summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorQu Wenruo <wqu@suse.com>2025-07-29 19:01:45 +0930
committerGreg Kroah-Hartman <gregkh@linuxfoundation.org>2025-09-09 19:02:16 +0200
commit6edbd021384f223ac19685452a79f83b7f6deaca (patch)
tree1a6aa259929e6f84ae0547473ef1db9613acfa0b
parente1ec77b94094a718ed7c04dda4dba98dff5dea58 (diff)
btrfs: clear block dirty if submit_one_sector() failed
[ Upstream commit 4bcd3061e8154606af7f721cb75ca04ffe191a12 ] [BUG] If submit_one_sector() failed, the block will be kept dirty, but with their corresponding range finished in the ordered extent. This means if a writeback happens later again, we can hit the following problems: - ASSERT(block_start != EXTENT_MAP_HOLE) in submit_one_sector() If the original extent map is a hole, then we can hit this case, as the new ordered extent failed, we will drop the new extent map and re-read one from the disk. - DEBUG_WARN() in btrfs_writepage_cow_fixup() This is because we no longer have an ordered extent for those dirty blocks. The original for them is already finished with error. [CAUSE] The function submit_one_sector() is not following the regular error handling of writeback. The common practice is to clear the folio dirty, start and finish the writeback for the block. This is normally done by extent_clear_unlock_delalloc() with PAGE_START_WRITEBACK | PAGE_END_WRITEBACK flags during run_delalloc_range(). So if we keep those failed blocks dirty, they will stay in the page cache and wait for the next writeback. And since the original ordered extent is already finished and removed, depending on the original extent map, we either hit the ASSERT() inside submit_one_sector(), or hit the DEBUG_WARN() in btrfs_writepage_cow_fixup(). [FIX] Follow the regular error handling to clear the dirty flag for the block, start and finish writeback for that block instead. Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
-rw-r--r--fs/btrfs/extent_io.c17
1 files changed, 13 insertions, 4 deletions
diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index 3711a5d07342..fac4000a5bca 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -1483,7 +1483,7 @@ out:
/*
* Return 0 if we have submitted or queued the sector for submission.
- * Return <0 for critical errors.
+ * Return <0 for critical errors, and the sector will have its dirty flag cleared.
*
* Caller should make sure filepos < i_size and handle filepos >= i_size case.
*/
@@ -1506,8 +1506,17 @@ static int submit_one_sector(struct btrfs_inode *inode,
ASSERT(filepos < i_size);
em = btrfs_get_extent(inode, NULL, filepos, sectorsize);
- if (IS_ERR(em))
+ if (IS_ERR(em)) {
+ /*
+ * When submission failed, we should still clear the folio dirty.
+ * Or the folio will be written back again but without any
+ * ordered extent.
+ */
+ btrfs_folio_clear_dirty(fs_info, folio, filepos, sectorsize);
+ btrfs_folio_set_writeback(fs_info, folio, filepos, sectorsize);
+ btrfs_folio_clear_writeback(fs_info, folio, filepos, sectorsize);
return PTR_ERR(em);
+ }
extent_offset = filepos - em->start;
em_end = btrfs_extent_map_end(em);
@@ -1637,8 +1646,8 @@ static noinline_for_stack int extent_writepage_io(struct btrfs_inode *inode,
* Here we set writeback and clear for the range. If the full folio
* is no longer dirty then we clear the PAGECACHE_TAG_DIRTY tag.
*
- * If we hit any error, the corresponding sector will still be dirty
- * thus no need to clear PAGECACHE_TAG_DIRTY.
+ * If we hit any error, the corresponding sector will have its dirty
+ * flag cleared and writeback finished, thus no need to handle the error case.
*/
if (!submitted_io && !error) {
btrfs_folio_set_writeback(fs_info, folio, start, len);