summaryrefslogtreecommitdiff
path: root/fs/btrfs/extent_io.c
AgeCommit message (Collapse)Author
2025-05-29btrfs: avoid linker error in btrfs_find_create_tree_block()Mark Harmstone
[ Upstream commit 7ef3cbf17d2734ca66c4ed8573be45f4e461e7ee ] The inline function btrfs_is_testing() is hardcoded to return 0 if CONFIG_BTRFS_FS_RUN_SANITY_TESTS is not set. Currently we're relying on the compiler optimizing out the call to alloc_test_extent_buffer() in btrfs_find_create_tree_block(), as it's not been defined (it's behind an #ifdef). Add a stub version of alloc_test_extent_buffer() to avoid linker errors on non-standard optimization levels. This problem was seen on GCC 14 with -O0 and is helps to see symbols that would be otherwise optimized out. Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Mark Harmstone <maharmstone@fb.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-05-09btrfs: adjust subpage bit start based on sectorsizeJosef Bacik
commit e08e49d986f82c30f42ad0ed43ebbede1e1e3739 upstream. When running machines with 64k page size and a 16k nodesize we started seeing tree log corruption in production. This turned out to be because we were not writing out dirty blocks sometimes, so this in fact affects all metadata writes. When writing out a subpage EB we scan the subpage bitmap for a dirty range. If the range isn't dirty we do bit_start++; to move onto the next bit. The problem is the bitmap is based on the number of sectors that an EB has. So in this case, we have a 64k pagesize, 16k nodesize, but a 4k sectorsize. This means our bitmap is 4 bits for every node. With a 64k page size we end up with 4 nodes per page. To make this easier this is how everything looks [0 16k 32k 48k ] logical address [0 4 8 12 ] radix tree offset [ 64k page ] folio [ 16k eb ][ 16k eb ][ 16k eb ][ 16k eb ] extent buffers [ | | | | | | | | | | | | | | | | ] bitmap Now we use all of our addressing based on fs_info->sectorsize_bits, so as you can see the above our 16k eb->start turns into radix entry 4. When we find a dirty range for our eb, we correctly do bit_start += sectors_per_node, because if we start at bit 0, the next bit for the next eb is 4, to correspond to eb->start 16k. However if our range is clean, we will do bit_start++, which will now put us offset from our radix tree entries. In our case, assume that the first time we check the bitmap the block is not dirty, we increment bit_start so now it == 1, and then we loop around and check again. This time it is dirty, and we go to find that start using the following equation start = folio_start + bit_start * fs_info->sectorsize; so in the case above, eb->start 0 is now dirty, and we calculate start as 0 + 1 * fs_info->sectorsize = 4096 4096 >> 12 = 1 Now we're looking up the radix tree for 1, and we won't find an eb. What's worse is now we're using bit_start == 1, so we do bit_start += sectors_per_node, which is now 5. If that eb is dirty we will run into the same thing, we will look at an offset that is not populated in the radix tree, and now we're skipping the writeout of dirty extent buffers. The best fix for this is to not use sectorsize_bits to address nodes, but that's a larger change. Since this is a fs corruption problem fix it simply by always using sectors_per_node to increment the start bit. Fixes: c4aec299fa8f ("btrfs: introduce submit_eb_subpage() to submit a subpage metadata page") CC: stable@vger.kernel.org # 5.15+ Reviewed-by: Boris Burkov <boris@bur.io> Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-03-22btrfs: fix two misuses of folio_shift()Matthew Wilcox (Oracle)
[ Upstream commit 01af106a076352182b2916b143fc50272600bd81 ] It is meaningless to shift a byte count by folio_shift(). The folio index is in units of PAGE_SIZE, not folio_size(). We can use folio_contains() to make this work for arbitrary-order folios, so remove the assertion that the folios are of order 0. Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-02-27btrfs: fix double accounting race when extent_writepage_io() failedQu Wenruo
[ Upstream commit 8bf334beb3496da3c3fbf3daf3856f7eec70dacc ] [BUG] If submit_one_sector() failed inside extent_writepage_io() for sector size < page size cases (e.g. 4K sector size and 64K page size), then we can hit double ordered extent accounting error. This should be very rare, as submit_one_sector() only fails when we failed to grab the extent map, and such extent map should exist inside the memory and has been pinned. [CAUSE] For example we have the following folio layout: 0 4K 32K 48K 60K 64K |//| |//////| |///| Where |///| is the dirty range we need to writeback. The 3 different dirty ranges are submitted for regular COW. Now we hit the following sequence: - submit_one_sector() returned 0 for [0, 4K) - submit_one_sector() returned 0 for [32K, 48K) - submit_one_sector() returned error for [60K, 64K) - btrfs_mark_ordered_io_finished() called for the whole folio This will mark the following ranges as finished: * [0, 4K) * [32K, 48K) Both ranges have their IO already submitted, this cleanup will lead to double accounting. * [60K, 64K) That's the correct cleanup. The only good news is, this error is only theoretical, as the target extent map is always pinned, thus we should directly grab it from memory, other than reading it from the disk. [FIX] Instead of calling btrfs_mark_ordered_io_finished() for the whole folio range, which can touch ranges we should not touch, instead move the error handling inside extent_writepage_io(). So that we can cleanup exact sectors that ought to be submitted but failed. This provides much more accurate cleanup, avoiding the double accounting. CC: stable@vger.kernel.org # 5.15+ Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-02-27btrfs: fix double accounting race when btrfs_run_delalloc_range() failedQu Wenruo
[ Upstream commit 72dad8e377afa50435940adfb697e070d3556670 ] [BUG] When running btrfs with block size (4K) smaller than page size (64K, aarch64), there is a very high chance to crash the kernel at generic/750, with the following messages: (before the call traces, there are 3 extra debug messages added) BTRFS warning (device dm-3): read-write for sector size 4096 with page size 65536 is experimental BTRFS info (device dm-3): checking UUID tree hrtimer: interrupt took 5451385 ns BTRFS error (device dm-3): cow_file_range failed, root=4957 inode=257 start=1605632 len=69632: -28 BTRFS error (device dm-3): run_delalloc_nocow failed, root=4957 inode=257 start=1605632 len=69632: -28 BTRFS error (device dm-3): failed to run delalloc range, root=4957 ino=257 folio=1572864 submit_bitmap=8-15 start=1605632 len=69632: -28 ------------[ cut here ]------------ WARNING: CPU: 2 PID: 3020984 at ordered-data.c:360 can_finish_ordered_extent+0x370/0x3b8 [btrfs] CPU: 2 UID: 0 PID: 3020984 Comm: kworker/u24:1 Tainted: G OE 6.13.0-rc1-custom+ #89 Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE Hardware name: QEMU KVM Virtual Machine, BIOS unknown 2/2/2022 Workqueue: events_unbound btrfs_async_reclaim_data_space [btrfs] pc : can_finish_ordered_extent+0x370/0x3b8 [btrfs] lr : can_finish_ordered_extent+0x1ec/0x3b8 [btrfs] Call trace: can_finish_ordered_extent+0x370/0x3b8 [btrfs] (P) can_finish_ordered_extent+0x1ec/0x3b8 [btrfs] (L) btrfs_mark_ordered_io_finished+0x130/0x2b8 [btrfs] extent_writepage+0x10c/0x3b8 [btrfs] extent_write_cache_pages+0x21c/0x4e8 [btrfs] btrfs_writepages+0x94/0x160 [btrfs] do_writepages+0x74/0x190 filemap_fdatawrite_wbc+0x74/0xa0 start_delalloc_inodes+0x17c/0x3b0 [btrfs] btrfs_start_delalloc_roots+0x17c/0x288 [btrfs] shrink_delalloc+0x11c/0x280 [btrfs] flush_space+0x288/0x328 [btrfs] btrfs_async_reclaim_data_space+0x180/0x228 [btrfs] process_one_work+0x228/0x680 worker_thread+0x1bc/0x360 kthread+0x100/0x118 ret_from_fork+0x10/0x20 ---[ end trace 0000000000000000 ]--- BTRFS critical (device dm-3): bad ordered extent accounting, root=4957 ino=257 OE offset=1605632 OE len=16384 to_dec=16384 left=0 BTRFS critical (device dm-3): bad ordered extent accounting, root=4957 ino=257 OE offset=1622016 OE len=12288 to_dec=12288 left=0 Unable to handle kernel NULL pointer dereference at virtual address 0000000000000008 BTRFS critical (device dm-3): bad ordered extent accounting, root=4957 ino=257 OE offset=1634304 OE len=8192 to_dec=4096 left=0 CPU: 1 UID: 0 PID: 3286940 Comm: kworker/u24:3 Tainted: G W OE 6.13.0-rc1-custom+ #89 Hardware name: QEMU KVM Virtual Machine, BIOS unknown 2/2/2022 Workqueue: btrfs_work_helper [btrfs] (btrfs-endio-write) pstate: 404000c5 (nZcv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : process_one_work+0x110/0x680 lr : worker_thread+0x1bc/0x360 Call trace: process_one_work+0x110/0x680 (P) worker_thread+0x1bc/0x360 (L) worker_thread+0x1bc/0x360 kthread+0x100/0x118 ret_from_fork+0x10/0x20 Code: f84086a1 f9000fe1 53041c21 b9003361 (f9400661) ---[ end trace 0000000000000000 ]--- Kernel panic - not syncing: Oops: Fatal exception SMP: stopping secondary CPUs SMP: failed to stop secondary CPUs 2-3 Dumping ftrace buffer: (ftrace buffer empty) Kernel Offset: 0x275bb9540000 from 0xffff800080000000 PHYS_OFFSET: 0xffff8fbba0000000 CPU features: 0x100,00000070,00801250,8201720b [CAUSE] The above warning is triggered immediately after the delalloc range failure, this happens in the following sequence: - Range [1568K, 1636K) is dirty 1536K 1568K 1600K 1636K 1664K | |/////////|////////| | Where 1536K, 1600K and 1664K are page boundaries (64K page size) - Enter extent_writepage() for page 1536K - Enter run_delalloc_nocow() with locked page 1536K and range [1568K, 1636K) This is due to the inode having preallocated extents. - Enter cow_file_range() with locked page 1536K and range [1568K, 1636K) - btrfs_reserve_extent() only reserved two extents The main loop of cow_file_range() only reserved two data extents, Now we have: 1536K 1568K 1600K 1636K 1664K | |<-->|<--->|/|///////| | 1584K 1596K Range [1568K, 1596K) has an ordered extent reserved. - btrfs_reserve_extent() failed inside cow_file_range() for file offset 1596K This is already a bug in our space reservation code, but for now let's focus on the error handling path. Now cow_file_range() returned -ENOSPC. - btrfs_run_delalloc_range() do error cleanup <<< ROOT CAUSE Call btrfs_cleanup_ordered_extents() with locked folio 1536K and range [1568K, 1636K) Function btrfs_cleanup_ordered_extents() normally needs to skip the ranges inside the folio, as it will normally be cleaned up by extent_writepage(). Such split error handling is already problematic in the first place. What's worse is the folio range skipping itself, which is not taking subpage cases into consideration at all, it will only skip the range if the page start >= the range start. In our case, the page start < the range start, since for subpage cases we can have delalloc ranges inside the folio but not covering the folio. So it doesn't skip the page range at all. This means all the ordered extents, both [1568K, 1584K) and [1584K, 1596K) will be marked as IOERR. And these two ordered extents have no more pending ios, they are marked finished, and *QUEUED* to be deleted from the io tree. - extent_writepage() do error cleanup Call btrfs_mark_ordered_io_finished() for the range [1536K, 1600K). Although ranges [1568K, 1584K) and [1584K, 1596K) are finished, the deletion from io tree is async, it may or may not happen at this time. If the ranges have not yet been removed, we will do double cleaning on those ranges, triggering the above ordered extent warnings. In theory there are other bugs, like the cleanup in extent_writepage() can cause double accounting on ranges that are submitted asynchronously (compression for example). But that's much harder to trigger because normally we do not mix regular and compression delalloc ranges. [FIX] The folio range split is already buggy and not subpage compatible, it was introduced a long time ago where subpage support was not even considered. So instead of splitting the ordered extents cleanup into the folio range and out of folio range, do all the cleanup inside writepage_delalloc(). - Pass @NULL as locked_folio for btrfs_cleanup_ordered_extents() in btrfs_run_delalloc_range() - Skip the btrfs_cleanup_ordered_extents() if writepage_delalloc() failed So all ordered extents are only cleaned up by btrfs_run_delalloc_range(). - Handle the ranges that already have ordered extents allocated If part of the folio already has ordered extent allocated, and btrfs_run_delalloc_range() failed, we also need to cleanup that range. Now we have a concentrated error handling for ordered extents during btrfs_run_delalloc_range(). Fixes: d1051d6ebf8e ("btrfs: Fix error handling in btrfs_cleanup_ordered_extents") CC: stable@vger.kernel.org # 5.15+ Reviewed-by: Boris Burkov <boris@bur.io> Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Stable-dep-of: 8bf334beb349 ("btrfs: fix double accounting race when extent_writepage_io() failed") Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-02-27btrfs: use btrfs_inode in extent_writepage()David Sterba
[ Upstream commit 011a9a1f244656cc3cbde47edba2b250f794d440 ] As extent_writepage() is internal helper we should use our inode type, so change it from struct inode. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com> Stable-dep-of: 8bf334beb349 ("btrfs: fix double accounting race when extent_writepage_io() failed") Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-02-27btrfs: rename btrfs_folio_(set|start|end)_writer_lock()Qu Wenruo
[ Upstream commit 0f7120266584490616f031873e7148495d77dd68 ] Since there is no user of reader locks, rename the writer locks into a more generic name, by removing the "_writer" part from the name. And also rename btrfs_subpage::writer into btrfs_subpage::locked. Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Stable-dep-of: 8bf334beb349 ("btrfs: fix double accounting race when extent_writepage_io() failed") Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-02-27btrfs: unify to use writer locks for subpage lockingQu Wenruo
[ Upstream commit 336e69f3025fb70db9d0dfb7f36ac79887bf5341 ] Since commit d7172f52e993 ("btrfs: use per-buffer locking for extent_buffer reading"), metadata read no longer relies on the subpage reader locking. This means we do not need to maintain a different metadata/data split for locking, so we can convert the existing reader lock users by: - add_ra_bio_pages() Convert to btrfs_folio_set_writer_lock() - end_folio_read() Convert to btrfs_folio_end_writer_lock() - begin_folio_read() Convert to btrfs_folio_set_writer_lock() - folio_range_has_eb() Remove the subpage->readers checks, since it is always 0. - Remove btrfs_subpage_start_reader() and btrfs_subpage_end_reader() Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Stable-dep-of: 8bf334beb349 ("btrfs: fix double accounting race when extent_writepage_io() failed") Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-02-27btrfs: mark all dirty sectors as locked inside writepage_delalloc()Qu Wenruo
[ Upstream commit c96d0e3921419bd3e5d8a1f355970c8ae3047ef4 ] Currently we only mark sectors as locked if there is a *NEW* delalloc range for it. But NEW delalloc range is not the same as dirty sectors we want to submit, e.g: 0 32K 64K 96K 128K | |////////||///////| |////| 120K For above 64K page size case, writepage_delalloc() for page 0 will find and lock the delalloc range [32K, 96K), which is beyond the page boundary. Then when writepage_delalloc() is called for the page 64K, since [64K, 96K) is already locked, only [120K, 128K) will be locked. This means, although range [64K, 96K) is dirty and will be submitted later by extent_writepage_io(), it will not be marked as locked. This is fine for now, as we call btrfs_folio_end_writer_lock_bitmap() to free every non-compressed sector, and compression is only allowed for full page range. But this is not safe for future sector perfect compression support, as this can lead to double folio unlock: Thread A | Thread B ---------------------------------------+-------------------------------- | submit_one_async_extent() | |- extent_clear_unlock_delalloc() extent_writepage() | |- btrfs_folio_end_writer_lock() |- btrfs_folio_end_writer_lock_bitmap()| |- btrfs_subpage_end_and_test_writer() | | | |- atomic_sub_and_test() | | | /* Now the atomic value is 0 */ |- if (atomic_read() == 0) | | |- folio_unlock() | |- folio_unlock() The root cause is the above range [64K, 96K) is dirtied and should also be locked but it isn't. So to make everything more consistent and prepare for the incoming sector perfect compression, mark all dirty sectors as locked. Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Stable-dep-of: 8bf334beb349 ("btrfs: fix double accounting race when extent_writepage_io() failed") Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-02-27btrfs: move the delalloc range bitmap search into extent_io.cQu Wenruo
[ Upstream commit 2bca8eb0774d271b1077b72f1be135073e0a898f ] Currently for subpage (sector size < page size) cases, we reuse subpage locked bitmap to find out all delalloc ranges we have locked, and run all those found ranges. However such reuse is not perfect, e.g.: 0 32K 64K 96K 128K | |////////||///////| |////| 120K For above range, writepage_delalloc() for page 0 will handle the range [32K, 96k), note delalloc range can be beyond the page boundary. But writepage_delalloc() for page 64K will only handle range [120K, 128K), as the previous run on page 0 has already handled range [64K, 96K). Meanwhile for the writeback we should expect range [64K, 96K) to also be locked, this leads to the mismatch from locked bitmap and delalloc range. This is not causing problems yet, but it's still an inconsistent behavior. So instead of relying on the subpage locked bitmap, move the delalloc range search using local @delalloc_bitmap, so that we can remove the existing btrfs_folio_find_writer_locked(). Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Stable-dep-of: 8bf334beb349 ("btrfs: fix double accounting race when extent_writepage_io() failed") Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-02-27btrfs: do not assume the full page range is not dirty in extent_writepage_io()Qu Wenruo
[ Upstream commit 928b4de66ed3b0d9a6f201ce41ab2eed6ea2e7ef ] The function extent_writepage_io() will submit the dirty sectors inside the page for the write. But recently to co-operate with the incoming subpage compression enhancement, a new bitmap is introduced to btrfs_bio_ctrl::submit_bitmap, to only avoid a subset of the dirty range. This is because we can have the following cases with 64K page size: 0 16K 32K 48K 64K | |/////////| |/| 52K For range [16K, 32K), we queue the dirty range for compression, which is ran in a delayed workqueue. Then for range [48K, 52K), we go through the regular submission path. In that case, our btrfs_bio_ctrl::submit_bitmap will exclude the range [16K, 32K). The dirty flags for the range [16K, 32K) is only cleared when the compression is done, by the extent_clear_unlock_delalloc() call inside submit_one_async_extent(). This patch fix the false alert by removing the btrfs_folio_assert_not_dirty() check, since it's no longer correct for subpage compression cases. Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Stable-dep-of: 8bf334beb349 ("btrfs: fix double accounting race when extent_writepage_io() failed") Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-02-21btrfs: fix stale page cache after race between readahead and direct IO writeFilipe Manana
[ Upstream commit acc18e1c1d8c0d59d793cf87790ccfcafb1bf5f0 ] After commit ac325fc2aad5 ("btrfs: do not hold the extent lock for entire read") we can now trigger a race between a task doing a direct IO write and readahead. When this race is triggered it results in tasks getting stale data when they attempt do a buffered read (including the task that did the direct IO write). This race can be sporadically triggered with test case generic/418, failing like this: $ ./check generic/418 FSTYP -- btrfs PLATFORM -- Linux/x86_64 debian0 6.13.0-rc7-btrfs-next-185+ #17 SMP PREEMPT_DYNAMIC Mon Feb 3 12:28:46 WET 2025 MKFS_OPTIONS -- /dev/sdc MOUNT_OPTIONS -- /dev/sdc /home/fdmanana/btrfs-tests/scratch_1 generic/418 14s ... - output mismatch (see /home/fdmanana/git/hub/xfstests/results//generic/418.out.bad) # --- tests/generic/418.out 2020-06-10 19:29:03.850519863 +0100 # +++ /home/fdmanana/git/hub/xfstests/results//generic/418.out.bad 2025-02-03 15:42:36.974609476 +0000 @@ -1,2 +1,5 @@ QA output created by 418 +cmpbuf: offset 0: Expected: 0x1, got 0x0 +[6:0] FAIL - comparison failed, offset 24576 +diotest -wp -b 4096 -n 8 -i 4 failed at loop 3 Silence is golden ... (Run 'diff -u /home/fdmanana/git/hub/xfstests/tests/generic/418.out /home/fdmanana/git/hub/xfstests/results//generic/418.out.bad' to see the entire diff) Ran: generic/418 Failures: generic/418 Failed 1 of 1 tests The race happens like this: 1) A file has a prealloc extent for the range [16K, 28K); 2) Task A starts a direct IO write against file range [24K, 28K). At the start of the direct IO write it invalidates the page cache at __iomap_dio_rw() with kiocb_invalidate_pages() for the 4K page at file offset 24K; 3) Task A enters btrfs_dio_iomap_begin() and locks the extent range [24K, 28K); 4) Task B starts a readahead for file range [16K, 28K), entering btrfs_readahead(). First it attempts to read the page at offset 16K by entering btrfs_do_readpage(), where it calls get_extent_map(), locks the range [16K, 20K) and gets the extent map for the range [16K, 28K), caching it into the 'em_cached' variable declared in the local stack of btrfs_readahead(), and then unlocks the range [16K, 20K). Since the extent map has the prealloc flag, at btrfs_do_readpage() we zero out the page's content and don't submit any bio to read the page from the extent. Then it attempts to read the page at offset 20K entering btrfs_do_readpage() where we reuse the previously cached extent map (decided by get_extent_map()) since it spans the page's range and it's still in the inode's extent map tree. Just like for the previous page, we zero out the page's content since the extent map has the prealloc flag set. Then it attempts to read the page at offset 24K entering btrfs_do_readpage() where we reuse the previously cached extent map (decided by get_extent_map()) since it spans the page's range and it's still in the inode's extent map tree. Just like for the previous pages, we zero out the page's content since the extent map has the prealloc flag set. Note that we didn't lock the extent range [24K, 28K), so we didn't synchronize with the ongoing direct IO write being performed by task A; 5) Task A enters btrfs_create_dio_extent() and creates an ordered extent for the range [24K, 28K), with the flags BTRFS_ORDERED_DIRECT and BTRFS_ORDERED_PREALLOC set; 6) Task A unlocks the range [24K, 28K) at btrfs_dio_iomap_begin(); 7) The ordered extent enters btrfs_finish_one_ordered() and locks the range [24K, 28K); 8) Task A enters fs/iomap/direct-io.c:iomap_dio_complete() and it tries to invalidate the page at offset 24K by calling kiocb_invalidate_post_direct_write(), resulting in a call chain that ends up at btrfs_release_folio(). The btrfs_release_folio() call ends up returning false because the range for the page at file offset 24K is currently locked by the task doing the ordered extent completion in the previous step (7), so we have: btrfs_release_folio() -> __btrfs_release_folio() -> try_release_extent_mapping() -> try_release_extent_state() This last function checking that the range is locked and returning false and propagating it up to btrfs_release_folio(). So this results in a failure to invalidate the page and kiocb_invalidate_post_direct_write() triggers this message logged in dmesg: Page cache invalidation failure on direct I/O. Possible data corruption due to collision with buffered I/O! After this we leave the page cache with stale data for the file range [24K, 28K), filled with zeroes instead of the data written by direct IO write (all bytes with a 0x01 value), so any task attempting to read with buffered IO, including the task that did the direct IO write, will get all bytes in the range with a 0x00 value instead of the written data. Fix this by locking the range, with btrfs_lock_and_flush_ordered_range(), at the two callers of btrfs_do_readpage() instead of doing it at get_extent_map(), just like we did before commit ac325fc2aad5 ("btrfs: do not hold the extent lock for entire read"), and unlocking the range after all the calls to btrfs_do_readpage(). This way we never reuse a cached extent map without flushing any pending ordered extents from a concurrent direct IO write. Fixes: ac325fc2aad5 ("btrfs: do not hold the extent lock for entire read") Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-02-21btrfs: rename __get_extent_map() and pass btrfs_inodeDavid Sterba
[ Upstream commit 06de96faf795b5c276a3be612da6b08c6112e747 ] The double underscore naming scheme does not apply here, there's only only get_extent_map(). As the definition is changed also pass the struct btrfs_inode. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com> Stable-dep-of: acc18e1c1d8c ("btrfs: fix stale page cache after race between readahead and direct IO write") Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-01-17fs/writeback: convert wbc_account_cgroup_owner to take a folioPankaj Raghav
[ Upstream commit 30dac24e14b52e1787572d1d4e06eeabe8a63630 ] Most of the callers of wbc_account_cgroup_owner() are converting a folio to page before calling the function. wbc_account_cgroup_owner() is converting the page back to a folio to call mem_cgroup_css_from_folio(). Convert wbc_account_cgroup_owner() to take a folio instead of a page, and convert all callers to pass a folio directly except f2fs. Convert the page to folio for all the callers from f2fs as they were the only callers calling wbc_account_cgroup_owner() with a page. As f2fs is already in the process of converting to folios, these call sites might also soon be calling wbc_account_cgroup_owner() with a folio directly in the future. No functional changes. Only compile tested. Signed-off-by: Pankaj Raghav <p.raghav@samsung.com> Link: https://lore.kernel.org/r/20240926140121.203821-1-kernel@pankajraghav.com Acked-by: David Sterba <dsterba@suse.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org> Stable-dep-of: 51d20d1dacbe ("iomap: fix zero padding data issue in concurrent append writes") Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-10-22btrfs: fix the delalloc range locking if sector size < page sizeQu Wenruo
Inside lock_delalloc_folios(), there are several problems related to sector size < page size handling: - Set the writer locks without checking if the folio is still valid We call btrfs_folio_start_writer_lock() just like it's folio_lock(). But since the folio may not even be the folio of the current mapping, we can easily screw up the folio->private. - The range is not clamped inside the page This means we can over write other bitmaps if the start/len is not properly handled, and trigger the btrfs_subpage_assert(). - @processed_end is always rounded up to page end If the delalloc range is not page aligned, and we need to retry (returning -EAGAIN), then we will unlock to the page end. Thankfully this is not a huge problem, as now btrfs_folio_end_writer_lock() can handle range larger than the locked range, and only unlock what is already locked. Fix all these problems by: - Lock and check the folio first, then call btrfs_folio_set_writer_lock() So that if we got a folio not belonging to the inode, we won't touch folio->private. - Properly truncate the range inside the page - Update @processed_end to the locked range end Fixes: 1e1de38792e0 ("btrfs: make process_one_page() to handle subpage locking") CC: stable@vger.kernel.org # 6.1+ Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-10-09btrfs: fix clear_dirty and writeback ordering in submit_one_sector()Naohiro Aota
This commit is a replay of commit 6252690f7e1b ("btrfs: fix invalid mapping of extent xarray state"). We need to call btrfs_folio_clear_dirty() before btrfs_set_range_writeback(), so that xarray DIRTY tag is cleared. With a refactoring commit 8189197425e7 ("btrfs: refactor __extent_writepage_io() to do sector-by-sector submission"), it screwed up and the order is reversed and causing the same hang. Fix the ordering now in submit_one_sector(). Fixes: 8189197425e7 ("btrfs: refactor __extent_writepage_io() to do sector-by-sector submission") Reviewed-by: Qu Wenruo <wqu@suse.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-09-10btrfs: only unlock the to-be-submitted ranges inside a folioQu Wenruo
[SUBPAGE COMPRESSION LIMITS] Currently inside writepage_delalloc(), if a delalloc range is going to be submitted asynchronously (inline or compression, the page dirty/writeback/unlock are all handled in at different time, not at the submission time), then we return 1 and extent_writepage() will skip the submission. This is fine if every sector matches page size, but if a sector is smaller than page size (aka, subpage case), then it can be very problematic, for example for the following 64K page: 0 16K 32K 48K 64K |/| |///////| |/| | | 4K 52K Where |/| is the dirty range we need to submit. In the above case, we need the following different handling for the 3 ranges: - [0, 4K) needs to be submitted for regular write A single sector cannot be compressed. - [16K, 32K) needs to be submitted for compressed write - [48K, 52K) needs to be submitted for regular write. Above, if we try to submit [16K, 32K) for compressed write, we will return 1 and immediately, and without submitting the remaining [48K, 52K) range. Furthermore, since extent_writepage() will exit without unlocking any sectors, the submitted range [0, 4K) will not have sector unlocked. That's the reason why for now subpage is only allowed for full page range. [ENHANCEMENT] - Introduce a submission bitmap at btrfs_bio_ctrl::submit_bitmap This records which sectors will be submitted by extent_writepage_io(). This allows us to track which sectors needs to be submitted thus later to be properly unlocked. For asynchronously submitted range (inline/compression), the corresponding bits will be cleared from that bitmap. - Only return 1 if no sector needs to be submitted in writepage_delalloc() - Only submit sectors marked by submission bitmap inside extent_writepage_io() So we won't touch the asynchronously submitted part. - Introduce btrfs_folio_end_writer_lock_bitmap() helper This will only unlock the involved sectors specified by @bitmap parameter, to avoid touching the range asynchronously submitted. Please note that, since subpage compression is still limited to page aligned range, this change is only a preparation for future sector perfect compression support for subpage. Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-09-10btrfs: merge btrfs_folio_unlock_writer() into btrfs_folio_end_writer_lock()Qu Wenruo
The function btrfs_folio_unlock_writer() is already calling btrfs_folio_end_writer_lock() to do the heavy lifting work, the only missing 0 writer check. Thus there is no need to keep two different functions, move the 0 writer check into btrfs_folio_end_writer_lock(), and remove btrfs_folio_unlock_writer(). Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-09-10btrfs: remove btrfs_folio_end_all_writers()Qu Wenruo
The function btrfs_folio_end_all_writers() is only utilized in extent_writepage() as a way to unlock all subpage range (for both successful submission and error handling). Meanwhile we have a similar function, btrfs_folio_end_writer_lock(). The difference is, btrfs_folio_end_writer_lock() expects a range that is a subset of the already locked range. This limit on btrfs_folio_end_writer_lock() is a little overkilled, preventing it from being utilized for error paths. So here we enhance btrfs_folio_end_writer_lock() to accept a superset of the locked range, and only end the locked subset. This means we can replace btrfs_folio_end_all_writers() with btrfs_folio_end_writer_lock() instead. Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-09-10btrfs: convert try_release_extent_mapping() to take a folioLi Zetao
The old page API is being gradually replaced and converted to use folio to improve code readability and avoid repeated conversion between page and folio. And page_to_inode() can be replaced with folio_to_inode() now. Signed-off-by: Li Zetao <lizetao1@huawei.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-09-10btrfs: convert try_release_extent_state() to take a folioLi Zetao
The old page API is being gradually replaced and converted to use folio to improve code readability and avoid repeated conversion between page and folio. Moreover, use folio_pos() instead of page_offset(), which is more consistent with folio usage. Signed-off-by: Li Zetao <lizetao1@huawei.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-09-10btrfs: convert submit_eb_page() to take a folioLi Zetao
The old page API is being gradually replaced and converted to use folio to improve code readability and avoid repeated conversion between page and folio. Signed-off-by: Li Zetao <lizetao1@huawei.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-09-10btrfs: convert submit_eb_subpage() to take a folioLi Zetao
The old page API is being gradually replaced and converted to use folio to improve code readability and avoid repeated conversion between page and folio. Moreover, use folio_pos() instead of page_offset(), which is more consistent with folio usage. Signed-off-by: Li Zetao <lizetao1@huawei.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-09-10btrfs: convert try_release_extent_buffer() to take a folioLi Zetao
The old page API is being gradually replaced and converted to use folio to improve code readability and avoid repeated conversion between page and folio. Signed-off-by: Li Zetao <lizetao1@huawei.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-09-10btrfs: convert try_release_subpage_extent_buffer() to take a folioLi Zetao
The old page API is being gradually replaced and converted to use folio to improve code readability and avoid repeated conversion between page and folio. And use folio_pos instead of page_offset, which is more consistent with folio usage. At the same time, folio_test_private() can handle folio directly without converting from page to folio first. Signed-off-by: Li Zetao <lizetao1@huawei.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-09-10btrfs: convert get_next_extent_buffer() to take a folioLi Zetao
The old page API is being gradually replaced and converted to use folio to improve code readability and avoid repeated conversion between page and folio. Use folio_pos instead of page_offset, which is more consistent with folio usage. Signed-off-by: Li Zetao <lizetao1@huawei.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-09-10btrfs: convert clear_page_extent_mapped() to take a folioLi Zetao
The old page API is being gradually replaced and converted to use folio to improve code readability and avoid repeated conversion between page and folio. Now clear_page_extent_mapped() can deal with a folio directly, so change its name to clear_folio_extent_mapped(). Signed-off-by: Li Zetao <lizetao1@huawei.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-09-10btrfs: do not hold the extent lock for entire readJosef Bacik
Historically we've held the extent lock throughout the entire read. There's been a few reasons for this, but it's mostly just caused us problems. For example, this prevents us from allowing page faults during direct io reads, because we could deadlock. This has forced us to only allow 4k reads at a time for io_uring NOWAIT requests because we have no idea if we'll be forced to page fault and thus have to do a whole lot of work. On the buffered side we are protected by the page lock, as long as we're reading things like buffered writes, punch hole, and even direct IO to a certain degree will get hung up on the page lock while the page is in flight. On the direct side we have the dio extent lock, which acts much like the way the extent lock worked previously to this patch, however just for direct reads. This protects direct reads from concurrent direct writes, while we're protected from buffered writes via the inode lock. Now that we're protected in all cases, narrow the extent lock to the part where we're getting the extent map to submit the reads, no longer holding the extent lock for the entire read operation. Push the extent lock down into do_readpage() so that we're only grabbing it when looking up the extent map. This portion was contributed by Goldwyn. Co-developed-by: Goldwyn Rodrigues <rgoldwyn@suse.com> Reviewed-by: Goldwyn Rodrigues <rgoldwyn@suse.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-09-10btrfs: rename __extent_writepage() and drop double underscoresDavid Sterba
The function does not follow the pattern where the underscores would be justified, so rename it. Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-09-10btrfs: rename btrfs_submit_bio() to btrfs_submit_bbio()David Sterba
The function name is a bit misleading as it submits the btrfs_bio (bbio), rename it so we can use btrfs_submit_bio() when an actual bio is submitted. Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-09-10btrfs: subpage: remove btrfs_fs_info::subpage_info memberQu Wenruo
The member btrfs_fs_info::subpage_info stores the cached bitmap start position inside the merged bitmap. However in reality there is only one thing depending on the sectorsize, bitmap_nr_bits, which records the number of sectors that fit inside a page. The sequence of sub-bitmaps have fixed order, thus it's just a quick multiplication to calculate the start position of each sub-bitmaps. Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-09-10btrfs: remove the nr_ret parameter from __extent_writepage_io()Qu Wenruo
The parameter @nr_ret is used to tell the caller how many sectors have been submitted for IO. Then callers check @nr_ret value to determine if we need to manually clear the PAGECACHE_TAG_DIRTY, as if we submitted no sector (e.g. all sectors are beyond i_size) there is no folio_start_writeback() called thus PAGECACHE_TAG_DIRTY tag will not be cleared. Remove this parameter by: - Moving the btrfs_folio_clear_writeback() call into __extent_writepage_io() So that if we didn't submit any IO, then manually call btrfs_folio_set_writeback() to clear PAGECACHE_TAG_DIRTY when the page is no longer dirty. - Use a bool to record if we have submitted any sector Instead of an int. - Use subpage compatible helpers to end folio writeback. This brings no change to the behavior, just for the sake of consistency. As for the call site inside __extent_writepage(), we're always called for the whole page, so the existing full page helper folio_(start|end)_writeback() is totally fine. For the call site inside extent_write_locked_range(), although we can have subpage range, folio_start_writeback() will only clear PAGECACHE_TAG_DIRTY if the page is no longer dirty, and the full folio will still be dirty if there is any subpage dirty range. Only when the last dirty subpage sector is cleared, the folio_start_writeback() will clear PAGECACHE_TAG_DIRTY. So no matter if we call the full page or subpage helper, the result is still the same, then just use the subpage helpers for consistency. Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-09-10btrfs: refactor __extent_writepage_io() to do sector-by-sector submissionQu Wenruo
Unlike the bitmap usage inside raid56, for __extent_writepage_io() we handle the subpage submission not sector-by-sector, but for each dirty range we found. This is not a big deal normally, as the subpage complex code is already mostly optimized out by the compiler for x86_64. However for the sake of consistency and for the future of subpage sector-perfect compression support, this patch does: - Extract the sector submission code into submit_one_sector() - Add the needed code to extract the dirty bitmap for subpage case There is a small pitfall for non-subpage case, as we cleared page dirty before starting writeback, so we have to manually set the default dirty_bitmap to 1 for such case. - Use bitmap_and() to calculate the target sectors we need to submit This is done for both subpage and non-subpage cases, and will later be expanded to skip inline/compression ranges. For x86_64, the dirty bitmap will be fixed to 1, with the length of 1, so we're still doing the same workload per sector. For larger page sizes, the overhead will be a little larger, as previous we only need to do one extent_map lookup per-dirty-range, but now it will be one extent_map lookup per-sector. But that is the same frequency as x86_64, so we're just aligning the behavior to x86_64. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-09-10btrfs: convert find_next_dirty_byte() to take a folioJosef Bacik
We already use a folio some in this function, replace all page usage with the folio and update the function to take the folio as an argument. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-09-10btrfs: convert __get_extent_map() to take a folioJosef Bacik
Now that btrfs_get_extent takes a folio, update __get_extent_map to take a folio as well. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-09-10btrfs: convert btrfs_get_extent() to take a folioJosef Bacik
We only pass this into read_inline_extent, change it to take a folio and update the callers. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-09-10btrfs: convert btrfs_writepage_cow_fixup() to use folioJosef Bacik
Instead of a page, use a folio for btrfs_writepage_cow_fixup. We already have a folio at the only caller, and the fixup worker uses folios. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-09-10btrfs: convert btrfs_run_delalloc_range() to take a folioJosef Bacik
Now that every function that btrfs_run_delalloc_range calls takes a folio, update it to take a folio and update the callers. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-09-10btrfs: convert extent_write_locked_range() to take a folioJosef Bacik
This mostly uses folios, convert it to take a folio instead and update the callers to pass in the folio. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-09-10btrfs: convert extent_clear_unlock_delalloc() to take a folioJosef Bacik
Instead of taking the locked page, take the locked folio so we can pass that into __process_folios_contig. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-09-10btrfs: convert process_one_page() to operate only on foliosJosef Bacik
Now that this mostly uses folios, update it to take folios, use the folios that are passed in, and rename from process_one_page => process_one_folio. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-09-10btrfs: convert __process_pages_contig() to take a folioJosef Bacik
This operates mostly on folios, update it to take a folio for the locked folio instead of the page, rename from __process_pages_contig => __process_folios_contig. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-09-10btrfs: convert __unlock_for_delalloc() to take a folioJosef Bacik
All of the callers have a folio at this point, update __unlock_for_delalloc to take a folio so that it's consistent with its callers. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-09-10btrfs: convert lock_delalloc_pages() to take a folioJosef Bacik
Also rename lock_delalloc_pages => lock_delalloc_folios in the process, now that it exclusively works on folios. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-09-10btrfs: convert find_lock_delalloc_range() to use a folioJosef Bacik
Instead of passing in a page for locked_page, pass in the folio instead. We only use the folio itself to validate some range assumptions, and then pass it into other functions. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-09-10btrfs: convert writepage_delalloc() to take a folioJosef Bacik
We already use a folio heavily in this function, pass the folio in directly and use it everywhere, only passing the page down to functions that do not take a folio yet. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-09-10btrfs: convert btrfs_mark_ordered_io_finished() to take a folioJosef Bacik
We only need a folio now, make it take a folio as an argument and update all of the callers. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-09-10btrfs: convert btrfs_finish_ordered_extent() to take a folioJosef Bacik
The callers and callee's of this now all use folios, update it to take a folio as well. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-09-10btrfs: convert __extent_writepage() to be completely folio basedJosef Bacik
Now that we've gotten most of the helpers updated to only take a folio, update __extent_writepage to only deal in folios. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-09-10btrfs: convert extent_write_locked_range() to use foliosJosef Bacik
Instead of using pages for everything, find a folio and use that. This makes things a bit cleaner as a lot of the functions calls here all take folios. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>