summaryrefslogtreecommitdiff
path: root/fs/btrfs/extent_io.c
AgeCommit message (Collapse)Author
2019-02-25btrfs: Remove EXTENT_FIRST_DELALLOC bitNikolay Borisov
With the refactoring introduced in 8b62f87bad9c ("Btrfs: reworki outstanding_extents") this flag became unused. Remove it and renumber the following flags accordingly. No functional changes. Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2019-02-25btrfs: Remove unused arguments from btrfs_get_extent_fiemapNikolay Borisov
This function is a simple wrapper over btrfs_get_extent that returns either: a) A real extent in the passed range or b) Adjusted extent based on whether delalloc bytes are found backing up a hole. To support these semantics it doesn't need the page/pg_offset/create arguments which are passed to btrfs_get_extent in case an extent is to be created. So simplify the function by removing the unused arguments. No functional changes. Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2019-02-15block: allow bio_for_each_segment_all() to iterate over multi-page bvecMing Lei
This patch introduces one extra iterator variable to bio_for_each_segment_all(), then we can allow bio_for_each_segment_all() to iterate over multi-page bvec. Given it is just one mechannical & simple change on all bio_for_each_segment_all() users, this patch does tree-wide change in one single patch, so that we can avoid to use a temporary helper for this conversion. Reviewed-by: Omar Sandoval <osandov@fb.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-02-15btrfs: use mp_bvec_last_segment to get bio's last pageMing Lei
Preparing for supporting multi-page bvec. Reviewed-by: Omar Sandoval <osandov@fb.com> Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-02-15btrfs: look at bi_size for repair decisionsChristoph Hellwig
bio_readpage_error currently uses bi_vcnt to decide if it is worth retrying an I/O. But the vector count is mostly an implementation artifact - it really should figure out if there is more than a single sector worth retrying. Use bi_size for that and shift by PAGE_SHIFT. This really should be blocks/sectors, but given that btrfs doesn't support a sector size different from the PAGE_SIZE using the page size keeps the changes to a minimum. Reviewed-by: Omar Sandoval <osandov@fb.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-01-04fs: don't open code lru_to_page()Nikolay Borisov
Multiple filesystems open code lru_to_page(). Rectify this by moving the macro from mm_inline (which is specific to lru stuff) to the more generic mm.h header and start using the macro where appropriate. No functional changes. Link: http://lkml.kernel.org/r/20181129104810.23361-1-nborisov@suse.com Link: https://lkml.kernel.org/r/20181129075301.29087-1-nborisov@suse.com Signed-off-by: Nikolay Borisov <nborisov@suse.com> Acked-by: Michal Hocko <mhocko@suse.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Mike Rapoport <rppt@linux.ibm.com> Acked-by: Pankaj gupta <pagupta@redhat.com> Acked-by: "Yan, Zheng" <zyan@redhat.com> [ceph] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-12-17btrfs: Fix typos in comments and stringsAndrea Gelmini
The typos accumulate over time so once in a while time they get fixed in a large patch. Signed-off-by: Andrea Gelmini <andrea.gelmini@gelma.net> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-12-17btrfs: Refactor main loop in extent_readpagesNikolay Borisov
extent_readpages processes all pages in the readlist in batches of 16, this is implemented by a single for loop but thanks to an if condition the loop does 2 things based on whether we've filled the batch or not. Additionally due to the structure of the code there is an additional check which deals with partial batches. Streamline all of this by explicitly using two loops. The outter one is used to process all pages while the inner one just fills in the batch of 16 (currently). Due to this new structure the code guarantees that all pages are processed in the loop hence the code to deal with any leftovers is eliminated. This also enable the compiler to inline __extent_readpages: ./scripts/bloat-o-meter fs/btrfs/extent_io.o extent_io.for add/remove: 0/1 grow/shrink: 1/0 up/down: 660/-820 (-160) Function old new delta extent_readpages 476 1136 +660 __extent_readpages 820 - -820 Total: Before=44315, After=44155, chg -0.36% Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-12-17btrfs: use offset_in_page instead of open-coding itJohannes Thumshirn
Constructs like 'var & (PAGE_SIZE - 1)' or 'var & ~PAGE_MASK' can denote an offset into a page. So replace them by the offset_in_page() macro instead of open-coding it if they're not used as an alignment check. Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-12-17btrfs: remove always true if branch in find_delalloc_rangeLu Fengqi
The @found is always false when it comes to the if branch. Besides, the bool type is more suitable for @found. Change the return value of the function and its caller to bool as well. Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: Lu Fengqi <lufq.fnst@cn.fujitsu.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-12-17btrfs: Refactor btrfs_merge_bio_hookNikolay Borisov
This function really checks whether adding more data to the bio will straddle a stripe/chunk. So first let's give it a more appropraite name - btrfs_bio_fits_in_stripe. Secondly, the offset parameter was never used to just remove it. Thirdly, pages are submitted to either btree or data inodes so it's guaranteed that tree->ops is set so replace the check with an ASSERT. Finally, document the parameters of the function. No functional changes. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-12-17btrfs: don't initialize 'offset' in map_private_extent_buffer()Johannes Thumshirn
In map_private_extent_buffer() the 'offset' variable is initialized to a page aligned version of the 'start' parameter. But later on it is overwritten with either the offset from the extent buffer's start or 0. So get rid of the initial initialization. Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: David Sterba <dsterba@suse.com>
2018-12-17btrfs: Remove extent_io_ops::readpage_io_failed_hookNikolay Borisov
For data inodes this hook does nothing but to return -EAGAIN which is used to signal to the endio routines that this bio belongs to a data inode. If this is the case the actual retrying is handled by bio_readpage_error. Alternatively, if this bio belongs to the btree inode then btree_io_failed_hook just does some cleanup and doesn't retry anything. This patch simplifies the code flow by eliminating readpage_io_failed_hook and instead open-coding btree_io_failed_hook in end_bio_extent_readpage. Also eliminate some needless checks since IO is always performed on either data inode or btree inode, both of which are guaranteed to have their extent_io_tree::ops set. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-12-17btrfs: replace btrfs_io_bio::end_io with a simple helperDavid Sterba
The end_io callback implemented as btrfs_io_bio_endio_readpage only calls kfree. Also the callback is set only in case the csum buffer is allocated and not pointing to the inline buffer. We can use that information to drop the indirection and call a helper that will free the csums only in the right case. This shrinks struct btrfs_io_bio by 8 bytes. Reviewed-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: David Sterba <dsterba@suse.com>
2018-12-17btrfs: use EXPORT_FOR_TESTS for conditionally exported functionsJohannes Thumshirn
Several functions in BTRFS are only used inside the source file they are declared if CONFIG_BTRFS_FS_RUN_SANITY_TESTS is not defined. However if CONFIG_BTRFS_FS_RUN_SANITY_TESTS is defined these functions are shared with the unit tests code. Before the introduction of the EXPORT_FOR_TESTS macro, these functions could not be declared as static and the compiler had a harder task when optimizing and inlining them. As we have EXPORT_FOR_TESTS now, use it where appropriate to support the compiler. Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-12-17btrfs: Replace BUG_ON with ASSERT in find_lock_delalloc_rangeNikolay Borisov
lock_delalloc_pages should only return 2 values - 0 in case of success and -EAGAIN if the range of pages to be locked should be shrunk due to some of gone. Manual inspections confirms that this is indeed the case since __process_pages_contig is where lock_delalloc_pages gets its return value. The latter always returns 0 or -EAGAIN so the invariant holds. No functional changes. Reviewed-by: Qu Wenruo <wqu@suse.com> Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-12-17btrfs: Sink find_lock_delalloc_range's 'max_bytes' argumentNikolay Borisov
All callers of this function pass BTRFS_MAX_EXTENT_SIZE (128M) so let's reduce the argument count and make that a local variable. No functional changes. Reviewed-by: Qu Wenruo <wqu@suse.com> Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-12-17btrfs: use tagged writepage to mitigate livelock of snapshotEthan Lien
Snapshot is expected to be fast. But if there are writers steadily creating dirty pages in our subvolume, the snapshot may take a very long time to complete. To fix the problem, we use tagged writepage for snapshot flusher as we do in the generic write_cache_pages(), so we can omit pages dirtied after the snapshot command. This does not change the semantics regarding which data get to the snapshot, if there are pages being dirtied during the snapshotting operation. There's a sync called before snapshot is taken in old/new case, any IO in flight just after that may be in the snapshot but this depends on other system effects that might still sync the IO. We do a simple snapshot speed test on a Intel D-1531 box: fio --ioengine=libaio --iodepth=32 --bs=4k --rw=write --size=64G --direct=0 --thread=1 --numjobs=1 --time_based --runtime=120 --filename=/mnt/sub/testfile --name=job1 --group_reporting & sleep 5; time btrfs sub snap -r /mnt/sub /mnt/snap; killall fio original: 1m58sec patched: 6.54sec This is the best case for this patch since for a sequential write case, we omit nearly all pages dirtied after the snapshot command. For a multi writers, random write test: fio --ioengine=libaio --iodepth=32 --bs=4k --rw=randwrite --size=64G --direct=0 --thread=1 --numjobs=4 --time_based --runtime=120 --filename=/mnt/sub/testfile --name=job1 --group_reporting & sleep 5; time btrfs sub snap -r /mnt/sub /mnt/snap; killall fio original: 15.83sec patched: 10.35sec The improvement is smaller compared to the sequential write case, since we omit only half of the pages dirtied after snapshot command. Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: Ethan Lien <ethanlien@synology.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-12-17btrfs: Remove unused extent_state argument from ↵Nikolay Borisov
btrfs_writepage_endio_finish_ordered This parameter was never used, yet was part of the interface of the function ever since its introduction as extent_io_ops::writepage_end_io_hook in e6dcd2dc9c48 ("Btrfs: New data=ordered implementation"). Now that NULL is passed everywhere as a value for this parameter let's remove it for good. No functional changes. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-12-17btrfs: Remove extent_page_data argument from writepage_delallocNikolay Borisov
The only remaining use of the 'epd' argument in writepage_delalloc is to reference the extent_io_tree which was set in extent_writepages. Since it is guaranteed that page->mapping of any page passed to writepage_delalloc (and __extent_writepage as the sole caller) to be equal to that passed in extent_writepages we can directly get the io_tree via the already passed inode (which is also taken from page->mapping->host). No functional changes. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-12-17btrfs: Move epd::extent_locked check to writepage_delalloc's callerNikolay Borisov
If epd::extent_locked is set then writepage_delalloc terminates. Make this a bit more apparent in the caller by simply bubbling the check up. This enables to remove epd as an argument to writepage_delalloc in a future patch. No functional change. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-12-17btrfs: Adjust loop in free_extent_bufferNikolay Borisov
The loop construct in free_extent_buffer was added in 242e18c7c1a8 ("Btrfs: reduce lock contention on extent buffer locks") as means of reducing the times the eb lock is taken, the non-last ref count is decremented and lock is released. As the special handling of UNMAPPED extent buffers was removed now there is only one decrement op which is happening for EXTENT_BUFFER_UNMAPPED case. This commit modifies the loop condition so that in case of UNMAPPED buffers the eb's lock is taken only if we are 100% sure the eb is going to be freed by the current executor of the code. Additionally, remove superfluous ref count ops in btrfs test. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-12-17btrfs: Remove special handling of EXTENT_BUFFER_UNMAPPED while freeingNikolay Borisov
Now that the whole of btrfs code has been audited for eb reference count management it's time to remove the hunk in free_extent_buffer that essentially considered the condition "eb->ref == 2 && EXTENT_BUFFER_DUMMY" to equal "eb->ref = 1". Also remove the last location which takes an extra reference count in alloc_test_extent_buffer. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-12-17btrfs: Remove extent_io_ops::split_extent_hook callbackNikolay Borisov
This is the counterpart to merge_extent_hook, similarly, it's used only for data/freespace inodes so let's remove it, rename it and call it directly where necessary. No functional changes. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-12-17btrfs: Remove extent_io_ops::merge_extent_hook callbackNikolay Borisov
This callback is used only for data and free space inodes. Such inodes are guaranteed to have their extent_io_tree::private_data set to the inode struct. Exploit this fact to directly call the function. Also give it a more descriptive name. No functional changes. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-12-17btrfs: Remove extent_io_ops::clear_bit_hook callbackNikolay Borisov
This is the counterpart to ex-set_bit_hook (now btrfs_set_delalloc_extent), similar to what was done before remove clear_bit_hook and rename the function. No functional changes. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-12-17btrfs: Remove extent_io_ops::set_bit_hook extent_io callbackNikolay Borisov
This callback is used to properly account delalloc extents for data inodes (ordinary file inodes and freespace v1 inodes). Those can be easily identified since they have their extent_io trees ->private_data member point to the inode. Let's exploit this fact to remove the needless indirection through extent_io_hooks and directly call the function. Also give the function a name which reflects its purpose - btrfs_set_delalloc_extent. This patch also modified test_find_delalloc so that the extent_io_tree used for testing doesn't have its ->private_data set which would have caused a crash in btrfs_set_delalloc_extent due to the btrfs_inode->root member not being initialised. The old version of the code also didn't call set_bit_hook since the extent_io ops weren't set for the inode. No functional changes. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-12-17btrfs: Remove extent_io_ops::check_extent_io_range callbackNikolay Borisov
This callback was only used in debug builds by btrfs_leak_debug_check. A better approach is to move its implementation in btrfs_leak_debug_check and ensure the latter is only executed for extent tree which have ->private_data set i.e. relate to a data node and not the btree one. No functional changes. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-12-17btrfs: Remove extent_io_ops::writepage_end_io_hookNikolay Borisov
This callback is ony ever called for data page writeout so there is no need to actually abstract it via extent_io_ops. Lets just export it, remove the definition of the callback and call it directly in the functions that invoke the callback. Also rename the function to btrfs_writepage_endio_finish_ordered since what it really does is account finished io in the ordered extent data structures. No functional changes. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-12-17btrfs: Remove extent_io_ops::writepage_start_hookNikolay Borisov
This hook is called only from __extent_writepage_io which is already called only from the data page writeout path. So there is no need to make an indirect call via extent_io_ops. This patch just removes the callback definition, exports the callback function and calls it directly at the only call site. Also give the function a more descriptive name. No functional changes. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-12-17btrfs: Remove extent_io_ops::fill_delallocNikolay Borisov
This callback is called only from writepage_delalloc which in turn is guaranteed to be called from the data page writeout path. In the end there is no reason to have the call to this function to be indrected via the extent_io_ops structure. This patch removes the callback definition, exports the function and calls it directly. No functional changes. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> [ rename to btrfs_run_delalloc_range ] Signed-off-by: David Sterba <dsterba@suse.com>
2018-10-28Merge branch 'xarray' of git://git.infradead.org/users/willy/linux-daxLinus Torvalds
Pull XArray conversion from Matthew Wilcox: "The XArray provides an improved interface to the radix tree data structure, providing locking as part of the API, specifying GFP flags at allocation time, eliminating preloading, less re-walking the tree, more efficient iterations and not exposing RCU-protected pointers to its users. This patch set 1. Introduces the XArray implementation 2. Converts the pagecache to use it 3. Converts memremap to use it The page cache is the most complex and important user of the radix tree, so converting it was most important. Converting the memremap code removes the only other user of the multiorder code, which allows us to remove the radix tree code that supported it. I have 40+ followup patches to convert many other users of the radix tree over to the XArray, but I'd like to get this part in first. The other conversions haven't been in linux-next and aren't suitable for applying yet, but you can see them in the xarray-conv branch if you're interested" * 'xarray' of git://git.infradead.org/users/willy/linux-dax: (90 commits) radix tree: Remove multiorder support radix tree test: Convert multiorder tests to XArray radix tree tests: Convert item_delete_rcu to XArray radix tree tests: Convert item_kill_tree to XArray radix tree tests: Move item_insert_order radix tree test suite: Remove multiorder benchmarking radix tree test suite: Remove __item_insert memremap: Convert to XArray xarray: Add range store functionality xarray: Move multiorder_check to in-kernel tests xarray: Move multiorder_shrink to kernel tests xarray: Move multiorder account test in-kernel radix tree test suite: Convert iteration test to XArray radix tree test suite: Convert tag_tagged_items to XArray radix tree: Remove radix_tree_clear_tags radix tree: Remove radix_tree_maybe_preload_order radix tree: Remove split/join code radix tree: Remove radix_tree_update_node_t page cache: Finish XArray conversion dax: Convert page fault handlers to XArray ...
2018-10-21btrfs: Convert page cache to XArrayMatthew Wilcox
Signed-off-by: Matthew Wilcox <willy@infradead.org> Acked-by: David Sterba <dsterba@suse.com>
2018-10-21pagevec: Use xa_mark_tMatthew Wilcox
Removes sparse warnings. Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-10-15btrfs: tests: add separate stub for find_lock_delalloc_rangeDavid Sterba
The helper find_lock_delalloc_range is now conditionally built static, dpending on whether the self-tests are enabled or not. There's a macro that is supposed to hide the export, used only once. To discourage further use, drop it an add a public wrapper for the helper needed by tests. Reviewed-by: Omar Sandoval <osandov@fb.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-10-15Btrfs: skip set_page_dirty if eb pages are already dirtyLiu Bo
As long as @eb is marked with EXTENT_BUFFER_DIRTY, all of its pages are dirty, so no need to set pages dirty again. Ftrace showed that the loop took 10us on my dev box, so removing this can save us at least 10us if eb is already dirty and otherwise avoid a potentially expensive calls to set_page_dirty. Signed-off-by: Liu Bo <bo.liu@linux.alibaba.com> Reviewed-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-10-15Btrfs: assert page dirty bit on extent buffer pagesLiu Bo
Just in case that someone breaks the rule that pages are dirty as long as eb is dirty. The next patch will dirty the pages conditionally. Signed-off-by: Liu Bo <bo.liu@linux.alibaba.com> Reviewed-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-10-15Btrfs: use next_state in find_first_extent_bitLiu Bo
As next_state() is already defined to get the next state, use it in find_first_extent_bit. No functional changes. Signed-off-by: Liu Bo <bo.liu@linux.alibaba.com> Reviewed-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-08-17btrfs: readpages() should submit IO as read-aheadJens Axboe
a_ops->readpages() is only ever used for read-ahead. Ensure that we pass this information down to the block layer. Link: http://lkml.kernel.org/r/20180621010725.17813-4-axboe@kernel.dk Signed-off-by: Jens Axboe <axboe@kernel.dk> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Chris Mason <clm@fb.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-06btrfs: drop extent_io_ops::set_range_writeback callbackDavid Sterba
The data and metadata callback implementation both use the same function. We can remove the call indirection and intermediate helper completely. Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-08-06btrfs: drop extent_io_ops::merge_bio_hook callbackDavid Sterba
The data and metadata callback implementation both use the same function. We can remove the call indirection completely. Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-08-06btrfs: drop extent_io_ops::tree_fs_info callbackDavid Sterba
All implementations of the callback are trivial and do the same and there's only one user. Merge everything together. Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-08-06btrfs: Rename EXTENT_BUFFER_DUMMY to EXTENT_BUFFER_UNMAPPEDNikolay Borisov
EXTENT_BUFFER_DUMMY is an awful name for this flag. Buffers which have this flag set are not in any way dummy. Rather, they are private in the sense that are not mapped and linked to the global buffer tree. This flag has subtle implications to the way free_extent_buffer works for example, as well as controls whether page->mapping->private_lock is held during extent_buffer release. Pages for an unmapped buffer cannot be under io, nor can they be written by a 3rd party so taking the lock is unnecessary. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> [ EXTENT_BUFFER_UNMAPPED, update changelog ] Signed-off-by: David Sterba <dsterba@suse.com>
2018-08-06btrfs: Document locking requirement via lockdep_assert_heldNikolay Borisov
Remove stale comment since there is no longer an eb->eb_lock and document the locking expectation with a lockdep_assert_held statement. No functional changes. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-08-06btrfs: rename btrfs_release_extent_buffer_pageDavid Sterba
The function used to release one page (and always the first one), but not anymore since a50924e3a4d7fccb0ecfbd4 ("btrfs: drop constant param from btrfs_release_extent_buffer_page"). Update the name and comment. Signed-off-by: David Sterba <dsterba@suse.com>
2018-08-06btrfs: Refactor loop in btrfs_release_extent_buffer_pageNikolay Borisov
The purpose of the function is to free all the pages comprising an extent buffer. This can be achieved with a simple for loop rather than the slightly more involved 'do {} while' construct. So rewrite the loop using a 'for' construct. Additionally we can never have an extent_buffer that has 0 pages so remove the check for index == 0. No functional changes. The reversed order used to have a meaning in the past where the first page served as a blocking point for several callers. See eg 4f2de97acee6532b36dd6e99 ("Btrfs: set page->private to the eb"). Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-08-06btrfs: Reword dodgy comments in alloc_extent_bufferNikolay Borisov
Commit eb14ab8ed24a ("Btrfs: fix page->private races") fixed a genuine race between extent buffer initialisation and btree_releasepage. Unfortunately as the code has evolved the comments weren't changed which made them slightly wrong and they weren't very clear in the fist place. Fix this by (hopefully) rewording them in a more approachable manner. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-08-06btrfs: Simplify page unlocking in alloc_extent_bufferNikolay Borisov
Current version of the page unlocking code was added in 727011e07cbd ("Btrfs: allow metadata blocks larger than the page size") but even in this commit that particular flag was never used per-se. In fact, btrfs only uses PageChecked for data pages to identify pages which have been dirtied but don't have ORDERED bit set. For more information see 247e743cbe6e ("Btrfs: Use async helpers to deal with pages that have been improperly dirtied"). However, this doesn't apply to extent buffer pages. The important bit here is that the pages are unlocked AFTER the extent buffer has been properly recorded in the radix tree to avoid races with btree_releasepage. Let's exploit this fact and simplify the page unlocking sequence by unlocking the pages in-order and removing the redundant PageChecked flag setting/clearing. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-08-06btrfs: open-code bio_set_op_attrsDavid Sterba
The helper is trivial and marked as deprecated. Signed-off-by: David Sterba <dsterba@suse.com>
2018-08-06btrfs: switch types to int when counting eb pagesDavid Sterba
The loops iterating eb pages use unsigned long, that's an overkill as we know that there are at most 16 pages (64k / 4k), and 4 by default (with nodesize 16k). Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>