summaryrefslogtreecommitdiff
path: root/Documentation/core-api/xarray.rst
AgeCommit message (Collapse)Author
2025-03-17xarray: add xas_try_split() to split a multi-index entryZi Yan
Patch series "Buddy allocator like (or non-uniform) folio split", v10. This patchset adds a new buddy allocator like (or non-uniform) large folio split from a order-n folio to order-m with m < n. It reduces 1. the total number of after-split folios from 2^(n-m) to n-m+1; 2. the amount of memory needed for multi-index xarray split from 2^(n/6-m/6) to n/6-m/6, assuming XA_CHUNK_SHIFT=6; 3. keep more large folios after a split from all order-m folios to order-(n-1) to order-m folios. For example, to split an order-9 to order-0, folio split generates 10 (or 11 for anonymous memory) folios instead of 512, allocates 1 xa_node instead of 8, and leaves 1 order-8, 1 order-7, ..., 1 order-1 and 2 order-0 folios (or 4 order-0 for anonymous memory) instead of 512 order-0 folios. Instead of duplicating existing split_huge_page*() code, __folio_split() is introduced as the shared backend code for both split_huge_page_to_list_to_order() and folio_split(). __folio_split() can support both uniform split and buddy allocator like (or non-uniform) split. All existing split_huge_page*() users can be gradually converted to use folio_split() if possible. In this patchset, I converted truncate_inode_partial_folio() to use folio_split(). xfstests quick group passed for both tmpfs and xfs. I also semi-replicated Hugh's test[12] and ran it without any issue for almost 24 hours. This patch (of 8): A preparation patch for non-uniform folio split, which always split a folio into half iteratively, and minimal xarray entry split. Currently, xas_split_alloc() and xas_split() always split all slots from a multi-index entry. They cost the same number of xa_node as the to-be-split slots. For example, to split an order-9 entry, which takes 2^(9-6)=8 slots, assuming XA_CHUNK_SHIFT is 6 (!CONFIG_BASE_SMALL), 8 xa_node are needed. Instead xas_try_split() is intended to be used iteratively to split the order-9 entry into 2 order-8 entries, then split one order-8 entry, based on the given index, to 2 order-7 entries, ..., and split one order-1 entry to 2 order-0 entries. When splitting the order-6 entry and a new xa_node is needed, xas_try_split() will try to allocate one if possible. As a result, xas_try_split() would only need 1 xa_node instead of 8. When a new xa_node is needed during the split, xas_try_split() can try to allocate one but no more. -ENOMEM will be return if a node cannot be allocated. -EINVAL will be return if a sibling node is split or cascade split happens, where two or more new nodes are needed, and these are not supported by xas_try_split(). xas_split_alloc() and xas_split() split an order-9 to order-0: --------------------------------- | | | | | | | | | | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | | | | | | | | | | --------------------------------- | | | | ------- --- --- ------- | | ... | | V V V V ----------- ----------- ----------- ----------- | xa_node | | xa_node | ... | xa_node | | xa_node | ----------- ----------- ----------- ----------- xas_try_split() splits an order-9 to order-0: --------------------------------- | | | | | | | | | | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | | | | | | | | | | --------------------------------- | | V ----------- | xa_node | ----------- Link: https://lkml.kernel.org/r/20250307174001.242794-1-ziy@nvidia.com Link: https://lkml.kernel.org/r/20250307174001.242794-2-ziy@nvidia.com Signed-off-by: Zi Yan <ziy@nvidia.com> Cc: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: David Hildenbrand <david@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: Kirill A. Shuemov <kirill.shutemov@linux.intel.com> Cc: Miaohe Lin <linmiaohe@huawei.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Yang Shi <yang@os.amperecomputing.com> Cc: Yu Zhao <yuzhao@google.com> Cc: Zi Yan <ziy@nvidia.com> Cc: Kairui Song <kasong@tencent.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-01-12XArray: minor documentation improvementsTamir Duberstein
- Replace "they" with "you" where "you" is used in the preceding sentence fragment. - Mention `xa_erase` in discussion of multi-index entries. Split this into a separate sentence. - Add "call" parentheses on "xa_store" for consistency and linkification. - Add caveat that `xa_store` and `xa_erase` are not equivalent in the presence of `XA_FLAGS_ALLOC`. Link: https://lkml.kernel.org/r/20241105-xarray-documentation-v5-1-8e1702321b41@gmail.com Signed-off-by: Tamir Duberstein <tamird@gmail.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> Reviewed-by: Bagas Sanjaya <bagasdotme@gmail.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-02-03XArray: Document the locking requirement for the xa_stateMatthew Wilcox (Oracle)
It wasn't obvious to all readers that it's unsafe to reuse an xa_state after dropping the xas_lock() or the rcu_read_lock(). Reported-by: Charan Teja Kalla <charante@codeaurora.org> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
2020-10-16XArray: add xas_splitMatthew Wilcox (Oracle)
In order to use multi-index entries for huge pages in the page cache, we need to be able to split a multi-index entry (eg if a file is truncated in the middle of a huge page entry). This version does not support splitting more than one level of the tree at a time. This is an acceptable limitation for the page cache as we do not expect to support order-12 pages in the near future. [akpm@linux-foundation.org: export xas_split_alloc() to modules] [willy@infradead.org: fix xarray split] Link: https://lkml.kernel.org/r/20200910175450.GV6583@casper.infradead.org [willy@infradead.org: fix xarray] Link: https://lkml.kernel.org/r/20201001233943.GW20115@casper.infradead.org Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: "Kirill A . Shutemov" <kirill@shutemov.name> Cc: Qian Cai <cai@lca.pw> Cc: Song Liu <songliubraving@fb.com> Link: https://lkml.kernel.org/r/20200903183029.14930-3-willy@infradead.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-01-17XArray: Add xa_for_each_rangeMatthew Wilcox (Oracle)
This function supports iterating over a range of an array. Also add documentation links for xa_for_each_start(). Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
2019-11-08XArray: Improve documentation of search marksMatthew Wilcox (Oracle)
Move most of the mark-related documentation to its own section to make it easier to understand. Add clarification that you can't search for an unset mark, and you can't yet search for combinations of marks. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
2019-06-26docs: remove :c:func: annotations from xarray.rstJonathan Corbet
Now that the build system automatically marks up function references, we don't have to clutter the source files, so take it out. [Some paragraphs could now benefit from refilling, but that was left out to avoid obscuring the real changes.] Acked-by: Matthew Wilcox <willy@infradead.org> Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2019-02-20XArray: Use xa_cmpxchg to implement xa_reserveMatthew Wilcox
Jason feels this is clearer, and it saves a function and an exported symbol. Suggested-by: Jason Gunthorpe <jgg@ziepe.ca> Signed-off-by: Matthew Wilcox <willy@infradead.org>
2019-02-06XArray: Add cyclic allocationMatthew Wilcox
This differs slightly from the IDR equivalent in five ways. 1. It can allocate up to UINT_MAX instead of being limited to INT_MAX, like xa_alloc(). Also like xa_alloc(), it will write to the 'id' pointer before placing the entry in the XArray. 2. The 'next' cursor is allocated separately from the XArray instead of being part of the IDR. This saves memory for all the users which do not use the cyclic allocation API and suits some users better. 3. It returns -EBUSY instead of -ENOSPC. 4. It will attempt to wrap back to the minimum value on memory allocation failure as well as on an -EBUSY error, assuming that a user would rather allocate a small ID than suffer an ID allocation failure. 5. It reports whether it has wrapped, which is important to some users. Signed-off-by: Matthew Wilcox <willy@infradead.org>
2019-02-06XArray: Add support for 1s-based allocationMatthew Wilcox
A lot of places want to allocate IDs starting at 1 instead of 0. While the xa_alloc() API supports this, it's not very efficient if lots of IDs are allocated, due to having to walk down to the bottom of the tree to see if ID 1 is available, then all the way over to the next non-allocated ID. This method marks ID 0 as being occupied which wastes one slot in the XArray, but preserves xa_empty() as working. Signed-off-by: Matthew Wilcox <willy@infradead.org>
2019-02-06XArray: Change xa_insert to return -EBUSYMatthew Wilcox
Userspace translates EEXIST to "File exists" which isn't a very good error message for the problem. "Device or resource busy" is a better indication of what went wrong. Signed-off-by: Matthew Wilcox <willy@infradead.org>
2019-01-06XArray: Honour reserved entries in xa_insertMatthew Wilcox
xa_insert() should treat reserved entries as occupied, not as available. Also, it should treat requests to insert a NULL pointer as a request to reserve the slot. Add xa_insert_bh() and xa_insert_irq() for completeness. Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-12-06XArray: Add xa_cmpxchg_irq and xa_cmpxchg_bhMatthew Wilcox
These convenience wrappers match the other _irq and _bh wrappers we already have. It turns out I'd already open-coded xa_cmpxchg_irq() in the shmem code, so convert that. Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-11-05XArray: Fix DocumentationMatthew Wilcox
Minor fixes. Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-11-05XArray: Handle NULL pointers differently for allocationMatthew Wilcox
For allocating XArrays, it makes sense to distinguish beteen erasing an entry and storing NULL. Storing NULL keeps the index allocated with a NULL pointer associated with it while xa_erase() frees the index. Some existing IDR users rely on this ability. Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-11-05XArray: Add xa_store_bh() and xa_store_irq()Matthew Wilcox
These convenience wrappers disable interrupts while taking the spinlock. A number of drivers would otherwise have to open-code these functions. Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-11-05XArray: Regularise xa_reserveMatthew Wilcox
The xa_reserve() function was a little unusual in that it attempted to be callable for all kinds of locking scenarios. Make it look like the other APIs with __xa_reserve, xa_reserve_bh and xa_reserve_irq variants. Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-10-21xarray: Add range store functionalityMatthew Wilcox
This version of xa_store_range() really only supports load and store. Our only user only needs basic load and store functionality, so there's no need to do the extra work to support marking and overlapping stores correctly yet. Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-10-21xarray: Track free entries in an XArrayMatthew Wilcox
Add the optional ability to track which entries in an XArray are free and provide xa_alloc() to replace most of the functionality of the IDR. Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-10-21xarray: Add xa_reserve and xa_releaseMatthew Wilcox
This function reserves a slot in the XArray for users which need to acquire multiple locks before storing their entry in the tree and so cannot use a plain xa_store(). Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-10-21xarray: Add documentationMatthew Wilcox
This is documentation on how to use the XArray, not details about its internal implementation. Signed-off-by: Matthew Wilcox <willy@infradead.org> Acked-by: Josef Bacik <jbacik@fb.com>