summaryrefslogtreecommitdiff
path: root/mm/slub.c
AgeCommit message (Collapse)Author
2024-04-25mm/slab: add allocation accounting into slab allocation and free pathsSuren Baghdasaryan
Account slab allocations using codetag reference embedded into slabobj_ext. Link: https://lkml.kernel.org/r/20240321163705.3067592-24-surenb@google.com Signed-off-by: Suren Baghdasaryan <surenb@google.com> Co-developed-by: Kent Overstreet <kent.overstreet@linux.dev> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Tested-by: Kees Cook <keescook@chromium.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Alex Gaynor <alex.gaynor@gmail.com> Cc: Alice Ryhl <aliceryhl@google.com> Cc: Andreas Hindborg <a.hindborg@samsung.com> Cc: Benno Lossin <benno.lossin@proton.me> Cc: "Björn Roy Baron" <bjorn3_gh@protonmail.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Christoph Lameter <cl@linux.com> Cc: Dennis Zhou <dennis@kernel.org> Cc: Gary Guo <gary@garyguo.net> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Pasha Tatashin <pasha.tatashin@soleen.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Tejun Heo <tj@kernel.org> Cc: Wedson Almeida Filho <wedsonaf@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-25mm/slab: introduce SLAB_NO_OBJ_EXT to avoid obj_ext creationSuren Baghdasaryan
Slab extension objects can't be allocated before slab infrastructure is initialized. Some caches, like kmem_cache and kmem_cache_node, are created before slab infrastructure is initialized. Objects from these caches can't have extension objects. Introduce SLAB_NO_OBJ_EXT slab flag to mark these caches and avoid creating extensions for objects allocated from these slabs. Link: https://lkml.kernel.org/r/20240321163705.3067592-9-surenb@google.com Signed-off-by: Suren Baghdasaryan <surenb@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Tested-by: Kees Cook <keescook@chromium.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Alex Gaynor <alex.gaynor@gmail.com> Cc: Alice Ryhl <aliceryhl@google.com> Cc: Andreas Hindborg <a.hindborg@samsung.com> Cc: Benno Lossin <benno.lossin@proton.me> Cc: "Björn Roy Baron" <bjorn3_gh@protonmail.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Christoph Lameter <cl@linux.com> Cc: Dennis Zhou <dennis@kernel.org> Cc: Gary Guo <gary@garyguo.net> Cc: Kent Overstreet <kent.overstreet@linux.dev> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Tejun Heo <tj@kernel.org> Cc: Wedson Almeida Filho <wedsonaf@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-25mm: introduce __GFP_NO_OBJ_EXT flag to selectively prevent slabobj_ext creationSuren Baghdasaryan
Introduce __GFP_NO_OBJ_EXT flag in order to prevent recursive allocations when allocating slabobj_ext on a slab. Link: https://lkml.kernel.org/r/20240321163705.3067592-8-surenb@google.com Signed-off-by: Suren Baghdasaryan <surenb@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Tested-by: Kees Cook <keescook@chromium.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Alex Gaynor <alex.gaynor@gmail.com> Cc: Alice Ryhl <aliceryhl@google.com> Cc: Andreas Hindborg <a.hindborg@samsung.com> Cc: Benno Lossin <benno.lossin@proton.me> Cc: "Björn Roy Baron" <bjorn3_gh@protonmail.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Christoph Lameter <cl@linux.com> Cc: Dennis Zhou <dennis@kernel.org> Cc: Gary Guo <gary@garyguo.net> Cc: Kent Overstreet <kent.overstreet@linux.dev> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Tejun Heo <tj@kernel.org> Cc: Wedson Almeida Filho <wedsonaf@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-25mm: introduce slabobj_ext to support slab object extensionsSuren Baghdasaryan
Currently slab pages can store only vectors of obj_cgroup pointers in page->memcg_data. Introduce slabobj_ext structure to allow more data to be stored for each slab object. Wrap obj_cgroup into slabobj_ext to support current functionality while allowing to extend slabobj_ext in the future. Link: https://lkml.kernel.org/r/20240321163705.3067592-7-surenb@google.com Signed-off-by: Suren Baghdasaryan <surenb@google.com> Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Tested-by: Kees Cook <keescook@chromium.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Alex Gaynor <alex.gaynor@gmail.com> Cc: Alice Ryhl <aliceryhl@google.com> Cc: Andreas Hindborg <a.hindborg@samsung.com> Cc: Benno Lossin <benno.lossin@proton.me> Cc: "Björn Roy Baron" <bjorn3_gh@protonmail.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Christoph Lameter <cl@linux.com> Cc: Dennis Zhou <dennis@kernel.org> Cc: Gary Guo <gary@garyguo.net> Cc: Kent Overstreet <kent.overstreet@linux.dev> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Tejun Heo <tj@kernel.org> Cc: Wedson Almeida Filho <wedsonaf@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-25mm/slub: mark slab_free_freelist_hook() __always_inlineKent Overstreet
It seems we need to be more forceful with the compiler on this one. This is done for performance reasons only. Link: https://lkml.kernel.org/r/20240321163705.3067592-4-surenb@google.com Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> Signed-off-by: Suren Baghdasaryan <surenb@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Tested-by: Kees Cook <keescook@chromium.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Alex Gaynor <alex.gaynor@gmail.com> Cc: Alice Ryhl <aliceryhl@google.com> Cc: Andreas Hindborg <a.hindborg@samsung.com> Cc: Benno Lossin <benno.lossin@proton.me> Cc: "Björn Roy Baron" <bjorn3_gh@protonmail.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Christoph Lameter <cl@linux.com> Cc: Dennis Zhou <dennis@kernel.org> Cc: Gary Guo <gary@garyguo.net> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Tejun Heo <tj@kernel.org> Cc: Wedson Almeida Filho <wedsonaf@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-23slub: use count_partial_free_approx() in slab_out_of_memory()Jianfeng Wang
slab_out_of_memory() uses count_partial() to get the exact count of free objects for each node. As it may get called in the slab allocation path, count_partial_free_approx() can be used to avoid the risk and overhead of traversing a long partial slab list. At the same time, show_slab_objects() still uses count_partial(). Thus, slub users can still have the option to access the exact count of objects via sysfs if the overhead is acceptable to them. Signed-off-by: Jianfeng Wang <jianfeng.w.wang@oracle.com> Acked-by: David Rientjes <rientjes@google.com> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2024-04-23slub: introduce count_partial_free_approx()Jianfeng Wang
When reading "/proc/slabinfo", the kernel needs to report the number of free objects for each kmem_cache. The current implementation uses count_partial() to get it by scanning each kmem_cache_node's partial slab list and summing free objects from every partial slab. This process must hold per-kmem_cache_node spinlock and disable IRQ, and may take a long time. Consequently, it can block slab allocations on other CPUs and cause timeouts for network devices, when the partial list is long. In production, even NMI watchdog can be triggered due to this matter: e.g., for "buffer_head", the number of partial slabs was observed to be ~1M in one kmem_cache_node. This problem was also confirmed by others [1-3]. Iterating a partial list to get the exact count of objects can cause soft lockups for a long list with or without the lock (e.g., if preemption is disabled), and may not be very useful: the object count can change after the lock is released. The approach of maintaining free-object counters requires atomic operations on the fast path [3]. So, the fix is to introduce count_partial_free_approx(). This function can be used for getting the free object count in a kmem_cache_node's partial list. It limits the number of slabs to scan and avoids scanning the whole list by giving an approximation for a long list. Suppose the limit is N. If the list's length is not greater than N, output the exact count by traversing the list; if its length is greater than N, output an approximated count by traversing a subset of the list. The proposed method is to scan N/2 slabs from the list's head and N/2 slabs from the tail. For a partial list with ~280K slabs, benchmarks show that it performs better than just counting from the list's head, after slabs get sorted by kmem_cache_shrink(). Default the limit to 10000, as it produces an approximation within 1% of the exact count for both scenarios. Then, use count_partial_free_approx() in get_slabinfo(). Benchmarks: Diff = (exact - approximated) / exact * Normal case (w/o kmem_cache_shrink()): | MAX_TO_SCAN | Diff (count from head)| Diff (count head+tail)| | 1000 | 0.43 % | 1.09 % | | 5000 | 0.06 % | 0.37 % | | 10000 | 0.02 % | 0.16 % | | 20000 | 0.009 % | -0.003 % | * Skewed case (w/ kmem_cache_shrink()): | MAX_TO_SCAN | Diff (count from head)| Diff (count head+tail)| | 1000 | 12.46 % | 6.75 % | | 5000 | 5.38 % | 1.27 % | | 10000 | 4.99 % | 0.22 % | | 20000 | 4.86 % | -0.06 % | [1] https://lore.kernel.org/linux-mm/alpine.DEB.2.21.2003031602460.1537@www.lameter.com/T/ [2] https://lore.kernel.org/lkml/alpine.DEB.2.22.394.2008071258020.55871@www.lameter.com/T/ [3] https://lore.kernel.org/lkml/1e01092b-140d-2bab-aeba-321a74a194ee@linux.com/T/ Signed-off-by: Jianfeng Wang <jianfeng.w.wang@oracle.com> Acked-by: David Rientjes <rientjes@google.com> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2024-04-15slub: Set __GFP_COMP in kmem_cache by defaultHaifeng Xu
Now the __GFP_COMP is set only if the higher-order is not 0. However, __GFP_COMP flag can be set unconditionally because compound page can not be created in the order-0 case. And this can also simplify the code a bit (no need to check the order is 0 or not). Signed-off-by: Haifeng Xu <haifeng.xu@shopee.com> Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2024-04-09mm/slub: remove duplicate initialization for early_kmem_cache_node_alloc()Sangyun Kim
The struct track for every object in a new slab is already set up by new_slab(), so remove the duplicate initialization in early_kmem_cache_node_alloc(). Co-developed-by: Hyunmin Lee <hyunminlr@gmail.com> Signed-off-by: Hyunmin Lee <hyunminlr@gmail.com> Co-developed-by: Jeungwoo Yoo <casionwoo@gmail.com> Signed-off-by: Jeungwoo Yoo <casionwoo@gmail.com> Signed-off-by: Sangyun Kim <sangyun.kim@snu.ac.kr> Cc: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2024-04-04mm/slub: correct comment in do_slab_free()Xiu Jianfeng
slab_alloc_node() should be __slab_alloc_node(). Signed-off-by: Xiu Jianfeng <xiujianfeng@huawei.com> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2024-04-04mm/slub: simplify get_partial_node()Xiongwei Song
The break conditions for filling cpu partial can be more readable and simple. If slub_get_cpu_partial() returns 0, we can confirm that we don't need to fill cpu partial, then we should break from the loop. On the other hand, we also should break from the loop if we have added enough cpu partial slabs. Meanwhile, the logic above gets rid of the #ifdef and also fixes a weird corner case that if we set cpu_partial_slabs to 0 from sysfs, we still allocate at least one here. Signed-off-by: Xiongwei Song <xiongwei.song@windriver.com> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2024-04-04mm/slub: add slub_get_cpu_partial() helperXiongwei Song
Add slub_get_cpu_partial() and dummy function to help improve get_partial_node(). It can help remove #ifdef of CONFIG_SLUB_CPU_PARTIAL and improve filling cpu partial logic. Signed-off-by: Xiongwei Song <xiongwei.song@windriver.com> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2024-04-04mm/slub: remove the check of !kmem_cache_has_cpu_partial()Xiongwei Song
The check of !kmem_cache_has_cpu_partial(s) with CONFIG_SLUB_CPU_PARTIAL enabled here is always false. We have already checked kmem_cache_debug() earlier and if it was true, then we either continued or broke from the loop so we can't reach this code in that case and don't need to check kmem_cache_debug() as part of kmem_cache_has_cpu_partial() again. Here we can remove it. Signed-off-by: Xiongwei Song <xiongwei.song@windriver.com> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2024-04-02mm/slub: Reduce memory consumption in extreme scenariosChen Jun
When kmalloc_node() is called without __GFP_THISNODE and the target node lacks sufficient memory, SLUB allocates a folio from a different node other than the requested node, instead of taking a partial slab from it. However, since the allocated folio does not belong to the requested node, on the following allocation it is deactivated and added to the partial slab list of the node it belongs to. This behavior can result in excessive memory usage when the requested node has insufficient memory, as SLUB will repeatedly allocate folios from other nodes without reusing the previously allocated ones. To prevent memory wastage, when a preferred node is indicated (not NUMA_NO_NODE) but without a prior __GFP_THISNODE constraint: 1) try to get a partial slab from target node only by having __GFP_THISNODE in pc.flags for get_partial() 2) if 1) failed, try to allocate a new slab from target node with GFP_NOWAIT | __GFP_THISNODE opportunistically. 3) if 2) failed, retry with original gfpflags which will allow get_partial() try partial lists of other nodes before potentially allocating new page from other nodes Without a preferred node, or with __GFP_THISNODE constraint, the behavior remains unchanged. On qemu with 4 numa nodes and each numa has 1G memory. Write a test ko to call kmalloc_node(196, GFP_KERNEL, 3) for (4 * 1024 + 4) * 1024 times. cat /proc/slabinfo shows: kmalloc-256 4200530 13519712 256 32 2 : tunables.. after this patch, cat /proc/slabinfo shows: kmalloc-256 4200558 4200768 256 32 2 : tunables.. Signed-off-by: Chen Jun <chenjun102@huawei.com> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2024-03-25mm/slub: mark racy accesses on slab->slabslinke li
The reads of slab->slabs are racy because it may be changed by put_cpu_partial concurrently. In slabs_cpu_partial_show() and show_slab_objects(), slab->slabs is only used for showing information. Data-racy reads from shared variables that are used only for diagnostic purposes should typically use data_race(), since it is normally not a problem if the values are off by a little. This patch is aimed at reducing the number of benign races reported by KCSAN in order to focus future debugging effort on harmful races. Signed-off-by: linke li <lilinke99@qq.com> Reviewed-by: Chengming Zhou <chengming.zhou@linux.dev> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2024-03-25mm/slub: remove dummy slabinfo functionsXiu Jianfeng
The SLAB implementation has been removed since 6.8, so there is no other version of slabinfo_show_stats() and slabinfo_write(), then we can remove these two dummy functions. Signed-off-by: Xiu Jianfeng <xiujianfeng@huawei.com> Acked-by: David Rientjes <rientjes@google.com> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2024-03-12Merge branch 'slab/for-6.9/slab-flag-cleanups' into slab/for-linusVlastimil Babka
Merge a series from myself that replaces hardcoded SLAB_ cache flag values with an enum, and explicitly deprecates the SLAB_MEM_SPREAD flag that is a no-op sine SLAB removal.
2024-03-12Merge branch 'slab/for-6.9/optimize-get-freelist' into slab/for-linusVlastimil Babka
Merge a series from Chengming Zhou that optimizes cpu freelist loading when grabbing a cpu partial slab, and removes some unnecessary code.
2024-03-04mm, slab: remove memcg_from_slab_obj()Vlastimil Babka
This empty wrapped exists only for !CONFIG_MEMCG_KMEM and seems it was never used. Probably a leftover from development of a series. Reviewed-by: Chengming Zhou <chengming.zhou@linux.dev> Reviewed-by: Roman Gushchin <roman.gushchin@linux.dev> Acked-by: David Rientjes <rientjes@google.com> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2024-03-01mm, slab: remove the corner case of inc_slabs_node()Chengming Zhou
We already have the inc_slabs_node() after kmem_cache_node->node[node] initialized in early_kmem_cache_node_alloc(), this special case of inc_slabs_node() can be removed. Then we don't need to consider the existence of kmem_cache_node in inc_slabs_node() anymore. Signed-off-by: Chengming Zhou <chengming.zhou@linux.dev> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2024-03-01mm/slab: Fix a kmemleak in kmem_cache_destroy()Xiaolei Wang
For earlier kmem cache creation, slab_sysfs_init() has not been called. Consequently, kmem_cache_destroy() cannot utilize kobj_type::release to release the kmem_cache structure. Therefore, tweak kmem_cache_release() to use slab_kmem_cache_release() for releasing kmem_cache when slab_state isn't FULL. This will fixes the memory leaks like following: unreferenced object 0xffff0000c2d87080 (size 128): comm "swapper/0", pid 1, jiffies 4294893428 hex dump (first 32 bytes): 00 00 00 00 ad 4e ad de ff ff ff ff 6b 6b 6b 6b .....N......kkkk ff ff ff ff ff ff ff ff b8 ab 48 89 00 80 ff ff.....H..... backtrace (crc 8819d0f6): [<ffff80008317a298>] kmemleak_alloc+0xb0/0xc4 [<ffff8000807e553c>] kmem_cache_alloc_node+0x288/0x3a8 [<ffff8000807e95f0>] __kmem_cache_create+0x1e4/0x64c [<ffff8000807216bc>] kmem_cache_create_usercopy+0x1c4/0x2cc [<ffff8000807217e0>] kmem_cache_create+0x1c/0x28 [<ffff8000819f6278>] arm_v7s_alloc_pgtable+0x1c0/0x6d4 [<ffff8000819f53a0>] alloc_io_pgtable_ops+0xe8/0x2d0 [<ffff800084b2d2c4>] arm_v7s_do_selftests+0xe0/0x73c [<ffff800080016b68>] do_one_initcall+0x11c/0x7ac [<ffff800084a71ddc>] kernel_init_freeable+0x53c/0xbb8 [<ffff8000831728d8>] kernel_init+0x24/0x144 [<ffff800080018e98>] ret_from_fork+0x10/0x20 Signed-off-by: Xiaolei Wang <xiaolei.wang@windriver.com> Reviewed-by: Chengming Zhou <chengming.zhou@linux.dev> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2024-02-26mm, slab: use an enum to define SLAB_ cache creation flagsVlastimil Babka
The values of SLAB_ cache creation flags are defined by hand, which is tedious and error-prone. Use an enum to assign the bit number and a __SLAB_FLAG_BIT() macro to #define the final flags. This renumbers the flag values, which is OK as they are only used internally. Also define a __SLAB_FLAG_UNUSED macro to assign value to flags disabled by their respective config options in a unified and sparse-friendly way. Reviewed-and-tested-by: Xiongwei Song <xiongwei.song@windriver.com> Reviewed-by: Chengming Zhou <chengming.zhou@linux.dev> Reviewed-by: Roman Gushchin <roman.gushchin@linux.dev> Acked-by: David Rientjes <rientjes@google.com> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2024-02-21mm, slab: fix the comment of cpu partial listChengming Zhou
The partial slabs on cpu partial list are not frozen after the commit 8cd3fa428b56 ("slub: Delay freezing of partial slabs") merged. So fix the comment. Signed-off-by: Chengming Zhou <chengming.zhou@linux.dev> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2024-02-21mm, slab: remove unused object_size parameter in kmem_cache_flags()Chengming Zhou
We don't use the object_size parameter in kmem_cache_flags(), so just remove it. Signed-off-by: Chengming Zhou <chengming.zhou@linux.dev> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2024-01-30mm/slub: remove parameter 'flags' in create_kmalloc_caches()Zheng Yejian
After commit 16a1d968358a ("mm/slab: remove mm/slab.c and slab_def.h"), parameter 'flags' is only passed as 0 in create_kmalloc_caches(), and then it is only passed to new_kmalloc_cache(). So we can change parameter 'flags' to be a local variable with initial value 0 in new_kmalloc_cache() and remove parameter 'flags' in create_kmalloc_caches(). Also make new_kmalloc_cache() static due to it is only used in mm/slab_common.c. Signed-off-by: Zheng Yejian <zhengyejian1@huawei.com> Acked-by: David Rientjes <rientjes@google.com> Reviewed-by: Chengming Zhou <zhouchengming@bytedance.com> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2024-01-23mm/slub: remove unused parameter in next_freelist_entry()Chengming Zhou
The parameter "struct slab *slab" is unused in next_freelist_entry(), so just remove it. Acked-by: Christoph Lameter (Ampere) <cl@linux.com> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2024-01-23mm/slub: remove full list manipulation for non-debug slabChengming Zhou
Since debug slab is processed by free_to_partial_list(), and only debug slab which has SLAB_STORE_USER flag would care about the full list, we can remove these unrelated full list manipulations from __slab_free(). Acked-by: Christoph Lameter (Ampere) <cl@linux.com> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2024-01-23mm/slub: directly load freelist from cpu partial slab in the likely caseChengming Zhou
The likely case is that we get a usable slab from the cpu partial list, we can directly load freelist from it and return back, instead of going the other way that need more work, like reenable interrupt and recheck. But we need to remove the "VM_BUG_ON(!new.frozen)" in get_freelist() for reusing it, since cpu partial slab is not frozen. It seems acceptable since it's only for debug purpose. And get_freelist() also assumes it can return NULL if the freelist is empty, which is not possible for the cpu partial slab case, so we add "VM_BUG_ON(!freelist)" after get_freelist() to make it explicit. There is some small performance improvement too, which shows by: perf bench sched messaging -g 5 -t -l 100000 mm-stable slub-optimize Total time 7.473 7.209 Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2024-01-22mm/slub: unify all sl[au]b parameters with "slab_$param"Xiongwei Song
Since the SLAB allocator has been removed, so we can clean up the sl[au]b_$params. With only one slab allocator left, it's better to use the generic "slab" term instead of "slub" which is an implementation detail, which is pointed out by Vlastimil Babka. For more information please see [1]. Hence, we are going to use "slab_$param" as the primary prefix. This patch is changing the following slab parameters - slub_max_order - slub_min_order - slub_min_objects - slub_debug to - slab_max_order - slab_min_order - slab_min_objects - slab_debug as the primary slab parameters for all references of them in docs and comments. But this patch won't change variables and functions inside slub as we will have wider slub/slab change. Meanwhile, "slub_$params" can also be passed by command line, which is to keep backward compatibility. Also mark all "slub_$params" as legacy. Remove the separate descriptions for slub_[no]merge, append legacy tip for them at the end of descriptions of slab_[no]merge. [1] https://lore.kernel.org/linux-mm/7512b350-4317-21a0-fab3-4101bc4d8f7a@suse.cz/ Signed-off-by: Xiongwei Song <xiongwei.song@windriver.com> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2024-01-09Merge tag 'mm-stable-2024-01-08-15-31' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: "Many singleton patches against the MM code. The patch series which are included in this merge do the following: - Peng Zhang has done some mapletree maintainance work in the series 'maple_tree: add mt_free_one() and mt_attr() helpers' 'Some cleanups of maple tree' - In the series 'mm: use memmap_on_memory semantics for dax/kmem' Vishal Verma has altered the interworking between memory-hotplug and dax/kmem so that newly added 'device memory' can more easily have its memmap placed within that newly added memory. - Matthew Wilcox continues folio-related work (including a few fixes) in the patch series 'Add folio_zero_tail() and folio_fill_tail()' 'Make folio_start_writeback return void' 'Fix fault handler's handling of poisoned tail pages' 'Convert aops->error_remove_page to ->error_remove_folio' 'Finish two folio conversions' 'More swap folio conversions' - Kefeng Wang has also contributed folio-related work in the series 'mm: cleanup and use more folio in page fault' - Jim Cromie has improved the kmemleak reporting output in the series 'tweak kmemleak report format'. - In the series 'stackdepot: allow evicting stack traces' Andrey Konovalov to permits clients (in this case KASAN) to cause eviction of no longer needed stack traces. - Charan Teja Kalla has fixed some accounting issues in the page allocator's atomic reserve calculations in the series 'mm: page_alloc: fixes for high atomic reserve caluculations'. - Dmitry Rokosov has added to the samples/ dorectory some sample code for a userspace memcg event listener application. See the series 'samples: introduce cgroup events listeners'. - Some mapletree maintanance work from Liam Howlett in the series 'maple_tree: iterator state changes'. - Nhat Pham has improved zswap's approach to writeback in the series 'workload-specific and memory pressure-driven zswap writeback'. - DAMON/DAMOS feature and maintenance work from SeongJae Park in the series 'mm/damon: let users feed and tame/auto-tune DAMOS' 'selftests/damon: add Python-written DAMON functionality tests' 'mm/damon: misc updates for 6.8' - Yosry Ahmed has improved memcg's stats flushing in the series 'mm: memcg: subtree stats flushing and thresholds'. - In the series 'Multi-size THP for anonymous memory' Ryan Roberts has added a runtime opt-in feature to transparent hugepages which improves performance by allocating larger chunks of memory during anonymous page faults. - Matthew Wilcox has also contributed some cleanup and maintenance work against eh buffer_head code int he series 'More buffer_head cleanups'. - Suren Baghdasaryan has done work on Andrea Arcangeli's series 'userfaultfd move option'. UFFDIO_MOVE permits userspace heap compaction algorithms to move userspace's pages around rather than UFFDIO_COPY'a alloc/copy/free. - Stefan Roesch has developed a 'KSM Advisor', in the series 'mm/ksm: Add ksm advisor'. This is a governor which tunes KSM's scanning aggressiveness in response to userspace's current needs. - Chengming Zhou has optimized zswap's temporary working memory use in the series 'mm/zswap: dstmem reuse optimizations and cleanups'. - Matthew Wilcox has performed some maintenance work on the writeback code, both code and within filesystems. The series is 'Clean up the writeback paths'. - Andrey Konovalov has optimized KASAN's handling of alloc and free stack traces for secondary-level allocators, in the series 'kasan: save mempool stack traces'. - Andrey also performed some KASAN maintenance work in the series 'kasan: assorted clean-ups'. - David Hildenbrand has gone to town on the rmap code. Cleanups, more pte batching, folio conversions and more. See the series 'mm/rmap: interface overhaul'. - Kinsey Ho has contributed some maintenance work on the MGLRU code in the series 'mm/mglru: Kconfig cleanup'. - Matthew Wilcox has contributed lruvec page accounting code cleanups in the series 'Remove some lruvec page accounting functions'" * tag 'mm-stable-2024-01-08-15-31' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (361 commits) mm, treewide: rename MAX_ORDER to MAX_PAGE_ORDER mm, treewide: introduce NR_PAGE_ORDERS selftests/mm: add separate UFFDIO_MOVE test for PMD splitting selftests/mm: skip test if application doesn't has root privileges selftests/mm: conform test to TAP format output selftests: mm: hugepage-mmap: conform to TAP format output selftests/mm: gup_test: conform test to TAP format output mm/selftests: hugepage-mremap: conform test to TAP format output mm/vmstat: move pgdemote_* out of CONFIG_NUMA_BALANCING mm: zsmalloc: return -ENOSPC rather than -EINVAL in zs_malloc while size is too large mm/memcontrol: remove __mod_lruvec_page_state() mm/khugepaged: use a folio more in collapse_file() slub: use a folio in __kmalloc_large_node slub: use folio APIs in free_large_kmalloc() slub: use alloc_pages_node() in alloc_slab_page() mm: remove inc/dec lruvec page state functions mm: ratelimit stat flush from workingset shrinker kasan: stop leaking stack trace handles mm/mglru: remove CONFIG_TRANSPARENT_HUGEPAGE mm/mglru: add dummy pmd_dirty() ...
2024-01-08mm, treewide: rename MAX_ORDER to MAX_PAGE_ORDERKirill A. Shutemov
commit 23baf831a32c ("mm, treewide: redefine MAX_ORDER sanely") has changed the definition of MAX_ORDER to be inclusive. This has caused issues with code that was not yet upstream and depended on the previous definition. To draw attention to the altered meaning of the define, rename MAX_ORDER to MAX_PAGE_ORDER. Link: https://lkml.kernel.org/r/20231228144704.14033-2-kirill.shutemov@linux.intel.com Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-01-05slub: use alloc_pages_node() in alloc_slab_page()Matthew Wilcox (Oracle)
For no apparent reason, we were open-coding alloc_pages_node() in this function. Link: https://lkml.kernel.org/r/20231228085748.1083901-3-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Acked-by: David Rientjes <rientjes@google.com> Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-01-04Merge branch 'slab/for-6.8/slub-hook-cleanups' into slab/for-nextVlastimil Babka
Merge the SLAB allocator removal and a number of subsequent SLUB cleanups and optimizations.
2023-12-29kasan: rename and document kasan_(un)poison_object_dataAndrey Konovalov
Rename kasan_unpoison_object_data to kasan_unpoison_new_object and add a documentation comment. Do the same for kasan_poison_object_data. The new names and the comments should suggest the users that these hooks are intended for internal use by the slab allocator. The following patch will remove non-slab-internal uses of these hooks. No functional changes. [andreyknvl@google.com: update references to renamed functions in comments] Link: https://lkml.kernel.org/r/20231221180637.105098-1-andrey.konovalov@linux.dev Link: https://lkml.kernel.org/r/eab156ebbd635f9635ef67d1a4271f716994e628.1703024586.git.andreyknvl@google.com Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Reviewed-by: Marco Elver <elver@google.com> Cc: Alexander Lobakin <alobakin@pm.me> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com> Cc: Breno Leitao <leitao@debian.org> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Evgenii Stepanov <eugenis@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-28mm/slub: free KFENCE objects in slab_free_hook()Vlastimil Babka
When freeing an object that was allocated from KFENCE, we do that in the slowpath __slab_free(), relying on the fact that KFENCE "slab" cannot be the cpu slab, so the fastpath has to fallback to the slowpath. This optimization doesn't help much though, because is_kfence_address() is checked earlier anyway during the free hook processing or detached freelist building. Thus we can simplify the code by making the slab_free_hook() free the KFENCE object immediately, similarly to KASAN quarantine. In slab_free_hook() we can place kfence_free() above init processing, as callers have been making sure to set init to false for KFENCE objects. This simplifies slab_free(). This places it also above kasan_slab_free() which is ok as that skips KFENCE objects anyway. While at it also determine the init value in slab_free_freelist_hook() outside of the loop. This change will also make introducing per cpu array caches easier. Tested-by: Marco Elver <elver@google.com> Reviewed-by: Chengming Zhou <zhouchengming@bytedance.com> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2023-12-10slub, kasan: improve interaction of KASAN and slub_debug poisoningAndrey Konovalov
When both KASAN and slub_debug are enabled, when a free object is being prepared in setup_object, slub_debug poisons the object data before KASAN initializes its per-object metadata. Right now, in setup_object, KASAN only initializes the alloc metadata, which is always stored outside of the object. slub_debug is aware of this and it skips poisoning and checking that memory area. However, with the following patch in this series, KASAN also starts initializing its free medata in setup_object. As this metadata might be stored within the object, this initialization might overwrite the slub_debug poisoning. This leads to slub_debug reports. Thus, skip checking slub_debug poisoning of the object data area that overlaps with the in-object KASAN free metadata. Also make slub_debug poisoning of tail kmalloc redzones more precise when KASAN is enabled: slub_debug can still poison and check the tail kmalloc allocation area that comes after the KASAN free metadata. Link: https://lkml.kernel.org/r/20231122231202.121277-1-andrey.konovalov@linux.dev Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Tested-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> Cc: Alexander Potapenko <glider@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Evgenii Stepanov <eugenis@google.com> Cc: Feng Tang <feng.tang@intel.com> Cc: Marco Elver <elver@google.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-07mm/slub: handle bulk and single object freeing separatelyVlastimil Babka
Currently we have a single function slab_free() handling both single object freeing and bulk freeing with necessary hooks, the latter case requiring slab_free_freelist_hook(). It should be however better to distinguish the two use cases for the following reasons: - code simpler to follow for the single object case - better code generation - although inlining should eliminate the slab_free_freelist_hook() for single object freeing in case no debugging options are enabled, it seems it's not perfect. When e.g. KASAN is enabled, we're imposing additional unnecessary overhead for single object freeing. - preparation to add percpu array caches in near future Therefore, simplify slab_free() for the single object case by dropping unnecessary parameters and calling only slab_free_hook() instead of slab_free_freelist_hook(). Rename the bulk variant to slab_free_bulk() and adjust callers accordingly. While at it, flip (and document) slab_free_hook() return value so that it returns true when the freeing can proceed, which matches the logic of slab_free_freelist_hook() and is not confusingly the opposite. Additionally we can simplify a bit by changing the tail parameter of do_slab_free() when freeing a single object - instead of NULL we can set it equal to head. bloat-o-meter shows small code reduction with a .config that has KASAN etc disabled: add/remove: 0/0 grow/shrink: 0/4 up/down: 0/-118 (-118) Function old new delta kmem_cache_alloc_bulk 1203 1196 -7 kmem_cache_free 861 835 -26 __kmem_cache_free 741 704 -37 kmem_cache_free_bulk 911 863 -48 Reviewed-by: Chengming Zhou <zhouchengming@bytedance.com> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2023-12-07mm/slub: introduce __kmem_cache_free_bulk() without free hooksVlastimil Babka
Currently, when __kmem_cache_alloc_bulk() fails, it frees back the objects that were allocated before the failure, using kmem_cache_free_bulk(). Because kmem_cache_free_bulk() calls the free hooks (KASAN etc.) and those expect objects that were processed by the post alloc hooks, slab_post_alloc_hook() is called before kmem_cache_free_bulk(). This is wasteful, although not a big concern in practice for the rare error path. But in order to efficiently handle percpu array batch refill and free in the near future, we will also need a variant of kmem_cache_free_bulk() that avoids the free hooks. So introduce it now and use it for the failure path. In case of failure we however still need to perform memcg uncharge so handle that in a new memcg_slab_alloc_error_hook(). Thanks to Chengming Zhou for noticing the missing uncharge. As a consequence, __kmem_cache_alloc_bulk() no longer needs the objcg parameter, remove it. Reviewed-by: Chengming Zhou <zhouchengming@bytedance.com> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2023-12-07mm/slub: fix bulk alloc and free statsVlastimil Babka
The SLUB sysfs stats enabled CONFIG_SLUB_STATS have two deficiencies identified wrt bulk alloc/free operations: - Bulk allocations from cpu freelist are not counted. Add the ALLOC_FASTPATH counter there. - Bulk fastpath freeing will count a list of multiple objects with a single FREE_FASTPATH inc. Add a stat_add() variant to count them all. Reviewed-by: Chengming Zhou <zhouchengming@bytedance.com> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2023-12-06mm/slub: optimize free fast path code layoutVlastimil Babka
Inspection of kmem_cache_free() disassembly showed we could make the fast path smaller by providing few more hints to the compiler, and splitting the memcg_slab_free_hook() into an inline part that only checks if there's work to do, and an out of line part doing the actual uncharge. bloat-o-meter results: add/remove: 2/0 grow/shrink: 0/3 up/down: 286/-554 (-268) Function old new delta __memcg_slab_free_hook - 270 +270 __pfx___memcg_slab_free_hook - 16 +16 kfree 828 665 -163 kmem_cache_free 1116 948 -168 kmem_cache_free_bulk.part 1701 1478 -223 Checking kmem_cache_free() disassembly now shows the non-fastpath cases are handled out of line, which should reduce instruction cache usage. Acked-by: David Rientjes <rientjes@google.com> Tested-by: David Rientjes <rientjes@google.com> Reviewed-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> Tested-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2023-12-06mm/slub: optimize alloc fastpath code layoutVlastimil Babka
With allocation fastpaths no longer divided between two .c files, we have better inlining, however checking the disassembly of kmem_cache_alloc() reveals we can do better to make the fastpaths smaller and move the less common situations out of line or to separate functions, to reduce instruction cache pressure. - split memcg pre/post alloc hooks to inlined checks that use likely() to assume there will be no objcg handling necessary, and non-inline functions doing the actual handling - add some more likely/unlikely() to pre/post alloc hooks to indicate which scenarios should be out of line - change gfp_allowed_mask handling in slab_post_alloc_hook() so the code can be optimized away when kasan/kmsan/kmemleak is configured out bloat-o-meter shows: add/remove: 4/2 grow/shrink: 1/8 up/down: 521/-2924 (-2403) Function old new delta __memcg_slab_post_alloc_hook - 461 +461 kmem_cache_alloc_bulk 775 791 +16 __pfx_should_failslab.constprop - 16 +16 __pfx___memcg_slab_post_alloc_hook - 16 +16 should_failslab.constprop - 12 +12 __pfx_memcg_slab_post_alloc_hook 16 - -16 kmem_cache_alloc_lru 1295 1023 -272 kmem_cache_alloc_node 1118 817 -301 kmem_cache_alloc 1076 772 -304 kmalloc_node_trace 1149 838 -311 kmalloc_trace 1102 789 -313 __kmalloc_node_track_caller 1393 1080 -313 __kmalloc_node 1397 1082 -315 __kmalloc 1374 1059 -315 memcg_slab_post_alloc_hook 464 - -464 Note that gcc still decided to inline __memcg_pre_alloc_hook(), but the code is out of line. Forcing noinline did not improve the results. As a result the fastpaths are shorter and overal code size is reduced. Acked-by: David Rientjes <rientjes@google.com> Tested-by: David Rientjes <rientjes@google.com> Reviewed-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> Tested-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2023-12-06mm/slub: remove slab_alloc() and __kmem_cache_alloc_lru() wrappersVlastimil Babka
slab_alloc() is a thin wrapper around slab_alloc_node() with only one caller. Replace with direct call of slab_alloc_node(). __kmem_cache_alloc_lru() itself is a thin wrapper with two callers, so replace it with direct calls of slab_alloc_node() and trace_kmem_cache_alloc(). This also makes sure _RET_IP_ has always the expected value and not depending on inlining decisions. Reviewed-by: Kees Cook <keescook@chromium.org> Acked-by: David Rientjes <rientjes@google.com> Tested-by: David Rientjes <rientjes@google.com> Reviewed-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> Tested-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2023-12-06mm/slab: move kmalloc() functions from slab_common.c to slub.cVlastimil Babka
This will eliminate a call between compilation units through __kmem_cache_alloc_node() and allow better inlining of the allocation fast path. Reviewed-by: Kees Cook <keescook@chromium.org> Acked-by: David Rientjes <rientjes@google.com> Tested-by: David Rientjes <rientjes@google.com> Reviewed-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> Tested-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2023-12-06mm/slab: move kfree() from slab_common.c to slub.cVlastimil Babka
This should result in better code. Currently kfree() makes a function call between compilation units to __kmem_cache_free() which does its own virt_to_slab(), throwing away the struct slab pointer we already had in kfree(). Now it can be reused. Additionally kfree() can now inline the whole SLUB freeing fastpath. Also move over free_large_kmalloc() as the only callsites are now in slub.c, and make it static. Reviewed-by: Kees Cook <keescook@chromium.org> Acked-by: David Rientjes <rientjes@google.com> Tested-by: David Rientjes <rientjes@google.com> Reviewed-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> Tested-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2023-12-06mm/slab: move struct kmem_cache_node from slab.h to slub.cVlastimil Babka
The declaration and associated helpers are not used anywhere else anymore. Reviewed-by: Kees Cook <keescook@chromium.org> Acked-by: David Rientjes <rientjes@google.com> Tested-by: David Rientjes <rientjes@google.com> Reviewed-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> Tested-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2023-12-06mm/slab: move memcg related functions from slab.h to slub.cVlastimil Babka
We don't share those between SLAB and SLUB anymore, so most memcg related functions can be moved to slub.c proper. Reviewed-by: Kees Cook <keescook@chromium.org> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: David Rientjes <rientjes@google.com> Tested-by: David Rientjes <rientjes@google.com> Reviewed-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> Tested-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2023-12-06mm/slab: move pre/post-alloc hooks from slab.h to slub.cVlastimil Babka
We don't share the hooks between two slab implementations anymore so they can be moved away from the header. As part of the move, also move should_failslab() from slab_common.c as the pre_alloc hook uses it. This means slab.h can stop including fault-inject.h and kmemleak.h. Fix up some files that were depending on the includes transitively. Reviewed-by: Kees Cook <keescook@chromium.org> Acked-by: David Rientjes <rientjes@google.com> Tested-by: David Rientjes <rientjes@google.com> Reviewed-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> Tested-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2023-12-06mm/slab: move struct kmem_cache_cpu declaration to slub.cVlastimil Babka
Nothing outside SLUB itself accesses the struct kmem_cache_cpu fields so it does not need to be declared in slub_def.h. This allows also to move enum stat_item. Reviewed-by: Kees Cook <keescook@chromium.org> Acked-by: David Rientjes <rientjes@google.com> Tested-by: David Rientjes <rientjes@google.com> Reviewed-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> Tested-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2023-12-05mm/slab, docs: switch mm-api docs generation from slab.c to slub.cVlastimil Babka
The SLAB implementation is going to be removed, and mm-api.rst currently uses mm/slab.c to obtain kerneldocs for some API functions. Switch it to mm/slub.c and move the relevant kerneldocs of exported functions from one to the other. The rest of kerneldocs in slab.c is for static SLAB implementation-specific functions that don't have counterparts in slub.c and thus can be simply removed with the implementation. Acked-by: David Rientjes <rientjes@google.com> Tested-by: David Rientjes <rientjes@google.com> Reviewed-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> Tested-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2023-12-05slub: Update frozen slabs documentations in the sourceChengming Zhou
The current updated scheme (which this series implemented) is: - node partial slabs: PG_Workingset && !frozen - cpu partial slabs: !PG_Workingset && !frozen - cpu slabs: !PG_Workingset && frozen - full slabs: !PG_Workingset && !frozen The most important change is that "frozen" bit is not set for the cpu partial slabs anymore, __slab_free() will grab node list_lock then check by !PG_Workingset that it's not on a node partial list. And the "frozen" bit is still kept for the cpu slabs for performance, since we don't need to grab node list_lock to check whether the PG_Workingset is set or not if the "frozen" bit is set in __slab_free(). Update related documentations and comments in the source. Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com> Tested-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> Reviewed-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> Acked-by: Christoph Lameter (Ampere) <cl@linux.com> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>