summaryrefslogtreecommitdiff
path: root/kernel/sysctl.c
AgeCommit message (Collapse)Author
10 dayssysctl: rename kern_table -> sysctl_subsys_tableJoel Granados
Renamed sysctl table from kern_table to sysctl_subsys_table and grouped the two arch specific ctls to the end of the array. This is part of a greater effort to move ctl tables into their respective subsystems which will reduce the merge conflicts in kernel/sysctl.c. Signed-off-by: Joel Granados <joel.granados@kernel.org>
10 dayskernel/sys.c: Move overflow{uid,gid} sysctl into kernel/sys.cJoel Granados
Moved ctl_tables elements for overflowuid and overflowgid into in kernel/sys.c. Create a register function that keeps them under "kernel" and run it after core with postcore_initcall. This is part of a greater effort to move ctl tables into their respective subsystems which will reduce the merge conflicts in kernel/sysctl.c. Signed-off-by: Joel Granados <joel.granados@kernel.org>
10 daysuevent: mv uevent_helper into kobject_uevent.cJoel Granados
Move both uevent_helper table into lib/kobject_uevent.c. Place the registration early in the initcall order with postcore_initcall. This is part of a greater effort to move ctl tables into their respective subsystems which will reduce the merge conflicts in kernel/sysctl.c. Signed-off-by: Joel Granados <joel.granados@kernel.org>
10 dayssysctl: Remove superfluous includes from kernel/sysctl.cJoel Granados
Remove the following headers from the include list in sysctl.c. * These are removed as the related variables are no longer there. =================== ==================== Include Related Var =================== ==================== linux/kmod.h usermodehelper asm/nmi.h nmi_watchdoc_enabled asm/io.h io_delay_type linux/pid.h pid_max_{,min,max} linux/sched/sysctl.h sysctl_{sched_*,numa_*,timer_*} linux/mount.h sysctl_mount_max linux/reboot.h poweroff_cmd linux/ratelimit.h {,printk_}ratelimit_state linux/printk.h kptr_restrict linux/security.h CONFIG_SECURITY_CAPABILITIES linux/net.h net_table linux/key.h key_sysctls linux/nvs_fs.h acpi_video_flags linux/acpi.h acpi_video_flags linux/fs.h proc_nr_files * These are no longer needed as intermediate includes ============== Include ============== linux/filter.h linux/binfmts.h Reviewed-by: Kees Cook <kees@kernel.org> Reviewed-by: Luis Chamberlain <mcgrof@kernel.org> Signed-off-by: Joel Granados <joel.granados@kernel.org>
10 dayssysctl: Remove (very) old file changelogJoel Granados
These comments are older than 2003 and therefore do not bare any relevance on the current state of the sysctl.c file. Remove them as they confuse more than clarify. Reviewed-by: Luis Chamberlain <mcgrof@kernel.org> Reviewed-by: Kees Cook <kees@kernel.org> Signed-off-by: Joel Granados <joel.granados@kernel.org>
10 dayssysctl: Move sysctl_panic_on_stackoverflow to kernel/panic.cJoel Granados
This is part of a greater effort to move ctl tables into their respective subsystems which will reduce the merge conflicts in kernel/sysctl.c. Reviewed-by: Kees Cook <kees@kernel.org> Signed-off-by: Joel Granados <joel.granados@kernel.org>
10 dayssysctl: move cad_pid into kernel/pid.cJoel Granados
Move cad_pid as well as supporting function proc_do_cad_pid into kernel/pic.c. Replaced call to __do_proc_dointvec with proc_dointvec inside proc_do_cad_pid which requires the copy of the ctl_table to handle the temp value. This is part of a greater effort to move ctl tables into their respective subsystems which will reduce the merge conflicts in kernel/sysctl.c. Reviewed-by: Luis Chamberlain <mcgrof@kernel.org> Reviewed-by: Kees Cook <kees@kernel.org> Signed-off-by: Joel Granados <joel.granados@kernel.org>
10 dayssysctl: Move tainted ctl_table into kernel/panic.cJoel Granados
Move the ctl_table with the "tainted" proc_name into kernel/panic.c. With it moves the proc_tainted helper function. This is part of a greater effort to move ctl tables into their respective subsystems which will reduce the merge conflicts in kernel/sysctl.c. Reviewed-by: Luis Chamberlain <mcgrof@kernel.org> Reviewed-by: Kees Cook <kees@kernel.org> Signed-off-by: Joel Granados <joel.granados@kernel.org>
10 daysInput: sysrq: mv sysrq into drivers/tty/sysrq.cJoel Granados
Move both sysrq ctl_table and supported sysrq_sysctl_handler helper function into drivers/tty/sysrq.c. Replaced the __do_proc_dointvec in helper function with do_proc_dointvec_minmax as the former is local to kernel/sysctl.c. Here we use the minmax version of do_proc_dointvec because do_proc_dointvec is static and calling do_proc_dointvec_minmax with a NULL min and max is the same as calling do_proc_dointvec. This is part of a greater effort to move ctl tables into their respective subsystems which will reduce the merge conflicts in kernel/sysctl.c. Reviewed-by: Kees Cook <kees@kernel.org> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Joel Granados <joel.granados@kernel.org>
10 daysfork: mv threads-max into kernel/fork.cJoel Granados
make sysctl_max_threads static as it no longer needs to be exported into sysctl.c. This is part of a greater effort to move ctl tables into their respective subsystems which will reduce the merge conflicts in kernel/sysctl.c. Reviewed-by: Luis Chamberlain <mcgrof@kernel.org> Reviewed-by: Kees Cook <kees@kernel.org> Signed-off-by: Joel Granados <joel.granados@kernel.org>
10 daysparisc/power: Move soft-power into power.cJoel Granados
Move the soft-power ctl table into parisc/power.c. As a consequence the pwrsw_enabled var is made static. This is part of a greater effort to move ctl tables into their respective subsystems which will reduce the merge conflicts in kernel/sysctl.c. Reviewed-by: Luis Chamberlain <mcgrof@kernel.org> Reviewed-by: Kees Cook <kees@kernel.org> Signed-off-by: Joel Granados <joel.granados@kernel.org>
10 daysmm: move randomize_va_space into memory.cJoel Granados
Move the randomize_va_space variable together with all its sysctl table elements into memory.c. Register it to the "kernel" directory by adding it to the subsys initialization calls This is part of a greater effort to move ctl tables into their respective subsystems which will reduce the merge conflicts in kernel/sysctl.c. Reviewed-by: Luis Chamberlain <mcgrof@kernel.org> Reviewed-by: Kees Cook <kees@kernel.org> Signed-off-by: Joel Granados <joel.granados@kernel.org>
10 daysrcu: Move rcu_stall related sysctls into rcu/tree_stall.hJoel Granados
Move sysctl_panic_on_rcu_stall and sysctl_max_rcu_stall_to_panic into the kernel/rcu subdirectory. Make these static in tree_stall.h and removed them as extern from panic.h as their scope is now confined into one file. This is part of a greater effort to move ctl tables into their respective subsystems which will reduce the merge conflicts in kernel/sysctl.c. Reviewed-by: Luis Chamberlain <mcgrof@kernel.org> Reviewed-by: Joel Fernandes <joelagnelf@nvidia.com> Reviewed-by: Kees Cook <kees@kernel.org> Signed-off-by: Joel Granados <joel.granados@kernel.org>
10 dayslocking/rtmutex: Move max_lock_depth into rtmutex.cJoel Granados
Move the max_lock_depth sysctl table element into rtmutex_api.c. Removed the rtmutex.h include from sysctl.c. Chose to move into rtmutex_api.c to avoid multiple registrations every time rtmutex.c is included in other files. This is part of a greater effort to move ctl tables into their respective subsystems which will reduce the merge conflicts in kernel/sysctl.c. Signed-off-by: Joel Granados <joel.granados@kernel.org>
10 daysmodule: Move modprobe_path and modules_disabled ctl_tables into the module ↵Joel Granados
subsys Move module sysctl (modprobe_path and modules_disabled) out of sysctl.c and into the modules subsystem. Make modules_disabled static as it no longer needs to be exported. Remove module.h from the includes in sysctl as it no longer uses any module exported variables. This is part of a greater effort to move ctl tables into their respective subsystems which will reduce the merge conflicts in kernel/sysctl.c. Reviewed-by: Luis Chamberlain <mcgrof@kernel.org> Reviewed-by: Petr Pavlu <petr.pavlu@suse.com> Signed-off-by: Joel Granados <joel.granados@kernel.org>
2025-04-09sparc: mv sparc sysctls into their own file under arch/sparc/kernelJoel Granados
Move sparc sysctls (reboot-cmd, stop-a, scons-poweroff and tsb-ratio) into a new file (arch/sparc/kernel/setup.c). This file will be included for both 32 and 64 bit sparc. Leave "tsb-ratio" under SPARC64 ifdef as it was in kernel/sysctl.c. The sysctl table register is called with arch_initcall placing it after its original place in proc_root_init. This is part of a greater effort to move ctl tables into their respective subsystems which will reduce the merge conflicts in kernel/sysctl.c. Signed-off-by: Joel Granados <joel.granados@kernel.org>
2025-04-09stack_tracer: move sysctl registration to kernel/trace/trace_stack.cJoel Granados
Move stack_tracer_enabled into trace_stack_sysctl_table. This is part of a greater effort to move ctl tables into their respective subsystems which will reduce the merge conflicts in kernel/sysctl.c. Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Joel Granados <joel.granados@kernel.org>
2025-04-09tracing: Move trace sysctls into trace.cJoel Granados
Move trace ctl tables into their own const array in kernel/trace/trace.c. The sysctl table register is called with subsys_initcall placing if after its original place in proc_root_init. This is part of a greater effort to move ctl tables into their respective subsystems which will reduce the merge conflicts in kernel/sysctl.c. Signed-off-by: Joel Granados <joel.granados@kernel.org> Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-04-09signal: Move signal ctl tables into signal.cJoel Granados
Move print-fatal-signals into its own const ctl table array in kernel/signal.c. This is part of a greater effort to move ctl tables into their respective subsystems which will reduce the merge conflicts in kernel/sysctl.c. Signed-off-by: Joel Granados <joel.granados@kernel.org>
2025-04-09panic: Move panic ctl tables into panic.cJoel Granados
Move panic, panic_on_oops, panic_print, panic_on_warn into kerne/panic.c. This is part of a greater effort to move ctl tables into their respective subsystems which will reduce the merge conflicts in kernel/sysctl.c. Signed-off-by: Joel Granados <joel.granados@kernel.org>
2025-03-29Merge tag 's390-6.15-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 updates from Vasily Gorbik: - Add sorting of mcount locations at build time - Rework uaccess functions with C exception handling to shorten inline assembly size and enable full inlining. This yields near-optimal code for small constant copies with a ~40kb kernel size increase - Add support for a configurable STRICT_MM_TYPECHECKS which allows to generate better code, but also allows to have type checking for debug builds - Optimize get_lowcore() for common callers with alternatives that nearly revert to the pre-relocated lowcore code, while also slightly reducing syscall entry and exit time - Convert MACHINE_HAS_* checks for single facility tests into cpu_has_* style macros that call test_facility(), and for features with additional conditions, add a new ALT_TYPE_FEATURE alternative to provide a static branch via alternative patching. Also, move machine feature detection to the decompressor for early patching and add debugging functionality to easily show which alternatives are patched - Add exception table support to early boot / startup code to get rid of the open coded exception handling - Use asm_inline for all inline assemblies with EX_TABLE or ALTERNATIVE to ensure correct inlining and unrolling decisions - Remove 2k page table leftovers now that s390 has been switched to always allocate 4k page tables - Split kfence pool into 4k mappings in arch_kfence_init_pool() and remove the architecture-specific kfence_split_mapping() - Use READ_ONCE_NOCHECK() in regs_get_kernel_stack_nth() to silence spurious KASAN warnings from opportunistic ftrace argument tracing - Force __atomic_add_const() variants on s390 to always return void, ensuring compile errors for improper usage - Remove s390's ioremap_wt() and pgprot_writethrough() due to mismatched semantics and lack of known users, relying on asm-generic fallbacks - Signal eventfd in vfio-ap to notify userspace when the guest AP configuration changes, including during mdev removal - Convert mdev_types from an array to a pointer in vfio-ccw and vfio-ap drivers to avoid fake flex array confusion - Cleanup trap code - Remove references to the outdated linux390@de.ibm.com address - Other various small fixes and improvements all over the code * tag 's390-6.15-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (78 commits) s390: Use inline qualifier for all EX_TABLE and ALTERNATIVE inline assemblies s390/kfence: Split kfence pool into 4k mappings in arch_kfence_init_pool() s390/ptrace: Avoid KASAN false positives in regs_get_kernel_stack_nth() s390/boot: Ignore vmlinux.map s390/sysctl: Remove "vm/allocate_pgste" sysctl s390: Remove 2k vs 4k page table leftovers s390/tlb: Use mm_has_pgste() instead of mm_alloc_pgste() s390/lowcore: Use lghi instead llilh to clear register s390/syscall: Merge __do_syscall() and do_syscall() s390/spinlock: Implement SPINLOCK_LOCKVAL with inline assembly s390/smp: Implement raw_smp_processor_id() with inline assembly s390/current: Implement current with inline assembly s390/lowcore: Use inline qualifier for get_lowcore() inline assembly s390: Move s390 sysctls into their own file under arch/s390 s390/syscall: Simplify syscall_get_arguments() s390/vfio-ap: Notify userspace that guest's AP config changed when mdev removed s390: Remove ioremap_wt() and pgprot_writethrough() s390/mm: Add configurable STRICT_MM_TYPECHECKS s390/mm: Convert pgste_val() into function s390/mm: Convert pgprot_val() into function ...
2025-03-26Merge tag 'sysctl-6.15-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/sysctl/sysctl Pull sysctl updates from Joel Granados: - Move vm_table members out of kernel/sysctl.c All vm_table array members have moved to their respective subsystems leading to the removal of vm_table from kernel/sysctl.c. This increases modularity by placing the ctl_tables closer to where they are actually used and at the same time reducing the chances of merge conflicts in kernel/sysctl.c. - ctl_table range fixes Replace the proc_handler function that checks variable ranges in coredump_sysctls and vdso_table with the one that actually uses the extra{1,2} pointers as min/max values. This tightens the range of the values that users can pass into the kernel effectively preventing {under,over}flows. - Misc fixes Correct grammar errors and typos in test messages. Update sysctl files in MAINTAINERS. Constified and removed array size in declaration for alignment_tbl * tag 'sysctl-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/sysctl/sysctl: (22 commits) selftests/sysctl: fix wording of help messages selftests: fix spelling/grammar errors in sysctl/sysctl.sh MAINTAINERS: Update sysctl file list in MAINTAINERS sysctl: Fix underflow value setting risk in vm_table coredump: Fixes core_pipe_limit sysctl proc_handler sysctl: remove unneeded include sysctl: remove the vm_table sh: vdso: move the sysctl to arch/sh/kernel/vsyscall/vsyscall.c x86: vdso: move the sysctl to arch/x86/entry/vdso/vdso32-setup.c fs: dcache: move the sysctl to fs/dcache.c sunrpc: simplify rpcauth_cache_shrink_count() fs: drop_caches: move sysctl to fs/drop_caches.c fs: fs-writeback: move sysctl to fs/fs-writeback.c mm: nommu: move sysctl to mm/nommu.c security: min_addr: move sysctl to security/min_addr.c mm: mmap: move sysctl to mm/mmap.c mm: util: move sysctls to mm/util.c mm: vmscan: move vmscan sysctls to mm/vmscan.c mm: swap: move sysctl to mm/swap.c mm: filemap: move sysctl to mm/filemap.c ...
2025-03-24Merge tag 'x86-core-2025-03-22' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull core x86 updates from Ingo Molnar: "x86 CPU features support: - Generate the <asm/cpufeaturemasks.h> header based on build config (H. Peter Anvin, Xin Li) - x86 CPUID parsing updates and fixes (Ahmed S. Darwish) - Introduce the 'setcpuid=' boot parameter (Brendan Jackman) - Enable modifying CPU bug flags with '{clear,set}puid=' (Brendan Jackman) - Utilize CPU-type for CPU matching (Pawan Gupta) - Warn about unmet CPU feature dependencies (Sohil Mehta) - Prepare for new Intel Family numbers (Sohil Mehta) Percpu code: - Standardize & reorganize the x86 percpu layout and related cleanups (Brian Gerst) - Convert the stackprotector canary to a regular percpu variable (Brian Gerst) - Add a percpu subsection for cache hot data (Brian Gerst) - Unify __pcpu_op{1,2}_N() macros to __pcpu_op_N() (Uros Bizjak) - Construct __percpu_seg_override from __percpu_seg (Uros Bizjak) MM: - Add support for broadcast TLB invalidation using AMD's INVLPGB instruction (Rik van Riel) - Rework ROX cache to avoid writable copy (Mike Rapoport) - PAT: restore large ROX pages after fragmentation (Kirill A. Shutemov, Mike Rapoport) - Make memremap(MEMREMAP_WB) map memory as encrypted by default (Kirill A. Shutemov) - Robustify page table initialization (Kirill A. Shutemov) - Fix flush_tlb_range() when used for zapping normal PMDs (Jann Horn) - Clear _PAGE_DIRTY for kernel mappings when we clear _PAGE_RW (Matthew Wilcox) KASLR: - x86/kaslr: Reduce KASLR entropy on most x86 systems, to support PCI BAR space beyond the 10TiB region (CONFIG_PCI_P2PDMA=y) (Balbir Singh) CPU bugs: - Implement FineIBT-BHI mitigation (Peter Zijlstra) - speculation: Simplify and make CALL_NOSPEC consistent (Pawan Gupta) - speculation: Add a conditional CS prefix to CALL_NOSPEC (Pawan Gupta) - RFDS: Exclude P-only parts from the RFDS affected list (Pawan Gupta) System calls: - Break up entry/common.c (Brian Gerst) - Move sysctls into arch/x86 (Joel Granados) Intel LAM support updates: (Maciej Wieczor-Retman) - selftests/lam: Move cpu_has_la57() to use cpuinfo flag - selftests/lam: Skip test if LAM is disabled - selftests/lam: Test get_user() LAM pointer handling AMD SMN access updates: - Add SMN offsets to exclusive region access (Mario Limonciello) - Add support for debugfs access to SMN registers (Mario Limonciello) - Have HSMP use SMN through AMD_NODE (Yazen Ghannam) Power management updates: (Patryk Wlazlyn) - Allow calling mwait_play_dead with an arbitrary hint - ACPI/processor_idle: Add FFH state handling - intel_idle: Provide the default enter_dead() handler - Eliminate mwait_play_dead_cpuid_hint() Build system: - Raise the minimum GCC version to 8.1 (Brian Gerst) - Raise the minimum LLVM version to 15.0.0 (Nathan Chancellor) Kconfig: (Arnd Bergmann) - Add cmpxchg8b support back to Geode CPUs - Drop 32-bit "bigsmp" machine support - Rework CONFIG_GENERIC_CPU compiler flags - Drop configuration options for early 64-bit CPUs - Remove CONFIG_HIGHMEM64G support - Drop CONFIG_SWIOTLB for PAE - Drop support for CONFIG_HIGHPTE - Document CONFIG_X86_INTEL_MID as 64-bit-only - Remove old STA2x11 support - Only allow CONFIG_EISA for 32-bit Headers: - Replace __ASSEMBLY__ with __ASSEMBLER__ in UAPI and non-UAPI headers (Thomas Huth) Assembly code & machine code patching: - x86/alternatives: Simplify alternative_call() interface (Josh Poimboeuf) - x86/alternatives: Simplify callthunk patching (Peter Zijlstra) - KVM: VMX: Use named operands in inline asm (Josh Poimboeuf) - x86/hyperv: Use named operands in inline asm (Josh Poimboeuf) - x86/traps: Cleanup and robustify decode_bug() (Peter Zijlstra) - x86/kexec: Merge x86_32 and x86_64 code using macros from <asm/asm.h> (Uros Bizjak) - Use named operands in inline asm (Uros Bizjak) - Improve performance by using asm_inline() for atomic locking instructions (Uros Bizjak) Earlyprintk: - Harden early_serial (Peter Zijlstra) NMI handler: - Add an emergency handler in nmi_desc & use it in nmi_shootdown_cpus() (Waiman Long) Miscellaneous fixes and cleanups: - by Ahmed S. Darwish, Andy Shevchenko, Ard Biesheuvel, Artem Bityutskiy, Borislav Petkov, Brendan Jackman, Brian Gerst, Dan Carpenter, Dr. David Alan Gilbert, H. Peter Anvin, Ingo Molnar, Josh Poimboeuf, Kevin Brodsky, Mike Rapoport, Lukas Bulwahn, Maciej Wieczor-Retman, Max Grobecker, Patryk Wlazlyn, Pawan Gupta, Peter Zijlstra, Philip Redkin, Qasim Ijaz, Rik van Riel, Thomas Gleixner, Thorsten Blum, Tom Lendacky, Tony Luck, Uros Bizjak, Vitaly Kuznetsov, Xin Li, liuye" * tag 'x86-core-2025-03-22' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (211 commits) zstd: Increase DYNAMIC_BMI2 GCC version cutoff from 4.8 to 11.0 to work around compiler segfault x86/asm: Make asm export of __ref_stack_chk_guard unconditional x86/mm: Only do broadcast flush from reclaim if pages were unmapped perf/x86/intel, x86/cpu: Replace Pentium 4 model checks with VFM ones perf/x86/intel, x86/cpu: Simplify Intel PMU initialization x86/headers: Replace __ASSEMBLY__ with __ASSEMBLER__ in non-UAPI headers x86/headers: Replace __ASSEMBLY__ with __ASSEMBLER__ in UAPI headers x86/locking/atomic: Improve performance by using asm_inline() for atomic locking instructions x86/asm: Use asm_inline() instead of asm() in clwb() x86/asm: Use CLFLUSHOPT and CLWB mnemonics in <asm/special_insns.h> x86/hweight: Use asm_inline() instead of asm() x86/hweight: Use ASM_CALL_CONSTRAINT in inline asm() x86/hweight: Use named operands in inline asm() x86/stackprotector/64: Only export __ref_stack_chk_guard on CONFIG_SMP x86/head/64: Avoid Clang < 17 stack protector in startup code x86/kexec: Merge x86_32 and x86_64 code using macros from <asm/asm.h> x86/runtime-const: Add the RUNTIME_CONST_PTR assembly macro x86/cpu/intel: Limit the non-architectural constant_tsc model checks x86/mm/pat: Replace Intel x86_model checks with VFM ones x86/cpu/intel: Fix fast string initialization for extended Families ...
2025-03-18s390: Move s390 sysctls into their own file under arch/s390joel granados
Move s390 sysctls (spin_retry and userprocess_debug) into their own files under arch/s390. Create two new sysctl tables (2390_{fault,spin}_sysctl_table) which will be initialized with arch_initcall placing them after their original place in proc_root_init. This is part of a greater effort to move ctl tables into their respective subsystems which will reduce the merge conflicts in kernel/sysctl.c. Signed-off-by: joel granados <joel.granados@kernel.org> Acked-by: Heiko Carstens <hca@linux.ibm.com> Link: https://lore.kernel.org/r/20250306-jag-mv_ctltables-v2-6-71b243c8d3f8@kernel.org Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2025-02-21perf/core: Move perf_event sysctls into kernel/eventsJoel Granados
Move ctl tables to two files: - perf_event_{paranoid,mlock_kb,max_sample_rate} and perf_cpu_time_max_percent into kernel/events/core.c - perf_event_max_{stack,context_per_stack} into kernel/events/callchain.c Make static variables and functions that are fully contained in core.c and callchain.cand remove them from include/linux/perf_event.h. Additionally six_hundred_forty_kb is moved to callchain.c. Two new sysctl tables are added ({callchain,events_core}_sysctl_table) with their respective sysctl registration functions. This is part of a greater effort to move ctl tables into their respective subsystems which will reduce the merge conflicts in kerenel/sysctl.c. Signed-off-by: Joel Granados <joel.granados@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20250218-jag-mv_ctltables-v1-5-cd3698ab8d29@kernel.org
2025-02-18x86: Move sysctls into arch/x86Joel Granados
Move the following sysctl tables into arch/x86/kernel/setup.c: panic_on_{unrecoverable_nmi,io_nmi} bootloader_{type,version} io_delay_type unknown_nmi_panic acpi_realmode_flags Variables moved from include/linux/ to arch/x86/include/asm/ because there is no longer need for them outside arch/x86/kernel: acpi_realmode_flags panic_on_{unrecoverable_nmi,io_nmi} Include <asm/nmi.h> in arch/s86/kernel/setup.h in order to bring in panic_on_{io_nmi,unrecovered_nmi}. This is part of a greater effort to move ctl tables into their respective subsystems which will reduce the merge conflicts in kerenel/sysctl.c. Signed-off-by: Joel Granados <joel.granados@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20250218-jag-mv_ctltables-v1-8-cd3698ab8d29@kernel.org
2025-02-07sysctl: remove unneeded includeKaixiong Yu
Removing unneeded mm includes in kernel/sysctl.c. Signed-off-by: Kaixiong Yu <yukaixiong@huawei.com> Reviewed-by: Kees Cook <kees@kernel.org> Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Joel Granados <joel.granados@kernel.org>
2025-02-07sysctl: remove the vm_tableKaixiong Yu
After patch1~14 is applied, all sysctls of vm_table would be moved. So, delete vm_table. Signed-off-by: Kaixiong Yu <yukaixiong@huawei.com> Signed-off-by: Joel Granados <joel.granados@kernel.org>
2025-02-07sh: vdso: move the sysctl to arch/sh/kernel/vsyscall/vsyscall.cKaixiong Yu
When CONFIG_SUPERH and CONFIG_VSYSCALL are defined, vdso_enabled belongs to arch/sh/kernel/vsyscall/vsyscall.c. So, move it into its own file. To avoid failure when registering the vdso_table, move the call to register_sysctl_init() into its own fs_initcall(). Signed-off-by: Kaixiong Yu <yukaixiong@huawei.com> Reviewed-by: Kees Cook <kees@kernel.org> Signed-off-by: Joel Granados <joel.granados@kernel.org>
2025-02-07x86: vdso: move the sysctl to arch/x86/entry/vdso/vdso32-setup.cKaixiong Yu
When CONFIG_X86_32 is defined and CONFIG_UML is not defined, vdso_enabled belongs to arch/x86/entry/vdso/vdso32-setup.c. So, move it into its own file. Before this patch, vdso_enabled was allowed to be set to a value exceeding 1 on x86_32 architecture. After this patch is applied, vdso_enabled is not permitted to set the value more than 1. It does not matter, because according to the function load_vdso32(), only vdso_enabled is set to 1, VDSO would be enabled. Other values all mean "disabled". The same limitation could be seen in the function vdso32_setup(). Signed-off-by: Kaixiong Yu <yukaixiong@huawei.com> Reviewed-by: Kees Cook <kees@kernel.org> Signed-off-by: Joel Granados <joel.granados@kernel.org>
2025-02-07fs: dcache: move the sysctl to fs/dcache.cKaixiong Yu
The sysctl_vfs_cache_pressure belongs to fs/dcache.c, move it to fs/dcache.c from kernel/sysctl.c. As a part of fs/dcache.c cleaning, sysctl_vfs_cache_pressure is changed to a static variable, and change the inline-type function vfs_pressure_ratio() to out-of-inline type, export vfs_pressure_ratio() with EXPORT_SYMBOL_GPL to be used by other files. Move the unneeded include(linux/dcache.h). Signed-off-by: Kaixiong Yu <yukaixiong@huawei.com> Reviewed-by: Kees Cook <kees@kernel.org> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Christian Brauner <brauner@kernel.org> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Joel Granados <joel.granados@kernel.org>
2025-02-07fs: drop_caches: move sysctl to fs/drop_caches.cKaixiong Yu
The sysctl_drop_caches to fs/drop_caches.c, move it to fs/drop_caches.c from /kernel/sysctl.c. And remove the useless extern variable declaration from include/linux/mm.h Signed-off-by: Kaixiong Yu <yukaixiong@huawei.com> Reviewed-by: Kees Cook <kees@kernel.org> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Christian Brauner <brauner@kernel.org> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Joel Granados <joel.granados@kernel.org>
2025-02-07fs: fs-writeback: move sysctl to fs/fs-writeback.cKaixiong Yu
The dirtytime_expire_interval belongs to fs/fs-writeback.c, move it to fs/fs-writeback.c from /kernel/sysctl.c. And remove the useless extern variable declaration and the function declaration from include/linux/writeback.h Signed-off-by: Kaixiong Yu <yukaixiong@huawei.com> Reviewed-by: Kees Cook <kees@kernel.org> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Joel Granados <joel.granados@kernel.org>
2025-02-07mm: nommu: move sysctl to mm/nommu.cKaixiong Yu
The sysctl_nr_trim_pages belongs to nommu.c, move it to mm/nommu.c from /kernel/sysctl.c. And remove the useless extern variable declaration from include/linux/mm.h Signed-off-by: Kaixiong Yu <yukaixiong@huawei.com> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Signed-off-by: Joel Granados <joel.granados@kernel.org>
2025-02-07security: min_addr: move sysctl to security/min_addr.cKaixiong Yu
The dac_mmap_min_addr belongs to min_addr.c, move it to min_addr.c from /kernel/sysctl.c. In the previous Linux kernel boot process, sysctl_init_bases needs to be executed before init_mmap_min_addr, So, register_sysctl_init should be executed before update_mmap_min_addr in init_mmap_min_addr. And according to the compilation condition in security/Makefile: obj-$(CONFIG_MMU) += min_addr.o if CONFIG_MMU is not defined, min_addr.c would not be included in the compilation process. So, drop the CONFIG_MMU check. Signed-off-by: Kaixiong Yu <yukaixiong@huawei.com> Reviewed-by: Kees Cook <kees@kernel.org> Acked-by: Paul Moore <paul@paul-moore.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Joel Granados <joel.granados@kernel.org>
2025-02-07mm: mmap: move sysctl to mm/mmap.cKaixiong Yu
This moves all mmap related sysctls to mm/mmap.c, as part of the kernel/sysctl.c cleaning, also move the variable declaration from kernel/sysctl.c into mm/mmap.c. Signed-off-by: Kaixiong Yu <yukaixiong@huawei.com> Reviewed-by: Kees Cook <kees@kernel.org> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Joel Granados <joel.granados@kernel.org>
2025-02-07mm: util: move sysctls to mm/util.cKaixiong Yu
This moves all util related sysctls to mm/util.c, as part of the kernel/sysctl.c cleaning, also removes redundant external variable declarations and function declarations. Signed-off-by: Kaixiong Yu <yukaixiong@huawei.com> Reviewed-by: Kees Cook <kees@kernel.org> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Joel Granados <joel.granados@kernel.org>
2025-02-07mm: vmscan: move vmscan sysctls to mm/vmscan.cKaixiong Yu
This moves vm_swappiness and zone_reclaim_mode to mm/vmscan.c, as part of the kernel/sysctl.c cleaning, also moves some external variable declarations and function declarations from include/linux/swap.h into mm/internal.h. Signed-off-by: Kaixiong Yu <yukaixiong@huawei.com> Reviewed-by: Kees Cook <kees@kernel.org> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Joel Granados <joel.granados@kernel.org>
2025-02-07mm: swap: move sysctl to mm/swap.cKaixiong Yu
The page-cluster belongs to mm/swap.c, move it to mm/swap.c . Removes the redundant external variable declaration and unneeded include(linux/swap.h). Signed-off-by: Kaixiong Yu <yukaixiong@huawei.com> Reviewed-by: Kees Cook <kees@kernel.org> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Joel Granados <joel.granados@kernel.org>
2025-02-07mm: filemap: move sysctl to mm/filemap.cKaixiong Yu
This moves the filemap related sysctl to mm/filemap.c, and removes the redundant external variable declaration. Signed-off-by: Kaixiong Yu <yukaixiong@huawei.com> Reviewed-by: Kees Cook <kees@kernel.org> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Joel Granados <joel.granados@kernel.org>
2025-02-07mm: vmstat: move sysctls to mm/vmstat.cKaixiong Yu
This moves all vmstat related sysctls to its own file, removes useless extern variable declarations, and do some related clean-ups. To avoid compiler warnings when CONFIG_PROC_FS is not defined, add the macro definition CONFIG_PROC_FS ahead CONFIG_NUMA in vmstat.c. Signed-off-by: Kaixiong Yu <yukaixiong@huawei.com> Reviewed-by: Kees Cook <kees@kernel.org> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Joel Granados <joel.granados@kernel.org>
2025-01-28treewide: const qualify ctl_tables where applicableJoel Granados
Add the const qualifier to all the ctl_tables in the tree except for watchdog_hardlockup_sysctl, memory_allocation_profiling_sysctls, loadpin_sysctl_table and the ones calling register_net_sysctl (./net, drivers/inifiniband dirs). These are special cases as they use a registration function with a non-const qualified ctl_table argument or modify the arrays before passing them on to the registration function. Constifying ctl_table structs will prevent the modification of proc_handler function pointers as the arrays would reside in .rodata. This is made possible after commit 78eb4ea25cd5 ("sysctl: treewide: constify the ctl_table argument of proc_handlers") constified all the proc_handlers. Created this by running an spatch followed by a sed command: Spatch: virtual patch @ depends on !(file in "net") disable optional_qualifier @ identifier table_name != { watchdog_hardlockup_sysctl, iwcm_ctl_table, ucma_ctl_table, memory_allocation_profiling_sysctls, loadpin_sysctl_table }; @@ + const struct ctl_table table_name [] = { ... }; sed: sed --in-place \ -e "s/struct ctl_table .table = &uts_kern/const struct ctl_table *table = \&uts_kern/" \ kernel/utsname_sysctl.c Reviewed-by: Song Liu <song@kernel.org> Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org> # for kernel/trace/ Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> # SCSI Reviewed-by: Darrick J. Wong <djwong@kernel.org> # xfs Acked-by: Jani Nikula <jani.nikula@intel.com> Acked-by: Corey Minyard <cminyard@mvista.com> Acked-by: Wei Liu <wei.liu@kernel.org> Acked-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Bill O'Donnell <bodonnel@redhat.com> Acked-by: Baoquan He <bhe@redhat.com> Acked-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Acked-by: Anna Schumaker <anna.schumaker@oracle.com> Signed-off-by: Joel Granados <joel.granados@kernel.org>
2024-12-02pid: allow pid_max to be set per pid namespaceChristian Brauner
The pid_max sysctl is a global value. For a long time the default value has been 65535 and during the pidfd dicussions Linus proposed to bump pid_max by default (cf. [1]). Based on this discussion systemd started bumping pid_max to 2^22. So all new systems now run with a very high pid_max limit with some distros having also backported that change. The decision to bump pid_max is obviously correct. It just doesn't make a lot of sense nowadays to enforce such a low pid number. There's sufficient tooling to make selecting specific processes without typing really large pid numbers available. In any case, there are workloads that have expections about how large pid numbers they accept. Either for historical reasons or architectural reasons. One concreate example is the 32-bit version of Android's bionic libc which requires pid numbers less than 65536. There are workloads where it is run in a 32-bit container on a 64-bit kernel. If the host has a pid_max value greater than 65535 the libc will abort thread creation because of size assumptions of pthread_mutex_t. That's a fairly specific use-case however, in general specific workloads that are moved into containers running on a host with a new kernel and a new systemd can run into issues with large pid_max values. Obviously making assumptions about the size of the allocated pid is suboptimal but we have userspace that does it. Of course, giving containers the ability to restrict the number of processes in their respective pid namespace indepent of the global limit through pid_max is something desirable in itself and comes in handy in general. Independent of motivating use-cases the existence of pid namespaces makes this also a good semantical extension and there have been prior proposals pushing in a similar direction. The trick here is to minimize the risk of regressions which I think is doable. The fact that pid namespaces are hierarchical will help us here. What we mostly care about is that when the host sets a low pid_max limit, say (crazy number) 100 that no descendant pid namespace can allocate a higher pid number in its namespace. Since pid allocation is hierarchial this can be ensured by checking each pid allocation against the pid namespace's pid_max limit. This means if the allocation in the descendant pid namespace succeeds, the ancestor pid namespace can reject it. If the ancestor pid namespace has a higher limit than the descendant pid namespace the descendant pid namespace will reject the pid allocation. The ancestor pid namespace will obviously not care about this. All in all this means pid_max continues to enforce a system wide limit on the number of processes but allows pid namespaces sufficient leeway in handling workloads with assumptions about pid values and allows containers to restrict the number of processes in a pid namespace through the pid_max interface. [1]: https://lore.kernel.org/linux-api/CAHk-=wiZ40LVjnXSi9iHLE_-ZBsWFGCgdmNiYZUXn1-V5YBg2g@mail.gmail.com - rebased from 5.14-rc1 - a few fixes (missing ns_free_inum on error path, missing initialization, etc) - permission check changes in pid_table_root_permissions - unsigned int pid_max -> int pid_max (keep pid_max type as it was) - add READ_ONCE in alloc_pid() as suggested by Christian - rebased from 6.7 and take into account: * sysctl: treewide: drop unused argument ctl_table_root::set_ownership(table) * sysctl: treewide: constify ctl_table_header::ctl_table_arg * pidfd: add pidfs * tracing: Move saved_cmdline code into trace_sched_switch.c Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com> Link: https://lore.kernel.org/r/20241122132459.135120-2-aleksandr.mikhalitsyn@canonical.com Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-10-23sysctl: Reorganize kerneldoc parameter namesJulia Lawall
Reorganize kerneldoc parameter names to match the parameter order in the function header. Problems identified using Coccinelle. Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr> Signed-off-by: Joel Granados <joel.granados@kernel.org>
2024-07-24sysctl: treewide: constify the ctl_table argument of proc_handlersJoel Granados
const qualify the struct ctl_table argument in the proc_handler function signatures. This is a prerequisite to moving the static ctl_table structs into .rodata data which will ensure that proc_handler function pointers cannot be modified. This patch has been generated by the following coccinelle script: ``` virtual patch @r1@ identifier ctl, write, buffer, lenp, ppos; identifier func !~ "appldata_(timer|interval)_handler|sched_(rt|rr)_handler|rds_tcp_skbuf_handler|proc_sctp_do_(hmac_alg|rto_min|rto_max|udp_port|alpha_beta|auth|probe_interval)"; @@ int func( - struct ctl_table *ctl + const struct ctl_table *ctl ,int write, void *buffer, size_t *lenp, loff_t *ppos); @r2@ identifier func, ctl, write, buffer, lenp, ppos; @@ int func( - struct ctl_table *ctl + const struct ctl_table *ctl ,int write, void *buffer, size_t *lenp, loff_t *ppos) { ... } @r3@ identifier func; @@ int func( - struct ctl_table * + const struct ctl_table * ,int , void *, size_t *, loff_t *); @r4@ identifier func, ctl; @@ int func( - struct ctl_table *ctl + const struct ctl_table *ctl ,int , void *, size_t *, loff_t *); @r5@ identifier func, write, buffer, lenp, ppos; @@ int func( - struct ctl_table * + const struct ctl_table * ,int write, void *buffer, size_t *lenp, loff_t *ppos); ``` * Code formatting was adjusted in xfs_sysctl.c to comply with code conventions. The xfs_stats_clear_proc_handler, xfs_panic_mask_proc_handler and xfs_deprecated_dointvec_minmax where adjusted. * The ctl_table argument in proc_watchdog_common was const qualified. This is called from a proc_handler itself and is calling back into another proc_handler, making it necessary to change it as part of the proc_handler migration. Co-developed-by: Thomas Weißschuh <linux@weissschuh.net> Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Co-developed-by: Joel Granados <j.granados@samsung.com> Signed-off-by: Joel Granados <j.granados@samsung.com>
2024-06-03sysctl: constify ctl_table arguments of utility functionThomas Weißschuh
In a future commit the proc_handlers themselves will change to "const struct ctl_table". As a preparation for that adapt the internal helper. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Joel Granados <j.granados@samsung.com>
2024-06-03sysctl: move the extra1/2 boundary check of u8 to sysctl_check_table_arrayWen Yang
Move boundary checking for proc_dou8ved_minmax into module loading, thereby reporting errors in advance. And add a kunit test case ensuring the boundary check is done correctly. The boundary check in proc_dou8vec_minmax done to the extra elements in the ctl_table struct is currently performed at runtime. This allows buggy kernel modules to be loaded normally without any errors only to fail when used. This is a buggy example module: #include <linux/kernel.h> #include <linux/module.h> #include <linux/sysctl.h> static struct ctl_table_header *_table_header = NULL; static unsigned char _data = 0; struct ctl_table table[] = { { .procname = "foo", .data = &_data, .maxlen = sizeof(u8), .mode = 0644, .proc_handler = proc_dou8vec_minmax, .extra1 = SYSCTL_ZERO, .extra2 = SYSCTL_ONE_THOUSAND, }, }; static int init_demo(void) { _table_header = register_sysctl("kernel", table); if (!_table_header) return -ENOMEM; return 0; } module_init(init_demo); MODULE_LICENSE("GPL"); And this is the result: # insmod test.ko # cat /proc/sys/kernel/foo cat: /proc/sys/kernel/foo: Invalid argument Suggested-by: Joel Granados <j.granados@samsung.com> Signed-off-by: Wen Yang <wen.yang@linux.dev> Cc: Luis Chamberlain <mcgrof@kernel.org> Cc: Kees Cook <keescook@chromium.org> Cc: Joel Granados <j.granados@samsung.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Christian Brauner <brauner@kernel.org> Cc: linux-kernel@vger.kernel.org Reviewed-by: Joel Granados <j.granados@samsung.com> Signed-off-by: Joel Granados <j.granados@samsung.com>
2024-04-24kernel misc: Remove the now superfluous sentinel elements from ctl_table arrayJoel Granados
This commit comes at the tail end of a greater effort to remove the empty elements at the end of the ctl_table arrays (sentinels) which will reduce the overall build time size of the kernel and run time memory bloat by ~64 bytes per sentinel (further information Link : https://lore.kernel.org/all/ZO5Yx5JFogGi%2FcBo@bombadil.infradead.org/) Remove the sentinel from ctl_table arrays. Reduce by one the values used to compare the size of the adjusted arrays. Signed-off-by: Joel Granados <j.granados@samsung.com>
2024-03-18tracing: Support to dump instance traces by ftrace_dump_on_oopsHuang Yiwei
Currently ftrace only dumps the global trace buffer on an OOPs. For debugging a production usecase, instance trace will be helpful to check specific problems since global trace buffer may be used for other purposes. This patch extend the ftrace_dump_on_oops parameter to dump a specific or multiple trace instances: - ftrace_dump_on_oops=0: as before -- don't dump - ftrace_dump_on_oops[=1]: as before -- dump the global trace buffer on all CPUs - ftrace_dump_on_oops=2 or =orig_cpu: as before -- dump the global trace buffer on CPU that triggered the oops - ftrace_dump_on_oops=<instance_name>: new behavior -- dump the tracing instance matching <instance_name> - ftrace_dump_on_oops[=2/orig_cpu],<instance1_name>[=2/orig_cpu], <instrance2_name>[=2/orig_cpu]: new behavior -- dump the global trace buffer and multiple instance buffer on all CPUs, or only dump on CPU that triggered the oops if =2 or =orig_cpu is given Also, the sysctl node can handle the input accordingly. Link: https://lore.kernel.org/linux-trace-kernel/20240223083126.1817731-1-quic_hyiwei@quicinc.com Cc: Ross Zwisler <zwisler@google.com> Cc: <mhiramat@kernel.org> Cc: <mark.rutland@arm.com> Cc: <mcgrof@kernel.org> Cc: <keescook@chromium.org> Cc: <j.granados@samsung.com> Cc: <mathieu.desnoyers@efficios.com> Cc: <corbet@lwn.net> Signed-off-by: Huang Yiwei <quic_hyiwei@quicinc.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-11-01Merge tag 'asm-generic-6.7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic Pull ia64 removal and asm-generic updates from Arnd Bergmann: - The ia64 architecture gets its well-earned retirement as planned, now that there is one last (mostly) working release that will be maintained as an LTS kernel. - The architecture specific system call tables are updated for the added map_shadow_stack() syscall and to remove references to the long-gone sys_lookup_dcookie() syscall. * tag 'asm-generic-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic: hexagon: Remove unusable symbols from the ptrace.h uapi asm-generic: Fix spelling of architecture arch: Reserve map_shadow_stack() syscall number for all architectures syscalls: Cleanup references to sys_lookup_dcookie() Documentation: Drop or replace remaining mentions of IA64 lib/raid6: Drop IA64 support Documentation: Drop IA64 from feature descriptions kernel: Drop IA64 support from sig_fault handlers arch: Remove Itanium (IA-64) architecture