Age | Commit message (Collapse) | Author |
|
Suggested-by: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Juergen Gross <jgross@suse.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Xin Li <xin@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Suggested-by: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Juergen Gross <jgross@suse.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Xin Li <xin@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Suggested-by: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Juergen Gross <jgross@suse.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Xin Li <xin@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Suggested-by: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Juergen Gross <jgross@suse.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Xin Li <xin@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Suggested-by: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Juergen Gross <jgross@suse.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Xin Li <xin@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Suggested-by: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Juergen Gross <jgross@suse.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Xin Li <xin@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Suggested-by: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Juergen Gross <jgross@suse.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Xin Li <xin@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Suggested-by: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Juergen Gross <jgross@suse.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Xin Li <xin@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Suggested-by: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Juergen Gross <jgross@suse.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Xin Li <xin@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Suggested-by: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Juergen Gross <jgross@suse.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Xin Li <xin@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Suggested-by: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Juergen Gross <jgross@suse.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Xin Li <xin@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Suggested-by: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Juergen Gross <jgross@suse.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Xin Li <xin@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Suggested-by: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Juergen Gross <jgross@suse.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Xin Li <xin@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Suggested-by: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Juergen Gross <jgross@suse.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Xin Li <xin@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Juergen Gross <jgross@suse.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Xin Li <xin@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Complete the AMD CPUID(4) emulation logic, which uses CPUID(0x80000006)
for L2/L3 cache info and an assocs[] associativity mapping array, by
adding entries for 3-way caches and 6-way caches.
Properly handle the case where CPUID(0x80000006) returns an L2/L3
associativity of 9. This is not real associativity, but a marker to
indicate that the respective L2/L3 cache information should be retrieved
from CPUID(0x8000001d) instead. If such a marker is encountered, return
early from legacy_amd_cpuid4(), thus effectively emulating an "invalid
index" CPUID(4) response with a cache type of zero.
When checking if CPUID(0x80000006) L2/L3 cache info output is valid, and
given the associtivity marker 9 above, do not just check if the whole
ECX/EDX register is zero. Rather, check if the associativity is zero or
9. An associativity of zero implies no L2/L3 cache, which make it the
more correct check anyway vs. a zero check of the whole output register.
Fixes: a326e948c538 ("x86, cacheinfo: Fixup L3 cache information for AMD multi-node processors")
Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: John Ogness <john.ogness@linutronix.de>
Cc: x86-cpuid@lists.linux.dev
Link: https://lore.kernel.org/r/20250409122233.1058601-3-darwi@linutronix.de
|
|
For the AMD CPUID(4) emulation cache info logic, the same associativity
mapping array, assocs[], is used for both CPUID(0x80000005) and
CPUID(0x80000006).
This is incorrect since per the AMD manuals, the mappings for
CPUID(0x80000005) L1d/L1i associativity is:
n = 0x1 -> 0xfe n
n = 0xff fully associative
while assocs[] maps these values to:
n = 0x1, 0x2, 0x4 n
n = 0x3, 0x7, 0x9 0
n = 0x6 8
n = 0x8 16
n = 0xa 32
n = 0xb 48
n = 0xc 64
n = 0xd 96
n = 0xe 128
n = 0xf fully associative
which is only valid for CPUID(0x80000006).
Parse CPUID(0x80000005) L1d/L1i associativity values as shown in the AMD
manuals. Since the 0xffff literal is used to denote full associativity
at the AMD CPUID(4)-emulation logic, define AMD_CPUID4_FULLY_ASSOCIATIVE
for it instead of spreading that literal in more places.
Mark the assocs[] mapping array as only valid for CPUID(0x80000006) L2/L3
cache information.
Fixes: a326e948c538 ("x86, cacheinfo: Fixup L3 cache information for AMD multi-node processors")
Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: John Ogness <john.ogness@linutronix.de>
Cc: x86-cpuid@lists.linux.dev
Link: https://lore.kernel.org/r/20250409122233.1058601-2-darwi@linutronix.de
|
|
The NULL array terminator at the end of erratum_1386_microcode was
removed during the switch from x86_cpu_desc to x86_cpu_id. This
causes readers to run off the end of the array.
Replace the NULL.
Fixes: f3f325152673 ("x86/cpu: Move AMD erratum 1386 table over to 'x86_cpu_id'")
Reported-by: Jiri Slaby <jirislaby@kernel.org>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
|
|
Create a document to summarize hard-earned knowledge about RSB-related
mitigations, with references, and replace the overly verbose yet
incomplete comments with a reference to the document.
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/ab73f4659ba697a974759f07befd41ae605e33dd.1744148254.git.jpoimboe@kernel.org
|
|
User->user Spectre v2 attacks (including RSB) across context switches
are already mitigated by IBPB in cond_mitigation(), if enabled globally
or if either the prev or the next task has opted in to protection. RSB
filling without IBPB serves no purpose for protecting user space, as
indirect branches are still vulnerable.
User->kernel RSB attacks are mitigated by eIBRS. In which case the RSB
filling on context switch isn't needed, so remove it.
Suggested-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Reviewed-by: Amit Shah <amit.shah@amd.com>
Reviewed-by: Nikolay Borisov <nik.borisov@suse.com>
Link: https://lore.kernel.org/r/98cdefe42180358efebf78e3b80752850c7a3e1b.1744148254.git.jpoimboe@kernel.org
|
|
eIBRS protects against guest->host RSB underflow/poisoning attacks.
Adding retpoline to the mix doesn't change that. Retpoline has a
balanced CALL/RET anyway.
So the current full RSB filling on VMEXIT with eIBRS+retpoline is
overkill. Disable it or do the VMEXIT_LITE mitigation if needed.
Suggested-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Reviewed-by: Amit Shah <amit.shah@amd.com>
Reviewed-by: Nikolay Borisov <nik.borisov@suse.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Link: https://lore.kernel.org/r/84a1226e5c9e2698eae1b5ade861f1b8bf3677dc.1744148254.git.jpoimboe@kernel.org
|
|
IBPB is expected to clear the RSB. However, if X86_BUG_IBPB_NO_RET is
set, that doesn't happen. Make indirect_branch_prediction_barrier()
take that into account by calling write_ibpb() which clears RSB on
X86_BUG_IBPB_NO_RET:
/* Make sure IBPB clears return stack preductions too. */
FILL_RETURN_BUFFER %rax, RSB_CLEAR_LOOPS, X86_BUG_IBPB_NO_RET
Note that, as of the previous patch, write_ibpb() also reads
'x86_pred_cmd' in order to use SBPB when applicable:
movl _ASM_RIP(x86_pred_cmd), %eax
Therefore that existing behavior in indirect_branch_prediction_barrier()
is not lost.
Fixes: 50e4b3b94090 ("x86/entry: Have entry_ibpb() invalidate return predictions")
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Nikolay Borisov <nik.borisov@suse.com>
Link: https://lore.kernel.org/r/bba68888c511743d4cd65564d1fc41438907523f.1744148254.git.jpoimboe@kernel.org
|
|
There's nothing entry-specific about entry_ibpb(). In preparation for
calling it from elsewhere, rename it to write_ibpb().
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/1e54ace131e79b760de3fe828264e26d0896e3ac.1744148254.git.jpoimboe@kernel.org
|
|
First of all, using 'mmio' prevents proper implementation of 8-bit accessors.
Second, it's simply inconsistent with uart8250 set of options. Rename it to
'mmio32'. While at it, remove rather misleading comment in the documentation.
From now on mmio32 is self-explanatory and pciserial supports not only 32-bit
MMIO accessors.
Also, while at it, fix the comment for the "pciserial" case. The comment
seems to be a copy'n'paste error when mentioning "serial" instead of
"pciserial" (with double quotes). Fix this.
With that, move it upper, so we don't calculate 'buf' twice.
Fixes: 3181424aeac2 ("x86/early_printk: Add support for MMIO-based UARTs")
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Denis Mukhin <dmukhin@ford.com>
Link: https://lore.kernel.org/r/20250407172214.792745-1-andriy.shevchenko@linux.intel.com
|
|
Since
741c10b096bc ("kernfs: Use RCU to access kernfs_node::name.")
a helper rdt_kn_name() that checks that rdtgroup_mutex is held has been used
for all accesses to the kernfs node name.
rdtgroup_mkdir() uses the name to determine if a valid monitor group is being
created by checking the parent name is "mon_groups". This is done without
holding rdtgroup_mutex, and now triggers the following warning:
| WARNING: suspicious RCU usage
| 6.15.0-rc1 #4465 Tainted: G E
| -----------------------------
| arch/x86/kernel/cpu/resctrl/internal.h:408 suspicious rcu_dereference_check() usage!
[...]
| Call Trace:
| <TASK>
| dump_stack_lvl
| lockdep_rcu_suspicious.cold
| is_mon_groups
| rdtgroup_mkdir
| kernfs_iop_mkdir
| vfs_mkdir
| do_mkdirat
| __x64_sys_mkdir
| do_syscall_64
| entry_SYSCALL_64_after_hwframe
Creating a control or monitor group calls mkdir_rdt_prepare(), which uses
rdtgroup_kn_lock_live() to take the rdtgroup_mutex.
To avoid taking and dropping the lock, move the check for the monitor group
name and position into mkdir_rdt_prepare() so that it occurs under
rdtgroup_mutex. Hoist is_mon_groups() earlier in the file.
[ bp: Massage. ]
Fixes: 741c10b096bc ("kernfs: Use RCU to access kernfs_node::name.")
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20250407124637.2433230-1-james.morse@arm.com
|
|
e820__register_nosave_regions()
While debugging kexec/hibernation hangs and crashes, it turned out that
the current implementation of e820__register_nosave_regions() suffers from
multiple serious issues:
- The end of last region is tracked by PFN, causing it to find holes
that aren't there if two consecutive subpage regions are present
- The nosave PFN ranges derived from holes are rounded out (instead of
rounded in) which makes it inconsistent with how explicitly reserved
regions are handled
Fix this by:
- Treating reserved regions as if they were holes, to ensure consistent
handling (rounding out nosave PFN ranges is more correct as the
kernel does not use partial pages)
- Tracking the end of the last RAM region by address instead of pages
to detect holes more precisely
These bugs appear to have been introduced about ~18 years ago with the very
first version of e820_mark_nosave_regions(), and its flawed assumptions were
carried forward uninterrupted through various waves of rewrites and renames.
[ mingo: Added Git archeology details, for kicks and giggles. ]
Fixes: e8eff5ac294e ("[PATCH] Make swsusp avoid memory holes and reserved memory regions on x86_64")
Reported-by: Roberto Ricci <io@r-ricci.it>
Tested-by: Roberto Ricci <io@r-ricci.it>
Signed-off-by: Myrrh Periwinkle <myrrhperiwinkle@qtmlabs.xyz>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: Len Brown <len.brown@intel.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20250406-fix-e820-nosave-v3-1-f3787bc1ee1d@qtmlabs.xyz
Closes: https://lore.kernel.org/all/Z4WFjBVHpndct7br@desktop0a/
|
|
Xen disables ACPI for PV guests in DomU, which causes acpi_mps_check() to
return 1 when CONFIG_X86_MPPARSE is not set. As a result, the local APIC is
disabled and the guest is later limited to a single vCPU, despite being
configured with more.
This regression was introduced in version 6.9 in commit 7c0edad3643f
("x86/cpu/topology: Rework possible CPU management"), which added an
early check that limits CPUs to 1 if apic_is_disabled.
Update the acpi_mps_check() logic to return 0 early when running as a Xen
PV guest in DomU, preventing APIC from being disabled in this specific case
and restoring correct multi-vCPU behaviour.
Fixes: 7c0edad3643f ("x86/cpu/topology: Rework possible CPU management")
Signed-off-by: Petr Vaněk <arkamar@atlas.cz>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250407132445.6732-2-arkamar@atlas.cz
|
|
If microcode did not get loaded there is no reason to keep it in the cache.
Moreover, if loading failed it will not be possible to load an earlier version
of microcode since the failed revision will always be selected from the cache
on the next reload attempt.
Since the failed revisions is not easily available at this point just clean the
whole cache. It will be rebuilt later if needed.
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20250327230503.1850368-3-boris.ostrovsky@oracle.com
|
|
Add some missing dependencies to the CPUID dependency table:
- All the AMX features depend on AMX_TILE
- All the SPEC_CTRL features depend on SPEC_CTRL
[ mingo: Keep the AMX part of the table grouped ... ]
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Sohil Mehta <sohil.mehta@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Ahmed S. Darwish <darwi@linutronix.de>
Link: https://lore.kernel.org/r/20240924170128.2611854-1-ak@linux.intel.com
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer cleanups from Thomas Gleixner:
"A set of final cleanups for the timer subsystem:
- Convert all del_timer[_sync]() instances over to the new
timer_delete[_sync]() API and remove the legacy wrappers.
Conversion was done with coccinelle plus some manual fixups as
coccinelle chokes on scoped_guard().
- The final cleanup of the hrtimer_init() to hrtimer_setup()
conversion.
This has been delayed to the end of the merge window, so that all
patches which have been merged through other trees are in mainline
and all new users are catched.
Doing this right before rc1 ensures that new code which is merged post
rc1 is not introducing new instances of the original functionality"
* tag 'timers-cleanups-2025-04-06' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
tracing/timers: Rename the hrtimer_init event to hrtimer_setup
hrtimers: Rename debug_init_on_stack() to debug_setup_on_stack()
hrtimers: Rename debug_init() to debug_setup()
hrtimers: Rename __hrtimer_init_sleeper() to __hrtimer_setup_sleeper()
hrtimers: Remove unnecessary NULL check in hrtimer_start_range_ns()
hrtimers: Make callback function pointer private
hrtimers: Merge __hrtimer_init() into __hrtimer_setup()
hrtimers: Switch to use __htimer_setup()
hrtimers: Delete hrtimer_init()
treewide: Convert new and leftover hrtimer_init() users
treewide: Switch/rename to timer_delete[_sync]()
|
|
timer_delete[_sync]() replaces del_timer[_sync](). Convert the whole tree
over and remove the historical wrapper inlines.
Conversion was done with coccinelle plus manual fixups where necessary.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Naming interrupt domains host is confusing at best and the irqdomain code
uses both domain and host inconsistently.
Therefore rename irq_set_default_host() to irq_set_default_domain().
Signed-off-by: Jiri Slaby (SUSE) <jirislaby@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250319092951.37667-3-jirislaby@kernel.org
|
|
mwait_idle_with_hints() and prefer_mwait_c1_over_halt()
The following commit, 12 years ago:
7e98b7192046 ("x86, idle: Use static_cpu_has() for CLFLUSH workaround, add barriers")
added barriers around the CLFLUSH in mwait_idle_with_hints(), justified with:
... and add memory barriers around it since the documentation is explicit
that CLFLUSH is only ordered with respect to MFENCE.
This also triggered, 11 years ago, the same adjustment in:
f8e617f45829 ("sched/idle/x86: Optimize unnecessary mwait_idle() resched IPIs")
during development, although it failed to get the static_cpu_has_bug() treatment.
X86_BUG_CLFLUSH_MONITOR (a.k.a the AAI65 errata) is specific to Intel CPUs,
and the SDM currently states:
Executions of the CLFLUSH instruction are ordered with respect to each
other and with respect to writes, locked read-modify-write instructions,
and fence instructions[1].
With footnote 1 reading:
Earlier versions of this manual specified that executions of the CLFLUSH
instruction were ordered only by the MFENCE instruction. All processors
implementing the CLFLUSH instruction also order it relative to the other
operations enumerated above.
i.e. The SDM was incorrect at the time, and barriers should not have been
inserted. Double checking the original AAI65 errata (not available from
intel.com any more) shows no mention of barriers either.
Note: If this were a general codepath, the MFENCEs would be needed, because
AMD CPUs of the same vintage do sport otherwise-unordered CLFLUSHs.
Remove the unnecessary barriers. Furthermore, use a plain alternative(),
rather than static_cpu_has_bug() and/or no optimisation. The workaround
is a single instruction.
Use an explicit %rax pointer rather than a general memory operand, because
MONITOR takes the pointer implicitly in the same way.
[ mingo: Cleaned up the commit a bit. ]
Fixes: 7e98b7192046 ("x86, idle: Use static_cpu_has() for CLFLUSH workaround, add barriers")
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Dave Hansen <dave.hansen@intel.com>
Acked-by: Borislav Petkov (AMD) <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@surriel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://lore.kernel.org/r/20250402172458.1378112-1-andrew.cooper3@citrix.com
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 TDX updates from Dave Hansen:
"Avoid direct HLT instruction execution in TDX guests.
TDX guests aren't expected to use the HLT instruction directly. It
causes a virtualization exception (#VE). While the #VE _can_ be
handled, the current handling is slow and buggy and the easiest thing
is just to avoid HLT in the first place. Plus, the kernel already has
paravirt infrastructure that makes it relatively painless.
Make TDX guests require paravirt and add some TDX-specific paravirt
handlers which avoid HLT in the normal halt routines. Also add a
warning in case another HLT sneaks in.
There was a report that this leads to a "major performance
improvement" on specjbb2015, probably because of the extra #VE
overhead or missed wakeups from the buggy HLT handling"
* tag 'x86_tdx_for_6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/tdx: Emit warning if IRQs are enabled during HLT #VE handling
x86/tdx: Fix arch_safe_halt() execution for TDX VMs
x86/paravirt: Move halt paravirt calls under CONFIG_PARAVIRT
|
|
Convert the last remaining printk() in nmi.c to pr_info(). Along with
it, use timespec macros to calculate the NMI handler duration.
Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Kai Huang <kai.huang@intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Nikolay Borisov <nik.borisov@suse.com>
Link: https://lore.kernel.org/r/20250327234629.3953536-10-sohil.mehta@intel.com
|
|
The expected_testcase_failures variable in the NMI selftest has never
been set since its introduction. Remove this unused variable along with
the related checks to simplify the code.
While at it, replace printk() with the corresponding pr_{cont,info}()
calls. Also, get rid of the superfluous testname wrapper and the
redundant file path comment.
Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Nikolay Borisov <nik.borisov@suse.com>
Link: https://lore.kernel.org/r/20250327234629.3953536-9-sohil.mehta@intel.com
|
|
Some of the comments in the default NMI handling code are out of place
or inadequate. Move them to the appropriate locations and update them as
needed.
Move the comment related to CPU-specific NMIs closer to the actual code.
Also, add more details about how back-to-back NMIs are detected since
that isn't immediately obvious.
Opportunistically, replace an #ifdef section in the vicinity with an
IS_ENABLED() check to make the code easier to read.
Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Kai Huang <kai.huang@intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Nikolay Borisov <nik.borisov@suse.com>
Link: https://lore.kernel.org/r/20250327234629.3953536-7-sohil.mehta@intel.com
|
|
The comment in unknown_nmi_error() is incorrect and misleading. There
is no longer a restriction on having a single Unknown NMI handler. Also,
nmi_handle() never used the 'b2b' parameter.
The commits that made the comment outdated are:
0d443b70cc92 ("x86/platform: Remove warning message for duplicate NMI handlers")
bf9f2ee28d47 ("x86/nmi: Remove the 'b2b' parameter from nmi_handle()")
Remove the old comment and update it to reflect the current logic.
Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Kai Huang <kai.huang@intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Nikolay Borisov <nik.borisov@suse.com>
Link: https://lore.kernel.org/r/20250327234629.3953536-6-sohil.mehta@intel.com
|
|
Commit:
feb6cd6a0f9f ("thermal/intel_powerclamp: stop sched tick in forced idle")
got rid of the last exported user of local_touch_nmi() a while back.
Remove the unnecessary export.
Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Kai Huang <kai.huang@intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Nikolay Borisov <nik.borisov@suse.com>
Link: https://lore.kernel.org/r/20250327234629.3953536-5-sohil.mehta@intel.com
|
|
The NMI descriptors for each NMI type are stored in an array. However,
they are currently initialized using raw numbers, which makes it
difficult to understand the code.
Introduce a macro to initialize the NMI descriptors using the NMI type
enum values to make the code more readable.
No functional change intended.
Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Kai Huang <kai.huang@intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Nikolay Borisov <nik.borisov@suse.com>
Link: https://lore.kernel.org/r/20250327234629.3953536-4-sohil.mehta@intel.com
|
|
Commit:
c305a4e98378 ("x86: Move sysctls into arch/x86")
recently moved the sysctl handling of panic_on_unrecovered_nmi and
panic_on_io_nmi to x86-specific code. These variables no longer need to
be declared in the generic header file.
Relocate the variable definitions and declarations closer to where they
are used. This makes all the NMI panic options consistent and easier to
track.
[ mingo: Fixed up the SHA1 of the commit reference. ]
Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Kai Huang <kai.huang@intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Nikolay Borisov <nik.borisov@suse.com>
Cc: Joel Granados <joel.granados@kernel.org>
Link: https://lore.kernel.org/r/20250327234629.3953536-3-sohil.mehta@intel.com
|
|
The unknown_nmi_panic variable is used to control whether the kernel
should panic on unknown NMIs. There is a sysctl entry under:
/proc/sys/kernel/unknown_nmi_panic
which can be used to change the behavior at runtime.
However, it seems that in some places, the option unnecessarily depends
on CONFIG_X86_LOCAL_APIC. Other code in nmi.c uses unknown_nmi_panic
without such a dependency. This results in a few messy #ifdefs
splattered across the code. The dependency was likely introduce due to a
potential build bug reported a long time ago:
https://lore.kernel.org/lkml/40BC67F9.3000609@myrealbox.com/
This build bug no longer exists.
Also, similar NMI panic options, such as panic_on_unrecovered_nmi and
panic_on_io_nmi, do not have an explicit dependency on the local APIC
either.
Though, it's hard to imagine a production system without the local APIC
configuration, making a specific NMI sysctl option dependent on it
doesn't make sense.
Remove the explicit dependency between unknown NMI handling and the
local APIC to make the code cleaner and more consistent.
While at it, reorder the header includes to maintain alphabetical order.
[ mingo: Cleaned up the changelog a bit, truly ordered the headers ... ]
Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Nikolay Borisov <nik.borisov@suse.com>
Link: https://lore.kernel.org/r/20250327234629.3953536-2-sohil.mehta@intel.com
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core updatesk from Greg KH:
"Here is the big set of driver core updates for 6.15-rc1. Lots of stuff
happened this development cycle, including:
- kernfs scaling changes to make it even faster thanks to rcu
- bin_attribute constify work in many subsystems
- faux bus minor tweaks for the rust bindings
- rust binding updates for driver core, pci, and platform busses,
making more functionaliy available to rust drivers. These are all
due to people actually trying to use the bindings that were in
6.14.
- make Rafael and Danilo full co-maintainers of the driver core
codebase
- other minor fixes and updates"
* tag 'driver-core-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (52 commits)
rust: platform: require Send for Driver trait implementers
rust: pci: require Send for Driver trait implementers
rust: platform: impl Send + Sync for platform::Device
rust: pci: impl Send + Sync for pci::Device
rust: platform: fix unrestricted &mut platform::Device
rust: pci: fix unrestricted &mut pci::Device
rust: device: implement device context marker
rust: pci: use to_result() in enable_device_mem()
MAINTAINERS: driver core: mark Rafael and Danilo as co-maintainers
rust/kernel/faux: mark Registration methods inline
driver core: faux: only create the device if probe() succeeds
rust/faux: Add missing parent argument to Registration::new()
rust/faux: Drop #[repr(transparent)] from faux::Registration
rust: io: fix devres test with new io accessor functions
rust: io: rename `io::Io` accessors
kernfs: Move dput() outside of the RCU section.
efi: rci2: mark bin_attribute as __ro_after_init
rapidio: constify 'struct bin_attribute'
firmware: qemu_fw_cfg: constify 'struct bin_attribute'
powerpc/perf/hv-24x7: Constify 'struct bin_attribute'
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull non-MM updates from Andrew Morton:
- The series "powerpc/crash: use generic crashkernel reservation" from
Sourabh Jain changes powerpc's kexec code to use more of the generic
layers.
- The series "get_maintainer: report subsystem status separately" from
Vlastimil Babka makes some long-requested improvements to the
get_maintainer output.
- The series "ucount: Simplify refcounting with rcuref_t" from
Sebastian Siewior cleans up and optimizing the refcounting in the
ucount code.
- The series "reboot: support runtime configuration of emergency
hw_protection action" from Ahmad Fatoum improves the ability for a
driver to perform an emergency system shutdown or reboot.
- The series "Converge on using secs_to_jiffies() part two" from Easwar
Hariharan performs further migrations from msecs_to_jiffies() to
secs_to_jiffies().
- The series "lib/interval_tree: add some test cases and cleanup" from
Wei Yang permits more userspace testing of kernel library code, adds
some more tests and performs some cleanups.
- The series "hung_task: Dump the blocking task stacktrace" from Masami
Hiramatsu arranges for the hung_task detector to dump the stack of
the blocking task and not just that of the blocked task.
- The series "resource: Split and use DEFINE_RES*() macros" from Andy
Shevchenko provides some cleanups to the resource definition macros.
- Plus the usual shower of singleton patches - please see the
individual changelogs for details.
* tag 'mm-nonmm-stable-2025-03-30-18-23' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (77 commits)
mailmap: consolidate email addresses of Alexander Sverdlin
fs/procfs: fix the comment above proc_pid_wchan()
relay: use kasprintf() instead of fixed buffer formatting
resource: replace open coded variant of DEFINE_RES()
resource: replace open coded variants of DEFINE_RES_*_NAMED()
resource: replace open coded variant of DEFINE_RES_NAMED_DESC()
resource: split DEFINE_RES_NAMED_DESC() out of DEFINE_RES_NAMED()
samples: add hung_task detector mutex blocking sample
hung_task: show the blocker task if the task is hung on mutex
kexec_core: accept unaccepted kexec segments' destination addresses
watchdog/perf: optimize bytes copied and remove manual NUL-termination
lib/interval_tree: fix the comment of interval_tree_span_iter_next_gap()
lib/interval_tree: skip the check before go to the right subtree
lib/interval_tree: add test case for span iteration
lib/interval_tree: add test case for interval_tree_iter_xxx() helpers
lib/rbtree: add random seed
lib/rbtree: split tests
lib/rbtree: enable userland test suite for rbtree related data structure
checkpatch: describe --min-conf-desc-length
scripts/gdb/symbols: determine KASLR offset on s390
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull MM updates from Andrew Morton:
- The series "Enable strict percpu address space checks" from Uros
Bizjak uses x86 named address space qualifiers to provide
compile-time checking of percpu area accesses.
This has caused a small amount of fallout - two or three issues were
reported. In all cases the calling code was found to be incorrect.
- The series "Some cleanup for memcg" from Chen Ridong implements some
relatively monir cleanups for the memcontrol code.
- The series "mm: fixes for device-exclusive entries (hmm)" from David
Hildenbrand fixes a boatload of issues which David found then using
device-exclusive PTE entries when THP is enabled. More work is
needed, but this makes thins better - our own HMM selftests now
succeed.
- The series "mm: zswap: remove z3fold and zbud" from Yosry Ahmed
remove the z3fold and zbud implementations. They have been deprecated
for half a year and nobody has complained.
- The series "mm: further simplify VMA merge operation" from Lorenzo
Stoakes implements numerous simplifications in this area. No runtime
effects are anticipated.
- The series "mm/madvise: remove redundant mmap_lock operations from
process_madvise()" from SeongJae Park rationalizes the locking in the
madvise() implementation. Performance gains of 20-25% were observed
in one MADV_DONTNEED microbenchmark.
- The series "Tiny cleanup and improvements about SWAP code" from
Baoquan He contains a number of touchups to issues which Baoquan
noticed when working on the swap code.
- The series "mm: kmemleak: Usability improvements" from Catalin
Marinas implements a couple of improvements to the kmemleak
user-visible output.
- The series "mm/damon/paddr: fix large folios access and schemes
handling" from Usama Arif provides a couple of fixes for DAMON's
handling of large folios.
- The series "mm/damon/core: fix wrong and/or useless damos_walk()
behaviors" from SeongJae Park fixes a few issues with the accuracy of
kdamond's walking of DAMON regions.
- The series "expose mapping wrprotect, fix fb_defio use" from Lorenzo
Stoakes changes the interaction between framebuffer deferred-io and
core MM. No functional changes are anticipated - this is preparatory
work for the future removal of page structure fields.
- The series "mm/damon: add support for hugepage_size DAMOS filter"
from Usama Arif adds a DAMOS filter which permits the filtering by
huge page sizes.
- The series "mm: permit guard regions for file-backed/shmem mappings"
from Lorenzo Stoakes extends the guard region feature from its
present "anon mappings only" state. The feature now covers shmem and
file-backed mappings.
- The series "mm: batched unmap lazyfree large folios during
reclamation" from Barry Song cleans up and speeds up the unmapping
for pte-mapped large folios.
- The series "reimplement per-vma lock as a refcount" from Suren
Baghdasaryan puts the vm_lock back into the vma. Our reasons for
pulling it out were largely bogus and that change made the code more
messy. This patchset provides small (0-10%) improvements on one
microbenchmark.
- The series "Docs/mm/damon: misc DAMOS filters documentation fixes and
improves" from SeongJae Park does some maintenance work on the DAMON
docs.
- The series "hugetlb/CMA improvements for large systems" from Frank
van der Linden addresses a pile of issues which have been observed
when using CMA on large machines.
- The series "mm/damon: introduce DAMOS filter type for unmapped pages"
from SeongJae Park enables users of DMAON/DAMOS to filter my the
page's mapped/unmapped status.
- The series "zsmalloc/zram: there be preemption" from Sergey
Senozhatsky teaches zram to run its compression and decompression
operations preemptibly.
- The series "selftests/mm: Some cleanups from trying to run them" from
Brendan Jackman fixes a pile of unrelated issues which Brendan
encountered while runnimg our selftests.
- The series "fs/proc/task_mmu: add guard region bit to pagemap" from
Lorenzo Stoakes permits userspace to use /proc/pid/pagemap to
determine whether a particular page is a guard page.
- The series "mm, swap: remove swap slot cache" from Kairui Song
removes the swap slot cache from the allocation path - it simply
wasn't being effective.
- The series "mm: cleanups for device-exclusive entries (hmm)" from
David Hildenbrand implements a number of unrelated cleanups in this
code.
- The series "mm: Rework generic PTDUMP configs" from Anshuman Khandual
implements a number of preparatoty cleanups to the GENERIC_PTDUMP
Kconfig logic.
- The series "mm/damon: auto-tune aggregation interval" from SeongJae
Park implements a feedback-driven automatic tuning feature for
DAMON's aggregation interval tuning.
- The series "Fix lazy mmu mode" from Ryan Roberts fixes some issues in
powerpc, sparc and x86 lazy MMU implementations. Ryan did this in
preparation for implementing lazy mmu mode for arm64 to optimize
vmalloc.
- The series "mm/page_alloc: Some clarifications for migratetype
fallback" from Brendan Jackman reworks some commentary to make the
code easier to follow.
- The series "page_counter cleanup and size reduction" from Shakeel
Butt cleans up the page_counter code and fixes a size increase which
we accidentally added late last year.
- The series "Add a command line option that enables control of how
many threads should be used to allocate huge pages" from Thomas
Prescher does that. It allows the careful operator to significantly
reduce boot time by tuning the parallalization of huge page
initialization.
- The series "Fix calculations in trace_balance_dirty_pages() for cgwb"
from Tang Yizhou fixes the tracing output from the dirty page
balancing code.
- The series "mm/damon: make allow filters after reject filters useful
and intuitive" from SeongJae Park improves the handling of allow and
reject filters. Behaviour is made more consistent and the documention
is updated accordingly.
- The series "Switch zswap to object read/write APIs" from Yosry Ahmed
updates zswap to the new object read/write APIs and thus permits the
removal of some legacy code from zpool and zsmalloc.
- The series "Some trivial cleanups for shmem" from Baolin Wang does as
it claims.
- The series "fs/dax: Fix ZONE_DEVICE page reference counts" from
Alistair Popple regularizes the weird ZONE_DEVICE page refcount
handling in DAX, permittig the removal of a number of special-case
checks.
- The series "refactor mremap and fix bug" from Lorenzo Stoakes is a
preparatoty refactoring and cleanup of the mremap() code.
- The series "mm: MM owner tracking for large folios (!hugetlb) +
CONFIG_NO_PAGE_MAPCOUNT" from David Hildenbrand reworks the manner in
which we determine whether a large folio is known to be mapped
exclusively into a single MM.
- The series "mm/damon: add sysfs dirs for managing DAMOS filters based
on handling layers" from SeongJae Park adds a couple of new sysfs
directories to ease the management of DAMON/DAMOS filters.
- The series "arch, mm: reduce code duplication in mem_init()" from
Mike Rapoport consolidates many per-arch implementations of
mem_init() into code generic code, where that is practical.
- The series "mm/damon/sysfs: commit parameters online via
damon_call()" from SeongJae Park continues the cleaning up of sysfs
access to DAMON internal data.
- The series "mm: page_ext: Introduce new iteration API" from Luiz
Capitulino reworks the page_ext initialization to fix a boot-time
crash which was observed with an unusual combination of compile and
cmdline options.
- The series "Buddy allocator like (or non-uniform) folio split" from
Zi Yan reworks the code to split a folio into smaller folios. The
main benefit is lessened memory consumption: fewer post-split folios
are generated.
- The series "Minimize xa_node allocation during xarry split" from Zi
Yan reduces the number of xarray xa_nodes which are generated during
an xarray split.
- The series "drivers/base/memory: Two cleanups" from Gavin Shan
performs some maintenance work on the drivers/base/memory code.
- The series "Add tracepoints for lowmem reserves, watermarks and
totalreserve_pages" from Martin Liu adds some more tracepoints to the
page allocator code.
- The series "mm/madvise: cleanup requests validations and
classifications" from SeongJae Park cleans up some warts which
SeongJae observed during his earlier madvise work.
- The series "mm/hwpoison: Fix regressions in memory failure handling"
from Shuai Xue addresses two quite serious regressions which Shuai
has observed in the memory-failure implementation.
- The series "mm: reliable huge page allocator" from Johannes Weiner
makes huge page allocations cheaper and more reliable by reducing
fragmentation.
- The series "Minor memcg cleanups & prep for memdescs" from Matthew
Wilcox is preparatory work for the future implementation of memdescs.
- The series "track memory used by balloon drivers" from Nico Pache
introduces a way to track memory used by our various balloon drivers.
- The series "mm/damon: introduce DAMOS filter type for active pages"
from Nhat Pham permits users to filter for active/inactive pages,
separately for file and anon pages.
- The series "Adding Proactive Memory Reclaim Statistics" from Hao Jia
separates the proactive reclaim statistics from the direct reclaim
statistics.
- The series "mm/vmscan: don't try to reclaim hwpoison folio" from
Jinjiang Tu fixes our handling of hwpoisoned pages within the reclaim
code.
* tag 'mm-stable-2025-03-30-16-52' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (431 commits)
mm/page_alloc: remove unnecessary __maybe_unused in order_to_pindex()
x86/mm: restore early initialization of high_memory for 32-bits
mm/vmscan: don't try to reclaim hwpoison folio
mm/hwpoison: introduce folio_contain_hwpoisoned_page() helper
cgroup: docs: add pswpin and pswpout items in cgroup v2 doc
mm: vmscan: split proactive reclaim statistics from direct reclaim statistics
selftests/mm: speed up split_huge_page_test
selftests/mm: uffd-unit-tests support for hugepages > 2M
docs/mm/damon/design: document active DAMOS filter type
mm/damon: implement a new DAMOS filter type for active pages
fs/dax: don't disassociate zero page entries
MM documentation: add "Unaccepted" meminfo entry
selftests/mm: add commentary about 9pfs bugs
fork: use __vmalloc_node() for stack allocation
docs/mm: Physical Memory: Populate the "Zones" section
xen: balloon: update the NR_BALLOON_PAGES state
hv_balloon: update the NR_BALLOON_PAGES state
balloon_compaction: update the NR_BALLOON_PAGES state
meminfo: add a per node counter for balloon drivers
mm: remove references to folio in __memcg_kmem_uncharge_page()
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/modules/linux
Pull modules updates from Petr Pavlu:
- Use RCU instead of RCU-sched
The mix of rcu_read_lock(), rcu_read_lock_sched() and
preempt_disable() in the module code and its users has
been replaced with just rcu_read_lock()
- The rest of changes are smaller fixes and updates
* tag 'modules-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/modules/linux: (32 commits)
MAINTAINERS: Update the MODULE SUPPORT section
module: Remove unnecessary size argument when calling strscpy()
module: Replace deprecated strncpy() with strscpy()
params: Annotate struct module_param_attrs with __counted_by()
bug: Use RCU instead RCU-sched to protect module_bug_list.
static_call: Use RCU in all users of __module_text_address().
kprobes: Use RCU in all users of __module_text_address().
bpf: Use RCU in all users of __module_text_address().
jump_label: Use RCU in all users of __module_text_address().
jump_label: Use RCU in all users of __module_address().
x86: Use RCU in all users of __module_address().
cfi: Use RCU while invoking __module_address().
powerpc/ftrace: Use RCU in all users of __module_text_address().
LoongArch: ftrace: Use RCU in all users of __module_text_address().
LoongArch/orc: Use RCU in all users of __module_address().
arm64: module: Use RCU in all users of __module_text_address().
ARM: module: Use RCU in all users of __module_text_address().
module: Use RCU in all users of __module_text_address().
module: Use RCU in all users of __module_address().
module: Use RCU in search_module_extables().
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull misc x86 fixes and updates from Ingo Molnar:
- Fix a large number of x86 Kconfig dependency and help text accuracy
bugs/problems, by Mateusz Jończyk and David Heideberg
- Fix a VM_PAT interaction with fork() crash. This also touches core
kernel code
- Fix an ORC unwinder bug for interrupt entries
- Fixes and cleanups
- Fix an AMD microcode loader bug that can promote verification
failures into success
- Add early-printk support for MMIO based UARTs on an x86 board that
had no other serial debugging facility and also experienced early
boot crashes
* tag 'x86-urgent-2025-03-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/microcode/AMD: Fix __apply_microcode_amd()'s return value
x86/mm/pat: Fix VM_PAT handling when fork() fails in copy_page_range()
x86/fpu: Update the outdated comment above fpstate_init_user()
x86/early_printk: Add support for MMIO-based UARTs
x86/dumpstack: Fix inaccurate unwinding from exception stacks due to misplaced assignment
x86/entry: Fix ORC unwinder for PUSH_REGS with save_ret=1
x86/Kconfig: Fix lists in X86_EXTENDED_PLATFORM help text
x86/Kconfig: Correct X86_X2APIC help text
x86/speculation: Remove the extra #ifdef around CALL_NOSPEC
x86/Kconfig: Document release year of glibc 2.3.3
x86/Kconfig: Make CONFIG_PCI_CNB20LE_QUIRK depend on X86_32
x86/Kconfig: Document CONFIG_PCI_MMCONFIG
x86/Kconfig: Update lists in X86_EXTENDED_PLATFORM
x86/Kconfig: Move all X86_EXTENDED_PLATFORM options together
x86/Kconfig: Always enable ARCH_SPARSEMEM_ENABLE
x86/Kconfig: Enable X86_X2APIC by default and improve help text
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull misc locking fixes and updates from Ingo Molnar:
- Fix a locking self-test FAIL on PREEMPT_RT kernels
- Fix nr_unused_locks accounting bug
- Simplify the split-lock debugging feature's fast-path
* tag 'locking-urgent-2025-03-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
locking/lockdep: Decrease nr_unused_locks if lock unused in zap_class()
lockdep: Fix wait context check on softirq for PREEMPT_RT
x86/split_lock: Simplify reenabling
|