Age | Commit message (Collapse) | Author |
|
Parallel AP bringup requires that the APs can run fully parallel through
the early startup code including the real mode trampoline.
To prepare for this implement a bit-spinlock to serialize access to the
real mode stack so that parallel upcoming APs are not going to corrupt each
others stack while going through the real mode startup code.
Co-developed-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Tested-by: Helge Deller <deller@gmx.de> # parisc
Tested-by: Guilherme G. Piccoli <gpiccoli@igalia.com> # Steam Deck
Link: https://lore.kernel.org/r/20230512205257.355425551@linutronix.de
|
|
For parallel CPU brinugp it's required to read the APIC ID in the low level
startup code. The virtual APIC base address is a constant because its a
fix-mapped address. Exposing that constant which is composed via macros to
assembly code is non-trivial due to header inclusion hell.
Aside of that it's constant only because of the vsyscall ABI
requirement. Once vsyscall is out of the picture the fixmap can be placed
at runtime.
Avoid header hell, stay flexible and store the address in a variable which
can be exposed to the low level startup code.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Tested-by: Helge Deller <deller@gmx.de> # parisc
Tested-by: Guilherme G. Piccoli <gpiccoli@igalia.com> # Steam Deck
Link: https://lore.kernel.org/r/20230512205257.299231005@linutronix.de
|
|
There is often significant latency in the early stages of CPU bringup, and
time is wasted by waking each CPU (e.g. with SIPI/INIT/INIT on x86) and
then waiting for it to respond before moving on to the next.
Allow a platform to enable parallel setup which brings all to be onlined
CPUs up to the CPUHP_BP_KICK_AP state. While this state advancement on the
control CPU (BP) is single-threaded the important part is the last state
CPUHP_BP_KICK_AP which wakes the to be onlined CPUs up.
This allows the CPUs to run up to the first sychronization point
cpuhp_ap_sync_alive() where they wait for the control CPU to release them
one by one for the full onlining procedure.
This parallelism depends on the CPU hotplug core sync mechanism which
ensures that the parallel brought up CPUs wait for release before touching
any state which would make the CPU visible to anything outside the hotplug
control mechanism.
To handle the SMT constraints of X86 correctly the bringup happens in two
iterations when CONFIG_HOTPLUG_SMT is enabled. The control CPU brings up
the primary SMT threads of each core first, which can load the microcode
without the need to rendevouz with the thread siblings. Once that's
completed it brings up the secondary SMT threads.
Co-developed-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Tested-by: Helge Deller <deller@gmx.de> # parisc
Tested-by: Guilherme G. Piccoli <gpiccoli@igalia.com> # Steam Deck
Link: https://lore.kernel.org/r/20230512205257.240231377@linutronix.de
|
|
Make the primary thread tracking CPU mask based in preparation for simpler
handling of parallel bootup.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Tested-by: Helge Deller <deller@gmx.de> # parisc
Tested-by: Guilherme G. Piccoli <gpiccoli@igalia.com> # Steam Deck
Link: https://lore.kernel.org/r/20230512205257.186599880@linutronix.de
|
|
The x86 CPU bringup state currently does AP wake-up, wait for AP to
respond and then release it for full bringup.
It is safe to be split into a wake-up and and a separate wait+release
state.
Provide the required functions and enable the split CPU bringup, which
prepares for parallel bringup, where the bringup of the non-boot CPUs takes
two iterations: One to prepare and wake all APs and the second to wait and
release them. Depending on timing this can eliminate the wait time
completely.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Tested-by: Helge Deller <deller@gmx.de> # parisc
Tested-by: Guilherme G. Piccoli <gpiccoli@igalia.com> # Steam Deck
Link: https://lore.kernel.org/r/20230512205257.133453992@linutronix.de
|
|
The bring up logic of a to be onlined CPU consists of several parts, which
are considered to be a single hotplug state:
1) Control CPU issues the wake-up
2) To be onlined CPU starts up, does the minimal initialization,
reports to be alive and waits for release into the complete bring-up.
3) Control CPU waits for the alive report and releases the upcoming CPU
for the complete bring-up.
Allow to split this into two states:
1) Control CPU issues the wake-up
After that the to be onlined CPU starts up, does the minimal
initialization, reports to be alive and waits for release into the
full bring-up. As this can run after the control CPU dropped the
hotplug locks the code which is executed on the AP before it reports
alive has to be carefully audited to not violate any of the hotplug
constraints, especially not modifying any of the various cpumasks.
This is really only meant to avoid waiting for the AP to react on the
wake-up. Of course an architecture can move strict CPU related setup
functionality, e.g. microcode loading, with care before the
synchronization point to save further pointless waiting time.
2) Control CPU waits for the alive report and releases the upcoming CPU
for the complete bring-up.
This allows that the two states can be split up to run all to be onlined
CPUs up to state #1 on the control CPU and then at a later point run state
#2. This spares some of the latencies of the full serialized per CPU
bringup by avoiding the per CPU wakeup/wait serialization. The assumption
is that the first AP already waits when the last AP has been woken up. This
obvioulsy depends on the hardware latencies and depending on the timings
this might still not completely eliminate all wait scenarios.
This split is just a preparatory step for enabling the parallel bringup
later. The boot time bringup is still fully serialized. It has a separate
config switch so that architectures which want to support parallel bringup
can test the split of the CPUHP_BRINGUG step separately.
To enable this the architecture must support the CPU hotplug core sync
mechanism and has to be audited that there are no implicit hotplug state
dependencies which require a fully serialized bringup.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Tested-by: Helge Deller <deller@gmx.de> # parisc
Tested-by: Guilherme G. Piccoli <gpiccoli@igalia.com> # Steam Deck
Link: https://lore.kernel.org/r/20230512205257.080801387@linutronix.de
|
|
Commit dce1ca0525bf ("sched/scs: Reset task stack state in bringup_cpu()")
ensured that the shadow call stack and KASAN poisoning were removed from
a CPU's stack each time that CPU is brought up, not just once.
This is not incorrect. However, with parallel bringup the idle thread setup
will happen at a different step. As a consequence the cleanup in
bringup_cpu() would be too late.
Move the SCS/KASAN cleanup to the generic _cpu_up() function instead,
which already ensures that the new CPU's stack is available, purely to
allow for early failure. This occurs when the CPU to be brought up is
in the CPUHP_OFFLINE state, which should correctly do the cleanup any
time the CPU has been taken down to the point where such is needed.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Tested-by: Helge Deller <deller@gmx.de> # parisc
Tested-by: Guilherme G. Piccoli <gpiccoli@igalia.com> # Steam Deck
Link: https://lore.kernel.org/r/20230512205257.027075560@linutronix.de
|
|
All users converted to the hotplug core mechanism.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Tested-by: Helge Deller <deller@gmx.de> # parisc
Tested-by: Guilherme G. Piccoli <gpiccoli@igalia.com> # Steam Deck
Link: https://lore.kernel.org/r/20230512205256.972894276@linutronix.de
|
|
Switch to the CPU hotplug core state tracking and synchronization
mechanim. No functional change intended.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Palmer Dabbelt <palmer@rivosinc.com>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Tested-by: Helge Deller <deller@gmx.de> # parisc
Tested-by: Guilherme G. Piccoli <gpiccoli@igalia.com> # Steam Deck
Link: https://lore.kernel.org/r/20230512205256.916055844@linutronix.de
|
|
Switch to the CPU hotplug core state tracking and synchronization
mechanim. No functional change intended.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Tested-by: Helge Deller <deller@gmx.de> # parisc
Tested-by: Guilherme G. Piccoli <gpiccoli@igalia.com> # Steam Deck
Link: https://lore.kernel.org/r/20230512205256.859920443@linutronix.de
|
|
Switch to the CPU hotplug core state tracking and synchronization
mechanim. This unfortunately requires to add dead reporting to the non CPS
platforms as CPS is the only user, but it allows an overall consolidation
of this functionality.
No functional change intended.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Tested-by: Helge Deller <deller@gmx.de> # parisc
Tested-by: Guilherme G. Piccoli <gpiccoli@igalia.com> # Steam Deck
Link: https://lore.kernel.org/r/20230512205256.803238859@linutronix.de
|
|
Switch to the CPU hotplug core state tracking and synchronization
mechanim. No functional change intended.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Tested-by: Helge Deller <deller@gmx.de> # parisc
Tested-by: Guilherme G. Piccoli <gpiccoli@igalia.com> # Steam Deck
Link: https://lore.kernel.org/r/20230512205256.747254502@linutronix.de
|
|
Switch to the CPU hotplug core state tracking and synchronization
mechanim. No functional change intended.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Tested-by: Helge Deller <deller@gmx.de> # parisc
Tested-by: Guilherme G. Piccoli <gpiccoli@igalia.com> # Steam Deck
Link: https://lore.kernel.org/r/20230512205256.690926018@linutronix.de
|
|
Switch to the CPU hotplug core state tracking and synchronization
mechanim. No functional change intended.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Tested-by: Helge Deller <deller@gmx.de> # parisc
Tested-by: Guilherme G. Piccoli <gpiccoli@igalia.com> # Steam Deck
Link: https://lore.kernel.org/r/20230512205256.635326070@linutronix.de
|
|
No more users.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Tested-by: Helge Deller <deller@gmx.de> # parisc
Tested-by: Guilherme G. Piccoli <gpiccoli@igalia.com> # Steam Deck
Link: https://lore.kernel.org/r/20230512205256.582584351@linutronix.de
|
|
The new AP state tracking and synchronization mechanism in the CPU hotplug
core code allows to remove quite some x86 specific code:
1) The AP alive synchronization based on cpumasks
2) The decision whether an AP can be brought up again
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Tested-by: Helge Deller <deller@gmx.de> # parisc
Tested-by: Guilherme G. Piccoli <gpiccoli@igalia.com> # Steam Deck
Link: https://lore.kernel.org/r/20230512205256.529657366@linutronix.de
|
|
The CPU state tracking and synchronization mechanism in smpboot.c is
completely independent of the hotplug code and all logic around it is
implemented in architecture specific code.
Except for the state reporting of the AP there is absolutely nothing
architecture specific and the sychronization and decision functions can be
moved into the generic hotplug core code.
Provide an integrated variant and add the core synchronization and decision
points. This comes in two flavours:
1) DEAD state synchronization
Updated by the architecture code once the AP reaches the point where
it is ready to be torn down by the control CPU, e.g. by removing power
or clocks or tear down via the hypervisor.
The control CPU waits for this state to be reached with a timeout. If
the state is reached an architecture specific cleanup function is
invoked.
2) Full state synchronization
This extends #1 with AP alive synchronization. This is new
functionality, which allows to replace architecture specific wait
mechanims, e.g. cpumasks, completely.
It also prevents that an AP which is in a limbo state can be brought
up again. This can happen when an AP failed to report dead state
during a previous off-line operation.
The dead synchronization is what most architectures use. Only x86 makes a
bringup decision based on that state at the moment.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Tested-by: Helge Deller <deller@gmx.de> # parisc
Tested-by: Guilherme G. Piccoli <gpiccoli@igalia.com> # Steam Deck
Link: https://lore.kernel.org/r/20230512205256.476305035@linutronix.de
|
|
No point in this conditional voodoo. Un-initializing the lock mechanism is
safe to be called unconditionally even if it was already invoked when the
CPU died.
Remove the invocation of xen_smp_intr_free() as that has been already
cleaned up in xen_cpu_dead_hvm().
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Tested-by: Helge Deller <deller@gmx.de> # parisc
Tested-by: Guilherme G. Piccoli <gpiccoli@igalia.com> # Steam Deck
Link: https://lore.kernel.org/r/20230512205256.423407127@linutronix.de
|
|
Now that the core code drops sparse_irq_lock after the idle thread
synchronized, it's pointless to wait for the AP to mark itself online.
Whether the control CPU runs in a wait loop or sleeps in the core code
waiting for the online operation to complete makes no difference.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Tested-by: Helge Deller <deller@gmx.de> # parisc
Tested-by: Guilherme G. Piccoli <gpiccoli@igalia.com> # Steam Deck
Link: https://lore.kernel.org/r/20230512205256.369512093@linutronix.de
|
|
Now that the core code drops sparse_irq_lock after the idle thread
synchronized, it's pointless to wait for the AP to mark itself online.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Tested-by: Helge Deller <deller@gmx.de> # parisc
Tested-by: Guilherme G. Piccoli <gpiccoli@igalia.com> # Steam Deck
Link: https://lore.kernel.org/r/20230512205256.316417181@linutronix.de
|
|
There is no harm to hold sparse_irq lock until the upcoming CPU completes
in cpuhp_online_idle(). This allows to remove cpu_online() synchronization
from architecture code.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Tested-by: Helge Deller <deller@gmx.de> # parisc
Tested-by: Guilherme G. Piccoli <gpiccoli@igalia.com> # Steam Deck
Link: https://lore.kernel.org/r/20230512205256.263722880@linutronix.de
|
|
Now that TSC synchronization is SMP function call based there is no reason
to wait for the AP to be set in smp_callin_mask. The control CPU waits for
the AP to set itself in the online mask anyway.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Tested-by: Helge Deller <deller@gmx.de> # parisc
Tested-by: Guilherme G. Piccoli <gpiccoli@igalia.com> # Steam Deck
Link: https://lore.kernel.org/r/20230512205256.206394064@linutronix.de
|
|
Spin-waiting on the control CPU until the AP reaches the TSC
synchronization is just a waste especially in the case that there is no
synchronization required.
As the synchronization has to run with interrupts disabled the control CPU
part can just be done from a SMP function call. The upcoming AP issues that
call async only in the case that synchronization is required.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Tested-by: Helge Deller <deller@gmx.de> # parisc
Tested-by: Guilherme G. Piccoli <gpiccoli@igalia.com> # Steam Deck
Link: https://lore.kernel.org/r/20230512205256.148255496@linutronix.de
|
|
The usage is in smpboot.c and not in the CPU initialization code.
The XEN_PV usage of cpu_callout_mask is obsolete as cpu_init() not longer
waits and cacheinfo has its own CPU mask now, so cpu_callout_mask can be
made static too.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Tested-by: Helge Deller <deller@gmx.de> # parisc
Tested-by: Guilherme G. Piccoli <gpiccoli@igalia.com> # Steam Deck
Link: https://lore.kernel.org/r/20230512205256.091511483@linutronix.de
|
|
cpu_callout_mask is used for the stop machine based MTRR/PAT init.
In preparation of moving the BP/AP synchronization to the core hotplug
code, use a private CPU mask for cacheinfo and manage it in the
starting/dying hotplug state.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Tested-by: Helge Deller <deller@gmx.de> # parisc
Tested-by: Guilherme G. Piccoli <gpiccoli@igalia.com> # Steam Deck
Link: https://lore.kernel.org/r/20230512205256.035041005@linutronix.de
|
|
The synchronization of the AP with the control CPU is a SMP boot problem
and has nothing to do with cpu_init().
Open code cpu_init_secondary() in start_secondary() and move
wait_for_master_cpu() into the SMP boot code.
No functional change.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Tested-by: Helge Deller <deller@gmx.de> # parisc
Tested-by: Guilherme G. Piccoli <gpiccoli@igalia.com> # Steam Deck
Link: https://lore.kernel.org/r/20230512205255.981999763@linutronix.de
|
|
There are four logical parts to what native_cpu_up() does on the BSP (or
on the controlling CPU for a later hotplug):
1) Wake the AP by sending the INIT/SIPI/SIPI sequence.
2) Wait for the AP to make it as far as wait_for_master_cpu() which
sets that CPU's bit in cpu_initialized_mask, then sets the bit in
cpu_callout_mask to let the AP proceed through cpu_init().
3) Wait for the AP to finish cpu_init() and get as far as the
smp_callin() call, which sets that CPU's bit in cpu_callin_mask.
4) Perform the TSC synchronization and wait for the AP to actually
mark itself online in cpu_online_mask.
In preparation to allow these phases to operate in parallel on multiple
APs, split them out into separate functions and document the interactions
a little more clearly in both the BP and AP code paths.
No functional change intended.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Tested-by: Helge Deller <deller@gmx.de> # parisc
Tested-by: Guilherme G. Piccoli <gpiccoli@igalia.com> # Steam Deck
Link: https://lore.kernel.org/r/20230512205255.928917242@linutronix.de
|
|
Peter stumbled over the barrier() after the invocation of smp_callin() in
start_secondary():
"...this barrier() and it's comment seem weird vs smp_callin(). That
function ends with an atomic bitop (it has to, at the very least it must
not be weaker than store-release) but also has an explicit wmb() to order
setup vs CPU_STARTING.
There is no way the smp_processor_id() referred to in this comment can land
before cpu_init() even without the barrier()."
The barrier() along with the comment was added in 2003 with commit
d8f19f2cac70 ("[PATCH] x86-64 merge") in the history tree. One of those
well documented combo patches of that time which changes world and some
more. The context back then was:
/*
* Dont put anything before smp_callin(), SMP
* booting is too fragile that we want to limit the
* things done here to the most necessary things.
*/
cpu_init();
smp_callin();
+ /* otherwise gcc will move up smp_processor_id before the cpu_init */
+ barrier();
Dprintk("cpu %d: waiting for commence\n", smp_processor_id());
Even back in 2003 the compiler was not allowed to reorder that
smp_processor_id() invocation before the cpu_init() function call.
Especially not as smp_processor_id() resolved to:
asm volatile("movl %%gs:%c1,%0":"=r" (ret__):"i"(pda_offset(field)):"memory");
There is no trace of this change in any mailing list archive including the
back then official x86_64 list discuss@x86-64.org, which would explain the
problem this change solved.
The debug prints are gone by now and the the only smp_processor_id()
invocation today is farther down in start_secondary() after locking
vector_lock which itself prevents reordering.
Even if the compiler would be allowed to reorder this, the code would still
be correct as GSBASE is set up early in the assembly code and is valid when
the CPU reaches start_secondary(), while the code at the time when this
barrier was added did the GSBASE setup in cpu_init().
As the barrier has zero value, remove it.
Reported-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Tested-by: Helge Deller <deller@gmx.de> # parisc
Tested-by: Guilherme G. Piccoli <gpiccoli@igalia.com> # Steam Deck
Link: https://lore.kernel.org/r/20230512205255.875713771@linutronix.de
|
|
Now that the CPU0 hotplug cruft is gone, the only user is AMD SEV.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Tested-by: Helge Deller <deller@gmx.de> # parisc
Tested-by: Guilherme G. Piccoli <gpiccoli@igalia.com> # Steam Deck
Link: https://lore.kernel.org/r/20230512205255.822234014@linutronix.de
|
|
This was introduced with commit e1c467e69040 ("x86, hotplug: Wake up CPU0
via NMI instead of INIT, SIPI, SIPI") to eventually support physical
hotplug of CPU0:
"We'll change this code in the future to wake up hard offlined CPU0 if
real platform and request are available."
11 years later this has not happened and physical hotplug is not officially
supported. Remove the cruft.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Tested-by: Helge Deller <deller@gmx.de> # parisc
Tested-by: Guilherme G. Piccoli <gpiccoli@igalia.com> # Steam Deck
Link: https://lore.kernel.org/r/20230512205255.768845190@linutronix.de
|
|
This was introduced together with commit e1c467e69040 ("x86, hotplug: Wake
up CPU0 via NMI instead of INIT, SIPI, SIPI") to eventually support
physical hotplug of CPU0:
"We'll change this code in the future to wake up hard offlined CPU0 if
real platform and request are available."
11 years later this has not happened and physical hotplug is not officially
supported. Remove the cruft.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Tested-by: Helge Deller <deller@gmx.de> # parisc
Tested-by: Guilherme G. Piccoli <gpiccoli@igalia.com> # Steam Deck
Link: https://lore.kernel.org/r/20230512205255.715707999@linutronix.de
|
|
This is used in the SEV play_dead() implementation to re-online CPUs. But
that has nothing to do with CPU0.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Tested-by: Helge Deller <deller@gmx.de> # parisc
Tested-by: Guilherme G. Piccoli <gpiccoli@igalia.com> # Steam Deck
Link: https://lore.kernel.org/r/20230512205255.662319599@linutronix.de
|
|
When TSC is synchronized across sockets then there is no reason to
calibrate the delay for the first CPU which comes up on a socket.
Just reuse the existing calibration value.
This removes 100ms pointlessly wasted time from CPU hotplug per socket.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Tested-by: Helge Deller <deller@gmx.de> # parisc
Tested-by: Guilherme G. Piccoli <gpiccoli@igalia.com> # Steam Deck
Link: https://lore.kernel.org/r/20230512205255.608773568@linutronix.de
|
|
No point in keeping them around.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Tested-by: Helge Deller <deller@gmx.de> # parisc
Tested-by: Guilherme G. Piccoli <gpiccoli@igalia.com> # Steam Deck
Link: https://lore.kernel.org/r/20230512205255.551974164@linutronix.de
|
|
Make topology_phys_to_logical_pkg_die() static as it's only used in
smpboot.c and fixup the kernel-doc warnings for both functions.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Tested-by: Helge Deller <deller@gmx.de> # parisc
Tested-by: Guilherme G. Piccoli <gpiccoli@igalia.com> # Steam Deck
Link: https://lore.kernel.org/r/20230512205255.493750666@linutronix.de
|
|
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl
Pull compute express link fixes from Dan Williams:
- Fix a compilation issue with DEFINE_STATIC_SRCU() in the unit tests
- Fix leaking kernel memory to a root-only sysfs attribute
* tag 'cxl-fixes-6.4-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl:
cxl: Add missing return to cdat read error path
tools/testing/cxl: Use DEFINE_STATIC_SRCU()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux
Pull parisc architecture fixes from Helge Deller:
- Fix encoding of swp_entry due to added SWP_EXCLUSIVE flag
- Include reboot.h to avoid gcc-12 compiler warning
* tag 'parisc-for-6.4-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
parisc: Fix encoding of swp_entry due to added SWP_EXCLUSIVE flag
parisc: kexec: include reboot.h
|
|
Pull ARM fixes from Russell King:
- fix unwinder for uleb128 case
- fix kernel-doc warnings for HP Jornada 7xx
- fix unbalanced stack on vfp success path
* tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm:
ARM: 9297/1: vfp: avoid unbalanced stack on 'success' return path
ARM: 9296/1: HP Jornada 7XX: fix kernel-doc warnings
ARM: 9295/1: unwind:fix unwind abort for uleb128 case
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking fix from Borislav Petkov:
- Make sure __down_read_common() is always inlined so that the callers'
names land in traceevents output and thus the blocked function can be
identified
* tag 'locking_urgent_for_v6.4_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
locking/rwsem: Add __always_inline annotation to __down_read_common() and inlined callers
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Borislav Petkov:
- Make sure the PEBS buffer is flushed before reprogramming the
hardware so that the correct record sizes are used
- Update the sample size for AMD BRS events
- Fix a confusion with using the same on-stack struct with different
events in the event processing path
* tag 'perf_urgent_for_v6.4_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/x86/intel/ds: Flush PEBS DS when changing PEBS_DATA_CFG
perf/x86: Fix missing sample size update on AMD BRS
perf/core: Fix perf_sample_data not properly initialized for different swevents in perf_tp_event()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fix from Borislav Petkov:
- Fix a couple of kernel-doc warnings
* tag 'sched_urgent_for_v6.4_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched: fix cid_lock kernel-doc warnings
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fix from Borislav Petkov:
- Add the required PCI IDs so that the generic SMN accesses provided by
amd_nb.c work for drivers which switch to them. Add a PCI device ID
to k10temp's table so that latter is loaded on such systems too
* tag 'x86_urgent_for_v6.4_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
hwmon: (k10temp) Add PCI ID for family 19, model 78h
x86/amd_nb: Add PCI ID for family 19h model 78h
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer fix from Borislav Petkov:
- Prevent CPU state corruption when an active clockevent broadcast
device is replaced while the system is already in oneshot mode
* tag 'timers_urgent_for_v6.4_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
tick/broadcast: Make broadcast device replacement work correctly
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
Pull ext4 fixes from Ted Ts'o:
"Some ext4 bug fixes (mostly to address Syzbot reports)"
* tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
ext4: bail out of ext4_xattr_ibody_get() fails for any reason
ext4: add bounds checking in get_max_inline_xattr_value_size()
ext4: add indication of ro vs r/w mounts in the mount message
ext4: fix deadlock when converting an inline directory in nojournal mode
ext4: improve error recovery code paths in __ext4_remount()
ext4: improve error handling from ext4_dirhash()
ext4: don't clear SB_RDONLY when remounting r/w until quota is re-enabled
ext4: check iomap type only if ext4_iomap_begin() does not fail
ext4: avoid a potential slab-out-of-bounds in ext4_group_desc_csum
ext4: fix data races when using cached status extents
ext4: avoid deadlock in fs reclaim with page writeback
ext4: fix invalid free tracking in ext4_xattr_move_to_block()
ext4: remove a BUG_ON in ext4_mb_release_group_pa()
ext4: allow ext4_get_group_info() to fail
ext4: fix lockdep warning when enabling MMP
ext4: fix WARNING in mb_find_extent
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev
Pull fbdev fixes from Helge Deller:
- use after free fix in imsttfb (Zheng Wang)
- fix error handling in arcfb (Zongjie Li)
- lots of whitespace cleanups (Thomas Zimmermann)
- add 1920x1080 modedb entry (me)
* tag 'fbdev-for-6.4-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev:
fbdev: stifb: Fix info entry in sti_struct on error path
fbdev: modedb: Add 1920x1080 at 60 Hz video mode
fbdev: imsttfb: Fix use after free bug in imsttfb_probe
fbdev: vfb: Remove trailing whitespaces
fbdev: valkyriefb: Remove trailing whitespaces
fbdev: stifb: Remove trailing whitespaces
fbdev: sa1100fb: Remove trailing whitespaces
fbdev: platinumfb: Remove trailing whitespaces
fbdev: p9100: Remove trailing whitespaces
fbdev: maxinefb: Remove trailing whitespaces
fbdev: macfb: Remove trailing whitespaces
fbdev: hpfb: Remove trailing whitespaces
fbdev: hgafb: Remove trailing whitespaces
fbdev: g364fb: Remove trailing whitespaces
fbdev: controlfb: Remove trailing whitespaces
fbdev: cg14: Remove trailing whitespaces
fbdev: atmel_lcdfb: Remove trailing whitespaces
fbdev: 68328fb: Remove trailing whitespaces
fbdev: arcfb: Fix error handling in arcfb_probe()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fix from James Bottomley:
"A single small fix for the UFS driver to fix a power management
failure"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: ufs: core: Fix I/O hang that occurs when BKOPS fails in W-LUN suspend
|
|
Fix the __swp_offset() and __swp_entry() macros due to commit 6d239fc78c0b
("parisc/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE") which introduced the
SWP_EXCLUSIVE flag by reusing the _PAGE_ACCESSED flag.
Reported-by: Christoph Biedl <linux-kernel.bfrz@manchmal.in-ulm.de>
Tested-by: Christoph Biedl <linux-kernel.bfrz@manchmal.in-ulm.de>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Helge Deller <deller@gmx.de>
Fixes: 6d239fc78c0b ("parisc/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE")
Cc: <stable@vger.kernel.org> # v6.3+
|
|
In ext4_update_inline_data(), if ext4_xattr_ibody_get() fails for any
reason, it's best if we just fail as opposed to stumbling on,
especially if the failure is EFSCORRUPTED.
Cc: stable@kernel.org
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
|
Normally the extended attributes in the inode body would have been
checked when the inode is first opened, but if someone is writing to
the block device while the file system is mounted, it's possible for
the inode table to get corrupted. Add bounds checking to avoid
reading beyond the end of allocated memory if this happens.
Reported-by: syzbot+1966db24521e5f6e23f7@syzkaller.appspotmail.com
Link: https://syzkaller.appspot.com/bug?extid=1966db24521e5f6e23f7
Cc: stable@kernel.org
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|