Age | Commit message (Collapse) | Author |
|
Traditionally, LoongArch uses "dbar 0" (full completion barrier) for
everything. But the full completion barrier is a performance killer, so
Loongson-3A6000 and newer processors have made finer granularity hints
available:
Bit4: ordering or completion (0: completion, 1: ordering)
Bit3: barrier for previous read (0: true, 1: false)
Bit2: barrier for previous write (0: true, 1: false)
Bit1: barrier for succeeding read (0: true, 1: false)
Bit0: barrier for succeeding write (0: true, 1: false)
Hint 0x700: barrier for "read after read" from the same address, which
is needed by LL-SC loops on old models (dbar 0x700 behaves the same as
nop if such reordering is disabled on new models).
This patch makes use of the various new hints for different kinds of
memory barriers. It brings performance improvements on Loongson-3A6000
series, while not affecting the existing models because all variants are
treated as 'dbar 0' there.
Why override queued_spin_unlock()?
After commit 01e3b958efe85a26d9b ("drivers: Remove explicit invocations
of mmiowb()") we need a completion barrier in queued_spin_unlock(), but
the generic implementation use smp_store_release() which only provide an
ordering barrier.
Signed-off-by: Jun Yi <yijun@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
Loongson-3A6000 has SMT (Simultaneous Multi-Threading) support, each
physical core has two logical cores (threads). This patch add SMT probe
and scheduler support via ACPI PPTT.
If SCHED_SMT enabled, Loongson-3A6000 is treated as 4 cores, 8 threads;
If SCHED_SMT disabled, Loongson-3A6000 is treated as 8 cores, 8 threads.
Remove smp_num_siblings to support HMP (Heterogeneous Multi-Processing).
Signed-off-by: Liupu Wang <wangliupu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
ACPI systems set io masters by parsing ACPI MADT, FDT systems have no
MADT so we explicitly set CPU#0 as the io master. Otherwise CPU#0 will
be considered as hotpluggable.
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull SMP cross-CPU function-call updates from Ingo Molnar:
- Remove diagnostics and adjust config for CSD lock diagnostics
- Add a generic IPI-sending tracepoint, as currently there's no easy
way to instrument IPI origins: it's arch dependent and for some major
architectures it's not even consistently available.
* tag 'smp-core-2023-04-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
trace,smp: Trace all smp_function_call*() invocations
trace: Add trace_ipi_send_cpu()
sched, smp: Trace smp callback causing an IPI
smp: reword smp call IPI comment
treewide: Trace IPIs sent via smp_send_reschedule()
irq_work: Trace self-IPIs sent via arch_irq_work_raise()
smp: Trace IPIs sent via arch_send_call_function_ipi_mask()
sched, smp: Trace IPIs sent via send_call_function_single_ipi()
trace: Add trace_ipi_send_cpumask()
kernel/smp: Make csdlock_debug= resettable
locking/csd_lock: Remove per-CPU data indirection from CSD lock debugging
locking/csd_lock: Remove added data from CSD lock debugging
locking/csd_lock: Add Kconfig option for csd_debug default
|
|
To be able to trace invocations of smp_send_reschedule(), rename the
arch-specific definitions of it to arch_smp_send_reschedule() and wrap it
into an smp_send_reschedule() that contains a tracepoint.
Changes to include the declaration of the tracepoint were driven by the
following coccinelle script:
@func_use@
@@
smp_send_reschedule(...);
@include@
@@
#include <trace/events/ipi.h>
@no_include depends on func_use && !include@
@@
#include <...>
+
+ #include <trace/events/ipi.h>
[csky bits]
[riscv bits]
Signed-off-by: Valentin Schneider <vschneid@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Guo Ren <guoren@kernel.org>
Acked-by: Palmer Dabbelt <palmer@rivosinc.com>
Link: https://lore.kernel.org/r/20230307143558.294354-6-vschneid@redhat.com
|
|
play_dead() doesn't return. Make that more explicit with a BUG().
BUG() is preferable to unreachable() because BUG() is a more explicit
failure mode and avoids undefined behavior like falling off the edge of
the function into whatever code happens to be next.
Link: https://lore.kernel.org/r/21245d687ffeda34dbcf04961a2df3724f04f7c8.1676358308.git.jpoimboe@kernel.org
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
|
|
Add suspend (Suspend To RAM, aka ACPI S3) support for LoongArch.
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
Since commit 40cd01a9c324("efi/loongarch: libstub: remove dependency on
flattened DT"), we can parse the FDT from efi system table.
And now, LoongArch is coming to support booting with FDT, so we add the
relevant booting support as well as parameter parsing.
Signed-off-by: Binbin Zhou <zhoubinbin@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
Function smp_send_reschedule() is standard kernel API, which is defined
in header file include/linux/smp.h. However, on LoongArch it is defined
as an inline function, this is confusing and kernel modules can not use
this function.
Now we define smp_send_reschedule() as a general function, and add a
EXPORT_SYMBOL_GPL on this function, so that kernel modules can use it.
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
SMP operations can be shared by Loongson-2 series and Loongson-3 series,
so we change the prefix from loongson3 to loongson for all functions and
data structures.
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
Now io master CPUs are not hotpluggable on LoongArch, but in the current
code only /sys/devices/system/cpu/cpu0/online is not created. Let us set
the hotpluggable field of all the io master CPUs as 0, then prevent to
create sysfs control file for all the io master CPUs which confuses some
user space tools. This is similar with commit 9cce844abf07 ("MIPS: CPU#0
is not hotpluggable").
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
Parse MADT to get multi-processor information, in order to fix the boot
problem and cpu-hotplug problem for SMP platform.
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
On physical machine we can save power by disabling clock of hot removed
cpu. However as different platforms require different methods to
configure clocks, the code is platform-specific, and probably belongs to
firmware/pmu or cpu regulator, rather than generic arch/loongarch code.
Also, there is no such register on QEMU virt machine since the
clock/frequency regulation is not emulated.
This patch removes the hard-coded clock register accesses in generic
LoongArch cpu hotplug flow.
Reviewed-by: WANG Xuerui <git@xen0n.name>
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
1, We assume arch/loongarch/include/asm/smp.h be included in include/
linux/smp.h is valid and the reverse inclusion isn't. So remove the
<linux/smp.h> in arch/loongarch/include/asm/smp.h.
2, arch/loongarch/include/asm/smp.h is only needed when CONFIG_SMP,
and setup.c include it only because it need plat_smp_setup(). So,
reorganize setup.c & smp.h, and then remove <asm/smp.h> in setup.c.
3, Fix cacheinfo.c and percpu.h build error by adding the missing header
files when !CONFIG_SMP.
4, Fix acpi.c build error by adding CONFIG_SMP guards.
5, Move irq_stat definition from smp.c to irq.c and fix its declaration.
6, Select CONFIG_SMP for CONFIG_NUMA, similar as other architectures do.
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
Add Non-Uniform Memory Access (NUMA) support for LoongArch. LoongArch
has 48-bit physical address, but the HyperTransport I/O bus only support
40-bit address, so we need a custom phys_to_dma() and dma_to_phys() to
extract the 4-bit node id (bit 44~47) from Loongson-3's 48-bit physical
address space and embed it into 40-bit. In the 40-bit dma address, node
id offset can be read from the LS7A_DMA_CFG register.
Reviewed-by: WANG Xuerui <git@xen0n.name>
Reviewed-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
LoongArch-based procesors have 4, 8 or 16 cores per package. This patch
adds multi-processor (SMP) support for LoongArch.
Reviewed-by: WANG Xuerui <git@xen0n.name>
Reviewed-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|