summaryrefslogtreecommitdiff
path: root/drivers/irqchip/irq-gic-v3-its.c
AgeCommit message (Collapse)Author
2020-10-14Merge tag 'acpi-5.10-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI updates from Rafael Wysocki: "These add support for generic initiator-only proximity domains to the ACPI NUMA code and the architectures using it, clean up some non-ACPICA code referring to debug facilities from ACPICA, reduce the overhead related to accessing GPE registers, add a new DPTF (Dynamic Power and Thermal Framework) participant driver, update the ACPICA code in the kernel to upstream revision 20200925, add a new ACPI backlight whitelist entry, fix a few assorted issues and clean up some code. Specifics: - Add support for generic initiator-only proximity domains to the ACPI NUMA code and the architectures using it (Jonathan Cameron) - Clean up some non-ACPICA code referring to debug facilities from ACPICA that are not actually used in there (Hanjun Guo) - Add new DPTF driver for the PCH FIVR participant (Srinivas Pandruvada) - Reduce overhead related to accessing GPE registers in ACPICA and the OS interface layer and make it possible to access GPE registers using logical addresses if they are memory-mapped (Rafael Wysocki) - Update the ACPICA code in the kernel to upstream revision 20200925 including changes as follows: + Add predefined names from the SMBus sepcification (Bob Moore) + Update acpi_help UUID list (Bob Moore) + Return exceptions for string-to-integer conversions in iASL (Bob Moore) + Add a new "ALL <NameSeg>" debugger command (Bob Moore) + Add support for 64 bit risc-v compilation (Colin Ian King) + Do assorted cleanups (Bob Moore, Colin Ian King, Randy Dunlap) - Add new ACPI backlight whitelist entry for HP 635 Notebook (Alex Hung) - Move TPS68470 OpRegion driver to drivers/acpi/pmic/ and split out Kconfig and Makefile specific for ACPI PMIC (Andy Shevchenko) - Clean up the ACPI SoC driver for AMD SoCs (Hanjun Guo) - Add missing config_item_put() to fix refcount leak (Hanjun Guo) - Drop lefrover field from struct acpi_memory_device (Hanjun Guo) - Make the ACPI extlog driver check for RDMSR failures (Ben Hutchings) - Fix handling of lid state changes in the ACPI button driver when input device is closed (Dmitry Torokhov) - Fix several assorted build issues (Barnabás Pőcze, John Garry, Nathan Chancellor, Tian Tao) - Drop unused inline functions and reduce code duplication by using kobj_to_dev() in the NFIT parsing code (YueHaibing, Wang Qing) - Serialize tools/power/acpi Makefile (Thomas Renninger)" * tag 'acpi-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (64 commits) ACPICA: Update version to 20200925 Version 20200925 ACPICA: Remove unnecessary semicolon ACPICA: Debugger: Add a new command: "ALL <NameSeg>" ACPICA: iASL: Return exceptions for string-to-integer conversions ACPICA: acpi_help: Update UUID list ACPICA: Add predefined names found in the SMBus sepcification ACPICA: Tree-wide: fix various typos and spelling mistakes ACPICA: Drop the repeated word "an" in a comment ACPICA: Add support for 64 bit risc-v compilation ACPI: button: fix handling lid state changes when input device closed tools/power/acpi: Serialize Makefile ACPI: scan: Replace ACPI_DEBUG_PRINT() with pr_debug() ACPI: memhotplug: Remove 'state' from struct acpi_memory_device ACPI / extlog: Check for RDMSR failure ACPI: Make acpi_evaluate_dsm() prototype consistent docs: mm: numaperf.rst Add brief description for access class 1. node: Add access1 class to represent CPU to memory characteristics ACPI: HMAT: Fix handling of changes from ACPI 6.2 to ACPI 6.3 ACPI: Let ACPI know we support Generic Initiator Affinity Structures x86: Support Generic Initiator only proximity domains ...
2020-10-13memblock: implement for_each_reserved_mem_region() using __next_mem_region()Mike Rapoport
Iteration over memblock.reserved with for_each_reserved_mem_region() used __next_reserved_mem_region() that implemented a subset of __next_mem_region(). Use __for_each_mem_range() and, essentially, __next_mem_region() with appropriate parameters to reduce code duplication. While on it, rename for_each_reserved_mem_region() to for_each_reserved_mem_range() for consistency. Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com> [.clang-format] Cc: Andy Lutomirski <luto@kernel.org> Cc: Baoquan He <bhe@redhat.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Daniel Axtens <dja@axtens.net> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Emil Renner Berthing <kernel@esmil.dk> Cc: Hari Bathini <hbathini@linux.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Stafford Horne <shorne@gmail.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will@kernel.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Link: https://lkml.kernel.org/r/20200818151634.14343-17-rppt@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-09-24irq-chip/gic-v3-its: Fix crash if ITS is in a proximity domain without ↵Jonathan Cameron
processor or memory Note this crash is present before any of the patches in this series, but as explained below it is highly unlikely anyone is shipping a firmware that causes it. Tests were done using an overriden SRAT. On ARM64, the gic-v3 driver directly parses SRAT to locate GIC Interrupt Translation Service (ITS) Affinity Structures. This is done much later in the boot than the parses of SRAT which identify proximity domains. As a result, an ITS placed in a proximity domain that is not defined by another SRAT structure will result in a NUMA node that is not completely configured and a crash. ITS [mem 0x202100000-0x20211ffff] ITS@0x0000000202100000: Using ITS number 0 Unable to handle kernel paging request at virtual address 0000000000001a08 ... Call trace: __alloc_pages_nodemask+0xe8/0x338 alloc_pages_node.constprop.0+0x34/0x40 its_probe_one+0x2f8/0xb18 gic_acpi_parse_madt_its+0x108/0x150 acpi_table_parse_entries_array+0x17c/0x264 acpi_table_parse_entries+0x48/0x6c acpi_table_parse_madt+0x30/0x3c its_init+0x1c4/0x644 gic_init_bases+0x4b8/0x4ec gic_acpi_init+0x134/0x264 acpi_match_madt+0x4c/0x84 acpi_table_parse_entries_array+0x17c/0x264 acpi_table_parse_entries+0x48/0x6c acpi_table_parse_madt+0x30/0x3c __acpi_probe_device_table+0x8c/0xe8 irqchip_init+0x3c/0x48 init_IRQ+0xcc/0x100 start_kernel+0x33c/0x548 ACPI 6.3 allows any set of Affinity Structures in SRAT to define a proximity domain. However, as we do not see this crash, we can conclude that no firmware is currently placing an ITS in a node that is separate from those containing memory and / or processors. We could modify the SRAT parsing behavior to identify the existence of Proximity Domains unique to the ITS structures, and handle them as a special case of a generic initiator (once support for those merges). This patch avoids the complexity that would be needed to handle this corner case, by not allowing the ITS entry parsing code to instantiate new NUMA Nodes. If one is encountered that does not already exist, then NO_NUMA_NODE is assigned and a warning printed just as if the value had been greater than allowed NUMA Nodes. "SRAT: Invalid NUMA node -1 in ITS affinity" Whilst this does not provide the full flexibility allowed by ACPI, it does fix the problem. We can revisit a more sophisticated solution if needed by future platforms. Change is simply to replace acpi_map_pxm_to_node with pxm_to_node reflecting the fact a new mapping is not created. Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Hanjun Guo <guohanjun@huawei.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-09-17Merge remote-tracking branch 'origin/irq/gic-retrigger' into irq/irqchip-nextMarc Zyngier
Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-09-06irqchip/git-v3-its: Implement irq_retrigger callback for device-triggered LPIsMarc Zyngier
It is pretty easy to provide a retrigger callback for the ITS, as it we already have the required support in terms of irq_set_irqchip_state(). Note that this only works for device-generated LPIs, and not the GICv4 doorbells, which should never have to be retriggered anyway. Reviewed-by: Valentin Schneider <valentin.schneider@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-08-23treewide: Use fallthrough pseudo-keywordGustavo A. R. Silva
Replace the existing /* fall through */ comments and its variants with the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary fall-through markings when it is the case. [1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
2020-08-04Merge tag 'irq-core-2020-08-04' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq updates from Thomas Gleixner: "The usual boring updates from the interrupt subsystem: - Infrastructure to allow building irqchip drivers as modules - Consolidation of irqchip ACPI probing - Removal of the EOI-preflow interrupt handler which was required for SPARC support and became obsolete after SPARC was converted to use sparse interrupts. - Cleanups, fixes and improvements all over the place" * tag 'irq-core-2020-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (51 commits) irqchip/loongson-pch-pic: Fix the misused irq flow handler irqchip/loongson-htvec: Support 8 groups of HT vectors irqchip/loongson-liointc: Fix misuse of gc->mask_cache dt-bindings: interrupt-controller: Update Loongson HTVEC description irqchip/imx-intmux: Fix irqdata regs save in imx_intmux_runtime_suspend() irqchip/imx-intmux: Implement intmux runtime power management irqchip/gic-v4.1: Use GFP_ATOMIC flag in allocate_vpe_l1_table() irqchip: Fix IRQCHIP_PLATFORM_DRIVER_* compilation by including module.h irqchip/stm32-exti: Map direct event to irq parent irqchip/mtk-cirq: Convert to a platform driver irqchip/mtk-sysirq: Convert to a platform driver irqchip/qcom-pdc: Switch to using IRQCHIP_PLATFORM_DRIVER helper macros irqchip: Add IRQCHIP_PLATFORM_DRIVER_BEGIN/END and IRQCHIP_MATCH helper macros irqchip: irq-bcm2836.h: drop a duplicated word irqchip/gic-v4.1: Ensure accessing the correct RD when writing INVALLR irqchip/irq-bcm7038-l1: Guard uses of cpu_logical_map irqchip/gic-v3: Remove unused register definition irqchip/qcom-pdc: Allow QCOM_PDC to be loadable as a permanent module genirq: Export irq_chip_retrigger_hierarchy and irq_chip_set_vcpu_affinity_parent irqdomain: Export irq_domain_update_bus_token ...
2020-07-27genirq/affinity: Make affinity setting if activated opt-inThomas Gleixner
John reported that on a RK3288 system the perf per CPU interrupts are all affine to CPU0 and provided the analysis: "It looks like what happens is that because the interrupts are not per-CPU in the hardware, armpmu_request_irq() calls irq_force_affinity() while the interrupt is deactivated and then request_irq() with IRQF_PERCPU | IRQF_NOBALANCING. Now when irq_startup() runs with IRQ_STARTUP_NORMAL, it calls irq_setup_affinity() which returns early because IRQF_PERCPU and IRQF_NOBALANCING are set, leaving the interrupt on its original CPU." This was broken by the recent commit which blocked interrupt affinity setting in hardware before activation of the interrupt. While this works in general, it does not work for this particular case. As contrary to the initial analysis not all interrupt chip drivers implement an activate callback, the safe cure is to make the deferred interrupt affinity setting at activation time opt-in. Implement the necessary core logic and make the two irqchip implementations for which this is required opt-in. In hindsight this would have been the right thing to do, but ... Fixes: baedb87d1b53 ("genirq/affinity: Handle affinity setting on inactive interrupts correctly") Reported-by: John Keeping <john@metanate.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Marc Zyngier <maz@kernel.org> Acked-by: Marc Zyngier <maz@kernel.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/87blk4tzgm.fsf@nanos.tec.linutronix.de
2020-07-27irqchip/gic-v4.1: Use GFP_ATOMIC flag in allocate_vpe_l1_table()Zenghui Yu
Booting the latest kernel with DEBUG_ATOMIC_SLEEP=y on a GICv4.1 enabled box, I get the following kernel splat: [ 0.053766] BUG: sleeping function called from invalid context at mm/slab.h:567 [ 0.053767] in_atomic(): 1, irqs_disabled(): 128, non_block: 0, pid: 0, name: swapper/1 [ 0.053769] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.8.0-rc3+ #23 [ 0.053770] Call trace: [ 0.053774] dump_backtrace+0x0/0x218 [ 0.053775] show_stack+0x2c/0x38 [ 0.053777] dump_stack+0xc4/0x10c [ 0.053779] ___might_sleep+0xfc/0x140 [ 0.053780] __might_sleep+0x58/0x90 [ 0.053782] slab_pre_alloc_hook+0x7c/0x90 [ 0.053783] kmem_cache_alloc_trace+0x60/0x2f0 [ 0.053785] its_cpu_init+0x6f4/0xe40 [ 0.053786] gic_starting_cpu+0x24/0x38 [ 0.053788] cpuhp_invoke_callback+0xa0/0x710 [ 0.053789] notify_cpu_starting+0xcc/0xd8 [ 0.053790] secondary_start_kernel+0x148/0x200 # ./scripts/faddr2line vmlinux its_cpu_init+0x6f4/0xe40 its_cpu_init+0x6f4/0xe40: allocate_vpe_l1_table at drivers/irqchip/irq-gic-v3-its.c:2818 (inlined by) its_cpu_init_lpis at drivers/irqchip/irq-gic-v3-its.c:3138 (inlined by) its_cpu_init at drivers/irqchip/irq-gic-v3-its.c:5166 It turned out that we're allocating memory using GFP_KERNEL (may sleep) within the CPU hotplug notifier, which is indeed an atomic context. Bad thing may happen if we're playing on a system with more than a single CommonLPIAff group. Avoid it by turning this into an atomic allocation. Fixes: 5e5168461c22 ("irqchip/gic-v4.1: VPE table (aka GICR_VPROPBASER) allocation") Signed-off-by: Zenghui Yu <yuzenghui@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20200630133746.816-1-yuzenghui@huawei.com
2020-07-27irqchip/gic-v4.1: Ensure accessing the correct RD when writing INVALLRZenghui Yu
The GICv4.1 spec tells us that it's CONSTRAINED UNPREDICTABLE to issue a register-based invalidation operation for a vPEID not mapped to that RD, or another RD within the same CommonLPIAff group. To follow this rule, commit f3a059219bc7 ("irqchip/gic-v4.1: Ensure mutual exclusion between vPE affinity change and RD access") tried to address the race between the RD accesses and the vPE affinity change, but somehow forgot to take GICR_INVALLR into account. Let's take the vpe_lock before evaluating vpe->col_idx to fix it. Fixes: f3a059219bc7 ("irqchip/gic-v4.1: Ensure mutual exclusion between vPE affinity change and RD access") Signed-off-by: Zenghui Yu <yuzenghui@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20200720092328.708-1-yuzenghui@huawei.com
2020-07-06Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull kvm fixes from Paolo Bonzini: "Bugfixes and a one-liner patch to silence a sparse warning" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: arm64: Stop clobbering x0 for HVC_SOFT_RESTART KVM: arm64: PMU: Fix per-CPU access in preemptible context KVM: VMX: Use KVM_POSSIBLE_CR*_GUEST_BITS to initialize guest/host masks KVM: x86: Mark CR4.TSD as being possibly owned by the guest KVM: x86: Inject #GP if guest attempts to toggle CR4.LA57 in 64-bit mode kvm: use more precise cast and do not drop __user KVM: x86: bit 8 of non-leaf PDPEs is not reserved KVM: X86: Fix async pf caused null-ptr-deref KVM: arm64: vgic-v4: Plug race between non-residency and v4.1 doorbell KVM: arm64: pvtime: Ensure task delay accounting is enabled KVM: arm64: Fix kvm_reset_vcpu() return code being incorrect with SVE KVM: arm64: Annotate hyp NMI-related functions as __always_inline KVM: s390: reduce number of IO pins to 1
2020-06-23KVM: arm64: vgic-v4: Plug race between non-residency and v4.1 doorbellMarc Zyngier
When making a vPE non-resident because it has hit a blocking WFI, the doorbell can fire at any time after the write to the RD. Crucially, it can fire right between the write to GICR_VPENDBASER and the write to the pending_last field in the its_vpe structure. This means that we would overwrite pending_last with stale data, and potentially not wakeup until some unrelated event (such as a timer interrupt) puts the vPE back on the CPU. GICv4 isn't affected by this as we actively mask the doorbell on entering the guest, while GICv4.1 automatically manages doorbell delivery without any hypervisor-driven masking. Use the vpe_lock to synchronize such update, which solves the problem altogether. Fixes: ae699ad348cdc ("irqchip/gic-v4.1: Move doorbell management to the GICv4 abstraction layer") Reported-by: Zenghui Yu <yuzenghui@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-06-21irqchip/gic-v4.1: Use readx_poll_timeout_atomic() to fix sleep in atomicZenghui Yu
readx_poll_timeout() can sleep if @sleep_us is specified by the caller, and is therefore unsafe to be used inside the atomic context, which is this case when we use it to poll the GICR_VPENDBASER.Dirty bit in irq_set_vcpu_affinity() callback. Let's convert to its atomic version instead which helps to get the v4.1 board back to life! Fixes: 96806229ca03 ("irqchip/gic-v4.1: Add support for VPENDBASER's Dirty+Valid signaling") Signed-off-by: Zenghui Yu <yuzenghui@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20200605052345.1494-1-yuzenghui@huawei.com
2020-05-20irqchip/gic-v3-its: Balance initial LPI affinity across CPUsMarc Zyngier
When mapping a LPI, the ITS driver picks the first possible affinity, which is in most cases CPU0, assuming that if that's not suitable, someone will come and set the affinity to something more interesting. It apparently isn't the case, and people complain of poor performance when many interrupts are glued to the same CPU. So let's place the interrupts by finding the "least loaded" CPU (that is, the one that has the fewer LPIs mapped to it). So called 'managed' interrupts are an interesting case where the affinity is actually dictated by the kernel itself, and we should honor this. Reported-by: John Garry <john.garry@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Tested-by: John Garry <john.garry@huawei.com> Link: https://lore.kernel.org/r/1575642904-58295-1-git-send-email-john.garry@huawei.com Link: https://lore.kernel.org/r/20200515165752.121296-3-maz@kernel.org
2020-05-18irqchip/gic-v3-its: Track LPI distribution on a per CPU basisMarc Zyngier
In order to improve the distribution of LPIs among CPUs, let start by tracking the number of LPIs assigned to CPUs, both for managed and non-managed interrupts (as separate counters). Signed-off-by: Marc Zyngier <maz@kernel.org> Tested-by: John Garry <john.garry@huawei.com> Link: https://lore.kernel.org/r/20200515165752.121296-2-maz@kernel.org
2020-04-16irqchip/gic-v4.1: Update effective affinity of virtual SGIsMarc Zyngier
Although the vSGIs are not directly visible to the host, they still get moved around by the CPU hotplug, for example. This results in the kernel moaning on the console, such as: genirq: irq_chip GICv4.1-sgi did not update eff. affinity mask of irq 38 Updating the effective affinity on set_affinity() fixes it. Reviewed-by: Zenghui Yu <yuzenghui@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-04-16irqchip/gic-v4.1: Add support for VPENDBASER's Dirty+Valid signalingMarc Zyngier
When a vPE is made resident, the GIC starts parsing the virtual pending table to deliver pending interrupts. This takes place asynchronously, and can at times take a long while. Long enough that the vcpu enters the guest and hits WFI before any interrupt has been signaled yet. The vcpu then exits, blocks, and now gets a doorbell. Rince, repeat. In order to avoid the above, a (optional on GICv4, mandatory on v4.1) feature allows the GIC to feedback to the hypervisor whether it is done parsing the VPT by clearing the GICR_VPENDBASER.Dirty bit. The hypervisor can then wait until the GIC is ready before actually running the vPE. Plug the detection code as well as polling on vPE schedule. While at it, tidy-up the kernel message that displays the GICv4 optional features. Reviewed-by: Zenghui Yu <yuzenghui@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-03-24Merge branch 'irq/gic-v4.1' into irq/irqchip-nextMarc Zyngier
Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-03-24irqchip/gic-v4.1: Eagerly vmap vPEsMarc Zyngier
Now that we have HW-accelerated SGIs being delivered to VPEs, it becomes required to map the VPEs on all ITSs instead of relying on the lazy approach that we would use when using the ITS-list mechanism. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Zenghui Yu <yuzenghui@huawei.com> Link: https://lore.kernel.org/r/20200304203330.4967-17-maz@kernel.org
2020-03-24irqchip/gic-v4.1: Plumb set_vcpu_affinity SGI callbacksMarc Zyngier
Just like for vLPIs, there is some configuration information that cannot be directly communicated through the normal irqchip API, and we have to use our good old friend set_vcpu_affinity as a side-band communication mechanism. This is used to configure group and priority for a given vSGI. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Zenghui Yu <yuzenghui@huawei.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Link: https://lore.kernel.org/r/20200304203330.4967-13-maz@kernel.org
2020-03-24irqchip/gic-v4.1: Plumb get/set_irqchip_state SGI callbacksMarc Zyngier
To implement the get/set_irqchip_state callbacks (limited to the PENDING state), we have to use a particular set of hacks: - Reading the pending state is done by using a pair of new redistributor registers (GICR_VSGIR, GICR_VSGIPENDR), which allow the 16 interrupts state to be retrieved. - Setting the pending state is done by generating it as we'd otherwise do for a guest (writing to GITS_SGIR). - Clearing the pending state is done by emitting a VSGI command with the "clear" bit set. This requires some interesting locking though: - When talking to the redistributor, we must make sure that the VPE affinity doesn't change, hence taking the VPE lock. - At the same time, we must ensure that nobody accesses the same redistributor's GICR_VSGIR registers for a different VPE, which would corrupt the reading of the pending bits. We thus take the per-RD spinlock. Much fun. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Zenghui Yu <yuzenghui@huawei.com> Link: https://lore.kernel.org/r/20200304203330.4967-12-maz@kernel.org
2020-03-24irqchip/gic-v4.1: Plumb mask/unmask SGI callbacksMarc Zyngier
Implement mask/unmask for virtual SGIs by calling into the configuration helper. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Zenghui Yu <yuzenghui@huawei.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Link: https://lore.kernel.org/r/20200304203330.4967-11-maz@kernel.org
2020-03-24irqchip/gic-v4.1: Add initial SGI configurationMarc Zyngier
The GICv4.1 ITS has yet another new command (VSGI) which allows a VPE-targeted SGI to be configured (or have its pending state cleared). Add support for this command and plumb it into the activate irqdomain callback so that it is ready to be used. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Zenghui Yu <yuzenghui@huawei.com> Link: https://lore.kernel.org/r/20200304203330.4967-10-maz@kernel.org
2020-03-24irqchip/gic-v4.1: Plumb skeletal VSGI irqchipMarc Zyngier
Since GICv4.1 has the capability to inject 16 SGIs into each VPE, and that I'm keen not to invent too many specific interfaces to manipulate these interrupts, let's pretend that each of these SGIs is an actual Linux interrupt. For that matter, let's introduce a minimal irqchip and irqdomain setup that will get fleshed up in the following patches. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Zenghui Yu <yuzenghui@huawei.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Link: https://lore.kernel.org/r/20200304203330.4967-9-maz@kernel.org
2020-03-21irqchip/gic-v4: Use Inner-Shareable attributes for virtual pending tablesHeyi Guo
There is no special reason to set virtual LPI pending table as non-shareable. If we choose to hard code the shareability without probing, Inner-Shareable is likely to be a better choice, as the VPEs can move around and benefit from having the redistributors snooping each other's cache, if that's something they can do. Furthermore, Hisilicon hip08 ends up with unspecified errors when mixing shareability attributes. So let's move to IS attributes for the VPT. This has also been tested on D05 and didn't show any regression. Signed-off-by: Heyi Guo <guoheyi@huawei.com> [maz: rewrote commit message] Signed-off-by: Marc Zyngier <maz@kernel.org> Tested-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20191130073849.38378-1-guoheyi@huawei.com
2020-03-20irqchip/gic-v4.1: Map the ITS SGIR register pageMarc Zyngier
One of the new features of GICv4.1 is to allow virtual SGIs to be directly signaled to a VPE. For that, the ITS has grown a new 64kB page containing only a single register that is used to signal a SGI to a given VPE. Add a second mapping covering this new 64kB range, and take this opportunity to limit the original mapping to 64kB, which is enough to cover the span of the ITS registers. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Zenghui Yu <yuzenghui@huawei.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Link: https://lore.kernel.org/r/20200304203330.4967-8-maz@kernel.org
2020-03-20irqchip/gic-v4.1: Advertise support v4.1 to KVMMarc Zyngier
Tell KVM that we support v4.1. Nothing uses this information so far. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Zenghui Yu <yuzenghui@huawei.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Link: https://lore.kernel.org/r/20200304203330.4967-7-maz@kernel.org
2020-03-20irqchip/gic-v4.1: Ensure mutual exclusion betwen invalidations on the same RDMarc Zyngier
The GICv4.1 spec says that it is CONTRAINED UNPREDICTABLE to write to any of the GICR_INV{LPI,ALL}R registers if GICR_SYNCR.Busy == 1. To deal with it, we must ensure that only a single invalidation can happen at a time for a given redistributor. Add a per-RD lock to that effect and take it around the invalidation/syncr-read to deal with this. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Zenghui Yu <yuzenghui@huawei.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Link: https://lore.kernel.org/r/20200304203330.4967-6-maz@kernel.org
2020-03-20irqchip/gic-v4.1: Wait for completion of redistributor's INVALL operationZenghui Yu
In GICv4.1, we emulate a guest-issued INVALL command by a direct write to GICR_INVALLR. Before we finish the emulation and go back to guest, let's make sure the physical invalidate operation is actually completed and no stale data will be left in redistributor. Per the specification, this can be achieved by polling the GICR_SYNCR.Busy bit (to zero). Signed-off-by: Zenghui Yu <yuzenghui@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Eric Auger <eric.auger@redhat.com> Link: https://lore.kernel.org/r/20200302092145.899-1-yuzenghui@huawei.com Link: https://lore.kernel.org/r/20200304203330.4967-5-maz@kernel.org
2020-03-19irqchip/gic-v4.1: Ensure mutual exclusion between vPE affinity change and RD ↵Marc Zyngier
access Before GICv4.1, all operations would be serialized with the affinity changes by virtue of using the same ITS command queue. With v4.1, things change, as invalidations (and a number of other operations) are issued using the redistributor MMIO frame. We must thus make sure that these redistributor accesses cannot race against aginst the affinity change, or we may end-up talking to the wrong redistributor. To ensure this, we expand the irq_to_cpuid() helper to take a spinlock when the LPI is mapped to a vLPI (a new per-VPE lock) on each operation that requires mutual exclusion. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Zenghui Yu <yuzenghui@huawei.com> Link: https://lore.kernel.org/r/20200304203330.4967-4-maz@kernel.org
2020-03-19irqchip/gic-v4.1: Skip absent CPUs while iterating over redistributorsMarc Zyngier
In a system that is only sparsly populated with CPUs, we can end-up with redistributors structures that are not initialized. Let's make sure we don't try and access those when iterating over them (in this case when checking we have a L2 VPE table). Fixes: 4e6437f12d6e ("irqchip/gic-v4.1: Ensure L2 vPE table is allocated at RD level") Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Zenghui Yu <yuzenghui@huawei.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Link: https://lore.kernel.org/r/20200304203330.4967-3-maz@kernel.org
2020-03-16irqchip/gic-v4: Provide irq_retrigger to avoid circular locking dependencyMarc Zyngier
On a very heavily loaded D05 with GICv4, I managed to trigger the following lockdep splat: [ 6022.598864] ====================================================== [ 6022.605031] WARNING: possible circular locking dependency detected [ 6022.611200] 5.6.0-rc4-00026-geee7c7b0f498 #680 Tainted: G E [ 6022.618061] ------------------------------------------------------ [ 6022.624227] qemu-system-aar/7569 is trying to acquire lock: [ 6022.629789] ffff042f97606808 (&p->pi_lock){-.-.}, at: try_to_wake_up+0x54/0x7a0 [ 6022.637102] [ 6022.637102] but task is already holding lock: [ 6022.642921] ffff002fae424cf0 (&irq_desc_lock_class){-.-.}, at: __irq_get_desc_lock+0x5c/0x98 [ 6022.651350] [ 6022.651350] which lock already depends on the new lock. [ 6022.651350] [ 6022.659512] [ 6022.659512] the existing dependency chain (in reverse order) is: [ 6022.666980] [ 6022.666980] -> #2 (&irq_desc_lock_class){-.-.}: [ 6022.672983] _raw_spin_lock_irqsave+0x50/0x78 [ 6022.677848] __irq_get_desc_lock+0x5c/0x98 [ 6022.682453] irq_set_vcpu_affinity+0x40/0xc0 [ 6022.687236] its_make_vpe_non_resident+0x6c/0xb8 [ 6022.692364] vgic_v4_put+0x54/0x70 [ 6022.696273] vgic_v3_put+0x20/0xd8 [ 6022.700183] kvm_vgic_put+0x30/0x48 [ 6022.704182] kvm_arch_vcpu_put+0x34/0x50 [ 6022.708614] kvm_sched_out+0x34/0x50 [ 6022.712700] __schedule+0x4bc/0x7f8 [ 6022.716697] schedule+0x50/0xd8 [ 6022.720347] kvm_arch_vcpu_ioctl_run+0x5f0/0x978 [ 6022.725473] kvm_vcpu_ioctl+0x3d4/0x8f8 [ 6022.729820] ksys_ioctl+0x90/0xd0 [ 6022.733642] __arm64_sys_ioctl+0x24/0x30 [ 6022.738074] el0_svc_common.constprop.3+0xa8/0x1e8 [ 6022.743373] do_el0_svc+0x28/0x88 [ 6022.747198] el0_svc+0x14/0x40 [ 6022.750761] el0_sync_handler+0x124/0x2b8 [ 6022.755278] el0_sync+0x140/0x180 [ 6022.759100] [ 6022.759100] -> #1 (&rq->lock){-.-.}: [ 6022.764143] _raw_spin_lock+0x38/0x50 [ 6022.768314] task_fork_fair+0x40/0x128 [ 6022.772572] sched_fork+0xe0/0x210 [ 6022.776484] copy_process+0x8c4/0x18d8 [ 6022.780742] _do_fork+0x88/0x6d8 [ 6022.784478] kernel_thread+0x64/0x88 [ 6022.788563] rest_init+0x30/0x270 [ 6022.792390] arch_call_rest_init+0x14/0x1c [ 6022.796995] start_kernel+0x498/0x4c4 [ 6022.801164] [ 6022.801164] -> #0 (&p->pi_lock){-.-.}: [ 6022.806382] __lock_acquire+0xdd8/0x15c8 [ 6022.810813] lock_acquire+0xd0/0x218 [ 6022.814896] _raw_spin_lock_irqsave+0x50/0x78 [ 6022.819761] try_to_wake_up+0x54/0x7a0 [ 6022.824018] wake_up_process+0x1c/0x28 [ 6022.828276] wakeup_softirqd+0x38/0x40 [ 6022.832533] __tasklet_schedule_common+0xc4/0xf0 [ 6022.837658] __tasklet_schedule+0x24/0x30 [ 6022.842176] check_irq_resend+0xc8/0x158 [ 6022.846609] irq_startup+0x74/0x128 [ 6022.850606] __enable_irq+0x6c/0x78 [ 6022.854602] enable_irq+0x54/0xa0 [ 6022.858431] its_make_vpe_non_resident+0xa4/0xb8 [ 6022.863557] vgic_v4_put+0x54/0x70 [ 6022.867469] kvm_arch_vcpu_blocking+0x28/0x38 [ 6022.872336] kvm_vcpu_block+0x48/0x490 [ 6022.876594] kvm_handle_wfx+0x18c/0x310 [ 6022.880938] handle_exit+0x138/0x198 [ 6022.885022] kvm_arch_vcpu_ioctl_run+0x4d4/0x978 [ 6022.890148] kvm_vcpu_ioctl+0x3d4/0x8f8 [ 6022.894494] ksys_ioctl+0x90/0xd0 [ 6022.898317] __arm64_sys_ioctl+0x24/0x30 [ 6022.902748] el0_svc_common.constprop.3+0xa8/0x1e8 [ 6022.908046] do_el0_svc+0x28/0x88 [ 6022.911871] el0_svc+0x14/0x40 [ 6022.915434] el0_sync_handler+0x124/0x2b8 [ 6022.919951] el0_sync+0x140/0x180 [ 6022.923773] [ 6022.923773] other info that might help us debug this: [ 6022.923773] [ 6022.931762] Chain exists of: [ 6022.931762] &p->pi_lock --> &rq->lock --> &irq_desc_lock_class [ 6022.931762] [ 6022.942101] Possible unsafe locking scenario: [ 6022.942101] [ 6022.948007] CPU0 CPU1 [ 6022.952523] ---- ---- [ 6022.957039] lock(&irq_desc_lock_class); [ 6022.961036] lock(&rq->lock); [ 6022.966595] lock(&irq_desc_lock_class); [ 6022.973109] lock(&p->pi_lock); [ 6022.976324] [ 6022.976324] *** DEADLOCK *** This is happening because we have a pending doorbell that requires retrigger. As SW retriggering is done in a tasklet, we trigger the circular dependency above. The easy cop-out is to provide a retrigger callback that doesn't require acquiring any extra lock. Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20200310184921.23552-5-maz@kernel.org
2020-03-16irqchip/gic-v3-its: Probe ITS page size for all GITS_BASERn registersMarc Zyngier
The GICv3 ITS driver assumes that once it has latched on a page size for a given BASER register, it can use the same page size as the maximum page size for all subsequent BASER registers. Although it worked so far, nothing in the architecture guarantees this, and Nianyao Tang hit this problem on some undisclosed implementation. Let's bite the bullet and probe the the supported page size on all BASER registers before starting to populate the tables. This simplifies the setup a bit, at the expense of a few additional MMIO accesses. Signed-off-by: Marc Zyngier <maz@kernel.org> Reported-by: Nianyao Tang <tangnianyao@huawei.com> Tested-by: Nianyao Tang <tangnianyao@huawei.com> Link: https://lore.kernel.org/r/1584089195-63897-1-git-send-email-zhangshaokun@hisilicon.com
2020-03-08irqchip/gic-v3-its: Fix access width for gicr_syncrHeyi Guo
GICR_SYNCR is a 32bit register, so it is better to access it with 32bit access width, though we have not seen any real problem. Signed-off-by: Heyi Guo <guoheyi@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20200225090023.28020-1-guoheyi@huawei.com
2020-02-09irqchip/gic-v4.1: Avoid 64bit division for the sake of 32bit ARMMarc Zyngier
In order to allow the GICv4 code to link properly on 32bit ARM, make sure we don't use 64bit divisions when it isn't strictly necessary. Fixes: 4e6437f12d6e ("irqchip/gic-v4.1: Ensure L2 vPE table is allocated at RD level") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Zenghui Yu <yuzenghui@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-02-08irqchip/gic-v3-its: Rename VPENDBASER/VPROPBASER accessorsZenghui Yu
V{PEND,PROP}BASER registers are actually located in VLPI_base frame of the *redistributor*. Rename their accessors to reflect this fact. No functional changes. Signed-off-by: Zenghui Yu <yuzenghui@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20200206075711.1275-7-yuzenghui@huawei.com
2020-02-08irqchip/gic-v3-its: Remove superfluous WARN_ONZenghui Yu
"ITS virtual pending table not cleaning" is already complained inside its_clear_vpend_valid(), there's no need to trigger a WARN_ON again. Signed-off-by: Zenghui Yu <yuzenghui@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20200206075711.1275-6-yuzenghui@huawei.com
2020-02-08irqchip/gic-v4.1: Drop 'tmp' in inherit_vpe_l1_table_from_rd()Zenghui Yu
The variable 'tmp' in inherit_vpe_l1_table_from_rd() is actually not needed, drop it. Signed-off-by: Zenghui Yu <yuzenghui@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20200206075711.1275-5-yuzenghui@huawei.com
2020-02-08irqchip/gic-v4.1: Ensure L2 vPE table is allocated at RD levelZenghui Yu
In GICv4, we will ensure that level2 vPE table memory is allocated for the specified vpe_id on all v4 ITS, in its_alloc_vpe_table(). This still works well for the typical GICv4.1 implementation, where the new vPE table is shared between the ITSs and the RDs. To make it explicit, let us introduce allocate_vpe_l2_table() to make sure that the L2 tables are allocated on all v4.1 RDs. We're likely not need to allocate memory in it because the vPE table is shared and (L2 table is) already allocated at ITS level, except for the case where the ITS doesn't share anything (say SVPET == 0, practically unlikely but architecturally allowed). The implementation of allocate_vpe_l2_table() is mostly copied from its_alloc_table_entry(). Signed-off-by: Zenghui Yu <yuzenghui@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20200206075711.1275-4-yuzenghui@huawei.com
2020-02-08irqchip/gic-v4.1: Set vpe_l1_base for all redistributorsZenghui Yu
Currently, we will not set vpe_l1_page for the current RD if we can inherit the vPE configuration table from another RD (or ITS), which results in an inconsistency between RDs within the same CommonLPIAff group. Let's rename it to vpe_l1_base to indicate the base address of the vPE configuration table of this RD, and set it properly for *all* v4.1 redistributors. Signed-off-by: Zenghui Yu <yuzenghui@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20200206075711.1275-3-yuzenghui@huawei.com
2020-02-08irqchip/gic-v4.1: Fix programming of GICR_VPROPBASER_4_1_SIZEZenghui Yu
The Size field of GICv4.1 VPROPBASER register indicates number of pages minus one and together Page_Size and Size control the vPEID width. Let's respect this requirement of the architecture. Signed-off-by: Zenghui Yu <yuzenghui@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20200206075711.1275-2-yuzenghui@huawei.com
2020-02-03irqchip/gic-v3-its: Reference to its_invall_cmd descriptor when building INVALLZenghui Yu
It looks like an obvious mistake to use its_mapc_cmd descriptor when building the INVALL command block. It so far worked by luck because both its_mapc_cmd.col and its_invall_cmd.col sit at the same offset of the ITS command descriptor, but we should not rely on it. Fixes: cc2d3216f53c ("irqchip: GICv3: ITS command queue") Signed-off-by: Zenghui Yu <yuzenghui@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20191202071021.1251-1-yuzenghui@huawei.com
2020-01-22irqchip/gic-v4.1: Allow direct invalidation of VLPIsMarc Zyngier
Just like for INVALL, GICv4.1 has grown a VPE-aware INVLPI register. Let's plumb it in and make use of the DirectLPI code in that case. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Zenghui Yu <yuzenghui@huawei.com> Link: https://lore.kernel.org/r/20191224111055.11836-16-maz@kernel.org
2020-01-22irqchip/gic-v4.1: Suppress per-VLPI doorbellMarc Zyngier
Since GICv4.1 gives us a per-VPE doorbell, avoid programming anything else on VMOVI/VMAPI/VMAPTI and on any other action that would have otherwise resulted in a per-VLPI doorbell to be programmed. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Zenghui Yu <yuzenghui@huawei.com> Link: https://lore.kernel.org/r/20191224111055.11836-15-maz@kernel.org
2020-01-22irqchip/gic-v4.1: Add VPE INVALL callbackMarc Zyngier
GICv4.1 redistributors have a VPE-aware INVALL register. Progress! We can now emulate a guest-requested INVALL without emiting a VINVALL command. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Zenghui Yu <yuzenghui@huawei.com> Link: https://lore.kernel.org/r/20191224111055.11836-14-maz@kernel.org
2020-01-22irqchip/gic-v4.1: Add VPE eviction callbackMarc Zyngier
When descheduling a VPE, special care must be taken to tell the GIC about whether we want to receive a doorbell or not. This is a major improvement on GICv4.0, where the doorbell had to be separately enabled/disabled. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Zenghui Yu <yuzenghui@huawei.com> Link: https://lore.kernel.org/r/20191224111055.11836-13-maz@kernel.org
2020-01-22irqchip/gic-v4.1: Add VPE residency callbackMarc Zyngier
Making a VPE resident on GICv4.1 is pretty simple, as it is just a single write to the local redistributor. We just need extra information about which groups to enable, which the KVM code will have to provide. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Zenghui Yu <yuzenghui@huawei.com> Link: https://lore.kernel.org/r/20191224111055.11836-12-maz@kernel.org
2020-01-22irqchip/gic-v4.1: Add mask/unmask doorbell callbacksMarc Zyngier
masking/unmasking doorbells on GICv4.1 relies on a new INVDB command, which broadcasts the invalidation to all RDs. Implement the new command as well as the masking callbacks, and plug the whole thing into the v4.1 VPE irqchip. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Zenghui Yu <yuzenghui@huawei.com> Link: https://lore.kernel.org/r/20191224111055.11836-11-maz@kernel.org
2020-01-22irqchip/gic-v4.1: Plumb skeletal VPE irqchipMarc Zyngier
Just like for GICv4.0, each VPE has its own doorbell interrupt, and thus an irqchip that manages them. Since the doorbell management is quite different on GICv4.1, let's introduce an almost empty irqchip the will get populated over the next new patches. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Zenghui Yu <yuzenghui@huawei.com> Link: https://lore.kernel.org/r/20191224111055.11836-10-maz@kernel.org
2020-01-22irqchip/gic-v4.1: Implement the v4.1 flavour of VMOVPMarc Zyngier
With GICv4.1, VMOVP is extended to allow a default doorbell to be specified, as well as a validity bit for this doorbell. As an added bonus, VMOVP isn't required anymore of moving a VPE between redistributors that share the same affinity. Let's add this support to the VMOVP builder, and make sure we don't issue the command if we don't really need to. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Zenghui Yu <yuzenghui@huawei.com> Link: https://lore.kernel.org/r/20191224111055.11836-9-maz@kernel.org