summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2025-02-28x86/cpufeatures: Rename X86_CMPXCHG64 to X86_CX8H. Peter Anvin (Intel)
Replace X86_CMPXCHG64 with X86_CX8, as CX8 is the name of the CPUID flag, thus to make it consistent with X86_FEATURE_CX8 defined in <asm/cpufeatures.h>. No functional change intended. Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com> Signed-off-by: Xin Li (Intel) <xin@zytor.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20250228082338.73859-2-xin@zytor.com
2025-02-28x86/cpu: Enable modifying CPU bug flags with '{clear,set}puid='Brendan Jackman
Sometimes it can be very useful to run CPU vulnerability mitigations on systems where they aren't known to mitigate any real-world vulnerabilities. This can be handy for mundane reasons like debugging HW-agnostic logic on whatever machine is to hand, but also for research reasons: while some mitigations are focused on individual vulns and uarches, others are fairly general, and it's strategically useful to have an idea how they'd perform on systems where they aren't currently needed. As evidence for this being useful, a flag specifically for Retbleed was added in: 5c9a92dec323 ("x86/bugs: Add retbleed=force"). Since CPU bugs are tracked using the same basic mechanism as features, and there are already parameters for manipulating them by hand, extend that mechanism to support bug as well as capabilities. With this patch and setcpuid=srso, a QEMU guest running on an Intel host will boot with Safe-RET enabled. Signed-off-by: Brendan Jackman <jackmanb@google.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20241220-force-cpu-bug-v2-3-7dc71bce742a@google.com
2025-02-28x86/cpu: Add the 'setcpuid=' boot parameterBrendan Jackman
In preparation for adding support to inject fake CPU bugs at boot-time, add a general facility to force enablement of CPU flags. The flag taints the kernel and the documentation attempts to be clear that this is highly unsuitable for uses outside of kernel development and platform experimentation. The new arg is parsed just like clearcpuid, but instead of leading to setup_clear_cpu_cap() it leads to setup_force_cpu_cap(). I've tested this by booting a nested QEMU guest on an Intel host, which with setcpuid=svm will claim that it supports AMD virtualization. Signed-off-by: Brendan Jackman <jackmanb@google.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20241220-force-cpu-bug-v2-2-7dc71bce742a@google.com
2025-02-28x86/cpu: Create helper function to parse the 'clearcpuid=' boot parameterBrendan Jackman
This is in preparation for a later commit that will reuse this code, to make review convenient. Factor out a helper function which does the full handling for this arg including printing info to the console. No functional change intended. Signed-off-by: Brendan Jackman <jackmanb@google.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20241220-force-cpu-bug-v2-1-7dc71bce742a@google.com
2025-02-28x86/cpu: Don't clear X86_FEATURE_LAHF_LM flag in init_amd_k8() on AMD when ↵Max Grobecker
running in a virtual machine When running in a virtual machine, we might see the original hardware CPU vendor string (i.e. "AuthenticAMD"), but a model and family ID set by the hypervisor. In case we run on AMD hardware and the hypervisor sets a model ID < 0x14, the LAHF cpu feature is eliminated from the the list of CPU capabilities present to circumvent a bug with some BIOSes in conjunction with AMD K8 processors. Parsing the flags list from /proc/cpuinfo seems to be happening mostly in bash scripts and prebuilt Docker containers, as it does not need to have additionals tools present – even though more reliable ways like using "kcpuid", which calls the CPUID instruction instead of parsing a list, should be preferred. Scripts, that use /proc/cpuinfo to determine if the current CPU is "compliant" with defined microarchitecture levels like x86-64-v2 will falsely claim the CPU is incapable of modern CPU instructions when "lahf_lm" is missing in that flags list. This can prevent some docker containers from starting or build scripts to create unoptimized binaries. Admittably, this is more a small inconvenience than a severe bug in the kernel and the shoddy scripts that rely on parsing /proc/cpuinfo should be fixed instead. This patch adds an additional check to see if we're running inside a virtual machine (X86_FEATURE_HYPERVISOR is present), which, to my understanding, can't be present on a real K8 processor as it was introduced only with the later/other Athlon64 models. Example output with the "lahf_lm" flag missing in the flags list (should be shown between "hypervisor" and "abm"): $ cat /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family : 15 model : 6 model name : Common KVM processor stepping : 1 microcode : 0x1000065 cpu MHz : 2599.998 cache size : 512 KB physical id : 0 siblings : 1 core id : 0 cpu cores : 1 apicid : 0 initial apicid : 0 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx rdtscp lm rep_good nopl cpuid extd_apicid tsc_known_freq pni pclmulqdq ssse3 fma cx16 sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c hypervisor abm 3dnowprefetch vmmcall bmi1 avx2 bmi2 xsaveopt ... while kcpuid shows the feature to be present in the CPU: # kcpuid -d | grep lahf lahf_lm - LAHF/SAHF available in 64-bit mode [ mingo: Updated the comment a bit, incorporated Boris's review feedback. ] Signed-off-by: Max Grobecker <max@grobecker.info> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: linux-kernel@vger.kernel.org Cc: Borislav Petkov <bp@alien8.de>
2025-02-27Merge tag 'drm-fixes-2025-02-28' of https://gitlab.freedesktop.org/drm/kernelLinus Torvalds
Pull drm fixes from Dave Airlie: "This week's fixes pull, amdgpu mostly, with some xe and a few misc others, the fb defio fix is bit of a change, but it avoids some nasty NULL pointer crashes due to defio assuming page backing in places it didn't have pages. amdgpu: - Legacy dpm suspend/resume fix - Runtime PM fix for DELL G5 SE - MAINTAINERS updates - Enforce Isolation fixes - mailmap update - EDID reading i2c fix - PSR fix - eDP fix - HPD interrupt handling fix - Clear memory fix amdkfd: - MQD handling fix vkms: - fix rounding error imagination: - header fix nouveau: - connector status fix fb/defio: - NULL ptr fix for defio drivers i915: - Fix encoder HW state readout for DP UHBR MST xe: - OA uapi fix (Umesh) - Userptr related fixes - Remove a duplicated register entry - Scheduler related fix to prevent exec races when freeing it" * tag 'drm-fixes-2025-02-28' of https://gitlab.freedesktop.org/drm/kernel: (25 commits) drm/fbdev-dma: Add shadow buffering for deferred I/O drm/nouveau: Do not override forced connector status drm/i915/dp_mst: Fix encoder HW state readout for UHBR MST drm/xe: cancel pending job timer before freeing scheduler drm/xe/regs: remove a duplicate definition for RING_CTL_SIZE(size) drm/imagination: remove unnecessary header include path drm/amdgpu: init return value in amdgpu_ttm_clear_buffer drm/amd/display: Fix HPD after gpu reset drm/amd/display: add a quirk to enable eDP0 on DP1 drm/amd/display: Disable PSR-SU on eDP panels MAINTAINERS: Update AMDGPU DML maintainers info drm/amd/display: restore edid reading from a given i2c adapter mailmap: Add entry for Rodrigo Siqueira MAINTAINERS: Change my role from Maintainer to Reviewer drm/amdgpu/mes: keep enforce isolation up to date drm/amdgpu/gfx: only call mes for enforce isolation if supported MAINTAINERS: update amdgpu maintainers list drm/amdgpu: disable BAR resize on Dell G5 SE drm/amdkfd: Preserve cp_hqd_pq_control on update_mqd amdgpu/pm/legacy: fix suspend/resume issues ...
2025-02-27ftrace: Avoid potential division by zero in function_stat_show()Nikolay Kuratov
Check whether denominator expression x * (x - 1) * 1000 mod {2^32, 2^64} produce zero and skip stddev computation in that case. For now don't care about rec->counter * rec->counter overflow because rec->time * rec->time overflow will likely happen earlier. Cc: stable@vger.kernel.org Cc: Wen Yang <wenyang@linux.alibaba.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Link: https://lore.kernel.org/20250206090156.1561783-1-kniv@yandex-team.ru Fixes: e31f7939c1c27 ("ftrace: Avoid potential division by zero in function profiler") Signed-off-by: Nikolay Kuratov <kniv@yandex-team.ru> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-02-27selftests/ftrace: Let fprobe test consider already enabled functionsHeiko Carstens
The fprobe test fails on Fedora 41 since the fprobe test assumption that the number of enabled_functions is zero before the test starts is not necessarily true. Some user space tools, like systemd, add BPF programs that attach to functions. Those will show up in the enabled_functions table and must be taken into account by the fprobe test. Therefore count the number of lines of enabled_functions before tests start, and use that as base when comparing expected results. Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Link: https://lore.kernel.org/20250226142703.910860-1-hca@linux.ibm.com Fixes: e85c5e9792b9 ("selftests/ftrace: Update fprobe test to check enabled_functions file") Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-02-27tracing: Fix bad hist from corrupting named_triggers listSteven Rostedt
The following commands causes a crash: ~# cd /sys/kernel/tracing/events/rcu/rcu_callback ~# echo 'hist:name=bad:keys=common_pid:onmax(bogus).save(common_pid)' > trigger bash: echo: write error: Invalid argument ~# echo 'hist:name=bad:keys=common_pid' > trigger Because the following occurs: event_trigger_write() { trigger_process_regex() { event_hist_trigger_parse() { data = event_trigger_alloc(..); event_trigger_register(.., data) { cmd_ops->reg(.., data, ..) [hist_register_trigger()] { data->ops->init() [event_hist_trigger_init()] { save_named_trigger(name, data) { list_add(&data->named_list, &named_triggers); } } } } ret = create_actions(); (return -EINVAL) if (ret) goto out_unreg; [..] ret = hist_trigger_enable(data, ...) { list_add_tail_rcu(&data->list, &file->triggers); <<<---- SKIPPED!!! (this is important!) [..] out_unreg: event_hist_unregister(.., data) { cmd_ops->unreg(.., data, ..) [hist_unregister_trigger()] { list_for_each_entry(iter, &file->triggers, list) { if (!hist_trigger_match(data, iter, named_data, false)) <- never matches continue; [..] test = iter; } if (test && test->ops->free) <<<-- test is NULL test->ops->free(test) [event_hist_trigger_free()] { [..] if (data->name) del_named_trigger(data) { list_del(&data->named_list); <<<<-- NEVER gets removed! } } } } [..] kfree(data); <<<-- frees item but it is still on list The next time a hist with name is registered, it causes an u-a-f bug and the kernel can crash. Move the code around such that if event_trigger_register() succeeds, the next thing called is hist_trigger_enable() which adds it to the list. A bunch of actions is called if get_named_trigger_data() returns false. But that doesn't need to be called after event_trigger_register(), so it can be moved up, allowing event_trigger_register() to be called just before hist_trigger_enable() keeping them together and allowing the file->triggers to be properly populated. Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Link: https://lore.kernel.org/20250227163944.1c37f85f@gandalf.local.home Fixes: 067fe038e70f6 ("tracing: Add variable reference handling to hist triggers") Reported-by: Tomas Glozar <tglozar@redhat.com> Tested-by: Tomas Glozar <tglozar@redhat.com> Reviewed-by: Tom Zanussi <zanussi@kernel.org> Closes: https://lore.kernel.org/all/CAP4=nvTsxjckSBTz=Oe_UYh8keD9_sZC4i++4h72mJLic4_W4A@mail.gmail.com/ Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-02-28Merge tag 'drm-xe-fixes-2025-02-27' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes uAPI: - OA uapi fix (Umesh) Driver: - Userptr related fixes (Auld) - Remove a duplicated register entry (Mingong) - Scheduler related fix to prevent exec races when freeing it (Tejas) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/Z8CSqJre1VCjPXt2@intel.com
2025-02-28Merge tag 'drm-intel-fixes-2025-02-27' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/i915/kernel into drm-fixes - Fix encoder HW state readout for DP UHBR MST (Imre) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/Z8CRM7XzlerbWSJy@intel.com
2025-02-28Merge tag 'drm-misc-fixes-2025-02-27' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes Fix a rounding error in vkms, a header fix for img, a connector status fix for nouveau, and a NULL pointer dereference fix for deferred IO drivers. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maxime Ripard <mripard@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250227-antique-robust-earthworm-09dfd1@houat
2025-02-28Merge tag 'amd-drm-fixes-6.14-2025-02-26' of ↵Dave Airlie
https://gitlab.freedesktop.org/agd5f/linux into drm-fixes amd-drm-fixes-6.14-2025-02-26: amdgpu: - Legacy dpm suspend/resume fix - Runtime PM fix for DELL G5 SE - MAINTAINERS updates - Enforce Isolation fixes - mailmap update - EDID reading i2c fix - PSR fix - eDP fix - HPD interrupt handling fix - Clear memory fix amdkfd: - MQD handling fix Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250226200342.3685347-1-alexander.deucher@amd.com
2025-02-27sched/core: Prevent rescheduling when interrupts are disabledThomas Gleixner
David reported a warning observed while loop testing kexec jump: Interrupts enabled after irqrouter_resume+0x0/0x50 WARNING: CPU: 0 PID: 560 at drivers/base/syscore.c:103 syscore_resume+0x18a/0x220 kernel_kexec+0xf6/0x180 __do_sys_reboot+0x206/0x250 do_syscall_64+0x95/0x180 The corresponding interrupt flag trace: hardirqs last enabled at (15573): [<ffffffffa8281b8e>] __up_console_sem+0x7e/0x90 hardirqs last disabled at (15580): [<ffffffffa8281b73>] __up_console_sem+0x63/0x90 That means __up_console_sem() was invoked with interrupts enabled. Further instrumentation revealed that in the interrupt disabled section of kexec jump one of the syscore_suspend() callbacks woke up a task, which set the NEED_RESCHED flag. A later callback in the resume path invoked cond_resched() which in turn led to the invocation of the scheduler: __cond_resched+0x21/0x60 down_timeout+0x18/0x60 acpi_os_wait_semaphore+0x4c/0x80 acpi_ut_acquire_mutex+0x3d/0x100 acpi_ns_get_node+0x27/0x60 acpi_ns_evaluate+0x1cb/0x2d0 acpi_rs_set_srs_method_data+0x156/0x190 acpi_pci_link_set+0x11c/0x290 irqrouter_resume+0x54/0x60 syscore_resume+0x6a/0x200 kernel_kexec+0x145/0x1c0 __do_sys_reboot+0xeb/0x240 do_syscall_64+0x95/0x180 This is a long standing problem, which probably got more visible with the recent printk changes. Something does a task wakeup and the scheduler sets the NEED_RESCHED flag. cond_resched() sees it set and invokes schedule() from a completely bogus context. The scheduler enables interrupts after context switching, which causes the above warning at the end. Quite some of the code paths in syscore_suspend()/resume() can result in triggering a wakeup with the exactly same consequences. They might not have done so yet, but as they share a lot of code with normal operations it's just a question of time. The problem only affects the PREEMPT_NONE and PREEMPT_VOLUNTARY scheduling models. Full preemption is not affected as cond_resched() is disabled and the preemption check preemptible() takes the interrupt disabled flag into account. Cure the problem by adding a corresponding check into cond_resched(). Reported-by: David Woodhouse <dwmw@amazon.co.uk> Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Tested-by: David Woodhouse <dwmw@amazon.co.uk> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: stable@vger.kernel.org Closes: https://lore.kernel.org/all/7717fe2ac0ce5f0a2c43fdab8b11f4483d54a2a4.camel@infradead.org
2025-02-27arm64: hugetlb: Fix flush_hugetlb_tlb_range() invalidation levelRyan Roberts
commit c910f2b65518 ("arm64/mm: Update tlb invalidation routines for FEAT_LPA2") changed the "invalidation level unknown" hint from 0 to TLBI_TTL_UNKNOWN (INT_MAX). But the fallback "unknown level" path in flush_hugetlb_tlb_range() was not updated. So as it stands, when trying to invalidate CONT_PMD_SIZE or CONT_PTE_SIZE hugetlb mappings, we will spuriously try to invalidate at level 0 on LPA2-enabled systems. Fix this so that the fallback passes TLBI_TTL_UNKNOWN, and while we are at it, explicitly use the correct stride and level for CONT_PMD_SIZE and CONT_PTE_SIZE, which should provide a minor optimization. Cc: stable@vger.kernel.org Fixes: c910f2b65518 ("arm64/mm: Update tlb invalidation routines for FEAT_LPA2") Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com> Link: https://lore.kernel.org/r/20250226120656.2400136-4-ryan.roberts@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2025-02-27arm64: hugetlb: Fix huge_ptep_get_and_clear() for non-present ptesRyan Roberts
arm64 supports multiple huge_pte sizes. Some of the sizes are covered by a single pte entry at a particular level (PMD_SIZE, PUD_SIZE), and some are covered by multiple ptes at a particular level (CONT_PTE_SIZE, CONT_PMD_SIZE). So the function has to figure out the size from the huge_pte pointer. This was previously done by walking the pgtable to determine the level and by using the PTE_CONT bit to determine the number of ptes at the level. But the PTE_CONT bit is only valid when the pte is present. For non-present pte values (e.g. markers, migration entries), the previous implementation was therefore erroneously determining the size. There is at least one known caller in core-mm, move_huge_pte(), which may call huge_ptep_get_and_clear() for a non-present pte. So we must be robust to this case. Additionally the "regular" ptep_get_and_clear() is robust to being called for non-present ptes so it makes sense to follow the behavior. Fix this by using the new sz parameter which is now provided to the function. Additionally when clearing each pte in a contig range, don't gather the access and dirty bits if the pte is not present. An alternative approach that would not require API changes would be to store the PTE_CONT bit in a spare bit in the swap entry pte for the non-present case. But it felt cleaner to follow other APIs' lead and just pass in the size. As an aside, PTE_CONT is bit 52, which corresponds to bit 40 in the swap entry offset field (layout of non-present pte). Since hugetlb is never swapped to disk, this field will only be populated for markers, which always set this bit to 0 and hwpoison swap entries, which set the offset field to a PFN; So it would only ever be 1 for a 52-bit PVA system where memory in that high half was poisoned (I think!). So in practice, this bit would almost always be zero for non-present ptes and we would only clear the first entry if it was actually a contiguous block. That's probably a less severe symptom than if it was always interpreted as 1 and cleared out potentially-present neighboring PTEs. Cc: stable@vger.kernel.org Fixes: 66b3923a1a0f ("arm64: hugetlb: add support for PTE contiguous bit") Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com> Link: https://lore.kernel.org/r/20250226120656.2400136-3-ryan.roberts@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2025-02-27mm: hugetlb: Add huge page size param to huge_ptep_get_and_clear()Ryan Roberts
In order to fix a bug, arm64 needs to be told the size of the huge page for which the huge_pte is being cleared in huge_ptep_get_and_clear(). Provide for this by adding an `unsigned long sz` parameter to the function. This follows the same pattern as huge_pte_clear() and set_huge_pte_at(). This commit makes the required interface modifications to the core mm as well as all arches that implement this function (arm64, loongarch, mips, parisc, powerpc, riscv, s390, sparc). The actual arm64 bug will be fixed in a separate commit. Cc: stable@vger.kernel.org Fixes: 66b3923a1a0f ("arm64: hugetlb: add support for PTE contiguous bit") Acked-by: David Hildenbrand <david@redhat.com> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> # riscv Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com> Acked-by: Alexander Gordeev <agordeev@linux.ibm.com> # s390 Link: https://lore.kernel.org/r/20250226120656.2400136-2-ryan.roberts@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2025-02-27Merge tag 'net-6.14-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from bluetooth. We didn't get netfilter or wireless PRs this week, so next week's PR is probably going to be bigger. A healthy dose of fixes for bugs introduced in the current release nonetheless. Current release - regressions: - Bluetooth: always allow SCO packets for user channel - af_unix: fix memory leak in unix_dgram_sendmsg() - rxrpc: - remove redundant peer->mtu_lock causing lockdep splats - fix spinlock flavor issues with the peer record hash - eth: iavf: fix circular lock dependency with netdev_lock - net: use rtnl_net_dev_lock() in register_netdevice_notifier_dev_net() RDMA driver register notifier after the device Current release - new code bugs: - ethtool: fix ioctl confusing drivers about desired HDS user config - eth: ixgbe: fix media cage present detection for E610 device Previous releases - regressions: - loopback: avoid sending IP packets without an Ethernet header - mptcp: reset connection when MPTCP opts are dropped after join Previous releases - always broken: - net: better track kernel sockets lifetime - ipv6: fix dst ref loop on input in seg6 and rpl lw tunnels - phy: qca807x: use right value from DTS for DAC_DSP_BIAS_CURRENT - eth: enetc: number of error handling fixes - dsa: rtl8366rb: reshuffle the code to fix config / build issue with LED support" * tag 'net-6.14-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (53 commits) net: ti: icss-iep: Reject perout generation request idpf: fix checksums set in idpf_rx_rsc() selftests: drv-net: Check if combined-count exists net: ipv6: fix dst ref loop on input in rpl lwt net: ipv6: fix dst ref loop on input in seg6 lwt usbnet: gl620a: fix endpoint checking in genelink_bind() net/mlx5: IRQ, Fix null string in debug print net/mlx5: Restore missing trace event when enabling vport QoS net/mlx5: Fix vport QoS cleanup on error net: mvpp2: cls: Fixed Non IP flow, with vlan tag flow defination. af_unix: Fix memory leak in unix_dgram_sendmsg() net: Handle napi_schedule() calls from non-interrupt net: Clear old fragment checksum value in napi_reuse_skb gve: unlink old napi when stopping a queue using queue API net: Use rtnl_net_dev_lock() in register_netdevice_notifier_dev_net(). tcp: Defer ts_recent changes until req is owned net: enetc: fix the off-by-one issue in enetc_map_tx_tso_buffs() net: enetc: remove the mm_lock from the ENETC v4 driver net: enetc: add missing enetc4_link_deinit() net: enetc: update UDP checksum when updating originTimestamp field ...
2025-02-27efi/mokvar-table: Avoid repeated map/unmap of the same pageArd Biesheuvel
Tweak the logic that traverses the MOKVAR UEFI configuration table to only unmap the entry header and map the next one if they don't live in the same physical page. Link: https://lore.kernel.org/all/8f085931-3e9d-4386-9209-1d6c95616327@uncooperative.org/ Tested-By: Peter Jones <pjones@redhat.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2025-02-27efi: Don't map the entire mokvar table to determine its sizePeter Jones
Currently, when validating the mokvar table, we (re)map the entire table on each iteration of the loop, adding space as we discover new entries. If the table grows over a certain size, this fails due to limitations of early_memmap(), and we get a failure and traceback: ------------[ cut here ]------------ WARNING: CPU: 0 PID: 0 at mm/early_ioremap.c:139 __early_ioremap+0xef/0x220 ... Call Trace: <TASK> ? __early_ioremap+0xef/0x220 ? __warn.cold+0x93/0xfa ? __early_ioremap+0xef/0x220 ? report_bug+0xff/0x140 ? early_fixup_exception+0x5d/0xb0 ? early_idt_handler_common+0x2f/0x3a ? __early_ioremap+0xef/0x220 ? efi_mokvar_table_init+0xce/0x1d0 ? setup_arch+0x864/0xc10 ? start_kernel+0x6b/0xa10 ? x86_64_start_reservations+0x24/0x30 ? x86_64_start_kernel+0xed/0xf0 ? common_startup_64+0x13e/0x141 </TASK> ---[ end trace 0000000000000000 ]--- mokvar: Failed to map EFI MOKvar config table pa=0x7c4c3000, size=265187. Mapping the entire structure isn't actually necessary, as we don't ever need more than one entry header mapped at once. Changes efi_mokvar_table_init() to only map each entry header, not the entire table, when determining the table size. Since we're not mapping any data past the variable name, it also changes the code to enforce that each variable name is NUL terminated, rather than attempting to verify it in place. Cc: <stable@vger.kernel.org> Signed-off-by: Peter Jones <pjones@redhat.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2025-02-27Merge tag 'sound-6.14-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "A collection of fixes. The only slightly large change is for ASoC Cirrus codec, but that's still in a normal range. All the rest are small device-specific fixes and should be fairly safe to take" * tag 'sound-6.14-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: ALSA: hda/realtek: Fix microphone regression on ASUS N705UD ALSA: hda/realtek: Fix wrong mic setup for ASUS VivoBook 15 ASoC: cs35l56: Prevent races when soft-resetting using SPI control firmware: cs_dsp: Remove async regmap writes ASoC: Intel: sof_sdw: warn both sdw and pch dmic are used ASoC: SOF: Intel: don't check number of sdw links when set dmic_fixup ASoC: dapm-graph: set fill colour of turned on nodes ASoC: fsl: Rename stream name of SAI DAI driver ASoC: es8328: fix route from DAC to output ALSA: usb-audio: Re-add sample rate quirk for Pioneer DJM-900NXS2 ASoC: tas2764: Set the SDOUT polarity correctly ASoC: tas2764: Fix power control mask ALSA: usb-audio: Avoid dropping MIDI events at closing multiple ports ASoC: tas2770: Fix volume scale
2025-02-27net: ti: icss-iep: Reject perout generation requestMeghana Malladi
IEP driver supports both perout and pps signal generation but perout feature is faulty with half-cooked support due to some missing configuration. Remove perout support from the driver and reject perout requests with "not supported" error code. Fixes: c1e0230eeaab2 ("net: ti: icss-iep: Add IEP driver") Signed-off-by: Meghana Malladi <m-malladi@ti.com> Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Link: https://patch.msgid.link/20250227092441.1848419-1-m-malladi@ti.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-27idpf: fix checksums set in idpf_rx_rsc()Eric Dumazet
idpf_rx_rsc() uses skb_transport_offset(skb) while the transport header is not set yet. This triggers the following warning for CONFIG_DEBUG_NET=y builds. DEBUG_NET_WARN_ON_ONCE(!skb_transport_header_was_set(skb)) [ 69.261620] WARNING: CPU: 7 PID: 0 at ./include/linux/skbuff.h:3020 idpf_vport_splitq_napi_poll (include/linux/skbuff.h:3020) idpf [ 69.261629] Modules linked in: vfat fat dummy bridge intel_uncore_frequency_tpmi intel_uncore_frequency_common intel_vsec_tpmi idpf intel_vsec cdc_ncm cdc_eem cdc_ether usbnet mii xhci_pci xhci_hcd ehci_pci ehci_hcd libeth [ 69.261644] CPU: 7 UID: 0 PID: 0 Comm: swapper/7 Tainted: G S W 6.14.0-smp-DEV #1697 [ 69.261648] Tainted: [S]=CPU_OUT_OF_SPEC, [W]=WARN [ 69.261650] RIP: 0010:idpf_vport_splitq_napi_poll (include/linux/skbuff.h:3020) idpf [ 69.261677] ? __warn (kernel/panic.c:242 kernel/panic.c:748) [ 69.261682] ? idpf_vport_splitq_napi_poll (include/linux/skbuff.h:3020) idpf [ 69.261687] ? report_bug (lib/bug.c:?) [ 69.261690] ? handle_bug (arch/x86/kernel/traps.c:285) [ 69.261694] ? exc_invalid_op (arch/x86/kernel/traps.c:309) [ 69.261697] ? asm_exc_invalid_op (arch/x86/include/asm/idtentry.h:621) [ 69.261700] ? __pfx_idpf_vport_splitq_napi_poll (drivers/net/ethernet/intel/idpf/idpf_txrx.c:4011) idpf [ 69.261704] ? idpf_vport_splitq_napi_poll (include/linux/skbuff.h:3020) idpf [ 69.261708] ? idpf_vport_splitq_napi_poll (drivers/net/ethernet/intel/idpf/idpf_txrx.c:3072) idpf [ 69.261712] __napi_poll (net/core/dev.c:7194) [ 69.261716] net_rx_action (net/core/dev.c:7265) [ 69.261718] ? __qdisc_run (net/sched/sch_generic.c:293) [ 69.261721] ? sched_clock (arch/x86/include/asm/preempt.h:84 arch/x86/kernel/tsc.c:288) [ 69.261726] handle_softirqs (kernel/softirq.c:561) Fixes: 3a8845af66edb ("idpf: add RX splitq napi poll support") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Alan Brady <alan.brady@intel.com> Cc: Joshua Hay <joshua.a.hay@intel.com> Cc: Willem de Bruijn <willemb@google.com> Acked-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Link: https://patch.msgid.link/20250226221253.1927782-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-27selftests: drv-net: Check if combined-count existsJoe Damato
Some drivers, like tg3, do not set combined-count: $ ethtool -l enp4s0f1 Channel parameters for enp4s0f1: Pre-set maximums: RX: 4 TX: 4 Other: n/a Combined: n/a Current hardware settings: RX: 4 TX: 1 Other: n/a Combined: n/a In the case where combined-count is not set, the ethtool netlink code in the kernel elides the value and the code in the test: netnl.channels_get(...) With a tg3 device, the returned dictionary looks like: {'header': {'dev-index': 3, 'dev-name': 'enp4s0f1'}, 'rx-max': 4, 'rx-count': 4, 'tx-max': 4, 'tx-count': 1} Note that the key 'combined-count' is missing. As a result of this missing key the test raises an exception: # Exception| if channels['combined-count'] == 0: # Exception| ~~~~~~~~^^^^^^^^^^^^^^^^^^ # Exception| KeyError: 'combined-count' Change the test to check if 'combined-count' is a key in the dictionary first and if not assume that this means the driver has separate RX and TX queues. With this change, the test now passes successfully on tg3 and mlx5 (which does have a 'combined-count'). Fixes: 1cf270424218 ("net: selftest: add test for netdev netlink queue-get API") Signed-off-by: Joe Damato <jdamato@fastly.com> Reviewed-by: David Wei <dw@davidwei.uk> Link: https://patch.msgid.link/20250226181957.212189-1-jdamato@fastly.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-27Merge branch 'fixes-for-seg6-and-rpl-lwtunnels-on-input'Paolo Abeni
Justin Iurman says: ==================== fixes for seg6 and rpl lwtunnels on input As a follow up to commit 92191dd10730 ("net: ipv6: fix dst ref loops in rpl, seg6 and ioam6 lwtunnels"), we also need a conditional dst cache on input for seg6_iptunnel and rpl_iptunnel to prevent dst ref loops (i.e., if the packet destination did not change, we may end up recording a reference to the lwtunnel in its own cache, and the lwtunnel state will never be freed). This series provides a fix to respectively prevent a dst ref loop on input in seg6_iptunnel and rpl_iptunnel. v2: - https://lore.kernel.org/netdev/20250211221624.18435-1-justin.iurman@uliege.be/ v1: - https://lore.kernel.org/netdev/20250209193840.20509-1-justin.iurman@uliege.be/ ==================== Link: https://patch.msgid.link/20250225175139.25239-1-justin.iurman@uliege.be Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-02-27net: ipv6: fix dst ref loop on input in rpl lwtJustin Iurman
Prevent a dst ref loop on input in rpl_iptunnel. Fixes: a7a29f9c361f ("net: ipv6: add rpl sr tunnel") Cc: Alexander Aring <alex.aring@gmail.com> Cc: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Justin Iurman <justin.iurman@uliege.be> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-02-27net: ipv6: fix dst ref loop on input in seg6 lwtJustin Iurman
Prevent a dst ref loop on input in seg6_iptunnel. Fixes: af4a2209b134 ("ipv6: sr: use dst_cache in seg6_input") Cc: David Lebrun <dlebrun@google.com> Cc: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Justin Iurman <justin.iurman@uliege.be> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-02-27x86/cpu: Remove get_this_hybrid_cpu_*()Pawan Gupta
Because calls to get_this_hybrid_cpu_type() and get_this_hybrid_cpu_native_id() are not required now. cpu-type and native-model-id are cached at boot in per-cpu struct cpuinfo_topology. Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lore.kernel.org/r/20241211-add-cpu-type-v5-4-2ae010f50370@linux.intel.com
2025-02-27perf/x86/intel: Use cache cpu-type for hybrid PMU selectionPawan Gupta
get_this_hybrid_cpu_type() misses a case when cpu-type is populated regardless of X86_FEATURE_HYBRID_CPU. This is particularly true for hybrid variants that have P or E cores fused off. Instead use the cpu-type cached in struct x86_topology, as it does not rely on hybrid feature to enumerate cpu-type. This can also help avoid the model-specific fixup get_hybrid_cpu_type(). Also replace the get_this_hybrid_cpu_native_id() with its cached value in struct x86_topology. While at it, remove enum hybrid_cpu_type as it serves no purpose when we have the exact cpu-types defined in enum intel_cpu_type. Also rename atom_native_id to intel_native_id and move it to intel-family.h where intel_cpu_type lives. Suggested-by: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lore.kernel.org/r/20241211-add-cpu-type-v5-3-2ae010f50370@linux.intel.com
2025-02-27cpufreq: intel_pstate: Avoid SMP calls to get cpu-typePawan Gupta
Intel pstate driver relies on SMP calls to get the cpu-type of a given CPU. Remove the SMP calls and instead use the cached value of cpu-type which is more efficient. [ mingo: Forward ported it. ] Suggested-by: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lore.kernel.org/r/20241211-add-cpu-type-v5-2-2ae010f50370@linux.intel.com
2025-02-27x86/cpu: Prefix hexadecimal values with 0x in cpu_debug_show()Pawan Gupta
The hex values in CPU debug interface are not prefixed with 0x. This may cause misinterpretation of values. Fix it. [ mingo: Restore previous vertical alignment of the output. ] Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lore.kernel.org/r/20241211-add-cpu-type-v5-1-2ae010f50370@linux.intel.com
2025-02-27usbnet: gl620a: fix endpoint checking in genelink_bind()Nikita Zhandarovich
Syzbot reports [1] a warning in usb_submit_urb() triggered by inconsistencies between expected and actually present endpoints in gl620a driver. Since genelink_bind() does not properly verify whether specified eps are in fact provided by the device, in this case, an artificially manufactured one, one may get a mismatch. Fix the issue by resorting to a usbnet utility function usbnet_get_endpoints(), usually reserved for this very problem. Check for endpoints and return early before proceeding further if any are missing. [1] Syzbot report: usb 5-1: Manufacturer: syz usb 5-1: SerialNumber: syz usb 5-1: config 0 descriptor?? gl620a 5-1:0.23 usb0: register 'gl620a' at usb-dummy_hcd.0-1, ... ------------[ cut here ]------------ usb 5-1: BOGUS urb xfer, pipe 3 != type 1 WARNING: CPU: 2 PID: 1841 at drivers/usb/core/urb.c:503 usb_submit_urb+0xe4b/0x1730 drivers/usb/core/urb.c:503 Modules linked in: CPU: 2 UID: 0 PID: 1841 Comm: kworker/2:2 Not tainted 6.12.0-syzkaller-07834-g06afb0f36106 #0 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014 Workqueue: mld mld_ifc_work RIP: 0010:usb_submit_urb+0xe4b/0x1730 drivers/usb/core/urb.c:503 ... Call Trace: <TASK> usbnet_start_xmit+0x6be/0x2780 drivers/net/usb/usbnet.c:1467 __netdev_start_xmit include/linux/netdevice.h:5002 [inline] netdev_start_xmit include/linux/netdevice.h:5011 [inline] xmit_one net/core/dev.c:3590 [inline] dev_hard_start_xmit+0x9a/0x7b0 net/core/dev.c:3606 sch_direct_xmit+0x1ae/0xc30 net/sched/sch_generic.c:343 __dev_xmit_skb net/core/dev.c:3827 [inline] __dev_queue_xmit+0x13d4/0x43e0 net/core/dev.c:4400 dev_queue_xmit include/linux/netdevice.h:3168 [inline] neigh_resolve_output net/core/neighbour.c:1514 [inline] neigh_resolve_output+0x5bc/0x950 net/core/neighbour.c:1494 neigh_output include/net/neighbour.h:539 [inline] ip6_finish_output2+0xb1b/0x2070 net/ipv6/ip6_output.c:141 __ip6_finish_output net/ipv6/ip6_output.c:215 [inline] ip6_finish_output+0x3f9/0x1360 net/ipv6/ip6_output.c:226 NF_HOOK_COND include/linux/netfilter.h:303 [inline] ip6_output+0x1f8/0x540 net/ipv6/ip6_output.c:247 dst_output include/net/dst.h:450 [inline] NF_HOOK include/linux/netfilter.h:314 [inline] NF_HOOK include/linux/netfilter.h:308 [inline] mld_sendpack+0x9f0/0x11d0 net/ipv6/mcast.c:1819 mld_send_cr net/ipv6/mcast.c:2120 [inline] mld_ifc_work+0x740/0xca0 net/ipv6/mcast.c:2651 process_one_work+0x9c5/0x1ba0 kernel/workqueue.c:3229 process_scheduled_works kernel/workqueue.c:3310 [inline] worker_thread+0x6c8/0xf00 kernel/workqueue.c:3391 kthread+0x2c1/0x3a0 kernel/kthread.c:389 ret_from_fork+0x45/0x80 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244 </TASK> Reported-by: syzbot+d693c07c6f647e0388d3@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=d693c07c6f647e0388d3 Fixes: 47ee3051c856 ("[PATCH] USB: usbnet (5/9) module for genesys gl620a cables") Cc: stable@vger.kernel.org Signed-off-by: Nikita Zhandarovich <n.zhandarovich@fintech.ru> Link: https://patch.msgid.link/20250224172919.1220522-1-n.zhandarovich@fintech.ru Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-02-27efivarfs: allow creation of zero length filesJames Bottomley
Temporarily allow the creation of zero length files in efivarfs so the 'fwupd' user space firmware update tool can continue to operate. This hack should be reverted as soon as the fwupd mechanisms for updating firmware have been fixed. fwupd has been coded to open a firmware file, close it, remove the immutable bit and write to it. Since commit 908af31f4896 ("efivarfs: fix error on write to new variable leaving remnants") this behaviour results in the first close removing the file which causes the second write to fail. To allow fwupd to keep working code up an indicator of size 1 if a write fails and only remove the file on that condition (so create at zero size is allowed). Tested-by: Richard Hughes <richard@hughsie.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> [ardb: replace LVFS with fwupd, as suggested by Richard] Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2025-02-27x86/platform: Only allow CONFIG_EISA for 32-bitArnd Bergmann
The CONFIG_EISA menu was cleaned up in 2018, but this inadvertently brought the option back on 64-bit machines: ISA remains guarded by a CONFIG_X86_32 check, but EISA no longer depends on ISA. The last Intel machines ith EISA support used a 82375EB PCI/EISA bridge from 1993 that could be paired with the 440FX chipset on early Pentium-II CPUs, long before the first x86-64 products. Fixes: 6630a8e50105 ("eisa: consolidate EISA Kconfig entry in drivers/eisa") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20250226213714.4040853-11-arnd@kernel.org
2025-02-27x86/pci: Remove old STA2x11 supportArnd Bergmann
ST ConneXt STA2x11 was an interface chip for Atom E6xx processors, using a number of components usually found on Arm SoCs. Most of this was merged upstream, but it was never complete enough to actually work and has been abandoned for many years. We already had an agreement on removing it in 2022, but nobody ever submitted the patch to do it. Without STA2x11, CONFIG_X86_32_NON_STANDARD no longer has any use - remove it. Suggested-by: Davide Ciminaghi <ciminaghi@gnudd.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20250226213714.4040853-10-arnd@kernel.org
2025-02-27x86/cpu: Document CONFIG_X86_INTEL_MID as 64-bit-onlyArnd Bergmann
The X86_INTEL_MID code was originally introduced for the 32-bit Moorestown/Medfield/Clovertrail platform, later the 64-bit Merrifield/Moorefield variants were added, but the final Morganfield 14nm platform was canceled before it hit the market. To help users understand what the option actually refers to, update the help text, and add a dependency on 64-bit kernels. Ferry confirmed that all the hardware can run 64-bit kernels these days, but is still testing 32-bit kernels on the Intel Edison board, so this remains possible, but is guarded by a CONFIG_EXPERT dependency now, to gently push remaining users towards using CONFIG_64BIT. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Andy Shevchenko <andy@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20250226213714.4040853-9-arnd@kernel.org
2025-02-27x86/mm: Drop support for CONFIG_HIGHPTEArnd Bergmann
With the maximum amount of RAM now 4GB, there is very little point to still have PTE pages in highmem. Drop this for simplification. The only other architecture supporting HIGHPTE is 32-bit arm, and once that feature is removed as well, the highpte logic can be dropped from common code as well. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20250226213714.4040853-8-arnd@kernel.org
2025-02-27x86/mm: Drop CONFIG_SWIOTLB for PAEArnd Bergmann
Since kernels with and without CONFIG_X86_PAE are now limited to the low 4GB of physical address space, there is no need to use swiotlb any more, so stop selecting this. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20250226213714.4040853-7-arnd@kernel.org
2025-02-27x86/mm: Remove CONFIG_HIGHMEM64G supportArnd Bergmann
HIGHMEM64G support was added in linux-2.3.25 to support (then) high-end Pentium Pro and Pentium III Xeon servers with more than 4GB of addressing, NUMA and PCI-X slots started appearing. I have found no evidence of this ever being used in regular dual-socket servers or consumer devices, all the users seem obsolete these days, even by i386 standards: - Support for NUMA servers (NUMA-Q, IBM x440, unisys) was already removed ten years ago. - 4+ socket non-NUMA servers based on Intel 450GX/450NX, HP F8 and ServerWorks ServerSet/GrandChampion could theoretically still work with 8GB, but these were exceptionally rare even 20 years ago and would have usually been equipped with than the maximum amount of RAM. - Some SKUs of the Celeron D from 2004 had 64-bit mode fused off but could still work in a Socket 775 mainboard designed for the later Core 2 Duo and 8GB. Apparently most BIOSes at the time only allowed 64-bit CPUs. - The rare Xeon LV "Sossaman" came on a few motherboards with registered DDR2 memory support up to 16GB. - In the early days of x86-64 hardware, there was sometimes the need to run a 32-bit kernel to work around bugs in the hardware drivers, or in the syscall emulation for 32-bit userspace. This likely still works but there should never be a need for this any more. PAE mode is still required to get access to the 'NX' bit on Atom 'Pentium M' and 'Core Duo' CPUs. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20250226213714.4040853-6-arnd@kernel.org
2025-02-27x86/cpu: Drop configuration options for early 64-bit CPUsArnd Bergmann
The x86 CPU selection menu is confusing for a number of reasons: When configuring 32-bit kernels, it shows a small number of early 64-bit microarchitectures (K8, Core 2) but not the regular generic 64-bit target that is the normal default. There is no longer a reason to run 32-bit kernels on production 64-bit systems, so only actual 32-bit CPUs need to be shown here. When configuring 64-bit kernels, the options also pointless as there is no way to pick any CPU from the past 15 years, leaving GENERIC_CPU as the only sensible choice. Address both of the above by removing the obsolete options and making all 64-bit kernels run on both Intel and AMD CPUs from any generation. Testing generic 32-bit kernels on 64-bit hardware remains possible, just not building a 32-bit kernel that requires a 64-bit CPU. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20250226213714.4040853-5-arnd@kernel.org
2025-02-27x86/build: Rework CONFIG_GENERIC_CPU compiler flagsArnd Bergmann
Building an x86-64 kernel with CONFIG_GENERIC_CPU is documented to run on all CPUs, but the Makefile does not actually pass an -march= argument, instead relying on the default that was used to configure the toolchain. In many cases, gcc will be configured to -march=x86-64 or -march=k8 for maximum compatibility, but in other cases a distribution default may be either raised to a more recent ISA, or set to -march=native to build for the CPU used for compilation. This still works in the case of building a custom kernel for the local machine. The point where it breaks down is building a kernel for another machine that is older the the default target. Changing the default to -march=x86-64 would make it work reliable, but possibly produce worse code on distros that intentionally default to a newer ISA. To allow reliably building a kernel for either the oldest x86-64 CPUs, pass the -march=x86-64 flag to the compiler. This was not possible in early versions of x86-64 gcc, but works on all currently supported versions down to at least gcc-5. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20250226213714.4040853-4-arnd@kernel.org
2025-02-27x86/smp: Drop 32-bit "bigsmp" machine supportArnd Bergmann
The x86-32 kernel used to support multiple platforms with more than eight logical CPUs, from the 1999-2003 timeframe: Sequent NUMA-Q, IBM Summit, Unisys ES7000 and HP F8. Support for all except the latter was dropped back in 2014, leaving only the F8 based DL740 and DL760 G2 machines in this catery, with up to eight single-core Socket-603 Xeon-MP processors with hyperthreading. Like the already removed machines, the HP F8 servers at the time cost upwards of $100k in typical configurations, but were quickly obsoleted by their 64-bit Socket-604 cousins and the AMD Opteron. Earlier servers with up to 8 Pentium Pro or Xeon processors remain fully supported as they had no hyperthreading. Similarly, the more common 4-socket Xeon-MP machines with hyperthreading using Intel or ServerWorks chipsets continue to work without this, and all the multi-core Xeon processors also run 64-bit kernels. While the "bigsmp" support can also be used to run on later 64-bit machines (including VM guests), it seems best to discourage that and get any remaining users to update their kernels to 64-bit builds on these. As a side-effect of this, there is also no more need to support NUMA configurations on 32-bit x86, as all true 32-bit NUMA platforms are already gone. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20250226213714.4040853-3-arnd@kernel.org
2025-02-27x86/Kconfig: Add cmpxchg8b support back to Geode CPUsArnd Bergmann
An older cleanup of mine inadvertently removed geode-gx1 and geode-lx from the list of CPUs that are known to support a working cmpxchg8b. Fixes: 88a2b4edda3d ("x86/Kconfig: Rework CONFIG_X86_PAE dependency") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20250226213714.4040853-2-arnd@kernel.org
2025-02-27Merge branch 'x86/mm' into x86/cpu, to avoid conflictsIngo Molnar
We are going to apply a new series that conflicts with pending work in x86/mm, so merge in x86/mm to avoid it, and also to refresh the x86/cpu branch with fixes. Signed-off-by: Ingo Molnar <mingo@kernel.org>
2025-02-27MIPS: Ignore relocs against __ex_table for relocatable kernelXi Ruoyao
Since commit 6f2c2f93a190 ("scripts/sorttable: Remove unneeded Elf_Rel"), sorttable no longer clears relocs against __ex_table, claiming "it was never used." But in fact MIPS relocatable kernel had been implicitly depending on this behavior, so after this commit the MIPS relocatable kernel has started to spit oops like: CPU 1 Unable to handle kernel paging request at virtual address 000000fffbbdbff8, epc == ffffffff818f9a6c, ra == ffffffff813ad7d0 ... ... Call Trace: [<ffffffff818f9a6c>] __raw_copy_from_user+0x48/0x2fc [<ffffffff813ad7d0>] cp_statx+0x1a0/0x1e0 [<ffffffff813ae528>] do_statx_fd+0xa8/0x118 [<ffffffff813ae670>] sys_statx+0xd8/0xf8 [<ffffffff81156cc8>] syscall_common+0x34/0x58 So ignore those relocs on our own to fix the issue. Fixes: 6f2c2f93a190 ("scripts/sorttable: Remove unneeded Elf_Rel") Signed-off-by: Xi Ruoyao <xry111@xry111.site> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2025-02-27x86/mm: Clear _PAGE_DIRTY for kernel mappings when we clear _PAGE_RWMatthew Wilcox (Oracle)
The bit pattern of _PAGE_DIRTY set and _PAGE_RW clear is used to mark shadow stacks. This is currently checked for in mk_pte() but not pfn_pte(). If we add the check to pfn_pte(), it catches vfree() calling set_direct_map_invalid_noflush() which calls __change_page_attr() which loads the old protection bits from the PTE, clears the specified bits and uses pfn_pte() to construct the new PTE. We should, therefore, for kernel mappings, clear the _PAGE_DIRTY bit consistently whenever we clear _PAGE_RW. I opted to do it in the callers in case we want to use __change_page_attr() to create shadow stacks inside the kernel at some point in the future. Arguably, we might also want to clear _PAGE_ACCESSED here. Note that the 3 functions involved: __set_pages_np() kernel_map_pages_in_pgd() kernel_unmap_pages_in_pgd() Only ever manipulate non-swappable kernel mappings, so maintaining the DIRTY:1|RW:0 special pattern for shadow stacks and DIRTY:0 pattern for non-shadow-stack entries can be maintained consistently and doesn't result in the unintended clearing of a live dirty bit that could corrupt (destroy) dirty bit information for user mappings. Reported-by: kernel test robot <oliver.sang@intel.com> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/174051422675.10177.13226545170101706336.tip-bot2@tip-bot2 Closes: https://lore.kernel.org/oe-lkp/202502241646.719f4651-lkp@intel.com
2025-02-27drm/fbdev-dma: Add shadow buffering for deferred I/OThomas Zimmermann
DMA areas are not necessarily backed by struct page, so we cannot rely on it for deferred I/O. Allocate a shadow buffer for drivers that require deferred I/O and use it as framebuffer memory. Fixes driver errors about being "Unable to handle kernel NULL pointer dereference at virtual address" or "Unable to handle kernel paging request at virtual address". The patch splits drm_fbdev_dma_driver_fbdev_probe() in an initial allocation, which creates the DMA-backed buffer object, and a tail that sets up the fbdev data structures. There is a tail function for direct memory mappings and a tail function for deferred I/O with the shadow buffer. It is no longer possible to use deferred I/O without shadow buffer. It can be re-added if there exists a reliably test for usable struct page in the allocated DMA-backed buffer object. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Reported-by: Nuno Gonçalves <nunojpg@gmail.com> CLoses: https://lore.kernel.org/dri-devel/CAEXMXLR55DziAMbv_+2hmLeH-jP96pmit6nhs6siB22cpQFr9w@mail.gmail.com/ Tested-by: Nuno Gonçalves <nunojpg@gmail.com> Fixes: 5ab91447aa13 ("drm/tiny/ili9225: Use fbdev-dma") Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: <stable@vger.kernel.org> # v6.11+ Reviewed-by: Simona Vetter <simona.vetter@ffwll.ch> Reviewed-by: Javier Martinez Canillas <javierm@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241211090643.74250-1-tzimmermann@suse.de
2025-02-27dmaengine: Revert "dmaengine: qcom: bam_dma: Avoid writing unavailable register"Caleb Connolly
This commit causes a hard crash on sdm845 and likely other platforms. Revert it until a proper fix is found. This reverts commit 57a7138d0627: ("dmaengine: qcom: bam_dma: Avoid writing unavailable register") Signed-off-by: Caleb Connolly <caleb.connolly@linaro.org> Fixes: 57a7138d0627 ("dmaengine: qcom: bam_dma: Avoid writing unavailable register") Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on sdm845-DB845c Tested-by: David Heidelberg <david@ixit.cz> Link: https://lore.kernel.org/r/20250208223112.142567-1-caleb.connolly@linaro.org Signed-off-by: Vinod Koul <vkoul@kernel.org>
2025-02-26Merge branch 'mlx5-misc-fixes-2025-02-25'Jakub Kicinski
Tariq Toukan says: ==================== mlx5 misc fixes 2025-02-25 This small patchset provides misc bug fixes from the team to the mlx5 core driver. ==================== Link: https://patch.msgid.link/20250225072608.526866-1-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-26net/mlx5: IRQ, Fix null string in debug printShay Drory
irq_pool_alloc() debug print can print a null string. Fix it by providing a default string to print. Fixes: 71e084e26414 ("net/mlx5: Allocating a pool of MSI-X vectors for SFs") Signed-off-by: Shay Drory <shayd@nvidia.com> Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202501141055.SwfIphN0-lkp@intel.com/ Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Link: https://patch.msgid.link/20250225072608.526866-4-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>