summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2024-10-15net: dsa: mv88e6xxx: Fix the max_vid definition for the MV88E6361Peter Rashleigh
According to the Marvell datasheet the 88E6361 has two VTU pages (4k VIDs per page) so the max_vid should be 8191, not 4095. In the current implementation mv88e6xxx_vtu_walk() gives unexpected results because of this error. I verified that mv88e6xxx_vtu_walk() works correctly on the MV88E6361 with this patch in place. Fixes: 12899f299803 ("net: dsa: mv88e6xxx: enable support for 88E6361 switch") Signed-off-by: Peter Rashleigh <peter@rashleigh.ca> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20241014204342.5852-1-peter@rashleigh.ca Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-15tcp/dccp: Don't use timer_pending() in reqsk_queue_unlink().Kuniyuki Iwashima
Martin KaFai Lau reported use-after-free [0] in reqsk_timer_handler(). """ We are seeing a use-after-free from a bpf prog attached to trace_tcp_retransmit_synack. The program passes the req->sk to the bpf_sk_storage_get_tracing kernel helper which does check for null before using it. """ The commit 83fccfc3940c ("inet: fix potential deadlock in reqsk_queue_unlink()") added timer_pending() in reqsk_queue_unlink() not to call del_timer_sync() from reqsk_timer_handler(), but it introduced a small race window. Before the timer is called, expire_timers() calls detach_timer(timer, true) to clear timer->entry.pprev and marks it as not pending. If reqsk_queue_unlink() checks timer_pending() just after expire_timers() calls detach_timer(), TCP will miss del_timer_sync(); the reqsk timer will continue running and send multiple SYN+ACKs until it expires. The reported UAF could happen if req->sk is close()d earlier than the timer expiration, which is 63s by default. The scenario would be 1. inet_csk_complete_hashdance() calls inet_csk_reqsk_queue_drop(), but del_timer_sync() is missed 2. reqsk timer is executed and scheduled again 3. req->sk is accept()ed and reqsk_put() decrements rsk_refcnt, but reqsk timer still has another one, and inet_csk_accept() does not clear req->sk for non-TFO sockets 4. sk is close()d 5. reqsk timer is executed again, and BPF touches req->sk Let's not use timer_pending() by passing the caller context to __inet_csk_reqsk_queue_drop(). Note that reqsk timer is pinned, so the issue does not happen in most use cases. [1] [0] BUG: KFENCE: use-after-free read in bpf_sk_storage_get_tracing+0x2e/0x1b0 Use-after-free read at 0x00000000a891fb3a (in kfence-#1): bpf_sk_storage_get_tracing+0x2e/0x1b0 bpf_prog_5ea3e95db6da0438_tcp_retransmit_synack+0x1d20/0x1dda bpf_trace_run2+0x4c/0xc0 tcp_rtx_synack+0xf9/0x100 reqsk_timer_handler+0xda/0x3d0 run_timer_softirq+0x292/0x8a0 irq_exit_rcu+0xf5/0x320 sysvec_apic_timer_interrupt+0x6d/0x80 asm_sysvec_apic_timer_interrupt+0x16/0x20 intel_idle_irq+0x5a/0xa0 cpuidle_enter_state+0x94/0x273 cpu_startup_entry+0x15e/0x260 start_secondary+0x8a/0x90 secondary_startup_64_no_verify+0xfa/0xfb kfence-#1: 0x00000000a72cc7b6-0x00000000d97616d9, size=2376, cache=TCPv6 allocated by task 0 on cpu 9 at 260507.901592s: sk_prot_alloc+0x35/0x140 sk_clone_lock+0x1f/0x3f0 inet_csk_clone_lock+0x15/0x160 tcp_create_openreq_child+0x1f/0x410 tcp_v6_syn_recv_sock+0x1da/0x700 tcp_check_req+0x1fb/0x510 tcp_v6_rcv+0x98b/0x1420 ipv6_list_rcv+0x2258/0x26e0 napi_complete_done+0x5b1/0x2990 mlx5e_napi_poll+0x2ae/0x8d0 net_rx_action+0x13e/0x590 irq_exit_rcu+0xf5/0x320 common_interrupt+0x80/0x90 asm_common_interrupt+0x22/0x40 cpuidle_enter_state+0xfb/0x273 cpu_startup_entry+0x15e/0x260 start_secondary+0x8a/0x90 secondary_startup_64_no_verify+0xfa/0xfb freed by task 0 on cpu 9 at 260507.927527s: rcu_core_si+0x4ff/0xf10 irq_exit_rcu+0xf5/0x320 sysvec_apic_timer_interrupt+0x6d/0x80 asm_sysvec_apic_timer_interrupt+0x16/0x20 cpuidle_enter_state+0xfb/0x273 cpu_startup_entry+0x15e/0x260 start_secondary+0x8a/0x90 secondary_startup_64_no_verify+0xfa/0xfb Fixes: 83fccfc3940c ("inet: fix potential deadlock in reqsk_queue_unlink()") Reported-by: Martin KaFai Lau <martin.lau@kernel.org> Closes: https://lore.kernel.org/netdev/eb6684d0-ffd9-4bdc-9196-33f690c25824@linux.dev/ Link: https://lore.kernel.org/netdev/b55e2ca0-42f2-4b7c-b445-6ffd87ca74a0@linux.dev/ [1] Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20241014223312.4254-1-kuniyu@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-15drm/msm/a6xx+: Insert a fence wait before SMMU table updateRob Clark
The CP_SMMU_TABLE_UPDATE _should_ be waiting for idle, but on some devices (x1-85, possibly others), it seems to pass that barrier while there are still things in the event completion FIFO waiting to be written back to memory. Work around that by adding a fence wait before context switch. The CP_EVENT_WRITE that writes the fence is the last write from a submit, so seeing this value hit memory is a reliable indication that it is safe to proceed with the context switch. v2: Only emit CP_WAIT_TIMESTAMP on a7xx, as it is not supported on a6xx. Conversely, I've not been able to reproduce this issue on a6xx, so hopefully it is limited to a7xx, or perhaps just certain a7xx devices. Fixes: af66706accdf ("drm/msm/a6xx: Add skeleton A7xx support") Closes: https://gitlab.freedesktop.org/drm/msm/-/issues/63 Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Akhil P Oommen <quic_akhilpo@quicinc.com> Signed-off-by: Abhinav Kumar <quic_abhinavk@quicinc.com>
2024-10-15net: bcmasp: fix potential memory leak in bcmasp_xmit()Wang Hai
The bcmasp_xmit() returns NETDEV_TX_OK without freeing skb in case of mapping fails, add dev_kfree_skb() to fix it. Fixes: 490cb412007d ("net: bcmasp: Add support for ASP2.0 Ethernet controller") Signed-off-by: Wang Hai <wanghai38@huawei.com> Acked-by: Florian Fainelli <florian.fainelli@broadcom.com> Link: https://patch.msgid.link/20241014145901.48940-1-wanghai38@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-16powerpc/powernv: Free name on error in opal_event_init()Michael Ellerman
In opal_event_init() if request_irq() fails name is not freed, leading to a memory leak. The code only runs at boot time, there's no way for a user to trigger it, so there's no security impact. Fix the leak by freeing name in the error path. Reported-by: 2639161967 <2639161967@qq.com> Closes: https://lore.kernel.org/linuxppc-dev/87wmjp3wig.fsf@mail.lhotse Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://patch.msgid.link/20240920093520.67997-1-mpe@ellerman.id.au
2024-10-15drm/msm/dpu: don't always program merge_3d blockJessica Zhang
Only program the merge_3d block for the video phys encoder when the 3d blend mode is not NONE Fixes: 3e79527a33a8 ("drm/msm/dpu: enable merge_3d support on sm8150/sm8250") Suggested-by: Abhinav Kumar <quic_abhinavk@quicinc.com> Signed-off-by: Jessica Zhang <quic_jesszhan@quicinc.com> Patchwork: https://patchwork.freedesktop.org/patch/619095/ Link: https://lore.kernel.org/r/20241009-merge3d-fix-v1-1-0d0b6f5c244e@quicinc.com Signed-off-by: Abhinav Kumar <quic_abhinavk@quicinc.com>
2024-10-15drm/msm/dpu: Don't always set merge_3d pending flushJessica Zhang
Don't set the merge_3d pending flush bits if the mode_3d is BLEND_3D_NONE. Always flushing merge_3d can cause timeout issues when there are multiple commits with concurrent writeback enabled. This is because the video phys enc waits for the hw_ctl flush register to be completely cleared [1] in its wait_for_commit_done(), but the WB encoder always sets the merge_3d pending flush during each commit regardless of if the merge_3d is actually active. This means that the hw_ctl flush register will never be 0 when there are multiple CWB commits and the video phys enc will hit vblank timeout errors after the first CWB commit. [1] commit fe9df3f50c39 ("drm/msm/dpu: add real wait_for_commit_done()") Fixes: 3e79527a33a8 ("drm/msm/dpu: enable merge_3d support on sm8150/sm8250") Fixes: d7d0e73f7de3 ("drm/msm/dpu: introduce the dpu_encoder_phys_* for writeback") Signed-off-by: Jessica Zhang <quic_jesszhan@quicinc.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Patchwork: https://patchwork.freedesktop.org/patch/619092/ Link: https://lore.kernel.org/r/20241009-mode3d-fix-v1-1-c0258354fadc@quicinc.com Signed-off-by: Abhinav Kumar <quic_abhinavk@quicinc.com>
2024-10-15irqchip/renesas-rzg2l: Fix missing put_deviceFabrizio Castro
rzg2l_irqc_common_init() calls of_find_device_by_node(), but the corresponding put_device() call is missing. This also gets reported by make coccicheck. Make use of the cleanup interfaces from cleanup.h to call into __free_put_device(), which in turn calls into put_device when leaving function rzg2l_irqc_common_init() and variable "dev" goes out of scope. To prevent that the device is put on successful completion, assign NULL to "dev" to prevent __free_put_device() from calling into put_device() within the successful path. "make coccicheck" will still complain about missing put_device() calls, but those are false positives now. Fixes: 3fed09559cd8 ("irqchip: Add RZ/G2L IA55 Interrupt Controller driver") Signed-off-by: Fabrizio Castro <fabrizio.castro.jz@renesas.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20241011172003.1242841-1-fabrizio.castro.jz@renesas.com
2024-10-15irqchip/riscv-intc: Fix SMP=n boot with ACPISunil V L
When CONFIG_SMP is disabled, the static array rintc_acpi_data with size NR_CPUS is not sufficient to hold all RINTC structures passed from the firmware. All RINTC structures are required to configure IMSIC/APLIC/PLIC properly irrespective of SMP in the OS. So, allocate dynamic memory based on the number of RINTC structures in MADT to fix this issue. Fixes: f8619b66bdb1 ("irqchip/riscv-intc: Add ACPI support for AIA") Reported-by: Björn Töpel <bjorn@kernel.org> Signed-off-by: Sunil V L <sunilvl@ventanamicro.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Alexandre Ghiti <alexghiti@rivosinc.com> Reviewed-by: Anup Patel <anup@brainfault.org> Link: https://lore.kernel.org/all/20241014065739.656959-1-sunilvl@ventanamicro.com Closes: https://github.com/linux-riscv/linux-riscv/actions/runs/11280997511/job/31375229012
2024-10-15Merge tag 'scmi-fixes-6.12' of ↵Arnd Bergmann
https://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux into arm/fixes Arm SCMI fixes for v6.12 Couple of fixes to address the issues found and reported on Broadcom STB platforms following the recent refactor of all the SCMI transports as standalone drivers. One of the issue is that the effective timeout value is much less than the intended value due to the way mailbox messages are queues in the mailbox framework. Since we block or serialise the shmem access anyway, there is no point in utilizing mailbox queues. The issue is fixed with exclusive lock on the channel when sending the message. The other issues is actually non-issue for upstream, but the workaround is just changing the link order of the transport drivers which enables Broadcom STB platforms to run both upstream and custom downstream kernel without any device tree changes. So pushing this to help them test upstream seamlessly as it has no practical or theoretical impact for others. There is also a fix to address possible double freeing of the name string in scmi_debugfs_common_cleanup() when devm_add_action_or_reset() fails. * tag 'scmi-fixes-6.12' of https://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux: firmware: arm_scmi: Queue in scmi layer for mailbox implementation firmware: arm_scmi: Give SMC transport precedence over mailbox firmware: arm_scmi: Fix the double free in scmi_debugfs_common_setup() Link: https://lore.kernel.org/r/20241015185128.1000604-1-sudeep.holla@arm.com Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-10-15Merge tag 'ffa-fixes-6.12' of ↵Arnd Bergmann
https://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux into arm/fixes Arm FF-A fixes for v6.12 Couple of fixes to avoid string-fortify warnings in export_uuid() and memcpy() from the recently added functions to support FFA_MSG_SEND_DIRECT_REQ2 and FFA_MSG_SEND_DIRECT_RESP2. * tag 'ffa-fixes-6.12' of https://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux: firmware: arm_ffa: Avoid string-fortify warning caused by memcpy() firmware: arm_ffa: Avoid string-fortify warning in export_uuid() Link: https://lore.kernel.org/r/20241015185037.1000435-1-sudeep.holla@arm.com Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-10-15Merge tag 'mvebu-fixes-6.12-1' of ↵Arnd Bergmann
https://git.kernel.org/pub/scm/linux/kernel/git/gclement/mvebu into arm/fixes mvebu fixes for 6.12 (part 1) Fix cp0 mdio pin numbers on SolidRun CN9130 SoM * tag 'mvebu-fixes-6.12-1' of https://git.kernel.org/pub/scm/linux/kernel/git/gclement/mvebu: arm64: dts: marvell: cn9130-sr-som: fix cp0 mdio pin numbers Link: https://lore.kernel.org/r/87ldyud25o.fsf@BLaptop.bootlin.com Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-10-15Merge tag 'reset-fixes-for-v6.12' of git://git.pengutronix.de/pza/linux into ↵Arnd Bergmann
arm/fixes Reset controller fixes for v6.12 Fix a NULL pointer dereference in reset-starfive-jh71x0 and replace two accidental commas at line endings with semicolons in reset-npcm. * tag 'reset-fixes-for-v6.12' of git://git.pengutronix.de/pza/linux: reset: starfive: jh71x0: Fix accessing the empty member on JH7110 SoC reset: npcm: convert comma to semicolon Link: https://lore.kernel.org/r/20240930165733.1541936-1-p.zabel@pengutronix.de Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-10-15net: systemport: fix potential memory leak in bcm_sysport_xmit()Wang Hai
The bcm_sysport_xmit() returns NETDEV_TX_OK without freeing skb in case of dma_map_single() fails, add dev_kfree_skb() to fix it. Fixes: 80105befdb4b ("net: systemport: add Broadcom SYSTEMPORT Ethernet MAC driver") Signed-off-by: Wang Hai <wanghai38@huawei.com> Link: https://patch.msgid.link/20241014145115.44977-1-wanghai38@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-15Merge tag 'trace-ringbuffer-v6.12-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull ring-buffer fixes from Steven Rostedt: - Fix ref counter of buffers assigned at boot up A tracing instance can be created from the kernel command line. If it maps to memory, it is considered permanent and should not be deleted, or bad things can happen. If it is not mapped to memory, then the user is fine to delete it via rmdir from the instances directory. But the ref counts assumed 0 was free to remove and greater than zero was not. But this was not the case. When an instance is created, it should have the reference of 1, and if it should not be removed, it must be greater than 1. The boot up code set normal instances with a ref count of 0, which could get removed if something accessed it and then released it. And memory mapped instances had a ref count of 1 which meant it could be deleted, and bad things happen. Keep normal instances ref count as 1, and set memory mapped instances ref count to 2. - Protect sub buffer size (order) updates from other modifications When a ring buffer is changing the size of its sub-buffers, no other operations should be performed on the ring buffer. That includes reading it. But the locking only grabbed the buffer->mutex that keeps some operations from touching the ring buffer. It also must hold the cpu_buffer->reader_lock as well when updates happen as other paths use that to do some operations on the ring buffer. * tag 'trace-ringbuffer-v6.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: ring-buffer: Fix reader locking when changing the sub buffer order ring-buffer: Fix refcount setting of boot mapped buffers
2024-10-15Merge branch 'fix-truncation-bug-in-coerce_reg_to_size_sx-and-extend-selftests'Alexei Starovoitov
Dimitar Kanaliev says: ==================== Fix truncation bug in coerce_reg_to_size_sx and extend selftests. This patch series addresses a truncation bug in the eBPF verifier function coerce_reg_to_size_sx(). The issue was caused by the incorrect ordering of assignments between 32-bit and 64-bit min/max values, leading to improper truncation when updating the register state. This issue has been reported previously by Zac Ecob[1] , but was not followed up on. The first patch fixes the assignment order in coerce_reg_to_size_sx() to ensure correct truncation. The subsequent patches add selftests for coerce_{reg,subreg}_to_size_sx. Changelog: v1 -> v2: - Moved selftests inside the conditional check for cpuv4 [1] (https://lore.kernel.org/bpf/h3qKLDEO6m9nhif0eAQX4fVrqdO0D_OPb0y5HfMK9jBePEKK33wQ3K-bqSVnr0hiZdFZtSJOsbNkcEQGpv_yJk61PAAiO8fUkgMRSO-lB50=@protonmail.com/) ==================== Link: https://lore.kernel.org/r/20241014121155.92887-1-dimitar.kanaliev@siteground.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-10-15selftests/bpf: Add test for sign extension in coerce_subreg_to_size_sx()Dimitar Kanaliev
Add a test for unsigned ranges after signed extension instruction. This case isn't currently covered by existing tests in verifier_movsx.c. Acked-by: Shung-Hsi Yu <shung-hsi.yu@suse.com> Signed-off-by: Dimitar Kanaliev <dimitar.kanaliev@siteground.com> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20241014121155.92887-4-dimitar.kanaliev@siteground.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-10-15selftests/bpf: Add test for truncation after sign extension in ↵Dimitar Kanaliev
coerce_reg_to_size_sx() Add test that checks whether unsigned ranges deduced by the verifier for sign extension instruction is correct. Without previous patch that fixes truncation in coerce_reg_to_size_sx() this test fails. Acked-by: Shung-Hsi Yu <shung-hsi.yu@suse.com> Signed-off-by: Dimitar Kanaliev <dimitar.kanaliev@siteground.com> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20241014121155.92887-3-dimitar.kanaliev@siteground.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-10-15bpf: Fix truncation bug in coerce_reg_to_size_sx()Dimitar Kanaliev
coerce_reg_to_size_sx() updates the register state after a sign-extension operation. However, there's a bug in the assignment order of the unsigned min/max values, leading to incorrect truncation: 0: (85) call bpf_get_prandom_u32#7 ; R0_w=scalar() 1: (57) r0 &= 1 ; R0_w=scalar(smin=smin32=0,smax=umax=smax32=umax32=1,var_off=(0x0; 0x1)) 2: (07) r0 += 254 ; R0_w=scalar(smin=umin=smin32=umin32=254,smax=umax=smax32=umax32=255,var_off=(0xfe; 0x1)) 3: (bf) r0 = (s8)r0 ; R0_w=scalar(smin=smin32=-2,smax=smax32=-1,umin=umin32=0xfffffffe,umax=0xffffffff,var_off=(0xfffffffffffffffe; 0x1)) In the current implementation, the unsigned 32-bit min/max values (u32_min_value and u32_max_value) are assigned directly from the 64-bit signed min/max values (s64_min and s64_max): reg->umin_value = reg->u32_min_value = s64_min; reg->umax_value = reg->u32_max_value = s64_max; Due to the chain assigmnent, this is equivalent to: reg->u32_min_value = s64_min; // Unintended truncation reg->umin_value = reg->u32_min_value; reg->u32_max_value = s64_max; // Unintended truncation reg->umax_value = reg->u32_max_value; Fixes: 1f9a1ea821ff ("bpf: Support new sign-extension load insns") Reported-by: Shung-Hsi Yu <shung-hsi.yu@suse.com> Reported-by: Zac Ecob <zacecob@protonmail.com> Signed-off-by: Dimitar Kanaliev <dimitar.kanaliev@siteground.com> Acked-by: Yonghong Song <yonghong.song@linux.dev> Reviewed-by: Shung-Hsi Yu <shung-hsi.yu@suse.com> Link: https://lore.kernel.org/r/20241014121155.92887-2-dimitar.kanaliev@siteground.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-10-15Merge tag 'bcachefs-2024-10-14' of git://evilpiepirate.org/bcachefsLinus Torvalds
Pull bcachefs fixes from Kent Overstreet: - New metadata version inode_has_child_snapshots This fixes bugs with handling of unlinked inodes + snapshots, in particular when an inode is reattached after taking a snapshot; deleted inodes now get correctly cleaned up across snapshots. - Disk accounting rewrite fixes - validation fixes for when a device has been removed - fix journal replay failing with "journal_reclaim_would_deadlock" - Some more small fixes for erasure coding + device removal - Assorted small syzbot fixes * tag 'bcachefs-2024-10-14' of git://evilpiepirate.org/bcachefs: (27 commits) bcachefs: Fix sysfs warning in fstests generic/730,731 bcachefs: Handle race between stripe reuse, invalidate_stripe_to_dev bcachefs: Fix kasan splat in new_stripe_alloc_buckets() bcachefs: Add missing validation for bch_stripe.csum_granularity_bits bcachefs: Fix missing bounds checks in bch2_alloc_read() bcachefs: fix uaf in bch2_dio_write_done() bcachefs: Improve check_snapshot_exists() bcachefs: Fix bkey_nocow_lock() bcachefs: Fix accounting replay flags bcachefs: Fix invalid shift in member_to_text() bcachefs: Fix bch2_have_enough_devs() for BCH_SB_MEMBER_INVALID bcachefs: __wait_for_freeing_inode: Switch to wait_bit_queue_entry bcachefs: Check if stuck in journal_res_get() closures: Add closure_wait_event_timeout() bcachefs: Fix state lock involved deadlock bcachefs: Fix NULL pointer dereference in bch2_opt_to_text bcachefs: Release transaction before wake up bcachefs: add check for btree id against max in try read node bcachefs: Disk accounting device validation fixes bcachefs: bch2_inode_or_descendents_is_open() ...
2024-10-15net: ethernet: rtsn: fix potential memory leak in rtsn_start_xmit()Wang Hai
The rtsn_start_xmit() returns NETDEV_TX_OK without freeing skb in case of skb->len being too long, add dev_kfree_skb_any() to fix it. Fixes: b0d3969d2b4d ("net: ethernet: rtsn: Add support for Renesas Ethernet-TSN") Signed-off-by: Wang Hai <wanghai38@huawei.com> Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20241014144250.38802-1-wanghai38@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-15net: xilinx: axienet: fix potential memory leak in axienet_start_xmit()Wang Hai
The axienet_start_xmit() returns NETDEV_TX_OK without freeing skb in case of dma_map_single() fails, add dev_kfree_skb_any() to fix it. Fixes: 71791dc8bdea ("net: axienet: Check for DMA mapping errors") Signed-off-by: Wang Hai <wanghai38@huawei.com> Reviewed-by: Radhey Shyam Pandey <radhey.shyam.pandey@amd.com> Link: https://patch.msgid.link/20241014143704.31938-1-wanghai38@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-15Merge branch 'mptcp-prevent-mpc-handshake-on-port-based-signal-endpoints'Jakub Kicinski
Matthieu Baerts says: ==================== mptcp: prevent MPC handshake on port-based signal endpoints MPTCP connection requests toward a listening socket created by the in-kernel PM for a port based signal endpoint will never be accepted, they need to be explicitly rejected. - Patch 1: Explicitly reject such requests. A fix for >= v5.12. - Patch 2: Cover this case in the MPTCP selftests to avoid regressions. Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> v1: https://lore.kernel.org/20240908180620.822579-1-xiyou.wangcong@gmail.com Link: https://lore.kernel.org/a5289a0d-2557-40b8-9575-6f1a0bbf06e4@redhat.com ==================== Link: https://patch.msgid.link/20241014-net-mptcp-mpc-port-endp-v2-0-7faea8e6b6ae@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-15selftests: mptcp: join: test for prohibited MPC to port-based endpPaolo Abeni
Explicitly verify that MPC connection attempts towards a port-based signal endpoint fail with a reset. Note that this new test is a bit different from the other ones, not using 'run_tests'. It is then needed to add the capture capability, and the picking the right port which have been extracted into three new helpers. The info about the capture can also be printed from a single point, which simplifies the exit paths in do_transfer(). The 'Fixes' tag here below is the same as the one from the previous commit: this patch here is not fixing anything wrong in the selftests, but it validates the previous fix for an issue introduced by this commit ID. Fixes: 1729cf186d8a ("mptcp: create the listening socket for new port") Cc: stable@vger.kernel.org Co-developed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20241014-net-mptcp-mpc-port-endp-v2-2-7faea8e6b6ae@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-15mptcp: prevent MPC handshake on port-based signal endpointsPaolo Abeni
Syzkaller reported a lockdep splat: ============================================ WARNING: possible recursive locking detected 6.11.0-rc6-syzkaller-00019-g67784a74e258 #0 Not tainted -------------------------------------------- syz-executor364/5113 is trying to acquire lock: ffff8880449f1958 (k-slock-AF_INET){+.-.}-{2:2}, at: spin_lock include/linux/spinlock.h:351 [inline] ffff8880449f1958 (k-slock-AF_INET){+.-.}-{2:2}, at: sk_clone_lock+0x2cd/0xf40 net/core/sock.c:2328 but task is already holding lock: ffff88803fe3cb58 (k-slock-AF_INET){+.-.}-{2:2}, at: spin_lock include/linux/spinlock.h:351 [inline] ffff88803fe3cb58 (k-slock-AF_INET){+.-.}-{2:2}, at: sk_clone_lock+0x2cd/0xf40 net/core/sock.c:2328 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(k-slock-AF_INET); lock(k-slock-AF_INET); *** DEADLOCK *** May be due to missing lock nesting notation 7 locks held by syz-executor364/5113: #0: ffff8880449f0e18 (sk_lock-AF_INET){+.+.}-{0:0}, at: lock_sock include/net/sock.h:1607 [inline] #0: ffff8880449f0e18 (sk_lock-AF_INET){+.+.}-{0:0}, at: mptcp_sendmsg+0x153/0x1b10 net/mptcp/protocol.c:1806 #1: ffff88803fe39ad8 (k-sk_lock-AF_INET){+.+.}-{0:0}, at: lock_sock include/net/sock.h:1607 [inline] #1: ffff88803fe39ad8 (k-sk_lock-AF_INET){+.+.}-{0:0}, at: mptcp_sendmsg_fastopen+0x11f/0x530 net/mptcp/protocol.c:1727 #2: ffffffff8e938320 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire include/linux/rcupdate.h:326 [inline] #2: ffffffff8e938320 (rcu_read_lock){....}-{1:2}, at: rcu_read_lock include/linux/rcupdate.h:838 [inline] #2: ffffffff8e938320 (rcu_read_lock){....}-{1:2}, at: __ip_queue_xmit+0x5f/0x1b80 net/ipv4/ip_output.c:470 #3: ffffffff8e938320 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire include/linux/rcupdate.h:326 [inline] #3: ffffffff8e938320 (rcu_read_lock){....}-{1:2}, at: rcu_read_lock include/linux/rcupdate.h:838 [inline] #3: ffffffff8e938320 (rcu_read_lock){....}-{1:2}, at: ip_finish_output2+0x45f/0x1390 net/ipv4/ip_output.c:228 #4: ffffffff8e938320 (rcu_read_lock){....}-{1:2}, at: local_lock_acquire include/linux/local_lock_internal.h:29 [inline] #4: ffffffff8e938320 (rcu_read_lock){....}-{1:2}, at: process_backlog+0x33b/0x15b0 net/core/dev.c:6104 #5: ffffffff8e938320 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire include/linux/rcupdate.h:326 [inline] #5: ffffffff8e938320 (rcu_read_lock){....}-{1:2}, at: rcu_read_lock include/linux/rcupdate.h:838 [inline] #5: ffffffff8e938320 (rcu_read_lock){....}-{1:2}, at: ip_local_deliver_finish+0x230/0x5f0 net/ipv4/ip_input.c:232 #6: ffff88803fe3cb58 (k-slock-AF_INET){+.-.}-{2:2}, at: spin_lock include/linux/spinlock.h:351 [inline] #6: ffff88803fe3cb58 (k-slock-AF_INET){+.-.}-{2:2}, at: sk_clone_lock+0x2cd/0xf40 net/core/sock.c:2328 stack backtrace: CPU: 0 UID: 0 PID: 5113 Comm: syz-executor364 Not tainted 6.11.0-rc6-syzkaller-00019-g67784a74e258 #0 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014 Call Trace: <IRQ> __dump_stack lib/dump_stack.c:93 [inline] dump_stack_lvl+0x241/0x360 lib/dump_stack.c:119 check_deadlock kernel/locking/lockdep.c:3061 [inline] validate_chain+0x15d3/0x5900 kernel/locking/lockdep.c:3855 __lock_acquire+0x137a/0x2040 kernel/locking/lockdep.c:5142 lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5759 __raw_spin_lock include/linux/spinlock_api_smp.h:133 [inline] _raw_spin_lock+0x2e/0x40 kernel/locking/spinlock.c:154 spin_lock include/linux/spinlock.h:351 [inline] sk_clone_lock+0x2cd/0xf40 net/core/sock.c:2328 mptcp_sk_clone_init+0x32/0x13c0 net/mptcp/protocol.c:3279 subflow_syn_recv_sock+0x931/0x1920 net/mptcp/subflow.c:874 tcp_check_req+0xfe4/0x1a20 net/ipv4/tcp_minisocks.c:853 tcp_v4_rcv+0x1c3e/0x37f0 net/ipv4/tcp_ipv4.c:2267 ip_protocol_deliver_rcu+0x22e/0x440 net/ipv4/ip_input.c:205 ip_local_deliver_finish+0x341/0x5f0 net/ipv4/ip_input.c:233 NF_HOOK+0x3a4/0x450 include/linux/netfilter.h:314 NF_HOOK+0x3a4/0x450 include/linux/netfilter.h:314 __netif_receive_skb_one_core net/core/dev.c:5661 [inline] __netif_receive_skb+0x2bf/0x650 net/core/dev.c:5775 process_backlog+0x662/0x15b0 net/core/dev.c:6108 __napi_poll+0xcb/0x490 net/core/dev.c:6772 napi_poll net/core/dev.c:6841 [inline] net_rx_action+0x89b/0x1240 net/core/dev.c:6963 handle_softirqs+0x2c4/0x970 kernel/softirq.c:554 do_softirq+0x11b/0x1e0 kernel/softirq.c:455 </IRQ> <TASK> __local_bh_enable_ip+0x1bb/0x200 kernel/softirq.c:382 local_bh_enable include/linux/bottom_half.h:33 [inline] rcu_read_unlock_bh include/linux/rcupdate.h:908 [inline] __dev_queue_xmit+0x1763/0x3e90 net/core/dev.c:4450 dev_queue_xmit include/linux/netdevice.h:3105 [inline] neigh_hh_output include/net/neighbour.h:526 [inline] neigh_output include/net/neighbour.h:540 [inline] ip_finish_output2+0xd41/0x1390 net/ipv4/ip_output.c:235 ip_local_out net/ipv4/ip_output.c:129 [inline] __ip_queue_xmit+0x118c/0x1b80 net/ipv4/ip_output.c:535 __tcp_transmit_skb+0x2544/0x3b30 net/ipv4/tcp_output.c:1466 tcp_rcv_synsent_state_process net/ipv4/tcp_input.c:6542 [inline] tcp_rcv_state_process+0x2c32/0x4570 net/ipv4/tcp_input.c:6729 tcp_v4_do_rcv+0x77d/0xc70 net/ipv4/tcp_ipv4.c:1934 sk_backlog_rcv include/net/sock.h:1111 [inline] __release_sock+0x214/0x350 net/core/sock.c:3004 release_sock+0x61/0x1f0 net/core/sock.c:3558 mptcp_sendmsg_fastopen+0x1ad/0x530 net/mptcp/protocol.c:1733 mptcp_sendmsg+0x1884/0x1b10 net/mptcp/protocol.c:1812 sock_sendmsg_nosec net/socket.c:730 [inline] __sock_sendmsg+0x1a6/0x270 net/socket.c:745 ____sys_sendmsg+0x525/0x7d0 net/socket.c:2597 ___sys_sendmsg net/socket.c:2651 [inline] __sys_sendmmsg+0x3b2/0x740 net/socket.c:2737 __do_sys_sendmmsg net/socket.c:2766 [inline] __se_sys_sendmmsg net/socket.c:2763 [inline] __x64_sys_sendmmsg+0xa0/0xb0 net/socket.c:2763 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7f04fb13a6b9 Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 01 1a 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007ffd651f42d8 EFLAGS: 00000246 ORIG_RAX: 0000000000000133 RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007f04fb13a6b9 RDX: 0000000000000001 RSI: 0000000020000d00 RDI: 0000000000000004 RBP: 00007ffd651f4310 R08: 0000000000000001 R09: 0000000000000001 R10: 0000000020000080 R11: 0000000000000246 R12: 00000000000f4240 R13: 00007f04fb187449 R14: 00007ffd651f42f4 R15: 00007ffd651f4300 </TASK> As noted by Cong Wang, the splat is false positive, but the code path leading to the report is an unexpected one: a client is attempting an MPC handshake towards the in-kernel listener created by the in-kernel PM for a port based signal endpoint. Such connection will be never accepted; many of them can make the listener queue full and preventing the creation of MPJ subflow via such listener - its intended role. Explicitly detect this scenario at initial-syn time and drop the incoming MPC request. Fixes: 1729cf186d8a ("mptcp: create the listening socket for new port") Cc: stable@vger.kernel.org Reported-by: syzbot+f4aacdfef2c6a6529c3e@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=f4aacdfef2c6a6529c3e Cc: Cong Wang <cong.wang@bytedance.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20241014-net-mptcp-mpc-port-endp-v2-1-7faea8e6b6ae@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-15net/smc: Fix searching in list of known pnetids in smc_pnet_add_pnetidLi RongQing
pnetid of pi (not newly allocated pe) should be compared Fixes: e888a2e8337c ("net/smc: introduce list of pnetids for Ethernet devices") Reviewed-by: D. Wythe <alibuda@linux.alibaba.com> Reviewed-by: Wen Gu <guwen@linux.alibaba.com> Signed-off-by: Li RongQing <lirongqing@baidu.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Gerd Bayer <gbayer@linux.ibm.com> Link: https://patch.msgid.link/20241014115321.33234-1-lirongqing@baidu.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-15net: macb: Avoid 20s boot delay by skipping MDIO bus registration for ↵Oleksij Rempel
fixed-link PHY A boot delay was introduced by commit 79540d133ed6 ("net: macb: Fix handling of fixed-link node"). This delay was caused by the call to `mdiobus_register()` in cases where a fixed-link PHY was present. The MDIO bus registration triggered unnecessary PHY address scans, leading to a 20-second delay due to attempts to detect Clause 45 (C45) compatible PHYs, despite no MDIO bus being attached. The commit 79540d133ed6 ("net: macb: Fix handling of fixed-link node") was originally introduced to fix a regression caused by commit 7897b071ac3b4 ("net: macb: convert to phylink"), which caused the driver to misinterpret fixed-link nodes as PHY nodes. This resulted in warnings like: mdio_bus f0028000.ethernet-ffffffff: fixed-link has invalid PHY address mdio_bus f0028000.ethernet-ffffffff: scan phy fixed-link at address 0 ... mdio_bus f0028000.ethernet-ffffffff: scan phy fixed-link at address 31 This patch reworks the logic to avoid registering and allocation of the MDIO bus when: - The device tree contains a fixed-link node. - There is no "mdio" child node in the device tree. If a child node named "mdio" exists, the MDIO bus will be registered to support PHYs attached to the MACB's MDIO bus. Otherwise, with only a fixed-link, the MDIO bus is skipped. Tested on a sama5d35 based system with a ksz8863 switch attached to macb0. Fixes: 79540d133ed6 ("net: macb: Fix handling of fixed-link node") Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Cc: stable@vger.kernel.org Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20241013052916.3115142-1-o.rempel@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-15riscv: dts: thead: remove enabled property for spi0Drew Fustini
There are currently no nodes that use spi0 so remove the enabled property for it in the beaglev ahead and lpi4a dts files. It can be re-enabled in the future if any peripherals will use it. The definition of spi0 remains in the th1520.dtsi file. Suggested-by: Emil Renner Berthing <emil.renner.berthing@canonical.com> Signed-off-by: Drew Fustini <dfustini@tenstorrent.com>
2024-10-15riscv: dts: thead: Add missing GPIO clock-namesEmil Renner Berthing
The gpio-dwapb looks for clock named "bus" so add clock-names property for the gpio controller nodes. Signed-off-by: Emil Renner Berthing <emil.renner.berthing@canonical.com> [dfustini: add two more lines to the commit message] Signed-off-by: Drew Fustini <dfustini@tenstorrent.com>
2024-10-15riscv: dtb: thead: Add BeagleV Ahead LEDsEmil Renner Berthing
Add nodes for the 5 user controllable LEDs on the BeagleV Ahead board. Acked-by: Linus Walleij <linus.walleij@linaro.org> Tested-by: Thomas Bonnefille <thomas.bonnefille@bootlin.com> Signed-off-by: Emil Renner Berthing <emil.renner.berthing@canonical.com> Signed-off-by: Drew Fustini <dfustini@tenstorrent.com>
2024-10-15riscv: dts: thead: Add TH1520 pinctrl settings for UART0Emil Renner Berthing
Add pinctrl settings for UART0 used as the default debug console on both the Lichee Pi 4A and BeagleV Ahead boards. Acked-by: Linus Walleij <linus.walleij@linaro.org> Tested-by: Thomas Bonnefille <thomas.bonnefille@bootlin.com> Signed-off-by: Emil Renner Berthing <emil.renner.berthing@canonical.com> Signed-off-by: Drew Fustini <dfustini@tenstorrent.com>
2024-10-15riscv: dts: thead: Add Lichee Pi 4M GPIO line namesEmil Renner Berthing
Add names for the GPIO00-GPIO14 lines of the SO-DIMM module. Acked-by: Linus Walleij <linus.walleij@linaro.org> Tested-by: Thomas Bonnefille <thomas.bonnefille@bootlin.com> Signed-off-by: Emil Renner Berthing <emil.renner.berthing@canonical.com> Signed-off-by: Drew Fustini <dfustini@tenstorrent.com>
2024-10-15riscv: dts: thead: Adjust TH1520 GPIO labelsEmil Renner Berthing
Adjust labels for the TH1520 GPIO controllers such that GPIOs can be referenced by the names used by the documentation. Eg. GPIO0_X -> <&gpio0 X Y> GPIO1_X -> <&gpio1 X Y> GPIO2_X -> <&gpio2 X Y> GPIO3_X -> <&gpio3 X Y> GPIO4_X -> <&gpio4 X Y> AOGPIO_X -> <&aogpio X Y> Remove labels for the parent GPIO devices that shouldn't need to be referenced. Acked-by: Linus Walleij <linus.walleij@linaro.org> Tested-by: Thomas Bonnefille <thomas.bonnefille@bootlin.com> Signed-off-by: Emil Renner Berthing <emil.renner.berthing@canonical.com> Signed-off-by: Drew Fustini <dfustini@tenstorrent.com>
2024-10-15riscv: dts: thead: Add TH1520 GPIO rangesEmil Renner Berthing
Add gpio-ranges properties to the TH1520 device tree, so user space can change basic pinconf settings for GPIOs and are not allowed to use pads already used by other functions. Adjust number of GPIOs available for the different controllers. Acked-by: Linus Walleij <linus.walleij@linaro.org> Tested-by: Thomas Bonnefille <thomas.bonnefille@bootlin.com> Signed-off-by: Emil Renner Berthing <emil.renner.berthing@canonical.com> Signed-off-by: Drew Fustini <dfustini@tenstorrent.com>
2024-10-15riscv: dts: thead: Add TH1520 pin control nodesEmil Renner Berthing
Add nodes for pin controllers on the T-Head TH1520 RISC-V SoC. Add the missing aonsys_clk for the always-on pin controller as there is not yet an aon subsys clock controller driver. Acked-by: Linus Walleij <linus.walleij@linaro.org> Tested-by: Thomas Bonnefille <thomas.bonnefille@bootlin.com> Signed-off-by: Emil Renner Berthing <emil.renner.berthing@canonical.com> [dfustini: modify description as there is now an ap_subsys clk driver] Signed-off-by: Drew Fustini <dfustini@tenstorrent.com>
2024-10-15net: ethernet: aeroflex: fix potential memory leak in greth_start_xmit_gbit()Wang Hai
The greth_start_xmit_gbit() returns NETDEV_TX_OK without freeing skb in case of skb->len being too long, add dev_kfree_skb() to fix it. Fixes: d4c41139df6e ("net: Add Aeroflex Gaisler 10/100/1G Ethernet MAC driver") Signed-off-by: Wang Hai <wanghai38@huawei.com> Reviewed-by: Gerhard Engleder <gerhard@engleder-embedded.com> Link: https://patch.msgid.link/20241012110434.49265-1-wanghai38@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-15netdevsim: use cond_resched() in nsim_dev_trap_report_work()Eric Dumazet
I am still seeing many syzbot reports hinting that syzbot might fool nsim_dev_trap_report_work() with hundreds of ports [1] Lets use cond_resched(), and system_unbound_wq instead of implicit system_wq. [1] INFO: task syz-executor:20633 blocked for more than 143 seconds. Not tainted 6.12.0-rc2-syzkaller-00205-g1d227fcc7222 #0 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. task:syz-executor state:D stack:25856 pid:20633 tgid:20633 ppid:1 flags:0x00004006 ... NMI backtrace for cpu 1 CPU: 1 UID: 0 PID: 16760 Comm: kworker/1:0 Not tainted 6.12.0-rc2-syzkaller-00205-g1d227fcc7222 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024 Workqueue: events nsim_dev_trap_report_work RIP: 0010:__sanitizer_cov_trace_pc+0x0/0x70 kernel/kcov.c:210 Code: 89 fb e8 23 00 00 00 48 8b 3d 04 fb 9c 0c 48 89 de 5b e9 c3 c7 5d 00 0f 1f 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 <f3> 0f 1e fa 48 8b 04 24 65 48 8b 0c 25 c0 d7 03 00 65 8b 15 60 f0 RSP: 0018:ffffc90000a187e8 EFLAGS: 00000246 RAX: 0000000000000100 RBX: ffffc90000a188e0 RCX: ffff888027d3bc00 RDX: ffff888027d3bc00 RSI: 0000000000000000 RDI: 0000000000000000 RBP: ffff88804a2e6000 R08: ffffffff8a4bc495 R09: ffffffff89da3577 R10: 0000000000000004 R11: ffffffff8a4bc2b0 R12: dffffc0000000000 R13: ffff88806573b503 R14: dffffc0000000000 R15: ffff8880663cca00 FS: 0000000000000000(0000) GS:ffff8880b8700000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fc90a747f98 CR3: 000000000e734000 CR4: 00000000003526f0 DR0: 0000000000000000 DR1: 000000000000002b DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Call Trace: <NMI> </NMI> <TASK> __local_bh_enable_ip+0x1bb/0x200 kernel/softirq.c:382 spin_unlock_bh include/linux/spinlock.h:396 [inline] nsim_dev_trap_report drivers/net/netdevsim/dev.c:820 [inline] nsim_dev_trap_report_work+0x75d/0xaa0 drivers/net/netdevsim/dev.c:850 process_one_work kernel/workqueue.c:3229 [inline] process_scheduled_works+0xa63/0x1850 kernel/workqueue.c:3310 worker_thread+0x870/0xd30 kernel/workqueue.c:3391 kthread+0x2f0/0x390 kernel/kthread.c:389 ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244 </TASK> Fixes: ba5e1272142d ("netdevsim: avoid potential loop in nsim_dev_trap_report_work()") Reported-by: syzbot+d383dc9579a76f56c251@syzkaller.appspotmail.com Reported-by: syzbot+c596faae21a68bf7afd0@syzkaller.appspotmail.com Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Jiri Pirko <jiri@nvidia.com> Link: https://patch.msgid.link/20241012094230.3893510-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-15macsec: don't increment counters for an unrelated SASabrina Dubroca
On RX, we shouldn't be incrementing the stats for an arbitrary SA in case the actual SA hasn't been set up. Those counters are intended to track packets for their respective AN when the SA isn't currently configured. Due to the way MACsec is implemented, we don't keep counters unless the SA is configured, so we can't track those packets, and those counters will remain at 0. The RXSC's stats keeps track of those packets without telling us which AN they belonged to. We could add counters for non-existent SAs, and then find a way to integrate them in the dump to userspace, but I don't think it's worth the effort. Fixes: 91ec9bd57f35 ("macsec: Fix traffic counters/statistics") Reported-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Link: https://patch.msgid.link/f5ac92aaa5b89343232615f4c03f9f95042c6aa0.1728657709.git.sd@queasysnail.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-15gpu: host1x: Set up device DMA parametersThierry Reding
In order to store device DMA parameters, the DMA framework depends on the device's dma_parms field to point at a valid memory location. Add backing storage for this in struct host1x_memory_context and point to it. Reported-by: Jonathan Hunter <jonathanh@nvidia.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Tested-by: Jon Hunter <jonathanh@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240916133320.368620-1-thierry.reding@gmail.com (cherry picked from commit b4ad4ef374d66cc8df3188bb1ddb65bce5fc9e50) Signed-off-by: Thierry Reding <treding@nvidia.com>
2024-10-15drm/amdgpu/swsmu: Only force workload setup on initAlex Deucher
Needed to set the workload type at init time so that we can apply the navi3x margin optimization. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3618 Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3131 Fixes: c50fe289ed72 ("drm/amdgpu/swsmu: always force a state reprogram on init") Reviewed-by: Kenneth Feng <kenneth.feng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 580ad7cbd4b7be8d2cb5ab5c1fca6bb76045eb0e) Cc: stable@vger.kernel.org
2024-10-15drm/radeon: Fix encoder->possible_clonesVille Syrjälä
Include the encoder itself in its possible_clones bitmask. In the past nothing validated that drivers were populating possible_clones correctly, but that changed in commit 74d2aacbe840 ("drm: Validate encoder->possible_clones"). Looks like radeon never got the memo and is still not following the rules 100% correctly. This results in some warnings during driver initialization: Bogus possible_clones: [ENCODER:46:TV-46] possible_clones=0x4 (full encoder mask=0x7) WARNING: CPU: 0 PID: 170 at drivers/gpu/drm/drm_mode_config.c:615 drm_mode_config_validate+0x113/0x39c ... Cc: Alex Deucher <alexander.deucher@amd.com> Cc: amd-gfx@lists.freedesktop.org Fixes: 74d2aacbe840 ("drm: Validate encoder->possible_clones") Reported-by: Erhard Furtner <erhard_f@mailbox.org> Closes: https://lore.kernel.org/dri-devel/20241009000321.418e4294@yea/ Tested-by: Erhard Furtner <erhard_f@mailbox.org> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 3b6e7d40649c0d75572039aff9d0911864c689db) Cc: stable@vger.kernel.org
2024-10-15drm/amdgpu/smu13: always apply the powersave optimizationAlex Deucher
It can avoid margin issues in some very demanding applications. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3618 Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3131 Fixes: c50fe289ed72 ("drm/amdgpu/swsmu: always force a state reprogram on init") Reviewed-by: Kenneth Feng <kenneth.feng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 62f38b4ccaa6aa063ca781d80b10aacd39dc5c76) Cc: stable@vger.kernel.org
2024-10-15drm/amdkfd: Accounting pdd vram_usage for svmPhilip Yang
Process device data pdd->vram_usage is read by rocm-smi via sysfs, this is currently missing the svm_bo usage accounting, so "rocm-smi --showpids" per process VRAM usage report is incorrect. Add pdd->vram_usage accounting when svm_bo allocation and release, change to atomic64_t type because it is updated outside process mutex now. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 98c0b0efcc11f2a5ddf3ce33af1e48eedf808b04)
2024-10-15drm/amd/amdgpu: Fix double unlock in amdgpu_mes_add_ringSrinivasan Shanmugam
This patch addresses a double unlock issue in the amdgpu_mes_add_ring function. The mutex was being unlocked twice under certain error conditions, which could lead to undefined behavior. The fix ensures that the mutex is unlocked only once before jumping to the clean_up_memory label. The unlock operation is moved to just before the goto statement within the conditional block that checks the return value of amdgpu_ring_init. This prevents the second unlock attempt after the clean_up_memory label, which is no longer necessary as the mutex is already unlocked by this point in the code flow. This change resolves the potential double unlock and maintains the correct mutex handling throughout the function. Fixes below: Commit d0c423b64765 ("drm/amdgpu/mes: use ring for kernel queue submission"), leads to the following Smatch static checker warning: drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c:1240 amdgpu_mes_add_ring() warn: double unlock '&adev->mes.mutex_hidden' (orig line 1213) drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c 1143 int amdgpu_mes_add_ring(struct amdgpu_device *adev, int gang_id, 1144 int queue_type, int idx, 1145 struct amdgpu_mes_ctx_data *ctx_data, 1146 struct amdgpu_ring **out) 1147 { 1148 struct amdgpu_ring *ring; 1149 struct amdgpu_mes_gang *gang; 1150 struct amdgpu_mes_queue_properties qprops = {0}; 1151 int r, queue_id, pasid; 1152 1153 /* 1154 * Avoid taking any other locks under MES lock to avoid circular 1155 * lock dependencies. 1156 */ 1157 amdgpu_mes_lock(&adev->mes); 1158 gang = idr_find(&adev->mes.gang_id_idr, gang_id); 1159 if (!gang) { 1160 DRM_ERROR("gang id %d doesn't exist\n", gang_id); 1161 amdgpu_mes_unlock(&adev->mes); 1162 return -EINVAL; 1163 } 1164 pasid = gang->process->pasid; 1165 1166 ring = kzalloc(sizeof(struct amdgpu_ring), GFP_KERNEL); 1167 if (!ring) { 1168 amdgpu_mes_unlock(&adev->mes); 1169 return -ENOMEM; 1170 } 1171 1172 ring->ring_obj = NULL; 1173 ring->use_doorbell = true; 1174 ring->is_mes_queue = true; 1175 ring->mes_ctx = ctx_data; 1176 ring->idx = idx; 1177 ring->no_scheduler = true; 1178 1179 if (queue_type == AMDGPU_RING_TYPE_COMPUTE) { 1180 int offset = offsetof(struct amdgpu_mes_ctx_meta_data, 1181 compute[ring->idx].mec_hpd); 1182 ring->eop_gpu_addr = 1183 amdgpu_mes_ctx_get_offs_gpu_addr(ring, offset); 1184 } 1185 1186 switch (queue_type) { 1187 case AMDGPU_RING_TYPE_GFX: 1188 ring->funcs = adev->gfx.gfx_ring[0].funcs; 1189 ring->me = adev->gfx.gfx_ring[0].me; 1190 ring->pipe = adev->gfx.gfx_ring[0].pipe; 1191 break; 1192 case AMDGPU_RING_TYPE_COMPUTE: 1193 ring->funcs = adev->gfx.compute_ring[0].funcs; 1194 ring->me = adev->gfx.compute_ring[0].me; 1195 ring->pipe = adev->gfx.compute_ring[0].pipe; 1196 break; 1197 case AMDGPU_RING_TYPE_SDMA: 1198 ring->funcs = adev->sdma.instance[0].ring.funcs; 1199 break; 1200 default: 1201 BUG(); 1202 } 1203 1204 r = amdgpu_ring_init(adev, ring, 1024, NULL, 0, 1205 AMDGPU_RING_PRIO_DEFAULT, NULL); 1206 if (r) 1207 goto clean_up_memory; 1208 1209 amdgpu_mes_ring_to_queue_props(adev, ring, &qprops); 1210 1211 dma_fence_wait(gang->process->vm->last_update, false); 1212 dma_fence_wait(ctx_data->meta_data_va->last_pt_update, false); 1213 amdgpu_mes_unlock(&adev->mes); ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 1214 1215 r = amdgpu_mes_add_hw_queue(adev, gang_id, &qprops, &queue_id); 1216 if (r) 1217 goto clean_up_ring; ^^^^^^^^^^^^^^^^^^ 1218 1219 ring->hw_queue_id = queue_id; 1220 ring->doorbell_index = qprops.doorbell_off; 1221 1222 if (queue_type == AMDGPU_RING_TYPE_GFX) 1223 sprintf(ring->name, "gfx_%d.%d.%d", pasid, gang_id, queue_id); 1224 else if (queue_type == AMDGPU_RING_TYPE_COMPUTE) 1225 sprintf(ring->name, "compute_%d.%d.%d", pasid, gang_id, 1226 queue_id); 1227 else if (queue_type == AMDGPU_RING_TYPE_SDMA) 1228 sprintf(ring->name, "sdma_%d.%d.%d", pasid, gang_id, 1229 queue_id); 1230 else 1231 BUG(); 1232 1233 *out = ring; 1234 return 0; 1235 1236 clean_up_ring: 1237 amdgpu_ring_fini(ring); 1238 clean_up_memory: 1239 kfree(ring); --> 1240 amdgpu_mes_unlock(&adev->mes); ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 1241 return r; 1242 } Fixes: d0c423b64765 ("drm/amdgpu/mes: use ring for kernel queue submission") Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Suggested-by: Jack Xiao <Jack.Xiao@amd.com> Reported by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Jack Xiao <Jack.Xiao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit bfaf1883605fd0c0dbabacd67ed49708470d5ea4)
2024-10-15drm/amdgpu/mes: fix issue of writing to the same log buffer from 2 MES pipesMichael Chen
With Unified MES enabled in gfx12, need separate event log buffer for the 2 MES pipes to avoid data overwrite. Signed-off-by: Michael Chen <michael.chen@amd.com> Reviewed-by: Jack Xiao <Jack.Xiao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 144df260f3daab42c4611021f929b3342de516e5) Cc: stable@vger.kernel.org # 6.11.x
2024-10-15drm/amdgpu: prevent BO_HANDLES error from being overwrittenMohammed Anees
Before this patch, if multiple BO_HANDLES chunks were submitted, the error -EINVAL would be correctly set but could be overwritten by the return value from amdgpu_cs_p1_bo_handles(). This patch ensures that if there are multiple BO_HANDLES, we stop. Fixes: fec5f8e8c6bc ("drm/amdgpu: disallow multiple BO_HANDLES chunks in one submit") Signed-off-by: Mohammed Anees <pvmohammedanees2003@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 40f2cd98828f454bdc5006ad3d94330a5ea164b7) Cc: stable@vger.kernel.org
2024-10-15drm/amdgpu: enable enforce_isolation sysfs node on VFsAlex Deucher
It should be enabled on both bare metal and VFs. Fixes: e189be9b2e38 ("drm/amdgpu: Add enforce_isolation sysfs attribute") Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Cc: Amber Lin <Amber.Lin@amd.com> Reviewed-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> (cherry picked from commit dc8847b054fd6679866ed4ee861e069e54c10799)
2024-10-15nvme-multipath: defer partition scanningKeith Busch
We need to suppress the partition scan from occuring within the controller's scan_work context. If a path error occurs here, the IO will wait until a path becomes available or all paths are torn down, but that action also occurs within scan_work, so it would deadlock. Defer the partion scan to a different context that does not block scan_work. Reported-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>
2024-10-15ring-buffer: Fix reader locking when changing the sub buffer orderPetr Pavlu
The function ring_buffer_subbuf_order_set() updates each ring_buffer_per_cpu and installs new sub buffers that match the requested page order. This operation may be invoked concurrently with readers that rely on some of the modified data, such as the head bit (RB_PAGE_HEAD), or the ring_buffer_per_cpu.pages and reader_page pointers. However, no exclusive access is acquired by ring_buffer_subbuf_order_set(). Modifying the mentioned data while a reader also operates on them can then result in incorrect memory access and various crashes. Fix the problem by taking the reader_lock when updating a specific ring_buffer_per_cpu in ring_buffer_subbuf_order_set(). Link: https://lore.kernel.org/linux-trace-kernel/20240715145141.5528-1-petr.pavlu@suse.com/ Link: https://lore.kernel.org/linux-trace-kernel/20241010195849.2f77cc3f@gandalf.local.home/ Link: https://lore.kernel.org/linux-trace-kernel/20241011112850.17212b25@gandalf.local.home/ Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Link: https://lore.kernel.org/20241015112440.26987-1-petr.pavlu@suse.com Fixes: 8e7b58c27b3c ("ring-buffer: Just update the subbuffers when changing their allocation order") Signed-off-by: Petr Pavlu <petr.pavlu@suse.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2024-10-15io_uring/sqpoll: close race on waiting for sqring entriesJens Axboe
When an application uses SQPOLL, it must wait for the SQPOLL thread to consume SQE entries, if it fails to get an sqe when calling io_uring_get_sqe(). It can do so by calling io_uring_enter(2) with the flag value of IORING_ENTER_SQ_WAIT. In liburing, this is generally done with io_uring_sqring_wait(). There's a natural expectation that once this call returns, a new SQE entry can be retrieved, filled out, and submitted. However, the kernel uses the cached sq head to determine if the SQRING is full or not. If the SQPOLL thread is currently in the process of submitting SQE entries, it may have updated the cached sq head, but not yet committed it to the SQ ring. Hence the kernel may find that there are SQE entries ready to be consumed, and return successfully to the application. If the SQPOLL thread hasn't yet committed the SQ ring entries by the time the application returns to userspace and attempts to get a new SQE, it will fail getting a new SQE. Fix this by having io_sqring_full() always use the user visible SQ ring head entry, rather than the internally cached one. Cc: stable@vger.kernel.org # 5.10+ Link: https://github.com/axboe/liburing/discussions/1267 Reported-by: Benedek Thaler <thaler@thaler.hu> Signed-off-by: Jens Axboe <axboe@kernel.dk>