summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2024-07-05gpiolib: of: add polarity quirk for TSC2005Dmitry Torokhov
DTS for Nokia N900 incorrectly specifies "active high" polarity for the reset line, while the chip documentation actually specifies it as "active low". In the past the driver fudged gpiod API and inverted the logic internally, but it was changed in d0d89493bff8. Fixes: d0d89493bff8 ("Input: tsc2004/5 - switch to using generic device properties") Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Acked-by: Linus Walleij <linus.walleij@linaro.org> Link: https://lore.kernel.org/r/ZoWXwYtwgJIxi-hD@google.com Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
2024-07-05Merge tag 'kvm-s390-master-6.10-1' of ↵Paolo Bonzini
git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD KVM: s390: Fix z16 support The z16 support might fail with the lpswey instruction. Provide a handler.
2024-07-05tpm: Address !chip->auth in tpm_buf_append_hmac_session*()Jarkko Sakkinen
Unless tpm_chip_bootstrap() was called by the driver, !chip->auth can cause a null derefence in tpm_buf_hmac_session*(). Thus, address !chip->auth in tpm_buf_hmac_session*() and remove the fallback implementation for !TCG_TPM2_HMAC. Cc: stable@vger.kernel.org # v6.9+ Reported-by: Stefan Berger <stefanb@linux.ibm.com> Closes: https://lore.kernel.org/linux-integrity/20240617193408.1234365-1-stefanb@linux.ibm.com/ Fixes: 1085b8276bb4 ("tpm: Add the rest of the session HMAC API") Tested-by: Michael Ellerman <mpe@ellerman.id.au> # ppc Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2024-07-05tpm: Address !chip->auth in tpm_buf_append_name()Jarkko Sakkinen
Unless tpm_chip_bootstrap() was called by the driver, !chip->auth can cause a null derefence in tpm_buf_append_name(). Thus, address !chip->auth in tpm_buf_append_name() and remove the fallback implementation for !TCG_TPM2_HMAC. Cc: stable@vger.kernel.org # v6.10+ Reported-by: Stefan Berger <stefanb@linux.ibm.com> Closes: https://lore.kernel.org/linux-integrity/20240617193408.1234365-1-stefanb@linux.ibm.com/ Fixes: d0a25bb961e6 ("tpm: Add HMAC session name/handle append") Tested-by: Michael Ellerman <mpe@ellerman.id.au> # ppc Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2024-07-05tpm: Address !chip->auth in tpm2_*_auth_session()Jarkko Sakkinen
Unless tpm_chip_bootstrap() was called by the driver, !chip->auth can cause a null derefence in tpm2_*_auth_session(). Thus, address !chip->auth in tpm2_*_auth_session(). Cc: stable@vger.kernel.org # v6.9+ Reported-by: Stefan Berger <stefanb@linux.ibm.com> Closes: https://lore.kernel.org/linux-integrity/20240617193408.1234365-1-stefanb@linux.ibm.com/ Fixes: 699e3efd6c64 ("tpm: Add HMAC session start and end functions") Tested-by: Michael Ellerman <mpe@ellerman.id.au> # ppc Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2024-07-04Merge tag 'for-6.10-rc6-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: - fix folio refcounting when releasing them (encoded write, dummy extent buffer) - fix out of bounds read when checking qgroup inherit data - fix how configurable chunk size is handled in zoned mode - in the ref-verify tool, fix uninitialized return value when checking extent owner ref and simple quota are not enabled * tag 'for-6.10-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: fix folio refcount in __alloc_dummy_extent_buffer() btrfs: fix folio refcount in btrfs_do_encoded_write() btrfs: fix uninitialized return value in the ref-verify tool btrfs: always do the basic checks for btrfs_qgroup_inherit structure btrfs: zoned: fix calc_available_free_space() for zoned mode
2024-07-04Merge tag 'net-6.10-rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from bluetooth, wireless and netfilter. There's one fix for power management with Intel's e1000e here, Thorsten tells us there's another problem that started in v6.9. We're trying to wrap that up but I don't think it's blocking. Current release - new code bugs: - wifi: mac80211: disable softirqs for queued frame handling - af_unix: fix uninit-value in __unix_walk_scc(), with the new garbage collection algo Previous releases - regressions: - Bluetooth: - qca: fix BT enable failure for QCA6390 after warm reboot - add quirk to ignore reserved PHY bits in LE Extended Adv Report, abused by some Broadcom controllers found on Apple machines - wifi: wilc1000: fix ies_len type in connect path Previous releases - always broken: - tcp: fix DSACK undo in fast recovery to call tcp_try_to_open(), avoid premature timeouts - net: make sure skb_datagram_iter maps fragments page by page, in case we somehow get compound highmem mixed in - eth: bnx2x: fix multiple UBSAN array-index-out-of-bounds when more queues are used Misc: - MAINTAINERS: Remembering Larry Finger" * tag 'net-6.10-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (62 commits) bnxt_en: Fix the resource check condition for RSS contexts mlxsw: core_linecards: Fix double memory deallocation in case of invalid INI file inet_diag: Initialize pad field in struct inet_diag_req_v2 tcp: Don't flag tcp_sk(sk)->rx_opt.saw_unknown for TCP AO. selftests: make order checking verbose in msg_zerocopy selftest selftests: fix OOM in msg_zerocopy selftest ice: use proper macro for testing bit ice: Reject pin requests with unsupported flags ice: Don't process extts if PTP is disabled ice: Fix improper extts handling selftest: af_unix: Add test case for backtrack after finalising SCC. af_unix: Fix uninit-value in __unix_walk_scc() bonding: Fix out-of-bounds read in bond_option_arp_ip_targets_set() net: rswitch: Avoid use-after-free in rswitch_poll() netfilter: nf_tables: unconditionally flush pending work before notifier wifi: iwlwifi: mvm: check vif for NULL/ERR_PTR before dereference wifi: iwlwifi: mvm: avoid link lookup in statistics wifi: iwlwifi: mvm: don't wake up rx_sync_waitq upon RFKILL wifi: iwlwifi: properly set WIPHY_FLAG_SUPPORTS_EXT_KEK_KCK wifi: wilc1000: fix ies_len type in connect path ...
2024-07-04Merge tag 's390-6.10-8' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 fixes from Heiko Carstens: - Fix and add physical to virtual address translations in dasd and virtio_ccw drivers. For virtio_ccw this is just a minimal fix. More code cleanup will follow. - Small defconfig updates * tag 's390-6.10-8' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/dasd: Fix invalid dereferencing of indirect CCW data pointer s390/vfio_ccw: Fix target addresses of TIC CCWs s390: Update defconfigs
2024-07-04Merge tag 'platform-drivers-x86-v6.10-5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86 Pull x86 platform driver fix from Hans de Goede: - Fix regression in toshiba_acpi introduced in 6.10-rc1 * tag 'platform-drivers-x86-v6.10-5' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: platform/x86: toshiba_acpi: Fix quickstart quirk handling
2024-07-04Merge tag 'kselftest-fix-2024-07-04' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux Pull Kselftest fix from Mickaël Salaün: "Fix Kselftests timeout. We can't use CLONE_VFORK, since that blocks the parent - and thus the timeout handling - until the child exits or execve's. Go back to using plain fork()" * tag 'kselftest-fix-2024-07-04' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux: selftests/harness: Fix tests timeout and race condition
2024-07-04Merge tag 'mm-hotfixes-stable-2024-07-03-22-23' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc fixes from, Andrew Morton: "6 hotfies, all cc:stable. Some fixes for longstanding nilfs2 issues and three unrelated MM fixes" * tag 'mm-hotfixes-stable-2024-07-03-22-23' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: nilfs2: fix incorrect inode allocation from reserved inodes nilfs2: add missing check for inode numbers on directory entries nilfs2: fix inode number range checks mm: avoid overflows in dirty throttling logic Revert "mm/writeback: fix possible divide-by-zero in wb_dirty_limits(), again" mm: optimize the redundant loop of mm_update_owner_next()
2024-07-04Merge tag 'drm-misc-fixes-2024-07-04' of ↵Daniel Vetter
https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes drm-misc-fixes for v6.10-rc7: - Add panel quirks. - Firmware sysfb refcount fix. - Another null pointer mode deref fix for nouveau. - Panthor sync and uobj fixes. - Fix fbdev regression since v6.7. - Delay free imported bo in ttm to fix lockdep splat. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/ffba0c63-2798-40b6-948d-361cd3b14e9f@linux.intel.com
2024-07-04Merge tag 'drm-xe-fixes-2024-07-04' of ↵Daniel Vetter
https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes Driver Changes: - One copy/paste mistake fix. - One error path fix causing an error pointer dereference. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> From: Thomas Hellstrom <thomas.hellstrom@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/ZoZ-wD66lgjiNh72@fedora
2024-07-04bnxt_en: Fix the resource check condition for RSS contextsPavan Chebbi
While creating a new RSS context, bnxt_rfs_capable() currently makes a strict check to see if the required VNICs are already available. If the current VNICs are not what is required, either too many or not enough, it will call the firmware to reserve the exact number required. There is a bug in the firmware when the driver tries to relinquish some reserved VNICs and RSS contexts. It will cause the default VNIC to lose its RSS configuration and cause receive packets to be placed incorrectly. Workaround this problem by skipping the resource reduction. The driver will not reduce the VNIC and RSS context reservations when a context is deleted. The resources will be available for use when new contexts are created later. Potentially, this workaround can cause us to run out of VNIC and RSS contexts if there are a lot of VF functions creating and deleting RSS contexts. In the future, we will conditionally disable this workaround when the firmware fix is available. Fixes: 438ba39b25fe ("bnxt_en: Improve RSS context reservation infrastructure") Reported-by: Jakub Kicinski <kuba@kernel.org> Link: https://lore.kernel.org/netdev/20240625010210.2002310-1-kuba@kernel.org/ Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20240703180112.78590-1-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-04mlxsw: core_linecards: Fix double memory deallocation in case of invalid INI ↵Aleksandr Mishin
file In case of invalid INI file mlxsw_linecard_types_init() deallocates memory but doesn't reset pointer to NULL and returns 0. In case of any error occurred after mlxsw_linecard_types_init() call, mlxsw_linecards_init() calls mlxsw_linecard_types_fini() which performs memory deallocation again. Add pointer reset to NULL. Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: b217127e5e4e ("mlxsw: core_linecards: Add line card objects and implement provisioning") Signed-off-by: Aleksandr Mishin <amishin@t-argos.ru> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Michal Kubiak <michal.kubiak@intel.com> Link: https://patch.msgid.link/20240703203251.8871-1-amishin@t-argos.ru Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-04Merge tag 'wireless-2024-07-04' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless Kalle Valo says: ==================== wireless fixes for v6.10 Hopefully the last fixes for v6.10. Fix a regression in wilc1000 where bitrate Information Elements longer than 255 bytes were broken. Few fixes also to mac80211 and iwlwifi. * tag 'wireless-2024-07-04' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless: wifi: iwlwifi: mvm: check vif for NULL/ERR_PTR before dereference wifi: iwlwifi: mvm: avoid link lookup in statistics wifi: iwlwifi: mvm: don't wake up rx_sync_waitq upon RFKILL wifi: iwlwifi: properly set WIPHY_FLAG_SUPPORTS_EXT_KEK_KCK wifi: wilc1000: fix ies_len type in connect path wifi: mac80211: fix BSS_CHANGED_UNSOL_BCAST_PROBE_RESP ==================== Link: https://patch.msgid.link/20240704111431.11DEDC3277B@smtp.kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-04Merge tag 'drm-intel-fixes-2024-07-02' of ↵Daniel Vetter
https://gitlab.freedesktop.org/drm/i915/kernel into drm-fixes drm/i915 fixes for v6.10-rc7: - Skip unnecessary MG programming, avoiding warnings (Imre) Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> From: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/87frss9ozs.fsf@intel.com
2024-07-04Merge tag 'nf-24-07-04' of ↵Paolo Abeni
git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following batch contains a oneliner patch to inconditionally flush workqueue containing stale objects to be released, syzbot managed to trigger UaF. Patch from Florian Westphal. netfilter pull request 24-07-04 * tag 'nf-24-07-04' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf: netfilter: nf_tables: unconditionally flush pending work before notifier ==================== Link: https://patch.msgid.link/20240703223304.1455-1-pablo@netfilter.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-07-04inet_diag: Initialize pad field in struct inet_diag_req_v2Shigeru Yoshida
KMSAN reported uninit-value access in raw_lookup() [1]. Diag for raw sockets uses the pad field in struct inet_diag_req_v2 for the underlying protocol. This field corresponds to the sdiag_raw_protocol field in struct inet_diag_req_raw. inet_diag_get_exact_compat() converts inet_diag_req to inet_diag_req_v2, but leaves the pad field uninitialized. So the issue occurs when raw_lookup() accesses the sdiag_raw_protocol field. Fix this by initializing the pad field in inet_diag_get_exact_compat(). Also, do the same fix in inet_diag_dump_compat() to avoid the similar issue in the future. [1] BUG: KMSAN: uninit-value in raw_lookup net/ipv4/raw_diag.c:49 [inline] BUG: KMSAN: uninit-value in raw_sock_get+0x657/0x800 net/ipv4/raw_diag.c:71 raw_lookup net/ipv4/raw_diag.c:49 [inline] raw_sock_get+0x657/0x800 net/ipv4/raw_diag.c:71 raw_diag_dump_one+0xa1/0x660 net/ipv4/raw_diag.c:99 inet_diag_cmd_exact+0x7d9/0x980 inet_diag_get_exact_compat net/ipv4/inet_diag.c:1404 [inline] inet_diag_rcv_msg_compat+0x469/0x530 net/ipv4/inet_diag.c:1426 sock_diag_rcv_msg+0x23d/0x740 net/core/sock_diag.c:282 netlink_rcv_skb+0x537/0x670 net/netlink/af_netlink.c:2564 sock_diag_rcv+0x35/0x40 net/core/sock_diag.c:297 netlink_unicast_kernel net/netlink/af_netlink.c:1335 [inline] netlink_unicast+0xe74/0x1240 net/netlink/af_netlink.c:1361 netlink_sendmsg+0x10c6/0x1260 net/netlink/af_netlink.c:1905 sock_sendmsg_nosec net/socket.c:730 [inline] __sock_sendmsg+0x332/0x3d0 net/socket.c:745 ____sys_sendmsg+0x7f0/0xb70 net/socket.c:2585 ___sys_sendmsg+0x271/0x3b0 net/socket.c:2639 __sys_sendmsg net/socket.c:2668 [inline] __do_sys_sendmsg net/socket.c:2677 [inline] __se_sys_sendmsg net/socket.c:2675 [inline] __x64_sys_sendmsg+0x27e/0x4a0 net/socket.c:2675 x64_sys_call+0x135e/0x3ce0 arch/x86/include/generated/asm/syscalls_64.h:47 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xd9/0x1e0 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f Uninit was stored to memory at: raw_sock_get+0x650/0x800 net/ipv4/raw_diag.c:71 raw_diag_dump_one+0xa1/0x660 net/ipv4/raw_diag.c:99 inet_diag_cmd_exact+0x7d9/0x980 inet_diag_get_exact_compat net/ipv4/inet_diag.c:1404 [inline] inet_diag_rcv_msg_compat+0x469/0x530 net/ipv4/inet_diag.c:1426 sock_diag_rcv_msg+0x23d/0x740 net/core/sock_diag.c:282 netlink_rcv_skb+0x537/0x670 net/netlink/af_netlink.c:2564 sock_diag_rcv+0x35/0x40 net/core/sock_diag.c:297 netlink_unicast_kernel net/netlink/af_netlink.c:1335 [inline] netlink_unicast+0xe74/0x1240 net/netlink/af_netlink.c:1361 netlink_sendmsg+0x10c6/0x1260 net/netlink/af_netlink.c:1905 sock_sendmsg_nosec net/socket.c:730 [inline] __sock_sendmsg+0x332/0x3d0 net/socket.c:745 ____sys_sendmsg+0x7f0/0xb70 net/socket.c:2585 ___sys_sendmsg+0x271/0x3b0 net/socket.c:2639 __sys_sendmsg net/socket.c:2668 [inline] __do_sys_sendmsg net/socket.c:2677 [inline] __se_sys_sendmsg net/socket.c:2675 [inline] __x64_sys_sendmsg+0x27e/0x4a0 net/socket.c:2675 x64_sys_call+0x135e/0x3ce0 arch/x86/include/generated/asm/syscalls_64.h:47 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xd9/0x1e0 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f Local variable req.i created at: inet_diag_get_exact_compat net/ipv4/inet_diag.c:1396 [inline] inet_diag_rcv_msg_compat+0x2a6/0x530 net/ipv4/inet_diag.c:1426 sock_diag_rcv_msg+0x23d/0x740 net/core/sock_diag.c:282 CPU: 1 PID: 8888 Comm: syz-executor.6 Not tainted 6.10.0-rc4-00217-g35bb670d65fc #32 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-2.fc40 04/01/2014 Fixes: 432490f9d455 ("net: ip, diag -- Add diag interface for raw sockets") Reported-by: syzkaller <syzkaller@googlegroups.com> Signed-off-by: Shigeru Yoshida <syoshida@redhat.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20240703091649.111773-1-syoshida@redhat.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-07-04tcp: Don't flag tcp_sk(sk)->rx_opt.saw_unknown for TCP AO.Kuniyuki Iwashima
When we process segments with TCP AO, we don't check it in tcp_parse_options(). Thus, opt_rx->saw_unknown is set to 1, which unconditionally triggers the BPF TCP option parser. Let's avoid the unnecessary BPF invocation. Fixes: 0a3a809089eb ("net/tcp: Verify inbound TCP-AO signed segments") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Acked-by: Dmitry Safonov <0x7f454c46@gmail.com> Link: https://patch.msgid.link/20240703033508.6321-1-kuniyu@amazon.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-07-04drm/xe/mcr: Avoid clobbering DSS steeringMatt Roper
A couple copy/paste mistakes in the code that selects steering targets for OADDRM and INSTANCE0 unintentionally clobbered the steering target for DSS ranges in some cases. The OADDRM/INSTANCE0 values were also not assigned as intended, although that mistake wound up being harmless since the desired values for those specific ranges were '0' which the kzalloc of the GT structure should have already taken care of implicitly. Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240626210536.1620176-2-matthew.d.roper@intel.com (cherry picked from commit 4f82ac6102788112e599a6074d2c1f2afce923df) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2024-07-04drm/xe: fix error handling in xe_migrate_update_pgtablesMatthew Auld
Don't call drm_suballoc_free with sa_bo pointing to PTR_ERR. References: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/2120 Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: <stable@vger.kernel.org> # v6.8+ Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240620102025.127699-2-matthew.auld@intel.com (cherry picked from commit ce6b63336f79ec5f3996de65f452330e395f99ae) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2024-07-04drm/ttm: Always take the bo delayed cleanup path for imported bosThomas Hellström
Bos can be put with multiple unrelated dma-resv locks held. But imported bos attempt to grab the bo dma-resv during dma-buf detach that typically happens during cleanup. That leads to lockde splats similar to the below and a potential ABBA deadlock. Fix this by always taking the delayed workqueue cleanup path for imported bos. Requesting stable fixes from when the Xe driver was introduced, since its usage of drm_exec and wide vm dma_resvs appear to be the first reliable trigger of this. [22982.116427] ============================================ [22982.116428] WARNING: possible recursive locking detected [22982.116429] 6.10.0-rc2+ #10 Tainted: G U W [22982.116430] -------------------------------------------- [22982.116430] glxgears:sh0/5785 is trying to acquire lock: [22982.116431] ffff8c2bafa539a8 (reservation_ww_class_mutex){+.+.}-{3:3}, at: dma_buf_detach+0x3b/0xf0 [22982.116438] but task is already holding lock: [22982.116438] ffff8c2d9aba6da8 (reservation_ww_class_mutex){+.+.}-{3:3}, at: drm_exec_lock_obj+0x49/0x2b0 [drm_exec] [22982.116442] other info that might help us debug this: [22982.116442] Possible unsafe locking scenario: [22982.116443] CPU0 [22982.116444] ---- [22982.116444] lock(reservation_ww_class_mutex); [22982.116445] lock(reservation_ww_class_mutex); [22982.116447] *** DEADLOCK *** [22982.116447] May be due to missing lock nesting notation [22982.116448] 5 locks held by glxgears:sh0/5785: [22982.116449] #0: ffff8c2d9aba58c8 (&xef->vm.lock){+.+.}-{3:3}, at: xe_file_close+0xde/0x1c0 [xe] [22982.116507] #1: ffff8c2e28cc8480 (&vm->lock){++++}-{3:3}, at: xe_vm_close_and_put+0x161/0x9b0 [xe] [22982.116578] #2: ffff8c2e31982970 (&val->lock){.+.+}-{3:3}, at: xe_validation_ctx_init+0x6d/0x70 [xe] [22982.116647] #3: ffffacdc469478a8 (reservation_ww_class_acquire){+.+.}-{0:0}, at: xe_vma_destroy_unlocked+0x7f/0xe0 [xe] [22982.116716] #4: ffff8c2d9aba6da8 (reservation_ww_class_mutex){+.+.}-{3:3}, at: drm_exec_lock_obj+0x49/0x2b0 [drm_exec] [22982.116719] stack backtrace: [22982.116720] CPU: 8 PID: 5785 Comm: glxgears:sh0 Tainted: G U W 6.10.0-rc2+ #10 [22982.116721] Hardware name: ASUS System Product Name/PRIME B560M-A AC, BIOS 2001 02/01/2023 [22982.116723] Call Trace: [22982.116724] <TASK> [22982.116725] dump_stack_lvl+0x77/0xb0 [22982.116727] __lock_acquire+0x1232/0x2160 [22982.116730] lock_acquire+0xcb/0x2d0 [22982.116732] ? dma_buf_detach+0x3b/0xf0 [22982.116734] ? __lock_acquire+0x417/0x2160 [22982.116736] __ww_mutex_lock.constprop.0+0xd0/0x13b0 [22982.116738] ? dma_buf_detach+0x3b/0xf0 [22982.116741] ? dma_buf_detach+0x3b/0xf0 [22982.116743] ? ww_mutex_lock+0x2b/0x90 [22982.116745] ww_mutex_lock+0x2b/0x90 [22982.116747] dma_buf_detach+0x3b/0xf0 [22982.116749] drm_prime_gem_destroy+0x2f/0x40 [drm] [22982.116775] xe_ttm_bo_destroy+0x32/0x220 [xe] [22982.116818] ? __mutex_unlock_slowpath+0x3a/0x290 [22982.116821] drm_exec_unlock_all+0xa1/0xd0 [drm_exec] [22982.116823] drm_exec_fini+0x12/0xb0 [drm_exec] [22982.116824] xe_validation_ctx_fini+0x15/0x40 [xe] [22982.116892] xe_vma_destroy_unlocked+0xb1/0xe0 [xe] [22982.116959] xe_vm_close_and_put+0x41a/0x9b0 [xe] [22982.117025] ? xa_find+0xe3/0x1e0 [22982.117028] xe_file_close+0x10a/0x1c0 [xe] [22982.117074] drm_file_free+0x22a/0x280 [drm] [22982.117099] drm_release_noglobal+0x22/0x70 [drm] [22982.117119] __fput+0xf1/0x2d0 [22982.117122] task_work_run+0x59/0x90 [22982.117125] do_exit+0x330/0xb40 [22982.117127] do_group_exit+0x36/0xa0 [22982.117129] get_signal+0xbd2/0xbe0 [22982.117131] arch_do_signal_or_restart+0x3e/0x240 [22982.117134] syscall_exit_to_user_mode+0x1e7/0x290 [22982.117137] do_syscall_64+0xa1/0x180 [22982.117139] ? lock_acquire+0xcb/0x2d0 [22982.117140] ? __set_task_comm+0x28/0x1e0 [22982.117141] ? find_held_lock+0x2b/0x80 [22982.117144] ? __set_task_comm+0xe1/0x1e0 [22982.117145] ? lock_release+0xca/0x290 [22982.117147] ? __do_sys_prctl+0x245/0xab0 [22982.117149] ? lockdep_hardirqs_on_prepare+0xde/0x190 [22982.117150] ? syscall_exit_to_user_mode+0xb0/0x290 [22982.117152] ? do_syscall_64+0xa1/0x180 [22982.117154] ? __lock_acquire+0x417/0x2160 [22982.117155] ? reacquire_held_locks+0xd1/0x1f0 [22982.117156] ? do_user_addr_fault+0x30c/0x790 [22982.117158] ? lock_acquire+0xcb/0x2d0 [22982.117160] ? find_held_lock+0x2b/0x80 [22982.117162] ? do_user_addr_fault+0x357/0x790 [22982.117163] ? lock_release+0xca/0x290 [22982.117164] ? do_user_addr_fault+0x361/0x790 [22982.117166] ? trace_hardirqs_off+0x4b/0xc0 [22982.117168] ? clear_bhb_loop+0x45/0xa0 [22982.117170] ? clear_bhb_loop+0x45/0xa0 [22982.117172] ? clear_bhb_loop+0x45/0xa0 [22982.117174] entry_SYSCALL_64_after_hwframe+0x76/0x7e [22982.117176] RIP: 0033:0x7f943d267169 [22982.117192] Code: Unable to access opcode bytes at 0x7f943d26713f. [22982.117193] RSP: 002b:00007f9430bffc80 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca [22982.117195] RAX: fffffffffffffe00 RBX: 0000000000000000 RCX: 00007f943d267169 [22982.117196] RDX: 0000000000000000 RSI: 0000000000000189 RDI: 00005622f89579d0 [22982.117197] RBP: 00007f9430bffcb0 R08: 0000000000000000 R09: 00000000ffffffff [22982.117198] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 [22982.117199] R13: 0000000000000000 R14: 0000000000000000 R15: 00005622f89579d0 [22982.117202] </TASK> Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Cc: Christian König <christian.koenig@amd.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: dri-devel@lists.freedesktop.org Cc: intel-xe@lists.freedesktop.org Cc: <stable@vger.kernel.org> # v6.8+ Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Christian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240628153848.4989-1-thomas.hellstrom@linux.intel.com
2024-07-03Merge branch 'fix-oom-and-order-check-in-msg_zerocopy-selftest'Jakub Kicinski
Zijian Zhang says: ==================== fix OOM and order check in msg_zerocopy selftest In selftests/net/msg_zerocopy.c, it has a while loop keeps calling sendmsg on a socket with MSG_ZEROCOPY flag, and it will recv the notifications until the socket is not writable. Typically, it will start the receiving process after around 30+ sendmsgs. However, as the introduction of commit dfa2f0483360 ("tcp: get rid of sysctl_tcp_adv_win_scale"), the sender is always writable and does not get any chance to run recv notifications. The selftest always exits with OUT_OF_MEMORY because the memory used by opt_skb exceeds the net.core.optmem_max. Meanwhile, it could be set to a different value to trigger OOM on older kernels too. Thus, we introduce "cfg_notification_limit" to force sender to receive notifications after some number of sendmsgs. And, we find that when lock debugging is on, notifications may not come in order. Thus, we have order checking outputs managed by cfg_verbose, to avoid too many outputs in this case. ==================== Link: https://patch.msgid.link/20240701225349.3395580-1-zijianzhang@bytedance.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-03selftests: make order checking verbose in msg_zerocopy selftestZijian Zhang
We find that when lock debugging is on, notifications may not come in order. Thus, we have order checking outputs managed by cfg_verbose, to avoid too many outputs in this case. Fixes: 07b65c5b31ce ("test: add msg_zerocopy test") Signed-off-by: Zijian Zhang <zijianzhang@bytedance.com> Signed-off-by: Xiaochun Lu <xiaochun.lu@bytedance.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20240701225349.3395580-3-zijianzhang@bytedance.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-03selftests: fix OOM in msg_zerocopy selftestZijian Zhang
In selftests/net/msg_zerocopy.c, it has a while loop keeps calling sendmsg on a socket with MSG_ZEROCOPY flag, and it will recv the notifications until the socket is not writable. Typically, it will start the receiving process after around 30+ sendmsgs. However, as the introduction of commit dfa2f0483360 ("tcp: get rid of sysctl_tcp_adv_win_scale"), the sender is always writable and does not get any chance to run recv notifications. The selftest always exits with OUT_OF_MEMORY because the memory used by opt_skb exceeds the net.core.optmem_max. Meanwhile, it could be set to a different value to trigger OOM on older kernels too. Thus, we introduce "cfg_notification_limit" to force sender to receive notifications after some number of sendmsgs. Fixes: 07b65c5b31ce ("test: add msg_zerocopy test") Signed-off-by: Zijian Zhang <zijianzhang@bytedance.com> Signed-off-by: Xiaochun Lu <xiaochun.lu@bytedance.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20240701225349.3395580-2-zijianzhang@bytedance.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-03Merge branch 'intel-wired-lan-driver-updates-2024-06-25-ice'Jakub Kicinski
Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2024-06-25 (ice) This series contains updates to ice driver only. Milena adds disabling of extts events when PTP is disabled. Jake prevents possible NULL pointer by checking that timestamps are ready before processing extts events and adds checks for unsupported PTP pin configuration. Petr Oros replaces _test_bit() with the correct test_bit() macro. v1: https://lore.kernel.org/netdev/20240625170248.199162-1-anthony.l.nguyen@intel.com/ ==================== Link: https://patch.msgid.link/20240702171459.2606611-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-03ice: use proper macro for testing bitPetr Oros
Do not use _test_bit() macro for testing bit. The proper macro for this is one without underline. _test_bit() is what test_bit() was prior to const-optimization. It directly calls arch_test_bit(), i.e. the arch-specific implementation (or the generic one). It's strictly _internal_ and shouldn't be used anywhere outside the actual test_bit() macro. test_bit() is a wrapper which checks whether the bitmap and the bit number are compile-time constants and if so, it calls the optimized function which evaluates this call to a compile-time constant as well. If either of them is not a compile-time constant, it just calls _test_bit(). test_bit() is the actual function to use anywhere in the kernel. IOW, calling _test_bit() avoids potential compile-time optimizations. The sensors is not a compile-time constant, thus most probably there are no object code changes before and after the patch. But anyway, we shouldn't call internal wrappers instead of the actual API. Fixes: 4da71a77fc3b ("ice: read internal temperature sensor") Acked-by: Ivan Vecera <ivecera@redhat.com> Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com> Signed-off-by: Petr Oros <poros@redhat.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://patch.msgid.link/20240702171459.2606611-5-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-03ice: Reject pin requests with unsupported flagsJacob Keller
The driver receives requests for configuring pins via the .enable callback of the PTP clock object. These requests come into the driver with flags which modify the requested behavior from userspace. Current implementation in ice does not reject flags that it doesn't support. This causes the driver to incorrectly apply requests with such flags as PTP_PEROUT_DUTY_CYCLE, or any future flags added by the kernel which it is not yet aware of. Fix this by properly validating flags in both ice_ptp_cfg_perout and ice_ptp_cfg_extts. Ensure that we check by bit-wise negating supported flags rather than just checking and rejecting known un-supported flags. This is preferable, as it ensures better compatibility with future kernels. Fixes: 172db5f91d5f ("ice: add support for auxiliary input/output pins") Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://patch.msgid.link/20240702171459.2606611-4-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-03ice: Don't process extts if PTP is disabledJacob Keller
The ice_ptp_extts_event() function can race with ice_ptp_release() and result in a NULL pointer dereference which leads to a kernel panic. Panic occurs because the ice_ptp_extts_event() function calls ptp_clock_event() with a NULL pointer. The ice driver has already released the PTP clock by the time the interrupt for the next external timestamp event occurs. To fix this, modify the ice_ptp_extts_event() function to check the PTP state and bail early if PTP is not ready. Fixes: 172db5f91d5f ("ice: add support for auxiliary input/output pins") Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://patch.msgid.link/20240702171459.2606611-3-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-03ice: Fix improper extts handlingMilena Olech
Extts events are disabled and enabled by the application ts2phc. However, in case where the driver is removed when the application is running, a specific extts event remains enabled and can cause a kernel crash. As a side effect, when the driver is reloaded and application is started again, remaining extts event for the channel from a previous run will keep firing and the message "extts on unexpected channel" might be printed to the user. To avoid that, extts events shall be disabled when PTP is released. Fixes: 172db5f91d5f ("ice: add support for auxiliary input/output pins") Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Co-developed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Milena Olech <milena.olech@intel.com> Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://patch.msgid.link/20240702171459.2606611-2-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-03selftest: af_unix: Add test case for backtrack after finalising SCC.Kuniyuki Iwashima
syzkaller reported a KMSAN splat in __unix_walk_scc() while backtracking edge_stack after finalising SCC. Let's add a test case exercising the path. Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: Shigeru Yoshida <syoshida@redhat.com> Link: https://patch.msgid.link/20240702160428.10153-2-syoshida@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-03af_unix: Fix uninit-value in __unix_walk_scc()Shigeru Yoshida
KMSAN reported uninit-value access in __unix_walk_scc() [1]. In the list_for_each_entry_reverse() loop, when the vertex's index equals it's scc_index, the loop uses the variable vertex as a temporary variable that points to a vertex in scc. And when the loop is finished, the variable vertex points to the list head, in this case scc, which is a local variable on the stack (more precisely, it's not even scc and might underflow the call stack of __unix_walk_scc(): container_of(&scc, struct unix_vertex, scc_entry)). However, the variable vertex is used under the label prev_vertex. So if the edge_stack is not empty and the function jumps to the prev_vertex label, the function will access invalid data on the stack. This causes the uninit-value access issue. Fix this by introducing a new temporary variable for the loop. [1] BUG: KMSAN: uninit-value in __unix_walk_scc net/unix/garbage.c:478 [inline] BUG: KMSAN: uninit-value in unix_walk_scc net/unix/garbage.c:526 [inline] BUG: KMSAN: uninit-value in __unix_gc+0x2589/0x3c20 net/unix/garbage.c:584 __unix_walk_scc net/unix/garbage.c:478 [inline] unix_walk_scc net/unix/garbage.c:526 [inline] __unix_gc+0x2589/0x3c20 net/unix/garbage.c:584 process_one_work kernel/workqueue.c:3231 [inline] process_scheduled_works+0xade/0x1bf0 kernel/workqueue.c:3312 worker_thread+0xeb6/0x15b0 kernel/workqueue.c:3393 kthread+0x3c4/0x530 kernel/kthread.c:389 ret_from_fork+0x6e/0x90 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244 Uninit was stored to memory at: unix_walk_scc net/unix/garbage.c:526 [inline] __unix_gc+0x2adf/0x3c20 net/unix/garbage.c:584 process_one_work kernel/workqueue.c:3231 [inline] process_scheduled_works+0xade/0x1bf0 kernel/workqueue.c:3312 worker_thread+0xeb6/0x15b0 kernel/workqueue.c:3393 kthread+0x3c4/0x530 kernel/kthread.c:389 ret_from_fork+0x6e/0x90 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244 Local variable entries created at: ref_tracker_free+0x48/0xf30 lib/ref_tracker.c:222 netdev_tracker_free include/linux/netdevice.h:4058 [inline] netdev_put include/linux/netdevice.h:4075 [inline] dev_put include/linux/netdevice.h:4101 [inline] update_gid_event_work_handler+0xaa/0x1b0 drivers/infiniband/core/roce_gid_mgmt.c:813 CPU: 1 PID: 12763 Comm: kworker/u8:31 Not tainted 6.10.0-rc4-00217-g35bb670d65fc #32 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-2.fc40 04/01/2014 Workqueue: events_unbound __unix_gc Fixes: 3484f063172d ("af_unix: Detect Strongly Connected Components.") Reported-by: syzkaller <syzkaller@googlegroups.com> Signed-off-by: Shigeru Yoshida <syoshida@redhat.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://patch.msgid.link/20240702160428.10153-1-syoshida@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-03bonding: Fix out-of-bounds read in bond_option_arp_ip_targets_set()Sam Sun
In function bond_option_arp_ip_targets_set(), if newval->string is an empty string, newval->string+1 will point to the byte after the string, causing an out-of-bound read. BUG: KASAN: slab-out-of-bounds in strlen+0x7d/0xa0 lib/string.c:418 Read of size 1 at addr ffff8881119c4781 by task syz-executor665/8107 CPU: 1 PID: 8107 Comm: syz-executor665 Not tainted 6.7.0-rc7 #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014 Call Trace: <TASK> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0xd9/0x150 lib/dump_stack.c:106 print_address_description mm/kasan/report.c:364 [inline] print_report+0xc1/0x5e0 mm/kasan/report.c:475 kasan_report+0xbe/0xf0 mm/kasan/report.c:588 strlen+0x7d/0xa0 lib/string.c:418 __fortify_strlen include/linux/fortify-string.h:210 [inline] in4_pton+0xa3/0x3f0 net/core/utils.c:130 bond_option_arp_ip_targets_set+0xc2/0x910 drivers/net/bonding/bond_options.c:1201 __bond_opt_set+0x2a4/0x1030 drivers/net/bonding/bond_options.c:767 __bond_opt_set_notify+0x48/0x150 drivers/net/bonding/bond_options.c:792 bond_opt_tryset_rtnl+0xda/0x160 drivers/net/bonding/bond_options.c:817 bonding_sysfs_store_option+0xa1/0x120 drivers/net/bonding/bond_sysfs.c:156 dev_attr_store+0x54/0x80 drivers/base/core.c:2366 sysfs_kf_write+0x114/0x170 fs/sysfs/file.c:136 kernfs_fop_write_iter+0x337/0x500 fs/kernfs/file.c:334 call_write_iter include/linux/fs.h:2020 [inline] new_sync_write fs/read_write.c:491 [inline] vfs_write+0x96a/0xd80 fs/read_write.c:584 ksys_write+0x122/0x250 fs/read_write.c:637 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0x40/0x110 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x63/0x6b ---[ end trace ]--- Fix it by adding a check of string length before using it. Fixes: f9de11a16594 ("bonding: add ip checks when store ip target") Signed-off-by: Yue Sun <samsun1006219@gmail.com> Signed-off-by: Simon Horman <horms@kernel.org> Acked-by: Jay Vosburgh <jay.vosburgh@canonical.com> Reviewed-by: Hangbin Liu <liuhangbin@gmail.com> Link: https://patch.msgid.link/20240702-bond-oob-v6-1-2dfdba195c19@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-03net: rswitch: Avoid use-after-free in rswitch_poll()Radu Rendec
The use-after-free is actually in rswitch_tx_free(), which is inlined in rswitch_poll(). Since `skb` and `gq->skbs[gq->dirty]` are in fact the same pointer, the skb is first freed using dev_kfree_skb_any(), then the value in skb->len is used to update the interface statistics. Let's move around the instructions to use skb->len before the skb is freed. This bug is trivial to reproduce using KFENCE. It will trigger a splat every few packets. A simple ARP request or ICMP echo request is enough. Fixes: 271e015b9153 ("net: rswitch: Add unmap_addrs instead of dma address in each desc") Signed-off-by: Radu Rendec <rrendec@redhat.com> Reviewed-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se> Link: https://patch.msgid.link/20240702210838.2703228-1-rrendec@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-04btrfs: fix folio refcount in __alloc_dummy_extent_buffer()Boris Burkov
Another improper use of __folio_put() in an error path after freshly allocating pages/folios which returns them with the refcount initialized to 1. The refactor from __free_pages() -> __folio_put() (instead of folio_put) removed a refcount decrement found in __free_pages() and folio_put but absent from __folio_put(). Fixes: 13df3775efca ("btrfs: cleanup metadata page pointer usage") CC: stable@vger.kernel.org # 6.8+ Tested-by: Ed Tomlinson <edtoml@gmail.com> Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Boris Burkov <boris@bur.io> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-07-04btrfs: fix folio refcount in btrfs_do_encoded_write()Boris Burkov
The conversion to folios switched __free_page() to __folio_put() in the error path in btrfs_do_encoded_write(). However, this gets the page refcounting wrong. If we do hit that error path (I reproduced by modifying btrfs_do_encoded_write to pretend to always fail in a way that jumps to out_folios and running the fstests case btrfs/281), then we always hit the following BUG freeing the folio: BUG: Bad page state in process btrfs pfn:40ab0b page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x61be5 pfn:0x40ab0b flags: 0x5ffff0000000000(node=0|zone=2|lastcpupid=0x1ffff) raw: 05ffff0000000000 0000000000000000 dead000000000122 0000000000000000 raw: 0000000000061be5 0000000000000000 00000001ffffffff 0000000000000000 page dumped because: nonzero _refcount Call Trace: <TASK> dump_stack_lvl+0x3d/0xe0 bad_page+0xea/0xf0 free_unref_page+0x8e1/0x900 ? __mem_cgroup_uncharge+0x69/0x90 __folio_put+0xe6/0x190 btrfs_do_encoded_write+0x445/0x780 ? current_time+0x25/0xd0 btrfs_do_write_iter+0x2cc/0x4b0 btrfs_ioctl_encoded_write+0x2b6/0x340 It turns out __free_page() decreases the page reference count while __folio_put() does not. Switch __folio_put() to folio_put() which decreases the folio reference count first. Fixes: 400b172b8cdc ("btrfs: compression: migrate compression/decompression paths to folios") Tested-by: Ed Tomlinson <edtoml@gmail.com> Reviewed-by: Qu Wenruo <wqu@suse.com> Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Boris Burkov <boris@bur.io> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-07-04netfilter: nf_tables: unconditionally flush pending work before notifierFlorian Westphal
syzbot reports: KASAN: slab-uaf in nft_ctx_update include/net/netfilter/nf_tables.h:1831 KASAN: slab-uaf in nft_commit_release net/netfilter/nf_tables_api.c:9530 KASAN: slab-uaf int nf_tables_trans_destroy_work+0x152b/0x1750 net/netfilter/nf_tables_api.c:9597 Read of size 2 at addr ffff88802b0051c4 by task kworker/1:1/45 [..] Workqueue: events nf_tables_trans_destroy_work Call Trace: nft_ctx_update include/net/netfilter/nf_tables.h:1831 [inline] nft_commit_release net/netfilter/nf_tables_api.c:9530 [inline] nf_tables_trans_destroy_work+0x152b/0x1750 net/netfilter/nf_tables_api.c:9597 Problem is that the notifier does a conditional flush, but its possible that the table-to-be-removed is still referenced by transactions being processed by the worker, so we need to flush unconditionally. We could make the flush_work depend on whether we found a table to delete in nf-next to avoid the flush for most cases. AFAICS this problem is only exposed in nf-next, with commit e169285f8c56 ("netfilter: nf_tables: do not store nft_ctx in transaction objects"), with this commit applied there is an unconditional fetch of table->family which is whats triggering the above splat. Fixes: 2c9f0293280e ("netfilter: nf_tables: flush pending destroy work before netlink notifier") Reported-and-tested-by: syzbot+4fd66a69358fc15ae2ad@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=4fd66a69358fc15ae2ad Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-07-04i2c: pnx: Fix potential deadlock warning from del_timer_sync() call in isrPiotr Wojtaszczyk
When del_timer_sync() is called in an interrupt context it throws a warning because of potential deadlock. The timer is used only to exit from wait_for_completion() after a timeout so replacing the call with wait_for_completion_timeout() allows to remove the problematic timer and its related functions altogether. Fixes: 41561f28e76a ("i2c: New Philips PNX bus driver") Signed-off-by: Piotr Wojtaszczyk <piotr.wojtaszczyk@timesys.com> Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
2024-07-03Merge tag 'trace-v6.10-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull tracing fix from Steven Rostedt: "Fix ioctl conflict with memmapped ring buffer ioctl It was reported that the ioctl() number used to update the ring buffer memory mapping conflicted with the TCGETS ioctl causing strace to report: $ strace -e ioctl stty ioctl(0, TCGETS or TRACE_MMAP_IOCTL_GET_READER, {c_iflag=ICRNL|IXON, c_oflag=NL0|CR0|TAB0|BS0|VT0|FF0|OPOST|ONLCR, c_cflag=B38400|CS8|CREAD, c_lflag=ISIG|ICANON|ECHO|ECHOE|ECHOK|IEXTEN|ECHOCTL|ECHOKE, ...}) = 0 Since this ioctl hasn't been in a full release yet, change it from "T", 0x1 to "R" 0x20, and also reserve 0x20-0x2F for future ioctl commands, as some more are being worked on for the future" * tag 'trace-v6.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: tracing: Have memmapped ring buffer use ioctl of "R" range 0x20-2F
2024-07-03tracing: Have memmapped ring buffer use ioctl of "R" range 0x20-2FSteven Rostedt (Google)
To prevent conflicts with other ioctl numbers to allow strace to have an idea of what is happening, add the range of ioctls for the trace buffer mapping from _IO("T", 0x1) to the range of "R" 0x20 - 0x2F. Link: https://lore.kernel.org/linux-trace-kernel/20240630105322.GA17573@altlinux.org/ Link: https://lore.kernel.org/linux-trace-kernel/20240630213626.GA23566@altlinux.org/ Cc: Jonathan Corbet <corbet@lwn.net> Fixes: cf9f0f7c4c5bb ("tracing: Allow user-space mapping of the ring-buffer") Link: https://lore.kernel.org/20240702153354.367861db@rorschach.local.home Reported-by: "Dmitry V. Levin" <ldv@strace.io> Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2024-07-03riscv: kexec: Avoid deadlock in kexec crash pathSong Shuai
If the kexec crash code is called in the interrupt context, the machine_kexec_mask_interrupts() function will trigger a deadlock while trying to acquire the irqdesc spinlock and then deactivate irqchip in irq_set_irqchip_state() function. Unlike arm64, riscv only requires irq_eoi handler to complete EOI and keeping irq_set_irqchip_state() will only leave this possible deadlock without any use. So we simply remove it. Link: https://lore.kernel.org/linux-riscv/20231208111015.173237-1-songshuaishuai@tinylab.org/ Fixes: b17d19a5314a ("riscv: kexec: Fixup irq controller broken in kexec crash path") Signed-off-by: Song Shuai <songshuaishuai@tinylab.org> Reviewed-by: Ryo Takakura <takakura@valinux.co.jp> Link: https://lore.kernel.org/r/20240626023316.539971-1-songshuaishuai@tinylab.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-07-03riscv: stacktrace: fix usage of ftrace_graph_ret_addr()Puranjay Mohan
ftrace_graph_ret_addr() takes an `idx` integer pointer that is used to optimize the stack unwinding. Pass it a valid pointer to utilize the optimizations that might be available in the future. The commit is making riscv's usage of ftrace_graph_ret_addr() match x86_64. Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org> Link: https://lore.kernel.org/r/20240618145820.62112-1-puranjay@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-07-03riscv: selftests: Fix vsetivli args for clangCharlie Jenkins
Clang does not support implicit LMUL in the vset* instruction sequences. Introduce an explicit LMUL in the vsetivli instruction. Signed-off-by: Charlie Jenkins <charlie@rivosinc.com> Fixes: 9d5328eeb185 ("riscv: selftests: Add signal handling vector tests") Link: https://lore.kernel.org/r/20240702-fix_sigreturn_test-v1-1-485f88a80612@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-07-03Merge patch series "Assorted fixes in RISC-V PMU driver"Palmer Dabbelt
Atish Patra <atishp@rivosinc.com> says: This series contains 3 fixes out of which the first one is a new fix for invalid event data reported in lkml[2]. The last two are v3 of Samuel's patch[1]. I added the RB/TB/Fixes tag and moved 1 unrelated change to its own patch. I also changed an error message in kvm vcpu_pmu from pr_err to pr_debug to avoid redundant failure error messages generated due to the boot time quering of events implemented in the patch[1] Here is the original cover letter for the patch[1] Before this patch: $ perf list hw List of pre-defined events (to be used in -e or -M): branch-instructions OR branches [Hardware event] branch-misses [Hardware event] bus-cycles [Hardware event] cache-misses [Hardware event] cache-references [Hardware event] cpu-cycles OR cycles [Hardware event] instructions [Hardware event] ref-cycles [Hardware event] stalled-cycles-backend OR idle-cycles-backend [Hardware event] stalled-cycles-frontend OR idle-cycles-frontend [Hardware event] $ perf stat -ddd true Performance counter stats for 'true': 4.36 msec task-clock # 0.744 CPUs utilized 1 context-switches # 229.325 /sec 0 cpu-migrations # 0.000 /sec 38 page-faults # 8.714 K/sec 4,375,694 cycles # 1.003 GHz (60.64%) 728,945 instructions # 0.17 insn per cycle 79,199 branches # 18.162 M/sec 17,709 branch-misses # 22.36% of all branches 181,734 L1-dcache-loads # 41.676 M/sec 5,547 L1-dcache-load-misses # 3.05% of all L1-dcache accesses <not counted> LLC-loads (0.00%) <not counted> LLC-load-misses (0.00%) <not counted> L1-icache-loads (0.00%) <not counted> L1-icache-load-misses (0.00%) <not counted> dTLB-loads (0.00%) <not counted> dTLB-load-misses (0.00%) <not counted> iTLB-loads (0.00%) <not counted> iTLB-load-misses (0.00%) <not counted> L1-dcache-prefetches (0.00%) <not counted> L1-dcache-prefetch-misses (0.00%) 0.005860375 seconds time elapsed 0.000000000 seconds user 0.010383000 seconds sys After this patch: $ perf list hw List of pre-defined events (to be used in -e or -M): branch-instructions OR branches [Hardware event] branch-misses [Hardware event] cache-misses [Hardware event] cache-references [Hardware event] cpu-cycles OR cycles [Hardware event] instructions [Hardware event] $ perf stat -ddd true Performance counter stats for 'true': 5.16 msec task-clock # 0.848 CPUs utilized 1 context-switches # 193.817 /sec 0 cpu-migrations # 0.000 /sec 37 page-faults # 7.171 K/sec 5,183,625 cycles # 1.005 GHz 961,696 instructions # 0.19 insn per cycle 85,853 branches # 16.640 M/sec 20,462 branch-misses # 23.83% of all branches 243,545 L1-dcache-loads # 47.203 M/sec 5,974 L1-dcache-load-misses # 2.45% of all L1-dcache accesses <not supported> LLC-loads <not supported> LLC-load-misses <not supported> L1-icache-loads <not supported> L1-icache-load-misses <not supported> dTLB-loads 19,619 dTLB-load-misses <not supported> iTLB-loads 6,831 iTLB-load-misses <not supported> L1-dcache-prefetches <not supported> L1-dcache-prefetch-misses 0.006085625 seconds time elapsed 0.000000000 seconds user 0.013022000 seconds sys [1] https://lore.kernel.org/linux-riscv/20240418014652.1143466-1-samuel.holland@sifive.com/ [2] https://lore.kernel.org/all/CC51D53B-846C-4D81-86FC-FBF969D0A0D6@pku.edu.cn/ * b4-shazam-merge: perf: RISC-V: Check standard event availability drivers/perf: riscv: Reset the counter to hpmevent mapping while starting cpus drivers/perf: riscv: Do not update the event data if uptodate Link: https://lore.kernel.org/r/20240628-misc_perf_fixes-v4-0-e01cfddcf035@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-07-03perf: RISC-V: Check standard event availabilitySamuel Holland
The RISC-V SBI PMU specification defines several standard hardware and cache events. Currently, all of these events are exposed to userspace, even when not actually implemented. They appear in the `perf list` output, and commands like `perf stat` try to use them. This is more than just a cosmetic issue, because the PMU driver's .add function fails for these events, which causes pmu_groups_sched_in() to prematurely stop scheduling in other (possibly valid) hardware events. Add logic to check which events are supported by the hardware (i.e. can be mapped to some counter), so only usable events are reported to userspace. Since the kernel does not know the mapping between events and possible counters, this check must happen during boot, when no counters are in use. Make the check asynchronous to minimize impact on boot time. Fixes: e9991434596f ("RISC-V: Add perf platform driver based on SBI PMU extension") Signed-off-by: Samuel Holland <samuel.holland@sifive.com> Reviewed-by: Atish Patra <atishp@rivosinc.com> Tested-by: Atish Patra <atishp@rivosinc.com> Signed-off-by: Atish Patra <atishp@rivosinc.com> Link: https://lore.kernel.org/r/20240628-misc_perf_fixes-v4-3-e01cfddcf035@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-07-03drivers/perf: riscv: Reset the counter to hpmevent mapping while starting cpusSamuel Holland
Currently, we stop all the counters while a new cpu is brought online. However, the hpmevent to counter mappings are not reset. The firmware may have some stale encoding in their mapping structure which may lead to undesirable results. We have not encountered such scenario though. Signed-off-by: Samuel Holland <samuel.holland@sifive.com> Signed-off-by: Atish Patra <atishp@rivosinc.com> Link: https://lore.kernel.org/r/20240628-misc_perf_fixes-v4-2-e01cfddcf035@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-07-03drivers/perf: riscv: Do not update the event data if uptodateAtish Patra
In case of an counter overflow, the event data may get corrupted if called from an external overflow handler. This happens because we can't update the counter without starting it when SBI PMU extension is in use. However, the prev_count has been already updated at the first pass while the counter value is still the old one. The solution is simple where we don't need to update it again if it is already updated which can be detected using hwc state. The event state in the overflow handler is updated in the following patch. Thus, this fix can't be backported to kernel version where overflow support was added. Fixes: a8625217a054 ("drivers/perf: riscv: Implement SBI PMU snapshot function") Closes:https://lore.kernel.org/all/CC51D53B-846C-4D81-86FC-FBF969D0A0D6@pku.edu.cn/ Reported-by: garthlei@pku.edu.cn Signed-off-by: Atish Patra <atishp@rivosinc.com> Link: https://lore.kernel.org/r/20240628-misc_perf_fixes-v4-1-e01cfddcf035@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-07-03nilfs2: fix incorrect inode allocation from reserved inodesRyusuke Konishi
If the bitmap block that manages the inode allocation status is corrupted, nilfs_ifile_create_inode() may allocate a new inode from the reserved inode area where it should not be allocated. Previous fix commit d325dc6eb763 ("nilfs2: fix use-after-free bug of struct nilfs_root"), fixed the problem that reserved inodes with inode numbers less than NILFS_USER_INO (=11) were incorrectly reallocated due to bitmap corruption, but since the start number of non-reserved inodes is read from the super block and may change, in which case inode allocation may occur from the extended reserved inode area. If that happens, access to that inode will cause an IO error, causing the file system to degrade to an error state. Fix this potential issue by adding a wraparound option to the common metadata object allocation routine and by modifying nilfs_ifile_create_inode() to disable the option so that it only allocates inodes with inode numbers greater than or equal to the inode number read in "nilfs->ns_first_ino", regardless of the bitmap status of reserved inodes. Link: https://lkml.kernel.org/r/20240623051135.4180-4-konishi.ryusuke@gmail.com Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Cc: Hillf Danton <hdanton@sina.com> Cc: Jan Kara <jack@suse.cz> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-07-03nilfs2: add missing check for inode numbers on directory entriesRyusuke Konishi
Syzbot reported that mounting and unmounting a specific pattern of corrupted nilfs2 filesystem images causes a use-after-free of metadata file inodes, which triggers a kernel bug in lru_add_fn(). As Jan Kara pointed out, this is because the link count of a metadata file gets corrupted to 0, and nilfs_evict_inode(), which is called from iput(), tries to delete that inode (ifile inode in this case). The inconsistency occurs because directories containing the inode numbers of these metadata files that should not be visible in the namespace are read without checking. Fix this issue by treating the inode numbers of these internal files as errors in the sanity check helper when reading directory folios/pages. Also thanks to Hillf Danton and Matthew Wilcox for their initial mm-layer analysis. Link: https://lkml.kernel.org/r/20240623051135.4180-3-konishi.ryusuke@gmail.com Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Reported-by: syzbot+d79afb004be235636ee8@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=d79afb004be235636ee8 Reported-by: Jan Kara <jack@suse.cz> Closes: https://lkml.kernel.org/r/20240617075758.wewhukbrjod5fp5o@quack3 Tested-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Cc: Hillf Danton <hdanton@sina.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>