summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2023-09-13Merge branch 'tcp-bind-fixes'David S. Miller
Kuniyuki Iwashima says: ==================== tcp: Fix bind() regression for v4-mapped-v6 address Since bhash2 was introduced, bind() is broken in two cases related to v4-mapped-v6 address. This series fixes the regression and adds test to cover the cases. Changes: v2: * Added patch 1 to factorise duplicated comparison (Eric Dumazet) v1: https://lore.kernel.org/netdev/20230911165106.39384-1-kuniyu@amazon.com/ ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2023-09-13selftest: tcp: Add v4-mapped-v6 cases in bind_wildcard.c.Kuniyuki Iwashima
We add these 8 test cases in bind_wildcard.c to check bind() conflicts. 1st bind() 2nd bind() --------- --------- 0.0.0.0 ::FFFF:0.0.0.0 ::FFFF:0.0.0.0 0.0.0.0 0.0.0.0 ::FFFF:127.0.0.1 ::FFFF:127.0.0.1 0.0.0.0 127.0.0.1 ::FFFF:0.0.0.0 ::FFFF:0.0.0.0 127.0.0.1 127.0.0.1 ::FFFF:127.0.0.1 ::FFFF:127.0.0.1 127.0.0.1 All test passed without bhash2 and with bhash2 and this series. Before bhash2: $ uname -r 6.0.0-rc1-00393-g0bf73255d3a3 $ ./bind_wildcard ... # PASSED: 16 / 16 tests passed. Just after bhash2: $ uname -r 6.0.0-rc1-00394-g28044fc1d495 $ ./bind_wildcard ... ok 15 bind_wildcard.v4_local_v6_v4mapped_local.v4_v6 not ok 16 bind_wildcard.v4_local_v6_v4mapped_local.v6_v4 # FAILED: 15 / 16 tests passed. On net.git: $ ./bind_wildcard ... not ok 14 bind_wildcard.v4_local_v6_v4mapped_any.v6_v4 not ok 16 bind_wildcard.v4_local_v6_v4mapped_local.v6_v4 # FAILED: 13 / 16 tests passed. With this series: $ ./bind_wildcard ... # PASSED: 16 / 16 tests passed. Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-09-13selftest: tcp: Move expected_errno into each test case in bind_wildcard.c.Kuniyuki Iwashima
This is a preparation patch for the following patch. Let's define expected_errno in each test case so that we can add other test cases easily. Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-09-13selftest: tcp: Fix address length in bind_wildcard.c.Kuniyuki Iwashima
The selftest passes the IPv6 address length for an IPv4 address. We should pass the correct length. Note inet_bind_sk() does not check if the size is larger than sizeof(struct sockaddr_in), so there is no real bug in this selftest. Fixes: 13715acf8ab5 ("selftest: Add test for bind() conflicts.") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-09-13tcp: Fix bind() regression for v4-mapped-v6 non-wildcard address.Kuniyuki Iwashima
Since bhash2 was introduced, the example below does not work as expected. These two bind() should conflict, but the 2nd bind() now succeeds. from socket import * s1 = socket(AF_INET6, SOCK_STREAM) s1.bind(('::ffff:127.0.0.1', 0)) s2 = socket(AF_INET, SOCK_STREAM) s2.bind(('127.0.0.1', s1.getsockname()[1])) During the 2nd bind() in inet_csk_get_port(), inet_bind2_bucket_find() fails to find the 1st socket's tb2, so inet_bind2_bucket_create() allocates a new tb2 for the 2nd socket. Then, we call inet_csk_bind_conflict() that checks conflicts in the new tb2 by inet_bhash2_conflict(). However, the new tb2 does not include the 1st socket, thus the bind() finally succeeds. In this case, inet_bind2_bucket_match() must check if AF_INET6 tb2 has the conflicting v4-mapped-v6 address so that inet_bind2_bucket_find() returns the 1st socket's tb2. Note that if we bind two sockets to 127.0.0.1 and then ::FFFF:127.0.0.1, the 2nd bind() fails properly for the same reason mentinoed in the previous commit. Fixes: 28044fc1d495 ("net: Add a bhash2 table hashed by port and address") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Acked-by: Andrei Vagin <avagin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-09-13tcp: Fix bind() regression for v4-mapped-v6 wildcard address.Kuniyuki Iwashima
Andrei Vagin reported bind() regression with strace logs. If we bind() a TCPv6 socket to ::FFFF:0.0.0.0 and then bind() a TCPv4 socket to 127.0.0.1, the 2nd bind() should fail but now succeeds. from socket import * s1 = socket(AF_INET6, SOCK_STREAM) s1.bind(('::ffff:0.0.0.0', 0)) s2 = socket(AF_INET, SOCK_STREAM) s2.bind(('127.0.0.1', s1.getsockname()[1])) During the 2nd bind(), if tb->family is AF_INET6 and sk->sk_family is AF_INET in inet_bind2_bucket_match_addr_any(), we still need to check if tb has the v4-mapped-v6 wildcard address. The example above does not work after commit 5456262d2baa ("net: Fix incorrect address comparison when searching for a bind2 bucket"), but the blamed change is not the commit. Before the commit, the leading zeros of ::FFFF:0.0.0.0 were treated as 0.0.0.0, and the sequence above worked by chance. Technically, this case has been broken since bhash2 was introduced. Note that if we bind() two sockets to 127.0.0.1 and then ::FFFF:0.0.0.0, the 2nd bind() fails properly because we fall back to using bhash to detect conflicts for the v4-mapped-v6 address. Fixes: 28044fc1d495 ("net: Add a bhash2 table hashed by port and address") Reported-by: Andrei Vagin <avagin@google.com> Closes: https://lore.kernel.org/netdev/ZPuYBOFC8zsK6r9T@google.com/ Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-09-13tcp: Factorise sk_family-independent comparison in ↵Kuniyuki Iwashima
inet_bind2_bucket_match(_addr_any). This is a prep patch to make the following patches cleaner that touch inet_bind2_bucket_match() and inet_bind2_bucket_match_addr_any(). Both functions have duplicated comparison for netns, port, and l3mdev. Let's factorise them. Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-09-12bpf, cgroup: fix multiple kernel-doc warningsRandy Dunlap
Fix missing or extra function parameter kernel-doc warnings in cgroup.c: kernel/bpf/cgroup.c:1359: warning: Excess function parameter 'type' description in '__cgroup_bpf_run_filter_skb' kernel/bpf/cgroup.c:1359: warning: Function parameter or member 'atype' not described in '__cgroup_bpf_run_filter_skb' kernel/bpf/cgroup.c:1439: warning: Excess function parameter 'type' description in '__cgroup_bpf_run_filter_sk' kernel/bpf/cgroup.c:1439: warning: Function parameter or member 'atype' not described in '__cgroup_bpf_run_filter_sk' kernel/bpf/cgroup.c:1467: warning: Excess function parameter 'type' description in '__cgroup_bpf_run_filter_sock_addr' kernel/bpf/cgroup.c:1467: warning: Function parameter or member 'atype' not described in '__cgroup_bpf_run_filter_sock_addr' kernel/bpf/cgroup.c:1512: warning: Excess function parameter 'type' description in '__cgroup_bpf_run_filter_sock_ops' kernel/bpf/cgroup.c:1512: warning: Function parameter or member 'atype' not described in '__cgroup_bpf_run_filter_sock_ops' kernel/bpf/cgroup.c:1685: warning: Excess function parameter 'type' description in '__cgroup_bpf_run_filter_sysctl' kernel/bpf/cgroup.c:1685: warning: Function parameter or member 'atype' not described in '__cgroup_bpf_run_filter_sysctl' kernel/bpf/cgroup.c:795: warning: Excess function parameter 'type' description in '__cgroup_bpf_replace' kernel/bpf/cgroup.c:795: warning: Function parameter or member 'new_prog' not described in '__cgroup_bpf_replace' Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Martin KaFai Lau <martin.lau@linux.dev> Cc: bpf@vger.kernel.org Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20230912060812.1715-1-rdunlap@infradead.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-09-12selftests/bpf: fix unpriv_disabled check in test_verifierArtem Savkov
Commit 1d56ade032a49 changed the function get_unpriv_disabled() to return its results as a bool instead of updating a global variable, but test_verifier was not updated to keep in line with these changes. Thus unpriv_disabled is always false in test_verifier and unprivileged tests are not properly skipped on systems with unprivileged bpf disabled. Fixes: 1d56ade032a49 ("selftests/bpf: Unprivileged tests for test_loader.c") Signed-off-by: Artem Savkov <asavkov@redhat.com> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20230912120631.213139-1-asavkov@redhat.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-09-12bpf: Fix a erroneous check after snprintf()Christophe JAILLET
snprintf() does not return negative error code on error, it returns the number of characters which *would* be generated for the given input. Fix the error handling check. Fixes: 57539b1c0ac2 ("bpf: Enable annotating trusted nested pointers") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Link: https://lore.kernel.org/r/393bdebc87b22563c08ace094defa7160eb7a6c0.1694190795.git.christophe.jaillet@wanadoo.fr Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-09-12tpm: Fix typo in tpmrm class definitionJustin M. Forbes
Commit d2e8071bed0be ("tpm: make all 'class' structures const") unfortunately had a typo for the name on tpmrm. Fixes: d2e8071bed0b ("tpm: make all 'class' structures const") Signed-off-by: Justin M. Forbes <jforbes@fedoraproject.org> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2023-09-12Merge tag 'for-6.6-rc1-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: - several fixes for handling directory item (inserting, removing, iteration, error handling) - fix transaction commit stalls when auto relocation is running and blocks other tasks that want to commit - fix a build error when DEBUG is enabled - fix lockdep warning in inode number lookup ioctl - fix race when finishing block group creation - remove link to obsolete wiki in several files * tag 'for-6.6-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: MAINTAINERS: remove links to obsolete btrfs.wiki.kernel.org btrfs: assert delayed node locked when removing delayed item btrfs: remove BUG() after failure to insert delayed dir index item btrfs: improve error message after failure to add delayed dir index item btrfs: fix a compilation error if DEBUG is defined in btree_dirty_folio btrfs: check for BTRFS_FS_ERROR in pending ordered assert btrfs: fix lockdep splat and potential deadlock after failure running delayed items btrfs: do not block starts waiting on previous transaction commit btrfs: release path before inode lookup during the ino lookup ioctl btrfs: fix race between finishing block group creation and its item update
2023-09-12Merge tag 'platform-drivers-x86-v6.6-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86 Pull x86 platform driver fixes from Hans de Goede: - various platform/mellanox fixes - one new DMI quirk for asus-wmi * tag 'platform-drivers-x86-v6.6-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: platform/x86: asus-wmi: Support 2023 ROG X16 tablet mode platform/mellanox: NVSW_SN2201 should depend on ACPI platform/mellanox: mlxbf-bootctl: add NET dependency into Kconfig platform/mellanox: mlxbf-pmc: Fix reading of unprogrammed events platform/mellanox: mlxbf-pmc: Fix potential buffer overflows platform/mellanox: mlxbf-tmfifo: Drop jumbo frames platform/mellanox: mlxbf-tmfifo: Drop the Rx packet if no more descriptors
2023-09-12ipv6: fix ip6_sock_set_addr_preferences() typoEric Dumazet
ip6_sock_set_addr_preferences() second argument should be an integer. SUNRPC attempts to set IPV6_PREFER_SRC_PUBLIC were translated to IPV6_PREFER_SRC_TMP Fixes: 18d5ad623275 ("ipv6: add ip6_sock_set_addr_preferences") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20230911154213.713941-1-edumazet@google.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-09-12Merge tag 'linux-kselftest-next-6.6-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest Pull kselftest fixes from Shuah Khan: - kselftest runner script to propagate SIGTERM to runner child to avoid kselftest hang - install symlinks required for test execution to avoid test failures - kselftest dependency checker script argument parsing * tag 'linux-kselftest-next-6.6-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: selftests: Keep symlinks, when possible selftests: fix dependency checker script kselftest/runner.sh: Propagate SIGTERM to runner child selftests/ftrace: Correctly enable event in instance-event.tc
2023-09-12Merge tag 'linux-kselftest-kunit-6.6-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest Pull kunit fixes from Shuah Khan: "Fixes to possible memory leak, null-ptr-deref, wild-memory-access, and error path bugs" * tag 'linux-kselftest-kunit-6.6-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: kunit: Fix possible memory leak in kunit_filter_suites() kunit: Fix possible null-ptr-deref in kunit_parse_glob_filter() kunit: Fix the wrong err path and add goto labels in kunit_filter_suites() kunit: Fix wild-memory-access bug in kunit_free_suite_set() kunit: test: Make filter strings in executor_test writable
2023-09-12Merge tag 'ovl-fixes-6.6-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/overlayfs/vfs Pull overlayfs fixes from Amir Goldstein: "Two fixes for pretty old regressions" * tag 'ovl-fixes-6.6-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/overlayfs/vfs: ovl: fix incorrect fdput() on aio completion ovl: fix failed copyup of fileattr on a symlink
2023-09-12linux/export: fix reference to exported functions for parisc64Masahiro Yamada
John David Anglin reported parisc has been broken since commit ddb5cdbafaaa ("kbuild: generate KSYMTAB entries by modpost"). Like ia64, parisc64 uses a function descriptor. The function references must be prefixed with P%. Also, symbols prefixed $$ from the library have the symbol type STT_LOPROC instead of STT_FUNC. They should be handled as functions too. Fixes: ddb5cdbafaaa ("kbuild: generate KSYMTAB entries by modpost") Reported-by: John David Anglin <dave.anglin@bell.net> Tested-by: John David Anglin <dave.anglin@bell.net> Tested-by: Helge Deller <deller@gmx.de> Closes: https://lore.kernel.org/linux-parisc/1901598a-e11d-f7dd-a5d9-9a69d06e6b6e@bell.net/T/#u Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Helge Deller <deller@gmx.de>
2023-09-12selftests/bpf: ensure all CI arches set CONFIG_BPF_KPROBE_OVERRIDE=yAndrii Nakryiko
Turns out CONFIG_BPF_KPROBE_OVERRIDE=y is only enabled in x86-64 CI, but is not set on aarch64, causing CI failures ([0]). Move CONFIG_BPF_KPROBE_OVERRIDE=y to arch-agnostic CI config. [0] https://github.com/kernel-patches/bpf/actions/runs/6122324047/job/16618390535 Fixes: 7182e56411b9 ("selftests/bpf: Add kprobe_multi override test") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20230912055928.1704269-1-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-09-12veth: Update XDP feature set when bringing up deviceToke Høiland-Jørgensen
There's an early return in veth_set_features() if the device is in a down state, which leads to the XDP feature flags not being updated when enabling GRO while the device is down. Which in turn leads to XDP_REDIRECT not working, because the redirect code now checks the flags. Fix this by updating the feature flags after bringing the device up. Before this patch: NETDEV_XDP_ACT_BASIC: yes NETDEV_XDP_ACT_REDIRECT: yes NETDEV_XDP_ACT_NDO_XMIT: no NETDEV_XDP_ACT_XSK_ZEROCOPY: no NETDEV_XDP_ACT_HW_OFFLOAD: no NETDEV_XDP_ACT_RX_SG: yes NETDEV_XDP_ACT_NDO_XMIT_SG: no After this patch: NETDEV_XDP_ACT_BASIC: yes NETDEV_XDP_ACT_REDIRECT: yes NETDEV_XDP_ACT_NDO_XMIT: yes NETDEV_XDP_ACT_XSK_ZEROCOPY: no NETDEV_XDP_ACT_HW_OFFLOAD: no NETDEV_XDP_ACT_RX_SG: yes NETDEV_XDP_ACT_NDO_XMIT_SG: yes Fixes: fccca038f300 ("veth: take into account device reconfiguration for xdp_features flag") Fixes: 66c0e13ad236 ("drivers: net: turn on XDP features") Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/r/20230911135826.722295-1-toke@redhat.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-09-12eventfs: Fix the NULL pointer dereference bug in eventfs_remove_rec()Jinjie Ruan
Inject fault while probing btrfs.ko, if kstrdup() fails in eventfs_prepare_ef() in eventfs_add_dir(), it will return ERR_PTR to assign file->ef. But the eventfs_remove() check NULL in trace_module_remove_events(), which causes the below NULL pointer dereference. As both Masami and Steven suggest, allocater side should handle the error carefully and remove it, so fix the places where it failed. Could not create tracefs 'raid56_write' directory Btrfs loaded, zoned=no, fsverity=no Unable to handle kernel NULL pointer dereference at virtual address 000000000000001c Mem abort info: ESR = 0x0000000096000004 EC = 0x25: DABT (current EL), IL = 32 bits SET = 0, FnV = 0 EA = 0, S1PTW = 0 FSC = 0x04: level 0 translation fault Data abort info: ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000 CM = 0, WnR = 0, TnD = 0, TagAccess = 0 GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 user pgtable: 4k pages, 48-bit VAs, pgdp=0000000102544000 [000000000000001c] pgd=0000000000000000, p4d=0000000000000000 Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP Dumping ftrace buffer: (ftrace buffer empty) Modules linked in: btrfs(-) libcrc32c xor xor_neon raid6_pq cfg80211 rfkill 8021q garp mrp stp llc ipv6 [last unloaded: btrfs] CPU: 15 PID: 1343 Comm: rmmod Tainted: G N 6.5.0+ #40 Hardware name: linux,dummy-virt (DT) pstate: 80000005 (Nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : eventfs_remove_rec+0x24/0xc0 lr : eventfs_remove+0x68/0x1d8 sp : ffff800082d63b60 x29: ffff800082d63b60 x28: ffffb84b80ddd00c x27: ffffb84b3054ba40 x26: 0000000000000002 x25: ffff800082d63bf8 x24: ffffb84b8398e440 x23: ffffb84b82af3000 x22: dead000000000100 x21: dead000000000122 x20: ffff800082d63bf8 x19: fffffffffffffff4 x18: ffffb84b82508820 x17: 0000000000000000 x16: 0000000000000000 x15: 000083bc876a3166 x14: 000000000000006d x13: 000000000000006d x12: 0000000000000000 x11: 0000000000000001 x10: 00000000000017e0 x9 : 0000000000000001 x8 : 0000000000000000 x7 : 0000000000000000 x6 : ffffb84b84289804 x5 : 0000000000000000 x4 : 9696969696969697 x3 : ffff33a5b7601f38 x2 : 0000000000000000 x1 : ffff800082d63bf8 x0 : fffffffffffffff4 Call trace: eventfs_remove_rec+0x24/0xc0 eventfs_remove+0x68/0x1d8 remove_event_file_dir+0x88/0x100 event_remove+0x140/0x15c trace_module_notify+0x1fc/0x230 notifier_call_chain+0x98/0x17c blocking_notifier_call_chain+0x4c/0x74 __arm64_sys_delete_module+0x1a4/0x298 invoke_syscall+0x44/0x100 el0_svc_common.constprop.1+0x68/0xe0 do_el0_svc+0x1c/0x28 el0_svc+0x3c/0xc4 el0t_64_sync_handler+0xa0/0xc4 el0t_64_sync+0x174/0x178 Code: 5400052c a90153b3 aa0003f3 aa0103f4 (f9401400) ---[ end trace 0000000000000000 ]--- Kernel panic - not syncing: Oops: Fatal exception SMP: stopping secondary CPUs Dumping ftrace buffer: (ftrace buffer empty) Kernel Offset: 0x384b00c00000 from 0xffff800080000000 PHYS_OFFSET: 0xffffcc5b80000000 CPU features: 0x88000203,3c020000,1000421b Memory Limit: none Rebooting in 1 seconds.. Link: https://lore.kernel.org/linux-trace-kernel/20230912134752.1838524-1-ruanjinjie@huawei.com Link: https://lore.kernel.org/all/20230912025808.668187-1-ruanjinjie@huawei.com/ Link: https://lore.kernel.org/all/20230911052818.1020547-1-ruanjinjie@huawei.com/ Link: https://lore.kernel.org/all/20230909072817.182846-1-ruanjinjie@huawei.com/ Link: https://lore.kernel.org/all/20230908074816.3724716-1-ruanjinjie@huawei.com/ Cc: Ajay Kaher <akaher@vmware.com> Fixes: 5bdcd5f5331a ("eventfs: Implement removal of meta data from eventfs") Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com> Suggested-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Suggested-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-09-12net: macb: fix sleep inside spinlockSascha Hauer
macb_set_tx_clk() is called under a spinlock but itself calls clk_set_rate() which can sleep. This results in: | BUG: sleeping function called from invalid context at kernel/locking/mutex.c:580 | pps pps1: new PPS source ptp1 | in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 40, name: kworker/u4:3 | preempt_count: 1, expected: 0 | RCU nest depth: 0, expected: 0 | 4 locks held by kworker/u4:3/40: | #0: ffff000003409148 | macb ff0c0000.ethernet: gem-ptp-timer ptp clock registered. | ((wq_completion)events_power_efficient){+.+.}-{0:0}, at: process_one_work+0x14c/0x51c | #1: ffff8000833cbdd8 ((work_completion)(&pl->resolve)){+.+.}-{0:0}, at: process_one_work+0x14c/0x51c | #2: ffff000004f01578 (&pl->state_mutex){+.+.}-{4:4}, at: phylink_resolve+0x44/0x4e8 | #3: ffff000004f06f50 (&bp->lock){....}-{3:3}, at: macb_mac_link_up+0x40/0x2ac | irq event stamp: 113998 | hardirqs last enabled at (113997): [<ffff800080e8503c>] _raw_spin_unlock_irq+0x30/0x64 | hardirqs last disabled at (113998): [<ffff800080e84478>] _raw_spin_lock_irqsave+0xac/0xc8 | softirqs last enabled at (113608): [<ffff800080010630>] __do_softirq+0x430/0x4e4 | softirqs last disabled at (113597): [<ffff80008001614c>] ____do_softirq+0x10/0x1c | CPU: 0 PID: 40 Comm: kworker/u4:3 Not tainted 6.5.0-11717-g9355ce8b2f50-dirty #368 | Hardware name: ... ZynqMP ... (DT) | Workqueue: events_power_efficient phylink_resolve | Call trace: | dump_backtrace+0x98/0xf0 | show_stack+0x18/0x24 | dump_stack_lvl+0x60/0xac | dump_stack+0x18/0x24 | __might_resched+0x144/0x24c | __might_sleep+0x48/0x98 | __mutex_lock+0x58/0x7b0 | mutex_lock_nested+0x24/0x30 | clk_prepare_lock+0x4c/0xa8 | clk_set_rate+0x24/0x8c | macb_mac_link_up+0x25c/0x2ac | phylink_resolve+0x178/0x4e8 | process_one_work+0x1ec/0x51c | worker_thread+0x1ec/0x3e4 | kthread+0x120/0x124 | ret_from_fork+0x10/0x20 The obvious fix is to move the call to macb_set_tx_clk() out of the protected area. This seems safe as rx and tx are both disabled anyway at this point. It is however not entirely clear what the spinlock shall protect. It could be the read-modify-write access to the NCFGR register, but this is accessed in macb_set_rx_mode() and macb_set_rxcsum_feature() as well without holding the spinlock. It could also be the register accesses done in mog_init_rings() or macb_init_buffers(), but again these functions are called without holding the spinlock in macb_hresp_error_task(). The locking seems fishy in this driver and it might deserve another look before this patch is applied. Fixes: 633e98a711ac0 ("net: macb: use resolved link config in mac_link_up()") Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de> Link: https://lore.kernel.org/r/20230908112913.1701766-1-s.hauer@pengutronix.de Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-09-12net/tls: do not free tls_rec on async operation in bpf_exec_tx_verdict()Liu Jian
I got the below warning when do fuzzing test: BUG: KASAN: null-ptr-deref in scatterwalk_copychunks+0x320/0x470 Read of size 4 at addr 0000000000000008 by task kworker/u8:1/9 CPU: 0 PID: 9 Comm: kworker/u8:1 Tainted: G OE Hardware name: linux,dummy-virt (DT) Workqueue: pencrypt_parallel padata_parallel_worker Call trace: dump_backtrace+0x0/0x420 show_stack+0x34/0x44 dump_stack+0x1d0/0x248 __kasan_report+0x138/0x140 kasan_report+0x44/0x6c __asan_load4+0x94/0xd0 scatterwalk_copychunks+0x320/0x470 skcipher_next_slow+0x14c/0x290 skcipher_walk_next+0x2fc/0x480 skcipher_walk_first+0x9c/0x110 skcipher_walk_aead_common+0x380/0x440 skcipher_walk_aead_encrypt+0x54/0x70 ccm_encrypt+0x13c/0x4d0 crypto_aead_encrypt+0x7c/0xfc pcrypt_aead_enc+0x28/0x84 padata_parallel_worker+0xd0/0x2dc process_one_work+0x49c/0xbdc worker_thread+0x124/0x880 kthread+0x210/0x260 ret_from_fork+0x10/0x18 This is because the value of rec_seq of tls_crypto_info configured by the user program is too large, for example, 0xffffffffffffff. In addition, TLS is asynchronously accelerated. When tls_do_encryption() returns -EINPROGRESS and sk->sk_err is set to EBADMSG due to rec_seq overflow, skmsg is released before the asynchronous encryption process ends. As a result, the UAF problem occurs during the asynchronous processing of the encryption module. If the operation is asynchronous and the encryption module returns EINPROGRESS, do not free the record information. Fixes: 635d93981786 ("net/tls: free record only on encryption error") Signed-off-by: Liu Jian <liujian56@huawei.com> Reviewed-by: Sabrina Dubroca <sd@queasysnail.net> Link: https://lore.kernel.org/r/20230909081434.2324940-1-liujian56@huawei.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-09-11Merge branch 'Avoid dummy bpf_offload_netdev in __bpf_prog_dev_bound_init'Martin KaFai Lau
Eduard Zingerman says: ==================== For a device bound BPF program with flag BPF_F_XDP_DEV_BOUND_ONLY, in case if device does not support offload, __bpf_prog_dev_bound_init() creates a dummy bpf_offload_netdev struct with .offdev field set to NULL. This dummy struct might be reused for programs without this flag bound to the same device. However, bpf_prog_offload_verifier_prep() that uses bpf_offload_netdev assumes that .offdev field cannot be NULL. This bug was reported by syzbot in [1]. [1] https://lore.kernel.org/bpf/000000000000d97f3c060479c4f8@google.com/ ==================== Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-09-11selftests/bpf: Offloaded prog after non-offloaded should not cause BUGEduard Zingerman
Check what happens if non-offloaded dev bound BPF program is followed by offloaded dev bound program. Test case adapated from syzbot report [1]. [1] https://lore.kernel.org/bpf/000000000000d97f3c060479c4f8@google.com/ Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20230912005539.2248244-3-eddyz87@gmail.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-09-11bpf: Avoid dummy bpf_offload_netdev in __bpf_prog_dev_bound_initEduard Zingerman
Fix for a bug observable under the following sequence of events: 1. Create a network device that does not support XDP offload. 2. Load a device bound XDP program with BPF_F_XDP_DEV_BOUND_ONLY flag (such programs are not offloaded). 3. Load a device bound XDP program with zero flags (such programs are offloaded). At step (2) __bpf_prog_dev_bound_init() associates with device (1) a dummy bpf_offload_netdev struct with .offdev field set to NULL. At step (3) __bpf_prog_dev_bound_init() would reuse dummy struct allocated at step (2). However, downstream usage of the bpf_offload_netdev assumes that .offdev field can't be NULL, e.g. in bpf_prog_offload_verifier_prep(). Adjust __bpf_prog_dev_bound_init() to require bpf_offload_netdev with non-NULL .offdev for offloaded BPF programs. Fixes: 2b3486bc2d23 ("bpf: Introduce device-bound XDP programs") Reported-by: syzbot+291100dcb32190ec02a8@syzkaller.appspotmail.com Closes: https://lore.kernel.org/bpf/000000000000d97f3c060479c4f8@google.com/ Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20230912005539.2248244-2-eddyz87@gmail.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-09-11tracefs/eventfs: Use list_for_each_srcu() in dcache_dir_open_wrapper()Steven Rostedt (Google)
The eventfs files list is protected by SRCU. In earlier iterations it was protected with just RCU, but because it needed to also call sleepable code, it had to be switch to SRCU. The dcache_dir_open_wrapper() list_for_each_rcu() was missed and did not get converted over to list_for_each_srcu(). That needs to be fixed. Link: https://lore.kernel.org/linux-trace-kernel/20230911120053.ca82f545e7f46ea753deda18@kernel.org/ Link: https://lore.kernel.org/linux-trace-kernel/20230911200654.71ce927c@gandalf.local.home Cc: Mark Rutland <mark.rutland@arm.com> Cc: Ajay Kaher <akaher@vmware.com> Cc: "Paul E. McKenney" <paulmck@kernel.org> Reported-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Fixes: 63940449555e7 ("eventfs: Implement eventfs lookup, read, open functions") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-09-11bpf: Avoid deadlock when using queue and stack maps from NMIToke Høiland-Jørgensen
Sysbot discovered that the queue and stack maps can deadlock if they are being used from a BPF program that can be called from NMI context (such as one that is attached to a perf HW counter event). To fix this, add an in_nmi() check and use raw_spin_trylock() in NMI context, erroring out if grabbing the lock fails. Fixes: f1a2e44a3aec ("bpf: add queue and stack maps") Reported-by: Hsin-Wei Hung <hsinweih@uci.edu> Tested-by: Hsin-Wei Hung <hsinweih@uci.edu> Co-developed-by: Hsin-Wei Hung <hsinweih@uci.edu> Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/r/20230911132815.717240-1-toke@redhat.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-09-11tracing/synthetic: Print out u64 values properlyTero Kristo
The synth traces incorrectly print pointer to the synthetic event values instead of the actual value when using u64 type. Fix by addressing the contents of the union properly. Link: https://lore.kernel.org/linux-trace-kernel/20230911141704.3585965-1-tero.kristo@linux.intel.com Fixes: ddeea494a16f ("tracing/synthetic: Use union instead of casts") Cc: stable@vger.kernel.org Signed-off-by: Tero Kristo <tero.kristo@linux.intel.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-09-11tracing/synthetic: Fix order of struct trace_dynamic_infoSteven Rostedt (Google)
To make handling BIG and LITTLE endian better the offset/len of dynamic fields of the synthetic events was changed into a structure of: struct trace_dynamic_info { #ifdef CONFIG_CPU_BIG_ENDIAN u16 offset; u16 len; #else u16 len; u16 offset; #endif }; to replace the manual changes of: data_offset = offset & 0xffff; data_offest = len << 16; But if you look closely, the above is: <len> << 16 | offset Which in little endian would be in memory: offset_lo offset_hi len_lo len_hi and in big endian: len_hi len_lo offset_hi offset_lo Which if broken into a structure would be: struct trace_dynamic_info { #ifdef CONFIG_CPU_BIG_ENDIAN u16 len; u16 offset; #else u16 offset; u16 len; #endif }; Which is the opposite of what was defined. Fix this and just to be safe also add "__packed". Link: https://lore.kernel.org/all/20230908154417.5172e343@gandalf.local.home/ Link: https://lore.kernel.org/linux-trace-kernel/20230908163929.2c25f3dc@gandalf.local.home Cc: stable@vger.kernel.org Cc: Mark Rutland <mark.rutland@arm.com> Tested-by: Sven Schnelle <svens@linux.ibm.com> Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Fixes: ddeea494a16f3 ("tracing/synthetic: Use union instead of casts") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-09-11selftests/bpf: Update bpf_clone_redirect expected return codeStanislav Fomichev
Commit 151e887d8ff9 ("veth: Fixing transmit return status for dropped packets") started propagating proper NET_XMIT_DROP error to the caller which means it's now possible to get positive error code when calling bpf_clone_redirect() in this particular test. Update the test to reflect that. Reported-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20230911194731.286342-2-sdf@google.com
2023-09-11bpf: Clarify error expectations from bpf_clone_redirectStanislav Fomichev
Commit 151e887d8ff9 ("veth: Fixing transmit return status for dropped packets") exposed the fact that bpf_clone_redirect is capable of returning raw NET_XMIT_XXX return codes. This is in the conflict with its UAPI doc which says the following: "0 on success, or a negative error in case of failure." Update the UAPI to reflect the fact that bpf_clone_redirect can return positive error numbers, but don't explicitly define their meaning. Reported-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20230911194731.286342-1-sdf@google.com
2023-09-11Merge branch 'fix-the-unmatched-unit_size-of-bpf_mem_cache'Alexei Starovoitov
Hou Tao says: ==================== Fix the unmatched unit_size of bpf_mem_cache From: Hou Tao <houtao1@huawei.com> Hi, The patchset aims to fix the reported warning [0] when the unit_size of bpf_mem_cache is mismatched with the object size of underly slab-cache. Patch #1 fixes the warning by adjusting size_index according to the value of KMALLOC_MIN_SIZE, so bpf_mem_cache with unit_size which is smaller than KMALLOC_MIN_SIZE or is not aligned with KMALLOC_MIN_SIZE will be redirected to bpf_mem_cache with bigger unit_size. Patch #2 doesn't do prefill for these redirected bpf_mem_cache to save memory. Patch #3 adds further error check in bpf_mem_alloc_init() to ensure the unit_size and object_size are always matched and to prevent potential issues due to the mismatch. Please see individual patches for more details. And comments are always welcome. [0]: https://lore.kernel.org/bpf/87jztjmmy4.fsf@all.your.base.are.belong.to.us ==================== Link: https://lore.kernel.org/r/20230908133923.2675053-1-houtao@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-09-11selftests/bpf: Test all valid alloc sizes for bpf mem allocatorHou Tao
Add a test to test all possible and valid allocation size for bpf memory allocator. For each possible allocation size, the test uses the following two steps to test the alloc and free path: 1) allocate N (N > high_watermark) objects to trigger the refill executed in irq_work. 2) free N objects to trigger the freeing executed in irq_work. Signed-off-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/r/20230908133923.2675053-5-houtao@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-09-11bpf: Ensure unit_size is matched with slab cache object sizeHou Tao
Add extra check in bpf_mem_alloc_init() to ensure the unit_size of bpf_mem_cache is matched with the object_size of underlying slab cache. If these two sizes are unmatched, print a warning once and return -EINVAL in bpf_mem_alloc_init(), so the mismatch can be found early and the potential issue can be prevented. Suggested-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/r/20230908133923.2675053-4-houtao@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-09-11bpf: Don't prefill for unused bpf_mem_cacheHou Tao
When the unit_size of a bpf_mem_cache is unmatched with the object_size of the underlying slab cache, the bpf_mem_cache will not be used, and the allocation will be redirected to a bpf_mem_cache with a bigger unit_size instead, so there is no need to prefill for these unused bpf_mem_caches. Signed-off-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/r/20230908133923.2675053-3-houtao@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-09-11bpf: Adjust size_index according to the value of KMALLOC_MIN_SIZEHou Tao
The following warning was reported when running "./test_progs -a link_api -a linked_list" on a RISC-V QEMU VM: ------------[ cut here ]------------ WARNING: CPU: 3 PID: 261 at kernel/bpf/memalloc.c:342 bpf_mem_refill Modules linked in: bpf_testmod(OE) CPU: 3 PID: 261 Comm: test_progs- ... 6.5.0-rc5-01743-gdcb152bb8328 #2 Hardware name: riscv-virtio,qemu (DT) epc : bpf_mem_refill+0x1fc/0x206 ra : irq_work_single+0x68/0x70 epc : ffffffff801b1bc4 ra : ffffffff8015fe84 sp : ff2000000001be20 gp : ffffffff82d26138 tp : ff6000008477a800 t0 : 0000000000046600 t1 : ffffffff812b6ddc t2 : 0000000000000000 s0 : ff2000000001be70 s1 : ff5ffffffffe8998 a0 : ff5ffffffffe8998 a1 : ff600003fef4b000 a2 : 000000000000003f a3 : ffffffff80008250 a4 : 0000000000000060 a5 : 0000000000000080 a6 : 0000000000000000 a7 : 0000000000735049 s2 : ff5ffffffffe8998 s3 : 0000000000000022 s4 : 0000000000001000 s5 : 0000000000000007 s6 : ff5ffffffffe8570 s7 : ffffffff82d6bd30 s8 : 000000000000003f s9 : ffffffff82d2c5e8 s10: 000000000000ffff s11: ffffffff82d2c5d8 t3 : ffffffff81ea8f28 t4 : 0000000000000000 t5 : ff6000008fd28278 t6 : 0000000000040000 [<ffffffff801b1bc4>] bpf_mem_refill+0x1fc/0x206 [<ffffffff8015fe84>] irq_work_single+0x68/0x70 [<ffffffff8015feb4>] irq_work_run_list+0x28/0x36 [<ffffffff8015fefa>] irq_work_run+0x38/0x66 [<ffffffff8000828a>] handle_IPI+0x3a/0xb4 [<ffffffff800a5c3a>] handle_percpu_devid_irq+0xa4/0x1f8 [<ffffffff8009fafa>] generic_handle_domain_irq+0x28/0x36 [<ffffffff800ae570>] ipi_mux_process+0xac/0xfa [<ffffffff8000a8ea>] sbi_ipi_handle+0x2e/0x88 [<ffffffff8009fafa>] generic_handle_domain_irq+0x28/0x36 [<ffffffff807ee70e>] riscv_intc_irq+0x36/0x4e [<ffffffff812b5d3a>] handle_riscv_irq+0x54/0x86 [<ffffffff812b6904>] do_irq+0x66/0x98 ---[ end trace 0000000000000000 ]--- The warning is due to WARN_ON_ONCE(tgt->unit_size != c->unit_size) in free_bulk(). The direct reason is that a object is allocated and freed by bpf_mem_caches with different unit_size. The root cause is that KMALLOC_MIN_SIZE is 64 and there is no 96-bytes slab cache in the specific VM. When linked_list test allocates a 72-bytes object through bpf_obj_new(), bpf_global_ma will allocate it from a bpf_mem_cache with 96-bytes unit_size, but this bpf_mem_cache is backed by 128-bytes slab cache. When the object is freed, bpf_mem_free() uses ksize() to choose the corresponding bpf_mem_cache. Because the object is allocated from 128-bytes slab cache, ksize() returns 128, bpf_mem_free() chooses a 128-bytes bpf_mem_cache to free the object and triggers the warning. A similar warning will also be reported when using CONFIG_SLAB instead of CONFIG_SLUB in a x86-64 kernel. Because CONFIG_SLUB defines KMALLOC_MIN_SIZE as 8 but CONFIG_SLAB defines KMALLOC_MIN_SIZE as 32. An alternative fix is to use kmalloc_size_round() in bpf_mem_alloc() to choose a bpf_mem_cache which has the same unit_size with the backing slab cache, but it may introduce performance degradation, so fix the warning by adjusting the indexes in size_index according to the value of KMALLOC_MIN_SIZE just like setup_kmalloc_cache_index_table() does. Fixes: 822fb26bdb55 ("bpf: Add a hint to allocated objects.") Reported-by: Björn Töpel <bjorn@kernel.org> Closes: https://lore.kernel.org/bpf/87jztjmmy4.fsf@all.your.base.are.belong.to.us Signed-off-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/r/20230908133923.2675053-2-houtao@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-09-11platform/x86: asus-wmi: Support 2023 ROG X16 tablet modeLuke D. Jones
Add quirk for ASUS ROG X16 (GV601V, 2023 versions) Flow 2-in-1 to enable tablet mode with lid flip (all screen rotations). Signed-off-by: Luke D. Jones <luke@ljones.dev> Link: https://lore.kernel.org/r/20230905082813.13470-1-luke@ljones.dev Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2023-09-11platform/mellanox: NVSW_SN2201 should depend on ACPIGeert Uytterhoeven
The only probing method supported by the Nvidia SN2201 platform driver is probing through an ACPI match table. Hence add a dependency on ACPI, to prevent asking the user about this driver when configuring a kernel without ACPI support. Fixes: 662f24826f95 ("platform/mellanox: Add support for new SN2201 system") Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Acked-by: Vadim Pasternak <vadimp@nvidia.com> Acked-by: Andi Shyti <andi.shyti@kernel.org> Link: https://lore.kernel.org/r/ec5a4071691ab08d58771b7732a9988e89779268.1693828363.git.geert+renesas@glider.be Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2023-09-11platform/mellanox: mlxbf-bootctl: add NET dependency into KconfigDavid Thompson
The latest version of the mlxbf_bootctl driver utilizes "sysfs_format_mac", and this API is only available if NET is defined in the kernel configuration. This patch changes the mlxbf_bootctl Kconfig to depend on NET. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202309031058.JvwNDBKt-lkp@intel.com/ Reported-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: David Thompson <davthompson@nvidia.com> Link: https://lore.kernel.org/r/20230905133243.31550-1-davthompson@nvidia.com Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2023-09-11platform/mellanox: mlxbf-pmc: Fix reading of unprogrammed eventsShravan Kumar Ramani
This fix involves 2 changes: - All event regs have a reset value of 0, which is not a valid event_number as per the event_list for most blocks and hence seen as an error. Add a "disable" event with event_number 0 for all blocks. - The enable bit for each counter need not be checked before reading the event info, and hence removed. Fixes: 1a218d312e65 ("platform/mellanox: mlxbf-pmc: Add Mellanox BlueField PMC driver") Signed-off-by: Shravan Kumar Ramani <shravankr@nvidia.com> Reviewed-by: Vadim Pasternak <vadimp@nvidia.com> Reviewed-by: David Thompson <davthompson@nvidia.com> Link: https://lore.kernel.org/r/04d0213932d32681de1c716b54320ed894e52425.1693917738.git.shravankr@nvidia.com Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2023-09-11platform/mellanox: mlxbf-pmc: Fix potential buffer overflowsShravan Kumar Ramani
Replace sprintf with sysfs_emit where possible. Size check in mlxbf_pmc_event_list_show should account for "\0". Fixes: 1a218d312e65 ("platform/mellanox: mlxbf-pmc: Add Mellanox BlueField PMC driver") Signed-off-by: Shravan Kumar Ramani <shravankr@nvidia.com> Reviewed-by: Vadim Pasternak <vadimp@nvidia.com> Reviewed-by: David Thompson <davthompson@nvidia.com> Link: https://lore.kernel.org/r/bef39ef32319a31b32f999065911f61b0d3b17c3.1693917738.git.shravankr@nvidia.com Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2023-09-11platform/mellanox: mlxbf-tmfifo: Drop jumbo framesLiming Sun
This commit drops over-sized network packets to avoid tmfifo queue stuck. Fixes: 1357dfd7261f ("platform/mellanox: Add TmFifo driver for Mellanox BlueField Soc") Signed-off-by: Liming Sun <limings@nvidia.com> Reviewed-by: Vadim Pasternak <vadimp@nvidia.com> Reviewed-by: David Thompson <davthompson@nvidia.com> Link: https://lore.kernel.org/r/9318936c2447f76db475c985ca6d91f057efcd41.1693322547.git.limings@nvidia.com Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2023-09-11platform/mellanox: mlxbf-tmfifo: Drop the Rx packet if no more descriptorsLiming Sun
This commit fixes tmfifo console stuck issue when the virtual networking interface is in down state. In such case, the network Rx descriptors runs out and causes the Rx network packet staying in the head of the tmfifo thus blocking the console packets. The fix is to drop the Rx network packet when no more Rx descriptors. Function name mlxbf_tmfifo_release_pending_pkt() is also renamed to mlxbf_tmfifo_release_pkt() to be more approperiate. Fixes: 1357dfd7261f ("platform/mellanox: Add TmFifo driver for Mellanox BlueField Soc") Signed-off-by: Liming Sun <limings@nvidia.com> Reviewed-by: Vadim Pasternak <vadimp@nvidia.com> Reviewed-by: David Thompson <davthompson@nvidia.com> Link: https://lore.kernel.org/r/8c0177dc938ae03f52ff7e0b62dbeee74b7bec09.1693322547.git.limings@nvidia.com Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2023-09-11net: ethernet: mtk_eth_soc: fix pse_port configuration for MT7988Lorenzo Bianconi
MT7988 SoC support 3 NICs. Fix pse_port configuration in mtk_flow_set_output_device routine if the traffic is offloaded to eth2. Rely on mtk_pse_port definitions. Fixes: 88efedf517e6 ("net: ethernet: mtk_eth_soc: enable nft hw flowtable_offload for MT7988 SoC") Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-09-11net: ethernet: mtk_eth_soc: fix uninitialized variableDaniel Golle
Variable dma_addr in function mtk_poll_rx can be uninitialized on some of the error paths. In practise this doesn't matter, even random data present in uninitialized stack memory can safely be used in the way it happens in the error path. However, in order to make Smatch happy make sure the variable is always initialized. Signed-off-by: Daniel Golle <daniel@makrotopia.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-09-11netfilter: nf_tables: disallow element removal on anonymous setsPablo Neira Ayuso
Anonymous sets need to be populated once at creation and then they are bound to rule since 938154b93be8 ("netfilter: nf_tables: reject unbound anonymous set before commit phase"), otherwise transaction reports EINVAL. Userspace does not need to delete elements of anonymous sets that are not yet bound, reject this with EOPNOTSUPP. From flush command path, skip anonymous sets, they are expected to be bound already. Otherwise, EINVAL is hit at the end of this transaction for unbound sets. Fixes: 96518518cc41 ("netfilter: add nftables") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2023-09-11kcm: Fix memory leak in error path of kcm_sendmsg()Shigeru Yoshida
syzbot reported a memory leak like below: BUG: memory leak unreferenced object 0xffff88810b088c00 (size 240): comm "syz-executor186", pid 5012, jiffies 4294943306 (age 13.680s) hex dump (first 32 bytes): 00 89 08 0b 81 88 ff ff 00 00 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<ffffffff83e5d5ff>] __alloc_skb+0x1ef/0x230 net/core/skbuff.c:634 [<ffffffff84606e59>] alloc_skb include/linux/skbuff.h:1289 [inline] [<ffffffff84606e59>] kcm_sendmsg+0x269/0x1050 net/kcm/kcmsock.c:815 [<ffffffff83e479c6>] sock_sendmsg_nosec net/socket.c:725 [inline] [<ffffffff83e479c6>] sock_sendmsg+0x56/0xb0 net/socket.c:748 [<ffffffff83e47f55>] ____sys_sendmsg+0x365/0x470 net/socket.c:2494 [<ffffffff83e4c389>] ___sys_sendmsg+0xc9/0x130 net/socket.c:2548 [<ffffffff83e4c536>] __sys_sendmsg+0xa6/0x120 net/socket.c:2577 [<ffffffff84ad7bb8>] do_syscall_x64 arch/x86/entry/common.c:50 [inline] [<ffffffff84ad7bb8>] do_syscall_64+0x38/0xb0 arch/x86/entry/common.c:80 [<ffffffff84c0008b>] entry_SYSCALL_64_after_hwframe+0x63/0xcd In kcm_sendmsg(), kcm_tx_msg(head)->last_skb is used as a cursor to append newly allocated skbs to 'head'. If some bytes are copied, an error occurred, and jumped to out_error label, 'last_skb' is left unmodified. A later kcm_sendmsg() will use an obsoleted 'last_skb' reference, corrupting the 'head' frag_list and causing the leak. This patch fixes this issue by properly updating the last allocated skb in 'last_skb'. Fixes: ab7ac4eb9832 ("kcm: Kernel Connection Multiplexor module") Reported-and-tested-by: syzbot+6f98de741f7dbbfc4ccb@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=6f98de741f7dbbfc4ccb Signed-off-by: Shigeru Yoshida <syoshida@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-09-11r8152: check budget for r8152_poll()Hayes Wang
According to the document of napi, there is no rx process when the budget is 0. Therefore, r8152_poll() has to return 0 directly when the budget is equal to 0. Fixes: d2187f8e4454 ("r8152: divide the tx and rx bottom functions") Signed-off-by: Hayes Wang <hayeswang@realtek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-09-11Merge branch 'sha1105-regressions'David S. Miller
Vladimir Oltean says: ==================== Fixes for SJA1105 DSA FDB regressions A report by Yanan Yang has prompted an investigation into the sja1105 driver's behavior w.r.t. multicast. The report states that when adding multicast L2 addresses with "bridge mdb add", only the most recently added address works - the others seem to be overwritten. This is solved by patch 3/5 (with patch 2/5 as a dependency for it). Patches 4/5 and 5/5 fix a series of race conditions introduced during the same patch set as the bug above, namely this one: https://patchwork.kernel.org/project/netdevbpf/cover/20211024171757.3753288-1-vladimir.oltean@nxp.com/ Finally, patch 1/5 fixes an issue found ever since the introduction of multicast forwarding offload in sja1105, which is that the multicast addresses are visible (with the "self" flag) in "bridge fdb show". ==================== Signed-off-by: David S. Miller <davem@davemloft.net>