summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2023-10-11netdev: replace simple napi_schedule_prep/__napi_schedule to napi_scheduleChristian Marangi
Replace drivers that still use napi_schedule_prep/__napi_schedule with napi_schedule helper as it does the same exact check and call. Signed-off-by: Christian Marangi <ansuelsmth@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20231009133754.9834-1-ansuelsmth@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-11Merge branch 'Add cgroup sockaddr hooks for unix sockets'Martin KaFai Lau
Daan De Meyer says: ==================== Changes since v10: * Removed extra check from bpf_sock_addr_set_sun_path() again in favor of calling unix_validate_addr() everywhere in af_unix.c before calling the hooks. Changes since v9: * Renamed bpf_sock_addr_set_unix_addr() to bpf_sock_addr_set_sun_path() and rennamed arguments to match the new name. * Added an extra check to bpf_sock_addr_set_sun_path() to disallow changing the address of an unnamed unix socket. * Removed unnecessary NULL check on uaddrlen in __cgroup_bpf_run_filter_sock_addr(). Changes since v8: * Added missing test programs to last patch Changes since v7: * Fixed formatting nit in comment * Renamed from cgroup/connectun to cgroup/connect_unix (and similar for all other hooks) Changes since v6: * Actually removed bpf_bind() helper for AF_UNIX hooks. * Fixed merge conflict * Updated comment to mention uaddrlen is read-only for AF_INET[6] * Removed unnecessary forward declaration of struct sock_addr_test * Removed unused BPF_CGROUP_RUN_PROG_UNIX_CONNECT() * Fixed formatting nit reported by checkpatch * Added more information to commit message about recvmsg() on connected socket Changes since v5: * Fixed kernel version in bpftool documentation (6.3 => 6.7). * Added connection mode socket recvmsg() test. * Removed bpf_bind() helper for AF_UNIX hooks. * Added missing getpeernameun and getsocknameun BPF test programs. * Added note for bind() test being unused currently. Changes since v4: * Dropped support for intercepting bind() as when using bind() with unix sockets and a pathname sockaddr, bind() will create an inode in the filesystem that needs to be cleaned up. If the address is rewritten, users might try to clean up the wrong file and leak the actual socket file in the filesystem. * Changed bpf_sock_addr_set_unix_addr() to use BTF_KFUNC_HOOK_CGROUP_SKB instead of BTF_KFUNC_HOOK_COMMON. * Removed unix socket related changes from BPF_CGROUP_PRE_CONNECT_ENABLED() as unix sockets do not support pre-connect. * Added tests for getpeernameun and getsocknameun hooks. * We now disallow an empty sockaddr in bpf_sock_addr_set_unix_addr() similar to unix_validate_addr(). * Removed unnecessary cgroup_bpf_enabled() checks * Removed unnecessary error checks Changes since v3: * Renamed bpf_sock_addr_set_addr() to bpf_sock_addr_set_unix_addr() and made it only operate on AF_UNIX sockaddrs. This is because for the other families, users usually want to configure more than just the address so a generic interface will not fit the bill here. e.g. for AF_INET and AF_INET6, users would generally also want to be able to configure the port which the current interface doesn't support. So we expose an AF_UNIX specific function instead. * Made the tests in the new sock addr tests more generic (similar to test_sock_addr.c), this should make it easier to migrate the other sock addr tests in the future. * Removed the new kfunc hook and attached to BTF_KFUNC_HOOK_COMMON instead * Set uaddrlen to 0 when the family is AF_UNSPEC * Pass in the addrlen to the hook from IPv6 code * Fixed mount directory mkdir() to ignore EEXIST Changes since v2: * Configuring the sock addr is now done via a new kfunc bpf_sock_addr_set() * The addrlen is exposed as u32 in bpf_sock_addr_kern * Selftests are updated to use the new kfunc * Selftests are now added as a new sock_addr test in prog_tests/ * Added BTF_KFUNC_HOOK_SOCK_ADDR for BPF_PROG_TYPE_CGROUP_SOCK_ADDR * __cgroup_bpf_run_filter_sock_addr() now returns the modified addrlen Changes since v1: * Split into multiple patches instead of one single patch * Added unix support for all socket address hooks instead of only connect() * Switched approach to expose the socket address length to the bpf hook instead of recalculating the socket address length in kernelspace to properly support abstract unix socket addresses * Modified socket address hook tests to calculate the socket address length once and pass it around everywhere instead of recalculating the actual unix socket address length on demand. * Added some missing section name tests for getpeername()/getsockname() This patch series extends the cgroup sockaddr hooks to include support for unix sockets. To add support for unix sockets, struct bpf_sock_addr_kern is extended to expose the socket address length to the bpf program. Along with that, a new kfunc bpf_sock_addr_set_unix_addr() is added to safely allow modifying an AF_UNIX sockaddr from bpf programs. I intend to use these new hooks in systemd to reimplement the LogNamespace= feature, which allows running multiple instances of systemd-journald to process the logs of different services. systemd-journald also processes syslog messages, so currently, using log namespaces means all services running in the same log namespace have to live in the same private mount namespace so that systemd can mount the journal namespace's associated syslog socket over /dev/log to properly direct syslog messages from all services running in that log namespace to the correct systemd-journald instance. We want to relax this requirement so that processes running in disjoint mount namespaces can still run in the same log namespace. To achieve this, we can use these new hooks to rewrite the socket address of any connect(), sendto(), ... syscalls to /dev/log to the socket address of the journal namespace's syslog socket instead, which will transparently do the redirection without requiring use of a mount namespace and mounting over /dev/log. Aside from the above usecase, these hooks can more generally be used to transparently redirect unix sockets to different addresses as required by services. ==================== Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-10-11selftests/bpf: Add tests for cgroup unix socket address hooksDaan De Meyer
These selftests are written in prog_tests style instead of adding them to the existing test_sock_addr tests. Migrating the existing sock addr tests to prog_tests style is left for future work. This commit adds support for testing bind() sockaddr hooks, even though there's no unix socket sockaddr hook for bind(). We leave this code intact for when the INET and INET6 tests are migrated in the future which do support intercepting bind(). Signed-off-by: Daan De Meyer <daan.j.demeyer@gmail.com> Link: https://lore.kernel.org/r/20231011185113.140426-10-daan.j.demeyer@gmail.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-10-11selftests/bpf: Make sure mount directory existsDaan De Meyer
The mount directory for the selftests cgroup tree might not exist so let's make sure it does exist by creating it ourselves if it doesn't exist. Signed-off-by: Daan De Meyer <daan.j.demeyer@gmail.com> Link: https://lore.kernel.org/r/20231011185113.140426-9-daan.j.demeyer@gmail.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-10-11documentation/bpf: Document cgroup unix socket address hooksDaan De Meyer
Update the documentation to mention the new cgroup unix sockaddr hooks. Signed-off-by: Daan De Meyer <daan.j.demeyer@gmail.com> Link: https://lore.kernel.org/r/20231011185113.140426-8-daan.j.demeyer@gmail.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-10-11bpftool: Add support for cgroup unix socket address hooksDaan De Meyer
Add the necessary plumbing to hook up the new cgroup unix sockaddr hooks into bpftool. Signed-off-by: Daan De Meyer <daan.j.demeyer@gmail.com> Acked-by: Quentin Monnet <quentin@isovalent.com> Link: https://lore.kernel.org/r/20231011185113.140426-7-daan.j.demeyer@gmail.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-10-11libbpf: Add support for cgroup unix socket address hooksDaan De Meyer
Add the necessary plumbing to hook up the new cgroup unix sockaddr hooks into libbpf. Signed-off-by: Daan De Meyer <daan.j.demeyer@gmail.com> Link: https://lore.kernel.org/r/20231011185113.140426-6-daan.j.demeyer@gmail.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-10-11bpf: Implement cgroup sockaddr hooks for unix socketsDaan De Meyer
These hooks allows intercepting connect(), getsockname(), getpeername(), sendmsg() and recvmsg() for unix sockets. The unix socket hooks get write access to the address length because the address length is not fixed when dealing with unix sockets and needs to be modified when a unix socket address is modified by the hook. Because abstract socket unix addresses start with a NUL byte, we cannot recalculate the socket address in kernelspace after running the hook by calculating the length of the unix socket path using strlen(). These hooks can be used when users want to multiplex syscall to a single unix socket to multiple different processes behind the scenes by redirecting the connect() and other syscalls to process specific sockets. We do not implement support for intercepting bind() because when using bind() with unix sockets with a pathname address, this creates an inode in the filesystem which must be cleaned up. If we rewrite the address, the user might try to clean up the wrong file, leaking the socket in the filesystem where it is never cleaned up. Until we figure out a solution for this (and a use case for intercepting bind()), we opt to not allow rewriting the sockaddr in bind() calls. We also implement recvmsg() support for connected streams so that after a connect() that is modified by a sockaddr hook, any corresponding recmvsg() on the connected socket can also be modified to make the connected program think it is connected to the "intended" remote. Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: Daan De Meyer <daan.j.demeyer@gmail.com> Link: https://lore.kernel.org/r/20231011185113.140426-5-daan.j.demeyer@gmail.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-10-11bna: replace deprecated strncpy with strscpy_padJustin Stitt
`strncpy` is deprecated for use on NUL-terminated destination strings [1] and as such we should prefer more robust and less ambiguous string interfaces. bfa_ioc_get_adapter_manufacturer() simply copies a string literal into `manufacturer`. Another implementation of bfa_ioc_get_adapter_manufacturer() from drivers/scsi/bfa/bfa_ioc.c uses memset + strscpy: | void | bfa_ioc_get_adapter_manufacturer(struct bfa_ioc_s *ioc, char *manufacturer) | { | memset((void *)manufacturer, 0, BFA_ADAPTER_MFG_NAME_LEN); | strscpy(manufacturer, BFA_MFG_NAME, BFA_ADAPTER_MFG_NAME_LEN); | } Let's use `strscpy_pad` to eliminate some redundant work while still NUL-terminating and NUL-padding the destination buffer. Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings [1] Link: https://github.com/KSPP/linux/issues/90 Signed-off-by: Justin Stitt <justinstitt@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20231009-strncpy-drivers-net-ethernet-brocade-bna-bfa_ioc-c-v2-1-78e0f47985d3@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-11net: dsa: lantiq_gswip: replace deprecated strncpy with ethtool_sprintfJustin Stitt
`strncpy` is deprecated for use on NUL-terminated destination strings [1] and as such we should prefer more robust and less ambiguous string interfaces. ethtool_sprintf() is designed specifically for get_strings() usage. Let's replace strncpy in favor of this more robust and easier to understand interface. Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings [1] Link: https://manpages.debian.org/testing/linux-manual-4.8/strscpy.9.en.html [2] Link: https://github.com/KSPP/linux/issues/90 Signed-off-by: Justin Stitt <justinstitt@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Link: https://lore.kernel.org/r/20231009-strncpy-drivers-net-dsa-lantiq_gswip-c-v1-1-d55a986a14cc@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-11net: dsa: mt7530: replace deprecated strncpy with ethtool_sprintfJustin Stitt
`strncpy` is deprecated for use on NUL-terminated destination strings [1] and as such we should prefer more robust and less ambiguous string interfaces. ethtool_sprintf() is designed specifically for get_strings() usage. Let's replace strncpy in favor of this more robust and easier to understand interface. Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings [1] Link: https://manpages.debian.org/testing/linux-manual-4.8/strscpy.9.en.html [2] Link: https://github.com/KSPP/linux/issues/90 Signed-off-by: Justin Stitt <justinstitt@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> Acked-by: Daniel Golle <daniel@makrotopia.org> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Link: https://lore.kernel.org/r/20231009-strncpy-drivers-net-dsa-mt7530-c-v1-1-ec6677a6436a@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-11net: tcp: fix crashes trying to free half-baked MTU probesJakub Kicinski
tcp_stream_alloc_skb() initializes the skb to use tcp_tsorted_anchor which is a union with the destructor. We need to clean that TCP-iness up before freeing. Fixes: 736013292e3c ("tcp: let tcp_mtu_probe() build headless packets") Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20231010173651.3990234-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-11net: mvpp2: replace deprecated strncpy with strscpyJustin Stitt
`strncpy` is deprecated for use on NUL-terminated destination strings [1] and as such we should prefer more robust and less ambiguous string interfaces. We expect `irqname` to be NUL-terminated based on its use with of_irq_get_byname() -> of_property_match_string() wherein it is used with a format string and a `strcmp`: | pr_debug("comparing %s with %s\n", string, p); | if (strcmp(string, p) == 0) | return i; /* Found it; return index */ NUL-padding is not required as is evident by other assignments to `irqname` which do not NUL-pad: | if (port->flags & MVPP2_F_DT_COMPAT) | snprintf(irqname, sizeof(irqname), "tx-cpu%d", i); | else | snprintf(irqname, sizeof(irqname), "hif%d", i); Considering the above, a suitable replacement is `strscpy` [2] due to the fact that it guarantees NUL-termination on the destination buffer without unnecessarily NUL-padding. Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings [1] Link: https://manpages.debian.org/testing/linux-manual-4.8/strscpy.9.en.html [2] Link: https://github.com/KSPP/linux/issues/90 Signed-off-by: Justin Stitt <justinstitt@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20231010-strncpy-drivers-net-ethernet-marvell-mvpp2-mvpp2_main-c-v1-1-51be96ad0324@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-11octeontx2-af: replace deprecated strncpy with strscpyJustin Stitt
`strncpy` is deprecated for use on NUL-terminated destination strings [1] and as such we should prefer more robust and less ambiguous string interfaces. We can see that linfo->lmac_type is expected to be NUL-terminated based on the `... - 1`'s present in the current code. Presumably making room for a NUL-byte at the end of the buffer. Considering the above, a suitable replacement is `strscpy` [2] due to the fact that it guarantees NUL-termination on the destination buffer without unnecessarily NUL-padding. Let's also prefer the more idiomatic strscpy usage of (dest, src, sizeof(dest)) rather than (dest, src, SOME_LEN). Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings [1] Link: https://manpages.debian.org/testing/linux-manual-4.8/strscpy.9.en.html [2] Link: https://github.com/KSPP/linux/issues/90 Signed-off-by: Justin Stitt <justinstitt@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20231010-strncpy-drivers-net-ethernet-marvell-octeontx2-af-cgx-c-v1-1-a443e18f9de8@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-11Merge tag 'ieee802154-for-net-2023-10-10' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/wpan/wpan Stefan Schmidt says: ==================== pull-request: ieee802154 for net 2023-10-10 Just one small fix this time around. Dinghao Liu fixed a potential use-after-free in the ca8210 driver probe function. * tag 'ieee802154-for-net-2023-10-10' of git://git.kernel.org/pub/scm/linux/kernel/git/wpan/wpan: ieee802154: ca8210: Fix a potential UAF in ca8210_probe ==================== Link: https://lore.kernel.org/r/20231010200943.82225-1-stefan@datenfreihafen.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-11bpf: Add bpf_sock_addr_set_sun_path() to allow writing unix sockaddr from bpfDaan De Meyer
As prep for adding unix socket support to the cgroup sockaddr hooks, let's add a kfunc bpf_sock_addr_set_sun_path() that allows modifying a unix sockaddr from bpf. While this is already possible for AF_INET and AF_INET6, we'll need this kfunc when we add unix socket support since modifying the address for those requires modifying both the address and the sockaddr length. Signed-off-by: Daan De Meyer <daan.j.demeyer@gmail.com> Link: https://lore.kernel.org/r/20231011185113.140426-4-daan.j.demeyer@gmail.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-10-11bpf: Propagate modified uaddrlen from cgroup sockaddr programsDaan De Meyer
As prep for adding unix socket support to the cgroup sockaddr hooks, let's propagate the sockaddr length back to the caller after running a bpf cgroup sockaddr hook program. While not important for AF_INET or AF_INET6, the sockaddr length is important when working with AF_UNIX sockaddrs as the size of the sockaddr cannot be determined just from the address family or the sockaddr's contents. __cgroup_bpf_run_filter_sock_addr() is modified to take the uaddrlen as an input/output argument. After running the program, the modified sockaddr length is stored in the uaddrlen pointer. Signed-off-by: Daan De Meyer <daan.j.demeyer@gmail.com> Link: https://lore.kernel.org/r/20231011185113.140426-3-daan.j.demeyer@gmail.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-10-11Merge tag 'fs_for_v6.6-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull quota regression fix from Jan Kara. * tag 'fs_for_v6.6-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: quota: Fix slow quotaoff
2023-10-11Merge tag 'for-6.6-rc5-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: "A revert of recent mount option parsing fix, this breaks mounts with security options. The second patch is a flexible array annotation" * tag 'for-6.6-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: add __counted_by for struct btrfs_delayed_item and use struct_size() Revert "btrfs: reject unknown mount options early"
2023-10-11Merge tag 'ata-6.6-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata Pull ata fixes from Damien Le Moal: - Three fixes for the pata_parport driver to address a typo in the code, a missing operation implementation and port reset handling in the presence of slave devices (Ondrej) - Fix handling of ATAPI devices reset with the fit3 protocol driver of the pata_parport driver (Ondrej) - A follow up fix for the recent suspend/resume corrections to avoid attempting rescanning on resume the scsi device associated with an ata disk when the request queue of the scsi device is still suspended (in addition to not doing the rescan if the scsi device itself is still suspended) (me) * tag 'ata-6.6-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata: scsi: Do not rescan devices with a suspended queue ata: pata_parport: fit3: implement IDE command set registers ata: pata_parport: add custom version of wait_after_reset ata: pata_parport: implement set_devctl ata: pata_parport: fix pata_parport_devchk
2023-10-11tools: ynl: use ynl-gen -o instead of stdout in MakefileJakub Kicinski
Jiri added more careful handling of output of the code generator to avoid wiping out existing files in commit f65f305ae008 ("tools: ynl-gen: use temporary file for rendering") Make use of the -o option in the Makefiles, it is already used by ynl-regen.sh. Link: https://lore.kernel.org/r/20231010202714.4045168-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-11Merge tag 'for-linus-2023101101' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid Pull HID fixes from Benjamin Tissoires: - regression fix for i2c-hid when used on DT platforms (Johan Hovold) - kernel crash fix on removal of the Logitech USB receiver (Hans de Goede) * tag 'for-linus-2023101101' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid: HID: logitech-hidpp: Fix kernel crash on receiver USB disconnect HID: i2c-hid: fix handling of unpopulated devices
2023-10-11netlink: specs: don't allow version to be specified for genetlinkJiri Pirko
There is no good reason to specify the version for new protocols. Forbid it in genetlink schema. If the future proves me wrong, this restriction could be easily lifted. Move the version definition in between legacy properties in genetlink-legacy. Suggested-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Jiri Pirko <jiri@nvidia.com> Link: https://lore.kernel.org/r/20231010074810.191177-1-jiri@resnulli.us Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-11Merge branch 'add-vf-fault-detect-support-for-hns3-ethernet-driver'Jakub Kicinski
Jijie Shao says: ==================== add vf fault detect support for HNS3 ethernet driver ==================== Link: https://lore.kernel.org/r/20231007031215.1067758-1-shaojijie@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-11net: hns3: add vf fault detect supportJie Wang
Currently hns3 driver supports vf fault detect feature. Several ras caused by VF resources don't need to do PF function reset for recovery. The driver only needs to reset the specified VF. So this patch adds process in ras module. New process will get detailed information about ras and do the most correct measures based on these accurate information. Signed-off-by: Jie Wang <wangjie125@huawei.com> Signed-off-by: Jijie Shao <shaojijie@huawei.com> Link: https://lore.kernel.org/r/20231007031215.1067758-3-shaojijie@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-11net: hns3: add hns3 vf fault detect cap bit supportJie Wang
Currently hns3 driver is designed to support VF fault detect feature in new hardwares. For code compatibility, vf fault detect cap bit is added to the driver. Signed-off-by: Jie Wang <wangjie125@huawei.com> Signed-off-by: Jijie Shao <shaojijie@huawei.com> Link: https://lore.kernel.org/r/20231007031215.1067758-2-shaojijie@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-11selftests/bpf: Add missing section name tests for getpeername/getsocknameDaan De Meyer
These were missed when these hooks were first added so add them now instead to make sure every sockaddr hook has a matching section name test. Signed-off-by: Daan De Meyer <daan.j.demeyer@gmail.com> Link: https://lore.kernel.org/r/20231011185113.140426-2-daan.j.demeyer@gmail.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-10-11Merge tag 'printk-for-6.6-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux Pull printk regression fix from Petr Mladek: - Avoid unnecessary wait and try to flush messages before checking pending ones * tag 'printk-for-6.6-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux: printk: flush consoles before checking progress
2023-10-11Merge branch 'rework/misc-cleanups' into for-linusPetr Mladek
2023-10-11Merge branch 'skb_segment-testing'David S. Miller
Willem de Bruijn says: ==================== add skb_segment kunit coverage As discussed at netconf last week. Some kernel code is exercised in many different ways. skb_segment is a prime example. This ~350 line function has 49 different patches in git blame with 28 different authors. When making a change, e.g., to fix a bug in one specific use case, it is hard to establish through analysis alone that the change does not break the many other paths through the code. It is impractical to exercise all code paths through regression testing from userspace. Add the minimal infrastructure needed to add KUnit tests to networking, and add code coverage for this function. Patch 1 adds the infra and the first simple test case: a linear skb Patch 2 adds variants with frags[] Patch 3 adds variants with frag_list skbs ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-11net: expand skb_segment unit test with frag_list coverageWillem de Bruijn
Expand the test with these variants that use skb frag_list: - GSO_TEST_FRAG_LIST: frag_skb length is gso_size - GSO_TEST_FRAG_LIST_PURE: same, data exclusively in frag skbs - GSO_TEST_FRAG_LIST_NON_UNIFORM: frag_skb length may vary - GSO_TEST_GSO_BY_FRAGS: frag_skb length defines gso_size, i.e., segs may have varying sizes. Signed-off-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-11net: parametrize skb_segment unit test to expand coverageWillem de Bruijn
Expand the test with variants - GSO_TEST_NO_GSO: payload size less than or equal to gso_size - GSO_TEST_FRAGS: payload in both linear and page frags - GSO_TEST_FRAGS_PURE: payload exclusively in page frags - GSO_TEST_GSO_PARTIAL: produce one gso segment of multiple of gso_size, plus optionally one non-gso trailer segment Define a test struct that encodes the input gso skb and output segs. Input in terms of linear and fragment lengths. Output as length of each segment. Signed-off-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-11net: add skb_segment kunit testWillem de Bruijn
Add unit testing for skb segment. This function is exercised by many different code paths, such as GSO_PARTIAL or GSO_BY_FRAGS, linear (with or without head_frag), frags or frag_list skbs, etc. It is infeasible to manually run tests that cover all code paths when making changes. The long and complex function also makes it hard to establish through analysis alone that a patch has no unintended side-effects. Add code coverage through kunit regression testing. Introduce kunit infrastructure for tests under net/core, and add this first test. This first skb_segment test exercises a simple case: a linear skb. Follow-on patches will parametrize the test and add more variants. Tested: Built and ran the test with make ARCH=um mrproper ./tools/testing/kunit/kunit.py run \ --kconfig_add CONFIG_NET=y \ --kconfig_add CONFIG_DEBUG_KERNEL=y \ --kconfig_add CONFIG_DEBUG_INFO=y \ --kconfig_add=CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y \ net_core_gso Signed-off-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-11btrfs: add __counted_by for struct btrfs_delayed_item and use struct_size()Gustavo A. R. Silva
Prepare for the coming implementation by GCC and Clang of the __counted_by attribute. Flexible array members annotated with __counted_by can have their accesses bounds-checked at run-time via CONFIG_UBSAN_BOUNDS (for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family functions). While there, use struct_size() helper, instead of the open-coded version, to calculate the size for the allocation of the whole flexible structure, including of course, the flexible-array member. This code was found with the help of Coccinelle, and audited and fixed manually. Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-10-11net/smc: Fix pos miscalculation in statisticsNils Hoppmann
SMC_STAT_PAYLOAD_SUB(_smc_stats, _tech, key, _len, _rc) will calculate wrong bucket positions for payloads of exactly 4096 bytes and (1 << (m + 12)) bytes, with m == SMC_BUF_MAX - 1. Intended bucket distribution: Assume l == size of payload, m == SMC_BUF_MAX - 1. Bucket 0 : 0 < l <= 2^13 Bucket n, 1 <= n <= m-1 : 2^(n+12) < l <= 2^(n+13) Bucket m : l > 2^(m+12) Current solution: _pos = fls64((l) >> 13) [...] _pos = (_pos < m) ? ((l == 1 << (_pos + 12)) ? _pos - 1 : _pos) : m For l == 4096, _pos == -1, but should be _pos == 0. For l == (1 << (m + 12)), _pos == m, but should be _pos == m - 1. In order to avoid special treatment of these corner cases, the calculation is adjusted. The new solution first subtracts the length by one, and then calculates the correct bucket by shifting accordingly, i.e. _pos = fls64((l - 1) >> 13), l > 0. This not only fixes the issues named above, but also makes the whole bucket assignment easier to follow. Same is done for SMC_STAT_RMB_SIZE_SUB(_smc_stats, _tech, k, _len), where the calculation of the bucket position is similar to the one named above. Fixes: e0e4b8fa5338 ("net/smc: Add SMC statistics support") Suggested-by: Halil Pasic <pasic@linux.ibm.com> Signed-off-by: Nils Hoppmann <niho@linux.ibm.com> Reviewed-by: Halil Pasic <pasic@linux.ibm.com> Reviewed-by: Wenjia Zhang <wenjia@linux.ibm.com> Reviewed-by: Dust Li <dust.li@linux.alibaba.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-11nfp: flower: avoid rmmod nfp crash issuesYanguo Li
When there are CT table entries, and you rmmod nfp, the following events can happen: task1: nfp_net_pci_remove ↓ nfp_flower_stop->(asynchronous)tcf_ct_flow_table_cleanup_work(3) ↓ nfp_zone_table_entry_destroy(1) task2: nfp_fl_ct_handle_nft_flow(2) When the execution order is (1)->(2)->(3), it will crash. Therefore, in the function nfp_fl_ct_del_flow, nf_flow_table_offload_del_cb needs to be executed synchronously. At the same time, in order to solve the deadlock problem and the problem of rtnl_lock sometimes failing, replace rtnl_lock with the private nfp_fl_lock. Fixes: 7cc93d888df7 ("nfp: flower-ct: remove callback delete deadlock") Cc: stable@vger.kernel.org Signed-off-by: Yanguo Li <yanguo.li@corigine.com> Signed-off-by: Louis Peens <louis.peens@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-11wifi: brcmfmac: fweh: Add __counted_by for struct brcmf_fweh_queue_item and ↵Gustavo A. R. Silva
use struct_size() Prepare for the coming implementation by GCC and Clang of the __counted_by attribute. Flexible array members annotated with __counted_by can have their accesses bounds-checked at run-time via CONFIG_UBSAN_BOUNDS (for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family functions). Also, relocate `event->datalen = datalen;` to before calling `memcpy(event->data, data, datalen);`, so that the __counted_by annotation has effect, and flex-array member `data` can be properly bounds-checked at run-time. While there, use struct_size() helper, instead of the open-coded version, to calculate the size for the allocation of the whole flexible structure, including of course, the flexible-array member. This code was found with the help of Coccinelle, and audited and fixed manually. Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Reviewed-by: Kees Cook <keescook@chromium.org> Acked-by: Arend van Spriel <arend.vanspriel@broadcom.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/ZSRzrIe0345eymk2@work
2023-10-11wifi: hostap: Add __counted_by for struct prism2_download_data and use ↵Gustavo A. R. Silva
struct_size() Prepare for the coming implementation by GCC and Clang of the __counted_by attribute. Flexible array members annotated with __counted_by can have their accesses bounds-checked at run-time via CONFIG_UBSAN_BOUNDS (for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family functions). While there, use struct_size() helper, instead of the open-coded version, to calculate the size for the allocation of the whole flexible structure, including of course, the flexible-array member. This code was found with the help of Coccinelle, and audited and fixed manually. Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/ZSRXXvWMMkm7qqRW@work
2023-10-11wifi: rtw88: Remove duplicate NULL check before calling usb_kill/free_urb()Jinjie Ruan
Both usb_kill_urb() and usb_free_urb() do the NULL check itself, so there is no need to duplicate it prior to calling. Fixes: a82dfd33d123 ("wifi: rtw88: Add common USB chip support") Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com> Acked-by: Sascha Hauer <s.hauer@pengutronix.de> Acked-by: Ping-Ke Shih <pkshih@realtek.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/20231008025852.1239450-1-ruanjinjie@huawei.com
2023-10-11net/core: Introduce netdev_core_stats_inc()Yajun Deng
Although there is a kfree_skb_reason() helper function that can be used to find the reason why this skb is dropped, but most callers didn't increase one of rx_dropped, tx_dropped, rx_nohandler and rx_otherhost_dropped. For the users, people are more concerned about why the dropped in ip is increasing. Introduce netdev_core_stats_inc() for trace the caller of dev_core_stats_*_inc(). Also, add __code to netdev_core_stats_alloc(), as it's called with small probability. And add noinline make sure netdev_core_stats_inc was never inlined. Signed-off-by: Yajun Deng <yajun.deng@linux.dev> Suggested-by: Alexander Lobakin <aleksander.lobakin@intel.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-11Merge branch 'dsa-validate-remove'David S. Miller
Russell King says: ==================== net: dsa: remove validate method These three patches remove DSA's phylink .validate method which becomes unnecessary once the last two drivers provide phylink capabilities, which this patch set adds. Both of these are best guesses. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-11net: dsa: remove dsa_port_phylink_validate()Russell King (Oracle)
As all drivers now provide phylink capabilities (including MAC), the if() condition in dsa_port_phylink_validate() will always be true. We will always use the generic validator, which phylink will call itself if the .validate method isn't populated. Thus, there is now no need to implement the .validate method, so this implementation can be removed. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-11net: dsa: dsa_loop: add phylink capabilitiesRussell King (Oracle)
Add phylink capabilities for dsa_loop, which I believe being a software construct means that it supports essentially all interface types and all speeds. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-11net: dsa: vsc73xx: add phylink capabilitiesRussell King (Oracle)
Add phylink capabilities for vsc73xx. Although this switch driver does populates the .adjust_link method, dsa_slave_phy_setup() will still be used to create phylink instances for the LAN ports, although phylink won't be used for shared links. There are two different classes of switch - 5+1 and 8 port. The 5+1 port switches uses port indicies 0-4 for the user interfaces and 6 for the CPU port. The 8 port is confusing - some comments in the driver imply that port index 7 is used, but the driver actually still uses 6, so that is what we go with. Also, there appear to be no DTs in the kernel tree that are using the 8 port variety. It also looks like port 5 is always skipped. The switch supports 10M, 100M and 1G speeds. It is not clear whether all these speeds are supported on the CPU interface. It also looks like symmetric pause is supported, whether asymmetric pause is as well is unclear. However, it looks like the pause configuration is entirely static, and doesn't depend on negotiation results. So, let's do the best effort we can based on the information found in the driver when creating vsc73xx_phylink_get_caps(). Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-11hv_netvsc: fix netvsc_send_completion to avoid multiple message length checksSonia Sharma
The switch statement in netvsc_send_completion() is incorrectly validating the length of incoming network packets by falling through to the next case. Avoid the fallthrough. Instead break after a case match and then process the complete() call. The current code has not caused any known failures. But nonetheless, the code should be corrected as a different ordering of the switch cases might cause a length check to fail when it should not. Signed-off-by: Sonia Sharma <sonia.sharma@linux.microsoft.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-11Merge branch 'virtio-net-interrupt-moderation'David S. Miller
Heng Qi says: ==================== virtio-net: Fix and update interrupt moderation The setting of virtio coalescing parameters involves all-queues and per queue, so we must be careful to synchronize the two. Regarding napi_tx switching, this patch set is not only compatible with the previous way of using tx-frames to switch napi_tx, but also improves the user experience when setting interrupt parameters. This patch set has been tested and was part of the previous netdim patch set[1] and is now being split to be rolled out in steps. [1] https://lore.kernel.org/all/20230811065512.22190-1-hengqi@linux.alibaba.com/ --- v2->v3: 1. Fix a tiny comment. v1->v2: 1. Fix some minor comments and add ack tags. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-11virtio-net: a tiny comment updateHeng Qi
Update a comment because virtio-net now supports both VIRTIO_NET_F_NOTF_COAL and VIRTIO_NET_F_VQ_NOTF_COAL. Signed-off-by: Heng Qi <hengqi@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-11virtio-net: fix the vq coalescing setting for vq resizeHeng Qi
According to the definition of virtqueue coalescing spec[1]: Upon disabling and re-enabling a transmit virtqueue, the device MUST set the coalescing parameters of the virtqueue to those configured through the VIRTIO_NET_CTRL_NOTF_COAL_TX_SET command, or, if the driver did not set any TX coalescing parameters, to 0. Upon disabling and re-enabling a receive virtqueue, the device MUST set the coalescing parameters of the virtqueue to those configured through the VIRTIO_NET_CTRL_NOTF_COAL_RX_SET command, or, if the driver did not set any RX coalescing parameters, to 0. We need to add this setting for vq resize (ethtool -G) where vq_reset happens. [1] https://lists.oasis-open.org/archives/virtio-dev/202303/msg00415.html Fixes: 394bd87764b6 ("virtio_net: support per queue interrupt coalesce command") Cc: Gavin Li <gavinl@nvidia.com> Signed-off-by: Heng Qi <hengqi@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-11virtio-net: fix per queue coalescing parameter settingHeng Qi
When the user sets a non-zero coalescing parameter to 0 for a specific virtqueue, it does not work as expected, so let's fix this. Fixes: 394bd87764b6 ("virtio_net: support per queue interrupt coalesce command") Reported-by: Xiaoming Zhao <zxm377917@alibaba-inc.com> Cc: Gavin Li <gavinl@nvidia.com> Signed-off-by: Heng Qi <hengqi@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-11virtio-net: consistently save parameters for per-queueHeng Qi
When using .set_coalesce interface to set all queue coalescing parameters, we need to update both per-queue and global save values. Fixes: 394bd87764b6 ("virtio_net: support per queue interrupt coalesce command") Cc: Gavin Li <gavinl@nvidia.com> Signed-off-by: Heng Qi <hengqi@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>