summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2024-12-18Merge tag 'for-6.13-rc3-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: - tree-checker catches invalid number of inline extent references - zoned mode fixes: - enhance zone append IO command so it also detects emulated writes - handle bio splitting at sectorsize boundary - when deleting a snapshot, fix a condition for visiting nodes in reloc trees * tag 'for-6.13-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: tree-checker: reject inline extent items with 0 ref count btrfs: split bios to the fs sector size boundary btrfs: use bio_is_zone_append() in the completion handler btrfs: fix improper generation check in snapshot delete
2024-12-18Merge tag 'cxl-fixes-6.13-rc4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl Pull cxl fixes from Ira Weiny: - prevent probe failure when non-critical RAS unmasking fails - fix CXL 1.1 link status sysfs attribute - fix 4 way (and greater) switch interleave region creation * tag 'cxl-fixes-6.13-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: cxl/region: Fix region creation for greater than x2 switches cxl/pci: Check dport->regs.rcd_pcie_cap availability before accessing cxl/pci: Fix potential bogus return value upon successful probing
2024-12-18Merge tag 'selinux-pr-20241217' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux Pull selinux fix from Paul Moore: "One small SELinux patch to get rid improve our handling of unknown extended permissions by safely ignoring them. Not only does this make it easier to support newer SELinux policy on older kernels in the future, it removes to BUG() calls from the SELinux code." * tag 'selinux-pr-20241217' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux: selinux: ignore unknown extended permissions
2024-12-18Merge branch 'selftests-net-packetdrill-import-multiple-tests'Jakub Kicinski
Soham Chakradeo says: ==================== selftests/net: packetdrill: import multiple tests Import tests for the following features (folder names in brackets): ECN (ecn) : RFC 3168 Close (close) : RFC 9293 TCP_INFO (tcp_info) : RFC 9293 Fast recovery (fast_recovery) : RFC 5681 Timestamping (timestamping) : RFC 1323 Nagle (nagle) : RFC 896 Selective Acknowledgments (sack) : RFC 2018 Recent Timestamp (ts_recent) : RFC 1323 Send file (sendfile) Syscall bad arg (syscall_bad_arg) Validate (validate) Blocking (blocking) Splice (splice) End of record (eor) Limited transmit (limited_transmit) Procedure to import and test the packetdrill tests into upstream linux is explained in the first patch of this series These tests have many authors. We only import them here from github.com/google/packetdrill. Thanks to the following authors fo their contributions over the years to these tests: Neal Cardwell, Shuo Chen, Yuchung Cheng, Jerry Chu, Eric Dumazet, Luke Hsiao, Priyaranjan Jha, Chonggang Li, Tanner Love, John Sperbeck, Wei Wang and Maciej Żenczykowski. For more info see the original github commits, such as https://github.com/google/packetdrill/commit/8229c94928ac. ==================== Link: https://patch.msgid.link/20241217185203.297935-1-sohamch.kernel@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-18selftests/net: packetdrill: import tcp/user_timeout, tcp/validate, ↵Soham Chakradeo
tcp/sendfile, tcp/limited-transmit, tcp/syscall_bad_arg Use the standard import and testing method, as described in the import of tcp/ecn and tcp/close , tcp/sack , tcp/tcp_info. Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Soham Chakradeo <sohamch@google.com> Link: https://patch.msgid.link/20241217185203.297935-5-sohamch.kernel@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-18selftests/net: packetdrill: import tcp/eor, tcp/splice, tcp/ts_recent, ↵Soham Chakradeo
tcp/blocking Use the standard import and testing method, as described in the import of tcp/ecn and tcp/close , tcp/sack , tcp/tcp_info. Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Soham Chakradeo <sohamch@google.com> Link: https://patch.msgid.link/20241217185203.297935-4-sohamch.kernel@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-18selftests/net: packetdrill: import tcp/fast_recovery, tcp/nagle, ↵Soham Chakradeo
tcp/timestamping Use the standard import and testing method, as described in the import of tcp/ecn , tcp/close , tcp/sack , tcp/tcp_info. Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Soham Chakradeo <sohamch@google.com> Link: https://patch.msgid.link/20241217185203.297935-3-sohamch.kernel@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-18selftests/net: packetdrill: import tcp/ecn, tcp/close, tcp/sack, tcp/tcp_infoSoham Chakradeo
Same as initial tests, import verbatim from github.com/google/packetdrill, aside from: - update `source ./defaults.sh` path to adjust for flat dir - add SPDX headers - remove author statements if any - drop blank lines at EOF Same test process as previous tests. Both with and without debug mode. Recording the steps once: make mrproper vng --build \ --config tools/testing/selftests/net/packetdrill/config \ --config kernel/configs/debug.config vng -v --run . --user root --cpus 4 -- \ make -C tools/testing/selftests TARGETS=net/packetdrill run_tests Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Soham Chakradeo <sohamch@google.com> Link: https://patch.msgid.link/20241217185203.297935-2-sohamch.kernel@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-18wifi: wlcore: sysfs: constify 'struct bin_attribute'Thomas Weißschuh
The sysfs core now allows instances of 'struct bin_attribute' to be moved into read-only memory. Make use of that to protect them against accidental or malicious modifications. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://patch.msgid.link/20241216-sysfs-const-bin_attr-net-v1-3-ec460b91f274@weissschuh.net
2024-12-18wifi: brcmfmac: clarify unmodifiable headroom log messageAlex Shumsky
Replace misleading log "insufficient headroom (0)" with more clear "unmodifiable headroom". Signed-off-by: Alex Shumsky <alexthreed@gmail.com> Acked-by: Arend van Spriel <arend.vanspriel@broadcom.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://patch.msgid.link/20241213081402.625003-1-alexthreed@gmail.com
2024-12-18Merge tag 'trace-v6.13-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull tracing fixes from Steven Rostedt: "Replace trace_check_vprintf() with test_event_printk() and ignore_event() The function test_event_printk() checks on boot up if the trace event printf() formats dereference any pointers, and if they do, it then looks at the arguments to make sure that the pointers they dereference will exist in the event on the ring buffer. If they do not, it issues a WARN_ON() as it is a likely bug. But this isn't the case for the strings that can be dereferenced with "%s", as some trace events (notably RCU and some IPI events) save a pointer to a static string in the ring buffer. As the string it points to lives as long as the kernel is running, it is not a bug to reference it, as it is guaranteed to be there when the event is read. But it is also possible (and a common bug) to point to some allocated string that could be freed before the trace event is read and the dereference is to bad memory. This case requires a run time check. The previous way to handle this was with trace_check_vprintf() that would process the printf format piece by piece and send what it didn't care about to vsnprintf() to handle arguments that were not strings. This kept it from having to reimplement vsnprintf(). But it relied on va_list implementation and for architectures that copied the va_list and did not pass it by reference, it wasn't even possible to do this check and it would be skipped. As 64bit x86 passed va_list by reference, most events were tested and this kept out bugs where strings would have been dereferenced after being freed. Instead of relying on the implementation of va_list, extend the boot up test_event_printk() function to validate all the "%s" strings that can be validated at boot, and for the few events that point to strings outside the ring buffer, flag both the event and the field that is dereferenced as "needs_test". Then before the event is printed, a call to ignore_event() is made, and if the event has the flag set, it iterates all its fields and for every field that is to be tested, it will read the pointer directly from the event in the ring buffer and make sure that it is valid. If the pointer is not valid, it will print a WARN_ON(), print out to the trace that the event has unsafe memory and ignore the print format. With this new update, the trace_check_vprintf() can be safely removed and now all events can be verified regardless of architecture" * tag 'trace-v6.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: tracing: Check "%s" dereference via the field and not the TP_printk format tracing: Add "%s" check in test_event_printk() tracing: Add missing helper functions in event pointer dereference check tracing: Fix test_event_printk() to process entire print argument
2024-12-18Merge tag 'hyperv-fixes-signed-20241217' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux Pull hyperv fixes from Wei Liu: - Various fixes to Hyper-V tools in the kernel tree (Dexuan Cui, Olaf Hering, Vitaly Kuznetsov) - Fix a bug in the Hyper-V TSC page based sched_clock() (Naman Jain) - Two bug fixes in the Hyper-V utility functions (Michael Kelley) - Convert open-coded timeouts to secs_to_jiffies() in Hyper-V drivers (Easwar Hariharan) * tag 'hyperv-fixes-signed-20241217' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux: tools/hv: reduce resource usage in hv_kvp_daemon tools/hv: add a .gitignore file tools/hv: reduce resouce usage in hv_get_dns_info helper hv/hv_kvp_daemon: Pass NIC name to hv_get_dns_info as well Drivers: hv: util: Avoid accessing a ringbuffer not initialized yet Drivers: hv: util: Don't force error code to ENODEV in util_probe() tools/hv: terminate fcopy daemon if read from uio fails drivers: hv: Convert open-coded timeouts to secs_to_jiffies() tools: hv: change permissions of NetworkManager configuration file x86/hyperv: Fix hv tsc page based sched_clock for hibernation tools: hv: Fix a complier warning in the fcopy uio daemon
2024-12-18x86/static-call: fix 32-bit buildJuergen Gross
In 32-bit x86 builds CONFIG_STATIC_CALL_INLINE isn't set, leading to static_call_initialized not being available. Define it as "0" in that case. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Fixes: 0ef8047b737d ("x86/static-call: provide a way to do very early static-call updates") Signed-off-by: Juergen Gross <jgross@suse.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-12-18net: Remove bouncing hippi listDr. David Alan Gilbert
linux-hippi is bouncing with: <linux-hippi@sunsite.dk>: Sorry, no mailbox here by that name. (#5.1.1) Remove it. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-12-18net: dsa: qca8k: Fix inconsistent use of jiffies vs millisecondsAndrew Lunn
wait_for_complete_timeout() expects a timeout in jiffies. With the driver, some call sites converted QCA8K_ETHERNET_TIMEOUT to jiffies, others did not. Make the code consistent by changes the #define to include a call to msecs_to_jiffies, and remove all other calls to msecs_to_jiffies. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Tested-by: from Christian would be very welcome. Signed-off-by: David S. Miller <davem@davemloft.net>
2024-12-18pwm: stm32: Fix complementary output in round_waveform_tohw()Fabrice Gasnier
When the timer supports complementary output, the CCxNE bit must be set additionally to the CCxE bit. So to not overwrite the latter use |= instead of = to set the former. Fixes: deaba9cff809 ("pwm: stm32: Implementation of the waveform callbacks") Signed-off-by: Fabrice Gasnier <fabrice.gasnier@foss.st.com> Link: https://lore.kernel.org/r/20241217150021.2030213-1-fabrice.gasnier@foss.st.com [ukleinek: Slightly improve commit log] Signed-off-by: Uwe Kleine-König <ukleinek@kernel.org>
2024-12-18Merge patch series "can: m_can: set init flag earlier in probe"Marc Kleine-Budde
This series fixes problems in the m_can_pci driver found on the Intel Elkhart Lake processor. Link: https://patch.msgid.link/e247f331cb72829fcbdfda74f31a59cbad1a6006.1728288535.git.matthias.schiffer@ew.tq-group.com Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2024-12-18can: m_can: fix missed interrupts with m_can_pciMatthias Schiffer
The interrupt line of PCI devices is interpreted as edge-triggered, however the interrupt signal of the m_can controller integrated in Intel Elkhart Lake CPUs appears to be generated level-triggered. Consider the following sequence of events: - IR register is read, interrupt X is set - A new interrupt Y is triggered in the m_can controller - IR register is written to acknowledge interrupt X. Y remains set in IR As at no point in this sequence no interrupt flag is set in IR, the m_can interrupt line will never become deasserted, and no edge will ever be observed to trigger another run of the ISR. This was observed to result in the TX queue of the EHL m_can to get stuck under high load, because frames were queued to the hardware in m_can_start_xmit(), but m_can_finish_tx() was never run to account for their successful transmission. On an Elkhart Lake based board with the two CAN interfaces connected to each other, the following script can reproduce the issue: ip link set can0 up type can bitrate 1000000 ip link set can1 up type can bitrate 1000000 cangen can0 -g 2 -I 000 -L 8 & cangen can0 -g 2 -I 001 -L 8 & cangen can0 -g 2 -I 002 -L 8 & cangen can0 -g 2 -I 003 -L 8 & cangen can0 -g 2 -I 004 -L 8 & cangen can0 -g 2 -I 005 -L 8 & cangen can0 -g 2 -I 006 -L 8 & cangen can0 -g 2 -I 007 -L 8 & cangen can1 -g 2 -I 100 -L 8 & cangen can1 -g 2 -I 101 -L 8 & cangen can1 -g 2 -I 102 -L 8 & cangen can1 -g 2 -I 103 -L 8 & cangen can1 -g 2 -I 104 -L 8 & cangen can1 -g 2 -I 105 -L 8 & cangen can1 -g 2 -I 106 -L 8 & cangen can1 -g 2 -I 107 -L 8 & stress-ng --matrix 0 & To fix the issue, repeatedly read and acknowledge interrupts at the start of the ISR until no interrupt flags are set, so the next incoming interrupt will also result in an edge on the interrupt line. While we have received a report that even with this patch, the TX queue can become stuck under certain (currently unknown) circumstances on the Elkhart Lake, this patch completely fixes the issue with the above reproducer, and it is unclear whether the remaining issue has a similar cause at all. Fixes: cab7ffc0324f ("can: m_can: add PCI glue driver for Intel Elkhart Lake") Signed-off-by: Matthias Schiffer <matthias.schiffer@ew.tq-group.com> Reviewed-by: Markus Schneider-Pargmann <msp@baylibre.com> Link: https://patch.msgid.link/fdf0439c51bcb3a46c21e9fb21c7f1d06363be84.1728288535.git.matthias.schiffer@ew.tq-group.com Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2024-12-18can: m_can: set init flag earlier in probeMatthias Schiffer
While an m_can controller usually already has the init flag from a hardware reset, no such reset happens on the integrated m_can_pci of the Intel Elkhart Lake. If the CAN controller is found in an active state, m_can_dev_setup() would fail because m_can_niso_supported() calls m_can_cccr_update_bits(), which refuses to modify any other configuration bits when CCCR_INIT is not set. To avoid this issue, set CCCR_INIT before attempting to modify any other configuration flags. Fixes: cd5a46ce6fa6 ("can: m_can: don't enable transceiver when probing") Signed-off-by: Matthias Schiffer <matthias.schiffer@ew.tq-group.com> Reviewed-by: Markus Schneider-Pargmann <msp@baylibre.com> Link: https://patch.msgid.link/e247f331cb72829fcbdfda74f31a59cbad1a6006.1728288535.git.matthias.schiffer@ew.tq-group.com Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2024-12-17Merge branch 'support-some-features-for-the-hibmcge-driver'Jakub Kicinski
Jijie Shao says: ==================== Support some features for the HIBMCGE driver In this patch series, The HIBMCGE driver implements some functions such as dump register, unicast MAC address filtering, debugfs and reset. ==================== Link: https://patch.msgid.link/20241216040532.1566229-1-shaojijie@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-17net: hibmcge: Add nway_reset supported in this moduleJijie Shao
Add nway_reset supported in this module Signed-off-by: Jijie Shao <shaojijie@huawei.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20241216040532.1566229-8-shaojijie@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-17net: hibmcge: Add reset supported in this moduleJijie Shao
Sometimes, if the port doesn't work, we can try to fix it by resetting it. This patch supports reset triggered by ethtool or FLR of PCIe, For example: ethtool --reset eth0 dedicated echo 1 > /sys/bus/pci/devices/0000\:83\:00.1/reset We hope that the reset can be performed only when the port is down, and the port cannot be up during the reset. Therefore, the entire reset process is protected by the rtnl lock. After the reset is complete, the hardware registers are restored to their default values. Therefore, some rebuild operations are required to rewrite the user configuration to the registers. Signed-off-by: Jijie Shao <shaojijie@huawei.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20241216040532.1566229-7-shaojijie@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-17net: hibmcge: Add pauseparam supported in this moduleJijie Shao
The MAC can automatically send or respond to pause frames. This patch supports the function of enabling pause frames by using ethtool. Signed-off-by: Jijie Shao <shaojijie@huawei.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20241216040532.1566229-6-shaojijie@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-17net: hibmcge: Add register dump supported in this moduleJijie Shao
The dump register is an effective way to analyze problems. To ensure code flexibility, each register contains the type, offset, and value information. The ethtool does the pretty print based on these information. The driver can dynamically add or delete registers that need to be dumped in the future because information such as type and offset is contained. ethtool always can do pretty print. With the ethtool of a specific version, the following effects are achieved: [root@localhost sjj]# ./ethtool -d enp131s0f1 [SPEC] VALID [0x0000]: 0x00000001 [SPEC] EVENT_REQ [0x0004]: 0x00000000 [SPEC] MAC_ID [0x0008]: 0x00000002 [SPEC] PHY_ADDR [0x000c]: 0x00000002 [SPEC] MAC_ADDR_L [0x0010]: 0x00000808 [SPEC] MAC_ADDR_H [0x0014]: 0x08080802 [SPEC] UC_MAX_NUM [0x0018]: 0x00000004 [SPEC] MAX_MTU [0x0028]: 0x00000fc2 [SPEC] MIN_MTU [0x002c]: 0x00000100 [SPEC] TX_FIFO_NUM [0x0030]: 0x00000040 [SPEC] RX_FIFO_NUM [0x0034]: 0x0000007f [SPEC] VLAN_LAYERS [0x0038]: 0x00000002 [MDIO] COMMAND_REG [0x0000]: 0x0000185f [MDIO] ADDR_REG [0x0004]: 0x00000000 [MDIO] WDATA_REG [0x0008]: 0x0000a000 [MDIO] RDATA_REG [0x000c]: 0x00000000 [MDIO] STA_REG [0x0010]: 0x00000000 [GMAC] DUPLEX_TYPE [0x0008]: 0x00000001 [GMAC] FD_FC_TYPE [0x000c]: 0x00008808 [GMAC] FC_TX_TIMER [0x001c]: 0x000000ff [GMAC] FD_FC_ADDR_LOW [0x0020]: 0xc2000001 [GMAC] FD_FC_ADDR_HIGH [0x0024]: 0x00000180 [GMAC] MAX_FRM_SIZE [0x003c]: 0x000005f6 [GMAC] PORT_MODE [0x0040]: 0x00000002 [GMAC] PORT_EN [0x0044]: 0x00000006 ... Signed-off-by: Jijie Shao <shaojijie@huawei.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20241216040532.1566229-5-shaojijie@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-17net: hibmcge: Add unicast frame filter supported in this moduleJijie Shao
MAC supports filtering unmatched unicast packets according to the MAC address table. This patch adds the support for unicast frame filtering. To support automatic restoration of MAC entries after reset, the driver saves a copy of MAC entries in the driver. Signed-off-by: Jijie Shao <shaojijie@huawei.com> Reviewed-by: Hariprasad Kelam <hkelam@marvell.com> Link: https://patch.msgid.link/20241216040532.1566229-4-shaojijie@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-17net: hibmcge: Add irq_info file to debugfsJijie Shao
the driver requested three interrupts: "tx", "rx", "err". The err interrupt is a summary interrupt. We distinguish different errors based on the status register and mask. With "cat /proc/interrupts | grep hibmcge", we can't distinguish the detailed cause of the error, so we added this file to debugfs. the following effects are achieved: [root@localhost sjj]# cat /sys/kernel/debug/hibmcge/0000\:83\:00.1/irq_info RX : enabled: true , logged: false, count: 0 TX : enabled: true , logged: false, count: 0 MAC_MII_FIFO_ERR : enabled: false, logged: true , count: 0 MAC_PCS_RX_FIFO_ERR : enabled: false, logged: true , count: 0 MAC_PCS_TX_FIFO_ERR : enabled: false, logged: true , count: 0 MAC_APP_RX_FIFO_ERR : enabled: false, logged: true , count: 0 MAC_APP_TX_FIFO_ERR : enabled: false, logged: true , count: 0 SRAM_PARITY_ERR : enabled: true , logged: true , count: 0 TX_AHB_ERR : enabled: true , logged: true , count: 0 RX_BUF_AVL : enabled: true , logged: false, count: 0 REL_BUF_ERR : enabled: true , logged: true , count: 0 TXCFG_AVL : enabled: true , logged: false, count: 0 TX_DROP : enabled: true , logged: false, count: 0 RX_DROP : enabled: true , logged: false, count: 0 RX_AHB_ERR : enabled: true , logged: true , count: 0 MAC_FIFO_ERR : enabled: true , logged: false, count: 0 RBREQ_ERR : enabled: true , logged: false, count: 0 WE_ERR : enabled: true , logged: false, count: 0 The irq framework of hibmcge driver also includes tx/rx interrupts. Therefore, TX and RX are not moved separately form this file. Signed-off-by: Jijie Shao <shaojijie@huawei.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20241216040532.1566229-3-shaojijie@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-17net: hibmcge: Add debugfs supported in this moduleJijie Shao
This patch initializes debugfs and creates root directory for each device. The tx_ring and rx_ring debugfs files are implemented together. Signed-off-by: Jijie Shao <shaojijie@huawei.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20241216040532.1566229-2-shaojijie@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-17Merge branch 'lan78xx-preparations-for-phylink'Jakub Kicinski
Oleksij Rempel says: ==================== lan78xx: Preparations for PHYlink This patch set is a third part of the preparatory work for migrating the lan78xx USB Ethernet driver to the PHYlink framework. During extensive testing, I observed that resetting the USB adapter can lead to various read/write errors. While the errors themselves are acceptable, they generate excessive log messages, resulting in significant log spam. This set improves error handling to reduce logging noise by addressing errors directly and returning early when necessary. ==================== Link: https://patch.msgid.link/20241216120941.1690908-1-o.rempel@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-17net: usb: lan78xx: Improve error handling in WoL operationsOleksij Rempel
Enhance error handling in Wake-on-LAN (WoL) operations: - Log a warning in `lan78xx_get_wol` if `lan78xx_read_reg` fails. - Check and handle errors from `device_set_wakeup_enable` and `phy_ethtool_set_wol` in `lan78xx_set_wol`. - Ensure proper cleanup with a unified error handling path. Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20241216120941.1690908-7-o.rempel@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-17net: usb: lan78xx: remove PHY register access from ethtool get_regsOleksij Rempel
Remove PHY register handling from `lan78xx_get_regs` and `lan78xx_get_regs_len`. Since the controller can have different PHYs attached, the first 32 registers are not universally relevant or the most interesting. Simplify the implementation to focus on MAC and device registers. Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Link: https://patch.msgid.link/20241216120941.1690908-6-o.rempel@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-17net: usb: lan78xx: rename phy_mutex to mdiobus_mutexOleksij Rempel
Rename `phy_mutex` to `mdiobus_mutex` for clarity, as the mutex protects MDIO bus access rather than PHY-specific operations. Update all references to ensure consistency. Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Link: https://patch.msgid.link/20241216120941.1690908-5-o.rempel@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-17net: usb: lan78xx: Use action-specific label in lan78xx_mac_resetOleksij Rempel
Rename the generic `done` label to the action-specific `exit_unlock` label in `lan78xx_mac_reset`. This improves clarity by indicating the specific cleanup action (mutex unlock) and aligns with best practices for error handling and cleanup labels. Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com> Link: https://patch.msgid.link/20241216120941.1690908-4-o.rempel@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-17net: usb: lan78xx: Use ETIMEDOUT instead of ETIME in lan78xx_stop_hwOleksij Rempel
Update lan78xx_stop_hw to return -ETIMEDOUT instead of -ETIME when a timeout occurs. While -ETIME indicates a general timer expiration, -ETIMEDOUT is more commonly used for signaling operation timeouts and provides better consistency with standard error handling in the driver. The -ETIME checks in tx_complete() and rx_complete() are unrelated to this error handling change. In these functions, the error values are derived from urb->status, which reflects USB transfer errors. The error value from lan78xx_stop_hw will be exposed in the following cases: - usb_driver::suspend - net_device_ops::ndo_stop (potentially, though currently the return value is not used). Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Reviewed-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com> Link: https://patch.msgid.link/20241216120941.1690908-3-o.rempel@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-17net: usb: lan78xx: Add error handling to lan78xx_get_regsOleksij Rempel
Update `lan78xx_get_regs` to handle errors during register and PHY reads. Log warnings for failed reads and exit the function early if an error occurs. Drop all previously logged registers to signal inconsistent readings to the user space. This ensures that invalid data is not returned to users. Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Link: https://patch.msgid.link/20241216120941.1690908-2-o.rempel@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-17niu: Use page->private instead of page->indexMatthew Wilcox (Oracle)
We are close to removing page->index. Use page->private instead, which is least likely to be removed. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Link: https://patch.msgid.link/20241216155124.3114-1-willy@infradead.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-17mlxsw: Switch to napi_gro_receive()Ido Schimmel
Benefit from the recent conversion of the driver to NAPI and enable GRO support through the use of napi_gro_receive(). Pass the NAPI pointer from the bus driver (mlxsw_pci) to the switch driver (mlxsw_spectrum) through the skb control block where various packet metadata is already encoded. The main motivation is to improve forwarding performance through the use of GRO fraglist [1]. In my testing, when the forwarding data path is simple (routing between two ports) there is not much difference in forwarding performance between GRO disabled and GRO enabled with fraglist. The improvement becomes more noticeable as the data path becomes more complex since it is traversed less times with GRO enabled. For example, with 10 ingress and 10 egress flower filters with different priorities on the two ports between which routing is performed, there is an improvement of about 140% in forwarded bandwidth. [1] https://lore.kernel.org/netdev/20200125102645.4782-1-steffen.klassert@secunet.com/ Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com> Link: https://patch.msgid.link/21258fe55f608ccf1ee2783a5a4534220af28903.1734354812.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-17Merge branch 'inetpeer-reduce-false-sharing-and-atomic-operations'Jakub Kicinski
Eric Dumazet says: ==================== inetpeer: reduce false sharing and atomic operations After commit 8c2bd38b95f7 ("icmp: change the order of rate limits"), there is a risk that a host receiving packets from an unique source targeting closed ports is using a common inet_peer structure from many cpus. All these cpus have to acquire/release a refcount and update the inet_peer timestamp (p->dtime) Switch to pure RCU to avoid changing the refcount, and update p->dtime only once per jiffy. Tested: DUT : 128 cores, 32 hw rx queues. receiving 8,400,000 UDP packets per second, targeting closed ports. Before the series: - napi poll can not keep up, NIC drops 1,200,000 packets per second. - We use 20 % of cpu cycles After this series: - All packets are received (no more hw drops) - We use 12 % of cpu cycles. v1: https://lore.kernel.org/20241213130212.1783302-1-edumazet@google.com ==================== Link: https://patch.msgid.link/20241215175629.1248773-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-17inetpeer: do not get a refcount in inet_getpeer()Eric Dumazet
All inet_getpeer() callers except ip4_frag_init() don't need to acquire a permanent refcount on the inetpeer. They can switch to full RCU protection. Move the refcount_inc_not_zero() into ip4_frag_init(), so that all the other callers no longer have to perform a pair of expensive atomic operations on a possibly contended cache line. inet_putpeer() no longer needs to be exported. After this patch, my DUT can receive 8,400,000 UDP packets per second targeting closed ports, using 50% less cpu cycles than before. Also change two calls to l3mdev_master_ifindex() by l3mdev_master_ifindex_rcu() (Ido ideas) Fixes: 8c2bd38b95f7 ("icmp: change the order of rate limits") Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20241215175629.1248773-5-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-17inetpeer: update inetpeer timestamp in inet_getpeer()Eric Dumazet
inet_putpeer() will be removed in the following patch, because we will no longer use refcounts. Update inetpeer timestamp (p->dtime) at lookup time. Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20241215175629.1248773-4-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-17inetpeer: remove create argument of inet_getpeer()Eric Dumazet
All callers of inet_getpeer() want to create an inetpeer. Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20241215175629.1248773-3-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-17inetpeer: remove create argument of inet_getpeer_v[46]()Eric Dumazet
All callers of inet_getpeer_v4() and inet_getpeer_v6() want to create an inetpeer. Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20241215175629.1248773-2-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-17Merge branch 'net-constify-struct-bin_attribute'Jakub Kicinski
Thomas Weißschuh says: ==================== net: constify 'struct bin_attribute' The sysfs core now allows instances of 'struct bin_attribute' to be moved into read-only memory. Make use of that to protect them against accidental or malicious modifications. ==================== Link: https://patch.msgid.link/20241216-sysfs-const-bin_attr-net-v1-0-ec460b91f274@weissschuh.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-17netxen_nic: constify 'struct bin_attribute'Thomas Weißschuh
The sysfs core now allows instances of 'struct bin_attribute' to be moved into read-only memory. Make use of that to protect them against accidental or malicious modifications. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20241216-sysfs-const-bin_attr-net-v1-4-ec460b91f274@weissschuh.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-17net: phy: ks8995: constify 'struct bin_attribute'Thomas Weißschuh
The sysfs core now allows instances of 'struct bin_attribute' to be moved into read-only memory. Make use of that to protect them against accidental or malicious modifications. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20241216-sysfs-const-bin_attr-net-v1-2-ec460b91f274@weissschuh.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-17net: bridge: constify 'struct bin_attribute'Thomas Weißschuh
The sysfs core now allows instances of 'struct bin_attribute' to be moved into read-only memory. Make use of that to protect them against accidental or malicious modifications. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Reviewed-by: Simon Horman <horms@kernel.org> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://patch.msgid.link/20241216-sysfs-const-bin_attr-net-v1-1-ec460b91f274@weissschuh.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-17rtnetlink: Try the outer netns attribute in rtnl_get_peer_net().Kuniyuki Iwashima
Xiao Liang reported that the cited commit changed netns handling in newlink() of netkit, veth, and vxcan. Before the patch, if we don't find a netns attribute in the peer device attributes, we tried to find another netns attribute in the outer netlink attributes by passing it to rtnl_link_get_net(). Let's restore the original behaviour. Fixes: 48327566769a ("rtnetlink: fix double call of rtnl_link_get_net_ifla()") Reported-by: Xiao Liang <shaw.leon@gmail.com> Closes: https://lore.kernel.org/netdev/CABAhCORBVVU8P6AHcEkENMj+gD2d3ce9t=A_o48E0yOQp8_wUQ@mail.gmail.com/#t Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Tested-by: Xiao Liang <shaw.leon@gmail.com> Link: https://patch.msgid.link/20241216110432.51488-1-kuniyu@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-17net: netdevsim: fix nsim_pp_hold_write()Eric Dumazet
nsim_pp_hold_write() has two problems: 1) It may return with rtnl held, as found by syzbot. 2) Its return value does not propagate an error if any. Fixes: 1580cbcbfe77 ("net: netdevsim: add some fake page pool use") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20241216083703.1859921-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-17net: page_pool: rename page_pool_is_last_ref()Jakub Kicinski
page_pool_is_last_ref() releases a reference while the name, to me at least, suggests it just checks if the refcount is 1. The semantics of the function are the same as those of atomic_dec_and_test() and refcount_dec_and_test(), so just use the _and_test() suffix. Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com> Reviewed-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> Acked-by: Jesper Dangaard Brouer <hawk@kernel.org> Link: https://patch.msgid.link/20241215212938.99210-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-17hexagon: Disable constant extender optimization for LLVM prior to 19.1.0Nathan Chancellor
The Hexagon-specific constant extender optimization in LLVM may crash on Linux kernel code [1], such as fs/bcache/btree_io.c after commit 32ed4a620c54 ("bcachefs: Btree path tracepoints") in 6.12: clang: llvm/lib/Target/Hexagon/HexagonConstExtenders.cpp:745: bool (anonymous namespace)::HexagonConstExtenders::ExtRoot::operator<(const HCE::ExtRoot &) const: Assertion `ThisB->getParent() == OtherB->getParent()' failed. Stack dump: 0. Program arguments: clang --target=hexagon-linux-musl ... fs/bcachefs/btree_io.c 1. <eof> parser at end of file 2. Code generation 3. Running pass 'Function Pass Manager' on module 'fs/bcachefs/btree_io.c'. 4. Running pass 'Hexagon constant-extender optimization' on function '@__btree_node_lock_nopath' Without assertions enabled, there is just a hang during compilation. This has been resolved in LLVM main (20.0.0) [2] and backported to LLVM 19.1.0 but the kernel supports LLVM 13.0.1 and newer, so disable the constant expander optimization using the '-mllvm' option when using a toolchain that is not fixed. Cc: stable@vger.kernel.org Link: https://github.com/llvm/llvm-project/issues/99714 [1] Link: https://github.com/llvm/llvm-project/commit/68df06a0b2998765cb0a41353fcf0919bbf57ddb [2] Link: https://github.com/llvm/llvm-project/commit/2ab8d93061581edad3501561722ebd5632d73892 [3] Reviewed-by: Brian Cain <bcain@quicinc.com> Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-12-17idpf: trigger SW interrupt when exiting wb_on_itr modeJoshua Hay
There is a race condition between exiting wb_on_itr and completion write backs. For example, we are in wb_on_itr mode and a Tx completion is generated by HW, ready to be written back, as we are re-enabling interrupts: HW SW | | | | idpf_tx_splitq_clean_all | | napi_complete_done | | | tx_completion_wb | idpf_vport_intr_update_itr_ena_irq That tx_completion_wb happens before the vector is fully re-enabled. Continuing with this example, it is a UDP stream and the tx_completion_wb is the last one in the flow (there are no rx packets). Because the HW generated the completion before the interrupt is fully enabled, the HW will not fire the interrupt once the timer expires and the write back will not happen. NAPI poll won't be called. We have indicated we're back in interrupt mode but nothing else will trigger the interrupt. Therefore, the completion goes unprocessed, triggering a Tx timeout. To mitigate this, fire a SW triggered interrupt upon exiting wb_on_itr. This interrupt will catch the rogue completion and avoid the timeout. Add logic to set the appropriate bits in the vector's dyn_ctl register. Fixes: 9c4a27da0ecc ("idpf: enable WB_ON_ITR") Reviewed-by: Madhu Chittim <madhu.chittim@intel.com> Signed-off-by: Joshua Hay <joshua.a.hay@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>