linux/linux-stable.git - Linux kernel stable tree

Age	Commit message (Collapse)	Author
2024-10-09	Merge branch 'net-improve-multicast-group-join-performance'	David S. Miller
	Jonas Rebmann says: ==================== improve multicast join group performance This series seeks to improve performance on updating igmp group memberships such as with IP_ADD_MEMBERSHIP or MCAST_JOIN_SOURCE_GROUP. Our use case was to add 2000 multicast memberships on a TQMLS1046A which took about 3.6 seconds for the membership additions alone. Our userspace reproducer tool was instrumented to log runtimes of the individual setsockopt invocations which clearly indicated quadratic complexity of setting up the membership with regard to the total number of multicast groups to be joined. We used perf to locate the hotspots and subsequently optimized the most costly sections of code. This series includes a patch to Linux igmp handling as well as a patch to the DPAA/Freescale driver. With both patches applied, our memberships can be set up in only about 87 miliseconds, which corresponds to a speedup of around 40. While we have acheived practically linear run-time complexity on the kernel side, a small quadratic factor remains in parts of the freescale driver code which we haven't yet optimized. We have by now payed little attention to the optimization potential in dropping group memberships, yet the dpaa patch applies to joining and leaving groups alike. Overall, this patch series brings great improvements in use cases involving large numbers of multicast groups, particularly when using the fsl_dpa driver, without noteworthy drawbacks in other scenarios. ==================== Signed-off-by: Jonas Rebmann <jre@pengutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-10-09	net: dpaa: use __dev_mc_sync in dpaa_set_rx_mode()	Jonas Rebmann
	The original driver first unregisters then re-registers all multicast addresses in the struct net_device_ops::ndo_set_rx_mode() callback. As the networking stack calls ndo_set_rx_mode() if a single multicast address change occurs, a significant amount of time may be used to first unregister and then re-register unchanged multicast addresses. This leads to performance issues when tracking large numbers of multicast addresses. Replace the unregister and register loop and the hand crafted mc_addr_list list handling with __dev_mc_sync(), to only update entries which have changed. On profiling with an fsl_dpa NIC, this patch presented a speedup of around 40 when successively setting up 2000 multicast groups using setsockopt(), without drawbacks on smaller numbers of multicast groups. Signed-off-by: Jonas Rebmann <jre@pengutronix.de> Reviewed-by: Sean Anderson <sean.anderson@seco.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-10-09	net: ipv4: igmp: optimize ____ip_mc_inc_group() using mc_hash	Jonas Rebmann
	The runtime cost of joining a single multicast group in the current implementation of ____ip_mc_inc_group grows linearly with the number of existing memberships. This is caused by the linear search for an existing group record in the multicast address list. This linear complexity results in quadratic complexity when successively adding memberships, which becomes a performance bottleneck when setting up large numbers of multicast memberships. If available, use the existing multicast hash map mc_hash to quickly search for an existing group membership record. This leads to near-constant complexity on the addition of a new multicast record, significantly improving performance for workloads involving many multicast memberships. On profiling with a loopback device, this patch presented a speedup of around 6 when successively setting up 2000 multicast groups using setsockopt without measurable drawbacks on smaller numbers of multicast groups. Signed-off-by: Jonas Rebmann <jre@pengutronix.de> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-10-09	net: amd: mvme147: Fix probe banner message	Daniel Palmer
	Currently this driver prints this line with what looks like a rogue format specifier when the device is probed: [ 2.840000] eth%d: MVME147 at 0xfffe1800, irq 12, Hardware Address xx:xx:xx:xx:xx:xx Change the printk() for netdev_info() and move it after the registration has completed so it prints out the name of the interface properly. Signed-off-by: Daniel Palmer <daniel@0x0f.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-10-09	net: phy: realtek: Fix MMD access on RTL8126A-integrated PHY	Heiner Kallweit
	All MMD reads return 0 for the RTL8126A-integrated PHY. Therefore phylib assumes it doesn't support EEE, what results in higher power consumption, and a significantly higher chip temperature in my case. To fix this split out the PHY driver for the RTL8126A-integrated PHY and set the read_mmd/write_mmd callbacks to read from vendor-specific registers. Fixes: 5befa3728b85 ("net: phy: realtek: add support for RTL8126A-integrated 5Gbps PHY") Cc: stable@vger.kernel.org Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-10-09	btrfs: fix clear_dirty and writeback ordering in submit_one_sector()	Naohiro Aota
	This commit is a replay of commit 6252690f7e1b ("btrfs: fix invalid mapping of extent xarray state"). We need to call btrfs_folio_clear_dirty() before btrfs_set_range_writeback(), so that xarray DIRTY tag is cleared. With a refactoring commit 8189197425e7 ("btrfs: refactor __extent_writepage_io() to do sector-by-sector submission"), it screwed up and the order is reversed and causing the same hang. Fix the ordering now in submit_one_sector(). Fixes: 8189197425e7 ("btrfs: refactor __extent_writepage_io() to do sector-by-sector submission") Reviewed-by: Qu Wenruo <wqu@suse.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-10-09	btrfs: zoned: fix missing RCU locking in error message when loading zone info	Filipe Manana
	At btrfs_load_zone_info() we have an error path that is dereferencing the name of a device which is a RCU string but we are not holding a RCU read lock, which is incorrect. Fix this by using btrfs_err_in_rcu() instead of btrfs_err(). The problem is there since commit 08e11a3db098 ("btrfs: zoned: load zone's allocation offset"), back then at btrfs_load_block_group_zone_info() but then later on that code was factored out into the helper btrfs_load_zone_info() by commit 09a46725cc84 ("btrfs: zoned: factor out per-zone logic from btrfs_load_block_group_zone_info"). Fixes: 08e11a3db098 ("btrfs: zoned: load zone's allocation offset") Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Qu Wenruo <wqu@suse.com> Reviewed-by: Naohiro Aota <naohiro.aota@wdc.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-10-09	net: ti: icssg-prueth: Fix race condition for VLAN table access	MD Danish Anwar
	The VLAN table is a shared memory between the two ports/slices in a ICSSG cluster and this may lead to race condition when the common code paths for both ports are executed in different CPUs. Fix the race condition access by locking the shared memory access Fixes: 487f7323f39a ("net: ti: icssg-prueth: Add helper functions to configure FDB") Signed-off-by: MD Danish Anwar <danishanwar@ti.com> Reviewed-by: Roger Quadros <rogerq@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-10-09	ptp: Add support for the AMZNC10C 'vmclock' device	David Woodhouse
	The vmclock device addresses the problem of live migration with precision clocks. The tolerances of a hardware counter (e.g. TSC) are typically around ±50PPM. A guest will use NTP/PTP/PPS to discipline that counter against an external source of 'real' time, and track the precise frequency of the counter as it changes with environmental conditions. When a guest is live migrated, anything it knows about the frequency of the underlying counter becomes invalid. It may move from a host where the counter running at -50PPM of its nominal frequency, to a host where it runs at +50PPM. There will also be a step change in the value of the counter, as the correctness of its absolute value at migration is limited by the accuracy of the source and destination host's time synchronization. In its simplest form, the device merely advertises a 'disruption_marker' which indicates that the guest should throw away any NTP synchronization it thinks it has, and start again. Because the shared memory region can be exposed all the way to userspace through the /dev/vmclock0 node, applications can still use time from a fast vDSO 'system call', and check the disruption marker to be sure that their timestamp is indeed truthful. The structure also allows for the precise time, as known by the host, to be exposed directly to guests so that they don't have to wait for NTP to resync from scratch. The PTP driver consumes this information if present. Like the KVM PTP clock, this PTP driver can convert TSC-based cross timestamps into KVM clock values. Unlike the KVM PTP clock, it does so only when such is actually helpful. The values and fields are based on the nascent virtio-rtc specification, and the intent is that a version (hopefully precisely this version) of this structure will be included as an optional part of that spec. In the meantime, this driver supports the simple ACPI form of the device which is being shipped in certain commercial hypervisors (and submitted for inclusion in QEMU). Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-10-09	Merge branch 'pcs-xpcs-cleanups-batch-2'	David S. Miller
	Russell King says: ==================== net: pcs: xpcs: cleanups batch 2 This is the second cleanup series for XPCS. Patch 1 removes the enum indexing the dw_xpcs_compat array. The index is never used except to place entries in the array and to size the array. Patch 2 removes the interface arrays - each of which only contain one interface. Patch 3 makes xpcs_find_compat() take the xpcs structure rather than the ID - the previous series removed the reason for xpcs_find_compat needing to take the ID. Patch 4 provides a helper to convert xpcs structure to a regular phylink_pcs structure, which leads to patch 5. Patch 5 moves the definition of struct dw_xpcs to the private xpcs header - with patch 4 in place, nothing outside of the xpcs driver accesses the contents of the dw_xpcs structure. Patch 6 renames xpcs_get_id() to xpcs_read_id() since it's reading the ID, rather than doing anything further with it. (Prior versions of this series renamed it to xpcs_read_phys_id() since that more accurately described that it was reading the physical ID registers.) Patch 7 moves the searching of the ID list out of line as this is a separate functional block. Patch 8 converts xpcs to use the bitmap macros, which eliminates the need for _SHIFT definitions. Patch 9 adds and uses _modify() accessors as there are a large amount of read-modify-write operations in this driver. This conversion found a bug in xpcs-wx code that has been reported and already fixed. Patch 10 converts xpcs to use read_poll_timeout() rather than open coding that. Patch 11 converts all printed messages to use the dev_*() functions so the driver and devie name are always printed. Patch 12 moves DW_VR_MII_DIG_CTRL1_2G5_EN to the correct place in the header file, rather than amongst another register's definitions. Patch 13 moves the Wangxun workaround to a common location rather than duplicating it in two places. We also reformat this to fit within 80 columns. ==================== Tested-by: Serge Semin <fancer.lancer@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-10-09	net: pcs: xpcs: move Wangxun VR_XS_PCS_DIG_CTRL1 configuration	Russell King (Oracle)
	According to commits 2a22b7ae2fa3 ("net: pcs: xpcs: adapt Wangxun NICs for SGMII mode") and 2deea43f386d ("net: pcs: xpcs: add 1000BASE-X AN interrupt support"), Wangxun devices need special VR_XS_PCS_DIG_CTRL1 settings for SGMII and 1000BASE-X. Both SGMII and 1000BASE-X use the same settings. Rather than placing these in the individual xpcs_config_*() functions, move it to where we already test for the Wangxun devices in xpcs_do_config(). Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-10-09	net: pcs: xpcs: correctly place DW_VR_MII_DIG_CTRL1_2G5_EN	Russell King (Oracle)
	Place DW_VR_MII_DIG_CTRL1_2G5_EN with the other DW_VR_MII_DIG_CTRL1 definitions rather than in the middle of a register list. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-10-09	net: pcs: xpcs: use dev_*() to print messages	Russell King (Oracle)
	Use the dev_*() family of functions to print all messages from the XPCS driver so we know which instance issues the messages. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-10-09	net: pcs: xpcs: convert to use read_poll_timeout()	Russell King (Oracle)
	Convert the xpcs driver to use read_poll_timeout() when waiting for reset to complete, rather than open-coding this. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-10-09	net: pcs: xpcs: add _modify() accessors	Russell King (Oracle)
	The xpcs driver does a lot of read-modify-write operations on registers, which leads to long-winded code to read the register, check whether the read was successful, modify the value in some way, and then write it back. We have a mdiodev _modify() accessor that encapsulates this, and does the register modification under the MDIO bus lock ensuring that the modification is atomic with respect to other bus operations. Convert the xpcs driver to use this accessor. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-10-09	net: pcs: xpcs: use FIELD_PREP() and FIELD_GET()	Russell King (Oracle)
	Convert xpcs to use the bitfield macros rather than definining the bitfield shifts and open-coding the insertion and extraction of these bitfields. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-10-09	net: pcs: xpcs: move searching ID list out of line	Russell King (Oracle)
	Move the searching of the physical ID out of xpcs_create() and into its own xpcs_identify() function, which makes it self contained. This reduces the complexity in xpcs_craete(), making it easier to follow, rather than having a lot of once-run code in the big for() loop. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-10-09	net: pcs: xpcs: rename xpcs_get_id()	Russell King (Oracle)
	Rename xpcs_get_id() to xpcs_read_id() which more closely reflects the purpose of this function. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-10-09	net: pcs: xpcs: move definition of struct dw_xpcs to private header	Russell King (Oracle)
	There should be no reason for anything outside the XPCS code to know the contents of struct dw_xpcs - this is a private structure to XPCS. Move the definition to the private pcs-xpcs.h header, leaving a declaration in the global pcs/pcs-xpcs.h Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-10-09	net: pcs: xpcs: provide a helper to get the phylink pcs given xpcs	Russell King (Oracle)
	Provide a helper to provide the pointer to the phylink_pcs struct given a valid xpcs pointer. This will be necessary when we make struct dw_xpcs private to pcs-xpcs.c Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-10-09	net: pcs: xpcs: pass xpcs instead of xpcs->id to xpcs_find_compat()	Russell King (Oracle)
	xpcs_find_compat() is now always passed xpcs->id. Rather than always dereferencing this in the caller, move it into xpcs_find_compat(), thus making this function consistent with most of the other xpcs functions in taking an xpcs pointer. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-10-09	net: pcs: xpcs: don't use array for interface	Russell King (Oracle)
	Currently, xpcs uses an array of interfaces that each "compat" entry supports. When looking up the compat entry for an interface, we iterate over the compat entries and then over each interface. Since each compat entry only has a single interface in its interfaces array, replace the array with a single member in the compat structure. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-10-09	net: pcs: xpcs: remove dw_xpcs_compat enum	Russell King (Oracle)
	There is no reason for the struct dw_xpcs_compat arrays to be a fixed size other than the way we iterate over them. The index into the array isn't used for anything, and having them fixed size needlessly wastes space. Remove the enum that defines their size, and instead use an empty array entry (with NULL ->supported) to mark the end of the array. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-10-09	xfs: fix a typo	Andrew Kreimer
	Fix a typo in comments. Signed-off-by: Andrew Kreimer <algonell@gmail.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2024-10-09	xfs: don't free cowblocks from under dirty pagecache on unshare	Brian Foster
	fallocate unshare mode explicitly breaks extent sharing. When a command completes, it checks the data fork for any remaining shared extents to determine whether the reflink inode flag and COW fork preallocation can be removed. This logic doesn't consider in-core pagecache and I/O state, however, which means we can unsafely remove COW fork blocks that are still needed under certain conditions. For example, consider the following command sequence: xfs_io -fc "pwrite 0 1k" -c "reflink <file> 0 256k 1k" \ -c "pwrite 0 32k" -c "funshare 0 1k" <file> This allocates a data block at offset 0, shares it, and then overwrites it with a larger buffered write. The overwrite triggers COW fork preallocation, 32 blocks by default, which maps the entire 32k write to delalloc in the COW fork. All but the shared block at offset 0 remains hole mapped in the data fork. The unshare command redirties and flushes the folio at offset 0, removing the only shared extent from the inode. Since the inode no longer maps shared extents, unshare purges the COW fork before the remaining 28k may have written back. This leaves dirty pagecache backed by holes, which writeback quietly skips, thus leaving clean, non-zeroed pagecache over holes in the file. To verify, fiemap shows holes in the first 32k of the file and reads return different data across a remount: $ xfs_io -c "fiemap -v" <file> <file>: EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS ... 1: [8..511]: hole 504 ... $ xfs_io -c "pread -v 4k 8" <file> 00001000: cd cd cd cd cd cd cd cd ........ $ umount <mnt>; mount <dev> <mnt> $ xfs_io -c "pread -v 4k 8" <file> 00001000: 00 00 00 00 00 00 00 00 ........ To avoid this problem, make unshare follow the same rules used for background cowblock scanning and never purge the COW fork for inodes with dirty pagecache or in-flight I/O. Fixes: 46afb0628b86347 ("xfs: only flush the unshared range in xfs_reflink_unshare") Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2024-10-08	net: ibm: emac: mal: fix wrong goto	Rosen Penev
	dcr_map is called in the previous if and therefore needs to be unmapped. Fixes: 1ff0fcfcb1a6 ("ibm_newemac: Fix new MAL feature handling") Signed-off-by: Rosen Penev <rosenp@gmail.com> Link: https://patch.msgid.link/20241007235711.5714-1-rosenp@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-08	net: phy: microchip_t1: SQI support for LAN887x	Tarun Alle
	Add support for measuring Signal Quality Index for LAN887x T1 PHY. Signal Quality Index (SQI) is measure of Link Channel Quality from 0 to 7, with 7 as the best. By default, a link loss event shall indicate an SQI of 0. Signed-off-by: Tarun Alle <Tarun.Alle@microchip.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20241007063943.3233-1-tarun.alle@microchip.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-08	Merge branch 'net-phy-marvell-88q2xxx-enable-auto-negotiation-for-mv88q2110'	Jakub Kicinski
	Niklas Söderlund says: ==================== net: phy: marvell-88q2xxx: Enable auto negotiation for mv88q2110 This series enables auto negotiation for the mv88q2110 device. Previously this feature have been disabled for mv88q2110, while enabled for other devices supported by this driver. The initial driver implementation states this is due to the configuration sequence provided by the vendor did not work. By comparing the initialization sequence of other devices this driver supports and the out-of-tree PHY driver for mv88q2110 found in the Renesas BSP [1] I was able to figure out a working configuration. As I have no access to the datasheets of either of these devices it would be super if someone who has could sanity check the initialization sequence. With this series I'm able to auto negotiate both 1000Mbps and 100Mbps links without issue. # ethtool eth0 Settings for eth0: Supported ports: [ ] Supported link modes: 100baseT1/Full 1000baseT1/Full Supported pause frame use: Symmetric Receive-only Supports auto-negotiation: Yes Supported FEC modes: Not reported Advertised link modes: 100baseT1/Full 1000baseT1/Full Advertised pause frame use: No Advertised auto-negotiation: Yes Advertised FEC modes: Not reported Link partner advertised link modes: 100baseT1/Full 1000baseT1/Full Link partner advertised pause frame use: No Link partner advertised auto-negotiation: Yes Link partner advertised FEC modes: Not reported Speed: 1000Mb/s Duplex: Full Auto-negotiation: on master-slave cfg: preferred master master-slave status: slave Port: Twisted Pair PHYAD: 0 Transceiver: external MDI-X: Unknown Link detected: yes SQI: 15/15 And the performance is good too. Without this change I was not able to manually configure a 1000Mbps link, only 100Mbps ones. So this gives a huge performance boost for my use-case. [ 5] local 10.1.0.2 port 5201 connected to 10.1.0.1 port 38346 [ ID] Interval Transfer Bitrate Retr Cwnd [ 5] 0.00-1.00 sec 96.8 MBytes 812 Mbits/sec 0 469 KBytes [ 5] 1.00-2.00 sec 94.3 MBytes 791 Mbits/sec 0 469 KBytes [ 5] 2.00-3.00 sec 96.1 MBytes 806 Mbits/sec 0 469 KBytes [ 5] 3.00-4.00 sec 98.3 MBytes 825 Mbits/sec 0 469 KBytes [ 5] 4.00-5.00 sec 98.4 MBytes 825 Mbits/sec 0 469 KBytes [ 5] 5.00-6.00 sec 98.4 MBytes 826 Mbits/sec 0 469 KBytes [ 5] 6.00-7.00 sec 98.9 MBytes 830 Mbits/sec 0 469 KBytes [ 5] 7.00-8.00 sec 91.7 MBytes 769 Mbits/sec 0 469 KBytes [ 5] 8.00-9.00 sec 99.4 MBytes 834 Mbits/sec 0 747 KBytes [ 5] 9.00-10.00 sec 101 MBytes 851 Mbits/sec 0 747 KBytes Patch 1/3 and 2/3 are preparation patches that align and move functions around as the mv88q2110 code paths can now reuses much of what is done for mv88q2220. While patch 3/3 adds the new initialization sequence and removes the auto negotiation limit for mv88q2110. 1. https://github.com/renesas-rcar/linux-bsp/commit/2a1f07d0e722a18188cfe62842b61f2fbc0ba812 ==================== Link: https://patch.msgid.link/20241005112412.544360-1-niklas.soderlund+renesas@ragnatech.se Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-08	net: phy: marvell-88q2xxx: Enable auto negotiation for mv88q2110	Niklas Söderlund
	The initial marvell-88q2xxx driver only supported the Marvell 88Q2110 PHY without auto negotiation support. The reason documented states that the provided initialization sequence did not to work. Now a method to enable auto negotiation have been found by comparing the initialization of other supported devices and an out-of-tree PHY driver. Perform the minimal needed initialization of the PHY to get auto negotiation working and remove the limitation that disables the auto negotiation feature for the mv88q2110 device. With this change a 1000Mbps full duplex link is able to be negotiated between two mv88q2110 and the link works perfectly. The other side also reflects the manually configure settings of the master device. # ethtool eth0 Settings for eth0: Supported ports: [ ] Supported link modes: 100baseT1/Full 1000baseT1/Full Supported pause frame use: Symmetric Receive-only Supports auto-negotiation: Yes Supported FEC modes: Not reported Advertised link modes: 100baseT1/Full 1000baseT1/Full Advertised pause frame use: No Advertised auto-negotiation: Yes Advertised FEC modes: Not reported Link partner advertised link modes: 100baseT1/Full 1000baseT1/Full Link partner advertised pause frame use: No Link partner advertised auto-negotiation: Yes Link partner advertised FEC modes: Not reported Speed: 1000Mb/s Duplex: Full Auto-negotiation: on master-slave cfg: preferred master master-slave status: slave Port: Twisted Pair PHYAD: 0 Transceiver: external MDI-X: Unknown Link detected: yes SQI: 15/15 Before this change I was not able to manually configure 1000Mbps link, only a 100Mpps link so this change providers an improvement in performance for this device. [ 5] local 10.1.0.2 port 5201 connected to 10.1.0.1 port 38346 [ ID] Interval Transfer Bitrate Retr Cwnd [ 5] 0.00-1.00 sec 96.8 MBytes 812 Mbits/sec 0 469 KBytes [ 5] 1.00-2.00 sec 94.3 MBytes 791 Mbits/sec 0 469 KBytes [ 5] 2.00-3.00 sec 96.1 MBytes 806 Mbits/sec 0 469 KBytes [ 5] 3.00-4.00 sec 98.3 MBytes 825 Mbits/sec 0 469 KBytes [ 5] 4.00-5.00 sec 98.4 MBytes 825 Mbits/sec 0 469 KBytes [ 5] 5.00-6.00 sec 98.4 MBytes 826 Mbits/sec 0 469 KBytes [ 5] 6.00-7.00 sec 98.9 MBytes 830 Mbits/sec 0 469 KBytes [ 5] 7.00-8.00 sec 91.7 MBytes 769 Mbits/sec 0 469 KBytes [ 5] 8.00-9.00 sec 99.4 MBytes 834 Mbits/sec 0 747 KBytes [ 5] 9.00-10.00 sec 101 MBytes 851 Mbits/sec 0 747 KBytes Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se> Tested-by: Stefan Eichenberger <eichest@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20241005112412.544360-4-niklas.soderlund+renesas@ragnatech.se Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-08	net: phy: marvell-88q2xxx: Make register writer function generic	Niklas Söderlund
	In preparation to adding auto negotiation support to mv88q2110 move and rename the helper function used to write an array of register values to the PHY. Just as for mv88q2220 devices this helper will be needed to for the initial configuration of the mv88q2110 to support auto negotiation. The function is moved verbatim, there is no change in behavior. Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Tested-by: Dimitri Fedrau <dima.fedrau@gmail.com> Tested-by: Stefan Eichenberger <eichest@gmail.com> Link: https://patch.msgid.link/20241005112412.544360-3-niklas.soderlund+renesas@ragnatech.se Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-08	net: phy: marvell-88q2xxx: Align soft reset for mv88q2110 and mv88q2220	Niklas Söderlund
	The soft reset implementations for mv88q2110 and mv88q2220 differ as the later need to consider that auto negation is supported on mv88q2220 devices. In preparation of enabling auto negotiation on mv88q2110 merge the two rest functions into a device generic one. The mv88q2220 behavior is kept as is but extended to wait for the reset bit to be clears before continuing, as was done previously on mv88q2220. Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Tested-by: Dimitri Fedrau <dima.fedrau@gmail.com> Tested-by: Stefan Eichenberger <eichest@gmail.com> Link: https://patch.msgid.link/20241005112412.544360-2-niklas.soderlund+renesas@ragnatech.se Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-08	fsl/fman: Fix a typo	Andrew Kreimer
	Fix a typo in comments: bellow -> below. Reported-by: Matthew Wilcox <willy@infradead.org> Signed-off-by: Andrew Kreimer <algonell@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20241006130829.13967-1-algonell@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-08	net: phy: aquantia: allow forcing order of MDI pairs	Daniel Golle
	Despite supporting Auto MDI-X, it looks like Aquantia only supports swapping pair (1,2) with pair (3,6) like it used to be for MDI-X on 100MBit/s networks. When all 4 pairs are in use (for 1000MBit/s or faster) the link does not come up with pair order is not configured correctly, either using MDI_CFG pin or using the "PMA Receive Reserved Vendor Provisioning 1" register. Normally, the order of MDI pairs being either ABCD or DCBA is configured by pulling the MDI_CFG pin. However, some hardware designs require overriding the value configured by that bootstrap pin. The PHY allows doing that by setting a bit in "PMA Receive Reserved Vendor Provisioning 1" register which allows ignoring the state of the MDI_CFG pin and another bit configuring whether the order of MDI pairs should be normal (ABCD) or reverse (DCBA). Pair polarity is not affected and remains identical in both settings. Introduce property "marvell,mdi-cfg-order" which allows forcing either normal or reverse order of the MDI pairs from DT. If the property isn't present, the behavior is unchanged and MDI pair order configuration is untouched (ie. either the result of MDI_CFG pin pull-up/pull-down, or pair order override already configured by the bootloader before Linux is started). Forcing normal pair order is required on the Adtran SDG-8733A Wi-Fi 7 residential gateway. Signed-off-by: Daniel Golle <daniel@makrotopia.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/9ed760ff87d5fc456f31e407ead548bbb754497d.1728058550.git.daniel@makrotopia.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-08	dt-bindings: net: marvell,aquantia: add property to override MDI_CFG	Daniel Golle
	Usually the MDI pair order reversal configuration is defined by bootstrap pin MDI_CFG. Some designs, however, require overriding the MDI pair order and force either normal or reverse order. Add property 'marvell,mdi-cfg-order' to allow forcing either normal or reverse order of the MDI pairs. Signed-off-by: Daniel Golle <daniel@makrotopia.org> Reviewed-by: Rob Herring (Arm) <robh@kernel.org> Link: https://patch.msgid.link/7ccf25d6d7859f1ce9983c81a2051cfdfb0e0a99.1728058550.git.daniel@makrotopia.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-08	net/sched: accept TCA_STAB only for root qdisc	Eric Dumazet
	Most qdiscs maintain their backlog using qdisc_pkt_len(skb) on the assumption it is invariant between the enqueue() and dequeue() handlers. Unfortunately syzbot can crash a host rather easily using a TBF + SFQ combination, with an STAB on SFQ [1] We can't support TCA_STAB on arbitrary level, this would require to maintain per-qdisc storage. [1] [ 88.796496] BUG: kernel NULL pointer dereference, address: 0000000000000000 [ 88.798611] #PF: supervisor read access in kernel mode [ 88.799014] #PF: error_code(0x0000) - not-present page [ 88.799506] PGD 0 P4D 0 [ 88.799829] Oops: Oops: 0000 [#1] SMP NOPTI [ 88.800569] CPU: 14 UID: 0 PID: 2053 Comm: b371744477 Not tainted 6.12.0-rc1-virtme #1117 [ 88.801107] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 [ 88.801779] RIP: 0010:sfq_dequeue (net/sched/sch_sfq.c:272 net/sched/sch_sfq.c:499) sch_sfq [ 88.802544] Code: 0f b7 50 12 48 8d 04 d5 00 00 00 00 48 89 d6 48 29 d0 48 8b 91 c0 01 00 00 48 c1 e0 03 48 01 c2 66 83 7a 1a 00 7e c0 48 8b 3a <4c> 8b 07 4c 89 02 49 89 50 08 48 c7 47 08 00 00 00 00 48 c7 07 00 All code ======== 0: 0f b7 50 12 movzwl 0x12(%rax),%edx 4: 48 8d 04 d5 00 00 00 lea 0x0(,%rdx,8),%rax b: 00 c: 48 89 d6 mov %rdx,%rsi f: 48 29 d0 sub %rdx,%rax 12: 48 8b 91 c0 01 00 00 mov 0x1c0(%rcx),%rdx 19: 48 c1 e0 03 shl $0x3,%rax 1d: 48 01 c2 add %rax,%rdx 20: 66 83 7a 1a 00 cmpw $0x0,0x1a(%rdx) 25: 7e c0 jle 0xffffffffffffffe7 27: 48 8b 3a mov (%rdx),%rdi 2a:* 4c 8b 07 mov (%rdi),%r8 <-- trapping instruction 2d: 4c 89 02 mov %r8,(%rdx) 30: 49 89 50 08 mov %rdx,0x8(%r8) 34: 48 c7 47 08 00 00 00 movq $0x0,0x8(%rdi) 3b: 00 3c: 48 rex.W 3d: c7 .byte 0xc7 3e: 07 (bad) ... Code starting with the faulting instruction =========================================== 0: 4c 8b 07 mov (%rdi),%r8 3: 4c 89 02 mov %r8,(%rdx) 6: 49 89 50 08 mov %rdx,0x8(%r8) a: 48 c7 47 08 00 00 00 movq $0x0,0x8(%rdi) 11: 00 12: 48 rex.W 13: c7 .byte 0xc7 14: 07 (bad) ... [ 88.803721] RSP: 0018:ffff9a1f892b7d58 EFLAGS: 00000206 [ 88.804032] RAX: 0000000000000000 RBX: ffff9a1f8420c800 RCX: ffff9a1f8420c800 [ 88.804560] RDX: ffff9a1f81bc1440 RSI: 0000000000000000 RDI: 0000000000000000 [ 88.805056] RBP: ffffffffc04bb0e0 R08: 0000000000000001 R09: 00000000ff7f9a1f [ 88.805473] R10: 000000000001001b R11: 0000000000009a1f R12: 0000000000000140 [ 88.806194] R13: 0000000000000001 R14: ffff9a1f886df400 R15: ffff9a1f886df4ac [ 88.806734] FS: 00007f445601a740(0000) GS:ffff9a2e7fd80000(0000) knlGS:0000000000000000 [ 88.807225] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 88.807672] CR2: 0000000000000000 CR3: 000000050cc46000 CR4: 00000000000006f0 [ 88.808165] Call Trace: [ 88.808459] <TASK> [ 88.808710] ? __die (arch/x86/kernel/dumpstack.c:421 arch/x86/kernel/dumpstack.c:434) [ 88.809261] ? page_fault_oops (arch/x86/mm/fault.c:715) [ 88.809561] ? exc_page_fault (./arch/x86/include/asm/irqflags.h:26 ./arch/x86/include/asm/irqflags.h:87 ./arch/x86/include/asm/irqflags.h:147 arch/x86/mm/fault.c:1489 arch/x86/mm/fault.c:1539) [ 88.809806] ? asm_exc_page_fault (./arch/x86/include/asm/idtentry.h:623) [ 88.810074] ? sfq_dequeue (net/sched/sch_sfq.c:272 net/sched/sch_sfq.c:499) sch_sfq [ 88.810411] sfq_reset (net/sched/sch_sfq.c:525) sch_sfq [ 88.810671] qdisc_reset (./include/linux/skbuff.h:2135 ./include/linux/skbuff.h:2441 ./include/linux/skbuff.h:3304 ./include/linux/skbuff.h:3310 net/sched/sch_generic.c:1036) [ 88.810950] tbf_reset (./include/linux/timekeeping.h:169 net/sched/sch_tbf.c:334) sch_tbf [ 88.811208] qdisc_reset (./include/linux/skbuff.h:2135 ./include/linux/skbuff.h:2441 ./include/linux/skbuff.h:3304 ./include/linux/skbuff.h:3310 net/sched/sch_generic.c:1036) [ 88.811484] netif_set_real_num_tx_queues (./include/linux/spinlock.h:396 ./include/net/sch_generic.h:768 net/core/dev.c:2958) [ 88.811870] __tun_detach (drivers/net/tun.c:590 drivers/net/tun.c:673) [ 88.812271] tun_chr_close (drivers/net/tun.c:702 drivers/net/tun.c:3517) [ 88.812505] __fput (fs/file_table.c:432 (discriminator 1)) [ 88.812735] task_work_run (kernel/task_work.c:230) [ 88.813016] do_exit (kernel/exit.c:940) [ 88.813372] ? trace_hardirqs_on (kernel/trace/trace_preemptirq.c:58 (discriminator 4)) [ 88.813639] ? handle_mm_fault (./arch/x86/include/asm/irqflags.h:42 ./arch/x86/include/asm/irqflags.h:97 ./arch/x86/include/asm/irqflags.h:155 ./include/linux/memcontrol.h:1022 ./include/linux/memcontrol.h:1045 ./include/linux/memcontrol.h:1052 mm/memory.c:5928 mm/memory.c:6088) [ 88.813867] do_group_exit (kernel/exit.c:1070) [ 88.814138] __x64_sys_exit_group (kernel/exit.c:1099) [ 88.814490] x64_sys_call (??:?) [ 88.814791] do_syscall_64 (arch/x86/entry/common.c:52 (discriminator 1) arch/x86/entry/common.c:83 (discriminator 1)) [ 88.815012] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130) [ 88.815495] RIP: 0033:0x7f44560f1975 Fixes: 175f9c1bba9b ("net_sched: Add size table for qdiscs") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Link: https://patch.msgid.link/20241007184130.3960565-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-08	Merge branch 'selftests-mlxsw-stabilize-red-tests'	Jakub Kicinski
	Petr Machata says: ==================== selftests: mlxsw: Stabilize RED tests Tweak the mlxsw-specific RED selftests to increase stability on Spectrum-3 and Spectrum-4 machines. ==================== Link: https://patch.msgid.link/cover.1728316370.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-08	selftests: mlxsw: sch_red_core: Lower TBF rate	Petr Machata
	The RED test uses a pair of TBF shapers. The first to get predictably-sized stream of traffic, and second to get a 100% saturated chokepoint. To this chokepoint it injects individual packets. Because the chokepoint is saturated, these additional packets go straight to the backlog. This allows the test to check RED behavior across various queue sizes. The shapers are rated at 1Gbps, for historical reasons (before mlxsw supported TBF offload, the test used port speed to create the chokepoints). Machines with a low-power CPU may have trouble consistently generating 1Gbps of traffic, and the test then spuriously fails. Instead, drop the rate to 200Mbps (Spectrum has a guaranteed shaper rate granularity of 200Mbps, so anything lower is not guaranteed to work well). Because that means fewer packets will be mirrored in the ECN-mark test, adjust the passing condition accordingly. Signed-off-by: Petr Machata <petrm@nvidia.com> Link: https://patch.msgid.link/c6712f9c5de75ae0bc2ab3d8ea7d92aaaf93af95.1728316370.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-08	selftests: mlxsw: sch_red_core: Send more packets for drop tests	Petr Machata
	This test works by injecting into a port with a maxed-out queue a couple packets and checks if a corresponding number of packets were dropped. This has worked well on Spectrum<4, but on Spectrum-4 it has been noisy. This is in line with the observation that on Spectrum-4, queue size tends to fluctuate more. A handful of packets could then still be accepted to the queue even though it was nominally full just recently. In order to accommodate this behavior, send many more packets. The buffer can fit N extra packets, but not N% packets. This therefore allows us to set wider absolute margins, while actually narrowing them relatively. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://patch.msgid.link/abc869b9f6003d400d6293ddd5edb2f4517f44d5.1728316370.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-08	selftests: mlxsw: sch_red_core: Sleep before querying queue depth	Petr Machata
	The qdisc stats are taken from the port's periodic HW stats, which are updated once a second. We try to accommodate the latency by using busywait in build_backlog(). The issue in that seems to be that when do_mark_test() builds the backlog, it makes the decision whether to send more packets based on the first instance of the queue depth stat exceeding the current value, when in fact more traffic is on the way and the queue depth would increase further. This leads to failures in TC 1 of mark-mirror test, where we see the following failure: TEST: TC 0: marked packets mirror'd [ OK ] TEST: TC 1: marked packets mirror'd [FAIL] Spurious packets (1680 -> 2290) observed without buffer pressure Fix by waiting for the full second before reading the queue depth for the first time, to make sure it reflects all in-flight traffic. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Link: https://patch.msgid.link/321dcf8b3e9a1f0766429c8cf3e3f1746f1bc375.1728316370.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-08	selftests: mlxsw: sch_red_core: Increase backlog size tolerance	Petr Machata
	Backlog fluctuates on Spectrum-4 much more than on <4. In practice we can sample queue depth values going from about -12% to about +7% of the configured RED limit. The test which checks the queue size has a limit of +-10%, and as a result often fails. We attempted to fix the issue by busywaiting for several seconds hoping to get within the bounds, but that still proved to be too noisy (or the wait time would be impractically long). Unfortunately we have to bump the value tolerance from 10% to 15%, which in this patch do. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Link: https://patch.msgid.link/f54950df2a8fcba46c3ddc1053376352fa2e592b.1728316370.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-08	selftests: mlxsw: sch_red_ets: Increase required backlog	Petr Machata
	Backlog fluctuates on Spectrum-4 much more than on <4. Increasing the desired backlog seems to help, as the constant fluctuations do not overlap into the territory where packets are marked. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Link: https://patch.msgid.link/0821fb3aa8bb6a6c0d3000baab04995517c9a0cc.1728316370.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-08	net: phy: smsc: use devm_clk_get_optional_enabled_with_rate()	Bartosz Golaszewski
	Fold the separate call to clk_set_rate() into the clock getter. Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20241007134100.107921-1-brgl@bgdev.pl Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-08	chelsio/chtls: Remove unused chtls_set_tcb_tflag	Dr. David Alan Gilbert
	chtls_set_tcb_tflag() has been unused since 2021's commit 827d329105bf ("chtls: Remove invalid set_tcb call") Remove it. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20241007004652.150065-1-linux@treblig.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-08	caif: Remove unused cfsrvl_getphyid	Dr. David Alan Gilbert
	cfsrvl_getphyid() has been unused since 2011's commit f36214408470 ("caif: Use RCU and lists in cfcnfg.c for managing caif link layers") Remove it. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20241007004456.149899-1-linux@treblig.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-08	net-timestamp: namespacify the sysctl_tstamp_allow_data	Jason Xing
	Let it be tuned in per netns by admins. Signed-off-by: Jason Xing <kernelxing@tencent.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20241005222609.94980-1-kerneljasonxing@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-08	net: dsa: mv88e6xxx: Add FID map cache	Aryan Srivastava
	Add a cached FID bitmap. This mitigates the need to walk all VTU entries to find the next free FID. When flushing the VTU (during init), zero the FID bitmap. Use and manipulate this bitmap from now on, instead of reading HW for the FID map. The repeated VTU walks are costly and can take ~40 mins if ~4000 vlans are added. Caching the FID map reduces this time to <2 mins. Signed-off-by: Aryan Srivastava <aryan.srivastava@alliedtelesis.co.nz> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20241006212905.3142976-1-aryan.srivastava@alliedtelesis.co.nz Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-08	e1000: Link NAPI instances to queues and IRQs	Joe Damato
	Add support for netdev-genl, allowing users to query IRQ, NAPI, and queue information. After this patch is applied, note the IRQ assigned to my NIC: $ cat /proc/interrupts \| grep enp0s8 \| cut -f1 --delimiter=':' 18 Note the output from the cli: $ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/netdev.yaml \ --dump napi-get --json='{"ifindex": 2}' [{'id': 513, 'ifindex': 2, 'irq': 18}] This device supports only 1 rx and 1 tx queue, so querying that: $ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/netdev.yaml \ --dump queue-get --json='{"ifindex": 2}' [{'id': 0, 'ifindex': 2, 'napi-id': 513, 'type': 'rx'}, {'id': 0, 'ifindex': 2, 'napi-id': 513, 'type': 'tx'}] Signed-off-by: Joe Damato <jdamato@fastly.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-10-08	e1000e: Link NAPI instances to queues and IRQs	Joe Damato
	Add support for netdev-genl, allowing users to query IRQ, NAPI, and queue information. After this patch is applied, note the IRQs assigned to my NIC: $ cat /proc/interrupts \| grep ens \| cut -f1 --delimiter=':' 50 51 52 While e1000e allocates 3 IRQs (RX, TX, and other), it looks like e1000e only has a single NAPI, so I've associated the NAPI with the RX IRQ (50 on my system, seen above). Note the output from the cli: $ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/netdev.yaml \ --dump napi-get --json='{"ifindex": 2}' [{'id': 145, 'ifindex': 2, 'irq': 50}] This device supports only 1 rx and 1 tx queue. so querying that: $ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/netdev.yaml \ --dump queue-get --json='{"ifindex": 2}' [{'id': 0, 'ifindex': 2, 'napi-id': 145, 'type': 'rx'}, {'id': 0, 'ifindex': 2, 'napi-id': 145, 'type': 'tx'}] Signed-off-by: Joe Damato <jdamato@fastly.com> Reviewed-by: Simon Horman <horms@kernel.org> Acked-by: Vitaly Lifshits <vitaly.lifshits@intel.com> Tested-by: Avigail Dahan <avigailx.dahan@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-10-08	e1000e: Remove duplicated writel() in e1000_configure_tx/rx()	Takamitsu Iwai
	Duplicated register initialization codes exist in e1000_configure_tx() and e1000_configure_rx(). For example, writel(0, tx_ring->head) writes 0 to tx_ring->head, which is adapter->hw.hw_addr + E1000_TDH(0). This initialization is already done in ew32(TDH(0), 0). ew32(TDH(0), 0) is equivalent to __ew32(hw, E1000_TDH(0), 0). It executes writel(0, hw->hw_addr + E1000_TDH(0)). Since variable hw is set to &adapter->hw, it is equal to writel(0, tx_ring->head). We can remove similar four writel() in e1000_configure_tx() and e1000_configure_rx(). commit 0845d45e900c ("e1000e: Modify Tx/Rx configurations to avoid null pointer dereferences in e1000_open") has introduced these writel(). This commit moved register writing to e1000_configure_tx/rx(), and as result, it caused duplication in e1000_configure_tx/rx(). This patch modifies the sequence of register writing, but removing these writes is safe because the same writes were already there before the commit. I also have checked the datasheets [0] [1] and have not found any description that we need to write RDH, RDT, TDH and TDT registers twice at initialization. Furthermore, we have tested this patch on an I219-V device physically. Link: https://www.intel.com/content/dam/www/public/us/en/documents/datasheets/82577-gbe-phy-datasheet.pdf [0] Link: https://www.intel.com/content/www/us/en/content-details/613460/intel-82583v-gbe-controller-datasheet.html [1] Tested-by: Kohei Enju <enjuk@amazon.com> Signed-off-by: Takamitsu Iwai <takamitz@amazon.co.jp> Tested-by: Mor Bar-Gabay <morx.bar.gabay@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-10-08	igb: Cleanup unused declarations	Yue Haibing
	e1000_init_function_pointers_82575() is never implemented and used since commit 9d5c824399de ("igb: PCI-Express 82575 Gigabit Ethernet driver"). And commit 9835fd7321a6 ("igb: Add new function to read part number from EEPROM in string format") removed igb_read_part_num() implementation. Signed-off-by: Yue Haibing <yuehaibing@huawei.com> Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>