summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2024-11-15enic: Move enic resource adjustments to separate functionNelson Escobar
Move the enic resource adjustments out of enic_set_intr_mode() and into its own function, enic_adjust_resources(). Co-developed-by: John Daley <johndale@cisco.com> Signed-off-by: John Daley <johndale@cisco.com> Co-developed-by: Satish Kharat <satishkh@cisco.com> Signed-off-by: Satish Kharat <satishkh@cisco.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Nelson Escobar <neescoba@cisco.com> Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Link: https://patch.msgid.link/20241113-remove_vic_resource_limits-v4-6-a34cf8570c67@cisco.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15enic: Adjust used MSI-X wq/rq/cq/interrupt resources in a more robust wayNelson Escobar
Instead of failing to use MSI-X if resources aren't configured exactly right, use the resources we do have. Since we could start using large numbers of rq resources, we do limit the rq count to what netif_get_num_default_rss_queues() recommends. Co-developed-by: John Daley <johndale@cisco.com> Signed-off-by: John Daley <johndale@cisco.com> Co-developed-by: Satish Kharat <satishkh@cisco.com> Signed-off-by: Satish Kharat <satishkh@cisco.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Nelson Escobar <neescoba@cisco.com> Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Link: https://patch.msgid.link/20241113-remove_vic_resource_limits-v4-5-a34cf8570c67@cisco.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15enic: Allocate arrays in enic struct based on VIC configNelson Escobar
Allocate wq, rq, cq, intr, and napi arrays based on the number of resources configured in the VIC. Co-developed-by: John Daley <johndale@cisco.com> Signed-off-by: John Daley <johndale@cisco.com> Co-developed-by: Satish Kharat <satishkh@cisco.com> Signed-off-by: Satish Kharat <satishkh@cisco.com> Signed-off-by: Nelson Escobar <neescoba@cisco.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Link: https://patch.msgid.link/20241113-remove_vic_resource_limits-v4-4-a34cf8570c67@cisco.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15enic: Save resource counts we read from HWNelson Escobar
Save the resources counts for wq,rq,cq, and interrupts in *_avail variables so that we don't lose the information when adjusting the counts we are actually using. Report the wq_avail and rq_avail as the channel maximums in 'ethtool -l' output. Co-developed-by: John Daley <johndale@cisco.com> Signed-off-by: John Daley <johndale@cisco.com> Co-developed-by: Satish Kharat <satishkh@cisco.com> Signed-off-by: Satish Kharat <satishkh@cisco.com> Signed-off-by: Nelson Escobar <neescoba@cisco.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Link: https://patch.msgid.link/20241113-remove_vic_resource_limits-v4-3-a34cf8570c67@cisco.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15enic: Make MSI-X I/O interrupts come after the other required onesNelson Escobar
The VIC hardware has a constraint that the MSIX interrupt used for errors be specified as a 7 bit number. Before this patch, it was allocated after the I/O interrupts, which would cause a problem if 128 or more I/O interrupts are in use. So make the required interrupts come before the I/O interrupts to guarantee the error interrupt offset never exceeds 7 bits. Co-developed-by: John Daley <johndale@cisco.com> Signed-off-by: John Daley <johndale@cisco.com> Co-developed-by: Satish Kharat <satishkh@cisco.com> Signed-off-by: Satish Kharat <satishkh@cisco.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Nelson Escobar <neescoba@cisco.com> Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Link: https://patch.msgid.link/20241113-remove_vic_resource_limits-v4-2-a34cf8570c67@cisco.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15enic: Create enic_wq/rq structures to bundle per wq/rq dataNelson Escobar
Bundling the wq/rq specific data into dedicated enic_wq/rq structures cleans up the enic structure and simplifies future changes related to wq/rq. Co-developed-by: John Daley <johndale@cisco.com> Signed-off-by: John Daley <johndale@cisco.com> Co-developed-by: Satish Kharat <satishkh@cisco.com> Signed-off-by: Satish Kharat <satishkh@cisco.com> Signed-off-by: Nelson Escobar <neescoba@cisco.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Link: https://patch.msgid.link/20241113-remove_vic_resource_limits-v4-1-a34cf8570c67@cisco.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15gve: Flow steering trigger reset only for timeout errorZiwei Xiao
When configuring flow steering rules, the driver is currently going through a reset for all errors from the device. Instead, the driver should only reset when there's a timeout error from the device. Fixes: 57718b60df9b ("gve: Add flow steering adminq commands") Cc: stable@vger.kernel.org Signed-off-by: Ziwei Xiao <ziweixiao@google.com> Signed-off-by: Jeroen de Borst <jeroendb@google.com> Reviewed-by: Harshitha Ramamurthy <hramamurthy@google.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20241113175930.2585680-1-jeroendb@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15net: phy: microchip_t1: Clause-45 PHY loopback support for LAN887xTarun Alle
Adds support for clause-45 PHY loopback for the Microchip LAN887x driver. Signed-off-by: Tarun Alle <Tarun.Alle@microchip.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20241114101951.382996-1-Tarun.Alle@microchip.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15dt-bindings: net: dsa: microchip,ksz: Drop undocumented "id"Rob Herring (Arm)
"id" is not a documented property, so drop it. Signed-off-by: Rob Herring (Arm) <robh@kernel.org> Acked-by: Conor Dooley <conor.dooley@microchip.com> Link: https://patch.msgid.link/20241113225642.1783485-2-robh@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15net: phy: fix phylib's dual eee_enabledRussell King (Oracle)
phylib has two eee_enabled members. Some parts of the code are using phydev->eee_enabled, other parts are using phydev->eee_cfg.eee_enabled. This leads to incorrect behaviour as their state goes out of sync. ethtool --show-eee shows incorrect information, and --set-eee sometimes doesn't take effect. Fix this by only having one eee_enabled member - that in eee_cfg. Fixes: 49168d1980e2 ("net: phy: Add phy_support_eee() indicating MAC support EEE") Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Heiner Kallweit <hkallweit1@gmail.com> Link: https://patch.msgid.link/E1tBXAF-00341F-EQ@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15xsk: Free skb when TX metadata options are invalidFelix Maurer
When a new skb is allocated for transmitting an xsk descriptor, i.e., for every non-multibuf descriptor or the first frag of a multibuf descriptor, but the descriptor is later found to have invalid options set for the TX metadata, the new skb is never freed. This can leak skbs until the send buffer is full which makes sending more packets impossible. Fix this by freeing the skb in the error path if we are currently dealing with the first frag, i.e., an skb allocated in this iteration of xsk_build_skb. Fixes: 48eb03dd2630 ("xsk: Add TX timestamp and TX checksum offload support") Reported-by: Michal Schmidt <mschmidt@redhat.com> Signed-off-by: Felix Maurer <fmaurer@redhat.com> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Acked-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/edb9b00fb19e680dff5a3350cd7581c5927975a8.1731581697.git.fmaurer@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15bnxt_en: optimize gettimex64Vadim Fedorenko
Current implementation of gettimex64() makes at least 3 PCIe reads to get current PHC time. It takes at least 2.2us to get this value back to userspace. At the same time there is cached value of upper bits of PHC available for packet timestamps already. This patch reuses cached value to speed up reading of PHC time. Signed-off-by: Vadim Fedorenko <vadfed@meta.com> Reviewed-by: Michael Chan <michael.chan@broadcom.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20241114114820.1411660-1-vadfed@meta.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15Merge tag 'nf-24-11-14' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following patchset contains Netfilter fixes for net: 1) Update .gitignore in selftest to skip conntrack_reverse_clash, from Li Zhijian. 2) Fix conntrack_dump_flush return values, from Guan Jing. 3) syzbot found that ipset's bitmap type does not properly checks for bitmap's first ip, from Jeongjun Park. * tag 'nf-24-11-14' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf: netfilter: ipset: add missing range check in bitmap_ip_uadt selftests: netfilter: Fix missing return values in conntrack_dump_flush selftests: netfilter: Add missing gitignore file ==================== Link: https://patch.msgid.link/20241114125723.82229-1-pablo@netfilter.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15netdev-genl: Hold rcu_read_lock in napi_setJoe Damato
Hold rcu_read_lock during netdev_nl_napi_set_doit, which calls napi_by_id and requires rcu_read_lock to be held. Closes: https://lore.kernel.org/netdev/719083c2-e277-447b-b6ea-ca3acb293a03@redhat.com/ Fixes: 1287c1ae0fc2 ("netdev-genl: Support setting per-NAPI config values") Signed-off-by: Joe Damato <jdamato@fastly.com> Link: https://patch.msgid.link/20241114175600.18882-1-jdamato@fastly.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15netdev-genl: Hold rcu_read_lock in napi_getJoe Damato
Hold rcu_read_lock in netdev_nl_napi_get_doit, which calls napi_by_id and is required to be called under rcu_read_lock. Cc: stable@vger.kernel.org Fixes: 27f91aaf49b3 ("netdev-genl: Add netlink framework functions for napi") Signed-off-by: Joe Damato <jdamato@fastly.com> Link: https://patch.msgid.link/20241114175157.16604-1-jdamato@fastly.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15Merge tag 'for-net-next-2024-11-14' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next Luiz Augusto von Dentz says: ==================== bluetooth-next pull request for net-next: - btusb: add Foxconn 0xe0fc for Qualcomm WCN785x - btmtk: Fix ISO interface handling - Add quirk for ATS2851 - btusb: Add RTL8852BE device 0489:e123 - ISO: Do not emit LE PA/BIG Create Sync if previous is pending - btusb: Add USB HW IDs for MT7920/MT7925 - btintel_pcie: Add handshake between driver and firmware - btintel_pcie: Add recovery mechanism - hci_conn: Use disable_delayed_work_sync - SCO: Use kref to track lifetime of sco_conn - ISO: Use kref to track lifetime of iso_conn - btnxpuart: Add GPIO support to power save feature - btusb: Add 0x0489:0xe0f3 and 0x13d3:0x3623 for Qualcomm WCN785x * tag 'for-net-next-2024-11-14' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next: (51 commits) Bluetooth: MGMT: Add initial implementation of MGMT_OP_HCI_CMD_SYNC Bluetooth: fix use-after-free in device_for_each_child() Bluetooth: btintel: Direct exception event to bluetooth stack Bluetooth: hci_core: Fix calling mgmt_device_connected Bluetooth: hci_bcm: Use the devm_clk_get_optional() helper Bluetooth: ISO: Send BIG Create Sync via hci_sync Bluetooth: hci_conn: Remove alloc from critical section Bluetooth: ISO: Use kref to track lifetime of iso_conn Bluetooth: SCO: Use kref to track lifetime of sco_conn Bluetooth: HCI: Add IPC(11) bus type Bluetooth: btusb: Add 3 HWIDs for MT7925 Bluetooth: btusb: Add new VID/PID 0489/e124 for MT7925 Bluetooth: ISO: Update hci_conn_hash_lookup_big for Broadcast slave Bluetooth: ISO: Do not emit LE BIG Create Sync if previous is pending Bluetooth: ISO: Fix matching parent socket for BIS slave Bluetooth: ISO: Do not emit LE PA Create Sync if previous is pending Bluetooth: btrtl: Decrease HCI_OP_RESET timeout from 10 s to 2 s Bluetooth: btbcm: fix missing of_node_put() in btbcm_get_board_name() Bluetooth: btusb: Add new VID/PID 0489/e111 for MT7925 Bluetooth: btmtk: adjust the position to init iso data anchor ... ==================== Link: https://patch.msgid.link/20241114214731.1994446-1-luiz.dentz@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15Merge tag 'nf-next-24-11-15' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next Pablo Neira Ayuso says: ==================== Netfilter updates for net-next The following patchset contains Netfilter updates for net-next: 1) Extended netlink error reporting if nfnetlink attribute parser fails, from Donald Hunter. 2) Incorrect request_module() module, from Simon Horman. 3) A series of patches to reduce memory consumption for set element transactions. Florian Westphal says: "When doing a flush on a set or mass adding/removing elements from a set, each element needs to allocate 96 bytes to hold the transactional state. In such cases, virtually all the information in struct nft_trans_elem is the same. Change nft_trans_elem to a flex-array, i.e. a single nft_trans_elem can hold multiple set element pointers. The number of elements that can be stored in one nft_trans_elem is limited by the slab allocator, this series limits the compaction to at most 62 elements as it caps the reallocation to 2048 bytes of memory." 4) A series of patches to prepare the transition to dscp_t in .flowi_tos. From Guillaume Nault. 5) Support for bitwise operations with two source registers, from Jeremy Sowden. * tag 'nf-next-24-11-15' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next: netfilter: bitwise: add support for doing AND, OR and XOR directly netfilter: bitwise: rename some boolean operation functions netfilter: nf_dup4: Convert nf_dup_ipv4_route() to dscp_t. netfilter: nft_fib: Convert nft_fib4_eval() to dscp_t. netfilter: rpfilter: Convert rpfilter_mt() to dscp_t. netfilter: flow_offload: Convert nft_flow_route() to dscp_t. netfilter: ipv4: Convert ip_route_me_harder() to dscp_t. netfilter: nf_tables: allocate element update information dynamically netfilter: nf_tables: switch trans_elem to real flex array netfilter: nf_tables: prepare nft audit for set element compaction netfilter: nf_tables: prepare for multiple elements in nft_trans_elem structure netfilter: nf_tables: add nft_trans_commit_list_add_elem helper netfilter: bpf: Pass string literal as format argument of request_module() netfilter: nfnetlink: Report extack policy errors for batched ops ==================== Link: https://patch.msgid.link/20241115133207.8907-1-pablo@netfilter.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15netfilter: bitwise: add support for doing AND, OR and XOR directlyJeremy Sowden
Hitherto, these operations have been converted in user space to mask-and-xor operations on one register and two immediate values, and it is the latter which have been evaluated by the kernel. We add support for evaluating these operations directly in kernel space on one register and either an immediate value or a second register. Pablo made a few changes to the original patch: - EINVAL if NFTA_BITWISE_SREG2 is used with fast version. - Allow _AND,_OR,_XOR with _DATA != sizeof(u32) - Dump _SREG2 or _DATA with _AND,_OR,_XOR Signed-off-by: Jeremy Sowden <jeremy@azazel.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-11-15netfilter: bitwise: rename some boolean operation functionsJeremy Sowden
In the next patch we add support for doing AND, OR and XOR operations directly in the kernel, so rename some functions and an enum constant related to mask-and-xor boolean operations. Signed-off-by: Jeremy Sowden <jeremy@azazel.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-11-15netfilter: nf_dup4: Convert nf_dup_ipv4_route() to dscp_t.Guillaume Nault
Use ip4h_dscp() instead of reading iph->tos directly. ip4h_dscp() returns a dscp_t value which is temporarily converted back to __u8 with inet_dscp_to_dsfield(). When converting ->flowi4_tos to dscp_t in the future, we'll only have to remove that inet_dscp_to_dsfield() call. Signed-off-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-11-15netfilter: nft_fib: Convert nft_fib4_eval() to dscp_t.Guillaume Nault
Use ip4h_dscp() instead of reading iph->tos directly. ip4h_dscp() returns a dscp_t value which is temporarily converted back to __u8 with inet_dscp_to_dsfield(). When converting ->flowi4_tos to dscp_t in the future, we'll only have to remove that inet_dscp_to_dsfield() call. Signed-off-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-11-15netfilter: rpfilter: Convert rpfilter_mt() to dscp_t.Guillaume Nault
Use ip4h_dscp() instead of reading iph->tos directly. ip4h_dscp() returns a dscp_t value which is temporarily converted back to __u8 with inet_dscp_to_dsfield(). When converting ->flowi4_tos to dscp_t in the future, we'll only have to remove that inet_dscp_to_dsfield() call. Signed-off-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-11-15netfilter: flow_offload: Convert nft_flow_route() to dscp_t.Guillaume Nault
Use ip4h_dscp()instead of reading ip_hdr()->tos directly. ip4h_dscp() returns a dscp_t value which is temporarily converted back to __u8 with inet_dscp_to_dsfield(). When converting ->flowi4_tos to dscp_t in the future, we'll only have to remove that inet_dscp_to_dsfield() call. Also, remove the comment about the net/ip.h include file, since it's now required for the ip4h_dscp() helper too. Signed-off-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-11-15netfilter: ipv4: Convert ip_route_me_harder() to dscp_t.Guillaume Nault
Use ip4h_dscp()instead of reading iph->tos directly. ip4h_dscp() returns a dscp_t value which is temporarily converted back to __u8 with inet_dscp_to_dsfield(). When converting ->flowi4_tos to dscp_t in the future, we'll only have to remove that inet_dscp_to_dsfield() call. Signed-off-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-11-15xfrm: Fix acquire state insertion.Steffen Klassert
A recent commit jumped over the dst hash computation and left the symbol uninitialized. Fix this by explicitly computing the dst hash before it is used. Fixes: 0045e3d80613 ("xfrm: Cache used outbound xfrm states at the policy.") Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2024-11-14Merge branch 'net-make-rss-rxnfc-semantics-more-explicit'Jakub Kicinski
Edward Cree says: ==================== net: make RSS+RXNFC semantics more explicit The original semantics of ntuple filters with FLOW_RSS were not fully understood by all drivers, some ignoring the ring_cookie from the flow rule. Require this support to be explicitly declared by the driver for filters relying on it to be inserted, and add self- test coverage for this functionality. Also teach ethtool_check_max_channel() about this. ==================== Link: https://patch.msgid.link/cover.1731499021.git.ecree.xilinx@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-14selftest: extend test_rss_context_queue_reconfigure for action additionEdward Cree
The combination of ntuple action (ring_cookie) and RSS context can cause an ntuple rule to target a higher queue than appears in any RSS indirection table or directly in the ntuple rule, since the two numbers are added together. Verify the logic that prevents reducing the queue count in this case. Signed-off-by: Edward Cree <ecree.xilinx@gmail.com> Link: https://patch.msgid.link/58276b800ab78c0a79c1918046ccae7fe45ba802.1731499022.git.ecree.xilinx@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-14selftest: validate RSS+ntuple filters with nonzero ring_cookieEdward Cree
Test creates an ntuple filter with 'action 2' and an RSS context whose indirection table has entries 0 and 1. Resulting traffic should go to queues 2 and 3; verify that it never hits queues 0 and 1. Signed-off-by: Edward Cree <ecree.xilinx@gmail.com> Link: https://patch.msgid.link/114afdf4d2867f72ed27751e8e08fe8b128a8529.1731499022.git.ecree.xilinx@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-14selftest: include dst-ip in ethtool ntuple rulesEdward Cree
sfc hardware does not support filters with only ipproto + dst-port; adding dst-ip to the flow spec allows the rss_ctx test to be run on these devices. Signed-off-by: Edward Cree <ecree.xilinx@gmail.com> Reviewed-by: Martin Habets <habetsm.xilinx@gmail.com> Link: https://patch.msgid.link/8e5d23c8f21310c23c080cc7bcd31b76f8fd3096.1731499022.git.ecree.xilinx@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-14net: ethtool: account for RSS+RXNFC add semantics when checking channel countEdward Cree
In ethtool_check_max_channel(), the new RX count must not only cover the max queue indices in RSS indirection tables and RXNFC destinations separately, but must also, for RXNFC rules with FLOW_RSS, cover the sum of the destination queue and the maximum index in the associated RSS context's indirection table, since that is the highest queue that the rule can actually deliver traffic to. It could be argued that the max queue across all custom RSS contexts (ethtool_get_max_rss_ctx_channel()) need no longer be considered, since any context to which packets can actually be delivered will be targeted by some RXNFC rule and its max will thus be allowed for by ethtool_get_max_rxnfc_channel(). For simplicity we keep both checks, so even RSS contexts unused by any RXNFC rule must fit the channel count. Signed-off-by: Edward Cree <ecree.xilinx@gmail.com> Link: https://patch.msgid.link/43257d375434bef388e36181492aa4c458b88336.1731499022.git.ecree.xilinx@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-14net: ethtool: only allow set_rxnfc with rss + ring_cookie if driver opts inEdward Cree
Ethtool ntuple filters with FLOW_RSS were originally defined as adding the base queue ID (ring_cookie) to the value from the indirection table, so that the same table could distribute over more than one set of queues when used by different filters. However, some drivers / hardware ignore the ring_cookie, and simply use the indirection table entries as queue IDs directly. Thus, for drivers which have not opted in by setting ethtool_ops.cap_rss_rxnfc_adds to declare that they support the original (addition) semantics, reject in ethtool_set_rxnfc any filter which combines FLOW_RSS and a nonzero ring. (For a ring_cookie of zero, both behaviours are equivalent.) Set the cap bit in sfc, as it is known to support this feature. Signed-off-by: Edward Cree <ecree.xilinx@gmail.com> Reviewed-by: Martin Habets <habetsm.xilinx@gmail.com> Link: https://patch.msgid.link/cc3da0844083b0e301a33092a6299e4042b65221.1731499022.git.ecree.xilinx@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-14i40e: Fix handling changed priv flagsPeter Große
After assembling the new private flags on a PF, the operation to determine the changed flags uses the wrong bitmaps. Instead of xor-ing orig_flags with new_flags, it uses the still unchanged pf->flags, thus changed_flags is always 0. Fix it by using the correct bitmaps. The issue was discovered while debugging why disabling source pruning stopped working with release 6.7. Although the new flags will be copied to pf->flags later on in that function, disabling source pruning requires a reset of the PF, which was skipped due to this bug. Disabling source pruning: $ sudo ethtool --set-priv-flags eno1 disable-source-pruning on $ sudo ethtool --show-priv-flags eno1 Private flags for eno1: MFP : off total-port-shutdown : off LinkPolling : off flow-director-atr : on veb-stats : off hw-atr-eviction : off link-down-on-close : off legacy-rx : off disable-source-pruning: on disable-fw-lldp : off rs-fec : off base-r-fec : off vf-vlan-pruning : off Regarding reproducing: I observed the issue with a rather complicated lab setup, where * two VLAN interfaces are created on eno1 * each with a different MAC address assigned * each moved into a separate namespace * both VLANs are bridged externally, so they form a single layer 2 network The external bridge is done via a channel emulator adding packet loss and delay and the application in the namespaces tries to send/receive traffic and measure the performance. Sender and receiver are separated by namespaces, yet the network card "sees its own traffic" send back to it. To make that work, source pruning has to be disabled. Cc: stable@vger.kernel.org Fixes: 70756d0a4727 ("i40e: Use DECLARE_BITMAP for flags and hw_features fields in i40e_pf") Signed-off-by: Peter Große <pegro@friiks.de> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://patch.msgid.link/20241113210705.1296408-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-14dt-bindings: net: sff,sfp: Fix "interrupts" property typoRob Herring (Arm)
The example has "interrupt" property which is not a defined property. It should be "interrupts" instead. "interrupts" also should not contain a phandle. Signed-off-by: Rob Herring (Arm) <robh@kernel.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Acked-by: Conor Dooley <conor.dooley@microchip.com> Link: https://patch.msgid.link/20241113225825.1785588-2-robh@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-14dt-bindings: net: mdio-mux-gpio: Drop undocumented "marvell,reg-init"Rob Herring (Arm)
"marvell,reg-init" is not yet documented by schema. It's irrelevant to the example, so just drop it. Signed-off-by: Rob Herring (Arm) <robh@kernel.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Acked-by: Conor Dooley <conor.dooley@microchip.com> Link: https://patch.msgid.link/20241113225713.1784118-2-robh@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-14net: sparx5: add missing lan969x Kconfig dependencyArnd Bergmann
The sparx5 switchdev driver can be built either with or without support for the Lan969x switch. However, it cannot be built-in when the lan969x driver is a loadable module because of a link-time dependency: arm-linux-gnueabi-ld: drivers/net/ethernet/microchip/sparx5/sparx5_main.o:(.rodata+0xd44): undefined reference to `lan969x_desc' Add a Kconfig dependency to reflect this in Kconfig, allowing all the valid configurations but forcing sparx5 to be a loadable module as well if lan969x is. Fixes: 98a01119608d ("net: sparx5: add compatible string for lan969x") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Daniel Machon <daniel.machon@microchip.com> Link: https://patch.msgid.link/20241113115513.4132548-1-arnd@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-14net: enetc: clean up before returning in probe()Dan Carpenter
We recently added this error path. We need to call enetc_pci_remove() before returning. It cleans up the resources from enetc_pci_probe(). Fixes: 99100d0d9922 ("net: enetc: add preliminary support for i.MX95 ENETC PF") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Wei Fang <wei.fang@nxp.com> Link: https://patch.msgid.link/93888efa-c838-4682-a7e5-e6bf318e844e@stanley.mountain Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-14net: phy: dp83869: fix status reporting for 1000base-x autonegotiationRomain Gantois
The DP83869 PHY transceiver supports converting from RGMII to 1000base-x. In this operation mode, autonegotiation can be performed, as described in IEEE802.3. The DP83869 has a set of fiber-specific registers located at offset 0xc00. When the transceiver is configured in RGMII-to-1000base-x mode, these registers are mapped onto offset 0, which should make reading the autonegotiation status transparent. However, the fiber registers at offset 0xc04 and 0xc05 follow the bit layout specified in Clause 37, and genphy_read_status() assumes a Clause 22 layout. Thus, genphy_read_status() doesn't properly read the capabilities advertised by the link partner, resulting in incorrect link parameters. Similarly, genphy_config_aneg() doesn't properly write advertised capabilities. Fix the 1000base-x autonegotiation procedure by replacing genphy_read_status() and genphy_config_aneg() with their Clause 37 equivalents. Fixes: a29de52ba2a1 ("net: dp83869: Add ability to advertise Fiber connection") Cc: stable@vger.kernel.org Signed-off-by: Romain Gantois <romain.gantois@bootlin.com> Link: https://patch.msgid.link/20241112-dp83869-1000base-x-v3-1-36005f4ab0d9@bootlin.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-14mdio: Remove mdio45_ethtool_gset_npage()Alistair Francis
The mdio45_ethtool_gset_npage() function isn't called, so let's remove it. Signed-off-by: Alistair Francis <alistair.francis@wdc.com> Link: https://patch.msgid.link/20241112105430.438491-2-alistair@alistair23.me Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-14include: mdio: Remove mdio45_ethtool_gset()Alistair Francis
mdio45_ethtool_gset() is never called, so let's remove it. Signed-off-by: Alistair Francis <alistair.francis@wdc.com> Link: https://patch.msgid.link/20241112105430.438491-1-alistair@alistair23.me Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-14Merge tag 'for-netdev' of ↵Jakub Kicinski
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Martin KaFai Lau says: ==================== pull-request: bpf-next 2024-11-14 We've added 9 non-merge commits during the last 4 day(s) which contain a total of 3 files changed, 226 insertions(+), 84 deletions(-). The main changes are: 1) Fixes to bpf_msg_push/pop_data and test_sockmap. The changes has dependency on the other changes in the bpf-next/net branch, from Zijian Zhang. 2) Drop netns codes from mptcp test. Reuse the common helpers in test_progs, from Geliang Tang. * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: bpf, sockmap: Fix sk_msg_reset_curr bpf, sockmap: Several fixes to bpf_msg_pop_data bpf, sockmap: Several fixes to bpf_msg_push_data selftests/bpf: Add more tests for test_txmsg_push_pop in test_sockmap selftests/bpf: Add push/pop checking for msg_verify_data in test_sockmap selftests/bpf: Fix total_bytes in msg_loop_rx in test_sockmap selftests/bpf: Fix SENDPAGE data logic in test_sockmap selftests/bpf: Add txmsg_pass to pull/push/pop in test_sockmap selftests/bpf: Drop netns helpers in mptcp ==================== Link: https://patch.msgid.link/20241114202832.3187927-1-martin.lau@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-14Merge branch 'ipv4-prepare-bpf-helpers-to-flowi4_tos-conversion'Jakub Kicinski
Guillaume Nault says: ==================== ipv4: Prepare bpf helpers to .flowi4_tos conversion. Continue the process of making a dscp_t variable available when setting .flowi4_tos. This series focuses on the BPF helpers that initialise a struct flowi4 manually. The objective is to eventually convert .flowi4_tos to dscp_t, (to get type annotation and prevent ECN bits from interfering with DSCP). ==================== Link: https://patch.msgid.link/cover.1731064982.git.gnault@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-14bpf: lwtunnel: Prepare bpf_lwt_xmit_reroute() to future .flowi4_tos conversion.Guillaume Nault
Use ip4h_dscp() to get the DSCP from the IPv4 header, then convert the dscp_t value to __u8 with inet_dscp_to_dsfield(). Then, when we'll convert .flowi4_tos to dscp_t, we'll just have to drop the inet_dscp_to_dsfield() call. Signed-off-by: Guillaume Nault <gnault@redhat.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://patch.msgid.link/8338a12377c44f698a651d1ce357dd92bdf18120.1731064982.git.gnault@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-14bpf: ipv4: Prepare __bpf_redirect_neigh_v4() to future .flowi4_tos conversion.Guillaume Nault
Use ip4h_dscp() to get the DSCP from the IPv4 header, then convert the dscp_t value to __u8 with inet_dscp_to_dsfield(). Then, when we'll convert .flowi4_tos to dscp_t, we'll just have to drop the inet_dscp_to_dsfield() call. Signed-off-by: Guillaume Nault <gnault@redhat.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://patch.msgid.link/35eacc8955003e434afb1365d404193cc98a9579.1731064982.git.gnault@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-14Merge branch 'tools-net-ynl-rework-async-notification-handling'Jakub Kicinski
Donald Hunter says: ==================== tools/net/ynl: rework async notification handling Revert patch 1bf70e6c3a53 which modified check_ntf() and instead add a new poll_ntf() with async notification semantics. See patch 2 for a detailed description. ==================== Link: https://patch.msgid.link/20241113090843.72917-1-donald.hunter@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-14tools/net/ynl: add async notification handlingDonald Hunter
The notification handling in ynl is currently very simple, using sleep() to wait a period of time and then handling all the buffered messages in a single batch. This patch adds async notification handling so that messages can be processed as they are received. This makes it possible to use ynl as a library that supplies notifications in a timely manner. - Add poll_ntf() to be a generator that yields 1 notification at a time and blocks until a notification is available. - Add a --duration parameter to the CLI, with --sleep as an alias. ./tools/net/ynl/cli.py \ --spec <SPEC> --subscribe <TOPIC> [ --duration <SECS> ] The cli will report any notifications for duration seconds and then exit. If duration is not specified, then it will poll forever, until interrupted. Here is an example python snippet that shows how to use ynl as a library for receiving notifications: ynl = YnlFamily(f"{dir}/rt_route.yaml") ynl.ntf_subscribe('rtnlgrp-ipv4-route') for event in ynl.poll_ntf(): handle(event) Signed-off-by: Donald Hunter <donald.hunter@gmail.com> Link: https://patch.msgid.link/20241113090843.72917-3-donald.hunter@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-14Revert "tools/net/ynl: improve async notification handling"Donald Hunter
This reverts commit 1bf70e6c3a5346966c25e0a1ff492945b25d3f80. This modification to check_ntf() is being reverted so that its behaviour remains equivalent to ynl_ntf_check() in the C YNL. Instead a new poll_ntf() will be added in a separate patch. Signed-off-by: Donald Hunter <donald.hunter@gmail.com> Link: https://patch.msgid.link/20241113090843.72917-2-donald.hunter@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-14Merge branch ↵Jakub Kicinski
'net-phy-switch-eee_broken_modes-to-linkmode-bitmap-and-add-accessor' Heiner Kallweit says: ==================== net: phy: switch eee_broken_modes to linkmode bitmap and add accessor eee_broken_modes has a eee_cap1 register layout currently. This doesn't allow to flag e.g. 2.5Gbps or 5Gbps BaseT EEE as broken. To overcome this limitation switch eee_broken_modes to a linkmode bitmap. Add an accessor for the bitmap and use it in r8169. ==================== Link: https://patch.msgid.link/405734c5-0ed4-40e4-9ac9-91084b9536d6@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-14r8169: copy vendor driver 2.5G/5G EEE advertisement constraintsHeiner Kallweit
Vendor driver r8125 doesn't advertise 2.5G EEE on RTL8125A, and r8126 doesn't advertise 5G EEE. Likely there are compatibility issues, therefore do the same in r8169. With this change we don't have to disable 2.5G EEE advertisement in rtl8125a_config_eee_phy() any longer. We use new phylib accessor phy_set_eee_broken() to mark the respective EEE modes as broken. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Link: https://patch.msgid.link/ce185e10-8a2f-4cf8-a49b-fd8fb3c3c8a1@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-14net: phy: add phy_set_eee_brokenHeiner Kallweit
Add an accessor for eee_broken_modes, so that drivers don't have to deal with phylib internals. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Link: https://patch.msgid.link/0f8ee279-d40d-4489-a3b0-d993472d744a@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-14net: phy: convert eee_broken_modes to a linkmode bitmapHeiner Kallweit
eee_broken_modes has a eee_cap1 register layout currently. This doen't allow to flag e.g. 2.5Gbps or 5Gbps BaseT EEE as broken. To overcome this limitation switch eee_broken_modes to a linkmode bitmap. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Link: https://patch.msgid.link/dfe0c9ff-84b0-4328-86d7-e917ebc084a1@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>