summaryrefslogtreecommitdiff
path: root/drivers/net/ethernet/intel
AgeCommit message (Collapse)Author
2022-11-09ice: Fix spurious interrupt during removal of trusted VFNorbert Zulinski
Previously, during removal of trusted VF when VF is down there was number of spurious interrupt equal to number of queues on VF. Add check if VF already has inactive queues. If VF is disabled and has inactive rx queues then do not disable rx queues. Add check in ice_vsi_stop_tx_ring if it's VF's vsi and if VF is disabled. Fixes: efe41860008e ("ice: Fix memory corruption in VF driver") Signed-off-by: Norbert Zulinski <norbertx.zulinski@intel.com> Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-11-04igb: Proactively round up to kmalloc bucket sizeKees Cook
In preparation for removing the "silently change allocation size" users of ksize(), explicitly round up all q_vector allocations so that allocations can be correctly compared to ksize(). Cc: Jesse Brandeburg <jesse.brandeburg@intel.com> Cc: Tony Nguyen <anthony.l.nguyen@intel.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Paolo Abeni <pabeni@redhat.com> Cc: intel-wired-lan@lists.osuosl.org Cc: netdev@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-11-04igb: Do not free q_vector unless new one was allocatedKees Cook
Avoid potential use-after-free condition under memory pressure. If the kzalloc() fails, q_vector will be freed but left in the original adapter->q_vector[v_idx] array position. Cc: Jesse Brandeburg <jesse.brandeburg@intel.com> Cc: Tony Nguyen <anthony.l.nguyen@intel.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Paolo Abeni <pabeni@redhat.com> Cc: intel-wired-lan@lists.osuosl.org Cc: netdev@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-11-04ixgbevf: Add error messages on vlan errorJan Sokolowski
ixgbevf did not provide an error in dmesg if VLAN addition failed. Add two descriptive failure messages in the kernel log. Signed-off-by: Jan Sokolowski <jan.sokolowski@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-11-04ixgbe: Remove unneeded semicolonYang Li
./drivers/net/ethernet/intel/ixgbe/ixgbe_ptp.c:1305:2-3: Unneeded semicolon Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=2688 Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-11-04ixgbe: Remove local variableAnirudh Venkataramanan
Remove local variable "match" and directly return evaluated conditional instead. Suggested-by: Alexander Duyck <alexander.duyck@gmail.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-11-04ixgbe: change MAX_RXD/MAX_TXD based on adapter typeDaniel Willenson
Set the length limit for the receive descriptor buffer and transmit descriptor buffer based on the controller type. The values used are called out in the controller datasheets as a 'Note:' in the RDLEN and TDLEN register descriptions. This allows the user to use ethtool to allocate larger descriptor buffers in the case where data is received or transmitted too quickly for the driver to keep up. Signed-off-by: Daniel Willenson <daniel@veobot.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-11-03Merge branch '40GbE' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2022-11-02 (i40e, iavf) This series contains updates to i40e and iavf drivers. Joe Damato adds tracepoint information to i40e_napi_poll to expose helpful debug information for users who'd like to get a better understanding of how their NIC is performing as they adjust various parameters and tuning knobs. Note: this does not touch any XDP related code paths. This tracepoint will only work when not using XDP. Care has been taken to avoid changing control flow in i40e_napi_poll with this change. Alicja adds error messaging for unsupported duplex settings for i40e. Ye Xingchen replaces use of __FUNCTION__ with __func__ for iavf. Bartosz changes tense of device removal message to be more clear on the action for iavf. * '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue: iavf: Change information about device removal in dmesg iavf: Replace __FUNCTION__ with __func__ i40e: Add appropriate error message logged for incorrect duplex setting i40e: Add i40e_napi_poll tracepoint i40e: Record number of RXes cleaned during NAPI i40e: Record number TXes cleaned during NAPI i40e: Store the irq number in i40e_q_vector ==================== Link: https://lore.kernel.org/r/20221102211011.2944983-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-03Merge branch '1GbE' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2022-11-02 (e1000e, e1000, igc) This series contains updates to e1000e, e1000, and igc drivers. For e1000e, Sasha adds a new board type to help distinguish platforms and adds device id support for upcoming platforms. He also adds trace points for CSME flows to aid in debugging. Ani removes unnecessary kmap_atomic call for e1000 and e1000e. Muhammad sets speed based transmit offsets for launchtime functionality to reduce latency for igc. * '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue: igc: Correct the launchtime offset e1000: Remove unnecessary use of kmap_atomic() e1000e: Remove unnecessary use of kmap_atomic() e1000e: Add e1000e trace module e1000e: Add support for the next LOM generation e1000e: Separate MTP board type from ADP ==================== Link: https://lore.kernel.org/r/20221102203957.2944396-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-03net: remove unused ndo_get_devlink_portJiri Pirko
Remove ndo_get_devlink_port which is no longer used alongside with the implementations in drivers. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-03net: make drivers to use SET_NETDEV_DEVLINK_PORT to set devlink_portJiri Pirko
Benefit from the previously implemented tracking of netdev events in devlink code and instead of calling devlink_port_type_eth_set() and devlink_port_type_clear() to set devlink port type and link to related netdev, use SET_NETDEV_DEVLINK_PORT() macro to assign devlink_port pointer to netdevice which is about to be registered. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-02igc: Correct the launchtime offsetMuhammad Husaini Zulkifli
The launchtime offset should be corrected according to sections 7.5.2.6 Transmit Scheduling Latency of the Intel Ethernet I225/I226 Software User Manual. Software can compensate the latency between the transmission scheduling and the time that packet is transmitted to the network by setting this GTxOffset register. Without setting this register, there may be a significant delay between the packet scheduling and the network point. This patch helps to reduce the latency for each of the link speed. Before: 10Mbps : 11000 - 13800 nanosecond 100Mbps : 1300 - 1700 nanosecond 1000Mbps : 190 - 600 nanosecond 2500Mbps : 1400 - 1700 nanosecond After: 10Mbps : less than 750 nanosecond 100Mbps : less than 192 nanosecond 1000Mbps : less than 128 nanosecond 2500Mbps : less than 128 nanosecond Test Setup: Talker : Use l2_tai.c to generate the launchtime into packet payload. Listener: Use timedump.c to compute the delta between packet arrival and LaunchTime packet payload. Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> Signed-off-by: Muhammad Husaini Zulkifli <muhammad.husaini.zulkifli@intel.com> Acked-by: Sasha Neftin <sasha.neftin@intel.com> Acked-by: Paul Menzel <pmenzel@molgen.mpg.de> Tested-by: Naama Meir <naamax.meir@linux.intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-11-02e1000: Remove unnecessary use of kmap_atomic()Anirudh Venkataramanan
buffer_info->rxbuf.page accessed in e1000_clean_jumbo_rx_irq() is allocated using GFP_ATOMIC. Pages allocated with GFP_ATOMIC can't come from highmem and so there's no need to kmap() them. Just use page_address(). I don't have access to a 32-bit system so did some limited testing on qemu (qemu-system-i386 -m 4096 -smp 4 -device e1000e) with a 32-bit Debian 11.04 image. Cc: Ira Weiny <ira.weiny@intel.com> Cc: Fabio M. De Francesco <fmdefrancesco@gmail.com> Cc: Jesse Brandeburg <jesse.brandeburg@intel.com> Cc: Tony Nguyen <anthony.l.nguyen@intel.com> Suggested-by: Ira Weiny <ira.weiny@intel.com> Suggested-by: Fabio M. De Francesco <fmdefrancesco@gmail.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-11-02e1000e: Remove unnecessary use of kmap_atomic()Anirudh Venkataramanan
alloc_rx_buf() allocates ps_page->page and buffer_info->page using either GFP_ATOMIC or GFP_KERNEL. Memory allocated with GFP_KERNEL/GFP_ATOMIC can't come from highmem and so there's no need to kmap() them. Just use page_address(). I don't have access to a 32-bit system so did some limited testing on qemu (qemu-system-i386 -m 4096 -smp 4 -device e1000e) with a 32-bit Debian 11.04 image. Cc: Ira Weiny <ira.weiny@intel.com> Cc: Fabio M. De Francesco <fmdefrancesco@gmail.com> Cc: Jesse Brandeburg <jesse.brandeburg@intel.com> Cc: Tony Nguyen <anthony.l.nguyen@intel.com> Suggested-by: Ira Weiny <ira.weiny@intel.com> Suggested-by: Fabio M. De Francesco <fmdefrancesco@gmail.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Naama Meir <naamax.meir@linux.intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-11-02e1000e: Add e1000e trace moduleSasha Neftin
Add tracepoints to the driver via a new file e1000e_trace.h and some new trace calls added in interesting places in the driver. Add some tracing for s0ix flows to help in a debug of shared resources with the CSME firmware. The idea here is that tracepoints have such low performance cost when disabled that we can leave these in the upstream driver. Performance not affected, and this can be very useful for debugging and adding new trace events to paths in the future. Usage: echo "e1000e_trace:*" > /sys/kernel/debug/tracing/set_event echo 1 > /sys/kernel/debug/tracing/events/e1000e_trace/enable Signed-off-by: Sasha Neftin <sasha.neftin@intel.com> Tested-by: Naama Meir <naamax.meir@linux.intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-11-02e1000e: Add support for the next LOM generationSasha Neftin
Add devices IDs for the next LOM generations that will be available on the next Intel Client platforms. This patch provides the initial support for these devices. Signed-off-by: Sasha Neftin <sasha.neftin@intel.com> Tested-by: Naama Meir <naamax.meir@linux.intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-11-02e1000e: Separate MTP board type from ADPSasha Neftin
We have the same LAN controller on different PCH's. Separate MTP board type from an ADP which will allow for specific fixes to be applied for MTP platforms. Signed-off-by: Sasha Neftin <sasha.neftin@intel.com> Tested-by: Naama Meir <naamax.meir@linux.intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-11-02iavf: Change information about device removal in dmesgBartosz Staszewski
Changed information about device removal in dmesg. In function iavf_remove changed printed message from "Remove" to "Removing" after hot vf plug/unplug. Reason for this change is that, that "Removing" word is better because it is clearer for the user that the device is already being removed rather than implying that the user should remove this device. Signed-off-by: Bartosz Staszewski <bartoszx.staszewski@intel.com> Signed-off-by: Kamil Maziarz <kamil.maziarz@intel.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-11-02iavf: Replace __FUNCTION__ with __func__ye xingchen
__FUNCTION__ exists only for backwards compatibility reasons with old gcc versions. Replace it with __func__. Signed-off-by: ye xingchen <ye.xingchen@zte.com.cn> Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-11-02i40e: Add appropriate error message logged for incorrect duplex settingAlicja Kowalska
Nothing logged in dmesg for attempting to set incorrect duplex. Add appropriate error message logged for incorrect duplex setting. Signed-off-by: Alicja Kowalska <alicjax.kowalska@intel.com> Signed-off-by: Kamil Maziarz <kamil.maziarz@intel.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-11-02i40e: Add i40e_napi_poll tracepointJoe Damato
Add a tracepoint for i40e_napi_poll that allows users to get detailed information about the amount of work done. This information can help users better tune the correct NAPI parameters (like weight and budget), as well as debug NIC settings like rx-usecs and tx-usecs, etc. When perf is attached, this tracepoint only fires when not in XDP mode. An example of the output from this tracepoint: $ sudo perf trace -e i40e:i40e_napi_poll -a --call-graph=fp --libtraceevent_print [..snip..] 388.258 :0/0 i40e:i40e_napi_poll(i40e_napi_poll on dev eth2 q i40e-eth2-TxRx-9 irq 346 irq_mask 00000000,00000000,00000000,00000000,00000000,00800000 curr_cpu 23 budget 64 bpr 64 rx_cleaned 28 tx_cleaned 0 rx_clean_complete 1 tx_clean_complete 1) i40e_napi_poll ([i40e]) i40e_napi_poll ([i40e]) __napi_poll ([kernel.kallsyms]) net_rx_action ([kernel.kallsyms]) __do_softirq ([kernel.kallsyms]) common_interrupt ([kernel.kallsyms]) asm_common_interrupt ([kernel.kallsyms]) intel_idle_irq ([kernel.kallsyms]) cpuidle_enter_state ([kernel.kallsyms]) cpuidle_enter ([kernel.kallsyms]) do_idle ([kernel.kallsyms]) cpu_startup_entry ([kernel.kallsyms]) [0x243fd8] ([kernel.kallsyms]) secondary_startup_64_no_verify ([kernel.kallsyms]) Signed-off-by: Joe Damato <jdamato@fastly.com> Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Acked-by: Sridhar Samudrala <sridhar.samudrala@intel.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-11-02i40e: Record number of RXes cleaned during NAPIJoe Damato
Adjust i40e_clean_rx_irq to accept an out parameter which records the number of RX packets cleaned. No XDP related code is modified and care has been taken to avoid changing control flow. Signed-off-by: Joe Damato <jdamato@fastly.com> Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Acked-by: Sridhar Samudrala <sridhar.samudrala@intel.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-11-02i40e: Record number TXes cleaned during NAPIJoe Damato
Update i40e_clean_tx_irq to take an out parameter (tx_cleaned) which stores the number TXs cleaned. No XDP related TX code is touched. Care has been taken to avoid changing the control flow of i40e_clean_tx_irq and i40e_napi_poll. Signed-off-by: Joe Damato <jdamato@fastly.com> Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Acked-by: Sridhar Samudrala <sridhar.samudrala@intel.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-11-02i40e: Store the irq number in i40e_q_vectorJoe Damato
Make it easy to figure out the IRQ number for a particular i40e_q_vector by storing the assigned IRQ in the structure itself. Signed-off-by: Joe Damato <jdamato@fastly.com> Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Acked-by: Sridhar Samudrala <sridhar.samudrala@intel.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-10-31ptp: introduce helpers to adjust by scaled parts per millionJacob Keller
Many drivers implement the .adjfreq or .adjfine PTP op function with the same basic logic: 1. Determine a base frequency value 2. Multiply this by the abs() of the requested adjustment, then divide by the appropriate divisor (1 billion, or 65,536 billion). 3. Add or subtract this difference from the base frequency to calculate a new adjustment. A few drivers need the difference and direction rather than the combined new increment value. I recently converted the Intel drivers to .adjfine and the scaled parts per million (65.536 parts per billion) logic. To avoid overflow with minimal loss of precision, mul_u64_u64_div_u64 was used. The basic logic used by all of these drivers is very similar, and leads to a lot of duplicate code to perform the same task. Rather than keep this duplicate code, introduce diff_by_scaled_ppm and adjust_by_scaled_ppm. These helper functions calculate the difference or adjustment necessary based on the scaled parts per million input. The diff_by_scaled_ppm function returns true if the difference should be subtracted, and false otherwise. Update the Intel drivers to use the new helper functions. Other vendor drivers will be converted to .adjfine and this helper function in the following changes. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-28ice: Add additional CSR registers to ETHTOOL_GREGSLukasz Czapnik
In the event of a Tx hang it can be useful to read a variety of hardware registers to capture some state about why the transmit queue got stuck. Extend the ETHTOOL_GREGS dump provided by the ice driver with several CSR registers that provide such relevant information regarding the hardware Tx state. This enables capturing relevant data to enable debugging such a Tx hang. Signed-off-by: Lukasz Czapnik <lukasz.czapnik@intel.com> Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Link: https://lore.kernel.org/r/20221027104239.1691549-1-jacob.e.keller@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-28net: Remove the obsolte u64_stats_fetch_*_irq() users (drivers).Thomas Gleixner
Now that the 32bit UP oddity is gone and 32bit uses always a sequence count, there is no need for the fetch_irq() variants anymore. Convert to the regular interface. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-27Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
drivers/net/can/usb/kvaser_usb/kvaser_usb_leaf.c 2871edb32f46 ("can: kvaser_usb: Fix possible completions during init_completion") abb8670938b2 ("can: kvaser_usb_leaf: Ignore stale bus-off after start") 8d21f5927ae6 ("can: kvaser_usb_leaf: Fix improved state not being reported") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-27ice: Add support Flex RXDMichal Jaron
Add new VIRTCHNL_VF_OFFLOAD_RX_FLEX_DESC flag, opcode VIRTCHNL_OP_GET_SUPPORTED_RXDIDS and add member rxdid in struct virtchnl_rxq_info to support AVF Flex RXD extension. Add support to allow VF to query flexible descriptor RXDIDs supported by DDP package and configure Rx queues with selected RXDID for IAVF. Add code to allow VIRTCHNL_OP_GET_SUPPORTED_RXDIDS message to be processed. Add necessary macros for registers. Signed-off-by: Leyi Rong <leyi.rong@intel.com> Signed-off-by: Xu Ting <ting.xu@intel.com> Signed-off-by: Michal Jaron <michalx.jaron@intel.com> Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com> Tested-by: Maxime Coquelin <maxime.coquelin@redhat.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://lore.kernel.org/r/20221025161252.1952939-1-jacob.e.keller@intel.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-10-25i40e: Fix flow-type by setting GL_HASH_INSET registersSlawomir Laba
Fix setting bits for specific flow_type for GLQF_HASH_INSET register. In previous version all of the bits were set only in hena register, while in inset only one bit was set. In order for this working correctly on all types of cards these bits needs to be set correctly for both hena and inset registers. Fixes: eb0dd6e4a3b3 ("i40e: Allow RSS Hash set with less than four parameters") Signed-off-by: Slawomir Laba <slawomirx.laba@intel.com> Signed-off-by: Michal Jaron <michalx.jaron@intel.com> Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://lore.kernel.org/r/20221024100526.1874914-3-jacob.e.keller@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-25i40e: Fix VF hang when reset is triggered on another VFSylwester Dziedziuch
When a reset was triggered on one VF with i40e_reset_vf global PF state __I40E_VF_DISABLE was set on a PF until the reset finished. If immediately after triggering reset on one VF there is a request to reset on another it will cause a hang on VF side because VF will be notified of incoming reset but the reset will never happen because of this global state, we will get such error message: [ +4.890195] iavf 0000:86:02.1: Never saw reset and VF will hang waiting for the reset to be triggered. Fix this by introducing new VF state I40E_VF_STATE_RESETTING that will be set on a VF if it is currently resetting instead of the global __I40E_VF_DISABLE PF state. Fixes: 3ba9bcb4b68f ("i40e: add locking around VF reset") Signed-off-by: Sylwester Dziedziuch <sylwesterx.dziedziuch@intel.com> Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://lore.kernel.org/r/20221024100526.1874914-2-jacob.e.keller@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-25i40e: Fix ethtool rx-flow-hash setting for X722Slawomir Laba
When enabling flow type for RSS hash via ethtool: ethtool -N $pf rx-flow-hash tcp4|tcp6|udp4|udp6 s|d the driver would fail to setup this setting on X722 device since it was using the mask on the register dedicated for X710 devices. Apply a different mask on the register when setting the RSS hash for the X722 device. When displaying the flow types enabled via ethtool: ethtool -n $pf rx-flow-hash tcp4|tcp6|udp4|udp6 the driver would print wrong values for X722 device. Fix this issue by testing masks for X722 device in i40e_get_rss_hash_opts function. Fixes: eb0dd6e4a3b3 ("i40e: Allow RSS Hash set with less than four parameters") Signed-off-by: Slawomir Laba <slawomirx.laba@intel.com> Signed-off-by: Michal Jaron <michalx.jaron@intel.com> Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://lore.kernel.org/r/20221024100526.1874914-1-jacob.e.keller@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-25ice: Enable RX queue selection using skbedit actionAmritha Nambiar
This patch uses TC skbedit queue_mapping action to support forwarding packets to a device queue. Such filters with action forward to queue will be the highest priority switch filter in HW. Example: $ tc filter add dev ens4f0 protocol ip ingress flower\ dst_ip 192.168.1.12 ip_proto tcp dst_port 5001\ action skbedit queue_mapping 5 skip_sw The above command adds an ingress filter, incoming packets qualifying the match will be accepted into queue 5. The queue number is in decimal format. Refactored ice_add_tc_flower_adv_fltr() to consolidate code with action FWD_TO_VSI and FWD_TO QUEUE. Reviewed-by: Sridhar Samudrala <sridhar.samudrala@intel.com> Reviewed-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> Signed-off-by: Amritha Nambiar <amritha.nambiar@intel.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-10-14i40e: Fix DMA mappings leakJan Sokolowski
During reallocation of RX buffers, new DMA mappings are created for those buffers. steps for reproduction: while : do for ((i=0; i<=8160; i=i+32)) do ethtool -G enp130s0f0 rx $i tx $i sleep 0.5 ethtool -g enp130s0f0 done done This resulted in crash: i40e 0000:01:00.1: Unable to allocate memory for the Rx descriptor ring, size=65536 Driver BUG WARNING: CPU: 0 PID: 4300 at net/core/xdp.c:141 xdp_rxq_info_unreg+0x43/0x50 Call Trace: i40e_free_rx_resources+0x70/0x80 [i40e] i40e_set_ringparam+0x27c/0x800 [i40e] ethnl_set_rings+0x1b2/0x290 genl_family_rcv_msg_doit.isra.15+0x10f/0x150 genl_family_rcv_msg+0xb3/0x160 ? rings_fill_reply+0x1a0/0x1a0 genl_rcv_msg+0x47/0x90 ? genl_family_rcv_msg+0x160/0x160 netlink_rcv_skb+0x4c/0x120 genl_rcv+0x24/0x40 netlink_unicast+0x196/0x230 netlink_sendmsg+0x204/0x3d0 sock_sendmsg+0x4c/0x50 __sys_sendto+0xee/0x160 ? handle_mm_fault+0xbe/0x1e0 ? syscall_trace_enter+0x1d3/0x2c0 __x64_sys_sendto+0x24/0x30 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x65/0xca RIP: 0033:0x7f5eac8b035b Missing register, driver bug WARNING: CPU: 0 PID: 4300 at net/core/xdp.c:119 xdp_rxq_info_unreg_mem_model+0x69/0x140 Call Trace: xdp_rxq_info_unreg+0x1e/0x50 i40e_free_rx_resources+0x70/0x80 [i40e] i40e_set_ringparam+0x27c/0x800 [i40e] ethnl_set_rings+0x1b2/0x290 genl_family_rcv_msg_doit.isra.15+0x10f/0x150 genl_family_rcv_msg+0xb3/0x160 ? rings_fill_reply+0x1a0/0x1a0 genl_rcv_msg+0x47/0x90 ? genl_family_rcv_msg+0x160/0x160 netlink_rcv_skb+0x4c/0x120 genl_rcv+0x24/0x40 netlink_unicast+0x196/0x230 netlink_sendmsg+0x204/0x3d0 sock_sendmsg+0x4c/0x50 __sys_sendto+0xee/0x160 ? handle_mm_fault+0xbe/0x1e0 ? syscall_trace_enter+0x1d3/0x2c0 __x64_sys_sendto+0x24/0x30 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x65/0xca RIP: 0033:0x7f5eac8b035b This was caused because of new buffers with different RX ring count should substitute older ones, but those buffers were freed in i40e_configure_rx_ring and reallocated again with i40e_alloc_rx_bi, thus kfree on rx_bi caused leak of already mapped DMA. Fix this by reallocating ZC with rx_bi_zc struct when BPF program loads. Additionally reallocate back to rx_bi when BPF program unloads. If BPF program is loaded/unloaded and XSK pools are created, reallocate RX queues accordingly in XSP_SETUP_XSK_POOL handler. Fixes: be1222b585fd ("i40e: Separate kernel allocated rx_bi rings from AF_XDP rings") Signed-off-by: Jan Sokolowski <jan.sokolowski@intel.com> Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Chandan <chandanx.rout@intel.com> (A Contingent Worker at Intel) Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-29Merge branch '100GbE' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2022-09-28 (ice) Arkadiusz implements a single pin initialization function, checking feature bits, instead of having separate device functions and updates sub-device IDs for recognizing E810T devices. Martyna adds support for switchdev filters on VLAN priority field. * '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue: ice: Add support for VLAN priority filters in switchdev ice: support features on new E810T variants ice: Merge pin initialization of E810 and E810T adapters ==================== Link: https://lore.kernel.org/r/20220928203217.411078-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-29Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
No conflicts. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-28net: drop the weight argument from netif_napi_addJakub Kicinski
We tell driver developers to always pass NAPI_POLL_WEIGHT as the weight to netif_napi_add(). This may be confusing to newcomers, drop the weight argument, those who really need to tweak the weight can use netif_napi_add_weight(). Acked-by: Marc Kleine-Budde <mkl@pengutronix.de> # for CAN Link: https://lore.kernel.org/r/20220927132753.750069-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-28ice: Add support for VLAN priority filters in switchdevMartyna Szapar-Mudlaw
Enable support for adding TC rules that filter on the VLAN priority in switchdev mode. VLAN priority are the first 3 bits of 16b switch field vector word which contain also vlan id value within its last 12 bits. When getting vlan priority value from tc match.key it has to be shifted first to proper bits positions (by VLAN_PRIO_SHIFT) and then can be added to the joint 'vlan' field in ice_vlan_hdr in lookup element. The mask of lookup changes accordingly. 0x0FFF - when only vlan id is added in filter 0xE000 - when only vlan priority is added in filter 0xEFFF - when both these values are specified Signed-off-by: Martyna Szapar-Mudlaw <martyna.szapar-mudlaw@linux.intel.com> Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-09-28ice: support features on new E810T variantsArkadiusz Kubalewski
Add new sub-device ids required for proper initialization of features on E810T devices supported by ice driver. Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-09-28ice: Merge pin initialization of E810 and E810T adaptersArkadiusz Kubalewski
Remove separate function initializing pins for E810T-based adapters and initialize pins based on feature bits. Signed-off-by: Maciej Machnikowski <maciej.machnikowski@intel.com> Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-09-27ice: xsk: drop power of 2 ring size restriction for AF_XDPMaciej Fijalkowski
We had multiple customers in the past months that reported commit 296f13ff3854 ("ice: xsk: Force rings to be sized to power of 2") makes them unable to use ring size of 8160 in conjunction with AF_XDP. Remove this restriction. Fixes: 296f13ff3854 ("ice: xsk: Force rings to be sized to power of 2") CC: Alasdair McWilliam <alasdair.mcwilliam@outlook.com> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: George Kuruvinakunnel <george.kuruvinakunnel@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-09-27ice: xsk: change batched Tx descriptor cleaningMaciej Fijalkowski
AF_XDP Tx descriptor cleaning in ice driver currently works in a "lazy" way - descriptors are not cleaned immediately after send. We rather hold on with cleaning until we see that free space in ring drops below particular threshold. This was supposed to reduce the amount of unnecessary work related to cleaning and instead of keeping the ring empty, ring was rather saturated. In AF_XDP realm cleaning Tx descriptors implies producing them to CQ. This is a way of letting know user space that particular descriptor has been sent, as John points out in [0]. We tried to implement serial descriptor cleaning which would be used in conjunction with batched cleaning but it made code base more convoluted and probably harder to maintain in future. Therefore we step away from batched cleaning in a current form in favor of an approach where we set RS bit on every last descriptor from a batch and clean always at the beginning of ice_xmit_zc(). This means that we give up a bit of Tx performance, but this doesn't hurt l2fwd scenario which is way more meaningful than txonly as this can be treaten as AF_XDP based packet generator. l2fwd is not hurt due to the fact that Tx side is much faster than Rx and Rx is the one that has to catch Tx up. FWIW Tx descriptors are still produced in a batched way. [0]: https://lore.kernel.org/bpf/62b0a20232920_3573208ab@john.notmuch/ Fixes: 126cdfe1007a ("ice: xsk: Improve AF_XDP ZC Tx and use batching API") Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: George Kuruvinakunnel <george.kuruvinakunnel@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-09-27ice: reorder PF/representor devlink port register/unregister flowsJiri Pirko
Make sure that netdevice is registered/unregistered while devlink port is registered. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-22Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
drivers/net/ethernet/freescale/fec.h 7b15515fc1ca ("Revert "fec: Restart PPS after link state change"") 40c79ce13b03 ("net: fec: add stop mode support for imx8 platform") https://lore.kernel.org/all/20220921105337.62b41047@canb.auug.org.au/ drivers/pinctrl/pinctrl-ocelot.c c297561bc98a ("pinctrl: ocelot: Fix interrupt controller") 181f604b33cd ("pinctrl: ocelot: add ability to be used in a non-mmio configuration") https://lore.kernel.org/all/20220921110032.7cd28114@canb.auug.org.au/ tools/testing/selftests/drivers/net/bonding/Makefile bbb774d921e2 ("net: Add tests for bonding and team address list management") 152e8ec77640 ("selftests/bonding: add a test for bonding lladdr target") https://lore.kernel.org/all/20220921110437.5b7dbd82@canb.auug.org.au/ drivers/net/can/usb/gs_usb.c 5440428b3da6 ("can: gs_usb: gs_can_open(): fix race dev->can.state condition") 45dfa45f52e6 ("can: gs_usb: add RX and TX hardware timestamp support") https://lore.kernel.org/all/84f45a7d-92b6-4dc5-d7a1-072152fab6ff@tessares.net/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-21Merge branch '100GbE' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2022-09-20 (ice) Michal re-sets TC configuration when changing number of queues. Mateusz moves the check and call for link-down-on-close to the specific path for downing/closing the interface. * '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue: ice: Fix interface being down after reset with link-down-on-close flag on ice: config netdev tc before setting queues number ==================== Link: https://lore.kernel.org/r/20220920205344.1860934-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-21ice: Fix ice_xdp_xmit() when XDP TX queue number is not sufficientLarysa Zaremba
The original patch added the static branch to handle the situation, when assigning an XDP TX queue to every CPU is not possible, so they have to be shared. However, in the XDP transmit handler ice_xdp_xmit(), an error was returned in such cases even before static condition was checked, thus making queue sharing still impossible. Fixes: 22bf877e528f ("ice: introduce XDP_TX fallback path") Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com> Reviewed-by: Alexander Lobakin <alexandr.lobakin@intel.com> Link: https://lore.kernel.org/r/20220919134346.25030-1-larysa.zaremba@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-20ice: Add low latency Tx timestamp readKarol Kolacinski
E810 products can support low latency Tx timestamp register read. This requires usage of threaded IRQ instead of kthread to reduce the kthread start latency (spikes up to 20 ms). Add a check for the device capability and use the new method if supported. Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://lore.kernel.org/r/20220916201728.241510-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-20ice: Fix interface being down after reset with link-down-on-close flag onMateusz Palczewski
When performing a reset on ice driver with link-down-on-close flag on interface would always stay down. Fix this by moving a check of this flag to ice_stop() that is called only when user wants to bring interface down. Fixes: ab4ab73fc1ec ("ice: Add ethtool private flag to make forcing link down optional") Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com> Tested-by: Petr Oros <poros@redhat.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-09-20ice: config netdev tc before setting queues numberMichal Swiatkowski
After lowering number of tx queues the warning appears: "Number of in use tx queues changed invalidating tc mappings. Priority traffic classification disabled!" Example command to reproduce: ethtool -L enp24s0f0 tx 36 rx 36 Fix this by setting correct tc mapping before setting real number of queues on netdev. Fixes: 0754d65bd4be5 ("ice: Add infrastructure for mqprio support via ndo_setup_tc") Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-09-20ice: Add L2TPv3 hardware offload supportMarcin Szycik
Add support for offloading packets based on L2TPv3 session id in switchdev mode. Example filter: tc filter add dev $PF1 ingress prio 1 protocol ip flower ip_proto l2tp \ l2tpv3_sid 1234 skip_sw action mirred egress redirect dev $VF1_PR Changes in iproute2 are required to be able to specify l2tpv3_sid. ICE COMMS DDP package is required to create a filter as it contains L2TPv3 profiles. Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Signed-off-by: Marcin Szycik <marcin.szycik@linux.intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>