summaryrefslogtreecommitdiff
path: root/drivers/net/ethernet/intel/ice
AgeCommit message (Collapse)Author
12 daysice: fix Rx page leak on multi-buffer framesJacob Keller
The ice_put_rx_mbuf() function handles calling ice_put_rx_buf() for each buffer in the current frame. This function was introduced as part of handling multi-buffer XDP support in the ice driver. It works by iterating over the buffers from first_desc up to 1 plus the total number of fragments in the frame, cached from before the XDP program was executed. If the hardware posts a descriptor with a size of 0, the logic used in ice_put_rx_mbuf() breaks. Such descriptors get skipped and don't get added as fragments in ice_add_xdp_frag. Since the buffer isn't counted as a fragment, we do not iterate over it in ice_put_rx_mbuf(), and thus we don't call ice_put_rx_buf(). Because we don't call ice_put_rx_buf(), we don't attempt to re-use the page or free it. This leaves a stale page in the ring, as we don't increment next_to_alloc. The ice_reuse_rx_page() assumes that the next_to_alloc has been incremented properly, and that it always points to a buffer with a NULL page. Since this function doesn't check, it will happily recycle a page over the top of the next_to_alloc buffer, losing track of the old page. Note that this leak only occurs for multi-buffer frames. The ice_put_rx_mbuf() function always handles at least one buffer, so a single-buffer frame will always get handled correctly. It is not clear precisely why the hardware hands us descriptors with a size of 0 sometimes, but it happens somewhat regularly with "jumbo frames" used by 9K MTU. To fix ice_put_rx_mbuf(), we need to make sure to call ice_put_rx_buf() on all buffers between first_desc and next_to_clean. Borrow the logic of a similar function in i40e used for this same purpose. Use the same logic also in ice_get_pgcnts(). Instead of iterating over just the number of fragments, use a loop which iterates until the current index reaches to the next_to_clean element just past the current frame. Unlike i40e, the ice_put_rx_mbuf() function does call ice_put_rx_buf() on the last buffer of the frame indicating the end of packet. For non-linear (multi-buffer) frames, we need to take care when adjusting the pagecnt_bias. An XDP program might release fragments from the tail of the frame, in which case that fragment page is already released. Only update the pagecnt_bias for the first descriptor and fragments still remaining post-XDP program. Take care to only access the shared info for fragmented buffers, as this avoids a significant cache miss. The xdp_xmit value only needs to be updated if an XDP program is run, and only once per packet. Drop the xdp_xmit pointer argument from ice_put_rx_mbuf(). Instead, set xdp_xmit in the ice_clean_rx_irq() function directly. This avoids needing to pass the argument and avoids an extra bit-wise OR for each buffer in the frame. Move the increment of the ntc local variable to ensure its updated *before* all calls to ice_get_pgcnts() or ice_put_rx_mbuf(), as the loop logic requires the index of the element just after the current frame. Now that we use an index pointer in the ring to identify the packet, we no longer need to track or cache the number of fragments in the rx_ring. Cc: Christoph Petrausch <christoph.petrausch@deepl.com> Cc: Jesper Dangaard Brouer <hawk@kernel.org> Reported-by: Jaroslav Pulchart <jaroslav.pulchart@gooddata.com> Closes: https://lore.kernel.org/netdev/CAK8fFZ4hY6GUJNENz3wY9jaYLZXGfpr7dnZxzGMYoE44caRbgw@mail.gmail.com/ Fixes: 743bbd93cf29 ("ice: put Rx buffers after being done with current frame") Tested-by: Michal Kubiak <michal.kubiak@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Acked-by: Jesper Dangaard Brouer <hawk@kernel.org> Tested-by: Priya Singh <priyax.singh@intel.com> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-09-02ice: fix NULL access of tx->in_use in ice_ll_ts_intrJacob Keller
Recent versions of the E810 firmware have support for an extra interrupt to handle report of the "low latency" Tx timestamps coming from the specialized low latency firmware interface. Instead of polling the registers, software can wait until the low latency interrupt is fired. This logic makes use of the Tx timestamp tracking structure, ice_ptp_tx, as it uses the same "ready" bitmap to track which Tx timestamps complete. Unfortunately, the ice_ll_ts_intr() function does not check if the tracker is initialized before its first access. This results in NULL dereference or use-after-free bugs similar to the issues fixed in the ice_ptp_ts_irq() function. Fix this by only checking the in_use bitmap (and other fields) if the tracker is marked as initialized. The reset flow will clear the init field under lock before it tears the tracker down, thus preventing any use-after-free or NULL access. Fixes: 82e71b226e0e ("ice: Enable SW interrupt from FW for LL TS") Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-09-02ice: fix NULL access of tx->in_use in ice_ptp_ts_irqJacob Keller
The E810 device has support for a "low latency" firmware interface to access and read the Tx timestamps. This interface does not use the standard Tx timestamp logic, due to the latency overhead of proxying sideband command requests over the firmware AdminQ. The logic still makes use of the Tx timestamp tracking structure, ice_ptp_tx, as it uses the same "ready" bitmap to track which Tx timestamps complete. Unfortunately, the ice_ptp_ts_irq() function does not check if the tracker is initialized before its first access. This results in NULL dereference or use-after-free bugs similar to the following: [245977.278756] BUG: kernel NULL pointer dereference, address: 0000000000000000 [245977.278774] RIP: 0010:_find_first_bit+0x19/0x40 [245977.278796] Call Trace: [245977.278809] ? ice_misc_intr+0x364/0x380 [ice] This can occur if a Tx timestamp interrupt races with the driver reset logic. Fix this by only checking the in_use bitmap (and other fields) if the tracker is marked as initialized. The reset flow will clear the init field under lock before it tears the tracker down, thus preventing any use-after-free or NULL access. Fixes: f9472aaabd1f ("ice: Process TSYN IRQ in a separate function") Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-08-25ice: fix incorrect counter for buffer allocation failuresMichal Kubiak
Currently, the driver increments `alloc_page_failed` when buffer allocation fails in `ice_clean_rx_irq()`. However, this counter is intended for page allocation failures, not buffer allocation issues. This patch corrects the counter by incrementing `alloc_buf_failed` instead, ensuring accurate statistics reporting for buffer allocation failures. Fixes: 2fba7dc5157b ("ice: Add support for XDP multi-buffer on Rx side") Reported-by: Jacob Keller <jacob.e.keller@intel.com> Suggested-by: Paul Menzel <pmenzel@molgen.mpg.de> Signed-off-by: Michal Kubiak <michal.kubiak@intel.com> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Reviewed-by: Jason Xing <kerneljasonxing@gmail.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Priya Singh <priyax.singh@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-08-25ice: use fixed adapter index for E825C embedded devicesJacob Keller
The ice_adapter structure is used by the ice driver to connect multiple physical functions of a device in software. It was introduced by commit 0e2bddf9e5f9 ("ice: add ice_adapter for shared data across PFs on the same NIC") and is primarily used for PTP support, as well as for handling certain cross-PF synchronization. The original design of ice_adapter used PCI address information to determine which devices should be connected. This was extended to support E825C devices by commit fdb7f54700b1 ("ice: Initial support for E825C hardware in ice_adapter"), which used the device ID for E825C devices instead of the PCI address. Later, commit 0093cb194a75 ("ice: use DSN instead of PCI BDF for ice_adapter index") replaced the use of Bus/Device/Function addressing with use of the device serial number. E825C devices may appear in "Dual NAC" configuration which has multiple physical devices tied to the same clock source and which need to use the same ice_adapter. Unfortunately, each "NAC" has its own NVM which has its own unique Device Serial Number. Thus, use of the DSN for connecting ice_adapter does not work properly. It "worked" in the pre-production systems because the DSN was not initialized on the test NVMs and all the NACs had the same zero'd serial number. Since we cannot rely on the DSN, lets fall back to the logic in the original E825C support which used the device ID. This is safe for E825C only because of the embedded nature of the device. It isn't a discreet adapter that can be plugged into an arbitrary system. All E825C devices on a given system are connected to the same clock source and need to be configured through the same PTP clock. To make this separation clear, reserve bit 63 of the 64-bit index values as a "fixed index" indicator. Always clear this bit when using the device serial number as an index. For E825C, use a fixed value defined as the 0x579C E825C backplane device ID bitwise ORed with the fixed index indicator. This is slightly different than the original logic of just using the device ID directly. Doing so prevents a potential issue with systems where only one of the NACs is connected with an external PHY over SGMII. In that case, one NAC would have the E825C_SGMII device ID, but the other would not. Separate the determination of the full 64-bit index from the 32-bit reduction logic. Provide both ice_adapter_index() and a wrapping ice_adapter_xa_index() which handles reducing the index to a long on 32-bit systems. As before, cache the full index value in the adapter structure to warn about collisions. This fixes issues with E825C not initializing PTP on both NACs, due to failure to connect the appropriate devices to the same ice_adapter. Fixes: 0093cb194a75 ("ice: use DSN instead of PCI BDF for ice_adapter index") Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Grzegorz Nitka <grzegorz.nitka@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-08-25ice: don't leave device non-functional if Tx scheduler config failsJacob Keller
The ice_cfg_tx_topo function attempts to apply Tx scheduler topology configuration based on NVM parameters, selecting either a 5 or 9 layer topology. As part of this flow, the driver acquires the "Global Configuration Lock", which is a hardware resource associated with programming the DDP package to the device. This "lock" is implemented by firmware as a way to guarantee that only one PF can program the DDP for a device. Unlike a traditional lock, once a PF has acquired this lock, no other PF will be able to acquire it again (including that PF) until a CORER of the device. Future requests to acquire the lock report that global configuration has already completed. The following flow is used to program the Tx topology: * Read the DDP package for scheduler configuration data * Acquire the global configuration lock * Program Tx scheduler topology according to DDP package data * Trigger a CORER which clears the global configuration lock This is followed by the flow for programming the DDP package: * Acquire the global configuration lock (again) * Download the DDP package to the device * Release the global configuration lock. However, if configuration of the Tx topology fails, (i.e. ice_get_set_tx_topo returns an error code), the driver exits ice_cfg_tx_topo() immediately, and fails to trigger CORER. While the global configuration lock is held, the firmware rejects most AdminQ commands, as it is waiting for the DDP package download (or Tx scheduler topology programming) to occur. The current driver flows assume that the global configuration lock has been reset by CORER after programming the Tx topology. Thus, the same PF attempts to acquire the global lock again, and fails. This results in the driver reporting "an unknown error occurred when loading the DDP package". It then attempts to enter safe mode, but ultimately fails to finish ice_probe() since nearly all AdminQ command report error codes, and the driver stops loading the device at some point during its initialization. The only currently known way that ice_get_set_tx_topo() can fail is with certain older DDP packages which contain invalid topology configuration, on firmware versions which strictly validate this data. The most recent releases of the DDP have resolved the invalid data. However, it is still poor practice to essentially brick the device, and prevent access to the device even through safe mode or recovery mode. It is also plausible that this command could fail for some other reason in the future. We cannot simply release the global lock after a failed call to ice_get_set_tx_topo(). Releasing the lock indicates to firmware that global configuration (downloading of the DDP) has completed. Future attempts by this or other PFs to load the DDP will fail with a report that the DDP package has already been downloaded. Then, PFs will enter safe mode as they realize that the package on the device does not meet the minimum version requirement to load. The reported error messages are confusing, as they indicate the version of the default "safe mode" package in the NVM, rather than the version of the file loaded from /lib/firmware. Instead, we need to trigger CORER to clear global configuration. This is the lowest level of hardware reset which clears the global configuration lock and related state. It also clears any already downloaded DDP. Crucially, it does *not* clear the Tx scheduler topology configuration. Refactor ice_cfg_tx_topo() to always trigger a CORER after acquiring the global lock, regardless of success or failure of the topology configuration. We need to re-initialize the HW structure when we trigger the CORER. Thus, it makes sense for this to be the responsibility of ice_cfg_tx_topo() rather than its caller, ice_init_tx_topology(). This avoids needless re-initialization in cases where we don't attempt to update the Tx scheduler topology, such as if it has already been programmed. There is one catch: failure to re-initialize the HW struct should stop ice_probe(). If this function fails, we won't have a valid HW structure and cannot ensure the device is functioning properly. To handle this, ensure ice_cfg_tx_topo() returns a limited set of error codes. Set aside one specifically, -ENODEV, to indicate that the ice_init_tx_topology() should fail and stop probe. Other error codes indicate failure to apply the Tx scheduler topology. This is treated as a non-fatal error, with an informational message informing the system administrator that the updated Tx topology did not apply. This allows the device to load and function with the default Tx scheduler topology, rather than failing to load entirely. Note that this use of CORER will not result in loops with future PFs attempting to also load the invalid Tx topology configuration. The first PF will acquire the global configuration lock as part of programming the DDP. Each PF after this will attempt to acquire the global lock as part of programming the Tx topology, and will fail with the indication from firmware that global configuration is already complete. Tx scheduler topology configuration is only performed during driver init (probe or devlink reload) and not during cleanup for a CORER that happens after probe completes. Fixes: 91427e6d9030 ("ice: Support 5 layer topology") Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-08-25ice: fix NULL pointer dereference in ice_unplug_aux_dev() on resetEmil Tantilov
Issuing a reset when the driver is loaded without RDMA support, will results in a crash as it attempts to remove RDMA's non-existent auxbus device: echo 1 > /sys/class/net/<if>/device/reset BUG: kernel NULL pointer dereference, address: 0000000000000008 ... RIP: 0010:ice_unplug_aux_dev+0x29/0x70 [ice] ... Call Trace: <TASK> ice_prepare_for_reset+0x77/0x260 [ice] pci_dev_save_and_disable+0x2c/0x70 pci_reset_function+0x88/0x130 reset_store+0x5a/0xa0 kernfs_fop_write_iter+0x15e/0x210 vfs_write+0x273/0x520 ksys_write+0x6b/0xe0 do_syscall_64+0x79/0x3b0 entry_SYSCALL_64_after_hwframe+0x76/0x7e ice_unplug_aux_dev() checks pf->cdev_info->adev for NULL pointer, but pf->cdev_info will also be NULL, leading to the deref in the trace above. Introduce a flag to be set when the creation of the auxbus device is successful, to avoid multiple NULL pointer checks in ice_unplug_aux_dev(). Fixes: c24a65b6a27c7 ("iidc/ice/irdma: Update IDC to support multiple consumers") Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-07-25Merge branch '100GbE' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== libie: commonize adminq structure Michal Swiatkowski says: It is a prework to allow reusing some specific Intel code (eq. fwlog). Move common *_aq_desc structure to libie header and changing it in ice, ixgbe, i40e and iavf. Only generic adminq commands can be easily moved to common header, as rest is slightly different. Format remains the same. It will be better to correctly move it when it will be needed to commonize other part of the code. Move *_aq_str() to new libie module (libie_adminq) and use it across drivers. The functions are exactly the same in each driver. Some more adminq helpers/functions can be moved to libie_adminq when needed. * '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue: i40e: use libie_aq_str iavf: use libie_aq_str ice: use libie_aq_str libie: add adminq helper for converting err to str iavf: use libie adminq descriptors i40e: use libie adminq descriptors ixgbe: use libie adminq descriptors ice, libie: move generic adminq descriptors to lib ==================== Link: https://patch.msgid.link/20250724182826.3758850-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-25net: Fix typosBjorn Helgaas
Fix typos in comments and error messages. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: David Arinzon <darinzon@amazon.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250723201528.2908218-1-helgaas@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-24Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Cross-merge networking fixes after downstream PR (net-6.16-rc8). Conflicts: drivers/net/ethernet/microsoft/mana/gdma_main.c 9669ddda18fb ("net: mana: Fix warnings for missing export.h header inclusion") 755391121038 ("net: mana: Allocate MSI-X vectors dynamically") https://lore.kernel.org/20250711130752.23023d98@canb.auug.org.au Adjacent changes: drivers/net/ethernet/ti/icssg/icssg_prueth.h 6e86fb73de0f ("net: ti: icssg-prueth: Fix buffer allocation for ICSSG") ffe8a4909176 ("net: ti: icssg-prueth: Read firmware-names from device tree") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-24ice: use libie_aq_strMichal Swiatkowski
Simple: s/ice_aq_str/libie_aq_str Add libie_aminq module in ice Kconfig. Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-07-24i40e: use libie adminq descriptorsMichal Swiatkowski
Use libie_aq_desc instead of i40e_aq_desc. Do needed changes to allow clean build. Get version descriptor is a little less detailed on i40e. To not mess up with shifting or union inside libie desc use get version descriptor from i40e. Move additional caps for i40e to libie. Fix RCT in declaration that is using libie_aq_desc; Use libie_aq_raw() wherever it can be used. The libie aq error is extended, cover it in ice driver just to clean build. In next patches the libie code for that will be used in each of intel driver. Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-07-24ice, libie: move generic adminq descriptors to libMichal Swiatkowski
The descriptor structure is the same in ice, ixgbe and i40e. Move it to common libie header to use it across different driver. Leave device specific adminq commands in separate folders. This lead to a change that need to be done in filling/getting descriptor: - previous: struct specific_desc *cmd; cmd = &desc.params.specific_desc; - now: struct specific_desc *cmd; cmd = libie_aq_raw(&desc); Do this changes across the driver to allow clean build. The casting only have to be done in case of specific descriptors, for generic one union can still be used. Changes beside code moving: - change ICE_ prefix to LIBIE_ prefix (ice_ and libie_ too) - remove shift variables not otherwise needed (in libie_aq_flags) - fill/get descriptor data based on desc.params.raw whenever the descriptor isn't defined in libie - move defines from the libie_aq_sth structure outside - add libie_aq_raw helper and use it instead of explicit casting Reviewed by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-07-21ice: Fix a null pointer dereference in ice_copy_and_init_pkg()Haoxiang Li
Add check for the return value of devm_kmemdup() to prevent potential null pointer dereference. Fixes: c76488109616 ("ice: Implement Dynamic Device Personalization (DDP) download") Cc: stable@vger.kernel.org Signed-off-by: Haoxiang Li <haoxiang_li2024@163.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-07-18ice: breakout common LAG code into helpersDave Ertman
In the VF handling code, parts of the code for lag can be broken out into helper functions to reduce code duplication. Break this code out into helper functions Reviewed-by: Marcin Szycik <marcin.szycik@linux.intel.com> Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-07-18ice: convert ice_add_prof() to bitmapJesse Brandeburg
Previously the ice_add_prof() took an array of u8 and looped over it with for_each_set_bit(), examining each 8 bit value as a bitmap. This was just hard to understand and unnecessary, and was triggering undefined behavior sanitizers with unaligned accesses within bitmap fields (on our internal tools/builds). Since the @ptype being passed in was already declared as a bitmap, refactor this to use native types with the advantage of simplifying the code to use a single loop. Co-developed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> CC: Jesse Brandeburg <jbrandeburg@cloudflare.com> Signed-off-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-07-18ice: add E835 device IDsDawid Osuchowski
E835 is an enhanced version of the E830. It continues to use the same set of commands, registers and interfaces as other devices in the 800 Series. Following device IDs are added: - 0x1248: Intel(R) Ethernet Controller E835-CC for backplane - 0x1249: Intel(R) Ethernet Controller E835-CC for QSFP - 0x124A: Intel(R) Ethernet Controller E835-CC for SFP - 0x1261: Intel(R) Ethernet Controller E835-C for backplane - 0x1262: Intel(R) Ethernet Controller E835-C for QSFP - 0x1263: Intel(R) Ethernet Controller E835-C for SFP - 0x1265: Intel(R) Ethernet Controller E835-L for backplane - 0x1266: Intel(R) Ethernet Controller E835-L for QSFP - 0x1267: Intel(R) Ethernet Controller E835-L for SFP Reviewed-by: Konrad Knitter <konrad.knitter@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Dawid Osuchowski <dawid.osuchowski@linux.intel.com> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-07-18ice: add 40G speed to Admin Command GET PORT OPTIONAleksandr Loktionov
Introduce the ICE_AQC_PORT_OPT_MAX_LANE_40G constant and update the code to process this new option in both the devlink and the Admin Queue Command GET PORT OPTION (opcode 0x06EA) message, similar to existing constants like ICE_AQC_PORT_OPT_MAX_LANE_50G, ICE_AQC_PORT_OPT_MAX_LANE_100G, and so on. This feature allows the driver to correctly report configuration options for 2x40G on E823 and other cards in the future via devlink. Example command: devlink port split pci/0000:01:00.0/0 count 2 Example dmesg: ice 0000:01:00.0: Available port split options and max port speeds (Gbps): ice 0000:01:00.0: Status Split Quad 0 Quad 1 ice 0000:01:00.0: count L0 L1 L2 L3 L4 L5 L6 L7 ice 0000:01:00.0: 2 40 - - - 40 - - - ice 0000:01:00.0: 2 50 - 50 - - - - - ice 0000:01:00.0: 4 25 25 25 25 - - - - ice 0000:01:00.0: 4 25 25 - - 25 25 - - ice 0000:01:00.0: Active 8 10 10 10 10 10 10 10 10 ice 0000:01:00.0: 1 100 - - - - - - - Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-07-17Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Cross-merge networking fixes after downstream PR (net-6.16-rc7). Conflicts: Documentation/netlink/specs/ovpn.yaml 880d43ca9aa4 ("netlink: specs: clean up spaces in brackets") af52020fc599 ("ovpn: reject unexpected netlink attributes") drivers/net/phy/phy_device.c a44312d58e78 ("net: phy: Don't register LEDs for genphy") f0f2b992d818 ("net: phy: Don't register LEDs for genphy") https://lore.kernel.org/20250710114926.7ec3a64f@kernel.org drivers/net/wireless/intel/iwlwifi/fw/regulatory.c drivers/net/wireless/intel/iwlwifi/mld/regulatory.c 5fde0fcbd760 ("wifi: iwlwifi: mask reserved bits in chan_state_active_bitmap") ea045a0de3b9 ("wifi: iwlwifi: add support for accepting raw DSM tables by firmware") net/ipv6/mcast.c ae3264a25a46 ("ipv6: mcast: Delay put pmc->idev in mld_del_delrec()") a8594c956cc9 ("ipv6: mcast: Avoid a duplicate pointer check in mld_del_delrec()") https://lore.kernel.org/8cc52891-3653-4b03-a45e-05464fe495cf@kernel.org No adjacent changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-15ice: check correct pointer in fwlog debugfsMichal Swiatkowski
pf->ice_debugfs_pf_fwlog should be checked for an error here. Fixes: 96a9a9341cda ("ice: configure FW logging") Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-07-15ice: add NULL check in eswitch lag checkDave Ertman
The function ice_lag_is_switchdev_running() is being called from outside of the LAG event handler code. This results in the lag->upper_netdev being NULL sometimes. To avoid a NULL-pointer dereference, there needs to be a check before it is dereferenced. Fixes: 776fe19953b0 ("ice: block default rule setting on LAG interface") Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-07-10ice: introduce ice_get_vf_by_dev() wrapperJacob Keller
The ice_get_vf_by_id() function is used to obtain a reference to a VF structure based on its ID. The ice_sriov_set_msix_vec_count() function needs to get a VF reference starting from the VF PCI device, and uses pci_iov_vf_id() to get the VF ID. This pattern is currently uncommon in the ice driver. However, the live migration module will introduce many more such locations. Add a helper wrapper ice_get_vf_by_dev() which takes the VF PCI device and calls ice_get_vf_by_id() using pci_iov_vf_id() to get the VF ID. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-07-10ice: avoid rebuilding if MSI-X vector count is unchangedJacob Keller
Commit 05c16687e0cc ("ice: set MSI-X vector count on VF") added support to change the vector count for VFs as part of ice_sriov_set_msix_vec_count(). This function modifies and rebuilds the target VF with the requested number of MSI-X vectors. Future support for live migration will add a call to ice_sriov_set_msix_vec_count() to ensure that a migrated VF has the proper MSI-X vector count. In most cases, this request will be to set the MSI-X vector count to its current value. In that case, no work is necessary. Rather than requiring the caller to check this, update the function to check and exit early if the vector count is already at the requested value. This avoids an unnecessary VF rebuild. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-07-10ice: use pci_iov_vf_id() to get VF IDJacob Keller
The ice_sriov_set_msix_vec_count() obtains the VF device ID in a strange way by iterating over the possible VF IDs and calling pci_iov_virtfn_devfn to calculate the device and function combos and compare them to the pdev->devfn. This is unnecessary. The pci_iov_vf_id() helper already exists which does the reverse calculation of pci_iov_virtfn_devfn(), which is much simpler and avoids the loop construction. Use this instead. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-07-10ice: expose VF functions used by live migrationJacob Keller
The live migration process will require configuring the target VF with the data provided from the source host. A few helper functions in ice_sriov.c and ice_virtchnl.c will be needed for this process, but are currently static. Expose these functions in their respective headers so that the live migration module can use them during the migration process. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-07-10ice: move ice_vsi_update_l2tsel to ice_lib.cJacob Keller
A future change is going to need to call ice_vsi_update_l2tsel from a new context outside of ice_virtchnl.c Since this function deals with a generic VSI, move it into ice_lib.c to enable calling it from other places in the ice driver. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Madhu Chittim <madhu.chittim@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-07-10ice: save RSS hash configuration for migrationJacob Keller
The VF can program the RSS hash configuration over virtchnl. It does this by sending a u64 bitmask which represents the current hash configuration. It is not trivial to reverse the hardware configuration back to this hash set for migration. Instead, save the value to the ice_vf structure when its modified by the VF. The rss_hashcfg value is an 8-byte field. Make room for it in ice_vf by re-arranging some of the existing fields. There is a 4-byte gap after the first_vector_idx, and a 4-byte gap between max_tx_rate and vf_states. Move first_vector_idx into the later 4-byte gap, creating an 8 byte area where rss_hashcfg can be placed. Also move the num_msix field near min_tx_rate, filling 2 bytes of a 3 byte hole. The end result of these changes enables placing the rss_hashcfg field into the structure while also saving 8 bytes in size. It looks like there are a handful of more possible cleanups to reduce the size even further, but those have been left as a future cleanup. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Madhu Chittim <madhu.chittim@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-07-10ice: add functions to get and set Tx queue contextJacob Keller
The live migration driver will need to save and restore the Tx queue context state from the hardware registers. This state contains both static fields which do not change during Tx traffic as well as dynamic fields which may change during Tx traffic. Unlike the Rx context, the Tx queue context is accessed indirectly from GLCOMM_QTX_CNTX_CTL and GLCOMM_QTX_CNTX_DATA registers. These registers are shared by multiple PFs on the same PCIe card. Multiple PFs cannot safely access the registers simultaneously, and there is no hardware semaphore or logic to control access. To handle this, introduce the txq_ctx_lock to the ice_adapter structure. This is similar to the ptp_gltsyn_time_lock. All PFs on the same adapter share this structure, and use it to serialize access to the registers to prevent error. Add a new functions to get and set the Tx queue context through the GLCOMM_QTX_CNTX_CTL interface. The hardware context values are stored in the registers using the same packed format as the Admin Queue buffer. The hardware buffer is 40 bytes wide, as it contains an additional 18 bytes of internal state not sent with the Admin Queue buffer. For this reason, a separate typedef and packing function must be used. We can share the same packed fields definitions because we never need to unpack the internal state. This is preferred, as it ensures the internal state is zero'd when writing into HW, and avoids issues with reading by u32 registers into a buffer of 22 bytes in length. Thanks to the typedefs, misuse of the API with the wrong size buffer can easily be caught at compile time. Note reading this data from hardware is essential because the current Tx queue context may be different from the context as initially programmed by the driver during VF initialization. When migrating a VF we must ensure the target VF has identical context as the source VF did. Co-developed-by: Yahui Cao <yahui.cao@intel.com> Signed-off-by: Yahui Cao <yahui.cao@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Madhu Chittim <madhu.chittim@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-07-10ice: add support for reading and unpacking Rx queue contextJacob Keller
In order to support live migration, the ice driver will need to read certain data from the Rx queue context. This is stored in the hardware in a packed format. Since we use <linux/packing.h> for the mapping between the packed hardware format and the unpacked structure, it is trivial to enable unpacking support via the unpack_fields() function. Add the ice_unpack_rxq_ctx() function based on the unpack_fields() API. Re-use the same field definitions from the packing implementation. Add ice_copy_rxq_ctx_from_hw() to copy the Rx queue context data from the hardware registers. Use these to implement ice_read_rxq_ctx() which will return the Rx queue context to the caller in its unpacked ice_rlan_ctx struct. This will enable the migration logic access to the relevant data about the Rx device queues. It can easily be copied to the target system as part of the migration payload, where it will be used to configure the Rx queues. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Madhu Chittim <madhu.chittim@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-07-08eth: ice: drop the dead code related to rss_contextsJakub Kicinski
ICE appears to have some odd form of rss_context use plumbed in for .get_rxfh. The .set_rxfh side does not support creating contexts, however, so this must be dead code. For at least a year now (since commit 7964e7884643 ("net: ethtool: use the tracking array for get_rxfh on custom RSS contexts")) we have not been calling .get_rxfh with a non-zero rss_context. We just get the info from the RSS XArray under dev->ethtool. Remove what must be dead code in the driver, clear the support flags. Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250707184115.2285277-3-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-03ice: convert to ndo_hwtstamp_get() and ndo_hwtstamp_set()Vladimir Oltean
New timestamping API was introduced in commit 66f7223039c0 ("net: add NDOs for configuring hardware timestamping") from kernel v6.6. It is time to convert the Intel ice driver to the new API, so that timestamping configuration can be removed from the ndo_eth_ioctl() path completely. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Acked-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Reviewed-by: Milena Olech <milena.olech@intel.com> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-06-27ice: add ref-sync dpll pinsArkadiusz Kubalewski
Implement reference sync input pin get/set callbacks, allow user space control over dpll pin pairs capable of reference sync support. Reviewed-by: Milena Olech <milena.olech@intel.com> Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com> Link: https://patch.msgid.link/20250626135219.1769350-4-arkadiusz.kubalewski@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-26ice: default to TIME_REF instead of TXCO on E825-CJacob Keller
The driver currently defaults to the internal oscillator as the clock source for E825-C hardware. While this clock source is labeled TCXO, indicating a temperature compensated oscillator, this is only true for some board designs. Many board designs have a less capable oscillator. The E825-C hardware may also have its clock source set to the TIME_REF pin. This pin is connected to the DPLL and is often a more stable clock source. The choice of the internal oscillator is not suitable for all systems, especially those which want to enable SyncE support. There is currently no interface available for users to configure the clock source. Other variants of the E82x board have the clock source configured in the NVM, but E825-C lacks this capability, so different board designs cannot select a different default clock via firmware. In most setups, the TIME_REF is a suitable default clock source. Additionally, we now fall back to the internal oscillator automatically if the TIME_REF clock source cannot be locked. Change the default clock source for E825-C to TIME_REF. Note that the driver logs a dev_dbg message upon configuring the TSPLL which includes the clock source and frequency. This can be enabled to confirm which clock source is in use. Longterm a proper interface to dynamically introspect and change the clock source will be designed (perhaps some extension of the DPLL subsystem?) Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Rinitha S <sx.rinitha@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-06-26ice: move TSPLL init calls to ice_ptp.cKarol Kolacinski
Initialize TSPLL after initializing PHC in ice_ptp.c instead of calling for each product in PHC init in ice_ptp_hw.c. Reviewed-by: Michal Kubiak <michal.kubiak@intel.com> Reviewed-by: Milena Olech <milena.olech@intel.com> Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com> Tested-by: Rinitha S <sx.rinitha@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-06-26ice: fall back to TCXO on TSPLL lock failKarol Kolacinski
TSPLL can fail when trying to lock to TIME_REF as a clock source, e.g. when the external clock source is not stable or connected to the board. To continue operation after failure, try to lock again to internal TCXO and inform user about this. Reviewed-by: Milena Olech <milena.olech@intel.com> Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com> Tested-by: Rinitha S <sx.rinitha@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-06-26ice: wait before enabling TSPLLKarol Kolacinski
To ensure proper operation, wait for 10 to 20 microseconds before enabling TSPLL. Adjust wait time after enabling TSPLL from 1-5 ms to 1-2 ms. Those values are empirical and tested on multiple HW configurations. Reviewed-by: Milena Olech <milena.olech@intel.com> Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-06-26ice: add multiple TSPLL helpersKarol Kolacinski
Add helpers for checking TSPLL params, disabling sticky bits, configuring TSPLL and getting default clock frequency to simplify the code flows. Reviewed-by: Milena Olech <milena.olech@intel.com> Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-06-26ice: use bitfields instead of unions for CGU regsKarol Kolacinski
Switch from unions with bitfield structs to definitions with bitfield masks. This is necessary, because some registers have different field definitions or even use a different register for the same fields based on HW type. Remove unused register fields. Reviewed-by: Milena Olech <milena.olech@intel.com> Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-06-26ice: read TSPLL registers again before reporting statusJacob Keller
After programming the TSPLL, re-read the registers before reporting status. This ensures the debug log message will show what was actually programmed, rather than relying on a cached value. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-06-26ice: clear time_sync_en field for E825-C during reprogrammingJacob Keller
When programming the Clock Generation Unit for E285-C hardware, we need to clear the time_sync_en bit of the DWORD 9 before we set the frequency. Co-developed-by: Karol Kolacinski <karol.kolacinski@intel.com> Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-06-21Merge branch '100GbE' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== ice: Separate TSPLL from PTP and clean up [part] Jake Keller says: Separate TSPLL related functions and definitions from all PTP-related files and clean up the code by implementing multiple helpers. * '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue: ice: add TSPLL log config helper ice: use designated initializers for TSPLL consts ice: remove ice_tspll_params_e825 definitions ice: fix E825-C TSPLL register definitions ice: rename TSPLL and CGU functions and definitions ice: move TSPLL functions to a separate file ==================== Link: https://patch.msgid.link/20250618174231.3100231-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-19Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Cross-merge networking fixes after downstream PR (net-6.16-rc3). No conflicts or adjacent changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-18udp_tunnel: remove rtnl_lock dependencyStanislav Fomichev
Drivers that are using ops lock and don't depend on RTNL lock still need to manage it because udp_tunnel's RTNL dependency. Introduce new udp_tunnel_nic_lock and use it instead of rtnl_lock. Drop non-UDP_TUNNEL_NIC_INFO_MAY_SLEEP mode from udp_tunnel infra (udp_tunnel_nic_device_sync_work needs to grab udp_tunnel_nic_lock mutex and might sleep). Cover more places in v4: - netlink - udp_tunnel_notify_add_rx_port (ndo_open) - triggers udp_tunnel_nic_device_sync_work - udp_tunnel_notify_del_rx_port (ndo_stop) - triggers udp_tunnel_nic_device_sync_work - udp_tunnel_get_rx_info (__netdev_update_features) - triggers NETDEV_UDP_TUNNEL_PUSH_INFO - udp_tunnel_drop_rx_info (__netdev_update_features) - triggers NETDEV_UDP_TUNNEL_DROP_INFO - udp_tunnel_nic_reset_ntf (ndo_open) - notifiers - udp_tunnel_nic_netdevice_event, depending on the event: - triggers NETDEV_UDP_TUNNEL_PUSH_INFO - triggers NETDEV_UDP_TUNNEL_DROP_INFO - ethnl_tunnel_info_reply_size - udp_tunnel_nic_set_port_priv (two intel drivers) Cc: Michael Chan <michael.chan@broadcom.com> Suggested-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Stanislav Fomichev <stfomichev@gmail.com> Link: https://patch.msgid.link/20250616162117.287806-4-stfomichev@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-18ice: add TSPLL log config helperKarol Kolacinski
Add a helper function to print new/current TSPLL config. This helps avoid unnecessary casts from u8 to enums. Reviewed-by: Michal Kubiak <michal.kubiak@intel.com> Reviewed-by: Milena Olech <milena.olech@intel.com> Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-06-18ice: use designated initializers for TSPLL constsKarol Kolacinski
Instead of multiple comments, use designated initializers for TSPLL consts. Adjust ice_tspll_params_e82x fields sizes. Reviewed-by: Michal Kubiak <michal.kubiak@intel.com> Reviewed-by: Milena Olech <milena.olech@intel.com> Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-06-18ice: remove ice_tspll_params_e825 definitionsKarol Kolacinski
Remove ice_tspll_params_e825 definitions as according to EDS (Electrical Design Specification) doc, E825 devices support only 156.25 MHz TSPLL frequency for both TCXO and TIME_REF clock source. Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-06-18ice: fix E825-C TSPLL register definitionsJacob Keller
The E825-C hardware has a slightly different register layout for register 19 of the Clock Generation Unit and TSPLL. The fbdiv_intgr value can be 10 bits wide. Additionally, most of the fields that were in register 24 are made available in register 23 instead. The programming logic already has a corrected definition for register 23, but it incorrectly still used the 8-bit definition of fbdiv_intgr. This results in truncating some of the values of fbdiv_intgr, including the value used for the 156.25MHz signal. The driver only used register 24 to obtain the enable status, which we should read from register 23. This results in an incorrect output for the log messages, but does not change any functionality besides disabled-by-default dynamic debug messages. Fix the register definitions, and adjust the code to properly reflect the enable/disable status in the log messages. Co-developed-by: Karol Kolacinski <karol.kolacinski@intel.com> Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-06-18ice: rename TSPLL and CGU functions and definitionsKarol Kolacinski
Rename TSPLL and CGU functions, definitions etc. to match the file name and have consistent naming scheme. Reviewed-by: Michal Kubiak <michal.kubiak@intel.com> Reviewed-by: Milena Olech <milena.olech@intel.com> Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-06-18ice: move TSPLL functions to a separate fileKarol Kolacinski
Collect TSPLL related functions and definitions and move them to a separate file to have all TSPLL functionality in one place. Move CGU related functions and definitions to ice_common.* Reviewed-by: Michal Kubiak <michal.kubiak@intel.com> Reviewed-by: Milena Olech <milena.olech@intel.com> Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-06-17ice: fix eswitch code memory leak in reset scenarioGrzegorz Nitka
Add simple eswitch mode checker in attaching VF procedure and allocate required port representor memory structures only in switchdev mode. The reset flows triggers VF (if present) detach/attach procedure. It might involve VF port representor(s) re-creation if the device is configured is switchdev mode (not legacy one). The memory was blindly allocated in current implementation, regardless of the mode and not freed if in legacy mode. Kmemeleak trace: unreferenced object (percpu) 0x7e3bce5b888458 (size 40): comm "bash", pid 1784, jiffies 4295743894 hex dump (first 32 bytes on cpu 45): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace (crc 0): pcpu_alloc_noprof+0x4c4/0x7c0 ice_repr_create+0x66/0x130 [ice] ice_repr_create_vf+0x22/0x70 [ice] ice_eswitch_attach_vf+0x1b/0xa0 [ice] ice_reset_all_vfs+0x1dd/0x2f0 [ice] ice_pci_err_resume+0x3b/0xb0 [ice] pci_reset_function+0x8f/0x120 reset_store+0x56/0xa0 kernfs_fop_write_iter+0x120/0x1b0 vfs_write+0x31c/0x430 ksys_write+0x61/0xd0 do_syscall_64+0x5b/0x180 entry_SYSCALL_64_after_hwframe+0x76/0x7e Testing hints (ethX is PF netdev): - create at least one VF echo 1 > /sys/class/net/ethX/device/sriov_numvfs - trigger the reset echo 1 > /sys/class/net/ethX/device/reset Fixes: 415db8399d06 ("ice: make representor code generic") Signed-off-by: Grzegorz Nitka <grzegorz.nitka@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>