Age | Commit message (Collapse) | Author |
|
[ Upstream commit 001ba0902046cb6c352494df610718c0763e77a5 ]
The fec_enet_update_cbd function calls page_pool_dev_alloc_pages but did
not handle the case when it returned NULL. There was a WARN_ON(!new_page)
but it would still proceed to use the NULL pointer and then crash.
This case does seem somewhat rare but when the system is under memory
pressure it can happen. One case where I can duplicate this with some
frequency is when writing over a smbd share to a SATA HDD attached to an
imx6q.
Setting /proc/sys/vm/min_free_kbytes to higher values also seems to solve
the problem for my test case. But it still seems wrong that the fec driver
ignores the memory allocation error and can crash.
This commit handles the allocation error by dropping the current packet.
Fixes: 95698ff6177b5 ("net: fec: using page pool to manage RX buffers")
Signed-off-by: Kevin Groeneveld <kgroeneveld@lenbrook.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Wei Fang <wei.fang@nxp.com>
Link: https://patch.msgid.link/20250113154846.1765414-1-kgroeneveld@lenbrook.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit c17ff476f53afb30f90bb3c2af77de069c81a622 ]
If coalesce_count is greater than 255 it will not fit in the register and
will overflow. This can be reproduced by running
# ethtool -C ethX rx-frames 256
which will result in a timeout of 0us instead. Fix this by checking for
invalid values and reporting an error.
Fixes: 8a3b7a252dca ("drivers/net/ethernet/xilinx: added Xilinx AXI Ethernet driver")
Signed-off-by: Sean Anderson <sean.anderson@linux.dev>
Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>
Reviewed-by: Radhey Shyam Pandey <radhey.shyam.pandey@amd.com>
Link: https://patch.msgid.link/20250113163001.2335235-1-sean.anderson@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 16ebb6f5b6295c9688749862a39a4889c56227f8 ]
The "sizeof(struct cmsg_bpf_event) + pkt_size + data_size" math could
potentially have an integer wrapping bug on 32bit systems. Check for
this and return an error.
Fixes: 9816dd35ecec ("nfp: bpf: perf event output helpers support")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Link: https://patch.msgid.link/6074805b-e78d-4b8a-bf05-e929b5377c28@stanley.mountain
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 258f5f905815979f15d5151d2ea4f20d8e057fe1 ]
Driver always naively assumes, that for PTP purposes, PHY lane to
configure is corresponding to PF ID.
This is not true for some port configurations, e.g.:
- 2x50G per quad, where lanes used are 0 and 2 on each quad, but PF IDs
are 0 and 1
- 100G per quad on 2 quads, where lanes used are 0 and 4, but PF IDs are
0 and 1
Use correct PHY lane assignment by getting and parsing port options.
This is read from the NVM by the FW and provided to the driver with
the indication of active port split.
Remove ice_is_muxed_topo(), which is no longer needed.
Fixes: 4409ea1726cb ("ice: Adjust PTP init for 2x50G E825C devices")
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Reviewed-by: Arkadiusz Kubalewski <Arkadiusz.kubalewski@intel.com>
Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com>
Signed-off-by: Grzegorz Nitka <grzegorz.nitka@intel.com>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit e800654e85b5b27966fc6493201f5f8cf658beb6 ]
Use struct ice_adapter to hold shared PTP data and control PTP
related actions instead of auxbus. This allows significant code
simplification and faster access to the container fields used in
the PTP support code.
Move the PTP port list to the ice_adapter container to simplify
the code and avoid race conditions which could occur due to the
synchronous nature of the initialization/access and
certain memory saving can be achieved by moving PTP data into
the ice_adapter itself.
Signed-off-by: Sergey Temerkhanov <sergey.temerkhanov@intel.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Stable-dep-of: 258f5f905815 ("ice: Add correct PHY lane assignment")
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 97ed20a01f5b96e8738b53f56ae84b06953a2853 ]
Add ice_get_ctrl_ptp() wrapper to simplify the PTP support code
in the functions that do not use ctrl_pf directly.
Add the control PF pointer to struct ice_adapter
Rearrange fields in struct ice_adapter
Signed-off-by: Sergey Temerkhanov <sergey.temerkhanov@intel.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Stable-dep-of: 258f5f905815 ("ice: Add correct PHY lane assignment")
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 5e0776451d89eefe66b19e010e48ece1cca07e58 ]
Introduce ice_get_phy_model() to improve code readability
Signed-off-by: Sergey Temerkhanov <sergey.temerkhanov@intel.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Stable-dep-of: 258f5f905815 ("ice: Add correct PHY lane assignment")
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 2e60560f1ec9b722f9c6699ec5d966f1732d14dd ]
Fix ETH56G FC-FEC incorrect Rx offset value by changing it from -255.96
to -469.26 ns.
Those values are derived from HW spec and reflect internal delays.
Hex value is a fixed point representation in Q23.9 format.
Fixes: 7cab44f1c35f ("ice: Introduce ETH56G PHY model for E825C products")
Reviewed-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit dc26548d729e5f732197d2b210fb77c745b01495 ]
Quad registers are read/written incorrectly. E825 devices always use
quad 0 address and differentiate between the PHYs by changing SBQ
destination device (phy_0 or phy_0_peer).
Add helpers for reading/writing PTP registers shared per quad and use
correct quad address and SBQ destination device based on port.
Fixes: 7cab44f1c35f ("ice: Introduce ETH56G PHY model for E825C products")
Reviewed-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com>
Signed-off-by: Grzegorz Nitka <grzegorz.nitka@intel.com>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit d79c304c76e9b30ff5527afc176b5c4f9f0374b6 ]
Current implementation checks revision of all PHYs on all PFs, which is
incorrect and may result in initialization failure. Check only the
revision of the current PHY.
Fixes: 7cab44f1c35f ("ice: Introduce ETH56G PHY model for E825C products")
Reviewed-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com>
Signed-off-by: Grzegorz Nitka <grzegorz.nitka@intel.com>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit e7e441a4100e4bc90b52f80494a28a9667993975 ]
When tx_max_frame_size was added to struct ravb_hw_info, no value was
set in ravb_rzv2m_hw_info so the default value of zero was used.
The maximum MTU is set by subtracting from tx_max_frame_size to allow
space for headers and frame checksums. As ndev->max_mtu is unsigned,
this subtraction wraps around leading to a ridiculously large positive
value that is obviously incorrect.
Before tx_max_frame_size was introduced, the maximum MTU was based on
rx_max_frame_size. So, we can restore the correct maximum MTU by copying
the rx_max_frame_size value into tx_max_frame_size for RZ/V2M.
Fixes: 1d63864299ca ("net: ravb: Fix maximum TX frame size for GbEth devices")
Signed-off-by: Paul Barker <paul.barker.ct@bp.renesas.com>
Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Link: https://patch.msgid.link/20250109113706.1409149-1-paul.barker.ct@bp.renesas.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit f0aa6a37a3dbb40b272df5fc6db93c114688adcd ]
Recalculate features when XDP is detached.
Before:
# ip li set dev eth0 xdp obj xdp_dummy.bpf.o sec xdp
# ip li set dev eth0 xdp off
# ethtool -k eth0 | grep gro
rx-gro-hw: off [requested on]
After:
# ip li set dev eth0 xdp obj xdp_dummy.bpf.o sec xdp
# ip li set dev eth0 xdp off
# ethtool -k eth0 | grep gro
rx-gro-hw: on
The fact that HW-GRO doesn't get re-enabled automatically is just
a minor annoyance. The real issue is that the features will randomly
come back during another reconfiguration which just happens to invoke
netdev_update_features(). The driver doesn't handle reconfiguring
two things at a time very robustly.
Starting with commit 98ba1d931f61 ("bnxt_en: Fix RSS logic in
__bnxt_reserve_rings()") we only reconfigure the RSS hash table
if the "effective" number of Rx rings has changed. If HW-GRO is
enabled "effective" number of rings is 2x what user sees.
So if we are in the bad state, with HW-GRO re-enablement "pending"
after XDP off, and we lower the rings by / 2 - the HW-GRO rings
doing 2x and the ethtool -L doing / 2 may cancel each other out,
and the:
if (old_rx_rings != bp->hw_resc.resv_rx_rings &&
condition in __bnxt_reserve_rings() will be false.
The RSS map won't get updated, and we'll crash with:
BUG: kernel NULL pointer dereference, address: 0000000000000168
RIP: 0010:__bnxt_hwrm_vnic_set_rss+0x13a/0x1a0
bnxt_hwrm_vnic_rss_cfg_p5+0x47/0x180
__bnxt_setup_vnic_p5+0x58/0x110
bnxt_init_nic+0xb72/0xf50
__bnxt_open_nic+0x40d/0xab0
bnxt_open_nic+0x2b/0x60
ethtool_set_channels+0x18c/0x1d0
As we try to access a freed ring.
The issue is present since XDP support was added, really, but
prior to commit 98ba1d931f61 ("bnxt_en: Fix RSS logic in
__bnxt_reserve_rings()") it wasn't causing major issues.
Fixes: 1054aee82321 ("bnxt_en: Use NETIF_F_GRO_HW.")
Fixes: 98ba1d931f61 ("bnxt_en: Fix RSS logic in __bnxt_reserve_rings()")
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Link: https://patch.msgid.link/20250109043057.2888953-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 03d120f27d050336f7e7d21879891542c4741f81 ]
CPSW ALE has 75-bit ALE entries stored across three 32-bit words.
The cpsw_ale_get_field() and cpsw_ale_set_field() functions support
ALE field entries spanning up to two words at the most.
The cpsw_ale_get_field() and cpsw_ale_set_field() functions work as
expected when ALE field spanned across word1 and word2, but fails when
ALE field spanned across word2 and word3.
For example, while reading the ALE field spanned across word2 and word3
(i.e. bits 62 to 64), the word3 data shifted to an incorrect position
due to the index becoming zero while flipping.
The same issue occurred when setting an ALE entry.
This issue has not been seen in practice but will be an issue in the future
if the driver supports accessing ALE fields spanning word2 and word3
Fix the methods to handle getting/setting fields spanning up to two words.
Fixes: b685f1a58956 ("net: ethernet: ti: cpsw_ale: Fix cpsw_ale_get_field()/cpsw_ale_set_field()")
Signed-off-by: Sudheer Kumar Doredla <s-doredla@ti.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Roger Quadros <rogerq@kernel.org>
Reviewed-by: Siddharth Vadapalli <s-vadapalli@ti.com>
Link: https://patch.msgid.link/20250108172433.311694-1-s-doredla@ti.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 0e2909c6bec9048f49d0c8e16887c63b50b14647 ]
When cmd_alloc_index(), fails cmd_work_handler() needs
to complete ent->slotted before returning early.
Otherwise the task which issued the command may hang:
mlx5_core 0000:01:00.0: cmd_work_handler:877:(pid 3880418): failed to allocate command entry
INFO: task kworker/13:2:4055883 blocked for more than 120 seconds.
Not tainted 4.19.90-25.44.v2101.ky10.aarch64 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
kworker/13:2 D 0 4055883 2 0x00000228
Workqueue: events mlx5e_tx_dim_work [mlx5_core]
Call trace:
__switch_to+0xe8/0x150
__schedule+0x2a8/0x9b8
schedule+0x2c/0x88
schedule_timeout+0x204/0x478
wait_for_common+0x154/0x250
wait_for_completion+0x28/0x38
cmd_exec+0x7a0/0xa00 [mlx5_core]
mlx5_cmd_exec+0x54/0x80 [mlx5_core]
mlx5_core_modify_cq+0x6c/0x80 [mlx5_core]
mlx5_core_modify_cq_moderation+0xa0/0xb8 [mlx5_core]
mlx5e_tx_dim_work+0x54/0x68 [mlx5_core]
process_one_work+0x1b0/0x448
worker_thread+0x54/0x468
kthread+0x134/0x138
ret_from_fork+0x10/0x18
Fixes: 485d65e13571 ("net/mlx5: Add a timeout to acquire the command queue semaphore")
Signed-off-by: Chenguang Zhao <zhaochenguang@kylinos.cn>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Acked-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20250108030009.68520-1-zhaochenguang@kylinos.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 2055272e3ae01a954e41a5afb437c5d76f758e0b ]
The pci_irq_vector() function never returns zero. It returns negative
error codes or a positive non-zero IRQ number. Fix the error checking to
test for negatives.
Fixes: a36e9f5cfe9e ("rtase: Add support for a pci table in this module")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Link: https://patch.msgid.link/f2ecc88d-af13-4651-9820-7cc665230019@stanley.mountain
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 426046e2d62dd19533808661e912b8e8a9eaec16 ]
Nvidia's Tegra MGBE controllers require the IOMMU "Stream ID" (SID) to be
written to the MGBE_WRAP_AXI_ASID0_CTRL register.
The current driver is hard coded to use MGBE0's SID for all controllers.
This causes softirq time outs and kernel panics when using controllers
other than MGBE0.
Example dmesg errors when an ethernet cable is connected to MGBE1:
[ 116.133290] tegra-mgbe 6910000.ethernet eth1: Link is Up - 1Gbps/Full - flow control rx/tx
[ 121.851283] tegra-mgbe 6910000.ethernet eth1: NETDEV WATCHDOG: CPU: 5: transmit queue 0 timed out 5690 ms
[ 121.851782] tegra-mgbe 6910000.ethernet eth1: Reset adapter.
[ 121.892464] tegra-mgbe 6910000.ethernet eth1: Register MEM_TYPE_PAGE_POOL RxQ-0
[ 121.905920] tegra-mgbe 6910000.ethernet eth1: PHY [stmmac-1:00] driver [Aquantia AQR113] (irq=171)
[ 121.907356] tegra-mgbe 6910000.ethernet eth1: Enabling Safety Features
[ 121.907578] tegra-mgbe 6910000.ethernet eth1: IEEE 1588-2008 Advanced Timestamp supported
[ 121.908399] tegra-mgbe 6910000.ethernet eth1: registered PTP clock
[ 121.908582] tegra-mgbe 6910000.ethernet eth1: configuring for phy/10gbase-r link mode
[ 125.961292] tegra-mgbe 6910000.ethernet eth1: Link is Up - 1Gbps/Full - flow control rx/tx
[ 181.921198] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
[ 181.921404] rcu: 7-....: (1 GPs behind) idle=540c/1/0x4000000000000002 softirq=1748/1749 fqs=2337
[ 181.921684] rcu: (detected by 4, t=6002 jiffies, g=1357, q=1254 ncpus=8)
[ 181.921878] Sending NMI from CPU 4 to CPUs 7:
[ 181.921886] NMI backtrace for cpu 7
[ 181.922131] CPU: 7 UID: 0 PID: 0 Comm: swapper/7 Kdump: loaded Not tainted 6.13.0-rc3+ #6
[ 181.922390] Hardware name: NVIDIA CTI Forge + Orin AGX/Jetson, BIOS 202402.1-Unknown 10/28/2024
[ 181.922658] pstate: 40400009 (nZcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 181.922847] pc : handle_softirqs+0x98/0x368
[ 181.922978] lr : __do_softirq+0x18/0x20
[ 181.923095] sp : ffff80008003bf50
[ 181.923189] x29: ffff80008003bf50 x28: 0000000000000008 x27: 0000000000000000
[ 181.923379] x26: ffffce78ea277000 x25: 0000000000000000 x24: 0000001c61befda0
[ 181.924486] x23: 0000000060400009 x22: ffffce78e99918bc x21: ffff80008018bd70
[ 181.925568] x20: ffffce78e8bb00d8 x19: ffff80008018bc20 x18: 0000000000000000
[ 181.926655] x17: ffff318ebe7d3000 x16: ffff800080038000 x15: 0000000000000000
[ 181.931455] x14: ffff000080816680 x13: ffff318ebe7d3000 x12: 000000003464d91d
[ 181.938628] x11: 0000000000000040 x10: ffff000080165a70 x9 : ffffce78e8bb0160
[ 181.945804] x8 : ffff8000827b3160 x7 : f9157b241586f343 x6 : eeb6502a01c81c74
[ 181.953068] x5 : a4acfcdd2e8096bb x4 : ffffce78ea277340 x3 : 00000000ffffd1e1
[ 181.960329] x2 : 0000000000000101 x1 : ffffce78ea277340 x0 : ffff318ebe7d3000
[ 181.967591] Call trace:
[ 181.970043] handle_softirqs+0x98/0x368 (P)
[ 181.974240] __do_softirq+0x18/0x20
[ 181.977743] ____do_softirq+0x14/0x28
[ 181.981415] call_on_irq_stack+0x24/0x30
[ 181.985180] do_softirq_own_stack+0x20/0x30
[ 181.989379] __irq_exit_rcu+0x114/0x140
[ 181.993142] irq_exit_rcu+0x14/0x28
[ 181.996816] el1_interrupt+0x44/0xb8
[ 182.000316] el1h_64_irq_handler+0x14/0x20
[ 182.004343] el1h_64_irq+0x80/0x88
[ 182.007755] cpuidle_enter_state+0xc4/0x4a8 (P)
[ 182.012305] cpuidle_enter+0x3c/0x58
[ 182.015980] cpuidle_idle_call+0x128/0x1c0
[ 182.020005] do_idle+0xe0/0xf0
[ 182.023155] cpu_startup_entry+0x3c/0x48
[ 182.026917] secondary_start_kernel+0xdc/0x120
[ 182.031379] __secondary_switched+0x74/0x78
[ 212.971162] rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { 7-.... } 6103 jiffies s: 417 root: 0x80/.
[ 212.985935] rcu: blocking rcu_node structures (internal RCU debug):
[ 212.992758] Sending NMI from CPU 0 to CPUs 7:
[ 212.998539] NMI backtrace for cpu 7
[ 213.004304] CPU: 7 UID: 0 PID: 0 Comm: swapper/7 Kdump: loaded Not tainted 6.13.0-rc3+ #6
[ 213.016116] Hardware name: NVIDIA CTI Forge + Orin AGX/Jetson, BIOS 202402.1-Unknown 10/28/2024
[ 213.030817] pstate: 40400009 (nZcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 213.040528] pc : handle_softirqs+0x98/0x368
[ 213.046563] lr : __do_softirq+0x18/0x20
[ 213.051293] sp : ffff80008003bf50
[ 213.055839] x29: ffff80008003bf50 x28: 0000000000000008 x27: 0000000000000000
[ 213.067304] x26: ffffce78ea277000 x25: 0000000000000000 x24: 0000001c61befda0
[ 213.077014] x23: 0000000060400009 x22: ffffce78e99918bc x21: ffff80008018bd70
[ 213.087339] x20: ffffce78e8bb00d8 x19: ffff80008018bc20 x18: 0000000000000000
[ 213.097313] x17: ffff318ebe7d3000 x16: ffff800080038000 x15: 0000000000000000
[ 213.107201] x14: ffff000080816680 x13: ffff318ebe7d3000 x12: 000000003464d91d
[ 213.116651] x11: 0000000000000040 x10: ffff000080165a70 x9 : ffffce78e8bb0160
[ 213.127500] x8 : ffff8000827b3160 x7 : 0a37b344852820af x6 : 3f049caedd1ff608
[ 213.138002] x5 : cff7cfdbfaf31291 x4 : ffffce78ea277340 x3 : 00000000ffffde04
[ 213.150428] x2 : 0000000000000101 x1 : ffffce78ea277340 x0 : ffff318ebe7d3000
[ 213.162063] Call trace:
[ 213.165494] handle_softirqs+0x98/0x368 (P)
[ 213.171256] __do_softirq+0x18/0x20
[ 213.177291] ____do_softirq+0x14/0x28
[ 213.182017] call_on_irq_stack+0x24/0x30
[ 213.186565] do_softirq_own_stack+0x20/0x30
[ 213.191815] __irq_exit_rcu+0x114/0x140
[ 213.196891] irq_exit_rcu+0x14/0x28
[ 213.202401] el1_interrupt+0x44/0xb8
[ 213.207741] el1h_64_irq_handler+0x14/0x20
[ 213.213519] el1h_64_irq+0x80/0x88
[ 213.217541] cpuidle_enter_state+0xc4/0x4a8 (P)
[ 213.224364] cpuidle_enter+0x3c/0x58
[ 213.228653] cpuidle_idle_call+0x128/0x1c0
[ 213.233993] do_idle+0xe0/0xf0
[ 213.237928] cpu_startup_entry+0x3c/0x48
[ 213.243791] secondary_start_kernel+0xdc/0x120
[ 213.249830] __secondary_switched+0x74/0x78
This bug has existed since the dwmac-tegra driver was added in Dec 2022
(See Fixes tag below for commit hash).
The Tegra234 SOC has 4 MGBE controllers, however Nvidia's Developer Kit
only uses MGBE0 which is why the bug was not found previously. Connect Tech
has many products that use 2 (or more) MGBE controllers.
The solution is to read the controller's SID from the existing "iommus"
device tree property. The 2nd field of the "iommus" device tree property
is the controller's SID.
Device tree snippet from tegra234.dtsi showing MGBE1's "iommus" property:
smmu_niso0: iommu@12000000 {
compatible = "nvidia,tegra234-smmu", "nvidia,smmu-500";
...
}
/* MGBE1 */
ethernet@6900000 {
compatible = "nvidia,tegra234-mgbe";
...
iommus = <&smmu_niso0 TEGRA234_SID_MGBE_VF1>;
...
}
Nvidia's arm-smmu driver reads the "iommus" property and stores the SID in
the MGBE device's "fwspec" struct. The dwmac-tegra driver can access the
SID using the tegra_dev_iommu_get_stream_id() helper function found in
linux/iommu.h.
Calling tegra_dev_iommu_get_stream_id() should not fail unless the "iommus"
property is removed from the device tree or the IOMMU is disabled.
While the Tegra234 SOC technically supports bypassing the IOMMU, it is not
supported by the current firmware, has not been tested and not recommended.
More detailed discussion with Thierry Reding from Nvidia linked below.
Fixes: d8ca113724e7 ("net: stmmac: tegra: Add MGBE support")
Link: https://lore.kernel.org/netdev/cover.1731685185.git.pnewman@connecttech.com
Signed-off-by: Parker Newman <pnewman@connecttech.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Acked-by: Thierry Reding <treding@nvidia.com>
Link: https://patch.msgid.link/6fb97f32cf4accb4f7cf92846f6b60064ba0a3bd.1736284360.git.pnewman@connecttech.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 9741e72b2286de8b38de9db685588ac421a95c87 ]
Currently, HIP08 devices does not register the ptp devices, so the
hdev->ptp is NULL. But the tx process would still try to set hardware time
stamp info with SKBTX_HW_TSTAMP flag and cause a kernel crash.
[ 128.087798] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000018
...
[ 128.280251] pc : hclge_ptp_set_tx_info+0x2c/0x140 [hclge]
[ 128.286600] lr : hclge_ptp_set_tx_info+0x20/0x140 [hclge]
[ 128.292938] sp : ffff800059b93140
[ 128.297200] x29: ffff800059b93140 x28: 0000000000003280
[ 128.303455] x27: ffff800020d48280 x26: ffff0cb9dc814080
[ 128.309715] x25: ffff0cb9cde93fa0 x24: 0000000000000001
[ 128.315969] x23: 0000000000000000 x22: 0000000000000194
[ 128.322219] x21: ffff0cd94f986000 x20: 0000000000000000
[ 128.328462] x19: ffff0cb9d2a166c0 x18: 0000000000000000
[ 128.334698] x17: 0000000000000000 x16: ffffcf1fc523ed24
[ 128.340934] x15: 0000ffffd530a518 x14: 0000000000000000
[ 128.347162] x13: ffff0cd6bdb31310 x12: 0000000000000368
[ 128.353388] x11: ffff0cb9cfbc7070 x10: ffff2cf55dd11e02
[ 128.359606] x9 : ffffcf1f85a212b4 x8 : ffff0cd7cf27dab0
[ 128.365831] x7 : 0000000000000a20 x6 : ffff0cd7cf27d000
[ 128.372040] x5 : 0000000000000000 x4 : 000000000000ffff
[ 128.378243] x3 : 0000000000000400 x2 : ffffcf1f85a21294
[ 128.384437] x1 : ffff0cb9db520080 x0 : ffff0cb9db500080
[ 128.390626] Call trace:
[ 128.393964] hclge_ptp_set_tx_info+0x2c/0x140 [hclge]
[ 128.399893] hns3_nic_net_xmit+0x39c/0x4c4 [hns3]
[ 128.405468] xmit_one.constprop.0+0xc4/0x200
[ 128.410600] dev_hard_start_xmit+0x54/0xf0
[ 128.415556] sch_direct_xmit+0xe8/0x634
[ 128.420246] __dev_queue_xmit+0x224/0xc70
[ 128.425101] dev_queue_xmit+0x1c/0x40
[ 128.429608] ovs_vport_send+0xac/0x1a0 [openvswitch]
[ 128.435409] do_output+0x60/0x17c [openvswitch]
[ 128.440770] do_execute_actions+0x898/0x8c4 [openvswitch]
[ 128.446993] ovs_execute_actions+0x64/0xf0 [openvswitch]
[ 128.453129] ovs_dp_process_packet+0xa0/0x224 [openvswitch]
[ 128.459530] ovs_vport_receive+0x7c/0xfc [openvswitch]
[ 128.465497] internal_dev_xmit+0x34/0xb0 [openvswitch]
[ 128.471460] xmit_one.constprop.0+0xc4/0x200
[ 128.476561] dev_hard_start_xmit+0x54/0xf0
[ 128.481489] __dev_queue_xmit+0x968/0xc70
[ 128.486330] dev_queue_xmit+0x1c/0x40
[ 128.490856] ip_finish_output2+0x250/0x570
[ 128.495810] __ip_finish_output+0x170/0x1e0
[ 128.500832] ip_finish_output+0x3c/0xf0
[ 128.505504] ip_output+0xbc/0x160
[ 128.509654] ip_send_skb+0x58/0xd4
[ 128.513892] udp_send_skb+0x12c/0x354
[ 128.518387] udp_sendmsg+0x7a8/0x9c0
[ 128.522793] inet_sendmsg+0x4c/0x8c
[ 128.527116] __sock_sendmsg+0x48/0x80
[ 128.531609] __sys_sendto+0x124/0x164
[ 128.536099] __arm64_sys_sendto+0x30/0x5c
[ 128.540935] invoke_syscall+0x50/0x130
[ 128.545508] el0_svc_common.constprop.0+0x10c/0x124
[ 128.551205] do_el0_svc+0x34/0xdc
[ 128.555347] el0_svc+0x20/0x30
[ 128.559227] el0_sync_handler+0xb8/0xc0
[ 128.563883] el0_sync+0x160/0x180
Fixes: 0bf5eb788512 ("net: hns3: add support for PTP")
Signed-off-by: Jie Wang <wangjie125@huawei.com>
Signed-off-by: Jijie Shao <shaojijie@huawei.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Link: https://patch.msgid.link/20250106143642.539698-8-shaojijie@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 7997ddd46c54408bcba5e37fe18b4d832e45d4d4 ]
The TQP BAR space is divided into two segments. TQPs 0-1023 and TQPs
1024-1279 are in different BAR space addresses. However,
hclge_fetch_pf_reg does not distinguish the tqp space information when
reading the tqp space information. When the number of TQPs is greater
than 1024, access bar space overwriting occurs.
The problem of different segments has been considered during the
initialization of tqp.io_base. Therefore, tqp.io_base is directly used
when the queue is read in hclge_fetch_pf_reg.
The error message:
Unable to handle kernel paging request at virtual address ffff800037200000
pc : hclge_fetch_pf_reg+0x138/0x250 [hclge]
lr : hclge_get_regs+0x84/0x1d0 [hclge]
Call trace:
hclge_fetch_pf_reg+0x138/0x250 [hclge]
hclge_get_regs+0x84/0x1d0 [hclge]
hns3_get_regs+0x2c/0x50 [hns3]
ethtool_get_regs+0xf4/0x270
dev_ethtool+0x674/0x8a0
dev_ioctl+0x270/0x36c
sock_do_ioctl+0x110/0x2a0
sock_ioctl+0x2ac/0x530
__arm64_sys_ioctl+0xa8/0x100
invoke_syscall+0x4c/0x124
el0_svc_common.constprop.0+0x140/0x15c
do_el0_svc+0x30/0xd0
el0_svc+0x1c/0x2c
el0_sync_handler+0xb0/0xb4
el0_sync+0x168/0x180
Fixes: 939ccd107ffc ("net: hns3: move dump regs function to a separate file")
Signed-off-by: Hao Lan <lanhao@huawei.com>
Signed-off-by: Jijie Shao <shaojijie@huawei.com>
Link: https://patch.msgid.link/20250106143642.539698-7-shaojijie@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 247fd1e33e1cd156aabe444e932d2648d33f1245 ]
Currently the misc irq is initialized before reset_timer setup. But
it will access the reset_timer in the irq handler. So initialize
the reset_timer earlier.
Fixes: ff200099d271 ("net: hns3: remove unnecessary work in hclgevf_main")
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Jijie Shao <shaojijie@huawei.com>
Link: https://patch.msgid.link/20250106143642.539698-6-shaojijie@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 98b1e3b27734139c76295754b6c317aa4df6d32e ]
Currently, there is a time window between misc irq enabled
and service task inited. If an interrupte is reported at
this time, it will cause warning like below:
[ 16.324639] Call trace:
[ 16.324641] __queue_delayed_work+0xb8/0xe0
[ 16.324643] mod_delayed_work_on+0x78/0xd0
[ 16.324655] hclge_errhand_task_schedule+0x58/0x90 [hclge]
[ 16.324662] hclge_misc_irq_handle+0x168/0x240 [hclge]
[ 16.324666] __handle_irq_event_percpu+0x64/0x1e0
[ 16.324667] handle_irq_event+0x80/0x170
[ 16.324670] handle_fasteoi_edge_irq+0x110/0x2bc
[ 16.324671] __handle_domain_irq+0x84/0xfc
[ 16.324673] gic_handle_irq+0x88/0x2c0
[ 16.324674] el1_irq+0xb8/0x140
[ 16.324677] arch_cpu_idle+0x18/0x40
[ 16.324679] default_idle_call+0x5c/0x1bc
[ 16.324682] cpuidle_idle_call+0x18c/0x1c4
[ 16.324684] do_idle+0x174/0x17c
[ 16.324685] cpu_startup_entry+0x30/0x6c
[ 16.324687] secondary_start_kernel+0x1a4/0x280
[ 16.324688] ---[ end trace 6aa0bff672a964aa ]---
So don't auto enable misc vector when request irq..
Fixes: 7be1b9f3e99f ("net: hns3: make hclge_service use delayed workqueue")
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Jijie Shao <shaojijie@huawei.com>
Link: https://patch.msgid.link/20250106143642.539698-5-shaojijie@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 5191a8d3c2ab5bc01930ea3425e06a739af5b0e9 ]
This patch modifies the implementation of debugfs:
When the user process stops unexpectedly, not all data of the file system
is read. In this case, the save_buf pointer is not released. When the
user process is called next time, save_buf is used to copy the cached
data to the user space. As a result, the queried data is stale.
To solve this problem, this patch implements .open() and .release() handler
for debugfs file_operations. moving allocation buffer and execution
of the cmd to the .open() handler and freeing in to the .release() handler.
Allocate separate buffer for each reader and associate the buffer
with the file pointer.
When different user read processes no longer share the buffer,
the stale data problem is fixed.
Fixes: 5e69ea7ee2a6 ("net: hns3: refactor the debugfs process")
Signed-off-by: Hao Lan <lanhao@huawei.com>
Signed-off-by: Guangwei Zhang <zhangwangwei6@huawei.com>
Signed-off-by: Jijie Shao <shaojijie@huawei.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Link: https://patch.msgid.link/20250106143642.539698-4-shaojijie@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit ac1e2836fe294c2007ca81cf7006862c3bdf0510 ]
Currently, the netdev->features is configured in hns3_nic_set_features.
As a result, __netdev_update_features considers that there is no feature
difference, and the procedures of the real features are missing.
Fixes: 2a7556bb2b73 ("net: hns3: implement ndo_features_check ops for hns3 driver")
Signed-off-by: Hao Lan <lanhao@huawei.com>
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Jijie Shao <shaojijie@huawei.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Link: https://patch.msgid.link/20250106143642.539698-3-shaojijie@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 5a4b584c67699a69981f0740618a144965a63237 ]
When a reset type that is not supported by the driver is input, a reset
pending flag bit of the HNAE3_NONE_RESET type is generated in
reset_pending. The driver does not have a mechanism to clear this type
of error. As a result, the driver considers that the reset is not
complete. This patch provides a mechanism to clear the
HNAE3_NONE_RESET flag and the parameter of
hnae3_ae_ops.set_default_reset_request is verified.
The error message:
hns3 0000:39:01.0: cmd failed -16
hns3 0000:39:01.0: hclge device re-init failed, VF is disabled!
hns3 0000:39:01.0: failed to reset VF stack
hns3 0000:39:01.0: failed to reset VF(4)
hns3 0000:39:01.0: prepare reset(2) wait done
hns3 0000:39:01.0 eth4: already uninitialized
Use the crash tool to view struct hclgevf_dev:
struct hclgevf_dev {
...
default_reset_request = 0x20,
reset_level = HNAE3_NONE_RESET,
reset_pending = 0x100,
reset_type = HNAE3_NONE_RESET,
...
};
Fixes: 720bd5837e37 ("net: hns3: add set_default_reset_request in the hnae3_ae_ops")
Signed-off-by: Hao Lan <lanhao@huawei.com>
Signed-off-by: Jijie Shao <shaojijie@huawei.com>
Link: https://patch.msgid.link/20250106143642.539698-2-shaojijie@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit db78475ba0d3c66d430f7ded2388cc041078a542 ]
Commit f85949f98206 ("xdp: add xdp_set_features_flag utility routine")
added routines to inform the core about XDP flag changes.
GVE support was added around the same time and missed using them.
GVE only changes the flags on error recover or resume.
Presumably the flags may change during resume if VM migrated.
User would not get the notification and upper devices would
not get a chance to recalculate their flags.
Fixes: 75eaae158b1b ("gve: Add XDP DROP and TX support for GQI-QPL format")
Reviewed-By: Jeroen de Borst <jeroendb@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20250106180210.1861784-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit bd2776e39c2a82ef4681d02678bb77b3d41e79be ]
When booting with a dock connected, the igc driver may get stuck for ~40
seconds if PCIe link is lost during initialization.
This happens because the driver access device after EECD register reads
return all F's, indicating failed reads. Consequently, hw->hw_addr is set
to NULL, which impacts subsequent rd32() reads. This leads to the driver
hanging in igc_get_hw_semaphore_i225(), as the invalid hw->hw_addr
prevents retrieving the expected value.
To address this, a validation check and a corresponding return value
catch is added for the EECD register read result. If all F's are
returned, indicating PCIe link loss, the driver will return -ENXIO
immediately. This avoids the 40-second hang and significantly improves
boot time when using a dock with an igc NIC.
Log before the patch:
[ 0.911913] igc 0000:70:00.0: enabling device (0000 -> 0002)
[ 0.912386] igc 0000:70:00.0: PTM enabled, 4ns granularity
[ 1.571098] igc 0000:70:00.0 (unnamed net_device) (uninitialized): PCIe link lost, device now detached
[ 43.449095] igc_get_hw_semaphore_i225: igc 0000:70:00.0 (unnamed net_device) (uninitialized): Driver can't access device - SMBI bit is set.
[ 43.449186] igc 0000:70:00.0: probe with driver igc failed with error -13
[ 46.345701] igc 0000:70:00.0: enabling device (0000 -> 0002)
[ 46.345777] igc 0000:70:00.0: PTM enabled, 4ns granularity
Log after the patch:
[ 1.031000] igc 0000:70:00.0: enabling device (0000 -> 0002)
[ 1.032097] igc 0000:70:00.0: PTM enabled, 4ns granularity
[ 1.642291] igc 0000:70:00.0 (unnamed net_device) (uninitialized): PCIe link lost, device now detached
[ 5.480490] igc 0000:70:00.0: enabling device (0000 -> 0002)
[ 5.480516] igc 0000:70:00.0: PTM enabled, 4ns granularity
Fixes: ab4056126813 ("igc: Add NVM support")
Cc: Chia-Lin Kao (AceLan) <acelan.kao@canonical.com>
Signed-off-by: En-Wei Wu <en-wei.wu@canonical.com>
Reviewed-by: Vitaly Lifshits <vitaly.lifshits@intel.com>
Tested-by: Mor Bar-Gabay <morx.bar.gabay@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 6c5b989116083a98f45aada548ff54e7a83a9c2d ]
ptp4l application reports too high offset when ran on E823 device
with a 100GB/s link. Those values cannot go under 100ns, like in a
working case when using 100 GB/s cable.
This is due to incorrect frequency settings on the PHY clocks for
100 GB/s speed. Changes are introduced to align with the internal
hardware documentation, and correctly initialize frequency in PHY
clocks with the frequency values that are in our HW spec.
To reproduce the issue run ptp4l as a Time Receiver on E823 device,
and observe the offset, which will never approach values seen
in the PTP working case.
Reproduction output:
ptp4l -i enp137s0f3 -m -2 -s -f /etc/ptp4l_8275.conf
ptp4l[5278.775]: master offset 12470 s2 freq +41288 path delay -3002
ptp4l[5278.837]: master offset 10525 s2 freq +39202 path delay -3002
ptp4l[5278.900]: master offset -24840 s2 freq -20130 path delay -3002
ptp4l[5278.963]: master offset 10597 s2 freq +37908 path delay -3002
ptp4l[5279.025]: master offset 8883 s2 freq +36031 path delay -3002
ptp4l[5279.088]: master offset 7267 s2 freq +34151 path delay -3002
ptp4l[5279.150]: master offset 5771 s2 freq +32316 path delay -3002
ptp4l[5279.213]: master offset 4388 s2 freq +30526 path delay -3002
ptp4l[5279.275]: master offset -30434 s2 freq -28485 path delay -3002
ptp4l[5279.338]: master offset -28041 s2 freq -27412 path delay -3002
ptp4l[5279.400]: master offset 7870 s2 freq +31118 path delay -3002
Fixes: 3a7496234d17 ("ice: implement basic E822 PTP support")
Reviewed-by: Milena Olech <milena.olech@intel.com>
Signed-off-by: Przemyslaw Korba <przemyslaw.korba@intel.com>
Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 65104599b3a8ed42d85b3f8f27be650afe1f3a7e ]
Mask admin command returned max phase adjust value for both input and
output pins. Only 31 bits are relevant, last released data sheet wrongly
points that 32 bits are valid - see [1] 3.2.6.4.1 Get CCU Capabilities
Command for reference. Fix of the datasheet itself is in progress.
Fix the min/max assignment logic, previously the value was wrongly
considered as negative value due to most significant bit being set.
Example of previous broken behavior:
$ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/dpll.yaml \
--do pin-get --json '{"id":1}'| grep phase-adjust
'phase-adjust': 0,
'phase-adjust-max': 16723,
'phase-adjust-min': -16723,
Correct behavior with the fix:
$ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/dpll.yaml \
--do pin-get --json '{"id":1}'| grep phase-adjust
'phase-adjust': 0,
'phase-adjust-max': 2147466925,
'phase-adjust-min': -2147466925,
[1] https://cdrdv2.intel.com/v1/dl/getContent/613875?explicitVersion=true
Fixes: 90e1c90750d7 ("ice: dpll: implement phase related callbacks")
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 4c1224501e9d6c5fd12d83752f1c1b444e0e3418 ]
During ARP failure, tid is not inserted but _c4iw_free_ep()
attempts to remove tid which results in error.
This patch fixes the issue by avoiding removal of uninserted tid.
Fixes: 59437d78f088 ("cxgb4/chtls: fix ULD connection failures due to wrong TID base")
Signed-off-by: Anumula Murali Mohan Reddy <anumula@chelsio.com>
Signed-off-by: Potnuri Bharat Teja <bharat@chelsio.com>
Link: https://patch.msgid.link/20250103092327.1011925-1-anumula@chelsio.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 40452969a50652e3cbf89dac83d54eebf2206d27 ]
DIM work will call the firmware to adjust the coalescing parameters on
the RX rings. We should cancel DIM work before we call the firmware
to free the RX rings. Otherwise, FW will reject the call from DIM
work if the RX ring has been freed. This will generate an error
message like this:
bnxt_en 0000:21:00.1 ens2f1np1: hwrm req_type 0x53 seq id 0x6fca error 0x2
and cause unnecessary concern for the user. It is also possible to
modify the coalescing parameters of the wrong ring if the ring has
been re-allocated.
To prevent this, cancel DIM work right before freeing the RX rings.
We also have to add a check in NAPI poll to not schedule DIM if the
RX rings are shutting down. Check that the VNIC is active before we
schedule DIM. The VNIC is always disabled before we free the RX rings.
Fixes: 0bc0b97fca73 ("bnxt_en: cleanup DIM work on device shutdown")
Reviewed-by: Hongguang Gao <hongguang.gao@broadcom.com>
Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20250104043849.3482067-3-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit c8dafb0e4398dacc362832098a04b97da3b0395b ]
When hwrm_req_replace() fails, the driver is not invoking bnxt_req_drop()
which could cause a memory leak.
Fixes: bbf33d1d9805 ("bnxt_en: update all firmware calls to use the new APIs")
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20250104043849.3482067-2-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 8c817eb26230dc0ae553cee16ff43a4a895f6756 ]
Add an array size limit to the for-loop to be sure we don't try
to reference a fw_version string off the end of the fw info names
array. We know that our firmware only has a limited number
of firmware slot names, but we shouldn't leave this unchecked.
Fixes: 45d76f492938 ("pds_core: set up device and adminq")
Signed-off-by: Shannon Nelson <shannon.nelson@amd.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Brett Creeley <brett.creeley@amd.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20250103195147.7408-1-shannon.nelson@amd.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 8ce4f287524c74a118b0af1eebd4b24a8efca57a ]
The existing SW-FW interaction flow on the driver is wrong. Follow this
wrong flow, driver would never return error if there is a unknown command.
Since firmware writes back 'firmware ready' and 'unknown command' in the
mailbox message if there is an unknown command sent by driver. So reading
'firmware ready' does not timeout. Then driver would mistakenly believe
that the interaction has completed successfully.
It tends to happen with the use of custom firmware. Move the check for
'unknown command' out of the poll timeout for 'firmware ready'. And adjust
the debug log so that mailbox messages are always printed when commands
timeout.
Fixes: 1efa9bfe58c5 ("net: libwx: Implement interaction with firmware")
Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
Link: https://patch.msgid.link/20250103081013.1995939-1-jiawenwu@trustnetic.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
commit fb3a9a1165cea104b5ab3753e88218e4497b01c1 upstream.
Commit ba0925c34e0f ("gve: process XSK TX descriptors as part of RX NAPI")
moved XSK TX processing to be part of the RX NAPI. However, that commit
did not include triggering the RX NAPI in gve_xsk_wakeup. This is
necessary because the TX NAPI only processes TX completions, meaning
that a TX wakeup would not actually trigger XSK descriptor processing.
Also, the branch on XDP_WAKEUP_TX was supposed to have been removed, as
the NAPI should be scheduled whether the wakeup is for RX or TX.
Fixes: ba0925c34e0f ("gve: process XSK TX descriptors as part of RX NAPI")
Cc: stable@vger.kernel.org
Signed-off-by: Joshua Washington <joshwash@google.com>
Signed-off-by: Praveen Kaligineedi <pkaligineedi@google.com>
Link: https://patch.msgid.link/20241221032807.302244-1-pkaligineedi@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit ff7c2dea9dd1a436fc79d6273adffdcc4a7ffea3 upstream.
In GVE, dedicated XDP queues only exist when an XDP program is installed
and the interface is up. As such, the NDO XDP XMIT callback should
return early if either of these conditions are false.
In the case of no loaded XDP program, priv->num_xdp_queues=0 which can
cause a divide-by-zero error, and in the case of interface down,
num_xdp_queues remains untouched to persist XDP queue count for the next
interface up, but the TX pointer itself would be NULL.
The XDP xmit callback also needs to synchronize with a device
transitioning from open to close. This synchronization will happen via
the GVE_PRIV_FLAGS_NAPI_ENABLED bit along with a synchronize_net() call,
which waits for any RCU critical sections at call-time to complete.
Fixes: 39a7f4aa3e4a ("gve: Add XDP REDIRECT support for GQI-QPL format")
Cc: stable@vger.kernel.org
Signed-off-by: Joshua Washington <joshwash@google.com>
Signed-off-by: Praveen Kaligineedi <pkaligineedi@google.com>
Reviewed-by: Praveen Kaligineedi <pkaligineedi@google.com>
Reviewed-by: Shailend Chand <shailend@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit de63ac44a527b2c5067551dbd70d939fe151325a upstream.
This patch fixes a number of consistency issues in the queue allocation
path related to XDP.
As it stands, the number of allocated XDP queues changes in three
different scenarios.
1) Adding an XDP program while the interface is up via
gve_add_xdp_queues
2) Removing an XDP program while the interface is up via
gve_remove_xdp_queues
3) After queues have been allocated and the old queue memory has been
removed in gve_queues_start.
However, the requirement for the interface to be up for
gve_(add|remove)_xdp_queues to be called, in conjunction with the fact
that the number of queues stored in priv isn't updated until _after_ XDP
queues have been allocated in the normal queue allocation path means
that if an XDP program is added while the interface is down, XDP queues
won't be added until the _second_ if_up, not the first.
Given the expectation that the number of XDP queues is equal to the
number of RX queues, scenario (3) has another problematic implication.
When changing the number of queues while an XDP program is loaded, the
number of XDP queues must be updated as well, as there is logic in the
driver (gve_xdp_tx_queue_id()) which relies on every RX queue having a
corresponding XDP TX queue. However, the number of XDP queues stored in
priv would not be updated until _after_ a close/open leading to a
mismatch in the number of XDP queues reported vs the number of XDP
queues which actually exist after the queue count update completes.
This patch remedies these issues by doing the following:
1) The allocation config getter function is set up to retrieve the
_expected_ number of XDP queues to allocate instead of relying
on the value stored in `priv` which is only updated once the queues
have been allocated.
2) When adjusting queues, XDP queues are adjusted to match the number of
RX queues when XDP is enabled. This only works in the case when
queues are live, so part (1) of the fix must still be available in
the case that queues are adjusted when there is an XDP program and
the interface is down.
Fixes: 5f08cd3d6423 ("gve: Alloc before freeing when adjusting queues")
Cc: stable@vger.kernel.org
Signed-off-by: Joshua Washington <joshwash@google.com>
Signed-off-by: Praveen Kaligineedi <pkaligineedi@google.com>
Reviewed-by: Praveen Kaligineedi <pkaligineedi@google.com>
Reviewed-by: Shailend Chand <shailend@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 40338d7987d810fcaa95c500b1068a52b08eec9b upstream.
This patch predicates the enabling and disabling of XSK pools on the
existence of queues. As it stands, if the interface is down, disabling
or enabling XSK pools would result in a crash, as the RX queue pointer
would be NULL. XSK pool registration will occur as part of the next
interface up.
Similarly, xsk_wakeup needs be guarded against queues disappearing
while the function is executing, so a check against the
GVE_PRIV_FLAGS_NAPI_ENABLED flag is added to synchronize with the
disabling of the bit and the synchronize_net() in gve_turndown.
Fixes: fd8e40321a12 ("gve: Add AF_XDP zero-copy support for GQI-QPL format")
Cc: stable@vger.kernel.org
Signed-off-by: Joshua Washington <joshwash@google.com>
Signed-off-by: Praveen Kaligineedi <pkaligineedi@google.com>
Reviewed-by: Praveen Kaligineedi <pkaligineedi@google.com>
Reviewed-by: Shailend Chand <shailend@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Larysa Zaremba <larysa.zaremba@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 6321f5fb70d502d95de8a212a7b484c297ec9644 upstream.
When stopping XDP TX rings, the XDP clean function needs to be called to
clean out the entire queue, similar to what happens in the normal TX
queue case. Otherwise, the FIFO won't be cleared correctly, and
xsk_tx_completed won't be reported.
Fixes: 75eaae158b1b ("gve: Add XDP DROP and TX support for GQI-QPL format")
Cc: stable@vger.kernel.org
Signed-off-by: Joshua Washington <joshwash@google.com>
Signed-off-by: Praveen Kaligineedi <pkaligineedi@google.com>
Reviewed-by: Praveen Kaligineedi <pkaligineedi@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit ba0925c34e0fa6fe02d3d642bc02ab099ab312c7 upstream.
When busy polling is enabled, xsk_sendmsg for AF_XDP zero copy marks
the NAPI ID corresponding to the memory pool allocated for the socket.
In GVE, this NAPI ID will never correspond to a NAPI ID of one of the
dedicated XDP TX queues registered with the umem because XDP TX is not
set up to share a NAPI with a corresponding RX queue.
This patch moves XSK TX descriptor processing from the TX NAPI to the RX
NAPI, and the gve_xsk_wakeup callback is updated to use the RX NAPI
instead of the TX NAPI, accordingly. The branch on if the wakeup is for
TX is removed, as the NAPI poll should be invoked whether the wakeup is
for TX or for RX.
Fixes: fd8e40321a12 ("gve: Add AF_XDP zero-copy support for GQI-QPL format")
Cc: stable@vger.kernel.org
Signed-off-by: Praveen Kaligineedi <pkaligineedi@google.com>
Signed-off-by: Joshua Washington <joshwash@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 4a4d38ace1fb0586bffd2aab03caaa05d6011748 upstream.
The Host Port (i.e. CPU facing port) of CPSW receives traffic from Linux
via TX DMA Channels which are Hardware Queues consisting of traffic
categorized according to their priority. The Host Port is configured to
dequeue traffic from these Hardware Queues on the basis of priority i.e.
as long as traffic exists on a Hardware Queue of a higher priority, the
traffic on Hardware Queues of lower priority isn't dequeued. An alternate
operation is also supported wherein traffic can be dequeued by the Host
Port in a Round-Robin manner.
Until commit under Fixes, the am65-cpsw driver enabled a single TX DMA
Channel, due to which, unless modified by user via "ethtool", all traffic
from Linux is transmitted on DMA Channel 0. Therefore, configuring
the Host Port for priority based dequeuing or Round-Robin operation
is identical since there is a single DMA Channel.
Since commit under Fixes, all 8 TX DMA Channels are enabled by default.
Additionally, the default "tc mapping" doesn't take into account
the possibility of different traffic profiles which various users
might have. This results in traffic starvation at the Host Port
due to the priority based dequeuing which has been enabled by default
since the inception of the driver. The traffic starvation triggers
NETDEV WATCHDOG timeout for all TX DMA Channels that haven't been serviced
due to the presence of traffic on the higher priority TX DMA Channels.
Fix this by defaulting to Round-Robin dequeuing at the Host Port, which
shall ensure that traffic is dequeued from all TX DMA Channels irrespective
of the traffic profile. This will address the NETDEV WATCHDOG timeouts.
At the same time, users can still switch from Round-Robin to Priority
based dequeuing at the Host Port with the help of the "p0-rx-ptype-rrobin"
private flag of "ethtool". Users are expected to setup an appropriate
"tc mapping" that suits their traffic profile when switching to priority
based dequeuing at the Host Port.
Fixes: be397ea3473d ("net: ethernet: am65-cpsw: Set default TX channels to maximum")
Cc: <stable@vger.kernel.org>
Signed-off-by: Siddharth Vadapalli <s-vadapalli@ti.com>
Link: https://patch.msgid.link/20241220075618.228202-1-s-vadapalli@ti.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 03c8d0af2e409e15c16130b185e12b5efba0a6b9 upstream.
A Marvell 88E8075 ethernet controller has this device ID instead of
11ab:4370 and works fine with the sky2 driver.
Signed-off-by: Pascal Hambourg <pascal@plouf.fr.eu.org>
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/10165a62-99fb-4be6-8c64-84afd6234085@plouf.fr.eu.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 9b115361248dc6cce182a2dc030c1c70b0a9639e ]
When ICSSG interfaces are brought down and brought up again, the
pru cores are shut down and booted again, flushing out all the memories
and start again in a clean state. Hence it is expected that the
IEP_CMP_CFG register needs to be flushed during iep_init() to ensure
that the existing residual configuration doesn't cause any unusual
behavior. If the register is not cleared, existing IEP_CMP_CFG set for
CMP1 will result in SYNC0_OUT signal based on the SYNC_OUT register values.
After bringing the interface up, calling PPS enable doesn't work as
the driver believes PPS is already enabled, (iep->pps_enabled is not
cleared during interface bring down) and driver will just return true
even though there is no signal. Fix this by disabling pps and perout.
Fixes: c1e0230eeaab ("net: ti: icss-iep: Add IEP driver")
Signed-off-by: Meghana Malladi <m-malladi@ti.com>
Reviewed-by: Roger Quadros <rogerq@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 9facce84f4062f782ebde18daa7006a23d40b607 ]
Timesync related operations are ran in PRU0 cores for both ICSSG SLICE0
and SLICE1. Currently whenever any ICSSG interface comes up we load the
respective firmwares to PRU cores and whenever interface goes down, we
stop the resective cores. Due to this, when SLICE0 goes down while
SLICE1 is still active, PRU0 firmwares are unloaded and PRU0 core is
stopped. This results in clock jump for SLICE1 interface as the timesync
related operations are no longer running.
As there are interdependencies between SLICE0 and SLICE1 firmwares,
fix this by running both PRU0 and PRU1 firmwares as long as at least 1
ICSSG interface is up. Add new flag in prueth struct to check if all
firmwares are running and remove the old flag (fw_running).
Use emacs_initialized as reference count to load the firmwares for the
first and last interface up/down. Moving init_emac_mode and fw_offload_mode
API outside of icssg_config to icssg_common_start API as they need
to be called only once per firmware boot.
Change prueth_emac_restart() to return error code and add error prints
inside the caller of this functions in case of any failures.
Move prueth_emac_stop() from common to sr1 driver.
sr1 and sr2 drivers have different logic handling for stopping
the firmwares. While sr1 driver is dependent on emac structure
to stop the corresponding pru cores for that slice, for sr2
all the pru cores of both the slices are stopped and is not
dependent on emac. So the prueth_emac_stop() function is no
longer common and can be moved to sr1 driver.
Fixes: c1e0230eeaab ("net: ti: icss-iep: Add IEP driver")
Signed-off-by: MD Danish Anwar <danishanwar@ti.com>
Signed-off-by: Meghana Malladi <m-malladi@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit a8620de72e5676993ec3a3b975f7c10908f5f60f ]
In efx_tc_ct_zone_ht_params, the key_len was previously set to
offsetof(struct efx_tc_ct_zone, linkage). This calculation is incorrect
because it includes any padding between the zone field and the linkage
field due to structure alignment, which can vary between systems.
This patch updates key_len to use sizeof_field(struct efx_tc_ct_zone, zone)
, ensuring that the hash table correctly uses the zone as the key. This fix
prevents potential hash lookup errors and improves connection tracking
reliability.
Fixes: c3bb5c6acd4e ("sfc: functions to register for conntrack zone offload")
Signed-off-by: Liang Jie <liangjie@lixiang.com>
Acked-by: Edward Cree <ecree.xilinx@gmail.com>
Link: https://patch.msgid.link/20241230093709.3226854-1-buaajxlj@163.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit ad5c318086e2e23b577eca33559c5ebf89bc7eb9 ]
Current implementation of mv643xx_eth_shared_of_add_port() calls
of_parse_phandle(), but does not release the refcount on error. Call
of_node_put() in the error path and in mv643xx_eth_shared_of_remove().
This bug was found by an experimental verification tool that I am
developing.
Fixes: 76723bca2802 ("net: mv643xx_eth: add DT parsing support")
Signed-off-by: Joe Hattori <joe@pf.is.s.u-tokyo.ac.jp>
Link: https://patch.msgid.link/20241221081448.3313163-1-joe@pf.is.s.u-tokyo.ac.jp
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit b255ef45fcc2141c1bf98456796abb956d843a27 ]
Check the return value of clk_prepare_enable to ensure that priv->clk has
been successfully enabled.
If priv->clk was not enabled during bcm_sysport_probe, bcm_sysport_resume,
or bcm_sysport_open, it must not be disabled in any subsequent execution
paths.
Fixes: 31bc72d97656 ("net: systemport: fetch and use clock resources")
Signed-off-by: Vitalii Mordan <mordan@ispras.ru>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://patch.msgid.link/20241227123007.2333397-1-mordan@ispras.ru
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 2a4f56fbcc473d8faeb29b73082df39efbe5893c ]
In the cited commit, when changing from switchdev to legacy mode,
uplink representor's netdev is kept, and its profile is replaced with
nic profile, so netdev is detached from old profile, then attach to
new profile.
During profile change, the hardware resources allocated by the old
profile will be cleaned up. However, the cleanup is relying on the
related kernel modules. And they may need to flush themselves first,
which is triggered by netdev events, for example, NETDEV_UNREGISTER.
However, netdev is kept, or netdev_register is called after the
cleanup, which may cause troubles because the resources are still
referred by kernel modules.
The same process applies to all the caes when uplink is leaving
switchdev mode, including devlink eswitch mode set legacy, driver
unload and devlink reload. For the first one, it can be blocked and
returns failure to users, whenever possible. But it's hard for the
others. Besides, the attachment to nic profile is unnecessary as the
netdev will be unregistered anyway for such cases.
So in this patch, the original behavior is kept only for devlink
eswitch set mode legacy. For the others, moves netdev unregistration
before the profile change.
Fixes: 7a9fb35e8c3a ("net/mlx5e: Do not reload ethernet ports when changing eswitch mode")
Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20241220081505.1286093-5-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 5a03b368562a7ff5f5f1f63b5adf8309cbdbd5be ]
During driver unload, unregister_netdev is called after unloading
vport rep. So, the mlx5e_rep_priv is already freed while trying to get
rpriv->netdev, or walk rpriv->tc_ht, which results in use-after-free.
So add the checking to make sure access the data of vport rep which is
still loaded.
Fixes: d1569537a837 ("net/mlx5e: Modify and restore TC rules for IPSec TX rules")
Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
Reviewed-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20241220081505.1286093-4-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 8c6254479b3d5bd788d2b5fefaa48fb194331ed0 ]
In MACsec, it is possible to create multiple active TX SAs on a SC,
but only one such SA can be used at a time for transmission. This SA
is selected through the encoding_sa link parameter.
When there are 2 or more active TX SAs configured (encoding_sa=0):
ip macsec add macsec0 tx sa 0 pn 1 on key 00 <KEY1>
ip macsec add macsec0 tx sa 1 pn 1 on key 00 <KEY2>
... the traffic should be still sent via TX SA 0 as the encoding_sa was
not changed. However, the driver ignores the encoding_sa and overrides
it to SA 1 by installing the flow steering id of the newly created TX SA
into the SCI -> flow steering id hash map. The future packet tx
descriptors will point to the incorrect flow steering rule (SA 1).
This patch fixes the issue by avoiding the creation of the flow steering
rule for an active TX SA that is not the encoding_sa. The driver side
tx_sa object and the FW side macsec object are still created. When the
encoding_sa link parameter is changed to another active TX SA, only the
new flow steering rule will be created in the mlx5e_macsec_upd_txsa()
handler.
Fixes: 8ff0ac5be144 ("net/mlx5: Add MACsec offload Tx command support")
Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com>
Reviewed-by: Lior Nahmanson <liorna@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20241220081505.1286093-3-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 050a4c011b0dfeb91664a5d7bd3647ff38db08ce ]
When creating a software steering completion queue (CQ), an arbitrary
MSIX vector n is selected. This results in the CQ sharing the same
Ethernet traffic channel n associated with the chosen vector. However,
the value of n is often unpredictable, which can introduce complications
for interrupt monitoring and verification tools.
Moreover, SW steering uses polling rather than event-driven interrupts.
Therefore, there is no need to select any MSIX vector other than the
existing vector 0 for CQ creation.
In light of these factors, and to enhance predictability, we modify the
code to consistently select MSIX vector 0 for CQ creation.
Fixes: 297cccebdc5a ("net/mlx5: DR, Expose an internal API to issue RDMA operations")
Signed-off-by: Shahar Shitrit <shshitrit@nvidia.com>
Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20241220081505.1286093-2-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit b5a7b661a073727219fedc35f5619f62418ffe72 ]
The device denoted by tunnel->parms.link resides in the underlay net
namespace. Therefore pass tunnel->net to ip_tunnel_init_flow().
Fixes: db53cd3d88dc ("net: Handle l3mdev in ip_tunnel_init_flow")
Signed-off-by: Xiao Liang <shaw.leon@gmail.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/20241219130336.103839-1-shaw.leon@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|