summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2023-08-07Merge branch 'net-stmmac-correct-mac-propagation-delay'Jakub Kicinski
Johannes Zink says: ==================== net: stmmac: correct MAC propagation delay Changes in v3: - work in Richard's review feedback. Thank you for reviewing my patch: - as some of the hardware may have no or invalid correction value registers: introduce feature switch which can be enabled in the glue code drivers depending on the actual hardware support - only enable the feature on the i.MX8MP for the time being, as the patch improves timing accuracy and is tested for this hardware - Link to v2: https://lore.kernel.org/r/20230719-stmmac_correct_mac_delay-v2-1-3366f38ee9a6@pengutronix.de Changes in v2: - fix builds for 32bit, this was found by the kernel build bot Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202307200225.B8rmKQPN-lkp@intel.com/ - while at it also fix an overflow by shifting a u32 constant from macro by 10bits by casting the constant to u64 - Link to v1: https://lore.kernel.org/r/20230719-stmmac_correct_mac_delay-v1-1-768aa4d09334@pengutronix.de Tested-by: Kurt Kanzenbach <kurt@linutronix.de> # imx8mp ==================== Link: https://lore.kernel.org/r/20230719-stmmac_correct_mac_delay-v3-0-61e63427735e@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-07net: stmmac: dwmac-imx: enable MAC propagation delay correction for i.MX8MPJohannes Zink
As the i.MX8MP supports reading MAC propagation delay and correcting the Hardware timestamp counter for additional delays [1], enable the feature for this SoC. This reduces phase error of the PPS output from the PTP Hardware Clock from approx 150ns to 100ns. [1] i.MX8MP Reference Manual, rev.1 Section 11.7.2.5.3 "Timestamp correction" Signed-off-by: Johannes Zink <j.zink@pengutronix.de> Link: https://lore.kernel.org/r/20230719-stmmac_correct_mac_delay-v3-2-61e63427735e@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-07net: stmmac: correct MAC propagation delayJohannes Zink
The IEEE1588 Standard specifies that the timestamps of Packets must be captured when the PTP message timestamp point (leading edge of first octet after the start of frame delimiter) crosses the boundary between the node and the network. As the MAC latches the timestamp at an internal point, the captured timestamp must be corrected for the additional data transmission latency, as described in the publicly available datasheet [1]. This patch only corrects for the MAC-Internal delay, which can be read out from the MAC_Ingress_Timestamp_Latency register on DWMAC version 5, since the Phy framework currently does not support querying the Phy ingress and egress latency. The Closs Domain Crossing Circuits errors as indicated in [1] are already being accounted in the stmmac_get_tx_hwtstamp() function and are not corrected here. As the Latency varies for different link speeds and MII modes of operation, the correction value needs to be updated on each link state change. As the delay also causes a phase shift in the timestamp counter compared to the rest of the network, this correction will also reduce phase error when generating PPS outputs from the timestamp counter. Since the correction registers may be unavailable on some hardware and no feature bits are documented for dynamically detection of the MAC propagation delay readout, introduce a feature bit to explicitely enable MAC delay Correction in the gluecode driver. [1] i.MX8MP Reference Manual, rev.1 Section 11.7.2.5.3 "Timestamp correction" Signed-off-by: Johannes Zink <j.zink@pengutronix.de> Link: https://lore.kernel.org/r/20230719-stmmac_correct_mac_delay-v2-1-3366f38ee9a6@pengutronix.de Link: https://lore.kernel.org/r/20230719-stmmac_correct_mac_delay-v3-1-61e63427735e@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-07net/mlx5e: Add capability check for vnic countersLama Kayal
Add missing capability check for each of the vnic counters exposed by devlink health reporter, and thus avoid unexpected behavior due to invalid access to registers. While at it, read only the exact number of bits for each counter whether it was 32 bits or 64 bits. Fixes: b0bc615df488 ("net/mlx5: Add vnic devlink health reporter to PFs/VFs") Fixes: a33682e4e78e ("net/mlx5e: Expose catastrophic steering error counters") Signed-off-by: Lama Kayal <lkayal@nvidia.com> Reviewed-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Rahul Rameshbabu <rrameshbabu@nvidia.com> Reviewed-by: Maher Sanalla <msanalla@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-08-07net/mlx5: Reload auxiliary devices in pci error handlersMoshe Shemesh
Handling pci errors should fully teardown and load back auxiliary devices, same as done through mlx5 health recovery flow. Fixes: 72ed5d5624af ("net/mlx5: Suspend auxiliary devices only in case of PCI device suspend") Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-08-07net/mlx5: Skip clock update work when device is in error stateMoshe Shemesh
When device is in error state, marked by the flag MLX5_DEVICE_STATE_INTERNAL_ERROR, the HW and PCI may not be accessible and so clock update work should be skipped. Furthermore, such access through PCI in error state, after calling mlx5_pci_disable_device() can result in failing to recover from pci errors. Fixes: ef9814deafd0 ("net/mlx5e: Add HW timestamping (TS) support") Reported-and-tested-by: Ganesh G R <ganeshgr@linux.ibm.com> Closes: https://lore.kernel.org/netdev/9bdb9b9d-140a-7a28-f0de-2e64e873c068@nvidia.com Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Aya Levin <ayal@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-08-07net/mlx5: LAG, Check correct bucket when modifying LAGShay Drory
Cited patch introduced buckets in hash mode, but missed to update the ports/bucket check when modifying LAG. Fix the check. Fixes: 352899f384d4 ("net/mlx5: Lag, use buckets in hash mode") Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Maor Gottlieb <maorg@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-08-07net/mlx5e: Unoffload post act rule when handling FIB eventsChris Mi
If having the following tc rule on stack device: filter parent ffff: protocol ip pref 3 flower chain 1 filter parent ffff: protocol ip pref 3 flower chain 1 handle 0x1 dst_mac 24:25:d0:e1:00:00 src_mac 02:25:d0:25:01:02 eth_type ipv4 ct_state +trk+new in_hw in_hw_count 1 action order 1: ct commit zone 0 pipe index 2 ref 1 bind 1 installed 3807 sec used 3779 sec firstused 3800 sec Action statistics: Sent 120 bytes 2 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 used_hw_stats delayed action order 2: tunnel_key set src_ip 192.168.1.25 dst_ip 192.168.1.26 key_id 4 dst_port 4789 csum pipe index 3 ref 1 bind 1 installed 3807 sec used 3779 sec firstused 3800 sec Action statistics: Sent 120 bytes 2 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 used_hw_stats delayed action order 3: mirred (Egress Redirect to device vxlan1) stolen index 9 ref 1 bind 1 installed 3807 sec used 3779 sec firstused 3800 sec Action statistics: Sent 120 bytes 2 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 used_hw_stats delayed When handling FIB events, the rule in post act will not be deleted. And because the post act rule has packet reformat and modify header actions, also will hit the following syndromes: mlx5_core 0000:08:00.0: mlx5_cmd_out_err:829:(pid 11613): DEALLOC_MODIFY_HEADER_CONTEXT(0x941) op_mod(0x0) failed, status bad resource state(0x9), syndrome (0x1ab444), err(-22) mlx5_core 0000:08:00.0: mlx5_cmd_out_err:829:(pid 11613): DEALLOC_PACKET_REFORMAT_CONTEXT(0x93e) op_mod(0x0) failed, status bad resource state(0x9), syndrome (0x179e84), err(-22) Fix it by unoffloading post act rule when handling FIB events. Fixes: 314e1105831b ("net/mlx5e: Add post act offload/unoffload API") Signed-off-by: Chris Mi <cmi@nvidia.com> Reviewed-by: Vlad Buslov <vladbu@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-08-07net/mlx5: Fix devlink controller number for ECVFDaniel Jurgens
The controller number for ECVFs is always 0, because the ECPF must be the eswitch owner for EC VFs to be enabled. Fixes: dc13180824b7 ("net/mlx5: Enable devlink port for embedded cpu VF vports") Signed-off-by: Daniel Jurgens <danielj@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-08-07net/mlx5: Allow 0 for total host VFsDaniel Jurgens
When querying eswitch functions 0 is a valid number of host VFs. After introducing ARM SRIOV falling through to getting the max value from PCI results in using the total VFs allowed on the ARM for the host. Fixes: 86eec50beaf3 ("net/mlx5: Support querying max VFs from device"); Signed-off-by: Daniel Jurgens <danielj@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-08-07net/mlx5: Return correct EC_VF function IDDaniel Jurgens
The ECVF function ID range is 1..max_ec_vfs. Currently mlx5_vport_to_func_id returns 0..max_ec_vfs - 1. Which results in a syndrome when querying the caps with more recent firmware, or reading incorrect caps with older firmware that supports EC VFs. Fixes: 9ac0b128248e ("net/mlx5: Update vport caps query/set for EC VFs") Signed-off-by: Daniel Jurgens <danielj@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-08-07net/mlx5: DR, Fix wrong allocation of modify hdr patternYevgeny Kliteynik
Fixing wrong calculation of the modify hdr pattern size, where the previously calculated number would not be enough to accommodate the required number of actions. Fixes: da5d0027d666 ("net/mlx5: DR, Add cache for modify header pattern") Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Erez Shitrit <erezsh@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-08-07net/mlx5e: TC, Fix internal port memory leakJianbo Liu
The flow rule can be splited, and the extra post_act rules are added to post_act table. It's possible to trigger memleak when the rule forwards packets from internal port and over tunnel, in the case that, for example, CT 'new' state offload is allowed. As int_port object is assigned to the flow attribute of post_act rule, and its refcnt is incremented by mlx5e_tc_int_port_get(), but mlx5e_tc_int_port_put() is not called, the refcnt is never decremented, then int_port is never freed. The kmemleak reports the following error: unreferenced object 0xffff888128204b80 (size 64): comm "handler20", pid 50121, jiffies 4296973009 (age 642.932s) hex dump (first 32 bytes): 01 00 00 00 19 00 00 00 03 f0 00 00 04 00 00 00 ................ 98 77 67 41 81 88 ff ff 98 77 67 41 81 88 ff ff .wgA.....wgA.... backtrace: [<00000000e992680d>] kmalloc_trace+0x27/0x120 [<000000009e945a98>] mlx5e_tc_int_port_get+0x3f3/0xe20 [mlx5_core] [<0000000035a537f0>] mlx5e_tc_add_fdb_flow+0x473/0xcf0 [mlx5_core] [<0000000070c2cec6>] __mlx5e_add_fdb_flow+0x7cf/0xe90 [mlx5_core] [<000000005cc84048>] mlx5e_configure_flower+0xd40/0x4c40 [mlx5_core] [<000000004f8a2031>] mlx5e_rep_indr_offload.isra.0+0x10e/0x1c0 [mlx5_core] [<000000007df797dc>] mlx5e_rep_indr_setup_tc_cb+0x90/0x130 [mlx5_core] [<0000000016c15cc3>] tc_setup_cb_add+0x1cf/0x410 [<00000000a63305b4>] fl_hw_replace_filter+0x38f/0x670 [cls_flower] [<000000008bc9e77c>] fl_change+0x1fd5/0x4430 [cls_flower] [<00000000e7f766e4>] tc_new_tfilter+0x867/0x2010 [<00000000e101c0ef>] rtnetlink_rcv_msg+0x6fc/0x9f0 [<00000000e1111d44>] netlink_rcv_skb+0x12c/0x360 [<0000000082dd6c8b>] netlink_unicast+0x438/0x710 [<00000000fc568f70>] netlink_sendmsg+0x794/0xc50 [<0000000016e92590>] sock_sendmsg+0xc5/0x190 So fix this by moving int_port cleanup code to the flow attribute free helper, which is used by all the attribute free cases. Fixes: 8300f225268b ("net/mlx5e: Create new flow attr for multi table actions") Signed-off-by: Jianbo Liu <jianbol@nvidia.com> Reviewed-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-08-07net/mlx5e: Take RTNL lock when needed before calling xdp_set_features()Gal Pressman
Hold RTNL lock when calling xdp_set_features() with a registered netdev, as the call triggers the netdev notifiers. This could happen when switching from uplink rep to nic profile for example. This resolves the following call trace: RTNL: assertion failed at net/core/dev.c (1953) WARNING: CPU: 6 PID: 112670 at net/core/dev.c:1953 call_netdevice_notifiers_info+0x7c/0x80 Modules linked in: sch_mqprio sch_mqprio_lib act_tunnel_key act_mirred act_skbedit cls_matchall nfnetlink_cttimeout act_gact cls_flower sch_ingress bonding ib_umad ip_gre rdma_ucm mlx5_vfio_pci ipip tunnel4 ip6_gre gre mlx5_ib vfio_pci vfio_pci_core vfio_iommu_type1 ib_uverbs vfio mlx5_core ib_ipoib geneve nf_tables ip6_tunnel tunnel6 iptable_raw openvswitch nsh rpcrdma ib_iser libiscsi scsi_transport_iscsi rdma_cm iw_cm ib_cm ib_core xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xt_addrtype iptable_nat nf_nat br_netfilter rpcsec_gss_krb5 auth_rpcgss oid_registry overlay zram zsmalloc fuse [last unloaded: ib_uverbs] CPU: 6 PID: 112670 Comm: devlink Not tainted 6.4.0-rc7_for_upstream_min_debug_2023_06_28_17_02 #1 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 RIP: 0010:call_netdevice_notifiers_info+0x7c/0x80 Code: 90 ff 80 3d 2d 6b f7 00 00 75 c5 ba a1 07 00 00 48 c7 c6 e4 ce 0b 82 48 c7 c7 c8 f4 04 82 c6 05 11 6b f7 00 01 e8 a4 7c 8e ff <0f> 0b eb a2 0f 1f 44 00 00 55 48 89 e5 41 54 48 83 e4 f0 48 83 ec RSP: 0018:ffff8882a21c3948 EFLAGS: 00010282 RAX: 0000000000000000 RBX: ffffffff82e6f880 RCX: 0000000000000027 RDX: ffff88885f99b5c8 RSI: 0000000000000001 RDI: ffff88885f99b5c0 RBP: 0000000000000028 R08: ffff88887ffabaa8 R09: 0000000000000003 R10: ffff88887fecbac0 R11: ffff88887ff7bac0 R12: ffff8882a21c3968 R13: ffff88811c018940 R14: 0000000000000000 R15: ffff8881274401a0 FS: 00007fe141c81800(0000) GS:ffff88885f980000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f787c28b948 CR3: 000000014bcf3005 CR4: 0000000000370ea0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> ? __warn+0x79/0x120 ? call_netdevice_notifiers_info+0x7c/0x80 ? report_bug+0x17c/0x190 ? handle_bug+0x3c/0x60 ? exc_invalid_op+0x14/0x70 ? asm_exc_invalid_op+0x16/0x20 ? call_netdevice_notifiers_info+0x7c/0x80 ? call_netdevice_notifiers_info+0x7c/0x80 call_netdevice_notifiers+0x2e/0x50 mlx5e_set_xdp_feature+0x21/0x50 [mlx5_core] mlx5e_nic_init+0xf1/0x1a0 [mlx5_core] mlx5e_netdev_init_profile+0x76/0x110 [mlx5_core] mlx5e_netdev_attach_profile+0x1f/0x90 [mlx5_core] mlx5e_netdev_change_profile+0x92/0x160 [mlx5_core] mlx5e_netdev_attach_nic_profile+0x1b/0x30 [mlx5_core] mlx5e_vport_rep_unload+0xaa/0xc0 [mlx5_core] __esw_offloads_unload_rep+0x52/0x60 [mlx5_core] mlx5_esw_offloads_rep_unload+0x52/0x70 [mlx5_core] esw_offloads_unload_rep+0x34/0x70 [mlx5_core] esw_offloads_disable+0x2b/0x90 [mlx5_core] mlx5_eswitch_disable_locked+0x1b9/0x210 [mlx5_core] mlx5_devlink_eswitch_mode_set+0xf5/0x630 [mlx5_core] ? devlink_get_from_attrs_lock+0x9e/0x110 devlink_nl_cmd_eswitch_set_doit+0x60/0xe0 genl_family_rcv_msg_doit.isra.0+0xc2/0x110 genl_rcv_msg+0x17d/0x2b0 ? devlink_get_from_attrs_lock+0x110/0x110 ? devlink_nl_cmd_eswitch_get_doit+0x290/0x290 ? devlink_pernet_pre_exit+0xf0/0xf0 ? genl_family_rcv_msg_doit.isra.0+0x110/0x110 netlink_rcv_skb+0x54/0x100 genl_rcv+0x24/0x40 netlink_unicast+0x1f6/0x2c0 netlink_sendmsg+0x232/0x4a0 sock_sendmsg+0x38/0x60 ? _copy_from_user+0x2a/0x60 __sys_sendto+0x110/0x160 ? __count_memcg_events+0x48/0x90 ? handle_mm_fault+0x161/0x260 ? do_user_addr_fault+0x278/0x6e0 __x64_sys_sendto+0x20/0x30 do_syscall_64+0x3d/0x90 entry_SYSCALL_64_after_hwframe+0x46/0xb0 RIP: 0033:0x7fe141b1340a Code: d8 64 89 02 48 c7 c0 ff ff ff ff eb b8 0f 1f 00 f3 0f 1e fa 41 89 ca 64 8b 04 25 18 00 00 00 85 c0 75 15 b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 7e c3 0f 1f 44 00 00 41 54 48 83 ec 30 44 89 RSP: 002b:00007fff61d03de8 EFLAGS: 00000246 ORIG_RAX: 000000000000002c RAX: ffffffffffffffda RBX: 0000000000afab00 RCX: 00007fe141b1340a RDX: 0000000000000038 RSI: 0000000000afab00 RDI: 0000000000000003 RBP: 0000000000afa910 R08: 00007fe141d80200 R09: 000000000000000c R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000001 </TASK> Fixes: 4d5ab0ad964d ("net/mlx5e: take into account device reconfiguration for xdp_features flag") Signed-off-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-08-07tpm/tpm_tis: Disable interrupts for Lenovo P620 devicesJonathan McDowell
The Lenovo ThinkStation P620 suffers from an irq storm issue like various other Lenovo machines, so add an entry for it to tpm_tis_dmi_table and force polling. It is worth noting that 481c2d14627d (tpm,tpm_tis: Disable interrupts after 1000 unhandled IRQs) does not seem to fix the problem on this machine, but setting 'tpm_tis.interrupts=0' on the kernel command line does. [jarkko@kernel.org: truncated the commit ID in the description to 12 characters] Cc: stable@vger.kernel.org # v6.4+ Fixes: e644b2f498d2 ("tpm, tpm_tis: Enable interrupt test") Signed-off-by: Jonathan McDowell <noodles@meta.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2023-08-07tpm: Disable RNG for all AMD fTPMsMario Limonciello
The TPM RNG functionality is not necessary for entropy when the CPU already supports the RDRAND instruction. The TPM RNG functionality was previously disabled on a subset of AMD fTPM series, but reports continue to show problems on some systems causing stutter root caused to TPM RNG functionality. Expand disabling TPM RNG use for all AMD fTPMs whether they have versions that claim to have fixed or not. To accomplish this, move the detection into part of the TPM CRB registration and add a flag indicating that the TPM should opt-out of registration to hwrng. Cc: stable@vger.kernel.org # 6.1.y+ Fixes: b006c439d58d ("hwrng: core - start hwrng kthread also for untrusted sources") Fixes: f1324bbc4011 ("tpm: disable hwrng for fTPM on some AMD designs") Reported-by: daniil.stas@posteo.net Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217719 Reported-by: bitlord0xff@gmail.com Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217212 Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2023-08-07sysctl: set variable key_sysctls storage-class-specifier to staticTom Rix
smatch reports security/keys/sysctl.c:12:18: warning: symbol 'key_sysctls' was not declared. Should it be static? This variable is only used in its defining file, so it should be static. Signed-off-by: Tom Rix <trix@redhat.com> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2023-08-07tpm/tpm_tis: Disable interrupts for TUXEDO InfinityBook S 15/17 Gen7Takashi Iwai
TUXEDO InfinityBook S 15/17 Gen7 suffers from an IRQ problem on tpm_tis like a few other laptops. Add an entry for the workaround. Cc: stable@vger.kernel.org Fixes: e644b2f498d2 ("tpm, tpm_tis: Enable interrupt test") Link: https://bugzilla.suse.com/show_bug.cgi?id=1213645 Signed-off-by: Takashi Iwai <tiwai@suse.de> Acked-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2023-08-07net/mlx5: Bridge, Only handle registered netdev bridge eventsRoi Dayan
Don't handle bridge events for a netdev that doesn't belong to an eswitch that registered for bridge events. Signed-off-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-08-07net/mlx5: E-Switch, Remove redundant arg ignore_flow_lvlRoi Dayan
The arg is always passed as true and thus redundant. Signed-off-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Shay Drory <shayd@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-08-07net/mlx5: Fix typo reminder -> remainderGal Pressman
Fix a typo in esw_qos_devlink_rate_to_mbps(): reminder -> remainder. Signed-off-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Rahul Rameshbabu <rrameshbabu@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-08-07net/mlx5: remove many unnecessary NULL valuesRuan Jinjie
There are many pointers assigned first, which need not to be initialized, so remove the NULL assignments. Signed-off-by: Ruan Jinjie <ruanjinjie@huawei.com> Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-08-07net/mlx5: Allocate completion EQs dynamicallyMaher Sanalla
This commit enables the dynamic allocation of EQs at runtime, allowing for more flexibility in managing completion EQs and reducing the memory overhead of driver load. Whenever a CQ is created for a given vector index, the driver will lookup to see if there is an already mapped completion EQ for that vector, if so, utilize it. Otherwise, allocate a new EQ on demand and then utilize it for the CQ completion events. Add a protection lock to the EQ table to protect from concurrent EQ creation attempts. While at it, replace mlx5_vector2irqn()/mlx5_vector2eqn() with mlx5_comp_eqn_get() and mlx5_comp_irqn_get() which will allocate an EQ on demand if no EQ is found for the given vector. Signed-off-by: Maher Sanalla <msanalla@nvidia.com> Reviewed-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-08-07net/mlx5: Handle SF IRQ request in the absence of SF IRQ poolMaher Sanalla
In case the SF IRQ pool is not available due to setup limitations, SF currently relies on the already allocated PF IRQs to fulfill its IRQ vector requests. However, with the dynamic EQ allocation introduced in the next patch, it is possible that not all IRQs of PF will be allocated after the driver is loaded. In such case, if a SF requests a completion IRQ without having its own independent IRQ pool, SF will lack a PF IRQ to utilize. To address this scenario, allocate an IRQ for the SF from the PF's IRQ pool on demand. The new IRQ will be shared between the SF and it's PF. Signed-off-by: Maher Sanalla <msanalla@nvidia.com> Reviewed-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-08-07net/mlx5: Rename mlx5_comp_vectors_count() to mlx5_comp_vectors_max()Maher Sanalla
To accurately represent its purpose, rename the function that retrieves the value of maximum vectors from mlx5_comp_vectors_count() to mlx5_comp_vectors_max(). Signed-off-by: Maher Sanalla <msanalla@nvidia.com> Reviewed-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-08-07net/mlx5: Add IRQ vector to CPU lookup functionMaher Sanalla
Currently, once driver load completes, IRQ requests were performed for all vectors. However, as we move to support dynamic creation of EQs, this will not be the case as some IRQs will not exist at this stage. Thus, in such case, use the default CPU to IRQ mapping which is the serial mapping based on IRQ vector index. Meaning, the n'th vector gets mapped to the n'th CPU. Introduce an API function mlx5_comp_vector_cpu() that takes an IRQ index and provides the corresponding CPU mapping. It utilizes the existing IRQ affinity if defined, or resorts to the default serialized CPU mapping otherwise. Signed-off-by: Maher Sanalla <msanalla@nvidia.com> Reviewed-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-08-07net/mlx5: Introduce mlx5_cpumask_default_spreadMaher Sanalla
For better code readability in the completion IRQ request code, define the cpu lookup per completion vector logic in a separate function. The new method mlx5_cpumask_default_spread() given a vector index 'n' will return the 'nth' cpu. This new method will be used also in the next patch. Signed-off-by: Maher Sanalla <msanalla@nvidia.com> Reviewed-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-08-07net/mlx5: Implement single completion EQ create/destroy methodsMaher Sanalla
Currently, create_comp_eqs() function handles the creation of all completion EQs for all the vectors on driver load. While on driver unload, destroy_comp_eqs() performs the equivalent job. In preparation for dynamic EQ creation, replace create_comp_eqs() / destroy_comp_eqs() with create_comp_eq() / destroy_comp_eq() functions which will receive a vector index and allocate/destroy an EQ for that specific vector. Thus, allowing more flexibility in the management of completion EQs. Signed-off-by: Maher Sanalla <msanalla@nvidia.com> Reviewed-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-08-07net/mlx5: Use xarray to store and manage completion EQsMaher Sanalla
Use xarray to store the completion EQs instead of a linked list. The xarray offers more scalability, reduced memory overhead, and facilitates the lookup of a certain EQ given a vector index. Signed-off-by: Maher Sanalla <msanalla@nvidia.com> Reviewed-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-08-07net/mlx5: Refactor completion IRQ request/release handlers in EQ layerMaher Sanalla
Break the completion IRQ request/release functions into per-vector handlers for both PCI devices and SFs in the EQ layer. On EQ table creation, loop over all vectors and request an IRQ for each one using the new per-vector functions. Perform the symmetrical change when releasing IRQs on EQ table cleanup. Signed-off-by: Maher Sanalla <msanalla@nvidia.com> Reviewed-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-08-07net/mlx5: Use xarray to store and manage completion IRQsMaher Sanalla
Use xarray to store the completion IRQs instead of a fixed-size allocated array as not all completion IRQs will be requested on driver load, but rather on demand when an EQ is created. The xarray offers more scalability, reduced memory overhead, and provides the ability to dynamically resize the array when needed. Signed-off-by: Maher Sanalla <msanalla@nvidia.com> Reviewed-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-08-07net/mlx5: Refactor completion IRQ request/release APIMaher Sanalla
Introduce a per-vector completion IRQ request API that requests a single IRQ for a given vector index instead of multiple IRQs request API. On driver load, loop over all completion vectors and request an IRQ for each one via the newly introduced API. Symmetrically, introduce an IRQ release API per vector. On driver unload, loop over all vectors and release each completion IRQ via the new per-vector API. As IRQ vectors will be requested dynamically later in the patchset, add a cpumask of the bounded CPUs to avoid the possible mapping of two IRQs of the same device to the same cpu. Signed-off-by: Maher Sanalla <msanalla@nvidia.com> Reviewed-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-08-07net/mlx5: Track the current number of completion EQsMaher Sanalla
In preparation to allocate completion EQs, add a counter to track the number of completion EQs currently allocated. Store the maximum number of EQs in max_comp_eqs variable. Signed-off-by: Maher Sanalla <msanalla@nvidia.com> Reviewed-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-08-07Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull kvm fixes from Paolo Bonzini: "x86: - Fix SEV race condition ARM: - Fixes for the configuration of SVE/SME traps when hVHE mode is in use - Allow use of pKVM on systems with FF-A implementations that are v1.0 compatible - Request/release percpu IRQs (arch timer, vGIC maintenance) correctly when pKVM is in use - Fix function prototype after __kvm_host_psci_cpu_entry() rename - Skip to the next instruction when emulating writes to TCR_EL1 on AmpereOne systems Selftests: - Fix missing include" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: selftests/rseq: Fix build with undefined __weak KVM: SEV: remove ghcb variable declarations KVM: SEV: only access GHCB fields once KVM: SEV: snapshot the GHCB before accessing it KVM: arm64: Skip instruction after emulating write to TCR_EL1 KVM: arm64: fix __kvm_host_psci_cpu_entry() prototype KVM: arm64: Fix resetting SME trap values on reset for (h)VHE KVM: arm64: Fix resetting SVE trap values on reset for hVHE KVM: arm64: Use the appropriate feature trap register when activating traps KVM: arm64: Helper to write to appropriate feature trap register based on mode KVM: arm64: Disable SME traps for (h)VHE at setup KVM: arm64: Use the appropriate feature trap register for SVE at EL2 setup KVM: arm64: Factor out code for checking (h)VHE mode into a macro KVM: arm64: Rephrase percpu enable/disable tracking in terms of hyp KVM: arm64: Fix hardware enable/disable flows for pKVM KVM: arm64: Allow pKVM on v1.0 compatible FF-A implementations
2023-08-07Merge tag 'mmc-v6.5-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc Pull MMC fixes from Ulf Hansson: - moxart: Fix big-endian conversion for SCR structure - sdhci-f-sdh30: Replace with sdhci_pltfm to fix PM support * tag 'mmc-v6.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc: mmc: sdhci-f-sdh30: Replace with sdhci_pltfm mmc: moxart: read scr register without changing byte order
2023-08-07gfs2: Don't use filemap_splice_readBob Peterson
Starting with patch 2cb1e08985, gfs2 started using the new function filemap_splice_read rather than the old (and subsequently deleted) function generic_file_splice_read. filemap_splice_read works by taking references to a number of folios in the page cache and splicing those folios into a pipe. The folios are then read from the pipe and the folio references are dropped. This can take an arbitrary amount of time. We cannot allow that in gfs2 because those folio references will pin the inode glock to the node and prevent it from being demoted, which can lead to cluster-wide deadlocks. Instead, use copy_splice_read. (In addition, the old generic_file_splice_read called into ->read_iter, which called gfs2_file_read_iter, which took the inode glock during the operation. The new filemap_splice_read interface does not take the inode glock anymore. This is fixable, but it still wouldn't prevent cluster-wide deadlocks.) Fixes: 2cb1e08985e3 ("splice: Use filemap_splice_read() instead of generic_file_splice_read()") Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-08-07gfs2: Fix freeze consistency check in gfs2_trans_add_metaAndreas Gruenbacher
Function gfs2_trans_add_meta() checks for the SDF_FROZEN flag to make sure that no buffers are added to a transaction while the filesystem is frozen. With the recent freeze/thaw rework, the SDF_FROZEN flag is cleared after thaw_super() is called, which is sufficient for serializing freeze/thaw. However, other filesystem operations started after thaw_super() may now be calling gfs2_trans_add_meta() before the SDF_FROZEN flag is cleared, which will trigger the SDF_FROZEN check in gfs2_trans_add_meta(). Fix that by checking the s_writers.frozen state instead. In addition, make sure not to call gfs2_assert_withdraw() with the sd_log_lock spin lock held. Check for a withdrawn filesystem before checking for a frozen filesystem, and don't pin/add buffers to the current transaction in case of a failure in either case. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2023-08-07x86/srso: Tie SBPB bit setting to microcode patch detectionBorislav Petkov (AMD)
The SBPB bit in MSR_IA32_PRED_CMD is supported only after a microcode patch has been applied so set X86_FEATURE_SBPB only then. Otherwise, guests would attempt to set that bit and #GP on the MSR write. While at it, make SMT detection more robust as some guests - depending on how and what CPUID leafs their report - lead to cpu_smt_control getting set to CPU_SMT_NOT_SUPPORTED but SRSO_NO should be set for any guest incarnation where one simply cannot do SMT, for whatever reason. Fixes: fb3bd914b3ec ("x86/srso: Add a Speculative RAS Overflow mitigation") Reported-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reported-by: Salvatore Bonaccorso <carnil@debian.org> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
2023-08-07udp/udplite: Remove unused function declarations udp{,lite}_get_port()Yue Haibing
Commit 6ba5a3c52da0 ("[UDP]: Make full use of proto.h.udp_hash innovation.") removed these implementations but leave declarations. Signed-off-by: Yue Haibing <yuehaibing@huawei.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-07net: sfp: Remove unused function declaration sfp_link_configure()Yue Haibing
Commit ce0aa27ff3f6 ("sfp: add sfp-bus to bridge between network devices and sfp cages") declared but never implemented it. Signed-off-by: Yue Haibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-07ndisc: Remove unused ndisc_ifinfo_sysctl_strategy() declarationYue Haibing
Commit f8572d8f2a2b ("sysctl net: Remove unused binary sysctl code") left behind this declaration. Signed-off-by: Yue Haibing <yuehaibing@huawei.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-07net: pkt_cls: Remove unused inline helpersYue Haibing
Commit acb674428c3d ("net: sched: introduce per-block callbacks") implemented these but never used it. Signed-off-by: Yue Haibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-07neighbour: Remove unused function declaration pneigh_for_each()Yue Haibing
pneigh_for_each() is never implemented since the beginning of git history. Signed-off-by: Yue Haibing <yuehaibing@huawei.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-07net/tls: Remove unused function declarationsYue Haibing
Commit 3c4d7559159b ("tls: kernel TLS support") declared but never implemented these functions. Signed-off-by: Yue Haibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-07Revert "riscv: dts: allwinner: d1: Add CAN controller nodes"Marc Kleine-Budde
It turned out the dtsi changes were not quite ready, revert them for now. This reverts commit 6ea1ad888f5900953a21853e709fa499fdfcb317. Link: https://lore.kernel.org/all/2690764.mvXUDI8C0e@jernej-laptop Suggested-by: Jernej Škrabec <jernej.skrabec@gmail.com> Link: https://lore.kernel.org/all/20230807-riscv-allwinner-d1-revert-can-controller-nodes-v1-1-eb3f70b435d9@pengutronix.de Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2023-08-06Linux 6.5-rc5v6.5-rc5Linus Torvalds
2023-08-07dmaengine: xilinx: xdma: Fix typoMiquel Raynal
Probably a copy/paste error with the previous block, here we are actually managing C2H IRQs. Fixes: 17ce252266c7 ("dmaengine: xilinx: xdma: Add xilinx xdma driver") Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/r/20230731101442.792514-3-miquel.raynal@bootlin.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2023-08-07dmaengine: xilinx: xdma: Fix interrupt vector settingMiquel Raynal
A couple of hardware registers need to be set to reflect which interrupts have been allocated to the device. Each register is 32-bit wide and can receive four 8-bit values. If we provide any other interrupt number than four, the irq_num variable will never be 0 within the while check and the while block will loop forever. There is an easy way to prevent this: just break the for loop when we reach "irq_num == 0", which anyway means all interrupts have been processed. Cc: stable@vger.kernel.org Fixes: 17ce252266c7 ("dmaengine: xilinx: xdma: Add xilinx xdma driver") Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Acked-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://lore.kernel.org/r/20230731101442.792514-2-miquel.raynal@bootlin.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2023-08-07dmaengine: owl-dma: Modify mismatched function nameZhang Jianhua
No functional modification involved. drivers/dma/owl-dma.c:208: warning: expecting prototype for struct owl_dma_pchan. Prototype was for struct owl_dma_vchan instead HDRTEST usr/include/sound/asequencer.h Fixes: 47e20577c24d ("dmaengine: Add Actions Semi Owl family S900 DMA driver") Signed-off-by: Zhang Jianhua <chris.zjh@huawei.com> Reviewed-by: Randy Dunlap <rdunlap@infradead.org> Link: https://lore.kernel.org/r/20230722153244.2086949-1-chris.zjh@huawei.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2023-08-07dmaengine: idxd: Clear PRS disable flag when disabling IDXD deviceFenghua Yu
Disabling IDXD device doesn't reset Page Request Service (PRS) disable flag to its initial value 0. This may cause user confusion because once PRS is disabled user will see PRS still remains the previous setting (i.e. disabled) via sysfs interface even after the device is disabled. To eliminate user confusion, reset PRS disable flag to ensure that the PRS flag bit reflects correct state after the device is disabled. Additionally, simplify the code by setting wq->flags to 0, which clears all flag bits, including any future additions. Fixes: f2dc327131b5 ("dmaengine: idxd: add per wq PRS disable") Tested-by: Tony Zhu <tony.zhu@intel.com> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20230712193505.3440752-1-fenghua.yu@intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>