summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2023-06-14mlxsw: spectrum_router: Access nhgi->rif through a helperPetr Machata
To abstract away deduction of RIF from the corresponding next hop group info (NHGI), mlxsw currently uses a macro. In its current form, that macro is impossible to extend to more general computation. Therefore introduce a helper, mlxsw_sp_nhgi_rif(), and use it throughout. This will make it possible to change the deduction path easily later on. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-06-14mlxsw: spectrum_router: Access nh->rif->dev through a helperPetr Machata
In order to abstract away deduction of netdevice from the corresponding next hop, introduce a helper, mlxsw_sp_nexthop_dev(), and use it throughout. This will make it possible to change the deduction path easily later on. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-06-14mlxsw: spectrum_router: Access rif->dev from params in mlxsw_sp_rif_create()Petr Machata
The previous patch added a helper to access a netdevice given a RIF. Using this helper in mlxsw_sp_rif_create() is unreasonable: the netdevice was given in RIF creation parameters. Just take it there. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-06-14mlxsw: spectrum_router: Access rif->dev through a helperPetr Machata
In order to abstract away deduction of netdevice from the corresponding RIF, introduce a helper, mlxsw_sp_rif_dev(), and use it throughout. This will make it possible to change the deduction path easily later on. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-06-14mlxsw: spectrum_router: Add a helper specifically for joining a LAGPetr Machata
Currently, joining a LAG very simply means that the LAG RIF should be joined by the subport representing untagged traffic. If the RIF does not exist, it does not have to be created: if the user wants there to be RIF for the LAG device, they are supposed to add an IP address, and they are supposed to do it after tha LAG becomes mlxsw upper. We can also assume that the LAG has no uppers, otherwise the enslavement is not allowed. In the future, these ordering dependencies should be removed. That means that joining LAG will be more complex operation, possibly involving a lazy RIF creation, and possibly joining / lazily creating RIFs for VLAN uppers of the LAG. It will be handy to have a dedicated function that handles all this. The new function mlxsw_sp_router_port_join_lag() is that. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-06-14mlxsw: spectrum_router: Extract a helper from mlxsw_sp_port_vlan_router_join()Petr Machata
Split out of mlxsw_sp_port_vlan_router_join() the part that checks for RIF and dispatches to __mlxsw_sp_port_vlan_router_join(), leaving it as wrapper that just manages the router lock. The new function, mlxsw_sp_port_vlan_router_join_existing(), will be useful as an atom in later patches. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-06-14selftests: forwarding: hw_stats_l3: Set addrgenmode in a separate stepDanielle Ratson
Setting the IPv6 address generation mode of a net device during its creation never worked, but after commit b0ad3c179059 ("rtnetlink: call validate_linkmsg in rtnl_create_link") it explicitly fails [1]. The failure is caused by the fact that validate_linkmsg() is called before the net device is registered, when it still does not have an 'inet6_dev'. Likewise, raising the net device before setting the address generation mode is meaningless, because by the time the mode is set, the address has already been generated. Therefore, fix the test to first create the net device, then set its IPv6 address generation mode and finally bring it up. [1] # ip link add name mydev addrgenmode eui64 type dummy RTNETLINK answers: Address family not supported by protocol Fixes: ba95e7930957 ("selftests: forwarding: hw_stats_l3: Add a new test") Signed-off-by: Danielle Ratson <danieller@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Link: https://lore.kernel.org/r/f3b05d85b2bc0c3d6168fe8f7207c6c8365703db.1686580046.git.petrm@nvidia.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-06-14Merge branch 'net-sched-fix-race-conditions-in-mini_qdisc_pair_swap'Paolo Abeni
Peilin Ye says: ==================== net/sched: Fix race conditions in mini_qdisc_pair_swap() These 2 patches fix race conditions for ingress and clsact Qdiscs as reported [1] by syzbot, split out from another [2] series (last 2 patches of it). Per-patch changelog omitted. Patch 1 hasn't been touched since last version; I just included everybody's tag. Patch 2 bases on patch 6 v1 of [2], with comments and commit log slightly changed. We also need rtnl_dereference() to load ->qdisc_sleeping since commit d636fc5dd692 ("net: sched: add rcu annotations around qdisc->qdisc_sleeping"), so I changed that; please take yet another look, thanks! Patch 2 has been tested with the new reproducer Pedro posted [3]. [1] https://syzkaller.appspot.com/bug?extid=b53a9c0d1ea4ad62da8b [2] https://lore.kernel.org/r/cover.1684887977.git.peilin.ye@bytedance.com/ [3] https://lore.kernel.org/r/7879f218-c712-e9cc-57ba-665990f5f4c9@mojatatu.com/ ==================== Link: https://lore.kernel.org/r/cover.1686355297.git.peilin.ye@bytedance.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-06-14net/sched: qdisc_destroy() old ingress and clsact Qdiscs before graftingPeilin Ye
mini_Qdisc_pair::p_miniq is a double pointer to mini_Qdisc, initialized in ingress_init() to point to net_device::miniq_ingress. ingress Qdiscs access this per-net_device pointer in mini_qdisc_pair_swap(). Similar for clsact Qdiscs and miniq_egress. Unfortunately, after introducing RTNL-unlocked RTM_{NEW,DEL,GET}TFILTER requests (thanks Hillf Danton for the hint), when replacing ingress or clsact Qdiscs, for example, the old Qdisc ("@old") could access the same miniq_{in,e}gress pointer(s) concurrently with the new Qdisc ("@new"), causing race conditions [1] including a use-after-free bug in mini_qdisc_pair_swap() reported by syzbot: BUG: KASAN: slab-use-after-free in mini_qdisc_pair_swap+0x1c2/0x1f0 net/sched/sch_generic.c:1573 Write of size 8 at addr ffff888045b31308 by task syz-executor690/14901 ... Call Trace: <TASK> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0xd9/0x150 lib/dump_stack.c:106 print_address_description.constprop.0+0x2c/0x3c0 mm/kasan/report.c:319 print_report mm/kasan/report.c:430 [inline] kasan_report+0x11c/0x130 mm/kasan/report.c:536 mini_qdisc_pair_swap+0x1c2/0x1f0 net/sched/sch_generic.c:1573 tcf_chain_head_change_item net/sched/cls_api.c:495 [inline] tcf_chain0_head_change.isra.0+0xb9/0x120 net/sched/cls_api.c:509 tcf_chain_tp_insert net/sched/cls_api.c:1826 [inline] tcf_chain_tp_insert_unique net/sched/cls_api.c:1875 [inline] tc_new_tfilter+0x1de6/0x2290 net/sched/cls_api.c:2266 ... @old and @new should not affect each other. In other words, @old should never modify miniq_{in,e}gress after @new, and @new should not update @old's RCU state. Fixing without changing sch_api.c turned out to be difficult (please refer to Closes: for discussions). Instead, make sure @new's first call always happen after @old's last call (in {ingress,clsact}_destroy()) has finished: In qdisc_graft(), return -EBUSY if @old has any ongoing filter requests, and call qdisc_destroy() for @old before grafting @new. Introduce qdisc_refcount_dec_if_one() as the counterpart of qdisc_refcount_inc_nz() used for filter requests. Introduce a non-static version of qdisc_destroy() that does a TCQ_F_BUILTIN check, just like qdisc_put() etc. Depends on patch "net/sched: Refactor qdisc_graft() for ingress and clsact Qdiscs". [1] To illustrate, the syzkaller reproducer adds ingress Qdiscs under TC_H_ROOT (no longer possible after commit c7cfbd115001 ("net/sched: sch_ingress: Only create under TC_H_INGRESS")) on eth0 that has 8 transmission queues: Thread 1 creates ingress Qdisc A (containing mini Qdisc a1 and a2), then adds a flower filter X to A. Thread 2 creates another ingress Qdisc B (containing mini Qdisc b1 and b2) to replace A, then adds a flower filter Y to B. Thread 1 A's refcnt Thread 2 RTM_NEWQDISC (A, RTNL-locked) qdisc_create(A) 1 qdisc_graft(A) 9 RTM_NEWTFILTER (X, RTNL-unlocked) __tcf_qdisc_find(A) 10 tcf_chain0_head_change(A) mini_qdisc_pair_swap(A) (1st) | | RTM_NEWQDISC (B, RTNL-locked) RCU sync 2 qdisc_graft(B) | 1 notify_and_destroy(A) | tcf_block_release(A) 0 RTM_NEWTFILTER (Y, RTNL-unlocked) qdisc_destroy(A) tcf_chain0_head_change(B) tcf_chain0_head_change_cb_del(A) mini_qdisc_pair_swap(B) (2nd) mini_qdisc_pair_swap(A) (3rd) | ... ... Here, B calls mini_qdisc_pair_swap(), pointing eth0->miniq_ingress to its mini Qdisc, b1. Then, A calls mini_qdisc_pair_swap() again during ingress_destroy(), setting eth0->miniq_ingress to NULL, so ingress packets on eth0 will not find filter Y in sch_handle_ingress(). This is just one of the possible consequences of concurrently accessing miniq_{in,e}gress pointers. Fixes: 7a096d579e8e ("net: sched: ingress: set 'unlocked' flag for Qdisc ops") Fixes: 87f373921c4e ("net: sched: ingress: set 'unlocked' flag for clsact Qdisc ops") Reported-by: syzbot+b53a9c0d1ea4ad62da8b@syzkaller.appspotmail.com Closes: https://lore.kernel.org/r/0000000000006cf87705f79acf1a@google.com/ Cc: Hillf Danton <hdanton@sina.com> Cc: Vlad Buslov <vladbu@mellanox.com> Signed-off-by: Peilin Ye <peilin.ye@bytedance.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-06-14net/sched: Refactor qdisc_graft() for ingress and clsact QdiscsPeilin Ye
Grafting ingress and clsact Qdiscs does not need a for-loop in qdisc_graft(). Refactor it. No functional changes intended. Tested-by: Pedro Tammela <pctammela@mojatatu.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com> Reviewed-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: Peilin Ye <peilin.ye@bytedance.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-06-14net/sched: act_ct: Fix promotion of offloaded unreplied tuplePaul Blakey
Currently UNREPLIED and UNASSURED connections are added to the nf flow table. This causes the following connection packets to be processed by the flow table which then skips conntrack_in(), and thus such the connections will remain UNREPLIED and UNASSURED even if reply traffic is then seen. Even still, the unoffloaded reply packets are the ones triggering hardware update from new to established state, and if there aren't any to triger an update and/or previous update was missed, hardware can get out of sync with sw and still mark packets as new. Fix the above by: 1) Not skipping conntrack_in() for UNASSURED packets, but still refresh for hardware, as before the cited patch. 2) Try and force a refresh by reply-direction packets that update the hardware rules from new to established state. 3) Remove any bidirectional flows that didn't failed to update in hardware for re-insertion as bidrectional once any new packet arrives. Fixes: 6a9bad0069cf ("net/sched: act_ct: offload UDP NEW connections") Co-developed-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: Paul Blakey <paulb@nvidia.com> Reviewed-by: Florian Westphal <fw@strlen.de> Link: https://lore.kernel.org/r/1686313379-117663-1-git-send-email-paulb@nvidia.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-06-14wifi: iwlwifi: mvm: spin_lock_bh() to fix lockdep regressionHugh Dickins
Lockdep on 6.4-rc on ThinkPad X1 Carbon 5th says ===================================================== WARNING: SOFTIRQ-safe -> SOFTIRQ-unsafe lock order detected 6.4.0-rc5 #1 Not tainted ----------------------------------------------------- kworker/3:1/49 [HC0[0]:SC0[4]:HE1:SE0] is trying to acquire: ffff8881066fa368 (&mvm_sta->deflink.lq_sta.rs_drv.pers.lock){+.+.}-{2:2}, at: rs_drv_get_rate+0x46/0xe7 and this task is already holding: ffff8881066f80a8 (&sta->rate_ctrl_lock){+.-.}-{2:2}, at: rate_control_get_rate+0xbd/0x126 which would create a new lock dependency: (&sta->rate_ctrl_lock){+.-.}-{2:2} -> (&mvm_sta->deflink.lq_sta.rs_drv.pers.lock){+.+.}-{2:2} but this new dependency connects a SOFTIRQ-irq-safe lock: (&sta->rate_ctrl_lock){+.-.}-{2:2} etc. etc. etc. Changing the spin_lock() in rs_drv_get_rate() to spin_lock_bh() was not enough to pacify lockdep, but changing them all on pers.lock has worked. Fixes: a8938bc881d2 ("wifi: iwlwifi: mvm: Add locking to the rate read flow") Signed-off-by: Hugh Dickins <hughd@google.com> Link: https://lore.kernel.org/r/79ffcc22-9775-cb6d-3ffd-1a517c40beef@google.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-06-13ethtool: ioctl: account for sopass diff in set_wolJustin Chen
sopass won't be set if wolopt doesn't change. This means the following will fail to set the correct sopass. ethtool -s eth0 wol s sopass 11:22:33:44:55:66 ethtool -s eth0 wol s sopass 22:44:55:66:77:88 Make sure we call into the driver layer set_wol if sopass is different. Fixes: 55b24334c0f2 ("ethtool: ioctl: improve error checking for set_wol") Signed-off-by: Justin Chen <justin.chen@broadcom.com> Link: https://lore.kernel.org/r/1686605822-34544-1-git-send-email-justin.chen@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-13Merge branch 'fix-small-bugs-and-annoyances-in-tc-testing'Jakub Kicinski
Vlad Buslov says: ==================== Fix small bugs and annoyances in tc-testing ==================== Link: https://lore.kernel.org/r/20230612075712.2861848-1-vladbu@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-13selftests/tc-testing: Remove configs that no longer existVlad Buslov
Some qdiscs and classifiers have recently been retired from kernel. However, tc-testing config is still cluttered with them which causes noise when using merge_config.sh script to update existing config for tc-testing compatibility. Remove the config settings for affected qdiscs and classifiers. Fixes: fb38306ceb9e ("net/sched: Retire ATM qdisc") Fixes: 051d44209842 ("net/sched: Retire CBQ qdisc") Fixes: bbe77c14ee61 ("net/sched: Retire dsmark qdisc") Fixes: 265b4da82dbf ("net/sched: Retire rsvp classifier") Fixes: 8c710f75256b ("net/sched: Retire tcindex classifier") Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Reviewed-by: Pedro Tammela <pctammela@mojatatu.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-13selftests/tc-testing: Fix SFB db testVlad Buslov
Setting very small value of db like 10ms introduces rounding errors when converting to/from jiffies on some kernel configs. For example, on 250hz the actual value will be set to 12ms which causes the test to fail: # $ sudo ./tdc.py -d eth2 -e 3410 # -- ns/SubPlugin.__init__ # Test 3410: Create SFB with db setting # # All test results: # # 1..1 # not ok 1 3410 - Create SFB with db setting # Could not match regex pattern. Verify command output: # qdisc sfb 1: root refcnt 2 rehash 600s db 12ms limit 1000p max 25p target 20p increment 0.000503548 decrement 4.57771e-05 penalty_rate 10pps penalty_burst 20p Set the value to 100ms instead which currently seem to work on 100hz, 250hz, 300hz and 1000hz kernel configs. Fixes: 6ad92dc56fca ("selftests/tc-testing: add selftests for sfb qdisc") Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Reviewed-by: Pedro Tammela <pctammela@mojatatu.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-13selftests/tc-testing: Fix Error: failed to find target LOGVlad Buslov
Add missing netfilter config dependency. Fixes following example error when running tests via tdc.sh for all XT tests: # $ sudo ./tdc.py -d eth2 -e 2029 # Test 2029: Add xt action with log-prefix # exit: 255 # exit: 0 # failed to find target LOG # # bad action parsing # parse_action: bad value (7:xt)! # Illegal "action" # # -----> teardown stage *** Could not execute: "$TC actions flush action xt" # # -----> teardown stage *** Error message: "Error: Cannot flush unknown TC action. # We have an error flushing # " # returncode 1; expected [0] # # -----> teardown stage *** Aborting test run. # # <_io.BufferedReader name=3> *** stdout *** # # <_io.BufferedReader name=5> *** stderr *** # "-----> teardown stage" did not complete successfully # Exception <class '__main__.PluginMgrTestFail'> ('teardown', ' failed to find target LOG\n\nbad action parsing\nparse_action: bad value (7:xt)!\nIllegal "action"\n', '"-----> teardown stage" did not complete successfully') (caught in test_runner, running test 2 2029 Add xt action with log-prefix stage teardown) # --------------- # traceback # File "/images/src/linux/tools/testing/selftests/tc-testing/./tdc.py", line 495, in test_runner # res = run_one_test(pm, args, index, tidx) # File "/images/src/linux/tools/testing/selftests/tc-testing/./tdc.py", line 434, in run_one_test # prepare_env(args, pm, 'teardown', '-----> teardown stage', tidx['teardown'], procout) # File "/images/src/linux/tools/testing/selftests/tc-testing/./tdc.py", line 245, in prepare_env # raise PluginMgrTestFail( # --------------- # accumulated output for this test: # failed to find target LOG # # bad action parsing # parse_action: bad value (7:xt)! # Illegal "action" # # --------------- # # All test results: # # 1..1 # ok 1 2029 - Add xt action with log-prefix # skipped - "-----> teardown stage" did not complete successfully Fixes: 910d504bc187 ("selftests/tc-testings: add selftests for xt action") Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Reviewed-by: Pedro Tammela <pctammela@mojatatu.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-13selftests/tc-testing: Fix Error: Specified qdisc kind is unknown.Vlad Buslov
All TEQL tests assume that sch_teql module is loaded. Load module in tdc.sh before running qdisc tests. Fixes following example error when running tests via tdc.sh for all TEQL tests: # $ sudo ./tdc.py -d eth2 -e 84a0 # -- ns/SubPlugin.__init__ # Test 84a0: Create TEQL with default setting # exit: 2 # exit: 0 # Error: Specified qdisc kind is unknown. # # -----> teardown stage *** Could not execute: "$TC qdisc del dev $DUMMY handle 1: root" # # -----> teardown stage *** Error message: "Error: Invalid handle. # " # returncode 2; expected [0] # # -----> teardown stage *** Aborting test run. # # <_io.BufferedReader name=3> *** stdout *** # # <_io.BufferedReader name=5> *** stderr *** # "-----> teardown stage" did not complete successfully # Exception <class '__main__.PluginMgrTestFail'> ('teardown', 'Error: Specified qdisc kind is unknown.\n', '"-----> teardown stage" did not complete successfully') (caught in test_runner, running test 2 84a0 Create TEQL with default setting stage teardown) # --------------- # traceback # File "/images/src/linux/tools/testing/selftests/tc-testing/./tdc.py", line 495, in test_runner # res = run_one_test(pm, args, index, tidx) # File "/images/src/linux/tools/testing/selftests/tc-testing/./tdc.py", line 434, in run_one_test # prepare_env(args, pm, 'teardown', '-----> teardown stage', tidx['teardown'], procout) # File "/images/src/linux/tools/testing/selftests/tc-testing/./tdc.py", line 245, in prepare_env # raise PluginMgrTestFail( # --------------- # accumulated output for this test: # Error: Specified qdisc kind is unknown. # # --------------- # # All test results: # # 1..1 # ok 1 84a0 - Create TEQL with default setting # skipped - "-----> teardown stage" did not complete successfully Fixes: cc62fbe114c9 ("selftests/tc-testing: add selftests for teql qdisc") Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Reviewed-by: Victor Nogueira <victor@mojatatu.com> Reviewed-by: Pedro Tammela <pctammela@mojatatu.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-13net: ethernet: ti: am65-cpsw: Call of_node_put() on error pathDan Carpenter
This code returns directly but it should instead call of_node_put() to drop some reference counts. Fixes: dab2b265dd23 ("net: ethernet: ti: am65-cpsw: Add support for SERDES configuration") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Roger Quadros <rogerq@kernel.org> Link: https://lore.kernel.org/r/e3012f0c-1621-40e6-bf7d-03c276f6e07f@kili.mountain Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-13mctp i2c: Switch back to use struct i2c_driver's .probe()Uwe Kleine-König
After commit b8a1a4cd5a98 ("i2c: Provide a temporary .probe_new() call-back type"), all drivers being converted to .probe_new() and then commit 03c835f498b5 ("i2c: Switch .probe() to not take an id parameter") convert back to (the new) .probe() to be able to eventually drop .probe_new() from struct i2c_driver. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Link: https://lore.kernel.org/r/20230612071641.836976-1-u.kleine-koenig@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-13Merge tag 'nios2_fix_v6.4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dinguyen/linux Pull NIOS2 dts fix from Dinh Nguyen: - Fix tse_mac "max-frame-size" property * tag 'nios2_fix_v6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/dinguyen/linux: nios2: dts: Fix tse_mac "max-frame-size" property
2023-06-13nios2: dts: Fix tse_mac "max-frame-size" propertyJanne Grunau
The given value of 1518 seems to refer to the layer 2 ethernet frame size without 802.1Q tag. Actual use of the "max-frame-size" including in the consumer of the "altr,tse-1.0" compatible is the MTU. Fixes: 95acd4c7b69c ("nios2: Device tree support") Fixes: 61c610ec61bb ("nios2: Add Max10 device tree") Cc: <stable@vger.kernel.org> Signed-off-by: Janne Grunau <j@jannau.net> Signed-off-by: Dinh Nguyen <dinguyen@kernel.org>
2023-06-13spi: dw: Replace incorrect spi_get_chipselect with setAbe Kohandel
Commit 445164e8c136 ("spi: dw: Replace spi->chip_select references with function calls") replaced direct access to spi.chip_select with spi_*_chipselect calls but incorrectly replaced a set instance with a get instance, replace the incorrect instance. Fixes: 445164e8c136 ("spi: dw: Replace spi->chip_select references with function calls") Signed-off-by: Abe Kohandel <abe.kohandel@intel.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Serge Semin <fancer.lancer@gmail.com> Link: https://lore.kernel.org/r/20230613162103.569812-1-abe.kohandel@intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
2023-06-13Merge branch 'tools-ynl-gen-improvements-for-dpll'Jakub Kicinski
Jakub Kicinski says: ==================== tools: ynl-gen: improvements for DPLL A set of improvements needed to generate the kernel code for DPLL. ==================== Link: https://lore.kernel.org/r/20230612155920.1787579-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-13tools: ynl-gen: inherit policy in multi-attrJakub Kicinski
Instead of reimplementing policies in MutliAttr for every underlying type forward the calls to the base type. This will be needed for DPLL which uses a multi-attr nest, and currently gets an invalid NLA_NEST policy generated. Reviewed-by: Jacob Keller <Jacob.e.keller@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-13tools: ynl-gen: correct enum policiesJakub Kicinski
Scalar range validation assumes enums start at 0. Teach it to properly calculate the value range. Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-13Merge tag 'devicetree-fixes-for-6.4-3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux Pull devicetree fixes from Rob Herring: - Fix missing of_node_put() in init_overlay_changeset() - Fix schema for qcom,pmic-mpp "qcom,paired" property - Fix 'additionalProperties' in silvaco,i3c-master binding - usage-model.rst: Use documented "arm,primecell" compatible string - Update Damien Le Moal's email address - Fixes in Realtek Bluetooth binding * tag 'devicetree-fixes-for-6.4-3' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: dt-bindings: pinctrl: qcom,pmic-mpp: Fix schema for "qcom,paired" dt-bindings: i3c: silvaco,i3c-master: fix missing schema restriction of: overlay: Fix missing of_node_put() in error case of init_overlay_changeset() docs: zh_CN/devicetree: sync usage-model fix docs: dt: fix documented Primecell compatible string dt-bindings: Change Damien Le Moal's contact email dt-bindings: net: realtek-bluetooth: Fix double RTL8723CS in desc dt-bindings: net: realtek-bluetooth: Fix RTL8821CS binding
2023-06-13dt-bindings: pinctrl: qcom,pmic-mpp: Fix schema for "qcom,paired"Rob Herring
The "qcom,paired" schema is all wrong. First, it's a list rather than an object(dictionary). Second, it is missing a required type. The meta-schema normally catches this, but schemas under "$defs" was not getting checked. A fix for that is pending. Fixes: f9a06b810951 ("dt-bindings: pinctrl: qcom,pmic-mpp: Convert qcom pmic mpp bindings to YAML") Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://lore.kernel.org/r/20230418150606.1528107-1-robh@kernel.org Signed-off-by: Rob Herring <robh@kernel.org>
2023-06-13regmap: regcache: Don't sync read-only registersTakashi Iwai
regcache_maple_sync() tries to sync all cached values no matter whether it's writable or not. OTOH, regache_sync_val() does care the wrtability and returns -EIO for a read-only register. This results in an error message like: snd_hda_codec_realtek hdaudioC0D0: Unable to sync register 0x2f0009. -5 and the sync loop is aborted incompletely. This patch adds the writable register check to regcache_sync_val() for addressing the bug above. Note that, although we may add the check in the caller side (regcache_maple_sync()), here we put in regcache_sync_val(), so that a similar case like this can be avoided in future. Fixes: f033c26de5a5 ("regmap: Add maple tree based register cache") Link: https://lore.kernel.org/r/877cs7g6f1.wl-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de> Link: https://lore.kernel.org/r/20230613112240.3361-1-tiwai@suse.de Signed-off-by: Mark Brown <broonie@kernel.org>
2023-06-13amd-xgbe: extend 10Mbps support to MAC version 21HRaju Rangoju
MAC version 21H supports the 10Mbps speed. So, extend support to platforms that support it. Acked-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com> Reviewed-by: Sridhar Samudrala <sridhar.samudrala@intel.com> Signed-off-by: Raju Rangoju <Raju.Rangoju@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-13Merge branch 'octeontx2-updates'David S. Miller
Naveen Mamindlapalli says: ==================== RVU NIX AF driver updates This patch series includes a few enhancements and other updates to the RVU NIX AF driver. The first patch adds devlink option to configure NPC MCAM high priority zone entries reservation. This is useful when the requester needs more high priority entries than default reserved entries. The second patch adds support for RSS hash computation using L3 SRC or DST only, or L4 SRC or DST only. The third patch updates DWRR MTU configuration for CN10KB silicon. HW uses the DWRR MTU to compute DWRR weight. Patch 4 configures the LBK link in TL3_TL2 configuration only when switch mode is enabled. Patch 5 adds an option in the mailbox request to enable/disable DROP_RE bit which drops packets with L2 errors when set. Patch 6 updates SMQ flush mechanism to stop other child nodes from enqueuing any packets while SMQ flush is active. Otherwise SMQ flush may timeout. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-13octeontx2-af: Set XOFF on other child transmit schedulers during SMQ flushNaveen Mamindlapalli
When multiple transmit scheduler queues feed a TL1 transmit link, the SMQ flush initiated on a low priority queue might get stuck when a high priority queue fully subscribes the transmit link. This inturn effects interface teardown. To avoid this, temporarily XOFF all TL1's other immediate child transmit scheduler queues and also clear any rate limit configuration on all the scheduler queues in SMQ(flush) hierarchy. Signed-off-by: Naveen Mamindlapalli <naveenm@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-13octeontx2-af: add option to toggle DROP_RE enable in rx cfgNithin Dabilpuram
Add option to toggle DROP_RE bit in rx cfg mbox. This helps in modifying the config runtime as opposed to setting available via nix_lf_alloc() mbox at NIX LF init time. Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com> Signed-off-by: Jerin Jacob Kollanukkaran <jerinj@marvell.com> Signed-off-by: Sunil Kovvuri Goutham <sgoutham@marvell.com> Signed-off-by: Naveen Mamindlapalli <naveenm@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-13octeontx2-af: Enable LBK links only when switch mode is on.Subbaraya Sundeep
Currently, all the TL3_TL2 nodes are being configured to enable switch LBK channel 63 in them. Instead enable them only when switch mode is enabled. Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com> Signed-off-by: Naveen Mamindlapalli <naveenm@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-13octeontx2-af: cn10k: Set NIX DWRR MTU for CN10KB siliconSunil Goutham
The DWRR MTU config added for SDP and RPM/LBK links on CN10K silicon is further extended on CK10KB silicon variant and made it configurable. Now there are 4 DWRR MTU config to choose while setting transmit scheduler's RR_WEIGHT. Here we are reserving one config for each of RPM, SDP and LBK. NIXX_AF_DWRR_MTUX(0) ---> RPM NIXX_AF_DWRR_MTUX(1) ---> SDP NIXX_AF_DWRR_MTUX(2) ---> LBK PF/VF drivers can choose the DWRR_MTU to be used by setting SMQX_CFG[pkt_link_type] to one of above. TLx_SCHEDULE[RR_WEIGHT] is to be as configured 'quantum / 2^DWRR_MTUX[MTU]'. DWRR_MTU of each link is exposed to PF/VF drivers via mailbox for RR_WEIGHT calculation. Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: Geetha sowjanya <gakula@marvell.com> Signed-off-by: Naveen Mamindlapalli <naveenm@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-13octeontx2-af: extend RSS supported offload typesKiran Kumar K
Add support to select L3 SRC or DST only, L4 SRC or DST only for RSS calculation. AF consumer may have requirement as we can select only SRC or DST data for RSS calculation in L3, L4 layers. With this requirement there will be following combinations, IPV[4,6]_SRC_ONLY, IPV[4,6]_DST_ONLY, [TCP,UDP,SCTP]_SRC_ONLY, [TCP,UDP,SCTP]_DST_ONLY. So, instead of creating a bit for each combination, we are using upper 4 bits (31:28) in the flow_key_cfg to represent the SRC, DST selection. 31 => L3_SRC, 30 => L3_DST, 29 => L4_SRC, 28 => L4_DST. These won't be part of flow_cfg, so that we don't need to change the existing ABI. Signed-off-by: Kiran Kumar K <kirankumark@marvell.com> Signed-off-by: Geetha sowjanya <gakula@marvell.com> Signed-off-by: Naveen Mamindlapalli <naveenm@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-13octeontx2-af: Add devlink option to adjust mcam high prio zone entriesNaveen Mamindlapalli
The NPC MCAM entries are currently divided into three priority zones in AF driver: high, mid, and low. The high priority zone and low priority zone take up 1/8th (each) of the available MCAM entries, and remaining going to the mid priority zone. The current allocation scheme may not meet certain requirements, such as when a requester needs more high priority zone entries than are reserved. This patch adds a devlink configurable option to increase the number of high priority zone entries that can be allocated by requester. The max number of entries that can be reserved for high priority usage is 100% of available MCAM entries. Usage: 1) Change high priority zone percentage to 75%: devlink -p dev param set pci/0002:01:00.0 name npc_mcam_high_zone_percent \ value 75 cmode runtime 2) Read high priority zone percentage: devlink -p dev param show pci/0002:01:00.0 name npc_mcam_high_zone_percent The devlink set configuration is only permitted when no MCAM entries are assigned, i.e., all MCAM entries are free, indicating that no PF/VF driver is loaded. So user must unload/unbind PF/VF driver/devices before modifying the high priority zone percentage. Signed-off-by: Naveen Mamindlapalli <naveenm@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-13selftests: forwarding: Fix layer 2 miss test syntaxIdo Schimmel
The test currently specifies "l2_miss" as "true" / "false", but the version that eventually landed in iproute2 uses "1" / "0" [1]. Align the test accordingly. [1] https://lore.kernel.org/netdev/20230607153550.3829340-1-idosch@nvidia.com/ Fixes: 8c33266ae26a ("selftests: forwarding: Add layer 2 miss test cases") Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-12Merge branch 'splice-net-some-miscellaneous-msg_splice_pages-changes'Jakub Kicinski
David Howells says: ==================== splice, net: Some miscellaneous MSG_SPLICE_PAGES changes Now that the splice_to_socket() has been rewritten so that nothing now uses the ->sendpage() file op[1], some further changes can be made, so here are some miscellaneous changes that can now be done. (1) Remove the ->sendpage() file op. (2) Remove hash_sendpage*() from AF_ALG. (3) Make sunrpc send multiple pages in single sendmsg() call rather than calling sendpage() in TCP (or maybe TLS). (4) Make tcp_bpf_sendpage() a wrapper around tcp_bpf_sendmsg(). (5) Make AF_KCM use sendmsg() when calling down to TCP and then make it send entire fragment lists in single sendmsg calls. Link: https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/commit/?id=fd5f4d7da29218485153fd8b4c08da7fc130c79f [1] ==================== Link: https://lore.kernel.org/r/20230609100221.2620633-1-dhowells@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-12kcm: Send multiple frags in one sendmsg()David Howells
Rewrite the AF_KCM transmission loop to send all the fragments in a single skb or frag_list-skb in one sendmsg() with MSG_SPLICE_PAGES set. The list of fragments in each skb is conveniently a bio_vec[] that can just be attached to a BVEC iter. Note: I'm working out the size of each fragment-skb by adding up bv_len for all the bio_vecs in skb->frags[] - but surely this information is recorded somewhere? For the skbs in head->frag_list, this is equal to skb->data_len, but not for the head. head->data_len includes all the tail frags too. Signed-off-by: David Howells <dhowells@redhat.com> cc: Tom Herbert <tom@herbertland.com> cc: Tom Herbert <tom@quantonium.net> cc: Jens Axboe <axboe@kernel.dk> cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-12kcm: Use sendmsg(MSG_SPLICE_PAGES) rather then sendpageDavid Howells
When transmitting data, call down into the transport socket using sendmsg with MSG_SPLICE_PAGES to indicate that content should be spliced rather than using sendpage. Signed-off-by: David Howells <dhowells@redhat.com> cc: Tom Herbert <tom@herbertland.com> cc: Tom Herbert <tom@quantonium.net> cc: Jens Axboe <axboe@kernel.dk> cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-12tcp_bpf: Make tcp_bpf_sendpage() go through tcp_bpf_sendmsg(MSG_SPLICE_PAGES)David Howells
Make tcp_bpf_sendpage() a wrapper around tcp_bpf_sendmsg(MSG_SPLICE_PAGES) rather than a loop calling tcp_sendpage(). sendpage() will be removed in the future. Signed-off-by: David Howells <dhowells@redhat.com> cc: John Fastabend <john.fastabend@gmail.com> cc: Jakub Sitnicki <jakub@cloudflare.com> cc: Jens Axboe <axboe@kernel.dk> cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-12sunrpc: Use sendmsg(MSG_SPLICE_PAGES) rather then sendpageDavid Howells
When transmitting data, call down into TCP using sendmsg with MSG_SPLICE_PAGES to indicate that content should be spliced rather than performing sendpage calls to transmit header, data pages and trailer. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Chuck Lever <chuck.lever@oracle.com> cc: Trond Myklebust <trond.myklebust@hammerspace.com> cc: Anna Schumaker <anna@kernel.org> cc: Jeff Layton <jlayton@kernel.org> cc: Jens Axboe <axboe@kernel.dk> cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-12algif: Remove hash_sendpage*()David Howells
Remove hash_sendpage*() as nothing should now call it since the rewrite of splice_to_socket()[1]. Signed-off-by: David Howells <dhowells@redhat.com> cc: Herbert Xu <herbert@gondor.apana.org.au> cc: Jens Axboe <axboe@kernel.dk> cc: Matthew Wilcox <willy@infradead.org> Link: https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/commit/?id=2dc334f1a63a8839b88483a3e73c0f27c9c1791c [1] Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-12Remove file->f_op->sendpageDavid Howells
Remove file->f_op->sendpage as splicing to a socket now calls sendmsg rather than sendpage. Signed-off-by: David Howells <dhowells@redhat.com> cc: Jens Axboe <axboe@kernel.dk> cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-12Merge branch 'net-flower-add-cfm-support'Jakub Kicinski
Zahari Doychev says: ==================== net: flower: add cfm support The first patch adds cfm support to the flow dissector. The second adds the flower classifier support. The third adds a selftest for the flower cfm functionality. ==================== Link: https://lore.kernel.org/r/20230608105648.266575-1-zahari.doychev@linux.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-12selftests: net: add tc flower cfm testZahari Doychev
New cfm flower test case is added to the net forwarding selfttests. Example output: # ./tc_flower_cfm.sh p1 p2 TEST: CFM opcode match test [ OK ] TEST: CFM level match test [ OK ] TEST: CFM opcode and level match test [ OK ] Signed-off-by: Zahari Doychev <zdoychev@maxlinear.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-12net: flower: add support for matching cfm fieldsZahari Doychev
Add support to the tc flower classifier to match based on fields in CFM information elements like level and opcode. tc filter add dev ens6 ingress protocol 802.1q \ flower vlan_id 698 vlan_ethtype 0x8902 cfm mdl 5 op 46 \ action drop Signed-off-by: Zahari Doychev <zdoychev@maxlinear.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-12net: flow_dissector: add support for cfm packetsZahari Doychev
Add support for dissecting cfm packets. The cfm packet header fields maintenance domain level and opcode can be dissected. Signed-off-by: Zahari Doychev <zdoychev@maxlinear.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-12Merge branch 'mptcp-fixes'Jakub Kicinski
Matthieu Baerts says: ==================== selftests: mptcp: skip tests not supported by old kernels (part 3) After a few years of increasing test coverage in the MPTCP selftests, we realised [1] the last version of the selftests is supposed to run on old kernels without issues. Supporting older versions is not that easy for this MPTCP case: these selftests are often validating the internals by checking packets that are exchanged, when some MIB counters are incremented after some actions, how connections are getting opened and closed in some cases, etc. In other words, it is not limited to the socket interface between the userspace and the kernelspace. In addition to that, the current MPTCP selftests run a lot of different sub-tests but the TAP13 protocol used in the selftests don't support sub-tests: one failure in sub-tests implies that the whole selftest is seen as failed at the end because sub-tests are not tracked. It is then important to skip sub-tests not supported by old kernels. To minimise the modifications and reduce the complexity to support old versions, the idea is to look at external signs and skip the whole selftest or just some sub-tests before starting them. This cannot be applied in all cases. Similar to the second part, this third one focuses on marking different sub-tests as skipped if some MPTCP features are not supported. This time, only in "mptcp_join.sh" selftest, the remaining one, is modified. Several techniques are used here to achieve this task: - Before starting some tests: - Check if a file (sysctl knob) is present: that's what patch 12/17 is doing for the userspace PM feature. - Check if a required kernel symbol is present in /proc/kallsyms: patches 9, 10, 14 and 15/17 are using this technique. - Check if it is possible to setup a particular network environment requiring Netfilter or TC: if the preparation step fail, the linked sub-test is marked as skipped. Patch 5/17 is doing that. - Check if a MIB counter is available: patches 7 and 13/17 do that. - Check if the kernel version is newer than a specific one: patch 1/17 adds some helpers in mptcp_lib.sh to ease its use. That's not ideal and it is only used as last resort but as mentioned above, it is important to skip tests if they are not supported not to have the whole selftest always being marked as failed on old kernels. Patches 11 and 17/17 are checking the kernel version. An alternative would be to ignore the results for some sub-tests but that's not ideal too. Note that SELFTESTS_MPTCP_LIB_NO_KVERSION_CHECK env var can be set to 1 not to skip these tests if the running kernel doesn't have a supported version. - After having launched the tests: - Adapt the expectations depending on the presence of a kernel symbol (patch 6/17) or a kernel version (patch 8/17). - Check is a MIB counter is available and skip the verification if not. Patch 4/17 is using this technique. Before skipping tests, SELFTESTS_MPTCP_LIB_EXPECT_ALL_FEATURES env var value is checked: if it is set to 1, the test is marked as "failed" instead of "skipped". MPTCP public CI expects to have all features supported and it sets this env var to 1 to catch regressions in these new checks. Patch 2/17 uses 'iptables-legacy' if available because it might be needed when using an older kernel not supporting iptables-nft. Patch 3/17 adds some helpers used in the other patches mentioned to easily mark sub-tests as skipped. Patch 16/17 uniforms MPTCP Join "listener" tests: it was imported code from userspace_pm.sh but without using the "code style" and ways of using tools and printing messages from MPTCP Join selftest. Link: https://lore.kernel.org/stable/CA+G9fYtDGpgT4dckXD-y-N92nqUxuvue_7AtDdBcHrbOMsDZLg@mail.gmail.com/ [1] Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368 ==================== Link: https://lore.kernel.org/r/20230609-upstream-net-20230610-mptcp-selftests-support-old-kernels-part-3-v1-0-2896fe2ee8a3@tessares.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>