summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2024-10-23Merge branch 'net-sysctl-allow-dump_cpumask-to-handle-higher-numbers-of-cpus'Paolo Abeni
Antoine Tenart says: ==================== net: sysctl: allow dump_cpumask to handle higher numbers of CPUs The main goal of this series is to allow dump_cpumask to handle higher numbers of CPUs (patch 3). While doing so I had the opportunity to make the function a bit simpler, which is done in patches 1-2. None of those is net material IMO. ==================== Link: https://patch.msgid.link/20241017152422.487406-1-atenart@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-23net: sysctl: allow dump_cpumask to handle higher numbers of CPUsAntoine Tenart
This fixes the output of rps_default_mask and flow_limit_cpu_bitmap when the CPU count is > 448, as it was truncated. The underlying values are actually stored correctly when writing to these sysctl but displaying them uses a fixed length temporary buffer in dump_cpumask. This buffer can be too small if the CPU count is > 448. Fix this by dynamically allocating the buffer in dump_cpumask, using a guesstimate of what we need. Signed-off-by: Antoine Tenart <atenart@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-23net: sysctl: do not reserve an extra char in dump_cpumask temporary bufferAntoine Tenart
When computing the length we'll be able to use out of the buffers, one char is removed from the temporary one to make room for a newline. It should be removed from the output buffer length too, but in reality this is not needed as the later call to scnprintf makes sure a null char is written at the end of the buffer which we override with the newline. Signed-off-by: Antoine Tenart <atenart@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-23net: sysctl: remove always-true conditionAntoine Tenart
Before adding a new line at the end of the temporary buffer in dump_cpumask, a length check is performed to ensure there is space for it. len = min(sizeof(kbuf) - 1, *lenp); len = scnprintf(kbuf, len, ...); if (len < *lenp) kbuf[len++] = '\n'; Note that the check is currently logically wrong, the written length is compared against the output buffer, not the temporary one. However this has no consequence as this is always true, even if fixed: scnprintf includes a null char at the end of the buffer but the returned length do not include it and there is always space for overriding it with a newline. Remove the condition. Signed-off-by: Antoine Tenart <atenart@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-23net: use sock_valbool_flag() only in __sock_set_timestamps()Yajun Deng
sock_{,re}set_flag() are contained in sock_valbool_flag(), it would be cleaner to just use sock_valbool_flag(). Signed-off-by: Yajun Deng <yajun.deng@linux.dev> Link: https://patch.msgid.link/20241017133435.2552-1-yajun.deng@linux.dev Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-23netdevsim: macsec: pad u64 to correct length in logsAles Nezbeda
Commit 02b34d03a24b ("netdevsim: add dummy macsec offload") pads u64 number to 8 characters using "%08llx" format specifier. Changing format specifier to "%016llx" ensures that no matter the value the representation of number in log is always the same length. Before this patch, entry in log for value '1' would say: removing SecY with SCI 00000001 at index 2 After this patch is applied, entry in log will say: removing SecY with SCI 0000000000000001 at index 2 Signed-off-by: Ales Nezbeda <anezbeda@redhat.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Sabrina Dubroca <sd@queasysnail.net> Link: https://patch.msgid.link/20241017131933.136971-1-anezbeda@redhat.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-22net: mv643xx: use ethtool_putsRosen Penev
Allows simplifying get_strings and avoids manual pointer manipulation. Signed-off-by: Rosen Penev <rosenp@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Message-ID: <20241018200522.12506-1-rosenp@gmail.com> Signed-off-by: Andrew Lunn <andrew@lunn.ch>
2024-10-22net: atlantic: support reading SFP module infoLorenz Brun
Add support for reading SFP module info and digital diagnostic monitoring data if supported by the module. The only Aquantia controller without an integrated PHY is the AQC100 which belongs to the B0 revision, that's why it's only implemented there. The register information was extracted from a diagnostic tool made publicly available by Dell, but all code was written from scratch by me. This has been tested to work with a variety of both optical and direct attach modules I had lying around and seems to work fine with all of them, including the diagnostics if supported by an optical module. All tests have been done with an AQC100 on an TL-NT521F card on firmware version 3.1.121 (current at the time of this patch). Signed-off-by: Lorenz Brun <lorenz@brun.one> Reviewed-by: Simon Horman <horms@kernel.org> Message-ID: <20241018171721.2577386-1-lorenz@brun.one> Signed-off-by: Andrew Lunn <andrew@lunn.ch>
2024-10-22octeontx2-pf: handle otx2_mbox_get_rsp errors in otx2_dcbnl.cDipendra Khadka
Add error pointer check after calling otx2_mbox_get_rsp(). Fixes: 8e67558177f8 ("octeontx2-pf: PFC config support with DCBx") Signed-off-by: Dipendra Khadka <kdipendra88@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Andrew Lunn <andrew@lunn.ch>
2024-10-22octeontx2-pf: handle otx2_mbox_get_rsp errors in otx2_dmac_flt.cDipendra Khadka
Add error pointer checks after calling otx2_mbox_get_rsp(). Fixes: 79d2be385e9e ("octeontx2-pf: offload DMAC filters to CGX/RPM block") Fixes: fa5e0ccb8f3a ("octeontx2-pf: Add support for exact match table.") Signed-off-by: Dipendra Khadka <kdipendra88@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Andrew Lunn <andrew@lunn.ch>
2024-10-22octeontx2-pf: handle otx2_mbox_get_rsp errors in cn10k.cDipendra Khadka
Add error pointer check after calling otx2_mbox_get_rsp(). Fixes: 2ca89a2c3752 ("octeontx2-pf: TC_MATCHALL ingress ratelimiting offload") Signed-off-by: Dipendra Khadka <kdipendra88@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Andrew Lunn <andrew@lunn.ch>
2024-10-22octeontx2-pf: handle otx2_mbox_get_rsp errors in otx2_flows.cDipendra Khadka
Adding error pointer check after calling otx2_mbox_get_rsp(). Fixes: 9917060fc30a ("octeontx2-pf: Cleanup flow rule management") Fixes: f0a1913f8a6f ("octeontx2-pf: Add support for ethtool ntuple filters") Fixes: 674b3e164238 ("octeontx2-pf: Add additional checks while configuring ucast/bcast/mcast rules") Signed-off-by: Dipendra Khadka <kdipendra88@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Andrew Lunn <andrew@lunn.ch>
2024-10-22octeontx2-pf: handle otx2_mbox_get_rsp errors in otx2_ethtool.cDipendra Khadka
Add error pointer check after calling otx2_mbox_get_rsp(). Fixes: 75f36270990c ("octeontx2-pf: Support to enable/disable pause frames via ethtool") Fixes: d0cf9503e908 ("octeontx2-pf: ethtool fec mode support") Signed-off-by: Dipendra Khadka <kdipendra88@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Andrew Lunn <andrew@lunn.ch>
2024-10-22octeontx2-pf: handle otx2_mbox_get_rsp errors in otx2_common.cDipendra Khadka
Add error pointer check after calling otx2_mbox_get_rsp(). Fixes: ab58a416c93f ("octeontx2-pf: cn10k: Get max mtu supported from admin function") Signed-off-by: Dipendra Khadka <kdipendra88@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Andrew Lunn <andrew@lunn.ch>
2024-10-22virtchnl: fix m68k build.Paolo Abeni
The kernel test robot reported a build failure on m68k in the intel driver due to the recent shapers-related changes. The mentioned arch has funny alignment properties, let's be explicit about the binary layout expectation introducing a padding field. Fixes: 608a5c05c39b ("virtchnl: support queue rate limit and quanta size configuration") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202410131710.71Wt6LKO-lkp@intel.com/ Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/e45d1c9f17356d431b03b419f60b8b763d2ff768.1729000481.git.pabeni@redhat.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-22Merge branch 'net-netconsole-refactoring-and-warning-fix'Paolo Abeni
Breno Leitao says: ==================== net: netconsole refactoring and warning fix The netconsole driver was showing a warning related to userdata information, depending on the message size being transmitted: ------------[ cut here ]------------ WARNING: CPU: 13 PID: 3013042 at drivers/net/netconsole.c:1122 write_ext_msg+0x3b6/0x3d0 ? write_ext_msg+0x3b6/0x3d0 console_flush_all+0x1e9/0x330 ... Identifying the cause of this warning proved to be non-trivial due to: * The write_ext_msg() function being over 100 lines long * Extensive use of pointer arithmetic * Inconsistent naming conventions and concept application The send_ext_msg() function grew organically over time: * Initially, the UDP packet consisted of a header and body * Later additions included release prepend and userdata * Naming became inconsistent (e.g., "body" excludes userdata, "header" excludes prepended release) This lack of consistency made investigating issues like the above warning more challenging than what it should be. To address these issues, the following steps were taken: * Breaking down write_ext_msg() into smaller functions with clear scopes * Improving readability and reasoning about the code * Simplifying and clarifying naming conventions Warning Fix ----------- The warning occurred when there was insufficient buffer space to append userdata. While this scenario is acceptable (as userdata can be sent in a separate packet later), the kernel was incorrectly raising a warning. A one-line fix has been implemented to resolve this issue. The fix was already sent to net, and is already available in net-next also. v4: * https://lore.kernel.org/all/20240930131214.3771313-1-leitao@debian.org/ v3: * https://lore.kernel.org/all/20240910100410.2690012-1-leitao@debian.org/ v2: * https://lore.kernel.org/all/20240909130756.2722126-1-leitao@debian.org/ v1: * https://lore.kernel.org/all/20240903140757.2802765-1-leitao@debian.org/ ==================== Link: https://patch.msgid.link/20241017095028.3131508-1-leitao@debian.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-22net: netconsole: split send_msg_fragmentedBreno Leitao
Refactor the send_msg_fragmented() function by extracting the logic for sending the message body into a new function called send_fragmented_body(). Now, send_msg_fragmented() handles appending the release and header, and then delegates the task of breaking up the body and sending the fragments to send_fragmented_body(). This is the final flow now: When send_ext_msg_udp() is called to send a message, it will: - call send_msg_no_fragmentation() if no fragmentation is needed or - call send_msg_fragmented() if fragmentation is needed * send_msg_fragmented() appends the header to the buffer, which is be persisted until the function returns * call send_fragmented_body() to iterate and populate the body of the message. It will not touch the header, and it will only replace the body, writing the msgbody and/or userdata. Also add some comment to make the code easier to review. Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-22net: netconsole: do not pass userdata up to the tailBreno Leitao
Do not pass userdata to send_msg_fragmented, since we can get it later. This will be more useful in the next patch, where send_msg_fragmented() will be split even more, and userdata is only necessary in the last function. Suggested-by: Simon Horman <horms@kernel.org> Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-22net: netconsole: extract release appending into separate functionBreno Leitao
Refactor the code by extracting the logic for appending the release into the buffer into a separate function. The goal is to reduce the size of send_msg_fragmented() and improve code readability. Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-22net: netconsole: track explicitly if msgbody was written to bufferBreno Leitao
The current check to determine if the message body was fully sent is difficult to follow. To improve clarity, introduce a variable that explicitly tracks whether the message body (msgbody) has been completely sent, indicating when it's time to begin sending userdata. Additionally, add comments to make the code more understandable for others who may work with it. Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-22net: netconsole: introduce variable to track body lengthBreno Leitao
This new variable tracks the total length of the data to be sent, encompassing both the message body (msgbody) and userdata, which is collectively called body. By explicitly defining body_len, the code becomes clearer and easier to reason about, simplifying offset calculations and improving overall readability of the function. Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-22net: netconsole: rename body to msg_bodyBreno Leitao
With the introduction of the userdata concept, the term body has become ambiguous and less intuitive. To improve clarity, body is renamed to msg_body, making it clear that the body is not the only content following the header. In an upcoming patch, the term body_len will also be revised for further clarity. The current packet structure is as follows: release, header, body, [msg_body + userdata] Here, [msg_body + userdata] collectively forms what is currently referred to as "body." This renaming helps to distinguish and better understand each component of the packet. Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-22net: netconsole: separate fragmented message handling in send_ext_msgBreno Leitao
Following the previous change, where the non-fragmented case was moved to its own function, this update introduces a new function called send_msg_fragmented to specifically manage scenarios where message fragmentation is required. Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-22net: netconsole: split send_ext_msg_udp() functionBreno Leitao
The send_ext_msg_udp() function has become quite large, currently spanning 102 lines. Its complexity, along with extensive pointer and offset manipulation, makes it difficult to read and error-prone. The function has evolved over time, and it’s now due for a refactor. To improve readability and maintainability, isolate the case where no message fragmentation occurs into a separate function, into a new send_msg_no_fragmentation() function. This scenario covers about 95% of the messages. Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-22net: netconsole: remove msg_ready variableBreno Leitao
Variable msg_ready is useless, since it does not represent anything. Get rid of it, using buf directly instead. Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-22tools: ynl-gen: use big-endian netlink attribute typesAsbjørn Sloth Tønnesen
Change ynl-gen-c.py to use NLA_BE16 and NLA_BE32 types to represent big-endian u16 and u32 ynl types. Doing this enables those attributes to have range checks applied, as the validator will then convert to host endianness prior to validation. The autogenerated kernel/uapi code have been regenerated by running: ./tools/net/ynl/ynl-regen.sh -f This changes the policy types of the following attributes: FOU_ATTR_PORT (NLA_U16 -> NLA_BE16) FOU_ATTR_PEER_PORT (NLA_U16 -> NLA_BE16) These two are used with nla_get_be16/nla_put_be16(). MPTCP_PM_ADDR_ATTR_ADDR4 (NLA_U32 -> NLA_BE32) This one is used with nla_get_in_addr/nla_put_in_addr(), which uses nla_get_be32/nla_put_be32(). IOWs the generated changes are AFAICT aligned with their implementations. The generated userspace code remains identical, and have been verified by comparing the output generated by the following command: make -C tools/net/ynl/generated Signed-off-by: Asbjørn Sloth Tønnesen <ast@fiberby.net> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20241017094704.3222173-1-ast@fiberby.net Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-22Merge branch 'selftests-net-introduce-deferred-commands'Paolo Abeni
Petr Machata says: ==================== selftests: net: Introduce deferred commands Recently, a defer helper was added to Python selftests. The idea is to keep cleanup commands close to their dirtying counterparts, thereby making it more transparent what is cleaning up what, making it harder to miss a cleanup, and make the whole cleanup business exception safe. All these benefits are applicable to bash as well, exception safety can be interpreted in terms of safety vs. a SIGINT. This patchset therefore introduces a framework of several helpers that serve to schedule cleanups in bash selftests. - Patch #1 has more details about the primitives being introduced. Patch #2 adds a fallback cleanup() function to lib.sh, because ideally selftests wouldn't need to introduce a dedicated cleanup function at all. - Patch #3 adds a parameter to stop_traffic(), which makes it possible to start other background processes after the traffic is started without confusing the cleanup. - Patches #4 to #10 convert a number of selftests. The goal was to convert all tests that use start_traffic / stop_traffic to the defer framework. Leftover traffic generators are a particularly painful sort of a missed cleanup. Normal unfinished cleanups can usually be cleaned up simply by rerunning the test and interrupting it early to let the cleanups run again / in full. This does not work with stop_traffic, because it is only issued at the end of the test case that starts the traffic. At the same time, leftover traffic generators influence follow-up test runs, and are hard to notice. The tests were however converted whole-sale, not just their traffic bits. Thus they form a proof of concept of the defer framework. ==================== Link: https://patch.msgid.link/cover.1729157566.git.petrm@nvidia.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-22selftests: mlxsw: devlink_trap_police: Use defer for test cleanupPetr Machata
Use the defer framework to schedule cleanups as soon as the command is executed. Note that the start_traffic commands in __burst_test() are each sending a fixed number of packets (note the -c flag) and then ending. They therefore do not need a matching stop_traffic. Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-22selftests: mlxsw: qos_max_descriptors: Use defer for test cleanupPetr Machata
Use the defer framework to schedule cleanups as soon as the command is executed. Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-22selftests: mlxsw: qos_ets_strict: Use defer for test cleanupPetr Machata
Use the defer framework to schedule cleanups as soon as the command is executed. Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-22selftests: mlxsw: qos_mc_aware: Use defer for test cleanupPetr Machata
Use the defer framework to schedule cleanups as soon as the command is executed. Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-22selftests: ETS: Use defer for test cleanupPetr Machata
Use the defer framework to schedule cleanups as soon as the command is executed. Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-22selftests: TBF: Use defer for test cleanupPetr Machata
Use the defer framework to schedule cleanups as soon as the command is executed. Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-22selftests: RED: Use defer for test cleanupPetr Machata
Instead of having a suite of dedicated cleanup functions, use the defer framework to schedule cleanups right as their setup functions are run. The sleep after stop_traffic() in mlxsw selftests is necessary, but scheduling it as "defer sleep; defer stop_traffic" is silly. Instead, add a local helper to stop traffic and sleep afterwards. Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-22selftests: forwarding: lib: Allow passing PID to stop_traffic()Petr Machata
Now that it is possible to schedule a deferral of stop_traffic() right after the traffic is started, we do not have to rely on the %% magic to kill the background process that was started last. Instead we can just give the PID explicitly. This makes it possible to start other background processes after the traffic is started without confusing the cleanup. Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-22selftests: forwarding: Add a fallback cleanup()Petr Machata
Consistent use of defers obviates the need for a separate test-specific cleanup function -- everything is just taken care of in defers. So in this patch, introduce a cleanup() helper in the forwarding lib.sh, which calls just pre_cleanup() and defer_scopes_cleanup(). Selftests are obviously still free to override the function. Since pre_cleanup() is too entangled with forwarding-specific minutia, the function cannot currently be in net/lib.sh. Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-22selftests: net: lib: Introduce deferred commandsPetr Machata
In commit 8510801a9dbd ("selftests: drv-net: add ability to schedule cleanup with defer()"), a defer helper was added to Python selftests. The idea is to keep cleanup commands close to their dirtying counterparts, thereby making it more transparent what is cleaning up what, making it harder to miss a cleanup, and make the whole cleanup business exception safe. All these benefits are applicable to bash as well, exception safety can be interpreted in terms of safety vs. a SIGINT. This patch therefore introduces a framework of several helpers that serve to schedule cleanups in bash selftests: - defer_scope_push(), defer_scope_pop(): Deferred statements can be batched together in scopes. When a scope is popped, the deferred commands scheduled in that scope are executed in the order opposite to order of their scheduling. - defer(): Schedules a defer to the most recently pushed scope (or the default scope if none was pushed.) - defer_prio(): Schedules a defer on the priority track. The priority defer queue is run before the default defer queue when scope is popped. The issue that this is addressing is specifically the one of restoring devlink shared buffer threshold type. When setting up static thresholds, one has to first change the threshold type to static, then override the individual thresholds. When cleaning up, it would be natural to reset the threshold values first, then change the threshold type. But the values that are valid for dynamic thresholds are generally invalid for static thresholds and vice versa. Attempts to restore the values first would be bounced. Thus one has to first reset the threshold type, then adjust the thresholds. (You could argue that the shared buffer threshold type API is broken and you would be right, but here we are.) This cannot be solved by pure defers easily. I considered making it possible to disable an existing defer, so that one could then schedule a new defer and disable the original. But this forward-shifting of the defer job would have to take place after every threshold-adjusting command, which would make it very awkward to schedule these jobs. - defer_scopes_cleanup(): Pops any unpopped scopes, including the default one. The selftests that use defer should run this in their exit trap. This is important to get cleanups of interrupted scripts. - in_defer_scope(): Sometimes a function would like to introduce a new defer scope, then run whatever it is that it wants to run, and then pop the scope to run the deferred cleanups. The helper in_defer_scope() can be used to run another command within such environment, such that any scheduled defers run after the command finishes. The framework is added as a separate file lib/sh/defer.sh so that it can be used by all bash selftests, including those that do not currently use lib.sh. lib.sh however includes the file by default, because ideally all tests would use these helpers instead of hand-rolling their cleanups. Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-22net: phy: marvell: Add mdix status reportingPaul Davey
Report MDI-X resolved state after link up. Tested on Linkstreet 88E6193X internal PHYs. Signed-off-by: Paul Davey <paul.davey@alliedtelesis.co.nz> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20241017015026.255224-1-paul.davey@alliedtelesis.co.nz Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-22net: stmmac: Programming sequence for VLAN packets with split headerAbhishek Chauhan
Currently reset state configuration of split header works fine for non-tagged packets and we see no corruption in payload of any size We need additional programming sequence with reset configuration to handle VLAN tagged packets to avoid corruption in payload for packets of size greater than 256 bytes. Without this change ping application complains about corruption in payload when the size of the VLAN packet exceeds 256 bytes. With this change tagged and non-tagged packets of any size works fine and there is no corruption seen. Current configuration which has the issue for VLAN packet ---------------------------------------------------------- Split happens at the position at Layer 3 header |MAC-DA|MAC-SA|Vlan Tag|Ether type|IP header|IP data|Rest of the payload| 2 bytes ^ | With the fix we are making sure that the split happens now at Layer 2 which is end of ethernet header and start of IP payload Ip traffic split ----------------- Bits which take care of this are SPLM and SPLOFST SPLM = Split mode is set to Layer 2 SPLOFST = These bits indicate the value of offset from the beginning of Length/Type field at which header split should take place when the appropriate SPLM is selected. Reset value is 2bytes. Un-tagged data (without VLAN) |MAC-DA|MAC-SA|Ether type|IP header|IP data|Rest of the payload| 2bytes ^ | Tagged data (with VLAN) |MAC-DA|MAC-SA|VLAN Tag|Ether type|IP header|IP data|Rest of the payload| 2bytes ^ | Non-IP traffic split such AV packet ------------------------------------ Bits which take care of this are SAVE = Split AV Enable SAVO = Split AV Offset, similar to SPLOFST but this is for AVTP packets. |Preamble|MAC-DA|MAC-SA|VLAN tag|Ether type|IEEE 1722 payload|CRC| 2bytes ^ | Signed-off-by: Abhishek Chauhan <quic_abchauha@quicinc.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20241016234313.3992214-1-quic_abchauha@quicinc.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-22Merge branch 'rtnetlink-refactor-rtnl_-new-del-set-link-for-per-netns-rtnl'Paolo Abeni
Kuniyuki Iwashima says: ==================== rtnetlink: Refactor rtnl_{new,del,set}link() for per-netns RTNL. This is a prep for the next series where we will push RTNL down to rtnl_{new,del,set}link(). That means, for example, __rtnl_newlink() is always under RTNL, but rtnl_newlink() has a non-RTNL section. As a prerequisite for per-netns RTNL, we will move netns validation (and RTNL-independent validations if possible) to that section. rtnl_link_ops and rtnl_af_ops will be protected with SRCU not to depend on RTNL. Changes: v2: * Add Eric's Reviewed-by to patch 1-4,6,8-11, (no tag on 5,7,12-14) * Patch 7 * Handle error of init_srcu_struct(). * Call cleanup_srcu_struct() after synchronize_srcu(). * Patch 12 * Move put_net() before errorout label * Patch 13 * Newly added as prep for patch 14 * Patch 14 * Handle error of init_srcu_struct(). * Call cleanup_srcu_struct() after synchronize_srcu(). v1: https://lore.kernel.org/netdev/20241009231656.57830-1-kuniyu@amazon.com/ ==================== Link: https://patch.msgid.link/20241016185357.83849-1-kuniyu@amazon.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-22rtnetlink: Protect struct rtnl_af_ops with SRCU.Kuniyuki Iwashima
Once RTNL is replaced with rtnl_net_lock(), we need a mechanism to guarantee that rtnl_af_ops is alive during inflight RTM_SETLINK even when its module is being unloaded. Let's use SRCU to protect ops. rtnl_af_lookup() now iterates rtnl_af_ops under RCU and returns SRCU-protected ops pointer. The caller must call rtnl_af_put() to release the pointer after the use. Also, rtnl_af_unregister() unlinks the ops first and calls synchronize_srcu() to wait for inflight RTM_SETLINK requests to complete. Note that rtnl_af_ops needs to be protected by its dedicated lock when RTNL is removed. Note also that BUG_ON() in do_setlink() is changed to the normal error handling as a different af_ops might be found after validate_linkmsg(). Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-22rtnetlink: Return int from rtnl_af_register().Kuniyuki Iwashima
The next patch will add init_srcu_struct() in rtnl_af_register(), then we need to handle its error. Let's add the error handling in advance to make the following patch cleaner. Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Matt Johnston <matt@codeconstruct.com.au> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-22rtnetlink: Call rtnl_link_get_net_capable() in do_setlink().Kuniyuki Iwashima
We will push RTNL down to rtnl_setlink(). RTM_SETLINK could call rtnl_link_get_net_capable() in do_setlink() to move a dev to a new netns, but the netns needs to be fetched before holding rtnl_net_lock(). Let's move it to rtnl_setlink() and pass the netns to do_setlink(). Now, RTM_NEWLINK paths (rtnl_changelink() and rtnl_group_changelink()) can pass the prefetched netns to do_setlink(). Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-22rtnetlink: Clean up rtnl_setlink().Kuniyuki Iwashima
We will push RTNL down to rtnl_setlink(). Let's unify the error path to make it easy to place rtnl_net_lock(). While at it, keep the variables in reverse xmas order. Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-22rtnetlink: Clean up rtnl_dellink().Kuniyuki Iwashima
We will push RTNL down to rtnl_delink(). Let's unify the error path to make it easy to place rtnl_net_lock(). While at it, keep the variables in reverse xmas order. Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-22rtnetlink: Fetch IFLA_LINK_NETNSID in rtnl_newlink().Kuniyuki Iwashima
Another netns option for RTM_NEWLINK is IFLA_LINK_NETNSID and is fetched in rtnl_newlink_create(). This must be done before holding rtnl_net_lock(). Let's move IFLA_LINK_NETNSID processing to rtnl_newlink(). Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-22rtnetlink: Call rtnl_link_get_net_capable() in rtnl_newlink().Kuniyuki Iwashima
As a prerequisite of per-netns RTNL, we must fetch netns before looking up dev or moving it to another netns. rtnl_link_get_net_capable() is called in rtnl_newlink_create() and do_setlink(), but both of them need to be moved to the RTNL-independent region, which will be rtnl_newlink(). Let's call rtnl_link_get_net_capable() in rtnl_newlink() and pass the netns down to where needed. Note that the latter two have not passed the nets to do_setlink() yet but will do so after the remaining rtnl_link_get_net_capable() is moved to rtnl_setlink() later. While at it, dest_net is renamed to tgt_net in rtnl_newlink_create() to align with rtnl_{del,set}link(). Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-22rtnetlink: Protect struct rtnl_link_ops with SRCU.Kuniyuki Iwashima
Once RTNL is replaced with rtnl_net_lock(), we need a mechanism to guarantee that rtnl_link_ops is alive during inflight RTM_NEWLINK even when its module is being unloaded. Let's use SRCU to protect ops. rtnl_link_ops_get() now iterates link_ops under RCU and returns SRCU-protected ops pointer. The caller must call rtnl_link_ops_put() to release the pointer after the use. Also, __rtnl_link_unregister() unlinks the ops first and calls synchronize_srcu() to wait for inflight RTM_NEWLINK requests to complete. Note that link_ops needs to be protected by its dedicated lock when RTNL is removed. Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-22rtnetlink: Move ops->validate to rtnl_newlink().Kuniyuki Iwashima
ops->validate() does not require RTNL. Let's move it to rtnl_newlink(). Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-22rtnetlink: Move rtnl_link_ops_get() and retry to rtnl_newlink().Kuniyuki Iwashima
Currently, if neither dev nor rtnl_link_ops is found in __rtnl_newlink(), we release RTNL and redo the whole process after request_module(), which complicates the logic. The ops will be RTNL-independent later. Let's move the ops lookup to rtnl_newlink() and do the retry earlier. Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>