summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2024-10-30net: sparx5: add compatible string for lan969xDaniel Machon
Add lan9691-switch compatible string to mchp_sparx5_match. Guard it with IS_ENABLED(CONFIG_LAN969X_SWITCH) to make sure Sparx5 can be compiled on its own. Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com> Signed-off-by: Daniel Machon <daniel.machon@microchip.com> Link: https://patch.msgid.link/20241024-sparx5-lan969x-switch-driver-2-v2-14-a0b5fae88a0f@microchip.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-30dt-bindings: net: add compatible strings for lan969x targetsDaniel Machon
Add compatible strings for the twelve different lan969x targets that we support. Either a sparx5-switch or lan9691-switch compatible string provided on their own, or any lan969x-switch compatible string with a fallback to lan9691-switch. Also, add myself as a maintainer. Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com> Signed-off-by: Daniel Machon <daniel.machon@microchip.com> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://patch.msgid.link/20241024-sparx5-lan969x-switch-driver-2-v2-13-a0b5fae88a0f@microchip.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-30net: sparx5: use is_sparx5() macro throughoutDaniel Machon
Use the is_sparx5() macro (introduced in earlier series [1]), in places where we need to handle things a bit differently on lan969x. These places are: - in sparx5_dsm_calendar_update() we need to switch the calendar from a to b on lan969x. - in sparx5_start() we need to make sure the HSCH_SYS_CLK_PER register is only touched on Sparx5. - in sparx5_start() we need to disable VCAP and FDMA for lan969x (will come in later series). - in sparx5_mirror_port_get() we must make sure the ANA_AC_PROBE_PORT_CFG1 register is only read on Sparx5. - sparx5_netdev.c and sparx5_packet.c we need to use different IFH (Internal Frame Header) offsets for lan969x. - in sparx5_port_fifo_sz() we must bail out on lan969x. - in sparx5_port_config_low_set() we must configure the phase detection registers. - in sparx5_port_config() and sparx5_port_init() we must do some additional configuration of the port devices. - in sparx5_dwrr_conf_set() we must derive the scheduling layer [1] https://lore.kernel.org/netdev/20241004-b4-sparx5-lan969x-switch-driver-v2-8-d3290f581663@microchip.com/ Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com> Signed-off-by: Daniel Machon <daniel.machon@microchip.com> Link: https://patch.msgid.link/20241024-sparx5-lan969x-switch-driver-2-v2-12-a0b5fae88a0f@microchip.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-30net: lan969x: add function for calculating the DSM calendarDaniel Machon
Lan969x has support for RedBox / HSR / PRP (not implemented yet). In order to accommodate for this in the future, we need to give lan969x it's own function for calculating the DSM calendar. The function calculates the calendar for each taxi bus. The calendar is used for bandwidth allocation towards the ports attached to the taxi bus. A calendar configuration consists of up-to 64 slots, which may be allocated to ports or left unused. Each slot accounts for 1 clock cycle. Also expose sparx5_cal_speed_to_value(), sparx5_get_port_cal_speed, sparx5_cal_bw and SPX5_DSM_CAL_EMPTY for use by lan969x. Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com> Signed-off-by: Daniel Machon <daniel.machon@microchip.com> Link: https://patch.msgid.link/20241024-sparx5-lan969x-switch-driver-2-v2-11-a0b5fae88a0f@microchip.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-30net: lan969x: add PTP handler functionDaniel Machon
Add PTP IRQ handler for lan969x. This is required, as the PTP registers are placed in two different targets on Sparx5 and lan969x. The implementation is otherwise the same as on Sparx5. Also, expose sparx5_get_hwtimestamp() for use by lan969x. Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com> Signed-off-by: Daniel Machon <daniel.machon@microchip.com> Link: https://patch.msgid.link/20241024-sparx5-lan969x-switch-driver-2-v2-10-a0b5fae88a0f@microchip.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-30net: lan969x: add lan969x ops to match dataDaniel Machon
Add a bunch of small lan969x ops in bulk. These ops are explained in detail in a previous series [1]. [1] https://lore.kernel.org/netdev/20241004-b4-sparx5-lan969x-switch-driver-v2-8-d3290f581663@microchip.com/ Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com> Signed-off-by: Daniel Machon <daniel.machon@microchip.com> Link: https://patch.msgid.link/20241024-sparx5-lan969x-switch-driver-2-v2-9-a0b5fae88a0f@microchip.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-30net: lan969x: add constants to match dataDaniel Machon
Add the lan969x constants to match data. These are already used throughout the Sparx5 code (introduced in earlier series [1]), so no need to update any code use. [1] https://lore.kernel.org/netdev/20241004-b4-sparx5-lan969x-switch-driver-v2-0-d3290f581663@microchip.com/ Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com> Signed-off-by: Daniel Machon <daniel.machon@microchip.com> Link: https://patch.msgid.link/20241024-sparx5-lan969x-switch-driver-2-v2-8-a0b5fae88a0f@microchip.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-30net: lan969x: add register diffs to match dataDaniel Machon
Add new file lan969x_regs.c that defines all the register differences for lan969x, and add it to the lan969x match data. GW_DEV2G5_PHASE_DETECTOR_CTRL, FP_DEV2G5_PHAD_CTRL_PHAD_ENA and FP_DEV2G5_PHAD_CTRL_PHAD_FAILED are required by the new register macros which was introduced earlier. Add these for Sparx5 also. Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com> Signed-off-by: Daniel Machon <daniel.machon@microchip.com> Link: https://patch.msgid.link/20241024-sparx5-lan969x-switch-driver-2-v2-7-a0b5fae88a0f@microchip.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-30net: lan969x: add match data for lan969xDaniel Machon
Add match data for lan969x, with initial fields for iomap, iomap_size and ioranges. Add new Kconfig symbol CONFIG_LAN969X_CONFIG for compiling the lan969x driver. It has been decided to give lan969x its own Kconfig symbol, as a considerable amount of code is needed, beside the Sparx5 code, to add full chip support (and more will be added in future series). Also this makes it possible to compile Sparx5 without lan969x. Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com> Signed-off-by: Daniel Machon <daniel.machon@microchip.com> Link: https://patch.msgid.link/20241024-sparx5-lan969x-switch-driver-2-v2-6-a0b5fae88a0f@microchip.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-30net: sparx5: add registers required by lan969xDaniel Machon
Lan969x will require a few additional registers for certain operations. Some are shared, some are not. Add these. Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com> Signed-off-by: Daniel Machon <daniel.machon@microchip.com> Link: https://patch.msgid.link/20241024-sparx5-lan969x-switch-driver-2-v2-5-a0b5fae88a0f@microchip.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-30net: sparx5: add sparx5 context pointer to a few functionsDaniel Machon
In preparation for lan969x, add the sparx5 context pointer to certain IFH (Internal Frame Header) functions. This is required, as the is_sparx5() function will be used here in a subsequent patch. Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com> Signed-off-by: Daniel Machon <daniel.machon@microchip.com> Link: https://patch.msgid.link/20241024-sparx5-lan969x-switch-driver-2-v2-4-a0b5fae88a0f@microchip.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-30net: sparx5: change frequency calculation for SDLB'sDaniel Machon
In preparation for lan969x, rework the function that calculates the SDLB (Service Dual Leacky Bucket) clock. This is required, as the HSCH_SYS_CLK_PER register is Sparx5-exclusive. Instead derive the clock from the core clock, using the sparx5_clk_period() function. The clock stays the same before and after this patch, only now, sparx5_sdlb_clk_hz_get() can be used for lan969x too. Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com> Signed-off-by: Daniel Machon <daniel.machon@microchip.com> Link: https://patch.msgid.link/20241024-sparx5-lan969x-switch-driver-2-v2-3-a0b5fae88a0f@microchip.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-30net: sparx5: change spx5_wr to spx5_rmw in cal update()Daniel Machon
In preparation for lan969x, use spx5_rmw() for enabling the update of the calendar. This is required to not overwrite the DSM_TAXI_CAL_CFG register, as an additional write will be added before this one, in a subsequent patch. Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com> Signed-off-by: Daniel Machon <daniel.machon@microchip.com> Link: https://patch.msgid.link/20241024-sparx5-lan969x-switch-driver-2-v2-2-a0b5fae88a0f@microchip.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-30net: sparx5: add support for lan969x targets and core clockDaniel Machon
In preparation for lan969x, add lan969x targets to sparx5_target_chiptype and set the core clock frequency for these throughout. Lan969x only supports a core clock frequency of 328MHz. Also, set the policer update internal (pol_upd_int) matching the 328 MHz frequency of the lan969x targets. Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com> Signed-off-by: Daniel Machon <daniel.machon@microchip.com> Link: https://patch.msgid.link/20241024-sparx5-lan969x-switch-driver-2-v2-1-a0b5fae88a0f@microchip.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-30gve: change to use page_pool_put_full_page when recycling pagesHarshitha Ramamurthy
The driver currently uses page_pool_put_page() to recycle page pool pages. Since gve uses split pages, if the fragment being recycled is not the last fragment in the page, there is no dma sync operation. When the last fragment is recycled, dma sync is performed by page pool infra according to the value passed as dma_sync_size which right now is set to the size of fragment. But the correct thing to do is to dma sync the entire page when the last fragment is recycled. Hence change to using page_pool_put_full_page(). Link: https://lore.kernel.org/netdev/89d7ce83-cc1d-4791-87b5-6f7af29a031d@huawei.com/ Suggested-by: Yunsheng Lin <linyunsheng@huawei.com> Reviewed-by: Praveen Kaligineedi <pkaligineedi@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Harshitha Ramamurthy <hramamurthy@google.com> Reviewed-by: Yunsheng Lin <linyunsheng@huawei.com> Fixes: ebdfae0d377b ("gve: adopt page pool for DQ RDA mode") Link: https://patch.msgid.link/20241023221141.3008011-1-pkaligineedi@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-30Merge branch 'refactoring-rvu-nic-driver'Jakub Kicinski
Geetha sowjanya says: ==================== Refactoring RVU NIC driver This is a preparation pathset for follow-up "Introducing RVU representors driver" patches. The RVU representor driver creates representor netdev of each rvu device when switch dev mode is enabled. RVU representor and NIC have a similar set of HW resources(NIX_LF,RQ/SQ/CQ) and implements a subset of NIC functionality. This patch set groups hw resources and queue configuration code into single API and export the existing functions so, that code can be shared between NIC and representor drivers. These patches are part of "Introduce RVU representors" patchset. https://lore.kernel.org/all/ZsdJ-w00yCI4NQ8T@nanopsycho.orion/T/ As suggested by "Jiri Pirko", submitting as separate patchset. ==================== Link: https://patch.msgid.link/20241023161843.15543-1-gakula@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-30octeontx2-pf: Move shared APIs to header fileGeetha sowjanya
Move mbox, hw resources and interrupt configuration functions to common header file. So, that they can be used later by the RVU representor driver. Signed-off-by: Geetha sowjanya <gakula@marvell.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20241023161843.15543-5-gakula@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-30octeontx2-pf: Reuse PF max mtu valueGeetha sowjanya
Reuse the maximum support HW MTU value that is fetch during probe. Instead of fetching through mbox each time mtu is changed as the value is fixed for interface. Signed-off-by: Geetha sowjanya <gakula@marvell.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20241023161843.15543-4-gakula@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-30octeontx2-pf: Add new APIs for queue memory alloc/free.Geetha sowjanya
Group the queue(RX/TX/CQ) memory allocation and free code to single APIs. Signed-off-by: Geetha sowjanya <gakula@marvell.com> Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20241023161843.15543-3-gakula@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-30octeontx2-pf: Define common API for HW resources configurationGeetha sowjanya
Define new API "otx2_init_rsrc" and move the HW blocks NIX/NPA resources configuration code under this API. So, that it can be used by the RVU representor driver that has similar resources of RVU NIC. Signed-off-by: Geetha sowjanya <gakula@marvell.com> Reviewed-by: Jiri Pirko <jiri@resnulli.us> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20241023161843.15543-2-gakula@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-30Merge branch 'mirroring-to-dsa-cpu-port'Jakub Kicinski
Vladimir Oltean says: ==================== Mirroring to DSA CPU port Users of the NXP LS1028A SoC (drivers/net/dsa/ocelot L2 switch inside) have requested to mirror packets from the ingress of a switch port to software. Both port-based and flow-based mirroring is required. The simplest way I could come up with was to set up tc mirred actions towards a dummy net_device, and make the offloading of that be accepted by the driver. Currently, the pattern in drivers is to reject mirred towards ports they don't know about, but I'm now permitting that, precisely by mirroring "to the CPU". For testers, this series depends on commit 34d35b4edbbe ("net/sched: act_api: deny mismatched skip_sw/skip_hw flags for actions created by classifiers") from net/main, which is absent from net-next as of the day of posting (Oct 23). Without the bug fix it is possible to create invalid configurations which are not rejected by the kernel. v2: https://lore.kernel.org/20241017165215.3709000-1-vladimir.oltean@nxp.com RFC: https://lore.kernel.org/20240913152915.2981126-1-vladimir.oltean@nxp.com For historical purposes, link to a much older (and much different) attempt: https://lore.kernel.org/20191002233750.13566-1-olteanv@gmail.com ==================== Link: https://patch.msgid.link/20241023135251.1752488-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-30net: mscc: ocelot: allow tc-flower mirred action towards foreign interfacesVladimir Oltean
Debugging certain flows in the offloaded switch data path can be done by installing two tc-mirred filters for mirroring: one in the hardware data path, which copies the frames to the CPU, and one which takes the frame from there and mirrors it to a virtual interface like a dummy device, where it can be seen with tcpdump. The effect of having 2 filters run on the same packet can be obtained by default using tc, by not specifying either the 'skip_sw' or 'skip_hw' keywords. Instead of refusing to offload mirroring/redirecting packets towards interfaces that aren't switch ports, just treat every other destination for what it is: something that is handled in software, behind the CPU port. Usage: $ ip link add dummy0 type dummy; ip link set dummy0 up $ tc qdisc add dev swp0 clsact $ tc filter add dev swp0 ingress protocol ip flower action mirred ingress mirror dev dummy0 Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://patch.msgid.link/20241023135251.1752488-7-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-30net: dsa: allow matchall mirroring rules towards the CPUVladimir Oltean
If the CPU bandwidth capacity permits, it may be useful to mirror the entire ingress of a user port to software. This is in fact possible to express even if there is no net_device representation for the CPU port. In fact, that approach was already exhausted and that representation wouldn't have even helped [1]. The idea behind implementing this is that currently, we refuse to offload any mirroring towards a non-DSA target net_device. But if we acknowledge the fact that to reach any foreign net_device, the switch must send the packet to the CPU anyway, then we can simply offload just that part, and let the software do the rest. There is only one condition we need to uphold: the filter needs to be present in the software data path as well (no skip_sw). There are 2 actions to consider: FLOW_ACTION_MIRRED (redirect to egress of target interface) and FLOW_ACTION_MIRRED_INGRESS (redirect to ingress of target interface). We don't have the ability/API to offload FLOW_ACTION_MIRRED_INGRESS when the target port is also a DSA user port, but we could also permit that through mirred to the CPU + software. Example: $ ip link add dummy0 type dummy; ip link set dummy0 up $ tc qdisc add dev swp0 clsact $ tc filter add dev swp0 ingress matchall action mirred ingress mirror dev dummy0 Any DSA driver with a ds->ops->port_mirror_add() implementation can now make use of this with no additional change. [1] https://lore.kernel.org/netdev/20191002233750.13566-1-olteanv@gmail.com/ Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://patch.msgid.link/20241023135251.1752488-6-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-30net: dsa: add more extack messages in dsa_user_add_cls_matchall_mirred()Vladimir Oltean
Do not leave -EOPNOTSUPP errors without an explanation. It is confusing for the user to figure out what is wrong otherwise. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://patch.msgid.link/20241023135251.1752488-5-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-30net: dsa: use "extack" as argument to flow_action_basic_hw_stats_check()Vladimir Oltean
We already have an "extack" stack variable in dsa_user_add_cls_matchall_police() and dsa_user_add_cls_matchall_mirred(), there is no need to retrieve it again from cls->common.extack. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://patch.msgid.link/20241023135251.1752488-4-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-30net: dsa: clean up dsa_user_add_cls_matchall()Vladimir Oltean
The body is a bit hard to read, hard to extend, and has duplicated conditions. Clean up the "if (many conditions) else if (many conditions, some of them repeated)" pattern by: - Moving the repeated conditions out - Replacing the repeated tests for the same variable with a switch/case - Moving the protocol check inside the dsa_user_add_cls_matchall_mirred() function call. This is pure refactoring, no logic has been changed, though some tests were reordered. The order does not matter - they are independent things to be tested for. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://patch.msgid.link/20241023135251.1752488-3-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-30net: sched: propagate "skip_sw" flag to struct flow_cls_common_offloadVladimir Oltean
Background: switchdev ports offload the Linux bridge, and most of the packets they handle will never see the CPU. The ports between which there exists no hardware data path are considered 'foreign' to switchdev. These can either be normal physical NICs without switchdev offload, or incompatible switchdev ports, or virtual interfaces like veth/dummy/etc. In some cases, an offloaded filter can only do half the work, and the rest must be handled by software. Redirecting/mirroring from the ingress of a switchdev port towards a foreign interface is one example of combined hardware/software data path. The most that the switchdev port can do is to extract the matching packets from its offloaded data path and send them to the CPU. From there on, the software filter runs (a second time, after the first run in hardware) on the packet and performs the mirred action. It makes sense for switchdev drivers which allow this kind of "half offloading" to sense the "skip_sw" flag of the filter/action pair, and deny attempts from the user to install a filter that does not run in software, because that simply won't work. In fact, a mirred action on a switchdev port towards a dummy interface appears to be a valid way of (selectively) monitoring offloaded traffic that flows through it. IFF_PROMISC was also discussed years ago, but (despite initial disagreement) there seems to be consensus that this flag should not affect the destination taken by packets, but merely whether or not the NIC discards packets with unknown MAC DA for local processing. [1] https://lore.kernel.org/netdev/20190830092637.7f83d162@ceranb/ [2] https://lore.kernel.org/netdev/20191002233750.13566-1-olteanv@gmail.com/ Suggested-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/netdev/ZxUo0Dc0M5Y6l9qF@shredder.mtl.com/ Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://patch.msgid.link/20241023135251.1752488-2-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-30Merge branch 'ptp-driver-for-s390-clocks'Jakub Kicinski
Sven Schnelle says: ==================== PtP driver for s390 clocks these patches add support for using the s390 physical and TOD clock as ptp clock. To do so, the first patch adds a clock id to the s390 TOD clock, while the second patch adds the PtP driver itself. ==================== Link: https://patch.msgid.link/20241023065601.449586-1-svens@linux.ibm.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-30s390/time: Add PtP driverSven Schnelle
Add a small PtP driver which allows user space to get the values of the physical and tod clock. This allows programs like chrony to use STP as clock source and steer the kernel clock. The physical clock can be used as a debugging aid to get the clock without any additional offsets like STP steering or LPAR offset. Acked-by: Heiko Carstens <hca@linux.ibm.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Link: https://patch.msgid.link/20241023065601.449586-3-svens@linux.ibm.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-30s390/time: Add clocksource id to TOD clockSven Schnelle
To allow specifying the clock source in the upcoming PtP driver, add a clocksource ID to the s390 TOD clock. Acked-by: Heiko Carstens <hca@linux.ibm.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Link: https://patch.msgid.link/20241023065601.449586-2-svens@linux.ibm.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-30tests: hsr: Increase timeout to 50 secondsYunshui Jiang
The HSR test, hsr_ping.sh, actually needs 7 min to run. Around 375s to be exact, and even more on a debug kernel or kernel with other network security limits. The timeout setting for the kselftest is currently 45 seconds, which is way too short to integrate hsr tests to run_kselftest infrastructure. However, timeout of hundreds of seconds is quite a long time, especially in a CI/CD environment. It seems that we need accelerate the test and balance with timeout setting. The most time-consuming func is do_ping_long, where ping command sends 10 packages to the given address. The default interval between two ping packages is 1s according to the ping Mannual. There isn't any operation between pings thus we could pass -i 0.1 to ping to make it 10 times faster. While even with this short interval, the test still need about 46.4 seconds to finish because of the two HSR interfaces, each of which is tested by calling do_ping func 12 times and do_ping_long func 19 times and sleep for 3s. So, an explicit setting is also needed to slightly increase the timeout. And to leave us some slack, use 50 as default timeout. Signed-off-by: Yunshui Jiang <jiangyunshui@kylinos.cn> Link: https://patch.msgid.link/20241028082757.2945232-1-jiangyunshui@kylinos.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-30x86/uaccess: Avoid barrier_nospec() in 64-bit copy_from_user()Linus Torvalds
The barrier_nospec() in 64-bit copy_from_user() is slow. Instead use pointer masking to force the user pointer to all 1's for an invalid address. The kernel test robot reports a 2.6% improvement in the per_thread_ops benchmark [1]. This is a variation on a patch originally by Josh Poimboeuf [2]. Link: https://lore.kernel.org/202410281344.d02c72a2-oliver.sang@intel.com [1] Link: https://lore.kernel.org/5b887fe4c580214900e21f6c61095adf9a142735.1730166635.git.jpoimboe@kernel.org [2] Tested-and-reviewed-by: Josh Poimboeuf <jpoimboe@kernel.org> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-10-30Merge tag 'perf-tools-fixes-for-v6.12-2-2024-10-30' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools Pull perf tools fixes from Arnaldo Carvalho de Melo: - Update more header copies with the kernel sources, including const.h, msr-index.h, arm64's cputype.h, kvm's, bits.h and unaligned.h - The return from 'write' isn't a pid, fix cut'n'paste error in 'perf trace' - Fix up the python binding build on architectures without HAVE_KVM_STAT_SUPPORT - Add some more bounds checks to augmented_raw_syscalls.bpf.c (used to collect syscall pointer arguments in 'perf trace') to make the resulting bytecode to pass the kernel BPF verifier, allowing us to go back accepting clang 12.0.1 as the minimum version required for compiling BPF sources - Add __NR_capget for x86 to fix a regression on running perf + intel PT (hw tracing) as non-root setting up the capabilities as described in https://www.kernel.org/doc/html/latest/admin-guide/perf-security.html - Fix missing syscalltbl in non-explicitly listed architectures, noticed on ARM 32-bit, that still needs a .tbl generator for the syscall id<->name tables, should be added for v6.13 - Handle 'perf test' failure when handling broken DWARF for ASM files * tag 'perf-tools-fixes-for-v6.12-2-2024-10-30' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: perf cap: Add __NR_capget to arch/x86 unistd tools headers: Update the linux/unaligned.h copy with the kernel sources tools headers arm64: Sync arm64's cputype.h with the kernel sources tools headers: Synchronize {uapi/}linux/bits.h with the kernel sources tools arch x86: Sync the msr-index.h copy with the kernel sources perf python: Fix up the build on architectures without HAVE_KVM_STAT_SUPPORT perf test: Handle perftool-testsuite_probe failure due to broken DWARF tools headers UAPI: Sync kvm headers with the kernel sources perf trace: Fix non-listed archs in the syscalltbl routines perf build: Change the clang check back to 12.0.1 perf trace augmented_raw_syscalls: Add more checks to pass the verifier perf trace augmented_raw_syscalls: Add extra array index bounds checking to satisfy some BPF verifiers perf trace: The return from 'write' isn't a pid tools headers UAPI: Sync linux/const.h with the kernel headers
2024-10-30Merge branch 'fixes-for-bits-iterator'Alexei Starovoitov
Hou Tao says: ==================== The patch set fixes several issues in bits iterator. Patch #1 fixes the kmemleak problem of bits iterator. Patch #2~#3 fix the overflow problem of nr_bits. Patch #4 fixes the potential stack corruption when bits iterator is used on 32-bit host. Patch #5 adds more test cases for bits iterator. Please see the individual patches for more details. And comments are always welcome. --- v4: * patch #1: add ack from Yafang * patch #3: revert code-churn like changes: (1) compute nr_bytes and nr_bits before the check of nr_words. (2) use nr_bits == 64 to check for single u64, preventing build warning on 32-bit hosts. * patch #4: use "BITS_PER_LONG == 32" instead of "!defined(CONFIG_64BIT)" v3: https://lore.kernel.org/bpf/20241025013233.804027-1-houtao@huaweicloud.com/T/#t * split the bits-iterator related patches from "Misc fixes for bpf" patch set * patch #1: use "!nr_bits || bits >= nr_bits" to stop the iteration * patch #2: add a new helper for the overflow problem * patch #3: decrease the limitation from 512 to 511 and check whether nr_bytes is too large for bpf memory allocator explicitly * patch #5: add two more test cases for bit iterator v2: http://lore.kernel.org/bpf/d49fa2f4-f743-c763-7579-c3cab4dd88cb@huaweicloud.com ==================== Link: https://lore.kernel.org/r/20241030100516.3633640-1-houtao@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-10-30selftests/bpf: Add three test cases for bits_iterHou Tao
Add more test cases for bits iterator: (1) huge word test Verify the multiplication overflow of nr_bits in bits_iter. Without the overflow check, when nr_words is 67108865, nr_bits becomes 64, causing bpf_probe_read_kernel_common() to corrupt the stack. (2) max word test Verify correct handling of maximum nr_words value (511). (3) bad word test Verify early termination of bits iteration when bits iterator initialization fails. Also rename bits_nomem to bits_too_big to better reflect its purpose. Signed-off-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/r/20241030100516.3633640-6-houtao@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-10-30bpf: Use __u64 to save the bits in bits iteratorHou Tao
On 32-bit hosts (e.g., arm32), when a bpf program passes a u64 to bpf_iter_bits_new(), bpf_iter_bits_new() will use bits_copy to store the content of the u64. However, bits_copy is only 4 bytes, leading to stack corruption. The straightforward solution would be to replace u64 with unsigned long in bpf_iter_bits_new(). However, this introduces confusion and problems for 32-bit hosts because the size of ulong in bpf program is 8 bytes, but it is treated as 4-bytes after passed to bpf_iter_bits_new(). Fix it by changing the type of both bits and bit_count from unsigned long to u64. However, the change is not enough. The main reason is that bpf_iter_bits_next() uses find_next_bit() to find the next bit and the pointer passed to find_next_bit() is an unsigned long pointer instead of a u64 pointer. For 32-bit little-endian host, it is fine but it is not the case for 32-bit big-endian host. Because under 32-bit big-endian host, the first iterated unsigned long will be the bits 32-63 of the u64 instead of the expected bits 0-31. Therefore, in addition to changing the type, swap the two unsigned longs within the u64 for 32-bit big-endian host. Signed-off-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/r/20241030100516.3633640-5-houtao@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-10-30bpf: Check the validity of nr_words in bpf_iter_bits_new()Hou Tao
Check the validity of nr_words in bpf_iter_bits_new(). Without this check, when multiplication overflow occurs for nr_bits (e.g., when nr_words = 0x0400-0001, nr_bits becomes 64), stack corruption may occur due to bpf_probe_read_kernel_common(..., nr_bytes = 0x2000-0008). Fix it by limiting the maximum value of nr_words to 511. The value is derived from the current implementation of BPF memory allocator. To ensure compatibility if the BPF memory allocator's size limitation changes in the future, use the helper bpf_mem_alloc_check_size() to check whether nr_bytes is too larger. And return -E2BIG instead of -ENOMEM for oversized nr_bytes. Fixes: 4665415975b0 ("bpf: Add bits iterator") Signed-off-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/r/20241030100516.3633640-4-houtao@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-10-30bpf: Add bpf_mem_alloc_check_size() helperHou Tao
Introduce bpf_mem_alloc_check_size() to check whether the allocation size exceeds the limitation for the kmalloc-equivalent allocator. The upper limit for percpu allocation is LLIST_NODE_SZ bytes larger than non-percpu allocation, so a percpu argument is added to the helper. The helper will be used in the following patch to check whether the size parameter passed to bpf_mem_alloc() is too big. Signed-off-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/r/20241030100516.3633640-3-houtao@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-10-30bpf: Free dynamically allocated bits in bpf_iter_bits_destroy()Hou Tao
bpf_iter_bits_destroy() uses "kit->nr_bits <= 64" to check whether the bits are dynamically allocated. However, the check is incorrect and may cause a kmemleak as shown below: unreferenced object 0xffff88812628c8c0 (size 32): comm "swapper/0", pid 1, jiffies 4294727320 hex dump (first 32 bytes): b0 c1 55 f5 81 88 ff ff f0 f0 f0 f0 f0 f0 f0 f0 ..U........... f0 f0 f0 f0 f0 f0 f0 f0 00 00 00 00 00 00 00 00 .............. backtrace (crc 781e32cc): [<00000000c452b4ab>] kmemleak_alloc+0x4b/0x80 [<0000000004e09f80>] __kmalloc_node_noprof+0x480/0x5c0 [<00000000597124d6>] __alloc.isra.0+0x89/0xb0 [<000000004ebfffcd>] alloc_bulk+0x2af/0x720 [<00000000d9c10145>] prefill_mem_cache+0x7f/0xb0 [<00000000ff9738ff>] bpf_mem_alloc_init+0x3e2/0x610 [<000000008b616eac>] bpf_global_ma_init+0x19/0x30 [<00000000fc473efc>] do_one_initcall+0xd3/0x3c0 [<00000000ec81498c>] kernel_init_freeable+0x66a/0x940 [<00000000b119f72f>] kernel_init+0x20/0x160 [<00000000f11ac9a7>] ret_from_fork+0x3c/0x70 [<0000000004671da4>] ret_from_fork_asm+0x1a/0x30 That is because nr_bits will be set as zero in bpf_iter_bits_next() after all bits have been iterated. Fix the issue by setting kit->bit to kit->nr_bits instead of setting kit->nr_bits to zero when the iteration completes in bpf_iter_bits_next(). In addition, use "!nr_bits || bits >= nr_bits" to check whether the iteration is complete and still use "nr_bits > 64" to indicate whether bits are dynamically allocated. The "!nr_bits" check is necessary because bpf_iter_bits_new() may fail before setting kit->nr_bits, and this condition will stop the iteration early instead of accessing the zeroed or freed kit->bits. Considering the initial value of kit->bits is -1 and the type of kit->nr_bits is unsigned int, change the type of kit->nr_bits to int. The potential overflow problem will be handled in the following patch. Fixes: 4665415975b0 ("bpf: Add bits iterator") Acked-by: Yafang Shao <laoar.shao@gmail.com> Signed-off-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/r/20241030100516.3633640-2-houtao@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-10-30Bluetooth: hci: fix null-ptr-deref in hci_read_supported_codecsSungwoo Kim
Fix __hci_cmd_sync_sk() to return not NULL for unknown opcodes. __hci_cmd_sync_sk() returns NULL if a command returns a status event. However, it also returns NULL where an opcode doesn't exist in the hci_cc table because hci_cmd_complete_evt() assumes status = skb->data[0] for unknown opcodes. This leads to null-ptr-deref in cmd_sync for HCI_OP_READ_LOCAL_CODECS as there is no hci_cc for HCI_OP_READ_LOCAL_CODECS, which always assumes status = skb->data[0]. KASAN: null-ptr-deref in range [0x0000000000000070-0x0000000000000077] CPU: 1 PID: 2000 Comm: kworker/u9:5 Not tainted 6.9.0-ga6bcb805883c-dirty #10 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014 Workqueue: hci7 hci_power_on RIP: 0010:hci_read_supported_codecs+0xb9/0x870 net/bluetooth/hci_codec.c:138 Code: 08 48 89 ef e8 b8 c1 8f fd 48 8b 75 00 e9 96 00 00 00 49 89 c6 48 ba 00 00 00 00 00 fc ff df 4c 8d 60 70 4c 89 e3 48 c1 eb 03 <0f> b6 04 13 84 c0 0f 85 82 06 00 00 41 83 3c 24 02 77 0a e8 bf 78 RSP: 0018:ffff888120bafac8 EFLAGS: 00010212 RAX: 0000000000000000 RBX: 000000000000000e RCX: ffff8881173f0040 RDX: dffffc0000000000 RSI: ffffffffa58496c0 RDI: ffff88810b9ad1e4 RBP: ffff88810b9ac000 R08: ffffffffa77882a7 R09: 1ffffffff4ef1054 R10: dffffc0000000000 R11: fffffbfff4ef1055 R12: 0000000000000070 R13: 0000000000000000 R14: 0000000000000000 R15: ffff88810b9ac000 FS: 0000000000000000(0000) GS:ffff8881f6c00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f6ddaa3439e CR3: 0000000139764003 CR4: 0000000000770ef0 PKRU: 55555554 Call Trace: <TASK> hci_read_local_codecs_sync net/bluetooth/hci_sync.c:4546 [inline] hci_init_stage_sync net/bluetooth/hci_sync.c:3441 [inline] hci_init4_sync net/bluetooth/hci_sync.c:4706 [inline] hci_init_sync net/bluetooth/hci_sync.c:4742 [inline] hci_dev_init_sync net/bluetooth/hci_sync.c:4912 [inline] hci_dev_open_sync+0x19a9/0x2d30 net/bluetooth/hci_sync.c:4994 hci_dev_do_open net/bluetooth/hci_core.c:483 [inline] hci_power_on+0x11e/0x560 net/bluetooth/hci_core.c:1015 process_one_work kernel/workqueue.c:3267 [inline] process_scheduled_works+0x8ef/0x14f0 kernel/workqueue.c:3348 worker_thread+0x91f/0xe50 kernel/workqueue.c:3429 kthread+0x2cb/0x360 kernel/kthread.c:388 ret_from_fork+0x4d/0x80 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244 Fixes: abfeea476c68 ("Bluetooth: hci_sync: Convert MGMT_OP_START_DISCOVERY") Signed-off-by: Sungwoo Kim <iam@sung-woo.kim> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2024-10-30Merge tag 'scsi-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Two small fixes, both in drivers (ufs and scsi_debug)" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: ufs: core: Fix another deadlock during RTC update scsi: scsi_debug: Fix do_device_access() handling of unexpected SG copy length
2024-10-30selftests/bpf: Add a selftest for bpf_csum_diff()Puranjay Mohan
Add a selftest for the bpf_csum_diff() helper. This selftests runs the helper in all three configurations(push, pull, and diff) and verifies its output. The correct results have been computed by hand and by the helper's older implementation. Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20241026125339.26459-5-puranjay@kernel.org
2024-10-30selftests/bpf: Don't mask result of bpf_csum_diff() in test_verifierPuranjay Mohan
The bpf_csum_diff() helper has been fixed to return a 16-bit value for all archs, so now we don't need to mask the result. This commit is basically reverting the below: commit 6185266c5a85 ("selftests/bpf: Mask bpf_csum_diff() return value to 16 bits in test_verifier") Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20241026125339.26459-4-puranjay@kernel.org
2024-10-30bpf: bpf_csum_diff: Optimize and homogenize for all archsPuranjay Mohan
1. Optimization ------------ The current implementation copies the 'from' and 'to' buffers to a scratchpad and it takes the bitwise NOT of 'from' buffer while copying. In the next step csum_partial() is called with this scratchpad. so, mathematically, the current implementation is doing: result = csum(to - from) Here, 'to' and '~ from' are copied in to the scratchpad buffer, we need it in the scratchpad buffer because csum_partial() takes a single contiguous buffer and not two disjoint buffers like 'to' and 'from'. We can re write this equation to: result = csum(to) - csum(from) using the distributive property of csum(). this allows 'to' and 'from' to be at different locations and therefore this scratchpad and copying is not needed. This in C code will look like: result = csum_sub(csum_partial(to, to_size, seed), csum_partial(from, from_size, 0)); 2. Homogenization -------------- The bpf_csum_diff() helper calls csum_partial() which is implemented by some architectures like arm and x86 but other architectures rely on the generic implementation in lib/checksum.c The generic implementation in lib/checksum.c returns a 16 bit value but the arch specific implementations can return more than 16 bits, this works out in most places because before the result is used, it is passed through csum_fold() that turns it into a 16-bit value. bpf_csum_diff() directly returns the value from csum_partial() and therefore the returned values could be different on different architectures. see discussion in [1]: for the int value 28 the calculated checksums are: x86 : -29 : 0xffffffe3 generic (arm64, riscv) : 65507 : 0x0000ffe3 arm : 131042 : 0x0001ffe2 Pass the result of bpf_csum_diff() through from32to16() before returning to homogenize this result for all architectures. NOTE: from32to16() is used instead of csum_fold() because csum_fold() does from32to16() + bitwise NOT of the result, which is not what we want to do here. [1] https://lore.kernel.org/bpf/CAJ+HfNiQbOcqCLxFUP2FMm5QrLXUUaj852Fxe3hn_2JNiucn6g@mail.gmail.com/ Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20241026125339.26459-3-puranjay@kernel.org
2024-10-30net: checksum: Move from32to16() to generic headerPuranjay Mohan
from32to16() is used by lib/checksum.c and also by arch/parisc/lib/checksum.c. The next patch will use it in the bpf_csum_diff helper. Move from32to16() to the include/net/checksum.h as csum_from32to16() and remove other implementations. Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20241026125339.26459-2-puranjay@kernel.org
2024-10-30ALSA: hda/realtek: Fix headset mic on TUXEDO Stellaris 16 Gen6 mb1Christoffer Sandberg
Quirk is needed to enable headset microphone on missing pin 0x19. Signed-off-by: Christoffer Sandberg <cs@tuxedo.de> Signed-off-by: Werner Sembach <wse@tuxedocomputers.com> Cc: <stable@vger.kernel.org> Link: https://patch.msgid.link/20241029151653.80726-2-wse@tuxedocomputers.com Signed-off-by: Takashi Iwai <tiwai@suse.de>
2024-10-30ALSA: hda/realtek: Fix headset mic on TUXEDO Gemini 17 Gen3Christoffer Sandberg
Quirk is needed to enable headset microphone on missing pin 0x19. Signed-off-by: Christoffer Sandberg <cs@tuxedo.de> Signed-off-by: Werner Sembach <wse@tuxedocomputers.com> Cc: <stable@vger.kernel.org> Link: https://patch.msgid.link/20241029151653.80726-1-wse@tuxedocomputers.com Signed-off-by: Takashi Iwai <tiwai@suse.de>
2024-10-30ALSA: usb-audio: Add quirks for Dell WD19 dockJan Schär
The WD19 family of docks has the same audio chipset as the WD15. This change enables jack detection on the WD19. We don't need the dell_dock_mixer_init quirk for the WD19. It is only needed because of the dell_alc4020_map quirk for the WD15 in mixer_maps.c, which disables the volume controls. Even for the WD15, this quirk was apparently only needed when the dock firmware was not updated. Signed-off-by: Jan Schär <jan@jschaer.ch> Cc: <stable@vger.kernel.org> Link: https://patch.msgid.link/20241029221249.15661-1-jan@jschaer.ch Signed-off-by: Takashi Iwai <tiwai@suse.de>
2024-10-30Merge tag 'asoc-fix-v6.12-rc5' of ↵Takashi Iwai
https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus ASoC: Fixes for v6.12 The biggest set of changes here is Hans' fixes and quirks for various Baytrail based platforms with RT5640 CODECs, and there's one core fix for a missed length assignment for __counted_by() checking. Otherwise it's small device specific fixes, several of them in the DT bindings.
2024-10-30Merge branch 'tcp-warn-once'David S. Miller
Jason Xing says: ==================== tcp: add tcp_warn_once() common helper Paolo Abeni suggested we can introduce a new helper to cover more cases in the future for better debug. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>