summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2023-02-09nfp: support IPsec offloading for NFP3800Huanhuan Wang
Add IPsec offloading support for NFP3800. Include data plane and control plane. Data plane: add IPsec packet process flow in NFP3800 datapath (NFDk). Control plane: add an algorithm support distinction flow in xfrm hook function xdo_dev_state_add(), as NFP3800 has a different set of IPsec algorithm support. This matches existing support for the NFP6000/NFP4000 and their NFD3 datapath. In addition, fixup the md_bytes calculation for NFD3 datapath to make sure the two datapahts are keept in sync. Signed-off-by: Huanhuan Wang <huanhuan.wang@corigine.com> Reviewed-by: Niklas Söderlund <niklas.soderlund@corigine.com> Signed-off-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Link: https://lore.kernel.org/r/20230208091000.4139974-1-simon.horman@corigine.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-09Merge branch 'net-introduce-rps_default_mask'Jakub Kicinski
Paolo Abeni says: ==================== net: introduce rps_default_mask Real-time setups try hard to ensure proper isolation between time critical applications and e.g. network processing performed by the network stack in softirq and RPS is used to move the softirq activity away from the isolated core. If the network configuration is dynamic, with netns and devices routinely created at run-time, enforcing the correct RPS setting on each newly created device allowing to transient bad configuration became complex. Additionally, when multi-queue devices are involved, configuring rps in user-space on each queue easily becomes very expensive, e.g. some setups use veths with per cpu queues. These series try to address the above, introducing a new sysctl knob: rps_default_mask. The new sysctl entry allows configuring a netns-wide RPS mask, to be enforced since receive queue creation time without any fourther per device configuration required. Additionally, a simple self-test is introduced to check the rps_default_mask behavior. ==================== Link: https://lore.kernel.org/r/cover.1675789134.git.pabeni@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-09self-tests: introduce self-tests for RPS default maskPaolo Abeni
Ensure that RPS default mask changes take place on all newly created netns/devices and don't affect existing ones. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-09net: introduce default_rps_mask netns attributePaolo Abeni
If RPS is enabled, this allows configuring a default rps mask, which is effective since receive queue creation time. A default RPS mask allows the system admin to ensure proper isolation, avoiding races at network namespace or device creation time. The default RPS mask is initially empty, and can be modified via a newly added sysctl entry. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-09net-sysctl: factor-out rpm mask manipulation helpersPaolo Abeni
Will simplify the following patch. No functional change intended. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-09net-sysctl: factor out cpumask parsing helperPaolo Abeni
Will be used by the following patch to avoid code duplication. No functional changes intended. The only difference is that now flow_limit_cpu_sysctl() will always compute the flow limit mask on each read operation, even when read() will not return any byte to user-space. Note that the new helper is placed under a new #ifdef at the file start to better fit the usage in the later patch Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-09Bluetooth: btintel: Set Per Platform Antenna Gain(PPAG)Seema Sreemantha
Antenna gain is defined as the antenna’s ability to increase the Tx power in a given direction. Intel is certifying its products with fixed reference antenna peak gain values (3/5dBi). The feature takes into account the actual antenna gain, and increases output power values, which results in a performance improvement. After firmware download is completed, driver reads from ACPI table and configures PPAG as required. ACPI table entry for PPAG is defined as below. Name (PPAG, Package (0x02) { 0x00000001, Package (0x02) { 0x00000012, /* Bluetooth Domain */ 0x00000001 /* 1 - Enable PPAG, 0 - Disable PPAG */ } }) btmon log: < HCI Command: Intel Configure Per Platform Antenna Gain (0x3f|0x0219) plen 12 Mcc: 0x00000000 Selector: Enable Delta: 0x00000000 > HCI Event: Command Complete (0x0e) plen 4 Intel Configure Per Platform Antenna Gain (0x3f|0x0219) ncmd 1 Status: Success (0x00) Signed-off-by: Kiran K <kiran.k@intel.com> Signed-off-by: Seema Sreemantha <seema.sreemantha@intel.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2023-02-09Bluetooth: Make sure LE create conn cancel is sent when timeoutArchie Pusaka
When sending LE create conn command, we set a timer with a duration of HCI_LE_CONN_TIMEOUT before timing out and calling create_le_conn_complete. Additionally, when receiving the command complete, we also set a timer with the same duration to call le_conn_timeout. Usually the latter will be triggered first, which then sends a LE create conn cancel command. However, due to the nature of racing, it is possible for the former to be called first, thereby calling the chain hci_conn_failed -> hci_conn_del -> cancel_delayed_work, thereby preventing LE create conn cancel to be sent. In this situation, the controller will be stuck in trying the LE connection. This patch flushes le_conn_timeout on create_le_conn_complete to make sure we always send LE create connection cancel, if necessary. Signed-off-by: Archie Pusaka <apusaka@chromium.org> Reviewed-by: Ying Hsu <yinghsu@chromium.org> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2023-02-09Bluetooth: Free potentially unfreed SCO connectionArchie Pusaka
It is possible to initiate a SCO connection while deleting the corresponding ACL connection, e.g. in below scenario: (1) < hci setup sync connect command (2) > hci disconn complete event (for the acl connection) (3) > hci command complete event (for(1), failure) When it happens, hci_cs_setup_sync_conn won't be able to obtain the reference to the SCO connection, so it will be stuck and potentially hinder subsequent connections to the same device. This patch prevents that by also deleting the SCO connection if it is still not established when the corresponding ACL connection is deleted. Signed-off-by: Archie Pusaka <apusaka@chromium.org> Reviewed-by: Ying Hsu <yinghsu@chromium.org> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2023-02-09Bluetooth: hci_qca: get wakeup status from serdev device handleZhengping Jiang
Bluetooth controller attached via the UART is handled by the serdev driver. Get the wakeup status from the device handle through serdev, instead of the parent path. Fixes: c1a74160eaf1 ("Bluetooth: hci_qca: Add device_may_wakeup support") Signed-off-by: Zhengping Jiang <jiangzp@google.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2023-02-09Bluetooth: L2CAP: Fix potential user-after-freeLuiz Augusto von Dentz
This fixes all instances of which requires to allocate a buffer calling alloc_skb which may release the chan lock and reacquire later which makes it possible that the chan is disconnected in the meantime. Fixes: a6a5568c03c4 ("Bluetooth: Lock the L2CAP channel when sending") Reported-by: Alexander Coffin <alex.coffin@matician.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2023-02-09Bluetooth: MGMT: add CIS feature bits to controller informationPauli Virtanen
Userspace needs to know whether the adapter has feature support for Connected Isochronous Stream - Central/Peripheral, so it can set up LE Audio features accordingly. Expose these feature bits as settings in MGMT controller info. Signed-off-by: Pauli Virtanen <pav@iki.fi> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2023-02-09Bluetooth: hci_conn: Refactor hci_bind_bis() since it always succeedsKees Cook
The compiler thinks "conn" might be NULL after a call to hci_bind_bis(), which cannot happen. Avoid any confusion by just making it not return a value since it cannot fail. Fixes the warnings seen with GCC 13: In function 'arch_atomic_dec_and_test', inlined from 'atomic_dec_and_test' at ../include/linux/atomic/atomic-instrumented.h:576:9, inlined from 'hci_conn_drop' at ../include/net/bluetooth/hci_core.h:1391:6, inlined from 'hci_connect_bis' at ../net/bluetooth/hci_conn.c:2124:3: ../arch/x86/include/asm/rmwcc.h:37:9: warning: array subscript 0 is outside array bounds of 'atomic_t[0]' [-Warray-bounds=] 37 | asm volatile (fullop CC_SET(cc) \ | ^~~ ... In function 'hci_connect_bis': cc1: note: source object is likely at address zero Fixes: eca0ae4aea66 ("Bluetooth: Add initial implementation of BIS connections") Cc: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Cc: Marcel Holtmann <marcel@holtmann.org> Cc: Johan Hedberg <johan.hedberg@gmail.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Paolo Abeni <pabeni@redhat.com> Cc: linux-bluetooth@vger.kernel.org Cc: netdev@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2023-02-09Bluetooth: HCI: Replace zero-length arrays with flexible-array membersGustavo A. R. Silva
Zero-length arrays are deprecated[1] and we are moving towards adopting C99 flexible-array members instead. So, replace zero-length arrays in a couple of structures with flex-array members. This helps with the ongoing efforts to tighten the FORTIFY_SOURCE routines on memcpy() and help us make progress towards globally enabling -fstrict-flex-arrays=3 [2]. Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays [1] Link: https://gcc.gnu.org/pipermail/gcc-patches/2022-October/602902.html [2] Link: https://github.com/KSPP/linux/issues/78 Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2023-02-09Bluetooth: qca: Fix sparse warningsLuiz Augusto von Dentz
This fixes the following warnings: drivers/bluetooth/hci_qca.c:1014:26: warning: cast to restricted __le16 drivers/bluetooth/hci_qca.c:1028:37: warning: cast to restricted __le32 Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2023-02-09Bluetooth: btusb: Add VID:PID 13d3:3529 for Realtek RTL8821CEMoises Cardona
This patch adds VID:PID 13d3:3529 to the btusb.c file. This VID:PID is found in the Realtek RTL8821CE module (M.2 module AW-CB304NF on an ASUS E210MA laptop) Output of /sys/kernel/debug/usb/devices: T: Bus=01 Lev=01 Prnt=01 Port=07 Cnt=02 Dev#= 3 Spd=12 MxCh= 0 D: Ver= 1.10 Cls=e0(wlcon) Sub=01 Prot=01 MxPS=64 #Cfgs= 1 P: Vendor=13d3 ProdID=3529 Rev= 1.10 S: Manufacturer=Realtek S: Product=Bluetooth Radio C:* #Ifs= 2 Cfg#= 1 Atr=e0 MxPwr=500mA I:* If#= 0 Alt= 0 #EPs= 3 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=81(I) Atr=03(Int.) MxPS= 16 Ivl=1ms E: Ad=02(O) Atr=02(Bulk) MxPS= 64 Ivl=0ms E: Ad=82(I) Atr=02(Bulk) MxPS= 64 Ivl=0ms I:* If#= 1 Alt= 0 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=03(O) Atr=01(Isoc) MxPS= 0 Ivl=1ms E: Ad=83(I) Atr=01(Isoc) MxPS= 0 Ivl=1ms I: If#= 1 Alt= 1 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=03(O) Atr=01(Isoc) MxPS= 9 Ivl=1ms E: Ad=83(I) Atr=01(Isoc) MxPS= 9 Ivl=1ms I: If#= 1 Alt= 2 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=03(O) Atr=01(Isoc) MxPS= 17 Ivl=1ms E: Ad=83(I) Atr=01(Isoc) MxPS= 17 Ivl=1ms I: If#= 1 Alt= 3 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=03(O) Atr=01(Isoc) MxPS= 25 Ivl=1ms E: Ad=83(I) Atr=01(Isoc) MxPS= 25 Ivl=1ms I: If#= 1 Alt= 4 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=03(O) Atr=01(Isoc) MxPS= 33 Ivl=1ms E: Ad=83(I) Atr=01(Isoc) MxPS= 33 Ivl=1ms I: If#= 1 Alt= 5 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=03(O) Atr=01(Isoc) MxPS= 49 Ivl=1ms E: Ad=83(I) Atr=01(Isoc) MxPS= 49 Ivl=1ms Signed-off-by: Moises Cardona <moisesmcardona@gmail.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2023-02-09Bluetooth: btusb: Add new PID/VID 0489:e0f2 for MT7921Mario Limonciello
This bluetooth device is found in a combo WLAN/BT card for a MediaTek 7921e. The device information: T: Bus=01 Lev=01 Prnt=01 Port=02 Cnt=01 Dev#= 2 Spd=480 MxCh= 0 D: Ver= 2.10 Cls=ef(misc ) Sub=02 Prot=01 MxPS=64 #Cfgs= 1 P: Vendor=0489 ProdID=e0f2 Rev= 1.00 S: Manufacturer=MediaTek Inc. S: Product=Wireless_Device S: SerialNumber=000000000 C:* #Ifs= 3 Cfg#= 1 Atr=e0 MxPwr=100mA A: FirstIf#= 0 IfCount= 3 Cls=e0(wlcon) Sub=01 Prot=01 I:* If#= 0 Alt= 0 #EPs= 3 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=81(I) Atr=03(Int.) MxPS= 16 Ivl=125us E: Ad=82(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 1 Alt= 0 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=83(I) Atr=01(Isoc) MxPS= 0 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 0 Ivl=1ms I: If#= 1 Alt= 1 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=83(I) Atr=01(Isoc) MxPS= 9 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 9 Ivl=1ms I: If#= 1 Alt= 2 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=83(I) Atr=01(Isoc) MxPS= 17 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 17 Ivl=1ms I: If#= 1 Alt= 3 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=83(I) Atr=01(Isoc) MxPS= 25 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 25 Ivl=1ms I: If#= 1 Alt= 4 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=83(I) Atr=01(Isoc) MxPS= 33 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 33 Ivl=1ms I: If#= 1 Alt= 5 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=83(I) Atr=01(Isoc) MxPS= 49 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 49 Ivl=1ms I: If#= 1 Alt= 6 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=83(I) Atr=01(Isoc) MxPS= 63 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 63 Ivl=1ms I:* If#= 2 Alt= 0 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=(none) E: Ad=8a(I) Atr=03(Int.) MxPS= 64 Ivl=125us E: Ad=0a(O) Atr=03(Int.) MxPS= 64 Ivl=125us I: If#= 2 Alt= 1 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=(none) E: Ad=8a(I) Atr=03(Int.) MxPS= 512 Ivl=125us E: Ad=0a(O) Atr=03(Int.) MxPS= 512 Ivl=125us Cc: Sean Wang <sean.wang@mediatek.com> Cc: Anson Tsao <anson.tsao@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2023-02-09Bluetooth: Fix issue with Actions Semi ATS2851 based devicesMarcel Holtmann
Their devices claim to support the erroneous data reporting, but don't actually support the required commands. So blacklist them and add a quirk. < HCI Command: Read Default Erroneous Data Reporting (0x03|0x005a) plen 0 > HCI Event: Command Status (0x0f) plen 4 Read Default Erroneous Data Reporting (0x03|0x005a) ncmd 1 Status: Unknown HCI Command (0x01) T: Bus=02 Lev=02 Prnt=08 Port=02 Cnt=01 Dev#= 10 Spd=12 MxCh= 0 D: Ver= 2.00 Cls=e0(wlcon) Sub=01 Prot=01 MxPS=64 #Cfgs= 1 P: Vendor=10d7 ProdID=b012 Rev=88.91 S: Manufacturer=Actions S: Product=general adapter S: SerialNumber=ACTIONS1234 C:* #Ifs= 2 Cfg#= 1 Atr=c0 MxPwr=100mA I:* If#= 0 Alt= 0 #EPs= 3 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=81(I) Atr=03(Int.) MxPS= 64 Ivl=1ms E: Ad=02(O) Atr=02(Bulk) MxPS= 64 Ivl=0ms E: Ad=82(I) Atr=02(Bulk) MxPS= 64 Ivl=0ms I:* If#= 1 Alt= 0 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=01(O) Atr=01(Isoc) MxPS= 0 Ivl=1ms E: Ad=83(I) Atr=01(Isoc) MxPS= 0 Ivl=1ms I: If#= 1 Alt= 1 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=01(O) Atr=01(Isoc) MxPS= 9 Ivl=1ms E: Ad=83(I) Atr=01(Isoc) MxPS= 9 Ivl=1ms I: If#= 1 Alt= 2 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=01(O) Atr=01(Isoc) MxPS= 17 Ivl=1ms E: Ad=83(I) Atr=01(Isoc) MxPS= 17 Ivl=1ms I: If#= 1 Alt= 3 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=01(O) Atr=01(Isoc) MxPS= 25 Ivl=1ms E: Ad=83(I) Atr=01(Isoc) MxPS= 25 Ivl=1ms I: If#= 1 Alt= 4 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=01(O) Atr=01(Isoc) MxPS= 33 Ivl=1ms E: Ad=83(I) Atr=01(Isoc) MxPS= 33 Ivl=1ms I: If#= 1 Alt= 5 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=01(O) Atr=01(Isoc) MxPS= 49 Ivl=1ms E: Ad=83(I) Atr=01(Isoc) MxPS= 49 Ivl=1ms Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2023-02-09Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
net/devlink/leftover.c / net/core/devlink.c: 565b4824c39f ("devlink: change port event netdev notifier from per-net to global") f05bd8ebeb69 ("devlink: move code to a dedicated directory") 687125b5799c ("devlink: split out core code") https://lore.kernel.org/all/20230208094657.379f2b1a@canb.auug.org.au/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-09tools/resolve_btfids: Pass HOSTCFLAGS as EXTRA_CFLAGS to prepare targetsJiri Olsa
Thorsten reported build issue with command line that defined extra HOSTCFLAGS that were not passed into 'prepare' targets, but were used to build resolve_btfids objects. This results in build fail when these objects are linked together: /usr/bin/ld: /build.../tools/bpf/resolve_btfids//libbpf/libbpf.a(libbpf-in.o): relocation R_X86_64_32 against `.rodata.str1.1' can not be used when making a PIE \ object; recompile with -fPIE Fixing this by passing HOSTCFLAGS in EXTRA_CFLAGS as part of HOST_OVERRIDES variable for prepare targets. [1] https://lore.kernel.org/bpf/f7922132-6645-6316-5675-0ece4197bfff@leemhuis.info/ Fixes: 56a2df7615fa ("tools/resolve_btfids: Compile resolve_btfids as host program") Reported-by: Thorsten Leemhuis <linux@leemhuis.info> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Tested-by: Thorsten Leemhuis <linux@leemhuis.info> Acked-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/bpf/20230209143735.4112845-1-jolsa@kernel.org
2023-02-09net: enable usercopy for skb_small_head_cacheEric Dumazet
syzbot and other bots reported that we have to enable user copy to/from skb->head. [1] We can prevent access to skb_shared_info, which is a nice improvement over standard kmem_cache. Layout of these kmem_cache objects is: < SKB_SMALL_HEAD_HEADROOM >< struct skb_shared_info > usercopy: Kernel memory overwrite attempt detected to SLUB object 'skbuff_small_head' (offset 32, size 20)! ------------[ cut here ]------------ kernel BUG at mm/usercopy.c:102 ! invalid opcode: 0000 [#1] PREEMPT SMP KASAN CPU: 1 PID: 1 Comm: swapper/0 Not tainted 6.2.0-rc6-syzkaller-01425-gcb6b2e11a42d #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/12/2023 RIP: 0010:usercopy_abort+0xbd/0xbf mm/usercopy.c:102 Code: e8 ee ad ba f7 49 89 d9 4d 89 e8 4c 89 e1 41 56 48 89 ee 48 c7 c7 20 2b 5b 8a ff 74 24 08 41 57 48 8b 54 24 20 e8 7a 17 fe ff <0f> 0b e8 c2 ad ba f7 e8 7d fb 08 f8 48 8b 0c 24 49 89 d8 44 89 ea RSP: 0000:ffffc90000067a48 EFLAGS: 00010286 RAX: 000000000000006b RBX: ffffffff8b5b6ea0 RCX: 0000000000000000 RDX: ffff8881401c0000 RSI: ffffffff8166195c RDI: fffff5200000cf3b RBP: ffffffff8a5b2a60 R08: 000000000000006b R09: 0000000000000000 R10: 0000000080000000 R11: 0000000000000000 R12: ffffffff8bf2a925 R13: ffffffff8a5b29a0 R14: 0000000000000014 R15: ffffffff8a5b2960 FS: 0000000000000000(0000) GS:ffff8880b9900000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 000000000c48e000 CR4: 00000000003506e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> __check_heap_object+0xdd/0x110 mm/slub.c:4761 check_heap_object mm/usercopy.c:196 [inline] __check_object_size mm/usercopy.c:251 [inline] __check_object_size+0x1da/0x5a0 mm/usercopy.c:213 check_object_size include/linux/thread_info.h:199 [inline] check_copy_size include/linux/thread_info.h:235 [inline] copy_from_iter include/linux/uio.h:186 [inline] copy_from_iter_full include/linux/uio.h:194 [inline] memcpy_from_msg include/linux/skbuff.h:3977 [inline] qrtr_sendmsg+0x65f/0x970 net/qrtr/af_qrtr.c:965 sock_sendmsg_nosec net/socket.c:722 [inline] sock_sendmsg+0xde/0x190 net/socket.c:745 say_hello+0xf6/0x170 net/qrtr/ns.c:325 qrtr_ns_init+0x220/0x2b0 net/qrtr/ns.c:804 qrtr_proto_init+0x59/0x95 net/qrtr/af_qrtr.c:1296 do_one_initcall+0x141/0x790 init/main.c:1306 do_initcall_level init/main.c:1379 [inline] do_initcalls init/main.c:1395 [inline] do_basic_setup init/main.c:1414 [inline] kernel_init_freeable+0x6f9/0x782 init/main.c:1634 kernel_init+0x1e/0x1d0 init/main.c:1522 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:308 </TASK> Fixes: bf9f1baa279f ("net: add dedicated kmem_cache for typical/small skb->head") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Tested-by: Ido Schimmel <idosch@nvidia.com> Reported-by: Linux Kernel Functional Testing <lkft@linaro.org> Tested-by: Linux Kernel Functional Testing <lkft@linaro.org> Link: https://lore.kernel.org/linux-next/CA+G9fYs-i-c2KTSA7Ai4ES_ZESY1ZnM=Zuo8P1jN00oed6KHMA@mail.gmail.com Link: https://lore.kernel.org/r/20230208142508.3278406-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-09Merge tag 'net-6.2-rc8' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from can and ipsec subtrees. Current release - regressions: - sched: fix off by one in htb_activate_prios() - eth: mana: fix accessing freed irq affinity_hint - eth: ice: fix out-of-bounds KASAN warning in virtchnl Current release - new code bugs: - eth: mtk_eth_soc: enable special tag when any MAC uses DSA Previous releases - always broken: - core: fix sk->sk_txrehash default - neigh: make sure used and confirmed times are valid - mptcp: be careful on subflow status propagation on errors - xfrm: prevent potential spectre v1 gadget in xfrm_xlate32_attr() - phylink: move phy_device_free() to correctly release phy device - eth: mlx5: - fix crash unsetting rx-vlan-filter in switchdev mode - fix hang on firmware reset - serialize module cleanup with reload and remove" * tag 'net-6.2-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (57 commits) selftests: forwarding: lib: quote the sysctl values net: mscc: ocelot: fix all IPv6 getting trapped to CPU when PTP timestamping is used rds: rds_rm_zerocopy_callback() use list_first_entry() net: txgbe: Update support email address selftests: Fix failing VXLAN VNI filtering test selftests: mptcp: stop tests earlier selftests: mptcp: allow more slack for slow test-case mptcp: be careful on subflow status propagation on errors mptcp: fix locking for in-kernel listener creation mptcp: fix locking for setsockopt corner-case mptcp: do not wait for bare sockets' timeout net: ethernet: mtk_eth_soc: fix DSA TX tag hwaccel for switch port 0 nfp: ethtool: fix the bug of setting unsupported port speed txhash: fix sk->sk_txrehash default net: ethernet: mtk_eth_soc: fix wrong parameters order in __xdp_rxq_info_reg() net: ethernet: mtk_eth_soc: enable special tag when any MAC uses DSA net: sched: sch: Fix off by one in htb_activate_prios() igc: Add ndo_tx_timeout support net: mana: Fix accessing freed irq affinity_hint hv_netvsc: Allocate memory in netvsc_dma_map() with GFP_ATOMIC ...
2023-02-09Merge tag 'for-linus-2023020901' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid Pull HID fixes from Benjamin Tissoires: - fix potential infinite loop with a badly crafted HID device (Xin Zhao) - fix regression from 6.1 in USB logitech devices potentially making their mouse wheel not working (Bastien Nocera) - clean up in AMD sensors, which fixes a long time resume bug (Mario Limonciello) - few device small fixes and quirks * tag 'for-linus-2023020901' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid: HID: Ignore battery for ELAN touchscreen 29DF on HP HID: amd_sfh: if no sensors are enabled, clean up HID: logitech: Disable hi-res scrolling on USB HID: core: Fix deadloop in hid_apply_multiplier. HID: Ignore battery for Elan touchscreen on Asus TP420IA HID: elecom: add support for TrackBall 056E:011C
2023-02-09Merge tag '6.2-rc8-smb3-client-fix' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds
Pull cifx fix from Steve French: "Small fix for use after free" * tag '6.2-rc8-smb3-client-fix' of git://git.samba.org/sfrench/cifs-2.6: cifs: Fix use-after-free in rdata->read_into_pages()
2023-02-09selftests: forwarding: lib: quote the sysctl valuesHangbin Liu
When set/restore sysctl value, we should quote the value as some keys may have multi values, e.g. net.ipv4.ping_group_range Fixes: f5ae57784ba8 ("selftests: forwarding: lib: Add sysctl_set(), sysctl_restore()") Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Link: https://lore.kernel.org/r/20230208032110.879205-1-liuhangbin@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-02-09net: mscc: ocelot: fix all IPv6 getting trapped to CPU when PTP timestamping ↵Vladimir Oltean
is used While running this selftest which usually passes: ~/selftests/drivers/net/dsa# ./local_termination.sh eno0 swp0 TEST: swp0: Unicast IPv4 to primary MAC address [ OK ] TEST: swp0: Unicast IPv4 to macvlan MAC address [ OK ] TEST: swp0: Unicast IPv4 to unknown MAC address [ OK ] TEST: swp0: Unicast IPv4 to unknown MAC address, promisc [ OK ] TEST: swp0: Unicast IPv4 to unknown MAC address, allmulti [ OK ] TEST: swp0: Multicast IPv4 to joined group [ OK ] TEST: swp0: Multicast IPv4 to unknown group [ OK ] TEST: swp0: Multicast IPv4 to unknown group, promisc [ OK ] TEST: swp0: Multicast IPv4 to unknown group, allmulti [ OK ] TEST: swp0: Multicast IPv6 to joined group [ OK ] TEST: swp0: Multicast IPv6 to unknown group [ OK ] TEST: swp0: Multicast IPv6 to unknown group, promisc [ OK ] TEST: swp0: Multicast IPv6 to unknown group, allmulti [ OK ] if I start PTP timestamping then run it again (debug prints added by me), the unknown IPv6 MC traffic is seen by the CPU port even when it should have been dropped: ~/selftests/drivers/net/dsa# ptp4l -i swp0 -2 -P -m ptp4l[225.410]: selected /dev/ptp1 as PTP clock [ 225.445746] mscc_felix 0000:00:00.5: ocelot_l2_ptp_trap_add: port 0 adding L2 PTP trap [ 225.453815] mscc_felix 0000:00:00.5: ocelot_ipv4_ptp_trap_add: port 0 adding IPv4 PTP event trap [ 225.462703] mscc_felix 0000:00:00.5: ocelot_ipv4_ptp_trap_add: port 0 adding IPv4 PTP general trap [ 225.471768] mscc_felix 0000:00:00.5: ocelot_ipv6_ptp_trap_add: port 0 adding IPv6 PTP event trap [ 225.480651] mscc_felix 0000:00:00.5: ocelot_ipv6_ptp_trap_add: port 0 adding IPv6 PTP general trap ptp4l[225.488]: port 1: INITIALIZING to LISTENING on INIT_COMPLETE ptp4l[225.488]: port 0: INITIALIZING to LISTENING on INIT_COMPLETE ^C ~/selftests/drivers/net/dsa# ./local_termination.sh eno0 swp0 TEST: swp0: Unicast IPv4 to primary MAC address [ OK ] TEST: swp0: Unicast IPv4 to macvlan MAC address [ OK ] TEST: swp0: Unicast IPv4 to unknown MAC address [ OK ] TEST: swp0: Unicast IPv4 to unknown MAC address, promisc [ OK ] TEST: swp0: Unicast IPv4 to unknown MAC address, allmulti [ OK ] TEST: swp0: Multicast IPv4 to joined group [ OK ] TEST: swp0: Multicast IPv4 to unknown group [ OK ] TEST: swp0: Multicast IPv4 to unknown group, promisc [ OK ] TEST: swp0: Multicast IPv4 to unknown group, allmulti [ OK ] TEST: swp0: Multicast IPv6 to joined group [ OK ] TEST: swp0: Multicast IPv6 to unknown group [FAIL] reception succeeded, but should have failed TEST: swp0: Multicast IPv6 to unknown group, promisc [ OK ] TEST: swp0: Multicast IPv6 to unknown group, allmulti [ OK ] The PGID_MCIPV6 is configured correctly to not flood to the CPU, I checked that. Furthermore, when I disable back PTP RX timestamping (ptp4l doesn't do that when it exists), packets are RX filtered again as they should be: ~/selftests/drivers/net/dsa# hwstamp_ctl -i swp0 -r 0 [ 218.202854] mscc_felix 0000:00:00.5: ocelot_l2_ptp_trap_del: port 0 removing L2 PTP trap [ 218.212656] mscc_felix 0000:00:00.5: ocelot_ipv4_ptp_trap_del: port 0 removing IPv4 PTP event trap [ 218.222975] mscc_felix 0000:00:00.5: ocelot_ipv4_ptp_trap_del: port 0 removing IPv4 PTP general trap [ 218.233133] mscc_felix 0000:00:00.5: ocelot_ipv6_ptp_trap_del: port 0 removing IPv6 PTP event trap [ 218.242251] mscc_felix 0000:00:00.5: ocelot_ipv6_ptp_trap_del: port 0 removing IPv6 PTP general trap current settings: tx_type 1 rx_filter 12 new settings: tx_type 1 rx_filter 0 ~/selftests/drivers/net/dsa# ./local_termination.sh eno0 swp0 TEST: swp0: Unicast IPv4 to primary MAC address [ OK ] TEST: swp0: Unicast IPv4 to macvlan MAC address [ OK ] TEST: swp0: Unicast IPv4 to unknown MAC address [ OK ] TEST: swp0: Unicast IPv4 to unknown MAC address, promisc [ OK ] TEST: swp0: Unicast IPv4 to unknown MAC address, allmulti [ OK ] TEST: swp0: Multicast IPv4 to joined group [ OK ] TEST: swp0: Multicast IPv4 to unknown group [ OK ] TEST: swp0: Multicast IPv4 to unknown group, promisc [ OK ] TEST: swp0: Multicast IPv4 to unknown group, allmulti [ OK ] TEST: swp0: Multicast IPv6 to joined group [ OK ] TEST: swp0: Multicast IPv6 to unknown group [ OK ] TEST: swp0: Multicast IPv6 to unknown group, promisc [ OK ] TEST: swp0: Multicast IPv6 to unknown group, allmulti [ OK ] So it's clear that something in the PTP RX trapping logic went wrong. Looking a bit at the code, I can see that there are 4 typos, which populate "ipv4" VCAP IS2 key filter fields for IPv6 keys. VCAP IS2 keys of type OCELOT_VCAP_KEY_IPV4 and OCELOT_VCAP_KEY_IPV6 are handled by is2_entry_set(). OCELOT_VCAP_KEY_IPV4 looks at &filter->key.ipv4, and OCELOT_VCAP_KEY_IPV6 at &filter->key.ipv6. Simply put, when we populate the wrong key field, &filter->key.ipv6 fields "proto.mask" and "proto.value" remain all zeroes (or "don't care"). So is2_entry_set() will enter the "else" of this "if" condition: if (msk == 0xff && (val == IPPROTO_TCP || val == IPPROTO_UDP)) and proceed to ignore the "proto" field. The resulting rule will match on all IPv6 traffic, trapping it to the CPU. This is the reason why the local_termination.sh selftest sees it, because control traps are stronger than the PGID_MCIPV6 used for flooding (from the forwarding data path). But the problem is in fact much deeper. We trap all IPv6 traffic to the CPU, but if we're bridged, we set skb->offload_fwd_mark = 1, so software forwarding will not take place and IPv6 traffic will never reach its destination. The fix is simple - correct the typos. I was intentionally inaccurate in the commit message about the breakage occurring when any PTP timestamping is enabled. In fact it only happens when L4 timestamping is requested (HWTSTAMP_FILTER_PTP_V2_EVENT or HWTSTAMP_FILTER_PTP_V2_L4_EVENT). But ptp4l requests a larger RX timestamping filter than it needs for "-2": HWTSTAMP_FILTER_PTP_V2_EVENT. I wanted people skimming through git logs to not think that the bug doesn't affect them because they only use ptp4l in L2 mode. Fixes: 96ca08c05838 ("net: mscc: ocelot: set up traps for PTP packets") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Link: https://lore.kernel.org/r/20230207183117.1745754-1-vladimir.oltean@nxp.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-02-09rds: rds_rm_zerocopy_callback() use list_first_entry()Pietro Borrello
rds_rm_zerocopy_callback() uses list_entry() on the head of a list causing a type confusion. Use list_first_entry() to actually access the first element of the rs_zcookie_queue list. Fixes: 9426bbc6de99 ("rds: use list structure to track information for zerocopy completion notification") Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Pietro Borrello <borrello@diag.uniroma1.it> Link: https://lore.kernel.org/r/20230202-rds-zerocopy-v3-1-83b0df974f9a@diag.uniroma1.it Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-02-08Merge tag 'ipsec-2023-02-08' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec Steffen Klassert says: ==================== ipsec 2023-02-08 1) Fix policy checks for nested IPsec tunnels when using xfrm interfaces. From Benedict Wong. 2) Fix netlink message expression on 32=>64-bit messages translators. From Anastasia Belova. 3) Prevent potential spectre v1 gadget in xfrm_xlate32_attr. From Eric Dumazet. 4) Always consistently use time64_t in xfrm_timer_handler. From Eric Dumazet. 5) Fix KCSAN reported bug: Multiple cpus can update use_time at the same time. From Eric Dumazet. 6) Fix SCP copy from IPv4 to IPv6 on interfamily tunnel. From Christian Hopps. * tag 'ipsec-2023-02-08' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec: xfrm: fix bug with DSCP copy to v6 from v4 tunnel xfrm: annotate data-race around use_time xfrm: consistently use time64_t in xfrm_timer_handler() xfrm/compat: prevent potential spectre v1 gadget in xfrm_xlate32_attr() xfrm: compat: change expression for switch in xfrm_xlate64 Fix XFRM-I support for nested ESP tunnels ==================== Link: https://lore.kernel.org/r/20230208114322.266510-1-steffen.klassert@secunet.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-08Merge tag 'linux-can-next-for-6.3-20230208' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next Marc Kleine-Budde says: ==================== can-next 2023-02-08 The 1st patch is by Oliver Hartkopp and cleans up the CAN_RAW's raw_setsockopt() for CAN_RAW_FD_FRAMES. The 2nd patch is by me and fixes the compilation if CONFIG_CAN_CALC_BITTIMING is disabled. (Problem introduced in last pull request to next-next.) * tag 'linux-can-next-for-6.3-20230208' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next: can: bittiming: can_calc_bittiming(): add missing parameter to no-op function can: raw: use temp variable instead of rolling back config ==================== Link: https://lore.kernel.org/r/20230208210014.3169347-1-mkl@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-08Merge tag 'mlx5-next-netdev-deadlock' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux Saeed Mahameed says: ==================== mlx5-next-netdev-deadlock This series from Jiri solves a deadlock when removing a network namespace with mlx5 devlink instance being in it. The deadlock is between: 1) mlx5_ib->unregister_netdevice_notifier() AND 2) mlx5_core->devlink_reload->cleanup_net() To slove this introduced mlx5 netdev added/removed events to track uplink netdev to be used for register_netdevice_notifier_dev_net() purposes. * tag 'mlx5-next-netdev-deadlock' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux: RDMA/mlx5: Track netdev to avoid deadlock during netdev notifier unregister net/mlx5e: Propagate an internal event in case uplink netdev changes net/mlx5e: Fix trap event handling net/mlx5: Introduce CQE error syndrome ==================== Link: https://lore.kernel.org/r/20230208005626.72930-1-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-08net: libwx: Remove unneeded semicolonYang Li
./drivers/net/ethernet/wangxun/libwx/wx_lib.c:683:2-3: Unneeded semicolon Reported-by: Abaci Robot <abaci@linux.alibaba.com> Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=3976 Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Link: https://lore.kernel.org/r/20230208004959.47553-1-yang.lee@linux.alibaba.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-08net: libwx: clean up one inconsistent indentingYang Li
drivers/net/ethernet/wangxun/libwx/wx_lib.c:1835 wx_setup_all_rx_resources() warn: inconsistent indenting Reported-by: Abaci Robot <abaci@linux.alibaba.com> Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=3981 Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Link: https://lore.kernel.org/r/20230208013227.111605-1-yang.lee@linux.alibaba.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-08net: txgbe: Update support email addressJiawen Wu
Update new email address for Wangxun 10Gb NIC support team. Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com> Link: https://lore.kernel.org/r/20230208023035.3371250-1-jiawenwu@trustnetic.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-08RDMA/mlx5: Track netdev to avoid deadlock during netdev notifier unregisterJiri Pirko
When removing a network namespace with mlx5 devlink instance being in it, following callchain is performed: cleanup_net (takes down_read(&pernet_ops_rwsem) devlink_pernet_pre_exit() devlink_reload() mlx5_devlink_reload_down() mlx5_unload_one_devl_locked() mlx5_detach_device() del_adev() mlx5r_remove() __mlx5_ib_remove() mlx5_ib_roce_cleanup() mlx5_remove_netdev_notifier() unregister_netdevice_notifier (takes down_write(&pernet_ops_rwsem) This deadlocks. Resolve this by converting to register_netdevice_notifier_dev_net() which does not take pernet_ops_rwsem and moves the notifier block around according to netdev it takes as arg. Use previously introduced netdev added/removed events to track uplink netdev to be used for register_netdevice_notifier_dev_net() purposes. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-02-08net/mlx5e: Propagate an internal event in case uplink netdev changesJiri Pirko
Whenever uplink netdev is set/cleared, propagate newly introduced event to inform notifier blocks netdev was added/removed. Move the set() helper to core.c from header, introduce clear() and netdev_added_event_replay() helpers. The last one is going to be called from rdma driver, so export it. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-02-08net/mlx5e: Fix trap event handlingJiri Pirko
Current code does not return correct return value from event handler. Fix it by returning NOTIFY_* and propagate err over newly introduce ctx structure. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-02-08Merge tag 'mlx5-fixes-2023-02-07' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5 fixes 2023-02-07 This series provides bug fixes to mlx5 driver. * tag 'mlx5-fixes-2023-02-07' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux: net/mlx5: Serialize module cleanup with reload and remove net/mlx5: fw_tracer, Zero consumer index when reloading the tracer net/mlx5: fw_tracer, Clear load bit when freeing string DBs buffers net/mlx5: Expose SF firmware pages counter net/mlx5: Store page counters in a single array net/mlx5e: IPoIB, Show unknown speed instead of error net/mlx5e: Fix crash unsetting rx-vlan-filter in switchdev mode net/mlx5: Bridge, fix ageing of peer FDB entries net/mlx5: DR, Fix potential race in dr_rule_create_rule_nic net/mlx5e: Update rx ring hw mtu upon each rx-fcs flag change ==================== Link: https://lore.kernel.org/r/20230208030302.95378-1-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-08Merge tag 'mlx5-updates-2023-02-07' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-updates-2023-02-07 1) Minor and trivial code Cleanups 2) Minor fixes for net-next 3) From Shay: dynamic FW trace strings update. * tag 'mlx5-updates-2023-02-07' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux: net/mlx5: fw_tracer, Add support for unrecognized string net/mlx5: fw_tracer, Add support for strings DB update event net/mlx5: fw_tracer, allow 0 size string DBs net/mlx5: fw_tracer: Fix debug print net/mlx5: fs, Remove redundant assignment of size net/mlx5: fs_core, Remove redundant variable err net/mlx5: Fix memory leak in error flow of port set buffer net/mlx5e: Remove incorrect debugfs_create_dir NULL check in TLS net/mlx5e: Remove incorrect debugfs_create_dir NULL check in hairpin net/mlx5: fs, Remove redundant vport_number assignment net/mlx5e: Remove redundant code for handling vlan actions net/mlx5e: Don't listen to remove flows event net/mlx5: fw reset: Skip device ID check if PCI link up failed net/mlx5: Remove redundant health work lock mlx5: reduce stack usage in mlx5_setup_tc ==================== Link: https://lore.kernel.org/r/20230208003712.68386-1-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-08selftests: Fix failing VXLAN VNI filtering testIdo Schimmel
iproute2 does not recognize the "group6" and "remote6" keywords. Fix by using "group" and "remote" instead. Before: # ./test_vxlan_vnifiltering.sh [...] Tests passed: 25 Tests failed: 2 After: # ./test_vxlan_vnifiltering.sh [...] Tests passed: 27 Tests failed: 0 Fixes: 3edf5f66c12a ("selftests: add new tests for vxlan vnifiltering") Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> Link: https://lore.kernel.org/r/20230207141819.256689-1-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-08samples/bpf: Add openat2() enter/exit tracepoint to syscall_tp sampleRong Tao
Commit fe3300897cbf("samples: bpf: fix syscall_tp due to unused syscall") added openat() syscall tracepoints. This patch adds support for openat2() as well. Signed-off-by: Rong Tao <rongtao@cestc.cn> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/tencent_9381CB1A158ED7ADD12C4406034E21A3AC07@qq.com
2023-02-08libbpf: Add sample_period to creation optionsJon Doron
Add option to set when the perf buffer should wake up, by default the perf buffer becomes signaled for every event that is being pushed to it. In case of a high throughput of events it will be more efficient to wake up only once you have X events ready to be read. So your application can wakeup once and drain the entire perf buffer. Signed-off-by: Jon Doron <jond@wiz.io> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230207081916.3398417-1-arilou@gmail.com
2023-02-08can: bittiming: can_calc_bittiming(): add missing parameter to no-op functionMarc Kleine-Budde
In commit 286c0e09e8e0 ("can: bittiming: can_changelink() pass extack down callstack") a new parameter was added to can_calc_bittiming(), however the static inline no-op (which is used if CONFIG_CAN_CALC_BITTIMING is disabled) wasn't converted. Add the new parameter to the static inline no-op of can_calc_bittiming(). Fixes: 286c0e09e8e0 ("can: bittiming: can_changelink() pass extack down callstack") Reported-by: kernel test robot <lkp@intel.com> Link: https://lore.kernel.org/20230207201734.2905618-1-mkl@pengutronix.de Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2023-02-08can: raw: use temp variable instead of rolling back configOliver Hartkopp
Introduce a temporary variable to check for an invalid configuration attempt from user space. Before this patch the value was copied to the real config variable and rolled back in the case of an error. Suggested-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Link: https://lore.kernel.org/all/20230203090807.97100-1-socketcan@hartkopp.net Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2023-02-08sfc: move xdp_features configuration in efx_pci_probe_post_io()Lorenzo Bianconi
Move xdp_features configuration from efx_pci_probe() to efx_pci_probe_post_io() since it is where all the other basic netdev features are initialised. Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://lore.kernel.org/r/9bd31c9a29bcf406ab90a249a28fc328e5578fd1.1675875404.git.lorenzo@kernel.org Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-02-08bpf, docs: Add note about type conventionDave Thaler
Add explanation about use of "u64", "u32", etc. as the type convention used in BPF documentation. Signed-off-by: Dave Thaler <dthaler@microsoft.com> Acked-by: David Vernet <void@manifault.com> Link: https://lore.kernel.org/r/20230127014706.1005-1-dthaler1968@googlemail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-02-08bpf/docs: Update design QA to be consistent with kfunc lifecycle docsToke Høiland-Jørgensen
Cong pointed out that there are some inconsistencies between the BPF design QA and the lifecycle expectations documentation we added for kfuncs. Let's update the QA file to be consistent with the kfunc docs, and add references where it makes sense. Also document that modules may export kfuncs now. v3: - Grammar nit + ack from David v2: - Fix repeated word (s/defined defined/defined/) Reported-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: David Vernet <void@manifault.com> Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/r/20230208164143.286392-1-toke@redhat.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-02-08Merge branch 'taprio-auto-qmaxsdu-new-tx'David S. Miller
Vladimir Oltean says: ==================== taprio automatic queueMaxSDU and new TXQ selection procedure This patch set addresses 2 design limitations in the taprio software scheduler: 1. Software scheduling fundamentally prioritizes traffic incorrectly, in a way which was inspired from Intel igb/igc drivers and does not follow the inputs user space gives (traffic classes and TC to TXQ mapping). Patch 05/15 handles this, 01/15 - 04/15 are preparations for this work. 2. Software scheduling assumes that the gate for a traffic class closes as soon as the next interval begins. But this isn't true. If consecutive schedule entries have that traffic class gate open, there is no "gate close" event and taprio should keep dequeuing from that TC without interruptions. Patches 06/15 - 15/15 handle this. Patch 10/15 is a generic Qdisc change required for this to work. Future development directions which depend on this patch set are: - Propagating the automatic queueMaxSDU calculation down to offloading device drivers, instead of letting them calculate this, as vsc9959_tas_guard_bands_update() does today. - A software data path for tc-taprio with preemptible traffic and Hold/Release events. v1 at: https://patchwork.kernel.org/project/netdevbpf/cover/20230128010719.2182346-1-vladimir.oltean@nxp.com/ ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-08net/sched: taprio: don't segment unnecessarilyVladimir Oltean
Improve commit 497cc00224cf ("taprio: Handle short intervals and large packets") to only perform segmentation when skb->len exceeds what taprio_dequeue() expects. In practice, this will make the biggest difference when a traffic class gate is always open in the schedule. This is because the max_frm_len will be U32_MAX, and such large skb->len values as Kurt reported will be sent just fine unsegmented. What I don't seem to know how to handle is how to make sure that the segmented skbs themselves are smaller than the maximum frame size given by the current queueMaxSDU[tc]. Nonetheless, we still need to drop those, otherwise the Qdisc will hang. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-08net/sched: taprio: split segmentation logic from qdisc_enqueue()Vladimir Oltean
The majority of the taprio_enqueue()'s function is spent doing TCP segmentation, which doesn't look right to me. Compilers shouldn't have a problem in inlining code no matter how we write it, so move the segmentation logic to a separate function. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-08net/sched: taprio: automatically calculate queueMaxSDU based on TC gate ↵Vladimir Oltean
durations taprio today has a huge problem with small TC gate durations, because it might accept packets in taprio_enqueue() which will never be sent by taprio_dequeue(). Since not much infrastructure was available, a kludge was added in commit 497cc00224cf ("taprio: Handle short intervals and large packets"), which segmented large TCP segments, but the fact of the matter is that the issue isn't specific to large TCP segments (and even worse, the performance penalty in segmenting those is absolutely huge). In commit a54fc09e4cba ("net/sched: taprio: allow user input of per-tc max SDU"), taprio gained support for queueMaxSDU, which is precisely the mechanism through which packets should be dropped at qdisc_enqueue() if they cannot be sent. After that patch, it was necessary for the user to manually limit the maximum MTU per TC. This change adds the necessary logic for taprio to further limit the values specified (or not specified) by the user to some minimum values which never allow oversized packets to be sent. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>