summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2023-12-15net: dsa: mv88e6xxx: Add "rmon" counter group supportTobias Waldekranz
Report the applicable subset of an mv88e6xxx port's counters using ethtool's standardized "rmon" counter group. Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-15net: dsa: mv88e6xxx: Limit histogram counters to ingress trafficTobias Waldekranz
Chips in this family only have one set of histogram counters, which can be used to count ingressing and/or egressing traffic. mv88e6xxx has, up until this point, kept the hardware default of counting both directions. In the mean time, standard counter group support has been added to ethtool. Via that interface, drivers may report ingress-only and egress-only histograms separately - but not combined. In order for mv88e6xxx to maximize amount of diagnostic information that can be exported via standard interfaces, we opt to limit the histogram counters to ingress traffic only. Which will allow us to export them via the standard "rmon" group in an upcoming commit. The reason for choosing ingress-only over egress-only, is to be compatible with RFC2819 (RMON MIB). Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-15net: dsa: mv88e6xxx: Add "eth-mac" counter group supportTobias Waldekranz
Report the applicable subset of an mv88e6xxx port's counters using ethtool's standardized "eth-mac" counter group. Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-15net: dsa: mv88e6xxx: Give each hw stat an IDTobias Waldekranz
With the upcoming standard counter group support, we are no longer reading out the whole set of counters, but rather mapping a subset to the requested group. Therefore, create an enum with an ID for each stat, such that mv88e6xxx_hw_stats[] can be subscripted with a human-readable ID corresponding to the counter's name. Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-15net: dsa: mv88e6xxx: Fix mv88e6352_serdes_get_stats error pathTobias Waldekranz
mv88e6xxx_get_stats, which collects stats from various sources, expects all callees to return the number of stats read. If an error occurs, 0 should be returned. Prevent future mishaps of this kind by updating the return type to reflect this contract. Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-15net: dsa: mv88e6xxx: Create API to read a single stat counterTobias Waldekranz
This change contains no functional change. We simply push the hardware specific stats logic to a function reading a single counter, rather than the whole set. This is a preparatory change for the upcoming standard ethtool statistics support (i.e. "eth-mac", "eth-ctrl" etc.). Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-15net: dsa: mv88e6xxx: Push locking into stats snapshottingTobias Waldekranz
This is more consistent with the driver's general structure. Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-15Merge branch 'net-optmem_max-changes'David S. Miller
Eric Dumazet says: ==================== net: optmem_max changes optmem_max default value is too small for tx zerocopy workloads. First patch increases default from 20KB to 128 KB, which is the value we have used for seven years. Second patch makes optmem_max sysctl per netns. Last patch tweaks two tests accordingly. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-15selftests/net: optmem_max became per netnsEric Dumazet
/proc/sys/net/core/optmem_max is now per netns, change two tests that were saving/changing/restoring its value on the parent netns. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-15net: Namespace-ify sysctl_optmem_maxEric Dumazet
optmem_max being used in tx zerocopy, we want to be able to control it on a netns basis. Following patch changes two tests. Tested: oqq130:~# cat /proc/sys/net/core/optmem_max 131072 oqq130:~# echo 1000000 >/proc/sys/net/core/optmem_max oqq130:~# cat /proc/sys/net/core/optmem_max 1000000 oqq130:~# unshare -n oqq130:~# cat /proc/sys/net/core/optmem_max 131072 oqq130:~# exit logout oqq130:~# cat /proc/sys/net/core/optmem_max 1000000 Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-15net: increase optmem_max default valueEric Dumazet
For many years, /proc/sys/net/core/optmem_max default value on a 64bit kernel has been 20 KB. Regular usage of TCP tx zerocopy needs a bit more. Google has used 128KB as the default value for 7 years without any problem. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-15Merge branch 'mlxsw-CFF-flood-mode'David S. Miller
Petr Machata says: ==================== mlxsw: CFF flood mode: NVE underlay configuration Recently, support for CFF flood mode (for Compressed FID Flooding) was added to the mlxsw driver. The most recent patchset has a detailed coverage of what CFF is and what has changed and how: https://lore.kernel.org/netdev/cover.1701183891.git.petrm@nvidia.com/ In CFF flood mode, each FID allocates a handful (in our implementation two or three) consecutive PGT entries. One entry holds the flood vector for unknown-UC traffic, one for MC, one for BC. To determine how to look up flood vectors, the CFF flood mode uses a concept of flood profiles, which are IDs that reference mappings from traffic types to offsets. In the case of CFF flood mode, the offset in question is applied to the PGT address configured at a FID. The same mechanism is used by NVE underlay for flooding. Again the profile ID and the traffic type determine the offset to apply, this time to KVD address used to look up flooding entries. Since mlxsw configures NVE underlay flood the same regardless of traffic type, only one offset was ever needed: the zero, which is the default, and thus no explicit configuration was needed. Now that CFF uses profiles as well, it would be better to configure the profile used by NVE explicitly, to make the configuration visible in the source code. In this patchset, add the register support (in patch #1), add a new traffic type to refer to "any traffic at all" (in patch #2) and finally configure the NVE profile explicitly for FIDs (in patch #3). So far, the implicitly configured flood profile was the ID 0. With this patchset, it changes to 3, leaving the 0 free to allow us to spot missed configuration. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-15mlxsw: spectrum_fid: Set NVE flood profile as part of FID configurationPetr Machata
The NVE flood profile is used for determining of offset applied to KVD address for NVE flood. We currently do not set it, leaving it at the default value of 0. That is not an issue: all the traffic-type-to-offset mappings (as configured by SFFP) default to offset of 0. This is what we need anyway, as mlxsw only allocates a single KVD entry for NVE underlay. The field is only relevant on Spectrum-2 and above. So to be fully consistent, we should split the existing controlled ops to Spectrum-1 and Spectrum>1 variants, with only the latter setting the field. But that seems like a lot of overhead for a single field whose meaning is "everything is the default". So instead pretend that the NVE flood profile does not exist in the controlled flood mode, like we have so far, and only set it when flood mode is CFF. Setting this at all serves dual purpose. First, it is now clear which profile belongs to NVE, because in the CFF mode, we have multiple users. This should prevent bugs in flood profile management. Second, using specifically non-zero value means there will be no valid uses of the profile 0, which we can therefore use as a sentinel. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-15mlxsw: spectrum_fid: Add an "any" packet typePetr Machata
Flood profiles have been used prior to CFF support for NVE underlay. Like is the case with FID flooding, an NVE profile describes at which offset a datum is located given traffic type. mlxsw currently only ever uses one KVD entry for NVE lookup, i.e. regardless of traffic type, the offset is always zero. To be able to describe this, add a traffic type enumerator describing "any traffic type". Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-15mlxsw: reg: Add nve_flood_prf_id field to SFMRPetr Machata
The field is used for setting a flood profile for lookup of KVD entry for NVE underlay. As the other uses of flood profile, this references a traffic type-to-offset mapping, except here it is not applied to PGT offsets, but KVD offsets. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-15ethernet: atheros: fix a memleak in atl1e_setup_ring_resourcesZhipeng Lu
In the error handling of 'offset > adapter->ring_size', the tx_ring->tx_buffer allocated by kzalloc should be freed, instead of 'goto failed' instantly. Fixes: a6a5325239c2 ("atl1e: Atheros L1E Gigabit Ethernet driver") Signed-off-by: Zhipeng Lu <alexious@zju.edu.cn> Reviewed-by: Suman Ghosh <sumang@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-15net: sched: ife: fix potential use-after-freeEric Dumazet
ife_decode() calls pskb_may_pull() two times, we need to reload ifehdr after the second one, or risk use-after-free as reported by syzbot: BUG: KASAN: slab-use-after-free in __ife_tlv_meta_valid net/ife/ife.c:108 [inline] BUG: KASAN: slab-use-after-free in ife_tlv_meta_decode+0x1d1/0x210 net/ife/ife.c:131 Read of size 2 at addr ffff88802d7300a4 by task syz-executor.5/22323 CPU: 0 PID: 22323 Comm: syz-executor.5 Not tainted 6.7.0-rc3-syzkaller-00804-g074ac38d5b95 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 11/10/2023 Call Trace: <TASK> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0xd9/0x1b0 lib/dump_stack.c:106 print_address_description mm/kasan/report.c:364 [inline] print_report+0xc4/0x620 mm/kasan/report.c:475 kasan_report+0xda/0x110 mm/kasan/report.c:588 __ife_tlv_meta_valid net/ife/ife.c:108 [inline] ife_tlv_meta_decode+0x1d1/0x210 net/ife/ife.c:131 tcf_ife_decode net/sched/act_ife.c:739 [inline] tcf_ife_act+0x4e3/0x1cd0 net/sched/act_ife.c:879 tc_act include/net/tc_wrapper.h:221 [inline] tcf_action_exec+0x1ac/0x620 net/sched/act_api.c:1079 tcf_exts_exec include/net/pkt_cls.h:344 [inline] mall_classify+0x201/0x310 net/sched/cls_matchall.c:42 tc_classify include/net/tc_wrapper.h:227 [inline] __tcf_classify net/sched/cls_api.c:1703 [inline] tcf_classify+0x82f/0x1260 net/sched/cls_api.c:1800 hfsc_classify net/sched/sch_hfsc.c:1147 [inline] hfsc_enqueue+0x315/0x1060 net/sched/sch_hfsc.c:1546 dev_qdisc_enqueue+0x3f/0x230 net/core/dev.c:3739 __dev_xmit_skb net/core/dev.c:3828 [inline] __dev_queue_xmit+0x1de1/0x3d30 net/core/dev.c:4311 dev_queue_xmit include/linux/netdevice.h:3165 [inline] packet_xmit+0x237/0x350 net/packet/af_packet.c:276 packet_snd net/packet/af_packet.c:3081 [inline] packet_sendmsg+0x24aa/0x5200 net/packet/af_packet.c:3113 sock_sendmsg_nosec net/socket.c:730 [inline] __sock_sendmsg+0xd5/0x180 net/socket.c:745 __sys_sendto+0x255/0x340 net/socket.c:2190 __do_sys_sendto net/socket.c:2202 [inline] __se_sys_sendto net/socket.c:2198 [inline] __x64_sys_sendto+0xe0/0x1b0 net/socket.c:2198 do_syscall_x64 arch/x86/entry/common.c:51 [inline] do_syscall_64+0x40/0x110 arch/x86/entry/common.c:82 entry_SYSCALL_64_after_hwframe+0x63/0x6b RIP: 0033:0x7fe9acc7cae9 Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 e1 20 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007fe9ada450c8 EFLAGS: 00000246 ORIG_RAX: 000000000000002c RAX: ffffffffffffffda RBX: 00007fe9acd9bf80 RCX: 00007fe9acc7cae9 RDX: 000000000000fce0 RSI: 00000000200002c0 RDI: 0000000000000003 RBP: 00007fe9accc847a R08: 0000000020000140 R09: 0000000000000014 R10: 0000000000000004 R11: 0000000000000246 R12: 0000000000000000 R13: 000000000000000b R14: 00007fe9acd9bf80 R15: 00007ffd5427ae78 </TASK> Allocated by task 22323: kasan_save_stack+0x33/0x50 mm/kasan/common.c:45 kasan_set_track+0x25/0x30 mm/kasan/common.c:52 ____kasan_kmalloc mm/kasan/common.c:374 [inline] __kasan_kmalloc+0xa2/0xb0 mm/kasan/common.c:383 kasan_kmalloc include/linux/kasan.h:198 [inline] __do_kmalloc_node mm/slab_common.c:1007 [inline] __kmalloc_node_track_caller+0x5a/0x90 mm/slab_common.c:1027 kmalloc_reserve+0xef/0x260 net/core/skbuff.c:582 __alloc_skb+0x12b/0x330 net/core/skbuff.c:651 alloc_skb include/linux/skbuff.h:1298 [inline] alloc_skb_with_frags+0xe4/0x710 net/core/skbuff.c:6331 sock_alloc_send_pskb+0x7e4/0x970 net/core/sock.c:2780 packet_alloc_skb net/packet/af_packet.c:2930 [inline] packet_snd net/packet/af_packet.c:3024 [inline] packet_sendmsg+0x1e2a/0x5200 net/packet/af_packet.c:3113 sock_sendmsg_nosec net/socket.c:730 [inline] __sock_sendmsg+0xd5/0x180 net/socket.c:745 __sys_sendto+0x255/0x340 net/socket.c:2190 __do_sys_sendto net/socket.c:2202 [inline] __se_sys_sendto net/socket.c:2198 [inline] __x64_sys_sendto+0xe0/0x1b0 net/socket.c:2198 do_syscall_x64 arch/x86/entry/common.c:51 [inline] do_syscall_64+0x40/0x110 arch/x86/entry/common.c:82 entry_SYSCALL_64_after_hwframe+0x63/0x6b Freed by task 22323: kasan_save_stack+0x33/0x50 mm/kasan/common.c:45 kasan_set_track+0x25/0x30 mm/kasan/common.c:52 kasan_save_free_info+0x2b/0x40 mm/kasan/generic.c:522 ____kasan_slab_free mm/kasan/common.c:236 [inline] ____kasan_slab_free+0x15b/0x1b0 mm/kasan/common.c:200 kasan_slab_free include/linux/kasan.h:164 [inline] slab_free_hook mm/slub.c:1800 [inline] slab_free_freelist_hook+0x114/0x1e0 mm/slub.c:1826 slab_free mm/slub.c:3809 [inline] __kmem_cache_free+0xc0/0x180 mm/slub.c:3822 skb_kfree_head net/core/skbuff.c:950 [inline] skb_free_head+0x110/0x1b0 net/core/skbuff.c:962 pskb_expand_head+0x3c5/0x1170 net/core/skbuff.c:2130 __pskb_pull_tail+0xe1/0x1830 net/core/skbuff.c:2655 pskb_may_pull_reason include/linux/skbuff.h:2685 [inline] pskb_may_pull include/linux/skbuff.h:2693 [inline] ife_decode+0x394/0x4f0 net/ife/ife.c:82 tcf_ife_decode net/sched/act_ife.c:727 [inline] tcf_ife_act+0x43b/0x1cd0 net/sched/act_ife.c:879 tc_act include/net/tc_wrapper.h:221 [inline] tcf_action_exec+0x1ac/0x620 net/sched/act_api.c:1079 tcf_exts_exec include/net/pkt_cls.h:344 [inline] mall_classify+0x201/0x310 net/sched/cls_matchall.c:42 tc_classify include/net/tc_wrapper.h:227 [inline] __tcf_classify net/sched/cls_api.c:1703 [inline] tcf_classify+0x82f/0x1260 net/sched/cls_api.c:1800 hfsc_classify net/sched/sch_hfsc.c:1147 [inline] hfsc_enqueue+0x315/0x1060 net/sched/sch_hfsc.c:1546 dev_qdisc_enqueue+0x3f/0x230 net/core/dev.c:3739 __dev_xmit_skb net/core/dev.c:3828 [inline] __dev_queue_xmit+0x1de1/0x3d30 net/core/dev.c:4311 dev_queue_xmit include/linux/netdevice.h:3165 [inline] packet_xmit+0x237/0x350 net/packet/af_packet.c:276 packet_snd net/packet/af_packet.c:3081 [inline] packet_sendmsg+0x24aa/0x5200 net/packet/af_packet.c:3113 sock_sendmsg_nosec net/socket.c:730 [inline] __sock_sendmsg+0xd5/0x180 net/socket.c:745 __sys_sendto+0x255/0x340 net/socket.c:2190 __do_sys_sendto net/socket.c:2202 [inline] __se_sys_sendto net/socket.c:2198 [inline] __x64_sys_sendto+0xe0/0x1b0 net/socket.c:2198 do_syscall_x64 arch/x86/entry/common.c:51 [inline] do_syscall_64+0x40/0x110 arch/x86/entry/common.c:82 entry_SYSCALL_64_after_hwframe+0x63/0x6b The buggy address belongs to the object at ffff88802d730000 which belongs to the cache kmalloc-8k of size 8192 The buggy address is located 164 bytes inside of freed 8192-byte region [ffff88802d730000, ffff88802d732000) The buggy address belongs to the physical page: page:ffffea0000b5cc00 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x2d730 head:ffffea0000b5cc00 order:3 entire_mapcount:0 nr_pages_mapped:0 pincount:0 flags: 0xfff00000000840(slab|head|node=0|zone=1|lastcpupid=0x7ff) page_type: 0xffffffff() raw: 00fff00000000840 ffff888013042280 dead000000000122 0000000000000000 raw: 0000000000000000 0000000080020002 00000001ffffffff 0000000000000000 page dumped because: kasan: bad access detected page_owner tracks the page as allocated page last allocated via order 3, migratetype Unmovable, gfp_mask 0x1d20c0(__GFP_IO|__GFP_FS|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_NOMEMALLOC|__GFP_HARDWALL), pid 22323, tgid 22320 (syz-executor.5), ts 950317230369, free_ts 950233467461 set_page_owner include/linux/page_owner.h:31 [inline] post_alloc_hook+0x2d0/0x350 mm/page_alloc.c:1544 prep_new_page mm/page_alloc.c:1551 [inline] get_page_from_freelist+0xa28/0x3730 mm/page_alloc.c:3319 __alloc_pages+0x22e/0x2420 mm/page_alloc.c:4575 alloc_pages_mpol+0x258/0x5f0 mm/mempolicy.c:2133 alloc_slab_page mm/slub.c:1870 [inline] allocate_slab mm/slub.c:2017 [inline] new_slab+0x283/0x3c0 mm/slub.c:2070 ___slab_alloc+0x979/0x1500 mm/slub.c:3223 __slab_alloc.constprop.0+0x56/0xa0 mm/slub.c:3322 __slab_alloc_node mm/slub.c:3375 [inline] slab_alloc_node mm/slub.c:3468 [inline] __kmem_cache_alloc_node+0x131/0x310 mm/slub.c:3517 __do_kmalloc_node mm/slab_common.c:1006 [inline] __kmalloc_node_track_caller+0x4a/0x90 mm/slab_common.c:1027 kmalloc_reserve+0xef/0x260 net/core/skbuff.c:582 __alloc_skb+0x12b/0x330 net/core/skbuff.c:651 alloc_skb include/linux/skbuff.h:1298 [inline] alloc_skb_with_frags+0xe4/0x710 net/core/skbuff.c:6331 sock_alloc_send_pskb+0x7e4/0x970 net/core/sock.c:2780 packet_alloc_skb net/packet/af_packet.c:2930 [inline] packet_snd net/packet/af_packet.c:3024 [inline] packet_sendmsg+0x1e2a/0x5200 net/packet/af_packet.c:3113 sock_sendmsg_nosec net/socket.c:730 [inline] __sock_sendmsg+0xd5/0x180 net/socket.c:745 __sys_sendto+0x255/0x340 net/socket.c:2190 page last free stack trace: reset_page_owner include/linux/page_owner.h:24 [inline] free_pages_prepare mm/page_alloc.c:1144 [inline] free_unref_page_prepare+0x53c/0xb80 mm/page_alloc.c:2354 free_unref_page+0x33/0x3b0 mm/page_alloc.c:2494 __unfreeze_partials+0x226/0x240 mm/slub.c:2655 qlink_free mm/kasan/quarantine.c:168 [inline] qlist_free_all+0x6a/0x170 mm/kasan/quarantine.c:187 kasan_quarantine_reduce+0x18e/0x1d0 mm/kasan/quarantine.c:294 __kasan_slab_alloc+0x65/0x90 mm/kasan/common.c:305 kasan_slab_alloc include/linux/kasan.h:188 [inline] slab_post_alloc_hook mm/slab.h:763 [inline] slab_alloc_node mm/slub.c:3478 [inline] slab_alloc mm/slub.c:3486 [inline] __kmem_cache_alloc_lru mm/slub.c:3493 [inline] kmem_cache_alloc_lru+0x219/0x6f0 mm/slub.c:3509 alloc_inode_sb include/linux/fs.h:2937 [inline] ext4_alloc_inode+0x28/0x650 fs/ext4/super.c:1408 alloc_inode+0x5d/0x220 fs/inode.c:261 new_inode_pseudo fs/inode.c:1006 [inline] new_inode+0x22/0x260 fs/inode.c:1032 __ext4_new_inode+0x333/0x5200 fs/ext4/ialloc.c:958 ext4_symlink+0x5d7/0xa20 fs/ext4/namei.c:3398 vfs_symlink fs/namei.c:4464 [inline] vfs_symlink+0x3e5/0x620 fs/namei.c:4448 do_symlinkat+0x25f/0x310 fs/namei.c:4490 __do_sys_symlinkat fs/namei.c:4506 [inline] __se_sys_symlinkat fs/namei.c:4503 [inline] __x64_sys_symlinkat+0x97/0xc0 fs/namei.c:4503 do_syscall_x64 arch/x86/entry/common.c:51 [inline] do_syscall_64+0x40/0x110 arch/x86/entry/common.c:82 Fixes: d57493d6d1be ("net: sched: ife: check on metadata length") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Alexander Aring <aahringo@redhat.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-15net: Return error from sk_stream_wait_connect() if sk_wait_event() failsShigeru Yoshida
The following NULL pointer dereference issue occurred: BUG: kernel NULL pointer dereference, address: 0000000000000000 <...> RIP: 0010:ccid_hc_tx_send_packet net/dccp/ccid.h:166 [inline] RIP: 0010:dccp_write_xmit+0x49/0x140 net/dccp/output.c:356 <...> Call Trace: <TASK> dccp_sendmsg+0x642/0x7e0 net/dccp/proto.c:801 inet_sendmsg+0x63/0x90 net/ipv4/af_inet.c:846 sock_sendmsg_nosec net/socket.c:730 [inline] __sock_sendmsg+0x83/0xe0 net/socket.c:745 ____sys_sendmsg+0x443/0x510 net/socket.c:2558 ___sys_sendmsg+0xe5/0x150 net/socket.c:2612 __sys_sendmsg+0xa6/0x120 net/socket.c:2641 __do_sys_sendmsg net/socket.c:2650 [inline] __se_sys_sendmsg net/socket.c:2648 [inline] __x64_sys_sendmsg+0x45/0x50 net/socket.c:2648 do_syscall_x64 arch/x86/entry/common.c:51 [inline] do_syscall_64+0x43/0x110 arch/x86/entry/common.c:82 entry_SYSCALL_64_after_hwframe+0x63/0x6b sk_wait_event() returns an error (-EPIPE) if disconnect() is called on the socket waiting for the event. However, sk_stream_wait_connect() returns success, i.e. zero, even if sk_wait_event() returns -EPIPE, so a function that waits for a connection with sk_stream_wait_connect() may misbehave. In the case of the above DCCP issue, dccp_sendmsg() is waiting for the connection. If disconnect() is called in concurrently, the above issue occurs. This patch fixes the issue by returning error from sk_stream_wait_connect() if sk_wait_event() fails. Fixes: 419ce133ab92 ("tcp: allow again tcp_disconnect() when threads are waiting") Signed-off-by: Shigeru Yoshida <syoshida@redhat.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reported-by: syzbot+c71bc336c5061153b502@syzkaller.appspotmail.com Reviewed-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Reported-by: syzkaller <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-15Merge branch 'net-at803x-cleanups'David S. Miller
Christian Marangi says: ==================== net: phy: at803x: additional cleanup for qca808x This small series is a preparation for the big code split. While the qca808x code is waiting to be reviwed and merged, we can further cleanup and generalize shared functions between at803x and qca808x. With these last 2 patch everything is ready to move the driver to a dedicated directory and split the code by creating a library module for the few shared functions between the 2 driver. Eventually at803x can be further cleaned and generalized but everything will be already self contained and related only to at803x family of PHYs. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-15net: phy: at803x: make read specific status function more genericChristian Marangi
Rework read specific status function to be more generic. The function apply different speed mask based on the PHY ID. Make it more generic by adding an additional arg to pass the specific speed (ss) mask and use the provided mask to parse the speed value. This is needed to permit an easier deatch of qca808x code from the at803x driver. Signed-off-by: Christian Marangi <ansuelsmth@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-15net: phy: at803x: move specific qca808x config_aneg to dedicated functionChristian Marangi
Move specific qca808x config_aneg to dedicated function to permit easier split of qca808x portion from at803x driver. Signed-off-by: Christian Marangi <ansuelsmth@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-15Merge branch 'vsock-credit-update'David S. Miller
Arseniy Krasnov says: ==================== send credit update during setting SO_RCVLOWAT DESCRIPTION This patchset fixes old problem with hungup of both rx/tx sides and adds test for it. This happens due to non-default SO_RCVLOWAT value and deferred credit update in virtio/vsock. Link to previous old patchset: https://lore.kernel.org/netdev/39b2e9fd-601b-189d-39a9-914e5574524c@sberdevices.ru/ Here is what happens step by step: TEST INITIAL CONDITIONS 1) Vsock buffer size is 128KB. 2) Maximum packet size is also 64KB as defined in header (yes it is hardcoded, just to remind about that value). 3) SO_RCVLOWAT is default, e.g. 1 byte. STEPS SENDER RECEIVER 1) sends 128KB + 1 byte in a single buffer. 128KB will be sent, but for 1 byte sender will wait for free space at peer. Sender goes to sleep. 2) reads 64KB, credit update not sent 3) sets SO_RCVLOWAT to 64KB + 1 4) poll() -> wait forever, there is only 64KB available to read. So in step 4) receiver also goes to sleep, waiting for enough data or connection shutdown message from the sender. Idea to fix it is that rx kicks tx side to continue transmission (and may be close connection) when rx changes number of bytes to be woken up (e.g. SO_RCVLOWAT) and this value is bigger than number of available bytes to read. I've added small test for this, but not sure as it uses hardcoded value for maximum packet length, this value is defined in kernel header and used to control deferred credit update. And as this is not available to userspace, I can't control test parameters correctly (if one day this define will be changed - test may become useless). Head for this patchset is: https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/commit/?id=9bab51bd662be4c3ebb18a28879981d69f3ef15a Link to v1: https://lore.kernel.org/netdev/20231108072004.1045669-1-avkrasnov@salutedevices.com/ Link to v2: https://lore.kernel.org/netdev/20231119204922.2251912-1-avkrasnov@salutedevices.com/ Link to v3: https://lore.kernel.org/netdev/20231122180510.2297075-1-avkrasnov@salutedevices.com/ Link to v4: https://lore.kernel.org/netdev/20231129212519.2938875-1-avkrasnov@salutedevices.com/ Link to v5: https://lore.kernel.org/netdev/20231130130840.253733-1-avkrasnov@salutedevices.com/ Link to v6: https://lore.kernel.org/netdev/20231205064806.2851305-1-avkrasnov@salutedevices.com/ Link to v7: https://lore.kernel.org/netdev/20231206211849.2707151-1-avkrasnov@salutedevices.com/ Link to v8: https://lore.kernel.org/netdev/20231211211658.2904268-1-avkrasnov@salutedevices.com/ Link to v9: https://lore.kernel.org/netdev/20231214091947.395892-1-avkrasnov@salutedevices.com/ Changelog: v1 -> v2: * Patchset rebased and tested on new HEAD of net-next (see hash above). * New patch is added as 0001 - it removes return from SO_RCVLOWAT set callback in 'af_vsock.c' when transport callback is set - with that we can set 'sk_rcvlowat' only once in 'af_vsock.c' and in future do not copy-paste it to every transport. It was discussed in v1. * See per-patch changelog after ---. v2 -> v3: * See changelog after --- in 0003 only (0001 and 0002 still same). v3 -> v4: * Patchset rebased and tested on new HEAD of net-next (see hash above). * See per-patch changelog after ---. v4 -> v5: * Change patchset tag 'RFC' -> 'net-next'. * See per-patch changelog after ---. v5 -> v6: * New patch 0003 which sends credit update during reading bytes from socket. * See per-patch changelog after ---. v6 -> v7: * Patchset rebased and tested on new HEAD of net-next (see hash above). * See per-patch changelog after ---. v7 -> v8: * See per-patch changelog after ---. v8 -> v9: * Patchset rebased and tested on new HEAD of net-next (see hash above). * Add 'Fixes' tag for the current 0002. * Reorder patches by moving two fixes first. v9 -> v10: * Squash 0002 and 0003 and update commit message in result. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-15vsock/test: two tests to check credit update logicArseniy Krasnov
Both tests are almost same, only differs in two 'if' conditions, so implemented in a single function. Tests check, that credit update message is sent: 1) During setting SO_RCVLOWAT value of the socket. 2) When number of 'rx_bytes' become smaller than SO_RCVLOWAT value. Signed-off-by: Arseniy Krasnov <avkrasnov@salutedevices.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-15virtio/vsock: send credit update during setting SO_RCVLOWATArseniy Krasnov
Send credit update message when SO_RCVLOWAT is updated and it is bigger than number of bytes in rx queue. It is needed, because 'poll()' will wait until number of bytes in rx queue will be not smaller than O_RCVLOWAT, so kick sender to send more data. Otherwise mutual hungup for tx/rx is possible: sender waits for free space and receiver is waiting data in 'poll()'. Rename 'set_rcvlowat' callback to 'notify_set_rcvlowat' and set 'sk->sk_rcvlowat' only in one place (i.e. 'vsock_set_rcvlowat'), so the transport doesn't need to do it. Fixes: b89d882dc9fc ("vsock/virtio: reduce credit update messages") Signed-off-by: Arseniy Krasnov <avkrasnov@salutedevices.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-15virtio/vsock: fix logic which reduces credit update messagesArseniy Krasnov
Add one more condition for sending credit update during dequeue from stream socket: when number of bytes in the rx queue is smaller than SO_RCVLOWAT value of the socket. This is actual for non-default value of SO_RCVLOWAT (e.g. not 1) - idea is to "kick" peer to continue data transmission, because we need at least SO_RCVLOWAT bytes in our rx queue to wake up user for reading data (in corner case it is also possible to stuck both tx and rx sides, this is why 'Fixes' is used). Fixes: b89d882dc9fc ("vsock/virtio: reduce credit update messages") Signed-off-by: Arseniy Krasnov <avkrasnov@salutedevices.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-15octeontx2-pf: Fix graceful exit during PFC configuration failureSuman Ghosh
During PFC configuration failure the code was not handling a graceful exit. This patch fixes the same and add proper code for a graceful exit. Fixes: 99c969a83d82 ("octeontx2-pf: Add egress PFC support") Signed-off-by: Suman Ghosh <sumang@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-15ipmr: support IP_PKTINFO on cache report IGMP msgLeone Fernando
In order to support IP_PKTINFO on those packets, we need to call ipv4_pktinfo_prepare. When sending mrouted/pimd daemons a cache report IGMP msg, it is unnecessary to set dst on the newly created skb. It used to be necessary on older versions until commit d826eb14ecef ("ipv4: PKTINFO doesnt need dst reference") which changed the way IP_PKTINFO struct is been retrieved. Changes from v1: 1. Undo changes in ipv4_pktinfo_prepare function. use it directly and copy the control block. Fixes: d826eb14ecef ("ipv4: PKTINFO doesnt need dst reference") Signed-off-by: Leone Fernando <leone4fernando@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-15net: mana: add msix index sharing between EQsKonstantin Taranov
This patch allows to assign and poll more than one EQ on the same msix index. It is achieved by introducing a list of attached EQs in each IRQ context. It also removes the existing msix_index map that tried to ensure that there is only one EQ at each msix_index. This patch exports symbols for creating EQs from other MANA kernel modules. Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-15octeontx2-af: Fix multicast/mirror group lock/unlock issueSuman Ghosh
As per the existing implementation, there exists a race between finding a multicast/mirror group entry and deleting that entry. The group lock was taken and released independently by rvu_nix_mcast_find_grp_elem() function. Which is incorrect and group lock should be taken during the entire operation of group updation/deletion. This patch fixes the same. Fixes: 51b2804c19cd ("octeontx2-af: Add new mbox to support multicast/mirror offload") Signed-off-by: Suman Ghosh <sumang@marvell.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-15net: libwx: fix memory leak on free pageduanqiangwen
ifconfig ethx up, will set page->refcount larger than 1, and then ifconfig ethx down, calling __page_frag_cache_drain() to free pages, it is not compatible with page pool. So deleting codes which changing page->refcount. Fixes: 3c47e8ae113a ("net: libwx: Support to receive packets in NAPI") Signed-off-by: duanqiangwen <duanqiangwen@net-swift.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-15Merge tag 'mlx5-updates-2023-12-13' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-updates-2023-12-13 Preparation for mlx5e socket direct feature. Socket direct will allow multiple PF devices attached to different NUMA nodes but sharing the same physical port. The following series is a small refactoring series in preparation to support socket direct in the following submission. Highlights: - Define required device registers and bits related to socket direct - Flow steering re-arrangements - Generalize TX objects (TISs) and store them in a common object, will be useful in the next series for per function object management. - Decouple raw CQ objects from their parent netdev priv - Prepare devcom for Socket Direct device group discovery. Please see the individual patches for more information. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-15Merge branch 'net-phy-rust'David S. Miller
FUJITA Tomonori says: ==================== Rust abstractions for network PHY drivers No functional change since v10; only comment and commit log updates. This patchset adds Rust abstractions for phylib. It doesn't fully cover the C APIs yet but I think that it's already useful. I implement two PHY drivers (Asix AX88772A PHYs and Realtek Generic FE-GE). Seems they work well with real hardware. The first patch introduces Rust bindings for phylib. The second patch adds a macro to declare a kernel module for PHYs drivers. The third adds the Rust ETHERNET PHY LIBRARY entry to MAINTAINERS file; adds the binding file and me as a maintainer (as Andrew Lunn suggested) with Trevor Gross as a reviewer. The last patch introduces the Rust version of Asix PHY driver, drivers/net/phy/ax88796b.c. The features are equivalent to the C version. You can choose C (by default) or Rust version on kernel configuration. v11: - adds Andrew, Alice, and Trevor's Reviewed-by - comment update v10: https://lore.kernel.org/netdev/20231210234924.1453917-1-fujita.tomonori@gmail.com/T/ - adds Trevor's SoB to the third patch - adds Benno's Reviewed-by to the second patch v9: https://lore.kernel.org/netdev/20231205.124531.842372711631366729.fujita.tomonori@gmail.com/T/ - adds a workaround to access to a bit field in phy_device - fixes a comment typo v8: https://lore.kernel.org/netdev/20231123050412.1012252-1-fujita.tomonori@gmail.com/ - updates the safety comments on Device and its related code - uses _phy_start_aneg instead of phy_start_aneg - drops the patch for enum synchronization - moves Sync From Registration to DriverVTable - fixes doctest errors - minor cleanups v7: https://lore.kernel.org/netdev/20231026001050.1720612-1-fujita.tomonori@gmail.com/T/ - renames get_link() to is_link_up() - improves the macro format - improves the commit log in the third patch - improves comments v6: https://lore.kernel.org/netdev/20231025.090243.1437967503809186729.fujita.tomonori@gmail.com/T/ - improves comments - makes the requirement of phy_drivers_register clear - fixes Makefile of the third patch v5: https://lore.kernel.org/all/20231019.094147.1808345526469629486.fujita.tomonori@gmail.com/T/ - drops the rustified-enum option, writes match by hand; no *risk* of UB - adds Miguel's patch for enum checking - moves CONFIG_RUST_PHYLIB_ABSTRACTIONS to drivers/net/phy/Kconfig - adds a new entry for this abstractions in MAINTAINERS - changes some of Device's methods to take &mut self - comment improvment v4: https://lore.kernel.org/netdev/20231012125349.2702474-1-fujita.tomonori@gmail.com/T/ - split the core patch - making Device::from_raw() private - comment improvement with code update - commit message improvement - avoiding using bindings::phy_driver in public functions - using an anonymous constant in module_phy_driver macro v3: https://lore.kernel.org/netdev/20231011.231607.1747074555988728415.fujita.tomonori@gmail.com/T/ - changes the base tree to net-next from rust-next - makes this feature optional; only enabled with CONFIG_RUST_PHYLIB_BINDINGS=y - cosmetic code and comment improvement - adds copyright v2: https://lore.kernel.org/netdev/20231006094911.3305152-2-fujita.tomonori@gmail.com/T/ - build failure fix - function renaming v1: https://lore.kernel.org/netdev/20231002085302.2274260-3-fujita.tomonori@gmail.com/T/ ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-15net: phy: add Rust Asix PHY driverFUJITA Tomonori
This is the Rust implementation of drivers/net/phy/ax88796b.c. The features are equivalent. You can choose C or Rust version kernel configuration. Signed-off-by: FUJITA Tomonori <fujita.tomonori@gmail.com> Reviewed-by: Trevor Gross <tmgross@umich.edu> Reviewed-by: Benno Lossin <benno.lossin@proton.me> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Alice Ryhl <aliceryhl@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-15MAINTAINERS: add Rust PHY abstractions for ETHERNET PHY LIBRARYFUJITA Tomonori
Adds me as a maintainer and Trevor as a reviewer. The files are placed at rust/kernel/ directory for now but the files are likely to be moved to net/ directory once a new Rust build system is implemented. Signed-off-by: FUJITA Tomonori <fujita.tomonori@gmail.com> Signed-off-by: Trevor Gross <tmgross@umich.edu> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-15rust: net::phy add module_phy_driver macroFUJITA Tomonori
This macro creates an array of kernel's `struct phy_driver` and registers it. This also corresponds to the kernel's `MODULE_DEVICE_TABLE` macro, which embeds the information for module loading into the module binary file. A PHY driver should use this macro. Signed-off-by: FUJITA Tomonori <fujita.tomonori@gmail.com> Reviewed-by: Alice Ryhl <aliceryhl@google.com> Reviewed-by: Benno Lossin <benno.lossin@proton.me> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Trevor Gross <tmgross@umich.edu> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-15rust: core abstractions for network PHY driversFUJITA Tomonori
This patch adds abstractions to implement network PHY drivers; the driver registration and bindings for some of callback functions in struct phy_driver and many genphy_ functions. This feature is enabled with CONFIG_RUST_PHYLIB_ABSTRACTIONS=y. This patch enables unstable const_maybe_uninit_zeroed feature for kernel crate to enable unsafe code to handle a constant value with uninitialized data. With the feature, the abstractions can initialize a phy_driver structure with zero easily; instead of initializing all the members by hand. It's supposed to be stable in the not so distant future. Link: https://github.com/rust-lang/rust/pull/116218 Signed-off-by: FUJITA Tomonori <fujita.tomonori@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Alice Ryhl <aliceryhl@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-15net: stmmac: don't create a MDIO bus if unnecessaryAndrew Halaney
Currently a MDIO bus is created if the devicetree description is either: 1. Not fixed-link 2. fixed-link but contains a MDIO bus as well The "1" case above isn't always accurate. If there's a phy-handle, it could be referencing a phy on another MDIO controller's bus[1]. In this case, where the MDIO bus is not described at all, currently stmmac will make a MDIO bus and scan its address space to discover phys (of which there are none). This process takes time scanning a bus that is known to be empty, delaying time to complete probe. There are also a lot of upstream devicetrees[2] that expect a MDIO bus to be created, scanned for phys, and the first one found connected to the MAC. This case can be inferred from the platform description by not having a phy-handle && not being fixed-link. This hits case "1" in the current driver's logic, and must be handled in any logic change here since it is a valid legacy dt-binding. Let's improve the logic to create a MDIO bus if either: - Devicetree contains a MDIO bus - !fixed-link && !phy-handle (legacy handling) This way the case where no MDIO bus should be made is handled, as well as retaining backwards compatibility with the valid cases. Below devicetree snippets can be found that explain some of the cases above more concretely. Here's[0] a devicetree example where the MAC is both fixed-link and driving a switch on MDIO (case "2" above). This needs a MDIO bus to be created: &fec1 { phy-mode = "rmii"; fixed-link { speed = <100>; full-duplex; }; mdio1: mdio { switch0: switch0@0 { compatible = "marvell,mv88e6190"; pinctrl-0 = <&pinctrl_gpio_switch0>; }; }; }; Here's[1] an example where there is no MDIO bus or fixed-link for the ethernet1 MAC, so no MDIO bus should be created since ethernet0 is the MDIO master for ethernet1's phy: &ethernet0 { phy-mode = "sgmii"; phy-handle = <&sgmii_phy0>; mdio { compatible = "snps,dwmac-mdio"; sgmii_phy0: phy@8 { compatible = "ethernet-phy-id0141.0dd4"; reg = <0x8>; device_type = "ethernet-phy"; }; sgmii_phy1: phy@a { compatible = "ethernet-phy-id0141.0dd4"; reg = <0xa>; device_type = "ethernet-phy"; }; }; }; &ethernet1 { phy-mode = "sgmii"; phy-handle = <&sgmii_phy1>; }; Finally there's descriptions like this[2] which don't describe the MDIO bus but expect it to be created and the whole address space scanned for a phy since there's no phy-handle or fixed-link described: &gmac { phy-supply = <&vcc_lan>; phy-mode = "rmii"; snps,reset-gpio = <&gpio3 RK_PB4 GPIO_ACTIVE_HIGH>; snps,reset-active-low; snps,reset-delays-us = <0 10000 1000000>; }; [0] https://elixir.bootlin.com/linux/v6.5-rc5/source/arch/arm/boot/dts/nxp/vf/vf610-zii-ssmb-dtu.dts [1] https://elixir.bootlin.com/linux/v6.6-rc5/source/arch/arm64/boot/dts/qcom/sa8775p-ride.dts [2] https://elixir.bootlin.com/linux/v6.6-rc5/source/arch/arm64/boot/dts/rockchip/rk3368-r88.dts#L164 Reviewed-by: Serge Semin <fancer.lancer@gmail.com> Co-developed-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> Signed-off-by: Andrew Halaney <ahalaney@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-14cxl/pmu: Ensure put_device on pmu devicesIra Weiny
The following kmemleaks were detected when removing the cxl module stack: unreferenced object 0xffff88822616b800 (size 1024): ... backtrace: [<00000000bedc6f83>] kmalloc_trace+0x26/0x90 [<00000000448d1afc>] devm_cxl_pmu_add+0x3a/0x110 [cxl_core] [<00000000ca3bfe16>] 0xffffffffa105213b [<00000000ba7f78dc>] local_pci_probe+0x41/0x90 [<000000005bb027ac>] pci_device_probe+0xb0/0x1c0 ... unreferenced object 0xffff8882260abcc0 (size 16): ... hex dump (first 16 bytes): 70 6d 75 5f 6d 65 6d 30 2e 30 00 26 82 88 ff ff pmu_mem0.0.&.... backtrace: ... [<00000000152b5e98>] dev_set_name+0x43/0x50 [<00000000c228798b>] devm_cxl_pmu_add+0x102/0x110 [cxl_core] [<00000000ca3bfe16>] 0xffffffffa105213b [<00000000ba7f78dc>] local_pci_probe+0x41/0x90 [<000000005bb027ac>] pci_device_probe+0xb0/0x1c0 ... unreferenced object 0xffff8882272af200 (size 256): ... backtrace: [<00000000bedc6f83>] kmalloc_trace+0x26/0x90 [<00000000a14d1813>] device_add+0x4ea/0x890 [<00000000a3f07b47>] devm_cxl_pmu_add+0xbe/0x110 [cxl_core] [<00000000ca3bfe16>] 0xffffffffa105213b [<00000000ba7f78dc>] local_pci_probe+0x41/0x90 [<000000005bb027ac>] pci_device_probe+0xb0/0x1c0 ... devm_cxl_pmu_add() correctly registers a device remove function but it only calls device_del() which is only part of device unregistration. Properly call device_unregister() to free up the memory associated with the device. Fixes: 1ad3f701c399 ("cxl/pci: Find and register CXL PMU devices") Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20231016-pmu-unregister-fix-v1-1-1e2eb2fa3c69@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-12-15drm/nouveau/kms/nv50-: Don't allow inheritance of headless iorsLyude Paul
Turns out we made a silly mistake when coming up with OR inheritance on nouveau. On pre-DCB 4.1, iors are statically routed to output paths via the DCB. On later generations iors are only routed to an output path if they're actually being used. Unfortunately, it appears with NVIF_OUTP_INHERIT_V0 we make the mistake of assuming the later is true on all generations, which is currently leading us to return bogus ior -> head assignments through nvif, which causes WARN_ON(). So - fix this by verifying that we actually know that there's a head assigned to an ior before allowing it to be inherited through nvif. This -should- hopefully fix the WARN_ON on GT218 reported by Borislav. Signed-off-by: Lyude Paul <lyude@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Reported-by: Borislav Petkov (AMD) <bp@alien8.de> Tested-by: Borislav Petkov (AMD) <bp@alien8.de> Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231214004359.1028109-1-lyude@redhat.com
2023-12-15drm/nouveau: Fixup gk20a instobj hierarchyThierry Reding
Commit 12c9b05da918 ("drm/nouveau/imem: support allocations not preserved across suspend") uses container_of() to cast from struct nvkm_memory to struct nvkm_instobj, assuming that all instance objects are derived from struct nvkm_instobj. For the gk20a family that's not the case and they are derived from struct nvkm_memory instead. This causes some subtle data corruption (nvkm_instobj.preserve ends up mapping to gk20a_instobj.vaddr) that causes a NULL pointer dereference in gk20a_instobj_acquire_iommu() (and possibly elsewhere) and also prevents suspend/resume from working. Fix this by making struct gk20a_instobj derive from struct nvkm_instobj instead. Fixes: 12c9b05da918 ("drm/nouveau/imem: support allocations not preserved across suspend") Reported-by: Jonathan Hunter <jonathanh@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com> Tested-by: Jon Hunter <jonathanh@nvidia.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231208104653.1917055-1-thierry.reding@gmail.com
2023-12-14Merge tag '6.7-rc5-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds
Pull smb client fixes from Steve French: "Address OOBs and NULL dereference found by Dr. Morris's recent analysis and fuzzing. All marked for stable as well" * tag '6.7-rc5-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6: smb: client: fix OOB in smb2_query_reparse_point() smb: client: fix NULL deref in asn1_ber_decoder() smb: client: fix potential OOBs in smb2_parse_contexts() smb: client: fix OOB in receive_encrypted_standard()
2023-12-14bpf: xdp: Register generic_kfunc_set with XDP programsDaniel Xu
Registering generic_kfunc_set with XDP programs enables some of the newer BPF features inside XDP -- namely tree based data structures and BPF exceptions. The current motivation for this commit is to enable assertions inside XDP bpf progs. Assertions are a standard and useful tool to encode intent. Signed-off-by: Daniel Xu <dxu@dxuuu.xyz> Link: https://lore.kernel.org/r/d07d4614b81ca6aada44fcb89bb6b618fb66e4ca.1702594357.git.dxu@dxuuu.xyz Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-14Merge tag 'wireless-2023-12-14' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless Johannes Berg says: ==================== * add (and fix) certificate for regdb handover to Chen-Yu Tsai * fix rfkill GPIO handling * a few driver (iwlwifi, mt76) crash fixes * logic fixes in the stack * tag 'wireless-2023-12-14' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless: wifi: cfg80211: fix certs build to not depend on file order wifi: mt76: fix crash with WED rx support enabled wifi: iwlwifi: pcie: avoid a NULL pointer dereference wifi: mac80211: mesh_plink: fix matches_local logic wifi: mac80211: mesh: check element parsing succeeded wifi: mac80211: check defragmentation succeeded wifi: mac80211: don't re-add debugfs during reconfig net: rfkill: gpio: set GPIO direction wifi: mac80211: check if the existing link config remains unchanged wifi: cfg80211: Add my certificate wifi: iwlwifi: pcie: add another missing bh-disable for rxq->lock wifi: ieee80211: don't require protected vendor action frames ==================== Link: https://lore.kernel.org/r/20231214111515.60626-3-johannes@sipsolutions.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-14Merge tag 'mlx5-fixes-2023-12-13' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5 fixes 2023-12-13 This series provides bug fixes to mlx5 driver. * tag 'mlx5-fixes-2023-12-13' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux: net/mlx5e: Correct snprintf truncation handling for fw_version buffer used by representors net/mlx5e: Correct snprintf truncation handling for fw_version buffer net/mlx5e: Fix error codes in alloc_branch_attr() net/mlx5e: Fix error code in mlx5e_tc_action_miss_mapping_get() net/mlx5: Refactor mlx5_flow_destination->rep pointer to vport num net/mlx5: Fix fw tracer first block check net/mlx5e: XDP, Drop fragmented packets larger than MTU size net/mlx5e: Decrease num_block_tc when unblock tc offload net/mlx5e: Fix overrun reported by coverity net/mlx5e: fix a potential double-free in fs_udp_create_groups net/mlx5e: Fix a race in command alloc flow net/mlx5e: Fix slab-out-of-bounds in mlx5_query_nic_vport_mac_list() net/mlx5e: fix double free of encap_header Revert "net/mlx5e: fix double free of encap_header" Revert "net/mlx5e: fix double free of encap_header in update funcs" ==================== Link: https://lore.kernel.org/r/20231214012505.42666-1-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-14Merge branch '100GbE' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2023-12-13 (ice, i40e) This series contains updates to ice and i40e drivers. Michal Schmidt prevents possible out-of-bounds access for ice. Ivan Vecera corrects value for MDIO clause 45 on i40e. * '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue: i40e: Fix ST code value for Clause 45 ice: fix theoretical out-of-bounds access in ethtool link modes ==================== Link: https://lore.kernel.org/r/20231213220827.1311772-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-14i40e: remove fake support of rx-frames-irqJason Xing
Since we never support this feature for I40E driver, we don't have to display the value when using 'ethtool -c eth0'. Before this patch applied, the rx-frames-irq is 256 which is consistent with tx-frames-irq. Apparently it could mislead users. Signed-off-by: Jason Xing <kernelxing@tencent.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://lore.kernel.org/r/20231213184406.1306602-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-14Merge branch 'mdio-mux-cleanup'Jakub Kicinski
Vladimir Oltean says: ==================== MDIO mux cleanup This small patch set resolves some technical debt in the MDIO mux driver which was discovered during the investigation for commit 1f9f2143f24e ("net: mdio-mux: fix C45 access returning -EIO after API change"). The patches have been sitting for 2 months in the NXP SDK kernel and haven't caused issues. ==================== Link: https://lore.kernel.org/r/20231213152712.320842-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-14net: mdio-mux: be compatible with parent buses which only support C45Vladimir Oltean
After the mii_bus API conversion to a split read() / read_c45(), there might be MDIO parent buses which only populate the read_c45() and write_c45() function pointers but not the C22 variants. We haven't seen these in the wild paired with MDIO multiplexers, but Andrew points out we should treat the corner case. Link: https://lore.kernel.org/netdev/4ccd7dc9-b611-48aa-865f-68d3a1327ce8@lunn.ch/ Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/20231213152712.320842-3-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-14net: mdio-mux: show errors on probe failureVladimir Oltean
Showing the precise error symbols can help debugging probe issues, such as the recent -EIO error in of_mdiobus_register() caused by the lack of bus->read_c45() and bus->write_c45() methods. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/20231213152712.320842-2-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-14net: atlantic: eliminate double free in error handling logicIgor Russkikh
Driver has a logic leak in ring data allocation/free, where aq_ring_free could be called multiple times on same ring, if system is under stress and got memory allocation error. Ring pointer was used as an indicator of failure, but this is not correct since only ring data is allocated/deallocated. Ring itself is an array member. Changing ring allocation functions to return error code directly. This simplifies error handling and eliminates aq_ring_free on higher layer. Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Link: https://lore.kernel.org/r/20231213095044.23146-1-irusskikh@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>