summaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)Author
2023-12-13net/sched: act_api: skip idr replace on bound actionsPedro Tammela
tcf_idr_insert_many will replace the allocated -EBUSY pointer in tcf_idr_check_alloc with the real action pointer, exposing it to all operations. This operation is only needed when the action pointer is created (ACT_P_CREATED). For actions which are bound to (returned 0), the pointer already resides in the idr making such operation a nop. Even though it's a nop, it's still not a cheap operation as internally the idr code walks the idr and then does a replace on the appropriate slot. So if the action was bound, better skip the idr replace entirely. Signed-off-by: Pedro Tammela <pctammela@mojatatu.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Reviewed-by: Vlad Buslov <vladbu@nvidia.com> Link: https://lore.kernel.org/r/20231211181807.96028-3-pctammela@mojatatu.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-13net/sched: act_api: rely on rcu in tcf_idr_check_allocPedro Tammela
Instead of relying only on the idrinfo->lock mutex for bind/alloc logic, rely on a combination of rcu + mutex + atomics to better scale the case where multiple rtnl-less filters are binding to the same action object. Action binding happens when an action index is specified explicitly and an action exists which such index exists. Example: tc actions add action drop index 1 tc filter add ... matchall action drop index 1 tc filter add ... matchall action drop index 1 tc filter add ... matchall action drop index 1 tc filter ls ... filter protocol all pref 49150 matchall chain 0 filter protocol all pref 49150 matchall chain 0 handle 0x1 not_in_hw action order 1: gact action drop random type none pass val 0 index 1 ref 4 bind 3 filter protocol all pref 49151 matchall chain 0 filter protocol all pref 49151 matchall chain 0 handle 0x1 not_in_hw action order 1: gact action drop random type none pass val 0 index 1 ref 4 bind 3 filter protocol all pref 49152 matchall chain 0 filter protocol all pref 49152 matchall chain 0 handle 0x1 not_in_hw action order 1: gact action drop random type none pass val 0 index 1 ref 4 bind 3 When no index is specified, as before, grab the mutex and allocate in the idr the next available id. In this version, as opposed to before, it's simplified to store the -EBUSY pointer instead of the previous alloc + replace combination. When an index is specified, rely on rcu to find if there's an object in such index. If there's none, fallback to the above, serializing on the mutex and reserving the specified id. If there's one, it can be an -EBUSY pointer, in which case we just try again until it's an action, or an action. Given the rcu guarantees, the action found could be dead and therefore we need to bump the refcount if it's not 0, handling the case it's in fact 0. As bind and the action refcount are already atomics, these increments can happen without the mutex protection while many tcf_idr_check_alloc race to bind to the same action instance. In case binding encounters a parallel delete or add, it will return -EAGAIN in order to try again. Both filter and action apis already have the retry machinery in-place. In case it's an unlocked filter it retries under the rtnl lock. Signed-off-by: Pedro Tammela <pctammela@mojatatu.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Reviewed-by: Vlad Buslov <vladbu@nvidia.com> Link: https://lore.kernel.org/r/20231211181807.96028-2-pctammela@mojatatu.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-13bpf: syzkaller found null ptr deref in unix_bpf proto addJohn Fastabend
I added logic to track the sock pair for stream_unix sockets so that we ensure lifetime of the sock matches the time a sockmap could reference the sock (see fixes tag). I forgot though that we allow af_unix unconnected sockets into a sock{map|hash} map. This is problematic because previous fixed expected sk_pair() to exist and did not NULL check it. Because unconnected sockets have a NULL sk_pair this resulted in the NULL ptr dereference found by syzkaller. BUG: KASAN: null-ptr-deref in unix_stream_bpf_update_proto+0x72/0x430 net/unix/unix_bpf.c:171 Write of size 4 at addr 0000000000000080 by task syz-executor360/5073 Call Trace: <TASK> ... sock_hold include/net/sock.h:777 [inline] unix_stream_bpf_update_proto+0x72/0x430 net/unix/unix_bpf.c:171 sock_map_init_proto net/core/sock_map.c:190 [inline] sock_map_link+0xb87/0x1100 net/core/sock_map.c:294 sock_map_update_common+0xf6/0x870 net/core/sock_map.c:483 sock_map_update_elem_sys+0x5b6/0x640 net/core/sock_map.c:577 bpf_map_update_value+0x3af/0x820 kernel/bpf/syscall.c:167 We considered just checking for the null ptr and skipping taking a ref on the NULL peer sock. But, if the socket is then connected() after being added to the sockmap we can cause the original issue again. So instead this patch blocks adding af_unix sockets that are not in the ESTABLISHED state. Reported-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot+e8030702aefd3444fb9e@syzkaller.appspotmail.com Fixes: 8866730aed51 ("bpf, sockmap: af_unix stream sockets need to hold ref for pair sock") Acked-by: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/r/20231201180139.328529-2-john.fastabend@gmail.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-12-13xdp: Add VLAN tag hintLarysa Zaremba
Implement functionality that enables drivers to expose VLAN tag to XDP code. VLAN tag is represented by 2 variables: - protocol ID, which is passed to bpf code in BE - VLAN TCI, in host byte order Acked-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com> Acked-by: Jesper Dangaard Brouer <hawk@kernel.org> Link: https://lore.kernel.org/r/20231205210847.28460-10-larysa.zaremba@intel.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-13xsk: add functions to fill control bufferMaciej Fijalkowski
Commit 94ecc5ca4dbf ("xsk: Add cb area to struct xdp_buff_xsk") has added a buffer for custom data to xdp_buff_xsk. Particularly, this memory is used for data, consumed by XDP hints kfuncs. It does not always change on a per-packet basis and some parts can be set for example, at the same time as RX queue info. Add functions to fill all cbs in xsk_buff_pool with the same metadata. Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com> Acked-by: Magnus Karlsson <magnus.karlsson@intel.com> Link: https://lore.kernel.org/r/20231205210847.28460-8-larysa.zaremba@intel.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-13Revert "tcp: disable tcp_autocorking for socket when TCP_NODELAY flag is set"Jakub Kicinski
This reverts commit f3f32a356c0d2379d4431364e74f101f8f075ce3. Paolo reports that the change disables autocorking even after the userspace sets TCP_CORK. Fixes: f3f32a356c0d ("tcp: disable tcp_autocorking for socket when TCP_NODELAY flag is set") Link: https://lore.kernel.org/r/0d30d5a41d3ac990573016308aaeacb40a9dc79f.camel@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-13tcp: disable tcp_autocorking for socket when TCP_NODELAY flag is setSalvatore Dipietro
Based on the tcp man page, if TCP_NODELAY is set, it disables Nagle's algorithm and packets are sent as soon as possible. However in the `tcp_push` function where autocorking is evaluated the `nonagle` value set by TCP_NODELAY is not considered which can trigger unexpected corking of packets and induce delays. For example, if two packets are generated as part of a server's reply, if the first one is not transmitted on the wire quickly enough, the second packet can trigger the autocorking in `tcp_push` and be delayed instead of sent as soon as possible. It will either wait for additional packets to be coalesced or an ACK from the client before transmitting the corked packet. This can interact badly if the receiver has tcp delayed acks enabled, introducing 40ms extra delay in completion times. It is not always possible to control who has delayed acks set, but it is possible to adjust when and how autocorking is triggered. Patch prevents autocorking if the TCP_NODELAY flag is set on the socket. Patch has been tested using an AWS c7g.2xlarge instance with Ubuntu 22.04 and Apache Tomcat 9.0.83 running the basic servlet below: import java.io.IOException; import java.io.OutputStreamWriter; import java.io.PrintWriter; import javax.servlet.ServletException; import javax.servlet.http.HttpServlet; import javax.servlet.http.HttpServletRequest; import javax.servlet.http.HttpServletResponse; public class HelloWorldServlet extends HttpServlet { @Override protected void doGet(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException { response.setContentType("text/html;charset=utf-8"); OutputStreamWriter osw = new OutputStreamWriter(response.getOutputStream(),"UTF-8"); String s = "a".repeat(3096); osw.write(s,0,s.length()); osw.flush(); } } Load was applied using wrk2 (https://github.com/kinvolk/wrk2) from an AWS c6i.8xlarge instance. With the current auto-corking behavior and TCP_NODELAY set an additional 40ms latency from P99.99+ values are observed. With the patch applied we see no occurrences of 40ms latencies. The patch has also been tested with iperf and uperf benchmarks and no regression was observed. # No patch with tcp_autocorking=1 and TCP_NODELAY set on all sockets ./wrk -t32 -c128 -d40s --latency -R10000 http://172.31.49.177:8080/hello/hello' ... 50.000% 0.91ms 75.000% 1.12ms 90.000% 1.46ms 99.000% 1.73ms 99.900% 1.96ms 99.990% 43.62ms <<< 40+ ms extra latency 99.999% 48.32ms 100.000% 49.34ms # With patch ./wrk -t32 -c128 -d40s --latency -R10000 http://172.31.49.177:8080/hello/hello' ... 50.000% 0.89ms 75.000% 1.13ms 90.000% 1.44ms 99.000% 1.67ms 99.900% 1.78ms 99.990% 2.27ms <<< no 40+ ms extra latency 99.999% 3.71ms 100.000% 4.57ms Fixes: f54b311142a9 ("tcp: auto corking") Signed-off-by: Salvatore Dipietro <dipiets@amazon.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-12net: Remove acked SYN flag from packet in the transmit queue correctlyDong Chenchen
syzkaller report: kernel BUG at net/core/skbuff.c:3452! invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.7.0-rc4-00009-gbee0e7762ad2-dirty #135 RIP: 0010:skb_copy_and_csum_bits (net/core/skbuff.c:3452) Call Trace: icmp_glue_bits (net/ipv4/icmp.c:357) __ip_append_data.isra.0 (net/ipv4/ip_output.c:1165) ip_append_data (net/ipv4/ip_output.c:1362 net/ipv4/ip_output.c:1341) icmp_push_reply (net/ipv4/icmp.c:370) __icmp_send (./include/net/route.h:252 net/ipv4/icmp.c:772) ip_fragment.constprop.0 (./include/linux/skbuff.h:1234 net/ipv4/ip_output.c:592 net/ipv4/ip_output.c:577) __ip_finish_output (net/ipv4/ip_output.c:311 net/ipv4/ip_output.c:295) ip_output (net/ipv4/ip_output.c:427) __ip_queue_xmit (net/ipv4/ip_output.c:535) __tcp_transmit_skb (net/ipv4/tcp_output.c:1462) __tcp_retransmit_skb (net/ipv4/tcp_output.c:3387) tcp_retransmit_skb (net/ipv4/tcp_output.c:3404) tcp_retransmit_timer (net/ipv4/tcp_timer.c:604) tcp_write_timer (./include/linux/spinlock.h:391 net/ipv4/tcp_timer.c:716) The panic issue was trigered by tcp simultaneous initiation. The initiation process is as follows: TCP A TCP B 1. CLOSED CLOSED 2. SYN-SENT --> <SEQ=100><CTL=SYN> ... 3. SYN-RECEIVED <-- <SEQ=300><CTL=SYN> <-- SYN-SENT 4. ... <SEQ=100><CTL=SYN> --> SYN-RECEIVED 5. SYN-RECEIVED --> <SEQ=100><ACK=301><CTL=SYN,ACK> ... // TCP B: not send challenge ack for ack limit or packet loss // TCP A: close tcp_close tcp_send_fin if (!tskb && tcp_under_memory_pressure(sk)) tskb = skb_rb_last(&sk->tcp_rtx_queue); //pick SYN_ACK packet TCP_SKB_CB(tskb)->tcp_flags |= TCPHDR_FIN; // set FIN flag 6. FIN_WAIT_1 --> <SEQ=100><ACK=301><END_SEQ=102><CTL=SYN,FIN,ACK> ... // TCP B: send challenge ack to SYN_FIN_ACK 7. ... <SEQ=301><ACK=101><CTL=ACK> <-- SYN-RECEIVED //challenge ack // TCP A: <SND.UNA=101> 8. FIN_WAIT_1 --> <SEQ=101><ACK=301><END_SEQ=102><CTL=SYN,FIN,ACK> ... // retransmit panic __tcp_retransmit_skb //skb->len=0 tcp_trim_head len = tp->snd_una - TCP_SKB_CB(skb)->seq // len=101-100 __pskb_trim_head skb->data_len -= len // skb->len=-1, wrap around ... ... ip_fragment icmp_glue_bits //BUG_ON If we use tcp_trim_head() to remove acked SYN from packet that contains data or other flags, skb->len will be incorrectly decremented. We can remove SYN flag that has been acked from rtx_queue earlier than tcp_trim_head(), which can fix the problem mentioned above. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Co-developed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com> Link: https://lore.kernel.org/r/20231210020200.1539875-1-dongchenchen2@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-13net: 9p: avoid freeing uninit memory in p9pdu_vreadfFedor Pchelkin
If some of p9pdu_readf() calls inside case 'T' in p9pdu_vreadf() fails, the error path is not handled properly. *wnames or members of *wnames array may be left uninitialized and invalidly freed. Initialize *wnames to NULL in beginning of case 'T'. Initialize the first *wnames array element to NULL and nullify the failing *wnames element so that the error path freeing loop stops on the first NULL element and doesn't proceed further. Found by Linux Verification Center (linuxtesting.org). Fixes: ace51c4dd2f9 ("9p: add new protocol support code") Signed-off-by: Fedor Pchelkin <pchelkin@ispras.ru> Message-ID: <20231206200913.16135-1-pchelkin@ispras.ru> Cc: stable@vger.kernel.org Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Christian Schoenebeck <linux_oss@crudebyte.com> Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
2023-12-12Merge branch 'vfs.file' of ↵Jens Axboe
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs into for-6.8/io_uring Merge vfs.file from the VFS tree to avoid conflicts with receive_fd() now having 3 arguments rather than just 2. * 'vfs.file' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: file: remove __receive_fd() file: stop exposing receive_fd_user() fs: replace f_rcuhead with f_task_work file: remove pointless wrapper file: s/close_fd_get_file()/file_close_fd()/g Improve __fget_files_rcu() code generation (and thus __fget_light()) file: massage cleanup of files that failed to open
2023-12-12file: stop exposing receive_fd_user()Christian Brauner
Not every subsystem needs to have their own specialized helper. Just us the __receive_fd() helper. Link: https://lore.kernel.org/r/20231130-vfs-files-fixes-v1-4-e73ca6f4ea83@kernel.org Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-12-12net/rose: Fix Use-After-Free in rose_ioctlHyunwoo Kim
Because rose_ioctl() accesses sk->sk_receive_queue without holding a sk->sk_receive_queue.lock, it can cause a race with rose_accept(). A use-after-free for skb occurs with the following flow. ``` rose_ioctl() -> skb_peek() rose_accept() -> skb_dequeue() -> kfree_skb() ``` Add sk->sk_receive_queue.lock to rose_ioctl() to fix this issue. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Hyunwoo Kim <v4bel@theori.io> Link: https://lore.kernel.org/r/20231209100538.GA407321@v4bel-B760M-AORUS-ELITE-AX Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-12-12atm: Fix Use-After-Free in do_vcc_ioctlHyunwoo Kim
Because do_vcc_ioctl() accesses sk->sk_receive_queue without holding a sk->sk_receive_queue.lock, it can cause a race with vcc_recvmsg(). A use-after-free for skb occurs with the following flow. ``` do_vcc_ioctl() -> skb_peek() vcc_recvmsg() -> skb_recv_datagram() -> skb_free_datagram() ``` Add sk->sk_receive_queue.lock to do_vcc_ioctl() to fix this issue. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Hyunwoo Kim <v4bel@theori.io> Link: https://lore.kernel.org/r/20231209094210.GA403126@v4bel-B760M-AORUS-ELITE-AX Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-12-12net: dns_resolver: the module is called dns_resolver, not dnsresolverAhelenia Ziemiańska
$ modinfo dnsresolver dns_resolver | grep name modinfo: ERROR: Module dnsresolver not found. filename: /lib/modules/6.1.0-9-amd64/kernel/net/dns_resolver/dns_resolver.ko name: dns_resolver Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Link: https://lore.kernel.org/r/gh4sxphjxbo56n2spgmc66vtazyxgiehpmv5f2gkvgicy6f4rs@tarta.nabijaczleweli.xyz Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-12-12wifi: mac80211: drop spurious WARN_ON() in ieee80211_ibss_csa_beacon()Dmitry Antipov
The WARN_ON() in subject was actually seen only once, with 5.10.200 under syzkaller. It looks like a weird artifact of (ab?)using the syzkaller itself [1], and hopefully may be safely removed. [1] https://lore.kernel.org/linux-wireless/1bd8f266-dee0-4d4e-9b50-e22546b55763@yandex.ru/T/#u Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru> Link: https://msgid.link/20231208153130.107409-1-dmantipov@yandex.ru Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-12-12wifi: mac80211: don't set ESS capab bit in assoc requestJohannes Berg
The ESS capability bit is reserved in frames transmitted by the client, so we shouldn't set it. Since we've set it for decades, keep that old behaviour unless we're connection to a new EHT AP. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Reviewed-by: Gregory Greenman <gregory.greenman@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://msgid.link/20231211085121.65005aba900b.I3d00c8741400572a89a7508b5ae612c968874ad7@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-12-12wifi: cfg80211: consume both probe response and beacon IEsBenjamin Berg
When doing a channel switch, cfg80211_update_known_bss may be called with a BSS where both proberesp_ies and beacon_ies is set. If that happens, both need to be consumed. Signed-off-by: Benjamin Berg <benjamin.berg@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://msgid.link/20231211085121.07a88656d7df.I0fe9fc599382de0eccf96455617e377d9c231966@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-12-12wifi: cfg80211: generate an ML element for per-STA profilesBenjamin Berg
The specification says that this information should not be explicitly included in the per-STA profile. However, we need this information readily available in the BSS for userspace and also internally when associating. As such, append the appropriate element before adding/updating the BSS. Signed-off-by: Benjamin Berg <benjamin.berg@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://msgid.link/20231211085121.abde63d9cc6d.I3d346be0f84f51dccf4f4f92a3e997e6102b9456@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-12-12wifi: cfg80211: Replace ENOTSUPP with EOPNOTSUPPAndrei Otcheretianski
ENOTSUPP isn't a standard error code, don't use it. Signed-off-by: Andrei Otcheretianski <andrei.otcheretianski@intel.com> Reviewed-by: Gregory Greenman <gregory.greenman@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://msgid.link/20231211085121.0214b6c79756.I2536bc8426ae15c8cff7ad199e57f06e2e404f13@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-12-12wifi: mac80211: Replace ENOTSUPP with EOPNOTSUPPAndrei Otcheretianski
ENOTSUP isn't a standard error code. EOPNOTSUPP should be used instead. Signed-off-by: Andrei Otcheretianski <andrei.otcheretianski@intel.com> Reviewed-by: Gregory Greenman <gregory.greenman@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://msgid.link/20231211085121.3841b71c867d.Idf2ad01d9dfe8d6d6c352bf02deb06e49701ad1d@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-12-12wifi: mac80211: add a flag to disallow puncturingJohannes Berg
There may be cases where puncturing isn't possible, and a connection needs to be downgraded. Add a hardware flag to support this. This is likely temporary: it seems we will need to move puncturing to the chandef/channel context. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Reviewed-by: Gregory Greenman <gregory.greenman@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://msgid.link/20231211085121.c1e89ea55e93.I37b8ca0ee64d5d7699e351785a9010afc106da3c@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-12-12wifi: cfg80211: Add support for setting TID to link mappingIlan Peer
Add support for setting the TID to link mapping for a non-AP MLD station. This is useful in cases user space needs to restrict the possible set of active links, e.g., since it got a BSS Transition Management request forcing to use only a subset of the valid links etc. Signed-off-by: Ilan Peer <ilan.peer@intel.com> Reviewed-by: Gregory Greenman <gregory.greenman@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://msgid.link/20231211085121.da4d56a5f3ff.Iacf88e943326bf9c169c49b728c4a3445fdedc97@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-12-12wifi: cfg80211: add BSS usage reportingJohannes Berg
Sometimes there may be reasons for which a BSS that's actually found in scan cannot be used to connect to, for example a nonprimary link of an NSTR mobile AP MLD cannot be used for normal direct connections to it. Not indicating these to userspace as we do now of course avoids being able to connect to them, but it's better if they're shown to userspace and it can make an appropriate decision, without e.g. doing an additional ML probe. Thus add an indication of what a BSS can be used for, currently "normal" and "MLD link", including a reason bitmap for it being not usable. The latter can be extended later for certain BSSes if there are other reasons they cannot be used. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Reviewed-by: Ilan Peer <ilan.peer@intel.com> Reviewed-by: Gregory Greenman <gregory.greenman@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://msgid.link/20231211085121.0464f25e0b1d.I9f70ca9f1440565ad9a5207d0f4d00a20cca67e7@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-12-12wifi: nl80211: Extend del pmksa support for SAE and OWE securityVinayak Yadawad
Current handling of del pmksa with SSID is limited to FILS security. In the current change the del pmksa support is extended to SAE/OWE security offloads as well. For OWE/SAE offloads, the PMK is generated and cached at driver/FW, so user app needs the capability to request cache deletion based on SSID for drivers supporting SAE/OWE offload. Signed-off-by: Vinayak Yadawad <vinayak.yadawad@broadcom.com> Link: https://msgid.link/ecdae726459e0944c377a6a6f6cb2c34d2e057d0.1701262123.git.vinayak.yadawad@broadcom.com [drop whitespace-damaged rdev_ops pointer completely, enabling tracing] Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-12-12wifi: mac80211: cleanup airtime arithmetic with ieee80211_sta_keep_active()Dmitry Antipov
Prefer native jiffies-wide 'unsigned long' for the 'last_active' field of 'struct airtime_info' and introduce 'ieee80211_sta_keep_active()' for airtime check in 'ieee80211_txq_keep_active()' and 'ieee80211_sta_register_airtime()'. Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru> Reviewed-by: Toke Høiland-Jørgensen <toke@toke.dk> Link: https://msgid.link/20231206060935.612241-1-dmantipov@yandex.ru Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-12-12wifi: mac80211: Add support for WBRF featuresEvan Quan
To support the WBRF mechanism, Wifi adapters utilized in the system must register the frequencies in use (or unregister those frequencies no longer used) via the dedicated calls. So that, other drivers responding to the frequencies can take proper actions to mitigate possible interference. Co-developed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Co-developed-by: Evan Quan <quanliangl@hotmail.com> Signed-off-by: Evan Quan <quanliangl@hotmail.com> Signed-off-by: Ma Jun <Jun.Ma2@amd.com> Link: https://msgid.link/20231211100630.2170152-5-Jun.Ma2@amd.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-12-12wifi: cfg80211: expose nl80211_chan_width_to_mhz for wide sharingEvan Quan
The newly added WBRF feature needs this interface for channel width calculation. Signed-off-by: Evan Quan <quanliangl@hotmail.com> Signed-off-by: Ma Jun <Jun.Ma2@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Link: https://msgid.link/20231211100630.2170152-4-Jun.Ma2@amd.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-12-12wifi: mac80211: mesh_plink: fix matches_local logicJohannes Berg
During refactoring the "else" here got lost, add it back. Fixes: c99a89edb106 ("mac80211: factor out plink event gathering") Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://msgid.link/20231211085121.795480fa0e0b.I017d501196a5bbdcd9afd33338d342d6fe1edd79@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-12-12wifi: mac80211: mesh: check element parsing succeededJohannes Berg
ieee802_11_parse_elems() can return NULL, so we must check for the return value. Fixes: 5d24828d05f3 ("mac80211: always allocate struct ieee802_11_elems") Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://msgid.link/20231211085121.93dea364f3d3.Ie87781c6c48979fb25a744b90af4a33dc2d83a28@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-12-12wifi: mac80211: check defragmentation succeededJohannes Berg
We need to check that cfg80211_defragment_element() didn't return an error, since it can fail due to bad input, and we didn't catch that before. Fixes: 8eb8dd2ffbbb ("wifi: mac80211: Support link removal using Reconfiguration ML element") Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://msgid.link/20231211085121.8595a6b67fc0.I1225edd8f98355e007f96502e358e476c7971d8c@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-12-12wifi: mac80211: don't re-add debugfs during reconfigJohannes Berg
If we're doing reconfig, then we cannot add the debugfs files that are already there from before the reconfig. Skip that in drv_change_sta_links() during reconfig. Fixes: d2caad527c19 ("wifi: mac80211: add API to show the link STAs in debugfs") Signed-off-by: Johannes Berg <johannes.berg@intel.com> Reviewed-by: Gregory Greenman <gregory.greenman@intel.com> Reviewed-by: Benjamin Berg <benjamin.berg@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://msgid.link/20231211085121.88a950f43e16.Id71181780994649219685887c0fcad33d387cc78@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-12-12net: rfkill: gpio: set GPIO directionRouven Czerwinski
Fix the undefined usage of the GPIO consumer API after retrieving the GPIO description with GPIO_ASIS. The API documentation mentions that GPIO_ASIS won't set a GPIO direction and requires the user to set a direction before using the GPIO. This can be confirmed on i.MX6 hardware, where rfkill-gpio is no longer able to enabled/disable a device, presumably because the GPIO controller was never configured for the output direction. Fixes: b2f750c3a80b ("net: rfkill: gpio: prevent value glitch during probe") Cc: stable@vger.kernel.org Signed-off-by: Rouven Czerwinski <r.czerwinski@pengutronix.de> Link: https://msgid.link/20231207075835.3091694-1-r.czerwinski@pengutronix.de Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-12-12wifi: mac80211: check if the existing link config remains unchangedEdward Adam Davis
[Syz report] WARNING: CPU: 1 PID: 5067 at net/mac80211/rate.c:48 rate_control_rate_init+0x540/0x690 net/mac80211/rate.c:48 Modules linked in: CPU: 1 PID: 5067 Comm: syz-executor413 Not tainted 6.7.0-rc3-syzkaller-00014-gdf60cee26a2e #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 11/10/2023 RIP: 0010:rate_control_rate_init+0x540/0x690 net/mac80211/rate.c:48 Code: 48 c7 c2 00 46 0c 8c be 08 03 00 00 48 c7 c7 c0 45 0c 8c c6 05 70 79 0b 05 01 e8 1b a0 6f f7 e9 e0 fd ff ff e8 61 b3 8f f7 90 <0f> 0b 90 e9 36 ff ff ff e8 53 b3 8f f7 e8 5e 0b 78 f7 31 ff 89 c3 RSP: 0018:ffffc90003c57248 EFLAGS: 00010293 RAX: 0000000000000000 RBX: ffff888016bc4000 RCX: ffffffff89f7d519 RDX: ffff888076d43b80 RSI: ffffffff89f7d6df RDI: 0000000000000005 RBP: ffff88801daaae20 R08: 0000000000000005 R09: 0000000000000000 R10: 0000000000000001 R11: 0000000000000002 R12: 0000000000000001 R13: 0000000000000000 R14: ffff888020030e20 R15: ffff888078f08000 FS: 0000555556b94380(0000) GS:ffff8880b9900000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000005fdeb8 CR3: 0000000076d22000 CR4: 00000000003506f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> sta_apply_auth_flags.constprop.0+0x4b7/0x510 net/mac80211/cfg.c:1674 sta_apply_parameters+0xaf1/0x16c0 net/mac80211/cfg.c:2002 ieee80211_add_station+0x3fa/0x6c0 net/mac80211/cfg.c:2068 rdev_add_station net/wireless/rdev-ops.h:201 [inline] nl80211_new_station+0x13ba/0x1a70 net/wireless/nl80211.c:7603 genl_family_rcv_msg_doit+0x1fc/0x2e0 net/netlink/genetlink.c:972 genl_family_rcv_msg net/netlink/genetlink.c:1052 [inline] genl_rcv_msg+0x561/0x800 net/netlink/genetlink.c:1067 netlink_rcv_skb+0x16b/0x440 net/netlink/af_netlink.c:2545 genl_rcv+0x28/0x40 net/netlink/genetlink.c:1076 netlink_unicast_kernel net/netlink/af_netlink.c:1342 [inline] netlink_unicast+0x53b/0x810 net/netlink/af_netlink.c:1368 netlink_sendmsg+0x93c/0xe40 net/netlink/af_netlink.c:1910 sock_sendmsg_nosec net/socket.c:730 [inline] __sock_sendmsg+0xd5/0x180 net/socket.c:745 ____sys_sendmsg+0x6ac/0x940 net/socket.c:2584 ___sys_sendmsg+0x135/0x1d0 net/socket.c:2638 __sys_sendmsg+0x117/0x1e0 net/socket.c:2667 do_syscall_x64 arch/x86/entry/common.c:51 [inline] do_syscall_64+0x40/0x110 arch/x86/entry/common.c:82 entry_SYSCALL_64_after_hwframe+0x63/0x6b [Analysis] It is inappropriate to make a link configuration change judgment on an non-existent and non new link. [Fix] Quickly exit when there is a existent link and the link configuration has not changed. Fixes: b303835dabe0 ("wifi: mac80211: accept STA changes without link changes") Reported-and-tested-by: syzbot+62d7eef57b09bfebcd84@syzkaller.appspotmail.com Signed-off-by: Edward Adam Davis <eadavis@qq.com> Link: https://msgid.link/tencent_DE67FF86DB92ED465489A36ECD2EDDCC8C06@qq.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-12-12wifi: cfg80211: Add my certificateChen-Yu Tsai
As announced [1][2], I have taken over maintainership of the wireless-regdb project. Add my certificate so that newer releases are valid to the kernel. Seth's certificate should be kept around for awhile, at least until a few new releases by me happen. This should also be applied to stable trees so that stable kernels can utilize newly released database binaries. [1] https://lore.kernel.org/linux-wireless/CAGb2v657baNMPKU3QADijx7hZa=GUcSv2LEDdn6N=QQaFX8r-g@mail.gmail.com/ [2] https://lore.kernel.org/linux-wireless/ZWmRR5ul7EDfxCan@wens.tw/ Cc: stable@vger.kernel.org Signed-off-by: Chen-Yu Tsai <wens@kernel.org> Acked-by: Seth Forshee <sforshee@kernel.org> Link: https://msgid.link/ZXHGsqs34qZyzZng@wens.tw Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-12-11net/sched: cls_api: conditional notification of eventsPedro Tammela
As of today tc-filter/chain events are unconditionally built and sent to RTNLGRP_TC. As with the introduction of rtnl_notify_needed we can check before-hand if they are really needed. This will help to alleviate system pressure when filters are concurrently added without the rtnl lock as in tc-flower. Reviewed-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Pedro Tammela <pctammela@mojatatu.com> Link: https://lore.kernel.org/r/20231208192847.714940-8-pctammela@mojatatu.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-11net/sched: cls_api: remove 'unicast' argument from delete notificationPedro Tammela
This argument is never called while set to true, so remove it as there's no need for it. Signed-off-by: Pedro Tammela <pctammela@mojatatu.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Link: https://lore.kernel.org/r/20231208192847.714940-7-pctammela@mojatatu.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-11net/sched: act_api: conditional notification of eventsPedro Tammela
As of today tc-action events are unconditionally built and sent to RTNLGRP_TC. As with the introduction of rtnl_notify_needed we can check before-hand if they are really needed. Signed-off-by: Pedro Tammela <pctammela@mojatatu.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Link: https://lore.kernel.org/r/20231208192847.714940-6-pctammela@mojatatu.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-11net/sched: act_api: don't open code max()Pedro Tammela
Use max() in a couple of places that are open coding it with the ternary operator. Signed-off-by: Pedro Tammela <pctammela@mojatatu.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Link: https://lore.kernel.org/r/20231208192847.714940-5-pctammela@mojatatu.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-11ipv6: annotate data-races around np->ucast_oifEric Dumazet
np->ucast_oif is read locklessly in some contexts. Make all accesses to this field lockless, adding appropriate annotations. This also makes setsockopt( IPV6_UNICAST_IF ) lockless. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-11ipv6: annotate data-races around np->mcast_oifEric Dumazet
np->mcast_oif is read locklessly in some contexts. Make all accesses to this field lockless, adding appropriate annotations. This also makes setsockopt( IPV6_MULTICAST_IF ) lockless. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-11Revert "net: rtnetlink: remove local list in __linkwatch_run_queue()"Johannes Berg
This reverts commit b8dbbbc535a9 ("net: rtnetlink: remove local list in __linkwatch_run_queue()"). It's evidently broken when there's a non-urgent work that gets added back, and then the loop can never finish. While reverting, add a note about that. Reported-by: Marek Szyprowski <m.szyprowski@samsung.com> Fixes: b8dbbbc535a9 ("net: rtnetlink: remove local list in __linkwatch_run_queue()") Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-11net/sched: act_ct: Take per-cb reference to tcf_ct_flow_tableVlad Buslov
The referenced change added custom cleanup code to act_ct to delete any callbacks registered on the parent block when deleting the tcf_ct_flow_table instance. However, the underlying issue is that the drivers don't obtain the reference to the tcf_ct_flow_table instance when registering callbacks which means that not only driver callbacks may still be on the table when deleting it but also that the driver can still have pointers to its internal nf_flowtable and can use it concurrently which results either warning in netfilter[0] or use-after-free. Fix the issue by taking a reference to the underlying struct tcf_ct_flow_table instance when registering the callback and release the reference when unregistering. Expose new API required for such reference counting by adding two new callbacks to nf_flowtable_type and implementing them for act_ct flowtable_ct type. This fixes the issue by extending the lifetime of nf_flowtable until all users have unregistered. [0]: [106170.938634] ------------[ cut here ]------------ [106170.939111] WARNING: CPU: 21 PID: 3688 at include/net/netfilter/nf_flow_table.h:262 mlx5_tc_ct_del_ft_cb+0x267/0x2b0 [mlx5_core] [106170.940108] Modules linked in: act_ct nf_flow_table act_mirred act_skbedit act_tunnel_key vxlan cls_matchall nfnetlink_cttimeout act_gact cls_flower sch_ingress mlx5_vdpa vringh vhost_iotlb vdpa bonding openvswitch nsh rpcrdma rdma_ucm ib_iser libiscsi scsi_transport_iscsi ib_umad rdma_cm ib_ipoib iw_cm ib_cm mlx5_ib ib_uverbs ib_core xt_MASQUERADE nf_conntrack_netlink nfnetlink iptable_nat xt_addrtype xt_conntrack nf_nat br_netfilter rpcsec_gss_krb5 auth_rpcgss oid_regis try overlay mlx5_core [106170.943496] CPU: 21 PID: 3688 Comm: kworker/u48:0 Not tainted 6.6.0-rc7_for_upstream_min_debug_2023_11_01_13_02 #1 [106170.944361] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 [106170.945292] Workqueue: mlx5e mlx5e_rep_neigh_update [mlx5_core] [106170.945846] RIP: 0010:mlx5_tc_ct_del_ft_cb+0x267/0x2b0 [mlx5_core] [106170.946413] Code: 89 ef 48 83 05 71 a4 14 00 01 e8 f4 06 04 e1 48 83 05 6c a4 14 00 01 48 83 c4 28 5b 5d 41 5c 41 5d c3 48 83 05 d1 8b 14 00 01 <0f> 0b 48 83 05 d7 8b 14 00 01 e9 96 fe ff ff 48 83 05 a2 90 14 00 [106170.947924] RSP: 0018:ffff88813ff0fcb8 EFLAGS: 00010202 [106170.948397] RAX: 0000000000000000 RBX: ffff88811eabac40 RCX: ffff88811eabad48 [106170.949040] RDX: ffff88811eab8000 RSI: ffffffffa02cd560 RDI: 0000000000000000 [106170.949679] RBP: ffff88811eab8000 R08: 0000000000000001 R09: ffffffffa0229700 [106170.950317] R10: ffff888103538fc0 R11: 0000000000000001 R12: ffff88811eabad58 [106170.950969] R13: ffff888110c01c00 R14: ffff888106b40000 R15: 0000000000000000 [106170.951616] FS: 0000000000000000(0000) GS:ffff88885fd40000(0000) knlGS:0000000000000000 [106170.952329] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [106170.952834] CR2: 00007f1cefd28cb0 CR3: 000000012181b006 CR4: 0000000000370ea0 [106170.953482] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [106170.954121] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [106170.954766] Call Trace: [106170.955057] <TASK> [106170.955315] ? __warn+0x79/0x120 [106170.955648] ? mlx5_tc_ct_del_ft_cb+0x267/0x2b0 [mlx5_core] [106170.956172] ? report_bug+0x17c/0x190 [106170.956537] ? handle_bug+0x3c/0x60 [106170.956891] ? exc_invalid_op+0x14/0x70 [106170.957264] ? asm_exc_invalid_op+0x16/0x20 [106170.957666] ? mlx5_del_flow_rules+0x10/0x310 [mlx5_core] [106170.958172] ? mlx5_tc_ct_block_flow_offload_add+0x1240/0x1240 [mlx5_core] [106170.958788] ? mlx5_tc_ct_del_ft_cb+0x267/0x2b0 [mlx5_core] [106170.959339] ? mlx5_tc_ct_del_ft_cb+0xc6/0x2b0 [mlx5_core] [106170.959854] ? mapping_remove+0x154/0x1d0 [mlx5_core] [106170.960342] ? mlx5e_tc_action_miss_mapping_put+0x4f/0x80 [mlx5_core] [106170.960927] mlx5_tc_ct_delete_flow+0x76/0xc0 [mlx5_core] [106170.961441] mlx5_free_flow_attr_actions+0x13b/0x220 [mlx5_core] [106170.962001] mlx5e_tc_del_fdb_flow+0x22c/0x3b0 [mlx5_core] [106170.962524] mlx5e_tc_del_flow+0x95/0x3c0 [mlx5_core] [106170.963034] mlx5e_flow_put+0x73/0xe0 [mlx5_core] [106170.963506] mlx5e_put_flow_list+0x38/0x70 [mlx5_core] [106170.964002] mlx5e_rep_update_flows+0xec/0x290 [mlx5_core] [106170.964525] mlx5e_rep_neigh_update+0x1da/0x310 [mlx5_core] [106170.965056] process_one_work+0x13a/0x2c0 [106170.965443] worker_thread+0x2e5/0x3f0 [106170.965808] ? rescuer_thread+0x410/0x410 [106170.966192] kthread+0xc6/0xf0 [106170.966515] ? kthread_complete_and_exit+0x20/0x20 [106170.966970] ret_from_fork+0x2d/0x50 [106170.967332] ? kthread_complete_and_exit+0x20/0x20 [106170.967774] ret_from_fork_asm+0x11/0x20 [106170.970466] </TASK> [106170.970726] ---[ end trace 0000000000000000 ]--- Fixes: 77ac5e40c44e ("net/sched: act_ct: remove and free nf_table callbacks") Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Reviewed-by: Paul Blakey <paulb@nvidia.com> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-09io_uring/af_unix: disable sending io_uring over socketsPavel Begunkov
File reference cycles have caused lots of problems for io_uring in the past, and it still doesn't work exactly right and races with unix_stream_read_generic(). The safest fix would be to completely disallow sending io_uring files via sockets via SCM_RIGHT, so there are no possible cycles invloving registered files and thus rendering SCM accounting on the io_uring side unnecessary. Cc: stable@vger.kernel.org Fixes: 0091bfc81741b ("io_uring/af_unix: defer registered files gc to io_uring release") Reported-and-suggested-by: Jann Horn <jannh@google.com> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Acked-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-08tcp: fix tcp_disordered_ack() vs usec TS resolutionEric Dumazet
After commit 939463016b7a ("tcp: change data receiver flowlabel after one dup") we noticed an increase of TCPACKSkippedPAWS events. Neal Cardwell tracked the issue to tcp_disordered_ack() assumption about remote peer TS clock. RFC 1323 & 7323 are suggesting the following: "timestamp clock frequency in the range 1 ms to 1 sec per tick between 1ms and 1sec." This has to be adjusted for 1 MHz clock frequency. This hints at reorders of SACK packets on send side, this might deserve a future patch. (skb->ooo_okay is always set for pure ACK packets) Fixes: 614e8316aa4c ("tcp: add support for usec resolution in TCP TS values") Co-developed-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: David Morley <morleyd@google.com> Link: https://lore.kernel.org/r/20231207181342.525181-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-08net: sysfs: fix locking in carrier readJohannes Berg
My previous patch added a call to linkwatch_sync_dev(), but that of course needs to be called under RTNL, which I missed earlier, but now saw RCU warnings from. Fix that by acquiring the RTNL in a similar fashion to how other files do it here. Fixes: facd15dfd691 ("net: core: synchronize link-watch when carrier is queried") Signed-off-by: Johannes Berg <johannes.berg@intel.com> Link: https://lore.kernel.org/r/20231206172122.859df6ba937f.I9c80608bcfbab171943ff4942b52dbd5e97fe06e@changeid Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-08Merge tag 'io_uring-6.7-2023-12-08' of git://git.kernel.dk/linuxLinus Torvalds
Pull io_uring fixes from Jens Axboe: "Two minor fixes for issues introduced in this release cycle, and two fixes for issues or potential issues that are heading to stable. One of these ends up disabling passing io_uring file descriptors via SCM_RIGHTS. There really shouldn't be an overlap between that kind of historic use case and modern usage of io_uring, which is why this was deemed appropriate" * tag 'io_uring-6.7-2023-12-08' of git://git.kernel.dk/linux: io_uring/af_unix: disable sending io_uring over sockets io_uring/kbuf: check for buffer list readiness after NULL check io_uring/kbuf: Fix an NULL vs IS_ERR() bug in io_alloc_pbuf_ring() io_uring: fix mutex_unlock with unreferenced ctx
2023-12-08Use READ/WRITE_ONCE() for IP local_port_range.David Laight
Commit 227b60f5102cd added a seqlock to ensure that the low and high port numbers were always updated together. This is overkill because the two 16bit port numbers can be held in a u32 and read/written in a single instruction. More recently 91d0b78c5177f added support for finer per-socket limits. The user-supplied value is 'high << 16 | low' but they are held separately and the socket options protected by the socket lock. Use a u32 containing 'high << 16 | low' for both the 'net' and 'sk' fields and use READ_ONCE()/WRITE_ONCE() to ensure both values are always updated together. Change (the now trival) inet_get_local_port_range() to a static inline to optimise the calling code. (In particular avoiding returning integers by reference.) Signed-off-by: David Laight <david.laight@aculab.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Acked-by: Mat Martineau <martineau@kernel.org> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://lore.kernel.org/r/4e505d4198e946a8be03fb1b4c3072b0@AcuMS.aculab.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-08ethtool: Implement ethtool_puts()justinstitt@google.com
Use strscpy() to implement ethtool_puts(). Functionally the same as ethtool_sprintf() when it's used with two arguments or with just "%s" format specifier. Signed-off-by: Justin Stitt <justinstitt@google.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Madhuri Sripada <madhuri.sripada@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-08net: ipv6: support reporting otherwise unknown prefix flags in RTM_NEWPREFIXMaciej Żenczykowski
Lorenzo points out that we effectively clear all unknown flags from PIO when copying them to userspace in the netlink RTM_NEWPREFIX notification. We could fix this one at a time as new flags are defined, or in one fell swoop - I choose the latter. We could either define 6 new reserved flags (reserved1..6) and handle them individually (and rename them as new flags are defined), or we could simply copy the entire unmodified byte over - I choose the latter. This unfortunately requires some anonymous union/struct magic, so we add a static assert on the struct size for a little extra safety. Cc: David Ahern <dsahern@kernel.org> Cc: Lorenzo Colitti <lorenzo@google.com> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Maciej Żenczykowski <maze@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-08neighbour: Don't let neigh_forced_gc() disable preemption for longJudy Hsiao
We are seeing cases where neigh_cleanup_and_release() is called by neigh_forced_gc() many times in a row with preemption turned off. When running on a low powered CPU at a low CPU frequency, this has been measured to keep preemption off for ~10 ms. That's not great on a system with HZ=1000 which expects tasks to be able to schedule in with ~1ms latency. Suggested-by: Douglas Anderson <dianders@chromium.org> Signed-off-by: Judy Hsiao <judyhsiao@chromium.org> Reviewed-by: David Ahern <dsahern@kernel.org> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Douglas Anderson <dianders@chromium.org> Signed-off-by: David S. Miller <davem@davemloft.net>