summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2024-06-24mm: fix incorrect vbq reference in purge_fragmented_blockZhaoyang Huang
xa_for_each() in _vm_unmap_aliases() loops through all vbs. However, since commit 062eacf57ad9 ("mm: vmalloc: remove a global vmap_blocks xarray") the vb from xarray may not be on the corresponding CPU vmap_block_queue. Consequently, purge_fragmented_block() might use the wrong vbq->lock to protect the free list, leading to vbq->free breakage. Incorrect lock protection can exhaust all vmalloc space as follows: CPU0 CPU1 +--------------------------------------------+ | +--------------------+ +-----+ | +--> | |---->| |------+ | CPU1:vbq free_list | | vb1 | +--- | |<----| |<-----+ | +--------------------+ +-----+ | +--------------------------------------------+ _vm_unmap_aliases() vb_alloc() new_vmap_block() xa_for_each(&vbq->vmap_blocks, idx, vb) --> vb in CPU1:vbq->freelist purge_fragmented_block(vb) spin_lock(&vbq->lock) spin_lock(&vbq->lock) --> use CPU0:vbq->lock --> use CPU1:vbq->lock list_del_rcu(&vb->free_list) list_add_tail_rcu(&vb->free_list, &vbq->free) __list_del(vb->prev, vb->next) next->prev = prev +--------------------+ | | | CPU1:vbq free_list | +---| |<--+ | +--------------------+ | +----------------------------+ __list_add(new, head->prev, head) +--------------------------------------------+ | +--------------------+ +-----+ | +--> | |---->| |------+ | CPU1:vbq free_list | | vb2 | +--- | |<----| |<-----+ | +--------------------+ +-----+ | +--------------------------------------------+ prev->next = next +--------------------------------------------+ |----------------------------+ | | +--------------------+ | +-----+ | +--> | |--+ | |------+ | CPU1:vbq free_list | | vb2 | +--- | |<----| |<-----+ | +--------------------+ +-----+ | +--------------------------------------------+ Here’s a list breakdown. All vbs, which were to be added to ‘prev’, cannot be used by list_for_each_entry_rcu(vb, &vbq->free, free_list) in vb_alloc(). Thus, vmalloc space is exhausted. This issue affects both erofs and f2fs, the stacktrace is as follows: erofs: [<ffffffd4ffb93ad4>] __switch_to+0x174 [<ffffffd4ffb942f0>] __schedule+0x624 [<ffffffd4ffb946f4>] schedule+0x7c [<ffffffd4ffb947cc>] schedule_preempt_disabled+0x24 [<ffffffd4ffb962ec>] __mutex_lock+0x374 [<ffffffd4ffb95998>] __mutex_lock_slowpath+0x14 [<ffffffd4ffb95954>] mutex_lock+0x24 [<ffffffd4fef2900c>] reclaim_and_purge_vmap_areas+0x44 [<ffffffd4fef25908>] alloc_vmap_area+0x2e0 [<ffffffd4fef24ea0>] vm_map_ram+0x1b0 [<ffffffd4ff1b46f4>] z_erofs_lz4_decompress+0x278 [<ffffffd4ff1b8ac4>] z_erofs_decompress_queue+0x650 [<ffffffd4ff1b8328>] z_erofs_runqueue+0x7f4 [<ffffffd4ff1b66a8>] z_erofs_read_folio+0x104 [<ffffffd4feeb6fec>] filemap_read_folio+0x6c [<ffffffd4feeb68c4>] filemap_fault+0x300 [<ffffffd4fef0ecac>] __do_fault+0xc8 [<ffffffd4fef0c908>] handle_mm_fault+0xb38 [<ffffffd4ffb9f008>] do_page_fault+0x288 [<ffffffd4ffb9ed64>] do_translation_fault[jt]+0x40 [<ffffffd4fec39c78>] do_mem_abort+0x58 [<ffffffd4ffb8c3e4>] el0_ia+0x70 [<ffffffd4ffb8c260>] el0t_64_sync_handler[jt]+0xb0 [<ffffffd4fec11588>] ret_to_user[jt]+0x0 f2fs: [<ffffffd4ffb93ad4>] __switch_to+0x174 [<ffffffd4ffb942f0>] __schedule+0x624 [<ffffffd4ffb946f4>] schedule+0x7c [<ffffffd4ffb947cc>] schedule_preempt_disabled+0x24 [<ffffffd4ffb962ec>] __mutex_lock+0x374 [<ffffffd4ffb95998>] __mutex_lock_slowpath+0x14 [<ffffffd4ffb95954>] mutex_lock+0x24 [<ffffffd4fef2900c>] reclaim_and_purge_vmap_areas+0x44 [<ffffffd4fef25908>] alloc_vmap_area+0x2e0 [<ffffffd4fef24ea0>] vm_map_ram+0x1b0 [<ffffffd4ff1a3b60>] f2fs_prepare_decomp_mem+0x144 [<ffffffd4ff1a6c24>] f2fs_alloc_dic+0x264 [<ffffffd4ff175468>] f2fs_read_multi_pages+0x428 [<ffffffd4ff17b46c>] f2fs_mpage_readpages+0x314 [<ffffffd4ff1785c4>] f2fs_readahead+0x50 [<ffffffd4feec3384>] read_pages+0x80 [<ffffffd4feec32c0>] page_cache_ra_unbounded+0x1a0 [<ffffffd4feec39e8>] page_cache_ra_order+0x274 [<ffffffd4feeb6cec>] do_sync_mmap_readahead+0x11c [<ffffffd4feeb6764>] filemap_fault+0x1a0 [<ffffffd4ff1423bc>] f2fs_filemap_fault+0x28 [<ffffffd4fef0ecac>] __do_fault+0xc8 [<ffffffd4fef0c908>] handle_mm_fault+0xb38 [<ffffffd4ffb9f008>] do_page_fault+0x288 [<ffffffd4ffb9ed64>] do_translation_fault[jt]+0x40 [<ffffffd4fec39c78>] do_mem_abort+0x58 [<ffffffd4ffb8c3e4>] el0_ia+0x70 [<ffffffd4ffb8c260>] el0t_64_sync_handler[jt]+0xb0 [<ffffffd4fec11588>] ret_to_user[jt]+0x0 To fix this, introducee cpu within vmap_block to record which this vb belongs to. Link: https://lkml.kernel.org/r/20240614021352.1822225-1-zhaoyang.huang@unisoc.com Link: https://lkml.kernel.org/r/20240607023116.1720640-1-zhaoyang.huang@unisoc.com Fixes: fc1e0d980037 ("mm/vmalloc: prevent stale TLBs in fully utilized blocks") Signed-off-by: Zhaoyang Huang <zhaoyang.huang@unisoc.com> Suggested-by: Hailong.Liu <hailong.liu@oppo.com> Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com> Cc: Baoquan He <bhe@redhat.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Lorenzo Stoakes <lstoakes@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-06-24Merge tag 'for-netdev' of ↵Jakub Kicinski
ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/bpf/bpf Daniel Borkmann says: ==================== pull-request: bpf 2024-06-24 We've added 12 non-merge commits during the last 10 day(s) which contain a total of 10 files changed, 412 insertions(+), 16 deletions(-). The main changes are: 1) Fix a BPF verifier issue validating may_goto with a negative offset, from Alexei Starovoitov. 2) Fix a BPF verifier validation bug with may_goto combined with jump to the first instruction, also from Alexei Starovoitov. 3) Fix a bug with overrunning reservations in BPF ring buffer, from Daniel Borkmann. 4) Fix a bug in BPF verifier due to missing proper var_off setting related to movsx instruction, from Yonghong Song. 5) Silence unnecessary syzkaller-triggered warning in __xdp_reg_mem_model(), from Daniil Dulov. * tag 'for-netdev' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: xdp: Remove WARN() from __xdp_reg_mem_model() selftests/bpf: Add tests for may_goto with negative offset. bpf: Fix may_goto with negative offset. selftests/bpf: Add more ring buffer test coverage bpf: Fix overrunning reservations in ringbuf selftests/bpf: Tests with may_goto and jumps to the 1st insn bpf: Fix the corner case with may_goto and jump to the 1st insn. bpf: Update BPF LSM maintainer list bpf: Fix remap of arena. selftests/bpf: Add a few tests to cover bpf: Add missed var_off setting in coerce_subreg_to_size_sx() bpf: Add missed var_off setting in set_sext32_default_val() ==================== Link: https://patch.msgid.link/20240624124330.8401-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-24Merge branch 'locking-introduce-nested-bh-locking'Jakub Kicinski
Sebastian Andrzej Siewior says: ==================== locking: Introduce nested-BH locking. Disabling bottoms halves acts as per-CPU BKL. On PREEMPT_RT code within local_bh_disable() section remains preemtible. As a result high prior tasks (or threaded interrupts) will be blocked by lower-prio task (or threaded interrupts) which are long running which includes softirq sections. The proposed way out is to introduce explicit per-CPU locks for resources which are protected by local_bh_disable() and use those only on PREEMPT_RT so there is no additional overhead for !PREEMPT_RT builds. The series introduces the infrastructure and converts large parts of networking which is largest stake holder here. Once this done the per-CPU lock from local_bh_disable() on PREEMPT_RT can be lifted. Performance testing. Baseline is net-next as of commit 93bda33046e7a ("Merge branch'net-constify-ctl_table-arguments-of-utility-functions'") plus v6.10-rc1. A 10GiG link is used between two hosts. The command xdp-bench redirect-cpu --cpu 3 --remote-action drop eth1 -e was invoked on the receiving side with a ixgbe. The sending side uses pktgen_sample03_burst_single_flow.sh on i40e. Baseline: | eth1->? 9,018,604 rx/s 0 err,drop/s | receive total 9,018,604 pkt/s 0 drop/s 0 error/s | cpu:7 9,018,604 pkt/s 0 drop/s 0 error/s | enqueue to cpu 3 9,018,602 pkt/s 0 drop/s 7.00 bulk-avg | cpu:7->3 9,018,602 pkt/s 0 drop/s 7.00 bulk-avg | kthread total 9,018,606 pkt/s 0 drop/s 214,698 sched | cpu:3 9,018,606 pkt/s 0 drop/s 214,698 sched | xdp_stats 0 pass/s 9,018,606 drop/s 0 redir/s | cpu:3 0 pass/s 9,018,606 drop/s 0 redir/s | redirect_err 0 error/s | xdp_exception 0 hit/s perf top --sort cpu,symbol --no-children: | 18.14% 007 [k] bpf_prog_4f0ffbb35139c187_cpumap_l4_hash | 13.29% 007 [k] ixgbe_poll | 12.66% 003 [k] cpu_map_kthread_run | 7.23% 003 [k] page_frag_free | 6.76% 007 [k] xdp_do_redirect | 3.76% 007 [k] cpu_map_redirect | 3.13% 007 [k] bq_flush_to_queue | 2.51% 003 [k] xdp_return_frame | 1.93% 007 [k] try_to_wake_up | 1.78% 007 [k] _raw_spin_lock | 1.74% 007 [k] cpu_map_enqueue | 1.56% 003 [k] bpf_prog_57cd311f2e27366b_cpumap_drop With this series applied: | eth1->? 10,329,340 rx/s 0 err,drop/s | receive total 10,329,340 pkt/s 0 drop/s 0 error/s | cpu:6 10,329,340 pkt/s 0 drop/s 0 error/s | enqueue to cpu 3 10,329,338 pkt/s 0 drop/s 8.00 bulk-avg | cpu:6->3 10,329,338 pkt/s 0 drop/s 8.00 bulk-avg | kthread total 10,329,321 pkt/s 0 drop/s 96,297 sched | cpu:3 10,329,321 pkt/s 0 drop/s 96,297 sched | xdp_stats 0 pass/s 10,329,321 drop/s 0 redir/s | cpu:3 0 pass/s 10,329,321 drop/s 0 redir/s | redirect_err 0 error/s | xdp_exception 0 hit/s perf top --sort cpu,symbol --no-children: | 20.90% 006 [k] bpf_prog_4f0ffbb35139c187_cpumap_l4_hash | 12.62% 006 [k] ixgbe_poll | 9.82% 003 [k] page_frag_free | 8.73% 003 [k] cpu_map_bpf_prog_run_xdp | 6.63% 006 [k] xdp_do_redirect | 4.94% 003 [k] cpu_map_kthread_run | 4.28% 006 [k] cpu_map_redirect | 4.03% 006 [k] bq_flush_to_queue | 3.01% 003 [k] xdp_return_frame | 1.95% 006 [k] _raw_spin_lock | 1.94% 003 [k] bpf_prog_57cd311f2e27366b_cpumap_drop This diff appears to be noise. v8: https://lore.kernel.org/all/20240619072253.504963-1-bigeasy@linutronix.de v7: https://lore.kernel.org/all/20240618072526.379909-1-bigeasy@linutronix.de v6: https://lore.kernel.org/all/20240612170303.3896084-1-bigeasy@linutronix.de v5: https://lore.kernel.org/all/20240607070427.1379327-1-bigeasy@linutronix.de v4: https://lore.kernel.org/all/20240604154425.878636-1-bigeasy@linutronix.de v3: https://lore.kernel.org/all/20240529162927.403425-1-bigeasy@linutronix.de v2: https://lore.kernel.org/all/20240503182957.1042122-1-bigeasy@linutronix.de v1: https://lore.kernel.org/all/20231215171020.687342-1-bigeasy@linutronix.de ==================== Link: https://patch.msgid.link/20240620132727.660738-1-bigeasy@linutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-24net: Move per-CPU flush-lists to bpf_net_context on PREEMPT_RT.Sebastian Andrzej Siewior
The per-CPU flush lists, which are accessed from within the NAPI callback (xdp_do_flush() for instance), are per-CPU. There are subject to the same problem as struct bpf_redirect_info. Add the per-CPU lists cpu_map_flush_list, dev_map_flush_list and xskmap_map_flush_list to struct bpf_net_context. Add wrappers for the access. The lists initialized on first usage (similar to bpf_net_ctx_get_ri()). Cc: "Björn Töpel" <bjorn@kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Eduard Zingerman <eddyz87@gmail.com> Cc: Hao Luo <haoluo@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Fastabend <john.fastabend@gmail.com> Cc: Jonathan Lemon <jonathan.lemon@gmail.com> Cc: KP Singh <kpsingh@kernel.org> Cc: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Cc: Magnus Karlsson <magnus.karlsson@intel.com> Cc: Martin KaFai Lau <martin.lau@linux.dev> Cc: Song Liu <song@kernel.org> Cc: Stanislav Fomichev <sdf@google.com> Cc: Yonghong Song <yonghong.song@linux.dev> Acked-by: Jesper Dangaard Brouer <hawk@kernel.org> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Link: https://patch.msgid.link/20240620132727.660738-16-bigeasy@linutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-24net: Reference bpf_redirect_info via task_struct on PREEMPT_RT.Sebastian Andrzej Siewior
The XDP redirect process is two staged: - bpf_prog_run_xdp() is invoked to run a eBPF program which inspects the packet and makes decisions. While doing that, the per-CPU variable bpf_redirect_info is used. - Afterwards xdp_do_redirect() is invoked and accesses bpf_redirect_info and it may also access other per-CPU variables like xskmap_flush_list. At the very end of the NAPI callback, xdp_do_flush() is invoked which does not access bpf_redirect_info but will touch the individual per-CPU lists. The per-CPU variables are only used in the NAPI callback hence disabling bottom halves is the only protection mechanism. Users from preemptible context (like cpu_map_kthread_run()) explicitly disable bottom halves for protections reasons. Without locking in local_bh_disable() on PREEMPT_RT this data structure requires explicit locking. PREEMPT_RT has forced-threaded interrupts enabled and every NAPI-callback runs in a thread. If each thread has its own data structure then locking can be avoided. Create a struct bpf_net_context which contains struct bpf_redirect_info. Define the variable on stack, use bpf_net_ctx_set() to save a pointer to it, bpf_net_ctx_clear() removes it again. The bpf_net_ctx_set() may nest. For instance a function can be used from within NET_RX_SOFTIRQ/ net_rx_action which uses bpf_net_ctx_set() and NET_TX_SOFTIRQ which does not. Therefore only the first invocations updates the pointer. Use bpf_net_ctx_get_ri() as a wrapper to retrieve the current struct bpf_redirect_info. The returned data structure is zero initialized to ensure nothing is leaked from stack. This is done on first usage of the struct. bpf_net_ctx_set() sets bpf_redirect_info::kern_flags to 0 to note that initialisation is required. First invocation of bpf_net_ctx_get_ri() will memset() the data structure and update bpf_redirect_info::kern_flags. bpf_redirect_info::nh is excluded from memset because it is only used once BPF_F_NEIGH is set which also sets the nh member. The kern_flags is moved past nh to exclude it from memset. The pointer to bpf_net_context is saved task's task_struct. Using always the bpf_net_context approach has the advantage that there is almost zero differences between PREEMPT_RT and non-PREEMPT_RT builds. Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Eduard Zingerman <eddyz87@gmail.com> Cc: Hao Luo <haoluo@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Fastabend <john.fastabend@gmail.com> Cc: KP Singh <kpsingh@kernel.org> Cc: Martin KaFai Lau <martin.lau@linux.dev> Cc: Song Liu <song@kernel.org> Cc: Stanislav Fomichev <sdf@google.com> Cc: Yonghong Song <yonghong.song@linux.dev> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Jesper Dangaard Brouer <hawk@kernel.org> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Link: https://patch.msgid.link/20240620132727.660738-15-bigeasy@linutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-24net: Use nested-BH locking for bpf_scratchpad.Sebastian Andrzej Siewior
bpf_scratchpad is a per-CPU variable and relies on disabled BH for its locking. Without per-CPU locking in local_bh_disable() on PREEMPT_RT this data structure requires explicit locking. Add a local_lock_t to the data structure and use local_lock_nested_bh() for locking. This change adds only lockdep coverage and does not alter the functional behaviour for !PREEMPT_RT. Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Hao Luo <haoluo@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Fastabend <john.fastabend@gmail.com> Cc: KP Singh <kpsingh@kernel.org> Cc: Martin KaFai Lau <martin.lau@linux.dev> Cc: Song Liu <song@kernel.org> Cc: Stanislav Fomichev <sdf@google.com> Cc: Yonghong Song <yonghong.song@linux.dev> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Link: https://patch.msgid.link/20240620132727.660738-14-bigeasy@linutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-24seg6: Use nested-BH locking for seg6_bpf_srh_states.Sebastian Andrzej Siewior
The access to seg6_bpf_srh_states is protected by disabling preemption. Based on the code, the entry point is input_action_end_bpf() and every other function (the bpf helper functions bpf_lwt_seg6_*()), that is accessing seg6_bpf_srh_states, should be called from within input_action_end_bpf(). input_action_end_bpf() accesses seg6_bpf_srh_states first at the top of the function and then disables preemption. This looks wrong because if preemption needs to be disabled as part of the locking mechanism then the variable shouldn't be accessed beforehand. Looking at how it is used via test_lwt_seg6local.sh then input_action_end_bpf() is always invoked from softirq context. If this is always the case then the preempt_disable() statement is superfluous. If this is not always invoked from softirq then disabling only preemption is not sufficient. Replace the preempt_disable() statement with nested-BH locking. This is not an equivalent replacement as it assumes that the invocation of input_action_end_bpf() always occurs in softirq context and thus the preempt_disable() is superfluous. Add a local_lock_t the data structure and use local_lock_nested_bh() for locking. Add lockdep_assert_held() to ensure the lock is held while the per-CPU variable is referenced in the helper functions. Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: David Ahern <dsahern@kernel.org> Cc: Hao Luo <haoluo@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Fastabend <john.fastabend@gmail.com> Cc: KP Singh <kpsingh@kernel.org> Cc: Martin KaFai Lau <martin.lau@linux.dev> Cc: Song Liu <song@kernel.org> Cc: Stanislav Fomichev <sdf@google.com> Cc: Yonghong Song <yonghong.song@linux.dev> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Link: https://patch.msgid.link/20240620132727.660738-13-bigeasy@linutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-24lwt: Don't disable migration prio invoking BPF.Sebastian Andrzej Siewior
There is no need to explicitly disable migration if bottom halves are also disabled. Disabling BH implies disabling migration. Remove migrate_disable() and rely solely on disabling BH to remain on the same CPU. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Link: https://patch.msgid.link/20240620132727.660738-12-bigeasy@linutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-24dev: Use nested-BH locking for softnet_data.process_queue.Sebastian Andrzej Siewior
softnet_data::process_queue is a per-CPU variable and relies on disabled BH for its locking. Without per-CPU locking in local_bh_disable() on PREEMPT_RT this data structure requires explicit locking. softnet_data::input_queue_head can be updated lockless. This is fine because this value is only update CPU local by the local backlog_napi thread. Add a local_lock_t to softnet_data and use local_lock_nested_bh() for locking of process_queue. This change adds only lockdep coverage and does not alter the functional behaviour for !PREEMPT_RT. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Link: https://patch.msgid.link/20240620132727.660738-11-bigeasy@linutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-24dev: Remove PREEMPT_RT ifdefs from backlog_lock.*().Sebastian Andrzej Siewior
The backlog_napi locking (previously RPS) relies on explicit locking if either RPS or backlog NAPI is enabled. If both are disabled then locking was achieved by disabling interrupts except on PREEMPT_RT. PREEMPT_RT was excluded because the needed synchronisation was already provided local_bh_disable(). Since the introduction of backlog NAPI and making it mandatory for PREEMPT_RT the ifdef within backlog_lock.*() is obsolete and can be removed. Remove the ifdefs in backlog_lock.*(). Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Link: https://patch.msgid.link/20240620132727.660738-10-bigeasy@linutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-24net: softnet_data: Make xmit per task.Sebastian Andrzej Siewior
Softirq is preemptible on PREEMPT_RT. Without a per-CPU lock in local_bh_disable() there is no guarantee that only one device is transmitting at a time. With preemption and multiple senders it is possible that the per-CPU `recursion' counter gets incremented by different threads and exceeds XMIT_RECURSION_LIMIT leading to a false positive recursion alert. The `more' member is subject to similar problems if set by one thread for one driver and wrongly used by another driver within another thread. Instead of adding a lock to protect the per-CPU variable it is simpler to make xmit per-task. Sending and receiving skbs happens always in thread context anyway. Having a lock to protected the per-CPU counter would block/ serialize two sending threads needlessly. It would also require a recursive lock to ensure that the owner can increment the counter further. Make the softnet_data.xmit a task_struct member on PREEMPT_RT. Add needed wrapper. Cc: Ben Segall <bsegall@google.com> Cc: Daniel Bristot de Oliveira <bristot@redhat.com> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Valentin Schneider <vschneid@redhat.com> Cc: Vincent Guittot <vincent.guittot@linaro.org> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Link: https://patch.msgid.link/20240620132727.660738-9-bigeasy@linutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-24netfilter: br_netfilter: Use nested-BH locking for brnf_frag_data_storage.Sebastian Andrzej Siewior
brnf_frag_data_storage is a per-CPU variable and relies on disabled BH for its locking. Without per-CPU locking in local_bh_disable() on PREEMPT_RT this data structure requires explicit locking. Add a local_lock_t to the data structure and use local_lock_nested_bh() for locking. This change adds only lockdep coverage and does not alter the functional behaviour for !PREEMPT_RT. Cc: Florian Westphal <fw@strlen.de> Cc: Jozsef Kadlecsik <kadlec@netfilter.org> Cc: Nikolay Aleksandrov <razor@blackwall.org> Cc: Pablo Neira Ayuso <pablo@netfilter.org> Cc: Roopa Prabhu <roopa@nvidia.com> Cc: bridge@lists.linux.dev Cc: coreteam@netfilter.org Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Link: https://patch.msgid.link/20240620132727.660738-8-bigeasy@linutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-24net/ipv4: Use nested-BH locking for ipv4_tcp_sk.Sebastian Andrzej Siewior
ipv4_tcp_sk is a per-CPU variable and relies on disabled BH for its locking. Without per-CPU locking in local_bh_disable() on PREEMPT_RT this data structure requires explicit locking. Make a struct with a sock member (original ipv4_tcp_sk) and a local_lock_t and use local_lock_nested_bh() for locking. This change adds only lockdep coverage and does not alter the functional behaviour for !PREEMPT_RT. Cc: David Ahern <dsahern@kernel.org> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Link: https://patch.msgid.link/20240620132727.660738-7-bigeasy@linutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-24net/tcp_sigpool: Use nested-BH locking for sigpool_scratch.Sebastian Andrzej Siewior
sigpool_scratch is a per-CPU variable and relies on disabled BH for its locking. Without per-CPU locking in local_bh_disable() on PREEMPT_RT this data structure requires explicit locking. Make a struct with a pad member (original sigpool_scratch) and a local_lock_t and use local_lock_nested_bh() for locking. This change adds only lockdep coverage and does not alter the functional behaviour for !PREEMPT_RT. Cc: David Ahern <dsahern@kernel.org> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Link: https://patch.msgid.link/20240620132727.660738-6-bigeasy@linutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-24net: Use nested-BH locking for napi_alloc_cache.Sebastian Andrzej Siewior
napi_alloc_cache is a per-CPU variable and relies on disabled BH for its locking. Without per-CPU locking in local_bh_disable() on PREEMPT_RT this data structure requires explicit locking. Add a local_lock_t to the data structure and use local_lock_nested_bh() for locking. This change adds only lockdep coverage and does not alter the functional behaviour for !PREEMPT_RT. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Link: https://patch.msgid.link/20240620132727.660738-5-bigeasy@linutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-24net: Use __napi_alloc_frag_align() instead of open coding it.Sebastian Andrzej Siewior
The else condition within __netdev_alloc_frag_align() is an open coded __napi_alloc_frag_align(). Use __napi_alloc_frag_align() instead of open coding it. Move fragsz assignment before page_frag_alloc_align() invocation because __napi_alloc_frag_align() also contains this statement. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Link: https://patch.msgid.link/20240620132727.660738-4-bigeasy@linutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-24locking/local_lock: Add local nested BH locking infrastructure.Sebastian Andrzej Siewior
Add local_lock_nested_bh() locking. It is based on local_lock_t and the naming follows the preempt_disable_nested() example. For !PREEMPT_RT + !LOCKDEP it is a per-CPU annotation for locking assumptions based on local_bh_disable(). The macro is optimized away during compilation. For !PREEMPT_RT + LOCKDEP the local_lock_nested_bh() is reduced to the usual lock-acquire plus lockdep_assert_in_softirq() - ensuring that BH is disabled. For PREEMPT_RT local_lock_nested_bh() acquires the specified per-CPU lock. It does not disable CPU migration because it relies on local_bh_disable() disabling CPU migration. With LOCKDEP it performans the usual lockdep checks as with !PREEMPT_RT. Due to include hell the softirq check has been moved spinlock.c. The intention is to use this locking in places where locking of a per-CPU variable relies on BH being disabled. Instead of treating disabled bottom halves as a big per-CPU lock, PREEMPT_RT can use this to reduce the locking scope to what actually needs protecting. A side effect is that it also documents the protection scope of the per-CPU variables. Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Link: https://patch.msgid.link/20240620132727.660738-3-bigeasy@linutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-24locking/local_lock: Introduce guard definition for local_lock.Sebastian Andrzej Siewior
Introduce lock guard definition for local_lock_t. There are no users yet. Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Link: https://patch.msgid.link/20240620132727.660738-2-bigeasy@linutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-24Merge tag 'input-for-v6.10-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input Pull input fixes from Dmitry Torokhov: - fixes for ili210x and elantech drivers - new products IDs added to xpad controller driver - a tweak to i8042 driver to always keep keyboard in Ayaneo Kun handheld in raw mode - populated "id_table" in ads7846 touchscreen driver to make sure non-OF instantiated devices can properly determine the model data. * tag 'input-for-v6.10-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: Input: ads7846 - use spi_device_id table Input: xpad - add support for ASUS ROG RAIKIRI PRO Input: ili210x - fix ili251x_read_touch_data() return value Input: i8042 - add Ayaneo Kun to i8042 quirk table Input: elantech - fix touchpad state on resume for Lenovo N24
2024-06-24Merge tag 'pinctrl-v6.10-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl Pull pin control fixes from Linus Walleij: - Use flag saving spinlocks in the Renesas rzg2l driver. This fixes up PREEMPT_RT problems. - Remove broken Qualcomm PM8008 that clearly was never working. A new version will arrive in the next merge window. - Add a quirk for LP8764 regmap that was missed and made the TI J7200 board unusable. - Fix persistance on the BCM2835 GPIO outputs kernel parameter so this remains consisten across a booted kernel. - Fix a potential deadlock in create_pinctrl() - Fix some erroneous bitfields and pinmux reset in the Rockchip RK3328 driver. * tag 'pinctrl-v6.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: pinctrl: rockchip: fix pinmux reset in rockchip_pmx_set pinctrl: rockchip: use dedicated pinctrl type for RK3328 pinctrl: rockchip: fix pinmux bits for RK3328 GPIO3-B pins pinctrl: rockchip: fix pinmux bits for RK3328 GPIO2-B pins pinctrl: fix deadlock in create_pinctrl() when handling -EPROBE_DEFER pinctrl: bcm2835: Fix permissions of persist_gpio_outputs pinctrl: tps6594: add missing support for LP8764 PMIC dt-bindings: pinctrl: qcom,pmic-gpio: drop pm8008 pinctrl: qcom: spmi-gpio: drop broken pm8008 support pinctrl: renesas: rzg2l: Use spin_{lock,unlock}_irq{save,restore}
2024-06-24netfilter: fix undefined reference to 'netfilter_lwtunnel_*' when ↵Jianguo Wu
CONFIG_SYSCTL=n if CONFIG_SYSFS is not enabled in config, we get the below compile error, All errors (new ones prefixed by >>): csky-linux-ld: net/netfilter/core.o: in function `netfilter_init': core.c:(.init.text+0x42): undefined reference to `netfilter_lwtunnel_init' >> csky-linux-ld: core.c:(.init.text+0x56): undefined reference to `netfilter_lwtunnel_fini' >> csky-linux-ld: core.c:(.init.text+0x70): undefined reference to `netfilter_lwtunnel_init' csky-linux-ld: core.c:(.init.text+0x78): undefined reference to `netfilter_lwtunnel_fini' Fixes: a2225e0250c5 ("netfilter: move the sysctl nf_hooks_lwtunnel into the netfilter core") Reported-by: Mirsad Todorovac <mtodorovac69@gmail.com> Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202406210511.8vbByYj3-lkp@intel.com/ Closes: https://lore.kernel.org/oe-kbuild-all/202406210520.6HmrUaA2-lkp@intel.com/ Signed-off-by: Jianguo Wu <wujianguo@chinatelecom.cn> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-06-24ASoC: mediatek: mt8195: Add platform entry for ETDM1_OUT_BE dai linkChen-Yu Tsai
Commit e70b8dd26711 ("ASoC: mediatek: mt8195: Remove afe-dai component and rework codec link") removed the codec entry for the ETDM1_OUT_BE dai link entirely instead of replacing it with COMP_EMPTY(). This worked by accident as the remaining COMP_EMPTY() platform entry became the codec entry, and the platform entry became completely empty, effectively the same as COMP_DUMMY() since snd_soc_fill_dummy_dai() doesn't do anything for platform entries. This causes a KASAN out-of-bounds warning in mtk_soundcard_common_probe() in sound/soc/mediatek/common/mtk-soundcard-driver.c: for_each_card_prelinks(card, i, dai_link) { if (adsp_node && !strncmp(dai_link->name, "AFE_SOF", strlen("AFE_SOF"))) dai_link->platforms->of_node = adsp_node; else if (!dai_link->platforms->name && !dai_link->platforms->of_node) dai_link->platforms->of_node = platform_node; } where the code expects the platforms array to have space for at least one entry. Add an COMP_EMPTY() entry so that dai_link->platforms has space. Fixes: e70b8dd26711 ("ASoC: mediatek: mt8195: Remove afe-dai component and rework codec link") Signed-off-by: Chen-Yu Tsai <wenst@chromium.org> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Link: https://patch.msgid.link/20240624061257.3115467-1-wenst@chromium.org Signed-off-by: Mark Brown <broonie@kernel.org>
2024-06-24xdp: Remove WARN() from __xdp_reg_mem_model()Daniil Dulov
syzkaller reports a warning in __xdp_reg_mem_model(). The warning occurs only if __mem_id_init_hash_table() returns an error. It returns the error in two cases: 1. memory allocation fails; 2. rhashtable_init() fails when some fields of rhashtable_params struct are not initialized properly. The second case cannot happen since there is a static const rhashtable_params struct with valid fields. So, warning is only triggered when there is a problem with memory allocation. Thus, there is no sense in using WARN() to handle this error and it can be safely removed. WARNING: CPU: 0 PID: 5065 at net/core/xdp.c:299 __xdp_reg_mem_model+0x2d9/0x650 net/core/xdp.c:299 CPU: 0 PID: 5065 Comm: syz-executor883 Not tainted 6.8.0-syzkaller-05271-gf99c5f563c17 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/27/2024 RIP: 0010:__xdp_reg_mem_model+0x2d9/0x650 net/core/xdp.c:299 Call Trace: xdp_reg_mem_model+0x22/0x40 net/core/xdp.c:344 xdp_test_run_setup net/bpf/test_run.c:188 [inline] bpf_test_run_xdp_live+0x365/0x1e90 net/bpf/test_run.c:377 bpf_prog_test_run_xdp+0x813/0x11b0 net/bpf/test_run.c:1267 bpf_prog_test_run+0x33a/0x3b0 kernel/bpf/syscall.c:4240 __sys_bpf+0x48d/0x810 kernel/bpf/syscall.c:5649 __do_sys_bpf kernel/bpf/syscall.c:5738 [inline] __se_sys_bpf kernel/bpf/syscall.c:5736 [inline] __x64_sys_bpf+0x7c/0x90 kernel/bpf/syscall.c:5736 do_syscall_64+0xfb/0x240 entry_SYSCALL_64_after_hwframe+0x6d/0x75 Found by Linux Verification Center (linuxtesting.org) with syzkaller. Fixes: 8d5d88527587 ("xdp: rhashtable with allocator ID to pointer mapping") Signed-off-by: Daniil Dulov <d.dulov@aladdin.ru> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Jesper Dangaard Brouer <hawk@kernel.org> Link: https://lore.kernel.org/all/20240617162708.492159-1-d.dulov@aladdin.ru Link: https://lore.kernel.org/bpf/20240624080747.36858-1-d.dulov@aladdin.ru
2024-06-24selftests/bpf: Add tests for may_goto with negative offset.Alexei Starovoitov
Add few tests with may_goto and negative offset. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20240619235355.85031-2-alexei.starovoitov@gmail.com
2024-06-24bpf: Fix may_goto with negative offset.Alexei Starovoitov
Zac's syzbot crafted a bpf prog that exposed two bugs in may_goto. The 1st bug is the way may_goto is patched. When offset is negative it should be patched differently. The 2nd bug is in the verifier: when current state may_goto_depth is equal to visited state may_goto_depth it means there is an actual infinite loop. It's not correct to prune exploration of the program at this point. Note, that this check doesn't limit the program to only one may_goto insn, since 2nd and any further may_goto will increment may_goto_depth only in the queued state pushed for future exploration. The current state will have may_goto_depth == 0 regardless of number of may_goto insns and the verifier has to explore the program until bpf_exit. Fixes: 011832b97b31 ("bpf: Introduce may_goto instruction") Reported-by: Zac Ecob <zacecob@protonmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Closes: https://lore.kernel.org/bpf/CAADnVQL-15aNp04-cyHRn47Yv61NXfYyhopyZtUyxNojUZUXpA@mail.gmail.com/ Link: https://lore.kernel.org/bpf/20240619235355.85031-1-alexei.starovoitov@gmail.com
2024-06-24selftests/bpf: Add more ring buffer test coverageDaniel Borkmann
Add test coverage for reservations beyond the ring buffer size in order to validate that bpf_ringbuf_reserve() rejects the request with NULL, all other ring buffer tests keep passing as well: # ./vmtest.sh -- ./test_progs -t ringbuf [...] ./test_progs -t ringbuf [ 1.165434] bpf_testmod: loading out-of-tree module taints kernel. [ 1.165825] bpf_testmod: module verification failed: signature and/or required key missing - tainting kernel [ 1.284001] tsc: Refined TSC clocksource calibration: 3407.982 MHz [ 1.286871] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x311fc34e357, max_idle_ns: 440795379773 ns [ 1.289555] clocksource: Switched to clocksource tsc #274/1 ringbuf/ringbuf:OK #274/2 ringbuf/ringbuf_n:OK #274/3 ringbuf/ringbuf_map_key:OK #274/4 ringbuf/ringbuf_write:OK #274 ringbuf:OK #275 ringbuf_multi:OK [...] Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> [ Test fixups for getting BPF CI back to work ] Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20240621140828.18238-2-daniel@iogearbox.net
2024-06-24MAINTAINERS: adjust file entry in FREESCALE QORIQ DPAA FMAN DRIVERLukas Bulwahn
Commit 243996d172a6 ("dt-bindings: net: Convert fsl-fman to yaml") splits the previous dt text file into four yaml files. It adjusts a corresponding file entry in MAINTAINERS from txt to yaml, but this adjustment misses that the file was split and renamed. Hence, ./scripts/get_maintainer.pl --self-test=patterns complains about a broken reference. Adjust the file entry to match the four yaml files resulting from this commit above. Signed-off-by: Lukas Bulwahn <lukas.bulwahn@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-24net: usb: ax88179_178a: improve link status logsJose Ignacio Tornos Martinez
Avoid spurious link status logs that may ultimately be wrong; for example, if the link is set to down with the cable plugged, then the cable is unplugged and after this the link is set to up, the last new log that is appearing is incorrectly telling that the link is up. In order to avoid errors, show link status logs after link_reset processing, and in order to avoid spurious as much as possible, only show the link loss when some link status change is detected. cc: stable@vger.kernel.org Fixes: e2ca90c276e1 ("ax88179_178a: ASIX AX88179_178A USB 3.0/2.0 to gigabit ethernet adapter driver") Signed-off-by: Jose Ignacio Tornos Martinez <jtornosm@redhat.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-23Linux 6.10-rc5v6.10-rc5Linus Torvalds
2024-06-23octeontx2-pf: Fix coverity and klockwork issues in octeon PF driverRatheesh Kannoth
Fix unintended sign extension and klockwork issues. These are not real issue but for sanity checks. Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com> Signed-off-by: Suman Ghosh <sumang@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-23Merge tag 'i2c-for-6.10-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull i2c fixes from Wolfram Sang: "The core gains placeholders for recently added functions when CONFIG_I2C is not defined as well documentation fixes to start using inclusive terminology. The drivers get paths in DT bindings fixed as well as proper interrupt handling for the ocores driver" * tag 'i2c-for-6.10-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: docs: i2c: summary: be clearer with 'controller/target' and 'adapter/client' pairs docs: i2c: summary: document 'local' and 'remote' targets docs: i2c: summary: document use of inclusive language docs: i2c: summary: update speed mode description docs: i2c: summary: update I2C specification link docs: i2c: summary: start sentences consistently. i2c: Add nop fwnode operations i2c: ocores: set IACK bit after core is enabled dt-bindings: i2c: google,cros-ec-i2c-tunnel: correct path to i2c-controller schema dt-bindings: i2c: atmel,at91sam: correct path to i2c-controller schema
2024-06-23Merge tag '6.10-rc4-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds
Pull smb client fixes from Steve French: "Five smb3 client fixes - three nets/fiolios cifs fixes - fix typo in module parameters description - fix incorrect swap warning" * tag '6.10-rc4-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6: cifs: Move the 'pid' from the subreq to the req cifs: Only pick a channel once per read request cifs: Defer read completion cifs: fix typo in module parameter enable_gcm_256 cifs: drop the incorrect assertion in cifs_swap_rw()
2024-06-23Merge tag 'fixes-2024-06-23' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock Pull memblock fix from Mike Rapoport: "Fix fragility in checks for unset node ID. Use numa_valid_node() function to verify that nid is a valid node ID instead of inconsistent comparisons with either NUMA_NO_NODE or MAX_NUMNODES" * tag 'fixes-2024-06-23' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock: memblock: use numa_valid_node() helper to check for invalid node ID
2024-06-23Merge tag 'mips-fixes_6.10_2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux Pull MIPS fixes from Thomas Bogendoerfer: - fix lseek in o32 compat mode - fix for microMIPS MT ASE helpers * tag 'mips-fixes_6.10_2' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux: mips: fix compat_sys_lseek syscall MIPS: mipsmtregs: Fix target register for MFTC0
2024-06-23Merge tag 'x86_urgent_for_v6.10_rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Borislav Petkov: - An ARM-relevant fix to not free default RMIDs of a resource control group - A randconfig build fix for the VMware virtual GPU driver * tag 'x86_urgent_for_v6.10_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/resctrl: Don't try to free nonexistent RMIDs drm/vmwgfx: Fix missing HYPERVISOR_GUEST dependency
2024-06-23Merge tag 'powerpc-6.10-3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: - Prevent use-after-free in 64-bit KVM VFIO - Add generated Power8 crypto asm to .gitignore Thanks to Al Viro and Nathan Lynch. * tag 'powerpc-6.10-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: KVM: PPC: Book3S HV: Prevent UAF in kvm_spapr_tce_attach_iommu_group() powerpc/crypto: Add generated P8 asm to .gitignore
2024-06-23ice: Rebuild TC queues on VSI queue reconfigurationJan Sokolowski
TC queues needs to be correctly updated when the number of queues on a VSI is reconfigured, so netdev's queue and TC settings will be dynamically adjusted and could accurately represent the underlying hardware state after changes to the VSI queue counts. Fixes: 0754d65bd4be ("ice: Add infrastructure for mqprio support via ndo_setup_tc") Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com> Signed-off-by: Jan Sokolowski <jan.sokolowski@intel.com> Signed-off-by: Karen Ostrowska <karen.ostrowska@intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-23dt-bindings: net: fman: remove ptp-timer from required listFrank Li
IEEE1588(ptp) is optional feature for network. Remove it from required list to fix below CHECK_DTBS warning. arch/arm64/boot/dts/freescale/fsl-ls1043a-qds.dtb: ethernet@f0000: 'ptp-timer' is a required property Signed-off-by: Frank Li <Frank.Li@nxp.com> Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-23Merge branch '100GbE' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== ice: prepare representor for SF support Michal Swiatkowski says: This is a series to prepare port representor for supporting also subfunctions. We need correct devlink locking and the possibility to update parent VSI after port representor is created. Refactor how devlink lock is taken to suite the subfunction use case. VSI configuration needs to be done after port representor is created. Port representor needs only allocated VSI. It doesn't need to be configured before. VSI needs to be reconfigured when update function is called. The code for this patchset was split from (too big) patchset [1]. [1] https://lore.kernel.org/netdev/20240213072724.77275-1-michal.swiatkowski@linux.intel.com/ --- Originally from https://lore.kernel.org/netdev/20240605-next-2024-06-03-intel-next-batch-v2-0-39c23963fa78@intel.com/ Changes: - delete ice_repr_get_by_vsi() from header - rephrase commit message in moving devlink locking ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-23Merge branch 'phy-microchip-ksz-9897-errata'David S. Miller
Enguerrand de Ribaucourt says: ==================== Handle new Microchip KSZ 9897 Errata These patches implement some suggested workarounds from the Microchip KSZ 9897 Errata [1]. [1] https://ww1.microchip.com/downloads/aemDocuments/documents/UNG/ProductDocuments/Errata/KSZ9897R-Errata-DS80000758.pdf --- v7: - use dev_crit_once instead of dev_crit_ratelimited - add a comment to help users understand the consequences of half-duplex errors v6: https://lore.kernel.org/netdev/20240614094642.122464-1-enguerrand.de-ribaucourt@savoirfairelinux.com/ - remove KSZ9897 phy_id workaround (was a configuration issue) - use macros for checking link down in monitoring function - check if VLAN is enabled before monitoring resources v5: https://lore.kernel.org/all/20240604092304.314636-1-enguerrand.de-ribaucourt@savoirfairelinux.com/ - use macros for bitfields - rewrap comments - check ksz_pread* return values - fix spelling mistakes - remove KSZ9477 suspend/resume deletion patch v4: https://lore.kernel.org/all/20240531142430.678198-1-enguerrand.de-ribaucourt@savoirfairelinux.com/ - Rebase on net/main - Add Fixes: tags to the patches - reverse x-mas tree order - use pseudo phy_id instead of match_phy_device v3: https://lore.kernel.org/all/20240530102436.226189-1-enguerrand.de-ribaucourt@savoirfairelinux.com/ ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-23net: dsa: microchip: monitor potential faults in half-duplex modeEnguerrand de Ribaucourt
The errata DS80000754 recommends monitoring potential faults in half-duplex mode for the KSZ9477 family. half-duplex is not very common so I just added a critical message when the fault conditions are detected. The switch can be expected to be unable to communicate anymore in these states and a software reset of the switch would be required which I did not implement. Fixes: b987e98e50ab ("dsa: add DSA switch driver for Microchip KSZ9477") Signed-off-by: Enguerrand de Ribaucourt <enguerrand.de-ribaucourt@savoirfairelinux.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-23net: dsa: microchip: use collision based back pressure modeEnguerrand de Ribaucourt
Errata DS80000758 states that carrier sense back pressure mode can cause link down issues in 100BASE-TX half duplex mode. The datasheet also recommends to always use the collision based back pressure mode. Fixes: b987e98e50ab ("dsa: add DSA switch driver for Microchip KSZ9477") Signed-off-by: Enguerrand de Ribaucourt <enguerrand.de-ribaucourt@savoirfairelinux.com> Reviewed-by: Woojung Huh <Woojung.huh@microchip.com> Acked-by: Arun Ramadoss <arun.ramadoss@microchip.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-23net: phy: micrel: add Microchip KSZ 9477 to the device tableEnguerrand de Ribaucourt
PHY_ID_KSZ9477 was supported but not added to the device table passed to MODULE_DEVICE_TABLE. Fixes: fc3973a1fa09 ("phy: micrel: add Microchip KSZ 9477 Switch PHY support") Signed-off-by: Enguerrand de Ribaucourt <enguerrand.de-ribaucourt@savoirfairelinux.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-23netlink: specs: Fix pse-set command attributesKory Maincent
Not all PSE attributes are used for the pse-set netlink command. Select only the ones used by ethtool. Fixes: f8586411e40e ("netlink: specs: Expand the pse netlink command with PoE interface") Signed-off-by: Kory Maincent <kory.maincent@bootlin.com> Reviewed-by: Donald Hunter <donald.hunter@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-23Merge tag 'i2c-host-fixes-6.10-rc5' of ↵Wolfram Sang
git://git.kernel.org/pub/scm/linux/kernel/git/andi.shyti/linux into i2c/for-current This pull request fixes the paths of the dt-schema to their complete locations for the ChromeOS EC tunnel driver and the Atmel at91sam drivers. Additionally, the OpenCores driver receives a fix for an issue that dates back to version 2.6.18. Specifically, the interrupts need to be acknowledged (clearing all pending interrupts) after enabling the core.
2024-06-22Merge tag 'rust-fixes-6.10' of https://github.com/Rust-for-Linux/linuxLinus Torvalds
Pull rust fix from Miguel Ojeda: - Avoid unused import warning in 'rusttest'. * tag 'rust-fixes-6.10' of https://github.com/Rust-for-Linux/linux: rust: avoid unused import warning in `rusttest`
2024-06-22Merge tag 'regulator-fix-v6.10-rc4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator Pull regulator fixes from Mark Brown: "A few driver specific fixes for incorrect device descriptions, plus a fix for a missing symbol export which causes build failures for some newly added drivers in other trees" * tag 'regulator-fix-v6.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator: regulator: axp20x: AXP717: fix LDO supply rails and off-by-ones regulator: bd71815: fix ramp values regulator: core: Fix modpost error "regulator_get_regmap" undefined regulator: tps6594-regulator: Fix the number of irqs for TPS65224 and TPS6594
2024-06-22Merge tag 'spi-fix-v6.10-rc4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi Pull spi fixes from Mark Brown: "A number of fixes that have built up for SPI, a bunch of driver specific ones including an unfortunate revert of an optimisation for the i.MX driver which was causing issues with some configurations, plus a couple of core fixes for the rarely used octal mode and for a bad interaction between multi-CS support and target mode" * tag 'spi-fix-v6.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi: spi: spi-imx: imx51: revert burst length calculation back to bits_per_word spi: Fix SPI slave probe failure spi: Fix OCTAL mode support spi: stm32: qspi: Clamp stm32_qspi_get_mode() output to CCR_BUSWIDTH_4 spi: stm32: qspi: Fix dual flash mode sanity test in stm32_qspi_setup() spi: cs42l43: Drop cs35l56 SPI speed down to 11MHz spi: cs42l43: Correct SPI root clock speed
2024-06-22Merge tag 'nfsd-6.10-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux Pull nfsd fixes from Chuck Lever: - Fix crashes triggered by administrative operations on the server * tag 'nfsd-6.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: NFSD: grab nfsd_mutex in nfsd_nl_rpc_status_get_dumpit() nfsd: fix oops when reading pool_stats before server is started
2024-06-22Merge tag 'xfs-6.10-fixes-4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds
Pull xfs fix from Chandan Babu: - Fix assertion failure due to a race between unlink and cluster buffer instantiation. * tag 'xfs-6.10-fixes-4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: xfs: fix unlink vs cluster buffer instantiation race