summaryrefslogtreecommitdiff
path: root/kernel
AgeCommit message (Collapse)Author
2023-04-05sched/psi: Rearrange polling code in preparationDomenico Cerasuolo
Move a few functions up in the file to avoid forward declaration needed in the patch implementing unprivileged PSI triggers. Suggested-by: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Domenico Cerasuolo <cerasuolodomenico@gmail.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Link: https://lore.kernel.org/r/20230330105418.77061-2-cerasuolodomenico@gmail.com
2023-04-05sched/fair: Fix inaccurate tally of ttwu_move_affineLibo Chen
There are scenarios where non-affine wakeups are incorrectly counted as affine wakeups by schedstats. When wake_affine_idle() returns prev_cpu which doesn't equal to nr_cpumask_bits, it will slip through the check: target == nr_cpumask_bits in wake_affine() and be counted as if target == this_cpu in schedstats. Replace target == nr_cpumask_bits with target != this_cpu to make sure affine wakeups are accurately tallied. Fixes: 806486c377e33 (sched/fair: Do not migrate if the prev_cpu is idle) Suggested-by: Daniel Jordan <daniel.m.jordan@oracle.com> Signed-off-by: Libo Chen <libo.chen@oracle.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com> Link: https://lore.kernel.org/r/20220810223313.386614-1-libo.chen@oracle.com
2023-04-05perf/core: Fix the same task check in perf_event_set_outputKan Liang
The same task check in perf_event_set_output has some potential issues for some usages. For the current perf code, there is a problem if using of perf_event_open() to have multiple samples getting into the same mmap’d memory when they are both attached to the same process. https://lore.kernel.org/all/92645262-D319-4068-9C44-2409EF44888E@gmail.com/ Because the event->ctx is not ready when the perf_event_set_output() is invoked in the perf_event_open(). Besides the above issue, before the commit bd2756811766 ("perf: Rewrite core context handling"), perf record can errors out when sampling with a hardware event and a software event as below. $ perf record -e cycles,dummy --per-thread ls failed to mmap with 22 (Invalid argument) That's because that prior to the commit a hardware event and a software event are from different task context. The problem should be a long time issue since commit c3f00c70276d ("perk: Separate find_get_context() from event initialization"). The task struct is stored in the event->hw.target for each per-thread event. It is a more reliable way to determine whether two events are attached to the same task. The event->hw.target was also introduced several years ago by the commit 50f16a8bf9d7 ("perf: Remove type specific target pointers"). It can not only be used to fix the issue with the current code, but also back port to fix the issues with an older kernel. Note: The event->hw.target was introduced later than commit c3f00c70276d. The patch may cannot be applied between the commit c3f00c70276d and commit 50f16a8bf9d7. Anybody that wants to back-port this at that period may have to find other solutions. Fixes: c3f00c70276d ("perf: Separate find_get_context() from event initialization") Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Zhengjun Xing <zhengjun.xing@linux.intel.com> Link: https://lkml.kernel.org/r/20230322202449.512091-1-kan.liang@linux.intel.com
2023-04-05perf: Optimize perf_pmu_migrate_context()Peter Zijlstra
Thomas reported that offlining CPUs spends a lot of time in synchronize_rcu() as called from perf_pmu_migrate_context() even though he's not actually using uncore events. Turns out, the thing is unconditionally waiting for RCU, even if there's no actual events to migrate. Fixes: 0cda4c023132 ("perf: Introduce perf_pmu_migrate_context()") Reported-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Paul E. McKenney <paulmck@kernel.org> Link: https://lkml.kernel.org/r/20230403090858.GT4253@hirez.programming.kicks-ass.net
2023-04-04tracing: Fix ftrace_boot_snapshot command line logicSteven Rostedt (Google)
The kernel command line ftrace_boot_snapshot by itself is supposed to trigger a snapshot at the end of boot up of the main top level trace buffer. A ftrace_boot_snapshot=foo will do the same for an instance called foo that was created by trace_instance=foo,... The logic was broken where if ftrace_boot_snapshot was by itself, it would trigger a snapshot for all instances that had tracing enabled, regardless if it asked for a snapshot or not. When a snapshot is requested for a buffer, the buffer's tr->allocated_snapshot is set to true. Use that to know if a trace buffer wants a snapshot at boot up or not. Since the top level buffer is part of the ftrace_trace_arrays list, there's no reason to treat it differently than the other buffers. Just iterate the list if ftrace_boot_snapshot was specified. Link: https://lkml.kernel.org/r/20230405022341.895334039@goodmis.org Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Ross Zwisler <zwisler@google.com> Fixes: 9c1c251d670bc ("tracing: Allow boot instances to have snapshot buffers") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-04-04tracing: Have tracing_snapshot_instance_cond() write errors to the ↵Steven Rostedt (Google)
appropriate instance If a trace instance has a failure with its snapshot code, the error message is to be written to that instance's buffer. But currently, the message is written to the top level buffer. Worse yet, it may also disable the top level buffer and not the instance that had the issue. Link: https://lkml.kernel.org/r/20230405022341.688730321@goodmis.org Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Ross Zwisler <zwisler@google.com> Fixes: 2824f50332486 ("tracing: Make the snapshot trigger work with instances") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-04-04kallsyms: Disable preemption for find_kallsyms_symbol_valueJiri Olsa
Artem reported suspicious RCU usage [1]. The reason is that verifier calls find_kallsyms_symbol_value with preemption enabled which will trigger suspicious RCU usage warning in rcu_dereference_sched call. Disabling preemption in find_kallsyms_symbol_value and adding __find_kallsyms_symbol_value function. Fixes: 31bf1dbccfb0 ("bpf: Fix attaching fentry/fexit/fmod_ret/lsm to modules") Reported-by: Artem Savkov <asavkov@redhat.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Tested-by: Artem Savkov <asavkov@redhat.com> Reviewed-by: Zhen Lei <thunder.leizhen@huawei.com> Link: https://lore.kernel.org/bpf/20230403220254.2191240-1-jolsa@kernel.org [1] https://lore.kernel.org/bpf/ZBrPMkv8YVRiWwCR@samus.usersys.redhat.com/
2023-04-04bpf: Undo strict enforcement for walking untagged fields.Alexei Starovoitov
The commit 6fcd486b3a0a ("bpf: Refactor RCU enforcement in the verifier.") broke several tracing bpf programs. Even in clang compiled kernels there are many fields that are not marked with __rcu that are safe to read and pass into helpers, but the verifier doesn't know that they're safe. Aggressively marking them as PTR_UNTRUSTED was premature. Fixes: 6fcd486b3a0a ("bpf: Refactor RCU enforcement in the verifier.") Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: David Vernet <void@manifault.com> Link: https://lore.kernel.org/bpf/20230404045029.82870-8-alexei.starovoitov@gmail.com
2023-04-04bpf: Allowlist few fields similar to __rcu tag.Alexei Starovoitov
Allow bpf program access cgrp->kn, mm->exe_file, skb->sk, req->sk. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: David Vernet <void@manifault.com> Link: https://lore.kernel.org/bpf/20230404045029.82870-7-alexei.starovoitov@gmail.com
2023-04-04bpf: Refactor NULL-ness check in check_reg_type().Alexei Starovoitov
check_reg_type() unconditionally disallows PTR_TO_BTF_ID | PTR_MAYBE_NULL. It's problematic for helpers that allow ARG_PTR_TO_BTF_ID_OR_NULL like bpf_sk_storage_get(). Allow passing PTR_TO_BTF_ID | PTR_MAYBE_NULL into such helpers. That technically includes bpf_kptr_xchg() helper, but in practice: bpf_kptr_xchg(..., bpf_cpumask_create()); is still disallowed because bpf_cpumask_create() returns ref counted pointer with ref_obj_id > 0. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: David Vernet <void@manifault.com> Link: https://lore.kernel.org/bpf/20230404045029.82870-6-alexei.starovoitov@gmail.com
2023-04-04bpf: Teach verifier that certain helpers accept NULL pointer.Alexei Starovoitov
bpf_[sk|inode|task|cgrp]_storage_[get|delete]() and bpf_get_socket_cookie() helpers perform run-time check that sk|inode|task|cgrp pointer != NULL. Teach verifier about this fact and allow bpf programs to pass PTR_TO_BTF_ID | PTR_MAYBE_NULL into such helpers. It will be used in the subsequent patch that will do bpf_sk_storage_get(.., skb->sk, ...); Even when 'skb' pointer is trusted the 'sk' pointer may be NULL. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: David Vernet <void@manifault.com> Link: https://lore.kernel.org/bpf/20230404045029.82870-5-alexei.starovoitov@gmail.com
2023-04-04bpf: Refactor btf_nested_type_is_trusted().Alexei Starovoitov
btf_nested_type_is_trusted() tries to find a struct member at corresponding offset. It works for flat structures and falls apart in more complex structs with nested structs. The offset->member search is already performed by btf_struct_walk() including nested structs. Reuse this work and pass {field name, field btf id} into btf_nested_type_is_trusted() instead of offset to make BTF_TYPE_SAFE*() logic more robust. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: David Vernet <void@manifault.com> Link: https://lore.kernel.org/bpf/20230404045029.82870-4-alexei.starovoitov@gmail.com
2023-04-04bpf: Remove unused arguments from btf_struct_access().Alexei Starovoitov
Remove unused arguments from btf_struct_access() callback. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: David Vernet <void@manifault.com> Link: https://lore.kernel.org/bpf/20230404045029.82870-3-alexei.starovoitov@gmail.com
2023-04-04bpf: Invoke btf_struct_access() callback only for writes.Alexei Starovoitov
Remove duplicated if (atype == BPF_READ) btf_struct_access() from btf_struct_access() callback and invoke it only for writes. This is possible to do because currently btf_struct_access() custom callback always delegates to generic btf_struct_access() helper for BPF_READ accesses. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: David Vernet <void@manifault.com> Link: https://lore.kernel.org/bpf/20230404045029.82870-2-alexei.starovoitov@gmail.com
2023-04-04srcu: Fix long lines in srcu_funnel_gp_start()Paul E. McKenney
This commit creates an srcu_usage pointer named "sup" as a shorter synonym for the "ssp->srcu_sup" that was bloating several lines of code. Cc: Christoph Hellwig <hch@lst.de> Tested-by: Sachin Sant <sachinp@linux.ibm.com> Tested-by: "Zhang, Qiang1" <qiang1.zhang@intel.com> Tested-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2023-04-04srcu: Fix long lines in srcu_gp_end()Paul E. McKenney
This commit creates an srcu_usage pointer named "sup" as a shorter synonym for the "ssp->srcu_sup" that was bloating several lines of code. Cc: Christoph Hellwig <hch@lst.de> Tested-by: Sachin Sant <sachinp@linux.ibm.com> Tested-by: "Zhang, Qiang1" <qiang1.zhang@intel.com> Tested-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2023-04-04srcu: Fix long lines in cleanup_srcu_struct()Paul E. McKenney
This commit creates an srcu_usage pointer named "sup" as a shorter synonym for the "ssp->srcu_sup" that was bloating several lines of code. Cc: Christoph Hellwig <hch@lst.de> Tested-by: Sachin Sant <sachinp@linux.ibm.com> Tested-by: "Zhang, Qiang1" <qiang1.zhang@intel.com> Tested-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2023-04-04srcu: Fix long lines in srcu_get_delay()Paul E. McKenney
This commit creates an srcu_usage pointer named "sup" as a shorter synonym for the "ssp->srcu_sup" that was bloating several lines of code. Tested-by: Sachin Sant <sachinp@linux.ibm.com> Tested-by: "Zhang, Qiang1" <qiang1.zhang@intel.com> Cc: Christoph Hellwig <hch@lst.de> Tested-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2023-04-04srcu: Check for readers at module-exit timePaul E. McKenney
If a given statically allocated in-module srcu_struct structure was ever used for updates, srcu_module_going() will invoke cleanup_srcu_struct() at module-exit time. This will check for the error case of SRCU readers persisting past module-exit time. On the other hand, if this srcu_struct structure never went through a grace period, srcu_module_going() only invokes free_percpu(), which would result in strange failures if SRCU readers persisted past module-exit time. This commit therefore adds a srcu_readers_active() check to srcu_module_going(), splatting if readers have persisted and refraining from invoking free_percpu() in that case. Better to leak memory than to suffer silent memory corruption! [ paulmck: Apply Zhang, Qiang1 feedback on memory leak. ] Tested-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2023-04-04srcu: Move work-scheduling fields from srcu_struct to srcu_usagePaul E. McKenney
This commit moves the ->reschedule_jiffies, ->reschedule_count, and ->work fields from the srcu_struct structure to the srcu_usage structure to reduce the size of the former in order to improve cache locality. However, this means that the container_of() calls cannot get a pointer to the srcu_struct because they are no longer in the srcu_struct. This issue is addressed by adding a ->srcu_ssp field in the srcu_usage structure that references the corresponding srcu_struct structure. And given the presence of the sup pointer to the srcu_usage structure, replace some ssp->srcu_usage-> instances with sup->. [ paulmck Apply feedback from kernel test robot. ] Link: https://lore.kernel.org/oe-kbuild-all/202303191400.iO5BOqka-lkp@intel.com/ Suggested-by: Christoph Hellwig <hch@lst.de> Tested-by: Sachin Sant <sachinp@linux.ibm.com> Tested-by: "Zhang, Qiang1" <qiang1.zhang@intel.com> Tested-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2023-04-04srcu: Move srcu_barrier() fields from srcu_struct to srcu_usagePaul E. McKenney
This commit moves the ->srcu_barrier_seq, ->srcu_barrier_mutex, ->srcu_barrier_completion, and ->srcu_barrier_cpu_cnt fields from the srcu_struct structure to the srcu_usage structure to reduce the size of the former in order to improve cache locality. Suggested-by: Christoph Hellwig <hch@lst.de> Tested-by: Sachin Sant <sachinp@linux.ibm.com> Tested-by: "Zhang, Qiang1" <qiang1.zhang@intel.com> Tested-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2023-04-04srcu: Move ->sda_is_static from srcu_struct to srcu_usagePaul E. McKenney
This commit moves the ->sda_is_static field from the srcu_struct structure to the srcu_usage structure to reduce the size of the former in order to improve cache locality. Suggested-by: Christoph Hellwig <hch@lst.de> Tested-by: Sachin Sant <sachinp@linux.ibm.com> Tested-by: "Zhang, Qiang1" <qiang1.zhang@intel.com> Tested-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2023-04-04srcu: Move heuristics fields from srcu_struct to srcu_usagePaul E. McKenney
This commit moves the ->srcu_size_jiffies, ->srcu_n_lock_retries, and ->srcu_n_exp_nodelay fields from the srcu_struct structure to the srcu_usage structure to reduce the size of the former in order to improve cache locality. Suggested-by: Christoph Hellwig <hch@lst.de> Tested-by: Sachin Sant <sachinp@linux.ibm.com> Tested-by: "Zhang, Qiang1" <qiang1.zhang@intel.com> Tested-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2023-04-04srcu: Move grace-period fields from srcu_struct to srcu_usagePaul E. McKenney
This commit moves the ->srcu_gp_seq, ->srcu_gp_seq_needed, ->srcu_gp_seq_needed_exp, ->srcu_gp_start, and ->srcu_last_gp_end fields from the srcu_struct structure to the srcu_usage structure to reduce the size of the former in order to improve cache locality. Suggested-by: Christoph Hellwig <hch@lst.de> Tested-by: Sachin Sant <sachinp@linux.ibm.com> Tested-by: "Zhang, Qiang1" <qiang1.zhang@intel.com> Tested-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2023-04-04srcu: Move ->srcu_gp_mutex from srcu_struct to srcu_usagePaul E. McKenney
This commit moves the ->srcu_gp_mutex field from the srcu_struct structure to the srcu_usage structure to reduce the size of the former in order to improve cache locality. Suggested-by: Christoph Hellwig <hch@lst.de> Tested-by: Sachin Sant <sachinp@linux.ibm.com> Tested-by: "Zhang, Qiang1" <qiang1.zhang@intel.com> Tested-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2023-04-04srcu: Move ->lock from srcu_struct to srcu_usagePaul E. McKenney
This commit moves the ->lock field from the srcu_struct structure to the srcu_usage structure to reduce the size of the former in order to improve cache locality. Suggested-by: Christoph Hellwig <hch@lst.de> Tested-by: Sachin Sant <sachinp@linux.ibm.com> Tested-by: "Zhang, Qiang1" <qiang1.zhang@intel.com> Tested-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2023-04-04srcu: Move ->lock initialization after srcu_usage allocationPaul E. McKenney
Currently, both __init_srcu_struct() in CONFIG_DEBUG_LOCK_ALLOC=y kernels and init_srcu_struct() in CONFIG_DEBUG_LOCK_ALLOC=n kernel initialize the srcu_struct structure's ->lock before the srcu_usage structure has been allocated. This of course prevents the ->lock from being moved to the srcu_usage structure, so this commit moves the initialization into the init_srcu_struct_fields() after the srcu_usage structure has been allocated. Cc: Christoph Hellwig <hch@lst.de> Tested-by: Sachin Sant <sachinp@linux.ibm.com> Tested-by: "Zhang, Qiang1" <qiang1.zhang@intel.com> Tested-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2023-04-04srcu: Move ->srcu_cb_mutex from srcu_struct to srcu_usagePaul E. McKenney
This commit moves the ->srcu_cb_mutex field from the srcu_struct structure to the srcu_usage structure to reduce the size of the former in order to improve cache locality. Suggested-by: Christoph Hellwig <hch@lst.de> Tested-by: Sachin Sant <sachinp@linux.ibm.com> Tested-by: "Zhang, Qiang1" <qiang1.zhang@intel.com> Tested-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2023-04-04srcu: Move ->srcu_size_state from srcu_struct to srcu_usagePaul E. McKenney
This commit moves the ->srcu_size_state field from the srcu_struct structure to the srcu_usage structure to reduce the size of the former in order to improve cache locality. Suggested-by: Christoph Hellwig <hch@lst.de> Tested-by: Sachin Sant <sachinp@linux.ibm.com> Tested-by: "Zhang, Qiang1" <qiang1.zhang@intel.com> Tested-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2023-04-04srcu: Move ->level from srcu_struct to srcu_usagePaul E. McKenney
This commit moves the ->level[] array from the srcu_struct structure to the srcu_usage structure to reduce the size of the former in order to improve cache locality. Suggested-by: Christoph Hellwig <hch@lst.de> Tested-by: Sachin Sant <sachinp@linux.ibm.com> Tested-by: "Zhang, Qiang1" <qiang1.zhang@intel.com> Tested-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2023-04-04srcu: Begin offloading srcu_struct fields to srcu_updatePaul E. McKenney
The current srcu_struct structure is on the order of 200 bytes in size (depending on architecture and .config), which is much better than the old-style 26K bytes, but still all too inconvenient when one is trying to achieve good cache locality on a fastpath involving SRCU readers. However, only a few fields in srcu_struct are used by SRCU readers. The remaining fields could be offloaded to a new srcu_update structure, thus shrinking the srcu_struct structure down to a few tens of bytes. This commit begins this noble quest, a quest that is complicated by open-coded initialization of the srcu_struct within the srcu_notifier_head structure. This complication is addressed by updating the srcu_notifier_head structure's open coding, given that there does not appear to be a straightforward way of abstracting that initialization. This commit moves only the ->node pointer to srcu_update. Later commits will move additional fields. [ paulmck: Fold in qiang1.zhang@intel.com's memory-leak fix. ] Link: https://lore.kernel.org/all/20230320055751.4120251-1-qiang1.zhang@intel.com/ Suggested-by: Christoph Hellwig <hch@lst.de> Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com> Cc: "Michał Mirosław" <mirq-linux@rere.qmqm.pl> Cc: Dmitry Osipenko <dmitry.osipenko@collabora.com> Tested-by: Sachin Sant <sachinp@linux.ibm.com> Tested-by: "Zhang, Qiang1" <qiang1.zhang@intel.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2023-04-04srcu: Use static init for statically allocated in-module srcu_structPaul E. McKenney
Further shrinking the srcu_struct structure is eased by requiring that in-module srcu_struct structures rely more heavily on static initialization. In particular, this preserves the property that a module-load-time srcu_struct initialization can fail only due to memory-allocation failure of the per-CPU srcu_data structures. It might also slightly improve robustness by keeping the number of memory allocations that must succeed down percpu_alloc() call. This is in preparation for splitting an srcu_usage structure out of the srcu_struct structure. [ paulmck: Fold in qiang1.zhang@intel.com feedback. ] Cc: Christoph Hellwig <hch@lst.de> Tested-by: Sachin Sant <sachinp@linux.ibm.com> Tested-by: "Zhang, Qiang1" <qiang1.zhang@intel.com> Tested-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2023-04-04rcu-tasks: Fix warning for unused tasks_rcu_exit_srcuPaul E. McKenney
The tasks_rcu_exit_srcu variable is used only by kernels built with CONFIG_TASKS_RCU=y, but is defined for all kernesl with CONFIG_TASKS_RCU_GENERIC=y. Therefore, in kernels built with CONFIG_TASKS_RCU_GENERIC=y but CONFIG_TASKS_RCU=n, this gives a "defined but not used" warning. This commit therefore moves this variable under CONFIG_TASKS_RCU. Link: https://lore.kernel.org/oe-kbuild-all/202303191536.XzMSyzTl-lkp@intel.com/ Reported-by: kernel test robot <lkp@intel.com> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Tested-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2023-04-03bpf: Fix struct_meta lookup for bpf_obj_free_fields kfunc callDave Marchevsky
bpf_obj_drop_impl has a void return type. In check_kfunc_call, the "else if" which sets insn_aux->kptr_struct_meta for bpf_obj_drop_impl is surrounded by a larger if statement which checks btf_type_is_ptr. As a result: * The bpf_obj_drop_impl-specific code will never execute * The btf_struct_meta input to bpf_obj_drop is always NULL * __bpf_obj_drop_impl will always see a NULL btf_record when called from BPF program, and won't call bpf_obj_free_fields * program-allocated kptrs which have fields that should be cleaned up by bpf_obj_free_fields may instead leak resources This patch adds a btf_type_is_void branch to the larger if and moves special handling for bpf_obj_drop_impl there, fixing the issue. Fixes: ac9f06050a35 ("bpf: Introduce bpf_obj_drop") Cc: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com> Link: https://lore.kernel.org/r/20230403200027.2271029-1-davemarchevsky@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-03driver core: class: remove struct class_interface * from callbacksGreg Kroah-Hartman
The add_dev and remove_dev callbacks in struct class_interface currently pass in a pointer back to the class_interface structure that is calling them, but none of the callback implementations actually use this pointer as it is pointless (the structure is known, the driver passed it in in the first place if it is really needed again.) So clean this up and just remove the pointer from the callbacks and fix up all callback functions. Cc: Jean Delvare <jdelvare@suse.com> Cc: Guenter Roeck <linux@roeck-us.net> Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Kurt Schwemmer <kurt.schwemmer@microsemi.com> Cc: Jon Mason <jdmason@kudzu.us> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Allen Hubbe <allenbh@gmail.com> Cc: Dominik Brodowski <linux@dominikbrodowski.net> Cc: Matt Porter <mporter@kernel.crashing.org> Cc: Alexandre Bounine <alex.bou9@gmail.com> Cc: "James E.J. Bottomley" <jejb@linux.ibm.com> Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Cc: Doug Gilbert <dgilbert@interlog.com> Cc: John Stultz <jstultz@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Stephen Boyd <sboyd@kernel.org> Cc: Hans de Goede <hdegoede@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Wang Weiyang <wangweiyang2@huawei.com> Cc: Yang Yingliang <yangyingliang@huawei.com> Cc: Jakob Koschel <jakobkoschel@gmail.com> Cc: Cai Xinchen <caixinchen1@huawei.com> Acked-by: Rafael J. Wysocki <rafael@kernel.org> Acked-by: Logan Gunthorpe <logang@deltatee.com> Link: https://lore.kernel.org/r/2023040250-pushover-platter-509c@gregkh Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-04-03tracing/osnoise: Fix notify new tracing_max_latencyDaniel Bristot de Oliveira
osnoise/timerlat tracers are reporting new max latency on instances where the tracing is off, creating inconsistencies between the max reported values in the trace and in the tracing_max_latency. Thus only report new tracing_max_latency on active tracing instances. Link: https://lkml.kernel.org/r/ecd109fde4a0c24ab0f00ba1e9a144ac19a91322.1680104184.git.bristot@kernel.org Cc: stable@vger.kernel.org Fixes: dae181349f1e ("tracing/osnoise: Support a list of trace_array *tr") Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-04-03tracing/timerlat: Notify new max thread latencyDaniel Bristot de Oliveira
timerlat is not reporting a new tracing_max_latency for the thread latency. The reason is that it is not calling notify_new_max_latency() function after the new thread latency is sampled. Call notify_new_max_latency() after computing the thread latency. Link: https://lkml.kernel.org/r/16e18d61d69073d0192ace07bf61e405cca96e9c.1680104184.git.bristot@kernel.org Cc: stable@vger.kernel.org Fixes: dae181349f1e ("tracing/osnoise: Support a list of trace_array *tr") Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-04-03ring-buffer: Fix race while reader and writer are on the same pageZheng Yejian
When user reads file 'trace_pipe', kernel keeps printing following logs that warn at "cpu_buffer->reader_page->read > rb_page_size(reader)" in rb_get_reader_page(). It just looks like there's an infinite loop in tracing_read_pipe(). This problem occurs several times on arm64 platform when testing v5.10 and below. Call trace: rb_get_reader_page+0x248/0x1300 rb_buffer_peek+0x34/0x160 ring_buffer_peek+0xbc/0x224 peek_next_entry+0x98/0xbc __find_next_entry+0xc4/0x1c0 trace_find_next_entry_inc+0x30/0x94 tracing_read_pipe+0x198/0x304 vfs_read+0xb4/0x1e0 ksys_read+0x74/0x100 __arm64_sys_read+0x24/0x30 el0_svc_common.constprop.0+0x7c/0x1bc do_el0_svc+0x2c/0x94 el0_svc+0x20/0x30 el0_sync_handler+0xb0/0xb4 el0_sync+0x160/0x180 Then I dump the vmcore and look into the problematic per_cpu ring_buffer, I found that tail_page/commit_page/reader_page are on the same page while reader_page->read is obviously abnormal: tail_page == commit_page == reader_page == { .write = 0x100d20, .read = 0x8f9f4805, // Far greater than 0xd20, obviously abnormal!!! .entries = 0x10004c, .real_end = 0x0, .page = { .time_stamp = 0x857257416af0, .commit = 0xd20, // This page hasn't been full filled. // .data[0...0xd20] seems normal. } } The root cause is most likely the race that reader and writer are on the same page while reader saw an event that not fully committed by writer. To fix this, add memory barriers to make sure the reader can see the content of what is committed. Since commit a0fcaaed0c46 ("ring-buffer: Fix race between reset page and reading page") has added the read barrier in rb_get_reader_page(), here we just need to add the write barrier. Link: https://lore.kernel.org/linux-trace-kernel/20230325021247.2923907-1-zhengyejian1@huawei.com Cc: stable@vger.kernel.org Fixes: 77ae365eca89 ("ring-buffer: make lockless") Suggested-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Zheng Yejian <zhengyejian1@huawei.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-04-03tracing/synthetic: Fix races on freeing last_cmdTze-nan Wu
Currently, the "last_cmd" variable can be accessed by multiple processes asynchronously when multiple users manipulate synthetic_events node at the same time, it could lead to use-after-free or double-free. This patch add "lastcmd_mutex" to prevent "last_cmd" from being accessed asynchronously. ================================================================ It's easy to reproduce in the KASAN environment by running the two scripts below in different shells. script 1: while : do echo -n -e '\x88' > /sys/kernel/tracing/synthetic_events done script 2: while : do echo -n -e '\xb0' > /sys/kernel/tracing/synthetic_events done ================================================================ double-free scenario: process A process B ------------------- --------------- 1.kstrdup last_cmd 2.free last_cmd 3.free last_cmd(double-free) ================================================================ use-after-free scenario: process A process B ------------------- --------------- 1.kstrdup last_cmd 2.free last_cmd 3.tracing_log_err(use-after-free) ================================================================ Appendix 1. KASAN report double-free: BUG: KASAN: double-free in kfree+0xdc/0x1d4 Free of addr ***** by task sh/4879 Call trace: ... kfree+0xdc/0x1d4 create_or_delete_synth_event+0x60/0x1e8 trace_parse_run_command+0x2bc/0x4b8 synth_events_write+0x20/0x30 vfs_write+0x200/0x830 ... Allocated by task 4879: ... kstrdup+0x5c/0x98 create_or_delete_synth_event+0x6c/0x1e8 trace_parse_run_command+0x2bc/0x4b8 synth_events_write+0x20/0x30 vfs_write+0x200/0x830 ... Freed by task 5464: ... kfree+0xdc/0x1d4 create_or_delete_synth_event+0x60/0x1e8 trace_parse_run_command+0x2bc/0x4b8 synth_events_write+0x20/0x30 vfs_write+0x200/0x830 ... ================================================================ Appendix 2. KASAN report use-after-free: BUG: KASAN: use-after-free in strlen+0x5c/0x7c Read of size 1 at addr ***** by task sh/5483 sh: CPU: 7 PID: 5483 Comm: sh ... __asan_report_load1_noabort+0x34/0x44 strlen+0x5c/0x7c tracing_log_err+0x60/0x444 create_or_delete_synth_event+0xc4/0x204 trace_parse_run_command+0x2bc/0x4b8 synth_events_write+0x20/0x30 vfs_write+0x200/0x830 ... Allocated by task 5483: ... kstrdup+0x5c/0x98 create_or_delete_synth_event+0x80/0x204 trace_parse_run_command+0x2bc/0x4b8 synth_events_write+0x20/0x30 vfs_write+0x200/0x830 ... Freed by task 5480: ... kfree+0xdc/0x1d4 create_or_delete_synth_event+0x74/0x204 trace_parse_run_command+0x2bc/0x4b8 synth_events_write+0x20/0x30 vfs_write+0x200/0x830 ... Link: https://lore.kernel.org/linux-trace-kernel/20230321110444.1587-1-Tze-nan.Wu@mediatek.com Fixes: 27c888da9867 ("tracing: Remove size restriction on synthetic event cmd error logging") Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Matthias Brugger <matthias.bgg@gmail.com> Cc: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Cc: "Tom Zanussi" <zanussi@kernel.org> Signed-off-by: Tze-nan Wu <Tze-nan.Wu@mediatek.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-04-03printk: Remove obsoleted check for non-existent "user" objectStanislav Kinsburskii
The original check for non-null "user" object was introduced by commit e11fea92e13f ("kmsg: export printk records to the /dev/kmsg interface") when "user" could be NULL if /dev/ksmg was opened for writing. Subsequent change 750afe7babd1 ("printk: add kernel parameter to control writes to /dev/kmsg") made "user" context required for files opened for write, but didn't remove now redundant checks for it to be non-NULL. This patch removes the dead code while preserving the current logic. Signed-off-by: Stanislav Kinsburskii <stanislav.kinsburski@gmail.com> CC: Petr Mladek <pmladek@suse.com> CC: Sergey Senozhatsky <senozhatsky@chromium.org> CC: Steven Rostedt <rostedt@goodmis.org> CC: John Ogness <john.ogness@linutronix.de> CC: linux-kernel@vger.kernel.org Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org> Reviewed-by: Petr Mladek <pmladek@suse.com> Signed-off-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/167929571877.2810.9926967619100618792.stgit@skinsburskii.localdomain
2023-04-03fork: use pidfd_prepare()Christian Brauner
Stop open-coding get_unused_fd_flags() and anon_inode_getfile(). That's brittle just for keeping the flags between both calls in sync. Use the dedicated helper. Message-Id: <20230327-pidfd-file-api-v1-2-5c0e9a3158e4@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-04-03pid: add pidfd_prepare()Christian Brauner
Add a new helper that allows to reserve a pidfd and allocates a new pidfd file that stashes the provided struct pid. This will allow us to remove places that either open code this function or that call pidfd_create() but then have to call close_fd() because there are still failure points after pidfd_create() has been called. Reviewed-by: Jan Kara <jack@suse.cz> Message-Id: <20230327-pidfd-file-api-v1-1-5c0e9a3158e4@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-04-03Merge 6.3-rc5 into driver-core-nextGreg Kroah-Hartman
We need the fixes in here for testing, as well as the driver core changes for documentation updates to build on. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-04-02bpf: compute hashes in bloom filter similar to hashmapAnton Protopopov
If the value size in a bloom filter is a multiple of 4, then the jhash2() function is used to compute hashes. The length parameter of this function equals to the number of 32-bit words in input. Compute it in the hot path instead of pre-computing it, as this is translated to one extra shift to divide the length by four vs. one extra memory load of a pre-computed length. Signed-off-by: Anton Protopopov <aspsk@isovalent.com> Link: https://lore.kernel.org/r/20230402114340.3441-1-aspsk@isovalent.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-01bpf: optimize hashmap lookups when key_size is divisible by 4Anton Protopopov
The BPF hashmap uses the jhash() hash function. There is an optimized version of this hash function which may be used if hash size is a multiple of 4. Apply this optimization to the hashmap in a similar way as it is done in the bloom filter map. On practice the optimization is only noticeable for smaller key sizes, which, however, is sufficient for many applications. An example is listed in the following table of measurements (a hashmap of 65536 elements was used): -------------------------------------------------------------------- | key_size | fullness | lookups /sec | lookups (opt) /sec | gain | -------------------------------------------------------------------- | 4 | 25% | 42.990M | 46.000M | 7.0% | | 4 | 50% | 37.910M | 39.094M | 3.1% | | 4 | 75% | 34.486M | 36.124M | 4.7% | | 4 | 100% | 31.760M | 32.719M | 3.0% | -------------------------------------------------------------------- | 8 | 25% | 43.855M | 49.626M | 13.2% | | 8 | 50% | 38.328M | 42.152M | 10.0% | | 8 | 75% | 34.483M | 38.088M | 10.5% | | 8 | 100% | 31.306M | 34.686M | 10.8% | -------------------------------------------------------------------- | 12 | 25% | 38.398M | 43.770M | 14.0% | | 12 | 50% | 33.336M | 37.712M | 13.1% | | 12 | 75% | 29.917M | 34.440M | 15.1% | | 12 | 100% | 27.322M | 30.480M | 11.6% | -------------------------------------------------------------------- | 16 | 25% | 41.491M | 41.921M | 1.0% | | 16 | 50% | 36.206M | 36.474M | 0.7% | | 16 | 75% | 32.529M | 33.027M | 1.5% | | 16 | 100% | 29.581M | 30.325M | 2.5% | -------------------------------------------------------------------- | 20 | 25% | 34.240M | 36.787M | 7.4% | | 20 | 50% | 30.328M | 32.663M | 7.7% | | 20 | 75% | 27.536M | 29.354M | 6.6% | | 20 | 100% | 24.847M | 26.505M | 6.7% | -------------------------------------------------------------------- | 24 | 25% | 36.329M | 40.608M | 11.8% | | 24 | 50% | 31.444M | 35.059M | 11.5% | | 24 | 75% | 28.426M | 31.452M | 10.6% | | 24 | 100% | 26.278M | 28.741M | 9.4% | -------------------------------------------------------------------- | 28 | 25% | 31.540M | 31.944M | 1.3% | | 28 | 50% | 27.739M | 28.063M | 1.2% | | 28 | 75% | 24.993M | 25.814M | 3.3% | | 28 | 100% | 23.513M | 23.500M | -0.1% | -------------------------------------------------------------------- | 32 | 25% | 32.116M | 33.953M | 5.7% | | 32 | 50% | 28.879M | 29.859M | 3.4% | | 32 | 75% | 26.227M | 26.948M | 2.7% | | 32 | 100% | 23.829M | 24.613M | 3.3% | -------------------------------------------------------------------- | 64 | 25% | 22.535M | 22.554M | 0.1% | | 64 | 50% | 20.471M | 20.675M | 1.0% | | 64 | 75% | 19.077M | 19.146M | 0.4% | | 64 | 100% | 17.710M | 18.131M | 2.4% | -------------------------------------------------------------------- The following script was used to gather the results (SMT & frequency off): cd tools/testing/selftests/bpf for key_size in 4 8 12 16 20 24 28 32 64; do for nr_entries in `seq 16384 16384 65536`; do fullness=$(printf '%3s' $((nr_entries*100/65536))) echo -n "key_size=$key_size: $fullness% full: " sudo ./bench -d2 -a bpf-hashmap-lookup --key_size=$key_size --nr_entries=$nr_entries --max_entries=65536 --nr_loops=2000000 --map_flags=0x40 | grep cpu done echo done Signed-off-by: Anton Protopopov <aspsk@isovalent.com> Link: https://lore.kernel.org/r/20230401200602.3275-1-aspsk@isovalent.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-01bpf: Remove now-defunct task kfuncsDavid Vernet
In commit 22df776a9a86 ("tasks: Extract rcu_users out of union"), the 'refcount_t rcu_users' field was extracted out of a union with the 'struct rcu_head rcu' field. This allows us to safely perform a refcount_inc_not_zero() on task->rcu_users when acquiring a reference on a task struct. A prior patch leveraged this by making struct task_struct an RCU-protected object in the verifier, and by bpf_task_acquire() to use the task->rcu_users field for synchronization. Now that we can use RCU to protect tasks, we no longer need bpf_task_kptr_get(), or bpf_task_acquire_not_zero(). bpf_task_kptr_get() is truly completely unnecessary, as we can just use RCU to get the object. bpf_task_acquire_not_zero() is now equivalent to bpf_task_acquire(). In addition to these changes, this patch also updates the associated selftests to no longer use these kfuncs. Signed-off-by: David Vernet <void@manifault.com> Link: https://lore.kernel.org/r/20230331195733.699708-3-void@manifault.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-01bpf: Make struct task_struct an RCU-safe typeDavid Vernet
struct task_struct objects are a bit interesting in terms of how their lifetime is protected by refcounts. task structs have two refcount fields: 1. refcount_t usage: Protects the memory backing the task struct. When this refcount drops to 0, the task is immediately freed, without waiting for an RCU grace period to elapse. This is the field that most callers in the kernel currently use to ensure that a task remains valid while it's being referenced, and is what's currently tracked with bpf_task_acquire() and bpf_task_release(). 2. refcount_t rcu_users: A refcount field which, when it drops to 0, schedules an RCU callback that drops a reference held on the 'usage' field above (which is acquired when the task is first created). This field therefore provides a form of RCU protection on the task by ensuring that at least one 'usage' refcount will be held until an RCU grace period has elapsed. The qualifier "a form of" is important here, as a task can remain valid after task->rcu_users has dropped to 0 and the subsequent RCU gp has elapsed. In terms of BPF, we want to use task->rcu_users to protect tasks that function as referenced kptrs, and to allow tasks stored as referenced kptrs in maps to be accessed with RCU protection. Let's first determine whether we can safely use task->rcu_users to protect tasks stored in maps. All of the bpf_task* kfuncs can only be called from tracepoint, struct_ops, or BPF_PROG_TYPE_SCHED_CLS, program types. For tracepoint and struct_ops programs, the struct task_struct passed to a program handler will always be trusted, so it will always be safe to call bpf_task_acquire() with any task passed to a program. Note, however, that we must update bpf_task_acquire() to be KF_RET_NULL, as it is possible that the task has exited by the time the program is invoked, even if the pointer is still currently valid because the main kernel holds a task->usage refcount. For BPF_PROG_TYPE_SCHED_CLS, tasks should never be passed as an argument to the any program handlers, so it should not be relevant. The second question is whether it's safe to use RCU to access a task that was acquired with bpf_task_acquire(), and stored in a map. Because bpf_task_acquire() now uses task->rcu_users, it follows that if the task is present in the map, that it must have had at least one task->rcu_users refcount by the time the current RCU cs was started. Therefore, it's safe to access that task until the end of the current RCU cs. With all that said, this patch makes struct task_struct is an RCU-protected object. In doing so, we also change bpf_task_acquire() to be KF_ACQUIRE | KF_RCU | KF_RET_NULL, and adjust any selftests as necessary. A subsequent patch will remove bpf_task_kptr_get(), and bpf_task_acquire_not_zero() respectively. Signed-off-by: David Vernet <void@manifault.com> Link: https://lore.kernel.org/r/20230331195733.699708-2-void@manifault.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-03-31iommu/sva: Move PASID helpers to sva codeJacob Pan
Preparing to remove IOASID infrastructure, PASID management will be under SVA code. Decouple mm code from IOASID. Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/20230322200803.869130-3-jacob.jun.pan@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-03-30Merge tag 'dma-mapping-6.3-2023-03-31' of ↵Linus Torvalds
git://git.infradead.org/users/hch/dma-mapping Pull dma-mapping fixes from Christoph Hellwig: - fix for swiotlb deadlock due to wrong alignment checks (GuoRui.Yu, Petr Tesarik) * tag 'dma-mapping-6.3-2023-03-31' of git://git.infradead.org/users/hch/dma-mapping: swiotlb: fix slot alignment checks swiotlb: use wrap_area_index() instead of open-coding it swiotlb: fix the deadlock in swiotlb_do_find_slots
2023-03-30Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Conflicts: drivers/net/ethernet/mediatek/mtk_ppe.c 3fbe4d8c0e53 ("net: ethernet: mtk_eth_soc: ppe: add support for flow accounting") 924531326e2d ("net: ethernet: mtk_eth_soc: add missing ppe cache flush when deleting a flow") Signed-off-by: Jakub Kicinski <kuba@kernel.org>