summaryrefslogtreecommitdiff
path: root/kernel/trace
AgeCommit message (Collapse)Author
2021-04-15tracing: Unify the logic for function tracing optionsYordan Karadzhov (VMware)
Currently the logic for dealing with the options for function tracing has two different implementations. One is used when we set the flags (in "static int func_set_flag()") and another used when we initialize the tracer (in "static int function_trace_init()"). Those two implementations are meant to do essentially the same thing and they are both not very convenient for adding new options. In this patch we add a helper function that provides a single implementation of the logic for dealing with the options and we make it such that new options can be easily added. Link: https://lkml.kernel.org/r/20210415181854.147448-6-y.karadz@gmail.com Signed-off-by: Yordan Karadzhov (VMware) <y.karadz@gmail.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-04-15tracing: Add method for recording "func_repeats" eventsYordan Karadzhov (VMware)
This patch only provides the implementation of the method. Later we will used it in a combination with a new option for function tracing. Link: https://lkml.kernel.org/r/20210415181854.147448-5-y.karadz@gmail.com Signed-off-by: Yordan Karadzhov (VMware) <y.karadz@gmail.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-04-15tracing: Add "last_func_repeats" to struct trace_arrayYordan Karadzhov (VMware)
The field is used to keep track of the consecutive (on the same CPU) calls of a single function. This information is needed in order to consolidate the function tracing record in the cases when a single function is called number of times. Link: https://lkml.kernel.org/r/20210415181854.147448-4-y.karadz@gmail.com Signed-off-by: Yordan Karadzhov (VMware) <y.karadz@gmail.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-04-15tracing: Define new ftrace event "func_repeats"Yordan Karadzhov (VMware)
The event aims to consolidate the function tracing record in the cases when a single function is called number of times consecutively. while (cond) do_func(); This may happen in various scenarios (busy waiting for example). The new ftrace event can be used to show repeated function events with a single event and save space on the ring buffer Link: https://lkml.kernel.org/r/20210415181854.147448-3-y.karadz@gmail.com Signed-off-by: Yordan Karadzhov (VMware) <y.karadz@gmail.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-04-15tracing: Define static void trace_print_time()Yordan Karadzhov (VMware)
The part of the code that prints the time of the trace record in "int trace_print_context()" gets extracted in a static function. This is done as a preparation for a following patch, in which we will define a new ftrace event called "func_repeats". The new static method, defined here, will be used by this new event to print the time of the last repeat of a function that is consecutively called number of times. Link: https://lkml.kernel.org/r/20210415181854.147448-2-y.karadz@gmail.com Signed-off-by: Yordan Karadzhov (VMware) <y.karadz@gmail.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-04-13Merge tag 'trace-v5.12-rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing fix from Steven Rostedt: "Fix a memory link in dyn_event_release(). An error path exited the function before freeing the allocated 'argv' variable" * tag 'trace-v5.12-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tracing/dynevent: Fix a memory leak in an error handling path
2021-04-13tracing/dynevent: Fix a memory leak in an error handling pathChristophe JAILLET
We must free 'argv' before returning, as already done in all the other paths of this function. Link: https://lkml.kernel.org/r/21e3594ccd7fc88c5c162c98450409190f304327.1618136448.git.christophe.jaillet@wanadoo.fr Fixes: d262271d0483 ("tracing/dynevent: Delegate parsing to create function") Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-04-10kernel: Initialize cpumask before parsingTetsuo Handa
KMSAN complains that new_value at cpumask_parse_user() from write_irq_affinity() from irq_affinity_proc_write() is uninitialized. [ 148.133411][ T5509] ===================================================== [ 148.135383][ T5509] BUG: KMSAN: uninit-value in find_next_bit+0x325/0x340 [ 148.137819][ T5509] [ 148.138448][ T5509] Local variable ----new_value.i@irq_affinity_proc_write created at: [ 148.140768][ T5509] irq_affinity_proc_write+0xc3/0x3d0 [ 148.142298][ T5509] irq_affinity_proc_write+0xc3/0x3d0 [ 148.143823][ T5509] ===================================================== Since bitmap_parse() from cpumask_parse_user() calls find_next_bit(), any alloc_cpumask_var() + cpumask_parse_user() sequence has possibility that find_next_bit() accesses uninitialized cpu mask variable. Fix this problem by replacing alloc_cpumask_var() with zalloc_cpumask_var(). Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Link: https://lore.kernel.org/r/20210401055823.3929-1-penguin-kernel@I-love.SAKURA.ne.jp
2021-04-09Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Conflicts: MAINTAINERS - keep Chandrasekar drivers/net/ethernet/mellanox/mlx5/core/en_main.c - simple fix + trust the code re-added to param.c in -next is fine include/linux/bpf.h - trivial include/linux/ethtool.h - trivial, fix kdoc while at it include/linux/skmsg.h - move to relevant place in tcp.c, comment re-wrapped net/core/skmsg.c - add the sk = sk // sk = NULL around calls net/tipc/crypto.c - trivial Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-04-02Merge tag 'trace-v5.12-rc5-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing fix from Steven Rostedt: "Fix stack trace entry size to stop showing garbage The macro that creates both the structure and the format displayed to user space for the stack trace event was changed a while ago to fix the parsing by user space tooling. But this change also modified the structure used to store the stack trace event. It changed the caller array field from [0] to [8]. Even though the size in the ring buffer is dynamic and can be something other than 8 (user space knows how to handle this), the 8 extra words was not accounted for when reserving the event on the ring buffer, and added 8 more entries, due to the calculation of "sizeof(*entry) + nr_entries * sizeof(long)", as the sizeof(*entry) now contains 8 entries. The size of the caller field needs to be subtracted from the size of the entry to create the correct allocation size" * tag 'trace-v5.12-rc5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tracing: Fix stack trace event size
2021-04-01ftrace: Simplify the calculation of page number for ftrace_page->records ↵Steven Rostedt (VMware)
some more Commit b40c6eabfcd40 ("ftrace: Simplify the calculation of page number for ftrace_page->records") simplified the calculation of the number of pages needed for each page group without having any empty pages, but it can be simplified even further. Link: https://lore.kernel.org/lkml/CAHk-=wjt9b7kxQ2J=aDNKbR1QBMB3Hiqb_hYcZbKsxGRSEb+gQ@mail.gmail.com/ Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-04-01ftrace: Store the order of pages allocated in ftrace_pageLinus Torvalds
Instead of saving the size of the records field of the ftrace_page, store the order it uses to allocate the pages, as that is what is needed to know in order to free the pages. This simplifies the code. Link: https://lore.kernel.org/lkml/CAHk-=whyMxheOqXAORt9a7JK9gc9eHTgCJ55Pgs4p=X3RrQubQ@mail.gmail.com/ Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> [ change log written by Steven Rostedt ] Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-04-01tracing: Remove unused argument from "ring_buffer_time_stamp()Yordan Karadzhov (VMware)
The "cpu" parameter is not being used by the function. Link: https://lkml.kernel.org/r/20210329130331.199402-1-y.karadz@gmail.com Signed-off-by: Yordan Karadzhov (VMware) <y.karadz@gmail.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-04-01Merge branch 'trace/ftrace/urgent' into HEADSteven Rostedt (VMware)
Needed to merge trace/ftrace/urgent to get: Commit 59300b36f85 ("ftrace: Check if pages were allocated before calling free_pages()") To clean up the code that is affected by it as well.
2021-04-01tracing: Fix stack trace event sizeSteven Rostedt (VMware)
Commit cbc3b92ce037 fixed an issue to modify the macros of the stack trace event so that user space could parse it properly. Originally the stack trace format to user space showed that the called stack was a dynamic array. But it is not actually a dynamic array, in the way that other dynamic event arrays worked, and this broke user space parsing for it. The update was to make the array look to have 8 entries in it. Helper functions were added to make it parse it correctly, as the stack was dynamic, but was determined by the size of the event stored. Although this fixed user space on how it read the event, it changed the internal structure used for the stack trace event. It changed the array size from [0] to [8] (added 8 entries). This increased the size of the stack trace event by 8 words. The size reserved on the ring buffer was the size of the stack trace event plus the number of stack entries found in the stack trace. That commit caused the amount to be 8 more than what was needed because it did not expect the caller field to have any size. This produced 8 entries of garbage (and reading random data) from the stack trace event: <idle>-0 [002] d... 1976396.837549: <stack trace> => trace_event_raw_event_sched_switch => __traceiter_sched_switch => __schedule => schedule_idle => do_idle => cpu_startup_entry => secondary_startup_64_no_verify => 0xc8c5e150ffff93de => 0xffff93de => 0 => 0 => 0xc8c5e17800000000 => 0x1f30affff93de => 0x00000004 => 0x200000000 Instead, subtract the size of the caller field from the size of the event to make sure that only the amount needed to store the stack trace is reserved. Link: https://lore.kernel.org/lkml/your-ad-here.call-01617191565-ext-9692@work.hours/ Cc: stable@vger.kernel.org Fixes: cbc3b92ce037 ("tracing: Set kernel_stack's caller size properly") Reported-by: Vasily Gorbik <gor@linux.ibm.com> Tested-by: Vasily Gorbik <gor@linux.ibm.com> Acked-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-03-31Merge tag 'trace-v5.12-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull ftrace fix from Steven Rostedt: "Add check of order < 0 before calling free_pages() The function addresses that are traced by ftrace are stored in pages, and the size is held in a variable. If there's some error in creating them, the allocate ones will be freed. In this case, it is possible that the order of pages to be freed may end up being negative due to a size of zero passed to get_count_order(), and then that negative number will cause free_pages() to free a very large section. Make sure that does not happen" * tag 'trace-v5.12-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: ftrace: Check if pages were allocated before calling free_pages()
2021-03-30ftrace: Check if pages were allocated before calling free_pages()Steven Rostedt (VMware)
It is possible that on error pg->size can be zero when getting its order, which would return a -1 value. It is dangerous to pass in an order of -1 to free_pages(). Check if order is greater than or equal to zero before calling free_pages(). Link: https://lore.kernel.org/lkml/20210330093916.432697c7@gandalf.local.home/ Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-03-25Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netDavid S. Miller
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-25tracing: Update create_system_filter() kernel-doc commentQiujun Huang
commit f306cc82a93d ("tracing: Update event filters for multibuffer") added the parameter @tr for create_system_filter(). commit bb9ef1cb7d86 ("tracing: Change apply_subsystem_event_filter() paths to check file->system == dir") changed the parameter from @system to @dir. Link: https://lkml.kernel.org/r/20210325161911.123452-1-hqjagain@gmail.com Signed-off-by: Qiujun Huang <hqjagain@gmail.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-03-25tracing: A minor cleanup for create_system_filter()Qiujun Huang
The first two parameters should be reduced to one, as @tr is simply @dir->tr. Link: https://lkml.kernel.org/r/20210324205642.65e03248@oasis.local.home Link: https://lkml.kernel.org/r/20210325163752.128407-1-hqjagain@gmail.com Suggested-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Qiujun Huang <hqjagain@gmail.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-03-24kernel: trace: Mundane typo fixes in the file trace_events_filter.cBhaskar Chowdhury
s/callin/calling/ Link: https://lkml.kernel.org/r/20210317095401.1854544-1-unixbhaskar@gmail.com Acked-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Bhaskar Chowdhury <unixbhaskar@gmail.com> [ Other fixes already done by Ingo Molnar ] Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-03-23tracing: Fix various typos in commentsIngo Molnar
Fix ~59 single-word typos in the tracing code comments, and fix the grammar in a handful of places. Link: https://lore.kernel.org/r/20210322224546.GA1981273@gmail.com Link: https://lkml.kernel.org/r/20210323174935.GA4176821@gmail.com Reviewed-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-03-18tracing: Add a verifier to check string pointers for trace eventsSteven Rostedt (VMware)
It is a common mistake for someone writing a trace event to save a pointer to a string in the TP_fast_assign() and then display that string pointer in the TP_printk() with %s. The problem is that those two events may happen a long time apart, where the source of the string may no longer exist. The proper way to handle displaying any string that is not guaranteed to be in the kernel core rodata section, is to copy it into the ring buffer via the __string(), __assign_str() and __get_str() helper macros. Add a check at run time while displaying the TP_printk() of events to make sure that every %s referenced is safe to dereference, and if it is not, trigger a warning and only show the address of the pointer, and the dereferenced string if it can be safely retrieved with a strncpy_from_kernel_nofault() call. In order to not have to copy the parsing of vsnprintf() formats, or even exporting its code, the verifier relies on vsnprintf() being able to modify the va_list that is passed to it, and it remains modified after it is called. This is the case for some architectures like x86_64, but other architectures like x86_32 pass the va_list to vsnprintf() as a value not a reference, and the verifier can not use it to parse the non string arguments. Thus, at boot up, it is checked if vsnprintf() modifies the passed in va_list or not, and a static branch will disable the verifier if it's not compatible. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-03-18tracing: Add check of trace event print fmts for dereferencing pointersSteven Rostedt (VMware)
Trace events record data into the ring buffer at the time of the event. The trace event has a printf logic to display the recorded data at a much later time when the user reads the trace file. This makes using dereferencing pointers unsafe if the dereferenced pointer points to the original source. The safe way to handle this is to create an array within the trace event and copy the source into the array. Then the dereference pointer may point to that array. As this is a easy mistake to make, a check is added to examine all trace event print fmts to make sure that they are safe to use. This only checks the various %p* dereferenced pointers like %pB, %pR, etc. It does not handle dereferencing of strings, as there are some use cases that are OK to dereference the source. That will be dealt with differently. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-03-18tracing: Add tracing_event_time_stamp() APISteven Rostedt (VMware)
Add a tracing_event_time_stamp() API that checks if the event passed in is not on the ring buffer but a pointer to the per CPU trace_buffered_event which does not have its time stamp set yet. If it is a pointer to the trace_buffered_event, then just return the current time stamp that the ring buffer would produce. Otherwise, return the time stamp from the event. Link: https://lkml.kernel.org/r/20210316164114.131996180@goodmis.org Reviewed-by: Tom Zanussi <zanussi@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-03-18ring-buffer: Add verifier for using ring_buffer_event_time_stamp()Steven Rostedt (VMware)
The ring_buffer_event_time_stamp() must be only called by an event that has not been committed yet, and is on the buffer that is passed in. This was used to help debug converting the histogram logic over to using the new time stamp code, and was proven to be very useful. Add a verifier that can check that this is the case, and extra WARN_ONs to catch unexpected use cases. Link: https://lkml.kernel.org/r/20210316164113.987294354@goodmis.org Reviewed-by: Tom Zanussi <zanussi@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-03-18tracing: Use a no_filter_buffering_ref to stop using the filter bufferSteven Rostedt (VMware)
Currently, the trace histograms relies on it using absolute time stamps to trigger the tracing to not use the temp buffer if filters are set. That's because the histograms need the full timestamp that is saved in the ring buffer. That is no longer the case, as the ring_buffer_event_time_stamp() can now return the time stamp for all events without all triggering a full absolute time stamp. Now that the absolute time stamp is an unrelated dependency to not using the filters. There's nothing about having absolute timestamps to keep from using the filter buffer. Instead, change the interface to explicitly state to disable filter buffering that the histogram logic can use. Link: https://lkml.kernel.org/r/20210316164113.847886563@goodmis.org Reviewed-by: Tom Zanussi <zanussi@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-03-18ring-buffer: Allow ring_buffer_event_time_stamp() to return time stamp of ↵Steven Rostedt (VMware)
all events Currently, ring_buffer_event_time_stamp() only returns an accurate time stamp of the event if it has an absolute extended time stamp attached to it. To make it more robust, use the event_stamp() in case the event does not have an absolute value attached to it. This will allow ring_buffer_event_time_stamp() to be used in more cases than just histograms, and it will also allow histograms to not require including absolute values all the time. Link: https://lkml.kernel.org/r/20210316164113.704830885@goodmis.org Reviewed-by: Tom Zanussi <zanussi@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-03-18tracing: Pass buffer of event to trigger operationsSteven Rostedt (VMware)
The ring_buffer_event_time_stamp() is going to be updated to extract the time stamp for the event without needing it to be set to have absolute values for all events. But to do so, it needs the buffer that the event is on as the buffer saves information for the event before it is committed to the buffer. If the trace buffer is disabled, a temporary buffer is used, and there's no access to this buffer from the current histogram triggers, even though it is passed to the trace event code. Pass the buffer that the event is on all the way down to the histogram triggers. Link: https://lkml.kernel.org/r/20210316164113.542448131@goodmis.org Reviewed-by: Tom Zanussi <zanussi@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-03-18ring-buffer: Add a event_stamp to cpu_buffer for each level of nestingSteven Rostedt (VMware)
Add a place to save the current event time stamp for each level of nesting. This will be used to retrieve the time stamp of the current event before it is committed. Link: https://lkml.kernel.org/r/20210316164113.399089673@goodmis.org Reviewed-by: Tom Zanussi <zanussi@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-03-18ring-buffer: Separate out internal use of ring_buffer_event_time_stamp()Steven Rostedt (VMware)
The exported use of ring_buffer_event_time_stamp() is going to become different than how it is used internally. Move the internal logic out into a static function called rb_event_time_stamp(), and have the internal callers call that instead. Link: https://lkml.kernel.org/r/20210316164113.257790481@goodmis.org Reviewed-by: Tom Zanussi <zanussi@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-03-17ftrace: Fix modify_ftrace_direct.Alexei Starovoitov
The following sequence of commands: register_ftrace_direct(ip, addr1); modify_ftrace_direct(ip, addr1, addr2); unregister_ftrace_direct(ip, addr2); will cause the kernel to warn: [ 30.179191] WARNING: CPU: 2 PID: 1961 at kernel/trace/ftrace.c:5223 unregister_ftrace_direct+0x130/0x150 [ 30.180556] CPU: 2 PID: 1961 Comm: test_progs W O 5.12.0-rc2-00378-g86bc10a0a711-dirty #3246 [ 30.182453] RIP: 0010:unregister_ftrace_direct+0x130/0x150 When modify_ftrace_direct() changes the addr from old to new it should update the addr stored in ftrace_direct_funcs. Otherwise the final unregister_ftrace_direct() won't find the address and will cause the splat. Fixes: 0567d6809182 ("ftrace: Add modify_ftrace_direct()") Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Link: https://lore.kernel.org/bpf/20210316195815.34714-1-alexei.starovoitov@gmail.com
2021-03-09Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextDavid S. Miller
Alexei Starovoitov says: ==================== pull-request: bpf-next 2021-03-09 The following pull-request contains BPF updates for your *net-next* tree. We've added 90 non-merge commits during the last 17 day(s) which contain a total of 114 files changed, 5158 insertions(+), 1288 deletions(-). The main changes are: 1) Faster bpf_redirect_map(), from Björn. 2) skmsg cleanup, from Cong. 3) Support for floating point types in BTF, from Ilya. 4) Documentation for sys_bpf commands, from Joe. 5) Support for sk_lookup in bpf_prog_test_run, form Lorenz. 6) Enable task local storage for tracing programs, from Song. 7) bpf_for_each_map_elem() helper, from Yonghong. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-04tracing: Skip selftests if tracing is disabledSteven Rostedt (VMware)
If tracing is disabled for some reason (traceoff_on_warning, command line, etc), the ftrace selftests are guaranteed to fail, as their results are defined by trace data in the ring buffers. If the ring buffers are turned off, the tests will fail, due to lack of data. Because tracing being disabled is for a specific reason (warning, user decided to, etc), it does not make sense to enable tracing to run the self tests, as the test output may corrupt the reason for the tracing to be disabled. Instead, simply skip the self tests and report that they are being skipped due to tracing being disabled. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-03-04tracing: Fix memory leak in __create_synth_event()Vamshi K Sthambamkadi
kmemleak report: unreferenced object 0xc5a6f708 (size 8): comm "ftracetest", pid 1209, jiffies 4294911500 (age 6.816s) hex dump (first 8 bytes): 00 c1 3d 60 14 83 1f 8a ..=`.... backtrace: [<f0aa4ac4>] __kmalloc_track_caller+0x2a6/0x460 [<7d3d60a6>] kstrndup+0x37/0x70 [<45a0e739>] argv_split+0x1c/0x120 [<c17982f8>] __create_synth_event+0x192/0xb00 [<0708b8a3>] create_synth_event+0xbb/0x150 [<3d1941e1>] create_dyn_event+0x5c/0xb0 [<5cf8b9e3>] trace_parse_run_command+0xa7/0x140 [<04deb2ef>] dyn_event_write+0x10/0x20 [<8779ac95>] vfs_write+0xa9/0x3c0 [<ed93722a>] ksys_write+0x89/0xc0 [<b9ca0507>] __ia32_sys_write+0x15/0x20 [<7ce02d85>] __do_fast_syscall_32+0x45/0x80 [<cb0ecb35>] do_fast_syscall_32+0x29/0x60 [<2467454a>] do_SYSENTER_32+0x15/0x20 [<9beaa61d>] entry_SYSENTER_32+0xa9/0xfc unreferenced object 0xc5a6f078 (size 8): comm "ftracetest", pid 1209, jiffies 4294911500 (age 6.816s) hex dump (first 8 bytes): 08 f7 a6 c5 00 00 00 00 ........ backtrace: [<bbac096a>] __kmalloc+0x2b6/0x470 [<aa2624b4>] argv_split+0x82/0x120 [<c17982f8>] __create_synth_event+0x192/0xb00 [<0708b8a3>] create_synth_event+0xbb/0x150 [<3d1941e1>] create_dyn_event+0x5c/0xb0 [<5cf8b9e3>] trace_parse_run_command+0xa7/0x140 [<04deb2ef>] dyn_event_write+0x10/0x20 [<8779ac95>] vfs_write+0xa9/0x3c0 [<ed93722a>] ksys_write+0x89/0xc0 [<b9ca0507>] __ia32_sys_write+0x15/0x20 [<7ce02d85>] __do_fast_syscall_32+0x45/0x80 [<cb0ecb35>] do_fast_syscall_32+0x29/0x60 [<2467454a>] do_SYSENTER_32+0x15/0x20 [<9beaa61d>] entry_SYSENTER_32+0xa9/0xfc In __create_synth_event(), while iterating field/type arguments, the argv_split() will return array of atleast 2 elements even when zero arguments(argc=0) are passed. for e.g. when there is double delimiter or string ends with delimiter To fix call argv_free() even when argc=0. Link: https://lkml.kernel.org/r/20210304094521.GA1826@cosmos Signed-off-by: Vamshi K Sthambamkadi <vamshi.k.sthambamkadi@gmail.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-03-04ring-buffer: Add a little more information and a WARN when time stamp going ↵Steven Rostedt (VMware)
backwards is detected When the CONFIG_RING_BUFFER_VALIDATE_TIME_DELTAS is enabled, and the time stamps are detected as not being valid, it reports information about the write stamp, but does not show the before_stamp which is still useful information. Also, it should give a warning once, such that tests detect this happening. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-03-04ring-buffer: Force before_stamp and write_stamp to be different on discardSteven Rostedt (VMware)
Part of the logic of the new time stamp code depends on the before_stamp and the write_stamp to be different if the write_stamp does not match the last event on the buffer, as it will be used to calculate the delta of the next event written on the buffer. The discard logic depends on this, as the next event to come in needs to inject a full timestamp as it can not rely on the last event timestamp in the buffer because it is unknown due to events after it being discarded. But by changing the write_stamp back to the time before it, it forces the next event to use a full time stamp, instead of relying on it. The issue came when a full time stamp was used for the event, and rb_time_delta() returns zero in that case. The update to the write_stamp (which subtracts delta) made it not change. Then when the event is removed from the buffer, because the before_stamp and write_stamp still match, the next event written would calculate its delta from the write_stamp, but that would be wrong as the write_stamp is of the time of the event that was discarded. In the case that the delta change being made to write_stamp is zero, set the before_stamp to zero as well, and this will force the next event to inject a full timestamp and not use the current write_stamp. Cc: stable@vger.kernel.org Fixes: a389d86f7fd09 ("ring-buffer: Have nested events still record running time stamp") Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-03-04tracing: Fix help text of TRACEPOINT_BENCHMARK in KconfigRolf Eike Beer
It's "cond_resched()" not "cond_sched()". Link: https://lkml.kernel.org/r/1863065.aFVDpXsuPd@devpool47 Signed-off-by: Rolf Eike Beer <eb@emlix.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-03-04tracing: Remove duplicate declaration from trace.hYordan Karadzhov (VMware)
A declaration of function "int trace_empty(struct trace_iterator *iter)" shows up twice in the header file kernel/trace/trace.h Link: https://lkml.kernel.org/r/20210304092348.208033-1-y.karadz@gmail.com Signed-off-by: Yordan Karadzhov (VMware) <y.karadz@gmail.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-02-28Merge tag 'block-5.12-2021-02-27' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull more block updates from Jens Axboe: "A few stragglers (and one due to me missing it originally), and fixes for changes in this merge window mostly. In particular: - blktrace cleanups (Chaitanya, Greg) - Kill dead blk_pm_* functions (Bart) - Fixes for the bio alloc changes (Christoph) - Fix for the partition changes (Christoph, Ming) - Fix for turning off iopoll with polled IO inflight (Jeffle) - nbd disconnect fix (Josef) - loop fsync error fix (Mauricio) - kyber update depth fix (Yang) - max_sectors alignment fix (Mikulas) - Add bio_max_segs helper (Matthew)" * tag 'block-5.12-2021-02-27' of git://git.kernel.dk/linux-block: (21 commits) block: Add bio_max_segs blktrace: fix documentation for blk_fill_rw() block: memory allocations in bounce_clone_bio must not fail block: remove the gfp_mask argument to bounce_clone_bio block: fix bounce_clone_bio for passthrough bios block-crypto-fallback: use a bio_set for splitting bios block: fix logging on capacity change blk-settings: align max_sectors on "logical_block_size" boundary block: reopen the device in blkdev_reread_part block: don't skip empty device in in disk_uevent blktrace: remove debugfs file dentries from struct blk_trace nbd: handle device refs for DESTROY_ON_DISCONNECT properly kyber: introduce kyber_depth_updated() loop: fix I/O error on fsync() in detached loop devices block: fix potential IO hang when turning off io_poll block: get rid of the trace rq insert wrapper blktrace: fix blk_rq_merge documentation blktrace: fix blk_rq_issue documentation blktrace: add blk_fill_rwbs documentation comment block: remove superfluous param in blk_fill_rwbs() ...
2021-02-26bpf: Add bpf_for_each_map_elem() helperYonghong Song
The bpf_for_each_map_elem() helper is introduced which iterates all map elements with a callback function. The helper signature looks like long bpf_for_each_map_elem(map, callback_fn, callback_ctx, flags) and for each map element, the callback_fn will be called. For example, like hashmap, the callback signature may look like long callback_fn(map, key, val, callback_ctx) There are two known use cases for this. One is from upstream ([1]) where a for_each_map_elem helper may help implement a timeout mechanism in a more generic way. Another is from our internal discussion for a firewall use case where a map contains all the rules. The packet data can be compared to all these rules to decide allow or deny the packet. For array maps, users can already use a bounded loop to traverse elements. Using this helper can avoid using bounded loop. For other type of maps (e.g., hash maps) where bounded loop is hard or impossible to use, this helper provides a convenient way to operate on all elements. For callback_fn, besides map and map element, a callback_ctx, allocated on caller stack, is also passed to the callback function. This callback_ctx argument can provide additional input and allow to write to caller stack for output. If the callback_fn returns 0, the helper will iterate through next element if available. If the callback_fn returns 1, the helper will stop iterating and returns to the bpf program. Other return values are not used for now. Currently, this helper is only available with jit. It is possible to make it work with interpreter with so effort but I leave it as the future work. [1]: https://lore.kernel.org/bpf/20210122205415.113822-1-xiyou.wangcong@gmail.com/ Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210226204925.3884923-1-yhs@fb.com
2021-02-26bpf: Enable task local storage for tracing programsSong Liu
To access per-task data, BPF programs usually creates a hash table with pid as the key. This is not ideal because: 1. The user need to estimate the proper size of the hash table, which may be inaccurate; 2. Big hash tables are slow; 3. To clean up the data properly during task terminations, the user need to write extra logic. Task local storage overcomes these issues and offers a better option for these per-task data. Task local storage is only available to BPF_LSM. Now enable it for tracing programs. Unlike LSM programs, tracing programs can be called in IRQ contexts. Helpers that access task local storage are updated to use raw_spin_lock_irqsave() instead of raw_spin_lock_bh(). Tracing programs can attach to functions on the task free path, e.g. exit_creds(). To avoid allocating task local storage after bpf_task_storage_free(). bpf_task_storage_get() is updated to not allocate new storage when the task is not refcounted (task->usage == 0). Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: KP Singh <kpsingh@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20210225234319.336131-2-songliubraving@fb.com
2021-02-26tracing: add error_report_end trace pointAlexander Potapenko
Patch series "Add error_report_end tracepoint to KFENCE and KASAN", v3. This patchset adds a tracepoint, error_repor_end, that is to be used by KFENCE, KASAN, and potentially other bug detection tools, when they print an error report. One of the possible use cases is userspace collection of kernel error reports: interested parties can subscribe to the tracing event via tracefs, and get notified when an error report occurs. This patch (of 3): Introduce error_report_end tracepoint. It can be used in debugging tools like KASAN, KFENCE, etc. to provide extensions to the error reporting mechanisms (e.g. allow tests hook into error reporting, ease error report collection from production kernels). Another benefit would be making use of ftrace for debugging or benchmarking the tools themselves. Should we need it, the tracepoint name leaves us with the possibility to introduce a complementary error_report_start tracepoint in the future. Link: https://lkml.kernel.org/r/20210121131915.1331302-1-glider@google.com Link: https://lkml.kernel.org/r/20210121131915.1331302-2-glider@google.com Signed-off-by: Alexander Potapenko <glider@google.com> Suggested-by: Marco Elver <elver@google.com> Cc: Andrey Konovalov <andreyknvl@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Petr Mladek <pmladek@suse.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-02-24blktrace: fix documentation for blk_fill_rw()Chaitanya Kulkarni
Add missing ":" after rwbs function parameter documentation that fixes following warning :- ./kernel/trace/blktrace.c:1877: warning: Function parameter or member 'rwbs' not described in 'blk_fill_rwbs' Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Fixes: 1f83bb4b4914 ("blktrace: add blk_fill_rwbs documentation comment") Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-02-23Merge tag 'clang-lto-v5.12-rc1-part2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull more clang LTO updates from Kees Cook: "Clang LTO x86 enablement. Full disclosure: while this has _not_ been in linux-next (since it initially looked like the objtool dependencies weren't going to make v5.12), it has been under daily build and runtime testing by Sami for quite some time. These x86 portions have been discussed on lkml, with Peter, Josh, and others helping nail things down. The bulk of the changes are to get objtool working happily. The rest of the x86 enablement is very small. Summary: - Generate __mcount_loc in objtool (Peter Zijlstra) - Support running objtool against vmlinux.o (Sami Tolvanen) - Clang LTO enablement for x86 (Sami Tolvanen)" Link: https://lore.kernel.org/lkml/20201013003203.4168817-26-samitolvanen@google.com/ Link: https://lore.kernel.org/lkml/cover.1611263461.git.jpoimboe@redhat.com/ * tag 'clang-lto-v5.12-rc1-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: kbuild: lto: force rebuilds when switching CONFIG_LTO x86, build: allow LTO to be selected x86, cpu: disable LTO for cpu.c x86, vdso: disable LTO only for vDSO kbuild: lto: postpone objtool objtool: Split noinstr validation from --vmlinux x86, build: use objtool mcount tracing: add support for objtool mcount objtool: Don't autodetect vmlinux.o objtool: Fix __mcount_loc generation with Clang's assembler objtool: Add a pass for generating __mcount_loc
2021-02-23tracing: add support for objtool mcountSami Tolvanen
This change adds build support for using objtool to generate __mcount_loc sections. Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
2021-02-23Merge tag 'modules-for-v5.12' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux Pull module updates from Jessica Yu: - Retire EXPORT_UNUSED_SYMBOL() and EXPORT_SYMBOL_GPL_FUTURE(). These export types were introduced between 2006 - 2008. All the of the unused symbols have been long removed and gpl future symbols were converted to gpl quite a long time ago, and I don't believe these export types have been used ever since. So, I think it should be safe to retire those export types now (Christoph Hellwig) - Refactor and clean up some aged code cruft in the module loader (Christoph Hellwig) - Build {,module_}kallsyms_on_each_symbol only when livepatching is enabled, as it is the only caller (Christoph Hellwig) - Unexport find_module() and module_mutex and fix the last module callers to not rely on these anymore. Make module_mutex internal to the module loader (Christoph Hellwig) - Harden ELF checks on module load and validate ELF structures before checking the module signature (Frank van der Linden) - Fix undefined symbol warning for clang (Fangrui Song) - Fix smatch warning (Dan Carpenter) * tag 'modules-for-v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux: module: potential uninitialized return in module_kallsyms_on_each_symbol() module: remove EXPORT_UNUSED_SYMBOL* module: remove EXPORT_SYMBOL_GPL_FUTURE module: move struct symsearch to module.c module: pass struct find_symbol_args to find_symbol module: merge each_symbol_section into find_symbol module: remove each_symbol_in_section module: mark module_mutex static kallsyms: only build {,module_}kallsyms_on_each_symbol when required kallsyms: refactor {,module_}kallsyms_on_each_symbol module: use RCU to synchronize find_module module: unexport find_module and module_mutex drm: remove drm_fb_helper_modinit powerpc/powernv: remove get_cxl_module module: harden ELF info handling module: Ignore _GLOBAL_OFFSET_TABLE_ when warning for undefined symbols
2021-02-23Merge tag 'clang-lto-v5.12-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull clang LTO updates from Kees Cook: "Clang Link Time Optimization. This is built on the work done preparing for LTO by arm64 folks, tracing folks, etc. This includes the core changes as well as the remaining pieces for arm64 (LTO has been the default build method on Android for about 3 years now, as it is the prerequisite for the Control Flow Integrity protections). While x86 LTO enablement is done, it depends on some pending objtool clean-ups. It's possible that I'll send a "part 2" pull request for LTO that includes x86 support. For merge log posterity, and as detailed in commit dc5723b02e52 ("kbuild: add support for Clang LTO"), here is the lt;dr to do an LTO build: make LLVM=1 LLVM_IAS=1 defconfig scripts/config -e LTO_CLANG_THIN make LLVM=1 LLVM_IAS=1 (To do a cross-compile of arm64, add "CROSS_COMPILE=aarch64-linux-gnu-" and "ARCH=arm64" to the "make" command lines.) Summary: - Clang LTO build infrastructure and arm64-specific enablement (Sami Tolvanen) - Recursive build CC_FLAGS_LTO fix (Alexander Lobakin)" * tag 'clang-lto-v5.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: kbuild: prevent CC_FLAGS_LTO self-bloating on recursive rebuilds arm64: allow LTO to be selected arm64: disable recordmcount with DYNAMIC_FTRACE_WITH_REGS arm64: vdso: disable LTO drivers/misc/lkdtm: disable LTO for rodata.o efi/libstub: disable LTO scripts/mod: disable LTO for empty.c modpost: lto: strip .lto from module names PCI: Fix PREL32 relocations for LTO init: lto: fix PREL32 relocations init: lto: ensure initcall ordering kbuild: lto: add a default list of used symbols kbuild: lto: merge module sections kbuild: lto: limit inlining kbuild: lto: fix module versioning kbuild: add support for Clang LTO tracing: move function tracer options to Kconfig
2021-02-23blktrace: remove debugfs file dentries from struct blk_traceGreg Kroah-Hartman
These debugfs dentries do not need to be saved for anything as the whole directory and everything in it is properly cleaned up when the parent directory is removed. So remove them from struct blk_trace and don't save them when created as it's not necessary. Cc: Jens Axboe <axboe@kernel.dk> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: linux-block@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-02-22Merge tag 'trace-v5.12' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing updates from Steven Rostedt: - Update to the way irqs and preemption is tracked via the trace event PC field - Fix handling of unregistering event failing due to allocate memory. This is only triggered by failure injection, as it is pretty much guaranteed to have less than a page allocation succeed. - Do not show the useless "filter" or "enable" files for the "ftrace" trace system, as they have no effect on doing anything. - Add a warning if kprobes are registered more than once. - Synthetic events now have their fields parsed by semicolons. Old formats without semicolons will still work, but new features will require them. - New option to allow trace events to show %p without hashing in trace file. The trace file can only be read by root, and reading the raw event buffer did not have any pointers hashed, so this does not expose anything new. - New directory in tools called tools/tracing, where a new tool that reads sequential latency reports from the ftrace latency tracers. - Other minor fixes and cleanups. * tag 'trace-v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (33 commits) kprobes: Fix to delay the kprobes jump optimization tracing/tools: Add the latency-collector to tools directory tracing: Make hash-ptr option default tracing: Add ptr-hash option to show the hashed pointer value tracing: Update the stage 3 of trace event macro comment tracing: Show real address for trace event arguments selftests/ftrace: Add '!event' synthetic event syntax check selftests/ftrace: Update synthetic event syntax errors tracing: Add a backward-compatibility check for synthetic event creation tracing: Update synth command errors tracing: Rework synthetic event command parsing tracing/dynevent: Delegate parsing to create function kprobes: Warn if the kprobe is reregistered ftrace: Remove unused ftrace_force_update() tracepoints: Code clean up tracepoints: Do not punish non static call users tracepoints: Remove unnecessary "data_args" macro parameter tracing: Do not create "enable" or "filter" files for ftrace event subsystem kernel: trace: preemptirq_delay_test: add cpu affinity tracepoint: Do not fail unregistering a probe due to memory failure ...