summaryrefslogtreecommitdiff
path: root/arch/s390/kernel/perf_cpum_sf.c
AgeCommit message (Collapse)Author
2025-06-17s390: Remove unnecessary include <linux/export.h>Heiko Carstens
Remove include <linux/export.h> from all files which do not contain an EXPORT_SYMBOL(). See commit 7d95680d64ac ("scripts/misc-check: check unnecessary #include <linux/export.h> when W=1") for more details. Acked-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2025-05-21s390/perf: Remove driver-specific throttle supportKan Liang
The throttle support has been added in the generic code. Remove the driver-specific throttle support. Besides the throttle, perf_event_overflow may return true because of event_limit. It already does an inatomic event disable. The pmu->stop is not required either. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Thomas Richter <tmricht@linux.ibm.com> Link: https://lore.kernel.org/r/20250520181644.2673067-8-kan.liang@linux.intel.com
2025-04-09s390/cpumf: Fix double free on error in cpumf_pmu_event_init()Thomas Richter
In PMU event initialization functions - cpumsf_pmu_event_init() - cpumf_pmu_event_init() - cfdiag_event_init() the partially created event had to be removed when an error was detected. The event::event_init() member function had to release all resources it allocated in case of error. event::destroy() had to be called on freeing an event after it was successfully created and event::event_init() returned success. With commit c70ca298036c ("perf/core: Simplify the perf_event_alloc() error path") this is not necessary anymore. The performance subsystem common code now always calls event::destroy() to clean up the allocated resources created during event initialization. Remove the event::destroy() invocation in PMU event initialization or that function is called twice for each event that runs into an error condition in event creation. This is the kernel log entry which shows up without the fix: ------------[ cut here ]------------ refcount_t: underflow; use-after-free. WARNING: CPU: 0 PID: 43388 at lib/refcount.c:87 refcount_dec_not_one+0x74/0x90 CPU: 0 UID: 0 PID: 43388 Comm: perf Not tainted 6.15.0-20250407.rc1.git0.300.fc41.s390x+git #1 NONE Hardware name: IBM 3931 A01 704 (LPAR) Krnl PSW : 0704c00180000000 00000209cb2c1b88 (refcount_dec_not_one+0x78/0x90) R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 RI:0 EA:3 Krnl GPRS: 0000020900000027 0000020900000023 0000000000000026 0000018900000000 00000004a2200a00 0000000000000000 0000000000000057 ffffffffffffffea 00000002b386c600 00000002b3f5b3e0 00000209cc51f140 00000209cc7fc550 0000000001449d38 ffffffffffffffff 00000209cb2c1b84 00000189d67dfb80 Krnl Code: 00000209cb2c1b78: c02000506727 larl %r2,00000209cbcce9c6 00000209cb2c1b7e: c0e5ffbd4431 brasl %r14,00000209caa6a3e0 #00000209cb2c1b84: af000000 mc 0,0 >00000209cb2c1b88: a7480001 lhi %r4,1 00000209cb2c1b8c: ebeff0a00004 lmg %r14,%r15,160(%r15) 00000209cb2c1b92: ec243fbf0055 risbg %r2,%r4,63,191,0 00000209cb2c1b98: 07fe bcr 15,%r14 00000209cb2c1b9a: 47000700 bc 0,1792 Call Trace: [<00000209cb2c1b88>] refcount_dec_not_one+0x78/0x90 [<00000209cb2c1dc4>] refcount_dec_and_mutex_lock+0x24/0x90 [<00000209caa3c29e>] hw_perf_event_destroy+0x2e/0x80 [<00000209cacaf8b4>] __free_event+0x74/0x270 [<00000209cacb47c4>] perf_event_alloc.part.0+0x4a4/0x730 [<00000209cacbf3e8>] __do_sys_perf_event_open+0x248/0xc20 [<00000209cacc14a4>] __s390x_sys_perf_event_open+0x44/0x50 [<00000209cb8114de>] __do_syscall+0x12e/0x260 [<00000209cb81ce34>] system_call+0x74/0x98 Last Breaking-Event-Address: [<00000209caa6a4d2>] __warn_printk+0xf2/0x100 ---[ end trace 0000000000000000 ]--- Fixes: c70ca298036c ("perf/core: Simplify the perf_event_alloc() error path") Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Reviewed-by: Sumanth Korikkar <sumanthk@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-12-09perf/core: Export perf_exclude_event()Namhyung Kim
While at it, rename the same function in s390 cpum_sf PMU. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Tested-by: Ravi Bangoria <ravi.bangoria@amd.com> Reviewed-by: Ravi Bangoria <ravi.bangoria@amd.com> Acked-by: Thomas Richter <tmricht@linux.ibm.com> Link: https://lore.kernel.org/r/20241203180441.1634709-2-namhyung@kernel.org
2024-11-21s390/cpum_sf: Simplify release of SDBs and SDBTsThomas Richter
Free_sampling_buffer() releases the Sampling Data Buffers (SDBs) and Sampling Data Buffer Table (SDBTs) allocated at event initialization. Both buffers are of PAGE_SIZE bytes. Each SDBT consists of 512 entries. The first 511 entries point to SDBs the last entry points to a successor SDBT. The last SDBT in the list points to the origin of all SDBTs. SDBTs do not contain holes, that is an entry always points to a SDB. If less than 511 SDBs have been allocation, the last entry points to the origin SDBT. Simplify the release of the SDBs and SDBTs, walk along the SDBT chain, release SDBs and SDBTs and stop when reaching the origin again. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Reviewed-by: Sumanth Korikkar <sumanthk@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-11-12s390/perf_cpum_sf: Convert to use try_cmpxchg128()Heiko Carstens
Convert cmpxchg128() usages to try_cmpxchg128() in order to generate slightly better code. Reviewed-by: Juergen Christ <jchrist@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-10-31s390/cpum_sf: Fix and protect memory allocation of SDBs with mutexThomas Richter
Reservation of the PMU hardware is done at first event creation and is protected by a pair of mutex_lock() and mutex_unlock(). After reservation of the PMU hardware the memory required for the PMUs the event is to be installed on is allocated by allocate_buffers() and alloc_sampling_buffer(). This done outside of the mutex protection. Without mutex protection two or more concurrent invocations of perf_event_init() may run in parallel. This can lead to allocation of Sample Data Blocks (SDBs) multiple times for the same PMU. Prevent this and protect memory allocation of SDBs by mutex. Fixes: 8a6fe8f21ec4 ("s390/cpum_sf: Use refcount_t instead of atomic_t") Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Reviewed-by: Sumanth Korikkar <sumanthk@linux.ibm.com> Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-10-29s390/cpum_sf: Rework call to sf_disable()Thomas Richter
Setup_pmc_cpu() function body consists of one single switch statement with two cases PMC_INIT and PMC_RELEASE. In both of these cases sf_disable() is invoked to turn off the CPU Measurement sampling facility. Move sf_disable() out of the switch statement. No functional change. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com> Reviewed-by: Sumanth Korikkar <sumanthk@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-10-29s390/cpum_sf: Handle CPU hotplug remove during samplingThomas Richter
CPU hotplug remove handling triggers the following function call sequence: CPUHP_AP_PERF_S390_SF_ONLINE --> s390_pmu_sf_offline_cpu() ... CPUHP_AP_PERF_ONLINE --> perf_event_exit_cpu() The s390 CPUMF sampling CPU hotplug handler invokes: s390_pmu_sf_offline_cpu() +--> cpusf_pmu_setup() +--> setup_pmc_cpu() +--> deallocate_buffers() This function de-allocates all sampling data buffers (SDBs) allocated for that CPU at event initialization. It also clears the PMU_F_RESERVED bit. The CPU is gone and can not be sampled. With the event still being active on the removed CPU, the CPU event hotplug support in kernel performance subsystem triggers the following function calls on the removed CPU: perf_event_exit_cpu() +--> perf_event_exit_cpu_context() +--> __perf_event_exit_context() +--> __perf_remove_from_context() +--> event_sched_out() +--> cpumsf_pmu_del() +--> cpumsf_pmu_stop() +--> hw_perf_event_update() to stop and remove the event. During removal of the event, the sampling device driver tries to read out the remaining samples from the sample data buffers (SDBs). But they have already been freed (and may have been re-assigned). This may lead to a use after free situation in which case the samples are most likely invalid. In the best case the memory has not been reassigned and still contains valid data. Remedy this situation and check if the CPU is still in reserved state (bit PMU_F_RESERVED set). In this case the SDBs have not been released an contain valid data. This is always the case when the event is removed (and no CPU hotplug off occured). If the PMU_F_RESERVED bit is not set, the SDB buffers are gone. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-10-29s390/cpum_sf: Fix format string in pr_err()Thomas Richter
Fix format string in pr_err() and use the built-in hexadecimal prefix %#x to display a number with a leading hexadecimal indicator 0x. No functional change. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com> Reviewed-by: Sumanth Korikkar <sumanthk@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-10-29s390/cpum_sf: Use sf_buffer_available()Thomas Richter
Use sf_buffer_available() consistently throughtout the code to test for the existence of sampling buffer. No functional change. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com> Reviewed-by: Sumanth Korikkar <sumanthk@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-10-29s390/cpum_sf: Consistently use goto out for function exitThomas Richter
When the sampling buffer allocation fails in __hw_perf_event_init(), jump to the end of the function and return the result. This is consistent with the other error handling and return conditions in this function. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com> Reviewed-By: Sumanth Korikkar <sumanthk@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-10-29s390/cpum_sf: Do not re-enable event after deletionThomas Richter
Event delete removes an event from the event list, but common code invokes the PMU's enable function later on. This happens in event_sched_out() and leads to the following call sequence: event_sched_out() +--> cpumsf_pmu_del() +--> cpumsf_pmu_enable() In cpumsf_pmu_enable() return immediately when the event is not active. Also remove an unneeded if clause. That if() statement is only reached when flag PMU_F_IN_USE has been set in cpumsf_pmu_add(). And this function also sets cpuhw->event to a valid value. Remove WARN_ON_ONCE() statement which never triggered. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com> Acked-by: Sumanth Korikkar <sumanthk@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-10-10s390/cpum_sf: Set bit PMU_F_ENABLED enabled after lpp() invocationThomas Richter
Set PMU enabled bit after lpp() call. This ensures the proper task PID is loaded by the llp instruction into Program Parameter register from where it is copied to each sampling data buffer (SDB) sample entry after the sampling was enabled. The barrier() instruction is not needed as cpumsf_pmu_enable() changes a CPU specific variable. Only the CPU that task is running on changes structure members. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-08-22s390/cpum_sf: Remove WARN_ON_ONCE statementsThomas Richter
Remove WARN_ON_ONCE statements. These have not triggered in the past. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Acked-by: Sumanth Korikkar <sumanthk@linux.ibm.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2024-08-22s390/cpum_sf: Rework debug_sprintf_event() messagesThomas Richter
Rework debug messages: - Remove most of the debug_sprintf_event() invocations. - Do not split string format statements - Remove colon after function name. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Acked-by: Sumanth Korikkar <sumanthk@linux.ibm.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2024-08-21s390/cpum_sf: Ignore qsi() return codeThomas Richter
qsi() executes the instruction qsi (query sample information) and stores the result of the query in a sample information block pointed to by the function argument. The instruction does not change the condition code register. The return code is always zero. No need to check for errors. Remove now unreferenced macros PMC_FAILURE and RS_INIT_FAILURE_QSI. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Acked-by: Sumanth Korikkar <sumanthk@linux.ibm.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2024-08-21s390/cpum_sf: Ignore lsctl() return code in sf_disable()Thomas Richter
sf_disable() returns the condition code of instruction lsctl (load sampling controls). However the parameter to lsctl() in sf_disable() is a sample control block containing all zeroes. This invocation of lsctl() does not fail and returns always zero even when there is no authorization for sampling on the machine. In short, sampling can be always turned off. Ignore the return code of sf_disable() and change the function return to void. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Acked-by: Sumanth Korikkar <sumanthk@linux.ibm.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2024-08-07s390/cpum_sf: Use variable name cpuhw consistentlyThomas Richter
All functions but setup_pmc_cpu() use a local variable named cpuhw to refer to struct cpu_hw_sf. In setup_pmc_cpu() rename variable cpusf to cpuhw. This makes the naming scheme consistent with all other functions. No functional change. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Acked-by: Sumanth Korikkar <sumanthk@linux.ibm.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2024-08-07s390/cpum_sf: Define and initialize variableThomas Richter
Define and initialize a variable in one place. Remove space between cast and variable. No functional change. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Acked-by: Sumanth Korikkar <sumanthk@linux.ibm.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2024-08-07s390/cpum_sf: Use hwc as variable consistentlyThomas Richter
In hw_perf_event_update() and cpumsf_pmu_enable() use variable hwc consistently to access event's hardware related data. No functional change. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Acked-by: Sumanth Korikkar <sumanthk@linux.ibm.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2024-08-07s390/cpum_sf: Move defines from header file to source fileThomas Richter
Some defines in common header file arch/s390/include/asm/perf_event.h are only used in one source file arch/s390/kernel/perf_cpum_sf.c. Move these defines from header to source file. No functional change. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Acked-by: Sumanth Korikkar <sumanthk@linux.ibm.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2024-08-07s390/cpum_sf: Rename macro to consistent prefixThomas Richter
Rename macro SAMPLE_FREQ_MODE to SAMPL_FREQ_MODE to make its prefix consistent with all other macro starting with prefix SAMPL_XXX. No functional change. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Acked-by: Sumanth Korikkar <sumanthk@linux.ibm.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2024-08-07s390/cpum_sf: Remove unused defines REG_NONE and REG_OVERFLOWThomas Richter
Member hw_perf_event::reg.reg is set but never used, so remove it. Defines REG_NONE and REG_OVERFLOW are not referenced anymore. The initialization to zero takes place in function perf_event_alloc() where ... event = kmem_cache_alloc_node(perf_event_cache, GFP_KERNEL | __GFP_ZERO, node); ... makes sure memory allocated for the event is zero'ed. This is done in the kernel's common code in kernel/events/core.c The struct perf_event contains member hw_perf_event as in struct perf_event { .... struct hw_perf_event hw; .... }; This contained sub-structure is also initialized to zero. No functional change. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Acked-by: Sumanth Korikkar <sumanthk@linux.ibm.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2024-08-07s390/cpum_sf: Use refcount_t instead of atomic_tThomas Richter
Replace atomic_t by refcount_t for reference counting of events. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Acked-by: Sumanth Korikkar <sumanthk@linux.ibm.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2024-06-18s390: Replace S390_lowcore by get_lowcore()Sven Schnelle
Replace all S390_lowcore usages in arch/s390/ by get_lowcore(). Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2023-07-04s390/cpum_sf: remove check on CPU being onlineThomas Richter
During sampling event initialization, a check is done if that particular CPU the event is to be installed on is actually online. This check is not necessary, as it is also performed in the system call entry point. Therefore remove this check. No functional change. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2023-07-04s390/cpum_sf: handle casts consistentlyThomas Richter
The casts are written in two different notations: (cast) expression and (cast)expression Convert statements with the first notation to the second notation. No functional change. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2023-07-04s390/cpum_sf: remove unnecessary debug statementThomas Richter
Remove debug_sprint_event() statement right after an pr_err() statement. No additional debug information is generated. No functional change. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2023-07-04s390/cpum_sf: remove parameter in call to pr_errThomas Richter
The op argument is hardcoded in the parameter list of function pr_err. Make the op code part of the text printed by pr_err. No functional change. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2023-07-04s390/cpum_sf: simplify function setup_pmu_cpuThomas Richter
Print the error message when the FAILURE flag is set. This saves on pr_err statement as the text of the error message is identical in both failures. Also observe reverse Xmas tree variable declarations in this function. No functional change. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2023-07-03s390: fix various typosHeiko Carstens
Fix various typos found with codespell. Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2023-07-03s390: include linux/io.h instead of asm/io.hHeiko Carstens
Include linux/io.h instead of asm/io.h everywhere. linux/io.h includes asm/io.h, so this shouldn't cause any problems. Instead this might help for some randconfig build errors which were reported due to some undefined io related functions. Also move the changed include so it stays grouped together with other includes from the same directory. For ctcm_mpc.c also remove not needed comments (actually questions). Acked-by: Christian Borntraeger <borntraeger@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2023-06-28s390/mm: do not include <asm-generic/io.h> directlyBaoquan He
We should always include <asm/io.h> in ARCH, but not <asm-generic/io.h> directly. Otherwise, macro defined by ARCH won't be seen and could cause building error. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202306100105.8GHnoMCP-lkp@intel.com/ Link: https://lore.kernel.org/all/ZIWrtFMUnRfVP5h0@MiWiFi-R3L-srv/ Signed-off-by: Baoquan He <bhe@redhat.com> [agordeev@linux.ibm.com changed patch description] Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2023-06-05s390/cpum_sf: Convert to cmpxchg128()Peter Zijlstra
Now that there is a cross arch u128 and cmpxchg128(), use those instead of the custom CDSG helper. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Heiko Carstens <hca@linux.ibm.com> Tested-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20230531132324.058821078@infradead.org
2023-03-27s390/cpum_sf: remove flag PERF_CPUM_SF_FULL_BLOCKSThomas Richter
This flag is used to process only fully populated sampling buffers when an sampling event is stopped on a CPU. By default the last sampling buffer is also scanned for samples even if the sampling block full indicator is not set in the trailer entry of a sampling buffer page. This flag can be set via perf_event_attr::config1 field. It was never used and never documented. It is useless now. With PERF_CPUM_SF_FULL_BLOCKS: When a process is scheduled off the CPU, the sampling is stopped and the samples are copied to the perf ring buffer and marked invalid. When stopped at the last full sample buffer page (which is achieved with the PERF_CPUM_SF_FULL_BLOCKS options), the hardware sampling will resume at the first free sample entry in the current, partially filled sample buffer. Without PERF_CPUM_SF_FULL_BLOCKS (default behavior): The partially filled last sample buffer is scanned and valid samples are saved to the perf ring buffer. The valid samples are marked invalid. The sampling is resumed when the process is scheduled on this CPU. Again the hardware sampling will resume at the first free sample entry in the current, partially filled sample buffer. Now the next interrupt handler invocation scans the full sample block and saves the valid samples to the ring buffer. It omits the invalid samples at the top of the buffer. The default behavior is fully sufficient, therefore remove this feature. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Acked-by: Hendrik Brueckner <brueckner@linux.ibm.com> Acked-by: Sumanth Korikkar <sumanthk@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2023-02-28s390/cpum_sf: use READ_ONCE_ALIGNED_128() instead of 128-bit cmpxchgHeiko Carstens
Use READ_ONCE_ALIGNED_128() to read the previous value in front of a 128-bit cmpxchg loop, instead of (mis-)using a 128-bit cmpxchg operation to do the same. This makes the code more readable and is faster. Link: https://lore.kernel.org/r/20230224100237.3247871-3-hca@linux.ibm.com Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-02-21Merge tag 's390-6.3-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 updates from Heiko Carstens: - Large cleanup of the con3270/tty3270 driver. Among others this fixes: - Background Color Support - ASCII Line Character Support - VT100 Support - Geometries other than 80x24 - Cleanup and improve cmpxchg() code. Also add cmpxchg_user_key() to uaccess functions, which will be used by KVM to access KVM guest memory with a specific storage key - Add support for user space events counting to CPUMF - Cleanup the vfio/ccw code, which also allows now to properly support 2K Format-2 IDALs - Move kernel page table allocation and initialization to decompressor, which finally allows to enter the kernel with dynamic address translation enabled. This in turn allows to get rid of code with special handling in the kernel, which has to distinguish if DAT is on or off - Replace kretprobe with rethook - Various improvements to vfio/ap queue resets: - Use TAPQ to verify completion of a reset in progress rather than multiple invocations of ZAPQ. - Check TAPQ response codes when verifying successful completion of ZAPQ. - Fix erroneous handling of some error response codes. - Increase the maximum amount of time to wait for successful completion of ZAPQ - Rework system call wrappers to get rid of alias functions, which were only left on s390 - Cleanup diag288_wdt watchdog driver. It has been agreed on with Guenter Roeck that this goes upstream via the s390 tree - Add missing loadparm parameter handling for list-directed ECKD ipl/reipl - Various improvements to memory detection code - Remove arch_cpu_idle_time() since the current implementation is broken, and allows user space observable accounted idle times which can temporarily decrease - Add Reset DAT-Protection support: (only) allow to change PTEs from RO to RW with a new RDP instruction. Unlike the currently used IPTE instruction, this does not necessarily guarantee that TLBs of all CPUs are synchronously flushed; and that remote CPUs can see spurious protection faults. The overall improvement for not requiring an all CPU synchronization, like it is required with IPTE, should be beneficial - Fix KFENCE page fault reporting - Smaller cleanups and improvement all over the place * tag 's390-6.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (182 commits) s390/irq,idle: simplify idle check s390/processor: add test_and_set_cpu_flag() and test_and_clear_cpu_flag() s390/processor: let cpu helper functions return boolean values s390/kfence: fix page fault reporting s390/zcrypt: introduce ctfm field in struct CPRBX s390: remove confusing comment from uapi types header file vfio/ccw: remove WARN_ON during shutdown s390/entry: remove toolchain dependent micro-optimization s390/mem_detect: do not truncate online memory ranges info s390/vx: remove __uint128_t type from __vector128 struct again s390/mm: add support for RDP (Reset DAT-Protection) s390/mm: define private VM_FAULT_* reasons from top bits Documentation: s390: correct spelling s390/ap: fix status returned by ap_qact() s390/ap: fix status returned by ap_aqic() s390: vfio-ap: tighten the NIB validity check Revert "s390/mem_detect: do not update output parameters on failure" s390/idle: remove arch_cpu_idle_time() and corresponding code s390/vx: use simple assignments to access __vector128 members s390/vx: add 64 and 128 bit members to __vector128 struct ...
2023-01-22s390/cpum_sf: diagnostic sampling buffer setup to handle virtual addressesThomas Richter
The CPU Measurement Sampling Facility (CPUM_SF) installs large buffers to save samples collected by hardware. These buffers are organized as Sample Data Buffer Tables (SDBT) and Sample Data Buffers (SDB). SDBs contain the samples which are extracted and saved in the perf ring buffer. The SDBTs are chained using real addresses and refer to SDBs using real addresses. The diagnostic sampling setup uses buffers provided by the process which invokes perf_event_open system call. The buffers are memory mapped. The buffers have been allocated by the kernel event subsystem. Add proper virtual to phyiscal address translation to the buffer chaining. The current constraint which requires virtual equals real address layout is removed. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-01-22s390/cpum_sf: rework macro AUX_SDB_NUM_xxxThomas Richter
Macro AUX_SDB_NUM() has three parameters. The first one is not used. Remove the first parameter. Also convert the macros to inline functions. No functional change. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-01-22s390/cpum_sf: sampling buffer setup to handle virtual addressesThomas Richter
The CPU Measurement Sampling Facility (CPUM_SF) installs large buffers to save samples collected by hardware. These buffers are organized as Sample Data Buffer Tables (SDBT) and Sample Data Buffers (SDB). SDBs contain the samples which are extracted and saved in the perf ring buffer. The SDBTs are chained using real addresses and refer to SDBs using real addresses. Adds proper virtual to phyiscal address translation to the buffer chaining. The current constraint which requires virtual equals real address layout is removed. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-01-22s390/cpum_sf: remove debug statements from function setup_pmc_cpuThomas Richter
Remove debug statements from function setup_pmc_cpu(). The debug statement displays a pointer value to a per cpu variable. This pointer value is printed nowhere else, so it has no use for cross reference. No functional change. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-01-22s390/cpum_sf: move functions from header file to source fileThomas Richter
Some inline helper functions are defined in a header file but used in only one source file. Move these functions to the source file. No functional change. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-01-18perf/core: Introduce perf_prepare_header()Namhyung Kim
Factor out perf_prepare_header() so that it can call perf_prepare_sample() without a header if not needed. Also it checks the filtered_sample_type to avoid duplicate work when perf_prepare_sample() is called twice (or more). Suggested-by: Peter Zijlstr <peterz@infradead.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Tested-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Song Liu <song@kernel.org> Acked-by: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20230118060559.615653-8-namhyung@kernel.org
2023-01-11s390/cpum_sf: add READ_ONCE() semantics to compare and swap loopsHeiko Carstens
The current cmpxchg_double() loops within the perf hw sampling code do not have READ_ONCE() semantics to read the old value from memory. This allows the compiler to generate code which reads the "old" value several times from memory, which again allows for inconsistencies. For example: /* Reset trailer (using compare-double-and-swap) */ do { te_flags = te->flags & ~SDB_TE_BUFFER_FULL_MASK; te_flags |= SDB_TE_ALERT_REQ_MASK; } while (!cmpxchg_double(&te->flags, &te->overflow, te->flags, te->overflow, te_flags, 0ULL)); The compiler could generate code where te->flags used within the cmpxchg_double() call may be refetched from memory and which is not necessarily identical to the previous read version which was used to generate te_flags. Which in turn means that an incorrect update could happen. Fix this by adding READ_ONCE() semantics to all cmpxchg_double() loops. Given that READ_ONCE() cannot generate code on s390 which atomically reads 16 bytes, use a private compare-and-swap-double implementation to achieve that. Also replace cmpxchg_double() with the private implementation to be able to re-use the old value within the loops. As a side effect this converts the whole code to only use bit fields to read and modify bits within the hws trailer header. Reported-by: Alexander Gordeev <agordeev@linux.ibm.com> Acked-by: Alexander Gordeev <agordeev@linux.ibm.com> Acked-by: Hendrik Brueckner <brueckner@linux.ibm.com> Reviewed-by: Thomas Richter <tmricht@linux.ibm.com> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/linux-s390/Y71QJBhNTIatvxUT@osiris/T/#ma14e2a5f7aa8ed4b94b6f9576799b3ad9c60f333 Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-01-17s390/cpumf: Support for CPU Measurement Sampling Facility LS bitThomas Richter
Adds support for the CPU Measurement Sampling Facility limit sampling bit in the sampling device driver. Limited samples have no valueable information are not collected. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Acked-by: Sumanth Korikkar <sumanthk@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-02-13s390/time: convert tod_clock_base to unionHeiko Carstens
Convert tod_clock_base to union tod_clock. This simplifies quite a bit of code and also fixes a bug in read_persistent_clock64(); void read_persistent_clock64(struct timespec64 *ts) { __u64 delta; delta = initial_leap_seconds + TOD_UNIX_EPOCH; get_tod_clock_ext(clk); *(__u64 *) &clk[1] -= delta; if (*(__u64 *) &clk[1] > delta) clk[0]--; ext_to_timespec64(clk, ts); } Assume &clk[1] == 3 and delta == 2; then after the substraction the if condition becomes true and the epoch part of the clock is decremented by one because of an assumed overflow, even though there is none. Fix this by using 128 bit arithmetics and let the compiler do the right thing: void read_persistent_clock64(struct timespec64 *ts) { union tod_clock clk; u64 delta; delta = initial_leap_seconds + TOD_UNIX_EPOCH; store_tod_clock_ext(&clk); clk.eitod -= delta; ext_to_timespec64(&clk, ts); } Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2020-11-17Merge tag 's390-5.10-4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 fixes from Heiko Carstens: - fix system call exit path; avoid return to user space with any TIF/CIF/PIF set - fix file permission for cpum_sfb_size parameter - another small defconfig update * tag 's390-5.10-4' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/cpum_sf.c: fix file permission for cpum_sfb_size s390: update defconfigs s390: fix system call exit path
2020-11-12s390/cpum_sf.c: fix file permission for cpum_sfb_sizeThomas Richter
This file is installed by the s390 CPU Measurement sampling facility device driver to export supported minimum and maximum sample buffer sizes. This file is read by lscpumf tool to display the details of the device driver capabilities. The lscpumf tool might be invoked by a non-root user. In this case it does not print anything because the file contents can not be read. Fix this by allowing read access for all users. Reading the file contents is ok, changing the file contents is left to the root user only. For further reference and details see: [1] https://github.com/ibm-s390-tools/s390-tools/issues/97 Fixes: 69f239ed335a ("s390/cpum_sf: Dynamically extend the sampling buffer if overflows occur") Cc: <stable@vger.kernel.org> # 3.14 Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Acked-by: Sumanth Korikkar <sumanthk@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-11-09perf: Reduce stack usage of perf_output_begin()Peter Zijlstra
__perf_output_begin() has an on-stack struct perf_sample_data in the unlikely case it needs to generate a LOST record. However, every call to perf_output_begin() must already have a perf_sample_data on-stack. Reported-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20201030151954.985416146@infradead.org