summaryrefslogtreecommitdiff
path: root/arch/s390/kernel
AgeCommit message (Collapse)Author
2022-01-23s390/nmi: handle vector validity failures for KVM guestsChristian Borntraeger
The machine check validity bit tells about the context. If a KVM guest was running the bit tells about the guest validity and the host state is not affected. As a guest can disable the guest validity this might result in unwanted host errors on machine checks. Cc: stable@vger.kernel.org Fixes: c929500d7a5a ("s390/nmi: s390: New low level handling for machine check happening in guest") Signed-off-by: Christian Borntraeger <borntraeger@linux.ibm.com> Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-01-23s390/nmi: handle guarded storage validity failures for KVM guestsChristian Borntraeger
machine check validity bits reflect the state of the machine check. If a guest does not make use of guarded storage, the validity bit might be off. We can not use the host CR bit to decide if the validity bit must be on. So ignore "invalid" guarded storage controls for KVM guests in the host and rely on the machine check being forwarded to the guest. If no other errors happen from a host perspective everything is fine and no process must be killed and the host can continue to run. Cc: stable@vger.kernel.org Fixes: c929500d7a5a ("s390/nmi: s390: New low level handling for machine check happening in guest") Reported-by: Carsten Otte <cotte@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@linux.ibm.com> Tested-by: Carsten Otte <cotte@de.ibm.com> Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-01-21Merge tag 's390-5.17-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull more s390 updates from Heiko Carstens: - add Sven Schnelle as reviewer for s390 code - make uaccess code more readable - change cpu measurement facility code to also support counter second version number 7, and add discard support for limited samples * tag 's390-5.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390: add Sven Schnelle as reviewer s390/uaccess: introduce bit field for OAC specifier s390/cpumf: Support for CPU Measurement Sampling Facility LS bit s390/cpumf: Support for CPU Measurement Facility CSVN 7
2022-01-17s390/cpumf: Support for CPU Measurement Sampling Facility LS bitThomas Richter
Adds support for the CPU Measurement Sampling Facility limit sampling bit in the sampling device driver. Limited samples have no valueable information are not collected. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Acked-by: Sumanth Korikkar <sumanthk@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-01-17s390/cpumf: Support for CPU Measurement Facility CSVN 7Thomas Richter
Adds support for the CPU Measurement Counter Facility second version number 7. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Acked-by: Sumanth Korikkar <sumanthk@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-01-17Merge branch 'signal-for-v5.17' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull signal/exit/ptrace updates from Eric Biederman: "This set of changes deletes some dead code, makes a lot of cleanups which hopefully make the code easier to follow, and fixes bugs found along the way. The end-game which I have not yet reached yet is for fatal signals that generate coredumps to be short-circuit deliverable from complete_signal, for force_siginfo_to_task not to require changing userspace configured signal delivery state, and for the ptrace stops to always happen in locations where we can guarantee on all architectures that the all of the registers are saved and available on the stack. Removal of profile_task_ext, profile_munmap, and profile_handoff_task are the big successes for dead code removal this round. A bunch of small bug fixes are included, as most of the issues reported were small enough that they would not affect bisection so I simply added the fixes and did not fold the fixes into the changes they were fixing. There was a bug that broke coredumps piped to systemd-coredump. I dropped the change that caused that bug and replaced it entirely with something much more restrained. Unfortunately that required some rebasing. Some successes after this set of changes: There are few enough calls to do_exit to audit in a reasonable amount of time. The lifetime of struct kthread now matches the lifetime of struct task, and the pointer to struct kthread is no longer stored in set_child_tid. The flag SIGNAL_GROUP_COREDUMP is removed. The field group_exit_task is removed. Issues where task->exit_code was examined with signal->group_exit_code should been examined were fixed. There are several loosely related changes included because I am cleaning up and if I don't include them they will probably get lost. The original postings of these changes can be found at: https://lkml.kernel.org/r/87a6ha4zsd.fsf@email.froward.int.ebiederm.org https://lkml.kernel.org/r/87bl1kunjj.fsf@email.froward.int.ebiederm.org https://lkml.kernel.org/r/87r19opkx1.fsf_-_@email.froward.int.ebiederm.org I trimmed back the last set of changes to only the obviously correct once. Simply because there was less time for review than I had hoped" * 'signal-for-v5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (44 commits) ptrace/m68k: Stop open coding ptrace_report_syscall ptrace: Remove unused regs argument from ptrace_report_syscall ptrace: Remove second setting of PT_SEIZED in ptrace_attach taskstats: Cleanup the use of task->exit_code exit: Use the correct exit_code in /proc/<pid>/stat exit: Fix the exit_code for wait_task_zombie exit: Coredumps reach do_group_exit exit: Remove profile_handoff_task exit: Remove profile_task_exit & profile_munmap signal: clean up kernel-doc comments signal: Remove the helper signal_group_exit signal: Rename group_exit_task group_exec_task coredump: Stop setting signal->group_exit_task signal: Remove SIGNAL_GROUP_COREDUMP signal: During coredumps set SIGNAL_GROUP_EXIT in zap_process signal: Make coredump handling explicit in complete_signal signal: Have prepare_signal detect coredumps using signal->core_state signal: Have the oom killer detect coredumps using signal->core_state exit: Move force_uaccess back into do_exit exit: Guarantee make_task_dead leaks the tsk when calling do_task_exit ...
2022-01-15Merge branch 'akpm' (patches from Andrew)Linus Torvalds
Merge misc updates from Andrew Morton: "146 patches. Subsystems affected by this patch series: kthread, ia64, scripts, ntfs, squashfs, ocfs2, vfs, and mm (slab-generic, slab, kmemleak, dax, kasan, debug, pagecache, gup, shmem, frontswap, memremap, memcg, selftests, pagemap, dma, vmalloc, memory-failure, hugetlb, userfaultfd, vmscan, mempolicy, oom-kill, hugetlbfs, migration, thp, ksm, page-poison, percpu, rmap, zswap, zram, cleanups, hmm, and damon)" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (146 commits) mm/damon: hide kernel pointer from tracepoint event mm/damon/vaddr: hide kernel pointer from damon_va_three_regions() failure log mm/damon/vaddr: use pr_debug() for damon_va_three_regions() failure logging mm/damon/dbgfs: remove an unnecessary variable mm/damon: move the implementation of damon_insert_region to damon.h mm/damon: add access checking for hugetlb pages Docs/admin-guide/mm/damon/usage: update for schemes statistics mm/damon/dbgfs: support all DAMOS stats Docs/admin-guide/mm/damon/reclaim: document statistics parameters mm/damon/reclaim: provide reclamation statistics mm/damon/schemes: account how many times quota limit has exceeded mm/damon/schemes: account scheme actions that successfully applied mm/damon: remove a mistakenly added comment for a future feature Docs/admin-guide/mm/damon/usage: update for kdamond_pid and (mk|rm)_contexts Docs/admin-guide/mm/damon/usage: mention tracepoint at the beginning Docs/admin-guide/mm/damon/usage: remove redundant information Docs/admin-guide/mm/damon/usage: update for scheme quotas and watermarks mm/damon: convert macro functions to static inline functions mm/damon: modify damon_rand() macro to static inline function mm/damon: move damon_rand() definition into damon.h ...
2022-01-15mm/mempolicy: wire up syscall set_mempolicy_home_nodeAneesh Kumar K.V
Link: https://lkml.kernel.org/r/20211202123810.267175-4-aneesh.kumar@linux.ibm.com Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Cc: Ben Widawsky <ben.widawsky@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Feng Tang <feng.tang@intel.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Andi Kleen <ak@linux.intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Huang Ying <ying.huang@intel.com> Cc: <linux-api@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-01-15mm: defer kmemleak object creation of module_alloc()Kefeng Wang
Yongqiang reports a kmemleak panic when module insmod/rmmod with KASAN enabled(without KASAN_VMALLOC) on x86[1]. When the module area allocates memory, it's kmemleak_object is created successfully, but the KASAN shadow memory of module allocation is not ready, so when kmemleak scan the module's pointer, it will panic due to no shadow memory with KASAN check. module_alloc __vmalloc_node_range kmemleak_vmalloc kmemleak_scan update_checksum kasan_module_alloc kmemleak_ignore Note, there is no problem if KASAN_VMALLOC enabled, the modules area entire shadow memory is preallocated. Thus, the bug only exits on ARCH which supports dynamic allocation of module area per module load, for now, only x86/arm64/s390 are involved. Add a VM_DEFER_KMEMLEAK flags, defer vmalloc'ed object register of kmemleak in module_alloc() to fix this issue. [1] https://lore.kernel.org/all/6d41e2b9-4692-5ec4-b1cd-cbe29ae89739@huawei.com/ [wangkefeng.wang@huawei.com: fix build] Link: https://lkml.kernel.org/r/20211125080307.27225-1-wangkefeng.wang@huawei.com [akpm@linux-foundation.org: simplify ifdefs, per Andrey] Link: https://lkml.kernel.org/r/CA+fCnZcnwJHUQq34VuRxpdoY6_XbJCDJ-jopksS5Eia4PijPzw@mail.gmail.com Link: https://lkml.kernel.org/r/20211124142034.192078-1-wangkefeng.wang@huawei.com Fixes: 793213a82de4 ("s390/kasan: dynamic shadow mem allocation for modules") Fixes: 39d114ddc682 ("arm64: add KASAN support") Fixes: bebf56a1b176 ("kasan: enable instrumentation of global variables") Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Reported-by: Yongqiang Liu <liuyongqiang13@huawei.com> Cc: Andrey Konovalov <andreyknvl@gmail.com> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Alexander Potapenko <glider@google.com> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-01-10Merge tag 's390-5.17-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 updates from Heiko Carstens: "Besides all the small improvements and cleanups the most notable part is the fast vector/SIMD implementation of the ChaCha20 stream cipher, which is an adaptation of Andy Polyakov's code for the kernel. Summary: - add fast vector/SIMD implementation of the ChaCha20 stream cipher, which mainly adapts Andy Polyakov's code for the kernel - add status attribute to AP queue device so users can easily figure out its status - fix race in page table release code, and and lots of documentation - remove uevent suppress from cio device driver, since it turned out that it generated more problems than it solved problems - quite a lot of virtual vs physical address confusion fixes - various other small improvements and cleanups all over the place" * tag 's390-5.17-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (39 commits) s390/dasd: use default_groups in kobj_type s390/sclp_sd: use default_groups in kobj_type s390/pci: simplify __pciwb_mio() inline asm s390: remove unused TASK_SIZE_OF s390/crash_dump: fix virtual vs physical address handling s390/crypto: fix compile error for ChaCha20 module s390/mm: check 2KB-fragment page on release s390/mm: better annotate 2KB pagetable fragments handling s390/mm: fix 2KB pgtable release race s390/sclp: release SCLP early buffer after kernel initialization s390/nmi: disable interrupts on extended save area update s390/zcrypt: CCA control CPRB sending s390/disassembler: update opcode table s390/uv: fix memblock virtual vs physical address confusion s390/smp: fix memblock_phys_free() vs memblock_free() confusion s390/sclp: fix memblock_phys_free() vs memblock_free() confusion s390/exit: remove dead reference to do_exit from copy_thread s390/ap: add missing virt_to_phys address conversion s390/pgalloc: use pointers instead of unsigned long values s390/pgalloc: add virt/phys address handling to base asce functions ...
2022-01-10Merge tag 'arm64-upstream' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 updates from Catalin Marinas: - KCSAN enabled for arm64. - Additional kselftests to exercise the syscall ABI w.r.t. SVE/FPSIMD. - Some more SVE clean-ups and refactoring in preparation for SME support (scalable matrix extensions). - BTI clean-ups (SYM_FUNC macros etc.) - arm64 atomics clean-up and codegen improvements. - HWCAPs for FEAT_AFP (alternate floating point behaviour) and FEAT_RPRESS (increased precision of reciprocal estimate and reciprocal square root estimate). - Use SHA3 instructions to speed-up XOR. - arm64 unwind code refactoring/unification. - Avoid DC (data cache maintenance) instructions when DCZID_EL0.DZP == 1 (potentially set by a hypervisor; user-space already does this). - Perf updates for arm64: support for CI-700, HiSilicon PCIe PMU, Marvell CN10K LLC-TAD PMU, miscellaneous clean-ups. - Other fixes and clean-ups; highlights: fix the handling of erratum 1418040, correct the calculation of the nomap region boundaries, introduce io_stop_wc() mapped to the new DGH instruction (data gathering hint). * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (81 commits) arm64: Use correct method to calculate nomap region boundaries arm64: Drop outdated links in comments arm64: perf: Don't register user access sysctl handler multiple times drivers: perf: marvell_cn10k: fix an IS_ERR() vs NULL check perf/smmuv3: Fix unused variable warning when CONFIG_OF=n arm64: errata: Fix exec handling in erratum 1418040 workaround arm64: Unhash early pointer print plus improve comment asm-generic: introduce io_stop_wc() and add implementation for ARM64 arm64: Ensure that the 'bti' macro is defined where linkage.h is included arm64: remove __dma_*_area() aliases docs/arm64: delete a space from tagged-address-abi arm64: Enable KCSAN kselftest/arm64: Add pidbench for floating point syscall cases arm64/fp: Add comments documenting the usage of state restore functions kselftest/arm64: Add a test program to exercise the syscall ABI kselftest/arm64: Allow signal tests to trigger from a function kselftest/arm64: Parameterise ptrace vector length information arm64/sve: Minor clarification of ABI documentation arm64/sve: Generalise vector length configuration prctl() for SME arm64/sve: Make sysctl interface for SVE reusable by SME ...
2021-12-20s390/crash_dump: fix virtual vs physical address handlingHeiko Carstens
Signal processor STORE STATUS requires a physical address where register contents are supposed to be written to, however the kernel must read the data via the corresponding virtual address. Also the allocated save_area, where register contents are copied to, resides in virtual address space. Fix this by using proper __pa() conversion, or correct memblock_alloc() invocation. Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-12-16s390/nmi: disable interrupts on extended save area updateAlexander Gordeev
Updating of the pointer to machine check extended save area on the IPL CPU needs the lowcore protection to be disabled. Disable interrupts while the protection is off to avoid unnoticed writes to the lowcore. Suggested-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-12-16s390/disassembler: update opcode tableHeiko Carstens
Sync with binutils: update opcode table to reflect the instruction format update of the lpswey instruction, and add the qpaci instruction. Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-12-16s390/uv: fix memblock virtual vs physical address confusionHeiko Carstens
memblock_alloc_try_nid() returns a virtual address, however in error case the allocated memory is incorrectly freed with memblock_phys_free(). Properly use memblock_free() instead, and pass a physical address to uv_init() to fix this. Note: this doesn't fix a bug currently, since virtual and physical addresses are identical. Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-12-16s390/smp: fix memblock_phys_free() vs memblock_free() confusionHeiko Carstens
memblock_phys_free() is used on a virtual address. Fix this by using memblock_free(). Note: this doesn't fix a bug currently, since virtual and physical addresses are identical. Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-12-16s390/exit: remove dead reference to do_exit from copy_threadEric W. Biederman
My s390 assembly is not particularly good so I have read the history of the reference to do_exit copy_thread and have been able to verify that do_exit is not used. The general argument is that s390 has been changed to use the generic kernel_thread and kernel_execve and the generic versions do not call do_exit. So it is strange to see a do_exit reference sitting there. The history of the do_exit reference in s390's version of copy_thread seems conclusive that the do_exit reference is something that lingers and should have been removed several years ago. Up through 8d19f15a60be ("[PATCH] s390 update (1/27): arch.") the s390 code made a call to the exit(2) system call when a kernel thread finished. Then kernel_thread_starter was added which branched directly to the value in register 11 when the kernel thread finshed. The value in register 11 was set in kernel_thread to "regs.gprs[11] = (unsigned long) do_exit" In commit 37fe5d41f640 ("s390: fold kernel_thread_helper() into ret_from_fork()") kernel_thread_starter was moved into entry.S and entry64.S unchanged (except for the syntax differences between inline assemly and in the assembly file). In commit f9a7e025dfc2 ("s390: switch to generic kernel_thread()") the assignment to "gprs[11]" was moved into copy_thread from the old kernel_thread. The helper kernel_thread_starter was still being used and was still branching to "%r11" at the end. In commit 30dcb0996e40 ("s390: switch to saner kernel_execve() semantics") kernel_thread_starter was changed to unconditionally branch to sysc_tracenogo instead to %r11 which held the value of do_exit. Unfortunately copy_thread was not updated to stop passing do_exit in "gprs[11]". In commit 56e62a737028 ("s390: convert to generic entry") kernel_thread_starter was replaced by __ret_from_fork. And the code still continued to pass do_exit in "gprs[11]" despite __ret_from_fork not caring in the slightest. Remove this dead reference to do_exit to make it clear that s390 is not doing anything with do_exit in copy_thread. History Tree: https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Fixes: 30dcb0996e40 ("s390: switch to saner kernel_execve() semantics") Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Link: https://lore.kernel.org/r/20211208202532.16409-1-ebiederm@xmission.com Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-12-13exit: Add and use make_task_dead.Eric W. Biederman
There are two big uses of do_exit. The first is it's design use to be the guts of the exit(2) system call. The second use is to terminate a task after something catastrophic has happened like a NULL pointer in kernel code. Add a function make_task_dead that is initialy exactly the same as do_exit to cover the cases where do_exit is called to handle catastrophic failure. In time this can probably be reduced to just a light wrapper around do_task_dead. For now keep it exactly the same so that there will be no behavioral differences introducing this new concept. Replace all of the uses of do_exit that use it for catastraphic task cleanup with make_task_dead to make it clear what the code is doing. As part of this rename rewind_stack_do_exit rewind_stack_and_make_dead. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2021-12-13exit/s390: Remove dead reference to do_exit from copy_threadEric W. Biederman
My s390 assembly is not particularly good so I have read the history of the reference to do_exit copy_thread and have been able to verify that do_exit is not used. The general argument is that s390 has been changed to use the generic kernel_thread and kernel_execve and the generic versions do not call do_exit. So it is strange to see a do_exit reference sitting there. The history of the do_exit reference in s390's version of copy_thread seems conclusive that the do_exit reference is something that lingers and should have been removed several years ago. Up through 8d19f15a60be ("[PATCH] s390 update (1/27): arch.") the s390 code made a call to the exit(2) system call when a kernel thread finished. Then kernel_thread_starter was added which branched directly to the value in register 11 when the kernel thread finshed. The value in register 11 was set in kernel_thread to "regs.gprs[11] = (unsigned long) do_exit" In commit 37fe5d41f640 ("s390: fold kernel_thread_helper() into ret_from_fork()") kernel_thread_starter was moved into entry.S and entry64.S unchanged (except for the syntax differences between inline assemly and in the assembly file). In commit f9a7e025dfc2 ("s390: switch to generic kernel_thread()") the assignment to "gprs[11]" was moved into copy_thread from the old kernel_thread. The helper kernel_thread_starter was still being used and was still branching to "%r11" at the end. In commit 30dcb0996e40 ("s390: switch to saner kernel_execve() semantics") kernel_thread_starter was changed to unconditionally branch to sysc_tracenogo instead to %r11 which held the value of do_exit. Unfortunately copy_thread was not updated to stop passing do_exit in "gprs[11]". In commit 56e62a737028 ("s390: convert to generic entry") kernel_thread_starter was replaced by __ret_from_fork. And the code still continued to pass do_exit in "gprs[11]" despite __ret_from_fork not caring in the slightest. Remove this dead reference to do_exit to make it clear that s390 is not doing anything with do_exit in copy_thread. Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Fixes: 30dcb0996e40 ("s390: switch to saner kernel_execve() semantics") History Tree: https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2021-12-12s390/entry: fix duplicate tracking of irq nesting levelSven Schnelle
In the current code, when exiting from idle, rcu_irq_enter() is called twice during irq entry: irq_entry_enter()-> rcu_irq_enter() irq_enter() -> rcu_irq_enter() This may lead to wrong results from rcu_is_cpu_rrupt_from_idle() because of a wrong dynticks nmi nesting count. Fix this by only calling irq_enter_rcu(). Cc: <stable@vger.kernel.org> # 5.12+ Reported-by: Mark Rutland <mark.rutland@arm.com> Fixes: 56e62a737028 ("s390: convert to generic entry") Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-12-10s390/kexec: handle R_390_PLT32DBL rela in arch_kexec_apply_relocations_add()Alexander Egorenkov
Starting with gcc 11.3, the C compiler will generate PLT-relative function calls even if they are local and do not require it. Later on during linking, the linker will replace all PLT-relative calls to local functions with PC-relative ones. Unfortunately, the purgatory code of kexec/kdump is not being linked as a regular executable or shared library would have been, and therefore, all PLT-relative addresses remain in the generated purgatory object code unresolved. This leads to the situation where the purgatory code is being executed during kdump with all PLT-relative addresses unresolved. And this results in endless loops within the purgatory code. Furthermore, the clang C compiler has always behaved like described above and this commit should fix kdump for kernels built with the latter. Because the purgatory code is no regular executable or shared library, contains only calls to local functions and has no PLT, all R_390_PLT32DBL relocation entries can be resolved just like a R_390_PC32DBL one. * https://refspecs.linuxfoundation.org/ELF/zSeries/lzsabi0_zSeries/x1633.html#AEN1699 Relocation entries of purgatory code generated with gcc 11.3 ------------------------------------------------------------ $ readelf -r linux/arch/s390/purgatory/purgatory.o Relocation section '.rela.text' at offset 0x370 contains 5 entries: Offset Info Type Sym. Value Sym. Name + Addend 00000000005c 000c00000013 R_390_PC32DBL 0000000000000000 purgatory_sha_regions + 2 00000000007a 000d00000014 R_390_PLT32DBL 0000000000000000 sha256_update + 2 00000000008c 000e00000014 R_390_PLT32DBL 0000000000000000 sha256_final + 2 000000000092 000800000013 R_390_PC32DBL 0000000000000000 .LC0 + 2 0000000000a0 000f00000014 R_390_PLT32DBL 0000000000000000 memcmp + 2 Relocation entries of purgatory code generated with gcc 11.2 ------------------------------------------------------------ $ readelf -r linux/arch/s390/purgatory/purgatory.o Relocation section '.rela.text' at offset 0x368 contains 5 entries: Offset Info Type Sym. Value Sym. Name + Addend 00000000005c 000c00000013 R_390_PC32DBL 0000000000000000 purgatory_sha_regions + 2 00000000007a 000d00000013 R_390_PC32DBL 0000000000000000 sha256_update + 2 00000000008c 000e00000013 R_390_PC32DBL 0000000000000000 sha256_final + 2 000000000092 000800000013 R_390_PC32DBL 0000000000000000 .LC0 + 2 0000000000a0 000f00000013 R_390_PC32DBL 0000000000000000 memcmp + 2 Signed-off-by: Alexander Egorenkov <egorenar@linux.ibm.com> Reported-by: Tao Liu <ltao@redhat.com> Suggested-by: Philipp Rudo <prudo@redhat.com> Reviewed-by: Philipp Rudo <prudo@redhat.com> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20211209073817.82196-1-egorenar@linux.ibm.com Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-12-10s390/ftrace: remove preempt_disable()/preempt_enable() pairJerome Marchand
It looks like commit ce5e48036c9e76a2 ("ftrace: disable preemption when recursion locked") missed a spot in kprobe_ftrace_handler() in arch/s390/kernel/ftrace.c. Remove the superfluous preempt_disable/enable_notrace() there too. Fixes: ce5e48036c9e76a2 ("ftrace: disable preemption when recursion locked") Signed-off-by: Jerome Marchand <jmarchan@redhat.com> Link: https://lore.kernel.org/r/20211208151503.1510381-1-jmarchan@redhat.com Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-12-10s390/kexec_file: fix error handling when applying relocationsPhilipp Rudo
arch_kexec_apply_relocations_add currently ignores all errors returned by arch_kexec_do_relocs. This means that every unknown relocation is silently skipped causing unpredictable behavior while the relocated code runs. Fix this by checking for errors and fail kexec_file_load if an unknown relocation type is encountered. The problem was found after gcc changed its behavior and used R_390_PLT32DBL relocations for brasl instruction and relied on ld to resolve the relocations in the final link in case direct calls are possible. As the purgatory code is only linked partially (option -r) ld didn't resolve the relocations leaving them for arch_kexec_do_relocs. But arch_kexec_do_relocs doesn't know how to handle R_390_PLT32DBL relocations so they were silently skipped. This ultimately caused an endless loop in the purgatory as the brasl instructions kept branching to itself. Fixes: 71406883fd35 ("s390/kexec_file: Add kexec_file_load system call") Reported-by: Tao Liu <ltao@redhat.com> Signed-off-by: Philipp Rudo <prudo@redhat.com> Link: https://lore.kernel.org/r/20211208130741.5821-3-prudo@redhat.com Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-12-10s390/kexec_file: print some more error messagesPhilipp Rudo
Be kind and give some more information on what went wrong. Signed-off-by: Philipp Rudo <prudo@redhat.com> Link: https://lore.kernel.org/r/20211208130741.5821-2-prudo@redhat.com Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-12-10arch: Make ARCH_STACKWALK independent of STACKTRACEPeter Zijlstra
Make arch_stack_walk() available for ARCH_STACKWALK architectures without it being entangled in STACKTRACE. Link: https://lore.kernel.org/lkml/20211022152104.356586621@infradead.org/ Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> [Mark: rebase, drop unnecessary arm change] Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Borislav Petkov <bp@alien8.de> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Link: https://lore.kernel.org/r/20211129142849.3056714-2-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2021-12-06s390/nmi: add missing __pa/__va address conversion of extended save areaHeiko Carstens
Add missing __pa/__va address conversion of machine check extended save area designation, which is an absolute address. Note: this currently doesn't fix a real bug, since virtual addresses are indentical to physical ones. Reported-by: Vineeth Vijayan <vneethv@linux.ibm.com> Tested-by: Vineeth Vijayan <vneethv@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-11-20Merge tag 's390-5.16-3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 updates from Heiko Carstens: - Add missing Kconfig option for ftrace direct multi sample, so it can be compiled again, and also add s390 support for this sample. - Update Christian Borntraeger's email address. - Various fixes for memory layout setup. Besides other this makes it possible to load shared DCSS segments again. - Fix copy to user space of swapped kdump oldmem. - Remove -mstack-guard and -mstack-size compile options when building vdso binaries. This can happen when CONFIG_VMAP_STACK is disabled and results in broken vdso code which causes more or less random exceptions. Also remove the not needed -nostdlib option. - Fix memory leak on cpu hotplug and return code handling in kexec code. - Wire up futex_waitv system call. - Replace snprintf with sysfs_emit where appropriate. * tag 's390-5.16-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: ftrace/samples: add s390 support for ftrace direct multi sample ftrace/samples: add missing Kconfig option for ftrace direct multi sample MAINTAINERS: update email address of Christian Borntraeger s390/kexec: fix memory leak of ipl report buffer s390/kexec: fix return code handling s390/dump: fix copying to user-space of swapped kdump oldmem s390: wire up sys_futex_waitv system call s390/vdso: filter out -mstack-guard and -mstack-size s390/vdso: remove -nostdlib compiler flag s390: replace snprintf in show functions with sysfs_emit s390/boot: simplify and fix kernel memory layout setup s390/setup: re-arrange memblock setup s390/setup: avoid using memblock_enforce_memory_limit s390/setup: avoid reserving memory above identity mapping
2021-11-19signal: Replace force_fatal_sig with force_exit_sig when in doubtEric W. Biederman
Recently to prevent issues with SECCOMP_RET_KILL and similar signals being changed before they are delivered SA_IMMUTABLE was added. Unfortunately this broke debuggers[1][2] which reasonably expect to be able to trap synchronous SIGTRAP and SIGSEGV even when the target process is not configured to handle those signals. Add force_exit_sig and use it instead of force_fatal_sig where historically the code has directly called do_exit. This has the implementation benefits of going through the signal exit path (including generating core dumps) without the danger of allowing userspace to ignore or change these signals. This avoids userspace regressions as older kernels exited with do_exit which debuggers also can not intercept. In the future is should be possible to improve the quality of implementation of the kernel by changing some of these force_exit_sig calls to force_fatal_sig. That can be done where it matters on a case-by-case basis with careful analysis. Reported-by: Kyle Huey <me@kylehuey.com> Reported-by: kernel test robot <oliver.sang@intel.com> [1] https://lkml.kernel.org/r/CAP045AoMY4xf8aC_4QU_-j7obuEPYgTcnQQP3Yxk=2X90jtpjw@mail.gmail.com [2] https://lkml.kernel.org/r/20211117150258.GB5403@xsang-OptiPlex-9020 Fixes: 00b06da29cf9 ("signal: Add SA_IMMUTABLE to ensure forced siganls do not get changed") Fixes: a3616a3c0272 ("signal/m68k: Use force_sigsegv(SIGSEGV) in fpsp040_die") Fixes: 83a1f27ad773 ("signal/powerpc: On swapcontext failure force SIGSEGV") Fixes: 9bc508cf0791 ("signal/s390: Use force_sigsegv in default_trap_handler") Fixes: 086ec444f866 ("signal/sparc32: In setup_rt_frame and setup_fram use force_fatal_sig") Fixes: c317d306d550 ("signal/sparc32: Exit with a fatal signal when try_to_clear_window_buffer fails") Fixes: 695dd0d634df ("signal/x86: In emulate_vsyscall force a signal instead of calling do_exit") Fixes: 1fbd60df8a85 ("signal/vm86_32: Properly send SIGSEGV when the vm86 state cannot be saved.") Fixes: 941edc5bf174 ("exit/syscall_user_dispatch: Send ordinary signals on failure") Link: https://lkml.kernel.org/r/871r3dqfv8.fsf_-_@email.froward.int.ebiederm.org Reviewed-by: Kees Cook <keescook@chromium.org> Tested-by: Kees Cook <keescook@chromium.org> Tested-by: Kyle Huey <khuey@kylehuey.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2021-11-18s390/kexec: fix memory leak of ipl report bufferBaoquan He
unreferenced object 0x38000195000 (size 4096): comm "kexec", pid 8548, jiffies 4294953647 (age 32443.270s) hex dump (first 32 bytes): 00 00 00 c8 20 00 00 00 00 00 00 c0 02 80 00 00 .... ........... 40 40 40 40 40 40 40 40 00 00 00 00 00 00 00 00 @@@@@@@@........ backtrace: [<0000000011a2f199>] __vmalloc_node_range+0xc0/0x140 [<0000000081fa2752>] vzalloc+0x5a/0x70 [<0000000063a4c92d>] ipl_report_finish+0x2c/0x180 [<00000000553304da>] kexec_file_add_ipl_report+0xf4/0x150 [<00000000862d033f>] kexec_file_add_components+0x124/0x160 [<000000000d2717bb>] arch_kexec_kernel_image_load+0x62/0x90 [<000000002e0373b6>] kimage_file_alloc_init+0x1aa/0x2e0 [<0000000060f2d14f>] __do_sys_kexec_file_load+0x17c/0x2c0 [<000000008c86fe5a>] __s390x_sys_kexec_file_load+0x40/0x50 [<000000001fdb9dac>] __do_syscall+0x1bc/0x1f0 [<000000003ee4258d>] system_call+0x78/0xa0 Signed-off-by: Baoquan He <bhe@redhat.com> Reviewed-by: Philipp Rudo <prudo@redhat.com> Fixes: 99feaa717e55 ("s390/kexec_file: Create ipl report and pass to next kernel") Cc: <stable@vger.kernel.org> # v5.2: 20c76e242e70: s390/kexec: fix return code handling Cc: <stable@vger.kernel.org> # v5.2 Link: https://lore.kernel.org/r/20211116033101.GD21646@MiWiFi-R3L-srv Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-11-18s390/kexec: fix return code handlingHeiko Carstens
kexec_file_add_ipl_report ignores that ipl_report_finish may fail and can return an error pointer instead of a valid pointer. Fix this and simplify by returning NULL in case of an error and let the only caller handle this case. Fixes: 99feaa717e55 ("s390/kexec_file: Create ipl report and pass to next kernel") Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-11-18s390/dump: fix copying to user-space of swapped kdump oldmemAlexander Egorenkov
This commit fixes a bug introduced by commit e9e7870f90e3 ("s390/dump: introduce boot data 'oldmem_data'"). OLDMEM_BASE was mistakenly replaced by oldmem_data.size instead of oldmem_data.start. This bug caused the following error during kdump: kdump.sh[878]: No program header covering vaddr 0x3434f5245found kexec bug? Fixes: e9e7870f90e3 ("s390/dump: introduce boot data 'oldmem_data'") Cc: stable@vger.kernel.org # 5.15+ Signed-off-by: Alexander Egorenkov <egorenar@linux.ibm.com> Reviewed-by: Marc Hartmayer <mhartmay@linux.ibm.com> Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-11-16s390: wire up sys_futex_waitv system callVasily Gorbik
Tested with futex kselftests. Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-11-16s390/vdso: filter out -mstack-guard and -mstack-sizeSven Schnelle
When CONFIG_VMAP_STACK is disabled, the user can enable CONFIG_STACK_CHECK, which adds a stack overflow check to each C function in the kernel. This is also done for functions in the vdso page. These functions are run in user context and user stack sizes are usually different to what the kernel uses. This might trigger the stack check although the stack size is valid. Therefore filter the -mstack-guard and -mstack-size flags when compiling vdso C files. Cc: stable@kernel.org # 5.10+ Fixes: 4bff8cb54502 ("s390: convert to GENERIC_VDSO") Reported-by: Janosch Frank <frankja@linux.ibm.com> Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-11-16s390/vdso: remove -nostdlib compiler flagMasahiro Yamada
The -nostdlib option requests the compiler to not use the standard system startup files or libraries when linking. It is effective only when $(CC) is used as a linker driver. Since commit 2b2a25845d53 ("s390/vdso: Use $(LD) instead of $(CC) to link vDSO"), $(LD) is directly used, hence -nostdlib is unneeded. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Link: https://lore.kernel.org/r/20211107162111.323701-1-masahiroy@kernel.org Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-11-16s390/setup: re-arrange memblock setupVasily Gorbik
- Avoid using ULONG_MAX in memblock_remove, it has no functional change but makes memblock_dbg output a range which makes sense. - Actually finish memblock memory setup before doing amode31/cr/uv setup. - Move memblock_dump_all() debug output after memblock memory setup is complete. This gives us final "memory" regions if they were trimmed due to addressing limits and still "physmem" regions as original info which came from mem_detect. Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-11-16s390/setup: avoid using memblock_enforce_memory_limitVasily Gorbik
There is a difference in how architectures treat "mem=" option. For some that is an amount of online memory, for s390 and x86 this is the limiting max address. Some memblock api like memblock_enforce_memory_limit() take limit argument and explicitly treat it as the size of online memory, and use __find_max_addr to convert it to an actual max address. Current s390 usage: memblock_enforce_memory_limit(memblock_end_of_DRAM()); yields different results depending on presence of memory holes (offline memory blocks in between online memory). If there are no memory holes limit == max_addr in memblock_enforce_memory_limit() and it does trim online memory and reserved memory regions. With memory holes present it actually does nothing. Since we already use memblock_remove() explicitly to trim online memory regions to potential limit (think mem=, kdump, addressing limits, etc.) drop the usage of memblock_enforce_memory_limit() altogether. Trimming reserved regions should not be required, since we now use memblock_set_current_limit() to limit allocations and any explicit memory reservations above the limit is an actual problem we should not hide. Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-11-16s390/setup: avoid reserving memory above identity mappingVasily Gorbik
Such reserved memory region, if not cleaned up later causes problems when memblock_free_all() is called to release free pages to the buddy allocator and those reserved regions are carried over to reserve_bootmem_region() which marks the pages as PageReserved. Instead use memblock_set_current_limit() to make sure memblock allocations do not go over identity mapping (which could happen when "mem=" option is used or during kdump). Cc: stable@vger.kernel.org Fixes: 73045a08cf55 ("s390: unify identity mapping limits handling") Reported-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com> Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-11-13Merge tag 's390-5.16-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull more s390 updates from Vasily Gorbik: - Add PCI automatic error recovery. - Fix tape driver timer initialization broken during timers api cleanup. - Fix bogus CPU measurement counters values on CPUs offlining. - Check the validity of subchanel before reading other fields in the schib in cio code. * tag 's390-5.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/cio: check the subchannel validity for dev_busid s390/cpumf: cpum_cf PMU displays invalid value after hotplug remove s390/tape: fix timer initialization in tape_std_assign() s390/pci: implement minimal PCI error recovery PCI: Export pci_dev_lock() s390/pci: implement reset_slot for hotplug slot s390/pci: refresh function handle in iomap
2021-11-10Merge branch 'exit-cleanups-for-v5.16' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull exit cleanups from Eric Biederman: "While looking at some issues related to the exit path in the kernel I found several instances where the code is not using the existing abstractions properly. This set of changes introduces force_fatal_sig a way of sending a signal and not allowing it to be caught, and corrects the misuse of the existing abstractions that I found. A lot of the misuse of the existing abstractions are silly things such as doing something after calling a no return function, rolling BUG by hand, doing more work than necessary to terminate a kernel thread, or calling do_exit(SIGKILL) instead of calling force_sig(SIGKILL). In the review a deficiency in force_fatal_sig and force_sig_seccomp where ptrace or sigaction could prevent the delivery of the signal was found. I have added a change that adds SA_IMMUTABLE to change that makes it impossible to interrupt the delivery of those signals, and allows backporting to fix force_sig_seccomp And Arnd found an issue where a function passed to kthread_run had the wrong prototype, and after my cleanup was failing to build." * 'exit-cleanups-for-v5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (23 commits) soc: ti: fix wkup_m3_rproc_boot_thread return type signal: Add SA_IMMUTABLE to ensure forced siganls do not get changed signal: Replace force_sigsegv(SIGSEGV) with force_fatal_sig(SIGSEGV) exit/r8188eu: Replace the macro thread_exit with a simple return 0 exit/rtl8712: Replace the macro thread_exit with a simple return 0 exit/rtl8723bs: Replace the macro thread_exit with a simple return 0 signal/x86: In emulate_vsyscall force a signal instead of calling do_exit signal/sparc32: In setup_rt_frame and setup_fram use force_fatal_sig signal/sparc32: Exit with a fatal signal when try_to_clear_window_buffer fails exit/syscall_user_dispatch: Send ordinary signals on failure signal: Implement force_fatal_sig exit/kthread: Have kernel threads return instead of calling do_exit signal/s390: Use force_sigsegv in default_trap_handler signal/vm86_32: Properly send SIGSEGV when the vm86 state cannot be saved. signal/vm86_32: Replace open coded BUG_ON with an actual BUG_ON signal/sparc: In setup_tsb_params convert open coded BUG into BUG signal/powerpc: On swapcontext failure force SIGSEGV signal/sh: Use force_sig(SIGKILL) instead of do_group_exit(SIGKILL) signal/mips: Update (_save|_restore)_fp_context to fail with -EFAULT signal/sparc32: Remove unreachable do_exit in do_sparc_fault ...
2021-11-08s390/cpumf: cpum_cf PMU displays invalid value after hotplug removeThomas Richter
When a CPU is hotplugged while the perf stat -e cycles command is running, a wrong (very large) value is displayed immediately after the CPU removal: Check the values, shouldn't be too high as in time counts unit events 1.001101919 29261846 cycles 2.002454499 17523405 cycles 3.003659292 24361161 cycles 4.004816983 18446744073638406144 cycles 5.005671647 <not counted> cycles ... The CPU hotplug off took place after 3 seconds. The issue is the read of the event count value after 4 seconds when the CPU is not available and the read of the counter returns an error. This is treated as a counter value of zero. This results in a very large value (0 - previous_value). Fix this by detecting the hotplugged off CPU and report 0 instead of a very large number. Cc: stable@vger.kernel.org Fixes: a029a4eab39e ("s390/cpumf: Allow concurrent access for CPU Measurement Counter Facility") Reported-by: Sumanth Korikkar <sumanthk@linux.ibm.com> Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Reviewed-by: Sumanth Korikkar <sumanthk@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2021-11-06Merge tag 's390-5.16-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 updates from Vasily Gorbik: - Add support for ftrace with direct call and ftrace direct call samples. - Add support for kernel command lines longer than current 896 bytes and make its length configurable. - Add support for BEAR enhancement facility to improve last breaking event instruction tracking. - Add kprobes sanity checks and testcases to prevent kprobe in the mid of an instruction. - Allow concurrent access to /dev/hwc for the CPUMF users. - Various ftrace / jump label improvements. - Convert unwinder tests to KUnit. - Add s390_iommu_aperture kernel parameter to tweak the limits on concurrently usable DMA mappings. - Add ap.useirq AP module option which can be used to disable interrupt use. - Add add_disk() error handling support to block device drivers. - Drop arch specific and use generic implementation of strlcpy and strrchr. - Several __pa/__va usages fixes. - Various cio, crypto, pci, kernel doc and other small fixes and improvements all over the code. [ Merge fixup as per https://lore.kernel.org/all/YXAqZ%2FEszRisunQw@osiris/ ] * tag 's390-5.16-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (63 commits) s390: make command line configurable s390: support command lines longer than 896 bytes s390/kexec_file: move kernel image size check s390/pci: add s390_iommu_aperture kernel parameter s390/spinlock: remove incorrect kernel doc indicator s390/string: use generic strlcpy s390/string: use generic strrchr s390/ap: function rework based on compiler warning s390/cio: make ccw_device_dma_* more robust s390/vfio-ap: s390/crypto: fix all kernel-doc warnings s390/hmcdrv: fix kernel doc comments s390/ap: new module option ap.useirq s390/cpumf: Allow multiple processes to access /dev/hwc s390/bitops: return true/false (not 1/0) from bool functions s390: add support for BEAR enhancement facility s390: introduce nospec_uses_trampoline() s390: rename last_break to pgm_last_break s390/ptrace: add last_break member to pt_regs s390/sclp: sort out physical vs virtual pointers usage s390/setup: convert start and end initrd pointers to virtual ...
2021-11-06Merge branch 'akpm' (patches from Andrew)Linus Torvalds
Merge misc updates from Andrew Morton: "257 patches. Subsystems affected by this patch series: scripts, ocfs2, vfs, and mm (slab-generic, slab, slub, kconfig, dax, kasan, debug, pagecache, gup, swap, memcg, pagemap, mprotect, mremap, iomap, tracing, vmalloc, pagealloc, memory-failure, hugetlb, userfaultfd, vmscan, tools, memblock, oom-kill, hugetlbfs, migration, thp, readahead, nommu, ksm, vmstat, madvise, memory-hotplug, rmap, zsmalloc, highmem, zram, cleanups, kfence, and damon)" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (257 commits) mm/damon: remove return value from before_terminate callback mm/damon: fix a few spelling mistakes in comments and a pr_debug message mm/damon: simplify stop mechanism Docs/admin-guide/mm/pagemap: wordsmith page flags descriptions Docs/admin-guide/mm/damon/start: simplify the content Docs/admin-guide/mm/damon/start: fix a wrong link Docs/admin-guide/mm/damon/start: fix wrong example commands mm/damon/dbgfs: add adaptive_targets list check before enable monitor_on mm/damon: remove unnecessary variable initialization Documentation/admin-guide/mm/damon: add a document for DAMON_RECLAIM mm/damon: introduce DAMON-based Reclamation (DAMON_RECLAIM) selftests/damon: support watermarks mm/damon/dbgfs: support watermarks mm/damon/schemes: activate schemes based on a watermarks mechanism tools/selftests/damon: update for regions prioritization of schemes mm/damon/dbgfs: support prioritization weights mm/damon/vaddr,paddr: support pageout prioritization mm/damon/schemes: prioritize regions within the quotas mm/damon/selftests: support schemes quotas mm/damon/dbgfs: support quotas of schemes ...
2021-11-06memblock: allow to specify flags with memblock_add_node()David Hildenbrand
We want to specify flags when hotplugging memory. Let's prepare to pass flags to memblock_add_node() by adjusting all existing users. Note that when hotplugging memory the system is already up and running and we might have concurrent memblock users: for example, while we're hotplugging memory, kexec_file code might search for suitable memory regions to place kexec images. It's important to add the memory directly to memblock via a single call with the right flags, instead of adding the memory first and apply flags later: otherwise, concurrent memblock users might temporarily stumble over memblocks with wrong flags, which will be important in a follow-up patch that introduces a new flag to properly handle add_memory_driver_managed(). Link: https://lkml.kernel.org/r/20211004093605.5830-4-david@redhat.com Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: David Hildenbrand <david@redhat.com> Acked-by: Shahab Vahedi <shahab@synopsys.com> [arch/arc] Reviewed-by: Mike Rapoport <rppt@linux.ibm.com> Cc: "Aneesh Kumar K . V" <aneesh.kumar@linux.ibm.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Jianyong Wu <Jianyong.Wu@arm.com> Cc: Jiaxun Yang <jiaxun.yang@flygoat.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vineet Gupta <vgupta@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-06memblock: rename memblock_free to memblock_phys_freeMike Rapoport
Since memblock_free() operates on a physical range, make its name reflect it and rename it to memblock_phys_free(), so it will be a logical counterpart to memblock_phys_alloc(). The callers are updated with the below semantic patch: @@ expression addr; expression size; @@ - memblock_free(addr, size); + memblock_phys_free(addr, size); Link: https://lkml.kernel.org/r/20210930185031.18648-6-rppt@kernel.org Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Juergen Gross <jgross@suse.com> Cc: Shahab Vahedi <Shahab.Vahedi@synopsys.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-06memblock: drop memblock_free_early_nid() and memblock_free_early()Mike Rapoport
memblock_free_early_nid() is unused and memblock_free_early() is an alias for memblock_free(). Replace calls to memblock_free_early() with calls to memblock_free() and remove memblock_free_early() and memblock_free_early_nid(). Link: https://lkml.kernel.org/r/20210930185031.18648-4-rppt@kernel.org Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Juergen Gross <jgross@suse.com> Cc: Shahab Vahedi <Shahab.Vahedi@synopsys.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-02Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull KVM updates from Paolo Bonzini: "ARM: - More progress on the protected VM front, now with the full fixed feature set as well as the limitation of some hypercalls after initialisation. - Cleanup of the RAZ/WI sysreg handling, which was pointlessly complicated - Fixes for the vgic placement in the IPA space, together with a bunch of selftests - More memcg accounting of the memory allocated on behalf of a guest - Timer and vgic selftests - Workarounds for the Apple M1 broken vgic implementation - KConfig cleanups - New kvmarm.mode=none option, for those who really dislike us RISC-V: - New KVM port. x86: - New API to control TSC offset from userspace - TSC scaling for nested hypervisors on SVM - Switch masterclock protection from raw_spin_lock to seqcount - Clean up function prototypes in the page fault code and avoid repeated memslot lookups - Convey the exit reason to userspace on emulation failure - Configure time between NX page recovery iterations - Expose Predictive Store Forwarding Disable CPUID leaf - Allocate page tracking data structures lazily (if the i915 KVM-GT functionality is not compiled in) - Cleanups, fixes and optimizations for the shadow MMU code s390: - SIGP Fixes - initial preparations for lazy destroy of secure VMs - storage key improvements/fixes - Log the guest CPNC Starting from this release, KVM-PPC patches will come from Michael Ellerman's PPC tree" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (227 commits) RISC-V: KVM: fix boolreturn.cocci warnings RISC-V: KVM: remove unneeded semicolon RISC-V: KVM: Fix GPA passed to __kvm_riscv_hfence_gvma_xyz() functions RISC-V: KVM: Factor-out FP virtualization into separate sources KVM: s390: add debug statement for diag 318 CPNC data KVM: s390: pv: properly handle page flags for protected guests KVM: s390: Fix handle_sske page fault handling KVM: x86: SGX must obey the KVM_INTERNAL_ERROR_EMULATION protocol KVM: x86: On emulation failure, convey the exit reason, etc. to userspace KVM: x86: Get exit_reason as part of kvm_x86_ops.get_exit_info KVM: x86: Clarify the kvm_run.emulation_failure structure layout KVM: s390: Add a routine for setting userspace CPU state KVM: s390: Simplify SIGP Set Arch handling KVM: s390: pv: avoid stalls when making pages secure KVM: s390: pv: avoid stalls for kvm_s390_pv_init_vm KVM: s390: pv: avoid double free of sida page KVM: s390: pv: add macros for UVC CC values s390/mm: optimize reset_guest_reference_bit() s390/mm: optimize set_guest_storage_key() s390/mm: no need for pte_alloc_map_lock() if we know the pmd is present ...
2021-11-01Merge tag 'audit-pr-20211101' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit Pull audit updates from Paul Moore: "Add some additional audit logging to capture the openat2() syscall open_how struct info. Previous variations of the open()/openat() syscalls allowed audit admins to inspect the syscall args to get the information contained in the new open_how struct used in openat2()" * tag 'audit-pr-20211101' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit: audit: return early if the filter rule has a lower priority audit: add OPENAT2 record to list "how" info audit: add support for the openat2 syscall audit: replace magic audit syscall class numbers with macros lsm_audit: avoid overloading the "key" audit field audit: Convert to SPDX identifier audit: rename struct node to struct audit_node to prevent future name collisions
2021-11-01Merge tag 'trace-v5.16' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing updates from Steven Rostedt: - kprobes: Restructured stack unwinder to show properly on x86 when a stack dump happens from a kretprobe callback. - Fix to bootconfig parsing - Have tracefs allow owner and group permissions by default (only denying others). There's been pressure to allow non root to tracefs in a controlled fashion, and using groups is probably the safest. - Bootconfig memory managament updates. - Bootconfig clean up to have the tools directory be less dependent on changes in the kernel tree. - Allow perf to be traced by function tracer. - Rewrite of function graph tracer to be a callback from the function tracer instead of having its own trampoline (this change will happen on an arch by arch basis, and currently only x86_64 implements it). - Allow multiple direct trampolines (bpf hooks to functions) be batched together in one synchronization. - Allow histogram triggers to add variables that can perform calculations against the event's fields. - Use the linker to determine architecture callbacks from the ftrace trampoline to allow for proper parameter prototypes and prevent warnings from the compiler. - Extend histogram triggers to key off of variables. - Have trace recursion use bit magic to determine preempt context over if branches. - Have trace recursion disable preemption as all use cases do anyway. - Added testing for verification of tracing utilities. - Various small clean ups and fixes. * tag 'trace-v5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (101 commits) tracing/histogram: Fix semicolon.cocci warnings tracing/histogram: Fix documentation inline emphasis warning tracing: Increase PERF_MAX_TRACE_SIZE to handle Sentinel1 and docker together tracing: Show size of requested perf buffer bootconfig: Initialize ret in xbc_parse_tree() ftrace: do CPU checking after preemption disabled ftrace: disable preemption when recursion locked tracing/histogram: Document expression arithmetic and constants tracing/histogram: Optimize division by a power of 2 tracing/histogram: Covert expr to const if both operands are constants tracing/histogram: Simplify handling of .sym-offset in expressions tracing: Fix operator precedence for hist triggers expression tracing: Add division and multiplication support for hist triggers tracing: Add support for creating hist trigger variables from literal selftests/ftrace: Stop tracing while reading the trace file by default MAINTAINERS: Update KPROBES and TRACING entries test_kprobes: Move it from kernel/ to lib/ docs, kprobes: Remove invalid URL and add new reference samples/kretprobes: Fix return value if register_kretprobe() failed lib/bootconfig: Fix the xbc_get_info kerneldoc ...
2021-10-29signal: Replace force_sigsegv(SIGSEGV) with force_fatal_sig(SIGSEGV)Eric W. Biederman
Now that force_fatal_sig exists it is unnecessary and a bit confusing to use force_sigsegv in cases where the simpler force_fatal_sig is wanted. So change every instance we can to make the code clearer. Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Link: https://lkml.kernel.org/r/877de7jrev.fsf@disp2133 Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2021-10-29signal/s390: Use force_sigsegv in default_trap_handlerEric W. Biederman
Reading the history it is unclear why default_trap_handler calls do_exit. It is not even menthioned in the commit where the change happened. My best guess is that because it is unknown why the exception happened it was desired to guarantee the process never returned to userspace. Using do_exit(SIGSEGV) has the problem that it will only terminate one thread of a process, leaving the process in an undefined state. Use force_sigsegv(SIGSEGV) instead which effectively has the same behavior except that is uses the ordinary signal mechanism and terminates all threads of a process and is generally well defined. Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: linux-s390@vger.kernel.org Fixes: ca2ab03237ec ("[PATCH] s390: core changes") History Tree: https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Link: https://lkml.kernel.org/r/20211020174406.17889-11-ebiederm@xmission.com Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>