summaryrefslogtreecommitdiff
path: root/arch/s390/kernel/early.c
AgeCommit message (Collapse)Author
2025-07-09s390/early: Copy last breaking event address to pt_regsHeiko Carstens
In case of an early crash the early program check handler also prints the last breaking event address which is contained within the pt_regs structure. However it is not initialized, and therefore a more or less random value is printed in case of a crash. Copy the last breaking event address from lowcore to pt_regs in case of an early program check to address this. This also makes it easier to analyze early crashes. Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2025-06-26s390/nmi: Print additional informationHeiko Carstens
In case of an unrecoverable machine check only the machine check interrupt code is printed to the console before the machine is stopped. This makes root cause analysis sometimes hard. Print additional machine check information to make analysis easier. The output now looks like this: Unrecoverable machine check, code: 00400F5F4C3B0000 6.16.0-rc2-11605-g987a9431e53a-dirty HW: IBM 3931 A01 704 (z/VM 7.4.0) PSW: 0706C00180000000 000003FFE0F0462E PFX: 0000000000070000 LBA: 000003FFE0F0462A EDC: 0000000000000000 FSA: 0000000000000000 CRS: 0080000014966A12 0000000087CB41C7 0000000000BFF140 0000000000000000 000000000000FFFF 0000000000BFF140 0000000071000000 0000000087CB41C7 0000000000008000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 00000000024C0007 00000000DB000000 0000000000BFF000 GPRS: FFFFFFFF00000000 000003FFE0F0462E E10EA4F489F897A6 0000000000000000 7FFFFFF2C0413C4C 000003FFE19B7010 0000000000000000 0000000000000000 0000000000000000 00000001F76B3380 000003FFE15D4050 0000000000000005 0000000000000000 0000000000070000 000003FFE0F0586C 0000037FE00B7DA0 System stopped Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2025-03-31s390/asm-offsets: Remove ASM_OFFSETS_CHeiko Carstens
Remove ASM_OFFSETS_C which is used as guard in thread_info.h to decide if asm-offsets can be included or not. There is no reason to include asm-offsets.h in thread_info.h anymore. Remove the define and the not needed include. Explicitly include asm-offsets.h in all header files which require it, and where it used to be included implicitly via thread_info.h. This reduces header dependencies. Acked-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2025-03-04s390/alternatives: Add debug functionalityHeiko Carstens
Similar to x86 and loongarch add a "debug-alternative" command line parameter, which allows for alternative debugging. The parameter itself comes with architecture specific semantics: "debug-alternative" -> print debug message for every single alternative "debug-alternative=0;2" -> print debug message for all alternatives with type 0 and 2 "debug-alternative=0:0-7" -> print debug message for all alternatives with type 0 which have a facility number within the range of 0-7 "debug-alternative=0:!8;1" -> print debug message for all alternatives with type 0, for all facility numbers, except facility 8, and in addition print all alternatives with type 1 A defconfig build currently results in a kernel with more than 20.000 alternatives, where the majority is for the niai alternative (spinlocks), and the relocated lowcore alternative. The following kernel command like options limit alternative debug output, and enable dynamic debug messages: debug-alternative=0:!49;1:!0 earlyprintk bootdebug ignore_loglevel loglevel=8 dyndbg="file alternative.c +p" This results in output like this: alt: [0/ 11] 0000021b9ce8680c: c0f400000089 -> c00400000000 alt: [0/ 64] 0000021b9ce87e60: c0f400000043 -> c00400000000 alt: [0/133] 0000021b9ce88c56: c0f400000027 -> c00400000000 alt: [0/ 74] 0000021b9ce89410: c0f40000002a -> c00400000000 alt: [0/ 40] 0000021b9dc3720a: 47000000 -> b280d398 alt: [0/193] 0000021b9dc37306: 47000000 -> b201d2b0 alt: [0/193] 0000021b9dc37354: c00400000000 -> d20720c0d2b0 alt: [1/ 5] 0000038d720d7bf2: c0f400000016 -> c00400000000 With [<alternative type>/<alternative data>] <address> oldcode -> newcode Alternative data depends on the alternative type: for type 0 (ALT_TYPE_FACILITY) data is the facility. For type 1 (ALT_TYPE_FEATURE) data is the corresponding machine feature. Acked-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2025-03-04s390/setup: Add decompressor_handled_param() wrapperHeiko Carstens
Make decompressor_handled_param() a wrapper for __decompressor_handled_param(). __decompressor_handled_param() now takes two parameters: a function name and a parameter name, which do not necessarily match. This allows to use characters like "-", which are not allowed in function names, for command line parameters which are handled by the decompressor and should be ignored by the kernel. Reviewed-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2025-03-04s390/vx: Convert cpu_has_vx() to cpu feature functionHeiko Carstens
Instead of having a private cpu_has_vx() implementation use the new common cpu feature method. Move the facility detection to the decompressor so it matches all other cpu features. Reviewed-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2025-03-04s390: Convert MACHINE_IS_[LPAR|VM|KVM], etc, machine_is_[lpar|vm|kvm]()Heiko Carstens
Move machine type detection to the decompressor and use static branches to implement and use machine_is_[lpar|vm|kvm]() instead of a runtime check via MACHINE_IS_[LPAR|VM|KVM]. Reviewed-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2025-03-04s390/diag: Convert MACHINE_HAS_DIAG9C to machine_has_diag9c()Heiko Carstens
Use static branch(es) to implement and use machine_has_diag9c() instead of a runtime check via MACHINE_HAS_DIAG9C. Reviewed-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2025-03-04s390/tx: Convert MACHINE_HAS_TE to machine_has_tx()Heiko Carstens
Use static branch(es) to implement and use machine_has_tx() instead of a runtime check with MACHINE_HAS_TE. Reviewed-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2025-03-04s390/time: Convert MACHINE_HAS_SCC to machine_has_scc()Heiko Carstens
Use static branch(es) to implement and use machine_has_scc() instead of a runtime check via MACHINE_HAS_SCC. This comes with a cleanup of early time initialization: the initial tod_clock_base value is now passed via the bootdata mechanism, instead of using absolute lowcore as transport vehicle from the decompressor to the kernel. Also the early tod clock initialization is moved to the decompressor which allows to use a static branch with machine_has_scc() within the kernel. Reviewed-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2025-03-04s390/pci: Get rid of MACHINE_HAS_PCI_MIOHeiko Carstens
Remove MACHINE_FLAG_PCI_MIO/MACHINE_HAS_PCI_MIO and implement the identical functionality with set_machine_feature(), clear_machine_feature() and test_machine_feature(). Acked-by: Niklas Schnelle <schnelle@linux.ibm.com> Tested-by: Niklas Schnelle <schnelle@linux.ibm.com> Reviewed-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2025-03-04s390/cpufeature: Convert MACHINE_HAS_IDTE to cpu_has_idte()Heiko Carstens
Convert MACHINE_HAS_... to cpu_has_...() which uses test_facility() instead of testing the machine_flags lowcore member if the feature is present. test_facility() generates better code since it results in a static branch without accessing memory. The branch is patched via alternatives by the decompressor depending on the availability of the required facility. Reviewed-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2025-03-04s390/cpufeature: Convert MACHINE_HAS_EDAT2 to cpu_has_edat2()Heiko Carstens
Convert MACHINE_HAS_... to cpu_has_...() which uses test_facility() instead of testing the machine_flags lowcore member if the feature is present. test_facility() generates better code since it results in a static branch without accessing memory. The branch is patched via alternatives by the decompressor depending on the availability of the required facility. Reviewed-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2025-03-04s390/cpufeature: Convert MACHINE_HAS_EDAT1 to cpu_has_edat1()Heiko Carstens
Convert MACHINE_HAS_... to cpu_has_...() which uses test_facility() instead of testing the machine_flags lowcore member if the feature is present. test_facility() generates better code since it results in a static branch without accessing memory. The branch is patched via alternatives by the decompressor depending on the availability of the required facility. Reviewed-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2025-03-04s390/cpufeature: Convert MACHINE_HAS_TOPOLOGY to cpu_has_topology()Heiko Carstens
Convert MACHINE_HAS_... to cpu_has_...() which uses test_facility() instead of testing the machine_flags lowcore member if the feature is present. test_facility() generates better code since it results in a static branch without accessing memory. The branch is patched via alternatives by the decompressor depending on the availability of the required facility. Reviewed-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2025-03-04s390/cpufeature: Convert MACHINE_HAS_TLB_LC to cpu_has_tlb_lc()Heiko Carstens
Convert MACHINE_HAS_... to cpu_has_...() which uses test_facility() instead of testing the machine_flags lowcore member if the feature is present. test_facility() generates better code since it results in a static branch without accessing memory. The branch is patched via alternatives by the decompressor depending on the availability of the required facility. Reviewed-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2025-03-04s390/cpufeature: Convert MACHINE_HAS_NX to cpu_has_nx()Heiko Carstens
Convert MACHINE_HAS_... to cpu_has_...() which uses test_facility() instead of testing the machine_flags lowcore member if the feature is present. test_facility() generates better code since it results in a static branch without accessing memory. The branch is patched via alternatives by the decompressor depending on the availability of the required facility. Reviewed-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2025-03-04s390/cpufeature: Convert MACHINE_HAS_GS to cpu_has_gs()Heiko Carstens
Convert MACHINE_HAS_... to cpu_has_...() which uses test_facility() instead of testing the machine_flags lowcore member if the feature is present. test_facility() generates better code since it results in a static branch without accessing memory. The branch is patched via alternatives by the decompressor depending on the availability of the required facility. Reviewed-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2025-03-04s390/cpufeature: Convert MACHINE_HAS_RDP to cpu_has_rdp()Heiko Carstens
Convert MACHINE_HAS_... to cpu_has_...() which uses test_facility() instead of testing the machine_flags lowcore member if the feature is present. test_facility() generates better code since it results in a static branch without accessing memory. The branch is patched via alternatives by the decompressor depending on the availability of the required facility. Reviewed-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2025-03-04s390/cpufeature: Convert MACHINE_HAS_SEQ_INSN to cpu_has_seq_insn()Heiko Carstens
Convert MACHINE_HAS_... to cpu_has_...() which uses test_facility() instead of testing the machine_flags lowcore member if the feature is present. test_facility() generates better code since it results in a static branch without accessing memory. The branch is patched via alternatives by the decompressor depending on the availability of the required facility. Reviewed-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2025-01-26s390: Use pr_info for "KernelAddressSanitizer initialized" messageVasily Gorbik
sclp_early_printk() ignores boot console debug settings and prints unconditionally. It also prints message without any timestamp or formatting. Convert it to pr_info(). Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2025-01-26s390/boot: Add bootdebug option to control debug messagesVasily Gorbik
Suppress decompressor debug messages by default, similar to regular kernel debug messages that require 'DEBUG' or 'dyndbg' to be enabled (depending on CONFIG_DYNAMIC_DEBUG). Introduce a 'bootdebug' option to enable printing these messages when needed. All messages are still stored in the boot ring buffer regardless. To enable boot debug messages: bootdebug debug Or combine with 'earlyprintk' to print them without delay: bootdebug debug earlyprintk Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2024-08-29s390/setup: Recognize sequential instruction fetching facilityVasily Gorbik
When sequential instruction fetching facility is present, certain guarantees are provided for code patching. In particular, atomic overwrites within 8 aligned bytes is safe from an instruction-fetching point of view. Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2024-08-22s390/early: Dump register contents and call trace for early crashesHeiko Carstens
If the early program check handler cannot resolve a program check dump register contents and a call trace to the console before loading a disabled wait psw. This makes debugging much easier. Emit an extra message with early_printk() for cases where regular printk() via the early console is not yet working so that at least some information is available. Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Acked-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2024-08-22s390/early: Add __init to __do_early_pgm_check()Heiko Carstens
__do_early_pgm_check() is a function which is only needed during early setup code. Mark it __init in order to save a few bytes. Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Acked-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2024-08-07s390/traps: Handle early warnings gracefullyHeiko Carstens
Add missing warning handling to the early program check handler. This way a warning is printed to the console as soon as the early console is setup, and the kernel continues to boot. Before this change a disabled wait psw was loaded instead and the machine was silently stopped without giving an idea about what happened. Reviewed-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2024-08-07s390/entry: Make early program check handler relocated lowcore awareHeiko Carstens
Add the missing pieces so the early program check handler also works with a relocated lowcore. Right now the result of an early program check in case of a relocated lowcore would be a program check loop. Fixes: 8f1e70adb1a3 ("s390/boot: Add cmdline option to relocate lowcore") Reviewed-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2024-07-23s390: Add infrastructure to patch lowcore accessesSven Schnelle
The s390 architecture defines two special per-CPU data pages called the "prefix area". In s390-linux terminology this is usually called "lowcore". This memory area contains system configuration data like old/new PSW's for system call/interrupt/machine check handlers and lots of other data. It is normally mapped to logical address 0. This area can only be accessed when in supervisor mode. This means that kernel code can dereference NULL pointers, because accesses to address 0 are allowed. Parts of lowcore can be write protected, but read accesses and write accesses outside of the write protected areas are not caught. To remove this limitation for debugging and testing, remap lowcore to another address and define a function get_lowcore() which simply returns the address where lowcore is mapped at. This would normally introduce a pointer dereference (=memory read). As lowcore is used for several very often used variables, add code to patch this function during runtime, so we avoid the memory reads. For C code get_lowcore() has to be used, for assembly code it is the GET_LC macro. When using this macro/function a reference is added to alternative patching. All these locations will be patched to the actual lowcore location when the kernel is booted or a module is loaded. To make debugging/bisecting problems easier, this patch adds all the infrastructure but the lowcore address is still hardwired to 0. This way the code can be converted on a per function basis, and the functionality is enabled in a patch after all the functions have been converted. Note that this requires at least z16 because the old lpsw instruction only allowed a 12 bit displacement. z16 introduced lpswey which allows 20 bits (signed), so the lowcore can effectively be mapped from address 0 - 0x7e000. To use 0x7e000 as address, a 6 byte lgfi instruction would have to be used in the alternative. To save two bytes, llilh can be used, but this only allows to set bits 16-31 of the address. In order to use the llilh instruction, use 0x70000 as alternative lowcore address. This is still large enough to catch NULL pointer dereferences into large arrays. Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2024-07-23s390/alternatives: Remove alternative facility listHeiko Carstens
The alternative and the normal facility list are always identical. Remove the alternative facility list, which allows to simplify the alternatives code. Acked-by: Alexander Gordeev <agordeev@linux.ibm.com> Tested-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2024-07-23s390/nospec: Push down alternative handlingHeiko Carstens
The nospec implementation is deeply integrated into the alternatives code: only for nospec an alternative facility list is implemented and used by the alternative code, while it is modified by nospec specific needs. Push down the nospec alternative handling into the nospec by introducing a new alternative type and a specific nospec callback to decide if alternatives should be applied. Also introduce a new global nobp variable which together with facility 82 can be used to decide if nobp is enabled or not. Acked-by: Alexander Gordeev <agordeev@linux.ibm.com> Tested-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2024-06-18s390: Replace S390_lowcore by get_lowcore()Sven Schnelle
Replace all S390_lowcore usages in arch/s390/ by get_lowcore(). Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2024-02-16s390/fpu: move, rename, and merge header filesHeiko Carstens
Move, rename, and merge the fpu and vx header files. This way fpu header files have a consistent naming scheme (fpu*.h). Also get rid of the fpu subdirectory and move header files to asm directory, so that all fpu and vx header files can be found at the same location. Merge internal.h header file into other header files, since the internal helpers are used at many locations. so those helper functions are really not internal. Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-02-12s390/acrs: cleanup access register handlingHeiko Carstens
save_access_regs() and restore_access_regs() are only available by including switch_to.h. This is done by a couple of C files, which have nothing to do with switch_to(), but only need these functions. Move both functions to a new header file and improve the implementation: - Get rid of typedef - Add memory access instrumentation support - Use long displacement instructions lamy/stamy instead of lam/stam - all current users end up with better code because of this Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-12-11s390/fpu: get rid of MACHINE_HAS_VXHeiko Carstens
Get rid of MACHINE_HAS_VX and replace it with cpu_has_vx() which is a short readable wrapper for "test_facility(129)". Facility bit 129 is set if the vector facility is present. test_facility() returns also true for all bits which are set in the architecture level set of the cpu that the kernel is compiled for. This means that test_facility(129) is a compile time constant which returns true for z13 and later, since the vector facility bit is part of the z13 kernel ALS. In result the compiled code will have less runtime checks, and less code. Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2023-12-11s390/fpu: remove "novx" optionHeiko Carstens
Remove the "novx" kernel command line option: the vector code runs without any problems since many years. Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2023-11-05s390/cmma: move parsing of cmma kernel parameter to early boot codeHeiko Carstens
The "cmma=" kernel command line parameter needs to be parsed early for upcoming changes. Therefore move the parsing code. Note that EX_TABLE handling of cmma_test_essa() needs to be open-coded, since the early boot code doesn't have infrastructure for handling expected exceptions. Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2023-09-19s390: use control register bit definesHeiko Carstens
Use control register bit defines instead of plain numbers where possible. Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2023-09-19s390/early: use system_ctl_set_bit() instead of local_ctl_set_bit()Heiko Carstens
Use system_ctl_set_bit() instead of local_ctl_set_bit() to reflect that the control register changes are supposed to be global. This change is just for documentation purposes, since it still results only in local control register contents being changed. Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2023-09-19s390/ctlreg: add local and system prefix to some functionsHeiko Carstens
Add local and system prefix to some functions to clarify they change control register contents on either the local CPU or the on all CPUs. This results in the following API: Two defines which load and save multiple control registers. The defines correlate with the following C prototypes: void __local_ctl_load(unsigned long *, unsigned int cr_low, unsigned int cr_high); void __local_ctl_store(unsigned long *, unsigned int cr_low, unsigned int cr_high); Two functions which locally set or clear one bit for a specified control register: void local_ctl_set_bit(unsigned int cr, unsigned int bit); void local_ctl_clear_bit(unsigned int cr, unsigned int bit); Two functions which set or clear one bit for a specified control register on all CPUs: void system_ctl_set_bit(unsigned int cr, unsigned int bit); void system_ctl_clear_bit(unsigend int cr, unsigned int bit); Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2023-09-19s390/smp,mcck: fix early IPI handlingHeiko Carstens
Both the external call as well as the emergency signal submask bits in control register 0 are set before any interrupt handler is registered. Change the order and first register the interrupt handler and only then enable the interrupts by setting the corresponding bits in control register 0. This prevents that the second part of the machine check handler for early machine check handling is not executed: the machine check handler sends an IPI to the CPU it runs on. If the corresponding interrupts are enabled, but no interrupt handler is present, the interrupt is ignored. Reviewed-by: Sven Schnelle <svens@linux.ibm.com> Acked-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2023-08-30s390/mm: simplify kernel mapping setupHeiko Carstens
The kernel mapping is setup in two stages: in the decompressor map all pages with RWX permissions, and within the kernel change all mappings to their final permissions, where most of the mappings are changed from RWX to RWNX. Change this and map all pages RWNX from the beginning, however without enabling noexec via control register modification. This means that effectively all pages are used with RWX permissions like before. When the final permissions have been applied to the kernel mapping enable noexec via control register modification. This allows to remove quite a bit of non-obvious code. Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-08-30s390: remove "noexec" optionHeiko Carstens
Do the same like x86 with commit 76ea0025a214 ("x86/cpu: Remove "noexec"") and remove the "noexec" kernel command line option. Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-03-20s390/kasan: move shadow mapping to decompressorVasily Gorbik
Since regular paging structs are initialized in decompressor already move KASAN shadow mapping to decompressor as well. This helps to avoid allocating KASAN required memory in 1 large chunk, de-duplicate paging structs creation code and start the uncompressed kernel with KASAN instrumentation right away. This also allows to avoid all pitfalls accidentally calling KASAN instrumented code during KASAN initialization. Acked-by: Heiko Carstens <hca@linux.ibm.com> Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-03-20s390/boot: remove non-functioning image bootable checkVasily Gorbik
check_image_bootable() has been introduced with commit 627c9b62058e ("s390/boot: block uncompressed vmlinux booting attempts") to make sure that users don't try to boot uncompressed vmlinux ELF image in qemu. It used to be possible quite some time ago. That commit prevented confusion with uncompressed vmlinux image starting to boot and even printing kernel messages until it crashed. Users might have tried to report the problem without realizing they are doing something which was not intended. Since commit f1d3c5323772 ("s390/boot: move sclp early buffer from fixed address in asm to C") check_image_bootable() doesn't function properly anymore, as well as booting uncompressed vmlinux image in qemu doesn't really produce any output and crashes. Moving forward it doesn't make sense to fix check_image_bootable() anymore, so simply remove it. Acked-by: Alexander Gordeev <agordeev@linux.ibm.com> Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-02-27s390/setup: do not complain about parameters handled in decompressorVasily Gorbik
Currently there are several kernel command line parameters which are only parsed and handled in decompressor and not known to the kernel. This leads to the following error message during kernel boot: Unknown kernel command line parameters "mem=3G nokaslr", will be passed to user space. To avoid confusion, register those parameters with an empty stub so that kernel does not complain about them. Reported-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com> Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-02-14s390/mm: add support for RDP (Reset DAT-Protection)Gerald Schaefer
RDP instruction allows to reset DAT-protection bit in a PTE, with less CPU synchronization overhead than IPTE instruction. In particular, IPTE can cause machine-wide synchronization overhead, and excessive IPTE usage can negatively impact machine performance. RDP can be used instead of IPTE, if the new PTE only differs in SW bits and _PAGE_PROTECT HW bit, for PTE protection changes from RO to RW. SW PTE bit changes are allowed, e.g. for dirty and young tracking, but none of the other HW-defined part of the PTE must change. This is because the architecture forbids such changes to an active and valid PTE, which is why invalidation with IPTE is always used first, before writing a new entry. The RDP optimization helps mainly for fault-driven SW dirty-bit tracking. Writable PTEs are initially always mapped with HW _PAGE_PROTECT bit set, to allow SW dirty-bit accounting on first write protection fault, where the DAT-protection would then be reset. The reset is now done with RDP instead of IPTE, if RDP instruction is available. RDP cannot always guarantee that the DAT-protection reset is propagated to all CPUs immediately. This means that spurious TLB protection faults on other CPUs can now occur. For this, common code provides a flush_tlb_fix_spurious_fault() handler, which will now be used to do a CPU-local TLB flush. However, this will clear the whole TLB of a CPU, and not just the affected entry. For more fine-grained flushing, by simply doing a (local) RDP again, flush_tlb_fix_spurious_fault() would need to also provide the PTE pointer. Note that spurious TLB protection faults cannot really be distinguished from racing pagetable updates, where another thread already installed the correct PTE. In such a case, the local TLB flush would be unnecessary overhead, but overall reduction of CPU synchronization overhead by not using IPTE is still expected to be beneficial. Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-01-13s390/mm: start kernel with DAT enabledAlexander Gordeev
The setup of the kernel virtual address space is spread throughout the sources, boot stages and config options like this: 1. The available physical memory regions are queried and stored as mem_detect information for later use in the decompressor. 2. Based on the physical memory availability the virtual memory layout is established in the decompressor; 3. If CONFIG_KASAN is disabled the kernel paging setup code populates kernel pgtables and turns DAT mode on. It uses the information stored at step [1]. 4. If CONFIG_KASAN is enabled the kernel early boot kasan setup populates kernel pgtables and turns DAT mode on. It uses the information stored at step [1]. The kasan setup creates early_pg_dir directory and directly overwrites swapper_pg_dir entries to make shadow memory pages available. Move the kernel virtual memory setup to the decompressor and start the kernel with DAT turned on right from the very first istruction. That completely eliminates the boot phase when the kernel runs in DAT-off mode, simplies the overall design and consolidates pgtables setup. The identity mapping is created in the decompressor, while kasan shadow mappings are still created by the early boot kernel code. Share with decompressor the existing kasan memory allocator. It decreases the size of a newly requested memory block from pgalloc_pos and ensures that kernel image is not overwritten. pgalloc_low and pgalloc_pos pointers are made preserved boot variables for that. Use the bootdata infrastructure to setup swapper_pg_dir and invalid_pg_dir directories used by the kernel later. The interim early_pg_dir directory established by the kasan initialization code gets eliminated as result. As the kernel runs in DAT-on mode only the PSW_KERNEL_BITS define gets PSW_MASK_DAT bit by default. Additionally, the setup_lowcore_dat_off() and setup_lowcore_dat_on() routines get merged, since there is no DAT-off mode stage anymore. The memory mappings are created with RW+X protection that allows the early boot code setting up all necessary data and services for the kernel being booted. Just before the paging is enabled the memory protection is changed to RO+X for text, RO+NX for read-only data and RW+NX for kernel data and the identity mapping. Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-01-13s390/early: fix sclp_early_sccb variable lifetimeAlexander Gordeev
Commit ada1da31ce34 ("s390/sclp: sort out physical vs virtual pointers usage") fixed the notion of virtual address for sclp_early_sccb pointer. However, it did not take into account that kasan_early_init() can also output messages and sclp_early_sccb should be adjusted by the time kasan_early_init() is called. Currently it is not a problem, since virtual and physical addresses on s390 are the same. Nevertheless, should they ever differ, this would cause an invalid pointer access. Fixes: ada1da31ce34 ("s390/sclp: sort out physical vs virtual pointers usage") Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-08-30s390: move from strlcpy with unused retval to strscpyWolfram Sang
Follow the advice of the below link and prefer 'strscpy' in this subsystem. Conversion is 1:1 because the return value is not used. Generated by a coccinelle script. Link: https://lore.kernel.org/r/CAHk-=wgfRnXz0W3D37d01q3JFkr_i_uTL=V6A6G1oUZcprmknw@mail.gmail.com/ Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Acked-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Benjamin Block <bblock@linux.ibm.com> Acked-by: Alexandra Winter <wintera@linux.ibm.com> Link: https://lore.kernel.org/r/20220818205948.6360-1-wsa+renesas@sang-engineering.com Link: https://lore.kernel.org/r/20220818210102.7301-1-wsa+renesas@sang-engineering.com [gor@linux.ibm.com: squashed two changes linked above together] Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-05-25s390: simplify early program check handlerHeiko Carstens
Due to historic reasons the base program check handler calls a configurable function. Given that there is only the early program check handler left, simplify the code by directly calling that function. The only other user was removed with commit d485235b0054 ("s390: assume diag308 set always works"). Also rename all functions and the asm file to reflect this. Reviewed-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>