summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2020-03-04powerpc/irq: Use current_stack_pointer in check_stack_overflow()Christophe Leroy
The purpose of check_stack_overflow() is to verify that the stack has not overflowed. To really know whether the stack pointer is still within boundaries, the check must be done directly on the value of r1. So use current_stack_pointer, which returns the current value of r1, rather than current_stack_frame() which causes a frame to be created and then returns that value. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200220115141.2707-3-mpe@ellerman.id.au
2020-03-04powerpc: Add current_stack_pointer as a register globalChristophe Leroy
current_stack_frame() doesn't return the stack pointer, but the caller's stack frame. See commit bfe9a2cfe91a ("powerpc: Reimplement __get_SP() as a function not a define") and commit acf620ecf56c ("powerpc: Rename __get_SP() to current_stack_pointer()") for details. In some cases this is overkill or incorrect, as it doesn't return the current value of r1. So add a current_stack_pointer register global to get the value of r1 directly. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> [mpe: Split out of other patch, tweak change log] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200220115141.2707-2-mpe@ellerman.id.au
2020-03-04powerpc: Rename current_stack_pointer() to current_stack_frame()Michael Ellerman
current_stack_pointer(), which was called __get_SP(), used to just return the value in r1. But that caused problems in some cases, so it was turned into a function in commit bfe9a2cfe91a ("powerpc: Reimplement __get_SP() as a function not a define"). Because it's a function in a separate compilation unit to all its callers, it has the effect of causing a stack frame to be created, and then returns the address of that frame. This is good in some cases like those described in the above commit, but in other cases it's overkill, we just need to know what stack page we're on. On some other arches current_stack_pointer is just a register global giving the stack pointer, and we'd like to do that too. So rename our current_stack_pointer() to current_stack_frame() to make that possible. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Christophe Leroy <christophe.leroy@c-s.fr> Link: https://lore.kernel.org/r/20200220115141.2707-1-mpe@ellerman.id.au
2020-03-04powerpc/kernel/sysfs: Add new config option PMU_SYSFS to enable PMU SPRs ↵Kajol Jain
sysfs file creation Many of the performance monitoring unit (PMU) SPRs are exposed in the sysfs. This may not be a desirable since "perf" API is the primary interface to program PMU and collect counter data in the system. But that said, we cant remove these sysfs files since we dont whether anyone/anything is using them. So the patch adds a new CONFIG option 'CONFIG_PMU_SYSFS' (user selectable) to be used in sysfs file creation for PMU SPRs. New option by default is disabled, but can be enabled if user needs it. Tested this patch behaviour in powernv and pseries machines. Patch is also tested for pmac32_defconfig. Signed-off-by: Kajol Jain <kjain@linux.ibm.com> Tested-by: Nageswara R Sastry <nasastry@in.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200214080606.26872-2-kjain@linux.ibm.com
2020-03-04powerpc/kernel/sysfs: Refactor current sysfs.cMadhavan Srinivasan
An attempt to refactor the current sysfs.c file. To start with a big chuck of macro #defines and dscr functions are moved to start of the file. Secondly, HAS_ #define macros are cleanup based on CONFIG_ options Finally new HAS_ macro added: 1. HAS_PPC_PA6T (for PA6T) to separate out non-PMU SPRs. 2. HAS_PPC_PMC56 to separate out PMC SPR's from HAS_PPC_PMC_CLASSIC which come under CONFIG_PPC64. Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200214080606.26872-1-kjain@linux.ibm.com
2020-03-04powerpc/powernv: Add explicit fast-reboot supportOliver O'Halloran
Add a way to manually invoke a fast-reboot rather than setting the NVRAM flag. The idea is to allow userspace to invoke a fast-reboot using the optional string argument to the reboot() system call, or using the xmon zr command so we don't need to leave around a persistent changes on a system to use the feature. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200217024833.30580-2-oohall@gmail.com
2020-03-04powerpc/powernv: Treat an empty reboot string as defaultOliver O'Halloran
Treat an empty reboot cmd string the same as a NULL string. This squashes a spurious unsupported reboot message that sometimes gets out when using xmon. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200217024833.30580-1-oohall@gmail.com
2020-03-04powerpc/Makefile: Mark phony targets as PHONYMichael Ellerman
Some of our phony targets are not marked as such. This can lead to confusing errors, eg: $ make clean $ touch install $ make install make: 'install' is up to date. $ Fix it by adding them to the PHONY variable which is marked phony in the top-level Makefile, or in scripts/Makefile.build for the boot Makefile. Suggested-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200219000434.15872-1-mpe@ellerman.id.au
2020-03-04powerpc/mm: Don't kmap_atomic() in pte_offset_map() on PPC32Christophe Leroy
On PPC32, pte_offset_map() does a kmap_atomic() in order to support page tables allocated in high memory, just like ARM and x86/32. But since at least 2008 and commit 8054a3428fbe ("powerpc: Remove dead CONFIG_HIGHPTE"), page tables are never allocated in high memory. When the page is in low mem, kmap_atomic() just returns the page address but still disable preemption and pagefault. And it is not an inlined function, so we suffer function call for no reason. Make pte_offset_map() the same as pte_offset_kernel() and make pte_unmap() void, in the same way as PPC64 which doesn't have HIGHMEM. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/03c97f0f6b3790d164822563be80f2fd4713a955.1581932480.git.christophe.leroy@c-s.fr
2020-03-04powerpc/book3s64: Fix error handling in mm_iommu_do_alloc()Alexey Kardashevskiy
The last jump to free_exit in mm_iommu_do_alloc() happens after page pointers in struct mm_iommu_table_group_mem_t were already converted to physical addresses. Thus calling put_page() on these physical addresses will likely crash. This moves the loop which calculates the pageshift and converts page struct pointers to physical addresses later after the point when we cannot fail; thus eliminating the need to convert pointers back. Fixes: eb9d7a62c386 ("powerpc/mm_iommu: Fix potential deadlock") Reported-by: Jan Kara <jack@suse.cz> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20191223060351.26359-1-aik@ozlabs.ru
2020-03-04powerpc/powernv: no need to check return value of debugfs_create functionsGreg Kroah-Hartman
When calling debugfs functions, there is no need to ever check the return value. The function can work or not, but the code logic should never do something different based on this. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200209105901.1620958-6-gregkh@linuxfoundation.org
2020-03-04powerpc/cell/axon_msi: no need to check return value of debugfs_create functionsGreg Kroah-Hartman
When calling debugfs functions, there is no need to ever check the return value. The function can work or not, but the code logic should never do something different based on this. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200209105901.1620958-5-gregkh@linuxfoundation.org
2020-03-04powerpc/mm: ptdump: no need to check return value of debugfs_create functionsGreg Kroah-Hartman
When calling debugfs functions, there is no need to ever check the return value. The function can work or not, but the code logic should never do something different based on this. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200209105901.1620958-4-gregkh@linuxfoundation.org
2020-03-04powerpc/mm: book3s64: hash_utils: no need to check return value of ↵Greg Kroah-Hartman
debugfs_create functions When calling debugfs functions, there is no need to ever check the return value. The function can work or not, but the code logic should never do something different based on this. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200209105901.1620958-3-gregkh@linuxfoundation.org
2020-03-04powerpc/kvm: no need to check return value of debugfs_create functionsGreg Kroah-Hartman
When calling debugfs functions, there is no need to ever check the return value. The function can work or not, but the code logic should never do something different based on this. Because of this cleanup, we get to remove a few fields in struct kvm_arch that are now unused. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> [mpe: Fix build error in kvm/timing.c, adapt kvmppc_remove_cpu_debugfs()] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200209105901.1620958-2-gregkh@linuxfoundation.org
2020-03-04powerpc/kernel: no need to check return value of debugfs_create functionsGreg Kroah-Hartman
When calling debugfs functions, there is no need to ever check the return value. The function can work or not, but the code logic should never do something different based on this. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200209105901.1620958-1-gregkh@linuxfoundation.org
2020-03-04powerpc/83xx: Add some error handling in 'quirk_mpc8360e_qe_enet10()'Christophe JAILLET
In some error handling path, we should call "of_node_put(np_par)" or some resource may be leaking in case of error. Fixes: 8159df72d43e ("83xx: add support for the kmeter1 board.") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Acked-by: Scott Wood <oss@buserror.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200208140920.7652-1-christophe.jaillet@wanadoo.fr
2020-03-04powerpc/83xx: Fix some typo in some warning messageChristophe JAILLET
"couldn;t" should be "couldn't". Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Acked-by: Scott Wood <oss@buserror.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200208140904.7521-1-christophe.jaillet@wanadoo.fr
2020-02-28powerpc: fix hardware PMU exception bug on PowerVM compatibility mode systemsDesnes A. Nunes do Rosario
PowerVM systems running compatibility mode on a few Power8 revisions are still vulnerable to the hardware defect that loses PMU exceptions arriving prior to a context switch. The software fix for this issue is enabled through the CPU_FTR_PMAO_BUG cpu_feature bit, nevertheless this bit also needs to be set for PowerVM compatibility mode systems. Fixes: 68f2f0d431d9ea4 ("powerpc: Add a cpu feature CPU_FTR_PMAO_BUG") Signed-off-by: Desnes A. Nunes do Rosario <desnesn@linux.ibm.com> Reviewed-by: Leonardo Bras <leonardo@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200227134715.9715-1-desnesn@linux.ibm.com
2020-02-26powerpc/32: drop get_pteptr()Christophe Leroy
Commit 8d30c14cab30 ("powerpc/mm: Rework I$/D$ coherency (v3)") and commit 90ac19a8b21b ("[POWERPC] Abolish iopa(), mm_ptov(), io_block_mapping() from arch/powerpc") removed the use of get_pteptr() outside of mm/pgtable_32.c In mm/pgtable_32.c, the only user of get_pteptr() is change_page_attr() which operates on kernel context and on lowmem pages only. Make virt_to_kpte() available outside of mm/mem.c and use it instead of get_pteptr(), and drop get_pteptr() Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/788378c6c3ba5c5298caab7c7f95e6c3c88244b8.1578558199.git.christophe.leroy@c-s.fr
2020-02-26powerpc/32: refactor pmd_offset(pud_offset(pgd_offset...Christophe Leroy
At several places pmd pointer is retrieved through the same action: pmd = pmd_offset(pud_offset(pgd_offset(mm, addr), addr), addr); or pmd = pmd_offset(pud_offset(pgd_offset_k(addr), addr), addr); Refactor this by implementing two helpers pmd_ptr() and pmd_ptr_k() This will help when adding the p4d level. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/7b065c5be35726af4066cab238ee35cabceda1fa.1578558199.git.christophe.leroy@c-s.fr
2020-02-26powerpc/32: don't restore r0, r6-r8 on exception entry path after ↵Christophe Leroy
trace_hardirqs_off() Since commit b86fb88855ea ("powerpc/32: implement fast entry for syscalls on non BOOKE") and commit 1a4b739bbb4f ("powerpc/32: implement fast entry for syscalls on BOOKE"), syscalls don't use the exception entry path anymore. It is therefore pointless to restore r0 and r6-r8 after calling trace_hardirqs_off(). In the meantime, drop the '2:' label which is unused and misleading. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/d2c6dc65d27e83964eb05f16a126161ab6455eea.1578388585.git.christophe.leroy@c-s.fr
2020-02-24powerpc: Include .BTF sectionNaveen N. Rao
Selecting CONFIG_DEBUG_INFO_BTF results in the below warning from ld: ld: warning: orphan section `.BTF' from `.btf.vmlinux.bin.o' being placed in section `.BTF' Include .BTF section in vmlinux explicitly to fix the same. Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200220113132.857132-1-naveen.n.rao@linux.vnet.ibm.com
2020-02-24powerpc/watchpoint: Don't call dar_within_range() for Book3SRavi Bangoria
DAR is set to the first byte of overlap between actual access and watched range at DSI on Book3S processor. But actual access range might or might not be within user asked range. So for Book3S, it must not call dar_within_range(). This revert portion of commit 39413ae00967 ("powerpc/hw_breakpoints: Rewrite 8xx breakpoints to allow any address range size."). Before patch: # ./tools/testing/selftests/powerpc/ptrace/perf-hwbreak ... TESTED: No overlap FAILED: Partial overlap: 0 != 2 TESTED: Partial overlap TESTED: No overlap FAILED: Full overlap: 0 != 2 failure: perf_hwbreak After patch: TESTED: No overlap TESTED: Partial overlap TESTED: Partial overlap TESTED: No overlap TESTED: Full overlap success: perf_hwbreak Fixes: 39413ae00967 ("powerpc/hw_breakpoints: Rewrite 8xx breakpoints to allow any address range size.") Reported-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Reviewed-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200222082049.330435-1-ravi.bangoria@linux.ibm.com
2020-02-19powerpc/32s: Slenderize _tlbia() for powerpc 603/603eChristophe Leroy
_tlbia() is a function used only on 603/603e core, ie on CPUs which don't have a hash table. _tlbia() uses the tlbia macro which implements a loop of 1024 tlbie. On the 603/603e core, flushing the entire TLB requires no more than 32 tlbie. Replace tlbia by a loop of 32 tlbie. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/12f4f4f0ff89aeab3b937fc96c84fb35e1b2517e.1580748445.git.christophe.leroy@c-s.fr
2020-02-19powerpc/pseries: Avoid NULL pointer dereference when drmem is unavailableLibor Pechacek
In guests without hotplugagble memory drmem structure is only zero initialized. Trying to manipulate DLPAR parameters results in a crash. $ echo "memory add count 1" > /sys/kernel/dlpar Oops: Kernel access of bad area, sig: 11 [#1] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries ... NIP: c0000000000ff294 LR: c0000000000ff248 CTR: 0000000000000000 REGS: c0000000fb9d3880 TRAP: 0300 Tainted: G E (5.5.0-rc6-2-default) MSR: 8000000000009033 <SF,EE,ME,IR,DR,RI,LE> CR: 28242428 XER: 20000000 CFAR: c0000000009a6c10 DAR: 0000000000000010 DSISR: 40000000 IRQMASK: 0 ... NIP dlpar_memory+0x6e4/0xd00 LR dlpar_memory+0x698/0xd00 Call Trace: dlpar_memory+0x698/0xd00 (unreliable) handle_dlpar_errorlog+0xc0/0x190 dlpar_store+0x198/0x4a0 kobj_attr_store+0x30/0x50 sysfs_kf_write+0x64/0x90 kernfs_fop_write+0x1b0/0x290 __vfs_write+0x3c/0x70 vfs_write+0xd0/0x260 ksys_write+0xdc/0x130 system_call+0x5c/0x68 Taking closer look at the code, I can see that for_each_drmem_lmb is a macro expanding into `for (lmb = &drmem_info->lmbs[0]; lmb <= &drmem_info->lmbs[drmem_info->n_lmbs - 1]; lmb++)`. When drmem_info->lmbs is NULL, the loop would iterate through the whole address range if it weren't stopped by the NULL pointer dereference on the next line. This patch aligns for_each_drmem_lmb and for_each_drmem_lmb_in_range macro behavior with the common C semantics, where the end marker does not belong to the scanned range, and alters get_lmb_range() semantics. As a side effect, the wraparound observed in the crash is prevented. Fixes: 6c6ea53725b3 ("powerpc/mm: Separate ibm, dynamic-memory data from DT format") Cc: stable@vger.kernel.org # v4.16+ Signed-off-by: Libor Pechacek <lpechacek@suse.cz> Signed-off-by: Michal Suchanek <msuchanek@suse.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200131132829.10281-1-msuchanek@suse.de
2020-02-19powerpc: Don't use thread struct for saving SRR0/1 on syscall.Christophe Leroy
CR0 can be saved later, and CTR can also be used for saving. Keep SRR1 in r9 and stash SRR0 in CTR, this avoids using thread_struct in memory for that. Saves 3 cycles (ie 1%) in null_syscall selftest on 8xx. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/b94c3bc03bac9431fec2dadb686384c481889422.1580470483.git.christophe.leroy@c-s.fr
2020-02-19powerpc/32: Warn and return ENOSYS on syscalls from kernelChristophe Leroy
Since commit b86fb88855ea ("powerpc/32: implement fast entry for syscalls on non BOOKE") and commit 1a4b739bbb4f ("powerpc/32: implement fast entry for syscalls on BOOKE"), syscalls from kernel are unexpected and can have catastrophic consequences as it will destroy the kernel stack. Test MSR_PR on syscall entry. In case syscall is from kernel, emit a warning and return ENOSYS error. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/8ee3bdbbdfdfc64ca7001e90c43b2aee6f333578.1580470482.git.christophe.leroy@c-s.fr
2020-02-19powerpc/32s: Don't flush all TLBs when flushing one pageChristophe Leroy
When flushing any memory range, the flushing function flushes all TLBs. When (start) and (end - 1) are in the same memory page, flush that page instead. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Reviewed-by: Segher Boessenkool <segher@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/b30b2eae6960502eaf0d9e36c60820b839693c33.1580542939.git.christophe.leroy@c-s.fr
2020-02-19powerpc/fadump: sysfs for fadump memory reservationSourabh Jain
Add a sys interface to allow querying the memory reserved by FADump for saving the crash dump. Also added Documentation/ABI for the new sysfs file. Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20191211160910.21656-7-sourabhjain@linux.ibm.com
2020-02-19Documentation/ABI: Mark /sys/kernel/fadump_* sysfs files deprecatedSourabh Jain
Add a deprecation note in FADump sysfs ABI documentation files and move them from ABI/testing to ABI/obsolete directory. Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com> [mpe: Use a proper table to fix errors from the documentation build] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20191211160910.21656-6-sourabhjain@linux.ibm.com
2020-02-19powerpc/powernv: Move core and fadump_release_opalcore under new kobjectSourabh Jain
The /sys/firmware/opal/core and /sys/kernel/fadump_release_opalcore sysfs files are used to export and release the OPAL memory on PowerNV platform. let's organize them into a new kobject under /sys/firmware/opal/mpipl/ directory. A symlink is added to maintain the backward compatibility for /sys/firmware/opal/core sysfs file. Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20191211160910.21656-5-sourabhjain@linux.ibm.com
2020-02-19powerpc/fadump: Reorganize /sys/kernel/fadump_* sysfs filesSourabh Jain
As the number of FADump sysfs files increases it is hard to manage all of them inside /sys/kernel directory. It's better to have all the FADump related sysfs files in a dedicated directory /sys/kernel/fadump. But in order to maintain backward compatibility a symlink has been added for every sysfs that has moved to new location. As the FADump sysfs files are now part of a dedicated directory there is no need to prefix their name with fadump_, hence sysfs file names are also updated. For example fadump_enabled sysfs file is now referred as enabled. Also consolidate ABI documentation for all the FADump sysfs files in a single file Documentation/ABI/testing/sysfs-kernel-fadump. Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com> Tested-by: Michal Suchanek <msuchanek@suse.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20191211160910.21656-4-sourabhjain@linux.ibm.com
2020-02-19sysfs: Wrap __compat_only_sysfs_link_entry_to_kobj function to change the ↵Sourabh Jain
symlink name The __compat_only_sysfs_link_entry_to_kobj function creates a symlink to a kobject but doesn't provide an option to change the symlink file name. This patch adds a wrapper function compat_only_sysfs_link_entry_to_kobj that extends the __compat_only_sysfs_link_entry_to_kobj functionality which allows function caller to customize the symlink name. Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com> [mpe: Fix compile error when CONFIG_SYSFS=n] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20191211160910.21656-3-sourabhjain@linux.ibm.com
2020-02-19Documentation/ABI: Add ABI documentation for /sys/kernel/fadump_*Sourabh Jain
Add missing ABI documentation for existing FADump sysfs files. Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com> Reviewed-by: Hari Bathini <hbathini@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20191211160910.21656-2-sourabhjain@linux.ibm.com
2020-02-19powerpc/process: Remove unneccessary #ifdef CONFIG_PPC64 in copy_thread_tls()Christophe Leroy
is_32bit_task() exists on both PPC64 and PPC32, no need of an ifdefery. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Reviewed-by: Michal Suchanek <msuchanek@suse.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/6ecbda05b4119c40222dc8ec284604e1597c9bff.1580327381.git.christophe.leroy@c-s.fr
2020-02-19powerpc/papr_scm: Mark papr_scm_ndctl() as staticVaibhav Jain
Function papr_scm_ndctl() is neither exported from the module nor called directly from outside 'papr.c' hence should be marked 'static'. Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200130040206.79998-1-vaibhav@linux.ibm.com
2020-02-19powerpc/pseries/Makefile: Remove CONFIG_PPC_PSERIES checkOliver O'Halloran
The pseries Makefile (arch/powerpc/platforms/pseries/Makefile) is only included by the platform Makefile (arch/powerpc/platform/Makefile) when CONFIG_PPC_PSERIES is selected, so checking for CONFIG_PPC_PSERIES in the pseries Makefile is pointless. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Reviewed-by: Tyrel Datwyler <tyreld@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200130063153.19915-2-oohall@gmail.com
2020-02-19powerpc/pseries/vio: Remove stray #ifdef CONFIG_PPC_PSERIESOliver O'Halloran
vio.c is in platforms/pseries, which is only built if PPC_PSERIES=y. In other words, this ifdef is pointless. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Reviewed-by: Tyrel Datwyler <tyreld@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200130063153.19915-1-oohall@gmail.com
2020-02-19powerpc/entry: Fix an #if which should be an #ifdef in entry_32.SChristophe Leroy
Fixes: 12c3f1fd87bf ("powerpc/32s: get rid of CPU_FTR_601 feature") Cc: stable@vger.kernel.org # v5.4+ Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/a99fc0ad65b87a1ba51cfa3e0e9034ee294c3e07.1582034961.git.christophe.leroy@c-s.fr
2020-02-18powerpc/xmon: Fix whitespace handling in getstring()Oliver O'Halloran
The ls (lookup symbol) and zr (reboot) commands use xmon's getstring() helper to read a string argument from the xmon prompt. This function skips over leading whitespace, but doesn't check if the first "non-whitespace" character is a newline which causes some odd behaviour (<enter> indicates a the enter key was pressed): 0:mon> ls printk<enter> printk: c0000000001680c4 0:mon> ls<enter> printk<enter> Symbol ' printk' not found. 0:mon> With commit 2d9b332d99b ("powerpc/xmon: Allow passing an argument to ppc_md.restart()") we have a similar problem with the zr command. Previously zr took no arguments so "zr<enter> would trigger a reboot. With that patch applied a second newline needs to be sent in order for the reboot to occur. Fix this by checking if the leading whitespace ended on a newline: 0:mon> ls<enter> Symbol '' not found. Fixes: 2d9b332d99b2 ("powerpc/xmon: Allow passing an argument to ppc_md.restart()") Reported-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200217041343.2454-1-oohall@gmail.com
2020-02-18powerpc/6xx: Fix power_save_ppc32_restore() with CONFIG_VMAP_STACKChristophe Leroy
power_save_ppc32_restore() is called during exception entry, before re-enabling the MMU. It substracts KERNELBASE from the address of nap_save_msscr0 to access it. With CONFIG_VMAP_STACK enabled, data MMU translation has already been re-enabled, so power_save_ppc32_restore() has to access nap_save_msscr0 by its virtual address. Reported-by: Larry Finger <Larry.Finger@lwfinger.net> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Fixes: cd08f109e262 ("powerpc/32s: Enable CONFIG_VMAP_STACK") Tested-by: Larry Finger <Larry.Finger@lwfinger.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/7bce32ccbab3ba3e3e0f27da6961bf6313df97ed.1581663140.git.christophe.leroy@c-s.fr
2020-02-18powerpc/chrp: Fix enter_rtas() with CONFIG_VMAP_STACKChristophe Leroy
With CONFIG_VMAP_STACK, data MMU has to be enabled to read data on the stack. Fixes: cd08f109e262 ("powerpc/32s: Enable CONFIG_VMAP_STACK") Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/d2330584f8c42d3039896e2b56f5d39676dc919c.1581669558.git.christophe.leroy@c-s.fr
2020-02-18powerpc/32s: Fix DSI and ISI exceptions for CONFIG_VMAP_STACKChristophe Leroy
hash_page() needs to read page tables from kernel memory. When entire kernel memory is mapped by BATs, which is normally the case when CONFIG_STRICT_KERNEL_RWX is not set, it works even if the page hosting the page table is not referenced in the MMU hash table. However, if the page where the page table resides is not covered by a BAT, a DSI fault can be encountered from hash_page(), and it loops forever. This can happen when CONFIG_STRICT_KERNEL_RWX is selected and the alignment of the different regions is too small to allow covering the entire memory with BATs. This also happens when CONFIG_DEBUG_PAGEALLOC is selected or when booting with 'nobats' flag. Also, if the page containing the kernel stack is not present in the MMU hash table, registers cannot be saved and a recursive DSI fault is encountered. To allow hash_page() to properly do its job at all time and load the MMU hash table whenever needed, it must run with data MMU disabled. This means it must be called before re-enabling data MMU. To allow this, registers clobbered by hash_page() and create_hpte() have to be saved in the thread struct together with SRR0, SSR1, DAR and DSISR. It is also necessary to ensure that DSI prolog doesn't overwrite regs saved by prolog of the current running exception. That means: - DSI can only use SPRN_SPRG_SCRATCH0 - Exceptions must free SPRN_SPRG_SCRATCH0 before writing to the stack. This also fixes the Oops reported by Erhard when create_hpte() is called by add_hash_page(). Due to prolog size increase, a few more exceptions had to get split in two parts. Fixes: cd08f109e262 ("powerpc/32s: Enable CONFIG_VMAP_STACK") Reported-by: Erhard F. <erhard_f@mailbox.org> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Tested-by: Erhard F. <erhard_f@mailbox.org> Tested-by: Larry Finger <Larry.Finger@lwfinger.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://bugzilla.kernel.org/show_bug.cgi?id=206501 Link: https://lore.kernel.org/r/64a4aa44686e9fd4b01333401367029771d9b231.1581761633.git.christophe.leroy@c-s.fr
2020-02-18powerpc/tm: Fix clearing MSR[TS] in current when reclaiming on signal deliveryGustavo Luiz Duarte
After a treclaim, we expect to be in non-transactional state. If we don't clear the current thread's MSR[TS] before we get preempted, then tm_recheckpoint_new_task() will recheckpoint and we get rescheduled in suspended transaction state. When handling a signal caught in transactional state, handle_rt_signal64() calls get_tm_stackpointer() that treclaims the transaction using tm_reclaim_current() but without clearing the thread's MSR[TS]. This can cause the TM Bad Thing exception below if later we pagefault and get preempted trying to access the user's sigframe, using __put_user(). Afterwards, when we are rescheduled back into do_page_fault() (but now in suspended state since the thread's MSR[TS] was not cleared), upon executing 'rfid' after completion of the page fault handling, the exception is raised because a transition from suspended to non-transactional state is invalid. Unexpected TM Bad Thing exception at c00000000000de44 (msr 0x8000000302a03031) tm_scratch=800000010280b033 Oops: Unrecoverable exception, sig: 6 [#1] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries CPU: 25 PID: 15547 Comm: a.out Not tainted 5.4.0-rc2 #32 NIP: c00000000000de44 LR: c000000000034728 CTR: 0000000000000000 REGS: c00000003fe7bd70 TRAP: 0700 Not tainted (5.4.0-rc2) MSR: 8000000302a03031 <SF,VEC,VSX,FP,ME,IR,DR,LE,TM[SE]> CR: 44000884 XER: 00000000 CFAR: c00000000000dda4 IRQMASK: 0 PACATMSCRATCH: 800000010280b033 GPR00: c000000000034728 c000000f65a17c80 c000000001662800 00007fffacf3fd78 GPR04: 0000000000001000 0000000000001000 0000000000000000 c000000f611f8af0 GPR08: 0000000000000000 0000000078006001 0000000000000000 000c000000000000 GPR12: c000000f611f84b0 c00000003ffcb200 0000000000000000 0000000000000000 GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR20: 0000000000000000 0000000000000000 0000000000000000 c000000f611f8140 GPR24: 0000000000000000 00007fffacf3fd68 c000000f65a17d90 c000000f611f7800 GPR28: c000000f65a17e90 c000000f65a17e90 c000000001685e18 00007fffacf3f000 NIP [c00000000000de44] fast_exception_return+0xf4/0x1b0 LR [c000000000034728] handle_rt_signal64+0x78/0xc50 Call Trace: [c000000f65a17c80] [c000000000034710] handle_rt_signal64+0x60/0xc50 (unreliable) [c000000f65a17d30] [c000000000023640] do_notify_resume+0x330/0x460 [c000000f65a17e20] [c00000000000dcc4] ret_from_except_lite+0x70/0x74 Instruction dump: 7c4ff120 e8410170 7c5a03a6 38400000 f8410060 e8010070 e8410080 e8610088 60000000 60000000 e8810090 e8210078 <4c000024> 48000000 e8610178 88ed0989 ---[ end trace 93094aa44b442f87 ]--- The simplified sequence of events that triggers the above exception is: ... # userspace in NON-TRANSACTIONAL state tbegin # userspace in TRANSACTIONAL state signal delivery # kernelspace in SUSPENDED state handle_rt_signal64() get_tm_stackpointer() treclaim # kernelspace in NON-TRANSACTIONAL state __put_user() page fault happens. We will never get back here because of the TM Bad Thing exception. page fault handling kicks in and we voluntarily preempt ourselves do_page_fault() __schedule() __switch_to(other_task) our task is rescheduled and we recheckpoint because the thread's MSR[TS] was not cleared __switch_to(our_task) switch_to_tm() tm_recheckpoint_new_task() trechkpt # kernelspace in SUSPENDED state The page fault handling resumes, but now we are in suspended transaction state do_page_fault() completes rfid <----- trying to get back where the page fault happened (we were non-transactional back then) TM Bad Thing # illegal transition from suspended to non-transactional This patch fixes that issue by clearing the current thread's MSR[TS] just after treclaim in get_tm_stackpointer() so that we stay in non-transactional state in case we are preempted. In order to make treclaim and clearing the thread's MSR[TS] atomic from a preemption perspective when CONFIG_PREEMPT is set, preempt_disable/enable() is used. It's also necessary to save the previous value of the thread's MSR before get_tm_stackpointer() is called so that it can be exposed to the signal handler later in setup_tm_sigcontexts() to inform the userspace MSR at the moment of the signal delivery. Found with tm-signal-context-force-tm kernel selftest. Fixes: 2b0a576d15e0 ("powerpc: Add new transactional memory state to the signal context") Cc: stable@vger.kernel.org # v3.9 Signed-off-by: Gustavo Luiz Duarte <gustavold@linux.ibm.com> Acked-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200211033831.11165-1-gustavold@linux.ibm.com
2020-02-17powerpc/8xx: Fix clearing of bits 20-23 in ITLB missChristophe Leroy
In ITLB miss handled the line supposed to clear bits 20-23 on the L2 ITLB entry is buggy and does indeed nothing, leading to undefined value which could allow execution when it shouldn't. Properly do the clearing with the relevant instruction. Fixes: 74fabcadfd43 ("powerpc/8xx: don't use r12/SPRN_SPRG_SCRATCH2 in TLB Miss handlers") Cc: stable@vger.kernel.org # v5.0+ Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Reviewed-by: Leonardo Bras <leonardo@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/4f70c2778163affce8508a210f65d140e84524b4.1581272050.git.christophe.leroy@c-s.fr
2020-02-17powerpc/hugetlb: Fix 8M hugepages on 8xxChristophe Leroy
With HW assistance all page tables must be 4k aligned, the 8xx drops the last 12 bits during the walk. Redefine HUGEPD_SHIFT_MASK to mask last 12 bits out. HUGEPD_SHIFT_MASK is used to for alignment of page table cache. Fixes: 22569b881d37 ("powerpc/8xx: Enable 8M hugepage support with HW assistance") Cc: stable@vger.kernel.org # v5.0+ Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/778b1a248c4c7ca79640eeff7740044da6a220a0.1581264115.git.christophe.leroy@c-s.fr
2020-02-17powerpc/hugetlb: Fix 512k hugepages on 8xx with 16k page sizeChristophe Leroy
Commit 55c8fc3f4930 ("powerpc/8xx: reintroduce 16K pages with HW assistance") redefined pte_t as a struct of 4 pte_basic_t, because in 16K pages mode there are four identical entries in the page table. But the size of hugepage tables is calculated based of the size of (void *). Therefore, we end up with page tables of size 1k instead of 4k for 512k pages. As 512k hugepage tables are the same size as standard page tables, ie 4k, use the standard page tables instead of PGT_CACHE tables. Fixes: 3fb69c6a1a13 ("powerpc/8xx: Enable 512k hugepage support with HW assistance") Cc: stable@vger.kernel.org # v5.0+ Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/90ec56a2315be602494619ed0223bba3b0b8d619.1580997007.git.christophe.leroy@c-s.fr
2020-02-17powerpc/eeh: Fix deadlock handling dead PHBSam Bobroff
Recovering a dead PHB can currently cause a deadlock as the PCI rescan/remove lock is taken twice. This is caused as part of an existing bug in eeh_handle_special_event(). The pe is processed while traversing the PHBs even though the pe is unrelated to the loop. This causes the pe to be, incorrectly, processed more than once. Untangling this section can move the pe processing out of the loop and also outside the locked section, correcting both problems. Fixes: 2e25505147b8 ("powerpc/eeh: Fix crash when edev->pdev changes") Cc: stable@vger.kernel.org # 5.4+ Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com> Reviewed-by: Frederic Barrat <fbarrat@linux.ibm.com> Tested-by: Frederic Barrat <fbarrat@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/0547e82dbf90ee0729a2979a8cac5c91665c621f.1581051445.git.sbobroff@linux.ibm.com
2020-02-16Linux 5.6-rc2v5.6-rc2Linus Torvalds