summaryrefslogtreecommitdiff
path: root/arch/riscv/include/asm/pgtable.h
AgeCommit message (Collapse)Author
2022-05-11riscv: add memory-type errata for T-HeadHeiko Stuebner
Some current cpus based on T-Head cores implement memory-types way different than described in the svpbmt spec even going so far as using PTE bits marked as reserved. Add the T-Head vendor-id and necessary errata code to replace the affected instructions. Signed-off-by: Heiko Stuebner <heiko@sntech.de> Tested-by: Samuel Holland <samuel@sholland.org> Link: https://lore.kernel.org/r/20220511192921.2223629-13-heiko@sntech.de Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-05-11riscv: add RISC-V Svpbmt extension supportHeiko Stuebner
Svpbmt (the S should be capitalized) is the "Supervisor-mode: page-based memory types" extension that specifies attributes for cacheability, idempotency and ordering. The relevant settings are done in special bits in PTEs: Here is the svpbmt PTE format: | 63 | 62-61 | 60-8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 N MT RSW D A G U X W R V ^ Of the Reserved bits [63:54] in a leaf PTE, the high bit is already allocated (as the N bit), so bits [62:61] are used as the MT (aka MemType) field. This field specifies one of three memory types that are close equivalents (or equivalent in effect) to the three main x86 and ARMv8 memory types - as shown in the following table. RISC-V Encoding & MemType RISC-V Description ---------- ------------------------------------------------ 00 - PMA Normal Cacheable, No change to implied PMA memory type 01 - NC Non-cacheable, idempotent, weakly-ordered Main Memory 10 - IO Non-cacheable, non-idempotent, strongly-ordered I/O memory 11 - Rsvd Reserved for future standard use As the extension will not be present on all implementations, implement a method to handle cpufeatures via alternatives to not incur runtime penalties on cpu variants not supporting specific extensions and patch relevant code parts at runtime. Co-developed-by: Wei Fu <wefu@redhat.com> Signed-off-by: Wei Fu <wefu@redhat.com> Co-developed-by: Liu Shaohua <liush@allwinnertech.com> Signed-off-by: Liu Shaohua <liush@allwinnertech.com> Co-developed-by: Guo Ren <guoren@kernel.org> Signed-off-by: Guo Ren <guoren@kernel.org> [moved to use the alternatives mechanism] Signed-off-by: Heiko Stuebner <heiko@sntech.de> Reviewed-by: Philipp Tomsich <philipp.tomsich@vrull.eu> Link: https://lore.kernel.org/r/20220511192921.2223629-10-heiko@sntech.de Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-05-11riscv: Fix accessing pfn bits in PTEs for non-32bit variantsHeiko Stuebner
On rv32 the PFN part of PTEs is defined to use bits [xlen-1:10] while on rv64 it is defined to use bits [53:10], leaving [63:54] as reserved. With upcoming optional extensions like svpbmt these previously reserved bits will get used so simply right-shifting the PTE to get the PFN won't be enough. So introduce a _PAGE_PFN_MASK constant to mask the correct bits for both rv32 and rv64 before shifting. Signed-off-by: Heiko Stuebner <heiko@sntech.de> Reviewed-by: Philipp Tomsich <philipp.tomsich@vrull.eu> Link: https://lore.kernel.org/r/20220511192921.2223629-9-heiko@sntech.de Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-04-26riscv: compat: Support TASK_SIZE for compat modeGuo Ren
Make TASK_SIZE from const to dynamic detect TIF_32BIT flag function. Refer to arm64 to implement DEFAULT_MAP_WINDOW_64 for efi-stub. Limit 32-bit compatible process in 0-2GB virtual address range (which is enough for real scenarios), because it could avoid address sign extend problem when 32-bit enter 64-bit and ease software design. The standard 32-bit TASK_SIZE is 0x9dc00000:FIXADDR_START, and compared to a compatible 32-bit, it increases 476MB for the application's virtual address. Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Tested-by: Heiko Stuebner <heiko@sntech.de> Link: https://lore.kernel.org/r/20220405071314.3225832-11-guoren@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-03-25Merge tag 'riscv-for-linus-5.18-mw0' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V updates from Palmer Dabbelt: - Support for Sv57-based virtual memory. - Various improvements for the MicroChip PolarFire SOC and the associated Icicle dev board, which should allow upstream kernels to boot without any additional modifications. - An improved memmove() implementation. - Support for the new Ssconfpmf and SBI PMU extensions, which allows for a much more useful perf implementation on RISC-V systems. - Support for restartable sequences. * tag 'riscv-for-linus-5.18-mw0' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (36 commits) rseq/selftests: Add support for RISC-V RISC-V: Add support for restartable sequence MAINTAINERS: Add entry for RISC-V PMU drivers Documentation: riscv: Remove the old documentation RISC-V: Add sscofpmf extension support RISC-V: Add perf platform driver based on SBI PMU extension RISC-V: Add RISC-V SBI PMU extension definitions RISC-V: Add a simple platform driver for RISC-V legacy perf RISC-V: Add a perf core library for pmu drivers RISC-V: Add CSR encodings for all HPMCOUNTERS RISC-V: Remove the current perf implementation RISC-V: Improve /proc/cpuinfo output for ISA extensions RISC-V: Do no continue isa string parsing without correct XLEN RISC-V: Implement multi-letter ISA extension probing framework RISC-V: Extract multi-letter extension names from "riscv, isa" RISC-V: Minimal parser for "riscv, isa" strings RISC-V: Correctly print supported extensions riscv: Fixed misaligned memory access. Fixed pointer comparison. MAINTAINERS: update riscv/microchip entry riscv: dts: microchip: add new peripherals to icicle kit device tree ...
2022-03-03riscv: Fix is_linear_mapping with recent move of KASAN regionAlexandre Ghiti
The KASAN region was recently moved between the linear mapping and the kernel mapping, is_linear_mapping used to check the validity of an address by using the start of the kernel mapping, which is now wrong. Fix this by using the maximum size of the physical memory. Fixes: f7ae02333d13 ("riscv: Move KASAN mapping next to the kernel mapping") Signed-off-by: Alexandre Ghiti <alexandre.ghiti@canonical.com> Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-02-14riscv: mm: Prepare pt_ops helper functions for sv57Qinglin Pan
This patch prepare some pt_ops helper functions which will be used in creating sv57 mappings during boot time. Signed-off-by: Qinglin Pan <panqinglin2020@iscas.ac.cn> Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-02-14riscv: mm: Control p4d's folding by pgtable_l5_enabledQinglin Pan
To determine pgtable level at boot time, we can not use helper functions in include/asm-generic/pgtable-nop4d.h and must implement these functions. This patch uses pgtable_l5_enabled variable instead of including pgtable-nop4d.h to controle p4d's folding, and implements corresponding helper functions. Signed-off-by: Qinglin Pan <panqinglin2020@iscas.ac.cn> Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-01-19RISC-V: Introduce sv48 support without relocatable kernelPalmer Dabbelt
This patchset allows to have a single kernel for sv39 and sv48 without being relocatable. The idea comes from Arnd Bergmann who suggested to do the same as x86, that is mapping the kernel to the end of the address space, which allows the kernel to be linked at the same address for both sv39 and sv48 and then does not require to be relocated at runtime. This implements sv48 support at runtime. The kernel will try to boot with 4-level page table and will fallback to 3-level if the HW does not support it. Folding the 4th level into a 3-level page table has almost no cost at runtime. Note that kasan region had to be moved to the end of the address space since its location must be known at compile-time and then be valid for both sv39 and sv48 (and sv57 that is coming). * riscv-sv48-v3: riscv: Explicit comment about user virtual address space size riscv: Use pgtable_l4_enabled to output mmu_type in cpuinfo riscv: Implement sv48 support asm-generic: Prepare for riscv use of pud_alloc_one and pud_free riscv: Allow to dynamically define VA_BITS riscv: Introduce functions to switch pt_ops riscv: Split early kasan mapping to prepare sv48 introduction riscv: Move KASAN mapping next to the kernel mapping riscv: Get rid of MAXPHYSMEM configs Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-01-19riscv: Explicit comment about user virtual address space sizeAlexandre Ghiti
Define precisely the size of the user accessible virtual space size for sv32/39/48 mmu types and explain why the whole virtual address space is split into 2 equal chunks between kernel and user space. Signed-off-by: Alexandre Ghiti <alexandre.ghiti@canonical.com> Reviewed-by: Anup Patel <anup@brainfault.org> Reviewed-by: Palmer Dabbelt <palmerdabbelt@google.com> Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-01-19riscv: Implement sv48 supportAlexandre Ghiti
By adding a new 4th level of page table, give the possibility to 64bit kernel to address 2^48 bytes of virtual address: in practice, that offers 128TB of virtual address space to userspace and allows up to 64TB of physical memory. If the underlying hardware does not support sv48, we will automatically fallback to a standard 3-level page table by folding the new PUD level into PGDIR level. In order to detect HW capabilities at runtime, we use SATP feature that ignores writes with an unsupported mode. Signed-off-by: Alexandre Ghiti <alexandre.ghiti@canonical.com> Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-01-19riscv: Allow to dynamically define VA_BITSAlexandre Ghiti
With 4-level page table folding at runtime, we don't know at compile time the size of the virtual address space so we must set VA_BITS dynamically so that sparsemem reserves the right amount of memory for struct pages. Signed-off-by: Alexandre Ghiti <alexandre.ghiti@canonical.com> Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-01-19riscv: Move KASAN mapping next to the kernel mappingAlexandre Ghiti
Now that KASAN_SHADOW_OFFSET is defined at compile time as a config, this value must remain constant whatever the size of the virtual address space, which is only possible by pushing this region at the end of the address space next to the kernel mapping. Signed-off-by: Alexandre Ghiti <alexandre.ghiti@canonical.com> Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-01-07riscv/mm: Enable THP migrationNanyong Sun
Add two THP helpers required to create PMD migration swap entries, and enable THP migration via ARCH_ENABLE_THP_MIGRATION. This can reduce time of THP migration without splitting and guarantee the migrated pages are still contiguous. Signed-off-by: Nanyong Sun <sunnanyong@huawei.com> Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-01-07riscv/mm: Adjust PAGE_PROT_NONE to comply with THP semanticsNanyong Sun
This is a preparation for enabling THP migration. As the commit b65399f6111b("arm64/mm: Change THP helpers to comply with generic MM semantics") mentioned, pmd_present() and pmd_trans_huge() are expected to behave in the following manner: ------------------------------------------------------------------------- | PMD states | pmd_present | pmd_trans_huge | ------------------------------------------------------------------------- | Mapped | Yes | Yes | ------------------------------------------------------------------------- | Splitting | Yes | Yes | ------------------------------------------------------------------------- | Migration/Swap | No | No | ------------------------------------------------------------------------- At present the PROT_NONE bit reuses the READ bit could not comply with above semantics with two problems: 1. When splitting a PMD THP, PMD is first invalidated with pmdp_invalidate()->pmd_mkinvalid(), which clears the PRESENT bit and PROT_NONE bit/READ bit, if the PMD is read-only, then the PAGE_LEAF property is also cleared, which results in pmd_present() return false. 2. When migrating, the swap entry only clear the PRESENT bit and PROT_NONE bit/READ bit, the W/X bit may be set, so _PAGE_LEAF may be true which results in pmd_present() return true. Solution: Adjust PROT_NONE bit from READ to GLOBAL bit can satisfy the above rules: 1. GLOBAL bit has no other meanings, not like the R/W/X bit, which is also relative with _PAGE_LEAF property. 2. GLOBAL bit is at bit 5, making swap entry start from bit 6, bit 0-5 are zero, which means the PRESENT, PROT_NONE, and PAGE_LEAF are all false, then the pmd_present() and pmd_trans_huge() return false when in migration/swap. Signed-off-by: Nanyong Sun <sunnanyong@huawei.com> Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-01-05riscv: Make vmalloc/vmemmap end equal to the start of the next regionAlexandre Ghiti
We used to define VMALLOC_END equal to the start of the next region *minus one* which is inconsistent with the use of this define in the core code (for example, see the definitions of VMALLOC_TOTAL and is_vmalloc_addr). And then make the definition of VMEMMAP_END consistent with VMALLOC_END and all other regions actually. Signed-off-by: Alexandre Ghiti <alexandre.ghiti@canonical.com> Reviewed-by: Jisheng Zhang <jszhang@kernel.org> Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2021-10-26riscv: remove .text section size limitation for XIPVitaly Wool
Currently there's a limit of 8MB for the .text section of a RISC-V image in the XIP case. This breaks compilation of many automatic builds and is generally inconvenient. This patch removes that limitation and optimizes XIP image file size at the same time. Signed-off-by: Vitaly Wool <vitaly.wool@konsulko.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-07-09Merge tag 'riscv-for-linus-5.14-mw0' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V updates from Palmer Dabbelt: "We have a handful of new features for 5.14: - Support for transparent huge pages. - Support for generic PCI resources mapping. - Support for the mem= kernel parameter. - Support for KFENCE. - A handful of fixes to avoid W+X mappings in the kernel. - Support for VMAP_STACK based overflow detection. - An optimized copy_{to,from}_user" * tag 'riscv-for-linus-5.14-mw0' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (37 commits) riscv: xip: Fix duplicate included asm/pgtable.h riscv: Fix PTDUMP output now BPF region moved back to module region riscv: __asm_copy_to-from_user: Optimize unaligned memory access and pipeline stall riscv: add VMAP_STACK overflow detection riscv: ptrace: add argn syntax riscv: mm: fix build errors caused by mk_pmd() riscv: Introduce structure that group all variables regarding kernel mapping riscv: Map the kernel with correct permissions the first time riscv: Introduce set_kernel_memory helper riscv: Enable KFENCE for riscv64 RISC-V: Use asm-generic for {in,out}{bwlq} riscv: add ASID-based tlbflushing methods riscv: pass the mm_struct to __sbi_tlb_flush_range riscv: Add mem kernel parameter support riscv: Simplify xip and !xip kernel address conversion macros riscv: Remove CONFIG_PHYS_RAM_BASE_FIXED riscv: Only initialize swiotlb when necessary riscv: fix typo in init.c riscv: Cleanup unused functions riscv: mm: Use better bitmap_zalloc() ...
2021-07-05riscv: mm: fix build errors caused by mk_pmd()Nanyong Sun
With "riscv: mm: add THP support on 64-bit", mk_pmd() function introduce build errors, 1.build with CONFIG_ARCH_RV32I=y: arch/riscv/include/asm/pgtable.h: In function 'mk_pmd': arch/riscv/include/asm/pgtable.h:513:9: error: implicit declaration of function 'pfn_pmd'; did you mean 'pfn_pgd'? [-Werror=implicit-function-declaration] 2.build with CONFIG_SPARSEMEM=y && CONFIG_SPARSEMEM_VMEMMAP=n arch/riscv/include/asm/pgtable.h: In function 'mk_pmd': include/asm-generic/memory_model.h:64:14: error: implicit declaration of function 'page_to_section'; did you mean 'present_section'? [-Werror=implicit-function-declaration] Move the definition of mk_pmd to pgtable-64.h to fix the first error. Use macro definition instead of inline function for mk_pmd to fix the second problem. It is similar to the mk_pte macro. Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Nanyong Sun <sunnanyong@huawei.com> Tested-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-07-01mm: define default value for FIRST_USER_ADDRESSAnshuman Khandual
Currently most platforms define FIRST_USER_ADDRESS as 0UL duplication the same code all over. Instead just define a generic default value (i.e 0UL) for FIRST_USER_ADDRESS and let the platforms override when required. This makes it much cleaner with reduced code. The default FIRST_USER_ADDRESS here would be skipped in <linux/pgtable.h> when the given platform overrides its value via <asm/pgtable.h>. Link: https://lkml.kernel.org/r/1620615725-24623-1-git-send-email-anshuman.khandual@arm.com Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> [m68k] Acked-by: Guo Ren <guoren@kernel.org> [csky] Acked-by: Stafford Horne <shorne@gmail.com> [openrisc] Acked-by: Catalin Marinas <catalin.marinas@arm.com> [arm64] Acked-by: Mike Rapoport <rppt@linux.ibm.com> Acked-by: Palmer Dabbelt <palmerdabbelt@google.com> [RISC-V] Cc: Richard Henderson <rth@twiddle.net> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Guo Ren <guoren@kernel.org> Cc: Brian Cain <bcain@codeaurora.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Michal Simek <monstr@monstr.eu> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Ley Foon Tan <ley.foon.tan@intel.com> Cc: Jonas Bonn <jonas@southpole.se> Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi> Cc: Stafford Horne <shorne@gmail.com> Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: "David S. Miller" <davem@davemloft.net> Cc: Jeff Dike <jdike@addtoit.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Chris Zankel <chris@zankel.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-30Merge branch 'riscv-wx-mappings' into for-nextPalmer Dabbelt
This contains both the short-term fix for the W+X boot mappings and the larger cleanup. * riscv-wx-mappings: riscv: Map the kernel with correct permissions the first time riscv: Introduce set_kernel_memory helper riscv: Simplify xip and !xip kernel address conversion macros riscv: Remove CONFIG_PHYS_RAM_BASE_FIXED riscv: mm: Fix W+X mappings at boot
2021-06-18riscv: Ensure BPF_JIT_REGION_START aligned with PMD sizeJisheng Zhang
Andreas reported commit fc8504765ec5 ("riscv: bpf: Avoid breaking W^X") breaks booting with one kind of defconfig, I reproduced a kernel panic with the defconfig: [ 0.138553] Unable to handle kernel paging request at virtual address ffffffff81201220 [ 0.139159] Oops [#1] [ 0.139303] Modules linked in: [ 0.139601] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.13.0-rc5-default+ #1 [ 0.139934] Hardware name: riscv-virtio,qemu (DT) [ 0.140193] epc : __memset+0xc4/0xfc [ 0.140416] ra : skb_flow_dissector_init+0x1e/0x82 [ 0.140609] epc : ffffffff8029806c ra : ffffffff8033be78 sp : ffffffe001647da0 [ 0.140878] gp : ffffffff81134b08 tp : ffffffe001654380 t0 : ffffffff81201158 [ 0.141156] t1 : 0000000000000002 t2 : 0000000000000154 s0 : ffffffe001647dd0 [ 0.141424] s1 : ffffffff80a43250 a0 : ffffffff81201220 a1 : 0000000000000000 [ 0.141654] a2 : 000000000000003c a3 : ffffffff81201258 a4 : 0000000000000064 [ 0.141893] a5 : ffffffff8029806c a6 : 0000000000000040 a7 : ffffffffffffffff [ 0.142126] s2 : ffffffff81201220 s3 : 0000000000000009 s4 : ffffffff81135088 [ 0.142353] s5 : ffffffff81135038 s6 : ffffffff8080ce80 s7 : ffffffff80800438 [ 0.142584] s8 : ffffffff80bc6578 s9 : 0000000000000008 s10: ffffffff806000ac [ 0.142810] s11: 0000000000000000 t3 : fffffffffffffffc t4 : 0000000000000000 [ 0.143042] t5 : 0000000000000155 t6 : 00000000000003ff [ 0.143220] status: 0000000000000120 badaddr: ffffffff81201220 cause: 000000000000000f [ 0.143560] [<ffffffff8029806c>] __memset+0xc4/0xfc [ 0.143859] [<ffffffff8061e984>] init_default_flow_dissectors+0x22/0x60 [ 0.144092] [<ffffffff800010fc>] do_one_initcall+0x3e/0x168 [ 0.144278] [<ffffffff80600df0>] kernel_init_freeable+0x1c8/0x224 [ 0.144479] [<ffffffff804868a8>] kernel_init+0x12/0x110 [ 0.144658] [<ffffffff800022de>] ret_from_exception+0x0/0xc [ 0.145124] ---[ end trace f1e9643daa46d591 ]--- After some investigation, I think I found the root cause: commit 2bfc6cd81bd ("move kernel mapping outside of linear mapping") moves BPF JIT region after the kernel: | #define BPF_JIT_REGION_START PFN_ALIGN((unsigned long)&_end) The &_end is unlikely aligned with PMD size, so the front bpf jit region sits with part of kernel .data section in one PMD size mapping. But kernel is mapped in PMD SIZE, when bpf_jit_binary_lock_ro() is called to make the first bpf jit prog ROX, we will make part of kernel .data section RO too, so when we write to, for example memset the .data section, MMU will trigger a store page fault. To fix the issue, we need to ensure the BPF JIT region is PMD size aligned. This patch acchieve this goal by restoring the BPF JIT region to original position, I.E the 128MB before kernel .text section. The modification to kasan_init.c is inspired by Alexandre. Fixes: fc8504765ec5 ("riscv: bpf: Avoid breaking W^X") Reported-by: Andreas Schwab <schwab@linux-m68k.org> Signed-off-by: Jisheng Zhang <jszhang@kernel.org> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-06-11riscv: Simplify xip and !xip kernel address conversion macrosAlexandre Ghiti
To simplify the kernel address conversion code, make the same definition of kernel_mapping_pa_to_va and kernel_mapping_va_to_pa compatible for both xip and !xip kernel by defining XIP_OFFSET to 0 in !xip kernel. Signed-off-by: Alexandre Ghiti <alex@ghiti.fr> Reviewed-by: Anup Patel <anup@brainfault.org> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-06-08riscv: fix build error when CONFIG_SMP is disabledBixuan Cui
Fix build error when disable CONFIG_SMP: mm/pgtable-generic.o: In function `.L19': pgtable-generic.c:(.text+0x42): undefined reference to `flush_pmd_tlb_range' mm/pgtable-generic.o: In function `pmdp_huge_clear_flush': pgtable-generic.c:(.text+0x6c): undefined reference to `flush_pmd_tlb_range' mm/pgtable-generic.o: In function `pmdp_invalidate': pgtable-generic.c:(.text+0x162): undefined reference to `flush_pmd_tlb_range' Fixes: e88b333142e4 ("riscv: mm: add THP support on 64-bit") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Bixuan Cui <cuibixuan@huawei.com> Acked-by: Nanyong Sun <sunnanyong@huawei.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-05-29riscv: Use global mappings for kernel pagesGuo Ren
We map kernel pages into all addresses spages, so they can be marked as global. This allows hardware to avoid flushing the kernel mappings when moving between address spaces. Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Reviewed-by: Anup Patel <anup@brainfault.org> Reviewed-by: Christoph Hellwig <hch@lst.de> [Palmer: commit text] Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-05-25riscv: Move setup_bootmem into paging_initKefeng Wang
Make setup_bootmem() static. Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-05-25riscv: mremap speedup - enable HAVE_MOVE_PUD and HAVE_MOVE_PMDJisheng Zhang
HAVE_MOVE_PUD enables remapping pages at the PUD level if both the source and destination addresses are PUD-aligned. HAVE_MOVE_PMD does similar speedup on the PMD level. With HAVE_MOVE_PUD enabled, there is about a 143x improvement on qemu With HAVE_MOVE_PMD enabled, there is about a 5x improvement on qemu Signed-off-by: Jisheng Zhang <jszhang@kernel.org> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-05-22riscv: mm: add THP support on 64-bitNanyong Sun
Bring Transparent HugePage support to riscv. A transparent huge page is always represented as a pmd. Signed-off-by: Nanyong Sun <sunnanyong@huawei.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-05-22riscv: mm: make pmd_bad() check leaf conditionNanyong Sun
In the definition in Documentation/vm/arch_pgtable_helpers.rst, pmd_bad() means test a non-table mapped PMD, so it should also return true when it is a leaf page. Signed-off-by: Nanyong Sun <sunnanyong@huawei.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-05-22riscv: mm: add _PAGE_LEAF macroNanyong Sun
In riscv, a page table entry is leaf when any bit of read, write, or execute bit is set. So add a macro:_PAGE_LEAF instead of (_PAGE_READ | _PAGE_WRITE | _PAGE_EXEC), which is frequently used to determine if it is a leaf page. This make code easier to read, without any functional change. Signed-off-by: Nanyong Sun <sunnanyong@huawei.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-05-01RISC-V: Always define XIP_FIXUPPalmer Dabbelt
XIP depends on MMU, but XIP_FIXUP is used throughout the kernel in order to avoid excessive ifdefs. This just makes sure to always define XIP_FIXUP, which will fix MMU=n builds. XIP_OFFSET is used by assembly but XIP_FIXUP is C-only, so they're split. Fixes: 44c922572952 ("RISC-V: enable XIP") Reported-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com> Tested-by: Alexandre Ghiti <alex@ghiti.fr> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-04-26RISC-V: enable XIPVitaly Wool
Introduce XIP (eXecute In Place) support for RISC-V platforms. It allows code to be executed directly from non-volatile storage directly addressable by the CPU, such as QSPI NOR flash which can be found on many RISC-V platforms. This makes way for significant optimization of RAM footprint. The XIP kernel is not compressed since it has to run directly from flash, so it will occupy more space on the non-volatile storage. The physical flash address used to link the kernel object files and for storing it has to be known at compile time and is represented by a Kconfig option. XIP on RISC-V will for the time being only work on MMU-enabled kernels. Signed-off-by: Vitaly Wool <vitaly.wool@konsulko.com> [Alex: Rebase on top of "Move kernel mapping outside the linear mapping" ] Signed-off-by: Alexandre Ghiti <alex@ghiti.fr> [Palmer: disable XIP for allyesconfig] Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-04-26riscv: Move kernel mapping outside of linear mappingAlexandre Ghiti
This is a preparatory patch for relocatable kernel and sv48 support. The kernel used to be linked at PAGE_OFFSET address therefore we could use the linear mapping for the kernel mapping. But the relocated kernel base address will be different from PAGE_OFFSET and since in the linear mapping, two different virtual addresses cannot point to the same physical address, the kernel mapping needs to lie outside the linear mapping so that we don't have to copy it at the same physical offset. The kernel mapping is moved to the last 2GB of the address space, BPF is now always after the kernel and modules use the 2GB memory range right before the kernel, so BPF and modules regions do not overlap. KASLR implementation will simply have to move the kernel in the last 2GB range and just take care of leaving enough space for BPF. In addition, by moving the kernel to the end of the address space, both sv39 and sv48 kernels will be exactly the same without needing to be relocated at runtime. Suggested-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Alexandre Ghiti <alex@ghiti.fr> [Palmer: Squash the STRICT_RWX fix, and a !MMU fix] Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-02-26Merge tag 'riscv-for-linus-5.12-mw0' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V updates from Palmer Dabbelt: "A handful of new RISC-V related patches for this merge window: - A check to ensure drivers are properly using uaccess. This isn't manifesting with any of the drivers I'm currently using, but may catch errors in new drivers. - Some preliminary support for the FU740, along with the HiFive Unleashed it will appear on. - NUMA support for RISC-V, which involves making the arm64 code generic. - Support for kasan on the vmalloc region. - A handful of new drivers for the Kendryte K210, along with the DT plumbing required to boot on a handful of K210-based boards. - Support for allocating ASIDs. - Preliminary support for kernels larger than 128MiB. - Various other improvements to our KASAN support, including the utilization of huge pages when allocating the KASAN regions. We may have already found a bug with the KASAN_VMALLOC code, but it's passing my tests. There's a fix in the works, but that will probably miss the merge window. * tag 'riscv-for-linus-5.12-mw0' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (75 commits) riscv: Improve kasan population by using hugepages when possible riscv: Improve kasan population function riscv: Use KASAN_SHADOW_INIT define for kasan memory initialization riscv: Improve kasan definitions riscv: Get rid of MAX_EARLY_MAPPING_SIZE soc: canaan: Sort the Makefile alphabetically riscv: Disable KSAN_SANITIZE for vDSO riscv: Remove unnecessary declaration riscv: Add Canaan Kendryte K210 SD card defconfig riscv: Update Canaan Kendryte K210 defconfig riscv: Add Kendryte KD233 board device tree riscv: Add SiPeed MAIXDUINO board device tree riscv: Add SiPeed MAIX GO board device tree riscv: Add SiPeed MAIX DOCK board device tree riscv: Add SiPeed MAIX BiT board device tree riscv: Update Canaan Kendryte K210 device tree dt-bindings: add resets property to dw-apb-timer dt-bindings: fix sifive gpio properties dt-bindings: update sifive uart compatible string dt-bindings: update sifive clint compatible string ...
2021-01-14riscv: Add support pte_protnone and pmd_protnone if CONFIG_NUMA_BALANCINGGreentime Hu
These two functions are used to distinguish between PROT_NONENUMA protections and hinting fault protections. Signed-off-by: Greentime Hu <greentime.hu@sifive.com> Reviewed-by: Anup Patel <anup@brainfault.org> Reviewed-by: Palmer Dabbelt <palmerdabbelt@google.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-01-14riscv: Separate memory init from paging initAtish Patra
Currently, we perform some memory init functions in paging init. But, that will be an issue for NUMA support where DT needs to be flattened before numa initialization and memblock_present can only be called after numa initialization. Move memory initialization related functions to a separate function. Signed-off-by: Atish Patra <atish.patra@wdc.com> Reviewed-by: Greentime Hu <greentime.hu@sifive.com> Reviewed-by: Anup Patel <anup@brainfault.org> Reviewed-by: Palmer Dabbelt <palmerdabbelt@google.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-01-09riscv: Drop a duplicated PAGE_KERNEL_EXECKefeng Wang
commit b91540d52a08 ("RISC-V: Add EFI runtime services") add a duplicated PAGE_KERNEL_EXEC, kill it. Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Reviewed-by: Pekka Enberg <penberg@kernel.org> Reviewed-by: Atish Patra <atish.patra@wdc.com> Fixes: b91540d52a08 ("RISC-V: Add EFI runtime services") Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2020-12-15arch, mm: restore dependency of __kernel_map_pages() on DEBUG_PAGEALLOCMike Rapoport
The design of DEBUG_PAGEALLOC presumes that __kernel_map_pages() must never fail. With this assumption is wouldn't be safe to allow general usage of this function. Moreover, some architectures that implement __kernel_map_pages() have this function guarded by #ifdef DEBUG_PAGEALLOC and some refuse to map/unmap pages when page allocation debugging is disabled at runtime. As all the users of __kernel_map_pages() were converted to use debug_pagealloc_map_pages() it is safe to make it available only when DEBUG_PAGEALLOC is set. Link: https://lkml.kernel.org/r/20201109192128.960-4-rppt@kernel.org Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Acked-by: David Hildenbrand <david@redhat.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Andy Lutomirski <luto@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Christoph Lameter <cl@linux.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Rientjes <rientjes@google.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: "Edgecombe, Rick P" <rick.p.edgecombe@intel.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Len Brown <len.brown@intel.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Pavel Machek <pavel@ucw.cz> Cc: Pekka Enberg <penberg@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-10-02RISC-V: Add EFI runtime servicesAtish Patra
This patch adds EFI runtime service support for RISC-V. Signed-off-by: Atish Patra <atish.patra@wdc.com> [ardb: - Remove the page check] Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Acked-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2020-10-02RISC-V: Move DT mapping outof fixmapAnup Patel
Currently, RISC-V reserves 1MB of fixmap memory for device tree. However, it maps only single PMD (2MB) space for fixmap which leaves only < 1MB space left for other kernel features such as early ioremap which requires fixmap as well. The fixmap size can be increased by another 2MB but it brings additional complexity and changes the virtual memory layout as well. If we require some additional feature requiring fixmap again, it has to be moved again. Technically, DT doesn't need a fixmap as the memory occupied by the DT is only used during boot. That's why, We map device tree in early page table using two consecutive PGD mappings at lower addresses (< PAGE_OFFSET). This frees lot of space in fixmap and also makes maximum supported device tree size supported as PGDIR_SIZE. Thus, init memory section can be used for the same purpose as well. This simplifies fixmap implementation. Signed-off-by: Anup Patel <anup.patel@wdc.com> Signed-off-by: Atish Patra <atish.patra@wdc.com> Reviewed-by: Palmer Dabbelt <palmerdabbelt@google.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2020-06-09mm: consolidate pte_index() and pte_offset_*() definitionsMike Rapoport
All architectures define pte_index() as (address >> PAGE_SHIFT) & (PTRS_PER_PTE - 1) and all architectures define pte_offset_kernel() as an entry in the array of PTEs indexed by the pte_index(). For the most architectures the pte_offset_kernel() implementation relies on the availability of pmd_page_vaddr() that converts a PMD entry value to the virtual address of the page containing PTEs array. Let's move x86 definitions of the PTE accessors to the generic place in <linux/pgtable.h> and then simply drop the respective definitions from the other architectures. The architectures that didn't provide pmd_page_vaddr() are updated to have that defined. The generic implementation of pte_offset_kernel() can be overridden by an architecture and alpha makes use of this because it has special ordering requirements for its version of pte_offset_kernel(). [rppt@linux.ibm.com: v2] Link: http://lkml.kernel.org/r/20200514170327.31389-11-rppt@kernel.org [rppt@linux.ibm.com: update] Link: http://lkml.kernel.org/r/20200514170327.31389-12-rppt@kernel.org [rppt@linux.ibm.com: update] Link: http://lkml.kernel.org/r/20200514170327.31389-13-rppt@kernel.org [akpm@linux-foundation.org: fix x86 warning] [sfr@canb.auug.org.au: fix powerpc build] Link: http://lkml.kernel.org/r/20200607153443.GB738695@linux.ibm.com Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Cain <bcain@codeaurora.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chris Zankel <chris@zankel.net> Cc: "David S. Miller" <davem@davemloft.net> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Greentime Hu <green.hu@gmail.com> Cc: Greg Ungerer <gerg@linux-m68k.org> Cc: Guan Xuetao <gxt@pku.edu.cn> Cc: Guo Ren <guoren@kernel.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Ley Foon Tan <ley.foon.tan@intel.com> Cc: Mark Salter <msalter@redhat.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Nick Hu <nickhu@andestech.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Richard Weinberger <richard@nod.at> Cc: Rich Felker <dalias@libc.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Stafford Horne <shorne@gmail.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Vincent Chen <deanbo422@gmail.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Will Deacon <will@kernel.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Link: http://lkml.kernel.org/r/20200514170327.31389-10-rppt@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-09mm: introduce include/linux/pgtable.hMike Rapoport
The include/linux/pgtable.h is going to be the home of generic page table manipulation functions. Start with moving asm-generic/pgtable.h to include/linux/pgtable.h and make the latter include asm/pgtable.h. Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Cain <bcain@codeaurora.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chris Zankel <chris@zankel.net> Cc: "David S. Miller" <davem@davemloft.net> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Greentime Hu <green.hu@gmail.com> Cc: Greg Ungerer <gerg@linux-m68k.org> Cc: Guan Xuetao <gxt@pku.edu.cn> Cc: Guo Ren <guoren@kernel.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Ley Foon Tan <ley.foon.tan@intel.com> Cc: Mark Salter <msalter@redhat.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Nick Hu <nickhu@andestech.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Richard Weinberger <richard@nod.at> Cc: Rich Felker <dalias@libc.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Stafford Horne <shorne@gmail.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Vincent Chen <deanbo422@gmail.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Will Deacon <will@kernel.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Link: http://lkml.kernel.org/r/20200514170327.31389-3-rppt@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-02mm: switch the test_vmalloc module to use __vmalloc_nodeChristoph Hellwig
No need to export the very low-level __vmalloc_node_range when the test module can use a slightly higher level variant. [akpm@linux-foundation.org: add missing `node' arg] [akpm@linux-foundation.org: fix riscv nommu build] Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Christophe Leroy <christophe.leroy@c-s.fr> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: David Airlie <airlied@linux.ie> Cc: Gao Xiang <xiang@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Laura Abbott <labbott@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Kelley <mikelley@microsoft.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Robin Murphy <robin.murphy@arm.com> Cc: Sakari Ailus <sakari.ailus@linux.intel.com> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: Sumit Semwal <sumit.semwal@linaro.org> Cc: Wei Liu <wei.liu@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Paul Mackerras <paulus@ozlabs.org> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Will Deacon <will@kernel.org> Link: http://lkml.kernel.org/r/20200414131348.444715-26-hch@lst.de Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-05-13riscv: pgtable: Fix __kernel_map_pages build error if NOMMUKefeng Wang
riscv64-none-linux-gnu-ld: mm/page_alloc.o: in function `.L0 ': page_alloc.c:(.text+0xd34): undefined reference to `__kernel_map_pages' riscv64-none-linux-gnu-ld: page_alloc.c:(.text+0x104a): undefined reference to `__kernel_map_pages' riscv64-none-linux-gnu-ld: mm/page_alloc.o: in function `__pageblock_pfn_to_page': page_alloc.c:(.text+0x145e): undefined reference to `__kernel_map_pages' Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2020-05-12riscv: Add pgprot_writecombine/device and PAGE_SHARED defination if NOMMUKefeng Wang
Some drivers use PAGE_SHARED, pgprot_writecombine()/pgprot_device(), add the defination to fix build error if NOMMU. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2020-03-26riscv: Add support to dump the kernel page tablesZong Li
In a similar manner to arm64, x86, powerpc, etc., it can traverse all page tables, and dump the page table layout with the memory types and permissions. Add a debugfs file at /sys/kernel/debug/kernel_page_tables to export the page table layout to userspace. Signed-off-by: Zong Li <zong.li@sifive.com> Tested-by: Alexandre Ghiti <alex@ghiti.fr> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2020-03-05RISC-V: Move all address space definition macros to one placeAtish Patra
If both CONFIG_KASAN and CONFIG_SPARSEMEM_VMEMMAP are set, we get the following compilation error. --------------------------------------------------------------- ./arch/riscv/include/asm/pgtable-64.h: In function ‘pud_page’: ./include/asm-generic/memory_model.h:54:29: error: ‘vmemmap’ undeclared (first use in this function); did you mean ‘mem_map’? #define __pfn_to_page(pfn) (vmemmap + (pfn)) ^~~~~~~ ./include/asm-generic/memory_model.h:82:21: note: in expansion of macro ‘__pfn_to_page’ #define pfn_to_page __pfn_to_page ^~~~~~~~~~~~~ ./arch/riscv/include/asm/pgtable-64.h:70:9: note: in expansion of macro ‘pfn_to_page’ return pfn_to_page(pud_val(pud) >> _PAGE_PFN_SHIFT); --------------------------------------------------------------- Fix the compliation errors by moving all the address space definition macros before including pgtable-64.h. Fixes: 8ad8b72721d0 (riscv: Add KASAN support) Signed-off-by: Atish Patra <atish.patra@wdc.com> Reviewed-by: Anup Patel <anup@brainfault.org> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2020-02-04riscv: mm: add p?d_leaf() definitionsSteven Price
walk_page_range() is going to be allowed to walk page tables other than those of user space. For this it needs to know when it has reached a 'leaf' entry in the page tables. This information is provided by the p?d_leaf() functions/macros. For riscv a page is a leaf page when it has a read, write or execute bit set on it. Link: http://lkml.kernel.org/r/20191218162402.45610-8-steven.price@arm.com Signed-off-by: Steven Price <steven.price@arm.com> Reviewed-by: Alexandre Ghiti <alex@ghiti.fr> Reviewed-by: Zong Li <zong.li@sifive.com> Acked-by: Paul Walmsley <paul.walmsley@sifive.com> [arch/riscv] Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Andy Lutomirski <luto@kernel.org> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David S. Miller <davem@davemloft.net> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Hogan <jhogan@kernel.org> Cc: James Morse <james.morse@arm.com> Cc: Jerome Glisse <jglisse@redhat.com> Cc: "Liang, Kan" <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul Burton <paul.burton@mips.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-12-27Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextDavid S. Miller
Daniel Borkmann says: ==================== pull-request: bpf-next 2019-12-27 The following pull-request contains BPF updates for your *net-next* tree. We've added 127 non-merge commits during the last 17 day(s) which contain a total of 110 files changed, 6901 insertions(+), 2721 deletions(-). There are three merge conflicts. Conflicts and resolution looks as follows: 1) Merge conflict in net/bpf/test_run.c: There was a tree-wide cleanup c593642c8be0 ("treewide: Use sizeof_field() macro") which gets in the way with b590cb5f802d ("bpf: Switch to offsetofend in BPF_PROG_TEST_RUN"): <<<<<<< HEAD if (!range_is_zero(__skb, offsetof(struct __sk_buff, priority) + sizeof_field(struct __sk_buff, priority), ======= if (!range_is_zero(__skb, offsetofend(struct __sk_buff, priority), >>>>>>> 7c8dce4b166113743adad131b5a24c4acc12f92c There are a few occasions that look similar to this. Always take the chunk with offsetofend(). Note that there is one where the fields differ in here: <<<<<<< HEAD if (!range_is_zero(__skb, offsetof(struct __sk_buff, tstamp) + sizeof_field(struct __sk_buff, tstamp), ======= if (!range_is_zero(__skb, offsetofend(struct __sk_buff, gso_segs), >>>>>>> 7c8dce4b166113743adad131b5a24c4acc12f92c Just take the one with offsetofend() /and/ gso_segs. Latter is correct due to 850a88cc4096 ("bpf: Expose __sk_buff wire_len/gso_segs to BPF_PROG_TEST_RUN"). 2) Merge conflict in arch/riscv/net/bpf_jit_comp.c: (I'm keeping Bjorn in Cc here for a double-check in case I got it wrong.) <<<<<<< HEAD if (is_13b_check(off, insn)) return -1; emit(rv_blt(tcc, RV_REG_ZERO, off >> 1), ctx); ======= emit_branch(BPF_JSLT, RV_REG_T1, RV_REG_ZERO, off, ctx); >>>>>>> 7c8dce4b166113743adad131b5a24c4acc12f92c Result should look like: emit_branch(BPF_JSLT, tcc, RV_REG_ZERO, off, ctx); 3) Merge conflict in arch/riscv/include/asm/pgtable.h: <<<<<<< HEAD ======= #define VMALLOC_SIZE (KERN_VIRT_SIZE >> 1) #define VMALLOC_END (PAGE_OFFSET - 1) #define VMALLOC_START (PAGE_OFFSET - VMALLOC_SIZE) #define BPF_JIT_REGION_SIZE (SZ_128M) #define BPF_JIT_REGION_START (PAGE_OFFSET - BPF_JIT_REGION_SIZE) #define BPF_JIT_REGION_END (VMALLOC_END) /* * Roughly size the vmemmap space to be large enough to fit enough * struct pages to map half the virtual address space. Then * position vmemmap directly below the VMALLOC region. */ #define VMEMMAP_SHIFT \ (CONFIG_VA_BITS - PAGE_SHIFT - 1 + STRUCT_PAGE_MAX_SHIFT) #define VMEMMAP_SIZE BIT(VMEMMAP_SHIFT) #define VMEMMAP_END (VMALLOC_START - 1) #define VMEMMAP_START (VMALLOC_START - VMEMMAP_SIZE) #define vmemmap ((struct page *)VMEMMAP_START) >>>>>>> 7c8dce4b166113743adad131b5a24c4acc12f92c Only take the BPF_* defines from there and move them higher up in the same file. Remove the rest from the chunk. The VMALLOC_* etc defines got moved via 01f52e16b868 ("riscv: define vmemmap before pfn_to_page calls"). Result: [...] #define __S101 PAGE_READ_EXEC #define __S110 PAGE_SHARED_EXEC #define __S111 PAGE_SHARED_EXEC #define VMALLOC_SIZE (KERN_VIRT_SIZE >> 1) #define VMALLOC_END (PAGE_OFFSET - 1) #define VMALLOC_START (PAGE_OFFSET - VMALLOC_SIZE) #define BPF_JIT_REGION_SIZE (SZ_128M) #define BPF_JIT_REGION_START (PAGE_OFFSET - BPF_JIT_REGION_SIZE) #define BPF_JIT_REGION_END (VMALLOC_END) /* * Roughly size the vmemmap space to be large enough to fit enough * struct pages to map half the virtual address space. Then * position vmemmap directly below the VMALLOC region. */ #define VMEMMAP_SHIFT \ (CONFIG_VA_BITS - PAGE_SHIFT - 1 + STRUCT_PAGE_MAX_SHIFT) #define VMEMMAP_SIZE BIT(VMEMMAP_SHIFT) #define VMEMMAP_END (VMALLOC_START - 1) #define VMEMMAP_START (VMALLOC_START - VMEMMAP_SIZE) [...] Let me know if there are any other issues. Anyway, the main changes are: 1) Extend bpftool to produce a struct (aka "skeleton") tailored and specific to a provided BPF object file. This provides an alternative, simplified API compared to standard libbpf interaction. Also, add libbpf extern variable resolution for .kconfig section to import Kconfig data, from Andrii Nakryiko. 2) Add BPF dispatcher for XDP which is a mechanism to avoid indirect calls by generating a branch funnel as discussed back in bpfconf'19 at LSF/MM. Also, add various BPF riscv JIT improvements, from Björn Töpel. 3) Extend bpftool to allow matching BPF programs and maps by name, from Paul Chaignon. 4) Support for replacing cgroup BPF programs attached with BPF_F_ALLOW_MULTI flag for allowing updates without service interruption, from Andrey Ignatov. 5) Cleanup and simplification of ring access functions for AF_XDP with a bonus of 0-5% performance improvement, from Magnus Karlsson. 6) Enable BPF JITs for x86-64 and arm64 by default. Also, final version of audit support for BPF, from Daniel Borkmann and latter with Jiri Olsa. 7) Move and extend test_select_reuseport into BPF program tests under BPF selftests, from Jakub Sitnicki. 8) Various BPF sample improvements for xdpsock for customizing parameters to set up and benchmark AF_XDP, from Jay Jayatheerthan. 9) Improve libbpf to provide a ulimit hint on permission denied errors. Also change XDP sample programs to attach in driver mode by default, from Toke Høiland-Jørgensen. 10) Extend BPF test infrastructure to allow changing skb mark from tc BPF programs, from Nikita V. Shirokov. 11) Optimize prologue code sequence in BPF arm32 JIT, from Russell King. 12) Fix xdp_redirect_cpu BPF sample to manually attach to tracepoints after libbpf conversion, from Jesper Dangaard Brouer. 13) Minor misc improvements from various others. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-20riscv: define vmemmap before pfn_to_page callsDavid Abdurachmanov
pfn_to_page & page_to_pfn depend on vmemmap being available before the calls if kernel is configured with CONFIG_SPARSEMEM_VMEMMAP=y. This was caused by NOMMU changes which moved vmemmap definition bellow functions definitions calling pfn_to_page & page_to_pfn. Noticed while compiled 5.5-rc2 kernel for Fedora/RISCV. v2: - Add a comment for vmemmap in source Signed-off-by: David Abdurachmanov <david.abdurachmanov@sifive.com> Fixes: 6bd33e1ece52 ("riscv: add nommu support") Reviewed-by: Anup Patel <anup@brainfault.org> Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>