Age | Commit message (Collapse) | Author |
|
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull mm updates from Andrew Morton:
"The usual shower of singleton fixes and minor series all over MM,
documented (hopefully adequately) in the respective changelogs.
Notable series include:
- Lucas Stach has provided some page-mapping cleanup/consolidation/
maintainability work in the series "mm/treewide: Remove pXd_huge()
API".
- In the series "Allow migrate on protnone reference with
MPOL_PREFERRED_MANY policy", Donet Tom has optimized mempolicy's
MPOL_PREFERRED_MANY mode, yielding almost doubled performance in
one test.
- In their series "Memory allocation profiling" Kent Overstreet and
Suren Baghdasaryan have contributed a means of determining (via
/proc/allocinfo) whereabouts in the kernel memory is being
allocated: number of calls and amount of memory.
- Matthew Wilcox has provided the series "Various significant MM
patches" which does a number of rather unrelated things, but in
largely similar code sites.
- In his series "mm: page_alloc: freelist migratetype hygiene"
Johannes Weiner has fixed the page allocator's handling of
migratetype requests, with resulting improvements in compaction
efficiency.
- In the series "make the hugetlb migration strategy consistent"
Baolin Wang has fixed a hugetlb migration issue, which should
improve hugetlb allocation reliability.
- Liu Shixin has hit an I/O meltdown caused by readahead in a
memory-tight memcg. Addressed in the series "Fix I/O high when
memory almost met memcg limit".
- In the series "mm/filemap: optimize folio adding and splitting"
Kairui Song has optimized pagecache insertion, yielding ~10%
performance improvement in one test.
- Baoquan He has cleaned up and consolidated the early zone
initialization code in the series "mm/mm_init.c: refactor
free_area_init_core()".
- Baoquan has also redone some MM initializatio code in the series
"mm/init: minor clean up and improvement".
- MM helper cleanups from Christoph Hellwig in his series "remove
follow_pfn".
- More cleanups from Matthew Wilcox in the series "Various
page->flags cleanups".
- Vlastimil Babka has contributed maintainability improvements in the
series "memcg_kmem hooks refactoring".
- More folio conversions and cleanups in Matthew Wilcox's series:
"Convert huge_zero_page to huge_zero_folio"
"khugepaged folio conversions"
"Remove page_idle and page_young wrappers"
"Use folio APIs in procfs"
"Clean up __folio_put()"
"Some cleanups for memory-failure"
"Remove page_mapping()"
"More folio compat code removal"
- David Hildenbrand chipped in with "fs/proc/task_mmu: convert
hugetlb functions to work on folis".
- Code consolidation and cleanup work related to GUP's handling of
hugetlbs in Peter Xu's series "mm/gup: Unify hugetlb, part 2".
- Rick Edgecombe has developed some fixes to stack guard gaps in the
series "Cover a guard gap corner case".
- Jinjiang Tu has fixed KSM's behaviour after a fork+exec in the
series "mm/ksm: fix ksm exec support for prctl".
- Baolin Wang has implemented NUMA balancing for multi-size THPs.
This is a simple first-cut implementation for now. The series is
"support multi-size THP numa balancing".
- Cleanups to vma handling helper functions from Matthew Wilcox in
the series "Unify vma_address and vma_pgoff_address".
- Some selftests maintenance work from Dev Jain in the series
"selftests/mm: mremap_test: Optimizations and style fixes".
- Improvements to the swapping of multi-size THPs from Ryan Roberts
in the series "Swap-out mTHP without splitting".
- Kefeng Wang has significantly optimized the handling of arm64's
permission page faults in the series
"arch/mm/fault: accelerate pagefault when badaccess"
"mm: remove arch's private VM_FAULT_BADMAP/BADACCESS"
- GUP cleanups from David Hildenbrand in "mm/gup: consistently call
it GUP-fast".
- hugetlb fault code cleanups from Vishal Moola in "Hugetlb fault
path to use struct vm_fault".
- selftests build fixes from John Hubbard in the series "Fix
selftests/mm build without requiring "make headers"".
- Memory tiering fixes/improvements from Ho-Ren (Jack) Chuang in the
series "Improved Memory Tier Creation for CPUless NUMA Nodes".
Fixes the initialization code so that migration between different
memory types works as intended.
- David Hildenbrand has improved follow_pte() and fixed an errant
driver in the series "mm: follow_pte() improvements and acrn
follow_pte() fixes".
- David also did some cleanup work on large folio mapcounts in his
series "mm: mapcount for large folios + page_mapcount() cleanups".
- Folio conversions in KSM in Alex Shi's series "transfer page to
folio in KSM".
- Barry Song has added some sysfs stats for monitoring multi-size
THP's in the series "mm: add per-order mTHP alloc and swpout
counters".
- Some zswap cleanups from Yosry Ahmed in the series "zswap
same-filled and limit checking cleanups".
- Matthew Wilcox has been looking at buffer_head code and found the
documentation to be lacking. The series is "Improve buffer head
documentation".
- Multi-size THPs get more work, this time from Lance Yang. His
series "mm/madvise: enhance lazyfreeing with mTHP in madvise_free"
optimizes the freeing of these things.
- Kemeng Shi has added more userspace-visible writeback
instrumentation in the series "Improve visibility of writeback".
- Kemeng Shi then sent some maintenance work on top in the series
"Fix and cleanups to page-writeback".
- Matthew Wilcox reduces mmap_lock traffic in the anon vma code in
the series "Improve anon_vma scalability for anon VMAs". Intel's
test bot reported an improbable 3x improvement in one test.
- SeongJae Park adds some DAMON feature work in the series
"mm/damon: add a DAMOS filter type for page granularity access recheck"
"selftests/damon: add DAMOS quota goal test"
- Also some maintenance work in the series
"mm/damon/paddr: simplify page level access re-check for pageout"
"mm/damon: misc fixes and improvements"
- David Hildenbrand has disabled some known-to-fail selftests ni the
series "selftests: mm: cow: flag vmsplice() hugetlb tests as
XFAIL".
- memcg metadata storage optimizations from Shakeel Butt in "memcg:
reduce memory consumption by memcg stats".
- DAX fixes and maintenance work from Vishal Verma in the series
"dax/bus.c: Fixups for dax-bus locking""
* tag 'mm-stable-2024-05-17-19-19' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (426 commits)
memcg, oom: cleanup unused memcg_oom_gfp_mask and memcg_oom_order
selftests/mm: hugetlb_madv_vs_map: avoid test skipping by querying hugepage size at runtime
mm/hugetlb: add missing VM_FAULT_SET_HINDEX in hugetlb_wp
mm/hugetlb: add missing VM_FAULT_SET_HINDEX in hugetlb_fault
selftests: cgroup: add tests to verify the zswap writeback path
mm: memcg: make alloc_mem_cgroup_per_node_info() return bool
mm/damon/core: fix return value from damos_wmark_metric_value
mm: do not update memcg stats for NR_{FILE/SHMEM}_PMDMAPPED
selftests: cgroup: remove redundant enabling of memory controller
Docs/mm/damon/maintainer-profile: allow posting patches based on damon/next tree
Docs/mm/damon/maintainer-profile: change the maintainer's timezone from PST to PT
Docs/mm/damon/design: use a list for supported filters
Docs/admin-guide/mm/damon/usage: fix wrong schemes effective quota update command
Docs/admin-guide/mm/damon/usage: fix wrong example of DAMOS filter matching sysfs file
selftests/damon: classify tests for functionalities and regressions
selftests/damon/_damon_sysfs: use 'is' instead of '==' for 'None'
selftests/damon/_damon_sysfs: find sysfs mount point from /proc/mounts
selftests/damon/_damon_sysfs: check errors from nr_schemes file reads
mm/damon/core: initialize ->esz_bp from damos_quota_init_priv()
selftests/damon: add a test for DAMOS quota goal
...
|
|
Pull ARM updates from Russell King:
- Updates to AMBA bus subsystem to drop .owner struct device_driver
initialisations, moving that to code instead.
- Add LPAE privileged-access-never support
- Add support for Clang CFI
- clkdev: report over-sized device or connection strings
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rmk/linux: (36 commits)
ARM: 9398/1: Fix userspace enter on LPAE with CC_OPTIMIZE_FOR_SIZE=y
clkdev: report over-sized strings when creating clkdev entries
ARM: 9393/1: mm: Use conditionals for CFI branches
ARM: 9392/2: Support CLANG CFI
ARM: 9391/2: hw_breakpoint: Handle CFI breakpoints
ARM: 9390/2: lib: Annotate loop delay instructions for CFI
ARM: 9389/2: mm: Define prototypes for all per-processor calls
ARM: 9388/2: mm: Type-annotate all per-processor assembly routines
ARM: 9387/2: mm: Rewrite cacheflush vtables in CFI safe C
ARM: 9386/2: mm: Use symbol alias for cache functions
ARM: 9385/2: mm: Type-annotate all cache assembly routines
ARM: 9384/2: mm: Make tlbflush routines CFI safe
ARM: 9382/1: ftrace: Define ftrace_stub_graph
ARM: 9358/2: Implement PAN for LPAE by TTBR0 page table walks disablement
ARM: 9357/2: Reduce the number of #ifdef CONFIG_CPU_SW_DOMAIN_PAN
ARM: 9356/2: Move asm statements accessing TTBCR into C functions
ARM: 9355/2: Add TTBCR_* definitions to pgtable-3level-hwdef.h
ARM: 9379/1: coresight: tpda: drop owner assignment
ARM: 9378/1: coresight: etm4x: drop owner assignment
ARM: 9377/1: hwrng: nomadik: drop owner assignment
...
|
|
|
|
execmem does not depend on modules, on the contrary modules use
execmem.
To make execmem available when CONFIG_MODULES=n, for instance for
kprobes, split execmem_params initialization out from
arch/*/kernel/module.c and compile it when CONFIG_EXECMEM=y
Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
|
|
Extend execmem parameters to accommodate more complex overrides of
module_alloc() by architectures.
This includes specification of a fallback range required by arm, arm64
and powerpc, EXECMEM_MODULE_DATA type required by powerpc, support for
allocation of KASAN shadow required by s390 and x86 and support for
late initialization of execmem required by arm64.
The core implementation of execmem_alloc() takes care of suppressing
warnings when the initial allocation fails but there is a fallback range
defined.
Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org>
Acked-by: Will Deacon <will@kernel.org>
Acked-by: Song Liu <song@kernel.org>
Tested-by: Liviu Dudau <liviu@dudau.co.uk>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler updates from Ingo Molnar:
- Add cpufreq pressure feedback for the scheduler
- Rework misfit load-balancing wrt affinity restrictions
- Clean up and simplify the code around ::overutilized and
::overload access.
- Simplify sched_balance_newidle()
- Bump SCHEDSTAT_VERSION to 16 due to a cleanup of CPU_MAX_IDLE_TYPES
handling that changed the output.
- Rework & clean up <asm/vtime.h> interactions wrt arch_vtime_task_switch()
- Reorganize, clean up and unify most of the higher level
scheduler balancing function names around the sched_balance_*()
prefix
- Simplify the balancing flag code (sched_balance_running)
- Miscellaneous cleanups & fixes
* tag 'sched-core-2024-05-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (50 commits)
sched/pelt: Remove shift of thermal clock
sched/cpufreq: Rename arch_update_thermal_pressure() => arch_update_hw_pressure()
thermal/cpufreq: Remove arch_update_thermal_pressure()
sched/cpufreq: Take cpufreq feedback into account
cpufreq: Add a cpufreq pressure feedback for the scheduler
sched/fair: Fix update of rd->sg_overutilized
sched/vtime: Do not include <asm/vtime.h> header
s390/irq,nmi: Include <asm/vtime.h> header directly
s390/vtime: Remove unused __ARCH_HAS_VTIME_TASK_SWITCH leftover
sched/vtime: Get rid of generic vtime_task_switch() implementation
sched/vtime: Remove confusing arch_vtime_task_switch() declaration
sched/balancing: Simplify the sg_status bitmask and use separate ->overloaded and ->overutilized flags
sched/fair: Rename set_rd_overutilized_status() to set_rd_overutilized()
sched/fair: Rename SG_OVERLOAD to SG_OVERLOADED
sched/fair: Rename {set|get}_rd_overload() to {set|get}_rd_overloaded()
sched/fair: Rename root_domain::overload to ::overloaded
sched/fair: Use helper functions to access root_domain::overload
sched/fair: Check root_domain::overload value before update
sched/fair: Combine EAS check with root_domain::overutilized access
sched/fair: Simplify the continue_balancing logic in sched_balance_newidle()
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf events updates from Ingo Molnar:
- Combine perf and BPF for fast evalution of HW breakpoint
conditions
- Add LBR capture support outside of hardware events
- Trigger IO signals for watermark_wakeup
- Add RAPL support for Intel Arrow Lake and Lunar Lake
- Optimize frequency-throttling
- Miscellaneous cleanups & fixes
* tag 'perf-core-2024-05-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (21 commits)
perf/bpf: Mark perf_event_set_bpf_handler() and perf_event_free_bpf_handler() as inline too
selftests/perf_events: Test FASYNC with watermark wakeups
perf/ring_buffer: Trigger IO signals for watermark_wakeup
perf: Move perf_event_fasync() to perf_event.h
perf/bpf: Change the !CONFIG_BPF_SYSCALL stubs to static inlines
selftest/bpf: Test a perf BPF program that suppresses side effects
perf/bpf: Allow a BPF program to suppress all sample side effects
perf/bpf: Remove unneeded uses_default_overflow_handler()
perf/bpf: Call BPF handler directly, not through overflow machinery
perf/bpf: Remove #ifdef CONFIG_BPF_SYSCALL from struct perf_event members
perf/bpf: Create bpf_overflow_handler() stub for !CONFIG_BPF_SYSCALL
perf/bpf: Reorder bpf_overflow_handler() ahead of __perf_event_overflow()
perf/x86/rapl: Add support for Intel Lunar Lake
perf/x86/rapl: Add support for Intel Arrow Lake
perf/core: Reduce PMU access to adjust sample freq
perf/core: Optimize perf_adjust_freq_unthr_context()
perf/x86/amd: Don't reject non-sampling events with configured LBR
perf/x86/amd: Support capturing LBR from software events
perf/x86/amd: Avoid taking branches before disabling LBR
perf/x86/amd: Ensure amd_pmu_core_disable_all() is always inlined
...
|
|
This registers a breakpoint handler for the new breakpoint type
(0x03) inserted by LLVM CLANG for CFI breakpoints.
If we are in permissive mode, just print a backtrace and continue.
Example with CONFIG_CFI_PERMISSIVE enabled:
> echo CFI_FORWARD_PROTO > /sys/kernel/debug/provoke-crash/DIRECT
lkdtm: Performing direct entry CFI_FORWARD_PROTO
lkdtm: Calling matched prototype ...
lkdtm: Calling mismatched prototype ...
CFI failure at lkdtm_indirect_call+0x40/0x4c (target: 0x0; expected type: 0x00000000)
WARNING: CPU: 1 PID: 112 at lkdtm_indirect_call+0x40/0x4c
CPU: 1 PID: 112 Comm: sh Not tainted 6.8.0-rc1+ #150
Hardware name: ARM-Versatile Express
(...)
lkdtm: FAIL: survived mismatched prototype function call!
lkdtm: Unexpected! This kernel (6.8.0-rc1+ armv7l) was built with CONFIG_CFI_CLANG=y
As you can see the LKDTM test fails, but I expect that this would be
expected behaviour in the permissive mode.
We are currently not implementing target and type for the CFI
breakpoint as this requires additional operand bundling compiler
extensions.
CPUs without breakpoint support cannot handle breakpoints naturally,
in these cases the permissive mode will not work, CFI will fall over
on an undefined instruction:
Internal error: Oops - undefined instruction: 0 [#1] PREEMPT ARM
CPU: 0 PID: 186 Comm: ash Tainted: G W 6.9.0-rc1+ #7
Hardware name: Gemini (Device Tree)
PC is at lkdtm_indirect_call+0x38/0x4c
LR is at lkdtm_CFI_FORWARD_PROTO+0x30/0x6c
This is reasonable I think: it's the best CFI can do to ascertain
the the control flow is not broken on these CPUs.
Reviewed-by: Kees Cook <keescook@chromium.org>
Tested-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Sami Tolvanen <samitolvanen@google.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
|
|
We found below OOB crash:
[ 33.452494] ==================================================================
[ 33.453513] BUG: KASAN: stack-out-of-bounds in refresh_cpu_vm_stats.constprop.0+0xcc/0x2ec
[ 33.454660] Write of size 164 at addr c1d03d30 by task swapper/0/0
[ 33.455515]
[ 33.455767] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G O 6.1.25-mainline #1
[ 33.456880] Hardware name: Generic DT based system
[ 33.457555] unwind_backtrace from show_stack+0x18/0x1c
[ 33.458326] show_stack from dump_stack_lvl+0x40/0x4c
[ 33.459072] dump_stack_lvl from print_report+0x158/0x4a4
[ 33.459863] print_report from kasan_report+0x9c/0x148
[ 33.460616] kasan_report from kasan_check_range+0x94/0x1a0
[ 33.461424] kasan_check_range from memset+0x20/0x3c
[ 33.462157] memset from refresh_cpu_vm_stats.constprop.0+0xcc/0x2ec
[ 33.463064] refresh_cpu_vm_stats.constprop.0 from tick_nohz_idle_stop_tick+0x180/0x53c
[ 33.464181] tick_nohz_idle_stop_tick from do_idle+0x264/0x354
[ 33.465029] do_idle from cpu_startup_entry+0x20/0x24
[ 33.465769] cpu_startup_entry from rest_init+0xf0/0xf4
[ 33.466528] rest_init from arch_post_acpi_subsys_init+0x0/0x18
[ 33.467397]
[ 33.467644] The buggy address belongs to stack of task swapper/0/0
[ 33.468493] and is located at offset 112 in frame:
[ 33.469172] refresh_cpu_vm_stats.constprop.0+0x0/0x2ec
[ 33.469917]
[ 33.470165] This frame has 2 objects:
[ 33.470696] [32, 76) 'global_zone_diff'
[ 33.470729] [112, 276) 'global_node_diff'
[ 33.471294]
[ 33.472095] The buggy address belongs to the physical page:
[ 33.472862] page:3cd72da8 refcount:1 mapcount:0 mapping:00000000 index:0x0 pfn:0x41d03
[ 33.473944] flags: 0x1000(reserved|zone=0)
[ 33.474565] raw: 00001000 ed741470 ed741470 00000000 00000000 00000000 ffffffff 00000001
[ 33.475656] raw: 00000000
[ 33.476050] page dumped because: kasan: bad access detected
[ 33.476816]
[ 33.477061] Memory state around the buggy address:
[ 33.477732] c1d03c00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 33.478630] c1d03c80: 00 00 00 00 00 00 00 00 f1 f1 f1 f1 00 00 00 00
[ 33.479526] >c1d03d00: 00 04 f2 f2 f2 f2 00 00 00 00 00 00 f1 f1 f1 f1
[ 33.480415] ^
[ 33.481195] c1d03d80: 00 00 00 00 00 00 00 00 00 00 04 f3 f3 f3 f3 f3
[ 33.482088] c1d03e00: f3 f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00
[ 33.482978] ==================================================================
We find the root cause of this OOB is that arm does not clear stale stack
poison in the case of cpuidle.
This patch refer to arch/arm64/kernel/sleep.S to resolve this issue.
From cited commit [1] that explain the problem
Functions which the compiler has instrumented for KASAN place poison on
the stack shadow upon entry and remove this poison prior to returning.
In the case of cpuidle, CPUs exit the kernel a number of levels deep in
C code. Any instrumented functions on this critical path will leave
portions of the stack shadow poisoned.
If CPUs lose context and return to the kernel via a cold path, we
restore a prior context saved in __cpu_suspend_enter are forgotten, and
we never remove the poison they placed in the stack shadow area by
functions calls between this and the actual exit of the kernel.
Thus, (depending on stackframe layout) subsequent calls to instrumented
functions may hit this stale poison, resulting in (spurious) KASAN
splats to the console.
To avoid this, clear any stale poison from the idle thread for a CPU
prior to bringing a CPU online.
From cited commit [2]
Extend to check for CONFIG_KASAN_STACK
[1] commit 0d97e6d8024c ("arm64: kasan: clear stale stack poison")
[2] commit d56a9ef84bd0 ("kasan, arm64: unpoison stack only with CONFIG_KASAN_STACK")
Signed-off-by: Boy Wu <boy.wu@mediatek.com>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Fixes: 5615f69bc209 ("ARM: 9016/2: Initialize the mapping of KASan shadow memory")
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
|
|
Patch series "Memory allocation profiling", v6.
Overview:
Low overhead [1] per-callsite memory allocation profiling. Not just for
debug kernels, overhead low enough to be deployed in production.
Example output:
root@moria-kvm:~# sort -rn /proc/allocinfo
127664128 31168 mm/page_ext.c:270 func:alloc_page_ext
56373248 4737 mm/slub.c:2259 func:alloc_slab_page
14880768 3633 mm/readahead.c:247 func:page_cache_ra_unbounded
14417920 3520 mm/mm_init.c:2530 func:alloc_large_system_hash
13377536 234 block/blk-mq.c:3421 func:blk_mq_alloc_rqs
11718656 2861 mm/filemap.c:1919 func:__filemap_get_folio
9192960 2800 kernel/fork.c:307 func:alloc_thread_stack_node
4206592 4 net/netfilter/nf_conntrack_core.c:2567 func:nf_ct_alloc_hashtable
4136960 1010 drivers/staging/ctagmod/ctagmod.c:20 [ctagmod] func:ctagmod_start
3940352 962 mm/memory.c:4214 func:alloc_anon_folio
2894464 22613 fs/kernfs/dir.c:615 func:__kernfs_new_node
...
Usage:
kconfig options:
- CONFIG_MEM_ALLOC_PROFILING
- CONFIG_MEM_ALLOC_PROFILING_ENABLED_BY_DEFAULT
- CONFIG_MEM_ALLOC_PROFILING_DEBUG
adds warnings for allocations that weren't accounted because of a
missing annotation
sysctl:
/proc/sys/vm/mem_profiling
Runtime info:
/proc/allocinfo
Notes:
[1]: Overhead
To measure the overhead we are comparing the following configurations:
(1) Baseline with CONFIG_MEMCG_KMEM=n
(2) Disabled by default (CONFIG_MEM_ALLOC_PROFILING=y &&
CONFIG_MEM_ALLOC_PROFILING_BY_DEFAULT=n)
(3) Enabled by default (CONFIG_MEM_ALLOC_PROFILING=y &&
CONFIG_MEM_ALLOC_PROFILING_BY_DEFAULT=y)
(4) Enabled at runtime (CONFIG_MEM_ALLOC_PROFILING=y &&
CONFIG_MEM_ALLOC_PROFILING_BY_DEFAULT=n && /proc/sys/vm/mem_profiling=1)
(5) Baseline with CONFIG_MEMCG_KMEM=y && allocating with __GFP_ACCOUNT
(6) Disabled by default (CONFIG_MEM_ALLOC_PROFILING=y &&
CONFIG_MEM_ALLOC_PROFILING_BY_DEFAULT=n) && CONFIG_MEMCG_KMEM=y
(7) Enabled by default (CONFIG_MEM_ALLOC_PROFILING=y &&
CONFIG_MEM_ALLOC_PROFILING_BY_DEFAULT=y) && CONFIG_MEMCG_KMEM=y
Performance overhead:
To evaluate performance we implemented an in-kernel test executing
multiple get_free_page/free_page and kmalloc/kfree calls with allocation
sizes growing from 8 to 240 bytes with CPU frequency set to max and CPU
affinity set to a specific CPU to minimize the noise. Below are results
from running the test on Ubuntu 22.04.2 LTS with 6.8.0-rc1 kernel on
56 core Intel Xeon:
kmalloc pgalloc
(1 baseline) 6.764s 16.902s
(2 default disabled) 6.793s (+0.43%) 17.007s (+0.62%)
(3 default enabled) 7.197s (+6.40%) 23.666s (+40.02%)
(4 runtime enabled) 7.405s (+9.48%) 23.901s (+41.41%)
(5 memcg) 13.388s (+97.94%) 48.460s (+186.71%)
(6 def disabled+memcg) 13.332s (+97.10%) 48.105s (+184.61%)
(7 def enabled+memcg) 13.446s (+98.78%) 54.963s (+225.18%)
Memory overhead:
Kernel size:
text data bss dec diff
(1) 26515311 18890222 17018880 62424413
(2) 26524728 19423818 16740352 62688898 264485
(3) 26524724 19423818 16740352 62688894 264481
(4) 26524728 19423818 16740352 62688898 264485
(5) 26541782 18964374 16957440 62463596 39183
Memory consumption on a 56 core Intel CPU with 125GB of memory:
Code tags: 192 kB
PageExts: 262144 kB (256MB)
SlabExts: 9876 kB (9.6MB)
PcpuExts: 512 kB (0.5MB)
Total overhead is 0.2% of total memory.
Benchmarks:
Hackbench tests run 100 times:
hackbench -s 512 -l 200 -g 15 -f 25 -P
baseline disabled profiling enabled profiling
avg 0.3543 0.3559 (+0.0016) 0.3566 (+0.0023)
stdev 0.0137 0.0188 0.0077
hackbench -l 10000
baseline disabled profiling enabled profiling
avg 6.4218 6.4306 (+0.0088) 6.5077 (+0.0859)
stdev 0.0933 0.0286 0.0489
stress-ng tests:
stress-ng --class memory --seq 4 -t 60
stress-ng --class cpu --seq 4 -t 60
Results posted at: https://evilpiepirate.org/~kent/memalloc_prof_v4_stress-ng/
[2] https://lore.kernel.org/all/20240306182440.2003814-1-surenb@google.com/
This patch (of 37):
The next patch drops vmalloc.h from a system header in order to fix a
circular dependency; this adds it to all the files that were pulling it in
implicitly.
[kent.overstreet@linux.dev: fix arch/alpha/lib/memcpy.c]
Link: https://lkml.kernel.org/r/20240327002152.3339937-1-kent.overstreet@linux.dev
[surenb@google.com: fix arch/x86/mm/numa_32.c]
Link: https://lkml.kernel.org/r/20240402180933.1663992-1-surenb@google.com
[kent.overstreet@linux.dev: a few places were depending on sizes.h]
Link: https://lkml.kernel.org/r/20240404034744.1664840-1-kent.overstreet@linux.dev
[arnd@arndb.de: fix mm/kasan/hw_tags.c]
Link: https://lkml.kernel.org/r/20240404124435.3121534-1-arnd@kernel.org
[surenb@google.com: fix arc build]
Link: https://lkml.kernel.org/r/20240405225115.431056-1-surenb@google.com
Link: https://lkml.kernel.org/r/20240321163705.3067592-1-surenb@google.com
Link: https://lkml.kernel.org/r/20240321163705.3067592-2-surenb@google.com
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Tested-by: Kees Cook <keescook@chromium.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Alex Gaynor <alex.gaynor@gmail.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Andreas Hindborg <a.hindborg@samsung.com>
Cc: Benno Lossin <benno.lossin@proton.me>
Cc: "Björn Roy Baron" <bjorn3_gh@protonmail.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: Gary Guo <gary@garyguo.net>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Wedson Almeida Filho <wedsonaf@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Several architectures defines this stub for the graph tracer,
and it is needed for CFI, as it needs a separate symbol for it.
The trick from include/asm-generic/vmlinux.lds.h to define
ftrace_stub_graph to ftrace_stub isn't working when using CFI.
Commit 883bbbffa5a4 contains the details.
Tested-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
|
|
With LPAE enabled, privileged no-access cannot be enforced using CPU
domains as such feature is not available. This patch implements PAN
by disabling TTBR0 page table walks while in kernel mode.
The ARM architecture allows page table walks to be split between TTBR0
and TTBR1. With LPAE enabled, the split is defined by a combination of
TTBCR T0SZ and T1SZ bits. Currently, an LPAE-enabled kernel uses TTBR0
for user addresses and TTBR1 for kernel addresses with the VMSPLIT_2G
and VMSPLIT_3G configurations. The main advantage for the 3:1 split is
that TTBR1 is reduced to 2 levels, so potentially faster TLB refill
(though usually the first level entries are already cached in the TLB).
The PAN support on LPAE-enabled kernels uses TTBR0 when running in user
space or in kernel space during user access routines (TTBCR T0SZ and
T1SZ are both 0). When running user accesses are disabled in kernel
mode, TTBR0 page table walks are disabled by setting TTBCR.EPD0. TTBR1
is used for kernel accesses (including loadable modules; anything
covered by swapper_pg_dir) by reducing the TTBCR.T0SZ to the minimum
(2^(32-7) = 32MB). To avoid user accesses potentially hitting stale TLB
entries, the ASID is switched to 0 (reserved) by setting TTBCR.A1 and
using the ASID value in TTBR1. The difference from a non-PAN kernel is
that with the 3:1 memory split, TTBR1 always uses 3 levels of page
tables.
As part of the change we are using preprocessor elif definied() clauses
so balance these clauses by converting relevant precedingt ifdef
clauses to if defined() clauses.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Tested-by: Florian Fainelli <florian.fainelli@broadcom.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
|
|
Now that struct perf_event's orig_overflow_handler is gone, there's no need
for the functions and macros to support looking past overflow_handler to
orig_overflow_handler.
This patch is solely a refactoring and results in no behavior change.
Signed-off-by: Kyle Huey <khuey@kylehuey.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Will Deacon <will@kernel.org>
Acked-by: Song Liu <song@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20240412015019.7060-6-khuey@kylehuey.com
|
|
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Pull ARM updates from Russell King:
- remove a misuse of kernel-doc comment
- use "Call trace:" for backtraces like other architectures
- implement copy_from_kernel_nofault_allowed() to fix a LKDTM test
- add a "cut here" line for prefetch aborts
- remove unnecessary Kconfing entry for FRAME_POINTER
- remove iwmmxy support for PJ4/PJ4B cores
- use bitfield helpers in ptrace to improve readabililty
- check if folio is reserved before flushing
* tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm:
ARM: 9359/1: flush: check if the folio is reserved for no-mapping addresses
ARM: 9354/1: ptrace: Use bitfield helpers
ARM: 9352/1: iwmmxt: Remove support for PJ4/PJ4B cores
ARM: 9353/1: remove unneeded entry for CONFIG_FRAME_POINTER
ARM: 9351/1: fault: Add "cut here" line for prefetch aborts
ARM: 9350/1: fault: Implement copy_from_kernel_nofault_allowed()
ARM: 9349/1: unwind: Add missing "Call trace:" line
ARM: 9334/1: mm: init: remove misuse of kernel-doc comment
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull MM updates from Andrew Morton:
- Sumanth Korikkar has taught s390 to allocate hotplug-time page frames
from hotplugged memory rather than only from main memory. Series
"implement "memmap on memory" feature on s390".
- More folio conversions from Matthew Wilcox in the series
"Convert memcontrol charge moving to use folios"
"mm: convert mm counter to take a folio"
- Chengming Zhou has optimized zswap's rbtree locking, providing
significant reductions in system time and modest but measurable
reductions in overall runtimes. The series is "mm/zswap: optimize the
scalability of zswap rb-tree".
- Chengming Zhou has also provided the series "mm/zswap: optimize zswap
lru list" which provides measurable runtime benefits in some
swap-intensive situations.
- And Chengming Zhou further optimizes zswap in the series "mm/zswap:
optimize for dynamic zswap_pools". Measured improvements are modest.
- zswap cleanups and simplifications from Yosry Ahmed in the series
"mm: zswap: simplify zswap_swapoff()".
- In the series "Add DAX ABI for memmap_on_memory", Vishal Verma has
contributed several DAX cleanups as well as adding a sysfs tunable to
control the memmap_on_memory setting when the dax device is
hotplugged as system memory.
- Johannes Weiner has added the large series "mm: zswap: cleanups",
which does that.
- More DAMON work from SeongJae Park in the series
"mm/damon: make DAMON debugfs interface deprecation unignorable"
"selftests/damon: add more tests for core functionalities and corner cases"
"Docs/mm/damon: misc readability improvements"
"mm/damon: let DAMOS feeds and tame/auto-tune itself"
- In the series "mm/mempolicy: weighted interleave mempolicy and sysfs
extension" Rakie Kim has developed a new mempolicy interleaving
policy wherein we allocate memory across nodes in a weighted fashion
rather than uniformly. This is beneficial in heterogeneous memory
environments appearing with CXL.
- Christophe Leroy has contributed some cleanup and consolidation work
against the ARM pagetable dumping code in the series "mm: ptdump:
Refactor CONFIG_DEBUG_WX and check_wx_pages debugfs attribute".
- Luis Chamberlain has added some additional xarray selftesting in the
series "test_xarray: advanced API multi-index tests".
- Muhammad Usama Anjum has reworked the selftest code to make its
human-readable output conform to the TAP ("Test Anything Protocol")
format. Amongst other things, this opens up the use of third-party
tools to parse and process out selftesting results.
- Ryan Roberts has added fork()-time PTE batching of THP ptes in the
series "mm/memory: optimize fork() with PTE-mapped THP". Mainly
targeted at arm64, this significantly speeds up fork() when the
process has a large number of pte-mapped folios.
- David Hildenbrand also gets in on the THP pte batching game in his
series "mm/memory: optimize unmap/zap with PTE-mapped THP". It
implements batching during munmap() and other pte teardown
situations. The microbenchmark improvements are nice.
- And in the series "Transparent Contiguous PTEs for User Mappings"
Ryan Roberts further utilizes arm's pte's contiguous bit ("contpte
mappings"). Kernel build times on arm64 improved nicely. Ryan's
series "Address some contpte nits" provides some followup work.
- In the series "mm/hugetlb: Restore the reservation" Breno Leitao has
fixed an obscure hugetlb race which was causing unnecessary page
faults. He has also added a reproducer under the selftest code.
- In the series "selftests/mm: Output cleanups for the compaction
test", Mark Brown did what the title claims.
- Kinsey Ho has added the series "mm/mglru: code cleanup and
refactoring".
- Even more zswap material from Nhat Pham. The series "fix and extend
zswap kselftests" does as claimed.
- In the series "Introduce cpu_dcache_is_aliasing() to fix DAX
regression" Mathieu Desnoyers has cleaned up and fixed rather a mess
in our handling of DAX on archiecctures which have virtually aliasing
data caches. The arm architecture is the main beneficiary.
- Lokesh Gidra's series "per-vma locks in userfaultfd" provides
dramatic improvements in worst-case mmap_lock hold times during
certain userfaultfd operations.
- Some page_owner enhancements and maintenance work from Oscar Salvador
in his series
"page_owner: print stacks and their outstanding allocations"
"page_owner: Fixup and cleanup"
- Uladzislau Rezki has contributed some vmalloc scalability
improvements in his series "Mitigate a vmap lock contention". It
realizes a 12x improvement for a certain microbenchmark.
- Some kexec/crash cleanup work from Baoquan He in the series "Split
crash out from kexec and clean up related config items".
- Some zsmalloc maintenance work from Chengming Zhou in the series
"mm/zsmalloc: fix and optimize objects/page migration"
"mm/zsmalloc: some cleanup for get/set_zspage_mapping()"
- Zi Yan has taught the MM to perform compaction on folios larger than
order=0. This a step along the path to implementaton of the merging
of large anonymous folios. The series is named "Enable >0 order folio
memory compaction".
- Christoph Hellwig has done quite a lot of cleanup work in the
pagecache writeback code in his series "convert write_cache_pages()
to an iterator".
- Some modest hugetlb cleanups and speedups in Vishal Moola's series
"Handle hugetlb faults under the VMA lock".
- Zi Yan has changed the page splitting code so we can split huge pages
into sizes other than order-0 to better utilize large folios. The
series is named "Split a folio to any lower order folios".
- David Hildenbrand has contributed the series "mm: remove
total_mapcount()", a cleanup.
- Matthew Wilcox has sought to improve the performance of bulk memory
freeing in his series "Rearrange batched folio freeing".
- Gang Li's series "hugetlb: parallelize hugetlb page init on boot"
provides large improvements in bootup times on large machines which
are configured to use large numbers of hugetlb pages.
- Matthew Wilcox's series "PageFlags cleanups" does that.
- Qi Zheng's series "minor fixes and supplement for ptdesc" does that
also. S390 is affected.
- Cleanups to our pagemap utility functions from Peter Xu in his series
"mm/treewide: Replace pXd_large() with pXd_leaf()".
- Nico Pache has fixed a few things with our hugepage selftests in his
series "selftests/mm: Improve Hugepage Test Handling in MM
Selftests".
- Also, of course, many singleton patches to many things. Please see
the individual changelogs for details.
* tag 'mm-stable-2024-03-13-20-04' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (435 commits)
mm/zswap: remove the memcpy if acomp is not sleepable
crypto: introduce: acomp_is_async to expose if comp drivers might sleep
memtest: use {READ,WRITE}_ONCE in memory scanning
mm: prohibit the last subpage from reusing the entire large folio
mm: recover pud_leaf() definitions in nopmd case
selftests/mm: skip the hugetlb-madvise tests on unmet hugepage requirements
selftests/mm: skip uffd hugetlb tests with insufficient hugepages
selftests/mm: dont fail testsuite due to a lack of hugepages
mm/huge_memory: skip invalid debugfs new_order input for folio split
mm/huge_memory: check new folio order when split a folio
mm, vmscan: retry kswapd's priority loop with cache_trim_mode off on failure
mm: add an explicit smp_wmb() to UFFDIO_CONTINUE
mm: fix list corruption in put_pages_list
mm: remove folio from deferred split list before uncharging it
filemap: avoid unnecessary major faults in filemap_fault()
mm,page_owner: drop unnecessary check
mm,page_owner: check for null stack_record before bumping its refcount
mm: swap: fix race between free_swap_and_cache() and swapoff()
mm/treewide: align up pXd_leaf() retval across archs
mm/treewide: drop pXd_large()
...
|
|
Standardize scheduler load-balancing function names on the
sched_balance_() prefix.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Shrikanth Hegde <sshegde@linux.ibm.com>
Link: https://lore.kernel.org/r/20240308111819.1101550-5-mingo@kernel.org
|
|
PJ4 is a v7 core that incorporates a iWMMXt coprocessor. However, GCC
does not support this combination (its iWMMXt configuration always
implies v5te), and so there is no v6/v7 user space that actually makes
use of this, beyond generic support for things like setjmp() that
preserve/restore the iWMMXt register file using generic LDC/STC
instructions emitted in assembler. As [0] appears to imply, this logic
is triggered for the init process at boot, and so most user threads will
have a iWMMXt register context associated with it, even though it is
never used.
At this point, it is highly unlikely that such GCC support will ever
materialize (and Clang does not implement support for iWMMXt to begin
with).
This means that advertising iWMMXt support on these cores results in
context switch overhead without any associated benefit, and so it is
better to simply ignore the iWMMXt unit on these systems. So rip out the
support. Doing so also fixes the issue reported in [0] related to UNDEF
handling of co-processor #0/#1 instructions issued from user space
running in Thumb2 mode.
The PJ4 cores are used in four platforms: Armada 370/xp, Dove (Cubox,
d2plug), MMP2 (xo-1.75) and Berlin (Google TV). Out of these, only the
first is still widely used, but that one actually doesn't have iWMMXt
but instead has only VFPV3-D16, and so it is not impacted by this
change.
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218427 [0]
Fixes: 8bcba70cb5c22 ("ARM: entry: Disregard Thumb undef exception ...")
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Nicolas Pitre <nico@fluxnic.net>
Reviewed-by: Jisheng Zhang <jszhang@kernel.org>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
|
|
Every other architecture in Linux includes the line "Call trace:" before
backtraces. In some cases ARM would print "Backtrace:", but this was
only via 1 specific call path, and wasn't included in CPU Oops nor things
like KASAN, UBSAN, etc that called dump_stack(). Regularize this line
so CI systems and other things (like LKDTM) that depend on parsing
"Call trace:" out of dmesg will see it for ARM.
Before this patch:
UBSAN: array-index-out-of-bounds in ../drivers/misc/lkdtm/bugs.c:376:16
index 8 is out of range for type 'char [8]'
CPU: 0 PID: 1402 Comm: cat Not tainted 6.7.0-rc2 #1
Hardware name: Generic DT based system
dump_backtrace from show_stack+0x20/0x24
r7:00000042 r6:00000000 r5:60070013 r4:80cf5d7c
show_stack from dump_stack_lvl+0x88/0x98
dump_stack_lvl from dump_stack+0x18/0x1c
r7:00000042 r6:00000008 r5:00000008 r4:80fab118
dump_stack from ubsan_epilogue+0x10/0x3c
ubsan_epilogue from __ubsan_handle_out_of_bounds+0x80/0x84
...
After this patch:
UBSAN: array-index-out-of-bounds in ../drivers/misc/lkdtm/bugs.c:376:16
index 8 is out of range for type 'char [8]'
CPU: 0 PID: 1402 Comm: cat Not tainted 6.7.0-rc2 #1
Hardware name: Generic DT based system
Call trace:
dump_backtrace from show_stack+0x20/0x24
r7:00000042 r6:00000000 r5:60070013 r4:80cf5d7c
show_stack from dump_stack_lvl+0x88/0x98
dump_stack_lvl from dump_stack+0x18/0x1c
r7:00000042 r6:00000008 r5:00000008 r4:80fab118
dump_stack from ubsan_epilogue+0x10/0x3c
ubsan_epilogue from __ubsan_handle_out_of_bounds+0x80/0x84
...
Link: https://lore.kernel.org/r/20240110215554.work.460-kees@kernel.org
Reported-by: Mark Brown <broonie@kernel.org>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Vladimir Murzin <vladimir.murzin@arm.com>
Cc: Zhen Lei <thunder.leizhen@huawei.com>
Cc: Keith Packard <keithpac@amazon.com>
Cc: Haibo Li <haibo.li@mediatek.com>
Cc: <linux-arm-kernel@lists.infradead.org>
Reviewed-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
|
|
Nathan reported below building error:
=====
$ curl -LSso .config https://git.alpinelinux.org/aports/plain/community/linux-edge/config-edge.armv7
$ make -skj"$(nproc)" ARCH=arm CROSS_COMPILE=arm-linux-gnueabi- olddefconfig all
..
arm-linux-gnueabi-ld: arch/arm/kernel/machine_kexec.o: in function `arch_crash_save_vmcoreinfo':
machine_kexec.c:(.text+0x488): undefined reference to `vmcoreinfo_append_str'
====
On architecutres, like arm, s390, ppc, sh, function
arch_crash_save_vmcoreinfo() is located in machine_kexec.c and it can
only be compiled in when CONFIG_KEXEC_CORE=y.
That's not right because arch_crash_save_vmcoreinfo() is used to export
arch specific vmcoreinfo. CONFIG_VMCORE_INFO is supposed to control its
compiling in. However, CONFIG_VMVCORE_INFO could be independent of
CONFIG_KEXEC_CORE, e.g CONFIG_PROC_KCORE=y will select CONFIG_VMVCORE_INFO.
Or CONFIG_KEXEC/CONFIG_KEXEC_FILE is set while CONFIG_CRASH_DUMP is
not set, it will report linking error.
So, on arm, s390, ppc and sh, move arch_crash_save_vmcoreinfo out to
a new file vmcore_info.c. Let CONFIG_VMCORE_INFO decide if compiling in
arch_crash_save_vmcoreinfo().
[akpm@linux-foundation.org: remove stray newlines at eof]
Link: https://lkml.kernel.org/r/20240129135033.157195-3-bhe@redhat.com
Signed-off-by: Baoquan He <bhe@redhat.com>
Reported-by: Nathan Chancellor <nathan@kernel.org>
Closes: https://lore.kernel.org/all/20240126045551.GA126645@dev-arch.thelio-3990X/T/#u
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Hari Bathini <hbathini@linux.ibm.com>
Cc: Klara Modin <klarasmodin@gmail.com>
Cc: Michael Kelley <mhklinux@outlook.com>
Cc: Pingfan Liu <piliu@redhat.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Yang Li <yang.lee@linux.alibaba.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Now crash codes under kernel/ folder has been split out from kexec
code, crash dumping can be separated from kexec reboot in config
items on arm with some adjustments.
Here use CONFIG_CRASH_RESERVE ifdef to replace CONFIG_KEXEC ifdef.
Link: https://lkml.kernel.org/r/20240124051254.67105-14-bhe@redhat.com
Signed-off-by: Baoquan He <bhe@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Hari Bathini <hbathini@linux.ibm.com>
Cc: Pingfan Liu <piliu@redhat.com>
Cc: Klara Modin <klarasmodin@gmail.com>
Cc: Michael Kelley <mhklinux@outlook.com>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Yang Li <yang.lee@linux.alibaba.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
The vDSO data page "union vdso_data_store" is defined in an ARM specific
header file and also defined in several other places.
Move the definition from the ARM header file into the generic vdso datapage
header to make it also usable for others and to prevent code duplication.
Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20240219153939.75719-5-anna-maria@linutronix.de
|
|
Pull ARM SoC code updates from Arnd Bergmann:
"There are two notable changes this time:
- add a arch/arm/Kconfig.platforms file to simplify the platforms
that have no code except their Kconfig file (Andrew Davis)
- remove support for the ARM11MPCore CPU in the versatile/realview
platform. Since this is the last remaining one after removing
ox820, some core code can go as well (Linus Walleij)
The other changes are minor cleanups and bugfixes"
* tag 'soc-arm-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
ARM: davinci: always select CONFIG_CPU_ARM926T
soc: pxa: ssp: fix casts
ARM: debug: fix DEBUG_UNCOMPRESS help for !MULTIPLATFORM
ARM: MAINTAINERS: drop empty entries for removed boards
ARM: Delete ARM11MPCore perf leftovers
ARM: mach-nspire: Rework support and directory structure
ARM: mach-sunplus: Rework support and directory structure
ARM: mach-airoha: Rework support and directory structure
ARM: mach-moxart: Move MOXA ART support into Kconfig.platforms
ARM: mach-uniphier: Move Socionext UniPhier support into Kconfig.platforms
ARM: mach-rda: Move RDA Micro support into Kconfig.platforms
ARM: mach-asm9260: Move ASM9260 support into Kconfig.platforms
ARM: Kconfig: move platform selection into its own Kconfig file
ARM: Delete ARM11MPCore (ARM11 ARMv6K SMP) support
MAINTAINERS: add Marvell MBus driver to Marvell EBU SoCs support
ARM: mxs: Do not search for "fsl,clkctrl"
ARM: imx: Use device_get_match_data()
MAINTAINERS: add omap bus drivers to OMAP2+ SUPPORT
ARM: at91: pm: set soc_pm.data.mode in at91_pm_secure_init()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull hardening updates from Kees Cook:
- Introduce the param_unknown_fn type and other clean ups (Andy
Shevchenko)
- Various __counted_by annotations (Christophe JAILLET, Gustavo A. R.
Silva, Kees Cook)
- Add KFENCE test to LKDTM (Stephen Boyd)
- Various strncpy() refactorings (Justin Stitt)
- Fix qnx4 to avoid writing into the smaller of two overlapping buffers
- Various strlcpy() refactorings
* tag 'hardening-v6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
qnx4: Use get_directory_fname() in qnx4_match()
qnx4: Extract dir entry filename processing into helper
atags_proc: Add __counted_by for struct buffer and use struct_size()
tracing/uprobe: Replace strlcpy() with strscpy()
params: Fix multi-line comment style
params: Sort headers
params: Use size_add() for kmalloc()
params: Do not go over the limit when getting the string length
params: Introduce the param_unknown_fn type
lkdtm: Add kfence read after free crash type
nvme-fc: replace deprecated strncpy with strscpy
nvdimm/btt: replace deprecated strncpy with strscpy
nvme-fabrics: replace deprecated strncpy with strscpy
drm/modes: replace deprecated strncpy with strscpy_pad
afs: Add __counted_by for struct afs_acl and use struct_size()
VMCI: Annotate struct vmci_handle_arr with __counted_by
i40e: Annotate struct i40e_qvlist_info with __counted_by
HID: uhid: replace deprecated strncpy with strscpy
samples: Replace strlcpy() with strscpy()
SUNRPC: Replace strlcpy() with strscpy()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 updates from Will Deacon:
"CPU features:
- Remove ARM64_HAS_NO_HW_PREFETCH copy_page() optimisation for ye
olde Thunder-X machines
- Avoid mapping KPTI trampoline when it is not required
- Make CPU capability API more robust during early initialisation
Early idreg overrides:
- Remove dependencies on core kernel helpers from the early
command-line parsing logic in preparation for moving this code
before the kernel is mapped
FPsimd:
- Restore kernel-mode fpsimd context lazily, allowing us to run
fpsimd code sequences in the kernel with pre-emption enabled
KBuild:
- Install 'vmlinuz.efi' when CONFIG_EFI_ZBOOT=y
- Makefile cleanups
LPA2 prep:
- Preparatory work for enabling the 'LPA2' extension, which will
introduce 52-bit virtual and physical addressing even with 4KiB
pages (including for KVM guests).
Misc:
- Remove dead code and fix a typo
MM:
- Pass NUMA node information for IRQ stack allocations
Perf:
- Add perf support for the Synopsys DesignWare PCIe PMU
- Add support for event counting thresholds (FEAT_PMUv3_TH)
introduced in Armv8.8
- Add support for i.MX8DXL SoCs to the IMX DDR PMU driver.
- Minor PMU driver fixes and optimisations
RIP VPIPT:
- Remove what support we had for the obsolete VPIPT I-cache policy
Selftests:
- Improvements to the SVE and SME selftests
Stacktrace:
- Refactor kernel unwind logic so that it can used by BPF unwinding
and, eventually, reliable backtracing
Sysregs:
- Update a bunch of register definitions based on the latest XML drop
from Arm"
* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (87 commits)
kselftest/arm64: Don't probe the current VL for unsupported vector types
efi/libstub: zboot: do not use $(shell ...) in cmd_copy_and_pad
arm64: properly install vmlinuz.efi
arm64/sysreg: Add missing system instruction definitions for FGT
arm64/sysreg: Add missing system register definitions for FGT
arm64/sysreg: Add missing ExtTrcBuff field definition to ID_AA64DFR0_EL1
arm64/sysreg: Add missing Pauth_LR field definitions to ID_AA64ISAR1_EL1
arm64: memory: remove duplicated include
arm: perf: Fix ARCH=arm build with GCC
arm64: Align boot cpucap handling with system cpucap handling
arm64: Cleanup system cpucap handling
MAINTAINERS: add maintainers for DesignWare PCIe PMU driver
drivers/perf: add DesignWare PCIe PMU driver
PCI: Move pci_clear_and_set_dword() helper to PCI header
PCI: Add Alibaba Vendor ID to linux/pci_ids.h
docs: perf: Add description for Synopsys DesignWare PCIe PMU driver
arm64: irq: set the correct node for shadow call stack
Revert "perf/arm_dmc620: Remove duplicate format attribute #defines"
arm64: fpsimd: Implement lazy restore for kernel mode FPSIMD
arm64: fpsimd: Preserve/restore kernel mode NEON at context switch
...
|
|
My commit deleting the PB11MPCore apparently left a few dangling
structs in the perf event code. Fix it up.
Fixes: 2560cffd2134 ("ARM: Delete ARM11MPCore (ARM11 ARMv6K SMP) support")
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Liviu Dudau <liviu.dudau@arm.com>
Link: https://lore.kernel.org/r/20231224-drop-11mpcore-fix-v1-1-d8b16d1c1fae@linaro.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
This ARM11 SMP configuration was one of the first SMP configurations
the ARM kernel supported, but it has the downside of odd DMA handling,
odd cache tagging, and often (as of recent) completely broken cache
handling on the ARM RealView PB11MPCore test chips. To boot the
platform it was necessary to completely disable the cache.
When it comes to the EB 11MPCore it is unclear if this ever worked.
These reference designs are now the only ARMv6K SMP platforms.
As only reference designs of purely academic interest remain, and
since the special-cased DMA and PMU code is hard to maintain and
doesn't really work, it is not really worth our time.
Delete the ARM11MPCore support along with:
- The special DMA quirk CONFIG_DMA_CACHE_RWFO that is only used
on ARMv6K SMP, and we are the last ARMV6K system leaving the
building and the cache handling is awkward, so good-bye.
- The special PMU handling that was only used by ARM11MPCore.
The following is left behind:
- TIMER_OF_DECLARE(arm_twd_11mp, "arm,arm11mp-twd-timer", ...)
in arch/arm/kernel/smp_twd.c, this is still in use by Marvell MMP3
arch/arm/boot/dts/marvell/mmp3.dtsi
- IRQCHIP_DECLARE(arm11mp_gic, "arm,arm11mp-gic", ...)
in drivers/irqchip/irq-gic.c, this is still in use by Marvell MMP3
arch/arm/boot/dts/marvell/mmp3.dtsi
- A compatible for the arm11mpcore SCU, since this was mistakedly
used for the Cortex-A9 version of RealView EB.
These are unfortunate but will need to be kept around for
compatibility. New Marvell-specific compatibles should however probably
be added.
Acked-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Liviu Dudau <liviu.dudau@arm.com>
Link: https://lore.kernel.org/r/20231207-drop-11mpcore-v2-1-560b396f3bf5@linaro.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
-EPERM or -EINVAL always get converted to -EOPNOTSUPP, so replace them.
This will allow __hw_perf_event_init() to return a different code or not
print that particular message for a different error in the next commit.
Signed-off-by: James Clark <james.clark@arm.com>
Link: https://lore.kernel.org/r/20231211161331.1277825-10-james.clark@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Ignat Korchagin complained that a potential config regression was
introduced by commit 89cde455915f ("kexec: consolidate kexec and crash
options into kernel/Kconfig.kexec"). Before the commit, CONFIG_CRASH_DUMP
has no dependency on CONFIG_KEXEC. After the commit, CRASH_DUMP selects
KEXEC. That enforces system to have CONFIG_KEXEC=y as long as
CONFIG_CRASH_DUMP=Y which people may not want.
In Ignat's case, he sets CONFIG_CRASH_DUMP=y, CONFIG_KEXEC_FILE=y and
CONFIG_KEXEC=n because kexec_load interface could have security issue if
kernel/initrd has no chance to be signed and verified.
CRASH_DUMP has select of KEXEC because Eric, author of above commit, met a
LKP report of build failure when posting patch of earlier version. Please
see below link to get detail of the LKP report:
https://lore.kernel.org/all/3e8eecd1-a277-2cfb-690e-5de2eb7b988e@oracle.com/T/#u
In fact, that LKP report is triggered because arm's <asm/kexec.h> is
wrapped in CONFIG_KEXEC ifdeffery scope. That is wrong. CONFIG_KEXEC
controls the enabling/disabling of kexec_load interface, but not kexec
feature. Removing the wrongly added CONFIG_KEXEC ifdeffery scope in
<asm/kexec.h> of arm allows us to drop the select KEXEC for CRASH_DUMP.
Meanwhile, change arch/arm/kernel/Makefile to let machine_kexec.o
relocate_kernel.o depend on KEXEC_CORE.
Link: https://lkml.kernel.org/r/20231128054457.659452-1-bhe@redhat.com
Fixes: 89cde455915f ("kexec: consolidate kexec and crash options into kernel/Kconfig.kexec")
Signed-off-by: Baoquan He <bhe@redhat.com>
Reported-by: Ignat Korchagin <ignat@cloudflare.com>
Tested-by: Ignat Korchagin <ignat@cloudflare.com> [compile-time only]
Tested-by: Alexander Gordeev <agordeev@linux.ibm.com>
Reviewed-by: Eric DeVolder <eric_devolder@yahoo.com>
Tested-by: Eric DeVolder <eric_devolder@yahoo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Currently the 32-bit arm PMU drivers use the pmu_hw_events::lock spinlock in
their arm_pmu::{start,stop,enable,disable}() callbacks to protect hardware
state and event data.
This locking is not necessary as the perf core code already provides mutual
exclusion, disabling interrupts to serialize against the IRQ handler, and
using perf_event_context::lock to protect against concurrent modifications of
events cross-cpu.
The locking was removed from the arm64 (now PMUv3) PMU driver in commit:
2a0e2a02e4b7 ("arm64: perf: Remove PMU locking")
... and the same reasoning applies to all the 32-bit PMU drivers.
Remove the locking from the 32-bit PMU drivers.
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: linux-perf-users@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-arm-kernel@lists.infradead.org
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20231115092805.737822-2-anshuman.khandual@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Prepare for the coming implementation by GCC and Clang of the __counted_by
attribute. Flexible array members annotated with __counted_by can have
their accesses bounds-checked at run-time via CONFIG_UBSAN_BOUNDS (for
array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family
functions).
While there, use struct_size() helper, instead of the open-coded
version, to calculate the size for the allocation of the whole
flexible structure, including of course, the flexible-array member.
This code was found with the help of Coccinelle, and audited and
fixed manually.
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Justin Stitt <justinstitt@google.com>
Link: https://lore.kernel.org/r/ZSVHurzo/4aFQcT3@work
Signed-off-by: Kees Cook <keescook@chromium.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Pull tty and serial updates from Greg KH:
"Here is the big set of tty/serial driver changes for 6.7-rc1. Included
in here are:
- console/vgacon cleanups and removals from Arnd
- tty core and n_tty cleanups from Jiri
- lots of 8250 driver updates and cleanups
- sc16is7xx serial driver updates
- dt binding updates
- first set of port lock wrapers from Thomas for the printk fixes
coming in future releases
- other small serial and tty core cleanups and updates
All of these have been in linux-next for a while with no reported
issues"
* tag 'tty-6.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (193 commits)
serdev: Replace custom code with device_match_acpi_handle()
serdev: Simplify devm_serdev_device_open() function
serdev: Make use of device_set_node()
tty: n_gsm: add copyright Siemens Mobility GmbH
tty: n_gsm: fix race condition in status line change on dead connections
serial: core: Fix runtime PM handling for pending tx
vgacon: fix mips/sibyte build regression
dt-bindings: serial: drop unsupported samsung bindings
tty: serial: samsung: drop earlycon support for unsupported platforms
tty: 8250: Add note for PX-835
tty: 8250: Fix IS-200 PCI ID comment
tty: 8250: Add Brainboxes Oxford Semiconductor-based quirks
tty: 8250: Add support for Intashield IX cards
tty: 8250: Add support for additional Brainboxes PX cards
tty: 8250: Fix up PX-803/PX-857
tty: 8250: Fix port count of PX-257
tty: 8250: Add support for Intashield IS-100
tty: 8250: Add support for Brainboxes UP cards
tty: 8250: Add support for additional Brainboxes UC cards
tty: 8250: Remove UC-257 and UC-431
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull non-MM updates from Andrew Morton:
"As usual, lots of singleton and doubleton patches all over the tree
and there's little I can say which isn't in the individual changelogs.
The lengthier patch series are
- 'kdump: use generic functions to simplify crashkernel reservation
in arch', from Baoquan He. This is mainly cleanups and
consolidation of the 'crashkernel=' kernel parameter handling
- After much discussion, David Laight's 'minmax: Relax type checks in
min() and max()' is here. Hopefully reduces some typecasting and
the use of min_t() and max_t()
- A group of patches from Oleg Nesterov which clean up and slightly
fix our handling of reads from /proc/PID/task/... and which remove
task_struct.thread_group"
* tag 'mm-nonmm-stable-2023-11-02-14-08' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (64 commits)
scripts/gdb/vmalloc: disable on no-MMU
scripts/gdb: fix usage of MOD_TEXT not defined when CONFIG_MODULES=n
.mailmap: add address mapping for Tomeu Vizoso
mailmap: update email address for Claudiu Beznea
tools/testing/selftests/mm/run_vmtests.sh: lower the ptrace permissions
.mailmap: map Benjamin Poirier's address
scripts/gdb: add lx_current support for riscv
ocfs2: fix a spelling typo in comment
proc: test ProtectionKey in proc-empty-vm test
proc: fix proc-empty-vm test with vsyscall
fs/proc/base.c: remove unneeded semicolon
do_io_accounting: use sig->stats_lock
do_io_accounting: use __for_each_thread()
ocfs2: replace BUG_ON() at ocfs2_num_free_extents() with ocfs2_error()
ocfs2: fix a typo in a comment
scripts/show_delta: add __main__ judgement before main code
treewide: mark stuff as __ro_after_init
fs: ocfs2: check status values
proc: test /proc/${pid}/statm
compiler.h: move __is_constexpr() to compiler.h
...
|
|
Pull ARM updates from Russell King:
- fix some kernel-doc warnings
- fix stack depot IRQ stack filter
- cast memset() byte to unsigned char
- explicitly include correct DI includes
- fix ARCH_LOW_ADDRESS_LIMIT with CONFIG_ZONE_DMA
- fix get_user() problems when linker uses a veneer
- make including linux/uaccess.h self-contained on ARM
* tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm:
ARM: 9326/1: make <linux/uaccess.h> self-contained for ARM
ARM: 9324/1: fix get_user() broken with veneer
ARM: 9323/1: mm: Fix ARCH_LOW_ADDRESS_LIMIT when CONFIG_ZONE_DMA
ARM: 9322/1: Explicitly include correct DT includes
ARM: 9321/1: memset: cast the constant byte to unsigned char
ARM: 9320/1: fix stack depot IRQ stack filter
ARM: 9319/1: sa1111: fix sa1111_probe kernel-doc warnings
|
|
Separating the VGA console screen_info from the EFI one unfortunately
caused a build failure for footbridge that I had never caught
with randconfig builds:
arch/arm/kernel/setup.c:932:27: error: static declaration of 'vgacon_screen_info' follows non-static declaration
932 | static struct screen_info vgacon_screen_info = {
| ^~~~~~~~~~~~~~~~~~
In file included from arch/arm/kernel/setup.c:44:
arch/arm/include/asm/setup.h:40:27: note: previous declaration of 'vgacon_screen_info' with type 'struct screen_info'
40 | extern struct screen_info vgacon_screen_info;
| ^~~~~~~~~~~~~~~~~~
arm-linux-gnueabi-ld: drivers/video/console/dummycon.o: in function `dummycon_init':
dummycon.c:(.text+0xe4): undefined reference to `screen_info'
Make sure the variable is global to avoid the conflict with the extern
declaration, and make it work in dummycon.c
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20231017093947.3627976-2-arnd@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
To prepare for completely separating the VGA console screen_info from
the one used in EFI/sysfb, rename the vgacon instances and make them
local as much as possible.
ia64 and arm both have confurations with vgacon and efi, but the contents
never overlaps because ia64 has no EFI framebuffer, and arm only has
vga console on legacy platforms without EFI. Renaming these is required
before the EFI screen_info can be moved into drivers/firmware.
The ia64 vga console is actually registered in two places from
setup_arch(), but one of them is wrong, so drop the one in pcdp.c and
fix the one in setup.c to use the correct conditional.
x86 has to keep them together, as the boot protocol is used to switch
between VGA text console and framebuffer through the screen_info data.
Acked-by: Javier Martinez Canillas <javierm@redhat.com>
Acked-by: Khalid Aziz <khalid@gonehiking.org>
Acked-by: Helge Deller <deller@gmx.de>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20231009211845.3136536-7-arnd@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
The vga console driver is fairly self-contained, and only used by
architectures that explicitly initialize the screen_info settings.
Chance every instance that picks the vga console by setting conswitchp
to call a function instead, and pass a reference to the screen_info
there.
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Acked-by: Khalid Azzi <khalid@gonehiking.org>
Acked-by: Helge Deller <deller@gmx.de>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20231009211845.3136536-6-arnd@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
The dummycon default console size used to be determined by architecture,
but now this is a Kconfig setting on everything except ARM. Tracing this
back in the historic git trees, this was used to match the size of VGA
console or VGA framebuffer on early machines, but nowadays that code is
no longer used, except probably on the old footbridge/netwinder since
that is the only one that supports vgacon.
On machines with a framebuffer, booting with DT so far results in always
using the hardcoded 80x30 size in dummycon, while on ATAGS the setting
can come from a bootloader specific override. Both seem to be worse
choices than the Kconfig setting, since the actual text size for fbcon
also depends on the selected font.
Make this work the same way as everywhere else and use the normal
Kconfig setting, except for the footbridge with vgacon, which keeps
using the traditional code. If vgacon is disabled, footbridge can
also ignore the setting. This means the screen_info only has to be
provided when either vgacon or EFI are enabled now.
To limit the amount of surprises on Arm, change the Kconfig default
to the previously used 80x30 setting instead of the usual 80x25.
Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>
Tested-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Acked-by: Helge Deller <deller@gmx.de>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20231009211845.3136536-4-arnd@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
This commit comes at the tail end of a greater effort to remove the
empty elements at the end of the ctl_table arrays (sentinels) which
will reduce the overall build time size of the kernel and run time
memory bloat by ~64 bytes per sentinel (further information Link :
https://lore.kernel.org/all/ZO5Yx5JFogGi%2FcBo@bombadil.infradead.org/)
Removed the sentinel as well as the explicit size from ctl_isa_vars. The
size is redundant as the initialization sets it. Changed
insn_emulation->sysctl from a 2 element array of struct ctl_table to a
simple struct. This has no consequence for the sysctl registration as it
is forwarded as a pointer. Removed sentinel from sve_defatul_vl_table,
sme_default_vl_table, tagged_addr_sysctl_table and
armv8_pmu_sysctl_table.
This removal is safe because register_sysctl_sz and register_sysctl use
the array size in addition to checking for the sentinel.
Signed-off-by: Joel Granados <j.granados@samsung.com>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
|
|
The DT of_device.h and of_platform.h date back to the separate
of_platform_bus_type before it as merged into the regular platform bus.
As part of that merge prepping Arm DT support 13 years ago, they
"temporarily" include each other. They also include platform_device.h
and of.h. As a result, there's a pretty much random mix of those include
files used throughout the tree. In order to detangle these headers and
replace the implicit includes with struct declarations, users need to
explicitly include the correct includes.
Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
|
|
Add two parameters 'low_size' and 'high' to function parse_crashkernel(),
later crashkernel=,high|low parsing will be added. Make adjustments in
all call sites of parse_crashkernel() in arch.
Link: https://lkml.kernel.org/r/20230914033142.676708-3-bhe@redhat.com
Signed-off-by: Baoquan He <bhe@redhat.com>
Reviewed-by: Zhen Lei <thunder.leizhen@huawei.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chen Jiahao <chenjiahao16@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Pull ARM updates from Russell King:
- Refactor VFP code and convert to C code (Ard Biesheuvel)
- Fix hardware breakpoint single-stepping using bpf_overflow_handler
- Make SMP stop calls asynchronous allowing panic from irq context to
work
- Fix for kernel-doc warnings for locomo
* tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm:
Revert part of ae1f8d793a19 ("ARM: 9304/1: add prototype for function called only from asm")
ARM: 9318/1: locomo: move kernel-doc to prevent warnings
ARM: 9317/1: kexec: Make smp stop calls asynchronous
ARM: 9316/1: hw_breakpoint: fix single-stepping when using bpf_overflow_handler
ARM: entry: Make asm coproc dispatch code NWFPE only
ARM: iwmmxt: Use undef hook to enable coprocessor for task
ARM: entry: Disregard Thumb undef exception in coproc dispatch
ARM: vfp: Use undef hook for handling VFP exceptions
ARM: kernel: Get rid of thread_info::used_cp[] array
ARM: vfp: Reimplement VFP exception entry in C code
ARM: vfp: Remove workaround for Feroceon CPUs
ARM: vfp: Record VFP bounces as perf emulation faults
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 shadow stack support from Dave Hansen:
"This is the long awaited x86 shadow stack support, part of Intel's
Control-flow Enforcement Technology (CET).
CET consists of two related security features: shadow stacks and
indirect branch tracking. This series implements just the shadow stack
part of this feature, and just for userspace.
The main use case for shadow stack is providing protection against
return oriented programming attacks. It works by maintaining a
secondary (shadow) stack using a special memory type that has
protections against modification. When executing a CALL instruction,
the processor pushes the return address to both the normal stack and
to the special permission shadow stack. Upon RET, the processor pops
the shadow stack copy and compares it to the normal stack copy.
For more information, refer to the links below for the earlier
versions of this patch set"
Link: https://lore.kernel.org/lkml/20220130211838.8382-1-rick.p.edgecombe@intel.com/
Link: https://lore.kernel.org/lkml/20230613001108.3040476-1-rick.p.edgecombe@intel.com/
* tag 'x86_shstk_for_6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (47 commits)
x86/shstk: Change order of __user in type
x86/ibt: Convert IBT selftest to asm
x86/shstk: Don't retry vm_munmap() on -EINTR
x86/kbuild: Fix Documentation/ reference
x86/shstk: Move arch detail comment out of core mm
x86/shstk: Add ARCH_SHSTK_STATUS
x86/shstk: Add ARCH_SHSTK_UNLOCK
x86: Add PTRACE interface for shadow stack
selftests/x86: Add shadow stack test
x86/cpufeatures: Enable CET CR4 bit for shadow stack
x86/shstk: Wire in shadow stack interface
x86: Expose thread features in /proc/$PID/status
x86/shstk: Support WRSS for userspace
x86/shstk: Introduce map_shadow_stack syscall
x86/shstk: Check that signal frame is shadow stack mem
x86/shstk: Check that SSP is aligned on sigreturn
x86/shstk: Handle signals for shadow stack
x86/shstk: Introduce routines modifying shstk
x86/shstk: Handle thread shadow stack
x86/shstk: Add user-mode shadow stack support
...
|
|
Pull drm updates from Dave Airlie:
"The drm core grew a new generic gpu virtual address manager, and new
execution locking helpers. These are used by nouveau now to provide
uAPI support for the userspace Vulkan driver. AMD had a bunch of new
IP core support, loads of refactoring around fbdev, but mostly just
the usual amount of stuff across the board.
core:
- fix gfp flags in drmm_kmalloc
gpuva:
- add new generic GPU VA manager (for nouveau initially)
syncobj:
- add new DRM_IOCTL_SYNCOBJ_EVENTFD ioctl
dma-buf:
- acquire resv lock for mmap() in exporters
- support dma-buf self import automatically
- docs fixes
backlight:
- fix fbdev interactions
atomic:
- improve logging
prime:
- remove struct gem_prim_mmap plus driver updates
gem:
- drm_exec: add locking over multiple GEM objects
- fix lockdep checking
fbdev:
- make fbdev userspace interfaces optional
- use linux device instead of fbdev device
- use deferred i/o helper macros in various drivers
- Make FB core selectable without drivers
- Remove obsolete flags FBINFO_DEFAULT and FBINFO_FLAG_DEFAULT
- Add helper macros and Kconfig tokens for DMA-allocated framebuffer
ttm:
- support init_on_free
- swapout fixes
panel:
- panel-edp: Support AUO B116XAB01.4
- Support Visionox R66451 plus DT bindings
- ld9040:
- Backlight support
- magic improved
- Kconfig fix
- Convert to of_device_get_match_data()
- Fix Kconfig dependencies
- simple:
- Set bpc value to fix warning
- Set connector type for AUO T215HVN01
- Support Innolux G156HCE-L01 plus DT bindings
- ili9881: Support TDO TL050HDV35 LCD panel plus DT bindings
- startek: Support KD070FHFID015 MIPI-DSI panel plus DT bindings
- sitronix-st7789v:
- Support Inanbo T28CP45TN89 plus DT bindings
- Support EDT ET028013DMA plus DT bindings
- Various cleanups
- edp: Add timings for N140HCA-EAC
- Allow panels and touchscreens to power sequence together
- Fix Innolux G156HCE-L01 LVDS clock
bridge:
- debugfs for chains support
- dw-hdmi:
- Improve support for YUV420 bus format
- CEC suspend/resume
- update EDID on HDMI detect
- dw-mipi-dsi: Fix enable/disable of DSI controller
- lt9611uxc: Use MODULE_FIRMWARE()
- ps8640: Remove broken EDID code
- samsung-dsim: Fix command transfer
- tc358764:
- Handle HS/VS polarity
- Use BIT() macro
- Various cleanups
- adv7511: Fix low refresh rate
- anx7625:
- Switch to macros instead of hardcoded values
- locking fixes
- tc358767: fix hardware delays
- sitronix-st7789v:
- Support panel orientation
- Support rotation property
- Add support for Jasonic JT240MHQS-HWT-EK-E3 plus DT bindings
amdgpu:
- SDMA 6.1.0 support
- HDP 6.1 support
- SMUIO 14.0 support
- PSP 14.0 support
- IH 6.1 support
- Lots of checkpatch cleanups
- GFX 9.4.3 updates
- Add USB PD and IFWI flashing documentation
- GPUVM updates
- RAS fixes
- DRR fixes
- FAMS fixes
- Virtual display fixes
- Soft IH fixes
- SMU13 fixes
- Rework PSP firmware loading for other IPs
- Kernel doc fixes
- DCN 3.0.1 fixes
- LTTPR fixes
- DP MST fixes
- DCN 3.1.6 fixes
- SMU 13.x fixes
- PSP 13.x fixes
- SubVP fixes
- GC 9.4.3 fixes
- Display bandwidth calculation fixes
- VCN4 secure submission fixes
- Allow building DC on RISC-V
- Add visible FB info to bo_print_info
- HBR3 fixes
- GFX9 MCBP fix
- GMC10 vmhub index fix
- GMC11 vmhub index fix
- Create a new doorbell manager
- SR-IOV fixes
- initial freesync panel replay support
- revert zpos properly until igt regression is fixeed
- use TTM to manage doorbell BAR
- Expose both current and average power via hwmon if supported
amdkfd:
- Cleanup CRIU dma-buf handling
- Use KIQ to unmap HIQ
- GFX 9.4.3 debugger updates
- GFX 9.4.2 debugger fixes
- Enable cooperative groups fof gfx11
- SVM fixes
- Convert older APUs to use dGPU path like newer APUs
- Drop IOMMUv2 path as it is no longer used
- TBA fix for aldebaran
i915:
- ICL+ DSI modeset sequence
- HDCP improvements
- MTL display fixes and cleanups
- HSW/BDW PSR1 restored
- Init DDI ports in VBT order
- General display refactors
- Start using plane scale factor for relative data rate
- Use shmem for dpt objects
- Expose RPS thresholds in sysfs
- Apply GuC SLPC min frequency softlimit correctly
- Extend Wa_14015795083 to TGL, RKL, DG1 and ADL
- Fix a VMA UAF for multi-gt platform
- Do not use stolen on MTL due to HW bug
- Check HuC and GuC version compatibility on MTL
- avoid infinite GPU waits due to premature release of request memory
- Fixes and updates for GSC memory allocation
- Display SDVO fixes
- Take stolen handling out of FBC code
- Make i915_coherent_map_type GT-centric
- Simplify shmem_create_from_object map_type
msm:
- SM6125 MDSS support
- DPU: SM6125 DPU support
- DSI: runtime PM support, burst mode support
- DSI PHY: SM6125 support in 14nm DSI PHY driver
- GPU: prepare for a7xx
- fix a690 firmware
- disable relocs on a6xx and newer
radeon:
- Lots of checkpatch cleanups
ast:
- improve device-model detection
- Represent BMV as virtual connector
- Report DP connection status
nouveau:
- add new exec/bind interface to support Vulkan
- document some getparam ioctls
- improve VRAM detection
- various fixes/cleanups
- workraound DPCD issues
ivpu:
- MMU updates
- debugfs support
- Support vpu4
virtio:
- add sync object support
atmel-hlcdc:
- Support inverted pixclock polarity
etnaviv:
- runtime PM cleanups
- hang handling fixes
exynos:
- use fbdev DMA helpers
- fix possible NULL ptr dereference
komeda:
- always attach encoder
omapdrm:
- use fbdev DMA helpers
ingenic:
- kconfig regmap fixes
loongson:
- support display controller
mediatek:
- Small mtk-dpi cleanups
- DisplayPort: support eDP and aux-bus
- Fix coverity issues
- Fix potential memory leak if vmap() fail
mgag200:
- minor fixes
mxsfb:
- support disabling overlay planes
panfrost:
- fix sync in IRQ handling
ssd130x:
- Support per-controller default resolution plus DT bindings
- Reduce memory-allocation overhead
- Improve intermediate buffer size computation
- Fix allocation of temporary buffers
- Fix pitch computation
- Fix shadow plane allocation
tegra:
- use fbdev DMA helpers
- Convert to devm_platform_ioremap_resource()
- support bridge/connector
- enable PM
tidss:
- Support TI AM625 plus DT bindings
- Implement new connector model plus driver updates
vkms:
- improve write back support
- docs fixes
- support gamma LUT
zynqmp-dpsub:
- misc fixes"
* tag 'drm-next-2023-08-30' of git://anongit.freedesktop.org/drm/drm: (1327 commits)
drm/gpuva_mgr: remove unused prev pointer in __drm_gpuva_sm_map()
drm/tests/drm_kunit_helpers: Place correct function name in the comment header
drm/nouveau: uapi: don't pass NO_PREFETCH flag implicitly
drm/nouveau: uvmm: fix unset region pointer on remap
drm/nouveau: sched: avoid job races between entities
drm/i915: Fix HPD polling, reenabling the output poll work as needed
drm: Add an HPD poll helper to reschedule the poll work
drm/i915: Fix TLB-Invalidation seqno store
drm/ttm/tests: Fix type conversion in ttm_pool_test
drm/msm/a6xx: Bail out early if setting GPU OOB fails
drm/msm/a6xx: Move LLC accessors to the common header
drm/msm/a6xx: Introduce a6xx_llc_read
drm/ttm/tests: Require MMU when testing
drm/panel: simple: Fix Innolux G156HCE-L01 LVDS clock
Revert "Revert "drm/amdgpu/display: change pipe policy for DCN 2.0""
drm/amdgpu: Add memory vendor information
drm/amd: flush any delayed gfxoff on suspend entry
drm/amdgpu: skip fence GFX interrupts disable/enable for S0ix
drm/amdgpu: Remove gfxoff check in GFX v9.4.3
drm/amd/pm: Update pci link speed for smu v13.0.6
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux
Pull modules updates from Luis Chamberlain:
"Summary of the changes worth highlighting from most interesting to
boring below:
- Christoph Hellwig's symbol_get() fix to Nvidia's efforts to
circumvent the protection he put in place in year 2020 to prevent
proprietary modules from using GPL only symbols, and also ensuring
proprietary modules which export symbols grandfather their taint.
That was done through year 2020 commit 262e6ae7081d ("modules:
inherit TAINT_PROPRIETARY_MODULE"). Christoph's new fix is done by
clarifing __symbol_get() was only ever intended to prevent module
reference loops by Linux kernel modules and so making it only find
symbols exported via EXPORT_SYMBOL_GPL(). The circumvention tactic
used by Nvidia was to use symbol_get() to purposely swift through
proprietary module symbols and completely bypass our traditional
EXPORT_SYMBOL*() annotations and community agreed upon
restrictions.
A small set of preamble patches fix up a few symbols which just
needed adjusting for this on two modules, the rtc ds1685 and the
networking enetc module. Two other modules just needed some build
fixing and removal of use of __symbol_get() as they can't ever be
modular, as was done by Arnd on the ARM pxa module and Christoph
did on the mmc au1xmmc driver.
This is a good reminder to us that symbol_get() is just a hack to
address things which should be fixed through Kconfig at build time
as was done in the later patches, and so ultimately it should just
go.
- Extremely late minor fix for old module layout 055f23b74b20
("module: check for exit sections in layout_sections() instead of
module_init_section()") by James Morse for arm64. Note that this
layout thing is old, it is *not* Song Liu's commit ac3b43283923
("module: replace module_layout with module_memory"). The issue
however is very odd to run into and so there was no hurry to get
this in fast.
- Although the fix did not go through the modules tree I'd like to
highlight the fix by Peter Zijlstra in commit 54097309620e
("x86/static_call: Fix __static_call_fixup()") now merged in your
tree which came out of what was originally suspected to be a
fallout of the the newer module layout changes by Song Liu commit
ac3b43283923 ("module: replace module_layout with module_memory")
instead of module_init_section()"). Thanks to the report by
Christian Bricart and the debugging by Song Liu & Peter that turned
to be noted as a kernel regression in place since v5.19 through
commit ee88d363d156 ("x86,static_call: Use alternative RET
encoding").
I highlight this to reflect and clarify that we haven't seen more
fallout from ac3b43283923 ("module: replace module_layout with
module_memory").
- RISC-V toolchain got mapping symbol support which prefix symbols
with "$" to help with alignment considerations for disassembly.
This is used to differentiate between incompatible instruction
encodings when disassembling. RISC-V just matches what ARM/AARCH64
did for alignment considerations and Palmer Dabbelt extended
is_mapping_symbol() to accept these symbols for RISC-V. We already
had support for this for all architectures but it also checked for
the second character, the RISC-V check Dabbelt added was just for
the "$". After a bit of testing and fallout on linux-next and based
on feedback from Masahiro Yamada it was decided to simplify the
check and treat the first char "$" as unique for all architectures,
and so we no make is_mapping_symbol() for all archs if the symbol
starts with "$".
The most relevant commit for this for RISC-V on binutils was:
https://sourceware.org/pipermail/binutils/2021-July/117350.html
- A late fix by Andrea Righi (today) to make module zstd
decompression use vmalloc() instead of kmalloc() to account for
large compressed modules. I suspect we'll see similar things for
other decompression algorithms soon.
- samples/hw_breakpoint minor fixes by Rong Tao, Arnd Bergmann and
Chen Jiahao"
* tag 'modules-6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux:
module/decompress: use vmalloc() for zstd decompression workspace
kallsyms: Add more debug output for selftest
ARM: module: Use module_init_layout_section() to spot init sections
arm64: module: Use module_init_layout_section() to spot init sections
module: Expose module_init_layout_section()
modules: only allow symbol_get of EXPORT_SYMBOL_GPL modules
rtc: ds1685: use EXPORT_SYMBOL_GPL for ds1685_rtc_poweroff
net: enetc: use EXPORT_SYMBOL_GPL for enetc_phc_index
mmc: au1xmmc: force non-modular build and remove symbol_get usage
ARM: pxa: remove use of symbol_get()
samples/hw_breakpoint: mark sample_hbp as static
samples/hw_breakpoint: fix building without module unloading
samples/hw_breakpoint: Fix kernel BUG 'invalid opcode: 0000'
modpost, kallsyms: Treat add '$'-prefixed symbols as mapping symbols
kernel: params: Remove unnecessary ‘0’ values from err
module: Ignore RISC-V mapping symbols too
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull non-MM updates from Andrew Morton:
- An extensive rework of kexec and crash Kconfig from Eric DeVolder
("refactor Kconfig to consolidate KEXEC and CRASH options")
- kernel.h slimming work from Andy Shevchenko ("kernel.h: Split out a
couple of macros to args.h")
- gdb feature work from Kuan-Ying Lee ("Add GDB memory helper
commands")
- vsprintf inclusion rationalization from Andy Shevchenko
("lib/vsprintf: Rework header inclusions")
- Switch the handling of kdump from a udev scheme to in-kernel
handling, by Eric DeVolder ("crash: Kernel handling of CPU and memory
hot un/plug")
- Many singleton patches to various parts of the tree
* tag 'mm-nonmm-stable-2023-08-28-22-48' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (81 commits)
document while_each_thread(), change first_tid() to use for_each_thread()
drivers/char/mem.c: shrink character device's devlist[] array
x86/crash: optimize CPU changes
crash: change crash_prepare_elf64_headers() to for_each_possible_cpu()
crash: hotplug support for kexec_load()
x86/crash: add x86 crash hotplug support
crash: memory and CPU hotplug sysfs attributes
kexec: exclude elfcorehdr from the segment digest
crash: add generic infrastructure for crash hotplug support
crash: move a few code bits to setup support of crash hotplug
kstrtox: consistently use _tolower()
kill do_each_thread()
nilfs2: fix WARNING in mark_buffer_dirty due to discarded buffer reuse
scripts/bloat-o-meter: count weak symbol sizes
treewide: drop CONFIG_EMBEDDED
lockdep: fix static memory detection even more
lib/vsprintf: declare no_hash_pointers in sprintf.h
lib/vsprintf: split out sprintf() and friends
kernel/fork: stop playing lockless games for exe_file replacement
adfs: delete unused "union adfs_dirtail" definition
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 updates from Will Deacon:
"I think we have a bit less than usual on the architecture side, but
that's somewhat balanced out by a large crop of perf/PMU driver
updates and extensions to our selftests.
CPU features and system registers:
- Advertise hinted conditional branch support (FEAT_HBC) to userspace
- Avoid false positive "SANITY CHECK" warning when xCR registers
differ outside of the length field
Documentation:
- Fix macro name typo in SME documentation
Entry code:
- Unmask exceptions earlier on the system call entry path
Memory management:
- Don't bother clearing PTE_RDONLY for dirty ptes in pte_wrprotect()
and pte_modify()
Perf and PMU drivers:
- Initial support for Coresight TRBE devices on ACPI systems (the
coresight driver changes will come later)
- Fix hw_breakpoint single-stepping when called from bpf
- Fixes for DDR PMU on i.MX8MP SoC
- Add NUMA-awareness to Hisilicon PCIe PMU driver
- Fix locking dependency issue in Arm DMC620 PMU driver
- Workaround Hisilicon erratum 162001900 in the SMMUv3 PMU driver
- Add support for Arm CMN-700 r3 parts to the CMN PMU driver
- Add support for recent Arm Cortex CPU PMUs
- Update Hisilicon PMU maintainers
Selftests:
- Add a bunch of new features to the hwcap test (JSCVT, PMULL, AES,
SHA1, etc)
- Fix SSVE test to leave streaming-mode after grabbing the signal
context
- Add new test for SVE vector-length changes with SME enabled
Miscellaneous:
- Allow compiler to warn on suspicious looking system register
expressions
- Work around SDEI firmware bug by aborting any running handlers on a
kernel crash
- Fix some harmless warnings when building with W=1
- Remove some unused function declarations
- Other minor fixes and cleanup"
* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (62 commits)
drivers/perf: hisi: Update HiSilicon PMU maintainers
arm_pmu: acpi: Add a representative platform device for TRBE
arm_pmu: acpi: Refactor arm_spe_acpi_register_device()
kselftest/arm64: Fix hwcaps selftest build
hw_breakpoint: fix single-stepping when using bpf_overflow_handler
arm64/sysreg: refactor deprecated strncpy
kselftest/arm64: add jscvt feature to hwcap test
kselftest/arm64: add pmull feature to hwcap test
kselftest/arm64: add AES feature check to hwcap test
kselftest/arm64: add SHA1 and related features to hwcap test
arm64: sysreg: Generate C compiler warnings on {read,write}_sysreg_s arguments
kselftest/arm64: build BTI tests in output directory
perf/imx_ddr: don't enable counter0 if none of 4 counters are used
perf/imx_ddr: speed up overflow frequency of cycle
drivers/perf: hisi: Schedule perf session according to locality
kselftest/arm64: fix a memleak in zt_regs_run()
perf/arm-dmc620: Fix dmc620_pmu_irqs_lock/cpu_hotplug_lock circular lock dependency
perf/smmuv3: Add MODULE_ALIAS for module auto loading
perf/smmuv3: Enable HiSilicon Erratum 162001900 quirk for HIP08/09
kselftest/arm64: Size sycall-abi buffers for the actual maximum VL
...
|
|
The APIs that allow backtracing across CPUs have always had a way to
exclude the current CPU. This convenience means callers didn't need to
find a place to allocate a CPU mask just to handle the common case.
Let's extend the API to take a CPU ID to exclude instead of just a
boolean. This isn't any more complex for the API to handle and allows the
hardlockup detector to exclude a different CPU (the one it already did a
trace for) without needing to find space for a CPU mask.
Arguably, this new API also encourages safer behavior. Specifically if
the caller wants to avoid tracing the current CPU (maybe because they
already traced the current CPU) this makes it more obvious to the caller
that they need to make sure that the current CPU ID can't change.
[akpm@linux-foundation.org: fix trigger_allbutcpu_cpu_backtrace() stub]
Link: https://lkml.kernel.org/r/20230804065935.v4.1.Ia35521b91fc781368945161d7b28538f9996c182@changeid
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: kernel test robot <lkp@intel.com>
Cc: Lecopzer Chen <lecopzer.chen@mediatek.com>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Pingfan Liu <kernelfans@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Arm platforms use is_default_overflow_handler() to determine if the
hw_breakpoint code should single-step over the breakpoint trigger or
let the custom handler deal with it.
Since bpf_overflow_handler() currently isn't recognized as a default
handler, attaching a BPF program to a PERF_TYPE_BREAKPOINT event causes
it to keep firing (the instruction triggering the data abort exception
is never skipped). For example:
# bpftrace -e 'watchpoint:0x10000:4:w { print("hit") }' -c ./test
Attaching 1 probe...
hit
hit
[...]
^C
(./test performs a single 4-byte store to 0x10000)
This patch replaces the check with uses_default_overflow_handler(),
which accounts for the bpf_overflow_handler() case by also testing
if one of the perf_event_output functions gets invoked indirectly,
via orig_default_handler.
Signed-off-by: Tomislav Novak <tnovak@meta.com>
Tested-by: Samuel Gosselin <sgosselin@google.com> # arm64
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/linux-arm-kernel/20220923203644.2731604-1-tnovak@fb.com/
Link: https://lore.kernel.org/r/20230605191923.1219974-1-tnovak@meta.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Since commit 4e57a4ddf6b0 ("ARM: 9107/1: syscall: always store
thread_info->abi_syscall"), the seccomp selftests "syscall_errno"
and "syscall_faked" have been broken. Both seccomp and PTRACE depend
on using the special value of "-1" for skipping syscalls. This value
wasn't working because it was getting masked by __NR_SYSCALL_MASK in
both PTRACE_SET_SYSCALL and get_syscall_nr().
Explicitly test for -1 in PTRACE_SET_SYSCALL and get_syscall_nr(),
leaving it exposed when present, allowing tracers to skip syscalls
again.
Cc: Russell King <linux@armlinux.org.uk>
Cc: Arnd Bergmann <arnd@kernel.org>
Cc: Lecopzer Chen <lecopzer.chen@mediatek.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: linux-arm-kernel@lists.infradead.org
Fixes: 4e57a4ddf6b0 ("ARM: 9107/1: syscall: always store thread_info->abi_syscall")
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20230810195422.2304827-2-keescook@chromium.org
Signed-off-by: Kees Cook <keescook@chromium.org>
|