Age | Commit message (Collapse) | Author |
|
Add a DEFINE_CLASS() for printbufs.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
New helpers to avoid open coded loops.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
If we don't finish journal replay we need to keep journal keys around
until the filesystem shuts down - otherwise e.g. -o norecovery, various
tools (dump, list) break, and eventually we'll be doing journal replay
in the background.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
We had a bug report where the errors from btree_node_check_topology()
don't seem to be getting printed; log_fsck_err() does some fancy
ratelimiting-type stuff that we don't want here.
Instead, just use bch2_count_fsck_err(); this is simpler, and modelled
after how we're currently handling bucket ref update errors in
buckets.c.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
More self healing code: readdir will now notice if there are dirents
hashed incorrectly, and it'll repair them if errors=fix_safe.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
We don't track snapshot overwrites outside of fsck, so for this to be
called at runtime outside of fsck we need to create it on demand, when
we have repair to do.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Now uses bch2_get_snapshot_overwrites(), and much shorter.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
New helper for getting a list of snapshot IDs that have overwritten a
given key.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Recover from "journal and btree in same bucket".
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
If snapshot deletion incorrectly missing some keys and leaves keys for
deleted snapshots, that causes a bit of a problem for data move - we
can't move an extent for a nonexistent snapshot, because the extent
might have to be fragmented, and maintaining correct visibility in child
snapshots doesn't work if it doesn't have a snapshot.
Previously we'd just skip these keys, but it turns out that causes
copygc to spin.
So we need runtime self healing, i.e. calling check_key_has_snapshot()
from the data move path.
Snapshot deletion v2 included sentinal values for deleted snapshot
nodes, so this is quite safe.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
data_update_init() does need to do btree operations, delay doing the
unlock-before-io.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
We'll be adding subtypes of these errors, and new error code tracing.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull MM updates from Andrew Morton:
- "Add folio_mk_pte()" from Matthew Wilcox simplifies the act of
creating a pte which addresses the first page in a folio and reduces
the amount of plumbing which architecture must implement to provide
this.
- "Misc folio patches for 6.16" from Matthew Wilcox is a shower of
largely unrelated folio infrastructure changes which clean things up
and better prepare us for future work.
- "memory,x86,acpi: hotplug memory alignment advisement" from Gregory
Price adds early-init code to prevent x86 from leaving physical
memory unused when physical address regions are not aligned to memory
block size.
- "mm/compaction: allow more aggressive proactive compaction" from
Michal Clapinski provides some tuning of the (sadly, hard-coded (more
sadly, not auto-tuned)) thresholds for our invokation of proactive
compaction. In a simple test case, the reduction of a guest VM's
memory consumption was dramatic.
- "Minor cleanups and improvements to swap freeing code" from Kemeng
Shi provides some code cleaups and a small efficiency improvement to
this part of our swap handling code.
- "ptrace: introduce PTRACE_SET_SYSCALL_INFO API" from Dmitry Levin
adds the ability for a ptracer to modify syscalls arguments. At this
time we can alter only "system call information that are used by
strace system call tampering, namely, syscall number, syscall
arguments, and syscall return value.
This series should have been incorporated into mm.git's "non-MM"
branch, but I goofed.
- "fs/proc: extend the PAGEMAP_SCAN ioctl to report guard regions" from
Andrei Vagin extends the info returned by the PAGEMAP_SCAN ioctl
against /proc/pid/pagemap. This permits CRIU to more efficiently get
at the info about guard regions.
- "Fix parameter passed to page_mapcount_is_type()" from Gavin Shan
implements that fix. No runtime effect is expected because
validate_page_before_insert() happens to fix up this error.
- "kernel/events/uprobes: uprobe_write_opcode() rewrite" from David
Hildenbrand basically brings uprobe text poking into the current
decade. Remove a bunch of hand-rolled implementation in favor of
using more current facilities.
- "mm/ptdump: Drop assumption that pxd_val() is u64" from Anshuman
Khandual provides enhancements and generalizations to the pte dumping
code. This might be needed when 128-bit Page Table Descriptors are
enabled for ARM.
- "Always call constructor for kernel page tables" from Kevin Brodsky
ensures that the ctor/dtor is always called for kernel pgtables, as
it already is for user pgtables.
This permits the addition of more functionality such as "insert hooks
to protect page tables". This change does result in various
architectures performing unnecesary work, but this is fixed up where
it is anticipated to occur.
- "Rust support for mm_struct, vm_area_struct, and mmap" from Alice
Ryhl adds plumbing to permit Rust access to core MM structures.
- "fix incorrectly disallowed anonymous VMA merges" from Lorenzo
Stoakes takes advantage of some VMA merging opportunities which we've
been missing for 15 years.
- "mm/madvise: batch tlb flushes for MADV_DONTNEED and MADV_FREE" from
SeongJae Park optimizes process_madvise()'s TLB flushing.
Instead of flushing each address range in the provided iovec, we
batch the flushing across all the iovec entries. The syscall's cost
was approximately halved with a microbenchmark which was designed to
load this particular operation.
- "Track node vacancy to reduce worst case allocation counts" from
Sidhartha Kumar makes the maple tree smarter about its node
preallocation.
stress-ng mmap performance increased by single-digit percentages and
the amount of unnecessarily preallocated memory was dramaticelly
reduced.
- "mm/gup: Minor fix, cleanup and improvements" from Baoquan He removes
a few unnecessary things which Baoquan noted when reading the code.
- ""Enhance sysfs handling for memory hotplug in weighted interleave"
from Rakie Kim "enhances the weighted interleave policy in the memory
management subsystem by improving sysfs handling, fixing memory
leaks, and introducing dynamic sysfs updates for memory hotplug
support". Fixes things on error paths which we are unlikely to hit.
- "mm/damon: auto-tune DAMOS for NUMA setups including tiered memory"
from SeongJae Park introduces new DAMOS quota goal metrics which
eliminate the manual tuning which is required when utilizing DAMON
for memory tiering.
- "mm/vmalloc.c: code cleanup and improvements" from Baoquan He
provides cleanups and small efficiency improvements which Baoquan
found via code inspection.
- "vmscan: enforce mems_effective during demotion" from Gregory Price
changes reclaim to respect cpuset.mems_effective during demotion when
possible. because presently, reclaim explicitly ignores
cpuset.mems_effective when demoting, which may cause the cpuset
settings to violated.
This is useful for isolating workloads on a multi-tenant system from
certain classes of memory more consistently.
- "Clean up split_huge_pmd_locked() and remove unnecessary folio
pointers" from Gavin Guo provides minor cleanups and efficiency gains
in in the huge page splitting and migrating code.
- "Use kmem_cache for memcg alloc" from Huan Yang creates a slab cache
for `struct mem_cgroup', yielding improved memory utilization.
- "add max arg to swappiness in memory.reclaim and lru_gen" from
Zhongkun He adds a new "max" argument to the "swappiness=" argument
for memory.reclaim MGLRU's lru_gen.
This directs proactive reclaim to reclaim from only anon folios
rather than file-backed folios.
- "kexec: introduce Kexec HandOver (KHO)" from Mike Rapoport is the
first step on the path to permitting the kernel to maintain existing
VMs while replacing the host kernel via file-based kexec. At this
time only memblock's reserve_mem is preserved.
- "mm: Introduce for_each_valid_pfn()" from David Woodhouse provides
and uses a smarter way of looping over a pfn range. By skipping
ranges of invalid pfns.
- "sched/numa: Skip VMA scanning on memory pinned to one NUMA node via
cpuset.mems" from Libo Chen removes a lot of pointless VMA scanning
when a task is pinned a single NUMA mode.
Dramatic performance benefits were seen in some real world cases.
- "JFS: Implement migrate_folio for jfs_metapage_aops" from Shivank
Garg addresses a warning which occurs during memory compaction when
using JFS.
- "move all VMA allocation, freeing and duplication logic to mm" from
Lorenzo Stoakes moves some VMA code from kernel/fork.c into the more
appropriate mm/vma.c.
- "mm, swap: clean up swap cache mapping helper" from Kairui Song
provides code consolidation and cleanups related to the folio_index()
function.
- "mm/gup: Cleanup memfd_pin_folios()" from Vishal Moola does that.
- "memcg: Fix test_memcg_min/low test failures" from Waiman Long
addresses some bogus failures which are being reported by the
test_memcontrol selftest.
- "eliminate mmap() retry merge, add .mmap_prepare hook" from Lorenzo
Stoakes commences the deprecation of file_operations.mmap() in favor
of the new file_operations.mmap_prepare().
The latter is more restrictive and prevents drivers from messing with
things in ways which, amongst other problems, may defeat VMA merging.
- "memcg: decouple memcg and objcg stocks"" from Shakeel Butt decouples
the per-cpu memcg charge cache from the objcg's one.
This is a step along the way to making memcg and objcg charging
NMI-safe, which is a BPF requirement.
- "mm/damon: minor fixups and improvements for code, tests, and
documents" from SeongJae Park is yet another batch of miscellaneous
DAMON changes. Fix and improve minor problems in code, tests and
documents.
- "memcg: make memcg stats irq safe" from Shakeel Butt converts memcg
stats to be irq safe. Another step along the way to making memcg
charging and stats updates NMI-safe, a BPF requirement.
- "Let unmap_hugepage_range() and several related functions take folio
instead of page" from Fan Ni provides folio conversions in the
hugetlb code.
* tag 'mm-stable-2025-05-31-14-50' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (285 commits)
mm: pcp: increase pcp->free_count threshold to trigger free_high
mm/hugetlb: convert use of struct page to folio in __unmap_hugepage_range()
mm/hugetlb: refactor __unmap_hugepage_range() to take folio instead of page
mm/hugetlb: refactor unmap_hugepage_range() to take folio instead of page
mm/hugetlb: pass folio instead of page to unmap_ref_private()
memcg: objcg stock trylock without irq disabling
memcg: no stock lock for cpu hot-unplug
memcg: make __mod_memcg_lruvec_state re-entrant safe against irqs
memcg: make count_memcg_events re-entrant safe against irqs
memcg: make mod_memcg_state re-entrant safe against irqs
memcg: move preempt disable to callers of memcg_rstat_updated
memcg: memcg_rstat_updated re-entrant safe against irqs
mm: khugepaged: decouple SHMEM and file folios' collapse
selftests/eventfd: correct test name and improve messages
alloc_tag: check mem_profiling_support in alloc_tag_init
Docs/damon: update titles and brief introductions to explain DAMOS
selftests/damon/_damon_sysfs: read tried regions directories in order
mm/damon/tests/core-kunit: add a test for damos_set_filters_default_reject()
mm/damon/paddr: remove unused variable, folio_list, in damon_pa_stat()
mm/damon/sysfs-schemes: fix wrong comment on damons_sysfs_quota_goal_metric_strs
...
|
|
Explain the restrictions imposed on ublk servers in two cases:
1. When UBLK_F_PER_IO_DAEMON is set (current ublk_drv)
2. When UBLK_F_PER_IO_DAEMON is not set (legacy)
Remove most references to per-queue daemons, as the new
UBLK_F_PER_IO_DAEMON feature renders that concept obsolete.
Signed-off-by: Uday Shankar <ushankar@purestorage.com>
Reviewed-by: Caleb Sander Mateos <csander@purestorage.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20250529-ublk_task_per_io-v8-9-e9d3b119336a@purestorage.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Add a new test_stress_06 for the per io daemons feature. This is just a
copy of test_stress_01 with the per_io_tasks flag added, with varying
amounts of nthreads. This test is able to reproduce a panic which was
caught manually during development [1]; in the current version of this
patch set, it passes.
Note that this commit also makes all stress tests using the
run_io_and_remove helper more stressful by additionally exercising the
batch submit (queue_rqs) path.
[1] https://lore.kernel.org/linux-block/aDgwGoGCEpwd1mFY@fedora/
Suggested-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Uday Shankar <ushankar@purestorage.com>
Link: https://lore.kernel.org/r/20250529-ublk_task_per_io-v8-8-e9d3b119336a@purestorage.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Add a new test test_generic_12 which:
- sets up a ublk server with per_io_tasks and a different number of ublk
server threads and ublk_queues. This is possible now that these
objects are decoupled
- runs some I/O load from a single CPU
- verifies that all the ublk server threads handle some I/O
Before this changeset, this test fails, since I/O issued from one CPU is
always handled by the one ublk server thread. After this changeset, the
test passes.
In the future, the last check above may be strengthened to "verify that
all ublk server threads handle the same amount of I/O." However, this
requires some adjustments/bugfixes to tag allocation, so this work is
postponed to a followup.
Signed-off-by: Uday Shankar <ushankar@purestorage.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20250529-ublk_task_per_io-v8-7-e9d3b119336a@purestorage.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Add support in kublk for decoupled ublk_queues and ublk server threads.
kublk now has two modes of operation:
- (preexisting mode) threads and queues are paired 1:1, and each thread
services all the I/Os of one queue
- (new mode) thread and queue counts are independently configurable.
threads service I/Os in a way that balances load across threads even
if load is not balanced over queues.
The default is the preexisting mode. The new mode is activated by
passing the --per_io_tasks flag.
Signed-off-by: Uday Shankar <ushankar@purestorage.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20250529-ublk_task_per_io-v8-6-e9d3b119336a@purestorage.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Towards the goal of decoupling ublk_queues from ublk server threads,
move resources/data that should be per-thread rather than per-queue out
of ublk_queue and into a new struct ublk_thread.
Signed-off-by: Uday Shankar <ushankar@purestorage.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20250529-ublk_task_per_io-v8-5-e9d3b119336a@purestorage.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Currently, each ublk server I/O handler thread initializes its own
queue. However, as we move towards decoupled ublk_queues and ublk server
threads, this model does not make sense anymore, as there will no longer
be a concept of a thread having "its own" queue. So lift queue
initialization out of the per-thread ublk_io_handler_fn and into a loop
in ublk_start_daemon (which runs once for each device).
There is a part of ublk_queue_init (ring initialization) which does
actually need to happen on the thread that will use the ring; that is
separated into a separate ublk_thread_init which is still called by each
I/O handler thread.
Signed-off-by: Uday Shankar <ushankar@purestorage.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20250529-ublk_task_per_io-v8-4-e9d3b119336a@purestorage.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
We currently have a helper ublk_queue_alloc_sqes which the ublk targets
use to allocate SQEs for their own operations. However, as we move
towards decoupled ublk_queues and ublk server threads, this helper does
not make sense anymore. SQEs are allocated from rings, and we will have
one ring per thread to avoid locking. Change the SQE allocation helper
to ublk_io_alloc_sqes. Currently this still allocates SQEs from the io's
queue's ring, but when we fully decouple threads and queues, it will
allocate from the io's thread's ring instead.
Signed-off-by: Uday Shankar <ushankar@purestorage.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20250529-ublk_task_per_io-v8-3-e9d3b119336a@purestorage.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Currently, when we process CQEs, we know which ublk_queue we are working
on because we know which ring we are working on, and ublk_queues and
rings are in 1:1 correspondence. However, as we decouple ublk_queues
from ublk server threads, ublk_queues and rings will no longer be in 1:1
correspondence - each ublk server thread will have a ring, and each
thread may issue commands against more than one ublk_queue. So in order
to know which ublk_queue a CQE refers to, plumb that information in the
associated SQE's user_data.
Signed-off-by: Uday Shankar <ushankar@purestorage.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20250529-ublk_task_per_io-v8-2-e9d3b119336a@purestorage.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Currently, ublk_drv associates to each hardware queue (hctx) a unique
task (called the queue's ubq_daemon) which is allowed to issue
COMMIT_AND_FETCH commands against the hctx. If any other task attempts
to do so, the command fails immediately with EINVAL. When considered
together with the block layer architecture, the result is that for each
CPU C on the system, there is a unique ublk server thread which is
allowed to handle I/O submitted on CPU C. This can lead to suboptimal
performance under imbalanced load generation. For an extreme example,
suppose all the load is generated on CPUs mapping to a single ublk
server thread. Then that thread may be fully utilized and become the
bottleneck in the system, while other ublk server threads are totally
idle.
This issue can also be addressed directly in the ublk server without
kernel support by having threads dequeue I/Os and pass them around to
ensure even load. But this solution requires inter-thread communication
at least twice for each I/O (submission and completion), which is
generally a bad pattern for performance. The problem gets even worse
with zero copy, as more inter-thread communication would be required to
have the buffer register/unregister calls to come from the correct
thread.
Therefore, address this issue in ublk_drv by allowing each I/O to have
its own daemon task. Two I/Os in the same queue are now allowed to be
serviced by different daemon tasks - this was not possible before.
Imbalanced load can then be balanced across all ublk server threads by
having the ublk server threads issue FETCH_REQs in a round-robin manner.
As a small toy example, consider a system with a single ublk device
having 2 queues, each of depth 4. A ublk server having 4 threads could
issue its FETCH_REQs against this device as follows (where each entry is
the qid,tag pair that the FETCH_REQ targets):
ublk server thread: T0 T1 T2 T3
0,0 0,1 0,2 0,3
1,3 1,0 1,1 1,2
This setup allows for load that is concentrated on one hctx/ublk_queue
to be spread out across all ublk server threads, alleviating the issue
described above.
Add the new UBLK_F_PER_IO_DAEMON feature to ublk_drv, which ublk servers
can use to essentially test for the presence of this change and tailor
their behavior accordingly.
Signed-off-by: Uday Shankar <ushankar@purestorage.com>
Reviewed-by: Caleb Sander Mateos <csander@purestorage.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Link: https://lore.kernel.org/r/20250529-ublk_task_per_io-v8-1-e9d3b119336a@purestorage.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev
Pull fbdev updates from Helge Deller:
"Many small but important fixes for special cases in the fbdev, fbcon
and vgacon code which were found with Syzkaller, Svace and other tools
by various people and teams (e.g. Linux Verification Center).
Some smaller code cleanups in the nvidiafb, arkfb, atyfb and viafb
drivers and two spelling fixes"
* tag 'fbdev-for-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev:
fbdev: Fix fb_set_var to prevent null-ptr-deref in fb_videomode_to_var
fbdev: Fix do_register_framebuffer to prevent null-ptr-deref in fb_videomode_to_var
fbdev: sstfb.rst: Fix spelling mistake
fbdev: core: fbcvt: avoid division by 0 in fb_cvt_hperiod()
fbcon: Make sure modelist not set on unregistered console
vgacon: Add check for vc_origin address range in vgacon_scroll()
fbdev: arkfb: Cast ics5342_init() allocation type
fbdev: nvidiafb: Correct const string length in nvidiafb_setup()
fbdev: atyfb: Remove unused PCI vendor ID
fbdev: carminefb: Fix spelling mistake of CARMINE_TOTAL_DIPLAY_MEM
fbdev: via: use new GPIO line value setter callbacks
|
|
The newly added anon_inode_test test fails to build due to attempting to
include a nonexisting overlayfs/wrapper.h:
anon_inode_test.c:10:10: fatal error: overlayfs/wrappers.h: No such file or directory
10 | #include "overlayfs/wrappers.h"
| ^~~~~~~~~~~~~~~~~~~~~~
This is due to 0bd92b9fe538 ("selftests/filesystems: move wrapper.h out
of overlayfs subdir") which was added in the vfs-6.16.selftests branch
which was based on -rc5 and did not contain the newly added test so once
things were merged into mainline the build started failing - both parent
commits are fine.
Fixes: 3e406741b1989 ("Merge tag 'vfs-6.16-rc1.selftests' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs")
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic
Pull compiler version requirement update from Arnd Bergmann:
"Require gcc-8 and binutils-2.30
x86 already uses gcc-8 as the minimum version, this changes all other
architectures to the same version. gcc-8 is used is Debian 10 and Red
Hat Enterprise Linux 8, both of which are still supported, and
binutils 2.30 is the oldest corresponding version on those.
Ubuntu Pro 18.04 and SUSE Linux Enterprise Server 15 both use gcc-7 as
the system compiler but additionally include toolchains that remain
supported.
With the new minimum toolchain versions, a number of workarounds for
older versions can be dropped, in particular on x86_64 and arm64.
Importantly, the updated compiler version allows removing two of the
five remaining gcc plugins, as support for sancov and structeak
features is already included in modern compiler versions.
I tried collecting the known changes that are possible based on the
new toolchain version, but expect that more cleanups will be possible.
Since this touches multiple architectures, I merged the patches
through the asm-generic tree."
* tag 'gcc-minimum-version-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic:
Makefile.kcov: apply needed compiler option unconditionally in CFLAGS_KCOV
Documentation: update binutils-2.30 version reference
gcc-plugins: remove SANCOV gcc plugin
Kbuild: remove structleak gcc plugin
arm64: drop binutils version checks
raid6: skip avx512 checks
kbuild: require gcc-8 and binutils-2.30
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull sophgo SoC devicetree updates from Arnd Bergmann:
"The Sophgo SG2044 SoC is their second generation server chip with 64
cores, following the SG2042.
In addition, there are minor updates for the cv180x SoCs"
* tag 'soc-newsoc-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
riscv: dts: sophgo: switch precise compatible for existed clock device for CV18XX
riscv: dts: sophgo: Add initial device tree of Sophgo SRD3-10
dt-bindings: riscv: sophgo: Add SG2044 compatible string
dt-bindings: interrupt-controller: Add Sophgo SG2044 PLIC
dt-bindings: interrupt-controller: Add Sophgo SG2044 CLINT mswi
riscv: dts: sopgho: use SOC_PERIPHERAL_IRQ to calculate interrupt number
riscv: dts: sophgo: rename header file cv18xx.dtsi to cv180x.dtsi
riscv: dts: sophgo: Move riscv cpu definition to a separate file
riscv: dts: sophgo: Move all soc specific device into soc dtsi file
riscv: sophgo: dts: Add spi controller for SG2042
riscv: dts: sophgo: sg2042: add pinctrl support
|
|
Pull SoC devicetree updates from Arnd Bergmann:
"There are 11 newly supported SoCs, but these are all either new
variants of existing designs, or straight reuses of the existing chip
in a new package:
- RK3562 is a new chip based on the old Cortex-A53 core, apparently a
low-cost version of the Cortex-A55 based RK3568/RK3566.
- NXP i.MX94 is a minor variation of i.MX93/i.MX95 with a different
set of on-chip peripherals.
- Renesas RZ/V2N (R9A09G056) is a new member of the larger RZ/V2
family
- Amlogic S6/S7/S7D
- Samsung Exynos7870 is an older chip similar to Exynos7885
- WonderMedia wm8950 is a minor variation on the wm8850 chip
- Amlogic s805y is almost idential to s805x
- Allwinner A523 is similar to A527 and T527
- Qualcomm MSM8926 is a variant of MSM8226
- Qualcomm Snapdragon X1P42100 is related to R1E80100
There are also 65 boards, including reference designs for the chips
above, this includes
- 12 new boards based on TI K3 series chips, most of them from
Toradex
- 10 devices using Rockchips RK35xx and PX30 chips
- 2 phones and 2 laptops based on Qualcomm Snapdragon designs
- 10 NXP i.MX8/i.MX9 boards, mostly for embedded/industrial uses
- 3 Samsung Galaxy phones based on Exynos7870
- 5 Allwinner based boards using a variety of ARMv8 chips
- 9 32-bit machines, each based on a different SoC family
Aside from the new hardware, there is the usual set of cleanups and
newly added hardware support on existing machines, for a total of 965
devicetree changesets"
* tag 'soc-dt-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (956 commits)
MAINTAINERS, mailmap: update Sven Peter's email address
arm64: dts: renesas: rzg3e-smarc-som: Reduce I2C2 clock frequency
arm64: dts: nuvoton: Add pinctrl
ARM: dts: samsung: sp5v210-aries: Align wifi node name with bindings
arm64: dts: blaize-blzp1600: Enable GPIO support
dt-bindings: clock: socfpga: convert to yaml
arm64: dts: rockchip: move rk3562 pinctrl node outside the soc node
arm64: dts: rockchip: fix rk3562 pcie unit addresses
arm64: dts: rockchip: move rk3528 pinctrl node outside the soc node
arm64: dts: rockchip: remove a double-empty line from rk3576 core dtsi
arm64: dts: rockchip: move rk3576 pinctrl node outside the soc node
arm64: dts: rockchip: fix rk3576 pcie unit addresses
arm64: dts: rockchip: Drop assigned-clock* from cpu nodes on rk3588
arm64: dts: rockchip: Add missing SFC power-domains to rk3576
Revert "arm64: dts: mediatek: mt8390-genio-common: Add firmware-name for scp0"
arm64: dts: mediatek: mt8188: Address binding warnings for MDP3 nodes
arm64: dts: mt6359: Rename RTC node to match binding expectations
arm64: dts: mt8365-evk: Add goodix touchscreen support
arm64: dts: mediatek: mt8188: Add missing #reset-cells property
arm64: dts: airoha: en7581: Add PCIe nodes to EN7581 SoC evaluation board
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull SoC defconfig updates from Arnd Bergmann:
"The usual defconfig updates enable configuration options for drivers
that got added. A few SoC specific options are enabled in Kconfig
files instead, in place of the defconfig files"
* tag 'soc-defconfig-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
arm64: defconfig: enable ACPM protocol and Exynos mailbox
arm64: defconfig: Enable configs for MediaTek Genio EVK boards
arm64: defconfig: mediatek: enable PHY drivers
arm64: defconfig: Enable Rockchip SAI and ES8328
arm64: defconfig: Add Toradex Embedded Controller config
arm64: defconfig: Enable TPIC2810 GPIO expander
riscv: defconfig: spacemit: enable clock controller driver for SpacemiT K1
arm64: defconfig: Enable TMP102 as module
arm64: defconfig: Enable hwspinlock and eQEP for K3
arm64: defconfig: Add CDNS_DSI and CDNS_PHY config
riscv: defconfig: spacemit: enable gpio support for K1 SoC
arm64: defconfig: Enable IPQ5424 RDP466 base configs
riscv: Enable PM_GENERIC_DOMAINS for T-Head SoCs
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull ARM SoC updates from Arnd Bergmann:
"The main update in size is the removal of the TI DaVinci DA830 SoC
support. DA830 is similar to DA850, which remain supported, but only
the reference board was ever supported, and we removed that one 3
years ago as it had never been converted to devicetree.
There are some other cleanups for OMAP4 and a few boards using old
GPIO interfaces"
* tag 'soc-arm-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
ARM: s3c: stop including gpio.h
ARM: dts: davinci: da850-evm: Increase fifo threshold
ARM: OMAP2+: Fix l4ls clk domain handling in STANDBY
ARM: broadcom: MAINTAINERS: Cover bcm2712 files
bus: ti-sysc: PRUSS OCP configuration
ARM: davinci: remove support for da830
ARM: omap: pmic-cpcap: do not mess around without CPCAP or OMAP4
ARM: omap2plus_defconfig: enable I2C devices of GTA04
ARM: s3c/gpio: use new line value setter callbacks
ARM: scoop/gpio: use new line value setter callbacks
ARM: sa1100/gpio: use new line value setter callbacks
ARM: orion/gpio: use new line value setter callbacks
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull SoC driver updates from Arnd Bergmann:
"Updates are across the usual driver subsystems with SoC specific
drivers:
- added soc specicific drivers for sophgo cv1800 and sg2044, qualcomm
sm8750, and amlogic c3 and s4 chips.
- cache controller updates for sifive chips, plus binding changes for
other cache descriptions.
- memory controller drivers for mediatek mt6893, stm32 and cleanups
for a few more drivers
- reset controller drivers for T-Head TH1502, Sophgo sg2044 and
Renesas RZ/V2H(P)
- SCMI firmware updates to better deal with buggy firmware, plus
better support for Qualcomm X1E and NXP i.MX specific interfaces
- a new platform driver for the crypto firmware on Cznic Turris
Omnia/MOX
- cleanups for the TEE firmware subsystem and amdtee driver
- minor updates and fixes for freescale/nxp, qualcomm, google,
aspeed, wondermedia, ti, nxp, renesas, hisilicon, mediatek,
broadcom and samsung SoCs"
* tag 'soc-drivers-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (133 commits)
soc: aspeed: Add NULL check in aspeed_lpc_enable_snoop()
soc: aspeed: lpc: Fix impossible judgment condition
ARM: aspeed: Don't select SRAM
docs: firmware: qcom_scm: Fix kernel-doc warning
soc: fsl: qe: Consolidate chained IRQ handler install/remove
firmware: qcom: scm: Allow QSEECOM for HP EliteBook Ultra G1q
dt-bindings: mfd: qcom,tcsr: Add compatible for ipq5018
dt-bindings: cache: add QiLai compatible to ax45mp
memory: stm32_omm: Fix error handling in stm32_omm_disable_child()
dt-bindings: cache: Convert marvell,tauros2-cache to DT schema
dt-bindings: cache: Convert marvell,{feroceon,kirkwood}-cache to DT schema
soc: samsung: exynos-pmu: enable CPU hotplug support for gs101
MAINTAINERS: Add google,gs101-pmu-intr-gen.yaml binding file
dt-bindings: soc: samsung: exynos-pmu: gs101: add google,pmu-intr-gen phandle
dt-bindings: soc: google: Add gs101-pmu-intr-gen binding documentation
bus: fsl-mc: Use strscpy() instead of strscpy_pad()
soc: fsl: qbman: Remove const from portal->cgrs allocation type
bus: fsl_mc: Fix driver_managed_dma check
bus: fsl-mc: increase MC_CMD_COMPLETION_TIMEOUT_MS value
bus: fsl-mc: drop useless cleanup
...
|
|
Change back printk format to 0x%08lx instead of %#08lx, since the latter
does not seem to reliably format the value to 8 hex chars.
Signed-off-by: Helge Deller <deller@gmx.de>
Cc: stable@vger.kernel.org # v5.18+
Fixes: e5e9e7f222e5b ("parisc/unaligned: Enhance user-space visible output")
|
|
This reverts commit e436576b0231542f6f233279f0972989232575a8.
That commit is very broken, and seems to have missed the fact that
CONFIG_ARM_SMMU_V3 is not just a yes-or-no thing, but also can be
modular.
So it caused build errors on arm64 allmodconfig setups:
ERROR: modpost: "arm_smmu_make_cdtable_ste" [drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-test.ko] undefined!
ERROR: modpost: "arm_smmu_make_s2_domain_ste" [drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-test.ko] undefined!
ERROR: modpost: "arm_smmu_make_s1_cd" [drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-test.ko] undefined!
...
(and six more symbols just the same).
Link: https://lore.kernel.org/all/CAHk-=wh4qRwm7AQ8sBmQj7qECzgAhj4r73RtCDfmHo5SdcN0Jw@mail.gmail.com/
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Rolf Eike Beer <eb@emlix.com>
Cc: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Dropping symbols also meant the callchain maps wasn't populated, but
the callchain map is needed to find the DSO.
Plumb the symbols option better, falling back to thread__find_map()
rather than thread__find_symbol() when symbols are disabled.
Fixes: 02b2705017d2e5ad ("perf callchain: Allow symbols to be optional when resolving a callchain")
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.ibm.com>
Cc: Ben Gainey <ben.gainey@arm.com>
Cc: Chaitanya S Prakash <chaitanyas.prakash@arm.com>
Cc: Charlie Jenkins <charlie@rivosinc.com>
Cc: Chun-Tse Shao <ctshao@google.com>
Cc: Colin Ian King <colin.i.king@gmail.com>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: Graham Woodward <graham.woodward@arm.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Krzysztof Łopatowski <krzysztof.m.lopatowski@gmail.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Levi Yun <yeoreum.yun@arm.com>
Cc: Li Huafei <lihuafei1@huawei.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin Liška <martin.liska@hey.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Matt Fleming <matt@readmodwrite.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Song Liu <song@kernel.org>
Cc: Steinar H. Gunderson <sesse@google.com>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Cc: Steve Clevenger <scclevenger@os.amperecomputing.com>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Cc: Veronika Molnarova <vmolnaro@redhat.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Cc: Yujie Liu <yujie.liu@intel.com>
Cc: Zhongqiu Han <quic_zhonhan@quicinc.com>
Cc: Zixian Cai <fzczx123@gmail.com>
Link: https://lore.kernel.org/r/20250529044000.759937-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Delaying kernel operations can be dangerous and the kernel may kill
(non-sleepable) BPF programs running for long in the future.
Limit the max delay to 10ms and update the document about it.
$ sudo ./perf lock con -abl -J 100000us@cgroup_mutex true
lock delay is too long: 100000us (> 10ms)
Usage: perf lock contention [<options>]
-J, --inject-delay <TIME@FUNC>
Inject delays to specific locks
Suggested-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/r/20250515181042.555189-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
If fb_add_videomode() in fb_set_var() fails to allocate memory for
fb_videomode, later it may lead to a null-ptr dereference in
fb_videomode_to_var(), as the fb_info is registered while not having the
mode in modelist that is expected to be there, i.e. the one that is
described in fb_info->var.
================================================================
general protection fault, probably for non-canonical address 0xdffffc0000000001: 0000 [#1] PREEMPT SMP KASAN NOPTI
KASAN: null-ptr-deref in range [0x0000000000000008-0x000000000000000f]
CPU: 1 PID: 30371 Comm: syz-executor.1 Not tainted 5.10.226-syzkaller #0
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
RIP: 0010:fb_videomode_to_var+0x24/0x610 drivers/video/fbdev/core/modedb.c:901
Call Trace:
display_to_var+0x3a/0x7c0 drivers/video/fbdev/core/fbcon.c:929
fbcon_resize+0x3e2/0x8f0 drivers/video/fbdev/core/fbcon.c:2071
resize_screen drivers/tty/vt/vt.c:1176 [inline]
vc_do_resize+0x53a/0x1170 drivers/tty/vt/vt.c:1263
fbcon_modechanged+0x3ac/0x6e0 drivers/video/fbdev/core/fbcon.c:2720
fbcon_update_vcs+0x43/0x60 drivers/video/fbdev/core/fbcon.c:2776
do_fb_ioctl+0x6d2/0x740 drivers/video/fbdev/core/fbmem.c:1128
fb_ioctl+0xe7/0x150 drivers/video/fbdev/core/fbmem.c:1203
vfs_ioctl fs/ioctl.c:48 [inline]
__do_sys_ioctl fs/ioctl.c:753 [inline]
__se_sys_ioctl fs/ioctl.c:739 [inline]
__x64_sys_ioctl+0x19a/0x210 fs/ioctl.c:739
do_syscall_64+0x33/0x40 arch/x86/entry/common.c:46
entry_SYSCALL_64_after_hwframe+0x67/0xd1
================================================================
The reason is that fb_info->var is being modified in fb_set_var(), and
then fb_videomode_to_var() is called. If it fails to add the mode to
fb_info->modelist, fb_set_var() returns error, but does not restore the
old value of fb_info->var. Restore fb_info->var on failure the same way
it is done earlier in the function.
Found by Linux Verification Center (linuxtesting.org) with Syzkaller.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Cc: stable@vger.kernel.org
Signed-off-by: Murad Masimov <m.masimov@mt-integration.ru>
Signed-off-by: Helge Deller <deller@gmx.de>
|
|
fb_videomode_to_var
If fb_add_videomode() in do_register_framebuffer() fails to allocate
memory for fb_videomode, it will later lead to a null-ptr dereference in
fb_videomode_to_var(), as the fb_info is registered while not having the
mode in modelist that is expected to be there, i.e. the one that is
described in fb_info->var.
================================================================
general protection fault, probably for non-canonical address 0xdffffc0000000001: 0000 [#1] PREEMPT SMP KASAN NOPTI
KASAN: null-ptr-deref in range [0x0000000000000008-0x000000000000000f]
CPU: 1 PID: 30371 Comm: syz-executor.1 Not tainted 5.10.226-syzkaller #0
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
RIP: 0010:fb_videomode_to_var+0x24/0x610 drivers/video/fbdev/core/modedb.c:901
Call Trace:
display_to_var+0x3a/0x7c0 drivers/video/fbdev/core/fbcon.c:929
fbcon_resize+0x3e2/0x8f0 drivers/video/fbdev/core/fbcon.c:2071
resize_screen drivers/tty/vt/vt.c:1176 [inline]
vc_do_resize+0x53a/0x1170 drivers/tty/vt/vt.c:1263
fbcon_modechanged+0x3ac/0x6e0 drivers/video/fbdev/core/fbcon.c:2720
fbcon_update_vcs+0x43/0x60 drivers/video/fbdev/core/fbcon.c:2776
do_fb_ioctl+0x6d2/0x740 drivers/video/fbdev/core/fbmem.c:1128
fb_ioctl+0xe7/0x150 drivers/video/fbdev/core/fbmem.c:1203
vfs_ioctl fs/ioctl.c:48 [inline]
__do_sys_ioctl fs/ioctl.c:753 [inline]
__se_sys_ioctl fs/ioctl.c:739 [inline]
__x64_sys_ioctl+0x19a/0x210 fs/ioctl.c:739
do_syscall_64+0x33/0x40 arch/x86/entry/common.c:46
entry_SYSCALL_64_after_hwframe+0x67/0xd1
================================================================
Even though fbcon_init() checks beforehand if fb_match_mode() in
var_to_display() fails, it can not prevent the panic because fbcon_init()
does not return error code. Considering this and the comment in the code
about fb_match_mode() returning NULL - "This should not happen" - it is
better to prevent registering the fb_info if its mode was not set
successfully. Also move fb_add_videomode() closer to the beginning of
do_register_framebuffer() to avoid having to do the cleanup on fail.
Found by Linux Verification Center (linuxtesting.org) with Syzkaller.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Cc: stable@vger.kernel.org
Signed-off-by: Murad Masimov <m.masimov@mt-integration.ru>
Signed-off-by: Helge Deller <deller@gmx.de>
|
|
Fix spelling: "tweeks" to "tweaks"
Signed-off-by: Rujra Bhatt <braker.noob.kernel@gmail.com>
Signed-off-by: Helge Deller <deller@gmx.de>
|
|
In fb_find_mode_cvt(), iff mode->refresh somehow happens to be 0x80000000,
cvt.f_refresh will become 0 when multiplying it by 2 due to overflow. It's
then passed to fb_cvt_hperiod(), where it's used as a divider -- division
by 0 will result in kernel oops. Add a sanity check for cvt.f_refresh to
avoid such overflow...
Found by Linux Verification Center (linuxtesting.org) with the Svace static
analysis tool.
Fixes: 96fe6a2109db ("[PATCH] fbdev: Add VESA Coordinated Video Timings (CVT) support")
Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Signed-off-by: Helge Deller <deller@gmx.de>
|
|
It looks like attempting to write to the "store_modes" sysfs node will
run afoul of unregistered consoles:
UBSAN: array-index-out-of-bounds in drivers/video/fbdev/core/fbcon.c:122:28
index -1 is out of range for type 'fb_info *[32]'
...
fbcon_info_from_console+0x192/0x1a0 drivers/video/fbdev/core/fbcon.c:122
fbcon_new_modelist+0xbf/0x2d0 drivers/video/fbdev/core/fbcon.c:3048
fb_new_modelist+0x328/0x440 drivers/video/fbdev/core/fbmem.c:673
store_modes+0x1c9/0x3e0 drivers/video/fbdev/core/fbsysfs.c:113
dev_attr_store+0x55/0x80 drivers/base/core.c:2439
static struct fb_info *fbcon_registered_fb[FB_MAX];
...
static signed char con2fb_map[MAX_NR_CONSOLES];
...
static struct fb_info *fbcon_info_from_console(int console)
...
return fbcon_registered_fb[con2fb_map[console]];
If con2fb_map contains a -1 things go wrong here. Instead, return NULL,
as callers of fbcon_info_from_console() are trying to compare against
existing "info" pointers, so error handling should kick in correctly.
Reported-by: syzbot+a7d4444e7b6e743572f7@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/679d0a8f.050a0220.163cdc.000c.GAE@google.com/
Signed-off-by: Kees Cook <kees@kernel.org>
Signed-off-by: Helge Deller <deller@gmx.de>
|
|
Our in-house Syzkaller reported the following BUG (twice), which we
believed was the same issue with [1]:
==================================================================
BUG: KASAN: slab-out-of-bounds in vcs_scr_readw+0xc2/0xd0 drivers/tty/vt/vt.c:4740
Read of size 2 at addr ffff88800f5bef60 by task syz.7.2620/12393
...
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0x72/0xa0 lib/dump_stack.c:106
print_address_description.constprop.0+0x6b/0x3d0 mm/kasan/report.c:364
print_report+0xba/0x280 mm/kasan/report.c:475
kasan_report+0xa9/0xe0 mm/kasan/report.c:588
vcs_scr_readw+0xc2/0xd0 drivers/tty/vt/vt.c:4740
vcs_write_buf_noattr drivers/tty/vt/vc_screen.c:493 [inline]
vcs_write+0x586/0x840 drivers/tty/vt/vc_screen.c:690
vfs_write+0x219/0x960 fs/read_write.c:584
ksys_write+0x12e/0x260 fs/read_write.c:639
do_syscall_x64 arch/x86/entry/common.c:51 [inline]
do_syscall_64+0x59/0x110 arch/x86/entry/common.c:81
entry_SYSCALL_64_after_hwframe+0x78/0xe2
...
</TASK>
Allocated by task 5614:
kasan_save_stack+0x20/0x40 mm/kasan/common.c:45
kasan_set_track+0x25/0x30 mm/kasan/common.c:52
____kasan_kmalloc mm/kasan/common.c:374 [inline]
__kasan_kmalloc+0x8f/0xa0 mm/kasan/common.c:383
kasan_kmalloc include/linux/kasan.h:201 [inline]
__do_kmalloc_node mm/slab_common.c:1007 [inline]
__kmalloc+0x62/0x140 mm/slab_common.c:1020
kmalloc include/linux/slab.h:604 [inline]
kzalloc include/linux/slab.h:721 [inline]
vc_do_resize+0x235/0xf40 drivers/tty/vt/vt.c:1193
vgacon_adjust_height+0x2d4/0x350 drivers/video/console/vgacon.c:1007
vgacon_font_set+0x1f7/0x240 drivers/video/console/vgacon.c:1031
con_font_set drivers/tty/vt/vt.c:4628 [inline]
con_font_op+0x4da/0xa20 drivers/tty/vt/vt.c:4675
vt_k_ioctl+0xa10/0xb30 drivers/tty/vt/vt_ioctl.c:474
vt_ioctl+0x14c/0x1870 drivers/tty/vt/vt_ioctl.c:752
tty_ioctl+0x655/0x1510 drivers/tty/tty_io.c:2779
vfs_ioctl fs/ioctl.c:51 [inline]
__do_sys_ioctl fs/ioctl.c:871 [inline]
__se_sys_ioctl+0x12d/0x190 fs/ioctl.c:857
do_syscall_x64 arch/x86/entry/common.c:51 [inline]
do_syscall_64+0x59/0x110 arch/x86/entry/common.c:81
entry_SYSCALL_64_after_hwframe+0x78/0xe2
Last potentially related work creation:
kasan_save_stack+0x20/0x40 mm/kasan/common.c:45
__kasan_record_aux_stack+0x94/0xa0 mm/kasan/generic.c:492
__call_rcu_common.constprop.0+0xc3/0xa10 kernel/rcu/tree.c:2713
netlink_release+0x620/0xc20 net/netlink/af_netlink.c:802
__sock_release+0xb5/0x270 net/socket.c:663
sock_close+0x1e/0x30 net/socket.c:1425
__fput+0x408/0xab0 fs/file_table.c:384
__fput_sync+0x4c/0x60 fs/file_table.c:465
__do_sys_close fs/open.c:1580 [inline]
__se_sys_close+0x68/0xd0 fs/open.c:1565
do_syscall_x64 arch/x86/entry/common.c:51 [inline]
do_syscall_64+0x59/0x110 arch/x86/entry/common.c:81
entry_SYSCALL_64_after_hwframe+0x78/0xe2
Second to last potentially related work creation:
kasan_save_stack+0x20/0x40 mm/kasan/common.c:45
__kasan_record_aux_stack+0x94/0xa0 mm/kasan/generic.c:492
__call_rcu_common.constprop.0+0xc3/0xa10 kernel/rcu/tree.c:2713
netlink_release+0x620/0xc20 net/netlink/af_netlink.c:802
__sock_release+0xb5/0x270 net/socket.c:663
sock_close+0x1e/0x30 net/socket.c:1425
__fput+0x408/0xab0 fs/file_table.c:384
task_work_run+0x154/0x240 kernel/task_work.c:239
exit_task_work include/linux/task_work.h:45 [inline]
do_exit+0x8e5/0x1320 kernel/exit.c:874
do_group_exit+0xcd/0x280 kernel/exit.c:1023
get_signal+0x1675/0x1850 kernel/signal.c:2905
arch_do_signal_or_restart+0x80/0x3b0 arch/x86/kernel/signal.c:310
exit_to_user_mode_loop kernel/entry/common.c:111 [inline]
exit_to_user_mode_prepare include/linux/entry-common.h:328 [inline]
__syscall_exit_to_user_mode_work kernel/entry/common.c:207 [inline]
syscall_exit_to_user_mode+0x1b3/0x1e0 kernel/entry/common.c:218
do_syscall_64+0x66/0x110 arch/x86/entry/common.c:87
entry_SYSCALL_64_after_hwframe+0x78/0xe2
The buggy address belongs to the object at ffff88800f5be000
which belongs to the cache kmalloc-2k of size 2048
The buggy address is located 2656 bytes to the right of
allocated 1280-byte region [ffff88800f5be000, ffff88800f5be500)
...
Memory state around the buggy address:
ffff88800f5bee00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
ffff88800f5bee80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>ffff88800f5bef00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
^
ffff88800f5bef80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
ffff88800f5bf000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
==================================================================
By analyzing the vmcore, we found that vc->vc_origin was somehow placed
one line prior to vc->vc_screenbuf when vc was in KD_TEXT mode, and
further writings to /dev/vcs caused out-of-bounds reads (and writes
right after) in vcs_write_buf_noattr().
Our further experiments show that in most cases, vc->vc_origin equals to
vga_vram_base when the console is in KD_TEXT mode, and it's around
vc->vc_screenbuf for the KD_GRAPHICS mode. But via triggerring a
TIOCL_SETVESABLANK ioctl beforehand, we can make vc->vc_origin be around
vc->vc_screenbuf while the console is in KD_TEXT mode, and then by
writing the special 'ESC M' control sequence to the tty certain times
(depends on the value of `vc->state.y - vc->vc_top`), we can eventually
move vc->vc_origin prior to vc->vc_screenbuf. Here's the PoC, tested on
QEMU:
```
int main() {
const int RI_NUM = 10; // should be greater than `vc->state.y - vc->vc_top`
int tty_fd, vcs_fd;
const char *tty_path = "/dev/tty0";
const char *vcs_path = "/dev/vcs";
const char escape_seq[] = "\x1bM"; // ESC + M
const char trigger_seq[] = "Let's trigger an OOB write.";
struct vt_sizes vt_size = { 70, 2 };
int blank = TIOCL_BLANKSCREEN;
tty_fd = open(tty_path, O_RDWR);
char vesa_mode[] = { TIOCL_SETVESABLANK, 1 };
ioctl(tty_fd, TIOCLINUX, vesa_mode);
ioctl(tty_fd, TIOCLINUX, &blank);
ioctl(tty_fd, VT_RESIZE, &vt_size);
for (int i = 0; i < RI_NUM; ++i)
write(tty_fd, escape_seq, sizeof(escape_seq) - 1);
vcs_fd = open(vcs_path, O_RDWR);
write(vcs_fd, trigger_seq, sizeof(trigger_seq));
close(vcs_fd);
close(tty_fd);
return 0;
}
```
To solve this problem, add an address range validation check in
vgacon_scroll(), ensuring vc->vc_origin never precedes vc_screenbuf.
Reported-by: syzbot+9c09fda97a1a65ea859b@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=9c09fda97a1a65ea859b [1]
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Cc: stable@vger.kernel.org
Co-developed-by: Yi Yang <yiyang13@huawei.com>
Signed-off-by: Yi Yang <yiyang13@huawei.com>
Signed-off-by: GONG Ruiqi <gongruiqi1@huawei.com>
Signed-off-by: Helge Deller <deller@gmx.de>
|
|
In preparation for making the kmalloc family of allocators type aware,
we need to make sure that the returned type from the allocation matches
the type of the variable being assigned. (Before, the allocator would
always return "void *", which can be implicitly cast to any pointer type.)
The assigned type is "struct dac_info *" but the returned type will be
"struct ics5342_info *", which has a larger allocation size. This is
by design, as struct ics5342_info contains struct dac_info as its first
member.
(patch slightly modified by Helge Deller)
Signed-off-by: Kees Cook <kees@kernel.org>
Signed-off-by: Helge Deller <deller@gmx.de>
|
|
The actual length of const string "noaccel" is 7, but the strncmp()
branch in nvidiafb_setup() wrongly hard codes it as 6.
Fix by using actual length 7 as argument of the strncmp().
Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com>
Signed-off-by: Helge Deller <deller@gmx.de>
|
|
The custom definition of PCI vendor ID in video/mach64.h is unused.
Remove it. Note, that the proper one is available in pci_ids.h.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Helge Deller <deller@gmx.de>
|
|
There is a spelling mistake in macro CARMINE_TOTAL_DIPLAY_MEM. Fix it.
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: Helge Deller <deller@gmx.de>
|
|
struct gpio_chip now has callbacks for setting line values that return
an integer, allowing to indicate failures. Convert the driver to using
them.
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Helge Deller <deller@gmx.de>
|
|
The MSR offset calculations in intel_pmu_config_acr() are buggy.
To calculate fixed counter MSR addresses in intel_pmu_config_acr(),
the HW counter index "idx" is subtracted by INTEL_PMC_IDX_FIXED.
This leads to the ACR mask value of fixed counters to be incorrectly
saved to the positions of GP counters in acr_cfg_b[], e.g.
For fixed counter 0, its ACR counter mask should be saved to
acr_cfg_b[32], but it's saved to acr_cfg_b[0] incorrectly.
Fix this issue.
[ mingo: Clarified & improved the changelog. ]
Fixes: ec980e4facef ("perf/x86/intel: Support auto counter reload")
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250529080236.2552247-2-dapeng1.mi@linux.intel.com
|
|
The following trace events are not used and defining them just wastes
memory:
x86_fpu_before_restore
x86_fpu_after_restore
x86_fpu_init_state
Simply remove them.
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Linux Trace Kernel <linux-trace-kernel@vger.kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Link: https://lore.kernel.org/all/20250529130138.544ffec4@gandalf.local.home # background
Link: https://lore.kernel.org/r/20250529131024.7c2ef96f@gandalf.local.home # x86 submission
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull ring-buffer updates from Steven Rostedt:
- Allow the persistent ring buffer to be memory mapped
In the last merge window there was issues with the implementation of
mapping the persistent ring buffer because it was assumed that the
persistent memory was just physical memory without being part of the
kernel virtual address space. But this was incorrect and the
persistent ring buffer can be mapped the same way as the allocated
ring buffer is mapped.
The metadata for the persistent ring buffer is different than the
normal ring buffer and the organization of mapping it to user space
is a little different. Make the updates needed to the meta data to
allow the persistent ring buffer to be mapped to user space.
- Fix cpus_read_lock() with buffer->mutex and cpu_buffer->mapping_lock
Mapping the ring buffer to user space uses the
cpu_buffer->mapping_lock. The buffer->mutex can be taken when the
mapping_lock is held, giving the locking order of:
cpu_buffer->mapping_lock -->> buffer->mutex. But there also exists
the ordering:
buffer->mutex -->> cpus_read_lock()
mm->mmap_lock -->> cpu_buffer->mapping_lock
cpus_read_lock() -->> mm->mmap_lock
causing a circular chain of:
cpu_buffer->mapping_lock -> buffer->mutex -->> cpus_read_lock() -->>
mm->mmap_lock -->> cpu_buffer->mapping_lock
By moving the cpus_read_lock() outside the buffer->mutex where:
cpus_read_lock() -->> buffer->mutex, breaks the deadlock chain.
- Do not trigger WARN_ON() for commit overrun
When the ring buffer is user space mapped and there's a "commit
overrun" (where an interrupt preempted an event, and then added so
many events it filled the buffer having to drop events when it hit
the preempted event) a WARN_ON() was triggered if this was read via a
memory mapped buffer.
This is due to "missed events" being non zero when the reader page
ended up with the commit page. The idea was, if the writer is on the
reader page, there's only one page that has been written to and there
should be no missed events.
But if a commit overrun is done where the writer is off the commit
page and looped around to the commit page causing missed events, it
is possible that the reader page is the commit page with missed
events.
Instead of triggering a WARN_ON() when the reader page is the commit
page with missed events, trigger it when the reader page is the
tail_page with missed events. That's because the writer is always on
the tail_page if an event was interrupted (which holds the commit
event) and continues off the commit page.
- Reset the persistent buffer if it is fully consumed
On boot up, if the user fully consumes the last boot buffer of the
persistent buffer, if it reboots without enabling it, there will
still be events in the buffer which can cause confusion. Instead,
reset the buffer when it is fully consumed, so that the data is not
read again.
- Clean up some goto out jumps
There's a few cases that the code jumps to the "out:" label that
simply returns a value. There used to be more work done at those
labels but now that they simply return a value use a return instead
of jumping to a label.
- Use guard() to simplify some of the code
Add guard() around some locking instead of jumping to a label to do
the unlocking.
- Use free() to simplify some of the code
Use free(kfree) on variables that will get freed on error and use
return_ptr() to return the variable when its not freed. There's one
instance where free(kfree) simplifies the code on a temp variable
that was allocated just for the function use.
* tag 'trace-ringbuffer-v6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
ring-buffer: Simplify functions with __free(kfree) to free allocations
ring-buffer: Make ring_buffer_{un}map() simpler with guard(mutex)
ring-buffer: Simplify ring_buffer_read_page() with guard()
ring-buffer: Simplify reset_disabled_cpu_buffer() with use of guard()
ring-buffer: Remove jump to out label in ring_buffer_swap_cpu()
ring-buffer: Removed unnecessary if() goto out where out is the next line
tracing: Reset last-boot buffers when reading out all cpu buffers
ring-buffer: Allow reserve_mem persistent ring buffers to be mmapped
ring-buffer: Do not trigger WARN_ON() due to a commit_overrun
ring-buffer: Move cpus_read_lock() outside of buffer->mutex
|