summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2025-04-14ksmbd: Fix dangling pointer in krb_authenticateSean Heelan
krb_authenticate frees sess->user and does not set the pointer to NULL. It calls ksmbd_krb5_authenticate to reinitialise sess->user but that function may return without doing so. If that happens then smb2_sess_setup, which calls krb_authenticate, will be accessing free'd memory when it later uses sess->user. Cc: stable@vger.kernel.org Signed-off-by: Sean Heelan <seanheelan@gmail.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2025-04-14net: openvswitch: fix nested key length validation in the set() actionIlya Maximets
It's not safe to access nla_len(ovs_key) if the data is smaller than the netlink header. Check that the attribute is OK first. Fixes: ccb1352e76cf ("net: Add Open vSwitch kernel components.") Reported-by: syzbot+b07a9da40df1576b8048@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=b07a9da40df1576b8048 Tested-by: syzbot+b07a9da40df1576b8048@syzkaller.appspotmail.com Signed-off-by: Ilya Maximets <i.maximets@ovn.org> Reviewed-by: Eelco Chaudron <echaudro@redhat.com> Acked-by: Aaron Conole <aconole@redhat.com> Link: https://patch.msgid.link/20250412104052.2073688-1-i.maximets@ovn.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-14Merge branch '1GbE' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== igc: Fix PTM timeout Christopher S M Hall says: There have been sporadic reports of PTM timeouts using i225/i226 devices These timeouts have been root caused to: 1) Manipulating the PTM status register while PTM is enabled and triggered 2) The hardware retrying too quickly when an inappropriate response is received from the upstream device The issue can be reproduced with the following: $ sudo phc2sys -R 1000 -O 0 -i tsn0 -m Note: 1000 Hz (-R 1000) is unrealistically large, but provides a way to quickly reproduce the issue. PHC2SYS exits with: "ioctl PTP_OFFSET_PRECISE: Connection timed out" when the PTM transaction fails The first patch in this series also resolves an issue reported by Corinna Vinschen relating to kdump: This patch also fixes a hang in igc_probe() when loading the igc driver in the kdump kernel on systems supporting PTM. The igc driver running in the base kernel enables PTM trigger in igc_probe(). Therefore the driver is always in PTM trigger mode, except in brief periods when manually triggering a PTM cycle. When a crash occurs, the NIC is reset while PTM trigger is enabled. Due to a hardware problem, the NIC is subsequently in a bad busmaster state and doesn't handle register reads/writes. When running igc_probe() in the kdump kernel, the first register access to a NIC register hangs driver probing and ultimately breaks kdump. With this patch, igc has PTM trigger disabled most of the time, and the trigger is only enabled for very brief (10 - 100 us) periods when manually triggering a PTM cycle. Chances that a crash occurs during a PTM trigger are not zero, but extremly reduced. * '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue: igc: add lock preventing multiple simultaneous PTM transactions igc: cleanup PTP module if probe fails igc: handle the IGC_PTP_ENABLED flag correctly igc: move ktime snapshot into PTM retry loop igc: increase wait time before retrying PTM igc: fix PTM cycle trigger logic ==================== Link: https://patch.msgid.link/20250411162857.2754883-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-14MAINTAINERS: update HUGETLB reviewersOscar Salvador
I have done quite some review on hugetlb code over the years, and some work on it as well, the latest being the hugetlb pagewalk unification which is a work in progress, and touches hugetlb code to great lengths. HugeTLB does not have many reviewers, so I would like to help out by offering myself as an official Reviewer. Signed-off-by: Oscar Salvador <osalvador@suse.de> Link: https://lkml.kernel.org/r/20250409082452.269180-1-osalvador@suse.de Acked-by: David Hildenbrand <david@redhat.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Muchun Song <muchun.song@linux.dev> Cc: Michal Hocko <mhocko@suse.com> Cc: Peter Xu <peterx@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-04-14netlink: specs: ovs_vport: align with C codegen capabilitiesJakub Kicinski
We started generating C code for OvS a while back, but actually C codegen only supports fixed headers specified at the family level right now (schema also allows specifying them per op). ovs_flow and ovs_datapath already specify the fixed header at the family level but ovs_vport does it per op. Move the property, all ops use the same header. This ensures YNL C sees the correct hdr_len: const struct ynl_family ynl_ovs_vport_family = { .name = "ovs_vport", - .hdr_len = sizeof(struct genlmsghdr), + .hdr_len = sizeof(struct genlmsghdr) + sizeof(struct ovs_header), }; Fixes: 7c59c9c8f202 ("tools: ynl: generate code for ovs families") Link: https://patch.msgid.link/20250409145541.580674-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-14net: don't mix device locking in dev_close_many() callsJakub Kicinski
Lockdep found the following dependency: &dev_instance_lock_key#3 --> &rdev->wiphy.mtx --> &net->xdp.lock --> &xs->mutex --> &dev_instance_lock_key#3 The first dependency is the problem. wiphy mutex should be outside the instance locks. The problem happens in notifiers (as always) for CLOSE. We only hold the instance lock for ops locked devices during CLOSE, and WiFi netdevs are not ops locked. Unfortunately, when we dev_close_many() during netns dismantle we may be holding the instance lock of _another_ netdev when issuing a CLOSE for a WiFi device. Lockdep's "Possible unsafe locking scenario" only prints 3 locks and we have 4, plus I think we'd need 3 CPUs, like this: CPU0 CPU1 CPU2 ---- ---- ---- lock(&xs->mutex); lock(&dev_instance_lock_key#3); lock(&rdev->wiphy.mtx); lock(&net->xdp.lock); lock(&xs->mutex); lock(&rdev->wiphy.mtx); lock(&dev_instance_lock_key#3); Tho, I don't think that's possible as CPU1 and CPU2 would be under rtnl_lock. Even if we have per-netns rtnl_lock and wiphy can span network namespaces - CPU0 and CPU1 must be in the same netns to see dev_instance_lock, so CPU0 can't be installing a socket as CPU1 is tearing the netns down. Regardless, our expected lock ordering is that wiphy lock is taken before instance locks, so let's fix this. Go over the ops locked and non-locked devices separately. Note that calling dev_close_many() on an empty list is perfectly fine. All processing (including RCU syncs) are conditional on the list not being empty, already. Fixes: 7e4d784f5810 ("net: hold netdev instance lock during rtnetlink operations") Reported-by: syzbot+6f588c78bf765b62b450@syzkaller.appspotmail.com Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250412233011.309762-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-14gpiolib: Allow to use setters with return value for output-only gpiosMathieu Dubois-Briand
The gpiod_direction_output_raw_commit() function checks if any setter callback is present before doing anything. As the new GPIO setters with return values were introduced, make this check also succeed if one is present. Fixes: 98ce1eb1fd87 ("gpiolib: introduce gpio_chip setters that return values") Signed-off-by: Mathieu Dubois-Briand <mathieu.dubois-briand@bootlin.com> Link: https://lore.kernel.org/r/20250411-mdb-gpiolib-setters-fix-v2-1-9611280d8822@bootlin.com Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
2025-04-14Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdmaLinus Torvalds
Pull rdma fixes from Jason Gunthorpe: - Fix hang in bnxt_re due to miscomputing the budget - Avoid a -Wformat-security message in dev_set_name() - Avoid an unused definition warning in fs.c with some kconfigs - Fix error handling in usnic and remove IS_ERR_OR_NULL() usage - Regression in RXE support foudn by blktests due to missing ODP exclusions - Set the dma_segment_size on HNS so it doesn't corrupt DMA when using very large IOs - Move a INIT_WORK to near when the work is allocated in cm.c to fix a racey crash where work in progress was being init'd - Use __GFP_NOWARN to not dump in kvcalloc() if userspace requests a very big MR * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: RDMA/bnxt_re: Remove unusable nq variable RDMA/core: Silence oversized kvmalloc() warning RDMA/cma: Fix workqueue crash in cma_netevent_work_handler RDMA/hns: Fix wrong maximum DMA segment size RDMA/rxe: Fix null pointer dereference in ODP MR check RDMA/mlx5: Fix compilation warning when USER_ACCESS isn't set RDMA/usnic: Fix passing zero to PTR_ERR in usnic_ib_pci_probe() RDMA/ucaps: Avoid format-security warning RDMA/bnxt_re: Fix budget handling of notification queue
2025-04-14Merge tag 'vfs-6.15-rc3.fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs fixes from Christian Brauner: - Fix NULL pointer dereference in virtiofs - Fix slab OOB access in hfs/hfsplus - Only create /proc/fs/netfs when CONFIG_PROC_FS is set - Fix getname_flags() to initialize pointer correctly - Convert dentry flags to enum - Don't allow datadir without lowerdir in overlayfs - Use namespace_{lock,unlock} helpers in dissolve_on_fput() instead of plain namespace_sem so unmounted mounts are properly cleaned up - Skip unnecessary ifs_block_is_uptodate check in iomap - Remove an unused forward declaration in overlayfs - Fix devpts uid/gid handling after converting to the new mount api - Fix afs_dynroot_readdir() to not use the RCU read lock - Fix mount_setattr() and open_tree_attr() to not pointlessly do path lookup or walk the mount tree if no mount option change has been requested * tag 'vfs-6.15-rc3.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: fs: use namespace_{lock,unlock} in dissolve_on_fput() iomap: skip unnecessary ifs_block_is_uptodate check fs: Fix filename init after recent refactoring netfs: Only create /proc/fs/netfs with CONFIG_PROC_FS mount: ensure we don't pointlessly walk the mount tree dcache: convert dentry flag macros to enum afs: Fix afs_dynroot_readdir() to not use the RCU read lock hfs/hfsplus: fix slab-out-of-bounds in hfs_bnode_read_key virtiofs: add filesystem context source name check devpts: Fix type for uid and gid params ovl: remove unused forward declaration ovl: don't allow datadir only
2025-04-14Merge tag 'perf-tools-fixes-for-v6.15-2025-04-13' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools Pull perf tools fixes from Namhyung Kim: "A couple of fixes and the usual tooling header updates: - fix a build error on ARM64 when libunwind is requested - fix an infinite loop with branch stack on AMD Zen3 - sync tooling headers with the kernel source" * tag 'perf-tools-fixes-for-v6.15-2025-04-13' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: perf tools: Remove evsel__handle_error_quirks() perf libunwind arm64: Fix missing close parens in an if statement tools headers: Update the arch/x86/lib/memset_64.S copy with the kernel sources tools headers: Update the x86 headers with the kernel sources tools headers: Update the linux/unaligned.h copy with the kernel sources tools headers: Update the uapi/asm-generic/mman-common.h copy with the kernel sources tools headers: Update the uapi/linux/prctl.h copy with the kernel sources tools headers: Update the syscall table with the kernel sources tools headers: Update the VFS headers with the kernel sources tools headers: Update the uapi/linux/perf_event.h copy with the kernel sources tools headers: Update the socket headers with the kernel sources tools headers: Update the KVM headers with the kernel sources
2025-04-14kunit: qemu_configs: SH: Respect kunit cmdlineThomas Weißschuh
The default SH kunit configuration sets CONFIG_CMDLINE_OVERWRITE which completely disregards the cmdline passed from the bootloader/QEMU in favor of the builtin CONFIG_CMDLINE. However the kunit tool needs to pass arguments to the in-kernel kunit core, for filters and other runtime parameters. Enable CONFIG_CMDLINE_EXTEND instead, so kunit arguments are respected. Link: https://lore.kernel.org/r/20250407-kunit-sh-v1-1-f5432a54cf2f@linutronix.de Fixes: 8110a3cab05e ("kunit: tool: Add support for SH under QEMU") Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Reviewed-by: David Gow <davidgow@google.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2025-04-14cpufreq: intel_pstate: Fix hwp_get_cpu_scaling()Rafael J. Wysocki
Commit b52aaeeadfac ("cpufreq: intel_pstate: Avoid SMP calls to get cpu-type") introduced two issues into hwp_get_cpu_scaling(). First, it made that function use the CPU type of the CPU running the code even though the target CPU is passed as the argument to it and second, it used smp_processor_id() for that even though hwp_get_cpu_scaling() runs in preemptible context. Fix both of these problems by simply passing "cpu" to cpu_data(). Fixes: b52aaeeadfac ("cpufreq: intel_pstate: Avoid SMP calls to get cpu-type") Link: https://lore.kernel.org/linux-pm/20250412103434.5321-1-xry111@xry111.site/ Reported-by: Xi Ruoyao <xry111@xry111.site> Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://patch.msgid.link/12659608.O9o76ZdvQC@rjwysocki.net
2025-04-14objtool/rust: add one more `noreturn` Rust function for Rust 1.86.0Miguel Ojeda
Starting with Rust 1.86.0 (see upstream commit b151b513ba2b ("Insert null checks for pointer dereferences when debug assertions are enabled") [1]), under some kernel configurations with `CONFIG_RUST_DEBUG_ASSERTIONS=y`, one may trigger a new `objtool` warning: rust/kernel.o: warning: objtool: _R..._6kernel9workqueue6system() falls through to next function _R...9workqueue14system_highpri() due to a call to the `noreturn` symbol: core::panicking::panic_null_pointer_dereference Thus add it to the list so that `objtool` knows it is actually `noreturn`. See commit 56d680dd23c3 ("objtool/rust: list `noreturn` Rust functions") for more details. Cc: stable@vger.kernel.org # Needed in 6.12.y and later (Rust is pinned in older LTSs). Fixes: 56d680dd23c3 ("objtool/rust: list `noreturn` Rust functions") Link: https://github.com/rust-lang/rust/commit/b151b513ba2b65c7506ec1a80f2712bbd09154d1 [1] Reviewed-by: Alice Ryhl <aliceryhl@google.com> Link: https://lore.kernel.org/r/20250413002338.1741593-1-ojeda@kernel.org Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
2025-04-14vfio/pci: Virtualize zero INTx PIN if no pdev->irqAlex Williamson
Typically pdev->irq is consistent with whether the device itself supports INTx, where device support is reported via the PIN register. Therefore the PIN register is often already zero if pdev->irq is zero. Recently virtualization of the PIN register was expanded to include the case where the device supports INTx but the platform does not route the interrupt. This is reported by a value of IRQ_NOTCONNECTED on some architectures. Other architectures just report zero for pdev->irq. We already disallow INTx setup if pdev->irq is zero, therefore add this to the PIN register virtualization criteria so that a consistent view is provided to userspace through virtualized config space and ioctls. Reported-by: Shivaprasad G Bhat <sbhat@linux.ibm.com> Link: https://lore.kernel.org/all/174231895238.2295.12586708771396482526.stgit@linux.ibm.com/ Tested-by: Shivaprasad G Bhat <sbhat@linux.ibm.com> Link: https://lore.kernel.org/r/20250320194145.2816379-1-alex.williamson@redhat.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2025-04-14block: fix resource leak in blk_register_queue() error pathZheng Qixing
When registering a queue fails after blk_mq_sysfs_register() is successful but the function later encounters an error, we need to clean up the blk_mq_sysfs resources. Add the missing blk_mq_sysfs_unregister() call in the error path to properly clean up these resources and prevent a memory leak. Fixes: 320ae51feed5 ("blk-mq: new multi-queue block IO queueing mechanism") Signed-off-by: Zheng Qixing <zhengqixing@huawei.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Yu Kuai <yukuai3@huawei.com> Link: https://lore.kernel.org/r/20250412092554.475218-1-zhengqixing@huaweicloud.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-04-14block: add SPDX header line to blk-throttle.hBird, Tim
Add an SPDX license identifier line to blk-throttle.h Use 'GPL-2.0' as the identifier, since blk-throttle.c uses that, and blk.h (from which some material was copied when blk-throttle.h was created) also uses that identifier. Signed-off-by: Tim Bird <tim.bird@sony.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/MW5PR13MB5632EE4645BCA24ED111EC0EFDB62@MW5PR13MB5632.namprd13.prod.outlook.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-04-14rust: kasan/kbuild: fix missing flags on first buildMiguel Ojeda
If KASAN is enabled, and one runs in a clean repository e.g.: make LLVM=1 prepare make LLVM=1 prepare Then the Rust code gets rebuilt, which should not happen. The reason is some of the LLVM KASAN `rustc` flags are added in the second run: -Cllvm-args=-asan-instrumentation-with-call-threshold=10000 -Cllvm-args=-asan-stack=0 -Cllvm-args=-asan-globals=1 -Cllvm-args=-asan-kernel-mem-intrinsic-prefix=1 Further runs do not rebuild Rust because the flags do not change anymore. Rebuilding like that in the second run is bad, even if this just happens with KASAN enabled, but missing flags in the first one is even worse. The root issue is that we pass, for some architectures and for the moment, a generated `target.json` file. That file is not ready by the time `rustc` gets called for the flag test, and thus the flag test fails just because the file is not available, e.g.: $ ... --target=./scripts/target.json ... -Cllvm-args=... error: target file "./scripts/target.json" does not exist There are a few approaches we could take here to solve this. For instance, we could ensure that every time that the config is rebuilt, we regenerate the file and recompute the flags. Or we could use the LLVM version to check for these flags, instead of testing the flag (which may have other advantages, such as allowing us to detect renames on the LLVM side). However, it may be easier than that: `rustc` is aware of the `-Cllvm-args` regardless of the `--target` (e.g. I checked that the list printed is the same, plus that I can check for these flags even if I pass a completely unrelated target), and thus we can just eliminate the dependency completely. Thus filter out the target. This does mean that `rustc-option` cannot be used to test a flag that requires the right target, but we don't have other users yet, it is a minimal change and we want to get rid of custom targets in the future. We could only filter in the case `target.json` is used, to make it work in more cases, but then it would be harder to notice that it may not work in a couple architectures. Cc: Matthew Maurer <mmaurer@google.com> Cc: Sami Tolvanen <samitolvanen@google.com> Cc: stable@vger.kernel.org Fixes: e3117404b411 ("kbuild: rust: Enable KASAN support") Tested-by: Alice Ryhl <aliceryhl@google.com> Link: https://lore.kernel.org/r/20250408220311.1033475-1-ojeda@kernel.org Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
2025-04-14rust: disable `clippy::needless_continue`Miguel Ojeda
Starting with Rust 1.86.0, Clippy's `needless_continue` lint complains about the last statement of a loop [1], including cases like: while ... { match ... { ... if ... => { ... return ...; } _ => continue, } } as well as nested `match`es in a loop. One solution is changing `continue` for `()` [2], but arguably using `continue` shows the intent better when it is alone in an arm like that. Moreover, I am not sure we want to force people to try to find other ways to write the code either, in cases when that applies. In addition, the help text does not really apply in the new cases the lint has introduced, e.g. here one cannot simply "drop" the expression: warning: this `continue` expression is redundant --> rust/macros/helpers.rs:85:18 | 85 | _ => continue, | ^^^^^^^^ | = help: consider dropping the `continue` expression = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#needless_continue = note: requested on the command line with `-W clippy::needless-continue` The examples in the documentation do not show a case like this, either, so the second "help" line does not help. In addition, locally disabling the lint is not possible with `expect`, since the behavior differs across versions. Using `allow` would be possible, but, even then, an extra line just for this is a bit too much, especially if there are other ways to satisfy the lint. Finally, the lint is still in the "pedantic" category and disabled by default by Clippy. Thus disable the lint, at least for the time being. Feedback was submitted to upstream Clippy, in case this can be improved or perhaps the lint split into several [3]. Cc: stable@vger.kernel.org # Needed in 6.12.y and later (Rust is pinned in older LTSs). Link: https://github.com/rust-lang/rust-clippy/pull/13891 [1] Link: https://lore.kernel.org/rust-for-linux/20250401221205.52381-1-ojeda@kernel.org/ [2] Link: https://github.com/rust-lang/rust-clippy/issues/14536 [3] Link: https://lore.kernel.org/r/20250403163805.67770-1-ojeda@kernel.org Reviewed-by: Alice Ryhl <aliceryhl@google.com> Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
2025-04-14riscv: Provide all alternative macros all the timeAndrew Jones
We need to provide all six forms of the alternative macros (ALTERNATIVE, ALTERNATIVE_2, _ALTERNATIVE_CFG, _ALTERNATIVE_CFG_2, __ALTERNATIVE_CFG, __ALTERNATIVE_CFG_2) for all four cases derived from the two ifdefs (RISCV_ALTERNATIVE, __ASSEMBLY__) in order to ensure all configs can compile. Define this missing ones and ensure all are defined to consume all parameters passed. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202504130710.3IKz6Ibs-lkp@intel.com/ Signed-off-by: Andrew Jones <ajones@ventanamicro.com> Tested-by: Alexandre Ghiti <alexghiti@rivosinc.com> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20250414120947.135173-2-ajones@ventanamicro.com Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-04-14riscv: module: Allocate PLT entries for R_RISCV_PLT32Samuel Holland
apply_r_riscv_plt32_rela() may need to emit a PLT entry for the referenced symbol, so there must be space allocated in the PLT. Fixes: 8fd6c5142395 ("riscv: Add remaining module relocations") Signed-off-by: Samuel Holland <samuel.holland@sifive.com> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Link: https://lore.kernel.org/r/20250409171526.862481-2-samuel.holland@sifive.com Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-04-14riscv: module: Fix out-of-bounds relocation accessSamuel Holland
The current code allows rel[j] to access one element past the end of the relocation section. Simplify to num_relocations which is equivalent to the existing size expression. Fixes: 080c4324fa5e ("riscv: optimize ELF relocation function in riscv") Signed-off-by: Samuel Holland <samuel.holland@sifive.com> Reviewed-by: Maxim Kochetkov <fido_max@inbox.ru> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20250409171526.862481-1-samuel.holland@sifive.com Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-04-14riscv: Properly export reserved regions in /proc/iomemBjörn Töpel
The /proc/iomem represents the kernel's memory map. Regions marked with "Reserved" tells the user that the range should not be tampered with. Kexec-tools, when using the older kexec_load syscall relies on the "Reserved" regions to build the memory segments, that will be the target of the new kexec'd kernel. The RISC-V port tries to expose all reserved regions to userland, but some regions were not properly exposed: Regions that resided in both the "regular" and reserved memory block, e.g. the EFI Memory Map. A missing entry could result in reserved memory being overwritten. It turns out, that arm64, and loongarch had a similar issue a while back: commit d91680e687f4 ("arm64: Fix /proc/iomem for reserved but not memory regions") commit 50d7ba36b916 ("arm64: export memblock_reserve()d regions via /proc/iomem") Similar to the other ports, resolve the issue by splitting the regions in an arch initcall, since we need a working allocator. Fixes: ffe0e5261268 ("RISC-V: Improve init_resources()") Signed-off-by: Björn Töpel <bjorn@rivosinc.com> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20250409182129.634415-1-bjorn@kernel.org Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-04-14riscv: Fix unaligned access info messagesAndrew Jones
Ensure we only print messages about command line parameters when the parameters are actually in use. Also complain about the use of the vector parameter when vector support isn't available. Fixes: aecb09e091dc ("riscv: Add parameter for skipping access speed tests") Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Closes: https://lore.kernel.org/all/CAMuHMdVEp2_ho51gkpLLJG2HimqZ1gZ0fa=JA4uNNZjFFqaPMg@mail.gmail.com/ Closes: https://lore.kernel.org/all/CAMuHMdWVMP0MYCLFq+b7H_uz-2omdFiDDUZq0t_gw0L9rrJtkQ@mail.gmail.com/ Signed-off-by: Andrew Jones <ajones@ventanamicro.com> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Tested-by: Alexandre Ghiti <alexghiti@rivosinc.com> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20250409153650.84433-2-ajones@ventanamicro.com Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-04-14spi: sun4i: add support for GPIO chip select linesMans Rullgard
Set use_gpio_descriptors to true so that GPIOs can be used for chip select in accordance with the DT binding. Signed-off-by: Mans Rullgard <mans@mansr.com> Acked-by: Jernej Skrabec <jernej.skrabec@gmail.com> Link: https://patch.msgid.link/20250410115303.5150-1-mans@mansr.com Signed-off-by: Mark Brown <broonie@kernel.org>
2025-04-14slab: ensure slab->obj_exts is clear in a newly allocated slab pageSuren Baghdasaryan
ktest recently reported crashes while running several buffered io tests with __alloc_tagging_slab_alloc_hook() at the top of the crash call stack. The signature indicates an invalid address dereference with low bits of slab->obj_exts being set. The bits were outside of the range used by page_memcg_data_flags and objext_flags and hence were not masked out by slab_obj_exts() when obtaining the pointer stored in slab->obj_exts. The typical crash log looks like this: 00510 Unable to handle kernel NULL pointer dereference at virtual address 0000000000000010 00510 Mem abort info: 00510 ESR = 0x0000000096000045 00510 EC = 0x25: DABT (current EL), IL = 32 bits 00510 SET = 0, FnV = 0 00510 EA = 0, S1PTW = 0 00510 FSC = 0x05: level 1 translation fault 00510 Data abort info: 00510 ISV = 0, ISS = 0x00000045, ISS2 = 0x00000000 00510 CM = 0, WnR = 1, TnD = 0, TagAccess = 0 00510 GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 00510 user pgtable: 4k pages, 39-bit VAs, pgdp=0000000104175000 00510 [0000000000000010] pgd=0000000000000000, p4d=0000000000000000, pud=0000000000000000 00510 Internal error: Oops: 0000000096000045 [#1] SMP 00510 Modules linked in: 00510 CPU: 10 UID: 0 PID: 7692 Comm: cat Not tainted 6.15.0-rc1-ktest-g189e17946605 #19327 NONE 00510 Hardware name: linux,dummy-virt (DT) 00510 pstate: 20001005 (nzCv daif -PAN -UAO -TCO -DIT +SSBS BTYPE=--) 00510 pc : __alloc_tagging_slab_alloc_hook+0xe0/0x190 00510 lr : __kmalloc_noprof+0x150/0x310 00510 sp : ffffff80c87df6c0 00510 x29: ffffff80c87df6c0 x28: 000000000013d1ff x27: 000000000013d200 00510 x26: ffffff80c87df9e0 x25: 0000000000000000 x24: 0000000000000001 00510 x23: ffffffc08041953c x22: 000000000000004c x21: ffffff80c0002180 00510 x20: fffffffec3120840 x19: ffffff80c4821000 x18: 0000000000000000 00510 x17: fffffffec3d02f00 x16: fffffffec3d02e00 x15: fffffffec3d00700 00510 x14: fffffffec3d00600 x13: 0000000000000200 x12: 0000000000000006 00510 x11: ffffffc080bb86c0 x10: 0000000000000000 x9 : ffffffc080201e58 00510 x8 : ffffff80c4821060 x7 : 0000000000000000 x6 : 0000000055555556 00510 x5 : 0000000000000001 x4 : 0000000000000010 x3 : 0000000000000060 00510 x2 : 0000000000000000 x1 : ffffffc080f50cf8 x0 : ffffff80d801d000 00510 Call trace: 00510 __alloc_tagging_slab_alloc_hook+0xe0/0x190 (P) 00510 __kmalloc_noprof+0x150/0x310 00510 __bch2_folio_create+0x5c/0xf8 00510 bch2_folio_create+0x2c/0x40 00510 bch2_readahead+0xc0/0x460 00510 read_pages+0x7c/0x230 00510 page_cache_ra_order+0x244/0x3a8 00510 page_cache_async_ra+0x124/0x170 00510 filemap_readahead.isra.0+0x58/0xa0 00510 filemap_get_pages+0x454/0x7b0 00510 filemap_read+0xdc/0x418 00510 bch2_read_iter+0x100/0x1b0 00510 vfs_read+0x214/0x300 00510 ksys_read+0x6c/0x108 00510 __arm64_sys_read+0x20/0x30 00510 invoke_syscall.constprop.0+0x54/0xe8 00510 do_el0_svc+0x44/0xc8 00510 el0_svc+0x18/0x58 00510 el0t_64_sync_handler+0x104/0x130 00510 el0t_64_sync+0x154/0x158 00510 Code: d5384100 f9401c01 b9401aa3 b40002e1 (f8227881) 00510 ---[ end trace 0000000000000000 ]--- 00510 Kernel panic - not syncing: Oops: Fatal exception 00510 SMP: stopping secondary CPUs 00510 Kernel Offset: disabled 00510 CPU features: 0x0000,000000e0,00000410,8240500b 00510 Memory Limit: none Investigation indicates that these bits are already set when we allocate slab page and are not zeroed out after allocation. We are not yet sure why these crashes start happening only recently but regardless of the reason, not initializing a field that gets used later is wrong. Fix it by initializing slab->obj_exts during slab page allocation. Fixes: 21c690a349ba ("mm: introduce slabobj_ext to support slab object extensions") Reported-by: Kent Overstreet <kent.overstreet@linux.dev> Tested-by: Kent Overstreet <kent.overstreet@linux.dev> Signed-off-by: Suren Baghdasaryan <surenb@google.com> Acked-by: Kent Overstreet <kent.overstreet@linux.dev> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20250411155737.1360746-1-surenb@google.com Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2025-04-14xfs: compute buffer address correctly in xmbuf_map_backing_memDarrick J. Wong
Prior to commit e614a00117bc2d, xmbuf_map_backing_mem relied on folio_file_page to return the base page for the xmbuf's loff_t in the xfile, and set b_addr to the page_address of that base page. Now that folio_file_page has been removed from xmbuf_map_backing_mem, we always set b_addr to the folio_address of the folio. This is correct for the situation where the folio size matches the buffer size, but it's totally wrong if tmpfs uses large folios. We need to use offset_in_folio here. Found via xfs/801, which demonstrated evidence of corruption of an in-memory rmap btree block right after initializing an adjacent block. Fixes: e614a00117bc2d ("xfs: cleanup mapping tmpfs folios into the buffer cache") Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2025-04-14xfs: add tunable threshold parameter for triggering zone GCHans Holmberg
Presently we start garbage collection late - when we start running out of free zones to backfill max_open_zones. This is a reasonable default as it minimizes write amplification. The longer we wait, the more blocks are invalidated and reclaim cost less in terms of blocks to relocate. Starting this late however introduces a risk of GC being outcompeted by user writes. If GC can't keep up, user writes will be forced to wait for free zones with high tail latencies as a result. This is not a problem under normal circumstances, but if fragmentation is bad and user write pressure is high (multiple full-throttle writers) we will "bottom out" of free zones. To mitigate this, introduce a zonegc_low_space tunable that lets the user specify a percentage of how much of the unused space that GC should keep available for writing. A high value will reclaim more of the space occupied by unused blocks, creating a larger buffer against write bursts. This comes at a cost as write amplification is increased. To illustrate this using a sample workload, setting zonegc_low_space to 60% avoids high (500ms) max latencies while increasing write amplification by 15%. Signed-off-by: Hans Holmberg <hans.holmberg@wdc.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2025-04-14xfs: mark xfs_buf_free as might_sleep()Christoph Hellwig
xfs_buf_free can call vunmap, which can sleep. The vunmap path is an unlikely one, so add might_sleep to ensure calling xfs_buf_free from atomic context gets caught more easily. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2025-04-14xfs: remove the leftover xfs_{set,clear}_li_failed infrastructureChristoph Hellwig
Marking a log item as failed kept a buffer reference around for resubmission of inode and dquote items. For inode items commit 298f7bec503f3 ("xfs: pin inode backing buffer to the inode log item") started pinning the inode item buffers unconditionally and removed the need for this. Later commit acc8f8628c37 ("xfs: attach dquot buffer to dquot log item buffer") did the same for dquot items but didn't fully clean up the xfs_clear_li_failed side for them. Stop adding the extra pin for dquot items and remove the helpers. This happens to fix a call to xfs_buf_free with the AIL lock held, which would be incorrect for the unlikely case freeing the buffer ends up calling vfree. Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2025-04-14genksyms: Handle typeof_unqual keyword and __seg_{fs,gs} qualifiersUros Bizjak
Handle typeof_unqual, __typeof_unqual and __typeof_unqual__ keywords using TYPEOF_KEYW token in the same way as typeof keyword. Also ignore x86 __seg_fs and __seg_gs named address space qualifiers using X86_SEG_KEYW token in the same way as const, volatile or restrict qualifiers. Fixes: ac053946f5c4 ("compiler.h: introduce TYPEOF_UNQUAL() macro") Closes: https://lore.kernel.org/lkml/81a25a60-de78-43fb-b56a-131151e1c035@molgen.mpg.de/ Reported-by: Paul Menzel <pmenzel@molgen.mpg.de> Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Masahiro Yamada <yamada.masahiro@socionext.com> Link: https://lore.kernel.org/r/20250413220749.270704-1-ubizjak@gmail.com
2025-04-13bcachefs: btree_root_unreadable_and_scan_found_nothing now AUTOFIXKent Overstreet
This will likely mean that the btree had only one node - there was nothing or almost nothing in it, and we should reconstruct and continue. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-04-13Revert "smb: client: fix TCP timers deadlock after rmmod"Kuniyuki Iwashima
This reverts commit e9f2517a3e18a54a3943c098d2226b245d488801. Commit e9f2517a3e18 ("smb: client: fix TCP timers deadlock after rmmod") is intended to fix a null-ptr-deref in LOCKDEP, which is mentioned as CVE-2024-54680, but is actually did not fix anything; The issue can be reproduced on top of it. [0] Also, it reverted the change by commit ef7134c7fc48 ("smb: client: Fix use-after-free of network namespace.") and introduced a real issue by reviving the kernel TCP socket. When a reconnect happens for a CIFS connection, the socket state transitions to FIN_WAIT_1. Then, inet_csk_clear_xmit_timers_sync() in tcp_close() stops all timers for the socket. If an incoming FIN packet is lost, the socket will stay at FIN_WAIT_1 forever, and such sockets could be leaked up to net.ipv4.tcp_max_orphans. Usually, FIN can be retransmitted by the peer, but if the peer aborts the connection, the issue comes into reality. I warned about this privately by pointing out the exact report [1], but the bogus fix was finally merged. So, we should not stop the timers to finally kill the connection on our side in that case, meaning we must not use a kernel socket for TCP whose sk->sk_net_refcnt is 0. The kernel socket does not have a reference to its netns to make it possible to tear down netns without cleaning up every resource in it. For example, tunnel devices use a UDP socket internally, but we can destroy netns without removing such devices and let it complete during exit. Otherwise, netns would be leaked when the last application died. However, this is problematic for TCP sockets because TCP has timers to close the connection gracefully even after the socket is close()d. The lifetime of the socket and its netns is different from the lifetime of the underlying connection. If the socket user does not maintain the netns lifetime, the timer could be fired after the socket is close()d and its netns is freed up, resulting in use-after-free. Actually, we have seen so many similar issues and converted such sockets to have a reference to netns. That's why I converted the CIFS client socket to have a reference to netns (sk->sk_net_refcnt == 1), which is somehow mentioned as out-of-scope of CIFS and technically wrong in e9f2517a3e18, but **is in-scope and right fix**. Regarding the LOCKDEP issue, we can prevent the module unload by bumping the module refcount when switching the LOCKDDEP key in sock_lock_init_class_and_name(). [2] For a while, let's revert the bogus fix. Note that now we can use sk_net_refcnt_upgrade() for the socket conversion, but I'll do so later separately to make backport easy. Link: https://lore.kernel.org/all/20250402020807.28583-1-kuniyu@amazon.com/ #[0] Link: https://lore.kernel.org/netdev/c08bd5378da647a2a4c16698125d180a@huawei.com/ #[1] Link: https://lore.kernel.org/lkml/20250402005841.19846-1-kuniyu@amazon.com/ #[2] Fixes: e9f2517a3e18 ("smb: client: fix TCP timers deadlock after rmmod") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Cc: stable@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>
2025-04-13Revert "smb: client: Fix netns refcount imbalance causing leaks and ↵Kuniyuki Iwashima
use-after-free" This reverts commit 4e7f1644f2ac6d01dc584f6301c3b1d5aac4eaef. The commit e9f2517a3e18 ("smb: client: fix TCP timers deadlock after rmmod") is not only a bogus fix for LOCKDEP null-ptr-deref but also introduces a real issue, TCP sockets leak, which will be explained in detail in the next revert. Also, CNA assigned CVE-2024-54680 to it but is rejecting it. [0] Thus, we are reverting the commit and its follow-up commit 4e7f1644f2ac ("smb: client: Fix netns refcount imbalance causing leaks and use-after-free"). Link: https://lore.kernel.org/all/2025040248-tummy-smilingly-4240@gregkh/ #[0] Fixes: 4e7f1644f2ac ("smb: client: Fix netns refcount imbalance causing leaks and use-after-free") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Cc: stable@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>
2025-04-13smb3 client: fix open hardlink on deferred close file errorChunjie Zhu
The following Python script results in unexpected behaviour when run on a CIFS filesystem against a Windows Server: # Create file fd = os.open('test', os.O_WRONLY|os.O_CREAT) os.write(fd, b'foo') os.close(fd) # Open and close the file to leave a pending deferred close fd = os.open('test', os.O_RDONLY|os.O_DIRECT) os.close(fd) # Try to open the file via a hard link os.link('test', 'new') newfd = os.open('new', os.O_RDONLY|os.O_DIRECT) The final open returns EINVAL due to the server returning STATUS_INVALID_PARAMETER. The root cause of this is that the client caches lease keys per inode, but the spec requires them to be related to the filename which causes problems when hard links are involved: From MS-SMB2 section 3.3.5.9.11: "The server MUST attempt to locate a Lease by performing a lookup in the LeaseTable.LeaseList using the LeaseKey in the SMB2_CREATE_REQUEST_LEASE_V2 as the lookup key. If a lease is found, Lease.FileDeleteOnClose is FALSE, and Lease.Filename does not match the file name for the incoming request, the request MUST be failed with STATUS_INVALID_PARAMETER" On client side, we first check the context of file open, if it hits above conditions, we first close all opening files which are belong to the same inode, then we do open the hard link file. Cc: stable@vger.kernel.org Signed-off-by: Chunjie Zhu <chunjie.zhu@cloud.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2025-04-13nfsd: decrease sc_count directly if fail to queue dl_recallLi Lingfeng
A deadlock warning occurred when invoking nfs4_put_stid following a failed dl_recall queue operation: T1 T2 nfs4_laundromat nfs4_get_client_reaplist nfs4_anylock_blockers __break_lease spin_lock // ctx->flc_lock spin_lock // clp->cl_lock nfs4_lockowner_has_blockers locks_owner_has_blockers spin_lock // flctx->flc_lock nfsd_break_deleg_cb nfsd_break_one_deleg nfs4_put_stid refcount_dec_and_lock spin_lock // clp->cl_lock When a file is opened, an nfs4_delegation is allocated with sc_count initialized to 1, and the file_lease holds a reference to the delegation. The file_lease is then associated with the file through kernel_setlease. The disassociation is performed in nfsd4_delegreturn via the following call chain: nfsd4_delegreturn --> destroy_delegation --> destroy_unhashed_deleg --> nfs4_unlock_deleg_lease --> kernel_setlease --> generic_delete_lease The corresponding sc_count reference will be released after this disassociation. Since nfsd_break_one_deleg executes while holding the flc_lock, the disassociation process becomes blocked when attempting to acquire flc_lock in generic_delete_lease. This means: 1) sc_count in nfsd_break_one_deleg will not be decremented to 0; 2) The nfs4_put_stid called by nfsd_break_one_deleg will not attempt to acquire cl_lock; 3) Consequently, no deadlock condition is created. Given that sc_count in nfsd_break_one_deleg remains non-zero, we can safely perform refcount_dec on sc_count directly. This approach effectively avoids triggering deadlock warnings. Fixes: 230ca758453c ("nfsd: put dl_stid if fail to queue dl_recall") Signed-off-by: Li Lingfeng <lilingfeng3@huawei.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2025-04-13nfs: add missing selections of CONFIG_CRC32Eric Biggers
nfs.ko, nfsd.ko, and lockd.ko all use crc32_le(), which is available only when CONFIG_CRC32 is enabled. But the only NFS kconfig option that selected CONFIG_CRC32 was CONFIG_NFS_DEBUG, which is client-specific and did not actually guard the use of crc32_le() even on the client. The code worked around this bug by only actually calling crc32_le() when CONFIG_CRC32 is built-in, instead hard-coding '0' in other cases. This avoided randconfig build errors, and in real kernels the fallback code was unlikely to be reached since CONFIG_CRC32 is 'default y'. But, this really needs to just be done properly, especially now that I'm planning to update CONFIG_CRC32 to not be 'default y'. Therefore, make CONFIG_NFS_FS, CONFIG_NFSD, and CONFIG_LOCKD select CONFIG_CRC32. Then remove the fallback code that becomes unnecessary, as well as the selection of CONFIG_CRC32 from CONFIG_NFS_DEBUG. Fixes: 1264a2f053a3 ("NFS: refactor code for calculating the crc32 hash of a filehandle") Signed-off-by: Eric Biggers <ebiggers@google.com> Acked-by: Anna Schumaker <anna.schumaker@oracle.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2025-04-13i2c: cros-ec-tunnel: defer probe if parent EC is not presentThadeu Lima de Souza Cascardo
When i2c-cros-ec-tunnel and the EC driver are built-in, the EC parent device will not be found, leading to NULL pointer dereference. That can also be reproduced by unbinding the controller driver and then loading i2c-cros-ec-tunnel module (or binding the device). [ 271.991245] BUG: kernel NULL pointer dereference, address: 0000000000000058 [ 271.998215] #PF: supervisor read access in kernel mode [ 272.003351] #PF: error_code(0x0000) - not-present page [ 272.008485] PGD 0 P4D 0 [ 272.011022] Oops: Oops: 0000 [#1] SMP NOPTI [ 272.015207] CPU: 0 UID: 0 PID: 3859 Comm: insmod Tainted: G S 6.15.0-rc1-00004-g44722359ed83 #30 PREEMPT(full) 3c7fb39a552e7d949de2ad921a7d6588d3a4fdc5 [ 272.030312] Tainted: [S]=CPU_OUT_OF_SPEC [ 272.034233] Hardware name: HP Berknip/Berknip, BIOS Google_Berknip.13434.356.0 05/17/2021 [ 272.042400] RIP: 0010:ec_i2c_probe+0x2b/0x1c0 [i2c_cros_ec_tunnel] [ 272.048577] Code: 1f 44 00 00 41 57 41 56 41 55 41 54 53 48 83 ec 10 65 48 8b 05 06 a0 6c e7 48 89 44 24 08 4c 8d 7f 10 48 8b 47 50 4c 8b 60 78 <49> 83 7c 24 58 00 0f 84 2f 01 00 00 48 89 fb be 30 06 00 00 4c 9 [ 272.067317] RSP: 0018:ffffa32082a03940 EFLAGS: 00010282 [ 272.072541] RAX: ffff969580b6a810 RBX: ffff969580b68c10 RCX: 0000000000000000 [ 272.079672] RDX: 0000000000000000 RSI: 0000000000000282 RDI: ffff969580b68c00 [ 272.086804] RBP: 00000000fffffdfb R08: 0000000000000000 R09: 0000000000000000 [ 272.093936] R10: 0000000000000000 R11: ffffffffc0600000 R12: 0000000000000000 [ 272.101067] R13: ffffffffa666fbb8 R14: ffffffffc05b5528 R15: ffff969580b68c10 [ 272.108198] FS: 00007b930906fc40(0000) GS:ffff969603149000(0000) knlGS:0000000000000000 [ 272.116282] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 272.122024] CR2: 0000000000000058 CR3: 000000012631c000 CR4: 00000000003506f0 [ 272.129155] Call Trace: [ 272.131606] <TASK> [ 272.133709] ? acpi_dev_pm_attach+0xdd/0x110 [ 272.137985] platform_probe+0x69/0xa0 [ 272.141652] really_probe+0x152/0x310 [ 272.145318] __driver_probe_device+0x77/0x110 [ 272.149678] driver_probe_device+0x1e/0x190 [ 272.153864] __driver_attach+0x10b/0x1e0 [ 272.157790] ? driver_attach+0x20/0x20 [ 272.161542] bus_for_each_dev+0x107/0x150 [ 272.165553] bus_add_driver+0x15d/0x270 [ 272.169392] driver_register+0x65/0x110 [ 272.173232] ? cleanup_module+0xa80/0xa80 [i2c_cros_ec_tunnel 3a00532f3f4af4a9eade753f86b0f8dd4e4e5698] [ 272.182617] do_one_initcall+0x110/0x350 [ 272.186543] ? security_kernfs_init_security+0x49/0xd0 [ 272.191682] ? __kernfs_new_node+0x1b9/0x240 [ 272.195954] ? security_kernfs_init_security+0x49/0xd0 [ 272.201093] ? __kernfs_new_node+0x1b9/0x240 [ 272.205365] ? kernfs_link_sibling+0x105/0x130 [ 272.209810] ? kernfs_next_descendant_post+0x1c/0xa0 [ 272.214773] ? kernfs_activate+0x57/0x70 [ 272.218699] ? kernfs_add_one+0x118/0x160 [ 272.222710] ? __kernfs_create_file+0x71/0xa0 [ 272.227069] ? sysfs_add_bin_file_mode_ns+0xd6/0x110 [ 272.232033] ? internal_create_group+0x453/0x4a0 [ 272.236651] ? __vunmap_range_noflush+0x214/0x2d0 [ 272.241355] ? __free_frozen_pages+0x1dc/0x420 [ 272.245799] ? free_vmap_area_noflush+0x10a/0x1c0 [ 272.250505] ? load_module+0x1509/0x16f0 [ 272.254431] do_init_module+0x60/0x230 [ 272.258181] __se_sys_finit_module+0x27a/0x370 [ 272.262627] do_syscall_64+0x6a/0xf0 [ 272.266206] ? do_syscall_64+0x76/0xf0 [ 272.269956] ? irqentry_exit_to_user_mode+0x79/0x90 [ 272.274836] entry_SYSCALL_64_after_hwframe+0x55/0x5d [ 272.279887] RIP: 0033:0x7b9309168d39 [ 272.283466] Code: 5b 41 5c 5d c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d af 40 0c 00 f7 d8 64 89 01 8 [ 272.302210] RSP: 002b:00007fff50f1a288 EFLAGS: 00000246 ORIG_RAX: 0000000000000139 [ 272.309774] RAX: ffffffffffffffda RBX: 000058bf9b50f6d0 RCX: 00007b9309168d39 [ 272.316905] RDX: 0000000000000000 RSI: 000058bf6c103a77 RDI: 0000000000000003 [ 272.324036] RBP: 00007fff50f1a2e0 R08: 00007fff50f19218 R09: 0000000021ec4150 [ 272.331166] R10: 000058bf9b50f7f0 R11: 0000000000000246 R12: 0000000000000000 [ 272.338296] R13: 00000000fffffffe R14: 0000000000000000 R15: 000058bf6c103a77 [ 272.345428] </TASK> [ 272.347617] Modules linked in: i2c_cros_ec_tunnel(+) [ 272.364585] gsmi: Log Shutdown Reason 0x03 Returning -EPROBE_DEFER will allow the device to be bound once the controller is bound, in the case of built-in drivers. Fixes: 9d230c9e4f4e ("i2c: ChromeOS EC tunnel driver") Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@igalia.com> Cc: <stable@vger.kernel.org> # v3.16+ Signed-off-by: Andi Shyti <andi.shyti@kernel.org> Link: https://lore.kernel.org/r/20250407-null-ec-parent-v1-1-f7dda62d3110@igalia.com
2025-04-13Linux 6.15-rc2v6.15-rc2Linus Torvalds
2025-04-13bcachefs: fix bch2_dev_usage_full_read_fast()Kent Overstreet
One reference to bch_dev_usage wasn't updated, which meant we weren't reading the full bch_dev_usage_full - oops. Fixes: 955ba7b5ea03 ("bcachefs: bch_dev_usage_full") Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-04-13Merge tag 'erofs-for-6.15-rc2-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs Pull erofs fixes from Gao Xiang: - Properly handle errors when file-backed I/O fails - Fix compilation issues on ARM platform (arm-linux-gnueabi) - Fix parsing of encoded extents - Minor cleanup * tag 'erofs-for-6.15-rc2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs: erofs: remove duplicate code erofs: fix encoded extents handling erofs: add __packed annotation to union(__le16..) erofs: set error to bio if file-backed IO fails
2025-04-13bcachefs: Don't print data read retry success on non-errorsKent Overstreet
We may end up in the data read retry path when reading cached data and racing with invalidation, or on checksum error when we were reading into a userspace buffer that might have been modified while the read was in flight. These aren't real errors, so we shouldn't print the 'retry success' message. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-04-13Merge tag 'ext4_for_linus-6.15-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 fixes from Ted Ts'o: "A few more miscellaneous ext4 bug fixes and cleanups including some syzbot failures and fixing a stale file handing refeencing an inode previously used as a regular file, but which has been deleted and reused as an ea_inode would result in ext4 erroneously considering this a case of fs corruption" * tag 'ext4_for_linus-6.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: fix off-by-one error in do_split ext4: make block validity check resistent to sb bh corruption ext4: avoid -Wflex-array-member-not-at-end warning Documentation: ext4: Add fields to ext4_super_block documentation ext4: don't treat fhandle lookup of ea_inode as FS corruption
2025-04-13Merge tag 'fixes-2025-04-13' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock Pull memblock fix from Mike Rapoport: "Fix build of memblock test. Add missing stubs for mutex and free_reserved_area() to memblock tests" * tag 'fixes-2025-04-13' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock: memblock tests: Fix mutex related build error
2025-04-13bcachefs: Add missing error handlingAlan Huang
Reported-by: syzbot+d10151bf01574a09a915@syzkaller.appspotmail.com Signed-off-by: Alan Huang <mmpgouride@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-04-13bcachefs: Prevent granting write refs when filesystem is read-onlyGabriel Shahrouzi
Fix a shutdown WARNING in bch2_dev_free caused by active write I/O references (ca->io_ref[WRITE]) on a device being freed. The problem occurs when: - The filesystem is marked read-only (BCH_FS_rw clear in c->flags). - A subsequent operation (e.g., error handling for device removal) incorrectly tries to grant write references back to a device. - During final shutdown, the read-only flag causes the system to skip stopping write I/O references (bch2_dev_io_ref_stop(ca, WRITE)). - The leftover active write reference triggers the WARN_ON in bch2_dev_free. Prevent this by checking if the filesystem is read-only before attempting to grant write references to a device in the problematic code path. Ensure consistency between the filesystem state flag and the device I/O reference state during shutdown. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-04-13clang-format: Update the ForEachMacros list for v6.15-rc1Ingo Molnar
One of my 'git grep' searches tripped on this file listing an already removed <linux/list.h> primitive. Refresh it. Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org>
2025-04-12ext4: fix off-by-one error in do_splitArtem Sadovnikov
Syzkaller detected a use-after-free issue in ext4_insert_dentry that was caused by out-of-bounds access due to incorrect splitting in do_split. BUG: KASAN: use-after-free in ext4_insert_dentry+0x36a/0x6d0 fs/ext4/namei.c:2109 Write of size 251 at addr ffff888074572f14 by task syz-executor335/5847 CPU: 0 UID: 0 PID: 5847 Comm: syz-executor335 Not tainted 6.12.0-rc6-syzkaller-00318-ga9cda7c0ffed #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/30/2024 Call Trace: <TASK> __dump_stack lib/dump_stack.c:94 [inline] dump_stack_lvl+0x241/0x360 lib/dump_stack.c:120 print_address_description mm/kasan/report.c:377 [inline] print_report+0x169/0x550 mm/kasan/report.c:488 kasan_report+0x143/0x180 mm/kasan/report.c:601 kasan_check_range+0x282/0x290 mm/kasan/generic.c:189 __asan_memcpy+0x40/0x70 mm/kasan/shadow.c:106 ext4_insert_dentry+0x36a/0x6d0 fs/ext4/namei.c:2109 add_dirent_to_buf+0x3d9/0x750 fs/ext4/namei.c:2154 make_indexed_dir+0xf98/0x1600 fs/ext4/namei.c:2351 ext4_add_entry+0x222a/0x25d0 fs/ext4/namei.c:2455 ext4_add_nondir+0x8d/0x290 fs/ext4/namei.c:2796 ext4_symlink+0x920/0xb50 fs/ext4/namei.c:3431 vfs_symlink+0x137/0x2e0 fs/namei.c:4615 do_symlinkat+0x222/0x3a0 fs/namei.c:4641 __do_sys_symlink fs/namei.c:4662 [inline] __se_sys_symlink fs/namei.c:4660 [inline] __x64_sys_symlink+0x7a/0x90 fs/namei.c:4660 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f </TASK> The following loop is located right above 'if' statement. for (i = count-1; i >= 0; i--) { /* is more than half of this entry in 2nd half of the block? */ if (size + map[i].size/2 > blocksize/2) break; size += map[i].size; move++; } 'i' in this case could go down to -1, in which case sum of active entries wouldn't exceed half the block size, but previous behaviour would also do split in half if sum would exceed at the very last block, which in case of having too many long name files in a single block could lead to out-of-bounds access and following use-after-free. Found by Linux Verification Center (linuxtesting.org) with Syzkaller. Cc: stable@vger.kernel.org Fixes: 5872331b3d91 ("ext4: fix potential negative array index in do_split()") Signed-off-by: Artem Sadovnikov <a.sadovnikov@ispras.ru> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://patch.msgid.link/20250404082804.2567-3-a.sadovnikov@ispras.ru Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2025-04-12ext4: make block validity check resistent to sb bh corruptionOjaswin Mujoo
Block validity checks need to be skipped in case they are called for journal blocks since they are part of system's protected zone. Currently, this is done by checking inode->ino against sbi->s_es->s_journal_inum, which is a direct read from the ext4 sb buffer head. If someone modifies this underneath us then the s_journal_inum field might get corrupted. To prevent against this, change the check to directly compare the inode with journal->j_inode. **Slight change in behavior**: During journal init path, check_block_validity etc might be called for journal inode when sbi->s_journal is not set yet. In this case we now proceed with ext4_inode_block_valid() instead of returning early. Since systems zones have not been set yet, it is okay to proceed so we can perform basic checks on the blocks. Suggested-by: Baokun Li <libaokun1@huawei.com> Reviewed-by: Baokun Li <libaokun1@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Zhang Yi <yi.zhang@huawei.com> Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com> Link: https://patch.msgid.link/0c06bc9ebfcd6ccfed84a36e79147bf45ff5adc1.1743142920.git.ojaswin@linux.ibm.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2025-04-12ext4: avoid -Wflex-array-member-not-at-end warningGustavo A. R. Silva
-Wflex-array-member-not-at-end was introduced in GCC-14, and we are getting ready to enable it, globally. Use the `DEFINE_RAW_FLEX()` helper for an on-stack definition of a flexible structure where the size of the flexible-array member is known at compile-time, and refactor the rest of the code, accordingly. So, with these changes, fix the following warning: fs/ext4/mballoc.c:3041:40: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end] Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Reviewed-by: Kees Cook <kees@kernel.org> Link: https://patch.msgid.link/Z-SF97N3AxcIMlSi@kspp Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2025-04-12Documentation: ext4: Add fields to ext4_super_block documentationTom Vierjahn
Documentation and implementation of the ext4 super block have slightly diverged: Padding has been removed in order to make room for new fields that are still missing in the documentation. Add the new fields s_encryption_level, s_first_error_errorcode, s_last_error_errorcode to the documentation of the ext4 super block. Fixes: f542fbe8d5e8 ("ext4 crypto: reserve codepoints used by the ext4 encryption feature") Fixes: 878520ac45f9 ("ext4: save the error code which triggered an ext4_error() in the superblock") Signed-off-by: Tom Vierjahn <tom.vierjahn@acm.org> Reviewed-by: Ojaswin Mujoo <ojaswin@linux.ibm.com> Link: https://patch.msgid.link/20250324221004.5268-1-tom.vierjahn@acm.org Signed-off-by: Theodore Ts'o <tytso@mit.edu>