summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2025-03-20io_uring/net: don't clear REQ_F_NEED_CLEANUP unconditionallyJens Axboe
io_req_msg_cleanup() relies on the fact that io_netmsg_recycle() will always fully recycle, but that may not be the case if the msg cache was already full. To ensure that normal cleanup always gets run, let io_netmsg_recycle() deal with clearing the relevant cleanup flags, as it knows exactly when that should be done. Cc: stable@vger.kernel.org Reported-by: David Wei <dw@davidwei.uk> Fixes: 75191341785e ("io_uring/net: add iovec recycling") Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-20Merge branch 'kvm-pre-tdx' into HEADPaolo Bonzini
- Add common secure TSC infrastructure for use within SNP and in the future TDX - Block KVM_CAP_SYNC_REGS if guest state is protected. It does not make sense to use the capability if the relevant registers are not available for reading or writing.
2025-03-20Merge branch 'kvm-nvmx-and-vm-teardown' into HEADPaolo Bonzini
The immediate issue being fixed here is a nVMX bug where KVM fails to detect that, after nested VM-Exit, L1 has a pending IRQ (or NMI). However, checking for a pending interrupt accesses the legacy PIC, and x86's kvm_arch_destroy_vm() currently frees the PIC before destroying vCPUs, i.e. checking for IRQs during the forced nested VM-Exit results in a NULL pointer deref; that's a prerequisite for the nVMX fix. The remaining patches attempt to bring a bit of sanity to x86's VM teardown code, which has accumulated a lot of cruft over the years. E.g. KVM currently unloads each vCPU's MMUs in a separate operation from destroying vCPUs, all because when guest SMP support was added, KVM had a kludgy MMU teardown flow that broke when a VM had more than one 1 vCPU. And that oddity lived on, for 18 years... Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2025-03-20drm/xe: Fix exporting xe buffers multiple timesTomasz Rusinowicz
The `struct ttm_resource->placement` contains TTM_PL_FLAG_* flags, but it was incorrectly tested for XE_PL_* flags. This caused xe_dma_buf_pin() to always fail when invoked for the second time. Fix this by checking the `mem_type` field instead. Fixes: 7764222d54b7 ("drm/xe: Disallow pinning dma-bufs in VRAM") Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: "Thomas Hellström" <thomas.hellstrom@linux.intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Nirmoy Das <nirmoy.das@intel.com> Cc: Jani Nikula <jani.nikula@intel.com> Cc: intel-xe@lists.freedesktop.org Cc: <stable@vger.kernel.org> # v6.8+ Signed-off-by: Tomasz Rusinowicz <tomasz.rusinowicz@intel.com> Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250218100353.2137964-1-jacek.lawrynowicz@linux.intel.com Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> (cherry picked from commit b96dabdba9b95f71ded50a1c094ee244408b2a8e) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2025-03-20Merge tag 'kvmarm-6.15' of ↵Paolo Bonzini
https://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 updates for 6.15 - Nested virtualization support for VGICv3, giving the nested hypervisor control of the VGIC hardware when running an L2 VM - Removal of 'late' nested virtualization feature register masking, making the supported feature set directly visible to userspace - Support for emulating FEAT_PMUv3 on Apple silicon, taking advantage of an IMPLEMENTATION DEFINED trap that covers all PMUv3 registers - Paravirtual interface for discovering the set of CPU implementations where a VM may run, addressing a longstanding issue of guest CPU errata awareness in big-little systems and cross-implementation VM migration - Userspace control of the registers responsible for identifying a particular CPU implementation (MIDR_EL1, REVIDR_EL1, AIDR_EL1), allowing VMs to be migrated cross-implementation - pKVM updates, including support for tracking stage-2 page table allocations in the protected hypervisor in the 'SecPageTable' stat - Fixes to vPMU, ensuring that userspace updates to the vPMU after KVM_RUN are reflected into the backing perf events
2025-03-20Merge tag 'kvm-riscv-6.15-1' of https://github.com/kvm-riscv/linux into HEADPaolo Bonzini
KVM/riscv changes for 6.15 - Disable the kernel perf counter during configure - KVM selftests improvements for PMU - Fix warning at the time of KVM module removal
2025-03-20cgroup: rstat: Cleanup flushing functions and lockingYosry Ahmed
Now that the rstat lock is being re-acquired on every CPU iteration in cgroup_rstat_flush_locked(), having the initially acquire the lock is unnecessary and unclear. Inline cgroup_rstat_flush_locked() into cgroup_rstat_flush() and move the lock/unlock calls to the beginning and ending of the loop body to make the critical section obvious. cgroup_rstat_flush_hold/release() do not make much sense with the lock being dropped and reacquired internally. Since it has no external callers, remove it and explicitly acquire the lock in cgroup_base_stat_cputime_show() instead. This leaves the code with a single flushing function, cgroup_rstat_flush(). Signed-off-by: Yosry Ahmed <yosry.ahmed@linux.dev> Signed-off-by: Tejun Heo <tj@kernel.org>
2025-03-20Merge tag 'net-6.14-rc8' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from can, bluetooth and ipsec. This contains a last minute revert of a recent GRE patch, mostly to allow me stating there are no known regressions outstanding. Current release - regressions: - revert "gre: Fix IPv6 link-local address generation." - eth: ti: am65-cpsw: fix NAPI registration sequence Previous releases - regressions: - ipv6: fix memleak of nhc_pcpu_rth_output in fib_check_nh_v6_gw(). - mptcp: fix data stream corruption in the address announcement - bluetooth: fix connection regression between LE and non-LE adapters - can: - flexcan: only change CAN state when link up in system PM - ucan: fix out of bound read in strscpy() source Previous releases - always broken: - lwtunnel: fix reentry loops - ipv6: fix TCP GSO segmentation with NAT - xfrm: force software GSO only in tunnel mode - eth: ti: icssg-prueth: add lock to stats Misc: - add Andrea Mayer as a maintainer of SRv6" * tag 'net-6.14-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (33 commits) MAINTAINERS: Add Andrea Mayer as a maintainer of SRv6 Revert "gre: Fix IPv6 link-local address generation." Revert "selftests: Add IPv6 link-local address generation tests for GRE devices." net/neighbor: add missing policy for NDTPA_QUEUE_LENBYTES tools headers: Sync uapi/asm-generic/socket.h with the kernel sources mptcp: Fix data stream corruption in the address announcement selftests: net: test for lwtunnel dst ref loops net: ipv6: ioam6: fix lwtunnel_output() loop net: lwtunnel: fix recursion loops net: ti: icssg-prueth: Add lock to stats net: atm: fix use after free in lec_send() xsk: fix an integer overflow in xp_create_and_assign_umem() net: stmmac: dwc-qos-eth: use devm_kzalloc() for AXI data selftests: drv-net: use defer in the ping test phy: fix xa_alloc_cyclic() error handling dpll: fix xa_alloc_cyclic() error handling devlink: fix xa_alloc_cyclic() error handling ipv6: Set errno after ip_fib_metrics_init() in ip6_route_info_create(). ipv6: Fix memleak of nhc_pcpu_rth_output in fib_check_nh_v6_gw(). net: ipv6: fix TCP GSO segmentation with NAT ...
2025-03-20Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdmaLinus Torvalds
Pull rdma fixes from Jason Gunthorpe: "Collected driver fixes from the last few weeks, I was surprised how significant many of them seemed to be. - Fix rdma-core test failures due to wrong startup ordering in rxe - Don't crash in bnxt_re if the FW supports more than 64k QPs - Fix wrong QP table indexing math in bnxt_re - Calculate the max SRQs for userspace properly in bnxt_re - Don't try to do math on errno for mlx5's rate calculation - Properly allow userspace to control the VLAN in the QP state during INIT->RTR for bnxt_re - 6 bug fixes for HNS: - Soft lockup when processing huge MRs, add a cond_resched() - Fix missed error unwind for doorbell allocation - Prevent bad send queue parameters from userspace - Wrong error unwind in qp creation - Missed xa_destroy during driver shutdown - Fix reporting to userspace of max_sge_rd, hns doesn't have a read/write difference" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: RDMA/hns: Fix wrong value of max_sge_rd RDMA/hns: Fix missing xa_destroy() RDMA/hns: Fix a missing rollback in error path of hns_roce_create_qp_common() RDMA/hns: Fix invalid sq params not being blocked RDMA/hns: Fix unmatched condition in error path of alloc_user_qp_db() RDMA/hns: Fix soft lockup during bt pages loop RDMA/bnxt_re: Avoid clearing VLAN_ID mask in modify qp path RDMA/mlx5: Handle errors returned from mlx5r_ib_rate() RDMA/bnxt_re: Fix reporting maximum SRQs on P7 chips RDMA/bnxt_re: Add missing paranthesis in map_qp_id_to_tbl_indx RDMA/bnxt_re: Fix allocation of QP table RDMA/rxe: Fix the failure of ibv_query_device() and ibv_query_device_ex() tests
2025-03-20Merge tag 'mmc-v6.14-rc4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc Pull MMC host fixes from Ulf Hansson: - sdhci-brcmstb: Fix CQE suspend/resume support - atmel-mci: Add a missing clk_disable_unprepare() in ->probe() * tag 'mmc-v6.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc: mmc: sdhci-brcmstb: add cqhci suspend/resume to PM ops mmc: atmel-mci: Add missing clk_disable_unprepare()
2025-03-20Merge tag 'efi-fixes-for-v6.14-3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi Pull EFI fixes from Ard Biesheuvel: "Here's a final batch of EFI fixes for v6.14. The efivarfs ones are fixes for changes that were made this cycle. James's fix is somewhat of a band-aid, but it was blessed by the VFS folks, who are working with James to come up with something better for the next cycle. - Avoid physical address 0x0 for random page allocations - Add correct lockdep annotation when traversing efivarfs on resume - Avoid NULL mount in kernel_file_open() when traversing efivarfs on resume" * tag 'efi-fixes-for-v6.14-3' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi: efivarfs: fix NULL dereference on resume efivarfs: use I_MUTEX_CHILD nested lock to traverse variables on resume efi/libstub: Avoid physical address 0x0 when doing random allocation
2025-03-20spi: dt-bindings: cdns,qspi-nor: ImproveMark Brown
Merge series from Miquel Raynal <miquel.raynal@bootlin.com>: While working with this controller I figured out a couple of small issues which I propose to fix. They are not super impacting, but I believe this goes into the right direction.
2025-03-20ASoC: wm8904: Add DMIC and DRC supportMark Brown
Merge series from Francesco Dolcini <francesco@dolcini.it>: This patch series adds DMIC and DRC support to the WM8904 driver, a new of_ helper is added to simplify the driver code. DRC functionality is added in the same patch series to provide the necessary dynamic range control to make DMIC support useful. The WM8904 supports digital microphones on two of its inputs: IN1L/DMICDAT1 and IN1R/DMICDAT2. These two inputs can either be connected to an ADC or to the DMIC system. There is an ADC for each line, and only one DMIC block. This DMIC block is either connected to DMICDAT1 or to DMICDAT2. One DMIC data line supports two digital microphones via time multiplexing. The pin's functionality is decided during hardware design (IN1L vs DMICDAT1 and IN1R vs DMICDAT2). This is reflected in the Device Tree. If one line is analog and one is DMIC, we need to be able to switch between ADC and DMIC at runtime. The DMIC source is known from the Device Tree. If both are DMIC inputs, we need to be able to switch the DMIC source. There is no need to switch between ADC and DMIC at runtime. Therefore, kcontrols are dynamically added by the driver depending on its Device Tree configuration. This is a heavy rework of a previous patch series provided by Alifer Moraes and Pierluigi Passaro, https://lore.kernel.org/lkml/20220307141041.27538-1-alifer.m@variscite.com.
2025-03-20Tidy up ASoC control get and put handlersMark Brown
Merge series from Charles Keepax <ckeepax@opensource.cirrus.com>: There is a lot of duplicated and occasionally slightly incorrect code around the ASoC control get and put handlers. This series add some kunit tests and then refactors the code to get all the tests passing and reduce some of the duplication. The focus here is on the volsw handlers, future work could still be done on some of the others but these were the ones that most required attention. Hopefully the only slightly controversal change is the very last patch which changes platform_max to be applied after the control type is determined, more discussion in the commit message for that one.
2025-03-20arm64: mm: Don't use %pK through printkThomas Weißschuh
Restricted pointers ("%pK") are not meant to be used through printk(). It can unintentionally expose security sensitive, raw pointer values. Use regular pointer formatting instead. Link: https://lore.kernel.org/lkml/20250113171731-dc10e3c1-da64-4af0-b767-7c7070468023@linutronix.de/ Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Link: https://lore.kernel.org/r/20250217-restricted-pointers-arm64-v1-1-14bb1f516b01@linutronix.de Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2025-03-20MAINTAINERS: Add Andrea Mayer as a maintainer of SRv6David Ahern
Andrea has made significant contributions to SRv6 support in Linux. Acknowledge the work and on-going interest in Srv6 support with a maintainers entry for these files so hopefully he is included on patches going forward. Signed-off-by: David Ahern <dsahern@kernel.org> Acked-by: Andrea Mayer <andrea.mayer@uniroma2.it> Link: https://patch.msgid.link/20250312092212.46299-1-dsahern@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-20Merge branch 'gre-revert-ipv6-link-local-address-fix'Paolo Abeni
Guillaume Nault says: ==================== gre: Revert IPv6 link-local address fix. Following Paolo's suggestion, let's revert the IPv6 link-local address generation fix for GRE devices. The patch introduced regressions in the upstream CI, which are still under investigation. Start by reverting the kselftest that depend on that fix (patch 1), then revert the kernel code itself (patch 2). ==================== Link: https://patch.msgid.link/cover.1742418408.git.gnault@redhat.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-20Revert "gre: Fix IPv6 link-local address generation."Guillaume Nault
This reverts commit 183185a18ff96751db52a46ccf93fff3a1f42815. This patch broke net/forwarding/ip6gre_custom_multipath_hash.sh in some circumstances (https://lore.kernel.org/netdev/Z9RIyKZDNoka53EO@mini-arch/). Let's revert it while the problem is being investigated. Fixes: 183185a18ff9 ("gre: Fix IPv6 link-local address generation.") Signed-off-by: Guillaume Nault <gnault@redhat.com> Link: https://patch.msgid.link/8b1ce738eb15dd841aab9ef888640cab4f6ccfea.1742418408.git.gnault@redhat.com Acked-by: Stanislav Fomichev <sdf@fomichev.me> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-20Revert "selftests: Add IPv6 link-local address generation tests for GRE ↵Guillaume Nault
devices." This reverts commit 6f50175ccad4278ed3a9394c00b797b75441bd6e. Commit 183185a18ff9 ("gre: Fix IPv6 link-local address generation.") is going to be reverted. So let's revert the corresponding kselftest first. Signed-off-by: Guillaume Nault <gnault@redhat.com> Link: https://patch.msgid.link/259a9e98f7f1be7ce02b53d0b4afb7c18a8ff747.1742418408.git.gnault@redhat.com Acked-by: Stanislav Fomichev <sdf@fomichev.me> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-20Merge tag 'ipsec-2025-03-19' of ↵Paolo Abeni
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec Steffen Klassert says: ==================== pull request (net): ipsec 2025-03-19 1) Fix tunnel mode TX datapath in packet offload mode by directly putting it to the xmit path. From Alexandre Cassen. 2) Force software GSO only in tunnel mode in favor of potential HW GSO. From Cosmin Ratiu. ipsec-2025-03-19 * tag 'ipsec-2025-03-19' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec: xfrm_output: Force software GSO only in tunnel mode xfrm: fix tunnel mode TX datapath in packet offload mode ==================== Link: https://patch.msgid.link/20250319065513.987135-1-steffen.klassert@secunet.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-20Merge patch series "pidfs: handle multi-threaded exec and premature ↵Christian Brauner
thread-group leader exit" Christian Brauner <brauner@kernel.org> says: This is another attempt at trying to make pidfd polling for multi-threaded exec and premature thread-group leader exit consistent. A quick recap of these two cases: (1) During a multi-threaded exec by a subthread, i.e., non-thread-group leader thread, all other threads in the thread-group including the thread-group leader are killed and the struct pid of the thread-group leader will be taken over by the subthread that called exec. IOW, two tasks change their TIDs. (2) A premature thread-group leader exit means that the thread-group leader exited before all of the other subthreads in the thread-group have exited. Both cases lead to inconsistencies for pidfd polling with PIDFD_THREAD. Any caller that holds a PIDFD_THREAD pidfd to the current thread-group leader may or may not see an exit notification on the file descriptor depending on when poll is performed. If the poll is performed before the exec of the subthread has concluded an exit notification is generated for the old thread-group leader. If the poll is performed after the exec of the subthread has concluded no exit notification is generated for the old thread-group leader. The correct behavior would be to simply not generate an exit notification on the struct pid of a subhthread exec because the struct pid is taken over by the subthread and thus remains alive. But this is difficult to handle because a thread-group may exit premature as mentioned in (2). In that case an exit notification is reliably generated but the subthreads may continue to run for an indeterminate amount of time and thus also may exec at some point. This tiny series tries to address this problem. If that works correctly then no exit notifications are generated for a PIDFD_THREAD pidfd for a thread-group leader until all subthreads have been reaped. If a subthread should exec before no exit notification will be generated until that task exits or it creates subthreads and repeates the cycle. * patches from https://lore.kernel.org/r/20250320-work-pidfs-thread_group-v4-0-da678ce805bf@kernel.org: selftests/pidfd: third test for multi-threaded exec polling selftests/pidfd: second test for multi-threaded exec polling selftests/pidfd: first test for multi-threaded exec polling pidfs: improve multi-threaded exec and premature thread-group leader exit polling Link: https://lore.kernel.org/r/20250320-work-pidfs-thread_group-v4-0-da678ce805bf@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-03-20selftests/pidfd: third test for multi-threaded exec pollingChristian Brauner
Ensure that during a multi-threaded exec and premature thread-group leader exit no exit notification is generated. Link: https://lore.kernel.org/r/20250320-work-pidfs-thread_group-v4-4-da678ce805bf@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-03-20selftests/pidfd: second test for multi-threaded exec pollingChristian Brauner
Ensure that during a multi-threaded exec and premature thread-group leader exit no exit notification is generated. Link: https://lore.kernel.org/r/20250320-work-pidfs-thread_group-v4-3-da678ce805bf@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-03-20selftests/pidfd: first test for multi-threaded exec pollingChristian Brauner
Add first test for premature thread-group leader exit. Link: https://lore.kernel.org/r/20250320-work-pidfs-thread_group-v4-2-da678ce805bf@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-03-20pidfs: improve multi-threaded exec and premature thread-group leader exit ↵Christian Brauner
polling This is another attempt trying to make pidfd polling for multi-threaded exec and premature thread-group leader exit consistent. A quick recap of these two cases: (1) During a multi-threaded exec by a subthread, i.e., non-thread-group leader thread, all other threads in the thread-group including the thread-group leader are killed and the struct pid of the thread-group leader will be taken over by the subthread that called exec. IOW, two tasks change their TIDs. (2) A premature thread-group leader exit means that the thread-group leader exited before all of the other subthreads in the thread-group have exited. Both cases lead to inconsistencies for pidfd polling with PIDFD_THREAD. Any caller that holds a PIDFD_THREAD pidfd to the current thread-group leader may or may not see an exit notification on the file descriptor depending on when poll is performed. If the poll is performed before the exec of the subthread has concluded an exit notification is generated for the old thread-group leader. If the poll is performed after the exec of the subthread has concluded no exit notification is generated for the old thread-group leader. The correct behavior would be to simply not generate an exit notification on the struct pid of a subhthread exec because the struct pid is taken over by the subthread and thus remains alive. But this is difficult to handle because a thread-group may exit prematurely as mentioned in (2). In that case an exit notification is reliably generated but the subthreads may continue to run for an indeterminate amount of time and thus also may exec at some point. So far there was no way to distinguish between (1) and (2) internally. This tiny series tries to address this problem by discarding PIDFD_THREAD notification on premature thread-group leader exit. If that works correctly then no exit notifications are generated for a PIDFD_THREAD pidfd for a thread-group leader until all subthreads have been reaped. If a subthread should exec aftewards no exit notification will be generated until that task exits or it creates subthreads and repeates the cycle. Co-Developed-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Oleg Nesterov <oleg@redhat.com> Link: https://lore.kernel.org/r/20250320-work-pidfs-thread_group-v4-1-da678ce805bf@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-03-20Merge tag 'batadv-net-pullrequest-20250318' of ↵Paolo Abeni
git://git.open-mesh.org/linux-merge Simon Wunderlich says: ==================== Here is batman-adv bugfix: - Ignore own maximum aggregation size during RX, Sven Eckelmann * tag 'batadv-net-pullrequest-20250318' of git://git.open-mesh.org/linux-merge: batman-adv: Ignore own maximum aggregation size during RX ==================== Link: https://patch.msgid.link/20250318150035.35356-1-sw@simonwunderlich.de Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-20net/neighbor: add missing policy for NDTPA_QUEUE_LENBYTESLin Ma
Previous commit 8b5c171bb3dc ("neigh: new unresolved queue limits") introduces new netlink attribute NDTPA_QUEUE_LENBYTES to represent approximative value for deprecated QUEUE_LEN. However, it forgot to add the associated nla_policy in nl_ntbl_parm_policy array. Fix it with one simple NLA_U32 type policy. Fixes: 8b5c171bb3dc ("neigh: new unresolved queue limits") Signed-off-by: Lin Ma <linma@zju.edu.cn> Link: https://patch.msgid.link/20250315165113.37600-1-linma@zju.edu.cn Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-20fs: sort out fd allocation vs dup2 race commentary, take 2Mateusz Guzik
fd_install() has a questionable comment above it. While it correctly points out a possible race against dup2(), it states: > We need to detect this and fput() the struct file we are about to > overwrite in this case. > > It should never happen - if we allow dup2() do it, _really_ bad things > will follow. I have difficulty parsing the above. The first sentence would suggest fd_install() tries to detect and recover from the race (it does not), the next one claims the race needs to be dealt with (it is, by dup2()). Given that fd_install() does not suffer the burden, this patch removes the above and instead expands on the race in dup2() commentary. While here tidy up the docs around fd_install(). Signed-off-by: Mateusz Guzik <mjguzik@gmail.com> Link: https://lore.kernel.org/r/20250320102637.1924183-1-mjguzik@gmail.com Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-03-20Merge patch series "further iomap large atomic writes changes"Christian Brauner
John Garry <john.g.garry@oracle.com> says: These iomap changes are spun-off the XFS large atomic writes series at https://lore.kernel.org/linux-xfs/86a64256-497a-453b-bbba-a5ac6b4cb056@oracle.com/T/#ma99c763221de9d49ea2ccfca9ff9b8d71c8b2677 The XFS parts there are not ready yet, but it is worth having the iomap changes queued in advance. Some much earlier changes from that same series were already queued in the vfs tree, and these patches rework those changes - specifically the first patch in this series does. The most other significant change is the patch to rework how the bio flags are set in the DIO patch. * patches from https://lore.kernel.org/r/20250320120250.4087011-1-john.g.garry@oracle.com: iomap: rework IOMAP atomic flags iomap: comment on atomic write checks in iomap_dio_bio_iter() iomap: inline iomap_dio_bio_opflags() Link: https://lore.kernel.org/r/20250320120250.4087011-1-john.g.garry@oracle.com Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-03-20iomap: rework IOMAP atomic flagsJohn Garry
Flag IOMAP_ATOMIC_SW is not really required. The idea of having this flag is that the FS ->iomap_begin callback could check if this flag is set to decide whether to do a SW (FS-based) atomic write. But the FS can set which ->iomap_begin callback it wants when deciding to do a FS-based atomic write. Furthermore, it was thought that IOMAP_ATOMIC_HW is not a proper name, as the block driver can use SW-methods to emulate an atomic write. So change back to IOMAP_ATOMIC. The ->iomap_begin callback needs though to indicate to iomap core that REQ_ATOMIC needs to be set, so add IOMAP_F_ATOMIC_BIO for that. These changes were suggested by Christoph Hellwig and Dave Chinner. Signed-off-by: John Garry <john.g.garry@oracle.com> Link: https://lore.kernel.org/r/20250320120250.4087011-4-john.g.garry@oracle.com Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-03-20iomap: comment on atomic write checks in iomap_dio_bio_iter()John Garry
Help explain the code. Also clarify the comment for bio size check. Signed-off-by: John Garry <john.g.garry@oracle.com> Link: https://lore.kernel.org/r/20250320120250.4087011-3-john.g.garry@oracle.com Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-03-20iomap: inline iomap_dio_bio_opflags()John Garry
It is neater to build blk_opf_t fully in one place, so inline iomap_dio_bio_opflags() in iomap_dio_bio_iter(). Also tidy up the logic in dealing with IOMAP_DIO_CALLER_COMP, in generally separate the logic in dealing with flags associated with reads and writes. Originally-from: Christoph Hellwig <hch@lst.de> Reviewed-by: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com> Signed-off-by: John Garry <john.g.garry@oracle.com> Link: https://lore.kernel.org/r/20250320120250.4087011-2-john.g.garry@oracle.com Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-03-20tools headers: Sync uapi/asm-generic/socket.h with the kernel sourcesAlexander Mikhalitsyn
This also fixes a wrong definitions for SCM_TS_OPT_ID & SO_RCVPRIORITY. Accidentally found while working on another patchset. Cc: linux-kernel@vger.kernel.org Cc: netdev@vger.kernel.org Cc: Eric Dumazet <edumazet@google.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Vadim Fedorenko <vadim.fedorenko@linux.dev> Cc: Willem de Bruijn <willemb@google.com> Cc: Jason Xing <kerneljasonxing@gmail.com> Cc: Anna Emese Nyiri <annaemesenyiri@gmail.com> Cc: Kuniyuki Iwashima <kuniyu@amazon.com> Cc: Paolo Abeni <pabeni@redhat.com> Fixes: a89568e9be75 ("selftests: txtimestamp: add SCM_TS_OPT_ID test") Fixes: e45469e594b2 ("sock: Introduce SO_RCVPRIORITY socket option") Link: https://lore.kernel.org/netdev/20250314195257.34854-1-kuniyu@amazon.com/ Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Jason Xing <kerneljasonxing@gmail.com> Link: https://patch.msgid.link/20250314214155.16046-1-aleksandr.mikhalitsyn@canonical.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-20mptcp: Fix data stream corruption in the address announcementArthur Mongodin
Because of the size restriction in the TCP options space, the MPTCP ADD_ADDR option is exclusive and cannot be sent with other MPTCP ones. For this reason, in the linked mptcp_out_options structure, group of fields linked to different options are part of the same union. There is a case where the mptcp_pm_add_addr_signal() function can modify opts->addr, but not ended up sending an ADD_ADDR. Later on, back in mptcp_established_options, other options will be sent, but with unexpected data written in other fields due to the union, e.g. in opts->ext_copy. This could lead to a data stream corruption in the next packet. Using an intermediate variable, prevents from corrupting previously established DSS option. The assignment of the ADD_ADDR option parameters is now done once we are sure this ADD_ADDR option can be set in the packet, e.g. after having dropped other suboptions. Fixes: 1bff1e43a30e ("mptcp: optimize out option generation") Cc: stable@vger.kernel.org Suggested-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Arthur Mongodin <amongodin@randorisec.fr> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> [ Matt: the commit message has been updated: long lines splits and some clarifications. ] Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250314-net-mptcp-fix-data-stream-corr-sockopt-v1-1-122dbb249db3@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-20mmc: host: Wait for Vdd to settle on card power offErick Shepherd
The SD spec version 6.0 section 6.4.1.5 requires that Vdd must be lowered to less than 0.5V for a minimum of 1 ms when powering off a card. Increase wait to 15 ms so that voltage has time to drain down to 0.5V and cards can power off correctly. Issues with voltage drain time were only observed on Apollo Lake and Bay Trail host controllers so this fix is limited to those devices. Signed-off-by: Erick Shepherd <erick.shepherd@ni.com> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Link: https://lore.kernel.org/r/20250314195021.1588090-1-erick.shepherd@ni.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2025-03-20i2c: amd-mp2: drop free_irq() of devm_request_irq() allocated irqYang Yingliang
irq allocated with devm_request_irq() will be freed in devm_irq_release(), using free_irq() in ->remove() will causes a dangling pointer, and a subsequent double free. So remove the free_irq() in the error path and remove path. Fixes: 969864efae78 ("i2c: amd-mp2: use msix/msi if the hardware supports") Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Acked-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com> Link: https://lore.kernel.org/r/20221103121146.99836-1-yangyingliang@huawei.com Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
2025-03-20libfs: Fix duplicate directory entry in offset_dir_lookupYongjian Sun
There is an issue in the kernel: In tmpfs, when using the "ls" command to list the contents of a directory with a large number of files, glibc performs the getdents call in multiple rounds. If a concurrent unlink occurs between these getdents calls, it may lead to duplicate directory entries in the ls output. One possible reproduction scenario is as follows: Create 1026 files and execute ls and rm concurrently: for i in {1..1026}; do echo "This is file $i" > /tmp/dir/file$i done ls /tmp/dir rm /tmp/dir/file4 ->getdents(file1026-file5) ->unlink(file4) ->getdents(file5,file3,file2,file1) It is expected that the second getdents call to return file3 through file1, but instead it returns an extra file5. The root cause of this problem is in the offset_dir_lookup function. It uses mas_find to determine the starting position for the current getdents call. Since mas_find locates the first position that is greater than or equal to mas->index, when file4 is deleted, it ends up returning file5. It can be fixed by replacing mas_find with mas_find_rev, which finds the first position that is less than or equal to mas->index. Fixes: b9b588f22a0c ("libfs: Use d_children list to iterate simple_offset directories") Signed-off-by: Yongjian Sun <sunyongjian1@huawei.com> Link: https://lore.kernel.org/r/20250320034417.555810-1-sunyongjian@huaweicloud.com Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-03-20ASoC: wm8904: add DMIC supportErnest Van Hoecke
The WM8904 codec supports both ADC and DMIC inputs. Get input pin functionality from the platform data and add the necessary controls depending on the possible additional routing. The ADC and DMIC share the IN1L/DMICDAT1 and IN1R/DMICDAT2 pins. This leads to a few scenarios requiring different DAPM routing: - When both are connected to an analog input, only the ADC is used. - When one line is a DMIC and the other an analog input, the DMIC source is set from the platform data and a mux is added to select whether to use the ADC or DMIC. - When both are connected to a DMIC, another mux is added to this to select the DMIC source. Note that we still need to be able to select the ADC system for use with the IN2L, IN2R, IN3L and IN3R pins. Signed-off-by: Ernest Van Hoecke <ernest.vanhoecke@toradex.com> Signed-off-by: Francesco Dolcini <francesco.dolcini@toradex.com> Reviewed-by: Charles Keepax <ckeepax@opensource.cirrus.com> Link: https://patch.msgid.link/20250319142059.46692-6-francesco@dolcini.it Signed-off-by: Mark Brown <broonie@kernel.org>
2025-03-20ASoC: wm8904: get platform data from DTErnest Van Hoecke
Read in optional codec-specific properties from the device tree. The platform_data structure is not populated when using device trees. This change parses optional dts properties to populate it. - wlf,in1l-as-dmicdat1 - wlf,in1r-as-dmicdat2 - wlf,gpio-cfg - wlf,micbias-cfg - wlf,drc-cfg-regs - wlf,drc-cfg-names - wlf,retune-mobile-cfg-regs - wlf,retune-mobile-cfg-names - wlf,retune-mobile-cfg-hz Datasheet: https://statics.cirrus.com/pubs/proDatasheet/WM8904_Rev4.1.pdf Signed-off-by: Ernest Van Hoecke <ernest.vanhoecke@toradex.com> Signed-off-by: Francesco Dolcini <francesco.dolcini@toradex.com> Reviewed-by: Charles Keepax <ckeepax@opensource.cirrus.com> Link: https://patch.msgid.link/20250319142059.46692-5-francesco@dolcini.it Signed-off-by: Mark Brown <broonie@kernel.org>
2025-03-20ASoC: dt-bindings: wm8904: Add DMIC, GPIO, MIC and EQ supportErnest Van Hoecke
Add two properties to select the IN1L/DMICDAT1 and IN2R/DMICDAT2 functionality: - wlf,in1l-as-dmicdat1 - wlf,in1r-as-dmicdat2 Add a property to describe the GPIO configuration registers, that can be used to set the four multifunction pins: - wlf,gpio-cfg Add a property to describe the mic bias control registers: - wlf,micbias-cfg Add two properties to describe the Dynamic Range Controller (DRC), allowing multiple named configurations where each config sets the 4 DRC registers (R40-R43): - wlf,drc-cfg-regs - wlf,drc-cfg-names Add three properties to describe the equalizer (ReTune Mobile), allowing multiple named configurations (associated with a samplerate) that set the 24 (R134-R157) EQ registers: - wlf,retune-mobile-cfg-regs - wlf,retune-mobile-cfg-hz - wlf,retune-mobile-cfg-rates The set of names and configurations for DRC and ReTune Mobile are specified by system integrators. The names are exposed directly to userspace as options that can be selected at runtime. Adding the DRC and ReTune Mobile data to the DT eases the transition from pdata, which has handled them this way for over a decade. The parameters filled in here are almost certainly specific tuning for the hardware so it makes sense to ship them with the hardware description. Datasheet: https://statics.cirrus.com/pubs/proDatasheet/WM8904_Rev4.1.pdf Signed-off-by: Ernest Van Hoecke <ernest.vanhoecke@toradex.com> Signed-off-by: Francesco Dolcini <francesco.dolcini@toradex.com> Reviewed-by: Charles Keepax <ckeepax@opensource.cirrus.com> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://patch.msgid.link/20250319142059.46692-4-francesco@dolcini.it Signed-off-by: Mark Brown <broonie@kernel.org>
2025-03-20ASoC: wm8904: Don't touch GPIO configs set to 0xFFFFErnest Van Hoecke
When updating the GPIO registers, do nothing for all fields of gpio_cfg that are "0xFFFF". This "do nothing" flag used to be 0 to easily check whether the gpio_cfg field was actually set inside pdata or left empty (default). However, 0 is a valid configuration for these registers, while 0xFFFF is not. With this change, users can explicitly set them to 0. Not setting gpio_cfg in the platform data will now lead to setting all GPIO registers to 0 instead of leaving them unset. No one is using this platform data with this codec. The change gets the driver ready to properly set gpio_cfg from the DT. Datasheet: https://statics.cirrus.com/pubs/proDatasheet/WM8904_Rev4.1.pdf Signed-off-by: Ernest Van Hoecke <ernest.vanhoecke@toradex.com> Signed-off-by: Francesco Dolcini <francesco.dolcini@toradex.com> Reviewed-by: Charles Keepax <ckeepax@opensource.cirrus.com> Link: https://patch.msgid.link/20250319142059.46692-3-francesco@dolcini.it Signed-off-by: Mark Brown <broonie@kernel.org>
2025-03-20of: Add of_property_read_u16_indexErnest Van Hoecke
There is an of_property_read_u32_index and of_property_read_u64_index. This patch adds a similar helper for u16. Signed-off-by: Ernest Van Hoecke <ernest.vanhoecke@toradex.com> Signed-off-by: Francesco Dolcini <francesco.dolcini@toradex.com> Reviewed-by: Rob Herring (Arm) <robh@kernel.org> Reviewed-by: Charles Keepax <ckeepax@opensource.cirrus.com> Link: https://patch.msgid.link/20250319142059.46692-2-francesco@dolcini.it Signed-off-by: Mark Brown <broonie@kernel.org>
2025-03-20spi: spi-mem: Introduce a default ->exec_op() debug logMiquel Raynal
Many spi-mem controller drivers have a very similar debug log at the beginning of their ->exec_op() callback implementation. This debug log is effectively useful, so let's create one that is complete and concise enough, so developers no longer need to write their own. The verbosity being high, VERBOSE_DEBUG will be required in this case. Remove the debug log from individual drivers and propose a common one. Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Reviewed-by: Tudor Ambarus <tudor.ambarus@linaro.org> Link: https://patch.msgid.link/20250320115644.2231240-1-miquel.raynal@bootlin.com Signed-off-by: Mark Brown <broonie@kernel.org>
2025-03-20spi: dt-bindings: cdns,qspi-nor: Require some peripheral propertiesMiquel Raynal
There are 5 mandatory peripheral properties. They are described in a separate binding but not explicitly required. Make sure they are correctly marked required and update the example to reflect this. Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Reviewed-by: Rob Herring (Arm) <robh@kernel.org> Link: https://patch.msgid.link/20250319094651.1290509-4-miquel.raynal@bootlin.com Signed-off-by: Mark Brown <broonie@kernel.org>
2025-03-20spi: dt-bindings: cdns,qspi-nor: Deprecate the Cadence compatible aloneMiquel Raynal
The initial SPI controller IP from Cadence has always been implemented into controllers from various hardware manufacturers and because of that, it has always been (rightfully) doubled with a more specific compatible. There are likely no reasons to keep this compatible legitimate, alone. Make sure people do not get mislead by officially deprecating this compatible. While at deprecating, let's update the examples to avoid documenting deprecated properties. Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Reviewed-by: Rob Herring (Arm) <robh@kernel.org> Link: https://patch.msgid.link/20250319094651.1290509-3-miquel.raynal@bootlin.com Signed-off-by: Mark Brown <broonie@kernel.org>
2025-03-20spi: dt-bindings: cdns,qspi-nor: Be more descriptive regarding what this ↵Miquel Raynal
controller is Despite being very common in commit logs, SPI NOR controllers simply do not exist. At least, they are not as specific as the name implies. There are SPI memory controllers which are indeed "specialized" and optimized for handling "memories", but most of them are just generic and accept almost any kind of opcode, address, dummy and data cycles, making them as suitable for NANDs than NORs. Furthermore, this controller supports any kind of bus, from single to octal NAND, so make it clear. Also add a comment to mention that the initial compatible naming is too specific (but obviously kept for backward compatibility reasons). Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://patch.msgid.link/20250319094651.1290509-2-miquel.raynal@bootlin.com Signed-off-by: Mark Brown <broonie@kernel.org>
2025-03-20fs: call inode_sb_list_add() outside of inode hash lockMateusz Guzik
As both locks are highly contended during significant inode churn, holding the inode hash lock while waiting for the sb list lock exacerbates the problem. Why moving it out is safe: the inode at hand still has I_NEW set and anyone who finds it through legitimate means waits for the bit to clear, by which time inode_sb_list_add() is guaranteed to have finished. This significantly drops hash lock contention for me when stating 20 separate trees in parallel, each with 1000 directories * 1000 files. However, no speed up was observed as contention increased on the other locks, notably dentry LRU. Even so, removal of the lock ordering will help making this faster later. Signed-off-by: Mateusz Guzik <mjguzik@gmail.com> Link: https://lore.kernel.org/r/20250320004643.1903287-1-mjguzik@gmail.com Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-03-20ALSA: oxygen: Fix dependency on CONFIG_PM_SLEEPTakashi Iwai
The conversion to EXPORT_SIMPLE_DEV_PM_OPS() also replaced the pm ops assignment with pm_ptr() macro, but this made difference from the original code; it had conditional with ifdef CONFIG_PM_SLEEP, while we have now with CONFIG_PM, instead. This seems causing build errors with randomconfig. For fixing the inconsistency, replace pm_ptr() with pm_sleep_ptr(). Fixes: 5ea0a2206b58 ("ALSA: oxygen: Convert to EXPORT_SIMPLE_DEV_PM_OPS()") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202503201853.7kB0BPRw-lkp@intel.com/ Link: https://patch.msgid.link/20250320105721.10789-1-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de>
2025-03-20Merge branch 'net-fix-lwtunnel-reentry-loops'Paolo Abeni
Justin Iurman says: ==================== net: fix lwtunnel reentry loops When the destination is the same after the transformation, we enter a lwtunnel loop. This is true for most of lwt users: ioam6, rpl, seg6, seg6_local, ila_lwt, and lwt_bpf. It can happen in their input() and output() handlers respectively, where either dst_input() or dst_output() is called at the end. It can also happen in xmit() handlers. Here is an example for rpl_input(): dump_stack_lvl+0x60/0x80 rpl_input+0x9d/0x320 lwtunnel_input+0x64/0xa0 lwtunnel_input+0x64/0xa0 lwtunnel_input+0x64/0xa0 lwtunnel_input+0x64/0xa0 lwtunnel_input+0x64/0xa0 [...] lwtunnel_input+0x64/0xa0 lwtunnel_input+0x64/0xa0 lwtunnel_input+0x64/0xa0 lwtunnel_input+0x64/0xa0 lwtunnel_input+0x64/0xa0 ip6_sublist_rcv_finish+0x85/0x90 ip6_sublist_rcv+0x236/0x2f0 ... until rpl_do_srh() fails, which means skb_cow_head() failed. This series provides a fix at the core level of lwtunnel to catch such loops when they're not caught by the respective lwtunnel users, and handle the loop case in ioam6 which is one of the users. This series also comes with a new selftest to detect some dst cache reference loops in lwtunnel users. ==================== Link: https://patch.msgid.link/20250314120048.12569-1-justin.iurman@uliege.be Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-20selftests: net: test for lwtunnel dst ref loopsJustin Iurman
As recently specified by commit 0ea09cbf8350 ("docs: netdev: add a note on selftest posting") in net-next, the selftest is therefore shipped in this series. However, this selftest does not really test this series. It needs this series to avoid crashing the kernel. What it really tests, thanks to kmemleak, is what was fixed by the following commits: - commit c71a192976de ("net: ipv6: fix dst refleaks in rpl, seg6 and ioam6 lwtunnels") - commit 92191dd10730 ("net: ipv6: fix dst ref loops in rpl, seg6 and ioam6 lwtunnels") - commit c64a0727f9b1 ("net: ipv6: fix dst ref loop on input in seg6 lwt") - commit 13e55fbaec17 ("net: ipv6: fix dst ref loop on input in rpl lwt") - commit 0e7633d7b95b ("net: ipv6: fix dst ref loop in ila lwtunnel") - commit 5da15a9c11c1 ("net: ipv6: fix missing dst ref drop in ila lwtunnel") Signed-off-by: Justin Iurman <justin.iurman@uliege.be> Link: https://patch.msgid.link/20250314120048.12569-4-justin.iurman@uliege.be Signed-off-by: Paolo Abeni <pabeni@redhat.com>