summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2024-12-19smb: client: fix TCP timers deadlock after rmmodEnzo Matsumiya
Commit ef7134c7fc48 ("smb: client: Fix use-after-free of network namespace.") fixed a netns UAF by manually enabled socket refcounting (sk->sk_net_refcnt=1 and sock_inuse_add(net, 1)). The reason the patch worked for that bug was because we now hold references to the netns (get_net_track() gets a ref internally) and they're properly released (internally, on __sk_destruct()), but only because sk->sk_net_refcnt was set. Problem: (this happens regardless of CONFIG_NET_NS_REFCNT_TRACKER and regardless if init_net or other) Setting sk->sk_net_refcnt=1 *manually* and *after* socket creation is not only out of cifs scope, but also technically wrong -- it's set conditionally based on user (=1) vs kernel (=0) sockets. And net/ implementations seem to base their user vs kernel space operations on it. e.g. upon TCP socket close, the TCP timers are not cleared because sk->sk_net_refcnt=1: (cf. commit 151c9c724d05 ("tcp: properly terminate timers for kernel sockets")) net/ipv4/tcp.c: void tcp_close(struct sock *sk, long timeout) { lock_sock(sk); __tcp_close(sk, timeout); release_sock(sk); if (!sk->sk_net_refcnt) inet_csk_clear_xmit_timers_sync(sk); sock_put(sk); } Which will throw a lockdep warning and then, as expected, deadlock on tcp_write_timer(). A way to reproduce this is by running the reproducer from ef7134c7fc48 and then 'rmmod cifs'. A few seconds later, the deadlock/lockdep warning shows up. Fix: We shouldn't mess with socket internals ourselves, so do not set sk_net_refcnt manually. Also change __sock_create() to sock_create_kern() for explicitness. As for non-init_net network namespaces, we deal with it the best way we can -- hold an extra netns reference for server->ssocket and drop it when it's released. This ensures that the netns still exists whenever we need to create/destroy server->ssocket, but is not directly tied to it. Fixes: ef7134c7fc48 ("smb: client: Fix use-after-free of network namespace.") Cc: stable@vger.kernel.org Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-12-19smb: client: Deduplicate "select NETFS_SUPPORT" in KconfigDragan Simic
Repeating automatically selected options in Kconfig files is redundant, so let's delete repeated "select NETFS_SUPPORT" that was added accidentally. Fixes: 69c3c023af25 ("cifs: Implement netfslib hooks") Signed-off-by: Dragan Simic <dsimic@manjaro.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-12-19smb: use macros instead of constants for leasekey size and default cifsattrs ↵Bharath SM
value Replace default hardcoded value for cifsAttrs with ATTR_ARCHIVE macro Use SMB2_LEASE_KEY_SIZE macro for leasekey size in smb2_lease_break Signed-off-by: Bharath SM <bharathsm@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-12-19drm/sched: Fix drm_sched_fini() docu generationBagas Sanjaya
Commit baf4afc5831438 ("drm/sched: Improve teardown documentation") added a list of drm_sched_fini()'s problems. The list triggers htmldocs warning (but renders correctly in htmldocs output): Documentation/gpu/drm-mm:571: ./drivers/gpu/drm/scheduler/sched_main.c:1359: ERROR: Unexpected indentation. Separate the list from the preceding paragraph by a blank line to fix the warning. While at it, also end the aforementioned paragraph by a colon. Fixes: baf4afc58314 ("drm/sched: Improve teardown documentation") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Closes: https://lore.kernel.org/r/20241108175655.6d3fcfb7@canb.auug.org.au/ Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com> [phasta: Adjust commit message] Signed-off-by: Philipp Stanner <phasta@kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/20241217034915.62594-1-bagasdotme@gmail.com
2024-12-19pmdomain: core: add dummy release function to genpd deviceLucas Stach
The genpd device, which is really only used as a handle to lookup OPP, but not even registered to the device core otherwise and thus lifetime linked to the genpd struct it is contained in, is missing a release function. After b8f7bbd1f4ec ("pmdomain: core: Add missing put_device()") the device will be cleaned up going through the driver core device_release() function, which will warn when no release callback is present for the device. Add a dummy release function to shut up the warning. Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Tested-by: Luca Ceresoli <luca.ceresoli@bootlin.com> Fixes: b8f7bbd1f4ec ("pmdomain: core: Add missing put_device()") Cc: stable@vger.kernel.org Message-ID: <20241218184433.1930532-1-l.stach@pengutronix.de> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2024-12-19pmdomain: imx: gpcv2: fix an OF node reference leak in imx_gpcv2_probe()Joe Hattori
imx_gpcv2_probe() leaks an OF node reference obtained by of_get_child_by_name(). Fix it by declaring the device node with the __free(device_node) cleanup construct. This bug was found by an experimental static analysis tool that I am developing. Fixes: 03aa12629fc4 ("soc: imx: Add GPCv2 power gating driver") Signed-off-by: Joe Hattori <joe@pf.is.s.u-tokyo.ac.jp> Cc: stable@vger.kernel.org Message-ID: <20241215030159.1526624-1-joe@pf.is.s.u-tokyo.ac.jp> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2024-12-19mmc: sdhci-msm: fix crypto key evictionEric Biggers
Commit c7eed31e235c ("mmc: sdhci-msm: Switch to the new ICE API") introduced an incorrect check of the algorithm ID into the key eviction path, and thus qcom_ice_evict_key() is no longer ever called. Fix it. Fixes: c7eed31e235c ("mmc: sdhci-msm: Switch to the new ICE API") Cc: stable@vger.kernel.org Cc: Abel Vesa <abel.vesa@linaro.org> Signed-off-by: Eric Biggers <ebiggers@google.com> Message-ID: <20241213041958.202565-6-ebiggers@kernel.org> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2024-12-19selftests/bpf: Fix compilation error in get_uprobe_offset()Jerome Marchand
In get_uprobe_offset(), the call to procmap_query() use the constant PROCMAP_QUERY_VMA_EXECUTABLE, even if PROCMAP_QUERY is not defined. Define PROCMAP_QUERY_VMA_EXECUTABLE when PROCMAP_QUERY isn't. Fixes: 4e9e07603ecd ("selftests/bpf: make use of PROCMAP_QUERY ioctl if available") Signed-off-by: Jerome Marchand <jmarchan@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/bpf/20241218175724.578884-1-jmarchan@redhat.com
2024-12-19accel/ivpu: Fix WARN in ivpu_ipc_send_receive_internal()Jacek Lawrynowicz
Move pm_runtime_set_active() to ivpu_pm_init() so when ivpu_ipc_send_receive_internal() is executed before ivpu_pm_enable() it already has correct runtime state, even if last resume was not successful. Fixes: 8ed520ff4682 ("accel/ivpu: Move set autosuspend delay to HW specific code") Cc: stable@vger.kernel.org # v6.7+ Reviewed-by: Karol Wachowski <karol.wachowski@intel.com> Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241210130939.1575610-4-jacek.lawrynowicz@linux.intel.com
2024-12-19accel/ivpu: Fix memory leak in ivpu_mmu_reserved_context_init()Jacek Lawrynowicz
Add appropriate error handling to ensure all allocated resources are released upon encountering an error. Fixes: a74f4d991352 ("accel/ivpu: Defer MMU root page table allocation") Cc: Karol Wachowski <karol.wachowski@intel.com> Reviewed-by: Karol Wachowski <karol.wachowski@intel.com> Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241210130939.1575610-3-jacek.lawrynowicz@linux.intel.com
2024-12-19accel/ivpu: Fix general protection fault in ivpu_bo_list()Jacek Lawrynowicz
Check if ctx is not NULL before accessing its fields. Fixes: 37dee2a2f433 ("accel/ivpu: Improve buffer object debug logs") Cc: stable@vger.kernel.org # v6.8 Reviewed-by: Karol Wachowski <karol.wachowski@intel.com> Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241210130939.1575610-2-jacek.lawrynowicz@linux.intel.com
2024-12-19selftests/bpf: Use asm constraint "m" for LoongArchTiezhu Yang
Currently, LoongArch LLVM does not support the constraint "o" and no plan to support it, it only supports the similar constraint "m", so change the constraints from "nor" in the "else" case to arch-specific "nmr" to avoid the build error such as "unexpected asm memory constraint" for LoongArch. Fixes: 630301b0d59d ("selftests/bpf: Add basic USDT selftests") Suggested-by: Weining Lu <luweining@loongson.cn> Suggested-by: Li Chen <chenli@loongson.cn> Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Huacai Chen <chenhuacai@loongson.cn> Cc: stable@vger.kernel.org Link: https://llvm.org/docs/LangRef.html#supported-constraint-code-list Link: https://github.com/llvm/llvm-project/blob/main/llvm/lib/Target/LoongArch/LoongArchISelDAGToDAG.cpp#L172 Link: https://lore.kernel.org/bpf/20241219111506.20643-1-yangtiezhu@loongson.cn
2024-12-19RDMA/bnxt_re: Fix the locking while accessing the QP tableSelvin Xavier
QP table handling is synchronized with destroy QP and Async event from the HW. The same needs to be synchronized during create_qp also. Use the same lock in create_qp also. Fixes: 76d3ddff7153 ("RDMA/bnxt_re: synchronize the qp-handle table array") Fixes: f218d67ef004 ("RDMA/bnxt_re: Allow posting when QPs are in error") Fixes: 84cf229f4001 ("RDMA/bnxt_re: Fix the qp table indexing") Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Link: https://patch.msgid.link/20241217102649.1377704-6-kalesh-anakkur.purayil@broadcom.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-12-19RDMA/bnxt_re: Fix MSN table size for variable wqe modeDamodharam Ammepalli
For variable size wqe mode, the MSN table size should be half the size of the SQ depth. Fixing this to avoid wrap around problems in the retransmission path. Fixes: de1d364c3815 ("RDMA/bnxt_re: Add support for Variable WQE in Genp7 adapters") Reviewed-by: Kashyap Desai <kashyap.desai@broadcom.com> Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Damodharam Ammepalli <damodharam.ammepalli@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Link: https://patch.msgid.link/20241217102649.1377704-5-kalesh-anakkur.purayil@broadcom.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-12-19RDMA/bnxt_re: Add send queue size check for variable wqeDamodharam Ammepalli
For the fixed WQE case, HW supports 0xFFFF WQEs. For variable Size WQEs, HW treats this number as the 16 bytes slots. The maximum supported WQEs needs to be adjusted based on the number of slots. Set a maximum WQE limit for variable WQE scenario. Fixes: de1d364c3815 ("RDMA/bnxt_re: Add support for Variable WQE in Genp7 adapters") Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Damodharam Ammepalli <damodharam.ammepalli@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Link: https://patch.msgid.link/20241217102649.1377704-4-kalesh-anakkur.purayil@broadcom.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-12-19RDMA/bnxt_re: Disable use of reserved wqesKalesh AP
Disabling the reserved wqes logic for Gen P5/P7 devices because this workaround is required only for legacy devices. Fixes: ecb53febfcad ("RDMA/bnxt_en: Enable RDMA driver support for 57500 chip") Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Link: https://patch.msgid.link/20241217102649.1377704-3-kalesh-anakkur.purayil@broadcom.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-12-19RDMA/bnxt_re: Fix max_qp_wrs reportedSelvin Xavier
While creating qps, driver adds one extra entry to the sq size passed by the ULPs in order to avoid queue full condition. When ULPs creates QPs with max_qp_wr reported, driver creates QP with 1 more than the max_wqes supported by HW. Create QP fails in this case. To avoid this error, reduce 1 entry in max_qp_wqes and report it to the stack. Fixes: 1ac5a4047975 ("RDMA/bnxt_re: Add bnxt_re RoCE driver") Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Link: https://patch.msgid.link/20241217102649.1377704-2-kalesh-anakkur.purayil@broadcom.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-12-19Merge tag 'thunderbolt-for-v6.13-rc4' of ↵Greg Kroah-Hartman
ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/westeri/thunderbolt into usb-linus Mika writes: thunderbolt: Fixes for v6.13-rc4 This includes following USB4/Thunderbolt fixes for v6.13-rc4: - Add Intel Panther Lake PCI IDs - Do not show nvm_version for retimers that are not supported - Fix redrive mode handling. All these have been in linux-next with no reported issues. * tag 'thunderbolt-for-v6.13-rc4' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/westeri/thunderbolt: thunderbolt: Improve redrive mode handling thunderbolt: Don't display nvm_version unless upgrade supported thunderbolt: Add support for Intel Panther Lake-M/P
2024-12-19regulator: rename regulator-uv-survival-time-ms according to DT bindingAhmad Fatoum
The regulator bindings don't document regulator-uv-survival-time-ms, but the more descriptive regulator-uv-less-critical-window-ms instead. Looking back at v3[1] and v4[2] of the series adding the support, the property was indeed renamed between these patch series, but unfortunately the rename only made it into the DT bindings with the driver code still using the old name. Let's therefore rename the property in the driver code to follow suit. This will break backwards compatibility, but there are no upstream device trees using the property and we never documented the old name of the property anyway. ¯\_(ツ)_/¯" [1]: https://lore.kernel.org/all/20231025084614.3092295-7-o.rempel@pengutronix.de/ [2]: https://lore.kernel.org/all/20231026144824.4065145-5-o.rempel@pengutronix.de/ Signed-off-by: Ahmad Fatoum <a.fatoum@pengutronix.de> Link: https://patch.msgid.link/20241218-regulator-uv-survival-time-ms-rename-v1-1-6cac9c3c75da@pengutronix.de Signed-off-by: Mark Brown <broonie@kernel.org>
2024-12-19ASoC: mediatek: disable buffer pre-allocationChen-Yu Tsai
On Chromebooks based on Mediatek MT8195 or MT8188, the audio frontend (AFE) is limited to accessing a very small window (1 MiB) of memory, which is described as a reserved memory region in the device tree. On these two platforms, the maximum buffer size is given as 512 KiB. The MediaTek common code uses the same value for preallocations. This means that only the first two PCM substreams get preallocations, and then the whole space is exhausted, barring any other substreams from working. Since the substreams used are not always the first two, this means audio won't work correctly. This is observed on the MT8188 Geralt Chromebooks, on which the "mediatek,dai-link" property was dropped when it was upstreamed. That property causes the driver to only register the PCM substreams listed in the property, and in the order given. Instead of trying to compute an optimal value and figuring out which streams are used, simply disable preallocation. The PCM buffers are managed by the core and are allocated and released on the fly. There should be no impact to any of the other MediaTek platforms. Signed-off-by: Chen-Yu Tsai <wenst@chromium.org> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Link: https://patch.msgid.link/20241219105303.548437-1-wenst@chromium.org Signed-off-by: Mark Brown <broonie@kernel.org>
2024-12-19net: mctp: handle skb cleanup on sock_queue failuresJeremy Kerr
Currently, we don't use the return value from sock_queue_rcv_skb, which means we may leak skbs if a message is not successfully queued to a socket. Instead, ensure that we're freeing the skb where the sock hasn't otherwise taken ownership of the skb by adding checks on the sock_queue_rcv_skb() to invoke a kfree on failure. In doing so, rather than using the 'rc' value to trigger the kfree_skb(), use the skb pointer itself, which is more explicit. Also, add a kunit test for the sock delivery failure cases. Fixes: 4a992bbd3650 ("mctp: Implement message fragmentation & reassembly") Cc: stable@vger.kernel.org Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au> Link: https://patch.msgid.link/20241218-mctp-next-v2-1-1c1729645eaa@codeconstruct.com.au Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-12-19net: mdiobus: fix an OF node reference leakJoe Hattori
fwnode_find_mii_timestamper() calls of_parse_phandle_with_fixed_args() but does not decrement the refcount of the obtained OF node. Add an of_node_put() call before returning from the function. This bug was detected by an experimental static analysis tool that I am developing. Fixes: bc1bee3b87ee ("net: mdiobus: Introduce fwnode_mdiobus_register_phy()") Signed-off-by: Joe Hattori <joe@pf.is.s.u-tokyo.ac.jp> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20241218035106.1436405-1-joe@pf.is.s.u-tokyo.ac.jp Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-12-19Merge tag 'drm-intel-fixes-2024-12-18' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/i915/kernel into drm-fixes - Reset engine utilization buffer before registration (Umesh Nerlige Ramappa) - Ensure busyness counter increases motonically (Umesh Nerlige Ramappa) - Accumulate active runtime on gt reset (Umesh Nerlige Ramappa) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Tvrtko Ursulin <tursulin@igalia.com> Link: https://patchwork.freedesktop.org/patch/msgid/Z2LppUZudGKXwWjW@linux
2024-12-19RDMA/siw: Remove direct link to net_deviceBernard Metzler
Do not manage a per device direct link to net_device. Rely on associated ib_devices net_device management, not doubling the effort locally. A badly managed local link to net_device was causing a 'KASAN: slab-use-after-free' exception during siw_query_port() call. Fixes: bdcf26bf9b3a ("rdma/siw: network and RDMA core interface") Reported-by: syzbot+4b87489410b4efd181bf@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=4b87489410b4efd181bf Signed-off-by: Bernard Metzler <bmt@zurich.ibm.com> Link: https://patch.msgid.link/20241212151848.564872-1-bmt@zurich.ibm.com Reviewed-by: Zhu Yanjun <yanjun.zhu@linux.dev> Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-12-19RDMA/nldev: Set error code in rdma_nl_notify_eventChiara Meiohas
In case of error set the error code before the goto. Fixes: 6ff57a2ea7c2 ("RDMA/nldev: Fix NULL pointer dereferences issue in rdma_nl_notify_event") Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Closes: https://lore.kernel.org/linux-rdma/a84a2fc3-33b6-46da-a1bd-3343fa07eaf9@stanley.mountain/ Signed-off-by: Chiara Meiohas <cmeiohas@nvidia.com> Reviewed-by: Maher Sanalla <msanalla@nvidia.com> Link: https://patch.msgid.link/13eb25961923f5de9eb9ecbbc94e26113d6049ef.1733815944.git.leonro@nvidia.com Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-12-19Merge tag 'nf-24-12-19' of ↵Paolo Abeni
git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf Pablo Neira Ayuso says: ==================== Netfilter/IPVS fixes for net The following series contains two fixes for Netfilter/IPVS: 1) Possible build failure in IPVS on systems with less than 512MB memory due to incorrect use of clamp(), from David Laight. 2) Fix bogus lockdep nesting splat with ipset list:set type, from Phil Sutter. netfilter pull request 24-12-19 * tag 'nf-24-12-19' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf: netfilter: ipset: Fix for recursive locking warning ipvs: Fix clamp() of ip_vs_conn_tab on small memory systems ==================== Link: https://patch.msgid.link/20241218234137.1687288-1-pablo@netfilter.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-12-18octeontx2-pf: fix error handling of devlink port in rvu_rep_create()Harshit Mogalapalli
Unregister the devlink port when register_netdev() fails. Fixes: 9ed0343f561e ("octeontx2-pf: Add devlink port support") Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com> Link: https://patch.msgid.link/20241217052326.1086191-2-harshit.m.mogalapalli@oracle.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-18octeontx2-pf: fix netdev memory leak in rvu_rep_create()Harshit Mogalapalli
When rvu_rep_devlink_port_register() fails, free_netdev(ndev) for this incomplete iteration before going to "exit:" label. Fixes: 9ed0343f561e ("octeontx2-pf: Add devlink port support") Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com> Link: https://patch.msgid.link/20241217052326.1086191-1-harshit.m.mogalapalli@oracle.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-18psample: adjust size if rate_as_probability is setAdrian Moreno
If PSAMPLE_ATTR_SAMPLE_PROBABILITY flag is to be sent, the available size for the packet data has to be adjusted accordingly. Also, check the error code returned by nla_put_flag. Fixes: 7b1b2b60c63f ("net: psample: allow using rate as probability") Signed-off-by: Adrian Moreno <amorenoz@redhat.com> Reviewed-by: Aaron Conole <aconole@redhat.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://patch.msgid.link/20241217113739.3929300-1-amorenoz@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-18netdev-genl: avoid empty messages in queue dumpJakub Kicinski
Empty netlink responses from do() are not correct (as opposed to dump() where not dumping anything is perfectly fine). We should return an error if the target object does not exist, in this case if the netdev is down it has no queues. Fixes: 6b6171db7fc8 ("netdev-genl: Add netlink framework functions for queue") Reported-by: syzbot+0a884bc2d304ce4af70f@syzkaller.appspotmail.com Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Joe Damato <jdamato@fastly.com> Link: https://patch.msgid.link/20241218022508.815344-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-18net: dsa: restore dsa_software_vlan_untag() ability to operate on ↵Vladimir Oltean
VLAN-untagged traffic Robert Hodaszi reports that locally terminated traffic towards VLAN-unaware bridge ports is broken with ocelot-8021q. He is describing the same symptoms as for commit 1f9fc48fd302 ("net: dsa: sja1105: fix reception from VLAN-unaware bridges"). For context, the set merged as "VLAN fixes for Ocelot driver": https://lore.kernel.org/netdev/20240815000707.2006121-1-vladimir.oltean@nxp.com/ was developed in a slightly different form earlier this year, in January. Initially, the switch was unconditionally configured to set OCELOT_ES0_TAG when using ocelot-8021q, regardless of port operating mode. This led to the situation where VLAN-unaware bridge ports would always push their PVID - see ocelot_vlan_unaware_pvid() - a negligible value anyway - into RX packets. To strip this in software, we would have needed DSA to know what private VID the switch chose for VLAN-unaware bridge ports, and pushed into the packets. This was implemented downstream, and a remnant of it remains in the form of a comment mentioning ds->ops->get_private_vid(), as something which would maybe need to be considered in the future. However, for upstream, it was deemed inappropriate, because it would mean introducing yet another behavior for stripping VLAN tags from VLAN-unaware bridge ports, when one already existed (ds->untag_bridge_pvid). The latter has been marked as obsolete along with an explanation why it is logically broken, but still, it would have been confusing. So, for upstream, felix_update_tag_8021q_rx_rule() was developed, which essentially changed the state of affairs from "Felix with ocelot-8021q delivers all packets as VLAN-tagged towards the CPU" into "Felix with ocelot-8021q delivers all packets from VLAN-aware bridge ports towards the CPU". This was done on the premise that in VLAN-unaware mode, there's nothing useful in the VLAN tags, and we can avoid introducing ds->ops->get_private_vid() in the DSA receive path if we configure the switch to not push those VLAN tags into packets in the first place. Unfortunately, and this is when the trainwreck started, the selftests developed initially and posted with the series were not re-ran. dsa_software_vlan_untag() was initially written given the assumption that users of this feature would send _all_ traffic as VLAN-tagged. It was only partially adapted to the new scheme, by removing ds->ops->get_private_vid(), which also used to be necessary in standalone ports mode. Where the trainwreck became even worse is that I had a second opportunity to think about this, when the dsa_software_vlan_untag() logic change initially broke sja1105, in commit 1f9fc48fd302 ("net: dsa: sja1105: fix reception from VLAN-unaware bridges"). I did not connect the dots that it also breaks ocelot-8021q, for pretty much the same reason that not all received packets will be VLAN-tagged. To be compatible with the optimized Felix control path which runs felix_update_tag_8021q_rx_rule() to only push VLAN tags when useful (in VLAN-aware mode), we need to restore the old dsa_software_vlan_untag() logic. The blamed commit introduced the assumption that dsa_software_vlan_untag() will see only VLAN-tagged packets, assumption which is false. What corrupts RX traffic is the fact that we call skb_vlan_untag() on packets which are not VLAN-tagged in the first place. Fixes: 93e4649efa96 ("net: dsa: provide a software untagging function on RX for VLAN-aware bridges") Reported-by: Robert Hodaszi <robert.hodaszi@digi.com> Closes: https://lore.kernel.org/netdev/20241215163334.615427-1-robert.hodaszi@digi.com/ Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://patch.msgid.link/20241216135059.1258266-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-18Merge branch '200GbE' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== idpf: trigger SW interrupt when exiting wb_on_itr mode Joshua Hay says: This patch series introduces SW triggered interrupt support for idpf, then uses said interrupt to fix a race condition between completion writebacks and re-enabling interrupts. * '200GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue: idpf: trigger SW interrupt when exiting wb_on_itr mode idpf: add support for SW triggered interrupts ==================== Link: https://patch.msgid.link/20241217225715.4005644-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-18selftests: openvswitch: fix tcpdump executionAdrian Moreno
Fix the way tcpdump is executed by: - Using the right variable for the namespace. Currently the use of the empty "ns" makes the command fail. - Waiting until it starts to capture to ensure the interesting traffic is caught on slow systems. - Using line-buffered output to ensure logs are available when the test is paused with "-p". Otherwise the last chunk of data might only be written when tcpdump is killed. Fixes: 74cc26f416b9 ("selftests: openvswitch: add interface support") Signed-off-by: Adrian Moreno <amorenoz@redhat.com> Acked-by: Eelco Chaudron <echaudro@redhat.com> Link: https://patch.msgid.link/20241217211652.483016-1-amorenoz@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-18mm: huge_memory: handle strsep not finding delimiterLeo Stone
split_huge_pages_write() does not handle the case where strsep finds no delimiter in the given string and sets the input buffer to NULL, which allows this reproducer to trigger a protection fault. Link: https://lkml.kernel.org/r/20241216042752.257090-2-leocstone@gmail.com Signed-off-by: Leo Stone <leocstone@gmail.com> Reported-by: syzbot+8a3da2f1bbf59227c289@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=8a3da2f1bbf59227c289 Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-12-18alloc_tag: fix set_codetag_empty() when !CONFIG_MEM_ALLOC_PROFILING_DEBUGSuren Baghdasaryan
It was recently noticed that set_codetag_empty() might be used not only to mark NULL alloctag references as empty to avoid warnings but also to reset valid tags (in clear_page_tag_ref()). Since set_codetag_empty() is defined as NOOP for CONFIG_MEM_ALLOC_PROFILING_DEBUG=n, such use of set_codetag_empty() leads to subtle bugs. Fix set_codetag_empty() for CONFIG_MEM_ALLOC_PROFILING_DEBUG=n to reset the tag reference. Link: https://lkml.kernel.org/r/20241130001423.1114965-2-surenb@google.com Fixes: a8fc28dad6d5 ("alloc_tag: introduce clear_page_tag_ref() helper function") Signed-off-by: Suren Baghdasaryan <surenb@google.com> Reported-by: David Wang <00107082@163.com> Closes: https://lore.kernel.org/lkml/20241124074318.399027-1-00107082@163.com/ Cc: David Wang <00107082@163.com> Cc: Kent Overstreet <kent.overstreet@linux.dev> Cc: Mike Rapoport (Microsoft) <rppt@kernel.org> Cc: Pasha Tatashin <pasha.tatashin@soleen.com> Cc: Sourav Panda <souravpanda@google.com> Cc: Yu Zhao <yuzhao@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-12-18alloc_tag: fix module allocation tags populated area calculationSuren Baghdasaryan
vm_module_tags_populate() calculation of the populated area assumes that area starts at a page boundary and therefore when new pages are allocation, the end of the area is page-aligned as well. If the start of the area is not page-aligned then allocating a page and incrementing the end of the area by PAGE_SIZE leads to an area at the end but within the area boundary which is not populated. Accessing this are will lead to a kernel panic. Fix the calculation by down-aligning the start of the area and using that as the location allocated pages are mapped to. [gehao@kylinos.cn: fix vm_module_tags_populate's KASAN poisoning logic] Link: https://lkml.kernel.org/r/20241205170528.81000-1-hao.ge@linux.dev [gehao@kylinos.cn: fix panic when CONFIG_KASAN enabled and CONFIG_KASAN_VMALLOC not enabled] Link: https://lkml.kernel.org/r/20241212072126.134572-1-hao.ge@linux.dev Link: https://lkml.kernel.org/r/20241130001423.1114965-1-surenb@google.com Fixes: 0f9b685626da ("alloc_tag: populate memory for module tags as needed") Signed-off-by: Suren Baghdasaryan <surenb@google.com> Reported-by: kernel test robot <oliver.sang@intel.com> Closes: https://lore.kernel.org/oe-lkp/202411132111.6a221562-lkp@intel.com Acked-by: Yu Zhao <yuzhao@google.com> Tested-by: Adrian Huang <ahuang12@lenovo.com> Cc: David Wang <00107082@163.com> Cc: Kent Overstreet <kent.overstreet@linux.dev> Cc: Mike Rapoport (Microsoft) <rppt@kernel.org> Cc: Pasha Tatashin <pasha.tatashin@soleen.com> Cc: Sourav Panda <souravpanda@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-12-18mm/codetag: clear tags before swapDavid Wang
When CONFIG_MEM_ALLOC_PROFILING_DEBUG is set, kernel WARN would be triggered when calling __alloc_tag_ref_set() during swap: alloc_tag was not cleared (got tag for mm/filemap.c:1951) WARNING: CPU: 0 PID: 816 at ./include/linux/alloc_tag.h... Clear code tags before swap can fix the warning. And this patch also fix a potential invalid address dereference in alloc_tag_add_check() when CONFIG_MEM_ALLOC_PROFILING_DEBUG is set and ref->ct is CODETAG_EMPTY, which is defined as ((void *)1). Link: https://lkml.kernel.org/r/20241213013332.89910-1-00107082@163.com Fixes: 51f43d5d82ed ("mm/codetag: swap tags when migrate pages") Signed-off-by: David Wang <00107082@163.com> Reported-by: kernel test robot <oliver.sang@intel.com> Closes: https://lore.kernel.org/oe-lkp/202412112227.df61ebb-lkp@intel.com Acked-by: Suren Baghdasaryan <surenb@google.com> Cc: Kent Overstreet <kent.overstreet@linux.dev> Cc: Yu Zhao <yuzhao@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-12-18mm/vmstat: fix a W=1 clang compiler warningBart Van Assche
Fix the following clang compiler warning that is reported if the kernel is built with W=1: ./include/linux/vmstat.h:518:36: error: arithmetic between different enumeration types ('enum node_stat_item' and 'enum lru_list') [-Werror,-Wenum-enum-conversion] 518 | return node_stat_name(NR_LRU_BASE + lru) + 3; // skip "nr_" | ~~~~~~~~~~~ ^ ~~~ Link: https://lkml.kernel.org/r/20241212213126.1269116-1-bvanassche@acm.org Fixes: 9d7ea9a297e6 ("mm/vmstat: add helpers to get vmstat item names for each enum type") Signed-off-by: Bart Van Assche <bvanassche@acm.org> Cc: Konstantin Khlebnikov <koct9i@gmail.com> Cc: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-12-18mm: convert partially_mapped set/clear operations to be atomicUsama Arif
Other page flags in the 2nd page, like PG_hwpoison and PG_anon_exclusive can get modified concurrently. Changes to other page flags might be lost if they are happening at the same time as non-atomic partially_mapped operations. Hence, make partially_mapped operations atomic. Link: https://lkml.kernel.org/r/20241212183351.1345389-1-usamaarif642@gmail.com Fixes: 8422acdc97ed ("mm: introduce a pageflag for partially mapped folios") Reported-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/all/e53b04ad-1827-43a2-a1ab-864c7efecf6e@redhat.com/ Signed-off-by: Usama Arif <usamaarif642@gmail.com> Acked-by: David Hildenbrand <david@redhat.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Roman Gushchin <roman.gushchin@linux.dev> Cc: Barry Song <baohua@kernel.org> Cc: Domenico Cerasuolo <cerasuolodomenico@gmail.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Matthew Wilcox <willy@infradead.org> Cc: Mike Rapoport (Microsoft) <rppt@kernel.org> Cc: Nico Pache <npache@redhat.com> Cc: Rik van Riel <riel@surriel.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Shakeel Butt <shakeel.butt@linux.dev> Cc: Yu Zhao <yuzhao@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-12-18nilfs2: fix buffer head leaks in calls to truncate_inode_pages()Ryusuke Konishi
When block_invalidatepage was converted to block_invalidate_folio, the fallback to block_invalidatepage in folio_invalidate() if the address_space_operations method invalidatepage (currently invalidate_folio) was not set, was removed. Unfortunately, some pseudo-inodes in nilfs2 use empty_aops set by inode_init_always_gfp() as is, or explicitly set it to address_space_operations. Therefore, with this change, block_invalidatepage() is no longer called from folio_invalidate(), and as a result, the buffer_head structures attached to these pages/folios are no longer freed via try_to_free_buffers(). Thus, these buffer heads are now leaked by truncate_inode_pages(), which cleans up the page cache from inode evict(), etc. Three types of caches use empty_aops: gc inode caches and the DAT shadow inode used by GC, and b-tree node caches. Of these, b-tree node caches explicitly call invalidate_mapping_pages() during cleanup, which involves calling try_to_free_buffers(), so the leak was not visible during normal operation but worsened when GC was performed. Fix this issue by using address_space_operations with invalidate_folio set to block_invalidate_folio instead of empty_aops, which will ensure the same behavior as before. Link: https://lkml.kernel.org/r/20241212164556.21338-1-konishi.ryusuke@gmail.com Fixes: 7ba13abbd31e ("fs: Turn block_invalidatepage into block_invalidate_folio") Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Cc: <stable@vger.kernel.org> [5.18+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-12-18vmalloc: fix accounting with i915Matthew Wilcox (Oracle)
If the caller of vmap() specifies VM_MAP_PUT_PAGES (currently only the i915 driver), we will decrement nr_vmalloc_pages and MEMCG_VMALLOC in vfree(). These counters are incremented by vmalloc() but not by vmap() so this will cause an underflow. Check the VM_MAP_PUT_PAGES flag before decrementing either counter. Link: https://lkml.kernel.org/r/20241211202538.168311-1-willy@infradead.org Fixes: b944afc9d64d ("mm: add a VM_MAP_PUT_PAGES flag for vmap") Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Reviewed-by: Shakeel Butt <shakeel.butt@linux.dev> Reviewed-by: Balbir Singh <balbirs@nvidia.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Muchun Song <muchun.song@linux.dev> Cc: Roman Gushchin <roman.gushchin@linux.dev> Cc: "Uladzislau Rezki (Sony)" <urezki@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-12-18mm/page_alloc: don't call pfn_to_page() on possibly non-existent PFN in ↵David Hildenbrand
split_large_buddy() In split_large_buddy(), we might call pfn_to_page() on a PFN that might not exist. In corner cases, such as when freeing the highest pageblock in the last memory section, this could result with CONFIG_SPARSEMEM && !CONFIG_SPARSEMEM_EXTREME in __pfn_to_section() returning NULL and and __section_mem_map_addr() dereferencing that NULL pointer. Let's fix it, and avoid doing a pfn_to_page() call for the first iteration, where we already have the page. So far this was found by code inspection, but let's just CC stable as the fix is easy. Link: https://lkml.kernel.org/r/20241210093437.174413-1-david@redhat.com Fixes: fd919a85cd55 ("mm: page_isolation: prepare for hygienic freelists") Signed-off-by: David Hildenbrand <david@redhat.com> Reported-by: Vlastimil Babka <vbabka@suse.cz> Closes: https://lkml.kernel.org/r/e1a898ba-a717-4d20-9144-29df1a6c8813@suse.cz Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Zi Yan <ziy@nvidia.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Yu Zhao <yuzhao@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-12-18fork: avoid inappropriate uprobe access to invalid mmLorenzo Stoakes
If dup_mmap() encounters an issue, currently uprobe is able to access the relevant mm via the reverse mapping (in build_map_info()), and if we are very unlucky with a race window, observe invalid XA_ZERO_ENTRY state which we establish as part of the fork error path. This occurs because uprobe_write_opcode() invokes anon_vma_prepare() which in turn invokes find_mergeable_anon_vma() that uses a VMA iterator, invoking vma_iter_load() which uses the advanced maple tree API and thus is able to observe XA_ZERO_ENTRY entries added to dup_mmap() in commit d24062914837 ("fork: use __mt_dup() to duplicate maple tree in dup_mmap()"). This change was made on the assumption that only process tear-down code would actually observe (and make use of) these values. However this very unlikely but still possible edge case with uprobes exists and unfortunately does make these observable. The uprobe operation prevents races against the dup_mmap() operation via the dup_mmap_sem semaphore, which is acquired via uprobe_start_dup_mmap() and dropped via uprobe_end_dup_mmap(), and held across register_for_each_vma() prior to invoking build_map_info() which does the reverse mapping lookup. Currently these are acquired and dropped within dup_mmap(), which exposes the race window prior to error handling in the invoking dup_mm() which tears down the mm. We can avoid all this by just moving the invocation of uprobe_start_dup_mmap() and uprobe_end_dup_mmap() up a level to dup_mm() and only release this lock once the dup_mmap() operation succeeds or clean up is done. This means that the uprobe code can never observe an incompletely constructed mm and resolves the issue in this case. Link: https://lkml.kernel.org/r/20241210172412.52995-1-lorenzo.stoakes@oracle.com Fixes: d24062914837 ("fork: use __mt_dup() to duplicate maple tree in dup_mmap()") Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Reported-by: syzbot+2d788f4f7cb660dac4b7@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/6756d273.050a0220.2477f.003d.GAE@google.com/ Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Liam R. Howlett <Liam.Howlett@Oracle.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peng Zhang <zhangpeng.00@bytedance.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: David Hildenbrand <david@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-12-18nilfs2: prevent use of deleted inodeEdward Adam Davis
syzbot reported a WARNING in nilfs_rmdir. [1] Because the inode bitmap is corrupted, an inode with an inode number that should exist as a ".nilfs" file was reassigned by nilfs_mkdir for "file0", causing an inode duplication during execution. And this causes an underflow of i_nlink in rmdir operations. The inode is used twice by the same task to unmount and remove directories ".nilfs" and "file0", it trigger warning in nilfs_rmdir. Avoid to this issue, check i_nlink in nilfs_iget(), if it is 0, it means that this inode has been deleted, and iput is executed to reclaim it. [1] WARNING: CPU: 1 PID: 5824 at fs/inode.c:407 drop_nlink+0xc4/0x110 fs/inode.c:407 ... Call Trace: <TASK> nilfs_rmdir+0x1b0/0x250 fs/nilfs2/namei.c:342 vfs_rmdir+0x3a3/0x510 fs/namei.c:4394 do_rmdir+0x3b5/0x580 fs/namei.c:4453 __do_sys_rmdir fs/namei.c:4472 [inline] __se_sys_rmdir fs/namei.c:4470 [inline] __x64_sys_rmdir+0x47/0x50 fs/namei.c:4470 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f Link: https://lkml.kernel.org/r/20241209065759.6781-1-konishi.ryusuke@gmail.com Fixes: d25006523d0b ("nilfs2: pathname operations") Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Reported-by: syzbot+9260555647a5132edd48@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=9260555647a5132edd48 Tested-by: syzbot+9260555647a5132edd48@syzkaller.appspotmail.com Signed-off-by: Edward Adam Davis <eadavis@qq.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-12-18zram: fix uninitialized ZRAM not releasing backing deviceKairui Song
Setting backing device is done before ZRAM initialization. If we set the backing device, then remove the ZRAM module without initializing the device, the backing device reference will be leaked and the device will be hold forever. Fix this by always reset the ZRAM fully on rmmod or reset store. Link: https://lkml.kernel.org/r/20241209165717.94215-3-ryncsn@gmail.com Fixes: 013bf95a83ec ("zram: add interface to specif backing device") Signed-off-by: Kairui Song <kasong@tencent.com> Reported-by: Desheng Wu <deshengwu@tencent.com> Suggested-by: Sergey Senozhatsky <senozhatsky@chromium.org> Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-12-18zram: refuse to use zero sized block device as backing deviceKairui Song
Patch series "zram: fix backing device setup issue", v2. This series fixes two bugs of backing device setting: - ZRAM should reject using a zero sized (or the uninitialized ZRAM device itself) as the backing device. - Fix backing device leaking when removing a uninitialized ZRAM device. This patch (of 2): Setting a zero sized block device as backing device is pointless, and one can easily create a recursive loop by setting the uninitialized ZRAM device itself as its own backing device by (zram0 is uninitialized): echo /dev/zram0 > /sys/block/zram0/backing_dev It's definitely a wrong config, and the module will pin itself, kernel should refuse doing so in the first place. By refusing to use zero sized device we avoided misuse cases including this one above. Link: https://lkml.kernel.org/r/20241209165717.94215-1-ryncsn@gmail.com Link: https://lkml.kernel.org/r/20241209165717.94215-2-ryncsn@gmail.com Fixes: 013bf95a83ec ("zram: add interface to specif backing device") Signed-off-by: Kairui Song <kasong@tencent.com> Reported-by: Desheng Wu <deshengwu@tencent.com> Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-12-18mm: use clear_user_(high)page() for arch with special user folio handlingZi Yan
Some architectures have special handling after clearing user folios: architectures, which set cpu_dcache_is_aliasing() to true, require flushing dcache; arc, which sets cpu_icache_is_aliasing() to true, changes folio->flags to make icache coherent to dcache. So __GFP_ZERO using only clear_page() is not enough to zero user folios and clear_user_(high)page() must be used. Otherwise, user data will be corrupted. Fix it by always clearing user folios with clear_user_(high)page() when cpu_dcache_is_aliasing() is true or cpu_icache_is_aliasing() is true. Rename alloc_zeroed() to user_alloc_needs_zeroing() and invert the logic to clarify its intend. Link: https://lkml.kernel.org/r/20241209182326.2955963-2-ziy@nvidia.com Fixes: 5708d96da20b ("mm: avoid zeroing user movable page twice with init_on_alloc=1") Signed-off-by: Zi Yan <ziy@nvidia.com> Reported-by: Geert Uytterhoeven <geert+renesas@glider.be> Closes: https://lore.kernel.org/linux-mm/CAMuHMdV1hRp_NtR5YnJo=HsfgKQeH91J537Gh4gKk3PFZhSkbA@mail.gmail.com/ Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Alexander Potapenko <glider@google.com> Cc: David Hildenbrand <david@redhat.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Kees Cook <keescook@chromium.org> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Miaohe Lin <linmiaohe@huawei.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Vineet Gupta <vgupta@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-12-18mm: introduce cpu_icache_is_aliasing() across all architecturesZi Yan
In commit eacd0e950dc2 ("ARC: [mm] Lazy D-cache flush (non aliasing VIPT)"), arc adds the need to flush dcache to make icache see the code page change. This also requires special handling for clear_user_(high)page(). Introduce cpu_icache_is_aliasing() to make MM code query special clear_user_(high)page() easier. This will be used by the following commit. Link: https://lkml.kernel.org/r/20241209182326.2955963-1-ziy@nvidia.com Fixes: 5708d96da20b ("mm: avoid zeroing user movable page twice with init_on_alloc=1") Signed-off-by: Zi Yan <ziy@nvidia.com> Suggested-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Alexander Potapenko <glider@google.com> Cc: David Hildenbrand <david@redhat.com> Cc: Geert Uytterhoeven <geert+renesas@glider.be> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Kees Cook <keescook@chromium.org> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Miaohe Lin <linmiaohe@huawei.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Vineet Gupta <vgupta@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-12-18mm: add RCU annotation to pte_offset_map(_lock)Petr Malat
RCU lock is taken by ___pte_offset_map() unless it returns NULL. Add this information to its inline callers to avoid sparse warning about context imbalance in pte_unmap(). Link: https://lkml.kernel.org/r/20241210000604.700710-1-oss@malat.biz Signed-off-by: Petr Malat <oss@malat.biz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-12-18mm: correctly reference merged VMALorenzo Stoakes
On second merge attempt on mmap() we incorrectly discard the possibly merged VMA, resulting in a possible use-after-free (and most certainly a reference to the wrong VMA) in this instance in the subsequent __mmap_complete() invocation. Correct this mistake by reassigning vma correctly if a merge succeeds in this case. Link: https://lkml.kernel.org/r/20241206215229.244413-1-lorenzo.stoakes@oracle.com Fixes: 5ac87a885aec ("mm: defer second attempt at merge on mmap()") Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Suggested-by: Jann Horn <jannh@google.com> Reported-by: syzbot+91cf8da9401355f946c3@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/67536a25.050a0220.a30f1.0149.GAE@google.com/ Reviewed-by: Liam R. Howlett <Liam.Howlett@Oracle.com> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>