summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
2025-08-20fhandle: raise FILEID_IS_DIR in handle_typeChristian Brauner
commit cc678bf7aa9e2e6c2356fd7f955513c1bd7d4c97 upstream. Currently FILEID_IS_DIR is raised in fh_flags which is wrong. Raise it in handle->handle_type were it's supposed to be. Link: https://lore.kernel.org/20250624-work-pidfs-fhandle-v2-1-d02a04858fe3@kernel.org Fixes: c374196b2b9f ("fs: name_to_handle_at() support for "explicit connectable" file handles") Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Amir Goldstein <amir73il@gmail.com> Cc: stable@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-20smb: client: remove redundant lstrp update in negotiate protocolWang Zhaolong
commit e19d8dd694d261ac26adb2a26121a37c107c81ad upstream. Commit 34331d7beed7 ("smb: client: fix first command failure during re-negotiation") addressed a race condition by updating lstrp before entering negotiate state. However, this approach may have some unintended side effects. The lstrp field is documented as "when we got last response from this server", and updating it before actually receiving a server response could potentially affect other mechanisms that rely on this timestamp. For example, the SMB echo detection logic also uses lstrp as a reference point. In scenarios with frequent user operations during reconnect states, the repeated calls to cifs_negotiate_protocol() might continuously update lstrp, which could interfere with the echo detection timing. Additionally, commit 266b5d02e14f ("smb: client: fix race condition in negotiate timeout by using more precise timing") introduced a dedicated neg_start field specifically for tracking negotiate start time. This provides a more precise solution for the original race condition while preserving the intended semantics of lstrp. Since the race condition is now properly handled by the neg_start mechanism, the lstrp update in cifs_negotiate_protocol() is no longer necessary and can be safely removed. Fixes: 266b5d02e14f ("smb: client: fix race condition in negotiate timeout by using more precise timing") Cc: stable@vger.kernel.org Acked-by: Paulo Alcantara (Red Hat) <pc@manguebit.org> Signed-off-by: Wang Zhaolong <wangzhaolong@huaweicloud.com> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-20smb3: fix for slab out of bounds on mount to ksmbdSteve French
commit 7d34ec36abb84fdfb6632a0f2cbda90379ae21fc upstream. With KASAN enabled, it is possible to get a slab out of bounds during mount to ksmbd due to missing check in parse_server_interfaces() (see below): BUG: KASAN: slab-out-of-bounds in parse_server_interfaces+0x14ee/0x1880 [cifs] Read of size 4 at addr ffff8881433dba98 by task mount/9827 CPU: 5 UID: 0 PID: 9827 Comm: mount Tainted: G OE 6.16.0-rc2-kasan #2 PREEMPT(voluntary) Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE Hardware name: Dell Inc. Precision Tower 3620/0MWYPT, BIOS 2.13.1 06/14/2019 Call Trace: <TASK> dump_stack_lvl+0x9f/0xf0 print_report+0xd1/0x670 __virt_addr_valid+0x22c/0x430 ? parse_server_interfaces+0x14ee/0x1880 [cifs] ? kasan_complete_mode_report_info+0x2a/0x1f0 ? parse_server_interfaces+0x14ee/0x1880 [cifs] kasan_report+0xd6/0x110 parse_server_interfaces+0x14ee/0x1880 [cifs] __asan_report_load_n_noabort+0x13/0x20 parse_server_interfaces+0x14ee/0x1880 [cifs] ? __pfx_parse_server_interfaces+0x10/0x10 [cifs] ? trace_hardirqs_on+0x51/0x60 SMB3_request_interfaces+0x1ad/0x3f0 [cifs] ? __pfx_SMB3_request_interfaces+0x10/0x10 [cifs] ? SMB2_tcon+0x23c/0x15d0 [cifs] smb3_qfs_tcon+0x173/0x2b0 [cifs] ? __pfx_smb3_qfs_tcon+0x10/0x10 [cifs] ? cifs_get_tcon+0x105d/0x2120 [cifs] ? do_raw_spin_unlock+0x5d/0x200 ? cifs_get_tcon+0x105d/0x2120 [cifs] ? __pfx_smb3_qfs_tcon+0x10/0x10 [cifs] cifs_mount_get_tcon+0x369/0xb90 [cifs] ? dfs_cache_find+0xe7/0x150 [cifs] dfs_mount_share+0x985/0x2970 [cifs] ? check_path.constprop.0+0x28/0x50 ? save_trace+0x54/0x370 ? __pfx_dfs_mount_share+0x10/0x10 [cifs] ? __lock_acquire+0xb82/0x2ba0 ? __kasan_check_write+0x18/0x20 cifs_mount+0xbc/0x9e0 [cifs] ? __pfx_cifs_mount+0x10/0x10 [cifs] ? do_raw_spin_unlock+0x5d/0x200 ? cifs_setup_cifs_sb+0x29d/0x810 [cifs] cifs_smb3_do_mount+0x263/0x1990 [cifs] Reported-by: Namjae Jeon <linkinjeon@kernel.org> Tested-by: Namjae Jeon <linkinjeon@kernel.org> Cc: stable@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-15smb: server: Fix extension string in ksmbd_extract_shortname()Thorsten Blum
commit 8e7d178d06e8937454b6d2f2811fa6a15656a214 upstream. In ksmbd_extract_shortname(), strscpy() is incorrectly called with the length of the source string (excluding the NUL terminator) rather than the size of the destination buffer. This results in "__" being copied to 'extension' rather than "___" (two underscores instead of three). Use the destination buffer size instead to ensure that the string "___" (three underscores) is copied correctly. Cc: stable@vger.kernel.org Fixes: e2f34481b24d ("cifsd: add server-side procedures for SMB3") Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-15ksmbd: limit repeated connections from clients with the same IPNamjae Jeon
commit e6bb9193974059ddbb0ce7763fa3882bd60d4dc3 upstream. Repeated connections from clients with the same IP address may exhaust the max connections and prevent other normal client connections. This patch limit repeated connections from clients with the same IP. Reported-by: tianshuo han <hantianshuo233@gmail.com> Cc: stable@vger.kernel.org Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-15smb: client: default to nonativesocket under POSIX mountsPaulo Alcantara
commit 6b445309eec2bc0594f3e24c7777aeef891d386e upstream. SMB3.1.1 POSIX mounts require sockets to be created with NFS reparse points. Cc: linux-cifs@vger.kernel.org Cc: Ralph Boehme <slow@samba.org> Cc: David Howells <dhowells@redhat.com> Cc: <stable@vger.kernel.org> Reported-by: Matthew Richardson <m.richardson@ed.ac.uk> Closes: https://marc.info/?i=1124e7cd-6a46-40a6-9f44-b7664a66654b@ed.ac.uk Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.org> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-15smb: client: set symlink type as native for POSIX mountsPaulo Alcantara
commit a967e758f8e9d8ce5ef096743393df5e6e51644b upstream. SMB3.1.1 POSIX mounts require symlinks to be created natively with IO_REPARSE_TAG_SYMLINK reparse point. Cc: linux-cifs@vger.kernel.org Cc: Ralph Boehme <slow@samba.org> Cc: David Howells <dhowells@redhat.com> Cc: <stable@vger.kernel.org> Reported-by: Matthew Richardson <m.richardson@ed.ac.uk> Closes: https://marc.info/?i=1124e7cd-6a46-40a6-9f44-b7664a66654b@ed.ac.uk Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.org> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-15smb: client: fix netns refcount leak after net_passive changesWang Zhaolong
commit 59b33fab4ca4d7dacc03367082777627e05d0323 upstream. After commit 5c70eb5c593d ("net: better track kernel sockets lifetime"), kernel sockets now use net_passive reference counting. However, commit 95d2b9f693ff ("Revert "smb: client: fix TCP timers deadlock after rmmod"") restored the manual socket refcount manipulation without adapting to this new mechanism, causing a memory leak. The issue can be reproduced by[1]: 1. Creating a network namespace 2. Mounting and Unmounting CIFS within the namespace 3. Deleting the namespace Some memory leaks may appear after a period of time following step 3. unreferenced object 0xffff9951419f6b00 (size 256): comm "ip", pid 447, jiffies 4294692389 (age 14.730s) hex dump (first 32 bytes): 1b 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 80 77 c2 44 51 99 ff ff .........w.DQ... backtrace: __kmem_cache_alloc_node+0x30e/0x3d0 __kmalloc+0x52/0x120 net_alloc_generic+0x1d/0x30 copy_net_ns+0x86/0x200 create_new_namespaces+0x117/0x300 unshare_nsproxy_namespaces+0x60/0xa0 ksys_unshare+0x148/0x360 __x64_sys_unshare+0x12/0x20 do_syscall_64+0x59/0x110 entry_SYSCALL_64_after_hwframe+0x78/0xe2 ... unreferenced object 0xffff9951442e7500 (size 32): comm "mount.cifs", pid 475, jiffies 4294693782 (age 13.343s) hex dump (first 32 bytes): 40 c5 38 46 51 99 ff ff 18 01 96 42 51 99 ff ff @.8FQ......BQ... 01 00 00 00 6f 00 c5 07 6f 00 d8 07 00 00 00 00 ....o...o....... backtrace: __kmem_cache_alloc_node+0x30e/0x3d0 kmalloc_trace+0x2a/0x90 ref_tracker_alloc+0x8e/0x1d0 sk_alloc+0x18c/0x1c0 inet_create+0xf1/0x370 __sock_create+0xd7/0x1e0 generic_ip_connect+0x1d4/0x5a0 [cifs] cifs_get_tcp_session+0x5d0/0x8a0 [cifs] cifs_mount_get_session+0x47/0x1b0 [cifs] dfs_mount_share+0xfa/0xa10 [cifs] cifs_mount+0x68/0x2b0 [cifs] cifs_smb3_do_mount+0x10b/0x760 [cifs] smb3_get_tree+0x112/0x2e0 [cifs] vfs_get_tree+0x29/0xf0 path_mount+0x2d4/0xa00 __se_sys_mount+0x165/0x1d0 Root cause: When creating kernel sockets, sk_alloc() calls net_passive_inc() for sockets with sk_net_refcnt=0. The CIFS code manually converts kernel sockets to user sockets by setting sk_net_refcnt=1, but doesn't call the corresponding net_passive_dec(). This creates an imbalance in the net_passive counter, which prevents the network namespace from being destroyed when its last user reference is dropped. As a result, the entire namespace and all its associated resources remain allocated. Timeline of patches leading to this issue: - commit ef7134c7fc48 ("smb: client: Fix use-after-free of network namespace.") in v6.12 fixed the original netns UAF by manually managing socket refcounts - commit e9f2517a3e18 ("smb: client: fix TCP timers deadlock after rmmod") in v6.13 attempted to use kernel sockets but introduced TCP timer issues - commit 5c70eb5c593d ("net: better track kernel sockets lifetime") in v6.14-rc5 introduced the net_passive mechanism with sk_net_refcnt_upgrade() for proper socket conversion - commit 95d2b9f693ff ("Revert "smb: client: fix TCP timers deadlock after rmmod"") in v6.15-rc3 reverted to manual refcount management without adapting to the new net_passive changes Fix this by using sk_net_refcnt_upgrade() which properly handles the net_passive counter when converting kernel sockets to user sockets. Link: https://bugzilla.kernel.org/show_bug.cgi?id=220343 [1] Fixes: 95d2b9f693ff ("Revert "smb: client: fix TCP timers deadlock after rmmod"") Cc: stable@vger.kernel.org Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Reviewed-by: Enzo Matsumiya <ematsumiya@suse.de> Signed-off-by: Wang Zhaolong <wangzhaolong@huaweicloud.com> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-15ksmbd: fix corrupted mtime and ctime in smb2_openNamjae Jeon
commit 4f8ff9486fd94b9d6a4932f2aefb9f2fc3bd0cf6 upstream. If STATX_BASIC_STATS flags are not given as an argument to vfs_getattr, It can not get ctime and mtime in kstat. This causes a problem showing mtime and ctime outdated from cifs.ko. File: /xfstest.test/foo Size: 4096 Blocks: 8 IO Block: 1048576 regular file Device: 0,65 Inode: 2033391 Links: 1 Access: (0755/-rwxr-xr-x) Uid: ( 0/ root) Gid: ( 0/ root) Context: system_u:object_r:cifs_t:s0 Access: 2025-07-23 22:15:30.136051900 +0100 Modify: 1970-01-01 01:00:00.000000000 +0100 Change: 1970-01-01 01:00:00.000000000 +0100 Birth: 2025-07-23 22:15:30.136051900 +0100 Cc: stable@vger.kernel.org Reported-by: David Howells <dhowells@redhat.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-15ksmbd: fix Preauh_HashValue race conditionNamjae Jeon
commit 44a3059c4c8cc635a1fb2afd692d0730ca1ba4b6 upstream. If client send multiple session setup requests to ksmbd, Preauh_HashValue race condition could happen. There is no need to free sess->Preauh_HashValue at session setup phase. It can be freed together with session at connection termination phase. Cc: stable@vger.kernel.org Reported-by: zdi-disclosures@trendmicro.com # ZDI-CAN-27661 Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-15ksmbd: fix null pointer dereference error in generate_encryptionkeyNamjae Jeon
commit 9b493ab6f35178afd8d619800df9071992f715de upstream. If client send two session setups with krb5 authenticate to ksmbd, null pointer dereference error in generate_encryptionkey could happen. sess->Preauth_HashValue is set to NULL if session is valid. So this patch skip generate encryption key if session is valid. Cc: stable@vger.kernel.org Reported-by: zdi-disclosures@trendmicro.com # ZDI-CAN-27654 Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-15nfsd: avoid ref leak in nfsd_open_local_fh()NeilBrown
commit e5a73150776f18547ee685c9f6bfafe549714899 upstream. If two calls to nfsd_open_local_fh() race and both successfully call nfsd_file_acquire_local(), they will both get an extra reference to the net to accompany the file reference stored in *pnf. One of them will fail to store (using xchg()) the file reference in *pnf and will drop that reference but WON'T drop the accompanying reference to the net. This leak means that when the nfs server is shut down it will hang in nfsd_shutdown_net() waiting for &nn->nfsd_net_free_done. This patch adds the missing nfsd_net_put(). Reported-by: Mike Snitzer <snitzer@kernel.org> Fixes: e6f7e1487ab5 ("nfs_localio: simplify interface to nfsd for getting nfsd_file") Cc: stable@vger.kernel.org Signed-off-by: NeilBrown <neil@brown.name> Tested-by: Mike Snitzer <snitzer@kernel.org> Reviewed-by: Mike Snitzer <snitzer@kernel.org> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-15nfsd: don't set the ctime on delegated atime updatesJeff Layton
commit f9a348e0de19226fc3c7e81de7677d3fa2c4b2d8 upstream. Clients will typically precede a DELEGRETURN for a delegation with delegated timestamp with a SETATTR to set the timestamps on the server to match what the client has. knfsd implements this by using the nfsd_setattr() infrastructure, which will set ATTR_CTIME on any update that goes to notify_change(). This is problematic as it means that the client will get a spurious ctime update when updating the atime. POSIX unfortunately doesn't phrase it succinctly, but updating the atime due to reads should not update the ctime. In this case, the client is sending a SETATTR to update the atime on the server to match its latest value. The ctime should not be advanced in this case as that would incorrectly indicate a change to the inode. Fix this by not implicitly setting ATTR_CTIME when ATTR_DELEG is set in __nfsd_setattr(). The decoder for FATTR4_WORD2_TIME_DELEG_MODIFY already sets ATTR_CTIME, so this is sufficient to make it skip setting the ctime on atime-only updates. Fixes: 7e13f4f8d27d ("nfsd: handle delegated timestamps in SETATTR") Cc: stable@vger.kernel.org Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-15smb: client: return an error if rdma_connect does not return within 5 secondsStefan Metzmacher
[ Upstream commit 03537826f77f1c829d0593d211b38b9c876c1722 ] This matches the timeout for tcp connections. Cc: Steve French <smfrench@gmail.com> Cc: Tom Talpey <tom@talpey.com> Cc: Long Li <longli@microsoft.com> Cc: linux-cifs@vger.kernel.org Cc: samba-technical@lists.samba.org Fixes: f198186aa9bb ("CIFS: SMBD: Establish SMB Direct connection") Signed-off-by: Stefan Metzmacher <metze@samba.org> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-15smb: client: let recv_done() avoid touching data_transfer after cleanup/moveStefan Metzmacher
[ Upstream commit 24eff17887cb45c25a427e662dda352973c5c171 ] Calling enqueue_reassembly() and wake_up_interruptible(&info->wait_reassembly_queue) or put_receive_buffer() means the response/data_transfer pointer might get re-used by another thread, which means these should be the last operations before calling return. Cc: Steve French <smfrench@gmail.com> Cc: Tom Talpey <tom@talpey.com> Cc: Long Li <longli@microsoft.com> Cc: linux-cifs@vger.kernel.org Cc: samba-technical@lists.samba.org Fixes: f198186aa9bb ("CIFS: SMBD: Establish SMB Direct connection") Signed-off-by: Stefan Metzmacher <metze@samba.org> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-15smb: client: let recv_done() cleanup before notifying the callers.Stefan Metzmacher
[ Upstream commit bdd7afc6dca5e0ebbb75583484aa6ea9e03fbb13 ] We should call put_receive_buffer() before waking up the callers. For the internal error case of response->type being unexpected, we now also call smbd_disconnect_rdma_connection() instead of not waking up the callers at all. Note that the SMBD_TRANSFER_DATA case still has problems, which will be addressed in the next commit in order to make it easier to review this one. Cc: Steve French <smfrench@gmail.com> Cc: Tom Talpey <tom@talpey.com> Cc: Long Li <longli@microsoft.com> Cc: linux-cifs@vger.kernel.org Cc: samba-technical@lists.samba.org Fixes: f198186aa9bb ("CIFS: SMBD: Establish SMB Direct connection") Signed-off-by: Stefan Metzmacher <metze@samba.org> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-15smb: client: make sure we call ib_dma_unmap_single() only if we called ↵Stefan Metzmacher
ib_dma_map_single already [ Upstream commit 047682c370b6f18fec818b57b0ed8b501bdb79f8 ] In case of failures either ib_dma_map_single() might not be called yet or ib_dma_unmap_single() was already called. We should make sure put_receive_buffer() only calls ib_dma_unmap_single() if needed. Cc: Steve French <smfrench@gmail.com> Cc: Tom Talpey <tom@talpey.com> Cc: Long Li <longli@microsoft.com> Cc: linux-cifs@vger.kernel.org Cc: samba-technical@lists.samba.org Fixes: f198186aa9bb ("CIFS: SMBD: Establish SMB Direct connection") Signed-off-by: Stefan Metzmacher <metze@samba.org> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-15smb: client: remove separate empty_packet_queueStefan Metzmacher
[ Upstream commit 24b6afc36db748467e853e166a385df07e443859 ] There's no need to maintain two lists, we can just have a single list of receive buffers, which are free to use. It just added unneeded complexity and resulted in ib_dma_unmap_single() not being called from recv_done() for empty keepalive packets. Cc: Steve French <smfrench@gmail.com> Cc: Tom Talpey <tom@talpey.com> Cc: Long Li <longli@microsoft.com> Cc: linux-cifs@vger.kernel.org Cc: samba-technical@lists.samba.org Fixes: f198186aa9bb ("CIFS: SMBD: Establish SMB Direct connection") Signed-off-by: Stefan Metzmacher <metze@samba.org> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-15smb: server: let recv_done() avoid touching data_transfer after cleanup/moveStefan Metzmacher
[ Upstream commit a6c015b7ac2d8c5233337e5793f50d04fac17669 ] Calling enqueue_reassembly() and wake_up_interruptible(&t->wait_reassembly_queue) or put_receive_buffer() means the recvmsg/data_transfer pointer might get re-used by another thread, which means these should be the last operations before calling return. Cc: Namjae Jeon <linkinjeon@kernel.org> Cc: Steve French <smfrench@gmail.com> Cc: Tom Talpey <tom@talpey.com> Cc: linux-cifs@vger.kernel.org Cc: samba-technical@lists.samba.org Fixes: 0626e6641f6b ("cifsd: add server handler for central processing and tranport layers") Signed-off-by: Stefan Metzmacher <metze@samba.org> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-15smb: server: let recv_done() consistently call ↵Stefan Metzmacher
put_recvmsg/smb_direct_disconnect_rdma_connection [ Upstream commit cfe76fdbb9729c650f3505d9cfb2f70ddda2dbdc ] We should call put_recvmsg() before smb_direct_disconnect_rdma_connection() in order to call it before waking up the callers. In all error cases we should call smb_direct_disconnect_rdma_connection() in order to avoid stale connections. Cc: Namjae Jeon <linkinjeon@kernel.org> Cc: Steve French <smfrench@gmail.com> Cc: Tom Talpey <tom@talpey.com> Cc: linux-cifs@vger.kernel.org Cc: samba-technical@lists.samba.org Fixes: 0626e6641f6b ("cifsd: add server handler for central processing and tranport layers") Signed-off-by: Stefan Metzmacher <metze@samba.org> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-15smb: server: make sure we call ib_dma_unmap_single() only if we called ↵Stefan Metzmacher
ib_dma_map_single already [ Upstream commit afb4108c92898350e66b9a009692230bcdd2ac73 ] In case of failures either ib_dma_map_single() might not be called yet or ib_dma_unmap_single() was already called. We should make sure put_recvmsg() only calls ib_dma_unmap_single() if needed. Cc: Namjae Jeon <linkinjeon@kernel.org> Cc: Steve French <smfrench@gmail.com> Cc: Tom Talpey <tom@talpey.com> Cc: linux-cifs@vger.kernel.org Cc: samba-technical@lists.samba.org Fixes: 0626e6641f6b ("cifsd: add server handler for central processing and tranport layers") Signed-off-by: Stefan Metzmacher <metze@samba.org> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-15smb: server: remove separate empty_recvmsg_queueStefan Metzmacher
[ Upstream commit 01027a62b508c48c762096f347de925eedcbd008 ] There's no need to maintain two lists, we can just have a single list of receive buffers, which are free to use. Cc: Steve French <smfrench@gmail.com> Cc: Tom Talpey <tom@talpey.com> Cc: linux-cifs@vger.kernel.org Cc: samba-technical@lists.samba.org Fixes: 0626e6641f6b ("cifsd: add server handler for central processing and tranport layers") Signed-off-by: Stefan Metzmacher <metze@samba.org> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-15NFS/localio: nfs_uuid_put() fix the wake up after unlinking the fileTrond Myklebust
[ Upstream commit 4ec752ce6debd5a0e7e0febf6bcf780ccda6ab5e ] Use store_release_wake_up() instead of wake_up_var_locked(), because the waiter cannot retake the nfs_uuid->lock. Acked-by: Mike Snitzer <snitzer@kernel.org> Tested-by: Mike Snitzer <snitzer@kernel.org> Suggested-by: NeilBrown <neil@brown.name> Link: https://lore.kernel.org/all/175262948827.2234665.1891349021754495573@noble.neil.brown.name/ Fixes: 21fb44034695 ("nfs_localio: protect race between nfs_uuid_put() and nfs_close_local_fh()") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-15NFS/localio: nfs_uuid_put() fix races with nfs_open/close_local_fh()Trond Myklebust
[ Upstream commit fdd015de767977f21892329af5e12276eb80375f ] In order for the wait in nfs_uuid_put() to be safe, it is necessary to ensure that nfs_uuid_add_file() doesn't add a new entry once the nfs_uuid->net has been NULLed out. Also fix up the wake_up_var_locked() / wait_var_event_spinlock() to both use the nfs_uuid address, since nfl, and &nfl->uuid could be used elsewhere. Acked-by: Mike Snitzer <snitzer@kernel.org> Tested-by: Mike Snitzer <snitzer@kernel.org> Link: https://lore.kernel.org/all/175262893035.2234665.1735173020338594784@noble.neil.brown.name/ Fixes: 21fb44034695 ("nfs_localio: protect race between nfs_uuid_put() and nfs_close_local_fh()") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-15NFS/localio: nfs_close_local_fh() fix check for file closedTrond Myklebust
[ Upstream commit e144d53cf21fb9d02626c669533788c6bdc61ce3 ] If the struct nfs_file_localio is closed, its list entry will be empty, but the nfs_uuid->files list might still contain other entries. Acked-by: Mike Snitzer <snitzer@kernel.org> Tested-by: Mike Snitzer <snitzer@kernel.org> Reviewed-by: NeilBrown <neil@brown.name> Fixes: 21fb44034695 ("nfs_localio: protect race between nfs_uuid_put() and nfs_close_local_fh()") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-15NFS: Fixup allocation flags for nfsiod's __GFP_NORETRYBenjamin Coddington
[ Upstream commit 99765233ab42bf7a4950377ad7894dce8a5c0e60 ] If the NFS client is doing writeback from a workqueue context, avoid using __GFP_NORETRY for allocations if the task has set PF_MEMALLOC_NOIO or PF_MEMALLOC_NOFS. The combination of these flags makes memory allocation failures much more likely. We've seen those allocation failures show up when the loopback driver is doing writeback from a workqueue to a file on NFS, where memory allocation failure results in errors or corruption within the loopback device's filesystem. Suggested-by: Trond Myklebust <trondmy@kernel.org> Fixes: 0bae835b63c5 ("NFS: Avoid writeback threads getting stuck in mempool_alloc()") Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Reviewed-by: Laurence Oberman <loberman@redhat.com> Tested-by: Laurence Oberman <loberman@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Link: https://lore.kernel.org/r/f83ac1155a4bc670f2663959a7a068571e06afd9.1752111622.git.bcodding@redhat.com Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-15NFSv4.2: another fix for listxattrOlga Kornievskaia
[ Upstream commit 9acb237deff7667b0f6b10fe6b1b70c4429ea049 ] Currently, when the server supports NFS4.1 security labels then security.selinux label in included twice. Instead, only add it when the server doesn't possess security label support. Fixes: 243fea134633 ("NFSv4.2: fix listxattr to return selinux security label") Signed-off-by: Olga Kornievskaia <okorniev@redhat.com> Link: https://lore.kernel.org/r/20250722205641.79394-1-okorniev@redhat.com Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-15NFS: Fix filehandle bounds checking in nfs_fh_to_dentry()Trond Myklebust
[ Upstream commit ef93a685e01a281b5e2a25ce4e3428cf9371a205 ] The function needs to check the minimal filehandle length before it can access the embedded filehandle. Reported-by: zhangjian <zhangjian496@huawei.com> Fixes: 20fa19027286 ("nfs: add export operations") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-15NFS: Fix wakeup of __nfs_lookup_revalidate() in unblock_revalidate()Trond Myklebust
[ Upstream commit 1db3a48e83bb64a70bf27263b7002585574a9c2d ] Use store_release_wake_up() to add the appropriate memory barrier before calling wake_up_var(&dentry->d_fsdata). Reported-by: Lukáš Hejtmánek<xhejtman@ics.muni.cz> Suggested-by: Santosh Pradhan <santosh.pradhan@gmail.com> Link: https://lore.kernel.org/all/18945D18-3EDB-4771-B019-0335CE671077@ics.muni.cz/ Fixes: 99bc9f2eb3f7 ("NFS: add barriers when testing for NFS_FSDATA_BLOCKED") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-15pNFS/flexfiles: don't attempt pnfs on fatal DS errorsTigran Mkrtchyan
[ Upstream commit f06bedfa62d57f7b67d44aacd6badad2e13a803f ] When an applications get killed (SIGTERM/SIGINT) while pNFS client performs a connection to DS, client ends in an infinite loop of connect-disconnect. This source of the issue, it that flexfilelayoutdev#nfs4_ff_layout_prepare_ds gets an error on nfs4_pnfs_ds_connect with status ERESTARTSYS, which is set by rpc_signal_task, but the error is treated as transient, thus retried. The issue is reproducible with Ctrl+C the following script(there should be ~1000 files in a directory, client should must not have any connections to DSes): ``` echo 3 > /proc/sys/vm/drop_caches for i in * do head -1 $i done ``` The change aims to propagate the nfs4_ff_layout_prepare_ds error state to the caller that can decide whatever this is a retryable error or not. Signed-off-by: Tigran Mkrtchyan <tigran.mkrtchyan@desy.de> Link: https://lore.kernel.org/r/20250627071751.189663-1-tigran.mkrtchyan@desy.de Fixes: 260f32adb88d ("pNFS/flexfiles: Check the result of nfs4_pnfs_ds_connect") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-15exfat: fdatasync flag should be same like generic_write_sync()Zhengxu Zhang
[ Upstream commit 2f2d42a17b5a6711378d39df74f1f69a831c5d4e ] Test: androbench by default setting, use 64GB sdcard. the random write speed: without this patch 3.5MB/s with this patch 7MB/s After patch "11a347fb6cef", the random write speed decreased significantly. the .write_iter() interface had been modified, and check the differences with generic_file_write_iter(), when calling generic_write_sync() and exfat_file_write_iter() to call vfs_fsync_range(), the fdatasync flag is wrong, and make not use the fdatasync mode, and make random write speed decreased. So use generic_write_sync() instead of vfs_fsync_range(). Fixes: 11a347fb6cef ("exfat: change to get file size from DataLength") Signed-off-by: Zhengxu Zhang <zhengxu.zhang@unisoc.com> Acked-by: Yuezhang Mo <Yuezhang.Mo@sony.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-15f2fs: fix to trigger foreground gc during f2fs_map_blocks() in lfs modeChao Yu
[ Upstream commit 1005a3ca28e90c7a64fa43023f866b960a60f791 ] w/ "mode=lfs" mount option, generic/299 will cause system panic as below: ------------[ cut here ]------------ kernel BUG at fs/f2fs/segment.c:2835! Call Trace: <TASK> f2fs_allocate_data_block+0x6f4/0xc50 f2fs_map_blocks+0x970/0x1550 f2fs_iomap_begin+0xb2/0x1e0 iomap_iter+0x1d6/0x430 __iomap_dio_rw+0x208/0x9a0 f2fs_file_write_iter+0x6b3/0xfa0 aio_write+0x15d/0x2e0 io_submit_one+0x55e/0xab0 __x64_sys_io_submit+0xa5/0x230 do_syscall_64+0x84/0x2f0 entry_SYSCALL_64_after_hwframe+0x76/0x7e RIP: 0010:new_curseg+0x70f/0x720 The root cause of we run out-of-space is: in f2fs_map_blocks(), f2fs may trigger foreground gc only if it allocates any physical block, it will be a little bit later when there is multiple threads writing data w/ aio/dio/bufio method in parallel, since we always use OPU in lfs mode, so f2fs_map_blocks() does block allocations aggressively. In order to fix this issue, let's give a chance to trigger foreground gc in prior to block allocation in f2fs_map_blocks(). Fixes: 36abef4e796d ("f2fs: introduce mode=lfs mount option") Cc: Daeho Jeong <daehojeong@google.com> Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-15f2fs: fix to calculate dirty data during has_not_enough_free_secs()Chao Yu
[ Upstream commit e194e140ab7de2ce2782e64b9e086a43ca6ff4f2 ] In lfs mode, dirty data needs OPU, we'd better calculate lower_p and upper_p w/ them during has_not_enough_free_secs(), otherwise we may encounter out-of-space issue due to we missed to reclaim enough free section w/ foreground gc. Fixes: 36abef4e796d ("f2fs: introduce mode=lfs mount option") Cc: Daeho Jeong <daehojeong@google.com> Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-15f2fs: fix to update upper_p in __get_secs_required() correctlyChao Yu
[ Upstream commit 6840faddb65683b4e7bd8196f177b038a1e19faf ] Commit 1acd73edbbfe ("f2fs: fix to account dirty data in __get_secs_required()") missed to calculate upper_p w/ data_secs, fix it. Fixes: 1acd73edbbfe ("f2fs: fix to account dirty data in __get_secs_required()") Cc: Daeho Jeong <daehojeong@google.com> Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-15f2fs: vm_unmap_ram() may be called from an invalid contextJan Prusakowski
[ Upstream commit 08a7efc5b02a0620ae16aa9584060e980a69cb55 ] When testing F2FS with xfstests using UFS backed virtual disks the kernel complains sometimes that f2fs_release_decomp_mem() calls vm_unmap_ram() from an invalid context. Example trace from f2fs/007 test: f2fs/007 5s ... [12:59:38][ 8.902525] run fstests f2fs/007 [ 11.468026] BUG: sleeping function called from invalid context at mm/vmalloc.c:2978 [ 11.471849] in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 68, name: irq/22-ufshcd [ 11.475357] preempt_count: 1, expected: 0 [ 11.476970] RCU nest depth: 0, expected: 0 [ 11.478531] CPU: 0 UID: 0 PID: 68 Comm: irq/22-ufshcd Tainted: G W 6.16.0-rc5-xfstests-ufs-g40f92e79b0aa #9 PREEMPT(none) [ 11.478535] Tainted: [W]=WARN [ 11.478536] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 [ 11.478537] Call Trace: [ 11.478543] <TASK> [ 11.478545] dump_stack_lvl+0x4e/0x70 [ 11.478554] __might_resched.cold+0xaf/0xbe [ 11.478557] vm_unmap_ram+0x21/0xb0 [ 11.478560] f2fs_release_decomp_mem+0x59/0x80 [ 11.478563] f2fs_free_dic+0x18/0x1a0 [ 11.478565] f2fs_finish_read_bio+0xd7/0x290 [ 11.478570] blk_update_request+0xec/0x3b0 [ 11.478574] ? sbitmap_queue_clear+0x3b/0x60 [ 11.478576] scsi_end_request+0x27/0x1a0 [ 11.478582] scsi_io_completion+0x40/0x300 [ 11.478583] ufshcd_mcq_poll_cqe_lock+0xa3/0xe0 [ 11.478588] ufshcd_sl_intr+0x194/0x1f0 [ 11.478592] ufshcd_threaded_intr+0x68/0xb0 [ 11.478594] ? __pfx_irq_thread_fn+0x10/0x10 [ 11.478599] irq_thread_fn+0x20/0x60 [ 11.478602] ? __pfx_irq_thread_fn+0x10/0x10 [ 11.478603] irq_thread+0xb9/0x180 [ 11.478605] ? __pfx_irq_thread_dtor+0x10/0x10 [ 11.478607] ? __pfx_irq_thread+0x10/0x10 [ 11.478609] kthread+0x10a/0x230 [ 11.478614] ? __pfx_kthread+0x10/0x10 [ 11.478615] ret_from_fork+0x7e/0xd0 [ 11.478619] ? __pfx_kthread+0x10/0x10 [ 11.478621] ret_from_fork_asm+0x1a/0x30 [ 11.478623] </TASK> This patch modifies in_task() check inside f2fs_read_end_io() to also check if interrupts are disabled. This ensures that pages are unmapped asynchronously in an interrupt handler. Fixes: bff139b49d9f ("f2fs: handle decompress only post processing in softirq") Signed-off-by: Jan Prusakowski <jprusakowski@google.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-15f2fs: fix to avoid out-of-boundary access in devs.pathChao Yu
[ Upstream commit 5661998536af52848cc4d52a377e90368196edea ] - touch /mnt/f2fs/012345678901234567890123456789012345678901234567890123 - truncate -s $((1024*1024*1024)) \ /mnt/f2fs/012345678901234567890123456789012345678901234567890123 - touch /mnt/f2fs/file - truncate -s $((1024*1024*1024)) /mnt/f2fs/file - mkfs.f2fs /mnt/f2fs/012345678901234567890123456789012345678901234567890123 \ -c /mnt/f2fs/file - mount /mnt/f2fs/012345678901234567890123456789012345678901234567890123 \ /mnt/f2fs/loop [16937.192225] F2FS-fs (loop0): Mount Device [ 0]: /mnt/f2fs/012345678901234567890123456789012345678901234567890123\xff\x01, 511, 0 - 3ffff [16937.192268] F2FS-fs (loop0): Failed to find devices If device path length equals to MAX_PATH_LEN, sbi->devs.path[] may not end up w/ null character due to path array is fully filled, So accidently, fields locate after path[] may be treated as part of device path, result in parsing wrong device path. struct f2fs_dev_info { ... char path[MAX_PATH_LEN]; ... }; Let's add one byte space for sbi->devs.path[] to store null character of device path string. Fixes: 3c62be17d4f5 ("f2fs: support multiple devices") Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-15f2fs: fix to avoid panic in f2fs_evict_inodeChao Yu
[ Upstream commit a509a55f8eecc8970b3980c6f06886bbff0e2f68 ] As syzbot [1] reported as below: R10: 0000000000000100 R11: 0000000000000206 R12: 00007ffe17473450 R13: 00007f28b1c10854 R14: 000000000000dae5 R15: 00007ffe17474520 </TASK> ---[ end trace 0000000000000000 ]--- ================================================================== BUG: KASAN: use-after-free in __list_del_entry_valid+0xa6/0x130 lib/list_debug.c:62 Read of size 8 at addr ffff88812d962278 by task syz-executor/564 CPU: 1 PID: 564 Comm: syz-executor Tainted: G W 6.1.129-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/12/2025 Call Trace: <TASK> __dump_stack+0x21/0x24 lib/dump_stack.c:88 dump_stack_lvl+0xee/0x158 lib/dump_stack.c:106 print_address_description+0x71/0x210 mm/kasan/report.c:316 print_report+0x4a/0x60 mm/kasan/report.c:427 kasan_report+0x122/0x150 mm/kasan/report.c:531 __asan_report_load8_noabort+0x14/0x20 mm/kasan/report_generic.c:351 __list_del_entry_valid+0xa6/0x130 lib/list_debug.c:62 __list_del_entry include/linux/list.h:134 [inline] list_del_init include/linux/list.h:206 [inline] f2fs_inode_synced+0xf7/0x2e0 fs/f2fs/super.c:1531 f2fs_update_inode+0x74/0x1c40 fs/f2fs/inode.c:585 f2fs_update_inode_page+0x137/0x170 fs/f2fs/inode.c:703 f2fs_write_inode+0x4ec/0x770 fs/f2fs/inode.c:731 write_inode fs/fs-writeback.c:1460 [inline] __writeback_single_inode+0x4a0/0xab0 fs/fs-writeback.c:1677 writeback_single_inode+0x221/0x8b0 fs/fs-writeback.c:1733 sync_inode_metadata+0xb6/0x110 fs/fs-writeback.c:2789 f2fs_sync_inode_meta+0x16d/0x2a0 fs/f2fs/checkpoint.c:1159 block_operations fs/f2fs/checkpoint.c:1269 [inline] f2fs_write_checkpoint+0xca3/0x2100 fs/f2fs/checkpoint.c:1658 kill_f2fs_super+0x231/0x390 fs/f2fs/super.c:4668 deactivate_locked_super+0x98/0x100 fs/super.c:332 deactivate_super+0xaf/0xe0 fs/super.c:363 cleanup_mnt+0x45f/0x4e0 fs/namespace.c:1186 __cleanup_mnt+0x19/0x20 fs/namespace.c:1193 task_work_run+0x1c6/0x230 kernel/task_work.c:203 exit_task_work include/linux/task_work.h:39 [inline] do_exit+0x9fb/0x2410 kernel/exit.c:871 do_group_exit+0x210/0x2d0 kernel/exit.c:1021 __do_sys_exit_group kernel/exit.c:1032 [inline] __se_sys_exit_group kernel/exit.c:1030 [inline] __x64_sys_exit_group+0x3f/0x40 kernel/exit.c:1030 x64_sys_call+0x7b4/0x9a0 arch/x86/include/generated/asm/syscalls_64.h:232 do_syscall_x64 arch/x86/entry/common.c:51 [inline] do_syscall_64+0x4c/0xa0 arch/x86/entry/common.c:81 entry_SYSCALL_64_after_hwframe+0x68/0xd2 RIP: 0033:0x7f28b1b8e169 Code: Unable to access opcode bytes at 0x7f28b1b8e13f. RSP: 002b:00007ffe174710a8 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7 RAX: ffffffffffffffda RBX: 00007f28b1c10879 RCX: 00007f28b1b8e169 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000001 RBP: 0000000000000002 R08: 00007ffe1746ee47 R09: 00007ffe17472360 R10: 0000000000000009 R11: 0000000000000246 R12: 00007ffe17472360 R13: 00007f28b1c10854 R14: 000000000000dae5 R15: 00007ffe17474520 </TASK> Allocated by task 569: kasan_save_stack mm/kasan/common.c:45 [inline] kasan_set_track+0x4b/0x70 mm/kasan/common.c:52 kasan_save_alloc_info+0x25/0x30 mm/kasan/generic.c:505 __kasan_slab_alloc+0x72/0x80 mm/kasan/common.c:328 kasan_slab_alloc include/linux/kasan.h:201 [inline] slab_post_alloc_hook+0x4f/0x2c0 mm/slab.h:737 slab_alloc_node mm/slub.c:3398 [inline] slab_alloc mm/slub.c:3406 [inline] __kmem_cache_alloc_lru mm/slub.c:3413 [inline] kmem_cache_alloc_lru+0x104/0x220 mm/slub.c:3429 alloc_inode_sb include/linux/fs.h:3245 [inline] f2fs_alloc_inode+0x2d/0x340 fs/f2fs/super.c:1419 alloc_inode fs/inode.c:261 [inline] iget_locked+0x186/0x880 fs/inode.c:1373 f2fs_iget+0x55/0x4c60 fs/f2fs/inode.c:483 f2fs_lookup+0x366/0xab0 fs/f2fs/namei.c:487 __lookup_slow+0x2a3/0x3d0 fs/namei.c:1690 lookup_slow+0x57/0x70 fs/namei.c:1707 walk_component+0x2e6/0x410 fs/namei.c:1998 lookup_last fs/namei.c:2455 [inline] path_lookupat+0x180/0x490 fs/namei.c:2479 filename_lookup+0x1f0/0x500 fs/namei.c:2508 vfs_statx+0x10b/0x660 fs/stat.c:229 vfs_fstatat fs/stat.c:267 [inline] vfs_lstat include/linux/fs.h:3424 [inline] __do_sys_newlstat fs/stat.c:423 [inline] __se_sys_newlstat+0xd5/0x350 fs/stat.c:417 __x64_sys_newlstat+0x5b/0x70 fs/stat.c:417 x64_sys_call+0x393/0x9a0 arch/x86/include/generated/asm/syscalls_64.h:7 do_syscall_x64 arch/x86/entry/common.c:51 [inline] do_syscall_64+0x4c/0xa0 arch/x86/entry/common.c:81 entry_SYSCALL_64_after_hwframe+0x68/0xd2 Freed by task 13: kasan_save_stack mm/kasan/common.c:45 [inline] kasan_set_track+0x4b/0x70 mm/kasan/common.c:52 kasan_save_free_info+0x31/0x50 mm/kasan/generic.c:516 ____kasan_slab_free+0x132/0x180 mm/kasan/common.c:236 __kasan_slab_free+0x11/0x20 mm/kasan/common.c:244 kasan_slab_free include/linux/kasan.h:177 [inline] slab_free_hook mm/slub.c:1724 [inline] slab_free_freelist_hook+0xc2/0x190 mm/slub.c:1750 slab_free mm/slub.c:3661 [inline] kmem_cache_free+0x12d/0x2a0 mm/slub.c:3683 f2fs_free_inode+0x24/0x30 fs/f2fs/super.c:1562 i_callback+0x4c/0x70 fs/inode.c:250 rcu_do_batch+0x503/0xb80 kernel/rcu/tree.c:2297 rcu_core+0x5a2/0xe70 kernel/rcu/tree.c:2557 rcu_core_si+0x9/0x10 kernel/rcu/tree.c:2574 handle_softirqs+0x178/0x500 kernel/softirq.c:578 run_ksoftirqd+0x28/0x30 kernel/softirq.c:945 smpboot_thread_fn+0x45a/0x8c0 kernel/smpboot.c:164 kthread+0x270/0x310 kernel/kthread.c:376 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295 Last potentially related work creation: kasan_save_stack+0x3a/0x60 mm/kasan/common.c:45 __kasan_record_aux_stack+0xb6/0xc0 mm/kasan/generic.c:486 kasan_record_aux_stack_noalloc+0xb/0x10 mm/kasan/generic.c:496 call_rcu+0xd4/0xf70 kernel/rcu/tree.c:2845 destroy_inode fs/inode.c:316 [inline] evict+0x7da/0x870 fs/inode.c:720 iput_final fs/inode.c:1834 [inline] iput+0x62b/0x830 fs/inode.c:1860 do_unlinkat+0x356/0x540 fs/namei.c:4397 __do_sys_unlink fs/namei.c:4438 [inline] __se_sys_unlink fs/namei.c:4436 [inline] __x64_sys_unlink+0x49/0x50 fs/namei.c:4436 x64_sys_call+0x958/0x9a0 arch/x86/include/generated/asm/syscalls_64.h:88 do_syscall_x64 arch/x86/entry/common.c:51 [inline] do_syscall_64+0x4c/0xa0 arch/x86/entry/common.c:81 entry_SYSCALL_64_after_hwframe+0x68/0xd2 The buggy address belongs to the object at ffff88812d961f20 which belongs to the cache f2fs_inode_cache of size 1200 The buggy address is located 856 bytes inside of 1200-byte region [ffff88812d961f20, ffff88812d9623d0) The buggy address belongs to the physical page: page:ffffea0004b65800 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x12d960 head:ffffea0004b65800 order:2 compound_mapcount:0 compound_pincount:0 flags: 0x4000000000010200(slab|head|zone=1) raw: 4000000000010200 0000000000000000 dead000000000122 ffff88810a94c500 raw: 0000000000000000 00000000800c000c 00000001ffffffff 0000000000000000 page dumped because: kasan: bad access detected page_owner tracks the page as allocated page last allocated via order 2, migratetype Reclaimable, gfp_mask 0x1d2050(__GFP_IO|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_NOMEMALLOC|__GFP_HARDWALL|__GFP_RECLAIMABLE), pid 569, tgid 568 (syz.2.16), ts 55943246141, free_ts 0 set_page_owner include/linux/page_owner.h:31 [inline] post_alloc_hook+0x1d0/0x1f0 mm/page_alloc.c:2532 prep_new_page mm/page_alloc.c:2539 [inline] get_page_from_freelist+0x2e63/0x2ef0 mm/page_alloc.c:4328 __alloc_pages+0x235/0x4b0 mm/page_alloc.c:5605 alloc_slab_page include/linux/gfp.h:-1 [inline] allocate_slab mm/slub.c:1939 [inline] new_slab+0xec/0x4b0 mm/slub.c:1992 ___slab_alloc+0x6f6/0xb50 mm/slub.c:3180 __slab_alloc+0x5e/0xa0 mm/slub.c:3279 slab_alloc_node mm/slub.c:3364 [inline] slab_alloc mm/slub.c:3406 [inline] __kmem_cache_alloc_lru mm/slub.c:3413 [inline] kmem_cache_alloc_lru+0x13f/0x220 mm/slub.c:3429 alloc_inode_sb include/linux/fs.h:3245 [inline] f2fs_alloc_inode+0x2d/0x340 fs/f2fs/super.c:1419 alloc_inode fs/inode.c:261 [inline] iget_locked+0x186/0x880 fs/inode.c:1373 f2fs_iget+0x55/0x4c60 fs/f2fs/inode.c:483 f2fs_fill_super+0x3ad7/0x6bb0 fs/f2fs/super.c:4293 mount_bdev+0x2ae/0x3e0 fs/super.c:1443 f2fs_mount+0x34/0x40 fs/f2fs/super.c:4642 legacy_get_tree+0xea/0x190 fs/fs_context.c:632 vfs_get_tree+0x89/0x260 fs/super.c:1573 do_new_mount+0x25a/0xa20 fs/namespace.c:3056 page_owner free stack trace missing Memory state around the buggy address: ffff88812d962100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff88812d962180: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb >ffff88812d962200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ^ ffff88812d962280: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff88812d962300: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ================================================================== [1] https://syzkaller.appspot.com/x/report.txt?x=13448368580000 This bug can be reproduced w/ the reproducer [2], once we enable CONFIG_F2FS_CHECK_FS config, the reproducer will trigger panic as below, so the direct reason of this bug is the same as the one below patch [3] fixed. kernel BUG at fs/f2fs/inode.c:857! RIP: 0010:f2fs_evict_inode+0x1204/0x1a20 Call Trace: <TASK> evict+0x32a/0x7a0 do_unlinkat+0x37b/0x5b0 __x64_sys_unlink+0xad/0x100 do_syscall_64+0x5a/0xb0 entry_SYSCALL_64_after_hwframe+0x6e/0xd8 RIP: 0010:f2fs_evict_inode+0x1204/0x1a20 [2] https://syzkaller.appspot.com/x/repro.c?x=17495ccc580000 [3] https://lore.kernel.org/linux-f2fs-devel/20250702120321.1080759-1-chao@kernel.org Tracepoints before panic: f2fs_unlink_enter: dev = (7,0), dir ino = 3, i_size = 4096, i_blocks = 8, name = file1 f2fs_unlink_exit: dev = (7,0), ino = 7, ret = 0 f2fs_evict_inode: dev = (7,0), ino = 7, pino = 3, i_mode = 0x81ed, i_size = 10, i_nlink = 0, i_blocks = 0, i_advise = 0x0 f2fs_truncate_node: dev = (7,0), ino = 7, nid = 8, block_address = 0x3c05 f2fs_unlink_enter: dev = (7,0), dir ino = 3, i_size = 4096, i_blocks = 8, name = file3 f2fs_unlink_exit: dev = (7,0), ino = 8, ret = 0 f2fs_evict_inode: dev = (7,0), ino = 8, pino = 3, i_mode = 0x81ed, i_size = 9000, i_nlink = 0, i_blocks = 24, i_advise = 0x4 f2fs_truncate: dev = (7,0), ino = 8, pino = 3, i_mode = 0x81ed, i_size = 0, i_nlink = 0, i_blocks = 24, i_advise = 0x4 f2fs_truncate_blocks_enter: dev = (7,0), ino = 8, i_size = 0, i_blocks = 24, start file offset = 0 f2fs_truncate_blocks_exit: dev = (7,0), ino = 8, ret = -2 The root cause is: in the fuzzed image, dnode #8 belongs to inode #7, after inode #7 eviction, dnode #8 was dropped. However there is dirent that has ino #8, so, once we unlink file3, in f2fs_evict_inode(), both f2fs_truncate() and f2fs_update_inode_page() will fail due to we can not load node #8, result in we missed to call f2fs_inode_synced() to clear inode dirty status. Let's fix this by calling f2fs_inode_synced() in error path of f2fs_evict_inode(). PS: As I verified, the reproducer [2] can trigger this bug in v6.1.129, but it failed in v6.16-rc4, this is because the testcase will stop due to other corruption has been detected by f2fs: F2FS-fs (loop0): inconsistent node block, node_type:2, nid:8, node_footer[nid:8,ino:8,ofs:0,cpver:5013063228981249506,blkaddr:15366] F2FS-fs (loop0): f2fs_lookup: inode (ino=9) has zero i_nlink Fixes: 0f18b462b2e5 ("f2fs: flush inode metadata when checkpoint is doing") Closes: https://syzkaller.appspot.com/x/report.txt?x=13448368580000 Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-15f2fs: fix to avoid UAF in f2fs_sync_inode_meta()Chao Yu
[ Upstream commit 7c30d79930132466f5be7d0b57add14d1a016bda ] syzbot reported an UAF issue as below: [1] [2] [1] https://syzkaller.appspot.com/text?tag=CrashReport&x=16594c60580000 ================================================================== BUG: KASAN: use-after-free in __list_del_entry_valid+0xa6/0x130 lib/list_debug.c:62 Read of size 8 at addr ffff888100567dc8 by task kworker/u4:0/8 CPU: 1 PID: 8 Comm: kworker/u4:0 Tainted: G W 6.1.129-syzkaller-00017-g642656a36791 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/12/2025 Workqueue: writeback wb_workfn (flush-7:0) Call Trace: <TASK> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0x151/0x1b7 lib/dump_stack.c:106 print_address_description mm/kasan/report.c:316 [inline] print_report+0x158/0x4e0 mm/kasan/report.c:427 kasan_report+0x13c/0x170 mm/kasan/report.c:531 __asan_report_load8_noabort+0x14/0x20 mm/kasan/report_generic.c:351 __list_del_entry_valid+0xa6/0x130 lib/list_debug.c:62 __list_del_entry include/linux/list.h:134 [inline] list_del_init include/linux/list.h:206 [inline] f2fs_inode_synced+0x100/0x2e0 fs/f2fs/super.c:1553 f2fs_update_inode+0x72/0x1c40 fs/f2fs/inode.c:588 f2fs_update_inode_page+0x135/0x170 fs/f2fs/inode.c:706 f2fs_write_inode+0x416/0x790 fs/f2fs/inode.c:734 write_inode fs/fs-writeback.c:1460 [inline] __writeback_single_inode+0x4cf/0xb80 fs/fs-writeback.c:1677 writeback_sb_inodes+0xb32/0x1910 fs/fs-writeback.c:1903 __writeback_inodes_wb+0x118/0x3f0 fs/fs-writeback.c:1974 wb_writeback+0x3da/0xa00 fs/fs-writeback.c:2081 wb_check_background_flush fs/fs-writeback.c:2151 [inline] wb_do_writeback fs/fs-writeback.c:2239 [inline] wb_workfn+0xbba/0x1030 fs/fs-writeback.c:2266 process_one_work+0x73d/0xcb0 kernel/workqueue.c:2299 worker_thread+0xa60/0x1260 kernel/workqueue.c:2446 kthread+0x26d/0x300 kernel/kthread.c:386 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295 </TASK> Allocated by task 298: kasan_save_stack mm/kasan/common.c:45 [inline] kasan_set_track+0x4b/0x70 mm/kasan/common.c:52 kasan_save_alloc_info+0x1f/0x30 mm/kasan/generic.c:505 __kasan_slab_alloc+0x6c/0x80 mm/kasan/common.c:333 kasan_slab_alloc include/linux/kasan.h:202 [inline] slab_post_alloc_hook+0x53/0x2c0 mm/slab.h:768 slab_alloc_node mm/slub.c:3421 [inline] slab_alloc mm/slub.c:3431 [inline] __kmem_cache_alloc_lru mm/slub.c:3438 [inline] kmem_cache_alloc_lru+0x102/0x270 mm/slub.c:3454 alloc_inode_sb include/linux/fs.h:3255 [inline] f2fs_alloc_inode+0x2d/0x350 fs/f2fs/super.c:1437 alloc_inode fs/inode.c:261 [inline] iget_locked+0x18c/0x7e0 fs/inode.c:1373 f2fs_iget+0x55/0x4ca0 fs/f2fs/inode.c:486 f2fs_lookup+0x3c1/0xb50 fs/f2fs/namei.c:484 __lookup_slow+0x2b9/0x3e0 fs/namei.c:1689 lookup_slow+0x5a/0x80 fs/namei.c:1706 walk_component+0x2e7/0x410 fs/namei.c:1997 lookup_last fs/namei.c:2454 [inline] path_lookupat+0x16d/0x450 fs/namei.c:2478 filename_lookup+0x251/0x600 fs/namei.c:2507 vfs_statx+0x107/0x4b0 fs/stat.c:229 vfs_fstatat fs/stat.c:267 [inline] vfs_lstat include/linux/fs.h:3434 [inline] __do_sys_newlstat fs/stat.c:423 [inline] __se_sys_newlstat+0xda/0x7c0 fs/stat.c:417 __x64_sys_newlstat+0x5b/0x70 fs/stat.c:417 x64_sys_call+0x52/0x9a0 arch/x86/include/generated/asm/syscalls_64.h:7 do_syscall_x64 arch/x86/entry/common.c:51 [inline] do_syscall_64+0x3b/0x80 arch/x86/entry/common.c:81 entry_SYSCALL_64_after_hwframe+0x68/0xd2 Freed by task 0: kasan_save_stack mm/kasan/common.c:45 [inline] kasan_set_track+0x4b/0x70 mm/kasan/common.c:52 kasan_save_free_info+0x2b/0x40 mm/kasan/generic.c:516 ____kasan_slab_free+0x131/0x180 mm/kasan/common.c:241 __kasan_slab_free+0x11/0x20 mm/kasan/common.c:249 kasan_slab_free include/linux/kasan.h:178 [inline] slab_free_hook mm/slub.c:1745 [inline] slab_free_freelist_hook mm/slub.c:1771 [inline] slab_free mm/slub.c:3686 [inline] kmem_cache_free+0x291/0x560 mm/slub.c:3711 f2fs_free_inode+0x24/0x30 fs/f2fs/super.c:1584 i_callback+0x4b/0x70 fs/inode.c:250 rcu_do_batch+0x552/0xbe0 kernel/rcu/tree.c:2297 rcu_core+0x502/0xf40 kernel/rcu/tree.c:2557 rcu_core_si+0x9/0x10 kernel/rcu/tree.c:2574 handle_softirqs+0x1db/0x650 kernel/softirq.c:624 __do_softirq kernel/softirq.c:662 [inline] invoke_softirq kernel/softirq.c:479 [inline] __irq_exit_rcu+0x52/0xf0 kernel/softirq.c:711 irq_exit_rcu+0x9/0x10 kernel/softirq.c:723 instr_sysvec_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1118 [inline] sysvec_apic_timer_interrupt+0xa9/0xc0 arch/x86/kernel/apic/apic.c:1118 asm_sysvec_apic_timer_interrupt+0x1b/0x20 arch/x86/include/asm/idtentry.h:691 Last potentially related work creation: kasan_save_stack+0x3b/0x60 mm/kasan/common.c:45 __kasan_record_aux_stack+0xb4/0xc0 mm/kasan/generic.c:486 kasan_record_aux_stack_noalloc+0xb/0x10 mm/kasan/generic.c:496 __call_rcu_common kernel/rcu/tree.c:2807 [inline] call_rcu+0xdc/0x10f0 kernel/rcu/tree.c:2926 destroy_inode fs/inode.c:316 [inline] evict+0x87d/0x930 fs/inode.c:720 iput_final fs/inode.c:1834 [inline] iput+0x616/0x690 fs/inode.c:1860 do_unlinkat+0x4e1/0x920 fs/namei.c:4396 __do_sys_unlink fs/namei.c:4437 [inline] __se_sys_unlink fs/namei.c:4435 [inline] __x64_sys_unlink+0x49/0x50 fs/namei.c:4435 x64_sys_call+0x289/0x9a0 arch/x86/include/generated/asm/syscalls_64.h:88 do_syscall_x64 arch/x86/entry/common.c:51 [inline] do_syscall_64+0x3b/0x80 arch/x86/entry/common.c:81 entry_SYSCALL_64_after_hwframe+0x68/0xd2 The buggy address belongs to the object at ffff888100567a10 which belongs to the cache f2fs_inode_cache of size 1360 The buggy address is located 952 bytes inside of 1360-byte region [ffff888100567a10, ffff888100567f60) The buggy address belongs to the physical page: page:ffffea0004015800 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x100560 head:ffffea0004015800 order:3 compound_mapcount:0 compound_pincount:0 flags: 0x4000000000010200(slab|head|zone=1) raw: 4000000000010200 0000000000000000 dead000000000122 ffff8881002c4d80 raw: 0000000000000000 0000000080160016 00000001ffffffff 0000000000000000 page dumped because: kasan: bad access detected page_owner tracks the page as allocated page last allocated via order 3, migratetype Reclaimable, gfp_mask 0xd2050(__GFP_IO|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_NOMEMALLOC|__GFP_RECLAIMABLE), pid 298, tgid 298 (syz-executor330), ts 26489303743, free_ts 0 set_page_owner include/linux/page_owner.h:33 [inline] post_alloc_hook+0x213/0x220 mm/page_alloc.c:2637 prep_new_page+0x1b/0x110 mm/page_alloc.c:2644 get_page_from_freelist+0x3a98/0x3b10 mm/page_alloc.c:4539 __alloc_pages+0x234/0x610 mm/page_alloc.c:5837 alloc_slab_page+0x6c/0xf0 include/linux/gfp.h:-1 allocate_slab mm/slub.c:1962 [inline] new_slab+0x90/0x3e0 mm/slub.c:2015 ___slab_alloc+0x6f9/0xb80 mm/slub.c:3203 __slab_alloc+0x5d/0xa0 mm/slub.c:3302 slab_alloc_node mm/slub.c:3387 [inline] slab_alloc mm/slub.c:3431 [inline] __kmem_cache_alloc_lru mm/slub.c:3438 [inline] kmem_cache_alloc_lru+0x149/0x270 mm/slub.c:3454 alloc_inode_sb include/linux/fs.h:3255 [inline] f2fs_alloc_inode+0x2d/0x350 fs/f2fs/super.c:1437 alloc_inode fs/inode.c:261 [inline] iget_locked+0x18c/0x7e0 fs/inode.c:1373 f2fs_iget+0x55/0x4ca0 fs/f2fs/inode.c:486 f2fs_fill_super+0x5360/0x6dc0 fs/f2fs/super.c:4488 mount_bdev+0x282/0x3b0 fs/super.c:1445 f2fs_mount+0x34/0x40 fs/f2fs/super.c:4743 legacy_get_tree+0xf1/0x190 fs/fs_context.c:632 page_owner free stack trace missing Memory state around the buggy address: ffff888100567c80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff888100567d00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb >ffff888100567d80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ^ ffff888100567e00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff888100567e80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ================================================================== [2] https://syzkaller.appspot.com/text?tag=CrashLog&x=13654c60580000 [ 24.675720][ T28] audit: type=1400 audit(1745327318.732:72): avc: denied { write } for pid=298 comm="syz-executor399" name="/" dev="loop0" ino=3 scontext=root:sysadm_r:sysadm_t tcontext=system_u:object_r:unlabeled_t tclass=dir permissive=1 [ 24.705426][ T296] ------------[ cut here ]------------ [ 24.706608][ T28] audit: type=1400 audit(1745327318.732:73): avc: denied { remove_name } for pid=298 comm="syz-executor399" name="file0" dev="loop0" ino=4 scontext=root:sysadm_r:sysadm_t tcontext=system_u:object_r:unlabeled_t tclass=dir permissive=1 [ 24.711550][ T296] WARNING: CPU: 0 PID: 296 at fs/f2fs/inode.c:847 f2fs_evict_inode+0x1262/0x1540 [ 24.734141][ T28] audit: type=1400 audit(1745327318.732:74): avc: denied { rename } for pid=298 comm="syz-executor399" name="file0" dev="loop0" ino=4 scontext=root:sysadm_r:sysadm_t tcontext=system_u:object_r:unlabeled_t tclass=dir permissive=1 [ 24.742969][ T296] Modules linked in: [ 24.765201][ T28] audit: type=1400 audit(1745327318.732:75): avc: denied { add_name } for pid=298 comm="syz-executor399" name="bus" scontext=root:sysadm_r:sysadm_t tcontext=system_u:object_r:unlabeled_t tclass=dir permissive=1 [ 24.768847][ T296] CPU: 0 PID: 296 Comm: syz-executor399 Not tainted 6.1.129-syzkaller-00017-g642656a36791 #0 [ 24.799506][ T296] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/12/2025 [ 24.809401][ T296] RIP: 0010:f2fs_evict_inode+0x1262/0x1540 [ 24.815018][ T296] Code: 34 70 4a ff eb 0d e8 2d 70 4a ff 4d 89 e5 4c 8b 64 24 18 48 8b 5c 24 28 4c 89 e7 e8 78 38 03 00 e9 84 fc ff ff e8 0e 70 4a ff <0f> 0b 4c 89 f7 be 08 00 00 00 e8 7f 21 92 ff f0 41 80 0e 04 e9 61 [ 24.834584][ T296] RSP: 0018:ffffc90000db7a40 EFLAGS: 00010293 [ 24.840465][ T296] RAX: ffffffff822aca42 RBX: 0000000000000002 RCX: ffff888110948000 [ 24.848291][ T296] RDX: 0000000000000000 RSI: 0000000000000002 RDI: 0000000000000000 [ 24.856064][ T296] RBP: ffffc90000db7bb0 R08: ffffffff822ac6a8 R09: ffffed10200b005d [ 24.864073][ T296] R10: 0000000000000000 R11: dffffc0000000001 R12: ffff888100580000 [ 24.871812][ T296] R13: dffffc0000000000 R14: ffff88810fef4078 R15: 1ffff920001b6f5c The root cause is w/ a fuzzed image, f2fs may missed to clear FI_DIRTY_INODE flag for target inode, after f2fs_evict_inode(), the inode is still linked in sbi->inode_list[DIRTY_META] global list, once it triggers checkpoint, f2fs_sync_inode_meta() may access the released inode. In f2fs_evict_inode(), let's always call f2fs_inode_synced() to clear FI_DIRTY_INODE flag and drop inode from global dirty list to avoid this UAF issue. Fixes: 0f18b462b2e5 ("f2fs: flush inode metadata when checkpoint is doing") Closes: https://syzkaller.appspot.com/bug?extid=849174b2efaf0d8be6ba Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-15f2fs: fix to check upper boundary for gc_no_zoned_gc_percentChao Yu
[ Upstream commit a919ae794ad2dc6d04b3eea2f9bc86332c1630cc ] This patch adds missing upper boundary check while setting gc_no_zoned_gc_percent via sysfs. Fixes: 9a481a1c16f4 ("f2fs: create gc_no_zoned_gc_percent and gc_boost_zoned_gc_percent") Cc: Daeho Jeong <daehojeong@google.com> Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-15f2fs: fix to check upper boundary for gc_valid_thresh_ratioChao Yu
[ Upstream commit 7a96d1d73ce9de5041e891a623b722f900651561 ] This patch adds missing upper boundary check while setting gc_valid_thresh_ratio via sysfs. Fixes: e791d00bd06c ("f2fs: add valid block ratio not to do excessive GC for one time GC") Cc: Daeho Jeong <daehojeong@google.com> Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-15f2fs: fix to check upper boundary for value of gc_boost_zoned_gc_percentyohan.joung
[ Upstream commit 10dcaa56ef93f2a45e4c3fec27d8e1594edad110 ] to check the upper boundary when setting gc_boost_zoned_gc_percent Fixes: 9a481a1c16f4 ("f2fs: create gc_no_zoned_gc_percent and gc_boost_zoned_gc_percent") Signed-off-by: yohan.joung <yohan.joung@sk.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-15f2fs: fix KMSAN uninit-value in extent_info usageAbinash Singh
[ Upstream commit 154467f4ad033473e5c903a03e7b9bca7df9a0fa ] KMSAN reported a use of uninitialized value in `__is_extent_mergeable()` and `__is_back_mergeable()` via the read extent tree path. The root cause is that `get_read_extent_info()` only initializes three fields (`fofs`, `blk`, `len`) of `struct extent_info`, leaving the remaining fields uninitialized. This leads to undefined behavior when those fields are accessed later, especially during extent merging. Fix it by zero-initializing the `extent_info` struct before population. Reported-by: syzbot+b8c1d60e95df65e827d4@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=b8c1d60e95df65e827d4 Fixes: 94afd6d6e525 ("f2fs: extent cache: support unaligned extent") Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Abinash Singh <abinashsinghlalotra@gmail.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-15f2fs: compress: fix UAF of f2fs_inode_info in f2fs_free_dicZhiguo Niu
[ Upstream commit 39868685c2a94a70762bc6d77dc81d781d05bff5 ] The decompress_io_ctx may be released asynchronously after I/O completion. If this file is deleted immediately after read, and the kworker of processing post_read_wq has not been executed yet due to high workloads, It is possible that the inode(f2fs_inode_info) is evicted and freed before it is used f2fs_free_dic. The UAF case as below: Thread A Thread B - f2fs_decompress_end_io - f2fs_put_dic - queue_work add free_dic work to post_read_wq - do_unlink - iput - evict - call_rcu This file is deleted after read. Thread C kworker to process post_read_wq - rcu_do_batch - f2fs_free_inode - kmem_cache_free inode is freed by rcu - process_scheduled_works - f2fs_late_free_dic - f2fs_free_dic - f2fs_release_decomp_mem read (dic->inode)->i_compress_algorithm This patch store compress_algorithm and sbi in dic to avoid inode UAF. In addition, the previous solution is deprecated in [1] may cause system hang. [1] https://lore.kernel.org/all/c36ab955-c8db-4a8b-a9d0-f07b5f426c3f@kernel.org Cc: Daeho Jeong <daehojeong@google.com> Fixes: bff139b49d9f ("f2fs: handle decompress only post processing in softirq") Signed-off-by: Zhiguo Niu <zhiguo.niu@unisoc.com> Signed-off-by: Baocong Liu <baocong.liu@unisoc.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-15f2fs: compress: change the first parameter of page_array_{alloc,free} to sbiZhiguo Niu
[ Upstream commit 8e2a9b656474d67c55010f2c003ea2cf889a19ff ] No logic changes, just cleanup and prepare for fixing the UAF issue in f2fs_free_dic. Signed-off-by: Zhiguo Niu <zhiguo.niu@unisoc.com> Signed-off-by: Baocong Liu <baocong.liu@unisoc.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Stable-dep-of: 39868685c2a9 ("f2fs: compress: fix UAF of f2fs_inode_info in f2fs_free_dic") Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-15f2fs: fix to avoid invalid wait context issueChao Yu
[ Upstream commit 90d5c9ba3ed91950f1546bf123a7a57cd958b452 ] ============================= [ BUG: Invalid wait context ] 6.13.0-rc1 #84 Tainted: G O ----------------------------- cat/56160 is trying to lock: ffff888105c86648 (&cprc->stat_lock){+.+.}-{3:3}, at: update_general_status+0x32a/0x8c0 [f2fs] other info that might help us debug this: context-{5:5} 2 locks held by cat/56160: #0: ffff88810a002a98 (&p->lock){+.+.}-{4:4}, at: seq_read_iter+0x56/0x4c0 #1: ffffffffa0462638 (f2fs_stat_lock){....}-{2:2}, at: stat_show+0x29/0x1020 [f2fs] stack backtrace: CPU: 0 UID: 0 PID: 56160 Comm: cat Tainted: G O 6.13.0-rc1 #84 Tainted: [O]=OOT_MODULE Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 Call Trace: <TASK> dump_stack_lvl+0x88/0xd0 dump_stack+0x14/0x20 __lock_acquire+0x8d4/0xbb0 lock_acquire+0xd6/0x300 _raw_spin_lock+0x38/0x50 update_general_status+0x32a/0x8c0 [f2fs] stat_show+0x50/0x1020 [f2fs] seq_read_iter+0x116/0x4c0 seq_read+0xfa/0x130 full_proxy_read+0x66/0x90 vfs_read+0xc4/0x350 ksys_read+0x74/0xf0 __x64_sys_read+0x1d/0x20 x64_sys_call+0x17d9/0x1b80 do_syscall_64+0x68/0x130 entry_SYSCALL_64_after_hwframe+0x67/0x6f RIP: 0033:0x7f2ca53147e2 - seq_read - stat_show - raw_spin_lock_irqsave(&f2fs_stat_lock, flags) : f2fs_stat_lock is raw_spinlock_t type variable - update_general_status - spin_lock(&sbi->cprc_info.stat_lock); : stat_lock is spinlock_t type variable The root cause is the lock order is incorrect [1], we should not acquire spinlock_t lock after raw_spinlock_t lock, as if CONFIG_PREEMPT_LOCK is on, spinlock_t is implemented based on rtmutex, which can sleep after holding the lock. To fix this issue, let's use change f2fs_stat_lock lock type from raw_spinlock_t to spinlock_t, it's safe due to: - we don't need to use raw version of spinlock as the path is not performance sensitive. - we don't need to use irqsave version of spinlock as it won't be used in irq context. Quoted from [1]: "Extend lockdep to validate lock wait-type context. The current wait-types are: LD_WAIT_FREE, /* wait free, rcu etc.. */ LD_WAIT_SPIN, /* spin loops, raw_spinlock_t etc.. */ LD_WAIT_CONFIG, /* CONFIG_PREEMPT_LOCK, spinlock_t etc.. */ LD_WAIT_SLEEP, /* sleeping locks, mutex_t etc.. */ Where lockdep validates that the current lock (the one being acquired) fits in the current wait-context (as generated by the held stack). This ensures that there is no attempt to acquire mutexes while holding spinlocks, to acquire spinlocks while holding raw_spinlocks and so on. In other words, its a more fancy might_sleep()." [1] https://lore.kernel.org/all/20200321113242.427089655@linutronix.de Fixes: 98237fcda4a2 ("f2fs: use spin_lock to avoid hang") Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-15f2fs: fix bio memleak when committing super blockSheng Yong
[ Upstream commit 554d9b7242a73d701ce121ac81bb578a3fca538e ] When committing new super block, bio is allocated but not freed, and kmemleak complains: unreferenced object 0xffff88801d185600 (size 192): comm "kworker/3:2", pid 128, jiffies 4298624992 hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 80 67 c3 00 81 88 ff ff .........g...... 01 08 06 00 00 00 00 00 00 00 00 00 01 00 00 00 ................ backtrace (crc 650ecdb1): kmem_cache_alloc_noprof+0x3a9/0x460 mempool_alloc_noprof+0x12f/0x310 bio_alloc_bioset+0x1e2/0x7e0 __f2fs_commit_super+0xe0/0x370 f2fs_commit_super+0x4ed/0x8c0 f2fs_record_error_work+0xc7/0x190 process_one_work+0x7db/0x1970 worker_thread+0x518/0xea0 kthread+0x359/0x690 ret_from_fork+0x34/0x70 ret_from_fork_asm+0x1a/0x30 The issue can be reproduced by: mount /dev/vda /mnt i=0 while :; do echo '[h]abc' > /sys/fs/f2fs/vda/extension_list echo '[h]!abc' > /sys/fs/f2fs/vda/extension_list echo scan > /sys/kernel/debug/kmemleak dmesg | grep "new suspected memory leaks" [ $? -eq 0 ] && break i=$((i + 1)) echo "$i" done umount /mnt Fixes: 5bcde4557862 ("f2fs: get rid of buffer_head use") Signed-off-by: Sheng Yong <shengyong1@xiaomi.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-15f2fs: turn off one_time when forcibly set to foreground GCDaeho Jeong
[ Upstream commit 8142daf8a53806689186ee255cc02f89af7f8890 ] one_time mode is only for background GC. So, we need to set it back to false when foreground GC is enforced. Fixes: 9748c2ddea4a ("f2fs: do FG_GC when GC boosting is required for zoned devices") Signed-off-by: Daeho Jeong <daehojeong@google.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-15squashfs: fix incorrect argument to sizeof in kmalloc_array callColin Ian King
[ Upstream commit 97103dcec292b8688de142f7a48bd0d46038d3f6 ] The sizeof(void *) is the incorrect argument in the kmalloc_array call, it best to fix this by using sizeof(*cache_folios) instead. Fortunately the sizes of void* and folio* happen to be the same, so this has not shown up as a run time issue. [akpm@linux-foundation.org: fix build] Link: https://lkml.kernel.org/r/20250708142604.1891156-1-colin.i.king@gmail.com Fixes: 2e227ff5e272 ("squashfs: add optional full compressed block caching") Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Cc: Phillip Lougher <phillip@squashfs.org.uk> Cc: Chanho Min <chanho.min@lge.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-15squashfs: use folios in squashfs_bio_read_cached()Matthew Wilcox (Oracle)
[ Upstream commit c9e3fb050e9cb0d3a833b2c62b35ea42cdd81e89 ] Remove an access to page->mapping and a few calls to the old page-based APIs. This doesn't support large folios, but it's still a nice improvement. Link: https://lkml.kernel.org/r/20250612143903.2849289-3-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Phillip Lougher <phillip@squashfs.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Stable-dep-of: 97103dcec292 ("squashfs: fix incorrect argument to sizeof in kmalloc_array call") Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-15jfs: fix metapage reference count leak in dbAllocCtlZheng Yu
[ Upstream commit 856db37592021e9155384094e331e2d4589f28b1 ] In dbAllocCtl(), read_metapage() increases the reference count of the metapage. However, when dp->tree.budmin < 0, the function returns -EIO without calling release_metapage() to decrease the reference count, leading to a memory leak. Add release_metapage(mp) before the error return to properly manage the metapage reference count and prevent the leak. Fixes: a5f5e4698f8abbb25fe4959814093fb5bfa1aa9d ("jfs: fix shift-out-of-bounds in dbSplit") Signed-off-by: Zheng Yu <zheng.yu@northwestern.edu> Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>