summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
2025-02-08btrfs: do proper folio cleanup when run_delalloc_nocow() failedQu Wenruo
commit c2b47df81c8e20a8e8cd94f0d7df211137ae94ed upstream. [BUG] With CONFIG_DEBUG_VM set, test case generic/476 has some chance to crash with the following VM_BUG_ON_FOLIO(): BTRFS error (device dm-3): cow_file_range failed, start 1146880 end 1253375 len 106496 ret -28 BTRFS error (device dm-3): run_delalloc_nocow failed, start 1146880 end 1253375 len 106496 ret -28 page: refcount:4 mapcount:0 mapping:00000000592787cc index:0x12 pfn:0x10664 aops:btrfs_aops [btrfs] ino:101 dentry name(?):"f1774" flags: 0x2fffff80004028(uptodate|lru|private|node=0|zone=2|lastcpupid=0xfffff) page dumped because: VM_BUG_ON_FOLIO(!folio_test_locked(folio)) ------------[ cut here ]------------ kernel BUG at mm/page-writeback.c:2992! Internal error: Oops - BUG: 00000000f2000800 [#1] SMP CPU: 2 UID: 0 PID: 3943513 Comm: kworker/u24:15 Tainted: G OE 6.12.0-rc7-custom+ #87 Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE Hardware name: QEMU KVM Virtual Machine, BIOS unknown 2/2/2022 Workqueue: events_unbound btrfs_async_reclaim_data_space [btrfs] pc : folio_clear_dirty_for_io+0x128/0x258 lr : folio_clear_dirty_for_io+0x128/0x258 Call trace: folio_clear_dirty_for_io+0x128/0x258 btrfs_folio_clamp_clear_dirty+0x80/0xd0 [btrfs] __process_folios_contig+0x154/0x268 [btrfs] extent_clear_unlock_delalloc+0x5c/0x80 [btrfs] run_delalloc_nocow+0x5f8/0x760 [btrfs] btrfs_run_delalloc_range+0xa8/0x220 [btrfs] writepage_delalloc+0x230/0x4c8 [btrfs] extent_writepage+0xb8/0x358 [btrfs] extent_write_cache_pages+0x21c/0x4e8 [btrfs] btrfs_writepages+0x94/0x150 [btrfs] do_writepages+0x74/0x190 filemap_fdatawrite_wbc+0x88/0xc8 start_delalloc_inodes+0x178/0x3a8 [btrfs] btrfs_start_delalloc_roots+0x174/0x280 [btrfs] shrink_delalloc+0x114/0x280 [btrfs] flush_space+0x250/0x2f8 [btrfs] btrfs_async_reclaim_data_space+0x180/0x228 [btrfs] process_one_work+0x164/0x408 worker_thread+0x25c/0x388 kthread+0x100/0x118 ret_from_fork+0x10/0x20 Code: 910a8021 a90363f7 a9046bf9 94012379 (d4210000) ---[ end trace 0000000000000000 ]--- [CAUSE] The first two lines of extra debug messages show the problem is caused by the error handling of run_delalloc_nocow(). E.g. we have the following dirtied range (4K blocksize 4K page size): 0 16K 32K |//////////////////////////////////////| | Pre-allocated | And the range [0, 16K) has a preallocated extent. - Enter run_delalloc_nocow() for range [0, 16K) Which found range [0, 16K) is preallocated, can do the proper NOCOW write. - Enter fallback_to_fow() for range [16K, 32K) Since the range [16K, 32K) is not backed by preallocated extent, we have to go COW. - cow_file_range() failed for range [16K, 32K) So cow_file_range() will do the clean up by clearing folio dirty, unlock the folios. Now the folios in range [16K, 32K) is unlocked. - Enter extent_clear_unlock_delalloc() from run_delalloc_nocow() Which is called with PAGE_START_WRITEBACK to start page writeback. But folios can only be marked writeback when it's properly locked, thus this triggered the VM_BUG_ON_FOLIO(). Furthermore there is another hidden but common bug that run_delalloc_nocow() is not clearing the folio dirty flags in its error handling path. This is the common bug shared between run_delalloc_nocow() and cow_file_range(). [FIX] - Clear folio dirty for range [@start, @cur_offset) Introduce a helper, cleanup_dirty_folios(), which will find and lock the folio in the range, clear the dirty flag and start/end the writeback, with the extra handling for the @locked_folio. - Introduce a helper to clear folio dirty, start and end writeback - Introduce a helper to record the last failed COW range end This is to trace which range we should skip, to avoid double unlocking. - Skip the failed COW range for the error handling CC: stable@vger.kernel.org Reviewed-by: Boris Burkov <boris@bur.io> Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-02-08btrfs: do proper folio cleanup when cow_file_range() failedQu Wenruo
commit 06f364284794f149d2abc167c11d556cf20c954b upstream. [BUG] When testing with COW fixup marked as BUG_ON() (this is involved with the new pin_user_pages*() change, which should not result new out-of-band dirty pages), I hit a crash triggered by the BUG_ON() from hitting COW fixup path. This BUG_ON() happens just after a failed btrfs_run_delalloc_range(): BTRFS error (device dm-2): failed to run delalloc range, root 348 ino 405 folio 65536 submit_bitmap 6-15 start 90112 len 106496: -28 ------------[ cut here ]------------ kernel BUG at fs/btrfs/extent_io.c:1444! Internal error: Oops - BUG: 00000000f2000800 [#1] SMP CPU: 0 UID: 0 PID: 434621 Comm: kworker/u24:8 Tainted: G OE 6.12.0-rc7-custom+ #86 Hardware name: QEMU KVM Virtual Machine, BIOS unknown 2/2/2022 Workqueue: events_unbound btrfs_async_reclaim_data_space [btrfs] pc : extent_writepage_io+0x2d4/0x308 [btrfs] lr : extent_writepage_io+0x2d4/0x308 [btrfs] Call trace: extent_writepage_io+0x2d4/0x308 [btrfs] extent_writepage+0x218/0x330 [btrfs] extent_write_cache_pages+0x1d4/0x4b0 [btrfs] btrfs_writepages+0x94/0x150 [btrfs] do_writepages+0x74/0x190 filemap_fdatawrite_wbc+0x88/0xc8 start_delalloc_inodes+0x180/0x3b0 [btrfs] btrfs_start_delalloc_roots+0x174/0x280 [btrfs] shrink_delalloc+0x114/0x280 [btrfs] flush_space+0x250/0x2f8 [btrfs] btrfs_async_reclaim_data_space+0x180/0x228 [btrfs] process_one_work+0x164/0x408 worker_thread+0x25c/0x388 kthread+0x100/0x118 ret_from_fork+0x10/0x20 Code: aa1403e1 9402f3ef aa1403e0 9402f36f (d4210000) ---[ end trace 0000000000000000 ]--- [CAUSE] That failure is mostly from cow_file_range(), where we can hit -ENOSPC. Although the -ENOSPC is already a bug related to our space reservation code, let's just focus on the error handling. For example, we have the following dirty range [0, 64K) of an inode, with 4K sector size and 4K page size: 0 16K 32K 48K 64K |///////////////////////////////////////| |#######################################| Where |///| means page are still dirty, and |###| means the extent io tree has EXTENT_DELALLOC flag. - Enter extent_writepage() for page 0 - Enter btrfs_run_delalloc_range() for range [0, 64K) - Enter cow_file_range() for range [0, 64K) - Function btrfs_reserve_extent() only reserved one 16K extent So we created extent map and ordered extent for range [0, 16K) 0 16K 32K 48K 64K |////////|//////////////////////////////| |<- OE ->|##############################| And range [0, 16K) has its delalloc flag cleared. But since we haven't yet submit any bio, involved 4 pages are still dirty. - Function btrfs_reserve_extent() returns with -ENOSPC Now we have to run error cleanup, which will clear all EXTENT_DELALLOC* flags and clear the dirty flags for the remaining ranges: 0 16K 32K 48K 64K |////////| | | | | Note that range [0, 16K) still has its pages dirty. - Some time later, writeback is triggered again for the range [0, 16K) since the page range still has dirty flags. - btrfs_run_delalloc_range() will do nothing because there is no EXTENT_DELALLOC flag. - extent_writepage_io() finds page 0 has no ordered flag Which falls into the COW fixup path, triggering the BUG_ON(). Unfortunately this error handling bug dates back to the introduction of btrfs. Thankfully with the abuse of COW fixup, at least it won't crash the kernel. [FIX] Instead of immediately unlocking the extent and folios, we keep the extent and folios locked until either erroring out or the whole delalloc range finished. When the whole delalloc range finished without error, we just unlock the whole range with PAGE_SET_ORDERED (and PAGE_UNLOCK for !keep_locked cases), with EXTENT_DELALLOC and EXTENT_LOCKED cleared. And the involved folios will be properly submitted, with their dirty flags cleared during submission. For the error path, it will be a little more complex: - The range with ordered extent allocated (range (1)) We only clear the EXTENT_DELALLOC and EXTENT_LOCKED, as the remaining flags are cleaned up by btrfs_mark_ordered_io_finished()->btrfs_finish_one_ordered(). For folios we finish the IO (clear dirty, start writeback and immediately finish the writeback) and unlock the folios. - The range with reserved extent but no ordered extent (range(2)) - The range we never touched (range(3)) For both range (2) and range(3) the behavior is not changed. Now even if cow_file_range() failed halfway with some successfully reserved extents/ordered extents, we will keep all folios clean, so there will be no future writeback triggered on them. CC: stable@vger.kernel.org Reviewed-by: Boris Burkov <boris@bur.io> Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-02-08btrfs: output the reason for open_ctree() failureQu Wenruo
commit d0f038104fa37380e2a725e669508e43d0c503e9 upstream. There is a recent ML report that mounting a large fs backed by hardware RAID56 controller (with one device missing) took too much time, and systemd seems to kill the mount attempt. In that case, the only error message is: BTRFS error (device sdj): open_ctree failed There is no reason on why the failure happened, making it very hard to understand the reason. At least output the error number (in the particular case it should be -EINTR) to provide some clue. Link: https://lore.kernel.org/linux-btrfs/9b9c4d2810abcca2f9f76e32220ed9a90febb235.camel@scientia.org/ Reported-by: Christoph Anton Mitterer <calestyo@scientia.org> Cc: stable@vger.kernel.org Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-02-08xfs: don't shut down the filesystem for media failures beyond end of logDarrick J. Wong
commit f4ed93037966aea07ae6b10ab208976783d24e2e upstream. If the filesystem has an external log device on pmem and the pmem reports a media error beyond the end of the log area, don't shut down the filesystem because we don't use that space. Cc: <stable@vger.kernel.org> # v6.0 Fixes: 6f643c57d57c56 ("xfs: implement ->notify_failure() for XFS") Signed-off-by: "Darrick J. Wong" <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-02-08xfs: release the dquot buf outside of qli_lockDarrick J. Wong
commit 1aacd3fac248902ea1f7607f2d12b93929a4833b upstream. Lai Yi reported a lockdep complaint about circular locking: Chain exists of: &lp->qli_lock --> &bch->bc_lock --> &l->lock Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&l->lock); lock(&bch->bc_lock); lock(&l->lock); lock(&lp->qli_lock); I /think/ the problem here is that xfs_dquot_attach_buf during quotacheck will release the buffer while it's holding the qli_lock. Because this is a cached buffer, xfs_buf_rele_cached takes b_lock before decrementing b_hold. Other threads have taught lockdep that a locking dependency chain is bp->b_lock -> bch->bc_lock -> l(ru)->lock; and that another chain is l(ru)->lock -> lp->qli_lock. Hence we do not want to take b_lock while holding qli_lock. Reported-by: syzbot+3126ab3db03db42e7a31@syzkaller.appspotmail.com Cc: <stable@vger.kernel.org> # v6.13-rc3 Fixes: ca378189fdfa89 ("xfs: convert quotacheck to attach dquot buffers") Tested-by: syzbot+3126ab3db03db42e7a31@syzkaller.appspotmail.com Signed-off-by: "Darrick J. Wong" <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-02-08xfs: don't over-report free space or inodes in statvfsDarrick J. Wong
commit 4b8d867ca6e2fc6d152f629fdaf027053b81765a upstream. Emmanual Florac reports a strange occurrence when project quota limits are enabled, free space is lower than the remaining quota, and someone runs statvfs: # mkfs.xfs -f /dev/sda # mount /dev/sda /mnt -o prjquota # xfs_quota -x -c 'limit -p bhard=2G 55' /mnt # mkdir /mnt/dir # xfs_io -c 'chproj 55' -c 'chattr +P' -c 'stat -vvvv' /mnt/dir # fallocate -l 19g /mnt/a # df /mnt /mnt/dir Filesystem Size Used Avail Use% Mounted on /dev/sda 20G 20G 345M 99% /mnt /dev/sda 2.0G 0 2.0G 0% /mnt I think the bug here is that xfs_fill_statvfs_from_dquot unconditionally assigns to f_bfree without checking that the filesystem has enough free space to fill the remaining project quota. However, this is a longstanding behavior of xfs so it's unclear what to do here. Cc: <stable@vger.kernel.org> # v2.6.18 Fixes: 932f2c323196c2 ("[XFS] statvfs component of directory/project quota support, code originally by Glen.") Reported-by: Emmanuel Florac <eflorac@intellique.com> Signed-off-by: "Darrick J. Wong" <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-02-08xfs: fix mount hang during primary superblock recovery failureLong Li
commit efebe42d95fbba91dca6e3e32cb9e0612eb56de5 upstream. When mounting an image containing a log with sb modifications that require log replay, the mount process hang all the time and stack as follows: [root@localhost ~]# cat /proc/557/stack [<0>] xfs_buftarg_wait+0x31/0x70 [<0>] xfs_buftarg_drain+0x54/0x350 [<0>] xfs_mountfs+0x66e/0xe80 [<0>] xfs_fs_fill_super+0x7f1/0xec0 [<0>] get_tree_bdev_flags+0x186/0x280 [<0>] get_tree_bdev+0x18/0x30 [<0>] xfs_fs_get_tree+0x1d/0x30 [<0>] vfs_get_tree+0x2d/0x110 [<0>] path_mount+0xb59/0xfc0 [<0>] do_mount+0x92/0xc0 [<0>] __x64_sys_mount+0xc2/0x160 [<0>] x64_sys_call+0x2de4/0x45c0 [<0>] do_syscall_64+0xa7/0x240 [<0>] entry_SYSCALL_64_after_hwframe+0x76/0x7e During log recovery, while updating the in-memory superblock from the primary SB buffer, if an error is encountered, such as superblock corruption occurs or some other reasons, we will proceed to out_release and release the xfs_buf. However, this is insufficient because the xfs_buf's log item has already been initialized and the xfs_buf is held by the buffer log item as follows, the xfs_buf will not be released, causing the mount thread to hang. xlog_recover_do_primary_sb_buffer xlog_recover_do_reg_buffer xlog_recover_validate_buf_type xfs_buf_item_init(bp, mp) The solution is straightforward, we simply need to allow it to be handled by the normal buffer write process. The filesystem will be shutdown before the submission of buffer_list in xlog_do_recovery_pass(), ensuring the correct release of the xfs_buf as follows: xlog_do_recovery_pass error = xlog_recover_process xlog_recover_process_data xlog_recover_process_ophdr xlog_recovery_process_trans ... xlog_recover_buf_commit_pass2 error = xlog_recover_do_primary_sb_buffer //Encounter error and return if (error) goto out_writebuf ... out_writebuf: xfs_buf_delwri_queue(bp, buffer_list) //add bp to list return error ... if (!list_empty(&buffer_list)) if (error) xlog_force_shutdown(log, SHUTDOWN_LOG_IO_ERROR); //shutdown first xfs_buf_delwri_submit(&buffer_list); //submit buffers in list __xfs_buf_submit if (bp->b_mount->m_log && xlog_is_shutdown(bp->b_mount->m_log)) xfs_buf_ioend_fail(bp) //release bp correctly Fixes: 6a18765b54e2 ("xfs: update the file system geometry after recoverying superblock buffers") Cc: stable@vger.kernel.org # v6.12 Signed-off-by: Long Li <leo.lilong@huawei.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Carlos Maiolino <cem@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-02-08xfs: check for dead buffers in xfs_buf_find_insertChristoph Hellwig
commit 07eae0fa67ca4bbb199ad85645e0f9dfaef931cd upstream. Commit 32dd4f9c506b ("xfs: remove a superflous hash lookup when inserting new buffers") converted xfs_buf_find_insert to use rhashtable_lookup_get_insert_fast and thus an operation that returns the existing buffer when an insert would duplicate the hash key. But this code path misses the check for a buffer with a reference count of zero, which could lead to reusing an about to be freed buffer. Fix this by using the same atomic_inc_not_zero pattern as xfs_buf_insert. Fixes: 32dd4f9c506b ("xfs: remove a superflous hash lookup when inserting new buffers") Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Cc: stable@vger.kernel.org # v6.0 Signed-off-by: Carlos Maiolino <cem@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-02-08f2fs: Introduce linear search for dentriesDaniel Lee
commit 91b587ba79e1b68bb718d12b0758dbcdab4e9cb7 upstream. This patch addresses an issue where some files in case-insensitive directories become inaccessible due to changes in how the kernel function, utf8_casefold(), generates case-folded strings from the commit 5c26d2f1d3f5 ("unicode: Don't special case ignorable code points"). F2FS uses these case-folded names to calculate hash values for locating dentries and stores them on disk. Since utf8_casefold() can produce different output across kernel versions, stored hash values and newly calculated hash values may differ. This results in affected files no longer being found via the hash-based lookup. To resolve this, the patch introduces a linear search fallback. If the initial hash-based search fails, F2FS will sequentially scan the directory entries. Fixes: 5c26d2f1d3f5 ("unicode: Don't special case ignorable code points") Link: https://bugzilla.kernel.org/show_bug.cgi?id=219586 Signed-off-by: Daniel Lee <chullee@google.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Cc: Daniel Rosenberg <drosen@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-02-08cifs: Fix getting and setting SACLs over SMB1Pali Rohár
[ Upstream commit 8b19dfb34d17e77a0809d433cc128b779282131b ] SMB1 callback get_cifs_acl_by_fid() currently ignores its last argument and therefore ignores request for SACL_SECINFO. Fix this issue by correctly propagating info argument from get_cifs_acl() and get_cifs_acl_by_fid() to CIFSSMBGetCIFSACL() function and pass SACL_SECINFO when requested. For accessing SACLs it is needed to open object with SYSTEM_SECURITY access. Pass this flag when trying to get or set SACLs. Same logic is in the SMB2+ code path. This change fixes getting and setting of "system.cifs_ntsd_full" and "system.smb3_ntsd_full" xattrs over SMB1 as currently it silentely ignored SACL part of passed xattr buffer. Fixes: 3970acf7ddb9 ("SMB3: Add support for getting and setting SACLs") Signed-off-by: Pali Rohár <pali@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-02-08cifs: Validate EAs for WSL reparse pointsPali Rohár
[ Upstream commit ef201e8759d20bf82b5943101147072de12bc524 ] Major and minor numbers for char and block devices are mandatory for stat. So check that the WSL EA $LXDEV is present for WSL CHR and BLK reparse points. WSL reparse point tag determinate type of the file. But file type is present also in the WSL EA $LXMOD. So check that both file types are same. Fixes: 78e26bec4d6d ("smb: client: parse uid, gid, mode and dev from WSL reparse points") Signed-off-by: Pali Rohár <pali@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-02-08hostfs: fix string handling in __dentry_name()Al Viro
[ Upstream commit 60a6002432448bb3f291d80768ae98d62efc9c77 ] strcpy() should not be used with destination potentially overlapping the source; what's more, strscpy() in there is pointless - we already know the amount we want to copy; might as well use memcpy(). Fixes: c278e81b8a02 "hostfs: Remove open coded strcpy()" Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-02-08ubifs: skip dumping tnc tree when zroot is nullpangliyuan
[ Upstream commit bdb0ca39e0acccf6771db49c3f94ed787d05f2d7 ] Clearing slab cache will free all znode in memory and make c->zroot.znode = NULL, then dumping tnc tree will access c->zroot.znode which cause null pointer dereference. Link: https://bugzilla.kernel.org/show_bug.cgi?id=219624#c0 Fixes: 1e51764a3c2a ("UBIFS: add new flash file system") Signed-off-by: pangliyuan <pangliyuan1@huawei.com> Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-02-08NFSv4.2: mark OFFLOAD_CANCEL MOVEABLEOlga Kornievskaia
[ Upstream commit 668135b9348c53fd205f5e07d11e82b10f31b55b ] OFFLOAD_CANCEL should be marked MOVEABLE for when we need to move tasks off a non-functional transport. Fixes: c975c2092657 ("NFS send OFFLOAD_CANCEL when COPY killed") Signed-off-by: Olga Kornievskaia <okorniev@redhat.com> Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-02-08NFSv4.2: fix COPY_NOTIFY xdr buf size calculationOlga Kornievskaia
[ Upstream commit e8380c2d06055665b3df6c03964911375d7f9290 ] We need to include sequence size in the compound. Fixes: 0491567b51ef ("NFS: add COPY_NOTIFY operation") Signed-off-by: Olga Kornievskaia <okorniev@redhat.com> Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-02-08nfs: fix incorrect error handling in LOCALIOMike Snitzer
[ Upstream commit ead11ac50ad4b8ef1b64806e962ea984862d96ad ] nfs4_stat_to_errno() expects a NFSv4 error code as an argument and returns a POSIX errno. The problem is LOCALIO is passing nfs4_stat_to_errno() the POSIX errno return values from filp->f_op->read_iter(), filp->f_op->write_iter() and vfs_fsync_range(). So the POSIX errno that nfs_local_pgio_done() and nfs_local_commit_done() are passing to nfs4_stat_to_errno() are failing to match any NFSv4 error code, which results in nfs4_stat_to_errno() defaulting to returning -EREMOTEIO. This causes assertions in upper layers due to -EREMOTEIO not being a valid NFSv4 error code. Fix this by updating nfs_local_pgio_done() and nfs_local_commit_done() to use the new nfs_localio_errno_to_nfs4_stat() to map a POSIX errno to an NFSv4 error code. Care was taken to factor out nfs4_errtbl_common[] to avoid duplicating the same NFS error to errno table. nfs4_errtbl_common[] is checked first by both nfs4_stat_to_errno and nfs_localio_errno_to_nfs4_stat before they check their own more specialized tables (nfs4_errtbl[] and nfs4_errtbl_localio[] respectively). While auditing the associated error mapping tables, the (ab)use of -1 for the last table entry was removed in favor of using ARRAY_SIZE to iterate the nfs_errtbl[] and nfs4_errtbl[]. And 'errno_NFSERR_IO' was removed because it caused needless obfuscation. Fixes: 70ba381e1a431 ("nfs: add LOCALIO support") Reported-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-02-08nilfs2: handle errors that nilfs_prepare_chunk() may returnRyusuke Konishi
[ Upstream commit ee70999a988b8abc3490609142f50ebaa8344432 ] Patch series "nilfs2: fix issues with rename operations". This series fixes BUG_ON check failures reported by syzbot around rename operations, and a minor behavioral issue where the mtime of a child directory changes when it is renamed instead of moved. This patch (of 2): The directory manipulation routines nilfs_set_link() and nilfs_delete_entry() rewrite the directory entry in the folio/page previously read by nilfs_find_entry(), so error handling is omitted on the assumption that nilfs_prepare_chunk(), which prepares the buffer for rewriting, will always succeed for these. And if an error is returned, it triggers the legacy BUG_ON() checks in each routine. This assumption is wrong, as proven by syzbot: the buffer layer called by nilfs_prepare_chunk() may call nilfs_get_block() if necessary, which may fail due to metadata corruption or other reasons. This has been there all along, but improved sanity checks and error handling may have made it more reproducible in fuzzing tests. Fix this issue by adding missing error paths in nilfs_set_link(), nilfs_delete_entry(), and their caller nilfs_rename(). Link: https://lkml.kernel.org/r/20250111143518.7901-1-konishi.ryusuke@gmail.com Link: https://lkml.kernel.org/r/20250111143518.7901-2-konishi.ryusuke@gmail.com Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Reported-by: syzbot+32c3706ebf5d95046ea1@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=32c3706ebf5d95046ea1 Reported-by: syzbot+1097e95f134f37d9395c@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=1097e95f134f37d9395c Fixes: 2ba466d74ed7 ("nilfs2: directory entry operations") Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-02-08nilfs2: protect access to buffers with no active referencesRyusuke Konishi
[ Upstream commit 367a9bffabe08c04f6d725032cce3d891b2b9e1a ] nilfs_lookup_dirty_data_buffers(), which iterates through the buffers attached to dirty data folios/pages, accesses the attached buffers without locking the folios/pages. For data cache, nilfs_clear_folio_dirty() may be called asynchronously when the file system degenerates to read only, so nilfs_lookup_dirty_data_buffers() still has the potential to cause use after free issues when buffers lose the protection of their dirty state midway due to this asynchronous clearing and are unintentionally freed by try_to_free_buffers(). Eliminate this race issue by adjusting the lock section in this function. Link: https://lkml.kernel.org/r/20250107200202.6432-3-konishi.ryusuke@gmail.com Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Fixes: 8c26c4e2694a ("nilfs2: fix issue with flush kernel thread after remount in RO mode because of driver's internal error or metadata corruption") Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-02-08nilfs2: do not force clear folio if buffer is referencedRyusuke Konishi
[ Upstream commit ca76bb226bf47ff04c782cacbd299f12ddee1ec1 ] Patch series "nilfs2: protect busy buffer heads from being force-cleared". This series fixes the buffer head state inconsistency issues reported by syzbot that occurs when the filesystem is corrupted and falls back to read-only, and the associated buffer head use-after-free issue. This patch (of 2): Syzbot has reported that after nilfs2 detects filesystem corruption and falls back to read-only, inconsistencies in the buffer state may occur. One of the inconsistencies is that when nilfs2 calls mark_buffer_dirty() to set a data or metadata buffer as dirty, but it detects that the buffer is not in the uptodate state: WARNING: CPU: 0 PID: 6049 at fs/buffer.c:1177 mark_buffer_dirty+0x2e5/0x520 fs/buffer.c:1177 ... Call Trace: <TASK> nilfs_palloc_commit_alloc_entry+0x4b/0x160 fs/nilfs2/alloc.c:598 nilfs_ifile_create_inode+0x1dd/0x3a0 fs/nilfs2/ifile.c:73 nilfs_new_inode+0x254/0x830 fs/nilfs2/inode.c:344 nilfs_mkdir+0x10d/0x340 fs/nilfs2/namei.c:218 vfs_mkdir+0x2f9/0x4f0 fs/namei.c:4257 do_mkdirat+0x264/0x3a0 fs/namei.c:4280 __do_sys_mkdirat fs/namei.c:4295 [inline] __se_sys_mkdirat fs/namei.c:4293 [inline] __x64_sys_mkdirat+0x87/0xa0 fs/namei.c:4293 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f The other is when nilfs_btree_propagate(), which propagates the dirty state to the ancestor nodes of a b-tree that point to a dirty buffer, detects that the origin buffer is not dirty, even though it should be: WARNING: CPU: 0 PID: 5245 at fs/nilfs2/btree.c:2089 nilfs_btree_propagate+0xc79/0xdf0 fs/nilfs2/btree.c:2089 ... Call Trace: <TASK> nilfs_bmap_propagate+0x75/0x120 fs/nilfs2/bmap.c:345 nilfs_collect_file_data+0x4d/0xd0 fs/nilfs2/segment.c:587 nilfs_segctor_apply_buffers+0x184/0x340 fs/nilfs2/segment.c:1006 nilfs_segctor_scan_file+0x28c/0xa50 fs/nilfs2/segment.c:1045 nilfs_segctor_collect_blocks fs/nilfs2/segment.c:1216 [inline] nilfs_segctor_collect fs/nilfs2/segment.c:1540 [inline] nilfs_segctor_do_construct+0x1c28/0x6b90 fs/nilfs2/segment.c:2115 nilfs_segctor_construct+0x181/0x6b0 fs/nilfs2/segment.c:2479 nilfs_segctor_thread_construct fs/nilfs2/segment.c:2587 [inline] nilfs_segctor_thread+0x69e/0xe80 fs/nilfs2/segment.c:2701 kthread+0x2f0/0x390 kernel/kthread.c:389 ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244 </TASK> Both of these issues are caused by the callbacks that handle the page/folio write requests, forcibly clear various states, including the working state of the buffers they hold, at unexpected times when they detect read-only fallback. Fix these issues by checking if the buffer is referenced before clearing the page/folio state, and skipping the clear if it is. Link: https://lkml.kernel.org/r/20250107200202.6432-1-konishi.ryusuke@gmail.com Link: https://lkml.kernel.org/r/20250107200202.6432-2-konishi.ryusuke@gmail.com Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Reported-by: syzbot+b2b14916b77acf8626d7@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=b2b14916b77acf8626d7 Reported-by: syzbot+d98fd19acd08b36ff422@syzkaller.appspotmail.com Link: https://syzkaller.appspot.com/bug?extid=d98fd19acd08b36ff422 Fixes: 8c26c4e2694a ("nilfs2: fix issue with flush kernel thread after remount in RO mode because of driver's internal error or metadata corruption") Tested-by: syzbot+b2b14916b77acf8626d7@syzkaller.appspotmail.com Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-02-08ocfs2: mark dquot as inactive if failed to start trans while releasing dquotSu Yue
[ Upstream commit 276c61385f6bc3223a5ecd307cf4aba2dfbb9a31 ] While running fstests generic/329, the kernel workqueue quota_release_workfn is dead looping in calling ocfs2_release_dquot(). The ocfs2 state is already readonly but ocfs2_release_dquot wants to start a transaction but fails and returns. ===================================================================== [ 2918.123602 ][ T275 ] On-disk corruption discovered. Please run fsck.ocfs2 once the filesystem is unmounted. [ 2918.124034 ][ T275 ] (kworker/u135:1,275,11):ocfs2_release_dquot:765 ERROR: status = -30 [ 2918.124452 ][ T275 ] (kworker/u135:1,275,11):ocfs2_release_dquot:795 ERROR: status = -30 [ 2918.124883 ][ T275 ] (kworker/u135:1,275,11):ocfs2_start_trans:357 ERROR: status = -30 [ 2918.125276 ][ T275 ] OCFS2: abort (device dm-0): ocfs2_start_trans: Detected aborted journal [ 2918.125710 ][ T275 ] On-disk corruption discovered. Please run fsck.ocfs2 once the filesystem is unmounted. ===================================================================== ocfs2_release_dquot() is much like dquot_release(), which is called by ext4 to handle similar situation. So here fix it by marking the dquot as inactive like what dquot_release() does. Link: https://lkml.kernel.org/r/20250106140653.92292-1-glass.su@suse.com Fixes: 9e33d69f553a ("ocfs2: Implementation of local and global quota file handling") Signed-off-by: Su Yue <glass.su@suse.com> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Jun Piao <piaojun@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-02-08erofs: fix potential return value overflow of z_erofs_shrink_scan()Gao Xiang
[ Upstream commit db902986dee453bfb5835cbc8efa67154ab34caf ] z_erofs_shrink_scan() could return small numbers due to the mistyped `freed`. Although I don't think it has any visible impact. Fixes: 3883a79abd02 ("staging: erofs: introduce VLE decompression support") Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20250114040058.459981-1-hsiangkao@linux.alibaba.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-02-08cifs: Use cifs_autodisable_serverino() for disabling CIFS_MOUNT_SERVER_INUM ↵Pali Rohár
in readdir.c [ Upstream commit 015683d4ed0d23698c71f2194f09bd17dbfad044 ] In all other places is used function cifs_autodisable_serverino() for disabling CIFS_MOUNT_SERVER_INUM mount flag. So use is also in readir.c _initiate_cifs_search() function. Benefit of cifs_autodisable_serverino() is that it also prints dmesg message that server inode numbers are being disabled. Fixes: ec06aedd4454 ("cifs: clean up handling when server doesn't consistently support inode numbers") Fixes: f534dc994397 ("cifs: clear server inode number flag while autodisabling") Signed-off-by: Pali Rohár <pali@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-02-08smb: client: fix oops due to unset link speedPaulo Alcantara
[ Upstream commit be7a6a77669588bfa5022a470989702bbbb11e7f ] It isn't guaranteed that NETWORK_INTERFACE_INFO::LinkSpeed will always be set by the server, so the client must handle any values and then prevent oopses like below from happening: Oops: divide error: 0000 [#1] PREEMPT SMP KASAN NOPTI CPU: 0 UID: 0 PID: 1323 Comm: cat Not tainted 6.13.0-rc7 #2 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-3.fc41 04/01/2014 RIP: 0010:cifs_debug_data_proc_show+0xa45/0x1460 [cifs] Code: 00 00 48 89 df e8 3b cd 1b c1 41 f6 44 24 2c 04 0f 84 50 01 00 00 48 89 ef e8 e7 d0 1b c1 49 8b 44 24 18 31 d2 49 8d 7c 24 28 <48> f7 74 24 18 48 89 c3 e8 6e cf 1b c1 41 8b 6c 24 28 49 8d 7c 24 RSP: 0018:ffffc90001817be0 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff88811230022c RCX: ffffffffc041bd99 RDX: 0000000000000000 RSI: 0000000000000567 RDI: ffff888112300228 RBP: ffff888112300218 R08: fffff52000302f5f R09: ffffed1022fa58ac R10: ffff888117d2c566 R11: 00000000fffffffe R12: ffff888112300200 R13: 000000012a15343f R14: 0000000000000001 R15: ffff888113f2db58 FS: 00007fe27119e740(0000) GS:ffff888148600000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fe2633c5000 CR3: 0000000124da0000 CR4: 0000000000750ef0 PKRU: 55555554 Call Trace: <TASK> ? __die_body.cold+0x19/0x27 ? die+0x2e/0x50 ? do_trap+0x159/0x1b0 ? cifs_debug_data_proc_show+0xa45/0x1460 [cifs] ? do_error_trap+0x90/0x130 ? cifs_debug_data_proc_show+0xa45/0x1460 [cifs] ? exc_divide_error+0x39/0x50 ? cifs_debug_data_proc_show+0xa45/0x1460 [cifs] ? asm_exc_divide_error+0x1a/0x20 ? cifs_debug_data_proc_show+0xa39/0x1460 [cifs] ? cifs_debug_data_proc_show+0xa45/0x1460 [cifs] ? seq_read_iter+0x42e/0x790 seq_read_iter+0x19a/0x790 proc_reg_read_iter+0xbe/0x110 ? __pfx_proc_reg_read_iter+0x10/0x10 vfs_read+0x469/0x570 ? do_user_addr_fault+0x398/0x760 ? __pfx_vfs_read+0x10/0x10 ? find_held_lock+0x8a/0xa0 ? __pfx_lock_release+0x10/0x10 ksys_read+0xd3/0x170 ? __pfx_ksys_read+0x10/0x10 ? __rcu_read_unlock+0x50/0x270 ? mark_held_locks+0x1a/0x90 do_syscall_64+0xbb/0x1d0 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7fe271288911 Code: 00 48 8b 15 01 25 10 00 f7 d8 64 89 02 b8 ff ff ff ff eb bd e8 20 ad 01 00 f3 0f 1e fa 80 3d b5 a7 10 00 00 74 13 31 c0 0f 05 <48> 3d 00 f0 ff ff 77 4f c3 66 0f 1f 44 00 00 55 48 89 e5 48 83 ec RSP: 002b:00007ffe87c079d8 EFLAGS: 00000246 ORIG_RAX: 0000000000000000 RAX: ffffffffffffffda RBX: 0000000000040000 RCX: 00007fe271288911 RDX: 0000000000040000 RSI: 00007fe2633c6000 RDI: 0000000000000003 RBP: 00007ffe87c07a00 R08: 0000000000000000 R09: 00007fe2713e6380 R10: 0000000000000022 R11: 0000000000000246 R12: 0000000000040000 R13: 00007fe2633c6000 R14: 0000000000000003 R15: 0000000000000000 </TASK> Fix this by setting cifs_server_iface::speed to a sane value (1Gbps) by default when link speed is unset. Cc: Shyam Prasad N <nspmangalore@gmail.com> Cc: Tom Talpey <tom@talpey.com> Fixes: a6d8fb54a515 ("cifs: distribute channels across interfaces based on speed") Reported-by: Frank Sorenson <sorenson@redhat.com> Reported-by: Jay Shin <jaeshin@redhat.com> Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.com> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-02-08afs: Fix the fallback handling for the YFS.RemoveFile2 RPC callDavid Howells
[ Upstream commit e30458d690f35abb01de8b3cbc09285deb725d00 ] Fix a pair of bugs in the fallback handling for the YFS.RemoveFile2 RPC call: (1) Fix the abort code check to also look for RXGEN_OPCODE. The lack of this masks the second bug. (2) call->server is now not used for ordinary filesystem RPC calls that have an operation descriptor. Fix to use call->op->server instead. Fixes: e49c7b2f6de7 ("afs: Build an abstraction around an "operation" concept") Signed-off-by: David Howells <dhowells@redhat.com> Link: https://lore.kernel.org/r/109541.1736865963@warthog.procyon.org.uk cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org Signed-off-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-02-08select: Fix unbalanced user_access_end()Christophe Leroy
[ Upstream commit 344af27715ddbf357cf76978d674428b88f8e92d ] While working on implementing user access validation on powerpc I got the following warnings on a pmac32_defconfig build: CC fs/select.o fs/select.o: warning: objtool: sys_pselect6+0x1bc: redundant UACCESS disable fs/select.o: warning: objtool: sys_pselect6_time32+0x1bc: redundant UACCESS disable On powerpc/32s, user_read_access_begin/end() are no-ops, but the failure path has a user_access_end() instead of user_read_access_end() which means an access end without any prior access begin. Replace that user_access_end() by user_read_access_end(). Fixes: 7e71609f64ec ("pselect6() and friends: take handling the combined 6th/7th args into helper") Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Link: https://lore.kernel.org/r/a7139e28d767a13e667ee3c79599a8047222ef36.1736751221.git.christophe.leroy@csgroup.eu Signed-off-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-02-08btrfs: subpage: fix the bitmap dump of the locked flagsQu Wenruo
[ Upstream commit 396294d1afee65a203d6cabd843d0782e5d7388e ] We're dumping the locked bitmap into the @checked_bitmap variable, printing incorrect values during debug. Thankfully even during my development I haven't hit a case where I need to dump the locked bitmap. But for the sake of consistency, fix it by dupping the locked bitmap into @locked_bitmap variable for output. Fixes: 75258f20fb70 ("btrfs: subpage: dump extra subpage bitmaps for debug") Reviewed-by: Boris Burkov <boris@bur.io> Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-02-08btrfs: improve the warning and error message for btrfs_remove_qgroup()Qu Wenruo
[ Upstream commit c0def46dec9c547679a25fe7552c4bcbec0b0dd2 ] [WARNING] There are several warnings about the recently introduced qgroup auto-removal that it triggers WARN_ON() for the non-zero rfer/excl numbers, e.g: ------------[ cut here ]------------ WARNING: CPU: 67 PID: 2882 at fs/btrfs/qgroup.c:1854 btrfs_remove_qgroup+0x3df/0x450 CPU: 67 UID: 0 PID: 2882 Comm: btrfs-cleaner Kdump: loaded Not tainted 6.11.6-300.fc41.x86_64 #1 RIP: 0010:btrfs_remove_qgroup+0x3df/0x450 Call Trace: <TASK> btrfs_qgroup_cleanup_dropped_subvolume+0x97/0xc0 btrfs_drop_snapshot+0x44e/0xa80 btrfs_clean_one_deleted_snapshot+0xc3/0x110 cleaner_kthread+0xd8/0x130 kthread+0xd2/0x100 ret_from_fork+0x34/0x50 ret_from_fork_asm+0x1a/0x30 </TASK> ---[ end trace 0000000000000000 ]--- BTRFS warning (device sda): to be deleted qgroup 0/319 has non-zero numbers, rfer 258478080 rfer_cmpr 258478080 excl 0 excl_cmpr 0 [CAUSE] Although the root cause is still unclear, as if qgroup is consistent a fully dropped subvolume (with extra transaction committed) should lead to all zero numbers for the qgroup. My current guess is the subvolume drop triggered the new subtree drop threshold thus marked qgroup inconsistent, then rescan cleared it but some corner case is not properly handled during subvolume dropping. But at least for this particular case, since it's only the rfer/excl not properly reset to 0, and qgroup is already marked inconsistent, there is nothing to be worried for the end users. The user space tool utilizing qgroup would queue a rescan to handle everything, so the kernel wanring is a little overkilled. [ENHANCEMENT] Enhance the warning inside btrfs_remove_qgroup() by: - Only do WARN() if CONFIG_BTRFS_DEBUG is enabled As explained the kernel can handle inconsistent qgroups by simply do a rescan, there is nothing to bother the end users. - Treat the reserved space leak the same as non-zero numbers By outputting the values and trigger a WARN() if it's a debug build. So far I haven't experienced any case related to reserved space so I hope we will never need to bother them. Fixes: 839d6ea4f86d ("btrfs: automatically remove the subvolume qgroup") Link: https://github.com/kdave/btrfs-progs/issues/922 Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-02-08pstore/blk: trivial typo fixesEugen Hristev
[ Upstream commit 542243af7182efaeaf6d0f4643f7de437541a9af ] Fix trivial typos in comments. Fixes: 2a03ddbde1e1 ("pstore/blk: Move verify_size() macro out of function") Fixes: 17639f67c1d6 ("pstore/blk: Introduce backend for block devices") Signed-off-by: Eugen Hristev <eugen.hristev@linaro.org> Link: https://lore.kernel.org/r/20250101111921.850406-1-eugen.hristev@linaro.org Signed-off-by: Kees Cook <kees@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-02-08fs: fix proc_handler for sysctl_nr_openJinliang Zheng
[ Upstream commit d727935cad9f6f52c8d184968f9720fdc966c669 ] Use proc_douintvec_minmax() instead of proc_dointvec_minmax() to handle sysctl_nr_open, because its data type is unsigned int, not int. Fixes: 9b80a184eaad ("fs/file: more unsigned file descriptors") Signed-off-by: Jinliang Zheng <alexjlzheng@tencent.com> Link: https://lore.kernel.org/r/20241124034636.325337-1-alexjlzheng@tencent.com Signed-off-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-02-08afs: Fix cleanup of immediately failed async callsDavid Howells
[ Upstream commit 9750be93b2be12b6d92323b97d7c055099d279e6 ] If we manage to begin an async call, but fail to transmit any data on it due to a signal, we then abort it which causes a race between the notification of call completion from rxrpc and our attempt to cancel the notification. The notification will be necessary, however, for async FetchData to terminate the netfs subrequest. However, since we get a notification from rxrpc upon completion of a call (aborted or otherwise), we can just leave it to that. This leads to calls not getting cleaned up, but appearing in /proc/net/rxrpc/calls as being aborted with code 6. Fix this by making the "error_do_abort:" case of afs_make_call() abort the call and then abandon it to the notification handler. Fixes: 34fa47612bfe ("afs: Fix race in async call refcounting") Reported-by: Marc Dionne <marc.dionne@auristor.com> Signed-off-by: David Howells <dhowells@redhat.com> Link: https://lore.kernel.org/r/20241216204124.3752367-25-dhowells@redhat.com cc: linux-afs@lists.infradead.org Signed-off-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-02-08afs: Fix directory format encoding structDavid Howells
[ Upstream commit 07a10767853adcbdbf436dc91393b729b52c4e81 ] The AFS directory format structure, union afs_xdr_dir_block::meta, has too many alloc counter slots declared and so pushes the hash table along and over the data. This doesn't cause a problem at the moment because I'm currently ignoring the hash table and only using the correct number of alloc_ctrs in the code anyway. In future, however, I should start using the hash table to try and speed up afs_lookup(). Fix this by using the correct constant to declare the counter array. Fixes: 4ea219a839bf ("afs: Split the directory content defs into a header") Signed-off-by: David Howells <dhowells@redhat.com> Link: https://lore.kernel.org/r/20241216204124.3752367-14-dhowells@redhat.com cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org Signed-off-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-02-08afs: Fix EEXIST error returned from afs_rmdir() to be ENOTEMPTYDavid Howells
[ Upstream commit b49194da2aff2c879dec9c59ef8dec0f2b0809ef ] AFS servers pass back a code indicating EEXIST when they're asked to remove a directory that is not empty rather than ENOTEMPTY because not all the systems that an AFS server can run on have the latter error available and AFS preexisted the addition of that error in general. Fix afs_rmdir() to translate EEXIST to ENOTEMPTY. Fixes: 260a980317da ("[AFS]: Add "directory write" support.") Signed-off-by: David Howells <dhowells@redhat.com> Link: https://lore.kernel.org/r/20241216204124.3752367-13-dhowells@redhat.com cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org Signed-off-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-02-08dlm: fix srcu_read_lock() return type to intAlexander Aring
[ Upstream commit 57cdd1a1cf1464199678f9338049b63fb5d5b41c ] The return type of srcu_read_lock() is int and not bool. Whereas we using the ret variable only to evaluate a bool type of dlm_lowcomms_con_has_addr() to check if an address is already being set. Fixes: 6f0b0b5d7ae7 ("fs: dlm: remove dlm_node_addrs lookup list") Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-02-08dlm: fix removal of rsb struct that is master and dir recordAlexander Aring
[ Upstream commit 134129520beaf3339482c557361ea0bde709cf36 ] An rsb struct was not being removed in the case where it was both the master and the dir record. This case (master and dir node) was missed in the condition for doing add_scan() from deactivate_rsb(). Fixing this triggers a related WARN_ON that needs to be fixed, and requires adjusting where two del_scan() calls are made. Fixes: c217adfc8caa ("dlm: fix add_scan and del_scan usage") Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-02-01smb: client: handle lack of EA support in smb2_query_path_info()Paulo Alcantara
commit 3681c74d342db75b0d641ba60de27bf73e16e66b upstream. If the server doesn't support both EAs and reparse point in a file, the SMB2_QUERY_INFO request will fail with either STATUS_NO_EAS_ON_FILE or STATUS_EAS_NOT_SUPPORT in the compound chain, so ignore it as long as reparse point isn't IO_REPARSE_TAG_LX_(CHR|BLK), which would require the EAs to know about major/minor numbers. Reported-by: Pali Rohár <pali@kernel.org> Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.com> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-02-01libfs: Use d_children list to iterate simple_offset directoriesChuck Lever
commit b9b588f22a0c049a14885399e27625635ae6ef91 upstream. The mtree mechanism has been effective at creating directory offsets that are stable over multiple opendir instances. However, it has not been able to handle the subtleties of renames that are concurrent with readdir. Instead of using the mtree to emit entries in the order of their offset values, use it only to map incoming ctx->pos to a starting entry. Then use the directory's d_children list, which is already maintained properly by the dcache, to find the next child to emit. One of the sneaky things about this is that when the mtree-allocated offset value wraps (which is very rare), looking up ctx->pos++ is not going to find the next entry; it will return NULL. Instead, by following the d_children list, the offset values can appear in any order but all of the entries in the directory will be visited eventually. Note also that the readdir() is guaranteed to reach the tail of this list. Entries are added only at the head of d_children, and readdir walks from its current position in that list towards its tail. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Link: https://lore.kernel.org/r/20241228175522.1854234-6-cel@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-02-01libfs: Replace simple_offset end-of-directory detectionChuck Lever
commit 68a3a65003145644efcbb651e91db249ccd96281 upstream. According to getdents(3), the d_off field in each returned directory entry points to the next entry in the directory. The d_off field in the last returned entry in the readdir buffer must contain a valid offset value, but if it points to an actual directory entry, then readdir/getdents can loop. This patch introduces a specific fixed offset value that is placed in the d_off field of the last entry in a directory. Some user space applications assume that the EOD offset value is larger than the offsets of real directory entries, so the largest valid offset value is reserved for this purpose. This new value is never allocated by simple_offset_add(). When ->iterate_dir() returns, getdents{64} inserts the ctx->pos value into the d_off field of the last valid entry in the readdir buffer. When it hits EOD, offset_readdir() sets ctx->pos to the EOD offset value so the last entry is updated to point to the EOD marker. When trying to read the entry at the EOD offset, offset_readdir() terminates immediately. It is worth noting that using a Maple tree for directory offset value allocation does not guarantee a 63-bit range of values -- on platforms where "long" is a 32-bit type, the directory offset value range is still 0..(2^31 - 1). For broad compatibility with 32-bit user space, the largest tmpfs directory cookie value is now S32_MAX. Fixes: 796432efab1e ("libfs: getdents() should return 0 after reaching EOD") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Link: https://lore.kernel.org/r/20241228175522.1854234-5-cel@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-02-01Revert "libfs: fix infinite directory reads for offset dir"Chuck Lever
commit b662d858131da9a8a14e68661656989b14dbf113 upstream. The current directory offset allocator (based on mtree_alloc_cyclic) stores the next offset value to return in octx->next_offset. This mechanism typically returns values that increase monotonically over time. Eventually, though, the newly allocated offset value wraps back to a low number (say, 2) which is smaller than other already- allocated offset values. Yu Kuai <yukuai3@huawei.com> reports that, after commit 64a7ce76fb90 ("libfs: fix infinite directory reads for offset dir"), if a directory's offset allocator wraps, existing entries are no longer visible via readdir/getdents because offset_readdir() stops listing entries once an entry's offset is larger than octx->next_offset. These entries vanish persistently -- they can be looked up, but will never again appear in readdir(3) output. The reason for this is that the commit treats directory offsets as monotonically increasing integer values rather than opaque cookies, and introduces this comparison: if (dentry2offset(dentry) >= last_index) { On 64-bit platforms, the directory offset value upper bound is 2^63 - 1. Directory offsets will monotonically increase for millions of years without wrapping. On 32-bit platforms, however, LONG_MAX is 2^31 - 1. The allocator can wrap after only a few weeks (at worst). Revert commit 64a7ce76fb90 ("libfs: fix infinite directory reads for offset dir") to prepare for a fix that can work properly on 32-bit systems and might apply to recent LTS kernels where shmem employs the simple_offset mechanism. Reported-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Link: https://lore.kernel.org/r/20241228175522.1854234-4-cel@kernel.org Reviewed-by: Yang Erkun <yangerkun@huawei.com> Signed-off-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-02-01Revert "libfs: Add simple_offset_empty()"Chuck Lever
commit d7bde4f27ceef3dc6d72010a20d4da23db835a32 upstream. simple_empty() and simple_offset_empty() perform the same task. The latter's use as a canary to find bugs has not found any new issues. A subsequent patch will remove the use of the mtree for iterating directory contents, so revert back to using a similar mechanism for determining whether a directory is indeed empty. Only one such mechanism is ever needed. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Link: https://lore.kernel.org/r/20241228175522.1854234-3-cel@kernel.org Reviewed-by: Yang Erkun <yangerkun@huawei.com> Signed-off-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-02-01libfs: Return ENOSPC when the directory offset range is exhaustedChuck Lever
commit 903dc9c43a155e0893280c7472d4a9a3a83d75a6 upstream. Testing shows that the EBUSY error return from mtree_alloc_cyclic() leaks into user space. The ERRORS section of "man creat(2)" says: > EBUSY O_EXCL was specified in flags and pathname refers > to a block device that is in use by the system > (e.g., it is mounted). ENOSPC is closer to what applications expect in this situation. Note that the normal range of simple directory offset values is 2..2^63, so hitting this error is going to be rare to impossible. Fixes: 6faddda69f62 ("libfs: Add directory operations for stable offsets") Cc: stable@vger.kernel.org # v6.9+ Reviewed-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Yang Erkun <yangerkun@huawei.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Link: https://lore.kernel.org/r/20241228175522.1854234-2-cel@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-02-01gfs2: Truncate address space when flipping GFS2_DIF_JDATA flagAndreas Gruenbacher
commit 7c9d9223802fbed4dee1ae301661bf346964c9d2 upstream. Truncate an inode's address space when flipping the GFS2_DIF_JDATA flag: depending on that flag, the pages in the address space will either use buffer heads or iomap_folio_state structs, and we cannot mix the two. Reported-by: Kun Hu <huk23@m.fudan.edu.cn>, Jiaji Qin <jjtan24@m.fudan.edu.cn> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-16Merge tag 'mm-hotfixes-stable-2025-01-16-21-11' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc fixes from Andrew Morton: "7 singleton hotfixes. 6 are MM. Two are cc:stable and the remainder address post-6.12 issues" * tag 'mm-hotfixes-stable-2025-01-16-21-11' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: ocfs2: check dir i_size in ocfs2_find_entry mailmap: update entry for Ethan Carter Edwards mm: zswap: move allocations during CPU init outside the lock mm: khugepaged: fix call hpage_collapse_scan_file() for anonymous vma mm: shmem: use signed int for version handling in casefold option alloc_tag: skip pgalloc_tag_swap if profiling is disabled mm: page_alloc: fix missed updates of lowmem_reserve in adjust_managed_page_count
2025-01-16Merge tag '6.13-rc7-SMB3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds
Pull smb client fixes from Steve French: - fix double free when reconnect racing with closing session - fix SMB1 reconnect with password rotation * tag '6.13-rc7-SMB3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6: smb: client: fix double free of TCP_Server_Info::hostname cifs: support reconnect with alternate password for SMB1
2025-01-16Merge tag 'for-6.13-rc7-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fix from David Sterba: - handle d_path() errors when canonicalizing device mapper paths during device scan * tag 'for-6.13-rc7-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: add the missing error handling inside get_canonical_dev_path
2025-01-15ocfs2: check dir i_size in ocfs2_find_entrySu Yue
syz reports an out of bounds read: ================================================================== BUG: KASAN: slab-out-of-bounds in ocfs2_match fs/ocfs2/dir.c:334 [inline] BUG: KASAN: slab-out-of-bounds in ocfs2_search_dirblock+0x283/0x6e0 fs/ocfs2/dir.c:367 Read of size 1 at addr ffff88804d8b9982 by task syz-executor.2/14802 CPU: 0 UID: 0 PID: 14802 Comm: syz-executor.2 Not tainted 6.13.0-rc4 #2 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014 Sched_ext: serialise (enabled+all), task: runnable_at=-10ms Call Trace: <TASK> __dump_stack lib/dump_stack.c:94 [inline] dump_stack_lvl+0x229/0x350 lib/dump_stack.c:120 print_address_description mm/kasan/report.c:378 [inline] print_report+0x164/0x530 mm/kasan/report.c:489 kasan_report+0x147/0x180 mm/kasan/report.c:602 ocfs2_match fs/ocfs2/dir.c:334 [inline] ocfs2_search_dirblock+0x283/0x6e0 fs/ocfs2/dir.c:367 ocfs2_find_entry_id fs/ocfs2/dir.c:414 [inline] ocfs2_find_entry+0x1143/0x2db0 fs/ocfs2/dir.c:1078 ocfs2_find_files_on_disk+0x18e/0x530 fs/ocfs2/dir.c:1981 ocfs2_lookup_ino_from_name+0xb6/0x110 fs/ocfs2/dir.c:2003 ocfs2_lookup+0x30a/0xd40 fs/ocfs2/namei.c:122 lookup_open fs/namei.c:3627 [inline] open_last_lookups fs/namei.c:3748 [inline] path_openat+0x145a/0x3870 fs/namei.c:3984 do_filp_open+0xe9/0x1c0 fs/namei.c:4014 do_sys_openat2+0x135/0x1d0 fs/open.c:1402 do_sys_open fs/open.c:1417 [inline] __do_sys_openat fs/open.c:1433 [inline] __se_sys_openat fs/open.c:1428 [inline] __x64_sys_openat+0x15d/0x1c0 fs/open.c:1428 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xf6/0x210 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7f01076903ad Code: c3 e8 a7 2b 00 00 0f 1f 80 00 00 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007f01084acfc8 EFLAGS: 00000246 ORIG_RAX: 0000000000000101 RAX: ffffffffffffffda RBX: 00007f01077cbf80 RCX: 00007f01076903ad RDX: 0000000000105042 RSI: 0000000020000080 RDI: ffffffffffffff9c RBP: 00007f01077cbf80 R08: 0000000000000000 R09: 0000000000000000 R10: 00000000000001ff R11: 0000000000000246 R12: 0000000000000000 R13: 00007f01077cbf80 R14: 00007f010764fc90 R15: 00007f010848d000 </TASK> ================================================================== And a general protection fault in ocfs2_prepare_dir_for_insert: ================================================================== loop0: detected capacity change from 0 to 32768 JBD2: Ignoring recovery information on journal ocfs2: Mounting device (7,0) on (node local, slot 0) with ordered data mode. Oops: general protection fault, probably for non-canonical address 0xdffffc0000000001: 0000 [#1] PREEMPT SMP KASAN NOPTI KASAN: null-ptr-deref in range [0x0000000000000008-0x000000000000000f] CPU: 0 UID: 0 PID: 5096 Comm: syz-executor792 Not tainted 6.11.0-rc4-syzkaller-00002-gb0da640826ba #0 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014 RIP: 0010:ocfs2_find_dir_space_id fs/ocfs2/dir.c:3406 [inline] RIP: 0010:ocfs2_prepare_dir_for_insert+0x3309/0x5c70 fs/ocfs2/dir.c:4280 Code: 00 00 e8 2a 25 13 fe e9 ba 06 00 00 e8 20 25 13 fe e9 4f 01 00 00 e8 16 25 13 fe 49 8d 7f 08 49 8d 5f 09 48 89 f8 48 c1 e8 03 <42> 0f b6 04 20 84 c0 0f 85 bd 23 00 00 48 89 d8 48 c1 e8 03 42 0f RSP: 0018:ffffc9000af9f020 EFLAGS: 00010202 RAX: 0000000000000001 RBX: 0000000000000009 RCX: ffff88801e27a440 RDX: 0000000000000000 RSI: 0000000000000400 RDI: 0000000000000008 RBP: ffffc9000af9f830 R08: ffffffff8380395b R09: ffffffff838090a7 R10: 0000000000000002 R11: ffff88801e27a440 R12: dffffc0000000000 R13: ffff88803c660878 R14: f700000000000088 R15: 0000000000000000 FS: 000055555a677380(0000) GS:ffff888020800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000560bce569178 CR3: 000000001de5a000 CR4: 0000000000350ef0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> ocfs2_mknod+0xcaf/0x2b40 fs/ocfs2/namei.c:292 vfs_mknod+0x36d/0x3b0 fs/namei.c:4088 do_mknodat+0x3ec/0x5b0 __do_sys_mknodat fs/namei.c:4166 [inline] __se_sys_mknodat fs/namei.c:4163 [inline] __x64_sys_mknodat+0xa7/0xc0 fs/namei.c:4163 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7f2dafda3a99 Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 f1 17 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007ffe336a6658 EFLAGS: 00000246 ORIG_RAX: 0000000000000103 RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f2dafda3a99 RDX: 00000000000021c0 RSI: 0000000020000040 RDI: 00000000ffffff9c RBP: 00007f2dafe1b5f0 R08: 0000000000004480 R09: 000055555a6784c0 R10: 0000000000000103 R11: 0000000000000246 R12: 00007ffe336a6680 R13: 00007ffe336a68a8 R14: 431bde82d7b634db R15: 00007f2dafdec03b </TASK> ================================================================== The two reports are all caused invalid negative i_size of dir inode. For ocfs2, dir_inode can't be negative or zero. Here add a check in which is called by ocfs2_check_dir_for_entry(). It fixes the second report as ocfs2_check_dir_for_entry() must be called before ocfs2_prepare_dir_for_insert(). Also set a up limit for dir with OCFS2_INLINE_DATA_FL. The i_size can't be great than blocksize. Link: https://lkml.kernel.org/r/20250106140640.92260-1-glass.su@suse.com Reported-by: Jiacheng Xu <stitch@zju.edu.cn> Link: https://lore.kernel.org/ocfs2-devel/17a04f01.1ae74.19436d003fc.Coremail.stitch@zju.edu.cn/T/#u Reported-by: syzbot+5a64828fcc4c2ad9b04f@syzkaller.appspotmail.com Link: https://lore.kernel.org/all/0000000000005894f3062018caf1@google.com/T/ Signed-off-by: Su Yue <glass.su@suse.com> Reviewed-by: Heming Zhao <heming.zhao@suse.com> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Jun Piao <piaojun@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-01-15smb: client: fix double free of TCP_Server_Info::hostnamePaulo Alcantara
When shutting down the server in cifs_put_tcp_session(), cifsd thread might be reconnecting to multiple DFS targets before it realizes it should exit the loop, so @server->hostname can't be freed as long as cifsd thread isn't done. Otherwise the following can happen: RIP: 0010:__slab_free+0x223/0x3c0 Code: 5e 41 5f c3 cc cc cc cc 4c 89 de 4c 89 cf 44 89 44 24 08 4c 89 1c 24 e8 fb cf 8e 00 44 8b 44 24 08 4c 8b 1c 24 e9 5f fe ff ff <0f> 0b 41 f7 45 08 00 0d 21 00 0f 85 2d ff ff ff e9 1f ff ff ff 80 RSP: 0018:ffffb26180dbfd08 EFLAGS: 00010246 RAX: ffff8ea34728e510 RBX: ffff8ea34728e500 RCX: 0000000000800068 RDX: 0000000000800068 RSI: 0000000000000000 RDI: ffff8ea340042400 RBP: ffffe112041ca380 R08: 0000000000000001 R09: 0000000000000000 R10: 6170732e31303000 R11: 70726f632e786563 R12: ffff8ea34728e500 R13: ffff8ea340042400 R14: ffff8ea34728e500 R15: 0000000000800068 FS: 0000000000000000(0000) GS:ffff8ea66fd80000(0000) 000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007ffc25376080 CR3: 000000012a2ba001 CR4: PKRU: 55555554 Call Trace: <TASK> ? show_trace_log_lvl+0x1c4/0x2df ? show_trace_log_lvl+0x1c4/0x2df ? __reconnect_target_unlocked+0x3e/0x160 [cifs] ? __die_body.cold+0x8/0xd ? die+0x2b/0x50 ? do_trap+0xce/0x120 ? __slab_free+0x223/0x3c0 ? do_error_trap+0x65/0x80 ? __slab_free+0x223/0x3c0 ? exc_invalid_op+0x4e/0x70 ? __slab_free+0x223/0x3c0 ? asm_exc_invalid_op+0x16/0x20 ? __slab_free+0x223/0x3c0 ? extract_hostname+0x5c/0xa0 [cifs] ? extract_hostname+0x5c/0xa0 [cifs] ? __kmalloc+0x4b/0x140 __reconnect_target_unlocked+0x3e/0x160 [cifs] reconnect_dfs_server+0x145/0x430 [cifs] cifs_handle_standard+0x1ad/0x1d0 [cifs] cifs_demultiplex_thread+0x592/0x730 [cifs] ? __pfx_cifs_demultiplex_thread+0x10/0x10 [cifs] kthread+0xdd/0x100 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x29/0x50 </TASK> Fixes: 7be3248f3139 ("cifs: To match file servers, make sure the server hostname matches") Reported-by: Jay Shin <jaeshin@redhat.com> Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2025-01-13btrfs: add the missing error handling inside get_canonical_dev_pathQu Wenruo
Inside function get_canonical_dev_path(), we call d_path() to get the final device path. But d_path() can return error, and in that case the next strscpy() call will trigger an invalid memory access. Add back the missing error handling for d_path(). Reported-by: Boris Burkov <boris@bur.io> Fixes: 7e06de7c83a7 ("btrfs: canonicalize the device path before adding it") Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2025-01-13Merge tag 'mm-hotfixes-stable-2025-01-13-00-03' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc fixes from Andrew Morton: "18 hotfixes. 11 are cc:stable. 13 are MM and 5 are non-MM. All patches are singletons - please see the relevant changelogs for details" * tag 'mm-hotfixes-stable-2025-01-13-00-03' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: fs/proc: fix softlockup in __read_vmcore (part 2) mm: fix assertion in folio_end_read() mm: vmscan : pgdemote vmstat is not getting updated when MGLRU is enabled. vmstat: disable vmstat_work on vmstat_cpu_down_prep() zram: fix potential UAF of zram table selftests/mm: set allocated memory to non-zero content in cow test mm: clear uffd-wp PTE/PMD state on mremap() module: fix writing of livepatch relocations in ROX text mm: zswap: properly synchronize freeing resources during CPU hotunplug Revert "mm: zswap: fix race between [de]compression and CPU hotunplug" hugetlb: fix NULL pointer dereference in trace_hugetlbfs_alloc_inode mm: fix div by zero in bdi_ratio_from_pages x86/execmem: fix ROX cache usage in Xen PV guests filemap: avoid truncating 64-bit offset to 32 bits tools: fix atomic_set() definition to set the value correctly mm/mempolicy: count MPOL_WEIGHTED_INTERLEAVE to "interleave_hit" scripts/decode_stacktrace.sh: fix decoding of lines with an additional info mm/kmemleak: fix percpu memory leak detection failure
2025-01-12cifs: support reconnect with alternate password for SMB1Meetakshi Setiya
SMB1 shares the mount and remount code paths with SMB2/3 and already supports password rotation in some scenarios. This patch extends the password rotation support to SMB1 reconnects as well. Cc: stable@vger.kernel.org Signed-off-by: Meetakshi Setiya <msetiya@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2025-01-12fs/proc: fix softlockup in __read_vmcore (part 2)Rik van Riel
Since commit 5cbcb62dddf5 ("fs/proc: fix softlockup in __read_vmcore") the number of softlockups in __read_vmcore at kdump time have gone down, but they still happen sometimes. In a memory constrained environment like the kdump image, a softlockup is not just a harmless message, but it can interfere with things like RCU freeing memory, causing the crashdump to get stuck. The second loop in __read_vmcore has a lot more opportunities for natural sleep points, like scheduling out while waiting for a data write to happen, but apparently that is not always enough. Add a cond_resched() to the second loop in __read_vmcore to (hopefully) get rid of the softlockups. Link: https://lkml.kernel.org/r/20250110102821.2a37581b@fangorn Fixes: 5cbcb62dddf5 ("fs/proc: fix softlockup in __read_vmcore") Signed-off-by: Rik van Riel <riel@surriel.com> Reported-by: Breno Leitao <leitao@debian.org> Cc: Baoquan He <bhe@redhat.com> Cc: Dave Young <dyoung@redhat.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>