summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
2024-06-12f2fs: compress: don't allow unaligned truncation on released compress inodeChao Yu
[ Upstream commit 29ed2b5dd521ce7c5d8466cd70bf0cc9d07afeee ] f2fs image may be corrupted after below testcase: - mkfs.f2fs -O extra_attr,compression -f /dev/vdb - mount /dev/vdb /mnt/f2fs - touch /mnt/f2fs/file - f2fs_io setflags compression /mnt/f2fs/file - dd if=/dev/zero of=/mnt/f2fs/file bs=4k count=4 - f2fs_io release_cblocks /mnt/f2fs/file - truncate -s 8192 /mnt/f2fs/file - umount /mnt/f2fs - fsck.f2fs /dev/vdb [ASSERT] (fsck_chk_inode_blk:1256) --> ino: 0x5 has i_blocks: 0x00000002, but has 0x3 blocks [FSCK] valid_block_count matching with CP [Fail] [0x4, 0x5] [FSCK] other corrupted bugs [Fail] The reason is: partial truncation assume compressed inode has reserved blocks, after partial truncation, valid block count may change w/o .i_blocks and .total_valid_block_count update, result in corruption. This patch only allow cluster size aligned truncation on released compress inode for fixing. Fixes: c61404153eb6 ("f2fs: introduce FI_COMPRESS_RELEASED instead of using IMMUTABLE bit") Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12f2fs: fix to release node block count in error path of f2fs_new_node_page()Chao Yu
[ Upstream commit 0fa4e57c1db263effd72d2149d4e21da0055c316 ] It missed to call dec_valid_node_count() to release node block count in error path, fix it. Fixes: 141170b759e0 ("f2fs: fix to avoid use f2fs_bug_on() in f2fs_new_node_page()") Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12f2fs: compress: fix to cover {reserve,release}_compress_blocks() w/ cp_rwsem ↵Chao Yu
lock [ Upstream commit 0a4ed2d97cb6d044196cc3e726b6699222b41019 ] It needs to cover {reserve,release}_compress_blocks() w/ cp_rwsem lock to avoid racing with checkpoint, otherwise, filesystem metadata including blkaddr in dnode, inode fields and .total_valid_block_count may be corrupted after SPO case. Fixes: ef8d563f184e ("f2fs: introduce F2FS_IOC_RELEASE_COMPRESS_BLOCKS") Fixes: c75488fb4d82 ("f2fs: introduce F2FS_IOC_RESERVE_COMPRESS_BLOCKS") Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12f2fs: compress: fix error path of inc_valid_block_count()Chao Yu
[ Upstream commit 043c832371cd9023fbd725138ddc6c7f288dc469 ] If inc_valid_block_count() can not allocate all requested blocks, it needs to release block count in .total_valid_block_count and resevation blocks in inode. Fixes: 54607494875e ("f2fs: compress: fix to avoid inconsistence bewteen i_blocks and dnode") Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12f2fs: introduce get_available_block_count() for cleanupChao Yu
[ Upstream commit 0f1c6ede6da9f7c5dd7380b74a64850298279168 ] There are very similar codes in inc_valid_block_count() and inc_valid_node_count() which is used for available user block count calculation. This patch introduces a new helper get_available_block_count() to include those common codes, and used it to clean up codes. Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Stable-dep-of: 043c832371cd ("f2fs: compress: fix error path of inc_valid_block_count()") Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12f2fs: deprecate io_bitsJaegeuk Kim
[ Upstream commit 87161a2b0aed9e9b614bbf6fe8697ad560ceb0cb ] Let's deprecate an unused io_bits feature to save CPU cycles and memory. Reviewed-by: Daeho Jeong <daehojeong@google.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Stable-dep-of: 043c832371cd ("f2fs: compress: fix error path of inc_valid_block_count()") Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12f2fs: compress: fix to update i_compr_blocks correctlyChao Yu
[ Upstream commit 186e7d71534df4589405925caca5597af7626c12 ] Previously, we account reserved blocks and compressed blocks into @compr_blocks, then, f2fs_i_compr_blocks_update(,compr_blocks) will update i_compr_blocks incorrectly, fix it. Meanwhile, for the case all blocks in cluster were reserved, fix to update dn->ofs_in_node correctly. Fixes: eb8fbaa53374 ("f2fs: compress: fix to check unreleased compressed cluster") Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12f2fs: fix block migration when section is not aligned to pow2Wu Bo
[ Upstream commit aa4074e8fec4d2e686daee627fcafb3503efe365 ] As for zoned-UFS, f2fs section size is forced to zone size. And zone size may not aligned to pow2. Fixes: 859fca6b706e ("f2fs: swap: support migrating swapfile in aligned write mode") Signed-off-by: Liao Yuanhong <liaoyuanhong@vivo.com> Signed-off-by: Wu Bo <bo.wu@vivo.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12f2fs: support file pinning for zoned devicesDaeho Jeong
[ Upstream commit 9703d69d9d153bb230711d0d577454552aeb13d4 ] Support file pinning with conventional storage area for zoned devices Signed-off-by: Daeho Jeong <daehojeong@google.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Stable-dep-of: aa4074e8fec4 ("f2fs: fix block migration when section is not aligned to pow2") Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12f2fs: kill heap-based allocationJaegeuk Kim
[ Upstream commit 4e0197f9932f70cc7be8744aa0ed4dd9b5d97d85 ] No one uses this feature. Let's kill it. Reviewed-by: Daeho Jeong <daehojeong@google.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Stable-dep-of: aa4074e8fec4 ("f2fs: fix block migration when section is not aligned to pow2") Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12f2fs: separate f2fs_gc_range() to use GC for a rangeDaeho Jeong
[ Upstream commit 2f0209f579d12bd0ea43a01a8696e30a8eeec1da ] Make f2fs_gc_range() an extenal function to use it for GC for a range. Signed-off-by: Daeho Jeong <daehojeong@google.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Stable-dep-of: aa4074e8fec4 ("f2fs: fix block migration when section is not aligned to pow2") Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12f2fs: use BLKS_PER_SEG, BLKS_PER_SEC, and SEGS_PER_SECJaegeuk Kim
[ Upstream commit a60108f7dfb5867da1ad9c777d2fbbe47e4dbdd7 ] No functional change. Reviewed-by: Daeho Jeong <daehojeong@google.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Stable-dep-of: aa4074e8fec4 ("f2fs: fix block migration when section is not aligned to pow2") Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12f2fs: support printk_ratelimited() in f2fs_printk()Chao Yu
[ Upstream commit b1c9d3f833ba60a288db111d7fe38edfeb9b8fbb ] This patch supports using printk_ratelimited() in f2fs_printk(), and wrap ratelimited f2fs_printk() into f2fs_{err,warn,info}_ratelimited(), then, use these new helps to clean up codes. Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Stable-dep-of: aa4074e8fec4 ("f2fs: fix block migration when section is not aligned to pow2") Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12f2fs: Clean up errors in segment.hKaiLong Wang
[ Upstream commit 37768434b7a7d00ac5a08b2c1d31aa7aaa0846a0 ] Fix the following errors reported by checkpatch: ERROR: spaces required around that ':' (ctx:VxW) Signed-off-by: KaiLong Wang <wangkailong@jari.cn> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Stable-dep-of: aa4074e8fec4 ("f2fs: fix block migration when section is not aligned to pow2") Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12ovl: remove upper umask handling from ovl_create_upper()Miklos Szeredi
[ Upstream commit 096802748ea1dea8b476938e0a8dc16f4bd2f1ad ] This is already done by vfs_prepare_mode() when creating the upper object by vfs_create(), vfs_mkdir() and vfs_mknod(). No regressions have been observed in xfstests run with posix acls turned off for the upper filesystem. Fixes: 1639a49ccdce ("fs: move S_ISGID stripping into the vfs_*() helpers") Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12udf: Convert udf_expand_file_adinicb() to use a folioMatthew Wilcox (Oracle)
[ Upstream commit db6754090a4f99c67e05ae6b87343ba6e013531f ] Use the folio APIs throughout this function. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Fixes: 1eeceaec794e ("udf: Convert udf_expand_file_adinicb() to avoid kmap_atomic()") Signed-off-by: Jan Kara <jack@suse.cz> Message-Id: <20240417150416.752929-4-willy@infradead.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12udf: Remove GFP_NOFS allocation in udf_expand_file_adinicb()Jan Kara
[ Upstream commit 38f8af2a7191e5da21c557210d810c6d0d34f6c4 ] udf_expand_file_adinicb() is called under inode->i_rwsem and mapping->invalidate_lock. i_rwsem is safe wrt fs reclaim, invalidate_lock on this inode is safe as well (we hold inode reference so reclaim will not touch it, furthermore even lockdep should not complain as invalidate_lock is acquired from udf_evict_inode() only when truncating inode which should not happen from fs reclaim). Signed-off-by: Jan Kara <jack@suse.cz> Stable-dep-of: db6754090a4f ("udf: Convert udf_expand_file_adinicb() to use a folio") Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12f2fs: fix to check pinfile flag in f2fs_move_file_range()Chao Yu
[ Upstream commit e07230da0500e0919a765037c5e81583b519be2c ] ioctl(F2FS_IOC_MOVE_RANGE) can truncate or punch hole on pinned file, fix to disallow it. Fixes: 5fed0be8583f ("f2fs: do not allow partial truncation on pinned file") Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12f2fs: fix to relocate check condition in f2fs_fallocate()Chao Yu
[ Upstream commit 278a6253a673611dbc8ab72a3b34b151a8e75822 ] compress and pinfile flag should be checked after inode lock held to avoid race condition, fix it. Fixes: 4c8ff7095bef ("f2fs: support data compression") Fixes: 5fed0be8583f ("f2fs: do not allow partial truncation on pinned file") Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12f2fs: compress: fix to relocate check condition in f2fs_ioc_{,de}compress_file()Chao Yu
[ Upstream commit bd9ae4ae9e585061acfd4a169f2321706f900246 ] Compress flag should be checked after inode lock held to avoid racing w/ f2fs_setflags_common() , fix it. Fixes: 5fdb322ff2c2 ("f2fs: add F2FS_IOC_DECOMPRESS_FILE and F2FS_IOC_COMPRESS_FILE") Reported-by: Zhiguo Niu <zhiguo.niu@unisoc.com> Closes: https://lore.kernel.org/linux-f2fs-devel/CAHJ8P3LdZXLc2rqeYjvymgYHr2+YLuJ0sLG9DdsJZmwO7deuhw@mail.gmail.com Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12f2fs: compress: fix to relocate check condition in ↵Chao Yu
f2fs_{release,reserve}_compress_blocks() [ Upstream commit 7c5dffb3d90c5921b91981cc663e02757d90526e ] Compress flag should be checked after inode lock held to avoid racing w/ f2fs_setflags_common(), fix it. Fixes: 4c8ff7095bef ("f2fs: support data compression") Reported-by: Zhiguo Niu <zhiguo.niu@unisoc.com> Closes: https://lore.kernel.org/linux-f2fs-devel/CAHJ8P3LdZXLc2rqeYjvymgYHr2+YLuJ0sLG9DdsJZmwO7deuhw@mail.gmail.com Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12f2fs: fix to wait on page writeback in __clone_blkaddrs()Chao Yu
[ Upstream commit d3876e34e7e789e2cbdd782360fef2a777391082 ] In below race condition, dst page may become writeback status in __clone_blkaddrs(), it needs to wait writeback before update, fix it. Thread A GC Thread - f2fs_move_file_range - filemap_write_and_wait_range(dst) - gc_data_segment - f2fs_down_write(dst) - move_data_page - set_page_writeback(dst_page) - f2fs_submit_page_write - f2fs_up_write(dst) - f2fs_down_write(dst) - __exchange_data_block - __clone_blkaddrs - f2fs_get_new_data_page - memcpy_page Fixes: 0a2aa8fbb969 ("f2fs: refactor __exchange_data_block for speed up") Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12f2fs: multidev: fix to recognize valid zero block addressChao Yu
[ Upstream commit 33e62cd7b4c281cd737c62e5d8c4f0e602a8c5c5 ] As reported by Yi Zhang in mailing list [1], kernel warning was catched during zbd/010 test as below: ./check zbd/010 zbd/010 (test gap zone support with F2FS) [failed] runtime ... 3.752s something found in dmesg: [ 4378.146781] run blktests zbd/010 at 2024-02-18 11:31:13 [ 4378.192349] null_blk: module loaded [ 4378.209860] null_blk: disk nullb0 created [ 4378.413285] scsi_debug:sdebug_driver_probe: scsi_debug: trim poll_queues to 0. poll_q/nr_hw = (0/1) [ 4378.422334] scsi host15: scsi_debug: version 0191 [20210520] dev_size_mb=1024, opts=0x0, submit_queues=1, statistics=0 [ 4378.434922] scsi 15:0:0:0: Direct-Access-ZBC Linux scsi_debug 0191 PQ: 0 ANSI: 7 [ 4378.443343] scsi 15:0:0:0: Power-on or device reset occurred [ 4378.449371] sd 15:0:0:0: Attached scsi generic sg5 type 20 [ 4378.449418] sd 15:0:0:0: [sdf] Host-managed zoned block device ... (See '/mnt/tests/gitlab.com/api/v4/projects/19168116/repository/archive.zip/storage/blktests/blk/blktests/results/nodev/zbd/010.dmesg' WARNING: CPU: 22 PID: 44011 at fs/iomap/iter.c:51 CPU: 22 PID: 44011 Comm: fio Not tainted 6.8.0-rc3+ #1 RIP: 0010:iomap_iter+0x32b/0x350 Call Trace: <TASK> __iomap_dio_rw+0x1df/0x830 f2fs_file_read_iter+0x156/0x3d0 [f2fs] aio_read+0x138/0x210 io_submit_one+0x188/0x8c0 __x64_sys_io_submit+0x8c/0x1a0 do_syscall_64+0x86/0x170 entry_SYSCALL_64_after_hwframe+0x6e/0x76 Shinichiro Kawasaki helps to analyse this issue and proposes a potential fixing patch in [2]. Quoted from reply of Shinichiro Kawasaki: "I confirmed that the trigger commit is dbf8e63f48af as Yi reported. I took a look in the commit, but it looks fine to me. So I thought the cause is not in the commit diff. I found the WARN is printed when the f2fs is set up with multiple devices, and read requests are mapped to the very first block of the second device in the direct read path. In this case, f2fs_map_blocks() and f2fs_map_blocks_cached() modify map->m_pblk as the physical block address from each block device. It becomes zero when it is mapped to the first block of the device. However, f2fs_iomap_begin() assumes that map->m_pblk is the physical block address of the whole f2fs, across the all block devices. It compares map->m_pblk against NULL_ADDR == 0, then go into the unexpected branch and sets the invalid iomap->length. The WARN catches the invalid iomap->length. This WARN is printed even for non-zoned block devices, by following steps. - Create two (non-zoned) null_blk devices memory backed with 128MB size each: nullb0 and nullb1. # mkfs.f2fs /dev/nullb0 -c /dev/nullb1 # mount -t f2fs /dev/nullb0 "${mount_dir}" # dd if=/dev/zero of="${mount_dir}/test.dat" bs=1M count=192 # dd if="${mount_dir}/test.dat" of=/dev/null bs=1M count=192 iflag=direct ..." So, the root cause of this issue is: when multi-devices feature is on, f2fs_map_blocks() may return zero blkaddr in non-primary device, which is a verified valid block address, however, f2fs_iomap_begin() treats it as an invalid block address, and then it triggers the warning in iomap framework code. Finally, as discussed, we decide to use a more simple and direct way that checking (map.m_flags & F2FS_MAP_MAPPED) condition instead of (map.m_pblk != NULL_ADDR) to fix this issue. Thanks a lot for the effort of Yi Zhang and Shinichiro Kawasaki on this issue. [1] https://lore.kernel.org/linux-f2fs-devel/CAHj4cs-kfojYC9i0G73PRkYzcxCTex=-vugRFeP40g_URGvnfQ@mail.gmail.com/ [2] https://lore.kernel.org/linux-f2fs-devel/gngdj77k4picagsfdtiaa7gpgnup6fsgwzsltx6milmhegmjff@iax2n4wvrqye/ Reported-by: Yi Zhang <yi.zhang@redhat.com> Closes: https://lore.kernel.org/linux-f2fs-devel/CAHj4cs-kfojYC9i0G73PRkYzcxCTex=-vugRFeP40g_URGvnfQ@mail.gmail.com/ Tested-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> Tested-by: Yi Zhang <yi.zhang@redhat.com> Fixes: 1517c1a7a445 ("f2fs: implement iomap operations") Fixes: 8d3c1fa3fa5e ("f2fs: don't rely on F2FS_MAP_* in f2fs_iomap_begin") Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12ext4: remove the redundant folio_wait_stable()Zhang Yi
[ Upstream commit df0b5afc62f3368d657a8fe4a8d393ac481474c2 ] __filemap_get_folio() with FGP_WRITEBEGIN parameter has already wait for stable folio, so remove the redundant folio_wait_stable() in ext4_da_write_begin(), it was left over from the commit cc883236b792 ("ext4: drop unnecessary journal handle in delalloc write") that removed the retry getting page logic. Fixes: cc883236b792 ("ext4: drop unnecessary journal handle in delalloc write") Signed-off-by: Zhang Yi <yi.zhang@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20240419023005.2719050-1-yi.zhang@huaweicloud.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12ext4: fix potential unnitialized variableDan Carpenter
[ Upstream commit 3f4830abd236d0428e50451e1ecb62e14c365e9b ] Smatch complains "err" can be uninitialized in the caller. fs/ext4/indirect.c:349 ext4_alloc_branch() error: uninitialized symbol 'err'. Set the error to zero on the success path. Fixes: 8016e29f4362 ("ext4: fast commit recovery path") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Link: https://lore.kernel.org/r/363a4673-0fb8-4adf-b4fb-90a499077276@moroto.mountain Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12ext4: avoid excessive credit estimate in ext4_tmpfile()Jan Kara
[ Upstream commit 35a1f12f0ca857fee1d7a04ef52cbd5f1f84de13 ] A user with minimum journal size (1024 blocks these days) complained about the following error triggered by generic/697 test in ext4_tmpfile(): run fstests generic/697 at 2024-02-28 05:34:46 JBD2: vfstest wants too many credits credits:260 rsv_credits:0 max:256 EXT4-fs error (device loop0) in __ext4_new_inode:1083: error 28 Indeed the credit estimate in ext4_tmpfile() is huge. EXT4_MAXQUOTAS_INIT_BLOCKS() is 219, then 10 credits from ext4_tmpfile() itself and then ext4_xattr_credits_for_new_inode() adds more credits needed for security attributes and ACLs. Now the EXT4_MAXQUOTAS_INIT_BLOCKS() is in fact unnecessary because we've already initialized quotas with dquot_init() shortly before and so EXT4_MAXQUOTAS_TRANS_BLOCKS() is enough (which boils down to 3 credits). Fixes: af51a2ac36d1 ("ext4: ->tmpfile() support") Signed-off-by: Jan Kara <jack@suse.cz> Tested-by: Luis Henriques <lhenriques@suse.de> Tested-by: Disha Goel <disgoel@linux.ibm.com> Link: https://lore.kernel.org/r/20240307115320.28949-1-jack@suse.cz Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12gfs2: do_xmote fixesAndreas Gruenbacher
[ Upstream commit 9947a06d29c0a30da88cdc6376ca5fd87083e130 ] Function do_xmote() is called with the glock spinlock held. Commit 86934198eefa added a 'goto skip_inval' statement at the beginning of the function to further below where the glock spinlock is expected not to be held anymore. Then it added code there that requires the glock spinlock to be held. This doesn't make sense; fix this up by dropping and retaking the spinlock where needed. In addition, when ->lm_lock() returned an error, do_xmote() didn't fail the locking operation, and simply left the glock hanging; fix that as well. (This is a much older error.) Fixes: 86934198eefa ("gfs2: Clear flags when withdraw prevents xmote") Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12gfs2: finish_xmote cleanupAndreas Gruenbacher
[ Upstream commit 1cd28e15864054f3c48baee9eecda1c0441c48ac ] Currently, function finish_xmote() takes and releases the glock spinlock. However, all of its callers immediately take that spinlock again, so it makes more sense to take the spin lock before calling finish_xmote() already. With that, thaw_glock() is the only place that sets the GLF_HAVE_REPLY flag outside of the glock spinlock, but it also takes that spinlock immediately thereafter. Change that to set the bit when the spinlock is already held. This allows to switch from test_and_clear_bit() to test_bit() and clear_bit() in glock_work_func(). Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Stable-dep-of: 9947a06d29c0 ("gfs2: do_xmote fixes") Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12gfs2: Rename gfs2_withdrawn to gfs2_withdrawing_or_withdrawnAndreas Gruenbacher
[ Upstream commit 4d927b03a68846e4e791ccde6b4c274df02f11e9 ] This function checks whether the filesystem has been been marked to be withdrawn eventually or has been withdrawn already. Rename this function to avoid confusing code like checking for gfs2_withdrawing() when gfs2_withdrawn() has already returned true. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Stable-dep-of: 9947a06d29c0 ("gfs2: do_xmote fixes") Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12gfs2: Mark withdraws as unlikelyAndreas Gruenbacher
[ Upstream commit 015af1af44003fff797f8632e940824c07d282bf ] Mark the gfs2_withdrawn(), gfs2_withdrawing(), and gfs2_withdraw_in_prog() inline functions as likely to return %false. This allows to get rid of likely() and unlikely() annotations at the call sites of those functions. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Stable-dep-of: 9947a06d29c0 ("gfs2: do_xmote fixes") Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12gfs2: Fix potential glock use-after-free on unmountAndreas Gruenbacher
[ Upstream commit d98779e687726d8f8860f1c54b5687eec5f63a73 ] When a DLM lockspace is released and there ares still locks in that lockspace, DLM will unlock those locks automatically. Commit fb6791d100d1b started exploiting this behavior to speed up filesystem unmount: gfs2 would simply free glocks it didn't want to unlock and then release the lockspace. This didn't take the bast callbacks for asynchronous lock contention notifications into account, which remain active until until a lock is unlocked or its lockspace is released. To prevent those callbacks from accessing deallocated objects, put the glocks that should not be unlocked on the sd_dead_glocks list, release the lockspace, and only then free those glocks. As an additional measure, ignore unexpected ast and bast callbacks if the receiving glock is dead. Fixes: fb6791d100d1b ("GFS2: skip dlm_unlock calls in unmount") Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Cc: David Teigland <teigland@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12gfs2: Remove ill-placed consistency checkAndreas Gruenbacher
[ Upstream commit 59f60005797b4018d7b46620037e0c53d690795e ] This consistency check was originally added by commit 9287c6452d2b1 ("gfs2: Fix occasional glock use-after-free"). It is ill-placed in gfs2_glock_free() because if it holds there, it must equally hold in __gfs2_glock_put() already. Either way, the check doesn't seem necessary anymore. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Stable-dep-of: d98779e68772 ("gfs2: Fix potential glock use-after-free on unmount") Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12gfs2: No longer use 'extern' in function declarationsAndreas Gruenbacher
[ Upstream commit 0b2355fe91ac3756a9e29c8b833ba33f9affb520 ] For non-static function declarations, external linkage is implied and the 'extern' keyword isn't needed. Some static checkers complain about the overuse of 'extern', so clean up all the function declarations. In addition, remove 'extern' from the definition of free_local_statfs_inodes(); it isn't needed there, either. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Stable-dep-of: d98779e68772 ("gfs2: Fix potential glock use-after-free on unmount") Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12gfs2: Rename gfs2_lookup_{ simple => meta }Andreas Gruenbacher
[ Upstream commit 062fb903895a035ed382a0d3f9b9d459b2718217 ] Function gfs2_lookup_simple() is used for looking up inodes in the metadata directory tree, so rename it to gfs2_lookup_meta() to closer match its purpose. Clean the function up a little on the way. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Stable-dep-of: d98779e68772 ("gfs2: Fix potential glock use-after-free on unmount") Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12gfs2: Convert gfs2_internal_read to foliosAndreas Gruenbacher
[ Upstream commit be7f6a6b0bca708999eef4f8e9f2b128c73b9e17 ] Change gfs2_internal_read() to use folios. Convert sizes to size_t. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Stable-dep-of: d98779e68772 ("gfs2: Fix potential glock use-after-free on unmount") Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12gfs2: Get rid of gfs2_alloc_blocks generation parameterAndreas Gruenbacher
[ Upstream commit 4c7b3f7fb7c8c66d669d107e717f9de41ef81e92 ] Get rid of the generation parameter of gfs2_alloc_blocks(): we only ever set the generation of the current inode while creating it, so do so directly. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Stable-dep-of: d98779e68772 ("gfs2: Fix potential glock use-after-free on unmount") Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12gfs2: Fix "ignore unlock failures after withdraw"Andreas Gruenbacher
[ Upstream commit 5d9231111966b6c5a65016d58dcbeab91055bc91 ] Commit 3e11e53041502 tries to suppress dlm_lock() lock conversion errors that occur when the lockspace has already been released. It does that by setting and checking the SDF_SKIP_DLM_UNLOCK flag. This conflicts with the intended meaning of the SDF_SKIP_DLM_UNLOCK flag, so check whether the lockspace is still allocated instead. (Given the current DLM API, checking for this kind of error after the fact seems easier that than to make sure that the lockspace is still allocated before calling dlm_lock(). Changing the DLM API so that users maintain the lockspace references themselves would be an option.) Fixes: 3e11e53041502 ("GFS2: ignore unlock failures after withdraw") Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12gfs2: Don't forget to complete delayed withdrawAndreas Gruenbacher
[ Upstream commit b01189333ee91c1ae6cd96dfd1e3a3c2e69202f0 ] Commit fffe9bee14b0 ("gfs2: Delay withdraw from atomic context") switched from gfs2_withdraw() to gfs2_withdraw_delayed() in gfs2_ail_error(), but failed to then check if a delayed withdraw had occurred. Fix that by adding the missing check in __gfs2_ail_flush(), where the spin locks are already dropped and a withdraw is possible. Fixes: fffe9bee14b0 ("gfs2: Delay withdraw from atomic context") Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12dlm: fix user space lock decision to copy lvbAlexander Aring
[ Upstream commit ad191e0eeebf64a60ca2d16ca01a223d2b1dd25e ] This patch fixes the copy lvb decision for user space lock requests. Checking dlm_lvb_operations is done earlier, where granted/requested lock modes are available to use in the matrix. The decision had been moved to the wrong location, where granted mode and requested mode where the same, which causes the dlm_lvb_operations matix to produce the wrong copy decision. For PW or EX requests, the caller could get invalid lvb data. Fixes: 61bed0baa4db ("fs: dlm: use a non-static queue for callbacks") Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12jffs2: prevent xattr node from overflowing the eraseblockIlya Denisyev
[ Upstream commit c6854e5a267c28300ff045480b5a7ee7f6f1d913 ] Add a check to make sure that the requested xattr node size is no larger than the eraseblock minus the cleanmarker. Unlike the usual inode nodes, the xattr nodes aren't split into parts and spread across multiple eraseblocks, which means that a xattr node must not occupy more than one eraseblock. If the requested xattr value is too large, the xattr node can spill onto the next eraseblock, overwriting the nodes and causing errors such as: jffs2: argh. node added in wrong place at 0x0000b050(2) jffs2: nextblock 0x0000a000, expected at 0000b00c jffs2: error: (823) do_verify_xattr_datum: node CRC failed at 0x01e050, read=0xfc892c93, calc=0x000000 jffs2: notice: (823) jffs2_get_inode_nodes: Node header CRC failed at 0x01e00c. {848f,2fc4,0fef511f,59a3d171} jffs2: Node at 0x0000000c with length 0x00001044 would run over the end of the erase block jffs2: Perhaps the file system was created with the wrong erase size? jffs2: jffs2_scan_eraseblock(): Magic bitmask 0x1985 not found at 0x00000010: 0x1044 instead This breaks the filesystem and can lead to KASAN crashes such as: BUG: KASAN: slab-out-of-bounds in jffs2_sum_add_kvec+0x125e/0x15d0 Read of size 4 at addr ffff88802c31e914 by task repro/830 CPU: 0 PID: 830 Comm: repro Not tainted 6.9.0-rc3+ #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0xc6/0x120 print_report+0xc4/0x620 ? __virt_addr_valid+0x308/0x5b0 kasan_report+0xc1/0xf0 ? jffs2_sum_add_kvec+0x125e/0x15d0 ? jffs2_sum_add_kvec+0x125e/0x15d0 jffs2_sum_add_kvec+0x125e/0x15d0 jffs2_flash_direct_writev+0xa8/0xd0 jffs2_flash_writev+0x9c9/0xef0 ? __x64_sys_setxattr+0xc4/0x160 ? do_syscall_64+0x69/0x140 ? entry_SYSCALL_64_after_hwframe+0x76/0x7e [...] Found by Linux Verification Center (linuxtesting.org) with Syzkaller. Fixes: aa98d7cf59b5 ("[JFFS2][XATTR] XATTR support on JFFS2 (version. 5)") Signed-off-by: Ilya Denisyev <dev@elkcl.ru> Link: https://lore.kernel.org/r/20240412155357.237803-1-dev@elkcl.ru Signed-off-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12nilfs2: fix out-of-range warningArnd Bergmann
[ Upstream commit c473bcdd80d4ab2ae79a7a509a6712818366e32a ] clang-14 points out that v_size is always smaller than a 64KB page size if that is configured by the CPU architecture: fs/nilfs2/ioctl.c:63:19: error: result of comparison of constant 65536 with expression of type '__u16' (aka 'unsigned short') is always false [-Werror,-Wtautological-constant-out-of-range-compare] if (argv->v_size > PAGE_SIZE) ~~~~~~~~~~~~ ^ ~~~~~~~~~ This is ok, so just shut up that warning with a cast. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/20240328143051.1069575-7-arnd@kernel.org Fixes: 3358b4aaa84f ("nilfs2: fix problems of memory allocation in ioctl") Acked-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Reviewed-by: Justin Stitt <justinstitt@google.com> Signed-off-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12ecryptfs: Fix buffer size for tag 66 packetBrian Kubisiak
[ Upstream commit 85a6a1aff08ec9f5b929d345d066e2830e8818e5 ] The 'TAG 66 Packet Format' description is missing the cipher code and checksum fields that are packed into the message packet. As a result, the buffer allocated for the packet is 3 bytes too small and write_tag_66_packet() will write up to 3 bytes past the end of the buffer. Fix this by increasing the size of the allocation so the whole packet will always fit in the buffer. This fixes the below kasan slab-out-of-bounds bug: BUG: KASAN: slab-out-of-bounds in ecryptfs_generate_key_packet_set+0x7d6/0xde0 Write of size 1 at addr ffff88800afbb2a5 by task touch/181 CPU: 0 PID: 181 Comm: touch Not tainted 6.6.13-gnu #1 4c9534092be820851bb687b82d1f92a426598dc6 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2/GNU Guix 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x4c/0x70 print_report+0xc5/0x610 ? ecryptfs_generate_key_packet_set+0x7d6/0xde0 ? kasan_complete_mode_report_info+0x44/0x210 ? ecryptfs_generate_key_packet_set+0x7d6/0xde0 kasan_report+0xc2/0x110 ? ecryptfs_generate_key_packet_set+0x7d6/0xde0 __asan_store1+0x62/0x80 ecryptfs_generate_key_packet_set+0x7d6/0xde0 ? __pfx_ecryptfs_generate_key_packet_set+0x10/0x10 ? __alloc_pages+0x2e2/0x540 ? __pfx_ovl_open+0x10/0x10 [overlay 30837f11141636a8e1793533a02e6e2e885dad1d] ? dentry_open+0x8f/0xd0 ecryptfs_write_metadata+0x30a/0x550 ? __pfx_ecryptfs_write_metadata+0x10/0x10 ? ecryptfs_get_lower_file+0x6b/0x190 ecryptfs_initialize_file+0x77/0x150 ecryptfs_create+0x1c2/0x2f0 path_openat+0x17cf/0x1ba0 ? __pfx_path_openat+0x10/0x10 do_filp_open+0x15e/0x290 ? __pfx_do_filp_open+0x10/0x10 ? __kasan_check_write+0x18/0x30 ? _raw_spin_lock+0x86/0xf0 ? __pfx__raw_spin_lock+0x10/0x10 ? __kasan_check_write+0x18/0x30 ? alloc_fd+0xf4/0x330 do_sys_openat2+0x122/0x160 ? __pfx_do_sys_openat2+0x10/0x10 __x64_sys_openat+0xef/0x170 ? __pfx___x64_sys_openat+0x10/0x10 do_syscall_64+0x60/0xd0 entry_SYSCALL_64_after_hwframe+0x6e/0xd8 RIP: 0033:0x7f00a703fd67 Code: 25 00 00 41 00 3d 00 00 41 00 74 37 64 8b 04 25 18 00 00 00 85 c0 75 5b 44 89 e2 48 89 ee bf 9c ff ff ff b8 01 01 00 00 0f 05 <48> 3d 00 f0 ff ff 0f 87 85 00 00 00 48 83 c4 68 5d 41 5c c3 0f 1f RSP: 002b:00007ffc088e30b0 EFLAGS: 00000246 ORIG_RAX: 0000000000000101 RAX: ffffffffffffffda RBX: 00007ffc088e3368 RCX: 00007f00a703fd67 RDX: 0000000000000941 RSI: 00007ffc088e48d7 RDI: 00000000ffffff9c RBP: 00007ffc088e48d7 R08: 0000000000000001 R09: 0000000000000000 R10: 00000000000001b6 R11: 0000000000000246 R12: 0000000000000941 R13: 0000000000000000 R14: 00007ffc088e48d7 R15: 00007f00a7180040 </TASK> Allocated by task 181: kasan_save_stack+0x2f/0x60 kasan_set_track+0x29/0x40 kasan_save_alloc_info+0x25/0x40 __kasan_kmalloc+0xc5/0xd0 __kmalloc+0x66/0x160 ecryptfs_generate_key_packet_set+0x6d2/0xde0 ecryptfs_write_metadata+0x30a/0x550 ecryptfs_initialize_file+0x77/0x150 ecryptfs_create+0x1c2/0x2f0 path_openat+0x17cf/0x1ba0 do_filp_open+0x15e/0x290 do_sys_openat2+0x122/0x160 __x64_sys_openat+0xef/0x170 do_syscall_64+0x60/0xd0 entry_SYSCALL_64_after_hwframe+0x6e/0xd8 Fixes: dddfa461fc89 ("[PATCH] eCryptfs: Public key; packet management") Signed-off-by: Brian Kubisiak <brian@kubisiak.com> Link: https://lore.kernel.org/r/5j2q56p6qkhezva6b2yuqfrsurmvrrqtxxzrnp3wqu7xrz22i7@hoecdztoplbl Signed-off-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12openpromfs: finish conversion to the new mount APIEric Sandeen
[ Upstream commit 8f27829974b025d4df2e78894105d75e3bf349f0 ] The original mount API conversion inexplicably left out the change from ->remount_fs to ->reconfigure; do that now. Fixes: 7ab2fa7693c3 ("vfs: Convert openpromfs to use the new mount API") Signed-off-by: Eric Sandeen <sandeen@redhat.com> Link: https://lore.kernel.org/r/90b968aa-c979-420f-ba37-5acc3391b28f@redhat.com Signed-off-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12ksmbd: fix uninitialized symbol 'share' in smb2_tree_connect()Namjae Jeon
[ Upstream commit bc642d7bfdac3bfd838a1cd6651955ae2eb8535a ] Fix uninitialized symbol 'share' in smb2_tree_connect(). Fixes: e9d8c2f95ab8 ("ksmbd: add continuous availability share parameter") Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12epoll: be better about file lifetimesLinus Torvalds
[ Upstream commit 4efaa5acf0a1d2b5947f98abb3acf8bfd966422b ] epoll can call out to vfs_poll() with a file pointer that may race with the last 'fput()'. That would make f_count go down to zero, and while the ep->mtx locking means that the resulting file pointer tear-down will be blocked until the poll returns, it means that f_count is already dead, and any use of it won't actually get a reference to the file any more: it's dead regardless. Make sure we have a valid ref on the file pointer before we call down to vfs_poll() from the epoll routines. Link: https://lore.kernel.org/lkml/0000000000002d631f0615918f1e@google.com/ Reported-by: syzbot+045b454ab35fd82a35fb@syzkaller.appspotmail.com Reviewed-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12ksmbd: ignore trailing slashes in share pathsNandor Kracser
commit 405ee4097c4bc3e70556520aed5ba52a511c2266 upstream. Trailing slashes in share paths (like: /home/me/Share/) caused permission issues with shares for clients on iOS and on Android TV for me, but otherwise they work fine with plain old Samba. Cc: stable@vger.kernel.org Signed-off-by: Nandor Kracser <bonifaido@gmail.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-06-12ksmbd: avoid to send duplicate oplock break notificationsNamjae Jeon
commit c91ecba9e421e4f2c9219cf5042fa63a12025310 upstream. This patch fixes generic/011 when oplocks is enable. Avoid to send duplicate oplock break notifications like smb2 leases case. Fixes: 97c2ec64667b ("ksmbd: avoid to send duplicate lease break notifications") Cc: stable@vger.kernel.org Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-06-12fs/ntfs3: Break dir enumeration if directory contents errorKonstantin Komarov
commit 302e9dca8428979c9c99f2dbb44dc1783f5011c3 upstream. If we somehow attempt to read beyond the directory size, an error is supposed to be returned. However, in some cases, read requests do not stop and instead enter into a loop. To avoid this, we set the position in the directory to the end. Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-06-12fs/ntfs3: Fix case when index is reused during tree transformationKonstantin Komarov
commit 05afeeebcac850a016ec4fb1f681ceda11963562 upstream. In most cases when adding a cluster to the directory index, they are placed at the end, and in the bitmap, this cluster corresponds to the last bit. The new directory size is calculated as follows: data_size = (u64)(bit + 1) << indx->index_bits; In the case of reusing a non-final cluster from the index, data_size is calculated incorrectly, resulting in the directory size differing from the actual size. A check for cluster reuse has been added, and the size update is skipped. Fixes: 82cae269cfa95 ("fs/ntfs3: Add initialization of super block") Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-06-12fs/ntfs3: Taking DOS names into account during link countingKonstantin Komarov
commit 110b24eb1a749bea3440f3ca2ff890a26179050a upstream. When counting and checking hard links in an ntfs file record, struct MFT_REC { struct NTFS_RECORD_HEADER rhdr; // 'FILE' __le16 seq; // 0x10: Sequence number for this record. >> __le16 hard_links; // 0x12: The number of hard links to record. __le16 attr_off; // 0x14: Offset to attributes. ... the ntfs3 driver ignored short names (DOS names), causing the link count to be reduced by 1 and messages to be output to dmesg. For Windows, such a situation is a minor error, meaning chkdsk does not report errors on such a volume, and in the case of using the /f switch, it silently corrects them, reporting that no errors were found. This does not affect the consistency of the file system. Nevertheless, the behavior in the ntfs3 driver is incorrect and changes the content of the file system. This patch should fix that. PS: most likely, there has been a confusion of concepts MFT_REC::hard_links and inode::__i_nlink. Fixes: 82cae269cfa95 ("fs/ntfs3: Add initialization of super block") Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>