summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
2019-11-28ocfs2: remove ocfs2_is_o2cb_active()Gang He
commit a634644751c46238df58bbfe992e30c1668388db upstream. Remove ocfs2_is_o2cb_active(). We have similar functions to identify which cluster stack is being used via osb->osb_cluster_stack. Secondly, the current implementation of ocfs2_is_o2cb_active() is not totally safe. Based on the design of stackglue, we need to get ocfs2_stack_lock before using ocfs2_stack related data structures, and that active_stack pointer can be NULL in the case of mount failure. Link: http://lkml.kernel.org/r/1495441079-11708-1-git-send-email-ghe@suse.com Signed-off-by: Gang He <ghe@suse.com> Reviewed-by: Joseph Qi <jiangqi903@gmail.com> Reviewed-by: Eric Ren <zren@suse.com> Acked-by: Changwei Ge <ge.changwei@h3c.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-11-28dlm: don't leak kernel pointer to userspaceTycho Andersen
[ Upstream commit 9de30f3f7f4d31037cfbb7c787e1089c1944b3a7 ] In copy_result_to_user(), we first create a struct dlm_lock_result, which contains a struct dlm_lksb, the last member of which is a pointer to the lvb. Unfortunately, we copy the entire struct dlm_lksb to the result struct, which is then copied to userspace at the end of the function, leaking the contents of sb_lvbptr, which is a valid kernel pointer in some cases (indeed, later in the same function the data it points to is copied to userspace). It is an error to leak kernel pointers to userspace, as it undermines KASLR protections (see e.g. 65eea8edc31 ("floppy: Do not copy a kernel pointer to user memory in FDGETPRM ioctl") for another example of this). Signed-off-by: Tycho Andersen <tycho@tycho.ws> Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-11-28dlm: fix invalid freeTycho Andersen
[ Upstream commit d968b4e240cfe39d39d80483bac8bca8716fd93c ] dlm_config_nodes() does not allocate nodes on failure, so we should not free() nodes when it fails. Signed-off-by: Tycho Andersen <tycho@tycho.ws> Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-11-28ocfs2: fix clusters leak in ocfs2_defrag_extent()Larry Chen
[ Upstream commit 6194ae4242dec0c9d604bc05df83aa9260a899e4 ] ocfs2_defrag_extent() might leak allocated clusters. When the file system has insufficient space, the number of claimed clusters might be less than the caller wants. If that happens, the original code might directly commit the transaction without returning clusters. This patch is based on code in ocfs2_add_clusters_in_btree(). [akpm@linux-foundation.org: include localalloc.h, reduce scope of data_ac] Link: http://lkml.kernel.org/r/20180904041621.16874-3-lchen@suse.com Signed-off-by: Larry Chen <lchen@suse.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Joseph Qi <jiangqi903@gmail.com> Cc: Changwei Ge <ge.changwei@h3c.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-11-28ocfs2: don't put and assigning null to bh allocated outsideChangwei Ge
[ Upstream commit cf76c78595ca87548ca5e45c862ac9e0949c4687 ] ocfs2_read_blocks() and ocfs2_read_blocks_sync() are both used to read several blocks from disk. Currently, the input argument *bhs* can be NULL or NOT. It depends on the caller's behavior. If the function fails in reading blocks from disk, the corresponding bh will be assigned to NULL and put. Obviously, above process for non-NULL input bh is not appropriate. Because the caller doesn't even know its bhs are put and re-assigned. If buffer head is managed by caller, ocfs2_read_blocks and ocfs2_read_blocks_sync() should not evaluate it to NULL. It will cause caller accessing illegal memory, thus crash. Link: http://lkml.kernel.org/r/HK2PR06MB045285E0F4FBB561F9F2F9B3D5680@HK2PR06MB0452.apcprd06.prod.outlook.com Signed-off-by: Changwei Ge <ge.changwei@h3c.com> Reviewed-by: Guozhonghua <guozhonghua@h3c.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Joseph Qi <jiangqi903@gmail.com> Cc: Changwei Ge <ge.changwei@h3c.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-11-28fs/hfs/extent.c: fix array out of bounds read of array extentColin Ian King
[ Upstream commit 6c9a3f843a29d6894dfc40df338b91dbd78f0ae3 ] Currently extent and index i are both being incremented causing an array out of bounds read on extent[i]. Fix this by removing the extraneous increment of extent. Ernesto said: : This is only triggered when deleting a file with a resource fork. I : may be wrong because the documentation isn't clear, but I don't think : you can create those under linux. So I guess nobody was testing them. : : > A disk space leak, perhaps? : : That's what it looks like in general. hfs_free_extents() won't do : anything if the block count doesn't add up, and the error will be : ignored. Now, if the block count randomly does add up, we could see : some corruption. Detected by CoverityScan, CID#711541 ("Out of bounds read") Link: http://lkml.kernel.org/r/20180831140538.31566-1-colin.king@canonical.com Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Ernesto A. Fernndez <ernesto.mnd.fernandez@gmail.com> Cc: David Howells <dhowells@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Hin-Tak Leung <htl10@users.sourceforge.net> Cc: Vyacheslav Dubeyko <slava@dubeyko.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-11-28hfs: fix return value of hfs_get_block()Ernesto A. Fernández
[ Upstream commit 1267a07be5ebbff2d2739290f3d043ae137c15b4 ] Direct writes to empty inodes fail with EIO. The generic direct-io code is in part to blame (a patch has been submitted as "direct-io: allow direct writes to empty inodes"), but hfs is worse affected than the other filesystems because the fallback to buffered I/O doesn't happen. The problem is the return value of hfs_get_block() when called with !create. Change it to be more consistent with the other modules. Link: http://lkml.kernel.org/r/4538ab8c35ea37338490525f0f24cbc37227528c.1539195310.git.ernesto.mnd.fernandez@gmail.com Signed-off-by: Ernesto A. Fernández <ernesto.mnd.fernandez@gmail.com> Reviewed-by: Vyacheslav Dubeyko <slava@dubeyko.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-11-28hfsplus: fix return value of hfsplus_get_block()Ernesto A. Fernández
[ Upstream commit 839c3a6a5e1fbc8542d581911b35b2cb5cd29304 ] Direct writes to empty inodes fail with EIO. The generic direct-io code is in part to blame (a patch has been submitted as "direct-io: allow direct writes to empty inodes"), but hfsplus is worse affected than the other filesystems because the fallback to buffered I/O doesn't happen. The problem is the return value of hfsplus_get_block() when called with !create. Change it to be more consistent with the other modules. Link: http://lkml.kernel.org/r/2cd1301404ec7cf1e39c8f11a01a4302f1460ad6.1539195310.git.ernesto.mnd.fernandez@gmail.com Signed-off-by: Ernesto A. Fernández <ernesto.mnd.fernandez@gmail.com> Reviewed-by: Vyacheslav Dubeyko <slava@dubeyko.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-11-28hfs: prevent btree data loss on ENOSPCErnesto A. Fernández
[ Upstream commit 54640c7502e5ed41fbf4eedd499e85f9acc9698f ] Inserting a new record in a btree may require splitting several of its nodes. If we hit ENOSPC halfway through, the new nodes will be left orphaned and their records will be lost. This could mean lost inodes or extents. Henceforth, check the available disk space before making any changes. This still leaves the potential problem of corruption on ENOMEM. There is no need to reserve space before deleting a catalog record, as we do for hfsplus. This difference is because hfs index nodes have fixed length keys. Link: http://lkml.kernel.org/r/ab5fc8a7d5ffccfd5f27b1cf2cb4ceb6c110da74.1536269131.git.ernesto.mnd.fernandez@gmail.com Signed-off-by: Ernesto A. Fernández <ernesto.mnd.fernandez@gmail.com> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-11-28hfsplus: prevent btree data loss on ENOSPCErnesto A. Fernández
[ Upstream commit d92915c35bfaf763d78bf1d5ac7f183420e3bd99 ] Inserting or deleting a record in a btree may require splitting several of its nodes. If we hit ENOSPC halfway through, the new nodes will be left orphaned and their records will be lost. This could mean lost inodes, extents or xattrs. Henceforth, check the available disk space before making any changes. This still leaves the potential problem of corruption on ENOMEM. The patch can be tested with xfstests generic/027. Link: http://lkml.kernel.org/r/4596eef22fbda137b4ffa0272d92f0da15364421.1536269129.git.ernesto.mnd.fernandez@gmail.com Signed-off-by: Ernesto A. Fernández <ernesto.mnd.fernandez@gmail.com> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-11-28hfs: fix BUG on bnode parent updateErnesto A. Fernández
[ Upstream commit ef75bcc5763d130451a99825f247d301088b790b ] hfs_brec_update_parent() may hit BUG_ON() if the first record of both a leaf node and its parent are changed, and if this forces the parent to be split. It is not possible for this to happen on a valid hfs filesystem because the index nodes have fixed length keys. For reasons I ignore, the hfs module does have support for a number of hfsplus features. A corrupt btree header may report variable length keys and trigger this BUG, so it's better to fix it. Link: http://lkml.kernel.org/r/cf9b02d57f806217a2b1bf5db8c3e39730d8f603.1535682463.git.ernesto.mnd.fernandez@gmail.com Signed-off-by: Ernesto A. Fernández <ernesto.mnd.fernandez@gmail.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Christoph Hellwig <hch@infradead.org> Cc: Viacheslav Dubeyko <slava@dubeyko.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-11-28hfsplus: fix BUG on bnode parent updateErnesto A. Fernández
[ Upstream commit 19a9d0f1acf75e8be8cfba19c1a34e941846fa2b ] Creating, renaming or deleting a file may hit BUG_ON() if the first record of both a leaf node and its parent are changed, and if this forces the parent to be split. This bug is triggered by xfstests generic/027, somewhat rarely; here is a more reliable reproducer: truncate -s 50M fs.iso mkfs.hfsplus fs.iso mount fs.iso /mnt i=1000 while [ $i -le 2400 ]; do touch /mnt/$i &>/dev/null ((++i)) done i=2400 while [ $i -ge 1000 ]; do mv /mnt/$i /mnt/$(perl -e "print $i x61") &>/dev/null ((--i)) done The issue is that a newly created bnode is being put twice. Reset new_node to NULL in hfs_brec_update_parent() before reaching goto again. Link: http://lkml.kernel.org/r/5ee1db09b60373a15890f6a7c835d00e76bf601d.1535682461.git.ernesto.mnd.fernandez@gmail.com Signed-off-by: Ernesto A. Fernández <ernesto.mnd.fernandez@gmail.com> Cc: Christoph Hellwig <hch@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-11-28fs/ocfs2/dlm/dlmdebug.c: fix a sleep-in-atomic-context bug in ↵Jia-Ju Bai
dlm_print_one_mle() [ Upstream commit 999865764f5f128896402572b439269acb471022 ] The kernel module may sleep with holding a spinlock. The function call paths (from bottom to top) in Linux-4.16 are: [FUNC] get_zeroed_page(GFP_NOFS) fs/ocfs2/dlm/dlmdebug.c, 332: get_zeroed_page in dlm_print_one_mle fs/ocfs2/dlm/dlmmaster.c, 240: dlm_print_one_mle in __dlm_put_mle fs/ocfs2/dlm/dlmmaster.c, 255: __dlm_put_mle in dlm_put_mle fs/ocfs2/dlm/dlmmaster.c, 254: spin_lock in dlm_put_ml [FUNC] get_zeroed_page(GFP_NOFS) fs/ocfs2/dlm/dlmdebug.c, 332: get_zeroed_page in dlm_print_one_mle fs/ocfs2/dlm/dlmmaster.c, 240: dlm_print_one_mle in __dlm_put_mle fs/ocfs2/dlm/dlmmaster.c, 222: __dlm_put_mle in dlm_put_mle_inuse fs/ocfs2/dlm/dlmmaster.c, 219: spin_lock in dlm_put_mle_inuse To fix this bug, GFP_NOFS is replaced with GFP_ATOMIC. This bug is found by my static analysis tool DSAC. Link: http://lkml.kernel.org/r/20180901112528.27025-1-baijiaju1990@gmail.com Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Joseph Qi <jiangqi903@gmail.com> Cc: Changwei Ge <ge.changwei@h3c.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-11-28ceph: fix dentry leak in ceph_readdir_prepopulateYan, Zheng
[ Upstream commit c58f450bd61511d897efc2ea472c69630635b557 ] Signed-off-by: "Yan, Zheng" <zyan@redhat.com> Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-11-28btrfs: handle error of get_old_rootNikolay Borisov
[ Upstream commit 315bed43fea532650933e7bba316a7601d439edf ] In btrfs_search_old_slot get_old_root is always used with the assumption it cannot fail. However, this is not true in rare circumstance it can fail and return null. This will lead to null point dereference when the header is read. Fix this by checking the return value and properly handling NULL by setting ret to -EIO and returning gracefully. Coverity-id: 1087503 Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: Lu Fengqi <lufq.fnst@cn.fujitsu.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-11-28gfs2: Fix marking bitmaps non-fullAndreas Gruenbacher
[ Upstream commit ec23df2b0cf3e1620f5db77972b7fb735f267eff ] Reservations in gfs can span multiple gfs2_bitmaps (but they won't span multiple resource groups). When removing a reservation, we want to clear the GBF_FULL flags of all involved gfs2_bitmaps, not just that of the first bitmap. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com> Reviewed-by: Steven Whitehouse <swhiteho@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-11-28Revert "fs: ocfs2: fix possible null-pointer dereferences in ↵Joseph Qi
ocfs2_xa_prepare_entry()" commit 94b07b6f9e2e996afff7395de6b35f34f4cb10bf upstream. This reverts commit 56e94ea132bb5c2c1d0b60a6aeb34dcb7d71a53d. Commit 56e94ea132bb ("fs: ocfs2: fix possible null-pointer dereferences in ocfs2_xa_prepare_entry()") introduces a regression that fail to create directory with mount option user_xattr and acl. Actually the reported NULL pointer dereference case can be correctly handled by loc->xl_ops->xlo_add_entry(), so revert it. Link: http://lkml.kernel.org/r/1573624916-83825-1-git-send-email-joseph.qi@linux.alibaba.com Fixes: 56e94ea132bb ("fs: ocfs2: fix possible null-pointer dereferences in ocfs2_xa_prepare_entry()") Signed-off-by: Joseph Qi <joseph.qi@linux.alibaba.com> Reported-by: Thomas Voegtle <tv@lio96.de> Acked-by: Changwei Ge <gechangwei@live.cn> Cc: Jia-Ju Bai <baijiaju1990@gmail.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-11-25GFS2: Flush the GFS2 delete workqueue before stopping the kernel threadsTim Smith
[ Upstream commit 1eb8d7387908022951792a46fa040ad3942b3b08 ] Flushing the workqueue can cause operations to happen which might call gfs2_log_reserve(), or get stuck waiting for locks taken by such operations. gfs2_log_reserve() can io_schedule(). If this happens, it will never wake because the only thing which can wake it is gfs2_logd() which was already stopped. This causes umount of a gfs2 filesystem to wedge permanently if, for example, the umount immediately follows a large delete operation. When this occured, the following stack trace was obtained from the umount command [<ffffffff81087968>] flush_workqueue+0x1c8/0x520 [<ffffffffa0666e29>] gfs2_make_fs_ro+0x69/0x160 [gfs2] [<ffffffffa0667279>] gfs2_put_super+0xa9/0x1c0 [gfs2] [<ffffffff811b7edf>] generic_shutdown_super+0x6f/0x100 [<ffffffff811b7ff7>] kill_block_super+0x27/0x70 [<ffffffffa0656a71>] gfs2_kill_sb+0x71/0x80 [gfs2] [<ffffffff811b792b>] deactivate_locked_super+0x3b/0x70 [<ffffffff811b79b9>] deactivate_super+0x59/0x60 [<ffffffff811d2998>] cleanup_mnt+0x58/0x80 [<ffffffff811d2a12>] __cleanup_mnt+0x12/0x20 [<ffffffff8108c87d>] task_work_run+0x7d/0xa0 [<ffffffff8106d7d9>] exit_to_usermode_loop+0x73/0x98 [<ffffffff81003961>] syscall_return_slowpath+0x41/0x50 [<ffffffff815a594c>] int_ret_from_sys_call+0x25/0x8f [<ffffffffffffffff>] 0xffffffffffffffff Signed-off-by: Tim Smith <tim.smith@citrix.com> Signed-off-by: Mark Syms <mark.syms@citrix.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-11-25proc/vmcore: Fix i386 build error of missing copy_oldmem_page_encrypted()Borislav Petkov
[ Upstream commit cf089611f4c446285046fcd426d90c18f37d2905 ] Lianbo reported a build error with a particular 32-bit config, see Link below for details. Provide a weak copy_oldmem_page_encrypted() function which architectures can override, in the same manner other functionality in that file is supplied. Reported-by: Lianbo Jiang <lijiang@redhat.com> Signed-off-by: Borislav Petkov <bp@suse.de> CC: x86@kernel.org Link: http://lkml.kernel.org/r/710b9d95-2f70-eadf-c4a1-c3dc80ee4ebb@redhat.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-11-25NFSv4.x: fix lock recovery during delegation recallOlga Kornievskaia
[ Upstream commit 44f411c353bf6d98d5a34f8f1b8605d43b2e50b8 ] Running "./nfstest_delegation --runtest recall26" uncovers that client doesn't recover the lock when we have an appending open, where the initial open got a write delegation. Instead of checking for the passed in open context against the file lock's open context. Check that the state is the same. Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-11-25f2fs: return correct errno in f2fs_gcJaegeuk Kim
[ Upstream commit 61f7725aa148ee870436a29d3a24d5c00ab7e9af ] This fixes overriding error number in f2fs_gc. Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-11-25fuse: use READ_ONCE on congestion_threshold and max_backgroundKirill Tkhai
[ Upstream commit 2a23f2b8adbe4bd584f936f7ac17a99750eed9d7 ] Since they are of unsigned int type, it's allowed to read them unlocked during reporting to userspace. Let's underline this fact with READ_ONCE() macroses. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-11-25kernfs: Fix range checks in kernfs_get_target_pathBernd Edlinger
[ Upstream commit a75e78f21f9ad4b810868c89dbbabcc3931591ca ] The terminating NUL byte is only there because the buffer is allocated with kzalloc(PAGE_SIZE, GFP_KERNEL), but since the range-check is off-by-one, and PAGE_SIZE==PATH_MAX, the returned string may not be zero-terminated if it is exactly PATH_MAX characters long. Furthermore also the initial loop may theoretically exceed PATH_MAX and cause a fault. Signed-off-by: Bernd Edlinger <bernd.edlinger@hotmail.de> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-11-25gfs2: Don't set GFS2_RDF_UPTODATE when the lvb is updatedBob Peterson
[ Upstream commit 4f36cb36c9d14340bb200d2ad9117b03ce992cfe ] The GFS2_RDF_UPTODATE flag in the rgrp is used to determine when a rgrp buffer is valid. It's cleared when the glock is invalidated, signifying that the buffer data is now invalid. But before this patch, function update_rgrp_lvb was setting the flag when it determined it had a valid lvb. But that's an invalid assumption: just because you have a valid lvb doesn't mean you have valid buffers. After all, another node may have made the lvb valid, and this node just fetched it from the glock via dlm. Consider this scenario: 1. The file system is mounted with RGRPLVB option. 2. In gfs2_inplace_reserve it locks the rgrp glock EX, but thanks to GL_SKIP, it skips the gfs2_rgrp_bh_get. 3. Since loops == 0 and the allocation target (ap->target) is bigger than the largest known chunk of blocks in the rgrp (rs->rs_rbm.rgd->rd_extfail_pt) it skips that rgrp and bypasses the call to gfs2_rgrp_bh_get there as well. 4. update_rgrp_lvb sees the lvb MAGIC number is valid, so bypasses gfs2_rgrp_bh_get, but it still sets sets GFS2_RDF_UPTODATE due to this invalid assumption. 5. The next time update_rgrp_lvb is called, it sees the bit is set and just returns 0, assuming both the lvb and rgrp are both uptodate. But since this is a smaller allocation, or space has been freed by another node, thus adjusting the lvb values, it decides to use the rgrp for allocations, with invalid rd_free due to the fact it was never updated. This patch changes update_rgrp_lvb so it doesn't set the UPTODATE flag anymore. That way, it has no choice but to fetch the latest values. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-11-25ecryptfs_lookup_interpose(): lower_dentry->d_parent is not stable eitherAl Viro
commit 762c69685ff7ad5ad7fee0656671e20a0c9c864d upstream. We need to get the underlying dentry of parent; sure, absent the races it is the parent of underlying dentry, but there's nothing to prevent losing a timeslice to preemtion in the middle of evaluation of lower_dentry->d_parent->d_inode, having another process move lower_dentry around and have its (ex)parent not pinned anymore and freed on memory pressure. Then we regain CPU and try to fetch ->d_inode from memory that is freed by that point. dentry->d_parent *is* stable here - it's an argument of ->lookup() and we are guaranteed that it won't be moved anywhere until we feed it to d_add/d_splice_alias. So we safely go that way to get to its underlying dentry. Cc: stable@vger.kernel.org # since 2009 or so Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-11-25ecryptfs_lookup_interpose(): lower_dentry->d_inode is not stableAl Viro
commit e72b9dd6a5f17d0fb51f16f8685f3004361e83d0 upstream. lower_dentry can't go from positive to negative (we have it pinned), but it *can* go from negative to positive. So fetching ->d_inode into a local variable, doing a blocking allocation, checking that now ->d_inode is non-NULL and feeding the value we'd fetched earlier to a function that won't accept NULL is not a good idea. Cc: stable@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-11-12cgroup,writeback: don't switch wbs immediately on dead wbs if the memcg is deadTejun Heo
commit 65de03e251382306a4575b1779c57c87889eee49 upstream. cgroup writeback tries to refresh the associated wb immediately if the current wb is dead. This is to avoid keeping issuing IOs on the stale wb after memcg - blkcg association has changed (ie. when blkcg got disabled / enabled higher up in the hierarchy). Unfortunately, the logic gets triggered spuriously on inodes which are associated with dead cgroups. When the logic is triggered on dead cgroups, the attempt fails only after doing quite a bit of work allocating and initializing a new wb. While c3aab9a0bd91 ("mm/filemap.c: don't initiate writeback if mapping has no dirty pages") alleviated the issue significantly as it now only triggers when the inode has dirty pages. However, the condition can still be triggered before the inode is switched to a different cgroup and the logic simply doesn't make sense. Skip the immediate switching if the associated memcg is dying. This is a simplified version of the following two patches: * https://lore.kernel.org/linux-mm/20190513183053.GA73423@dennisz-mbp/ * http://lkml.kernel.org/r/156355839560.2063.5265687291430814589.stgit@buzz Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Fixes: e8a7abf5a5bd ("writeback: disassociate inodes from dying bdi_writebacks") Acked-by: Dennis Zhou <dennis@kernel.org> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-11-12NFSv4: Don't allow a cached open with a revoked delegationTrond Myklebust
[ Upstream commit be3df3dd4c70ee020587a943a31b98a0fb4b6424 ] If the delegation is marked as being revoked, we must not use it for cached opens. Fixes: 869f9dfa4d6d ("NFSv4: Fix races between nfs_remove_bad_delegation() and delegation return") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-11-12configfs: fix a deadlock in configfs_symlink()Al Viro
commit 351e5d869e5ac10cb40c78b5f2d7dfc816ad4587 upstream. Configfs abuses symlink(2). Unlike the normal filesystems, it wants the target resolved at symlink(2) time, like link(2) would've done. The problem is that ->symlink() is called with the parent directory locked exclusive, so resolving the target inside the ->symlink() is easily deadlocked. Short of really ugly games in sys_symlink() itself, all we can do is to unlock the parent before resolving the target and relock it after. However, that invalidates the checks done by the caller of ->symlink(), so we have to * check that dentry is still where it used to be (it couldn't have been moved, but it could've been unhashed) * recheck that it's still negative (somebody else might've successfully created a symlink with the same name while we were looking the target up) * recheck the permissions on the parent directory. Cc: stable@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-11-12ceph: fix use-after-free in __ceph_remove_cap()Luis Henriques
commit ea60ed6fcf29eebc78f2ce91491e6309ee005a01 upstream. KASAN reports a use-after-free when running xfstest generic/531, with the following trace: [ 293.903362] kasan_report+0xe/0x20 [ 293.903365] rb_erase+0x1f/0x790 [ 293.903370] __ceph_remove_cap+0x201/0x370 [ 293.903375] __ceph_remove_caps+0x4b/0x70 [ 293.903380] ceph_evict_inode+0x4e/0x360 [ 293.903386] evict+0x169/0x290 [ 293.903390] __dentry_kill+0x16f/0x250 [ 293.903394] dput+0x1c6/0x440 [ 293.903398] __fput+0x184/0x330 [ 293.903404] task_work_run+0xb9/0xe0 [ 293.903410] exit_to_usermode_loop+0xd3/0xe0 [ 293.903413] do_syscall_64+0x1a0/0x1c0 [ 293.903417] entry_SYSCALL_64_after_hwframe+0x44/0xa9 This happens because __ceph_remove_cap() may queue a cap release (__ceph_queue_cap_release) which can be scheduled before that cap is removed from the inode list with rb_erase(&cap->ci_node, &ci->i_caps); And, when this finally happens, the use-after-free will occur. This can be fixed by removing the cap from the inode list before being removed from the session list, and thus eliminating the risk of an UAF. Cc: stable@vger.kernel.org Signed-off-by: Luis Henriques <lhenriques@suse.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-11-10fs/dcache: move security_d_instantiate() behind attaching dentry to inodezhangyi (F)
During backport 1e2e547a93a "do d_instantiate/unlock_new_inode combinations safely", there was a error instantiating sequence of attaching dentry to inode and calling security_d_instantiate(). Before commit ce23e640133 "->getxattr(): pass dentry and inode as separate arguments" and b96809173e9 "security_d_instantiate(): move to the point prior to attaching dentry to inode", security_d_instantiate() should be called beind __d_instantiate(), otherwise it will trigger below problem when CONFIG_SECURITY_SMACK on ext4 was enabled because d_inode(dentry) used by ->getxattr() is NULL before __d_instantiate() instantiate inode. [ 31.858026] BUG: unable to handle kernel paging request at ffffffffffffff70 ... [ 31.882024] Call Trace: [ 31.882378] [<ffffffffa347f75c>] ext4_xattr_get+0x8c/0x3e0 [ 31.883195] [<ffffffffa3489454>] ext4_xattr_security_get+0x24/0x40 [ 31.884086] [<ffffffffa336a56b>] generic_getxattr+0x5b/0x90 [ 31.884907] [<ffffffffa3700514>] smk_fetch+0xb4/0x150 [ 31.885634] [<ffffffffa3700772>] smack_d_instantiate+0x1c2/0x550 [ 31.886508] [<ffffffffa36f9a5a>] security_d_instantiate+0x3a/0x80 [ 31.887389] [<ffffffffa3353b26>] d_instantiate_new+0x36/0x130 [ 31.888223] [<ffffffffa342b1ef>] ext4_mkdir+0x4af/0x6a0 [ 31.888928] [<ffffffffa3343470>] vfs_mkdir+0x100/0x280 [ 31.889536] [<ffffffffa334b086>] SyS_mkdir+0xb6/0x170 [ 31.890255] [<ffffffffa307c855>] ? trace_do_page_fault+0x95/0x2b0 [ 31.891134] [<ffffffffa3c5e078>] entry_SYSCALL_64_fastpath+0x18/0x73 Cc: <stable@vger.kernel.org> # 3.16, 4.4 Signed-off-by: zhangyi (F) <yi.zhang@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-11-10cifs: Fix cifsInodeInfo lock_sem deadlock when reconnect occursDave Wysochanski
[ Upstream commit d46b0da7a33dd8c99d969834f682267a45444ab3 ] There's a deadlock that is possible and can easily be seen with a test where multiple readers open/read/close of the same file and a disruption occurs causing reconnect. The deadlock is due a reader thread inside cifs_strict_readv calling down_read and obtaining lock_sem, and then after reconnect inside cifs_reopen_file calling down_read a second time. If in between the two down_read calls, a down_write comes from another process, deadlock occurs. CPU0 CPU1 ---- ---- cifs_strict_readv() down_read(&cifsi->lock_sem); _cifsFileInfo_put OR cifs_new_fileinfo down_write(&cifsi->lock_sem); cifs_reopen_file() down_read(&cifsi->lock_sem); Fix the above by changing all down_write(lock_sem) calls to down_write_trylock(lock_sem)/msleep() loop, which in turn makes the second down_read call benign since it will never block behind the writer while holding lock_sem. Signed-off-by: Dave Wysochanski <dwysocha@redhat.com> Suggested-by: Ronnie Sahlberg <lsahlber@redhat.com> Reviewed--by: Ronnie Sahlberg <lsahlber@redhat.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-11-06xfs: Correctly invert xfs_buftarg LRU isolation logicVratislav Bendel
commit 19957a181608d25c8f4136652d0ea00b3738972d upstream. Due to an inverted logic mistake in xfs_buftarg_isolate() the xfs_buffers with zero b_lru_ref will take another trip around LRU, while isolating buffers with non-zero b_lru_ref. Additionally those isolated buffers end up right back on the LRU once they are released, because b_lru_ref remains elevated. Fix that circuitous route by leaving them on the LRU as originally intended. Signed-off-by: Vratislav Bendel <vbendel@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Alex Lyakas <alex@zadara.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-11-06fuse: truncate pending writes on O_TRUNCMiklos Szeredi
commit e4648309b85a78f8c787457832269a8712a8673e upstream. Make sure cached writes are not reordered around open(..., O_TRUNC), with the obvious wrong results. Fixes: 4d99ff8f12eb ("fuse: Turn writeback cache on") Cc: <stable@vger.kernel.org> # v3.15+ Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-11-06fuse: flush dirty data/metadata before non-truncate setattrMiklos Szeredi
commit b24e7598db62386a95a3c8b9c75630c5d56fe077 upstream. If writeback cache is enabled, then writes might get reordered with chmod/chown/utimes. The problem with this is that performing the write in the fuse daemon might itself change some of these attributes. In such case the following sequence of operations will result in file ending up with the wrong mode, for example: int fd = open ("suid", O_WRONLY|O_CREAT|O_EXCL); write (fd, "1", 1); fchown (fd, 0, 0); fchmod (fd, 04755); close (fd); This patch fixes this by flushing pending writes before performing chown/chmod/utimes. Reported-by: Giuseppe Scrivano <gscrivan@redhat.com> Tested-by: Giuseppe Scrivano <gscrivan@redhat.com> Fixes: 4d99ff8f12eb ("fuse: Turn writeback cache on") Cc: <stable@vger.kernel.org> # v3.15+ Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-11-06NFSv4: Fix leak of clp->cl_acceptor stringChuck Lever
[ Upstream commit 1047ec868332034d1fbcb2fae19fe6d4cb869ff2 ] Our client can issue multiple SETCLIENTID operations to the same server in some circumstances. Ensure that calls to nfs4_proc_setclientid() after the first one do not overwrite the previously allocated cl_acceptor string. unreferenced object 0xffff888461031800 (size 32): comm "mount.nfs", pid 2227, jiffies 4294822467 (age 1407.749s) hex dump (first 32 bytes): 6e 66 73 40 6b 6c 69 6d 74 2e 69 62 2e 31 30 31 nfs@klimt.ib.101 35 67 72 61 6e 67 65 72 2e 6e 65 74 00 00 00 00 5granger.net.... backtrace: [<00000000ab820188>] __kmalloc+0x128/0x176 [<00000000eeaf4ec8>] gss_stringify_acceptor+0xbd/0x1a7 [auth_rpcgss] [<00000000e85e3382>] nfs4_proc_setclientid+0x34e/0x46c [nfsv4] [<000000003d9cf1fa>] nfs40_discover_server_trunking+0x7a/0xed [nfsv4] [<00000000b81c3787>] nfs4_discover_server_trunking+0x81/0x244 [nfsv4] [<000000000801b55f>] nfs4_init_client+0x1b0/0x238 [nfsv4] [<00000000977daf7f>] nfs4_set_client+0xfe/0x14d [nfsv4] [<0000000053a68a2a>] nfs4_create_server+0x107/0x1db [nfsv4] [<0000000088262019>] nfs4_remote_mount+0x2c/0x59 [nfsv4] [<00000000e84a2fd0>] legacy_get_tree+0x2d/0x4c [<00000000797e947c>] vfs_get_tree+0x20/0xc7 [<00000000ecabaaa8>] fc_mount+0xe/0x36 [<00000000f15fafc2>] vfs_kern_mount+0x74/0x8d [<00000000a3ff4e26>] nfs_do_root_mount+0x8a/0xa3 [nfsv4] [<00000000d1c2b337>] nfs4_try_mount+0x58/0xad [nfsv4] [<000000004c9bddee>] nfs_fs_mount+0x820/0x869 [nfs] Fixes: f11b2a1cfbf5 ("nfs4: copy acceptor name from context ... ") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-11-06fs: ocfs2: fix a possible null-pointer dereference in ↵Jia-Ju Bai
ocfs2_info_scan_inode_alloc() [ Upstream commit 2abb7d3b12d007c30193f48bebed781009bebdd2 ] In ocfs2_info_scan_inode_alloc(), there is an if statement on line 283 to check whether inode_alloc is NULL: if (inode_alloc) When inode_alloc is NULL, it is used on line 287: ocfs2_inode_lock(inode_alloc, &bh, 0); ocfs2_inode_lock_full_nested(inode, ...) struct ocfs2_super *osb = OCFS2_SB(inode->i_sb); Thus, a possible null-pointer dereference may occur. To fix this bug, inode_alloc is checked on line 286. This bug is found by a static analysis tool STCheck written by us. Link: http://lkml.kernel.org/r/20190726033717.32359-1-baijiaju1990@gmail.com Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-11-06fs: ocfs2: fix possible null-pointer dereferences in ocfs2_xa_prepare_entry()Jia-Ju Bai
[ Upstream commit 56e94ea132bb5c2c1d0b60a6aeb34dcb7d71a53d ] In ocfs2_xa_prepare_entry(), there is an if statement on line 2136 to check whether loc->xl_entry is NULL: if (loc->xl_entry) When loc->xl_entry is NULL, it is used on line 2158: ocfs2_xa_add_entry(loc, name_hash); loc->xl_entry->xe_name_hash = cpu_to_le32(name_hash); loc->xl_entry->xe_name_offset = cpu_to_le16(loc->xl_size); and line 2164: ocfs2_xa_add_namevalue(loc, xi); loc->xl_entry->xe_value_size = cpu_to_le64(xi->xi_value_len); loc->xl_entry->xe_name_len = xi->xi_name_len; Thus, possible null-pointer dereferences may occur. To fix these bugs, if loc-xl_entry is NULL, ocfs2_xa_prepare_entry() abnormally returns with -EINVAL. These bugs are found by a static analysis tool STCheck written by us. [akpm@linux-foundation.org: remove now-unused ocfs2_xa_add_entry()] Link: http://lkml.kernel.org/r/20190726101447.9153-1-baijiaju1990@gmail.com Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-11-06fs: cifs: mute -Wunused-const-variable messageAustin Kim
[ Upstream commit dd19c106a36690b47bb1acc68372f2b472b495b8 ] After 'Initial git repository build' commit, 'mapping_table_ERRHRD' variable has not been used. So 'mapping_table_ERRHRD' const variable could be removed to mute below warning message: fs/cifs/netmisc.c:120:40: warning: unused variable 'mapping_table_ERRHRD' [-Wunused-const-variable] static const struct smb_to_posix_error mapping_table_ERRHRD[] = { ^ Signed-off-by: Austin Kim <austindh.kim@gmail.com> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-11-06exec: load_script: Do not exec truncated interpreter pathKees Cook
[ Upstream commit b5372fe5dc84235dbe04998efdede3c4daa866a9 ] Commit 8099b047ecc4 ("exec: load_script: don't blindly truncate shebang string") was trying to protect against a confused exec of a truncated interpreter path. However, it was overeager and also refused to truncate arguments as well, which broke userspace, and it was reverted. This attempts the protection again, but allows arguments to remain truncated. In an effort to improve readability, helper functions and comments have been added. Co-developed-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Kees Cook <keescook@chromium.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Samuel Dionne-Riel <samuel@dionne-riel.com> Cc: Richard Weinberger <richard.weinberger@gmail.com> Cc: Graham Christensen <graham@grahamc.com> Cc: Michal Hocko <mhocko@suse.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-10-29btrfs: block-group: Fix a memory leak due to missing btrfs_put_block_group()Qu Wenruo
commit 4b654acdae850f48b8250b9a578a4eaa518c7a6f upstream. In btrfs_read_block_groups(), if we have an invalid block group which has mixed type (DATA|METADATA) while the fs doesn't have MIXED_GROUPS feature, we error out without freeing the block group cache. This patch will add the missing btrfs_put_block_group() to prevent memory leak. Note for stable backports: the file to patch in versions <= 5.3 is fs/btrfs/extent-tree.c Fixes: 49303381f19a ("Btrfs: bail out if block group has different mixed flag") CC: stable@vger.kernel.org # 4.9+ Reviewed-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-10-29CIFS: avoid using MID 0xFFFFRoberto Bergantinos Corpas
commit 03d9a9fe3f3aec508e485dd3dcfa1e99933b4bdb upstream. According to MS-CIFS specification MID 0xFFFF should not be used by the CIFS client, but we actually do. Besides, this has proven to cause races leading to oops between SendReceive2/cifs_demultiplex_thread. On SMB1, MID is a 2 byte value easy to reach in CurrentMid which may conflict with an oplock break notification request coming from server Signed-off-by: Roberto Bergantinos Corpas <rbergant@redhat.com> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com> Reviewed-by: Aurelien Aptel <aaptel@suse.com> Signed-off-by: Steve French <stfrench@microsoft.com> CC: Stable <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-10-17xfs: clear sb->s_fs_info on mount failureDave Chinner
commit c9fbd7bbc23dbdd73364be4d045e5d3612cf6e82 upstream. We recently had an oops reported on a 4.14 kernel in xfs_reclaim_inodes_count() where sb->s_fs_info pointed to garbage and so the m_perag_tree lookup walked into lala land. Essentially, the machine was under memory pressure when the mount was being run, xfs_fs_fill_super() failed after allocating the xfs_mount and attaching it to sb->s_fs_info. It then cleaned up and freed the xfs_mount, but the sb->s_fs_info field still pointed to the freed memory. Hence when the superblock shrinker then ran it fell off the bad pointer. With the superblock shrinker problem fixed at teh VFS level, this stale s_fs_info pointer is still a problem - we use it unconditionally in ->put_super when the superblock is being torn down, and hence we can still trip over it after a ->fill_super call failure. Hence we need to clear s_fs_info if xfs-fs_fill_super() fails, and we need to check if it's valid in the places it can potentially be dereferenced after a ->fill_super failure. Signed-Off-By: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Ajay Kaher <akaher@vmware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-10-17CIFS: Force revalidate inode when dentry is stalePavel Shilovsky
[ Upstream commit c82e5ac7fe3570a269c0929bf7899f62048e7dbc ] Currently the client indicates that a dentry is stale when inode numbers or type types between a local inode and a remote file don't match. If this is the case attributes is not being copied from remote to local, so, it is already known that the local copy has stale metadata. That's why the inode needs to be marked for revalidation in order to tell the VFS to lookup the dentry again before openning a file. This prevents unexpected stale errors to be returned to the user space when openning a file. Cc: <stable@vger.kernel.org> Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-10-17cifs: Check uniqueid for SMB2+ and return -ESTALE if necessaryRoss Lagerwall
[ Upstream commit a108471b5730b52017e73b58c9f486319d2ac308 ] Commit 7196ac113a4f ("Fix to check Unique id and FileType when client refer file directly.") checks whether the uniqueid of an inode has changed when getting the inode info, but only when using the UNIX extensions. Add a similar check for SMB2+, since this can be done without an extra network roundtrip. Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com> Signed-off-by: Steve French <smfrench@gmail.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-10-17CIFS: Force reval dentry if LOOKUP_REVAL flag is setPavel Shilovsky
commit 0b3d0ef9840f7be202393ca9116b857f6f793715 upstream. Mark inode for force revalidation if LOOKUP_REVAL flag is set. This tells the client to actually send a QueryInfo request to the server to obtain the latest metadata in case a directory or a file were changed remotely. Only do that if the client doesn't have a lease for the file to avoid unneeded round trips to the server. Cc: <stable@vger.kernel.org> Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-10-17CIFS: Gracefully handle QueryInfo errors during openPavel Shilovsky
commit 30573a82fb179420b8aac30a3a3595aa96a93156 upstream. Currently if the client identifies problems when processing metadata returned in CREATE response, the open handle is being leaked. This causes multiple problems like a file missing a lease break by that client which causes high latencies to other clients accessing the file. Another side-effect of this is that the file can't be deleted. Fix this by closing the file after the client hits an error after the file was opened and the open descriptor wasn't returned to the user space. Also convert -ESTALE to -EOPENSTALE to allow the VFS to revalidate a dentry and retry the open. Cc: <stable@vger.kernel.org> Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-10-17fuse: fix memleak in cuse_channel_openzhengbin
[ Upstream commit 9ad09b1976c562061636ff1e01bfc3a57aebe56b ] If cuse_send_init fails, need to fuse_conn_put cc->fc. cuse_channel_open->fuse_conn_init->refcount_set(&fc->count, 1) ->fuse_dev_alloc->fuse_conn_get ->fuse_dev_free->fuse_conn_put Fixes: cc080e9e9be1 ("fuse: introduce per-instance fuse_dev structure") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: zhengbin <zhengbin13@huawei.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-10-17ceph: fix directories inode i_blkbits initializationLuis Henriques
[ Upstream commit 750670341a24cb714e624e0fd7da30900ad93752 ] When filling an inode with info from the MDS, i_blkbits is being initialized using fl_stripe_unit, which contains the stripe unit in bytes. Unfortunately, this doesn't make sense for directories as they have fl_stripe_unit set to '0'. This means that i_blkbits will be set to 0xff, causing an UBSAN undefined behaviour in i_blocksize(): UBSAN: Undefined behaviour in ./include/linux/fs.h:731:12 shift exponent 255 is too large for 32-bit type 'int' Fix this by initializing i_blkbits to CEPH_BLOCK_SHIFT if fl_stripe_unit is zero. Signed-off-by: Luis Henriques <lhenriques@suse.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-10-179p: avoid attaching writeback_fid on mmap with type PRIVATEChengguang Xu
[ Upstream commit c87a37ebd40b889178664c2c09cc187334146292 ] Currently on mmap cache policy, we always attach writeback_fid whether mmap type is SHARED or PRIVATE. However, in the use case of kata-container which combines 9p(Guest OS) with overlayfs(Host OS), this behavior will trigger overlayfs' copy-up when excute command inside container. Link: http://lkml.kernel.org/r/20190820100325.10313-1-cgxu519@zoho.com.cn Signed-off-by: Chengguang Xu <cgxu519@zoho.com.cn> Signed-off-by: Dominique Martinet <dominique.martinet@cea.fr> Signed-off-by: Sasha Levin <sashal@kernel.org>