summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2022-05-25kcov: update pos before writing pc in trace functionCongyu Liu
In __sanitizer_cov_trace_pc(), previously we write pc before updating pos. However, some early interrupt code could bypass check_kcov_mode() check and invoke __sanitizer_cov_trace_pc(). If such interrupt is raised between writing pc and updating pos, the pc could be overitten by the recursive __sanitizer_cov_trace_pc(). As suggested by Dmitry, we cold update pos before writing pc to avoid such interleaving. Apply the same change to write_comp_data(). Link: https://lkml.kernel.org/r/20220523053531.1572793-1-liu3101@purdue.edu Signed-off-by: Congyu Liu <liu3101@purdue.edu> Reviewed-by: Dmitry Vyukov <dvyukov@google.com> Cc: Andrey Konovalov <andreyknvl@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-25ocfs2: dlmfs: fix error handling of user_dlm_destroy_lockJunxiao Bi via Ocfs2-devel
When user_dlm_destroy_lock failed, it didn't clean up the flags it set before exit. For USER_LOCK_IN_TEARDOWN, if this function fails because of lock is still in used, next time when unlink invokes this function, it will return succeed, and then unlink will remove inode and dentry if lock is not in used(file closed), but the dlm lock is still linked in dlm lock resource, then when bast come in, it will trigger a panic due to user-after-free. See the following panic call trace. To fix this, USER_LOCK_IN_TEARDOWN should be reverted if fail. And also error should be returned if USER_LOCK_IN_TEARDOWN is set to let user know that unlink fail. For the case of ocfs2_dlm_unlock failure, besides USER_LOCK_IN_TEARDOWN, USER_LOCK_BUSY is also required to be cleared. Even though spin lock is released in between, but USER_LOCK_IN_TEARDOWN is still set, for USER_LOCK_BUSY, if before every place that waits on this flag, USER_LOCK_IN_TEARDOWN is checked to bail out, that will make sure no flow waits on the busy flag set by user_dlm_destroy_lock(), then we can simplely revert USER_LOCK_BUSY when ocfs2_dlm_unlock fails. Fix user_dlm_cluster_lock() which is the only function not following this. [ 941.336392] (python,26174,16):dlmfs_unlink:562 ERROR: unlink 004fb0000060000b5a90b8c847b72e1, error -16 from destroy [ 989.757536] ------------[ cut here ]------------ [ 989.757709] kernel BUG at fs/ocfs2/dlmfs/userdlm.c:173! [ 989.757876] invalid opcode: 0000 [#1] SMP [ 989.758027] Modules linked in: ksplice_2zhuk2jr_ib_ipoib_new(O) ksplice_2zhuk2jr(O) mptctl mptbase xen_netback xen_blkback xen_gntalloc xen_gntdev xen_evtchn cdc_ether usbnet mii ocfs2 jbd2 rpcsec_gss_krb5 auth_rpcgss nfsv4 nfsv3 nfs_acl nfs fscache lockd grace ocfs2_dlmfs ocfs2_stack_o2cb ocfs2_dlm ocfs2_nodemanager ocfs2_stackglue configfs bnx2fc fcoe libfcoe libfc scsi_transport_fc sunrpc ipmi_devintf bridge stp llc rds_rdma rds bonding ib_sdp ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm falcon_lsm_serviceable(PE) falcon_nf_netcontain(PE) mlx4_vnic falcon_kal(E) falcon_lsm_pinned_13402(E) mlx4_ib ib_sa ib_mad ib_core ib_addr xenfs xen_privcmd dm_multipath iTCO_wdt iTCO_vendor_support pcspkr sb_edac edac_core i2c_i801 lpc_ich mfd_core ipmi_ssif i2c_core ipmi_si ipmi_msghandler [ 989.760686] ioatdma sg ext3 jbd mbcache sd_mod ahci libahci ixgbe dca ptp pps_core vxlan udp_tunnel ip6_udp_tunnel megaraid_sas mlx4_core crc32c_intel be2iscsi bnx2i cnic uio cxgb4i cxgb4 cxgb3i libcxgbi ipv6 cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi wmi dm_mirror dm_region_hash dm_log dm_mod [last unloaded: ksplice_2zhuk2jr_ib_ipoib_old] [ 989.761987] CPU: 10 PID: 19102 Comm: dlm_thread Tainted: P OE 4.1.12-124.57.1.el6uek.x86_64 #2 [ 989.762290] Hardware name: Oracle Corporation ORACLE SERVER X5-2/ASM,MOTHERBOARD,1U, BIOS 30350100 06/17/2021 [ 989.762599] task: ffff880178af6200 ti: ffff88017f7c8000 task.ti: ffff88017f7c8000 [ 989.762848] RIP: e030:[<ffffffffc07d4316>] [<ffffffffc07d4316>] __user_dlm_queue_lockres.part.4+0x76/0x80 [ocfs2_dlmfs] [ 989.763185] RSP: e02b:ffff88017f7cbcb8 EFLAGS: 00010246 [ 989.763353] RAX: 0000000000000000 RBX: ffff880174d48008 RCX: 0000000000000003 [ 989.763565] RDX: 0000000000120012 RSI: 0000000000000003 RDI: ffff880174d48170 [ 989.763778] RBP: ffff88017f7cbcc8 R08: ffff88021f4293b0 R09: 0000000000000000 [ 989.763991] R10: ffff880179c8c000 R11: 0000000000000003 R12: ffff880174d48008 [ 989.764204] R13: 0000000000000003 R14: ffff880179c8c000 R15: ffff88021db7a000 [ 989.764422] FS: 0000000000000000(0000) GS:ffff880247480000(0000) knlGS:ffff880247480000 [ 989.764685] CS: e033 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 989.764865] CR2: ffff8000007f6800 CR3: 0000000001ae0000 CR4: 0000000000042660 [ 989.765081] Stack: [ 989.765167] 0000000000000003 ffff880174d48040 ffff88017f7cbd18 ffffffffc07d455f [ 989.765442] ffff88017f7cbd88 ffffffff816fb639 ffff88017f7cbd38 ffff8800361b5600 [ 989.765717] ffff88021db7a000 ffff88021f429380 0000000000000003 ffffffffc0453020 [ 989.765991] Call Trace: [ 989.766093] [<ffffffffc07d455f>] user_bast+0x5f/0xf0 [ocfs2_dlmfs] [ 989.766287] [<ffffffff816fb639>] ? schedule_timeout+0x169/0x2d0 [ 989.766475] [<ffffffffc0453020>] ? o2dlm_lock_ast_wrapper+0x20/0x20 [ocfs2_stack_o2cb] [ 989.766738] [<ffffffffc045303a>] o2dlm_blocking_ast_wrapper+0x1a/0x20 [ocfs2_stack_o2cb] [ 989.767010] [<ffffffffc0864ec6>] dlm_do_local_bast+0x46/0xe0 [ocfs2_dlm] [ 989.767217] [<ffffffffc084f5cc>] ? dlm_lockres_calc_usage+0x4c/0x60 [ocfs2_dlm] [ 989.767466] [<ffffffffc08501f1>] dlm_thread+0xa31/0x1140 [ocfs2_dlm] [ 989.767662] [<ffffffff816f78da>] ? __schedule+0x24a/0x810 [ 989.767834] [<ffffffff816f78ce>] ? __schedule+0x23e/0x810 [ 989.768006] [<ffffffff816f78da>] ? __schedule+0x24a/0x810 [ 989.768178] [<ffffffff816f78ce>] ? __schedule+0x23e/0x810 [ 989.768349] [<ffffffff816f78da>] ? __schedule+0x24a/0x810 [ 989.768521] [<ffffffff816f78ce>] ? __schedule+0x23e/0x810 [ 989.768693] [<ffffffff816f78da>] ? __schedule+0x24a/0x810 [ 989.768893] [<ffffffff816f78ce>] ? __schedule+0x23e/0x810 [ 989.769067] [<ffffffff816f78da>] ? __schedule+0x24a/0x810 [ 989.769241] [<ffffffff810ce4d0>] ? wait_woken+0x90/0x90 [ 989.769411] [<ffffffffc084f7c0>] ? dlm_kick_thread+0x80/0x80 [ocfs2_dlm] [ 989.769617] [<ffffffff810a8bbb>] kthread+0xcb/0xf0 [ 989.769774] [<ffffffff816f78da>] ? __schedule+0x24a/0x810 [ 989.769945] [<ffffffff816f78da>] ? __schedule+0x24a/0x810 [ 989.770117] [<ffffffff810a8af0>] ? kthread_create_on_node+0x180/0x180 [ 989.770321] [<ffffffff816fdaa1>] ret_from_fork+0x61/0x90 [ 989.770492] [<ffffffff810a8af0>] ? kthread_create_on_node+0x180/0x180 [ 989.770689] Code: d0 00 00 00 f0 45 7d c0 bf 00 20 00 00 48 89 83 c0 00 00 00 48 89 83 c8 00 00 00 e8 55 c1 8c c0 83 4b 04 10 48 83 c4 08 5b 5d c3 <0f> 0b 0f 1f 84 00 00 00 00 00 55 48 89 e5 41 55 41 54 53 48 83 [ 989.771892] RIP [<ffffffffc07d4316>] __user_dlm_queue_lockres.part.4+0x76/0x80 [ocfs2_dlmfs] [ 989.772174] RSP <ffff88017f7cbcb8> [ 989.772704] ---[ end trace ebd1e38cebcc93a8 ]--- [ 989.772907] Kernel panic - not syncing: Fatal exception [ 989.773173] Kernel Offset: disabled Link: https://lkml.kernel.org/r/20220518235224.87100-2-junxiao.bi@oracle.com Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Joseph Qi <jiangqi903@gmail.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-25ocfs2: dlmfs: don't clear USER_LOCK_ATTACHED when destroying lockJunxiao Bi
The following function is the only place that checks USER_LOCK_ATTACHED. This flag is set when lock request is granted through user_ast() and only the following function will clear it. Checking of this flag here is to make sure ocfs2_dlm_unlock is not issued if this lock is never granted. For example, lock file is created and then get removed, open file never happens. Clearing the flag here is not necessary because this is the only function that checks it, if another flow is executing user_dlm_destroy_lock(), it will bail out at the beginning because of USER_LOCK_IN_TEARDOWN and never check USER_LOCK_ATTACHED. Drop the clear, so we don't need take care of it for the following error handling patch. int user_dlm_destroy_lock(struct user_lock_res *lockres) { ... status = 0; if (!(lockres->l_flags & USER_LOCK_ATTACHED)) { spin_unlock(&lockres->l_lock); goto bail; } lockres->l_flags &= ~USER_LOCK_ATTACHED; lockres->l_flags |= USER_LOCK_BUSY; spin_unlock(&lockres->l_lock); status = ocfs2_dlm_unlock(conn, &lockres->l_lksb, DLM_LKF_VALBLK); if (status) { user_log_dlm_error("ocfs2_dlm_unlock", status, lockres); goto bail; } ... } V1 discussion with Joseph: https://lore.kernel.org/all/7b620c53-0c45-da2c-829e-26195cbe7d4e@linux.alibaba.com/T/ Link: https://lkml.kernel.org/r/20220518235224.87100-1-junxiao.bi@oracle.com Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-19fs/ntfs: remove redundant variable idxColin Ian King
The variable idx is assigned a value and is never read. The variable is not used and is redundant, remove it. Cleans up clang scan build warning: warning: Although the value stored to 'idx' is used in the enclosing expression, the value is never actually read from 'idx' [deadcode.DeadStores] Link: https://lkml.kernel.org/r/20220517093646.93628-2-colin.i.king@gmail.com Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Reviewed-by: Anton Altaparmakov <anton@tuxera.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-19fat: remove time truncations in vfat_create/vfat_mkdirChung-Chiang Cheng
All the timestamps in vfat_create() and vfat_mkdir() come from fat_time_fat2unix() which ensures time granularity. We don't need to truncate them to fit FAT's format. Moreover, fat_truncate_crtime() and fat_timespec64_trunc_10ms() are also removed because there is no caller anymore. Link: https://lkml.kernel.org/r/20220503152536.2503003-4-cccheng@synology.com Signed-off-by: Chung-Chiang Cheng <cccheng@synology.com> Acked-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-19fat: report creation time in statxChung-Chiang Cheng
creation time is no longer mixed with change time. Add an in-memory field for it, and report it in statx if supported. Link: https://lkml.kernel.org/r/20220503152536.2503003-3-cccheng@synology.com Signed-off-by: Chung-Chiang Cheng <cccheng@synology.com> Acked-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-19fat: ignore ctime updates, and keep ctime identical to mtime in memoryChung-Chiang Cheng
FAT supports creation time but not change time, and there was no corresponding timestamp for creation time in previous VFS. The original implementation took the compromise of saving the in-memory change time into the on-disk creation time field, but this would lead to compatibility issues with non-linux systems. To address this issue, this patch changes the behavior of ctime. It will no longer be loaded and stored from the creation time on disk. Instead of that, it'll be consistent with the in-memory mtime and share the same on-disk field. All updates to mtime will also be applied to ctime in memory, while all updates to ctime will be ignored. Link: https://lkml.kernel.org/r/20220503152536.2503003-2-cccheng@synology.com Signed-off-by: Chung-Chiang Cheng <cccheng@synology.com> Acked-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-19fat: split fat_truncate_time() into separate functionsChung-Chiang Cheng
Separate fat_truncate_time() to each timestamps for later creation time work. This patch does not introduce any functional changes, it's merely refactoring change. Link: https://lkml.kernel.org/r/20220503152536.2503003-1-cccheng@synology.com Signed-off-by: Chung-Chiang Cheng <cccheng@synology.com> Acked-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-19MAINTAINERS: add Muchun as a memcg reviewerMuchun Song
I have been focusing on mm for the past two years. e.g. developing, fixing bugs, reviewing. I have fixed lots of races (including memcg). I would like to help people working on memcg or related by reviewing their work. Let me be Cc'd on patches related to memcg. Link: https://lkml.kernel.org/r/20220517143320.99649-1-songmuchun@bytedance.com Signed-off-by: Muchun Song <songmuchun@bytedance.com> Acked-by: Shakeel Butt <shakeelb@google.com> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Roman Gushchin <roman.gushchin@linux.dev> Acked-by: FanJun Kong <bh1scw@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-13proc/sysctl: make protected_* world readableJulius Hemanth Pitti
protected_* files have 600 permissions which prevents non-superuser from reading them. Container like "AWS greengrass" refuse to launch unless protected_hardlinks and protected_symlinks are set. When containers like these run with "userns-remap" or "--user" mapping container's root to non-superuser on host, they fail to run due to denied read access to these files. As these protections are hardly a secret, and do not possess any security risk, making them world readable. Though above greengrass usecase needs read access to only protected_hardlinks and protected_symlinks files, setting all other protected_* files to 644 to keep consistency. Link: http://lkml.kernel.org/r/20200709235115.56954-1-jpitti@cisco.com Fixes: 800179c9b8a1 ("fs: add link restrictions") Signed-off-by: Julius Hemanth Pitti <jpitti@cisco.com> Acked-by: Kees Cook <keescook@chromium.org> Acked-by: Luis Chamberlain <mcgrof@kernel.org> Cc: Iurii Zaikin <yzaikin@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-12ia64: mca: drop redundant spinlock initializationHaowen Bai
mlogbuf_rlock has declared and initialized by DEFINE_SPINLOCK, so we don't need to spin_lock_init again, drop it. Link: https://lkml.kernel.org/r/1652176897-4754-1-git-send-email-baihaowen@meizu.com Signed-off-by: Haowen Bai <baihaowen@meizu.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-12tty: fix deadlock caused by calling printk() under tty_port->lockQi Zheng
pty_write() invokes kmalloc() which may invoke a normal printk() to print failure message. This can cause a deadlock in the scenario reported by syz-bot below: CPU0 CPU1 CPU2 ---- ---- ---- lock(console_owner); lock(&port_lock_key); lock(&port->lock); lock(&port_lock_key); lock(&port->lock); lock(console_owner); As commit dbdda842fe96 ("printk: Add console owner and waiter logic to load balance console writes") said, such deadlock can be prevented by using printk_deferred() in kmalloc() (which is invoked in the section guarded by the port->lock). But there are too many printk() on the kmalloc() path, and kmalloc() can be called from anywhere, so changing printk() to printk_deferred() is too complicated and inelegant. Therefore, this patch chooses to specify __GFP_NOWARN to kmalloc(), so that printk() will not be called, and this deadlock problem can be avoided. Syzbot reported the following lockdep error: ====================================================== WARNING: possible circular locking dependency detected 5.4.143-00237-g08ccc19a-dirty #10 Not tainted ------------------------------------------------------ syz-executor.4/29420 is trying to acquire lock: ffffffff8aedb2a0 (console_owner){....}-{0:0}, at: console_trylock_spinning kernel/printk/printk.c:1752 [inline] ffffffff8aedb2a0 (console_owner){....}-{0:0}, at: vprintk_emit+0x2ca/0x470 kernel/printk/printk.c:2023 but task is already holding lock: ffff8880119c9158 (&port->lock){-.-.}-{2:2}, at: pty_write+0xf4/0x1f0 drivers/tty/pty.c:120 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #2 (&port->lock){-.-.}-{2:2}: __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline] _raw_spin_lock_irqsave+0x35/0x50 kernel/locking/spinlock.c:159 tty_port_tty_get drivers/tty/tty_port.c:288 [inline] <-- lock(&port->lock); tty_port_default_wakeup+0x1d/0xb0 drivers/tty/tty_port.c:47 serial8250_tx_chars+0x530/0xa80 drivers/tty/serial/8250/8250_port.c:1767 serial8250_handle_irq.part.0+0x31f/0x3d0 drivers/tty/serial/8250/8250_port.c:1854 serial8250_handle_irq drivers/tty/serial/8250/8250_port.c:1827 [inline] <-- lock(&port_lock_key); serial8250_default_handle_irq+0xb2/0x220 drivers/tty/serial/8250/8250_port.c:1870 serial8250_interrupt+0xfd/0x200 drivers/tty/serial/8250/8250_core.c:126 __handle_irq_event_percpu+0x109/0xa50 kernel/irq/handle.c:156 [...] -> #1 (&port_lock_key){-.-.}-{2:2}: __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline] _raw_spin_lock_irqsave+0x35/0x50 kernel/locking/spinlock.c:159 serial8250_console_write+0x184/0xa40 drivers/tty/serial/8250/8250_port.c:3198 <-- lock(&port_lock_key); call_console_drivers kernel/printk/printk.c:1819 [inline] console_unlock+0x8cb/0xd00 kernel/printk/printk.c:2504 vprintk_emit+0x1b5/0x470 kernel/printk/printk.c:2024 <-- lock(console_owner); vprintk_func+0x8d/0x250 kernel/printk/printk_safe.c:394 printk+0xba/0xed kernel/printk/printk.c:2084 register_console+0x8b3/0xc10 kernel/printk/printk.c:2829 univ8250_console_init+0x3a/0x46 drivers/tty/serial/8250/8250_core.c:681 console_init+0x49d/0x6d3 kernel/printk/printk.c:2915 start_kernel+0x5e9/0x879 init/main.c:713 secondary_startup_64+0xa4/0xb0 arch/x86/kernel/head_64.S:241 -> #0 (console_owner){....}-{0:0}: [...] lock_acquire+0x127/0x340 kernel/locking/lockdep.c:4734 console_trylock_spinning kernel/printk/printk.c:1773 [inline] <-- lock(console_owner); vprintk_emit+0x307/0x470 kernel/printk/printk.c:2023 vprintk_func+0x8d/0x250 kernel/printk/printk_safe.c:394 printk+0xba/0xed kernel/printk/printk.c:2084 fail_dump lib/fault-inject.c:45 [inline] should_fail+0x67b/0x7c0 lib/fault-inject.c:144 __should_failslab+0x152/0x1c0 mm/failslab.c:33 should_failslab+0x5/0x10 mm/slab_common.c:1224 slab_pre_alloc_hook mm/slab.h:468 [inline] slab_alloc_node mm/slub.c:2723 [inline] slab_alloc mm/slub.c:2807 [inline] __kmalloc+0x72/0x300 mm/slub.c:3871 kmalloc include/linux/slab.h:582 [inline] tty_buffer_alloc+0x23f/0x2a0 drivers/tty/tty_buffer.c:175 __tty_buffer_request_room+0x156/0x2a0 drivers/tty/tty_buffer.c:273 tty_insert_flip_string_fixed_flag+0x93/0x250 drivers/tty/tty_buffer.c:318 tty_insert_flip_string include/linux/tty_flip.h:37 [inline] pty_write+0x126/0x1f0 drivers/tty/pty.c:122 <-- lock(&port->lock); n_tty_write+0xa7a/0xfc0 drivers/tty/n_tty.c:2356 do_tty_write drivers/tty/tty_io.c:961 [inline] tty_write+0x512/0x930 drivers/tty/tty_io.c:1045 __vfs_write+0x76/0x100 fs/read_write.c:494 [...] other info that might help us debug this: Chain exists of: console_owner --> &port_lock_key --> &port->lock Link: https://lkml.kernel.org/r/20220511061951.1114-2-zhengqi.arch@bytedance.com Link: https://lkml.kernel.org/r/20220510113809.80626-2-zhengqi.arch@bytedance.com Fixes: b6da31b2c07c ("tty: Fix data race in tty_insert_flip_string_fixed_flag") Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com> Acked-by: Jiri Slaby <jirislaby@kernel.org> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Akinobu Mita <akinobu.mita@gmail.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-12relay: remove redundant assignment to pointer bufColin Ian King
Pointer buf is being assigned a value that is not being read, buf is being re-assigned in the next starement. The assignment is redundant and can be removed. Cleans up clang scan build warning: kernel/relay.c:443:8: warning: Although the value stored to 'buf' is used in the enclosing expression, the value is never actually read from 'buf' [deadcode.DeadStores] Link: https://lkml.kernel.org/r/20220508212152.58753-1-colin.i.king@gmail.com Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Reviewed-by: Jens Axboe <axboe@kernel.dk> Cc: Christoph Hellwig <hch@lst.de> Cc: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-12fs/ntfs3: validate BOOT sectors_per_clustersRandy Dunlap
When the NTFS BOOT sectors_per_clusters field is > 0x80, it represents a shift value. Make sure that the shift value is not too large before using it (NTFS max cluster size is 2MB). Return -EVINVAL if it too large. This prevents negative shift values and shift values that are larger than the field size. Prevents this UBSAN error: UBSAN: shift-out-of-bounds in ../fs/ntfs3/super.c:673:16 shift exponent -192 is negative Link: https://lkml.kernel.org/r/20220502175342.20296-1-rdunlap@infradead.org Fixes: 82cae269cfa9 ("fs/ntfs3: Add initialization of super block") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Reported-by: syzbot+1631f09646bc214d2e76@syzkaller.appspotmail.com Reviewed-by: Namjae Jeon <linkinjeon@kernel.org> Cc: Konstantin Komarov <almaz.alexandrovich@paragon-software.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Kari Argillander <kari.argillander@stargateuniverse.net> Cc: Namjae Jeon <linkinjeon@kernel.org> Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-12lib/string_helpers: fix not adding strarray to device's resource listPuyou Lu
Add allocated strarray to device's resource list. This is a must to automatically release strarray when the device disappears. Without this fix we have a memory leak in the few drivers which use devm_kasprintf_strarray(). Link: https://lkml.kernel.org/r/20220506044409.30066-1-puyou.lu@gmail.com Link: https://lkml.kernel.org/r/20220506073623.2679-1-puyou.lu@gmail.com Fixes: acdb89b6c87a ("lib/string_helpers: Introduce managed variant of kasprintf_strarray()") Signed-off-by: Puyou Lu <puyou.lu@gmail.com> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Cc: Tejun Heo <tj@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-12kernel/crash_core.c: remove redundant check of ck_cmdlinelizhe
At the end of get_last_crashkernel(), the judgement of ck_cmdline is obviously unnecessary and causes redundance, let's clean it up. Link: https://lkml.kernel.org/r/20220506104116.259323-1-sensor1010@163.com Signed-off-by: lizhe <sensor1010@163.com> Acked-by: Baoquan He <bhe@redhat.com> Acked-by: Philipp Rudo <prudo@redhat.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Dave Young <dyoung@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-12ELF, uapi: fixup ELF_ST_TYPE definitionAlexey Dobriyan
This is very theoretical compile failure: ELF_ST_TYPE(st_info = A) Cast will bind first and st_info will stop being lvalue: error: lvalue required as left operand of assignment Given that the only use of this macro is ELF_ST_TYPE(sym->st_info) where st_info is "unsigned char" I've decided to remove cast especially given that companion macro ELF_ST_BIND doesn't use cast. Link: https://lkml.kernel.org/r/Ymv7G1BeX4kt3obz@localhost.localdomain Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Acked-by: Kees Cook <keescook@chromium.org> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-09ipc/mqueue: use get_tree_nodev() in mqueue_get_tree()Waiman Long
When running the stress-ng clone benchmark with multiple testing threads, it was found that there were significant spinlock contention in sget_fc(). The contended spinlock was the sb_lock. It is under heavy contention because the following code in the critcal section of sget_fc(): hlist_for_each_entry(old, &fc->fs_type->fs_supers, s_instances) { if (test(old, fc)) goto share_extant_sb; } After testing with added instrumentation code, it was found that the benchmark could generate thousands of ipc namespaces with the corresponding number of entries in the mqueue's fs_supers list where the namespaces are the key for the search. This leads to excessive time in scanning the list for a match. Looking back at the mqueue calling sequence leading to sget_fc(): mq_init_ns() => mq_create_mount() => fc_mount() => vfs_get_tree() => mqueue_get_tree() => get_tree_keyed() => vfs_get_super() => sget_fc() Currently, mq_init_ns() is the only mqueue function that will indirectly call mqueue_get_tree() with a newly allocated ipc namespace as the key for searching. As a result, there will never be a match with the exising ipc namespaces stored in the mqueue's fs_supers list. So using get_tree_keyed() to do an existing ipc namespace search is just a waste of time. Instead, we could use get_tree_nodev() to eliminate the useless search. By doing so, we can greatly reduce the sb_lock hold time and avoid the spinlock contention problem in case a large number of ipc namespaces are present. Of course, if the code is modified in the future to allow mqueue_get_tree() to be called with an existing ipc namespace instead of a new one, we will have to use get_tree_keyed() in this case. The following stress-ng clone benchmark command was run on a 2-socket 48-core Intel system: ./stress-ng --clone 32 --verbose --oomable --metrics-brief -t 20 The "bogo ops/s" increased from 5948.45 before patch to 9137.06 after patch. This is an increase of 54% in performance. Link: https://lkml.kernel.org/r/20220121172315.19652-1-longman@redhat.com Fixes: 935c6912b198 ("ipc: Convert mqueue fs to fs_context") Signed-off-by: Waiman Long <longman@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: David Howells <dhowells@redhat.com> Cc: Manfred Spraul <manfred@colorfullife.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-09ipc: update semtimedop() to use hrtimerPrakash Sangappa
semtimedop() should be converted to use hrtimer like it has been done for most of the system calls with timeouts. This system call already takes a struct timespec as an argument and can therefore provide finer granularity timed wait. Link: https://lkml.kernel.org/r/1651187881-2858-1-git-send-email-prakash.sangappa@oracle.com Signed-off-by: Prakash Sangappa <prakash.sangappa@oracle.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Manfred Spraul <manfred@colorfullife.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-09ipc/sem: remove redundant assignmentsMichal Orzel
Get rid of redundant assignments which end up in values not being read either because they are overwritten or the function ends. Reported by clang-tidy [deadcode.DeadStores] Link: https://lkml.kernel.org/r/20220409101933.207157-1-michalorzel.eng@gmail.com Signed-off-by: Michal Orzel <michalorzel.eng@gmail.com> Reviewed-by: Tom Rix <trix@redhat.com> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-09initramfs: support cpio extraction with file checksumsDavid Disseldorp
Add support for extraction of checksum-enabled "070702" cpio archives, specified in Documentation/driver-api/early-userspace/buffer-format.rst. Fail extraction if the calculated file data checksum doesn't match the value carried in the header. Link: https://lkml.kernel.org/r/20220404093429.27570-7-ddiss@suse.de Signed-off-by: David Disseldorp <ddiss@suse.de> Suggested-by: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christian Brauner <christian.brauner@ubuntu.com> Cc: Martin Wilck <mwilck@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-09gen_init_cpio: support file checksum archivingDavid Disseldorp
Documentation/driver-api/early-userspace/buffer-format.rst includes the specification for checksum-enabled cpio archives. Implement support for this format in gen_init_cpio via a new '-c' parameter. Link: https://lkml.kernel.org/r/20220404093429.27570-6-ddiss@suse.de Signed-off-by: David Disseldorp <ddiss@suse.de> Suggested-by: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christian Brauner <christian.brauner@ubuntu.com> Cc: Martin Wilck <mwilck@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-09gen_init_cpio: fix short read file handlingDavid Disseldorp
When processing a "file" entry, gen_init_cpio attempts to allocate a buffer large enough to stage the entire contents of the source file. It then attempts to fill the buffer via a single read() call and subsequently writes out the entire buffer length, without checking that read() returned the full length, potentially writing uninitialized buffer memory. Fix this by breaking up file I/O into 64k chunks and only writing the length returned by the prior read() call. Link: https://lkml.kernel.org/r/20220404093429.27570-5-ddiss@suse.de Signed-off-by: David Disseldorp <ddiss@suse.de> Reviewed-by: Martin Wilck <mwilck@suse.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christian Brauner <christian.brauner@ubuntu.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-09initramfs: add INITRAMFS_PRESERVE_MTIME Kconfig optionDavid Disseldorp
initramfs cpio mtime preservation, as implemented in commit 889d51a10712 ("initramfs: add option to preserve mtime from initramfs cpio images"), uses a linked list to defer directory mtime processing until after all other items in the cpio archive have been processed. This is done to ensure that parent directory mtimes aren't overwritten via subsequent child creation. The lkml link below indicates that the mtime retention use case was for embedded devices with applications running exclusively out of initramfs, where the 32-bit mtime value provided a rough file version identifier. Linux distributions which discard an extracted initramfs immediately after the root filesystem has been mounted may want to avoid the unnecessary overhead. This change adds a new INITRAMFS_PRESERVE_MTIME Kconfig option, which can be used to disable on-by-default mtime retention and in turn speed up initramfs extraction, particularly for cpio archives with large directory counts. Benchmarks with a one million directory cpio archive extracted 20 times demonstrated: mean extraction time (s) std dev INITRAMFS_PRESERVE_MTIME=y 3.808 0.006 INITRAMFS_PRESERVE_MTIME unset 3.056 0.004 The above extraction times were measured using ftrace (initcall_finish - initcall_start) values for populate_rootfs() with initramfs_async disabled. [ddiss@suse.de: rebase atop dir_entry.name flexible array member and drop separate initramfs_mtime.h header] Link: https://lkml.org/lkml/2008/9/3/424 Link: https://lkml.kernel.org/r/20220404093429.27570-4-ddiss@suse.de Signed-off-by: David Disseldorp <ddiss@suse.de> Reviewed-by: Martin Wilck <mwilck@suse.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christian Brauner <christian.brauner@ubuntu.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-09initramfs: make dir_entry.name a flexible array memberDavid Disseldorp
dir_entry.name is currently allocated via a separate kstrdup(). Change it to a flexible array member and allocate it along with struct dir_entry. Link: https://lkml.kernel.org/r/20220404093429.27570-3-ddiss@suse.de Signed-off-by: David Disseldorp <ddiss@suse.de> Acked-by: Christian Brauner <christian.brauner@ubuntu.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Martin Wilck <mwilck@suse.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-09initramfs: refactor do_header() cpio magic checksDavid Disseldorp
Patch series "initramfs: "crc" cpio format and INITRAMFS_PRESERVE_MTIME", v7. This patchset does some minor initramfs refactoring and allows cpio entry mtime preservation to be disabled via a new Kconfig INITRAMFS_PRESERVE_MTIME option. Patches 4/6 to 6/6 implement support for creation and extraction of "crc" cpio archives, which carry file data checksums. Basic tests for this functionality can be found at https://github.com/rapido-linux/rapido/pull/163 This patch (of 6): do_header() is called for each cpio entry and fails if the first six bytes don't match "newc" magic. The magic check includes a special case error message if POSIX.1 ASCII (cpio -H odc) magic is detected. This special case POSIX.1 check can be nested under the "newc" mismatch code path to avoid calling memcmp() twice in a non-error case. Link: https://lkml.kernel.org/r/20220404093429.27570-1-ddiss@suse.de Link: https://lkml.kernel.org/r/20220404093429.27570-2-ddiss@suse.de Signed-off-by: David Disseldorp <ddiss@suse.de> Reviewed-by: Martin Wilck <mwilck@suse.com> Acked-by: Christian Brauner <christian.brauner@ubuntu.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-09proc: fix dentry/inode overinstantiating under /proc/${pid}/netAlexey Dobriyan
When a process exits, /proc/${pid}, and /proc/${pid}/net dentries are flushed. However some leaf dentries like /proc/${pid}/net/arp_cache aren't. That's because respective PDEs have proc_misc_d_revalidate() hook which returns 1 and leaves dentries/inodes in the LRU. Force revalidation/lookup on everything under /proc/${pid}/net by inheriting proc_net_dentry_ops. [akpm@linux-foundation.org: coding-style cleanups] Link: https://lkml.kernel.org/r/YjdVHgildbWO7diJ@localhost.localdomain Fixes: c6c75deda813 ("proc: fix lookup in /proc/net subdirectories after setns(2)") Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Reported-by: hui li <juanfengpy@gmail.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-04-29fs: sysv: check sbi->s_firstdatazone in complete_read_superLiu Shixin
sbi->s_firstinodezone is initialized to 2 and sbi->s_firstdatazone is read from sbd. There's no guarantee that sbi->s_firstdatazone must bigger than sbi->s_firstinodezone. If sbi->s_firstdatazone less than 2, the filesystem can still be mounted unexpetly. At this point, sbi->s_ninodes flip to very large value and this filesystem is broken. We can observe this by executing 'df' command. When we execute, we will get an error message: "sysv_count_free_inodes: unable to read inode table" Link: https://lkml.kernel.org/r/20220330104215.530223-1-liushixin2@huawei.com Signed-off-by: Liu Shixin <liushixin2@huawei.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-04-29kernel: make taskstats available from all net namespacesxu xin
If getdelays runs in a non-init network namespace, it will fail in getting delayacct stats even if it has privilege of root user, which seems to be not very reasonable. We can simply reproduce this by executing commands: unshare -n getdelays -d -p <pid> I don't think net namespace should be an obstacle to the normal execution of getdelay function. So let's make it available from all net namespaces. Link: https://lkml.kernel.org/r/20220412071946.2532318-1-xu.xin16@zte.com.cn Signed-off-by: xu xin <xu.xin16@zte.com.cn> Cc: Balbir Singh <bsingharora@gmail.com> Cc: Yang Yang <yang.yang29@zte.com.cn> Cc: "Dr. Thomas Orgis" <thomas.orgis@uni-hamburg.de> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Ismael Luceno <ismael@iodev.co.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-04-29taskstats: version 12 with thread group and exe infoDr. Thomas Orgis
The task exit struct needs some crucial information to be able to provide an enhanced version of process and thread accounting. This change provides: 1. ac_tgid in additon to ac_pid 2. thread group execution walltime in ac_tgetime 3. flag AGROUP in ac_flag to indicate the last task in a thread group / process 4. device ID and inode of task's /proc/self/exe in ac_exe_dev and ac_exe_inode 5. tools/accounting/procacct as demonstrator When a task exits, taskstats are reported to userspace including the task's pid and ppid, but without the id of the thread group this task is part of. Without the tgid, the stats of single tasks cannot be correlated to each other as a thread group (process). The taskstats documentation suggests that on process exit a data set consisting of accumulated stats for the whole group is produced. But such an additional set of stats is only produced for actually multithreaded processes, not groups that had only one thread, and also those stats only contain data about delay accounting and not the more basic information about CPU and memory resource usage. Adding the AGROUP flag to be set when the last task of a group exited enables determination of process end also for single-threaded processes. My applicaton basically does enhanced process accounting with summed cputime, biggest maxrss, tasks per process. The data is not available with the traditional BSD process accounting (which is not designed to be extensible) and the taskstats interface allows more efficient on-the-fly grouping and summing of the stats, anyway, without intermediate disk writes. Furthermore, I do carry statistics on which exact program binary is used how often with associated resources, getting a picture on how important which parts of a collection of installed scientific software in different versions are, and how well they put load on the machine. This is enabled by providing information on /proc/self/exe for each task. I assume the two 64-bit fields for device ID and inode are more appropriate than the possibly large resolved path to keep the data volume down. Add the tgid to the stats to complete task identification, the flag AGROUP to mark the last task of a group, the group wallclock time, and inode-based identification of the associated executable file. Add tools/accounting/procacct.c as a simplified fork of getdelays.c to demonstrate process and thread accounting. [thomas.orgis@uni-hamburg.de: fix version number in comment] Link: https://lkml.kernel.org/r/20220405003601.7a5f6008@plasteblaster Link: https://lkml.kernel.org/r/20220331004106.64e5616b@plasteblaster Signed-off-by: Dr. Thomas Orgis <thomas.orgis@uni-hamburg.de> Reviewed-by: Ismael Luceno <ismael@iodev.co.uk> Cc: Balbir Singh <bsingharora@gmail.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: xu xin <xu.xin16@zte.com.cn> Cc: Yang Yang <yang.yang29@zte.com.cn> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-04-29rapidio: remove unnecessary use of list iteratorJakob Koschel
req->map is set in the valid case and always equals 'map' if the break was hit. It therefore is unnecessary to use the list iterator variable and the use of 'map' can be replaced with req->map. This is done in preparation to limit the scope of a list iterator to the list traversal loop [1]. Link: https://lore.kernel.org/all/YhdfEIwI4EdtHdym@kroah.com/ Link: https://lkml.kernel.org/r/20220319203344.2547702-1-jakobkoschel@gmail.com Signed-off-by: Jakob Koschel <jakobkoschel@gmail.com> Reviewed-by: John Hubbard <jhubbard@nvidia.com> Cc: Matt Porter <mporter@kernel.crashing.org> Cc: Alexandre Bounine <alex.bou9@gmail.com> Cc: Kees Cook <keescook@chromium.org> Cc: Mike Rapoport <rppt@kernel.org> Cc: "Brian Johannesmeyer" <bjohannesmeyer@gmail.com> Cc: Cristiano Giuffrida <c.giuffrida@vu.nl> Cc: "Bos, H.J." <h.j.bos@vu.nl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-04-29kexec: remove redundant assignmentsMichal Orzel
Get rid of redundant assignments which end up in values not being read either because they are overwritten or the function ends. Reported by clang-tidy [deadcode.DeadStores] Link: https://lkml.kernel.org/r/20220326180948.192154-1-michalorzel.eng@gmail.com Signed-off-by: Michal Orzel <michalorzel.eng@gmail.com> Acked-by: Baoquan He <bhe@redhat.com> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Michal Orzel <michalorzel.eng@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-04-29MAINTAINERS: remove redundant file of PTRACE SUPPORT entryTiezhu Yang
In MAINTAINERS PTRACE SUPPORT entry, the file include/uapi/linux/ptrace.h is redundant, remove it. Link: https://lkml.kernel.org/r/1649240981-11024-4-git-send-email-yangtiezhu@loongson.cn Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Cc: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-04-29ptrace: fix wrong comment of PT_DTRACETiezhu Yang
PT_DTRACE is only used on um now, fix the wrong comment. Link: https://lkml.kernel.org/r/1649240981-11024-3-git-send-email-yangtiezhu@loongson.cn Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Cc: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-04-29ptrace: remove redudant check of #ifdef PTRACE_SINGLESTEPTiezhu Yang
Patch series "ptrace: do some cleanup". This patch (of 3): PTRACE_SINGLESTEP is always defined as 9 in include/uapi/linux/ptrace.h, remove redudant check of #ifdef PTRACE_SINGLESTEP. Link: https://lkml.kernel.org/r/1649240981-11024-2-git-send-email-yangtiezhu@loongson.cn Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Cc: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-04-29fat: add ratelimit to fat*_ent_bread()OGAWA Hirofumi
fat*_ent_bread() can be the cause of too many report on I/O error path. So use fat_msg_ratelimit() instead. Link: https://lkml.kernel.org/r/87bkxogfeq.fsf@mail.parknet.co.jp Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Reported-by: qianfan <qianfanguijin@163.com> Tested-by: qianfan <qianfanguijin@163.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-04-29fatfs: add FAT messages to printk indexJonathan Lassoff
In order for end users to quickly react to new issues that come up in production, it is proving useful to leverage the printk indexing system. This printk index enables kernel developers to use calls to printk() with changeable ad-hoc format strings (as they always have; no change of expectations), while enabling end users to examine format strings to detect changes. Since end users are using regular expressions to match messages printed through printk(), being able to detect changes in chosen format strings from release to release provides a useful signal to review printk()-matching regular expressions for any necessary updates. So that detailed FAT messages are captured by this printk index, this patch wraps fat_msg with a macro. [akpm@linux-foundation.org: coding-style cleanups] Link: https://lkml.kernel.org/r/8aaa2dd7995e820292bb40d2120ab69756662c65.1648688136.git.jof@thejof.com Signed-off-by: Jonathan Lassoff <jof@thejof.com> Acked-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Reviewed-by: Petr Mladek <pmladek@suse.com> Tested-by: Petr Mladek <pmladek@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-04-29fatfs: remove redundant judgmentYubo Feng
iput() has already judged the incoming parameter, so there is no need to repeat outside. Link: https://lkml.kernel.org/r/1648265418-76563-1-git-send-email-fengyubo3@huawei.com Signed-off-by: Yubo Feng <fengyubo3@huawei.com> Reported-by: Hulk Robot <hulkci@huawei.com> Acked-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-04-29init/Kconfig: remove USELIB syscall by defaultKees Cook
The uselib syscall has been long deprecated. There's no need to keep this enabled by default under X86_32. Link: https://lkml.kernel.org/r/20220412212519.4113845-1-keescook@chromium.org Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Cc: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-04-29list: fix a data-race around ep->rdllistKuniyuki Iwashima
ep_poll() first calls ep_events_available() with no lock held and checks if ep->rdllist is empty by list_empty_careful(), which reads rdllist->prev. Thus all accesses to it need some protection to avoid store/load-tearing. Note INIT_LIST_HEAD_RCU() already has the annotation for both prev and next. Commit bf3b9f6372c4 ("epoll: Add busy poll support to epoll with socket fds.") added the first lockless ep_events_available(), and commit c5a282e9635e ("fs/epoll: reduce the scope of wq lock in epoll_wait()") made some ep_events_available() calls lockless and added single call under a lock, finally commit e59d3c64cba6 ("epoll: eliminate unnecessary lock for zero timeout") made the last ep_events_available() lockless. BUG: KCSAN: data-race in do_epoll_wait / do_epoll_wait write to 0xffff88810480c7d8 of 8 bytes by task 1802 on cpu 0: INIT_LIST_HEAD include/linux/list.h:38 [inline] list_splice_init include/linux/list.h:492 [inline] ep_start_scan fs/eventpoll.c:622 [inline] ep_send_events fs/eventpoll.c:1656 [inline] ep_poll fs/eventpoll.c:1806 [inline] do_epoll_wait+0x4eb/0xf40 fs/eventpoll.c:2234 do_epoll_pwait fs/eventpoll.c:2268 [inline] __do_sys_epoll_pwait fs/eventpoll.c:2281 [inline] __se_sys_epoll_pwait+0x12b/0x240 fs/eventpoll.c:2275 __x64_sys_epoll_pwait+0x74/0x80 fs/eventpoll.c:2275 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x44/0xd0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae read to 0xffff88810480c7d8 of 8 bytes by task 1799 on cpu 1: list_empty_careful include/linux/list.h:329 [inline] ep_events_available fs/eventpoll.c:381 [inline] ep_poll fs/eventpoll.c:1797 [inline] do_epoll_wait+0x279/0xf40 fs/eventpoll.c:2234 do_epoll_pwait fs/eventpoll.c:2268 [inline] __do_sys_epoll_pwait fs/eventpoll.c:2281 [inline] __se_sys_epoll_pwait+0x12b/0x240 fs/eventpoll.c:2275 __x64_sys_epoll_pwait+0x74/0x80 fs/eventpoll.c:2275 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x44/0xd0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae value changed: 0xffff88810480c7d0 -> 0xffff888103c15098 Reported by Kernel Concurrency Sanitizer on: CPU: 1 PID: 1799 Comm: syz-fuzzer Tainted: G W 5.17.0-rc7-syzkaller-dirty #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Link: https://lkml.kernel.org/r/20220322002653.33865-3-kuniyu@amazon.co.jp Fixes: e59d3c64cba6 ("epoll: eliminate unnecessary lock for zero timeout") Fixes: c5a282e9635e ("fs/epoll: reduce the scope of wq lock in epoll_wait()") Fixes: bf3b9f6372c4 ("epoll: Add busy poll support to epoll with socket fds.") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp> Reported-by: syzbot+bdd6e38a1ed5ee58d8bd@syzkaller.appspotmail.com Cc: Al Viro <viro@zeniv.linux.org.uk>, Andrew Morton <akpm@linux-foundation.org> Cc: Kuniyuki Iwashima <kuniyu@amazon.co.jp> Cc: Kuniyuki Iwashima <kuni1840@gmail.com> Cc: "Soheil Hassas Yeganeh" <soheil@google.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: "Sridhar Samudrala" <sridhar.samudrala@intel.com> Cc: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-04-29pipe: make poll_usage boolean and annotate its accessKuniyuki Iwashima
Patch series "Fix data-races around epoll reported by KCSAN." This series suppresses a false positive KCSAN's message and fixes a real data-race. This patch (of 2): pipe_poll() runs locklessly and assigns 1 to poll_usage. Once poll_usage is set to 1, it never changes in other places. However, concurrent writes of a value trigger KCSAN, so let's make KCSAN happy. BUG: KCSAN: data-race in pipe_poll / pipe_poll write to 0xffff8880042f6678 of 4 bytes by task 174 on cpu 3: pipe_poll (fs/pipe.c:656) ep_item_poll.isra.0 (./include/linux/poll.h:88 fs/eventpoll.c:853) do_epoll_wait (fs/eventpoll.c:1692 fs/eventpoll.c:1806 fs/eventpoll.c:2234) __x64_sys_epoll_wait (fs/eventpoll.c:2246 fs/eventpoll.c:2241 fs/eventpoll.c:2241) do_syscall_64 (arch/x86/entry/common.c:50 arch/x86/entry/common.c:80) entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:113) write to 0xffff8880042f6678 of 4 bytes by task 177 on cpu 1: pipe_poll (fs/pipe.c:656) ep_item_poll.isra.0 (./include/linux/poll.h:88 fs/eventpoll.c:853) do_epoll_wait (fs/eventpoll.c:1692 fs/eventpoll.c:1806 fs/eventpoll.c:2234) __x64_sys_epoll_wait (fs/eventpoll.c:2246 fs/eventpoll.c:2241 fs/eventpoll.c:2241) do_syscall_64 (arch/x86/entry/common.c:50 arch/x86/entry/common.c:80) entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:113) Reported by Kernel Concurrency Sanitizer on: CPU: 1 PID: 177 Comm: epoll_race Not tainted 5.17.0-58927-gf443e374ae13 #6 Hardware name: Red Hat KVM, BIOS 1.11.0-2.amzn2 04/01/2014 Link: https://lkml.kernel.org/r/20220322002653.33865-1-kuniyu@amazon.co.jp Link: https://lkml.kernel.org/r/20220322002653.33865-2-kuniyu@amazon.co.jp Fixes: 3b844826b6c6 ("pipe: avoid unnecessary EPOLLET wakeups under normal loads") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp> Cc: Alexander Duyck <alexander.h.duyck@intel.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Kuniyuki Iwashima <kuni1840@gmail.com> Cc: "Soheil Hassas Yeganeh" <soheil@google.com> Cc: "Sridhar Samudrala" <sridhar.samudrala@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-04-29lib: remove back_str initializationTom Rix
Clang static analysis reports this false positive glob.c:48:32: warning: Assigned value is garbage or undefined char const *back_pat = NULL, *back_str = back_str; ^~~~~~~~ ~~~~~~~~ back_str is set after back_pat and it's use is protected by the !back_pat check. It is not necessary to initialize back_str, so remove the initialization. Link: https://lkml.kernel.org/r/20220402131546.3383578-1-trix@redhat.com Signed-off-by: Tom Rix <trix@redhat.com> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Cc: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-04-29lib/string.c: simplify str[c]spnRasmus Villemoes
Use strchr(), which makes them a lot shorter, and more obviously symmetric in their treatment of accept/reject. It also saves a little bit of .text; bloat-o-meter for an arm build says Function old new delta strcspn 92 76 -16 strspn 108 76 -32 While here, also remove a stray empty line before EXPORT_SYMBOL(). Link: https://lkml.kernel.org/r/20220328224119.3003834-2-linux@rasmusvillemoes.dk Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Andy Shevchenko <andy@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-04-29lib/test_string.c: add strspn and strcspn testsRasmus Villemoes
Before refactoring strspn() and strcspn(), add some simple test cases. Link: https://lkml.kernel.org/r/20220328224119.3003834-1-linux@rasmusvillemoes.dk Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Andy Shevchenko <andy@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-04-29lib/Kconfig.debug: remove more CONFIG_..._VALUE indirectionsRasmus Villemoes
As in "kernel/panic.c: remove CONFIG_PANIC_ON_OOPS_VALUE indirection", use the IS_ENABLED() helper rather than having a hidden config option. Link: https://lkml.kernel.org/r/20220321121301.1389693-1-linux@rasmusvillemoes.dk Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Masahiro Yamada <masahiroy@kernel.org> Cc: Kees Cook <keescook@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-04-29lib/test_meminit: optimize do_kmem_cache_rcu_persistent() testXiaoke Wang
To make the test more robust, there are the following changes: 1. add a check for the return value of kmem_cache_alloc(). 2. properly release the object `buf` on several error paths. 3. release the objects of `used_objects` if we never hit `saved_ptr`. 4. destroy the created cache by default. Link: https://lkml.kernel.org/r/tencent_7CB95F1C3914BCE1CA4A61FF7C20E7CCB108@qq.com Signed-off-by: Xiaoke Wang <xkernel.wang@foxmail.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Konovalov <andreyknvl@gmail.com> Cc: Marco Elver <elver@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com> Cc: Xiaoke Wang <xkernel.wang@foxmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-04-29get_maintainer: Honor mailmap for in file emailsRob Herring
Add support to also use the mailmap for 'in file' email addresses. Link: https://lkml.kernel.org/r/20220323193645.317514-1-robh@kernel.org Signed-off-by: Rob Herring <robh@kernel.org> Reported-by: Marc Zyngier <maz@kernel.org> Acked-by: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-04-29kernel: pid_namespace: use NULL instead of using plain integer as pointerHaowen Bai
This fixes the following sparse warnings: kernel/pid_namespace.c:55:77: warning: Using plain integer as NULL pointer Link: https://lkml.kernel.org/r/1647944288-2806-1-git-send-email-baihaowen@meizu.com Signed-off-by: Haowen Bai <baihaowen@meizu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-04-29net: unexport csum_and_copy_{from,to}_userChristoph Hellwig
csum_and_copy_from_user and csum_and_copy_to_user are exported by a few architectures, but not actually used in modular code. Drop the exports. Link: https://lkml.kernel.org/r/20220421070440.1282704-1-hch@lst.de Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc) Cc: David Miller <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-04-29vmcore: convert read_from_oldmem() to take an iov_iterMatthew Wilcox (Oracle)
Remove the read_from_oldmem() wrapper introduced earlier and convert all the remaining callers to pass an iov_iter. Link: https://lkml.kernel.org/r/20220408090636.560886-4-bhe@redhat.com Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Baoquan He <bhe@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Tiezhu Yang <yangtiezhu@loongson.cn> Cc: Amit Daniel Kachhap <amit.kachhap@arm.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>