summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2023-07-03fs/ntfs3: Code refactoringKonstantin Komarov
Check functions arguments. Use u8 instead of size_t for ntfs names, more consts and other. Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
2023-07-03fs/ntfs3: Code formattingKonstantin Komarov
clang-format-15 was used to format code according kernel's .clang-format. Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
2023-07-03fs/ntfs3: Do not update primary boot in ntfs_init_from_boot()Konstantin Komarov
'cause it may be faked boot. Let ntfs to be mounted and update boot later. Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
2023-07-03fs/ntfs3: Alternative boot if primary boot is corruptedKonstantin Komarov
Some code refactoring added also. Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
2023-07-03fs/ntfs3: Mark ntfs dirty when on-disk struct is corruptedKonstantin Komarov
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
2023-07-03fs/ntfs3: Fix ntfs_atomic_openKonstantin Komarov
This fixes xfstest 633/696. Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
2023-07-03fs/ntfs3: Correct checking while generating attr_listKonstantin Komarov
Correct slightly previous commit: Enhance sanity check while generating attr_list Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
2023-07-03fs/ntfs3: Use __GFP_NOWARN allocation at ntfs_load_attr_list()Tetsuo Handa
syzbot is reporting too large allocation at ntfs_load_attr_list(), for a crafted filesystem can have huge data_size. Reported-by: syzbot <syzbot+89dbb3a789a5b9711793@syzkaller.appspotmail.com> Link: https://syzkaller.appspot.com/bug?extid=89dbb3a789a5b9711793 Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
2023-07-03fs: ntfs3: Fix possible null-pointer dereferences in mi_read()Jia-Ju Bai
In a previous commit 2681631c2973 ("fs/ntfs3: Add null pointer check to attr_load_runs_vcn"), ni can be NULL in attr_load_runs_vcn(), and thus it should be checked before being used. However, in the call stack of this commit, mft_ni in mi_read() is aliased with ni in attr_load_runs_vcn(), and it is also used in mi_read() at two places: mi_read() rw_lock = &mft_ni->file.run_lock -> No check attr_load_runs_vcn(mft_ni, ...) ni (namely mft_ni) is checked in the previous commit attr_load_runs_vcn(..., &mft_ni->file.run) -> No check Thus, to avoid possible null-pointer dereferences, the related checks should be added. These bugs are reported by a static analysis tool implemented by myself, and they are found by extending a known bug fixed in the previous commit. Thus, they could be theoretical bugs. Signed-off-by: Jia-Ju Bai <baijiaju@buaa.edu.cn> Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
2023-07-03fs/ntfs3: Return error for inconsistent extended attributesEdward Lo
ntfs_read_ea is called when we want to read extended attributes. There are some sanity checks for the validity of the EAs. However, it fails to return a proper error code for the inconsistent attributes, which might lead to unpredicted memory accesses after return. [ 138.916927] BUG: KASAN: use-after-free in ntfs_set_ea+0x453/0xbf0 [ 138.923876] Write of size 4 at addr ffff88800205cfac by task poc/199 [ 138.931132] [ 138.933016] CPU: 0 PID: 199 Comm: poc Not tainted 6.2.0-rc1+ #4 [ 138.938070] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 [ 138.947327] Call Trace: [ 138.949557] <TASK> [ 138.951539] dump_stack_lvl+0x4d/0x67 [ 138.956834] print_report+0x16f/0x4a6 [ 138.960798] ? ntfs_set_ea+0x453/0xbf0 [ 138.964437] ? kasan_complete_mode_report_info+0x7d/0x200 [ 138.969793] ? ntfs_set_ea+0x453/0xbf0 [ 138.973523] kasan_report+0xb8/0x140 [ 138.976740] ? ntfs_set_ea+0x453/0xbf0 [ 138.980578] __asan_store4+0x76/0xa0 [ 138.984669] ntfs_set_ea+0x453/0xbf0 [ 138.988115] ? __pfx_ntfs_set_ea+0x10/0x10 [ 138.993390] ? kernel_text_address+0xd3/0xe0 [ 138.998270] ? __kernel_text_address+0x16/0x50 [ 139.002121] ? unwind_get_return_address+0x3e/0x60 [ 139.005659] ? __pfx_stack_trace_consume_entry+0x10/0x10 [ 139.010177] ? arch_stack_walk+0xa2/0x100 [ 139.013657] ? filter_irq_stacks+0x27/0x80 [ 139.017018] ntfs_setxattr+0x405/0x440 [ 139.022151] ? __pfx_ntfs_setxattr+0x10/0x10 [ 139.026569] ? kvmalloc_node+0x2d/0x120 [ 139.030329] ? kasan_save_stack+0x41/0x60 [ 139.033883] ? kasan_save_stack+0x2a/0x60 [ 139.037338] ? kasan_set_track+0x29/0x40 [ 139.040163] ? kasan_save_alloc_info+0x1f/0x30 [ 139.043588] ? __kasan_kmalloc+0x8b/0xa0 [ 139.047255] ? __kmalloc_node+0x68/0x150 [ 139.051264] ? kvmalloc_node+0x2d/0x120 [ 139.055301] ? vmemdup_user+0x2b/0xa0 [ 139.058584] __vfs_setxattr+0x121/0x170 [ 139.062617] ? __pfx___vfs_setxattr+0x10/0x10 [ 139.066282] __vfs_setxattr_noperm+0x97/0x300 [ 139.070061] __vfs_setxattr_locked+0x145/0x170 [ 139.073580] vfs_setxattr+0x137/0x2a0 [ 139.076641] ? __pfx_vfs_setxattr+0x10/0x10 [ 139.080223] ? __kasan_check_write+0x18/0x20 [ 139.084234] do_setxattr+0xce/0x150 [ 139.087768] setxattr+0x126/0x140 [ 139.091250] ? __pfx_setxattr+0x10/0x10 [ 139.094948] ? __virt_addr_valid+0xcb/0x140 [ 139.097838] ? __call_rcu_common.constprop.0+0x1c7/0x330 [ 139.102688] ? debug_smp_processor_id+0x1b/0x30 [ 139.105985] ? kasan_quarantine_put+0x5b/0x190 [ 139.109980] ? putname+0x84/0xa0 [ 139.113886] ? __kasan_slab_free+0x11e/0x1b0 [ 139.117961] ? putname+0x84/0xa0 [ 139.121316] ? preempt_count_sub+0x1c/0xd0 [ 139.124427] ? __mnt_want_write+0xae/0x100 [ 139.127836] ? mnt_want_write+0x8f/0x150 [ 139.130954] path_setxattr+0x164/0x180 [ 139.133998] ? __pfx_path_setxattr+0x10/0x10 [ 139.137853] ? __pfx_ksys_pwrite64+0x10/0x10 [ 139.141299] ? debug_smp_processor_id+0x1b/0x30 [ 139.145714] ? fpregs_assert_state_consistent+0x6b/0x80 [ 139.150796] __x64_sys_setxattr+0x71/0x90 [ 139.155407] do_syscall_64+0x3f/0x90 [ 139.159035] entry_SYSCALL_64_after_hwframe+0x72/0xdc [ 139.163843] RIP: 0033:0x7f108cae4469 [ 139.166481] Code: 00 f3 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 088 [ 139.183764] RSP: 002b:00007fff87588388 EFLAGS: 00000286 ORIG_RAX: 00000000000000bc [ 139.190657] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f108cae4469 [ 139.196586] RDX: 00007fff875883b0 RSI: 00007fff875883d1 RDI: 00007fff875883b6 [ 139.201716] RBP: 00007fff8758c530 R08: 0000000000000001 R09: 00007fff8758c618 [ 139.207940] R10: 0000000000000006 R11: 0000000000000286 R12: 00000000004004c0 [ 139.214007] R13: 00007fff8758c610 R14: 0000000000000000 R15: 0000000000000000 Signed-off-by: Edward Lo <loyuantsung@gmail.com> Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
2023-07-03fs/ntfs3: Enhance sanity check while generating attr_listEdward Lo
ni_create_attr_list uses WARN_ON to catch error cases while generating attribute list, which only prints out stack trace and may not be enough. This repalces them with more proper error handling flow. [ 59.666332] BUG: kernel NULL pointer dereference, address: 000000000000000e [ 59.673268] #PF: supervisor read access in kernel mode [ 59.678354] #PF: error_code(0x0000) - not-present page [ 59.682831] PGD 8000000005ff1067 P4D 8000000005ff1067 PUD 7dee067 PMD 0 [ 59.688556] Oops: 0000 [#1] PREEMPT SMP KASAN PTI [ 59.692642] CPU: 0 PID: 198 Comm: poc Tainted: G B W 6.2.0-rc1+ #4 [ 59.698868] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 [ 59.708795] RIP: 0010:ni_create_attr_list+0x505/0x860 [ 59.713657] Code: 7e 10 e8 5e d0 d0 ff 45 0f b7 76 10 48 8d 7b 16 e8 00 d1 d0 ff 66 44 89 73 16 4d 8d 75 0e 4c 89 f7 e8 3f d0 d0 ff 4c 8d8 [ 59.731559] RSP: 0018:ffff88800a56f1e0 EFLAGS: 00010282 [ 59.735691] RAX: 0000000000000001 RBX: ffff88800b7b5088 RCX: ffffffffb83079fe [ 59.741792] RDX: 0000000000000001 RSI: 0000000000000008 RDI: ffffffffbb7f9fc0 [ 59.748423] RBP: ffff88800a56f3a8 R08: ffff88800b7b50a0 R09: fffffbfff76ff3f9 [ 59.754654] R10: ffffffffbb7f9fc7 R11: fffffbfff76ff3f8 R12: ffff88800b756180 [ 59.761552] R13: 0000000000000000 R14: 000000000000000e R15: 0000000000000050 [ 59.768323] FS: 00007feaa8c96440(0000) GS:ffff88806d400000(0000) knlGS:0000000000000000 [ 59.776027] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 59.781395] CR2: 00007f3a2e0b1000 CR3: 000000000a5bc000 CR4: 00000000000006f0 [ 59.787607] Call Trace: [ 59.790271] <TASK> [ 59.792488] ? __pfx_ni_create_attr_list+0x10/0x10 [ 59.797235] ? kernel_text_address+0xd3/0xe0 [ 59.800856] ? unwind_get_return_address+0x3e/0x60 [ 59.805101] ? __kasan_check_write+0x18/0x20 [ 59.809296] ? preempt_count_sub+0x1c/0xd0 [ 59.813421] ni_ins_attr_ext+0x52c/0x5c0 [ 59.817034] ? __pfx_ni_ins_attr_ext+0x10/0x10 [ 59.821926] ? __vfs_setxattr+0x121/0x170 [ 59.825718] ? __vfs_setxattr_noperm+0x97/0x300 [ 59.829562] ? __vfs_setxattr_locked+0x145/0x170 [ 59.833987] ? vfs_setxattr+0x137/0x2a0 [ 59.836732] ? do_setxattr+0xce/0x150 [ 59.839807] ? setxattr+0x126/0x140 [ 59.842353] ? path_setxattr+0x164/0x180 [ 59.845275] ? __x64_sys_setxattr+0x71/0x90 [ 59.848838] ? do_syscall_64+0x3f/0x90 [ 59.851898] ? entry_SYSCALL_64_after_hwframe+0x72/0xdc [ 59.857046] ? stack_depot_save+0x17/0x20 [ 59.860299] ni_insert_attr+0x1ba/0x420 [ 59.863104] ? __pfx_ni_insert_attr+0x10/0x10 [ 59.867069] ? preempt_count_sub+0x1c/0xd0 [ 59.869897] ? _raw_spin_unlock_irqrestore+0x2b/0x50 [ 59.874088] ? __create_object+0x3ae/0x5d0 [ 59.877865] ni_insert_resident+0xc4/0x1c0 [ 59.881430] ? __pfx_ni_insert_resident+0x10/0x10 [ 59.886355] ? kasan_save_alloc_info+0x1f/0x30 [ 59.891117] ? __kasan_kmalloc+0x8b/0xa0 [ 59.894383] ntfs_set_ea+0x90d/0xbf0 [ 59.897703] ? __pfx_ntfs_set_ea+0x10/0x10 [ 59.901011] ? kernel_text_address+0xd3/0xe0 [ 59.905308] ? __kernel_text_address+0x16/0x50 [ 59.909811] ? unwind_get_return_address+0x3e/0x60 [ 59.914898] ? __pfx_stack_trace_consume_entry+0x10/0x10 [ 59.920250] ? arch_stack_walk+0xa2/0x100 [ 59.924560] ? filter_irq_stacks+0x27/0x80 [ 59.928722] ntfs_setxattr+0x405/0x440 [ 59.932512] ? __pfx_ntfs_setxattr+0x10/0x10 [ 59.936634] ? kvmalloc_node+0x2d/0x120 [ 59.940378] ? kasan_save_stack+0x41/0x60 [ 59.943870] ? kasan_save_stack+0x2a/0x60 [ 59.947719] ? kasan_set_track+0x29/0x40 [ 59.951417] ? kasan_save_alloc_info+0x1f/0x30 [ 59.955733] ? __kasan_kmalloc+0x8b/0xa0 [ 59.959598] ? __kmalloc_node+0x68/0x150 [ 59.963163] ? kvmalloc_node+0x2d/0x120 [ 59.966490] ? vmemdup_user+0x2b/0xa0 [ 59.969060] __vfs_setxattr+0x121/0x170 [ 59.972456] ? __pfx___vfs_setxattr+0x10/0x10 [ 59.976008] __vfs_setxattr_noperm+0x97/0x300 [ 59.981562] __vfs_setxattr_locked+0x145/0x170 [ 59.986100] vfs_setxattr+0x137/0x2a0 [ 59.989964] ? __pfx_vfs_setxattr+0x10/0x10 [ 59.993616] ? __kasan_check_write+0x18/0x20 [ 59.997425] do_setxattr+0xce/0x150 [ 60.000304] setxattr+0x126/0x140 [ 60.002967] ? __pfx_setxattr+0x10/0x10 [ 60.006471] ? __virt_addr_valid+0xcb/0x140 [ 60.010461] ? __call_rcu_common.constprop.0+0x1c7/0x330 [ 60.016037] ? debug_smp_processor_id+0x1b/0x30 [ 60.021008] ? kasan_quarantine_put+0x5b/0x190 [ 60.025545] ? putname+0x84/0xa0 [ 60.027910] ? __kasan_slab_free+0x11e/0x1b0 [ 60.031483] ? putname+0x84/0xa0 [ 60.033986] ? preempt_count_sub+0x1c/0xd0 [ 60.036876] ? __mnt_want_write+0xae/0x100 [ 60.040738] ? mnt_want_write+0x8f/0x150 [ 60.044317] path_setxattr+0x164/0x180 [ 60.048096] ? __pfx_path_setxattr+0x10/0x10 [ 60.052096] ? strncpy_from_user+0x175/0x1c0 [ 60.056482] ? debug_smp_processor_id+0x1b/0x30 [ 60.059848] ? fpregs_assert_state_consistent+0x6b/0x80 [ 60.064557] __x64_sys_setxattr+0x71/0x90 [ 60.068892] do_syscall_64+0x3f/0x90 [ 60.072868] entry_SYSCALL_64_after_hwframe+0x72/0xdc [ 60.077523] RIP: 0033:0x7feaa86e4469 [ 60.080915] Code: 00 f3 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 088 [ 60.097353] RSP: 002b:00007ffdbd8311e8 EFLAGS: 00000286 ORIG_RAX: 00000000000000bc [ 60.103386] RAX: ffffffffffffffda RBX: 9461c5e290baac00 RCX: 00007feaa86e4469 [ 60.110322] RDX: 00007ffdbd831fe0 RSI: 00007ffdbd831305 RDI: 00007ffdbd831263 [ 60.116808] RBP: 00007ffdbd836180 R08: 0000000000000001 R09: 00007ffdbd836268 [ 60.123879] R10: 000000000000007d R11: 0000000000000286 R12: 0000000000400500 [ 60.130540] R13: 00007ffdbd836260 R14: 0000000000000000 R15: 0000000000000000 [ 60.136553] </TASK> [ 60.138818] Modules linked in: [ 60.141839] CR2: 000000000000000e [ 60.144831] ---[ end trace 0000000000000000 ]--- [ 60.149058] RIP: 0010:ni_create_attr_list+0x505/0x860 [ 60.153975] Code: 7e 10 e8 5e d0 d0 ff 45 0f b7 76 10 48 8d 7b 16 e8 00 d1 d0 ff 66 44 89 73 16 4d 8d 75 0e 4c 89 f7 e8 3f d0 d0 ff 4c 8d8 [ 60.172443] RSP: 0018:ffff88800a56f1e0 EFLAGS: 00010282 [ 60.176246] RAX: 0000000000000001 RBX: ffff88800b7b5088 RCX: ffffffffb83079fe [ 60.182752] RDX: 0000000000000001 RSI: 0000000000000008 RDI: ffffffffbb7f9fc0 [ 60.189949] RBP: ffff88800a56f3a8 R08: ffff88800b7b50a0 R09: fffffbfff76ff3f9 [ 60.196950] R10: ffffffffbb7f9fc7 R11: fffffbfff76ff3f8 R12: ffff88800b756180 [ 60.203671] R13: 0000000000000000 R14: 000000000000000e R15: 0000000000000050 [ 60.209595] FS: 00007feaa8c96440(0000) GS:ffff88806d400000(0000) knlGS:0000000000000000 [ 60.216299] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 60.222276] CR2: 00007f3a2e0b1000 CR3: 000000000a5bc000 CR4: 00000000000006f0 Signed-off-by: Edward Lo <loyuantsung@gmail.com> Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
2023-07-03fs/ntfs3: Use wrapper i_blocksize() in ntfs_zero_range()Yangtao Li
Convert to use i_blocksize() for readability. Signed-off-by: Yangtao Li <frank.li@vivo.com> [almaz.alexandrovich@paragon-software.com: the patch has been partially accepted for performance reasons] Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
2023-07-03ntfs: Fix panic about slab-out-of-bounds caused by ntfs_listxattr()Zeng Heng
Here is a BUG report from syzbot: BUG: KASAN: slab-out-of-bounds in ntfs_list_ea fs/ntfs3/xattr.c:191 [inline] BUG: KASAN: slab-out-of-bounds in ntfs_listxattr+0x401/0x570 fs/ntfs3/xattr.c:710 Read of size 1 at addr ffff888021acaf3d by task syz-executor128/3632 Call Trace: ntfs_list_ea fs/ntfs3/xattr.c:191 [inline] ntfs_listxattr+0x401/0x570 fs/ntfs3/xattr.c:710 vfs_listxattr fs/xattr.c:457 [inline] listxattr+0x293/0x2d0 fs/xattr.c:804 Fix the logic of ea_all iteration. When the ea->name_len is 0, return immediately, or Add2Ptr() would visit invalid memory in the next loop. Fixes: be71b5cba2e6 ("fs/ntfs3: Add attrib operations") Reported-by: syzbot+9fcea5ef6dc4dc72d334@syzkaller.appspotmail.com Signed-off-by: Zeng Heng <zengheng4@huawei.com> [almaz.alexandrovich@paragon-software.com: lines of the patch have changed] Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
2023-07-02Merge tag 'iomap-6.5-merge-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds
Pull iomap updates from Darrick Wong: - Fix a type signature mismatch - Drop Christoph as maintainer * tag 'iomap-6.5-merge-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: iomap: drop me [hch] from MAINTAINERS for iomap fs: iomap: Change the type of blocksize from 'int' to 'unsigned int' in iomap_file_buffered_write_punch_delalloc
2023-07-02Merge tag 'v6.5/vfs.fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs fix from Christian Brauner: "A fix for the backing file work from this cycle. When init_file() failed it would call file_free_rcu() on the file allocated by the caller of init_file(). It naively assumed that the correct cleanup operation would be called depending on whether it is a regular file or a backing file. However, that presupposes that the FMODE_BACKING flag would already be set which it won't be as that is done in the caller of init_file(). Fix that bug by moving the cleanup of the allocated file into the caller where it belongs in the first place. There's no good reason for init_file() to consume resources it didn't allocate. This is a mainline only fix and was reported by syzbot. The fix was validated by syzbot against the provided reproducer" * tag 'v6.5/vfs.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: fs: move cleanup from init_file() into its callers
2023-07-02Merge tag 'i2c-for-6.5-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull i2c updates from Wolfram Sang: - I2C has now a co-maintainer taking care of the host drivers. Welcome Andi Shyti and have fun! - platform remove callback converted to return void in drivers - simplify drivers by using devm_clk_get_enabled() - introduce i2c_get_match_data() to avoid more boilerplate code (especially since the core stopped delivering an i2c_device_id) - and the usual bunch of driver updates * tag 'i2c-for-6.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: (38 commits) i2c: uniphier: Use devm_clk_get_enabled() i2c: uniphier-f: Use devm_clk_get_enabled() i2c: owl: Use devm_clk_get_enabled() i2c: lpc2k: Use devm_clk_get_enabled() i2c: hix5hd2: Use devm_clk_get_enabled() i2c: sun6i-p2wi: Use devm_clk_get_enabled() i2c: pasemi-platform: Use devm_clk_get_enabled() i2c: mt7621: Use devm_clk_get_enabled() i2c: xiic: Use devm_clk_get_enabled() i2c: davinci: Use platform table macro over module_alias i2c: ocores: use devm_ managed clks i2c: nomadik: Use dev_err_probe() whenever possible i2c: nomadik: Use devm_clk_get_enabled() i2c: nomadik: Remove unnecessary goto label usb: typec: ucsi: Mark dGPUs as DEVICE scope i2c: wmt: Use devm_platform_get_and_ioremap_resource() i2c: versatile: Use devm_platform_get_and_ioremap_resource() i2c: hix5hd2: Add I2C_M_STOP flag support for i2c-hix5hd2 driver. i2c: mpc: Use of_property_read_reg() to parse "reg" i2c: imx-lpi2c: Don't open-code DIV_ROUND_UP ...
2023-07-02Merge tag 'parisc-for-6.5-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux Pull parisc updates from Helge Deller: - Add missing cacheflush() syscall - Fix STI console on 64-bit-only machines - Move kernel debug options to Kconfig.debug - Lots of warning fixes in arch/parisc/ and drivers/parisc/ when compiled with W=1 - Enable some more graphics drivers in refreshed defconfigs * tag 'parisc-for-6.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux: (29 commits) parisc: Refresh defconfigs parisc: irq: Add irq-related function declarations parisc: Move init function declarations into header file parisc: dino: Make dino_init() returning void parisc: lba_pci: Mark two variables __maybe_unused parisc: unaligned: Include header file to avoid missing prototype warnings parisc: signal: Mark do_notify_resume() and sys_rt_sigreturn() asmlinkage parisc: unwind: Mark start and stop variables __maybe_unused parisc: init: Drop unused variable end_paddr parisc: traps: Mark functions static parisc: processor: Fix kdoc for init_cpu_profiler() parisc: sys_parisc: parisc_personality() is called from asm code parisc: ccio-dma: Fix kdoc and compiler warnings parisc: pdc_stable: Fix kdoc and compiler warnings parisc: pci-dma: Make pcxl_alloc_range() static parisc: Mark image_size __maybe_unused in perf_write() parisc: module: Mark symindex __maybe_unused parisc: pdc_chassis: Fix kdoc warnings parisc: firmware: Fix kdoc warnings parisc: drivers: Fix kdoc warnings ...
2023-07-02xfs: fix the calculation for "end" and "length"Shiyang Ruan
The value of "end" should be "start + length - 1". Signed-off-by: Shiyang Ruan <ruansy.fnst@fujitsu.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2023-07-02xfs: fix xfs_btree_query_range callers to initialize btree rec fullyDarrick J. Wong
Use struct initializers to ensure that the xfs_btree_irecs passed into the query_range function are completely initialized. No functional changes, just closing some sloppy hygiene. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2023-07-02xfs: validate fsmap offsets specified in the query keysDarrick J. Wong
Improve the validation of the fsmap offset fields in the query keys and move the validation to the top of the function now that we have pushed the low key adjustment code downwards. Also fix some indenting issues that aren't worth a separate patch. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2023-07-02xfs: fix logdev fsmap query result filteringDarrick J. Wong
The external log device fsmap backend doesn't have an rmapbt to query, so it's wasteful to spend time initializing the rmap_irec objects. Worse yet, the log could (someday) be longer than 2^32 fsblocks, so using the rmap irec structure will result in integer overflows. Fix this mess by computing the start address that we want from keys[0] directly, and use the daddr-based record filtering algorithm that we also use for rtbitmap queries. Fixes: e89c041338ed ("xfs: implement the GETFSMAP ioctl") Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2023-07-02xfs: clean up the rtbitmap fsmap backendDarrick J. Wong
The rtbitmap fsmap backend doesn't query the rmapbt, so it's wasteful to spend time initializing the rmap_irec objects. Worse yet, the logic to query the rtbitmap is spread across three separate functions, which is unnecessarily difficult to follow. Compute the start rtextent that we want from keys[0] directly and combine the functions to avoid passing parameters around everywhere, and consolidate all the logic into a single function. At one point many years ago I intended to use __xfs_getfsmap_rtdev as the launching point for realtime rmapbt queries, but this hasn't been the case for a long time. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2023-07-02xfs: fix getfsmap reporting past the last rt extentDarrick J. Wong
The realtime section ends at the last rt extent. If the user configures the rt geometry with an extent size that is not an integer factor of the number of rt blocks, it's possible for there to be rt blocks past the end of the last rt extent. These tail blocks cannot ever be allocated and will cause corruption reports if the last extent coincides with the end of an rt bitmap block, so do not report consider them for the GETFSMAP output. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2023-07-02xfs: fix integer overflows in the fsmap rtbitmap and logdev backendsDarrick J. Wong
It's not correct to use the rmap irec structure to hold query key information to query the rtbitmap because the realtime volume can be longer than 2^32 fsblocks in length. Because the rt volume doesn't have allocation groups, introduce a daddr-based record filtering algorithm and compute the rtextent values using 64-bit variables. The same problem exists in the external log device fsmap implementation, so use the same solution to fix it too. After this patch, all the code that touches info->low and info->high under xfs_getfsmap_logdev and __xfs_getfsmap_rtdev are unnecessary. Cleaning this up will be done in subsequent patches. Fixes: 4c934c7dd60c ("xfs: report realtime space information via the rtbitmap") Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2023-07-02xfs: fix interval filtering in multi-step fsmap queriesDarrick J. Wong
I noticed a bug in ranged GETFSMAP queries: # xfs_io -c 'fsmap -vvvv' /opt EXT: DEV BLOCK-RANGE OWNER FILE-OFFSET AG AG-OFFSET TOTAL 0: 8:80 [0..7]: static fs metadata 0 (0..7) 8 <snip> 9: 8:80 [192..223]: 137 0..31 0 (192..223) 32 # xfs_io -c 'fsmap -vvvv -d 208 208' /opt # That's not right -- we asked what block maps block 208, and we should've received a mapping for inode 137 offset 16. Instead, we get nothing. The root cause of this problem is a mis-interaction between the fsmap code and how btree ranged queries work. xfs_btree_query_range returns any btree record that overlaps with the query interval, even if the record starts before or ends after the interval. Similarly, GETFSMAP is supposed to return a recordset containing all records that overlap the range queried. However, it's possible that the recordset is larger than the buffer that the caller provided to convey mappings to userspace. In /that/ case, userspace is supposed to copy the last record returned to fmh_keys[0] and call GETFSMAP again. In this case, we do not want to return mappings that we have already supplied to the caller. The call to xfs_btree_query_range is the same, but now we ignore any records that start before fmh_keys[0]. Unfortunately, we didn't implement the filtering predicate correctly. The predicate should only be called when we're calling back for more records. Accomplish this by setting info->low.rm_blockcount to a nonzero value and ensuring that it is cleared as necessary. As a result, we no longer want to adjust dkeys[0] in the main setup function because that's confusing. This patch doesn't touch the logdev/rtbitmap backends because they have bigger problems that will be addressed by subsequent patches. Found via xfs/556 with parent pointers enabled. Fixes: e89c041338ed ("xfs: implement the GETFSMAP ioctl") Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2023-07-02Merge branch 'octeontx2-af-fixes'David S. Miller
Hariprasad Kelam says: ==================== octeontx2-af: MAC block fixes for CN10KB This patch set contains fixes for the issues encountered in testing CN10KB MAC block RPM_USX. Patch1: firmware to kernel communication is not working due to wrong interrupt configuration. CSR addresses are corrected. Patch2: NIX to RVU PF mapping errors encountered due to wrong firmware config. Corrects this mapping error. Patch3: Driver is trying to access non exist cgx/lmac which is resulting in kernel panic. Address this issue by adding proper checks. Patch4: MAC features are not getting reset on FLR. Fix the issue by resetting the stale config. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2023-07-02octeontx2-af: Reset MAC features in FLRHariprasad Kelam
AF driver configures MAC features like internal loopback and PFC upon receiving the request from PF and its VF netdev. But these features are not getting reset in FLR. This patch fixes the issue by resetting the same. Fixes: 23999b30ae67 ("octeontx2-af: Enable or disable CGX internal loopback") Fixes: 1121f6b02e7a ("octeontx2-af: Priority flow control configuration support") Signed-off-by: Hariprasad Kelam <hkelam@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-07-02octeontx2-af: Add validation before accessing cgx and lmacHariprasad Kelam
with the addition of new MAC blocks like CN10K RPM and CN10KB RPM_USX, LMACs are noncontiguous and CGX blocks are also noncontiguous. But during RVU driver initialization, the driver is assuming they are contiguous and trying to access cgx or lmac with their id which is resulting in kernel panic. This patch fixes the issue by adding proper checks. [ 23.219150] pc : cgx_lmac_read+0x38/0x70 [ 23.219154] lr : rvu_program_channels+0x3f0/0x498 [ 23.223852] sp : ffff000100d6fc80 [ 23.227158] x29: ffff000100d6fc80 x28: ffff00010009f880 x27: 000000000000005a [ 23.234288] x26: ffff000102586768 x25: 0000000000002500 x24: fffffffffff0f000 Fixes: 91c6945ea1f9 ("octeontx2-af: cn10k: Add RPM MAC support") Signed-off-by: Hariprasad Kelam <hkelam@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-07-02octeontx2-af: Fix mapping for NIX block from CGX connectionHariprasad Kelam
Firmware configures NIX block mapping for all MAC blocks. The current implementation reads the configuration and creates the mapping between RVU PF and NIX blocks. But this configuration is only valid for silicons that support multiple blocks. For all other silicons, all MAC blocks map to NIX0. This patch corrects the mapping by adding a check for the same. Fixes: c5a73b632b90 ("octeontx2-af: Map NIX block from CGX connection") Signed-off-by: Hariprasad Kelam <hkelam@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-07-02octeontx2-af: cn10kb: fix interrupt csr addressesHariprasad Kelam
The current design is that, for asynchronous events like link_up and link_down firmware raises the interrupt to kernel. The previous patch which added RPM_USX driver has a bug where it uses old csr addresses for configuring interrupts. Which is resulting in losing interrupts from source firmware. This patch fixes the issue by correcting csr addresses. Fixes: b9d0fedc6234 ("octeontx2-af: cn10kb: Add RPM_USX MAC support") Signed-off-by: Hariprasad Kelam <hkelam@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-07-02nvme-tcp: Fix comma-related oopsDavid Howells
Fix a comma that should be a semicolon. The comma is at the end of an if-body and thus makes the statement after (a bvec_set_page()) conditional too, resulting in an oops because we didn't fill out the bio_vec[]: BUG: kernel NULL pointer dereference, address: 0000000000000008 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page ... Workqueue: nvme_tcp_wq nvme_tcp_io_work [nvme_tcp] RIP: 0010:skb_splice_from_iter+0xf1/0x370 ... Call Trace: tcp_sendmsg_locked+0x3a6/0xdd0 tcp_sendmsg+0x31/0x50 inet_sendmsg+0x47/0x80 sock_sendmsg+0x99/0xb0 nvme_tcp_try_send_data+0x149/0x490 [nvme_tcp] nvme_tcp_try_send+0x1b7/0x300 [nvme_tcp] nvme_tcp_io_work+0x40/0xc0 [nvme_tcp] process_one_work+0x21c/0x430 worker_thread+0x54/0x3e0 kthread+0xf8/0x130 Fixes: 7769887817c3 ("nvme-tcp: Use sendmsg(MSG_SPLICE_PAGES) rather then sendpage") Reported-by: Aurelien Aptel <aaptel@nvidia.com> Link: https://lore.kernel.org/r/253mt0il43o.fsf@mtr-vdi-124.i-did-not-set--mail-host-address--so-tickle-me/ Signed-off-by: David Howells <dhowells@redhat.com> cc: Sagi Grimberg <sagi@grimberg.me> cc: Willem de Bruijn <willemb@google.com> cc: Keith Busch <kbusch@kernel.org> cc: Jens Axboe <axboe@fb.com> cc: Christoph Hellwig <hch@lst.de> cc: Chaitanya Kulkarni <kch@nvidia.com> cc: "David S. Miller" <davem@davemloft.net> cc: Eric Dumazet <edumazet@google.com> cc: Jakub Kicinski <kuba@kernel.org> cc: Paolo Abeni <pabeni@redhat.com> cc: Jens Axboe <axboe@kernel.dk> cc: Jens Axboe <axboe@kernel.dk> cc: Matthew Wilcox <willy@infradead.org> cc: linux-nvme@lists.infradead.org cc: netdev@vger.kernel.org Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-07-02fs: move cleanup from init_file() into its callersAmir Goldstein
The use of file_free_rcu() in init_file() to free the struct that was allocated by the caller was hacky and we got what we deserved. Let init_file() and its callers take care of cleaning up each after their own allocated resources on error. Fixes: 62d53c4a1dfe ("fs: use backing_file container for internal files with "fake" f_path") # mainline only Reported-and-tested-by: syzbot+ada42aab05cf51b00e98@syzkaller.appspotmail.com Signed-off-by: Amir Goldstein <amir73il@gmail.com> Message-Id: <20230701171134.239409-1-amir73il@gmail.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-07-01Merge tag 'csky-for-linus-6.5' of https://github.com/c-sky/csky-linuxLinus Torvalds
Pull arch/csky update from Guo Ren: - Correct thread.trap_no restore of uprobe * tag 'csky-for-linus-6.5' of https://github.com/c-sky/csky-linux: csky: uprobes: Restore thread.trap_no
2023-07-01perf evsel amd: Fix IBS error messageRavi Bangoria
AMD IBS can do per-process profiling[1] and is no longer restricted to per-cpu or systemwide only. Remove stale error message. Also, checking just exclude_kernel is not sufficient since IBS does not support any privilege filters. So include all exclude_* checks. And finally, move these checks under tools/perf/arch/x86/ from generic code. Before: $ sudo ./perf record -e ibs_op//k -C 0 Error: AMD IBS may only be available in system-wide/per-cpu mode. Try using -a, or -C and workload affinity After: $ sudo ./perf record -e ibs_op//k -C 0 Error: AMD IBS doesn't support privilege filtering. Try again without the privilege modifiers (like 'k') at the end. [1] https://git.kernel.org/torvalds/c/30093056f7b2 Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: ananth.narayan@amd.com Cc: sandipan.das@amd.com Cc: santosh.shukla@amd.com Cc: irogers@google.com Cc: peterz@infradead.org Cc: adrian.hunter@intel.com Cc: acme@kernel.org Cc: jolsa@kernel.org Link: https://lore.kernel.org/r/20230630085230.437-1-ravi.bangoria@amd.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2023-07-01Merge tag 'nfs-for-6.5-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds
Pull NFS client updates from Trond Myklebust: "Stable fixes and other bugfixes: - nfs: don't report STATX_BTIME in ->getattr - Revert 'NFSv4: Retry LOCK on OLD_STATEID during delegation return' since it breaks NFSv4 state recovery. - NFSv4.1: freeze the session table upon receiving NFS4ERR_BADSESSION - Fix the NFSv4.2 xattr cache shrinker_id - Force a ctime update after a NFSv4.2 SETXATTR call Features and cleanups: - NFS and RPC over TLS client code from Chuck Lever - Support for use of abstract unix socket addresses with the rpcbind daemon - Sysfs API to allow shutdown of the kernel RPC client and prevent umount() hangs if the server is known to be permanently down - XDR cleanups from Anna" * tag 'nfs-for-6.5-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (33 commits) Revert "NFSv4: Retry LOCK on OLD_STATEID during delegation return" NFS: Don't cleanup sysfs superblock entry if uninitialized nfs: don't report STATX_BTIME in ->getattr NFSv4.1: freeze the session table upon receiving NFS4ERR_BADSESSION NFSv4.2: fix wrong shrinker_id NFSv4: Clean up some shutdown loops NFS: Cancel all existing RPC tasks when shutdown NFS: add sysfs shutdown knob NFS: add a sysfs link to the acl rpc_client NFS: add a sysfs link to the lockd rpc_client NFS: Add sysfs links to sunrpc clients for nfs_clients NFS: add superblock sysfs entries NFS: Make all of /sys/fs/nfs network-namespace unique NFS: Open-code the nfs_kset kset_create_and_add() NFS: rename nfs_client_kobj to nfs_net_kobj NFS: rename nfs_client_kset to nfs_kset NFS: Add an "xprtsec=" NFS mount option NFS: Have struct nfs_client carry a TLS policy field SUNRPC: Add a TCP-with-TLS RPC transport class SUNRPC: Capture CMSG metadata on client-side receive ...
2023-07-01Merge tag 'x86-urgent-2023-07-01' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fix from Thomas Gleixner: "A single regression fix for x86: Moving the invocation of arch_cpu_finalize_init() earlier in the boot process caused a boot regression on IBT enabled system. The root cause is not the move of arch_cpu_finalize_init() itself. The system fails to boot because the subsequent efi_enter_virtual_mode() code has a non-IBT safe EFI call inside. This was not noticed before because IBT was enabled after the EFI initialization. Switching the EFI call to use the IBT safe wrapper cures the problem" * tag 'x86-urgent-2023-07-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/efi: Make efi_set_virtual_address_map IBT safe
2023-07-01perf: unwind: Fix symfs with libdwVincent Whitchurch
Pass the full path including the symfs (if any) to libdw. Without this unwinding fails with errors like this when a symfs is used: unwind: failed with 'No such file or directory'" Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: kernel@axis.com Cc: Ian Rogers <irogers@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Link: https://lore.kernel.org/r/20230630-perf-libdw-symfs-v2-1-469760dd4d5b@axis.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2023-07-01perf symbol: Fix uninitialized return value in symbols__find_by_name()James Clark
found_idx and s aren't initialized, so if no symbol is found then the assert at the end will index off the end of the array causing a segfault. The function also doesn't return NULL when the symbol isn't found even if the assert passes. Fix it by initializing the values and only setting them when something is found. Fixes the following test failure: $ perf test 1 1: vmlinux symtab matches kallsyms : FAILED! Fixes: 259dce914e93 ("perf symbol: Remove symbol_name_rb_node") Signed-off-by: James Clark <james.clark@arm.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Link: https://lore.kernel.org/r/20230630153840.858668-1-james.clark@arm.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2023-07-01perf test: Test perf lock contention CSV outputNamhyung Kim
To verify CSV output, just check the number of separators (",") using the tr and wc commands like this. grep -v "^#" ${result} | tr -d -c | wc -c Now it expects 6 columns (and 5 separators) in the output, but it may be changed later so count the field in the header first and compare it to the actual output lines. $ cat ${result} # output: contended, total wait, max wait, avg wait, type, caller 1, 28787, 28787, 28787, spinlock, raw_spin_rq_lock_nested+0x1b The test looks like below now: $ sudo ./perf test -v contention 86: kernel lock contention analysis test : --- start --- test child forked, pid 2705822 Testing perf lock record and perf lock contention Testing perf lock contention --use-bpf Testing perf lock record and perf lock contention at the same time Testing perf lock contention --threads Testing perf lock contention --lock-addr Testing perf lock contention --type-filter (w/ spinlock) Testing perf lock contention --lock-filter (w/ tasklist_lock) Testing perf lock contention --callstack-filter (w/ unix_stream) Testing perf lock contention --callstack-filter with task aggregation Testing perf lock contention CSV output test child finished with 0 ---- end ---- kernel lock contention analysis test: Ok Acked-by: Ian Rogers <irogers@google.com> Cc: Hao Luo <haoluo@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20230628200141.2739587-5-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2023-07-01perf lock contention: Add --output optionNamhyung Kim
To avoid formatting failures for example in CSV output due to debug messages, add --output option to put the result in a file. Unfortunately the short -o option was taken by the --owner already. $ sudo ./perf lock con -ab --output lock-out.txt -v sleep 1 Looking at the vmlinux_path (8 entries long) symsrc__init: cannot get elf header. Using /proc/kcore for kernel data Using /proc/kallsyms for symbols $ head lock-out.txt contended total wait max wait avg wait type caller 3 76.79 us 26.89 us 25.60 us rwlock:R ep_poll_callback+0x2d 0xffffffff9a23f4b5 _raw_read_lock_irqsave+0x45 0xffffffff99bbd4dd ep_poll_callback+0x2d 0xffffffff999029f3 __wake_up_common+0x73 0xffffffff99902b82 __wake_up_common_lock+0x82 0xffffffff99fa5b1c sock_def_readable+0x3c 0xffffffff9a11521d unix_stream_sendmsg+0x18d 0xffffffff99f9fc9c sock_sendmsg+0x5c Suggested-by: Ian Rogers <irogers@google.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Hao Luo <haoluo@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20230628200141.2739587-4-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2023-07-01perf lock contention: Add -x option for CSV style outputNamhyung Kim
Sometimes we want to process the output by external programs. Let's add the -x option to specify the field separator like perf stat. $ sudo ./perf lock con -ab -x, sleep 1 # output: contended, total wait, max wait, avg wait, type, caller 19, 194232, 21415, 10222, spinlock, process_one_work+0x1f0 15, 162748, 23843, 10849, rwsem:R, do_user_addr_fault+0x40e 4, 86740, 23415, 21685, rwlock:R, ep_poll_callback+0x2d 1, 84281, 84281, 84281, mutex, iwl_mvm_async_handlers_wk+0x135 8, 67608, 27404, 8451, spinlock, __queue_work+0x174 3, 58616, 31125, 19538, rwsem:W, do_mprotect_pkey+0xff 3, 52953, 21172, 17651, rwlock:W, do_epoll_wait+0x248 2, 30324, 19704, 15162, rwsem:R, do_madvise+0x3ad 1, 24619, 24619, 24619, spinlock, rcu_core+0xd4 The first line is a comment that shows the output format. Each line is separated by the given string ("," in this case). The time is printed in nsec without the unit so that it can be parsed easily. The characters can be used in the output like (":", "+" and ".") are not allowed for the -x option. $ ./perf lock con -x: Cannot use the separator that is already used Usage: perf lock contention [<options>] -x, --field-separator <separator> print result in CSV format with custom separator The stacktraces are printed in the same line separated by ":". The header is updated to show the stacktrace. Also the debug output is added at the end as a comment. $ sudo ./perf lock con -abv -x, -F wait_total sleep 1 Looking at the vmlinux_path (8 entries long) symsrc__init: cannot get elf header. Using /proc/kcore for kernel data Using /proc/kallsyms for symbols # output: total wait, type, caller, stacktrace 37134, spinlock, rcu_core+0xd4, 0xffffffff9d0401e4 _raw_spin_lock_irqsave+0x44: 0xffffffff9c738114 rcu_core+0xd4: ... 21213, spinlock, raw_spin_rq_lock_nested+0x1b, 0xffffffff9d0407c0 _raw_spin_lock+0x30: 0xffffffff9c6d9cfb raw_spin_rq_lock_nested+0x1b: ... 20506, rwlock:W, ep_done_scan+0x2d, 0xffffffff9c9bc4dd ep_done_scan+0x2d: 0xffffffff9c9bd5f1 do_epoll_wait+0x6d1: ... 18044, rwlock:R, ep_poll_callback+0x2d, 0xffffffff9d040555 _raw_read_lock_irqsave+0x45: 0xffffffff9c9bc81d ep_poll_callback+0x2d: ... 17890, rwlock:W, do_epoll_wait+0x47b, 0xffffffff9c9bd39b do_epoll_wait+0x47b: 0xffffffff9c9be9ef __x64_sys_epoll_wait+0x6d1: ... 12114, spinlock, futex_wait_queue+0x60, 0xffffffff9d0407c0 _raw_spin_lock+0x30: 0xffffffff9d037cae __schedule+0xbe: ... # debug: total=7, bad=0, bad_task=0, bad_stack=0, bad_time=0, bad_data=0 Also note that some field (like lock symbols) can be empty. $ sudo ./perf lock con -abl -x, -E 10 sleep 1 # output: contended, total wait, max wait, avg wait, address, symbol, type 6, 275025, 61764, 45837, ffff9dcc9f7d60d0, , spinlock 18, 87716, 11196, 4873, ffff9dc540059000, , spinlock 2, 6472, 5499, 3236, ffff9dcc7f730e00, rq_lock, spinlock 3, 4429, 2341, 1476, ffff9dcc7f7b0e00, rq_lock, spinlock 3, 3974, 1635, 1324, ffff9dcc7f7f0e00, rq_lock, spinlock 4, 3290, 1326, 822, ffff9dc5f4e2cde0, , rwlock 3, 2894, 1023, 964, ffffffff9e0d7700, rcu_state, spinlock 1, 2567, 2567, 2567, ffff9dcc7f6b0e00, rq_lock, spinlock 4, 1259, 596, 314, ffff9dc69c2adde0, , rwlock 1, 934, 934, 934, ffff9dcc7f670e00, rq_lock, spinlock Acked-by: Ian Rogers <irogers@google.com> Cc: Hao Luo <haoluo@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20230628200141.2739587-3-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2023-07-01perf lock: Remove stale commentsNamhyung Kim
The comment was for symbol_conf.sort_by_name which was deleted already. Let's get rid of the stale comments as well. Acked-by: Ian Rogers <irogers@google.com> Cc: Hao Luo <haoluo@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20230628200141.2739587-2-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2023-07-01Merge tag 'kbuild-v6.5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild updates from Masahiro Yamada: - Remove the deprecated rule to build *.dtbo from *.dts - Refactor section mismatch detection in modpost - Fix bogus ARM section mismatch detections - Fix error of 'make gtags' with O= option - Add Clang's target triple to KBUILD_CPPFLAGS to fix a build error with the latest LLVM version - Rebuild the built-in initrd when KBUILD_BUILD_TIMESTAMP is changed - Ignore more compiler-generated symbols for kallsyms - Fix 'make local*config' to handle the ${CONFIG_FOO} form in Makefiles - Enable more kernel-doc warnings with W=2 - Refactor <linux/export.h> by generating KSYMTAB data by modpost - Deprecate <asm/export.h> and <asm-generic/export.h> - Remove the EXPORT_DATA_SYMBOL macro - Move the check for static EXPORT_SYMBOL back to modpost, which makes the build faster - Re-implement CONFIG_TRIM_UNUSED_KSYMS with one-pass algorithm - Warn missing MODULE_DESCRIPTION when building modules with W=1 - Make 'make clean' robust against too long argument error - Exclude more objects from GCOV to fix CFI failures with GCOV - Allow 'make modules_install' to install modules.builtin and modules.builtin.modinfo even when CONFIG_MODULES is disabled - Include modules.builtin and modules.builtin.modinfo in the linux-image Debian package even when CONFIG_MODULES is disabled - Revive "Entering directory" logging for the latest Make version * tag 'kbuild-v6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (72 commits) modpost: define more R_ARM_* for old distributions kbuild: revive "Entering directory" for Make >= 4.4.1 kbuild: set correct abs_srctree and abs_objtree for package builds scripts/mksysmap: Ignore prefixed KCFI symbols kbuild: deb-pkg: remove the CONFIG_MODULES check in buildeb kbuild: builddeb: always make modules_install, to install modules.builtin* modpost: continue even with unknown relocation type modpost: factor out Elf_Sym pointer calculation to section_rel() modpost: factor out inst location calculation to section_rel() kbuild: Disable GCOV for *.mod.o kbuild: Fix CFI failures with GCOV kbuild: make clean rule robust against too long argument error script: modpost: emit a warning when the description is missing kbuild: make modules_install copy modules.builtin(.modinfo) linux/export.h: rename 'sec' argument to 'license' modpost: show offset from symbol for section mismatch warnings modpost: merge two similar section mismatch warnings kbuild: implement CONFIG_TRIM_UNUSED_KSYMS without recursion modpost: use null string instead of NULL pointer for default namespace modpost: squash sym_update_namespace() into sym_add_exported() ...
2023-07-01Merge tag 'arm64-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fix from Catalin Marinas: "Fix memory corruption (overwriting the kmalloc redzone) when saving the SVE state while in SVE streaming mode" * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: sme: Use STR P to clear FFR context field in streaming SVE mode
2023-07-01Merge tag 'cxl-for-6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxlLinus Torvalds
Pull CXL updates from Dan Williams: "The highlights in terms of new functionality are support for the standard CXL Performance Monitor definition that appeared in CXL 3.0, support for device sanitization (wiping all data from a device), secure-erase (re-keying encryption of user data), and support for firmware update. The firmware update support is notable as it reuses the simple sysfs_upload interface to just cat(1) a blob to a sysfs file and pipe that to the device. Additionally there are a substantial number of cleanups and reorganizations to get ready for RCH error handling (RCH == Restricted CXL Host == current shipping hardware generation / pre CXL-2.0 topologies) and type-2 (accelerator / vendor specific) devices. For vendor specific devices they implement a subset of what the generic type-3 (generic memory expander) driver expects. As a result the rework decouples optional infrastructure from the core driver context. For RCH topologies, where the specification working group did not want to confuse pre-CXL-aware operating systems, many of the standard registers are hidden which makes support standard bus features like AER (PCIe Advanced Error Reporting) difficult. The rework arranges for the driver to help the PCI-AER core. Bjorn is on board with this direction but a late regression disocvery means the completion of this functionality needs to cook a bit longer, so it is code reorganizations only for now. Summary: - Add infrastructure for supporting background commands along with support for device sanitization and firmware update - Introduce a CXL performance monitoring unit driver based on the common definition in the specification. - Land some preparatory cleanup and refactoring for the anticipated arrival of CXL type-2 (accelerator devices) and CXL RCH (CXL-v1.1 topology) error handling. - Rework CPU cache management with respect to region configuration (device hotplug or other dynamic changes to memory interleaving) - Fix region reconfiguration vs CXL decoder ordering rules" * tag 'cxl-for-6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: (51 commits) cxl: Fix one kernel-doc comment cxl/pci: Use correct flag for sanitize polling docs: perf: Minimal introduction the the CXL PMU device and driver perf: CXL Performance Monitoring Unit driver tools/testing/cxl: add firmware update emulation to CXL memdevs tools/testing/cxl: Use named effects for the Command Effect Log tools/testing/cxl: Fix command effects for inject/clear poison cxl: add a firmware update mechanism using the sysfs firmware loader cxl/test: Add Secure Erase opcode support cxl/mem: Support Secure Erase cxl/test: Add Sanitize opcode support cxl/mem: Wire up Sanitization support cxl/mbox: Add sanitization handling machinery cxl/mem: Introduce security state sysfs file cxl/mbox: Allow for IRQ_NONE case in the isr Revert "cxl/port: Enable the HDM decoder capability for switch ports" cxl/memdev: Formalize endpoint port linkage cxl/pci: Unconditionally unmask 256B Flit errors cxl/region: Manage decoder target_type at decoder-attach time cxl/hdm: Default CXL_DEVTYPE_DEVMEM decoders to CXL_DECODER_DEVMEM ...
2023-07-01Merge tag 'libnvdimm-for-6.5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm Pull nvdimm and DAX updates from Vishal Verma: "This is mostly small cleanups and fixes, with the biggest change being the change to the DAX fault handler allowing it to return VM_FAULT_HWPOISON. Summary: - DAX fixes and cleanups including a use after free, extra references, and device unregistration, and a redundant variable. - Allow the DAX fault handler to return VM_FAULT_HWPOISON - A few libnvdimm cleanups such as making some functions and variables static where sufficient. - Add a few missing prototypes for wrapped functions in tools/testing/nvdimm" * tag 'libnvdimm-for-6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: dax: enable dax fault handler to report VM_FAULT_HWPOISON nvdimm: make security_show static nvdimm: make nd_class variable static dax/kmem: Pass valid argument to memory_group_register_static fsdax: remove redundant variable 'error' dax: Cleanup extra dax_region references dax: Introduce alloc_dev_dax_id() dax: Use device_unregister() in unregister_dax_mapping() dax: Fix dax_mapping_release() use after free tools/testing/nvdimm: Drop empty platform remove function libnvdimm: mark 'security_show' static again testing: nvdimm: add missing prototypes for wrapped functions dax: fix missing-prototype warnings
2023-07-01Merge tag 'sysctl-fixes-v2-v6.4-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux Pull another sysctl fix from Luis Chamberlain: "Just one minor nit I forgot to merge" * tag 'sysctl-fixes-v2-v6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux: sysctl: set variable sysctl_mount_point storage-class-specifier to static
2023-07-01Merge tag 'flex-array-transformations-6.5-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux Pull flexible-array update from Gustavo Silva: "Transform a zero-length array into a C99 flexible-array member. This addresses a build failure with Clang by fixing multiple '-Warray-bounds' warnings in drivers/staging/ks7010/ks_wlan_net.c" * tag 'flex-array-transformations-6.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux: uapi: wireless: Replace zero-length array with flexible-array member
2023-07-01pid: use struct_size_t() helperChristian Brauner
Before commit d67790ddf021 ("overflow: Add struct_size_t() helper") only struct_size() existed, which expects a valid pointer instance containing the flexible array. However, when we determine the default struct pid allocation size for the associated kmem cache of a pid namespace we need to take the nesting depth of the pid namespace into account without an variable instance necessarily being available. In commit b69f0aeb0689 ("pid: Replace struct pid 1-element array with flex-array") we used to handle this the old fashioned way and cast NULL to a struct pid pointer type. However, we do apparently have a dedicated struct_size_t() helper for exactly this case. So switch to that. Suggested-by: Kees Cook <keescook@chromium.org> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-07-01mm: Update do_vmi_align_munmap() return semanticsLiam R. Howlett
Since do_vmi_align_munmap() will always honor the downgrade request on the success, the callers no longer have to deal with confusing return codes. Since all callers that request downgrade actually want the lock to be dropped, change the downgrade to an unlock request. Note that the lock still needs to be held in read mode during the page table clean up to avoid races with a map request. Update do_vmi_align_munmap() to return 0 for success. Clean up the callers and comments to always expect the unlock to be honored on the success path. The error path will always leave the lock untouched. As part of the cleanup, the wrapper function do_vmi_munmap() and callers to the wrapper are also updated. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/linux-mm/20230629191414.1215929-1-willy@infradead.org/ Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>