summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2022-02-12Merge tag 'seccomp-v5.17-rc4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull seccomp fixes from Kees Cook: "This fixes a corner case of fatal SIGSYS being ignored since v5.15. Along with the signal fix is a change to seccomp so that seeing another syscall after a fatal filter result will cause seccomp to kill the process harder. Summary: - Force HANDLER_EXIT even for SIGNAL_UNKILLABLE - Make seccomp self-destruct after fatal filter results - Update seccomp samples for easier behavioral demonstration" * tag 'seccomp-v5.17-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: samples/seccomp: Adjust sample to also provide kill option seccomp: Invalidate seccomp mode to catch death failures signal: HANDLER_EXIT should clear SIGNAL_UNKILLABLE
2022-02-12Merge branch 'akpm' (patches from Andrew)Linus Torvalds
Merge misc fixes from Andrew Morton: "5 patches. Subsystems affected by this patch series: binfmt, procfs, and mm (vmscan, memcg, and kfence)" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: kfence: make test case compatible with run time set sample interval mm: memcg: synchronize objcg lists with a dedicated spinlock mm: vmscan: remove deadlock due to throttling failing to make progress fs/proc: task_mmu.c: don't read mapcount for migration entry fs/binfmt_elf: fix PT_LOAD p_align values for loaders
2022-02-12kconfig: fix failing to generate auto.confJing Leng
When the KCONFIG_AUTOCONFIG is specified (e.g. export \ KCONFIG_AUTOCONFIG=output/config/auto.conf), the directory of include/config/ will not be created, so kconfig can't create deps files in it and auto.conf can't be generated. Signed-off-by: Jing Leng <jleng@ambarella.com> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2022-02-12Revert "usb: dwc2: drd: fix soft connect when gadget is unconfigured"Greg Kroah-Hartman
This reverts commit 269cbcf7b72de6f0016806d4a0cec1d689b55a87. It causes build errors as reported by the kernel test robot. Link: https://lore.kernel.org/r/202202112236.AwoOTtHO-lkp@intel.com Reported-by: kernel test robot <lkp@intel.com> Fixes: 269cbcf7b72d ("usb: dwc2: drd: fix soft connect when gadget is unconfigured") Cc: stable@kernel.org Cc: Amelie Delaunay <amelie.delaunay@foss.st.com> Cc: Minas Harutyunyan <Minas.Harutyunyan@synopsys.com> Cc: Fabrice Gasnier <fabrice.gasnier@foss.st.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-02-11kfence: make test case compatible with run time set sample intervalPeng Liu
The parameter kfence_sample_interval can be set via boot parameter and late shell command, which is convenient for automated tests and KFENCE parameter optimization. However, KFENCE test case just uses compile-time CONFIG_KFENCE_SAMPLE_INTERVAL, which will make KFENCE test case not run as users desired. Export kfence_sample_interval, so that KFENCE test case can use run-time-set sample interval. Link: https://lkml.kernel.org/r/20220207034432.185532-1-liupeng256@huawei.com Signed-off-by: Peng Liu <liupeng256@huawei.com> Reviewed-by: Marco Elver <elver@google.com> Cc: Alexander Potapenko <glider@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Sumit Semwal <sumit.semwal@linaro.org> Cc: Christian Knig <christian.koenig@amd.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-02-11mm: memcg: synchronize objcg lists with a dedicated spinlockRoman Gushchin
Alexander reported a circular lock dependency revealed by the mmap1 ltp test: LOCKDEP_CIRCULAR (suite: ltp, case: mtest06 (mmap1)) WARNING: possible circular locking dependency detected 5.17.0-20220113.rc0.git0.f2211f194038.300.fc35.s390x+debug #1 Not tainted ------------------------------------------------------ mmap1/202299 is trying to acquire lock: 00000001892c0188 (css_set_lock){..-.}-{2:2}, at: obj_cgroup_release+0x4a/0xe0 but task is already holding lock: 00000000ca3b3818 (&sighand->siglock){-.-.}-{2:2}, at: force_sig_info_to_task+0x38/0x180 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (&sighand->siglock){-.-.}-{2:2}: __lock_acquire+0x604/0xbd8 lock_acquire.part.0+0xe2/0x238 lock_acquire+0xb0/0x200 _raw_spin_lock_irqsave+0x6a/0xd8 __lock_task_sighand+0x90/0x190 cgroup_freeze_task+0x2e/0x90 cgroup_migrate_execute+0x11c/0x608 cgroup_update_dfl_csses+0x246/0x270 cgroup_subtree_control_write+0x238/0x518 kernfs_fop_write_iter+0x13e/0x1e0 new_sync_write+0x100/0x190 vfs_write+0x22c/0x2d8 ksys_write+0x6c/0xf8 __do_syscall+0x1da/0x208 system_call+0x82/0xb0 -> #0 (css_set_lock){..-.}-{2:2}: check_prev_add+0xe0/0xed8 validate_chain+0x736/0xb20 __lock_acquire+0x604/0xbd8 lock_acquire.part.0+0xe2/0x238 lock_acquire+0xb0/0x200 _raw_spin_lock_irqsave+0x6a/0xd8 obj_cgroup_release+0x4a/0xe0 percpu_ref_put_many.constprop.0+0x150/0x168 drain_obj_stock+0x94/0xe8 refill_obj_stock+0x94/0x278 obj_cgroup_charge+0x164/0x1d8 kmem_cache_alloc+0xac/0x528 __sigqueue_alloc+0x150/0x308 __send_signal+0x260/0x550 send_signal+0x7e/0x348 force_sig_info_to_task+0x104/0x180 force_sig_fault+0x48/0x58 __do_pgm_check+0x120/0x1f0 pgm_check_handler+0x11e/0x180 other info that might help us debug this: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&sighand->siglock); lock(css_set_lock); lock(&sighand->siglock); lock(css_set_lock); *** DEADLOCK *** 2 locks held by mmap1/202299: #0: 00000000ca3b3818 (&sighand->siglock){-.-.}-{2:2}, at: force_sig_info_to_task+0x38/0x180 #1: 00000001892ad560 (rcu_read_lock){....}-{1:2}, at: percpu_ref_put_many.constprop.0+0x0/0x168 stack backtrace: CPU: 15 PID: 202299 Comm: mmap1 Not tainted 5.17.0-20220113.rc0.git0.f2211f194038.300.fc35.s390x+debug #1 Hardware name: IBM 3906 M04 704 (LPAR) Call Trace: dump_stack_lvl+0x76/0x98 check_noncircular+0x136/0x158 check_prev_add+0xe0/0xed8 validate_chain+0x736/0xb20 __lock_acquire+0x604/0xbd8 lock_acquire.part.0+0xe2/0x238 lock_acquire+0xb0/0x200 _raw_spin_lock_irqsave+0x6a/0xd8 obj_cgroup_release+0x4a/0xe0 percpu_ref_put_many.constprop.0+0x150/0x168 drain_obj_stock+0x94/0xe8 refill_obj_stock+0x94/0x278 obj_cgroup_charge+0x164/0x1d8 kmem_cache_alloc+0xac/0x528 __sigqueue_alloc+0x150/0x308 __send_signal+0x260/0x550 send_signal+0x7e/0x348 force_sig_info_to_task+0x104/0x180 force_sig_fault+0x48/0x58 __do_pgm_check+0x120/0x1f0 pgm_check_handler+0x11e/0x180 INFO: lockdep is turned off. In this example a slab allocation from __send_signal() caused a refilling and draining of a percpu objcg stock, resulted in a releasing of another non-related objcg. Objcg release path requires taking the css_set_lock, which is used to synchronize objcg lists. This can create a circular dependency with the sighandler lock, which is taken with the locked css_set_lock by the freezer code (to freeze a task). In general it seems that using css_set_lock to synchronize objcg lists makes any slab allocations and deallocation with the locked css_set_lock and any intervened locks risky. To fix the problem and make the code more robust let's stop using css_set_lock to synchronize objcg lists and use a new dedicated spinlock instead. Link: https://lkml.kernel.org/r/Yfm1IHmoGdyUR81T@carbon.dhcp.thefacebook.com Fixes: bf4f059954dc ("mm: memcg/slab: obj_cgroup API") Signed-off-by: Roman Gushchin <guro@fb.com> Reported-by: Alexander Egorenkov <egorenar@linux.ibm.com> Tested-by: Alexander Egorenkov <egorenar@linux.ibm.com> Reviewed-by: Waiman Long <longman@redhat.com> Acked-by: Tejun Heo <tj@kernel.org> Reviewed-by: Shakeel Butt <shakeelb@google.com> Reviewed-by: Jeremy Linton <jeremy.linton@arm.com> Tested-by: Jeremy Linton <jeremy.linton@arm.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-02-11mm: vmscan: remove deadlock due to throttling failing to make progressMel Gorman
A soft lockup bug in kcompactd was reported in a private bugzilla with the following visible in dmesg; watchdog: BUG: soft lockup - CPU#33 stuck for 26s! [kcompactd0:479] watchdog: BUG: soft lockup - CPU#33 stuck for 52s! [kcompactd0:479] watchdog: BUG: soft lockup - CPU#33 stuck for 78s! [kcompactd0:479] watchdog: BUG: soft lockup - CPU#33 stuck for 104s! [kcompactd0:479] The machine had 256G of RAM with no swap and an earlier failed allocation indicated that node 0 where kcompactd was run was potentially unreclaimable; Node 0 active_anon:29355112kB inactive_anon:2913528kB active_file:0kB inactive_file:0kB unevictable:64kB isolated(anon):0kB isolated(file):0kB mapped:8kB dirty:0kB writeback:0kB shmem:26780kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 23480320kB writeback_tmp:0kB kernel_stack:2272kB pagetables:24500kB all_unreclaimable? yes Vlastimil Babka investigated a crash dump and found that a task migrating pages was trying to drain PCP lists; PID: 52922 TASK: ffff969f820e5000 CPU: 19 COMMAND: "kworker/u128:3" Call Trace: __schedule schedule schedule_timeout wait_for_completion __flush_work __drain_all_pages __alloc_pages_slowpath.constprop.114 __alloc_pages alloc_migration_target migrate_pages migrate_to_node do_migrate_pages cpuset_migrate_mm_workfn process_one_work worker_thread kthread ret_from_fork This failure is specific to CONFIG_PREEMPT=n builds. The root of the problem is that kcompact0 is not rescheduling on a CPU while a task that has isolated a large number of the pages from the LRU is waiting on kcompact0 to reschedule so the pages can be released. While shrink_inactive_list() only loops once around too_many_isolated, reclaim can continue without rescheduling if sc->skipped_deactivate == 1 which could happen if there was no file LRU and the inactive anon list was not low. Link: https://lkml.kernel.org/r/20220203100326.GD3301@suse.de Fixes: d818fca1cac3 ("mm/vmscan: throttle reclaim and compaction when too may pages are isolated") Signed-off-by: Mel Gorman <mgorman@suse.de> Debugged-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: David Rientjes <rientjes@google.com> Cc: Hugh Dickins <hughd@google.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Rik van Riel <riel@surriel.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-02-11fs/proc: task_mmu.c: don't read mapcount for migration entryYang Shi
The syzbot reported the below BUG: kernel BUG at include/linux/page-flags.h:785! invalid opcode: 0000 [#1] PREEMPT SMP KASAN CPU: 1 PID: 4392 Comm: syz-executor560 Not tainted 5.16.0-rc6-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:PageDoubleMap include/linux/page-flags.h:785 [inline] RIP: 0010:__page_mapcount+0x2d2/0x350 mm/util.c:744 Call Trace: page_mapcount include/linux/mm.h:837 [inline] smaps_account+0x470/0xb10 fs/proc/task_mmu.c:466 smaps_pte_entry fs/proc/task_mmu.c:538 [inline] smaps_pte_range+0x611/0x1250 fs/proc/task_mmu.c:601 walk_pmd_range mm/pagewalk.c:128 [inline] walk_pud_range mm/pagewalk.c:205 [inline] walk_p4d_range mm/pagewalk.c:240 [inline] walk_pgd_range mm/pagewalk.c:277 [inline] __walk_page_range+0xe23/0x1ea0 mm/pagewalk.c:379 walk_page_vma+0x277/0x350 mm/pagewalk.c:530 smap_gather_stats.part.0+0x148/0x260 fs/proc/task_mmu.c:768 smap_gather_stats fs/proc/task_mmu.c:741 [inline] show_smap+0xc6/0x440 fs/proc/task_mmu.c:822 seq_read_iter+0xbb0/0x1240 fs/seq_file.c:272 seq_read+0x3e0/0x5b0 fs/seq_file.c:162 vfs_read+0x1b5/0x600 fs/read_write.c:479 ksys_read+0x12d/0x250 fs/read_write.c:619 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae The reproducer was trying to read /proc/$PID/smaps when calling MADV_FREE at the mean time. MADV_FREE may split THPs if it is called for partial THP. It may trigger the below race: CPU A CPU B ----- ----- smaps walk: MADV_FREE: page_mapcount() PageCompound() split_huge_page() page = compound_head(page) PageDoubleMap(page) When calling PageDoubleMap() this page is not a tail page of THP anymore so the BUG is triggered. This could be fixed by elevated refcount of the page before calling mapcount, but that would prevent it from counting migration entries, and it seems overkilling because the race just could happen when PMD is split so all PTE entries of tail pages are actually migration entries, and smaps_account() does treat migration entries as mapcount == 1 as Kirill pointed out. Add a new parameter for smaps_account() to tell this entry is migration entry then skip calling page_mapcount(). Don't skip getting mapcount for device private entries since they do track references with mapcount. Pagemap also has the similar issue although it was not reported. Fixed it as well. [shy828301@gmail.com: v4] Link: https://lkml.kernel.org/r/20220203182641.824731-1-shy828301@gmail.com [nathan@kernel.org: avoid unused variable warning in pagemap_pmd_range()] Link: https://lkml.kernel.org/r/20220207171049.1102239-1-nathan@kernel.org Link: https://lkml.kernel.org/r/20220120202805.3369-1-shy828301@gmail.com Fixes: e9b61f19858a ("thp: reintroduce split_huge_page()") Signed-off-by: Yang Shi <shy828301@gmail.com> Signed-off-by: Nathan Chancellor <nathan@kernel.org> Reported-by: syzbot+1f52b3a18d5633fa7f82@syzkaller.appspotmail.com Acked-by: David Hildenbrand <david@redhat.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Jann Horn <jannh@google.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-02-11fs/binfmt_elf: fix PT_LOAD p_align values for loadersMike Rapoport
Rui Salvaterra reported that Aisleroit solitaire crashes with "Wrong __data_start/_end pair" assertion from libgc after update to v5.17-rc1. Bisection pointed to commit 9630f0d60fec ("fs/binfmt_elf: use PT_LOAD p_align values for static PIE") that fixed handling of static PIEs, but made the condition that guards load_bias calculation to exclude loader binaries. Restoring the check for presence of interpreter fixes the problem. Link: https://lkml.kernel.org/r/20220202121433.3697146-1-rppt@kernel.org Fixes: 9630f0d60fec ("fs/binfmt_elf: use PT_LOAD p_align values for static PIE") Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Reported-by: Rui Salvaterra <rsalvaterra@gmail.com> Tested-by: Rui Salvaterra <rsalvaterra@gmail.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Eric Biederman <ebiederm@xmission.com> Cc: "H.J. Lu" <hjl.tools@gmail.com> Cc: Kees Cook <keescook@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-02-11atl1c: fix tx timeout after link flap on Mikrotik 10/25G NICGatis Peisenieks
If NIC had packets in tx queue at the moment link down event happened, it could result in tx timeout when link got back up. Since device has more than one tx queue we need to reset them accordingly. Fixes: 057f4af2b171 ("atl1c: add 4 RX/TX queue support for Mikrotik 10/25G NIC") Signed-off-by: Gatis Peisenieks <gatis@mikrotik.com> Link: https://lore.kernel.org/r/20220211065123.4187615-1-gatis@mikrotik.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-11mctp: serial: Cancel pending work from ndo_uninit handlerJeremy Kerr
We cannot do the cancel_work_sync from after the unregister_netdev, as the dev pointer is no longer valid, causing a uaf on ldisc unregister (or device close). Instead, do the cancel_work_sync from the ndo_uninit op, where the dev still exists, but the queue has stopped. Fixes: 7bd9890f3d74 ("mctp: serial: cancel tx work on ldisc close") Reported-by: Luo Likang <luolikang@nsfocus.com> Tested-by: Luo Likang <luolikang@nsfocus.com> Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au> Link: https://lore.kernel.org/r/20220211011552.1861886-1-jk@codeconstruct.com.au Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-11net: dsa: lan9303: fix reset on probeMans Rullgard
The reset input to the LAN9303 chip is active low, and devicetree gpio handles reflect this. Therefore, the gpio should be requested with an initial state of high in order for the reset signal to be asserted. Other uses of the gpio already use the correct polarity. Fixes: a1292595e006 ("net: dsa: add new DSA switch driver for the SMSC-LAN9303") Signed-off-by: Mans Rullgard <mans@mansr.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fianelil <f.fainelli@gmail.com> Link: https://lore.kernel.org/r/20220209145454.19749-1-mans@mansr.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-11Merge tag 'soc-fixes-5.17-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull ARM SoC fixes from Arnd Bergmann: "This is a fairly large set of bugfixes, most of which had been sent a while ago but only now made it into the soc tree: Maintainer file updates: - Claudiu Beznea now co-maintains the at91 soc family, replacing Ludovic Desroches. - Michael Walle maintains the sl28cpld drivers - Alain Volmat and Raphael Gallais-Pou take over some drivers for ST platforms - Alim Akhtar is an additional reviewer for Samsung platforms Code fixes: - Op-tee had a problem with object lifetime that needs a slightly complex fix, as well as another bug with error handling. - Several minor issues for the OMAP platform, including a regression with the timer - A Kconfig change to fix a build-time issue on Intel SoCFPGA Device tree fixes: - The Amlogic Meson platform fixes a boot regression on am1-odroid, a spurious interrupt, and a problem with reserved memory regions - In the i.MX platform, several bug fixes are needed to make devices work correctly: SD card detection, alarmtimer, and sound card on some board. One patch for the GPU got in there by accident and gets reverted again. - TI K3 needs a fix for J721S2 serial port numbers - ux500 needs a fix to mount the SD card as root on the Skomer phone" * tag 'soc-fixes-5.17-1' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (46 commits) Revert "arm64: dts: imx8mn-venice-gw7902: disable gpu" arm64: Remove ARCH_VULCAN MAINTAINERS: add myself as a maintainer for the sl28cpld MAINTAINERS: add IRC to ARM sub-architectures and Devicetree MAINTAINERS: arm: samsung: add Git tree and IRC ARM: dts: Fix boot regression on Skomer ARM: dts: spear320: Drop unused and undocumented 'irq-over-gpio' property soc: aspeed: lpc-ctrl: Block error printing on probe defer cases docs/ABI: testing: aspeed-uart-routing: Escape asterisk MAINTAINERS: update drm/stm drm/sti and cec/sti maintainers MAINTAINERS: Update Benjamin Gaignard maintainer status ARM: socfpga: fix missing RESET_CONTROLLER arm64: dts: meson-sm1-odroid: fix boot loop after reboot arm64: dts: meson-g12: drop BL32 region from SEI510/SEI610 arm64: dts: meson-g12: add ATF BL32 reserved-memory region arm64: dts: meson-gx: add ATF BL32 reserved-memory region arm64: dts: meson-sm1-bananapi-m5: fix wrong GPIO domain for GPIOE_2 arm64: dts: meson-sm1-odroid: use correct enable-gpio pin for tf-io regulator arm64: dts: meson-g12b-odroid-n2: fix typo 'dio2133' optee: use driver internal tee_context for some rpc ...
2022-02-11Merge branch 'bpf: fix a bpf_timer initialization issue'Alexei Starovoitov
Yonghong Song says: ==================== The patch [1] exposed a bpf_timer initialization bug in function check_and_init_map_value(). With bug fix here, the patch [1] can be applied with all selftests passed. Please see individual patches for fix details. [1] https://lore.kernel.org/bpf/20220209070324.1093182-2-memxor@gmail.com/ Changelog: v3 -> v4: . move header file in patch #1 to avoid bpf-next merge conflict v2 -> v3: . switch patch #1 and patch #2 for better bisecting v1 -> v2: . add Fixes tag for patch #1 . rebase against bpf tree ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-02-11bpf: Fix a bpf_timer initialization issueYonghong Song
The patch in [1] intends to fix a bpf_timer related issue, but the fix caused existing 'timer' selftest to fail with hang or some random errors. After some debug, I found an issue with check_and_init_map_value() in the hashtab.c. More specifically, in hashtab.c, we have code l_new = bpf_map_kmalloc_node(&htab->map, ...) check_and_init_map_value(&htab->map, l_new...) Note that bpf_map_kmalloc_node() does not do initialization so l_new contains random value. The function check_and_init_map_value() intends to zero the bpf_spin_lock and bpf_timer if they exist in the map. But I found bpf_spin_lock is zero'ed but bpf_timer is not zero'ed. With [1], later copy_map_value() skips copying of bpf_spin_lock and bpf_timer. The non-zero bpf_timer caused random failures for 'timer' selftest. Without [1], for both bpf_spin_lock and bpf_timer case, bpf_timer will be zero'ed, so 'timer' self test is okay. For check_and_init_map_value(), why bpf_spin_lock is zero'ed properly while bpf_timer not. In bpf uapi header, we have struct bpf_spin_lock { __u32 val; }; struct bpf_timer { __u64 :64; __u64 :64; } __attribute__((aligned(8))); The initialization code: *(struct bpf_spin_lock *)(dst + map->spin_lock_off) = (struct bpf_spin_lock){}; *(struct bpf_timer *)(dst + map->timer_off) = (struct bpf_timer){}; It appears the compiler has no obligation to initialize anonymous fields. For example, let us use clang with bpf target as below: $ cat t.c struct bpf_timer { unsigned long long :64; }; struct bpf_timer2 { unsigned long long a; }; void test(struct bpf_timer *t) { *t = (struct bpf_timer){}; } void test2(struct bpf_timer2 *t) { *t = (struct bpf_timer2){}; } $ clang -target bpf -O2 -c -g t.c $ llvm-objdump -d t.o ... 0000000000000000 <test>: 0: 95 00 00 00 00 00 00 00 exit 0000000000000008 <test2>: 1: b7 02 00 00 00 00 00 00 r2 = 0 2: 7b 21 00 00 00 00 00 00 *(u64 *)(r1 + 0) = r2 3: 95 00 00 00 00 00 00 00 exit gcc11.2 does not have the above issue. But from INTERNATIONAL STANDARD ©ISO/IEC ISO/IEC 9899:201x Programming languages — C http://www.open-std.org/Jtc1/sc22/wg14/www/docs/n1547.pdf page 157: Except where explicitly stated otherwise, for the purposes of this subclause unnamed members of objects of structure and union type do not participate in initialization. Unnamed members of structure objects have indeterminate value even after initialization. To fix the problem, let use memset for bpf_timer case in check_and_init_map_value(). For consistency, memset is also used for bpf_spin_lock case. [1] https://lore.kernel.org/bpf/20220209070324.1093182-2-memxor@gmail.com/ Fixes: 68134668c17f3 ("bpf: Add map side support for bpf timers.") Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220211194953.3142152-1-yhs@fb.com
2022-02-11bpf: Emit bpf_timer in vmlinux BTFYonghong Song
Currently the following code in check_and_init_map_value() *(struct bpf_timer *)(dst + map->timer_off) = (struct bpf_timer){}; can help generate bpf_timer definition in vmlinuxBTF. But the code above may not zero the whole structure due to anonymour members and that code will be replaced by memset in the subsequent patch and bpf_timer definition will disappear from vmlinuxBTF. Let us emit the type explicitly so bpf program can continue to use it from vmlinux.h. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220211194948.3141529-1-yhs@fb.com
2022-02-11Merge branch 'Fix for crash due to overwrite in copy_map_value'Alexei Starovoitov
Kumar Kartikeya says: ==================== A fix for an oversight in copy_map_value that leads to kernel crash. Also, a question for BPF developers: It seems in arraymap.c, we always do check_and_free_timer_in_array after we do copy_map_value in map_update_elem callback, but the same is not done for hashtab.c. Is there a specific reason for this difference in behavior, or did I miss that it happens for hashtab.c as well? Changlog: --------- v1 -> v2: v1: https://lore.kernel.org/bpf/20220209051113.870717-1-memxor@gmail.com * Fix build error for selftests patch due to missing SYS_PREFIX in bpf tree ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-02-11selftests/bpf: Add test for bpf_timer overwriting crashKumar Kartikeya Dwivedi
Add a test that validates that timer value is not overwritten when doing a copy_map_value call in the kernel. Without the prior fix, this test triggers a crash. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220209070324.1093182-3-memxor@gmail.com
2022-02-11bpf: Fix crash due to incorrect copy_map_valueKumar Kartikeya Dwivedi
When both bpf_spin_lock and bpf_timer are present in a BPF map value, copy_map_value needs to skirt both objects when copying a value into and out of the map. However, the current code does not set both s_off and t_off in copy_map_value, which leads to a crash when e.g. bpf_spin_lock is placed in map value with bpf_timer, as bpf_map_update_elem call will be able to overwrite the other timer object. When the issue is not fixed, an overwriting can produce the following splat: [root@(none) bpf]# ./test_progs -t timer_crash [ 15.930339] bpf_testmod: loading out-of-tree module taints kernel. [ 16.037849] ================================================================== [ 16.038458] BUG: KASAN: user-memory-access in __pv_queued_spin_lock_slowpath+0x32b/0x520 [ 16.038944] Write of size 8 at addr 0000000000043ec0 by task test_progs/325 [ 16.039399] [ 16.039514] CPU: 0 PID: 325 Comm: test_progs Tainted: G OE 5.16.0+ #278 [ 16.039983] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ArchLinux 1.15.0-1 04/01/2014 [ 16.040485] Call Trace: [ 16.040645] <TASK> [ 16.040805] dump_stack_lvl+0x59/0x73 [ 16.041069] ? __pv_queued_spin_lock_slowpath+0x32b/0x520 [ 16.041427] kasan_report.cold+0x116/0x11b [ 16.041673] ? __pv_queued_spin_lock_slowpath+0x32b/0x520 [ 16.042040] __pv_queued_spin_lock_slowpath+0x32b/0x520 [ 16.042328] ? memcpy+0x39/0x60 [ 16.042552] ? pv_hash+0xd0/0xd0 [ 16.042785] ? lockdep_hardirqs_off+0x95/0xd0 [ 16.043079] __bpf_spin_lock_irqsave+0xdf/0xf0 [ 16.043366] ? bpf_get_current_comm+0x50/0x50 [ 16.043608] ? jhash+0x11a/0x270 [ 16.043848] bpf_timer_cancel+0x34/0xe0 [ 16.044119] bpf_prog_c4ea1c0f7449940d_sys_enter+0x7c/0x81 [ 16.044500] bpf_trampoline_6442477838_0+0x36/0x1000 [ 16.044836] __x64_sys_nanosleep+0x5/0x140 [ 16.045119] do_syscall_64+0x59/0x80 [ 16.045377] ? lock_is_held_type+0xe4/0x140 [ 16.045670] ? irqentry_exit_to_user_mode+0xa/0x40 [ 16.046001] ? mark_held_locks+0x24/0x90 [ 16.046287] ? asm_exc_page_fault+0x1e/0x30 [ 16.046569] ? asm_exc_page_fault+0x8/0x30 [ 16.046851] ? lockdep_hardirqs_on+0x7e/0x100 [ 16.047137] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 16.047405] RIP: 0033:0x7f9e4831718d [ 16.047602] Code: b4 0c 00 0f 05 eb a9 66 0f 1f 44 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d b3 6c 0c 00 f7 d8 64 89 01 48 [ 16.048764] RSP: 002b:00007fff488086b8 EFLAGS: 00000206 ORIG_RAX: 0000000000000023 [ 16.049275] RAX: ffffffffffffffda RBX: 00007f9e48683740 RCX: 00007f9e4831718d [ 16.049747] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00007fff488086d0 [ 16.050225] RBP: 00007fff488086f0 R08: 00007fff488085d7 R09: 00007f9e4cb594a0 [ 16.050648] R10: 0000000000000000 R11: 0000000000000206 R12: 00007f9e484cde30 [ 16.051124] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 [ 16.051608] </TASK> [ 16.051762] ================================================================== Fixes: 68134668c17f ("bpf: Add map side support for bpf timers.") Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220209070324.1093182-2-memxor@gmail.com
2022-02-11Merge tag 'pci-v5.17-fixes-4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull pci fix from Bjorn Helgaas: "Revert a commit that reduced the number of IRQs used but resulted in interrupt storms (Bjorn Helgaas)" * tag 'pci-v5.17-fixes-4' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: Revert "PCI/portdrv: Do not setup up IRQs if there are no users"
2022-02-11Revert "PCI/portdrv: Do not setup up IRQs if there are no users"Bjorn Helgaas
This reverts commit 0e8ae5a6ff5952253cd7cc0260df838ab4c21009. 0e8ae5a6ff59 ("PCI/portdrv: Do not setup up IRQs if there are no users") reduced usage of IRQs when we don't think we need them. But Joey, Sergiu, and David reported choppy GUI rendering, systems that became unresponsive every few seconds, incorrect values reported by cpufreq, and high IRQ 16 CPU usage. Joey bisected the issues to 0e8ae5a6ff59, so revert it until we figure out a better solution. Link: https://lore.kernel.org/r/20220210222717.GA658201@bhelgaas Link: https://bugzilla.kernel.org/show_bug.cgi?id=215533 Link: https://bugzilla.kernel.org/show_bug.cgi?id=215546 Reported-by: Joey Corleone <joey.corleone@mail.ru> Reported-by: Sergiu Deitsch <sergiu.deitsch@gmail.com> Reported-by: David Spencer <dspencer577@gmail.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: stable@vger.kernel.org # v5.16+ Cc: Jan Kiszka <jan.kiszka@siemens.com>
2022-02-11Merge tag 'riscv-for-linus-5.17-rc4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V fixes from Palmer Dabbelt: - A fix to avoid undefined behavior when stack backtracing, which manifests in GCC as incorrect stack addresses - A few fixes for the XIP kernels - A fix to tracking NUMA state on CPU hotplug - Support for the recently relesaed binutils-2.38, which changed the default ISA version to one without CSRs or fence.i in 'I' extension * tag 'riscv-for-linus-5.17-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: riscv: fix build with binutils 2.38 riscv: cpu-hotplug: clear cpu from numa map when teardown riscv: extable: fix err reg writing in dedicated uaccess handler riscv/mm: Add XIP_FIXUP for riscv_pfn_base riscv/mm: Add XIP_FIXUP for phys_ram_base riscv: Fix XIP_FIXUP_FLASH_OFFSET riscv: eliminate unreliable __builtin_frame_address(1)
2022-02-11Merge tag 'arm64-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Catalin Marinas: - Enable Cortex-A510 erratum 2051678 by default as we do with other errata. - arm64 IORT: Check the node revision for PMCG resources to cope with old firmware based on a broken revision of the spec that had no way to describe the second register page (when an implementation is using the recommended RELOC_CTRS feature). * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: ACPI/IORT: Check node revision for PMCG resources arm64: Enable Cortex-A510 erratum 2051678 by default
2022-02-11Merge tag 'acpi-5.17-rc4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI fixes from Rafael Wysocki: "These revert two commits that turned out to be problematic and fix two issues related to wakeup from suspend-to-idle on x86. Specifics: - Revert a recent change that attempted to avoid issues with conflicting address ranges during PCI initialization, because it turned out to introduce a regression (Hans de Goede). - Revert a change that limited EC GPE wakeups from suspend-to-idle to systems based on Intel hardware, because it turned out that systems based on hardware from other vendors depended on that functionality too (Mario Limonciello). - Fix two issues related to the handling of wakeup interrupts and wakeup events signaled through the EC GPE during suspend-to-idle on x86 (Rafael Wysocki)" * tag 'acpi-5.17-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: x86/PCI: revert "Ignore E820 reservations for bridge windows on newer systems" PM: s2idle: ACPI: Fix wakeup interrupts handling ACPI: PM: s2idle: Cancel wakeup before dispatching EC GPE ACPI: PM: Revert "Only mark EC GPE for wakeup on Intel systems"
2022-02-11Merge tag 'gfs2-v5.16-rc3-fixes2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2 Pull gfs2 fixes from Andreas Gruenbacher: - Revert debug commit that causes unexpected data corruption - Fix muti-block reservation regression * tag 'gfs2-v5.16-rc3-fixes2' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2: gfs2: Fix gfs2_release for non-writers regression Revert "gfs2: check context in gfs2_glock_put"
2022-02-11Merge tag 'block-5.17-2022-02-11' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block fixes from Jens Axboe: - NVMe pull request - nvme-tcp: fix bogus request completion when failing to send AER (Sagi Grimberg) - add the missing nvme_complete_req tracepoint for batched completion (Bean Huo) - Revert of the loop async autoclear issue that has continued to plague us this release. A few patchsets exists to improve this, but they are too invasive to be considered at this point (Tetsuo) * tag 'block-5.17-2022-02-11' of git://git.kernel.dk/linux-block: loop: revert "make autoclear operation asynchronous" nvme-tcp: fix bogus request completion when failing to send AER nvme: add nvme_complete_req tracepoint for batched completion
2022-02-11Merge tag 'io_uring-5.17-2022-02-11' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull io_uring fixes from Jens Axboe: - Fix a false-positive warning from an older gcc (Alviro) - Allow oom killer invocations from io_uring_setup (Shakeel) * tag 'io_uring-5.17-2022-02-11' of git://git.kernel.dk/linux-block: mm: io_uring: allow oom-killer from io_uring_setup io_uring: Clean up a false-positive warning from GCC 9.3.0
2022-02-11Merge tag 'gpio-fixes-for-v5.17-rc4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux Pull gpio fixes from Bartosz Golaszewski: - use sleeping variants of GPIO accessors where needed in gpio-aggregator - never return kernel's internal error codes to user-space in gpiolib core - use the correct register for reading output values in gpio-sifive - fix line hogging in gpio-sim * tag 'gpio-fixes-for-v5.17-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux: gpio: sim: fix hogs with custom chip labels gpio: sifive: use the correct register to read output values gpiolib: Never return internal error codes to user space gpio: aggregator: Fix calling into sleeping GPIO controllers
2022-02-11Merge tag 'ata-5.17-rc4-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata Pull ata fixes from Damien Le Moal: "A couple of additional fixes for 5.17-rc4: - Fix compilation warnings in the sata_fsl driver (powerpc) (me) - Disable TRIM commands on M88V29 devices as these commands are failing despite the device reporting it supports TRIM (Zoltan)" * tag 'ata-5.17-rc4-2' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata: ata: libata-core: Disable TRIM on M88V29 ata: sata_fsl: fix sscanf() and sysfs_emit() format strings
2022-02-11Merge tag 'drm-fixes-2022-02-11' of git://anongit.freedesktop.org/drm/drmLinus Torvalds
Pull drm fixes from Dave Airlie: "Regular fixes pull, mostly i915 and amd fixes, along with a maintainers update for fbdev core. Otherwise just some build fixes and vc4 HDMI fixes. fbdev: - MAINTAINERS: add Daniel as fbdev core module maintainer - build warning fix - implicit type cast fix panel: - simple: Fix assignments from panel_dpi_probe() privacy-screen: - fix docs warning i915: - non-x86 build fix - ttm error propogation fix - drrs on hsw/ivb disabled - BIOS readout fixes - missing stackdepot oops fix amd: - DCN 3.1 display fixes - GC 10.3.1 harvest fix - Page flip irq fix - hwmon label fix - DCN 2.0 display fix rockchip: - fix HDMI error cleanup - fix RK3399 VOP register fields vc4: - HDMI fixes - remove redundant code" * tag 'drm-fixes-2022-02-11' of git://anongit.freedesktop.org/drm/drm: (25 commits) drm/amdgpu/display: change pipe policy for DCN 2.0 drm/amd/pm: fix hwmon node of power1_label create issue drm/amd/display: keep eDP Vdd on when eDP stream is already enabled drm/amd/display: fix yellow carp wm clamping drm/amd/display: Cap pflip irqs per max otg number drm/amdgpu: add utcl2_harvest to gc 10.3.1 display/amd: decrease message verbosity about watermarks table failure drm/rockchip: vop: Correct RK3399 VOP register fields drm/rockchip: dw_hdmi: Do not leave clock enabled in error case MAINTAINERS: Add entry for fbdev core fbcon: Avoid 'cap' set but not used warning drm/privacy-screen: Fix sphinx warning drm/i915: Workaround broken BIOS DBUF configuration on TGL/RKL drm/i915: Populate pipe dbuf slices more accurately during readout drm/i915: Allow !join_mbus cases for adlp+ dbuf configuration drm/i915: Fix header test for !CONFIG_X86 drm/i915/ttm: Return some errors instead of trying memcpy move drm/i915: Disable DRRS on IVB/HSW port != A drm/i915: Fix oops due to missing stack depot drm/vc4: crtc: Fix redundant variable assignment ...
2022-02-11Merge tag 'trace-v5.17-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing fixes from Steven Rostedt: - Fixes to the RTLA tooling - A fix to a tp_printk overriding tp_printk_stop_on_boot on the command line * tag 'trace-v5.17-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tracing: Fix tp_printk option related with tp_printk_stop_on_boot MAINTAINERS: Add RTLA entry rtla: Fix segmentation fault when failing to enable -t rtla/trace: Error message fixup rtla/utils: Fix session duration parsing rtla: Follow kernel version
2022-02-11KVM: SVM: fix race between interrupt delivery and AVIC inhibitionMaxim Levitsky
If svm_deliver_avic_intr is called just after the target vcpu's AVIC got inhibited, it might read a stale value of vcpu->arch.apicv_active which can lead to the target vCPU not noticing the interrupt. To fix this use load-acquire/store-release so that, if the target vCPU is IN_GUEST_MODE, we're guaranteed to see a previous disabling of the AVIC. If AVIC has been disabled in the meanwhile, proceed with the KVM_REQ_EVENT-based delivery. Incomplete IPI vmexit has the same races as svm_deliver_avic_intr, and in fact it can be handled in exactly the same way; the only difference lies in who has set IRR, whether svm_deliver_interrupt or the processor. Therefore, svm_complete_interrupt_delivery can be used to fix incomplete IPI vmexits as well. Co-developed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-02-11KVM: SVM: set IRR in svm_deliver_interruptPaolo Bonzini
SVM has to set IRR for both the AVIC and the software-LAPIC case, so pull it up to the common function that handles both configurations. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-02-11KVM: SVM: extract avic_ring_doorbellMaxim Levitsky
The check on the current CPU adds an extra level of indentation to svm_deliver_avic_intr and conflates documentation on what happens if the vCPU exits (of interest to svm_deliver_avic_intr) and migrates (only of interest to avic_ring_doorbell, which calls get/put_cpu()). Extract the wrmsr to a separate function and rewrite the comment in svm_deliver_avic_intr(). Co-developed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-02-11selftests: kvm: Remove absent target fileMuhammad Usama Anjum
There is no vmx_pi_mmio_test file. Remove it to get rid of error while creation of selftest archive: rsync: [sender] link_stat "/kselftest/kvm/x86_64/vmx_pi_mmio_test" failed: No such file or directory (2) rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1333) [sender=3.2.3] Fixes: 6a58150859fd ("selftest: KVM: Add intra host migration tests") Reported-by: "kernelci.org bot" <bot@kernelci.org> Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com> Message-Id: <20220210172352.1317554-1-usama.anjum@collabora.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-02-11Merge tag 'kvmarm-fixes-5.17-3' of ↵Paolo Bonzini
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 fixes for 5.17, take #3 - Fix pending state read of a HW interrupt
2022-02-11gfs2: Fix gfs2_release for non-writers regressionBob Peterson
When a file is opened for writing, the vfs code (do_dentry_open) calls get_write_access for the inode, thus incrementing the inode's write count. That writer normally then creates a multi-block reservation for the inode (i_res) that can be re-used by other writers, which speeds up writes for applications that stupidly loop on open/write/close. When the writes are all done, the multi-block reservation should be deleted when the file is closed by the last "writer." Commit 0ec9b9ea4f83 broke that concept when it moved the call to gfs2_rs_delete before the check for FMODE_WRITE. Non-writers have no business removing the multi-block reservations of writers. In fact, if someone opens and closes the file for RO while a writer has a multi-block reservation, the RO closer will delete the reservation midway through the write, and this results in: kernel BUG at fs/gfs2/rgrp.c:677! (or thereabouts) which is: BUG_ON(rs->rs_requested); from function gfs2_rs_deltree. This patch moves the check back inside the check for FMODE_WRITE. Fixes: 0ec9b9ea4f83 ("gfs2: Check for active reservation in gfs2_release") Cc: stable@vger.kernel.org # v5.12+ Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2022-02-11Revert "gfs2: check context in gfs2_glock_put"Andreas Gruenbacher
It turns out that the might_sleep() call that commit 660a6126f8c3 adds is triggering occasional data corruption in testing. We're not sure about the root cause yet, but since this commit was added as a debugging aid only, revert it for now. This reverts commit 660a6126f8c3208f6df8d552039cda078a8426d1. Fixes: 660a6126f8c3 ("gfs2: check context in gfs2_glock_put") Cc: stable@vger.kernel.org # v5.16+ Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2022-02-11Merge branch 'acpi-x86'Rafael J. Wysocki
Merge a revert of a problematic commit for 5.17-rc4. * acpi-x86: x86/PCI: revert "Ignore E820 reservations for bridge windows on newer systems"
2022-02-11Merge tag 'usb-serial-5.17-rc4' of ↵Greg Kroah-Hartman
https://git.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial into usb-linus Johan writes: USB-serial fixes for 5.17-rc4 Here are some new device ids for 5.17-rc4. All have been in linux-next with no reported issues. * tag 'usb-serial-5.17-rc4' of https://git.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial: USB: serial: cp210x: add CPI Bulk Coin Recycler id USB: serial: cp210x: add NCR Retail IO box id USB: serial: ftdi_sio: add support for Brainboxes US-159/235/320 USB: serial: option: add ZTE MF286D modem USB: serial: ch341: add support for GW Instek USB2.0-Serial devices
2022-02-11Merge tag 'wireless-2022-02-11' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless wireless fixes for v5.17 Second set of fixes for v5.17. This is the first pull request with both driver and stack patches. Most important here are a regression fix for brcmfmac USB devices and an iwlwifi fix for use after free when the firmware was missing. We have new maintainers for ath9k and wcn36xx as well as ath6kl is now orphaned. Also smaller fixes to iwlwifi and stack.
2022-02-11loop: revert "make autoclear operation asynchronous"Tetsuo Handa
The kernel test robot is reporting that xfstest which does umount ext2 on xfs umount xfs sequence started failing, for commit 322c4293ecc58110 ("loop: make autoclear operation asynchronous") removed a guarantee that fput() of backing file is processed before lo_release() from close() returns to user mode. And syzbot is reporting that deferring destroy_workqueue() from __loop_clr_fd() to a WQ context did not help [1]. Revert that commit. Link: https://syzkaller.appspot.com/bug?extid=831661966588c802aae9 [1] Reported-by: kernel test robot <oliver.sang@intel.com> Acked-by: Jan Kara <jack@suse.cz> Reviewed-by: Christoph Hellwig <hch@lst.de> Reported-by: syzbot <syzbot+831661966588c802aae9@syzkaller.appspotmail.com> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Link: https://lore.kernel.org/r/20220211071554.3424-1-penguin-kernel@I-love.SAKURA.ne.jp Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-02-11bpf: Do not try bpf_msg_push_data with len 0Felix Maurer
If bpf_msg_push_data() is called with len 0 (as it happens during selftests/bpf/test_sockmap), we do not need to do anything and can return early. Calling bpf_msg_push_data() with len 0 previously lead to a wrong ENOMEM error: we later called get_order(copy + len); if len was 0, copy + len was also often 0 and get_order() returned some undefined value (at the moment 52). alloc_pages() caught that and failed, but then bpf_msg_push_data() returned ENOMEM. This was wrong because we are most probably not out of memory and actually do not need any additional memory. Fixes: 6fff607e2f14b ("bpf: sk_msg program helper bpf_msg_push_data") Signed-off-by: Felix Maurer <fmaurer@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Yonghong Song <yhs@fb.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/df69012695c7094ccb1943ca02b4920db3537466.1644421921.git.fmaurer@redhat.com
2022-02-11Merge ra.kernel.org:/pub/scm/linux/kernel/git/netfilter/nfDavid S. Miller
Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following patchset contains Netfilter fixes for net: 1) Add selftest for nft_synproxy, from Florian Westphal. 2) xt_socket destroy path incorrectly disables IPv4 defrag for IPv6 traffic (typo), from Eric Dumazet. 3) Fix exit value selftest nft_concat_range.sh, from Hangbin Liu. 4) nft_synproxy disables the IPv4 hooks if the IPv6 hooks fail to be registered. 5) disable rp_filter on router in selftest nft_fib.sh, also from Hangbin Liu. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-11drop_monitor: fix data-race in dropmon_net_event / trace_napi_poll_hitEric Dumazet
trace_napi_poll_hit() is reading stat->dev while another thread can write on it from dropmon_net_event() Use READ_ONCE()/WRITE_ONCE() here, RCU rules are properly enforced already, we only have to take care of load/store tearing. BUG: KCSAN: data-race in dropmon_net_event / trace_napi_poll_hit write to 0xffff88816f3ab9c0 of 8 bytes by task 20260 on cpu 1: dropmon_net_event+0xb8/0x2b0 net/core/drop_monitor.c:1579 notifier_call_chain kernel/notifier.c:84 [inline] raw_notifier_call_chain+0x53/0xb0 kernel/notifier.c:392 call_netdevice_notifiers_info net/core/dev.c:1919 [inline] call_netdevice_notifiers_extack net/core/dev.c:1931 [inline] call_netdevice_notifiers net/core/dev.c:1945 [inline] unregister_netdevice_many+0x867/0xfb0 net/core/dev.c:10415 ip_tunnel_delete_nets+0x24a/0x280 net/ipv4/ip_tunnel.c:1123 vti_exit_batch_net+0x2a/0x30 net/ipv4/ip_vti.c:515 ops_exit_list net/core/net_namespace.c:173 [inline] cleanup_net+0x4dc/0x8d0 net/core/net_namespace.c:597 process_one_work+0x3f6/0x960 kernel/workqueue.c:2307 worker_thread+0x616/0xa70 kernel/workqueue.c:2454 kthread+0x1bf/0x1e0 kernel/kthread.c:377 ret_from_fork+0x1f/0x30 read to 0xffff88816f3ab9c0 of 8 bytes by interrupt on cpu 0: trace_napi_poll_hit+0x89/0x1c0 net/core/drop_monitor.c:292 trace_napi_poll include/trace/events/napi.h:14 [inline] __napi_poll+0x36b/0x3f0 net/core/dev.c:6366 napi_poll net/core/dev.c:6432 [inline] net_rx_action+0x29e/0x650 net/core/dev.c:6519 __do_softirq+0x158/0x2de kernel/softirq.c:558 do_softirq+0xb1/0xf0 kernel/softirq.c:459 __local_bh_enable_ip+0x68/0x70 kernel/softirq.c:383 __raw_spin_unlock_bh include/linux/spinlock_api_smp.h:167 [inline] _raw_spin_unlock_bh+0x33/0x40 kernel/locking/spinlock.c:210 spin_unlock_bh include/linux/spinlock.h:394 [inline] ptr_ring_consume_bh include/linux/ptr_ring.h:367 [inline] wg_packet_decrypt_worker+0x73c/0x780 drivers/net/wireguard/receive.c:506 process_one_work+0x3f6/0x960 kernel/workqueue.c:2307 worker_thread+0x616/0xa70 kernel/workqueue.c:2454 kthread+0x1bf/0x1e0 kernel/kthread.c:377 ret_from_fork+0x1f/0x30 value changed: 0xffff88815883e000 -> 0x0000000000000000 Reported by Kernel Concurrency Sanitizer on: CPU: 0 PID: 26435 Comm: kworker/0:1 Not tainted 5.17.0-rc1-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Workqueue: wg-crypt-wg2 wg_packet_decrypt_worker Fixes: 4ea7e38696c7 ("dropmon: add ability to detect when hardware dropsrxpackets") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Neil Horman <nhorman@tuxdriver.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-11iio: buffer: Fix file related error handling in IIO_BUFFER_GET_FD_IOCTLMathias Krause
If we fail to copy the just created file descriptor to userland, we try to clean up by putting back 'fd' and freeing 'ib'. The code uses put_unused_fd() for the former which is wrong, as the file descriptor was already published by fd_install() which gets called internally by anon_inode_getfd(). This makes the error handling code leaving a half cleaned up file descriptor table around and a partially destructed 'file' object, allowing userland to play use-after-free tricks on us, by abusing the still usable fd and making the code operate on a dangling 'file->private_data' pointer. Instead of leaving the kernel in a partially corrupted state, don't attempt to explicitly clean up and leave this to the process exit path that'll release any still valid fds, including the one created by the previous call to anon_inode_getfd(). Simply return -EFAULT to indicate the error. Fixes: f73f7f4da581 ("iio: buffer: add ioctl() to support opening extra buffers for IIO device") Cc: stable@kernel.org Cc: Jonathan Cameron <jic23@kernel.org> Cc: Alexandru Ardelean <ardeleanalex@gmail.com> Cc: Lars-Peter Clausen <lars@metafoo.de> Cc: Nuno Sa <Nuno.Sa@analog.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Mathias Krause <minipli@grsecurity.net> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-02-11net/smc: Avoid overwriting the copies of clcsock callback functionsWen Gu
The callback functions of clcsock will be saved and replaced during the fallback. But if the fallback happens more than once, then the copies of these callback functions will be overwritten incorrectly, resulting in a loop call issue: clcsk->sk_error_report |- smc_fback_error_report() <------------------------------| |- smc_fback_forward_wakeup() | (loop) |- clcsock_callback() (incorrectly overwritten) | |- smc->clcsk_error_report() ------------------| So this patch fixes the issue by saving these function pointers only once in the fallback and avoiding overwriting. Reported-by: syzbot+4de3c0e8a263e1e499bc@syzkaller.appspotmail.com Fixes: 341adeec9ada ("net/smc: Forward wakeup to smc socket waitqueue after fallback") Link: https://lore.kernel.org/r/0000000000006d045e05d78776f6@google.com Signed-off-by: Wen Gu <guwen@linux.alibaba.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-11KVM: arm64: vgic: Read HW interrupt pending state from the HWMarc Zyngier
It appears that a read access to GIC[DR]_I[CS]PENDRn doesn't always result in the pending interrupts being accurately reported if they are mapped to a HW interrupt. This is particularily visible when acking the timer interrupt and reading the GICR_ISPENDR1 register immediately after, for example (the interrupt appears as not-pending while it really is...). This is because a HW interrupt has its 'active and pending state' kept in the *physical* distributor, and not in the virtual one, as mandated by the spec (this is what allows the direct deactivation). The virtual distributor only caries the pending and active *states* (note the plural, as these are two independent and non-overlapping states). Fix it by reading the HW state back, either from the timer itself or from the distributor if necessary. Reported-by: Ricardo Koller <ricarkol@google.com> Tested-by: Ricardo Koller <ricarkol@google.com> Reviewed-by: Ricardo Koller <ricarkol@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220208123726.3604198-1-maz@kernel.org
2022-02-11usb: dwc2: drd: fix soft connect when gadget is unconfiguredFabrice Gasnier
When the gadget driver hasn't been (yet) configured, and the cable is connected to a HOST, the SFTDISCON gets cleared unconditionally, so the HOST tries to enumerate it. At the host side, this can result in a stuck USB port or worse. When getting lucky, some dmesg can be observed at the host side: new high-speed USB device number ... device descriptor read/64, error -110 Fix it in drd, by checking the enabled flag before calling dwc2_hsotg_core_connect(). It will be called later, once configured, by the normal flow: - udc_bind_to_driver - usb_gadget_connect - dwc2_hsotg_pullup - dwc2_hsotg_core_connect Fixes: 17f934024e84 ("usb: dwc2: override PHY input signals with usb role switch support") Cc: stable@kernel.org Reviewed-by: Amelie Delaunay <amelie.delaunay@foss.st.com> Acked-by: Minas Harutyunyan <Minas.Harutyunyan@synopsys.com> Signed-off-by: Fabrice Gasnier <fabrice.gasnier@foss.st.com> Link: https://lore.kernel.org/r/1644423353-17859-1-git-send-email-fabrice.gasnier@foss.st.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-02-11usb: gadget: rndis: check size of RNDIS_MSG_SET commandGreg Kroah-Hartman
Check the size of the RNDIS_MSG_SET command given to us before attempting to respond to an invalid message size. Reported-by: Szymon Heidrich <szymon.heidrich@gmail.com> Cc: stable@kernel.org Tested-by: Szymon Heidrich <szymon.heidrich@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>