summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2022-09-26cxgb4: fix missing unlock on ETHOFLD desc collect fail pathRafael Mendonca
The label passed to the QDESC_GET for the ETHOFLD TXQ, RXQ, and FLQ, is the 'out' one, which skips the 'out_unlock' label, and thus doesn't unlock the 'uld_mutex' before returning. Additionally, since commit 5148e5950c67 ("cxgb4: add EOTID tracking and software context dump"), the access to these ETHOFLD hardware queues should be protected by the 'mqprio_mutex' instead. Fixes: 2d0cb84dd973 ("cxgb4: add ETHOFLD hardware queue support") Fixes: 5148e5950c67 ("cxgb4: add EOTID tracking and software context dump") Signed-off-by: Rafael Mendonca <rafaelmendsr@gmail.com> Reviewed-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com> Link: https://lore.kernel.org/r/20220922175109.764898-1-rafaelmendsr@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-26Merge tag 'ext4_for_linus_fixes2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull missed ext4 fix from Ted Ts'o: "Fix an potential unitialzied variable bug; this was a fixup that I had forgotten to apply before the last pull request for ext4. My bad" * tag 'ext4_for_linus_fixes2' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: fixup possible uninitialized variable access in ext4_mb_choose_next_group_cr1()
2022-09-26net: sched: act_ct: fix possible refcount leak in tcf_ct_init()Hangyu Hua
nf_ct_put need to be called to put the refcount got by tcf_ct_fill_params to avoid possible refcount leak when tcf_ct_flow_table_get fails. Fixes: c34b961a2492 ("net/sched: act_ct: Create nf flow table per zone") Signed-off-by: Hangyu Hua <hbh25y@gmail.com> Link: https://lore.kernel.org/r/20220923020046.8021-1-hbh25y@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-26x86/uaccess: avoid check_object_size() in copy_from_user_nmi()Kees Cook
The check_object_size() helper under CONFIG_HARDENED_USERCOPY is designed to skip any checks where the length is known at compile time as a reasonable heuristic to avoid "likely known-good" cases. However, it can only do this when the copy_*_user() helpers are, themselves, inline too. Using find_vmap_area() requires taking a spinlock. The check_object_size() helper can call find_vmap_area() when the destination is in vmap memory. If show_regs() is called in interrupt context, it will attempt a call to copy_from_user_nmi(), which may call check_object_size() and then find_vmap_area(). If something in normal context happens to be in the middle of calling find_vmap_area() (with the spinlock held), the interrupt handler will hang forever. The copy_from_user_nmi() call is actually being called with a fixed-size length, so check_object_size() should never have been called in the first place. Given the narrow constraints, just replace the __copy_from_user_inatomic() call with an open-coded version that calls only into the sanitizers and not check_object_size(), followed by a call to raw_copy_from_user(). [akpm@linux-foundation.org: no instrument_copy_from_user() in my tree...] Link: https://lkml.kernel.org/r/20220919201648.2250764-1-keescook@chromium.org Link: https://lore.kernel.org/all/CAOUHufaPshtKrTWOz7T7QFYUNVGFm0JBjvM700Nhf9qEL9b3EQ@mail.gmail.com Fixes: 0aef499f3172 ("mm/usercopy: Detect vmalloc overruns") Signed-off-by: Kees Cook <keescook@chromium.org> Reported-by: Yu Zhao <yuzhao@google.com> Reported-by: Florian Lehner <dev@der-flo.net> Suggested-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Florian Lehner <dev@der-flo.net> Cc: Matthew Wilcox <willy@infradead.org> Cc: Josh Poimboeuf <jpoimboe@kernel.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-26mm/page_isolation: fix isolate_single_pageblock() isolation behaviorZi Yan
set_migratetype_isolate() does not allow isolating MIGRATE_CMA pageblocks unless it is used for CMA allocation. isolate_single_pageblock() did not have the same behavior when it is used together with set_migratetype_isolate() in start_isolate_page_range(). This allows alloc_contig_range() with migratetype other than MIGRATE_CMA, like MIGRATE_MOVABLE (used by alloc_contig_pages()), to isolate first and last pageblock but fail the rest. The failure leads to changing migratetype of the first and last pageblock to MIGRATE_MOVABLE from MIGRATE_CMA, corrupting the CMA region. This can happen during gigantic page allocations. Like Doug said here: https://lore.kernel.org/linux-mm/a3363a52-883b-dcd1-b77f-f2bb378d6f2d@gmail.com/T/#u, for gigantic page allocations, the user would notice no difference, since the allocation on CMA region will fail as well as it did before. But it might hurt the performance of device drivers that use CMA, since CMA region size decreases. Fix it by passing migratetype into isolate_single_pageblock(), so that set_migratetype_isolate() used by isolate_single_pageblock() will prevent the isolation happening. Link: https://lkml.kernel.org/r/20220914023913.1855924-1-zi.yan@sent.com Fixes: b2c9e2fbba32 ("mm: make alloc_contig_range work at pageblock granularity") Signed-off-by: Zi Yan <ziy@nvidia.com> Reported-by: Doug Berger <opendmb@gmail.com> Cc: David Hildenbrand <david@redhat.com> Cc: Doug Berger <opendmb@gmail.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-26mm,hwpoison: check mm when killing accessing processShuai Xue
The GHES code calls memory_failure_queue() from IRQ context to queue work into workqueue and schedule it on the current CPU. Then the work is processed in memory_failure_work_func() by kworker and calls memory_failure(). When a page is already poisoned, commit a3f5d80ea401 ("mm,hwpoison: send SIGBUS with error virutal address") make memory_failure() call kill_accessing_process() that: - holds mmap locking of current->mm - does pagetable walk to find the error virtual address - and sends SIGBUS to the current process with error info. However, the mm of kworker is not valid, resulting in a null-pointer dereference. So check mm when killing the accessing process. [akpm@linux-foundation.org: remove unrelated whitespace alteration] Link: https://lkml.kernel.org/r/20220914064935.7851-1-xueshuai@linux.alibaba.com Fixes: a3f5d80ea401 ("mm,hwpoison: send SIGBUS with error virutal address") Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com> Reviewed-by: Miaohe Lin <linmiaohe@huawei.com> Acked-by: Naoya Horiguchi <naoya.horiguchi@nec.com> Cc: Huang Ying <ying.huang@intel.com> Cc: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: Bixuan Cui <cuibixuan@linux.alibaba.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-26mm/hugetlb: correct demote page offset logicDoug Berger
With gigantic pages it may not be true that struct page structures are contiguous across the entire gigantic page. The nth_page macro is used here in place of direct pointer arithmetic to correct for this. Mike said: : This error could cause addressing exceptions. However, this is only : possible in configurations where CONFIG_SPARSEMEM && : !CONFIG_SPARSEMEM_VMEMMAP. Such a configuration option is rare and : unknown to be the default anywhere. Link: https://lkml.kernel.org/r/20220914190917.3517663-1-opendmb@gmail.com Fixes: 8531fc6f52f5 ("hugetlb: add hugetlb demote page support") Signed-off-by: Doug Berger <opendmb@gmail.com> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Muchun Song <songmuchun@bytedance.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-26mm: prevent page_frag_alloc() from corrupting the memoryMaurizio Lombardi
A number of drivers call page_frag_alloc() with a fragment's size > PAGE_SIZE. In low memory conditions, __page_frag_cache_refill() may fail the order 3 cache allocation and fall back to order 0; In this case, the cache will be smaller than the fragment, causing memory corruptions. Prevent this from happening by checking if the newly allocated cache is large enough for the fragment; if not, the allocation will fail and page_frag_alloc() will return NULL. Link: https://lkml.kernel.org/r/20220715125013.247085-1-mlombard@redhat.com Fixes: b63ae8ca096d ("mm/net: Rename and move page fragment handling from net/ to mm/") Signed-off-by: Maurizio Lombardi <mlombard@redhat.com> Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> Cc: Chen Lin <chen45464546@163.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-26mm: bring back update_mmu_cache() to finish_fault()Sergei Antonov
Running this test program on ARMv4 a few times (sometimes just once) reproduces the bug. int main() { unsigned i; char paragon[SIZE]; void* ptr; memset(paragon, 0xAA, SIZE); ptr = mmap(NULL, SIZE, PROT_READ | PROT_WRITE, MAP_ANON | MAP_SHARED, -1, 0); if (ptr == MAP_FAILED) return 1; printf("ptr = %p\n", ptr); for (i=0;i<10000;i++){ memset(ptr, 0xAA, SIZE); if (memcmp(ptr, paragon, SIZE)) { printf("Unexpected bytes on iteration %u!!!\n", i); break; } } munmap(ptr, SIZE); } In the "ptr" buffer there appear runs of zero bytes which are aligned by 16 and their lengths are multiple of 16. Linux v5.11 does not have the bug, "git bisect" finds the first bad commit: f9ce0be71d1f ("mm: Cleanup faultaround and finish_fault() codepaths") Before the commit update_mmu_cache() was called during a call to filemap_map_pages() as well as finish_fault(). After the commit finish_fault() lacks it. Bring back update_mmu_cache() to finish_fault() to fix the bug. Also call update_mmu_tlb() only when returning VM_FAULT_NOPAGE to more closely reproduce the code of alloc_set_pte() function that existed before the commit. On many platforms update_mmu_cache() is nop: x86, see arch/x86/include/asm/pgtable ARMv6+, see arch/arm/include/asm/tlbflush.h So, it seems, few users ran into this bug. Link: https://lkml.kernel.org/r/20220908204809.2012451-1-saproj@gmail.com Fixes: f9ce0be71d1f ("mm: Cleanup faultaround and finish_fault() codepaths") Signed-off-by: Sergei Antonov <saproj@gmail.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Will Deacon <will@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-26frontswap: don't call ->init if no ops are registeredChristoph Hellwig
If no frontswap module (i.e. zswap) was registered, frontswap_ops will be NULL. In such situation, swapon crashes with the following stack trace: Unable to handle kernel access to user memory outside uaccess routines at virtual address 0000000000000000 Mem abort info: ESR = 0x0000000096000004 EC = 0x25: DABT (current EL), IL = 32 bits SET = 0, FnV = 0 EA = 0, S1PTW = 0 FSC = 0x04: level 0 translation fault Data abort info: ISV = 0, ISS = 0x00000004 CM = 0, WnR = 0 user pgtable: 4k pages, 48-bit VAs, pgdp=00000020a4fab000 [0000000000000000] pgd=0000000000000000, p4d=0000000000000000 Internal error: Oops: 96000004 [#1] SMP Modules linked in: zram fsl_dpaa2_eth pcs_lynx phylink ahci_qoriq crct10dif_ce ghash_ce sbsa_gwdt fsl_mc_dpio nvme lm90 nvme_core at803x xhci_plat_hcd rtc_fsl_ftm_alarm xgmac_mdio ahci_platform i2c_imx ip6_tables ip_tables fuse Unloaded tainted modules: cppc_cpufreq():1 CPU: 10 PID: 761 Comm: swapon Not tainted 6.0.0-rc2-00454-g22100432cf14 #1 Hardware name: SolidRun Ltd. SolidRun CEX7 Platform, BIOS EDK II Jun 21 2022 pstate: 00400005 (nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : frontswap_init+0x38/0x60 lr : __do_sys_swapon+0x8a8/0x9f4 sp : ffff80000969bcf0 x29: ffff80000969bcf0 x28: ffff37bee0d8fc00 x27: ffff80000a7f5000 x26: fffffcdefb971e80 x25: ffffaba797453b90 x24: 0000000000000064 x23: ffff37c1f209d1a8 x22: ffff37bee880e000 x21: ffffaba797748560 x20: ffff37bee0d8fce4 x19: ffffaba797748488 x18: 0000000000000014 x17: 0000000030ec029a x16: ffffaba795a479b0 x15: 0000000000000000 x14: 0000000000000000 x13: 0000000000000030 x12: 0000000000000001 x11: ffff37c63c0aba18 x10: 0000000000000000 x9 : ffffaba7956b8c88 x8 : ffff80000969bcd0 x7 : 0000000000000000 x6 : 0000000000000000 x5 : 0000000000000001 x4 : 0000000000000000 x3 : ffffaba79730f000 x2 : ffff37bee0d8fc00 x1 : 0000000000000000 x0 : 0000000000000000 Call trace: frontswap_init+0x38/0x60 __do_sys_swapon+0x8a8/0x9f4 __arm64_sys_swapon+0x28/0x3c invoke_syscall+0x78/0x100 el0_svc_common.constprop.0+0xd4/0xf4 do_el0_svc+0x38/0x4c el0_svc+0x34/0x10c el0t_64_sync_handler+0x11c/0x150 el0t_64_sync+0x190/0x194 Code: d000e283 910003fd f9006c41 f946d461 (f9400021) ---[ end trace 0000000000000000 ]--- Link: https://lkml.kernel.org/r/20220909130829.3262926-1-hch@lst.de Fixes: 1da0d94a3ec8 ("frontswap: remove support for multiple ops") Reported-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Liu Shixin <liushixin2@huawei.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-26mm/huge_memory: use pfn_to_online_page() in split_huge_pages_all()Naoya Horiguchi
NULL pointer dereference is triggered when calling thp split via debugfs on the system with offlined memory blocks. With debug option enabled, the following kernel messages are printed out: page:00000000467f4890 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x121c000 flags: 0x17fffc00000000(node=0|zone=2|lastcpupid=0x1ffff) raw: 0017fffc00000000 0000000000000000 dead000000000122 0000000000000000 raw: 0000000000000000 0000000000000000 00000001ffffffff 0000000000000000 page dumped because: unmovable page page:000000007d7ab72e is uninitialized and poisoned page dumped because: VM_BUG_ON_PAGE(PagePoisoned(p)) ------------[ cut here ]------------ kernel BUG at include/linux/mm.h:1248! invalid opcode: 0000 [#1] PREEMPT SMP PTI CPU: 16 PID: 20964 Comm: bash Tainted: G I 6.0.0-rc3-foll-numa+ #41 ... RIP: 0010:split_huge_pages_write+0xcf4/0xe30 This shows that page_to_nid() in page_zone() is unexpectedly called for an offlined memmap. Use pfn_to_online_page() to get struct page in PFN walker. Link: https://lkml.kernel.org/r/20220908041150.3430269-1-naoya.horiguchi@linux.dev Fixes: f1dd2cd13c4b ("mm, memory_hotplug: do not associate hotadded memory to zones until online") [visible after d0dc12e86b319] Signed-off-by: Naoya Horiguchi <naoya.horiguchi@nec.com> Co-developed-by: David Hildenbrand <david@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Reviewed-by: Yang Shi <shy828301@gmail.com> Acked-by: Michal Hocko <mhocko@suse.com> Reviewed-by: Miaohe Lin <linmiaohe@huawei.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Muchun Song <songmuchun@bytedance.com> Cc: <stable@vger.kernel.org> [5.10+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-26mm: fix madivse_pageout mishandling on non-LRU pageMinchan Kim
MADV_PAGEOUT tries to isolate non-LRU pages and gets a warning from isolate_lru_page below. Fix it by checking PageLRU in advance. ------------[ cut here ]------------ trying to isolate tail page WARNING: CPU: 0 PID: 6175 at mm/folio-compat.c:158 isolate_lru_page+0x130/0x140 Modules linked in: CPU: 0 PID: 6175 Comm: syz-executor.0 Not tainted 5.18.12 #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014 RIP: 0010:isolate_lru_page+0x130/0x140 Link: https://lore.kernel.org/linux-mm/485f8c33.2471b.182d5726afb.Coremail.hantianshuo@iie.ac.cn/ Link: https://lkml.kernel.org/r/20220908151204.762596-1-minchan@kernel.org Fixes: 1a4e58cce84e ("mm: introduce MADV_PAGEOUT") Signed-off-by: Minchan Kim <minchan@kernel.org> Reported-by: 韩天ç`• <hantianshuo@iie.ac.cn> Suggested-by: Yang Shi <shy828301@gmail.com> Acked-by: Yang Shi <shy828301@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-26powerpc/64s/radix: don't need to broadcast IPI for radix pmd collapse flushYang Shi
The IPI broadcast is used to serialize against fast-GUP, but fast-GUP will move to use RCU instead of disabling local interrupts in fast-GUP. Using an IPI is the old-styled way of serializing against fast-GUP although it still works as expected now. And fast-GUP now fixed the potential race with THP collapse by checking whether PMD is changed or not. So IPI broadcast in radix pmd collapse flush is not necessary anymore. But it is still needed for hash TLB. Link: https://lkml.kernel.org/r/20220907180144.555485-2-shy828301@gmail.com Suggested-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Yang Shi <shy828301@gmail.com> Acked-by: David Hildenbrand <david@redhat.com> Acked-by: Peter Xu <peterx@redhat.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Hugh Dickins <hughd@google.com> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-26mm: gup: fix the fast GUP race against THP collapseYang Shi
Since general RCU GUP fast was introduced in commit 2667f50e8b81 ("mm: introduce a general RCU get_user_pages_fast()"), a TLB flush is no longer sufficient to handle concurrent GUP-fast in all cases, it only handles traditional IPI-based GUP-fast correctly. On architectures that send an IPI broadcast on TLB flush, it works as expected. But on the architectures that do not use IPI to broadcast TLB flush, it may have the below race: CPU A CPU B THP collapse fast GUP gup_pmd_range() <-- see valid pmd gup_pte_range() <-- work on pte pmdp_collapse_flush() <-- clear pmd and flush __collapse_huge_page_isolate() check page pinned <-- before GUP bump refcount pin the page check PTE <-- no change __collapse_huge_page_copy() copy data to huge page ptep_clear() install huge pmd for the huge page return the stale page discard the stale page The race can be fixed by checking whether PMD is changed or not after taking the page pin in fast GUP, just like what it does for PTE. If the PMD is changed it means there may be parallel THP collapse, so GUP should back off. Also update the stale comment about serializing against fast GUP in khugepaged. Link: https://lkml.kernel.org/r/20220907180144.555485-1-shy828301@gmail.com Fixes: 2667f50e8b81 ("mm: introduce a general RCU get_user_pages_fast()") Acked-by: David Hildenbrand <david@redhat.com> Acked-by: Peter Xu <peterx@redhat.com> Signed-off-by: Yang Shi <shy828301@gmail.com> Reviewed-by: John Hubbard <jhubbard@nvidia.com> Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> Cc: Hugh Dickins <hughd@google.com> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-26usbnet: Fix memory leak in usbnet_disconnect()Peilin Ye
Currently usbnet_disconnect() unanchors and frees all deferred URBs using usb_scuttle_anchored_urbs(), which does not free urb->context, causing a memory leak as reported by syzbot. Use a usb_get_from_anchor() while loop instead, similar to what we did in commit 19cfe912c37b ("Bluetooth: btusb: Fix memory leak in play_deferred"). Also free urb->sg. Reported-and-tested-by: syzbot+dcd3e13cf4472f2e0ba1@syzkaller.appspotmail.com Fixes: 69ee472f2706 ("usbnet & cdc-ether: Autosuspend for online devices") Fixes: 638c5115a794 ("USBNET: support DMA SG") Signed-off-by: Peilin Ye <peilin.ye@bytedance.com> Link: https://lore.kernel.org/r/20220923042551.2745-1-yepeilin.cs@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-26io_uring: register single issuer task at creationDylan Yudaken
Instead of picking the task from the first submitter task, rather use the creator task or in the case of disabled (IORING_SETUP_R_DISABLED) the enabling task. This approach allows a lot of simplification of the logic here. This removes init logic from the submission path, which can always be a bit confusing, but also removes the need for locking to write (or read) the submitter_task. Users that want to move a ring before submitting can create the ring disabled and then enable it on the submitting task. Signed-off-by: Dylan Yudaken <dylany@fb.com> Fixes: 97bbdc06a444 ("io_uring: add IORING_SETUP_SINGLE_ISSUER") Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-09-26ext4: fixup possible uninitialized variable access in ↵Jan Kara
ext4_mb_choose_next_group_cr1() Variable 'grp' may be left uninitialized if there's no group with suitable average fragment size (or larger). Fix the problem by initializing it earlier. Link: https://lore.kernel.org/r/20220922091542.pkhedytey7wzp5fi@quack3 Fixes: 83e80a6e3543 ("ext4: use buckets for cr 1 block scan instead of rbtree") Cc: stable@kernel.org Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2022-09-26Revert "net: mvpp2: debugfs: fix memory leak when using debugfs_lookup()"Sasha Levin
This reverts commit fe2c9c61f668cde28dac2b188028c5299cedcc1e. On Tue, Sep 13, 2022 at 05:48:58PM +0100, Russell King (Oracle) wrote: >What happens if this is built as a module, and the module is loaded, >binds (and creates the directory), then is removed, and then re- >inserted? Nothing removes the old directory, so doesn't >debugfs_create_dir() fail, resulting in subsequent failure to add >any subsequent debugfs entries? > >I don't think this patch should be backported to stable trees until >this point is addressed. Revert until a proper fix is available as the original behavior was better. Cc: Marcin Wojtas <mw@semihalf.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Paolo Abeni <pabeni@redhat.com> Cc: stable@kernel.org Reported-by: Russell King <linux@armlinux.org.uk> Fixes: fe2c9c61f668 ("net: mvpp2: debugfs: fix memory leak when using debugfs_lookup()") Signed-off-by: Sasha Levin <sashal@kernel.org> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://lore.kernel.org/r/20220923234736.657413-1-sashal@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-26drm/i915/gt: Restrict forced preemption to the active contextChris Wilson
When we submit a new pair of contexts to ELSP for execution, we start a timer by which point we expect the HW to have switched execution to the pending contexts. If the promotion to the new pair of contexts has not occurred, we declare the executing context to have hung and force the preemption to take place by resetting the engine and resubmitting the new contexts. This can lead to an unfair situation where almost all of the preemption timeout is consumed by the first context which just switches into the second context immediately prior to the timer firing and triggering the preemption reset (assuming that the timer interrupts before we process the CS events for the context switch). The second context hasn't yet had a chance to yield to the incoming ELSP (and send the ACk for the promotion) and so ends up being blamed for the reset. If we see that a context switch has occurred since setting the preemption timeout, but have not yet received the ACK for the ELSP promotion, rearm the preemption timer and check again. This is especially significant if the first context was not schedulable and so we used the shortest timer possible, greatly increasing the chance of accidentally blaming the second innocent context. Fixes: 3a7a92aba8fb ("drm/i915/execlists: Force preemption") Fixes: d12acee84ffb ("drm/i915/execlists: Cancel banned contexts on schedule-out") Reported-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Andi Shyti <andi.shyti@linux.intel.com> Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com> Tested-by: Andrzej Hajda <andrzej.hajda@intel.com> Cc: <stable@vger.kernel.org> # v5.5+ Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220921135258.1714873-1-andrzej.hajda@intel.com (cherry picked from commit 107ba1a2c705f4358f2602ec2f2fd821bb651f42) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2022-09-26perf tests powerpc: Fix branch stack sampling test to include sanity check ↵Athira Rajeev
for branch filter Commit b55878c90ab92a24 ("perf test: Add test for branch stack sampling") added test for branch stack sampling. There is a sanity check in the beginning to skip the test if the hardware doesn't support branch stack sampling. Snippet <<>> skip the test if the hardware doesn't support branch stack sampling perf record -b -o- -B true > /dev/null 2>&1 || exit 2 <<>> But the testcase also uses branch sample types: save_type, any. if any platform doesn't support the branch filters used in the test, the testcase will fail. In powerpc, currently mutliple branch filters are not supported and hence this test fails in powerpc. Fix the sanity check to look at the support for branch filters used in this test before proceeding with the test. Fixes: b55878c90ab92a24 ("perf test: Add test for branch stack sampling") Reported-by: Disha Goel <disgoel@linux.vnet.ibm.com> Reviewed-by: Kajol Jain <kjain@linux.ibm.com> Signed-off-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: German Gomez <german.gomez@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: linuxppc-dev@lists.ozlabs.org Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nageswara R Sastry <rnsastry@linux.ibm.com> Link: https://lore.kernel.org/r/20220921145255.20972-2-atrajeev@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-09-26perf parse-events: Remove "not supported" hybrid cache eventsZhengjun Xing
By default, we create two hybrid cache events, one is for cpu_core, and another is for cpu_atom. But Some hybrid hardware cache events are only available on one CPU PMU. For example, the 'L1-dcache-load-misses' is only available on cpu_core, while the 'L1-icache-loads' is only available on cpu_atom. We need to remove "not supported" hybrid cache events. By extending is_event_supported() to global API and using it to check if the hybrid cache events are supported before being created, we can remove the "not supported" hybrid cache events. Before: # ./perf stat -e L1-dcache-load-misses,L1-icache-loads -a sleep 1 Performance counter stats for 'system wide': 52,570 cpu_core/L1-dcache-load-misses/ <not supported> cpu_atom/L1-dcache-load-misses/ <not supported> cpu_core/L1-icache-loads/ 1,471,817 cpu_atom/L1-icache-loads/ 1.004915229 seconds time elapsed After: # ./perf stat -e L1-dcache-load-misses,L1-icache-loads -a sleep 1 Performance counter stats for 'system wide': 54,510 cpu_core/L1-dcache-load-misses/ 1,441,286 cpu_atom/L1-icache-loads/ 1.005114281 seconds time elapsed Fixes: 30def61f64bac5f5 ("perf parse-events: Create two hybrid cache events") Reported-by: Yi Ammy <ammy.yi@intel.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Xing Zhengjun <zhengjun.xing@linux.intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220923030013.3726410-2-zhengjun.xing@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-09-26perf print-events: Fix "perf list" can not display the PMU prefix for some ↵Zhengjun Xing
hybrid cache events Some hybrid hardware cache events are only available on one CPU PMU. For example, 'L1-dcache-load-misses' is only available on cpu_core. We have supported in the perf list clearly reporting this info, the function works fine before but recently the argument "config" in API is_event_supported() is changed from "u64" to "unsigned int" which caused a regression, the "perf list" then can not display the PMU prefix for some hybrid cache events. For the hybrid systems, the PMU type ID is stored at config[63:32], define config to "unsigned int" will miss the PMU type ID information, then the regression happened, the config should be defined as "u64". Before: # ./perf list |grep "Hardware cache event" L1-dcache-load-misses [Hardware cache event] L1-dcache-loads [Hardware cache event] L1-dcache-stores [Hardware cache event] L1-icache-load-misses [Hardware cache event] L1-icache-loads [Hardware cache event] LLC-load-misses [Hardware cache event] LLC-loads [Hardware cache event] LLC-store-misses [Hardware cache event] LLC-stores [Hardware cache event] branch-load-misses [Hardware cache event] branch-loads [Hardware cache event] dTLB-load-misses [Hardware cache event] dTLB-loads [Hardware cache event] dTLB-store-misses [Hardware cache event] dTLB-stores [Hardware cache event] iTLB-load-misses [Hardware cache event] node-load-misses [Hardware cache event] node-loads [Hardware cache event] After: # ./perf list |grep "Hardware cache event" L1-dcache-loads [Hardware cache event] L1-dcache-stores [Hardware cache event] L1-icache-load-misses [Hardware cache event] LLC-load-misses [Hardware cache event] LLC-loads [Hardware cache event] LLC-store-misses [Hardware cache event] LLC-stores [Hardware cache event] branch-load-misses [Hardware cache event] branch-loads [Hardware cache event] cpu_atom/L1-icache-loads/ [Hardware cache event] cpu_core/L1-dcache-load-misses/ [Hardware cache event] cpu_core/node-load-misses/ [Hardware cache event] cpu_core/node-loads/ [Hardware cache event] dTLB-load-misses [Hardware cache event] dTLB-loads [Hardware cache event] dTLB-store-misses [Hardware cache event] dTLB-stores [Hardware cache event] iTLB-load-misses [Hardware cache event] Fixes: 9b7c7728f4e4ba8d ("perf parse-events: Break out tracepoint and printing") Reported-by: Yi Ammy <ammy.yi@intel.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Xing Zhengjun <zhengjun.xing@linux.intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220923030013.3726410-1-zhengjun.xing@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-09-26perf tools: Get a perf cgroup more portably in BPFNamhyung Kim
The perf_event_cgrp_id can be different on other configurations. To be more portable as CO-RE, it needs to get the cgroup subsys id using the bpf_core_enum_value() helper. Suggested-by: Ian Rogers <irogers@google.com> Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Hao Luo <haoluo@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <songliubraving@fb.com> Cc: bpf@vger.kernel.org Link: https://lore.kernel.org/r/20220923063205.772936-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-09-26gpio: mvebu: Fix check for pwm support on non-A8K platformsPali Rohár
pwm support incompatible with Armada 80x0/70x0 API is not only in Armada 370, but also in Armada XP, 38x and 39x. So basically every non-A8K platform. Fix check for pwm support appropriately. Fixes: 85b7d8abfec7 ("gpio: mvebu: add pwm support for Armada 8K/7K") Signed-off-by: Pali Rohár <pali@kernel.org> Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
2022-09-25Linux 6.0-rc7v6.0-rc7Linus Torvalds
2022-09-25Merge tag 'ext4_for_linus_stable' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 fixes from Ted Ts'o: "Regression and bug fixes: - Performance regression fix from 5.18 on a Rasberry Pi - Fix extent parsing bug which triggers a BUG_ON when a (corrupted) extent tree has has a non-root node when zero entries. - Fix a livelock where in the right (wrong) circumstances a large number of nfsd threads can try to write to a nearly full file system, and retry for hours(!)" * tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: limit the number of retries after discarding preallocations blocks ext4: fix bug in extents parsing when eh_entries == 0 and eh_depth > 0 ext4: use buckets for cr 1 block scan instead of rbtree ext4: use locality group preallocation for small closed files ext4: make directory inode spreading reflect flexbg size ext4: avoid unnecessary spreading of allocations among groups ext4: make mballoc try target group first even with mb_optimize_scan
2022-09-25Merge tag 'dax-and-nvdimm-fixes-v6.0-final' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm Pull NVDIMM and DAX fixes from Dan Williams: "A recently discovered one-line fix for devdax that further addresses a v5.5 regression, and (a bit embarrassing) a small batch of fixes that have been sitting in my fixes tree for weeks. The older fixes have soaked in linux-next during that time and address an fsdax infinite loop and some other minor fixups. - Fix a infinite loop bug in fsdax - Fix memory-type detection for devdax (EINJ regression) - Small cleanups" * tag 'dax-and-nvdimm-fixes-v6.0-final' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: devdax: Fix soft-reservation memory description fsdax: Fix infinite loop in dax_iomap_rw() nvdimm/namespace: drop nested variable in create_namespace_pmem() ndtest: Cleanup all of blk namespace specific code pmem: fix a name collision
2022-09-25Merge tag 'i2c-for-6.0-rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull i2c fixes from Wolfram Sang: "I2C driver bugfixes for mlxbf and imx, a few documentation fixes after the rework this cycle, and one hardening for the i2c-mux core" * tag 'i2c-for-6.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: i2c: mux: harden i2c_mux_alloc() against integer overflows i2c: mlxbf: Fix frequency calculation i2c: mlxbf: prevent stack overflow in mlxbf_i2c_smbus_start_transaction() i2c: mlxbf: incorrect base address passed during io write Documentation: i2c: fix references to other documents MAINTAINERS: remove Nehal Shah from AMD MP2 I2C DRIVER i2c: imx: If pm_runtime_get_sync() returned 1 device access is possible
2022-09-24Input: synaptics - disable Intertouch for Lenovo T14 and P14s AMD G1Mark Pearson
Since intertouch was enabled for the T14 and P14s AMD G1 laptops there have been a number of reports of touchpads not working well. Debugging this with Synaptics they noted that intertouch should not be enabled as SMBUS host notify is not available on these laptops. Reverting the previous commit (e4ce4d3a939d97bea045eafa13ad1195695f91ce) to restore functionality back to what it was. Note - we are working with Synaptics to see if there is a better solution, but nothing is confirmed as yet. Signed-off-by: Mark Pearson <markpearson@lenovo.com> Link: https://lore.kernel.org/r/20220920193936.8709-1-markpearson@lenovo.com Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2022-09-24Input: iqs62x-keys - drop unused device node referencesJeff LaBundy
Each call to device/fwnode_get_named_child_node() must be matched with a call to fwnode_handle_put() once the corresponding node is no longer in use. This ensures a reference count remains balanced in the case of dynamic device tree support. Currently, the driver never calls fwnode_handle_put(). This patch adds the missing calls. Fixes: ce1cb0eec85b ("input: keyboard: Add support for Azoteq IQS620A/621/622/624/625") Signed-off-by: Jeff LaBundy <jeff@labundy.com> Reviewed-by: Mattijs Korpershoek <mkorpershoek@baylibre.com> Link: https://lore.kernel.org/r/YyYbYvlkq5cy55dc@nixie71 Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2022-09-24Input: melfas_mip4 - fix return value check in mip4_probe()Yang Yingliang
devm_gpiod_get_optional() may return ERR_PTR(-EPROBE_DEFER), add a minus sign to fix it. Fixes: 6ccb1d8f78bd ("Input: add MELFAS MIP4 Touchscreen driver") Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Link: https://lore.kernel.org/r/20220924030715.1653538-1-yangyingliang@huawei.com Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2022-09-24Merge branch 'for-6.0/dax' into libnvdimm-fixesDan Williams
Pick up another "Soft Reservation" fix for v6.0-final on top of some straggling nvdimm fixes that missed v5.19.
2022-09-24devdax: Fix soft-reservation memory descriptionDan Williams
The "hmem" platform-devices that are created to represent the platform-advertised "Soft Reserved" memory ranges end up inserting a resource that causes the iomem_resource tree to look like this: 340000000-43fffffff : hmem.0 340000000-43fffffff : Soft Reserved 340000000-43fffffff : dax0.0 This is because insert_resource() reparents ranges when they completely intersect an existing range. This matters because code that uses region_intersects() to scan for a given IORES_DESC will only check that top-level 'hmem.0' resource and not the 'Soft Reserved' descendant. So, to support EINJ (via einj_error_inject()) to inject errors into memory hosted by a dax-device, be sure to describe the memory as IORES_DESC_SOFT_RESERVED. This is a follow-on to: commit b13a3e5fd40b ("ACPI: APEI: Fix _EINJ vs EFI_MEMORY_SP") ...that fixed EINJ support for "Soft Reserved" ranges in the first instance. Fixes: 262b45ae3ab4 ("x86/efi: EFI soft reservation to E820 enumeration") Reported-by: Ricardo Sandoval Torres <ricardo.sandoval.torres@intel.com> Tested-by: Ricardo Sandoval Torres <ricardo.sandoval.torres@intel.com> Cc: <stable@vger.kernel.org> Cc: Tony Luck <tony.luck@intel.com> Cc: Omar Avelar <omar.avelar@intel.com> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Mark Gross <markgross@kernel.org> Link: https://lore.kernel.org/r/166397075670.389916.7435722208896316387.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-09-24Merge tag 'kbuild-fixes-v6.0-3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild fixes from Masahiro Yamada: - Fix build error for the combination of SYSTEM_TRUSTED_KEYRING=y and X509_CERTIFICATE_PARSER=m - Fix DEBUG_INFO_SPLIT to generate debug info for GCC 11+ and Clang 12+ - Revive debug info for assembly files - Remove unused code * tag 'kbuild-fixes-v6.0-3' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: Makefile.debug: re-enable debug info for .S files Makefile.debug: set -g unconditional on CONFIG_DEBUG_INFO_SPLIT certs: make system keyring depend on built-in x509 parser Kconfig: remove unused function 'menu_get_root_menu' scripts/clang-tools: remove unused module
2022-09-24Merge tag 's390-6.0-5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 fix from Vasily Gorbik: - Fix potential hangs in VFIO AP driver * tag 's390-6.0-5' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/vfio-ap: bypass unnecessary processing of AP resources
2022-09-24Merge tag 'pm-6.0-rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management fixes from Rafael Wysocki: "These fix an uninitialized variable usage in the operating performance points code and add missing DT bindings for it. Specifics: - Fix uninitialized variable usage in dev_pm_opp_config_clks_simple() (Christophe JAILLET) - Add missing OPP DT properties (Rob Herring)" * tag 'pm-6.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: dt-bindings: opp: Add missing (unevaluated|additional)Properties on child nodes OPP: Fix an un-initialized variable usage
2022-09-24Merge tag 'char-misc-6.0-rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char/misc driver fixes from Greg KH: "Here are three tiny driver fixes for 6.0-rc7. They include: - phy driver reset bugfix - fpga memleak bugfix - counter irq config bugfix The first two have been in linux-next for a while, the last one has only been added to my tree in the past few days, but was in linux-next under a different commit id. I couldn't pull directly from the counter tree due to some gpg key propagation issue, so I took the commit directly from email instead" * tag 'char-misc-6.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: counter: 104-quad-8: Fix skipped IRQ lines during events configuration fpga: m10bmc-sec: Fix possible memory leak of flash_buf phy: marvell: phy-mvebu-a3700-comphy: Remove broken reset support
2022-09-24Merge tag 'tty-6.0-rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty Pull tty/serial driver fixes from Greg KH: "Here are some small, and late, serial driver fixes for 6.0-rc7 to resolve some reported problems. Included in here are: - tegra icount accounting fixes, including a framework function that other drivers will be converted over to using in 6.1-rc1. - fsl_lpuart reset bugfix - 8250 omap 485 bugfix - sifive serial clock bugfix The last three patches have not shown up in linux-next due to them being added to my tree only 2 days ago, but they are tiny and self-contained and the developers say they resolve issues that they have with 6.0-rc. The other three have been in linux-next for a while with no reported issues" * tag 'tty-6.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: serial: sifive: enable clocks for UART when probed serial: 8250: omap: Use serial8250_em485_supported serial: fsl_lpuart: Reset prior to registration serial: tegra-tcu: Use uart_xmit_advance(), fixes icount.tx accounting serial: tegra: Use uart_xmit_advance(), fixes icount.tx accounting serial: Create uart_xmit_advance()
2022-09-24Merge tag 'cgroup-for-6.0-rc6-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup fixes from Tejun Heo: - Add Waiman Long as a cpuset maintainer - cgroup_get_from_id() could be fed a kernfs ID which doesn't point to a cgroup directory but a knob file and then crash. Error out if the lookup kernfs_node isn't a directory. * tag 'cgroup-for-6.0-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: cgroup: cgroup_get_from_id() must check the looked-up kn is a directory cpuset: Add Waiman Long as a cpuset maintainer
2022-09-24Merge tag 'wq-for-6.0-rc6-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq Pull workqueue fix from Tejun Heo: "Just one patch to improve flush lockdep coverage" * tag 'wq-for-6.0-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: workqueue: don't skip lockdep work dependency in cancel_work_sync()
2022-09-24Merge tag 'io_uring-6.0-2022-09-23' of git://git.kernel.dk/linuxLinus Torvalds
Pull io_uring fix from Jens Axboe: "Just a single fix for an issue with un-reaped IOPOLL requests on ring exit" * tag 'io_uring-6.0-2022-09-23' of git://git.kernel.dk/linux: io_uring: ensure that cached task references are always put on exit
2022-09-24Merge tag 'block-6.0-2022-09-22' of git://git.kernel.dk/linuxLinus Torvalds
Pull block fixes from Jens Axboe: "Fix a regression that's been plaguing us by reverting the offending commit, as attempts to both reproduce the issue and fix it in a saner fashion have failed. Fix for a potential oops condition in the s390 dasd block driver" * tag 'block-6.0-2022-09-22' of git://git.kernel.dk/linux: Revert "block: freeze the queue earlier in del_gendisk" s390/dasd: fix Oops in dasd_alias_get_start_dev due to missing pavgroup
2022-09-23sfc: correct filter_table_remove method for EF10 PFsAndy Moreton
A previous patch added a wrapper function to take a lock around efx_mcdi_filter_table_remove(), but only changed EF10 VFs' method table to call it. Change it in the PF method table too. Fixes: 77eb40749d73 ("sfc: move table locking into filter_table_{probe,remove} methods") Signed-off-by: Andy Moreton <andy.moreton@amd.com> Signed-off-by: Edward Cree <ecree.xilinx@gmail.com> Link: https://lore.kernel.org/r/20220922211218.814-1-ecree@xilinx.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-24Makefile.debug: re-enable debug info for .S filesNick Desaulniers
Alexey reported that the fraction of unknown filename instances in kallsyms grew from ~0.3% to ~10% recently; Bill and Greg tracked it down to assembler defined symbols, which regressed as a result of: commit b8a9092330da ("Kbuild: do not emit debug info for assembly with LLVM_IAS=1") In that commit, I allude to restoring debug info for assembler defined symbols in a follow up patch, but it seems I forgot to do so in commit a66049e2cf0e ("Kbuild: make DWARF version a choice") Link: https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=31bf18645d98b4d3d7357353be840e320649a67d Fixes: b8a9092330da ("Kbuild: do not emit debug info for assembly with LLVM_IAS=1") Reported-by: Alexey Alexandrov <aalexand@google.com> Reported-by: Bill Wendling <morbo@google.com> Reported-by: Greg Thelen <gthelen@google.com> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Suggested-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2022-09-24Makefile.debug: set -g unconditional on CONFIG_DEBUG_INFO_SPLITNick Desaulniers
Dmitrii, Fangrui, and Mashahiro note: Before GCC 11 and Clang 12 -gsplit-dwarf implicitly uses -g2. Fix CONFIG_DEBUG_INFO_SPLIT for gcc-11+ & clang-12+ which now need -g specified in order for -gsplit-dwarf to work at all. -gsplit-dwarf has been mutually exclusive with -g since support for CONFIG_DEBUG_INFO_SPLIT was introduced in commit 866ced950bcd ("kbuild: Support split debug info v4") I don't think it ever needed to be. Link: https://lore.kernel.org/lkml/20220815013317.26121-1-dmitrii.bundin.a@gmail.com/ Link: https://lore.kernel.org/lkml/CAK7LNARPAmsJD5XKAw7m_X2g7Fi-CAAsWDQiP7+ANBjkg7R7ng@mail.gmail.com/ Link: https://reviews.llvm.org/D80391 Cc: Andi Kleen <ak@linux.intel.com> Reported-by: Dmitrii Bundin <dmitrii.bundin.a@gmail.com> Reported-by: Fangrui Song <maskray@google.com> Reported-by: Masahiro Yamada <masahiroy@kernel.org> Suggested-by: Dmitrii Bundin <dmitrii.bundin.a@gmail.com> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2022-09-23io_uring: ensure that cached task references are always put on exitJens Axboe
io_uring caches task references to avoid doing atomics for each of them per request. If a request is put from the same task that allocated it, then we can maintain a per-ctx cache of them. This obviously relies on io_uring always pruning caches in a reliable way, and there's currently a case off io_uring fd release where we can miss that. One example is a ring setup with IOPOLL, which relies on the task polling for completions, which will free them. However, if such a task submits a request and then exits or closes the ring without reaping the completion, then ring release will reap and put. If release happens from that very same task, the completed request task refs will get put back into the cache pool. This is problematic, as we're now beyond the point of pruning caches. Manually drop these caches after doing an IOPOLL reap. This releases references from the current task, which is enough. If another task happens to be doing the release, then the caching will not be triggered and there's no issue. Cc: stable@vger.kernel.org Fixes: e98e49b2bbf7 ("io_uring: extend task put optimisations") Reported-by: Homin Rhee <hominlab@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-09-23Merge tag 'arm64-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Will Deacon: "These are all very simple and self-contained, although the CFI jump-table fix touches the generic linker script as that's where the problematic macro lives. - Fix false positive "sleeping while atomic" warning resulting from the kPTI rework taking a mutex too early. - Fix possible overflow in AMU frequency calculation - Fix incorrect shift in CMN PMU driver which causes problems with newer versions of the IP - Reduce alignment of the CFI jump table to avoid huge kernel images and link errors with !4KiB page size configurations" * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: vmlinux.lds.h: CFI: Reduce alignment of jump-table to function alignment perf/arm-cmn: Add more bits to child node address offset field arm64: topology: fix possible overflow in amu_fie_setup() arm64: mm: don't acquire mutex when rewriting swapper
2022-09-23ACPI: processor idle: Practically limit "Dummy wait" workaround to old Intel ↵Dave Hansen
systems Old, circa 2002 chipsets have a bug: they don't go idle when they are supposed to. So, a workaround was added to slow the CPU down and ensure that the CPU waits a bit for the chipset to actually go idle. This workaround is ancient and has been in place in some form since the original kernel ACPI implementation. But, this workaround is very painful on modern systems. The "inl()" can take thousands of cycles (see Link: for some more detailed numbers and some fun kernel archaeology). First and foremost, modern systems should not be using this code. Typical Intel systems have not used it in over a decade because it is horribly inferior to MWAIT-based idle. Despite this, people do seem to be tripping over this workaround on AMD system today. Limit the "dummy wait" workaround to Intel systems. Keep Modern AMD systems from tripping over the workaround. Remotely modern Intel systems use intel_idle instead of this code and will, in practice, remain unaffected by the dummy wait. Reported-by: K Prateek Nayak <kprateek.nayak@amd.com> Suggested-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Tested-by: K Prateek Nayak <kprateek.nayak@amd.com> Link: https://lore.kernel.org/all/20220921063638.2489-1-kprateek.nayak@amd.com/ Link: https://lkml.kernel.org/r/20220922184745.3252932-1-dave.hansen@intel.com
2022-09-24certs: make system keyring depend on built-in x509 parserMasahiro Yamada
Commit e90886291c7c ("certs: make system keyring depend on x509 parser") is not the right fix because x509_load_certificate_list() can be modular. The combination of CONFIG_SYSTEM_TRUSTED_KEYRING=y and CONFIG_X509_CERTIFICATE_PARSER=m still results in the following error: LD .tmp_vmlinux.kallsyms1 ld: certs/system_keyring.o: in function `load_system_certificate_list': system_keyring.c:(.init.text+0x8c): undefined reference to `x509_load_certificate_list' make: *** [Makefile:1169: vmlinux] Error 1 Fixes: e90886291c7c ("certs: make system keyring depend on x509 parser") Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Tested-by: Adam Borowski <kilobyte@angband.pl>
2022-09-24Kconfig: remove unused function 'menu_get_root_menu'Zeng Heng
There is nowhere calling `menu_get_root_menu` function, so remove it. Signed-off-by: Zeng Heng <zengheng4@huawei.com> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>