summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2022-08-21Linux 5.18.19v5.18.19linux-5.18.yGreg Kroah-Hartman
Link: https://lore.kernel.org/r/20220819153710.430046927@linuxfoundation.org Tested-by: Shuah Khan <skhan@linuxfoundation.org> Tested-by: Linux Kernel Functional Testing <lkft@linaro.org> Tested-by: Ron Economos <re@w6rz.net> Tested-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-08-21arm64: kexec_file: use more system keyrings to verify kernel image signatureCoiby Xu
commit 0d519cadf75184a24313568e7f489a7fc9b1be3b upstream. Currently, when loading a kernel image via the kexec_file_load() system call, arm64 can only use the .builtin_trusted_keys keyring to verify a signature whereas x86 can use three more keyrings i.e. .secondary_trusted_keys, .machine and .platform keyrings. For example, one resulting problem is kexec'ing a kernel image would be rejected with the error "Lockdown: kexec: kexec of unsigned images is restricted; see man kernel_lockdown.7". This patch set enables arm64 to make use of the same keyrings as x86 to verify the signature kexec'ed kernel image. Fixes: 732b7b93d849 ("arm64: kexec_file: add kernel signature verification support") Cc: stable@vger.kernel.org # 105e10e2cf1c: kexec_file: drop weak attribute from functions Cc: stable@vger.kernel.org # 34d5960af253: kexec: clean up arch_kexec_kernel_verify_sig Cc: stable@vger.kernel.org # 83b7bb2d49ae: kexec, KEYS: make the code in bzImage64_verify_sig generic Acked-by: Baoquan He <bhe@redhat.com> Cc: kexec@lists.infradead.org Cc: keyrings@vger.kernel.org Cc: linux-security-module@vger.kernel.org Co-developed-by: Michal Suchanek <msuchanek@suse.de> Signed-off-by: Michal Suchanek <msuchanek@suse.de> Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Coiby Xu <coxu@redhat.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-08-21kexec, KEYS: make the code in bzImage64_verify_sig genericCoiby Xu
commit c903dae8941deb55043ee46ded29e84e97cd84bb upstream. commit 278311e417be ("kexec, KEYS: Make use of platform keyring for signature verify") adds platform keyring support on x86 kexec but not arm64. The code in bzImage64_verify_sig uses the keys on the .builtin_trusted_keys, .machine, if configured and enabled, .secondary_trusted_keys, also if configured, and .platform keyrings to verify the signed kernel image as PE file. Cc: kexec@lists.infradead.org Cc: keyrings@vger.kernel.org Cc: linux-security-module@vger.kernel.org Reviewed-by: Michal Suchanek <msuchanek@suse.de> Signed-off-by: Coiby Xu <coxu@redhat.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-08-21btrfs: raid56: don't trust any cached sector in __raid56_parity_recover()Qu Wenruo
commit f6065f8edeb25f4a9dfe0b446030ad995a84a088 upstream. [BUG] There is a small workload which will always fail with recent kernel: (A simplified version from btrfs/125 test case) mkfs.btrfs -f -m raid5 -d raid5 -b 1G $dev1 $dev2 $dev3 mount $dev1 $mnt xfs_io -f -c "pwrite -S 0xee 0 1M" $mnt/file1 sync umount $mnt btrfs dev scan -u $dev3 mount -o degraded $dev1 $mnt xfs_io -f -c "pwrite -S 0xff 0 128M" $mnt/file2 umount $mnt btrfs dev scan mount $dev1 $mnt btrfs balance start --full-balance $mnt umount $mnt The failure is always failed to read some tree blocks: BTRFS info (device dm-4): relocating block group 217710592 flags data|raid5 BTRFS error (device dm-4): parent transid verify failed on 38993920 wanted 9 found 7 BTRFS error (device dm-4): parent transid verify failed on 38993920 wanted 9 found 7 ... [CAUSE] With the recently added debug output, we can see all RAID56 operations related to full stripe 38928384: 56.1183: raid56_read_partial: full_stripe=38928384 devid=2 type=DATA1 offset=0 opf=0x0 physical=9502720 len=65536 56.1185: raid56_read_partial: full_stripe=38928384 devid=3 type=DATA2 offset=16384 opf=0x0 physical=9519104 len=16384 56.1185: raid56_read_partial: full_stripe=38928384 devid=3 type=DATA2 offset=49152 opf=0x0 physical=9551872 len=16384 56.1187: raid56_write_stripe: full_stripe=38928384 devid=3 type=DATA2 offset=0 opf=0x1 physical=9502720 len=16384 56.1188: raid56_write_stripe: full_stripe=38928384 devid=3 type=DATA2 offset=32768 opf=0x1 physical=9535488 len=16384 56.1188: raid56_write_stripe: full_stripe=38928384 devid=1 type=PQ1 offset=0 opf=0x1 physical=30474240 len=16384 56.1189: raid56_write_stripe: full_stripe=38928384 devid=1 type=PQ1 offset=32768 opf=0x1 physical=30507008 len=16384 56.1218: raid56_write_stripe: full_stripe=38928384 devid=3 type=DATA2 offset=49152 opf=0x1 physical=9551872 len=16384 56.1219: raid56_write_stripe: full_stripe=38928384 devid=1 type=PQ1 offset=49152 opf=0x1 physical=30523392 len=16384 56.2721: raid56_parity_recover: full stripe=38928384 eb=39010304 mirror=2 56.2723: raid56_parity_recover: full stripe=38928384 eb=39010304 mirror=2 56.2724: raid56_parity_recover: full stripe=38928384 eb=39010304 mirror=2 Before we enter raid56_parity_recover(), we have triggered some metadata write for the full stripe 38928384, this leads to us to read all the sectors from disk. Furthermore, btrfs raid56 write will cache its calculated P/Q sectors to avoid unnecessary read. This means, for that full stripe, after any partial write, we will have stale data, along with P/Q calculated using that stale data. Thankfully due to patch "btrfs: only write the sectors in the vertical stripe which has data stripes" we haven't submitted all the corrupted P/Q to disk. When we really need to recover certain range, aka in raid56_parity_recover(), we will use the cached rbio, along with its cached sectors (the full stripe is all cached). This explains why we have no event raid56_scrub_read_recover() triggered. Since we have the cached P/Q which is calculated using the stale data, the recovered one will just be stale. In our particular test case, it will always return the same incorrect metadata, thus causing the same error message "parent transid verify failed on 39010304 wanted 9 found 7" again and again. [BTRFS DESTRUCTIVE RMW PROBLEM] Test case btrfs/125 (and above workload) always has its trouble with the destructive read-modify-write (RMW) cycle: 0 32K 64K Data1: | Good | Good | Data2: | Bad | Bad | Parity: | Good | Good | In above case, if we trigger any write into Data1, we will use the bad data in Data2 to re-generate parity, killing the only chance to recovery Data2, thus Data2 is lost forever. This destructive RMW cycle is not specific to btrfs RAID56, but there are some btrfs specific behaviors making the case even worse: - Btrfs will cache sectors for unrelated vertical stripes. In above example, if we're only writing into 0~32K range, btrfs will still read data range (32K ~ 64K) of Data1, and (64K~128K) of Data2. This behavior is to cache sectors for later update. Incidentally commit d4e28d9b5f04 ("btrfs: raid56: make steal_rbio() subpage compatible") has a bug which makes RAID56 to never trust the cached sectors, thus slightly improve the situation for recovery. Unfortunately, follow up fix "btrfs: update stripe_sectors::uptodate in steal_rbio" will revert the behavior back to the old one. - Btrfs raid56 partial write will update all P/Q sectors and cache them This means, even if data at (64K ~ 96K) of Data2 is free space, and only (96K ~ 128K) of Data2 is really stale data. And we write into that (96K ~ 128K), we will update all the parity sectors for the full stripe. This unnecessary behavior will completely kill the chance of recovery. Thankfully, an unrelated optimization "btrfs: only write the sectors in the vertical stripe which has data stripes" will prevent submitting the write bio for untouched vertical sectors. That optimization will keep the on-disk P/Q untouched for a chance for later recovery. [FIX] Although we have no good way to completely fix the destructive RMW (unless we go full scrub for each partial write), we can still limit the damage. With patch "btrfs: only write the sectors in the vertical stripe which has data stripes" now we won't really submit the P/Q of unrelated vertical stripes, so the on-disk P/Q should still be fine. Now we really need to do is just drop all the cached sectors when doing recovery. By this, we have a chance to read the original P/Q from disk, and have a chance to recover the stale data, while still keep the cache to speed up regular write path. In fact, just dropping all the cache for recovery path is good enough to allow the test case btrfs/125 along with the small script to pass reliably. The lack of metadata write after the degraded mount, and forced metadata COW is saving us this time. So this patch will fix the behavior by not trust any cache in __raid56_parity_recover(), to solve the problem while still keep the cache useful. But please note that this test pass DOES NOT mean we have solved the destructive RMW problem, we just do better damage control a little better. Related patches: - btrfs: only write the sectors in the vertical stripe - d4e28d9b5f04 ("btrfs: raid56: make steal_rbio() subpage compatible") - btrfs: update stripe_sectors::uptodate in steal_rbio Acked-by: David Sterba <dsterba@suse.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-08-21btrfs: only write the sectors in the vertical stripe which has data stripesQu Wenruo
commit bd8f7e627703ca5707833d623efcd43f104c7b3f upstream. If we have only 8K partial write at the beginning of a full RAID56 stripe, we will write the following contents: 0 8K 32K 64K Disk 1 (data): |XX| | | Disk 2 (data): | | | Disk 3 (parity): |XXXXXXXXXXXXXXX|XXXXXXXXXXXXXXX| |X| means the sector will be written back to disk. Note that, although we won't write any sectors from disk 2, but we will write the full 64KiB of parity to disk. This behavior is fine for now, but not for the future (especially for RAID56J, as we waste quite some space to journal the unused parity stripes). So here we will also utilize the btrfs_raid_bio::dbitmap, anytime we queue a higher level bio into an rbio, we will update rbio::dbitmap to indicate which vertical stripes we need to writeback. And at finish_rmw(), we also check dbitmap to see if we need to write any sector in the vertical stripe. So after the patch, above example will only lead to the following writeback pattern: 0 8K 32K 64K Disk 1 (data): |XX| | | Disk 2 (data): | | | Disk 3 (parity): |XX| | | Acked-by: David Sterba <dsterba@suse.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-08-21net_sched: cls_route: disallow handle of 0Jamal Hadi Salim
commit 02799571714dc5dd6948824b9d080b44a295f695 upstream. Follows up on: https://lore.kernel.org/all/20220809170518.164662-1-cascardo@canonical.com/ handle of 0 implies from/to of universe realm which is not very sensible. Lets see what this patch will do: $sudo tc qdisc add dev $DEV root handle 1:0 prio //lets manufacture a way to insert handle of 0 $sudo tc filter add dev $DEV parent 1:0 protocol ip prio 100 \ route to 0 from 0 classid 1:10 action ok //gets rejected... Error: handle of 0 is not valid. We have an error talking to the kernel, -1 //lets create a legit entry.. sudo tc filter add dev $DEV parent 1:0 protocol ip prio 100 route from 10 \ classid 1:10 action ok //what did the kernel insert? $sudo tc filter ls dev $DEV parent 1:0 filter protocol ip pref 100 route chain 0 filter protocol ip pref 100 route chain 0 fh 0x000a8000 flowid 1:10 from 10 action order 1: gact action pass random type none pass val 0 index 1 ref 1 bind 1 //Lets try to replace that legit entry with a handle of 0 $ sudo tc filter replace dev $DEV parent 1:0 protocol ip prio 100 \ handle 0x000a8000 route to 0 from 0 classid 1:10 action drop Error: Replacing with handle of 0 is invalid. We have an error talking to the kernel, -1 And last, lets run Cascardo's POC: $ ./poc 0 0 -22 -22 -22 Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-08-21tee: add overflow check in register_shm_helper()Jens Wiklander
commit 573ae4f13f630d6660008f1974c0a8a29c30e18a upstream. With special lengths supplied by user space, register_shm_helper() has an integer overflow when calculating the number of pages covered by a supplied user space memory region. This causes internal_get_user_pages_fast() a helper function of pin_user_pages_fast() to do a NULL pointer dereference: Unable to handle kernel NULL pointer dereference at virtual address 0000000000000010 Modules linked in: CPU: 1 PID: 173 Comm: optee_example_a Not tainted 5.19.0 #11 Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 02/06/2015 pc : internal_get_user_pages_fast+0x474/0xa80 Call trace: internal_get_user_pages_fast+0x474/0xa80 pin_user_pages_fast+0x24/0x4c register_shm_helper+0x194/0x330 tee_shm_register_user_buf+0x78/0x120 tee_ioctl+0xd0/0x11a0 __arm64_sys_ioctl+0xa8/0xec invoke_syscall+0x48/0x114 Fix this by adding an an explicit call to access_ok() in tee_shm_register_user_buf() to catch an invalid user space address early. Fixes: 033ddf12bcf5 ("tee: add register user memory") Cc: stable@vger.kernel.org Reported-by: Nimish Mishra <neelam.nimish@gmail.com> Reported-by: Anirban Chakraborty <ch.anirban00727@gmail.com> Reported-by: Debdeep Mukhopadhyay <debdeep.mukhopadhyay@gmail.com> Suggested-by: Jerome Forissier <jerome.forissier@linaro.org> Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-08-17Linux 5.18.18v5.18.18Greg Kroah-Hartman
Link: https://lore.kernel.org/r/20220815180429.240518113@linuxfoundation.org Tested-by: Shuah Khan <skhan@linuxfoundation.org> Tested-by: Linux Kernel Functional Testing <lkft@linaro.org> Link: https://lore.kernel.org/r/20220816124604.978842485@linuxfoundation.org Tested-by: Linux Kernel Functional Testing <lkft@linaro.org> Tested-by: Ron Economos <re@w6rz.net> Tested-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-08-17mm: introduce clear_highpage_kasan_taggedAndrey Konovalov
commit d9da8f6cf55eeca642c021912af1890002464c64 upstream. Add a clear_highpage_kasan_tagged() helper that does clear_highpage() on a page potentially tagged by KASAN. This helper is used by the following patch. Link: https://lkml.kernel.org/r/4471979b46b2c487787ddcd08b9dc5fedd1b6ffd.1654798516.git.andreyknvl@google.com Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Marco Elver <elver@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Jiri Slaby <jirislaby@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-08-17Bluetooth: L2CAP: Fix l2cap_global_chan_by_psm regressionLuiz Augusto von Dentz
commit 332f1795ca202489c665a75e62e18ff6284de077 upstream. The patch d0be8347c623: "Bluetooth: L2CAP: Fix use-after-free caused by l2cap_chan_put" from Jul 21, 2022, leads to the following Smatch static checker warning: net/bluetooth/l2cap_core.c:1977 l2cap_global_chan_by_psm() error: we previously assumed 'c' could be null (see line 1996) Fixes: d0be8347c623 ("Bluetooth: L2CAP: Fix use-after-free caused by l2cap_chan_put") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-08-17io_uring: mem-account pbuf bucketsPavel Begunkov
commit cc18cc5e82033d406f54144ad6f8092206004684 upstream. Potentially, someone may create as many pbuf bucket as there are indexes in an xarray without any other restrictions bounding our memory usage, put memory needed for the buckets under memory accounting. Cc: <stable@vger.kernel.org> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/d34c452e45793e978d26e2606211ec9070d329ea.1659622312.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-08-17f2fs: fix null-ptr-deref in f2fs_get_dnode_of_dataYe Bin
commit 4a2c5b7994960fac29cf8a3f4e62855bae1b27d4 upstream. There is issue as follows when test f2fs atomic write: F2FS-fs (loop0): Can't find valid F2FS filesystem in 2th superblock F2FS-fs (loop0): invalid crc_offset: 0 F2FS-fs (loop0): f2fs_check_nid_range: out-of-range nid=1, run fsck to fix. F2FS-fs (loop0): f2fs_check_nid_range: out-of-range nid=2, run fsck to fix. ================================================================== BUG: KASAN: null-ptr-deref in f2fs_get_dnode_of_data+0xac/0x16d0 Read of size 8 at addr 0000000000000028 by task rep/1990 CPU: 4 PID: 1990 Comm: rep Not tainted 5.19.0-rc6-next-20220715 #266 Call Trace: <TASK> dump_stack_lvl+0x6e/0x91 print_report.cold+0x49a/0x6bb kasan_report+0xa8/0x130 f2fs_get_dnode_of_data+0xac/0x16d0 f2fs_do_write_data_page+0x2a5/0x1030 move_data_page+0x3c5/0xdf0 do_garbage_collect+0x2015/0x36c0 f2fs_gc+0x554/0x1d30 f2fs_balance_fs+0x7f5/0xda0 f2fs_write_single_data_page+0xb66/0xdc0 f2fs_write_cache_pages+0x716/0x1420 f2fs_write_data_pages+0x84f/0x9a0 do_writepages+0x130/0x3a0 filemap_fdatawrite_wbc+0x87/0xa0 file_write_and_wait_range+0x157/0x1c0 f2fs_do_sync_file+0x206/0x12d0 f2fs_sync_file+0x99/0xc0 vfs_fsync_range+0x75/0x140 f2fs_file_write_iter+0xd7b/0x1850 vfs_write+0x645/0x780 ksys_write+0xf1/0x1e0 do_syscall_64+0x3b/0x90 entry_SYSCALL_64_after_hwframe+0x63/0xcd As 3db1de0e582c commit changed atomic write way which new a cow_inode for atomic write file, and also mark cow_inode as FI_ATOMIC_FILE. When f2fs_do_write_data_page write cow_inode will use cow_inode's cow_inode which is NULL. Then will trigger null-ptr-deref. To solve above issue, introduce FI_COW_FILE flag for COW inode. Fiexes: 3db1de0e582c("f2fs: change the current atomic write way") Signed-off-by: Ye Bin <yebin10@huawei.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-08-17drm/vc4: change vc4_dma_range_matches from a global to staticTom Rix
commit 63569d90863ff26c8b10c8971d1271c17a45224b upstream. sparse reports drivers/gpu/drm/vc4/vc4_drv.c:270:27: warning: symbol 'vc4_dma_range_matches' was not declared. Should it be static? vc4_dma_range_matches is only used in vc4_drv.c, so it's storage class specifier should be static. Fixes: da8e393e23ef ("drm/vc4: drv: Adopt the dma configuration from the HVS or V3D component") Signed-off-by: Tom Rix <trix@redhat.com> Signed-off-by: Maxime Ripard <maxime@cerno.tech> Link: https://patchwork.freedesktop.org/patch/msgid/20220629200101.498138-1-trix@redhat.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-08-17f2fs: revive F2FS_IOC_ABORT_VOLATILE_WRITEDaeho Jeong
commit 23339e5752d01a4b5e122759b002cf896d26f6c1 upstream. F2FS_IOC_ABORT_VOLATILE_WRITE was used to abort a atomic write before. However it was removed accidentally. So revive it by changing the name, since volatile write had gone. Signed-off-by: Daeho Jeong <daehojeong@google.com> Fiexes: 7bc155fec5b3("f2fs: kill volatile write support") Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-08-17net: phy: smsc: Disable Energy Detect Power-Down in interrupt modeLukas Wunner
commit 2642cc6c3bbe0900ba15bab078fd15ad8baccbc5 upstream. Simon reports that if two LAN9514 USB adapters are directly connected without an intermediate switch, the link fails to come up and link LEDs remain dark. The issue was introduced by commit 1ce8b37241ed ("usbnet: smsc95xx: Forward PHY interrupts to PHY driver to avoid polling"). The PHY suffers from a known erratum wherein link detection becomes unreliable if Energy Detect Power-Down is used. In poll mode, the driver works around the erratum by briefly disabling EDPD for 640 msec to detect a neighbor, then re-enabling it to save power. In interrupt mode, no interrupt is signaled if EDPD is used by both link partners, so it must not be enabled at all. We'll recoup the power savings by enabling SUSPEND1 mode on affected LAN95xx chips in a forthcoming commit. Fixes: 1ce8b37241ed ("usbnet: smsc95xx: Forward PHY interrupts to PHY driver to avoid polling") Reported-by: Simon Han <z.han@kunbus.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Link: https://lore.kernel.org/r/439a3f3168c2f9d44b5fd9bb8d2b551711316be6.1655714438.git.lukas@wunner.de Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-08-17bpf: Suppress 'passing zero to PTR_ERR' warningKumar Kartikeya Dwivedi
commit 1ec5ee8c8a5a65ea377f8bea64bf4d5b743f6f79 upstream. Kernel Test Robot complains about passing zero to PTR_ERR for the said line, suppress it by using PTR_ERR_OR_ZERO. Fixes: c0a5a21c25f3 ("bpf: Allow storing referenced kptr in map") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220521132620.1976921-1-memxor@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-08-17bpf: Fix sparse warning for bpf_kptr_xchg_protoKumar Kartikeya Dwivedi
commit 5b74c690e1c55953ec99fd9dab74f72dbee4fe95 upstream. Kernel Test Robot complained about missing static storage class annotation for bpf_kptr_xchg_proto variable. sparse: symbol 'bpf_kptr_xchg_proto' was not declared. Should it be static? This caused by missing extern definition in the header. Add it to suppress the sparse warning. Fixes: c0a5a21c25f3 ("bpf: Allow storing referenced kptr in map") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20220511194654.765705-2-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-08-17drm/bridge: tc358767: Fix (e)DP bridge endpoint parsing in dedicated functionMarek Vasut
commit 9030a9e571b3ba250d3d450a98310e3c74ecaff4 upstream. Per toshiba,tc358767.yaml DT binding document, port@2 the output (e)DP port is optional. In case this port is not described in DT, the bridge driver operates in DPI-to-DP mode. The drm_of_find_panel_or_bridge() call in tc_probe_edp_bridge_endpoint() returns -ENODEV in case port@2 is not present in DT and this specific return value is incorrectly propagated outside of tc_probe_edp_bridge_endpoint() function. All other error values must be propagated and are propagated correctly. Return 0 in case the port@2 is missing instead, that reinstates the original behavior before the commit this patch fixes. Fixes: 8478095a8c4b ("drm/bridge: tc358767: Move (e)DP bridge endpoint parsing into dedicated function") Signed-off-by: Marek Vasut <marex@denx.de> Cc: Jonas Karlman <jonas@kwiboo.se> Cc: Laurent Pinchart <Laurent.pinchart@ideasonboard.com> Cc: Lucas Stach <l.stach@pengutronix.de> Cc: Marek Vasut <marex@denx.de> Cc: Maxime Ripard <maxime@cerno.tech> Cc: Neil Armstrong <narmstrong@baylibre.com> Cc: Robert Foss <robert.foss@linaro.org> Cc: Sam Ravnborg <sam@ravnborg.org> Reviewed-by: Lucas Stach <l.stach@pengutronix.de> Link: https://patchwork.freedesktop.org/patch/msgid/20220428213132.447890-1-marex@denx.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-08-17powerpc/kexec: Fix build failure from uninitialised variableRussell Currey
commit 83ee9f23763a432a4077bf20624ee35de87bce99 upstream. clang 14 won't build because ret is uninitialised and can be returned if both prop and fdtprop are NULL. Drop the ret variable and return an error in that failure case. Fixes: b1fc44eaa9ba ("pseries/iommu/ddw: Fix kdump to work in absence of ibm,dma-window") Suggested-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Russell Currey <ruscur@russell.cc> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220810054331.373761-1-ruscur@russell.cc Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-08-17Revert "s390/smp: enforce lowcore protection on CPU restart"Alexander Gordeev
commit 953503751a426413ea8aee2299ae3ee971b70d9b upstream. This reverts commit 6f5c672d17f583b081e283927f5040f726c54598. This breaks normal crash dump when CPU0 is offline. Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-08-17net: dsa: felix: fix min gate len calculation for tc when its first gate is ↵Vladimir Oltean
closed commit 7e4babffa6f340a74c820d44d44d16511e666424 upstream. min_gate_len[tc] is supposed to track the shortest interval of continuously open gates for a traffic class. For example, in the following case: TC 76543210 t0 00000001b 200000 ns t1 00000010b 200000 ns min_gate_len[0] and min_gate_len[1] should be 200000, while min_gate_len[2-7] should be 0. However what happens is that min_gate_len[0] is 200000, but min_gate_len[1] ends up being 0 (despite gate_len[1] being 200000 at the point where the logic detects the gate close event for TC 1). The problem is that the code considers a "gate close" event whenever it sees that there is a 0 for that TC (essentially it's level rather than edge triggered). By doing that, any time a gate is seen as closed without having been open prior, gate_len, which is 0, will be written into min_gate_len. Once min_gate_len becomes 0, it's impossible for it to track anything higher than that (the length of actually open intervals). To fix this, we make the writing to min_gate_len[tc] be edge-triggered, which avoids writes for gates that are closed in consecutive intervals. However what this does is it makes us need to special-case the permanently closed gates at the end. Fixes: 55a515b1f5a9 ("net: dsa: felix: drop oversized frames with tc-taprio instead of hanging the port") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://lore.kernel.org/r/20220804202817.1677572-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-08-17tracing: Use a copy of the va_list for __assign_vstr()Steven Rostedt (Google)
commit 3a2dcbaf4d31023106975d6ae75b6df080c454cb upstream. If an instance of tracing enables the same trace event as another instance, or the top level instance, or even perf, then the va_list passed into some tracepoints can be used more than once. As va_list can only be traversed once, this can cause issues: # cat /sys/kernel/tracing/instances/qla2xxx/trace cat-56106 [012] ..... 2419873.470098: ql_dbg_log: qla2xxx [0000:05:00.0]-1054:14: Entered (null). cat-56106 [012] ..... 2419873.470101: ql_dbg_log: qla2xxx [0000:05:00.0]-1000:14: Entered ×+<96>²Ü<98>^H. cat-56106 [012] ..... 2419873.470102: ql_dbg_log: qla2xxx [0000:05:00.0]-1006:14: Prepare to issue mbox cmd=0xde589000. # cat /sys/kernel/tracing/trace cat-56106 [012] ..... 2419873.470097: ql_dbg_log: qla2xxx [0000:05:00.0]-1054:14: Entered qla2x00_get_firmware_state. cat-56106 [012] ..... 2419873.470100: ql_dbg_log: qla2xxx [0000:05:00.0]-1000:14: Entered qla2x00_mailbox_command. cat-56106 [012] ..... 2419873.470102: ql_dbg_log: qla2xxx [0000:05:00.0]-1006:14: Prepare to issue mbox cmd=0x69. The instance version is corrupted because the top level instance iterated the va_list first. Use va_copy() in the __assign_vstr() macro to make sure that each trace event for each use case gets a fresh va_list. Link: https://lore.kernel.org/all/259d53a5-958e-6508-4e45-74dba2821242@marvell.com/ Link: https://lkml.kernel.org/r/20220719182004.21daa83e@gandalf.local.home Fixes: 0563231f93c6d ("tracing/events: Add __vstring() and __assign_vstr() helper macros") Reported-by: Arun Easi <aeasi@marvell.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-08-17mptcp: refine memory schedulingPaolo Abeni
commit 69d93daec026cdda98e29e8edb12534bfa5b1a9b upstream. Similar to commit 7c80b038d23e ("net: fix sk_wmem_schedule() and sk_rmem_schedule() errors"), let the MPTCP receive path schedule exactly the required amount of memory. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-08-17Revert "mwifiex: fix sleep in atomic context bugs caused by dev_coredumpv"Greg Kroah-Hartman
commit 5f8954e099b8ae96e7de1bb95950e00c85bedd40 upstream. This reverts commit a52ed4866d2b90dd5e4ae9dabd453f3ed8fa3cbc as it causes build problems in linux-next. It needs to be reintroduced in a way that can allow the api to evolve and not require a "flag day" to catch all users. Link: https://lore.kernel.org/r/20220623160723.7a44b573@canb.auug.org.au Cc: Duoming Zhou <duoming@zju.edu.cn> Cc: Brian Norris <briannorris@chromium.org> Cc: Johannes Berg <johannes@sipsolutions.net> Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-08-17raw: fix a typo in raw_icmp_error()Eric Dumazet
commit 97a4d46b1516250d640c1ae0c9e7129d160d6a1c upstream. I accidentally broke IPv4 traceroute, by swapping iph->saddr and iph->daddr. Probably because raw_icmp_error() and raw_v4_input() use different order for iph->saddr and iph->daddr. Fixes: ba44f8182ec2 ("raw: use more conventional iterators") Reported-by: John Sperbeck <jsperbeck@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20220623193540.2851799-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-08-17raw: remove unused variables from raw6_icmp_error()Eric Dumazet
commit c4fceb46add65481ef0dfb79cad24c3c269b4cad upstream. saddr and daddr are set but not used. Fixes: ba44f8182ec2 ("raw: use more conventional iterators") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com> Link: https://lore.kernel.org/r/20220622032303.159394-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-08-17crypto: lib/blake2s - reduce stack frame usage in self testJason A. Donenfeld
commit d6c14da474bf260d73953fbf7992c98d9112aec7 upstream. Using 3 blocks here doesn't give us much more than using 2, and it causes a stack frame size warning on certain compiler/config/arch combinations: lib/crypto/blake2s-selftest.c: In function 'blake2s_selftest': >> lib/crypto/blake2s-selftest.c:632:1: warning: the frame size of 1088 bytes is larger than 1024 bytes [-Wframe-larger-than=] 632 | } | ^ So this patch just reduces the block from 3 to 2, which makes the warning go away. Reported-by: kernel test robot <lkp@intel.com> Link: https://lore.kernel.org/linux-crypto/202206200851.gE3MHCgd-lkp@intel.com Fixes: 2d16803c562e ("crypto: blake2s - remove shash module") Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-08-17tcp: fix over estimation in sk_forced_mem_schedule()Eric Dumazet
commit c4ee118561a0f74442439b7b5b486db1ac1ddfeb upstream. sk_forced_mem_schedule() has a bug similar to ones fixed in commit 7c80b038d23e ("net: fix sk_wmem_schedule() and sk_rmem_schedule() errors") While this bug has little chance to trigger in old kernels, we need to fix it before the following patch. Fixes: d83769a580f1 ("tcp: fix possible deadlock in tcp_send_fin()") Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Reviewed-by: Shakeel Butt <shakeelb@google.com> Reviewed-by: Wei Wang <weiwan@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-08-17net_sched: cls_route: remove from list when handle is 0Thadeu Lima de Souza Cascardo
commit 9ad36309e2719a884f946678e0296be10f0bb4c1 upstream. When a route filter is replaced and the old filter has a 0 handle, the old one won't be removed from the hashtable, while it will still be freed. The test was there since before commit 1109c00547fc ("net: sched: RCU cls_route"), when a new filter was not allocated when there was an old one. The old filter was reused and the reinserting would only be necessary if an old filter was replaced. That was still wrong for the same case where the old handle was 0. Remove the old filter from the list independently from its handle value. This fixes CVE-2022-2588, also reported as ZDI-CAN-17440. Reported-by: Zhenpeng Lin <zplin@u.northwestern.edu> Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Reviewed-by: Kamal Mostafa <kamal@canonical.com> Cc: <stable@vger.kernel.org> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Link: https://lore.kernel.org/r/20220809170518.164662-1-cascardo@canonical.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-08-17btrfs: convert count_max_extents() to use fs_info->max_extent_sizeNaohiro Aota
commit 7d7672bc5d1038c745716c397d892d21e29de71c upstream. If count_max_extents() uses BTRFS_MAX_EXTENT_SIZE to calculate the number of extents needed, btrfs release the metadata reservation too much on its way to write out the data. Now that BTRFS_MAX_EXTENT_SIZE is replaced with fs_info->max_extent_size, convert count_max_extents() to use it instead, and fix the calculation of the metadata reservation. CC: stable@vger.kernel.org # 5.12+ Fixes: d8e3fb106f39 ("btrfs: zoned: use ZONE_APPEND write for zoned mode") Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-08-17btrfs: join running log transaction when logging new nameFilipe Manana
commit 723df2bcc9e166ac7fb82b3932a53e09415dfcde upstream. When logging a new name, in case of a rename, we pin the log before changing it. We then either delete a directory entry from the log or insert a key range item to mark the old name for deletion on log replay. However when doing one of those log changes we may have another task that started writing out the log (at btrfs_sync_log()) and it started before we pinned the log root. So we may end up changing a log tree while its writeback is being started by another task syncing the log. This can lead to inconsistencies in a log tree and other unexpected results during log replay, because we can get some committed node pointing to a node/leaf that ends up not getting written to disk before the next log commit. The problem, conceptually, started to happen in commit 88d2beec7e53fc ("btrfs: avoid logging all directory changes during renames"), because there we started to update the log without joining its current transaction first. However the problem only became visible with commit 259c4b96d78dda ("btrfs: stop doing unnecessary log updates during a rename"), and that is because we used to pin the log at btrfs_rename() and then before entering btrfs_log_new_name(), when unlinking the old dentry, we ended up at btrfs_del_inode_ref_in_log() and btrfs_del_dir_entries_in_log(). Both of them join the current log transaction, effectively waiting for any log transaction writeout (due to acquiring the root's log_mutex). This made it safe even after leaving the current log transaction, because we remained with the log pinned when we called btrfs_log_new_name(). Then in commit 259c4b96d78dda ("btrfs: stop doing unnecessary log updates during a rename"), we removed the log pinning from btrfs_rename() and stopped calling btrfs_del_inode_ref_in_log() and btrfs_del_dir_entries_in_log() during the rename, and started to do all the needed work at btrfs_log_new_name(), but without joining the current log transaction, only pinning the log, which is racy because another task may have started writeout of the log tree right before we pinned the log. Both commits landed in kernel 5.18, so it doesn't make any practical difference which should be blamed, but I'm blaming the second commit only because with the first one, by chance, the problem did not happen due to the fact we joined the log transaction after pinning the log and unpinned it only after calling btrfs_log_new_name(). So make btrfs_log_new_name() join the current log transaction instead of pinning it, so that we never do log updates if it's writeout is starting. Fixes: 259c4b96d78dda ("btrfs: stop doing unnecessary log updates during a rename") CC: stable@vger.kernel.org # 5.18+ Reported-by: Zygo Blaxell <ce3g8jdj@umail.furryterror.org> Tested-by: Zygo Blaxell <ce3g8jdj@umail.furryterror.org> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-08-17powerpc: Fix eh field when calling lwarx on PPC32Christophe Leroy
commit 18db466a9a306406dab3b134014d9f6ed642471c upstream. Commit 9401f4e46cf6 ("powerpc: Use lwarx/ldarx directly instead of PPC_LWARX/LDARX macros") properly handled the eh field of lwarx in asm/bitops.h but failed to clear it for PPC32 in asm/simple_spinlock.h So, do as in arch_atomic_try_cmpxchg_lock(), set it to 1 if PPC64 but set it to 0 if PPC32. For that use IS_ENABLED(CONFIG_PPC64) which returns 1 when CONFIG_PPC64 is set and 0 otherwise. Fixes: 9401f4e46cf6 ("powerpc: Use lwarx/ldarx directly instead of PPC_LWARX/LDARX macros") Cc: stable@vger.kernel.org # v5.15+ Reported-by: Pali Rohár <pali@kernel.org> Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Tested-by: Pali Rohár <pali@kernel.org> Reviewed-by: Segher Boessenkool <segher@kernel.crashing.org> [mpe: Use symbolic names, use 'n' constraint per Segher] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/a1176e19e627dd6a1b8d24c6c457a8ab874b7d12.1659430931.git.christophe.leroy@csgroup.eu Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-08-17xen-blkfront: Apply 'feature_persistent' parameter when connectSeongJae Park
commit 402c43ea6b34a1b371ffeed9adf907402569eaf5 upstream. In some use cases[1], the backend is created while the frontend doesn't support the persistent grants feature, but later the frontend can be changed to support the feature and reconnect. In the past, 'blkback' enabled the persistent grants feature since it unconditionally checked if frontend supports the persistent grants feature for every connect ('connect_ring()') and decided whether it should use persistent grans or not. However, commit aac8a70db24b ("xen-blkback: add a parameter for disabling of persistent grants") has mistakenly changed the behavior. It made the frontend feature support check to not be repeated once it shown the 'feature_persistent' as 'false', or the frontend doesn't support persistent grants. Similar behavioral change has made on 'blkfront' by commit 74a852479c68 ("xen-blkfront: add a parameter for disabling of persistent grants"). This commit changes the behavior of the parameter to make effect for every connect, so that the previous behavior of 'blkfront' can be restored. [1] https://lore.kernel.org/xen-devel/CAJwUmVB6H3iTs-C+U=v-pwJB7-_ZRHPxHzKRJZ22xEPW7z8a=g@mail.gmail.com/ Fixes: 74a852479c68 ("xen-blkfront: add a parameter for disabling of persistent grants") Cc: <stable@vger.kernel.org> # 5.10.x Signed-off-by: SeongJae Park <sj@kernel.org> Reviewed-by: Maximilian Heyne <mheyne@amazon.de> Reviewed-by: Juergen Gross <jgross@suse.com> Link: https://lore.kernel.org/r/20220715225108.193398-4-sj@kernel.org Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-08-17xen-blkback: Apply 'feature_persistent' parameter when connectMaximilian Heyne
commit e94c6101e151b019b8babc518ac2a6ada644a5a1 upstream. In some use cases[1], the backend is created while the frontend doesn't support the persistent grants feature, but later the frontend can be changed to support the feature and reconnect. In the past, 'blkback' enabled the persistent grants feature since it unconditionally checked if frontend supports the persistent grants feature for every connect ('connect_ring()') and decided whether it should use persistent grans or not. However, commit aac8a70db24b ("xen-blkback: add a parameter for disabling of persistent grants") has mistakenly changed the behavior. It made the frontend feature support check to not be repeated once it shown the 'feature_persistent' as 'false', or the frontend doesn't support persistent grants. This commit changes the behavior of the parameter to make effect for every connect, so that the previous workflow can work again as expected. [1] https://lore.kernel.org/xen-devel/CAJwUmVB6H3iTs-C+U=v-pwJB7-_ZRHPxHzKRJZ22xEPW7z8a=g@mail.gmail.com/ Reported-by: Andrii Chepurnyi <andrii.chepurnyi82@gmail.com> Fixes: aac8a70db24b ("xen-blkback: add a parameter for disabling of persistent grants") Cc: <stable@vger.kernel.org> # 5.10.x Signed-off-by: Maximilian Heyne <mheyne@amazon.de> Signed-off-by: SeongJae Park <sj@kernel.org> Reviewed-by: Maximilian Heyne <mheyne@amazon.de> Reviewed-by: Juergen Gross <jgross@suse.com> Link: https://lore.kernel.org/r/20220715225108.193398-3-sj@kernel.org Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-08-17xen-blkback: fix persistent grants negotiationSeongJae Park
commit fc9be616bb8f3ed9cf560308f86904f5c06be205 upstream. Persistent grants feature can be used only when both backend and the frontend supports the feature. The feature was always supported by 'blkback', but commit aac8a70db24b ("xen-blkback: add a parameter for disabling of persistent grants") has introduced a parameter for disabling it runtime. To avoid the parameter be updated while being used by 'blkback', the commit caches the parameter into 'vbd->feature_gnt_persistent' in 'xen_vbd_create()', and then check if the guest also supports the feature and finally updates the field in 'connect_ring()'. However, 'connect_ring()' could be called before 'xen_vbd_create()', so later execution of 'xen_vbd_create()' can wrongly overwrite 'true' to 'vbd->feature_gnt_persistent'. As a result, 'blkback' could try to use 'persistent grants' feature even if the guest doesn't support the feature. This commit fixes the issue by moving the parameter value caching to 'xen_blkif_alloc()', which allocates the 'blkif'. Because the struct embeds 'vbd' object, which will be used by 'connect_ring()' later, this should be called before 'connect_ring()' and therefore this should be the right and safe place to do the caching. Fixes: aac8a70db24b ("xen-blkback: add a parameter for disabling of persistent grants") Cc: <stable@vger.kernel.org> # 5.10.x Signed-off-by: Maximilian Heyne <mheyne@amazon.de> Signed-off-by: SeongJae Park <sj@kernel.org> Reviewed-by: Maximilian Heyne <mheyne@amazon.de> Reviewed-by: Juergen Gross <jgross@suse.com> Link: https://lore.kernel.org/r/20220715225108.193398-2-sj@kernel.org Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-08-17tpm: Add check for Failure mode for TPM2 modulesMårten Lindahl
[ Upstream commit 863ed94c589fcd1984f4e3080f069d30508044bb ] In commit 0aa698787aa2 ("tpm: Add Upgrade/Reduced mode support for TPM2 modules") it was said that: "If the TPM is in Failure mode, it will successfully respond to both tpm2_do_selftest() and tpm2_startup() calls. Although, will fail to answer to tpm2_get_cc_attrs_tbl(). Use this fact to conclude that TPM is in Failure mode." But a check was never added in the commit when calling tpm2_get_cc_attrs_tbl() to conclude that the TPM is in Failure mode. This commit corrects this by adding a check. Fixes: 0aa698787aa2 ("tpm: Add Upgrade/Reduced mode support for TPM2 modules") Cc: stable@vger.kernel.org # v5.17+ Signed-off-by: Mårten Lindahl <marten.lindahl@axis.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-08-17tpm: eventlog: Fix section mismatch for DEBUG_SECTION_MISMATCHHuacai Chen
[ Upstream commit bed4593645366ad7362a3aa7bc0d100d8d8236a8 ] If DEBUG_SECTION_MISMATCH enabled, __calc_tpm2_event_size() will not be inlined, this cause section mismatch like this: WARNING: modpost: vmlinux.o(.text.unlikely+0xe30c): Section mismatch in reference from the variable L0 to the function .init.text:early_ioremap() The function L0() references the function __init early_memremap(). This is often because L0 lacks a __init annotation or the annotation of early_ioremap is wrong. Fix it by using __always_inline instead of inline for the called-once function __calc_tpm2_event_size(). Fixes: 44038bc514a2 ("tpm: Abstract crypto agile event size calculations") Cc: stable@vger.kernel.org # v5.3 Reported-by: WANG Xuerui <git@xen0n.name> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-08-17KEYS: asymmetric: enforce SM2 signature use pkey algoTianjia Zhang
[ Upstream commit 0815291a8fd66cdcf7db1445d4d99b0d16065829 ] The signature verification of SM2 needs to add the Za value and recalculate sig->digest, which requires the detection of the pkey_algo in public_key_verify_signature(). As Eric Biggers said, the pkey_algo field in sig is attacker-controlled and should be use pkey->pkey_algo instead of sig->pkey_algo, and secondly, if sig->pkey_algo is NULL, it will also cause signature verification failure. The software_key_determine_akcipher() already forces the algorithms are matched, so the SM3 algorithm is enforced in the SM2 signature, although this has been checked, we still avoid using any algorithm information in the signature as input. Fixes: 215525639631 ("X.509: support OSCCA SM2-with-SM3 certificate verification") Reported-by: Eric Biggers <ebiggers@google.com> Cc: stable@vger.kernel.org # v5.10+ Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-08-17ext4: fix race when reusing xattr blocksJan Kara
[ Upstream commit 65f8b80053a1b2fd602daa6814e62d6fa90e5e9b ] When ext4_xattr_block_set() decides to remove xattr block the following race can happen: CPU1 CPU2 ext4_xattr_block_set() ext4_xattr_release_block() new_bh = ext4_xattr_block_cache_find() lock_buffer(bh); ref = le32_to_cpu(BHDR(bh)->h_refcount); if (ref == 1) { ... mb_cache_entry_delete(); unlock_buffer(bh); ext4_free_blocks(); ... ext4_forget(..., bh, ...); jbd2_journal_revoke(..., bh); ext4_journal_get_write_access(..., new_bh, ...) do_get_write_access() jbd2_journal_cancel_revoke(..., new_bh); Later the code in ext4_xattr_block_set() finds out the block got freed and cancels reusal of the block but the revoke stays canceled and so in case of block reuse and journal replay the filesystem can get corrupted. If the race works out slightly differently, we can also hit assertions in the jbd2 code. Fix the problem by making sure that once matching mbcache entry is found, code dropping the last xattr block reference (or trying to modify xattr block in place) waits until the mbcache entry reference is dropped. This way code trying to reuse xattr block is protected from someone trying to drop the last reference to xattr block. Reported-and-tested-by: Ritesh Harjani <ritesh.list@gmail.com> CC: stable@vger.kernel.org Fixes: 82939d7999df ("ext4: convert to mbcache2") Signed-off-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20220712105436.32204-5-jack@suse.cz Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-08-17ext4: unindent codeblock in ext4_xattr_block_set()Jan Kara
[ Upstream commit fd48e9acdf26d0cbd80051de07d4a735d05d29b2 ] Remove unnecessary else (and thus indentation level) from a code block in ext4_xattr_block_set(). It will also make following code changes easier. No functional changes. CC: stable@vger.kernel.org Fixes: 82939d7999df ("ext4: convert to mbcache2") Signed-off-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20220712105436.32204-4-jack@suse.cz Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-08-17ext4: use kmemdup() to replace kmalloc + memcpyShuqi Zhang
[ Upstream commit 4efd9f0d120c55b08852ee5605dbb02a77089a5d ] Replace kmalloc + memcpy with kmemdup() Signed-off-by: Shuqi Zhang <zhangshuqi3@huawei.com> Reviewed-by: Ritesh Harjani <ritesh.list@gmail.com> Link: https://lore.kernel.org/r/20220525030120.803330-1-zhangshuqi3@huawei.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-08-17ext4: remove EA inode entry from mbcache on inode evictionJan Kara
[ Upstream commit 6bc0d63dad7f9f54d381925ee855b402f652fa39 ] Currently we remove EA inode from mbcache as soon as its xattr refcount drops to zero. However there can be pending attempts to reuse the inode and thus refcount handling code has to handle the situation when refcount increases from zero anyway. So save some work and just keep EA inode in mbcache until it is getting evicted. At that moment we are sure following iget() of EA inode will fail anyway (or wait for eviction to finish and load things from the disk again) and so removing mbcache entry at that moment is fine and simplifies the code a bit. CC: stable@vger.kernel.org Fixes: 82939d7999df ("ext4: convert to mbcache2") Signed-off-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20220712105436.32204-3-jack@suse.cz Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-08-17ext4: make sure ext4_append() always allocates new blockLukas Czerner
[ Upstream commit b8a04fe77ef1360fbf73c80fddbdfeaa9407ed1b ] ext4_append() must always allocate a new block, otherwise we run the risk of overwriting existing directory block corrupting the directory tree in the process resulting in all manner of problems later on. Add a sanity check to see if the logical block is already allocated and error out if it is. Cc: stable@kernel.org Signed-off-by: Lukas Czerner <lczerner@redhat.com> Reviewed-by: Andreas Dilger <adilger@dilger.ca> Link: https://lore.kernel.org/r/20220704142721.157985-2-lczerner@redhat.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-08-17ext4: check if directory block is within i_sizeLukas Czerner
[ Upstream commit 65f8ea4cd57dbd46ea13b41dc8bac03176b04233 ] Currently ext4 directory handling code implicitly assumes that the directory blocks are always within the i_size. In fact ext4_append() will attempt to allocate next directory block based solely on i_size and the i_size is then appropriately increased after a successful allocation. However, for this to work it requires i_size to be correct. If, for any reason, the directory inode i_size is corrupted in a way that the directory tree refers to a valid directory block past i_size, we could end up corrupting parts of the directory tree structure by overwriting already used directory blocks when modifying the directory. Fix it by catching the corruption early in __ext4_read_dirblock(). Addresses Red-Hat-Bugzilla: #2070205 CVE: CVE-2022-1184 Signed-off-by: Lukas Czerner <lczerner@redhat.com> Cc: stable@vger.kernel.org Reviewed-by: Andreas Dilger <adilger@dilger.ca> Link: https://lore.kernel.org/r/20220704142721.157985-1-lczerner@redhat.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-08-17ext4: fix warning in ext4_iomap_begin as race between bmap and writeYe Bin
[ Upstream commit 51ae846cff568c8c29921b1b28eb2dfbcd4ac12d ] We got issue as follows: ------------[ cut here ]------------ WARNING: CPU: 3 PID: 9310 at fs/ext4/inode.c:3441 ext4_iomap_begin+0x182/0x5d0 RIP: 0010:ext4_iomap_begin+0x182/0x5d0 RSP: 0018:ffff88812460fa08 EFLAGS: 00010293 RAX: ffff88811f168000 RBX: 0000000000000000 RCX: ffffffff97793c12 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000003 RBP: ffff88812c669160 R08: ffff88811f168000 R09: ffffed10258cd20f R10: ffff88812c669077 R11: ffffed10258cd20e R12: 0000000000000001 R13: 00000000000000a4 R14: 000000000000000c R15: ffff88812c6691ee FS: 00007fd0d6ff3740(0000) GS:ffff8883af180000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fd0d6dda290 CR3: 0000000104a62000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: iomap_apply+0x119/0x570 iomap_bmap+0x124/0x150 ext4_bmap+0x14f/0x250 bmap+0x55/0x80 do_vfs_ioctl+0x952/0xbd0 __x64_sys_ioctl+0xc6/0x170 do_syscall_64+0x33/0x40 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Above issue may happen as follows: bmap write bmap ext4_bmap iomap_bmap ext4_iomap_begin ext4_file_write_iter ext4_buffered_write_iter generic_perform_write ext4_da_write_begin ext4_da_write_inline_data_begin ext4_prepare_inline_data ext4_create_inline_data ext4_set_inode_flag(inode, EXT4_INODE_INLINE_DATA); if (WARN_ON_ONCE(ext4_has_inline_data(inode))) ->trigger bug_on To solved above issue hold inode lock in ext4_bamp. Signed-off-by: Ye Bin <yebin10@huawei.com> Link: https://lore.kernel.org/r/20220617013935.397596-1-yebin10@huawei.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-08-17ext4: correct the misjudgment in ext4_iget_extra_inodeBaokun Li
[ Upstream commit fd7e672ea98b95b9d4c9dae316639f03c16a749d ] Use the EXT4_INODE_HAS_XATTR_SPACE macro to more accurately determine whether the inode have xattr space. Cc: stable@kernel.org Signed-off-by: Baokun Li <libaokun1@huawei.com> Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20220616021358.2504451-5-libaokun1@huawei.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-08-17ext4: correct max_inline_xattr_value_size computingBaokun Li
[ Upstream commit c9fd167d57133c5b748d16913c4eabc55e531c73 ] If the ext4 inode does not have xattr space, 0 is returned in the get_max_inline_xattr_value_size function. Otherwise, the function returns a negative value when the inode does not contain EXT4_STATE_XATTR. Cc: stable@kernel.org Signed-off-by: Baokun Li <libaokun1@huawei.com> Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20220616021358.2504451-4-libaokun1@huawei.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-08-17ext4: fix use-after-free in ext4_xattr_set_entryBaokun Li
[ Upstream commit 67d7d8ad99beccd9fe92d585b87f1760dc9018e3 ] Hulk Robot reported a issue: ================================================================== BUG: KASAN: use-after-free in ext4_xattr_set_entry+0x18ab/0x3500 Write of size 4105 at addr ffff8881675ef5f4 by task syz-executor.0/7092 CPU: 1 PID: 7092 Comm: syz-executor.0 Not tainted 4.19.90-dirty #17 Call Trace: [...] memcpy+0x34/0x50 mm/kasan/kasan.c:303 ext4_xattr_set_entry+0x18ab/0x3500 fs/ext4/xattr.c:1747 ext4_xattr_ibody_inline_set+0x86/0x2a0 fs/ext4/xattr.c:2205 ext4_xattr_set_handle+0x940/0x1300 fs/ext4/xattr.c:2386 ext4_xattr_set+0x1da/0x300 fs/ext4/xattr.c:2498 __vfs_setxattr+0x112/0x170 fs/xattr.c:149 __vfs_setxattr_noperm+0x11b/0x2a0 fs/xattr.c:180 __vfs_setxattr_locked+0x17b/0x250 fs/xattr.c:238 vfs_setxattr+0xed/0x270 fs/xattr.c:255 setxattr+0x235/0x330 fs/xattr.c:520 path_setxattr+0x176/0x190 fs/xattr.c:539 __do_sys_lsetxattr fs/xattr.c:561 [inline] __se_sys_lsetxattr fs/xattr.c:557 [inline] __x64_sys_lsetxattr+0xc2/0x160 fs/xattr.c:557 do_syscall_64+0xdf/0x530 arch/x86/entry/common.c:298 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x459fe9 RSP: 002b:00007fa5e54b4c08 EFLAGS: 00000246 ORIG_RAX: 00000000000000bd RAX: ffffffffffffffda RBX: 000000000051bf60 RCX: 0000000000459fe9 RDX: 00000000200003c0 RSI: 0000000020000180 RDI: 0000000020000140 RBP: 000000000051bf60 R08: 0000000000000001 R09: 0000000000000000 R10: 0000000000001009 R11: 0000000000000246 R12: 0000000000000000 R13: 00007ffc73c93fc0 R14: 000000000051bf60 R15: 00007fa5e54b4d80 [...] ================================================================== Above issue may happen as follows: ------------------------------------- ext4_xattr_set ext4_xattr_set_handle ext4_xattr_ibody_find >> s->end < s->base >> no EXT4_STATE_XATTR >> xattr_check_inode is not executed ext4_xattr_ibody_set ext4_xattr_set_entry >> size_t min_offs = s->end - s->base >> UAF in memcpy we can easily reproduce this problem with the following commands: mkfs.ext4 -F /dev/sda mount -o debug_want_extra_isize=128 /dev/sda /mnt touch /mnt/file setfattr -n user.cat -v `seq -s z 4096|tr -d '[:digit:]'` /mnt/file In ext4_xattr_ibody_find, we have the following assignment logic: header = IHDR(inode, raw_inode) = raw_inode + EXT4_GOOD_OLD_INODE_SIZE + i_extra_isize is->s.base = IFIRST(header) = header + sizeof(struct ext4_xattr_ibody_header) is->s.end = raw_inode + s_inode_size In ext4_xattr_set_entry min_offs = s->end - s->base = s_inode_size - EXT4_GOOD_OLD_INODE_SIZE - i_extra_isize - sizeof(struct ext4_xattr_ibody_header) last = s->first free = min_offs - ((void *)last - s->base) - sizeof(__u32) = s_inode_size - EXT4_GOOD_OLD_INODE_SIZE - i_extra_isize - sizeof(struct ext4_xattr_ibody_header) - sizeof(__u32) In the calculation formula, all values except s_inode_size and i_extra_size are fixed values. When i_extra_size is the maximum value s_inode_size - EXT4_GOOD_OLD_INODE_SIZE, min_offs is -4 and free is -8. The value overflows. As a result, the preceding issue is triggered when memcpy is executed. Therefore, when finding xattr or setting xattr, check whether there is space for storing xattr in the inode to resolve this issue. Cc: stable@kernel.org Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Baokun Li <libaokun1@huawei.com> Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20220616021358.2504451-3-libaokun1@huawei.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-08-17ext4: add EXT4_INODE_HAS_XATTR_SPACE macro in xattr.hBaokun Li
[ Upstream commit 179b14152dcb6a24c3415200603aebca70ff13af ] When adding an xattr to an inode, we must ensure that the inode_size is not less than EXT4_GOOD_OLD_INODE_SIZE + extra_isize + pad. Otherwise, the end position may be greater than the start position, resulting in UAF. Signed-off-by: Baokun Li <libaokun1@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Link: https://lore.kernel.org/r/20220616021358.2504451-2-libaokun1@huawei.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-08-17ext4: fix extent status tree race in writeback error recovery pathEric Whitney
[ Upstream commit 7f0d8e1d607c1a4fa9a27362a108921d82230874 ] A race can occur in the unlikely event ext4 is unable to allocate a physical cluster for a delayed allocation in a bigalloc file system during writeback. Failure to allocate a cluster forces error recovery that includes a call to mpage_release_unused_pages(). That function removes any corresponding delayed allocated blocks from the extent status tree. If a new delayed write is in progress on the same cluster simultaneously, resulting in the addition of an new extent containing one or more blocks in that cluster to the extent status tree, delayed block accounting can be thrown off if that delayed write then encounters a similar cluster allocation failure during future writeback. Write lock the i_data_sem in mpage_release_unused_pages() to fix this problem. Ext4's block/cluster accounting code for bigalloc relies on i_data_sem for mutual exclusion, as is found in the delayed write path, and the locking in mpage_release_unused_pages() is missing. Cc: stable@kernel.org Reported-by: Ye Bin <yebin10@huawei.com> Signed-off-by: Eric Whitney <enwlinux@gmail.com> Link: https://lore.kernel.org/r/20220615160530.1928801-1-enwlinux@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Sasha Levin <sashal@kernel.org>