summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2020-01-31btrfs: drop the -EBUSY case in __extent_writepage_ioJosef Bacik
Now that we only return 0 or -EAGAIN from btrfs_writepage_cow_fixup, we do not need this -EBUSY case. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-01-31Btrfs: keep pages dirty when using btrfs_writepage_fixup_workerChris Mason
For COW, btrfs expects pages dirty pages to have been through a few setup steps. This includes reserving space for the new block allocations and marking the range in the state tree for delayed allocation. A few places outside btrfs will dirty pages directly, especially when unmapping mmap'd pages. In order for these to properly go through COW, we run them through a fixup worker to wait for stable pages, and do the delalloc prep. 87826df0ec36 added a window where the dirty pages were cleaned, but pending more action from the fixup worker. We clear_page_dirty_for_io() before we call into writepage, so the page is no longer dirty. The commit changed it so now we leave the page clean between unlocking it here and the fixup worker starting at some point in the future. During this window, page migration can jump in and relocate the page. Once our fixup work actually starts, it finds page->mapping is NULL and we end up freeing the page without ever writing it. This leads to crc errors and other exciting problems, since it screws up the whole statemachine for waiting for ordered extents. The fix here is to keep the page dirty while we're waiting for the fixup worker to get to work. This is accomplished by returning -EAGAIN from btrfs_writepage_cow_fixup if we queued the page up for fixup, which will cause the writepage function to redirty the page. Because we now expect the page to be dirty once it gets to the fixup worker we must adjust the error cases to call clear_page_dirty_for_io() on the page. That is the bulk of the patch, but it is not the fix, the fix is the -EAGAIN from btrfs_writepage_cow_fixup. We cannot separate these two changes out because the error conditions change with the new expectations. Signed-off-by: Chris Mason <clm@fb.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-01-31btrfs: take overcommit into account in inc_block_group_roJosef Bacik
inc_block_group_ro does a calculation to see if we have enough room left over if we mark this block group as read only in order to see if it's ok to mark the block group as read only. The problem is this calculation _only_ works for data, where our used is always less than our total. For metadata we will overcommit, so this will almost always fail for metadata. Fix this by exporting btrfs_can_overcommit, and then see if we have enough space to remove the remaining free space in the block group we are trying to mark read only. If we do then we can mark this block group as read only. Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-01-31btrfs: fix force usage in inc_block_group_roJosef Bacik
For some reason we've translated the do_chunk_alloc that goes into btrfs_inc_block_group_ro to force in inc_block_group_ro, but these are two different things. force for inc_block_group_ro is used when we are forcing the block group read only no matter what, for example when the underlying chunk is marked read only. We need to not do the space check here as this block group needs to be read only. btrfs_inc_block_group_ro() has a do_chunk_alloc flag that indicates that we need to pre-allocate a chunk before marking the block group read only. This has nothing to do with forcing, and in fact we _always_ want to do the space check in this case, so unconditionally pass false for force in this case. Then fixup inc_block_group_ro to honor force as it's expected and documented to do. Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-01-31btrfs: Correctly handle empty trees in find_first_clear_extent_bitNikolay Borisov
Raviu reported that running his regular fs_trim segfaulted with the following backtrace: [ 237.525947] assertion failed: prev, in ../fs/btrfs/extent_io.c:1595 [ 237.525984] ------------[ cut here ]------------ [ 237.525985] kernel BUG at ../fs/btrfs/ctree.h:3117! [ 237.525992] invalid opcode: 0000 [#1] SMP PTI [ 237.525998] CPU: 4 PID: 4423 Comm: fstrim Tainted: G U OE 5.4.14-8-vanilla #1 [ 237.526001] Hardware name: ASUSTeK COMPUTER INC. [ 237.526044] RIP: 0010:assfail.constprop.58+0x18/0x1a [btrfs] [ 237.526079] Call Trace: [ 237.526120] find_first_clear_extent_bit+0x13d/0x150 [btrfs] [ 237.526148] btrfs_trim_fs+0x211/0x3f0 [btrfs] [ 237.526184] btrfs_ioctl_fitrim+0x103/0x170 [btrfs] [ 237.526219] btrfs_ioctl+0x129a/0x2ed0 [btrfs] [ 237.526227] ? filemap_map_pages+0x190/0x3d0 [ 237.526232] ? do_filp_open+0xaf/0x110 [ 237.526238] ? _copy_to_user+0x22/0x30 [ 237.526242] ? cp_new_stat+0x150/0x180 [ 237.526247] ? do_vfs_ioctl+0xa4/0x640 [ 237.526278] ? btrfs_ioctl_get_supported_features+0x30/0x30 [btrfs] [ 237.526283] do_vfs_ioctl+0xa4/0x640 [ 237.526288] ? __do_sys_newfstat+0x3c/0x60 [ 237.526292] ksys_ioctl+0x70/0x80 [ 237.526297] __x64_sys_ioctl+0x16/0x20 [ 237.526303] do_syscall_64+0x5a/0x1c0 [ 237.526310] entry_SYSCALL_64_after_hwframe+0x49/0xbe That was due to btrfs_fs_device::aloc_tree being empty. Initially I thought this wasn't possible and as a percaution have put the assert in find_first_clear_extent_bit. Turns out this is indeed possible and could happen when a file system with SINGLE data/metadata profile has a 2nd device added. Until balance is run or a new chunk is allocated on this device it will be completely empty. In this case find_first_clear_extent_bit should return the full range [0, -1ULL] and let the caller handle this i.e for trim the end will be capped at the size of actual device. Link: https://lore.kernel.org/linux-btrfs/izW2WNyvy1dEDweBICizKnd2KDwDiDyY2EYQr4YCwk7pkuIpthx-JRn65MPBde00ND6V0_Lh8mW0kZwzDiLDv25pUYWxkskWNJnVP0kgdMA=@protonmail.com/ Fixes: 45bfcfc168f8 ("btrfs: Implement find_first_clear_extent_bit") CC: stable@vger.kernel.org # 5.2+ Signed-off-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-01-31btrfs: flush write bio if we loop in extent_write_cache_pagesJosef Bacik
There exists a deadlock with range_cyclic that has existed forever. If we loop around with a bio already built we could deadlock with a writer who has the page locked that we're attempting to write but is waiting on a page in our bio to be written out. The task traces are as follows PID: 1329874 TASK: ffff889ebcdf3800 CPU: 33 COMMAND: "kworker/u113:5" #0 [ffffc900297bb658] __schedule at ffffffff81a4c33f #1 [ffffc900297bb6e0] schedule at ffffffff81a4c6e3 #2 [ffffc900297bb6f8] io_schedule at ffffffff81a4ca42 #3 [ffffc900297bb708] __lock_page at ffffffff811f145b #4 [ffffc900297bb798] __process_pages_contig at ffffffff814bc502 #5 [ffffc900297bb8c8] lock_delalloc_pages at ffffffff814bc684 #6 [ffffc900297bb900] find_lock_delalloc_range at ffffffff814be9ff #7 [ffffc900297bb9a0] writepage_delalloc at ffffffff814bebd0 #8 [ffffc900297bba18] __extent_writepage at ffffffff814bfbf2 #9 [ffffc900297bba98] extent_write_cache_pages at ffffffff814bffbd PID: 2167901 TASK: ffff889dc6a59c00 CPU: 14 COMMAND: "aio-dio-invalid" #0 [ffffc9003b50bb18] __schedule at ffffffff81a4c33f #1 [ffffc9003b50bba0] schedule at ffffffff81a4c6e3 #2 [ffffc9003b50bbb8] io_schedule at ffffffff81a4ca42 #3 [ffffc9003b50bbc8] wait_on_page_bit at ffffffff811f24d6 #4 [ffffc9003b50bc60] prepare_pages at ffffffff814b05a7 #5 [ffffc9003b50bcd8] btrfs_buffered_write at ffffffff814b1359 #6 [ffffc9003b50bdb0] btrfs_file_write_iter at ffffffff814b5933 #7 [ffffc9003b50be38] new_sync_write at ffffffff8128f6a8 #8 [ffffc9003b50bec8] vfs_write at ffffffff81292b9d #9 [ffffc9003b50bf00] ksys_pwrite64 at ffffffff81293032 I used drgn to find the respective pages we were stuck on page_entry.page 0xffffea00fbfc7500 index 8148 bit 15 pid 2167901 page_entry.page 0xffffea00f9bb7400 index 7680 bit 0 pid 1329874 As you can see the kworker is waiting for bit 0 (PG_locked) on index 7680, and aio-dio-invalid is waiting for bit 15 (PG_writeback) on index 8148. aio-dio-invalid has 7680, and the kworker epd looks like the following crash> struct extent_page_data ffffc900297bbbb0 struct extent_page_data { bio = 0xffff889f747ed830, tree = 0xffff889eed6ba448, extent_locked = 0, sync_io = 0 } Probably worth mentioning as well that it waits for writeback of the page to complete while holding a lock on it (at prepare_pages()). Using drgn I walked the bio pages looking for page 0xffffea00fbfc7500 which is the one we're waiting for writeback on bio = Object(prog, 'struct bio', address=0xffff889f747ed830) for i in range(0, bio.bi_vcnt.value_()): bv = bio.bi_io_vec[i] if bv.bv_page.value_() == 0xffffea00fbfc7500: print("FOUND IT") which validated what I suspected. The fix for this is simple, flush the epd before we loop back around to the beginning of the file during writeout. Fixes: b293f02e1423 ("Btrfs: Add writepages support") CC: stable@vger.kernel.org # 4.4+ Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-01-31Btrfs: fix race between adding and putting tree mod seq elements and nodesFilipe Manana
There is a race between adding and removing elements to the tree mod log list and rbtree that can lead to use-after-free problems. Consider the following example that explains how/why the problems happens: 1) Task A has mod log element with sequence number 200. It currently is the only element in the mod log list; 2) Task A calls btrfs_put_tree_mod_seq() because it no longer needs to access the tree mod log. When it enters the function, it initializes 'min_seq' to (u64)-1. Then it acquires the lock 'tree_mod_seq_lock' before checking if there are other elements in the mod seq list. Since the list it empty, 'min_seq' remains set to (u64)-1. Then it unlocks the lock 'tree_mod_seq_lock'; 3) Before task A acquires the lock 'tree_mod_log_lock', task B adds itself to the mod seq list through btrfs_get_tree_mod_seq() and gets a sequence number of 201; 4) Some other task, name it task C, modifies a btree and because there elements in the mod seq list, it adds a tree mod elem to the tree mod log rbtree. That node added to the mod log rbtree is assigned a sequence number of 202; 5) Task B, which is doing fiemap and resolving indirect back references, calls btrfs get_old_root(), with 'time_seq' == 201, which in turn calls tree_mod_log_search() - the search returns the mod log node from the rbtree with sequence number 202, created by task C; 6) Task A now acquires the lock 'tree_mod_log_lock', starts iterating the mod log rbtree and finds the node with sequence number 202. Since 202 is less than the previously computed 'min_seq', (u64)-1, it removes the node and frees it; 7) Task B still has a pointer to the node with sequence number 202, and it dereferences the pointer itself and through the call to __tree_mod_log_rewind(), resulting in a use-after-free problem. This issue can be triggered sporadically with the test case generic/561 from fstests, and it happens more frequently with a higher number of duperemove processes. When it happens to me, it either freezes the VM or it produces a trace like the following before crashing: [ 1245.321140] general protection fault: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC PTI [ 1245.321200] CPU: 1 PID: 26997 Comm: pool Not tainted 5.5.0-rc6-btrfs-next-52 #1 [ 1245.321235] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-0-ga698c8995f-prebuilt.qemu.org 04/01/2014 [ 1245.321287] RIP: 0010:rb_next+0x16/0x50 [ 1245.321307] Code: .... [ 1245.321372] RSP: 0018:ffffa151c4d039b0 EFLAGS: 00010202 [ 1245.321388] RAX: 6b6b6b6b6b6b6b6b RBX: ffff8ae221363c80 RCX: 6b6b6b6b6b6b6b6b [ 1245.321409] RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff8ae221363c80 [ 1245.321439] RBP: ffff8ae20fcc4688 R08: 0000000000000002 R09: 0000000000000000 [ 1245.321475] R10: ffff8ae20b120910 R11: 00000000243f8bb1 R12: 0000000000000038 [ 1245.321506] R13: ffff8ae221363c80 R14: 000000000000075f R15: ffff8ae223f762b8 [ 1245.321539] FS: 00007fdee1ec7700(0000) GS:ffff8ae236c80000(0000) knlGS:0000000000000000 [ 1245.321591] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1245.321614] CR2: 00007fded4030c48 CR3: 000000021da16003 CR4: 00000000003606e0 [ 1245.321642] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 1245.321668] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 1245.321706] Call Trace: [ 1245.321798] __tree_mod_log_rewind+0xbf/0x280 [btrfs] [ 1245.321841] btrfs_search_old_slot+0x105/0xd00 [btrfs] [ 1245.321877] resolve_indirect_refs+0x1eb/0xc60 [btrfs] [ 1245.321912] find_parent_nodes+0x3dc/0x11b0 [btrfs] [ 1245.321947] btrfs_check_shared+0x115/0x1c0 [btrfs] [ 1245.321980] ? extent_fiemap+0x59d/0x6d0 [btrfs] [ 1245.322029] extent_fiemap+0x59d/0x6d0 [btrfs] [ 1245.322066] do_vfs_ioctl+0x45a/0x750 [ 1245.322081] ksys_ioctl+0x70/0x80 [ 1245.322092] ? trace_hardirqs_off_thunk+0x1a/0x1c [ 1245.322113] __x64_sys_ioctl+0x16/0x20 [ 1245.322126] do_syscall_64+0x5c/0x280 [ 1245.322139] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 1245.322155] RIP: 0033:0x7fdee3942dd7 [ 1245.322177] Code: .... [ 1245.322258] RSP: 002b:00007fdee1ec6c88 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [ 1245.322294] RAX: ffffffffffffffda RBX: 00007fded40210d8 RCX: 00007fdee3942dd7 [ 1245.322314] RDX: 00007fded40210d8 RSI: 00000000c020660b RDI: 0000000000000004 [ 1245.322337] RBP: 0000562aa89e7510 R08: 0000000000000000 R09: 00007fdee1ec6d44 [ 1245.322369] R10: 0000000000000073 R11: 0000000000000246 R12: 00007fdee1ec6d48 [ 1245.322390] R13: 00007fdee1ec6d40 R14: 00007fded40210d0 R15: 00007fdee1ec6d50 [ 1245.322423] Modules linked in: .... [ 1245.323443] ---[ end trace 01de1e9ec5dff3cd ]--- Fix this by ensuring that btrfs_put_tree_mod_seq() computes the minimum sequence number and iterates the rbtree while holding the lock 'tree_mod_log_lock' in write mode. Also get rid of the 'tree_mod_seq_lock' lock, since it is now redundant. Fixes: bd989ba359f2ac ("Btrfs: add tree modification log functions") Fixes: 097b8a7c9e48e2 ("Btrfs: join tree mod log code with the code holding back delayed refs") CC: stable@vger.kernel.org # 4.4+ Reviewed-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-01-31thermal: stm32: fix spelling mistake "preprare" -> "prepare"Colin Ian King
There is a spelling mistake in a dev_err error message. Fix it. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20200130100537.18069-1-colin.king@canonical.com
2020-01-31Documentation: cpu-idle-cooling: fix a SEVERE docs build failureRandy Dunlap
Sphinx ('make htmldocs') stops with a SEVERE error: Sphinx parallel build error: SystemMessage: /home/rdunlap/lnx/next/linux-next-20200120/Documentation/driver-api/thermal/cpu-idle-cooling.rst:69: (SEVERE/4) Unexpected section title. ^ | so fix the .rst file so that the SEVERE build error does not happen. Also fix another minor formatting warning (unexpected unindent). Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Zhang Rui <rui.zhang@intel.com> Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: Amit Kucheria <amit.kucheria@verdurent.com> Cc: linux-pm@vger.kernel.org Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/712c1152-56b5-307f-b3f3-ed03a30b804a@infradead.org
2020-01-31Merge branches 'pm-cpufreq' and 'pm-core'Rafael J. Wysocki
* pm-cpufreq: cpufreq: Avoid creating excessively large stack frames * pm-core: PM: core: Fix handling of devices deleted during system-wide resume
2020-01-30Merge tag 'mpx-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/daveh/x86-mpx Pull x86 MPX removal from Dave Hansen: "MPX requires recompiling applications, which requires compiler support. Unfortunately, GCC 9.1 is expected to be be released without support for MPX. This means that there was only a relatively small window where folks could have ever used MPX. It failed to gain wide adoption in the industry, and Linux was the only mainstream OS to ever support it widely. Support for the feature may also disappear on future processors. This set completes the process that we started during the 5.4 merge window when the MPX prctl()s were removed. XSAVE support is left in place, which allows MPX-using KVM guests to continue to function" * tag 'mpx-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/daveh/x86-mpx: x86/mpx: remove MPX from arch/x86 mm: remove arch_bprm_mm_init() hook x86/mpx: remove bounds exception code x86/mpx: remove build infrastructure x86/alternatives: add missing insn.h include
2020-01-30Merge tag 'mtd/for-5.6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux Pull MTD updates from Miquel Raynal: "MTD core - block2mtd: page index should use pgoff_t - maps: physmap: minimal Runtime PM support - maps: pcmciamtd: avoid possible sleep-in-atomic-context bugs - concat: Fix a comment referring to an unknown symbol Raw NAND: - Macronix: Use match_string() helper - Atmel: switch to using devm_fwnode_gpiod_get() - Denali: rework the SKIP_BYTES feature and add reset controlling - Brcmnand: set appropriate DMA mask - Cadence: add unspecified HAS_IOMEM dependency - Various cleanup. Onenand: - Rename Samsung and Omap2 drivers to avoid possible build warnings - Enable compile testing - Various build issues - Kconfig cleanup SPI-NAND: - Support for Toshiba TC58CVG2S0HRAIJ SPI-NOR: - Add support for TB selection using SR bit 6, - Add support for few flashes" * tag 'mtd/for-5.6' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux: (41 commits) mtd: concat: Fix a comment referring to an unknown symbol mtd: rawnand: add unspecified HAS_IOMEM dependency mtd: block2mtd: page index should use pgoff_t mtd: maps: physmap: Add minimal Runtime PM support mtd: maps: pcmciamtd: fix possible sleep-in-atomic-context bugs in pcmciamtd_set_vpp() mtd: onenand: Rename omap2 driver to avoid a build warning mtd: onenand: Use a better name for samsung driver mtd: rawnand: atmel: switch to using devm_fwnode_gpiod_get() mtd: spinand: add support for Toshiba TC58CVG2S0HRAIJ mtd: rawnand: macronix: Use match_string() helper to simplify the code mtd: sharpslpart: Fix unsigned comparison to zero mtd: onenand: Enable compile testing of OMAP and Samsung drivers mtd: onenand: samsung: Fix printing format for size_t on 64-bit mtd: onenand: samsung: Fix pointer cast -Wpointer-to-int-cast warnings on 64 bit mtd: rawnand: denali: remove hard-coded DENALI_DEFAULT_OOB_SKIP_BYTES mtd: rawnand: denali_dt: add reset controlling dt-bindings: mtd: denali_dt: document reset property mtd: rawnand: denali_dt: Add support for configuring SPARE_AREA_SKIP_BYTES mtd: rawnand: denali_dt: error out if platform has no associated data mtd: rawnand: brcmnand: Set appropriate DMA mask ...
2020-01-30Merge tag 'upstream-5.6-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rw/ubifs Pull UBI/UBIFS updates from Miquel Raynal: "This pull request contains mostly fixes for UBI and UBIFS: UBI: - Fixes for memory leaks in error paths - Fix for an logic error in a fastmap selfcheck UBIFS: - Fix for FS_IOC_SETFLAGS related to fscrypt flag - Support for FS_ENCRYPT_FL - Fix for a dead lock in bulk-read mode" Sent on behalf of Richard Weinberger who is traveling. * tag 'upstream-5.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/ubifs: ubi: Fix an error pointer dereference in error handling code ubifs: Fix memory leak from c->sup_node ubifs: Fix ino_t format warnings in orphan_delete() ubifs: Fix deadlock in concurrent bulk-read and writepage ubifs: Fix wrong memory allocation ubi: Free the normal volumes in error paths of ubi_attach_mtd_dev() ubi: Check the presence of volume before call ubi_fastmap_destroy_checkmap() ubifs: Add support for FS_ENCRYPT_FL ubifs: Fix FS_IOC_SETFLAGS unexpectedly clearing encrypt flag ubi: wl: Remove set but not used variable 'prev_e' ubi: fastmap: Fix inverted logic in seen selfcheck
2020-01-30Merge tag 'f2fs-for-5.6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs Pull f2fs updates from Jaegeuk Kim: "In this series, we've implemented transparent compression experimentally. It supports LZO and LZ4, but will add more later as we investigate in the field more. At this point, the feature doesn't expose compressed space to user directly in order to guarantee potential data updates later to the space. Instead, the main goal is to reduce data writes to flash disk as much as possible, resulting in extending disk life time as well as relaxing IO congestion. Alternatively, we're also considering to add ioctl() to reclaim compressed space and show it to user after putting the immutable bit. Enhancements: - add compression support - avoid unnecessary locks in quota ops - harden power-cut scenario for zoned block devices - use private bio_set to avoid IO congestion - replace GC mutex with rwsem to serialize callers Bug fixes: - fix dentry consistency and memory corruption in rename()'s error case - fix wrong swap extent reports - fix casefolding bugs - change lock coverage to avoid deadlock - avoid GFP_KERNEL under f2fs_lock_op And, we've cleaned up sysfs entries to prepare no debugfs" * tag 'f2fs-for-5.6' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (31 commits) f2fs: fix race conditions in ->d_compare() and ->d_hash() f2fs: fix dcache lookup of !casefolded directories f2fs: Add f2fs stats to sysfs f2fs: delete duplicate information on sysfs nodes f2fs: change to use rwsem for gc_mutex f2fs: update f2fs document regarding to fsync_mode f2fs: add a way to turn off ipu bio cache f2fs: code cleanup for f2fs_statfs_project() f2fs: fix miscounted block limit in f2fs_statfs_project() f2fs: show the CP_PAUSE reason in checkpoint traces f2fs: fix deadlock allocating bio_post_read_ctx from mempool f2fs: remove unneeded check for error allocating bio_post_read_ctx f2fs: convert inline_dir early before starting rename f2fs: fix memleak of kobject f2fs: fix to add swap extent correctly f2fs: run fsck when getting bad inode during GC f2fs: support data compression f2fs: free sysfs kobject f2fs: declare nested quota_sem and remove unnecessary sems f2fs: don't put new_page twice in f2fs_rename ...
2020-01-30Merge tag 'for_v5.6-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull UDF, quota, reiserfs, ext2 fixes and cleanups from Jan Kara: "A few assorted fixes and cleanups for udf, quota, reiserfs, and ext2" * tag 'for_v5.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: fs/reiserfs: remove unused macros fs/quota: remove unused macro udf: Clarify meaning of f_files in udf_statfs udf: Allow writing to 'Rewritable' partitions udf: Disallow R/W mode for disk with Metadata partition udf: Fix meaning of ENTITYID_FLAGS_* macros to be really bitwise-or flags udf: Fix free space reporting for metadata and virtual partitions udf: Update header files to UDF 2.60 udf: Move OSTA Identifier Suffix macros from ecma_167.h to osta_udf.h udf: Fix spelling in EXT_NEXT_EXTENT_ALLOCDESCS ext2: Adjust indentation in ext2_fill_super quota: avoid time_t in v1_disk_dqblk definition reiserfs: Fix spurious unlock in reiserfs_fill_super() error handling reiserfs: Fix memory leak of journal device string ext2: set proper errno in error case of ext2_fill_super()
2020-01-30Merge tag 'xfs-5.6-merge-6' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds
Pull xfs updates from Darrick Wong: "In this release we clean out the last of the old 32-bit timestamp code, fix a number of bugs and memory corruptions on 32-bit platforms, and a refactoring of some of the extended attribute code. I think I'll be back next week with some refactoring of how the XFS buffer code returns error codes, however I prefer to hold onto that for another week to let it soak a while longer Summary: - Get rid of compat_time_t - Convert time_t to time64_t in quota code - Remove shadow variables - Prevent ATTR_ flag misuse in the attrmulti ioctls - Clean out strlen in the attr code - Remove some bogus asserts - Fix various file size limit calculation errors with 32-bit kernels - Pack xfs_dir2_sf_entry_t to fix build errors on arm oabi - Fix nowait inode locking calls for directio aio reads - Fix memory corruption bugs when invalidating remote xattr value buffers - Streamline remote attr value removal - Make the buffer log format size consistent across platforms - Strengthen buffer log format size checking - Fix messed up return types of xfs_inode_need_cow - Fix some unused variable warnings" * tag 'xfs-5.6-merge-6' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: (24 commits) xfs: remove unused variable 'done' xfs: fix uninitialized variable in xfs_attr3_leaf_inactive xfs: change return value of xfs_inode_need_cow to int xfs: check log iovec size to make sure it's plausibly a buffer log format xfs: make struct xfs_buf_log_format have a consistent size xfs: complain if anyone tries to create a too-large buffer log item xfs: clean up xfs_buf_item_get_format return value xfs: streamline xfs_attr3_leaf_inactive xfs: fix memory corruption during remote attr value buffer invalidation xfs: refactor remote attr value buffer invalidation xfs: fix IOCB_NOWAIT handling in xfs_file_dio_aio_read xfs: Add __packed to xfs_dir2_sf_entry_t definition xfs: fix s_maxbytes computation on 32-bit kernels xfs: truncate should remove all blocks, not just to the end of the page cache xfs: introduce XFS_MAX_FILEOFF xfs: remove bogus assertion when online repair isn't enabled xfs: Remove all strlen in all xfs_attr_* functions for attr names. xfs: fix misuse of the XFS_ATTR_INCOMPLETE flag xfs: also remove cached ACLs when removing the underlying attr xfs: reject invalid flags combinations in XFS_IOC_ATTRMULTI_BY_HANDLE ...
2020-01-30Merge tag 'ext4_for_linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 updates from Ted Ts'o: "This merge window, we've added some performance improvements in how we handle inode locking in the read/write paths, and improving the performance of Direct I/O overwrites. We also now record the error code which caused the first and most recent ext4_error() report in the superblock, to make it easier to root cause problems in production systems. There are also many of the usual cleanups and miscellaneous bug fixes" * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (49 commits) jbd2: clean __jbd2_journal_abort_hard() and __journal_abort_soft() jbd2: make sure ESHUTDOWN to be recorded in the journal superblock ext4, jbd2: ensure panic when aborting with zero errno jbd2: switch to use jbd2_journal_abort() when failed to submit the commit record jbd2_seq_info_next should increase position index jbd2: remove pointless assertion in __journal_remove_journal_head ext4,jbd2: fix comment and code style jbd2: delete the duplicated words in the comments ext4: fix extent_status trace points ext4: fix symbolic enum printing in trace output ext4: choose hardlimit when softlimit is larger than hardlimit in ext4_statfs_project() ext4: fix race conditions in ->d_compare() and ->d_hash() ext4: make dioread_nolock the default ext4: fix extent_status fragmentation for plain files jbd2: clear JBD2_ABORT flag before journal_reset to update log tail info when load journal ext4: drop ext4_kvmalloc() ext4: Add EXT4_IOC_FSGETXATTR/EXT4_IOC_FSSETXATTR to compat_ioctl ext4: remove unused macro MPAGE_DA_EXTENT_TAIL ext4: add missing braces in ext4_ext_drop_refs() ext4: fix some nonstandard indentation in extents.c ...
2020-01-30cifs: fix soft mounts hanging in the reconnect codeRonnie Sahlberg
RHBZ: 1795429 In recent DFS updates we have a new variable controlling how many times we will retry to reconnect the share. If DFS is not used, then this variable is initialized to 0 in: static inline int dfs_cache_get_nr_tgts(const struct dfs_cache_tgt_list *tl) { return tl ? tl->tl_numtgts : 0; } This means that in the reconnect loop in smb2_reconnect() we will immediately wrap retries to -1 and never actually get to pass this conditional: if (--retries) continue; The effect is that we no longer reach the point where we fail the commands with -EHOSTDOWN and basically the kernel threads are virtually hung and unkillable. Fixes: a3a53b7603798fd8 (cifs: Add support for failover in smb2_reconnect()) Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz> CC: Stable <stable@vger.kernel.org>
2020-01-30RDMA/core: Make the entire API tree staticJason Gunthorpe
Compilation of mlx5 driver without CONFIG_INFINIBAND_USER_ACCESS generates the following error. on x86_64: ld: drivers/infiniband/hw/mlx5/main.o: in function `mlx5_ib_handler_MLX5_IB_METHOD_VAR_OBJ_ALLOC': main.c:(.text+0x186d): undefined reference to `ib_uverbs_get_ucontext_file' ld: drivers/infiniband/hw/mlx5/main.o:(.rodata+0x2480): undefined reference to `uverbs_idr_class' ld: drivers/infiniband/hw/mlx5/main.o:(.rodata+0x24d8): undefined reference to `uverbs_destroy_def_handler' This is happening because some parts of the UAPI description are not static. This is a hold over from earlier code that relied on struct pointers to refer to object types, now object types are referenced by number. Remove the unused globals and add statics to the remaining UAPI description elements. Remove the redundent #ifdefs around mlx5_ib_*defs and obsolete mlx5_ib_get_devx_tree(). The compiler now trims alot more unused code, including the above problematic definitions when !CONFIG_INFINIBAND_USER_ACCESS. Fixes: 7be76bef320b ("IB/mlx5: Introduce VAR object and its alloc/destroy methods") Reported-by: Randy Dunlap <rdunlap@infradead.org> Acked-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-30Merge branch 'cve-2019-3016' into kvm-next-5.6Paolo Bonzini
From Boris Ostrovsky: The KVM hypervisor may provide a guest with ability to defer remote TLB flush when the remote VCPU is not running. When this feature is used, the TLB flush will happen only when the remote VPCU is scheduled to run again. This will avoid unnecessary (and expensive) IPIs. Under certain circumstances, when a guest initiates such deferred action, the hypervisor may miss the request. It is also possible that the guest may mistakenly assume that it has already marked remote VCPU as needing a flush when in fact that request had already been processed by the hypervisor. In both cases this will result in an invalid translation being present in a vCPU, potentially allowing accesses to memory locations in that guest's address space that should not be accessible. Note that only intra-guest memory is vulnerable. The five patches address both of these problems: 1. The first patch makes sure the hypervisor doesn't accidentally clear a guest's remote flush request 2. The rest of the patches prevent the race between hypervisor acknowledging a remote flush request and guest issuing a new one. Conflicts: arch/x86/kvm/x86.c [move from kvm_arch_vcpu_free to kvm_arch_vcpu_destroy]
2020-01-30x86/KVM: Clean up host's steal time structureBoris Ostrovsky
Now that we are mapping kvm_steal_time from the guest directly we don't need keep a copy of it in kvm_vcpu_arch.st. The same is true for the stime field. This is part of CVE-2019-3016. Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Reviewed-by: Joao Martins <joao.m.martins@oracle.com> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-30x86/KVM: Make sure KVM_VCPU_FLUSH_TLB flag is not missedBoris Ostrovsky
There is a potential race in record_steal_time() between setting host-local vcpu->arch.st.steal.preempted to zero (i.e. clearing KVM_VCPU_PREEMPTED) and propagating this value to the guest with kvm_write_guest_cached(). Between those two events the guest may still see KVM_VCPU_PREEMPTED in its copy of kvm_steal_time, set KVM_VCPU_FLUSH_TLB and assume that hypervisor will do the right thing. Which it won't. Instad of copying, we should map kvm_steal_time and that will guarantee atomicity of accesses to @preempted. This is part of CVE-2019-3016. Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Reviewed-by: Joao Martins <joao.m.martins@oracle.com> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-30x86/kvm: Cache gfn to pfn translationBoris Ostrovsky
__kvm_map_gfn()'s call to gfn_to_pfn_memslot() is * relatively expensive * in certain cases (such as when done from atomic context) cannot be called Stashing gfn-to-pfn mapping should help with both cases. This is part of CVE-2019-3016. Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Reviewed-by: Joao Martins <joao.m.martins@oracle.com> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-30x86/kvm: Introduce kvm_(un)map_gfn()Boris Ostrovsky
kvm_vcpu_(un)map operates on gfns from any current address space. In certain cases we want to make sure we are not mapping SMRAM and for that we can use kvm_(un)map_gfn() that we are introducing in this patch. This is part of CVE-2019-3016. Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Reviewed-by: Joao Martins <joao.m.martins@oracle.com> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-30x86/kvm: Be careful not to clear KVM_VCPU_FLUSH_TLB bitBoris Ostrovsky
kvm_steal_time_set_preempted() may accidentally clear KVM_VCPU_FLUSH_TLB bit if it is called more than once while VCPU is preempted. This is part of CVE-2019-3016. (This bug was also independently discovered by Jim Mattson <jmattson@google.com>) Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Reviewed-by: Joao Martins <joao.m.martins@oracle.com> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-30Merge tag 'kvm-ppc-next-5.6-2' of ↵Paolo Bonzini
git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc into HEAD Second KVM PPC update for 5.6 * Fix compile warning on 32-bit machines * Fix locking error in secure VM support
2020-01-30Merge tag 'kvmarm-5.6' of ↵Paolo Bonzini
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm updates for Linux 5.6 - Fix MMIO sign extension - Fix HYP VA tagging on tag space exhaustion - Fix PSTATE/CPSR handling when generating exception - Fix MMU notifier's advertizing of young pages - Fix poisoned page handling - Fix PMU SW event handling - Fix TVAL register access - Fix AArch32 external abort injection - Fix ITS unmapped collection handling - Various cleanups
2020-01-30Merge tag 'drm-next-2020-01-30' of git://anongit.freedesktop.org/drm/drmLinus Torvalds
Pull drm updates from Davbe Airlie: "This is the main pull request for graphics for 5.6. Usual selection of changes all over. I've got one outstanding vmwgfx pull that touches mm so kept it separate until after all of this lands. I'll try and get it to you soon after this, but it might be early next week (nothing wrong with code, just my schedule is messy) This also hits a lot of fbdev drivers with some cleanups. Other notables: - vulkan timeline semaphore support added to syncobjs - nouveau turing secureboot/graphics support - Displayport MST display stream compression support Detailed summary: uapi: - dma-buf heaps added (and fixed) - command line add support for panel oreientation - command line allow overriding penguin count drm: - mipi dsi definition updates - lockdep annotations for dma_resv - remove dma-buf kmap/kunmap support - constify fb_ops in all fbdev drivers - MST fix for daisy chained hotplug- - CTA-861-G modes with VIC >= 193 added - fix drm_panel_of_backlight export - LVDS decoder support - more device based logging support - scanline alighment for dumb buffers - MST DSC helpers scheduler: - documentation fixes - job distribution improvements panel: - Logic PD type 28 panel support - Jimax8729d MIPI-DSI - igenic JZ4770 - generic DSI devicetree bindings - sony acx424AKP panel - Leadtek LTK500HD1829 - xinpeng XPP055C272 - AUO B116XAK01 - GiantPlus GPM940B0 - BOE NV140FHM-N49 - Satoz SAT050AT40H12R2 - Sharp LS020B1DD01D panels. ttm: - use blocking WW lock i915: - hw/uapi state separation - Lock annotation improvements - selftest improvements - ICL/TGL DSI VDSC support - VBT parsing improvments - Display refactoring - DSI updates + fixes - HDCP 2.2 for CFL - CML PCI ID fixes - GLK+ fbc fix - PSR fixes - GEN/GT refactor improvments - DP MST fixes - switch context id alloc to xarray - workaround updates - LMEM debugfs support - tiled monitor fixes - ICL+ clock gating programming removed - DP MST disable sequence fixed - LMEM discontiguous object maps - prefaulting for discontiguous objects - use LMEM for dumb buffers if possible - add LMEM mmap support amdgpu: - enable sync object timelines for vulkan - MST atomic routines - enable MST DSC support - add DMCUB display microengine support - DC OEM i2c support - Renoir DC fixes - Initial HDCP 2.x support - BACO support for Arcturus - Use BACO for runtime PM power save - gfxoff on navi10 - gfx10 golden updates and fixes - DCN support on POWER - GFXOFF for raven1 refresh - MM engine idle handlers cleanup - 10bpc EDP panel fixes - renoir watermark fixes - SR-IOV fixes - Arcturus VCN fixes - GDDR6 training fixes - freesync fixes - Pollock support amdkfd: - unify more codepath with amdgpu - use KIQ to setup HIQ rather than MMIO radeon: - fix vma fault handler race - PPC DMA fix - register check fixes for r100/r200 nouveau: - mmap_sem vs dma_resv fix - rewrite the ACR secure boot code for Turing - TU10x graphics engine support (TU11x pending) - Page kind mapping for turing - 10-bit LUT support - GP10B Tegra fixes - HD audio regression fix hisilicon/hibmc: - use generic fbdev code and helpers rockchip: - dsi/px30 support virtio: - fb damage support - static some functions vc4: - use dma_resv lock wrappers msm: - use dma_resv lock wrappers - sc7180 display + DSI support - a618 support - UBWC support improvements vmwgfx: - updates + new logging uapi exynos: - enable/disable callback cleanups etnaviv: - use dma_resv lock wrappers atmel-hlcdc: - clock fixes mediatek: - cmdq support - non-smooth cursor fixes - ctm property support sun4i: - suspend support - A64 mipi dsi support rcar-du: - Color management module support - LVDS encoder dual-link support - R8A77980 support analogic: - add support for an6345 ast: - atomic modeset support - primary plane garbage fix arcgpu: - fixes for fourcc handling tegra: - minor fixes and improvments mcde: - vblank support meson: - OSD1 plane AFBC commit gma500: - add pageflip support - reomve global drm_dev komeda: - tweak debugfs output - d32 support - runtime PM suppotr udl: - use generic shmem helpers - cleanup and fixes" * tag 'drm-next-2020-01-30' of git://anongit.freedesktop.org/drm/drm: (1998 commits) drm/nouveau/fb/gp102-: allow module to load even when scrubber binary is missing drm/nouveau/acr: return error when registering LSF if ACR not supported drm/nouveau/disp/gv100-: not all channel types support reporting error codes drm/nouveau/disp/nv50-: prevent oops when no channel method map provided drm/nouveau: support synchronous pushbuf submission drm/nouveau: signal pending fences when channel has been killed drm/nouveau: reject attempts to submit to dead channels drm/nouveau: zero vma pointer even if we only unreference it rather than free drm/nouveau: Add HD-audio component notifier support drm/nouveau: fix build error without CONFIG_IOMMU_API drm/nouveau/kms/nv04: remove set but not used variable 'width' drm/nouveau/kms/nv50: remove set but not unused variable 'nv_connector' drm/nouveau/mmu: fix comptag memory leak drm/nouveau/gr/gp10b: Use gp100_grctx and gp100_gr_zbc drm/nouveau/pmu/gm20b,gp10b: Fix Falcon bootstrapping drm/exynos: Rename Exynos to lowercase drm/exynos: change callback names drm/mst: Don't do atomic checks over disabled managers drm/amdgpu: add the lost mutex_init back drm/amd/display: skip opp blank or unblank if test pattern enabled ...
2020-01-30Merge tag 'for-v5.6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply Pull power supply and reset updates from Sebastian Reichel: "Core: - Add battery internal resistance temperature table support Drivers: - sc27xx: Optimize the battery resistance with measuring temperature - max17042-battery: Add MAX17055 support - bq25890-charger: Add support of BQ25892 and BQ25896 chips - misc fixes" * tag 'for-v5.6' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply: (44 commits) power: supply: ipaq_micro_battery: remove unneeded semicolon power: supply: bq25890_charger: fix incorrect error return when bq25890_field_read fails power: supply: axp20x_usb_power: Only poll while offline power: supply: axp20x_usb_power: Add wakeup control power: supply: axp20x_usb_power: Allow offlining power: supply: axp20x_usb_power: Use a match structure power: suppy: ucs1002: Make the symbol 'ucs1002_regulator_enable' static power: reset: at91-poweroff: use proper master clock register offset power: reset: at91-poweroff: introduce struct shdwc_reg_config power: supply: bq25890_charger: Add DT and I2C ids for all supported chips dt-bindings: Add new chips to bq25890 binding documentation power: supply: bq25890_charger: Add support of BQ25892 and BQ25896 chips power: supply: core: Update sysfs-class-power ABI document power: supply: sbs-battery: Fix a signedness bug in sbs_get_battery_capacity() power: supply: ltc2941-battery-gauge: fix use-after-free power: supply: max17040: Correct IRQ wake handling power: supply: axp20x_usb_power: Remove unused device_node power: supply: axp20x_ac_power: Add wakeup control power: supply: axp20x_ac_power: Allow offlining power: supply: axp20x_ac_power: Fix reporting online status ...
2020-01-30Merge tag 'devicetree-for-5.6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux Pull devicetree updates from Rob Herring: - Update dtc to upstream v1.5.1-22-gc40aeb60b47a (plus 1 revert) - Fix for DMA coherent devices on Power - Rework and simplify the DT phandle cache code - DT schema conversions for LEDS, gpio-leds, STM32 dfsdm, STM32 UART, STM32 ROMEM, STM32 watchdog, STM32 DMAs, STM32 mlahb, STM32 RTC, STM32 RCC, STM32 syscon, rs485, Renesas rCar CSI2, Faraday FTIDE010, DWC2, Arm idle-states, Allwinner legacy resets, PRCM and clocks, Allwinner H6 OPP, Allwinner AHCI, Allwinner MBUS, Allwinner A31 CSI, Allwinner h/w codec, Allwinner A10 system ctrl, Allwinner SRAM, Allwinner USB PHY, Renesas CEU, generic PCI host, Arm Versatile PCI - New binding schemas for SATA and PATA controllers, TI and Infineon VR controllers, MAX31730 - New compatible strings for i.MX8QM, WCN3991, renesas,r8a77961-wdt, renesas,etheravb-r8a77961 - Add USB 'super-speed-plus' as a documented speed - Vendor prefixes for broadmobi, calaosystems, kam, and mps - Clean-up the multiple flavors of ST-Ericsson vendor prefixes * tag 'devicetree-for-5.6' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: (66 commits) scripts/dtc: Revert "yamltree: Ensure consistent bracketing of properties with phandles" of: Add OF_DMA_DEFAULT_COHERENT & select it on powerpc dt-bindings: leds: Convert gpio-leds to DT schema dt-bindings: leds: Convert common LED binding to schema dt-bindings: PCI: Convert generic host binding to DT schema dt-bindings: PCI: Convert Arm Versatile binding to DT schema dt-bindings: Be explicit about installing deps dt-bindings: stm32: convert dfsdm to json-schema dt-bindings: serial: Convert STM32 UART to json-schema dt-bindings: serial: Convert rs485 bindings to json-schema dt-bindings: timer: Use non-empty ranges in example dt-bindings: arm-boards: typo fix dt-bindings: Add TI and Infineon VR Controllers as trivial devices dt-binding: usb: add "super-speed-plus" dt-bindings: rcar-csi2: Convert bindings to json-schema dt-bindings: iio: adc: ad7606: Fix wrong maxItems value dt-bindings: Convert Faraday FTIDE010 to DT schema dt-bindings: Create DT bindings for PATA controllers dt-bindings: Create DT bindings for SATA controllers dt: bindings: add vendor prefix for Kamstrup A/S ...
2020-01-30Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netLinus Torvalds
Pull networking fixes from David Miller: 1) Various mptcp fixupes from Florian Westphal and Geery Uytterhoeven. 2) Don't clear the node/port GUIDs after we've assigned the correct values to them. From Leon Romanovsky. * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: net/core: Do not clear VF index for node/port GUIDs query mptcp: Fix undefined mptcp_handle_ipv6_mapped for modular IPV6 net: drop_monitor: Use kstrdup udp: document udp_rcv_segment special case for looped packets mptcp: MPTCP_HMAC_TEST should depend on MPTCP mptcp: Fix incorrect IPV6 dependency check Revert "MAINTAINERS: mptcp@ mailing list is moderated" mptcp: handle tcp fallback when using syn cookies mptcp: avoid a lockdep splat when mcast group was joined mptcp: fix panic on user pointer access mptcp: defer freeing of cached ext until last moment net: mvneta: fix XDP support if sw bm is used as fallback sch_choke: Use kvcalloc mptcp: Fix build with PROC_FS disabled. MAINTAINERS: mptcp@ mailing list is moderated
2020-01-30Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/ideLinus Torvalds
Pull IDE updates from David Miller: 1) Fix mem region name in tx4949ide driver, from Christophe JAILLET. 2) Make drive->dn read only, it should not be changeable by users. From Dan Carpenter. 3) Several cast fixups from Krzysztof Kozlowski. There is also going to be a removal of a now unused IDE driver, but that will come via the MIPS tree. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/ide: ide: make drive->dn read only ide: serverworks: potential overflow in svwks_set_pio_mode() cmd64x: potential buffer overflow in cmd64x_program_timings() ide: remove unneeded header include path to drivers/ide ide: qd65xx: Fix cast to pointer from integer of different size ide: ht6560b: Fix cast to pointer from integer of different size ide: remove set but not used variable 'hwif' ide: remove unnecessary touch_softlockup_watchdog ide: tx4939ide: Fix the name used in a 'devm_request_mem_region()' call ide: Use dev_get_drvdata where possible
2020-01-30Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparcLinus Torvalds
Pull sparc updates from David Miller: 1) Add a proper .exit.data section. 2) Fix ipc64_perm type definition, from Arnd Bergmann. 3) Support folded p4d page tables on sparc64, from Mike Rapport. 4) Remove uses of struct timex, also from Arnd Bergmann. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc: y2038: sparc: remove use of struct timex sparc64: add support for folded p4d page tables sparc/console: kill off obsolete declarations sparc32: fix struct ipc64_perm type definition sparc32, leon: Stop adding vendor and device id to prom ambapp path components sparc: Add .exit.data section. sparc: remove unneeded uapi/asm/statfs.h
2020-01-30Merge tag 'arm64-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 KVM fix from Catalin Marinas: "Set the correct MDCR_EL2 register value on the first run of a vCPU" * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: KVM: arm64: Write arch.mdcr_el2 changes since last vcpu_load on VHE
2020-01-30net/core: Do not clear VF index for node/port GUIDs queryLeon Romanovsky
VF numbers were assigned to node_guid and port_guid, but cleared right before such query calls were issued. It caused to return node/port GUIDs of VF index 0 for all VFs. Fixes: 30aad41721e0 ("net/core: Add support for getting VF GUIDs") Reported-by: Adrian Chiris <adrianc@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-30y2038: sparc: remove use of struct timexArnd Bergmann
'struct timex' is one of the last users of 'struct timeval' and is only referenced in one place in the kernel any more, to convert the user space timex into the kernel-internal version on sparc64, with a different tv_usec member type. As a preparation for hiding the time_t definition and everything using that in the kernel, change the implementation once more to only convert the timeval member, and then enclose the struct definition in an #ifdef. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Julian Calaby <julian.calaby@gmail.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-30sparc64: add support for folded p4d page tablesMike Rapoport
Implement primitives necessary for the 4th level folding, add walks of p4d level where appropriate and replace 5level-fixup.h with pgtable-nop4d.h. Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-30ide: make drive->dn read onlyDan Carpenter
The IDE core always sets ->dn correctly so changing it is never required. Setting it to a different value than assigned by IDE core is very likely to result in data corruption (due to wrong transfer timings being set on the controller etc.) Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Tested-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-30mptcp: Fix undefined mptcp_handle_ipv6_mapped for modular IPV6Geert Uytterhoeven
If CONFIG_MPTCP=y, CONFIG_MPTCP_IPV6=n, and CONFIG_IPV6=m: ERROR: "mptcp_handle_ipv6_mapped" [net/ipv6/ipv6.ko] undefined! This does not happen if CONFIG_MPTCP_IPV6=y, as CONFIG_MPTCP_IPV6 selects CONFIG_IPV6, and thus forces CONFIG_IPV6 builtin. As exporting a symbol for an empty function would be a bit wasteful, fix this by providing a dummy version of mptcp_handle_ipv6_mapped() for the CONFIG_MPTCP_IPV6=n case. Rename mptcp_handle_ipv6_mapped() to mptcpv6_handle_mapped(), to make it clear this is a pure-IPV6 function, just like mptcpv6_init(). Fixes: cec37a6e41aae7bf ("mptcp: Handle MP_CAPABLE options for outgoing connections") Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-30net: drop_monitor: Use kstrdupJoe Perches
Convert the equivalent but rather odd uses of kmemdup with __GFP_ZERO to the more common kstrdup and avoid unnecessary zeroing of copied over memory. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-30udp: document udp_rcv_segment special case for looped packetsWillem de Bruijn
Commit 6cd021a58c18a ("udp: segment looped gso packets correctly") fixes an issue with rare udp gso multicast packets looped onto the receive path. The stable backport makes the narrowest change to target only these packets, when needed. As opposed to, say, expanding __udp_gso_segment, which is harder to reason to be free from unintended side-effects. But the resulting code is hardly self-describing. Document its purpose and rationale. Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-30mptcp: MPTCP_HMAC_TEST should depend on MPTCPGeert Uytterhoeven
As the MPTCP HMAC test is integrated into the MPTCP code, it can be built only when MPTCP is enabled. Hence when MPTCP is disabled, asking the user if the test code should be enabled is futile. Wrap the whole block of MPTCP-specific config options inside a check for MPTCP. While at it, drop the "default n" for MPTCP_HMAC_TEST, as that is the default anyway. Fixes: 65492c5a6ab5df50 ("mptcp: move from sha1 (v0) to sha256 (v1)") Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-30mptcp: Fix incorrect IPV6 dependency checkGeert Uytterhoeven
If CONFIG_MPTCP=y, CONFIG_MPTCP_IPV6=n, and CONFIG_IPV6=m: net/mptcp/protocol.o: In function `__mptcp_tcp_fallback': protocol.c:(.text+0x786): undefined reference to `inet6_stream_ops' Fix this by checking for CONFIG_MPTCP_IPV6 instead of CONFIG_IPV6, like is done in all other places in the mptcp code. Fixes: 8ab183deb26a3b79 ("mptcp: cope with later TCP fallback") Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-30char: hpet: Fix out-of-bounds read bugGustavo A. R. Silva
Currently, there is an out-of-bounds read on array hpetp->hp_dev in the following for loop: 870 for (i = 0; i < hdp->hd_nirqs; i++) 871 hpetp->hp_dev[i].hd_hdwirq = hdp->hd_irq[i]; This is due to the recent change from one-element array to flexible-array member in struct hpets: 104 struct hpets { ... 113 struct hpet_dev hp_dev[]; 114 }; This change affected the total size of the dynamic memory allocation, decreasing it by one time the size of struct hpet_dev. Fix this by adjusting the allocation size when calling struct_size(). Fixes: 987f028b8637c ("char: hpet: Use flexible-array member") Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Acked-by: Eric Biggers <ebiggers@kernel.org> Link: https://lore.kernel.org/r/20200129022613.GA24281@embeddedor.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-01-30Merge branch 'linux-5.6' of git://github.com/skeggsb/linux into drm-nextDave Airlie
A couple of OOPS fixes, fixes for TU1xx if firmware isn't available, better behaviour in the face of GPU faults, and a patch to make HD audio work again after runpm changes. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Ben Skeggs <skeggsb@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/ <CACAvsv4xcLF6Ahh7UYEesn-wBEksd2da+ghusBAdODMrH7Sz2A@mail.gmail.com
2020-01-29Merge tag 'for-linus-hmm' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma Pull mmu_notifier updates from Jason Gunthorpe: "This small series revises the names in mmu_notifier to make the code clearer and more readable" * tag 'for-linus-hmm' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: mm/mmu_notifiers: Use 'interval_sub' as the variable for mmu_interval_notifier mm/mmu_notifiers: Use 'subscription' as the variable name for mmu_notifier mm/mmu_notifier: Rename struct mmu_notifier_mm to mmu_notifier_subscriptions
2020-01-29Merge tag 'threads-v5.6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux Pull thread management updates from Christian Brauner: "Sargun Dhillon over the last cycle has worked on the pidfd_getfd() syscall. This syscall allows for the retrieval of file descriptors of a process based on its pidfd. A task needs to have ptrace_may_access() permissions with PTRACE_MODE_ATTACH_REALCREDS (suggested by Oleg and Andy) on the target. One of the main use-cases is in combination with seccomp's user notification feature. As a reminder, seccomp's user notification feature was made available in v5.0. It allows a task to retrieve a file descriptor for its seccomp filter. The file descriptor is usually handed of to a more privileged supervising process. The supervisor can then listen for syscall events caught by the seccomp filter of the supervisee and perform actions in lieu of the supervisee, usually emulating syscalls. pidfd_getfd() is needed to expand its uses. There are currently two major users that wait on pidfd_getfd() and one future user: - Netflix, Sargun said, is working on a service mesh where users should be able to connect to a dns-based VIP. When a user connects to e.g. 1.2.3.4:80 that runs e.g. service "foo" they will be redirected to an envoy process. This service mesh uses seccomp user notifications and pidfd to intercept all connect calls and instead of connecting them to 1.2.3.4:80 connects them to e.g. 127.0.0.1:8080. - LXD uses the seccomp notifier heavily to intercept and emulate mknod() and mount() syscalls for unprivileged containers/processes. With pidfd_getfd() more uses-cases e.g. bridging socket connections will be possible. - The patchset has also seen some interest from the browser corner. Right now, Firefox is using a SECCOMP_RET_TRAP sandbox managed by a broker process. In the future glibc will start blocking all signals during dlopen() rendering this type of sandbox impossible. Hence, in the future Firefox will switch to a seccomp-user-nofication based sandbox which also makes use of file descriptor retrieval. The thread for this can be found at https://sourceware.org/ml/libc-alpha/2019-12/msg00079.html With pidfd_getfd() it is e.g. possible to bridge socket connections for the supervisee (binding to a privileged port) and taking actions on file descriptors on behalf of the supervisee in general. Sargun's first version was using an ioctl on pidfds but various people pushed for it to be a proper syscall which he duely implemented as well over various review cycles. Selftests are of course included. I've also added instructions how to deal with merge conflicts below. There's also a small fix coming from the kernel mentee project to correctly annotate struct sighand_struct with __rcu to fix various sparse warnings. We've received a few more such fixes and even though they are mostly trivial I've decided to postpone them until after -rc1 since they came in rather late and I don't want to risk introducing build warnings. Finally, there's a new prctl() command PR_{G,S}ET_IO_FLUSHER which is needed to avoid allocation recursions triggerable by storage drivers that have userspace parts that run in the IO path (e.g. dm-multipath, iscsi, etc). These allocation recursions deadlock the device. The new prctl() allows such privileged userspace components to avoid allocation recursions by setting the PF_MEMALLOC_NOIO and PF_LESS_THROTTLE flags. The patch carries the necessary acks from the relevant maintainers and is routed here as part of prctl() thread-management." * tag 'threads-v5.6' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux: prctl: PR_{G,S}ET_IO_FLUSHER to support controlling memory reclaim sched.h: Annotate sighand_struct with __rcu test: Add test for pidfd getfd arch: wire up pidfd_getfd syscall pid: Implement pidfd_getfd syscall vfs, fdtable: Add fget_task helper
2020-01-29Merge tag 'for-5.6/io_uring-vfs-2020-01-29' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull io_uring updates from Jens Axboe: - Support for various new opcodes (fallocate, openat, close, statx, fadvise, madvise, openat2, non-vectored read/write, send/recv, and epoll_ctl) - Faster ring quiesce for fileset updates - Optimizations for overflow condition checking - Support for max-sized clamping - Support for probing what opcodes are supported - Support for io-wq backend sharing between "sibling" rings - Support for registering personalities - Lots of little fixes and improvements * tag 'for-5.6/io_uring-vfs-2020-01-29' of git://git.kernel.dk/linux-block: (64 commits) io_uring: add support for epoll_ctl(2) eventpoll: support non-blocking do_epoll_ctl() calls eventpoll: abstract out epoll_ctl() handler io_uring: fix linked command file table usage io_uring: support using a registered personality for commands io_uring: allow registering credentials io_uring: add io-wq workqueue sharing io-wq: allow grabbing existing io-wq io_uring/io-wq: don't use static creds/mm assignments io-wq: make the io_wq ref counted io_uring: fix refcounting with batched allocations at OOM io_uring: add comment for drain_next io_uring: don't attempt to copy iovec for READ/WRITE io_uring: honor IOSQE_ASYNC for linked reqs io_uring: prep req when do IOSQE_ASYNC io_uring: use labeled array init in io_op_defs io_uring: optimise sqe-to-req flags translation io_uring: remove REQ_F_IO_DRAINED io_uring: file switch work needs to get flushed on exit io_uring: hide uring_fd in ctx ...
2020-01-29Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsiLinus Torvalds
Pull SCSI updates from James Bottomley: "This series is slightly unusual because it includes Arnd's compat ioctl tree here: 1c46a2cf2dbd Merge tag 'block-ioctl-cleanup-5.6' into 5.6/scsi-queue Excluding Arnd's changes, this is mostly an update of the usual drivers: megaraid_sas, mpt3sas, qla2xxx, ufs, lpfc, hisi_sas. There are a couple of core and base updates around error propagation and atomicity in the attribute container base we use for the SCSI transport classes. The rest is minor changes and updates" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (149 commits) scsi: hisi_sas: Rename hisi_sas_cq.pci_irq_mask scsi: hisi_sas: Add prints for v3 hw interrupt converge and automatic affinity scsi: hisi_sas: Modify the file permissions of trigger_dump to write only scsi: hisi_sas: Replace magic number when handle channel interrupt scsi: hisi_sas: replace spin_lock_irqsave/spin_unlock_restore with spin_lock/spin_unlock scsi: hisi_sas: use threaded irq to process CQ interrupts scsi: ufs: Use UFS device indicated maximum LU number scsi: ufs: Add max_lu_supported in struct ufs_dev_info scsi: ufs: Delete is_init_prefetch from struct ufs_hba scsi: ufs: Inline two functions into their callers scsi: ufs: Move ufshcd_get_max_pwr_mode() to ufshcd_device_params_init() scsi: ufs: Split ufshcd_probe_hba() based on its called flow scsi: ufs: Delete struct ufs_dev_desc scsi: ufs: Fix ufshcd_probe_hba() reture value in case ufshcd_scsi_add_wlus() fails scsi: ufs-mediatek: enable low-power mode for hibern8 state scsi: ufs: export some functions for vendor usage scsi: ufs-mediatek: add dbg_register_dump implementation scsi: qla2xxx: Fix a NULL pointer dereference in an error path scsi: qla1280: Make checking for 64bit support consistent scsi: megaraid_sas: Update driver version to 07.713.01.00-rc1 ...
2020-01-29Merge tag 'for-5.6/dm-changes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm Pull device mapper updates from Mike Snitzer: - Fix DM core's potential for q->make_request_fn NULL pointer in the unlikely case that a DM device is created without a DM table and then accessed due to upper-layer userspace code or user error. - Fix DM thin-provisioning's metadata_pre_commit_callback to not use memory after it is free'd. Also refactor code to disallow changing the thin-pool's data device once in use -- doing so guarantees smae lifetime of pool's data device relative to the pool metadata. - Fix DM space maps used by DM thinp and DM cache to avoid reuse of a already used block. This race was identified with extremely heavy snapshot use in the context of DM thin provisioning. - Fix DM raid's table status relative to an active rebuild. - Fix DM crypt to use GFP_NOIO rather than GFP_NOFS in call to skcipher_request_alloc(). Also fix benbi IV constructor crash if used in authenticated mode. - Add DM crypt support for Elephant diffuser to allow for Bitlocker compatibility. - Fix DM verity target to not prefetch hash blocks for data that has already been verified. - Fix DM writecache's incorrect flush sequence during commit when in SSD mode. - Improve DM writecache's sequential write performance on SSDs. - Add DM zoned target support for zone sizes smaller than 128MiB. - Add DM multipath 'queue_if_no_path_timeout_secs' module param to allow timeout if path isn't reinstated. This allows users a kernel safety-net against IO hanging indefinitely, due to no active paths, that has historically only been provided by multipathd userspace. - Various DM code cleanups to use true/false rather than 1/0, a variable rename in dm-dust, and fix for a math error in comment for DM thin metadata's ondisk format. * tag 'for-5.6/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: (21 commits) dm: fix potential for q->make_request_fn NULL pointer dm writecache: improve performance of large linear writes on SSDs dm mpath: Add timeout mechanism for queue_if_no_path dm thin: change data device's flush_bio to be member of struct pool dm thin: don't allow changing data device during thin-pool reload dm thin: fix use-after-free in metadata_pre_commit_callback dm thin metadata: use pool locking at end of dm_pool_metadata_close dm writecache: fix incorrect flush sequence when doing SSD mode commit dm crypt: fix benbi IV constructor crash if used in authenticated mode dm crypt: Implement Elephant diffuser for Bitlocker compatibility dm space map common: fix to ensure new block isn't already in use dm verity: don't prefetch hash blocks for already-verified data dm crypt: fix GFP flags passed to skcipher_request_alloc() dm thin metadata: Fix trivial math error in on-disk format documentation dm thin metadata: use true/false for bool variable dm snapshot: use true/false for bool variable dm bio prison v2: use true/false for bool variable dm mpath: use true/false for bool variable dm zoned: support zone sizes smaller than 128MiB dm raid: table line rebuild status fixes ...