summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2019-07-26Btrfs: add missing inode version, ctime and mtime updates when punching holeFilipe Manana
commit 179006688a7e888cbff39577189f2e034786d06a upstream. If the range for which we are punching a hole covers only part of a page, we end up updating the inode item but we skip the update of the inode's iversion, mtime and ctime. Fix that by ensuring we update those properties of the inode. A patch for fstests test case generic/059 that tests this as been sent along with this fix. Fixes: 2aaa66558172b0 ("Btrfs: add hole punching") Fixes: e8c1c76e804b18 ("Btrfs: add missing inode update when punching hole") CC: stable@vger.kernel.org # 4.4+ Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-26Btrfs: fix fsync not persisting dentry deletions due to inode evictionsFilipe Manana
commit 803f0f64d17769071d7287d9e3e3b79a3e1ae937 upstream. In order to avoid searches on a log tree when unlinking an inode, we check if the inode being unlinked was logged in the current transaction, as well as the inode of its parent directory. When any of the inodes are logged, we proceed to delete directory items and inode reference items from the log, to ensure that if a subsequent fsync of only the inode being unlinked or only of the parent directory when the other is not fsync'ed as well, does not result in the entry still existing after a power failure. That check however is not reliable when one of the inodes involved (the one being unlinked or its parent directory's inode) is evicted, since the logged_trans field is transient, that is, it is not stored on disk, so it is lost when the inode is evicted and loaded into memory again (which is set to zero on load). As a consequence the checks currently being done by btrfs_del_dir_entries_in_log() and btrfs_del_inode_ref_in_log() always return true if the inode was evicted before, regardless of the inode having been logged or not before (and in the current transaction), this results in the dentry being unlinked still existing after a log replay if after the unlink operation only one of the inodes involved is fsync'ed. Example: $ mkfs.btrfs -f /dev/sdb $ mount /dev/sdb /mnt $ mkdir /mnt/dir $ touch /mnt/dir/foo $ xfs_io -c fsync /mnt/dir/foo # Keep an open file descriptor on our directory while we evict inodes. # We just want to evict the file's inode, the directory's inode must not # be evicted. $ ( cd /mnt/dir; while true; do :; done ) & $ pid=$! # Wait a bit to give time to background process to chdir to our test # directory. $ sleep 0.5 # Trigger eviction of the file's inode. $ echo 2 > /proc/sys/vm/drop_caches # Unlink our file and fsync the parent directory. After a power failure # we don't expect to see the file anymore, since we fsync'ed the parent # directory. $ rm -f $SCRATCH_MNT/dir/foo $ xfs_io -c fsync /mnt/dir <power failure> $ mount /dev/sdb /mnt $ ls /mnt/dir foo $ --> file still there, unlink not persisted despite explicit fsync on dir Fix this by checking if the inode has the full_sync bit set in its runtime flags as well, since that bit is set everytime an inode is loaded from disk, or for other less common cases such as after a shrinking truncate or failure to allocate extent maps for holes, and gets cleared after the first fsync. Also consider the inode as possibly logged only if it was last modified in the current transaction (besides having the full_fsync flag set). Fixes: 3a5f1d458ad161 ("Btrfs: Optimize btree walking while logging inodes") CC: stable@vger.kernel.org # 4.4+ Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-26Btrfs: fix data loss after inode eviction, renaming it, and fsync itFilipe Manana
commit d1d832a0b51dd9570429bb4b81b2a6c1759e681a upstream. When we log an inode, regardless of logging it completely or only that it exists, we always update it as logged (logged_trans and last_log_commit fields of the inode are updated). This is generally fine and avoids future attempts to log it from having to do repeated work that brings no value. However, if we write data to a file, then evict its inode after all the dealloc was flushed (and ordered extents completed), rename the file and fsync it, we end up not logging the new extents, since the rename may result in logging that the inode exists in case the parent directory was logged before. The following reproducer shows and explains how this can happen: $ mkfs.btrfs -f /dev/sdb $ mount /dev/sdb /mnt $ mkdir /mnt/dir $ touch /mnt/dir/foo $ touch /mnt/dir/bar # Do a direct IO write instead of a buffered write because with a # buffered write we would need to make sure dealloc gets flushed and # complete before we do the inode eviction later, and we can not do that # from user space with call to things such as sync(2) since that results # in a transaction commit as well. $ xfs_io -d -c "pwrite -S 0xd3 0 4K" /mnt/dir/bar # Keep the directory dir in use while we evict inodes. We want our file # bar's inode to be evicted but we don't want our directory's inode to # be evicted (if it were evicted too, we would not be able to reproduce # the issue since the first fsync below, of file foo, would result in a # transaction commit. $ ( cd /mnt/dir; while true; do :; done ) & $ pid=$! # Wait a bit to give time for the background process to chdir. $ sleep 0.1 # Evict all inodes, except the inode for the directory dir because it is # currently in use by our background process. $ echo 2 > /proc/sys/vm/drop_caches # fsync file foo, which ends up persisting information about the parent # directory because it is a new inode. $ xfs_io -c fsync /mnt/dir/foo # Rename bar, this results in logging that this inode exists (inode item, # names, xattrs) because the parent directory is in the log. $ mv /mnt/dir/bar /mnt/dir/baz # Now fsync baz, which ends up doing absolutely nothing because of the # rename operation which logged that the inode exists only. $ xfs_io -c fsync /mnt/dir/baz <power failure> $ mount /dev/sdb /mnt $ od -t x1 -A d /mnt/dir/baz 0000000 --> Empty file, data we wrote is missing. Fix this by not updating last_sub_trans of an inode when we are logging only that it exists and the inode was not yet logged since it was loaded from disk (full_sync bit set), this is enough to make btrfs_inode_in_log() return false for this scenario and make us log the inode. The logged_trans of the inode is still always setsince that alone is used to track if names need to be deleted as part of unlink operations. Fixes: 257c62e1bce03e ("Btrfs: avoid tree log commit when there are no changes") CC: stable@vger.kernel.org # 4.4+ Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-26PCI: qcom: Ensure that PERST is asserted for at least 100 msNiklas Cassel
commit 64adde31c8e996a6db6f7a1a4131180e363aa9f2 upstream. Currently, there is only a 1 ms sleep after asserting PERST. Reading the datasheets for different endpoints, some require PERST to be asserted for 10 ms in order for the endpoint to perform a reset, others require it to be asserted for 50 ms. Several SoCs using this driver uses PCIe Mini Card, where we don't know what endpoint will be plugged in. The PCI Express Card Electromechanical Specification r2.0, section 2.2, "PERST# Signal" specifies: "On power up, the deassertion of PERST# is delayed 100 ms (TPVPERL) from the power rails achieving specified operating limits." Add a sleep of 100 ms before deasserting PERST, in order to ensure that we are compliant with the spec. Fixes: 82a823833f4e ("PCI: qcom: Add Qualcomm PCIe controller driver") Signed-off-by: Niklas Cassel <niklas.cassel@linaro.org> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Stanimir Varbanov <svarbanov@mm-sol.com> Cc: stable@vger.kernel.org # 4.5+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-26PCI: Do not poll for PME if the device is in D3coldMika Westerberg
commit 000dd5316e1c756a1c028f22e01d06a38249dd4d upstream. PME polling does not take into account that a device that is directly connected to the host bridge may go into D3cold as well. This leads to a situation where the PME poll thread reads from a config space of a device that is in D3cold and gets incorrect information because the config space is not accessible. Here is an example from Intel Ice Lake system where two PCIe root ports are in D3cold (I've instrumented the kernel to log the PMCSR register contents): [ 62.971442] pcieport 0000:00:07.1: Check PME status, PMCSR=0xffff [ 62.971504] pcieport 0000:00:07.0: Check PME status, PMCSR=0xffff Since 0xffff is interpreted so that PME is pending, the root ports will be runtime resumed. This repeats over and over again essentially blocking all runtime power management. Prevent this from happening by checking whether the device is in D3cold before its PME status is read. Fixes: 71a83bd727cc ("PCI/PM: add runtime PM support to PCIe port") Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Reviewed-by: Lukas Wunner <lukas@wunner.de> Cc: 3.6+ <stable@vger.kernel.org> # v3.6+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-26PCI: hv: Fix a use-after-free bug in hv_eject_device_work()Dexuan Cui
commit 4df591b20b80cb77920953812d894db259d85bd7 upstream. Fix a use-after-free in hv_eject_device_work(). Fixes: 05f151a73ec2 ("PCI: hv: Fix a memory leak in hv_eject_device_work()") Signed-off-by: Dexuan Cui <decui@microsoft.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-26intel_th: pci: Add Ice Lake NNPI supportAlexander Shishkin
commit 4aa5aed2b6f267592705a526f57518a5d715b769 upstream. This adds Ice Lake NNPI support to the Intel(R) Trace Hub. Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: stable <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20190621161930.60785-5-alexander.shishkin@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-26RDMA/srp: Accept again source addresses that do not have a port numberBart Van Assche
commit bcef5b7215681250c4bf8961dfe15e9e4fef97d0 upstream. The function srp_parse_in() is used both for parsing source address specifications and for target address specifications. Target addresses must have a port number. Having to specify a port number for source addresses is inconvenient. Make sure that srp_parse_in() supports again parsing addresses with no port number. Cc: <stable@vger.kernel.org> Fixes: c62adb7def71 ("IB/srp: Fix IPv6 address parsing") Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-26block: Fix potential overflow in blk_report_zones()Damien Le Moal
commit 113ab72ed4794c193509a97d7c6d32a6886e1682 upstream. For large values of the number of zones reported and/or large zone sizes, the sector increment calculated with blk_queue_zone_sectors(q) * n in blk_report_zones() loop can overflow the unsigned int type used for the calculation as both "n" and blk_queue_zone_sectors() value are unsigned int. E.g. for a device with 256 MB zones (524288 sectors), overflow happens with 8192 or more zones reported. Changing the return type of blk_queue_zone_sectors() to sector_t, fixes this problem and avoids overflow problem for all other callers of this helper too. The same change is also applied to the bdev_zone_sectors() helper. Fixes: e76239a3748c ("block: add a report_zones method") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-26block: Allow mapping of vmalloc-ed buffersDamien Le Moal
commit b4c5875d36178e8df409bdce232f270cac89fafe upstream. To allow the SCSI subsystem scsi_execute_req() function to issue requests using large buffers that are better allocated with vmalloc() rather than kmalloc(), modify bio_map_kern() to allow passing a buffer allocated with vmalloc(). To do so, detect vmalloc-ed buffers using is_vmalloc_addr(). For vmalloc-ed buffers, flush the buffer using flush_kernel_vmap_range(), use vmalloc_to_page() instead of virt_to_page() to obtain the pages of the buffer, and invalidate the buffer addresses with invalidate_kernel_vmap_range() on completion of read BIOs. This last point is executed using the function bio_invalidate_vmalloc_pages() which is defined only if the architecture defines ARCH_HAS_FLUSH_KERNEL_DCACHE_PAGE, that is, if the architecture actually needs the invalidation done. Fixes: 515ce6061312 ("scsi: sd_zbc: Fix sd_zbc_report_zones() buffer allocation") Fixes: e76239a3748c ("block: add a report_zones method") Cc: stable@vger.kernel.org Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-26drm/edid: parse CEA blocks embedded in DisplayIDAndres Rodriguez
commit e28ad544f462231d3fd081a7316339359efbb481 upstream. DisplayID blocks allow embedding of CEA blocks. The payloads are identical to traditional top level CEA extension blocks, but the header is slightly different. This change allows the CEA parser to find a CEA block inside a DisplayID block. Additionally, it adds support for parsing the embedded CTA header. No further changes are necessary due to payload parity. This change fixes audio support for the Valve Index HMD. Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: <stable@vger.kernel.org> # v4.15 Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190619180901.17901-1-andresx7@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-26perf/x86/amd/uncore: Set the thread mask for F17h L3 PMCsKim Phillips
commit 2f217d58a8a086d3399fecce39fb358848e799c4 upstream. Fill in the L3 performance event select register ThreadMask bitfield, to enable per hardware thread accounting. Signed-off-by: Kim Phillips <kim.phillips@amd.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: <stable@vger.kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Gary Hook <Gary.Hook@amd.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Janakarajan Natarajan <Janakarajan.Natarajan@amd.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Martin Liska <mliska@suse.cz> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Pu Wen <puwen@hygon.cn> Cc: Stephane Eranian <eranian@google.com> Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: https://lkml.kernel.org/r/20190628215906.4276-2-kim.phillips@amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-26perf/x86/amd/uncore: Do not set 'ThreadMask' and 'SliceMask' for non-L3 PMCsKim Phillips
commit 16f4641166b10e199f0d7b68c2c5f004fef0bda3 upstream. The following commit: d7cbbe49a930 ("perf/x86/amd/uncore: Set ThreadMask and SliceMask for L3 Cache perf events") enables L3 PMC events for all threads and slices by writing 1's in 'ChL3PmcCfg' (L3 PMC PERF_CTL) register fields. Those bitfields overlap with high order event select bits in the Data Fabric PMC control register, however. So when a user requests raw Data Fabric events (-e amd_df/event=0xYYY/), the two highest order bits get inadvertently set, changing the counter select to events that don't exist, and for which no counts are read. This patch changes the logic to write the L3 masks only when dealing with L3 PMC counters. AMD Family 16h and below Northbridge (NB) counters were not affected. Signed-off-by: Kim Phillips <kim.phillips@amd.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: <stable@vger.kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Gary Hook <Gary.Hook@amd.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Janakarajan Natarajan <Janakarajan.Natarajan@amd.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Martin Liska <mliska@suse.cz> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Pu Wen <puwen@hygon.cn> Cc: Stephane Eranian <eranian@google.com> Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Fixes: d7cbbe49a930 ("perf/x86/amd/uncore: Set ThreadMask and SliceMask for L3 Cache perf events") Link: https://lkml.kernel.org/r/20190628215906.4276-1-kim.phillips@amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-26perf/x86/intel: Fix spurious NMI on fixed counterKan Liang
commit e4557c1a46b0d32746bd309e1941914b5a6912b4 upstream. If a user first sample a PEBS event on a fixed counter, then sample a non-PEBS event on the same fixed counter on Icelake, it will trigger spurious NMI. For example: perf record -e 'cycles:p' -a perf record -e 'cycles' -a The error message for spurious NMI: [June 21 15:38] Uhhuh. NMI received for unknown reason 30 on CPU 2. [ +0.000000] Do you have a strange power saving mode enabled? [ +0.000000] Dazed and confused, but trying to continue The bug was introduced by the following commit: commit 6f55967ad9d9 ("perf/x86/intel: Fix race in intel_pmu_disable_event()") The commit moves the intel_pmu_pebs_disable() after intel_pmu_disable_fixed(), which returns immediately. The related bit of PEBS_ENABLE MSR will never be cleared for the fixed counter. Then a non-PEBS event runs on the fixed counter, but the bit on PEBS_ENABLE is still set, which triggers spurious NMIs. Check and disable PEBS for fixed counters after intel_pmu_disable_fixed(). Reported-by: Yi, Ammy <ammy.yi@intel.com> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: <stable@vger.kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Fixes: 6f55967ad9d9 ("perf/x86/intel: Fix race in intel_pmu_disable_event()") Link: https://lkml.kernel.org/r/20190625142135.22112-1-kan.liang@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-26x86/boot: Fix memory leak in default_get_smp_config()David Rientjes
commit e74bd96989dd42a51a73eddb4a5510a6f5e42ac3 upstream. When default_get_smp_config() is called with early == 1 and mpf->feature1 is non-zero, mpf is leaked because the return path does not do early_memunmap(). Fix this and share a common exit routine. Fixes: 5997efb96756 ("x86/boot: Use memremap() to map the MPF and MPC data") Reported-by: Cfir Cohen <cfir@google.com> Signed-off-by: David Rientjes <rientjes@google.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/alpine.DEB.2.21.1907091942570.28240@chino.kir.corp.google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-26rt2x00usb: fix rx queue hangSoeren Moch
commit 41a531ffa4c5aeb062f892227c00fabb3b4a9c91 upstream. Since commit ed194d136769 ("usb: core: remove local_irq_save() around ->complete() handler") the handler rt2x00usb_interrupt_rxdone() is not running with interrupts disabled anymore. So this completion handler is not guaranteed to run completely before workqueue processing starts for the same queue entry. Be sure to set all other flags in the entry correctly before marking this entry ready for workqueue processing. This way we cannot miss error conditions that need to be signalled from the completion handler to the worker thread. Note that rt2x00usb_work_rxdone() processes all available entries, not only such for which queue_work() was called. This patch is similar to what commit df71c9cfceea ("rt2x00: fix order of entry flags modification") did for TX processing. This fixes a regression on a RT5370 based wifi stick in AP mode, which suddenly stopped data transmission after some period of heavy load. Also stopping the hanging hostapd resulted in the error message "ieee80211 phy0: rt2x00queue_flush_queue: Warning - Queue 14 failed to flush". Other operation modes are probably affected as well, this just was the used testcase. Fixes: ed194d136769 ("usb: core: remove local_irq_save() around ->complete() handler") Cc: stable@vger.kernel.org # 4.20+ Signed-off-by: Soeren Moch <smoch@web.de> Acked-by: Stanislaw Gruszka <sgruszka@redhat.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-269p/virtio: Add cleanup path in p9_virtio_initYueHaibing
commit d4548543fc4ece56c6f04b8586f435fb4fd84c20 upstream. KASAN report this: BUG: unable to handle kernel paging request at ffffffffa0097000 PGD 3870067 P4D 3870067 PUD 3871063 PMD 2326e2067 PTE 0 Oops: 0000 [#1 CPU: 0 PID: 5340 Comm: modprobe Not tainted 5.1.0-rc7+ #25 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org 04/01/2014 RIP: 0010:__list_add_valid+0x10/0x70 Code: c3 48 8b 06 55 48 89 e5 5d 48 39 07 0f 94 c0 0f b6 c0 c3 90 90 90 90 90 90 90 55 48 89 d0 48 8b 52 08 48 89 e5 48 39 f2 75 19 <48> 8b 32 48 39 f0 75 3a RSP: 0018:ffffc90000e23c68 EFLAGS: 00010246 RAX: ffffffffa00ad000 RBX: ffffffffa009d000 RCX: 0000000000000000 RDX: ffffffffa0097000 RSI: ffffffffa0097000 RDI: ffffffffa009d000 RBP: ffffc90000e23c68 R08: 0000000000000001 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: ffffffffa0097000 R13: ffff888231797180 R14: 0000000000000000 R15: ffffc90000e23e78 FS: 00007fb215285540(0000) GS:ffff888237a00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffffffa0097000 CR3: 000000022f144000 CR4: 00000000000006f0 Call Trace: v9fs_register_trans+0x2f/0x60 [9pnet ? 0xffffffffa0087000 p9_virtio_init+0x25/0x1000 [9pnet_virtio do_one_initcall+0x6c/0x3cc ? kmem_cache_alloc_trace+0x248/0x3b0 do_init_module+0x5b/0x1f1 load_module+0x1db1/0x2690 ? m_show+0x1d0/0x1d0 __do_sys_finit_module+0xc5/0xd0 __x64_sys_finit_module+0x15/0x20 do_syscall_64+0x6b/0x1d0 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x7fb214d8e839 Code: 00 f3 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 RSP: 002b:00007ffc96554278 EFLAGS: 00000246 ORIG_RAX: 0000000000000139 RAX: ffffffffffffffda RBX: 000055e67eed2aa0 RCX: 00007fb214d8e839 RDX: 0000000000000000 RSI: 000055e67ce95c2e RDI: 0000000000000003 RBP: 000055e67ce95c2e R08: 0000000000000000 R09: 000055e67eed2aa0 R10: 0000000000000003 R11: 0000000000000246 R12: 0000000000000000 R13: 000055e67eeda500 R14: 0000000000040000 R15: 000055e67eed2aa0 Modules linked in: 9pnet_virtio(+) 9pnet gre rfkill vmw_vsock_virtio_transport_common vsock [last unloaded: 9pnet_virtio CR2: ffffffffa0097000 ---[ end trace 4a52bb13ff07b761 If register_virtio_driver() fails in p9_virtio_init, we should call v9fs_unregister_trans() to do cleanup. Link: http://lkml.kernel.org/r/20190430115942.41840-1-yuehaibing@huawei.com Cc: stable@vger.kernel.org Reported-by: Hulk Robot <hulkci@huawei.com> Fixes: b530cc794024 ("9p: add virtio transport") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: Dominique Martinet <dominique.martinet@cea.fr> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-269p/xen: Add cleanup path in p9_trans_xen_initYueHaibing
commit 80a316ff16276b36d0392a8f8b2f63259857ae98 upstream. If xenbus_register_frontend() fails in p9_trans_xen_init, we should call v9fs_unregister_trans() to do cleanup. Link: http://lkml.kernel.org/r/20190430143933.19368-1-yuehaibing@huawei.com Cc: stable@vger.kernel.org Fixes: 868eb122739a ("xen/9pfs: introduce Xen 9pfs transport driver") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: Dominique Martinet <dominique.martinet@cea.fr> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-26xen/events: fix binding user event channels to cpusJuergen Gross
commit bce5963bcb4f9934faa52be323994511d59fd13c upstream. When binding an interdomain event channel to a vcpu via IOCTL_EVTCHN_BIND_INTERDOMAIN not only the event channel needs to be bound, but the affinity of the associated IRQi must be changed, too. Otherwise the IRQ and the event channel won't be moved to another vcpu in case the original vcpu they were bound to is going offline. Cc: <stable@vger.kernel.org> # 4.13 Fixes: c48f64ab472389df ("xen-evtchn: Bind dyn evtchn:qemu-dm interrupt to next online VCPU") Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-26dm zoned: fix zone state management raceDamien Le Moal
commit 3b8cafdd5436f9298b3bf6eb831df5eef5ee82b6 upstream. dm-zoned uses the zone flag DMZ_ACTIVE to indicate that a zone of the backend device is being actively read or written and so cannot be reclaimed. This flag is set as long as the zone atomic reference counter is not 0. When this atomic is decremented and reaches 0 (e.g. on BIO completion), the active flag is cleared and set again whenever the zone is reused and BIO issued with the atomic counter incremented. These 2 operations (atomic inc/dec and flag set/clear) are however not always executed atomically under the target metadata mutex lock and this causes the warning: WARN_ON(!test_bit(DMZ_ACTIVE, &zone->flags)); in dmz_deactivate_zone() to be displayed. This problem is regularly triggered with xfstests generic/209, generic/300, generic/451 and xfs/077 with XFS being used as the file system on the dm-zoned target device. Similarly, xfstests ext4/303, ext4/304, generic/209 and generic/300 trigger the warning with ext4 use. This problem can be easily fixed by simply removing the DMZ_ACTIVE flag and managing the "ACTIVE" state by directly looking at the reference counter value. To do so, the functions dmz_activate_zone() and dmz_deactivate_zone() are changed to inline functions respectively calling atomic_inc() and atomic_dec(), while the dmz_is_active() macro is changed to an inline function calling atomic_read(). Fixes: 3b1a94c88b79 ("dm zoned: drive-managed zoned block device target") Cc: stable@vger.kernel.org Reported-by: Masato Suzuki <masato.suzuki@wdc.com> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-26padata: use smp_mb in padata_reorder to avoid orphaned padata jobsDaniel Jordan
commit cf144f81a99d1a3928f90b0936accfd3f45c9a0a upstream. Testing padata with the tcrypt module on a 5.2 kernel... # modprobe tcrypt alg="pcrypt(rfc4106(gcm(aes)))" type=3 # modprobe tcrypt mode=211 sec=1 ...produces this splat: INFO: task modprobe:10075 blocked for more than 120 seconds. Not tainted 5.2.0-base+ #16 modprobe D 0 10075 10064 0x80004080 Call Trace: ? __schedule+0x4dd/0x610 ? ring_buffer_unlock_commit+0x23/0x100 schedule+0x6c/0x90 schedule_timeout+0x3b/0x320 ? trace_buffer_unlock_commit_regs+0x4f/0x1f0 wait_for_common+0x160/0x1a0 ? wake_up_q+0x80/0x80 { crypto_wait_req } # entries in braces added by hand { do_one_aead_op } { test_aead_jiffies } test_aead_speed.constprop.17+0x681/0xf30 [tcrypt] do_test+0x4053/0x6a2b [tcrypt] ? 0xffffffffa00f4000 tcrypt_mod_init+0x50/0x1000 [tcrypt] ... The second modprobe command never finishes because in padata_reorder, CPU0's load of reorder_objects is executed before the unlocking store in spin_unlock_bh(pd->lock), causing CPU0 to miss CPU1's increment: CPU0 CPU1 padata_reorder padata_do_serial LOAD reorder_objects // 0 INC reorder_objects // 1 padata_reorder TRYLOCK pd->lock // failed UNLOCK pd->lock CPU0 deletes the timer before returning from padata_reorder and since no other job is submitted to padata, modprobe waits indefinitely. Add a pair of full barriers to guarantee proper ordering: CPU0 CPU1 padata_reorder padata_do_serial UNLOCK pd->lock smp_mb() LOAD reorder_objects INC reorder_objects smp_mb__after_atomic() padata_reorder TRYLOCK pd->lock smp_mb__after_atomic is needed so the read part of the trylock operation comes after the INC, as Andrea points out. Thanks also to Andrea for help with writing a litmus test. Fixes: 16295bec6398 ("padata: Generic parallelization/serialization interface") Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com> Cc: <stable@vger.kernel.org> Cc: Andrea Parri <andrea.parri@amarulasolutions.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Paul E. McKenney <paulmck@linux.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steffen Klassert <steffen.klassert@secunet.com> Cc: linux-arch@vger.kernel.org Cc: linux-crypto@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-26drm/nouveau/i2c: Enable i2c pads & busses during preinitLyude Paul
commit 7cb95eeea6706c790571042a06782e378b2561ea upstream. It turns out that while disabling i2c bus access from software when the GPU is suspended was a step in the right direction with: commit 342406e4fbba ("drm/nouveau/i2c: Disable i2c bus access after ->fini()") We also ended up accidentally breaking the vbios init scripts on some older Tesla GPUs, as apparently said scripts can actually use the i2c bus. Since these scripts are executed before initializing any subdevices, we end up failing to acquire access to the i2c bus which has left a number of cards with their fan controllers uninitialized. Luckily this doesn't break hardware - it just means the fan gets stuck at 100%. This also means that we've always been using our i2c busses before initializing them during the init scripts for older GPUs, we just didn't notice it until we started preventing them from being used until init. It's pretty impressive this never caused us any issues before! So, fix this by initializing our i2c pad and busses during subdev pre-init. We skip initializing aux busses during pre-init, as those are guaranteed to only ever be used by nouveau for DP aux transactions. Signed-off-by: Lyude Paul <lyude@redhat.com> Tested-by: Marc Meledandri <m.meledandri@gmail.com> Fixes: 342406e4fbba ("drm/nouveau/i2c: Disable i2c bus access after ->fini()") Cc: stable@vger.kernel.org Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-26ARM: dts: gemini: Set DIR-685 SPI CS as active lowLinus Walleij
commit f90b8fda3a9d72a9422ea80ae95843697f94ea4a upstream. The SPI to the display on the DIR-685 is active low, we were just saved by the SPI library enforcing active low on everything before, so set it as active low to avoid ambiguity. Link: https://lore.kernel.org/r/20190715202101.16060-1-linus.walleij@linaro.org Cc: stable@vger.kernel.org Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Olof Johansson <olof@lixom.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-26kconfig: fix missing choice values in auto.confMasahiro Yamada
commit 8e2442a5f86e1f77b86401fce274a7f622740bc4 upstream. Since commit 00c864f8903d ("kconfig: allow all config targets to write auto.conf if missing"), Kconfig creates include/config/auto.conf in the defconfig stage when it is missing. Joonas Kylmälä reported incorrect auto.conf generation under some circumstances. To reproduce it, apply the following diff: | --- a/arch/arm/configs/imx_v6_v7_defconfig | +++ b/arch/arm/configs/imx_v6_v7_defconfig | @@ -345,14 +345,7 @@ CONFIG_USB_CONFIGFS_F_MIDI=y | CONFIG_USB_CONFIGFS_F_HID=y | CONFIG_USB_CONFIGFS_F_UVC=y | CONFIG_USB_CONFIGFS_F_PRINTER=y | -CONFIG_USB_ZERO=m | -CONFIG_USB_AUDIO=m | -CONFIG_USB_ETH=m | -CONFIG_USB_G_NCM=m | -CONFIG_USB_GADGETFS=m | -CONFIG_USB_FUNCTIONFS=m | -CONFIG_USB_MASS_STORAGE=m | -CONFIG_USB_G_SERIAL=m | +CONFIG_USB_FUNCTIONFS=y | CONFIG_MMC=y | CONFIG_MMC_SDHCI=y | CONFIG_MMC_SDHCI_PLTFM=y And then, run: $ make ARCH=arm mrproper imx_v6_v7_defconfig You will see CONFIG_USB_FUNCTIONFS=y is correctly contained in the .config, but not in the auto.conf. Please note drivers/usb/gadget/legacy/Kconfig is included from a choice block in drivers/usb/gadget/Kconfig. So USB_FUNCTIONFS is a choice value. This is probably a similar situation described in commit beaaddb62540 ("kconfig: tests: test defconfig when two choices interact"). When sym_calc_choice() is called, the choice symbol forgets the SYMBOL_DEF_USER unless all of its choice values are explicitly set by the user. The choice symbol is given just one chance to recall it because set_all_choice_values() is called if SYMBOL_NEED_SET_CHOICE_VALUES is set. When sym_calc_choice() is called again, the choice symbol forgets it forever, since SYMBOL_NEED_SET_CHOICE_VALUES is a one-time aid. Hence, we cannot call sym_clear_all_valid() again and again. It is crazy to repeat set and unset of internal flags. However, we cannot simply get rid of "sym->flags &= flags | ~SYMBOL_DEF_USER;" Doing so would re-introduce the problem solved by commit 5d09598d488f ("kconfig: fix new choices being skipped upon config update"). To work around the issue, conf_write_autoconf() stopped calling sym_clear_all_valid(). conf_write() must be changed accordingly. Currently, it clears SYMBOL_WRITE after the symbol is written into the .config file. This is needed to prevent it from writing the same symbol multiple times in case the symbol is declared in two or more locations. I added the new flag SYMBOL_WRITTEN, to track the symbols that have been written. Anyway, this is a cheesy workaround in order to suppress the issue as far as defconfig is concerned. Handling of choices is totally broken. sym_clear_all_valid() is called every time a user touches a symbol from the GUI interface. To reproduce it, just add a new symbol drivers/usb/gadget/legacy/Kconfig, then touch around unrelated symbols from menuconfig. USB_FUNCTIONFS will disappear from the .config file. I added the Fixes tag since it is more fatal than before. But, this has been broken since long long time before, and still it is. We should take a closer look to fix this correctly somehow. Fixes: 00c864f8903d ("kconfig: allow all config targets to write auto.conf if missing") Cc: linux-stable <stable@vger.kernel.org> # 4.19+ Reported-by: Joonas Kylmälä <joonas.kylmala@iki.fi> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Tested-by: Joonas Kylmälä <joonas.kylmala@iki.fi> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-26i3c: fix i2c and i3c scl rate by bus modeVitor Soares
commit ecc8fb54bd443bf69996d9d5ddb8d90a50f14936 upstream. Currently the I3C framework limits SCL frequency to FM speed when dealing with a mixed slow bus, even if all I2C devices are FM+ capable. The core was also not accounting for I3C speed limitations when operating in mixed slow mode and was erroneously using FM+ speed as the max I2C speed when operating in mixed fast mode. Fixes: 3a379bbcea0a ("i3c: Add core I3C infrastructure") Signed-off-by: Vitor Soares <vitor.soares@synopsys.com> Cc: Boris Brezillon <bbrezillon@kernel.org> Cc: <stable@vger.kernel.org> Cc: <linux-kernel@vger.kernel.org> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-26fs/proc/proc_sysctl.c: fix the default values of i_uid/i_gid on /proc/sys ↵Radoslaw Burny
inodes. commit 5ec27ec735ba0477d48c80561cc5e856f0c5dfaf upstream. Normally, the inode's i_uid/i_gid are translated relative to s_user_ns, but this is not a correct behavior for proc. Since sysctl permission check in test_perm is done against GLOBAL_ROOT_[UG]ID, it makes more sense to use these values in u_[ug]id of proc inodes. In other words: although uid/gid in the inode is not read during test_perm, the inode logically belongs to the root of the namespace. I have confirmed this with Eric Biederman at LPC and in this thread: https://lore.kernel.org/lkml/87k1kzjdff.fsf@xmission.com Consequences ============ Since the i_[ug]id values of proc nodes are not used for permissions checks, this change usually makes no functional difference. However, it causes an issue in a setup where: * a namespace container is created without root user in container - hence the i_[ug]id of proc nodes are set to INVALID_[UG]ID * container creator tries to configure it by writing /proc/sys files, e.g. writing /proc/sys/kernel/shmmax to configure shared memory limit Kernel does not allow to open an inode for writing if its i_[ug]id are invalid, making it impossible to write shmmax and thus - configure the container. Using a container with no root mapping is apparently rare, but we do use this configuration at Google. Also, we use a generic tool to configure the container limits, and the inability to write any of them causes a failure. History ======= The invalid uids/gids in inodes first appeared due to 81754357770e (fs: Update i_[ug]id_(read|write) to translate relative to s_user_ns). However, AFAIK, this did not immediately cause any issues. The inability to write to these "invalid" inodes was only caused by a later commit 0bd23d09b874 (vfs: Don't modify inodes with a uid or gid unknown to the vfs). Tested: Used a repro program that creates a user namespace without any mapping and stat'ed /proc/$PID/root/proc/sys/kernel/shmmax from outside. Before the change, it shows the overflow uid, with the change it's 0. The overflow uid indicates that the uid in the inode is not correct and thus it is not possible to open the file for writing. Link: http://lkml.kernel.org/r/20190708115130.250149-1-rburny@google.com Fixes: 0bd23d09b874 ("vfs: Don't modify inodes with a uid or gid unknown to the vfs") Signed-off-by: Radoslaw Burny <rburny@google.com> Acked-by: Luis Chamberlain <mcgrof@kernel.org> Cc: Kees Cook <keescook@chromium.org> Cc: "Eric W . Biederman" <ebiederm@xmission.com> Cc: Seth Forshee <seth.forshee@canonical.com> Cc: John Sperbeck <jsperbeck@google.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: <stable@vger.kernel.org> [4.8+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-26signal: Correct namespace fixups of si_pid and si_uidEric W. Biederman
commit 7a0cf094944e2540758b7f957eb6846d5126f535 upstream. The function send_signal was split from __send_signal so that it would be possible to bypass the namespace logic based upon current[1]. As it turns out the si_pid and the si_uid fixup are both inappropriate in the case of kill_pid_usb_asyncio so move that logic into send_signal. It is difficult to arrange but possible for a signal with an si_code of SI_TIMER or SI_SIGIO to be sent across namespace boundaries. In which case tests for when it is ok to change si_pid and si_uid based on SI_FROMUSER are incorrect. Replace the use of SI_FROMUSER with a new test has_si_pid_and_used based on siginfo_layout. Now that the uid fixup is no longer present after expanding SEND_SIG_NOINFO properly calculate the si_uid that the target task needs to read. [1] 7978b567d315 ("signals: add from_ancestor_ns parameter to send_signal()") Cc: stable@vger.kernel.org Fixes: 6588c1e3ff01 ("signals: SI_USER: Masquerade si_pid when crossing pid ns boundary") Fixes: 6b550f949594 ("user namespace: make signal.c respect user namespaces") Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-26signal/usb: Replace kill_pid_info_as_cred with kill_pid_usb_asyncioEric W. Biederman
commit 70f1b0d34bdf03065fe869e93cc17cad1ea20c4a upstream. The usb support for asyncio encoded one of it's values in the wrong field. It should have used si_value but instead used si_addr which is not present in the _rt union member of struct siginfo. The practical result of this is that on a 64bit big endian kernel when delivering a signal to a 32bit process the si_addr field is set to NULL, instead of the expected pointer value. This issue can not be fixed in copy_siginfo_to_user32 as the usb usage of the the _sigfault (aka si_addr) member of the siginfo union when SI_ASYNCIO is set is incompatible with the POSIX and glibc usage of the _rt member of the siginfo union. Therefore replace kill_pid_info_as_cred with kill_pid_usb_asyncio a dedicated function for this one specific case. There are no other users of kill_pid_info_as_cred so this specialization should have no impact on the amount of code in the kernel. Have kill_pid_usb_asyncio take instead of a siginfo_t which is difficult and error prone, 3 arguments, a signal number, an errno value, and an address enconded as a sigval_t. The encoding of the address as a sigval_t allows the code that reads the userspace request for a signal to handle this compat issue along with all of the other compat issues. Add BUILD_BUG_ONs in kernel/signal.c to ensure that we can now place the pointer value at the in si_pid (instead of si_addr). That is the code now verifies that si_pid and si_addr always occur at the same location. Further the code veries that for native structures a value placed in si_pid and spilling into si_uid will appear in userspace in si_addr (on a byte by byte copy of siginfo or a field by field copy of siginfo). The code also verifies that for a 64bit kernel and a 32bit userspace the 32bit pointer will fit in si_pid. I have used the usbsig.c program below written by Alan Stern and slightly tweaked by me to run on a big endian machine to verify the issue exists (on sparc64) and to confirm the patch below fixes the issue. /* usbsig.c -- test USB async signal delivery */ #define _GNU_SOURCE #include <stdio.h> #include <fcntl.h> #include <signal.h> #include <string.h> #include <sys/ioctl.h> #include <unistd.h> #include <endian.h> #include <linux/usb/ch9.h> #include <linux/usbdevice_fs.h> static struct usbdevfs_urb urb; static struct usbdevfs_disconnectsignal ds; static volatile sig_atomic_t done = 0; void urb_handler(int sig, siginfo_t *info , void *ucontext) { printf("Got signal %d, signo %d errno %d code %d addr: %p urb: %p\n", sig, info->si_signo, info->si_errno, info->si_code, info->si_addr, &urb); printf("%s\n", (info->si_addr == &urb) ? "Good" : "Bad"); } void ds_handler(int sig, siginfo_t *info , void *ucontext) { printf("Got signal %d, signo %d errno %d code %d addr: %p ds: %p\n", sig, info->si_signo, info->si_errno, info->si_code, info->si_addr, &ds); printf("%s\n", (info->si_addr == &ds) ? "Good" : "Bad"); done = 1; } int main(int argc, char **argv) { char *devfilename; int fd; int rc; struct sigaction act; struct usb_ctrlrequest *req; void *ptr; char buf[80]; if (argc != 2) { fprintf(stderr, "Usage: usbsig device-file-name\n"); return 1; } devfilename = argv[1]; fd = open(devfilename, O_RDWR); if (fd == -1) { perror("Error opening device file"); return 1; } act.sa_sigaction = urb_handler; sigemptyset(&act.sa_mask); act.sa_flags = SA_SIGINFO; rc = sigaction(SIGUSR1, &act, NULL); if (rc == -1) { perror("Error in sigaction"); return 1; } act.sa_sigaction = ds_handler; sigemptyset(&act.sa_mask); act.sa_flags = SA_SIGINFO; rc = sigaction(SIGUSR2, &act, NULL); if (rc == -1) { perror("Error in sigaction"); return 1; } memset(&urb, 0, sizeof(urb)); urb.type = USBDEVFS_URB_TYPE_CONTROL; urb.endpoint = USB_DIR_IN | 0; urb.buffer = buf; urb.buffer_length = sizeof(buf); urb.signr = SIGUSR1; req = (struct usb_ctrlrequest *) buf; req->bRequestType = USB_DIR_IN | USB_TYPE_STANDARD | USB_RECIP_DEVICE; req->bRequest = USB_REQ_GET_DESCRIPTOR; req->wValue = htole16(USB_DT_DEVICE << 8); req->wIndex = htole16(0); req->wLength = htole16(sizeof(buf) - sizeof(*req)); rc = ioctl(fd, USBDEVFS_SUBMITURB, &urb); if (rc == -1) { perror("Error in SUBMITURB ioctl"); return 1; } rc = ioctl(fd, USBDEVFS_REAPURB, &ptr); if (rc == -1) { perror("Error in REAPURB ioctl"); return 1; } memset(&ds, 0, sizeof(ds)); ds.signr = SIGUSR2; ds.context = &ds; rc = ioctl(fd, USBDEVFS_DISCSIGNAL, &ds); if (rc == -1) { perror("Error in DISCSIGNAL ioctl"); return 1; } printf("Waiting for usb disconnect\n"); while (!done) { sleep(1); } close(fd); return 0; } Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: linux-usb@vger.kernel.org Cc: Alan Stern <stern@rowland.harvard.edu> Cc: Oliver Neukum <oneukum@suse.com> Fixes: v2.3.39 Cc: stable@vger.kernel.org Acked-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-26arm64: irqflags: Add condition flags to inline asm clobber listJulien Thierry
commit f57065782f245ca96f1472209a485073bbc11247 upstream. Some of the inline assembly instruction use the condition flags and need to include "cc" in the clobber list. Fixes: 4a503217ce37 ("arm64: irqflags: Use ICC_PMR_EL1 for interrupt masking") Cc: <stable@vger.kernel.org> # 5.1.x- Suggested-by: Marc Zyngier <marc.zyngier@arm.com> Cc: Will Deacon <will.deacon@arm.com> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Julien Thierry <julien.thierry@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-26arm64: tegra: Fix AGIC register rangeJon Hunter
commit ba24eee6686f6ed3738602b54d959253316a9541 upstream. The Tegra AGIC interrupt controller is an ARM GIC400 interrupt controller. Per the ARM GIC device-tree binding, the first address region is for the GIC distributor registers and the second address region is for the GIC CPU interface registers. The address space for the distributor registers is 4kB, but currently this is incorrectly defined as 8kB for the Tegra AGIC and overlaps with the CPU interface registers. Correct the address space for the distributor to be 4kB. Cc: stable@vger.kernel.org Signed-off-by: Jon Hunter <jonathanh@nvidia.com> Fixes: bcdbde433542 ("arm64: tegra: Add AGIC node for Tegra210") Signed-off-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-26KVM: x86/vPMU: refine kvm_pmu err msg when event creation failedLike Xu
commit 6fc3977ccc5d3c22e851f2dce2d3ce2a0a843842 upstream. If a perf_event creation fails due to any reason of the host perf subsystem, it has no chance to log the corresponding event for guest which may cause abnormal sampling data in guest result. In debug mode, this message helps to understand the state of vPMC and we may not limit the number of occurrences but not in a spamming style. Suggested-by: Joe Perches <joe@perches.com> Signed-off-by: Like Xu <like.xu@linux.intel.com> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-26KVM: PPC: Book3S HV: Fix CR0 setting in TM emulationMichael Neuling
commit 3fefd1cd95df04da67c83c1cb93b663f04b3324f upstream. When emulating tsr, treclaim and trechkpt, we incorrectly set CR0. The code currently sets: CR0 <- 00 || MSR[TS] but according to the ISA it should be: CR0 <- 0 || MSR[TS] || 0 This fixes the bit shift to put the bits in the correct location. This is a data integrity issue as CR0 is corrupted. Fixes: 4bb3c7a0208f ("KVM: PPC: Book3S HV: Work around transactional memory bugs in POWER9") Cc: stable@vger.kernel.org # v4.17+ Tested-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-26KVM: PPC: Book3S HV: Clear pending decrementer exceptions on nested guest entrySuraj Jitindar Singh
commit 3c25ab35fbc8526ac0c9b298e8a78e7ad7a55479 upstream. If we enter an L1 guest with a pending decrementer exception then this is cleared on guest exit if the guest has writtien a positive value into the decrementer (indicating that it handled the decrementer exception) since there is no other way to detect that the guest has handled the pending exception and that it should be dequeued. In the event that the L1 guest tries to run a nested (L2) guest immediately after this and the L2 guest decrementer is negative (which is loaded by L1 before making the H_ENTER_NESTED hcall), then the pending decrementer exception isn't cleared and the L2 entry is blocked since L1 has a pending exception, even though L1 may have already handled the exception and written a positive value for it's decrementer. This results in a loop of L1 trying to enter the L2 guest and L0 blocking the entry since L1 has an interrupt pending with the outcome being that L2 never gets to run and hangs. Fix this by clearing any pending decrementer exceptions when L1 makes the H_ENTER_NESTED hcall since it won't do this if it's decrementer has gone negative, and anyway it's decrementer has been communicated to L0 in the hdec_expires field and L0 will return control to L1 when this goes negative by delivering an H_DECREMENTER exception. Fixes: 95a6432ce903 ("KVM: PPC: Book3S HV: Streamlined guest entry/exit path on P9 for radix guests") Cc: stable@vger.kernel.org # v4.20+ Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-26KVM: PPC: Book3S HV: Signed extend decrementer value if not using large ↵Suraj Jitindar Singh
decrementer commit 869537709ebf1dc865e75c3fc97b23f8acf37c16 upstream. On POWER9 the decrementer can operate in large decrementer mode where the decrementer is 56 bits and signed extended to 64 bits. When not operating in this mode the decrementer behaves as a 32 bit decrementer which is NOT signed extended (as on POWER8). Currently when reading a guest decrementer value we don't take into account whether the large decrementer is enabled or not, and this means the value will be incorrect when the guest is not using the large decrementer. Fix this by sign extending the value read when the guest isn't using the large decrementer. Fixes: 95a6432ce903 ("KVM: PPC: Book3S HV: Streamlined guest entry/exit path on P9 for radix guests") Cc: stable@vger.kernel.org # v4.20+ Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-26KVM: VMX: check CPUID before allowing read/write of IA32_XSSWanpeng Li
commit 4d763b168e9c5c366b05812c7bba7662e5ea3669 upstream. Raise #GP when guest read/write IA32_XSS, but the CPUID bits say that it shouldn't exist. Fixes: 203000993de5 (kvm: vmx: add MSR logic for XSAVES) Reported-by: Xiaoyao Li <xiaoyao.li@linux.intel.com> Reported-by: Tao Xu <tao3.xu@intel.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: Wanpeng Li <wanpengli@tencent.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-26KVM: VMX: Fix handling of #MC that occurs during VM-EntrySean Christopherson
commit beb8d93b3e423043e079ef3dda19dad7b28467a8 upstream. A previous fix to prevent KVM from consuming stale VMCS state after a failed VM-Entry inadvertantly blocked KVM's handling of machine checks that occur during VM-Entry. Per Intel's SDM, a #MC during VM-Entry is handled in one of three ways, depending on when the #MC is recognoized. As it pertains to this bug fix, the third case explicitly states EXIT_REASON_MCE_DURING_VMENTRY is handled like any other VM-Exit during VM-Entry, i.e. sets bit 31 to indicate the VM-Entry failed. If a machine-check event occurs during a VM entry, one of the following occurs: - The machine-check event is handled as if it occurred before the VM entry: ... - The machine-check event is handled after VM entry completes: ... - A VM-entry failure occurs as described in Section 26.7. The basic exit reason is 41, for "VM-entry failure due to machine-check event". Explicitly handle EXIT_REASON_MCE_DURING_VMENTRY as a one-off case in vmx_vcpu_run() instead of binning it into vmx_complete_atomic_exit(). Doing so allows vmx_vcpu_run() to handle VMX_EXIT_REASONS_FAILED_VMENTRY in a sane fashion and also simplifies vmx_complete_atomic_exit() since VMCS.VM_EXIT_INTR_INFO is guaranteed to be fresh. Fixes: b060ca3b2e9e7 ("kvm: vmx: Handle VMLAUNCH/VMRESUME failure properly") Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Reviewed-by: Jim Mattson <jmattson@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-26KVM: nVMX: Always sync GUEST_BNDCFGS when it comes from vmcs01Sean Christopherson
commit 3b013a2972d5bc344d6eaa8f24fdfe268211e45f upstream. If L1 does not set VM_ENTRY_LOAD_BNDCFGS, then L1's BNDCFGS value must be propagated to vmcs02 since KVM always runs with VM_ENTRY_LOAD_BNDCFGS when MPX is supported. Because the value effectively comes from vmcs01, vmcs02 must be updated even if vmcs12 is clean. Fixes: 62cf9bd8118c4 ("KVM: nVMX: Fix emulation of VM_ENTRY_LOAD_BNDCFGS") Cc: stable@vger.kernel.org Cc: Liran Alon <liran.alon@oracle.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-26KVM: nVMX: Don't dump VMCS if virtual APIC page can't be mappedSean Christopherson
commit 73cb85568433feadb79e963bf2efba9b3e9ae3df upstream. ... as a malicious userspace can run a toy guest to generate invalid virtual-APIC page addresses in L1, i.e. flood the kernel log with error messages. Fixes: 690908104e39d ("KVM: nVMX: allow tests to use bad virtual-APIC page address") Cc: stable@vger.kernel.org Cc: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-26media: videobuf2-dma-sg: Prevent size from overflowingSakari Ailus
commit 14f28f5cea9e3998442de87846d1907a531b6748 upstream. buf->size is an unsigned long; casting that to int will lead to an overflow if buf->size exceeds INT_MAX. Fix this by changing the type to unsigned long instead. This is possible as the buf->size is always aligned to PAGE_SIZE, and therefore the size will never have values lesser than 0. Note on backporting to stable: the file used to be under drivers/media/v4l2-core, it was moved to the current location after 4.14. Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Cc: stable@vger.kernel.org Reviewed-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-26media: videobuf2-core: Prevent size alignment wrapping buffer size to 0Sakari Ailus
commit defcdc5d89ced780fb45196d539d6570ec5b1ba5 upstream. PAGE_ALIGN() may wrap the buffer size around to 0. Prevent this by checking that the aligned value is not smaller than the unaligned one. Note on backporting to stable: the file used to be under drivers/media/v4l2-core, it was moved to the current location after 4.14. Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Cc: stable@vger.kernel.org Reviewed-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-26media: coda: Remove unbalanced and unneeded mutex unlockEzequiel Garcia
commit 766b9b168f6c75c350dd87c3e0bc6a9b322f0013 upstream. The mutex unlock in the threaded interrupt handler is not paired with any mutex lock. Remove it. This bug has been here for a really long time, so it applies to any stable repo. Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de> Signed-off-by: Ezequiel Garcia <ezequiel@collabora.com> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Cc: stable@vger.kernel.org Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-26media: v4l2: Test type instead of cfg->type in v4l2_ctrl_new_custom()Boris Brezillon
commit 07d89227a983df957a6a7c56f7c040cde9ac571f upstream. cfg->type can be overridden by v4l2_ctrl_fill() and the new value is stored in the local type var. Fix the tests to use this local var. Fixes: 0996517cf8ea ("V4L/DVB: v4l2: Add new control handling framework") Cc: <stable@vger.kernel.org> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> [hverkuil-cisco@xs4all.nl: change to !qmenu and !qmenu_int (checkpatch)] Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-26ceph: fix end offset in truncate_inode_pages_range callLuis Henriques
commit d31d07b97a5e76f41e00eb81dcca740e84aa7782 upstream. Commit e450f4d1a5d6 ("ceph: pass inclusive lend parameter to filemap_write_and_wait_range()") fixed the end offset parameter used to call filemap_write_and_wait_range and invalidate_inode_pages2_range. Unfortunately it missed truncate_inode_pages_range, introducing a regression that is easily detected by xfstest generic/130. The problem is that when doing direct IO it is possible that an extra page is truncated from the page cache when the end offset is page aligned. This can cause data loss if that page hasn't been sync'ed to the OSDs. While there, change code to use PAGE_ALIGN macro instead. Cc: stable@vger.kernel.org Fixes: e450f4d1a5d6 ("ceph: pass inclusive lend parameter to filemap_write_and_wait_range()") Signed-off-by: Luis Henriques <lhenriques@suse.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-26ALSA: hda/hdmi - Fix i915 reverse port/pin mappingTakashi Iwai
commit 3140aafb22edeab0cc41f15f53b12a118c0ac215 upstream. The recent fix for Icelake HDMI codec introduced the mapping from pin NID to the i915 gfx port number. However, it forgot the reverse mapping from the port number to the pin NID that is used in the ELD notifier callback. As a result, it's processed to a wrong widget and gives a warning like snd_hda_codec_hdmi hdaudioC0D2: HDMI: pin nid 5 not registered This patch corrects it with a proper reverse mapping function. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=204133 Fixes: b0d8bc50b9f2 ("ALSA: hda: hdmi - add Icelake support") Reviewed-by: Kai Vehmanen <kai.vehmanen@linux.intel.com> Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-26ALSA: hda/hdmi - Remove duplicated defineTakashi Iwai
commit eb4177116bf568a413c544eca3f4446cb4064be9 upstream. INTEL_GET_VENDOR_VERB is defined twice identically. Let's remove a superfluous line. Fixes: b0d8bc50b9f2 ("ALSA: hda: hdmi - add Icelake support") Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-26ALSA: hda/realtek: apply ALC891 headset fixup to one Dell machineHui Wang
commit 4b4e0e32e4b09274dbc9d173016c1a026f44608c upstream. Without this patch, the headset-mic and headphone-mic don't work. Cc: <stable@vger.kernel.org> Signed-off-by: Hui Wang <hui.wang@canonical.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-26ALSA: hda/realtek - Fixed Headphone Mic can't record on Dell platformKailang Yang
commit fbc571290d9f7bfe089c50f4ac4028dd98ebfe98 upstream. It assigned to wrong model. So, The headphone Mic can't work. Fixes: 3f640970a414 ("ALSA: hda - Fix headset mic detection problem for several Dell laptops") Signed-off-by: Kailang Yang <kailang@realtek.com> Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-26ALSA: hda - Don't resume forcibly i915 HDMI/DP codecTakashi Iwai
commit 4914da2fb0c89205790503f20dfdde854f3afdd8 upstream. We apply the codec resume forcibly at system resume callback for updating and syncing the jack detection state that may have changed during sleeping. This is, however, superfluous for the codec like Intel HDMI/DP, where the jack detection is managed via the audio component notification; i.e. the jack state change shall be reported sooner or later from the graphics side at mode change. This patch changes the codec resume callback to avoid the forcible resume conditionally with a new flag, codec->relaxed_resume, for reducing the resume time. The flag is set in the codec probe. Although this doesn't fix the entire bug mentioned in the bugzilla entry below, it's still a good optimization and some improvements are seen. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=201901 Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-26ALSA: seq: Break too long mutex context in the write loopTakashi Iwai
commit ede34f397ddb063b145b9e7d79c6026f819ded13 upstream. The fix for the racy writes and ioctls to sequencer widened the application of client->ioctl_mutex to the whole write loop. Although it does unlock/relock for the lengthy operation like the event dup, the loop keeps the ioctl_mutex for the whole time in other situations. This may take quite long time if the user-space would give a huge buffer, and this is a likely cause of some weird behavior spotted by syzcaller fuzzer. This patch puts a simple workaround, just adding a mutex break in the loop when a large number of events have been processed. This shouldn't hit any performance drop because the threshold is set high enough for usual operations. Fixes: 7bd800915677 ("ALSA: seq: More protection for concurrent write and ioctl races") Reported-by: syzbot+97aae04ce27e39cbfca9@syzkaller.appspotmail.com Reported-by: syzbot+4c595632b98bb8ffcc66@syzkaller.appspotmail.com Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-26raid5-cache: Need to do start() part job after adding journal deviceXiao Ni
commit d9771f5ec46c282d518b453c793635dbdc3a2a94 upstream. commit d5d885fd514f ("md: introduce new personality funciton start()") splits the init job to two parts. The first part run() does the jobs that do not require the md threads. The second part start() does the jobs that require the md threads. Now it just does run() in adding new journal device. It needs to do the second part start() too. Fixes: d5d885fd514f ("md: introduce new personality funciton start()") Cc: stable@vger.kernel.org #v4.9+ Reported-by: Michal Soltys <soltys@ziu.info> Signed-off-by: Xiao Ni <xni@redhat.com> Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>