linux/linux-stable.git - Linux kernel stable tree

Age	Commit message (Collapse)	Author
2020-06-24	nvme: fix possible deadlock when I/O is blocked	Sagi Grimberg
	Revert fab7772bfbcf ("nvme-multipath: revalidate nvme_ns_head gendisk in nvme_validate_ns") When adding a new namespace to the head disk (via nvme_mpath_set_live) we will see partition scan which triggers I/O on the mpath device node. This process will usually be triggered from the scan_work which holds the scan_lock. If I/O blocks (if we got ana change currently have only available paths but none are accessible) this can deadlock on the head disk bd_mutex as both partition scan I/O takes it, and head disk revalidation takes it to check for resize (also triggered from scan_work on a different path). See trace [1]. The mpath disk revalidation was originally added to detect online disk size change, but this is no longer needed since commit cb224c3af4df ("nvme: Convert to use set_capacity_revalidate_and_notify") which already updates resize info without unnecessarily revalidating the disk (the mpath disk doesn't even implement .revalidate_disk fop). [1]: -- kernel: INFO: task kworker/u65:9:494 blocked for more than 241 seconds. kernel: Tainted: G OE 5.3.5-050305-generic #201910071830 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. kernel: kworker/u65:9 D 0 494 2 0x80004000 kernel: Workqueue: nvme-wq nvme_scan_work [nvme_core] kernel: Call Trace: kernel: __schedule+0x2b9/0x6c0 kernel: schedule+0x42/0xb0 kernel: schedule_preempt_disabled+0xe/0x10 kernel: __mutex_lock.isra.0+0x182/0x4f0 kernel: __mutex_lock_slowpath+0x13/0x20 kernel: mutex_lock+0x2e/0x40 kernel: revalidate_disk+0x63/0xa0 kernel: __nvme_revalidate_disk+0xfe/0x110 [nvme_core] kernel: nvme_revalidate_disk+0xa4/0x160 [nvme_core] kernel: ? evict+0x14c/0x1b0 kernel: revalidate_disk+0x2b/0xa0 kernel: nvme_validate_ns+0x49/0x940 [nvme_core] kernel: ? blk_mq_free_request+0xd2/0x100 kernel: ? __nvme_submit_sync_cmd+0xbe/0x1e0 [nvme_core] kernel: nvme_scan_work+0x24f/0x380 [nvme_core] kernel: process_one_work+0x1db/0x380 kernel: worker_thread+0x249/0x400 kernel: kthread+0x104/0x140 kernel: ? process_one_work+0x380/0x380 kernel: ? kthread_park+0x80/0x80 kernel: ret_from_fork+0x1f/0x40 ... kernel: INFO: task kworker/u65:1:2630 blocked for more than 241 seconds. kernel: Tainted: G OE 5.3.5-050305-generic #201910071830 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. kernel: kworker/u65:1 D 0 2630 2 0x80004000 kernel: Workqueue: nvme-wq nvme_scan_work [nvme_core] kernel: Call Trace: kernel: __schedule+0x2b9/0x6c0 kernel: schedule+0x42/0xb0 kernel: io_schedule+0x16/0x40 kernel: do_read_cache_page+0x438/0x830 kernel: ? __switch_to_asm+0x34/0x70 kernel: ? file_fdatawait_range+0x30/0x30 kernel: read_cache_page+0x12/0x20 kernel: read_dev_sector+0x27/0xc0 kernel: read_lba+0xc1/0x220 kernel: ? kmem_cache_alloc_trace+0x19c/0x230 kernel: efi_partition+0x1e6/0x708 kernel: ? vsnprintf+0x39e/0x4e0 kernel: ? snprintf+0x49/0x60 kernel: check_partition+0x154/0x244 kernel: rescan_partitions+0xae/0x280 kernel: __blkdev_get+0x40f/0x560 kernel: blkdev_get+0x3d/0x140 kernel: __device_add_disk+0x388/0x480 kernel: device_add_disk+0x13/0x20 kernel: nvme_mpath_set_live+0x119/0x140 [nvme_core] kernel: nvme_update_ns_ana_state+0x5c/0x60 [nvme_core] kernel: nvme_set_ns_ana_state+0x1e/0x30 [nvme_core] kernel: nvme_parse_ana_log+0xa1/0x180 [nvme_core] kernel: ? nvme_update_ns_ana_state+0x60/0x60 [nvme_core] kernel: nvme_mpath_add_disk+0x47/0x90 [nvme_core] kernel: nvme_validate_ns+0x396/0x940 [nvme_core] kernel: ? blk_mq_free_request+0xd2/0x100 kernel: nvme_scan_work+0x24f/0x380 [nvme_core] kernel: process_one_work+0x1db/0x380 kernel: worker_thread+0x249/0x400 kernel: kthread+0x104/0x140 kernel: ? process_one_work+0x380/0x380 kernel: ? kthread_park+0x80/0x80 kernel: ret_from_fork+0x1f/0x40 -- Fixes: fab7772bfbcf ("nvme-multipath: revalidate nvme_ns_head gendisk in nvme_validate_ns") Signed-off-by: Anton Eidelman <anton@lightbitslabs.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-06-24	nvme-rdma: assign completion vector correctly	Max Gurtovoy
	The completion vector index that is given during CQ creation can't exceed the number of support vectors by the underlying RDMA device. This violation currently can accure, for example, in case one will try to connect with N regular read/write queues and M poll queues and the sum of N + M > num_supported_vectors. This will lead to failure in establish a connection to remote target. Instead, in that case, share a completion vector between queues. Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-06-24	nvme-loop: initialize tagset numa value to the value of the ctrl	Max Gurtovoy
	Both admin's and drive's tagsets should be set according the numa node of the controller. Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-06-24	nvme-tcp: initialize tagset numa value to the value of the ctrl	Max Gurtovoy
	Both admin's and drive's tagsets should be set according the numa node of the controller. Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-06-24	nvme-pci: initialize tagset numa value to the value of the ctrl	Max Gurtovoy
	Both admin's and drive's tagsets should be set according the numa node of the controller. Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-06-24	nvme-pci: override the value of the controller's numa node	Max Gurtovoy
	Set the node value according to the PCI device numa node. Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-06-24	nvme: set initial value for controller's numa node	Max Gurtovoy
	Initialize the node to NUMA_NO_NODE value. Transports that are aware of numa node affinity can override it (e.g. RDMA transport set the affinity according to the RDMA HCA). Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-06-24	usb: renesas_usbhs: getting residue from callback_result	Yoshihiro Shimoda
	This driver assumed that dmaengine_tx_status() could return the residue even if the transfer was completed. However, this was not correct usage [1] and this caused to break getting the residue after the commit 24461d9792c2 ("dmaengine: virt-dma: Fix access after free in vchan_complete()") actually. So, this is possible to get wrong received size if the usb controller gets a short packet. For example, g_zero driver causes "bad OUT byte" errors. The usb-dmac driver will support the callback_result, so this driver can use it to get residue correctly. Note that even if the usb-dmac driver has not supported the callback_result yet, this patch doesn't cause any side-effects. [1] https://lore.kernel.org/dmaengine/20200616165550.GP2324254@vkoul-mobl/ Reported-by: Hien Dang <hien.dang.eb@renesas.com> Fixes: 24461d9792c2 ("dmaengine: virt-dma: Fix access after free in vchan_complete()") Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Link: https://lore.kernel.org/r/1592482277-19563-1-git-send-email-yoshihiro.shimoda.uh@renesas.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-06-24	block: release bip in a right way in error path	Chengguang Xu
	Release bip using kfree() in error path when that was allocated by kmalloc(). Signed-off-by: Chengguang Xu <cgxu519@mykernel.net> Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-06-24	drm/radeon: fix fb_div check in ni_init_smc_spll_table()	Denis Efremov
	clk_s is checked twice in a row in ni_init_smc_spll_table(). fb_div should be checked instead. Fixes: 69e0b57a91ad ("drm/radeon/kms: add dpm support for cayman (v5)") Cc: stable@vger.kernel.org Signed-off-by: Denis Efremov <efremov@linux.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-06-24	Revert "usb: dwc3: exynos: Add support for Exynos5422 suspend clk"	Anand Moon
	This reverts commit 07f6842341abe978e6375078f84506ec3280ece5. Since SCLK_SCLK_USBD300 suspend clock need to be configured for phy module, I wrongly mapped this clock to DWC3 code. Cc: Felipe Balbi <balbi@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Anand Moon <linux.amoon@gmail.com> Cc: stable <stable@vger.kernel.org> Fixes: 07f6842341ab ("usb: dwc3: exynos: Add support for Exynos5422 suspend clk") Link: https://lore.kernel.org/r/20200623074637.756-1-linux.amoon@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-06-24	xhci: Poll for U0 after disabling USB2 LPM	Kai-Heng Feng
	USB2 devices with LPM enabled may interrupt the system suspend: [ 932.510475] usb 1-7: usb suspend, wakeup 0 [ 932.510549] hub 1-0:1.0: hub_suspend [ 932.510581] usb usb1: bus suspend, wakeup 0 [ 932.510590] xhci_hcd 0000:00:14.0: port 9 not suspended [ 932.510593] xhci_hcd 0000:00:14.0: port 8 not suspended .. [ 932.520323] xhci_hcd 0000:00:14.0: Port change event, 1-7, id 7, portsc: 0x400e03 .. [ 932.591405] PM: pci_pm_suspend(): hcd_pci_suspend+0x0/0x30 returns -16 [ 932.591414] PM: dpm_run_callback(): pci_pm_suspend+0x0/0x160 returns -16 [ 932.591418] PM: Device 0000:00:14.0 failed to suspend async: error -16 During system suspend, USB core will let HC suspends the device if it doesn't have remote wakeup enabled and doesn't have any children. However, from the log above we can see that the usb 1-7 doesn't get bus suspended due to not in U0. After a while the port finished U2 -> U0 transition, interrupts the suspend process. The observation is that after disabling LPM, port doesn't transit to U0 immediately and can linger in U2. xHCI spec 4.23.5.2 states that the maximum exit latency for USB2 LPM should be BESL + 10us. The BESL for the affected device is advertised as 400us, which is still not enough based on my testing result. So let's use the maximum permitted latency, 10000, to poll for U0 status to solve the issue. Cc: stable@vger.kernel.org Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20200624135949.22611-6-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-06-24	xhci: Return if xHCI doesn't support LPM	Kai-Heng Feng
	Just return if xHCI is quirked to disable LPM. We can save some time from reading registers and doing spinlocks. Add stable tag as we want this patch together with the next one, "Poll for U0 after disabling USB2 LPM" which fixes a suspend issue for some USB2 LPM devices Cc: stable@vger.kernel.org Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20200624135949.22611-5-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-06-24	usb: host: xhci-mtk: avoid runtime suspend when removing hcd	Macpaul Lin
	When runtime suspend was enabled, runtime suspend might happen when xhci is removing hcd. This might cause kernel panic when hcd has been freed but runtime pm suspend related handle need to reference it. Signed-off-by: Macpaul Lin <macpaul.lin@mediatek.com> Reviewed-by: Chunfeng Yun <chunfeng.yun@mediatek.com> Cc: stable@vger.kernel.org Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20200624135949.22611-4-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-06-24	xhci: Fix enumeration issue when setting max packet size for FS devices.	Al Cooper
	Unable to complete the enumeration of a USB TV Tuner device. Per XHCI spec (4.6.5), the EP state field of the input context shall be cleared for a set address command. In the special case of an FS device that has "MaxPacketSize0 = 8", the Linux XHCI driver does not do this before evaluating the context. With an XHCI controller that checks the EP state field for parameter context error this causes a problem in cases such as the device getting reset again after enumeration. When that field is cleared, the problem does not occur. This was found and fixed by Sasi Kumar. Cc: stable@vger.kernel.org Signed-off-by: Al Cooper <alcooperx@gmail.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20200624135949.22611-3-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-06-24	xhci: Fix incorrect EP_STATE_MASK	Mathias Nyman
	EP_STATE_MASK should be 0x7 instead of 0xf xhci spec 6.2.3 shows that the EP state field in the endpoint context data structure consist of bits [2:0]. The old value included a bit from the next field which fortunately is a RsvdZ region. So hopefully this hasn't caused too much harm Cc: stable@vger.kernel.org Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20200624135949.22611-2-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-06-24	usb: cdns3: ep0: add spinlock for cdns3_check_new_setup	Peter Chen
	The other thread may access other endpoints when the cdns3_check_new_setup is handling, add spinlock to protect it. Fixes: 7733f6c32e36 ("usb: cdns3: Add Cadence USB3 DRD Driver") Cc: <stable@vger.kernel.org> Reviewed-by: Pawel Laszczak <pawell@cadence.com> Signed-off-by: Peter Chen <peter.chen@nxp.com> Link: https://lore.kernel.org/r/20200623030918.8409-4-peter.chen@nxp.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-06-24	usb: cdns3: trace: using correct dir value	Peter Chen
	It should use the correct direction value from register, not depends on previous software setting. It fixed the EP number wrong issue at trace when the TRBERR interrupt occurs for EP0IN. When the EP0IN IOC has finished, software prepares the setup packet request, the expected direction is OUT, but at that time, the TRBERR for EP0IN may occur since it is DMULT mode, the DMA does not stop until TRBERR has met. Fixes: 7733f6c32e36 ("usb: cdns3: Add Cadence USB3 DRD Driver") Cc: <stable@vger.kernel.org> Reviewed-by: Pawel Laszczak <pawell@cadence.com> Signed-off-by: Peter Chen <peter.chen@nxp.com> Link: https://lore.kernel.org/r/20200623030918.8409-3-peter.chen@nxp.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-06-24	usb: cdns3: ep0: fix the test mode set incorrectly	Peter Chen
	The 'tmode' is ctrl->wIndex, changing it as the real test mode value for register assignment. Fixes: 7733f6c32e36 ("usb: cdns3: Add Cadence USB3 DRD Driver") Cc: <stable@vger.kernel.org> Reviewed-by: Jun Li <jun.li@nxp.com> Reviewed-by: Pawel Laszczak <pawell@cadence.com> Signed-off-by: Peter Chen <peter.chen@nxp.com> Link: https://lore.kernel.org/r/20200623030918.8409-2-peter.chen@nxp.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-06-24	kselftest: arm64: Remove redundant clean target	Mark Brown
	The arm64 signal tests generate warnings during build since both they and the toplevel lib.mk define a clean target: Makefile:25: warning: overriding recipe for target 'clean' ../../lib.mk:126: warning: ignoring old recipe for target 'clean' Since the inclusion of lib.mk is in the signal Makefile there is no situation where this warning could be avoided so just remove the redundant clean target. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20200624104933.21125-1-broonie@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2020-06-24	arm64: kpti: Add KRYO{3, 4}XX silver CPU cores to kpti safelist	Sai Prakash Ranjan
	QCOM KRYO{3,4}XX silver/LITTLE CPU cores are based on Cortex-A55 and are meltdown safe, hence add them to kpti_safe_list[]. Signed-off-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org> Link: https://lore.kernel.org/r/20200624123406.3472-1-saiprakash.ranjan@codeaurora.org Signed-off-by: Will Deacon <will@kernel.org>
2020-06-24	arm64: Don't insert a BTI instruction at inner labels	Jean-Philippe Brucker
	Some ftrace features are broken since commit 714a8d02ca4d ("arm64: asm: Override SYM_FUNC_START when building the kernel with BTI"). For example the function_graph tracer: $ echo function_graph > /sys/kernel/debug/tracing/current_tracer [ 36.107016] WARNING: CPU: 0 PID: 115 at kernel/trace/ftrace.c:2691 ftrace_modify_all_code+0xc8/0x14c When ftrace_modify_graph_caller() attempts to write a branch at ftrace_graph_call, it finds the "BTI J" instruction inserted by SYM_INNER_LABEL() instead of a NOP, and aborts. It turns out we don't currently need the BTI landing pads inserted by SYM_INNER_LABEL: * ftrace_call and ftrace_graph_call are only used for runtime patching of the active tracer. The patched code is not reached from a branch. * install_el2_stub is reached from a CBZ instruction, which doesn't change PSTATE.BTYPE. * __guest_exit is reached from B instructions in the hyp-entry vectors, which aren't subject to BTI checks either. Remove the BTI annotation from SYM_INNER_LABEL. Fixes: 714a8d02ca4d ("arm64: asm: Override SYM_FUNC_START when building the kernel with BTI") Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Reviewed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20200624112253.1602786-1-jean-philippe@linaro.org Signed-off-by: Will Deacon <will@kernel.org>
2020-06-24	arm64: vdso: Don't use gcc plugins for building vgettimeofday.c	Alexander Popov
	Don't use gcc plugins for building arch/arm64/kernel/vdso/vgettimeofday.c to avoid unneeded instrumentation. Signed-off-by: Alexander Popov <alex.popov@linux.com> Link: https://lore.kernel.org/r/20200624123330.83226-4-alex.popov@linux.com Signed-off-by: Will Deacon <will@kernel.org>
2020-06-24	usbip: tools: fix module name in man page	Antonio Borneo
	Commit 64e62426f40d ("staging: usbip: edit Kconfig and rename CONFIG options") renamed the module usbip as usbip-host, but the example in the man page still reports the old module name. Fix the module name in usbipd.8 Fixes: 64e62426f40d ("staging: usbip: edit Kconfig and rename CONFIG options") Acked-by: Shuah Khan <skhan@linuxfoundation.org> Signed-off-by: Antonio Borneo <borneo.antonio@gmail.com> Acked-by: matt mooney <mfm@muteddisk.com> Link: https://lore.kernel.org/r/20200618000818.1048203-1-borneo.antonio@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-06-24	usbip: tools: fix build error for multiple definition	Antonio Borneo
	With GCC 10, building usbip triggers error for multiple definition of 'udev_context', in: - libsrc/vhci_driver.c:18 and - libsrc/usbip_host_common.c:27. Declare as extern the definition in libsrc/usbip_host_common.c. Signed-off-by: Antonio Borneo <borneo.antonio@gmail.com> Acked-by: Shuah Khan <skhan@linuxfoundation.org> Link: https://lore.kernel.org/r/20200618000844.1048309-1-borneo.antonio@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-06-24	USB: ch9: add "USB_" prefix in front of TEST defines	Greg Kroah-Hartman
	For some reason, the TEST_ defines in the usb/ch9.h files did not have the USB_ prefix on it, making it a bit confusing when reading the file, as well as not the nicest thing to do in a uapi file. So fix that up and add the USB_ prefix on to them, and fix up all in-kernel usages. This included deleting the duplicate copy in the net2272.h file. Cc: Felipe Balbi <balbi@kernel.org> Cc: Michal Simek <michal.simek@xilinx.com> Cc: Mathias Nyman <mathias.nyman@intel.com> Cc: Pawel Laszczak <pawell@cadence.com> Cc: YueHaibing <yuehaibing@huawei.com> Cc: Nathan Chancellor <natechancellor@gmail.com> Cc: Jason Yan <yanaijie@huawei.com> Cc: Jia-Ju Bai <baijiaju1990@gmail.com> Cc: Stephen Boyd <swboyd@chromium.org> Cc: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Jules Irenge <jbi.octave@gmail.com> Cc: Alan Stern <stern@rowland.harvard.edu> Cc: Thinh Nguyen <Thinh.Nguyen@synopsys.com> Cc: Rob Gill <rrobgill@protonmail.com> Cc: Macpaul Lin <macpaul.lin@mediatek.com> Acked-by: Minas Harutyunyan <hminas@synopsys.com> Acked-by: Bin Liu <b-liu@ti.com> Acked-by: Chunfeng Yun <chunfeng.yun@mediatek.com> Acked-by: Peter Chen <peter.chen@nxp.com> Link: https://lore.kernel.org/r/20200618144206.2655890-1-gregkh@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-06-24	ALSA: usb-audio: Fix OOB access of mixer element list	Takashi Iwai
	The USB-audio mixer code holds a linked list of usb_mixer_elem_list, and several operations are performed for each mixer element. A few of them (snd_usb_mixer_notify_id() and snd_usb_mixer_interrupt_v2()) assume each mixer element being a usb_mixer_elem_info object that is a subclass of usb_mixer_elem_list, cast via container_of() and access it members. This may result in an out-of-bound access when a non-standard list element has been added, as spotted by syzkaller recently. This patch adds a new field, is_std_info, in usb_mixer_elem_list to indicate that the element is the usb_mixer_elem_info type or not, and skip the access to such an element if needed. Reported-by: syzbot+fb14314433463ad51625@syzkaller.appspotmail.com Reported-by: syzbot+2405ca3401e943c538b5@syzkaller.appspotmail.com Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20200624122340.9615-1-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de>
2020-06-24	arm64: vdso: Only pass --no-eh-frame-hdr when linker supports it	Will Deacon
	Commit 87676cfca141 ("arm64: vdso: Disable dwarf unwinding through the sigreturn trampoline") unconditionally passes the '--no-eh-frame-hdr' option to the linker when building the native vDSO in an attempt to prevent generation of the .eh_frame_hdr section, the presence of which has been implicated in segfaults originating from the libgcc unwinder. Unfortunately, not all versions of binutils support this option, which has been shown to cause build failures in linux-next: \| CALL scripts/atomic/check-atomics.sh \| CALL scripts/checksyscalls.sh \| LD arch/arm64/kernel/vdso/vdso.so.dbg \| ld: unrecognized option '--no-eh-frame-hdr' \| ld: use the --help option for usage information \| arch/arm64/kernel/vdso/Makefile:64: recipe for target \| 'arch/arm64/kernel/vdso/vdso.so.dbg' failed \| make[1]: * [arch/arm64/kernel/vdso/vdso.so.dbg] Error 1 \| arch/arm64/Makefile:175: recipe for target 'vdso_prepare' failed \| make: * [vdso_prepare] Error 2 Only link the vDSO with '--no-eh-frame-hdr' when the linker supports it. If we end up with the section due to linker defaults, the absence of CFI information in the sigreturn trampoline will prevent the unwinder from breaking. Link: https://lore.kernel.org/r/7a7e31a8-9a7b-2428-ad83-2264f20bdc2d@hisilicon.com Fixes: 87676cfca141 ("arm64: vdso: Disable dwarf unwinding through the sigreturn trampoline") Reported-by: Shaokun Zhang <zhangshaokun@hisilicon.com> Tested-by: Jon Hunter <jonathanh@nvidia.com> Signed-off-by: Will Deacon <will@kernel.org>
2020-06-24	habanalabs: increase h/w timer when checking idle	Omer Shpigelman
	In GAUDI the current timer value for the hardware to check if it is in IDLE state is too low. As a result, there are occasions where the H/W wrongly reports it is not IDLE. The driver checks that before submitting work on behalf of the driver during initialization, so a false report might cause the driver to fail during device initialization. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-06-24	Revert "usb: dwc3: exynos: Add support for Exynos5422 suspend clk"	Anand Moon
	This reverts commit 07f6842341abe978e6375078f84506ec3280ece5. Since SCLK_SCLK_USBD300 suspend clock need to be configured for phy module, I wrongly mapped this clock to DWC3 code. Cc: Felipe Balbi <balbi@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Anand Moon <linux.amoon@gmail.com> Signed-off-by: Felipe Balbi <balbi@kernel.org>
2020-06-24	usb: gadget: udc: Potential Oops in error handling code	Dan Carpenter
	If this is in "transceiver" mode the the ->qwork isn't required and is a NULL pointer. This can lead to a NULL dereference when we call destroy_workqueue(udc->qwork). Fixes: 3517c31a8ece ("usb: gadget: mv_udc: use devm_xxx for probe") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Felipe Balbi <balbi@kernel.org>
2020-06-24	usb: phy: tegra: Fix unnecessary check in tegra_usb_phy_probe()	Tang Bin
	In the function tegra_usb_phy_probe(), if usb_add_phy_dev() failed, the return value will be given to err, and if usb_add_phy_dev() succeed, the return value will be zero. Thus it is unnecessary to repeated check here. Signed-off-by: Zhang Shengju <zhangshengju@cmss.chinamobile.com> Signed-off-by: Tang Bin <tangbin@cmss.chinamobile.com> Signed-off-by: Felipe Balbi <balbi@kernel.org>
2020-06-24	usb: dwc3: pci: Fix reference count leak in dwc3_pci_resume_work	Aditya Pakki
	dwc3_pci_resume_work() calls pm_runtime_get_sync() that increments the reference counter. In case of failure, decrement the reference before returning. Signed-off-by: Aditya Pakki <pakki001@umn.edu> Signed-off-by: Felipe Balbi <balbi@kernel.org>
2020-06-24	usb: cdns3: ep0: add spinlock for cdns3_check_new_setup	Peter Chen
	The other thread may access other endpoints when the cdns3_check_new_setup is handling, add spinlock to protect it. Cc: <stable@vger.kernel.org> Fixes: 7733f6c32e36 ("usb: cdns3: Add Cadence USB3 DRD Driver") Reviewed-by: Pawel Laszczak <pawell@cadence.com> Signed-off-by: Peter Chen <peter.chen@nxp.com> Signed-off-by: Felipe Balbi <balbi@kernel.org>
2020-06-24	usb: cdns3: trace: using correct dir value	Peter Chen
	It should use the correct direction value from register, not depends on previous software setting. It fixed the EP number wrong issue at trace when the TRBERR interrupt occurs for EP0IN. When the EP0IN IOC has finished, software prepares the setup packet request, the expected direction is OUT, but at that time, the TRBERR for EP0IN may occur since it is DMULT mode, the DMA does not stop until TRBERR has met. Cc: <stable@vger.kernel.org> Fixes: 7733f6c32e36 ("usb: cdns3: Add Cadence USB3 DRD Driver") Reviewed-by: Pawel Laszczak <pawell@cadence.com> Signed-off-by: Peter Chen <peter.chen@nxp.com> Signed-off-by: Felipe Balbi <balbi@kernel.org>
2020-06-24	usb: cdns3: ep0: fix the test mode set incorrectly	Peter Chen
	The 'tmode' is ctrl->wIndex, changing it as the real test mode value for register assignment. Cc: <stable@vger.kernel.org> Fixes: 7733f6c32e36 ("usb: cdns3: Add Cadence USB3 DRD Driver") Reviewed-by: Jun Li <jun.li@nxp.com> Signed-off-by: Peter Chen <peter.chen@nxp.com> Signed-off-by: Felipe Balbi <balbi@kernel.org>
2020-06-24	soc: imx8m: fix build warning	Peng Fan
	Fix the build warning with x86_64-randconfig >> drivers/soc/imx/soc-imx8m.c:150:34: warning: unused variable >> 'imx8_soc_match' [-Wunused-const-variable] static const struct of_device_id imx8_soc_match[] = { ^ Fixes: fc40200ebf82 ("soc: imx: increase build coverage for imx8m soc driver") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Peng Fan <peng.fan@nxp.com> Signed-off-by: Shawn Guo <shawnguo@kernel.org>
2020-06-24	habanalabs: Correct handling when failing to enqueue CB	Ofir Bitton
	The fence release flow is different if the CS was never submitted. In that case, we don't have an hw_sob object attached that we need to "put". While if the CS was aborted, we do need to "put" the hw_sob. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-06-24	habanalabs: increase GAUDI QMAN ARB WDT timeout	Oded Gabbay
	The current timeout is too low for some of the workloads and we see false errors as a result. Reviewed-by: Tomer Tayar <ttayar@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-06-24	habanalabs: rename mmu_write() to mmu_asid_va_write()	Oded Gabbay
	The function name conflicts with a static inline function in arch/m68k/include/asm/mcfmmu.h Reported-by: kernel test robot <lkp@intel.com> Reviewed-by: Tomer Tayar <ttayar@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-06-24	habanalabs: use PI in MMU cache invalidation	Omer Shpigelman
	The PS flow for MMU cache invalidation caused timeouts in stress tests. Use PS + PI flow so no timeouts should happen whatsoever. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-06-24	habanalabs: block scalar load_and_exe on external queue	Oded Gabbay
	In Gaudi, the user can't execute scalar load_and_exe on external queue because it can be a security hole. The driver doesn't parse the commands being loaded and it can be msg_prot, which the user isn't allowed to use. Reviewed-by: Tomer Tayar <ttayar@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-06-24	scsi: mptscsih: Fix read sense data size	Tomas Henzl
	The sense data buffer in sense_buf_pool is allocated with size of MPT_SENSE_BUFFER_ALLOC(64) (multiplied by req_depth) while SNS_LEN(sc)(96) is used when reading the data. That may lead to a read from unallocated area, sometimes from another (unallocated) page. To fix this, limit the read size to MPT_SENSE_BUFFER_ALLOC. Link: https://lore.kernel.org/r/20200616150446.4840-1-thenzl@redhat.com Co-developed-by: Stanislav Saner <ssaner@redhat.com> Signed-off-by: Stanislav Saner <ssaner@redhat.com> Signed-off-by: Tomas Henzl <thenzl@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-06-24	scsi: zfcp: Fix panic on ERP timeout for previously dismissed ERP action	Steffen Maier
	Suppose that, for unrelated reasons, FSF requests on behalf of recovery are very slow and can run into the ERP timeout. In the case at hand, we did adapter recovery to a large degree. However due to the slowness a LUN open is pending so the corresponding fc_rport remains blocked. After fast_io_fail_tmo we trigger close physical port recovery for the port under which the LUN should have been opened. The new higher order port recovery dismisses the pending LUN open ERP action and dismisses the pending LUN open FSF request. Such dismissal decouples the ERP action from the pending corresponding FSF request by setting zfcp_fsf_req->erp_action to NULL (among other things) [zfcp_erp_strategy_check_fsfreq()]. If now the ERP timeout for the pending open LUN request runs out, we must not use zfcp_fsf_req->erp_action in the ERP timeout handler. This is a problem since v4.15 commit 75492a51568b ("s390/scsi: Convert timers to use timer_setup()"). Before that we intentionally only passed zfcp_erp_action as context argument to zfcp_erp_timeout_handler(). Note: The lifetime of the corresponding zfcp_fsf_req object continues until a (late) response or an (unrelated) adapter recovery. Just like the regular response path ignores dismissed requests [zfcp_fsf_req_complete() => zfcp_fsf_protstatus_eval() => return early] the ERP timeout handler now needs to ignore dismissed requests. So simply return early in the ERP timeout handler if the FSF request is marked as dismissed in its status flags. To protect against the race where zfcp_erp_strategy_check_fsfreq() dismisses and sets zfcp_fsf_req->erp_action to NULL after our previous status flag check, return early if zfcp_fsf_req->erp_action is NULL. After all, the former ERP action does not need to be woken up as that was already done as part of the dismissal above [zfcp_erp_action_dismiss()]. This fixes the following panic due to kernel page fault in IRQ context: Unable to handle kernel pointer dereference in virtual kernel address space Failing address: 0000000000000000 TEID: 0000000000000483 Fault in home space mode while using kernel ASCE. AS:000009859238c00b R2:00000e3e7ffd000b R3:00000e3e7ffcc007 S:00000e3e7ffd7000 P:000000000000013d Oops: 0004 ilc:2 [#1] SMP Modules linked in: ... CPU: 82 PID: 311273 Comm: stress Kdump: loaded Tainted: G E X ... Hardware name: IBM 8561 T01 701 (LPAR) Krnl PSW : 0404c00180000000 001fffff80549be0 (zfcp_erp_notify+0x40/0xc0 [zfcp]) R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 RI:0 EA:3 Krnl GPRS: 0000000000000080 00000e3d00000000 00000000000000f0 0000000000030000 000000010028e700 000000000400a39c 000000010028e700 00000e3e7cf87e02 0000000010000000 0700098591cb67f0 0000000000000000 0000000000000000 0000033840e9a000 0000000000000000 001fffe008d6bc18 001fffe008d6bbc8 Krnl Code: 001fffff80549bd4: a7180000 lhi %r1,0 001fffff80549bd8: 4120a0f0 la %r2,240(%r10) #001fffff80549bdc: a53e0003 llilh %r3,3 >001fffff80549be0: ba132000 cs %r1,%r3,0(%r2) 001fffff80549be4: a7740037 brc 7,1fffff80549c52 001fffff80549be8: e320b0180004 lg %r2,24(%r11) 001fffff80549bee: e31020e00004 lg %r1,224(%r2) 001fffff80549bf4: 412020e0 la %r2,224(%r2) Call Trace: [<001fffff80549be0>] zfcp_erp_notify+0x40/0xc0 [zfcp] [<00000985915e26f0>] call_timer_fn+0x38/0x190 [<00000985915e2944>] expire_timers+0xfc/0x190 [<00000985915e2ac4>] run_timer_softirq+0xec/0x218 [<0000098591ca7c4c>] __do_softirq+0x144/0x398 [<00000985915110aa>] do_softirq_own_stack+0x72/0x88 [<0000098591551b58>] irq_exit+0xb0/0xb8 [<0000098591510c6a>] do_IRQ+0x82/0xb0 [<0000098591ca7140>] ext_int_handler+0x128/0x12c [<0000098591722d98>] clear_subpage.constprop.13+0x38/0x60 ([<000009859172ae4c>] clear_huge_page+0xec/0x250) [<000009859177e7a2>] do_huge_pmd_anonymous_page+0x32a/0x768 [<000009859172a712>] __handle_mm_fault+0x88a/0x900 [<000009859172a860>] handle_mm_fault+0xd8/0x1b0 [<0000098591529ef6>] do_dat_exception+0x136/0x3e8 [<0000098591ca6d34>] pgm_check_handler+0x1c8/0x220 Last Breaking-Event-Address: [<001fffff80549c88>] zfcp_erp_timeout_handler+0x10/0x18 [zfcp] Kernel panic - not syncing: Fatal exception in interrupt Link: https://lore.kernel.org/r/20200623140242.98864-1-maier@linux.ibm.com Fixes: 75492a51568b ("s390/scsi: Convert timers to use timer_setup()") Cc: <stable@vger.kernel.org> #4.15+ Reviewed-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: Steffen Maier <maier@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-06-23	scsi: lpfc: Avoid another null dereference in lpfc_sli4_hba_unset()	SeongJae Park
	Commit cdb42becdd40 ("scsi: lpfc: Replace io_channels for nvme and fcp with general hdw_queues per cpu") has introduced static checker warnings for potential null dereferences in 'lpfc_sli4_hba_unset()' and commit 1ffdd2c0440d ("scsi: lpfc: resolve static checker warning in lpfc_sli4_hba_unset") has tried to fix it. However, yet another potential null dereference is remaining. This commit fixes it. This bug was discovered and resolved using Coverity Static Analysis Security Testing (SAST) by Synopsys, Inc. Link: https://lore.kernel.org/r/20200623084122.30633-1-sjpark@amazon.com Fixes: 1ffdd2c0440d ("scsi: lpfc: resolve static checker warning inlpfc_sli4_hba_unset") Fixes: cdb42becdd40 ("scsi: lpfc: Replace io_channels for nvme and fcp with general hdw_queues per cpu") Reviewed-by: James Smart <james.smart@broadcom.com> Signed-off-by: SeongJae Park <sjpark@amazon.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-06-23	Merge branch 'Two-phylink-pause-fixes'	David S. Miller
	Russell King says: ==================== Two phylink pause fixes While testing, I discovered two issues with ethtool -A with phylink. First, if there is a PHY bound to the network device, we hit a deadlock when phylib tries to notify us of the link changing as a result of triggering a renegotiation. Second, when we are manually forcing the pause settings, and there is no renegotiation triggered, we do not update the MAC via the new mac_link_up approach. These two patches solve both problems, and will need to be backported to v5.7; they do not apply cleanly there due to the introduction of PCS in the v5.8 merge window. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-23	net: phylink: ensure manual pause mode configuration takes effect	Russell King
	We have been relying on link events and mac_config() when the manual pause modes are changed. With recent developments, such as moving the programming of link state to mac_link_up(), this no longer works. To ensure that we update the MAC, we must generate a link-down followed by a link-up event; we can do that by setting mac_link_dropped and triggering a resolve. Fixes: 91a208f2185a ("net: phylink: propagate resolved link config via mac_link_up()") Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-23	net: phylink: fix ethtool -A with attached PHYs	Russell King
	Fix a phylink's ethtool set_pauseparam support deadlock caused by phylib interacting with phylink: we must not hold the state lock while calling phylib functions that may call into phylink_phy_change(). Fixes: f904f15ea9b5 ("net: phylink: allow ethtool -A to change flow control advertisement") Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-23	scsi: libata: Fix the ata_scsi_dma_need_drain stub	Christoph Hellwig
	We not only need the stub when libata is disabled, but also if it is modular and there are built-in SAS drivers (which can happen when SCSI_SAS_ATA is disabled). Link: https://lore.kernel.org/r/20200620071302.462974-2-hch@lst.de Fixes: b8f1d1e05817 ("scsi: Wire up ata_scsi_dma_need_drain for SAS HBA drivers") Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-06-23	scsi: qla2xxx: Keep initiator ports after RSCN	Roman Bolshakov
	The driver performs SCR (state change registration) in all modes including pure target mode. For each RSCN, scan_needed flag is set in qla2x00_handle_rscn() for the port mentioned in the RSCN and fabric rescan is scheduled. During the rescan, GNN_FT handler, qla24xx_async_gnnft_done() deletes session of the port that caused the RSCN. In target mode, the session deletion has an impact on ATIO handler, qlt_24xx_atio_pkt(). Target responds with SAM STATUS BUSY to I/O incoming from the deleted session. qlt_handle_cmd_for_atio() and qlt_handle_task_mgmt() return -EFAULT if they are not able to find session of the command/TMF, and that results in invocation of qlt_send_busy(): qlt_24xx_atio_pkt_all_vps: qla_target(0): type 6 ox_id 0014 qla_target(0): Unable to send command to target, sending BUSY status Such response causes command timeout on the initiator. Error handler thread on the initiator will be spawned to abort the commands: scsi 23:0:0:0: tag#0 abort scheduled scsi 23:0:0:0: tag#0 aborting command qla2xxx [0000:af:00.0]-188c:23: Entered qla24xx_abort_command. qla2xxx [0000:af:00.0]-801c:23: Abort command issued nexus=23:0:0 -- 0 2003. Command abort is rejected by target and fails (2003), error handler then tries to perform DEVICE RESET and TARGET RESET but they're also doomed to fail because TMFs are ignored for the deleted sessions. Then initiator makes BUS RESET that resets the link via qla2x00_full_login_lip(). BUS RESET succeeds and brings initiator port up, SAN switch detects that and sends RSCN to the target port and it fails again the same way as described above. It never goes out of the loop. The change breaks the RSCN loop by keeping initiator sessions mentioned in RSCN payload in all modes, including dual and pure target mode. Link: https://lore.kernel.org/r/20200605144435.27023-1-r.bolshakov@yadro.com Fixes: 2037ce49d30a ("scsi: qla2xxx: Fix stale session") Cc: Quinn Tran <qutran@marvell.com> Cc: Arun Easi <aeasi@marvell.com> Cc: Nilesh Javali <njavali@marvell.com> Cc: Bart Van Assche <bvanassche@acm.org> Cc: Daniel Wagner <dwagner@suse.de> Cc: Himanshu Madhani <himanshu.madhani@oracle.com> Cc: Martin Wilck <mwilck@suse.com> Cc: stable@vger.kernel.org # v5.4+ Reviewed-by: Daniel Wagner <dwagner@suse.de> Reviewed-by: Shyam Sundar <ssundar@marvell.com> Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Roman Bolshakov <r.bolshakov@yadro.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>