summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2020-08-17Revert "scsi: qla2xxx: Fix crash on qla2x00_mailbox_command"Saurav Kashyap
FCoE adapter initialization failed for ISP8021 with the following patch applied. In addition, reproduction of the issue the patch originally tried to address has been unsuccessful. This reverts commit 3cb182b3fa8b7a61f05c671525494697cba39c6a. Link: https://lore.kernel.org/r/20200806111014.28434-11-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Saurav Kashyap <skashyap@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-08-17scsi: qla2xxx: Fix null pointer access during disconnect from subsystemQuinn Tran
NVMEAsync command is being submitted to QLA while the same NVMe controller is in the middle of reset. The reset path has deleted the association and freed aen_op->fcp_req.private. Add a check for this private pointer before issuing the command. ... 6 [ffffb656ca11fce0] page_fault at ffffffff8c00114e [exception RIP: qla_nvme_post_cmd+394] RIP: ffffffffc0d012ba RSP: ffffb656ca11fd98 RFLAGS: 00010206 RAX: ffff8fb039eda228 RBX: ffff8fb039eda200 RCX: 00000000000da161 RDX: ffffffffc0d4d0f0 RSI: ffffffffc0d26c9b RDI: ffff8fb039eda220 RBP: 0000000000000013 R8: ffff8fb47ff6aa80 R9: 0000000000000002 R10: 0000000000000000 R11: ffffb656ca11fdc8 R12: ffff8fb27d04a3b0 R13: ffff8fc46dd98a58 R14: 0000000000000000 R15: ffff8fc4540f0000 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 7 [ffffb656ca11fe08] nvme_fc_start_fcp_op at ffffffffc0241568 [nvme_fc] 8 [ffffb656ca11fe50] nvme_fc_submit_async_event at ffffffffc0241901 [nvme_fc] 9 [ffffb656ca11fe68] nvme_async_event_work at ffffffffc014543d [nvme_core] 10 [ffffb656ca11fe98] process_one_work at ffffffff8b6cd437 11 [ffffb656ca11fed8] worker_thread at ffffffff8b6cdcef 12 [ffffb656ca11ff10] kthread at ffffffff8b6d3402 13 [ffffb656ca11ff50] ret_from_fork at ffffffff8c000255 -- PID: 37824 TASK: ffff8fb033063d80 CPU: 20 COMMAND: "kworker/u97:451" 0 [ffffb656ce1abc28] __schedule at ffffffff8be629e3 1 [ffffb656ce1abcc8] schedule at ffffffff8be62fe8 2 [ffffb656ce1abcd0] schedule_timeout at ffffffff8be671ed 3 [ffffb656ce1abd70] wait_for_completion at ffffffff8be639cf 4 [ffffb656ce1abdd0] flush_work at ffffffff8b6ce2d5 5 [ffffb656ce1abe70] nvme_stop_ctrl at ffffffffc0144900 [nvme_core] 6 [ffffb656ce1abe80] nvme_fc_reset_ctrl_work at ffffffffc0243445 [nvme_fc] 7 [ffffb656ce1abe98] process_one_work at ffffffff8b6cd437 8 [ffffb656ce1abed8] worker_thread at ffffffff8b6cdb50 9 [ffffb656ce1abf10] kthread at ffffffff8b6d3402 10 [ffffb656ce1abf50] ret_from_fork at ffffffff8c000255 Link: https://lore.kernel.org/r/20200806111014.28434-10-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-08-17scsi: qla2xxx: Check if FW supports MQ before enablingSaurav Kashyap
OS boot during Boot from SAN was stuck at dracut emergency shell after enabling NVMe driver parameter. For non-MQ support the driver was enabling MQ. Add a check to confirm if FW supports MQ. Link: https://lore.kernel.org/r/20200806111014.28434-9-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Saurav Kashyap <skashyap@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-08-17scsi: qla2xxx: Fix WARN_ON in qla_nvme_register_hbaArun Easi
qla_nvme_register_hba() puts out a warning when there are not enough queue pairs available for FC-NVME. Just fail the NVME registration rather than a WARNING + call Trace. Link: https://lore.kernel.org/r/20200806111014.28434-8-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Arun Easi <aeasi@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-08-17scsi: qla2xxx: Allow ql2xextended_error_logging special value 1 to be set ↵Arun Easi
anytime ql2xextended_error_logging can now be set to 1 to get the default mask value, as opposed to at module load time only. Link: https://lore.kernel.org/r/20200806111014.28434-7-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Arun Easi <aeasi@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-08-17scsi: qla2xxx: Reduce noisy debug messageQuinn Tran
Update debug level and message for ELS IOCB done. Link: https://lore.kernel.org/r/20200806111014.28434-6-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-08-17scsi: qla2xxx: Fix login timeoutQuinn Tran
Multipath errors were seen during failback due to login timeout. The remote device sent LOGO, the local host tore down the session and did relogin. The RSCN arrived indicates remote device is going through failover after which the relogin is in a 20s timeout phase. At this point the driver is stuck in the relogin process. Add a fix to delete the session as part of abort/flush the login. Link: https://lore.kernel.org/r/20200806111014.28434-5-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-08-17scsi: qla2xxx: Indicate correct supported speeds for Mezz cardQuinn Tran
Correct the supported speeds for 16G Mezz card. Link: https://lore.kernel.org/r/20200806111014.28434-4-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-08-17scsi: qla2xxx: Flush I/O on zone disableQuinn Tran
Perform implicit logout to flush I/O on zone disable. Link: https://lore.kernel.org/r/20200806111014.28434-3-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Himanshu Madhani <hmadhani@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-08-17scsi: qla2xxx: Flush all sessions on zone disableQuinn Tran
On Zone Disable, certain switches would ignore all commands. This causes timeout for both switch scan command and abort of that command. On detection of this condition, all sessions will be shutdown. Link: https://lore.kernel.org/r/20200806111014.28434-2-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Himanshu Madhani <hmadhani@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-08-17scsi: qla2xxx: Use MBX_TOV_SECONDS for mailbox command timeout valuesEnzo Matsumiya
Improves readability of qla_mbx.c. Link: https://lore.kernel.org/r/20200805200546.22497-1-ematsumiya@suse.de Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Reviewed-by: Roman Bolshakov <r.bolshakov@yadro.com> Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-08-17scsi: scsi_debug: Fix scp is NULL errorsDouglas Gilbert
John Garry reported 'sdebug_q_cmd_complete: scp is NULL' failures that were mainly seen on aarch64 machines (e.g. RPi 4 with four A72 CPUs). The problem was tracked down to a missing critical section on a "short circuit" path. Namely, the time to process the current command so far has already exceeded the requested command duration (i.e. the number of nanoseconds in the ndelay parameter). The random=1 parameter setting was pivotal in finding this error. The failure scenario involved first taking that "short circuit" path (due to a very short command duration) and then taking the more likely hrtimer_start() path (due to a longer command duration). With random=1 each command's duration is taken from the uniformly distributed [0..ndelay) interval. The fio utility also helped by reliably generating the error scenario at about once per minute on a RPi 4 (64 bit OS). Link: https://lore.kernel.org/r/20200813155738.109298-1-dgilbert@interlog.com Reported-by: John Garry <john.garry@huawei.com> Reviewed-by: Lee Duncan <lduncan@suse.com> Signed-off-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-08-17scsi: zfcp: Fix use-after-free in request timeout handlersSteffen Maier
Before v4.15 commit 75492a51568b ("s390/scsi: Convert timers to use timer_setup()"), we intentionally only passed zfcp_adapter as context argument to zfcp_fsf_request_timeout_handler(). Since we only trigger adapter recovery, it was unnecessary to sync against races between timeout and (late) completion. Likewise, we only passed zfcp_erp_action as context argument to zfcp_erp_timeout_handler(). Since we only wakeup an ERP action, it was unnecessary to sync against races between timeout and (late) completion. Meanwhile the timeout handlers get timer_list as context argument and do a timer-specific container-of to zfcp_fsf_req which can have been freed. Fix it by making sure that any request timeout handlers, that might just have started before del_timer(), are completed by using del_timer_sync() instead. This ensures the request free happens afterwards. Space time diagram of potential use-after-free: Basic idea is to have 2 or more pending requests whose timeouts run out at almost the same time. req 1 timeout ERP thread req 2 timeout ---------------- ---------------- --------------------------------------- zfcp_fsf_request_timeout_handler fsf_req = from_timer(fsf_req, t, timer) adapter = fsf_req->adapter zfcp_qdio_siosl(adapter) zfcp_erp_adapter_reopen(adapter,...) zfcp_erp_strategy ... zfcp_fsf_req_dismiss_all list_for_each_entry_safe zfcp_fsf_req_complete 1 del_timer 1 zfcp_fsf_req_free 1 zfcp_fsf_req_complete 2 zfcp_fsf_request_timeout_handler del_timer 2 fsf_req = from_timer(fsf_req, t, timer) zfcp_fsf_req_free 2 adapter = fsf_req->adapter ^^^^^^^ already freed Link: https://lore.kernel.org/r/20200813152856.50088-1-maier@linux.ibm.com Fixes: 75492a51568b ("s390/scsi: Convert timers to use timer_setup()") Cc: <stable@vger.kernel.org> #4.15+ Suggested-by: Julian Wiedmann <jwi@linux.ibm.com> Reviewed-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: Steffen Maier <maier@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-08-17scsi: ufs: No need to send Abort Task if the task in DB was clearedBean Huo
If the bit corresponding to a task in the Doorbell register has been cleared, no need to poll the status of the task on the device side and to send an Abort Task TM. Instead, let it directly goto cleanup. In addition, to keep original debug output, move the goto below the debug print. Link: https://lore.kernel.org/r/20200811141859.27399-3-huobean@gmail.com Reviewed-by: Stanley Chu <stanley.chu@mediatek.com> Reviewed-by: Can Guo <cang@codeaurora.org> Signed-off-by: Bean Huo <beanhuo@micron.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-08-17scsi: ufs: Clean up completed request without interrupt notificationStanley Chu
If somehow no interrupt notification is raised for a completed request and its doorbell bit is cleared by host, UFS driver needs to cleanup its outstanding bit in ufshcd_abort(). Otherwise, system may behave abnormally in the following scenario: After ufshcd_abort() returns, this request will be requeued by SCSI layer with its outstanding bit set. Any future completed request will trigger ufshcd_transfer_req_compl() to handle all "completed outstanding bits". At this time the "abnormal outstanding bit" will be detected and the "requeued request" will be chosen to execute request post-processing flow. This is wrong because this request is still "alive". Link: https://lore.kernel.org/r/20200811141859.27399-2-huobean@gmail.com Reviewed-by: Can Guo <cang@codeaurora.org> Acked-by: Avri Altman <avri.altman@wdc.com> Signed-off-by: Stanley Chu <stanley.chu@mediatek.com> Signed-off-by: Bean Huo <beanhuo@micron.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-08-17scsi: ufs: Improve interrupt handling for shared interruptsAdrian Hunter
For shared interrupts, the interrupt status might be zero, so check that first. Link: https://lore.kernel.org/r/20200811133936.19171-2-adrian.hunter@intel.com Reviewed-by: Avri Altman <avri.altman@wdc.com> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-08-17scsi: ufs: Fix interrupt error message for shared interruptsAdrian Hunter
The interrupt might be shared, in which case it is not an error for the interrupt handler to be called when the interrupt status is zero, so don't print the message unless there was enabled interrupt status. Link: https://lore.kernel.org/r/20200811133936.19171-1-adrian.hunter@intel.com Fixes: 9333d7757348 ("scsi: ufs: Fix irq return code") Reviewed-by: Avri Altman <avri.altman@wdc.com> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-08-17scsi: ufs-pci: Add quirk for broken auto-hibernate for Intel EHLAdrian Hunter
Intel EHL UFS host controller advertises auto-hibernate capability but it does not work correctly. Add a quirk for that. [mkp: checkpatch fix] Link: https://lore.kernel.org/r/20200810141024.28859-1-adrian.hunter@intel.com Fixes: 8c09d7527697 ("scsi: ufshdc-pci: Add Intel PCI IDs for EHL") Acked-by: Stanley Chu <stanley.chu@mediatek.com> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-08-17scsi: ufs-mediatek: Fix incorrect time to wait link statusStanley Chu
Fix incorrect calculation of "ms" based waiting time in function ufs_mtk_setup_clocks(). Link: https://lore.kernel.org/r/20200809055702.20140-1-stanley.chu@mediatek.com Fixes: 9006e3986f66 ("scsi: ufs-mediatek: Do not gate clocks if auto-hibern8 is not entered yet") Reviewed-by: Avri Altman <avri.altman@wdc.com> Signed-off-by: Stanley Chu <stanley.chu@mediatek.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-08-17scsi: ufs: Fix possible infinite loop in ufshcd_holdStanley Chu
In ufshcd_suspend(), after clk-gating is suspended and link is set as Hibern8 state, ufshcd_hold() is still possibly invoked before ufshcd_suspend() returns. For example, MediaTek's suspend vops may issue UIC commands which would call ufshcd_hold() during the command issuing flow. Now if UFSHCD_CAP_HIBERN8_WITH_CLK_GATING capability is enabled, then ufshcd_hold() may enter infinite loops because there is no clk-ungating work scheduled or pending. In this case, ufshcd_hold() shall just bypass, and keep the link as Hibern8 state. Link: https://lore.kernel.org/r/20200809050734.18740-1-stanley.chu@mediatek.com Reviewed-by: Avri Altman <avri.altman@wdc.com> Co-developed-by: Andy Teng <andy.teng@mediatek.com> Signed-off-by: Andy Teng <andy.teng@mediatek.com> Signed-off-by: Stanley Chu <stanley.chu@mediatek.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-08-17scsi: fcoe: Fix I/O path allocationMike Christie
ixgbe_fcoe_ddp_setup() can be called from the main I/O path and is called with a spin_lock held, so we have to use GFP_ATOMIC allocation instead of GFP_KERNEL. Link: https://lore.kernel.org/r/1596831813-9839-1-git-send-email-michael.christie@oracle.com cc: Hannes Reinecke <hare@suse.de> Reviewed-by: Lee Duncan <lduncan@suse.com> Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-08-17scsi: ufs: ti-j721e-ufs: Fix error return in ti_j721e_ufs_probe()Jing Xiangfeng
Fix to return error code PTR_ERR() from the error handling case instead of 0. Link: https://lore.kernel.org/r/20200806070135.67797-1-jingxiangfeng@huawei.com Fixes: 22617e216331 ("scsi: ufs: ti-j721e-ufs: Fix unwinding of pm_runtime changes") Reviewed-by: Avri Altman <avri.altman@wdc.com> Signed-off-by: Jing Xiangfeng <jingxiangfeng@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-08-17scsi: ufs: Fix a race condition between error handler and runtime PM opsCan Guo
The current IRQ handler blocks SCSI requests before scheduling eh_work, when error handler calls pm_runtime_get_sync, if ufshcd_suspend/resume sends a SCSI cmd, most likely the SSU cmd, since SCSI requests are blocked, pm_runtime_get_sync() will never return because ufshcd_suspend/resume is blocked by the SCSI cmd. - In queuecommand path, hba->ufshcd_state check and ufshcd_send_command should stay under the same spin lock. This is to make sure that no more commands leak into doorbell after hba->ufshcd_state is changed. - Don't block SCSI requests before error handler starts to run, let error handler block SCSI requests when it is ready to start error recovery. - Don't let SCSI layer keep requeuing the SCSI cmds sent from HBA runtime PM ops, let them pass or fail them. Let them pass if eh_work is scheduled due to non-fatal errors. Fail them if eh_work is scheduled due to fatal errors, otherwise the cmds may eventually time out since UFS is in bad state, which gets error handler blocked for too long. If we fail the SCSI cmds sent from HBA runtime PM ops, HBA runtime PM ops fails too, but it does not hurt since error handler can recover HBA runtime PM error. Link: https://lore.kernel.org/r/1596975355-39813-9-git-send-email-cang@codeaurora.org Reviewed-by: Bean Huo <beanhuo@micron.com> Reviewed-by: Asutosh Das <asutoshd@codeaurora.org> Signed-off-by: Can Guo <cang@codeaurora.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-08-17scsi: ufs: Move dumps in IRQ handler to error handlerCan Guo
Performing dumps in the IRQ handler causes system stability issues. Move dumps to the error handler and only print basic host registers here. Link: https://lore.kernel.org/r/1596975355-39813-8-git-send-email-cang@codeaurora.org Reviewed-by: Bean Huo <beanhuo@micron.com> Reviewed-by: Asutosh Das <asutoshd@codeaurora.org> Signed-off-by: Can Guo <cang@codeaurora.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-08-17scsi: ufs: Recover HBA runtime PM error in error handlerCan Guo
The current error handler can not recover HBA runtime PM error if ufshcd_suspend/resume has failed due to UFS errors, e.g. hibern8 enter/exit error or SSU cmd error. When this happens, error handler may fail performing a full reset and restore because error handler always assumes that power, IRQs and clocks are ready after pm_runtime_get_sync returns, but actually they are not if ufshcd_resume fails[1]. If ufschd_suspend/resume fails due to UFS errors, runtime PM framework saves the error value to dev.power.runtime_error. After that, HBA dev runtime suspend/resume would not be invoked anymore unless runtime_error is cleared[2]. In case of ufshcd_suspend/resume fails due to UFS errors, for scenario [1], error handler cannot assume anything of pm_runtime_get_sync, meaning error handler should explicitly turn ON powers, IRQs and clocks again. To get the HBA runtime PM work as regard for scenario [2], error handler can clear the runtime_error by calling pm_runtime_set_active() if full reset and restore succeeds. And, more important, if pm_runtime_set_active() returns no error, which means runtime_error has been cleared, we also need to resume those scsi devices under HBA in case any of them has failed to be resumed due to HBA runtime resume failure. This is to unblock blk_queue_enter in case there are bios waiting inside it. Link: https://lore.kernel.org/r/1596975355-39813-7-git-send-email-cang@codeaurora.org Reviewed-by: Bean Huo <beanhuo@micron.com> Reviewed-by: Asutosh Das <asutoshd@codeaurora.org> Signed-off-by: Can Guo <cang@codeaurora.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-08-17scsi: ufs: Fix concurrency of error handler and other error recovery pathsCan Guo
Error recovery can be invoked from multiple code paths, including hibern8 enter/exit (from ufshcd_link_recovery), ufshcd_eh_host_reset_handler() and eh_work scheduled from IRQ context. Ultimately, these paths are all trying to invoke ufshcd_reset_and_restore() in either a synchronous or asynchronous manner. This causes problems: - If link recovery happens during ungate work, ufshcd_hold() would be called recursively. Although commit 53c12d0ef6fc ("scsi: ufs: fix error recovery after the hibern8 exit failure") fixed a deadlock due to recursive calls of ufshcd_hold() by adding a check of eh_in_progress into ufshcd_hold, this check allows eh_work to run in parallel while link recovery is running. - Similar concurrency can also happen when error recovery is invoked from ufshcd_eh_host_reset_handler and ufshcd_link_recovery. - Concurrency can even happen between eh_works. eh_work, currently queued on system_wq, is allowed to have multiple instances running in parallel, but we don't have proper protection for that. If any of above concurrency scenarios happen, error recovery would fail and lead ufs device and host into bad states. To fix the concurrency problem, this change queues eh_work on a single threaded workqueue and removes link recovery calls from the hibern8 enter/exit path. In addition, make use of eh_work in eh_host_reset_handler instead of calling ufshcd_reset_and_restore. This unifies the UFS error recovery mechanism. According to the UFSHCI JEDEC spec, hibern8 enter/exit error occurs when the link is broken. This essentially applies to any power mode change operations (since they all use PACP_PWR cmds in UniPro layer). So, if a power mode change operation (including AH8 enter/exit) fails, mark link state as UIC_LINK_BROKEN_STATE and schedule the eh_work. In this case, error handler needs to do a full reset and restore to recover the link back to active. Before the link state is recovered to active, ufshcd_uic_pwr_ctrl simply returns -ENOLINK to avoid more errors. Link: https://lore.kernel.org/r/1596975355-39813-6-git-send-email-cang@codeaurora.org Reviewed-by: Bean Huo <beanhuo@micron.com> Reviewed-by: Asutosh Das <asutoshd@codeaurora.org> Signed-off-by: Can Guo <cang@codeaurora.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-08-17scsi: ufs: Add some debug information to ufshcd_print_host_state()Can Guo
Information about the last interrupt status and timestamp is helpful when debugging system stability issues (IRQ starvation, for instance). Add this information to ufshcd_print_host_state() output. In addition, UFS device information such as model name and firmware version also comes in handy during debugging. This is printed as well. Link: https://lore.kernel.org/r/1596975355-39813-5-git-send-email-cang@codeaurora.org Reviewed-by: Avri Altman <avri.altman@wdc.com> Reviewed-by: Hongwu Su <hongwus@codeaurora.org> Reviewed-by: Asutosh Das <asutoshd@codeaurora.org> Reviewed-by: Stanley Chu <stanley.chu@mediatek.com> Reviewed-by: Bean Huo <beanhuo@micron.com> Signed-off-by: Can Guo <cang@codeaurora.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-08-17scsi: ufs-qcom: Remove testbus dump in ufs_qcom_dump_dbg_regsCan Guo
Dumping testbus registers outputs a lot of information and can cause stability issues. Remove the dump code. Link: https://lore.kernel.org/r/1596975355-39813-4-git-send-email-cang@codeaurora.org Reviewed-by: Hongwu Su <hongwus@codeaurora.org> Reviewed-by: Avri Altman <avri.altman@wdc.com> Reviewed-by: Bean Huo <beanhuo@micron.com> Reviewed-by: Asutosh Das <asutoshd@codeaurora.org> Signed-off-by: Can Guo <cang@codeaurora.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-08-17scsi: ufs: ufs-qcom: Fix race conditions caused by ufs_qcom_testbus_config()Can Guo
If ufs_qcom_dump_dbg_regs() calls ufs_qcom_testbus_config() from ufshcd_suspend/resume and/or clk gate/ungate context, pm_runtime_get_sync() and ufshcd_hold() will cause a race condition. Fix this by removing the unnecessary calls of pm_runtime_get_sync() and ufshcd_hold(). Link: https://lore.kernel.org/r/1596975355-39813-3-git-send-email-cang@codeaurora.org Reviewed-by: Hongwu Su <hongwus@codeaurora.org> Reviewed-by: Avri Altman <avri.altman@wdc.com> Reviewed-by: Bean Huo <beanhuo@micron.com> Reviewed-by: Asutosh Das <asutoshd@codeaurora.org> Signed-off-by: Can Guo <cang@codeaurora.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-08-17scsi: ufs: Add checks before setting clk-gating statesCan Guo
Clock gating can be turned on/off selectively which means the associated state information is only correct if the feature is enabled. This change makes sure that we only look at state of clk-gating if it is enabled. Link: https://lore.kernel.org/r/1596975355-39813-2-git-send-email-cang@codeaurora.org Reviewed-by: Avri Altman <avri.altman@wdc.com> Reviewed-by: Hongwu Su <hongwus@codeaurora.org> Reviewed-by: Stanley Chu <stanley.chu@mediatek.com> Reviewed-by: Bean Huo <beanhuo@micron.com> Reviewed-by: Asutosh Das <asutoshd@codeaurora.org> Signed-off-by: Can Guo <cang@codeaurora.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-08-16Linux 5.9-rc1v5.9-rc1Linus Torvalds
2020-08-16Merge tag 'io_uring-5.9-2020-08-15' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull io_uring fixes from Jens Axboe: "A few differerent things in here. Seems like syzbot got some more io_uring bits wired up, and we got a handful of reports and the associated fixes are in here. General fixes too, and a lot of them marked for stable. Lastly, a bit of fallout from the async buffered reads, where we now more easily trigger short reads. Some applications don't really like that, so the io_read() code now handles short reads internally, and got a cleanup along the way so that it's now easier to read (and documented). We're now passing tests that failed before" * tag 'io_uring-5.9-2020-08-15' of git://git.kernel.dk/linux-block: io_uring: short circuit -EAGAIN for blocking read attempt io_uring: sanitize double poll handling io_uring: internally retry short reads io_uring: retain iov_iter state over io_read/io_write calls task_work: only grab task signal lock when needed io_uring: enable lookup of links holding inflight files io_uring: fail poll arm on queue proc failure io_uring: hold 'ctx' reference around task_work queue + execute fs: RWF_NOWAIT should imply IOCB_NOIO io_uring: defer file table grabbing request cleanup for locked requests io_uring: add missing REQ_F_COMP_LOCKED for nested requests io_uring: fix recursive completion locking on oveflow flush io_uring: use TWA_SIGNAL for task_work uncondtionally io_uring: account locked memory before potential error case io_uring: set ctx sq/cq entry count earlier io_uring: Fix NULL pointer dereference in loop_rw_iter() io_uring: add comments on how the async buffered read retry works io_uring: io_async_buf_func() need not test page bit
2020-08-16parisc: fix PMD pages allocation by restoring pmd_alloc_one()Mike Rapoport
Commit 1355c31eeb7e ("asm-generic: pgalloc: provide generic pmd_alloc_one() and pmd_free_one()") converted parisc to use generic version of pmd_alloc_one() but it missed the fact that parisc uses order-1 pages for PMD. Restore the original version of pmd_alloc_one() for parisc, just use GFP_PGTABLE_KERNEL that implies __GFP_ZERO instead of GFP_KERNEL and memset. Fixes: 1355c31eeb7e ("asm-generic: pgalloc: provide generic pmd_alloc_one() and pmd_free_one()") Reported-by: Meelis Roos <mroos@linux.ee> Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Tested-by: Meelis Roos <mroos@linux.ee> Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org> Link: https://lkml.kernel.org/r/9f2b5ebd-e4a4-0fa1-6cd3-4b9f6892d1ad@linux.ee Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-15Merge tag 'block-5.9-2020-08-14' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block fixes from Jens Axboe: "A few fixes on the block side of things: - Discard granularity fix (Coly) - rnbd cleanups (Guoqing) - md error handling fix (Dan) - md sysfs fix (Junxiao) - Fix flush request accounting, which caused an IO slowdown for some configurations (Ming) - Properly propagate loop flag for partition scanning (Lennart)" * tag 'block-5.9-2020-08-14' of git://git.kernel.dk/linux-block: block: fix double account of flush request's driver tag loop: unset GENHD_FL_NO_PART_SCAN on LOOP_CONFIGURE rnbd: no need to set bi_end_io in rnbd_bio_map_kern rnbd: remove rnbd_dev_submit_io md-cluster: Fix potential error pointer dereference in resize_bitmaps() block: check queue's limits.discard_granularity in __blkdev_issue_discard() md: get sysfs entry after redundancy attr group create
2020-08-15Merge tag 'riscv-for-linus-5.9-mw1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V fix from Palmer Dabbelt: "I collected a single fix during the merge window: we managed to break the early trap setup on !MMU, this fixes it" * tag 'riscv-for-linus-5.9-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: riscv: Setup exception vector for nommu platform
2020-08-15Merge tag 'sh-for-5.9' of git://git.libc.org/linux-shLinus Torvalds
Pull arch/sh updates from Rich Felker: "Cleanup, SECCOMP_FILTER support, message printing fixes, and other changes to arch/sh" * tag 'sh-for-5.9' of git://git.libc.org/linux-sh: (34 commits) sh: landisk: Add missing initialization of sh_io_port_base sh: bring syscall_set_return_value in line with other architectures sh: Add SECCOMP_FILTER sh: Rearrange blocks in entry-common.S sh: switch to copy_thread_tls() sh: use the generic dma coherent remap allocator sh: don't allow non-coherent DMA for NOMMU dma-mapping: consolidate the NO_DMA definition in kernel/dma/Kconfig sh: unexport register_trapped_io and match_trapped_io_handler sh: don't include <asm/io_trapped.h> in <asm/io.h> sh: move the ioremap implementation out of line sh: move ioremap_fixed details out of <asm/io.h> sh: remove __KERNEL__ ifdefs from non-UAPI headers sh: sort the selects for SUPERH alphabetically sh: remove -Werror from Makefiles sh: Replace HTTP links with HTTPS ones arch/sh/configs: remove obsolete CONFIG_SOC_CAMERA* sh: stacktrace: Remove stacktrace_ops.stack() sh: machvec: Modernize printing of kernel messages sh: pci: Modernize printing of kernel messages ...
2020-08-15io_uring: short circuit -EAGAIN for blocking read attemptJens Axboe
One case was missed in the short IO retry handling, and that's hitting -EAGAIN on a blocking attempt read (eg from io-wq context). This is a problem on sockets that are marked as non-blocking when created, they don't carry any REQ_F_NOWAIT information to help us terminate them instead of perpetually retrying. Fixes: 227c0c9673d8 ("io_uring: internally retry short reads") Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-08-15io_uring: sanitize double poll handlingJens Axboe
There's a bit of confusion on the matching pairs of poll vs double poll, depending on if the request is a pure poll (IORING_OP_POLL_ADD) or poll driven retry. Add io_poll_get_double() that returns the double poll waitqueue, if any, and io_poll_get_single() that returns the original poll waitqueue. With that, remove the argument to io_poll_remove_double(). Finally ensure that wait->private is cleared once the double poll handler has run, so that remove knows it's already been seen. Cc: stable@vger.kernel.org # v5.8 Reported-by: syzbot+7f617d4a9369028b8a2c@syzkaller.appspotmail.com Fixes: 18bceab101ad ("io_uring: allow POLL_ADD with double poll_wait() users") Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-08-15Merge tag 'perf-tools-2020-08-14' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux Pull more perf tools updates from Arnaldo Carvalho de Melo: "Fixes: - Fixes for 'perf bench numa'. - Always memset source before memcpy in 'perf bench mem'. - Quote CC and CXX for their arguments to fix build in environments using those variables to pass more than just the compiler names. - Fix module symbol processing, addressing regression detected via "perf test". - Allow multiple probes in record+script_probe_vfs_getname.sh 'perf test' entry. Improvements: - Add script to autogenerate socket family name id->string table from copy of kernel header, used so far in 'perf trace'. - 'perf ftrace' improvements to provide similar options for this utility so that one can go from 'perf record', 'perf trace', etc to 'perf ftrace' just by changing the name of the subcommand. - Prefer new "sched:sched_waking" trace event when it exists in 'perf sched' post processing. - Update POWER9 metrics to utilize other metrics. - Fall back to querying debuginfod if debuginfo not found locally. Miscellaneous: - Sync various kvm headers with kernel sources" * tag 'perf-tools-2020-08-14' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (40 commits) perf ftrace: Make option description initials all capital letters perf build-ids: Fall back to debuginfod query if debuginfo not found perf bench numa: Remove dead code in parse_nodes_opt() perf stat: Update POWER9 metrics to utilize other metrics perf ftrace: Add change log perf: ftrace: Add set_tracing_options() to set all trace options perf ftrace: Add option --tid to filter by thread id perf ftrace: Add option -D/--delay to delay tracing perf: ftrace: Allow set graph depth by '--graph-opts' perf ftrace: Add support for trace option tracing_thresh perf ftrace: Add option 'verbose' to show more info for graph tracer perf ftrace: Add support for tracing option 'irq-info' perf ftrace: Add support for trace option funcgraph-irqs perf ftrace: Add support for trace option sleep-time perf ftrace: Add support for tracing option 'func_stack_trace' perf tools: Add general function to parse sublevel options perf ftrace: Add option '--inherit' to trace children processes perf ftrace: Show trace column header perf ftrace: Add option '-m/--buffer-size' to set per-cpu buffer size perf ftrace: Factor out function write_tracing_file_int() ...
2020-08-15Merge tag 'x86-urgent-2020-08-15' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: "Misc fixes and small updates all around the place: - Fix mitigation state sysfs output - Fix an FPU xstate/sxave code assumption bug triggered by Architectural LBR support - Fix Lightning Mountain SoC TSC frequency enumeration bug - Fix kexec debug output - Fix kexec memory range assumption bug - Fix a boundary condition in the crash kernel code - Optimize porgatory.ro generation a bit - Enable ACRN guests to use X2APIC mode - Reduce a __text_poke() IRQs-off critical section for the benefit of PREEMPT_RT" * tag 'x86-urgent-2020-08-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/alternatives: Acquire pte lock with interrupts enabled x86/bugs/multihit: Fix mitigation reporting when VMX is not in use x86/fpu/xstate: Fix an xstate size check warning with architectural LBRs x86/purgatory: Don't generate debug info for purgatory.ro x86/tsr: Fix tsc frequency enumeration bug on Lightning Mountain SoC kexec_file: Correctly output debugging information for the PT_LOAD ELF header kexec: Improve & fix crash_exclude_mem_range() to handle overlapping ranges x86/crash: Correct the address boundary of function parameters x86/acrn: Remove redundant chars from ACRN signature x86/acrn: Allow ACRN guest to use X2APIC mode
2020-08-15Merge tag 'sched-urgent-2020-08-15' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes from Ingo Molnar: "Two fixes: fix a new tracepoint's output value, and fix the formatting of show-state syslog printouts" * tag 'sched-urgent-2020-08-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/debug: Fix the alignment of the show-state debug output sched: Fix use of count for nr_running tracepoint
2020-08-15Merge tag 'perf-urgent-2020-08-15' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: "Misc fixes, an expansion of perf syscall access to CAP_PERFMON privileged tools, plus a RAPL HW-enablement for Intel SPR platforms" * tag 'perf-urgent-2020-08-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86/rapl: Add support for Intel SPR platform perf/x86/rapl: Support multiple RAPL unit quirks perf/x86/rapl: Fix missing psys sysfs attributes hw_breakpoint: Remove unused __register_perf_hw_breakpoint() declaration kprobes: Remove show_registers() function prototype perf/core: Take over CAP_SYS_PTRACE creds to CAP_PERFMON capability
2020-08-15Merge tag 'locking-urgent-2020-08-15' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking fixlets from Ingo Molnar: "A documentation fix and a 'fallthrough' macro update" * tag 'locking-urgent-2020-08-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: futex: Convert to use the preferred 'fallthrough' macro Documentation/locking/locktypes: Fix a typo
2020-08-15Merge tag '9p-for-5.9-rc1' of git://github.com/martinetd/linuxLinus Torvalds
Pull 9p updates from Dominique Martinet: - some code cleanup - a couple of static analysis fixes - setattr: try to pick a fid associated with the file rather than the dentry, which might sometimes matter * tag '9p-for-5.9-rc1' of git://github.com/martinetd/linux: 9p: Remove unneeded cast from memory allocation 9p: remove unused code in 9p net/9p: Fix sparse endian warning in trans_fd.c 9p: Fix memory leak in v9fs_mount 9p: retrieve fid from file when file instance exist.
2020-08-15Merge tag '5.9-rc-smb3-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds
Pull cifs fixes from Steve French: "Three small cifs/smb3 fixes, one for stable fixing mkdir path with the 'idsfromsid' mount option" * tag '5.9-rc-smb3-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6: SMB3: Fix mkdir when idsfromsid configured on mount cifs: Convert to use the fallthrough macro cifs: Fix an error pointer dereference in cifs_mount()
2020-08-15Merge tag 'nfs-for-5.9-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds
Pull NFS client updates from Trond Myklebust: "Stable fixes: - pNFS: Don't return layout segments that are being used for I/O - pNFS: Don't move layout segments off the active list when being used for I/O Features: - NFS: Add support for user xattrs through the NFSv4.2 protocol - NFS: Allow applications to speed up readdir+statx() using AT_STATX_DONT_SYNC - NFSv4.0 allow nconnect for v4.0 Bugfixes and cleanups: - nfs: ensure correct writeback errors are returned on close() - nfs: nfs_file_write() should check for writeback errors - nfs: Fix getxattr kernel panic and memory overflow - NFS: Fix the pNFS/flexfiles mirrored read failover code - SUNRPC: dont update timeout value on connection reset - freezer: Add unsafe versions of freezable_schedule_timeout_interruptible for NFS - sunrpc: destroy rpc_inode_cachep after unregister_filesystem" * tag 'nfs-for-5.9-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (32 commits) NFS: Fix flexfiles read failover fs: nfs: delete repeated words in comments rpc_pipefs: convert comma to semicolon nfs: Fix getxattr kernel panic and memory overflow NFS: Don't return layout segments that are in use NFS: Don't move layouts to plh_return_segs list while in use NFS: Add layout segment info to pnfs read/write/commit tracepoints NFS: Add tracepoints for layouterror and layoutstats. NFS: Report the stateid + status in trace_nfs4_layoutreturn_on_close() SUNRPC dont update timeout value on connection reset nfs: nfs_file_write() should check for writeback errors nfs: ensure correct writeback errors are returned on close() NFSv4.2: xattr cache: get rid of cache discard work queue NFS: remove redundant initialization of variable result NFSv4.0 allow nconnect for v4.0 freezer: Add unsafe versions of freezable_schedule_timeout_interruptible for NFS sunrpc: destroy rpc_inode_cachep after unregister_filesystem NFSv4.2: add client side xattr caching. NFSv4.2: hook in the user extended attribute handlers NFSv4.2: add the extended attribute proc functions. ...
2020-08-15Merge tag 'edac_updates_for_5.9_pt2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras Pull edac fix from Tony Luck: "Fix for the ie31200 driver that missed the first pull" * tag 'edac_updates_for_5.9_pt2' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras: EDAC/ie31200: Fallback if host bridge device is already initialized
2020-08-15Merge tag 'devicetree-fixes-for-5.9' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux Pull devicetree fixes from Rob Herring: "Another round of 'allOf' removals and whitespace clean-ups of schemas" * tag 'devicetree-fixes-for-5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: dt-bindings: Remove more cases of 'allOf' containing a '$ref' dt-bindings: Whitespace clean-ups in schema files
2020-08-15Merge tag 'acpi-5.9-rc1-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull more ACPI updates from Rafael Wysocki: "Add new hardware support to the ACPI driver for AMD SoCs, the x86 clk driver and the Designware i2c driver (changes from Akshu Agrawal and Pu Wen)" * tag 'acpi-5.9-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: clk: x86: Support RV architecture ACPI: APD: Add a fmw property is_raven clk: x86: Change name from ST to FCH ACPI: APD: Change name from ST to FCH i2c: designware: Add device HID for Hygon I2C controller
2020-08-15Merge tag 'pm-5.9-rc1-3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull one more power management update from Rafael Wysocki: "Modify the intel_pstate driver to allow it to work in the passive mode with hardware-managed P-states (HWP) enabled" * tag 'pm-5.9-rc1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: cpufreq: intel_pstate: Implement passive mode with HWP enabled