summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2022-01-24scsi: hisi_sas: Remove useless DMA-32 fallback configurationChristophe JAILLET
As stated in [1], dma_set_mask() with a 64-bit mask never fails if dev->dma_mask is non-NULL. So, if it fails, the 32-bit case will also fail for the same reason. Simplify code and remove some dead code accordingly. [1]: https://lore.kernel.org/linux-kernel/YL3vSPK5DXTNvgdx@infradead.org/#t Link: https://lore.kernel.org/r/1bf2d3660178b0e6f172e5208bc0bd68d31d9268.1642237482.git.christophe.jaillet@wanadoo.fr Acked-by: John Garry <john.garry@huawei.com> Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-01-24scsi: 3w-sas: Remove useless DMA-32 fallback configurationChristophe JAILLET
As stated in [1], dma_set_mask() with a 64-bit mask never fails if dev->dma_mask is non-NULL. So, if it fails, the 32-bit case will also fail for the same reason. Simplify code and remove some dead code accordingly. [1]: https://lore.kernel.org/linux-kernel/YL3vSPK5DXTNvgdx@infradead.org/#t Link: https://lore.kernel.org/r/dbbe8671ca760972d80f8d35f3170b4609bee368.1642236763.git.christophe.jaillet@wanadoo.fr Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-01-24scsi: bnx2fc: Flush destroy_work queue before calling bnx2fc_interface_put()John Meneghini
The bnx2fc_destroy() functions are removing the interface before calling destroy_work. This results multiple WARNings from sysfs_remove_group() as the controller rport device attributes are removed too early. Replace the fcoe_port's destroy_work queue. It's not needed. The problem is easily reproducible with the following steps. Example: $ dmesg -w & $ systemctl enable --now fcoe $ fipvlan -s -c ens2f1 $ fcoeadm -d ens2f1.802 [ 583.464488] host2: libfc: Link down on port (7500a1) [ 583.472651] bnx2fc: 7500a1 - rport not created Yet!! [ 583.490468] ------------[ cut here ]------------ [ 583.538725] sysfs group 'power' not found for kobject 'rport-2:0-0' [ 583.568814] WARNING: CPU: 3 PID: 192 at fs/sysfs/group.c:279 sysfs_remove_group+0x6f/0x80 [ 583.607130] Modules linked in: dm_service_time 8021q garp mrp stp llc bnx2fc cnic uio rpcsec_gss_krb5 auth_rpcgss nfsv4 ... [ 583.942994] CPU: 3 PID: 192 Comm: kworker/3:2 Kdump: loaded Not tainted 5.14.0-39.el9.x86_64 #1 [ 583.984105] Hardware name: HP ProLiant DL120 G7, BIOS J01 07/01/2013 [ 584.016535] Workqueue: fc_wq_2 fc_rport_final_delete [scsi_transport_fc] [ 584.050691] RIP: 0010:sysfs_remove_group+0x6f/0x80 [ 584.074725] Code: ff 5b 48 89 ef 5d 41 5c e9 ee c0 ff ff 48 89 ef e8 f6 b8 ff ff eb d1 49 8b 14 24 48 8b 33 48 c7 c7 ... [ 584.162586] RSP: 0018:ffffb567c15afdc0 EFLAGS: 00010282 [ 584.188225] RAX: 0000000000000000 RBX: ffffffff8eec4220 RCX: 0000000000000000 [ 584.221053] RDX: ffff8c1586ce84c0 RSI: ffff8c1586cd7cc0 RDI: ffff8c1586cd7cc0 [ 584.255089] RBP: 0000000000000000 R08: 0000000000000000 R09: ffffb567c15afc00 [ 584.287954] R10: ffffb567c15afbf8 R11: ffffffff8fbe7f28 R12: ffff8c1486326400 [ 584.322356] R13: ffff8c1486326480 R14: ffff8c1483a4a000 R15: 0000000000000004 [ 584.355379] FS: 0000000000000000(0000) GS:ffff8c1586cc0000(0000) knlGS:0000000000000000 [ 584.394419] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 584.421123] CR2: 00007fe95a6f7840 CR3: 0000000107674002 CR4: 00000000000606e0 [ 584.454888] Call Trace: [ 584.466108] device_del+0xb2/0x3e0 [ 584.481701] device_unregister+0x13/0x60 [ 584.501306] bsg_unregister_queue+0x5b/0x80 [ 584.522029] bsg_remove_queue+0x1c/0x40 [ 584.541884] fc_rport_final_delete+0xf3/0x1d0 [scsi_transport_fc] [ 584.573823] process_one_work+0x1e3/0x3b0 [ 584.592396] worker_thread+0x50/0x3b0 [ 584.609256] ? rescuer_thread+0x370/0x370 [ 584.628877] kthread+0x149/0x170 [ 584.643673] ? set_kthread_struct+0x40/0x40 [ 584.662909] ret_from_fork+0x22/0x30 [ 584.680002] ---[ end trace 53575ecefa942ece ]--- Link: https://lore.kernel.org/r/20220115040044.1013475-1-jmeneghi@redhat.com Fixes: 0cbf32e1681d ("[SCSI] bnx2fc: Avoid calling bnx2fc_if_destroy with unnecessary locks") Tested-by: Guangwu Zhang <guazhang@redhat.com> Co-developed-by: Maurizio Lombardi <mlombard@redhat.com> Signed-off-by: Maurizio Lombardi <mlombard@redhat.com> Signed-off-by: John Meneghini <jmeneghi@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-01-24scsi: zfcp: Fix failed recovery on gone remote port with non-NPIV FCP devicesSteffen Maier
Suppose we have an environment with a number of non-NPIV FCP devices (virtual HBAs / FCP devices / zfcp "adapter"s) sharing the same physical FCP channel (HBA port) and its I_T nexus. Plus a number of storage target ports zoned to such shared channel. Now one target port logs out of the fabric causing an RSCN. Zfcp reacts with an ADISC ELS and subsequent port recovery depending on the ADISC result. This happens on all such FCP devices (in different Linux images) concurrently as they all receive a copy of this RSCN. In the following we look at one of those FCP devices. Requests other than FSF_QTCB_FCP_CMND can be slow until they get a response. Depending on which requests are affected by slow responses, there are different recovery outcomes. Here we want to fix failed recoveries on port or adapter level by avoiding recovery requests that can be slow. We need the cached N_Port_ID for the remote port "link" test with ADISC. Just before sending the ADISC, we now intentionally forget the old cached N_Port_ID. The idea is that on receiving an RSCN for a port, we have to assume that any cached information about this port is stale. This forces a fresh new GID_PN [FC-GS] nameserver lookup on any subsequent recovery for the same port. Since we typically can still communicate with the nameserver efficiently, we now reach steady state quicker: Either the nameserver still does not know about the port so we stop recovery, or the nameserver already knows the port potentially with a new N_Port_ID and we can successfully and quickly perform open port recovery. For the one case, where ADISC returns successfully, we re-initialize port->d_id because that case does not involve any port recovery. This also solves a problem if the storage WWPN quickly logs into the fabric again but with a different N_Port_ID. Such as on virtual WWPN takeover during target NPIV failover. [https://www.redbooks.ibm.com/abstracts/redp5477.html] In that case the RSCN from the storage FDISC was ignored by zfcp and we could not successfully recover the failover. On some later failback on the storage, we could have been lucky if the virtual WWPN got the same old N_Port_ID from the SAN switch as we still had cached. Then the related RSCN triggered a successful port reopen recovery. However, there is no guarantee to get the same N_Port_ID on NPIV FDISC. Even though NPIV-enabled FCP devices are not affected by this problem, this code change optimizes recovery time for gone remote ports as a side effect. The timely drop of cached N_Port_IDs prevents unnecessary slow open port attempts. While the problem might have been in code before v2.6.32 commit 799b76d09aee ("[SCSI] zfcp: Decouple gid_pn requests from erp") this fix depends on the gid_pn_work introduced with that commit, so we mark it as culprit to satisfy fix dependencies. Note: Point-to-point remote port is already handled separately and gets its N_Port_ID from the cached peer_d_id. So resetting port->d_id in general does not affect PtP. Link: https://lore.kernel.org/r/20220118165803.3667947-1-maier@linux.ibm.com Fixes: 799b76d09aee ("[SCSI] zfcp: Decouple gid_pn requests from erp") Cc: <stable@vger.kernel.org> #2.6.32+ Suggested-by: Benjamin Block <bblock@linux.ibm.com> Reviewed-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Steffen Maier <maier@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-01-24scsi: pm8001: Fix bogus FW crash for maxcpus=1John Garry
According to the comment in check_fw_ready() we should not check the IOP1_READY field in register SCRATCH_PAD_1 for 8008 or 8009 controllers. However we check this very field in process_oq() for processing the highest index interrupt vector. The highest interrupt vector is checked as the FW is programmed to signal fatal errors through this irq. Change that function to not check IOP1_READY for those mentioned controllers, but do check ILA_READY in both cases. The reason I assume that this was not hit earlier was because we always allocated 64 MSI(X), and just did not pass the vector index check in process_oq(), i.e. the handler never ran for vector index 63. Link: https://lore.kernel.org/r/1642508105-95432-1-git-send-email-john.garry@huawei.com Tested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-01-24scsi: qedf: Change context reset messages to ratelimitedSaurav Kashyap
If FCoE is not configured, libfc/libfcoe keeps on retrying FLOGI and after 3 retries driver does a context reset and tries fipvlan again. This leads to context reset message flooding the logs. Hence ratelimit the message to prevent flooding the logs. Link: https://lore.kernel.org/r/20220117135311.6256-4-njavali@marvell.com Signed-off-by: Saurav Kashyap <skashyap@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-01-24scsi: qedf: Fix refcount issue when LOGO is received during TMFSaurav Kashyap
Hung task call trace was seen during LOGO processing. [ 974.309060] [0000:00:00.0]:[qedf_eh_device_reset:868]: 1:0:2:0: LUN RESET Issued... [ 974.309065] [0000:00:00.0]:[qedf_initiate_tmf:2422]: tm_flags 0x10 sc_cmd 00000000c16b930f op = 0x2a target_id = 0x2 lun=0 [ 974.309178] [0000:00:00.0]:[qedf_initiate_tmf:2431]: portid=016900 tm_flags =LUN RESET [ 974.309222] [0000:00:00.0]:[qedf_initiate_tmf:2438]: orig io_req = 00000000ec78df8f xid = 0x180 ref_cnt = 1. [ 974.309625] host1: rport 016900: Received LOGO request while in state Ready [ 974.309627] host1: rport 016900: Delete port [ 974.309642] host1: rport 016900: work event 3 [ 974.309644] host1: rport 016900: lld callback ev 3 [ 974.313243] [0000:61:00.2]:[qedf_execute_tmf:2383]:1: fcport is uploading, not executing flush. [ 974.313295] [0000:61:00.2]:[qedf_execute_tmf:2400]:1: task mgmt command success... [ 984.031088] INFO: task jbd2/dm-15-8:7645 blocked for more than 120 seconds. [ 984.031136] Not tainted 4.18.0-305.el8.x86_64 #1 [ 984.031166] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 984.031209] jbd2/dm-15-8 D 0 7645 2 0x80004080 [ 984.031212] Call Trace: [ 984.031222] __schedule+0x2c4/0x700 [ 984.031230] ? unfreeze_partials.isra.83+0x16e/0x1a0 [ 984.031233] ? bit_wait_timeout+0x90/0x90 [ 984.031235] schedule+0x38/0xa0 [ 984.031238] io_schedule+0x12/0x40 [ 984.031240] bit_wait_io+0xd/0x50 [ 984.031243] __wait_on_bit+0x6c/0x80 [ 984.031248] ? free_buffer_head+0x21/0x50 [ 984.031251] out_of_line_wait_on_bit+0x91/0xb0 [ 984.031257] ? init_wait_var_entry+0x50/0x50 [ 984.031268] jbd2_journal_commit_transaction+0x112e/0x19f0 [jbd2] [ 984.031280] kjournald2+0xbd/0x270 [jbd2] [ 984.031284] ? finish_wait+0x80/0x80 [ 984.031291] ? commit_timeout+0x10/0x10 [jbd2] [ 984.031294] kthread+0x116/0x130 [ 984.031300] ? kthread_flush_work_fn+0x10/0x10 [ 984.031305] ret_from_fork+0x1f/0x40 There was a ref count issue when LOGO is received during TMF. This leads to one of the I/Os hanging with the driver. Fix the ref count. Link: https://lore.kernel.org/r/20220117135311.6256-3-njavali@marvell.com Signed-off-by: Saurav Kashyap <skashyap@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-01-24scsi: qedf: Add stag_work to all the vportsSaurav Kashyap
Call trace seen when creating NPIV ports, only 32 out of 64 show online. stag work was not initialized for vport, hence initialize the stag work. WARNING: CPU: 8 PID: 645 at kernel/workqueue.c:1635 __queue_delayed_work+0x68/0x80 CPU: 8 PID: 645 Comm: kworker/8:1 Kdump: loaded Tainted: G IOE --------- -- 4.18.0-348.el8.x86_64 #1 Hardware name: Dell Inc. PowerEdge MX740c/0177V9, BIOS 2.12.2 07/09/2021 Workqueue: events fc_lport_timeout [libfc] RIP: 0010:__queue_delayed_work+0x68/0x80 Code: 89 b2 88 00 00 00 44 89 82 90 00 00 00 48 01 c8 48 89 42 50 41 81 f8 00 20 00 00 75 1d e9 60 24 07 00 44 89 c7 e9 98 f6 ff ff <0f> 0b eb c5 0f 0b eb a1 0f 0b eb a7 0f 0b eb ac 44 89 c6 e9 40 23 RSP: 0018:ffffae514bc3be40 EFLAGS: 00010006 RAX: ffff8d25d6143750 RBX: 0000000000000202 RCX: 0000000000000002 RDX: ffff8d2e31383748 RSI: ffff8d25c000d600 RDI: ffff8d2e31383788 RBP: ffff8d2e31380de0 R08: 0000000000002000 R09: ffff8d2e31383750 R10: ffffffffc0c957e0 R11: ffff8d2624800000 R12: ffff8d2e31380a58 R13: ffff8d2d915eb000 R14: ffff8d25c499b5c0 R15: ffff8d2e31380e18 FS: 0000000000000000(0000) GS:ffff8d2d1fb00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055fd0484b8b8 CR3: 00000008ffc10006 CR4: 00000000007706e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: queue_delayed_work_on+0x36/0x40 qedf_elsct_send+0x57/0x60 [qedf] fc_lport_enter_flogi+0x90/0xc0 [libfc] fc_lport_timeout+0xb7/0x140 [libfc] process_one_work+0x1a7/0x360 ? create_worker+0x1a0/0x1a0 worker_thread+0x30/0x390 ? create_worker+0x1a0/0x1a0 kthread+0x116/0x130 ? kthread_flush_work_fn+0x10/0x10 ret_from_fork+0x35/0x40 ---[ end trace 008f00f722f2c2ff ]-- Initialize stag work for all the vports. Link: https://lore.kernel.org/r/20220117135311.6256-2-njavali@marvell.com Signed-off-by: Saurav Kashyap <skashyap@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-01-24scsi: ufs: ufshcd-pltfrm: Check the return value of devm_kstrdup()Xiaoke Wang
devm_kstrdup() returns pointer to allocated string on success, NULL on failure. So it is better to check the return value of it. Link: https://lore.kernel.org/r/tencent_4257E15D4A94FF9020DDCC4BB9B21C041408@qq.com Reviewed-by: Bean Huo <beanhuo@micron.com> Signed-off-by: Xiaoke Wang <xkernel.wang@foxmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-01-24scsi: target: iscsi: Make sure the np under each tpg is uniqueZouMingzhe
iscsit_tpg_check_network_portal() has nested for_each loops and is supposed to return true when a match is found. However, the tpg loop will still continue after existing the tpg_np loop. If this tpg_np is not the last the match value will be changed. Break the outer loop after finding a match and make sure the np under each tpg is unique. Link: https://lore.kernel.org/r/20220111054742.19582-1-mingzhe.zou@easystack.cn Signed-off-by: ZouMingzhe <mingzhe.zou@easystack.cn> Reviewed-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-01-24scsi: elx: efct: Don't use GFP_KERNEL under spin lockYang Yingliang
GFP_KERNEL/GFP_DMA can't be used under a spin lock. According the comment, els_ios_lock is used to protect els ios list so we can move down the spin lock to avoid using this flag under the lock. Link: https://lore.kernel.org/r/20220111012441.3232527-1-yangyingliang@huawei.com Fixes: 8f406ef72859 ("scsi: elx: libefc: Extended link Service I/O handling") Reported-by: Hulk Robot <hulkci@huawei.com> Reviewed-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-01-25drm/msm/dsi: invalid parameter check in msm_dsi_phy_enableJosé Expósito
The function performs a check on the "phy" input parameter, however, it is used before the check. Initialize the "dev" variable after the sanity check to avoid a possible NULL pointer dereference. Fixes: 5c8290284402b ("drm/msm/dsi: Split PHY drivers to separate files") Addresses-Coverity-ID: 1493860 ("Null pointer dereference") Signed-off-by: José Expósito <jose.exposito89@gmail.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Link: https://lore.kernel.org/r/20220116181844.7400-1-jose.exposito89@gmail.com Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
2022-01-25drm/msm/a6xx: Add missing suspend_count incrementRob Clark
Reported-by: Danylo Piliaiev <dpiliaiev@igalia.com> Fixes: 3ab1c5cc3939 ("drm/msm: Add param for userspace to query suspend count") Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Link: https://lore.kernel.org/r/20220113163215.215367-1-robdclark@gmail.com Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
2022-01-25drm/msm: Fix wrong size calculationXianting Tian
For example, memory-region in .dts as below, reg = <0x0 0x50000000 0x0 0x20000000> We can get below values, struct resource r; r.start = 0x50000000; r.end = 0x6fffffff; So the size should be: size = r.end - r.start + 1 = 0x20000000 Signed-off-by: Xianting Tian <xianting.tian@linux.alibaba.com> Fixes: 072f1f9168ed ("drm/msm: add support for "stolen" mem") Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Link: https://lore.kernel.org/r/20220112123334.749776-1-xianting.tian@linux.alibaba.com Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
2022-01-25drm/msm/dpu: invalid parameter check in dpu_setup_dspp_pccJosé Expósito
The function performs a check on the "ctx" input parameter, however, it is used before the check. Initialize the "base" variable after the sanity check to avoid a possible NULL pointer dereference. Fixes: 4259ff7ae509e ("drm/msm/dpu: add support for pcc color block in dpu driver") Addresses-Coverity-ID: 1493866 ("Null pointer dereference") Signed-off-by: José Expósito <jose.exposito89@gmail.com> Link: https://lore.kernel.org/r/20220109192431.135949-1-jose.exposito89@gmail.com Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
2022-01-25drm/msm/hdmi: Fix missing put_device() call in msm_hdmi_get_phyMiaoqian Lin
The reference taken by 'of_find_device_by_node()' must be released when not needed anymore. Add the corresponding 'put_device()' in the error handling path. Fixes: e00012b256d4 ("drm/msm/hdmi: Make HDMI core get its PHY") Signed-off-by: Miaoqian Lin <linmq006@gmail.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Link: https://lore.kernel.org/r/20220107085026.23831-1-linmq006@gmail.com Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
2022-01-25drm/msm/dsi: Fix missing put_device() call in dsi_get_phyMiaoqian Lin
If of_find_device_by_node() succeeds, dsi_get_phy() doesn't a corresponding put_device(). Thus add put_device() to fix the exception handling. Fixes: ec31abf ("drm/msm/dsi: Separate PHY to another platform device") Signed-off-by: Miaoqian Lin <linmq006@gmail.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Link: https://lore.kernel.org/r/20211230070943.18116-1-linmq006@gmail.com Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
2022-01-25drm/msm: remove variable set but not usedYang Li
The code that uses variable mdss has been removed, So the declaration and assignment of the variable can be removed. Eliminate the following clang warning: drivers/gpu/drm/msm/msm_drv.c:513:19: warning: variable 'mdss' set but not used [-Wunused-but-set-variable] Reported-by: Abaci Robot <abaci@linux.alibaba.com> Fixes: 2027e5b3413d ("drm/msm: Initialize MDSS irq domain at probe time") Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Link: https://lore.kernel.org/r/20211216031103.34146-1-yang.lee@linux.alibaba.com Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
2022-01-24hwmon: (nct6775) Fix crash in clear_caseopenGuenter Roeck
Paweł Marciniak reports the following crash, observed when clearing the chassis intrusion alarm. BUG: kernel NULL pointer dereference, address: 0000000000000028 PGD 0 P4D 0 Oops: 0000 [#1] PREEMPT SMP PTI CPU: 3 PID: 4815 Comm: bash Tainted: G S 5.16.2-200.fc35.x86_64 #1 Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./Z97 Extreme4, BIOS P2.60A 05/03/2018 RIP: 0010:clear_caseopen+0x5a/0x120 [nct6775] Code: 68 70 e8 e9 32 b1 e3 85 c0 0f 85 d2 00 00 00 48 83 7c 24 ... RSP: 0018:ffffabcb02803dd8 EFLAGS: 00010246 RAX: 0000000000000000 RBX: 0000000000000002 RCX: 0000000000000000 RDX: ffff8e8808192880 RSI: 0000000000000000 RDI: ffff8e87c7509a68 RBP: 0000000000000000 R08: 0000000000000001 R09: 000000000000000a R10: 000000000000000a R11: f000000000000000 R12: 000000000000001f R13: ffff8e87c7509828 R14: ffff8e87c7509a68 R15: ffff8e88494527a0 FS: 00007f4db9151740(0000) GS:ffff8e8ebfec0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000028 CR3: 0000000166b66001 CR4: 00000000001706e0 Call Trace: <TASK> kernfs_fop_write_iter+0x11c/0x1b0 new_sync_write+0x10b/0x180 vfs_write+0x209/0x2a0 ksys_write+0x4f/0xc0 do_syscall_64+0x3b/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xae The problem is that the device passed to clear_caseopen() is the hwmon device, not the platform device, and the platform data is not set in the hwmon device. Store the pointer to sio_data in struct nct6775_data and get if from there if needed. Fixes: 2e7b9886968b ("hwmon: (nct6775) Use superio_*() function pointers in sio_data.") Cc: Denis Pauk <pauk.denis@gmail.com> Cc: Bernhard Seibold <mail@bernhard-seibold.de> Reported-by: Paweł Marciniak <pmarciniak@lodz.home.pl> Tested-by: Denis Pauk <pauk.denis@gmail.com> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2022-01-24Merge tag 'linux-can-fixes-for-5.17-20220124' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can Marc Kleine-Budde says: ==================== pull-request: can 2022-01-24 The first patch updates the email address of Brian Silverman from his former employer to his private address. The next patch fixes DT bindings information for the tcan4x5x SPI CAN driver. The following patch targets the m_can driver and fixes the introduction of FIFO bulk read support. Another patch for the tcan4x5x driver, which fixes the max register value for the regmap config. The last patch for the flexcan driver marks the RX mailbox support for the MCF5441X as support. * tag 'linux-can-fixes-for-5.17-20220124' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can: can: flexcan: mark RX via mailboxes as supported on MCF5441X can: tcan4x5x: regmap: fix max register value can: m_can: m_can_fifo_{read,write}: don't read or write from/to FIFO if length is 0 dt-bindings: can: tcan4x5x: fix mram-cfg RX FIFO config mailmap: update email address of Brian Silverman ==================== Link: https://lore.kernel.org/r/20220124175955.3464134-1-mkl@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-01-24can: flexcan: mark RX via mailboxes as supported on MCF5441XMarc Kleine-Budde
Most flexcan IP cores support 2 RX modes: - FIFO - mailbox The flexcan IP core on the MCF5441X cannot receive CAN RTR messages via mailboxes. However the mailbox mode is more performant. The commit | 1c45f5778a3b ("can: flexcan: add ethtool support to change rx-rtr setting during runtime") added support to switch from FIFO to mailbox mode on these cores. After testing the mailbox mode on the MCF5441X by Angelo Dureghello, this patch marks it (without RTR capability) as supported. Further the IP core overview table is updated, that RTR reception via mailboxes is not supported. Link: https://lore.kernel.org/all/20220121084425.3141218-1-mkl@pengutronix.de Tested-by: Angelo Dureghello <angelo@kernel-space.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-01-24can: tcan4x5x: regmap: fix max register valueMarc Kleine-Budde
The MRAM of the tcan4x5x has a size of 2K and starts at 0x8000. There are no further registers in the tcan4x5x making 0x87fc the biggest addressable register. This patch fixes the max register value of the regmap config from 0x8ffc to 0x87fc. Fixes: 6e1caaf8ed22 ("can: tcan4x5x: fix max register value") Link: https://lore.kernel.org/all/20220119064011.2943292-1-mkl@pengutronix.de Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-01-24can: m_can: m_can_fifo_{read,write}: don't read or write from/to FIFO if ↵Marc Kleine-Budde
length is 0 In order to optimize FIFO access, especially on m_can cores attached to slow busses like SPI, in patch | e39381770ec9 ("can: m_can: Disable IRQs on FIFO bus errors") bulk read/write support has been added to the m_can_fifo_{read,write} functions. That change leads to the tcan driver to call regmap_bulk_{read,write}() with a length of 0 (for CAN frames with 0 data length). regmap treats this as an error: | tcan4x5x spi1.0 tcan4x5x0: FIFO write returned -22 This patch fixes the problem by not calling the cdev->ops->{read,write)_fifo() in case of a 0 length read/write. Fixes: e39381770ec9 ("can: m_can: Disable IRQs on FIFO bus errors") Link: https://lore.kernel.org/all/20220114155751.2651888-1-mkl@pengutronix.de Cc: stable@vger.kernel.org Cc: Matt Kline <matt@bitbashing.io> Cc: Chandrasekar Ramakrishnan <rcsekar@samsung.com> Reported-by: Michael Anochin <anochin@photo-meter.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-01-24dt-bindings: can: tcan4x5x: fix mram-cfg RX FIFO configMarc Kleine-Budde
This tcan4x5x only comes with 2K of MRAM, a RX FIFO with a dept of 32 doesn't fit into the MRAM. Use a depth of 16 instead. Fixes: 4edd396a1911 ("dt-bindings: can: tcan4x5x: Add DT bindings for TCAN4x5X driver") Link: https://lore.kernel.org/all/20220119062951.2939851-1-mkl@pengutronix.de Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-01-24mailmap: update email address of Brian SilvermanMarc Kleine-Budde
Brian Silverman's address at bluerivertech.com is not valid anymore, use Brian's private email address instead. Link: https://lore.kernel.org/all/20220110082359.2019735-1-mkl@pengutronix.de Cc: Brian Silverman <bsilver16384@gmail.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-01-24btrfs: update writeback index when starting defragFilipe Manana
When starting a defrag, we should update the writeback index of the inode's mapping in case it currently has a value beyond the start of the range we are defragging. This can help performance and often result in getting less extents after writeback - for e.g., if the current value of the writeback index sits somewhere in the middle of a range that gets dirty by the defrag, then after writeback we can get two smaller extents instead of a single, larger extent. We used to have this before the refactoring in 5.16, but it was removed without any reason to do so. Originally it was added in kernel 3.1, by commit 2a0f7f5769992b ("Btrfs: fix recursive auto-defrag"), in order to fix a loop with autodefrag resulting in dirtying and writing pages over and over, but some testing on current code did not show that happening, at least with the test described in that commit. So add back the behaviour, as at the very least it is a nice to have optimization. Fixes: 7b508037d4cac3 ("btrfs: defrag: use defrag_one_cluster() to implement btrfs_defrag_file()") CC: stable@vger.kernel.org # 5.16 Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2022-01-24btrfs: add back missing dirty page rate limiting to defragFilipe Manana
A defrag operation can dirty a lot of pages, specially if operating on the entire file or a large file range. Any task dirtying pages should periodically call balance_dirty_pages_ratelimited(), as stated in that function's comments, otherwise they can leave too many dirty pages in the system. This is what we did before the refactoring in 5.16, and it should have remained, just like in the buffered write path and relocation. So restore that behaviour. Fixes: 7b508037d4cac3 ("btrfs: defrag: use defrag_one_cluster() to implement btrfs_defrag_file()") CC: stable@vger.kernel.org # 5.16 Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2022-01-24btrfs: fix deadlock when reserving space during defragFilipe Manana
When defragging we can end up collecting a range for defrag that has already pages under delalloc (dirty), as long as the respective extent map for their range is not mapped to a hole, a prealloc extent or the extent map is from an old generation. Most of the time that is harmless from a functional perspective at least, however it can result in a deadlock: 1) At defrag_collect_targets() we find an extent map that meets all requirements but there's delalloc for the range it covers, and we add its range to list of ranges to defrag; 2) The defrag_collect_targets() function is called at defrag_one_range(), after it locked a range that overlaps the range of the extent map; 3) At defrag_one_range(), while the range is still locked, we call defrag_one_locked_target() for the range associated to the extent map we collected at step 1); 4) Then finally at defrag_one_locked_target() we do a call to btrfs_delalloc_reserve_space(), which will reserve data and metadata space. If the space reservations can not be satisfied right away, the flusher might be kicked in and start flushing delalloc and wait for the respective ordered extents to complete. If this happens we will deadlock, because both flushing delalloc and finishing an ordered extent, requires locking the range in the inode's io tree, which was already locked at defrag_collect_targets(). So fix this by skipping extent maps for which there's already delalloc. Fixes: eb793cf857828d ("btrfs: defrag: introduce helper to collect target file extents") CC: stable@vger.kernel.org # 5.16 Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2022-01-24arm64: Mark start_backtrace() notrace and NOKPROBE_SYMBOLMasami Hiramatsu
Mark the start_backtrace() as notrace and NOKPROBE_SYMBOL because this function is called from ftrace and lockdep to get the caller address via return_address(). The lockdep is used in kprobes, it should also be NOKPROBE_SYMBOL. Fixes: b07f3499661c ("arm64: stacktrace: Move start_backtrace() out of the header") Cc: <stable@vger.kernel.org> # 5.13.x Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Reviewed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/164301227374.1433152.12808232644267107415.stgit@devnote2 Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-01-24arm64: errata: Update ARM64_ERRATUM_[2119858|2224489] with Cortex-X2 rangesAnshuman Khandual
Errata ARM64_ERRATUM_[2119858|2224489] also affect some Cortex-X2 ranges as well. Lets update these errata definition and detection to accommodate all new Cortex-X2 based cpu MIDR ranges. Cc: Will Deacon <will@kernel.org> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Suzuki Poulose <suzuki.poulose@arm.com> Cc: coresight@lists.linaro.org Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/1642994138-25887-3-git-send-email-anshuman.khandual@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-01-24arm64: Add Cortex-X2 CPU part definitionAnshuman Khandual
Add the CPU Partnumbers for the new Arm designs. Cc: Will Deacon <will@kernel.org> Cc: Suzuki Poulose <suzuki.poulose@arm.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/1642994138-25887-2-git-send-email-anshuman.khandual@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-01-24video: hyperv_fb: Fix validation of screen resolutionMichael Kelley
In the WIN10 version of the Synthetic Video protocol with Hyper-V, Hyper-V reports a list of supported resolutions as part of the protocol negotiation. The driver calculates the maximum width and height from the list of resolutions, and uses those maximums to validate any screen resolution specified in the video= option on the kernel boot line. This method of validation is incorrect. For example, the list of supported resolutions could contain 1600x1200 and 1920x1080, both of which fit in an 8 Mbyte frame buffer. But calculating the max width and height yields 1920 and 1200, and 1920x1200 resolution does not fit in an 8 Mbyte frame buffer. Unfortunately, this resolution is accepted, causing a kernel fault when the driver accesses memory outside the frame buffer. Instead, validate the specified screen resolution by calculating its size, and comparing against the frame buffer size. Delete the code for calculating the max width and height from the list of resolutions, since these max values have no use. Also add the frame buffer size to the info message to aid in understanding why a resolution might be rejected. Fixes: 67e7cdb4829d ("video: hyperv: hyperv_fb: Obtain screen resolution from Hyper-V host") Signed-off-by: Michael Kelley <mikelley@microsoft.com> Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Acked-by: Helge Deller <deller@gmx.de> Link: https://lore.kernel.org/r/1642360711-2335-1-git-send-email-mikelley@microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
2022-01-24KVM: remove async parameter of hva_to_pfn_remapped()Xianting Tian
The async parameter of hva_to_pfn_remapped() is not used, so remove it. Signed-off-by: Xianting Tian <xianting.tian@linux.alibaba.com> Message-Id: <20220124020456.156386-1-xianting.tian@linux.alibaba.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-24x86,kvm/xen: Remove superfluous .fixup usagePeter Zijlstra
Commit 14243b387137 ("KVM: x86/xen: Add KVM_IRQ_ROUTING_XEN_EVTCHN and event channel delivery") adds superfluous .fixup usage after the whole .fixup section was removed in commit e5eefda5aa51 ("x86: Remove .fixup section"). Fixes: 14243b387137 ("KVM: x86/xen: Add KVM_IRQ_ROUTING_XEN_EVTCHN and event channel delivery") Reported-by: Borislav Petkov <bp@alien8.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Message-Id: <20220123124219.GH20638@worktop.programming.kicks-ass.net> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-24KVM: VMX: Zero host's SYSENTER_ESP iff SYSENTER is NOT usedSean Christopherson
Zero vmcs.HOST_IA32_SYSENTER_ESP when initializing *constant* host state if and only if SYSENTER cannot be used, i.e. the kernel is a 64-bit kernel and is not emulating 32-bit syscalls. As the name suggests, vmx_set_constant_host_state() is intended for state that is *constant*. When SYSENTER is used, SYSENTER_ESP isn't constant because stacks are per-CPU, and the VMCS must be updated whenever the vCPU is migrated to a new CPU. The logic in vmx_vcpu_load_vmcs() doesn't differentiate between "never loaded" and "loaded on a different CPU", i.e. setting SYSENTER_ESP on VMCS load also handles setting correct host state when the VMCS is first loaded. Because a VMCS must be loaded before it is initialized during vCPU RESET, zeroing the field in vmx_set_constant_host_state() obliterates the value that was written when the VMCS was loaded. If the vCPU is run before it is migrated, the subsequent VM-Exit will zero out MSR_IA32_SYSENTER_ESP, leading to a #DF on the next 32-bit syscall. double fault: 0000 [#1] SMP CPU: 0 PID: 990 Comm: stable Not tainted 5.16.0+ #97 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015 EIP: entry_SYSENTER_32+0x0/0xe7 Code: <9c> 50 eb 17 0f 20 d8 a9 00 10 00 00 74 0d 25 ff ef ff ff 0f 22 d8 EAX: 000000a2 EBX: a8d1300c ECX: a8d13014 EDX: 00000000 ESI: a8f87000 EDI: a8d13014 EBP: a8d12fc0 ESP: 00000000 DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068 EFLAGS: 00210093 CR0: 80050033 CR2: fffffffc CR3: 02c3b000 CR4: 00152e90 Fixes: 6ab8a4053f71 ("KVM: VMX: Avoid to rdmsrl(MSR_IA32_SYSENTER_ESP)") Cc: Lai Jiangshan <laijs@linux.alibaba.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20220122015211.1468758-1-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-24quota: cleanup double word in commentTom Rix
Remove the second 'handle'. Link: https://lore.kernel.org/r/20220116125936.389767-1-trix@redhat.com Signed-off-by: Tom Rix <trix@redhat.com> Signed-off-by: Jan Kara <jack@suse.cz>
2022-01-24udf: Restore i_lenAlloc when inode expansion failsJan Kara
When we fail to expand inode from inline format to a normal format, we restore inode to contain the original inline formatting but we forgot to set i_lenAlloc back. The mismatch between i_lenAlloc and i_size was then causing further problems such as warnings and lost data down the line. Reported-by: butt3rflyh4ck <butterflyhuangxx@gmail.com> CC: stable@vger.kernel.org Fixes: 7e49b6f2480c ("udf: Convert UDF to new truncate calling sequence") Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jan Kara <jack@suse.cz>
2022-01-24udf: Fix NULL ptr deref when converting from inline formatJan Kara
udf_expand_file_adinicb() calls directly ->writepage to write data expanded into a page. This however misses to setup inode for writeback properly and so we can crash on inode->i_wb dereference when submitting page for IO like: BUG: kernel NULL pointer dereference, address: 0000000000000158 #PF: supervisor read access in kernel mode ... <TASK> __folio_start_writeback+0x2ac/0x350 __block_write_full_page+0x37d/0x490 udf_expand_file_adinicb+0x255/0x400 [udf] udf_file_write_iter+0xbe/0x1b0 [udf] new_sync_write+0x125/0x1c0 vfs_write+0x28e/0x400 Fix the problem by marking the page dirty and going through the standard writeback path to write the page. Strictly speaking we would not even have to write the page but we want to catch e.g. ENOSPC errors early. Reported-by: butt3rflyh4ck <butterflyhuangxx@gmail.com> CC: stable@vger.kernel.org Fixes: 52ebea749aae ("writeback: make backing_dev_info host cgroup-specific bdi_writebacks") Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jan Kara <jack@suse.cz>
2022-01-24net: stmmac: remove unused members in struct stmmac_privJisheng Zhang
The tx_coalesce and mii_irq are not used at all now, so remove them. Signed-off-by: Jisheng Zhang <jszhang@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-24fsnotify: fix fsnotify hooks in pseudo filesystemsAmir Goldstein
Commit 49246466a989 ("fsnotify: move fsnotify_nameremove() hook out of d_delete()") moved the fsnotify delete hook before d_delete() so fsnotify will have access to a positive dentry. This allowed a race where opening the deleted file via cached dentry is now possible after receiving the IN_DELETE event. To fix the regression in pseudo filesystems, convert d_delete() calls to d_drop() (see commit 46c46f8df9aa ("devpts_pty_kill(): don't bother with d_delete()") and move the fsnotify hook after d_drop(). Add a missing fsnotify_unlink() hook in nfsdfs that was found during the audit of fsnotify hooks in pseudo filesystems. Note that the fsnotify hooks in simple_recursive_removal() follow d_invalidate(), so they require no change. Link: https://lore.kernel.org/r/20220120215305.282577-2-amir73il@gmail.com Reported-by: Ivan Delalande <colona@arista.com> Link: https://lore.kernel.org/linux-fsdevel/YeNyzoDM5hP5LtGW@visor/ Fixes: 49246466a989 ("fsnotify: move fsnotify_nameremove() hook out of d_delete()") Cc: stable@vger.kernel.org # v5.3+ Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz>
2022-01-24fsnotify: invalidate dcache before IN_DELETE eventAmir Goldstein
Apparently, there are some applications that use IN_DELETE event as an invalidation mechanism and expect that if they try to open a file with the name reported with the delete event, that it should not contain the content of the deleted file. Commit 49246466a989 ("fsnotify: move fsnotify_nameremove() hook out of d_delete()") moved the fsnotify delete hook before d_delete() so fsnotify will have access to a positive dentry. This allowed a race where opening the deleted file via cached dentry is now possible after receiving the IN_DELETE event. To fix the regression, create a new hook fsnotify_delete() that takes the unlinked inode as an argument and use a helper d_delete_notify() to pin the inode, so we can pass it to fsnotify_delete() after d_delete(). Backporting hint: this regression is from v5.3. Although patch will apply with only trivial conflicts to v5.4 and v5.10, it won't build, because fsnotify_delete() implementation is different in each of those versions (see fsnotify_link()). A follow up patch will fix the fsnotify_unlink/rmdir() calls in pseudo filesystem that do not need to call d_delete(). Link: https://lore.kernel.org/r/20220120215305.282577-1-amir73il@gmail.com Reported-by: Ivan Delalande <colona@arista.com> Link: https://lore.kernel.org/linux-fsdevel/YeNyzoDM5hP5LtGW@visor/ Fixes: 49246466a989 ("fsnotify: move fsnotify_nameremove() hook out of d_delete()") Cc: stable@vger.kernel.org # v5.3+ Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz>
2022-01-24net: atlantic: Use the bitmap API instead of hand-writing itChristophe JAILLET
Simplify code by using bitmap_weight() and bitmap_zero() instead of hand-writing these functions. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-24ping: fix the sk_bound_dev_if match in ping_lookupXin Long
When 'ping' changes to use PING socket instead of RAW socket by: # sysctl -w net.ipv4.ping_group_range="0 100" the selftests 'router_broadcast.sh' will fail, as such command # ip vrf exec vrf-h1 ping -I veth0 198.51.100.255 -b can't receive the response skb by the PING socket. It's caused by mismatch of sk_bound_dev_if and dif in ping_rcv() when looking up the PING socket, as dif is vrf-h1 if dif's master was set to vrf-h1. This patch is to fix this regression by also checking the sk_bound_dev_if against sdif so that the packets can stil be received even if the socket is not bound to the vrf device but to the real iif. Fixes: c319b4d76b9e ("net: ipv4: add IPPROTO_ICMP socket kind") Reported-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-24arm64: vdso: Fix "no previous prototype" warningVincenzo Frascino
If compiling the arm64 kernel with W=1 the following warning is produced: | arch/arm64/kernel/vdso/vgettimeofday.c:9:5: error: no previous prototype for ‘__kernel_clock_gettime’ [-Werror=missing-prototypes] | 9 | int __kernel_clock_gettime(clockid_t clock, | | ^~~~~~~~~~~~~~~~~~~~~~ | arch/arm64/kernel/vdso/vgettimeofday.c:15:5: error: no previous prototype for ‘__kernel_gettimeofday’ [-Werror=missing-prototypes] | 15 | int __kernel_gettimeofday(struct __kernel_old_timeval *tv, | | ^~~~~~~~~~~~~~~~~~~~~ | arch/arm64/kernel/vdso/vgettimeofday.c:21:5: error: no previous prototype for ‘__kernel_clock_getres’ [-Werror=missing-prototypes] | 21 | int __kernel_clock_getres(clockid_t clock_id, | | ^~~~~~~~~~~~~~~~~~~~~ This patch removes "-Wmissing-prototypes" and "-Wmissing-declarations" compilers flags from the compilation of vgettimeofday.c to make possible to build the kernel with CONFIG_WERROR enabled. Cc: Will Deacon <will@kernel.org> Reported-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Tested-by: Marc Kleine-Budde <mkl@pengutronix.de> Link: https://lore.kernel.org/r/20220121121234.47273-1-vincenzo.frascino@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-01-24net/smc: Transitional solution for clcsock race issueWen Gu
We encountered a crash in smc_setsockopt() and it is caused by accessing smc->clcsock after clcsock was released. BUG: kernel NULL pointer dereference, address: 0000000000000020 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: 0000 [#1] PREEMPT SMP PTI CPU: 1 PID: 50309 Comm: nginx Kdump: loaded Tainted: G E 5.16.0-rc4+ #53 RIP: 0010:smc_setsockopt+0x59/0x280 [smc] Call Trace: <TASK> __sys_setsockopt+0xfc/0x190 __x64_sys_setsockopt+0x20/0x30 do_syscall_64+0x34/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x7f16ba83918e </TASK> This patch tries to fix it by holding clcsock_release_lock and checking whether clcsock has already been released before access. In case that a crash of the same reason happens in smc_getsockopt() or smc_switch_to_fallback(), this patch also checkes smc->clcsock in them too. And the caller of smc_switch_to_fallback() will identify whether fallback succeeds according to the return value. Fixes: fd57770dd198 ("net/smc: wait for pending work before clcsock release_sock") Link: https://lore.kernel.org/lkml/5dd7ffd1-28e2-24cc-9442-1defec27375e@linux.ibm.com/T/ Signed-off-by: Wen Gu <guwen@linux.alibaba.com> Acked-by: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-24ibmvnic: remove unused ->wait_capabilitySukadev Bhattiprolu
With previous bug fix, ->wait_capability flag is no longer needed and can be removed. Fixes: 249168ad07cd ("ibmvnic: Make CRQ interrupt tasklet wait for all capabilities crqs") Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Reviewed-by: Dany Madden <drt@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-24ibmvnic: don't spin in taskletSukadev Bhattiprolu
ibmvnic_tasklet() continuously spins waiting for responses to all capability requests. It does this to avoid encountering an error during initialization of the vnic. However if there is a bug in the VIOS and we do not receive a response to one or more queries the tasklet ends up spinning continuously leading to hard lock ups. If we fail to receive a message from the VIOS it is reasonable to timeout the login attempt rather than spin indefinitely in the tasklet. Fixes: 249168ad07cd ("ibmvnic: Make CRQ interrupt tasklet wait for all capabilities crqs") Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Reviewed-by: Dany Madden <drt@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-24ibmvnic: init ->running_cap_crqs earlySukadev Bhattiprolu
We use ->running_cap_crqs to determine when the ibmvnic_tasklet() should send out the next protocol message type. i.e when we get back responses to all our QUERY_CAPABILITY CRQs we send out REQUEST_CAPABILITY crqs. Similiary, when we get responses to all the REQUEST_CAPABILITY crqs, we send out the QUERY_IP_OFFLOAD CRQ. We currently increment ->running_cap_crqs as we send out each CRQ and have the ibmvnic_tasklet() send out the next message type, when this running_cap_crqs count drops to 0. This assumes that all the CRQs of the current type were sent out before the count drops to 0. However it is possible that we send out say 6 CRQs, get preempted and receive all the 6 responses before we send out the remaining CRQs. This can result in ->running_cap_crqs count dropping to zero before all messages of the current type were sent and we end up sending the next protocol message too early. Instead initialize the ->running_cap_crqs upfront so the tasklet will only send the next protocol message after all responses are received. Use the cap_reqs local variable to also detect any discrepancy (either now or in future) in the number of capability requests we actually send. Currently only send_query_cap() is affected by this behavior (of sending next message early) since it is called from the worker thread (during reset) and from application thread (during ->ndo_open()) and they can be preempted. send_request_cap() is only called from the tasklet which processes CRQ responses sequentially, is not be affected. But to maintain the existing symmtery with send_query_capability() we update send_request_capability() also. Fixes: 249168ad07cd ("ibmvnic: Make CRQ interrupt tasklet wait for all capabilities crqs") Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Reviewed-by: Dany Madden <drt@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-24ibmvnic: Allow extra failures before disablingSukadev Bhattiprolu
If auto-priority-failover (APF) is enabled and there are at least two backing devices of different priorities, some resets like fail-over, change-param etc can cause at least two back to back failovers. (Failover from high priority backing device to lower priority one and then back to the higher priority one if that is still functional). Depending on the timimg of the two failovers it is possible to trigger a "hard" reset and for the hard reset to fail due to failovers. When this occurs, the driver assumes that the network is unstable and disables the VNIC for a 60-second "settling time". This in turn can cause the ethtool command to fail with "No such device" while the vnic automatically recovers a little while later. Given that it's possible to have two back to back failures, allow for extra failures before disabling the vnic for the settling time. Fixes: f15fde9d47b8 ("ibmvnic: delay next reset if hard reset fails") Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Reviewed-by: Dany Madden <drt@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-24ipv4: fix ip option filtering for locally generated fragmentsJakub Kicinski
During IP fragmentation we sanitize IP options. This means overwriting options which should not be copied with NOPs. Only the first fragment has the original, full options. ip_fraglist_prepare() copies the IP header and options from previous fragment to the next one. Commit 19c3401a917b ("net: ipv4: place control buffer handling away from fragmentation iterators") moved sanitizing options before ip_fraglist_prepare() which means options are sanitized and then overwritten again with the old values. Fixing this is not enough, however, nor did the sanitization work prior to aforementioned commit. ip_options_fragment() (which does the sanitization) uses ipcb->opt.optlen for the length of the options. ipcb->opt of fragments is not populated (it's 0), only the head skb has the state properly built. So even when called at the right time ip_options_fragment() does nothing. This seems to date back all the way to v2.5.44 when the fast path for pre-fragmented skbs had been introduced. Prior to that ip_options_build() would have been called for every fragment (in fact ever since v2.5.44 the fragmentation handing in ip_options_build() has been dead code, I'll clean it up in -next). In the original patch (see Link) caixf mentions fixing the handling for fragments other than the second one, but I'm not sure how _any_ fragment could have had their options sanitized with the code as it stood. Tested with python (MTU on lo lowered to 1000 to force fragmentation): import socket s = socket.socket(socket.AF_INET, socket.SOCK_DGRAM) s.setsockopt(socket.IPPROTO_IP, socket.IP_OPTIONS, bytearray([7,4,5,192, 20|0x80,4,1,0])) s.sendto(b'1'*2000, ('127.0.0.1', 1234)) Before: IP (tos 0x0, ttl 64, id 1053, offset 0, flags [+], proto UDP (17), length 996, options (RR [bad length 4] [bad ptr 5] 192.148.4.1,,RA value 256)) localhost.36500 > localhost.search-agent: UDP, length 2000 IP (tos 0x0, ttl 64, id 1053, offset 968, flags [+], proto UDP (17), length 996, options (RR [bad length 4] [bad ptr 5] 192.148.4.1,,RA value 256)) localhost > localhost: udp IP (tos 0x0, ttl 64, id 1053, offset 1936, flags [none], proto UDP (17), length 100, options (RR [bad length 4] [bad ptr 5] 192.148.4.1,,RA value 256)) localhost > localhost: udp After: IP (tos 0x0, ttl 96, id 42549, offset 0, flags [+], proto UDP (17), length 996, options (RR [bad length 4] [bad ptr 5] 192.148.4.1,,RA value 256)) localhost.51607 > localhost.search-agent: UDP, bad length 2000 > 960 IP (tos 0x0, ttl 96, id 42549, offset 968, flags [+], proto UDP (17), length 996, options (NOP,NOP,NOP,NOP,RA value 256)) localhost > localhost: udp IP (tos 0x0, ttl 96, id 42549, offset 1936, flags [none], proto UDP (17), length 100, options (NOP,NOP,NOP,NOP,RA value 256)) localhost > localhost: udp RA (20 | 0x80) is now copied as expected, RR (7) is "NOPed out". Link: https://lore.kernel.org/netdev/20220107080559.122713-1-ooppublic@163.com/ Fixes: 19c3401a917b ("net: ipv4: place control buffer handling away from fragmentation iterators") Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: caixf <ooppublic@163.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>