summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2022-08-16virtio_net: Revert "virtio_net: set the default max ring size by find_vqs()"Michael S. Tsirkin
This reverts commit 762faee5a2678559d3dc09d95f8f2c54cd0466a7. This has been reported to trip up guests on GCP (Google Cloud). The reason is that virtio_find_vqs_ctx_size is broken on legacy devices. We can in theory fix virtio_find_vqs_ctx_size but in fact the patch itself has several other issues: - It treats unknown speed as < 10G - It leaves userspace no way to find out the ring size set by hypervisor - It tests speed when link is down - It ignores the virtio spec advice: Both \field{speed} and \field{duplex} can change, thus the driver is expected to re-read these values after receiving a configuration change notification. - It is not clear the performance impact has been tested properly Revert the patch for now. Reported-by: Andres Freund <andres@anarazel.de> Link: https://lore.kernel.org/r/20220814212610.GA3690074%40roeck-us.net Link: https://lore.kernel.org/r/20220815070203.plwjx7b3cyugpdt7%40awork3.anarazel.de Link: https://lore.kernel.org/r/3df6bb82-1951-455d-a768-e9e1513eb667%40www.fastmail.com Link: https://lore.kernel.org/r/FCDC5DDE-3CDD-4B8A-916F-CA7D87B547CE%40anarazel.de Fixes: 762faee5a267 ("virtio_net: set the default max ring size by find_vqs()") Cc: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Cc: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Tested-by: Andres Freund <andres@anarazel.de> Tested-by: Guenter Roeck <linux@roeck-us.net> Message-Id: <20220816053602.173815-2-mst@redhat.com>
2022-08-15io_uring/notif: raise limit on notification slotsPavel Begunkov
1024 notification slots is rather an arbitrary value, raise it up, everything is accounted to memcg. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/eb78a0a5f2fa5941f8e845cdae5fb399bf7ba0be.1660566179.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-08-15io_uring/net: improve zc addr import error handlingPavel Begunkov
We may account memory to a memcg of a request that didn't even got to the network layer. It's not a bug as it'll be routinely cleaned up on flush, but it might be confusing for the userspace. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/b8aae61f4c3ddc4da97c1da876bb73871f352d50.1660566179.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-08-15io_uring/net: use right helpers for async recyclePavel Begunkov
We have a helper that checks for whether a request contains anything in ->async_data or not, namely req_has_async_data(). It's better to use it as it might have some extra considerations. Fixes: 43e0bbbd0b0e3 ("io_uring: add netmsg cache") Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/b7414da4e7c3c32c31fc02dfd1355af4ccf4ca5f.1660566179.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-08-15Merge branch '40GbE' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2022-08-12 (iavf) This series contains updates to iavf driver only. Przemyslaw frees memory for admin queues in initialization error paths, prevents freeing of vf_res which is causing null pointer dereference, and adjusts calls in error path of reset to avoid iavf_close() which could cause deadlock. Ivan Vecera avoids deadlock that can occur when driver if part of failover. * '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue: iavf: Fix deadlock in initialization iavf: Fix reset error handling iavf: Fix NULL pointer dereference in iavf_get_link_ksettings iavf: Fix adminq error handling ==================== Link: https://lore.kernel.org/r/20220812172309.853230-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-15net: rtnetlink: fix module reference count leak issue in rtnetlink_rcv_msgZhengchao Shao
When bulk delete command is received in the rtnetlink_rcv_msg function, if bulk delete is not supported, module_put is not called to release the reference counting. As a result, module reference count is leaked. Fixes: a6cec0bcd342 ("net: rtnetlink: add bulk delete support flag") Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://lore.kernel.org/r/20220815024629.240367-1-shaozhengchao@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-15net: moxa: pass pdev instead of ndev to DMA functionsSergei Antonov
dma_map_single() calls fail in moxart_mac_setup_desc_ring() and moxart_mac_start_xmit() which leads to an incessant output of this: [ 16.043925] moxart-ethernet 92000000.mac eth0: DMA mapping error [ 16.050957] moxart-ethernet 92000000.mac eth0: DMA mapping error [ 16.058229] moxart-ethernet 92000000.mac eth0: DMA mapping error Passing pdev to DMA is a common approach among net drivers. Fixes: 6c821bd9edc9 ("net: Add MOXA ART SoCs ethernet driver") Signed-off-by: Sergei Antonov <saproj@gmail.com> Suggested-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/20220812171339.2271788-1-saproj@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-15ksmbd: don't remove dos attribute xattr on O_TRUNC openNamjae Jeon
When smb client open file in ksmbd share with O_TRUNC, dos attribute xattr is removed as well as data in file. This cause the FSCTL_SET_SPARSE request from the client fails because ksmbd can't update the dos attribute after setting ATTR_SPARSE_FILE. And this patch fix xfstests generic/469 test also. Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Reviewed-by: Hyunchul Lee <hyc.lee@gmail.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2022-08-15ksmbd: remove unnecessary generic_fillattr in smb2_openHyunchul Lee
Remove unnecessary generic_fillattr to fix wrong AllocationSize of SMB2_CREATE response, And Move the call of ksmbd_vfs_getattr above the place where stat is needed because of truncate. This patch fixes wrong AllocationSize of SMB2_CREATE response. Because ext4 updates inode->i_blocks only when disk space is allocated, generic_fillattr does not set stat.blocks properly for delayed allocation. But ext4 returns the blocks that include the delayed allocation blocks when getattr is called. The issue can be reproduced with commands below: touch ${FILENAME} xfs_io -c "pwrite -S 0xAB 0 40k" ${FILENAME} xfs_io -c "stat" ${FILENAME} 40KB are written, but the count of blocks is 8. Signed-off-by: Hyunchul Lee <hyc.lee@gmail.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2022-08-16ata: libata-eh: Add missing command nameDamien Le Moal
Add the missing command name for ATA_CMD_NCQ_NON_DATA to ata_get_cmd_name(). Fixes: 661ce1f0c4a6 ("libata/libsas: Define ATA_CMD_NCQ_NON_DATA") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: Hannes Reinecke <hare@suse.de>
2022-08-15lib/cpumask: drop always-true preprocessor guardSander Vanheule
Since lib/cpumask.o is only built for CONFIG_SMP=y, NR_CPUS will always be greater than 1 at compile time. This makes checking for that condition unnecesarry, so it can be dropped. Signed-off-by: Sander Vanheule <sander@svanheule.net> Signed-off-by: Yury Norov <yury.norov@gmail.com>
2022-08-15lib/cpumask: add inline cpumask_next_wrap() for UPSander Vanheule
In the uniprocessor case, cpumask_next_wrap() can be simplified, as the number of valid argument combinations is limited: - 'start' can only be 0 - 'n' can only be -1 or 0 The only valid CPU that can then be returned, if any, will be the first one set in the provided 'mask'. For NR_CPUS == 1, include/linux/cpumask.h now provides an inline definition of cpumask_next_wrap(), which will conflict with the one provided by lib/cpumask.c. Make building of lib/cpumask.o again depend on CONFIG_SMP=y (i.e. NR_CPUS > 1) to avoid the re-definition. Suggested-by: Yury Norov <yury.norov@gmail.com> Signed-off-by: Sander Vanheule <sander@svanheule.net> Signed-off-by: Yury Norov <yury.norov@gmail.com>
2022-08-15cpumask: align signatures of UP implementationsSander Vanheule
Between the generic version, and their uniprocessor optimised implementations, the return types of cpumask_any_and_distribute() and cpumask_any_distribute() are not identical. Change the UP versions to 'unsigned int', to match the generic versions. Suggested-by: Yury Norov <yury.norov@gmail.com> Signed-off-by: Sander Vanheule <sander@svanheule.net> Signed-off-by: Yury Norov <yury.norov@gmail.com>
2022-08-15mmc: sdhci-of-dwcmshc: Re-enable support for the BlueField-3 SoCLiming Sun
The commit 08f3dff799d4 (mmc: sdhci-of-dwcmshc: add rockchip platform support") introduces the use of_device_get_match_data() to check for some chips. Unfortunately, it also breaks the BlueField-3 FW, which uses ACPI. To fix the problem, let's add the ACPI match data and the corresponding quirks to re-enable the support for the BlueField-3 SoC. Reviewed-by: David Woods <davwoods@nvidia.com> Signed-off-by: Liming Sun <limings@nvidia.com> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Fixes: 08f3dff799d4 ("mmc: sdhci-of-dwcmshc: add rockchip platform support") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20220809173742.178440-1-limings@nvidia.com [Ulf: Clarified the commit message a bit] Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2022-08-15selftests/landlock: fix broken include of linux/landlock.hGuillaume Tucker
Revert part of the earlier changes to fix the kselftest build when using a sub-directory from the top of the tree as this broke the landlock test build as a side-effect when building with "make -C tools/testing/selftests/landlock". Reported-by: Mickaël Salaün <mic@digikod.net> Fixes: a917dd94b832 ("selftests/landlock: drop deprecated headers dependency") Fixes: f2745dc0ba3d ("selftests: stop using KSFT_KHDR_INSTALL") Signed-off-by: Guillaume Tucker <guillaume.tucker@collabora.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2022-08-15netfilter: nf_tables: check NFT_SET_CONCAT flag if field_count is specifiedPablo Neira Ayuso
Since f3a2181e16f1 ("netfilter: nf_tables: Support for sets with multiple ranged fields"), it possible to combine intervals and concatenations. Later on, ef516e8625dd ("netfilter: nf_tables: reintroduce the NFT_SET_CONCAT flag") provides the NFT_SET_CONCAT flag for userspace to report that the set stores a concatenation. Make sure NFT_SET_CONCAT is set on if field_count is specified for consistency. Otherwise, if NFT_SET_CONCAT is specified with no field_count, bail out with EINVAL. Fixes: ef516e8625dd ("netfilter: nf_tables: reintroduce the NFT_SET_CONCAT flag") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-08-15nios2: add force_successful_syscall_return()Al Viro
If we use the ancient SysV syscall ABI, we'd better have tell the kernel how to claim that a negative return value is a success. Use ->orig_r2 for that - it's inaccessible via ptrace, so it's a fair game for changes and it's normally[*] non-negative on return from syscall. Set to -1; syscall is not going to be restart-worthy by definition, so we won't interfere with that use either. [*] the only exception is rt_sigreturn(), where we skip the entire messing with r1/r2 anyway. Fixes: 82ed08dd1b0e ("nios2: Exception handling") Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Dinh Nguyen <dinguyen@kernel.org>
2022-08-15nios2: restarts apply only to the first sigframe we build...Al Viro
Fixes: b53e906d255d ("nios2: Signal handling support") Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Dinh Nguyen <dinguyen@kernel.org>
2022-08-15nios2: fix syscall restart checksAl Viro
sys_foo() returns -512 (aka -ERESTARTSYS) => do_signal() sees 512 in r2 and 1 in r1. sys_foo() returns 512 => do_signal() sees 512 in r2 and 0 in r1. The former is restart-worthy; the latter obviously isn't. Fixes: b53e906d255d ("nios2: Signal handling support") Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Dinh Nguyen <dinguyen@kernel.org>
2022-08-15nios2: traced syscall does need to check the syscall numberAl Viro
all checks done before letting the tracer modify the register state are worthless... Fixes: 82ed08dd1b0e ("nios2: Exception handling") Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Dinh Nguyen <dinguyen@kernel.org>
2022-08-15nios2: don't leave NULLs in sys_call_table[]Al Viro
fill the gaps in there with sys_ni_syscall, as everyone does... Fixes: 82ed08dd1b0e ("nios2: Exception handling") Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Dinh Nguyen <dinguyen@kernel.org>
2022-08-15nios2: page fault et.al. are *not* restartable syscalls...Al Viro
make sure that ->orig_r2 is negative for everything except the syscalls. Fixes: 82ed08dd1b0e ("nios2: Exception handling") Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Dinh Nguyen <dinguyen@kernel.org>
2022-08-15netfilter: nf_tables: disallow NFT_SET_ELEM_CATCHALL and ↵Pablo Neira Ayuso
NFT_SET_ELEM_INTERVAL_END These flags are mutually exclusive, report EINVAL in this case. Fixes: aaa31047a6d2 ("netfilter: nftables: add catch-all set element support") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-08-15netfilter: nf_tables: NFTA_SET_ELEM_KEY_END requires concat and interval flagsPablo Neira Ayuso
If the NFT_SET_CONCAT|NFT_SET_INTERVAL flags are set on, then the netlink attribute NFTA_SET_ELEM_KEY_END must be specified. Otherwise, NFTA_SET_ELEM_KEY_END should not be present. For catch-all element, NFTA_SET_ELEM_KEY_END should not be present. The NFT_SET_ELEM_INTERVAL_END is never used with this set flags combination. Fixes: 7b225d0b5c6d ("netfilter: nf_tables: add NFTA_SET_ELEM_KEY_END attribute") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-08-15s390/ap: fix crash on older machines based on QCI info missingHarald Freudenberger
On older z series machines (z12 and older) there is no QCI info available. The AP code took care of this and the AP bus scan then switched to simple probing via TAPQ. With commit 283915850a44 ("s390/ap: notify drivers on config changed and scan complete callbacks") some code was introduced which silently assumed that the QCI info is always available. However, with KVM simulating an older machine (z12) the result was a kernel crash. Funnily the same crash does not happen on LPAR - maybe because NULL is a valid pointer and reading some data from address 0 also works fine. This fix now improves the code to be aware that the QCI instruction may not be available on older machines and thus the two pointers to QCI info structs may simple be NULL. However, on a machine not providing the QCI info the two callbacks to the zcrypt device drivers on_config_changed() and on_scan_complete() provide parameters which are pointers to a QCI info struct. These both callbacks are NOT served if there is no QCI info available. The only consumer of these callbacks is the vfio device driver. This driver only supports CEX4 and higher. All physical machines which are able to provide CEX4 cards have QCI support available. So there is no sense in for example fill the QCI info struct by hand with looping over cards and queues and TAPQ each APQN. Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com> Cc: stable@vger.kernel.org Fixes: 283915850a44 ("s390/ap: notify drivers on config changed and scan complete callbacks") Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-08-15s390/hypfs: avoid error message under KVMJuergen Gross
When booting under KVM the following error messages are issued: hypfs.7f5705: The hardware system does not support hypfs hypfs.7a79f0: Initialization of hypfs failed with rc=-61 Demote the severity of first message from "error" to "info" and issue the second message only in other error cases. Signed-off-by: Juergen Gross <jgross@suse.com> Acked-by: Heiko Carstens <hca@linux.ibm.com> Acked-by: Christian Borntraeger <borntraeger@linux.ibm.com> Link: https://lore.kernel.org/r/20220620094534.18967-1-jgross@suse.com [arch/s390/hypfs/hypfs_diag.c changed description] Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-08-15ALSA: hda/realtek: Add quirks for ASUS Zenbooks using CS35L41Stefan Binding
These Asus Zenbook laptop use Realtek HDA codec combined with 2xCS35L41 Amplifiers using SPI. Signed-off-by: Stefan Binding <sbinding@opensource.cirrus.com> Link: https://lore.kernel.org/r/20220815141953.25197-1-sbinding@opensource.cirrus.com Signed-off-by: Takashi Iwai <tiwai@suse.de>
2022-08-15mmc: meson-gx: Fix an error handling path in meson_mmc_probe()Christophe JAILLET
The commit in Fixes has introduced a new error handling which should goto the existing error handling path. Otherwise some resources leak. Fixes: 19c6beaa064c ("mmc: meson-gx: add device reset") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/be4b863bacf323521ba3a02efdc4fca9cdedd1a6.1659855351.git.christophe.jaillet@wanadoo.fr Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2022-08-15mmc: mtk-sd: Clear interrupts when cqe off/disableWenbin Mei
Currently we don't clear MSDC interrupts when cqe off/disable, which led to the data complete interrupt will be reserved for the next command. If the next command with data transfer after cqe off/disable, we process the CMD ready interrupt and trigger DMA start for data, but the data complete interrupt is already exists, then SW assume that the data transfer is complete, SW will trigger DMA stop, but the data may not be transmitted yet or is transmitting, so we may encounter the following error: mtk-msdc 11230000.mmc: CMD bus busy detected. Signed-off-by: Wenbin Mei <wenbin.mei@mediatek.com> Fixes: 88bd652b3c74 ("mmc: mediatek: command queue support") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20220728080048.21336-1-wenbin.mei@mediatek.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2022-08-15mmc: pxamci: Fix another error handling path in pxamci_probe()Christophe JAILLET
The commit in Fixes: has introduced an new error handling without branching to the existing error handling path. Update it now and release some resources if pxamci_init_ocr() fails. Fixes: 61951fd6cb49 ("mmc: pxamci: let mmc core handle regulators") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/07a2dcebf8ede69b484103de8f9df043f158cffd.1658862932.git.christophe.jaillet@wanadoo.fr Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2022-08-15mmc: pxamci: Fix an error handling path in pxamci_probe()Christophe JAILLET
The commit in Fixes: has moved some code around without updating gotos to the error handling path. Update it now and release some resources if pxamci_of_init() fails. Fixes: fa3a5115469c ("mmc: pxamci: call mmc_of_parse()") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/6d75855ad4e2470e9ed99e0df21bc30f0c925a29.1658862932.git.christophe.jaillet@wanadoo.fr Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2022-08-15selftests/powerpc: Add missing PMU selftests to .gitignoresRussell Currey
Some recently added selftests don't have their binaries in .gitignores, so add them. I also alphabetically sorted sampling_tests/.gitignore while I was in there. Signed-off-by: Russell Currey <ruscur@russell.cc> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220812071632.56095-1-ruscur@russell.cc
2022-08-15Merge branch 'mlxsw-fixes'David S. Miller
Petr Machata says: ==================== mlxsw: Fixes for PTP support This set fixes several issues in mlxsw PTP code. - Patch #1 fixes compilation warnings. - Patch #2 adjusts the order of operation during cleanup, thereby closing the window after PTP state was already cleaned in the ASIC for the given port, but before the port is removed, when the user could still in theory make changes to the configuration. - Patch #3 protects the PTP configuration with a custom mutex, instead of relying on RTNL, which is not held in all access paths. - Patch #4 forbids enablement of PTP only in RX or only in TX. The driver implicitly assumed this would be the case, but neglected to sanitize the configuration. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2022-08-15mlxsw: spectrum_ptp: Forbid PTP enablement only in RX or in TXAmit Cohen
Currently mlxsw driver configures one global PTP configuration for all ports. The reason is that the switch behaves like a transparent clock between CPU port and front-panel ports. When time stamp is enabled in any port, the hardware is configured to update the correction field. The fact that the configuration of CPU port affects all the ports, makes the correction field update to be global for all ports. Otherwise, user will see odd values in the correction field, as the switch will update the correction field in the CPU port, but not in all the front-panel ports. The CPU port is relevant in both RX and TX, so to avoid problematic configuration, forbid PTP enablement only in one direction, i.e., only in RX or TX. Without the change: $ hwstamp_ctl -i swp1 -r 12 -t 0 current settings: tx_type 0 rx_filter 0 new settings: tx_type 0 rx_filter 2 $ echo $? 0 With the change: $ hwstamp_ctl -i swp1 -r 12 -t 0 current settings: tx_type 1 rx_filter 2 SIOCSHWTSTAMP failed: Invalid argument Fixes: 08ef8bc825d96 ("mlxsw: spectrum_ptp: Support SIOCGHWTSTAMP, SIOCSHWTSTAMP ioctls") Signed-off-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-08-15mlxsw: spectrum_ptp: Protect PTP configuration with a mutexAmit Cohen
Currently the functions mlxsw_sp2_ptp_{configure, deconfigure}_port() assume that they are called when RTNL is locked and they warn otherwise. The deconfigure function can be called when port is removed, for example as part of device reload, then there is no locked RTNL and the function warns [1]. To avoid such case, do not assume that RTNL protects this code, add a dedicated mutex instead. The mutex protects 'ptp_state->config' which stores the existing global configuration in hardware. Use this mutex also to protect the code which configures the hardware. Then, there will be only one configuration in any time, which will be updated in 'ptp_state' and a race will be avoided. [1]: RTNL: assertion failed at drivers/net/ethernet/mellanox/mlxsw/spectrum_ptp.c (1600) WARNING: CPU: 1 PID: 1583493 at drivers/net/ethernet/mellanox/mlxsw/spectrum_ptp.c:1600 mlxsw_sp2_ptp_hwtstamp_set+0x2d3/0x300 [mlxsw_spectrum] [...] CPU: 1 PID: 1583493 Comm: devlink Not tainted5.19.0-rc8-custom-127022-gb371dffda095 #789 Hardware name: Mellanox Technologies Ltd.MSN3420/VMOD0005, BIOS 5.11 01/06/2019 RIP: 0010:mlxsw_sp2_ptp_hwtstamp_set+0x2d3/0x300[mlxsw_spectrum] [...] Call Trace: <TASK> mlxsw_sp_port_remove+0x7e/0x190 [mlxsw_spectrum] mlxsw_sp_fini+0xd1/0x270 [mlxsw_spectrum] mlxsw_core_bus_device_unregister+0x55/0x280 [mlxsw_core] mlxsw_devlink_core_bus_device_reload_down+0x1c/0x30[mlxsw_core] devlink_reload+0x1ee/0x230 devlink_nl_cmd_reload+0x4de/0x580 genl_family_rcv_msg_doit+0xdc/0x140 genl_rcv_msg+0xd7/0x1d0 netlink_rcv_skb+0x49/0xf0 genl_rcv+0x1f/0x30 netlink_unicast+0x22f/0x350 netlink_sendmsg+0x208/0x440 __sys_sendto+0xf0/0x140 __x64_sys_sendto+0x1b/0x20 do_syscall_64+0x35/0x80 entry_SYSCALL_64_after_hwframe+0x63/0xcd Fixes: 08ef8bc825d96 ("mlxsw: spectrum_ptp: Support SIOCGHWTSTAMP, SIOCSHWTSTAMP ioctls") Reported-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-08-15mlxsw: spectrum: Clear PTP configuration after unregistering the netdeviceAmit Cohen
Currently as part of removing port, PTP API is called to clear the existing configuration and set the 'rx_filter' and 'tx_type' to zero. The clearing is done before unregistering the netdevice, which means that there is a window of time in which the user can reconfigure PTP in the port, and this configuration will not be cleared. Reorder the operations, clear PTP configuration after unregistering the netdevice. Fixes: 8748642751ede ("mlxsw: spectrum: PTP: Support SIOCGHWTSTAMP, SIOCSHWTSTAMP ioctls") Signed-off-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-08-15mlxsw: spectrum_ptp: Fix compilation warningsAmit Cohen
In case that 'CONFIG_PTP_1588_CLOCK' is not enabled in the config file, there are implementations for the functions mlxsw_{sp,sp2}_ptp_txhdr_construct() as part of 'spectrum_ptp.h'. In this case, they should be defined as 'static' as they are not supposed to be used out of this file. Make the functions 'static', otherwise the following warnings are returned: "warning: no previous prototype for 'mlxsw_sp_ptp_txhdr_construct'" "warning: no previous prototype for 'mlxsw_sp2_ptp_txhdr_construct'" In addition, make the functions 'inline' for case that 'spectrum_ptp.h' will be included anywhere else and the functions would probably not be used, so compilation warnings about unused static will be returned. Fixes: 24157bc69f45 ("mlxsw: Send PTP packets as data packets to overcome a limitation") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-08-15net_sched: cls_route: disallow handle of 0Jamal Hadi Salim
Follows up on: https://lore.kernel.org/all/20220809170518.164662-1-cascardo@canonical.com/ handle of 0 implies from/to of universe realm which is not very sensible. Lets see what this patch will do: $sudo tc qdisc add dev $DEV root handle 1:0 prio //lets manufacture a way to insert handle of 0 $sudo tc filter add dev $DEV parent 1:0 protocol ip prio 100 \ route to 0 from 0 classid 1:10 action ok //gets rejected... Error: handle of 0 is not valid. We have an error talking to the kernel, -1 //lets create a legit entry.. sudo tc filter add dev $DEV parent 1:0 protocol ip prio 100 route from 10 \ classid 1:10 action ok //what did the kernel insert? $sudo tc filter ls dev $DEV parent 1:0 filter protocol ip pref 100 route chain 0 filter protocol ip pref 100 route chain 0 fh 0x000a8000 flowid 1:10 from 10 action order 1: gact action pass random type none pass val 0 index 1 ref 1 bind 1 //Lets try to replace that legit entry with a handle of 0 $ sudo tc filter replace dev $DEV parent 1:0 protocol ip prio 100 \ handle 0x000a8000 route to 0 from 0 classid 1:10 action drop Error: Replacing with handle of 0 is invalid. We have an error talking to the kernel, -1 And last, lets run Cascardo's POC: $ ./poc 0 0 -22 -22 -22 Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-08-15net: fix potential refcount leak in ndisc_router_discovery()Xin Xiong
The issue happens on specific paths in the function. After both the object `rt` and `neigh` are grabbed successfully, when `lifetime` is nonzero but the metric needs change, the function just deletes the route and set `rt` to NULL. Then, it may try grabbing `rt` and `neigh` again if above conditions hold. The function simply overwrite `neigh` if succeeds or returns if fails, without decreasing the reference count of previous `neigh`. This may result in memory leaks. Fix it by decrementing the reference count of `neigh` in place. Fixes: 6b2e04bc240f ("net: allow user to set metric on default route learned via Router Advertisement") Signed-off-by: Xin Xiong <xiongx18@fudan.edu.cn> Signed-off-by: Xin Tan <tanxin.ctf@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-08-15Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/netDavid S. Miller
-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2022-08-11 (ice) This series contains updates to ice driver only. Benjamin corrects a misplaced parenthesis for a WARN_ON check. Michal removes WARN_ON from a check as its recoverable and not warranting of a call trace. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2022-08-15neighbour: make proxy_queue.qlen limit per-deviceAlexander Mikhalitsyn
Right now we have a neigh_param PROXY_QLEN which specifies maximum length of neigh_table->proxy_queue. But in fact, this limitation doesn't work well because check condition looks like: tbl->proxy_queue.qlen > NEIGH_VAR(p, PROXY_QLEN) The problem is that p (struct neigh_parms) is a per-device thing, but tbl (struct neigh_table) is a system-wide global thing. It seems reasonable to make proxy_queue limit per-device based. v2: - nothing changed in this patch v3: - rebase to net tree Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: David Ahern <dsahern@kernel.org> Cc: Yajun Deng <yajun.deng@linux.dev> Cc: Roopa Prabhu <roopa@nvidia.com> Cc: Christian Brauner <brauner@kernel.org> Cc: netdev@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Cc: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com> Cc: Konstantin Khorenko <khorenko@virtuozzo.com> Cc: kernel@openvz.org Cc: devel@openvz.org Suggested-by: Denis V. Lunev <den@openvz.org> Signed-off-by: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com> Reviewed-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-08-15neigh: fix possible DoS due to net iface start/stop loopDenis V. Lunev
Normal processing of ARP request (usually this is Ethernet broadcast packet) coming to the host is looking like the following: * the packet comes to arp_process() call and is passed through routing procedure * the request is put into the queue using pneigh_enqueue() if corresponding ARP record is not local (common case for container records on the host) * the request is processed by timer (within 80 jiffies by default) and ARP reply is sent from the same arp_process() using NEIGH_CB(skb)->flags & LOCALLY_ENQUEUED condition (flag is set inside pneigh_enqueue()) And here the problem comes. Linux kernel calls pneigh_queue_purge() which destroys the whole queue of ARP requests on ANY network interface start/stop event through __neigh_ifdown(). This is actually not a problem within the original world as network interface start/stop was accessible to the host 'root' only, which could do more destructive things. But the world is changed and there are Linux containers available. Here container 'root' has an access to this API and could be considered as untrusted user in the hosting (container's) world. Thus there is an attack vector to other containers on node when container's root will endlessly start/stop interfaces. We have observed similar situation on a real production node when docker container was doing such activity and thus other containers on the node become not accessible. The patch proposed doing very simple thing. It drops only packets from the same namespace in the pneigh_queue_purge() where network interface state change is detected. This is enough to prevent the problem for the whole node preserving original semantics of the code. v2: - do del_timer_sync() if queue is empty after pneigh_queue_purge() v3: - rebase to net tree Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: David Ahern <dsahern@kernel.org> Cc: Yajun Deng <yajun.deng@linux.dev> Cc: Roopa Prabhu <roopa@nvidia.com> Cc: Christian Brauner <brauner@kernel.org> Cc: netdev@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Cc: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com> Cc: Konstantin Khorenko <khorenko@virtuozzo.com> Cc: kernel@openvz.org Cc: devel@openvz.org Investigated-by: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com> Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-08-15net: qrtr: start MHI channel after endpoit creationMaxim Kochetkov
MHI channel may generates event/interrupt right after enabling. It may leads to 2 race conditions issues. 1) Such event may be dropped by qcom_mhi_qrtr_dl_callback() at check: if (!qdev || mhi_res->transaction_status) return; Because dev_set_drvdata(&mhi_dev->dev, qdev) may be not performed at this moment. In this situation qrtr-ns will be unable to enumerate services in device. --------------------------------------------------------------- 2) Such event may come at the moment after dev_set_drvdata() and before qrtr_endpoint_register(). In this case kernel will panic with accessing wrong pointer at qcom_mhi_qrtr_dl_callback(): rc = qrtr_endpoint_post(&qdev->ep, mhi_res->buf_addr, mhi_res->bytes_xferd); Because endpoint is not created yet. -------------------------------------------------------------- So move mhi_prepare_for_transfer_autoqueue after endpoint creation to fix it. Fixes: a2e2cc0dbb11 ("net: qrtr: Start MHI channels during init") Signed-off-by: Maxim Kochetkov <fido_max@inbox.ru> Reviewed-by: Hemant Kumar <quic_hemantk@quicinc.com> Reviewed-by: Manivannan Sadhasivam <mani@kernel.org> Reviewed-by: Loic Poulain <loic.poulain@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-08-15drm/sun4i: dsi: Prevent underflow when computing packet sizesSamuel Holland
Currently, the packet overhead is subtracted using unsigned arithmetic. With a short sync pulse, this could underflow and wrap around to near the maximal u16 value. Fix this by using signed subtraction. The call to max() will correctly handle any negative numbers that are produced. Apply the same fix to the other timings, even though those subtractions are less likely to underflow. Fixes: 133add5b5ad4 ("drm/sun4i: Add Allwinner A31 MIPI-DSI controller support") Signed-off-by: Samuel Holland <samuel@sholland.org> Reviewed-by: Jernej Skrabec <jernej.skrabec@gmail.com> Signed-off-by: Maxime Ripard <maxime@cerno.tech> Link: https://lore.kernel.org/r/20220812031623.34057-1-samuel@sholland.org
2022-08-15dt-bindings: display: sun4i: Add D1 TCONs to conditionalsSamuel Holland
When adding the D1 TCON bindings, I missed the conditional blocks that restrict the binding for TCON LCD vs TCON TV hardware. Add the D1 TCON variants to the appropriate blocks for DE2 TCON LCDs and TCON TVs. Fixes: ae5a5d26c15c ("dt-bindings: display: Add D1 display engine compatibles") Signed-off-by: Samuel Holland <samuel@sholland.org> Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Maxime Ripard <maxime@cerno.tech> Link: https://lore.kernel.org/r/20220812073702.57618-1-samuel@sholland.org
2022-08-15powerpc/pci: Fix get_phb_number() lockingMichael Ellerman
The recent change to get_phb_number() causes a DEBUG_ATOMIC_SLEEP warning on some systems: BUG: sleeping function called from invalid context at kernel/locking/mutex.c:580 in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 1, name: swapper preempt_count: 1, expected: 0 RCU nest depth: 0, expected: 0 1 lock held by swapper/1: #0: c157efb0 (hose_spinlock){+.+.}-{2:2}, at: pcibios_alloc_controller+0x64/0x220 Preemption disabled at: [<00000000>] 0x0 CPU: 0 PID: 1 Comm: swapper Not tainted 5.19.0-yocto-standard+ #1 Call Trace: [d101dc90] [c073b264] dump_stack_lvl+0x50/0x8c (unreliable) [d101dcb0] [c0093b70] __might_resched+0x258/0x2a8 [d101dcd0] [c0d3e634] __mutex_lock+0x6c/0x6ec [d101dd50] [c0a84174] of_alias_get_id+0x50/0xf4 [d101dd80] [c002ec78] pcibios_alloc_controller+0x1b8/0x220 [d101ddd0] [c140c9dc] pmac_pci_init+0x198/0x784 [d101de50] [c140852c] discover_phbs+0x30/0x4c [d101de60] [c0007fd4] do_one_initcall+0x94/0x344 [d101ded0] [c1403b40] kernel_init_freeable+0x1a8/0x22c [d101df10] [c00086e0] kernel_init+0x34/0x160 [d101df30] [c001b334] ret_from_kernel_thread+0x5c/0x64 This is because pcibios_alloc_controller() holds hose_spinlock but of_alias_get_id() takes of_mutex which can sleep. The hose_spinlock protects the phb_bitmap, and also the hose_list, but it doesn't need to be held while get_phb_number() calls the OF routines, because those are only looking up information in the device tree. So fix it by having get_phb_number() take the hose_spinlock itself, only where required, and then dropping the lock before returning. pcibios_alloc_controller() then needs to take the lock again before the list_add() but that's safe, the order of the list is not important. Fixes: 0fe1e96fef0a ("powerpc/pci: Prefer PCI domain assignment via DT 'linux,pci-domain' and alias") Reported-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220815065550.1303620-1-mpe@ellerman.id.au
2022-08-15cifs: missing directory in MAINTAINERS fileSteve French
The include/uapi/linux/cifs directory (not just fs/cifs and fs/smbfs_common) should be included in cifs entry in the MAINTAINERS file. Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz> Signed-off-by: Steve French <stfrench@microsoft.com>
2022-08-14Linux 6.0-rc1v6.0-rc1Linus Torvalds
2022-08-14radix-tree: replace gfp.h inclusion with gfp_types.hYury Norov
Radix tree header includes gfp.h for __GFP_BITS_SHIFT only. Now we have gfp_types.h for this. Fixes powerpc allmodconfig build: In file included from include/linux/nodemask.h:97, from include/linux/mmzone.h:17, from include/linux/gfp.h:7, from include/linux/radix-tree.h:12, from include/linux/idr.h:15, from include/linux/kernfs.h:12, from include/linux/sysfs.h:16, from include/linux/kobject.h:20, from include/linux/pci.h:35, from arch/powerpc/kernel/prom_init.c:24: include/linux/random.h: In function 'add_latent_entropy': >> include/linux/random.h:25:46: error: 'latent_entropy' undeclared (first use in this function); did you mean 'add_latent_entropy'? 25 | add_device_randomness((const void *)&latent_entropy, sizeof(latent_entropy)); | ^~~~~~~~~~~~~~ | add_latent_entropy include/linux/random.h:25:46: note: each undeclared identifier is reported only once for each function it appears in Reported-by: kernel test robot <lkp@intel.com> CC: Andy Shevchenko <andriy.shevchenko@linux.intel.com> CC: Andrew Morton <akpm@linux-foundation.org> CC: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Yury Norov <yury.norov@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-08-14Merge tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfsLinus Torvalds
Pull vfs lseek fix from Al Viro: "Fix proc_reg_llseek() breakage. Always had been possible if somebody left NULL ->proc_lseek, became a practical issue now" * tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: take care to handle NULL ->proc_lseek()