summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2023-03-13s390/ipl: add missing intersection check to ipl_report handlingSven Schnelle
The code which handles the ipl report is searching for a free location in memory where it could copy the component and certificate entries to. It checks for intersection between the sections required for the kernel and the component/certificate data area, but fails to check whether the data structures linking these data areas together intersect. This might cause the iplreport copy code to overwrite the iplreport itself. Fix this by adding two addtional intersection checks. Cc: <stable@vger.kernel.org> Fixes: 9641b8cc733f ("s390/ipl: read IPL report at early boot") Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Reviewed-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2023-03-13tools/virtio: Ignore virtio-trace/trace-agentRong Tao
since commit 108fc82596e3("tools: Add guest trace agent as a user tool") introduce virtio-trace/trace-agent, it should be ignored in the git tree. Signed-off-by: Rong Tao <rongtao@cestc.cn> Message-Id: <tencent_52B2BC2F47540A5FEB46E710BD0C8485B409@qq.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-13vdpa_sim: set last_used_idx as last_avail_idx in vdpasim_queue_readyEugenio Pérez
Starting from an used_idx different than 0 is needed in use cases like virtual machine migration. Not doing so and letting the caller set an avail idx different than 0 causes destination device to try to use old buffers that source driver already recover and are not available anymore. Since vdpa_sim does not support receive inflight descriptors as a destination of a migration, let's set both avail_idx and used_idx the same at vq start. This is how vhost-user works in a VHOST_SET_VRING_BASE call. Although the simple fix is to set last_used_idx at vdpasim_set_vq_state, it would be reset at vdpasim_queue_ready. The last_avail_idx case is fixed with commit 0e84f918fac8 ("vdpa_sim: not reset state in vdpasim_queue_ready"). Since the only option is to make it equal to last_avail_idx, adding the only change needed here. This was discovered and tested live migrating the vdpa_sim_net device. Fixes: 2c53d0f64c06 ("vdpasim: vDPA device simulator") Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Eugenio Pérez <eperezma@redhat.com> Message-Id: <20230302181857.925374-1-eperezma@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-13vhost-vdpa: free iommu domain after last use during cleanupGautam Dawar
Currently vhost_vdpa_cleanup() unmaps the DMA mappings by calling `iommu_unmap(v->domain, map->start, map->size);` from vhost_vdpa_general_unmap() when the parent vDPA driver doesn't provide DMA config operations. However, the IOMMU domain referred to by `v->domain` is freed in vhost_vdpa_free_domain() before vhost_vdpa_cleanup() in vhost_vdpa_release() which results in NULL pointer de-reference. Accordingly, moving the call to vhost_vdpa_free_domain() in vhost_vdpa_cleanup() would makes sense. This will also help detaching the dma device in error handling of vhost_vdpa_alloc_domain(). This issue was observed on terminating QEMU with SIGQUIT. Fixes: 037d4305569a ("vhost-vdpa: call vhost_vdpa_cleanup during the release") Signed-off-by: Gautam Dawar <gautam.dawar@amd.com> Message-Id: <20230301163203.29883-1-gautam.dawar@amd.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
2023-03-12Linux 6.3-rc2v6.3-rc2Linus Torvalds
2023-03-12wifi: cfg80211: Partial revert "wifi: cfg80211: Fix use after free for wext"Hector Martin
This reverts part of commit 015b8cc5e7c4 ("wifi: cfg80211: Fix use after free for wext") This commit broke WPA offload by unconditionally clearing the crypto modes for non-WEP connections. Drop that part of the patch. Signed-off-by: Hector Martin <marcan@marcan.st> Reported-by: Ilya <me@0upti.me> Reported-and-tested-by: Janne Grunau <j@jannau.net> Reviewed-by: Eric Curtin <ecurtin@redhat.com> Fixes: 015b8cc5e7c4 ("wifi: cfg80211: Fix use after free for wext") Cc: stable@kernel.org Link: https://lore.kernel.org/linux-wireless/ZAx0TWRBlGfv7pNl@kroah.com/T/#m11e6e0915ab8fa19ce8bc9695ab288c0fe018edf Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-03-12Merge tag 'tpm-v6.3-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd Pull tpm fixes from Jarkko Sakkinen: "Two additional bug fixes for v6.3" * tag 'tpm-v6.3-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd: tpm: disable hwrng for fTPM on some AMD designs tpm/eventlog: Don't abort tpm_read_log on faulty ACPI address
2023-03-12tpm: disable hwrng for fTPM on some AMD designsMario Limonciello
AMD has issued an advisory indicating that having fTPM enabled in BIOS can cause "stuttering" in the OS. This issue has been fixed in newer versions of the fTPM firmware, but it's up to system designers to decide whether to distribute it. This issue has existed for a while, but is more prevalent starting with kernel 6.1 because commit b006c439d58db ("hwrng: core - start hwrng kthread also for untrusted sources") started to use the fTPM for hwrng by default. However, all uses of /dev/hwrng result in unacceptable stuttering. So, simply disable registration of the defective hwrng when detecting these faulty fTPM versions. As this is caused by faulty firmware, it is plausible that such a problem could also be reproduced by other TPM interactions, but this hasn't been shown by any user's testing or reports. It is hypothesized to be triggered more frequently by the use of the RNG because userspace software will fetch random numbers regularly. Intentionally continue to register other TPM functionality so that users that rely upon PCR measurements or any storage of data will still have access to it. If it's found later that another TPM functionality is exacerbating this problem a module parameter it can be turned off entirely and a module parameter can be introduced to allow users who rely upon fTPM functionality to turn it on even though this problem is present. Link: https://www.amd.com/en/support/kb/faq/pa-410 Link: https://bugzilla.kernel.org/show_bug.cgi?id=216989 Link: https://lore.kernel.org/all/20230209153120.261904-1-Jason@zx2c4.com/ Fixes: b006c439d58d ("hwrng: core - start hwrng kthread also for untrusted sources") Cc: stable@vger.kernel.org Cc: Jarkko Sakkinen <jarkko@kernel.org> Cc: Thorsten Leemhuis <regressions@leemhuis.info> Cc: James Bottomley <James.Bottomley@hansenpartnership.com> Tested-by: reach622@mailcuk.com Tested-by: Bell <1138267643@qq.com> Co-developed-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2023-03-12tpm/eventlog: Don't abort tpm_read_log on faulty ACPI addressMorten Linderud
tpm_read_log_acpi() should return -ENODEV when no eventlog from the ACPI table is found. If the firmware vendor includes an invalid log address we are unable to map from the ACPI memory and tpm_read_log() returns -EIO which would abort discovery of the eventlog. Change the return value from -EIO to -ENODEV when acpi_os_map_iomem() fails to map the event log. The following hardware was used to test this issue: Framework Laptop (Pre-production) BIOS: INSYDE Corp, Revision: 3.2 TPM Device: NTC, Firmware Revision: 7.2 Dump of the faulty ACPI TPM2 table: [000h 0000 4] Signature : "TPM2" [Trusted Platform Module hardware interface Table] [004h 0004 4] Table Length : 0000004C [008h 0008 1] Revision : 04 [009h 0009 1] Checksum : 2B [00Ah 0010 6] Oem ID : "INSYDE" [010h 0016 8] Oem Table ID : "TGL-ULT" [018h 0024 4] Oem Revision : 00000002 [01Ch 0028 4] Asl Compiler ID : "ACPI" [020h 0032 4] Asl Compiler Revision : 00040000 [024h 0036 2] Platform Class : 0000 [026h 0038 2] Reserved : 0000 [028h 0040 8] Control Address : 0000000000000000 [030h 0048 4] Start Method : 06 [Memory Mapped I/O] [034h 0052 12] Method Parameters : 00 00 00 00 00 00 00 00 00 00 00 00 [040h 0064 4] Minimum Log Length : 00010000 [044h 0068 8] Log Address : 000000004053D000 Fixes: 0cf577a03f21 ("tpm: Fix handling of missing event log") Tested-by: Erkki Eilonen <erkki@bearmetal.eu> Signed-off-by: Morten Linderud <morten@linderud.pw> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2023-03-12hwmon: tmp512: drop of_match_ptr for ID tableKrzysztof Kozlowski
The driver will match mostly by DT table (even thought there is regular ID table) so there is little benefit in of_match_ptr (this also allows ACPI matching via PRP0001, even though it might not be relevant here). This also fixes !CONFIG_OF error: drivers/hwmon/tmp513.c:610:34: error: ‘tmp51x_of_match’ defined but not used [-Werror=unused-const-variable=] Fixes: 59dfa75e5d82 ("hwmon: Add driver for Texas Instruments TMP512/513 sensor chips.") Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://lore.kernel.org/r/20230312193723.478032-2-krzysztof.kozlowski@linaro.org Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2023-03-12hwmon: (ucd90320) Add minimum delay between bus accessesLars-Peter Clausen
When probing the ucd90320 access to some of the registers randomly fails. Sometimes it NACKs a transfer, sometimes it returns just random data and the PEC check fails. Experimentation shows that this seems to be triggered by a register access directly back to back with a previous register write. Experimentation also shows that inserting a small delay after register writes makes the issue go away. Use a similar solution to what the max15301 driver does to solve the same problem. Create a custom set of bus read and write functions that make sure that the delay is added. Fixes: a470f11c5ba2 ("hwmon: (pmbus/ucd9000) Add support for UCD90320 Power Sequencer") Signed-off-by: Lars-Peter Clausen <lars@metafoo.de> Link: https://lore.kernel.org/r/20230312160312.2227405-1-lars@metafoo.de Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2023-03-12x86/mce: Make sure logged MCEs are processed after sysfs updateYazen Ghannam
A recent change introduced a flag to queue up errors found during boot-time polling. These errors will be processed during late init once the MCE subsystem is fully set up. A number of sysfs updates call mce_restart() which goes through a subset of the CPU init flow. This includes polling MCA banks and logging any errors found. Since the same function is used as boot-time polling, errors will be queued. However, the system is now past late init, so the errors will remain queued until another error is found and the workqueue is triggered. Call mce_schedule_work() at the end of mce_restart() so that queued errors are processed. Fixes: 3bff147b187d ("x86/mce: Defer processing of early errors") Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Tony Luck <tony.luck@intel.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230301221420.2203184-1-yazen.ghannam@amd.com
2023-03-12hwmon: (ina3221) return prober error codeMarcus Folkesson
ret is set to 0 which do not indicate an error. Return -EINVAL instead. Fixes: a9e9dd9c6de5 ("hwmon: (ina3221) Read channel input source info from DT") Signed-off-by: Marcus Folkesson <marcus.folkesson@gmail.com> Link: https://lore.kernel.org/r/20230310075035.246083-1-marcus.folkesson@gmail.com Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2023-03-12hwmon: (xgene) Fix use after free bug in xgene_hwmon_remove due to race ↵Zheng Wang
condition In xgene_hwmon_probe, &ctx->workq is bound with xgene_hwmon_evt_work. Then it will be started. If we remove the driver which will call xgene_hwmon_remove to clean up, there may be unfinished work. The possible sequence is as follows: Fix it by finishing the work before cleanup in xgene_hwmon_remove. CPU0 CPU1 |xgene_hwmon_evt_work xgene_hwmon_remove | kfifo_free(&ctx->async_msg_fifo);| | |kfifo_out_spinlocked |//use &ctx->async_msg_fifo Fixes: 2ca492e22cb7 ("hwmon: (xgene) Fix crash when alarm occurs before driver probe") Signed-off-by: Zheng Wang <zyytlz.wz@163.com> Link: https://lore.kernel.org/r/20230310084007.1403388-1-zyytlz.wz@163.com Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2023-03-12Merge tag 'xfs-6.3-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds
Pull xfs fixes from Darrick Wong: - Fix a crash if mount time quotacheck fails when there are inodes queued for garbage collection. - Fix an off by one error when discarding folios after writeback failure. * tag 'xfs-6.3-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: xfs: fix off-by-one-block in xfs_discard_folio() xfs: quotacheck failure can race with background inode inactivation
2023-03-12Merge tag 'staging-6.3-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging Pull staging driver fixes and removal from Greg KH: "Here are four small staging driver fixes, and one big staging driver deletion for 6.3-rc2. The fixes are: - rtl8192e driver fixes for where the driver was attempting to execute various programs directly from the disk for unknown reasons - rtl8723bs driver fixes for issues found by Hans in testing The deleted driver is the removal of the r8188eu wireless driver as now in 6.3-rc1 we have a "real" wifi driver for one that includes support for many many more devices than this old driver did. So it's time to remove it as it is no longer needed. The maintainers of this driver all have acked its removal. Many thanks to them over the years for working to clean it up and keep it working while the real driver was being developed. All of these have been in linux-next this week with no reported problems" * tag 'staging-6.3-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: staging: r8188eu: delete driver staging: rtl8723bs: Pass correct parameters to cfg80211_get_bss() staging: rtl8723bs: Fix key-store index handling staging: rtl8192e: Remove call_usermodehelper starting RadioPower.sh staging: rtl8192e: Remove function ..dm_check_ac_dc_power calling a script
2023-03-12Merge tag 'x86_urgent_for_v6.3_rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fix from Borislav Petkov: "A single erratum fix for AMD machines: - Disable XSAVES on AMD Zen1 and Zen2 machines due to an erratum. No impact to anything as those machines will fallback to XSAVEC which is equivalent there" * tag 'x86_urgent_for_v6.3_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/CPU/AMD: Disable XSAVES on AMD family 0x17
2023-03-12Merge tag 'kernel.fork.v6.3-rc2' of ↵Linus Torvalds
gitolite.kernel.org:pub/scm/linux/kernel/git/brauner/linux Pull clone3 fix from Christian Brauner: "A simple fix for the clone3() system call. The CLONE_NEWTIME allows the creation of time namespaces. The flag reuses a bit from the CSIGNAL bits that are used in the legacy clone() system call to set the signal that gets sent to the parent after the child exits. The clone3() system call doesn't rely on CSIGNAL anymore as it uses a dedicated .exit_signal field in struct clone_args. So we blocked all CSIGNAL bits in clone3_args_valid(). When CLONE_NEWTIME was introduced and reused a CSIGNAL bit we forgot to adapt clone3_args_valid() causing CLONE_NEWTIME with clone3() to be rejected. Fix this" * tag 'kernel.fork.v6.3-rc2' of gitolite.kernel.org:pub/scm/linux/kernel/git/brauner/linux: selftests/clone3: test clone3 with CLONE_NEWTIME fork: allow CLONE_NEWTIME in clone3 flags
2023-03-12Merge tag 'vfs.misc.v6.3-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping Pull vfs fixes from Christian Brauner: - When allocating pages for a watch queue failed, we didn't return an error causing userspace to proceed even though all subsequent notifcations would be lost. Make sure to return an error. - Fix a misformed tree entry for the idmapping maintainers entry. - When setting file leases from an idmapped mount via generic_setlease() we need to take the idmapping into account otherwise taking a lease would fail from an idmapped mount. - Remove two redundant assignments, one in splice code and the other in locks code, that static checkers complained about. * tag 'vfs.misc.v6.3-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping: filelocks: use mount idmapping for setlease permission check fs/locks: Remove redundant assignment to cmd splice: Remove redundant assignment to ret MAINTAINERS: repair a malformed T: entry in IDMAPPED MOUNTS watch_queue: fix IOC_WATCH_QUEUE_SET_SIZE alloc error paths
2023-03-12Merge tag 'ext4_for_linus_stable' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 fixes from Ted Ts'o: "Bug fixes and regressions for ext4, the most serious of which is a potential deadlock during directory renames that was introduced during the merge window discovered by a combination of syzbot and lockdep" * tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: zero i_disksize when initializing the bootloader inode ext4: make sure fs error flag setted before clear journal error ext4: commit super block if fs record error when journal record without error ext4, jbd2: add an optimized bmap for the journal inode ext4: fix WARNING in ext4_update_inline_data ext4: move where set the MAY_INLINE_DATA flag is set ext4: Fix deadlock during directory rename ext4: Fix comment about the 64BIT feature docs: ext4: modify the group desc size to 64 ext4: fix another off-by-one fsmap error on 1k block filesystems ext4: fix RENAME_WHITEOUT handling for inline directories ext4: make kobj_type structures constant ext4: fix cgroup writeback accounting with fs-layer encryption
2023-03-12cpumask: relax sanity checking constraintsLinus Torvalds
The cpumask_check() was unnecessarily tight, and causes problems for the users of cpumask_next(). We have a number of users that take the previous return value of one of the bit scanning functions and subtract one to keep it in "range". But since the scanning functions end up returning up to 'small_cpumask_bits' instead of the tighter 'nr_cpumask_bits', the range really needs to be using that widened form. [ This "previous-1" behavior is also the reason we have all those comments about /* -1 is a legal arg here. */ and separate checks for that being ok. So we could have just made "small_cpumask_bits-1" be a similar special "don't check this" value. Tetsuo Handa even suggested a patch that only does that for cpumask_next(), since that seems to be the only actual case that triggers, but that all makes it even _more_ magical and special. So just relax the check ] One example of this kind of pattern being the 'c_start()' function in arch/x86/kernel/cpu/proc.c, but also duplicated in various forms on other architectures. Reported-by: syzbot+96cae094d90877641f32@syzkaller.appspotmail.com Link: https://syzkaller.appspot.com/bug?extid=96cae094d90877641f32 Reported-by: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> Link: https://lore.kernel.org/lkml/c1f4cc16-feea-b83c-82cf-1a1f007b7eb9@I-love.SAKURA.ne.jp/ Fixes: 596ff4a09b89 ("cpumask: re-introduce constant-sized cpumask optimizations") Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-03-11Merge tag 'i2c-for-6.3-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull i2c updates from Wolfram Sang: "This marks the end of a transition to let I2C have the same probe semantics as other subsystems. Uwe took care that no drivers in the current tree nor in -next use the deprecated .probe call. So, it is a good time to switch to the new, standard semantics now. There is also a regression fix: - regression fix for the notifier handling of the I2C core - final coversions of drivers away from deprecated .probe - make .probe_new the standard probe and convert I2C core to use it * tag 'i2c-for-6.3-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: i2c: dev: Fix bus callback return values i2c: Convert drivers to new .probe() callback i2c: mux: Convert all drivers to new .probe() callback i2c: Switch .probe() to not take an id parameter media: i2c: ov2685: convert to i2c's .probe_new() media: i2c: ov5695: convert to i2c's .probe_new() w1: ds2482: Convert to i2c's .probe_new() serial: sc16is7xx: Convert to i2c's .probe_new() mtd: maps: pismo: Convert to i2c's .probe_new() misc: ad525x_dpot-i2c: Convert to i2c's .probe_new()
2023-03-11ubi: block: Fix missing blk_mq_end_requestRichard Weinberger
Switching to BLK_MQ_F_BLOCKING wrongly removed the call to blk_mq_end_request(). Add it back to have our IOs finished Fixes: 91cc8fbcc8c7 ("ubi: block: set BLK_MQ_F_BLOCKING") Analyzed-by: Linus Torvalds <torvalds@linux-foundation.org> Reported-by: Daniel Palmer <daniel@0x0f.com> Link: https://lore.kernel.org/linux-mtd/CAHk-=wi29bbBNh3RqJKu3PxzpjDN5D5K17gEVtXrb7-6bfrnMQ@mail.gmail.com/ Signed-off-by: Richard Weinberger <richard@nod.at> Reviewed-by: Christoph Hellwig <hch@lst.de> Tested-by: Daniel Palmer <daniel@0x0f.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-03-11KVM: arm64: timers: Convert per-vcpu virtual offset to a global valueMarc Zyngier
Having a per-vcpu virtual offset is a pain. It needs to be synchronized on each update, and expands badly to a setup where different timers can have different offsets, or have composite offsets (as with NV). So let's start by replacing the use of the CNTVOFF_EL2 shadow register (which we want to reclaim for NV anyway), and make the virtual timer carry a pointer to a VM-wide offset. This simplifies the code significantly. It also addresses two terrible bugs: - The use of CNTVOFF_EL2 leads to some nice offset corruption when the sysreg gets reset, as reported by Joey. - The kvm mutex is taken from a vcpu ioctl, which goes against the locking rules... Reported-by: Joey Gouly <joey.gouly@arm.com> Reviewed-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20230224173915.GA17407@e124191.cambridge.arm.com Tested-by: Joey Gouly <joey.gouly@arm.com> Link: https://lore.kernel.org/r/20230224191640.3396734-1-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-03-10Merge git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nfJakub Kicinski
Pablo Neira Ayuso says: ==================== Netfilter fixes for net 1) nft_parse_register_load() gets an incorrect datatype size as input, from Jeremy Sowden. 2) incorrect maximum netlink attribute in nft_redir, also from Jeremy. * git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf: netfilter: nft_redir: correct value of inet type `.maxattrs` netfilter: nft_redir: correct length for loading protocol registers netfilter: nft_masq: correct length for loading protocol registers netfilter: nft_nat: correct length for loading protocol registers ==================== Link: https://lore.kernel.org/r/20230309174655.69816-1-pablo@netfilter.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-11ext4: zero i_disksize when initializing the bootloader inodeZhihao Cheng
If the boot loader inode has never been used before, the EXT4_IOC_SWAP_BOOT inode will initialize it, including setting the i_size to 0. However, if the "never before used" boot loader has a non-zero i_size, then i_disksize will be non-zero, and the inconsistency between i_size and i_disksize can trigger a kernel warning: WARNING: CPU: 0 PID: 2580 at fs/ext4/file.c:319 CPU: 0 PID: 2580 Comm: bb Not tainted 6.3.0-rc1-00004-g703695902cfa RIP: 0010:ext4_file_write_iter+0xbc7/0xd10 Call Trace: vfs_write+0x3b1/0x5c0 ksys_write+0x77/0x160 __x64_sys_write+0x22/0x30 do_syscall_64+0x39/0x80 Reproducer: 1. create corrupted image and mount it: mke2fs -t ext4 /tmp/foo.img 200 debugfs -wR "sif <5> size 25700" /tmp/foo.img mount -t ext4 /tmp/foo.img /mnt cd /mnt echo 123 > file 2. Run the reproducer program: posix_memalign(&buf, 1024, 1024) fd = open("file", O_RDWR | O_DIRECT); ioctl(fd, EXT4_IOC_SWAP_BOOT); write(fd, buf, 1024); Fix this by setting i_disksize as well as i_size to zero when initiaizing the boot loader inode. Link: https://bugzilla.kernel.org/show_bug.cgi?id=217159 Cc: stable@kernel.org Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com> Link: https://lore.kernel.org/r/20230308032643.641113-1-chengzhihao1@huawei.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2023-03-11ext4: make sure fs error flag setted before clear journal errorYe Bin
Now, jounral error number maybe cleared even though ext4_commit_super() failed. This may lead to error flag miss, then fsck will miss to check file system deeply. Signed-off-by: Ye Bin <yebin10@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20230307061703.245965-3-yebin@huaweicloud.com
2023-03-11ext4: commit super block if fs record error when journal record without errorYe Bin
Now, 'es->s_state' maybe covered by recover journal. And journal errno maybe not recorded in journal sb as IO error. ext4_update_super() only update error information when 'sbi->s_add_error_count' large than zero. Then 'EXT4_ERROR_FS' flag maybe lost. To solve above issue just recover 'es->s_state' error flag after journal replay like error info. Signed-off-by: Ye Bin <yebin10@huawei.com> Reviewed-by: Baokun Li <libaokun1@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20230307061703.245965-2-yebin@huaweicloud.com
2023-03-11ext4, jbd2: add an optimized bmap for the journal inodeTheodore Ts'o
The generic bmap() function exported by the VFS takes locks and does checks that are not necessary for the journal inode. So allow the file system to set a journal-optimized bmap function in journal->j_bmap. Reported-by: syzbot+9543479984ae9e576000@syzkaller.appspotmail.com Link: https://syzkaller.appspot.com/bug?id=e4aaa78795e490421c79f76ec3679006c8ff4cf0 Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2023-03-11ext4: fix WARNING in ext4_update_inline_dataYe Bin
Syzbot found the following issue: EXT4-fs (loop0): mounted filesystem 00000000-0000-0000-0000-000000000000 without journal. Quota mode: none. fscrypt: AES-256-CTS-CBC using implementation "cts-cbc-aes-aesni" fscrypt: AES-256-XTS using implementation "xts-aes-aesni" ------------[ cut here ]------------ WARNING: CPU: 0 PID: 5071 at mm/page_alloc.c:5525 __alloc_pages+0x30a/0x560 mm/page_alloc.c:5525 Modules linked in: CPU: 1 PID: 5071 Comm: syz-executor263 Not tainted 6.2.0-rc1-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/26/2022 RIP: 0010:__alloc_pages+0x30a/0x560 mm/page_alloc.c:5525 RSP: 0018:ffffc90003c2f1c0 EFLAGS: 00010246 RAX: ffffc90003c2f220 RBX: 0000000000000014 RCX: 0000000000000000 RDX: 0000000000000028 RSI: 0000000000000000 RDI: ffffc90003c2f248 RBP: ffffc90003c2f2d8 R08: dffffc0000000000 R09: ffffc90003c2f220 R10: fffff52000785e49 R11: 1ffff92000785e44 R12: 0000000000040d40 R13: 1ffff92000785e40 R14: dffffc0000000000 R15: 1ffff92000785e3c FS: 0000555556c0d300(0000) GS:ffff8880b9800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f95d5e04138 CR3: 00000000793aa000 CR4: 00000000003506f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> __alloc_pages_node include/linux/gfp.h:237 [inline] alloc_pages_node include/linux/gfp.h:260 [inline] __kmalloc_large_node+0x95/0x1e0 mm/slab_common.c:1113 __do_kmalloc_node mm/slab_common.c:956 [inline] __kmalloc+0xfe/0x190 mm/slab_common.c:981 kmalloc include/linux/slab.h:584 [inline] kzalloc include/linux/slab.h:720 [inline] ext4_update_inline_data+0x236/0x6b0 fs/ext4/inline.c:346 ext4_update_inline_dir fs/ext4/inline.c:1115 [inline] ext4_try_add_inline_entry+0x328/0x990 fs/ext4/inline.c:1307 ext4_add_entry+0x5a4/0xeb0 fs/ext4/namei.c:2385 ext4_add_nondir+0x96/0x260 fs/ext4/namei.c:2772 ext4_create+0x36c/0x560 fs/ext4/namei.c:2817 lookup_open fs/namei.c:3413 [inline] open_last_lookups fs/namei.c:3481 [inline] path_openat+0x12ac/0x2dd0 fs/namei.c:3711 do_filp_open+0x264/0x4f0 fs/namei.c:3741 do_sys_openat2+0x124/0x4e0 fs/open.c:1310 do_sys_open fs/open.c:1326 [inline] __do_sys_openat fs/open.c:1342 [inline] __se_sys_openat fs/open.c:1337 [inline] __x64_sys_openat+0x243/0x290 fs/open.c:1337 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd Above issue happens as follows: ext4_iget ext4_find_inline_data_nolock ->i_inline_off=164 i_inline_size=60 ext4_try_add_inline_entry __ext4_mark_inode_dirty ext4_expand_extra_isize_ea ->i_extra_isize=32 s_want_extra_isize=44 ext4_xattr_shift_entries ->after shift i_inline_off is incorrect, actually is change to 176 ext4_try_add_inline_entry ext4_update_inline_dir get_max_inline_xattr_value_size if (EXT4_I(inode)->i_inline_off) entry = (struct ext4_xattr_entry *)((void *)raw_inode + EXT4_I(inode)->i_inline_off); free += EXT4_XATTR_SIZE(le32_to_cpu(entry->e_value_size)); ->As entry is incorrect, then 'free' may be negative ext4_update_inline_data value = kzalloc(len, GFP_NOFS); -> len is unsigned int, maybe very large, then trigger warning when 'kzalloc()' To resolve the above issue we need to update 'i_inline_off' after 'ext4_xattr_shift_entries()'. We do not need to set EXT4_STATE_MAY_INLINE_DATA flag here, since ext4_mark_inode_dirty() already sets this flag if needed. Setting EXT4_STATE_MAY_INLINE_DATA when it is needed may trigger a BUG_ON in ext4_writepages(). Reported-by: syzbot+d30838395804afc2fa6f@syzkaller.appspotmail.com Cc: stable@kernel.org Signed-off-by: Ye Bin <yebin10@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20230307015253.2232062-3-yebin@huaweicloud.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2023-03-11ext4: move where set the MAY_INLINE_DATA flag is setYe Bin
The only caller of ext4_find_inline_data_nolock() that needs setting of EXT4_STATE_MAY_INLINE_DATA flag is ext4_iget_extra_inode(). In ext4_write_inline_data_end() we just need to update inode->i_inline_off. Since we are going to add one more caller that does not need to set EXT4_STATE_MAY_INLINE_DATA, just move setting of EXT4_STATE_MAY_INLINE_DATA out to ext4_iget_extra_inode(). Signed-off-by: Ye Bin <yebin10@huawei.com> Cc: stable@kernel.org Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20230307015253.2232062-2-yebin@huaweicloud.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2023-03-10Merge branch 'mptcp-fixes-for-6-3'Jakub Kicinski
Matthieu Baerts says: ==================== mptcp: fixes for 6.3 Patch 1 fixes a possible deadlock in subflow_error_report() reported by lockdep. The report was in fact a false positive but the modification makes sense and silences lockdep to allow syzkaller to find real issues. The regression has been introduced in v5.12. Patch 2 is a refactoring needed to be able to fix the two next issues. It improves the situation and can be backported up to v6.0. Patches 3 and 4 fix UaF reported by KASAN. It fixes issues potentially visible since v5.7 and v5.19 but only reproducible until recently (v6.0). These two patches depend on patch 2/7. Patch 5 fixes the order of the printed values: expected vs seen values. The regression has been introduced recently: v6.3-rc1. Patch 6 adds missing ro_after_init flags. A previous patch added them for other functions but these two have been missed. This previous patch has been backported to stable versions (up to v5.12) so probably better to do the same here. Patch 7 fixes tcp_set_state() being called twice in a row since v5.10. Patch 8 fixes another lockdep false positive issue but this time in MPTCP PM code. Same here, some modifications in the code has been made to silence this issue and help finding real ones later. This issue can be seen since v6.2. v1: https://lore.kernel.org/r/20230227-upstream-net-20230227-mptcp-fixes-v1-0-070e30ae4a8e@tessares.net ==================== Link: https://lore.kernel.org/r/20230227-upstream-net-20230227-mptcp-fixes-v2-0-47c2e95eada9@tessares.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-10mptcp: fix lockdep false positive in mptcp_pm_nl_create_listen_socket()Paolo Abeni
Christoph reports a lockdep splat in the mptcp_subflow_create_socket() error path, when such function is invoked by mptcp_pm_nl_create_listen_socket(). Such code path acquires two separates, nested socket lock, with the internal lock operation lacking the "nested" annotation. Adding that in sock_release() for mptcp's sake only could be confusing. Instead just add a new lockclass to the in-kernel msk socket, re-initializing the lockdep infra after the socket creation. Fixes: ad2171009d96 ("mptcp: fix locking for in-kernel listener creation") Cc: stable@vger.kernel.org Reported-by: Christoph Paasch <cpaasch@apple.com> Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/354 Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Tested-by: Christoph Paasch <cpaasch@apple.com> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-10mptcp: avoid setting TCP_CLOSE state twiceMatthieu Baerts
tcp_set_state() is called from tcp_done() already. There is then no need to first set the state to TCP_CLOSE, then call tcp_done(). Fixes: d582484726c4 ("mptcp: fix fallback for MP_JOIN subflows") Cc: stable@vger.kernel.org Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/362 Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-10mptcp: add ro_after_init for tcp{,v6}_prot_overrideGeliang Tang
Add __ro_after_init labels for the variables tcp_prot_override and tcpv6_prot_override, just like other variables adjacent to them, to indicate that they are initialised from the init hooks and no writes occur afterwards. Fixes: b19bc2945b40 ("mptcp: implement delegated actions") Cc: stable@vger.kernel.org Fixes: 51fa7f8ebf0e ("mptcp: mark ops structures as ro_after_init") Signed-off-by: Geliang Tang <geliang.tang@suse.com> Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-10selftests: mptcp: userspace pm: fix printed valuesMatthieu Baerts
In case of errors, the printed message had the expected and the seen value inverted. This patch simply correct the order: first the expected value, then the one that has been seen. Fixes: 10d4273411be ("selftests: mptcp: userspace: print error details if any") Cc: stable@vger.kernel.org Acked-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-10mptcp: fix UaF in listener shutdownPaolo Abeni
As reported by Christoph after having refactored the passive socket initialization, the mptcp listener shutdown path is prone to an UaF issue. BUG: KASAN: use-after-free in _raw_spin_lock_bh+0x73/0xe0 Write of size 4 at addr ffff88810cb23098 by task syz-executor731/1266 CPU: 1 PID: 1266 Comm: syz-executor731 Not tainted 6.2.0-rc59af4eaa31c1f6c00c8f1e448ed99a45c66340dd5 #6 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x6e/0x91 print_report+0x16a/0x46f kasan_report+0xad/0x130 kasan_check_range+0x14a/0x1a0 _raw_spin_lock_bh+0x73/0xe0 subflow_error_report+0x6d/0x110 sk_error_report+0x3b/0x190 tcp_disconnect+0x138c/0x1aa0 inet_child_forget+0x6f/0x2e0 inet_csk_listen_stop+0x209/0x1060 __mptcp_close_ssk+0x52d/0x610 mptcp_destroy_common+0x165/0x640 mptcp_destroy+0x13/0x80 __mptcp_destroy_sock+0xe7/0x270 __mptcp_close+0x70e/0x9b0 mptcp_close+0x2b/0x150 inet_release+0xe9/0x1f0 __sock_release+0xd2/0x280 sock_close+0x15/0x20 __fput+0x252/0xa20 task_work_run+0x169/0x250 exit_to_user_mode_prepare+0x113/0x120 syscall_exit_to_user_mode+0x1d/0x40 do_syscall_64+0x48/0x90 entry_SYSCALL_64_after_hwframe+0x72/0xdc The msk grace period can legitly expire in between the last reference count dropped in mptcp_subflow_queue_clean() and the later eventual access in inet_csk_listen_stop() After the previous patch we don't need anymore special-casing msk listener socket cleanup: the mptcp worker will process each of the unaccepted msk sockets. Just drop the now unnecessary code. Please note this commit depends on the two parent ones: mptcp: refactor passive socket initialization mptcp: use the workqueue to destroy unaccepted sockets Fixes: 6aeed9045071 ("mptcp: fix race on unaccepted mptcp sockets") Cc: stable@vger.kernel.org Reported-and-tested-by: Christoph Paasch <cpaasch@apple.com> Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/346 Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-10mptcp: use the workqueue to destroy unaccepted socketsPaolo Abeni
Christoph reported a UaF at token lookup time after having refactored the passive socket initialization part: BUG: KASAN: use-after-free in __token_bucket_busy+0x253/0x260 Read of size 4 at addr ffff88810698d5b0 by task syz-executor653/3198 CPU: 1 PID: 3198 Comm: syz-executor653 Not tainted 6.2.0-rc59af4eaa31c1f6c00c8f1e448ed99a45c66340dd5 #6 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x6e/0x91 print_report+0x16a/0x46f kasan_report+0xad/0x130 __token_bucket_busy+0x253/0x260 mptcp_token_new_connect+0x13d/0x490 mptcp_connect+0x4ed/0x860 __inet_stream_connect+0x80e/0xd90 tcp_sendmsg_fastopen+0x3ce/0x710 mptcp_sendmsg+0xff1/0x1a20 inet_sendmsg+0x11d/0x140 __sys_sendto+0x405/0x490 __x64_sys_sendto+0xdc/0x1b0 do_syscall_64+0x3b/0x90 entry_SYSCALL_64_after_hwframe+0x72/0xdc We need to properly clean-up all the paired MPTCP-level resources and be sure to release the msk last, even when the unaccepted subflow is destroyed by the TCP internals via inet_child_forget(). We can re-use the existing MPTCP_WORK_CLOSE_SUBFLOW infra, explicitly checking that for the critical scenario: the closed subflow is the MPC one, the msk is not accepted and eventually going through full cleanup. With such change, __mptcp_destroy_sock() is always called on msk sockets, even on accepted ones. We don't need anymore to transiently drop one sk reference at msk clone time. Please note this commit depends on the parent one: mptcp: refactor passive socket initialization Fixes: 58b09919626b ("mptcp: create msk early") Cc: stable@vger.kernel.org Reported-and-tested-by: Christoph Paasch <cpaasch@apple.com> Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/347 Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-10mptcp: refactor passive socket initializationPaolo Abeni
After commit 30e51b923e43 ("mptcp: fix unreleased socket in accept queue") unaccepted msk sockets go throu complete shutdown, we don't need anymore to delay inserting the first subflow into the subflow lists. The reference counting deserve some extra care, as __mptcp_close() is unaware of the request socket linkage to the first subflow. Please note that this is more a refactoring than a fix but because this modification is needed to include other corrections, see the following commits. Then a Fixes tag has been added here to help the stable team. Fixes: 30e51b923e43 ("mptcp: fix unreleased socket in accept queue") Cc: stable@vger.kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Tested-by: Christoph Paasch <cpaasch@apple.com> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-10mptcp: fix possible deadlock in subflow_error_reportPaolo Abeni
Christoph reported a possible deadlock while the TCP stack destroys an unaccepted subflow due to an incoming reset: the MPTCP socket error path tries to acquire the msk-level socket lock while TCP still owns the listener socket accept queue spinlock, and the reverse dependency already exists in the TCP stack. Note that the above is actually a lockdep false positive, as the chain involves two separate sockets. A different per-socket lockdep key will address the issue, but such a change will be quite invasive. Instead, we can simply stop earlier the socket error handling for orphaned or unaccepted subflows, breaking the critical lockdep chain. Error handling in such a scenario is a no-op. Reported-and-tested-by: Christoph Paasch <cpaasch@apple.com> Fixes: 15cc10453398 ("mptcp: deliver ssk errors to msk") Cc: stable@vger.kernel.org Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/355 Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-10Merge branch 'update-xdp_features-flag-according-to-nic-re-configuration'Jakub Kicinski
Lorenzo Bianconi says: ==================== update xdp_features flag according to NIC re-configuration Changes since v1: - rebase on top of net tree - remove NETDEV_XDP_ACT_NDO_XMIT_SG support in mlx5e driver - always enable NETDEV_XDP_ACT_NDO_XMIT support in mlx5e driver ==================== Link: https://lore.kernel.org/r/cover.1678364612.git.lorenzo@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-10mvpp2: take care of xdp_features when reconfiguring queuesMatteo Croce
XDP is supported only if enough queues are present, so when reconfiguring the queues set xdp_features accordingly. Fixes: 66c0e13ad236 ("drivers: net: turn on XDP features") Suggested-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Matteo Croce <teknoraver@meta.com> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-10net/mlx5e: take into account device reconfiguration for xdp_features flagLorenzo Bianconi
Take into account LRO and GRO configuration setting device xdp_features flag. Consider channel rq_wq_type enabling rx scatter-gatter support in xdp_features flag and disable NETDEV_XDP_ACT_NDO_XMIT_SG since it is not supported yet by the driver. Moreover always enable NETDEV_XDP_ACT_NDO_XMIT as the ndo_xdp_xmit callback does not require to load a dummy xdp program on the NIC. Fixes: 66c0e13ad236 ("drivers: net: turn on XDP features") Co-developed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-10veth: take into account device reconfiguration for xdp_features flagLorenzo Bianconi
Take into account tx/rx queues reconfiguration setting device xdp_features flag. Moreover consider NETIF_F_GRO flag in order to enable ndo_xdp_xmit callback. Fixes: 66c0e13ad236 ("drivers: net: turn on XDP features") Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-10net: ena: take into account xdp_features setting tx/rx queuesLorenzo Bianconi
ena nic allows xdp just if enough hw queues are available for XDP. Take into account queues configuration setting xdp_features. Fixes: 66c0e13ad236 ("drivers: net: turn on XDP features") Reviewed-by: Shay Agroskin <shayagr@amazon.com> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-10net: thunderx: take into account xdp_features setting tx/rx queuesLorenzo Bianconi
thunderx nic allows xdp just if enough hw queues are available for XDP. Take into account queues configuration setting xdp_features. Fixes: 66c0e13ad236 ("drivers: net: turn on XDP features") Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-10xdp: add xdp_set_features_flag utility routineLorenzo Bianconi
Introduce xdp_set_features_flag utility routine in order to update dynamically xdp_features according to the dynamic hw configuration via ethtool (e.g. changing number of hw rx/tx queues). Add xdp_clear_features_flag() in order to clear all xdp_feature flag. Reviewed-by: Shay Agroskin <shayagr@amazon.com> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-10tools: ynl: fix get_mask utility routineLorenzo Bianconi
Fix get_mask utility routine in order to take into account possible gaps in the elements list. Fixes: be5bea1cc0bf ("net: add basic C code generators for Netlink") Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-10tools: ynl: fix render-max for flags definitionLorenzo Bianconi
Properly manage render-max property for flags definition type introducing mask value and setting it to (last_element << 1) - 1 instead of adding max value set to last_element + 1 Fixes: be5bea1cc0bf ("net: add basic C code generators for Netlink") Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-10i40e: Fix kernel crash during reboot when adapter is in recovery modeIvan Vecera
If the driver detects during probe that firmware is in recovery mode then i40e_init_recovery_mode() is called and the rest of probe function is skipped including pci_set_drvdata(). Subsequent i40e_shutdown() called during shutdown/reboot dereferences NULL pointer as pci_get_drvdata() returns NULL. To fix call pci_set_drvdata() also during entering to recovery mode. Reproducer: 1) Lets have i40e NIC with firmware in recovery mode 2) Run reboot Result: [ 139.084698] i40e: Intel(R) Ethernet Connection XL710 Network Driver [ 139.090959] i40e: Copyright (c) 2013 - 2019 Intel Corporation. [ 139.108438] i40e 0000:02:00.0: Firmware recovery mode detected. Limiting functionality. [ 139.116439] i40e 0000:02:00.0: Refer to the Intel(R) Ethernet Adapters and Devices User Guide for details on firmware recovery mode. [ 139.129499] i40e 0000:02:00.0: fw 8.3.64775 api 1.13 nvm 8.30 0x8000b78d 1.3106.0 [8086:1583] [15d9:084a] [ 139.215932] i40e 0000:02:00.0 enp2s0f0: renamed from eth0 [ 139.223292] i40e 0000:02:00.1: Firmware recovery mode detected. Limiting functionality. [ 139.231292] i40e 0000:02:00.1: Refer to the Intel(R) Ethernet Adapters and Devices User Guide for details on firmware recovery mode. [ 139.244406] i40e 0000:02:00.1: fw 8.3.64775 api 1.13 nvm 8.30 0x8000b78d 1.3106.0 [8086:1583] [15d9:084a] [ 139.329209] i40e 0000:02:00.1 enp2s0f1: renamed from eth0 ... [ 156.311376] BUG: kernel NULL pointer dereference, address: 00000000000006c2 [ 156.318330] #PF: supervisor write access in kernel mode [ 156.323546] #PF: error_code(0x0002) - not-present page [ 156.328679] PGD 0 P4D 0 [ 156.331210] Oops: 0002 [#1] PREEMPT SMP NOPTI [ 156.335567] CPU: 26 PID: 15119 Comm: reboot Tainted: G E 6.2.0+ #1 [ 156.343126] Hardware name: Abacus electric, s.r.o. - servis@abacus.cz Super Server/H12SSW-iN, BIOS 2.4 04/13/2022 [ 156.353369] RIP: 0010:i40e_shutdown+0x15/0x130 [i40e] [ 156.358430] Code: c1 fc ff ff 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 55 48 89 fd 53 48 8b 9f 48 01 00 00 <f0> 80 8b c2 06 00 00 04 f0 80 8b c0 06 00 00 08 48 8d bb 08 08 00 [ 156.377168] RSP: 0018:ffffb223c8447d90 EFLAGS: 00010282 [ 156.382384] RAX: ffffffffc073ee70 RBX: 0000000000000000 RCX: 0000000000000001 [ 156.389510] RDX: 0000000080000001 RSI: 0000000000000246 RDI: ffff95db49988000 [ 156.396634] RBP: ffff95db49988000 R08: ffffffffffffffff R09: ffffffff8bd17d40 [ 156.403759] R10: 0000000000000001 R11: ffffffff8a5e3d28 R12: ffff95db49988000 [ 156.410882] R13: ffffffff89a6fe17 R14: ffff95db49988150 R15: 0000000000000000 [ 156.418007] FS: 00007fe7c0cc3980(0000) GS:ffff95ea8ee80000(0000) knlGS:0000000000000000 [ 156.426083] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 156.431819] CR2: 00000000000006c2 CR3: 00000003092fc005 CR4: 0000000000770ee0 [ 156.438944] PKRU: 55555554 [ 156.441647] Call Trace: [ 156.444096] <TASK> [ 156.446199] pci_device_shutdown+0x38/0x60 [ 156.450297] device_shutdown+0x163/0x210 [ 156.454215] kernel_restart+0x12/0x70 [ 156.457872] __do_sys_reboot+0x1ab/0x230 [ 156.461789] ? vfs_writev+0xa6/0x1a0 [ 156.465362] ? __pfx_file_free_rcu+0x10/0x10 [ 156.469635] ? __call_rcu_common.constprop.85+0x109/0x5a0 [ 156.475034] do_syscall_64+0x3e/0x90 [ 156.478611] entry_SYSCALL_64_after_hwframe+0x72/0xdc [ 156.483658] RIP: 0033:0x7fe7bff37ab7 Fixes: 4ff0ee1af016 ("i40e: Introduce recovery mode support") Signed-off-by: Ivan Vecera <ivecera@redhat.com> Tested-by: Arpana Arland <arpanax.arland@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://lore.kernel.org/r/20230309184509.984639-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>