summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2022-10-28ice: Add additional CSR registers to ETHTOOL_GREGSLukasz Czapnik
In the event of a Tx hang it can be useful to read a variety of hardware registers to capture some state about why the transmit queue got stuck. Extend the ETHTOOL_GREGS dump provided by the ice driver with several CSR registers that provide such relevant information regarding the hardware Tx state. This enables capturing relevant data to enable debugging such a Tx hang. Signed-off-by: Lukasz Czapnik <lukasz.czapnik@intel.com> Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Link: https://lore.kernel.org/r/20221027104239.1691549-1-jacob.e.keller@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-28Merge branch 'clean-up-sfp-register-definitions'Jakub Kicinski
Russell King says: ==================== Clean up SFP register definitions This two-part patch series cleans up the SFP register definitions by 1. converting them from hex to decimal, as all the definitions in the documents use decimal, this makes it easier to cross-reference. 2. moving the bit definitions for each register along side their register address definition ==================== Link: https://lore.kernel.org/r/Y1qFvaDlLVM1fHdG@shell.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-28net: sfp: move field definitions along side register indexRussell King (Oracle)
Just as we do for the A2h enum, arrange the A0h enum to have the field definitions next to their corresponding register index. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-28net: sfp: convert register indexes from hex to decimalRussell King (Oracle)
The register indexes in the standards are in decimal rather than hex, so lets specify them in decimal in the header file so we can easily cross-reference without converting between hex and decimal. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-28net: ethernet: adi: adin1110: Fix notifiersAlexandru Tachici
ADIN1110 was registering netdev_notifiers on each device probe. This leads to warnings/probe failures because of double registration of the same notifier when to adin1110/2111 devices are connected to the same system. Move the registration of netdev_notifiers in module init call, in this way multiple driver instances can use the same notifiers. Fixes: bc93e19d088b ("net: ethernet: adi: Add ADIN1110 support") Signed-off-by: Alexandru Tachici <alexandru.tachici@analog.com> Link: https://lore.kernel.org/r/20221027095655.89890-2-alexandru.tachici@analog.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-28Merge branch 'net-mtk_eth_soc-improve-pcs-implementation'Jakub Kicinski
Russell King says: ==================== net: mtk_eth_soc: improve PCS implementation As a result of invesigations from Frank Wunderlich, we know a lot more about the Mediatek "SGMII" PCS block, and can implement the PCS support correctly. This series achieves that, and Frank has tested the final result and reports that it works for him. The series could do with further testing by others, but I suspect that is unlikely to happen until it is merged based on past performances with this driver. Briefly, the patches in order: 1. Add a new helper to get the link timer duration in nanoseconds 2. Add definitions for the newly discovered registers and updates to bit definitions, including bitmasks for the BMCR, BMSR and two advertisement registers. 3. Remove unnecessary/unused error handling (functions always returning zero.) 4. Adding the missing pcs_get_state() implementation. 5. Converting the code to use regmap_update_bits() rather than open-coding read-modify-write sequences. 6. Adding out-of-band speed and duplex forcing for all non-inband modes not just the 802.3z link modes the code currently does. 7. Moving the release of the PHY power down to the main pcs_config() function. 8. Moving the interface speed selection to the main pcs_config() function. 9. Adding advertisement programming. 10. Adding correct link timer programming using the new helper in the first patch. 11. Adding support for 802.3z negotiation. There is one remaining issue - when configuring the PCS for in-band, for some reason the AN restart bit is always set. This should not be necessary, but requires further investigation with the hardware to find out whether it is really necessary. I suspect this was a work around for a previous poor implementation. ==================== Link: https://lore.kernel.org/r/Y1qDMw+DJLAJHT40@shell.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-28net: mtk_eth_soc: add support for in-band 802.3z negotiationRussell King (Oracle)
As a result of help from Frank Wunderlich to investigate and test, we now know how to program this PCS for in-band 802.3z negotiation. Add support for this by moving the contents of the two functions into the common mtk_pcs_config() function and adding the register settings for 802.3z negotiation. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-28net: mtk_eth_soc: move and correct link timer programmingRussell King (Oracle)
Program the link timer appropriately for the interface mode being used, using the newly introduced phylink helper that provides the nanosecond link timer interval. The intervals are 1.6ms for SGMII based protocols and 10ms for 802.3z based protocols. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-28net: mtk_eth_soc: add advertisement programmingRussell King (Oracle)
Program the advertisement into the mtk PCS block. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-28net: mtk_eth_soc: move interface speed selectionRussell King (Oracle)
Move the selection of the underlying interface speed to the pcs_config function, so we always program the interface speed. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-28net: mtk_eth_soc: move PHY power upRussell King (Oracle)
The PHY power up is common to both configuration paths, so move it into the parent function. We need to do this for all serdes modes. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-28net: mtk_eth_soc: add out of band forcing of speed and duplex in pcs_link_upRussell King (Oracle)
Add support for forcing the link speed and duplex setting in the pcs_link_up() method for out of band modes, which will be useful when we finish converting the pcs_config() method. Until then, we still have to force duplex for 802.3z modes to work correctly. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-28net: mtk_eth_soc: convert mtk_sgmii to use regmap_update_bits()Russell King (Oracle)
mtk_sgmii does a lot of read-modify-write operations, for which there is a specific regmap function. Use this function instead of open-coding the operations. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-28net: mtk_eth_soc: add pcs_get_state() implementationRussell King (Oracle)
Add a pcs_get_state() implementation which uses the advertisements to compute the resulting link modes, and BMSR contents to determine negotiation and link status. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-28net: mtk_eth_soc: eliminate unnecessary error handlingRussell King (Oracle)
The functions called by the pcs_config() method always return zero, so there is no point trying to handle an error from these functions. Make these functions void, eliminate the "err" variable and simply return zero from the pcs_config() function itself. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-28net: mtk_eth_soc: add definitions for PCSRussell King (Oracle)
As a result of help from Frank Wunderlich to investigate and test, we know a bit more about the PCS on the Mediatek platforms. Update the definitions from this investigation. This PCS appears similar, but not identical to the Lynx PCS. Although not included in this patch, but for future reference, the PHY ID registers at offset 4 read as 0x4d544950 'MTIP'. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-28net: phylink: add phylink_get_link_timer_ns() helperRussell King (Oracle)
Add a helper to convert the PHY interface mode to the required link timer setting as stated by the appropriate standard. Inappropriate interface modes return an error. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-28Merge branch 'a-few-corrections-for-sock_support_zc'Jakub Kicinski
Pavel Begunkov says: ==================== a few corrections for SOCK_SUPPORT_ZC There are several places/cases that got overlooked in regards to SOCK_SUPPORT_ZC. We're lacking the flag for IPv6 UDP sockets and accepted TCP sockets. We also should clear the flag when someone tries to hijack a socket by replacing the ->sk_prot callbacks. ==================== Link: https://lore.kernel.org/r/cover.1666825799.git.asml.silence@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-28net: also flag accepted sockets supporting msghdr originated zerocopyStefan Metzmacher
Without this only the client initiated tcp sockets have SOCK_SUPPORT_ZC. The listening socket on the server also has it, but the accepted connections didn't, which meant IORING_OP_SEND[MSG]_ZC will always fails with -EOPNOTSUPP. Fixes: e993ffe3da4b ("net: flag sockets supporting msghdr originated zerocopy") Cc: <stable@vger.kernel.org> # 6.0 CC: Jens Axboe <axboe@kernel.dk> Link: https://lore.kernel.org/io-uring/20221024141503.22b4e251@kernel.org/T/#m38aa19b0b825758fb97860a38ad13122051f9dda Signed-off-by: Stefan Metzmacher <metze@samba.org> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-28net/ulp: remove SOCK_SUPPORT_ZC from tls socketsPavel Begunkov
Remove SOCK_SUPPORT_ZC when we're setting ulp as it might not support msghdr::ubuf_info, e.g. like TLS replacing ->sk_prot with a new set of handlers. Cc: <stable@vger.kernel.org> # 6.0 Reported-by: Jakub Kicinski <kuba@kernel.org> Fixes: e993ffe3da4bc ("net: flag sockets supporting msghdr originated zerocopy") Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-28net: remove SOCK_SUPPORT_ZC from sockmapPavel Begunkov
sockmap replaces ->sk_prot with its own callbacks, we should remove SOCK_SUPPORT_ZC as the new proto doesn't support msghdr::ubuf_info. Cc: <stable@vger.kernel.org> # 6.0 Reported-by: Jakub Kicinski <kuba@kernel.org> Fixes: e993ffe3da4bc ("net: flag sockets supporting msghdr originated zerocopy") Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-28udp: advertise ipv6 udp support for msghdr::ubuf_infoPavel Begunkov
Mark udp ipv6 as supporting msghdr::ubuf_info. In the original commit SOCK_SUPPORT_ZC was supposed to be set by a udp_init_sock() call from udp6_init_sock(), but d38afeec26ed4 ("tcp/udp: Call inet6_destroy_sock() in IPv6 ...") removed it and so ipv6 udp misses the flag. Cc: <stable@vger.kernel.org> # 6.0 Fixes: e993ffe3da4bc ("net: flag sockets supporting msghdr originated zerocopy") Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-28enic: MAINTAINERS: Update enic maintainersGovindarajulu Varadarajan
Update enic maintainers. Signed-off-by: Govindarajulu Varadarajan <govind.varadar@gmail.com> Link: https://lore.kernel.org/r/20221028042159.735670-1-govind.varadar@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-28net: openvswitch: add missing .resv_start_opJakub Kicinski
I missed one of the families in OvS when annotating .resv_start_op. This triggers the warning added in commit ce48ebdd5651 ("genetlink: limit the use of validation workarounds to old ops"). Reported-by: syzbot+40eb8c0447c0e47a7e9b@syzkaller.appspotmail.com Fixes: 9c5d03d36251 ("genetlink: start to validate reserved header bytes") Link: https://lore.kernel.org/r/20221028032501.2724270-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-28netlink: hide validation union fields from kdocJakub Kicinski
Mark the validation fields as private, users shouldn't set them directly and they are too complicated to explain in a more succinct way (there's already a long explanation in the comment above). The strict_start_type field is set directly and has a dedicated comment so move that above the "private" section. Link: https://lore.kernel.org/r/20221027212107.2639255-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-28Merge branch 'net-remove-the-obsolte-u64_stats_fetch_-_irq'Jakub Kicinski
Sebastian Andrzej Siewior says: ==================== net: Remove the obsolte u64_stats_fetch_*_irq() This is the removal of u64_stats_fetch_*_irq() users in networking. The prerequisites are part of v6.1-rc1. The spi and bpf bits are not part of the series and have been routed directly. ==================== Link: https://lore.kernel.org/r/20221026132215.696950-1-bigeasy@linutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-28net: Remove the obsolte u64_stats_fetch_*_irq() users (net).Thomas Gleixner
Now that the 32bit UP oddity is gone and 32bit uses always a sequence count, there is no need for the fetch_irq() variants anymore. Convert to the regular interface. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-28net: Remove the obsolte u64_stats_fetch_*_irq() users (drivers).Thomas Gleixner
Now that the 32bit UP oddity is gone and 32bit uses always a sequence count, there is no need for the fetch_irq() variants anymore. Convert to the regular interface. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-28Kalle Valo says:Jakub Kicinski
==================== pull-request: wireless-next-2022-10-28 First set of patches v6.2. mac80211 refactoring continues for Wi-Fi 7. All mac80211 driver are now converted to use internal TX queues, this might cause some regressions so we wanted to do this early in the cycle. Note: wireless tree was merged[1] to wireless-next to avoid some conflicts with mac80211 patches between the trees. Unfortunately there are still two smaller conflicts in net/mac80211/util.c which Stephen also reported[2]. In the first conflict initialise scratch_len to "params->scratch_len ?: 3 * params->len" (note number 3, not 2!) and in the second conflict take the version which uses elems->scratch_pos. [1] https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next.git/commit/?id=dfd2d876b3fda1790bc0239ba4c6967e25d16e91 [2] https://lore.kernel.org/all/20221020032340.5cf101c0@canb.auug.org.au/ mac80211 - preparation for Wi-Fi 7 Multi-Link Operation (MLO) continues - add API to show the link STAs in debugfs - all mac80211 drivers are now using mac80211 internal TX queues (iTXQs) rtw89 - support 8852BE rtl8xxxu - support RTL8188FU brmfmac - support two station interfaces concurrently bcma - support SPROM rev 11 ==================== Link: https://lore.kernel.org/r/20221028132943.304ECC433B5@smtp.kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-28Merge tag 's390-6.1-3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 fixes from Vasily Gorbik: - Remove outdated linux390 link from MAINTAINERS - Add few missing EX_TABLE entries to inline assemblies - Fix raw data collection for pai_ext PMU - Add kernel image secure boot trailer for future firmware versions - Fix out-of-bounds access on cio_ignore free - Fix memory allocation of mdev_types array in vfio-ap * tag 's390-6.1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/vfio-ap: Fix memory allocation for mdev_types array s390/cio: fix out-of-bounds access on cio_ignore free s390/pai: fix raw data collection for PMU pai_ext s390/boot: add secure boot trailer s390/pci: add missing EX_TABLE entries to __pcistg_mio_inuser()/__pcilg_mio_inuser() s390/futex: add missing EX_TABLE entry to __futex_atomic_op() s390/uaccess: add missing EX_TABLE entries to __clear_user() MAINTAINERS: remove outdated linux390 link
2022-10-28Merge tag 'riscv-for-linus-6.1-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V fixes from Palmer Dabbelt: - A fix for a build warning in the jump_label code - One of the git://github -> https://github cleanups, for the SiFive drivers - A fix for the kasan initialization code, this still likely warrants some cleanups but that's a bigger problem and at least this fixes the crashes in the short term - A pair of fixes for extension support detection on mixed LLVM/GNU toolchains - A fix for a runtime warning in the /proc/cpuinfo code * tag 'riscv-for-linus-6.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: RISC-V: Fix /proc/cpuinfo cpumask warning riscv: fix detection of toolchain Zihintpause support riscv: fix detection of toolchain Zicbom support riscv: mm: add missing memcpy in kasan_init MAINTAINERS: git://github.com -> https://github.com for sifive riscv: jump_label: mark arguments as const to satisfy asm constraints
2022-10-28Merge tag 'acpi-6.1-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI and device properties fixes from Rafael Wysocki: "These fix device properties documentation and the ACPI PCC code, add a new IRQ override quirk for resource handling and add one more item to the list of device IDs to be ignored when returned by _DEP. Specifics: - Fix the documentation of the *_match_string() family of functions to properly cover the return value (Andy Shevchenko) - Fix a possible integer overflow during multiplication in the ACPI PCC code (Manank Patel) - Make the ACPI device resources code skip IRQ override on Asus Vivobook S5602ZA (Tamim Khan) - Add LATT2021 to the list of device IDs that are ignored when returned by _DEP, because there are no drivers for them in the kernel and no plans to add such drivers (Hans de Goede)" * tag 'acpi-6.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPI: scan: Add LATT2021 to acpi_ignore_dep_ids[] ACPI: resource: Skip IRQ override on Asus Vivobook S5602ZA ACPI: PCC: Fix unintentional integer overflow device property: Fix documentation for *_match_string() APIs
2022-10-28Merge tag 'pm-6.1-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management fixes from Rafael Wysocki: "These make the intel_pstate driver work as expected on all hybrid platforms to date (regardless of possible platform firmware issues), fix hybrid sleep on systems using suspend-to-idle by default, make the generic power domains code handle disabled idle states properly and update pm-graph. Specifics: - Make intel_pstate use what is known about the hardware instead of relying on information from the platform firmware (ACPI CPPC in particular) to establish the relationship between the HWP CPU performance levels and frequencies on all hybrid platforms available to date (Rafael Wysocki) - Allow hybrid sleep to use suspend-to-idle as a system suspend method if it is the current suspend method of choice (Mario Limonciello) - Fix handling of unavailable/disabled idle states in the generic power domains code (Sudeep Holla) - Update the pm-graph suite of utilities to version 5.10 which is fixes-mostly and does not add any new features (Todd Brandt)" * tag 'pm-6.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: PM: domains: Fix handling of unavailable/disabled idle states pm-graph v5.10 cpufreq: intel_pstate: hybrid: Use known scaling factor for P-cores cpufreq: intel_pstate: Read all MSRs on the target CPU PM: hibernate: Allow hybrid sleep to work with s2idle
2022-10-28bpf: check max_entries before allocating memoryFlorian Lehner
For maps of type BPF_MAP_TYPE_CPUMAP memory is allocated first before checking the max_entries argument. If then max_entries is greater than NR_CPUS additional work needs to be done to free allocated memory before an error is returned. This changes moves the check on max_entries before the allocation happens. Signed-off-by: Florian Lehner <dev@der-flo.net> Link: https://lore.kernel.org/r/20221028183405.59554-1-dev@der-flo.net Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2022-10-29random: use arch_get_random*_early() in random_init()Jean-Philippe Brucker
While reworking the archrandom handling, commit d349ab99eec7 ("random: handle archrandom with multiple longs") switched to the non-early archrandom helpers in random_init(), which broke initialization of the entropy pool from the arm64 random generator. Indeed at that point the arm64 CPU features, which verify that all CPUs have compatible capabilities, are not finalized so arch_get_random_seed_longs() is unsuccessful. Instead random_init() should use the _early functions, which check only the boot CPU on arm64. On other architectures the _early functions directly call the normal ones. Fixes: d349ab99eec7 ("random: handle archrandom with multiple longs") Cc: stable@vger.kernel.org Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-10-28tools/nolibc/string: Fix memcmp() implementationRasmus Villemoes
The C standard says that memcmp() must treat the buffers as consisting of "unsigned chars". If char happens to be unsigned, the casts are ok, but then obviously the c1 variable can never contain a negative value. And when char is signed, the casts are wrong, and there's still a problem with using an 8-bit quantity to hold the difference, because that can range from -255 to +255. For example, assuming char is signed, comparing two 1-byte buffers, one containing 0x00 and another 0x80, the current implementation would return -128 for both memcmp(a, b, 1) and memcmp(b, a, 1), whereas one of those should of course return something positive. Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Fixes: 66b6f755ad45 ("rcutorture: Import a copy of nolibc") Cc: stable@vger.kernel.org # v5.0+ Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-10-28tools/nolibc: Fix missing strlen() definition and infinite loop with gcc-12Willy Tarreau
When built at -Os, gcc-12 recognizes an strlen() pattern in nolibc_strlen() and replaces it with a jump to strlen(), which is not defined as a symbol and breaks compilation. Worse, when the function is called strlen(), the function is simply replaced with a jump to itself, hence becomes an infinite loop. One way to avoid this is to always set -ffreestanding, but the calling code doesn't know this and there's no way (either via attributes or pragmas) to globally enable it from include files, effectively leaving a painful situation for the caller. Alexey suggested to place an empty asm() statement inside the loop to stop gcc from recognizing a well-known pattern, which happens to work pretty fine. At least it allows us to make sure our local definition is not replaced with a self jump. The function only needs to be renamed back to strlen() so that the symbol exists, which implies that nolibc_strlen() which is used on variable strings has to be declared as a macro that points back to it before the strlen() macro is redifined. It was verified to produce valid code with gcc 3.4 to 12.1 at different optimization levels, and both with constant and variable strings. In case this problem surfaces again in the future, an alternate approach consisting in adding an optimize("no-tree-loop-distribute-patterns") function attribute for gcc>=12 worked as well but is less pretty. Reported-by: kernel test robot <yujie.liu@intel.com> Link: https://lore.kernel.org/r/202210081618.754a77db-yujie.liu@intel.com Fixes: 66b6f755ad45 ("rcutorture: Import a copy of nolibc") Fixes: 96980b833a21 ("tools/nolibc/string: do not use __builtin_strlen() at -O0") Cc: "Paul E. McKenney" <paulmck@kernel.org> Cc: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-10-28mm: multi-gen LRU: move lru_gen_add_mm() out of IRQ-off regionSebastian Andrzej Siewior
lru_gen_add_mm() has been added within an IRQ-off region in the commit mentioned below. The other invocations of lru_gen_add_mm() are not within an IRQ-off region. The invocation within IRQ-off region is problematic on PREEMPT_RT because the function is using a spin_lock_t which must not be used within IRQ-disabled regions. The other invocations of lru_gen_add_mm() occur while task_struct::alloc_lock is acquired. Move lru_gen_add_mm() after interrupts are enabled and before task_unlock(). Link: https://lkml.kernel.org/r/20221026134830.711887-1-bigeasy@linutronix.de Fixes: bd74fdaea1460 ("mm: multi-gen LRU: support page table walks") Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Acked-by: Yu Zhao <yuzhao@google.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: "Eric W . Biederman" <ebiederm@xmission.com> Cc: Kees Cook <keescook@chromium.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-28lib: maple_tree: remove unneeded initialization in mtree_range_walk()Lukas Bulwahn
Before the do-while loop in mtree_range_walk(), the variables next, min, max need to be initialized. The variables last, prev_min and prev_max are set within the loop body before they are eventually used after exiting the loop body. As it is a do-while loop, the loop body is executed at least once, so the variables last, prev_min and prev_max do not need to be initialized before the loop body. Remove unneeded initialization of last and prev_min. The needless initialization was reported by clang-analyzer as Dead Stores. As the compiler already identifies these assignments as unneeded, it optimizes the assignments away. Hence: No functional change. No change in object code. Link: https://lkml.kernel.org/r/20221026120029.12555-2-lukas.bulwahn@gmail.com Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-28mmap: fix remap_file_pages() regressionLiam Howlett
When using the VMA iterator, the final execution will set the variable 'next' to NULL which causes the function to fail out. Restore the break in the loop to exit the VMA iterator early without clearing NULL fixes the issue. Link: https://lore.kernel.org/lkml/29344.1666681759@jrobl/ Link: https://lkml.kernel.org/r/20221025161222.2634030-1-Liam.Howlett@oracle.com Fixes: 763ecb035029 (mm: remove the vma linked list) Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Reported-by: "J. R. Okajima" <hooanon05g@gmail.com> Tested-by: "J. R. Okajima" <hooanon05g@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-28mm/shmem: ensure proper fallback if page faultsIra Weiny
The kernel test robot flagged a recursive lock as a result of a conversion from kmap_atomic() to kmap_local_folio()[Link] The cause was due to the code depending on the kmap_atomic() side effect of disabling page faults. In that case the code expects the fault to fail and take the fallback case. git archaeology implied that the recursion may not be an actual bug.[1] However, depending on the implementation of the mmap_lock and the condition of the call there may still be a deadlock.[2] So this is not purely a lockdep issue. Considering a single threaded call stack there are 3 options. 1) Different mm's are in play (no issue) 2) Readlock implementation is recursive and same mm is in play (no issue) 3) Readlock implementation is _not_ recursive (issue) The mmap_lock is recursive so with a single thread there is no issue. However, Matthew pointed out a deadlock scenario when you consider additional process' and threads thusly. "The readlock implementation is only recursive if nobody else has taken a write lock. If you have a multithreaded process, one of the other threads can call mmap() and that will prevent recursion (due to fairness). Even if it's a different process that you're trying to acquire the mmap read lock on, you can still get into a deadly embrace. eg: process A thread 1 takes read lock on own mmap_lock process A thread 2 calls mmap, blocks taking write lock process B thread 1 takes page fault, read lock on own mmap lock process B thread 2 calls mmap, blocks taking write lock process A thread 1 blocks taking read lock on process B process B thread 1 blocks taking read lock on process A Now all four threads are blocked waiting for each other." Regardless using pagefault_disable() ensures that no matter what locking implementation is used a deadlock will not occur. Add an explicit pagefault_disable() and a big comment to explain this for future souls looking at this code. [1] https://lore.kernel.org/all/Y1MymJ%2FINb45AdaY@iweiny-desk3/ [2] https://lore.kernel.org/lkml/Y1bXBtGTCym77%2FoD@casper.infradead.org/ Link: https://lkml.kernel.org/r/20221025220108.2366043-1-ira.weiny@intel.com Link: https://lore.kernel.org/r/202210211215.9dc6efb5-yujie.liu@intel.com Fixes: 7a7256d5f512 ("shmem: convert shmem_mfill_atomic_pte() to use a folio") Signed-off-by: Ira Weiny <ira.weiny@intel.com> Reported-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reported-by: kernel test robot <yujie.liu@intel.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Peter Xu <peterx@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-28mm/userfaultfd: replace kmap/kmap_atomic() with kmap_local_page()Ira Weiny
kmap() and kmap_atomic() are being deprecated in favor of kmap_local_page() which is appropriate for any thread local context.[1] A recent locking bug report with userfaultfd showed that the conversion of the kmap_atomic()'s in those code flows requires care with regard to the prevention of deadlock.[2] git archaeology implied that the recursion may not be an actual bug.[3] However, depending on the implementation of the mmap_lock and the condition of the call there may still be a deadlock.[4] So this is not purely a lockdep issue. Considering a single threaded call stack there are 3 options. 1) Different mm's are in play (no issue) 2) Readlock implementation is recursive and same mm is in play (no issue) 3) Readlock implementation is _not_ recursive (issue) The mmap_lock is recursive so with a single thread there is no issue. However, Matthew pointed out a deadlock scenario when you consider additional process' and threads thusly. "The readlock implementation is only recursive if nobody else has taken a write lock. If you have a multithreaded process, one of the other threads can call mmap() and that will prevent recursion (due to fairness). Even if it's a different process that you're trying to acquire the mmap read lock on, you can still get into a deadly embrace. eg: process A thread 1 takes read lock on own mmap_lock process A thread 2 calls mmap, blocks taking write lock process B thread 1 takes page fault, read lock on own mmap lock process B thread 2 calls mmap, blocks taking write lock process A thread 1 blocks taking read lock on process B process B thread 1 blocks taking read lock on process A Now all four threads are blocked waiting for each other." Regardless using pagefault_disable() ensures that no matter what locking implementation is used a deadlock will not occur. Complete kmap conversion in userfaultfd by replacing the kmap() and kmap_atomic() calls with kmap_local_page(). When replacing the kmap_atomic() call ensure page faults continue to be disabled to support the correct fall back behavior and add a comment to inform future souls of the requirement. [1] https://lore.kernel.org/all/20220813220034.806698-1-ira.weiny@intel.com/ [2] https://lore.kernel.org/all/Y1Mh2S7fUGQ%2FiKFR@iweiny-desk3/ [3] https://lore.kernel.org/all/Y1MymJ%2FINb45AdaY@iweiny-desk3/ [4] https://lore.kernel.org/lkml/Y1bXBtGTCym77%2FoD@casper.infradead.org/ [ira.weiny@intel.com: v2] Link: https://lkml.kernel.org/r/20221025220136.2366143-1-ira.weiny@intel.com Link: https://lkml.kernel.org/r/20221024043452.1491677-1-ira.weiny@intel.com Signed-off-by: Ira Weiny <ira.weiny@intel.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Peter Xu <peterx@redhat.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-28x86: fortify: kmsan: fix KMSAN fortify buildsAlexander Potapenko
Ensure that KMSAN builds replace memset/memcpy/memmove calls with the respective __msan_XXX functions, and that none of the macros are redefined twice. This should allow building kernel with both CONFIG_KMSAN and CONFIG_FORTIFY_SOURCE. Link: https://lkml.kernel.org/r/20221024212144.2852069-5-glider@google.com Link: https://github.com/google/kmsan/issues/89 Signed-off-by: Alexander Potapenko <glider@google.com> Reported-by: Tamas K Lengyel <tamas.lengyel@zentific.com> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Kees Cook <keescook@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-28x86: asm: make sure __put_user_size() evaluates pointer onceAlexander Potapenko
User access macros must ensure their arguments are evaluated only once if they are used more than once in the macro body. Adding instrument_put_user() to __put_user_size() resulted in double evaluation of the `ptr` argument, which led to correctness issues when performing e.g. unsafe_put_user(..., p++, ...). To fix those issues, evaluate the `ptr` argument of __put_user_size() at the beginning of the macro. Link: https://lkml.kernel.org/r/20221024212144.2852069-4-glider@google.com Fixes: 888f84a6da4d ("x86: asm: instrument usercopy in get_user() and put_user()") Signed-off-by: Alexander Potapenko <glider@google.com> Reported-by: youling257 <youling257@gmail.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-28Kconfig.debug: disable CONFIG_FRAME_WARN for KMSAN by defaultAlexander Potapenko
KMSAN adds a lot of instrumentation to the code, which results in increased stack usage (up to 2048 bytes and more in some cases). It's hard to predict how big the stack frames can be, so we disable the warnings for KMSAN instead. Link: https://lkml.kernel.org/r/20221024212144.2852069-3-glider@google.com Link: https://github.com/google/kmsan/issues/89 Signed-off-by: Alexander Potapenko <glider@google.com> Cc: Kees Cook <keescook@chromium.org> Cc: Masahiro Yamada <masahiroy@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-28x86/purgatory: disable KMSAN instrumentationAlexander Potapenko
The stand-alone purgatory.ro does not contain the KMSAN runtime, therefore it can't be built with KMSAN compiler instrumentation. Link: https://lkml.kernel.org/r/20221024212144.2852069-2-glider@google.com Link: https://github.com/google/kmsan/issues/89 Signed-off-by: Alexander Potapenko <glider@google.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-28mm: kmsan: export kmsan_copy_page_meta()Alexander Potapenko
Certain modules call copy_user_highpage(), which calls kmsan_copy_page_meta() under KMSAN, so we need to export the latter. Link: https://lkml.kernel.org/r/20221024212144.2852069-1-glider@google.com Link: https://github.com/google/kmsan/issues/89 Fixes: b073d7f8aee4 ("mm: kmsan: maintain KMSAN metadata for page operations") Signed-off-by: Alexander Potapenko <glider@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-28mm: migrate: fix return value if all subpages of THPs are migrated successfullyBaolin Wang
During THP migration, if THPs are not migrated but they are split and all subpages are migrated successfully, migrate_pages() will still return the number of THP pages that were not migrated. This will confuse the callers of migrate_pages(). For example, the longterm pinning will failed though all pages are migrated successfully. Thus we should return 0 to indicate that all pages are migrated in this case Link: https://lkml.kernel.org/r/de386aa864be9158d2f3b344091419ea7c38b2f7.1666599848.git.baolin.wang@linux.alibaba.com Fixes: b5bade978e9b ("mm: migrate: fix the return value of migrate_pages()") Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com> Reviewed-by: Alistair Popple <apopple@nvidia.com> Reviewed-by: Yang Shi <shy828301@gmail.com> Cc: David Hildenbrand <david@redhat.com> Cc: "Huang, Ying" <ying.huang@intel.com> Cc: Zi Yan <ziy@nvidia.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-28mm/uffd: fix vma check on userfault for wpPeter Xu
We used to have a report that pte-marker code can be reached even when uffd-wp is not compiled in for file memories, here: https://lore.kernel.org/all/YzeR+R6b4bwBlBHh@x1n/T/#u I just got time to revisit this and found that the root cause is we simply messed up with the vma check, so that for !PTE_MARKER_UFFD_WP system, we will allow UFFDIO_REGISTER of MINOR & WP upon shmem as the check was wrong: if (vm_flags & VM_UFFD_MINOR) return is_vm_hugetlb_page(vma) || vma_is_shmem(vma); Where we'll allow anything to pass on shmem as long as minor mode is requested. Axel did it right when introducing minor mode but I messed it up in b1f9e876862d when moving code around. Fix it. Link: https://lkml.kernel.org/r/20221024193336.1233616-1-peterx@redhat.com Link: https://lkml.kernel.org/r/20221024193336.1233616-2-peterx@redhat.com Fixes: b1f9e876862d ("mm/uffd: enable write protection for shmem & hugetlbfs") Signed-off-by: Peter Xu <peterx@redhat.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Nadav Amit <nadav.amit@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-28mm: prep_compound_tail() clear page->privateHugh Dickins
Although page allocation always clears page->private in the first page or head page of an allocation, it has never made a point of clearing page->private in the tails (though 0 is often what is already there). But now commit 71e2d666ef85 ("mm/huge_memory: do not clobber swp_entry_t during THP split") issues a warning when page_tail->private is found to be non-0 (unless it's swapcache). Change that warning to dump page_tail (which also dumps head), instead of just the head: so far we have seen dead000000000122, dead000000000003, dead000000000001 or 0000000000000002 in the raw output for tail private. We could just delete the warning, but today's consensus appears to want page->private to be 0, unless there's a good reason for it to be set: so now clear it in prep_compound_tail() (more general than just for THP; but not for high order allocation, which makes no pass down the tails). Link: https://lkml.kernel.org/r/1c4233bb-4e4d-5969-fbd4-96604268a285@google.com Fixes: 71e2d666ef85 ("mm/huge_memory: do not clobber swp_entry_t during THP split") Signed-off-by: Hugh Dickins <hughd@google.com> Acked-by: Mel Gorman <mgorman@techsingularity.net> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>