summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2024-06-22Merge tag 'bcachefs-2024-06-22' of https://evilpiepirate.org/git/bcachefsLinus Torvalds
Pull bcachefs fixes from Kent Overstreet: "Lots of (mostly boring) fixes for syzbot bugs and rare(r) CI bugs. The LRU_TIME_BITS fix was slightly more involved; we only have 48 bits for the LRU position (we would prefer 64), so wraparound is possible for the cached data LRUs on a filesystem that has done sufficient (petabytes) reads; this is now handled. One notable user reported bugfix, where we were forgetting to correctly set the bucket data type, which should have been BCH_DATA_need_gc_gens instead of BCH_DATA_free; this was causing us to go emergency read-only on a filesystem that had seen heavy enough use to see bucket gen wraparoud. We're now starting to fix simple (safe) errors without requiring user intervention - i.e. a small incremental step towards full self healing. This is currently limited to just certain allocation information counters, and the error is still logged in the superblock; see that patch for more information. ("bcachefs: Fix safe errors by default")" * tag 'bcachefs-2024-06-22' of https://evilpiepirate.org/git/bcachefs: (22 commits) bcachefs: Move the ei_flags setting to after initialization bcachefs: Fix a UAF after write_super() bcachefs: Use bch2_print_string_as_lines for long err bcachefs: Fix I_NEW warning in race path in bch2_inode_insert() bcachefs: Replace bare EEXIST with private error codes bcachefs: Fix missing alloc_data_type_set() closures: Change BUG_ON() to WARN_ON() bcachefs: fix alignment of VMA for memory mapped files on THP bcachefs: Fix safe errors by default bcachefs: Fix bch2_trans_put() bcachefs: set_worker_desc() for delete_dead_snapshots bcachefs: Fix bch2_sb_downgrade_update() bcachefs: Handle cached data LRU wraparound bcachefs: Guard against overflowing LRU_TIME_BITS bcachefs: delete_dead_snapshots() doesn't need to go RW bcachefs: Fix early init error path in journal code bcachefs: Check for invalid btree IDs bcachefs: Fix btree ID bitmasks bcachefs: Fix shift overflow in read_one_super() bcachefs: Fix a locking bug in the do_discard_fast() path ...
2024-06-22Merge tag 'ata-6.10-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux Pull ata fix from Niklas Cassel: - We currently enable DIPM (device initiated power management) in the device (using a SET FEATURES call to the device), regardless if the HBA supports any LPM states or not. It seems counter intuitive, and potentially dangerous to enable a device side feature, when the HBA does not have the corresponding support. Thus, make sure that we do not enable DIPM if the HBA does not support any LPM states. * tag 'ata-6.10-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux: ata: ahci: Do not enable LPM if no LPM states are supported by the HBA
2024-06-22Merge tag 'pwm/for-6.10-rc5-fixes-take2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ukleinek/linux Pull pwm fixes from Uwe Kleine-König: "Three fixes for the pwm-stm32 driver. The first patch prevents an integer wrap-around for small periods. In the second patch the calculation of the prescaler is fixed which resulted in values for the ARR register that don't fit into the corresponding register bit field. The last commit improves an error message that was wrongly copied from another error path" * tag 'pwm/for-6.10-rc5-fixes-take2' of git://git.kernel.org/pub/scm/linux/kernel/git/ukleinek/linux: pwm: stm32: Fix error message to not describe the previous error path pwm: stm32: Fix calculation of prescaler pwm: stm32: Refuse too small period requests
2024-06-22Merge tag 'arm-fixes-6.10' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull SoC fixes from Arnd Bergmann: "There are seven oneline patches that each address a distinct problem on the NXP i.MX platform, mostly the popular i.MX8M variant. The only other two fixes are for error handling on the psci firmware driver and SD card support on the milkv duo riscv board" * tag 'arm-fixes-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: firmware: psci: Fix return value from psci_system_suspend() riscv: dts: sophgo: disable write-protection for milkv duo arm64: dts: imx8qm-mek: fix gpio number for reg_usdhc2_vmmc arm64: dts: freescale: imx8mm-verdin: enable hysteresis on slow input pin arm64: dts: imx93-11x11-evk: Remove the 'no-sdio' property arm64: dts: freescale: imx8mp-venice-gw73xx-2x: fix BT shutdown GPIO arm: dts: imx53-qsb-hdmi: Disable panel instead of deleting node arm64: dts: imx8mp: Fix TC9595 input clock on DH i.MX8M Plus DHCOM SoM arm64: dts: freescale: imx8mm-verdin: Fix GPU speed
2024-06-22Merge tag 'loongarch-fixes-6.10-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson Pull LoongArch fixes from Huacai Chen: "Some hw breakpoint fixes, an objtool build warnging fix, and a trivial cleanup" * tag 'loongarch-fixes-6.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson: LoongArch: KVM: Remove an unneeded semicolon LoongArch: Fix multiple hardware watchpoint issues LoongArch: Trigger user-space watchpoints correctly LoongArch: Fix watchpoint setting error LoongArch: Only allow OBJTOOL & ORC unwinder if toolchain supports -mthin-add-sub
2024-06-22Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull kvm fixes from Paolo Bonzini: "ARM: - Fix dangling references to a redistributor region if the vgic was prematurely destroyed. - Properly mark FFA buffers as released, ensuring that both parties can make forward progress. x86: - Allow getting/setting MSRs for SEV-ES guests, if they're using the pre-6.9 KVM_SEV_ES_INIT API. - Always sync pending posted interrupts to the IRR prior to IOAPIC route updates, so that EOIs are intercepted properly if the old routing table requested that. Generic: - Avoid __fls(0) - Fix reference leak on hwpoisoned page - Fix a race in kvm_vcpu_on_spin() by ensuring loads and stores are atomic. - Fix bug in __kvm_handle_hva_range() where KVM calls a function pointer that was intended to be a marker only (nothing bad happens but kind of a mine and also technically undefined behavior) - Do not bother accounting allocations that are small and freed before getting back to userspace. Selftests: - Fix compilation for RISC-V. - Fix a "shift too big" goof in the KVM_SEV_INIT2 selftest. - Compute the max mappable gfn for KVM selftests on x86 using GuestMaxPhyAddr from KVM's supported CPUID (if it's available)" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: SEV-ES: Fix svm_get_msr()/svm_set_msr() for KVM_SEV_ES_INIT guests KVM: Discard zero mask with function kvm_dirty_ring_reset virt: guest_memfd: fix reference leak on hwpoisoned page kvm: do not account temporary allocations to kmem MAINTAINERS: Drop Wanpeng Li as a Reviewer for KVM Paravirt support KVM: x86: Always sync PIR to IRR prior to scanning I/O APIC routes KVM: Stop processing *all* memslots when "null" mmu_notifier handler is found KVM: arm64: FFA: Release hyp rx buffer KVM: selftests: Fix RISC-V compilation KVM: arm64: Disassociate vcpus from redistributor region on teardown KVM: Fix a data race on last_boosted_vcpu in kvm_vcpu_on_spin() KVM: selftests: x86: Prioritize getting max_gfn from GuestPhysBits KVM: selftests: Fix shift of 32 bit unsigned int more than 32 bits
2024-06-22pwm: stm32: Fix error message to not describe the previous error pathUwe Kleine-König
"Failed to lock the clock" is an appropriate error message for clk_rate_exclusive_get() failing, but not for the clock running too fast for the driver's calculations. Adapt the error message accordingly. Fixes: d44d635635a7 ("pwm: stm32: Fix for settings using period > UINT32_MAX") Signed-off-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com> Link: https://lore.kernel.org/r/285182163211203fc823a65b180761f46e828dcb.1718979150.git.u.kleine-koenig@baylibre.com Signed-off-by: Uwe Kleine-König <ukleinek@kernel.org>
2024-06-22pwm: stm32: Fix calculation of prescalerUwe Kleine-König
A small prescaler is beneficial, as this improves the resolution of the duty_cycle configuration. However if the prescaler is too small, the maximal possible period becomes considerably smaller than the requested value. One situation where this goes wrong is the following: With a parent clock rate of 208877930 Hz and max_arr = 0xffff = 65535, a request for period = 941243 ns currently results in PSC = 1. The value for ARR is then calculated to ARR = 941243 * 208877930 / (1000000000 * 2) - 1 = 98301 This value is bigger than 65535 however and so doesn't fit into the respective register field. In this particular case the PWM was configured for a period of 313733.4806027616 ns (with ARR = 98301 & 0xffff). Even if ARR was configured to its maximal value, only period = 627495.6861167669 ns would be achievable. Fix the calculation accordingly and adapt the comment to match the new algorithm. With the calculation fixed the above case results in PSC = 2 and so an actual period of 941229.1667195285 ns. Fixes: 8002fbeef1e4 ("pwm: stm32: Calculate prescaler with a division instead of a loop") Signed-off-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com> Link: https://lore.kernel.org/r/b4d96b79917617434a540df45f20cb5de4142f88.1718979150.git.u.kleine-koenig@baylibre.com Signed-off-by: Uwe Kleine-König <ukleinek@kernel.org>
2024-06-22net: xilinx: axienet: Enable multicast by defaultSean Anderson
We support multicast addresses, so enable it by default. Signed-off-by: Sean Anderson <sean.anderson@linux.dev> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-22ibmvnic: Free any outstanding tx skbs during scrq resetNick Child
There are 2 types of outstanding tx skb's: Type 1: Packets that are sitting in the drivers ind_buff that are waiting to be batch sent to the NIC. During a device reset, these are freed with a call to ibmvnic_tx_scrq_clean_buffer() Type 2: Packets that have been sent to the NIC and are awaiting a TX completion IRQ. These are free'd during a reset with a call to clean_tx_pools() During any reset which requires us to free the tx irq, ensure that the Type 2 skb references are freed. Since the irq is released, it is impossible for the NIC to inform of any completions. Furthermore, later in the reset process is a call to init_tx_pools() which marks every entry in the tx pool as free (ie not outstanding). So if the driver is to make a call to init_tx_pools(), it must first be sure that the tx pool is empty of skb references. This issue was discovered by observing the following in the logs during EEH testing: TX free map points to untracked skb (tso_pool 0 idx=4) TX free map points to untracked skb (tso_pool 0 idx=5) TX free map points to untracked skb (tso_pool 1 idx=36) Fixes: 65d6470d139a ("ibmvnic: clean pending indirect buffs during reset") Signed-off-by: Nick Child <nnac123@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-22docs: i2c: summary: be clearer with 'controller/target' and 'adapter/client' ↵Wolfram Sang
pairs This not only includes rewording, but also where to put which emphasis on terms in this document. Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Reviewed-by: Easwar Hariharan <eahariha@linux.microsoft.com> Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
2024-06-22docs: i2c: summary: document 'local' and 'remote' targetsWolfram Sang
Because Linux can be a target as well, add terminology to differentiate between Linux being the target and Linux accessing targets. Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Reviewed-by: Easwar Hariharan <eahariha@linux.microsoft.com> Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
2024-06-22docs: i2c: summary: document use of inclusive languageWolfram Sang
We now have the updated I2C specs and our own Code of Conduct, so we have all we need to switch over to the inclusive terminology. Define them here. Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Reviewed-by: Easwar Hariharan <eahariha@linux.microsoft.com> Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
2024-06-22docs: i2c: summary: update speed mode descriptionWolfram Sang
Fastest I2C mode is 5 MHz. Update the docs and reword the paragraph slightly. Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Reviewed-by: Easwar Hariharan <eahariha@linux.microsoft.com> Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
2024-06-22docs: i2c: summary: update I2C specification linkWolfram Sang
Luckily, the specs are directly downloadable again, so update the link. Also update its title to the original name "I²C". Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Reviewed-by: Easwar Hariharan <eahariha@linux.microsoft.com> Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
2024-06-22docs: i2c: summary: start sentences consistently.Wolfram Sang
Change the first paragraphs to contain only one space after the end of the previous sentence like in the rest of the document. Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Reviewed-by: Easwar Hariharan <eahariha@linux.microsoft.com> Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
2024-06-21Merge tag 'batadv-net-pullrequest-20240621' of ↵Jakub Kicinski
git://git.open-mesh.org/linux-merge Simon Wunderlich says: ==================== Here are some batman-adv bugfixes: - Don't accept TT entries for out-of-spec VIDs, by Sven Eckelmann - Revert "batman-adv: prefer kfree_rcu() over call_rcu() with free-only callbacks", by Linus Lüssing * tag 'batadv-net-pullrequest-20240621' of git://git.open-mesh.org/linux-merge: Revert "batman-adv: prefer kfree_rcu() over call_rcu() with free-only callbacks" batman-adv: Don't accept TT entries for out-of-spec VIDs ==================== Link: https://patch.msgid.link/20240621143915.49137-1-sw@simonwunderlich.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-21Merge tag 'linux-can-fixes-for-6.10-20240621' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can Marc Kleine-Budde says: ==================== pull-request: can 2024-06-21 The first patch is by Oleksij Rempel, it enhances the error handling for tightly received RTS message in the j1939 protocol. Shigeru Yoshida's patch fixes a kernel information leak in j1939_send_one() in the j1939 protocol. Followed by a patch by Oleksij Rempel for the j1939 protocol, to properly recover from a CAN bus error during BAM transmission. A patch by Chen Ni properly propagates errors in the kvaser_usb driver. The last patch is by Vitor Soares, that fixes an infinite loop in the mcp251xfd driver is SPI async sending fails during xmit. * tag 'linux-can-fixes-for-6.10-20240621' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can: can: mcp251xfd: fix infinite loop when xmit fails can: kvaser_usb: fix return value for hif_usb_send_regout net: can: j1939: recover socket queue on CAN bus error during BAM transmission net: can: j1939: Initialize unused data in j1939_send_one() net: can: j1939: enhanced error handling for tightly received RTS messages in xtp_rx_rts_session_new ==================== Link: https://patch.msgid.link/20240621121739.434355-1-mkl@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-21Merge tag 'linux-can-next-for-6.11-20240621' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next Marc Kleine-Budde says: ==================== pull-request: can-next 2024-06-21 The first 2 patches are by Andy Shevchenko, one cleans up the includes in the mcp251x driver, the other one updates the sja100 plx_pci driver to make use of predefines PCI subvendor ID. Mans Rullgard's patch cleans up the Kconfig help text of for the slcan driver. Oliver Hartkopp provides a patch to update the documentation, which removes the ISO 15675-2 specification version where possible. The next 2 patches are by Harini T and update the documentation of the xilinx_can driver. Francesco Valla provides documentation for the ISO 15765-2 protocol. A patch by Dr. David Alan Gilbert removes an unused struct from the mscan driver. 12 patches are by Martin Jocic. The first three add support for 3 new devices to the kvaser_usb driver. The remaining 9 first clean up the kvaser_pciefd driver, and then add support for MSI. Krzysztof Kozlowski contributes 3 patches simplifies the CAN SPI drivers by making use of spi_get_device_match_data(). The last patch is by Martin Hundebøll, which reworks the m_can driver to not enable the CAN transceiver during probe. * tag 'linux-can-next-for-6.11-20240621' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next: (24 commits) can: m_can: don't enable transceiver when probing can: mcp251xfd: simplify with spi_get_device_match_data() can: mcp251x: simplify with spi_get_device_match_data() can: hi311x: simplify with spi_get_device_match_data() can: kvaser_pciefd: Add MSI interrupts can: kvaser_pciefd: Move reset of DMA RX buffers to the end of the ISR can: kvaser_pciefd: Change name of return code variable can: kvaser_pciefd: Rename board_irq to pci_irq can: kvaser_pciefd: Add unlikely can: kvaser_pciefd: Add inline can: kvaser_pciefd: Remove unnecessary comment can: kvaser_pciefd: Skip redundant NULL pointer check in ISR can: kvaser_pciefd: Group #defines together can: kvaser_usb: Add support for Kvaser Mini PCIe 1xCAN can: kvaser_usb: Add support for Kvaser USBcan Pro 5xCAN can: kvaser_usb: Add support for Vining 800 can: mscan: remove unused struct 'mscan_state' Documentation: networking: document ISO 15765-2 can: xilinx_can: Document driver description to list all supported IPs can: isotp: remove ISO 15675-2 specification version where possible ... ==================== Link: https://patch.msgid.link/20240621080201.305471-1-mkl@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-21vxlan: Pull inner IP header in vxlan_xmit_one().Guillaume Nault
Ensure the inner IP header is part of the skb's linear data before setting old_iph. Otherwise, on a non-linear skb, old_iph could point outside of the packet data. Unlike classical VXLAN, which always encapsulates Ethernet packets, VXLAN-GPE can transport IP packets directly. In that case, we need to look at skb->protocol to figure out if an Ethernet header is present. Fixes: d342894c5d2f ("vxlan: virtual extensible lan") Signed-off-by: Guillaume Nault <gnault@redhat.com> Link: https://patch.msgid.link/2aa75f6fa62ac9dbe4f16ad5ba75dd04a51d4b99.1718804000.git.gnault@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-21Merge tag 'scsi-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Two fixes: one in the ufs driver fixing an obvious memory leak and the other (with a core flag based update) trying to prevent USB crashes by stopping the core from issuing a request for the I/O Hints mode page" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: usb: uas: Do not query the IO Advice Hints Grouping mode page for USB/UAS devices scsi: core: Introduce the BLIST_SKIP_IO_HINTS flag scsi: ufs: core: Free memory allocated for model before reinit
2024-06-21Merge tag 'drm-fixes-2024-06-22' of https://gitlab.freedesktop.org/drm/kernelLinus Torvalds
Pull drm fixes from Dave Airlie: "Still pretty quiet, two weeks worth of amdgpu fixes, with one i915 and one xe. I didn't get the drm-misc-fixes tree PR this week, but there was only one fix queued and I think it can wait another week, so seems pretty normal. xe: - Fix for invalid register access i915: - Fix conditions for joiner usage, it's not possible with eDP MSO amdgpu: - Fix display idle optimization race - Fix GPUVM TLB flush locking scope - IPS fix - GFX 9.4.3 harvesting fix - Runtime pm fix for shared buffers - DCN 3.5.x fixes - USB4 fix - RISC-V clang fix - Silence UBSAN warnings - MES11 fix - PSP 14.0.x fix" * tag 'drm-fixes-2024-06-22' of https://gitlab.freedesktop.org/drm/kernel: drm/xe/vf: Don't touch GuC irq registers if using memory irqs drm/amdgpu: init TA fw for psp v14 drm/amdgpu: cleanup MES11 command submission drm/amdgpu: fix UBSAN warning in kv_dpm.c drm/radeon: fix UBSAN warning in kv_dpm.c drm/amd/display: Disable CONFIG_DRM_AMD_DC_FP for RISC-V with clang drm/amd/display: Attempt to avoid empty TUs when endpoint is DPIA drm/amd/display: change dram_clock_latency to 34us for dcn35 drm/amd/display: Change dram_clock_latency to 34us for dcn351 drm/amdgpu: revert "take runtime pm reference when we attach a buffer" v2 drm/amdgpu: Indicate CU havest info to CP drm/amd/display: prevent register access while in IPS drm/amdgpu: fix locking scope when flushing tlb drm/amd/display: Remove redundant idle optimization check drm/i915/mso: using joiner is not possible with eDP MSO
2024-06-21Merge tag 'ovl-fixes-6.10-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/overlayfs/vfs Pull overlayfs fixes from Miklos Szeredi: "Fix two bugs, one originating in this cycle and one from 6.6" * tag 'ovl-fixes-6.10-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/overlayfs/vfs: ovl: fix encoding fid for lower only root ovl: fix copy-up in tmpfile
2024-06-21Merge tag 'io_uring-6.10-20240621' of git://git.kernel.dk/linuxLinus Torvalds
Pull io_uring fix from Jens Axboe: "Just a single cleanup for the fixed buffer iov_iter import. More cosmetic than anything else, but let's get it cleaned up as it's confusing" * tag 'io_uring-6.10-20240621' of git://git.kernel.dk/linux: io_uring/rsrc: fix incorrect assignment of iter->nr_segs in io_import_fixed
2024-06-21Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdmaLinus Torvalds
Pull rdma fixes from Jason Gunthorpe: "Small bug fixes: - Prevent a crash in bnxt if the en and rdma drivers disagree on the MSI vectors - Have rxe memcpy inline data from the correct address - Fix rxe's validation of UD packets - Several mlx5 mr cache issues: bad lock balancing on error, missing propagation of the ATS property to the HW, wrong bucketing of freed mrs in some cases - Incorrect goto error unwind in mlx5 driver probe - Missed userspace input validation in mlx5 SRQ create - Incorrect uABI in MANA rejecting valid optional MR creation flags" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: RDMA/mana_ib: Ignore optional access flags for MRs RDMA/mlx5: Add check for srq max_sge attribute RDMA/mlx5: Fix unwind flow as part of mlx5_ib_stage_init_init RDMA/mlx5: Ensure created mkeys always have a populated rb_key RDMA/mlx5: Follow rb_key.ats when creating new mkeys RDMA/mlx5: Remove extra unlock on error path RDMA/rxe: Fix responder length checking for UD request packets RDMA/rxe: Fix data copy for IB_SEND_INLINE RDMA/bnxt_re: Fix the max msix vectors macro
2024-06-21bpf: Fix overrunning reservations in ringbufDaniel Borkmann
The BPF ring buffer internally is implemented as a power-of-2 sized circular buffer, with two logical and ever-increasing counters: consumer_pos is the consumer counter to show which logical position the consumer consumed the data, and producer_pos which is the producer counter denoting the amount of data reserved by all producers. Each time a record is reserved, the producer that "owns" the record will successfully advance producer counter. In user space each time a record is read, the consumer of the data advanced the consumer counter once it finished processing. Both counters are stored in separate pages so that from user space, the producer counter is read-only and the consumer counter is read-write. One aspect that simplifies and thus speeds up the implementation of both producers and consumers is how the data area is mapped twice contiguously back-to-back in the virtual memory, allowing to not take any special measures for samples that have to wrap around at the end of the circular buffer data area, because the next page after the last data page would be first data page again, and thus the sample will still appear completely contiguous in virtual memory. Each record has a struct bpf_ringbuf_hdr { u32 len; u32 pg_off; } header for book-keeping the length and offset, and is inaccessible to the BPF program. Helpers like bpf_ringbuf_reserve() return `(void *)hdr + BPF_RINGBUF_HDR_SZ` for the BPF program to use. Bing-Jhong and Muhammad reported that it is however possible to make a second allocated memory chunk overlapping with the first chunk and as a result, the BPF program is now able to edit first chunk's header. For example, consider the creation of a BPF_MAP_TYPE_RINGBUF map with size of 0x4000. Next, the consumer_pos is modified to 0x3000 /before/ a call to bpf_ringbuf_reserve() is made. This will allocate a chunk A, which is in [0x0,0x3008], and the BPF program is able to edit [0x8,0x3008]. Now, lets allocate a chunk B with size 0x3000. This will succeed because consumer_pos was edited ahead of time to pass the `new_prod_pos - cons_pos > rb->mask` check. Chunk B will be in range [0x3008,0x6010], and the BPF program is able to edit [0x3010,0x6010]. Due to the ring buffer memory layout mentioned earlier, the ranges [0x0,0x4000] and [0x4000,0x8000] point to the same data pages. This means that chunk B at [0x4000,0x4008] is chunk A's header. bpf_ringbuf_submit() / bpf_ringbuf_discard() use the header's pg_off to then locate the bpf_ringbuf itself via bpf_ringbuf_restore_from_rec(). Once chunk B modified chunk A's header, then bpf_ringbuf_commit() refers to the wrong page and could cause a crash. Fix it by calculating the oldest pending_pos and check whether the range from the oldest outstanding record to the newest would span beyond the ring buffer size. If that is the case, then reject the request. We've tested with the ring buffer benchmark in BPF selftests (./benchs/run_bench_ringbufs.sh) before/after the fix and while it seems a bit slower on some benchmarks, it is still not significantly enough to matter. Fixes: 457f44363a88 ("bpf: Implement BPF ring buffer and verifier support for it") Reported-by: Bing-Jhong Billy Jheng <billy@starlabs.sg> Reported-by: Muhammad Ramdhan <ramdhan@starlabs.sg> Co-developed-by: Bing-Jhong Billy Jheng <billy@starlabs.sg> Co-developed-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Bing-Jhong Billy Jheng <billy@starlabs.sg> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240621140828.18238-1-daniel@iogearbox.net
2024-06-21Merge tag 'sound-6.10-rc5-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull more sound fixes from Takashi Iwai: "A follow-up fix for a random build issue, as well as another trivial HD-audio quirk" * tag 'sound-6.10-rc5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: ALSA: hda: Use imply for suggesting CONFIG_SERIAL_MULTI_INSTANTIATE ALSA: hda/realtek: Add quirk for Lenovo Yoga Pro 7 14AHP9
2024-06-21Merge tag 'acpi-6.10-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI fixes from Rafael Wysocki: "These address a possible NULL pointer dereference in the ACPICA code and quirk camera enumeration on multiple platforms where incorrect data are present in the platform firmware. Specifics: - Undo an ACPICA code change that attempted to keep operation regions within a page boundary, but allowed accesses to unmapped memory to occur (Raju Rangoju) - Ignore MIPI camera graph port nodes created with the help of the information from the ACPI tables on all Dell Tiger, Alder and Raptor Lake models as that information is reported to be invalid on the platforms in question (Hans de Goede) - Use new Intel CPU model matching macros in the MIPI DisCo for Imaging part of ACPI device enumeration (Hans de Goede)" * tag 'acpi-6.10-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPI: mipi-disco-img: Switch to new Intel CPU model defines ACPI: scan: Ignore camera graph port nodes on all Dell Tiger, Alder and Raptor Lake models ACPICA: Revert "ACPICA: avoid Info: mapping multiple BARs. Your kernel is fine."
2024-06-21selftests/bpf: Tests with may_goto and jumps to the 1st insnAlexei Starovoitov
Add few tests with may_goto and jumps to the 1st insn. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/bpf/20240619011859.79334-2-alexei.starovoitov@gmail.com
2024-06-21bpf: Fix the corner case with may_goto and jump to the 1st insn.Alexei Starovoitov
When the following program is processed by the verifier: L1: may_goto L2 goto L1 L2: w0 = 0 exit the may_goto insn is first converted to: L1: r11 = *(u64 *)(r10 -8) if r11 == 0x0 goto L2 r11 -= 1 *(u64 *)(r10 -8) = r11 goto L1 L2: w0 = 0 exit then later as the last step the verifier inserts: *(u64 *)(r10 -8) = BPF_MAX_LOOPS as the first insn of the program to initialize loop count. When the first insn happens to be a branch target of some jmp the bpf_patch_insn_data() logic will produce: L1: *(u64 *)(r10 -8) = BPF_MAX_LOOPS r11 = *(u64 *)(r10 -8) if r11 == 0x0 goto L2 r11 -= 1 *(u64 *)(r10 -8) = r11 goto L1 L2: w0 = 0 exit because instruction patching adjusts all jmps and calls, but for this particular corner case it's incorrect and the L1 label should be one instruction down, like: *(u64 *)(r10 -8) = BPF_MAX_LOOPS L1: r11 = *(u64 *)(r10 -8) if r11 == 0x0 goto L2 r11 -= 1 *(u64 *)(r10 -8) = r11 goto L1 L2: w0 = 0 exit and that's what this patch is fixing. After bpf_patch_insn_data() call adjust_jmp_off() to adjust all jmps that point to newly insert BPF_ST insn to point to insn after. Note that bpf_patch_insn_data() cannot easily be changed to accommodate this logic, since jumps that point before or after a sequence of patched instructions have to be adjusted with the full length of the patch. Conceptually it's somewhat similar to "insert" of instructions between other instructions with weird semantics. Like "insert" before 1st insn would require adjustment of CALL insns to point to newly inserted 1st insn, but not an adjustment JMP insns that point to 1st, yet still adjusting JMP insns that cross over 1st insn (point to insn before or insn after), hence use simple adjust_jmp_off() logic to fix this corner case. Ideally bpf_patch_insn_data() would have an auxiliary info to say where 'the start of newly inserted patch is', but it would be too complex for backport. Fixes: 011832b97b31 ("bpf: Introduce may_goto instruction") Reported-by: Zac Ecob <zacecob@protonmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Closes: https://lore.kernel.org/bpf/CAADnVQJ_WWx8w4b=6Gc2EpzAjgv+6A0ridnMz2TvS2egj4r3Gw@mail.gmail.com/ Link: https://lore.kernel.org/bpf/20240619011859.79334-1-alexei.starovoitov@gmail.com
2024-06-21Merge tag 'thermal-6.10-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull thermal control fixes from Rafael Wysocki: "These fix the Mediatek lvts_thermal driver, the Intel int340x driver, and the thermal core (two issues related to system suspend). Specifics: - Remove the filtered mode for mt8188 from lvts_thermal as it is not supported on this platform and fail the lvts_thermal initialization when the golden temperature is zero as that means the efuse data is not correctly set (Julien Panis) - Update the processor_thermal part of the Intel int340x driver to support shared interrupts as the processor thermal device interrupt may in fact be shared with PCI devices (Srinivas Pandruvada) - Synchronize the suspend-prepare and post-suspend actions of the thermal PM notifier to avoid a destructive race condition and change the priority of that notifier to the minimum to avoid interference between the work items spawned by it and the other PM notifiers during system resume (Rafael Wysocki)" * tag 'thermal-6.10-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: thermal: int340x: processor_thermal: Support shared interrupts thermal: core: Change PM notifier priority to the minimum thermal: core: Synchronize suspend-prepare and post-suspend actions thermal/drivers/mediatek/lvts_thermal: Return error in case of invalid efuse data thermal/drivers/mediatek/lvts_thermal: Remove filtered mode for mt8188
2024-06-21Merge tag 'dmaengine-fix-6.10' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine Pull dmaengine fixes from Vinod Koul: - kmemleak, error path handling and missing kmem_cache_destroy() fixes for ioatdma driver - use after free fix for idxd driver - data synchronisation fix for xdma isr handling - fsl driver channel constraints and linking two fsl module fixes * tag 'dmaengine-fix-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine: dmaengine: ioatdma: Fix missing kmem_cache_destroy() dt-bindings: dma: fsl-edma: fix dma-channels constraints dmaengine: fsl-edma: avoid linking both modules dmaengine: ioatdma: Fix kmemleak in ioat_pci_probe() dmaengine: ioatdma: Fix error path in ioat3_dma_probe() dmaengine: ioatdma: Fix leaking on version mismatch dmaengine: ti: k3-udma-glue: Fix of_k3_udma_glue_parse_chn_by_id() dmaengine: idxd: Fix possible Use-After-Free in irq_process_work_list dmaengine: xilinx: xdma: Fix data synchronisation in xdma_channel_isr()
2024-06-21Merge tag 'phy-fixes-6.10' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy Pull phy fixes from Vinod Koul: - Qualcomm QMP driver fixes for missing register offsets and correct N4 offsets for registers * tag 'phy-fixes-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy: phy: qcom: qmp-combo: Switch from V6 to V6 N4 register offsets phy: qcom-qmp: pcs: Add missing v6 N4 register offsets phy: qcom-qmp: qserdes-txrx: Add missing registers offsets
2024-06-21Merge tag 'soundwire-6.10-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire Pull soundwire fix from Vinod Koul: - Single fix for calling fwnode_handle_put() on the returned fwnode pointer * tag 'soundwire-6.10-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire: soundwire: fix usages of device_get_named_child_node()
2024-06-21Merge tag 'kvm-riscv-fixes-6.10-2' of https://github.com/kvm-riscv/linux ↵Paolo Bonzini
into HEAD KVM/riscv fixes for 6.10, take #2 - Fix compilation for KVM selftests
2024-06-21ice: update representor when VSI is readyMichal Swiatkowski
In case of reset of VF VSI can be reallocated. To handle this case it should be properly updated. Reload representor as vsi->vsi_num can be different than the one stored when representor was created. Instead of only changing antispoof do whole VSI configuration for eswitch. Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-06-21ice: move VSI configuration outside repr setupMichal Swiatkowski
It is needed because subfunction port representor shouldn't configure the source VSI during representor creation. Move the code to separate function and call it only in case the VF port representor is being created. Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-06-21ice: move devlink locking outside the port creationMichal Swiatkowski
In case of subfunction lock will be taken for whole port creation and removing. Do the same in VF case. Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-06-21ice: store representor ID in bridge portMichal Swiatkowski
It is used to get representor structure during cleaning. Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com> Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-06-21pwm: stm32: Refuse too small period requestsUwe Kleine-König
If period_ns is small, prd might well become 0. Catch that case because otherwise with regmap_write(priv->regmap, TIM_ARR, prd - 1); a few lines down quite a big period is configured. Fixes: 7edf7369205b ("pwm: Add driver for STM32 plaftorm") Cc: stable@vger.kernel.org Reviewed-by: Trevor Gamblin <tgamblin@baylibre.com> Signed-off-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com> Link: https://lore.kernel.org/r/b86f62f099983646f97eeb6bfc0117bb2d0c340d.1718979150.git.u.kleine-koenig@baylibre.com Signed-off-by: Uwe Kleine-König <ukleinek@kernel.org>
2024-06-21bcachefs: Move the ei_flags setting to after initializationYouling Tang
`inode->ei_flags` setting and cleaning should be done after initialization, otherwise the operation is invalid. Fixes: 9ca4853b98af ("bcachefs: Fix quota support for snapshots") Signed-off-by: Youling Tang <tangyouling@kylinos.cn> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-06-21bcachefs: Fix a UAF after write_super()Kent Overstreet
write_super() may reallocate the superblock buffer - but bch_sb_field_ext was referencing it; don't use it after the write_super call. Reported-by: syzbot+8992fc10a192067b8d8a@syzkaller.appspotmail.com Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-06-21bcachefs: Use bch2_print_string_as_lines for long errKent Overstreet
printk strings get truncated to 1024 bytes; if we have a long error message (journal debug info) we need to use a helper. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-06-21bcachefs: Fix I_NEW warning in race path in bch2_inode_insert()Kent Overstreet
discard_new_inode() is the correct interface for tearing down an indoe that was fully created but not made visible to other threads, but it expects I_NEW to be set, which we don't use. Reported-by: https://github.com/koverstreet/bcachefs/issues/690 Fixes: bcachefs: Fix race path in bch2_inode_insert() Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-06-21bcachefs: Replace bare EEXIST with private error codesKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-06-21bcachefs: Fix missing alloc_data_type_set()Kent Overstreet
Incorrect bucket state transition in the discard path; when incrementing a bucket's generation number that had already been discarded, we were forgetting to check if it should be need_gc_gens, not free. This was caught by the .invalid checks in the transaction commit path, causing us to go emergency read only. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-06-21closures: Change BUG_ON() to WARN_ON()Kent Overstreet
If a BUG_ON() can be hit in the wild, it shouldn't be a BUG_ON() For reference, this has popped up once in the CI, and we'll need more info to debug it: 03240 ------------[ cut here ]------------ 03240 kernel BUG at lib/closure.c:21! 03240 kernel BUG at lib/closure.c:21! 03240 Internal error: Oops - BUG: 00000000f2000800 [#1] SMP 03240 Modules linked in: 03240 CPU: 15 PID: 40534 Comm: kworker/u80:1 Not tainted 6.10.0-rc4-ktest-ga56da69799bd #25570 03240 Hardware name: linux,dummy-virt (DT) 03240 Workqueue: btree_update btree_interior_update_work 03240 pstate: 00001005 (nzcv daif -PAN -UAO -TCO -DIT +SSBS BTYPE=--) 03240 pc : closure_put+0x224/0x2a0 03240 lr : closure_put+0x24/0x2a0 03240 sp : ffff0000d12071c0 03240 x29: ffff0000d12071c0 x28: dfff800000000000 x27: ffff0000d1207360 03240 x26: 0000000000000040 x25: 0000000000000040 x24: 0000000000000040 03240 x23: ffff0000c1f20180 x22: 0000000000000000 x21: ffff0000c1f20168 03240 x20: 0000000040000000 x19: ffff0000c1f20140 x18: 0000000000000001 03240 x17: 0000000000003aa0 x16: 0000000000003ad0 x15: 1fffe0001c326974 03240 x14: 0000000000000a1e x13: 0000000000000000 x12: 1fffe000183e402d 03240 x11: ffff6000183e402d x10: dfff800000000000 x9 : ffff6000183e402e 03240 x8 : 0000000000000001 x7 : 00009fffe7c1bfd3 x6 : ffff0000c1f2016b 03240 x5 : ffff0000c1f20168 x4 : ffff6000183e402e x3 : ffff800081391954 03240 x2 : 0000000000000001 x1 : 0000000000000000 x0 : 00000000a8000000 03240 Call trace: 03240 closure_put+0x224/0x2a0 03240 bch2_check_for_deadlock+0x910/0x1028 03240 bch2_six_check_for_deadlock+0x1c/0x30 03240 six_lock_slowpath.isra.0+0x29c/0xed0 03240 six_lock_ip_waiter+0xa8/0xf8 03240 __bch2_btree_node_lock_write+0x14c/0x298 03240 bch2_trans_lock_write+0x6d4/0xb10 03240 __bch2_trans_commit+0x135c/0x5520 03240 btree_interior_update_work+0x1248/0x1c10 03240 process_scheduled_works+0x53c/0xd90 03240 worker_thread+0x370/0x8c8 03240 kthread+0x258/0x2e8 03240 ret_from_fork+0x10/0x20 03240 Code: aa1303e0 d63f0020 a94363f7 17ffff8c (d4210000) 03240 ---[ end trace 0000000000000000 ]--- 03240 Kernel panic - not syncing: Oops - BUG: Fatal exception 03240 SMP: stopping secondary CPUs 03241 SMP: failed to stop secondary CPUs 13,15 03241 Kernel Offset: disabled 03241 CPU features: 0x00,00000003,80000008,4240500b 03241 Memory Limit: none 03241 ---[ end Kernel panic - not syncing: Oops - BUG: Fatal exception ]--- 03246 ========= FAILED TIMEOUT copygc_torture_no_checksum in 7200s Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-06-21Merge branch 'mlxsw-fixes'David S. Miller
Petr Machata says: ==================== mlxsw: Fixes This patchset fixes an issue with mlxsw driver initialization, and a memory corruption issue in shared buffer occupancy handling. v3: - Drop the core thermal fix, it's not relevant anymore. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-21mlxsw: spectrum_buffers: Fix memory corruptions on Spectrum-4 systemsIdo Schimmel
The following two shared buffer operations make use of the Shared Buffer Status Register (SBSR): # devlink sb occupancy snapshot pci/0000:01:00.0 # devlink sb occupancy clearmax pci/0000:01:00.0 The register has two masks of 256 bits to denote on which ingress / egress ports the register should operate on. Spectrum-4 has more than 256 ports, so the register was extended by cited commit with a new 'port_page' field. However, when filling the register's payload, the driver specifies the ports as absolute numbers and not relative to the first port of the port page, resulting in memory corruptions [1]. Fix by specifying the ports relative to the first port of the port page. [1] BUG: KASAN: slab-use-after-free in mlxsw_sp_sb_occ_snapshot+0xb6d/0xbc0 Read of size 1 at addr ffff8881068cb00f by task devlink/1566 [...] Call Trace: <TASK> dump_stack_lvl+0xc6/0x120 print_report+0xce/0x670 kasan_report+0xd7/0x110 mlxsw_sp_sb_occ_snapshot+0xb6d/0xbc0 mlxsw_devlink_sb_occ_snapshot+0x75/0xb0 devlink_nl_sb_occ_snapshot_doit+0x1f9/0x2a0 genl_family_rcv_msg_doit+0x20c/0x300 genl_rcv_msg+0x567/0x800 netlink_rcv_skb+0x170/0x450 genl_rcv+0x2d/0x40 netlink_unicast+0x547/0x830 netlink_sendmsg+0x8d4/0xdb0 __sys_sendto+0x49b/0x510 __x64_sys_sendto+0xe5/0x1c0 do_syscall_64+0xc1/0x1d0 entry_SYSCALL_64_after_hwframe+0x77/0x7f [...] Allocated by task 1: kasan_save_stack+0x33/0x60 kasan_save_track+0x14/0x30 __kasan_kmalloc+0x8f/0xa0 copy_verifier_state+0xbc2/0xfb0 do_check_common+0x2c51/0xc7e0 bpf_check+0x5107/0x9960 bpf_prog_load+0xf0e/0x2690 __sys_bpf+0x1a61/0x49d0 __x64_sys_bpf+0x7d/0xc0 do_syscall_64+0xc1/0x1d0 entry_SYSCALL_64_after_hwframe+0x77/0x7f Freed by task 1: kasan_save_stack+0x33/0x60 kasan_save_track+0x14/0x30 kasan_save_free_info+0x3b/0x60 poison_slab_object+0x109/0x170 __kasan_slab_free+0x14/0x30 kfree+0xca/0x2b0 free_verifier_state+0xce/0x270 do_check_common+0x4828/0xc7e0 bpf_check+0x5107/0x9960 bpf_prog_load+0xf0e/0x2690 __sys_bpf+0x1a61/0x49d0 __x64_sys_bpf+0x7d/0xc0 do_syscall_64+0xc1/0x1d0 entry_SYSCALL_64_after_hwframe+0x77/0x7f Fixes: f8538aec88b4 ("mlxsw: Add support for more than 256 ports in SBSR register") Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-21mlxsw: pci: Fix driver initialization with Spectrum-4Ido Schimmel
Cited commit added support for a new reset flow ("all reset") which is deeper than the existing reset flow ("software reset") and allows the device's PCI firmware to be upgraded. In the new flow the driver first tells the firmware that "all reset" is required by issuing a new reset command (i.e., MRSR.command=6) and then triggers the reset by having the PCI core issue a secondary bus reset (SBR). However, due to a race condition in the device's firmware the device is not always able to recover from this reset, resulting in initialization failures [1]. New firmware versions include a fix for the bug and advertise it using a new capability bit in the Management Capabilities Mask (MCAM) register. Avoid initialization failures by reading the new capability bit and triggering the new reset flow only if the bit is set. If the bit is not set, trigger a normal PCI hot reset by skipping the call to the Management Reset and Shutdown Register (MRSR). Normal PCI hot reset is weaker than "all reset", but it results in a fully operational driver and allows users to flash a new firmware, if they want to. [1] mlxsw_spectrum4 0000:01:00.0: not ready 1023ms after bus reset; waiting mlxsw_spectrum4 0000:01:00.0: not ready 2047ms after bus reset; waiting mlxsw_spectrum4 0000:01:00.0: not ready 4095ms after bus reset; waiting mlxsw_spectrum4 0000:01:00.0: not ready 8191ms after bus reset; waiting mlxsw_spectrum4 0000:01:00.0: not ready 16383ms after bus reset; waiting mlxsw_spectrum4 0000:01:00.0: not ready 32767ms after bus reset; waiting mlxsw_spectrum4 0000:01:00.0: not ready 65535ms after bus reset; giving up mlxsw_spectrum4 0000:01:00.0: PCI function reset failed with -25 mlxsw_spectrum4 0000:01:00.0: cannot register bus device mlxsw_spectrum4: probe of 0000:01:00.0 failed with error -25 Fixes: f257c73e5356 ("mlxsw: pci: Add support for new reset flow") Reported-by: Maksym Yaremchuk <maksymy@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Tested-by: Maksym Yaremchuk <maksymy@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>