linux/linux-stable.git - Linux kernel stable tree

Age	Commit message (Collapse)	Author
9 days	usb: xhci: print xhci->xhc_state when queue_command failed	Su Hui
	When encounters some errors like these: xhci_hcd 0000:4a:00.2: xHCI dying or halted, can't queue_command xhci_hcd 0000:4a:00.2: FIXME: allocate a command ring segment usb usb5-port6: couldn't allocate usb_device It's hard to know whether xhc_state is dying or halted. So it's better to print xhc_state's value which can help locate the resaon of the bug. Signed-off-by: Su Hui <suhui@nfschina.com> Link: https://lore.kernel.org/r/20250725060117.1773770-1-suhui@nfschina.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-07-17	usb: xhci: Avoid showing errors during surprise removal	Mario Limonciello
	When a USB4 dock is unplugged from a system it won't respond to ring events. The PCI core handles the surprise removal event and notifies all PCI drivers. The XHCI PCI driver sets a flag that the device is being removed as well. When that flag is set don't show messages in the cleanup path for marking the controller dead. Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Acked-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20250717073107.488599-2-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-06-19	Revert "usb: xhci: Implement xhci_handshake_check_state() helper"	Roy Luo
	This reverts commit 6ccb83d6c4972ebe6ae49de5eba051de3638362c. Commit 6ccb83d6c497 ("usb: xhci: Implement xhci_handshake_check_state() helper") was introduced to workaround watchdog timeout issues on some platforms, allowing xhci_reset() to bail out early without waiting for the reset to complete. Skipping the xhci handshake during a reset is a dangerous move. The xhci specification explicitly states that certain registers cannot be accessed during reset in section 5.4.1 USB Command Register (USBCMD), Host Controller Reset (HCRST) field: "This bit is cleared to '0' by the Host Controller when the reset process is complete. Software cannot terminate the reset process early by writinga '0' to this bit and shall not write any xHC Operational or Runtime registers until while HCRST is '1'." This behavior causes a regression on SNPS DWC3 USB controller with dual-role capability. When the DWC3 controller exits host mode and removes xhci while a reset is still in progress, and then tries to configure its hardware for device mode, the ongoing reset leads to register access issues; specifically, all register reads returns 0. These issues extend beyond the xhci register space (which is expected during a reset) and affect the entire DWC3 IP block, causing the DWC3 device mode to malfunction. Cc: stable <stable@kernel.org> Fixes: 6ccb83d6c497 ("usb: xhci: Implement xhci_handshake_check_state() helper") Signed-off-by: Roy Luo <royluo@google.com> Link: https://lore.kernel.org/r/20250522190912.457583-3-royluo@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-05-21	usb: xhci: rename 'irq_pending' to 'iman'	Niklas Neronin
	The Interrupt Register Set contains Interrupt Management register (IMAN). The IMAN register contains the following fields: - Bit 0: Interrupt Pending (IP) - Bit 1: Interrupt Enable (IE) - Bits 31:2: RsvdP (Reserved and Preserved) Tn the xhci driver, the pointer currently named 'irq_pending' refers to the IMAN register. However, the name "irq_pending" only describes one of the fields within the IMAN register, rather than the entire register itself. To improve clarity and better align with the xHCI specification, the pointer is renamed to 'iman'. Signed-off-by: Niklas Neronin <niklas.neronin@linux.intel.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20250515135621.335595-22-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-05-21	usb: xhci: remove '0' write to write-1-to-clear register	Niklas Neronin
	xHCI specification 1.2, section 5.5.2.1. Interrupt Pending bit is RW1C (Write-1-to-clear), which means that writing '0' to is has no effect and is removed. The Interrupt Pending (IP) bit is cleared at the start of interrupt handling; xhci_clear_interrupt_pending(). This could theoretically cause a new interrupt to be issued before the xhci driver reaches the interrupter disable functions. To address this, the IP bit is read after Interrupt Enable is disabled, and a debug message is issued if the IP bit is still set. Signed-off-by: Niklas Neronin <niklas.neronin@linux.intel.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20250515135621.335595-18-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-05-21	usb: xhci: guarantee that IMAN register is flushed	Niklas Neronin
	Add read call to guarantee that the write to the IMAN register has been flushed. xHCI specification 1.2, section 5.5.2.1, Note: "Most systems have write buffers that minimize overhead, but this may require a read operation to guarantee that the write has been flushed from the posted buffer." Signed-off-by: Niklas Neronin <niklas.neronin@linux.intel.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20250515135621.335595-17-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-05-21	usb: xhci: Add debugfs support for xHCI port bandwidth	Xu Rao
	In many projects, you need to obtain the available bandwidth of the xhci roothub port. Refer to xhci rev1_2 and use the TRB_GET_BW command to obtain it. hardware tested: 03:00.3 USB controller: Advanced Micro Devices, Inc. [AMD] Raven USB 3.1 (prog-if 30 [XHCI]) Subsystem: Huawei Technologies Co., Ltd. Raven USB 3.1 Flags: bus master, fast devsel, latency 0, IRQ 30 Memory at c0300000 (64-bit, non-prefetchable) [size=1M] Capabilities: [48] Vendor Specific Information: Len=08 <?> Capabilities: [50] Power Management version 3 Capabilities: [64] Express Endpoint, MSI 00 Capabilities: [a0] MSI: Enable- Count=1/8 Maskable- 64bit+ Capabilities: [c0] MSI-X: Enable+ Count=8 Masked- Kernel driver in use: xhci_hcd test progress: 1. cd /sys/kernel/debug/usb/xhci/0000:03:00.3/port_bandwidth# ls FS_BW HS_BW SS_BW 2. test fs speed device cat FS_BW port[1] available bw: 90%. port[2] available bw: 90%. port[3] available bw: 90%. port[4] available bw: 90%. port[5] available bw: 0%. port[6] available bw: 0%. port[7] available bw: 0%. port[8] available bw: 0%. plug in fs usb audio ID 0d8c:013c cat FS_BW port[1] available bw: 76%. port[2] available bw: 76%. port[3] available bw: 76%. port[4] available bw: 76%. port[5] available bw: 0%. port[6] available bw: 0%. port[7] available bw: 0%. port[8] available bw: 0%. 3. test hs speed device cat HS_BW port[1] available bw: 79%. port[2] available bw: 79%. port[3] available bw: 79%. port[4] available bw: 79%. port[5] available bw: 0%. port[6] available bw: 0%. port[7] available bw: 0%. port[8] available bw: 0%. plug in hs usb video ID 0408:1040 cat HS_BW port[1] available bw: 39%. port[2] available bw: 39%. port[3] available bw: 39%. port[4] available bw: 39%. port[5] available bw: 0%. port[6] available bw: 0%. port[7] available bw: 0%. port[8] available bw: 0%. 4.cat SS_BW port[1] available bw: 0%. port[2] available bw: 0%. port[3] available bw: 0%. port[4] available bw: 0%. port[5] available bw: 90%. port[6] available bw: 90%. port[7] available bw: 90%. port[8] available bw: 90%. Signed-off-by: Xu Rao <raoxu@uniontech.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20250515135621.335595-3-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-05-21	usb: xhci: Don't log transfer ring segment list on errors	Michal Pecio
	The error message above used to span two lines, rarely more. A recent cleanup concentrated useful information from it in one line, but then it added printing the list of all ring segments, which is even longer than before. It provides no new information in usual cases and little in unusual ones, but adds noise to the log. Drop it. Signed-off-by: Michal Pecio <michal.pecio@gmail.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20250515135621.335595-2-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-05-13	Merge 6.15-rc6 into usb-next	Greg Kroah-Hartman
	We need the USB fixes in here as well. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-05-05	usb: xhci: Don't trust the EP Context cycle bit when moving HW dequeue	Michal Pecio
	VIA VL805 doesn't bother updating the EP Context cycle bit when the endpoint halts. This is seen by patching xhci_move_dequeue_past_td() to print the cycle bits of the EP Context and the TRB at hw_dequeue and then disconnecting a flash drive while reading it. Actual cycle state is random as expected, but the EP Context bit is always 1. This means that the cycle state produced by this function is wrong half the time, and then the endpoint stops working. Work around it by looking at the cycle bit of TD's end_trb instead of believing the Endpoint or Stream Context. Specifically: - rename cycle_found to hw_dequeue_found to avoid confusion - initialize new_cycle from td->end_trb instead of hw_dequeue - switch new_cycle toggling to happen after end_trb is found Now a workload which regularly stalls the device works normally for a few hours and clearly demonstrates the HW bug - the EP Context bit is not updated in a new cycle until Set TR Dequeue overwrites it: [ +0,000298] sd 10:0:0:0: [sdc] Attached SCSI disk [ +0,011758] cycle bits: TRB 1 EP Ctx 1 [ +5,947138] cycle bits: TRB 1 EP Ctx 1 [ +0,065731] cycle bits: TRB 0 EP Ctx 1 [ +0,064022] cycle bits: TRB 0 EP Ctx 0 [ +0,063297] cycle bits: TRB 0 EP Ctx 0 [ +0,069823] cycle bits: TRB 0 EP Ctx 0 [ +0,063390] cycle bits: TRB 1 EP Ctx 0 [ +0,063064] cycle bits: TRB 1 EP Ctx 1 [ +0,062293] cycle bits: TRB 1 EP Ctx 1 [ +0,066087] cycle bits: TRB 0 EP Ctx 1 [ +0,063636] cycle bits: TRB 0 EP Ctx 0 [ +0,066360] cycle bits: TRB 0 EP Ctx 0 Also tested on the buggy ASM1042 which moves EP Context dequeue to the next TRB after errors, one problem case addressed by the rework that implemented this loop. In this case hw_dequeue can be enqueue, so simply picking the cycle bit of TRB at hw_dequeue wouldn't work. Commit 5255660b208a ("xhci: add quirk for host controllers that don't update endpoint DCS") tried to solve the stale cycle problem, but it was more complex and got reverted due to a reported issue. Cc: Jonathan Bell <jonathan@raspberrypi.org> Cc: Oliver Neukum <oneukum@suse.com> Signed-off-by: Michal Pecio <michal.pecio@gmail.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20250505125630.561699-2-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-04-28	Merge 6.15-rc4 into usb-next	Greg Kroah-Hartman
	We need the USB fixes in here as well, and this resolves the following merge conflicts that were reported in linux-next: drivers/usb/chipidea/ci_hdrc_imx.c drivers/usb/host/xhci.h Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-04-11	usb: xhci: Fix invalid pointer dereference in Etron workaround	Michal Pecio
	This check is performed before prepare_transfer() and prepare_ring(), so enqueue can already point at the final link TRB of a segment. And indeed it will, some 0.4% of times this code is called. Then enqueue + 1 is an invalid pointer. It will crash the kernel right away or load some junk which may look like a link TRB and cause the real link TRB to be replaced with a NOOP. This wouldn't end well. Use a functionally equivalent test which doesn't dereference the pointer and always gives correct result. Something has crashed my machine twice in recent days while playing with an Etron HC, and a control transfer stress test ran for confirmation has just crashed it again. The same test passes with this patch applied. Fixes: 5e1c67abc930 ("xhci: Fix control transfer error on Etron xHCI host") Cc: stable@vger.kernel.org Signed-off-by: Michal Pecio <michal.pecio@gmail.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Reviewed-by: Kuangyi Chiang <ki.chiang65@gmail.com> Link: https://lore.kernel.org/r/20250410151828.2868740-5-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-04-11	usb: xhci: Fix Short Packet handling rework ignoring errors	Michal Pecio
	A Short Packet event before the last TRB of a TD is followed by another event on the final TRB on spec-compliant HCs, which is most of them. A 'last_td_was_short' flag was added to know if a TD has just completed as Short Packet and another event is to come. The flag was cleared after seeing the event (unless no TDs are pending, but that's a separate bug) or seeing a new TD complete as something other than Short Packet. A rework replaced the flag with an 'old_trb_comp_code' variable. When an event doesn't match the pending TD and the previous event was Short Packet, the new event is silently ignored. To preserve old behavior, 'old_trb_comp_code' should be cleared at this point, but instead it is being set to current comp code, which is often Short Packet again. This can cause more events to be silently ignored, even though they are no longer connected with the old TD that completed short and indicate a serious problem with the driver or the xHC. Common device classes like UAC in async mode, UVC, serial or the UAS status pipe complete as Short Packet routinely and could be affected. Clear 'old_trb_comp_code' to zero, which is an invalid completion code and the same value the variable starts with. This restores original behavior on Short Packet and also works for illegal Etron events, which the code has been extended to cover too. Fixes: b331a3d8097f ("xhci: Handle spurious events on Etron host isoc enpoints") Signed-off-by: Michal Pecio <michal.pecio@gmail.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20250410151828.2868740-4-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-04-11	Revert "xhci: Prevent early endpoint restart when handling STALL errors."	Mathias Nyman
	This reverts commit 860f5d0d3594005d4588240028f42e8d2bfc725b. Paul Menzel reported that the two EP_STALLED patches in 6.15-rc1 cause regression. Turns out that the new flag may never get cleared after reset-resume, preventing xhci from restarting the endpoint. Revert this to take a proper look at it. Link: https://lore.kernel.org/linux-usb/84b400f8-2943-44e0-8803-f3aac3b670af@molgen.mpg.de cc: Paul Menzel <pmenzel@molgen.mpg.de> cc: Michal Pecio <michal.pecio@gmail.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20250410151828.2868740-3-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-04-11	usb: host: xhci-mem: Cleanup pending secondary event ring events	Wesley Cheng
	As part of xHCI bus suspend, the xHCI is halted. However, if there are pending events in the secondary event ring, it is observed that the xHCI controller stops responding to further commands upon host or device initiated bus resume. Iterate through all pending events and update the dequeue pointer to the beginning of the event ring. Tested-by: Puma Hsu <pumahsu@google.com> Tested-by: Daehwan Jung <dh10.jung@samsung.com> Signed-off-by: Wesley Cheng <quic_wcheng@quicinc.com> Acked-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20250409194804.3773260-3-quic_wcheng@quicinc.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-03-11	usb: xhci: Avoid Stop Endpoint retry loop if the endpoint seems Running	Michal Pecio
	Nothing prevents a broken HC from claiming that an endpoint is Running and repeatedly rejecting Stop Endpoint with Context State Error. Avoid infinite retries and give back cancelled TDs. No such cases known so far, but HCs have bugs. Signed-off-by: Michal Pecio <michal.pecio@gmail.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20250311154551.4035726-4-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-03-11	usb: xhci: Don't change the status of stalled TDs on failed Stop EP	Michal Pecio
	When the device stalls an endpoint, current TD is assigned -EPIPE status and Reset Endpoint is queued. If a Stop Endpoint is pending at the time, it will run before Reset Endpoint and fail due to the stall. Its handler will change TD's status to -EPROTO before Reset Endpoint handler runs and initiates giveback. Check if the stall has already been handled and don't try to do it again. Since xhci_handle_halted_endpoint() performs this check too, not overwriting td->status is the only difference. I haven't seen this case yet, but I have seen a related one where the xHC has already executed Reset Endpoint, EP Context state is now Stopped and EP_HALTED is set. If the xHC took a bit longer to execute Reset Endpoint, said case would become this one. Signed-off-by: Michal Pecio <michal.pecio@gmail.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20250311154551.4035726-3-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-03-06	xhci: Handle spurious events on Etron host isoc enpoints	Mathias Nyman
	Unplugging a USB3.0 webcam from Etron hosts while streaming results in errors like this: [ 2.646387] xhci_hcd 0000:03:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 18 comp_code 13 [ 2.646446] xhci_hcd 0000:03:00.0: Looking for event-dma 000000002fdf8630 trb-start 000000002fdf8640 trb-end 000000002fdf8650 [ 2.646560] xhci_hcd 0000:03:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 18 comp_code 13 [ 2.646568] xhci_hcd 0000:03:00.0: Looking for event-dma 000000002fdf8660 trb-start 000000002fdf8670 trb-end 000000002fdf8670 Etron xHC generates two transfer events for the TRB if an error is detected while processing the last TRB of an isoc TD. The first event can be any sort of error (like USB Transaction or Babble Detected, etc), and the final event is Success. The xHCI driver will handle the TD after the first event and remove it from its internal list, and then print an "Transfer event TRB DMA ptr not part of current TD" error message after the final event. Commit 5372c65e1311 ("xhci: process isoc TD properly when there was a transaction error mid TD.") is designed to address isoc transaction errors, but unfortunately it doesn't account for this scenario. This issue is similar to the XHCI_SPURIOUS_SUCCESS case where a success event follows a 'short transfer' event, but the TD the event points to is already given back. Expand the spurious success 'short transfer' event handling to cover the spurious success after error on Etron hosts. Kuangyi Chiang reported this issue and submitted a different solution based on using error_mid_td. This commit message is mostly taken from that patch. Reported-by: Kuangyi Chiang <ki.chiang65@gmail.com> Closes: https://lore.kernel.org/linux-usb/20241028025337.6372-6-ki.chiang65@gmail.com/ Tested-by: Kuangyi Chiang <ki.chiang65@gmail.com> Tested-by: Michal Pecio <michal.pecio@gmail.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20250306144954.3507700-16-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-03-06	usb: xhci: Unify duplicate inc_enq() code	Michal Pecio
	Extract a block of code copied from inc_enq() into a separate function and call it from inc_enq() and the other function which used this code. Remove the pointless 'next' variable which only aliases ring->enqueue. Note: I don't know if any 0.95 xHC ever reached series production, but "AMD 0.96 host" appears to be the "Llano" family APU. Example dmesg at https://linux-hardware.org/?probe=79d5cfd4fd&log=dmesg pci 0000:00:10.0: [1022:7812] type 00 class 0x0c0330 hcc params 0x014042c3 hci version 0x96 quirks 0x0000000000000608 Signed-off-by: Michal Pecio <michal.pecio@gmail.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20250306144954.3507700-15-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-03-06	xhci: Prevent early endpoint restart when handling STALL errors.	Mathias Nyman
	Ensure that an endpoint halted due to device STALL is not restarted before a Clear_Feature(ENDPOINT_HALT) request is sent to the device. The host side of the endpoint may otherwise be started early by the 'Set TR Deq' command completion handler which is called if dequeue is moved past a cancelled or halted TD. Prevent this with a new flag set for bulk and interrupt endpoints when a Stall Error is received. Clear it in hcd->endpoint_reset() which is called after Clear_Feature(ENDPOINT_HALT) is sent. Also add a debug message if a class driver queues a new URB after the STALL. Note that class driver might not be aware of the STALL yet when it submits the URB as URBs are given back in BH. Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20250306144954.3507700-13-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-03-06	usb: xhci: move debug capabilities from trb_in_td() to handle_tx_event()	Niklas Neronin
	Function trb_in_td() currently includes debug capabilities that are triggered when its debug argument is set to true. The only consumer of these debug capabilities is handle_tx_event(), which calls trb_in_td() twice, once for its primary functionality and a second time solely for debugging purposes if the first call returns 'NULL'. This approach is inefficient and can lead to confusion, as trb_in_td() executes the same code with identical arguments twice, differing only in the debug output during the second execution. To enhance clarity and efficiency, move the debug capabilities out of trb_in_td() and integrates them directly into handle_tx_event(). This change reduces the argument count of trb_in_td() and ensures that debug steps are executed only when necessary, streamlining the function's operation. Signed-off-by: Niklas Neronin <niklas.neronin@linux.intel.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20250306144954.3507700-12-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-03-06	usb: xhci: refactor trb_in_td() to be static	Niklas Neronin
	Relocate trb_in_td() and marks it as static, as it's exclusively utilized in xhci-ring.c. This adjustment lays the groundwork for future rework of the function. The function's logic remains unchanged; only its access specifier is altered to static and a redundant "else" is removed on line 325 (due to checkpatch.pl complaining). Signed-off-by: Niklas Neronin <niklas.neronin@linux.intel.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20250306144954.3507700-11-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-03-06	usb: xhci: Skip only one TD on Ring Underrun/Overrun	Michal Pecio
	If skipping is deferred to events other than Missed Service Error itsef, it means we are running on an xHCI 1.0 host and don't know how many TDs were missed until we reach some ordinary transfer completion event. And in case of ring xrun, we can't know where the xrun happened either. If we skip all pending TDs, we may prematurely give back TDs added after the xrun had occurred, risking data loss or buffer UAF by the xHC. If we skip none, a driver may become confused and stop working when all its URBs are missed and appear to be "in flight" forever. Skip exactly one TD on each xrun event - the first one that was missed, as we can now be sure that the HC has finished processing it. Provided that one more TD is queued before any subsequent doorbell ring, it will become safe to skip another TD by the time we get an xrun again. Signed-off-by: Michal Pecio <michal.pecio@gmail.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20250306144954.3507700-8-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-03-06	usb: xhci: Expedite skipping missed isoch TDs on modern HCs	Michal Pecio
	xHCI spec rev. 1.0 allowed the TRB pointer of Missed Service events to be NULL. Having no idea which of the queued TDs were missed and which are waiting, we can only set a flag to skip missed TDs later. But HCs are also allowed to give us pointer to the last missed TRB, and this became mandatory in spec rev. 1.1 and later. Use this pointer, if available, to immediately skip all missed TDs. This reduces latency and risk of skipping-related bugs, because we can now leave the skip flag cleared for future events. Handle Missed Service Error events as 'error mid TD', if applicable, because rev. 1.0 spec excplicitly says so in notes to 4.10.3.2 and later revs in 4.10.3.2 and 4.11.2.5.2. Notes to 4.9.1 seem to apply. Tested on ASM1142 and ASM3142 v1.1 xHCs which provide TRB pointers. Tested on AMD, Etron, Renesas v1.0 xHCs which provide TRB pointers. Tested on NEC v0.96 and VIA v1.0 xHCs which send a NULL pointer. Change inspired by a discussion about realtime USB audio. Link: https://lore.kernel.org/linux-usb/76e1a191-020d-4a76-97f6-237f9bd0ede0@gmx.net/T/ Signed-off-by: Michal Pecio <michal.pecio@gmail.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20250306144954.3507700-7-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-03-06	usb: xhci: Fix isochronous Ring Underrun/Overrun event handling	Michal Pecio
	The TRB pointer of these events points at enqueue at the time of error occurrence on xHCI 1.1+ HCs or it's NULL on older ones. By the time we are handling the event, a new TD may be queued at this ring position. I can trigger this race by rising interrupt moderation to increase IRQ handling delay. Similar delay may occur naturally due to system load. If this ever happens after a Missed Service Error, missed TDs will be skipped and the new TD processed as if it matched the event. It could be given back prematurely, risking data loss or buffer UAF by the xHC. Don't complete TDs on xrun events and don't warn if queued TDs don't match the event's TRB pointer, which can be NULL or a link/no-op TRB. Don't warn if there are no queued TDs at all. Now that it's safe, also handle xrun events if the skip flag is clear. This ensures completion of any TD stuck in 'error mid TD' state right before the xrun event, which could happen if a driver submits a finite number of URBs to a buggy HC and then an error occurs on the last TD. Signed-off-by: Michal Pecio <michal.pecio@gmail.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20250306144954.3507700-6-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-03-06	usb: xhci: Complete 'error mid TD' transfers when handling Missed Service	Michal Pecio
	Missed Service Error after an error mid TD means that the failed TD has already been passed by the xHC without acknowledgment of the final TRB, a known hardware bug. So don't wait any more and give back the TD. Reproduced on NEC uPD720200 under conditions of ludicrously bad USB link quality, confirmed to behave as expected using dynamic debug. Signed-off-by: Michal Pecio <michal.pecio@gmail.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20250306144954.3507700-5-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-03-06	usb: xhci: Don't skip on Stopped - Length Invalid	Michal Pecio
	Up until commit d56b0b2ab142 ("usb: xhci: ensure skipped isoc TDs are returned when isoc ring is stopped") in v6.11, the driver didn't skip missed isochronous TDs when handling Stoppend and Stopped - Length Invalid events. Instead, it erroneously cleared the skip flag, which would cause the ring to get stuck, as future events won't match the missed TD which is never removed from the queue until it's cancelled. This buggy logic seems to have been in place substantially unchanged since the 3.x series over 10 years ago, which probably speaks first and foremost about relative rarity of this case in normal usage, but by the spec I see no reason why it shouldn't be possible. After d56b0b2ab142, TDs are immediately skipped when handling those Stopped events. This poses a potential problem in case of Stopped - Length Invalid, which occurs either on completed TDs (likely already given back) or Link and No-Op TRBs. Such event won't be recognized as matching any TD (unless it's the rare Link TRB inside a TD) and will result in skipping all pending TDs, giving them back possibly before they are done, risking isoc data loss and maybe UAF by HW. As a compromise, don't skip and don't clear the skip flag on this kind of event. Then the next event will skip missed TDs. A downside of not handling Stopped - Length Invalid on a Link inside a TD is that if the TD is cancelled, its actual length will not be updated to account for TRBs (silently) completed before the TD was stopped. I had no luck producing this sequence of completion events so there is no compelling demonstration of any resulting disaster. It may be a very rare, obscure condition. The sole motivation for this patch is that if such unlikely event does occur, I'd rather risk reporting a cancelled partially done isoc frame as empty than gamble with UAF. This will be fixed more properly by looking at Stopped event's TRB pointer when making skipping decisions, but such rework is unlikely to be backported to v6.12, which will stay around for a few years. Fixes: d56b0b2ab142 ("usb: xhci: ensure skipped isoc TDs are returned when isoc ring is stopped") Cc: stable@vger.kernel.org Signed-off-by: Michal Pecio <michal.pecio@gmail.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20250306144954.3507700-4-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-03-06	usb: xhci: remove redundant update_ring_for_set_deq_completion() function	Niklas Neronin
	The function is a remnant from a previous implementation and is now redundant. There is no longer a need to search for the dequeue pointer, as both the TRB and segment dequeue pointers are saved within 'queued_deq_seg' and 'queued_deq_ptr'. Signed-off-by: Niklas Neronin <niklas.neronin@linux.intel.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20250306144954.3507700-3-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-15	USB: host: Use str_enable_disable-like helpers	Krzysztof Kozlowski
	Replace ternary (condition ? "enable" : "disable") syntax with helpers from string_choices.h because: 1. Simple function call with one argument is easier to read. Ternary operator has three arguments and with wrapping might lead to quite long code. 2. Is slightly shorter thus also easier to read. 3. It brings uniformity in the text - same string. 4. Allows deduping by the linker, which results in a smaller binary file. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://lore.kernel.org/r/20250114-str-enable-disable-usb-v1-2-c8405df47c19@linaro.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-12-27	xhci: Add command completion parameter support	Mathias Nyman
	xHC hosts can pass 24 bits of data with a command completion event TRB as the completion code only uses 8 bits of the 32 bit status field. Only Configure Endpoint, and Get Extended Property commands utilize this "command completion parameter" 24 bit field. For other command completion events the xHC should keep it RsvdZ. Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20241227120142.1035206-5-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-12-27	usb: xhci: Fix NULL pointer dereference on certain command aborts	Michal Pecio
	If a command is queued to the final usable TRB of a ring segment, the enqueue pointer is advanced to the subsequent link TRB and no further. If the command is later aborted, when the abort completion is handled the dequeue pointer is advanced to the first TRB of the next segment. If no further commands are queued, xhci_handle_stopped_cmd_ring() sees the ring pointers unequal and assumes that there is a pending command, so it calls xhci_mod_cmd_timer() which crashes if cur_cmd was NULL. Don't attempt timer setup if cur_cmd is NULL. The subsequent doorbell ring likely is unnecessary too, but it's harmless. Leave it alone. This is probably Bug 219532, but no confirmation has been received. The issue has been independently reproduced and confirmed fixed using a USB MCU programmed to NAK the Status stage of SET_ADDRESS forever. Everything continued working normally after several prevented crashes. Link: https://bugzilla.kernel.org/show_bug.cgi?id=219532 Fixes: c311e391a7ef ("xhci: rework command timeout and cancellation,") CC: stable@vger.kernel.org Signed-off-by: Michal Pecio <michal.pecio@gmail.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20241227120142.1035206-4-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-12-17	xhci: Turn NEC specific quirk for handling Stop Endpoint errors generic	Mathias Nyman
	xHC hosts from several vendors have the same issue where endpoints start so slowly that a later queued 'Stop Endpoint' command may complete before endpoint is up and running. The 'Stop Endpoint' command fails with context state error as the endpoint still appears as stopped. See commit 42b758137601 ("usb: xhci: Limit Stop Endpoint retries") for details CC: stable@vger.kernel.org Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20241217102122.2316814-2-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-11-06	usb: xhci: Avoid queuing redundant Stop Endpoint commands	Michal Pecio
	Stop Endpoint command on an already stopped endpoint fails and may be misinterpreted as a known hardware bug by the completion handler. This results in an unnecessary delay with repeated retries of the command. Avoid queuing this command when endpoint state flags indicate that it's stopped or halted and the command will fail. If commands are pending on the endpoint, their completion handlers will process cancelled TDs so it's done. In case of waiting for external operations like clearing TT buffer, the endpoint is stopped and cancelled TDs can be processed now. This eliminates practically all unnecessary retries because an endpoint with pending URBs is maintained in Running state by the driver, unless aforementioned commands or other operations are pending on it. This is guaranteed by xhci_ring_ep_doorbell() and by the fact that it is called every time any of those operations completes. The only known exceptions are hardware bugs (the endpoint never starts at all) and Stream Protocol errors not associated with any TRB, which cause an endpoint reset not followed by restart. Sounds like a bug. Generally, these retries are only expected to happen when the endpoint fails to start for unknown/no reason, which is a worse problem itself, and fixing the bug eliminates the retries too. All cases were tested and found to work as expected. SET_DEQ_PENDING was produced by patching uvcvideo to unlink URBs in 100us intervals, which then runs into this case very often. EP_HALTED was produced by restarting 'cat /dev/ttyUSB0' on a serial dongle with broken cable. EP_CLEARING_TT by the same, with the dongle on an external hub. Fixes: fd9d55d190c0 ("xhci: retry Stop Endpoint on buggy NEC controllers") CC: stable@vger.kernel.org Signed-off-by: Michal Pecio <michal.pecio@gmail.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20241106101459.775897-34-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-11-06	usb: xhci: Fix TD invalidation under pending Set TR Dequeue	Michal Pecio
	xhci_invalidate_cancelled_tds() may not work correctly if the hardware is modifying endpoint or stream contexts at the same time by executing a Set TR Dequeue command. And even if it worked, it would be unable to queue Set TR Dequeue for the next stream, failing to clear xHC cache. On stream endpoints, a chain of Set TR Dequeue commands may take some time to execute and we may want to cancel more TDs during this time. Currently this leads to Stop Endpoint completion handler calling this function without testing for SET_DEQ_PENDING, which will trigger the aforementioned problems when it happens. On all endpoints, a halt condition causes Reset Endpoint to be queued and an error status given to the class driver, which may unlink more URBs in response. Stop Endpoint is queued and its handler may execute concurrently with Set TR Dequeue queued by Reset Endpoint handler. (Reset Endpoint handler calls this function too, but there seems to be no possibility of it running concurrently with Set TR Dequeue). Fix xhci_invalidate_cancelled_tds() to work correctly under a pending Set TR Dequeue. Bail out of the function when SET_DEQ_PENDING is set, then make the completion handler call the function again and also call xhci_giveback_invalidated_tds(), which needs to be called next. This seems to fix another potential bug, where the handler would call xhci_invalidate_cancelled_tds(), which may clear some deferred TDs if a sanity check fails, and the TDs wouldn't be given back promptly. Said sanity check seems to be wrong and prone to false positives when the endpoint halts, but fixing it is beyond the scope of this change, besides ensuring that cleared TDs are given back properly. Fixes: 5ceac4402f5d ("xhci: Handle TD clearing for multiple streams case") CC: stable@vger.kernel.org Signed-off-by: Michal Pecio <michal.pecio@gmail.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20241106101459.775897-33-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-11-06	usb: xhci: Limit Stop Endpoint retries	Michal Pecio
	Some host controllers fail to atomically transition an endpoint to the Running state on a doorbell ring and enter a hidden "Restarting" state, which looks very much like Stopped, with the important difference that it will spontaneously transition to Running anytime soon. A Stop Endpoint command queued in the Restarting state typically fails with Context State Error and the completion handler sees the Endpoint Context State as either still Stopped or already Running. Even a case of Halted was observed, when an error occurred right after the restart. The Halted state is already recovered from by resetting the endpoint. The Running state is handled by retrying Stop Endpoint. The Stopped state was recognized as a problem on NEC controllers and worked around also by retrying, because the endpoint soon restarts and then stops for good. But there is a risk: the command may fail if the endpoint is "stopped for good" already, and retries will fail forever. The possibility of this was not realized at the time, but a number of cases were discovered later and reproduced. Some proved difficult to deal with, and it is outright impossible to predict if an endpoint may fail to ever start at all due to a hardware bug. One such bug (albeit on ASM3142, not on NEC) was found to be reliably triggered simply by toggling an AX88179 NIC up/down in a tight loop for a few seconds. An endless retries storm is quite nasty. Besides putting needless load on the xHC and CPU, it causes URBs never to be given back, paralyzing the device and connection/disconnection logic for the whole bus if the device is unplugged. User processes waiting for URBs become unkillable, drivers and kworker threads lock up and xhci_hcd cannot be reloaded. For peace of mind, impose a timeout on Stop Endpoint retries in this case. If they don't succeed in 100ms, consider the endpoint stopped permanently for some reason and just give back the unlinked URBs. This failure case is rare already and work is under way to make it rarer. Start this work today by also handling one simple case of race with Reset Endpoint, because it costs just two lines to implement. Fixes: fd9d55d190c0 ("xhci: retry Stop Endpoint on buggy NEC controllers") CC: stable@vger.kernel.org Signed-off-by: Michal Pecio <michal.pecio@gmail.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20241106101459.775897-32-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-11-06	usb: xhci: add help function xhci_dequeue_td()	Niklas Neronin
	Add xhci_dequeue_td() helper function to reduce code duplication. Function xhci_dequeue_td() advances the dequeue pointer past the specified Transfer Descriptor (TD) and releases the TD. Signed-off-by: Niklas Neronin <niklas.neronin@linux.intel.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20241106101459.775897-30-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-11-06	usb: xhci: refactor xhci_td_cleanup() to return void	Niklas Neronin
	The function is modified to return 'void' instead of an integer since it invariably returns '0'. Additionally, multiple functions which only return xhci_td_cleanup() are also refactored to return void. This change eliminates the need for callers to handle a return value that does not convey meaningful information and improve code readability, as it becomes immediately clear that the function does not produce a significant output. Signed-off-by: Niklas Neronin <niklas.neronin@linux.intel.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20241106101459.775897-29-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-11-06	usb: xhci: remove unused arguments from td_to_noop()	Niklas Neronin
	Function td_to_noop() does not utilize arguments 'xhci' and 'ep_ring'. These unused arguments are removed to clean up the code. Signed-off-by: Niklas Neronin <niklas.neronin@linux.intel.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20241106101459.775897-28-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-11-06	usb: xhci: simplify TDs start and end naming scheme in struct 'xhci_td'	Niklas Neronin
	Old names: * start_seg - last_trb_seg * first_trb - last_trb New names: * start_seg - end_seg * start_trb - end_trb A Transfer Descriptor (TD) in the xhci driver is a data structure that represents a single transaction to be performed by the USB host controller. This transaction is defined by TRBs from 'start_trb' in 'start_seg' to 'end_trb' in 'end_seg'. The terms "start" and "end" were chosen over "first" and "last" for ease of searching within the codebase. The ring structure uses 'first_seg' and 'last_seg', while the TD structure uses 'start_seg' and 'end_seg'. Signed-off-by: Niklas Neronin <niklas.neronin@linux.intel.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20241106101459.775897-24-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-11-06	xhci: Fix control transfer error on Etron xHCI host	Kuangyi Chiang
	Performing a stability stress test on a USB3.0 2.5G ethernet adapter results in errors like this: [ 91.441469] r8152 2-3:1.0 eth3: get_registers -71 [ 91.458659] r8152 2-3:1.0 eth3: get_registers -71 [ 91.475911] r8152 2-3:1.0 eth3: get_registers -71 [ 91.493203] r8152 2-3:1.0 eth3: get_registers -71 [ 91.510421] r8152 2-3:1.0 eth3: get_registers -71 The r8152 driver will periodically issue lots of control-IN requests to access the status of ethernet adapter hardware registers during the test. This happens when the xHCI driver enqueue a control TD (which cross over the Link TRB between two ring segments, as shown) in the endpoint zero's transfer ring. Seems the Etron xHCI host can not perform this TD correctly, causing the USB transfer error occurred, maybe the upper driver retry that control-IN request can solve problem, but not all drivers do this. \| \| ------- \| TRB \| Setup Stage ------- \| TRB \| Link ------- ------- \| TRB \| Data Stage ------- \| TRB \| Status Stage ------- \| \| To work around this, the xHCI driver should enqueue a No Op TRB if next available TRB is the Link TRB in the ring segment, this can prevent the Setup and Data Stage TRB to be breaked by the Link TRB. Check if the XHCI_ETRON_HOST quirk flag is set before invoking the workaround in xhci_queue_ctrl_tx(). Fixes: d0e96f5a71a0 ("USB: xhci: Control transfer support.") Cc: stable@vger.kernel.org Signed-off-by: Kuangyi Chiang <ki.chiang65@gmail.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20241106101459.775897-20-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-11-06	xhci: trace stream context at Set TR Deq command completion	Mathias Nyman
	A successful 'Set TR Deq' command writes a new dequeue pointer to the stream context for stream rings, show the content of the stream ctx at command completion. Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20241106101459.775897-9-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-11-06	xhci: Don't trace ring at every enqueue or dequeue increase	Mathias Nyman
	Don't trace ring pointers every time driver increases enqueue pointer after queuing a TRB, or dequeue pointer after handling an event. These xhci_inc_deq: and xhci_inc_enq: trace entries fill up the trace and provides little useful info now that the TRB DMA address is printed with the TRB itself. Only trace ring during xhci_inc_enq() and xhci_inc_deq() in case we move to a new ring segment. Also don't show both segment and TRB addess in trace. Segment is just TRB with 0xfff masked off. Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20241106101459.775897-7-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-11-06	xhci: show DMA address of TRB when tracing TRBs	Mathias Nyman
	The DMA address of a queued TRB is essential when looking at traces as both transfer events and command completion events refer to the command or transfer based on its DMA address. Previously the TRB address was figured out from xhci_inc_enq and xhci_inc_deq trace entries seen after queuing or handlong a TRB. Now that DMA address is shown in TRB tracing we can get rid of most of the xhci_inc_enq and xhci_inc_deq traces thus decreasing trace size. Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20241106101459.775897-6-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-11-06	usb: xhci: Fix sum_trb_lengths()	Michal Pecio
	This function is supposed to sum the lengths of all transfer TRBs in a TD up to a point, but it starts summing at the current dequeue since it only ever gets called on the first pending TD. This won't work when there are cancelled TDs at the beginning of the ring. The function tries to exclude No-Ops from the count, but not all cancelled TDs are No-Op'ed - not those the HW stopped on. The absolutely obvious fix is to start counting at the TD's first TRB. And remove the now-useless 'ring' parameter, and 'xhci' too. Signed-off-by: Michal Pecio <michal.pecio@gmail.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20241106101459.775897-4-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-11-06	usb: xhci: Remove unused parameters of next_trb()	Michal Pecio
	The function has two parameters which it doesn't use and hasn't ever used. One caller even puts NULL there, knowing it will work anyway. Signed-off-by: Michal Pecio <michal.pecio@gmail.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20241106101459.775897-3-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-11-05	Merge v6.12-rc6 into usb-next	Greg Kroah-Hartman
	We need the USB fixes in here as well, and this resolves a merge conflict in: drivers/usb/typec/tcpm/tcpm.c Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Link: https://lore.kernel.org/r/20241101150730.090dc30f@canb.auug.org.au Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-10-29	xhci: Fix Link TRB DMA in command ring stopped completion event	Faisal Hassan
	During the aborting of a command, the software receives a command completion event for the command ring stopped, with the TRB pointing to the next TRB after the aborted command. If the command we abort is located just before the Link TRB in the command ring, then during the 'command ring stopped' completion event, the xHC gives the Link TRB in the event's cmd DMA, which causes a mismatch in handling command completion event. To address this situation, move the 'command ring stopped' completion event check slightly earlier, since the specific command it stopped on isn't of significant concern. Fixes: 7f84eef0dafb ("USB: xhci: No-op command queueing and irq handler.") Cc: stable@vger.kernel.org Signed-off-by: Faisal Hassan <quic_faisalh@quicinc.com> Acked-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20241022155631.1185-1-quic_faisalh@quicinc.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-10-21	Merge 6.12-rc4 into usb-next	Greg Kroah-Hartman
	We need the USB fixes in here as well. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-10-17	usb: xhci: Fix handling errors mid TD followed by other errors	Michal Pecio
	Some host controllers fail to produce the final completion event on an isochronous TD which experienced an error mid TD. We deal with it by flagging such TDs and checking if the next event points at the flagged TD or at the next one, and giving back the flagged TD if the latter. This is not enough, because the next TD may be missed by the xHC. Or there may be no next TD but a ring underrun. We also need to get such TD quickly out of the way, or errors on later TDs may be handled wrong. If the next TD experiences a Missed Service Error, we will set the skip flag on the endpoint and then attempt skipping TDs when yet another event arrives. In such scenario, we ought to report the 'error mid TD' transfer as such rather than skip it. Another problem case are Stopped events. If we see one after an error mid TD, we naively assume that it's a Force Stopped Event because it doesn't match the pending TD, but in reality it might be an ordinary Stopped event for the next TD, which we fail to recognize and handle. Fix this by moving error mid TD handling before the whole TD skipping loop. Remove unnecessary conditions, always give back the TD if the new event points to any TRB outside it or if the pointer is NULL, as may be the case in Ring Underrun and Overrun events on 1st gen hardware. Only if the pending TD isn't flagged, consider other actions like skipping. As a side effect of reordering with skip and FSE cases, error mid TD is reordered with last_td_was_short check. This is harmless, because the two cases are mutually exclusive - only one can happen in any given run of handle_tx_event(). Tested on the NEC host and a USB camera with flaky cable. Dynamic debug confirmed that Transaction Errors are sometimes seen, sometimes mid-TD, sometimes followed by Missed Service. In such cases, they were finished properly before skipping began. [Rebase on 6.12-rc1 -Mathias] Signed-off-by: Michal Pecio <michal.pecio@gmail.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20241016140000.783905-4-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-10-17	xhci: Mitigate failed set dequeue pointer commands	Mathias Nyman
	Avoid xHC host from processing a cancelled URB by always turning cancelled URB TDs into no-op TRBs before queuing a 'Set TR Deq' command. If the command fails then xHC will start processing the cancelled TD instead of skipping it once endpoint is restarted, causing issues like Babble error. This is not a complete solution as a failed 'Set TR Deq' command does not guarantee xHC TRB caches are cleared. Fixes: 4db356924a50 ("xhci: turn cancelled td cleanup to its own function") Cc: stable@vger.kernel.org Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20241016140000.783905-3-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>