summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2025-01-07ice: fix incorrect PHY settings for 100 GB/sPrzemyslaw Korba
ptp4l application reports too high offset when ran on E823 device with a 100GB/s link. Those values cannot go under 100ns, like in a working case when using 100 GB/s cable. This is due to incorrect frequency settings on the PHY clocks for 100 GB/s speed. Changes are introduced to align with the internal hardware documentation, and correctly initialize frequency in PHY clocks with the frequency values that are in our HW spec. To reproduce the issue run ptp4l as a Time Receiver on E823 device, and observe the offset, which will never approach values seen in the PTP working case. Reproduction output: ptp4l -i enp137s0f3 -m -2 -s -f /etc/ptp4l_8275.conf ptp4l[5278.775]: master offset 12470 s2 freq +41288 path delay -3002 ptp4l[5278.837]: master offset 10525 s2 freq +39202 path delay -3002 ptp4l[5278.900]: master offset -24840 s2 freq -20130 path delay -3002 ptp4l[5278.963]: master offset 10597 s2 freq +37908 path delay -3002 ptp4l[5279.025]: master offset 8883 s2 freq +36031 path delay -3002 ptp4l[5279.088]: master offset 7267 s2 freq +34151 path delay -3002 ptp4l[5279.150]: master offset 5771 s2 freq +32316 path delay -3002 ptp4l[5279.213]: master offset 4388 s2 freq +30526 path delay -3002 ptp4l[5279.275]: master offset -30434 s2 freq -28485 path delay -3002 ptp4l[5279.338]: master offset -28041 s2 freq -27412 path delay -3002 ptp4l[5279.400]: master offset 7870 s2 freq +31118 path delay -3002 Fixes: 3a7496234d17 ("ice: implement basic E822 PTP support") Reviewed-by: Milena Olech <milena.olech@intel.com> Signed-off-by: Przemyslaw Korba <przemyslaw.korba@intel.com> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-01-07ice: fix max values for dpll pin phase adjustArkadiusz Kubalewski
Mask admin command returned max phase adjust value for both input and output pins. Only 31 bits are relevant, last released data sheet wrongly points that 32 bits are valid - see [1] 3.2.6.4.1 Get CCU Capabilities Command for reference. Fix of the datasheet itself is in progress. Fix the min/max assignment logic, previously the value was wrongly considered as negative value due to most significant bit being set. Example of previous broken behavior: $ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/dpll.yaml \ --do pin-get --json '{"id":1}'| grep phase-adjust 'phase-adjust': 0, 'phase-adjust-max': 16723, 'phase-adjust-min': -16723, Correct behavior with the fix: $ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/dpll.yaml \ --do pin-get --json '{"id":1}'| grep phase-adjust 'phase-adjust': 0, 'phase-adjust-max': 2147466925, 'phase-adjust-min': -2147466925, [1] https://cdrdv2.intel.com/v1/dl/getContent/613875?explicitVersion=true Fixes: 90e1c90750d7 ("ice: dpll: implement phase related callbacks") Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-01-07topology: Keep the cpumask unchanged when printing cpumapLi Huafei
During fuzz testing, the following warning was discovered: different return values (15 and 11) from vsnprintf("%*pbl ", ...) test:keyward is WARNING in kvasprintf WARNING: CPU: 55 PID: 1168477 at lib/kasprintf.c:30 kvasprintf+0x121/0x130 Call Trace: kvasprintf+0x121/0x130 kasprintf+0xa6/0xe0 bitmap_print_to_buf+0x89/0x100 core_siblings_list_read+0x7e/0xb0 kernfs_file_read_iter+0x15b/0x270 new_sync_read+0x153/0x260 vfs_read+0x215/0x290 ksys_read+0xb9/0x160 do_syscall_64+0x56/0x100 entry_SYSCALL_64_after_hwframe+0x78/0xe2 The call trace shows that kvasprintf() reported this warning during the printing of core_siblings_list. kvasprintf() has several steps: (1) First, calculate the length of the resulting formatted string. (2) Allocate a buffer based on the returned length. (3) Then, perform the actual string formatting. (4) Check whether the lengths of the formatted strings returned in steps (1) and (2) are consistent. If the core_cpumask is modified between steps (1) and (3), the lengths obtained in these two steps may not match. Indeed our test includes cpu hotplugging, which should modify core_cpumask while printing. To fix this issue, cache the cpumask into a temporary variable before calling cpumap_print_{list, cpumask}_to_buf(), to keep it unchanged during the printing process. Fixes: bb9ec13d156e ("topology: use bin_attribute to break the size limitation of cpumap ABI") Cc: stable <stable@kernel.org> Signed-off-by: Li Huafei <lihuafei1@huawei.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20241114110141.94725-1-lihuafei1@huawei.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-07sysfs: constify macro BIN_ATTRIBUTE_GROUPS()Thomas Weißschuh
As there is only one in-tree user, avoid a transition phase and switch that user in the same commit. As there are some interdependencies between the constness of the different symbols in the s390 driver, covert the whole driver at once. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Link: https://lore.kernel.org/r/20241205-sysfs-const-bin_attr-groups_macro-v1-1-ac5e855031e8@weissschuh.net Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-07MAINTAINERS: add include/linux/sysfs.hThomas Weißschuh
include/linux/sysfs.h is maintained as part of "DRIVER CORE, KOBJECT, DEBUGFS AND SYSFS. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Link: https://lore.kernel.org/r/20241205-sysfs-maintainers-v1-1-53694b690d67@weissschuh.net Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-07kernel/ksysfs.c: simplify bin_attribute definitionThomas Weißschuh
The notes attribute can be implemented in terms of BIN_ATTR_SIMPLE(). This saves memory at runtime and is a preparation for the constification of struct bin_attribute. Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Link: https://lore.kernel.org/r/20241121-sysfs-const-bin_attr-ksysfs-v1-1-972faced149d@weissschuh.net Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-07KVM: s390: selftests: Add has device attr check to uc_attr_mem_limit selftestChristoph Schlameuss
Fixup the uc_attr_mem_limit test case to also cover the KVM_HAS_DEVICE_ATTR ioctl. Signed-off-by: Christoph Schlameuss <schlameuss@linux.ibm.com> Tested-by: Hariharan Mari <hari55@linux.ibm.com> Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Link: https://lore.kernel.org/r/20241216092140.329196-7-schlameuss@linux.ibm.com Message-ID: <20241216092140.329196-7-schlameuss@linux.ibm.com> Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
2025-01-07KVM: s390: selftests: Add ucontrol gis routing testChristoph Schlameuss
Add a selftests for the interrupt routing configuration when using ucontrol VMs. Calling the test may trigger a null pointer dereferences on kernels not containing the fixes in this patch series. Signed-off-by: Christoph Schlameuss <schlameuss@linux.ibm.com> Tested-by: Hariharan Mari <hari55@linux.ibm.com> Reviewed-by: Hariharan Mari <hari55@linux.ibm.com> Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Link: https://lore.kernel.org/r/20241216092140.329196-5-schlameuss@linux.ibm.com Message-ID: <20241216092140.329196-5-schlameuss@linux.ibm.com> Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
2025-01-07KVM: s390: Reject KVM_SET_GSI_ROUTING on ucontrol VMsChristoph Schlameuss
Prevent null pointer dereference when processing KVM_IRQ_ROUTING_S390_ADAPTER routing entries. The ioctl cannot be processed for ucontrol VMs. Fixes: f65470661f36 ("KVM: s390/interrupt: do not pin adapter interrupt pages") Signed-off-by: Christoph Schlameuss <schlameuss@linux.ibm.com> Tested-by: Hariharan Mari <hari55@linux.ibm.com> Reviewed-by: Hariharan Mari <hari55@linux.ibm.com> Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Link: https://lore.kernel.org/r/20241216092140.329196-4-schlameuss@linux.ibm.com Message-ID: <20241216092140.329196-4-schlameuss@linux.ibm.com> Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
2025-01-07KVM: s390: selftests: Add ucontrol flic attr selftestsChristoph Schlameuss
Add some superficial selftests for the floating interrupt controller when using ucontrol VMs. These tests are intended to cover very basic calls only. Some of the calls may trigger null pointer dereferences on kernels not containing the fixes in this patch series. Signed-off-by: Christoph Schlameuss <schlameuss@linux.ibm.com> Tested-by: Hariharan Mari <hari55@linux.ibm.com> Reviewed-by: Hariharan Mari <hari55@linux.ibm.com> Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Link: https://lore.kernel.org/r/20241216092140.329196-3-schlameuss@linux.ibm.com Message-ID: <20241216092140.329196-3-schlameuss@linux.ibm.com> Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
2025-01-07KVM: s390: Reject setting flic pfault attributes on ucontrol VMsChristoph Schlameuss
Prevent null pointer dereference when processing the KVM_DEV_FLIC_APF_ENABLE and KVM_DEV_FLIC_APF_DISABLE_WAIT ioctls in the interrupt controller. Fixes: 3c038e6be0e2 ("KVM: async_pf: Async page fault support on s390") Reported-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Signed-off-by: Christoph Schlameuss <schlameuss@linux.ibm.com> Reviewed-by: Hariharan Mari <hari55@linux.ibm.com> Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Link: https://lore.kernel.org/r/20241216092140.329196-2-schlameuss@linux.ibm.com Message-ID: <20241216092140.329196-2-schlameuss@linux.ibm.com> Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
2025-01-07KVM: s390: vsie: fix virtual/physical address in unpin_scb()Claudio Imbrenda
In commit 77b533411595 ("KVM: s390: VSIE: sort out virtual/physical address in pin_guest_page"), only pin_scb() has been updated. This means that in unpin_scb() a virtual address was still used directly as physical address without conversion. The resulting physical address is obviously wrong and most of the time also invalid. Since commit d0ef8d9fbebe ("KVM: s390: Use kvm_release_page_dirty() to unpin "struct page" memory"), unpin_guest_page() will directly use kvm_release_page_dirty(), instead of kvm_release_pfn_dirty(), which has since been removed. One of the checks that were performed by kvm_release_pfn_dirty() was to verify whether the page was valid at all, and silently return successfully without doing anything if the page was invalid. When kvm_release_pfn_dirty() was still used, the invalid page was thus silently ignored. Now the check is gone and the result is an Oops. This also means that when running with a V!=R kernel, the page was not released, causing a leak. The solution is simply to add the missing virt_to_phys(). Fixes: 77b533411595 ("KVM: s390: VSIE: sort out virtual/physical address in pin_guest_page") Reviewed-by: Janosch Frank <frankja@linux.ibm.com> Reviewed-by: Nico Boehr <nrb@linux.ibm.com> Link: https://lore.kernel.org/r/20241210083948.23963-1-imbrenda@linux.ibm.com Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Message-ID: <20241210083948.23963-1-imbrenda@linux.ibm.com>
2025-01-07platform/x86: intel/pmc: Fix ioremap() of bad addressDavid E. Box
In pmc_core_ssram_get_pmc(), the physical addresses for hidden SSRAM devices are retrieved from the MMIO region of the primary SSRAM device. If additional devices are not present, the address returned is zero. Currently, the code does not check for this condition, resulting in ioremap() incorrectly attempting to map address 0. Add a check for a zero address and return 0 if no additional devices are found, as it is not an error for the device to be absent. Fixes: a01486dc4bb1 ("platform/x86/intel/pmc: Cleanup SSRAM discovery") Signed-off-by: David E. Box <david.e.box@linux.intel.com> Link: https://lore.kernel.org/r/20250106174653.1497128-1-david.e.box@linux.intel.com Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2025-01-07platform/x86: ISST: Add Clearwater Forest to support listSrinivas Pandruvada
Add Clearwater Forest (INTEL_ATOM_DARKMONT_X) to SST support list by adding to isst_cpu_ids. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Link: https://lore.kernel.org/r/20250103155255.1488139-2-srinivas.pandruvada@linux.intel.com Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2025-01-07platform/x86/intel: power-domains: Add Clearwater Forest supportSrinivas Pandruvada
Add Clearwater Forest support (INTEL_ATOM_DARKMONT_X) to tpmi_cpu_ids to support domaid id mappings. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Link: https://lore.kernel.org/r/20250103155255.1488139-1-srinivas.pandruvada@linux.intel.com Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2025-01-07platform/x86/amd/pmc: Only disable IRQ1 wakeup where i8042 actually enabled itMaciej S. Szmigiero
Wakeup for IRQ1 should be disabled only in cases where i8042 had actually enabled it, otherwise "wake_depth" for this IRQ will try to drop below zero and there will be an unpleasant WARN() logged: kernel: atkbd serio0: Disabling IRQ1 wakeup source to avoid platform firmware bug kernel: ------------[ cut here ]------------ kernel: Unbalanced IRQ 1 wake disable kernel: WARNING: CPU: 10 PID: 6431 at kernel/irq/manage.c:920 irq_set_irq_wake+0x147/0x1a0 The PMC driver uses DEFINE_SIMPLE_DEV_PM_OPS() to define its dev_pm_ops which sets amd_pmc_suspend_handler() to the .suspend, .freeze, and .poweroff handlers. i8042_pm_suspend(), however, is only set as the .suspend handler. Fix the issue by call PMC suspend handler only from the same set of dev_pm_ops handlers as i8042_pm_suspend(), which currently means just the .suspend handler. To reproduce this issue try hibernating (S4) the machine after a fresh boot without putting it into s2idle first. Fixes: 8e60615e8932 ("platform/x86/amd: pmc: Disable IRQ1 wakeup for RN/CZN") Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Maciej S. Szmigiero <mail@maciej.szmigiero.name> Link: https://lore.kernel.org/r/c8f28c002ca3c66fbeeb850904a1f43118e17200.1736184606.git.mail@maciej.szmigiero.name [ij: edited the commit message.] Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2025-01-07debugfs: fix missing mutex_destroy() in short_fops caseAl Viro
we need that in ->real_fops == NULL, ->short_fops != NULL case Fixes: 8dc6d81c6b2a "debugfs: add small file operations for most files" Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Link: https://lore.kernel.org/r/20241229081223.3193228-1-viro@zeniv.linux.org.uk Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-07fs: debugfs: differentiate short fops with proxy opsJohannes Berg
Geert reported that my previous short fops debugfs changes broke m68k, because it only has mandatory alignement of two, so we can't stash the "is it short" information into the pointer (as we already did with the "is it real" bit.) Instead, exploit the fact that debugfs_file_get() called on an already open file will already find that the fsdata is no longer the real fops but rather the allocated data that already distinguishes full/short ops, so only open() needs to be able to distinguish. We can achieve that by using two different open functions. Unfortunately this requires another set of full file ops, increasing the size by 536 bytes (x86-64), but that's still a reasonable trade-off given that only converting some of the wireless stack gained over 28k. This brings the total cost of this to around 1k, for wins of 28k (all x86-64). Reported-and-tested-by: Geert Uytterhoeven <geert@linux-m68k.org> Link: https://lore.kernel.org/CAMuHMdWu_9-L2Te101w8hU7H_2yobJFPXSwwUmGHSJfaPWDKiQ@mail.gmail.com Fixes: 8dc6d81c6b2a ("debugfs: add small file operations for most files") Signed-off-by: Johannes Berg <johannes.berg@intel.com> Tested-by: Geert Uytterhoeven <geert@linux-m68k.org> Link: https://lore.kernel.org/r/20241129121536.30989-2-johannes@sipsolutions.net Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-07afs: Fix the maximum cell name lengthDavid Howells
The kafs filesystem limits the maximum length of a cell to 256 bytes, but a problem occurs if someone actually does that: kafs tries to create a directory under /proc/net/afs/ with the name of the cell, but that fails with a warning: WARNING: CPU: 0 PID: 9 at fs/proc/generic.c:405 because procfs limits the maximum filename length to 255. However, the DNS limits the maximum lookup length and, by extension, the maximum cell name, to 255 less two (length count and trailing NUL). Fix this by limiting the maximum acceptable cellname length to 253. This also allows us to be sure we can create the "/afs/.<cell>/" mountpoint too. Further, split the YFS VL record cell name maximum to be the 256 allowed by the protocol and ignore the record retrieved by YFSVL.GetCellName if it exceeds 253. Fixes: c3e9f888263b ("afs: Implement client support for the YFSVL.GetCellName RPC op") Reported-by: syzbot+7848fee1f1e5c53f912b@syzkaller.appspotmail.com Closes: https://lore.kernel.org/r/6776d25d.050a0220.3a8527.0048.GAE@google.com/ Signed-off-by: David Howells <dhowells@redhat.com> Link: https://lore.kernel.org/r/376236.1736180460@warthog.procyon.org.uk Tested-by: syzbot+7848fee1f1e5c53f912b@syzkaller.appspotmail.com cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-01-07Merge tag 'fuse-fixes-6.13-rc7' of ↵Christian Brauner
ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse Pull fuse fixes from Miklos Szeredi <mszeredi@redhat.com>: - Fix fuse_get_user_pages() allocation failure handling. - Fix direct-io folio offset and length calculation. * tag 'fuse-fixes-6.13-rc7' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: fuse: Set *nbytesp=0 in fuse_get_user_pages on allocation failure fuse: fix direct io folio offset and length calculation Link: https://lore.kernel.org/r/CAJfpegu7o_X%3DSBWk_C47dUVUQ1mJZDEGe1MfD0N3wVJoUBWdmg@mail.gmail.com Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-01-07staging: gpib: refer to correct config symbol in tnt4882 MakefileLukas Bulwahn
Commit 79d2e1919a27 ("staging: gpib: fix Makefiles") uses the corresponding config symbols to let Makefiles include the driver sources appropriately in the kernel build. Unfortunately, the Makefile in the tnt4882 directory refers to the non-existing config GPIB_TNT4882. The actual config name for this driver is GPIB_NI_PCI_ISA, as can be observed in the gpib Makefile. Probably, this is caused by the subtle differences between the config names, directory names and file names in ./drivers/staging/gpib/, where often config names and directory names are identical or at least close in naming, but in this case, it is not. Change the reference in the tnt4882 Makefile from the non-existing config GPIB_TNT4882 to the existing config GPIB_NI_PCI_ISA. Fixes: 79d2e1919a27 ("staging: gpib: fix Makefiles") Signed-off-by: Lukas Bulwahn <lukas.bulwahn@redhat.com> Link: https://lore.kernel.org/r/20250107135032.34424-1-lukas.bulwahn@redhat.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-07io_uring: silence false positive warningsPavel Begunkov
If we kill a ring and then immediately exit the task, we'll get cancellattion running by the task and a kthread in io_ring_exit_work. For DEFER_TASKRUN, we do want to limit it to only one entity executing it, however it's currently not an issue as it's protected by uring_lock. Silence lockdep assertions for now, we'll return to it later. Reported-by: syzbot+1bcb75613069ad4957fc@syzkaller.appspotmail.com Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/7e5f68281acb0f081f65fde435833c68a3b7e02f.1736257837.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-01-07net: don't dump Tx and uninitialized NAPIsJakub Kicinski
We use NAPI ID as the key for continuing dumps. We also depend on the NAPIs being sorted by ID within the driver list. Tx NAPIs (which don't have an ID assigned) break this expectation, it's not currently possible to dump them reliably. Since Tx NAPIs are relatively rare, and can't be used in doit (GET or SET) hide them from the dump API as well. Fixes: 27f91aaf49b3 ("netdev-genl: Add netlink framework functions for napi") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20250103183207.1216004-1-kuba@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-01-07usb: typec: fix pm usage counter imbalance in ucsi_ccg_sync_control()GONG Ruiqi
The error handling for the case `con_index == 0` should involve dropping the pm usage counter, as ucsi_ccg_sync_control() gets it at the beginning. Fix it. Cc: stable <stable@kernel.org> Fixes: e56aac6e5a25 ("usb: typec: fix potential array underflow in ucsi_ccg_sync_control()") Signed-off-by: GONG Ruiqi <gongruiqi1@huawei.com> Reviewed-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Link: https://lore.kernel.org/r/20250107015750.2778646-1-gongruiqi1@huawei.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-07usb-storage: Add max sectors quirk for Nokia 208Lubomir Rintel
This fixes data corruption when accessing the internal SD card in mass storage mode. I am actually not too sure why. I didn't figure a straightforward way to reproduce the issue, but i seem to get garbage when issuing a lot (over 50) of large reads (over 120 sectors) are done in a quick succession. That is, time seems to matter here -- larger reads are fine if they are done with some delay between them. But I'm not great at understanding this sort of things, so I'll assume the issue other, smarter, folks were seeing with similar phones is the same problem and I'll just put my quirk next to theirs. The "Software details" screen on the phone is as follows: V 04.06 07-08-13 RM-849 (c) Nokia TL;DR version of the device descriptor: idVendor 0x0421 Nokia Mobile Phones idProduct 0x06c2 bcdDevice 4.06 iManufacturer 1 Nokia iProduct 2 Nokia 208 The patch assumes older firmwares are broken too (I'm unable to test, but no biggie if they aren't I guess), and I have no idea if newer firmware exists. Signed-off-by: Lubomir Rintel <lkundrak@v3.sk> Cc: stable <stable@kernel.org> Acked-by: Alan Stern <stern@rowland.harvard.edu> Link: https://lore.kernel.org/r/20250101212206.2386207-1-lkundrak@v3.sk Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-07usb: gadget: midi2: Reverse-select at the right placeTakashi Iwai
We should do reverse selection of other components from CONFIG_USB_F_MIDI2 which is tristate, instead of CONFIG_USB_CONFIGFS_F_MIDI2 which is bool, for satisfying subtle module dependencies. Fixes: 8b645922b223 ("usb: gadget: Add support for USB MIDI 2.0 function driver") Cc: stable <stable@kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de> Link: https://lore.kernel.org/r/20250101131124.27599-1-tiwai@suse.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-07rust: driver: address soundness issue in `RegistrationOps`Danilo Krummrich
The `RegistrationOps` trait holds some obligations to the caller and implementers. While being documented, the trait and the corresponding functions haven't been marked as unsafe. Hence, markt the trait and functions unsafe and add the corresponding safety comments. This patch does not include any fuctional changes. Reported-by: Gary Guo <gary@garyguo.net> Closes: https://lore.kernel.org/rust-for-linux/20241224195821.3b43302b.gary@garyguo.net/ Signed-off-by: Danilo Krummrich <dakr@kernel.org> Reviewed-by: Gary Guo <gary@garyguo.net> Link: https://lore.kernel.org/r/20250103164655.96590-4-dakr@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-07rust: io: move module entry to its correct locationDanilo Krummrich
The module entry of `io` falsely ended up in the "use" block instead of the "mod" block, hence move it to its correct location. Signed-off-by: Danilo Krummrich <dakr@kernel.org> Link: https://lore.kernel.org/r/20250103164655.96590-3-dakr@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-07rust: pci: do not depend on CONFIG_PCI_MSIDanilo Krummrich
The PCI abstractions do not actually depend on CONFIG_PCI_MSI; it also breaks drivers that only depend on CONFIG_PCI, hence drop it. While at it, move the module entry to its correct location. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202501030744.4ucqC1cB-lkp@intel.com/ Fixes: 1bd8b6b2c5d3 ("rust: pci: add basic PCI device / driver abstractions") Signed-off-by: Danilo Krummrich <dakr@kernel.org> Link: https://lore.kernel.org/r/20250103164655.96590-2-dakr@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-06cxgb4: Avoid removal of uninserted tidAnumula Murali Mohan Reddy
During ARP failure, tid is not inserted but _c4iw_free_ep() attempts to remove tid which results in error. This patch fixes the issue by avoiding removal of uninserted tid. Fixes: 59437d78f088 ("cxgb4/chtls: fix ULD connection failures due to wrong TID base") Signed-off-by: Anumula Murali Mohan Reddy <anumula@chelsio.com> Signed-off-by: Potnuri Bharat Teja <bharat@chelsio.com> Link: https://patch.msgid.link/20250103092327.1011925-1-anumula@chelsio.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-06Merge branch 'bnxt_en-2-bug-fixes'Jakub Kicinski
Michael Chan says: ==================== bnxt_en: 2 Bug fixes The first patch fixes a potential memory leak when sending a FW message for the RoCE driver. The second patch fixes the potential issue of DIM modifying the coalescing parameters of a ring that has been freed. ==================== Link: https://patch.msgid.link/20250104043849.3482067-1-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-06bnxt_en: Fix DIM shutdownMichael Chan
DIM work will call the firmware to adjust the coalescing parameters on the RX rings. We should cancel DIM work before we call the firmware to free the RX rings. Otherwise, FW will reject the call from DIM work if the RX ring has been freed. This will generate an error message like this: bnxt_en 0000:21:00.1 ens2f1np1: hwrm req_type 0x53 seq id 0x6fca error 0x2 and cause unnecessary concern for the user. It is also possible to modify the coalescing parameters of the wrong ring if the ring has been re-allocated. To prevent this, cancel DIM work right before freeing the RX rings. We also have to add a check in NAPI poll to not schedule DIM if the RX rings are shutting down. Check that the VNIC is active before we schedule DIM. The VNIC is always disabled before we free the RX rings. Fixes: 0bc0b97fca73 ("bnxt_en: cleanup DIM work on device shutdown") Reviewed-by: Hongguang Gao <hongguang.gao@broadcom.com> Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://patch.msgid.link/20250104043849.3482067-3-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-06bnxt_en: Fix possible memory leak when hwrm_req_replace failsKalesh AP
When hwrm_req_replace() fails, the driver is not invoking bnxt_req_drop() which could cause a memory leak. Fixes: bbf33d1d9805 ("bnxt_en: update all firmware calls to use the new APIs") Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://patch.msgid.link/20250104043849.3482067-2-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-06pds_core: limit loop over fw name listShannon Nelson
Add an array size limit to the for-loop to be sure we don't try to reference a fw_version string off the end of the fw info names array. We know that our firmware only has a limited number of firmware slot names, but we shouldn't leave this unchecked. Fixes: 45d76f492938 ("pds_core: set up device and adminq") Signed-off-by: Shannon Nelson <shannon.nelson@amd.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Brett Creeley <brett.creeley@amd.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20250103195147.7408-1-shannon.nelson@amd.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-06drm/amdgpu: Add a lock when accessing the buddy trim functionArunpravin Paneer Selvam
When running YouTube videos and Steam games simultaneously, the tester found a system hang / race condition issue with the multi-display configuration setting. Adding a lock to the buddy allocator's trim function would be the solution. <log snip> [ 7197.250436] general protection fault, probably for non-canonical address 0xdead000000000108 [ 7197.250447] RIP: 0010:__alloc_range+0x8b/0x340 [amddrm_buddy] [ 7197.250470] Call Trace: [ 7197.250472] <TASK> [ 7197.250475] ? show_regs+0x6d/0x80 [ 7197.250481] ? die_addr+0x37/0xa0 [ 7197.250483] ? exc_general_protection+0x1db/0x480 [ 7197.250488] ? drm_suballoc_new+0x13c/0x93d [drm_suballoc_helper] [ 7197.250493] ? asm_exc_general_protection+0x27/0x30 [ 7197.250498] ? __alloc_range+0x8b/0x340 [amddrm_buddy] [ 7197.250501] ? __alloc_range+0x109/0x340 [amddrm_buddy] [ 7197.250506] amddrm_buddy_block_trim+0x1b5/0x260 [amddrm_buddy] [ 7197.250511] amdgpu_vram_mgr_new+0x4f5/0x590 [amdgpu] [ 7197.250682] amdttm_resource_alloc+0x46/0xb0 [amdttm] [ 7197.250689] ttm_bo_alloc_resource+0xe4/0x370 [amdttm] [ 7197.250696] amdttm_bo_validate+0x9d/0x180 [amdttm] [ 7197.250701] amdgpu_bo_pin+0x15a/0x2f0 [amdgpu] [ 7197.250831] amdgpu_dm_plane_helper_prepare_fb+0xb2/0x360 [amdgpu] [ 7197.251025] ? try_wait_for_completion+0x59/0x70 [ 7197.251030] drm_atomic_helper_prepare_planes.part.0+0x2f/0x1e0 [ 7197.251035] drm_atomic_helper_prepare_planes+0x5d/0x70 [ 7197.251037] drm_atomic_helper_commit+0x84/0x160 [ 7197.251040] drm_atomic_nonblocking_commit+0x59/0x70 [ 7197.251043] drm_mode_atomic_ioctl+0x720/0x850 [ 7197.251047] ? __pfx_drm_mode_atomic_ioctl+0x10/0x10 [ 7197.251049] drm_ioctl_kernel+0xb9/0x120 [ 7197.251053] ? srso_alias_return_thunk+0x5/0xfbef5 [ 7197.251056] drm_ioctl+0x2d4/0x550 [ 7197.251058] ? __pfx_drm_mode_atomic_ioctl+0x10/0x10 [ 7197.251063] amdgpu_drm_ioctl+0x4e/0x90 [amdgpu] [ 7197.251186] __x64_sys_ioctl+0xa0/0xf0 [ 7197.251190] x64_sys_call+0x143b/0x25c0 [ 7197.251193] do_syscall_64+0x7f/0x180 [ 7197.251197] ? srso_alias_return_thunk+0x5/0xfbef5 [ 7197.251199] ? amdgpu_display_user_framebuffer_create+0x215/0x320 [amdgpu] [ 7197.251329] ? drm_internal_framebuffer_create+0xb7/0x1a0 [ 7197.251332] ? srso_alias_return_thunk+0x5/0xfbef5 Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> Fixes: 4a5ad08f5377 ("drm/amdgpu: Add address alignment support to DCC buffers") Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 3318ba94e56b9183d0304577c74b33b6b01ce516) Cc: stable@vger.kernel.org
2025-01-06drm/amd/pm: fix BUG: scheduling while atomicKun Liu
atomic scheduling will be triggered in interrupt handler for AC/DC mode switch as following backtrace. Call Trace: <IRQ> dump_stack_lvl __schedule_bug __schedule schedule schedule_preempt_disabled __mutex_lock smu_cmn_send_smc_msg_with_param smu_v13_0_irq_process amdgpu_irq_dispatch amdgpu_ih_process amdgpu_irq_handler __handle_irq_event_percpu handle_irq_event handle_edge_irq __common_interrupt common_interrupt </IRQ> <TASK> asm_common_interrupt Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Kenneth Feng <kenneth.feng@amd.com> Signed-off-by: Kun Liu <Kun.Liu2@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 03cc84b102d1a832e8dfc59344346dedcebcdf42) Cc: stable@vger.kernel.org
2025-01-06drm/amdkfd: wq_release signals dma_fence only when availableZhu Lingshan
kfd_process_wq_release() signals eviction fence by dma_fence_signal() which wanrs if dma_fence is NULL. kfd_process->ef is initialized by kfd_process_device_init_vm() through ioctl. That means the fence is NULL for a new created kfd_process, and close a kfd_process right after open it will trigger the warning. This commit conditionally signals the eviction fence in kfd_process_wq_release() only when it is available. [ 503.660882] WARNING: CPU: 0 PID: 9 at drivers/dma-buf/dma-fence.c:467 dma_fence_signal+0x74/0xa0 [ 503.782940] Workqueue: kfd_process_wq kfd_process_wq_release [amdgpu] [ 503.789640] RIP: 0010:dma_fence_signal+0x74/0xa0 [ 503.877620] Call Trace: [ 503.880066] <TASK> [ 503.882168] ? __warn+0xcd/0x260 [ 503.885407] ? dma_fence_signal+0x74/0xa0 [ 503.889416] ? report_bug+0x288/0x2d0 [ 503.893089] ? handle_bug+0x53/0xa0 [ 503.896587] ? exc_invalid_op+0x14/0x50 [ 503.900424] ? asm_exc_invalid_op+0x16/0x20 [ 503.904616] ? dma_fence_signal+0x74/0xa0 [ 503.908626] kfd_process_wq_release+0x6b/0x370 [amdgpu] [ 503.914081] process_one_work+0x654/0x10a0 [ 503.918186] worker_thread+0x6c3/0xe70 [ 503.921943] ? srso_alias_return_thunk+0x5/0xfbef5 [ 503.926735] ? srso_alias_return_thunk+0x5/0xfbef5 [ 503.931527] ? __kthread_parkme+0x82/0x140 [ 503.935631] ? __pfx_worker_thread+0x10/0x10 [ 503.939904] kthread+0x2a8/0x380 [ 503.943132] ? __pfx_kthread+0x10/0x10 [ 503.946882] ret_from_fork+0x2d/0x70 [ 503.950458] ? __pfx_kthread+0x10/0x10 [ 503.954210] ret_from_fork_asm+0x1a/0x30 [ 503.958142] </TASK> [ 503.960328] ---[ end trace 0000000000000000 ]--- Fixes: 967d226eaae8 ("dma-buf: add WARN_ON() illegal dma-fence signaling") Signed-off-by: Zhu Lingshan <lingshan.zhu@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 2774ef7625adb5fb9e9265c26a59dca7b8fd171e) Cc: stable@vger.kernel.org
2025-01-06drm/amd/display: Add check for granularity in dml ceil/floor helpersRoman Li
[Why] Wrapper functions for dcn_bw_ceil2() and dcn_bw_floor2() should check for granularity is non zero to avoid assert and divide-by-zero error in dcn_bw_ functions. [How] Add check for granularity 0. Cc: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alvin Lee <alvin.lee2@amd.com> Signed-off-by: Roman Li <Roman.Li@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit f6e09701c3eb2ccb8cb0518e0b67f1c69742a4ec) Cc: stable@vger.kernel.org
2025-01-06drm/amdkfd: fixed page fault when enable MES shader debuggerJesse.zhang@amd.com
Initialize the process context address before setting the shader debugger. [ 260.781212] amdgpu 0000:03:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:32 vmid:0 pasid:0) [ 260.781236] amdgpu 0000:03:00.0: amdgpu: in page starting at address 0x0000000000000000 from client 10 [ 260.781255] amdgpu 0000:03:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00040A40 [ 260.781270] amdgpu 0000:03:00.0: amdgpu: Faulty UTCL2 client ID: CPC (0x5) [ 260.781284] amdgpu 0000:03:00.0: amdgpu: MORE_FAULTS: 0x0 [ 260.781296] amdgpu 0000:03:00.0: amdgpu: WALKER_ERROR: 0x0 [ 260.781308] amdgpu 0000:03:00.0: amdgpu: PERMISSION_FAULTS: 0x4 [ 260.781320] amdgpu 0000:03:00.0: amdgpu: MAPPING_ERROR: 0x0 [ 260.781332] amdgpu 0000:03:00.0: amdgpu: RW: 0x1 [ 260.782017] amdgpu 0000:03:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:32 vmid:0 pasid:0) [ 260.782039] amdgpu 0000:03:00.0: amdgpu: in page starting at address 0x0000000000000000 from client 10 [ 260.782058] amdgpu 0000:03:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00040A41 [ 260.782073] amdgpu 0000:03:00.0: amdgpu: Faulty UTCL2 client ID: CPC (0x5) [ 260.782087] amdgpu 0000:03:00.0: amdgpu: MORE_FAULTS: 0x1 [ 260.782098] amdgpu 0000:03:00.0: amdgpu: WALKER_ERROR: 0x0 [ 260.782110] amdgpu 0000:03:00.0: amdgpu: PERMISSION_FAULTS: 0x4 [ 260.782122] amdgpu 0000:03:00.0: amdgpu: MAPPING_ERROR: 0x0 [ 260.782137] amdgpu 0000:03:00.0: amdgpu: RW: 0x1 [ 260.782155] amdgpu 0000:03:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:32 vmid:0 pasid:0) [ 260.782166] amdgpu 0000:03:00.0: amdgpu: in page starting at address 0x0000000000000000 from client 10 Fixes: 438b39ac74e2 ("drm/amdkfd: pause autosuspend when creating pdd") Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3849 Signed-off-by: Jesse Zhang <jesse.zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 5b231f5bc9ff02ec5737f2ec95cdf15ac95088e9) Cc: stable@vger.kernel.org
2025-01-06drm/amd/display: fix divide error in DM plane scale calcsMelissa Wen
dm_get_plane_scale doesn't take into account plane scaled size equal to zero, leading to a kernel oops due to division by zero. Fix by setting out-scale size as zero when the dst size is zero, similar to what is done by drm_calc_scale(). This issue started with the introduction of cursor ovelay mode that uses this function to assess cursor mode changes via dm_crtc_get_cursor_mode() before checking plane state. [Dec17 17:14] Oops: divide error: 0000 [#1] PREEMPT SMP NOPTI [ +0.000018] CPU: 5 PID: 1660 Comm: surface-DP-1 Not tainted 6.10.0+ #231 [ +0.000007] Hardware name: Valve Jupiter/Jupiter, BIOS F7A0131 01/30/2024 [ +0.000004] RIP: 0010:dm_get_plane_scale+0x3f/0x60 [amdgpu] [ +0.000553] Code: 44 0f b7 41 3a 44 0f b7 49 3e 83 e0 0f 48 0f a3 c2 73 21 69 41 28 e8 03 00 00 31 d2 41 f7 f1 31 d2 89 06 69 41 2c e8 03 00 00 <41> f7 f0 89 07 e9 d7 d8 7e e9 44 89 c8 45 89 c1 41 89 c0 eb d4 66 [ +0.000005] RSP: 0018:ffffa8df0de6b8a0 EFLAGS: 00010246 [ +0.000006] RAX: 00000000000003e8 RBX: ffff9ac65c1f6e00 RCX: ffff9ac65d055500 [ +0.000003] RDX: 0000000000000000 RSI: ffffa8df0de6b8b0 RDI: ffffa8df0de6b8b4 [ +0.000004] RBP: ffff9ac64e7a5800 R08: 0000000000000000 R09: 0000000000000a00 [ +0.000003] R10: 00000000000000ff R11: 0000000000000054 R12: ffff9ac6d0700010 [ +0.000003] R13: ffff9ac65d054f00 R14: ffff9ac65d055500 R15: ffff9ac64e7a60a0 [ +0.000004] FS: 00007f869ea00640(0000) GS:ffff9ac970080000(0000) knlGS:0000000000000000 [ +0.000004] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ +0.000003] CR2: 000055ca701becd0 CR3: 000000010e7f2000 CR4: 0000000000350ef0 [ +0.000004] Call Trace: [ +0.000007] <TASK> [ +0.000006] ? __die_body.cold+0x19/0x27 [ +0.000009] ? die+0x2e/0x50 [ +0.000007] ? do_trap+0xca/0x110 [ +0.000007] ? do_error_trap+0x6a/0x90 [ +0.000006] ? dm_get_plane_scale+0x3f/0x60 [amdgpu] [ +0.000504] ? exc_divide_error+0x38/0x50 [ +0.000005] ? dm_get_plane_scale+0x3f/0x60 [amdgpu] [ +0.000488] ? asm_exc_divide_error+0x1a/0x20 [ +0.000011] ? dm_get_plane_scale+0x3f/0x60 [amdgpu] [ +0.000593] dm_crtc_get_cursor_mode+0x33f/0x430 [amdgpu] [ +0.000562] amdgpu_dm_atomic_check+0x2ef/0x1770 [amdgpu] [ +0.000501] drm_atomic_check_only+0x5e1/0xa30 [drm] [ +0.000047] drm_mode_atomic_ioctl+0x832/0xcb0 [drm] [ +0.000050] ? __pfx_drm_mode_atomic_ioctl+0x10/0x10 [drm] [ +0.000047] drm_ioctl_kernel+0xb3/0x100 [drm] [ +0.000062] drm_ioctl+0x27a/0x4f0 [drm] [ +0.000049] ? __pfx_drm_mode_atomic_ioctl+0x10/0x10 [drm] [ +0.000055] amdgpu_drm_ioctl+0x4e/0x90 [amdgpu] [ +0.000360] __x64_sys_ioctl+0x97/0xd0 [ +0.000010] do_syscall_64+0x82/0x190 [ +0.000008] ? __pfx_drm_mode_createblob_ioctl+0x10/0x10 [drm] [ +0.000044] ? srso_return_thunk+0x5/0x5f [ +0.000006] ? drm_ioctl_kernel+0xb3/0x100 [drm] [ +0.000040] ? srso_return_thunk+0x5/0x5f [ +0.000005] ? __check_object_size+0x50/0x220 [ +0.000007] ? srso_return_thunk+0x5/0x5f [ +0.000005] ? srso_return_thunk+0x5/0x5f [ +0.000005] ? drm_ioctl+0x2a4/0x4f0 [drm] [ +0.000039] ? __pfx_drm_mode_createblob_ioctl+0x10/0x10 [drm] [ +0.000043] ? srso_return_thunk+0x5/0x5f [ +0.000005] ? srso_return_thunk+0x5/0x5f [ +0.000005] ? __pm_runtime_suspend+0x69/0xc0 [ +0.000006] ? srso_return_thunk+0x5/0x5f [ +0.000005] ? amdgpu_drm_ioctl+0x71/0x90 [amdgpu] [ +0.000366] ? srso_return_thunk+0x5/0x5f [ +0.000006] ? syscall_exit_to_user_mode+0x77/0x210 [ +0.000007] ? srso_return_thunk+0x5/0x5f [ +0.000005] ? do_syscall_64+0x8e/0x190 [ +0.000006] ? srso_return_thunk+0x5/0x5f [ +0.000006] ? do_syscall_64+0x8e/0x190 [ +0.000006] ? srso_return_thunk+0x5/0x5f [ +0.000007] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ +0.000008] RIP: 0033:0x55bb7cd962bc [ +0.000007] Code: 4c 89 6c 24 18 4c 89 64 24 20 4c 89 74 24 28 0f 57 c0 0f 11 44 24 30 89 c7 48 8d 54 24 08 b8 10 00 00 00 be bc 64 38 c0 0f 05 <49> 89 c7 48 83 3b 00 74 09 4c 89 c7 ff 15 62 64 99 00 48 83 7b 18 [ +0.000005] RSP: 002b:00007f869e9f4da0 EFLAGS: 00000217 ORIG_RAX: 0000000000000010 [ +0.000007] RAX: ffffffffffffffda RBX: 00007f869e9f4fb8 RCX: 000055bb7cd962bc [ +0.000004] RDX: 00007f869e9f4da8 RSI: 00000000c03864bc RDI: 000000000000003b [ +0.000003] RBP: 000055bb9ddcbcc0 R08: 00007f86541b9920 R09: 0000000000000009 [ +0.000004] R10: 0000000000000004 R11: 0000000000000217 R12: 00007f865406c6b0 [ +0.000003] R13: 00007f86541b5290 R14: 00007f865410b700 R15: 000055bb9ddcbc18 [ +0.000009] </TASK> Fixes: 1b04dcca4fb1 ("drm/amd/display: Introduce overlay cursor mode") Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3729 Reported-by: Fabio Scaccabarozzi <fsvm88@gmail.com> Co-developed-by: Fabio Scaccabarozzi <fsvm88@gmail.com> Signed-off-by: Fabio Scaccabarozzi <fsvm88@gmail.com> Signed-off-by: Melissa Wen <mwen@igalia.com> Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit ab75a0d2e07942ae15d32c0a5092fd336451378c) Cc: stable@vger.kernel.org
2025-01-06drm/amd/display: increase MAX_SURFACES to the value supported by hwMelissa Wen
As the hw supports up to 4 surfaces, increase the maximum number of surfaces to prevent the DC error when trying to use more than three planes. [drm:dc_state_add_plane [amdgpu]] *ERROR* Surface: can not attach plane_state 000000003e2cb82c! Maximum is: 3 Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3693 Signed-off-by: Melissa Wen <mwen@igalia.com> Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit b8d6daffc871a42026c3c20bff7b8fa0302298c1) Cc: stable@vger.kernel.org
2025-01-06drm/amd/display: fix page fault due to max surface definition mismatchMelissa Wen
DC driver is using two different values to define the maximum number of surfaces: MAX_SURFACES and MAX_SURFACE_NUM. Consolidate MAX_SURFACES as the unique definition for surface updates across DC. It fixes page fault faced by Cosmic users on AMD display versions that support two overlay planes, since the introduction of cursor overlay mode. [Nov26 21:33] BUG: unable to handle page fault for address: 0000000051d0f08b [ +0.000015] #PF: supervisor read access in kernel mode [ +0.000006] #PF: error_code(0x0000) - not-present page [ +0.000005] PGD 0 P4D 0 [ +0.000007] Oops: Oops: 0000 [#1] PREEMPT SMP NOPTI [ +0.000006] CPU: 4 PID: 71 Comm: kworker/u32:6 Not tainted 6.10.0+ #300 [ +0.000006] Hardware name: Valve Jupiter/Jupiter, BIOS F7A0131 01/30/2024 [ +0.000007] Workqueue: events_unbound commit_work [drm_kms_helper] [ +0.000040] RIP: 0010:copy_stream_update_to_stream.isra.0+0x30d/0x750 [amdgpu] [ +0.000847] Code: 8b 10 49 89 94 24 f8 00 00 00 48 8b 50 08 49 89 94 24 00 01 00 00 8b 40 10 41 89 84 24 08 01 00 00 49 8b 45 78 48 85 c0 74 0b <0f> b6 00 41 88 84 24 90 64 00 00 49 8b 45 60 48 85 c0 74 3b 48 8b [ +0.000010] RSP: 0018:ffffc203802f79a0 EFLAGS: 00010206 [ +0.000009] RAX: 0000000051d0f08b RBX: 0000000000000004 RCX: ffff9f964f0a8070 [ +0.000004] RDX: ffff9f9710f90e40 RSI: ffff9f96600c8000 RDI: ffff9f964f000000 [ +0.000004] RBP: ffffc203802f79f8 R08: 0000000000000000 R09: 0000000000000000 [ +0.000005] R10: 0000000000000000 R11: 0000000000000000 R12: ffff9f96600c8000 [ +0.000004] R13: ffff9f9710f90e40 R14: ffff9f964f000000 R15: ffff9f96600c8000 [ +0.000004] FS: 0000000000000000(0000) GS:ffff9f9970000000(0000) knlGS:0000000000000000 [ +0.000005] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ +0.000005] CR2: 0000000051d0f08b CR3: 00000002e6a20000 CR4: 0000000000350ef0 [ +0.000005] Call Trace: [ +0.000011] <TASK> [ +0.000010] ? __die_body.cold+0x19/0x27 [ +0.000012] ? page_fault_oops+0x15a/0x2d0 [ +0.000014] ? exc_page_fault+0x7e/0x180 [ +0.000009] ? asm_exc_page_fault+0x26/0x30 [ +0.000013] ? copy_stream_update_to_stream.isra.0+0x30d/0x750 [amdgpu] [ +0.000739] ? dc_commit_state_no_check+0xd6c/0xe70 [amdgpu] [ +0.000470] update_planes_and_stream_state+0x49b/0x4f0 [amdgpu] [ +0.000450] ? srso_return_thunk+0x5/0x5f [ +0.000009] ? commit_minimal_transition_state+0x239/0x3d0 [amdgpu] [ +0.000446] update_planes_and_stream_v2+0x24a/0x590 [amdgpu] [ +0.000464] ? srso_return_thunk+0x5/0x5f [ +0.000009] ? sort+0x31/0x50 [ +0.000007] ? amdgpu_dm_atomic_commit_tail+0x159f/0x3a30 [amdgpu] [ +0.000508] ? srso_return_thunk+0x5/0x5f [ +0.000009] ? amdgpu_crtc_get_scanout_position+0x28/0x40 [amdgpu] [ +0.000377] ? srso_return_thunk+0x5/0x5f [ +0.000009] ? drm_crtc_vblank_helper_get_vblank_timestamp_internal+0x160/0x390 [drm] [ +0.000058] ? srso_return_thunk+0x5/0x5f [ +0.000005] ? dma_fence_default_wait+0x8c/0x260 [ +0.000010] ? srso_return_thunk+0x5/0x5f [ +0.000005] ? wait_for_completion_timeout+0x13b/0x170 [ +0.000006] ? srso_return_thunk+0x5/0x5f [ +0.000005] ? dma_fence_wait_timeout+0x108/0x140 [ +0.000010] ? commit_tail+0x94/0x130 [drm_kms_helper] [ +0.000024] ? process_one_work+0x177/0x330 [ +0.000008] ? worker_thread+0x266/0x3a0 [ +0.000006] ? __pfx_worker_thread+0x10/0x10 [ +0.000004] ? kthread+0xd2/0x100 [ +0.000006] ? __pfx_kthread+0x10/0x10 [ +0.000006] ? ret_from_fork+0x34/0x50 [ +0.000004] ? __pfx_kthread+0x10/0x10 [ +0.000005] ? ret_from_fork_asm+0x1a/0x30 [ +0.000011] </TASK> Fixes: 1b04dcca4fb1 ("drm/amd/display: Introduce overlay cursor mode") Suggested-by: Leo Li <sunpeng.li@amd.com> Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3693 Signed-off-by: Melissa Wen <mwen@igalia.com> Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 1c86c81a86c60f9b15d3e3f43af0363cf56063e7) Cc: stable@vger.kernel.org
2025-01-06drm/amd/display: Remove unnecessary amdgpu_irq_get/putAlex Hung
[WHY & HOW] commit 7fb363c57522 ("drm/amd/display: Let drm_crtc_vblank_on/off manage interrupts") lets drm_crtc_vblank_* to manage interrupts in amdgpu_dm_crtc_set_vblank, and amdgpu_irq_get/put do not need to be called here. Part of that patch got lost somehow, so fix it up. Fixes: 7fb363c57522 ("drm/amd/display: Let drm_crtc_vblank_on/off manage interrupts") Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Leo Li <sunpeng.li@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 3782305ce5807c18fbf092124b9e8303cf1723ae) Cc: stable@vger.kernel.org
2025-01-06Merge tag 'vfs-6.13-rc7.fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs fixes from Christian Brauner: - Relax assertions on failure to encode file handles The ->encode_fh() method can fail for various reasons. None of them warrant a WARN_ON(). - Fix overlayfs file handle encoding by allowing encoding an fid from an inode without an alias - Make sure fuse_dir_open() handles FOPEN_KEEP_CACHE. If it's not specified fuse needs to invaludate the directory inode page cache - Fix qnx6 so it builds with gcc-15 - Various fixes for netfslib and ceph and nfs filesystems: - Ignore silly rename files from afs and nfs when building header archives - Fix read result collection in netfslib with multiple subrequests - Handle ENOMEM for netfslib buffered reads - Fix oops in nfs_netfs_init_request() - Parse the secctx command immediately in cachefiles - Remove a redundant smp_rmb() in netfslib - Handle recursion in read retry in netfslib - Fix clearing of folio_queue - Fix missing cancellation of copy-to_cache when the cache for a file is temporarly disabled in netfslib - Sanity check the hfs root record - Fix zero padding data issues in concurrent write scenarios - Fix is_mnt_ns_file() after converting nsfs to path_from_stashed() - Fix missing declaration of init_files - Increase I/O priority when writing revoke records in jbd2 - Flush filesystem device before updating tail sequence in jbd2 * tag 'vfs-6.13-rc7.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (23 commits) ovl: support encoding fid from inode with no alias ovl: pass realinode to ovl_encode_real_fh() instead of realdentry fuse: respect FOPEN_KEEP_CACHE on opendir netfs: Fix is-caching check in read-retry netfs: Fix the (non-)cancellation of copy when cache is temporarily disabled netfs: Fix ceph copy to cache on write-begin netfs: Work around recursion by abandoning retry if nothing read netfs: Fix missing barriers by using clear_and_wake_up_bit() netfs: Remove redundant use of smp_rmb() cachefiles: Parse the "secctx" immediately nfs: Fix oops in nfs_netfs_init_request() when copying to cache netfs: Fix enomem handling in buffered reads netfs: Fix non-contiguous donation between completed reads kheaders: Ignore silly-rename files fs: relax assertions on failure to encode file handles fs: fix missing declaration of init_files fs: fix is_mnt_ns_file() iomap: fix zero padding data issue in concurrent append writes iomap: pass byte granular end position to iomap_add_to_ioend jbd2: flush filesystem device before updating tail sequence ...
2025-01-06btrfs: zlib: fix avail_in bytes for s390 zlib HW compression pathMikhail Zaslonko
Since the input data length passed to zlib_compress_folios() can be arbitrary, always setting strm.avail_in to a multiple of PAGE_SIZE may cause read-in bytes to exceed the input range. Currently this triggers an assert in btrfs_compress_folios() on the debug kernel (see below). Fix strm.avail_in calculation for S390 hardware acceleration path. assertion failed: *total_in <= orig_len, in fs/btrfs/compression.c:1041 ------------[ cut here ]------------ kernel BUG at fs/btrfs/compression.c:1041! monitor event: 0040 ilc:2 [#1] PREEMPT SMP CPU: 16 UID: 0 PID: 325 Comm: kworker/u273:3 Not tainted 6.13.0-20241204.rc1.git6.fae3b21430ca.300.fc41.s390x+debug #1 Hardware name: IBM 3931 A01 703 (z/VM 7.4.0) Workqueue: btrfs-delalloc btrfs_work_helper Krnl PSW : 0704d00180000000 0000021761df6538 (btrfs_compress_folios+0x198/0x1a0) R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:1 PM:0 RI:0 EA:3 Krnl GPRS: 0000000080000000 0000000000000001 0000000000000047 0000000000000000 0000000000000006 ffffff01757bb000 000001976232fcc0 000000000000130c 000001976232fcd0 000001976232fcc8 00000118ff4a0e30 0000000000000001 00000111821ab400 0000011100000000 0000021761df6534 000001976232fb58 Krnl Code: 0000021761df6528: c020006f5ef4 larl %r2,0000021762be2310 0000021761df652e: c0e5ffbd09d5 brasl %r14,00000217615978d8 #0000021761df6534: af000000 mc 0,0 >0000021761df6538: 0707 bcr 0,%r7 0000021761df653a: 0707 bcr 0,%r7 0000021761df653c: 0707 bcr 0,%r7 0000021761df653e: 0707 bcr 0,%r7 0000021761df6540: c004004bb7ec brcl 0,000002176276d518 Call Trace: [<0000021761df6538>] btrfs_compress_folios+0x198/0x1a0 ([<0000021761df6534>] btrfs_compress_folios+0x194/0x1a0) [<0000021761d97788>] compress_file_range+0x3b8/0x6d0 [<0000021761dcee7c>] btrfs_work_helper+0x10c/0x160 [<0000021761645760>] process_one_work+0x2b0/0x5d0 [<000002176164637e>] worker_thread+0x20e/0x3e0 [<000002176165221a>] kthread+0x15a/0x170 [<00000217615b859c>] __ret_from_fork+0x3c/0x60 [<00000217626e72d2>] ret_from_fork+0xa/0x38 INFO: lockdep is turned off. Last Breaking-Event-Address: [<0000021761597924>] _printk+0x4c/0x58 Kernel panic - not syncing: Fatal exception: panic_on_oops Fixes: fd1e75d0105d ("btrfs: make compression path to be subpage compatible") CC: stable@vger.kernel.org # 6.12+ Acked-by: Ilya Leoshkevich <iii@linux.ibm.com> Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Mikhail Zaslonko <zaslonko@linux.ibm.com> Signed-off-by: David Sterba <dsterba@suse.com>
2025-01-06btrfs: zoned: calculate max_extent_size properly on non-zoned setupChristoph Hellwig
Since commit 559218d43ec9 ("block: pre-calculate max_zone_append_sectors"), queue_limits's max_zone_append_sectors is default to be 0 and it is only updated when there is a zoned device. So, we have lim->max_zone_append_sectors = 0 when there is no zoned device in the filesystem. That leads to fs_info->max_zone_append_size and thus fs_info->max_extent_size to be 0, which is wrong and can for example lead to a divide by zero in count_max_extents(). Fix this by only capping fs_info->max_extent_size to fs_info->max_zone_append_size when it is non-zero. Based on a patch from Naohiro Aota <naohiro.aota@wdc.com>, from which much of this commit message is stolen as well. Reported-by: Shinichiro Kawasaki <shinichiro.kawasaki@wdc.com> Fixes: 559218d43ec9 ("block: pre-calculate max_zone_append_sectors") Tested-by: Shinichiro Kawasaki <shinichiro.kawasaki@wdc.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Naohiro Aota <naohiro.aota@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David Sterba <dsterba@suse.com>
2025-01-06btrfs: avoid NULL pointer dereference if no valid extent treeQu Wenruo
[BUG] Syzbot reported a crash with the following call trace: BTRFS info (device loop0): scrub: started on devid 1 BUG: kernel NULL pointer dereference, address: 0000000000000208 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 106e70067 P4D 106e70067 PUD 107143067 PMD 0 Oops: Oops: 0000 [#1] PREEMPT SMP NOPTI CPU: 1 UID: 0 PID: 689 Comm: repro Kdump: loaded Tainted: G O 6.13.0-rc4-custom+ #206 Tainted: [O]=OOT_MODULE Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS unknown 02/02/2022 RIP: 0010:find_first_extent_item+0x26/0x1f0 [btrfs] Call Trace: <TASK> scrub_find_fill_first_stripe+0x13d/0x3b0 [btrfs] scrub_simple_mirror+0x175/0x260 [btrfs] scrub_stripe+0x5d4/0x6c0 [btrfs] scrub_chunk+0xbb/0x170 [btrfs] scrub_enumerate_chunks+0x2f4/0x5f0 [btrfs] btrfs_scrub_dev+0x240/0x600 [btrfs] btrfs_ioctl+0x1dc8/0x2fa0 [btrfs] ? do_sys_openat2+0xa5/0xf0 __x64_sys_ioctl+0x97/0xc0 do_syscall_64+0x4f/0x120 entry_SYSCALL_64_after_hwframe+0x76/0x7e </TASK> [CAUSE] The reproducer is using a corrupted image where extent tree root is corrupted, thus forcing to use "rescue=all,ro" mount option to mount the image. Then it triggered a scrub, but since scrub relies on extent tree to find where the data/metadata extents are, scrub_find_fill_first_stripe() relies on an non-empty extent root. But unfortunately scrub_find_fill_first_stripe() doesn't really expect an NULL pointer for extent root, it use extent_root to grab fs_info and triggered a NULL pointer dereference. [FIX] Add an extra check for a valid extent root at the beginning of scrub_find_fill_first_stripe(). The new error path is introduced by 42437a6386ff ("btrfs: introduce mount option rescue=ignorebadroots"), but that's pretty old, and later commit b979547513ff ("btrfs: scrub: introduce helper to find and fill sector info for a scrub_stripe") changed how we do scrub. So for kernels older than 6.6, the fix will need manual backport. Reported-by: syzbot+339e9dbe3a2ca419b85d@syzkaller.appspotmail.com Link: https://lore.kernel.org/linux-btrfs/67756935.050a0220.25abdd.0a12.GAE@google.com/ Fixes: 42437a6386ff ("btrfs: introduce mount option rescue=ignorebadroots") Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2025-01-06Merge tag 'vfio-v6.13-rc7' of https://github.com/awilliam/linux-vfioLinus Torvalds
Pull vfio fix from Alex Williamson: - Fix a missed order alignment requirement of the pfn when inserting mappings through the new huge fault handler introduced in v6.12 (Alex Williamson) * tag 'vfio-v6.13-rc7' of https://github.com/awilliam/linux-vfio: vfio/pci: Fallback huge faults for unaligned pfn
2025-01-06Merge patch series "Fix encoding overlayfs fid for fanotify delete events"Christian Brauner
Amir Goldstein <amir73il@gmail.com> says: This is a followup fix to the reported regression [1] that was introduced by overlayfs non-decodable file handles support in v6.6. The first fix posted two weeks ago [2] was a quick band aid which is justified on its own and is still queued on your vfs.fixes branch. This followup fix fixes the root cause of overlayfs file handle encoding failure and it also solves a bug with fanotify FAN_DELETE_SELF events on overlayfs, that was discovered from analysis of the first report. The fix to fanotify delete events was verified with a new LTP test [3]. [1] https://lore.kernel.org/linux-fsdevel/CAOQ4uxiie81voLZZi2zXS1BziXZCM24nXqPAxbu8kxXCUWdwOg@mail.gmail.com/ [2] https://lore.kernel.org/linux-fsdevel/20241219115301.465396-1-amir73il@gmail.com/ [3] https://github.com/amir73il/ltp/commits/ovl_encode_fid/ * patches from https://lore.kernel.org/r/20250105162404.357058-1-amir73il@gmail.com: ovl: support encoding fid from inode with no alias ovl: pass realinode to ovl_encode_real_fh() instead of realdentry Link: https://lore.kernel.org/r/20250105162404.357058-1-amir73il@gmail.com Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-01-06ovl: support encoding fid from inode with no aliasAmir Goldstein
Dmitry Safonov reported that a WARN_ON() assertion can be trigered by userspace when calling inotify_show_fdinfo() for an overlayfs watched inode, whose dentry aliases were discarded with drop_caches. The WARN_ON() assertion in inotify_show_fdinfo() was removed, because it is possible for encoding file handle to fail for other reason, but the impact of failing to encode an overlayfs file handle goes beyond this assertion. As shown in the LTP test case mentioned in the link below, failure to encode an overlayfs file handle from a non-aliased inode also leads to failure to report an fid with FAN_DELETE_SELF fanotify events. As Dmitry notes in his analyzis of the problem, ovl_encode_fh() fails if it cannot find an alias for the inode, but this failure can be fixed. ovl_encode_fh() seldom uses the alias and in the case of non-decodable file handles, as is often the case with fanotify fid info, ovl_encode_fh() never needs to use the alias to encode a file handle. Defer finding an alias until it is actually needed so ovl_encode_fh() will not fail in the common case of FAN_DELETE_SELF fanotify events. Fixes: 16aac5ad1fa9 ("ovl: support encoding non-decodable file handles") Reported-by: Dmitry Safonov <dima@arista.com> Closes: https://lore.kernel.org/linux-fsdevel/CAOQ4uxiie81voLZZi2zXS1BziXZCM24nXqPAxbu8kxXCUWdwOg@mail.gmail.com/ Signed-off-by: Amir Goldstein <amir73il@gmail.com> Link: https://lore.kernel.org/r/20250105162404.357058-3-amir73il@gmail.com Signed-off-by: Christian Brauner <brauner@kernel.org>