Age | Commit message (Collapse) | Author |
|
Ido Schimmel says:
====================
selftests: forwarding: Various fixes
Fix various problems in the forwarding selftests so that they will pass
in the netdev CI instead of being ignored. See commit messages for
details.
====================
Link: https://lore.kernel.org/r/20240208155529.1199729-1-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The redirection test case fails in the netdev CI on debug kernels
because an FDB entry is learned despite the presence of a tc filter that
redirects incoming traffic [1].
I am unable to reproduce the failure locally, but I can see how it can
happen given that learning is first enabled and only then the ingress tc
filter is configured. On debug kernels the time window between these two
operations is longer compared to regular kernels, allowing random
packets to be transmitted and trigger learning.
Fix by reversing the order and configure the ingress tc filter before
enabling learning.
[1]
[...]
# TEST: Locked port MAB redirect [FAIL]
# Locked entry created for redirected traffic
Fixes: 38c43a1ce758 ("selftests: forwarding: Add test case for traffic redirection from a locked port")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Hangbin Liu <liuhangbin@gmail.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://lore.kernel.org/r/20240208155529.1199729-5-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Suppress the following grep warnings:
[...]
INFO: # Port group entries configuration tests - (*, G)
TEST: Common port group entries configuration tests (IPv4 (*, G)) [ OK ]
TEST: Common port group entries configuration tests (IPv6 (*, G)) [ OK ]
grep: warning: stray \ before /
grep: warning: stray \ before /
grep: warning: stray \ before /
TEST: IPv4 (*, G) port group entries configuration tests [ OK ]
grep: warning: stray \ before /
grep: warning: stray \ before /
grep: warning: stray \ before /
TEST: IPv6 (*, G) port group entries configuration tests [ OK ]
[...]
They do not fail the test, but do clutter the output.
Fixes: b6d00da08610 ("selftests: forwarding: Add bridge MDB test")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Hangbin Liu <liuhangbin@gmail.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://lore.kernel.org/r/20240208155529.1199729-4-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
After enabling a multicast querier on the bridge (like the test is
doing), the bridge will wait for the Max Response Delay before starting
to forward according to its MDB in order to let Membership Reports
enough time to be received and processed.
Currently, the test is waiting for exactly the default Max Response
Delay (10 seconds) which is racy and leads to failures [1].
Fix by reducing the Max Response Delay to 1 second.
[1]
[...]
# TEST: IPv4 host entries forwarding tests [FAIL]
# Packet locally received after flood
Fixes: b6d00da08610 ("selftests: forwarding: Add bridge MDB test")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Hangbin Liu <liuhangbin@gmail.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://lore.kernel.org/r/20240208155529.1199729-3-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
After enabling a multicast querier on the bridge (like the test is
doing), the bridge will wait for the Max Response Delay before starting
to forward according to its MDB in order to let Membership Reports
enough time to be received and processed.
Currently, the test is waiting for exactly the default Max Response
Delay (10 seconds) which is racy and leads to failures [1].
Fix by reducing the Max Response Delay to 1 second.
[1]
[...]
# TEST: L2 miss - Multicast (IPv4) [FAIL]
# Unregistered multicast filter was hit after adding MDB entry
Fixes: 8c33266ae26a ("selftests: forwarding: Add layer 2 miss test cases")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Hangbin Liu <liuhangbin@gmail.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://lore.kernel.org/r/20240208155529.1199729-2-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The test toggles the carrier of a bridge port in order to test the
bridge backup port feature.
Due to the linkwatch delayed work the carrier change is not always
reflected fast enough to the bridge driver and packets are not forwarded
as the test expects, resulting in failures [1].
Fix by busy waiting on the bridge port state until it changes to the
desired state following the carrier change.
[1]
# Backup port
# -----------
[...]
# TEST: swp1 carrier off [ OK ]
# TEST: No forwarding out of swp1 [FAIL]
[ 641.995910] br0: port 1(swp1) entered disabled state
# TEST: No forwarding out of vx0 [ OK ]
Fixes: b408453053fb ("selftests: net: Add bridge backup port and backup nexthop ID test")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://lore.kernel.org/r/20240208123110.1063930-1-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Space reservations for metadata are, most of the time, pessimistic as we
reserve space for worst possible cases - where tree heights are at the
maximum possible height (8), we need to COW every extent buffer in a tree
path, need to split extent buffers, etc.
For data, we generally reserve the exact amount of space we are going to
allocate. The exception here is when using compression, in which case we
reserve space matching the uncompressed size, as the compression only
happens at writeback time and in the worst possible case we need that
amount of space in case the data is not compressible.
This means that when there's not available space in the corresponding
space_info object, we may need to allocate a new block group, and then
that block group might not be used after all. In this case the block
group is never added to the list of unused block groups and ends up
never being deleted - except if we unmount and mount again the fs, as
when reading block groups from disk we add unused ones to the list of
unused block groups (fs_info->unused_bgs). Otherwise a block group is
only added to the list of unused block groups when we deallocate the
last extent from it, so if no extent is ever allocated, the block group
is kept around forever.
This also means that if we have a bunch of tasks reserving space in
parallel we can end up allocating many block groups that end up never
being used or kept around for too long without being used, which has
the potential to result in ENOSPC failures in case for example we over
allocate too many metadata block groups and then end up in a state
without enough unallocated space to allocate a new data block group.
This is more likely to happen with metadata reservations as of kernel
6.7, namely since commit 28270e25c69a ("btrfs: always reserve space for
delayed refs when starting transaction"), because we started to always
reserve space for delayed references when starting a transaction handle
for a non-zero number of items, and also to try to reserve space to fill
the gap between the delayed block reserve's reserved space and its size.
So to avoid this, when finishing the creation a new block group, add the
block group to the list of unused block groups if it's still unused at
that time. This way the next time the cleaner kthread runs, it will delete
the block group if it's still unused and not needed to satisfy existing
space reservations.
Reported-by: Ivan Shapovalov <intelfx@intelfx.name>
Link: https://lore.kernel.org/linux-btrfs/9cdbf0ca9cdda1b4c84e15e548af7d7f9f926382.camel@intelfx.name/
CC: stable@vger.kernel.org # 6.7+
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: Boris Burkov <boris@bur.io>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Before deleting a block group that is in the list of unused block groups
(fs_info->unused_bgs), we check if the block group became used before
deleting it, as extents from it may have been allocated after it was added
to the list.
However even if the block group was not yet used, there may be tasks that
have only reserved space and have not yet allocated extents, and they
might be relying on the availability of the unused block group in order
to allocate extents. The reservation works first by increasing the
"bytes_may_use" field of the corresponding space_info object (which may
first require flushing delayed items, allocating a new block group, etc),
and only later a task does the actual allocation of extents.
For metadata we usually don't end up using all reserved space, as we are
pessimistic and typically account for the worst cases (need to COW every
single node in a path of a tree at maximum possible height, etc). For
data we usually reserve the exact amount of space we're going to allocate
later, except when using compression where we always reserve space based
on the uncompressed size, as compression is only triggered when writeback
starts so we don't know in advance how much space we'll actually need, or
if the data is compressible.
So don't delete an unused block group if the total size of its space_info
object minus the block group's size is less then the sum of used space and
space that may be used (space_info->bytes_may_use), as that means we have
tasks that reserved space and may need to allocate extents from the block
group. In this case, besides skipping the deletion, re-add the block group
to the list of unused block groups so that it may be reconsidered later,
in case the tasks that reserved space end up not needing to allocate
extents from it.
Allowing the deletion of the block group while we have reserved space, can
result in tasks failing to allocate metadata extents (-ENOSPC) while under
a transaction handle, resulting in a transaction abort, or failure during
writeback for the case of data extents.
CC: stable@vger.kernel.org # 6.0+
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: Boris Burkov <boris@bur.io>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Add a helper function to determine if a block group is being used and make
use of it at btrfs_delete_unused_bgs(). This helper will also be used in
future code changes.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: Boris Burkov <boris@bur.io>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
While running the CI for an unrelated change I hit the following panic
with generic/648 on btrfs_holes_spacecache.
assertion failed: block_start != EXTENT_MAP_HOLE, in fs/btrfs/extent_io.c:1385
------------[ cut here ]------------
kernel BUG at fs/btrfs/extent_io.c:1385!
invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
CPU: 1 PID: 2695096 Comm: fsstress Kdump: loaded Tainted: G W 6.8.0-rc2+ #1
RIP: 0010:__extent_writepage_io.constprop.0+0x4c1/0x5c0
Call Trace:
<TASK>
extent_write_cache_pages+0x2ac/0x8f0
extent_writepages+0x87/0x110
do_writepages+0xd5/0x1f0
filemap_fdatawrite_wbc+0x63/0x90
__filemap_fdatawrite_range+0x5c/0x80
btrfs_fdatawrite_range+0x1f/0x50
btrfs_write_out_cache+0x507/0x560
btrfs_write_dirty_block_groups+0x32a/0x420
commit_cowonly_roots+0x21b/0x290
btrfs_commit_transaction+0x813/0x1360
btrfs_sync_file+0x51a/0x640
__x64_sys_fdatasync+0x52/0x90
do_syscall_64+0x9c/0x190
entry_SYSCALL_64_after_hwframe+0x6e/0x76
This happens because we fail to write out the free space cache in one
instance, come back around and attempt to write it again. However on
the second pass through we go to call btrfs_get_extent() on the inode to
get the extent mapping. Because this is a new block group, and with the
free space inode we always search the commit root to avoid deadlocking
with the tree, we find nothing and return a EXTENT_MAP_HOLE for the
requested range.
This happens because the first time we try to write the space cache out
we hit an error, and on an error we drop the extent mapping. This is
normal for normal files, but the free space cache inode is special. We
always expect the extent map to be correct. Thus the second time
through we end up with a bogus extent map.
Since we're deprecating this feature, the most straightforward way to
fix this is to simply skip dropping the extent map range for this failed
range.
I shortened the test by using error injection to stress the area to make
it easier to reproduce. With this patch in place we no longer panic
with my error injection test.
CC: stable@vger.kernel.org # 4.14+
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V fixes from Palmer Dabbelt:
- fix missing TLB flush during early boot on SPARSEMEM_VMEMMAP
configurations
- fixes to correctly implement the break-before-make behavior requried
by the ISA for NAPOT mappings
- fix a missing TLB flush on intermediate mapping changes
- fix build warning about a missing declaration of overflow_stack
- fix performace regression related to incorrect tracking of completed
batch TLB flushes
* tag 'riscv-for-linus-6.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
riscv: Fix arch_tlbbatch_flush() by clearing the batch cpumask
riscv: declare overflow_stack as exported from traps.c
riscv: Fix arch_hugetlb_migration_supported() for NAPOT
riscv: Flush the tlb when a page directory is freed
riscv: Fix hugetlb_mask_last_page() when NAPOT is enabled
riscv: Fix set_huge_pte_at() for NAPOT mapping
riscv: mm: execute local TLB flush after populating vmemmap
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull tracing fixes from Steven Rostedt:
- Fix broken direct trampolines being called when another callback is
attached the same function.
ARM 64 does not support FTRACE_WITH_REGS, and when it added direct
trampoline calls from ftrace, it removed the "WITH_REGS" flag from
the ftrace_ops for direct trampolines. This broke x86 as x86 requires
direct trampolines to have WITH_REGS.
This wasn't noticed because direct trampolines work as long as the
function it is attached to is not shared with other callbacks (like
the function tracer). When there are other callbacks, a helper
trampoline is called, to call all the non direct callbacks and when
it returns, the direct trampoline is called.
For x86, the direct trampoline sets a flag in the regs field to tell
the x86 specific code to call the direct trampoline. But this only
works if the ftrace_ops had WITH_REGS set. ARM does things
differently that does not require this. For now, set WITH_REGS if the
arch supports WITH_REGS (which ARM does not), and this makes it work
for both ARM64 and x86.
- Fix wasted memory in the saved_cmdlines logic.
The saved_cmdlines is a cache that maps PIDs to COMMs that tracing
can use. Most trace events only save the PID in the event. The
saved_cmdlines file lists PIDs to COMMs so that the tracing tools can
show an actual name and not just a PID for each event. There's an
array of PIDs that map to a small set of saved COMM strings. The
array is set to PID_MAX_DEFAULT which is usually set to 32768. When a
PID comes in, it will add itself to this array along with the index
into the COMM array (note if the system allows more than
PID_MAX_DEFAULT, this cache is similar to cache lines as an update of
a PID that has the same PID_MAX_DEFAULT bits set will flush out
another task with the same matching bits set).
A while ago, the size of this cache was changed to be dynamic and the
array was moved into a structure and created with kmalloc(). But this
new structure had the size of 131104 bytes, or 0x20020 in hex. As
kmalloc allocates in powers of two, it was actually allocating
0x40000 bytes (262144) leaving 131040 bytes of wasted memory. The
last element of this structure was a pointer to the COMM string array
which defaulted to just saving 128 COMMs.
By changing the last field of this structure to a variable length
string, and just having it round up to fill the allocated memory, the
default size of the saved COMM cache is now 8190. This not only uses
the wasted space, but actually saves space by removing the extra
allocation for the COMM names.
* tag 'trace-v6.8-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
tracing: Fix wasted memory in saved_cmdlines logic
ftrace: Fix DIRECT_CALLS to use SAVE_REGS by default
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull probes fixes from Masami Hiramatsu:
- remove unnecessary initial values of kprobes local variables
- probe-events parser bug fixes:
- calculate the argument size and format string after setting type
information from BTF, because BTF can change the size and format
string.
- show $comm parse error correctly instead of failing silently.
* tag 'probes-fixes-v6.8-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
kprobes: Remove unnecessary initial values of variables
tracing/probes: Fix to set arg size and fmt after setting type from BTF
tracing/probes: Fix to show a parse error for bad type for $comm
|
|
The commit noted in fixes added a bogus requirement that runtime PM managed
devices need to be in the RPM_ACTIVE state for PME polling. In fact, only
devices in low power states should be polled.
However there's still a requirement that the device config space must be
accessible, which has implications for both the current state of the polled
device and the parent bridge, when present. It's not sufficient to assume
the bridge remains in D0 and cases have been observed where the bridge
passes the D0 test, but the PM state indicates RPM_SUSPENDING and config
space of the polled device becomes inaccessible during pci_pme_wakeup().
Therefore, since the bridge is already effectively required to be in the
RPM_ACTIVE state, formalize this in the code and elevate the PM usage count
to maintain the state while polling the subordinate device.
This resolves a regression reported in the bugzilla below where a
Thunderbolt/USB4 hierarchy fails to scan for an attached NVMe endpoint
downstream of a bridge in a D3hot power state.
Link: https://lore.kernel.org/r/20240123185548.1040096-1-alex.williamson@redhat.com
Fixes: d3fcd7360338 ("PCI: Fix runtime PM race with PME polling")
Reported-by: Sanath S <sanath.s@amd.com>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218360
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Sanath S <sanath.s@amd.com>
Reviewed-by: Rafael J. Wysocki <rafael@kernel.org>
Cc: Lukas Wunner <lukas@wunner.de>
Cc: Mika Westerberg <mika.westerberg@linux.intel.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi
Pull EFI fixes from Ard Biesheuvel:
"The only notable change here is the patch that changes the way we deal
with spurious errors from the EFI memory attribute protocol. This will
be backported to v6.6, and is intended to ensure that we will not
paint ourselves into a corner when we tighten this further in order to
comply with MS requirements on signed EFI code.
Note that this protocol does not currently exist in x86 production
systems in the field, only in Microsoft's fork of OVMF, but it will be
mandatory for Windows logo certification for x86 PCs in the future.
- Tighten ELF relocation checks on the RISC-V EFI stub
- Give up if the new EFI memory attributes protocol fails spuriously
on x86
- Take care not to place the kernel in the lowest 16 MB of DRAM on
x86
- Omit special purpose EFI memory from memblock
- Some fixes for the CXL CPER reporting code
- Make the PE/COFF layout of mixed-mode capable images comply with a
strict interpretation of the spec"
* tag 'efi-fixes-for-v6.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi:
x86/efistub: Use 1:1 file:memory mapping for PE/COFF .compat section
cxl/trace: Remove unnecessary memcpy's
cxl/cper: Fix errant CPER prints for CXL events
efi: Don't add memblocks for soft-reserved memory
efi: runtime: Fix potential overflow of soft-reserved region size
efi/libstub: Add one kernel-doc comment
x86/efistub: Avoid placing the kernel below LOAD_PHYSICAL_ADDR
x86/efistub: Give up if memory attribute protocol returns an error
riscv/efistub: Tighten ELF relocation check
riscv/efistub: Ensure GP-relative addressing is not used
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci
Pull pci fixes from Bjorn Helgaas:
- Fix an unintentional truncation of DWC MSI-X address to 32 bits and
update similar MSI code to match (Dan Carpenter)
* tag 'pci-v6.8-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci:
PCI: dwc: Clean up dw_pcie_ep_raise_msi_irq() alignment
PCI: dwc: Fix a 64bit bug in dw_pcie_ep_raise_msix_irq()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging
Pull hwmon fixes from Guenter Roeck:
- coretemp: Various fixes, and increase number of supported CPU cores
- aspeed-pwm-tacho: Add missing mutex protection
* tag 'hwmon-for-v6.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
hwmon: (coretemp) Enlarge per package core count limit
hwmon: (coretemp) Fix bogus core_id to attr name mapping
hwmon: (coretemp) Fix out-of-bounds memory access
hwmon: (aspeed-pwm-tacho) mutex for tach reading
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc
Pull MMC fixes from Ulf Hansson:
"MMC core:
- Allow non-sleeping read-only slot-gpio
MMC host:
- sdhci-pci-o2micro: Fix a warm reboot BIOS issue"
* tag 'mmc-v6.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
mmc: slot-gpio: Allow non-sleeping GPIO ro
mmc: sdhci-pci-o2micro: Fix a warm reboot issue that disk can't be detected by BIOS
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm
Pull pmdomain fixes from Ulf Hansson:
"Core:
- Move the unused cleanup to a _sync initcall
Providers:
- mediatek: Fix race conditions at probe/remove with genpd
- renesas: r8a77980-sysc: CR7 must be always on"
* tag 'pmdomain-v6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm:
pmdomain: mediatek: fix race conditions with genpd
pmdomain: renesas: r8a77980-sysc: CR7 must be always on
pmdomain: core: Move the unused cleanup to a _sync initcall
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux
Pull gpio fix from Bartosz Golaszewski:
- remove the new GPIO device from the global list unconditionally in
error path in core GPIOLIB
* tag 'gpio-fixes-for-v6.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
gpio: remove GPIO device from the list unconditionally in error path
|
|
Pull drm fixes from Dave Airlie:
"Regular weekly fixes, xe, amdgpu and msm are most of them, with some
misc in i915, ivpu and nouveau, scattered but nothing too intense at
this point.
i915:
- gvt: docs fix, uninit var, MAINTAINERS
ivpu:
- add aborted job status
- disable d3 hot delay
- mmu fixes
nouveau:
- fix gsp rpc size request
- fix dma buffer leaks
- use common code for gsp mem ctor
xe:
- Fix a loop in an error path
- Fix a missing dma-fence reference
- Fix a retry path on userptr REMAP
- Workaround for a false gcc warning
- Fix missing map of the usm batch buffer in the migrate vm.
- Fix a memory leak.
- Fix a bad assumption of used page size
- Fix hitting a BUG() due to zero pages to map.
- Remove some leftover async bind queue relics
amdgpu:
- Misc NULL/bounds check fixes
- ODM pipe policy fix
- Aborted suspend fixes
- JPEG 4.0.5 fix
- DCN 3.5 fixes
- PSP fix
- DP MST fix
- Phantom pipe fix
- VRAM vendor fix
- Clang fix
- SR-IOV fix
msm:
- DPU:
- fix for kernel doc warnings and smatch warnings in dpu_encoder
- fix for smatch warning in dpu_encoder
- fix the bus bandwidth value for SDM670
- DP:
- fixes to handle unknown bpc case correctly for DP
- fix for MISC0 programming
- GPU:
- dmabuf vmap fix
- a610 UBWC corruption fix (incorrect hbb)
- revert a commit that was making GPU recovery unreliable"
* tag 'drm-fixes-2024-02-09' of git://anongit.freedesktop.org/drm/drm: (43 commits)
drm/xe: Remove TEST_VM_ASYNC_OPS_ERROR
drm/xe/vm: don't ignore error when in_kthread
drm/xe: Assume large page size if VMA not yet bound
drm/xe/display: Fix memleak in display initialization
drm/xe: Map both mem.kernel_bb_pool and usm.bb_pool
drm/xe: circumvent bogus stringop-overflow warning
drm/xe: Pick correct userptr VMA to repin on REMAP op failure
drm/xe: Take a reference in xe_exec_queue_last_fence_get()
drm/xe: Fix loop in vm_bind_ioctl_ops_unwind
drm/amdgpu: Fix HDP flush for VFs on nbio v7.9
drm/amd/display: Implement bounds check for stream encoder creation in DCN301
drm/amd/display: Increase frame-larger-than for all display_mode_vba files
drm/amd/display: Clear phantom stream count and plane count
drm/amdgpu: Avoid fetching VRAM vendor info
drm/amd/display: Disable ODM by default for DCN35
drm/amd/display: Update phantom pipe enable / disable sequence
drm/amd/display: Fix MST Null Ptr for RV
drm/amdgpu: Fix shared buff copy to user
drm/amd/display: Increase eval/entry delay for DCN35
drm/amdgpu: remove asymmetrical irq disabling in jpeg 4.0.5 suspend
...
|
|
AmpereOneX mesh implementation has a bug in HN-P nodes that makes them
report incorrect child count. The failing crosspoints report 8 children
while they only have two.
When the driver tries to access the inexistent child nodes, it believes it
has reached an invalid node type and probing fails. The workaround is to
ignore those incorrect child nodes and continue normally.
Signed-off-by: Ilkka Koskinen <ilkka@os.amperecomputing.com>
[ rm: rewrote simpler generalised version ]
Tested-by: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/ce4b1442135fe03d0de41859b04b268c88c854a3.1707498577.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
The generic constraint "i" seems to be copied from x86 or arm (and with
a redundant generic operand modifier "c"). It works with -fno-PIE but
not with -fPIE/-fPIC in GCC's aarch64 port.
The machine constraint "S", which denotes a symbol or label reference
with a constant offset, supports PIC and has been available in GCC since
2012 and in Clang since 7.0. However, Clang before 19 does not support
"S" on a symbol with a constant offset [1] (e.g.
`static_key_false(&nf_hooks_needed[pf][hook])` in
include/linux/netfilter.h), so we use "i" as a fallback.
Suggested-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Fangrui Song <maskray@google.com>
Link: https://github.com/llvm/llvm-project/pull/80255 [1]
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20240206074552.541154-1-maskray@google.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
fix typo in comments
thath -> that
Signed-off-by: Seongsu Park <sgsu.park@samsung.com>
Link: https://lore.kernel.org/r/20240202013306.883777-1-sgsu.park@samsung.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
S2M NDR BI-ConflictAck opcode is described as 4 in the CXL
r3.0 3.3.9 Table 3.43. However, it is defined as 3 in macro definition.
Fixes: 5d7107c72796 ("perf: CXL Performance Monitoring Unit driver")
Signed-off-by: Hojin Nam <hj96.nam@samsung.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/20240208013415epcms2p2904187c8a863f4d0d2adc980fb91a2dc@epcms2p2
Signed-off-by: Will Deacon <will@kernel.org>
|
|
When we are in a syscall we will only save the FPSIMD subset even though
the task still has access to the full register set, and on context switch
we will only remove TIF_SVE when loading the register state. This means
that the signal handling code should not assume that TIF_SVE means that
the register state is stored in SVE format, it should instead check the
format that was recorded during save.
Fixes: 8c845e273104 ("arm64/sve: Leave SVE enabled on syscall if we don't context switch")
Signed-off-by: Mark Brown <broonie@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20240130-arm64-sve-signal-regs-v2-1-9fc6f9502782@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
|
|
The kernel built with MCRUSOE is unbootable on Transmeta Crusoe. It shows
the following error message:
This kernel requires an i686 CPU, but only detected an i586 CPU.
Unable to boot - please use a kernel appropriate for your CPU.
Remove MCRUSOE from the condition introduced in commit in Fixes, effectively
changing X86_MINIMUM_CPU_FAMILY back to 5 on that machine, which matches the
CPU family given by CPUID.
[ bp: Massage commit message. ]
Fixes: 25d76ac88821 ("x86/Kconfig: Explicitly enumerate i686-class CPUs in Kconfig")
Signed-off-by: Aleksander Mazur <deweloper@wp.pl>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Cc: <stable@kernel.org>
Link: https://lore.kernel.org/r/20240123134309.1117782-1-deweloper@wp.pl
|
|
the driver currently fails to compile on 6.8-rc3 due to:
| spi-ppc4xx.c: In function ‘spi_ppc4xx_of_probe’:
| @346:36: error: invalid use of undefined type ‘struct platform_device’
| 346 | struct device_node *np = op->dev.of_node;
| | ^~
| ... (more similar errors)
it was working with 6.7. Looks like it only needed the include
and its compiling fine!
Signed-off-by: Christian Lamparter <chunkeey@gmail.com>
Link: https://lore.kernel.org/r/3eb3f9c4407ba99d1cd275662081e46b9e839173.1707490664.git.chunkeey@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
The driver never uses the IRQ1_CFG register so there's no need to provide
a default value. It's set as a readable register only for debugging
through the regmap registers file.
A system-specific firmware could overwrite this register with a non-default
value. Therefore the driver can't hardcode what the initial value actually
is. As the register is only for debugging the value can be left unknown
until someone wants to read it through debugfs.
Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com>
Link: https://lore.kernel.org/r/20240209145700.1555950-1-rf@opensource.cirrus.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
Commit 73cfbfa9caea ("ALSA: hda/cs35l56: Add driver for Cirrus Logic
CS35L56 amplifier") adds configs SND_HDA_SCODEC_CS35L56_{I2C,SPI},
which selects the non-existing config CS_DSP. Note the renaming in
commit d7cfdf17cb9d ("firmware: cs_dsp: Rename KConfig symbol CS_DSP ->
FW_CS_DSP"), though.
Select the intended config FW_CS_DSP.
This broken select command probably was not noticed as the configs also
select SND_HDA_CS_DSP_CONTROLS and this then selects FW_CS_DSP. So, the
select FW_CS_DSP could actually be dropped, but we will keep this
redundancy in place as the author originally also intended to have this
redundancy of selects in place.
Fixes: 73cfbfa9caea ("ALSA: hda/cs35l56: Add driver for Cirrus Logic CS35L56 amplifier")
Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Reviewed-by: Simon Trimmer <simont@opensource.cirrus.com>
Link: https://lore.kernel.org/r/20240209082044.3981-1-lukas.bulwahn@gmail.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
While looking at improving the saved_cmdlines cache I found a huge amount
of wasted memory that should be used for the cmdlines.
The tracing data saves pids during the trace. At sched switch, if a trace
occurred, it will save the comm of the task that did the trace. This is
saved in a "cache" that maps pids to comms and exposed to user space via
the /sys/kernel/tracing/saved_cmdlines file. Currently it only caches by
default 128 comms.
The structure that uses this creates an array to store the pids using
PID_MAX_DEFAULT (which is usually set to 32768). This causes the structure
to be of the size of 131104 bytes on 64 bit machines.
In hex: 131104 = 0x20020, and since the kernel allocates generic memory in
powers of two, the kernel would allocate 0x40000 or 262144 bytes to store
this structure. That leaves 131040 bytes of wasted space.
Worse, the structure points to an allocated array to store the comm names,
which is 16 bytes times the amount of names to save (currently 128), which
is 2048 bytes. Instead of allocating a separate array, make the structure
end with a variable length string and use the extra space for that.
This is similar to a recommendation that Linus had made about eventfs_inode names:
https://lore.kernel.org/all/20240130190355.11486-5-torvalds@linux-foundation.org/
Instead of allocating a separate string array to hold the saved comms,
have the structure end with: char saved_cmdlines[]; and round up to the
next power of two over sizeof(struct saved_cmdline_buffers) + num_cmdlines * TASK_COMM_LEN
It will use this extra space for the saved_cmdline portion.
Now, instead of saving only 128 comms by default, by using this wasted
space at the end of the structure it can save over 8000 comms and even
saves space by removing the need for allocating the other array.
Link: https://lore.kernel.org/linux-trace-kernel/20240209063622.1f7b6d5f@rorschach.local.home
Cc: stable@vger.kernel.org
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Vincent Donnefort <vdonnefort@google.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Mete Durlu <meted@linux.ibm.com>
Fixes: 939c7a4f04fcd ("tracing: Introduce saved_cmdlines_size file")
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
|
|
Similar to the existing "ports" node name, coresight device tree bindings
have added "in-ports" and "out-ports" as standard node names for a
collection of ports.
Add support for these name to of_graph_get_port_parent() so that
remote-endpoint parsing can find the correct parent node for these
coresight ports too.
Signed-off-by: Saravana Kannan <saravanak@google.com>
Link: https://lore.kernel.org/r/20240207011803.2637531-4-saravanak@google.com
Signed-off-by: Rob Herring <robh@kernel.org>
|
|
After commit 4a032827daa8 ("of: property: Simplify of_link_to_phandle()"),
remote-endpoint properties created a fwnode link from the consumer device
to the supplier endpoint. This is a tiny bit inefficient (not buggy) when
trying to create device links or detecting cycles. So, improve this the
same way we improved finding the consumer of a remote-endpoint property.
Fixes: 4a032827daa8 ("of: property: Simplify of_link_to_phandle()")
Signed-off-by: Saravana Kannan <saravanak@google.com>
Link: https://lore.kernel.org/r/20240207011803.2637531-3-saravanak@google.com
Signed-off-by: Rob Herring <robh@kernel.org>
|
|
We have a more accurate function to find the right consumer of a
remote-endpoint property instead of searching for a parent with
compatible string property. So, use that instead. While at it, make the
code to find the consumer a bit more flexible and based on the property
being parsed.
Fixes: f7514a663016 ("of: property: fw_devlink: Add support for remote-endpoint")
Signed-off-by: Saravana Kannan <saravanak@google.com>
Link: https://lore.kernel.org/r/20240207011803.2637531-2-saravanak@google.com
Signed-off-by: Rob Herring <robh@kernel.org>
|
|
This reverts commit 398aa9a7e77cf23c2a6f882ddd3dcd96f21771dc.
The update to the gadget API to support EBC feature is incomplete. It's
missing at least the following:
* New usage documentation
* Gadget capability check
* Condition for the user to check how many and which endpoints can be
used as "fifo_mode"
* Description of how it can affect completed request (e.g. dwc3 won't
update TRB on completion -- ie. how it can affect request's actual
length report)
Let's revert this until it's ready.
Fixes: 398aa9a7e77c ("usb: dwc3: Support EBC feature of DWC_usb31")
Signed-off-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com>
Link: https://lore.kernel.org/r/3042f847ff904b4dd4e4cf66a1b9df470e63439e.1707441690.git.Thinh.Nguyen@synopsys.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
The commit 60c8971899f3 ("ftrace: Make DIRECT_CALLS work WITH_ARGS
and !WITH_REGS") changed DIRECT_CALLS to use SAVE_ARGS when there
are multiple ftrace_ops at the same function, but since the x86 only
support to jump to direct_call from ftrace_regs_caller, when we set
the function tracer on the same target function on x86, ftrace-direct
does not work as below (this actually works on arm64.)
At first, insmod ftrace-direct.ko to put a direct_call on
'wake_up_process()'.
# insmod kernel/samples/ftrace/ftrace-direct.ko
# less trace
...
<idle>-0 [006] ..s1. 564.686958: my_direct_func: waking up rcu_preempt-17
<idle>-0 [007] ..s1. 564.687836: my_direct_func: waking up kcompactd0-63
<idle>-0 [006] ..s1. 564.690926: my_direct_func: waking up rcu_preempt-17
<idle>-0 [006] ..s1. 564.696872: my_direct_func: waking up rcu_preempt-17
<idle>-0 [007] ..s1. 565.191982: my_direct_func: waking up kcompactd0-63
Setup a function filter to the 'wake_up_process' too, and enable it.
# cd /sys/kernel/tracing/
# echo wake_up_process > set_ftrace_filter
# echo function > current_tracer
# less trace
...
<idle>-0 [006] ..s3. 686.180972: wake_up_process <-call_timer_fn
<idle>-0 [006] ..s3. 686.186919: wake_up_process <-call_timer_fn
<idle>-0 [002] ..s3. 686.264049: wake_up_process <-call_timer_fn
<idle>-0 [002] d.h6. 686.515216: wake_up_process <-kick_pool
<idle>-0 [002] d.h6. 686.691386: wake_up_process <-kick_pool
Then, only function tracer is shown on x86.
But if you enable 'kprobe on ftrace' event (which uses SAVE_REGS flag)
on the same function, it is shown again.
# echo 'p wake_up_process' >> dynamic_events
# echo 1 > events/kprobes/p_wake_up_process_0/enable
# echo > trace
# less trace
...
<idle>-0 [006] ..s2. 2710.345919: p_wake_up_process_0: (wake_up_process+0x4/0x20)
<idle>-0 [006] ..s3. 2710.345923: wake_up_process <-call_timer_fn
<idle>-0 [006] ..s1. 2710.345928: my_direct_func: waking up rcu_preempt-17
<idle>-0 [006] ..s2. 2710.349931: p_wake_up_process_0: (wake_up_process+0x4/0x20)
<idle>-0 [006] ..s3. 2710.349934: wake_up_process <-call_timer_fn
<idle>-0 [006] ..s1. 2710.349937: my_direct_func: waking up rcu_preempt-17
To fix this issue, use SAVE_REGS flag for multiple ftrace_ops flag of
direct_call by default.
Link: https://lore.kernel.org/linux-trace-kernel/170484558617.178953.1590516949390270842.stgit@devnote2
Fixes: 60c8971899f3 ("ftrace: Make DIRECT_CALLS work WITH_ARGS and !WITH_REGS")
Cc: stable@vger.kernel.org
Cc: Florent Revest <revest@chromium.org>
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Mark Rutland <mark.rutland@arm.com> [arm64]
Acked-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
|
|
__le32 and __le64 types should be used with le32_to_cpu() and
le64_to_cpu() and __user is needed for pointers referencing
guest memory, as sparse helpfully points out.
Fixes: e9f12b5fff8a ("RISC-V: KVM: Implement SBI STA extension")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202401020142.lwFEDK5v-lkp@intel.com/
Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Atish Patra <atishp@rivosinc.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
|
|
__le32 and __le64 types should be used with le32_to_cpu() and
le64_to_cpu(), as sparse helpfully points out.
Fixes: fdf68acccfc6 ("RISC-V: paravirt: Implement steal-time support")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202401011933.hL9zqmKo-lkp@intel.com/
Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Atish Patra <atishp@rivosinc.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
|
|
steal_time is not used outside paravirt.c, make it static,
as sparse suggested.
Fixes: fdf68acccfc6 ("RISC-V: paravirt: Implement steal-time support")
Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Atish Patra <atishp@rivosinc.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
|
|
The reuseport_addr_any.sh is currently skipping DCCP tests and
pmtu.sh is skipping all the FOU/GUE related cases: add the missing
options.
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/38d3ca7f909736c1aef56e6244d67c82a9bba6ff.1707326987.git.pabeni@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Command example string is not read as command.
Fix command annotation.
Fixes: a8ce7b26a51e ("devlink: Expose port function commands to control migratable")
Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240206161717.466653-1-parav@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Do not report the XDP capability NETDEV_XDP_ACT_XSK_ZEROCOPY as the
bonding driver does not support XDP and AF_XDP in zero-copy mode even
if the real NIC drivers do.
Note that the driver used to report everything as supported before a
device was bonded. Instead of just masking out the zero-copy support
from this, have the driver report that no XDP feature is supported
until a real device is bonded. This seems to be more truthful as it is
the real drivers that decide what XDP features are supported.
Fixes: cb9e6e584d58 ("bonding: add xdp_features support")
Reported-by: Prashant Batra <prbatra.mail@gmail.com>
Link: https://lore.kernel.org/all/CAJ8uoz2ieZCopgqTvQ9ZY6xQgTbujmC6XkMTamhp68O-h_-rLg@mail.gmail.com/T/
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/r/20240207084737.20890-1-magnus.karlsson@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Recently, handshake_req_destroy_test1 started failing:
Expected handshake_req_destroy_test == req, but
handshake_req_destroy_test == 0000000000000000
req == 0000000060f99b40
not ok 11 req_destroy works
This is because "sock_release(sock)" was replaced with "fput(filp)"
to address a memory leak. Note that sock_release() is synchronous
but fput() usually delays the final close and clean-up.
The delay is not consequential in the other cases that were changed
but handshake_req_destroy_test1 is testing that handshake_req_cancel()
followed by closing the file actually does call the ->hp_destroy
method. Thus the PTR_EQ test at the end has to be sure that the
final close is complete before it checks the pointer.
We cannot use a completion here because if ->hp_destroy is never
called (ie, there is an API bug) then the test will hang.
Reported by: Guenter Roeck <linux@roeck-us.net>
Closes: https://lore.kernel.org/netdev/ZcKDd1to4MPANCrn@tissot.1015granger.net/T/#mac5c6299f86799f1c71776f3a07f9c566c7c3c40
Fixes: 4a0f07d71b04 ("net/handshake: Fix memory leak in __sock_create() and sock_alloc_file()")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Link: https://lore.kernel.org/r/170724699027.91401.7839730697326806733.stgit@oracle-102.nfsv4bat.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
I managed to hit following use after free warning recently:
[ 2169.711665] ==================================================================
[ 2169.714009] BUG: KASAN: slab-use-after-free in __run_timers.part.0+0x179/0x4c0
[ 2169.716293] Write of size 8 at addr ffff88812b326a70 by task swapper/4/0
[ 2169.719022] CPU: 4 PID: 0 Comm: swapper/4 Not tainted 6.8.0-rc2jiri+ #2
[ 2169.720974] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
[ 2169.722457] Call Trace:
[ 2169.722756] <IRQ>
[ 2169.723024] dump_stack_lvl+0x58/0xb0
[ 2169.723417] print_report+0xc5/0x630
[ 2169.723807] ? __virt_addr_valid+0x126/0x2b0
[ 2169.724268] kasan_report+0xbe/0xf0
[ 2169.724667] ? __run_timers.part.0+0x179/0x4c0
[ 2169.725116] ? __run_timers.part.0+0x179/0x4c0
[ 2169.725570] __run_timers.part.0+0x179/0x4c0
[ 2169.726003] ? call_timer_fn+0x320/0x320
[ 2169.726404] ? lock_downgrade+0x3a0/0x3a0
[ 2169.726820] ? kvm_clock_get_cycles+0x14/0x20
[ 2169.727257] ? ktime_get+0x92/0x150
[ 2169.727630] ? lapic_next_deadline+0x35/0x60
[ 2169.728069] run_timer_softirq+0x40/0x80
[ 2169.728475] __do_softirq+0x1a1/0x509
[ 2169.728866] irq_exit_rcu+0x95/0xc0
[ 2169.729241] sysvec_apic_timer_interrupt+0x6b/0x80
[ 2169.729718] </IRQ>
[ 2169.729993] <TASK>
[ 2169.730259] asm_sysvec_apic_timer_interrupt+0x16/0x20
[ 2169.730755] RIP: 0010:default_idle+0x13/0x20
[ 2169.731190] Code: c0 08 00 00 00 4d 29 c8 4c 01 c7 4c 29 c2 e9 72 ff ff ff cc cc cc cc 8b 05 9a 7f 1f 02 85 c0 7e 07 0f 00 2d cf 69 43 00 fb f4 <fa> c3 66 66 2e 0f 1f 84 00 00 00 00 00 65 48 8b 04 25 c0 93 04 00
[ 2169.732759] RSP: 0018:ffff888100dbfe10 EFLAGS: 00000242
[ 2169.733264] RAX: 0000000000000001 RBX: ffff888100d9c200 RCX: ffffffff8241bd62
[ 2169.733925] RDX: ffffed109a848b15 RSI: 0000000000000004 RDI: ffffffff8127ac55
[ 2169.734566] RBP: 0000000000000004 R08: 0000000000000000 R09: ffffed109a848b14
[ 2169.735200] R10: ffff8884d42458a3 R11: 000000000000ba7e R12: ffffffff83d7d3a0
[ 2169.735835] R13: 1ffff110201b7fc6 R14: 0000000000000000 R15: ffff888100d9c200
[ 2169.736478] ? ct_kernel_exit.constprop.0+0xa2/0xc0
[ 2169.736954] ? do_idle+0x285/0x290
[ 2169.737323] default_idle_call+0x63/0x90
[ 2169.737730] do_idle+0x285/0x290
[ 2169.738089] ? arch_cpu_idle_exit+0x30/0x30
[ 2169.738511] ? mark_held_locks+0x1a/0x80
[ 2169.738917] ? lockdep_hardirqs_on_prepare+0x12e/0x200
[ 2169.739417] cpu_startup_entry+0x30/0x40
[ 2169.739825] start_secondary+0x19a/0x1c0
[ 2169.740229] ? set_cpu_sibling_map+0xbd0/0xbd0
[ 2169.740673] secondary_startup_64_no_verify+0x15d/0x16b
[ 2169.741179] </TASK>
[ 2169.741686] Allocated by task 1098:
[ 2169.742058] kasan_save_stack+0x1c/0x40
[ 2169.742456] kasan_save_track+0x10/0x30
[ 2169.742852] __kasan_kmalloc+0x83/0x90
[ 2169.743246] mlx5_dpll_probe+0xf5/0x3c0 [mlx5_dpll]
[ 2169.743730] auxiliary_bus_probe+0x62/0xb0
[ 2169.744148] really_probe+0x127/0x590
[ 2169.744534] __driver_probe_device+0xd2/0x200
[ 2169.744973] device_driver_attach+0x6b/0xf0
[ 2169.745402] bind_store+0x90/0xe0
[ 2169.745761] kernfs_fop_write_iter+0x1df/0x2a0
[ 2169.746210] vfs_write+0x41f/0x790
[ 2169.746579] ksys_write+0xc7/0x160
[ 2169.746947] do_syscall_64+0x6f/0x140
[ 2169.747333] entry_SYSCALL_64_after_hwframe+0x46/0x4e
[ 2169.748049] Freed by task 1220:
[ 2169.748393] kasan_save_stack+0x1c/0x40
[ 2169.748789] kasan_save_track+0x10/0x30
[ 2169.749188] kasan_save_free_info+0x3b/0x50
[ 2169.749621] poison_slab_object+0x106/0x180
[ 2169.750044] __kasan_slab_free+0x14/0x50
[ 2169.750451] kfree+0x118/0x330
[ 2169.750792] mlx5_dpll_remove+0xf5/0x110 [mlx5_dpll]
[ 2169.751271] auxiliary_bus_remove+0x2e/0x40
[ 2169.751694] device_release_driver_internal+0x24b/0x2e0
[ 2169.752191] unbind_store+0xa6/0xb0
[ 2169.752563] kernfs_fop_write_iter+0x1df/0x2a0
[ 2169.753004] vfs_write+0x41f/0x790
[ 2169.753381] ksys_write+0xc7/0x160
[ 2169.753750] do_syscall_64+0x6f/0x140
[ 2169.754132] entry_SYSCALL_64_after_hwframe+0x46/0x4e
[ 2169.754847] Last potentially related work creation:
[ 2169.755315] kasan_save_stack+0x1c/0x40
[ 2169.755709] __kasan_record_aux_stack+0x9b/0xf0
[ 2169.756165] __queue_work+0x382/0x8f0
[ 2169.756552] call_timer_fn+0x126/0x320
[ 2169.756941] __run_timers.part.0+0x2ea/0x4c0
[ 2169.757376] run_timer_softirq+0x40/0x80
[ 2169.757782] __do_softirq+0x1a1/0x509
[ 2169.758387] Second to last potentially related work creation:
[ 2169.758924] kasan_save_stack+0x1c/0x40
[ 2169.759322] __kasan_record_aux_stack+0x9b/0xf0
[ 2169.759773] __queue_work+0x382/0x8f0
[ 2169.760156] call_timer_fn+0x126/0x320
[ 2169.760550] __run_timers.part.0+0x2ea/0x4c0
[ 2169.760978] run_timer_softirq+0x40/0x80
[ 2169.761381] __do_softirq+0x1a1/0x509
[ 2169.761998] The buggy address belongs to the object at ffff88812b326a00
which belongs to the cache kmalloc-256 of size 256
[ 2169.763061] The buggy address is located 112 bytes inside of
freed 256-byte region [ffff88812b326a00, ffff88812b326b00)
[ 2169.764346] The buggy address belongs to the physical page:
[ 2169.764866] page:000000000f2b1e89 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x12b324
[ 2169.765731] head:000000000f2b1e89 order:2 entire_mapcount:0 nr_pages_mapped:0 pincount:0
[ 2169.766484] anon flags: 0x200000000000840(slab|head|node=0|zone=2)
[ 2169.767048] page_type: 0xffffffff()
[ 2169.767422] raw: 0200000000000840 ffff888100042b40 0000000000000000 dead000000000001
[ 2169.768183] raw: 0000000000000000 0000000000200020 00000001ffffffff 0000000000000000
[ 2169.768899] page dumped because: kasan: bad access detected
[ 2169.769649] Memory state around the buggy address:
[ 2169.770116] ffff88812b326900: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 2169.770805] ffff88812b326980: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 2169.771485] >ffff88812b326a00: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ 2169.772173] ^
[ 2169.772787] ffff88812b326a80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ 2169.773477] ffff88812b326b00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 2169.774160] ==================================================================
[ 2169.774845] ==================================================================
I didn't manage to reproduce it. Though the issue seems to be obvious.
There is a chance that the mlx5_dpll_remove() calls
cancel_delayed_work() when the work runs and manages to re-arm itself.
In that case, after delay timer triggers next attempt to queue it,
it works with freed memory.
Fix this by using cancel_delayed_work_sync() instead which makes sure
that work is done when it returns.
Fixes: 496fd0a26bbf ("mlx5: Implement SyncE support using DPLL infrastructure")
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240206164328.360313-1-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Recently, I've been hitting following deadlock warning during dpll pin
dump:
[52804.637962] ======================================================
[52804.638536] WARNING: possible circular locking dependency detected
[52804.639111] 6.8.0-rc2jiri+ #1 Not tainted
[52804.639529] ------------------------------------------------------
[52804.640104] python3/2984 is trying to acquire lock:
[52804.640581] ffff88810e642678 (nlk_cb_mutex-GENERIC){+.+.}-{3:3}, at: netlink_dump+0xb3/0x780
[52804.641417]
but task is already holding lock:
[52804.642010] ffffffff83bde4c8 (dpll_lock){+.+.}-{3:3}, at: dpll_lock_dumpit+0x13/0x20
[52804.642747]
which lock already depends on the new lock.
[52804.643551]
the existing dependency chain (in reverse order) is:
[52804.644259]
-> #1 (dpll_lock){+.+.}-{3:3}:
[52804.644836] lock_acquire+0x174/0x3e0
[52804.645271] __mutex_lock+0x119/0x1150
[52804.645723] dpll_lock_dumpit+0x13/0x20
[52804.646169] genl_start+0x266/0x320
[52804.646578] __netlink_dump_start+0x321/0x450
[52804.647056] genl_family_rcv_msg_dumpit+0x155/0x1e0
[52804.647575] genl_rcv_msg+0x1ed/0x3b0
[52804.648001] netlink_rcv_skb+0xdc/0x210
[52804.648440] genl_rcv+0x24/0x40
[52804.648831] netlink_unicast+0x2f1/0x490
[52804.649290] netlink_sendmsg+0x36d/0x660
[52804.649742] __sock_sendmsg+0x73/0xc0
[52804.650165] __sys_sendto+0x184/0x210
[52804.650597] __x64_sys_sendto+0x72/0x80
[52804.651045] do_syscall_64+0x6f/0x140
[52804.651474] entry_SYSCALL_64_after_hwframe+0x46/0x4e
[52804.652001]
-> #0 (nlk_cb_mutex-GENERIC){+.+.}-{3:3}:
[52804.652650] check_prev_add+0x1ae/0x1280
[52804.653107] __lock_acquire+0x1ed3/0x29a0
[52804.653559] lock_acquire+0x174/0x3e0
[52804.653984] __mutex_lock+0x119/0x1150
[52804.654423] netlink_dump+0xb3/0x780
[52804.654845] __netlink_dump_start+0x389/0x450
[52804.655321] genl_family_rcv_msg_dumpit+0x155/0x1e0
[52804.655842] genl_rcv_msg+0x1ed/0x3b0
[52804.656272] netlink_rcv_skb+0xdc/0x210
[52804.656721] genl_rcv+0x24/0x40
[52804.657119] netlink_unicast+0x2f1/0x490
[52804.657570] netlink_sendmsg+0x36d/0x660
[52804.658022] __sock_sendmsg+0x73/0xc0
[52804.658450] __sys_sendto+0x184/0x210
[52804.658877] __x64_sys_sendto+0x72/0x80
[52804.659322] do_syscall_64+0x6f/0x140
[52804.659752] entry_SYSCALL_64_after_hwframe+0x46/0x4e
[52804.660281]
other info that might help us debug this:
[52804.661077] Possible unsafe locking scenario:
[52804.661671] CPU0 CPU1
[52804.662129] ---- ----
[52804.662577] lock(dpll_lock);
[52804.662924] lock(nlk_cb_mutex-GENERIC);
[52804.663538] lock(dpll_lock);
[52804.664073] lock(nlk_cb_mutex-GENERIC);
[52804.664490]
The issue as follows: __netlink_dump_start() calls control->start(cb)
with nlk->cb_mutex held. In control->start(cb) the dpll_lock is taken.
Then nlk->cb_mutex is released and taken again in netlink_dump(), while
dpll_lock still being held. That leads to ABBA deadlock when another
CPU races with the same operation.
Fix this by moving dpll_lock taking into dumpit() callback which ensures
correct lock taking order.
Fixes: 9d71b54b65b1 ("dpll: netlink: Add DPLL framework base functions")
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
Link: https://lore.kernel.org/r/20240207115902.371649-1-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
https://gitlab.freedesktop.org/drm/msm into drm-fixes
Fixes for v6.8-rc4
DPU:
- fix for kernel doc warnings and smatch warnings in dpu_encoder
- fix for smatch warning in dpu_encoder
- fix the bus bandwidth value for SDM670
DP:
- fixes to handle unknown bpc case correctly for DP. The current code was
spilling over into other bits of DP configuration register, had to be
fixed to avoid the extra shifts which were causing the spill over
- fix for MISC0 programming in DP driver to program the correct
colorimetry value
GPU:
- dmabuf vmap fix
- a610 UBWC corruption fix (incorrect hbb)
- revert a commit that was making GPU recovery unreliable
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Rob Clark <robdclark@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/CAF6AEGv+tb1+_cp7ftxcMZbbxE9810rvxeaC50eL=msQ+zkm0g@mail.gmail.com
|
|
https://gitlab.freedesktop.org/agd5f/linux into drm-fixes
amd-drm-fixes-6.8-2024-02-08:
amdgpu:
- Misc NULL/bounds check fixes
- ODM pipe policy fix
- Aborted suspend fixes
- JPEG 4.0.5 fix
- DCN 3.5 fixes
- PSP fix
- DP MST fix
- Phantom pipe fix
- VRAM vendor fix
- Clang fix
- SR-IOV fix
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240208165500.4887-1-alexander.deucher@amd.com
|
|
git://anongit.freedesktop.org/drm/drm-intel into drm-fixes
- Just includes gvt-fixes-2024-02-05
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ZcTETgXsejwVwat6@jlahtine-mobl.ger.corp.intel.com
|
|
https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes
Driver Changes:
- Fix a loop in an error path
- Fix a missing dma-fence reference
- Fix a retry path on userptr REMAP
- Workaround for a false gcc warning
- Fix missing map of the usm batch buffer
in the migrate vm.
- Fix a memory leak.
- Fix a bad assumption of used page size
- Fix hitting a BUG() due to zero pages to map.
- Remove some leftover async bind queue relics
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Thomas Hellstrom <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ZcS2LllawGifubsk@fedora
|
|
git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
A null pointer dereference fix for v3d, a TTM pool initialization fix,
several fixes for nouveau around register size, DMA buffer leaks and API
consistency, a multiple fixes for ivpu around MMU setup, initialization
and firmware interactions.
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maxime Ripard <mripard@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/4wsi2i6kgkqdu7nzp4g7hxasbswnrmc5cakgf5zzvnix53u7lr@4rmp7hwblow3
|