Age | Commit message (Collapse) | Author |
|
When the msk socket is cloned at MPC handshake time, a few
fields are initialized in a racy way outside mptcp_sk_clone()
and the msk socket lock.
The above is due historical reasons: before commit a88d0092b24b
("mptcp: simplify subflow_syn_recv_sock()") as the first subflow socket
carrying all the needed date was not available yet at msk creation
time
We can now refactor the code moving the missing initialization bit
under the socket lock, removing the init race and avoiding some
code duplication.
This will also simplify the next patch, as all msk->first write
access are now under the msk socket lock.
Fixes: 0397c6d85f9c ("mptcp: keep unaccepted MPC subflow into join list")
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The MPTCP can access the first subflow socket in a few spots
outside the socket lock scope. That is actually safe, as MPTCP
will delete the socket itself only after the msk sock close().
Still the such accesses causes a few KCSAN splats, as reported
by Christoph. Silence the harmless warning adding a few annotation
around the relevant accesses.
Fixes: 71ba088ce0aa ("mptcp: cleanup accept and poll")
Reported-by: Christoph Paasch <cpaasch@apple.com>
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/402
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Ondrej reported a functional issue WRT timeout handling on connect
with a nice reproducer.
The problem is that the current mptcp connect waits for both the
MPTCP socket level timeout, and the first subflow socket timeout.
The latter is not influenced/touched by the exposed setsockopt().
Overall the above makes the SO_SNDTIMEO a no-op on connect.
Since mptcp_connect is invoked via inet_stream_connect and the
latter properly handle the MPTCP level timeout, we can address the
issue making the nested subflow level connect always unblocking.
This also allow simplifying a bit the code, dropping an ugly hack
to handle the fastopen and custom proto_ops connect.
The issues predates the blamed commit below, but the current resolution
requires the infrastructure introduced there.
Fixes: 54f1944ed6d2 ("mptcp: factor out mptcp_connect()")
Reported-by: Ondrej Mosnacek <omosnace@redhat.com>
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/399
Cc: stable@vger.kernel.org
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Xin Long says:
====================
rtnetlink: a couple of fixes in linkmsg validation
validate_linkmsg() was introduced to do linkmsg validation for existing
links. However, the new created links also need this linkmsg validation.
Add validate_linkmsg() check for link creating in Patch 1, and add more
tb checks into validate_linkmsg() in Patch 2 and 3.
====================
Link: https://lore.kernel.org/r/cover.1685548598.git.lucien.xin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This fixes the issue that dev gro_max_size and gso_ipv4_max_size
can be set to a huge value:
# ip link add dummy1 type dummy
# ip link set dummy1 gro_max_size 4294967295
# ip -d link show dummy1
dummy addrgenmode eui64 ... gro_max_size 4294967295
Fixes: 0fe79f28bfaf ("net: allow gro_max_size to exceed 65536")
Fixes: 9eefedd58ae1 ("net: add gso_ipv4_max_size and gro_ipv4_max_size per device")
Reported-by: Xiumei Mu <xmu@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
These IFLA_GSO_* tb check should also be done for the new created link,
otherwise, they can be set to a huge value when creating links:
# ip link add dummy1 gso_max_size 4294967295 type dummy
# ip -d link show dummy1
dummy addrgenmode eui64 ... gso_max_size 4294967295
Fixes: 46e6b992c250 ("rtnetlink: allow GSO maximums to be set on device creation")
Fixes: 9eefedd58ae1 ("net: add gso_ipv4_max_size and gro_ipv4_max_size per device")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
validate_linkmsg() was introduced by commit 1840bb13c22f5b ("[RTNL]:
Validate hardware and broadcast address attribute for RTM_NEWLINK")
to validate tb[IFLA_ADDRESS/BROADCAST] for existing links. The same
check should also be done for newly created links.
This patch adds validate_linkmsg() call in rtnl_create_link(), to
avoid the invalid address set when creating some devices like:
# ip link add dummy0 type dummy
# ip link add link dummy0 name mac0 address 01:02 type macsec
Fixes: 0e06877c6fdb ("[RTNETLINK]: rtnl_link: allow specifying initial device address")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The ice driver caches next_to_clean value at the beginning of
ice_clean_rx_irq() in order to remember the first buffer that has to be
freed/recycled after main Rx processing loop. The end boundary is
indicated by first descriptor of frame that Rx processing loop has ended
its duties. Note that if mentioned loop ended in the middle of gathering
multi-buffer frame, next_to_clean would be pointing to the descriptor in
the middle of the frame BUT freeing/recycling stage will stop at the
first descriptor. This means that next iteration of ice_clean_rx_irq()
will miss the (first_desc, next_to_clean - 1) entries.
When running various 9K MTU workloads, such splats were observed:
[ 540.780716] BUG: kernel NULL pointer dereference, address: 0000000000000000
[ 540.787787] #PF: supervisor read access in kernel mode
[ 540.793002] #PF: error_code(0x0000) - not-present page
[ 540.798218] PGD 0 P4D 0
[ 540.800801] Oops: 0000 [#1] PREEMPT SMP NOPTI
[ 540.805231] CPU: 18 PID: 3984 Comm: xskxceiver Tainted: G W 6.3.0-rc7+ #96
[ 540.813619] Hardware name: Intel Corporation S2600WFT/S2600WFT, BIOS SE5C620.86B.02.01.0008.031920191559 03/19/2019
[ 540.824209] RIP: 0010:ice_clean_rx_irq+0x2b6/0xf00 [ice]
[ 540.829678] Code: 74 24 10 e9 aa 00 00 00 8b 55 78 41 31 57 10 41 09 c4 4d 85 ff 0f 84 83 00 00 00 49 8b 57 08 41 8b 4f 1c 65 8b 35 1a fa 4b 3f <48> 8b 02 48 c1 e8 3a 39 c6 0f 85 a2 00 00 00 f6 42 08 02 0f 85 98
[ 540.848717] RSP: 0018:ffffc9000f42fc50 EFLAGS: 00010282
[ 540.854029] RAX: 0000000000000004 RBX: 0000000000000002 RCX: 000000000000fffe
[ 540.861272] RDX: 0000000000000000 RSI: 0000000000000001 RDI: 00000000ffffffff
[ 540.868519] RBP: ffff88984a05ac00 R08: 0000000000000000 R09: dead000000000100
[ 540.875760] R10: ffff88983fffcd00 R11: 000000000010f2b8 R12: 0000000000000004
[ 540.883008] R13: 0000000000000003 R14: 0000000000000800 R15: ffff889847a10040
[ 540.890253] FS: 00007f6ddf7fe640(0000) GS:ffff88afdf800000(0000) knlGS:0000000000000000
[ 540.898465] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 540.904299] CR2: 0000000000000000 CR3: 000000010d3da001 CR4: 00000000007706e0
[ 540.911542] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 540.918789] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 540.926032] PKRU: 55555554
[ 540.928790] Call Trace:
[ 540.931276] <TASK>
[ 540.933418] ice_napi_poll+0x4ca/0x6d0 [ice]
[ 540.937804] ? __pfx_ice_napi_poll+0x10/0x10 [ice]
[ 540.942716] napi_busy_loop+0xd7/0x320
[ 540.946537] xsk_recvmsg+0x143/0x170
[ 540.950178] sock_recvmsg+0x99/0xa0
[ 540.953729] __sys_recvfrom+0xa8/0x120
[ 540.957543] ? do_futex+0xbd/0x1d0
[ 540.961008] ? __x64_sys_futex+0x73/0x1d0
[ 540.965083] __x64_sys_recvfrom+0x20/0x30
[ 540.969155] do_syscall_64+0x38/0x90
[ 540.972796] entry_SYSCALL_64_after_hwframe+0x72/0xdc
[ 540.977934] RIP: 0033:0x7f6de5f27934
To fix this, set cached_ntc to first_desc so that at the end, when
freeing/recycling buffers, descriptors from first to ntc are not missed.
Fixes: 2fba7dc5157b ("ice: Add support for XDP multi-buffer on Rx side")
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Tested-by: Chandan Kumar Rout <chandanx.rout@intel.com> (A Contingent Worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Link: https://lore.kernel.org/r/20230531154457.3216621-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The interrupt fix in commit 97a89ed101bb should be applied on all variants
of GPY2xx PHY and GPY115C.
Fixes: 97a89ed101bb ("net: phy: mxl-gpy: disable interrupts on GPY215 by default")
Signed-off-by: Xu Liang <lxu@maxlinear.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Link: https://lore.kernel.org/r/20230531074822.39136-1-lxu@maxlinear.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Fix return value in the error path of rswitch_start_xmit(). If TX
queues are full, this function should return NETDEV_TX_BUSY.
Fixes: 3590918b5d07 ("net: ethernet: renesas: Add support for "Ethernet Switch"")
Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Link: https://lore.kernel.org/r/20230529073817.1145208-1-yoshihiro.shimoda.uh@renesas.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
marvell_nfc_setup_interface() uses the frequency retrieved from the
clock associated with the nand interface to determine the timings that
will be used. By changing the NAND frequency select without reflecting
this in the clock configuration this means that the timings calculated
don't correctly meet the requirements of the NAND chip. This hasn't been
an issue up to now because of a different bug that was stopping the
timings being updated after they were initially set.
Fixes: b25251414f6e ("mtd: rawnand: marvell: Stop implementing ->select_chip()")
Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz>
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Link: https://lore.kernel.org/linux-mtd/20230525003154.2303012-2-chris.packham@alliedtelesis.co.nz
|
|
When new timing values are calculated in marvell_nfc_setup_interface()
ensure that they will be applied in marvell_nfc_select_target() by
clearing the selected_chip pointer.
Fixes: b25251414f6e ("mtd: rawnand: marvell: Stop implementing ->select_chip()")
Suggested-by: Miquel Raynal <miquel.raynal@bootlin.com>
Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz>
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Link: https://lore.kernel.org/linux-mtd/20230525003154.2303012-1-chris.packham@alliedtelesis.co.nz
|
|
The addition of the mtdchar_read_ioctl() function caused the stack usage
of mtdchar_ioctl() to grow beyond the warning limit on 32-bit architectures
with gcc-13:
drivers/mtd/mtdchar.c: In function 'mtdchar_ioctl':
drivers/mtd/mtdchar.c:1229:1: error: the frame size of 1488 bytes is larger than 1024 bytes [-Werror=frame-larger-than=]
Mark both the read and write portions as noinline_for_stack to ensure
they don't get inlined and use separate stack slots to reduce the
maximum usage, both in the mtdchar_ioctl() and combined with any
of its callees.
Fixes: 095bb6e44eb1 ("mtdchar: add MEMREAD ioctl")
Cc: stable@vger.kernel.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Link: https://lore.kernel.org/linux-mtd/20230417205654.1982368-1-arnd@kernel.org
|
|
Naga no longer works for AMD/Xilinx and there is no activity from him to
continue to maintain Xilinx related drivers. Add myself instead to be kept
in loop if there is any need for testing.
Signed-off-by: Michal Simek <michal.simek@amd.com>
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
[<miquel.raynal@bootlin.com>: Manually apply on top of the latest -rc which
where the MAINTAINERS file got sorted]
Link: https://lore.kernel.org/linux-mtd/06df49c300c53a27423260e99acc217b06d4e588.1684827820.git.michal.simek@amd.com
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394
Pull firewire fix from Takashi Sakamoto:
"A single patch to use a flexible array rather than a zero-length one"
* tag 'firewire-fixes-6.4-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394:
firewire: Replace zero-length array with flexible-array member
|
|
git://git.linaro.org/landing-teams/working/fujitsu/integration
Pull mailbox fix from Jassi Brar:
"Fix missing mutex unlock in mailbox-test"
* tag 'mailbox-fixes-6.4-rc5' of git://git.linaro.org/landing-teams/working/fujitsu/integration:
mailbox: mailbox-test: fix a locking issue in mbox_test_message_write()
|
|
A switch held in reset by default needs to wait longer until we can
reliably detect it.
An issue was observed when testing on the Marvell 88E6393X (Link Street).
The driver failed to detect the switch on some upstarts. Increasing the
wait time after reset deactivation solves this issue.
The updated wait time is now also the same as the wait time in the
mv88e6xxx_hardware_reset function.
Fixes: 7b75e49de424 ("net: dsa: mv88e6xxx: wait after reset deactivation")
Signed-off-by: Andreas Svensson <andreas.svensson@axis.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20230530145223.1223993-1-andreas.svensson@axis.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Zero-length and one-element arrays are deprecated, and we are moving
towards adopting C99 flexible-array members, instead.
Address the following warnings found with GCC-13 and
-fstrict-flex-arrays=3 enabled:
sound/firewire/amdtp-stream.c: In function ‘build_it_pkt_header’:
sound/firewire/amdtp-stream.c:694:17: warning: ‘generate_cip_header’ accessing 8 bytes in a region of size 0 [-Wstringop-overflow=]
694 | generate_cip_header(s, cip_header, data_block_counter, syt);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
sound/firewire/amdtp-stream.c:694:17: note: referencing argument 2 of type ‘__be32[2]’ {aka ‘unsigned int[2]’}
sound/firewire/amdtp-stream.c:667:13: note: in a call to function ‘generate_cip_header’
667 | static void generate_cip_header(struct amdtp_stream *s, __be32 cip_header[2],
| ^~~~~~~~~~~~~~~~~~~
This helps with the ongoing efforts to tighten the FORTIFY_SOURCE
routines on memcpy() and help us make progress towards globally
enabling -fstrict-flex-arrays=3 [1].
Link: https://github.com/KSPP/linux/issues/21
Link: https://github.com/KSPP/linux/issues/303
Link: https://gcc.gnu.org/pipermail/gcc-patches/2022-October/602902.html [1]
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/ZHT0V3SpvHyxCv5W@work
Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
|
|
[BUG]
After commit e02ee89baa66 ("btrfs: scrub: switch scrub_simple_mirror()
to scrub_stripe infrastructure"), scrub no longer works for zoned device
at all.
Even an empty zoned btrfs cannot be replaced:
# mkfs.btrfs -f /dev/nvme0n1
# mount /dev/nvme0n1 /mnt/btrfs
# btrfs replace start -Bf 1 /dev/nvme0n2 /mnt/btrfs
Resetting device zones /dev/nvme1n1 (160 zones) ...
ERROR: ioctl(DEV_REPLACE_START) failed on "/mnt/btrfs/": Input/output error
And we can hit kernel crash related to that:
BTRFS info (device nvme1n1): host-managed zoned block device /dev/nvme3n1, 160 zones of 134217728 bytes
BTRFS info (device nvme1n1): dev_replace from /dev/nvme2n1 (devid 2) to /dev/nvme3n1 started
nvme3n1: Zone Management Append(0x7d) @ LBA 65536, 4 blocks, Zone Is Full (sct 0x1 / sc 0xb9) DNR
I/O error, dev nvme3n1, sector 786432 op 0xd:(ZONE_APPEND) flags 0x4000 phys_seg 3 prio class 2
BTRFS error (device nvme1n1): bdev /dev/nvme3n1 errs: wr 1, rd 0, flush 0, corrupt 0, gen 0
BUG: kernel NULL pointer dereference, address: 00000000000000a8
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
RIP: 0010:_raw_spin_lock_irqsave+0x1e/0x40
Call Trace:
<IRQ>
btrfs_lookup_ordered_extent+0x31/0x190
btrfs_record_physical_zoned+0x18/0x40
btrfs_simple_end_io+0xaf/0xc0
blk_update_request+0x153/0x4c0
blk_mq_end_request+0x15/0xd0
nvme_poll_cq+0x1d3/0x360
nvme_irq+0x39/0x80
__handle_irq_event_percpu+0x3b/0x190
handle_irq_event+0x2f/0x70
handle_edge_irq+0x7c/0x210
__common_interrupt+0x34/0xa0
common_interrupt+0x7d/0xa0
</IRQ>
<TASK>
asm_common_interrupt+0x22/0x40
[CAUSE]
Dev-replace reuses scrub code to iterate all extents and write the
existing content back to the new device.
And for zoned devices, we call fill_writer_pointer_gap() to make sure
all the writes into the zoned device is sequential, even if there may be
some gaps between the writes.
However we have several different bugs all related to zoned dev-replace:
- We are using ZONE_APPEND operation for metadata style write back
For zoned devices, btrfs has two ways to write data:
* ZONE_APPEND for data
This allows higher queue depth, but will not be able to know where
the write would land.
Thus needs to grab the real on-disk physical location in it's endio.
* WRITE for metadata
This requires single queue depth (new writes can only be submitted
after previous one finished), and all writes must be sequential.
For scrub, we go single queue depth, but still goes with ZONE_APPEND,
which requires btrfs_bio::inode being populated.
This is the cause of that crash.
- No correct tracing of write_pointer
After a write finished, we should forward sctx->write_pointer, or
fill_writer_pointer_gap() would not work properly and cause more
than necessary zero out, and fill the whole zone prematurely.
- Incorrect physical bytenr passed to fill_writer_pointer_gap()
In scrub_write_sectors(), one call site passes logical address, which
is completely wrong.
The other call site passes physical address of current sector, but
we should pass the physical address of the btrfs_bio we're submitting.
This is the cause of the -EIO errors.
[FIX]
- Do not use ZONE_APPEND for btrfs_submit_repair_write().
- Manually forward sctx->write_pointer after successful writeback
- Use the physical address of the to-be-submitted btrfs_bio for
fill_writer_pointer_gap()
Now zoned device replace would work as expected.
Reported-by: Christoph Hellwig <hch@lst.de>
Fixes: e02ee89baa66 ("btrfs: scrub: switch scrub_simple_mirror() to scrub_stripe infrastructure")
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid
Pull HID fixes from Jiri Kosina:
- Regression fix for overlong long timeouts during initialization on
some Logitech Unifying devices (Bastien Nocera)
- error handling and overflow fixes for Wacom driver (Denis Arefev,
Jason Gerecke, Nikita Zhandarovich)
* tag 'for-linus-2023060101' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid:
HID: logitech-hidpp: Handle timeout differently from busy
HID: wacom: Add error check to wacom_parse_and_register()
HID: google: add jewel USB id
HID: wacom: avoid integer overflow in wacom_intuos_inout()
HID: wacom: Check for string overflow from strscpy calls
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata
Pull ata fix from Damien Le Moal:
- Fix ata_find_dev() use of the device number to find a struct
ata_device for a port. This addresses issues with some passthrough
commands with libsas managed devices.
* tag 'ata-6.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata:
ata: libata-scsi: Use correct device no in ata_find_dev()
|
|
Pull smb server fixes from Steve French:
"Eight server fixes (most also for stable):
- Two fixes for uninitialized pointer reads (rename and link)
- Fix potential UAF in oplock break
- Two fixes for potential out of bound reads in negotiate
- Fix crediting bug
- Two fixes for xfstests (allocation size fix for test 694 and lookup
issue shown by test 464)"
* tag '6.4-rc4-smb3-server-fixes' of git://git.samba.org/ksmbd:
ksmbd: call putname after using the last component
ksmbd: fix incorrect AllocationSize set in smb2_get_info
ksmbd: fix UAF issue from opinfo->conn
ksmbd: fix multiple out-of-bounds read during context decoding
ksmbd: fix slab-out-of-bounds read in smb2_handle_negotiate
ksmbd: fix credit count leakage
ksmbd: fix uninitialized pointer read in smb2_create_link()
ksmbd: fix uninitialized pointer read in ksmbd_vfs_rename()
|
|
IPA_STATUS_SIZE was introduced in commit b8dc7d0eea5a as a replacement
for the size of the removed struct ipa_status which had size
sizeof(__le32[8]). Use this value as IPA_STATUS_SIZE.
Fixes: b8dc7d0eea5a ("net: ipa: stop using sizeof(status)")
Signed-off-by: Bert Karwatzki <spasswolf@web.de>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Link: https://lore.kernel.org/r/20230531103618.102608-1-spasswolf@web.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
In this patch, we mainly try to handle sending a compressed ack
correctly if it's deferred.
Here are more details in the old logic:
When sack compression is triggered in the tcp_compressed_ack_kick(),
if the sock is owned by user, it will set TCP_DELACK_TIMER_DEFERRED
and then defer to the release cb phrase. Later once user releases
the sock, tcp_delack_timer_handler() should send a ack as expected,
which, however, cannot happen due to lack of ICSK_ACK_TIMER flag.
Therefore, the receiver would not sent an ack until the sender's
retransmission timeout. It definitely increases unnecessary latency.
Fixes: 5d9f4262b7ea ("tcp: add SACK compression")
Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: fuyuanli <fuyuanli@didiglobal.com>
Signed-off-by: Jason Xing <kerneljasonxing@gmail.com>
Link: https://lore.kernel.org/netdev/20230529113804.GA20300@didi-ThinkCentre-M920t-N000/
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20230531080150.GA20424@didi-ThinkCentre-M920t-N000
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
If we send two TCA_FLOWER_KEY_ENC_OPTS_GENEVE packets and their total
size is 252 bytes(key->enc_opts.len = 252) then
key->enc_opts.len = opt->length = data_len / 4 = 0 when the third
TCA_FLOWER_KEY_ENC_OPTS_GENEVE packet enters fl_set_geneve_opt. This
bypasses the next bounds check and results in an out-of-bounds.
Fixes: 0a6e77784f49 ("net/sched: allow flower to match tunnel options")
Signed-off-by: Hangyu Hua <hbh25y@gmail.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Reviewed-by: Pieter Jansen van Vuuren <pieter.jansen-van-vuuren@amd.com>
Link: https://lore.kernel.org/r/20230531102805.27090-1-hbh25y@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
If an IOMMU domain was never attached, it lacks any linkage to the
actual IOMMU hardware. Attempting to do flush_iotlb_all() on it will
result in a NULL pointer dereference. This seems to happen after the
recent IOMMU core rework in v6.4-rc1.
Unable to handle kernel read from unreadable memory at virtual address 0000000000000018
Call trace:
mtk_iommu_flush_iotlb_all+0x20/0x80
iommu_create_device_direct_mappings.part.0+0x13c/0x230
iommu_setup_default_domain+0x29c/0x4d0
iommu_probe_device+0x12c/0x190
of_iommu_configure+0x140/0x208
of_dma_configure_id+0x19c/0x3c0
platform_dma_configure+0x38/0x88
really_probe+0x78/0x2c0
Check if the "bank" field has been filled in before actually attempting
the IOTLB flush to avoid it. The IOTLB is also flushed when the device
comes out of runtime suspend, so it should have a clean initial state.
Fixes: 08500c43d4f7 ("iommu/mediatek: Adjust the structure")
Signed-off-by: Chen-Yu Tsai <wenst@chromium.org>
Reviewed-by: Yong Wu <yong.wu@mediatek.com>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Link: https://lore.kernel.org/r/20230526085402.394239-1-wenst@chromium.org
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
Clearing out report id and timestamp as means to detect unlanded reports
only works if report size is power of 2. That is, only when report size is
a sub-multiple of the OA buffer size can we be certain that reports will
land at the same place each time in the OA buffer (after rewind). If report
size is not a power of 2, we need to zero out the entire report to be able
to detect unlanded reports reliably.
v2: Add Fixes tag (Umesh)
Fixes: 1cc064dce4ed ("drm/i915/perf: Add support for OA media units")
Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230523204042.4180641-1-ashutosh.dixit@intel.com
(cherry picked from commit 09a36015d9a0940214c080f95afc605c47648bbd)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
|
|
There are 4 lanes in the single instance of J784S4 SERDES. Each SERDES
lane mux can select up to 4 different IPs. Define all the possible
functions.
Signed-off-by: Matt Ranostay <mranostay@ti.com>
Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Siddharth Vadapalli <s-vadapalli@ti.com>
Signed-off-by: Peter Rosin <peda@axentia.se>
Link: https://lore.kernel.org/r/755a14f1-92ad-ce4b-3fde-2a4b0650475c@axentia.se
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Since commit 0166dc11be91 ("of: make CONFIG_OF user selectable"), it
is possible to test-build any driver which depends on OF on any
architecture by explicitly selecting OF. Therefore depending on
COMPILE_TEST as an alternative is no longer needed.
It is actually better to always build such drivers with OF enabled,
so that the test builds are closer to how each driver will actually be
built on its intended target. Building them without OF may not test
much as the compiler will optimize out potentially large parts of the
code. In the worst case, this could even pop false positive warnings.
Dropping COMPILE_TEST here improves the quality of our testing and
avoids wasting time on non-existent issues.
As a minor optimization, this also lets us drop of_match_ptr(), as we
now know what it will resolve to, we might as well save cpp some work.
Signed-off-by: Jean Delvare <jdelvare@suse.de>
Signed-off-by: Peter Rosin <peda@axentia.se>
Link: https://lore.kernel.org/r/bc790b4e-1cb4-4ef5-3da8-9d0e6b613bc7@axentia.se
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Failure ladders weren't exactly unwinding what the function had done up
to that point; most seriously, when we encountered an already offloaded
rule, the failure path tried to remove the new rule from the hashtable,
which would in fact remove the already-present 'old' rule (since it has
the same key) from the table, and leak its resources.
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <error27@gmail.com>
Closes: https://lore.kernel.org/r/202305200745.xmIlkqjH-lkp@intel.com/
Fixes: d902e1a737d4 ("sfc: bare bones TC offload on EF100")
Fixes: 17654d84b47c ("sfc: add offloading of 'foreign' TC (decap) rules")
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Link: https://lore.kernel.org/r/20230530202527.53115-1-edward.cree@amd.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
During driver load it reads embedded_cpu bit from initialization
segment, but the initialization segment is readable only after
initialization bit is cleared.
Move the call to mlx5_read_embedded_cpu() right after initialization bit
cleared.
Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Fixes: 591905ba9679 ("net/mlx5: Introduce Mellanox SmartNIC and modify page management logic")
Reviewed-by: Shay Drory <shayd@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Allocation failure is outside the critical lock section and should
return immediately rather than jumping to the unlock section.
Also unlock as soon as required and remove the now redundant jump label.
Fixes: 80a2a9026b24 ("net/mlx5e: Add a lock on tir list")
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
[ 9.837087] mlx5_core 0000:02:00.0: firmware version: 16.35.2000
[ 9.843126] mlx5_core 0000:02:00.0: 126.016 Gb/s available PCIe bandwidth (8.0 GT/s PCIe x16 link)
[ 10.311515] mlx5_core 0000:02:00.0: Rate limit: 127 rates are supported, range: 0Mbps to 97656Mbps
[ 10.321948] mlx5_core 0000:02:00.0: E-Switch: Total vports 2, per vport: max uc(128) max mc(2048)
[ 10.344324] mlx5_core 0000:02:00.0: mlx5_pcie_event:301:(pid 88): PCIe slot advertised sufficient power (27W).
[ 10.354339] BUG: unable to handle page fault for address: ffffffff8ff0ade0
[ 10.361206] #PF: supervisor read access in kernel mode
[ 10.366335] #PF: error_code(0x0000) - not-present page
[ 10.371467] PGD 81ec39067 P4D 81ec39067 PUD 81ec3a063 PMD 114b07063 PTE 800ffff7e10f5062
[ 10.379544] Oops: 0000 [#1] PREEMPT SMP PTI
[ 10.383721] CPU: 0 PID: 117 Comm: kworker/0:6 Not tainted 6.3.0-13028-g7222f123c983 #1
[ 10.391625] Hardware name: Supermicro X10SRA-F/X10SRA-F, BIOS 2.0b 06/12/2017
[ 10.398750] Workqueue: events work_for_cpu_fn
[ 10.403108] RIP: 0010:__bitmap_or+0x10/0x26
[ 10.407286] Code: 85 c0 0f 95 c0 c3 cc cc cc cc 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 89 c9 31 c0 48 83 c1 3f 48 c1 e9 06 39 c>
[ 10.426024] RSP: 0000:ffffb45a0078f7b0 EFLAGS: 00010097
[ 10.431240] RAX: 0000000000000000 RBX: ffffffff8ff0adc0 RCX: 0000000000000004
[ 10.438365] RDX: ffff9156801967d0 RSI: ffffffff8ff0ade0 RDI: ffff9156801967b0
[ 10.445489] RBP: ffffb45a0078f7e8 R08: 0000000000000030 R09: 0000000000000000
[ 10.452613] R10: 0000000000000000 R11: 0000000000000000 R12: 00000000000000ec
[ 10.459737] R13: ffffffff8ff0ade0 R14: 0000000000000001 R15: 0000000000000020
[ 10.466862] FS: 0000000000000000(0000) GS:ffff9165bfc00000(0000) knlGS:0000000000000000
[ 10.474936] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 10.480674] CR2: ffffffff8ff0ade0 CR3: 00000001011ae003 CR4: 00000000003706f0
[ 10.487800] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 10.494922] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 10.502046] Call Trace:
[ 10.504493] <TASK>
[ 10.506589] ? matrix_alloc_area.constprop.0+0x43/0x9a
[ 10.511729] ? prepare_namespace+0x84/0x174
[ 10.515914] irq_matrix_reserve_managed+0x56/0x10c
[ 10.520699] x86_vector_alloc_irqs+0x1d2/0x31e
[ 10.525146] irq_domain_alloc_irqs_hierarchy+0x39/0x3f
[ 10.530284] irq_domain_alloc_irqs_parent+0x1a/0x2a
[ 10.535155] intel_irq_remapping_alloc+0x59/0x5e9
[ 10.539859] ? kmem_cache_debug_flags+0x11/0x26
[ 10.544383] ? __radix_tree_lookup+0x39/0xb9
[ 10.548649] irq_domain_alloc_irqs_hierarchy+0x39/0x3f
[ 10.553779] irq_domain_alloc_irqs_parent+0x1a/0x2a
[ 10.558650] msi_domain_alloc+0x8c/0x120
[ 10.567697] irq_domain_alloc_irqs_locked+0x11d/0x286
[ 10.572741] __irq_domain_alloc_irqs+0x72/0x93
[ 10.577179] __msi_domain_alloc_irqs+0x193/0x3f1
[ 10.581789] ? __xa_alloc+0xcf/0xe2
[ 10.585273] msi_domain_alloc_irq_at+0xa8/0xfe
[ 10.589711] pci_msix_alloc_irq_at+0x47/0x5c
The crash is due to matrix_alloc_area() attempting to access per-CPU
memory for CPUs that are not present on the system. The CPU mask
passed into reserve_managed_vector() via it's @irqd parameter is
corrupted because it contains uninitialized stack data.
Fixes: bbac70c74183 ("net/mlx5: Use newer affinity descriptor")
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
When dynamic IRQ allocation is not supported all IRQs are allocated up
front in mlx5_irq_table_create() instead of dynamically as part of
mlx5_irq_alloc(). In the latter dynamic case irq->map.index is set
via the mapping returned by pci_msix_alloc_irq_at(). In the static case
and prior to commit 1da438c0ae02 ("net/mlx5: Fix indexing of mlx5_irq")
irq->map.index was set in mlx5_irq_alloc() twice once initially to 0 and
then to the requested index before storing in the xarray. After this
commit it is only set to 0 which breaks all other IRQ mappings.
Fix this by setting irq->map.index to the requested index together with
irq->map.virq and improve the related comment to make it clearer which
cases it deals with.
Cc: Chuck Lever III <chuck.lever@oracle.com>
Tested-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Reviewed-by: Eli Cohen <elic@nvidia.com>
Fixes: 1da438c0ae02 ("net/mlx5: Fix indexing of mlx5_irq")
Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
Tested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
mlx5 add IRQs to rmap upon MSIX request, and mlx5 remove rmap from
MSIX only if msi_map.index is populated. However, msi_map.index is
populated only when dynamic MSIX is supported. This results in freeing
IRQs without removing them from rmap, which triggers the bellow
WARN_ON[1].
rmap is a feature which have no relation to dynamic MSIX.
Hence, remove the check of msi_map.index when removing IRQ from rmap.
[1]
[ 200.307160 ] WARNING: CPU: 20 PID: 1702 at kernel/irq/manage.c:2034 free_irq+0x2ac/0x358
[ 200.316990 ] CPU: 20 PID: 1702 Comm: modprobe Not tainted 6.4.0-rc3_for_upstream_min_debug_2023_05_24_14_02 #1
[ 200.318939 ] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
[ 200.321659 ] pc : free_irq+0x2ac/0x358
[ 200.322400 ] lr : free_irq+0x20/0x358
[ 200.337865 ] Call trace:
[ 200.338360 ] free_irq+0x2ac/0x358
[ 200.339029 ] irq_release+0x58/0xd0 [mlx5_core]
[ 200.340093 ] mlx5_irqs_release_vectors+0x80/0xb0 [mlx5_core]
[ 200.341344 ] destroy_comp_eqs+0x120/0x170 [mlx5_core]
[ 200.342469 ] mlx5_eq_table_destroy+0x1c/0x38 [mlx5_core]
[ 200.343645 ] mlx5_unload+0x8c/0xc8 [mlx5_core]
[ 200.344652 ] mlx5_uninit_one+0x78/0x118 [mlx5_core]
[ 200.345745 ] remove_one+0x80/0x108 [mlx5_core]
[ 200.346752 ] pci_device_remove+0x40/0xd8
[ 200.347554 ] device_remove+0x50/0x88
[ 200.348272 ] device_release_driver_internal+0x1c4/0x228
[ 200.349312 ] driver_detach+0x54/0xa0
[ 200.350030 ] bus_remove_driver+0x74/0x100
[ 200.350833 ] driver_unregister+0x34/0x68
[ 200.351619 ] pci_unregister_driver+0x28/0xa0
[ 200.352476 ] mlx5_cleanup+0x14/0x2210 [mlx5_core]
[ 200.353536 ] __arm64_sys_delete_module+0x190/0x2e8
[ 200.354495 ] el0_svc_common.constprop.0+0x6c/0x1d0
[ 200.355455 ] do_el0_svc+0x38/0x98
[ 200.356122 ] el0_svc+0x1c/0x80
[ 200.356739 ] el0t_64_sync_handler+0xb4/0x130
[ 200.357604 ] el0t_64_sync+0x174/0x178
[ 200.358345 ] ---[ end trace 0000000000000000 ]---
Fixes: 3354822cde5a ("net/mlx5: Use dynamic msix vectors allocation")
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Add IP GC 11.0.1 in the list of target to have
tmz enabled by default.
Signed-off-by: Ikshwaku Chauhan <ikshwaku.chauhan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 6.1.x
|
|
Pull smb client fixes from Steve French:
"Four small smb3 client fixes:
- two small fixes suggested by kernel test robot
- small cleanup fix
- update Paulo's email address in the maintainer file"
* tag '6.4-rc4-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
cifs: address unused variable warning
smb: delete an unnecessary statement
smb3: missing null check in SMB2_change_notify
smb3: update a reviewer email in MAINTAINERS file
|
|
During reboot test on arm64 platform, it may failure on boot.
The error message are as follows:
[ 1.706570][ 3] [ T273] [drm:si_thermal_enable_alert [amdgpu]] *ERROR* Could not enable thermal interrupts.
[ 1.716547][ 3] [ T273] [drm:amdgpu_device_ip_late_init [amdgpu]] *ERROR* late_init of IP block <si_dpm> failed -22
[ 1.727064][ 3] [ T273] amdgpu 0000:02:00.0: amdgpu_device_ip_late_init failed
[ 1.734367][ 3] [ T273] amdgpu 0000:02:00.0: Fatal error during GPU init
v2: squash in built warning fix (Alex)
Signed-off-by: Zhenneng Li <lizhenneng@kylinos.cn>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
|
|
Add ras_poison_irq and functions. And fix the amdgpu_irq_put
call trace in jpeg_v4_0_hw_fini.
[ 50.497562] RIP: 0010:amdgpu_irq_put+0xa4/0xc0 [amdgpu]
[ 50.497619] RSP: 0018:ffffaa2400fcfcb0 EFLAGS: 00010246
[ 50.497620] RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000
[ 50.497621] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[ 50.497621] RBP: ffffaa2400fcfcd0 R08: 0000000000000000 R09: 0000000000000000
[ 50.497622] R10: 0000000000000000 R11: 0000000000000000 R12: ffff99b2105242d8
[ 50.497622] R13: 0000000000000000 R14: ffff99b210500000 R15: ffff99b210500000
[ 50.497623] FS: 0000000000000000(0000) GS:ffff99b518480000(0000) knlGS:0000000000000000
[ 50.497623] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 50.497624] CR2: 00007f9d32aa91e8 CR3: 00000001ba210000 CR4: 0000000000750ee0
[ 50.497624] PKRU: 55555554
[ 50.497625] Call Trace:
[ 50.497625] <TASK>
[ 50.497627] jpeg_v4_0_hw_fini+0x43/0xc0 [amdgpu]
[ 50.497693] jpeg_v4_0_suspend+0x13/0x30 [amdgpu]
[ 50.497751] amdgpu_device_ip_suspend_phase2+0x240/0x470 [amdgpu]
[ 50.497802] amdgpu_device_ip_suspend+0x41/0x80 [amdgpu]
[ 50.497854] amdgpu_device_pre_asic_reset+0xd9/0x4a0 [amdgpu]
[ 50.497905] amdgpu_device_gpu_recover.cold+0x548/0xcf1 [amdgpu]
[ 50.498005] amdgpu_debugfs_reset_work+0x4c/0x80 [amdgpu]
[ 50.498060] process_one_work+0x21f/0x400
[ 50.498063] worker_thread+0x200/0x3f0
[ 50.498064] ? process_one_work+0x400/0x400
[ 50.498065] kthread+0xee/0x120
[ 50.498067] ? kthread_complete_and_exit+0x20/0x20
[ 50.498068] ret_from_fork+0x22/0x30
Suggested-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Horatio Zhang <Hongkun.Zhang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Add ras_poison_irq and functions.
Suggested-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Horatio Zhang <Hongkun.Zhang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Separate jpegbRAS poison consumption handling from the instance irq, and
register dedicated ras_poison_irq src and funcs for UVD_POISON.
v2:
- Separate ras irq from jpeg instance irq
- Improve the subject and code comments
v3:
- Split the patch into three parts
- Improve the code comments
Suggested-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Horatio Zhang <Hongkun.Zhang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Add ras_poison_irq and functions. And fix the amdgpu_irq_put
call trace in vcn_v4_0_hw_fini.
[ 44.563572] RIP: 0010:amdgpu_irq_put+0xa4/0xc0 [amdgpu]
[ 44.563629] RSP: 0018:ffffb36740edfc90 EFLAGS: 00010246
[ 44.563630] RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000
[ 44.563630] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[ 44.563631] RBP: ffffb36740edfcb0 R08: 0000000000000000 R09: 0000000000000000
[ 44.563631] R10: 0000000000000000 R11: 0000000000000000 R12: ffff954c568e2ea8
[ 44.563631] R13: 0000000000000000 R14: ffff954c568c0000 R15: ffff954c568e2ea8
[ 44.563632] FS: 0000000000000000(0000) GS:ffff954f584c0000(0000) knlGS:0000000000000000
[ 44.563632] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 44.563633] CR2: 00007f028741ba70 CR3: 000000026ca10000 CR4: 0000000000750ee0
[ 44.563633] PKRU: 55555554
[ 44.563633] Call Trace:
[ 44.563634] <TASK>
[ 44.563634] vcn_v4_0_hw_fini+0x62/0x160 [amdgpu]
[ 44.563700] vcn_v4_0_suspend+0x13/0x30 [amdgpu]
[ 44.563755] amdgpu_device_ip_suspend_phase2+0x240/0x470 [amdgpu]
[ 44.563806] amdgpu_device_ip_suspend+0x41/0x80 [amdgpu]
[ 44.563858] amdgpu_device_pre_asic_reset+0xd9/0x4a0 [amdgpu]
[ 44.563909] amdgpu_device_gpu_recover.cold+0x548/0xcf1 [amdgpu]
[ 44.564006] amdgpu_debugfs_reset_work+0x4c/0x80 [amdgpu]
[ 44.564061] process_one_work+0x21f/0x400
[ 44.564062] worker_thread+0x200/0x3f0
[ 44.564063] ? process_one_work+0x400/0x400
[ 44.564064] kthread+0xee/0x120
[ 44.564065] ? kthread_complete_and_exit+0x20/0x20
[ 44.564066] ret_from_fork+0x22/0x30
Suggested-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Horatio Zhang <Hongkun.Zhang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Add ras_poison_irq and functions.
Suggested-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Horatio Zhang <Hongkun.Zhang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Separate vcn RAS poison consumption handling from the instance irq, and
register dedicated ras_poison_irq src and funcs for UVD_POISON.
v2:
- Separate ras irq from vcn instance irq
- Improve the subject and code comments
v3:
- Split the patch into three parts
- Improve the code comments
Suggested-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Horatio Zhang <Hongkun.Zhang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
This reverts commit 474f01015ffdb74e01c2eb3584a2822c64e7b2be.
Caused a regression:
Samsung Odyssey Neo G9, running at 5120x1440@240/VRR, connected to Navi
21 via DisplayPort, blanks and the GPU hangs while starting the Steam
game Assetto Corsa Competizione (via Proton 7.0).
Example dmesg excerpt:
amdgpu 0000:0c:00.0: [drm] ERROR [CRTC:82:crtc-0] flip_done timed out
NMI watchdog: Watchdog detected hard LOCKUP on cpu 6
[...]
RIP: 0010:amdgpu_device_rreg.part.0+0x2f/0xf0 [amdgpu]
Code: 41 54 44 8d 24 b5 00 00 00 00 55 89 f5 53 48 89 fb 4c 3b a7 60 0b 00 00 73 6a 83 e2 02 74 29 4c 03 a3 68 0b 00 00 45 8b 24 24 <48> 8b 43 08 0f b7 70 3e 66 90 44 89 e0 5b 5d 41 5c 31 d2 31 c9 31
RSP: 0000:ffffb39a119dfb88 EFLAGS: 00000086
RAX: ffffffffc0eb96a0 RBX: ffff9e7963dc0000 RCX: 0000000000007fff
RDX: 0000000000000000 RSI: 0000000000004ff6 RDI: ffff9e7963dc0000
RBP: 0000000000004ff6 R08: ffffb39a119dfc40 R09: 0000000000000010
R10: ffffb39a119dfc40 R11: ffffb39a119dfc44 R12: 00000000000e05ae
R13: 0000000000000000 R14: ffff9e7963dc0010 R15: 0000000000000000
FS: 000000001012f6c0(0000) GS:ffff9e805eb80000(0000) knlGS:000000007fd40000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000461ca000 CR3: 00000002a8a20000 CR4: 0000000000350ee0
Call Trace:
<TASK>
dm_read_reg_func+0x37/0xc0 [amdgpu]
generic_reg_get2+0x22/0x60 [amdgpu]
optc1_get_crtc_scanoutpos+0x6a/0xc0 [amdgpu]
dc_stream_get_scanoutpos+0x74/0x90 [amdgpu]
dm_crtc_get_scanoutpos+0x82/0xf0 [amdgpu]
amdgpu_display_get_crtc_scanoutpos+0x91/0x190 [amdgpu]
? dm_read_reg_func+0x37/0xc0 [amdgpu]
amdgpu_get_vblank_counter_kms+0xb4/0x1a0 [amdgpu]
dm_pflip_high_irq+0x213/0x2f0 [amdgpu]
amdgpu_dm_irq_handler+0x8a/0x200 [amdgpu]
amdgpu_irq_dispatch+0xd4/0x220 [amdgpu]
amdgpu_ih_process+0x7f/0x110 [amdgpu]
amdgpu_irq_handler+0x1f/0x70 [amdgpu]
__handle_irq_event_percpu+0x46/0x1b0
handle_irq_event+0x34/0x80
handle_edge_irq+0x9f/0x240
__common_interrupt+0x66/0x110
common_interrupt+0x5c/0xd0
asm_common_interrupt+0x22/0x40
Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Michel Dänzer <mdaenzer@redhat.com>
Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
This reverts commit ce560ac40272a5c8b5b68a9d63a75edd9e66aed2.
It depends on its parent commit, which we want to revert.
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Michel Dänzer <mdaenzer@redhat.com>
[Hamza: fix a whitespace issue in dcn30_prepare_bandwidth()]
Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
This patch reverses the DPM clocks levels output of pp_dpm_mclk
and pp_dpm_fclk for renoir.
On dGPUs and older APUs we expose the levels from lowest clocks
to highest clocks. But for some APUs, the clocks levels are
given the reversed orders by PMFW. Like the memory DPM clocks
that are exposed by pp_dpm_mclk.
It's not intuitive that they are reversed on these APUs. All tools
and software that talks to the driver then has to know different ways
to interpret the data depending on the asic.
So we need to reverse them to expose the clocks levels from the
driver consistently.
Signed-off-by: Tim Huang <Tim.Huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
|
|
This patch reverses the DPM clocks levels output of pp_dpm_mclk
and pp_dpm_fclk.
On dGPUs and older APUs we expose the levels from lowest clocks
to highest clocks. But for some APUs, the clocks levels that from
the DFPstateTable are given the reversed orders by PMFW. Like the
memory DPM clocks that are exposed by pp_dpm_mclk.
It's not intuitive that they are reversed on these APUs. All tools
and software that talks to the driver then has to know different ways
to interpret the data depending on the asic.
So we need to reverse them to expose the clocks levels from the
driver consistently.
Signed-off-by: Tim Huang <Tim.Huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
|
|
This patch reverses the DPM clocks levels output of pp_dpm_mclk
and pp_dpm_fclk.
On dGPUs and older APUs we expose the levels from lowest clocks
to highest clocks. But for some APUs, the clocks levels that from
the DFPstateTable are given the reversed orders by PMFW. Like the
memory DPM clocks that are exposed by pp_dpm_mclk.
It's not intuitive that they are reversed on these APUs. All tools
and software that talks to the driver then has to know different ways
to interpret the data depending on the asic.
So we need to reverse them to expose the clocks levels from the
driver consistently.
Signed-off-by: Tim Huang <Tim.Huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
|
|
This patch reverses the DPM clocks levels output of pp_dpm_mclk.
On dGPUs and older APUs we expose the levels from lowest clocks
to highest clocks. But for some APUs, the clocks levels that from
the DFPstateTable are given the reversed orders by PMFW. Like the
memory DPM clocks that are exposed by pp_dpm_mclk.
It's not intuitive that they are reversed on these APUs. All tools
and software that talks to the driver then has to know different ways
to interpret the data depending on the asic.
So we need to reverse them to expose the clocks levels from the
driver consistently.
Signed-off-by: Tim Huang <Tim.Huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
|