Age | Commit message (Collapse) | Author |
|
DAMON debugfs interface iterates current monitoring targets in
'dbgfs_target_ids_read()' while holding the corresponding
'kdamond_lock'. However, it also destructs the monitoring targets in
'dbgfs_before_terminate()' without holding the lock. This can result in
a use_after_free bug. This commit avoids the race by protecting the
destruction with the corresponding 'kdamond_lock'.
Link: https://lkml.kernel.org/r/20211221094447.2241-1-sj@kernel.org
Reported-by: Sangwoo Bae <sangwoob@amazon.com>
Fixes: 4bc05954d007 ("mm/damon: implement a debugfs-based user space interface")
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: <stable@vger.kernel.org> [5.15.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
The second parameter of alloc_pages_exact_nid is the one indicating the
size of memory pointed by the returned pointer.
Link: https://lkml.kernel.org/r/YbjEgwhn4bGblp//@coeus
Fixes: abd58f38dfb4 ("mm/page_alloc: add __alloc_size attributes for better bounds checking")
Signed-off-by: Thibaut Sautereau <thibaut.sautereau@ssi.gouv.fr>
Acked-by: Kees Cook <keescook@chromium.org>
Cc: Daniel Micay <danielmicay@gmail.com>
Cc: Levente Polyak <levente@leventepolyak.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
It is not easily reproducible, but on 5.16-rc I have several times hit
the VM_BUG_ON_PAGE(PageTail(page), page) in
page_cache_add_speculative(): usually from filemap_get_read_batch() for
an ext4 read, yesterday from next_uptodate_page() from
filemap_map_pages() for a shmem fault.
That BUG used to be placed where page_ref_add_unless() had succeeded,
but now it is placed before folio_ref_add_unless() is attempted: that is
not safe, since it is only the acquired reference which makes the page
safe from racing THP collapse or split.
We could keep the BUG, checking PageTail only when
folio_ref_try_add_rcu() has succeeded; but I don't think it adds much
value - just delete it.
Link: https://lkml.kernel.org/r/8b98fc6f-3439-8614-c3f3-945c659a1aba@google.com
Fixes: 020853b6f5ea ("mm: Add folio_try_get_rcu()")
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: William Kucharski <william.kucharski@oracle.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
When a memory error hits a tail page of a free hugepage,
__page_handle_poison() is expected to be called to isolate the error in
4kB unit, but it's not called due to the outdated if-condition in
memory_failure_hugetlb(). This loses the chance to isolate the error in
the finer unit, so it's not optimal. Drop the condition.
This "(p != head && TestSetPageHWPoison(head)" condition is based on the
old semantics of PageHWPoison on hugepage (where PG_hwpoison flag was
set on the subpage), so it's not necessray any more. By getting to set
PG_hwpoison on head page for hugepages, concurrent error events on
different subpages in a single hugepage can be prevented by
TestSetPageHWPoison(head) at the beginning of memory_failure_hugetlb().
So dropping the condition should not reopen the race window originally
mentioned in commit b985194c8c0a ("hwpoison, hugetlb:
lock_page/unlock_page does not match for handling a free hugepage")
[naoya.horiguchi@linux.dev: fix "HardwareCorrupted" counter]
Link: https://lkml.kernel.org/r/20211220084851.GA1460264@u2004
Link: https://lkml.kernel.org/r/20211210110208.879740-1-naoya.horiguchi@linux.dev
Signed-off-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Reported-by: Fei Luo <luofei@unicloud.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Cc: <stable@vger.kernel.org> [5.14+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Some lists that are moderated are not marked as moderated consistently,
so mark them all as moderated.
Link: https://lkml.kernel.org/r/20211209001330.18558-1-rdunlap@infradead.org
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Miquel Raynal <miquel.raynal@bootlin.com>
Cc: Conor Culhane <conor.culhane@silvaco.com>
Cc: Ryder Lee <ryder.lee@mediatek.com>
Cc: Jianjun Wang <jianjun.wang@mediatek.com>
Cc: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
When booting with crashkernel= on the kernel command line a warning
similar to
Kernel command line: ro console=ttyS0 crashkernel=256M
Unknown kernel command line parameters "crashkernel=256M", will be passed to user space.
is printed.
This comes from crashkernel= being parsed independent from the kernel
parameter handling mechanism. So the code in init/main.c doesn't know
that crashkernel= is a valid kernel parameter and prints this incorrect
warning.
Suppress the warning by adding a dummy early_param handler for
crashkernel=.
Link: https://lkml.kernel.org/r/20211208133443.6867-1-prudo@redhat.com
Fixes: 86d1919a4fb0 ("init: print out unknown kernel parameters")
Signed-off-by: Philipp Rudo <prudo@redhat.com>
Acked-by: Baoquan He <bhe@redhat.com>
Cc: Andrew Halaney <ahalaney@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
alloc_pages_vma() may try to allocate THP page on the local NUMA node
first:
page = __alloc_pages_node(hpage_node,
gfp | __GFP_THISNODE | __GFP_NORETRY, order);
And if the allocation fails it retries allowing remote memory:
if (!page && (gfp & __GFP_DIRECT_RECLAIM))
page = __alloc_pages_node(hpage_node,
gfp, order);
However, this retry allocation completely ignores memory policy nodemask
allowing allocation to escape restrictions.
The first appearance of this bug seems to be the commit ac5b2c18911f
("mm: thp: relax __GFP_THISNODE for MADV_HUGEPAGE mappings").
The bug disappeared later in the commit 89c83fb539f9 ("mm, thp:
consolidate THP gfp handling into alloc_hugepage_direct_gfpmask") and
reappeared again in slightly different form in the commit 76e654cc91bb
("mm, page_alloc: allow hugepage fallback to remote nodes when
madvised")
Fix this by passing correct nodemask to the __alloc_pages() call.
The demonstration/reproducer of the problem:
$ mount -oremount,size=4G,huge=always /dev/shm/
$ echo always > /sys/kernel/mm/transparent_hugepage/defrag
$ cat mbind_thp.c
#include <unistd.h>
#include <sys/mman.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <assert.h>
#include <stdlib.h>
#include <stdio.h>
#include <numaif.h>
#define SIZE 2ULL << 30
int main(int argc, char **argv)
{
int fd;
unsigned long long i;
char *addr;
pid_t pid;
char buf[100];
unsigned long nodemask = 1;
fd = open("/dev/shm/test", O_RDWR|O_CREAT);
assert(fd > 0);
assert(ftruncate(fd, SIZE) == 0);
addr = mmap(NULL, SIZE, PROT_READ|PROT_WRITE,
MAP_SHARED, fd, 0);
assert(mbind(addr, SIZE, MPOL_BIND, &nodemask, 2, MPOL_MF_STRICT|MPOL_MF_MOVE)==0);
for (i = 0; i < SIZE; i+=4096) {
addr[i] = 1;
}
pid = getpid();
snprintf(buf, sizeof(buf), "grep shm /proc/%d/numa_maps", pid);
system(buf);
sleep(10000);
return 0;
}
$ gcc mbind_thp.c -o mbind_thp -lnuma
$ numactl -H
available: 2 nodes (0-1)
node 0 cpus: 0 2
node 0 size: 1918 MB
node 0 free: 1595 MB
node 1 cpus: 1 3
node 1 size: 2014 MB
node 1 free: 1731 MB
node distances:
node 0 1
0: 10 20
1: 20 10
$ rm -f /dev/shm/test; taskset -c 0 ./mbind_thp
7fd970a00000 bind:0 file=/dev/shm/test dirty=524288 active=0 N0=396800 N1=127488 kernelpagesize_kB=4
Link: https://lkml.kernel.org/r/20211208165343.22349-1-arbn@yandex-team.com
Fixes: ac5b2c18911f ("mm: thp: relax __GFP_THISNODE for MADV_HUGEPAGE mappings")
Signed-off-by: Andrey Ryabinin <arbn@yandex-team.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: David Rientjes <rientjes@google.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Hulk robot reported a kmemleak problem:
unreferenced object 0xffff93d1d8cc02e8 (size 248):
comm "cat", pid 23327, jiffies 4624670141 (age 495992.217s)
hex dump (first 32 bytes):
00 40 85 19 d4 93 ff ff 00 10 00 00 00 00 00 00 .@..............
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace:
seq_open+0x2a/0x80
full_proxy_open+0x167/0x1e0
do_dentry_open+0x1e1/0x3a0
path_openat+0x961/0xa20
do_filp_open+0xae/0x120
do_sys_openat2+0x216/0x2f0
do_sys_open+0x57/0x80
do_syscall_64+0x33/0x40
entry_SYSCALL_64_after_hwframe+0x44/0xa9
unreferenced object 0xffff93d419854000 (size 4096):
comm "cat", pid 23327, jiffies 4624670141 (age 495992.217s)
hex dump (first 32 bytes):
6b 66 65 6e 63 65 2d 23 32 35 30 3a 20 30 78 30 kfence-#250: 0x0
30 30 30 30 30 30 30 37 35 34 62 64 61 31 32 2d 0000000754bda12-
backtrace:
seq_read_iter+0x313/0x440
seq_read+0x14b/0x1a0
full_proxy_read+0x56/0x80
vfs_read+0xa5/0x1b0
ksys_read+0xa0/0xf0
do_syscall_64+0x33/0x40
entry_SYSCALL_64_after_hwframe+0x44/0xa9
I find that we can easily reproduce this problem with the following
commands:
cat /sys/kernel/debug/kfence/objects
echo scan > /sys/kernel/debug/kmemleak
cat /sys/kernel/debug/kmemleak
The leaked memory is allocated in the stack below:
do_syscall_64
do_sys_open
do_dentry_open
full_proxy_open
seq_open ---> alloc seq_file
vfs_read
full_proxy_read
seq_read
seq_read_iter
traverse ---> alloc seq_buf
And it should have been released in the following process:
do_syscall_64
syscall_exit_to_user_mode
exit_to_user_mode_prepare
task_work_run
____fput
__fput
full_proxy_release ---> free here
However, the release function corresponding to file_operations is not
implemented in kfence. As a result, a memory leak occurs. Therefore,
the solution to this problem is to implement the corresponding release
function.
Link: https://lkml.kernel.org/r/20211206133628.2822545-1-libaokun1@huawei.com
Fixes: 0ce20dd84089 ("mm: add Kernel Electric-Fence infrastructure")
Signed-off-by: Baokun Li <libaokun1@huawei.com>
Reported-by: Hulk Robot <hulkci@huawei.com>
Acked-by: Marco Elver <elver@google.com>
Reviewed-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Yu Kuai <yukuai3@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
NFT_COUNTER was removed since
390ad4295aa ("netfilter: nf_tables: make counter support built-in")
LKP/0Day will check if all configs listing under selftests are able to
be enabled properly.
For the missing configs, it will report something like:
LKP WARN miss config CONFIG_NFT_COUNTER= of net/mptcp/config
- it's not reasonable to keep the deprecated configs.
- configs under kselftests are recommended by corresponding tests.
So if some configs are missing, it will impact the testing results
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Ma Xinjian <xinjianx.ma@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch is to delay the endpoint free by calling call_rcu() to fix
another use-after-free issue in sctp_sock_dump():
BUG: KASAN: use-after-free in __lock_acquire+0x36d9/0x4c20
Call Trace:
__lock_acquire+0x36d9/0x4c20 kernel/locking/lockdep.c:3218
lock_acquire+0x1ed/0x520 kernel/locking/lockdep.c:3844
__raw_spin_lock_bh include/linux/spinlock_api_smp.h:135 [inline]
_raw_spin_lock_bh+0x31/0x40 kernel/locking/spinlock.c:168
spin_lock_bh include/linux/spinlock.h:334 [inline]
__lock_sock+0x203/0x350 net/core/sock.c:2253
lock_sock_nested+0xfe/0x120 net/core/sock.c:2774
lock_sock include/net/sock.h:1492 [inline]
sctp_sock_dump+0x122/0xb20 net/sctp/diag.c:324
sctp_for_each_transport+0x2b5/0x370 net/sctp/socket.c:5091
sctp_diag_dump+0x3ac/0x660 net/sctp/diag.c:527
__inet_diag_dump+0xa8/0x140 net/ipv4/inet_diag.c:1049
inet_diag_dump+0x9b/0x110 net/ipv4/inet_diag.c:1065
netlink_dump+0x606/0x1080 net/netlink/af_netlink.c:2244
__netlink_dump_start+0x59a/0x7c0 net/netlink/af_netlink.c:2352
netlink_dump_start include/linux/netlink.h:216 [inline]
inet_diag_handler_cmd+0x2ce/0x3f0 net/ipv4/inet_diag.c:1170
__sock_diag_cmd net/core/sock_diag.c:232 [inline]
sock_diag_rcv_msg+0x31d/0x410 net/core/sock_diag.c:263
netlink_rcv_skb+0x172/0x440 net/netlink/af_netlink.c:2477
sock_diag_rcv+0x2a/0x40 net/core/sock_diag.c:274
This issue occurs when asoc is peeled off and the old sk is freed after
getting it by asoc->base.sk and before calling lock_sock(sk).
To prevent the sk free, as a holder of the sk, ep should be alive when
calling lock_sock(). This patch uses call_rcu() and moves sock_put and
ep free into sctp_endpoint_destroy_rcu(), so that it's safe to try to
hold the ep under rcu_read_lock in sctp_transport_traverse_process().
If sctp_endpoint_hold() returns true, it means this ep is still alive
and we have held it and can continue to dump it; If it returns false,
it means this ep is dead and can be freed after rcu_read_unlock, and
we should skip it.
In sctp_sock_dump(), after locking the sk, if this ep is different from
tsp->asoc->ep, it means during this dumping, this asoc was peeled off
before calling lock_sock(), and the sk should be skipped; If this ep is
the same with tsp->asoc->ep, it means no peeloff happens on this asoc,
and due to lock_sock, no peeloff will happen either until release_sock.
Note that delaying endpoint free won't delay the port release, as the
port release happens in sctp_endpoint_destroy() before calling call_rcu().
Also, freeing endpoint by call_rcu() makes it safe to access the sk by
asoc->base.sk in sctp_assocs_seq_show() and sctp_rcv().
Thanks Jones to bring this issue up.
v1->v2:
- improve the changelog.
- add kfree(ep) into sctp_endpoint_destroy_rcu(), as Jakub noticed.
Reported-by: syzbot+9276d76e83e3bcde6c99@syzkaller.appspotmail.com
Reported-by: Lee Jones <lee.jones@linaro.org>
Fixes: d25adbeb0cdb ("sctp: fix an use-after-free issue in sctp_sock_dump")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The fixed_phy_get_gpiod function() returns NULL, it doesn't return error
pointers, using NULL checking to fix this.i
Fixes: 5468e82f7034 ("net: phy: fixed-phy: Drop GPIO from fixed_phy_add()")
Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
Link: https://lore.kernel.org/r/20211224021500.10362-1-linmq006@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Pull ARM fixes from Russell King:
- fix nommu after getting rid of mini-stack for ARMv7
- fix Thumb2 bug in iWMMXt exception handling
* tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm:
ARM: 9169/1: entry: fix Thumb2 bug in iWMMXt exception handling
ARM: 9160/1: NOMMU: Reload __secondary_data after PROCINFO_INITFUNC
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86
Pull x86 platform driver fixes from Hans de Goede:
"Various bug-fixes"
* tag 'platform-drivers-x86-v5.16-4' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86:
platform/x86: intel_pmc_core: fix memleak on registration failure
platform/x86/intel: Remove X86_PLATFORM_DRIVERS_INTEL
platform/x86: system76_acpi: Guard System76 EC specific functionality
platform/x86: apple-gmux: use resource_size() with res
platform/x86: amd-pmc: only use callbacks for suspend
platform/mellanox: mlxbf-pmc: Fix an IS_ERR() vs NULL bug in mlxbf_pmc_map_counters
|
|
Commit 85bf17b28f97 ("recordmcount.pl: look for jgnop instruction as well
as bcrl on s390") added a new alternative mnemonic for the existing brcl
instruction. This is required for the combination old gcc version (pre 9.0)
and binutils since version 2.37.
However at the same time this commit introduced a typo, replacing brcl with
bcrl. As a result no mcount locations are detected anymore with old gcc
versions (pre 9.0) and binutils before version 2.37.
Fix this by using the correct mnemonic again.
Reported-by: Miroslav Benes <mbenes@suse.cz>
Cc: Jerome Marchand <jmarchan@redhat.com>
Cc: <stable@vger.kernel.org>
Fixes: 85bf17b28f97 ("recordmcount.pl: look for jgnop instruction as well as bcrl on s390")
Link: https://lore.kernel.org/r/alpine.LSU.2.21.2112230949520.19849@pobox.suse.cz
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
|
|
The below referenced commit correctly updated the computation of number
of segments (gso_size) by using only the gso payload size and
removing the header lengths.
With this change the regression test started failing. Update
the tests to match this new behavior.
Both IPv4 and IPv6 tests are updated, as a separate patch in this series
will update udp_v6_send_skb to match this change in udp_send_skb.
Fixes: 158390e45612 ("udp: using datalen to cap max gso segments")
Signed-off-by: Coco Li <lixiaoyan@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://lore.kernel.org/r/20211223222441.2975883-2-lixiaoyan@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The max number of UDP gso segments is intended to cap to
UDP_MAX_SEGMENTS, this is checked in udp_send_skb().
skb->len contains network and transport header len here, we should use
only data len instead.
This is the ipv6 counterpart to the below referenced commit,
which missed the ipv6 change
Fixes: 158390e45612 ("udp: using datalen to cap max gso segments")
Signed-off-by: Coco Li <lixiaoyan@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://lore.kernel.org/r/20211223222441.2975883-1-lixiaoyan@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
mlx5 fixes 2021-12-22
This series provides bug fixes to mlx5 driver.
* tag 'mlx5-fixes-2021-12-22' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux:
net/mlx5: Fix some error handling paths in 'mlx5e_tc_add_fdb_flow()'
net/mlx5e: Delete forward rule for ct or sample action
net/mlx5e: Fix ICOSQ recovery flow for XSK
net/mlx5e: Fix interoperability between XSK and ICOSQ recovery flow
net/mlx5e: Fix skb memory leak when TC classifier action offloads are disabled
net/mlx5e: Wrap the tx reporter dump callback to extract the sq
net/mlx5: Fix tc max supported prio for nic mode
net/mlx5: Fix SF health recovery flow
net/mlx5: Fix error print in case of IRQ request failed
net/mlx5: Use first online CPU instead of hard coded CPU
net/mlx5: DR, Fix querying eswitch manager vport for ECPF
net/mlx5: DR, Fix NULL vs IS_ERR checking in dr_domain_init_resources
====================
Link: https://lore.kernel.org/r/20211223190441.153012-1-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Pull ksmbd fixes from Steve French:
"Three ksmbd fixes, all for stable as well.
Two fix potential unitialized memory and one fixes a security problem
where encryption is unitentionally disabled from some clients"
* tag '5.16-rc5-ksmbd-fixes' of git://git.samba.org/ksmbd:
ksmbd: disable SMB2_GLOBAL_CAP_ENCRYPTION for SMB 3.1.1
ksmbd: fix uninitialized symbol 'pntsd_size'
ksmbd: fix error code in ndr_read_int32()
|
|
Pull drm fixes from Dave Airlie:
"Happy Xmas. Nothing major, one mediatek and a couple of i915 locking
fixes. There might be a few stragglers over next week or so but I
don't expect much before next release.
mediatek:
- NULL pointer check
i915:
- guc submission locking fixes"
* tag 'drm-fixes-2021-12-24' of git://anongit.freedesktop.org/drm/drm:
drm/i915/guc: Only assign guc_id.id when stealing guc_id
drm/i915/guc: Use correct context lock when callig clr_context_registered
drm/mediatek: hdmi: Perform NULL pointer check for mtk_hdmi_conf
|
|
Pull io_uring fix from Jens Axboe:
"Single fix for not clearing kiocb->ki_pos back to 0 for a stream,
destined for stable as well"
* tag 'io_uring-5.16-2021-12-23' of git://git.kernel.dk/linux-block:
io_uring: zero iocb->ki_pos for stream file types
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull ucount fix from Eric Biederman:
"This fixes a silly logic bug in the ucount rlimits code, where it was
comparing against the wrong limit"
* 'ucount-rlimit-fixes-for-v5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
ucounts: Fix rlimit max values check
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Including fixes from netfilter.
Current release - regressions:
- revert "tipc: use consistent GFP flags"
Previous releases - regressions:
- igb: fix deadlock caused by taking RTNL in runtime resume path
- accept UFOv6 packages in virtio_net_hdr_to_skb
- netfilter: fix regression in looped (broad|multi)cast's MAC
handling
- bridge: fix ioctl old_deviceless bridge argument
- ice: xsk: do not clear status_error0 for ntu + nb_buffs descriptor,
avoid stalls when multiple sockets use an interface
Previous releases - always broken:
- inet: fully convert sk->sk_rx_dst to RCU rules
- veth: ensure skb entering GRO are not cloned
- sched: fix zone matching for invalid conntrack state
- bonding: fix ad_actor_system option setting to default
- nf_tables: fix use-after-free in nft_set_catchall_destroy()
- lantiq_xrx200: increase buffer reservation to avoid mem corruption
- ice: xsk: avoid leaking app buffers during clean up
- tun: avoid double free in tun_free_netdev"
* tag 'net-5.16-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (45 commits)
net: stmmac: dwmac-visconti: Fix value of ETHER_CLK_SEL_FREQ_SEL_2P5M
r8152: sync ocp base
r8152: fix the force speed doesn't work for RTL8156
net: bridge: fix ioctl old_deviceless bridge argument
net: stmmac: ptp: fix potentially overflowing expression
net: dsa: tag_ocelot: use traffic class to map priority on injected header
veth: ensure skb entering GRO are not cloned.
asix: fix wrong return value in asix_check_host_enable()
asix: fix uninit-value in asix_mdio_read()
sfc: falcon: Check null pointer of rx_queue->page_ring
sfc: Check null pointer of rx_queue->page_ring
net: ks8851: Check for error irq
drivers: net: smc911x: Check for error irq
fjes: Check for error irq
bonding: fix ad_actor_system option setting to default
igb: fix deadlock caused by taking RTNL in RPM resume path
gve: Correct order of processing device options
net: skip virtio_net_hdr_set_proto if protocol already set
net: accept UFOv6 packages in virtio_net_hdr_to_skb
docs: networking: replace skb_hwtstamp_tx with skb_tstamp_tx
...
|
|
In case device registration fails during module initialisation, the
platform device structure needs to be freed using platform_device_put()
to properly free all resources (e.g. the device name).
Fixes: 938835aa903a ("platform/x86: intel_pmc_core: do not create a static struct device")
Cc: stable@vger.kernel.org # 5.9
Signed-off-by: Johan Hovold <johan@kernel.org>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://lore.kernel.org/r/20211222105023.6205-1-johan@kernel.org
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
|
|
ETHER_CLK_SEL_FREQ_SEL_2P5M is not 0 bit of the register. This is a
value, which is 0. Fix from BIT(0) to 0.
Reported-by: Yuji Ishikawa <yuji2.ishikawa@toshiba.co.jp>
Fixes: b38dd98ff8d0 ("net: stmmac: Add Toshiba Visconti SoCs glue driver")
Signed-off-by: Nobuhiro Iwamatsu <nobuhiro1.iwamatsu@toshiba.co.jp>
Link: https://lore.kernel.org/r/20211223073633.101306-1-nobuhiro1.iwamatsu@toshiba.co.jp
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Hayes Wang says:
====================
r8152: fix bugs
Patch #1 fix the issue of force speed mode for RTL8156.
Patch #2 fix the issue of unexpected ocp_base.
====================
Link: https://lore.kernel.org/r/20211223092702.23841-386-nic_swsd@realtek.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
There are some chances that the actual base of hardware is different
from the value recorded by driver, so we have to reset the variable
of ocp_base to sync it.
Set ocp_base to -1. Then, it would be updated and the new base would be
set to the hardware next time.
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
It needs to set mdio force mode. Otherwise, link off always occurs when
setting force speed.
Fixes: 195aae321c82 ("r8152: support new chips")
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"Quite a few small fixes, hopefully the last batch for 5.16.
Most of them are device-specific quirks and/or fixes, and nothing
looks scary for the late stage"
* tag 'sound-5.16-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: hda/realtek: Fix quirk for Clevo NJ51CU
ALSA: rawmidi - fix the uninitalized user_pversion
ALSA: hda: intel-sdw-acpi: go through HDAS ACPI at max depth of 2
ALSA: hda: intel-sdw-acpi: harden detection of controller
ALSA: hda/hdmi: Disable silent stream on GLK
ALSA: hda/realtek: fix mute/micmute LEDs for a HP ProBook
ASoC: meson: aiu: Move AIU_I2S_MISC hold setting to aiu-fifo-i2s
ASoC: meson: aiu: fifo: Add missing dma_coerce_mask_and_coherent()
ASoC: tas2770: Fix setting of high sample rates
ASoC: rt5682: fix the wrong jack type detected
ALSA: hda/realtek: Add new alc285-hp-amp-init model
ALSA: hda/realtek: Amp init fixup for HP ZBook 15 G6
ASoC: tegra: Restore headphones jack name on Nyan Big
ASoC: tegra: Add DAPM switches for headphones and mic jack
ALSA: jack: Check the return value of kstrdup()
ALSA: drivers: opl3: Fix incorrect use of vp->state
ASoC: SOF: Intel: pci-tgl: add new ADL-P variant
ASoC: SOF: Intel: pci-tgl: add ADL-N support
|
|
Commit 561d8352818f ("bridge: use ndo_siocdevprivate") changed the
source and destination arguments of copy_{to,from}_user in bridge's
old_deviceless() from args[1] to uarg breaking SIOC{G,S}IFBR ioctls.
Commit cbd7ad29a507 ("net: bridge: fix ioctl old_deviceless bridge
argument") fixed only BRCTL_{ADD,DEL}_BRIDGES commands leaving
BRCTL_GET_BRIDGES one untouched.
The fixes BRCTL_GET_BRIDGES as well and has been tested with busybox's
brctl.
Example of broken brctl:
$ brctl show
bridge name bridge id STP enabled interfaces
brctl: can't get bridge name for index 0: No such device or address
Example of fixed brctl:
$ brctl show
bridge name bridge id STP enabled interfaces
br0 8000.000000000000 no
Fixes: 561d8352818f ("bridge: use ndo_siocdevprivate")
Signed-off-by: Remi Pommarel <repk@triplefau.lt>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Link: https://lore.kernel.org/all/20211223153139.7661-2-repk@triplefau.lt/
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Convert the u32 variable to type u64 in a context where expression of
type u64 is required to avoid potential overflow.
Fixes: e9e3720002f6 ("net: stmmac: ptp: update tas basetime after ptp adjust")
Signed-off-by: Xiaoliang Yang <xiaoliang.yang_1@nxp.com>
Link: https://lore.kernel.org/r/20211223073928.37371-1-xiaoliang.yang_1@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
For Ocelot switches, the CPU injected frames have an injection header
where it can specify the QoS class of the packet and the DSA tag, now it
uses the SKB priority to set that. If a traffic class to priority
mapping is configured on the netdevice (with mqprio for example ...), it
won't be considered for CPU injected headers. This patch make the QoS
class aligned to the priority to traffic class mapping if it exists.
Fixes: 8dce89aa5f32 ("net: dsa: ocelot: add tagger for Ocelot/Felix switches")
Signed-off-by: Xiaoliang Yang <xiaoliang.yang_1@nxp.com>
Signed-off-by: Marouen Ghodhbane <marouen.ghodhbane@nxp.com>
Link: https://lore.kernel.org/r/20211223072211.33130-1-xiaoliang.yang_1@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux
Pull gpio fixes from Bartosz Golaszewski:
- fix interrupts when replugging the device in gpio-dln2
- remove the arbitrary timeout on virtio requests from gpio-virtio
* tag 'gpio-fixes-for-v5.16-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
gpio: virtio: remove timeout
gpio: dln2: Fix interrupts when replugging the device
|
|
After commit d3256efd8e8b ("veth: allow enabling NAPI even without XDP"),
if GRO is enabled on a veth device and TSO is disabled on the peer
device, TCP skbs will go through the NAPI callback. If there is no XDP
program attached, the veth code does not perform any share check, and
shared/cloned skbs could enter the GRO engine.
Ignat reported a BUG triggered later-on due to the above condition:
[ 53.970529][ C1] kernel BUG at net/core/skbuff.c:3574!
[ 53.981755][ C1] invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI
[ 53.982634][ C1] CPU: 1 PID: 19 Comm: ksoftirqd/1 Not tainted 5.16.0-rc5+ #25
[ 53.982634][ C1] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
[ 53.982634][ C1] RIP: 0010:skb_shift+0x13ef/0x23b0
[ 53.982634][ C1] Code: ea 03 0f b6 04 02 48 89 fa 83 e2 07 38 d0
7f 08 84 c0 0f 85 41 0c 00 00 41 80 7f 02 00 4d 8d b5 d0 00 00 00 0f
85 74 f5 ff ff <0f> 0b 4d 8d 77 20 be 04 00 00 00 4c 89 44 24 78 4c 89
f7 4c 89 8c
[ 53.982634][ C1] RSP: 0018:ffff8881008f7008 EFLAGS: 00010246
[ 53.982634][ C1] RAX: 0000000000000000 RBX: ffff8881180b4c80 RCX: 0000000000000000
[ 53.982634][ C1] RDX: 0000000000000002 RSI: ffff8881180b4d3c RDI: ffff88810bc9cac2
[ 53.982634][ C1] RBP: ffff8881008f70b8 R08: ffff8881180b4cf4 R09: ffff8881180b4cf0
[ 53.982634][ C1] R10: ffffed1022999e5c R11: 0000000000000002 R12: 0000000000000590
[ 53.982634][ C1] R13: ffff88810f940c80 R14: ffff88810f940d50 R15: ffff88810bc9cac0
[ 53.982634][ C1] FS: 0000000000000000(0000) GS:ffff888235880000(0000) knlGS:0000000000000000
[ 53.982634][ C1] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 53.982634][ C1] CR2: 00007ff5f9b86680 CR3: 0000000108ce8004 CR4: 0000000000170ee0
[ 53.982634][ C1] Call Trace:
[ 53.982634][ C1] <TASK>
[ 53.982634][ C1] tcp_sacktag_walk+0xaba/0x18e0
[ 53.982634][ C1] tcp_sacktag_write_queue+0xe7b/0x3460
[ 53.982634][ C1] tcp_ack+0x2666/0x54b0
[ 53.982634][ C1] tcp_rcv_established+0x4d9/0x20f0
[ 53.982634][ C1] tcp_v4_do_rcv+0x551/0x810
[ 53.982634][ C1] tcp_v4_rcv+0x22ed/0x2ed0
[ 53.982634][ C1] ip_protocol_deliver_rcu+0x96/0xaf0
[ 53.982634][ C1] ip_local_deliver_finish+0x1e0/0x2f0
[ 53.982634][ C1] ip_sublist_rcv_finish+0x211/0x440
[ 53.982634][ C1] ip_list_rcv_finish.constprop.0+0x424/0x660
[ 53.982634][ C1] ip_list_rcv+0x2c8/0x410
[ 53.982634][ C1] __netif_receive_skb_list_core+0x65c/0x910
[ 53.982634][ C1] netif_receive_skb_list_internal+0x5f9/0xcb0
[ 53.982634][ C1] napi_complete_done+0x188/0x6e0
[ 53.982634][ C1] gro_cell_poll+0x10c/0x1d0
[ 53.982634][ C1] __napi_poll+0xa1/0x530
[ 53.982634][ C1] net_rx_action+0x567/0x1270
[ 53.982634][ C1] __do_softirq+0x28a/0x9ba
[ 53.982634][ C1] run_ksoftirqd+0x32/0x60
[ 53.982634][ C1] smpboot_thread_fn+0x559/0x8c0
[ 53.982634][ C1] kthread+0x3b9/0x490
[ 53.982634][ C1] ret_from_fork+0x22/0x30
[ 53.982634][ C1] </TASK>
Address the issue by skipping the GRO stage for shared or cloned skbs.
To reduce the chance of OoO, try to unclone the skbs before giving up.
v1 -> v2:
- use avoid skb_copy and fallback to netif_receive_skb - Eric
Reported-by: Ignat Korchagin <ignat@cloudflare.com>
Fixes: d3256efd8e8b ("veth: allow enabling NAPI even without XDP")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Tested-by: Ignat Korchagin <ignat@cloudflare.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/b5f61c5602aab01bac8d711d8d1bfab0a4817db7.1640197544.git.pabeni@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc
Pull MMC fixes from Ulf Hansson:
"MMC core:
- Disable card detect during shutdown
MMC host:
- mmci: Fixup tuning support for stm32_sdmmc
- meson-mx-sdhc: Fix support for multi-block SDIO commands
- sdhci-tegra: Fix support for eMMC HS400ES mode"
* tag 'mmc-v5.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
mmc: mmci: stm32: clear DLYB_CR after sending tuning command
mmc: meson-mx-sdhc: Set MANUAL_STOP for multi-block SDIO commands
mmc: core: Disable card detect during shutdown
mmc: sdhci-tegra: Fix switch to HS400ES mode
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull ARM SoC fixes from Arnd Bergmann:
"This is my last set of fixes for 5.16, including
- multiple code fixes for the op-tee firmware driver
- Two patches for allwinner SoCs, one fixing the phy mode on a board,
the other one fixing a driver bug in the "RSB" bus driver. This was
originally targeted for 5.17, but seemed worth moving to 5.16
- Two small fixes for devicetree files on i.MX platforms, resolving
problems with ethernet and i2c"
* tag 'arm-fixes-5.16-4' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
optee: Suppress false positive kmemleak report in optee_handle_rpc()
tee: optee: Fix incorrect page free bug
arm64: dts: lx2160a: fix scl-gpios property name
tee: handle lookup of shm with reference count 0
ARM: dts: imx6qdl-wandboard: Fix Ethernet support
bus: sunxi-rsb: Fix shutdown
arm64: dts: allwinner: orangepi-zero-plus: fix PHY mode
|
|
While introduction of this menu brings a nice view in the configuration tools,
it brought more issues than solves, i.e. it prevents to locate files in the
intel/ subfolder without touching non-related Kconfig dependencies elsewhere.
Drop X86_PLATFORM_DRIVERS_INTEL altogether.
Note, on x86 it's enabled by default and it's quite unlikely anybody wants to
disable all of the modules in this submenu.
Fixes: 8bd836feb6ca ("platform/x86: intel_skl_int3472: Move to intel/ subfolder")
Suggested-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20211222194941.76054-1-andriy.shevchenko@linux.intel.com
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
|
|
Certain functionality or its implementation in System76 EC firmware may
be different to the proprietary ODM EC firmware. Introduce a new bool,
`has_open_ec`, to guard our specific logic. Detect the use of this by
looking for a custom ACPI method name used in System76 firmware.
Signed-off-by: Tim Crawford <tcrawford@system76.com>
Link: https://lore.kernel.org/r/20211222185154.4560-1-tcrawford@system76.com
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
|
|
All the error handling paths of 'mlx5e_tc_add_fdb_flow()' end to 'err_out'
where 'flow_flag_set(flow, FAILED);' is called.
All but the new error handling paths added by the commits given in the
Fixes tag below.
Fix these error handling paths and branch to 'err_out'.
Fixes: 166f431ec6be ("net/mlx5e: Add indirect tc offload of ovs internal port")
Fixes: b16eb3c81fe2 ("net/mlx5: Support internal port as decap route device")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
(cherry picked from commit 31108d142f3632970f6f3e0224bd1c6781c9f87d)
|
|
When there is ct or sample action, the ct or sample rule will be deleted
and return. But if there is an extra mirror action, the forward rule can't
be deleted because of the return.
Fix it by removing the return.
Fixes: 69e2916ebce4 ("net/mlx5: CT: Add support for mirroring")
Fixes: f94d6389f6a8 ("net/mlx5e: TC, Add support to offload sample action")
Signed-off-by: Chris Mi <cmi@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
There are two ICOSQs per channel: one is needed for RX, and the other
for async operations (XSK TX, kTLS offload). Currently, the recovery
flow for both is the same, and async ICOSQ is mistakenly treated like
the regular ICOSQ.
This patch prevents running the regular ICOSQ recovery on async ICOSQ.
The purpose of async ICOSQ is to handle XSK wakeup requests and post
kTLS offload RX parameters, it has nothing to do with RQ and XSKRQ UMRs,
so the regular recovery sequence is not applicable here.
Fixes: be5323c8379f ("net/mlx5e: Report and recover from CQE error on ICOSQ")
Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Aya Levin <ayal@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Both regular RQ and XSKRQ use the same ICOSQ for UMRs. When doing
recovery for the ICOSQ, don't forget to deactivate XSKRQ.
XSK can be opened and closed while channels are active, so a new mutex
prevents the ICOSQ recovery from running at the same time. The ICOSQ
recovery deactivates and reactivates XSKRQ, so any parallel change in
XSK state would break consistency. As the regular RQ is running, it's
not enough to just flush the recovery work, because it can be
rescheduled.
Fixes: be5323c8379f ("net/mlx5e: Report and recover from CQE error on ICOSQ")
Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
When TC classifier action offloads are disabled (CONFIG_MLX5_CLS_ACT in
Kconfig), the mlx5e_rep_tc_receive() function which is responsible for
passing the skb to the stack (or freeing it) is defined as a nop, and
results in leaking the skb memory. Replace the nop with a call to
napi_gro_receive() to resolve the leak.
Fixes: 28e7606fa8f1 ("net/mlx5e: Refactor rx handler of represetor device")
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Ariel Levkovich <lariel@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Function mlx5e_tx_reporter_dump_sq() casts its void * argument to struct
mlx5e_txqsq *, but in TX-timeout-recovery flow the argument is actually
of type struct mlx5e_tx_timeout_ctx *.
mlx5_core 0000:08:00.1 enp8s0f1: TX timeout detected
mlx5_core 0000:08:00.1 enp8s0f1: TX timeout on queue: 1, SQ: 0x11ec, CQ: 0x146d, SQ Cons: 0x0 SQ Prod: 0x1, usecs since last trans: 21565000
BUG: stack guard page was hit at 0000000093f1a2de (stack is 00000000b66ea0dc..000000004d932dae)
kernel stack overflow (page fault): 0000 [#1] SMP NOPTI
CPU: 5 PID: 95 Comm: kworker/u20:1 Tainted: G W OE 5.13.0_mlnx #1
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
Workqueue: mlx5e mlx5e_tx_timeout_work [mlx5_core]
RIP: 0010:mlx5e_tx_reporter_dump_sq+0xd3/0x180
[mlx5_core]
Call Trace:
mlx5e_tx_reporter_dump+0x43/0x1c0 [mlx5_core]
devlink_health_do_dump.part.91+0x71/0xd0
devlink_health_report+0x157/0x1b0
mlx5e_reporter_tx_timeout+0xb9/0xf0 [mlx5_core]
? mlx5e_tx_reporter_err_cqe_recover+0x1d0/0x1d0
[mlx5_core]
? mlx5e_health_queue_dump+0xd0/0xd0 [mlx5_core]
? update_load_avg+0x19b/0x550
? set_next_entity+0x72/0x80
? pick_next_task_fair+0x227/0x340
? finish_task_switch+0xa2/0x280
mlx5e_tx_timeout_work+0x83/0xb0 [mlx5_core]
process_one_work+0x1de/0x3a0
worker_thread+0x2d/0x3c0
? process_one_work+0x3a0/0x3a0
kthread+0x115/0x130
? kthread_park+0x90/0x90
ret_from_fork+0x1f/0x30
--[ end trace 51ccabea504edaff ]---
RIP: 0010:mlx5e_tx_reporter_dump_sq+0xd3/0x180
PKRU: 55555554
Kernel panic - not syncing: Fatal exception
Kernel Offset: disabled
end Kernel panic - not syncing: Fatal exception
To fix this bug add a wrapper for mlx5e_tx_reporter_dump_sq() which
extracts the sq from struct mlx5e_tx_timeout_ctx and set it as the
TX-timeout-recovery flow dump callback.
Fixes: 5f29458b77d5 ("net/mlx5e: Support dump callback in TX reporter")
Signed-off-by: Aya Levin <ayal@nvidia.com>
Signed-off-by: Amir Tzin <amirtz@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Only prio 1 is supported if firmware doesn't support ignore flow
level for nic mode. The offending commit removed the check wrongly.
Add it back.
Fixes: 9a99c8f1253a ("net/mlx5e: E-Switch, Offload all chain 0 priorities when modify header and forward action is not supported")
Signed-off-by: Chris Mi <cmi@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
SF do not directly control the PCI device. During recovery flow SF
should not be allowed to do pci disable or pci reset, its PF will do it.
It fixes the following kernel trace:
mlx5_core.sf mlx5_core.sf.25: mlx5_health_try_recover:387:(pid 40948): starting health recovery flow
mlx5_core 0000:03:00.0: mlx5_pci_slot_reset was called
mlx5_core 0000:03:00.0: wait vital counter value 0xab175 after 1 iterations
mlx5_core.sf mlx5_core.sf.25: firmware version: 24.32.532
mlx5_core.sf mlx5_core.sf.23: mlx5_health_try_recover:387:(pid 40946): starting health recovery flow
mlx5_core 0000:03:00.0: mlx5_pci_slot_reset was called
mlx5_core 0000:03:00.0: wait vital counter value 0xab193 after 1 iterations
mlx5_core.sf mlx5_core.sf.23: firmware version: 24.32.532
mlx5_core.sf mlx5_core.sf.25: mlx5_cmd_check:813:(pid 40948): ENABLE_HCA(0x104) op_mod(0x0) failed,
status bad resource state(0x9), syndrome (0x658908)
mlx5_core.sf mlx5_core.sf.25: mlx5_function_setup:1292:(pid 40948): enable hca failed
mlx5_core.sf mlx5_core.sf.25: mlx5_health_try_recover:389:(pid 40948): health recovery failed
Fixes: 1958fc2f0712 ("net/mlx5: SF, Add auxiliary device driver")
Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
In case IRQ layer failed to find or to request irq, the driver is
printing the first cpu of the provided affinity as part of the error
print. Empty affinity is a valid input for the IRQ layer, and it is
an error to call cpumask_first() on empty affinity.
Remove the first cpu print from the error message.
Fixes: c36326d38d93 ("net/mlx5: Round-Robin EQs over IRQs")
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Hard coded CPU (0 in our case) might be offline. Hence, use the first
online CPU instead.
Fixes: f891b7cdbdcd ("net/mlx5: Enable single IRQ for PCI Function")
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
On BlueField the E-Switch manager is the ECPF (vport 0xFFFE), but when
querying capabilities of ECPF eswitch manager, need to query vport 0
with other_vport = 0.
Fixes: 9091b821aaa4 ("net/mlx5: DR, Handle eswitch manager and uplink vports separately")
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Reviewed-by: Alex Vesker <valex@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
The mlx5_get_uars_page() function returns error pointers.
Using IS_ERR() to check the return value to fix this.
Fixes: 4ec9e7b02697 ("net/mlx5: DR, Expose steering domain functionality")
Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
The PVSCSI implementation in the VMware hypervisor under specific
configuration ("SCSI Bus Sharing" set to "Physical") returns zero dataLen
in the completion descriptor for READ CAPACITY(16). As a result, the kernel
can not detect proper disk geometry. This can be recognized by the kernel
message:
[ 0.776588] sd 1:0:0:0: [sdb] Sector size 0 reported, assuming 512.
The PVSCSI implementation in QEMU does not set dataLen at all, keeping it
zeroed. This leads to a boot hang as was reported by Shmulik Ladkani.
It is likely that the controller returns the garbage at the end of the
buffer. Residual length should be set by the driver in that case. The SCSI
layer will erase corresponding data. See commit bdb2b8cab439 ("[SCSI] erase
invalid data returned by device") for details.
Commit e662502b3a78 ("scsi: vmw_pvscsi: Set correct residual data length")
introduced the issue by setting residual length unconditionally, causing
the SCSI layer to erase the useful payload beyond dataLen when this value
is returned as 0.
As a result, considering existing issues in implementations of PVSCSI
controllers, we do not want to call scsi_set_resid() when dataLen ==
0. Calling scsi_set_resid() has no effect if dataLen equals buffer length.
Link: https://lore.kernel.org/lkml/20210824120028.30d9c071@blondie/
Link: https://lore.kernel.org/r/20211220190514.55935-1-amakhalov@vmware.com
Fixes: e662502b3a78 ("scsi: vmw_pvscsi: Set correct residual data length")
Cc: Matt Wang <wwentao@vmware.com>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Vishal Bhakta <vbhakta@vmware.com>
Cc: VMware PV-Drivers <pv-drivers@vmware.com>
Cc: James E.J. Bottomley <jejb@linux.ibm.com>
Cc: linux-scsi@vger.kernel.org
Cc: stable@vger.kernel.org
Reported-and-suggested-by: Shmulik Ladkani <shmulik.ladkani@gmail.com>
Signed-off-by: Alexey Makhalov <amakhalov@vmware.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|