Age | Commit message (Collapse) | Author |
|
Pull kvm fixes from Paolo Bonzini:
"Changes that were posted too late for 6.1, or after the release.
x86:
- several fixes to nested VMX execution controls
- fixes and clarification to the documentation for Xen emulation
- do not unnecessarily release a pmu event with zero period
- MMU fixes
- fix Coverity warning in kvm_hv_flush_tlb()
selftests:
- fixes for the ucall mechanism in selftests
- other fixes mostly related to compilation with clang"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (41 commits)
KVM: selftests: restore special vmmcall code layout needed by the harness
Documentation: kvm: clarify SRCU locking order
KVM: x86: fix deadlock for KVM_XEN_EVTCHN_RESET
KVM: x86/xen: Documentation updates and clarifications
KVM: x86/xen: Add KVM_XEN_INVALID_GPA and KVM_XEN_INVALID_GFN to uapi
KVM: x86/xen: Simplify eventfd IOCTLs
KVM: x86/xen: Fix SRCU/RCU usage in readers of evtchn_ports
KVM: x86/xen: Use kvm_read_guest_virt() instead of open-coding it badly
KVM: x86/xen: Fix memory leak in kvm_xen_write_hypercall_page()
KVM: Delete extra block of "};" in the KVM API documentation
kvm: x86/mmu: Remove duplicated "be split" in spte.h
kvm: Remove the unused macro KVM_MMU_READ_{,UN}LOCK()
MAINTAINERS: adjust entry after renaming the vmx hyperv files
KVM: selftests: Mark correct page as mapped in virt_map()
KVM: arm64: selftests: Don't identity map the ucall MMIO hole
KVM: selftests: document the default implementation of vm_vaddr_populate_bitmap
KVM: selftests: Use magic value to signal ucall_alloc() failure
KVM: selftests: Disable "gnu-variable-sized-type-not-at-end" warning
KVM: selftests: Include lib.mk before consuming $(CC)
KVM: selftests: Explicitly disable builtins for mem*() overrides
...
|
|
Pull NVMe fixes from Christoph:
"nvme fixes for Linux 6.2
- fix various problems in handling the Command Supported and Effects log
(Christoph Hellwig)
- don't allow unprivileged passthrough of commands that don't transfer
data but modify logical block content (Christoph Hellwig)
- add a features and quirks policy document (Christoph Hellwig)
- fix some really nasty code that was correct but made smatch complain
(Sagi Grimberg)"
* tag 'nvme-6.2-2022-12-29' of git://git.infradead.org/nvme:
nvme-auth: fix smatch warning complaints
nvme: consult the CSE log page for unprivileged passthrough
nvme: also return I/O command effects from nvme_command_effects
nvmet: don't defer passthrough commands with trivial effects to the workqueue
nvmet: set the LBCC bit for commands that modify data
nvmet: use NVME_CMD_EFFECTS_CSUPP instead of open coding it
nvme: fix the NVME_CMD_EFFECTS_CSE_MASK definition
docs, nvme: add a feature and quirk policy document
|
|
Many of the structs recently added to track field info for linked-list
head are useful as-is for rbtree root. So let's do a mechanical renaming
of list_head-related types and fields:
include/linux/bpf.h:
struct btf_field_list_head -> struct btf_field_graph_root
list_head -> graph_root in struct btf_field union
kernel/bpf/btf.c:
list_head -> graph_root in struct btf_field_info
This is a nonfunctional change, functionality to actually use these
fields for rbtree will be added in further patches.
Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Link: https://lore.kernel.org/r/20221217082506.1570898-5-davemarchevsky@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Add few static text to explain how one can bring up the search dialog
box by pressing the forward slash key anywhere on this interface.
Signed-off-by: Bhaskar Chowdhury <unixbhaskar@gmail.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
|
|
Instead of counting on prior allocations to have sized allocations to
the next kmalloc bucket size, always perform a krealloc that is at least
ksize(dst) in size (which is a no-op), so the size can be correctly
tracked by all the various allocation size trackers (KASAN,
__alloc_size, etc).
Reported-by: Hyunwoo Kim <v4bel@theori.io>
Link: https://lore.kernel.org/bpf/20221223094551.GA1439509@ubuntu
Fixes: ceb35b666d42 ("bpf/verifier: Use kmalloc_size_roundup() to match ksize() usage")
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Martin KaFai Lau <martin.lau@linux.dev>
Cc: Song Liu <song@kernel.org>
Cc: Yonghong Song <yhs@fb.com>
Cc: KP Singh <kpsingh@kernel.org>
Cc: Stanislav Fomichev <sdf@google.com>
Cc: Hao Luo <haoluo@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: bpf@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20221223182836.never.866-kees@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Kui-Feng Lee says:
====================
This issue is related to task iterators over vma. A system crash can
occur when a task iterator travels through vma of tasks as the death
of a task will clear the pointer to its mm, even though the
task_struct is still held. As a result, an unexpected crash happens
due to a null pointer. To address this problem, a reference to mm is
kept on the iterator to make sure that the pointer is always
valid. This patch set provides a solution for this crash by properly
referencing mm on task iterators over vma.
The major changes from v1 are:
- Fix commit logs of the test case.
- Use reverse Christmas tree coding style.
- Remove unnecessary error handling for time().
v1: https://lore.kernel.org/bpf/20221216015912.991616-1-kuifeng@meta.com/
====================
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
When a task iterator traverses vma(s), it is possible task->mm might
become invalid in the middle of traversal and this may cause kernel
misbehave (e.g., crash)
This test case creates iterators repeatedly and forks short-lived
processes in the background to detect this bug. The test will last
for 3 seconds to get the chance to trigger the issue.
Signed-off-by: Kui-Feng Lee <kuifeng@meta.com>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/r/20221216221855.4122288-3-kuifeng@meta.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Fix the system crash that happens when a task iterator travel through
vma of tasks.
In task iterators, we used to access mm by following the pointer on
the task_struct; however, the death of a task will clear the pointer,
even though we still hold the task_struct. That can cause an
unexpected crash for a null pointer when an iterator is visiting a
task that dies during the visit. Keeping a reference of mm on the
iterator ensures we always have a valid pointer to mm.
Co-developed-by: Song Liu <song@kernel.org>
Signed-off-by: Song Liu <song@kernel.org>
Signed-off-by: Kui-Feng Lee <kuifeng@meta.com>
Reported-by: Nathan Slingerland <slinger@meta.com>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/r/20221216221855.4122288-2-kuifeng@meta.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
In the ensure_good_fd function, if the fcntl function succeeds but
the close function fails, ensure_good_fd returns a normal fd and
sets errno, which may cause users to misunderstand. The close
failure is not a serious problem, and the correct FD has been
handed over to the upper-layer application. Let's restore errno here.
Signed-off-by: Xin Liu <liuxin350@huawei.com>
Link: https://lore.kernel.org/r/20221223133618.10323-1-liuxin350@huawei.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Commit 7443b296e699 ("x86/percpu: Move cpu_number next to current_task")
moved global per_cpu variable 'cpu_number' into pcpu_hot structure.
Therefore this part of var_data test is no longer valid.
Disable it until better solution is found.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
In the scenario where livepatch and kretfunc coexist, the pageattr of
im->image is rox after arch_prepare_bpf_trampoline in
bpf_trampoline_update, and then modify_fentry or register_fentry returns
-EAGAIN from bpf_tramp_ftrace_ops_func, the BPF_TRAMP_F_ORIG_STACK flag
will be configured, and arch_prepare_bpf_trampoline will be re-executed.
At this time, because the pageattr of im->image is rox,
arch_prepare_bpf_trampoline will read and write im->image, which causes
a fault. as follows:
insmod livepatch-sample.ko # samples/livepatch/livepatch-sample.c
bpftrace -e 'kretfunc:cmdline_proc_show {}'
BUG: unable to handle page fault for address: ffffffffa0206000
PGD 322d067 P4D 322d067 PUD 322e063 PMD 1297e067 PTE d428061
Oops: 0003 [#1] PREEMPT SMP PTI
CPU: 2 PID: 270 Comm: bpftrace Tainted: G E K 6.1.0 #5
RIP: 0010:arch_prepare_bpf_trampoline+0xed/0x8c0
RSP: 0018:ffffc90001083ad8 EFLAGS: 00010202
RAX: ffffffffa0206000 RBX: 0000000000000020 RCX: 0000000000000000
RDX: ffffffffa0206001 RSI: ffffffffa0206000 RDI: 0000000000000030
RBP: ffffc90001083b70 R08: 0000000000000066 R09: ffff88800f51b400
R10: 000000002e72c6e5 R11: 00000000d0a15080 R12: ffff8880110a68c8
R13: 0000000000000000 R14: ffff88800f51b400 R15: ffffffff814fec10
FS: 00007f87bc0dc780(0000) GS:ffff88803e600000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffa0206000 CR3: 0000000010b70000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
bpf_trampoline_update+0x25a/0x6b0
__bpf_trampoline_link_prog+0x101/0x240
bpf_trampoline_link_prog+0x2d/0x50
bpf_tracing_prog_attach+0x24c/0x530
bpf_raw_tp_link_attach+0x73/0x1d0
__sys_bpf+0x100e/0x2570
__x64_sys_bpf+0x1c/0x30
do_syscall_64+0x5b/0x80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
With this patch, when modify_fentry or register_fentry returns -EAGAIN
from bpf_tramp_ftrace_ops_func, the pageattr of im->image will be reset
to nx+rw.
Cc: stable@vger.kernel.org
Fixes: 00963a2e75a8 ("bpf: Support bpf_trampoline on functions with IPMODIFY (e.g. livepatch)")
Signed-off-by: Chuang Wang <nashuiliang@gmail.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/r/20221224133146.780578-1-nashuiliang@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Commit 0d4e8ed139d8 ("net/mlx5: Lag, avoid lockdep warnings")
accidentally removed a call to cancel delayed bond work thus it may
cause queued delay to expire and fall on an already destroyed work
queue.
Fix by restoring the call cancel_delayed_work_sync() before
destroying the workqueue.
This prevents call trace such as this:
[ 329.230417] BUG: kernel NULL pointer dereference, address: 0000000000000000
[ 329.231444] #PF: supervisor write access in kernel mode
[ 329.232233] #PF: error_code(0x0002) - not-present page
[ 329.233007] PGD 0 P4D 0
[ 329.233476] Oops: 0002 [#1] SMP
[ 329.234012] CPU: 5 PID: 145 Comm: kworker/u20:4 Tainted: G OE 6.0.0-rc5_mlnx #1
[ 329.235282] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
[ 329.236868] Workqueue: mlx5_cmd_0000:08:00.1 cmd_work_handler [mlx5_core]
[ 329.237886] RIP: 0010:_raw_spin_lock+0xc/0x20
[ 329.238585] Code: f0 0f b1 17 75 02 f3 c3 89 c6 e9 6f 3c 5f ff 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 0f 1f 44 00 00 31 c0 ba 01 00 00 00 <f0> 0f b1 17 75 02 f3 c3 89 c6 e9 45 3c 5f ff 0f 1f 44 00 00 0f 1f
[ 329.241156] RSP: 0018:ffffc900001b0e98 EFLAGS: 00010046
[ 329.241940] RAX: 0000000000000000 RBX: ffffffff82374ae0 RCX: 0000000000000000
[ 329.242954] RDX: 0000000000000001 RSI: 0000000000000014 RDI: 0000000000000000
[ 329.243974] RBP: ffff888106ccf000 R08: ffff8881004000c8 R09: ffff888100400000
[ 329.244990] R10: 0000000000000000 R11: ffffffff826669f8 R12: 0000000000002000
[ 329.246009] R13: 0000000000000005 R14: ffff888100aa7ce0 R15: ffff88852ca80000
[ 329.247030] FS: 0000000000000000(0000) GS:ffff88852ca80000(0000) knlGS:0000000000000000
[ 329.248260] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 329.249111] CR2: 0000000000000000 CR3: 000000016d675001 CR4: 0000000000770ee0
[ 329.250133] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 329.251152] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 329.252176] PKRU: 55555554
Fixes: 0d4e8ed139d8 ("net/mlx5: Lag, avoid lockdep warnings")
Signed-off-by: Eli Cohen <elic@nvidia.com>
Reviewed-by: Maor Dickman <maord@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
The cited patch added support of matching on geneve option by setting
geneve_tlv_option_0_data mask and key but didn't set geneve_tlv_option_0_exist
bit which is required on some HWs when matching geneve_tlv_option_0_data parameter,
this may cause in some cases for packets to wrongly match on rules with different
geneve option.
Example of such case is packet with geneve_tlv_object class=789 and data=456
will wrongly match on rule with match geneve_tlv_object class=123 and data=456.
Fix it by setting geneve_tlv_option_0_exist bit when supported by the HW when matching
on geneve_tlv_option_0_data parameter.
Fixes: 9272e3df3023 ("net/mlx5e: Geneve, Add support for encap/decap flows offload")
Signed-off-by: Maor Dickman <maord@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Current xdp xmit functions logic (mlx5e_xmit_xdp_frame_mpwqe or
mlx5e_xmit_xdp_frame), validates xdp packet length by comparing it to
hw mtu (configured at xdp sq allocation) before xmiting it. This check
does not account for ethernet fcs length (calculated and filled by the
nic). Hence, when we try sending packets with length > (hw-mtu -
ethernet-fcs-size), the device port drops it and tx_errors_phy is
incremented. Desired behavior is to catch these packets and drop them
by the driver.
Fix this behavior in XDP SQ allocation function (mlx5e_alloc_xdpsq) by
subtracting ethernet FCS header size (4 Bytes) from current hw mtu
value, since ethernet FCS is calculated and written to ethernet frames
by the nic.
Fixes: d8bec2b29a82 ("net/mlx5e: Support bpf_xdp_adjust_head()")
Signed-off-by: Adham Faris <afaris@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
The cited commit introduced a bug for multiple encapsulations flow.
If one dest encap becomes invalid, the flow is set slow path flag.
But when other dests encap become invalid, they are not cleared due
to slow path flag of the flow. When neigh-update-add is running, it
will use invalid encap.
Fix it by checking slow path flag after clearing dest encap.
Fixes: 9a5f9cc794e1 ("net/mlx5e: Fix possible use-after-free deleting fdb rule")
Signed-off-by: Chris Mi <cmi@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Need to use sprintf to build a string instead of sscanf. Otherwise
dirname is null and both "ct_nic" and "ct_fdb" won't be created.
But its redundant anyway as driver could be in switchdev mode but
still add nic rules. So use "ct" as folder name.
Fixes: 77422a8f6f61 ("net/mlx5e: CT: Add ct driver counters")
Signed-off-by: Chris Mi <cmi@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
RX reporter mistakenly reads from the regular (inactive) RQ
when XSK RQ is active. Fix it here.
Fixes: 3db4c85cde7a ("net/mlx5e: xsk: Use queue indices starting from 0 for XSK queues")
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Gal Pressman <gal@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
mlx5e_build_nic_params will turn CQE compression on if the hardware
capability is enabled and the slow_pci_heuristic condition is detected.
As IPoIB doesn't support CQE compression, make sure to disable the
feature in the IPoIB profile init.
Please note that the feature is not exposed to the user for IPoIB
interfaces, so it can't be subsequently turned on.
Fixes: b797a684b0dd ("net/mlx5e: Enable CQE compression when PCI is slower than link")
Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Reviewed-by: Gal Pressman <gal@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
mlx5 PF can disable RoCE for its VFs and SFs. In such case RoCE is
marked as unsupported on those VFs/SFs.
The cited patch added an option for disable (and enable) RoCE at HCA
level. However, that commit didn't check whether RoCE is supported on
the HCA and enabled user to try and set RoCE to on.
Fix it by checking whether the HCA supports RoCE.
Fixes: fbfa97b4d79f ("net/mlx5: Disable roce at HCA level")
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Currently, recovery is done without considering whether the device is
still in probe flow.
This may lead to recovery before device have finished probed
successfully. e.g.: while mlx5_init_one() is running. Recovery flow is
using functionality that is loaded only by mlx5_init_one(), and there
is no point in running recovery without mlx5_init_one() finished
successfully.
Fix it by waiting for probe flow to finish and checking whether the
device is probed before trying to perform recovery.
Fixes: 51d138c2610a ("net/mlx5: Fix health error state handling")
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
io_eq_size and event_eq_size params are of param type
DEVLINK_PARAM_TYPE_U32. But, the validation callback is addressing them
as DEVLINK_PARAM_TYPE_U16.
This cause mismatch in validation in big-endian systems, in which
values in range were rejected while 268500991 was accepted.
Fix it by checking the U32 value in the validation callback.
Fixes: 0844fa5f7b89 ("net/mlx5: Let user configure io_eq_size param")
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
There are two cleanup calls missing in mlx5_init_once() error path.
Add them making the error path flow to be the same as
mlx5_cleanup_once().
Fixes: 52ec462eca9b ("net/mlx5: Add reserved-gids support")
Fixes: 7c39afb394c7 ("net/mlx5: PTP code migration to driver core section")
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Fix SRIOV VST mode behavior to insert cvlan when a guest tag is already
present in the frame. Previous VST mode behavior was to drop packets or
override existing tag, depending on the device version.
In this patch we fix this behavior by correctly building the HW steering
rule with a push vlan action, or for older devices we ask the FW to stack
the vlan when a vlan is already present.
Fixes: 07bab9502641 ("net/mlx5: E-Switch, Refactor eswitch ingress acl codes")
Fixes: dfcb1ed3c331 ("net/mlx5: E-Switch, Vport ingress/egress ACLs rules for VST mode")
Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
When initializing auth context, there may be no secrets passed
by the user. Make return code explicit when returning successfully.
smatch warnings:
drivers/nvme/host/auth.c:950 nvme_auth_init_ctrl() warn: missing error code? 'ret'
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
|
|
Commands like Write Zeros can change the contents of a namespaces without
actually transferring data. To protect against this, check the Commands
Supported and Effects log is supported by the controller for any
unprivileg command passthrough and refuse unprivileged passthrough if the
command has any effects that can change data or metadata.
Note: While the Commands Support and Effects log page has only been
mandatory since NVMe 2.0, it is widely supported because Windows requires
it for any command passthrough from userspace.
Fixes: e4fbcf32c860 ("nvme: identify-namespace without CAP_SYS_ADMIN")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Kanchan Joshi <joshi.k@samsung.com>
|
|
To be able to use the Commands Supported and Effects Log for allowing
unprivileged passtrough, it needs to be corretly reported for I/O
commands as well. Return the I/O command effects from
nvme_command_effects, and also add a default list of effects for the
NVM command set. For other command sets, the Commands Supported and
Effects log is required to be present already.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Kanchan Joshi <joshi.k@samsung.com>
|
|
Mask out the "Command Supported" and "Logical Block Content Change" bits
and only defer execution of commands that have non-trivial effects to
the workqueue for synchronous execution. This allows to execute admin
commands asynchronously on controllers that provide a Command Supported
and Effects log page, and will keep allowing to execute Write commands
asynchronously once command effects on I/O commands are taken into
account.
Fixes: c1fef73f793b ("nvmet: add passthru code to process commands")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Kanchan Joshi <joshi.k@samsung.com>
|
|
Write, Write Zeroes, Zone append and a Zone Reset through
Zone Management Send modify the logical block content of a namespace,
so make sure the LBCC bit is reported for them.
Fixes: b5d0b38c0475 ("nvmet: add Command Set Identifier support")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Kanchan Joshi <joshi.k@samsung.com>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
|
|
Use NVME_CMD_EFFECTS_CSUPP instead of open coding it and assign a
single value to multiple array entries instead of repeated assignments.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Kanchan Joshi <joshi.k@samsung.com>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
|
|
3 << 16 does not generate the correct mask for bits 16, 17 and 18.
Use the GENMASK macro to generate the correct mask instead.
Fixes: 84fef62d135b ("nvme: check admin passthru command effects")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Kanchan Joshi <joshi.k@samsung.com>
|
|
This adds a document about what specification features are supported by
the Linux NVMe driver, and what qualifies for a quirk if an implementation
has problems following the specification.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Acked-by: Jonathan Corbet <corbet@lwn.net>
|
|
The recent code refactoring for HD-audio HDMI codec driver caused a
regression on AMD/ATI HDMI codecs; namely, PulseAudioand pipewire
don't recognize HDMI outputs any longer while the direct output via
ALSA raw access still works.
The problem turned out that, after the code refactoring, the driver
assumes only the dynamic PCM assignment, and when a PCM stream that
still isn't assigned to any pin gets opened, the driver tries to
assign any free converter to the PCM stream. This behavior is OK for
Intel and other codecs, as they have arbitrary connections between
pins and converters. OTOH, on AMD chips that have a 1:1 mapping
between pins and converters, this may end up with blocking the open of
the next PCM stream for the pin that is tied with the formerly taken
converter.
Also, with the code refactoring, more PCM streams are exposed than
necessary as we assume all converters can be used, while this isn't
true for AMD case. This may change the PCM stream assignment and
confuse users as well.
This patch fixes those problems by:
- Introducing a flag spec->static_pcm_mapping, and if it's set, the
driver applies the static mapping between pins and converters at the
probe time
- Limiting the number of PCM streams per pins, too; this avoids the
superfluous PCM streams
Fixes: ef6f5494faf6 ("ALSA: hda/hdmi: Use only dynamic PCM device allocation")
Cc: <stable@vger.kernel.org>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=216836
Co-developed-by: Jaroslav Kysela <perex@perex.cz>
Signed-off-by: Jaroslav Kysela <perex@perex.cz>
Link: https://lore.kernel.org/r/20221228125714.16329-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
x86:
* several fixes to nested VMX execution controls
* fixes and clarification to the documentation for Xen emulation
* do not unnecessarily release a pmu event with zero period
* MMU fixes
* fix Coverity warning in kvm_hv_flush_tlb()
selftests:
* fixes for the ucall mechanism in selftests
* other fixes mostly related to compilation with clang
|
|
Commit 8fda37cf3d41 ("KVM: selftests: Stuff RAX/RCX with 'safe' values
in vmmcall()/vmcall()", 2022-11-21) broke the svm_nested_soft_inject_test
because it placed a "pop rbp" instruction after vmmcall. While this is
correct and mimics what is done in the VMX case, this particular test
expects a ud2 instruction right after the vmmcall, so that it can skip
over it in the L1 part of the test.
Inline a suitably-modified version of vmmcall() to restore the
functionality of the test.
Fixes: 8fda37cf3d41 ("KVM: selftests: Stuff RAX/RCX with 'safe' values in vmmcall()/vmcall()"
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20221130181147.9911-1-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Rudi reports a compilation failure on x86_64 when CONFIG_NET_CLS or
CONFIG_NET_CLS_ACT is not set but CONFIG_RETPOLINE is set.
A misplaced '#endif' was causing the issue.
Fixes: 7f0e810220e2 ("net/sched: add retpoline wrapper for tc")
Tested-by: Rudi Heitbaum <rudi@heitbaum.com>
Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Follow the advice of the Documentation/filesystems/sysfs.rst
and show() should only use sysfs_emit() or sysfs_emit_at()
when formatting the value to be returned to user space.
Signed-off-by: Xuezhi Zhang <zhangxuezhi1@coolpad.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Chunhao Lin says:
====================
r8169: fix dmar pte write access is not set error
This series fixes dmar pte write access is not set error.
Chunhao Lin (2):
r8169: move rtl_wol_enable_rx() and rtl_prepare_power_down()
r8169: fix dmar pte write access is not set error
v2:
-update commit message
-adjust the code according to current kernel code
v3:
-update title and commit message
-split the patch
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When close device, if wol is enabled, rx will be enabled. When open
device it will cause rx packet to be dma to the wrong memory address
after pci_set_master() and system log will show blow messages.
DMAR: DRHD: handling fault status reg 3
DMAR: [DMA Write] Request device [02:00.0] PASID ffffffff fault addr
ffdd4000 [fault reason 05] PTE Write access is not set
In this patch, driver disable tx/rx when close device. If wol is
enabled, only enable rx filter and disable rxdv_gate(if support) to
let hardware only receive packet to fifo but not to dma it.
Signed-off-by: Chunhao Lin <hau@realtek.com>
Reviewed-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
There is no functional change. Moving these two functions for following
patch "r8169: fix dmar pte write access is not set error".
Signed-off-by: Chunhao Lin <hau@realtek.com>
Reviewed-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Daniil Tatianin says:
====================
net/ethtool/ioctl: split ethtool_get_phy_stats into multiple helpers
This series fixes a potential NULL dereference in ethtool_get_phy_stats
while also attempting to refactor/split said function into multiple
helpers so that it's easier to reason about what's going on.
I've taken Andrew Lunn's suggestions on the previous version of this
patch and added a bit of my own.
Changes since v1:
- Remove an extra newline in the first patch
- Move WARN_ON_ONCE into the if check as it already returns the
result of the comparison
- Actually split ethtool_get_phy_stats instead of attempting to
refactor it
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
So that it's easier to follow and make sense of the branching and
various conditions.
Stats retrieval has been split into two separate functions
ethtool_get_phy_stats_phydev & ethtool_get_phy_stats_ethtool.
The former attempts to retrieve the stats using phydev & phy_ops, while
the latter uses ethtool_ops.
Actual n_stats validation & array allocation has been moved into a new
ethtool_vzalloc_stats_array helper.
This also fixes a potential NULL dereference of
ops->get_ethtool_phy_stats where it was getting called in an else branch
unconditionally without making sure it was actually present.
Found by Linux Verification Center (linuxtesting.org) with the SVACE
static analysis tool.
Signed-off-by: Daniil Tatianin <d-tatianin@yandex-team.ru>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Now that we always early return if we don't have any stats we can remove
these checks as they're no longer necessary.
Signed-off-by: Daniil Tatianin <d-tatianin@yandex-team.ru>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
It's not very useful to copy back an empty ethtool_stats struct and
return 0 if we didn't actually have any stats. This also allows for
further simplification of this function in the future commits.
Signed-off-by: Daniil Tatianin <d-tatianin@yandex-team.ru>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Currently only the locking order of SRCU vs kvm->slots_arch_lock
and kvm->slots_lock is documented. Extend this to kvm->lock
since Xen emulation got it terribly wrong.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
While KVM_XEN_EVTCHN_RESET is usually called with no vCPUs running,
if that happened it could cause a deadlock. This is due to
kvm_xen_eventfd_reset() doing a synchronize_srcu() inside
a kvm->lock critical section.
To avoid this, first collect all the evtchnfd objects in an
array and free all of them once the kvm->lock critical section
is over and th SRCU grace period has expired.
Reported-by: Michal Luczaj <mhal@rbox.co>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
The virtblk_map_data() function returns negative error codes, however, the
'nents' field of vbr->sg_table is an unsigned int, which causes the error
handling not to work correctly.
Cc: stable@vger.kernel.org
Fixes: 0e9911fa768f ("virtio-blk: support mq_ops->queue_rqs()")
Signed-off-by: Rafael Mendonca <rafaelmendsr@gmail.com>
Message-Id: <20221021204126.927603-1-rafaelmendsr@gmail.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Suwan Kim <suwan.kim027@gmail.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
|
|
In the receive_filter(), should not drop the packet with the
broadcast/multicast address. Add the check for this
Signed-off-by: Cindy Lu <lulu@redhat.com>
Message-Id: <20221214054306.24145-1-lulu@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
|
|
After commit bda324fd037a ("vdpasim: control virtqueue support"),
vdpasim->iommu became an array of IOTLB, so we should clean the
mappings of each free one by one instead of just deleting the ranges
in the first IOTLB which may leak maps.
Fixes: bda324fd037a ("vdpasim: control virtqueue support")
Cc: Gautam Dawar <gautam.dawar@xilinx.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20221213090717.61529-1-jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Gautam Dawar <gautam.dawar@amd.com>
|
|
For the device without multiqueue feature, we will read 0 as
max_virtqueue_pairs from the config. So if we fill
VDPA_ATTR_DEV_NET_CFG_MAX_VQP with the value we read from the config
we will confuse the user.
Fixing this by only filling the value when multiqueue is offered by
the device so userspace can assume 1 when the attr is not provided.
Fixes: 13b00b135665c("vdpa: Add support for querying vendor statistics")
Cc: Eli Cohen <elic@nvidia.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20220907060110.4511-1-jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Eli Cohen <elic@nvidia.com>
|
|
In vp_vdpa_remove(), the code kfree(&vp_vdpa_mgtdev->mgtdev.id_table) uses
a reference of pointer as the argument of kfree, which is the wrong pointer
and then may hit crash like this:
Unable to handle kernel paging request at virtual address 00ffff003363e30c
Internal error: Oops: 96000004 [#1] SMP
Call trace:
rb_next+0x20/0x5c
ext4_readdir+0x494/0x5c4 [ext4]
iterate_dir+0x168/0x1b4
__se_sys_getdents64+0x68/0x170
__arm64_sys_getdents64+0x24/0x30
el0_svc_common.constprop.0+0x7c/0x1bc
do_el0_svc+0x2c/0x94
el0_svc+0x20/0x30
el0_sync_handler+0xb0/0xb4
el0_sync+0x160/0x180
Code: 54000220 f9400441 b4000161 aa0103e0 (f9400821)
SMP: stopping secondary CPUs
Starting crashdump kernel...
Fixes: ffbda8e9df10 ("vdpa/vp_vdpa : add vdpa tool support in vp_vdpa")
Signed-off-by: Rong Wang <wangrong68@huawei.com>
Signed-off-by: Nanyong Sun <sunnanyong@huawei.com>
Message-Id: <20221207120813.2837529-1-sunnanyong@huawei.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cindy Lu <lulu@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
|