Age | Commit message (Collapse) | Author |
|
git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson
Pull LoongArch updates from Huacai Chen:
- Select some options in Kconfig
- Give a chance to build with !CONFIG_SMP
- Switch to use built-in rustc target
- Add new supported device nodes to dts
- Some bug fixes and other small changes
- Update the default config file
* tag 'loongarch-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson:
LoongArch: Update Loongson-3 default config file
LoongArch: dts: Add new supported device nodes to Loongson-2K2000
LoongArch: dts: Add new supported device nodes to Loongson-2K0500
LoongArch: dts: Remove "disabled" state of clock controller node
LoongArch: rust: Switch to use built-in rustc target
LoongArch: Fix callchain parse error with kernel tracepoint events again
LoongArch: Give a chance to build with !CONFIG_SMP
LoongArch: Select THP_SWAP if HAVE_ARCH_TRANSPARENT_HUGEPAGE
LoongArch: Select ARCH_WANT_DEFAULT_BPF_JIT
LoongArch: Select ARCH_SUPPORTS_INT128 if CC_HAS_INT128
LoongArch: Select ARCH_HAS_FAST_MULTIPLIER
|
|
This loop is supposed to check if ext->subset_ext_ids[j] is valid, rather
than if ext->subset_ext_ids[i] is valid, before setting the extension
id ext->subset_ext_ids[j] in isainfo->isa.
Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Fixes: 0d8295ed975b ("riscv: add ISA extension parsing for scalar crypto")
Link: https://lore.kernel.org/r/20240502-cpufeature_fixes-v4-2-b3d1a088722d@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
The riscv_cpuinfo struct that contains mvendorid and marchid is not
populated until all harts are booted which happens after the DT parsing.
Use the mvendorid/marchid from the boot hart to determine if the DT
contains an invalid V.
Fixes: d82f32202e0d ("RISC-V: Ignore V from the riscv,isa DT property on older T-Head CPUs")
Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Guo Ren <guoren@kernel.org>
Link: https://lore.kernel.org/r/20240502-cpufeature_fixes-v4-1-b3d1a088722d@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
Pull microblaze updates from Michal Simek:
- Cleanup code around removed early_printk
* tag 'microblaze-v6.10' of git://git.monstr.eu/linux-2.6-microblaze:
microblaze: Remove early printk call from cpuinfo-static.c
microblaze: Remove gcc flag for non existing early_printk.c file
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/overlayfs/vfs
Pull overlayfs updates from Miklos Szeredi:
- Add tmpfile support
- Clean up include
* tag 'ovl-update-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/overlayfs/vfs:
ovl: remove duplicate included header
ovl: remove upper umask handling from ovl_create_upper()
ovl: implement tmpfile
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse
Pull fuse updates from Miklos Szeredi:
- Add fs-verity support (Richard Fung)
- Add multi-queue support to virtio-fs (Peter-Jan Gootzen)
- Fix a bug in NOTIFY_RESEND handling (Hou Tao)
- page -> folio cleanup (Matthew Wilcox)
* tag 'fuse-update-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
virtio-fs: add multi-queue support
virtio-fs: limit number of request queues
fuse: clear FR_SENT when re-adding requests into pending list
fuse: set FR_PENDING atomically in fuse_resend()
fuse: Add initial support for fs-verity
fuse: Convert fuse_readpages_end() to use folio_end_read()
|
|
__kernel_map_pages() is a debug function which clears the valid bit in page
table entry for deallocated pages to detect illegal memory accesses to
freed pages.
This function set/clear the valid bit using __set_memory(). __set_memory()
acquires init_mm's semaphore, and this operation may sleep. This is
problematic, because __kernel_map_pages() can be called in atomic context,
and thus is illegal to sleep. An example warning that this causes:
BUG: sleeping function called from invalid context at kernel/locking/rwsem.c:1578
in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 2, name: kthreadd
preempt_count: 2, expected: 0
CPU: 0 PID: 2 Comm: kthreadd Not tainted 6.9.0-g1d4c6d784ef6 #37
Hardware name: riscv-virtio,qemu (DT)
Call Trace:
[<ffffffff800060dc>] dump_backtrace+0x1c/0x24
[<ffffffff8091ef6e>] show_stack+0x2c/0x38
[<ffffffff8092baf8>] dump_stack_lvl+0x5a/0x72
[<ffffffff8092bb24>] dump_stack+0x14/0x1c
[<ffffffff8003b7ac>] __might_resched+0x104/0x10e
[<ffffffff8003b7f4>] __might_sleep+0x3e/0x62
[<ffffffff8093276a>] down_write+0x20/0x72
[<ffffffff8000cf00>] __set_memory+0x82/0x2fa
[<ffffffff8000d324>] __kernel_map_pages+0x5a/0xd4
[<ffffffff80196cca>] __alloc_pages_bulk+0x3b2/0x43a
[<ffffffff8018ee82>] __vmalloc_node_range+0x196/0x6ba
[<ffffffff80011904>] copy_process+0x72c/0x17ec
[<ffffffff80012ab4>] kernel_clone+0x60/0x2fe
[<ffffffff80012f62>] kernel_thread+0x82/0xa0
[<ffffffff8003552c>] kthreadd+0x14a/0x1be
[<ffffffff809357de>] ret_from_fork+0xe/0x1c
Rewrite this function with apply_to_existing_page_range(). It is fine to
not have any locking, because __kernel_map_pages() works with pages being
allocated/deallocated and those pages are not changed by anyone else in the
meantime.
Fixes: 5fde3db5eb02 ("riscv: add ARCH_SUPPORTS_DEBUG_PAGEALLOC support")
Signed-off-by: Nam Cao <namcao@linutronix.de>
Cc: stable@vger.kernel.org
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/1289ecba9606a19917bc12b6c27da8aa23e1e5ae.1715750938.git.namcao@linutronix.de
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
debug_pagealloc is a debug feature which clears the valid bit in page table
entry for freed pages to detect illegal accesses to freed memory.
For this feature to work, virtual mapping must have PAGE_SIZE resolution.
(No, we cannot map with huge pages and split them only when needed; because
pages can be allocated/freed in atomic context and page splitting cannot be
done in atomic context)
Force linear mapping to use small pages if debug_pagealloc is enabled.
Note that it is not necessary to force the entire linear mapping, but only
those that are given to memory allocator. Some parts of memory can keep
using huge page mapping (for example, kernel's executable code). But these
parts are minority, so keep it simple. This is just a debug feature, some
extra overhead should be acceptable.
Fixes: 5fde3db5eb02 ("riscv: add ARCH_SUPPORTS_DEBUG_PAGEALLOC support")
Signed-off-by: Nam Cao <namcao@linutronix.de>
Cc: stable@vger.kernel.org
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/2e391fa6c6f9b3fcf1b41cefbace02ee4ab4bf59.1715750938.git.namcao@linutronix.de
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
[Why]
The vram width value is 0.
Because the integratedsysteminfo table in VBIOS has updated to 2.3.
[How]
Driver needs a new intergrated info v2.3 table too.
Then the vram width value will be correct.
Signed-off-by: Li Ma <li.ma@amd.com>
Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
|
|
Our applications, built on Elasticsearch[0], frequently create and
delete files. These applications operate within containers, some with a
memory limit exceeding 100GB. Over prolonged periods, the accumulation
of negative dentries within these containers can amount to tens of
gigabytes.
Upon container exit, directories are deleted. However, due to the
numerous associated dentries, this process can be time-consuming. Our
users have expressed frustration with this prolonged exit duration,
which constitutes our first issue.
Simultaneously, other processes may attempt to access the parent
directory of the Elasticsearch directories. Since the task responsible
for deleting the dentries holds the inode lock, processes attempting
directory lookup experience significant delays. This issue, our second
problem, is easily demonstrated:
- Task 1 generates negative dentries:
$ pwd
~/test
$ mkdir es && cd es/ && ./create_and_delete_files.sh
[ After generating tens of GB dentries ]
$ cd ~/test && rm -rf es
[ It will take a long duration to finish ]
- Task 2 attempts to lookup the 'test/' directory
$ pwd
~/test
$ ls
The 'ls' command in Task 2 experiences prolonged execution as Task 1
is deleting the dentries.
We've devised a solution to address both issues by deleting associated
dentry when removing a file. Interestingly, we've noted that a similar
patch was proposed years ago[1], although it was rejected citing the
absence of tangible issues caused by negative dentries. Given our
current challenges, we're resubmitting the proposal. All relevant
stakeholders from previous discussions have been included for reference.
Some alternative solutions are also under discussion[2][3], such as
shrinking child dentries outside of the parent inode lock or even
asynchronously shrinking child dentries. However, given the
straightforward nature of the current solution, I believe this approach
is still necessary.
[ NOTE! This is a pretty fundamental change in how we deal with
unlinking dentries, and it doesn't change the fact that you can have
lots of negative dentries from just doing negative lookups.
But the kernel test robot is at least initially happy with this from a
performance angle, so I'm applying this ASAP just to get more testing
and as a "known fix for an issue people hit in real life".
Put another way: we should still look at the alternatives, and this
patch may get reverted if somebody finds a performance regression on
some other load. - Linus ]
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Link: https://github.com/elastic/elasticsearch [0]
Link: https://patchwork.kernel.org/project/linux-fsdevel/patch/1502099673-31620-1-git-send-email-wangkai86@huawei.com [1]
Link: https://lore.kernel.org/linux-fsdevel/20240511200240.6354-2-torvalds@linux-foundation.org/ [2]
Link: https://lore.kernel.org/linux-fsdevel/CAHk-=wjEMf8Du4UFzxuToGDnF3yLaMcrYeyNAaH1NJWa6fwcNQ@mail.gmail.com/ [3]
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Waiman Long <longman@redhat.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Wangkai <wangkai86@huawei.com>
Cc: Colin Walters <walters@verbum.org>
Tested-by: kernel test robot <oliver.sang@intel.com>
Link: https://lore.kernel.org/all/202405221518.ecea2810-oliver.sang@intel.com/
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
[bug]
In the virtio_pci_common.c function vp_del_vqs, vp_dev->is_avq is involved
to determine whether it is admin virtqueue, but this function vp_dev->is_avq
may be empty. For installations, virtio_pci_legacy does not assign a value
to vp_dev->is_avq.
[fix]
Check whether it is vp_dev->is_avq before use.
[test]
Test with virsh Attach device
Before this patch, the following command would crash the guest system
After applying the patch, everything seems to be working fine.
Signed-off-by: Li Zhang <zhanglikernel@gmail.com>
Message-Id: <1710566754-3532-1-git-send-email-zhanglikernel@gmail.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
This adds support for virtio-net to vduse.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
When request_irq() fails, error path calls vp_del_vqs(). There, as vq is
present in the list, free_irq() is called for the same vector. That
causes following splat:
[ 0.414355] Trying to free already-free IRQ 27
[ 0.414403] WARNING: CPU: 1 PID: 1 at kernel/irq/manage.c:1899 free_irq+0x1a1/0x2d0
[ 0.414510] Modules linked in:
[ 0.414540] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 6.9.0-rc4+ #27
[ 0.414540] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-1.fc39 04/01/2014
[ 0.414540] RIP: 0010:free_irq+0x1a1/0x2d0
[ 0.414540] Code: 1e 00 48 83 c4 08 48 89 e8 5b 5d 41 5c 41 5d 41 5e 41 5f c3 cc cc cc cc 90 8b 74 24 04 48 c7 c7 98 80 6c b1 e8 00 c9 f7 ff 90 <0f> 0b 90 90 48 89 ee 4c 89 ef e8 e0 20 b8 00 49 8b 47 40 48 8b 40
[ 0.414540] RSP: 0000:ffffb71480013ae0 EFLAGS: 00010086
[ 0.414540] RAX: 0000000000000000 RBX: ffffa099c2722000 RCX: 0000000000000000
[ 0.414540] RDX: 0000000000000000 RSI: ffffb71480013998 RDI: 0000000000000001
[ 0.414540] RBP: 0000000000000246 R08: 00000000ffffdfff R09: 0000000000000001
[ 0.414540] R10: 00000000ffffdfff R11: ffffffffb18729c0 R12: ffffa099c1c91760
[ 0.414540] R13: ffffa099c1c916a4 R14: ffffa099c1d2f200 R15: ffffa099c1c91600
[ 0.414540] FS: 0000000000000000(0000) GS:ffffa099fec40000(0000) knlGS:0000000000000000
[ 0.414540] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 0.414540] CR2: 0000000000000000 CR3: 0000000008e3e001 CR4: 0000000000370ef0
[ 0.414540] Call Trace:
[ 0.414540] <TASK>
[ 0.414540] ? __warn+0x80/0x120
[ 0.414540] ? free_irq+0x1a1/0x2d0
[ 0.414540] ? report_bug+0x164/0x190
[ 0.414540] ? handle_bug+0x3b/0x70
[ 0.414540] ? exc_invalid_op+0x17/0x70
[ 0.414540] ? asm_exc_invalid_op+0x1a/0x20
[ 0.414540] ? free_irq+0x1a1/0x2d0
[ 0.414540] vp_del_vqs+0xc1/0x220
[ 0.414540] vp_find_vqs_msix+0x305/0x470
[ 0.414540] vp_find_vqs+0x3e/0x1a0
[ 0.414540] vp_modern_find_vqs+0x1b/0x70
[ 0.414540] init_vqs+0x387/0x600
[ 0.414540] virtnet_probe+0x50a/0xc80
[ 0.414540] virtio_dev_probe+0x1e0/0x2b0
[ 0.414540] really_probe+0xc0/0x2c0
[ 0.414540] ? __pfx___driver_attach+0x10/0x10
[ 0.414540] __driver_probe_device+0x73/0x120
[ 0.414540] driver_probe_device+0x1f/0xe0
[ 0.414540] __driver_attach+0x88/0x180
[ 0.414540] bus_for_each_dev+0x85/0xd0
[ 0.414540] bus_add_driver+0xec/0x1f0
[ 0.414540] driver_register+0x59/0x100
[ 0.414540] ? __pfx_virtio_net_driver_init+0x10/0x10
[ 0.414540] virtio_net_driver_init+0x90/0xb0
[ 0.414540] do_one_initcall+0x58/0x230
[ 0.414540] kernel_init_freeable+0x1a3/0x2d0
[ 0.414540] ? __pfx_kernel_init+0x10/0x10
[ 0.414540] kernel_init+0x1a/0x1c0
[ 0.414540] ret_from_fork+0x31/0x50
[ 0.414540] ? __pfx_kernel_init+0x10/0x10
[ 0.414540] ret_from_fork_asm+0x1a/0x30
[ 0.414540] </TASK>
Fix this by calling deleting the current vq when request_irq() fails.
Fixes: 0b0f9dc52ed0 ("Revert "virtio_pci: use shared interrupts for virtqueues"")
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Message-Id: <20240426150845.3999481-1-jiri@resnulli.us>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
Add myself as a reviewer of some VirtIO areas I'm interested.
Until this point I've been scanning manually the list looking for
series that touches this area. Adding myself to make this task easier.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20240213182450.106796-1-eperezma@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
ida_alloc() and ida_free() should be preferred to the deprecated
ida_simple_get() and ida_simple_remove().
Note that the upper limit of ida_simple_get() is exclusive, but the one of
ida_alloc_max() is inclusive. So a -1 has been added when needed.
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Simon Horman <horms@kernel.org>
Message-Id: <67c2edf49788c27d5f7a49fc701520b9fcf739b5.1713088999.git.christophe.jaillet@wanadoo.fr>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
|
|
When there is a ctlq and it doesn't require interrupt
callbacks,the original method of calculating vectors
wastes hardware msi or msix resources as well as system
IRQ resources.
When conducting performance testing using testpmd in the
guest os, it was found that the performance was lower compared
to directly using vfio-pci to passthrough the device
In scenarios where the virtio device in the guest os does
not utilize interrupts, the vdpa driver still configures
the hardware's msix vector. Therefore, the hardware still
sends interrupts to the host os. Because of this unnecessary
action by the hardware, hardware performance decreases, and
it also affects the performance of the host os.
Before modification:(interrupt mode)
32: 0 0 0 0 PCI-MSI 32768-edge vp-vdpa[0000:00:02.0]-0
33: 0 0 0 0 PCI-MSI 32769-edge vp-vdpa[0000:00:02.0]-1
34: 0 0 0 0 PCI-MSI 32770-edge vp-vdpa[0000:00:02.0]-2
35: 0 0 0 0 PCI-MSI 32771-edge vp-vdpa[0000:00:02.0]-config
After modification:(interrupt mode)
32: 0 0 1 7 PCI-MSI 32768-edge vp-vdpa[0000:00:02.0]-0
33: 36 0 3 0 PCI-MSI 32769-edge vp-vdpa[0000:00:02.0]-1
34: 0 0 0 0 PCI-MSI 32770-edge vp-vdpa[0000:00:02.0]-config
Before modification:(virtio pmd mode for guest os)
32: 0 0 0 0 PCI-MSI 32768-edge vp-vdpa[0000:00:02.0]-0
33: 0 0 0 0 PCI-MSI 32769-edge vp-vdpa[0000:00:02.0]-1
34: 0 0 0 0 PCI-MSI 32770-edge vp-vdpa[0000:00:02.0]-2
35: 0 0 0 0 PCI-MSI 32771-edge vp-vdpa[0000:00:02.0]-config
After modification:(virtio pmd mode for guest os)
32: 0 0 0 0 PCI-MSI 32768-edge vp-vdpa[0000:00:02.0]-config
To verify the use of the virtio PMD mode in the guest operating
system, the following patch needs to be applied to QEMU:
https://lore.kernel.org/all/20240408073311.2049-1-yuxue.liu@jaguarmicro.com
Signed-off-by: Yuxue Liu <yuxue.liu@jaguarmicro.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Heng Qi <hengqi@linux.alibaba.com>
Message-Id: <20240410033020.1310-1-yuxue.liu@jaguarmicro.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
virtio core already sets the .owner, so driver does not need to.
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Message-Id: <20240331-module-owner-virtio-v2-25-98f04bfaf46a@linaro.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Anton Yakovlev <anton.yakovlev@opensynergy.com>
|
|
virtio core already sets the .owner, so driver does not need to.
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Message-Id: <20240331-module-owner-virtio-v2-24-98f04bfaf46a@linaro.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
|
|
virtio core already sets the .owner, so driver does not need to.
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Message-Id: <20240331-module-owner-virtio-v2-23-98f04bfaf46a@linaro.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Acked-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
virtio core already sets the .owner, so driver does not need to.
Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Message-Id: <20240331-module-owner-virtio-v2-22-98f04bfaf46a@linaro.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
virtio core already sets the .owner, so driver does not need to.
Acked-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Message-Id: <20240331-module-owner-virtio-v2-21-98f04bfaf46a@linaro.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com
Reviewed-by: Pankaj Gupta <pankaj.gupta@amd.com>
|
|
virtio core already sets the .owner, so driver does not need to.
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Message-Id: <20240331-module-owner-virtio-v2-20-98f04bfaf46a@linaro.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
virtio core already sets the .owner, so driver does not need to.
Acked-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Message-Id: <20240331-module-owner-virtio-v2-19-98f04bfaf46a@linaro.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
|
|
virtio core already sets the .owner, so driver does not need to.
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Message-Id: <20240331-module-owner-virtio-v2-18-98f04bfaf46a@linaro.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
virtio core already sets the .owner, so driver does not need to.
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Message-Id: <20240331-module-owner-virtio-v2-17-98f04bfaf46a@linaro.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
virtio core already sets the .owner, so driver does not need to.
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Message-Id: <20240331-module-owner-virtio-v2-16-98f04bfaf46a@linaro.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
virtio core already sets the .owner, so driver does not need to.
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Message-Id: <20240331-module-owner-virtio-v2-15-98f04bfaf46a@linaro.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Alexander Graf <graf@amazon.com>
|
|
virtio core already sets the .owner, so driver does not need to.
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Message-Id: <20240331-module-owner-virtio-v2-14-98f04bfaf46a@linaro.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
virtio core already sets the .owner, so driver does not need to.
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Message-Id: <20240331-module-owner-virtio-v2-13-98f04bfaf46a@linaro.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
virtio core already sets the .owner, so driver does not need to.
Acked-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Message-Id: <20240331-module-owner-virtio-v2-12-98f04bfaf46a@linaro.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
|
|
virtio core already sets the .owner, so driver does not need to.
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Message-Id: <20240331-module-owner-virtio-v2-11-98f04bfaf46a@linaro.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Sudeep Holla <sudeep.holla@arm.com>
|
|
virtio core already sets the .owner, so driver does not need to.
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Message-Id: <20240331-module-owner-virtio-v2-10-98f04bfaf46a@linaro.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
virtio core already sets the .owner, so driver does not need to.
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Message-Id: <20240331-module-owner-virtio-v2-9-98f04bfaf46a@linaro.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
virtio core already sets the .owner, so driver does not need to.
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Message-Id: <20240331-module-owner-virtio-v2-8-98f04bfaf46a@linaro.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
virtio core already sets the .owner, so driver does not need to.
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Message-Id: <20240331-module-owner-virtio-v2-7-98f04bfaf46a@linaro.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
virtio core already sets the .owner, so driver does not need to.
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Message-Id: <20240331-module-owner-virtio-v2-6-98f04bfaf46a@linaro.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
|
|
virtio core already sets the .owner, so driver does not need to.
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Message-Id: <20240331-module-owner-virtio-v2-5-98f04bfaf46a@linaro.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
virtio core already sets the .owner, so driver does not need to.
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Message-Id: <20240331-module-owner-virtio-v2-4-98f04bfaf46a@linaro.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
virtio core already sets the .owner, so driver does not need to.
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Message-Id: <20240331-module-owner-virtio-v2-3-98f04bfaf46a@linaro.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
virtio core already sets the .owner, so driver does not need to.
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Message-Id: <20240331-module-owner-virtio-v2-2-98f04bfaf46a@linaro.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
Treat stats requests as wakeup events to ensure that the driver responds
to device requests in a timely manner.
Signed-off-by: David Stevens <stevensd@chromium.org>
Acked-by: David Hildenbrand <david@redhat.com>
Message-Id: <20240321012445.1593685-3-stevensd@google.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
Wakeup sources don't support nesting multiple events, so sharing a
single object between multiple drivers can result in one driver
overriding the wakeup event processing period specified by another
driver. Have the virtio balloon driver use the wakeup source of the
device it is bound to rather than the wakeup source of the parent
device, to avoid conflicts with the transport layer.
Note that although the virtio balloon's virtio_device itself isn't what
actually wakes up the device, it is responsible for processing wakeup
events. In the same way that EPOLLWAKEUP uses a dedicated wakeup_source
to prevent suspend when userspace is processing wakeup events, a
dedicated wakeup_source is necessary when processing wakeup events in a
higher layer in the kernel.
Fixes: b12fbc3f787e ("virtio_balloon: stay awake while adjusting balloon")
Signed-off-by: David Stevens <stevensd@chromium.org>
Acked-by: David Hildenbrand <david@redhat.com>
Message-Id: <20240321012445.1593685-2-stevensd@google.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
With virtio-mem, primarily hibernation is problematic: as the machine shuts
down, the virtio-mem device loses its state. Powering the machine back up
is like losing a bunch of DIMMs. While there would be ways to add limited
support, suspend+resume is more commonly used for VMs and "easier" to
support cleanly.
s2idle can be supported without any device dependencies. Similarly, one
would expect suspend-to-ram (i.e., S3) to work out of the box. However,
QEMU currently unplugs all device memory when resuming the VM, using a
cold reset on the "wakeup" path. In order to support S3, we need a feature
flag for the device to tell us if memory remains plugged when waking up. In
the future, QEMU will implement this feature.
So let's always support s2idle and support S3 with plugged memory only if
the device indicates support. Block hibernation early using the PM
notifier.
Trying to hibernate now fails early:
# echo disk > /sys/power/state
[ 26.455369] PM: hibernation: hibernation entry
[ 26.458271] virtio_mem virtio0: hibernation is not supported.
[ 26.462498] PM: hibernation: hibernation exit
-bash: echo: write error: Operation not permitted
s2idle works even without the new feature bit:
# echo s2idle > /sys/power/mem_sleep
# echo mem > /sys/power/state
[ 52.083725] PM: suspend entry (s2idle)
[ 52.095950] Filesystems sync: 0.010 seconds
[ 52.101493] Freezing user space processes
[ 52.104213] Freezing user space processes completed (elapsed 0.001 seconds)
[ 52.106520] OOM killer disabled.
[ 52.107655] Freezing remaining freezable tasks
[ 52.110880] Freezing remaining freezable tasks completed (elapsed 0.001 seconds)
[ 52.113296] printk: Suspending console(s) (use no_console_suspend to debug)
S3 does not work without the feature bit when memory is plugged:
# echo deep > /sys/power/mem_sleep
# echo mem > /sys/power/state
[ 32.788281] PM: suspend entry (deep)
[ 32.816630] Filesystems sync: 0.027 seconds
[ 32.820029] Freezing user space processes
[ 32.823870] Freezing user space processes completed (elapsed 0.001 seconds)
[ 32.827756] OOM killer disabled.
[ 32.829608] Freezing remaining freezable tasks
[ 32.833842] Freezing remaining freezable tasks completed (elapsed 0.001 seconds)
[ 32.837953] printk: Suspending console(s) (use no_console_suspend to debug)
[ 32.916172] virtio_mem virtio0: suspend+resume with plugged memory is not supported
[ 32.916181] virtio-pci 0000:00:02.0: PM: pci_pm_suspend(): virtio_pci_freeze+0x0/0x50 returns -1
[ 32.916197] virtio-pci 0000:00:02.0: PM: dpm_run_callback(): pci_pm_suspend+0x0/0x170 returns -1
[ 32.916210] virtio-pci 0000:00:02.0: PM: failed to suspend async: error -1
But S3 works with the new feature bit when memory is plugged (patched
QEMU):
# echo deep > /sys/power/mem_sleep
# echo mem > /sys/power/state
[ 33.983694] PM: suspend entry (deep)
[ 34.009828] Filesystems sync: 0.024 seconds
[ 34.013589] Freezing user space processes
[ 34.016722] Freezing user space processes completed (elapsed 0.001 seconds)
[ 34.019092] OOM killer disabled.
[ 34.020291] Freezing remaining freezable tasks
[ 34.023549] Freezing remaining freezable tasks completed (elapsed 0.001 seconds)
[ 34.026090] printk: Suspending console(s) (use no_console_suspend to debug)
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20240318120645.105664-1-david@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
This removes the signal/coredump hacks added for vhost_tasks in:
Commit f9010dbdce91 ("fork, vhost: Use CLONE_THREAD to fix freezer/ps regression")
When that patch was added vhost_tasks did not handle SIGKILL and would
try to ignore/clear the signal and continue on until the device's close
function was called. In the previous patches vhost_tasks and the vhost
drivers were converted to support SIGKILL by cleaning themselves up and
exiting. The hacks are no longer needed so this removes them.
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Message-Id: <20240316004707.45557-10-michael.christie@oracle.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
Instead of lingering until the device is closed, this has us handle
SIGKILL by:
1. marking the worker as killed so we no longer try to use it with
new virtqueues and new flush operations.
2. setting the virtqueue to worker mapping so no new works are queued.
3. running all the exiting works.
Suggested-by: Edward Adam Davis <eadavis@qq.com>
Reported-and-tested-by: syzbot+98edc2df894917b3431f@syzkaller.appspotmail.com
Message-Id: <tencent_546DA49414E876EEBECF2C78D26D242EE50A@qq.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Message-Id: <20240316004707.45557-9-michael.christie@oracle.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
In the next patches where the worker can be killed while in use, we
need to be able to take the worker mutex and kill queued works for
new IO and flushes, and set some new flags to prevent new
__vhost_vq_attach_worker calls from swapping in/out killed workers.
If we are holding the worker mutex during a flush and the flush's work
is still in the queue, the worker code that will handle the SIGKILL
cleanup won't be able to take the mutex and perform it's cleanup. So
this patch has us drop the worker mutex while waiting for the flush
to complete.
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Message-Id: <20240316004707.45557-8-michael.christie@oracle.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
__vhost_vq_attach_worker uses the vhost_dev mutex to serialize the
swapping of a virtqueue's worker. This was done for simplicity because
we are already holding that mutex.
In the next patches where the worker can be killed while in use, we need
finer grained locking because some drivers will hold the vhost_dev mutex
while flushing. However in the SIGKILL handler in the next patches, we
will need to be able to swap workers (set current one to NULL), kill
queued works and stop new flushes while flushes are in progress.
To prepare us, this has us use the virtqueue mutex for swapping workers
instead of the vhost_dev one.
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Message-Id: <20240316004707.45557-7-michael.christie@oracle.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
vhost_vq_work_queue will never fail when queueing the TMF's response
handling because a guest can only send us TMFs when the device is fully
setup so there is always a worker at that time. In the next patches we
will modify the worker code so it handles SIGKILL by exiting before
outstanding commands/TMFs have sent their responses. In that case
vhost_vq_work_queue can fail when we try to send a response.
This has us just free the TMF's resources since at this time the guest
won't be able to get a response even if we could send it.
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Message-Id: <20240316004707.45557-6-michael.christie@oracle.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
vhost_vq_flush is no longer used so remove it.
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Message-Id: <20240316004707.45557-5-michael.christie@oracle.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
We flush all the workers that are not also used by the ctl vq to make
sure that responses queued by LIO before the TMF response are sent
before the TMF response. This requires a special vhost_vq_flush
function which, in the next patches where we handle SIGKILL killing
workers while in use, will require extra locking/complexity. To avoid
that, this patch has us flush the entire device from the system work
queue, then queue up sending the response from there.
This is a little less optimal since we now flush all workers but this
will be ok since commands have already timed out and perf is not a
concern.
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Message-Id: <20240316004707.45557-4-michael.christie@oracle.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|