Age | Commit message (Collapse) | Author |
|
Use the added target io handling helpers for simplifying loop io
completion.
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20250322093218.431419-8-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Enable zero copy for null target so that we can evaluate performance
from zero copy or not.
Also this should be the simplest ublk zero copy implementation, which
can be served as zc example.
Add test for covering 'add -t null -z'.
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20250322093218.431419-7-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
- pass 'truct dev_ctx *ctx' to target init function
- add 'private_data' to 'struct ublk_dev' for storing target specific data
- add 'private_data' to 'struct ublk_io' for storing per-IO data
- add 'tgt_ios' to 'struct ublk_io' for counting how many io_uring ios
for handling the current io command
- add helper ublk_get_io() for supporting stripe target
- add two helpers for simplifying target io handling
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20250322093218.431419-6-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Move two functions for initializing & de-initializing backing file
into common.c.
Also move one common helper into kublk.h.
Prepare for supporting ublk-stripe.
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20250322093218.431419-5-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Increase max buffer size to 1MB, and 64KB is too small to evaluate
performance with builtin ublk server implementation.
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20250322093218.431419-4-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Unify the sqe allocator helper, and we will use it for supporting
more cases, such as ublk stripe, in which variable sqe allocation
is required.
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20250322093218.431419-3-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
block layer, ublk and io_uring might re-order IO in the past
- plug
- queue ublk io command via task work
Add one test for verifying if sequential WRITE IO is dispatched in order.
- null target is taken, so we can just observe io order from
`tracepoint:block:block_rq_complete` which represents the dispatch order
- WRITE IO is taken because READ may come from system-wide utility
Cc: Uday Shankar <ushankar@purestorage.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20250322093218.431419-2-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Firstly ublk char device node may not be created by udev yet, so wait
a while until it can be opened or timeout.
Secondly delete created ublk device in case of start failure, otherwise
the device becomes zombie.
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20250321135324.259677-1-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
By default, io_uring marks a waiting task as being in iowait, if it's
sleeping waiting on events and there are pending requests. This isn't
necessarily always useful, and may be confusing on non-storage setups
where iowait isn't expected. It can also cause extra power usage, by
preventing the CPU from entering lower sleep states.
This adds a new enter flag, IORING_ENTER_NO_IOWAIT. If set, then
io_uring will not account the sleeping task as being in iowait. If the
kernel supports this feature, then it will be marked by having the
IORING_FEAT_NO_IOWAIT feature flag set.
As the kernel currently does not support separating the iowait
accounting and CPU frequency boosting, the IORING_ENTER_NO_IOWAIT
controls both of these at the same time. In the future, if those do end
up being split, then it'd be possible to control them separately.
However, it seems more likely that the kernel will decouple iowait and
CPU frequency boosting anyway.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
For loop target, write cache isn't enabled, and each write isn't be
marked as DSYNC too.
Fix it by enabling write cache, meantime fix FLUSH implementation
by not taking LBA range into account, and there isn't such info
for FLUSH command.
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20250321004758.152572-1-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Some user decides test result by exit code only, and wouldn't like to be
bothered by the test result.
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20250320013743.4167489-4-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
ublk_drv may be built-in, so don't show modprobe failure, and we
do check `/dev/ublk-control` for skipping test if ublk_drv isn't
enabled.
Reported-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20250320013743.4167489-3-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Add one dependency helper which can include new uapi definition which
isn't synced from kernel.
This way also helps a lot for downstream test deployment.
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20250320013743.4167489-2-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
The original support for incrementally consumed buffers didn't allow it
to be used with bundles, with the assumption being that incremental
buffers are generally larger, and hence there's less of a nedd to
support it.
But that assumption may not be correct - it's perfectly viable to use
smaller buffers with incremental consumption, and there may be valid
reasons for an application or framework to do so.
As there's really no need to explicitly disable bundles with
incrementally consumed buffers, allow it. This actually makes the peek
side cheaper and simpler, with the completion side basically the same,
just needing to iterate for the consumed length.
Reported-by: Norman Maurer <norman_maurer@apple.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
This reverts commit 2a51c327d4a4a2eb62d67f4ea13a17efd0f25c5c.
The kernel registered bvecs do use the iov_iter_advance() API, so we
can't rely on this simplification anymore.
Fixes: 27cb27b6d5ea40 ("io_uring: add support for kernel registered bvecs")
Reported-by: Caleb Sander Mateos <csander@purestorage.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Caleb Sander Mateos <csander@purestorage.com>
Link: https://lore.kernel.org/r/20250310184825.569371-1-kbusch@meta.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Add UBLK_TEST_QUIET, so we can print test result(PASS/SKIP/FAIL) only.
Also always run from test script's current directory, then the same test
script can be started from other work directory.
This way helps a lot to reuse this test source code and scripts for
other projects(liburing, blktests, ...)
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20250303124324.3563605-12-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Add stress_test_01 for running IO vs. killing ublk server, so io_uring exit &
cancel code path can be covered, same with ublk's cancel code path.
Especially IO buffer lifetime is one big thing for ublk zero copy, the added
test can verify if this area works as expected.
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20250303124324.3563605-11-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Add stress_test_01 for running IO vs. removing device for verifying that
ublk device removal can work as expected when heavy IO workloads are in
progress.
null, loop and loop/zc are covered in this tests.
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20250303124324.3563605-10-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Load ublk_drv module in _prep_test(), and unload it in _cleanup_test(),
so that test can always be done in consistent state.
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20250303124324.3563605-9-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Move zero copy feature check into _add_ublk_dev() since we will have
more tests which requires to cover zero copy.
Then one check function of _check_add_dev() has to be added for dealing
with cleanup since '_add_ublk_dev()' is run in sub-shell, and we can't
exit from it to terminal shell.
Meantime always return error code from _add_ublk_dev().
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20250303124324.3563605-8-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
More devices can be created in single tests, so simply remove all
ublk devices in _cleanup_test(), meantime remove the ${dev_id} argument
of _cleanup_test().
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20250303124324.3563605-7-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Add shellcheck, meantime fixes all warnings.
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20250303124324.3563605-6-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
The argument of '-a' doesn't follow any value, so fix it by putting it
with '-z' together.
Fixes: bedc9cbc5f97 ("selftests: ublk: add ublk zero copy test")
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Link: https://lore.kernel.org/r/20250303124324.3563605-5-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Add --foreground command for helping to debug.
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Link: https://lore.kernel.org/r/20250303124324.3563605-4-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Fixes the following build failure:
ublk//file_backed.c: In function ‘backing_file_tgt_init’:
ublk//file_backed.c:28:42: error: ‘O_DIRECT’ undeclared (first use in this function); did you mean ‘O_DIRECTORY’?
28 | fd = open(file, O_RDWR | O_DIRECT);
| ^~~~~~~~
| O_DIRECTORY
when trying to reuse this same utility for liburing test.
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Link: https://lore.kernel.org/r/20250303124324.3563605-3-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Improve ublk_stop_io_daemon() in the following ways:
- don't wait if ->ublksrv_pid becomes -1, which means that the disk
has been stopped
- don't wait if ublk char device doesn't exist any more, so we can
avoid to rely on inoitfy for wait until the char device is closed
And this way may reduce time of delete command a lot.
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Link: https://lore.kernel.org/r/20250303124324.3563605-2-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Commit ef623a647f42 ("io_uring: Move old async data allocation helper
to header") leave behind this unused declaration.
Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Link: https://lore.kernel.org/r/20250305013454.3635021-1-yuehaibing@huawei.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Add a helper function io_cache_free() that returns an allocation to a
io_alloc_cache, falling back on kfree() if the io_alloc_cache is full.
This is the inverse of io_cache_alloc(), which takes an allocation from
an io_alloc_cache and falls back on kmalloc() if the cache is empty.
Convert 4 callers to use the helper.
Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
Suggested-by: Li Zetao <lizetao1@huawei.com>
Link: https://lore.kernel.org/r/20250304194814.2346705-1-csander@purestorage.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
io_rsrc_node's of type IORING_RSRC_FILE always have a file attached
immediately after they are allocated. IORING_RSRC_BUFFER nodes won't be
returned from io_sqe_buffer_register()/io_buffer_register_bvec() until
they have a io_mapped_ubuf attached.
So remove the checks for a NULL file/buffer in io_free_rsrc_node().
Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
Link: https://lore.kernel.org/r/20250228235916.670437-5-csander@purestorage.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
The done: label is only reachable if node is non-NULL. So don't bother
checking, just call io_free_node().
Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
Link: https://lore.kernel.org/r/20250228235916.670437-4-csander@purestorage.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
io_sqe_buffer_register() currently calls io_put_rsrc_node() if it fails
to fully set up the io_rsrc_node. io_put_rsrc_node() is more involved
than necessary, since we already know the reference count will reach 0
and no io_mapped_ubuf has been attached to the node yet.
So just call io_free_node() to release the node's memory. This also
avoids the need to temporarily set the node's buf pointer to NULL.
Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
Link: https://lore.kernel.org/r/20250228235916.670437-3-csander@purestorage.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
io_rsrc_node_alloc() calls io_cache_alloc(), which uses kmalloc() to
allocate the node. So it can be freed with kfree() instead of kvfree().
Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
Link: https://lore.kernel.org/r/20250228235916.670437-2-csander@purestorage.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Split the freeing of the io_rsrc_node from io_free_rsrc_node(), for use
with nodes that haven't been fully initialized.
Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
Link: https://lore.kernel.org/r/20250228235916.670437-1-csander@purestorage.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
io_uring/rsrc.h uses several types from include/linux/io_uring_types.h.
Include io_uring_types.h explicitly in rsrc.h to avoid depending on
users of rsrc.h including io_uring_types.h first.
Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
Reviewed-by: Li Zetao <lizetao1@huawei.com>
Link: https://lore.kernel.org/r/20250301183612.937529-1-csander@purestorage.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
io_buffer_register_bvec() takes index as an unsigned int argument, but
ublk_register_io_buf() casts ub_cmd->addr (a u64) to int. Remove the
misleading cast and instead pass index as an unsigned value to
ublk_register_io_buf() and ublk_unregister_io_buf().
Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20250301190317.950208-1-csander@purestorage.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Call io_find_buf_node() to avoid duplicating it in io_nop().
Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
Link: https://lore.kernel.org/r/20250301001610.678223-2-csander@purestorage.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Declare io_find_buf_node() in io_uring/rsrc.h so it can be called from
other files.
Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
Link: https://lore.kernel.org/r/20250301001610.678223-1-csander@purestorage.com
[axboe: keep the inline for local hot path usage]
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Indicate to userspace applications if a UBLK_IO_UNREGISTER_IO_BUF
command specifies an invalid buffer index by returning an error code.
Return -EINVAL if no buffer is registered with the given index, and
-EBUSY if the registered buffer is not a kernel bvec.
Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
Link: https://lore.kernel.org/r/20250228231432.642417-1-csander@purestorage.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
The cmd_to_io_kiocb() macro applies a pointer cast to its input without
parenthesizing it. Currently all inputs are variable names, so this has
the intended effect. But since casts have relatively high precedence,
the macro would apply the cast to the wrong value if the input was a
pointer addition, for example.
Turn the macro into a static inline function to ensure the pointer cast
is applied to the full input value.
Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
Link: https://lore.kernel.org/r/20250228230305.630885-1-csander@purestorage.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
io_uring_cmd_import_fixed() takes a struct io_uring_cmd *, but the type
of the ioucmd parameter is void *. Make the pointer type explicit so the
compiler can type check it.
Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
Link: https://lore.kernel.org/r/20250228221514.604350-1-csander@purestorage.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
The macro rq_data_dir() already computes a request's data direction.
Use it in place of the if-else to set imu->dir.
Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
Link: https://lore.kernel.org/r/20250228223057.615284-1-csander@purestorage.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Enable zero copy on file backed target, meantime add one fio test for
covering write verify, another test for mkfs/mount/umount.
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20250228161919.2869102-4-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Add file backed ublk target code, meantime add one fio test for
covering write verify, another test for mkfs/mount/umount.
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20250228161919.2869102-3-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Both ublk driver and userspace heavily depends on io_uring subsystem,
and tools/testing/selftests/ should be the best place for holding this
cross-subsystem tests.
Add basic read/write IO test over this ublk null disk, and make sure ublk
working.
More tests will be added.
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20250228161919.2869102-2-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Frequent alloc/free cycles on these is pretty costly. Use an io cache to
more efficiently reuse these buffers.
Signed-off-by: Keith Busch <kbusch@kernel.org>
Link: https://lore.kernel.org/r/20250227223916.143006-7-kbusch@meta.com
[axboe: fix imu leak]
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Provide new operations for the user to request mapping an active request
to an io uring instance's buf_table. The user has to provide the index
it wants to install the buffer.
A reference count is taken on the request to ensure it can't be
completed while it is active in a ring's buf_table.
Signed-off-by: Keith Busch <kbusch@kernel.org>
Link: https://lore.kernel.org/r/20250227223916.143006-6-kbusch@meta.com
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Provide an interface for the kernel to leverage the existing
pre-registered buffers that io_uring provides. User space can reference
these later to achieve zero-copy IO.
User space must register an empty fixed buffer table with io_uring in
order for the kernel to make use of it.
Signed-off-by: Keith Busch <kbusch@kernel.org>
Link: https://lore.kernel.org/r/20250227223916.143006-5-kbusch@meta.com
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
When using kernel registered bvec fixed buffers, the "address" is
actually the offset into the bvec rather than userspace address.
Therefore it can be 0.
We can skip checking whether the address is NULL before mapping
uring_cmd data. Bad userspace address will be handled properly later when
the user buffer is imported.
With this patch, we will be able to use the kernel registered bvec fixed
buffers in io_uring NVMe passthru with ublk zero-copy support.
Reviewed-by: Caleb Sander Mateos <csander@purestorage.com>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Xinyu Zhang <xizhang@purestorage.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Link: https://lore.kernel.org/r/20250227223916.143006-4-kbusch@meta.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Registered buffers may depend on a linked command, which makes the prep
path too early to import. Move to the issue path when the node is
actually needed like all the other users of fixed buffers.
Signed-off-by: Keith Busch <kbusch@kernel.org>
Link: https://lore.kernel.org/r/20250227223916.143006-3-kbusch@meta.com
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Cleans up the generic rw prep to not require the do_import flag. Use a
different prep function for callers that might need buffer select.
Based-on-a-patch-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Link: https://lore.kernel.org/r/20250227223916.143006-2-kbusch@meta.com
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|