summaryrefslogtreecommitdiff
path: root/io_uring/io_uring.c
AgeCommit message (Collapse)Author
2022-07-24io_uring: remove extra io_commit_cqring()Pavel Begunkov
We don't post events in __io_commit_cqring_flush() anymore but send all requests to tw, so no need to do io_commit_cqring() there. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/f2481e32375e749be89c42e4804268b608722cef.1655637157.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-07-24io_uring: clean up tracing eventsPavel Begunkov
We have lots of trace events accepting an io_uring request and wanting to print some of its fields like user_data, opcode, flags and so on. However, as trace points were unaware of io_uring structures, we had to pass all the fields as arguments. Teach trace/events/io_uring.h about struct io_kiocb and stop the misery of passing a horde of arguments to trace helpers. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/40ff72f92798114e56d400f2b003beb6cde6ef53.1655384063.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-07-24io_uring: kill extra io_uring_types.h includesPavel Begunkov
io_uring/io_uring.h already includes io_uring_types.h, no need to include it every time. Kill it in a bunch of places, it prepares us for following patches. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/94d8c943fbe0ef949981c508ddcee7fc1c18850f.1655384063.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-07-24io_uring: change ->cqe_cached invariant for CQE32Pavel Begunkov
With IORING_SETUP_CQE32 ->cqe_cached doesn't store a real address but rather an implicit offset into cqes. Store the real cqe pointer and increment it accordingly if CQE32. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/1ee1838cba16bed96381a006950b36ba640d998c.1655455613.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-07-24io_uring: introduce io_req_cqe_overflow()Pavel Begunkov
__io_fill_cqe_req() is hot and inlined, we want it to be as small as possible. Add io_req_cqe_overflow() accepting only a request and doing all overflow accounting, and replace with it two calls to 6 argument io_cqring_event_overflow(). Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/048b9fbcce56814d77a1a540409c98c3d383edcb.1655455613.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-07-24io_uring: don't inline __io_get_cqe()Pavel Begunkov
__io_get_cqe() is not as hot as io_get_cqe(), no need to inline it, it sheds ~500B from the binary. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/c1ac829198a881b7af8710926f99a3559b9f24c0.1655455613.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-07-24io_uring: don't expose io_fill_cqe_aux()Pavel Begunkov
Deduplicate some code and add a helper for filling an aux CQE, locking and notification. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/b7c6557c8f9dc5c4cfb01292116c682a0ff61081.1655455613.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-07-24io_uring: mutex locked poll hashingPavel Begunkov
Currently we do two extra spin lock/unlock pairs to add a poll/apoll request to the cancellation hash table and remove it from there. On the submission side we often already hold ->uring_lock and tw completion is likely to hold it as well. Add a second cancellation hash table protected by ->uring_lock. In concerns for latency because of a need to have the mutex locked on the completion side, use the new table only in following cases: 1) IORING_SETUP_SINGLE_ISSUER: only one task grabs uring_lock, so there is little to no contention and so the main tw hander will almost always end up grabbing it before calling callbacks. 2) IORING_SETUP_SQPOLL: same as with single issuer, only one task is a major user of ->uring_lock. 3) apoll: we normally grab the lock on the completion side anyway to execute the request, so it's free. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/1bbad9c78c454b7b92f100bbf46730a37df7194f.1655371007.git.asml.silence@gmail.com Reviewed-by: Hao Xu <howeyxu@tencent.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-07-24io_uring: introduce a struct for hash tablePavel Begunkov
Instead of passing around a pointer to hash buckets, add a bit of type safety and wrap it into a structure. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/d65bc3faba537ec2aca9eabf334394936d44bd28.1655371007.git.asml.silence@gmail.com Reviewed-by: Hao Xu <howeyxu@tencent.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-07-24io_uring: add IORING_SETUP_SINGLE_ISSUERPavel Begunkov
Add a new IORING_SETUP_SINGLE_ISSUER flag and the userspace visible part of it, i.e. put limitations of submitters. Also, don't allow it together with IOPOLL as we're not going to put it to good use. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/4bcc41ee467fdf04c8aab8baf6ce3ba21858c3d4.1655371007.git.asml.silence@gmail.com Reviewed-by: Hao Xu <howeyxu@tencent.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-07-24io_uring: clean up io_ring_ctx_allocPavel Begunkov
Add a variable for the number of hash buckets in io_ring_ctx_alloc(), makes it more readable. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/993926ed0d614ba9a76b2a85bebae2babcb13983.1655371007.git.asml.silence@gmail.com Reviewed-by: Hao Xu <howeyxu@tencent.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-07-24io_uring: limit the number of cancellation bucketsPavel Begunkov
Don't allocate to many hash/cancellation buckets, there might be too many, clamp it to 8 bits, or 256 * 64B = 16KB. We don't usually have too many requests, and 256 buckets should be enough, especially since we do hash search only in the cancellation path. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/b9620c8072ba61a2d50eba894b89bd93a94a9abd.1655371007.git.asml.silence@gmail.com Reviewed-by: Hao Xu <howeyxu@tencent.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-07-24io_uring: switch cancel_hash to use per entry spinlockHao Xu
Add a new io_hash_bucket structure so that each bucket in cancel_hash has separate spinlock. Use per entry lock for cancel_hash, this removes some completion lock invocation and remove contension between different cancel_hash entries. Signed-off-by: Hao Xu <howeyxu@tencent.com> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/05d1e135b0c8bce9d1441e6346776589e5783e26.1655371007.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-07-24io_uring: refactor io_req_task_complete()Pavel Begunkov
Clean up io_req_task_complete() and deduplicate io_put_kbuf() calls. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/ae3148ac7eb5cce3e06895cde306e9e959d6f6ae.1655371007.git.asml.silence@gmail.com Reviewed-by: Hao Xu <howeyxu@tencent.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-07-24io_uring: kill REQ_F_COMPLETE_INLINEPavel Begunkov
REQ_F_COMPLETE_INLINE is only needed to delay queueing into the completion list to io_queue_sqe() as __io_req_complete() is inlined and we don't want to bloat the kernel. As now we complete in a more centralised fashion in io_issue_sqe() we can get rid of the flag and queue to the list directly. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/600ba20a9338b8a39b249b23d3d177803613dde4.1655371007.git.asml.silence@gmail.com Reviewed-by: Hao Xu <howeyxu@tencent.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-07-24io_uring: remove unused IO_REQ_CACHE_SIZE definedJens Axboe
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-07-24io_uring: don't set REQ_F_COMPLETE_INLINE in twPavel Begunkov
io_req_task_complete() enqueues requests for state completion itself, no need for REQ_F_COMPLETE_INLINE, which is only serve the purpose of not bloating the kernel. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/aca80f71464ad02c06f1311d998a2d6ee0b31573.1655310733.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-07-24io_uring: remove check_cq checking from hot pathsPavel Begunkov
All ctx->check_cq events are slow path, don't test every single flag one by one in the hot path, but add a common guarding if. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/dff026585cea7ff3a172a7c83894a3b0111bbf6a.1655310733.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-07-24io_uring: never defer-complete multi-apollPavel Begunkov
Luckily, nnobody completes multi-apoll requests outside the polling functions, but don't set IO_URING_F_COMPLETE_DEFER in any case as there is nobody who is catching REQ_F_COMPLETE_INLINE, and so will leak requests if used. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/a65ed3f5effd9321ee06e6edea294a03be3e15a0.1655310733.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-07-24io_uring: move small helpers to headersPavel Begunkov
There is a bunch of inline helpers that will be useful not only to the core of io_uring, move them to headers. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/22df99c83723e44cba7e945e8519e64e3642c064.1655310733.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-07-24io_uring: move opcode table to opdef.cJens Axboe
We already have the declarations in opdef.h, move the rest into its own file rather than in the main io_uring.c file. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-07-24io_uring: move read/write related opcodes to its own fileJens Axboe
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-07-24io_uring: move remaining file table manipulation to filetable.cJens Axboe
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-07-24io_uring: move rsrc related data, core, and commandsJens Axboe
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-07-24io_uring: split provided buffers handling into its own fileJens Axboe
Move both the opcodes related to it, and the internals code dealing with it. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-07-24io_uring: move cancelation into its own fileJens Axboe
This also helps cleanup the io_uring.h cancel parts, as we can make things static in the cancel.c file, mostly. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-07-24io_uring: move poll handling into its own fileJens Axboe
Add a io_poll_issue() rather than export the general task_work locking and io_issue_sqe(), and put the io_op_defs definition and structure into a separate header file so that poll can use it. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-07-24io_uring: add opcode name to io_op_defsJens Axboe
This kills the last per-op switch. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-07-24io_uring: include and forward-declaration sanitationJens Axboe
Remove some dead headers we no longer need, and get rid of the io_ring_ctx and io_uring_fops forward declarations. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-07-24io_uring: move io_uring_task (tctx) helpers into its own fileJens Axboe
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-07-24io_uring: move fdinfo helpers to its own fileJens Axboe
This also means moving a bit more of the fixed file handling to the filetable side, which makes sense separately too. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-07-24io_uring: use io_is_uring_fops() consistentlyJens Axboe
Convert the last spots that check for io_uring_fops to use the provided helper instead. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-07-24io_uring: move SQPOLL related handling into its own fileJens Axboe
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-07-24io_uring: move timeout opcodes and handling into its own fileJens Axboe
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-07-24io_uring: move our reference counting into a headerJens Axboe
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-07-24io_uring: move msg_ring into its own fileJens Axboe
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-07-24io_uring: split network related opcodes into its own fileJens Axboe
While at it, convert the handlers to just use io_eopnotsupp_prep() if CONFIG_NET isn't set. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-07-24io_uring: move statx handling to its own fileJens Axboe
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-07-24io_uring: move epoll handler to its own fileJens Axboe
Would be nice to sort out Kconfig for this and don't even compile epoll.c if we don't have epoll configured. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-07-24io_uring: add a dummy -EOPNOTSUPP prep handlerJens Axboe
Add it and use it for the epoll handling, if epoll isn't configured. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-07-24io_uring: move uring_cmd handling to its own fileJens Axboe
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-07-24io_uring: split out open/close operationsJens Axboe
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-07-24io_uring: separate out file table handling codeJens Axboe
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-07-24io_uring: split out fadvise/madvise operationsJens Axboe
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-07-24io_uring: split out fs related sync/fallocate functionsJens Axboe
This splits out sync_file_range, fsync, and fallocate. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-07-24io_uring: split out splice related operationsJens Axboe
This splits out splice and tee support. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-07-24io_uring: split out filesystem related operationsJens Axboe
This splits out renameat, unlinkat, mkdirat, symlinkat, and linkat. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-07-24io_uring: move nop into its own fileJens Axboe
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-07-24io_uring: move xattr related opcodes to its own fileJens Axboe
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-07-24io_uring: handle completions in the coreJens Axboe
Normally request handlers complete requests themselves, if they don't return an error. For the latter case, the core will complete it for them. This is unhandy for pushing opcode handlers further out, as we don't want a bunch of inline completion code and we don't want to make the completion path slower than it is now. Let the core handle any completion, unless the handler explicitly asks us not to. Signed-off-by: Jens Axboe <axboe@kernel.dk>