summaryrefslogtreecommitdiff
path: root/io_uring/rsrc.c
AgeCommit message (Collapse)Author
2023-04-20Revert "io_uring/rsrc: disallow multi-source reg buffers"Jens Axboe
This reverts commit edd478269640b360c6f301f2baa04abdda563ef3. There's really no specific need to disallow multiple sources of buffers, and io_uring really should not be mandating this by itself. We should be able to solely rely on GUP making these decisions. As this also stands in the way of a cleanup where io_uring is the odd one out, kill it. Link: https://lore.kernel.org/all/61ded378-51a8-1dcb-b631-fda1903248a9@gmail.com/ Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-04-18io_uring/rsrc: disassociate nodes and rsrc_dataPavel Begunkov
Make rsrc nodes independent from rsrd_data, for that we keep ctx and rsrc type in nodes. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/4f259abe9cd4eea6a3b4ed83508635218acd3c3f.1681822823.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-04-18io_uring/rsrc: devirtualise rsrc put callbacksPavel Begunkov
We only have two rsrc types, buffers and files, replace virtual callbacks for putting resources down with a switch..case. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/02ca727bf8e5f7f820c2f404e95ae88c8f472930.1681822823.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-04-18io_uring/rsrc: pass node to io_rsrc_put_work()Pavel Begunkov
Instead of passing rsrc_data and a resource to io_rsrc_put_work() just forward node, that's all the function needs. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/791e8edd28d78797240b74d34e99facbaad62f3b.1681822823.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-04-18io_uring/rsrc: inline io_rsrc_put_work()Pavel Begunkov
io_rsrc_put_work() is simple enough to be open coded into its only caller. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/1b36dd46766ced39a9b160767babfa2fce07b8f8.1681822823.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-04-18io_uring/rsrc: add empty flag in rsrc_nodePavel Begunkov
Unless a node was flushed by io_rsrc_ref_quiesce(), it'll carry a resource. Replace ->inline_items with an empty flag, which is initialised to false and only raised in io_rsrc_ref_quiesce(). Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/75d384c9d2252e12af73b9cf8a44e1699106aeb1.1681822823.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-04-18io_uring/rsrc: merge nodes and io_rsrc_putPavel Begunkov
struct io_rsrc_node carries a number of resources represented by struct io_rsrc_put. That was handy before for sync overhead ammortisation, but all complexity is gone and nodes are simple and lightweight. Let's allocate a separate node for each resource. Nodes and io_rsrc_put and not much different in size, and former are cached, so node allocation should work better. That also removes some overhead for nested iteration in io_rsrc_node_ref_zero() / __io_rsrc_put_work(). Another reason for the patch is that it greatly reduces complexity by moving io_rsrc_node_switch[_start]() inside io_queue_rsrc_removal(), so users don't have to care about it. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/c7d3a45b30cc14cd93700a710dd112edc703db98.1681822823.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-04-18io_uring/rsrc: infer node from ctx on io_queue_rsrc_removalPavel Begunkov
For io_queue_rsrc_removal() we should always use the current active rsrc node, don't pass it directly but let the function grab it from the context. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/d15939b4afea730978b4925685c2577538b823bb.1681822823.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-04-15io_uring/rsrc: refactor io_queue_rsrc_removalPavel Begunkov
We can queue up a rsrc into a list in io_queue_rsrc_removal() while allocating io_rsrc_put and so simplify the function. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/36bd708ee25c0e2e7992dc19b17db166eea9ac40.1681395792.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-04-15io_uring/rsrc: clean up __io_sqe_buffers_update()Pavel Begunkov
Inline offset variable, so we don't use it without subjecting it to array_index_nospec() first. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/77936d9ed23755588810c5eafcea7e1c3b90e3cd.1681395792.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-04-15io_uring/rsrc: inline switch_start fast pathPavel Begunkov
Inline the part of io_rsrc_node_switch_start() that checks whether the cache is empty or not, as most of the times it will have some number of entries in there. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/9619c1717a0e01f22c5fce2f1ba2735f804da0f2.1681395792.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-04-15io_uring/rsrc: remove rsrc_data refsPavel Begunkov
Instead of waiting for rsrc_data->refs to be downed to zero, check whether there are rsrc nodes queued for completion, that's easier then maintaining references. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/8e33fd143d83e11af3e386aea28eb6d6c6a1be10.1681395792.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-04-15io_uring/rsrc: fix DEFER_TASKRUN rsrc quiescePavel Begunkov
For io_rsrc_ref_quiesce() to progress it should execute all task_work items, including deferred ones. However, currently nobody would wake us, and so let's set ctx->cq_wait_nr, so io_req_local_work_add() would wake us up. Fixes: c0e0d6ba25f18 ("io_uring: add IORING_SETUP_DEFER_TASKRUN") Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/f1a90d1bc5ebf096475b018fed52e54f3b89d4af.1681395792.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-04-15io_uring/rsrc: use wq for quiescingPavel Begunkov
Replace completions with waitqueues for rsrc data quiesce, the main wakeup condition is when data refs hit zero. Note that data refs are only changes under ->uring_lock, so we prepare before mutex_unlock() reacquire it after taking the lock back. This change will be needed in the next patch. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/1d0dbc74b3b4fd67c8f01819e680c5e0da252956.1681395792.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-04-15io_uring/rsrc: refactor io_rsrc_ref_quiescePavel Begunkov
Refactor io_rsrc_ref_quiesce() by moving the first mutex_unlock(), so we don't have to have a second mutex_unlock() further in the loop. It prepares us to the next patch. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/65bc876271fb16bf550a53a4c76c91aacd94e52e.1681395792.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-04-15io_uring/rsrc: remove io_rsrc_node::donePavel Begunkov
Kill io_rsrc_node::node and check refs instead, it's set when the nodes refcount hits zero, and it won't change afterwards. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/bbde361f4010f7e8bf196f1ecca27a763b79926f.1681395792.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-04-15io_uring/rsrc: use nospec'ed indexesPavel Begunkov
We use array_index_nospec() for registered buffer indexes, but don't use it while poking into rsrc tags, fix that. Fixes: 634d00df5e1cf ("io_uring: add full-fledged dynamic buffers support") Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/f02fafc5a9c0dd69be2b0618c38831c078232ff0.1681395792.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-04-12io_uring/rsrc: extract SCM file put helperPavel Begunkov
SCM file accounting is a slow path and is only used for UNIX files. Extract a helper out of io_rsrc_file_put() that does the SCM unaccounting. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/58cc7bffc2ee96bec8c2b89274a51febcbfa5556.1681210788.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-04-12io_uring/rsrc: refactor io_rsrc_node_switchPavel Begunkov
We use io_rsrc_node_switch() coupled with io_rsrc_node_switch_start() for a bunch of cases including initialising ctx->rsrc_node, i.e. by passing NULL instead of rsrc_data. Leave it to only deal with actual node changing. For that, first remove it from io_uring_create() and add a function allocating the first node. Then also remove all calls to io_rsrc_node_switch() from files/buffers register as we already have a node installed and it does essentially nothing. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/d146fe306ff98b1a5a60c997c252534f03d423d7.1681210788.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-04-12io_uring/rsrc: zero node's rsrc data on allocPavel Begunkov
struct io_rsrc_node::rsrc_data field is initialised on rsrc removal and shouldn't be used before that, still let's play safe and zero the field on alloc. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/09bd03cedc8da8a7974c5e6e4bf0489fd16593ab.1681210788.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-04-12io_uring/rsrc: consolidate node cachingPavel Begunkov
We store one pre-allocated rsrc node in ->rsrc_backup_node, merge it with ->rsrc_node_cache. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/6d5410e51ccd29be7a716be045b51d6b371baef6.1681210788.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-04-12io_uring/rsrc: add lockdep checksPavel Begunkov
Add a lockdep chek to make sure that file and buffer updates hold ->uring_lock. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/961bbe6e433ec9bc0375127f23468b37b729df99.1681210788.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-04-04io_uring/rsrc: optimise io_rsrc_data refcountingPavel Begunkov
Every struct io_rsrc_node takes a struct io_rsrc_data reference, which means all rsrc updates do 2 extra atomics. Replace atomics refcounting with a int as it's all done under ->uring_lock. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/e73c3d6820cf679532696d790b5b8fae23537213.1680576071.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-04-04io_uring/rsrc: add lockdep sanity checksPavel Begunkov
We should hold ->uring_lock while putting nodes with io_put_rsrc_node(), add a lockdep check for that. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/b50d5f156ac41450029796738c1dfd22a521df7a.1680576071.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-04-04io_uring/rsrc: cache struct io_rsrc_nodePavel Begunkov
Add allocation cache for struct io_rsrc_node, it's always allocated and put under ->uring_lock, so it doesn't need any extra synchronisation around caches. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/252a9d9ef9654e6467af30fdc02f57c0118fb76e.1680576071.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-04-04io_uring/rsrc: don't offload node freePavel Begunkov
struct delayed_work rsrc_put_work was previously used to offload node freeing because io_rsrc_node_ref_zero() was previously called by RCU in the IRQ context. Now, as percpu refcounting is gone, we can do it eagerly at the spot without pushing it to a worker. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/13fb1aac1e8d068ad8fd4a0c6d0d157ab61b90c0.1680576071.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-04-04io_uring/rsrc: optimise io_rsrc_put allocationPavel Begunkov
Every io_rsrc_node keeps a list of items to put, and all entries are kmalloc()'ed. However, it's quite often to queue up only one entry per node, so let's add an inline entry there to avoid extra allocations. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/c482c1c652c45c85ac52e67c974bc758a50fed5f.1680576071.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-04-04io_uring/rsrc: rename rsrc_listPavel Begunkov
We have too many "rsrc" around which makes the name of struct io_rsrc_node::rsrc_list confusing. The field is responsible for keeping a list of files or buffers, so call it item_list and add comments around. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/3e34d4dfc1fdbb6b520f904ee6187c2ccf680efe.1680576071.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-04-04io_uring/rsrc: kill rsrc_ref_lockPavel Begunkov
We use ->rsrc_ref_lock spinlock to protect ->rsrc_ref_list in io_rsrc_node_ref_zero(). Now we removed pcpu refcounting, which means io_rsrc_node_ref_zero() is not executed from the irq context as an RCU callback anymore, and we also put it under ->uring_lock. io_rsrc_node_switch(), which queues up nodes into the list, is also protected by ->uring_lock, so we can safely get rid of ->rsrc_ref_lock. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/6b60af883c263551190b526a55ff2c9d5ae07141.1680576071.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-04-04io_uring/rsrc: protect node refs with uring_lockPavel Begunkov
Currently, for nodes we have an atomic counter and some cached (non-atomic) refs protected by uring_lock. Let's put all ref manipulations under uring_lock and get rid of the atomic part. It's free as in all cases we care about we already hold the lock. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/25b142feed7d831008257d90c8b17c0115d4fc15.1680576071.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-04-04io_uring/rsrc: keep cached refs per nodePavel Begunkov
We cache refs of the current node (i.e. ctx->rsrc_node) in ctx->rsrc_cached_refs. We'll be moving away from atomics, so move the cached refs in struct io_rsrc_node for now. It's a prep patch and shouldn't change anything in practise. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/9edc3669c1d71b06c2dca78b2b2b8bb9292738b9.1680576071.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-04-04io_uring/rsrc: use non-pcpu refcounts for nodesPavel Begunkov
One problem with the current rsrc infra is that often updates will generates lots of rsrc nodes, each carry pcpu refs. That takes quite a lot of memory, especially if there is a stall, and takes lots of CPU cycles. Only pcpu allocations takes >50 of CPU with a naive benchmark updating files in a loop. Replace pcpu refs with normal refcounting. There is already a hot path avoiding atomics / refs, but following patches will further improve it. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/e9ed8a9457b331a26555ff9443afc64cdaab7247.1680576071.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-03-30Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Conflicts: drivers/net/ethernet/mediatek/mtk_ppe.c 3fbe4d8c0e53 ("net: ethernet: mtk_eth_soc: ppe: add support for flow accounting") 924531326e2d ("net: ethernet: mtk_eth_soc: add missing ppe cache flush when deleting a flow") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-22io_uring/rsrc: fix null-ptr-deref in io_file_bitmap_get()Savino Dicanosa
When fixed files are unregistered, file_alloc_end and alloc_hint are not cleared. This can later cause a NULL pointer dereference in io_file_bitmap_get() if auto index selection is enabled via IORING_FILE_INDEX_ALLOC: [ 6.519129] BUG: kernel NULL pointer dereference, address: 0000000000000000 [...] [ 6.541468] RIP: 0010:_find_next_zero_bit+0x1a/0x70 [...] [ 6.560906] Call Trace: [ 6.561322] <TASK> [ 6.561672] io_file_bitmap_get+0x38/0x60 [ 6.562281] io_fixed_fd_install+0x63/0xb0 [ 6.562851] ? __pfx_io_socket+0x10/0x10 [ 6.563396] io_socket+0x93/0xf0 [ 6.563855] ? __pfx_io_socket+0x10/0x10 [ 6.564411] io_issue_sqe+0x5b/0x3d0 [ 6.564914] io_submit_sqes+0x1de/0x650 [ 6.565452] __do_sys_io_uring_enter+0x4fc/0xb20 [ 6.566083] ? __do_sys_io_uring_register+0x11e/0xd80 [ 6.566779] do_syscall_64+0x3c/0x90 [ 6.567247] entry_SYSCALL_64_after_hwframe+0x72/0xdc [...] To fix the issue, set file alloc range and alloc_hint to zero after file tables are freed. Cc: stable@vger.kernel.org Fixes: 4278a0deb1f6 ("io_uring: defer alloc_hint update to io_file_bitmap_set()") Signed-off-by: Savino Dicanosa <sd7.dev@pm.me> [axboe: add explicit bitmap == NULL check as well] Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-03-17Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
net/wireless/nl80211.c b27f07c50a73 ("wifi: nl80211: fix puncturing bitmap policy") cbbaf2bb829b ("wifi: nl80211: add a command to enable/disable HW timestamping") https://lore.kernel.org/all/20230314105421.3608efae@canb.auug.org.au tools/testing/selftests/net/Makefile 62199e3f1658 ("selftests: net: Add VXLAN MDB test") 13715acf8ab5 ("selftest: Add test for bind() conflicts.") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-16io_uring/rsrc: fix folio accountingPavel Begunkov
| BUG: Bad page state in process kworker/u8:0 pfn:5c001 | page:00000000bfda61c8 refcount:0 mapcount:0 mapping:0000000000000000 index:0x20001 pfn:0x5c001 | head:0000000011409842 order:9 entire_mapcount:0 nr_pages_mapped:0 pincount:1 | anon flags: 0x3fffc00000b0004(uptodate|head|mappedtodisk|swapbacked|node=0|zone=0|lastcpupid=0xffff) | raw: 03fffc0000000000 fffffc0000700001 ffffffff00700903 0000000100000000 | raw: 0000000000000200 0000000000000000 00000000ffffffff 0000000000000000 | head: 03fffc00000b0004 dead000000000100 dead000000000122 ffff00000a809dc1 | head: 0000000000020000 0000000000000000 00000000ffffffff 0000000000000000 | page dumped because: nonzero pincount | CPU: 3 PID: 9 Comm: kworker/u8:0 Not tainted 6.3.0-rc2-00001-gc6811bf0cd87 #1 | Hardware name: linux,dummy-virt (DT) | Workqueue: events_unbound io_ring_exit_work | Call trace: | dump_backtrace+0x13c/0x208 | show_stack+0x34/0x58 | dump_stack_lvl+0x150/0x1a8 | dump_stack+0x20/0x30 | bad_page+0xec/0x238 | free_tail_pages_check+0x280/0x350 | free_pcp_prepare+0x60c/0x830 | free_unref_page+0x50/0x498 | free_compound_page+0xcc/0x100 | free_transhuge_page+0x1f0/0x2b8 | destroy_large_folio+0x80/0xc8 | __folio_put+0xc4/0xf8 | gup_put_folio+0xd0/0x250 | unpin_user_page+0xcc/0x128 | io_buffer_unmap+0xec/0x2c0 | __io_sqe_buffers_unregister+0xa4/0x1e0 | io_ring_exit_work+0x68c/0x1188 | process_one_work+0x91c/0x1a58 | worker_thread+0x48c/0xe30 | kthread+0x278/0x2f0 | ret_from_fork+0x10/0x20 Mark reports an issue with the recent patches coalescing compound pages while registering them in io_uring. The reason is that we try to drop excessive references with folio_put_refs(), but pages were acquired with pin_user_pages(), which has extra accounting and so should be put down with matching unpin_user_pages() or at least gup_put_folio(). As a fix unpin_user_pages() all but first page instead, and let's figure out a better API after. Fixes: 57bebf807e2abcf8 ("io_uring/rsrc: optimise registered huge pages") Reported-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Jens Axboe <axboe@kernel.dk> Tested-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/10efd5507d6d1f05ea0f3c601830e08767e189bd.1678980230.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-03-15io_uring: rsrc: Optimize return value variable 'ret'Li zeming
The initialization assignment of the variable ret is changed to 0, only in 'goto fail;' Use the ret variable as the function return value. Signed-off-by: Li zeming <zeming@nfschina.com> Link: https://lore.kernel.org/r/20230317182538.3027-1-zeming@nfschina.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-03-09Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Documentation/bpf/bpf_devel_QA.rst b7abcd9c656b ("bpf, doc: Link to submitting-patches.rst for general patch submission info") d56b0c461d19 ("bpf, docs: Fix link to netdev-FAQ target") https://lore.kernel.org/all/20230307095812.236eb1be@canb.auug.org.au/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-08net: reclaim skb->scm_io_uring bitEric Dumazet
Commit 0091bfc81741 ("io_uring/af_unix: defer registered files gc to io_uring release") added one bit to struct sk_buff. This structure is critical for networking, and we try very hard to not add bloat on it, unless absolutely required. For instance, we can use a specific destructor as a wrapper around unix_destruct_scm(), to identify skbs that unix_gc() has to special case. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Pavel Begunkov <asml.silence@gmail.com> Cc: Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Cc: Jens Axboe <axboe@kernel.dk> Reviewed-by: Jens Axboe <axboe@kernel.dk> Reviewed-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-24io_uring/rsrc: always initialize 'folio' to NULLJens Axboe
Smatch complains that: smatch warnings: io_uring/rsrc.c:1262 io_sqe_buffer_register() error: uninitialized symbol 'folio'. 'folio' may be used uninitialized, which can happen if we end up with a single page mapped. Ensure that we clear folio to NULL at the top so it's always set. Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <error27@gmail.com> Link: https://lore.kernel.org/r/202302241432.YML1CD5C-lkp@intel.com/ Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-02-22io_uring/rsrc: optimise registered huge pagesPavel Begunkov
When registering huge pages, internally io_uring will split them into many PAGE_SIZE bvec entries. That's bad for performance as drivers need to eventually dma-map the data and will do it individually for each bvec entry. Coalesce huge pages into one large bvec. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-02-22io_uring/rsrc: optimise single entry advancePavel Begunkov
Iterating within the first bvec entry should be essentially free, but we use iov_iter_advance() for that, which shows up in benchmark profiles taking up to 0.5% of CPU. Replace it with a hand coded version. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-02-22io_uring/rsrc: disallow multi-source reg buffersPavel Begunkov
If two or more mappings go back to back to each other they can be passed into io_uring to be registered as a single registered buffer. That would even work if mappings came from different sources, e.g. it's possible to mix in this way anon pages and pages from shmem or hugetlb. That is not a problem but it'd rather be less prone if we forbid such mixing. Cc: <stable@vger.kernel.org> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-02-22io_uring/rsrc: fix a comment in io_import_fixed()Pavel Begunkov
io_import_fixed() supports offsets, but "may not" means the opposite. Replace it with "might not" so the comments rather speaks about possible cases. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Reviewed-by: Gabriel Krisman Bertazi <krisman@suse.de> Link: https://lore.kernel.org/r/5b5f79958456caa6dc532f6205f75f224b232c81.1676902343.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-02-03io_uring: use bvec_set_page to initialize a bvecChristoph Hellwig
Use the bvec_set_page helper to initialize a bvec. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Link: https://lore.kernel.org/r/20230203150634.3199647-19-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-12-07io_uring: use tw for putting rsrcPavel Begunkov
Use task_work for completing rsrc removals, it'll be needed later for spinlock optimisations. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/cbba5d53a11ee6fc2194dacea262c1d733c8b529.1670384893.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-11-30io_uring: don't reinstall quiesce node for each twPavel Begunkov
There is no need to reinit data and install a new rsrc node every time we get a task_work, it's detrimental, just execute it and conitnue waiting. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/3895d3344164cd9b3a0bbb24a6e357e20a13434b.1669821213.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-11-30io_uring: improve rsrc quiesce refs checksPavel Begunkov
Do a little bit of refactoring of io_rsrc_ref_quiesce(), flatten the data refs checks and so get rid of a conditional weird unlock-else-break construct. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/d21283e9f88a77612c746ed526d86fe3bfb58a70.1669821213.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-11-25io_uring: remove overflow param from io_post_aux_cqeDylan Yudaken
The only call sites which would not allow overflow are also call sites which would use the io_aux_cqe as they care about ordering. So remove this parameter from io_post_aux_cqe. Signed-off-by: Dylan Yudaken <dylany@meta.com> Link: https://lore.kernel.org/r/20221124093559.3780686-9-dylany@meta.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-11-21io_uring: do not always force run task_work in io_uring_registerDylan Yudaken
Running task work when not needed can unnecessarily delay operations. Specifically IORING_SETUP_DEFER_TASKRUN tries to avoid running task work until the user requests it. Therefore do not run it in io_uring_register any more. The one catch is that io_rsrc_ref_quiesce expects it to have run in order to process all outstanding references, and so reorder it's loop to do this. Signed-off-by: Dylan Yudaken <dylany@meta.com> Link: https://lore.kernel.org/r/20221107123349.4106213-1-dylany@meta.com Signed-off-by: Jens Axboe <axboe@kernel.dk>