summaryrefslogtreecommitdiff
path: root/fs/bcachefs/ec.c
AgeCommit message (Collapse)Author
2023-10-22bcachefs: Fix stripe reuse pathKent Overstreet
It's possible that we reuse a stripe that doesn't have quite the same configuration as the stripe_head we're allocating from. In that case, we have to make sure that the new stripe uses the settings from the stripe we resue, not the stripe head, and make sure the buffer is allocated correctly. This fixes the ec_mixed_tiers test. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: RESERVE_stripeKent Overstreet
Rework stripe creation path - new algorithm for deciding when to create new stripes or reuse existing stripes. We add a new allocation watermark, RESERVE_stripe, above RESERVE_none. Then we always try to create a new stripe by doing RESERVE_stripe allocations; if this fails, we reuse an existing stripe and allocate buckets for it with the reserve watermark for the given write (RESERVE_none or RESERVE_movinggc). Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: More stripe create cleanup/fixesKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Plumb alloc_reserve through stripe create pathKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: ec: Improve error message for btree node in stripeKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: ec: Ensure new stripe is closed in error pathKent Overstreet
This fixes a use-after-free bug. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: ec: zero_out_rest_of_ec_bucket()Kent Overstreet
Occasionally, we won't write to an entire bucket. This fixes the EC code to handle this case, zeroing out the rest of the bucket as needed. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Improve bch2_stripe_to_text()Kent Overstreet
We now print pointers as bucket:offset, the same as how we print extent pointers. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: get_stripe_key_trans()Kent Overstreet
Another nested btree_trans fix Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Fix erasure coding shutdown pathKent Overstreet
It's possible when shutting down to for a stripe head to have a new stripe that doesn't yet have any blocks allocated - we just need to free it. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Fix buffer overrun in ec_stripe_update_extent()Kent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Simplify ec stripes heapKent Overstreet
Now that we have a separate data structure for tracking open stripes, the stripes heap can track all existing stripes, which is a nice simplification. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Erasure coding: Track open stripesKent Overstreet
This adds a new hash table for stripes being created or updated, instead of hackily relying on the stripes heap. This lets us reserve the slot for the new stripe up front, at the same time as we would pick an existing stripe - if we were updating an existing stripe - making the overall code more consistent. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Stripe deletion now checks what it's deletingKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Improve c->writes refcounting for stripe create pathKent Overstreet
This makes our handling of c->writes more consistent with other asynchronous work items. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Switch ec_stripes_heap_lock to a mutexKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Fix erasure coding lockingKent Overstreet
This adds a new helper, bch2_trans_mutex_lock(), for locking a mutex - dropping and retaking btree locks as needed. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Don't block on ec_stripe_head_lock with btree locks heldKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Erasure coding now uses bch2_bucket_alloc_transKent Overstreet
This code predates plumbing btree_trans through the bucket allocation path: switching to it fixes a deadlock due to using multiple btree_trans at the same time, which we never want to do. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Change bkey_invalid() rw param to flagsKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Don't use key cache during fsckKent Overstreet
The btree key cache mainly helps with lock contention, at the cost of additional memory overhead. During some fsck passes the memory overhead really matters, but fsck is single threaded so lock contention is an issue - so skipping the key cache during fsck will help with performance. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Delete in memory ec backpointersKent Overstreet
Post btree backpointers, these aren't needed anymore. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Erasure coding now uses backpointersKent Overstreet
This is only a start to updating erasure coding for backpointers - it's still not working yet. The subsequent patch will delete our old in memory backpointers for copygc, and this fixes a spurious EPERM bug/error message. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Debug mode for c->writes referencesKent Overstreet
This adds a debug mode where we split up the c->writes refcount into distinct refcounts for every codepath that takes a reference, and adds sysfs code to print the value of each ref. This will make it easier to debug shutdown hangs due to refcount leaks. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: ec_stripe_delete_work() now takes ref on c->writesKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Use for_each_btree_key_upto() more consistentlyKent Overstreet
It's important that in BTREE_ITER_FILTER_SNAPSHOTS mode we always use peek_upto() and provide an end for the interval we're searching for - otherwise, when we hit the end of the inode the next inode be in a different subvolume and not have any keys in the current snapshot, and we'd iterate over arbitrarily many keys before returning one. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Convert EROFS errors to private error codesKent Overstreet
More error code improvements - this gets us more useful error messages. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: New btree helpersKent Overstreet
This introduces some new conveniences, to help cut down on boilerplate: - bch2_trans_kmalloc_nomemzero() - performance optimiation - bch2_bkey_make_mut() - bch2_bkey_get_mut() - bch2_bkey_get_mut_typed() - bch2_bkey_alloc() Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: More errcode cleanupKent Overstreet
We shouldn't be overloading standard error codes now that we have provisions for bcachefs-specific errorcodes: this patch converts super.c and super-io.c to per error site errcodes, with a bit of cleanup. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: New bpos_cmp(), bkey_cmp() replacementsKent Overstreet
This patch introduces - bpos_eq() - bpos_lt() - bpos_le() - bpos_gt() - bpos_ge() and equivalent replacements for bkey_cmp(). Looking at the generated assembly these could probably be improved further, but we already see a significant code size improvement with this patch. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Ratelimit ec error messageKent Overstreet
We should fix this, but for now this makes this more usable. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Add private error codes for ENOSPCKent Overstreet
Continuing the saga of introducing private dedicated error codes for each error path, this patch converts ENOSPC to error codes that are subtypes of ENOSPC. We've recently had a test failure where we got -ENOSPC where we shouldn't have, and didn't have enough information to tell where it came from, so this patch will solve that problem. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: EINTR -> BCH_ERR_transaction_restartKent Overstreet
Now that we have error codes, with subtypes, we can switch to our own error code for transaction restarts - and even better, a distinct error code for each transaction restart reason: clearer code and better debugging. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Use bch2_err_str() in error messagesKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: ec_stripe_bkey_insert() -> for_each_btree_key_norestart()Kent Overstreet
With the upcoming patches to add assertions for incorrect nested transaction restart handling, this code is now bogus. Switch it to for_each_btree_key_norestart() so that transaction restarts are only handled in one place. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Convert erasure coding to for_each_btree_key_commit()Kent Overstreet
The new for_each_btree_key2() macro handles transaction retries, allowing us to avoid nested transactions - which we want to avoid since they're tricky to do completely correctly and upcoming assertions are going to be checking for that. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Always use percpu_ref_tryget_live() on c->writesKent Overstreet
If we're trying to get a ref and the refcount has been killed, it means we're doing an emergency shutdown - we always want tryget_live(). Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Printbuf reworkKent Overstreet
This converts bcachefs to the modern printbuf interface/implementation, synced with the version to be submitted upstream. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Initialize ec work structs earlyKent Overstreet
We need to ensure that work structs in bch_fs always get initialized - otherwise an error in filesystem initialization can pop a warning in the workqueue code when we try to cancel a work struct that wasn't initialized. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Add rw to .key_invalid()Kent Overstreet
This adds a new parameter to .key_invalid() methods for whether the key is being read or written; the idea being that methods can do more aggressive checks when a key is newly created and being written, when we wouldn't want to delete the key because of those checks. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Convert .key_invalid methods to printbufsKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Kill allocator threads & freelistsKent Overstreet
Now that we have new persistent data structures for the allocator, this patch converts the allocator to use them. Now, foreground bucket allocation uses the freespace btree to find buckets to allocate, instead of popping buckets off the freelist. The background allocator threads are no longer needed and are deleted, as well as the allocator freelists. Now we only need background tasks for invalidating buckets containing cached data (when we are low on empty buckets), and for issuing discards. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: x-macroize alloc_reserve enumKent Overstreet
This makes an array of strings available, like our other enums. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Heap allocate printbufsKent Overstreet
This patch changes printbufs dynamically allocate and reallocate a buffer as needed. Stack usage has become a bit of a problem, and a major cause of that has been static size string buffers on the stack. The most involved part of this refactoring is that printbufs must now be exited with printbuf_exit(). Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: BTREE_ITER_WITH_JOURNALKent Overstreet
This adds a new btree iterator flag, BTREE_ITER_WITH_JOURNAL, that is automatically enabled when initializing a btree iterator before journal replay has completed - it overlays the contents of the journal with the btree. This lets us delete bch2_btree_and_journal_walk() and just use the normal btree iterator interface instead - which also lets us delete a significant amount of duplicated code. Note that BTREE_ITER_WITH_JOURNAL is still unoptimized in this patch - we're redoing the binary search over keys in the journal every time we call bch2_btree_iter_peek(). Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Add iter_flags arg to bch2_btree_delete_range()Kent Overstreet
Will be used by the new snapshot tests, to pass in BTREE_ITER_ALL_SNAPSHOTS. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Kill bch2_ec_mem_alloc()Kent Overstreet
bch2_ec_mem_alloc() was only used by GC, and there's no real need to preallocate the stripes radix tree since we can cope fine with memory allocation failure when we use the radix tree. This deletes a fair bit of code, and it's also needed for the upcoming patch because bch2_btree_iter_peek_prev() won't be working before journal replay completes (and using it was incorrect previously, as well). Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Refactor open_bucket codeKent Overstreet
Prep work for adding a hash table of open buckets - instead of embedding a bch_extent_ptr, we need to refer to the bucket directly so that we're not calling sector_to_bucket() in the hash table lookup code, which has an expensive divide. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Turn encoded_extent_max into a regular optionKent Overstreet
It'll now be handled at format time and in sysfs like other options - it still can only be set at format time, though. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22bcachefs: Don't erasure code cached ptrsKent Overstreet
It doesn't make much sense to be erasure coding cached pointers, we should be erasure coding one of the dirty pointers in an extent. This patch makes sure we're passing BCH_WRITE_CACHED when we expect the new pointer to be a cached pointer, and tweaks the write path to not allocate from a stripe when BCH_WRITE_CACHED is set - and fixes an assertion we were hitting in the ec path where when adding the stripe to an extent and deleting the other pointers the pointer to the stripe didn't exist (because dropping all dirty pointers from an extent turns it into a KEY_TYPE_error key). Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>