Age | Commit message (Collapse) | Author |
|
This fixes a performance regression in journal replay; without
colaescing accounting keys we have multiple keys at the same position,
which means journal_keys_peek_upto() has to skip past many overwritten
keys - turning journal replay into an O(n^2) algorithm.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Until accounting keys hit the btree, they are deltas, not new versions
of the existing key; this means we have to teach journal replay to
accumulate them.
Additionally, the journal doesn't track precisely which entries have
been flushed to the btree; it only tracks a range of entries that may
possibly still need to be flushed.
That means we need to compare accounting keys against the version in the
btree and only flush updates that are newer.
There's another wrinkle with the write buffer: if the write buffer
starts flushing accounting keys before journal replay has finished
flushing accounting keys, journal replay will see the version number
from the new updates and updates from the journal will be lost.
To avoid this, journal replay has to flush accounting keys first, and
we'll be adding a flag so that write buffer flush knows to hold
accounting keys until then.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
debug helper
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Don't pick a pivot that's going to be deleted.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
btree_and_journal_iter is old code that we want to get rid of, but we're
not ready to yet.
lack of btree node prefetching is, it turns out, a real performance
issue for fsck on spinning rust, so - add it.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
we now always have a btree_trans when using a btree_and_journal_iter;
prep work for adding prefetching to btree_and_journal_iter
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
The btree iterator code overlays keys from the journal until journal
replay is finished; since we're now starting copygc/rebalance etc.
before replay is finished, this is multithreaded access and thus needs
refcounting.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Split out a new file from recovery.c for managing the list of keys we
read from the journal: before journal replay finishes the btree iterator
code needs to be able to iterate over and return keys from the journal
as well, so there's a fair bit of code here.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|