Age | Commit message (Collapse) | Author |
|
We had a bug where the root inode of a subvolume was erronously deleted:
bch2_evict_inode() called bch2_inode_rm(), meaning the VFS inode's
i_nlink was somehow set to 0 when it shouldn't have - the inode in the
btree indicated it clearly was not unlinked.
This has been addressed with additional safety checks in
bch2_inode_rm() - pulling in the safety checks we already were doing
when deleting unlinked inodes in recovery - but the really disastrous
bug was in check_subvols(), which on finding a dangling subvol (subvol
with a missing root inode) would delete the subvolume.
I assume this bug dates from early check_directory_structure() code,
which originally handled subvolumes and normal paths - the idea being
that still live contents of the subvolume would get reattached
somewhere.
But that's incorrect, and disastrously so; deleting a subvolume triggers
deleting the snapshot ID it points to, deleting the entire contents.
The correct way to repair is to recreate the root inode if it's missing;
then any contents will get reattached under that subvolume's lost+found.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Add a superblock flag to temporarily disable ratelimiting for a recovery
pass.
This will be used to make check_key_has_snapshot safer: we don't want to
delete a key for a missing snapshot unless we know that the snapshots
and subvolumes btrees are consistent, i.e. check_snapshots and
check_subvols have run recently.
Changing those btrees - creating/deleting a subvolume or snapshot - will
set the "disable ratelimit" flag, i.e. ensuring that those passes run if
check_key_has_snapshot discovers an error.
We're only disabling ratelimiting in the snapshot/subvol delete paths,
we're not so concerned about the create paths.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Add a tracepoint for any time we return an error and unwind.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
The new guard(), scoped_guard() allow for more natural code.
Some of the uses with creative flow control have been left.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Consolidate the run_explicit_recovery_pass() interfaces by adding a
flags parameter; this will also let us add a RUN_RECOVERY_PASS_ratelimit
flag.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
If we detect an error that requires running a recovery pass, and we're
not in recovery, we won't be able to fix it until the next mount - make
sure we're noting in the superblock that it needs to run.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Drop the single-purpose write ref code in bcachefs.h, and convert to
enumarated refs.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Instead of going emegency read only with a bch2_fs_inconsistent() call,
log the error and recovery pass appropriately.
If we're still in recovery it'll be repaired immediately, otherwise
it'll be repaired on the next mount.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
_init_early() is for initialization that cannot fail, and often must
happen for teardown partway through initialization to work.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
snapshot
Fix this repair path.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
This was planned to be done ages ago, now finally completed; there are
places where we have quite a few btree_trans objects on the stack, so
this reduces stack usage somewhat.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
This fixes recursive subvolume removal.
Subvolume deletion is asynchronous; fs_path_parent, and thus the entry
in the subvolume_children btree, need to be cleared when the subvolume
is unlinked from the fs heirarchy - else we'll spuriously think a
subvolume has children and deletion will fail.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
For some unknown reason, checks on struct bkey_s_c_snapshot and struct
bkey_s_c_snapshot_tree pointers are missing.
Therefore, I think it would be appropriate to fix the incorrect pointer checking
through this patch.
Fixes: 4bd06f07bcb5 ("bcachefs: Fixes for snapshot_tree.master_subvol")
Signed-off-by: Jeongjun Park <aha310510@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Ensure that snapshot_tree.master_subvol is cleared when we delete the
master subvolume in a tree of snapshots, and allow for snapshot trees
that don't have a master subvolume in fsck.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Add a new parameter to bkey validate functions, and use it to improve
invalid bkey error messages: we can now print the btree and depth it
came from, or if it came from the journal, or is a btree root.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
We unconditionally go read-write, if we're going to do so, before
journal replay: lazy_rw is obsolete.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Using commit_do() to call alloc_sectors_start_trans() breaks when we're
randomly injecting transaction restarts - the restart in the commit
causes us to leak the lock that alloc_sectorS_start_trans() takes.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
These shouldn't always be fatal errors - logged op resume, in
particular, and we want it as a parameter there.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
It was initially believed that it would be better to be explicit about
the snapshot we're updating when writing inodes in fsck; however, it
turns out that passing around the snapshot separately is more error
prone and we're usually updating the inode in the same snapshow we read
it from.
This is different from normal filesystem paths, where we do the update
in the snapshot of the subvolume we're in.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
A couple small error handling fixes
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
bkey_fsck_err() was added as an interface that looks like fsck_err(),
but previously all it did was ensure that the appropriate error counter
was incremented in the superblock.
This is a cleanup and bugfix patch that converts it to a wrapper around
fsck_err(). This is needed to fix an issue with the upgrade path to
disk_accounting_v3, where the "silent fix" error list now includes
bkey_fsck errors; fsck_err() handles this in a unified way, and since we
need to change printing of bkey fsck errors from the caller to the inner
bkey_fsck_err() calls, this ends up being a pretty big change.
Als,, rename .invalid() methods to .validate(), for clarity, while we're
changing the function signature anyways (to drop the printbuf argument).
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
The commit 65bd44239727 ("bcachefs: bch2_btree_insert_trans() no longer
specifies BTREE_ITER_cached") removes BTREE_ITER_cached from
bch2_btree_insert_trans, which causes the update_inode function from
bcachefs-tools to take a long time (~20s). Add an iter_flags parameter
to bch2_btree_insert, so the users can specify iter update trigger
flags, such as BTREE_ITER_cached.
Signed-off-by: Ariel Miculas <ariel.miculas@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
fsck_err() now optionally takes a btree_trans; if the current thread has
one, it is required that it be passed.
The next patch will use this to unlock when waiting for user input.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
This fixes an assertion pop in btree_iter.c that checks for forgetting
to pass a snapshot ID when iterating over snapshots btrees.
Reported-by: syzbot+0dfe05235e38653e2aee@syzkaller.appspotmail.com
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
We're about to start using bch_validate_flags for superblock section
validation - it's no longer bkey specific.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
When building with clang's -Wincompatible-function-pointer-types-strict
(a warning designed to catch potential kCFI failures at build time),
there are several warnings along the lines of:
fs/bcachefs/bkey_methods.c:118:2: error: incompatible function pointer types initializing 'int (*)(struct btree_trans *, enum btree_id, unsigned int, struct bkey_s_c, struct bkey_s, enum btree_iter_update_trigger_flags)' with an expression of type 'int (struct btree_trans *, enum btree_id, unsigned int, struct bkey_s_c, struct bkey_s, unsigned int)' [-Werror,-Wincompatible-function-pointer-types-strict]
118 | BCH_BKEY_TYPES()
| ^~~~~~~~~~~~~~~~
fs/bcachefs/bcachefs_format.h:394:2: note: expanded from macro 'BCH_BKEY_TYPES'
394 | x(inode, 8) \
| ^~~~~~~~~~~~~~~~~~~~~~~~~~
fs/bcachefs/bkey_methods.c:117:41: note: expanded from macro 'x'
117 | #define x(name, nr) [KEY_TYPE_##name] = bch2_bkey_ops_##name,
| ^~~~~~~~~~~~~~~~~~~~
<scratch space>:277:1: note: expanded from here
277 | bch2_bkey_ops_inode
| ^~~~~~~~~~~~~~~~~~~
fs/bcachefs/inode.h:26:13: note: expanded from macro 'bch2_bkey_ops_inode'
26 | .trigger = bch2_trigger_inode, \
| ^~~~~~~~~~~~~~~~~~
There are several functions that did not have their flags parameter
converted to 'enum btree_iter_update_trigger_flags' in the recent
unification, which will cause kCFI failures at runtime because the
types, while ABI compatible (hence no warning from the non-strict
version of this warning), do not match exactly.
Fix up these functions (as well as a few other obvious functions that
should have it, even if there are no warnings currently) to resolve the
warnings and potential kCFI runtime failures.
Fixes: 31e4ef3280c8 ("bcachefs: iter/update/trigger/str_hash flag cleanup")
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Combine iter/update/trigger/str_hash flags into a single enum, and
x-macroize them for a to_text() function later.
These flags are all for a specific iter/key/update context, so it makes
sense to group them together - iter/update/trigger flags were already
given distinct bits, this cleans up and unifies that handling.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
We've grown a fair amount of code for managing recovery passes; tracking
which ones we're running, which ones need to be run, and flagging in the
superblock which ones need to be run on the next recovery.
So it's worth splitting out into its own file, this code is pretty
different from the code in recovery.c.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Recursively destroying subvolumes isn't allowed yet.
Fixes: https://github.com/koverstreet/bcachefs/issues/634
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Add a btree to record a parent -> child subvolume relationships,
according to the filesystem heirarchy.
The subvolume_children btree is a bitset btree: if a bit is set at pos
p, that means p.offset is a child of subvolume p.inode.
This will be used for efficiently listing subvolumes, as well as
recursive deletion.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Record the filesystem path heirarchy for subvolumes in bch_subvolume
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
bit of renaming, prep for adding a fs path parent
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Subvolumes and subvolume root inodes point to each other: this verifies
the subvolume -> inode -> subvolme path.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
This eliminates a lot of BCH_TRANS_COMMIT_lazy_rw flags, and is less
error prone.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
BTREE_INSERT flags are actually transaction commit flags - rename them
for clarity.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Add checks to all the VFS paths for "are we in a RO snapshot?".
Note - we don't check this when setting inode options via our xattr
interface, since those generally only affect data placement, not
contents of data.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Reported-by: "Carl E. Thompson" <list-bcachefs@carlthompson.net>
|
|
This patch adds a superblock error counter for every distinct fsck
error; this means that when analyzing filesystems out in the wild we'll
be able to see what sorts of inconsistencies are being found and repair,
and hence what bugs to look for.
Errors validating bkeys are not yet considered distinct fsck errors, but
this patch adds a new helper, bkey_fsck_err(), in order to add distinct
error types for them as well.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Be a bit more careful about when bch2_delete_dead_snapshots needs to
run: it only needs to run synchronously if we're running fsck, and it
only needs to run at all if we have snapshot nodes to delete or if fsck
has noticed that it needs to run.
Also:
Rename BCH_FS_HAVE_DELETED_SNAPSHOTS -> BCH_FS_NEED_DELETE_DEAD_SNAPSHOTS
Kill bch2_delete_dead_snapshots_hook(), move functionality to
bch2_mark_snapshot()
Factor out bch2_check_snapshot_needs_deletion(), to explicitly check
if we need to be running snapshot deletion.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
We're using more stack than we'd like in a number of functions, and
btree_trans is the biggest object that we stack allocate.
But we have to do a heap allocatation to initialize it anyways, so
there's no real downside to heap allocating the entire thing.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
callbacks
When building bcachefs with -Wincompatible-function-pointer-types-strict,
a clang warning designed to catch issues with mismatched function
pointer types, which will be fatal at runtime due to kernel Control Flow
Integrity (kCFI), there are several instances along the lines of:
fs/bcachefs/bkey_methods.c:118:2: error: incompatible function pointer types initializing 'int (*)(const struct bch_fs *, struct bkey_s_c, enum bkey_invalid_flags, struct printbuf *)' with an expression of type 'int (const struct bch_fs *, struct bkey_s_c, unsigned int, struct printbuf *)' [-Werror,-Wincompatible-function-pointer-types-strict]
118 | BCH_BKEY_TYPES()
| ^~~~~~~~~~~~~~~~
fs/bcachefs/bcachefs_format.h:342:2: note: expanded from macro 'BCH_BKEY_TYPES'
342 | x(deleted, 0) \
| ^~~~~~~~~~~~~~~~~~~~~~~~~~
fs/bcachefs/bkey_methods.c:117:41: note: expanded from macro 'x'
117 | #define x(name, nr) [KEY_TYPE_##name] = bch2_bkey_ops_##name,
| ^~~~~~~~~~~~~~~~~~~~
<scratch space>:206:1: note: expanded from here
206 | bch2_bkey_ops_deleted
| ^~~~~~~~~~~~~~~~~~~~~
fs/bcachefs/bkey_methods.c:34:17: note: expanded from macro 'bch2_bkey_ops_deleted'
34 | .key_invalid = deleted_key_invalid, \
| ^~~~~~~~~~~~~~~~~~~
The flags parameter should be of type 'enum bkey_invalid_flags', not
'unsigned int'. Adjust the type everywhere so that there is no more
warning.
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
subvolume.c has gotten a bit large, this splits out a separate file just
for managing snapshot trees - BTREE_ID_snapshots.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
This introduces bch2_run_explicit_recovery_pass() and uses it for when
fsck detects that we need to re-run dead snaphots cleanup, and makes
dead snapshot cleanup more like a normal recovery pass.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|