summaryrefslogtreecommitdiff
path: root/fs/bcachefs/subvolume.c
AgeCommit message (Collapse)Author
2025-06-04bcachefs: Fix subvol to missing root repairKent Overstreet
We had a bug where the root inode of a subvolume was erronously deleted: bch2_evict_inode() called bch2_inode_rm(), meaning the VFS inode's i_nlink was somehow set to 0 when it shouldn't have - the inode in the btree indicated it clearly was not unlinked. This has been addressed with additional safety checks in bch2_inode_rm() - pulling in the safety checks we already were doing when deleting unlinked inodes in recovery - but the really disastrous bug was in check_subvols(), which on finding a dangling subvol (subvol with a missing root inode) would delete the subvolume. I assume this bug dates from early check_directory_structure() code, which originally handled subvolumes and normal paths - the idea being that still live contents of the subvolume would get reattached somewhere. But that's incorrect, and disastrously so; deleting a subvolume triggers deleting the snapshot ID it points to, deleting the entire contents. The correct way to repair is to recreate the root inode if it's missing; then any contents will get reattached under that subvolume's lost+found. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-06-04bcachefs: Add flags to subvolume_to_text()Kent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-06-02bcachefs: BCH_RECOVERY_PASS_NO_RATELIMITKent Overstreet
Add a superblock flag to temporarily disable ratelimiting for a recovery pass. This will be used to make check_key_has_snapshot safer: we don't want to delete a key for a missing snapshot unless we know that the snapshots and subvolumes btrees are consistent, i.e. check_snapshots and check_subvols have run recently. Changing those btrees - creating/deleting a subvolume or snapshot - will set the "disable ratelimit" flag, i.e. ensuring that those passes run if check_key_has_snapshot discovers an error. We're only disabling ratelimiting in the snapshot/subvol delete paths, we're not so concerned about the create paths. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-06-02bcachefs: bch_err_throw()Kent Overstreet
Add a tracepoint for any time we return an error and unwind. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-06-01bcachefs: Replace rcu_read_lock() with guardsKent Overstreet
The new guard(), scoped_guard() allow for more natural code. Some of the uses with creative flow control have been left. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21bcachefs: bch2_run_explicit_recovery_pass() cleanupKent Overstreet
Consolidate the run_explicit_recovery_pass() interfaces by adding a flags parameter; this will also let us add a RUN_RECOVERY_PASS_ratelimit flag. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21bcachefs: snapshot delete progress indicatorKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21bcachefs: Run most explicit recovery passes persistentKent Overstreet
If we detect an error that requires running a recovery pass, and we're not in recovery, we won't be able to fix it until the next mount - make sure we're noting in the superblock that it needs to run. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21bcachefs: bch_fs.writes -> enumerated_refsKent Overstreet
Drop the single-purpose write ref code in bcachefs.h, and convert to enumarated refs. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21bcachefs: Flag for repair on missing subvolumeKent Overstreet
Instead of going emegency read only with a bch2_fs_inconsistent() call, log the error and recovery pass appropriately. If we're still in recovery it'll be repaired immediately, otherwise it'll be repaired on the next mount. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21bcachefs: Move various init code to _init_early()Kent Overstreet
_init_early() is for initialization that cannot fail, and often must happen for teardown partway through initialization to work. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21bcachefs: bch2_subvolume_wait_for_pagecache_and_delete() cleanupKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-04-28bcachefs: Run BCH_RECOVERY_PASS_reconstruct_snapshots on missing subvol -> ↵Kent Overstreet
snapshot Fix this repair path. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-04-02bcachefs: Kill btree_iter.transKent Overstreet
This was planned to be done ages ago, now finally completed; there are places where we have quite a few btree_trans objects on the stack, so this reduces stack usage somewhat. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-29bcachefs: Clear fs_path_parent on subvolume unlinkKent Overstreet
This fixes recursive subvolume removal. Subvolume deletion is asynchronous; fs_path_parent, and thus the entry in the subvolume_children btree, need to be cleared when the subvolume is unlinked from the fs heirarchy - else we'll spuriously think a subvolume has children and deletion will fail. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-02-06bcachefs: fix incorrect pointer check in __bch2_subvolume_delete()Jeongjun Park
For some unknown reason, checks on struct bkey_s_c_snapshot and struct bkey_s_c_snapshot_tree pointers are missing. Therefore, I think it would be appropriate to fix the incorrect pointer checking through this patch. Fixes: 4bd06f07bcb5 ("bcachefs: Fixes for snapshot_tree.master_subvol") Signed-off-by: Jeongjun Park <aha310510@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-01-09bcachefs: Fixes for snapshot_tree.master_subvolKent Overstreet
Ensure that snapshot_tree.master_subvol is cleared when we delete the master subvolume in a tree of snapshots, and allow for snapshot trees that don't have a master subvolume in fsck. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21bcachefs: kill flags param to bch2_subvolume_get()Kent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21bcachefs: struct bkey_validate_contextKent Overstreet
Add a new parameter to bkey validate functions, and use it to improve invalid bkey error messages: we can now print the btree and depth it came from, or if it came from the journal, or is a btree root. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21bcachefs: Kill BCH_TRANS_COMMIT_lazy_rwKent Overstreet
We unconditionally go read-write, if we're going to do so, before journal replay: lazy_rw is obsolete. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-10-18bcachefs: Don't use commit_do() unnecessarilyKent Overstreet
Using commit_do() to call alloc_sectors_start_trans() breaks when we're randomly injecting transaction restarts - the restart in the commit causes us to leak the lock that alloc_sectorS_start_trans() takes. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-10-04bcachefs: Add warn param to subvol_get_snapshot, peek_inodeKent Overstreet
These shouldn't always be fatal errors - logged op resume, in particular, and we want it as a parameter there. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-10-04bcachefs: Kill snapshot arg to fsck_write_inode()Kent Overstreet
It was initially believed that it would be better to be explicit about the snapshot we're updating when writing inodes in fsck; however, it turns out that passing around the snapshot separately is more error prone and we're usually updating the inode in the same snapshow we read it from. This is different from normal filesystem paths, where we do the update in the snapshot of the subvolume we're in. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-09-27bcachefs: Fix iterator leak in check_subvol()Kent Overstreet
A couple small error handling fixes Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-08-13bcachefs: Make bkey_fsck_err() a wrapper around fsck_err()Kent Overstreet
bkey_fsck_err() was added as an interface that looks like fsck_err(), but previously all it did was ensure that the appropriate error counter was incremented in the superblock. This is a cleanup and bugfix patch that converts it to a wrapper around fsck_err(). This is needed to fix an issue with the upgrade path to disk_accounting_v3, where the "silent fix" error list now includes bkey_fsck errors; fsck_err() handles this in a unified way, and since we need to change printing of bkey fsck errors from the caller to the inner bkey_fsck_err() calls, this ends up being a pretty big change. Als,, rename .invalid() methods to .validate(), for clarity, while we're changing the function signature anyways (to drop the printbuf argument). Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-07-14bcachefs: bch2_btree_insert() - add btree iter flagsAriel Miculas
The commit 65bd44239727 ("bcachefs: bch2_btree_insert_trans() no longer specifies BTREE_ITER_cached") removes BTREE_ITER_cached from bch2_btree_insert_trans, which causes the update_inode function from bcachefs-tools to take a long time (~20s). Add an iter_flags parameter to bch2_btree_insert, so the users can specify iter update trigger flags, such as BTREE_ITER_cached. Signed-off-by: Ariel Miculas <ariel.miculas@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-07-14bcachefs: fsck_err() may now take a btree_transKent Overstreet
fsck_err() now optionally takes a btree_trans; if the current thread has one, it is required that it be passed. The next patch will use this to unlock when waiting for user input. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-20bcachefs: Check for subvolues with bogus snapshot/inode fieldsKent Overstreet
This fixes an assertion pop in btree_iter.c that checks for forgetting to pass a snapshot ID when iterating over snapshots btrees. Reported-by: syzbot+0dfe05235e38653e2aee@syzkaller.appspotmail.com Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-09bcachefs: s/bkey_invalid_flags/bch_validate_flagsKent Overstreet
We're about to start using bch_validate_flags for superblock section validation - it's no longer bkey specific. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-08bcachefs: Fix type of flags parameter for some ->trigger() implementationsNathan Chancellor
When building with clang's -Wincompatible-function-pointer-types-strict (a warning designed to catch potential kCFI failures at build time), there are several warnings along the lines of: fs/bcachefs/bkey_methods.c:118:2: error: incompatible function pointer types initializing 'int (*)(struct btree_trans *, enum btree_id, unsigned int, struct bkey_s_c, struct bkey_s, enum btree_iter_update_trigger_flags)' with an expression of type 'int (struct btree_trans *, enum btree_id, unsigned int, struct bkey_s_c, struct bkey_s, unsigned int)' [-Werror,-Wincompatible-function-pointer-types-strict] 118 | BCH_BKEY_TYPES() | ^~~~~~~~~~~~~~~~ fs/bcachefs/bcachefs_format.h:394:2: note: expanded from macro 'BCH_BKEY_TYPES' 394 | x(inode, 8) \ | ^~~~~~~~~~~~~~~~~~~~~~~~~~ fs/bcachefs/bkey_methods.c:117:41: note: expanded from macro 'x' 117 | #define x(name, nr) [KEY_TYPE_##name] = bch2_bkey_ops_##name, | ^~~~~~~~~~~~~~~~~~~~ <scratch space>:277:1: note: expanded from here 277 | bch2_bkey_ops_inode | ^~~~~~~~~~~~~~~~~~~ fs/bcachefs/inode.h:26:13: note: expanded from macro 'bch2_bkey_ops_inode' 26 | .trigger = bch2_trigger_inode, \ | ^~~~~~~~~~~~~~~~~~ There are several functions that did not have their flags parameter converted to 'enum btree_iter_update_trigger_flags' in the recent unification, which will cause kCFI failures at runtime because the types, while ABI compatible (hence no warning from the non-strict version of this warning), do not match exactly. Fix up these functions (as well as a few other obvious functions that should have it, even if there are no warnings currently) to resolve the warnings and potential kCFI runtime failures. Fixes: 31e4ef3280c8 ("bcachefs: iter/update/trigger/str_hash flag cleanup") Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-08bcachefs: iter/update/trigger/str_hash flag cleanupKent Overstreet
Combine iter/update/trigger/str_hash flags into a single enum, and x-macroize them for a to_text() function later. These flags are all for a specific iter/key/update context, so it makes sense to group them together - iter/update/trigger flags were already given distinct bits, this cleans up and unifies that handling. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-31bcachefs: Split out recovery_passes.cKent Overstreet
We've grown a fair amount of code for managing recovery passes; tracking which ones we're running, which ones need to be run, and flagging in the superblock which ones need to be run on the next recovery. So it's worth splitting out into its own file, this code is pretty different from the code in recovery.c. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-13bcachefs: Check for subvolume children when deleting subvolumesKent Overstreet
Recursively destroying subvolumes isn't allowed yet. Fixes: https://github.com/koverstreet/bcachefs/issues/634 Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-13bcachefs: BTREE_ID_subvolume_childrenKent Overstreet
Add a btree to record a parent -> child subvolume relationships, according to the filesystem heirarchy. The subvolume_children btree is a bitset btree: if a bit is set at pos p, that means p.offset is a child of subvolume p.inode. This will be used for efficiently listing subvolumes, as well as recursive deletion. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-13bcachefs: bch_subvolume::fs_path_parentKent Overstreet
Record the filesystem path heirarchy for subvolumes in bch_subvolume Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-13bcachefs: bch_subvolume::parent -> creation_parentKent Overstreet
bit of renaming, prep for adding a fs path parent Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-13bcachefs: Check subvol <-> inode pointers in check_subvol()Kent Overstreet
Subvolumes and subvolume root inodes point to each other: this verifies the subvolume -> inode -> subvolme path. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: for_each_btree_key() now declares loop iterKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: bch_err_(fn|msg) check if should printKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: Explicity go RW for fsckKent Overstreet
This eliminates a lot of BCH_TRANS_COMMIT_lazy_rw flags, and is less error prone. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: Rename BTREE_INSERT flagsKent Overstreet
BTREE_INSERT flags are actually transaction commit flags - rename them for clarity. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: make RO snapshots actually ROKent Overstreet
Add checks to all the VFS paths for "are we in a RO snapshot?". Note - we don't check this when setting inode options via our xattr interface, since those generally only affect data placement, not contents of data. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> Reported-by: "Carl E. Thompson" <list-bcachefs@carlthompson.net>
2023-11-01bcachefs: Enumerate fsck errorsKent Overstreet
This patch adds a superblock error counter for every distinct fsck error; this means that when analyzing filesystems out in the wild we'll be able to see what sorts of inconsistencies are being found and repair, and hence what bugs to look for. Errors validating bkeys are not yet considered distinct fsck errors, but this patch adds a new helper, bkey_fsck_err(), in order to add distinct error types for them as well. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-31bcachefs: Don't run bch2_delete_dead_snapshots() unnecessarilyKent Overstreet
Be a bit more careful about when bch2_delete_dead_snapshots needs to run: it only needs to run synchronously if we're running fsck, and it only needs to run at all if we have snapshot nodes to delete or if fsck has noticed that it needs to run. Also: Rename BCH_FS_HAVE_DELETED_SNAPSHOTS -> BCH_FS_NEED_DELETE_DEAD_SNAPSHOTS Kill bch2_delete_dead_snapshots_hook(), move functionality to bch2_mark_snapshot() Factor out bch2_check_snapshot_needs_deletion(), to explicitly check if we need to be running snapshot deletion. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Heap allocate btree_transKent Overstreet
We're using more stack than we'd like in a number of functions, and btree_trans is the biggest object that we stack allocate. But we have to do a heap allocatation to initialize it anyways, so there's no real downside to heap allocating the entire thing. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Fix -Wincompatible-function-pointer-types-strict from key_invalid ↵Nathan Chancellor
callbacks When building bcachefs with -Wincompatible-function-pointer-types-strict, a clang warning designed to catch issues with mismatched function pointer types, which will be fatal at runtime due to kernel Control Flow Integrity (kCFI), there are several instances along the lines of: fs/bcachefs/bkey_methods.c:118:2: error: incompatible function pointer types initializing 'int (*)(const struct bch_fs *, struct bkey_s_c, enum bkey_invalid_flags, struct printbuf *)' with an expression of type 'int (const struct bch_fs *, struct bkey_s_c, unsigned int, struct printbuf *)' [-Werror,-Wincompatible-function-pointer-types-strict] 118 | BCH_BKEY_TYPES() | ^~~~~~~~~~~~~~~~ fs/bcachefs/bcachefs_format.h:342:2: note: expanded from macro 'BCH_BKEY_TYPES' 342 | x(deleted, 0) \ | ^~~~~~~~~~~~~~~~~~~~~~~~~~ fs/bcachefs/bkey_methods.c:117:41: note: expanded from macro 'x' 117 | #define x(name, nr) [KEY_TYPE_##name] = bch2_bkey_ops_##name, | ^~~~~~~~~~~~~~~~~~~~ <scratch space>:206:1: note: expanded from here 206 | bch2_bkey_ops_deleted | ^~~~~~~~~~~~~~~~~~~~~ fs/bcachefs/bkey_methods.c:34:17: note: expanded from macro 'bch2_bkey_ops_deleted' 34 | .key_invalid = deleted_key_invalid, \ | ^~~~~~~~~~~~~~~~~~~ The flags parameter should be of type 'enum bkey_invalid_flags', not 'unsigned int'. Adjust the type everywhere so that there is no more warning. Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Convert more code to bch_err_msg()Kent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Split out snapshot.cKent Overstreet
subvolume.c has gotten a bit large, this splits out a separate file just for managing snapshot trees - BTREE_ID_snapshots. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: bch2_run_explicit_recovery_pass()Kent Overstreet
This introduces bch2_run_explicit_recovery_pass() and uses it for when fsck detects that we need to re-run dead snaphots cleanup, and makes dead snapshot cleanup more like a normal recovery pass. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Inline bch2_snapshot_is_ancestor() fast pathKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>