summaryrefslogtreecommitdiff
path: root/fs/bcachefs/snapshot.c
AgeCommit message (Collapse)Author
2024-10-12bcachefs: Improve check_snapshot_exists()Kent Overstreet
Check if we have snapshot_trees or subvolumes that refer to the snapshot node being reconstructed, and use them. With this, the kill_btree_root test that blows away the snapshots btree now passes, and we're able to successfully reconstruct. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-10-09bcachefs: Kill bch2_propagate_key_to_snapshot_leaves()Kent Overstreet
Dead code now. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-09-23bcachefs: Fix infinite loop in propagate_key_to_snapshot_leaves()Kent Overstreet
As we iterate we need to mark that we no longer need iterators - otherwise we'll infinite loop via the "too many iters" check when there's many snapshots. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-09-21bcachefs: Hold read lock in bch2_snapshot_tree_oldest_subvol()Ahmed Ehab
Syzbot reports a problem that a warning is triggered due to suspicious use of rcu_dereference_check(). That is triggered by a call of bch2_snapshot_tree_oldest_subvol(). The cause of the warning is that inside bch2_snapshot_tree_oldest_subvol(), snapshot_t() is called which calls rcu_dereference() that requires a read lock to be held. Also, the call of bch2_snapshot_tree_next() eventually calls snapshot_t(). To fix this, call rcu_read_lock() before calling snapshot_t(). Then, release the lock after the termination of the while loop. Reported-by: <syzbot+f7c41a878676b72c16a6@syzkaller.appspotmail.com> Signed-off-by: Ahmed Ehab <bottaawesome633@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-08-13bcachefs: Make bkey_fsck_err() a wrapper around fsck_err()Kent Overstreet
bkey_fsck_err() was added as an interface that looks like fsck_err(), but previously all it did was ensure that the appropriate error counter was incremented in the superblock. This is a cleanup and bugfix patch that converts it to a wrapper around fsck_err(). This is needed to fix an issue with the upgrade path to disk_accounting_v3, where the "silent fix" error list now includes bkey_fsck errors; fsck_err() handles this in a unified way, and since we need to change printing of bkey fsck errors from the caller to the inner bkey_fsck_err() calls, this ends up being a pretty big change. Als,, rename .invalid() methods to .validate(), for clarity, while we're changing the function signature anyways (to drop the printbuf argument). Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-07-14bcachefs: fsck_err() may now take a btree_transKent Overstreet
fsck_err() now optionally takes a btree_trans; if the current thread has one, it is required that it be passed. The next patch will use this to unlock when waiting for user input. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-06-25bcachefs: Fix kmalloc bug in __snapshot_t_mutPei Li
When allocating too huge a snapshot table, we should fail gracefully in __snapshot_t_mut() instead of fail in kmalloc(). Reported-by: syzbot+770e99b65e26fa023ab1@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=770e99b65e26fa023ab1 Tested-by: syzbot+770e99b65e26fa023ab1@syzkaller.appspotmail.com Signed-off-by: Pei Li <peili.dev@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-06-19bcachefs: set_worker_desc() for delete_dead_snapshotsKent Overstreet
this is long running - help users see what's going on Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-06-19bcachefs: delete_dead_snapshots() doesn't need to go RWKent Overstreet
We've been moving away from going RW lazily; if we want to go RW we do that in set_may_go_rw(), and if we didn't go RW we don't need to delete dead snapshots. Reported-by: syzbot+4366624c0b5aac4906cf@syzkaller.appspotmail.com Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-28bcachefs: Run check_key_has_snapshot in snapshot_delete_keys()Kent Overstreet
delete_dead_snapshots now runs before the main fsck.c passes which check for keys for invalid snapshots; thus, it needs those checks as well. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-28bcachefs: Refactor delete_dead_snapshots()Kent Overstreet
Consolidate per-key work into delete_dead_snapshots_process_key(), so we now walk all keys once, not twice. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-09bcachefs: s/bkey_invalid_flags/bch_validate_flagsKent Overstreet
We're about to start using bch_validate_flags for superblock section validation - it's no longer bkey specific. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-08bcachefs: eliminate the uninitialized compilation warning in ↵Hongbo Li
bch2_reconstruct_snapshots When compiling the bcachefs-tools, the following compilation warning is reported: libbcachefs/snapshot.c: In function ‘bch2_reconstruct_snapshots’: libbcachefs/snapshot.c:915:19: warning: ‘tree_id’ may be used uninitialized in this function [-Wmaybe-uninitialized] 915 | snapshot->v.tree = cpu_to_le32(tree_id); libbcachefs/snapshot.c:903:6: note: ‘tree_id’ was declared here 903 | u32 tree_id; | ^~~~~~~ This is a false alert, because @tree_id is changed in bch2_snapshot_tree_create after it returns 0. And if this function returns other value, @tree_id wouldn't be used. Thus there should be nothing wrong in logical. Although the report itself is a false alert, we can still make it more explicit by setting the initial value of @tree_id to 0 (an invalid tree ID). Fixes: a292be3b68f3 ("bcachefs: Reconstruct missing snapshot nodes") Signed-off-by: Hongbo Li <lihongbo22@huawei.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-08bcachefs: Fix type of flags parameter for some ->trigger() implementationsNathan Chancellor
When building with clang's -Wincompatible-function-pointer-types-strict (a warning designed to catch potential kCFI failures at build time), there are several warnings along the lines of: fs/bcachefs/bkey_methods.c:118:2: error: incompatible function pointer types initializing 'int (*)(struct btree_trans *, enum btree_id, unsigned int, struct bkey_s_c, struct bkey_s, enum btree_iter_update_trigger_flags)' with an expression of type 'int (struct btree_trans *, enum btree_id, unsigned int, struct bkey_s_c, struct bkey_s, unsigned int)' [-Werror,-Wincompatible-function-pointer-types-strict] 118 | BCH_BKEY_TYPES() | ^~~~~~~~~~~~~~~~ fs/bcachefs/bcachefs_format.h:394:2: note: expanded from macro 'BCH_BKEY_TYPES' 394 | x(inode, 8) \ | ^~~~~~~~~~~~~~~~~~~~~~~~~~ fs/bcachefs/bkey_methods.c:117:41: note: expanded from macro 'x' 117 | #define x(name, nr) [KEY_TYPE_##name] = bch2_bkey_ops_##name, | ^~~~~~~~~~~~~~~~~~~~ <scratch space>:277:1: note: expanded from here 277 | bch2_bkey_ops_inode | ^~~~~~~~~~~~~~~~~~~ fs/bcachefs/inode.h:26:13: note: expanded from macro 'bch2_bkey_ops_inode' 26 | .trigger = bch2_trigger_inode, \ | ^~~~~~~~~~~~~~~~~~ There are several functions that did not have their flags parameter converted to 'enum btree_iter_update_trigger_flags' in the recent unification, which will cause kCFI failures at runtime because the types, while ABI compatible (hence no warning from the non-strict version of this warning), do not match exactly. Fix up these functions (as well as a few other obvious functions that should have it, even if there are no warnings currently) to resolve the warnings and potential kCFI runtime failures. Fixes: 31e4ef3280c8 ("bcachefs: iter/update/trigger/str_hash flag cleanup") Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-08bcachefs: iter/update/trigger/str_hash flag cleanupKent Overstreet
Combine iter/update/trigger/str_hash flags into a single enum, and x-macroize them for a to_text() function later. These flags are all for a specific iter/key/update context, so it makes sense to group them together - iter/update/trigger flags were already given distinct bits, this cleans up and unifies that handling. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-08bcachefs: allow for custom action in fsck error messagesKent Overstreet
Be more explicit to the user about what we're doing. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-04-04bcachefs: Make snapshot_is_ancestor() safeKent Overstreet
Snapshot table accesses generally need to be checking for invalid snapshot ID now, fix one that was missed. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-04-03bcachefs: Reconstruct missing snapshot nodesKent Overstreet
When the snapshots btree is going, we'll have to delete huge amounts of data - unless we can reconstruct it by looking at the keys that refer to it. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-31bcachefs: Improve -o norecovery; opts.recovery_pass_limitKent Overstreet
This adds opts.recovery_pass_limit, and redoes -o norecovery to make use of it; this fixes some issues with -o norecovery so it can be safely used for data recovery. Norecovery means "don't do journal replay"; it's an important data recovery tool when we're getting stuck in journal replay. When using it this way we need to make sure we don't free journal keys after startup, so we continue to overlay them: thus it needs to imply retain_recovery_info, as well as nochanges. recovery_pass_limit is an explicit option for telling recovery to exit after a specific recovery pass; this is a much cleaner way of implementing -o norecovery, as well as being a useful debug feature in its own right. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-31bcachefs: Add checks for invalid snapshot IDsKent Overstreet
Previously, we assumed that keys were consistent with the snapshots btree - but that's not correct as fsck may not have been run or may not be complete. This adds checks and error handling when using the in-memory snapshots table (that mirrors the snapshots btree). Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-31bcachefs: Move snapshot table size to struct snapshot_tableKent Overstreet
We need to add bounds checking for snapshot table accesses - it turns out there are cases where we do need to use the snapshots table before fsck checks have completed (and indeed, fsck may not have been run). Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-17bcachefs: bch2_snapshot_is_ancestor() now safe to call in early recoveryKent Overstreet
this fixes an assertion pop in bch2_check_snapshot_trees() -> check_snapshot_tree() -> bch2_snapshot_tree_master_subvol() -> bch2_snapshot_is_ancestor() Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-02-24bcachefs: Fix check_snapshot() memcpyKent Overstreet
check_snapshot() copies the bch_snapshot to a temporary to easily handle older versions that don't have all the fields of the current version, but it lacked a min() to correctly handle keys newer and larger than the current version. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21bcachefs: bch_snapshot::btimeKent Overstreet
Add a field to bch_snapshot for creation time; this will be important when we start exposing the snapshot tree to userspace. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21bcachefs: kvfree bch_fs::snapshots in bch2_fs_snapshots_exitSu Yue
bch_fs::snapshots is allocated by kvzalloc in __snapshot_t_mut. It should be freed by kvfree not kfree. Or umount will triger: [ 406.829178 ] BUG: unable to handle page fault for address: ffffe7b487148008 [ 406.830676 ] #PF: supervisor read access in kernel mode [ 406.831643 ] #PF: error_code(0x0000) - not-present page [ 406.832487 ] PGD 0 P4D 0 [ 406.832898 ] Oops: 0000 [#1] PREEMPT SMP PTI [ 406.833512 ] CPU: 2 PID: 1754 Comm: umount Kdump: loaded Tainted: G OE 6.7.0-rc7-custom+ #90 [ 406.834746 ] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014 [ 406.835796 ] RIP: 0010:kfree+0x62/0x140 [ 406.836197 ] Code: 80 48 01 d8 0f 82 e9 00 00 00 48 c7 c2 00 00 00 80 48 2b 15 78 9f 1f 01 48 01 d0 48 c1 e8 0c 48 c1 e0 06 48 03 05 56 9f 1f 01 <48> 8b 50 08 48 89 c7 f6 c2 01 0f 85 b0 00 00 00 66 90 48 8b 07 f6 [ 406.837810 ] RSP: 0018:ffffb9d641607e48 EFLAGS: 00010286 [ 406.838213 ] RAX: ffffe7b487148000 RBX: ffffb9d645200000 RCX: ffffb9d641607dc4 [ 406.838738 ] RDX: 000065bb00000000 RSI: ffffffffc0d88b84 RDI: ffffb9d645200000 [ 406.839217 ] RBP: ffff9a4625d00068 R08: 0000000000000001 R09: 0000000000000001 [ 406.839650 ] R10: 0000000000000001 R11: 000000000000001f R12: ffff9a4625d4da80 [ 406.840055 ] R13: ffff9a4625d00000 R14: ffffffffc0e2eb20 R15: 0000000000000000 [ 406.840451 ] FS: 00007f0a264ffb80(0000) GS:ffff9a4e2d500000(0000) knlGS:0000000000000000 [ 406.840851 ] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 406.841125 ] CR2: ffffe7b487148008 CR3: 000000018c4d2000 CR4: 00000000000006f0 [ 406.841464 ] Call Trace: [ 406.841583 ] <TASK> [ 406.841682 ] ? __die+0x1f/0x70 [ 406.841828 ] ? page_fault_oops+0x159/0x470 [ 406.842014 ] ? fixup_exception+0x22/0x310 [ 406.842198 ] ? exc_page_fault+0x1ed/0x200 [ 406.842382 ] ? asm_exc_page_fault+0x22/0x30 [ 406.842574 ] ? bch2_fs_release+0x54/0x280 [bcachefs] [ 406.842842 ] ? kfree+0x62/0x140 [ 406.842988 ] ? kfree+0x104/0x140 [ 406.843138 ] bch2_fs_release+0x54/0x280 [bcachefs] [ 406.843390 ] kobject_put+0xb7/0x170 [ 406.843552 ] deactivate_locked_super+0x2f/0xa0 [ 406.843756 ] cleanup_mnt+0xba/0x150 [ 406.843917 ] task_work_run+0x59/0xa0 [ 406.844083 ] exit_to_user_mode_prepare+0x197/0x1a0 [ 406.844302 ] syscall_exit_to_user_mode+0x16/0x40 [ 406.844510 ] do_syscall_64+0x4e/0xf0 [ 406.844675 ] entry_SYSCALL_64_after_hwframe+0x6e/0x76 [ 406.844907 ] RIP: 0033:0x7f0a2664e4fb Signed-off-by: Su Yue <glass.su@suse.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-05bcachefs: fsck_err()s don't need to manually check c->sb.version anymoreKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-05bcachefs: mark now takes bkey_sKent Overstreet
Prep work for disk space accounting rewrite: we're going to want to use a single callback for both of our current triggers, so we need to change them to have the same type signature first. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: for_each_btree_key() now declares loop iterKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: darray_for_each() now declares loop iterKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: bch_err_(fn|msg) check if should printKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: Fix snapshot.c assertion for online fsckKent Overstreet
c->curr_recovery_pass can go backwards; this adds a non rewinding version, c->recovery_pass_done. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: Rename for_each_btree_key2() -> for_each_btree_key()Kent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: Kill for_each_btree_key()Kent Overstreet
for_each_btree_key() handles transaction restarts, like for_each_btree_key2(), but only calls bch2_trans_begin() after a transaction restart - for_each_btree_key2() wraps every loop iteration in a transaction. The for_each_btree_key() behaviour is problematic when it leads to holding the SRCU lock that prevents key cache reclaim for an unbounded amount of time - there's no real need to keep it around. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: Explicity go RW for fsckKent Overstreet
This eliminates a lot of BCH_TRANS_COMMIT_lazy_rw flags, and is less error prone. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: convert bch_fs_flags to x-macroKent Overstreet
Now we can print out filesystem flags in sysfs, useful for debugging various "what's my filesystem doing" issues. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: Rename BTREE_INSERT flagsKent Overstreet
BTREE_INSERT flags are actually transaction commit flags - rename them for clarity. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-11-24bcachefs: Fix an endianness conversionKent Overstreet
cpu_to_le32(), not le32_to_cpu() - fixes a sparse complaint. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-11-01bcachefs: Fix deleted inodes btree in snapshot deletionKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-11-01bcachefs: Enumerate fsck errorsKent Overstreet
This patch adds a superblock error counter for every distinct fsck error; this means that when analyzing filesystems out in the wild we'll be able to see what sorts of inconsistencies are being found and repair, and hence what bugs to look for. Errors validating bkeys are not yet considered distinct fsck errors, but this patch adds a new helper, bkey_fsck_err(), in order to add distinct error types for them as well. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-11-01bcachefs: Fix snapshot skiplistsKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-31bcachefs: Don't run bch2_delete_dead_snapshots() unnecessarilyKent Overstreet
Be a bit more careful about when bch2_delete_dead_snapshots needs to run: it only needs to run synchronously if we're running fsck, and it only needs to run at all if we have snapshot nodes to delete or if fsck has noticed that it needs to run. Also: Rename BCH_FS_HAVE_DELETED_SNAPSHOTS -> BCH_FS_NEED_DELETE_DEAD_SNAPSHOTS Kill bch2_delete_dead_snapshots_hook(), move functionality to bch2_mark_snapshot() Factor out bch2_check_snapshot_needs_deletion(), to explicitly check if we need to be running snapshot deletion. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-31bcachefs: Fix lock ordering with snapshot_create_lockKent Overstreet
We must not hold btree locks while taking snapshot_create_lock - this fixes a lockdep splat. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: snapshot_create_lockKent Overstreet
Add a new lock for snapshot creation - this addresses a few races with logged operations and snapshot deletion. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Fix snapshot skiplists during snapshot deletionKent Overstreet
In snapshot deleion, we have to pick new skiplist nodes for entries that point to nodes being deleted. The function that finds a new skiplist node, skipping over entries being deleted, was incorrect: if n = 0, but the parent node is being deleted, we also need to skip over that node. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Fix looping around bch2_propagate_key_to_snapshot_leaves()Kent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Make sure to initialize equiv when creating new snapshotsKent Overstreet
Previously, equiv was set in the snapshot deletion path, which is where it's needed - equiv, for snapshot ID equivalence classes, would ideally be a private data structure to the snapshot deletion path. But if a new snapshot is created while snapshot deletion is running, move_key_to_correct_snapshot() moves a key to snapshot id 0 - oops. Fixes: https://github.com/koverstreet/bcachefs/issues/593 Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: snapshots: Use kvfree_rcu_mightsleep()Kent Overstreet
kvfree_rcu() was renamed - not removed. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: snapshot: Add missing assignment in bch2_delete_dead_snapshots()Dan Carpenter
This code accidentally left out the "ret = " assignment so the errors from for_each_btree_key2() are not checked. Fixes: 53534482a250 ("bcachefs: for_each_btree_key2()") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Heap allocate btree_transKent Overstreet
We're using more stack than we'd like in a number of functions, and btree_trans is the biggest object that we stack allocate. But we have to do a heap allocatation to initialize it anyways, so there's no real downside to heap allocating the entire thing. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Fix W=12 build errorsKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>