summaryrefslogtreecommitdiff
path: root/fs/bcachefs/debug.c
AgeCommit message (Collapse)Author
2025-07-05bcachefs: Fix bch2_btree_transactions_read() synchronizationKent Overstreet
Since we're accessing btree_trans objects owned by another thread, we need to guard against using pointers to freed key cache entries: we need our own srcu read lock, and we should skip a btree_trans if it didn't hold the srcu lock (and thus it might have pointers to freed key cache entries). 00693 Mem abort info: 00693 ESR = 0x0000000096000005 00693 EC = 0x25: DABT (current EL), IL = 32 bits 00693 SET = 0, FnV = 0 00693 EA = 0, S1PTW = 0 00693 FSC = 0x05: level 1 translation fault 00693 Data abort info: 00693 ISV = 0, ISS = 0x00000005, ISS2 = 0x00000000 00693 CM = 0, WnR = 0, TnD = 0, TagAccess = 0 00693 GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 00693 user pgtable: 4k pages, 39-bit VAs, pgdp=000000012e650000 00693 [000000008fb96218] pgd=0000000000000000, p4d=0000000000000000, pud=0000000000000000 00693 Internal error: Oops: 0000000096000005 [#1] SMP 00693 Modules linked in: 00693 CPU: 0 UID: 0 PID: 4307 Comm: cat Not tainted 6.16.0-rc2-ktest-g9e15af94fd86 #27578 NONE 00693 Hardware name: linux,dummy-virt (DT) 00693 pstate: 60001005 (nZCv daif -PAN -UAO -TCO -DIT +SSBS BTYPE=--) 00693 pc : six_lock_counts+0x20/0xe8 00693 lr : bch2_btree_bkey_cached_common_to_text+0x38/0x130 00693 sp : ffffff80ca98bb60 00693 x29: ffffff80ca98bb60 x28: 000000008fb96200 x27: 0000000000000007 00693 x26: ffffff80eafd06b8 x25: 0000000000000000 x24: ffffffc080d75a60 00693 x23: ffffff80eafd0000 x22: ffffffc080bdfcc0 x21: ffffff80eafd0210 00693 x20: ffffff80c192ff08 x19: 000000008fb96200 x18: 00000000ffffffff 00693 x17: 0000000000000000 x16: 0000000000000000 x15: 00000000ffffffff 00693 x14: 0000000000000000 x13: ffffff80ceb5a29a x12: 20796220646c6568 00693 x11: 72205d3e303c5b20 x10: 0000000000000020 x9 : ffffffc0805fb6b0 00693 x8 : 0000000000000020 x7 : 0000000000000000 x6 : 0000000000000020 00693 x5 : ffffff80ceb5a29c x4 : 0000000000000001 x3 : 000000000000029c 00693 x2 : 0000000000000000 x1 : ffffff80ef66c000 x0 : 000000008fb96200 00693 Call trace: 00693 six_lock_counts+0x20/0xe8 (P) 00693 bch2_btree_bkey_cached_common_to_text+0x38/0x130 00693 bch2_btree_trans_to_text+0x260/0x2a8 00693 bch2_btree_transactions_read+0xac/0x1e8 00693 full_proxy_read+0x74/0xd8 00693 vfs_read+0x90/0x300 00693 ksys_read+0x6c/0x108 00693 __arm64_sys_read+0x20/0x30 00693 invoke_syscall.constprop.0+0x54/0xe8 00693 do_el0_svc+0x44/0xc8 00693 el0_svc+0x18/0x58 00693 el0t_64_sync_handler+0x104/0x130 00693 el0t_64_sync+0x154/0x158 00693 Code: 910003fd f9423c22 f90017e2 d2800002 (f9400c01) 00693 ---[ end trace 0000000000000000 ]--- Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-07-04bcachefs: Tweak btree cache helpers for use by btree node scanKent Overstreet
btree node scan needs to not use the btree node cache: that causes interference from prior failed reads and parallel workers. Instead we need to allocate btree nodes that don't live in the btree cache, so that we can call bch2_btree_node_read_done() directly. This patch tweaks the low level helpers so they don't touch the btree cache lists. Cc: Nikita Ofitserov <himikof@gmail.com> Reviewed-by: Nikita Ofitserov <himikof@gmail.com> Reported-and-tested-by: Edoardo Codeglia <bcachefs@404.blue> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-06-01bcachefs: Replace rcu_read_lock() with guardsKent Overstreet
The new guard(), scoped_guard() allow for more natural code. Some of the uses with creative flow control have been left. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-30bcachefs: Include b->ob.nr in cached_btree_node_to_text()Kent Overstreet
We have a bug report that looks like we might be leaking open buckets - let's check if they got left attached to the cached btree node. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21bcachefs: fix bch2_debugfs_flush_buf() when tabstops are in useKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21bcachefs: Single err message for btree node readsKent Overstreet
Like we just did with the data read path, emit a single error message per btree node reads, nicely formatted, with all the actions we took grouped together. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21bcachefs: Async object debuggingKent Overstreet
Debugging infrastructure for async objs: this lets us easily create fast_lists for various object types so they'll be visible in debugfs. Add new object types to the BCH_ASYNC_OBJS_TYPES() enum, and drop a pretty-printer wrapper in async_objs.c. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21bcachefs: bch_dev.io_ref -> enumerated_refKent Overstreet
Convert device IO refs to enumerated_refs, for easier debugging of refcount issues. Simple conversion: enumerate all users and convert to the new helpers. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21bcachefs: Single device modeKent Overstreet
Single device filesystems are now identified by the block device name, not the UUID - and single device filesystems with the same UUID can be mounted simultaneously, without any special options. This allocates a new bit in the superblock, BCH_SB_MULTI_DEVICE, which indicates whether a filesystem has ever been multi device. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21bcachefs: trace bch2_trans_kmalloc()Kent Overstreet
We're occasionally seeing the WARN_ON() for bump allocator usage exceeding BTREE_TRANS_MEM_MAX; add some tracing so we can see what's going on. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-04-02bcachefs: Split up bch_dev.io_refKent Overstreet
We now have separate per device io_refs for read and write access. This fixes a device removal bug where the discard workers were still running while we're removing alloc info for that device. It's also a bit of hardening; we no longer allow writes to devices that are read-only. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-14bcachefs: bch2_bkey_pick_read_device() can now specify a deviceKent Overstreet
To be used for scrub, where we want the read to come from a specific device. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-14bcachefs: Move write_points to debugfsKent Overstreet
this was hitting the sysfs 4k limit Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-01-25bcachefs: Improve journal pin flushingKent Overstreet
Running the preempt tiering tests with a lower than normal journal reclaim delay turned up a shutdown hang - a lost wakeup, caused because flushing a journal pin (e.g. key cache/write buffer) can generate a new journal pin. The "simple" fix of adding the correct wakeup didn't work because of ordering issues; if we flush btree node pins too aggressively before other pins have completed, we end up spinning where each flush iteration generates new work. So to fix this correctly: - The list of flushed journal pins is now broken out by type, so that we can wait for key cache/write buffer pin flushing to complete before flushing dirty btree nodes - A new closure_waitlist is added for bch2_journal_flush_pins; this one is only used under or when we're taking the journal lock, so it's pretty cheap to add rigorously correct wakeups to journal_pin_set() and journal_pin_drop(). Additionally, bch2_journal_seq_pins_to_text() is moved to journal_reclaim.c, where it belongs, along with a bit of other small renaming and refactoring. Besides fixing the hang, the better ordering between key cache/write buffer flushing and btree node flushing should help or fix the "unmount taking excessively long" a few users have been noticing. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21bcachefs: Avoid bch2_btree_id_str()Kent Overstreet
Prefer bch2_btree_id_to_text() - it prints out the integer ID when unknown. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-08-13bcachefs: Convert for_each_btree_node() to lockrestart_do()Kent Overstreet
for_each_btree_node() now works similarly to for_each_btree_key(), where the loop body is passed as an argument to be passed to lockrestart_do(). This now calls trans_begin() on every loop iteration - which fixes an SRCU warning in backpointers fsck. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-06-28bcachefs: Fix loop restart in bch2_btree_transactions_read()Kent Overstreet
Accidental infinite loop; also fix btree_deadlock_to_text() Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-06-23bcachefs: Fix btree_trans list orderingKent Overstreet
The debug code relies on btree_trans_list being ordered so that it can resume on subsequent calls or lock restarts. However, it was using trans->locknig_wait.task.pid, which is incorrect since btree_trans objects are cached and reused - typically by different tasks. Fix this by switching to pointer order, and also sort them lazily when required - speeding up the btree_trans_get() fastpath. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-06-23bcachefs: Fix race between trans_put() and btree_transactions_read()Kent Overstreet
debug.c was using closure_get() on a different thread's closure where the we don't know if the object being refcounted is alive. We keep btree_trans objects on a list so they can be printed by debug code, and because it is cost prohibitive to touch the btree_trans list every time we allocate and free btree_trans objects, cached objects are also on this list. However, we do not want the debug code to see cached but not in use btree_trans objects - critically because the btree_paths array will have been freed (if it was reallocated). closure_get() is also incorrect to use when that get may race with it hitting zero, i.e. we must already have a ref on the object or know the ref can't currently hit 0 for other reasons (as used in the cycle detector). to fix this, use the previously introduced closure_get_not_zero(), closure_return_sync(), and closure_init_stack_release(); the debug code now can only take a ref on a trans object if it's alive and in use. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-06-23bcachefs: Make btree_deadlock_to_text() clearerKent Overstreet
btree_deadlock_to_text() searches the list of btree transactions to find a deadlock - when it finds one it's done; it's not like other *_read() functions that's printing each object. Factor out btree_deadlock_to_text() to make this clearer. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-06-23bcachefs: fix seqmutex_relock()Kent Overstreet
We were grabbing the sequence number before unlock incremented it - fix this by moving the increment to seqmutex_lock() (so the seqmutex_relock() failure path skips the mutex_trylock()), and returning the sequence number from unlock(), to make the API simpler and safer. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-09bcachefs: bch2_dev_get_ioref() checks for device not presentKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-09bcachefs: bch2_dev_get_ioref2(); debug.cKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-08bcachefs: member helper cleanupsKent Overstreet
Some renaming for better consistency bch2_member_exists -> bch2_member_alive bch2_dev_exists -> bch2_member_exists bch2_dev_exsits2 -> bch2_dev_exists bch_dev_locked -> bch2_dev_locked bch_dev_bkey_exists -> bch2_dev_bkey_exists new helper - bch2_dev_safe Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-08bcachefs: iter/update/trigger/str_hash flag cleanupKent Overstreet
Combine iter/update/trigger/str_hash flags into a single enum, and x-macroize them for a to_text() function later. These flags are all for a specific iter/key/update context, so it makes sense to group them together - iter/update/trigger flags were already given distinct bits, this cleans up and unifies that handling. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-08bcachefs: prt_printf() now respects \r\n\tKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-04-04bcachefs: Move btree_updates to debugfsKent Overstreet
sysfs is limited to PAGE_SIZE, and when we're debugging strange deadlocks/priority inversions we need to see the full list. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-04-03bcachefs: create debugfs dir for each btreeThomas Bertschinger
This creates a subdirectory for each individual btree under the btrees/ debugfs directory. Directory structure, before: /sys/kernel/debug/bcachefs/$FS_ID/btrees/ ├── alloc ├── alloc-bfloat-failed ├── alloc-formats ├── backpointers ├── backpointers-bfloat-failed ├── backpointers-formats ... Directory structure, after: /sys/kernel/debug/bcachefs/$FS_ID/btrees/ ├── alloc │   ├── bfloat-failed │   ├── formats │   └── keys ├── backpointers │   ├── bfloat-failed │   ├── formats │   └── keys ... Signed-off-by: Thomas Bertschinger <tahbertschinger@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-18bcachefs: Improve bch2_fatal_error()Kent Overstreet
error messages should always include __func__ Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-13bcachefs: kill kvpmalloc()Kent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-22bcachefs: Add gfp flags param to bch2_prt_task_backtrace()Kent Overstreet
Fixes: e6a2566f7a00 ("bcachefs: Better journal tracepoints") Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> Reported-by: smatch
2024-01-21bcachefs: Prep work for variable size btree node buffersKent Overstreet
bcachefs btree nodes are big - typically 256k - and btree roots are pinned in memory. As we're now up to 18 btrees, we now have significant memory overhead in mostly empty btree roots. And in the future we're going to start enforcing that certain btree node boundaries exist, to solve lock contention issues - analagous to XFS's AGIs. Thus, we need to start allocating smaller btree node buffers when we can. This patch changes code that refers to the filesystem constant c->opts.btree_node_size to refer to the btree node buffer size - btree_buf_bytes() - where appropriate. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-05bcachefs: Improve would_deadlock trace eventKent Overstreet
We now include backtraces for every thread involved in the cycle. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-05bcachefs: kill useless return retKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-05bcachefs: track transaction durationsKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: optimize __bch2_trans_get(), kill DEBUG_TRANSACTIONSKent Overstreet
- Some tweaks to greatly reduce locking overhead for the list of btree transactions, so that it can always be enabled: leave btree_trans objects on the list when they're on the percpu single item freelist, and only check for duplicates in the same process when CONFIG_BCACHEFS_DEBUG is enabled - don't zero out the full btree_trans() unless we allocated it from the mempool Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: btree_iter -> btree_path_idx_tKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: for_each_btree_key() now declares loop iterKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: Rename for_each_btree_key2() -> for_each_btree_key()Kent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: Fix bch2_read_btree()Kent Overstreet
In the debugfs code, we had an incorrect use of drop_locks_do(); on transaction restart we don't want to restart the current loop iteration, since we've already emitted the current key to the buffer for userspace. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-31bcachefs: bch2_btree_id_str()Kent Overstreet
Since we can run with unknown btree IDs, we can't directly index btree IDs into fixed size arrays. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Fix copy_to_user() usage in flush_buf()Kent Overstreet
copy_to_user() returns the number of bytes successfully copied - not an errcode. Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Heap allocate btree_transKent Overstreet
We're using more stack than we'd like in a number of functions, and btree_trans is the biggest object that we stack allocate. But we have to do a heap allocatation to initialize it anyways, so there's no real downside to heap allocating the entire thing. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Fix W=12 build errorsKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Break up io.cKent Overstreet
More reorganization, this splits up io.c into - io_read.c - io_misc.c - fallocate, fpunch, truncate - io_write.c Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Fix more lockdep splats in debug.cKent Overstreet
Similar to previous fixes, we can't incur page faults while holding btree locks. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: seqmutex; fix a lockdep splatKent Overstreet
We can't be holding btree_trans_lock while copying to user space, which might incur a page fault. To fix this, convert it to a seqmutex so we can unlock/relock. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: GFP_NOIO -> GFP_NOFSKent Overstreet
GFP_NOIO dates from the bcache days, when we operated under the block layer. Now, GFP_NOFS is more appropriate, so switch all GFP_NOIO uses to GFP_NOFS. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: bch2_btree_node_ondisk_to_text()Kent Overstreet
Pulling out a helper from cmd_list.c, as the rest is being rewritten in Rust but we're not ready to rewrite lower-level btree code in Rust. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Drop some anonymous structs, unionsKent Overstreet
Rust bindgen doesn't cope well with anonymous structs and unions. This patch drops the fancy anonymous structs & unions in bkey_i that let us use the same helpers for bkey_i and bkey_packed; since bkey_packed is an internal type that's never exposed to outside code, it's only a minor inconvenienc. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>