summaryrefslogtreecommitdiff
path: root/kernel/locking
AgeCommit message (Collapse)Author
2022-08-21locking/lockdep: Fix lockdep_init_map_*() confusionPeter Zijlstra
[ Upstream commit eae6d58d67d9739be5f7ae2dbead1d0ef6528243 ] Commit dfd5e3f5fe27 ("locking/lockdep: Mark local_lock_t") added yet another lockdep_init_map_*() variant, but forgot to update all the existing users of the most complicated version. This could lead to a loss of lock_type and hence an incorrect report. Given the relative rarity of both local_lock and these annotations, this is unlikely to happen in practise, still, best fix things. Fixes: dfd5e3f5fe27 ("locking/lockdep: Mark local_lock_t") Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/YqyEDtoan20K0CVD@worktop.programming.kicks-ass.net Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-08-21lockdep: Allow tuning tracing capacity constants.Tetsuo Handa
commit 5dc33592e95534dc8455ce3e9baaaf3dae0fff82 upstream. Since syzkaller continues various test cases until the kernel crashes, syzkaller tends to examine more locking dependencies than normal systems. As a result, syzbot is reporting that the fuzz testing was terminated due to hitting upper limits lockdep can track [1] [2] [3]. Since analysis via /proc/lockdep* did not show any obvious culprit [4] [5], we have no choice but allow tuning tracing capacity constants. [1] https://syzkaller.appspot.com/bug?id=3d97ba93fb3566000c1c59691ea427370d33ea1b [2] https://syzkaller.appspot.com/bug?id=381cb436fe60dc03d7fd2a092b46d7f09542a72a [3] https://syzkaller.appspot.com/bug?id=a588183ac34c1437fc0785e8f220e88282e5a29f [4] https://lkml.kernel.org/r/4b8f7a57-fa20-47bd-48a0-ae35d860f233@i-love.sakura.ne.jp [5] https://lkml.kernel.org/r/1c351187-253b-2d49-acaf-4563c63ae7d2@i-love.sakura.ne.jp References: https://lkml.kernel.org/r/1595640639-9310-1-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Acked-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-08locking/lockdep: Iterate lock_classes directly when reading lockdep filesWaiman Long
[ Upstream commit fb7275acd6fb988313dddd8d3d19efa70d9015ad ] When dumping lock_classes information via /proc/lockdep, we can't take the lockdep lock as the lock hold time is indeterminate. Iterating over all_lock_classes without holding lock can be dangerous as there is a slight chance that it may branch off to other lists leading to infinite loop or even access invalid memory if changes are made to all_lock_classes list in parallel. To avoid this problem, iteration of lock classes is now done directly on the lock_classes array itself. The lock_classes_in_use bitmap is checked to see if the lock class is being used. To avoid iterating the full array all the times, a new max_lock_class_idx value is added to track the maximum lock_class index that is currently being used. We can theoretically take the lockdep lock for iterating all_lock_classes when other lockdep files (lockdep_stats and lock_stat) are accessed as the lock hold time will be shorter for them. For consistency, they are also modified to iterate the lock_classes array directly. Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20220211035526.1329503-2-longman@redhat.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-04-08locking/lockdep: Avoid potential access of invalid memory in lock_classWaiman Long
commit 61cc4534b6550997c97a03759ab46b29d44c0017 upstream. It was found that reading /proc/lockdep after a lockdep splat may potentially cause an access to freed memory if lockdep_unregister_key() is called after the splat but before access to /proc/lockdep [1]. This is due to the fact that graph_lock() call in lockdep_unregister_key() fails after the clearing of debug_locks by the splat process. After lockdep_unregister_key() is called, the lock_name may be freed but the corresponding lock_class structure still have a reference to it. That invalid memory pointer will then be accessed when /proc/lockdep is read by a user and a use-after-free (UAF) error will be reported if KASAN is enabled. To fix this problem, lockdep_unregister_key() is now modified to always search for a matching key irrespective of the debug_locks state and zap the corresponding lock class if a matching one is found. [1] https://lore.kernel.org/lkml/77f05c15-81b6-bddd-9650-80d5f23fe330@i-love.sakura.ne.jp/ Fixes: 8b39adbee805 ("locking/lockdep: Make lockdep_unregister_key() honor 'debug_locks' again") Reported-by: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Cc: Cheng-Jui Wang <cheng-jui.wang@mediatek.com> Link: https://lkml.kernel.org/r/20220103023558.1377055-1-longman@redhat.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-02-23lockdep: Correct lock_classes index mappingCheng Jui Wang
commit 28df029d53a2fd80c1b8674d47895648ad26dcfb upstream. A kernel exception was hit when trying to dump /proc/lockdep_chains after lockdep report "BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!": Unable to handle kernel paging request at virtual address 00054005450e05c3 ... 00054005450e05c3] address between user and kernel address ranges ... pc : [0xffffffece769b3a8] string+0x50/0x10c lr : [0xffffffece769ac88] vsnprintf+0x468/0x69c ... Call trace: string+0x50/0x10c vsnprintf+0x468/0x69c seq_printf+0x8c/0xd8 print_name+0x64/0xf4 lc_show+0xb8/0x128 seq_read_iter+0x3cc/0x5fc proc_reg_read_iter+0xdc/0x1d4 The cause of the problem is the function lock_chain_get_class() will shift lock_classes index by 1, but the index don't need to be shifted anymore since commit 01bb6f0af992 ("locking/lockdep: Change the range of class_idx in held_lock struct") already change the index to start from 0. The lock_classes[-1] located at chain_hlocks array. When printing lock_classes[-1] after the chain_hlocks entries are modified, the exception happened. The output of lockdep_chains are incorrect due to this problem too. Fixes: f611e8cf98ec ("lockdep: Take read/write status in consideration when generate chainkey") Signed-off-by: Cheng Jui Wang <cheng-jui.wang@mediatek.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Boqun Feng <boqun.feng@gmail.com> Link: https://lore.kernel.org/r/20220210105011.21712-1-cheng-jui.wang@mediatek.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-02-01kernel: delete repeated words in commentsRandy Dunlap
[ Upstream commit c034f48e99907d5be147ac8f0f3e630a9307c2be ] Drop repeated words in kernel/events/. {if, the, that, with, time} Drop repeated words in kernel/locking/. {it, no, the} Drop repeated words in kernel/sched/. {in, not} Link: https://lkml.kernel.org/r/20210127023412.26292-1-rdunlap@infradead.org Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Acked-by: Will Deacon <will@kernel.org> [kernel/locking/] Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Will Deacon <will@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: "Paul E. McKenney" <paulmck@kernel.org> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Vincent Guittot <vincent.guittot@linaro.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-18lockdep: Let lock_is_held_type() detect recursive read as readSebastian Andrzej Siewior
[ Upstream commit 2507003a1d10917c9158077bf6030719d02c941e ] lock_is_held_type(, 1) detects acquired read locks. It only recognized locks acquired with lock_acquire_shared(). Read locks acquired with lock_acquire_shared_recursive() are not recognized because a `2' is stored as the read value. Rework the check to additionally recognise lock's read value one and two as a read held lock. Fixes: e918188611f07 ("locking: More accurate annotations for read_lock()") Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Boqun Feng <boqun.feng@gmail.com> Acked-by: Waiman Long <longman@redhat.com> Link: https://lkml.kernel.org/r/20210903084001.lblecrvz4esl4mrr@linutronix.de Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-18locking/lockdep: Avoid RCU-induced noinstr failPeter Zijlstra
[ Upstream commit ce0b9c805dd66d5e49fd53ec5415ae398f4c56e6 ] vmlinux.o: warning: objtool: look_up_lock_class()+0xc7: call to rcu_read_lock_any_held() leaves .noinstr.text section Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20210624095148.311980536@infradead.org Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-09-15locking/lockdep: Mark local_lock_tPeter Zijlstra
[ Upstream commit dfd5e3f5fe27bda91d5cc028c86ffbb7a0614489 ] The local_lock_t's are special, because they cannot form IRQ inversions, make sure we can tell them apart from the rest of the locks. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-09-15locking/mutex: Fix HANDOFF conditionPeter Zijlstra
[ Upstream commit 048661a1f963e9517630f080687d48af79ed784c ] Yanfei reported that setting HANDOFF should not depend on recomputing @first, only on @first state. Which would then give: if (ww_ctx || !first) first = __mutex_waiter_is_first(lock, &waiter); if (first) __mutex_set_flag(lock, MUTEX_FLAG_HANDOFF); But because 'ww_ctx || !first' is basically 'always' and the test for first is relatively cheap, omit that first branch entirely. Reported-by: Yanfei Xu <yanfei.xu@windriver.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Waiman Long <longman@redhat.com> Reviewed-by: Yanfei Xu <yanfei.xu@windriver.com> Link: https://lore.kernel.org/r/20210630154114.896786297@infradead.org Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-14lockdep: Fix wait-type for empty stackPeter Zijlstra
[ Upstream commit f8b298cc39f0619544c607eaef09fd0b2afd10f3 ] Even the very first lock can violate the wait-context check, consider the various IRQ contexts. Fixes: de8f5e4f2dc1 ("lockdep: Introduce wait-type checks") Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Joerg Roedel <jroedel@suse.de> Link: https://lore.kernel.org/r/20210617190313.256987481@infradead.org Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-14lockding/lockdep: Avoid to find wrong lock dep path in check_irq_usage()Boqun Feng
[ Upstream commit 7b1f8c6179769af6ffa055e1169610b51d71edd5 ] In the step #3 of check_irq_usage(), we seach backwards to find a lock whose usage conflicts the usage of @target_entry1 on safe/unsafe. However, we should only keep the irq-unsafe usage of @target_entry1 into consideration, because it could be a case where a lock is hardirq-unsafe but soft-safe, and in check_irq_usage() we find it because its hardirq-unsafe could result into a hardirq-safe-unsafe deadlock, but currently since we don't filter out the other usage bits, so we may find a lock dependency path softirq-unsafe -> softirq-safe, which in fact doesn't cause a deadlock. And this may cause misleading lockdep splats. Fix this by only keeping LOCKF_ENABLED_IRQ_ALL bits when we try the backwards search. Reported-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20210618170110.3699115-4-boqun.feng@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-14locking/lockdep: Fix the dep path printing for backwards BFSBoqun Feng
[ Upstream commit 69c7a5fb2482636f525f016c8333fdb9111ecb9d ] We use the same code to print backwards lock dependency path as the forwards lock dependency path, and this could result into incorrect printing because for a backwards lock_list ->trace is not the call trace where the lock of ->class is acquired. Fix this by introducing a separate function on printing the backwards dependency path. Also add a few comments about the printing while we are at it. Reported-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20210618170110.3699115-2-boqun.feng@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-06-30locking/lockdep: Improve noinstr vs errorsPeter Zijlstra
[ Upstream commit 49faa77759b211fff344898edc23bb780707fff5 ] Better handle the failure paths. vmlinux.o: warning: objtool: debug_locks_off()+0x23: call to console_verbose() leaves .noinstr.text section vmlinux.o: warning: objtool: debug_locks_off()+0x19: call to __kasan_check_write() leaves .noinstr.text section debug_locks_off+0x19/0x40: instrument_atomic_write at include/linux/instrumented.h:86 (inlined by) __debug_locks_off at include/linux/debug_locks.h:17 (inlined by) debug_locks_off at lib/debug_locks.c:41 Fixes: 6eebad1ad303 ("lockdep: __always_inline more for noinstr") Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20210621120120.784404944@infradead.org Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-26locking/mutex: clear MUTEX_FLAGS if wait_list is empty due to signalZqiang
[ Upstream commit 3a010c493271f04578b133de977e0e5dd2848cea ] When a interruptible mutex locker is interrupted by a signal without acquiring this lock and removed from the wait queue. if the mutex isn't contended enough to have a waiter put into the wait queue again, the setting of the WAITER bit will force mutex locker to go into the slowpath to acquire the lock every time, so if the wait queue is empty, the WAITER bit need to be clear. Fixes: 040a0a371005 ("mutex: Add support for wound/wait style locks") Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Zqiang <qiang.zhang@windriver.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20210517034005.30828-1-qiang.zhang@windriver.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-26locking/lockdep: Correct calling tracepointsLeo Yan
[ Upstream commit 89e70d5c583c55088faa2201d397ee30a15704aa ] The commit eb1f00237aca ("lockdep,trace: Expose tracepoints") reverses tracepoints for lock_contended() and lock_acquired(), thus the ftrace log shows the wrong locking sequence that "acquired" event is prior to "contended" event: <idle>-0 [001] d.s3 20803.501685: lock_acquire: 0000000008b91ab4 &sg_policy->update_lock <idle>-0 [001] d.s3 20803.501686: lock_acquired: 0000000008b91ab4 &sg_policy->update_lock <idle>-0 [001] d.s3 20803.501689: lock_contended: 0000000008b91ab4 &sg_policy->update_lock <idle>-0 [001] d.s3 20803.501690: lock_release: 0000000008b91ab4 &sg_policy->update_lock This patch fixes calling tracepoints for lock_contended() and lock_acquired(). Fixes: eb1f00237aca ("lockdep,trace: Expose tracepoints") Signed-off-by: Leo Yan <leo.yan@linaro.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20210512120937.90211-1-leo.yan@linaro.org Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-04-28locking/qrwlock: Fix ordering in queued_write_lock_slowpath()Ali Saidi
[ Upstream commit 84a24bf8c52e66b7ac89ada5e3cfbe72d65c1896 ] While this code is executed with the wait_lock held, a reader can acquire the lock without holding wait_lock. The writer side loops checking the value with the atomic_cond_read_acquire(), but only truly acquires the lock when the compare-and-exchange is completed successfully which isn’t ordered. This exposes the window between the acquire and the cmpxchg to an A-B-A problem which allows reads following the lock acquisition to observe values speculatively before the write lock is truly acquired. We've seen a problem in epoll where the reader does a xchg while holding the read lock, but the writer can see a value change out from under it. Writer | Reader -------------------------------------------------------------------------------- ep_scan_ready_list() | |- write_lock_irq() | |- queued_write_lock_slowpath() | |- atomic_cond_read_acquire() | | read_lock_irqsave(&ep->lock, flags); --> (observes value before unlock) | chain_epi_lockless() | | epi->next = xchg(&ep->ovflist, epi); | | read_unlock_irqrestore(&ep->lock, flags); | | | atomic_cmpxchg_relaxed() | |-- READ_ONCE(ep->ovflist); | A core can order the read of the ovflist ahead of the atomic_cmpxchg_relaxed(). Switching the cmpxchg to use acquire semantics addresses this issue at which point the atomic_cond_read can be switched to use relaxed semantics. Fixes: b519b56e378ee ("locking/qrwlock: Use atomic_cond_read_acquire() when spinning in qrwlock") Signed-off-by: Ali Saidi <alisaidi@amazon.com> [peterz: use try_cmpxchg()] Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Steve Capper <steve.capper@arm.com> Acked-by: Will Deacon <will@kernel.org> Acked-by: Waiman Long <longman@redhat.com> Tested-by: Steve Capper <steve.capper@arm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-04-21lockdep: Add a missing initialization hint to the "INFO: Trying to register ↵Tetsuo Handa
non-static key" message [ Upstream commit 3a85969e9d912d5dd85362ee37b5f81266e00e77 ] Since this message is printed when dynamically allocated spinlocks (e.g. kzalloc()) are used without initialization (e.g. spin_lock_init()), suggest to developers to check whether initialization functions for objects were called, before making developers wonder what annotation is missing. [ mingo: Minor tweaks to the message. ] Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20210321064913.4619-1-penguin-kernel@I-love.SAKURA.ne.jp Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-04-14lockdep: Address clang -Wformat warning printing for %hdArnd Bergmann
commit 6d48b7912cc72275dc7c59ff961c8bac7ef66a92 upstream. Clang doesn't like format strings that truncate a 32-bit value to something shorter: kernel/locking/lockdep.c:709:4: error: format specifies type 'short' but the argument has type 'int' [-Werror,-Wformat] In this case, the warning is a slightly questionable, as it could realize that both class->wait_type_outer and class->wait_type_inner are in fact 8-bit struct members, even though the result of the ?: operator becomes an 'int'. However, there is really no point in printing the number as a 16-bit 'short' rather than either an 8-bit or 32-bit number, so just change it to a normal %d. Fixes: de8f5e4f2dc1 ("lockdep: Introduce wait-type checks") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20210322115531.3987555-1-arnd@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-04-07locking/ww_mutex: Simplify use_ww_ctx & ww_ctx handlingWaiman Long
[ Upstream commit 5de2055d31ea88fd9ae9709ac95c372a505a60fa ] The use_ww_ctx flag is passed to mutex_optimistic_spin(), but the function doesn't use it. The frequent use of the (use_ww_ctx && ww_ctx) combination is repetitive. In fact, ww_ctx should not be used at all if !use_ww_ctx. Simplify ww_mutex code by dropping use_ww_ctx from mutex_optimistic_spin() an clear ww_ctx if !use_ww_ctx. In this way, we can replace (use_ww_ctx && ww_ctx) by just (ww_ctx). Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Davidlohr Bueso <dbueso@suse.de> Link: https://lore.kernel.org/r/20210316153119.13802-2-longman@redhat.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-04locking/lockdep: Avoid unmatched unlockPeter Zijlstra
[ Upstream commit 7f82e631d236cafd28518b998c6d4d8dc2ef68f6 ] Commit f6f48e180404 ("lockdep: Teach lockdep about "USED" <- "IN-NMI" inversions") overlooked that print_usage_bug() releases the graph_lock and called it without the graph lock held. Fixes: f6f48e180404 ("lockdep: Teach lockdep about "USED" <- "IN-NMI" inversions") Reported-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Waiman Long <longman@redhat.com> Link: https://lkml.kernel.org/r/YBfkuyIfB1+VRxXP@hirez.programming.kicks-ass.net Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-02-07locking/lockdep: Avoid noinstr warning for DEBUG_LOCKDEPPeter Zijlstra
[ Upstream commit 77ca93a6b1223e210e58e1000c09d8d420403c94 ] vmlinux.o: warning: objtool: lock_is_held_type()+0x60: call to check_flags.part.0() leaves .noinstr.text section Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20210106144017.652218215@infradead.org Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-01-30rtmutex: Remove unused argument from rt_mutex_proxy_unlock()Thomas Gleixner
commit 2156ac1934166d6deb6cd0f6ffc4c1076ec63697 upstream Nothing uses the argument. Remove it as preparation to use pi_state_update_owner(). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-01-27locking/lockdep: Cure noinstr failPeter Zijlstra
commit 0afda3a888dccf12557b41ef42eee942327d122b upstream. When the compiler doesn't feel like inlining, it causes a noinstr fail: vmlinux.o: warning: objtool: lock_is_held_type()+0xb: call to lockdep_enabled() leaves .noinstr.text section Fixes: 4d004099a668 ("lockdep: Fix lockdep recursion") Reported-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20210106144017.592595176@infradead.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-01-09rwsem: Implement down_read_interruptibleEric W. Biederman
[ Upstream commit 31784cff7ee073b34d6eddabb95e3be2880a425c ] In preparation for converting exec_update_mutex to a rwsem so that multiple readers can execute in parallel and not deadlock, add down_read_interruptible. This is needed for perf_event_open to be converted (with no semantic changes) from working on a mutex to wroking on a rwsem. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/87k0tybqfy.fsf@x220.int.ebiederm.org Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-01-09rwsem: Implement down_read_killable_nestedEric W. Biederman
[ Upstream commit 0f9368b5bf6db0c04afc5454b1be79022a681615 ] In preparation for converting exec_update_mutex to a rwsem so that multiple readers can execute in parallel and not deadlock, add down_read_killable_nested. This is needed so that kcmp_lock can be converted from working on a mutexes to working on rw_semaphores. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/87o8jabqh3.fsf@x220.int.ebiederm.org Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-11-17lockdep: Put graph lock/unlock under lock_recursion protectionBoqun Feng
A warning was hit when running xfstests/generic/068 in a Hyper-V guest: [...] ------------[ cut here ]------------ [...] DEBUG_LOCKS_WARN_ON(lockdep_hardirqs_enabled()) [...] WARNING: CPU: 2 PID: 1350 at kernel/locking/lockdep.c:5280 check_flags.part.0+0x165/0x170 [...] ... [...] Workqueue: events pwq_unbound_release_workfn [...] RIP: 0010:check_flags.part.0+0x165/0x170 [...] ... [...] Call Trace: [...] lock_is_held_type+0x72/0x150 [...] ? lock_acquire+0x16e/0x4a0 [...] rcu_read_lock_sched_held+0x3f/0x80 [...] __send_ipi_one+0x14d/0x1b0 [...] hv_send_ipi+0x12/0x30 [...] __pv_queued_spin_unlock_slowpath+0xd1/0x110 [...] __raw_callee_save___pv_queued_spin_unlock_slowpath+0x11/0x20 [...] .slowpath+0x9/0xe [...] lockdep_unregister_key+0x128/0x180 [...] pwq_unbound_release_workfn+0xbb/0xf0 [...] process_one_work+0x227/0x5c0 [...] worker_thread+0x55/0x3c0 [...] ? process_one_work+0x5c0/0x5c0 [...] kthread+0x153/0x170 [...] ? __kthread_bind_mask+0x60/0x60 [...] ret_from_fork+0x1f/0x30 The cause of the problem is we have call chain lockdep_unregister_key() -> <irq disabled by raw_local_irq_save()> lockdep_unlock() -> arch_spin_unlock() -> __pv_queued_spin_unlock_slowpath() -> pv_kick() -> __send_ipi_one() -> trace_hyperv_send_ipi_one(). Although this particular warning is triggered because Hyper-V has a trace point in ipi sending, but in general arch_spin_unlock() may call another function having a trace point in it, so put the arch_spin_lock() and arch_spin_unlock() after lock_recursion protection to fix this problem and avoid similiar problems. Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20201113110512.1056501-1-boqun.feng@gmail.com
2020-11-10lockdep: Avoid to modify chain keys in validate_chain()Boqun Feng
Chris Wilson reported a problem spotted by check_chain_key(): a chain key got changed in validate_chain() because we modify the ->read in validate_chain() to skip checks for dependency adding, and ->read is taken into calculation for chain key since commit f611e8cf98ec ("lockdep: Take read/write status in consideration when generate chainkey"). Fix this by avoiding to modify ->read in validate_chain() based on two facts: a) since we now support recursive read lock detection, there is no need to skip checks for dependency adding for recursive readers, b) since we have a), there is only one case left (nest_lock) where we want to skip checks in validate_chain(), we simply remove the modification for ->read and rely on the return value of check_deadlock() to skip the dependency adding. Reported-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20201102053743.450459-1-boqun.feng@gmail.com
2020-11-01Merge tag 'locking-urgent-2020-11-01' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking fixes from Thomas Gleixner: "A couple of locking fixes: - Fix incorrect failure injection handling in the fuxtex code - Prevent a preemption warning in lockdep when tracking local_irq_enable() and interrupts are already enabled - Remove more raw_cpu_read() usage from lockdep which causes state corruption on !X86 architectures. - Make the nr_unused_locks accounting in lockdep correct again" * tag 'locking-urgent-2020-11-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: lockdep: Fix nr_unused_locks accounting locking/lockdep: Remove more raw_cpu_read() usage futex: Fix incorrect should_fail_futex() handling lockdep: Fix preemption WARN for spurious IRQ-enable
2020-10-30lockdep: Fix nr_unused_locks accountingPeter Zijlstra
Chris reported that commit 24d5a3bffef1 ("lockdep: Fix usage_traceoverflow") breaks the nr_unused_locks validation code triggered by /proc/lockdep_stats. By fully splitting LOCK_USED and LOCK_USED_READ it becomes a bad indicator for accounting nr_unused_locks; simplyfy by using any first bit. Fixes: 24d5a3bffef1 ("lockdep: Fix usage_traceoverflow") Reported-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://lkml.kernel.org/r/20201027124834.GL2628@hirez.programming.kicks-ass.net
2020-10-30locking/lockdep: Remove more raw_cpu_read() usagePeter Zijlstra
I initially thought raw_cpu_read() was OK, since if it is !0 we have IRQs disabled and can't get migrated, so if we get migrated both CPUs must have 0 and it doesn't matter which 0 we read. And while that is true; it isn't the whole store, on pretty much all architectures (except x86) this can result in computing the address for one CPU, getting migrated, the old CPU continuing execution with another task (possibly setting recursion) and then the new CPU reading the value of the old CPU, which is no longer 0. Similer to: baffd723e44d ("lockdep: Revert "lockdep: Use raw_cpu_*() for per-cpu variables"") Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20201026152256.GB2651@hirez.programming.kicks-ass.net
2020-10-22lockdep: Fix preemption WARN for spurious IRQ-enablePeter Zijlstra
It is valid (albeit uncommon) to call local_irq_enable() without first having called local_irq_disable(). In this case we enter lockdep_hardirqs_on*() with IRQs enabled and trip a preemption warning for using __this_cpu_read(). Use this_cpu_read() instead to avoid the warning. Fixes: 4d004099a6 ("lockdep: Fix lockdep recursion") Reported-by: syzbot+53f8ce8bbc07924b6417@syzkaller.appspotmail.com Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
2020-10-18Merge tag 'core-rcu-2020-10-12' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull RCU changes from Ingo Molnar: - Debugging for smp_call_function() - RT raw/non-raw lock ordering fixes - Strict grace periods for KASAN - New smp_call_function() torture test - Torture-test updates - Documentation updates - Miscellaneous fixes [ This doesn't actually pull the tag - I've dropped the last merge from the RCU branch due to questions about the series. - Linus ] * tag 'core-rcu-2020-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (77 commits) smp: Make symbol 'csd_bug_count' static kernel/smp: Provide CSD lock timeout diagnostics smp: Add source and destination CPUs to __call_single_data rcu: Shrink each possible cpu krcp rcu/segcblist: Prevent useless GP start if no CBs to accelerate torture: Add gdb support rcutorture: Allow pointer leaks to test diagnostic code rcutorture: Hoist OOM registry up one level refperf: Avoid null pointer dereference when buf fails to allocate rcutorture: Properly synchronize with OOM notifier rcutorture: Properly set rcu_fwds for OOM handling torture: Add kvm.sh --help and update help message rcutorture: Add CONFIG_PROVE_RCU_LIST to TREE05 torture: Update initrd documentation rcutorture: Replace HTTP links with HTTPS ones locktorture: Make function torture_percpu_rwsem_init() static torture: document --allcpus argument added to the kvm.sh script rcutorture: Output number of elapsed grace periods rcutorture: Remove KCSAN stubs rcu: Remove unused "cpu" parameter from rcu_report_qs_rdp() ...
2020-10-09Merge branch 'locking/urgent' into locking/core, to pick up fixesIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2020-10-09lockdep: Fix lockdep recursionPeter Zijlstra
Steve reported that lockdep_assert*irq*(), when nested inside lockdep itself, will trigger a false-positive. One example is the stack-trace code, as called from inside lockdep, triggering tracing, which in turn calls RCU, which then uses lockdep_assert_irqs_disabled(). Fixes: a21ee6055c30 ("lockdep: Change hardirq{s_enabled,_context} to per-cpu variables") Reported-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2020-10-09lockdep: Fix usage_traceoverflowPeter Zijlstra
Basically print_lock_class_header()'s for loop is out of sync with the the size of of ->usage_traces[]. Also clean things up a bit while at it, to avoid such mishaps in the future. Fixes: 23870f122768 ("locking/lockdep: Fix "USED" <- "IN-NMI" inversions") Reported-by: Qian Cai <cai@redhat.com> Debugged-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Tested-by: Qian Cai <cai@redhat.com> Link: https://lkml.kernel.org/r/20200930094937.GE2651@hirez.programming.kicks-ass.net
2020-10-09Merge branch 'for-mingo' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu Pull v5.10 RCU changes from Paul E. McKenney: - Debugging for smp_call_function(). - Strict grace periods for KASAN. The point of this series is to find RCU-usage bugs, so the corresponding new RCU_STRICT_GRACE_PERIOD Kconfig option depends on both DEBUG_KERNEL and RCU_EXPERT, and is further disabled by dfefault. Finally, the help text includes a goodly list of scary caveats. - New smp_call_function() torture test. - Torture-test updates. - Documentation updates. - Miscellaneous fixes. Signed-off-by: Ingo Molnar <mingo@kernel.org>
2020-09-29lockdep: Optimize the memory usage of circular queueBoqun Feng
Qian Cai reported a BFS_EQUEUEFULL warning [1] after read recursive deadlock detection merged into tip tree recently. Unlike the previous lockep graph searching, which iterate every lock class (every node in the graph) exactly once, the graph searching for read recurisve deadlock detection needs to iterate every lock dependency (every edge in the graph) once, as a result, the maximum memory cost of the circular queue changes from O(V), where V is the number of lock classes (nodes or vertices) in the graph, to O(E), where E is the number of lock dependencies (edges), because every lock class or dependency gets enqueued once in the BFS. Therefore we hit the BFS_EQUEUEFULL case. However, actually we don't need to enqueue all dependencies for the BFS, because every time we enqueue a dependency, we almostly enqueue all other dependencies in the same dependency list ("almostly" is because we currently check before enqueue, so if a dependency doesn't pass the check stage we won't enqueue it, however, we can always do in reverse ordering), based on this, we can only enqueue the first dependency from a dependency list and every time we want to fetch a new dependency to work, we can either: 1) fetch the dependency next to the current dependency in the dependency list or 2) if the dependency in 1) doesn't exist, fetch the dependency from the queue. With this approach, the "max bfs queue depth" for a x86_64_defconfig + lockdep and selftest config kernel can get descreased from: max bfs queue depth: 201 to (after apply this patch) max bfs queue depth: 61 While I'm at it, clean up the code logic a little (e.g. directly return other than set a "ret" value and goto the "exit" label). [1]: https://lore.kernel.org/lkml/17343f6f7f2438fc376125384133c5ba70c2a681.camel@redhat.com/ Reported-by: Qian Cai <cai@redhat.com> Reported-by: syzbot+62ebe501c1ce9a91f68c@syzkaller.appspotmail.com Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20200917080210.108095-1-boqun.feng@gmail.com
2020-09-16locking/percpu-rwsem: Use this_cpu_{inc,dec}() for read_countHou Tao
The __this_cpu*() accessors are (in general) IRQ-unsafe which, given that percpu-rwsem is a blocking primitive, should be just fine. However, file_end_write() is used from IRQ context and will cause load-store issues on architectures where the per-cpu accessors are not natively irq-safe. Fix it by using the IRQ-safe this_cpu_*() for operations on read_count. This will generate more expensive code on a number of platforms, which might cause a performance regression for some of the other percpu-rwsem users. If any such is reported, we can consider alternative solutions. Fixes: 70fe2f48152e ("aio: fix freeze protection of aio writes") Signed-off-by: Hou Tao <houtao1@huawei.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Will Deacon <will@kernel.org> Acked-by: Oleg Nesterov <oleg@redhat.com> Link: https://lkml.kernel.org/r/20200915140750.137881-1-houtao1@huawei.com
2020-09-03locking/lockdep: Fix "USED" <- "IN-NMI" inversionspeterz@infradead.org
During the LPC RCU BoF Paul asked how come the "USED" <- "IN-NMI" detector doesn't trip over rcu_read_lock()'s lockdep annotation. Looking into this I found a very embarrasing typo in verify_lock_unused(): - if (!(class->usage_mask & LOCK_USED)) + if (!(class->usage_mask & LOCKF_USED)) fixing that will indeed cause rcu_read_lock() to insta-splat :/ The above typo means that instead of testing for: 0x100 (1 << LOCK_USED), we test for 8 (LOCK_USED), which corresponds to (1 << LOCK_ENABLED_HARDIRQ). So instead of testing for _any_ used lock, it will only match any lock used with interrupts enabled. The rcu_read_lock() annotation uses .check=0, which means it will not set any of the interrupt bits and will thus never match. In order to properly fix the situation and allow rcu_read_lock() to correctly work, split LOCK_USED into LOCK_USED and LOCK_USED_READ and by having .read users set USED_READ and test USED, pure read-recursive locks are permitted. Fixes: f6f48e180404 ("lockdep: Teach lockdep about "USED" <- "IN-NMI" inversions") Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Tested-by: Masami Hiramatsu <mhiramat@kernel.org> Acked-by: Paul E. McKenney <paulmck@kernel.org> Link: https://lore.kernel.org/r/20200902160323.GK1362448@hirez.programming.kicks-ass.net
2020-08-26lockdep: Take read/write status in consideration when generate chainkeyBoqun Feng
Currently, the chainkey of a lock chain is a hash sum of the class_idx of all the held locks, the read/write status are not taken in to consideration while generating the chainkey. This could result into a problem, if we have: P1() { read_lock(B); lock(A); } P2() { lock(A); read_lock(B); } P3() { lock(A); write_lock(B); } , and P1(), P2(), P3() run one by one. And when running P2(), lockdep detects such a lock chain A -> B is not a deadlock, then it's added in the chain cache, and then when running P3(), even if it's a deadlock, we could miss it because of the hit of chain cache. This could be confirmed by self testcase "chain cached mixed R-L/L-W ". To resolve this, we use concept "hlock_id" to generate the chainkey, the hlock_id is a tuple (hlock->class_idx, hlock->read), which fits in a u16 type. With this, the chainkeys are different is the lock sequences have the same locks but different read/write status. Besides, since we use "hlock_id" to generate chainkeys, the chain_hlocks array now store the "hlock_id"s rather than lock_class indexes. Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20200807074238.1632519-15-boqun.feng@gmail.com
2020-08-26lockdep: Add recursive read locks into dependency graphBoqun Feng
Since we have all the fundamental to handle recursive read locks, we now add them into the dependency graph. Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20200807074238.1632519-13-boqun.feng@gmail.com
2020-08-26lockdep: Fix recursive read lock related safe->unsafe detectionBoqun Feng
Currently, in safe->unsafe detection, lockdep misses the fact that a LOCK_ENABLED_IRQ_*_READ usage and a LOCK_USED_IN_IRQ_*_READ usage may cause deadlock too, for example: P1 P2 <irq disabled> write_lock(l1); <irq enabled> read_lock(l2); write_lock(l2); <in irq> read_lock(l1); Actually, all of the following cases may cause deadlocks: LOCK_USED_IN_IRQ_* -> LOCK_ENABLED_IRQ_* LOCK_USED_IN_IRQ_*_READ -> LOCK_ENABLED_IRQ_* LOCK_USED_IN_IRQ_* -> LOCK_ENABLED_IRQ_*_READ LOCK_USED_IN_IRQ_*_READ -> LOCK_ENABLED_IRQ_*_READ To fix this, we need to 1) change the calculation of exclusive_mask() so that READ bits are not dropped and 2) always call usage() in mark_lock_irq() to check usage deadlocks, even when the new usage of the lock is READ. Besides, adjust usage_match() and usage_acculumate() to recursive read lock changes. Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20200807074238.1632519-12-boqun.feng@gmail.com
2020-08-26lockdep: Adjust check_redundant() for recursive read changeBoqun Feng
check_redundant() will report redundancy if it finds a path could replace the about-to-add dependency in the BFS search. With recursive read lock changes, we certainly need to change the match function for the check_redundant(), because the path needs to match not only the lock class but also the dependency kinds. For example, if the about-to-add dependency @prev -> @next is A -(SN)-> B, and we find a path A -(S*)-> .. -(*R)->B in the dependency graph with __bfs() (for simplicity, we can also say we find an -(SR)-> path from A to B), we can not replace the dependency with that path in the BFS search. Because the -(SN)-> dependency can make a strong path with a following -(S*)-> dependency, however an -(SR)-> path cannot. Further, we can replace an -(SN)-> dependency with a -(EN)-> path, that means if we find a path which is stronger than or equal to the about-to-add dependency, we can report the redundancy. By "stronger", it means both the start and the end of the path are not weaker than the start and the end of the dependency (E is "stronger" than S and N is "stronger" than R), so that we can replace the dependency with that path. To make sure we find a path whose start point is not weaker than the about-to-add dependency, we use a trick: the ->only_xr of the root (start point) of __bfs() is initialized as @prev-> == 0, therefore if @prev is E, __bfs() will pick only -(E*)-> for the first dependency, otherwise, __bfs() can pick -(E*)-> or -(S*)-> for the first dependency. To make sure we find a path whose end point is not weaker than the about-to-add dependency, we replace the match function for __bfs() check_redundant(), we check for the case that either @next is R (anything is not weaker than it) or the end point of the path is N (which is not weaker than anything). Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20200807074238.1632519-11-boqun.feng@gmail.com
2020-08-26lockdep: Support deadlock detection for recursive read locks in ↵Boqun Feng
check_noncircular() Currently, lockdep only has limit support for deadlock detection for recursive read locks. This patch support deadlock detection for recursive read locks. The basic idea is: We are about to add dependency B -> A in to the dependency graph, we use check_noncircular() to find whether we have a strong dependency path A -> .. -> B so that we have a strong dependency circle (a closed strong dependency path): A -> .. -> B -> A , which doesn't have two adjacent dependencies as -(*R)-> L -(S*)->. Since A -> .. -> B is already a strong dependency path, so if either B -> A is -(E*)-> or A -> .. -> B is -(*N)->, the circle A -> .. -> B -> A is strong, otherwise not. So we introduce a new match function hlock_conflict() to replace the class_equal() for the deadlock check in check_noncircular(). Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20200807074238.1632519-10-boqun.feng@gmail.com
2020-08-26lockdep: Make __bfs(.match) return boolBoqun Feng
The "match" parameter of __bfs() is used for checking whether we hit a match in the search, therefore it should return a boolean value rather than an integer for better readability. This patch then changes the return type of the function parameter and the match functions to bool. Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20200807074238.1632519-9-boqun.feng@gmail.com
2020-08-26lockdep: Extend __bfs() to work with multiple types of dependenciesBoqun Feng
Now we have four types of dependencies in the dependency graph, and not all the pathes carry real dependencies (the dependencies that may cause a deadlock), for example: Given lock A and B, if we have: CPU1 CPU2 ============= ============== write_lock(A); read_lock(B); read_lock(B); write_lock(A); (assuming read_lock(B) is a recursive reader) then we have dependencies A -(ER)-> B, and B -(SN)-> A, and a dependency path A -(ER)-> B -(SN)-> A. In lockdep w/o recursive locks, a dependency path from A to A means a deadlock. However, the above case is obviously not a deadlock, because no one holds B exclusively, therefore no one waits for the other to release B, so who get A first in CPU1 and CPU2 will run non-blockingly. As a result, dependency path A -(ER)-> B -(SN)-> A is not a real/strong dependency that could cause a deadlock. From the observation above, we know that for a dependency path to be real/strong, no two adjacent dependencies can be as -(*R)-> -(S*)->. Now our mission is to make __bfs() traverse only the strong dependency paths, which is simple: we record whether we only have -(*R)-> for the previous lock_list of the path in lock_list::only_xr, and when we pick a dependency in the traverse, we 1) filter out -(S*)-> dependency if the previous lock_list only has -(*R)-> dependency (i.e. ->only_xr is true) and 2) set the next lock_list::only_xr to true if we only have -(*R)-> left after we filter out dependencies based on 1), otherwise, set it to false. With this extension for __bfs(), we now need to initialize the root of __bfs() properly (with a correct ->only_xr), to do so, we introduce some helper functions, which also cleans up a little bit for the __bfs() root initialization code. Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20200807074238.1632519-8-boqun.feng@gmail.com
2020-08-26lockdep: Introduce lock_list::depBoqun Feng
To add recursive read locks into the dependency graph, we need to store the types of dependencies for the BFS later. There are four types of dependencies: * Exclusive -> Non-recursive dependencies: EN e.g. write_lock(prev) held and try to acquire write_lock(next) or non-recursive read_lock(next), which can be represented as "prev -(EN)-> next" * Shared -> Non-recursive dependencies: SN e.g. read_lock(prev) held and try to acquire write_lock(next) or non-recursive read_lock(next), which can be represented as "prev -(SN)-> next" * Exclusive -> Recursive dependencies: ER e.g. write_lock(prev) held and try to acquire recursive read_lock(next), which can be represented as "prev -(ER)-> next" * Shared -> Recursive dependencies: SR e.g. read_lock(prev) held and try to acquire recursive read_lock(next), which can be represented as "prev -(SR)-> next" So we use 4 bits for the presence of each type in lock_list::dep. Helper functions and macros are also introduced to convert a pair of locks into lock_list::dep bit and maintain the addition of different types of dependencies. Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20200807074238.1632519-7-boqun.feng@gmail.com
2020-08-26lockdep: Reduce the size of lock_list::distanceBoqun Feng
lock_list::distance is always not greater than MAX_LOCK_DEPTH (which is 48 right now), so a u16 will fit. This patch reduces the size of lock_list::distance to save space, so that we can introduce other fields to help detect recursive read lock deadlocks without increasing the size of lock_list structure. Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20200807074238.1632519-6-boqun.feng@gmail.com
2020-08-26lockdep: Make __bfs() visit every dependency until a matchBoqun Feng
Currently, __bfs() will do a breadth-first search in the dependency graph and visit each lock class in the graph exactly once, so for example, in the following graph: A ---------> B | ^ | | +----------> C a __bfs() call starts at A, will visit B through dependency A -> B and visit C through dependency A -> C and that's it, IOW, __bfs() will not visit dependency C -> B. This is OK for now, as we only have strong dependencies in the dependency graph, so whenever there is a traverse path from A to B in __bfs(), it means A has strong dependencies to B (IOW, B depends on A strongly). So no need to visit all dependencies in the graph. However, as we are going to add recursive-read lock into the dependency graph, as a result, not all the paths mean strong dependencies, in the same example above, dependency A -> B may be a weak dependency and traverse A -> C -> B may be a strong dependency path. And with the old way of __bfs() (i.e. visiting every lock class exactly once), we will miss the strong dependency path, which will result into failing to find a deadlock. To cure this for the future, we need to find a way for __bfs() to visit each dependency, rather than each class, exactly once in the search until we find a match. The solution is simple: We used to mark lock_class::lockdep_dependency_gen_id to indicate a class has been visited in __bfs(), now we change the semantics a little bit: we now mark lock_class::lockdep_dependency_gen_id to indicate _all the dependencies_ in its lock_{after,before} have been visited in the __bfs() (note we only take one direction in a __bfs() search). In this way, every dependency is guaranteed to be visited until we find a match. Note: the checks in mark_lock_accessed() and lock_accessed() are removed, because after this modification, we may call these two functions on @source_entry of __bfs(), which may not be the entry in "list_entries" Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20200807074238.1632519-5-boqun.feng@gmail.com