summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2020-01-25Merge branch 'for-mingo' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu Pull RCU updates from Paul E. McKenney: - Expedited grace-period updates - kfree_rcu() updates - RCU list updates - Preemptible RCU updates - Torture-test updates - Miscellaneous fixes - Documentation updates Signed-off-by: Ingo Molnar <mingo@kernel.org>
2020-01-24Merge branches 'doc.2019.12.10a', 'exp.2019.12.09a', 'fixes.2020.01.24a', ↵Paul E. McKenney
'kfree_rcu.2020.01.24a', 'list.2020.01.10a', 'preempt.2020.01.24a' and 'torture.2019.12.09a' into HEAD doc.2019.12.10a: Documentations updates exp.2019.12.09a: Expedited grace-period updates fixes.2020.01.24a: Miscellaneous fixes kfree_rcu.2020.01.24a: Batch kfree_rcu() work list.2020.01.10a: RCU-protected-list updates preempt.2020.01.24a: Preemptible RCU updates torture.2019.12.09a: Torture-test updates
2020-01-24rcu: Remove unused stop-machine #includePaul E. McKenney
Long ago, RCU used the stop-machine mechanism to implement expedited grace periods, but no longer does so. This commit therefore removes the no-longer-needed #includes of linux/stop_machine.h. Link: https://lwn.net/Articles/805317/ Reported-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-01-24powerpc: Remove comment about read_barrier_depends()Will Deacon
'read_barrier_depends()' doesn't exist anymore so stop talking about it. Signed-off-by: Will Deacon <will@kernel.org> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-01-24.mailmap: Add entries for old paulmck@kernel.org addressesPaul E. McKenney
[ paulmck: Apply Florian Fainelli feedback. ] Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-01-24srcu: Apply *_ONCE() to ->srcu_last_gp_endPaul E. McKenney
The ->srcu_last_gp_end field is accessed from any CPU at any time by synchronize_srcu(), so non-initialization references need to use READ_ONCE() and WRITE_ONCE(). This commit therefore makes that change. Reported-by: syzbot+08f3e9d26e5541e1ecf2@syzkaller.appspotmail.com Acked-by: Marco Elver <elver@google.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-01-24rcu: Switch force_qs_rnp() to for_each_leaf_node_cpu_mask()Paul E. McKenney
Currently, force_qs_rnp() uses a for_each_leaf_node_possible_cpu() loop containing a check of the current CPU's bit in ->qsmask. This works, but this commit saves three lines by instead using for_each_leaf_node_cpu_mask(), which combines the functionality of for_each_leaf_node_possible_cpu() and leaf_node_cpu_bit(). This commit also replaces the use of the local variable "bit" with rdp->grpmask. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-01-24rcu: Move rcu_{expedited,normal} definitions into rcupdate.hBen Dooks
This commit moves the rcu_{expedited,normal} definitions from kernel/rcu/update.c to include/linux/rcupdate.h to make sure they are in sync, and also to avoid the following warning from sparse: kernel/ksysfs.c:150:5: warning: symbol 'rcu_expedited' was not declared. Should it be static? kernel/ksysfs.c:167:5: warning: symbol 'rcu_normal' was not declared. Should it be static? Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-01-24rcu: Move gp_state_names[] and gp_state_getname() to tree_stall.hLai Jiangshan
Only tree_stall.h needs to get name from GP state, so this commit moves the gp_state_names[] array and the gp_state_getname() from kernel/rcu/tree.h and kernel/rcu/tree.c, respectively, to kernel/rcu/tree_stall.h. While moving gp_state_names[], this commit uses the GCC syntax to ensure that the right string is associated with the right CPP macro. Signed-off-by: Lai Jiangshan <jiangshanlai@gmail.com> Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-01-24rcu: Remove the declaration of call_rcu() in tree.hLai Jiangshan
The call_rcu() function is an external RCU API that is declared in include/linux/rcupdate.h. There is thus no point in redeclaring it in kernel/rcu/tree.h, so this commit removes that redundant declaration. Signed-off-by: Lai Jiangshan <jiangshanlai@gmail.com> Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-01-24rcu: Fix tracepoint tracking RCU CPU kthread utilizationLai Jiangshan
In the call to trace_rcu_utilization() at the start of the loop in rcu_cpu_kthread(), "rcu_wait" is incorrect, plus this trace event needs to be hoisted above the loop to balance with either the "rcu_wait" or "rcu_yield", depending on how the loop exits. This commit therefore makes these changes. Signed-off-by: Lai Jiangshan <jiangshanlai@gmail.com> Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-01-24rcu: Fix harmless omission of "CONFIG_" from #if conditionLai Jiangshan
The C preprocessor macros SRCU and TINY_RCU should instead be CONFIG_SRCU and CONFIG_TINY_RCU, respectively in the #f in kernel/rcu/rcu.h. But there is no harm when "TINY_RCU" is wrongly used, which are always non-defined, which makes "!defined(TINY_RCU)" always true, which means the code block is always included, and the included code block doesn't cause any compilation error so far in CONFIG_TINY_RCU builds. It is also the reason this change should not be taken in -stable. This commit adds the needed "CONFIG_" prefix to both macros. Not for -stable. Signed-off-by: Lai Jiangshan <jiangshanlai@gmail.com> Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-01-24rcu: Avoid tick_dep_set_cpu() misorderingPaul E. McKenney
In the current code, rcu_nmi_enter_common() might decide to turn on the tick using tick_dep_set_cpu(), but be delayed just before doing so. Then the grace-period kthread might notice that the CPU in question had in fact gone through a quiescent state, thus turning off the tick using tick_dep_clear_cpu(). The later invocation of tick_dep_set_cpu() would then incorrectly leave the tick on. This commit therefore enlists the aid of the leaf rcu_node structure's ->lock to ensure that decisions to enable or disable the tick are carried out before they can be reversed. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-01-24rcu: Provide wrappers for uses of ->rcu_read_lock_nestingLai Jiangshan
This commit provides wrapper functions for uses of ->rcu_read_lock_nesting to improve readability and to ease future changes to support inlining of __rcu_read_lock() and __rcu_read_unlock(). Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-01-24rcu: Use READ_ONCE() for ->expmask in rcu_read_unlock_special()Paul E. McKenney
The rcu_node structure's ->expmask field is updated only when holding the ->lock, but is also accessed locklessly. This means that all ->expmask updates must use WRITE_ONCE() and all reads carried out without holding ->lock must use READ_ONCE(). This commit therefore changes the lockless ->expmask read in rcu_read_unlock_special() to use READ_ONCE(). Reported-by: syzbot+99f4ddade3c22ab0cf23@syzkaller.appspotmail.com Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Acked-by: Marco Elver <elver@google.com>
2020-01-24rcu: Clear ->rcu_read_unlock_special only onceLai Jiangshan
In rcu_preempt_deferred_qs_irqrestore(), ->rcu_read_unlock_special is cleared one piece at a time. Given that the "if" statements in this function use the copy in "special", this commit removes the clearing of the individual pieces in favor of clearing ->rcu_read_unlock_special in one go just after it has been determined to be non-zero. Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-01-24rcu: Clear .exp_hint only when deferred quiescent state has been reportedLai Jiangshan
Currently, the .exp_hint flag is cleared in rcu_read_unlock_special(), which works, but which can also prevent subsequent rcu_read_unlock() calls from helping expedite the quiescent state needed by an ongoing expedited RCU grace period. This commit therefore defers clearing of .exp_hint from rcu_read_unlock_special() to rcu_preempt_deferred_qs_irqrestore(), thus ensuring that intervening calls to rcu_read_unlock() have a chance to help end the expedited grace period. Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-01-24rcu: Rename some instance of CONFIG_PREEMPTION to CONFIG_PREEMPT_RCULai Jiangshan
CONFIG_PREEMPTION and CONFIG_PREEMPT_RCU are always identical, but some code depends on CONFIG_PREEMPTION to access to rcu_preempt functionality. This patch changes CONFIG_PREEMPTION to CONFIG_PREEMPT_RCU in these cases. Signed-off-by: Lai Jiangshan <jiangshanlai@gmail.com> Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-01-24rcu: Remove kfree_call_rcu_nobatch()Joel Fernandes (Google)
Now that the kfree_rcu() special-casing has been removed from tree RCU, this commit removes kfree_call_rcu_nobatch() since it is no longer needed. Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-01-24rcu: Remove kfree_rcu() special casing and lazy-callback handlingJoel Fernandes (Google)
This commit removes kfree_rcu() special-casing and the lazy-callback handling from Tree RCU. It moves some of this special casing to Tiny RCU, the removal of which will be the subject of later commits. This results in a nice negative delta. Suggested-by: Paul E. McKenney <paulmck@linux.ibm.com> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> [ paulmck: Add slab.h #include, thanks to kbuild test robot <lkp@intel.com>. ] Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-01-24rcu: Add support for debug_objects debugging for kfree_rcu()Joel Fernandes (Google)
This commit applies RCU's debug_objects debugging to the new batched kfree_rcu() implementations. The object is queued at the kfree_rcu() call and dequeued during reclaim. Tested that enabling CONFIG_DEBUG_OBJECTS_RCU_HEAD successfully detects double kfree_rcu() calls. Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> [ paulmck: Fix IRQ per kbuild test robot <lkp@intel.com> feedback. ] Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-01-24rcu: Add multiple in-flight batches of kfree_rcu() workJoel Fernandes (Google)
During testing, it was observed that amount of memory consumed due kfree_rcu() batching is 300-400MB. Previously we had only a single head_free pointer pointing to the list of rcu_head(s) that are to be freed after a grace period. Until this list is drained, we cannot queue any more objects on it since such objects may not be ready to be reclaimed when the worker thread eventually gets to drainin g the head_free list. We can do better by maintaining multiple lists as done by this patch. Testing shows that memory consumption came down by around 100-150MB with just adding another list. Adding more than 1 additional list did not show any improvement. Suggested-by: Paul E. McKenney <paulmck@linux.ibm.com> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> [ paulmck: Code style and initialization handling. ] [ paulmck: Fix field name, reported by kbuild test robot <lkp@intel.com>. ] Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-01-24rcu: Make kfree_rcu() use a non-atomic ->monitor_todoJoel Fernandes
Because the ->monitor_todo field is always protected by krcp->lock, this commit downgrades from xchg() to non-atomic unmarked assignment statements. Signed-off-by: Joel Fernandes <joel@joelfernandes.org> [ paulmck: Update to include early-boot kick code. ] Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-01-24rcuperf: Add kfree_rcu() performance TestsJoel Fernandes (Google)
This test runs kfree_rcu() in a loop to measure performance of the new kfree_rcu() batching functionality. The following table shows results when booting with arguments: rcuperf.kfree_loops=20000 rcuperf.kfree_alloc_num=8000 rcuperf.kfree_rcu_test=1 rcuperf.kfree_no_batch=X rcuperf.kfree_no_batch=X # Grace Periods Test Duration (s) X=1 (old behavior) 9133 11.5 X=0 (new behavior) 1732 12.5 On a 16 CPU system with the above boot parameters, we see that the total number of grace periods that elapse during the test drops from 9133 when not batching to 1732 when batching (a 5X improvement). The kfree_rcu() flood itself slows down a bit when batching, though, as shown. Note that the active memory consumption during the kfree_rcu() flood does increase to around 200-250MB due to the batching (from around 50MB without batching). However, this memory consumption is relatively constant. In other words, the system is able to keep up with the kfree_rcu() load. The memory consumption comes down considerably if KFREE_DRAIN_JIFFIES is increased from HZ/50 to HZ/80. A later patch will reduce memory consumption further by using multiple lists. Also, when running the test, please disable CONFIG_DEBUG_PREEMPT and CONFIG_PROVE_RCU for realistic comparisons with/without batching. Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-01-24rcu: Add basic support for kfree_rcu() batchingByungchul Park
Recently a discussion about stability and performance of a system involving a high rate of kfree_rcu() calls surfaced on the list [1] which led to another discussion how to prepare for this situation. This patch adds basic batching support for kfree_rcu(). It is "basic" because we do none of the slab management, dynamic allocation, code moving or any of the other things, some of which previous attempts did [2]. These fancier improvements can be follow-up patches and there are different ideas being discussed in those regards. This is an effort to start simple, and build up from there. In the future, an extension to use kfree_bulk and possibly per-slab batching could be done to further improve performance due to cache-locality and slab-specific bulk free optimizations. By using an array of pointers, the worker thread processing the work would need to read lesser data since it does not need to deal with large rcu_head(s) any longer. Torture tests follow in the next patch and show improvements of around 5x reduction in number of grace periods on a 16 CPU system. More details and test data are in that patch. There is an implication with rcu_barrier() with this patch. Since the kfree_rcu() calls can be batched, and may not be handed yet to the RCU machinery in fact, the monitor may not have even run yet to do the queue_rcu_work(), there seems no easy way of implementing rcu_barrier() to wait for those kfree_rcu()s that are already made. So this means a kfree_rcu() followed by an rcu_barrier() does not imply that memory will be freed once rcu_barrier() returns. Another implication is higher active memory usage (although not run-away..) until the kfree_rcu() flooding ends, in comparison to without batching. More details about this are in the second patch which adds an rcuperf test. Finally, in the near future we will get rid of kfree_rcu() special casing within RCU such as in rcu_do_batch and switch everything to just batching. Currently we don't do that since timer subsystem is not yet up and we cannot schedule the kfree_rcu() monitor as the timer subsystem's lock are not initialized. That would also mean getting rid of kfree_call_rcu_nobatch() entirely. [1] http://lore.kernel.org/lkml/20190723035725-mutt-send-email-mst@kernel.org [2] https://lkml.org/lkml/2017/12/19/824 Cc: kernel-team@android.com Cc: kernel-team@lge.com Co-developed-by: Byungchul Park <byungchul.park@lge.com> Signed-off-by: Byungchul Park <byungchul.park@lge.com> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> [ paulmck: Applied 0day and Paul Walmsley feedback on ->monitor_todo. ] [ paulmck: Make it work during early boot. ] [ paulmck: Add a crude early boot self-test. ] [ paulmck: Style adjustments and experimental docbook structure header. ] Link: https://lore.kernel.org/lkml/alpine.DEB.2.21.9999.1908161931110.32497@viisi.sifive.com/T/#me9956f66cb611b95d26ae92700e1d901f46e8c59 Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-01-23Merge tag 'xarray-5.5' of git://git.infradead.org/users/willy/linux-daxLinus Torvalds
Pull XArray fixes from Matthew Wilcox: "Primarily bugfixes, mostly around handling index wrap-around correctly. A couple of doc fixes and adding missing APIs. I had an oops live on stage at linux.conf.au this year, and it turned out to be a bug in xas_find() which I can't prove isn't triggerable in the current codebase. Then in looking for the bug, I spotted two more bugs. The bots have had a few days to chew on this with no problems reported, and it passes the test-suite (which now has more tests to make sure these problems don't come back)" * tag 'xarray-5.5' of git://git.infradead.org/users/willy/linux-dax: XArray: Add xa_for_each_range XArray: Fix xas_find returning too many entries XArray: Fix xa_find_after with multi-index entries XArray: Fix infinite loop with entry at ULONG_MAX XArray: Add wrappers for nested spinlocks XArray: Improve documentation of search marks XArray: Fix xas_pause at ULONG_MAX
2020-01-23Merge tag 'trace-v5.5-rc6-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing fixes from Steven Rostedt: "Various tracing fixes: - Fix a function comparison warning for a xen trace event macro - Fix a double perf_event linking to a trace_uprobe_filter for multiple events - Fix suspicious RCU warnings in trace event code for using list_for_each_entry_rcu() when the "_rcu" portion wasn't needed. - Fix a bug in the histogram code when using the same variable - Fix a NULL pointer dereference when tracefs lockdown enabled and calling trace_set_default_clock() - A fix to a bug found with the double perf_event linking patch" * tag 'trace-v5.5-rc6-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tracing/uprobe: Fix to make trace_uprobe_filter alignment safe tracing: Do not set trace clock if tracefs lockdown is in effect tracing: Fix histogram code when expression has same var as value tracing: trigger: Replace unneeded RCU-list traversals tracing/uprobe: Fix double perf_event linking on multiprobe uprobe tracing: xen: Ordered comparison of function pointers
2020-01-23Merge tag 'ceph-for-5.5-rc8' of https://github.com/ceph/ceph-clientLinus Torvalds
Pull ceph fix from Ilya Dryomov: "A fix for a potential use-after-free from Jeff, marked for stable" * tag 'ceph-for-5.5-rc8' of https://github.com/ceph/ceph-client: ceph: hold extra reference to r_parent over life of request
2020-01-23Merge tag 'pm-5.5-rc8' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management fix from Rafael Wysocki: "Prevent the kernel from crashing during resume from hibernation if free pages contain leftover data from the restore kernel and init_on_free is set (Alexander Potapenko)" * tag 'pm-5.5-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: PM: hibernate: fix crashes with init_on_free=1
2020-01-23Merge tag 'pci-v5.5-fixes-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI fix from Bjorn Helgaas: "Mark ATS as broken on AMD Navi14 GPU rev 0xc5 (Alex Deucher)" * tag 'pci-v5.5-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: PCI: Mark AMD Navi14 GPU rev 0xc5 ATS as broken
2020-01-23readdir: make user_access_begin() use the real access rangeLinus Torvalds
In commit 9f79b78ef744 ("Convert filldir[64]() from __put_user() to unsafe_put_user()") I changed filldir to not do individual __put_user() accesses, but instead use unsafe_put_user() surrounded by the proper user_access_begin/end() pair. That make them enormously faster on modern x86, where the STAC/CLAC games make individual user accesses fairly heavy-weight. However, the user_access_begin() range was not really the exact right one, since filldir() has the unfortunate problem that it needs to not only fill out the new directory entry, it also needs to fix up the previous one to contain the proper file offset. It's unfortunate, but the "d_off" field in "struct dirent" is _not_ the file offset of the directory entry itself - it's the offset of the next one. So we end up backfilling the offset in the previous entry as we walk along. But since x86 didn't really care about the exact range, and used to be the only architecture that did anything fancy in user_access_begin() to begin with, the filldir[64]() changes did something lazy, and even commented on it: /* * Note! This range-checks 'previous' (which may be NULL). * The real range was checked in getdents */ if (!user_access_begin(dirent, sizeof(*dirent))) goto efault; and it all worked fine. But now 32-bit ppc is starting to also implement user_access_begin(), and the fact that we faked the range to only be the (possibly not even valid) previous directory entry becomes a problem, because ppc32 will actually be using the range that is passed in for more than just "check that it's user space". This is a complete rewrite of Christophe's original patch. By saving off the record length of the previous entry instead of a pointer to it in the filldir data structures, we can simplify the range check and the writing of the previous entry d_off field. No need for any conditionals in the user accesses themselves, although we retain the conditional EINTR checking for the "was this the first directory entry" signal handling latency logic. Fixes: 9f79b78ef744 ("Convert filldir[64]() from __put_user() to unsafe_put_user()") Link: https://lore.kernel.org/lkml/a02d3426f93f7eb04960a4d9140902d278cab0bb.1579697910.git.christophe.leroy@c-s.fr/ Link: https://lore.kernel.org/lkml/408c90c4068b00ea8f1c41cca45b84ec23d4946b.1579783936.git.christophe.leroy@c-s.fr/ Reported-and-tested-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-01-23readdir: be more conservative with directory entry namesLinus Torvalds
Commit 8a23eb804ca4 ("Make filldir[64]() verify the directory entry filename is valid") added some minimal validity checks on the directory entries passed to filldir[64](). But they really were pretty minimal. This fleshes out at least the name length check: we used to disallow zero-length names, but really, negative lengths or oevr-long names aren't ok either. Both could happen if there is some filesystem corruption going on. Now, most filesystems tend to use just an "unsigned char" or similar for the length of a directory entry name, so even with a corrupt filesystem you should never see anything odd like that. But since we then use the name length to create the directory entry record length, let's make sure it actually is half-way sensible. Note how POSIX states that the size of a path component is limited by NAME_MAX, but we actually use PATH_MAX for the check here. That's because while NAME_MAX is generally the correct maximum name length (it's 255, for the same old "name length is usually just a byte on disk"), there's nothing in the VFS layer that really cares. So the real limitation at a VFS layer is the total pathname length you can pass as a filename: PATH_MAX. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-01-22Merge tag 'leds-5.5-rc8' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pavel/linux-leds Pull LED fixes from Pavel Machek: "Jacek's fix for an uninitialized gpio label is why I'm requesting this pull; it fixes regression in debugging output in sysfs. Others are just bugfixes that should be safe. Everything has been in -next for while" * tag 'leds-5.5-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/pavel/linux-leds: leds: lm3532: add pointer to documentation and fix typo leds: rb532: cleanup whitespace ledtrig-pattern: fix email address quoting in MODULE_AUTHOR() led: max77650: add of_match table leds-as3645a: Drop fwnode reference on ignored node leds: gpio: Fix uninitialized gpio label for fwnode based probe
2020-01-22Merge tag 'hwmon-for-v5.5-rc8' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging Pull hwmon fixes from Guenter Roeck: - In hwmon core, do not use the hwmon parent device for device managed memory allocations, since parent device lifetime may not match hwmon device lifetime. - Fix discrepancy between read and write values in adt7475 driver. - Fix alarms and voltage limits in nct7802 driver. * tag 'hwmon-for-v5.5-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging: hwmon: (core) Do not use device managed functions for memory allocations hwmon: (adt7475) Make volt2reg return same reg as reg2volt input hwmon: (nct7802) Fix non-working alarm on voltages hwmon: (nct7802) Fix voltage limits to wrong registers
2020-01-22leds: lm3532: add pointer to documentation and fix typoPavel
Add pointer to datasheet and fix typo in printk message. Signed-off-by: Pavel Machek <pavel@ucw.cz>
2020-01-22leds: rb532: cleanup whitespacePavel Machek
Trivial cleanup removing empty line at wrong place. Signed-off-by: Pavel Machek <pavel@ucw.cz>
2020-01-22ledtrig-pattern: fix email address quoting in MODULE_AUTHOR()Pavel Machek
Apparently it is quite easy to forget ">" in quoting of email address. This fixes it. Signed-off-by: Pavel Machek <pavel@ucw.cz>
2020-01-22led: max77650: add of_match tableBartosz Golaszewski
We need the of_match table if we want to use the compatible string in the pmic's child node and get the led driver loaded automatically. Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com> Signed-off-by: Pavel Machek <pavel@ucw.cz>
2020-01-22leds-as3645a: Drop fwnode reference on ignored nodeSakari Ailus
If a node is ignored, do not get a reference to it. Fix the bug by moving fwnode_handle_get() where a reference to an fwnode is saved for clarity. Reported-by: Andy Shevchenko <andriy.shevchenko@intel.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Signed-off-by: Pavel Machek <pavel@ucw.cz>
2020-01-22leds: gpio: Fix uninitialized gpio label for fwnode based probeJacek Anaszewski
When switching to using generic LED name composition mechanism via devm_led_classdev_register_ext() API the part of code initializing struct gpio_led's template name property was removed alongside. It was however overlooked that the property was also passed to devm_fwnode_get_gpiod_from_child() in place of "label" parameter, which when set to NULL, results in gpio label being initialized to '?'. It could be observed in debugfs and failed to properly identify gpio association with LED consumer. Fix this shortcoming by updating the GPIO label after the LED is registered and its final name is known. Fixes: d7235f5feaa0 ("leds: gpio: Use generic support for composing LED names") Cc: Russell King <linux@armlinux.org.uk> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Jacek Anaszewski <jacek.anaszewski@gmail.com> [fixed comment] Signed-off-by: Pavel Machek <pavel@ucw.cz>
2020-01-22Merge tag 'io_uring-5.5-2020-01-22' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull io_uring fix from Jens Axboe: "This was supposed to have gone in last week, but due to a brain fart on my part, I forgot that we made this struct addition in the 5.5 cycle. So here it is for 5.5, to prevent having a 32 vs 64-bit compatability issue with the files_update command" * tag 'io_uring-5.5-2020-01-22' of git://git.kernel.dk/linux-block: io_uring: fix compat for IORING_REGISTER_FILES_UPDATE
2020-01-22tracing/uprobe: Fix to make trace_uprobe_filter alignment safeMasami Hiramatsu
Commit 99c9a923e97a ("tracing/uprobe: Fix double perf_event linking on multiprobe uprobe") moved trace_uprobe_filter on trace_probe_event. However, since it introduced a flexible data structure with char array and type casting, the alignment of trace_uprobe_filter can be broken. This changes the type of the array to trace_uprobe_filter data strucure to fix it. Link: http://lore.kernel.org/r/20200120124022.GA14897@hirez.programming.kicks-ass.net Link: http://lkml.kernel.org/r/157966340499.5107.10978352478952144902.stgit@devnote2 Fixes: 99c9a923e97a ("tracing/uprobe: Fix double perf_event linking on multiprobe uprobe") Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-01-21ceph: hold extra reference to r_parent over life of requestJeff Layton
Currently, we just assume that it will stick around by virtue of the submitter's reference, but later patches will allow the syscall to return early and we can't rely on that reference at that point. While I'm not aware of any reports of it, Xiubo pointed out that this may fix a use-after-free. If the wait for a reply times out or is canceled via signal, and then the reply comes in after the syscall returns, the client can end up trying to access r_parent without a reference. Take an extra reference to the inode when setting r_parent and release it when releasing the request. Cc: stable@vger.kernel.org Signed-off-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2020-01-20io_uring: fix compat for IORING_REGISTER_FILES_UPDATEEugene Syromiatnikov
fds field of struct io_uring_files_update is problematic with regards to compat user space, as pointer size is different in 32-bit, 32-on-64-bit, and 64-bit user space. In order to avoid custom handling of compat in the syscall implementation, make fds __u64 and use u64_to_user_ptr in order to retrieve it. Also, align the field naturally and check that no garbage is passed there. Fixes: c3a31e605620c279 ("io_uring: add support for IORING_REGISTER_FILES_UPDATE") Signed-off-by: Eugene Syromiatnikov <esyr@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-01-20tracing: Do not set trace clock if tracefs lockdown is in effectMasami Ichikawa
When trace_clock option is not set and unstable clcok detected, tracing_set_default_clock() sets trace_clock(ThinkPad A285 is one of case). In that case, if lockdown is in effect, null pointer dereference error happens in ring_buffer_set_clock(). Link: http://lkml.kernel.org/r/20200116131236.3866925-1-masami256@gmail.com Cc: stable@vger.kernel.org Fixes: 17911ff38aa58 ("tracing: Add locked_down checks to the open calls of files created for tracefs") Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1788488 Signed-off-by: Masami Ichikawa <masami256@gmail.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-01-20tracing: Fix histogram code when expression has same var as valueSteven Rostedt (VMware)
While working on a tool to convert SQL syntex into the histogram language of the kernel, I discovered the following bug: # echo 'first u64 start_time u64 end_time pid_t pid u64 delta' >> synthetic_events # echo 'hist:keys=pid:start=common_timestamp' > events/sched/sched_waking/trigger # echo 'hist:keys=next_pid:delta=common_timestamp-$start,start2=$start:onmatch(sched.sched_waking).trace(first,$start2,common_timestamp,next_pid,$delta)' > events/sched/sched_switch/trigger Would not display any histograms in the sched_switch histogram side. But if I were to swap the location of "delta=common_timestamp-$start" with "start2=$start" Such that the last line had: # echo 'hist:keys=next_pid:start2=$start,delta=common_timestamp-$start:onmatch(sched.sched_waking).trace(first,$start2,common_timestamp,next_pid,$delta)' > events/sched/sched_switch/trigger The histogram works as expected. What I found out is that the expressions clear out the value once it is resolved. As the variables are resolved in the order listed, when processing: delta=common_timestamp-$start The $start is cleared. When it gets to "start2=$start", it errors out with "unresolved symbol" (which is silent as this happens at the location of the trace), and the histogram is dropped. When processing the histogram for variable references, instead of adding a new reference for a variable used twice, use the same reference. That way, not only is it more efficient, but the order will no longer matter in processing of the variables. From Tom Zanussi: "Just to clarify some more about what the problem was is that without your patch, we would have two separate references to the same variable, and during resolve_var_refs(), they'd both want to be resolved separately, so in this case, since the first reference to start wasn't part of an expression, it wouldn't get the read-once flag set, so would be read normally, and then the second reference would do the read-once read and also be read but using read-once. So everything worked and you didn't see a problem: from: start2=$start,delta=common_timestamp-$start In the second case, when you switched them around, the first reference would be resolved by doing the read-once, and following that the second reference would try to resolve and see that the variable had already been read, so failed as unset, which caused it to short-circuit out and not do the trigger action to generate the synthetic event: to: delta=common_timestamp-$start,start2=$start With your patch, we only have the single resolution which happens correctly the one time it's resolved, so this can't happen." Link: https://lore.kernel.org/r/20200116154216.58ca08eb@gandalf.local.home Cc: stable@vger.kernel.org Fixes: 067fe038e70f6 ("tracing: Add variable reference handling to hist triggers") Reviewed-by: Tom Zanuss <zanussi@kernel.org> Tested-by: Tom Zanussi <zanussi@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-01-20Merge tag 'fixes_for_v5.5-rc8' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull reiserfs fix from Jan Kara: "A fixup of a recently merged reiserfs fix which has caused problem when xattrs were not compiled in" * tag 'fixes_for_v5.5-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: reiserfs: fix handling of -EOPNOTSUPP in reiserfs_for_each_xattr
2020-01-19Linux 5.5-rc7v5.5-rc7Linus Torvalds
2020-01-19Merge tag 'riscv/for-v5.5-rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V fixes from Paul Walmsley: "Three fixes for RISC-V: - Don't free and reuse memory containing the code that CPUs parked at boot reside in. - Fix rv64 build problems for ubsan and some modules by adding logical and arithmetic shift helpers for 128-bit values. These are from libgcc and are similar to what's present for ARM64. - Fix vDSO builds to clean up their own temporary files" * tag 'riscv/for-v5.5-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: riscv: Less inefficient gcc tishift helpers (and export their symbols) riscv: delete temporary files riscv: make sure the cores stay looping in .Lsecondary_park
2020-01-19Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netLinus Torvalds
Pull networking fixes from David Miller: 1) Fix non-blocking connect() in x25, from Martin Schiller. 2) Fix spurious decryption errors in kTLS, from Jakub Kicinski. 3) Netfilter use-after-free in mtype_destroy(), from Cong Wang. 4) Limit size of TSO packets properly in lan78xx driver, from Eric Dumazet. 5) r8152 probe needs an endpoint sanity check, from Johan Hovold. 6) Prevent looping in tcp_bpf_unhash() during sockmap/tls free, from John Fastabend. 7) hns3 needs short frames padded on transmit, from Yunsheng Lin. 8) Fix netfilter ICMP header corruption, from Eyal Birger. 9) Fix soft lockup when low on memory in hns3, from Yonglong Liu. 10) Fix NTUPLE firmware command failures in bnxt_en, from Michael Chan. 11) Fix memory leak in act_ctinfo, from Eric Dumazet. * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (91 commits) cxgb4: reject overlapped queues in TC-MQPRIO offload cxgb4: fix Tx multi channel port rate limit net: sched: act_ctinfo: fix memory leak bnxt_en: Do not treat DSN (Digital Serial Number) read failure as fatal. bnxt_en: Fix ipv6 RFS filter matching logic. bnxt_en: Fix NTUPLE firmware command failures. net: systemport: Fixed queue mapping in internal ring map net: dsa: bcm_sf2: Configure IMP port for 2Gb/sec net: dsa: sja1105: Don't error out on disabled ports with no phy-mode net: phy: dp83867: Set FORCE_LINK_GOOD to default after reset net: hns: fix soft lockup when there is not enough memory net: avoid updating qdisc_xmit_lock_key in netdev_update_lockdep_key() net/sched: act_ife: initalize ife->metalist earlier netfilter: nat: fix ICMP header corruption on ICMP errors net: wan: lapbether.c: Use built-in RCU list checking netfilter: nf_tables: fix flowtable list del corruption netfilter: nf_tables: fix memory leak in nf_tables_parse_netdev_hooks() netfilter: nf_tables: remove WARN and add NLA_STRING upper limits netfilter: nft_tunnel: ERSPAN_VERSION must not be null netfilter: nft_tunnel: fix null-attribute check ...