Age | Commit message (Collapse) | Author |
|
Complier may generate codes that re-read the tun->numqueues during
tun_select_queue(). This may be a race if vlan->numqueues were changed in the
same time and can lead unexpected result (e.g. very huge value).
We need prevent the compiler from generating such codes by adding an
ACCESS_ONCE() to make sure tun->numqueues were only read once.
Bug were introduced by commit c8d68e6be1c3b242f1c598595830890b65cea64a
(tuntap: multiqueue support).
Reported-by: Michael S. Tsirkin <mst@redhat.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When we decide not use zero-copy, msg.control should be set to NULL otherwise
macvtap/tap may set zerocopy callbacks which may decrease the kref of ubufs
wrongly.
Bug were introduced by commit cedb9bdce099206290a2bdd02ce47a7b253b6a84
(vhost-net: skip head management if no outstanding).
This solves the following warnings:
WARNING: at include/linux/kref.h:47 handle_tx+0x477/0x4b0 [vhost_net]()
Modules linked in: vhost_net macvtap macvlan tun nfsd exportfs bridge stp llc openvswitch kvm_amd kvm bnx2 megaraid_sas [last unloaded: tun]
CPU: 5 PID: 8670 Comm: vhost-8668 Not tainted 3.10.0-rc2+ #1566
Hardware name: Dell Inc. PowerEdge R715/00XHKG, BIOS 1.5.2 04/19/2011
ffffffffa0198323 ffff88007c9ebd08 ffffffff81796b73 ffff88007c9ebd48
ffffffff8103d66b 000000007b773e20 ffff8800779f0000 ffff8800779f43f0
ffff8800779f8418 000000000000015c 0000000000000062 ffff88007c9ebd58
Call Trace:
[<ffffffff81796b73>] dump_stack+0x19/0x1e
[<ffffffff8103d66b>] warn_slowpath_common+0x6b/0xa0
[<ffffffff8103d6b5>] warn_slowpath_null+0x15/0x20
[<ffffffffa0197627>] handle_tx+0x477/0x4b0 [vhost_net]
[<ffffffffa0197690>] handle_tx_kick+0x10/0x20 [vhost_net]
[<ffffffffa019541e>] vhost_worker+0xfe/0x1a0 [vhost_net]
[<ffffffffa0195320>] ? vhost_attach_cgroups_work+0x30/0x30 [vhost_net]
[<ffffffffa0195320>] ? vhost_attach_cgroups_work+0x30/0x30 [vhost_net]
[<ffffffff81061f46>] kthread+0xc6/0xd0
[<ffffffff81061e80>] ? kthread_freezable_should_stop+0x70/0x70
[<ffffffff817a1aec>] ret_from_fork+0x7c/0xb0
[<ffffffff81061e80>] ? kthread_freezable_should_stop+0x70/0x70
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch reworks the UEFI anti-bricking code, including an effective
reversion of cc5a080c and 31ff2f20. It turns out that calling
QueryVariableInfo() from boot services results in some firmware
implementations jumping to physical addresses even after entering virtual
mode, so until we have 1:1 mappings for UEFI runtime space this isn't
going to work so well.
Reverting these gets us back to the situation where we'd refuse to create
variables on some systems because they classify deleted variables as "used"
until the firmware triggers a garbage collection run, which they won't do
until they reach a lower threshold. This results in it being impossible to
install a bootloader, which is unhelpful.
Feedback from Samsung indicates that the firmware doesn't need more than
5KB of storage space for its own purposes, so that seems like a reasonable
threshold. However, there's still no guarantee that a platform will attempt
garbage collection merely because it drops below this threshold. It seems
that this is often only triggered if an attempt to write generates a
genuine EFI_OUT_OF_RESOURCES error. We can force that by attempting to
create a variable larger than the remaining space. This should fail, but if
it somehow succeeds we can then immediately delete it.
I've tested this on the UEFI machines I have available, but I don't have
a Samsung and so can't verify that it avoids the bricking problem.
Signed-off-by: Matthew Garrett <matthew.garrett@nebula.com>
Signed-off-by: Lee, Chun-Y <jlee@suse.com> [ dummy variable cleanup ]
Cc: <stable@vger.kernel.org>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
|
|
In Steven Rostedt's words:
> I've been debugging the last couple of days why my tests have been
> locking up. One of my tracing tests, runs all available tracers. The
> lockup always happened with the mmiotrace, which is used to trace
> interactions between priority drivers and the kernel. But to do this
> easily, when the tracer gets registered, it disables all but the boot
> CPUs. The lockup always happened after it got done disabling the CPUs.
>
> Then I decided to try this:
>
> while :; do
> for i in 1 2 3; do
> echo 0 > /sys/devices/system/cpu/cpu$i/online
> done
> for i in 1 2 3; do
> echo 1 > /sys/devices/system/cpu/cpu$i/online
> done
> done
>
> Well, sure enough, that locked up too, with the same users. Doing a
> sysrq-w (showing all blocked tasks):
>
> [ 2991.344562] task PC stack pid father
> [ 2991.344562] rcu_preempt D ffff88007986fdf8 0 10 2 0x00000000
> [ 2991.344562] ffff88007986fc98 0000000000000002 ffff88007986fc48 0000000000000908
> [ 2991.344562] ffff88007986c280 ffff88007986ffd8 ffff88007986ffd8 00000000001d3c80
> [ 2991.344562] ffff880079248a40 ffff88007986c280 0000000000000000 00000000fffd4295
> [ 2991.344562] Call Trace:
> [ 2991.344562] [<ffffffff815437ba>] schedule+0x64/0x66
> [ 2991.344562] [<ffffffff81541750>] schedule_timeout+0xbc/0xf9
> [ 2991.344562] [<ffffffff8154bec0>] ? ftrace_call+0x5/0x2f
> [ 2991.344562] [<ffffffff81049513>] ? cascade+0xa8/0xa8
> [ 2991.344562] [<ffffffff815417ab>] schedule_timeout_uninterruptible+0x1e/0x20
> [ 2991.344562] [<ffffffff810c980c>] rcu_gp_kthread+0x502/0x94b
> [ 2991.344562] [<ffffffff81062791>] ? __init_waitqueue_head+0x50/0x50
> [ 2991.344562] [<ffffffff810c930a>] ? rcu_gp_fqs+0x64/0x64
> [ 2991.344562] [<ffffffff81061cdb>] kthread+0xb1/0xb9
> [ 2991.344562] [<ffffffff81091e31>] ? lock_release_holdtime.part.23+0x4e/0x55
> [ 2991.344562] [<ffffffff81061c2a>] ? __init_kthread_worker+0x58/0x58
> [ 2991.344562] [<ffffffff8154c1dc>] ret_from_fork+0x7c/0xb0
> [ 2991.344562] [<ffffffff81061c2a>] ? __init_kthread_worker+0x58/0x58
> [ 2991.344562] kworker/0:1 D ffffffff81a30680 0 47 2 0x00000000
> [ 2991.344562] Workqueue: events cpuset_hotplug_workfn
> [ 2991.344562] ffff880078dbbb58 0000000000000002 0000000000000006 00000000000000d8
> [ 2991.344562] ffff880078db8100 ffff880078dbbfd8 ffff880078dbbfd8 00000000001d3c80
> [ 2991.344562] ffff8800779ca5c0 ffff880078db8100 ffffffff81541fcf 0000000000000000
> [ 2991.344562] Call Trace:
> [ 2991.344562] [<ffffffff81541fcf>] ? __mutex_lock_common+0x3d4/0x609
> [ 2991.344562] [<ffffffff815437ba>] schedule+0x64/0x66
> [ 2991.344562] [<ffffffff81543a39>] schedule_preempt_disabled+0x18/0x24
> [ 2991.344562] [<ffffffff81541fcf>] __mutex_lock_common+0x3d4/0x609
> [ 2991.344562] [<ffffffff8103d11b>] ? get_online_cpus+0x3c/0x50
> [ 2991.344562] [<ffffffff8103d11b>] ? get_online_cpus+0x3c/0x50
> [ 2991.344562] [<ffffffff815422ff>] mutex_lock_nested+0x3b/0x40
> [ 2991.344562] [<ffffffff8103d11b>] get_online_cpus+0x3c/0x50
> [ 2991.344562] [<ffffffff810af7e6>] rebuild_sched_domains_locked+0x6e/0x3a8
> [ 2991.344562] [<ffffffff810b0ec6>] rebuild_sched_domains+0x1c/0x2a
> [ 2991.344562] [<ffffffff810b109b>] cpuset_hotplug_workfn+0x1c7/0x1d3
> [ 2991.344562] [<ffffffff810b0ed9>] ? cpuset_hotplug_workfn+0x5/0x1d3
> [ 2991.344562] [<ffffffff81058e07>] process_one_work+0x2d4/0x4d1
> [ 2991.344562] [<ffffffff81058d3a>] ? process_one_work+0x207/0x4d1
> [ 2991.344562] [<ffffffff8105964c>] worker_thread+0x2e7/0x3b5
> [ 2991.344562] [<ffffffff81059365>] ? rescuer_thread+0x332/0x332
> [ 2991.344562] [<ffffffff81061cdb>] kthread+0xb1/0xb9
> [ 2991.344562] [<ffffffff81061c2a>] ? __init_kthread_worker+0x58/0x58
> [ 2991.344562] [<ffffffff8154c1dc>] ret_from_fork+0x7c/0xb0
> [ 2991.344562] [<ffffffff81061c2a>] ? __init_kthread_worker+0x58/0x58
> [ 2991.344562] bash D ffffffff81a4aa80 0 2618 2612 0x10000000
> [ 2991.344562] ffff8800379abb58 0000000000000002 0000000000000006 0000000000000c2c
> [ 2991.344562] ffff880077fea140 ffff8800379abfd8 ffff8800379abfd8 00000000001d3c80
> [ 2991.344562] ffff8800779ca5c0 ffff880077fea140 ffffffff81541fcf 0000000000000000
> [ 2991.344562] Call Trace:
> [ 2991.344562] [<ffffffff81541fcf>] ? __mutex_lock_common+0x3d4/0x609
> [ 2991.344562] [<ffffffff815437ba>] schedule+0x64/0x66
> [ 2991.344562] [<ffffffff81543a39>] schedule_preempt_disabled+0x18/0x24
> [ 2991.344562] [<ffffffff81541fcf>] __mutex_lock_common+0x3d4/0x609
> [ 2991.344562] [<ffffffff81530078>] ? rcu_cpu_notify+0x2f5/0x86e
> [ 2991.344562] [<ffffffff81530078>] ? rcu_cpu_notify+0x2f5/0x86e
> [ 2991.344562] [<ffffffff815422ff>] mutex_lock_nested+0x3b/0x40
> [ 2991.344562] [<ffffffff81530078>] rcu_cpu_notify+0x2f5/0x86e
> [ 2991.344562] [<ffffffff81091c99>] ? __lock_is_held+0x32/0x53
> [ 2991.344562] [<ffffffff81548912>] notifier_call_chain+0x6b/0x98
> [ 2991.344562] [<ffffffff810671fd>] __raw_notifier_call_chain+0xe/0x10
> [ 2991.344562] [<ffffffff8103cf64>] __cpu_notify+0x20/0x32
> [ 2991.344562] [<ffffffff8103cf8d>] cpu_notify_nofail+0x17/0x36
> [ 2991.344562] [<ffffffff815225de>] _cpu_down+0x154/0x259
> [ 2991.344562] [<ffffffff81522710>] cpu_down+0x2d/0x3a
> [ 2991.344562] [<ffffffff81526351>] store_online+0x4e/0xe7
> [ 2991.344562] [<ffffffff8134d764>] dev_attr_store+0x20/0x22
> [ 2991.344562] [<ffffffff811b3c5f>] sysfs_write_file+0x108/0x144
> [ 2991.344562] [<ffffffff8114c5ef>] vfs_write+0xfd/0x158
> [ 2991.344562] [<ffffffff8114c928>] SyS_write+0x5c/0x83
> [ 2991.344562] [<ffffffff8154c494>] tracesys+0xdd/0xe2
>
> As well as held locks:
>
> [ 3034.728033] Showing all locks held in the system:
> [ 3034.728033] 1 lock held by rcu_preempt/10:
> [ 3034.728033] #0: (rcu_preempt_state.onoff_mutex){+.+...}, at: [<ffffffff810c9471>] rcu_gp_kthread+0x167/0x94b
> [ 3034.728033] 4 locks held by kworker/0:1/47:
> [ 3034.728033] #0: (events){.+.+.+}, at: [<ffffffff81058d3a>] process_one_work+0x207/0x4d1
> [ 3034.728033] #1: (cpuset_hotplug_work){+.+.+.}, at: [<ffffffff81058d3a>] process_one_work+0x207/0x4d1
> [ 3034.728033] #2: (cpuset_mutex){+.+.+.}, at: [<ffffffff810b0ec1>] rebuild_sched_domains+0x17/0x2a
> [ 3034.728033] #3: (cpu_hotplug.lock){+.+.+.}, at: [<ffffffff8103d11b>] get_online_cpus+0x3c/0x50
> [ 3034.728033] 1 lock held by mingetty/2563:
> [ 3034.728033] #0: (&ldata->atomic_read_lock){+.+...}, at: [<ffffffff8131e28a>] n_tty_read+0x252/0x7e8
> [ 3034.728033] 1 lock held by mingetty/2565:
> [ 3034.728033] #0: (&ldata->atomic_read_lock){+.+...}, at: [<ffffffff8131e28a>] n_tty_read+0x252/0x7e8
> [ 3034.728033] 1 lock held by mingetty/2569:
> [ 3034.728033] #0: (&ldata->atomic_read_lock){+.+...}, at: [<ffffffff8131e28a>] n_tty_read+0x252/0x7e8
> [ 3034.728033] 1 lock held by mingetty/2572:
> [ 3034.728033] #0: (&ldata->atomic_read_lock){+.+...}, at: [<ffffffff8131e28a>] n_tty_read+0x252/0x7e8
> [ 3034.728033] 1 lock held by mingetty/2575:
> [ 3034.728033] #0: (&ldata->atomic_read_lock){+.+...}, at: [<ffffffff8131e28a>] n_tty_read+0x252/0x7e8
> [ 3034.728033] 7 locks held by bash/2618:
> [ 3034.728033] #0: (sb_writers#5){.+.+.+}, at: [<ffffffff8114bc3f>] file_start_write+0x2a/0x2c
> [ 3034.728033] #1: (&buffer->mutex#2){+.+.+.}, at: [<ffffffff811b3b93>] sysfs_write_file+0x3c/0x144
> [ 3034.728033] #2: (s_active#54){.+.+.+}, at: [<ffffffff811b3c3e>] sysfs_write_file+0xe7/0x144
> [ 3034.728033] #3: (x86_cpu_hotplug_driver_mutex){+.+.+.}, at: [<ffffffff810217c2>] cpu_hotplug_driver_lock+0x17/0x19
> [ 3034.728033] #4: (cpu_add_remove_lock){+.+.+.}, at: [<ffffffff8103d196>] cpu_maps_update_begin+0x17/0x19
> [ 3034.728033] #5: (cpu_hotplug.lock){+.+.+.}, at: [<ffffffff8103cfd8>] cpu_hotplug_begin+0x2c/0x6d
> [ 3034.728033] #6: (rcu_preempt_state.onoff_mutex){+.+...}, at: [<ffffffff81530078>] rcu_cpu_notify+0x2f5/0x86e
> [ 3034.728033] 1 lock held by bash/2980:
> [ 3034.728033] #0: (&ldata->atomic_read_lock){+.+...}, at: [<ffffffff8131e28a>] n_tty_read+0x252/0x7e8
>
> Things looked a little weird. Also, this is a deadlock that lockdep did
> not catch. But what we have here does not look like a circular lock
> issue:
>
> Bash is blocked in rcu_cpu_notify():
>
> 1961 /* Exclude any attempts to start a new grace period. */
> 1962 mutex_lock(&rsp->onoff_mutex);
>
>
> kworker is blocked in get_online_cpus(), which makes sense as we are
> currently taking down a CPU.
>
> But rcu_preempt is not blocked on anything. It is simply sleeping in
> rcu_gp_kthread (really rcu_gp_init) here:
>
> 1453 #ifdef CONFIG_PROVE_RCU_DELAY
> 1454 if ((prandom_u32() % (rcu_num_nodes * 8)) == 0 &&
> 1455 system_state == SYSTEM_RUNNING)
> 1456 schedule_timeout_uninterruptible(2);
> 1457 #endif /* #ifdef CONFIG_PROVE_RCU_DELAY */
>
> And it does this while holding the onoff_mutex that bash is waiting for.
>
> Doing a function trace, it showed me where it happened:
>
> [ 125.940066] rcu_pree-10 3.... 28384115273: schedule_timeout_uninterruptible <-rcu_gp_kthread
> [...]
> [ 125.940066] rcu_pree-10 3d..3 28384202439: sched_switch: prev_comm=rcu_preempt prev_pid=10 prev_prio=120 prev_state=D ==> next_comm=watchdog/3 next_pid=38 next_prio=120
>
> The watchdog ran, and then:
>
> [ 125.940066] watchdog-38 3d..3 28384692863: sched_switch: prev_comm=watchdog/3 prev_pid=38 prev_prio=120 prev_state=P ==> next_comm=modprobe next_pid=2848 next_prio=118
>
> Not sure what modprobe was doing, but shortly after that:
>
> [ 125.940066] modprobe-2848 3d..3 28385041749: sched_switch: prev_comm=modprobe prev_pid=2848 prev_prio=118 prev_state=R+ ==> next_comm=migration/3 next_pid=40 next_prio=0
>
> Where the migration thread took down the CPU:
>
> [ 125.940066] migratio-40 3d..3 28389148276: sched_switch: prev_comm=migration/3 prev_pid=40 prev_prio=0 prev_state=P ==> next_comm=swapper/3 next_pid=0 next_prio=120
>
> which finally did:
>
> [ 125.940066] <idle>-0 3...1 28389282142: arch_cpu_idle_dead <-cpu_startup_entry
> [ 125.940066] <idle>-0 3...1 28389282548: native_play_dead <-arch_cpu_idle_dead
> [ 125.940066] <idle>-0 3...1 28389282924: play_dead_common <-native_play_dead
> [ 125.940066] <idle>-0 3...1 28389283468: idle_task_exit <-play_dead_common
> [ 125.940066] <idle>-0 3...1 28389284644: amd_e400_remove_cpu <-play_dead_common
>
>
> CPU 3 is now offline, the rcu_preempt thread that ran on CPU 3 is still
> doing a schedule_timeout_uninterruptible() and it registered it's
> timeout to the timer base for CPU 3. You would think that it would get
> migrated right? The issue here is that the timer migration happens at
> the CPU notifier for CPU_DEAD. The problem is that the rcu notifier for
> CPU_DOWN is blocked waiting for the onoff_mutex to be released, which is
> held by the thread that just put itself into a uninterruptible sleep,
> that wont wake up until the CPU_DEAD notifier of the timer
> infrastructure is called, which wont happen until the rcu notifier
> finishes. Here's our deadlock!
This commit breaks this deadlock cycle by substituting a shorter udelay()
for the previous schedule_timeout_uninterruptible(), while at the same
time increasing the probability of the delay. This maintains the intensity
of the testing.
Reported-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Tested-by: Steven Rostedt <rostedt@goodmis.org>
|
|
This commit fixes a lockdep-detected deadlock by moving a wake_up()
call out from a rnp->lock critical section. Please see below for
the long version of this story.
On Tue, 2013-05-28 at 16:13 -0400, Dave Jones wrote:
> [12572.705832] ======================================================
> [12572.750317] [ INFO: possible circular locking dependency detected ]
> [12572.796978] 3.10.0-rc3+ #39 Not tainted
> [12572.833381] -------------------------------------------------------
> [12572.862233] trinity-child17/31341 is trying to acquire lock:
> [12572.870390] (rcu_node_0){..-.-.}, at: [<ffffffff811054ff>] rcu_read_unlock_special+0x9f/0x4c0
> [12572.878859]
> but task is already holding lock:
> [12572.894894] (&ctx->lock){-.-...}, at: [<ffffffff811390ed>] perf_lock_task_context+0x7d/0x2d0
> [12572.903381]
> which lock already depends on the new lock.
>
> [12572.927541]
> the existing dependency chain (in reverse order) is:
> [12572.943736]
> -> #4 (&ctx->lock){-.-...}:
> [12572.960032] [<ffffffff810b9851>] lock_acquire+0x91/0x1f0
> [12572.968337] [<ffffffff816ebc90>] _raw_spin_lock+0x40/0x80
> [12572.976633] [<ffffffff8113c987>] __perf_event_task_sched_out+0x2e7/0x5e0
> [12572.984969] [<ffffffff81088953>] perf_event_task_sched_out+0x93/0xa0
> [12572.993326] [<ffffffff816ea0bf>] __schedule+0x2cf/0x9c0
> [12573.001652] [<ffffffff816eacfe>] schedule_user+0x2e/0x70
> [12573.009998] [<ffffffff816ecd64>] retint_careful+0x12/0x2e
> [12573.018321]
> -> #3 (&rq->lock){-.-.-.}:
> [12573.034628] [<ffffffff810b9851>] lock_acquire+0x91/0x1f0
> [12573.042930] [<ffffffff816ebc90>] _raw_spin_lock+0x40/0x80
> [12573.051248] [<ffffffff8108e6a7>] wake_up_new_task+0xb7/0x260
> [12573.059579] [<ffffffff810492f5>] do_fork+0x105/0x470
> [12573.067880] [<ffffffff81049686>] kernel_thread+0x26/0x30
> [12573.076202] [<ffffffff816cee63>] rest_init+0x23/0x140
> [12573.084508] [<ffffffff81ed8e1f>] start_kernel+0x3f1/0x3fe
> [12573.092852] [<ffffffff81ed856f>] x86_64_start_reservations+0x2a/0x2c
> [12573.101233] [<ffffffff81ed863d>] x86_64_start_kernel+0xcc/0xcf
> [12573.109528]
> -> #2 (&p->pi_lock){-.-.-.}:
> [12573.125675] [<ffffffff810b9851>] lock_acquire+0x91/0x1f0
> [12573.133829] [<ffffffff816ebe9b>] _raw_spin_lock_irqsave+0x4b/0x90
> [12573.141964] [<ffffffff8108e881>] try_to_wake_up+0x31/0x320
> [12573.150065] [<ffffffff8108ebe2>] default_wake_function+0x12/0x20
> [12573.158151] [<ffffffff8107bbf8>] autoremove_wake_function+0x18/0x40
> [12573.166195] [<ffffffff81085398>] __wake_up_common+0x58/0x90
> [12573.174215] [<ffffffff81086909>] __wake_up+0x39/0x50
> [12573.182146] [<ffffffff810fc3da>] rcu_start_gp_advanced.isra.11+0x4a/0x50
> [12573.190119] [<ffffffff810fdb09>] rcu_start_future_gp+0x1c9/0x1f0
> [12573.198023] [<ffffffff810fe2c4>] rcu_nocb_kthread+0x114/0x930
> [12573.205860] [<ffffffff8107a91d>] kthread+0xed/0x100
> [12573.213656] [<ffffffff816f4b1c>] ret_from_fork+0x7c/0xb0
> [12573.221379]
> -> #1 (&rsp->gp_wq){..-.-.}:
> [12573.236329] [<ffffffff810b9851>] lock_acquire+0x91/0x1f0
> [12573.243783] [<ffffffff816ebe9b>] _raw_spin_lock_irqsave+0x4b/0x90
> [12573.251178] [<ffffffff810868f3>] __wake_up+0x23/0x50
> [12573.258505] [<ffffffff810fc3da>] rcu_start_gp_advanced.isra.11+0x4a/0x50
> [12573.265891] [<ffffffff810fdb09>] rcu_start_future_gp+0x1c9/0x1f0
> [12573.273248] [<ffffffff810fe2c4>] rcu_nocb_kthread+0x114/0x930
> [12573.280564] [<ffffffff8107a91d>] kthread+0xed/0x100
> [12573.287807] [<ffffffff816f4b1c>] ret_from_fork+0x7c/0xb0
Notice the above call chain.
rcu_start_future_gp() is called with the rnp->lock held. Then it calls
rcu_start_gp_advance, which does a wakeup.
You can't do wakeups while holding the rnp->lock, as that would mean
that you could not do a rcu_read_unlock() while holding the rq lock, or
any lock that was taken while holding the rq lock. This is because...
(See below).
> [12573.295067]
> -> #0 (rcu_node_0){..-.-.}:
> [12573.309293] [<ffffffff810b8d36>] __lock_acquire+0x1786/0x1af0
> [12573.316568] [<ffffffff810b9851>] lock_acquire+0x91/0x1f0
> [12573.323825] [<ffffffff816ebc90>] _raw_spin_lock+0x40/0x80
> [12573.331081] [<ffffffff811054ff>] rcu_read_unlock_special+0x9f/0x4c0
> [12573.338377] [<ffffffff810760a6>] __rcu_read_unlock+0x96/0xa0
> [12573.345648] [<ffffffff811391b3>] perf_lock_task_context+0x143/0x2d0
> [12573.352942] [<ffffffff8113938e>] find_get_context+0x4e/0x1f0
> [12573.360211] [<ffffffff811403f4>] SYSC_perf_event_open+0x514/0xbd0
> [12573.367514] [<ffffffff81140e49>] SyS_perf_event_open+0x9/0x10
> [12573.374816] [<ffffffff816f4dd4>] tracesys+0xdd/0xe2
Notice the above trace.
perf took its own ctx->lock, which can be taken while holding the rq
lock. While holding this lock, it did a rcu_read_unlock(). The
perf_lock_task_context() basically looks like:
rcu_read_lock();
raw_spin_lock(ctx->lock);
rcu_read_unlock();
Now, what looks to have happened, is that we scheduled after taking that
first rcu_read_lock() but before taking the spin lock. When we scheduled
back in and took the ctx->lock, the following rcu_read_unlock()
triggered the "special" code.
The rcu_read_unlock_special() takes the rnp->lock, which gives us a
possible deadlock scenario.
CPU0 CPU1 CPU2
---- ---- ----
rcu_nocb_kthread()
lock(rq->lock);
lock(ctx->lock);
lock(rnp->lock);
wake_up();
lock(rq->lock);
rcu_read_unlock();
rcu_read_unlock_special();
lock(rnp->lock);
lock(ctx->lock);
**** DEADLOCK ****
> [12573.382068]
> other info that might help us debug this:
>
> [12573.403229] Chain exists of:
> rcu_node_0 --> &rq->lock --> &ctx->lock
>
> [12573.424471] Possible unsafe locking scenario:
>
> [12573.438499] CPU0 CPU1
> [12573.445599] ---- ----
> [12573.452691] lock(&ctx->lock);
> [12573.459799] lock(&rq->lock);
> [12573.467010] lock(&ctx->lock);
> [12573.474192] lock(rcu_node_0);
> [12573.481262]
> *** DEADLOCK ***
>
> [12573.501931] 1 lock held by trinity-child17/31341:
> [12573.508990] #0: (&ctx->lock){-.-...}, at: [<ffffffff811390ed>] perf_lock_task_context+0x7d/0x2d0
> [12573.516475]
> stack backtrace:
> [12573.530395] CPU: 1 PID: 31341 Comm: trinity-child17 Not tainted 3.10.0-rc3+ #39
> [12573.545357] ffffffff825b4f90 ffff880219f1dbc0 ffffffff816e375b ffff880219f1dc00
> [12573.552868] ffffffff816dfa5d ffff880219f1dc50 ffff88023ce4d1f8 ffff88023ce4ca40
> [12573.560353] 0000000000000001 0000000000000001 ffff88023ce4d1f8 ffff880219f1dcc0
> [12573.567856] Call Trace:
> [12573.575011] [<ffffffff816e375b>] dump_stack+0x19/0x1b
> [12573.582284] [<ffffffff816dfa5d>] print_circular_bug+0x200/0x20f
> [12573.589637] [<ffffffff810b8d36>] __lock_acquire+0x1786/0x1af0
> [12573.596982] [<ffffffff810918f5>] ? sched_clock_cpu+0xb5/0x100
> [12573.604344] [<ffffffff810b9851>] lock_acquire+0x91/0x1f0
> [12573.611652] [<ffffffff811054ff>] ? rcu_read_unlock_special+0x9f/0x4c0
> [12573.619030] [<ffffffff816ebc90>] _raw_spin_lock+0x40/0x80
> [12573.626331] [<ffffffff811054ff>] ? rcu_read_unlock_special+0x9f/0x4c0
> [12573.633671] [<ffffffff811054ff>] rcu_read_unlock_special+0x9f/0x4c0
> [12573.640992] [<ffffffff811390ed>] ? perf_lock_task_context+0x7d/0x2d0
> [12573.648330] [<ffffffff810b429e>] ? put_lock_stats.isra.29+0xe/0x40
> [12573.655662] [<ffffffff813095a0>] ? delay_tsc+0x90/0xe0
> [12573.662964] [<ffffffff810760a6>] __rcu_read_unlock+0x96/0xa0
> [12573.670276] [<ffffffff811391b3>] perf_lock_task_context+0x143/0x2d0
> [12573.677622] [<ffffffff81139070>] ? __perf_event_enable+0x370/0x370
> [12573.684981] [<ffffffff8113938e>] find_get_context+0x4e/0x1f0
> [12573.692358] [<ffffffff811403f4>] SYSC_perf_event_open+0x514/0xbd0
> [12573.699753] [<ffffffff8108cd9d>] ? get_parent_ip+0xd/0x50
> [12573.707135] [<ffffffff810b71fd>] ? trace_hardirqs_on_caller+0xfd/0x1c0
> [12573.714599] [<ffffffff81140e49>] SyS_perf_event_open+0x9/0x10
> [12573.721996] [<ffffffff816f4dd4>] tracesys+0xdd/0xe2
This commit delays the wakeup via irq_work(), which is what
perf and ftrace use to perform wakeups in critical sections.
Reported-by: Dave Jones <davej@redhat.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
|
|
__DECLARE_TRACE_RCU() currently creates an _rcuidle() tracepoint which
may safely be invoked from what RCU considers to be an idle CPU.
However, these _rcuidle() tracepoints may -not- be invoked from the
handler of an irq taken from idle, because rcu_idle_enter() zeroes
RCU's nesting-level counter, so that the rcu_irq_exit() returning to
idle will trigger a WARN_ON_ONCE().
This commit therefore substitutes rcu_irq_enter() for rcu_idle_exit()
and rcu_irq_exit() for rcu_idle_enter() in order to make the _rcuidle()
tracepoints usable from irq handlers as well as from process context.
Reported-by: Dave Jones <davej@redhat.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
|
|
Pablo Neira Ayuso says:
====================
The following patchset contains four fixes for Netfilter and one fix
for IPVS, they are:
* Fix data leak to user-space via getsockopt IP_VS_SO_GET_DESTS, from
Dan Carpenter.
* Fix xt_TCPMSS if no TCP MSS is specified in syn packets, to avoid the
violation of RFC879, from Phil Oester.
* Fix incomplete dump of objects via nfnetlink_acct and nfnetlink_cttimeout,
from myself.
* Fix missing HW protocol in packets passed to user-space via NFQUEUE,
from myself.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi
Pull spi fixes from Mark Brown:
"A few nasty issues, particularly a race with the interrupt controller
in the xilinx driver, together with a couple of more minor fixes and a
much needed move of the mailing list away from sourceforge."
* tag 'spi-v3.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
spi: hspi: fixup long delay time
spi: spi-xilinx: Remove ISR race condition
spi: topcliff-pch: fix error return code in pch_spi_probe()
spi: topcliff-pch: Pass correct pointer to free_irq()
spi: Move mailing list to vger
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen
Pull xen fixes from Konrad Rzeszutek Wilk:
"Two bug-fixes for regressions:
- xen/tmem stopped working after a certain combination of
modprobe/swapon was used
- cpu online/offlining would trigger WARN_ON."
* tag 'stable/for-linus-3.10-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
xen/tmem: Don't over-write tmem_frontswap_poolid after tmem_frontswap_init set it.
xen/smp: Fixup NOHZ per cpu data when onlining an offline CPU.
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap
Pull regmap fixes from Mark Brown:
"The biggest fix here is Lars-Peter's fix for custom locking callbacks
which is pretty localised but important for those devices that use the
feature. Otherwise we've got a couple of fairly small cleanups which
would have been sent sooner were it not for letting Lars-Peter's patch
soak for a while"
* tag 'regmap-v3.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
regmap: rbtree: Fixed node range check on sync
regmap: regcache: Fixup locking for custom lock callbacks
regmap: debugfs: Check return value of regmap_write()
|
|
Pull crypto fixes from Herbert Xu:
"This fixes a build problem in sahara and temporarily disables two new
optimisations because of performance regressions until a permanent fix
is ready"
* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: sahara - fix building as module
crypto: blowfish - disable AVX2 implementation
crypto: twofish - disable AVX2 implementation
|
|
Do not use uninitialised termios data to determine when to configure the
device at open.
This also prevents stack data from leaking to userspace in the OOM error
path.
Cc: stable@vger.kernel.org
Signed-off-by: Johan Hovold <jhovold@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Do not use uninitialised termios data to determine when to configure the
device at open.
Cc: stable@vger.kernel.org
Signed-off-by: Johan Hovold <jhovold@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Do not use uninitialised termios data to determine when to configure the
device at open.
This also prevents stack data from leaking to userspace.
Cc: stable@vger.kernel.org
Signed-off-by: Johan Hovold <jhovold@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
arch_ftrace_update_code and ftrace_modify_all_code are only
available if CONFIG_DYNAMIC_FTRACE is selected.
Fixes the following build problem on MIPS randconfig:
arch/mips/kernel/ftrace.c: In function 'arch_ftrace_update_code':
arch/mips/kernel/ftrace.c:31:2: error: implicit declaration of function
'ftrace_modify_all_code' [-Werror=implicit-function-declaration]
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
Acked-by: Steven J. Hill <Steven.Hill@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/5435/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
|
|
The kvm_* symbols are only available if KVM is selected.
Fixes the following linking problem on a randconfig:
arch/mips/built-in.o: In function `local_flush_tlb_mm':
(.text+0x18a94): undefined reference to `kvm_local_flush_tlb_all'
arch/mips/built-in.o: In function `local_flush_tlb_range':
(.text+0x18d0c): undefined reference to `kvm_local_flush_tlb_all'
kernel/built-in.o: In function `__schedule':
core.c:(.sched.text+0x2a00): undefined reference to `kvm_local_flush_tlb_all'
mm/built-in.o: In function `use_mm':
(.text+0x30214): undefined reference to `kvm_local_flush_tlb_all'
fs/built-in.o: In function `flush_old_exec':
(.text+0xf0a0): undefined reference to `kvm_local_flush_tlb_all'
make: *** [vmlinux] Error 1
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
Acked-by: Steven J. Hill <Steven.Hill@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/5437/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
|
|
Only an interrupt can wake the core from 'wait', enable interrupts
locally before executing 'wait'.
[ralf@linux-mips.org: This leave the race between an interrupt that's
setting TIF_NEED_RESCHEd and entering the WAIT status. but at least it's
going to bring Alchemy back from the dead, so I'm going to apply this
patch.]
Signed-off-by: Manuel Lauss <manuel.lauss@gmail.com>
Cc: Linux-MIPS <linux-mips@linux-mips.org>
Cc: Maciej W. Rozycki <macro@linux-mips.org>
Patchwork: https://patchwork.linux-mips.org/patch/5408/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
|
|
Faraday fotg210 udc driver supports only Bulk transfer so far.
fotg210 could be configured as an USB2.0 peripheral.
This driver is tested with mass storage gadget driver on Faraday
EVB a369.
Signed-off-by: Yuan-Hsin Chen <yhchen@faraday-tech.com>
Signed-off-by: Felipe Balbi <balbi@ti.com>
|
|
f_rndis learns about configfs so we can, eventually,
remove in-kernel gadget drivers.
Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Felipe Balbi <balbi@ti.com>
|
|
use new usb_gstrings_attach interface
Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Felipe Balbi <balbi@ti.com>
|
|
This is required in preparation for using usb_gstrings_attach.
The rndis initialization so far has been performed on the first
occurence of rndis_bind(), but the condition to check it (first
or not first) was "borrowed" from strings handling.
Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Felipe Balbi <balbi@ti.com>
|
|
use new interface so old one can be removed.
Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Felipe Balbi <balbi@ti.com>
|
|
compatibility
Converting rndis to the new function interface requires converting
the USB rndis' function code and its users.
This patch converts the f_rndis.c to the new function interface.
The file is now compiled into a separate usb_f_rndis.ko module.
The old function interface is provided by means of a preprocessor
conditional directives. After all users are converted, the old interface
can be removed.
Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Felipe Balbi <balbi@ti.com>
|
|
f_subset learns about configfs so we can, eventually,
remove in-kernel gadget drivers.
Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Felipe Balbi <balbi@ti.com>
|
|
use the new usb_gstrings_attach interface.
Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Felipe Balbi <balbi@ti.com>
|
|
teach ethernet code about the new interface of f_subset so
the old one can eventually be removed.
Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Felipe Balbi <balbi@ti.com>
|
|
compatibility
Converting ecm subset to the new function interface requires converting
the USB subset's function code and its users.
This patch converts the f_subset.c to the new function interface.
The file is now compiled into a separate usb_f_subset.ko module.
The old function interface is provided by means of a preprocessor
conditional directives. After all users are converted, the old interface
can be removed.
Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Felipe Balbi <balbi@ti.com>
|
|
cleanup only, no functional changes.
Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Felipe Balbi <balbi@ti.com>
|
|
f_eem learns about our configfs interface so we
can remove in-kernel gadget drivers in future.
Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Felipe Balbi <balbi@ti.com>
|
|
use the new usb_gstrings_attach interface
Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Felipe Balbi <balbi@ti.com>
|
|
There are no old function interface users left, so the old interface
can be removed.
Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Felipe Balbi <balbi@ti.com>
|
|
use new interface so old one can be removed.
Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Felipe Balbi <balbi@ti.com>
|
|
compatibility
Converting eem to the new function interface requires converting
the USB eem's function code and its users.
This patch converts the f_eem.c to the new function interface.
The file is now compiled into a separate usb_f_eem.ko module.
The old function interface is provided by means of a preprocessor
conditional directives. After all users are converted, the old interface
can be removed.
Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Felipe Balbi <balbi@ti.com>
|
|
moving to new interface so we can remove the older one.
Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Felipe Balbi <balbi@ti.com>
|
|
helper function to copy MAC address to proper place.
Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Felipe Balbi <balbi@ti.com>
|
|
f_phonet learns about configfs so we can remove
in-kernel gadget drivers.
Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Felipe Balbi <balbi@ti.com>
|
|
this will let us deprecate (and remove) the old interface.
Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Felipe Balbi <balbi@ti.com>
|
|
There are no old function interface users left, so the old interface
can be removed.
Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Felipe Balbi <balbi@ti.com>
|
|
use the new interface which will allow us to deprecate the
legacy way of binding functions.
Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Felipe Balbi <balbi@ti.com>
|
|
compatibility
Converting f_phonet to the new function interface requires converting
the f_phonet's function code and its users.
This patch converts the f_phonet.c to the new function interface.
The file is now compiled into a separate usb_f_phonet.ko module.
The old function interface is provided by means of preprocessor
conditional directives. After all users are converted, the old interface
can be removed.
Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Felipe Balbi <balbi@ti.com>
|
|
cleanup patch only in preparation for configfs.
Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Felipe Balbi <balbi@ti.com>
|
|
There are no old function interface users left, so the old interface
can be removed.
Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Felipe Balbi <balbi@ti.com>
|
|
preparation to use configfs-based approach on g_nokia.ko
Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Felipe Balbi <balbi@ti.com>
|
|
use the new usb_gstrings_attach interface
Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Felipe Balbi <balbi@ti.com>
|
|
f_ecm learns about our new configfs-based binding.
Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Felipe Balbi <balbi@ti.com>
|
|
use the new usb_gstrings_attach interface.
Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Felipe Balbi <balbi@ti.com>
|
|
f_ecm has been converted to new configfs infrastructure,
fixing cdc2 gadget driver.
[ balbi@ti.com : fixed a bunch of errors when adding ECM function ]
Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Felipe Balbi <balbi@ti.com>
|
|
compatibility
Converting ecm to the new function interface requires converting
the USB ecm's function code and its users.
This patch converts the f_ecm.c to the new function interface.
The file is now compiled into a separate usb_f_ecm.ko module.
The old function interface is provided by means of a preprocessor
conditional directives. After all users are converted, the old interface
can be removed.
Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Felipe Balbi <balbi@ti.com>
|
|
All USB Ethernet functions will have very similar attributes in configfs.
This patch provides helper definitions to ease writing the functions and
reduce source code duplication.
Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Felipe Balbi <balbi@ti.com>
|
|
Add configfs support to the NCM function driver so
that we can, eventually, get rid of kernel-based
gadget drivers.
Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Felipe Balbi <balbi@ti.com>
|