diff options
| author | Preeti U Murthy <preeti@linux.vnet.ibm.com> | 2015-03-30 14:59:19 +0530 | 
|---|---|---|
| committer | Ingo Molnar <mingo@kernel.org> | 2015-04-02 14:25:39 +0200 | 
| commit | 345527b1edce8df719e0884500c76832a18211c3 (patch) | |
| tree | 386a6b25b2437bd94cf63df6d02d95f729eab7cc /kernel | |
| parent | 9eed56e889d8a0bb7870e1216d8d4326dd63ec50 (diff) | |
clockevents: Fix cpu_down() race for hrtimer based broadcasting
It was found when doing a hotplug stress test on POWER, that the
machine either hit softlockups or rcu_sched stall warnings.  The
issue was traced to commit:
  7cba160ad789 ("powernv/cpuidle: Redesign idle states management")
which exposed the cpu_down() race with hrtimer based broadcast mode:
  5d1638acb9f6 ("tick: Introduce hrtimer based broadcast")
The race is the following:
Assume CPU1 is the CPU which holds the hrtimer broadcasting duty
before it is taken down.
	CPU0					CPU1
	cpu_down()				take_cpu_down()
						disable_interrupts()
	cpu_die()
	while (CPU1 != CPU_DEAD) {
		msleep(100);
		switch_to_idle();
		stop_cpu_timer();
		schedule_broadcast();
	}
	tick_cleanup_cpu_dead()
		take_over_broadcast()
So after CPU1 disabled interrupts it cannot handle the broadcast
hrtimer anymore, so CPU0 will be stuck forever.
Fix this by explicitly taking over broadcast duty before cpu_die().
This is a temporary workaround. What we really want is a callback
in the clockevent device which allows us to do that from the dying
CPU by pushing the hrtimer onto a different cpu. That might involve
an IPI and is definitely more complex than this immediate fix.
Changelog was picked up from:
    https://lkml.org/lkml/2015/2/16/213
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Preeti U. Murthy <preeti@linux.vnet.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: mpe@ellerman.id.au
Cc: nicolas.pitre@linaro.org
Cc: peterz@infradead.org
Cc: rjw@rjwysocki.net
Fixes: http://linuxppc.10917.n7.nabble.com/offlining-cpus-breakage-td88619.html
Link: http://lkml.kernel.org/r/20150330092410.24979.59887.stgit@preeti.in.ibm.com
[ Merged it to the latest timer tree, renamed the callback, tidied up the changelog. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Diffstat (limited to 'kernel')
| -rw-r--r-- | kernel/cpu.c | 2 | ||||
| -rw-r--r-- | kernel/time/tick-broadcast.c | 19 | 
2 files changed, 13 insertions, 8 deletions
| diff --git a/kernel/cpu.c b/kernel/cpu.c index 1972b161c61e..af5db20e5803 100644 --- a/kernel/cpu.c +++ b/kernel/cpu.c @@ -20,6 +20,7 @@  #include <linux/gfp.h>  #include <linux/suspend.h>  #include <linux/lockdep.h> +#include <linux/tick.h>  #include <trace/events/power.h>  #include "smpboot.h" @@ -411,6 +412,7 @@ static int __ref _cpu_down(unsigned int cpu, int tasks_frozen)  	while (!idle_cpu(cpu))  		cpu_relax(); +	hotplug_cpu__broadcast_tick_pull(cpu);  	/* This actually kills the CPU. */  	__cpu_die(cpu); diff --git a/kernel/time/tick-broadcast.c b/kernel/time/tick-broadcast.c index 19cfb381faa9..f5e0fd5652dc 100644 --- a/kernel/time/tick-broadcast.c +++ b/kernel/time/tick-broadcast.c @@ -680,14 +680,19 @@ static void broadcast_shutdown_local(struct clock_event_device *bc,  	clockevents_set_state(dev, CLOCK_EVT_STATE_SHUTDOWN);  } -static void broadcast_move_bc(int deadcpu) +void hotplug_cpu__broadcast_tick_pull(int deadcpu)  { -	struct clock_event_device *bc = tick_broadcast_device.evtdev; +	struct clock_event_device *bc; +	unsigned long flags; -	if (!bc || !broadcast_needs_cpu(bc, deadcpu)) -		return; -	/* This moves the broadcast assignment to this cpu */ -	clockevents_program_event(bc, bc->next_event, 1); +	raw_spin_lock_irqsave(&tick_broadcast_lock, flags); +	bc = tick_broadcast_device.evtdev; + +	if (bc && broadcast_needs_cpu(bc, deadcpu)) { +		/* This moves the broadcast assignment to this CPU: */ +		clockevents_program_event(bc, bc->next_event, 1); +	} +	raw_spin_unlock_irqrestore(&tick_broadcast_lock, flags);  }  /* @@ -924,8 +929,6 @@ void tick_shutdown_broadcast_oneshot(unsigned int *cpup)  	cpumask_clear_cpu(cpu, tick_broadcast_pending_mask);  	cpumask_clear_cpu(cpu, tick_broadcast_force_mask); -	broadcast_move_bc(cpu); -  	raw_spin_unlock_irqrestore(&tick_broadcast_lock, flags);  } | 
