Age | Commit message (Collapse) | Author |
|
Nothing is using its return value so change it to return void.
No functionality change.
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Fix to return a negative error code -ENOMEM from kcalloc() error
handling case instead of 0, as done elsewhere in this function.
Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
|
|
0day found a boot warning triggered in rcu_perf_writer() on !SMP kernel:
WARN_ON(rcu_gp_is_normal() && gp_exp);
, the root cause of which is trying to measure expedited grace
periods(by setting gp_exp to true by default) when all the grace periods
are normal(TINY RCU only has normal grace periods).
However, such a mis-setting would only result in failing to measure the
performance for a specific kind of grace periods, therefore using a
WARN_ON to check this is a little overkilling. We could handle this
inside rcuperf module via some error messages to tell users about the
mis-settings.
Therefore this patch removes the WARN_ON in rcu_perf_writer() and
handles those checkings in rcu_perf_init() with plain if() code.
Moreover, this patch changes the default value of gp_exp to 1) align
with rcutorture tests and 2) make the default setting work for all RCU
implementations by default.
Suggested-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Fixes: http://lkml.kernel.org/r/57411b10.mFvG0+AgcrMXGtcj%fengguang.wu@intel.com
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
|
|
If the whole system has only one cpu, that cpu won't be able to be
offlined, so there is no need onoff task is stil running.
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
|
|
This commit breaks torture_online() and torture_offline() out of
torture_onoff() in preparation for allowing waketorture finer-grained
control of its CPU-hotplug workload.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
|
|
This commit removes CONFIG_RCU_TORTURE_TEST_RUNNABLE in favor of the
already-existing rcutorture.torture_runnable kernel boot parameter.
It also converts an #ifdef into IS_ENABLED(), saving a few lines of code.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
|
|
This commit applies the infamous IS_ENABLED() macro to eliminate a #ifdef.
It also eliminates the RCU_PERF_TEST_RUNNABLE Kconfig option in favor
of the already-existing rcuperf.perf_runnable kernel boot parameter.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
|
|
People have been having some difficulty finding their way around the
RCU code. This commit therefore pulls some of the expedited grace-period
code from tree_plugin.h to a new tree_exp.h file. This commit is strictly
code movement.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
|
|
People have been having some difficulty finding their way around the
RCU code. This commit therefore pulls some of the expedited grace-period
code from tree.c to a new tree_exp.h file. This commit is strictly code
movement, with the exception of a forward declaration that was added
for the sync_sched_exp_online_cleanup() function.
A subsequent commit will move the remaining expedited grace-period code
from tree_plugin.h to tree_exp.h.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
|
|
I think you'll find this condition is superfluous, as the whole function
is under #ifdef of that same.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
|
|
In the past, RCU grace-period initialization excluded CPU-hotplug
operations, but this is no longer the case. This commit therefore
removed an outdated comment in rcu_gp_init() claiming that these
are excluded.
Reported-by: Lihao Liang <lihao.liang@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
|
|
The comment header for rcu_scheduler_active states that it is used
to optimize synchronize_sched() at early boot. This is incorrect.
The synchronize_sched() function instead checks the number of online
CPUs. This commit therefore replaces the comment's synchronize_sched()
with synchronize_rcu(), which really does use rcu_scheduler_active for
this purpose.
Reported-by: Lihao Liang <lihao.liang@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
|
|
When RET_TRACE triggers, a tracer may change a syscall into something that
should be filtered by seccomp. This re-runs seccomp after a trace event
to make sure things continue to pass.
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Andy Lutomirski <luto@kernel.org>
|
|
Since nothing is using the 2-phase API, and it adds more complexity than
benefit, remove it.
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Andy Lutomirski <luto@kernel.org>
|
|
Currently, if arch code wants to supply seccomp_data directly to
seccomp (which is generally much faster than having seccomp do it
using the syscall_get_xyz() API), it has to use the two-phase
seccomp hooks. Add it to the easy hooks, too.
Cc: linux-arch@vger.kernel.org
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
|
|
processing sysrq-w
Lengthy output of sysrq-w may take a lot of time on slow serial console.
Currently we reset NMI-watchdog on the current CPU to avoid spurious
lockup messages. Sometimes this doesn't work since softlockup watchdog
might trigger on another CPU which is waiting for an IPI to proceed.
We reset softlockup watchdogs on all CPUs, but we do this only after
listing all tasks, and this may be too late on a busy system.
So, reset watchdogs CPUs earlier, in for_each_process_thread() loop.
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/1465474805-14641-1-git-send-email-aryabinin@virtuozzo.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
I see a hang when enabling sched events:
echo 1 > /sys/kernel/debug/tracing/events/sched/enable
The printk buffer shows:
BUG: spinlock recursion on CPU#1, swapper/1/0
lock: 0xffff88007d5d8c00, .magic: dead4ead, .owner: swapper/1/0, .owner_cpu: 1
CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.7.0-rc2+ #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.8.1-20150318_183358- 04/01/2014
...
Call Trace:
<IRQ> [<ffffffff8143d663>] dump_stack+0x85/0xc2
[<ffffffff81115948>] spin_dump+0x78/0xc0
[<ffffffff81115aea>] do_raw_spin_lock+0x11a/0x150
[<ffffffff81891471>] _raw_spin_lock+0x61/0x80
[<ffffffff810e5466>] ? try_to_wake_up+0x256/0x4e0
[<ffffffff810e5466>] try_to_wake_up+0x256/0x4e0
[<ffffffff81891a0a>] ? _raw_spin_unlock_irqrestore+0x4a/0x80
[<ffffffff810e5705>] wake_up_process+0x15/0x20
[<ffffffff810cebb4>] insert_work+0x84/0xc0
[<ffffffff810ced7f>] __queue_work+0x18f/0x660
[<ffffffff810cf9a6>] queue_work_on+0x46/0x90
[<ffffffffa00cd95b>] drm_fb_helper_dirty.isra.11+0xcb/0xe0 [drm_kms_helper]
[<ffffffffa00cdac0>] drm_fb_helper_sys_imageblit+0x30/0x40 [drm_kms_helper]
[<ffffffff814babcd>] soft_cursor+0x1ad/0x230
[<ffffffff814ba379>] bit_cursor+0x649/0x680
[<ffffffff814b9d30>] ? update_attr.isra.2+0x90/0x90
[<ffffffff814b5e6a>] fbcon_cursor+0x14a/0x1c0
[<ffffffff81555ef8>] hide_cursor+0x28/0x90
[<ffffffff81558b6f>] vt_console_print+0x3bf/0x3f0
[<ffffffff81122c63>] call_console_drivers.constprop.24+0x183/0x200
[<ffffffff811241f4>] console_unlock+0x3d4/0x610
[<ffffffff811247f5>] vprintk_emit+0x3c5/0x610
[<ffffffff81124bc9>] vprintk_default+0x29/0x40
[<ffffffff811e965b>] printk+0x57/0x73
[<ffffffff810f7a9e>] enqueue_entity+0xc2e/0xc70
[<ffffffff810f7b39>] enqueue_task_fair+0x59/0xab0
[<ffffffff8106dcd9>] ? kvm_sched_clock_read+0x9/0x20
[<ffffffff8103fb39>] ? sched_clock+0x9/0x10
[<ffffffff810e3fcc>] activate_task+0x5c/0xa0
[<ffffffff810e4514>] ttwu_do_activate+0x54/0xb0
[<ffffffff810e5cea>] sched_ttwu_pending+0x7a/0xb0
[<ffffffff810e5e51>] scheduler_ipi+0x61/0x170
[<ffffffff81059e7f>] smp_trace_reschedule_interrupt+0x4f/0x2a0
[<ffffffff81893ba6>] trace_reschedule_interrupt+0x96/0xa0
<EOI> [<ffffffff8106e0d6>] ? native_safe_halt+0x6/0x10
[<ffffffff8110fb1d>] ? trace_hardirqs_on+0xd/0x10
[<ffffffff81040ac0>] default_idle+0x20/0x1a0
[<ffffffff8104147f>] arch_cpu_idle+0xf/0x20
[<ffffffff81102f8f>] default_idle_call+0x2f/0x50
[<ffffffff8110332e>] cpu_startup_entry+0x37e/0x450
[<ffffffff8105af70>] start_secondary+0x160/0x1a0
Note the hang only occurs when echoing the above from a physical serial
console, not from an ssh session.
The bug is caused by a deadlock where the task is trying to grab the rq
lock twice because printk()'s aren't safe in sched code.
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Fixes: cb2517653fcc ("sched/debug: Make schedstats a runtime tunable that is disabled by default")
Link: http://lkml.kernel.org/r/20160613073209.gdvdybiruljbkn3p@treble
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
With the modified semantics of spin_unlock_wait() a number of
explicit barriers can be removed. Also update the comment for the
do_exit() usecase, as that was somewhat stale/obscure.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Introduce smp_acquire__after_ctrl_dep(), this construct is not
uncommon, but the lack of this barrier is.
Use it to better express smp_rmb() uses in WRITE_ONCE(), the IPC
semaphore code and the qspinlock code.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
This new form allows using hardware assisted waiting.
Some hardware (ARM64 and x86) allow monitoring an address for changes,
so by providing a pointer we can use this to replace the cpu_relax()
with hardware optimized methods in the future.
Requested-by: Will Deacon <will.deacon@arm.com>
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
changes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
The NMI watchdog uses either the fixed cycles or a generic cycles
counter. This causes a lot of conflicts with users of the PMU who want
to run a full group including the cycles fixed counter, for example
the --topdown support recently added to perf stat. The code needs to
fall back to not use groups, which can cause measurement inaccuracy
due to multiplexing errors.
This patch switches the NMI watchdog to use reference cycles
on Intel systems. This is actually more accurate than cycles,
because cycles can tick faster than the measured CPU Frequency
due to Turbo mode.
The ref cycles always tick at their frequency, or slower when
the system is idling. That means the NMI watchdog can never
expire too early, unlike with cycles.
The reference cycles tick roughly at the frequency of the TSC,
so the same period computation can be used.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: acme@kernel.org
Cc: jolsa@kernel.org
Link: http://lkml.kernel.org/r/1465478079-19993-1-git-send-email-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
This patch adds guest steal-time support to full dynticks CPU
time accounting. After the following commit:
ff9a9b4c4334 ("sched, time: Switch VIRT_CPU_ACCOUNTING_GEN to jiffy granularity")
... time sampling became jiffy based, even if we do the sampling from the
context tracking code, so steal_account_process_tick() can be reused
to account how many 'ticks' are stolen-time, after the last accumulation.
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1465813966-3116-4-git-send-email-wanpeng.li@hotmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Commit:
e9532e69b8d1 ("sched/cputime: Fix steal time accounting vs. CPU hotplug")
... set rq->prev_* to 0 after a CPU hotplug comes back, in order to
fix the case where (after CPU hotplug) steal time is smaller than
rq->prev_steal_time.
However, this should never happen. Steal time was only smaller because of the
KVM-specific bug fixed by the previous patch. Worse, the previous patch
triggers a bug on CPU hot-unplug/plug operation: because
rq->prev_steal_time is cleared, all of the CPU's past steal time will be
accounted again on hot-plug.
Since the root cause has been fixed, we can just revert commit e9532e69b8d1.
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 'commit e9532e69b8d1 ("sched/cputime: Fix steal time accounting vs. CPU hotplug")'
Link: http://lkml.kernel.org/r/1465813966-3116-3-git-send-email-wanpeng.li@hotmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Chris Wilson reported a divide by 0 at:
post_init_entity_util_avg():
> 725 if (cfs_rq->avg.util_avg != 0) {
> 726 sa->util_avg = cfs_rq->avg.util_avg * se->load.weight;
> -> 727 sa->util_avg /= (cfs_rq->avg.load_avg + 1);
> 728
> 729 if (sa->util_avg > cap)
> 730 sa->util_avg = cap;
> 731 } else {
Which given the lack of serialization, and the code generated from
update_cfs_rq_load_avg() is entirely possible:
if (atomic_long_read(&cfs_rq->removed_load_avg)) {
s64 r = atomic_long_xchg(&cfs_rq->removed_load_avg, 0);
sa->load_avg = max_t(long, sa->load_avg - r, 0);
sa->load_sum = max_t(s64, sa->load_sum - r * LOAD_AVG_MAX, 0);
removed_load = 1;
}
turns into:
ffffffff81087064: 49 8b 85 98 00 00 00 mov 0x98(%r13),%rax
ffffffff8108706b: 48 85 c0 test %rax,%rax
ffffffff8108706e: 74 40 je ffffffff810870b0
ffffffff81087070: 4c 89 f8 mov %r15,%rax
ffffffff81087073: 49 87 85 98 00 00 00 xchg %rax,0x98(%r13)
ffffffff8108707a: 49 29 45 70 sub %rax,0x70(%r13)
ffffffff8108707e: 4c 89 f9 mov %r15,%rcx
ffffffff81087081: bb 01 00 00 00 mov $0x1,%ebx
ffffffff81087086: 49 83 7d 70 00 cmpq $0x0,0x70(%r13)
ffffffff8108708b: 49 0f 49 4d 70 cmovns 0x70(%r13),%rcx
Which you'll note ends up with 'sa->load_avg - r' in memory at
ffffffff8108707a.
By calling post_init_entity_util_avg() under rq->lock we're sure to be
fully serialized against PELT updates and cannot observe intermediate
state like this.
Reported-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yuyang Du <yuyang.du@intel.com>
Cc: bsegall@google.com
Cc: morten.rasmussen@arm.com
Cc: pjt@google.com
Cc: steve.muckle@linaro.org
Fixes: 2b8c41daba32 ("sched/fair: Initiate a new task's util avg to a bounded value")
Link: http://lkml.kernel.org/r/20160609130750.GQ30909@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Kasan causes the compiler to instrument C code and is used at runtime to
detect accesses to memory that has been freed, or not yet allocated.
The code in snapshot.c saves and restores memory when hibernating. This will
access whole pages in the slab cache that have both free and allocated
areas, resulting in a large number of false positives from Kasan.
Disable instrumentation of this file.
Signed-off-by: James Morse <james.morse@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/core
First drop of irqchip updates for 4.8 from Marc Zyngier:
- Fix a few bugs in configuring the default trigger from the irqdomain layer
- Make the genirq layer PM aware
- Add PM capability to the ARM GIC driver
- Add support for 2-level translation tables to the GICv3 ITS driver
|
|
Some IRQ chips may be located in a power domain outside of the CPU
subsystem and hence will require device specific runtime power
management. In order to support such IRQ chips, add a pointer for a
device structure to the irq_chip structure, and if this pointer is
populated by the IRQ chip driver and CONFIG_PM is selected in the kernel
configuration, then the pm_runtime_get/put APIs for this chip will be
called when an IRQ is requested/freed, respectively.
Reviewed-by: Kevin Hilman <khilman@baylibre.com>
Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
|
|
Some IRQ chips, such as GPIO controllers or secondary level interrupt
controllers, may require require additional runtime power management
control to ensure they are accessible. For such IRQ chips, it makes sense
to enable the IRQ chip when interrupts are requested and disabled them
again once all interrupts have been freed.
When mapping an IRQ, the IRQ type settings are read and then programmed.
The mapping of the IRQ happens before the IRQ is requested and so the
programming of the type settings occurs before the IRQ is requested. This
is a problem for IRQ chips that require additional power management
control because they may not be accessible yet. Therefore, when mapping
the IRQ, don't program the type settings, just save them and then program
these saved settings when the IRQ is requested (so long as if they are not
overridden via the call to request the IRQ).
Add a stub function for irq_domain_free_irqs() to avoid any compilation
errors when CONFIG_IRQ_DOMAIN_HIERARCHY is not selected.
Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
|
|
As we now do for non-percpu interrupt, perform a lookup of the
interrupt trigger if the user doesn't supply one. The difference
here is that we can only do it at enable time (trigger configuration
can be per-cpu as well).
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
|
|
For some devices the IRQ trigger type for a device is read from
firmware, such as device-tree. The IRQ trigger type is typically read
when the mapping for IRQ is created, which is before the IRQ is
requested. Hence, the IRQ trigger type is programmed when mapping the
IRQ and not when requesting the IRQ.
Although this works for most cases, in order to support IRQ chips which
require runtime power management, which may not be accessible prior
to requesting the IRQ, it is desirable to look-up the IRQ trigger type
when it is requested. Therefore, if the IRQ trigger type is not
specified when __setup_irq() is called, look-up the saved IRQ trigger
type. This will allow us to defer the programming of the trigger type
from when the IRQ is mapped to when it is actually requested.
Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
|
|
When mapping an IRQ, it is possible that a mapping for the IRQ already
exists. If mapping does exist then there are the following issues with
regard to the handling of the IRQ type settings ...
1. If the domain is part of a hierarchy, then:
a. We do not check that the type settings for the existing mapping
match those of the new mapping.
b. We do not check to see if the type settings have been programmed
yet (and they might not have been) and so we may never set the
type.
2. If the domain is NOT part of a hierarchy, we will overwrite the
current type settings programmed if they are different from the
previous mapping. Please note that irq_create_mapping()
calls irq_find_mapping() to check if a mapping already exists.
Although, it may be unlikely that the type settings for a shared
interrupt would not match, nonetheless we should check for this.
Therefore, to fix this check if a mapping exists (regardless of whether
the domain is part of a hierarchy or not) and if it does then:
1. Return the IRQ number if the type settings match or are not
specified.
2. Program the type settings and return the IRQ number if the type
settings have not been programmed yet.
3. Otherwise if the type setting do not match, then print a warning
and don't return the IRQ number.
Furthermore, add a warning if the type return by irq_domain_translate()
has bits outside the sense mask set and then clear these bits. If these
bits are not cleared then this will cause the comparision of the type
settings for an existing mapping to fail with that of the new mapping
even if the sense bit themselves match. The reason being is that the
existing type settings are read by calling irq_get_trigger_type() which
will clear any bits outside the sense mask. This will allow us to detect
irqchips that are not correctly clearing these bits and fix them.
Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
|
|
Merge filesystem stacking fixes from Jann Horn.
* emailed patches from Jann Horn <jannh@google.com>:
sched: panic on corrupted stack end
ecryptfs: forbid opening files without mmap handler
proc: prevent stacking filesystems on top
|
|
Until now, hitting this BUG_ON caused a recursive oops (because oops
handling involves do_exit(), which calls into the scheduler, which in
turn raises an oops), which caused stuff below the stack to be
overwritten until a panic happened (e.g. via an oops in interrupt
context, caused by the overwritten CPU index in the thread_info).
Just panic directly.
Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fixes from Ingo Molnar:
"Two scheduler debugging fixes"
* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/debug: Fix 'schedstats=enable' cmdline option
sched/debug: Fix /proc/sched_debug regression
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
"A handful of tooling fixes, two PMU driver fixes and a cleanup of
redundant code that addresses a security analyzer false positive"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/core: Remove a redundant check
perf/x86/intel/uncore: Remove SBOX support for Broadwell server
perf ctf: Convert invalid chars in a string before set value
perf record: Fix crash when kptr is restricted
perf symbols: Check kptr_restrict for root
perf/x86/intel/rapl: Fix pmus free during cleanup
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking fixes from Ingo Molnar:
"Misc fixes:
- a file-based futex fix
- one more spin_unlock_wait() fix
- a ww-mutex deadlock detection improvement/fix
- and a raw_read_seqcount_latch() barrier fix"
* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
futex: Calculate the futex key based on a tail page for file-based futexes
locking/qspinlock: Fix spin_unlock_wait() some more
locking/ww_mutex: Report recursive ww_mutex locking early
locking/seqcount: Re-fix raw_read_seqcount_latch()
|
|
Pull networking fixes from David Miller:
1) nfnetlink timestamp taken from wrong skb, fix from Florian Westphal.
2) Revert some msleep conversions in rtlwifi as these spots are in
atomic context, from Larry Finger.
3) Validate that NFTA_SET_TABLE attribute is actually specified when we
call nf_tables_getset(). From Phil Turnbull.
4) Don't do mdio_reset in stmmac driver with spinlock held as that can
sleep, from Vincent Palatin.
5) sk_filter() does things other than run a BPF filter, so we should
not elide it's call just because sk->sk_filter is NULL. Fix from
Eric Dumazet.
6) Fix missing backlog updates in several packet schedulers, from Cong
Wang.
7) bnx2x driver should allow VLAN add/remove while the interface is
down, from Michal Schmidt.
8) Several RDS/TCP race fixes from Sowmini Varadhan.
9) fq_codel scheduler doesn't return correct queue length in dumps,
from Eric Dumazet.
10) Fix TCP stats for tail loss probe and early retransmit in ipv6, from
Yuchung Cheng.
11) Properly initialize udp_tunnel_socket_cfg in l2tp_tunnel_create(),
from Guillaume Nault.
12) qfq scheduler leaks SKBs if a kzalloc fails, fix from Florian
Westphal.
13) sock_fprog passed into PACKET_FANOUT_DATA needs compat handling,
from Willem de Bruijn.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (85 commits)
vmxnet3: segCnt can be 1 for LRO packets
packet: compat support for sock_fprog
stmmac: fix parameter to dwmac4_set_umac_addr()
net/mlx5e: Fix blue flame quota logic
net/mlx5e: Use ndo_stop explicitly at shutdown flow
net/mlx5: E-Switch, always set mc_promisc for allmulti vports
net/mlx5: E-Switch, Modify node guid on vf set MAC
net/mlx5: E-Switch, Fix vport enable flow
net/mlx5: E-Switch, Use the correct error check on returned pointers
net/mlx5: E-Switch, Use the correct free() function
net/mlx5: Fix E-Switch flow steering capabilities check
net/mlx5: Fix flow steering NIC capabilities check
net/mlx5: Fix root flow table update
net/mlx5: Fix MLX5_CMD_OP_MAX to be defined correctly
net/mlx5: Fix masking of reserved bits in XRCD number
net/mlx5: Fix the size of modify QP mailbox
mlxsw: spectrum: Don't sleep during ndo_get_phys_port_name()
mlxsw: spectrum: Make split flow match firmware requirements
wext: Fix 32 bit iwpriv compatibility issue with 64 bit Kernel
cfg80211: remove get/set antenna and tx power warnings
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management fixes from Rafael Wysocki:
"Stable-candidate fixes for the intel_pstate driver and the cpuidle
core.
Specifics:
- Fix two intel_pstate initialization issues, one of which was
introduced during the 4.4 cycle (Srinivas Pandruvada)
- Fix kernel build with CONFIG_UBSAN set and CONFIG_CPU_IDLE unset
(Catalin Marinas)"
* tag 'pm-4.7-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
cpufreq: intel_pstate: Fix ->set_policy() interface for no_turbo
cpufreq: intel_pstate: Fix code ordering in intel_pstate_set_policy()
cpuidle: Do not access cpuidle_devices when !CONFIG_CPU_IDLE
|
|
sprintf() and snprintf() implementation of kernel guarantees that
its result is terminated with null byte if size is larger than 0. So we
don't need to call memset() at all.
Signed-off-by: Weongyo Jeong <weongyo.linux@gmail.com>
Link: http://lkml.kernel.org/r/1459451703-5744-1-git-send-email-weongyo.linux@gmail.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
for_each_irq_desc() macro has already skipped NULL irq_desc, don't
bother to check it again.
Signed-off-by: Jianyu Zhan <nasa4836@gmail.com>
Cc: mingo@kernel.org
Cc: yhlu.kernel@gmail.com
Link: http://lkml.kernel.org/r/1458395959-7046-1-git-send-email-nasa4836@gmail.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Only need CONFIG_NO_HZ_COMMON as this block is already in a
CONFIG_SMP block.
Signed-off-by: Pratyush Patel <pratyushpatel.1995@gmail.com>
Link: http://lkml.kernel.org/r/20160301172849.GA18152@cyborg
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Update the usleep_range() function comment to make it clear that it can
only be used in non-atomic context.
Previously we claimed usleep_range() was a drop-in replacement for udelay()
where wakeup is flexible. But that's only true in non-atomic contexts,
where it's possible to sleep instead of delay.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: John Stultz <john.stultz@linaro.org>
Link: http://lkml.kernel.org/r/20160531212302.28502.44995.stgit@bhelgaas-glaptop2.roam.corp.google.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
* pm-cpufreq-fixes:
cpufreq: intel_pstate: Fix ->set_policy() interface for no_turbo
cpufreq: intel_pstate: Fix code ordering in intel_pstate_set_policy()
* pm-cpuidle:
cpuidle: Do not access cpuidle_devices when !CONFIG_CPU_IDLE
|
|
When relay_open_buf() fails in relay_open(), code will goto free_bufs,
but chan is nowhere freed.
Link: http://lkml.kernel.org/r/1464777927-19675-1-git-send-email-yizhouzhou@ict.ac.cn
Signed-off-by: Zhouyi Zhou <zhouzhouyi@gmail.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Instead of overloading the discard support with the REQ_SECURE flag.
Use the opportunity to rename the queue flag as well, and remove the
dead checks for this flag in the RAID 1 and RAID 10 drivers that don't
claim support for secure erase.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
Mike Galbraith reported that the LTP test case futex_wake04 was broken
by commit 65d8fc777f6d ("futex: Remove requirement for lock_page()
in get_futex_key()").
This test case uses futexes backed by hugetlbfs pages and so there is an
associated inode with a futex stored on such pages. The problem is that
the key is being calculated based on the head page index of the hugetlbfs
page and not the tail page.
Prior to the optimisation, the page lock was used to stabilise mappings and
pin the inode is file-backed which is overkill. If the page was a compound
page, the head page was automatically looked up as part of the page lock
operation but the tail page index was used to calculate the futex key.
After the optimisation, the compound head is looked up early and the page
lock is only relied upon to identify truncated pages, special pages or a
shmem page moving to swapcache. The head page is looked up because without
the page lock, special care has to be taken to pin the inode correctly.
However, the tail page is still required to calculate the futex key so
this patch records the tail page.
On vanilla 4.6, the output of the test case is;
futex_wake04 0 TINFO : Hugepagesize 2097152
futex_wake04 1 TFAIL : futex_wake04.c:126: Bug: wait_thread2 did not wake after 30 secs.
With the patch applied
futex_wake04 0 TINFO : Hugepagesize 2097152
futex_wake04 1 TPASS : Hi hydra, thread2 awake!
Fixes: 65d8fc777f6d "futex: Remove requirement for lock_page() in get_futex_key()"
Reported-and-tested-by: Mike Galbraith <umgwanakikbuti@gmail.com>
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Davidlohr Bueso <dave@stgolabs.net>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/20160608132522.GM2469@suse.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|