Age | Commit message (Collapse) | Author |
|
bch_allocator_thread() is calling try_to_freeze(), but that's just an
expensive no-op given the fact that the thread is not marked freezable.
Bucket allocator has to be up and running to the very last stages of the
suspend, as the bcache I/O that's in flight (think of writing an
hibernation image to a swap device served by bcache).
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
bch_writeback_thread() is calling try_to_freeze(), but that's just an
expensive no-op given the fact that the thread is not marked freezable.
I/O helper kthreads, exactly such as the bcache writeback thread, actually
shouldn't be freezable, because they are potentially necessary for
finalizing the image write-out.
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
Add support for ALC295/ALC3254.
They are simply compatible with ALC225 chip.
Signed-off-by: Kailang Yang <kailang@realtek.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
The wrong field (xattr) is dumped here due to a copy-and-paste error.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
Ezequiel reported that he's facing UBI going into read-only
mode after power cut. It turned out that this behavior happens
only when updating a static volume is interrupted and Fastmap is
used.
A possible trace can look like:
ubi0 warning: ubi_io_read_vid_hdr [ubi]: no VID header found at PEB 2323, only 0xFF bytes
ubi0 warning: ubi_eba_read_leb [ubi]: switch to read-only mode
CPU: 0 PID: 833 Comm: ubiupdatevol Not tainted 4.6.0-rc2-ARCH #4
Hardware name: SAMSUNG ELECTRONICS CO., LTD. 300E4C/300E5C/300E7C/NP300E5C-AD8AR, BIOS P04RAP 10/15/2012
0000000000000286 00000000eba949bd ffff8800c45a7b38 ffffffff8140d841
ffff8801964be000 ffff88018eaa4800 ffff8800c45a7bb8 ffffffffa003abf6
ffffffff850e2ac0 8000000000000163 ffff8801850e2ac0 ffff8801850e2ac0
Call Trace:
[<ffffffff8140d841>] dump_stack+0x63/0x82
[<ffffffffa003abf6>] ubi_eba_read_leb+0x486/0x4a0 [ubi]
[<ffffffffa00453b3>] ubi_check_volume+0x83/0xf0 [ubi]
[<ffffffffa0039d97>] ubi_open_volume+0x177/0x350 [ubi]
[<ffffffffa00375d8>] vol_cdev_open+0x58/0xb0 [ubi]
[<ffffffff8124b08e>] chrdev_open+0xae/0x1d0
[<ffffffff81243bcf>] do_dentry_open+0x1ff/0x300
[<ffffffff8124afe0>] ? cdev_put+0x30/0x30
[<ffffffff81244d36>] vfs_open+0x56/0x60
[<ffffffff812545f4>] path_openat+0x4f4/0x1190
[<ffffffff81256621>] do_filp_open+0x91/0x100
[<ffffffff81263547>] ? __alloc_fd+0xc7/0x190
[<ffffffff812450df>] do_sys_open+0x13f/0x210
[<ffffffff812451ce>] SyS_open+0x1e/0x20
[<ffffffff81a99e32>] entry_SYSCALL_64_fastpath+0x1a/0xa4
UBI checks static volumes for data consistency and reads the
whole volume upon first open. If the volume is found erroneous
users of UBI cannot read from it, but another volume update is
possible to fix it. The check is performed by running
ubi_eba_read_leb() on every allocated LEB of the volume.
For static volumes ubi_eba_read_leb() computes the checksum of all
data stored in a LEB. To verify the computed checksum it has to read
the LEB's volume header which stores the original checksum.
If the volume header is not found UBI treats this as fatal internal
error and switches to RO mode. If the UBI device was attached via a
full scan the assumption is correct, the volume header has to be
present as it had to be there while scanning to get known as mapped.
If the attach operation happened via Fastmap the assumption is no
longer correct. When attaching via Fastmap UBI learns the mapping
table from Fastmap's snapshot of the system state and not via a full
scan. It can happen that a LEB got unmapped after a Fastmap was
written to the flash. Then UBI can learn the LEB still as mapped and
accessing it returns only 0xFF bytes. As UBI is not a FTL it is
allowed to have mappings to empty PEBs, it assumes that the layer
above takes care of LEB accounting and referencing.
UBIFS does so using the LEB property tree (LPT).
For static volumes UBI blindly assumes that all LEBs are present and
therefore special actions have to be taken.
The described situation can happen when updating a static volume is
interrupted, either by a user or a power cut.
The volume update code first unmaps all LEBs of a volume and then
writes LEB by LEB. If the sequence of operations is interrupted UBI
detects this either by the absence of LEBs, no volume header present
at scan time, or corrupted payload, detected via checksum.
In the Fastmap case the former method won't trigger as no scan
happened and UBI automatically thinks all LEBs are present.
Only by reading data from a LEB it detects that the volume header is
missing and incorrectly treats this as fatal error.
To deal with the situation ubi_eba_read_leb() from now on checks
whether we attached via Fastmap and handles the absence of a
volume header like a data corruption error.
This way interrupted static volume updates will correctly get detected
also when Fastmap is used.
Cc: <stable@vger.kernel.org>
Reported-by: Ezequiel Garcia <ezequiel@vanguardiasur.com.ar>
Tested-by: Ezequiel Garcia <ezequiel@vanguardiasur.com.ar>
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
Set free_count to zero before walking through ai->erase list
in wl_init().
Found in U-Boot as U-Boot has no workqueue/threads, it immediately
calls erase_worker(), which increase for each erased block
free_count. Without this patch, free_count gets after
this initialized to zero in wl_init(), so the free_count
variable always has the maybe wrong value 0 in U-Boot.
Signed-off-by: Heiko Schocher <hs@denx.de>
Reviewed-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
My static checker complains that "val" is uninitialized when kstrtoint()
fails.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
My static checker says that "err" can be uninitialized if
"vol->reserved_pebs" is <= 0. I don't think that can happen but
returning a literal is cleaner anyway.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
Signed-off-by: z00189512 <abc.zhangliang@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
Drop this paranoia check from the old days.
If our MTD driver or the flash is so bad that we even cannot
trust it to write data we have bigger problems.
If one really does not trust the flash and wants write-verify
she can enable UBI io checks using debugfs.
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
On serious situations, UBI may detect serious device corruption,
and switch to read-only mode to protect the data and allow debugging.
This commit exposes this ro-mode on sysfs, so it can be obtained
by userspace tools.
Signed-off-by: Ezequiel Garcia <ezequiel@vanguardiasur.com.ar>
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
Instead of having two functions for cycling through the E820 map in
order to count to be remapped pages and remap them later, just use one
function with a caller supplied sub-function called for each region to
be processed. This eliminates the possibility of a mismatch between
both loops which showed up in certain configurations.
Suggested-by: Ed Swierk <eswierk@skyportsystems.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
|
|
Commit ff1e22e7a638 ("xen/events: Mask a moving irq") open-coded
irq_move_irq() but left out checking if the IRQ is disabled. This broke
resuming from suspend since it tries to move a (disabled) irq without
holding the IRQ's desc->lock. Fix it by adding in a check for disabled
IRQs.
The resulting stacktrace was:
kernel BUG at /build/linux-UbQGH5/linux-4.4.0/kernel/irq/migration.c:31!
invalid opcode: 0000 [#1] SMP
Modules linked in: xenfs xen_privcmd ...
CPU: 0 PID: 9 Comm: migration/0 Not tainted 4.4.0-22-generic #39-Ubuntu
Hardware name: Xen HVM domU, BIOS 4.6.1-xs125180 05/04/2016
task: ffff88003d75ee00 ti: ffff88003d7bc000 task.ti: ffff88003d7bc000
RIP: 0010:[<ffffffff810e26e2>] [<ffffffff810e26e2>] irq_move_masked_irq+0xd2/0xe0
RSP: 0018:ffff88003d7bfc50 EFLAGS: 00010046
RAX: 0000000000000000 RBX: ffff88003d40ba00 RCX: 0000000000000001
RDX: 0000000000000001 RSI: 0000000000000100 RDI: ffff88003d40bad8
RBP: ffff88003d7bfc68 R08: 0000000000000000 R09: ffff88003d000000
R10: 0000000000000000 R11: 000000000000023c R12: ffff88003d40bad0
R13: ffffffff81f3a4a0 R14: 0000000000000010 R15: 00000000ffffffff
FS: 0000000000000000(0000) GS:ffff88003da00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fd4264de624 CR3: 0000000037922000 CR4: 00000000003406f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Stack:
ffff88003d40ba38 0000000000000024 0000000000000000 ffff88003d7bfca0
ffffffff814c8d92 00000010813ef89d 00000000805ea732 0000000000000009
0000000000000024 ffff88003cc39b80 ffff88003d7bfce0 ffffffff814c8f66
Call Trace:
[<ffffffff814c8d92>] eoi_pirq+0xb2/0xf0
[<ffffffff814c8f66>] __startup_pirq+0xe6/0x150
[<ffffffff814ca659>] xen_irq_resume+0x319/0x360
[<ffffffff814c7e75>] xen_suspend+0xb5/0x180
[<ffffffff81120155>] multi_cpu_stop+0xb5/0xe0
[<ffffffff811200a0>] ? cpu_stop_queue_work+0x80/0x80
[<ffffffff811203d0>] cpu_stopper_thread+0xb0/0x140
[<ffffffff810a94e6>] ? finish_task_switch+0x76/0x220
[<ffffffff810ca731>] ? __raw_callee_save___pv_queued_spin_unlock+0x11/0x20
[<ffffffff810a3935>] smpboot_thread_fn+0x105/0x160
[<ffffffff810a3830>] ? sort_range+0x30/0x30
[<ffffffff810a0588>] kthread+0xd8/0xf0
[<ffffffff810a04b0>] ? kthread_create_on_node+0x1e0/0x1e0
[<ffffffff8182568f>] ret_from_fork+0x3f/0x70
[<ffffffff810a04b0>] ? kthread_create_on_node+0x1e0/0x1e0
Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
|
|
b4ff8389ed14 is incomplete: relies on nr_legacy_irqs() to get the number
of legacy interrupts when actually nr_legacy_irqs() returns 0 after
probe_8259A(). Use NR_IRQS_LEGACY instead.
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
CC: stable@vger.kernel.org
|
|
The XEN UEFI code has become available on the ARM architecture
recently, but now causes a link-time warning:
ld: warning: drivers/xen/efi.o uses 2-byte wchar_t yet the output is to use 4-byte wchar_t; use of wchar_t values across objects may fail
This seems harmless, because the efi code only uses 2-byte
characters when interacting with EFI, so we don't pass on those
strings to elsewhere in the system, and we just need to
silence the warning.
It is not clear to me whether we actually need to build the file
with the -fshort-wchar flag, but if we do, then we should also
pass --no-wchar-size-warning to the linker, to avoid the warning.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Fixes: 37060935dc04 ("ARM64: XEN: Add a function to initialize Xen specific UEFI runtime services")
|
|
IOCTL_GNTDEV_GRANT_COPY batches copy operations to reduce the number
of hypercalls. The stack is used to avoid a memory allocation in a
hot path. However, a batch size of 24 requires more than 1024 bytes of
stack which in some configurations causes a compiler warning.
xen/gntdev.c: In function ‘gntdev_ioctl_grant_copy’:
xen/gntdev.c:949:1: warning: the frame size of 1248 bytes is
larger than 1024 bytes [-Wframe-larger-than=]
This is a harmless warning as there is still plenty of stack spare,
but people keep trying to "fix" it. Reduce the batch size to 16 to
reduce stack usage to less than 1024 bytes. This should have minimal
impact on performance.
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
|
|
On slow platforms with unreliable TSC, such as QEMU emulated machines,
it is possible for the kernel to request the next event in the past. In
that case, in the current implementation of xen_vcpuop_clockevent, we
simply return -ETIME. To be precise the Xen returns -ETIME and we pass
it on. However the result of this is a missed event, which simply causes
the kernel to hang.
Instead it is better to always ask the hypervisor for a timer event,
even if the timeout is in the past. That way there are no lost
interrupts and the kernel survives. To do that, remove the
VCPU_SSHOTTMR_future flag.
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Acked-by: Juergen Gross <jgross@suse.com>
|
|
Useful when tracing nested setups where the guest may trigger more than
the host usually does. But even some typical host exits were missing.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Specifically the change from hex to decimal helps correlating events.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
These were supposed to be a bitwise operation but there is a typo.
The result is mostly harmless, but sparse correctly complains.
Fixes: 44a95dae1d22 ('KVM: x86: Detect and Initialize AVIC support')
Fixes: 18f40c53e10f ('svm: Add VMEXIT handlers for AVIC')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into kvm-next
KVM/ARM Changes for v4.7 take 2
"The GIC is dead; Long live the GIC"
This set of changes include the new vgic, which is a reimplementation of
our horribly broken legacy vgic implementation. The two implementations
will live side-by-side (with the new being the configured default) for
one kernel release and then we'll remove it.
Also fixes a non-critical issue with virtual abort injection to guests.
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent
Pull perf/core improvements from Arnaldo Carvalho de Melo:
User visible changes:
- Add "srcline_from" and "srcline_to" branch sort keys to 'perf top' and
'perf report' (Andi Kleen)
Infrastructure changes:
- Make 'perf trace' auto-attach fd->name and ptr->name beautifiers based
on the name of syscall arguments, this way new syscalls that have
'const char * (path,pathname,filename)' will use the fd->name beautifier
(vfs_getname perf probe, if in place) and the 'fd->name' (vfs_getname
or via /proc/PID/fd/) (Arnaldo Carvalho de Melo)
- Infrastructure to read from a ring buffer in backward write mode (Wang Nan)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
A recent addition to the DRM tree for 4.7 added 'extern "C"' guards
for c++ to all the DRM headers, and that now causes warnings
in 'make headers_check':
usr/include/drm/amdgpu_drm.h:38: userspace cannot reference function or variable defined in the kernel
usr/include/drm/drm.h:63: userspace cannot reference function or variable defined in the kernel
usr/include/drm/drm.h:699: userspace cannot reference function or variable defined in the kernel
usr/include/drm/drm_fourcc.h:30: userspace cannot reference function or variable defined in the kernel
usr/include/drm/drm_mode.h:33: userspace cannot reference function or variable defined in the kernel
usr/include/drm/drm_sarea.h:38: userspace cannot reference function or variable defined in the kernel
usr/include/drm/exynos_drm.h:21: userspace cannot reference function or variable defined in the kernel
usr/include/drm/i810_drm.h:7: userspace cannot reference function or variable defined in the kernel
This changes the headers_check.pl script to not warn about this.
I'm listing the merge commit as introducing the problem, because
there are several patches in this branch that each do this for
one file.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes: 7c10ddf87472 ("Merge branch 'drm-uapi-extern-c-fixes' of https://github.com/evelikov/linux into drm-next")
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
Merge yet more updates from Andrew Morton:
- Oleg's "wait/ptrace: assume __WALL if the child is traced". It's a
kernel-based workaround for existing userspace issues.
- A few hotfixes
- befs cleanups
- nilfs2 updates
- sys_wait() changes
- kexec updates
- kdump
- scripts/gdb updates
- the last of the MM queue
- a few other misc things
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (84 commits)
kgdb: depends on VT
drm/amdgpu: make amdgpu_mn_get wait for mmap_sem killable
drm/radeon: make radeon_mn_get wait for mmap_sem killable
drm/i915: make i915_gem_mmap_ioctl wait for mmap_sem killable
uprobes: wait for mmap_sem for write killable
prctl: make PR_SET_THP_DISABLE wait for mmap_sem killable
exec: make exec path waiting for mmap_sem killable
aio: make aio_setup_ring killable
coredump: make coredump_wait wait for mmap_sem for write killable
vdso: make arch_setup_additional_pages wait for mmap_sem for write killable
ipc, shm: make shmem attach/detach wait for mmap_sem killable
mm, fork: make dup_mmap wait for mmap_sem for write killable
mm, proc: make clear_refs killable
mm: make vm_brk killable
mm, elf: handle vm_brk error
mm, aout: handle vm_brk failures
mm: make vm_munmap killable
mm: make vm_mmap killable
mm: make mmap_sem for write waits killable for mm syscalls
MAINTAINERS: add co-maintainer for scripts/gdb
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull kselftest updates from Shuah Khan:
"This update for Kselftest adds:
- a new ftrace testcase
- fixes for ftrace and intel_pstate tests"
* tag 'linux-kselftest-4.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
tools: testing: define the _GNU_SOURCE macro
kselftests/ftrace: Add a test case for event pid filtering
kselftests/ftrace: Detect tracefs mount point
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing fix from Steven Rostedt:
"Reviewing the selftest I recently submitted, I realize that the second
part of it uses my old hack to get the PID of the spawned background
tasks, which doesn't work for all shells, instead of the common use of
$!"
* tag 'trace-v4.7-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
ftracetest: Use proper logic to find process PID
|
|
Pull arch/tile updates from Chris Metcalf:
"This is an even quieter cycle than usual"
* git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile:
Fix typo
Fix typo
Fix typo
tile: sort the "select" lines in the TILE/TILEGX configs
tile: clarify barrier semantics of atomic_add_return
tile/defconfigs: Remove CONFIG_IPV6_PRIVACY
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata
Pull libata sata_dwc_460ex updates from Tejun Heo:
"Patches to bring sata_dwc_460ex up to snuff.
It was a separate pull request because it depends on dmaengine dw
platform changes which are now in mainline"
* 'for-4.7-dw' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata: (24 commits)
ata: dwc: add DMADEVICES dependency
powerpc/4xx: Device tree update for the 460ex DWC SATA
ata: sata_dwc_460ex: make debug messages neat
ata: sata_dwc_460ex: supply physical address of FIFO to DMA
ata: sata_dwc_460ex: use devm_ioremap
ata: sata_dwc_460ex: tidy up sata_dwc_clear_dmacr()
ata: sata_dwc_460ex: use readl/writel_relaxed()
ata: sata_dwc_460ex: switch to new dmaengine_terminate_* API
ata: sata_dwc_460ex: add __iomem to register base pointer
ata: sata_dwc_460ex: get rid of incorrect cast
ata: sata_dwc_460ex: get rid of some pointless casts
ata: sata_dwc_460ex: remove empty libata callback
ata: sata_dwc_460ex: correct HOSTDEV{P}_FROM_*() macros
ata: sata_dwc_460ex: get rid of global data
ata: sata_dwc_460ex: add phy support
ata: sata_dwc_460ex: use "dmas" DT property to find dma channel
ata: sata_dwc_460ex: don't call ata_sff_qc_issue() on DMA commands
ata: sata_dwc_460ex: skip dma setup for non-dma commands
ata: sata_dwc_460ex: select only core part of DMA driver
ata: sata_dwc_460ex: DMA is always a flow controller
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata
Pull libata ZAC support from Tejun Heo:
"This contains Zone ATA Command support for Shingled Magnetic Recording
devices.
In addition to sending the new commands down to the device, as ZAC
commands depend on getting a lot of responses from the device, piping
up responses is beefed up too. However, it doesn't involve changes to
libata core mechanism or its interaction with upper layers, so I'm not
expecting too many fallouts.
Kudos to Hannes for driving SMR support"
* 'for-4.7-zac' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata: (28 commits)
libata: support host-aware and host-managed ZAC devices
libata: support device-managed ZAC devices
libata: NCQ encapsulation for ZAC MANAGEMENT OUT
libata: Implement ZBC OUT translation
libata: implement ZBC IN translation
libata: fixup ZAC device disabling
libata-scsi: Generate sense code for disabled devices
libata-trace: decode subcommands
libata: Check log page directory before accessing pages
libata: Add command definitions for NCQ Encapsulation for READ LOG DMA EXT
libata: Separate out ata_dev_config_ncq_send_recv()
libata/libsas: Define ATA_CMD_NCQ_NON_DATA
libsas: enable FPDMA SEND/RECEIVE
libata: do not attempt to retrieve sense code twice
libata-scsi: Set information sense field for invalid parameter
libata-scsi: set bit pointer for sense code information
libata-scsi: Set field pointer in sense code
scsi: add scsi_set_sense_field_pointer()
libata: Implement control mode page to select sense format
libata-scsi: generate correct ATA pass-through sense
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull more security subsystem updates from James Morris:
"Minor updates for the keys code"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
MAINTAINERS: Update keyrings record and add asymmetric keys record
lib: asn1_decoder - add MODULE_LICENSE("GPL")
KEYS: The PKCS#7 test key type should use the secondary keyring
|
|
With VT=n, the kernel build fails with:
drivers/built-in.o: In function `kgdboc_pre_exp_handler':
kgdboc.c:(.text+0x7b5aa): undefined reference to `fg_console'
kgdboc.c:(.text+0x7b5ce): undefined reference to `vc_cons'
kgdboc.c:(.text+0x7b5d5): undefined reference to `vc_cons'
kgdboc.o is built when KGDB_SERIAL_CONSOLE is set. So make
KGDB_SERIAL_CONSOLE depend on HW_CONSOLE which includes those symbols.
Link: http://lkml.kernel.org/r/1459412955-4696-1-git-send-email-jslaby@suse.cz
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Reported-by: "Jim Davis" <jim.epost@gmail.com>
Acked-by: Jason Wessel <jason.wessel@windriver.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
amdgpu_mn_get which is called during ioct path relies on mmap_sem for
write. If the waiting task gets killed by the oom killer it would block
oom_reaper from asynchronous address space reclaim and reduce the
chances of timely OOM resolving. Wait for the lock in the killable mode
and return with EINTR if the task got killed while waiting.
[arnd@arndb.de: use ERR_PTR() to return from amdgpu_mn_get]
Signed-off-by: Michal Hocko <mhocko@suse.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Christian König <christian.koenig@amd.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
radeon_mn_get which is called during ioct path relies on mmap_sem for
write. If the waiting task gets killed by the oom killer it would block
oom_reaper from asynchronous address space reclaim and reduce the
chances of timely OOM resolving. Wait for the lock in the killable mode
and return with EINTR if the task got killed while waiting.
Signed-off-by: Michal Hocko <mhocko@suse.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: David Airlie <airlied@linux.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
i915_gem_mmap_ioctl relies on mmap_sem for write. If the waiting task
gets killed by the oom killer it would block oom_reaper from
asynchronous address space reclaim and reduce the chances of timely OOM
resolving. Wait for the lock in the killable mode and return with EINTR
if the task got killed while waiting.
Signed-off-by: Michal Hocko <mhocko@suse.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: David Airlie <airlied@linux.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
xol_add_vma needs mmap_sem for write. If the waiting task gets killed
by the oom killer it would block oom_reaper from asynchronous address
space reclaim and reduce the chances of timely OOM resolving. Wait for
the lock in the killable mode and return with EINTR if the task got
killed while waiting.
Do not warn in dup_xol_work if __create_xol_area failed due to fatal
signal pending because this is usually considered a kernel issue.
Signed-off-by: Michal Hocko <mhocko@suse.com>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
PR_SET_THP_DISABLE requires mmap_sem for write. If the waiting task
gets killed by the oom killer it would block oom_reaper from
asynchronous address space reclaim and reduce the chances of timely OOM
resolving. Wait for the lock in the killable mode and return with EINTR
if the task got killed while waiting.
Signed-off-by: Michal Hocko <mhocko@suse.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Alex Thorlton <athorlton@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
setup_arg_pages requires mmap_sem for write. If the waiting task gets
killed by the oom killer it would block oom_reaper from asynchronous
address space reclaim and reduce the chances of timely OOM resolving.
Wait for the lock in the killable mode and return with EINTR if the task
got killed while waiting. All the callers are already handling error
path and the fatal signal doesn't need any additional treatment.
The same applies to __bprm_mm_init.
Signed-off-by: Michal Hocko <mhocko@suse.com>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
aio_setup_ring waits for mmap_sem in writable mode. If the waiting task
gets killed by the oom killer it would block oom_reaper from
asynchronous address space reclaim and reduce the chances of timely OOM
resolving. Wait for the lock in the killable mode and return with EINTR
if the task got killed while waiting. This will also expedite the
return to the userspace and do_exit.
Signed-off-by: Michal Hocko <mhocko@suse.com>
Acked-by: Jeff Moyer <jmoyer@redhat.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Benamin LaHaise <bcrl@kvack.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
coredump_wait waits for mmap_sem for write currently which can prevent
oom_reaper to reclaim the oom victims address space asynchronously
because that requires mmap_sem for read. This might happen if the oom
victim is multi threaded and some thread(s) is holding mmap_sem for read
(e.g. page fault) and it is stuck in the page allocator while other
thread(s) reached coredump_wait already.
This patch simply uses down_write_killable and bails out with EINTR if
the lock got interrupted by the fatal signal. do_coredump will return
right away and do_group_exit will take care to zap the whole thread
group.
Signed-off-by: Michal Hocko <mhocko@suse.com>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
most architectures are relying on mmap_sem for write in their
arch_setup_additional_pages. If the waiting task gets killed by the oom
killer it would block oom_reaper from asynchronous address space reclaim
and reduce the chances of timely OOM resolving. Wait for the lock in
the killable mode and return with EINTR if the task got killed while
waiting.
Signed-off-by: Michal Hocko <mhocko@suse.com>
Acked-by: Andy Lutomirski <luto@amacapital.net> [x86 vdso]
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
shmat and shmdt rely on mmap_sem for write. If the waiting task gets
killed by the oom killer it would block oom_reaper from asynchronous
address space reclaim and reduce the chances of timely OOM resolving.
Wait for the lock in the killable mode and return with EINTR if the task
got killed while waiting.
Signed-off-by: Michal Hocko <mhocko@suse.com>
Acked-by: Davidlohr Bueso <dave@stgolabs.net>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
dup_mmap needs to lock current's mm mmap_sem for write. If the waiting
task gets killed by the oom killer it would block oom_reaper from
asynchronous address space reclaim and reduce the chances of timely OOM
resolving. Wait for the lock in the killable mode and return with EINTR
if the task got killed while waiting.
Signed-off-by: Michal Hocko <mhocko@suse.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
CLEAR_REFS_MM_HIWATER_RSS and CLEAR_REFS_SOFT_DIRTY are relying on
mmap_sem for write. If the waiting task gets killed by the oom killer
and it would operate on the current's mm it would block oom_reaper from
asynchronous address space reclaim and reduce the chances of timely OOM
resolving. Wait for the lock in the killable mode and return with EINTR
if the task got killed while waiting. This will also expedite the
return to the userspace and do_exit even if the mm is remote.
Signed-off-by: Michal Hocko <mhocko@suse.com>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Petr Cermak <petrcermak@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Now that all the callers handle vm_brk failure we can change it wait for
mmap_sem killable to help oom_reaper to not get blocked just because
vm_brk gets blocked behind mmap_sem readers.
Signed-off-by: Michal Hocko <mhocko@suse.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
load_elf_library doesn't handle vm_brk failure although nothing really
indicates it cannot do that because the function is allowed to fail due
to vm_mmap failures already. This might be not a problem now but later
patch will make vm_brk killable (resp. mmap_sem for write waiting will
become killable) and so the failure will be more probable.
Signed-off-by: Michal Hocko <mhocko@suse.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
vm_brk is allowed to fail but load_aout_binary simply ignores the error
and happily continues. I haven't noticed any problem from that in real
life but later patches will make the failure more likely because vm_brk
will become killable (resp. mmap_sem for write waiting will become
killable) so we should be more careful now.
The error handling should be quite straightforward because there are
calls to vm_mmap which check the error properly already. The only
notable exception is set_brk which is called after beyond_if label. But
nothing indicates that we cannot move it above set_binfmt as the two do
not depend on each other and fail before we do set_binfmt and alter
reference counting.
Signed-off-by: Michal Hocko <mhocko@suse.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Almost all current users of vm_munmap are ignoring the return value and
so they do not handle potential error. This means that some VMAs might
stay behind. This patch doesn't try to solve those potential problems.
Quite contrary it adds a new failure mode by using down_write_killable
in vm_munmap. This should be safer than other failure modes, though,
because the process is guaranteed to die as soon as it leaves the kernel
and exit_mmap will clean the whole address space.
This will help in the OOM conditions when the oom victim might be stuck
waiting for the mmap_sem for write which in turn can block oom_reaper
which relies on the mmap_sem for read to make a forward progress and
reclaim the address space of the victim.
Signed-off-by: Michal Hocko <mhocko@suse.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
All the callers of vm_mmap seem to check for the failure already and
bail out in one way or another on the error which means that we can
change it to use killable version of vm_mmap_pgoff and return -EINTR if
the current task gets killed while waiting for mmap_sem. This also
means that vm_mmap_pgoff can be killable by default and drop the
additional parameter.
This will help in the OOM conditions when the oom victim might be stuck
waiting for the mmap_sem for write which in turn can block oom_reaper
which relies on the mmap_sem for read to make a forward progress and
reclaim the address space of the victim.
Please note that load_elf_binary is ignoring vm_mmap error for
current->personality & MMAP_PAGE_ZERO case but that shouldn't be a
problem because the address is not used anywhere and we never return to
the userspace if we got killed.
Signed-off-by: Michal Hocko <mhocko@suse.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
This is a follow up work for oom_reaper [1]. As the async OOM killing
depends on oom_sem for read we would really appreciate if a holder for
write didn't stood in the way. This patchset is changing many of
down_write calls to be killable to help those cases when the writer is
blocked and waiting for readers to release the lock and so help
__oom_reap_task to process the oom victim.
Most of the patches are really trivial because the lock is help from a
shallow syscall paths where we can return EINTR trivially and allow the
current task to die (note that EINTR will never get to the userspace as
the task has fatal signal pending). Others seem to be easy as well as
the callers are already handling fatal errors and bail and return to
userspace which should be sufficient to handle the failure gracefully.
I am not familiar with all those code paths so a deeper review is really
appreciated.
As this work is touching more areas which are not directly connected I
have tried to keep the CC list as small as possible and people who I
believed would be familiar are CCed only to the specific patches (all
should have received the cover though).
This patchset is based on linux-next and it depends on
down_write_killable for rw_semaphores which got merged into tip
locking/rwsem branch and it is merged into this next tree. I guess it
would be easiest to route these patches via mmotm because of the
dependency on the tip tree but if respective maintainers prefer other
way I have no objections.
I haven't covered all the mmap_write(mm->mmap_sem) instances here
$ git grep "down_write(.*\<mmap_sem\>)" next/master | wc -l
98
$ git grep "down_write(.*\<mmap_sem\>)" | wc -l
62
I have tried to cover those which should be relatively easy to review in
this series because this alone should be a nice improvement. Other
places can be changed on top.
[0] http://lkml.kernel.org/r/1456752417-9626-1-git-send-email-mhocko@kernel.org
[1] http://lkml.kernel.org/r/1452094975-551-1-git-send-email-mhocko@kernel.org
[2] http://lkml.kernel.org/r/1456750705-7141-1-git-send-email-mhocko@kernel.org
This patch (of 18):
This is the first step in making mmap_sem write waiters killable. It
focuses on the trivial ones which are taking the lock early after
entering the syscall and they are not changing state before.
Therefore it is very easy to change them to use down_write_killable and
immediately return with -EINTR. This will allow the waiter to pass away
without blocking the mmap_sem which might be required to make a forward
progress. E.g. the oom reaper will need the lock for reading to
dismantle the OOM victim address space.
The only tricky function in this patch is vm_mmap_pgoff which has many
call sites via vm_mmap. To reduce the risk keep vm_mmap with the
original non-killable semantic for now.
vm_munmap callers do not bother checking the return value so open code
it into the munmap syscall path for now for simplicity.
Signed-off-by: Michal Hocko <mhocko@suse.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Mel Gorman <mgorman@suse.de>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Add myself as a co-maintainer for scripts/gdb supporting Jan Kizka
Link: http://lkml.kernel.org/r/fb5d34ce563f33d2f324f26f592b24ded30032ee.1462865983.git.jan.kiszka@siemens.com
Signed-off-by: Kieran Bingham <kieran@bingham.xyz>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|