summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2018-09-02Merge branch 'core-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull core fixes from Thomas Gleixner: "A small set of updates for core code: - Prevent tracing in functions which are called from trace patching via stop_machine() to prevent executing half patched function trace entries. - Remove old GCC workarounds - Remove pointless includes of notifier.h" * 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: objtool: Remove workaround for unreachable warnings from old GCC notifier: Remove notifier header file wherever not used watchdog: Mark watchdog touch functions as notrace
2018-09-02x86/microcode: Update the new microcode revision unconditionallyFilippo Sironi
Handle the case where microcode gets loaded on the BSP's hyperthread sibling first and the boot_cpu_data's microcode revision doesn't get updated because of early exit due to the siblings sharing a microcode engine. For that, simply write the updated revision on all CPUs unconditionally. Signed-off-by: Filippo Sironi <sironi@amazon.de> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: prarit@redhat.com Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/1533050970-14385-1-git-send-email-sironi@amazon.de
2018-09-02x86/microcode: Make sure boot_cpu_data.microcode is up-to-datePrarit Bhargava
When preparing an MCE record for logging, boot_cpu_data.microcode is used to read out the microcode revision on the box. However, on systems where late microcode update has happened, the microcode revision output in a MCE log record is wrong because boot_cpu_data.microcode is not updated when the microcode gets updated. But, the microcode revision saved in boot_cpu_data's microcode member should be kept up-to-date, regardless, for consistency. Make it so. Fixes: fa94d0c6e0f3 ("x86/MCE: Save microcode revision in machine check records") Signed-off-by: Prarit Bhargava <prarit@redhat.com> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: sironi@amazon.de Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/20180731112739.32338-1-prarit@redhat.com
2018-09-02x86/pti: Fix section mismatch warning/errorRandy Dunlap
Fix the section mismatch warning in arch/x86/mm/pti.c: WARNING: vmlinux.o(.text+0x6972a): Section mismatch in reference from the function pti_clone_pgtable() to the function .init.text:pti_user_pagetable_walk_pte() The function pti_clone_pgtable() references the function __init pti_user_pagetable_walk_pte(). This is often because pti_clone_pgtable lacks a __init annotation or the annotation of pti_user_pagetable_walk_pte is wrong. FATAL: modpost: Section mismatches detected. Fixes: 85900ea51577 ("x86/pti: Map the vsyscall page if needed") Reported-by: kbuild test robot <lkp@intel.com> Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@kernel.org> Link: https://lkml.kernel.org/r/43a6d6a3-d69d-5eda-da09-0b1c88215a2a@infradead.org
2018-09-02of/platform: initialise AMBA default DMA masksLinus Walleij
This addresses a v4.19-rc1 regression in the PL111 DRM driver in drivers/gpu/pl111/* The driver uses the CMA KMS helpers and will thus at some point call down to dma_alloc_attrs() to allocate a chunk of contigous DMA memory for the framebuffer. It appears that in v4.18, it was OK that this (and other DMA mastering AMBA devices) left dev->coherent_dma_mask blank (zero). In v4.19-rc1 the WARN_ON_ONCE(dev && !dev->coherent_dma_mask) in dma_alloc_attrs() in include/linux/dma-mapping.h is triggered. The allocation later fails when get_coherent_dma_mask() is called from __dma_alloc() and __dma_alloc() returns NULL: drm-clcd-pl111 dev:20: coherent DMA mask is unset drm-clcd-pl111 dev:20: [drm:drm_fb_helper_fbdev_setup] *ERROR* Failed to set fbdev configuration It turns out that in commit 4d8bde883bfb ("OF: Don't set default coherent DMA mask") the OF core stops setting the default DMA mask on new devices, especially those lines of the patch: - if (!dev->coherent_dma_mask) - dev->coherent_dma_mask = DMA_BIT_MASK(32); Robin Murphy solved a similar problem in a5516219b102 ("of/platform: Initialise default DMA masks") by simply assigning dev.coherent_dma_mask and the dev.dma_mask to point to the same when creating devices from the device tree, and introducing the same code into the code path creating AMBA/PrimeCell devices solved my problem, graphics now come up. The code simply assumes that the device can access all of the system memory by setting the coherent DMA mask to 0xffffffff when creating a device from the device tree, which is crude, but seems to be what kernel v4.18 assumed. The AMBA PrimeCells do not differ between coherent and streaming DMA so we can just assign the same to any DMA mask. Possibly drivers should augment their coherent DMA mask in accordance with "dma-ranges" from the device tree if more finegranular masking is needed. Reported-by: Russell King <linux@armlinux.org.uk> Fixes: 4d8bde883bfb ("OF: Don't set default coherent DMA mask") Cc: Russell King <linux@armlinux.org.uk> Cc: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-09-02sparc: set a default 32-bit dma mask for OF devicesChristoph Hellwig
This keeps the historic default behavior for devices without a DMA mask, but removes the warning about a lacking DMA mask for doing DMA without a mask. Reported-by: Meelis Roos <mroos@linux.ee> Signed-off-by: Christoph Hellwig <hch@lst.de> Tested-by: Guenter Roeck <linux@roeck-us.net>
2018-09-01Merge tag 'omap-for-v4.19/fixes-v2-signed' of ↵Olof Johansson
git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into fixes Fixes for omap variants against v4.19-rc1 These are mostly fixes related to using ti-sysc interconnect target module driver for accessing right register offsets for sgx and cpsw and for no_console_suspend regression. There is also a droid4 emmc fix where emmc may not get detected for some models, and vibrator dts mismerge fix. And we have a file permission fix for am335x-osd3358-sm-red.dts that just got added. And we must tag RTC as system-power-controller for am437x for PMIC to shut down during poweroff. * tag 'omap-for-v4.19/fixes-v2-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap: ARM: dts: omap4-droid4: Fix emmc errors seen on some devices ARM: dts: Fix file permission for am335x-osd3358-sm-red.dts arm: dts: am4372: setup rtc as system-power-controller ARM: dts: omap4-droid4: fix vibrations on Droid 4 bus: ti-sysc: Fix no_console_suspend handling bus: ti-sysc: Fix module register ioremap for larger offsets ARM: OMAP2+: Fix module address for modules using mpu_rt_idx ARM: OMAP2+: Fix null hwmod for ti-sysc debug Signed-off-by: Olof Johansson <olof@lixom.net>
2018-09-01ipv6: don't get lwtstate twice in ip6_rt_copy_init()Alexey Kodanev
Commit 80f1a0f4e0cd ("net/ipv6: Put lwtstate when destroying fib6_info") partially fixed the kmemleak [1], lwtstate can be copied from fib6_info, with ip6_rt_copy_init(), and it should be done only once there. rt->dst.lwtstate is set by ip6_rt_init_dst(), at the start of the function ip6_rt_copy_init(), so there is no need to get it again at the end. With this patch, lwtstate also isn't copied from RTF_REJECT routes. [1]: unreferenced object 0xffff880b6aaa14e0 (size 64): comm "ip", pid 10577, jiffies 4295149341 (age 1273.903s) hex dump (first 32 bytes): 01 00 04 00 04 00 00 00 10 00 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<0000000018664623>] lwtunnel_build_state+0x1bc/0x420 [<00000000b73aa29a>] ip6_route_info_create+0x9f7/0x1fd0 [<00000000ee2c5d1f>] ip6_route_add+0x14/0x70 [<000000008537b55c>] inet6_rtm_newroute+0xd9/0xe0 [<000000002acc50f5>] rtnetlink_rcv_msg+0x66f/0x8e0 [<000000008d9cd381>] netlink_rcv_skb+0x268/0x3b0 [<000000004c893c76>] netlink_unicast+0x417/0x5a0 [<00000000f2ab1afb>] netlink_sendmsg+0x70b/0xc30 [<00000000890ff0aa>] sock_sendmsg+0xb1/0xf0 [<00000000a2e7b66f>] ___sys_sendmsg+0x659/0x950 [<000000001e7426c8>] __sys_sendmsg+0xde/0x170 [<00000000fe411443>] do_syscall_64+0x9f/0x4a0 [<000000001be7b28b>] entry_SYSCALL_64_after_hwframe+0x49/0xbe [<000000006d21f353>] 0xffffffffffffffff Fixes: 6edb3c96a5f0 ("net/ipv6: Defer initialization of dst to data path") Signed-off-by: Alexey Kodanev <alexey.kodanev@oracle.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-01x86/vdso: Fix lsl operand orderSamuel Neves
In the __getcpu function, lsl is using the wrong target and destination registers. Luckily, the compiler tends to choose %eax for both variables, so it has been working so far. Fixes: a582c540ac1b ("x86/vdso: Use RDPID in preference to LSL when available") Signed-off-by: Samuel Neves <sneves@dei.uc.pt> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Andy Lutomirski <luto@kernel.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20180901201452.27828-1-sneves@dei.uc.pt
2018-09-01Merge tag 'linux-watchdog-4.19-rc2' of ↵Linus Torvalds
git://www.linux-watchdog.org/linux-watchdog Pull watchdog fixlet from Wim Van Sebroeck: "Document support for r8a774a1" * tag 'linux-watchdog-4.19-rc2' of git://www.linux-watchdog.org/linux-watchdog: dt-bindings: watchdog: renesas-wdt: Document r8a774a1 support
2018-09-01Merge tag 'clk-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux Pull clk fixes from Stephen Boyd: "Two small fixes, one for the x86 Stoney SoC to get a more accurate clk frequency and the other to fix a bad allocation in the Nuvoton NPCM7XX driver" * tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: clk: x86: Set default parent to 48Mhz clk: npcm7xx: fix memory allocation
2018-09-01random: make CPU trust a boot parameterKees Cook
Instead of forcing a distro or other system builder to choose at build time whether the CPU is trusted for CRNG seeding via CONFIG_RANDOM_TRUST_CPU, provide a boot-time parameter for end users to control the choice. The CONFIG will set the default state instead. Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2018-09-01kernel/dma/direct: take DMA offset into account in dma_direct_supportedChristoph Hellwig
When a device has a DMA offset the dma capable result will change due to the difference between the physical and DMA address. Take that into account. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Reviewed-by: Robin Murphy <robin.murphy@arm.com>
2018-09-01x86/mce: Fix set_mce_nospec() to avoid #GP faultLuckTony
The trick with flipping bit 63 to avoid loading the address of the 1:1 mapping of the poisoned page while the 1:1 map is updated used to work when unmapping the page. But it falls down horribly when attempting to directly set the page as uncacheable. The problem is that when the cache mode is changed to uncachable, the pages needs to be flushed from the cache first. But the decoy address is non-canonical due to bit 63 flipped, and the CLFLUSH instruction throws a #GP fault. Add code to change_page_attr_set_clr() to fix the address before calling flush. Fixes: 284ce4011ba6 ("x86/memory_failure: Introduce {set, clear}_mce_nospec()") Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Anvin <hpa@zytor.com> Cc: Borislav Petkov <bp@alien8.de> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Jiang <dave.jiang@intel.com> Link: https://lkml.kernel.org/r/20180831165506.GA9605@agluck-desk
2018-08-31ibmvnic: Include missing return code checks in reset functionThomas Falcon
Check the return codes of these functions and halt reset in case of failure. The driver will remain in a dormant state until the next reset event, when device initialization will be re-attempted. Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-31selftests: pmtu: detect correct binary to ping ipv6 addressesSabrina Dubroca
Some systems don't have the ping6 binary anymore, and use ping for everything. Detect the absence of ping6 and try to use ping instead. Fixes: d1f1b9cbf34c ("selftests: net: Introduce first PMTU test") Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Acked-by: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-31selftests: pmtu: maximum MTU for vti4 is 2^16-1-20Sabrina Dubroca
Since commit 82612de1c98e ("ip_tunnel: restore binding to ifaces with a large mtu"), the maximum MTU for vti4 is based on IP_MAX_MTU instead of the mysterious constant 0xFFF8. This makes this selftest fail. Fixes: 82612de1c98e ("ip_tunnel: restore binding to ifaces with a large mtu") Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Acked-by: Stefano Brivio <sbrivio@redhat.com> Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-31tcp: do not restart timewait timer on rst receptionFlorian Westphal
RFC 1337 says: ''Ignore RST segments in TIME-WAIT state. If the 2 minute MSL is enforced, this fix avoids all three hazards.'' So with net.ipv4.tcp_rfc1337=1, expected behaviour is to have TIME-WAIT sk expire rather than removing it instantly when a reset is received. However, Linux will also re-start the TIME-WAIT timer. This causes connect to fail when tying to re-use ports or very long delays (until syn retry interval exceeds MSL). packetdrill test case: // Demonstrate bogus rearming of TIME-WAIT timer in rfc1337 mode. `sysctl net.ipv4.tcp_rfc1337=1` 0.000 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3 0.000 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0 0.000 bind(3, ..., ...) = 0 0.000 listen(3, 1) = 0 0.100 < S 0:0(0) win 29200 <mss 1460,nop,nop,sackOK,nop,wscale 7> 0.100 > S. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK,nop,wscale 7> 0.200 < . 1:1(0) ack 1 win 257 0.200 accept(3, ..., ...) = 4 // Receive first segment 0.310 < P. 1:1001(1000) ack 1 win 46 // Send one ACK 0.310 > . 1:1(0) ack 1001 // read 1000 byte 0.310 read(4, ..., 1000) = 1000 // Application writes 100 bytes 0.350 write(4, ..., 100) = 100 0.350 > P. 1:101(100) ack 1001 // ACK 0.500 < . 1001:1001(0) ack 101 win 257 // close the connection 0.600 close(4) = 0 0.600 > F. 101:101(0) ack 1001 win 244 // Our side is in FIN_WAIT_1 & waits for ack to fin 0.7 < . 1001:1001(0) ack 102 win 244 // Our side is in FIN_WAIT_2 with no outstanding data. 0.8 < F. 1001:1001(0) ack 102 win 244 0.8 > . 102:102(0) ack 1002 win 244 // Our side is now in TIME_WAIT state, send ack for fin. 0.9 < F. 1002:1002(0) ack 102 win 244 0.9 > . 102:102(0) ack 1002 win 244 // Peer reopens with in-window SYN: 1.000 < S 1000:1000(0) win 9200 <mss 1460,nop,nop,sackOK,nop,wscale 7> // Therefore, reply with ACK. 1.000 > . 102:102(0) ack 1002 win 244 // Peer sends RST for this ACK. Normally this RST results // in tw socket removal, but rfc1337=1 setting prevents this. 1.100 < R 1002:1002(0) win 244 // second syn. Due to rfc1337=1 expect another pure ACK. 31.0 < S 1000:1000(0) win 9200 <mss 1460,nop,nop,sackOK,nop,wscale 7> 31.0 > . 102:102(0) ack 1002 win 244 // .. and another RST from peer. 31.1 < R 1002:1002(0) win 244 31.2 `echo no timer restart;ss -m -e -a -i -n -t -o state TIME-WAIT` // third syn after one minute. Time-Wait socket should have expired by now. 63.0 < S 1000:1000(0) win 9200 <mss 1460,nop,nop,sackOK,nop,wscale 7> // so we expect a syn-ack & 3whs to proceed from here on. 63.0 > S. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK,nop,wscale 7> Without this patch, 'ss' shows restarts of tw timer and last packet is thus just another pure ack, more than one minute later. This restores the original code from commit 283fd6cf0be690a83 ("Merge in ANK networking jumbo patch") in netdev-vger-cvs.git . For some reason the else branch was removed/lost in 1f28b683339f7 ("Merge in TCP/UDP optimizations and [..]") and timer restart became unconditional. Reported-by: Michal Tesar <mtesar@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-31net/rds: RDS is not Radio Data SystemPavel Machek
Getting prompt "The RDS Protocol" (RDS) is not too helpful, and it is easily confused with Radio Data System (which we may want to support in kernel, too). Signed-off-by: Pavel Machek <pavel@ucw.cz> Acked-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Acked-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-31hv_netvsc: Fix a deadlock by getting rtnl lock earlier in netvsc_probe()Dexuan Cui
This patch fixes the race between netvsc_probe() and rndis_set_subchannel(), which can cause a deadlock. These are the related 3 paths which show the deadlock: path #1: Workqueue: hv_vmbus_con vmbus_onmessage_work [hv_vmbus] Call Trace: schedule schedule_preempt_disabled __mutex_lock __device_attach bus_probe_device device_add vmbus_device_register vmbus_onoffer vmbus_onmessage_work process_one_work worker_thread kthread ret_from_fork path #2: schedule schedule_preempt_disabled __mutex_lock netvsc_probe vmbus_probe really_probe __driver_attach bus_for_each_dev driver_attach_async async_run_entry_fn process_one_work worker_thread kthread ret_from_fork path #3: Workqueue: events netvsc_subchan_work [hv_netvsc] Call Trace: schedule rndis_set_subchannel netvsc_subchan_work process_one_work worker_thread kthread ret_from_fork Before path #1 finishes, path #2 can start to run, because just before the "bus_probe_device(dev);" in device_add() in path #1, there is a line "object_uevent(&dev->kobj, KOBJ_ADD);", so systemd-udevd can immediately try to load hv_netvsc and hence path #2 can start to run. Next, path #2 offloads the subchannal's initialization to a workqueue, i.e. path #3, so we can end up in a deadlock situation like this: Path #2 gets the device lock, and is trying to get the rtnl lock; Path #3 gets the rtnl lock and is waiting for all the subchannel messages to be processed; Path #1 is trying to get the device lock, but since #2 is not releasing the device lock, path #1 has to sleep; since the VMBus messages are processed one by one, this means the sub-channel messages can't be procedded, so #3 has to sleep with the rtnl lock held, and finally #2 has to sleep... Now all the 3 paths are sleeping and we hit the deadlock. With the patch, we can make sure #2 gets both the device lock and the rtnl lock together, gets its job done, and releases the locks, so #1 and #3 will not be blocked for ever. Fixes: 8195b1396ec8 ("hv_netvsc: fix deadlock on hotplug") Signed-off-by: Dexuan Cui <decui@microsoft.com> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: K. Y. Srinivasan <kys@microsoft.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-31nfp: wait for posted reconfigs when disabling the deviceJakub Kicinski
To avoid leaking a running timer we need to wait for the posted reconfigs after netdev is unregistered. In common case the process of deinitializing the device will perform synchronous reconfigs which wait for posted requests, but especially with VXLAN ports being actively added and removed there can be a race condition leaving a timer running after adapter structure is freed leading to a crash. Add an explicit flush after deregistering and for a good measure a warning to check if timer is running just before structures are freed. Fixes: 3d780b926a12 ("nfp: add async reconfiguration mechanism") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-31Revert "packet: switch kvzalloc to allocate memory"Eric Dumazet
This reverts commit 71e41286203c017d24f041a7cd71abea7ca7b1e0. mmap()/munmap() can not be backed by kmalloced pages : We fault in : VM_BUG_ON_PAGE(PageSlab(page), page); unmap_single_vma+0x8a/0x110 unmap_vmas+0x4b/0x90 unmap_region+0xc9/0x140 do_munmap+0x274/0x360 vm_munmap+0x81/0xc0 SyS_munmap+0x2b/0x40 do_syscall_64+0x13e/0x1c0 entry_SYSCALL_64_after_hwframe+0x42/0xb7 Fixes: 71e41286203c ("packet: switch kvzalloc to allocate memory") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: John Sperbeck <jsperbeck@google.com> Bisected-by: John Sperbeck <jsperbeck@google.com> Cc: Zhang Yu <zhangyu31@baidu.com> Cc: Li RongQing <lirongqing@baidu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-31md-cluster: release RESYNC lock after the last resync messageGuoqing Jiang
All the RESYNC messages are sent with resync lock held, the only exception is resync_finish which releases resync_lockres before send the last resync message, this should be changed as well. Otherwise, we can see deadlock issue as follows: clustermd2-gqjiang2:~ # cat /proc/mdstat Personalities : [raid10] [raid1] md0 : active raid1 sdg[0] sdf[1] 134144 blocks super 1.2 [2/2] [UU] [===================>.] resync = 99.6% (134144/134144) finish=0.0min speed=26K/sec bitmap: 1/1 pages [4KB], 65536KB chunk unused devices: <none> clustermd2-gqjiang2:~ # ps aux|grep md|grep D root 20497 0.0 0.0 0 0 ? D 16:00 0:00 [md0_raid1] clustermd2-gqjiang2:~ # cat /proc/20497/stack [<ffffffffc05ff51e>] dlm_lock_sync+0x8e/0xc0 [md_cluster] [<ffffffffc05ff7e8>] __sendmsg+0x98/0x130 [md_cluster] [<ffffffffc05ff900>] sendmsg+0x20/0x30 [md_cluster] [<ffffffffc05ffc35>] resync_info_update+0xb5/0xc0 [md_cluster] [<ffffffffc0593e84>] md_reap_sync_thread+0x134/0x170 [md_mod] [<ffffffffc059514c>] md_check_recovery+0x28c/0x510 [md_mod] [<ffffffffc060c882>] raid1d+0x42/0x800 [raid1] [<ffffffffc058ab61>] md_thread+0x121/0x150 [md_mod] [<ffffffff9a0a5b3f>] kthread+0xff/0x140 [<ffffffff9a800235>] ret_from_fork+0x35/0x40 [<ffffffffffffffff>] 0xffffffffffffffff clustermd-gqjiang1:~ # ps aux|grep md|grep D root 20531 0.0 0.0 0 0 ? D 16:00 0:00 [md0_raid1] root 20537 0.0 0.0 0 0 ? D 16:00 0:00 [md0_cluster_rec] root 20676 0.0 0.0 0 0 ? D 16:01 0:00 [md0_resync] clustermd-gqjiang1:~ # cat /proc/mdstat Personalities : [raid10] [raid1] md0 : active raid1 sdf[1] sdg[0] 134144 blocks super 1.2 [2/2] [UU] [===================>.] resync = 97.3% (131072/134144) finish=8076.8min speed=0K/sec bitmap: 1/1 pages [4KB], 65536KB chunk unused devices: <none> clustermd-gqjiang1:~ # cat /proc/20531/stack [<ffffffffc080974d>] metadata_update_start+0xcd/0xd0 [md_cluster] [<ffffffffc079c897>] md_update_sb.part.61+0x97/0x820 [md_mod] [<ffffffffc079f15b>] md_check_recovery+0x29b/0x510 [md_mod] [<ffffffffc0816882>] raid1d+0x42/0x800 [raid1] [<ffffffffc0794b61>] md_thread+0x121/0x150 [md_mod] [<ffffffff9e0a5b3f>] kthread+0xff/0x140 [<ffffffff9e800235>] ret_from_fork+0x35/0x40 [<ffffffffffffffff>] 0xffffffffffffffff clustermd-gqjiang1:~ # cat /proc/20537/stack [<ffffffffc0813222>] freeze_array+0xf2/0x140 [raid1] [<ffffffffc080a56e>] recv_daemon+0x41e/0x580 [md_cluster] [<ffffffffc0794b61>] md_thread+0x121/0x150 [md_mod] [<ffffffff9e0a5b3f>] kthread+0xff/0x140 [<ffffffff9e800235>] ret_from_fork+0x35/0x40 [<ffffffffffffffff>] 0xffffffffffffffff clustermd-gqjiang1:~ # cat /proc/20676/stack [<ffffffffc080951e>] dlm_lock_sync+0x8e/0xc0 [md_cluster] [<ffffffffc080957f>] lock_token+0x2f/0xa0 [md_cluster] [<ffffffffc0809622>] lock_comm+0x32/0x90 [md_cluster] [<ffffffffc08098f5>] sendmsg+0x15/0x30 [md_cluster] [<ffffffffc0809c0a>] resync_info_update+0x8a/0xc0 [md_cluster] [<ffffffffc08130ba>] raid1_sync_request+0xa9a/0xb10 [raid1] [<ffffffffc079b8ea>] md_do_sync+0xbaa/0xf90 [md_mod] [<ffffffffc0794b61>] md_thread+0x121/0x150 [md_mod] [<ffffffff9e0a5b3f>] kthread+0xff/0x140 [<ffffffff9e800235>] ret_from_fork+0x35/0x40 [<ffffffffffffffff>] 0xffffffffffffffff Reviewed-by: NeilBrown <neilb@suse.com> Signed-off-by: Guoqing Jiang <gqjiang@suse.com> Signed-off-by: Shaohua Li <shli@fb.com>
2018-08-31RAID10 BUG_ON in raise_barrier when force is true and conf->barrier is 0Xiao Ni
In raid10 reshape_request it gets max_sectors in read_balance. If the underlayer disks have bad blocks, the max_sectors is less than last. It will call goto read_more many times. It calls raise_barrier(conf, sectors_done != 0) every time. In this condition sectors_done is not 0. So the value passed to the argument force of raise_barrier is true. In raise_barrier it checks conf->barrier when force is true. If force is true and conf->barrier is 0, it panic. In this case reshape_request submits bio to under layer disks. And in the callback function of the bio it calls lower_barrier. If the bio finishes before calling raise_barrier again, it can trigger the BUG_ON. Add one pair of raise_barrier/lower_barrier to fix this bug. Signed-off-by: Xiao Ni <xni@redhat.com> Suggested-by: Neil Brown <neilb@suse.com> Signed-off-by: Shaohua Li <shli@fb.com>
2018-08-31md/raid5-cache: disable reshape completelyShaohua Li
We don't support reshape yet if an array supports log device. Previously we determine the fact by checking ->log. However, ->log could be NULL after a log device is removed, but the array is still marked to support log device. Don't allow reshape in this case too. User can disable log device support by setting 'consistency_policy' to 'resync' then do reshape. Reported-by: Xiao Ni <xni@redhat.com> Tested-by: Xiao Ni <xni@redhat.com> Signed-off-by: Shaohua Li <shli@fb.com>
2018-08-31blkcg: use tryget logic when associating a blkg with a bioDennis Zhou (Facebook)
There is a very small change a bio gets caught up in a really unfortunate race between a task migration, cgroup exiting, and itself trying to associate with a blkg. This is due to css offlining being performed after the css->refcnt is killed which triggers removal of blkgs that reach their blkg->refcnt of 0. To avoid this, association with a blkg should use tryget and fallback to using the root_blkg. Fixes: 08e18eab0c579 ("block: add bi_blkg to the bio for cgroups") Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Dennis Zhou <dennisszhou@gmail.com> Cc: Jiufei Xue <jiufei.xue@linux.alibaba.com> Cc: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Tejun Heo <tj@kernel.org> Cc: Josef Bacik <josef@toxicpanda.com> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-08-31blkcg: delay blkg destruction until after writeback has finishedDennis Zhou (Facebook)
Currently, blkcg destruction relies on a sequence of events: 1. Destruction starts. blkcg_css_offline() is called and blkgs release their reference to the blkcg. This immediately destroys the cgwbs (writeback). 2. With blkgs giving up their reference, the blkcg ref count should become zero and eventually call blkcg_css_free() which finally frees the blkcg. Jiufei Xue reported that there is a race between blkcg_bio_issue_check() and cgroup_rmdir(). To remedy this, blkg destruction becomes contingent on the completion of all writeback associated with the blkcg. A count of the number of cgwbs is maintained and once that goes to zero, blkg destruction can follow. This should prevent premature blkg destruction related to writeback. The new process for blkcg cleanup is as follows: 1. Destruction starts. blkcg_css_offline() is called which offlines writeback. Blkg destruction is delayed on the cgwb_refcnt count to avoid punting potentially large amounts of outstanding writeback to root while maintaining any ongoing policies. Here, the base cgwb_refcnt is put back. 2. When the cgwb_refcnt becomes zero, blkcg_destroy_blkgs() is called and handles destruction of blkgs. This is where the css reference held by each blkg is released. 3. Once the blkcg ref count goes to zero, blkcg_css_free() is called. This finally frees the blkg. It seems in the past blk-throttle didn't do the most understandable things with taking data from a blkg while associating with current. So, the simplification and unification of what blk-throttle is doing caused this. Fixes: 08e18eab0c579 ("block: add bi_blkg to the bio for cgroups") Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Dennis Zhou <dennisszhou@gmail.com> Cc: Jiufei Xue <jiufei.xue@linux.alibaba.com> Cc: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Tejun Heo <tj@kernel.org> Cc: Josef Bacik <josef@toxicpanda.com> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-08-31Revert "blk-throttle: fix race between blkcg_bio_issue_check() and ↵Dennis Zhou (Facebook)
cgroup_rmdir()" This reverts commit 4c6994806f708559c2812b73501406e21ae5dcd0. Destroying blkgs is tricky because of the nature of the relationship. A blkg should go away when either a blkcg or a request_queue goes away. However, blkg's pin the blkcg to ensure they remain valid. To break this cycle, when a blkcg is offlined, blkgs put back their css ref. This eventually lets css_free() get called which frees the blkcg. The above commit (4c6994806f70) breaks this order of events by trying to destroy blkgs in css_free(). As the blkgs still hold references to the blkcg, css_free() is never called. The race between blkcg_bio_issue_check() and cgroup_rmdir() will be addressed in the following patch by delaying destruction of a blkg until all writeback associated with the blkcg has been finished. Fixes: 4c6994806f70 ("blk-throttle: fix race between blkcg_bio_issue_check() and cgroup_rmdir()") Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Dennis Zhou <dennisszhou@gmail.com> Cc: Jiufei Xue <jiufei.xue@linux.alibaba.com> Cc: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Tejun Heo <tj@kernel.org> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-08-31ARC: dma [IOC]: mark DMA devices connected as dma-coherentEugeniy Paltsev
Mark DMA devices on AXS103 and HSDK boards connected through IOC port as dma-coherent. Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2018-08-31ARC: atomics: unbork atomic_fetch_##op()Will Deacon
In 4.19-rc1, Eugeniy reported weird boot and IO errors on ARC HSDK | INFO: task syslogd:77 blocked for more than 10 seconds. | Not tainted 4.19.0-rc1-00007-gf213acea4e88 #40 | "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this | message. | syslogd D 0 77 76 0x00000000 | | Stack Trace: | __switch_to+0x0/0xac | __schedule+0x1b2/0x730 | io_schedule+0x5c/0xc0 | __lock_page+0x98/0xdc | find_lock_entry+0x38/0x100 | shmem_getpage_gfp.isra.3+0x82/0xbfc | shmem_fault+0x46/0x138 | handle_mm_fault+0x5bc/0x924 | do_page_fault+0x100/0x2b8 | ret_from_exception+0x0/0x8 He bisected to 84c6591103db ("locking/atomics, asm-generic/bitops/lock.h: Rewrite using atomic_fetch_*()") This commit however only unmasked the real issue introduced by commit 4aef66c8ae9 ("locking/atomic, arch/arc: Fix build") which missed the retry-if-scond-failed branch in atomic_fetch_##op() macros. The bisected commit started using atomic_fetch_##op() macros for building the rest of atomics. Fixes: 4aef66c8ae9 ("locking/atomic, arch/arc: Fix build") Reported-by: Eugeniy Paltsev <paltsev@synopsys.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com> [vgupta: wrote changelog]
2018-08-31MIPS: VDSO: Match data page cache colouring when D$ aliasesPaul Burton
When a system suffers from dcache aliasing a user program may observe stale VDSO data from an aliased cache line. Notably this can break the expectation that clock_gettime(CLOCK_MONOTONIC, ...) is, as its name suggests, monotonic. In order to ensure that users observe updates to the VDSO data page as intended, align the user mappings of the VDSO data page such that their cache colouring matches that of the virtual address range which the kernel will use to update the data page - typically its unmapped address within kseg0. This ensures that we don't introduce aliasing cache lines for the VDSO data page, and therefore that userland will observe updates without requiring cache invalidation. Signed-off-by: Paul Burton <paul.burton@mips.com> Reported-by: Hauke Mehrtens <hauke@hauke-m.de> Reported-by: Rene Nielsen <rene.nielsen@microsemi.com> Reported-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Fixes: ebb5e78cc634 ("MIPS: Initial implementation of a VDSO") Patchwork: https://patchwork.linux-mips.org/patch/20344/ Tested-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Tested-by: Hauke Mehrtens <hauke@hauke-m.de> Cc: James Hogan <jhogan@kernel.org> Cc: linux-mips@linux-mips.org Cc: stable@vger.kernel.org # v4.4+
2018-09-01kconfig: remove a spurious self-assignmentLukas Bulwahn
The self assignment was probably introduced by an automated code refactoring in commit 694c49a7c01c ("kconfig: drop localization support"). The issue was identified by a self-assign warning when running make menuconfig with clang. Fixes: 694c49a7c01c ("kconfig: drop localization support") Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2018-09-01scripts/setlocalversion: git: Make -dirty check more robustGenki Sky
$(git diff-index) relies on the index being refreshed. This refreshing of the index used to happen, but was removed in cdf2bc632ebc ("scripts/setlocalversion on write-protected source tree", 2013-06-14) due to issues with a read-only filesystem. If the index is not refreshed, one runs into problems. E.g. as described in [0], git stores the uid in its index, so even if just the uid has changed (or git is tricked into thinking so), then we will think the tree is dirty. So as in [1], if you package linux-git with a system that uses fakeroot(1), you get a "-dirty" version. Unless you manually $(git update-index --refresh) themselves. The simplest solution seems to be $(git status --porcelain), with an additional flag saying "ignore untracked files". It seems clearer about what it does, and avoids issues regarding cached indexes and writable filesystems, but still has stable output for scripting. [0]: https://public-inbox.org/git/0190ae30-b6c8-2a8b-b1fb-fd9d84e6dfdf@oracle.com/ [1]: https://bbs.archlinux.org/viewtopic.php?id=236702 Signed-off-by: Genki Sky <sky@genki.is> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2018-08-31Merge tag 'arm64-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Will Deacon: "A few arm64 fixes came in this week, specifically fixing some nasty truncation of return values from firmware calls and resolving a VM_BUG_ON due to accessing uninitialised struct pages corresponding to NOMAP pages. Summary: - Fix typos in SVE documentation - Fix type-checking and implicit truncation for SMCCC calls - Force CONFIG_HOLES_IN_ZONE=y so that SLAB doesn't fall over NOMAP regions" * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: mm: always enable CONFIG_HOLES_IN_ZONE arm/arm64: smccc-1.1: Handle function result as parameters arm/arm64: smccc-1.1: Make return values unsigned long Documentation/arm64/sve: Couple of improvements and typos
2018-08-31x86/efi: Load fixmap GDT in efi_call_phys_epilog()Joerg Roedel
When PTI is enabled on x86-32 the kernel uses the GDT mapped in the fixmap for the simple reason that this address is also mapped for user-space. The efi_call_phys_prolog()/efi_call_phys_epilog() wrappers change the GDT to call EFI runtime services and switch back to the kernel GDT when they return. But the switch-back uses the writable GDT, not the fixmap GDT. When that happened and and the CPU returns to user-space it switches to the user %cr3 and tries to restore user segment registers. This fails because the writable GDT is not mapped in the user page-table, and without a GDT the fault handlers also can't be launched. The result is a triple fault and reboot of the machine. Fix that by restoring the GDT back to the fixmap GDT which is also mapped in the user page-table. Fixes: 7757d607c6b3 x86/pti: ('Allow CONFIG_PAGE_TABLE_ISOLATION for x86_32') Reported-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Guenter Roeck <linux@roeck-us.net> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Pavel Machek <pavel@ucw.cz> Cc: hpa@zytor.com Cc: linux-efi@vger.kernel.org Link: https://lkml.kernel.org/r/1535702738-10971-1-git-send-email-joro@8bytes.org
2018-08-31Merge tag 'for-linus-4.19b-rc2-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen fixes from Juergen Gross: - minor cleanup avoiding a warning when building with new gcc - a patch to add a new sysfs node for Xen frontend/backend drivers to make it easier to obtain the state of a pv device - two fixes for 32-bit pv-guests to avoid intermediate L1TF vulnerable PTEs * tag 'for-linus-4.19b-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: x86/xen: remove redundant variable save_pud xen: export device state to sysfs x86/pae: use 64 bit atomic xchg function in native_ptep_get_and_clear x86/xen: don't write ptes directly in 32-bit PV guests
2018-08-31Merge tag 'm68k-for-v4.19-tag2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k Pull m68k fix from Geert Uytterhoeven: "Just a single fix for a bug introduced during the merge window: fix wrong date and time on PMU-based Macs" * tag 'm68k-for-v4.19-tag2' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k: m68k/mac: Use correct PMU response format
2018-08-31Merge branch 'i2c/for-current' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull i2c fixes from Wolfram Sang: - regression fixes for i801 and designware - better API and leak fix for releasing DMA safe buffers - better greppable strings for the bitbang algorithm * 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: i2c: sh_mobile: fix leak when using DMA bounce buffer i2c: sh_mobile: define start_ch() void as it only returns 0 anyhow i2c: refactor function to release a DMA safe buffer i2c: algos: bit: make the error messages grepable i2c: designware: Re-init controllers with pm_disabled set on resume i2c: i801: Allow ACPI AML access I/O ports not reserved for SMBus
2018-08-31x86/nmi: Fix NMI uaccess race against CR3 switchingAndy Lutomirski
A NMI can hit in the middle of context switching or in the middle of switch_mm_irqs_off(). In either case, CR3 might not match current->mm, which could cause copy_from_user_nmi() and friends to read the wrong memory. Fix it by adding a new nmi_uaccess_okay() helper and checking it in copy_from_user_nmi() and in __copy_from_user_nmi()'s callers. Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Rik van Riel <riel@surriel.com> Cc: Nadav Amit <nadav.amit@gmail.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Jann Horn <jannh@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/dd956eba16646fd0b15c3c0741269dfd84452dac.1535557289.git.luto@kernel.org
2018-08-31x86: Allow generating user-space headers without a compilerBen Hutchings
When bootstrapping an architecture, it's usual to generate the kernel's user-space headers (make headers_install) before building a compiler. Move the compiler check (for asm goto support) to the archprepare target so that it is only done when building code for the target. Fixes: e501ce957a78 ("x86: Force asm-goto") Reported-by: Helmut Grohne <helmutg@debian.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20180829194317.GA4765@decadent.org.uk
2018-08-31x86/dumpstack: Don't dump kernel memory based on usermode RIPJann Horn
show_opcodes() is used both for dumping kernel instructions and for dumping user instructions. If userspace causes #PF by jumping to a kernel address, show_opcodes() can be reached with regs->ip controlled by the user, pointing to kernel code. Make sure that userspace can't trick us into dumping kernel memory into dmesg. Fixes: 7cccf0725cf7 ("x86/dumpstack: Add a show_ip() function") Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Borislav Petkov <bp@suse.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: security@kernel.org Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20180828154901.112726-1-jannh@google.com
2018-08-31of: Add device_type access helper functionsRob Herring
In preparation to remove direct access to device_node.type, add of_node_is_type() and of_node_get_device_type() helpers to check and retrieve the device type. Cc: Frank Rowand <frowand.list@gmail.com> Signed-off-by: Rob Herring <robh@kernel.org>
2018-08-31cpu/hotplug: Remove skip_onerr field from cpuhp_step structureMukesh Ojha
When notifiers were there, `skip_onerr` was used to avoid calling particular step startup/teardown callbacks in the CPU up/down rollback path, which made the hotplug asymmetric. As notifiers are gone now after the full state machine conversion, the `skip_onerr` field is no longer required. Remove it from the structure and its usage. Signed-off-by: Mukesh Ojha <mojha@codeaurora.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/1535439294-31426-1-git-send-email-mojha@codeaurora.org
2018-08-31arm64: mm: always enable CONFIG_HOLES_IN_ZONEJames Morse
Commit 6d526ee26ccd ("arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA") only enabled HOLES_IN_ZONE for NUMA systems because the NUMA code was choking on the missing zone for nomap pages. This problem doesn't just apply to NUMA systems. If the architecture doesn't set HAVE_ARCH_PFN_VALID, pfn_valid() will return true if the pfn is part of a valid sparsemem section. When working with multiple pages, the mm code uses pfn_valid_within() to test each page it uses within the sparsemem section is valid. On most systems memory comes in MAX_ORDER_NR_PAGES chunks which all have valid/initialised struct pages. In this case pfn_valid_within() is optimised out. Systems where this isn't true (e.g. due to nomap) should set HOLES_IN_ZONE and provide HAVE_ARCH_PFN_VALID so that mm tests each page as it works with it. Currently non-NUMA arm64 systems can't enable HOLES_IN_ZONE, leading to a VM_BUG_ON(): | page:fffffdff802e1780 is uninitialized and poisoned | raw: ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff | raw: ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff | page dumped because: VM_BUG_ON_PAGE(PagePoisoned(p)) | ------------[ cut here ]------------ | kernel BUG at include/linux/mm.h:978! | Internal error: Oops - BUG: 0 [#1] PREEMPT SMP [...] | CPU: 1 PID: 25236 Comm: dd Not tainted 4.18.0 #7 | Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015 | pstate: 40000085 (nZcv daIf -PAN -UAO) | pc : move_freepages_block+0x144/0x248 | lr : move_freepages_block+0x144/0x248 | sp : fffffe0071177680 [...] | Process dd (pid: 25236, stack limit = 0x0000000094cc07fb) | Call trace: | move_freepages_block+0x144/0x248 | steal_suitable_fallback+0x100/0x16c | get_page_from_freelist+0x440/0xb20 | __alloc_pages_nodemask+0xe8/0x838 | new_slab+0xd4/0x418 | ___slab_alloc.constprop.27+0x380/0x4a8 | __slab_alloc.isra.21.constprop.26+0x24/0x34 | kmem_cache_alloc+0xa8/0x180 | alloc_buffer_head+0x1c/0x90 | alloc_page_buffers+0x68/0xb0 | create_empty_buffers+0x20/0x1ec | create_page_buffers+0xb0/0xf0 | __block_write_begin_int+0xc4/0x564 | __block_write_begin+0x10/0x18 | block_write_begin+0x48/0xd0 | blkdev_write_begin+0x28/0x30 | generic_perform_write+0x98/0x16c | __generic_file_write_iter+0x138/0x168 | blkdev_write_iter+0x80/0xf0 | __vfs_write+0xe4/0x10c | vfs_write+0xb4/0x168 | ksys_write+0x44/0x88 | sys_write+0xc/0x14 | el0_svc_naked+0x30/0x34 | Code: aa1303e0 90001a01 91296421 94008902 (d4210000) | ---[ end trace 1601ba47f6e883fe ]--- Remove the NUMA dependency. Link: https://www.spinics.net/lists/arm-kernel/msg671851.html Cc: <stable@vger.kernel.org> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Reported-by: Mikulas Patocka <mpatocka@redhat.com> Reviewed-by: Pavel Tatashin <pavel.tatashin@microsoft.com> Tested-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-08-31gpio: Fix crash due to registration raceVincent Whitchurch
gpiochip_add_data_with_key() adds the gpiochip to the gpio_devices list before of_gpiochip_add() is called, but it's only the latter which sets the ->of_xlate function pointer. gpiochip_find() can be called by someone else between these two actions, and it can find the chip and call of_gpiochip_match_node_and_xlate() which leads to the following crash due to a NULL ->of_xlate(). Unhandled prefetch abort: page domain fault (0x01b) at 0x00000000 Modules linked in: leds_gpio(+) gpio_generic(+) CPU: 0 PID: 830 Comm: insmod Not tainted 4.18.0+ #43 Hardware name: ARM-Versatile Express PC is at (null) LR is at of_gpiochip_match_node_and_xlate+0x2c/0x38 Process insmod (pid: 830, stack limit = 0x(ptrval)) (of_gpiochip_match_node_and_xlate) from (gpiochip_find+0x48/0x84) (gpiochip_find) from (of_get_named_gpiod_flags+0xa8/0x238) (of_get_named_gpiod_flags) from (gpiod_get_from_of_node+0x2c/0xc8) (gpiod_get_from_of_node) from (devm_fwnode_get_index_gpiod_from_child+0xb8/0x144) (devm_fwnode_get_index_gpiod_from_child) from (gpio_led_probe+0x208/0x3c4 [leds_gpio]) (gpio_led_probe [leds_gpio]) from (platform_drv_probe+0x48/0x9c) (platform_drv_probe) from (really_probe+0x1d0/0x3d4) (really_probe) from (driver_probe_device+0x78/0x1c0) (driver_probe_device) from (__driver_attach+0x120/0x13c) (__driver_attach) from (bus_for_each_dev+0x68/0xb4) (bus_for_each_dev) from (bus_add_driver+0x1a8/0x268) (bus_add_driver) from (driver_register+0x78/0x10c) (driver_register) from (do_one_initcall+0x54/0x1fc) (do_one_initcall) from (do_init_module+0x64/0x1f4) (do_init_module) from (load_module+0x2198/0x26ac) (load_module) from (sys_finit_module+0xe0/0x110) (sys_finit_module) from (ret_fast_syscall+0x0/0x54) One way to fix this would be to rework the hairy registration sequence in gpiochip_add_data_with_key(), but since I'd probably introduce a couple of new bugs if I attempted that, simply add a check for a non-NULL of_xlate function pointer in of_gpiochip_match_node_and_xlate(). This works since the driver looking for the gpio will simply fail to find the gpio and defer its probe and be reprobed when the driver which is registering the gpiochip has fully completed its probe. Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2018-08-31m68k/mac: Use correct PMU response formatFinn Thain
Now that the 68k Mac port has adopted the via-pmu driver, it must decode the PMU response accordingly otherwise the date and time will be wrong. Fixes: ebd722275f9cfc67 ("macintosh/via-pmu: Replace via-pmu68k driver with via-pmu driver") Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
2018-08-30Merge tag 'drm-fixes-2018-08-31' of git://anongit.freedesktop.org/drm/drmLinus Torvalds
Pull drm fixes from Dave Airlie: "Regular fixes pull: - Mediatek has a bunch of fixes to their RDMA and Overlay engines. - i915 has some Cannonlake/Geminilake watermark workarounds, LSPCON fix, HDCP free fix, audio fix and a ppgtt reference counting fix. - amdgpu has some SRIOV, Kasan, memory leaks and other misc fixes" * tag 'drm-fixes-2018-08-31' of git://anongit.freedesktop.org/drm/drm: (35 commits) drm/i915/audio: Hook up component bindings even if displays are disabled drm/i915: Increase LSPCON timeout drm/i915: Stop holding a ref to the ppgtt from each vma drm/i915: Free write_buf that we allocated with kzalloc. drm/i915: Fix glk/cnl display w/a #1175 drm/amdgpu: Need to set moved to true when evict bo drm/amdgpu: Remove duplicated power source update drm/amd/display: Fix memory leak caused by missed dc_sink_release drm/amdgpu: fix holding mn_lock while allocating memory drm/amdgpu: Power on uvd block when hw_fini drm/amdgpu: Update power state at the end of smu hw_init. drm/amdgpu: Fix vce initialize failed on Kaveri/Mullins drm/amdgpu: Enable/disable gfx PG feature in rlc safe mode drm/amdgpu: Adjust the VM size based on system memory size v2 drm/mediatek: fix connection from RDMA2 to DSI1 drm/mediatek: update some variable name from ovl to comp drm/mediatek: use layer_nr function to get layer number to init plane drm/mediatek: add function to return RDMA layer number drm/mediatek: add function to return OVL layer number drm/mediatek: add function to get layer number for component ...
2018-08-30disable stringop truncation warnings for nowStephen Rothwell
They are too noisy Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-30Merge tag 'pm-4.19-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management fixes from Rafael Wysocki: "These address a corner case in the menu cpuidle governor and fix error handling in the PM core's generic clock management code. Specifics: - Make the menu cpuidle governor avoid stopping the scheduler tick if the predicted idle duration exceeds the tick period length, but the selected idle state is shallow and deeper idle states with high target residencies are available (Rafael Wysocki). - Make the PM core's generic clock management code use a proper data type for one variable to make error handling work (Dan Carpenter)" * tag 'pm-4.19-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: cpuidle: menu: Retain tick when shallow state is selected PM / clk: signedness bug in of_pm_clk_add_clks()
2018-08-30arc: remove redundant GCC version checksMasahiro Yamada
Commit cafa0010cd51 ("Raise the minimum required gcc version to 4.6") bumped the minimum GCC version to 4.6 for all architectures. With GCC >= 4.6 assumed, 'upto_gcc44' is empty, 'atleast_gcc44' is y. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>