summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2017-12-06dt-bindings: eeprom: at25: Grammar s/are can/can/Geert Uytterhoeven
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Rob Herring <robh@kernel.org>
2017-12-06dt-bindings: Remove leading 0x from bindings notationMathieu Malaterre
Improve the binding example by removing all the leading 0x to fix the following dtc warnings: Warning (unit_address_format): Node /XXX unit name should not have leading "0x" Converted using the following command: find Documentation/devicetree/bindings -name "*.txt" -exec sed -i -e 's/([^ ])\@0x([0-9a-f])/$1\@$2/g' {} + This is a follow up to commit 48c926cd3414 Signed-off-by: Mathieu Malaterre <malat@debian.org> Signed-off-by: Rob Herring <robh@kernel.org>
2017-12-06of: overlay: Remove else after gotoGeert Uytterhoeven
If an "if" branch is terminated by a "goto", there's no need to have an "else" statement and an indented block of code. Remove the "else" statement to simplify the code flow for the casual reviewer. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Rob Herring <robh@kernel.org>
2017-12-06of: Spelling s/changset/changeset/Geert Uytterhoeven
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Rob Herring <robh@kernel.org>
2017-12-06of: unittest: Remove bogus overlay mutex release from overlay_data_add()Geert Uytterhoeven
overlay_data_add() never takes the special overlay mutex, so it must not be released in the error patch. Presumably the call to of_overlay_mutex_unlock() is a relic from v1 of the patch. Fixes: f948d6d8b792bb90 ("of: overlay: avoid race condition between applying multiple overlays") Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Frank Rowand <frank.rowand@sony.com> Signed-off-by: Rob Herring <robh@kernel.org>
2017-12-06drivers: net: dsa: remove duplicate includesPravin Shedge
These duplicate includes have been found with scripts/checkincludes.pl but they have been removed manually to avoid removing false positives. Signed-off-by: Pravin Shedge <pravin.shedge4linux@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-06rds: Fix NULL pointer dereference in __rds_rdma_mapHåkon Bugge
This is a fix for syzkaller719569, where memory registration was attempted without any underlying transport being loaded. Analysis of the case reveals that it is the setsockopt() RDS_GET_MR (2) and RDS_GET_MR_FOR_DEST (7) that are vulnerable. Here is an example stack trace when the bug is hit: BUG: unable to handle kernel NULL pointer dereference at 00000000000000c0 IP: __rds_rdma_map+0x36/0x440 [rds] PGD 2f93d03067 P4D 2f93d03067 PUD 2f93d02067 PMD 0 Oops: 0000 [#1] SMP Modules linked in: bridge stp llc tun rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache rds binfmt_misc sb_edac intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul c rc32_pclmul ghash_clmulni_intel pcbc aesni_intel crypto_simd glue_helper cryptd iTCO_wdt mei_me sg iTCO_vendor_support ipmi_si mei ipmi_devintf nfsd shpchp pcspkr i2c_i801 ioatd ma ipmi_msghandler wmi lpc_ich mfd_core auth_rpcgss nfs_acl lockd grace sunrpc ip_tables ext4 mbcache jbd2 mgag200 i2c_algo_bit drm_kms_helper ixgbe syscopyarea ahci sysfillrect sysimgblt libahci mdio fb_sys_fops ttm ptp libata sd_mod mlx4_core drm crc32c_intel pps_core megaraid_sas i2c_core dca dm_mirror dm_region_hash dm_log dm_mod CPU: 48 PID: 45787 Comm: repro_set2 Not tainted 4.14.2-3.el7uek.x86_64 #2 Hardware name: Oracle Corporation ORACLE SERVER X5-2L/ASM,MOBO TRAY,2U, BIOS 31110000 03/03/2017 task: ffff882f9190db00 task.stack: ffffc9002b994000 RIP: 0010:__rds_rdma_map+0x36/0x440 [rds] RSP: 0018:ffffc9002b997df0 EFLAGS: 00010202 RAX: 0000000000000000 RBX: ffff882fa2182580 RCX: 0000000000000000 RDX: 0000000000000000 RSI: ffffc9002b997e40 RDI: ffff882fa2182580 RBP: ffffc9002b997e30 R08: 0000000000000000 R09: 0000000000000002 R10: ffff885fb29e3838 R11: 0000000000000000 R12: ffff882fa2182580 R13: ffff882fa2182580 R14: 0000000000000002 R15: 0000000020000ffc FS: 00007fbffa20b700(0000) GS:ffff882fbfb80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000000000c0 CR3: 0000002f98a66006 CR4: 00000000001606e0 Call Trace: rds_get_mr+0x56/0x80 [rds] rds_setsockopt+0x172/0x340 [rds] ? __fget_light+0x25/0x60 ? __fdget+0x13/0x20 SyS_setsockopt+0x80/0xe0 do_syscall_64+0x67/0x1b0 entry_SYSCALL64_slow_path+0x25/0x25 RIP: 0033:0x7fbff9b117f9 RSP: 002b:00007fbffa20aed8 EFLAGS: 00000293 ORIG_RAX: 0000000000000036 RAX: ffffffffffffffda RBX: 00000000000c84a4 RCX: 00007fbff9b117f9 RDX: 0000000000000002 RSI: 0000400000000114 RDI: 000000000000109b RBP: 00007fbffa20af10 R08: 0000000000000020 R09: 00007fbff9dd7860 R10: 0000000020000ffc R11: 0000000000000293 R12: 0000000000000000 R13: 00007fbffa20b9c0 R14: 00007fbffa20b700 R15: 0000000000000021 Code: 41 56 41 55 49 89 fd 41 54 53 48 83 ec 18 8b 87 f0 02 00 00 48 89 55 d0 48 89 4d c8 85 c0 0f 84 2d 03 00 00 48 8b 87 00 03 00 00 <48> 83 b8 c0 00 00 00 00 0f 84 25 03 00 0 0 48 8b 06 48 8b 56 08 The fix is to check the existence of an underlying transport in __rds_rdma_map(). Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com> Reported-by: syzbot <syzkaller@googlegroups.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-06net_sched: use macvlan real dev trans_start in dev_trans_start()Chris Dion
Macvlan devices are similar to vlans and do not update their own trans_start. In order for arp monitoring to work for a bond device when the slaves are macvlans, obtain its real device. Signed-off-by: Chris Dion <christopher.dion@dell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-06x86/vdso: Change time() prototype to match __vdso_time()Arnd Bergmann
gcc-8 warns that time() is an alias for __vdso_time() but the two have different prototypes: arch/x86/entry/vdso/vclock_gettime.c:327:5: error: 'time' alias between functions of incompatible types 'int(time_t *)' {aka 'int(long int *)'} and 'time_t(time_t *)' {aka 'long int(long int *)'} [-Werror=attribute-alias] int time(time_t *t) ^~~~ arch/x86/entry/vdso/vclock_gettime.c:318:16: note: aliased declaration here I could not figure out whether this is intentional, but I see that changing it to return time_t avoids the warning. Returning 'int' from time() is also a bit questionable, as it causes an overflow in y2038 even on 64-bit architectures that use a 64-bit time_t type. On 32-bit architecture with 64-bit time_t, time() should always be implement by the C library by calling a (to be added) clock_gettime() variant that takes a sufficiently wide argument. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Vitaly Kuznetsov <vkuznets@redhat.com> Link: http://lkml.kernel.org/r/20171204150203.852959-1-arnd@arndb.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-07Merge branch 'drm-fixes-4.15' of git://people.freedesktop.org/~agd5f/linux ↵Dave Airlie
into drm-fixes ttm and license fixes * 'drm-fixes-4.15' of git://people.freedesktop.org/~agd5f/linux: drm/ttm: swap consecutive allocated pooled pages v4 drm/ttm: swap consecutive allocated cached pages v3 drm/ttm: roundup the shrink request to prevent skip huge pool drm/ttm: add page order support in ttm_pages_put drm/ttm: add set_pages_wb for handling page order more than zero drm/ttm: add page order in page pool drm/ttm: use NUM_PAGES_TO_ALLOC always drm/amdgpu: add license to files where it was missing drm/amdgpu: add license to Makefiles
2017-12-06xen-netback: Fix logging message with spurious period after newlineJoe Perches
Using a period after a newline causes bad output. Signed-off-by: Joe Perches <joe@perches.com> Reviewed-by: Paul Durrant <paul.durrant@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-06net: thunderx: Fix TCP/UDP checksum offload for IPv4 pktsFlorian Westphal
Offload IP header checksum to NIC. This fixes a previous patch which disabled checksum offloading for both IPv4 and IPv6 packets. So L3 checksum offload was getting disabled for IPv4 pkts. And HW is dropping these pkts for some reason. Without this patch, IPv4 TSO appears to be broken: WIthout this patch I get ~16kbyte/s, with patch close to 2mbyte/s when copying files via scp from test box to my home workstation. Looking at tcpdump on sender it looks like hardware drops IPv4 TSO skbs. This patch restores performance for me, ipv6 looks good too. Fixes: fa6d7cb5d76c ("net: thunderx: Fix TCP/UDP checksum offload for IPv6 pkts") Cc: Sunil Goutham <sgoutham@cavium.com> Cc: Aleksey Makarov <aleksey.makarov@auriga.com> Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-06arm64/sve: Avoid dereference of dead task_struct in KVM guest entryDave Martin
When deciding whether to invalidate FPSIMD state cached in the cpu, the backend function sve_flush_cpu_state() attempts to dereference __this_cpu_read(fpsimd_last_state). However, this is not safe: there is no guarantee that this task_struct pointer is still valid, because the task could have exited in the meantime. This means that we need another means to get the appropriate value of TIF_SVE for the associated task. This patch solves this issue by adding a cached copy of the TIF_SVE flag in fpsimd_last_state, which we can check without dereferencing the task pointer. In particular, although this patch is not a KVM fix per se, this means that this check is now done safely in the KVM world switch path (which is currently the only user of this code). Signed-off-by: Dave Martin <Dave.Martin@arm.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Christoffer Dall <christoffer.dall@linaro.org> Cc: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-12-06Merge tag 'iommu-v4.15-rc3' of git://github.com/awilliam/linux-vfioLinus Torvalds
Pull IOMMU fix from Alex Williamson: "Fix VT-d handling of scatterlists where sg->offset exceeds PAGE_SIZE" * tag 'iommu-v4.15-rc3' of git://github.com/awilliam/linux-vfio: iommu/vt-d: Fix scatterlist offset handling
2017-12-06Merge tag 'sound-4.15-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "All fixes are small and for stable: - a PCM ioctl race fix - yet another USB-audio hardening for malicious descriptors - Realtek ALC257 codec support" * tag 'sound-4.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: ALSA: pcm: prevent UAF in snd_pcm_info ALSA: hda/realtek - New codec support for ALC257 ALSA: usb-audio: Add check return value for usb_string() ALSA: usb-audio: Fix out-of-bound error ALSA: seq: Remove spurious WARN_ON() at timer check
2017-12-06x86: Fix Sparse warnings about non-static functionsColin Ian King
Functions x86_vector_debug_show(), uv_handle_nmi() and uv_nmi_setup_common() are local to the source and do not need to be in global scope, so make them static. Fixes up various sparse warnings. Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: Mike Travis <mike.travis@hpe.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jiri Kosina <trivial@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Russ Anderson <russ.anderson@hpe.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: kernel-janitors@vger.kernel.org Cc: travis@sgi.com Link: http://lkml.kernel.org/r/20171206173358.24388-1-colin.king@canonical.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-06efi: Add comment to avoid future expanding of sysfs systabDave Young
/sys/firmware/efi/systab shows several different values, it breaks sysfs one file one value design. But since there are already userspace tools depend on it eg. kexec-tools so add code comment to alert future expanding of this file. Signed-off-by: Dave Young <dyoung@redhat.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matt Fleming <matt@codeblueprint.co.uk> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-efi@vger.kernel.org Link: http://lkml.kernel.org/r/20171206095010.24170-4-ard.biesheuvel@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-06efi/esrt: Use memunmap() instead of kfree() to free the remappingPan Bian
The remapping result of memremap() should be freed with memunmap(), not kfree(). Signed-off-by: Pan Bian <bianpan2016@163.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: <stable@vger.kernel.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matt Fleming <matt@codeblueprint.co.uk> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-efi@vger.kernel.org Link: http://lkml.kernel.org/r/20171206095010.24170-3-ard.biesheuvel@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-06efi: Move some sysfs files to be read-only by rootGreg Kroah-Hartman
Thanks to the scripts/leaking_addresses.pl script, it was found that some EFI values should not be readable by non-root users. So make them root-only, and to do that, add a __ATTR_RO_MODE() macro to make this easier, and use it in other places at the same time. Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Tested-by: Dave Young <dyoung@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Matt Fleming <matt@codeblueprint.co.uk> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-efi@vger.kernel.org Cc: stable <stable@vger.kernel.org> Link: http://lkml.kernel.org/r/20171206095010.24170-2-ard.biesheuvel@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-06sched/fair: Update and fix the runnable propagation ruleVincent Guittot
Unlike running, the runnable part can't be directly propagated through the hierarchy when we migrate a task. The main reason is that runnable time can be shared with other sched_entities that stay on the rq and this runnable time will also remain on prev cfs_rq and must not be removed. Instead, we can estimate what should be the new runnable of the prev cfs_rq and check that this estimation stay in a possible range. The prop_runnable_sum is a good estimation when adding runnable_sum but fails most often when we remove it. Instead, we could use the formula below instead: gcfs_rq's runnable_sum = gcfs_rq->avg.load_sum / gcfs_rq->load.weight which assumes that tasks are equally runnable which is not true but easy to compute. Beside these estimates, we have several simple rules that help us to filter out wrong ones: - ge->avg.runnable_sum <= than LOAD_AVG_MAX - ge->avg.runnable_sum >= ge->avg.running_sum (ge->avg.util_sum << LOAD_AVG_MAX) - ge->avg.runnable_sum can't increase when we detach a task The effect of these fixes is better cgroups balancing. Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Ben Segall <bsegall@google.com> Cc: Chris Mason <clm@fb.com> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Josef Bacik <josef@toxicpanda.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Morten Rasmussen <morten.rasmussen@arm.com> Cc: Paul Turner <pjt@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Yuyang Du <yuyang.du@intel.com> Link: http://lkml.kernel.org/r/1510842112-21028-1-git-send-email-vincent.guittot@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-06sched/wait: Fix add_wait_queue() behavioral changeOmar Sandoval
The following cleanup commit: 50816c48997a ("sched/wait: Standardize internal naming of wait-queue entries") ... unintentionally changed the behavior of add_wait_queue() from inserting the wait entry at the head of the wait queue to the tail of the wait queue. Beyond a negative performance impact this change in behavior theoretically also breaks wait queues which mix exclusive and non-exclusive waiters, as non-exclusive waiters will not be woken up if they are queued behind enough exclusive waiters. Signed-off-by: Omar Sandoval <osandov@fb.com> Reviewed-by: Jens Axboe <axboe@kernel.dk> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: kernel-team@fb.com Fixes: ("sched/wait: Standardize internal naming of wait-queue entries") Link: http://lkml.kernel.org/r/a16c8ccffd39bd08fdaa45a5192294c784b803a7.1512544324.git.osandov@fb.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-06locking/lockdep: Fix possible NULL derefPeter Zijlstra
We can't invalidate xhlocks when we've not yet allocated any. Reported-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Fixes: f52be5708076 ("locking/lockdep: Untangle xhlock history save/restore from task independence") Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-06cpu/hotplug: Fix state name in takedown_cpu() commentBrendan Jackman
CPUHP_AP_SCHED_MIGRATE_DYING doesn't exist, it looks like this was supposed to refer to CPUHP_AP_SCHED_STARTING's teardown callback, i.e. sched_cpu_dying(). Signed-off-by: Brendan Jackman <brendan.jackman@arm.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Quentin Perret <quentin.perret@arm.com> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20171206105911.28093-1-brendan.jackman@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-06arm64: SW PAN: Update saved ttbr0 value on enter_lazy_tlbWill Deacon
enter_lazy_tlb is called when a kernel thread rides on the back of another mm, due to a context switch or an explicit call to unuse_mm where a call to switch_mm is elided. In these cases, it's important to keep the saved ttbr value up to date with the active mm, otherwise we can end up with a stale value which points to a potentially freed page table. This patch implements enter_lazy_tlb for arm64, so that the saved ttbr0 is kept up-to-date with the active mm for kernel threads. Cc: Mark Rutland <mark.rutland@arm.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Vinayak Menon <vinmenon@codeaurora.org> Cc: <stable@vger.kernel.org> Fixes: 39bc88e5e38e9b21 ("arm64: Disable TTBR0_EL1 during normal kernel execution") Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Reported-by: Vinayak Menon <vinmenon@codeaurora.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-12-06arm64: SW PAN: Point saved ttbr0 at the zero page when switching to init_mmWill Deacon
update_saved_ttbr0 mandates that mm->pgd is not swapper, since swapper contains kernel mappings and should never be installed into ttbr0. However, this means that callers must avoid passing the init_mm to update_saved_ttbr0 which in turn can cause the saved ttbr0 value to be out-of-date in the context of the idle thread. For example, EFI runtime services may leave the saved ttbr0 pointing at the EFI page table, and kernel threads may end up with stale references to freed page tables. This patch changes update_saved_ttbr0 so that the init_mm points the saved ttbr0 value to the empty zero page, which always exists and never contains valid translations. EFI and switch can then call into update_saved_ttbr0 unconditionally. Cc: Mark Rutland <mark.rutland@arm.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Vinayak Menon <vinmenon@codeaurora.org> Cc: <stable@vger.kernel.org> Fixes: 39bc88e5e38e9b21 ("arm64: Disable TTBR0_EL1 during normal kernel execution") Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Reported-by: Vinayak Menon <vinmenon@codeaurora.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-12-06arm64: fpsimd: Abstract out binding of task's fpsimd context to the cpu.Dave Martin
There is currently some duplicate logic to associate current's FPSIMD context with the cpu when loading FPSIMD state into the cpu regs. Subsequent patches will update that logic, so in order to ensure it only needs to be done in one place, this patch factors the relevant code out into a new function fpsimd_bind_to_cpu(). Signed-off-by: Dave Martin <Dave.Martin@arm.com> Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-12-06arm64: fpsimd: Prevent registers leaking from dead tasksDave Martin
Currently, loading of a task's fpsimd state into the CPU registers is skipped if that task's state is already present in the registers of that CPU. However, the code relies on the struct fpsimd_state * (and by extension struct task_struct *) to unambiguously identify a task. There is a particular case in which this doesn't work reliably: when a task exits, its task_struct may be recycled to describe a new task. Consider the following scenario: 1) Task P loads its fpsimd state onto cpu C. per_cpu(fpsimd_last_state, C) := P; P->thread.fpsimd_state.cpu := C; 2) Task X is scheduled onto C and loads its fpsimd state on C. per_cpu(fpsimd_last_state, C) := X; X->thread.fpsimd_state.cpu := C; 3) X exits, causing X's task_struct to be freed. 4) P forks a new child T, which obtains X's recycled task_struct. T == X. T->thread.fpsimd_state.cpu == C (inherited from P). 5) T is scheduled on C. T's fpsimd state is not loaded, because per_cpu(fpsimd_last_state, C) == T (== X) && T->thread.fpsimd_state.cpu == C. (This is the check performed by fpsimd_thread_switch().) So, T gets X's registers because the last registers loaded onto C were those of X, in (2). This patch fixes the problem by ensuring that the sched-in check fails in (5): fpsimd_flush_task_state(T) is called when T is forked, so that T->thread.fpsimd_state.cpu == C cannot be true. This relies on the fact that T is not schedulable until after copy_thread() completes. Once T's fpsimd state has been loaded on some CPU C there may still be other cpus D for which per_cpu(fpsimd_last_state, D) == &X->thread.fpsimd_state. But D is necessarily != C in this case, and the check in (5) must fail. An alternative fix would be to do refcounting on task_struct. This would result in each CPU holding a reference to the last task whose fpsimd state was loaded there. It's not clear whether this is preferable, and it involves higher overhead than the fix proposed in this patch. It would also move all the task_struct freeing work into the context switch critical section, or otherwise some deferred cleanup mechanism would need to be introduced, neither of which seems obviously justified. Cc: <stable@vger.kernel.org> Fixes: 005f78cd8849 ("arm64: defer reloading a task's FPSIMD state to userland resume") Signed-off-by: Dave Martin <Dave.Martin@arm.com> Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> [will: word-smithed the comment so it makes more sense] Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-12-06KVM: x86: fix APIC page invalidationRadim Krčmář
Implementation of the unpinned APIC page didn't update the VMCS address cache when invalidation was done through range mmu notifiers. This became a problem when the page notifier was removed. Re-introduce the arch-specific helper and call it from ...range_start. Reported-by: Fabian Grünbichler <f.gruenbichler@proxmox.com> Fixes: 38b9917350cb ("kvm: vmx: Implement set_apic_access_page_addr") Fixes: 369ea8242c0f ("mm/rmap: update to new mmu_notifier semantic v2") Cc: <stable@vger.kernel.org> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Andrea Arcangeli <aarcange@redhat.com> Tested-by: Wanpeng Li <wanpeng.li@hotmail.com> Tested-by: Fabian Grünbichler <f.gruenbichler@proxmox.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2017-12-06Merge tag 'kvm-s390-master-4.15-1' of ↵Radim Krčmář
git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux KVM: s390: Fixes for 4.15 - SPDX tags - Fence storage key accesses from problem state - Make sure that irq_state.flags is not used in the future
2017-12-06xen/pvcalls: Fix a check in pvcalls_front_remove()Dan Carpenter
bedata->ref can't be less than zero because it's unsigned. This affects certain error paths in probe. We first set ->ref = -1 and then we set it to a valid value later. Fixes: 219681909913 ("xen/pvcalls: connect to the backend") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Juergen Gross <jgross@suse.com> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2017-12-06xen/pvcalls: check for xenbus_read() errorsDan Carpenter
Smatch complains that "len" is uninitialized if xenbus_read() fails so let's add some error handling. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Juergen Gross <jgross@suse.com> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2017-12-06drm/ttm: swap consecutive allocated pooled pages v4Christian König
When we detect consecutive allocation of pages swap them to avoid accidentally freeing them as huge page. v2: use swap v3: check if it's really the first allocated page v4: don't touch the loop variable Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Roger He <Hongbo.He@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-07powerpc/xmon: Don't print hashed pointers in xmonMichael Ellerman
Since commit ad67b74d2469 ("printk: hash addresses printed with %p") pointers printed with %p are hashed, ie. you don't see the actual pointer value but rather a cryptographic hash of its value. In xmon we want to see the actual pointer values, because xmon is a debugger, so replace %p with %px which prints the actual pointer value. We justify doing this in xmon because 1) xmon is a kernel crash debugger, it's only accessible via the console 2) xmon doesn't print to dmesg, so the pointers it prints are not able to be leaked that way. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-12-06powerpc/64s: Initialize ISAv3 MMU registers before setting partition tableNicholas Piggin
kexec can leave MMU registers set when booting into a new kernel, the PIDR (Process Identification Register) in particular. The boot sequence does not zero PIDR, so it only gets set when CPUs first switch to a userspace processes (until then it's running a kernel thread with effective PID = 0). This leaves a window where a process table entry and page tables are set up due to user processes running on other CPUs, that happen to match with a stale PID. The CPU with that PID may cause speculative accesses that address quadrant 0 (aka userspace addresses), which will result in cached translations and PWC (Page Walk Cache) for that process, on a CPU which is not in the mm_cpumask and so they will not be invalidated properly. The most common result is the kernel hanging in infinite page fault loops soon after kexec (usually in schedule_tail, which is usually the first non-speculative quadrant 0 access to a new PID) due to a stale PWC. However being a stale translation error, it could result in anything up to security and data corruption problems. Fix this by zeroing out PIDR at boot and kexec. Fixes: 7e381c0ff618 ("powerpc/mm/radix: Add mmu context handling callback for radix") Cc: stable@vger.kernel.org # v4.7+ Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-12-06x86/power: Fix some ordering bugs in __restore_processor_context()Andy Lutomirski
__restore_processor_context() had a couple of ordering bugs. It restored GSBASE after calling load_gs_index(), and the latter can call into tracing code. It also tried to restore segment registers before restoring the LDT, which is straight-up wrong. Reorder the code so that we restore GSBASE, then the descriptor tables, then the segments. This fixes two bugs. First, it fixes a regression that broke resume under certain configurations due to irqflag tracing in native_load_gs_index(). Second, it fixes resume when the userspace process that initiated suspect had funny segments. The latter can be reproduced by compiling this: // SPDX-License-Identifier: GPL-2.0 /* * ldt_echo.c - Echo argv[1] while using an LDT segment */ int main(int argc, char **argv) { int ret; size_t len; char *buf; const struct user_desc desc = { .entry_number = 0, .base_addr = 0, .limit = 0xfffff, .seg_32bit = 1, .contents = 0, /* Data, grow-up */ .read_exec_only = 0, .limit_in_pages = 1, .seg_not_present = 0, .useable = 0 }; if (argc != 2) errx(1, "Usage: %s STRING", argv[0]); len = asprintf(&buf, "%s\n", argv[1]); if (len < 0) errx(1, "Out of memory"); ret = syscall(SYS_modify_ldt, 1, &desc, sizeof(desc)); if (ret < -1) errno = -ret; if (ret) err(1, "modify_ldt"); asm volatile ("movw %0, %%es" :: "rm" ((unsigned short)7)); write(1, buf, len); return 0; } and running ldt_echo >/sys/power/mem Without the fix, the latter causes a triple fault on resume. Fixes: ca37e57bbe0c ("x86/entry/64: Add missing irqflags tracing to native_load_gs_index()") Reported-by: Jarkko Nikula <jarkko.nikula@linux.intel.com> Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Jarkko Nikula <jarkko.nikula@linux.intel.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lkml.kernel.org/r/6b31721ea92f51ea839e79bd97ade4a75b1eeea2.1512057304.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-06x86/PCI: Make broadcom_postcore_init() check acpi_disabledRafael J. Wysocki
acpi_os_get_root_pointer() may return a valid address even if acpi_disabled is set, but the host bridge information from the ACPI tables is not going to be used in that case and the Broadcom host bridge initialization should not be skipped then, So make broadcom_postcore_init() check acpi_disabled too to avoid this issue. Fixes: 6361d72b04d1 (x86/PCI: read Broadcom CNB20LE host bridge info before PCI scan) Reported-by: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Linux PCI <linux-pci@vger.kernel.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/3186627.pxZj1QbYNg@aspire.rjw.lan Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-06x86/microcode/AMD: Add support for fam17h microcode loadingTom Lendacky
The size for the Microcode Patch Block (MPB) for an AMD family 17h processor is 3200 bytes. Add a #define for fam17h so that it does not default to 2048 bytes and fail a microcode load/update. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Borislav Petkov <bp@alien8.de> Link: https://lkml.kernel.org/r/20171130224640.15391.40247.stgit@tlendack-t1.amdoffice.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-06x86/cpufeatures: Make X86_BUG_FXSAVE_LEAK detectable in CPUID on AMDRudolf Marek
The latest AMD AMD64 Architecture Programmer's Manual adds a CPUID feature XSaveErPtr (CPUID_Fn80000008_EBX[2]). If this feature is set, the FXSAVE, XSAVE, FXSAVEOPT, XSAVEC, XSAVES / FXRSTOR, XRSTOR, XRSTORS always save/restore error pointers, thus making the X86_BUG_FXSAVE_LEAK workaround obsolete on such CPUs. Signed-off-by: Rudolf Marek <r.marek@assembler.cz> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Borislav Petkov <bp@suse.de> Tested-by: Borislav Petkov <bp@suse.de> Cc: Andy Lutomirski <luto@amacapital.net> Link: https://lkml.kernel.org/r/bdcebe90-62c5-1f05-083c-eba7f08b2540@assembler.cz Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-06drm: safely free connectors from connector_iterDaniel Vetter
In commit 613051dac40da1751ab269572766d3348d45a197 Author: Daniel Vetter <daniel.vetter@ffwll.ch> Date: Wed Dec 14 00:08:06 2016 +0100 drm: locking&new iterators for connector_list we've went to extreme lengths to make sure connector iterations works in any context, without introducing any additional locking context. This worked, except for a small fumble in the implementation: When we actually race with a concurrent connector unplug event, and our temporary connector reference turns out to be the final one, then everything breaks: We call the connector release function from whatever context we happen to be in, which can be an irq/atomic context. And connector freeing grabs all kinds of locks and stuff. Fix this by creating a specially safe put function for connetor_iter, which (in this rare case) punts the cleanup to a worker. Reported-by: Ben Widawsky <ben@bwidawsk.net> Cc: Ben Widawsky <ben@bwidawsk.net> Fixes: 613051dac40d ("drm: locking&new iterators for connector_list") Cc: Dave Airlie <airlied@gmail.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Sean Paul <seanpaul@chromium.org> Cc: <stable@vger.kernel.org> # v4.11+ Reviewed-by: Dave Airlie <airlied@gmail.com> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171204204818.24745-1-daniel.vetter@ffwll.ch
2017-12-06KVM: s390: Fix skey emulation permission checkJanosch Frank
All skey functions call skey_check_enable at their start, which checks if we are in the PSTATE and injects a privileged operation exception if we are. Unfortunately they continue processing afterwards and perform the operation anyhow as skey_check_enable does not deliver an error if the exception injection was successful. Let's move the PSTATE check into the skey functions and exit them on such an occasion, also we now do not enable skey handling anymore in such a case. Signed-off-by: Janosch Frank <frankja@linux.vnet.ibm.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Fixes: a7e19ab ("KVM: s390: handle missing storage-key facility") Cc: <stable@vger.kernel.org> # v4.8+ Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2017-12-06KVM: s390: mark irq_state.flags as non-usableChristian Borntraeger
Old kernels did not check for zero in the irq_state.flags field and old QEMUs did not zero the flag/reserved fields when calling KVM_S390_*_IRQ_STATE. Let's add comments to prevent future uses of these fields. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2017-12-06KVM: s390: Remove redundant license textGreg Kroah-Hartman
Now that the SPDX tag is in all arch/s390/kvm/ files, that identifies the license in a specific and legally-defined manner. So the extra GPL text wording can be removed as it is no longer needed at all. This is done on a quest to remove the 700+ different ways that files in the kernel describe the GPL license text. And there's unneeded stuff like the address (sometimes incorrect) for the FSF which is never needed. No copyright headers or other non-license-description text was removed. Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Cornelia Huck <cohuck@redhat.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Message-Id: <20171124140043.10062-9-gregkh@linuxfoundation.org> Acked-by: Cornelia Huck <cohuck@redhat.com> Acked-by: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2017-12-06KVM: s390: add SPDX identifiers to the remaining filesGreg Kroah-Hartman
It's good to have SPDX identifiers in all files to make it easier to audit the kernel tree for correct licenses. Update the arch/s390/kvm/ files with the correct SPDX license identifier based on the license text in the file itself. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This work is based on a script and data from Thomas Gleixner, Philippe Ombredanne, and Kate Stewart. Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Cornelia Huck <cohuck@redhat.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Kate Stewart <kstewart@linuxfoundation.org> Cc: Philippe Ombredanne <pombredanne@nexb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Message-Id: <20171124140043.10062-3-gregkh@linuxfoundation.org> Acked-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2017-12-06drm/i915/gvt: set max priority for gvt contextZhenyu Wang
This is to workaround guest driver hang regression after preemption enable that gvt hasn't enabled handling of that for guest workload. So in effect this disables preemption for gvt context now. Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com> (cherry picked from commit 1603660b3342269c95fcafee1945790342a8c28e)
2017-12-06drm/i915/gvt: Don't mark vgpu context as inactive when preemptedZhenyu Wang
We shouldn't mark inactive for vGPU context if preempted, which would still be re-scheduled later. So keep active state. Fixes: d6c0511300dc ("drm/i915/execlists: Distinguish the incomplete context notifies") Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com> (cherry picked from commit da5f99eaccc10e30bf82eb02b1be74703b878720)
2017-12-06drm/i915/gvt: Limit read hw reg to active vgpuXiong Zhang
mmio_read_from_hw() let vgpu could read hw reg, if vgpu's workload is running on hw, things is good. Otherwise vgpu will get other vgpu's reg val, it is unsafe. This patch limit such hw access to active vgpu. If vgpu isn't running on hw, the reg read of this vgpu will get the last active val which saved at schedule_out. v2: ring timestamp is walking continuously even if the ring is idle. so read hw directly. (Zhenyu) Signed-off-by: Xiong Zhang <xiong.y.zhang@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com> (cherry picked from commit 295764cd2ff41e2c1bc8af4050de77cec5e7a1c0)
2017-12-06drm/i915/gvt: Export intel_gvt_render_mmio_to_ring_id()Zhi Wang
Since many emulation logic needs to convert the offset of ring registers into ring id, we export it for other caller which might need it. Signed-off-by: Zhi Wang <zhi.a.wang@intel.com> (cherry picked from commit 62a6a53786fc4b4e7543cc63b704dbb3f7df4c0f)
2017-12-06drm/i915/gvt: Emulate PCI expansion ROM base address registerChangbin Du
Our vGPU doesn't have a device ROM, we need follow the PCI spec to report this info to drivers. Otherwise, we would see below errors. Inspecting possible rom at 0xfe049000 (vd=8086:1912 bdf=00:10.0) qemu-system-x86_64: vfio-pci: Cannot read device rom at 00000000-0000-0000-0000-000000000001 Device option ROM contents are probably invalid (check dmesg). Skip option ROM probe with rombar=0, or load from file with romfile=No option rom signature (got 4860) I will also send a improvement patch to PCI subsystem related to PCI ROM. But no idea to omit below error, since no pattern to detect vbios shadow without touch its content. 0000:00:10.0: Invalid PCI ROM header signature: expecting 0xaa55, got 0x0000 Signed-off-by: Changbin Du <changbin.du@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com> (cherry picked from commit c4270d122ccff963a021d1beb893d6192336af96)
2017-12-05x86: don't hash faulting address in oops printoutLinus Torvalds
Things like this will probably keep showing up for other architectures and other special cases. I actually thought we already used %lx for this, and that is indeed _historically_ the case, but we moved to %p when merging the 32-bit and 64-bit cases as a convenient way to get the formatting right (ie automatically picking "%08lx" vs "%016lx" based on register size). So just turn this %p into %px. Reported-by: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-12-05locking/refcounts: Do not force refcount_t usage as GPL-only exportKees Cook
The refcount_t protection on x86 was not intended to use the stricter GPL export. This adjusts the linkage again to avoid a regression in the availability of the refcount API. Reported-by: Dave Airlie <airlied@gmail.com> Fixes: 7a46ec0e2f48 ("locking/refcounts, x86/asm: Implement fast refcount overflow protection") Cc: stable@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>