linux/linux-stable.git - Linux kernel stable tree

Age	Commit message (Collapse)	Author
2015-05-08	Merge branch 'for-linus-4.1' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs Pull btrfs fix from Chris Mason: "When an arm user reported crashes near page_address(page) in my new code, it became clear that I can't be trusted with GFP masks. Filipe beat me to the patch, and I'll just be in the corner with my dunce cap on" * 'for-linus-4.1' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: Btrfs: fix wrong mapping flags for free space inode
2015-05-08	Merge tag 'dm-4.1-fixes-2' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm Pull device mapper fixes from Mike Snitzer: "Two additional fixes for changes introduced via DM during the 4.1 merge window. The first reverts a dm-crypt change that wasn't correct. The second fixes a device format regression that impacted userspace" * tag 'dm-4.1-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: init: fix regression by supporting devices with major:minor:offset format Revert "dm crypt: fix deadlock when async crypto algorithm returns -EBUSY"
2015-05-08	Merge branch 'for-linus' of git://git.kernel.dk/linux-block	Linus Torvalds
	Pull block fixes from Jens Axboe: "A collection of fixes since the merge window; - fix for a double elevator module release, from Chao Yu. Ancient bug. - the splice() MORE flag fix from Christophe Leroy. - a fix for NVMe, fixing a patch that went in in the merge window. From Keith. - two fixes for blk-mq CPU hotplug handling, from Ming Lei. - bdi vs blockdev lifetime fix from Neil Brown, fixing and oops in md. - two blk-mq fixes from Shaohua, fixing a race on queue stop and a bad merge issue with FUA writes. - division-by-zero fix for writeback from Tejun. - a block bounce page accounting fix, making sure we inc/dec after bouncing so that pre/post IO pages match up. From Wang YanQing" * 'for-linus' of git://git.kernel.dk/linux-block: splice: sendfile() at once fails for big files blk-mq: don't lose requests if a stopped queue restarts blk-mq: fix FUA request hang block: destroy bdi before blockdev is unregistered. block:bounce: fix call inc_\|dec_zone_page_state on different pages confuse value of NR_BOUNCE elevator: fix double release of elevator module writeback: use \|1 instead of +1 to protect against div by zero blk-mq: fix CPU hotplug handling blk-mq: fix race between timeout and CPU hotplug NVMe: Fix VPD B0 max sectors translation
2015-05-08	Merge tag 'gpio-v4.1-2' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio Pull GPIO fixes from Linus Walleij: "Here is a bunch of GPIO fixes that I collected since -rc1, nothing controversial, nothing special: - fix a memory leak for GPIO hotplug. - fix a signedness bug in the ACPI GPIO pin validation. - driver fixes: Qualcomm SPMI and OMAP MPUIO IRQ issues" * tag 'gpio-v4.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio: gpio: omap: Fix regression for MPUIO interrupts gpio: sysfs: fix memory leaks and device hotplug pinctrl: qcom-spmi-gpio: Fix input value report pinctrl: qcom-spmi-gpio: Fix output type configuration gpiolib: change gpio pin from unsigned to signed in acpi callback
2015-05-08	Merge tag 'mmc-4.1-rc2' of git://git.linaro.org/people/ulf.hansson/mmc	Linus Torvalds
	Pull MMC fixes from Ulf Hansson: "MMC core: - Don't access RPMB partitions for normal read/write - Fix hibernation restore sequence MMC host: - dw_mmc: Fix card detection for non removable cards - dw_mmc: Fix sglist issue in 32-bit mode - sh_mmcif: Fix timeout value for command request" * tag 'mmc-4.1-rc2' of git://git.linaro.org/people/ulf.hansson/mmc: mmc: dw_mmc: dw_mci_get_cd check MMC_CAP_NONREMOVABLE mmc: dw_mmc: init desc in dw_mci_idmac_init mmc: card: Don't access RPMB partitions for normal read/write mmc: sh_mmcif: Fix timeout value for command request mmc: core: add missing pm event in mmc_pm_notify to fix hib restore
2015-05-08	Merge tag 'trace-fixes-v4.1-rc2' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing fix from Steven Rostedt: "The newly added ftrace_print_array_seq() function had a bug in it. Luckily, the only user of it didn't make the 4.1 merge window. But the helper function should be fixed before 4.2 when the users start coming in" * tag 'trace-fixes-v4.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tracing: Make ftrace_print_array_seq compute buf_len
2015-05-08	ARM: AM33xx+: hwmod: re-use omap4 implementations for reset functionality	Tero Kristo
	The reset code functionality is mostly a copy paste between OMAP4+ and AM33xx+. Re-use the omap4 code where possible, and just keep the special implementation for de-asserting the hardreset lines for AM33xx, as AM33xx+ devices have slightly different register layouts compared to OMAP4+. This patch also fixes the hardreset issues faced on AM43xx. Signed-off-by: Tero Kristo <t-kristo@ti.com> Reported-by: Dave Gerlach <d-gerlach@ti.com> Reported-by: Suman Anna <s-anna@ti.com> Signed-off-by: Paul Walmsley <paul@pwsan.com>
2015-05-08	ARM: OMAP4+: PRM: add support for passing status register/bit info to reset	Tero Kristo
	AM43xx has slightly different reset register layout compared to OMAP4+, with varying status bit shifts and status register offsets. Current code assumes static offsets and identical status / reset control bit shifts, which is wrong. This patch adds PRM core support for passing the actual implementations from hwmod code. AM43xx mappings will be fixed in subsequent patch. Signed-off-by: Tero Kristo <t-kristo@ti.com> Reported-by: Dave Gerlach <d-gerlach@ti.com> Reported-by: Suman Anna <s-anna@ti.com> Signed-off-by: Paul Walmsley <paul@pwsan.com>
2015-05-08	ARM: AM43xx: hwmod: add VPFE hwmod entries	Benoit Parrot
	This patch adds VPFE HWMOD data for AM43xx. Signed-off-by: Benoit Parrot <bparrot@ti.com> Signed-off-by: Darren Etheridge <detheridge@ti.com> Signed-off-by: Felipe Balbi <balbi@ti.com> Signed-off-by: Lad, Prabhakar <prabhakar.csengg@gmail.com> Tested-by: Benoit Parrot <bparrot@ti.com> [paul@pwsan.com: updated to apply on v4.1-rc1] Signed-off-by: Paul Walmsley <paul@pwsan.com>
2015-05-09	ARM: dts: Add keep-power-in-suspend to WiFi SDIO node for exynos5250-snow	Javier Martinez Canillas
	The Marvell mwifiex driver prevents the system to enter into a suspend state if the card power is not preserved during a suspend/resume cycle. So Suspend-to-RAM and Suspend-to-idle are failing on Exynos5250 Snow. Add the keep-power-in-suspend Power Management property to the SDIO/MMC node so the mwifiex suspend handler doesn't fail and the system is able to enter into a suspend state. Signed-off-by: Javier Martinez Canillas <javier.martinez@collabora.co.uk> Reviewed-by: Doug Anderson <dianders@chromium.org> Signed-off-by: Kukjin Kim <kgene@kernel.org>
2015-05-09	ARM: dts: Fix typo in trip point temperature for exynos5420/5440	Abhilash Kesavan
	Remove the extra zero in the "cpu-crit-0" trip point for exynos5420 and exynos5440. Signed-off-by: Abhilash Kesavan <a.kesavan@samsung.com> Signed-off-by: Kukjin Kim <kgene@kernel.org>
2015-05-09	ARM: dts: add 'rtc_src' clock to rtc node for exynos4412-odroid boards	Markus Reichl
	The Exynos4412 SoC has a s3c6410 RTC where the source clock is now a mandatory property. This patch fixes probe failure of s3c-rtc on Odroid-X2/U2/U3 boards. Signed-off-by: Markus Reichl <m.reichl@fivetechno.de> Tested-by: Tobias Jakobi <tjakobi@math.uni-bielefeld.de> Reviewed-by: Chanwoo Choi <cw00.choi@samsung.com> Reviewed-by: Javier Martinez Canillas <javier.martinez@collabora.co.uk> Tested-by: Krzysztof Kozlowski <k.kozlowski@samsung.com> Signed-off-by: Kukjin Kim <kgene@kernel.org>
2015-05-09	ARM: dts: Make DP a consumer of DISP1 power domain on Exynos5420	Javier Martinez Canillas
	Commit ea08de16eb1b ("ARM: dts: Add DISP1 power domain for exynos5420") added a device node for the Exynos5420 DISP1 power domain but dit not make the DP controller a consumer of that power domain. This causes an "Unhandled fault: imprecise external abort" error if the exynos-dp driver tries to access the DP controller registers and the PD was turned off. This lead to a kernel panic and a complete system hang. Make the DP controller device node a consumer of the DISP1 power domain to ensure that the PD is turned on when the exynos-dp driver is probed. Fixes: ea08de16eb1b ("ARM: dts: Add DISP1 power domain for exynos5420") Signed-off-by: Javier Martinez Canillas <javier.martinez@collabora.co.uk> Signed-off-by: Kukjin Kim <kgene@kernel.org>
2015-05-08	MAINTAINERS: add Conexant Digicolor machines entry	Baruch Siach
	This adds Baruch as the maintainer for the Digicolor platform. Signed-off-by: Baruch Siach <baruch@tkos.co.il> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2015-05-08	MAINTAINERS: socfpga: update the git repo for SoCFPGA	Dinh Nguyen
	The git tree at rocketboards.org is going away. Update the entry to reflect the address of the new location. Also add an entry for all the socfpga_* dts files. Signed-off-by: Dinh Nguyen <dinguyen@opensource.altera.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2015-05-08	arm64: bpf: fix signedness bug in loading 64-bit immediate	Xi Wang
	Consider "(u64)insn1.imm << 32 \| imm" in the arm64 JIT. Since imm is signed 32-bit, it is sign-extended to 64-bit, losing the high 32 bits. The fix is to convert imm to u32 first, which will be zero-extended to u64 implicitly. Cc: Zi Shen Lim <zlim.lnx@gmail.com> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: <stable@vger.kernel.org> Fixes: 30d3d94cc3d5 ("arm64: bpf: add 'load 64-bit immediate' instruction") Signed-off-by: Xi Wang <xi.wang@gmail.com> [will: removed non-arm64 bits and redundant casting] Signed-off-by: Will Deacon <will.deacon@arm.com>
2015-05-08	Merge tag 'extcon-fixes-for-4.1-rc2' of ↵	Greg Kroah-Hartman
	git://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/extcon into char-misc-linus Chanwoo writes: Update extcon for v4.1-rc2 This patchset fixes the NULL pointer deference issue of extcon-usb-gpio.c to prevent the interrupt occur before device initialization.
2015-05-08	iio: adc: cc10001: Fix the channel number mapping	Naidu Tellapati
	When some of the ADC channels are reserved for remote CPUs, the scan index and the corresponding channel number doesn't match. This leads to convesion on the incorrect channel during triggered capture. Fix this by using a scan index to channel mapping encoded in the iio_chan_spec for this purpose while starting conversion on a particular ADC channel in trigger handler. Also, the channel_map is not really used anywhere but in probe(), so no need to keep track of it. Remove it from device structure. While here, add 1 to number of channels to register timestamp channel with the IIO core. Fixes: 1664f6a5b0c8 ("iio: adc: Cosmic Circuits 10001 ADC driver") Signed-off-by: Naidu Tellapati <naidu.tellapati@imgtec.com> Signed-off-by: Ezequiel Garcia <ezequiel.garcia@imgtec.com> Cc: <Stable@vger.kernel.org> Signed-off-by: Jonathan Cameron <jic23@kernel.org>
2015-05-08	staging: vt6655: lock MACvWriteBSSIDAddress.	Malcolm Priestley
	This function selects page 1 and cause intermittent problems on interrupt handler. lock call with spin_lock_irqsave. Signed-off-by: Malcolm Priestley <tvboxspy@gmail.com> Cc: <stable@vger.kernel.org> # v3.19+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-05-08	staging: vt6655: CARDbUpdateTSF bss timestamp correct tsf counter value.	Malcolm Priestley
	The TSF counter is not set correctly. Use sync_tsf for last beacon value and get tsf local value. Remove qwLocalTSF variable and call CARDbGetCurrentTSF. Signed-off-by: Malcolm Priestley <tvboxspy@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-05-08	staging: vt6655: vnt_tx_packet Correct TX order of OWNED_BY_NIC	Malcolm Priestley
	The state of m_td0TD0.f1Owner should change after the buff_addr has been filled otherwise the device grabs the buffer too early. m_td0TD0.f1Owner is protected by memory barriers on both sides of change. iTDUsed is best incremented after MACvTransmit. It appears that f1Owner actually polls to do the memory transfer. A back port patch will be needed for v3.19 Signed-off-by: Malcolm Priestley <tvboxspy@gmail.com> Cc: <stable@vger.kernel.org> # v4.0+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-05-08	staging: vt6655: Fix 80211 control and management status reporting.	Malcolm Priestley
	Currently only TD_FLAGS_NETIF_SKB are reported back to mac80211. Move vnt_int_report_rate to report all frame types. Signed-off-by: Malcolm Priestley <tvboxspy@gmail.com> Cc: <stable@vger.kernel.org> # v3.19+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-05-08	staging: vt6655: implement IEEE80211_TX_STAT_NOACK_TRANSMITTED	Malcolm Priestley
	Make use of this macro for non ack frames. Signed-off-by: Malcolm Priestley <tvboxspy@gmail.com> Cc: <stable@vger.kernel.org> # v4.0 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-05-08	staging: vt6655: device_free_tx_buf use only ieee80211_tx_status_irqsafe	Malcolm Priestley
	TD_FLAGS_NETIF_SKB is only for data. Fixes issue of ack frames not being reported. Signed-off-by: Malcolm Priestley <tvboxspy@gmail.com> Cc: <stable@vger.kernel.org> # v3.19+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-05-08	staging: vt6656: use ieee80211_tx_info to select packet type.	Malcolm Priestley
	Information for packet type is in ieee80211_tx_info band IEEE80211_BAND_5GHZ for PK_TYPE_11A. IEEE80211_TX_RC_USE_CTS_PROTECT via tx_rate flags selects PK_TYPE_11GB This ensures that the packet is always the right type. Signed-off-by: Malcolm Priestley <tvboxspy@gmail.com> Cc: <stable@vger.kernel.org> # v3.17+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-05-08	serial: omap: Fix error handling in probe	Semen Protsenko
	There is pm_qos_add_request() being executed on serial_omap_probe(), which stores "&up->pm_qos_request" from omap-serial driver to "pm_qos_array[PM_QOS_CPU_DMA_LATENCY]->constraints". If serial_omap_probe() fails after pm_qos_add_request() (e.g. on uart_add_one_port() call), pm_qos_array still keeping pm_qos_request struct from omap-serial driver, which is not valid anymore (since driver failed). This leads further to kernel crash on pm_qos_update_target(), executing from some completely different driver. We were observing this while trying to run audio playback while having one of omap-serial driver instances failed on uart_add_one_port() call: Unable to handle kernel paging request at virtual address fffffffc Backtrace: (plist_add) from (pm_qos_update_target) (pm_qos_update_target) from (pm_qos_add_request) (pm_qos_add_request) from (snd_pcm_hw_params) (snd_pcm_hw_params) from (snd_pcm_common_ioctl1) (snd_pcm_common_ioctl1) from (snd_pcm_playback_ioctl1) (snd_pcm_playback_ioctl1) from (snd_pcm_playback_ioctl) (snd_pcm_playback_ioctl) from (do_vfs_ioctl) (do_vfs_ioctl) from (SyS_ioctl) (SyS_ioctl) from (ret_fast_syscall) This patch adds pm_qos_remove_request() on fail path in serial_omap_probe() in order to fix this issue. While at it, free the wakeup settings on fail path as well, just like it's done in serial_omap_remove(). Signed-off-by: Semen Protsenko <semen.protsenko@globallogic.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-05-08	earlycon: Revert log warnings	Peter Hurley
	Log warnings meant to help diagnose problems setting up earlycon are reporting false positives for 'console='. Revert to the previous behavior which reported nothing. Cc: Maciej W. Rozycki <macro@linux-mips.org> Cc: Robert Schwebel <r.schwebel@pengutronix.de> Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-05-08	drm/tegra: Don't use vblank_disable_immediate on incapable driver.	Mario Kleiner
	Tegra would not only need a hardware vblank counter that increments at leading edge of vblank, but also support for instantaneous high precision vblank timestamp queries, ie. a proper implementation of dev->driver->get_vblank_timestamp(). Without these, there can be off-by-one errors during vblank disable/enable if the scanout is inside vblank at en/disable time, and additionally clients will never see any useable vblank timestamps when querying via drmWaitVblank ioctl. This would negatively affect swap scheduling under X11 and Wayland. Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> Acked-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-05-08	Merge tag 'drm-amdkfd-fixes-2015-05-07' of ↵	Dave Airlie
	git://people.freedesktop.org/~gabbayo/linux into drm-fixes - Add missing initialization of SDMA vm register when creating an SDMA queue - Don't report local memory size, as we don't support local memory allocation yet. - Allow to unregister process with exisiting queues. Until now we blocked it with BUG_ON, which was also an error by itself. * tag 'drm-amdkfd-fixes-2015-05-07' of git://people.freedesktop.org/~gabbayo/linux: drm/amdkfd: Initialize sdma vm when creating sdma queue drm/amdkfd: Don't report local memory size drm/amdkfd: allow unregister process with queues
2015-05-08	Merge branch 'drm-fixes-4.1' of git://people.freedesktop.org/~agd5f/linux ↵	Dave Airlie
	into drm-fixes Mostly stability fixes for UVD and VCE, plus a few other bug and regression fixes. * 'drm-fixes-4.1' of git://people.freedesktop.org/~agd5f/linux: drm/radeon: stop trying to suspend UVD sessions drm/radeon: more strictly validate the UVD codec drm/radeon: make UVD handle checking more strict drm/radeon: make VCE handle check more strict drm/radeon: fix userptr lockup drm/radeon: fix userptr BO unpin bug v3 drm/radeon: don't setup audio on asics that don't support it drm/radeon: disable semaphores for UVD V1 (v2)
2015-05-08	ipc/mqueue: Implement lockless pipelined wakeups	Davidlohr Bueso
	This patch moves the wakeup_process() invocation so it is not done under the info->lock by making use of a lockless wake_q. With this change, the waiter is woken up once it is STATE_READY and it does not need to loop on SMP if it is still in STATE_PENDING. In the timeout case we still need to grab the info->lock to verify the state. This change should also avoid the introduction of preempt_disable() in -rt which avoids a busy-loop which pools for the STATE_PENDING -> STATE_READY change if the waiter has a higher priority compared to the waker. Additionally, this patch micro-optimizes wq_sleep by using the cheaper cousin of set_current_state(TASK_INTERRUPTABLE) as we will block no matter what, thus get rid of the implied barrier. Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: George Spelvin <linux@horizon.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Chris Mason <clm@fb.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Manfred Spraul <manfred@colorfullife.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: dave@stgolabs.net Link: http://lkml.kernel.org/r/1430748166.1940.17.camel@stgolabs.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-05-08	futex: Implement lockless wakeups	Davidlohr Bueso
	Given the overall futex architecture, any chance of reducing hb->lock contention is welcome. In this particular case, using wake-queues to enable lockless wakeups addresses very much real world performance concerns, even cases of soft-lockups in cases of large amounts of blocked tasks (which is not hard to find in large boxes, using but just a handful of futex). At the lowest level, this patch can reduce latency of a single thread attempting to acquire hb->lock in highly contended scenarios by a up to 2x. At lower counts of nr_wake there are no regressions, confirming, of course, that the wake_q handling overhead is practically non existent. For instance, while a fair amount of variation, the extended pef-bench wakeup benchmark shows for a 20 core machine the following avg per-thread time to wakeup its share of tasks: nr_thr ms-before ms-after 16 0.0590 0.0215 32 0.0396 0.0220 48 0.0417 0.0182 64 0.0536 0.0236 80 0.0414 0.0097 96 0.0672 0.0152 Naturally, this can cause spurious wakeups. However there is no core code that cannot handle them afaict, and furthermore tglx does have the point that other events can already trigger them anyway. Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Chris Mason <clm@fb.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: George Spelvin <linux@horizon.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Manfred Spraul <manfred@colorfullife.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/1430494072-30283-3-git-send-email-dave@stgolabs.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-05-08	sched: Implement lockless wake-queues	Peter Zijlstra
	This is useful for locking primitives that can effect multiple wakeups per operation and want to avoid lock internal lock contention by delaying the wakeups until we've released the lock internal locks. Alternatively it can be used to avoid issuing multiple wakeups, and thus save a few cycles, in packet processing. Queue all target tasks and wakeup once you've processed all packets. That way you avoid waking the target task multiple times if there were multiple packets for the same task. Properties of a wake_q are: - Lockless, as queue head must reside on the stack. - Being a queue, maintains wakeup order passed by the callers. This can be important for otherwise, in scenarios where highly contended locks could affect any reliance on lock fairness. - A queued task cannot be added again until it is woken up. This patch adds the needed infrastructure into the scheduler code and uses the new wake_list to delay the futex wakeups until after we've released the hash bucket locks. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> [tweaks, adjustments, comments, etc.] Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Chris Mason <clm@fb.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: George Spelvin <linux@horizon.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Manfred Spraul <manfred@colorfullife.com> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/1430494072-30283-2-git-send-email-dave@stgolabs.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-05-08	sched, timer: Use the atomic task_cputime in thread_group_cputimer	Jason Low
	Recent optimizations were made to thread_group_cputimer to improve its scalability by keeping track of cputime stats without a lock. However, the values were open coded to the structure, causing them to be at a different abstraction level from the regular task_cputime structure. Furthermore, any subsequent similar optimizations would not be able to share the new code, since they are specific to thread_group_cputimer. This patch adds the new task_cputime_atomic data structure (introduced in the previous patch in the series) to thread_group_cputimer for keeping track of the cputime atomically, which also helps generalize the code. Suggested-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Jason Low <jason.low2@hp.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Rik van Riel <riel@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Aswin Chandramouleeswaran <aswin@hp.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Mike Galbraith <umgwanakikbuti@gmail.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com> Cc: Scott J Norton <scott.norton@hp.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Waiman Long <Waiman.Long@hp.com> Link: http://lkml.kernel.org/r/1430251224-5764-6-git-send-email-jason.low2@hp.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-05-08	sched, timer: Provide an atomic 'struct task_cputime' data structure	Jason Low
	This patch adds an atomic variant of the 'struct task_cputime' data structure, which can be used to store and update task_cputime statistics without needing to do locking. Suggested-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Jason Low <jason.low2@hp.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Rik van Riel <riel@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Aswin Chandramouleeswaran <aswin@hp.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Mike Galbraith <umgwanakikbuti@gmail.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com> Cc: Scott J Norton <scott.norton@hp.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Waiman Long <Waiman.Long@hp.com> Link: http://lkml.kernel.org/r/1430251224-5764-5-git-send-email-jason.low2@hp.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-05-08	sched, timer: Replace spinlocks with atomics in thread_group_cputimer(), to ↵	Jason Low
	improve scalability While running a database workload, we found a scalability issue with itimers. Much of the problem was caused by the thread_group_cputimer spinlock. Each time we account for group system/user time, we need to obtain a thread_group_cputimer's spinlock to update the timers. On larger systems (such as a 16 socket machine), this caused more than 30% of total time spent trying to obtain this kernel lock to update these group timer stats. This patch converts the timers to 64-bit atomic variables and use atomic add to update them without a lock. With this patch, the percent of total time spent updating thread group cputimer timers was reduced from 30% down to less than 1%. Note: On 32-bit systems using the generic 64-bit atomics, this causes sample_group_cputimer() to take locks 3 times instead of just 1 time. However, we tested this patch on a 32-bit system ARM system using the generic atomics and did not find the overhead to be much of an issue. An explanation for why this isn't an issue is that 32-bit systems usually have small numbers of CPUs, and cacheline contention from extra spinlocks called periodically is not really apparent on smaller systems. Signed-off-by: Jason Low <jason.low2@hp.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Rik van Riel <riel@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Aswin Chandramouleeswaran <aswin@hp.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Mike Galbraith <umgwanakikbuti@gmail.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com> Cc: Scott J Norton <scott.norton@hp.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Waiman Long <Waiman.Long@hp.com> Link: http://lkml.kernel.org/r/1430251224-5764-4-git-send-email-jason.low2@hp.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-05-08	sched/numa: Document usages of mm->numa_scan_seq	Jason Low
	The p->mm->numa_scan_seq is accessed using READ_ONCE/WRITE_ONCE and modified without exclusive access. It is not clear why it is accessed this way. This patch provides some documentation on that. Suggested-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Jason Low <jason.low2@hp.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Rik van Riel <riel@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Aswin Chandramouleeswaran <aswin@hp.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Mike Galbraith <umgwanakikbuti@gmail.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com> Cc: Scott J Norton <scott.norton@hp.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Waiman Long <waiman.long@hp.com> Link: http://lkml.kernel.org/r/1430440094.2475.61.camel@j-VirtualBox Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-05-08	sched, timer: Convert usages of ACCESS_ONCE() in the scheduler to ↵	Jason Low
	READ_ONCE()/WRITE_ONCE() ACCESS_ONCE doesn't work reliably on non-scalar types. This patch removes the rest of the existing usages of ACCESS_ONCE() in the scheduler, and use the new READ_ONCE() and WRITE_ONCE() APIs as appropriate. Signed-off-by: Jason Low <jason.low2@hp.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Rik van Riel <riel@redhat.com> Acked-by: Waiman Long <Waiman.Long@hp.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Aswin Chandramouleeswaran <aswin@hp.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Mike Galbraith <umgwanakikbuti@gmail.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com> Cc: Scott J Norton <scott.norton@hp.com> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/1430251224-5764-2-git-send-email-jason.low2@hp.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-05-08	sched/core: Remove unnecessary down/up conversion	Nicholas Mc Guire
	'rt_period_us' is automatically type converted from u64 to long and then cast back to u64 - this down/up conversion is unnecessary and can be removed to improve readability. This will also help us not truncate 'rt_period_us' to 32 bits on 32-bit kernels, should we ever have so large values. (unlikely, not the least due to procfs.) Signed-off-by: Nicholas Mc Guire <hofrat@osadl.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Borislav Petkov <bp@alien8.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1430643116-24049-1-git-send-email-hofrat@osadl.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-05-08	signals, ptrace, sched: Fix a misaligned load inside ptrace_attach()	Palmer Dabbelt
	The misaligned load exception arises when running ptrace_attach() on the RISC-V (which hasn't been upstreamed yet). The problem is that wait_on_bit() takes a void* but then proceeds to call test_bit(), which takes a long*. This allows an int-aligned pointer to be passed to test_bit(), which promptly fails. This will manifest on any other asm-generic port where unaligned loads trap, where sizeof(long) > sizeof(int), and where task_struct.jobctl ends up not being long-aligned. This patch changes task_struct.jobctl to be a long, which ensures it has the correct alignment. Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Chris Metcalf <cmetcalf@ezchip.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Borislav Petkov <bp@alien8.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: bobby.prani@gmail.com Cc: oleg@redhat.com Cc: paulmck@linux.vnet.ibm.com Cc: richard@nod.at Cc: vdavydov@parallels.com Link: http://lkml.kernel.org/r/1430453997-32459-2-git-send-email-palmer@dabbelt.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-05-08	sched/wait: Change wait_on_bit() to take an unsigned long , not a void *	Palmer Dabbelt
	The implementations of wait_on_bit() will only work with long-aligned memory on systems that don't support misaligned loads and stores. This patch changes the function prototypes to ensure that the compiler will enforce alignment. Running make defconfig make KFLAGS="-Werror" seems to indicate that, as of c56fb6564dcd ("Fix a misaligned load inside ptrace_attach()"), there are now no users of non-long-aligned calls to wait_on_bit(). I additionally tried a few "make randconfig" attempts, none of which failed to compile for this reason. Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Chris Metcalf <cmetcalf@ezchip.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Borislav Petkov <bp@alien8.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: bobby.prani@gmail.com Cc: oleg@redhat.com Cc: paulmck@linux.vnet.ibm.com Cc: richard@nod.at Cc: vdavydov@parallels.com Link: http://lkml.kernel.org/r/1430453997-32459-3-git-send-email-palmer@dabbelt.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-05-08	signals, sched: Change all uses of JOBCTL_* from 'int' to 'long'	Palmer Dabbelt
	c56fb6564dcd ("Fix a misaligned load inside ptrace_attach()") makes jobctl an "unsigned long". It makes sense to have the masks applied to it match that type. This is currently just a cosmetic change, but it will prevent the mask from being unexpectedly truncated if we ever end up with masks with more bits. One instance of "signr" is an int, but I left this alone because the mask ensures that it will never overflow. Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Chris Metcalf <cmetcalf@ezchip.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Borislav Petkov <bp@alien8.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: bobby.prani@gmail.com Cc: oleg@redhat.com Cc: paulmck@linux.vnet.ibm.com Cc: richard@nod.at Cc: vdavydov@parallels.com Link: http://lkml.kernel.org/r/1430453997-32459-4-git-send-email-palmer@dabbelt.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-05-08	sched: Move the loadavg code to a more obvious location	Peter Zijlstra
	I could not find the loadavg code.. turns out it was hidden in a file called proc.c. It further got mingled up with the cruft per rq load indexes (which we really want to get rid of). Move the per rq load indexes into the fair.c load-balance code (that's the only thing that uses them) and rename proc.c to loadavg.c so we can find it again. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Borislav Petkov <bp@alien8.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Thomas Gleixner <tglx@linutronix.de> [ Did minor cleanups to the code. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-05-08	Merge branch 'sched/urgent' into sched/core, before applying new patches	Ingo Molnar
	Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-05-08	perf/x86/intel: Fix SLM cache event list	Kan Liang
	iTLB-load-misses and LLC-load-misses count incorrectly on SLM. There is no ITLB.MISSES support on SLM. Event PAGE_WALKS.I_SIDE_WALK should be used to count iTLB-load-misses. This event counts when an instruction (I) page walk is completed or started. Since a page walk implies a TLB miss, the number of TLB misses can be counted by counting the number of pagewalks. DMND_DATA_RD counts both demand and DCU prefetch data reads. However, LLC-load-misses should only count demand reads. There is no way to not include prefetches with a single counter on SLM. So the LLC-load-misses support should be removed on SLM. Signed-off-by: Kan Liang <kan.liang@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Borislav Petkov <bp@alien8.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1429608881-5055-1-git-send-email-kan.liang@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-05-08	perf: Annotate inherited event ctx->mutex recursion	Peter Zijlstra
	While fuzzing Sasha tripped over another ctx->mutex recursion lockdep splat. Annotate this. Reported-by: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Borislav Petkov <bp@alien8.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-05-08	sched/core: Remove __cpuinit section tag that crept back in	Paul Gortmaker
	We removed __cpuinit support (leaving no-op stubs) quite some time ago. However this one crept back in as of commit a803f0261bb2bb57aab ("sched: Initialize rq->age_stamp on processor start") Since we want to clobber the stubs too, get this removed now. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Corey Minyard <cminyard@mvista.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1430174880-27958-2-git-send-email-paul.gortmaker@windriver.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-05-08	sched/core: Fix regression in cpuset_cpu_inactive() for suspend	Omar Sandoval
	Commit 3c18d447b3b3 ("sched/core: Check for available DL bandwidth in cpuset_cpu_inactive()"), a SCHED_DEADLINE bugfix, had a logic error that caused a regression in setting a CPU inactive during suspend. I ran into this when a program was failing pthread_setaffinity_np() with EINVAL after a suspend+wake up. A simple reproducer: $ ./a.out sched_setaffinity: Success $ systemctl suspend $ ./a.out sched_setaffinity: Invalid argument ... where ./a.out is: #define _GNU_SOURCE #include <errno.h> #include <sched.h> #include <stdio.h> #include <stdlib.h> #include <string.h> #include <unistd.h> int main(void) { long num_cores; cpu_set_t cpu_set; int ret; num_cores = sysconf(_SC_NPROCESSORS_ONLN); CPU_ZERO(&cpu_set); CPU_SET(num_cores - 1, &cpu_set); errno = 0; ret = sched_setaffinity(getpid(), sizeof(cpu_set), &cpu_set); perror("sched_setaffinity"); return ret ? EXIT_FAILURE : EXIT_SUCCESS; } The mistake is that suspend is handled in the action == CPU_DOWN_PREPARE_FROZEN case of the switch statement in cpuset_cpu_inactive(). However, the commit in question masked out CPU_TASKS_FROZEN from the action, making this case dead. The fix is straightforward. Signed-off-by: Omar Sandoval <osandov@osandov.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Borislav Petkov <bp@alien8.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Juri Lelli <juri.lelli@arm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: 3c18d447b3b3 ("sched/core: Check for available DL bandwidth in cpuset_cpu_inactive()") Link: http://lkml.kernel.org/r/1cb5ecb3d6543c38cce5790387f336f54ec8e2bc.1430733960.git.osandov@osandov.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-05-08	sched: Handle priority boosted tasks proper in setscheduler()	Thomas Gleixner
	Ronny reported that the following scenario is not handled correctly: T1 (prio = 10) lock(rtmutex); T2 (prio = 20) lock(rtmutex) boost T1 T1 (prio = 20) sys_set_scheduler(prio = 30) T1 prio = 30 .... sys_set_scheduler(prio = 10) T1 prio = 30 The last step is wrong as T1 should now be back at prio 20. Commit c365c292d059 ("sched: Consider pi boosting in setscheduler()") only handles the case where a boosted tasks tries to lower its priority. Fix it by taking the new effective priority into account for the decision whether a change of the priority is required. Reported-by: Ronny Meeus <ronny.meeus@gmail.com> Tested-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Steven Rostedt <rostedt@goodmis.org> Cc: <stable@vger.kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Mike Galbraith <umgwanakikbuti@gmail.com> Fixes: c365c292d059 ("sched: Consider pi boosting in setscheduler()") Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1505051806060.4225@nanos Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-05-08	Merge branch 'sched/urgent' into sched/core	Ingo Molnar
	So this isn't really a fix but a cleanup that can wait for v4.2. Signed-off-by: Ingo Molnar <mingo@kernel.org>