summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2013-09-21Btrfs: actually log directory we are fsync()'ingJosef Bacik
If you just create a directory and then fsync that directory and then pull the power plug you will come back up and the directory will not be there. That is because we won't actually create directories if we've logged files inside of them since they will be created on replay, but in this check we will set our logged_trans of our current directory if it happens to be a directory, making us think it doesn't need to be logged. Fix the logic to only do this to parent directories. Thanks, Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2013-09-21Btrfs: actually limit the size of delalloc rangeJosef Bacik
So forever we have had this thing to limit the amount of delalloc pages we'll setup to be written out to 128mb. This is because we have to lock all the pages in this range, so anything above this gets a bit unweildly, and also without a limit we'll happily allocate gigantic chunks of disk space. Turns out our check for this wasn't quite right, we wouldn't actually limit the chunk we wanted to write out, we'd just stop looking for more space after we went over the limit. So if you do a giant 20gb dd on my box with lots of ram I could get 2gig extents. This is fine normally, except when you go to relocate these extents and we can't find enough space to relocate these moster extents, since we have to be able to allocate exactly the same sized extent to move it around. So fix this by actually enforcing the limit. With this patch I'm no longer seeing giant 1.5gb extents. Thanks, Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2013-09-21Btrfs: allocate the free space by the existed max extent size when ENOSPCMiao Xie
By the current code, if the requested size is very large, and all the extents in the free space cache are small, we will waste lots of the cpu time to cut the requested size in half and search the cache again and again until it gets down to the size the allocator can return. In fact, we can know the max extent size in the cache after the first search, so we needn't cut the size in half repeatedly, and just use the max extent size directly. This way can save lots of cpu time and make the performance grow up when there are only fragments in the free space cache. According to my test, if there are only 4KB free space extents in the fs, and the total size of those extents are 256MB, we can reduce the execute time of the following test from 5.4s to 1.4s. dd if=/dev/zero of=<testfile> bs=1MB count=1 oflag=sync Changelog v2 -> v3: - fix the problem that we skip the block group with the space which is less than we need. Changelog v1 -> v2: - address the problem that we return a wrong start position when searching the free space in a bitmap. Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2013-09-21btrfs: add lockdep and tracing annotations for uuid treeDavid Sterba
Signed-off-by: David Sterba <dsterba@suse.cz> Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2013-09-21btrfs: show compiled-in config features at module load timeStefan Behrens
We want to know if there are debugging features compiled in, this may affect performance. The message is printed before the sanity checks. (This commit message is a copy of David Sterba's commit message when he introduced btrfs_print_info()). Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de> Reviewed-by: David Sterba <dsterba@suse.cz> Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2013-09-21Btrfs: more efficient inode tree replace operationFilipe David Borba Manana
Instead of removing the current inode from the red black tree and then add the new one, just use the red black tree replace operation, which is more efficient. Signed-off-by: Filipe David Borba Manana <fdmanana@gmail.com> Reviewed-by: Zach Brown <zab@redhat.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2013-09-21Btrfs: do not add replace target to the alloc_listIlya Dryomov
If replace was suspended by the umount, replace target device is added to the fs_devices->alloc_list during a later mount. This is obviously wrong. ->is_tgtdev_for_dev_replace is supposed to guard against that, but ->is_tgtdev_for_dev_replace is (and can only ever be) initialized *after* everything is opened and fs_devices lists are populated. Fix this by checking the devid instead: for replace targets it's always equal to BTRFS_DEV_REPLACE_DEVID. Cc: Stefan Behrens <sbehrens@giantdisaster.de> Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Stefan Behrens <sbehrens@giantdisaster.de> Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2013-09-21Btrfs: fixup error handling in btrfs_reloc_cowJosef Bacik
If we failed to actually allocate the correct size of the extent to relocate we will end up in an infinite loop because we won't return an error, we'll just move on to the next extent. So fix this up by returning an error, and then fix all the callers to return an error up the stack rather than BUG_ON()'ing. Thanks, Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2013-09-21Merge tag 'v3.11' into for-linusChris Mason
Linux 3.11
2013-09-21iio:buffer_cb: Add missing iio_buffer_init()Lars-Peter Clausen
Make sure to properly initialize the IIO buffer data structure. Signed-off-by: Lars-Peter Clausen <lars@metafoo.de> Signed-off-by: Jonathan Cameron <jic23@kernel.org>
2013-09-21iio: Prevent race between IIO chardev opening and IIO device freeLars-Peter Clausen
Set the IIO device as the parent for the character device We need to make sure that the IIO device is not freed while the character device exists, otherwise the freeing of the IIO device might race against the file open callback. Do this by setting the character device's parent to the IIO device, this will cause the character device to grab a reference to the IIO device and only release it once the character device itself has been removed. Also move the registration of the character device before the registration of the IIO device to avoid the (rather theoretical case) that the IIO device is already freed again before we can add the character device and grab a reference to the IIO device. We also need to move the call to cdev_del() from iio_dev_release() to iio_device_unregister() (where it should have been in the first place anyway) to avoid a reference cycle. As iio_dev_release() is only called once all reference are dropped, but the character device holds a reference to the IIO device. Signed-off-by: Lars-Peter Clausen <lars@metafoo.de> Signed-off-by: Jonathan Cameron <jic23@kernel.org>
2013-09-21iio: fix: Keep a reference to the IIO device for open file descriptorsLars-Peter Clausen
Make sure that the IIO device is not freed while we still have file descriptors for it. Signed-off-by: Lars-Peter Clausen <lars@metafoo.de> Signed-off-by: Jonathan Cameron <jic23@kernel.org>
2013-09-21iio: Stop sampling when the device is removedLars-Peter Clausen
Make sure to stop sampling when the device is removed, otherwise it will continue to sample forever. Signed-off-by: Lars-Peter Clausen <lars@metafoo.de> Signed-off-by: Jonathan Cameron <jic23@kernel.org>
2013-09-21iio: Fix crash when scan_bytes is computed with active_scan_mask == NULLPeter Meerwald
if device has available_scan_masks set and the buffer is enabled without any scan_elements enabled, in a NULL pointer is dereferenced in iio_compute_scan_bytes() [ 18.993713] Unable to handle kernel NULL pointer dereference at virtual address 00000000 [ 19.002593] pgd = debd4000 [ 19.005432] [00000000] *pgd=9ebc0831, *pte=00000000, *ppte=00000000 [ 19.012329] Internal error: Oops: 17 [#1] PREEMPT ARM [ 19.017639] Modules linked in: [ 19.020843] CPU: 0 Not tainted (3.9.11-00036-g75c888a-dirty #207) [ 19.027587] PC is at _find_first_bit_le+0xc/0x2c [ 19.032440] LR is at iio_compute_scan_bytes+0x2c/0xf4 [ 19.037719] pc : [<c021dc60>] lr : [<c03198d0>] psr: 200d0013 [ 19.037719] sp : debd9ed0 ip : 00000000 fp : 000802bc [ 19.049713] r10: 00000000 r9 : 00000000 r8 : deb67250 [ 19.055206] r7 : 00000000 r6 : 00000000 r5 : 00000000 r4 : deb67000 [ 19.062011] r3 : de96ec00 r2 : 00000000 r1 : 00000004 r0 : 00000000 [ 19.068847] Flags: nzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment user [ 19.076324] Control: 10c5387d Table: 9ebd4019 DAC: 00000015 problem is the rollback code in iio_update_buffers(), old_mask may be NULL (e.g. on first call) I'm not too confident about the fix; works for me... Signed-off-by: Peter Meerwald <pmeerw@pmeerw.net> Reviewed-by: Lars-Peter Clausen <lars@metafoo.de> Signed-off-by: Jonathan Cameron <jic23@kernel.org>
2013-09-21iio: Fix mcp4725 dev-to-indio_dev conversion in suspend/resumePeter Meerwald
dev_to_iio_dev() is a false friend Signed-off-by: Peter Meerwald <pmeerw@pmeerw.net> Signed-off-by: Jonathan Cameron <jic23@kernel.org>
2013-09-21iio: Fix bma180 dev-to-indio_dev conversion in suspend/resumePeter Meerwald
dev_to_iio_dev() is a false friend Signed-off-by: Peter Meerwald <pmeerw@pmeerw.net> Cc: Oleksandr Kravchenko <o.v.kravchenko@globallogic.com> Signed-off-by: Jonathan Cameron <jic23@kernel.org>
2013-09-21iio: Fix tmp006 dev-to-indio_dev conversion in suspend/resumePeter Meerwald
dev_to_iio_dev() is a false friend Signed-off-by: Peter Meerwald <pmeerw@pmeerw.net> Signed-off-by: Jonathan Cameron <jic23@kernel.org>
2013-09-20Merge tag 'pm+acpi-3.12-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI and power management fixes from Rafael Wysocki: 1) Four fixes for cpufreq regressions introduced by the changes that removed Device Tree parsing for CPU device nodes from cpufreq drivers from Sudeep KarkadaNagesha. 2) Two fixes for recent cpufreq regressions introduced by changes related to the preservation of sysfs attributes over system suspend/resume cycles from Viresh Kumar. 3) Fix for ACPI-based wakeup signaling in the PCI subsystem that fails to stop PME polling for devices put into the D3cold power state from Rafael J Wysocki. 4) Fix for bad interactions between cpufreq and udev on systems supporting intel_pstate where acpi-cpufreq is available as well from Yinghai Lu. * tag 'pm+acpi-3.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: cpufreq: return EEXIST instead of EBUSY for second registering PCI / ACPI / PM: Clear pme_poll for devices in D3cold on wakeup ARM: shmobile: change dev_id to cpu0 while registering cpu clock ARM: i.MX: change dev_id to cpu0 while registering cpu clock cpufreq: imx6q-cpufreq: assign cpu_dev correctly to cpu0 device cpufreq: cpufreq-cpu0: assign cpu_dev correctly to cpu0 device cpufreq: unlock correct rwsem while updating policy->cpu cpufreq: Clear policy->cpus bits in __cpufreq_remove_dev_finish()
2013-09-20Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostLinus Torvalds
Pull vhost updates from Michael Tsirkin: "vhost: minor changes on top of 3.12-rc1 This fixes module loading for vhost-scsi, and tweaks locking in vhost core a bit. Both of these are not exactly release blockers but it's early in the cycle so I think it's a good idea to apply them now" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: vhost-scsi: whitespace tweak vhost/scsi: use vmalloc for order-10 allocation vhost: wake up worker outside spin_lock
2013-09-20CacheFiles: Don't try to dump the index key if the cookie has been clearedDavid Howells
Don't try to dump the index key that distinguishes an object if netfs data in the cookie the object refers to has been cleared (ie. the cookie has passed most of the way through __fscache_relinquish_cookie()). Since the netfs holds the index key, we can't get at it once the ->def and ->netfs_data pointers have been cleared - and a NULL pointer exception will ensue, usually just after a: CacheFiles: Error: Unexpected object collision error is reported. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-09-20CacheFiles: Fix memory leak in cachefiles_check_auxdata error pathsJosh Boyer
In cachefiles_check_auxdata(), we allocate auxbuf but fail to free it if we determine there's an error or that the data is stale. Further, assigning the output of vfs_getxattr() to auxbuf->len gives problems with checking for errors as auxbuf->len is a u16. We don't actually need to set auxbuf->len, so keep the length in a variable for now. We shouldn't need to check the upper limit of the buffer as an overflow there should be indicated by -ERANGE. While we're at it, fscache_check_aux() returns an enum value, not an int, so assign it to an appropriately typed variable rather than to ret. Signed-off-by: Josh Boyer <jwboyer@fedoraproject.org> Signed-off-by: David Howells <dhowells@redhat.com> cc: Hongyi Jia <jiayisuse@gmail.com> cc: Milosz Tanski <milosz@adfin.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-09-20lockref: use cmpxchg64 explicitly for lockless updatesWill Deacon
The cmpxchg() function tends not to support 64-bit arguments on 32-bit architectures. This could be either due to use of unsigned long arguments (like on ARM) or lack of instruction support (cmpxchgq on x86). However, these architectures may implement a specific cmpxchg64() function to provide 64-bit cmpxchg support instead. Since the lockref code requires a 64-bit cmpxchg and relies on the architecture selecting ARCH_USE_CMPXCHG_LOCKREF, move to using cmpxchg64 instead of cmpxchg and allow 32-bit architectures to make use of the lockless lockref implementation. Cc: Waiman Long <Waiman.Long@hp.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-09-20dm mpath: disable WRITE SAME if it failsMike Snitzer
Workaround the SCSI layer's problematic WRITE SAME heuristics by disabling WRITE SAME in the DM multipath device's queue_limits if an underlying device disabled it. The WRITE SAME heuristics, with both the original commit 5db44863b6eb ("[SCSI] sd: Implement support for WRITE SAME") and the updated commit 66c28f971 ("[SCSI] sd: Update WRITE SAME heuristics"), default to enabling WRITE SAME(10) even without successfully determining it is supported. After the first failed WRITE SAME the SCSI layer will disable WRITE SAME for the device (by setting sdkp->device->no_write_same which results in 'max_write_same_sectors' in device's queue_limits to be set to 0). When a device is stacked ontop of such a SCSI device any changes to that SCSI device's queue_limits do not automatically propagate up the stack. As such, a DM multipath device will not have its WRITE SAME support disabled. This causes the block layer to continue to issue WRITE SAME requests to the mpath device which causes paths to fail and (if mpath IO isn't configured to queue when no paths are available) it will result in actual IO errors to the upper layers. This fix doesn't help configurations that have additional devices stacked ontop of the mpath device (e.g. LVM created linear DM devices ontop). A proper fix that restacks all the queue_limits from the bottom of the device stack up will need to be explored if SCSI will continue to use this model of optimistically allowing op codes and then disabling them after they fail for the first time. Before this patch: EXT4-fs (dm-6): mounted filesystem with ordered data mode. Opts: (null) device-mapper: multipath: XXX snitm debugging: got -EREMOTEIO (-121) device-mapper: multipath: XXX snitm debugging: failing WRITE SAME IO with error=-121 end_request: critical target error, dev dm-6, sector 528 dm-6: WRITE SAME failed. Manually zeroing. device-mapper: multipath: Failing path 8:112. end_request: I/O error, dev dm-6, sector 4616 dm-6: WRITE SAME failed. Manually zeroing. end_request: I/O error, dev dm-6, sector 4616 end_request: I/O error, dev dm-6, sector 5640 end_request: I/O error, dev dm-6, sector 6664 end_request: I/O error, dev dm-6, sector 7688 end_request: I/O error, dev dm-6, sector 524288 Buffer I/O error on device dm-6, logical block 65536 lost page write due to I/O error on dm-6 JBD2: Error -5 detected when updating journal superblock for dm-6-8. end_request: I/O error, dev dm-6, sector 524296 Aborting journal on device dm-6-8. end_request: I/O error, dev dm-6, sector 524288 Buffer I/O error on device dm-6, logical block 65536 lost page write due to I/O error on dm-6 JBD2: Error -5 detected when updating journal superblock for dm-6-8. # cat /sys/block/sdh/queue/write_same_max_bytes 0 # cat /sys/block/dm-6/queue/write_same_max_bytes 33553920 After this patch: EXT4-fs (dm-6): mounted filesystem with ordered data mode. Opts: (null) device-mapper: multipath: XXX snitm debugging: got -EREMOTEIO (-121) device-mapper: multipath: XXX snitm debugging: WRITE SAME I/O failed with error=-121 end_request: critical target error, dev dm-6, sector 528 dm-6: WRITE SAME failed. Manually zeroing. # cat /sys/block/sdh/queue/write_same_max_bytes 0 # cat /sys/block/dm-6/queue/write_same_max_bytes 0 It should be noted that WRITE SAME support wasn't enabled in DM multipath until v3.10. Signed-off-by: Mike Snitzer <snitzer@redhat.com> Cc: Martin K. Petersen <martin.petersen@oracle.com> Cc: Hannes Reinecke <hare@suse.de> Cc: stable@vger.kernel.org # 3.10+
2013-09-20dm-snapshot: fix performance degradation due to small hash sizeMikulas Patocka
LVM2, since version 2.02.96, creates origin with zero size, then loads the snapshot driver and then loads the origin. Consequently, the snapshot driver sees the origin size zero and sets the hash size to the lower bound 64. Such small hash table causes performance degradation. This patch changes it so that the hash size is determined by the size of snapshot volume, not minimum of origin and snapshot size. It doesn't make sense to set the snapshot size significantly larger than the origin size, so we do not need to take origin size into account when calculating the hash size. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Cc: stable@vger.kernel.org
2013-09-20dm snapshot: workaround for a false positive lockdep warningMikulas Patocka
The kernel reports a lockdep warning if a snapshot is invalidated because it runs out of space. The lockdep warning was triggered by commit 0976dfc1d0cd80a4e9dfaf87bd87 ("workqueue: Catch more locking problems with flush_work()") in v3.5. The warning is false positive. The real cause for the warning is that the lockdep engine treats different instances of md->lock as a single lock. This patch is a workaround - we use flush_workqueue instead of flush_work. This code path is not performance sensitive (it is called only on initialization or invalidation), thus it doesn't matter that we flush the whole workqueue. The real fix for the problem would be to teach the lockdep engine to treat different instances of md->lock as separate locks. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Acked-by: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Cc: stable@vger.kernel.org # 3.5+
2013-09-20Merge branch 'pm-cpufreq'Rafael J. Wysocki
* pm-cpufreq: cpufreq: return EEXIST instead of EBUSY for second registering ARM: shmobile: change dev_id to cpu0 while registering cpu clock ARM: i.MX: change dev_id to cpu0 while registering cpu clock cpufreq: imx6q-cpufreq: assign cpu_dev correctly to cpu0 device cpufreq: cpufreq-cpu0: assign cpu_dev correctly to cpu0 device cpufreq: unlock correct rwsem while updating policy->cpu cpufreq: Clear policy->cpus bits in __cpufreq_remove_dev_finish()
2013-09-20Merge branch 'acpi-pci'Rafael J. Wysocki
* acpi-pci: PCI / ACPI / PM: Clear pme_poll for devices in D3cold on wakeup
2013-09-20Merge tag 'arm64-stable' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/cmarinas/linux-aarch64 Pull ARM64 fixes from Catalin Marinas: - Compat register fault reporting fix - Documentation clarification on tagged pointers - hwcap widened to 64-bit (user space already reading it as 64-bit) * tag 'arm64-stable' of git://git.kernel.org/pub/scm/linux/kernel/git/cmarinas/linux-aarch64: arm64: Widen hwcap to be 64 bit arm64: Correctly report LR and SP for compat tasks arm64: documentation: tighten up tagged pointer documentation arm64: Make do_bad_area() function static
2013-09-20sched/balancing: Fix cfs_rq->task_h_load calculationVladimir Davydov
Patch a003a2 (sched: Consider runnable load average in move_tasks()) sets all top-level cfs_rqs' h_load to rq->avg.load_avg_contrib, which is always 0. This mistype leads to all tasks having weight 0 when load balancing in a cpu-cgroup enabled setup. There obviously should be sum of weights of all runnable tasks there instead. Fix it. Signed-off-by: Vladimir Davydov <vdavydov@parallels.com> Reviewed-by: Paul Turner <pjt@google.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1379173186-11944-1-git-send-email-vdavydov@parallels.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-09-20sched/balancing: Fix 'local->avg_load > busiest->avg_load' case in ↵Vladimir Davydov
fix_small_imbalance() In busiest->group_imb case we can come to fix_small_imbalance() with local->avg_load > busiest->avg_load. This can result in wrong imbalance fix-up, because there is the following check there where all the members are unsigned: if (busiest->avg_load - local->avg_load + scaled_busy_load_per_task >= (scaled_busy_load_per_task * imbn)) { env->imbalance = busiest->load_per_task; return; } As a result we can end up constantly bouncing tasks from one cpu to another if there are pinned tasks. Fix it by substituting the subtraction with an equivalent addition in the check. [ The bug can be caught by running 2*N cpuhogs pinned to two logical cpus belonging to different cores on an HT-enabled machine with N logical cpus: just look at se.nr_migrations growth. ] Signed-off-by: Vladimir Davydov <vdavydov@parallels.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/ef167822e5c5b2d96cf5b0e3e4f4bdff3f0414a2.1379252740.git.vdavydov@parallels.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-09-20sched/balancing: Fix 'local->avg_load > sds->avg_load' case in ↵Vladimir Davydov
calculate_imbalance() In busiest->group_imb case we can come to calculate_imbalance() with local->avg_load >= busiest->avg_load >= sds->avg_load. This can result in imbalance overflow, because it is calculated as follows env->imbalance = min( max_pull * busiest->group_power, (sds->avg_load - local->avg_load) * local->group_power) / SCHED_POWER_SCALE; As a result we can end up constantly bouncing tasks from one cpu to another if there are pinned tasks. Fix this by skipping the assignment and assuming imbalance=0 in case local->avg_load > sds->avg_load. [ The bug can be caught by running 2*N cpuhogs pinned to two logical cpus belonging to different cores on an HT-enabled machine with N logical cpus: just look at se.nr_migrations growth. ] Signed-off-by: Vladimir Davydov <vdavydov@parallels.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/8f596cc6bc0e5e655119dc892c9bfcad26e971f4.1379252740.git.vdavydov@parallels.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-09-20arm64: Widen hwcap to be 64 bitSteve Capper
Under arm64 elf_hwcap is a 32 bit quantity, but it is stored in a 64 bit auxiliary ELF field and glibc reads hwcap as 64 bit. This patch widens elf_hwcap to be 64 bit. Signed-off-by: Steve Capper <steve.capper@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2013-09-20arm64: Correctly report LR and SP for compat tasksCatalin Marinas
When a task crashes and we print debugging information, ensure that compat tasks show the actual AArch32 LR and SP registers rather than the AArch64 ones. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2013-09-20arm64: documentation: tighten up tagged pointer documentationWill Deacon
Commit d50240a5f6ce ("arm64: mm: permit use of tagged pointers at EL0") added support for tagged pointers in userspace, but the corresponding update to Documentation/ contained some imprecise statements. This patch fixes up some minor ambiguities in the text, hopefully making it more clear about exactly what the kernel expects from user virtual addresses. Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2013-09-20arm64: Make do_bad_area() function staticCatalin Marinas
This function is only called from arch/arm64/mm/fault.c. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2013-09-20Merge tag 'efi-urgent' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/mfleming/efi into x86/urgent Pull EFI fix from Matt Flemin: " * Fix WARNING on i386 by only enabling a workaround for x86-64 since we've never encountered the bug on i386 - Josh Boyer " Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-09-20perf: Fix capabilities bitfield compatibility in 'struct perf_event_mmap_page'Peter Zijlstra
Solve the problems around the broken definition of perf_event_mmap_page:: cap_usr_time and cap_usr_rdpmc fields which used to overlap, partially fixed by: 860f085b74e9 ("perf: Fix broken union in 'struct perf_event_mmap_page'") The problem with the fix (merged in v3.12-rc1 and not yet released officially), noticed by Vince Weaver is that the new behavior is not detectable by new user-space, and that due to the reuse of the field names it's easy to mis-compile a binary if old headers are used on a new kernel or new headers are used on an old kernel. To solve all that make this change explicit, detectable and self-contained, by iterating the ABI the following way: - Always clear bit 0, and rename it to usrpage->cap_bit0, to at least not confuse old user-space binaries. RDPMC will be marked as unavailable to old binaries but that's within the ABI, this is a capability bit. - Rename bit 1 to ->cap_bit0_is_deprecated and always set it to 1, so new libraries can reliably detect that bit 0 is deprecated and perma-zero without having to check the kernel version. - Use bits 2, 3, 4 for the newly defined, correct functionality: cap_user_rdpmc : 1, /* The RDPMC instruction can be used to read counts */ cap_user_time : 1, /* The time_* fields are used */ cap_user_time_zero : 1, /* The time_zero field is used */ - Rename all the bitfield names in perf_event.h to be different from the old names, to make sure it's not possible to mis-compile it accidentally with old assumptions. The 'size' field can then be used in the future to add new fields and it will act as a natural ABI version indicator as well. Also adjust tools/perf/ userspace for the new definitions, noticed by Adrian Hunter. Reported-by: Vince Weaver <vincent.weaver@maine.edu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Also-Fixed-by: Adrian Hunter <adrian.hunter@intel.com> Link: http://lkml.kernel.org/n/tip-zr03yxjrpXesOzzupszqglbv@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-09-20perf/x86/intel/uncore: Don't use smp_processor_id() in validate_group()Yan, Zheng
uncore_validate_group() can't call smp_processor_id() because it is in preemptible context. Pass NUMA_NO_NODE to the allocator instead. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1379400493-11505-1-git-send-email-zheng.z.yan@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-09-20perf: Update ABI commentPeter Zijlstra
For some mysterious reason the sample_id field of PERF_RECORD_MMAP went AWOL. Reported-by: Vince Weaver <vince@deater.net> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-09-20Merge tag 'perf-urgent-for-mingo' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent Pull perf/urgent fixes from Arnaldo Carvalho de Melo: * Check for SIGINT in more loops, allowing tools such as 'perf report' to react faster to Ctrl+C, from Arnaldo Carvalho de Melo. * Fix objdump line parsing offset validation in the annotate code, from Adrian Hunter. * Fix buildid cache handling of kallsyms with kcore, from Adrian Hunter. * Fix compile with libelf without get_phdrnum, from Adrian Hunter. * Sharpen the libaudit dependencies test, refusing to build with older libraries that doesn't have all the functions used by 'perf trace", fix from Ingo Molnar. * Fill in new definitions for madvise()/mmap() flags to fix the build in older systems, from Ingo Molnar. * Fix old GCC build error in older systems in the kallsyms parsing code in trace-event-parse.c, from Ingo Molnar. * Ignore DWARF declaration tags, allowing, for instance, that the $ perf probe -L getname command succeeds in showing the source code for the 'getname' kernel function, telling in which lines probes can be inserted, fix from Masami Hiramatsu. * Fix linux/magic.h related build breakage in some systems, fix from Vinson Lee. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-09-19Merge tag 'fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull ARM SoC fixes from Olof Johansson: "A set of fixes for ARM platforms for 3.12. Among them: - A fix for build breakage in the MTD subsystem for some PXA devices. David Woodhouse has this patch in his for-next branch but has not been responding to our requests to send it up so here it is. I should have amended the commit message to describe the build failure for CONFIG_OF=n setups, but forgot and now it's down in the stack of commits. - Added device-tree for the BeagleBone Black. Turns out people have been using the older "regualar" bone DT for the newer boards, and there's risk of damaging hardware that way. - Misc DT and regular fixes for OMAP. - Fix to make the ST-Ericsson "snowball" boards boot with multi_v7_defconfig, and enable one of the ST-E reference boards on the same config. - Kconfig cleanup for u300 to hide submenus when the platform isn't enabled. - Enable ARM_ATAG_DTB_COMPAT to let firmware override command line when booting with an appended devicetree on non-DT-enabled firmware (needed to boot snowball)" * tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (26 commits) ARM: multi_v7: add HREFv60 to multi_v7 defconfig ARM: OMAP2+: mux: fix trivial typo in name ARM: OMAP4 SMP: Corrected a typo fucntions to functions ARM: OMAP4: cpuidle: fix: call cpu_cluster_pm_exit conditionally mailbox: remove unnecessary platform_set_drvdata() ARM: mach-omap2: gpmc: Fix warning when CONFIG_ARM_LPAE=y ARM: OMAP: fix return value check in omap_device_build_from_dt() ARM: OMAP4: Fix clock_get error for GPMC during boot ARM: sa1100: collie.c: fall back to jedec_probe flash detection ARM: u300: hide submenus ARM: dts: igep00x0: Add pinmux configuration for MCBSP2 ARM: dts: Fix muxing and regulator for wl12xx on the SDIO bus for blaze ARM: dts: Fix muxing and regulator for wl12xx on the SDIO bus for pandaboard mtd: nand: pxa3xx: Remove unneeded ifdef CONFIG_OF ARM: multi_v7_defconfig: enable ARM_ATAG_DTB_COMPAT ARM: ux500: disable outer cache debug ARM: dts: OMAP5: fix ocp2scp DTS data ARM: dts: OMAP5: fix reg property size ARM: dts: am335x-bone*: add DT for BeagleBone Black ARM: dts: omap3-beagle-xm: fix string error in compatible property ...
2013-09-20Merge branch 'msm-fixes-3.12' of ↵Dave Airlie
git://people.freedesktop.org/~robclark/linux into drm-fixes A couple small msm fixes. Plus drop of set_need_resched(). * 'msm-fixes-3.12' of git://people.freedesktop.org/~robclark/linux: drm/msm: drop unnecessary set_need_resched() drm/msm: fix potential NULL pointer dereference drm/msm: workaround for missing irq drm/msm: return -EBUSY if bo still active drm/msm: fix return value check in ERR_PTR() drm/msm: fix cmdstream size check drm/msm: hangcheck harder drm/msm: handle read vs write fences
2013-09-20Merge branch 'exynos-drm-fixes' of ↵Dave Airlie
git://git.kernel.org/pub/scm/linux/kernel/git/daeinki/drm-exynos into drm-fixes Just small fixes, and code cleanups. * 'exynos-drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/daeinki/drm-exynos: drm/exynos: fix return value check in lowlevel_buffer_allocate() drm/exynos: Fix address space warnings in exynos_drm_fbdev.c drm/exynos: Fix address space warning in exynos_drm_buf.c drm/exynos: Remove redundant OF dependency
2013-09-20Merge tag 'drm-intel-fixes-2013-09-19' of ↵Dave Airlie
git://people.freedesktop.org/~danvet/drm-intel into drm-fixes Some more dealock fixes around pageflips and gpu hangs, fixes for hsw hangs when doing modesets/dpms. And a few minor things to rectify issues with our modeset state tracking which the checker spotted. * tag 'drm-intel-fixes-2013-09-19' of git://people.freedesktop.org/~danvet/drm-intel: drm/i915: Don't enable the cursor on a disable pipe drm/i915: do not update cursor in crtc mode set drm/i915: kill set_need_resched drm/i915/dvo: set crtc timings again for panel fixed modes drm/i915/sdvo: Robustify the dtd<->drm_mode conversions drm/i915/sdvo: Fully translate sync flags in the dtd->mode conversion drm/i915: Use proper print format for debug prints drm/i915: fix wait_for_pending_flips vs gpu hang deadlock drm/i915: Track pfit enable state separately from size
2013-09-20cpufreq: return EEXIST instead of EBUSY for second registeringYinghai Lu
On systems that support intel_pstate, acpi_cpufreq fails to load, and udev keeps trying until trace gets filled up and kernel crashes. The root cause is driver return ret from cpufreq_register_driver(), because when some other driver takes over before, it will return EBUSY and then udev will keep trying ... cpufreq_register_driver() should return EEXIST instead so that the system can boot without appending intel_pstate=disable and still use intel_pstate. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-20Revert "drm: mark context support as a legacy subsystem"Dave Airlie
This reverts commit 7c510133d93dd6f15ca040733ba7b2891ed61fd1. Well looks like not enough digging was done, libdrm_nouveau before 2.4.33 used contexts, 292da616fe1f936ca78a3fa8e1b1b19883e343b6 nouveau: pull in major libdrm rewrite got rid of them, Reported-by: Paul Zimmerman <Paul.Zimmerman@synopsys.com> Reported-by: Mikael Pettersson <mikpe@it.uu.se> Signed-off-by: Dave Airlie <airlied@redhat.com>
2013-09-20PCI / ACPI / PM: Clear pme_poll for devices in D3cold on wakeupRafael J. Wysocki
Commit 448bd85 (PCI/PM: add PCIe runtime D3cold support) added a piece of code to pci_acpi_wake_dev() causing that function to behave in a special way for devices in D3cold (so that their configuration registers are not accessed before those devices are resumed). However, it didn't take the clearing of the pme_poll flag into account. That has to be done for all devices, even if they are in D3cold, or pci_pme_list_scan() will not know that wakeup has been signaled for the device and will poll its PME Status bit unnecessarily. Fix the problem by moving the clearing of the pme_poll flag in pci_acpi_wake_dev() before the code introduced by commit 448bd85. Reported-and-tested-by: David E. Box <david.e.box@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Cc: 3.6+ <stable@vger.kernel.org> # 3.6+
2013-09-19Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller: 1) If the local_df boolean is set on an SKB we have to allocate a unique ID even if IP_DF is set in the ipv4 headers, from Ansis Atteka. 2) Some fixups for the new chipset support that went into the sfc driver, from Ben Hutchings. 3) Because SCTP bypasses a good chunk of, and actually duplicates, the logic of the ipv6 output path, some IPSEC things don't get done properly. Integrate SCTP better into the ipv6 output path so that these problems are fixed and such issues don't get missed in the future either. From Daniel Borkmann. 4) Fix skge regressions added by the DMA mapping error return checking added in v3.10, from Mikulas Patocka. 5) Kill some more IRQF_DISABLED references, from Michael Opdenacker. 6) Fix races and deadlocks in the bridging code, from Hong Zhiguo. 7) Fix error handling in tun_set_iff(), in particular don't leak resources. From Jason Wang. 8) Prevent format-string injection into xen-netback driver, from Kees Cook. 9) Fix regression added to netpoll ARP packet handling, in particular check for the right ETH_P_ARP protocol code. From Sonic Zhang. 10) Try to deal with AMD IOMMU errors when using r8169 chips, from Francois Romieu. 11) Cure freezes due to recent changes in the rt2x00 wireless driver, from Stanislaw Gruszka. 12) Don't do SPI transfers (which can sleep) in interrupt context in cw1200 driver, from Solomon Peachy. 13) Fix LEDs handling bug in 5720 tg3 chips already handled for 5719. From Nithin Sujir. 14) Make xen_netbk_count_skb_slots() count the actual number of slots that will be used, taking into consideration packing and other issues that the transmit path will run into. From David Vrabel. 15) Use the correct maximum age when calculating the bridge message_age_timer, from Chris Healy. 16) Get rid of memory leaks in mcs7780 IRDA driver, from Alexey Khoroshilov. 17) Netfilter conntrack extensions were converted to RCU but are not always freed properly using kfree_rcu(). Fix from Michal Kubecek. 18) VF reset recovery not being done correctly in qlcnic driver, from Manish Chopra. 19) Fix inverted test in ATM nicstar driver, from Andy Shevchenko. 20) Missing workqueue destroy in cxgb4 error handling, from Wei Yang. 21) Internal switch not initialized properly in bgmac driver, from Rafał Miłecki. 22) Netlink messages report wrong local and remote addresses in IPv6 tunneling, from Ding Zhi. 23) ICMP redirects should not generate socket errors in DCCP and SCTP. We're still working out how this should be handled for RAW and UDP sockets. From Daniel Borkmann and Duan Jiong. 24) We've had several bugs wherein the network namespace's loopback device gets accessed after it is free'd, NULL it out so that we can catch these problems more readily. From Eric W Biederman. 25) Fix regression in TCP RTO calculations, from Neal Cardwell. 26) Fix too early free of xen-netback network device when VIFs still exist. From Paul Durrant. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (87 commits) netconsole: fix a deadlock with rtnl and netconsole's mutex netpoll: fix NULL pointer dereference in netpoll_cleanup skge: fix broken driver ip: generate unique IP identificator if local fragmentation is allowed ip: use ip_hdr() in __ip_make_skb() to retrieve IP header xen-netback: Don't destroy the netdev until the vif is shut down net:dccp: do not report ICMP redirects to user space cnic: Fix crash in cnic_bnx2x_service_kcq() bnx2x, cnic, bnx2i, bnx2fc: Fix bnx2i and bnx2fc regressions. vxlan: Avoid creating fdb entry with NULL destination tcp: fix RTO calculated from cached RTT drivers: net: phy: cicada.c: clears warning Use #include <linux/io.h> instead of <asm/io.h> net loopback: Set loopback_dev to NULL when freed batman-adv: set the TAG flag for the vid passed to BLA netfilter: nfnetlink_queue: use network skb for sequence adjustment net: sctp: rfc4443: do not report ICMP redirects to user space net: usb: cdc_ether: use usb.h macros whenever possible net: usb: cdc_ether: fix checkpatch errors and warnings net: usb: cdc_ether: Use wwan interface for Telit modules ip6_tunnels: raddr and laddr are inverted in nl msg ...
2013-09-19netconsole: fix a deadlock with rtnl and netconsole's mutexNikolay Aleksandrov
This bug was introduced by commit 7a163bfb7ce50895bbe67300ea610d31b9c09230 ("netconsole: avoid a crash with multiple sysfs writers"). In store_enabled() we have the following sequence: acquire nt->mutex then rtnl, but in the netconsole netdev notifier we have rtnl then nt->mutex effectively leading to a deadlock. The NULL pointer dereference that the above commit tries to fix is actually due to another bug in netpoll_cleanup(). This is fixed by dropping the mutex from the netdev notifier as it's already protected by rtnl. Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-09-19netpoll: fix NULL pointer dereference in netpoll_cleanupNikolay Aleksandrov
I've been hitting a NULL ptr deref while using netconsole because the np->dev check and the pointer manipulation in netpoll_cleanup are done without rtnl and the following sequence happens when having a netconsole over a vlan and we remove the vlan while disabling the netconsole: CPU 1 CPU2 removes vlan and calls the notifier enters store_enabled(), calls netdev_cleanup which checks np->dev and then waits for rtnl executes the netconsole netdev release notifier making np->dev == NULL and releases rtnl continues to dereference a member of np->dev which at this point is == NULL Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>