summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2014-04-15Merge tag 'perf-urgent-for-mingo' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf into perf/urgent Pull perf/urgent fixes from Jiri Olsa: * Instead of redirecting flex output, use -o (Cody P Schafer) * Fix double free in perf test 21 (Adrian Hunter) Signed-off-by: Jiri Olsa <jolsa@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-04-15ipv4: add a sock pointer to ip_queue_xmit()Eric Dumazet
ip_queue_xmit() assumes the skb it has to transmit is attached to an inet socket. Commit 31c70d5956fc ("l2tp: keep original skb ownership") changed l2tp to not change skb ownership and thus broke this assumption. One fix is to add a new 'struct sock *sk' parameter to ip_queue_xmit(), so that we do not assume skb->sk points to the socket used by l2tp tunnel. Fixes: 31c70d5956fc ("l2tp: keep original skb ownership") Reported-by: Zhan Jianyu <nasa4836@gmail.com> Tested-by: Zhan Jianyu <nasa4836@gmail.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-15xen/manage: Poweroff forcefully if user-space is not yet up.Konrad Rzeszutek Wilk
The user can launch the guest in this sequence: xl create -p /vm.cfg [launch, but pause it] xl shutdown latest [sets control/shutdown=poweroff] xl unpause latest xl console latest [and see that the guest has completely ignored the shutdown request] In reality the guest hasn't ignored it. It registers a watch and gets a notification that there is value. It then calls the shutdown_handler which ends up calling orderly_shutdown. Unfortunately that is so early in the bootup that there are no user-space. Which means that the orderly_shutdown fails. But since the force flag was set to false it continues on without reporting. What we really want to is to use the force when we are in the SYSTEM_BOOTING state and not use the 'force' when SYSTEM_RUNNING. However, if we are in the running state - and the shutdown command has been given before the user-space has been setup, there is nothing we can do. Worst yet, we stop ignoring the 'xl shutdown' requests! As such, the other part of this patch is to only stop ignoring the 'xl shutdown' when we are truly in the power off sequence. That means the user can do multiple 'xl shutdown' and we will try to act on them instead of ignoring them. Fixes-Bug: http://bugs.xenproject.org/xen/bug/6 Reported-by: Alex Bligh <alex@alex.org.uk> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2014-04-15xen/xenbus: Avoid synchronous wait on XenBus stalling shutdown/restart.Konrad Rzeszutek Wilk
The 'read_reply' works with 'process_msg' to read of a reply in XenBus. 'process_msg' is running from within the 'xenbus' thread. Whenever a message shows up in XenBus it is put on a xs_state.reply_list list and 'read_reply' picks it up. The problem is if the backend domain or the xenstored process is killed. In which case 'xenbus' is still awaiting - and 'read_reply' if called - stuck forever waiting for the reply_list to have some contents. This is normally not a problem - as the backend domain can come back or the xenstored process can be restarted. However if the domain is in process of being powered off/restarted/halted - there is no point of waiting on it coming back - as we are effectively being terminated and should not impede the progress. This patch solves this problem by checking whether the guest is the right domain. If it is an initial domain and hurtling towards death - there is no point of continuing the wait. All other type of guests continue with their behavior (as Xenstore is expected to still be running in another domain). Fixes-Bug: http://bugs.xenproject.org/xen/bug/8 Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2014-04-15xen/spinlock: Don't enable them unconditionally.Konrad Rzeszutek Wilk
The git commit a945928ea2709bc0e8e8165d33aed855a0110279 ('xen: Do not enable spinlocks before jump_label_init() has executed') was added to deal with the jump machinery. Earlier the code that turned on the jump label was only called by Xen specific functions. But now that it had been moved to the initcall machinery it gets called on Xen, KVM, and baremetal - ouch!. And the detection machinery to only call it on Xen wasn't remembered in the heat of merge window excitement. This means that the slowpath is enabled on baremetal while it should not be. Reported-by: Waiman Long <waiman.long@hp.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> CC: stable@vger.kernel.org CC: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2014-04-15xen-pciback: silence an unwanted debug printkDan Carpenter
There is a missing curly brace here so we might print some extra debug information. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2014-04-15xen: fix memory leak in __xen_pcibk_add_pci_dev()Daeseok Youn
It need to free dev_entry when it failed to assign to a new slot on the virtual PCI bus. smatch says: drivers/xen/xen-pciback/vpci.c:142 __xen_pcibk_add_pci_dev() warn: possible memory leak of 'dev_entry' Signed-off-by: Daeseok Youn <daeseok.youn@gmail.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2014-04-15irqchip: vic: Properly chain the cascaded IRQsLinus Walleij
We are flagging the parent IRQ as chained, then we must also make sure to call the chained_irq_[enter|exit] functions for things to work smoothly. Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Link: http://lkml.kernel.org/r/1397550484-7119-1-git-send-email-linus.walleij@linaro.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-04-15x86/xen: Fix 32-bit PV guests's usage of kernel_stackBoris Ostrovsky
Commit 198d208df4371734ac4728f69cb585c284d20a15 ("x86: Keep thread_info on thread stack in x86_32") made 32-bit kernels use kernel_stack to point to thread_info. That change missed a couple of updates needed by Xen's 32-bit PV guests: 1. kernel_stack needs to be initialized for secondary CPUs 2. GET_THREAD_INFO() now uses %fs register which may not be the kernel's version when executing xen_iret(). With respect to the second issue, we don't need GET_THREAD_INFO() anymore: we used it as an intermediate step to get to per_cpu xen_vcpu and avoid referencing %fs. Now that we are going to use %fs anyway we may as well go directly to xen_vcpu. Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2014-04-15drm/omap: fix the handling of fb ref countsTomi Valkeinen
With the recent primary-plane changes for drm, the primary plane's framebuffer needs to be ref counted the same way as for non-primary-planes. This was not done by the omapdrm driver, which caused the ref count to drop to 0 too early, causing problems. This patch moves the fb unref and ref from omap_plane_update to omap_plane_mode_set. This way the fb refs are updated for both primary and non-primary cases, as omap_plane_update calls omap_plane_mode_set. Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
2014-04-15perf tools: Instead of redirecting flex output, use -oCody P Schafer
This gives us a real filename instead of having '<stdout>' show up all over the place when debugging. Signed-off-by: Cody P Schafer <cody@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/1396652539-2416-1-git-send-email-cody@linux.vnet.ibm.com Signed-off-by: Jiri Olsa <jolsa@redhat.com>
2014-04-15perf tools: Fix double free in perf test 21 (code-reading.c)Adrian Hunter
perf_evlist__delete() deletes attached cpu and thread maps but the test is still using them, so remove them from the evlist before deleting it. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Link: http://lkml.kernel.org/r/53465E3E.8070201@intel.com Signed-off-by: Jiri Olsa <jolsa@redhat.com>
2014-04-15iommu/arm-smmu: fix panic in arm_smmu_alloc_init_pteBin Wang
kernel panic happened when iommu_unmap a buffer larger than 2MB, more than expected pmd entries got “invalidated”, due to a wrong range passed to arm_smmu_alloc_init_pte. it was likely a typo, now we fix it, passing the correct "end" address to arm_smmu_alloc_init_pte. Signed-off-by: Bin Wang <binw@marvell.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2014-04-15iommu/arm-smmu: Return 0 on unmap failureLaurent Pinchart
The IOMMU core expects the unmap operation to return the number of bytes that have been unmapped or 0 on failure, a negative return value being treated like a number of bytes. Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2014-04-15drm/omap: protect omap_crtc's event with event_lock spinlockArchit Taneja
The vblank_cb callback and the page_flip ioctl can occur together in different CPU contexts. vblank_cb uses takes tje drm device's event_lock spinlock when sending the vblank event and updating omap_crtc->event and omap_crtc->od_fb. Use the same spinlock in page_flip, to make sure the above omap_crtc parameters are configured sequentially. Signed-off-by: Archit Taneja <archit@ti.com> Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
2014-04-15drm/omap: Use old_fb to synchronize between successive page flipsArchit Taneja
omap_crtc->old_fb is used to check whether the previous page flip has completed or not. However, it's never initialized to anything, so it's always NULL. This results in the check to always succeed, and the page_flip to proceed. Initialize old_fb to the fb that we intend to flip to through page_flip, and therefore prevent a future page flip to proceed if the last one didn't complete. Signed-off-by: Archit Taneja <archit@ti.com> Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
2014-04-15drm/omap: Fix crash when using LCD3 overlay managerArchit Taneja
The channel_names list didn't have a string populated for LCD3 manager, this results in a crash when the display's output is connected to LCD3. Add an entry for LCD3. Reported-by: Somnath Mukherjee <somnath@ti.com> Signed-off-by: Archit Taneja <archit@ti.com> Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
2014-04-15drm/omap: gem sync: wait on correct eventsArchit Taneja
A waiter of the type OMAP_GEM_READ should wait for a buffer to be completely written, and only then proceed with reading it. A similar logic applies for waiters with OMAP_GEM_WRITE flag. Currently the function is_waiting() waits on the read_complete/read_target counts in the sync object. This should be the other way round, as a reader should wait for users who are 'writing' to this buffer, and vice versa. Make readers of the buffer(OMAP_GEM_READ) wait on the write counters, and writers to the buffer(OMAP_GEM_WRITE) wait on the read counters in is_waiting() Signed-off-by: Archit Taneja <archit@ti.com> Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
2014-04-15drm/omap: Fix memory leak in omap_gem_op_asyncSubhajit Paul
In omap_gem_op_async(), if a waiter is not added to the wait list, it needs to be free'd in the function itself. Make sure we free the waiter for this case. Signed-off-by: Subhajit Paul <subhajit_paul@ti.com> Signed-off-by: Archit Taneja <archit@ti.com> Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
2014-04-15drm/omap: remove warn from debugfsTomi Valkeinen
Patch dfe96ddcfa22b44100814b9435770f6ff1309d37 (omapdrm: simplify locking in the fb debugfs file) removed taking locks when using omapdrm's debugfs to dump fb objects. However, in omap_gem_describe we give a WARN is the lock has not been taken, so that WARN is now seen every time omapdrm debugfs is used. So, presuming the removal of locks is ok, we can also remove the WARN. Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
2014-04-15drm/omap: remove extra plane->destroy from crtc destroyTomi Valkeinen
All the planes, including primary planes, are now destroyed by the drm framework. Thus we no longer need the explicit call to plane->destroy from the crtc's destroy function. This patch removes the call, thus fixing the crash caused by double freeing the plane. remove omap_crtc->plane->funcs->destroy(omap_crtc->plane) Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
2014-04-15drm/omap: print warning when rotating non-TILER fbTomi Valkeinen
Print a warning when the user tries to rotate a non-TILER framebuffer. Also set the rotation to 0, to avoid constant flood of the warnings in case of page flipping. Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
2014-04-15video: bf54x-lq043fb: fix build errorSteven Miao
Fix build error by including linux/gpio.h. Also drop asm/gpio.h which is not needed. Signed-off-by: Steven Miao <realmz6@gmail.com> Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
2014-04-14iommu/vt-d: fix bug in matching PCI devices with DRHD/RMRR descriptorsJiang Liu
Commit "59ce0515cdaf iommu/vt-d: Update DRHD/RMRR/ATSR device scope caches when PCI hotplug happens" introduces a bug, which fails to match PCI devices with DMAR device scope entries if PCI path array in the entry has more than one level. For example, it fails to handle [1D2h 0466 1] Device Scope Entry Type : 01 [1D3h 0467 1] Entry Length : 0A [1D4h 0468 2] Reserved : 0000 [1D6h 0470 1] Enumeration ID : 00 [1D7h 0471 1] PCI Bus Number : 00 [1D8h 0472 2] PCI Path : 1C,04 [1DAh 0474 2] PCI Path : 00,02 And cause DMA failure on HP DL980 as: DMAR:[fault reason 02] Present bit in context entry is clear dmar: DRHD: handling fault status reg 602 dmar: DMAR:[DMA Read] Request device [02:00.2] fault addr 7f61e000 Reported-and-tested-by: Davidlohr Bueso <davidlohr@hp.com> Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2014-04-14iommu/vt-d: Fix get_domain_for_dev() handling of upstream PCIe bridgesDavid Woodhouse
Commit 146922ec79 ("iommu/vt-d: Make get_domain_for_dev() take struct device") introduced new variables bridge_bus and bridge_devfn to identify the upstream PCIe to PCI bridge responsible for the given target device. Leaving the original bus/devfn variables to identify the target device itself, now that it is no longer assumed to be PCI and we can no longer trivially find that information. However, the patch failed to correctly use the new variables in all cases; instead using the as-yet-uninitialised 'bus' and 'devfn' variables. Reported-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2014-04-15driver/net: cosa driver uses udelay incorrectlyLi, Zhen-Hua
In cosa driver, udelay with more than 20000 may cause __bad_udelay. Use msleep for instead. Signed-off-by: Li, Zhen-Hua <zhen-hual@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-15at86rf230: fix __at86rf230_read_subreg functionAlexander Aring
The __at86rf230_read_subreg function don't mask and shift register contents which it should do. This patch adds the necessary masks and shift operations in this function. Since we have csma support this can make some trouble on state changes. Since CSMA support turned on some bits in the TRX_STATUS register that used to be zero, not masking broke checking of the TRX_STATUS field after commanding a state change. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Reviewed-by: Werner Almesberger <werner@almesberger.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-15at86rf230: remove check if AVDD settledAlexander Aring
The AVDD regulator is only enabled when the RF section is active TX_ON (PLL_ON) state. Since commit 7dcbd22a97eb0689e6c583ad630ae0e7341e34c1 ("ieee802154: ensure that first RF212 state comes from TRX_OFF"). We are in TRX_OFF state at the time at86rf230_hw_init is run. Note that this test would only fail in case of a severe hardware malfunction (faulty/shorted power supply, etc.) so it wasn't all that useful in the first place. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Reviewed-by: Werner Almesberger <werner@almesberger.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-15net: cadence: Add architecture dependenciesJean Delvare
The Cadence ethernet chipsets are only used on specific ARM architectures. Add Kconfig dependencies so that drivers for these chipsets are only buildable on the relevant architectures. Signed-off-by: Jean Delvare <jdelvare@suse.de> Cc: Nicolas Ferre <nicolas.ferre@atmel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-14Merge git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull KVM fixes from Marcelo Tosatti: - Fix for guest triggerable BUG_ON (CVE-2014-0155) - CR4.SMAP support - Spurious WARN_ON() fix * git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: x86: remove WARN_ON from get_kernel_ns() KVM: Rename variable smep to cr4_smep KVM: expose SMAP feature to guest KVM: Disable SMAP for guests in EPT realmode and EPT unpaging mode KVM: Add SMAP support when setting CR4 KVM: Remove SMAP bit from CR4_RESERVED_BITS KVM: ioapic: try to recover if pending_eoi goes out of range KVM: ioapic: fix assignment of ioapic->rtc_status.pending_eoi (CVE-2014-0155)
2014-04-14Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6Linus Torvalds
Pull bmc2835 crypto fix from Herbert Xu: "This fixes a potential boot crash on bcm2835 due to the recent change that now causes hardware RNGs to be accessed on registration" * git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: hwrng: bcm2835 - fix oops when rng h/w is accessed during registration
2014-04-14user namespace: fix incorrect memory barriersMikulas Patocka
smp_read_barrier_depends() can be used if there is data dependency between the readers - i.e. if the read operation after the barrier uses address that was obtained from the read operation before the barrier. In this file, there is only control dependency, no data dependecy, so the use of smp_read_barrier_depends() is incorrect. The code could fail in the following way: * the cpu predicts that idx < entries is true and starts executing the body of the for loop * the cpu fetches map->extent[0].first and map->extent[0].count * the cpu fetches map->nr_extents * the cpu verifies that idx < extents is true, so it commits the instructions in the body of the for loop The problem is that in this scenario, the cpu read map->extent[0].first and map->nr_extents in the wrong order. We need a full read memory barrier to prevent it. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-14Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nfDavid S. Miller
Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following patchset contains three Netfilter fixes for your net tree, they are: * Fix missing generation sequence initialization which results in a splat if lockdep is enabled, it was introduced in the recent works to improve nf_conntrack scalability, from Andrey Vagin. * Don't flush the GRE keymap list in nf_conntrack when the pptp helper is disabled otherwise this crashes due to a double release, from Andrey Vagin. * Fix nf_tables cmp fast in big endian, from Patrick McHardy. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-14net: Start with correct mac_len in skb_network_protocolVlad Yasevich
Sometimes, when the packet arrives at skb_mac_gso_segment() its skb->mac_len already accounts for some of the mac lenght headers in the packet. This seems to happen when forwarding through and OpenSSL tunnel. When we start looking for any vlan headers in skb_network_protocol() we seem to ignore any of the already known mac headers and start with an ETH_HLEN. This results in an incorrect offset, dropped TSO frames and general slowness of the connection. We can start counting from the known skb->mac_len and return at least that much if all mac level headers are known and accounted for. Fixes: 53d6471cef17262d3ad1c7ce8982a234244f68ec (net: Account for all vlan headers in skb_mac_gso_segment) CC: Eric Dumazet <eric.dumazet@gmail.com> CC: Daniel Borkman <dborkman@redhat.com> Tested-by: Martin Filip <nexus+kernel@smoula.net> Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-14powerpc/PCI: Fix NULL dereference in sys_pciconfig_iobase() list traversalMike Qiu
3bc955987fb3 ("powerpc/PCI: Use list_for_each_entry() for bus traversal") caused a NULL pointer dereference because the loop body set the iterator to NULL: Unable to handle kernel paging request for data at address 0x00000000 Faulting instruction address: 0xc000000000041d78 Oops: Kernel access of bad area, sig: 11 [#1] ... NIP [c000000000041d78] .sys_pciconfig_iobase+0x68/0x1f0 LR [c000000000041e0c] .sys_pciconfig_iobase+0xfc/0x1f0 Call Trace: [c0000003b4787db0] [c000000000041e0c] .sys_pciconfig_iobase+0xfc/0x1f0 (unreliable) [c0000003b4787e30] [c000000000009ed8] syscall_exit+0x0/0x98 Fix it by using a temporary variable for the iterator. [bhelgaas: changelog, drop tmp_bus initialization] Fixes: 3bc955987fb3 powerpc/PCI: Use list_for_each_entry() for bus traversal Signed-off-by: Mike Qiu <qiudayu@linux.vnet.ibm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2014-04-14KVM: x86: remove WARN_ON from get_kernel_ns()Marcelo Tosatti
Function and callers can be preempted. https://bugzilla.kernel.org/show_bug.cgi?id=73721 Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
2014-04-14KVM: Rename variable smep to cr4_smepFeng Wu
Rename variable smep to cr4_smep, which can better reflect the meaning of the variable. Signed-off-by: Feng Wu <feng.wu@intel.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2014-04-14KVM: expose SMAP feature to guestFeng Wu
This patch exposes SMAP feature to guest Signed-off-by: Feng Wu <feng.wu@intel.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2014-04-14KVM: Disable SMAP for guests in EPT realmode and EPT unpaging modeFeng Wu
SMAP is disabled if CPU is in non-paging mode in hardware. However KVM always uses paging mode to emulate guest non-paging mode with TDP. To emulate this behavior, SMAP needs to be manually disabled when guest switches to non-paging mode. Signed-off-by: Feng Wu <feng.wu@intel.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2014-04-14KVM: Add SMAP support when setting CR4Feng Wu
This patch adds SMAP handling logic when setting CR4 for guests Thanks a lot to Paolo Bonzini for his suggestion to use the branchless way to detect SMAP violation. Signed-off-by: Feng Wu <feng.wu@intel.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2014-04-14KVM: Remove SMAP bit from CR4_RESERVED_BITSFeng Wu
This patch removes SMAP bit from CR4_RESERVED_BITS. Signed-off-by: Feng Wu <feng.wu@intel.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2014-04-14Revert "net: sctp: Fix a_rwnd/rwnd management to reflect real state of the ↵Daniel Borkmann
receiver's buffer" This reverts commit ef2820a735f7 ("net: sctp: Fix a_rwnd/rwnd management to reflect real state of the receiver's buffer") as it introduced a serious performance regression on SCTP over IPv4 and IPv6, though a not as dramatic on the latter. Measurements are on 10Gbit/s with ixgbe NICs. Current state: [root@Lab200slot2 ~]# iperf3 --sctp -4 -c 192.168.241.3 -V -l 1452 -t 60 iperf version 3.0.1 (10 January 2014) Linux Lab200slot2 3.14.0 #1 SMP Thu Apr 3 23:18:29 EDT 2014 x86_64 Time: Fri, 11 Apr 2014 17:56:21 GMT Connecting to host 192.168.241.3, port 5201 Cookie: Lab200slot2.1397238981.812898.548918 [ 4] local 192.168.241.2 port 38616 connected to 192.168.241.3 port 5201 Starting Test: protocol: SCTP, 1 streams, 1452 byte blocks, omitting 0 seconds, 60 second test [ ID] Interval Transfer Bandwidth [ 4] 0.00-1.09 sec 20.8 MBytes 161 Mbits/sec [ 4] 1.09-2.13 sec 10.8 MBytes 86.8 Mbits/sec [ 4] 2.13-3.15 sec 3.57 MBytes 29.5 Mbits/sec [ 4] 3.15-4.16 sec 4.33 MBytes 35.7 Mbits/sec [ 4] 4.16-6.21 sec 10.4 MBytes 42.7 Mbits/sec [ 4] 6.21-6.21 sec 0.00 Bytes 0.00 bits/sec [ 4] 6.21-7.35 sec 34.6 MBytes 253 Mbits/sec [ 4] 7.35-11.45 sec 22.0 MBytes 45.0 Mbits/sec [ 4] 11.45-11.45 sec 0.00 Bytes 0.00 bits/sec [ 4] 11.45-11.45 sec 0.00 Bytes 0.00 bits/sec [ 4] 11.45-11.45 sec 0.00 Bytes 0.00 bits/sec [ 4] 11.45-12.51 sec 16.0 MBytes 126 Mbits/sec [ 4] 12.51-13.59 sec 20.3 MBytes 158 Mbits/sec [ 4] 13.59-14.65 sec 13.4 MBytes 107 Mbits/sec [ 4] 14.65-16.79 sec 33.3 MBytes 130 Mbits/sec [ 4] 16.79-16.79 sec 0.00 Bytes 0.00 bits/sec [ 4] 16.79-17.82 sec 5.94 MBytes 48.7 Mbits/sec (etc) [root@Lab200slot2 ~]# iperf3 --sctp -6 -c 2001:db8:0:f101::1 -V -l 1400 -t 60 iperf version 3.0.1 (10 January 2014) Linux Lab200slot2 3.14.0 #1 SMP Thu Apr 3 23:18:29 EDT 2014 x86_64 Time: Fri, 11 Apr 2014 19:08:41 GMT Connecting to host 2001:db8:0:f101::1, port 5201 Cookie: Lab200slot2.1397243321.714295.2b3f7c [ 4] local 2001:db8:0:f101::2 port 55804 connected to 2001:db8:0:f101::1 port 5201 Starting Test: protocol: SCTP, 1 streams, 1400 byte blocks, omitting 0 seconds, 60 second test [ ID] Interval Transfer Bandwidth [ 4] 0.00-1.00 sec 169 MBytes 1.42 Gbits/sec [ 4] 1.00-2.00 sec 201 MBytes 1.69 Gbits/sec [ 4] 2.00-3.00 sec 188 MBytes 1.58 Gbits/sec [ 4] 3.00-4.00 sec 174 MBytes 1.46 Gbits/sec [ 4] 4.00-5.00 sec 165 MBytes 1.39 Gbits/sec [ 4] 5.00-6.00 sec 199 MBytes 1.67 Gbits/sec [ 4] 6.00-7.00 sec 163 MBytes 1.36 Gbits/sec [ 4] 7.00-8.00 sec 174 MBytes 1.46 Gbits/sec [ 4] 8.00-9.00 sec 193 MBytes 1.62 Gbits/sec [ 4] 9.00-10.00 sec 196 MBytes 1.65 Gbits/sec [ 4] 10.00-11.00 sec 157 MBytes 1.31 Gbits/sec [ 4] 11.00-12.00 sec 175 MBytes 1.47 Gbits/sec [ 4] 12.00-13.00 sec 192 MBytes 1.61 Gbits/sec [ 4] 13.00-14.00 sec 199 MBytes 1.67 Gbits/sec (etc) After patch: [root@Lab200slot2 ~]# iperf3 --sctp -4 -c 192.168.240.3 -V -l 1452 -t 60 iperf version 3.0.1 (10 January 2014) Linux Lab200slot2 3.14.0+ #1 SMP Mon Apr 14 12:06:40 EDT 2014 x86_64 Time: Mon, 14 Apr 2014 16:40:48 GMT Connecting to host 192.168.240.3, port 5201 Cookie: Lab200slot2.1397493648.413274.65e131 [ 4] local 192.168.240.2 port 50548 connected to 192.168.240.3 port 5201 Starting Test: protocol: SCTP, 1 streams, 1452 byte blocks, omitting 0 seconds, 60 second test [ ID] Interval Transfer Bandwidth [ 4] 0.00-1.00 sec 240 MBytes 2.02 Gbits/sec [ 4] 1.00-2.00 sec 239 MBytes 2.01 Gbits/sec [ 4] 2.00-3.00 sec 240 MBytes 2.01 Gbits/sec [ 4] 3.00-4.00 sec 239 MBytes 2.00 Gbits/sec [ 4] 4.00-5.00 sec 245 MBytes 2.05 Gbits/sec [ 4] 5.00-6.00 sec 240 MBytes 2.01 Gbits/sec [ 4] 6.00-7.00 sec 240 MBytes 2.02 Gbits/sec [ 4] 7.00-8.00 sec 239 MBytes 2.01 Gbits/sec With the reverted patch applied, the SCTP/IPv4 performance is back to normal on latest upstream for IPv4 and IPv6 and has same throughput as 3.4.2 test kernel, steady and interval reports are smooth again. Fixes: ef2820a735f7 ("net: sctp: Fix a_rwnd/rwnd management to reflect real state of the receiver's buffer") Reported-by: Peter Butler <pbutler@sonusnet.com> Reported-by: Dongsheng Song <dongsheng.song@gmail.com> Reported-by: Fengguang Wu <fengguang.wu@intel.com> Tested-by: Peter Butler <pbutler@sonusnet.com> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Cc: Matija Glavinic Pecotic <matija.glavinic-pecotic.ext@nsn.com> Cc: Alexander Sverdlin <alexander.sverdlin@nsn.com> Cc: Vlad Yasevich <vyasevich@gmail.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-14cxgb4: Save the correct mac addr for hw-loopback connections in the L2TSteve Wise
Hardware needs the local device mac address to support hw loopback for rdma loopback connections. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-14net: filter: seccomp: fix wrong decoding of BPF_S_ANC_SECCOMP_LD_WDaniel Borkmann
While reviewing seccomp code, we found that BPF_S_ANC_SECCOMP_LD_W has been wrongly decoded by commit a8fc927780 ("sk-filter: Add ability to get socket filter program (v2)") into the opcode BPF_LD|BPF_B|BPF_ABS although it should have been decoded as BPF_LD|BPF_W|BPF_ABS. In practice, this should not have much side-effect though, as such conversion is/was being done through prctl(2) PR_SET_SECCOMP. Reverse operation PR_GET_SECCOMP will only return the current seccomp mode, but not the filter itself. Since the transition to the new BPF infrastructure, it's also not used anymore, so we can simply remove this as it's unreachable. Fixes: a8fc927780 ("sk-filter: Add ability to get socket filter program (v2)") Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Cc: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-14seccomp: fix populating a0-a5 syscall args in 32-bit x86 BPFDaniel Borkmann
Linus reports that on 32-bit x86 Chromium throws the following seccomp resp. audit log messages: audit: type=1326 audit(1397359304.356:28108): auid=500 uid=500 gid=500 ses=2 subj=unconfined_u:unconfined_r:chrome_sandbox_t:s0-s0:c0.c1023 pid=3677 comm="chrome" exe="/opt/google/chrome/chrome" sig=0 syscall=172 compat=0 ip=0xb2dd9852 code=0x30000 audit: type=1326 audit(1397359304.356:28109): auid=500 uid=500 gid=500 ses=2 subj=unconfined_u:unconfined_r:chrome_sandbox_t:s0-s0:c0.c1023 pid=3677 comm="chrome" exe="/opt/google/chrome/chrome" sig=0 syscall=5 compat=0 ip=0xb2dd9852 code=0x50000 These audit messages are being triggered via audit_seccomp() through __secure_computing() in seccomp mode (BPF) filter with seccomp return codes 0x30000 (== SECCOMP_RET_TRAP) and 0x50000 (== SECCOMP_RET_ERRNO) during filter runtime. Moreover, Linus reports that x86_64 Chromium seems fine. The underlying issue that explains this is that the implementation of populate_seccomp_data() is wrong. Our seccomp data structure sd that is being shared with user ABI is: struct seccomp_data { int nr; __u32 arch; __u64 instruction_pointer; __u64 args[6]; }; Therefore, a simple cast to 'unsigned long *' for storing the value of the syscall argument via syscall_get_arguments() is just wrong as on 32-bit x86 (or any other 32bit arch), it would result in storing a0-a5 at wrong offsets in args[] member, and thus i) could leak stack memory to user space and ii) tampers with the logic of seccomp BPF programs that read out and check for syscall arguments: syscall_get_arguments(task, regs, 0, 1, (unsigned long *) &sd->args[0]); Tested on 32-bit x86 with Google Chrome, unfortunately only via remote test machine through slow ssh X forwarding, but it fixes the issue on my side. So fix it up by storing args in type correct variables, gcc is clever and optimizes the copy away in other cases, e.g. x86_64. Fixes: bd4cf0ed331a ("net: filter: rework/optimize internal BPF interpreter's instruction set") Reported-and-bisected-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Eric Paris <eparis@redhat.com> Cc: James Morris <james.l.morris@oracle.com> Cc: Kees Cook <keescook@chromium.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-14wl18xx: align event mailbox with current fwEliad Peller
Some fields are missing from the event mailbox struct definitions, which cause issues when trying to handle some events. Add the missing fields in order to align the struct size (without adding actual support for the new fields). Reported-and-tested-by: Imre Kaloz <kaloz@openwrt.org> Cc: stable@vger.kernel.org # 3.14+ Fixes: 028e724 ("wl18xx: move to new firmware (wl18xx-fw-3.bin)") Signed-off-by: Eliad Peller <eliad@wizery.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2014-04-14rsi: Fix a potential memory leak in rsi_send_auto_rate_request()Christian Engelmayer
Fix a potential memory leak in the error path of function rsi_send_auto_rate_request(). In case memory allocation for array 'selected_rates' fails, the error path exits and leaves the previously allocated skb in place. Detected by Coverity: CID 1195575. Signed-off-by: Christian Engelmayer <cengelma@gmx.at> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2014-04-14cw1200: Fix cw1200_debug_link_idFrederic Danis
This array is used in debug string to display cw1200_link_status defined in drivers/net/wireless/cw1200/cw1200.h. Add missing strings for CW1200_LINK_RESET and CW1200_LINK_RESET_REMAP. Signed-off-by: Frederic Danis <frederic.danis@linux.intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2014-04-14wlcore: ignore dummy packet events in PLT modeLuciano Coelho
Sometimes the firmware sends a dummy packet event while we are in PLT mode. This doesn't make sense, it's a firmware bug. Fix this by ignoring dummy packet events when we're PLT mode. Reported-by: Yegor Yefremov <yegorslists@googlemail.com> Reported-by: Arik Nemtsov <arik@wizery.com> Signed-off-by: Luciano Coelho <luca@coelho.fi> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2014-04-14rsi: Fix a potential memory leak in rsi_set_channel()Christian Engelmayer
Fix a potential memory leak in function rsi_set_channel() that is used to program channel changes. The channel check block for the frequency bands directly exits the function in case of an error, thus leaving an already allocated skb unreferenced. Move the checks above allocating the skb. Detected by Coverity: CID 1195576. Signed-off-by: Christian Engelmayer <cengelma@gmx.at> Signed-off-by: John W. Linville <linville@tuxdriver.com>