summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2017-04-12ipv6: addrconf: fix 48 bit 6lowpan autoconfigurationAlexander Aring
This patch adds support for 48 bit 6LoWPAN address length autoconfiguration which is the case for BTLE 6LoWPAN. Signed-off-by: Alexander Aring <aar@pengutronix.de> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Reviewed-by: Stefan Schmidt <stefan@osg.samsung.com> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2017-04-126lowpan: iphc: override l2 packet informationAlexander Aring
The skb->pkt_type need to be set by L2, but on 6LoWPAN there exists L2 e.g. BTLE which doesn't has multicast addressing. If it's a multicast or not is detected by IPHC headers multicast bit. The IPv6 layer will evaluate this pkt_type, so we force set this type while uncompressing. Should be okay for 802.15.4 as well. Signed-off-by: Alexander Aring <aar@pengutronix.de> Reviewed-by: Stefan Schmidt <stefan@osg.samsung.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2017-04-126lowpan: Set MAC address length according to LOWPAN_LLTYPEPatrik Flykt
Set MAC address length according to the 6LoWPAN link layer in use. Bluetooth Low Energy uses 48 bit addressing while IEEE802.15.4 uses 64 bits. Signed-off-by: Patrik Flykt <patrik.flykt@linux.intel.com> Reviewed-by: Stefan Schmidt <stefan@osg.samsung.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2017-04-12bluetooth: Set 6 byte device addressesPatrik Flykt
Set BTLE MAC addresses that are 6 bytes long and not 8 bytes that are used in other places with 6lowpan. Signed-off-by: Patrik Flykt <patrik.flykt@linux.intel.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Reviewed-by: Stefan Schmidt <stefan@osg.samsung.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2017-04-12Bluetooth: hci_bcm: Fix clock (un)prepareJohn Keeping
The hci_bcm driver currently does not prepare/unprepare the clock and goes directly to enable, but as the documentation for clk_enable says, clk_prepare must be called before clk_enable. Signed-off-by: John Keeping <john@metanate.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2017-04-12Bluetooth: convert rfcomm_dlc.refcnt from atomic_t to refcount_tElena Reshetova
refcount_t type and corresponding API should be used instead of atomic_t when the variable is used as a reference counter. This allows to avoid accidental refcounter overflows that might lead to use-after-free situations. Signed-off-by: Elena Reshetova <elena.reshetova@intel.com> Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: David Windsor <dwindsor@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2017-04-12Bluetooth: btmrvl: fix spelling mistake: "unregester" -> "unregister"Colin Ian King
trivial fix to spelling mistake in debug message Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2017-04-12mm: Tighten x86 /dev/mem with zeroing readsKees Cook
Under CONFIG_STRICT_DEVMEM, reading System RAM through /dev/mem is disallowed. However, on x86, the first 1MB was always allowed for BIOS and similar things, regardless of it actually being System RAM. It was possible for heap to end up getting allocated in low 1MB RAM, and then read by things like x86info or dd, which would trip hardened usercopy: usercopy: kernel memory exposure attempt detected from ffff880000090000 (dma-kmalloc-256) (4096 bytes) This changes the x86 exception for the low 1MB by reading back zeros for System RAM areas instead of blindly allowing them. More work is needed to extend this to mmap, but currently mmap doesn't go through usercopy, so hardened usercopy won't Oops the kernel. Reported-by: Tommi Rantala <tommi.t.rantala@nokia.com> Tested-by: Tommi Rantala <tommi.t.rantala@nokia.com> Signed-off-by: Kees Cook <keescook@chromium.org>
2017-04-12net: make struct net_device::min_header_len 8-bitAlexey Dobriyan
This field is never big enough to warrant 16-bitness. 8-bit accesses enjoy shorted encoding on i386/x86_64 than 16-bit accesses: add/remove: 0/0 grow/shrink: 0/2 up/down: 0/-10 (-10) function old new delta loopback_setup 169 164 -5 ether_setup 148 143 -5 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-12net: neigh: make ->hh_len 32-bitAlexey Dobriyan
Using 16-bit ->hh_len doesn't save any memory, save some .text instead: add/remove: 0/0 grow/shrink: 1/6 up/down: 2/-19 (-17) function old new delta neigh_update 2312 2314 +2 fwnet_header_cache 199 197 -2 eth_header_cache 101 99 -2 ip6_finish_output2 2371 2368 -3 vrf_finish_output6 1522 1518 -4 vrf_finish_output 1413 1409 -4 ip_finish_output2 1627 1623 -4 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-12gso: Support frag_list splitting with head_fragIlan Tayari
A driver may use build_skb() for received packets. These SKBs then have a head_frag. Since commit d7e8883cfcf4 ("net: make GRO aware of skb->head_frag"), GRO may build frag_list SKBs out of head_frag received SKBs. In such a case, the chained SKBs end up with a head_frag. Commit 07b26c9454a2 ("gso: Support partial splitting at the frag_list pointer") adds partial segmentation of frag_list SKB chains into individual SKBs. However, this is not done if the chained SKBs have any linear part, because the device may not be able to DMA the private linear buffer. A chained frag_list SKB with head_frag is wrongfully detected in this case as having a private linear part and thus falls back to software GSO, while in fact the linear part is backed by a DMA page just like any other frag. This causes low performance when forwarding those packets that were built with build_skb() Allow partial segmentation at the frag_list pointer for chained SKBs with head_frag. Note that such SKBs can only be created by GRO, when applied to received packets with head_frag. Also note that this change only affects the data path that performs the partial segmentation at frag_list pointer, and not any of the other more common data paths. Signed-off-by: Ilan Tayari <ilant@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-12ipv6: Fix idev->addr_list corruptionRabin Vincent
addrconf_ifdown() removes elements from the idev->addr_list without holding the idev->lock. If this happens while the loop in __ipv6_dev_get_saddr() is handling the same element, that function ends up in an infinite loop: NMI watchdog: BUG: soft lockup - CPU#1 stuck for 23s! [test:1719] Call Trace: ipv6_get_saddr_eval+0x13c/0x3a0 __ipv6_dev_get_saddr+0xe4/0x1f0 ipv6_dev_get_saddr+0x1b4/0x204 ip6_dst_lookup_tail+0xcc/0x27c ip6_dst_lookup_flow+0x38/0x80 udpv6_sendmsg+0x708/0xba8 sock_sendmsg+0x18/0x30 SyS_sendto+0xb8/0xf8 syscall_common+0x34/0x58 Fixes: 6a923934c33 (Revert "ipv6: Revert optional address flusing on ifdown.") Signed-off-by: Rabin Vincent <rabinv@axis.com> Acked-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-12drm/etnaviv: fix missing unlock on error in etnaviv_gpu_submit()Wei Yongjun
Add the missing unlock before return from function etnaviv_gpu_submit() in the error handling case. lst: fixed label name. Fixes: f3cd1b064f11 ("drm/etnaviv: (re-)protect fence allocation with GPU mutex") CC: stable@vger.kernel.org #4.9+ Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
2017-04-12Merge branch 'l2tp-const'David S. Miller
Guillaume Nault says: ==================== l2tp: constify l2tp_session_get*() and l2tp_tunnel_find*() Declare parameters of these functions as "const" where possible. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-12l2tp: define parameters of l2tp_tunnel_find*() as "const"Guillaume Nault
l2tp_tunnel_find() and l2tp_tunnel_find_nth() don't modify "net". Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-12l2tp: define parameters of l2tp_session_get*() as "const"Guillaume Nault
Make l2tp_pernet()'s parameter constant, so that l2tp_session_get*() can declare their "net" variable as "const". Also constify "ifname" in l2tp_session_get_by_ifname(). Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-12net: xdp: don't export dev_change_xdp_fd()Johannes Berg
Since dev_change_xdp_fd() is only used in rtnetlink, which must be built-in, there's no reason to export dev_change_xdp_fd(). Signed-off-by: Johannes Berg <johannes.berg@intel.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-12Merge remote-tracking branch 'mkp-scsi/4.11/scsi-fixes' into fixesJames Bottomley
2017-04-12Merge branch 'ftgmac100-rework-batch4-misc'David S. Miller
Benjamin Herrenschmidt says: ==================== ftgmac100: Rework batch 4 - Misc This is v2 of the fourth batch of updates to the ftgmac100 driver. This is a bunch of misc cleanups and fixes, such as properly disabling HW checksum generation on AST2400 where it's known to be broken and some chip init updates. This also adds the ability to turn HW checksum on/off and configure the ring sizes via ethtool. v2 Fixes patch 1/10 (NETIF_F_HW_CSUM conversion) The next (and last) batch will add a few more "features" such as netpoll, multicast/promist, vlan offload... ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-12ftgmac100: Set default ring sizes to 128 entriesBenjamin Herrenschmidt
I haven't seen any improvement above that size on the machines I've tested with. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-12ftgmac100: Make ring sizes configurable via ethtoolBenjamin Herrenschmidt
We set an arbitrary max at 1024 since we pre-allocate the actual descriptor arrays and skb arrays to the full size to keep the code a bit simpler and avoid allocation failures in the reset task. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-12ftgmac100: Add more register inits in ftgmac100_init_hw()Benjamin Herrenschmidt
Clear stale interrupts on entry, configure FIFO sizes, set FIFO thresholds, configure interrupt mitigation. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-12ftgmac100: Open code remaining register writesBenjamin Herrenschmidt
The helpers just take space but don't provide much value. Simple one line comments are more explanatory. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-12ftgmac100: Rename ftgmac100_setup_mac to ftgmac100_initial_macBenjamin Herrenschmidt
To remove more confusion. This function is about obtaining the initial MAC address at driver probe time. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-12ftgmac100: Rename ftgmac100_set_mac to ftgmac100_write_mac_addrBenjamin Herrenschmidt
To avoid confusion with the ndo callback and generally be clearer about the purpose of that function Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-12ftgmac100: Set netdev->hw_featuresBenjamin Herrenschmidt
So features can be turned on/off via ethtool Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-12ftgmac100: Disable HW checksum generation on AST2400, enable on othersBenjamin Herrenschmidt
We found out that HW checksum generation only works from AST2500 onward. This disables it on AST2400 and removes the "no-hw-checksum" properties in the device-trees. The problem we had wasn't related to NC-SI. Also rework the logic testing for that property so it can be used to disable HW checksum generation and checking regardless of whether NC-SI is used or not in case other variants out there need this. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-12ftgmac100: Use device "compatible" property, not machine.Benjamin Herrenschmidt
We test for aspeed chips to handle a couple of special cases, but we do that by checking the machine type which isn't right. Instead check the actual device compatible property. This also updates the dtsi files for the aspeed SoC to match. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-12ftgmac100: Upgrade to NETIF_F_HW_CSUMBenjamin Herrenschmidt
The documentation describes NETIF_F_IP_CSUM as deprecated so let's switch to NETIF_F_HW_CSUM and use the helper to handle unhandled protocols. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-12Merge branch 'stable-4.11' of git://git.infradead.org/users/pcmoore/auditLinus Torvalds
Pull audit fix from Paul Moore: "One more small audit fix, this should be the last for v4.11. Seth Forshee noticed a problem where the audit retry queue wasn't being flushed properly when audit was enabled and the audit daemon wasn't running; this patches fixes the problem (see the commit description for more details on the change). Both Seth and I have tested this and everything looks good" * 'stable-4.11' of git://git.infradead.org/users/pcmoore/audit: audit: make sure we don't let the retry queue grow without bounds
2017-04-11Merge git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pendingLinus Torvalds
Pull SCSI target fixes from Nicholas Bellinger: "There has been work in a number of different areas over the last weeks, including: - Fix target-core-user (TCMU) back-end bi-directional handling (Xiubo Li + Mike Christie + Ilias Tsitsimpis) - Fix iscsi-target TMR reference leak during session shutdown (Rob Millner + Chu Yuan Lin) - Fix target_core_fabric_configfs.c race between LUN shutdown + mapped LUN creation (James Shen) - Fix target-core unknown fabric callback queue-full errors (Potnuri Bharat Teja) - Fix iscsi-target + iser-target queue-full handling in order to support iw_cxgb4 RNICs. (Potnuri Bharat Teja + Sagi Grimberg) - Fix ALUA transition state race between multiple initiator (Mike Christie) - Drop work-around for legacy GlobalSAN initiator, to allow QLogic 57840S + 579xx offload HBAs to work out-of-the-box in MSFT environments. (Martin Svec + Arun Easi) Note that a number are CC'ed for stable, and although the queue-full bug-fixes required for iser-target to work with iw_cxgb4 aren't CC'ed here, they'll be posted to Greg-KH separately" * git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: tcmu: Skip Data-Out blocks before gathering Data-In buffer for BIDI case iscsi-target: Drop work-around for legacy GlobalSAN initiator target: Fix ALUA transition state race between multiple initiators iser-target: avoid posting a recv buffer twice iser-target: Fix queue-full response handling iscsi-target: Propigate queue_data_in + queue_status errors target: Fix unknown fabric callback queue-full errors tcmu: Fix wrongly calculating of the base_command_size tcmu: Fix possible overwrite of t_data_sg's last iov[] target: Avoid mappedlun symlink creation during lun shutdown iscsi-target: Fix TMR reference leak during session shutdown usb: gadget: Correct usb EP argument for BOT status request tcmu: Allow cmd_time_out to be set to zero (disabled)
2017-04-11Merge branch 'for-4.11-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup fixes from Tejun Heo: "This contains fixes for two long standing subtle bugs: - kthread_bind() on a new kthread binds it to specific CPUs and prevents userland from messing with the affinity or cgroup membership. Unfortunately, for cgroup membership, there's a window between kthread creation and kthread_bind*() invocation where the kthread can be moved into a non-root cgroup by userland. Depending on what controllers are in effect, this can assign the kthread unexpected attributes. For example, in the reported case, workqueue workers ended up in a non-root cpuset cgroups and had their CPU affinities overridden. This broke workqueue invariants and led to workqueue stalls. Fixed by closing the window between kthread creation and kthread_bind() as suggested by Oleg. - There was a bug in cgroup mount path which could allow two competing mount attempts to attach the same cgroup_root to two different superblocks. This was caused by mishandling return value from kernfs_pin_sb(). Fixed" * 'for-4.11-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: cgroup: avoid attaching a cgroup root to two different superblocks cgroup, kthread: close race window where new kthreads can be migrated to non-root cgroups
2017-04-11Merge branch 'for-4.11-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata Pull libata fixes from Tejun Heo: "Two libata fixes. One to disable hotplug on VT6420 which never worked properly. The other reverts an earlier patch which disabled the second port on SB600/700. There were some confusions due to earlier datasheets which incorrectly indicated that the second port is not implemented on both SB600 and 700" * 'for-4.11-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata: sata_via: Enable hotplug only on VT6421 Revert "pata_atiixp: Don't use unconnected secondary port on SB600/SB700"
2017-04-11Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid Pull HID fixes from Jiri Kosina: - revert of a commit that switched all Synaptics touchpads over to be driven by hid-rmi. It turns out that this caused several user-visible regressions, and therefore we revert back to the original state before all the reported issues have been fixed. - a new uclogic device ID addition, from Xiaolei Yu. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid: Revert "HID: rmi: Handle all Synaptics touchpads using hid-rmi" HID: uclogic: add support for Ugee Tablet EX07S
2017-04-11Merge branch 'net-smc-next'David S. Miller
Ursula Braun says: ==================== net/smc: patches for net-next here are some patches for net/smc. Most important are improvements for socket closing. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-11net/smc: do not use IB_SEND_INLINE together with mapped dataUrsula Braun
smc specifies IB_SEND_INLINE for IB_WR_SEND ib_post_send calls, but provides a mapped buffer to be sent. This is inconsistent, since IB_SEND_INLINE works without mapped buffer. Problem has not been detected in the past, because tests had been limited to Connect X3 cards from Mellanox, whose mlx4 driver just ignored the IB_SEND_INLINE flag. For now, the IB_SEND_INLINE flag is removed. Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Reviewed-by: Thomas Richter <tmricht@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-11net/smc: destruct non-accepted socketsUrsula Braun
Make sure sockets never accepted are removed cleanly. Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Reviewed-by: Thomas Richter <tmricht@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-11net/smc: remove duplicate unhashUrsula Braun
unhash is already called in sock_put_work. Remove the second call. Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Reviewed-by: Thomas Richter <tmricht@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-11net/smc: guarantee ConnClosed send after shutdown SHUT_WRUrsula Braun
State SMC_CLOSED should be reached only, if ConnClosed has been sent to the peer. If ConnClosed is received from the peer, a socket with shutdown SHUT_WR done, switches errorneously to state SMC_CLOSED, which means the peer socket is dangling. The local SMC socket is supposed to switch to state APPFINCLOSEWAIT to make sure smc_close_final() is called during socket close. Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Reviewed-by: Thomas Richter <tmricht@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-11net/smc: no socket state changes in tasklet contextUrsula Braun
Several state changes occur during SMC socket closing. Currently state changes triggered locally occur in process context with lock_sock() taken while state changes triggered by peer occur in tasklet context with bh_lock_sock() taken. bh_lock_sock() does not wait till a lock_sock(() task in process context is finished. This may lead to races in socket state transitions resulting in dangling SMC-sockets, or it may lead to duplicate SMC socket freeing. This patch introduces a closing worker to run all state changes under lock_sock(). Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Reviewed-by: Thomas Richter <tmricht@linux.vnet.ibm.com> Reported-by: Dave Jones <davej@codemonkey.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-11net/smc: always call the POLL_IN part of sk_wake_asyncUrsula Braun
Wake up reading file descriptors for a closing socket as well, otherwise some socket applications may stall. Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Reviewed-by: Thomas Richter <tmricht@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-11net/smc: guarantee reset of write_blocked for heavy workloadUrsula Braun
If peer indicates write_blocked, the cursor state of the received data should be send to the peer immediately (in smc_tx_consumer_update()). Afterwards the write_blocked indicator is cleared. If there is no free slot for another write request, sending is postponed to worker smc_tx_work, and the write_blocked indicator is not cleared. Therefore another clearing check is needed in smc_tx_work(). Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Reviewed-by: Thomas Richter <tmricht@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-11net/smc: return active RoCE port onlyUrsula Braun
SMC requires an active ib port on the RoCE device. smc_pnet_find_roce_resource() determines the matching RoCE device port according to the configured PNET table. Do not return the found RoCE device port, if it is not flagged active. Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Reviewed-by: Thomas Richter <tmricht@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-11net/smc: remove useless smc_ib_devices_list checkUrsula Braun
The global event handler is created only, if the ib_device has already been used by at least one link group. It is guaranteed that there exists the corresponding entry in the smc_ib_devices list. Get rid of this superfluous check. Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Reviewed-by: Thomas Richter <tmricht@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-11net/smc: get rid of old commentUrsula Braun
This patch removes an outdated comment. Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Reviewed-by: Thomas Richter <tmricht@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-11Merge branch 'bridge-register-netdev-before-changelink'David S. Miller
Ido Schimmel says: ==================== bridge: Fix kernel oops during bridge creation First patch adds a missing ndo_uninit() in the bridge driver, which is a prerequisite for the second patch that actually fixes the oops. Please consider both patches for 4.4.y, 4.9.y and 4.10.y ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-11bridge: netlink: register netdevice before executing changelinkIdo Schimmel
Peter reported a kernel oops when executing the following command: $ ip link add name test type bridge vlan_default_pvid 1 [13634.939408] BUG: unable to handle kernel NULL pointer dereference at 0000000000000190 [13634.939436] IP: __vlan_add+0x73/0x5f0 [...] [13634.939783] Call Trace: [13634.939791] ? pcpu_next_unpop+0x3b/0x50 [13634.939801] ? pcpu_alloc+0x3d2/0x680 [13634.939810] ? br_vlan_add+0x135/0x1b0 [13634.939820] ? __br_vlan_set_default_pvid.part.28+0x204/0x2b0 [13634.939834] ? br_changelink+0x120/0x4e0 [13634.939844] ? br_dev_newlink+0x50/0x70 [13634.939854] ? rtnl_newlink+0x5f5/0x8a0 [13634.939864] ? rtnl_newlink+0x176/0x8a0 [13634.939874] ? mem_cgroup_commit_charge+0x7c/0x4e0 [13634.939886] ? rtnetlink_rcv_msg+0xe1/0x220 [13634.939896] ? lookup_fast+0x52/0x370 [13634.939905] ? rtnl_newlink+0x8a0/0x8a0 [13634.939915] ? netlink_rcv_skb+0xa1/0xc0 [13634.939925] ? rtnetlink_rcv+0x24/0x30 [13634.939934] ? netlink_unicast+0x177/0x220 [13634.939944] ? netlink_sendmsg+0x2fe/0x3b0 [13634.939954] ? _copy_from_user+0x39/0x40 [13634.939964] ? sock_sendmsg+0x30/0x40 [13634.940159] ? ___sys_sendmsg+0x29d/0x2b0 [13634.940326] ? __alloc_pages_nodemask+0xdf/0x230 [13634.940478] ? mem_cgroup_commit_charge+0x7c/0x4e0 [13634.940592] ? mem_cgroup_try_charge+0x76/0x1a0 [13634.940701] ? __handle_mm_fault+0xdb9/0x10b0 [13634.940809] ? __sys_sendmsg+0x51/0x90 [13634.940917] ? entry_SYSCALL_64_fastpath+0x1e/0xad The problem is that the bridge's VLAN group is created after setting the default PVID, when registering the netdevice and executing its ndo_init(). Fix this by changing the order of both operations, so that br_changelink() is only processed after the netdevice is registered, when the VLAN group is already initialized. Fixes: b6677449dff6 ("bridge: netlink: call br_changelink() during br_dev_newlink()") Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reported-by: Peter V. Saveliev <peter@svinota.eu> Tested-by: Peter V. Saveliev <peter@svinota.eu> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-11bridge: implement missing ndo_uninit()Ido Schimmel
While the bridge driver implements an ndo_init(), it was missing a symmetric ndo_uninit(), causing the different de-initialization operations to be scattered around its dellink() and destructor(). Implement a symmetric ndo_uninit() and remove the overlapping operations from its dellink() and destructor(). This is a prerequisite for the next patch, as it allows us to have a proper cleanup upon changelink() failure during the bridge's newlink(). Fixes: b6677449dff6 ("bridge: netlink: call br_changelink() during br_dev_newlink()") Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-11net: stmmac: use netif_set_real_num_{rx,tx}_queuesJoao Pinto
In the submission of the lastest multiple buffer patch set, this fix was lost. I am sending this patch to put it right again. The fix was originally proposed by Arnd Bergmann. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Joao Pinto <jpinto@synopsys.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-11scsi: ipr: do not set DID_PASSTHROUGH on CHECK CONDITIONMauricio Faria de Oliveira
On a dual controller setup with multipath enabled, some MEDIUM ERRORs caused both paths to be failed, thus I/O got queued/blocked since the 'queue_if_no_path' feature is enabled by default on IPR controllers. This example disabled 'queue_if_no_path' so the I/O failure is seen at the sg_dd program. Notice that after the sg_dd test-case, both paths are in 'failed' state, and both path/priority groups are in 'enabled' state (not 'active') -- which would block I/O with 'queue_if_no_path'. # sg_dd if=/dev/dm-2 bs=4096 count=1 dio=1 verbose=4 blk_sgio=0 <...> read(unix): count=4096, res=-1 sg_dd: reading, skip=0 : Input/output error <...> # dmesg [...] sd 2:2:16:0: [sds] FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE [...] sd 2:2:16:0: [sds] Sense Key : Medium Error [current] [...] sd 2:2:16:0: [sds] Add. Sense: Unrecovered read error - recommend rewrite the data [...] sd 2:2:16:0: [sds] CDB: Read(10) 28 00 00 00 00 00 00 00 20 00 [...] blk_update_request: I/O error, dev sds, sector 0 [...] device-mapper: multipath: Failing path 65:32. <...> [...] device-mapper: multipath: Failing path 65:224. # multipath -l 1IBM_IPR-0_59C2AE0000001F80 dm-2 IBM ,IPR-0 59C2AE00 size=5.2T features='0' hwhandler='1 alua' wp=rw |-+- policy='service-time 0' prio=0 status=enabled | `- 2:2:16:0 sds 65:32 failed undef running `-+- policy='service-time 0' prio=0 status=enabled `- 1:2:7:0 sdae 65:224 failed undef running This is not the desired behavior. The dm-multipath explicitly checks for the MEDIUM ERROR case (and a few others) so not to fail the path (e.g., I/O to other sectors could potentially happen without problems). See dm-mpath.c :: do_end_io_bio() -> noretry_error() !->! fail_path(). The problem trace is: 1) ipr_scsi_done() // SENSE KEY/CHECK CONDITION detected, go to.. 2) ipr_erp_start() // ipr_is_gscsi() and masked_ioasc OK, go to.. 3) ipr_gen_sense() // masked_ioasc is IPR_IOASC_MED_DO_NOT_REALLOC, // so set DID_PASSTHROUGH. 4) scsi_decide_disposition() // check for DID_PASSTHROUGH and return // early on, faking a DID_OK.. *instead* // of reaching scsi_check_sense(). // Had it reached the latter, that would // set host_byte to DID_MEDIUM_ERROR. 5) scsi_finish_command() 6) scsi_io_completion() 7) __scsi_error_from_host_byte() // That would be converted to -ENODATA <...> 8) dm_softirq_done() 9) multipath_end_io() 10) do_end_io() 11) noretry_error() // And that is checked in dm-mpath :: noretry_error() // which would cause fail_path() not to be called. With this patch applied, the I/O is failed but the paths are not. This multipath device continues accepting more I/O requests without blocking. (and notice the different host byte/driver byte handling per SCSI layer). # dmesg [...] sd 2:2:7:0: [sdaf] Done: SUCCESS Result: hostbyte=0x13 driverbyte=DRIVER_OK [...] sd 2:2:7:0: [sdaf] CDB: Read(10) 28 00 00 00 00 00 00 00 40 00 [...] sd 2:2:7:0: [sdaf] Sense Key : Medium Error [current] [...] sd 2:2:7:0: [sdaf] Add. Sense: Unrecovered read error - recommend rewrite the data [...] blk_update_request: critical medium error, dev sdaf, sector 0 [...] blk_update_request: critical medium error, dev dm-6, sector 0 [...] sd 2:2:7:0: [sdaf] Done: SUCCESS Result: hostbyte=0x13 driverbyte=DRIVER_OK [...] sd 2:2:7:0: [sdaf] CDB: Read(10) 28 00 00 00 00 00 00 00 10 00 [...] sd 2:2:7:0: [sdaf] Sense Key : Medium Error [current] [...] sd 2:2:7:0: [sdaf] Add. Sense: Unrecovered read error - recommend rewrite the data [...] blk_update_request: critical medium error, dev sdaf, sector 0 [...] blk_update_request: critical medium error, dev dm-6, sector 0 [...] Buffer I/O error on dev dm-6, logical block 0, async page read # multipath -l 1IBM_IPR-0_59C2AE0000001F80 1IBM_IPR-0_59C2AE0000001F80 dm-6 IBM ,IPR-0 59C2AE00 size=5.2T features='1 queue_if_no_path' hwhandler='1 alua' wp=rw |-+- policy='service-time 0' prio=0 status=active | `- 2:2:7:0 sdaf 65:240 active undef running `-+- policy='service-time 0' prio=0 status=enabled `- 1:2:7:0 sdh 8:112 active undef running Signed-off-by: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com> Acked-by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>