summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2017-04-10ftgmac100: Cleanup tx queue handlingBenjamin Herrenschmidt
We have a private lock which isn't terribly useful, and we maintain a "tx_pending" counter for information that's already available via a trivial arithmetic operation. Then we unconditionaly wake the queue even when not stopped. Finally our code in tx isn't really safe vs. a concurrent reclaim. The aspeed chips aren't SMP today but I prefer the code being right and future proof. So rip that out and replace it with more "standard" queue handling, currently with a threshold of 1 queue element, which will be increased when we implement fragmented sends. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-10ftgmac100: Store tx skbs in a separate arrayBenjamin Herrenschmidt
Rather than in the descriptor. The descriptor is mapped non-cachable and rather slow to access. Since to do that we need to keep track of the tx "pointer" we also have no use of all the accesors to manipulate it, just open code it, it's as clear and will help when adding fragmented sends. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-10ftgmac100: Pad small frames properlyBenjamin Herrenschmidt
Rather than just transmitting garbage past the end of the small packet. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-10ftgmac100: Factor tx packet dropping pathBenjamin Herrenschmidt
Use a simple goto to a drop path at the tail of the function, it will be used in a few more cases soon Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-10ftgmac100: Merge ftgmac100_xmit() into ftgmac100_hard_start_xmit()Benjamin Herrenschmidt
This will make subsequent rework of the tx path simpler Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-10ftgmac100: Move ftgmac100_hard_start_xmit() aroundBenjamin Herrenschmidt
Move it below ftgmac100_xmit() and the rest of the tx path No code change. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-10ftgmac100: Add a tx timeout handlerBenjamin Herrenschmidt
We have a reset task to reset our chip, use it. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-10Merge branch 'linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto fixes from Herbert Xu: "This fixes a number of bugs in the caam driver: - device creation fails after release - error-path NULL-pointer dereference - spurious hardware error in RNG deinstantiation" * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: caam - fix RNG deinstantiation error checking crypto: caam - fix invalid dereference in caam_rsa_init_tfm() crypto: caam - fix JR platform device subsequent (re)creations
2017-04-10x86/vdso: Plug race between mapping and ELF header setupThomas Gleixner
The vsyscall32 sysctl can racy against a concurrent fork when it switches from disabled to enabled: arch_setup_additional_pages() if (vdso32_enabled) --> No mapping sysctl.vsysscall32() --> vdso32_enabled = true create_elf_tables() ARCH_DLINFO_IA32 if (vdso32_enabled) { --> Add VDSO entry with NULL pointer Make ARCH_DLINFO_IA32 check whether the VDSO mapping has been set up for the newly forked process or not. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Andy Lutomirski <luto@amacapital.net> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Mathias Krause <minipli@googlemail.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/20170410151723.602367196@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-04-10x86/vdso: Ensure vdso32_enabled gets set to valid values onlyMathias Krause
vdso_enabled can be set to arbitrary integer values via the kernel command line 'vdso32=' parameter or via 'sysctl abi.vsyscall32'. load_vdso32() only maps VDSO if vdso_enabled == 1, but ARCH_DLINFO_IA32 merily checks for vdso_enabled != 0. As a consequence the AT_SYSINFO_EHDR auxiliary vector for the VDSO_ENTRY is emitted with a NULL pointer which causes a segfault when the application tries to use the VDSO. Restrict the valid arguments on the command line and the sysctl to 0 and 1. Fixes: b0b49f2673f0 ("x86, vdso: Remove compat vdso support") Signed-off-by: Mathias Krause <minipli@googlemail.com> Acked-by: Andy Lutomirski <luto@amacapital.net> Cc: Peter Zijlstra <peterz@infradead.org> Cc: stable@vger.kernel.org Cc: Roland McGrath <roland@redhat.com> Link: http://lkml.kernel.org/r/1491424561-7187-1-git-send-email-minipli@googlemail.com Link: http://lkml.kernel.org/r/20170410151723.518412863@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-04-10audit: make sure we don't let the retry queue grow without boundsPaul Moore
The retry queue is intended to provide a temporary buffer in the case of transient errors when communicating with auditd, it is not meant as a long life queue, that functionality is provided by the hold queue. This patch fixes a problem identified by Seth where the retry queue could grow uncontrollably if an auditd instance did not connect to the kernel to drain the queues. This commit fixes this by doing the following: * Make sure we always call auditd_reset() if we decide the connection with audit is really dead. There were some cases in kauditd_hold_skb() where we did not reset the connection, this patch relocates the reset calls to kauditd_thread() so all the error conditions are caught and the connection reset. As a side effect, this means we could move auditd_reset() and get rid of the forward definition at the top of kernel/audit.c. * We never checked the status of the auditd connection when processing the main audit queue which meant that the retry queue could grow unchecked. This patch adds a call to auditd_reset() after the main queue has been processed if auditd is not connected, the auditd_reset() call will make sure the retry and hold queues are correctly managed/flushed so that the retry queue remains reasonable. Cc: <stable@vger.kernel.org> # 4.10.x-: 5b52330bbfe6 Reported-by: Seth Forshee <seth.forshee@canonical.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-04-10pinctrl: samsung: Add missing part for PINCFG_TYPE_DRV of Exynos5433Chanwoo Choi
The commit 1259feddd0f8("pinctrl: samsung: Fix the width of PINCFG_TYPE_DRV bitfields for Exynos5433") already fixed the different width of PINCFG_TYPE_DRV from previous Exynos SoC. However wrong merge conflict resolution was chosen in commit 7f36f5d11cda ("Merge tag 'v4.10-rc6' into devel") effectively dropping the changes for PINCFG_TYPE_DRV. Re-do them here. The macro EXYNOS_PIN_BANK_EINTW is no longer used so remove it. Fixes: 7f36f5d11cda ("Merge tag 'v4.10-rc6' into devel") Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com> Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2017-04-09net: dsa: mt7530: Include gpio/consumer.h for GPIO functionsFlorian Fainelli
Fixes build errors seen with CONFIG_GPIOLIB disabled and warnings enabled: drivers/net/dsa/mt7530.c: In function 'mt7530_setup': drivers/net/dsa/mt7530.c:948:3: error: implicit declaration of function 'gpiod_set_value_cansleep' [-Werror=implicit-function-declaration] gpiod_set_value_cansleep(priv->reset, 0); ^~~~~~~~~~~~~~~~~~~~~~~~ drivers/net/dsa/mt7530.c: In function 'mt7530_probe': drivers/net/dsa/mt7530.c:1068:17: error: implicit declaration of function 'devm_gpiod_get_optional' [-Werror=implicit-function-declaration] priv->reset = devm_gpiod_get_optional(&mdiodev->dev, "reset", ^~~~~~~~~~~~~~~~~~~~~~~ drivers/net/dsa/mt7530.c:1069:13: error: 'GPIOD_OUT_LOW' undeclared (first use in this function) GPIOD_OUT_LOW); ^~~~~~~~~~~~~ drivers/net/dsa/mt7530.c:1069:13: Fixes: b8f126a8d543 ("net-next: dsa: add dsa support for Mediatek MT7530 switch") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-09tcp: clear saved_syn in tcp_disconnect()Eric Dumazet
In the (very unlikely) case a passive socket becomes a listener, we do not want to duplicate its saved SYN headers. This would lead to double frees, use after free, and please hackers and various fuzzers Tested: 0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3 +0 setsockopt(3, IPPROTO_TCP, TCP_SAVE_SYN, [1], 4) = 0 +0 fcntl(3, F_SETFL, O_RDWR|O_NONBLOCK) = 0 +0 bind(3, ..., ...) = 0 +0 listen(3, 5) = 0 +0 < S 0:0(0) win 32972 <mss 1460,nop,wscale 7> +0 > S. 0:0(0) ack 1 <...> +.1 < . 1:1(0) ack 1 win 257 +0 accept(3, ..., ...) = 4 +0 connect(4, AF_UNSPEC, ...) = 0 +0 close(3) = 0 +0 bind(4, ..., ...) = 0 +0 listen(4, 5) = 0 +0 < S 0:0(0) win 32972 <mss 1460,nop,wscale 7> +0 > S. 0:0(0) ack 1 <...> +.1 < . 1:1(0) ack 1 win 257 Fixes: cd8ae85299d5 ("tcp: provide SYN headers for passive connections") Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-09bpf: fix comment typoAlexander Alemayhu
o s/bpf_bpf_get_socket_cookie/bpf_get_socket_cookie Signed-off-by: Alexander Alemayhu <alexander@alemayhu.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-09netvsc: use napi_consume_skbstephen hemminger
This allows using deferred skb freeing and with NAPI. And get buffer recycling. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-09Merge tag 'wireless-drivers-next-for-davem-2017-04-07' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next Kalle Valo says: ==================== wireless-drivers-next patches for 4.12 Lots of bugfixes as usual but also some new features. Major changes: ath10k * improve firmware download time for QCA6174 and QCA9377, especially helps resume time ath9k_htc * add support AirTies 1eda:2315 AR9271 device rt2x00 * add support MT7620 mwifiex * enable auto deep sleep mode for USB chipsets brcmfmac * add support for network namespaces (WIPHY_FLAG_NETNS_OK) ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-09Revert "rtnl: Add support for netdev event to link messages"David S. Miller
This reverts commit def12888c161e6fec0702e5ec9c3962846e3a21d. As per discussion between Roopa Prabhu and David Ahern, it is advisable that we instead have the code collect the setlink triggered events into a bitmask emitted in the IFLA_EVENT netlink attribute. Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-09Merge branch '40GbE' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue Jeff Kirsher says: ==================== 40GbE Intel Wired LAN Driver Updates 2017-04-08 This series contains updates to i40e and i40evf only. Mitch fixes an issue where the client driver (i40iw) was attempting to load on x710 devices (which do not support iWARP), so only register with the client if iWARP is supported. Jake fixes up error messages to better clarify to the user when adding a invalid flow type. Updates the driver to look up the MAC address from eth_get_platform_mac_address() first before checking what the firmware provides. Cleans up code so we are not repeating a duplicate loop, by checking both transmit and receive queues in a single loop. Also cleans up flags never used, so remove the definitions. Alex does cleanup so that we are always updating pf->flags when a change is made to the private flags. Adds support for 3K buffers to the receive path so that we can provide the additional padding needed in the event of NET_IP_ALIGN being non-zero or a cache line being greater than 64. Adds support for build_skb() to i40e/i40evf. Maciej adjusts the scope of the rtnl lock held during reset because it was stopping other PFs from running their reset procedures. Alan reduces code complexity in i40e_detect_recover_hung_queue(). ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-09Linux 4.11-rc6v4.11-rc6Linus Torvalds
2017-04-09Merge branch 'for-next' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds
Pull CIFS fixes from Steve French: "This is a set of CIFS/SMB3 fixes for stable. There is another set of four SMB3 reconnect fixes for stable in progress but they are still being reviewed/tested, so didn't want to wait any longer to send these five below" * 'for-next' of git://git.samba.org/sfrench/cifs-2.6: Reset TreeId to zero on SMB2 TREE_CONNECT CIFS: Fix build failure with smb2 Introduce cifs_copy_file_range() SMB3: Rename clone_range to copychunk_range Handle mismatched open calls
2017-04-09Merge branch 'fixes' of git://git.armlinux.org.uk/~rmk/linux-armLinus Torvalds
Pull ARM fixes from Russell King: "A number of ARM fixes: - prevent oopses caused by dma_get_sgtable() and declared DMA coherent memory - fix boot failure on nommu caused by ID_PFR1 access - a number of kprobes fixes from Jon Medhurst and Masami Hiramatsu" * 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm: ARM: 8665/1: nommu: access ID_PFR1 only if CPUID scheme ARM: dma-mapping: disallow dma_get_sgtable() for non-kernel managed memory arm: kprobes: Align stack to 8-bytes in test code arm: kprobes: Fix the return address of multiple kretprobes arm: kprobes: Skip single-stepping in recursing path if possible arm: kprobes: Allow to handle reentered kprobe on single-stepping
2017-04-09Merge tag 'driver-core-4.11-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core fixes from Greg KH: "Here are 3 small fixes for 4.11-rc6. One resolves a reported issue with sysfs files that NeilBrown found, one is a documenatation fix for the stable kernel rules, and the last is a small MAINTAINERS file update for kernfs" * tag 'driver-core-4.11-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: MAINTAINERS: separate out kernfs maintainership sysfs: be careful of error returns from ops->show() Documentation: stable-kernel-rules: fix stable-tag format
2017-04-09Merge tag 'staging-4.11-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging Pull staging/IIO driver rfixes from Greg KH: "Here are a number of small IIO and staging driver fixes for 4.11-rc6. Nothing big here, just iio fixes for reported issues, and an ashmem fix for a very old bug that has been reported by a number of Android vendors" * tag 'staging-4.11-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: staging: android: ashmem: lseek failed due to no FMODE_LSEEK. iio: hid-sensor-attributes: Fix sensor property setting failure. iio: accel: hid-sensor-accel-3d: Fix duplicate scan index error iio: core: Fix IIO_VAL_FRACTIONAL_LOG2 for negative values iio: st_pressure: initialize lps22hb bootime iio: bmg160: reset chip when probing iio: cros_ec_sensors: Fix return value to get raw and calibbias data.
2017-04-09Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull VFS fixes from Al Viro: "statx followup fixes and a fix for stack-smashing on alpha" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: alpha: fix stack smashing in old_adjtimex(2) statx: Include a mask for stx_attributes in struct statx statx: Reserve the top bit of the mask for future struct expansion xfs: report crtime and attribute flags to statx ext4: Add statx support statx: optimize copy of struct statx to userspace statx: remove incorrect part of vfs_statx() comment statx: reject unknown flags when using NULL path Documentation/filesystems: fix documentation for ->getattr()
2017-04-08netfilter: nf_ct_expect: use proper RCU list traversal/update APIsLiping Zhang
We should use proper RCU list APIs to manipulate help->expectations, as we can dump the conntrack's expectations via nfnetlink, i.e. in ctnetlink_exp_ct_dump_table(), where only rcu_read_lock is acquired. So for list traversal, use hlist_for_each_entry_rcu; for list add/del, use hlist_add_head_rcu and hlist_del_rcu. Signed-off-by: Liping Zhang <zlpnobody@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-04-08netfilter: ctnetlink: skip dumping expect when nfct_help(ct) is NULLLiping Zhang
For IPCTNL_MSG_EXP_GET, if the CTA_EXPECT_MASTER attr is specified, then the NLM_F_DUMP request will dump the expectations related to this connection tracking. But we forget to check whether the conntrack has nf_conn_help or not, so if nfct_help(ct) is NULL, oops will happen: BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 IP: ctnetlink_exp_ct_dump_table+0xf9/0x1e0 [nf_conntrack_netlink] Call Trace: ? ctnetlink_exp_ct_dump_table+0x75/0x1e0 [nf_conntrack_netlink] netlink_dump+0x124/0x2a0 __netlink_dump_start+0x161/0x190 ctnetlink_dump_exp_ct+0x16c/0x1bc [nf_conntrack_netlink] ? ctnetlink_exp_fill_info.constprop.33+0xf0/0xf0 [nf_conntrack_netlink] ? ctnetlink_glue_seqadj+0x20/0x20 [nf_conntrack_netlink] ctnetlink_get_expect+0x32e/0x370 [nf_conntrack_netlink] ? debug_lockdep_rcu_enabled+0x1d/0x20 nfnetlink_rcv_msg+0x60a/0x6a9 [nfnetlink] ? nfnetlink_rcv_msg+0x1b9/0x6a9 [nfnetlink] [...] Signed-off-by: Liping Zhang <zlpnobody@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-04-08netfilter: make it safer during the inet6_dev->addr_list traversalLiping Zhang
inet6_dev->addr_list is protected by inet6_dev->lock, so only using rcu_read_lock is not enough, we should acquire read_lock_bh(&idev->lock) before the inet6_dev->addr_list traversal. Signed-off-by: Liping Zhang <zlpnobody@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-04-08netfilter: ctnetlink: make it safer when checking the ct helper nameLiping Zhang
One CPU is doing ctnetlink_change_helper(), while another CPU is doing unhelp() at the same time. So even if help->helper is not NULL at first, the later statement strcmp(help->helper->name, ...) may still access the NULL pointer. So we must use rcu_read_lock and rcu_dereference to avoid such _bad_ thing happen. Fixes: f95d7a46bc57 ("netfilter: ctnetlink: Fix regression in CTA_HELP processing") Signed-off-by: Liping Zhang <zlpnobody@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-04-08netfilter: helper: Add the rcu lock when call __nf_conntrack_helper_findGao Feng
When invoke __nf_conntrack_helper_find, it needs the rcu lock to protect the helper module which would not be unloaded. Now there are two caller nf_conntrack_helper_try_module_get and ctnetlink_create_expect which don't hold rcu lock. And the other callers left like ctnetlink_change_helper, ctnetlink_create_conntrack, and ctnetlink_glue_attach_expect, they already hold the rcu lock or spin_lock_bh. Remove the rcu lock in functions nf_ct_helper_expectfn_find_by_name and nf_ct_helper_expectfn_find_by_symbol. Because they return one pointer which needs rcu lock, so their caller should hold the rcu lock, not in these two functions. Signed-off-by: Gao Feng <fgao@ikuai8.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-04-08netfilter: ctnetlink: using bit to represent the ct eventLiping Zhang
Otherwise, creating a new conntrack via nfnetlink: # conntrack -I -p udp -s 1.1.1.1 -d 2.2.2.2 -t 10 --sport 10 --dport 20 will emit the wrong ct events(where UPDATE should be NEW): # conntrack -E [UPDATE] udp 17 10 src=1.1.1.1 dst=2.2.2.2 sport=10 dport=20 [UNREPLIED] src=2.2.2.2 dst=1.1.1.1 sport=20 dport=10 mark=0 Signed-off-by: Liping Zhang <zlpnobody@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-04-08Merge branch 'dsa-receive-path-simplifications'David S. Miller
Florian Fainelli says: ==================== net: dsa: Receive path simplifications This patch series does factor the common code found in all tag implementations into dsa_switch_rcv(). The original motivation was to add GRO support, but this may be a lot of work with unclear benefits at this point. Changes in v2: - take care of tag_mtk.c in the process ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-08net: dsa: Factor bottom tag receive functionsFlorian Fainelli
All DSA tag receive functions do strictly the same thing after they have located the originating source port from their tag specific protocol: - push ETH_HLEN bytes - set pkt_type to PACKET_HOST - call eth_type_trans() - bump up counters - call netif_receive_skb() Factor all of that into dsa_switch_rcv(). This also makes us return a pointer to a sk_buff, which makes us symetric with the xmit function. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-08net: dsa: Move skb_unshare() to dsa_switch_rcv()Florian Fainelli
All DSA tag receive functions need to unshare the skb before mangling it, move this to the generic dsa_switch_rcv() function which will allow us to make the tag receive function return their mangled skb without caring about freeing a NULL skb. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-08net: dsa: Do not check for NULL dst in tag parsersFlorian Fainelli
dsa_switch_rcv() already tests for dst == NULL, so there is no need to duplicate the same check within the tag receive functions. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-08netfilter: xt_TCPMSS: add more sanity tests on tcph->doffEric Dumazet
Denys provided an awesome KASAN report pointing to an use after free in xt_TCPMSS I have provided three patches to fix this issue, either in xt_TCPMSS or in xt_tcpudp.c. It seems xt_TCPMSS patch has the smallest possible impact. Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Denys Fedoryshchenko <nuclearcat@nuclearcat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-04-08skbuff: Extend gso_type to unsigned int.Steffen Klassert
All available gso_type flags are currently in use, so extend gso_type from 'unsigned short' to 'unsigned int' to be able to add further flags. We reorder the struct skb_shared_info to use two bytes of the four byte hole before dataref. All fields before dataref are cleared, i.e. four bytes more than before the change. The remaining two byte hole is moved to the beginning of the structure, this protects us from immediate overwites on out of bound writes to the sk_buff head. Structure layout on x86-64 before the change: struct skb_shared_info { unsigned char nr_frags; /* 0 1 */ __u8 tx_flags; /* 1 1 */ short unsigned int gso_size; /* 2 2 */ short unsigned int gso_segs; /* 4 2 */ short unsigned int gso_type; /* 6 2 */ struct sk_buff * frag_list; /* 8 8 */ struct skb_shared_hwtstamps hwtstamps; /* 16 8 */ u32 tskey; /* 24 4 */ __be32 ip6_frag_id; /* 28 4 */ atomic_t dataref; /* 32 4 */ /* XXX 4 bytes hole, try to pack */ void * destructor_arg; /* 40 8 */ skb_frag_t frags[17]; /* 48 272 */ /* --- cacheline 5 boundary (320 bytes) --- */ /* size: 320, cachelines: 5, members: 12 */ /* sum members: 316, holes: 1, sum holes: 4 */ }; Structure layout on x86-64 after the change: struct skb_shared_info { short unsigned int _unused; /* 0 2 */ unsigned char nr_frags; /* 2 1 */ __u8 tx_flags; /* 3 1 */ short unsigned int gso_size; /* 4 2 */ short unsigned int gso_segs; /* 6 2 */ struct sk_buff * frag_list; /* 8 8 */ struct skb_shared_hwtstamps hwtstamps; /* 16 8 */ unsigned int gso_type; /* 24 4 */ u32 tskey; /* 28 4 */ __be32 ip6_frag_id; /* 32 4 */ atomic_t dataref; /* 36 4 */ void * destructor_arg; /* 40 8 */ skb_frag_t frags[17]; /* 48 272 */ /* --- cacheline 5 boundary (320 bytes) --- */ /* size: 320, cachelines: 5, members: 13 */ }; Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-08Merge branch 'for-linus' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block fixes from Jens Axboe: "Here's a pull request for 4.11-rc, fixing a set of issues mostly centered around the new scheduling framework. These have been brewing for a while, but split up into what we absolutely need in 4.11, and what we can defer until 4.12. These are well tested, on both single queue and multiqueue setups, and with and without shared tags. They fix several hangs that have happened in testing. This is obviously larger than I would have preferred at this point in time, but I don't think we can shave much off this and still get the desired results. In detail, this pull request contains: - a set of five fixes for NVMe, mostly from Christoph and one from Roland. - a series from Bart, fixing issues with dm-mq and SCSI shared tags and scheduling. Note that one of those patches commit messages may read like an optimization, but it is in fact an important fix for queue restarts in particular. - a series from Omar, most importantly fixing a hang with multiple hardware queues when we fail to get a driver tag. Another important fix in there is for resizing hardware queues, which nbd does when handling multiple sockets for one connection. - fixing an imbalance in putting the ctx for hctx request allocations from Minchan" * 'for-linus' of git://git.kernel.dk/linux-block: blk-mq: Restart a single queue if tag sets are shared dm rq: Avoid that request processing stalls sporadically scsi: Avoid that SCSI queues get stuck blk-mq: Introduce blk_mq_delay_run_hw_queue() blk-mq: remap queues when adding/removing hardware queues blk-mq-sched: fix crash in switch error path blk-mq-sched: set up scheduler tags when bringing up new queues blk-mq-sched: refactor scheduler initialization blk-mq: use the right hctx when getting a driver tag fails nvmet: fix byte swap in nvmet_parse_io_cmd nvmet: fix byte swap in nvmet_execute_write_zeroes nvmet: add missing byte swap in nvmet_get_smart_log nvme: add missing byte swap in nvme_setup_discard nvme: Correct NVMF enum values to match NVMe-oF rev 1.0 block: do not put mq context in blk_mq_alloc_request_hctx
2017-04-08Merge tag 'pinctrl-v4.11-4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl Pull pin control fix from Linus Walleij: "This late fix for pin control is hopefully the last I send this cycle. The problem was detected early in the v4.11 release cycle and there has been some back and forth on how to solve it. Sadly the proper fix arrives late, but at least not too late. An issue was detected with pin control on the Freescale i.MX after the refactorings for more general group and function handling. We now have the proper fix for this" * tag 'pinctrl-v4.11-4' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: pinctrl: core: Fix pinctrl_register_and_init() with pinctrl_enable()
2017-04-08Merge tag 'powerpc-4.11-7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: "Some more powerpc fixes for 4.11: Headed to stable: - disable HFSCR[TM] if TM is not supported, fixes a potential host kernel crash triggered by a hostile guest, but only in configurations that no one uses - don't try to fix up misaligned load-with-reservation instructions - fix flush_(d|i)cache_range() called from modules on little endian kernels - add missing global TLB invalidate if cxl is active - fix missing preempt_disable() in crc32c-vpmsum And a fix for selftests build changes that went in this release: - selftests/powerpc: Fix standalone powerpc build Thanks to: Benjamin Herrenschmidt, Frederic Barrat, Oliver O'Halloran, Paul Mackerras" * tag 'powerpc-4.11-7' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/crypto/crc32c-vpmsum: Fix missing preempt_disable() powerpc/mm: Add missing global TLB invalidate if cxl is active powerpc/64: Fix flush_(d|i)cache_range() called from modules powerpc: Don't try to fix up misaligned load-with-reservation instructions powerpc: Disable HFSCR[TM] if TM is not supported selftests/powerpc: Fix standalone powerpc build
2017-04-08mm/mempolicy.c: fix error handling in set_mempolicy and mbind.Chris Salls
In the case that compat_get_bitmap fails we do not want to copy the bitmap to the user as it will contain uninitialized stack data and leak sensitive data. Signed-off-by: Chris Salls <salls@cs.ucsb.edu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-04-08sysctl: report EINVAL if value is larger than UINT_MAX for proc_douintvecLiping Zhang
Currently, inputting the following command will succeed but actually the value will be truncated: # echo 0x12ffffffff > /proc/sys/net/ipv4/tcp_notsent_lowat This is not friendly to the user, so instead, we should report error when the value is larger than UINT_MAX. Fixes: e7d316a02f68 ("sysctl: handle error writing UINT_MAX to u32 fields") Signed-off-by: Liping Zhang <zlpnobody@gmail.com> Cc: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-04-08MAINTAINERS: separate out kernfs maintainershipTejun Heo
Separate out kernfs from driver core and add myself as a co-maintainer. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-08Merge tag 'linux-can-fixes-for-4.12-20170404' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can Marc Kleine-Budde says: ==================== pull-request: can 2017-04-04 this is a pull request of two patches for net/master. The first patch by Markus Marb fixes a register read access in the ifi driver. The second patch by Geert Uytterhoeven for the rcar driver remove the printing of a kernel virtual address. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-08liquidio: fix VF incorrectly indicating that it successfully set its VLANFelix Manlunas
For security reasons, NIC firmware does not allow VF to set its VLAN if PF set it already. Firmware allows VF to set its VLAN if PF did not set it. After the VF instructs the firmware to set the VLAN, VF always indicates (via return 0) that the operation is successful--even for the times when it isn't. Put in a mechanism for the VF's set VLAN function to receive the firmware response code, then make that function return -EPERM if the firmware forbids the operation. Make that mechanism available for other functions that may, in the future, be interested in receiving the response code from the firmware. That mechanism involves adding new fields to struct octnic_ctrl_pkt, so make all users of struct octnic_ctrl_pkt initialize the struct to zero before using it; otherwise, the mechanism might act on uninitialized garbage. Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com> Signed-off-by: Derek Chickles <derek.chickles@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-08sysfs: be careful of error returns from ops->show()NeilBrown
ops->show() can return a negative error code. Commit 65da3484d9be ("sysfs: correctly handle short reads on PREALLOC attrs.") (in v4.4) caused this to be stored in an unsigned 'size_t' variable, so errors would look like large numbers. As a result, if an error is returned, sysfs_kf_read() will return the value of 'count', typically 4096. Commit 17d0774f8068 ("sysfs: correctly handle read offset on PREALLOC attrs") (in v4.8) extended this error to use the unsigned large 'len' as a size for memmove(). Consequently, if ->show returns an error, then the first read() on the sysfs file will return 4096 and could return uninitialized memory to user-space. If the application performs a subsequent read, this will trigger a memmove() with extremely large count, and is likely to crash the machine is bizarre ways. This bug can currently only be triggered by reading from an md sysfs attribute declared with __ATTR_PREALLOC() during the brief period between when mddev_put() deletes an mddev from the ->all_mddevs list, and when mddev_delayed_delete() - which is scheduled on a workqueue - completes. Before this, an error won't be returned by the ->show() After this, the ->show() won't be called. I can reproduce it reliably only by putting delay like usleep_range(500000,700000); early in mddev_delayed_delete(). Then after creating an md device md0 run echo clear > /sys/block/md0/md/array_state; cat /sys/block/md0/md/array_state The bug can be triggered without the usleep. Fixes: 65da3484d9be ("sysfs: correctly handle short reads on PREALLOC attrs.") Fixes: 17d0774f8068 ("sysfs: correctly handle read offset on PREALLOC attrs") Cc: stable@vger.kernel.org Signed-off-by: NeilBrown <neilb@suse.com> Acked-by: Tejun Heo <tj@kernel.org> Reported-and-tested-by: Miroslav Benes <mbenes@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-08Documentation: stable-kernel-rules: fix stable-tag formatJohan Hovold
A patch documenting how to specify which kernels a particular fix should be backported to (seemingly) inadvertently added a minus sign after the kernel version. This particular stable-tag format had never been used prior to this patch, and was neither present when the patch in question was first submitted (it was added in v2 without any comment). Drop the minus sign to avoid any confusion. Fixes: fdc81b7910ad ("stable_kernel_rules: Add clause about specification of kernel versions to patch.") Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-08netvsc: Initialize all channel related state prior to opening the channelK. Y. Srinivasan
Prior to opening the channel we should have all the state setup to handle interrupts. The current code does not do that; fix the bug. This bug can result in faults in the interrupt path. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-08net: dsa: mv88e6xxx: Make SMI c22/c45 read/write functions staticFlorian Fainelli
The SMI clause 22 & 45 read/write operations are local to the global2.c file, so make them static. This eliminates the following warning: drivers/net/dsa/mv88e6xxx/global2.c:571:5: warning: no previous prototype for 'mv88e6xxx_g2_smi_phy_read_c45' [-Wmissing-prototypes] int mv88e6xxx_g2_smi_phy_read_c45(struct mv88e6xxx_chip *chip, int addr, ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~ drivers/net/dsa/mv88e6xxx/global2.c:602:5: warning: no previous prototype for 'mv88e6xxx_g2_smi_phy_read_c22' [-Wmissing-prototypes] int mv88e6xxx_g2_smi_phy_read_c22(struct mv88e6xxx_chip *chip, int addr, ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~ drivers/net/dsa/mv88e6xxx/global2.c:635:5: warning: no previous prototype for 'mv88e6xxx_g2_smi_phy_write_c45' [-Wmissing-prototypes] int mv88e6xxx_g2_smi_phy_write_c45(struct mv88e6xxx_chip *chip, int addr, ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ drivers/net/dsa/mv88e6xxx/global2.c:664:5: warning: no previous prototype for 'mv88e6xxx_g2_smi_phy_write_c22' [-Wmissing-prototypes] int mv88e6xxx_g2_smi_phy_write_c22(struct mv88e6xxx_chip *chip, int addr, ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Suggested-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-08net: tcp: Increase TCP_MIB_OUTRSTS even though fail to alloc skbGao Feng
Because TCP_MIB_OUTRSTS is an important count, so always increase it whatever send it successfully or not. Now move the increment of TCP_MIB_OUTRSTS to the top of tcp_send_active_reset to make sure it is increased always even though fail to alloc skb. Signed-off-by: Gao Feng <fgao@ikuai8.com> Signed-off-by: David S. Miller <davem@davemloft.net>