linux/linux-stable.git - Linux kernel stable tree

Age	Commit message (Collapse)	Author
2019-08-09	rxrpc: Don't bother generating maxSkew in the ACK packet	David Howells
	Don't bother generating maxSkew in the ACK packet as it has been obsolete since AFS 3.1. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeffrey Altman <jaltman@auristor.com>
2019-08-09	rxrpc: Fix local endpoint refcounting	David Howells
	The object lifetime management on the rxrpc_local struct is broken in that the rxrpc_local_processor() function is expected to clean up and remove an object - but it may get requeued by packets coming in on the backing UDP socket once it starts running. This may result in the assertion in rxrpc_local_rcu() firing because the memory has been scheduled for RCU destruction whilst still queued: rxrpc: Assertion failed ------------[ cut here ]------------ kernel BUG at net/rxrpc/local_object.c:468! Note that if the processor comes around before the RCU free function, it will just do nothing because ->dead is true. Fix this by adding a separate refcount to count active users of the endpoint that causes the endpoint to be destroyed when it reaches 0. The original refcount can then be used to refcount objects through the work processor and cause the memory to be rcu freed when that reaches 0. Fixes: 4f95dd78a77e ("rxrpc: Rework local endpoint management") Reported-by: syzbot+1e0edc4b8b7494c28450@syzkaller.appspotmail.com Signed-off-by: David Howells <dhowells@redhat.com>
2019-08-09	MAINTAINERS: handle fbdev changes through drm-misc tree	Bartlomiej Zolnierkiewicz
	fbdev patches will now go to upstream through drm-misc tree (IOW starting with v5.4 merge window fbdev changes will be included in DRM pull request) for improved maintainership and better integration testing. Update MAINTAINERS file accordingly. Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
2019-08-09	bcache: Revert "bcache: use sysfs_match_string() instead of ↵	Coly Li
	__sysfs_match_string()" This reverts commit 89e0341af082dbc170019f908846f4a424efc86b. In drivers/md/bcache/sysfs.c:bch_snprint_string_list(), NULL pointer at the end of list is necessary. Remove the NULL from last element of each lists will cause the following panic, [ 4340.455652] bcache: register_cache() registered cache device nvme0n1 [ 4340.464603] bcache: register_bdev() registered backing device sdk [ 4421.587335] bcache: bch_cached_dev_run() cached dev sdk is running already [ 4421.587348] bcache: bch_cached_dev_attach() Caching sdk as bcache0 on set 354e1d46-d99f-4d8b-870b-078b80dc88a6 [ 5139.247950] general protection fault: 0000 [#1] SMP NOPTI [ 5139.247970] CPU: 9 PID: 5896 Comm: cat Not tainted 4.12.14-95.29-default #1 SLE12-SP4 [ 5139.247988] Hardware name: HPE ProLiant DL380 Gen10/ProLiant DL380 Gen10, BIOS U30 04/18/2019 [ 5139.248006] task: ffff888fb25c0b00 task.stack: ffff9bbacc704000 [ 5139.248021] RIP: 0010:string+0x21/0x70 [ 5139.248030] RSP: 0018:ffff9bbacc707bf0 EFLAGS: 00010286 [ 5139.248043] RAX: ffffffffa7e432e3 RBX: ffff8881c20da02a RCX: ffff0a00ffffff04 [ 5139.248058] RDX: 3f00656863616362 RSI: ffff8881c20db000 RDI: ffffffffffffffff [ 5139.248075] RBP: ffff8881c20db000 R08: 0000000000000000 R09: ffff8881c20da02a [ 5139.248090] R10: 0000000000000004 R11: 0000000000000000 R12: ffff9bbacc707c48 [ 5139.248104] R13: 0000000000000fd6 R14: ffffffffc0665855 R15: ffffffffc0665855 [ 5139.248119] FS: 00007faf253b8700(0000) GS:ffff88903f840000(0000) knlGS:0000000000000000 [ 5139.248137] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 5139.248149] CR2: 00007faf25395008 CR3: 0000000f72150006 CR4: 00000000007606e0 [ 5139.248164] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 5139.248179] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 5139.248193] PKRU: 55555554 [ 5139.248200] Call Trace: [ 5139.248210] vsnprintf+0x1fb/0x510 [ 5139.248221] snprintf+0x39/0x40 [ 5139.248238] bch_snprint_string_list.constprop.15+0x5b/0x90 [bcache] [ 5139.248256] __bch_cached_dev_show+0x44d/0x5f0 [bcache] [ 5139.248270] ? __alloc_pages_nodemask+0xb2/0x210 [ 5139.248284] bch_cached_dev_show+0x2c/0x50 [bcache] [ 5139.248297] sysfs_kf_seq_show+0xbb/0x190 [ 5139.248308] seq_read+0xfc/0x3c0 [ 5139.248317] __vfs_read+0x26/0x140 [ 5139.248327] vfs_read+0x87/0x130 [ 5139.248336] SyS_read+0x42/0x90 [ 5139.248346] do_syscall_64+0x74/0x160 [ 5139.248358] entry_SYSCALL_64_after_hwframe+0x3d/0xa2 [ 5139.248370] RIP: 0033:0x7faf24eea370 [ 5139.248379] RSP: 002b:00007fff82d03f38 EFLAGS: 00000246 ORIG_RAX: 0000000000000000 [ 5139.248395] RAX: ffffffffffffffda RBX: 0000000000020000 RCX: 00007faf24eea370 [ 5139.248411] RDX: 0000000000020000 RSI: 00007faf25396000 RDI: 0000000000000003 [ 5139.248426] RBP: 00007faf25396000 R08: 00000000ffffffff R09: 0000000000000000 [ 5139.248441] R10: 000000007c9d4d41 R11: 0000000000000246 R12: 00007faf25396000 [ 5139.248456] R13: 0000000000000003 R14: 0000000000000000 R15: 0000000000000fff [ 5139.248892] Code: ff ff ff 0f 1f 80 00 00 00 00 49 89 f9 48 89 cf 48 c7 c0 e3 32 e4 a7 48 c1 ff 30 48 81 fa ff 0f 00 00 48 0f 46 d0 48 85 ff 74 45 <44> 0f b6 02 48 8d 42 01 45 84 c0 74 38 48 01 fa 4c 89 cf eb 0e The simplest way to fix is to revert commit 89e0341af082 ("bcache: use sysfs_match_string() instead of __sysfs_match_string()"). This bug was introduced in Linux v5.2, so this fix only applies to Linux v5.2 is enough for stable tree maintainer. Fixes: 89e0341af082 ("bcache: use sysfs_match_string() instead of __sysfs_match_string()") Cc: stable@vger.kernel.org Cc: Alexandru Ardelean <alexandru.ardelean@analog.com> Reported-by: Peifeng Lin <pflin@suse.com> Acked-by: Alexandru Ardelean <alexandru.ardelean@analog.com> Signed-off-by: Coly Li <colyli@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-08-09	netfilter: nf_flow_table: teardown flow timeout race	Pablo Neira Ayuso
	Flows that are in teardown state (due to RST / FIN TCP packet) still have their offload flag set on. Hence, the conntrack garbage collector may race to undo the timeout adjustment that the fixup routine performs, leaving the conntrack entry in place with the internal offload timeout (one day). Update teardown flow state to ESTABLISHED and set tracking to liberal, then once the offload bit is cleared, adjust timeout if it is more than the default fixup timeout (conntrack might already have set a lower timeout from the packet path). Fixes: da5984e51063 ("netfilter: nf_flow_table: add support for sending flows back to the slow path") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-08-09	netfilter: nf_flow_table: conntrack picks up expired flows	Pablo Neira Ayuso
	Update conntrack entry to pick up expired flows, otherwise the conntrack entry gets stuck with the internal offload timeout (one day). The TCP state also needs to be adjusted to ESTABLISHED state and tracking is set to liberal mode in order to give conntrack a chance to pick up the expired flow. Fixes: ac2a66665e23 ("netfilter: add generic flow table infrastructure") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-08-09	netfilter: nf_tables: use-after-free in failing rule with bound set	Pablo Neira Ayuso
	If a rule that has already a bound anonymous set fails to be added, the preparation phase releases the rule and the bound set. However, the transaction object from the abort path still has a reference to the set object that is stale, leading to a use-after-free when checking for the set->bound field. Add a new field to the transaction that specifies if the set is bound, so the abort path can skip releasing it since the rule command owns it and it takes care of releasing it. After this update, the set->bound field is removed. [ 24.649883] Unable to handle kernel paging request at virtual address 0000000000040434 [ 24.657858] Mem abort info: [ 24.660686] ESR = 0x96000004 [ 24.663769] Exception class = DABT (current EL), IL = 32 bits [ 24.669725] SET = 0, FnV = 0 [ 24.672804] EA = 0, S1PTW = 0 [ 24.675975] Data abort info: [ 24.678880] ISV = 0, ISS = 0x00000004 [ 24.682743] CM = 0, WnR = 0 [ 24.685723] user pgtable: 4k pages, 48-bit VAs, pgdp=0000000428952000 [ 24.692207] [0000000000040434] pgd=0000000000000000 [ 24.697119] Internal error: Oops: 96000004 [#1] SMP [...] [ 24.889414] Call trace: [ 24.891870] __nf_tables_abort+0x3f0/0x7a0 [ 24.895984] nf_tables_abort+0x20/0x40 [ 24.899750] nfnetlink_rcv_batch+0x17c/0x588 [ 24.904037] nfnetlink_rcv+0x13c/0x190 [ 24.907803] netlink_unicast+0x18c/0x208 [ 24.911742] netlink_sendmsg+0x1b0/0x350 [ 24.915682] sock_sendmsg+0x4c/0x68 [ 24.919185] ___sys_sendmsg+0x288/0x2c8 [ 24.923037] __sys_sendmsg+0x7c/0xd0 [ 24.926628] __arm64_sys_sendmsg+0x2c/0x38 [ 24.930744] el0_svc_common.constprop.0+0x94/0x158 [ 24.935556] el0_svc_handler+0x34/0x90 [ 24.939322] el0_svc+0x8/0xc [ 24.942216] Code: 37280300 f9404023 91014262 aa1703e0 (f9401863) [ 24.948336] ---[ end trace cebbb9dcbed3b56f ]--- Fixes: f6ac85858976 ("netfilter: nf_tables: unbind set in rule from commit path") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-08-09	omap-dma/omap_vout_vrfb: fix off-by-one fi value	Hans Verkuil
	The OMAP 4 TRM specifies that when using double-index addressing the address increases by the ES plus the EI value minus 1 within a frame. When a full frame is transferred, the address increases by the ES plus the frame index (FI) value minus 1. The omap-dma code didn't account for the 'minus 1' in the FI register. To get correct addressing, add 1 to the src_icg value. This was found when testing a hacked version of the media m2m-deinterlace.c driver on a Pandaboard. The only other source that uses this feature is omap_vout_vrfb.c, and that adds a + 1 when setting the dst_icg. This is a workaround for the broken omap-dma.c behavior. So remove the workaround at the same time that we fix omap-dma.c. I tested the omap_vout driver with a Beagle XM board to check that the '+ 1' in omap_vout_vrfb.c was indeed a workaround for the omap-dma bug. Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Acked-by: Peter Ujfalusi <peter.ujfalusi@ti.com> Acked-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org> Link: https://lore.kernel.org/r/952e7f51-f208-9333-6f58-b7ed20d2ea0b@xs4all.nl Signed-off-by: Vinod Koul <vkoul@kernel.org>
2019-08-09	ALSA: hda - Apply workaround for another AMD chip 1022:1487	Takashi Iwai
	MSI MPG X570 board is with another AMD HD-audio controller (PCI ID 1022:1487) and it requires the same workaround applied for X370, etc (PCI ID 1022:1457). BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=195303 Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2019-08-09	s390/vdso: map vdso also for statically linked binaries	Heiko Carstens
	s390 does not map the vdso for statically linked binaries, assuming that this doesn't make sense. See commit fc5243d98ac2 ("[S390] arch_setup_additional_pages arguments"). However with glibc commit d665367f596d ("linux: Enable vDSO for static linking as default (BZ#19767)") and commit 5e855c895401 ("s390: Enable VDSO for static linking") the vdso is also used for statically linked binaries - if the kernel would make it available. Therefore map the vdso always, just like all other architectures. Reported-by: Stefan Liebler <stli@linux.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-08-09	drm/i915: Use after free in error path in intel_vgpu_create_workload()	Dan Carpenter
	We can't free "workload" until after the printk or it's a use after free. Fixes: 2089a76ade90 ("drm/i915/gvt: Checking workload's gma earlier") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2019-08-09	KVM: arm/arm64: vgic: Reevaluate level sensitive interrupts on enable	Alexandru Elisei
	A HW mapped level sensitive interrupt asserted by a device will not be put into the ap_list if it is disabled at the VGIC level. When it is enabled again, it will be inserted into the ap_list and written to a list register on guest entry regardless of the state of the device. We could argue that this can also happen on real hardware, when the command to enable the interrupt reached the GIC before the device had the chance to de-assert the interrupt signal; however, we emulate the distributor and redistributors in software and we can do better than that. Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org>
2019-08-09	KVM: arm: Don't write junk to CP15 registers on reset	Marc Zyngier
	At the moment, the way we reset CP15 registers is mildly insane: We write junk to them, call the reset functions, and then check that we have something else in them. The "fun" thing is that this can happen while the guest is running (PSCI, for example). If anything in KVM has to evaluate the state of a CP15 register while junk is in there, bad thing may happen. Let's stop doing that. Instead, we track that we have called a reset function for that register, and assume that the reset function has done something. In the end, the very need of this reset check is pretty dubious, as it doesn't check everything (a lot of the CP15 reg leave outside of the cp15_regs[] array). It may well be axed in the near future. Signed-off-by: Marc Zyngier <maz@kernel.org>
2019-08-09	KVM: arm64: Don't write junk to sysregs on reset	Marc Zyngier
	At the moment, the way we reset system registers is mildly insane: We write junk to them, call the reset functions, and then check that we have something else in them. The "fun" thing is that this can happen while the guest is running (PSCI, for example). If anything in KVM has to evaluate the state of a system register while junk is in there, bad thing may happen. Let's stop doing that. Instead, we track that we have called a reset function for that register, and assume that the reset function has done something. This requires fixing a couple of sysreg refinition in the trap table. In the end, the very need of this reset check is pretty dubious, as it doesn't check everything (a lot of the sysregs leave outside of the sys_regs[] array). It may well be axed in the near future. Tested-by: Zenghui Yu <yuzenghui@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org>
2019-08-09	Merge tag 'drm-intel-fixes-2019-08-08' of ↵	Dave Airlie
	git://anongit.freedesktop.org/drm/drm-intel into drm-fixes drm/i915 fixes for v5.3-rc4: - Fix GLK DSI escape clock setting - Fix a memleak on HDCP revoked Ksv error path Signed-off-by: Dave Airlie <airlied@redhat.com> From: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/87pnlghz79.fsf@intel.com
2019-08-09	Merge tag 'drm-misc-fixes-2019-08-08' of ↵	Dave Airlie
	git://anongit.freedesktop.org/drm/drm-misc into drm-fixes drm-misc-fixes for v5.3-rc4: - Suspend fix for rockchip - Fix unterminated strncpy cmdline mode parser Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/ace294a6-6bb2-d9b1-695d-3260e1d60831@linux.intel.com
2019-08-08	net: tundra: tsi108: use spin_lock_irqsave instead of spin_lock_irq in IRQ ↵	Fuqian Huang
	context As spin_unlock_irq will enable interrupts. Function tsi108_stat_carry is called from interrupt handler tsi108_irq. Interrupts are enabled in interrupt handler. Use spin_lock_irqsave/spin_unlock_irqrestore instead of spin_(un)lock_irq in IRQ context to avoid this. Signed-off-by: Fuqian Huang <huangfq.daxian@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-08	team: Add vlan tx offload to hw_enc_features	YueHaibing
	We should also enable team's vlan tx offload in hw_enc_features, pass the vlan packets to the slave devices with vlan tci, let the slave handle vlan tunneling offload implementation. Fixes: 3268e5cb494d ("team: Advertise tunneling offload features") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-09	Merge branch 'vmwgfx-fixes-5.3' of ↵	Dave Airlie
	git://people.freedesktop.org/~thomash/linux into drm-fixes One single memory leak fix. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Thomas Hellstrom "VMware" <thomas@shipmail.org> Link: https://patchwork.freedesktop.org/patch/msgid/20190808094615.31040-1-thomas@shipmail.org
2019-08-08	net/tls: prevent skb_orphan() from leaking TLS plain text with offload	Jakub Kicinski
	sk_validate_xmit_skb() and drivers depend on the sk member of struct sk_buff to identify segments requiring encryption. Any operation which removes or does not preserve the original TLS socket such as skb_orphan() or skb_clone() will cause clear text leaks. Make the TCP socket underlying an offloaded TLS connection mark all skbs as decrypted, if TLS TX is in offload mode. Then in sk_validate_xmit_skb() catch skbs which have no socket (or a socket with no validation) and decrypted flag set. Note that CONFIG_SOCK_VALIDATE_XMIT, CONFIG_TLS_DEVICE and sk->sk_validate_xmit_skb are slightly interchangeable right now, they all imply TLS offload. The new checks are guarded by CONFIG_TLS_DEVICE because that's the option guarding the sk_buff->decrypted member. Second, smaller issue with orphaning is that it breaks the guarantee that packets will be delivered to device queues in-order. All TLS offload drivers depend on that scheduling property. This means skb_orphan_partial()'s trick of preserving partial socket references will cause issues in the drivers. We need a full orphan, and as a result netem delay/throttling will cause all TLS offload skbs to be dropped. Reusing the sk_buff->decrypted flag also protects from leaking clear text when incoming, decrypted skb is redirected (e.g. by TC). See commit 0608c69c9a80 ("bpf: sk_msg, sock{map\|hash} redirect through ULP") for justification why the internal flag is safe. The only location which could leak the flag in is tcp_bpf_sendmsg(), which is taken care of by clearing the previously unused bit. v2: - remove superfluous decrypted mark copy (Willem); - remove the stale doc entry (Boris); - rely entirely on EOR marking to prevent coalescing (Boris); - use an internal sendpages flag instead of marking the socket (Boris). v3 (Willem): - reorganize the can_skb_orphan_partial() condition; - fix the flag leak-in through tcp_bpf_sendmsg. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Boris Pismenny <borisp@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-08	Merge branch 'skbedit-batch-fixes'	David S. Miller
	Roman Mashak says: ==================== Fix batched event generation for skbedit action When adding or deleting a batch of entries, the kernel sends up to TCA_ACT_MAX_PRIO (defined to 32 in kernel) entries in an event to user space. However it does not consider that the action sizes may vary and require different skb sizes. For example, consider the following script adding 32 entries with all supported skbedit parameters and cookie (in order to maximize netlink messages size): % cat tc-batch.sh TC="sudo /mnt/iproute2.git/tc/tc" $TC actions flush action skbedit for i in `seq 1 $1`; do cmd="action skbedit queue_mapping 2 priority 10 mark 7/0xaabbccdd \ ptype host inheritdsfield \ index $i cookie aabbccddeeff112233445566778800a1 " args=$args$cmd done $TC actions add $args % % ./tc-batch.sh 32 Error: Failed to fill netlink attributes while adding TC action. We have an error talking to the kernel % patch 1 adds callback in tc_action_ops of skbedit action, which calculates the action size, and passes size to tcf_add_notify()/tcf_del_notify(). patch 2 updates the TDC test suite with relevant skbedit test cases. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-08	tc-testing: updated skbedit action tests with batch create/delete	Roman Mashak
	Update TDC tests with cases varifying ability of TC to install or delete batches of skbedit actions. Signed-off-by: Roman Mashak <mrv@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-08	net sched: update skbedit action for batched events operations	Roman Mashak
	Add get_fill_size() routine used to calculate the action size when building a batch of events. Fixes: ca9b0e27e ("pkt_action: add new action skbedit") Signed-off-by: Roman Mashak <mrv@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-08	net: dsa: sja1105: remove set but not used variables 'tx_vid' and 'rx_vid'	YueHaibing
	Fixes gcc '-Wunused-but-set-variable' warning: drivers/net/dsa/sja1105/sja1105_main.c: In function sja1105_fdb_dump: drivers/net/dsa/sja1105/sja1105_main.c:1226:14: warning: variable tx_vid set but not used [-Wunused-but-set-variable] drivers/net/dsa/sja1105/sja1105_main.c:1226:6: warning: variable rx_vid set but not used [-Wunused-but-set-variable] They are not used since commit 6d7c7d948a2e ("net: dsa: sja1105: Fix broken learning with vlan_filtering disabled") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: YueHaibing <yuehaibing@huawei.com> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Reviewed-by: Vivien Didelot <vivien.didelot@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-08	bonding: Add vlan tx offload to hw_enc_features	YueHaibing
	As commit 30d8177e8ac7 ("bonding: Always enable vlan tx offload") said, we should always enable bonding's vlan tx offload, pass the vlan packets to the slave devices with vlan tci, let them to handle vlan implementation. Now if encapsulation protocols like VXLAN is used, skb->encapsulation may be set, then the packet is passed to vlan device which based on bonding device. However in netif_skb_features(), the check of hw_enc_features: if (skb->encapsulation) features &= dev->hw_enc_features; clears NETIF_F_HW_VLAN_CTAG_TX/NETIF_F_HW_VLAN_STAG_TX. This results in same issue in commit 30d8177e8ac7 like this: vlan_dev_hard_start_xmit -->dev_queue_xmit -->validate_xmit_skb -->netif_skb_features //NETIF_F_HW_VLAN_CTAG_TX is cleared -->validate_xmit_vlan -->__vlan_hwaccel_push_inside //skb->tci is cleared ... --> bond_start_xmit --> bond_xmit_hash //BOND_XMIT_POLICY_ENCAP34 --> __skb_flow_dissect // nhoff point to IP header --> case htons(ETH_P_8021Q) // skb_vlan_tag_present is false, so vlan = __skb_header_pointer(skb, nhoff, sizeof(_vlan), //vlan point to ip header wrongly Fixes: b2a103e6d0af ("bonding: convert to ndo_fix_features") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Acked-by: Jay Vosburgh <jay.vosburgh@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-08	net: sched: sch_taprio: fix memleak in error path for sched list parse	Ivan Khoronzhuk
	In error case, all entries should be freed from the sched list before deleting it. For simplicity use rcu way. Fixes: 5a781ccbd19e46 ("tc: Add support for configuring the taprio scheduler") Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-09	soundwire: fix regmap dependencies and align with other serial links	Pierre-Louis Bossart
	The existing code has a mixed select/depend usage which makes no sense. config SOUNDWIRE_BUS tristate select REGMAP_SOUNDWIRE config REGMAP_SOUNDWIRE tristate depends on SOUNDWIRE_BUS Let's remove one layer of Kconfig definitions and align with the solutions used by all other serial links. Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Link: https://lore.kernel.org/r/20190718230215.18675-1-pierre-louis.bossart@linux.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2019-08-08	net: docs: replace IPX in tuntap documentation	Stephen Hemminger
	IPX is no longer supported, but the example in the documentation might useful. Replace it with IPv6. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-08	docs: admin-guide: remove references to IPX and token-ring	Stephen Hemminger
	Both IPX and TR have not been supported for a while now. Remove them from the /proc/sys/net documentation. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-08	xen/netback: Reset nr_frags before freeing skb	Ross Lagerwall
	At this point nr_frags has been incremented but the frag does not yet have a page assigned so freeing the skb results in a crash. Reset nr_frags before freeing the skb to prevent this. Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-08	dt-bindings: riscv: fix the schema compatible string for the HiFive ↵	Paul Walmsley
	Unleashed board The YAML binding document for SiFive boards has an incorrect compatible string for the HiFive Unleashed board. Change it to match the name of the board on the SiFive web site: https://www.sifive.com/boards/hifive-unleashed which also matches the contents of the board DT data file: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/riscv/boot/dts/sifive/hifive-unleashed-a00.dts#n13 Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com> Acked-by: Rob Herring <robh@kernel.org>
2019-08-08	dt-bindings: riscv: remove obsolete cpus.txt	Paul Walmsley
	Remove the now-obsolete riscv/cpus.txt DT binding document, since we are using YAML binding documentation instead. While doing so, transfer the explanatory text about 'harts' (with some edits) into the YAML file, at Rob's request. Link: https://lore.kernel.org/linux-riscv/CAL_JsqJs6MtvmuyAknsUxQymbmoV=G+=JfS1PQj9kNHV7fjC9g@mail.gmail.com/ Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com> Cc: Rob Herring <robh@kernel.org> Reviewed-by: Rob Herring <robh@kernel.org>
2019-08-08	RISC-V: Remove udivdi3	Palmer Dabbelt
	This should never have landed in the first place: it was added as part of 64-bit divide support for 32-bit systems, but the kernel doesn't allow this sort of division. I must have forgotten to remove it. This patch removes the support. Since this routine only worked on 64-bit platforms but was only built on 32-bit platforms, it's essentially just nonsense anyway. Signed-off-by: Palmer Dabbelt <palmer@sifive.com> Acked-by: Nicolas Pitre <nico@fluxnic.net> Link: https://lore.kernel.org/linux-riscv/nycvar.YSQ.7.76.1908061413360.19480@knanqh.ubzr/T/#t Reported-by: Eric Lin <tesheng@andestech.com> Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
2019-08-08	riscv: delay: use do_div() instead of __udivdi3()	Paul Walmsley
	In preparation for removing __udivdi3() from the RISC-V architecture-specific files, convert its one user to use do_div(). This avoids breaking the RV32 build after __udivdi3() is removed. This second version removes the assignment of the remainder to an unused temporary variable. Thanks to Nicolas Pitre <nico@fluxnic.net> for the suggestion. Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com> Cc: Nicolas Pitre <nico@fluxnic.net>
2019-08-08	dt-bindings: Update the riscv,isa string description	Atish Patra
	Since the RISC-V specification states that ISA description strings are case-insensitive, there's no functional difference between mixed-case, upper-case, and lower-case ISA strings. Thus, to simplify parsing, specify that the letters present in "riscv,isa" must be all lowercase. Suggested-by: Paul Walmsley <paul.walmsley@sifive.com> Signed-off-by: Atish Patra <atish.patra@wdc.com> Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
2019-08-08	inet: frags: re-introduce skb coalescing for local delivery	Guillaume Nault
	Before commit d4289fcc9b16 ("net: IP6 defrag: use rbtrees for IPv6 defrag"), a netperf UDP_STREAM test[0] using big IPv6 datagrams (thus generating many fragments) and running over an IPsec tunnel, reported more than 6Gbps throughput. After that patch, the same test gets only 9Mbps when receiving on a be2net nic (driver can make a big difference here, for example, ixgbe doesn't seem to be affected). By reusing the IPv4 defragmentation code, IPv6 lost fragment coalescing (IPv4 fragment coalescing was dropped by commit 14fe22e33462 ("Revert "ipv4: use skb coalescing in defragmentation"")). Without fragment coalescing, be2net runs out of Rx ring entries and starts to drop frames (ethtool reports rx_drops_no_frags errors). Since the netperf traffic is only composed of UDP fragments, any lost packet prevents reassembly of the full datagram. Therefore, fragments which have no possibility to ever get reassembled pile up in the reassembly queue, until the memory accounting exeeds the threshold. At that point no fragment is accepted anymore, which effectively discards all netperf traffic. When reassembly timeout expires, some stale fragments are removed from the reassembly queue, so a few packets can be received, reassembled and delivered to the netperf receiver. But the nic still drops frames and soon the reassembly queue gets filled again with stale fragments. These long time frames where no datagram can be received explain why the performance drop is so significant. Re-introducing fragment coalescing is enough to get the initial performances again (6.6Gbps with be2net): driver doesn't drop frames anymore (no more rx_drops_no_frags errors) and the reassembly engine works at full speed. This patch is quite conservative and only coalesces skbs for local IPv4 and IPv6 delivery (in order to avoid changing skb geometry when forwarding). Coalescing could be extended in the future if need be, as more scenarios would probably benefit from it. [0]: Test configuration Sender: ip xfrm policy flush ip xfrm state flush ip xfrm state add src fc00:1::1 dst fc00:2::1 proto esp spi 0x1000 aead 'rfc4106(gcm(aes))' 0x0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b 96 mode transport sel src fc00:1::1 dst fc00:2::1 ip xfrm policy add src fc00:1::1 dst fc00:2::1 dir in tmpl src fc00:1::1 dst fc00:2::1 proto esp mode transport action allow ip xfrm state add src fc00:2::1 dst fc00:1::1 proto esp spi 0x1001 aead 'rfc4106(gcm(aes))' 0x0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b 96 mode transport sel src fc00:2::1 dst fc00:1::1 ip xfrm policy add src fc00:2::1 dst fc00:1::1 dir out tmpl src fc00:2::1 dst fc00:1::1 proto esp mode transport action allow netserver -D -L fc00:2::1 Receiver: ip xfrm policy flush ip xfrm state flush ip xfrm state add src fc00:2::1 dst fc00:1::1 proto esp spi 0x1001 aead 'rfc4106(gcm(aes))' 0x0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b 96 mode transport sel src fc00:2::1 dst fc00:1::1 ip xfrm policy add src fc00:2::1 dst fc00:1::1 dir in tmpl src fc00:2::1 dst fc00:1::1 proto esp mode transport action allow ip xfrm state add src fc00:1::1 dst fc00:2::1 proto esp spi 0x1000 aead 'rfc4106(gcm(aes))' 0x0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b 96 mode transport sel src fc00:1::1 dst fc00:2::1 ip xfrm policy add src fc00:1::1 dst fc00:2::1 dir out tmpl src fc00:1::1 dst fc00:2::1 proto esp mode transport action allow netperf -H fc00:2::1 -f k -P 0 -L fc00:1::1 -l 60 -t UDP_STREAM -I 99,5 -i 5,5 -T5,5 -6 Signed-off-by: Guillaume Nault <gnault@redhat.com> Acked-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-08	Merge tag 'nfs-for-5.3-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs	Linus Torvalds
	Pull NFS client fixes from Trond Myklebust: "Highlights include: Stable fixes: - NFSv4: Ensure we check the return value of update_open_stateid() so we correctly track active open state. - NFSv4: Fix for delegation state recovery to ensure we recover all open modes that are active. - NFSv4: Fix an Oops in nfs4_do_setattr Fixes: - NFS: Fix regression whereby fscache errors are appearing on 'nofsc' mounts - NFSv4: Fix a potential sleep while atomic in nfs4_do_reclaim() - NFSv4: Fix a credential refcount leak in nfs41_check_delegation_stateid - pNFS: Report errors from the call to nfs4_select_rw_stateid() - NFSv4: Various other delegation and open stateid recovery fixes - NFSv4: Fix state recovery behaviour when server connection times out" * tag 'nfs-for-5.3-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: NFSv4: Ensure state recovery handles ETIMEDOUT correctly NFS: Fix regression whereby fscache errors are appearing on 'nofsc' mounts NFSv4: Fix an Oops in nfs4_do_setattr NFSv4: Fix a potential sleep while atomic in nfs4_do_reclaim() NFSv4: Check the return value of update_open_stateid() NFSv4.1: Only reap expired delegations NFSv4.1: Fix open stateid recovery NFSv4: Report the error from nfs4_select_rw_stateid() NFSv4: When recovering state fails with EAGAIN, retry the same recovery NFSv4: Print an error in the syslog when state is marked as irrecoverable NFSv4: Fix delegation state recovery NFSv4: Fix a credential refcount leak in nfs41_check_delegation_stateid
2019-08-08	clk: samsung: exynos542x: Move MSCL subsystem clocks to its sub-CMU	Marek Szyprowski
	M2M scaler clocks require special handling of their parent bus clock during power domain on/off sequences. MSCL clocks were not initially added to the sub-CMU handler, because that time there was no driver for the M2M scaler device and it was not possible to test it. This patch fixes this issue. Parent clock for M2M scaler devices is now properly preserved during MSC power domain on/off sequence. This gives M2M scaler devices proper performance: fullHD XRGB32 image 1000 rotations test takes 3.17s instead of 45.08s. Fixes: b06a532bf1fa ("clk: samsung: Add Exynos5 sub-CMU clock driver") Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Link: https://lkml.kernel.org/r/20190808121839.23892-1-m.szyprowski@samsung.com Acked-by: Sylwester Nawrocki <s.nawrocki@samsung.com> Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2019-08-08	clk: samsung: exynos5800: Move MAU subsystem clocks to MAU sub-CMU	Sylwester Nawrocki
	This patch fixes broken sound on Exynos5422/5800 platforms after system/suspend resume cycle in cases where the audio root clock is derived from MAU_EPLL_CLK. In order to preserve state of the USER_MUX_MAU_EPLL_CLK clock mux during system suspend/resume cycle for Exynos5800 we group the MAU block input clocks in "MAU" sub-CMU and add the clock mux control bit to .suspend_regs. This ensures that user configuration of the mux is not lost after the PMU block changes the mux setting to OSC_DIV when switching off the MAU power domain. Adding the SRC_TOP9 register to exynos5800_clk_regs[] array is not sufficient as at the time of the syscore_ops suspend call MAU power domain is already turned off and we already save and subsequently restore an incorrect register's value. Fixes: b06a532bf1fa ("clk: samsung: Add Exynos5 sub-CMU clock driver") Reported-by: Jaafar Ali <jaafarkhalaf@gmail.com> Suggested-by: Marek Szyprowski <m.szyprowski@samsung.com> Tested-by: Jaafar Ali <jaafarkhalaf@gmail.com> Signed-off-by: Sylwester Nawrocki <s.nawrocki@samsung.com> Link: https://lkml.kernel.org/r/20190808144929.18685-2-s.nawrocki@samsung.com Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2019-08-08	clk: samsung: Change signature of exynos5_subcmus_init() function	Sylwester Nawrocki
	In order to make it easier in subsequent patch to create different subcmu lists for exynos5420 and exynos5800 SoCs the code is rewritten so we pass an array of pointers to the subcmus initialization function. Fixes: b06a532bf1fa ("clk: samsung: Add Exynos5 sub-CMU clock driver") Tested-by: Jaafar Ali <jaafarkhalaf@gmail.com> Signed-off-by: Sylwester Nawrocki <s.nawrocki@samsung.com> Link: https://lkml.kernel.org/r/20190808144929.18685-1-s.nawrocki@samsung.com Reviewed-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2019-08-08	Merge tag 'perf-urgent-for-mingo-5.3-20190808' of ↵	Thomas Gleixner
	git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent perf/urgent fixes: db-export: Adrian Hunter: - Fix thread__exec_comm() picking of main thread COMM for pre-existing, synthesized from /proc data records. annotate: Arnaldo Carvalho de Melo: - Fix printing of unaugmented disassembled instructions from BPF, some lines were leaving leftovers from the previous screen, due to use of newlines by binutils's libopcode disassembler. perf ftrace: He Zhe: - Fix cpumask problems when only one CPU is present. PMU events: Jin Yao: - Add missing "cpu_clk_unhalted.core" event. perf bench: Jiri Olsa: - Fix cpu0 binding in the NUMA benchmarks. s390: Thomas Richter: - Fix module size calculations. build system: Masanari Iida: - Fix a typo in a variable name in the Documentation Makefile misc: Ian Rogers: - Fix include paths in ui directory. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-08-08	net/mlx5e: Remove redundant check in CQE recovery flow of tx reporter	Aya Levin
	Remove check of recovery bit, in the beginning of the CQE recovery function. This test is already performed right before the reporter is invoked, when CQE error is detected. Fixes: de8650a82071 ("net/mlx5e: Add tx reporter support") Signed-off-by: Aya Levin <ayal@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-08-08	net/mlx5e: Fix error flow of CQE recovery on tx reporter	Aya Levin
	CQE recovery function begins with test and set of recovery bit. Add an error flow which ensures clearing of this bit when leaving the recovery function, to allow further recoveries to take place. This allows removal of clearing recovery bit on sq activate. Fixes: de8650a82071 ("net/mlx5e: Add tx reporter support") Signed-off-by: Aya Levin <ayal@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-08-08	net/mlx5e: Fix false negative indication on tx reporter CQE recovery	Aya Levin
	Remove wrong error return value when SQ is not in error state. CQE recovery on TX reporter queries the sq state. If the sq is not in error state, the sq is either in ready or reset state. Ready state is good state which doesn't require recovery and reset state is a temporal state which ends in ready state. With this patch, CQE recovery in this scenario is successful. Fixes: de8650a82071 ("net/mlx5e: Add tx reporter support") Signed-off-by: Aya Levin <ayal@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-08-08	net/mlx5e: kTLS, Fix tisn field placement	Tariq Toukan
	Shift the tisn field in the WQE control segment, per the HW specification. Fixes: d2ead1f360e8 ("net/mlx5e: Add kTLS TX HW offload support") Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-08-08	net/mlx5e: kTLS, Fix tisn field name	Tariq Toukan
	Use the proper tisn field name from the union in struct mlx5_wqe_ctrl_seg. Fixes: d2ead1f360e8 ("net/mlx5e: Add kTLS TX HW offload support") Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-08-08	net/mlx5e: kTLS, Fix progress params context WQE layout	Tariq Toukan
	The TLS progress params context WQE should not include an Eth segment, drop it. In addition, align the tls_progress_params layout with the HW specification document: - fix the tisn field name. - remove the valid bit. Fixes: a12ff35e0fb7 ("net/mlx5: Introduce TLS TX offload hardware bits and structures") Fixes: d2ead1f360e8 ("net/mlx5e: Add kTLS TX HW offload support") Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-08-08	net/mlx5: kTLS, Fix wrong TIS opmod constants	Tariq Toukan
	Fix the used constants for TLS TIS opmods, per the HW specification. Fixes: a12ff35e0fb7 ("net/mlx5: Introduce TLS TX offload hardware bits and structures") Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-08-08	net/mlx5: crypto, Fix wrong offset in encryption key command	Tariq Toukan
	Fix the 128b key offset in key encryption key creation command, per the HW specification. Fixes: 45d3b55dc665 ("net/mlx5: Add crypto library to support create/destroy encryption key") Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-08-08	net/mlx5e: ethtool, Avoid setting speed to 56GBASE when autoneg off	Mohamad Heib
	Setting speed to 56GBASE is allowed only with auto-negotiation enabled. This patch prevent setting speed to 56GBASE when auto-negotiation disabled. Fixes: f62b8bb8f2d3 ("net/mlx5: Extend mlx5_core to support ConnectX-4 Ethernet functionality") Signed-off-by: Mohamad Heib <mohamadh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>