summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2020-07-08RDMA/mlx5: Set PD pointers for the error flow unwindLeon Romanovsky
ib_pd is accessed internally during destroy of the TIR/TIS, but PD can be not set yet. This leading to the following kernel panic. BUG: kernel NULL pointer dereference, address: 0000000000000074 PGD 8000000079eaa067 P4D 8000000079eaa067 PUD 7ae81067 PMD 0 Oops: 0000 [#1] SMP PTI CPU: 1 PID: 709 Comm: syz-executor.0 Not tainted 5.8.0-rc3 #41 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014 RIP: 0010:destroy_raw_packet_qp_tis drivers/infiniband/hw/mlx5/qp.c:1189 [inline] RIP: 0010:destroy_raw_packet_qp drivers/infiniband/hw/mlx5/qp.c:1527 [inline] RIP: 0010:destroy_qp_common+0x2ca/0x4f0 drivers/infiniband/hw/mlx5/qp.c:2397 Code: 00 85 c0 74 2e e8 56 18 55 ff 48 8d b3 28 01 00 00 48 89 ef e8 d7 d3 ff ff 48 8b 43 08 8b b3 c0 01 00 00 48 8b bd a8 0a 00 00 <0f> b7 50 74 e8 0d 6a fe ff e8 28 18 55 ff 49 8d 55 50 4c 89 f1 48 RSP: 0018:ffffc900007bbac8 EFLAGS: 00010293 RAX: 0000000000000000 RBX: ffff88807949e800 RCX: 0000000000000998 RDX: 0000000000000000 RSI: 0000000000000008 RDI: ffff88807c180140 RBP: ffff88807b50c000 R08: 000000000002d379 R09: ffffc900007bba00 R10: 0000000000000001 R11: 000000000002d358 R12: ffff888076f37000 R13: ffff88807949e9c8 R14: ffffc900007bbe08 R15: ffff888076f37000 FS: 00000000019bf940(0000) GS:ffff88807dd00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000074 CR3: 0000000076d68004 CR4: 0000000000360ee0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: mlx5_ib_create_qp+0xf36/0xf90 drivers/infiniband/hw/mlx5/qp.c:3014 _ib_create_qp drivers/infiniband/core/core_priv.h:333 [inline] create_qp+0x57f/0xd20 drivers/infiniband/core/uverbs_cmd.c:1443 ib_uverbs_create_qp+0xcf/0x100 drivers/infiniband/core/uverbs_cmd.c:1564 ib_uverbs_write+0x5fa/0x780 drivers/infiniband/core/uverbs_main.c:664 __vfs_write+0x3f/0x90 fs/read_write.c:495 vfs_write+0xc7/0x1f0 fs/read_write.c:559 ksys_write+0x5e/0x110 fs/read_write.c:612 do_syscall_64+0x3e/0x70 arch/x86/entry/common.c:359 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x466479 Code: Bad RIP value. RSP: 002b:00007ffd057b62b8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 000000000073bf00 RCX: 0000000000466479 RDX: 0000000000000070 RSI: 0000000020000240 RDI: 0000000000000003 RBP: 00000000019bf8fc R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff R13: 0000000000000bf6 R14: 00000000004cb859 R15: 00000000006fefc0 Fixes: 6c41965d647a ("RDMA/mlx5: Don't access ib_qp fields in internal destroy QP path") Link: https://lore.kernel.org/r/20200707110612.882962-4-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-08IB/mlx5: Fix 50G per lane indicationAya Levin
Some released FW versions mistakenly don't set the capability that 50G per lane link-modes are supported for VFs (ptys_extended_ethernet capability bit). Use PTYS.ext_eth_proto_capability instead, as this indication is always accurate. If PTYS.ext_eth_proto_capability is valid (has a non-zero value) conclude that the HCA supports 50G per lane. Otherwise, conclude that the HCA doesn't support 50G per lane. Fixes: 08e8676f1607 ("IB/mlx5: Add support for 50Gbps per lane link modes") Link: https://lore.kernel.org/r/20200707110612.882962-3-leon@kernel.org Signed-off-by: Aya Levin <ayal@mellanox.com> Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com> Reviewed-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-08selftests: kmod: Add module address visibility testKees Cook
Make sure we don't regress the CAP_SYSLOG behavior of the module address visibility via /proc/modules nor /sys/module/*/sections/*. Reviewed-by: Luis Chamberlain <mcgrof@kernel.org> Signed-off-by: Kees Cook <keescook@chromium.org>
2020-07-08bpf: Check correct cred for CAP_SYSLOG in bpf_dump_raw_ok()Kees Cook
When evaluating access control over kallsyms visibility, credentials at open() time need to be used, not the "current" creds (though in BPF's case, this has likely always been the same). Plumb access to associated file->f_cred down through bpf_dump_raw_ok() and its callers now that kallsysm_show_value() has been refactored to take struct cred. Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: bpf@vger.kernel.org Cc: stable@vger.kernel.org Fixes: 7105e828c087 ("bpf: allow for correlation of maps and helpers in dump") Signed-off-by: Kees Cook <keescook@chromium.org>
2020-07-08kprobes: Do not expose probe addresses to non-CAP_SYSLOGKees Cook
The kprobe show() functions were using "current"'s creds instead of the file opener's creds for kallsyms visibility. Fix to use seq_file->file->f_cred. Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: stable@vger.kernel.org Fixes: 81365a947de4 ("kprobes: Show address of kprobes if kallsyms does") Fixes: ffb9bd68ebdb ("kprobes: Show blacklist addresses as same as kallsyms does") Signed-off-by: Kees Cook <keescook@chromium.org>
2020-07-08module: Do not expose section addresses to non-CAP_SYSLOGKees Cook
The printing of section addresses in /sys/module/*/sections/* was not using the correct credentials to evaluate visibility. Before: # cat /sys/module/*/sections/.*text 0xffffffffc0458000 ... # capsh --drop=CAP_SYSLOG -- -c "cat /sys/module/*/sections/.*text" 0xffffffffc0458000 ... After: # cat /sys/module/*/sections/*.text 0xffffffffc0458000 ... # capsh --drop=CAP_SYSLOG -- -c "cat /sys/module/*/sections/.*text" 0x0000000000000000 ... Additionally replaces the existing (safe) /proc/modules check with file->f_cred for consistency. Reported-by: Dominik Czarnota <dominik.czarnota@trailofbits.com> Fixes: be71eda5383f ("module: Fix display of wrong module .text address") Cc: stable@vger.kernel.org Tested-by: Jessica Yu <jeyu@kernel.org> Acked-by: Jessica Yu <jeyu@kernel.org> Signed-off-by: Kees Cook <keescook@chromium.org>
2020-07-08module: Refactor section attr into bin attributeKees Cook
In order to gain access to the open file's f_cred for kallsym visibility permission checks, refactor the module section attributes to use the bin_attribute instead of attribute interface. Additionally removes the redundant "name" struct member. Cc: stable@vger.kernel.org Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Tested-by: Jessica Yu <jeyu@kernel.org> Acked-by: Jessica Yu <jeyu@kernel.org> Signed-off-by: Kees Cook <keescook@chromium.org>
2020-07-08kallsyms: Refactor kallsyms_show_value() to take credKees Cook
In order to perform future tests against the cred saved during open(), switch kallsyms_show_value() to operate on a cred, and have all current callers pass current_cred(). This makes it very obvious where callers are checking the wrong credential in their "read" contexts. These will be fixed in the coming patches. Additionally switch return value to bool, since it is always used as a direct permission check, not a 0-on-success, negative-on-error style function return. Cc: stable@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org>
2020-07-08cxgb4: fix all-mask IP address comparisonRahul Lakkireddy
Convert all-mask IP address to Big Endian, instead, for comparison. Fixes: f286dd8eaad5 ("cxgb4: use correct type for all-mask IP address comparison") Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-08tipc: fix retransmission on unicast linksHamish Martin
A scenario has been observed where a 'bc_init' message for a link is not retransmitted if it fails to be received by the peer. This leads to the peer never establishing the link fully and it discarding all other data received on the link. In this scenario the message is lost in transit to the peer. The issue is traced to the 'nxt_retr' field of the skb not being initialised for links that aren't a bc_sndlink. This leads to the comparison in tipc_link_advance_transmq() that gates whether to attempt retransmission of a message performing in an undesirable way. Depending on the relative value of 'jiffies', this comparison: time_before(jiffies, TIPC_SKB_CB(skb)->nxt_retr) may return true or false given that 'nxt_retr' remains at the uninitialised value of 0 for non bc_sndlinks. This is most noticeable shortly after boot when jiffies is initialised to a high value (to flush out rollover bugs) and we compare a jiffies of, say, 4294940189 to zero. In that case time_before returns 'true' leading to the skb not being retransmitted. The fix is to ensure that all skbs have a valid 'nxt_retr' time set for them and this is achieved by refactoring the setting of this value into a central function. With this fix, transmission losses of 'bc_init' messages do not stall the link establishment forever because the 'bc_init' message is retransmitted and the link eventually establishes correctly. Fixes: 382f598fb66b ("tipc: reduce duplicate packets for unicast traffic") Acked-by: Jon Maloy <jmaloy@redhat.com> Signed-off-by: Hamish Martin <hamish.martin@alliedtelesis.co.nz> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-08l2tp: remove skb_dst_set() from l2tp_xmit_skb()Xin Long
In the tx path of l2tp, l2tp_xmit_skb() calls skb_dst_set() to set skb's dst. However, it will eventually call inet6_csk_xmit() or ip_queue_xmit() where skb's dst will be overwritten by: skb_dst_set_noref(skb, dst); without releasing the old dst in skb. Then it causes dst/dev refcnt leak: unregister_netdevice: waiting for eth0 to become free. Usage count = 1 This can be reproduced by simply running: # modprobe l2tp_eth && modprobe l2tp_ip # sh ./tools/testing/selftests/net/l2tp.sh So before going to inet6_csk_xmit() or ip_queue_xmit(), skb's dst should be dropped. This patch is to fix it by removing skb_dst_set() from l2tp_xmit_skb() and moving skb_dst_drop() into l2tp_xmit_core(). Fixes: 3557baabf280 ("[L2TP]: PPP over L2TP driver core") Reported-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: James Chapman <jchapman@katalix.com> Tested-by: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-08nbd: Fix memory leak in nbd_add_socketZheng Bin
When adding first socket to nbd, if nsock's allocation failed, the data structure member "config->socks" was reallocated, but the data structure member "config->num_connections" was not updated. A memory leak will occur then because the function "nbd_config_put" will free "config->socks" only when "config->num_connections" is not zero. Fixes: 03bf73c315ed ("nbd: prevent memory leak") Reported-by: syzbot+934037347002901b8d2a@syzkaller.appspotmail.com Signed-off-by: Zheng Bin <zhengbin13@huawei.com> Reviewed-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-07-08arm64: Documentation: Fix broken table in generated HTMLSuzuki K Poulose
cpu-feature-registers.rst is missing a new line before a couple of tables listing the visible fields, causing broken tables in the HTML documentation generated by "make htmldocs". Fix this by adding the missing new line. Reported-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20200707143152.154541-1-suzuki.poulose@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2020-07-08arm64: kgdb: Fix single-step exception handling oopsWei Li
After entering kdb due to breakpoint, when we execute 'ss' or 'go' (will delay installing breakpoints, do single-step first), it won't work correctly, and it will enter kdb due to oops. It's because the reason gotten in kdb_stub() is not as expected, and it seems that the ex_vector for single-step should be 0, like what arch powerpc/sh/parisc has implemented. Before the patch: Entering kdb (current=0xffff8000119e2dc0, pid 0) on processor 0 due to Keyboard Entry [0]kdb> bp printk Instruction(i) BP #0 at 0xffff8000101486cc (printk) is enabled addr at ffff8000101486cc, hardtype=0 installed=0 [0]kdb> g / # echo h > /proc/sysrq-trigger Entering kdb (current=0xffff0000fa878040, pid 266) on processor 3 due to Breakpoint @ 0xffff8000101486cc [3]kdb> ss Entering kdb (current=0xffff0000fa878040, pid 266) on processor 3 Oops: (null) due to oops @ 0xffff800010082ab8 CPU: 3 PID: 266 Comm: sh Not tainted 5.7.0-rc4-13839-gf0e5ad491718 #6 Hardware name: linux,dummy-virt (DT) pstate: 00000085 (nzcv daIf -PAN -UAO) pc : el1_irq+0x78/0x180 lr : __handle_sysrq+0x80/0x190 sp : ffff800015003bf0 x29: ffff800015003d20 x28: ffff0000fa878040 x27: 0000000000000000 x26: ffff80001126b1f0 x25: ffff800011b6a0d8 x24: 0000000000000000 x23: 0000000080200005 x22: ffff8000101486cc x21: ffff800015003d30 x20: 0000ffffffffffff x19: ffff8000119f2000 x18: 0000000000000000 x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000 x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000 x11: 0000000000000000 x10: 0000000000000000 x9 : 0000000000000000 x8 : ffff800015003e50 x7 : 0000000000000002 x6 : 00000000380b9990 x5 : ffff8000106e99e8 x4 : ffff0000fadd83c0 x3 : 0000ffffffffffff x2 : ffff800011b6a0d8 x1 : ffff800011b6a000 x0 : ffff80001130c9d8 Call trace: el1_irq+0x78/0x180 printk+0x0/0x84 write_sysrq_trigger+0xb0/0x118 proc_reg_write+0xb4/0xe0 __vfs_write+0x18/0x40 vfs_write+0xb0/0x1b8 ksys_write+0x64/0xf0 __arm64_sys_write+0x14/0x20 el0_svc_common.constprop.2+0xb0/0x168 do_el0_svc+0x20/0x98 el0_sync_handler+0xec/0x1a8 el0_sync+0x140/0x180 [3]kdb> After the patch: Entering kdb (current=0xffff8000119e2dc0, pid 0) on processor 0 due to Keyboard Entry [0]kdb> bp printk Instruction(i) BP #0 at 0xffff8000101486cc (printk) is enabled addr at ffff8000101486cc, hardtype=0 installed=0 [0]kdb> g / # echo h > /proc/sysrq-trigger Entering kdb (current=0xffff0000fa852bc0, pid 268) on processor 0 due to Breakpoint @ 0xffff8000101486cc [0]kdb> g Entering kdb (current=0xffff0000fa852bc0, pid 268) on processor 0 due to Breakpoint @ 0xffff8000101486cc [0]kdb> ss Entering kdb (current=0xffff0000fa852bc0, pid 268) on processor 0 due to SS trap @ 0xffff800010082ab8 [0]kdb> Fixes: 44679a4f142b ("arm64: KGDB: Add step debugging support") Signed-off-by: Wei Li <liwei391@huawei.com> Tested-by: Douglas Anderson <dianders@chromium.org> Reviewed-by: Douglas Anderson <dianders@chromium.org> Link: https://lore.kernel.org/r/20200509214159.19680-2-liwei391@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2020-07-08arm64: entry: Tidy up block comments and label numbersWill Deacon
Continually butchering our entry code with CPU errata workarounds has led to it looking a little scruffy. Consistently used /* */ comment style for multi-line block comments and ensure that small numeric labels use consecutive integers. No functional change, but the state of things was irritating. Signed-off-by: Will Deacon <will@kernel.org>
2020-07-08arm64: Rework ARM_ERRATUM_1414080 handlingMarc Zyngier
The current handling of erratum 1414080 has the side effect that cntkctl_el1 can get changed for both 32 and 64bit tasks. This isn't a problem so far, but if we ever need to mitigate another of these errata on the 64bit side, we'd better keep the messing with cntkctl_el1 local to 32bit tasks. For that, make sure that on entering the kernel from a 32bit tasks, userspace access to cntvct gets enabled, and disabled returning to userspace, while it never gets changed for 64bit tasks. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20200706163802.1836732-5-maz@kernel.org [will: removed branch instructions per Mark's review comments] Signed-off-by: Will Deacon <will@kernel.org>
2020-07-08arm64: arch_timer: Disable the compat vdso for cores affected by ↵Marc Zyngier
ARM64_WORKAROUND_1418040 ARM64_WORKAROUND_1418040 requires that AArch32 EL0 accesses to the virtual counter register are trapped and emulated by the kernel. This makes the vdso pretty pointless, and in some cases livelock prone. Provide a workaround entry that limits the vdso to 64bit tasks. Signed-off-by: Marc Zyngier <maz@kernel.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20200706163802.1836732-4-maz@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2020-07-08arm64: arch_timer: Allow an workaround descriptor to disable compat vdsoMarc Zyngier
As we are about to disable the vdso for compat tasks in some circumstances, let's allow a workaround descriptor to express exactly that. Signed-off-by: Marc Zyngier <maz@kernel.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20200706163802.1836732-3-maz@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2020-07-08arm64: Introduce a way to disable the 32bit vdsoMarc Zyngier
We have a class of errata (grouped under the ARM64_WORKAROUND_1418040 banner) that force the trapping of counter access from 32bit EL0. We would normally disable the whole vdso for such defect, except that it would disable it for 64bit userspace as well, which is a shame. Instead, add a new vdso_clock_mode, which signals that the vdso isn't usable for compat tasks. This gets checked in the new vdso_clocksource_ok() helper, now provided for the 32bit vdso. Signed-off-by: Marc Zyngier <maz@kernel.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20200706163802.1836732-2-maz@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2020-07-08arm64: entry: Fix the typo in the comment of el1_dbg()Kevin Hao
The function name should be local_daif_mask(). Signed-off-by: Kevin Hao <haokexin@gmail.com> Acked-by: Mark Rutlamd <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20200417103212.45812-2-haokexin@gmail.com Signed-off-by: Will Deacon <will@kernel.org>
2020-07-08drivers/firmware/psci: Assign @err directly in hotplug_tests()Gavin Shan
The return value of down_and_up_cpus() can be assigned to @err directly. With that, the useless assignment to @err with zero can be dropped. Signed-off-by: Gavin Shan <gshan@redhat.com> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> Link: https://lore.kernel.org/r/20200630075943.203954-1-gshan@redhat.com Signed-off-by: Will Deacon <will@kernel.org>
2020-07-08drivers/firmware/psci: Fix memory leakage in alloc_init_cpu_groups()Gavin Shan
The CPU mask (@tmp) should be released on failing to allocate @cpu_groups or any of its elements. Otherwise, it leads to memory leakage because the CPU mask variable is dynamically allocated when CONFIG_CPUMASK_OFFSTACK is enabled. Signed-off-by: Gavin Shan <gshan@redhat.com> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> Link: https://lore.kernel.org/r/20200630075227.199624-1-gshan@redhat.com Signed-off-by: Will Deacon <will@kernel.org>
2020-07-08KVM: arm64: Fix definition of PAGE_HYP_DEVICEWill Deacon
PAGE_HYP_DEVICE is intended to encode attribute bits for an EL2 stage-1 pte mapping a device. Unfortunately, it includes PROT_DEVICE_nGnRE which encodes attributes for EL1 stage-1 mappings such as UXN and nG, which are RES0 for EL2, and DBM which is meaningless as TCR_EL2.HD is not set. Fix the definition of PAGE_HYP_DEVICE so that it doesn't set RES0 bits at EL2. Acked-by: Marc Zyngier <maz@kernel.org> Cc: Marc Zyngier <maz@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20200708162546.26176-1-will@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2020-07-08Merge branch 'net-smc-fixes'David S. Miller
Karsten Graul says: ==================== net/smc: fixes 2020-07-08 Please apply the following patch series for smc to netdev's net tree. The patches fix problems found during more testing of SMC functionality, resulting in hang conditions and unneeded link deactivations. The clc module was hardened to be prepared for possible future SMCD versions. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-08net/smc: tolerate future SMCD versionsUrsula Braun
CLC proposal messages of future SMCD versions could be larger than SMCD V1 CLC proposal messages. To enable toleration in SMC V1 the receival of CLC proposal messages is adapted: * accept larger length values in CLC proposal * check trailing eye catcher for incoming CLC proposal with V1 length only * receive the whole CLC proposal even in cases it does not fit into the V1 buffer Fixes: e7b7a64a8493d ("smc: support variable CLC proposal messages") Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-08net/smc: switch smcd_dev_list spinlock to mutexUrsula Braun
The similar smc_ib_devices spinlock has been converted to a mutex. Protecting the smcd_dev_list by a mutex is possible as well. This patch converts the smcd_dev_list spinlock to a mutex. Fixes: c6ba7c9ba43d ("net/smc: add base infrastructure for SMC-D and ISM") Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-08net/smc: fix sleep bug in smc_pnet_find_roce_resource()Ursula Braun
Tests showed this BUG: [572555.252867] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:935 [572555.252876] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 131031, name: smcapp [572555.252879] INFO: lockdep is turned off. [572555.252883] CPU: 1 PID: 131031 Comm: smcapp Tainted: G O 5.7.0-rc3uschi+ #356 [572555.252885] Hardware name: IBM 3906 M03 703 (LPAR) [572555.252887] Call Trace: [572555.252896] [<00000000ac364554>] show_stack+0x94/0xe8 [572555.252901] [<00000000aca1f400>] dump_stack+0xa0/0xe0 [572555.252906] [<00000000ac3c8c10>] ___might_sleep+0x260/0x280 [572555.252910] [<00000000acdc0c98>] __mutex_lock+0x48/0x940 [572555.252912] [<00000000acdc15c2>] mutex_lock_nested+0x32/0x40 [572555.252975] [<000003ff801762d0>] mlx5_lag_get_roce_netdev+0x30/0xc0 [mlx5_core] [572555.252996] [<000003ff801fb3aa>] mlx5_ib_get_netdev+0x3a/0xe0 [mlx5_ib] [572555.253007] [<000003ff80063848>] smc_pnet_find_roce_resource+0x1d8/0x310 [smc] [572555.253011] [<000003ff800602f0>] __smc_connect+0x1f0/0x3e0 [smc] [572555.253015] [<000003ff80060634>] smc_connect+0x154/0x190 [smc] [572555.253022] [<00000000acbed8d4>] __sys_connect+0x94/0xd0 [572555.253025] [<00000000acbef620>] __s390x_sys_socketcall+0x170/0x360 [572555.253028] [<00000000acdc6800>] system_call+0x298/0x2b8 [572555.253030] INFO: lockdep is turned off. Function smc_pnet_find_rdma_dev() might be called from smc_pnet_find_roce_resource(). It holds the smc_ib_devices list spinlock while calling infiniband op get_netdev(). At least for mlx5 the get_netdev operation wants mutex serialization, which conflicts with the smc_ib_devices spinlock. This patch switches the smc_ib_devices spinlock into a mutex to allow sleeping when calling get_netdev(). Fixes: a4cf0443c414 ("smc: introduce SMC as an IB-client") Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-08net/smc: fix work request handlingKarsten Graul
Wait for pending sends only when smc_switch_conns() found a link to move the connections to. Do not wait during link freeing, this can lead to permanent hang situations. And refuse to provide a new tx slot on an unusable link. Fixes: c6f02ebeea3a ("net/smc: switch connections to alternate link") Reviewed-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-08net/smc: separate LLC wait queues for flow and messagesKarsten Graul
There might be races in scenarios where both SMC link groups are on the same system. Prevent that by creating separate wait queues for LLC flows and messages. Switch to non-interruptable versions of wait_event() and wake_up() for the llc flow waiter to make sure the waiters get control sequentially. Fine tune the llc_flow_lock to include the assignment of the message. Write to system log when an unexpected message was dropped. And remove an extra indirection and use the existing local variable lgr in smc_llc_enqueue(). Fixes: 555da9af827d ("net/smc: add event-based llc_flow framework") Reviewed-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-08net: atlantic: fix ip dst and ipv6 address filtersDmitry Bogdanov
This patch fixes ip dst and ipv6 address filters. There were 2 mistakes in the code, which led to the issue: * invalid register was used for ipv4 dst address; * incorrect write order of dwords for ipv6 addresses. Fixes: 23e7a718a49b ("net: aquantia: add rx-flow filter definitions") Signed-off-by: Dmitry Bogdanov <dbogdanov@marvell.com> Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com> Signed-off-by: Alexander Lobakin <alobakin@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-08Documentation: update for gcc 4.9 requirementRandy Dunlap
Update Documentation for the gcc v4.9 upgrade requirement. Fixes: 5429ef62bcf3 ("compiler/gcc: Raise minimum GCC version for kernel builds to 4.8") Fixes: 6ec4476ac825 ("Raise gcc version requirement to 4.9") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Acked-by: Jonathan Corbet <corbet@lwn.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-07-08Merge tag 'sound-5.8-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "A collection of small, mostly device-specific fixes. The significant one is the regression fix for USB-audio implicit feedback devices due to the incorrect frame size calculation, which landed in 5.8 and stable trees. In addition, a few usual HD-audio and USB-audio quirks, Intel HDMI fixes, ASoC fsl and rt5682 fixes, as well as the fix in compress-offload partial drain operation" * tag 'sound-5.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: ALSA: compress: fix partial_drain completion state ALSA: usb-audio: Add implicit feedback quirk for RTX6001 ALSA: usb-audio: add quirk for MacroSilicon MS2109 ALSA: hda/realtek: Enable headset mic of Acer Veriton N4660G with ALC269VC ALSA: hda/realtek: Enable headset mic of Acer C20-820 with ALC269VC ALSA: hda/realtek - Enable audio jacks of Acer vCopperbox with ALC269VC ALSA: hda/realtek - Fix Lenovo Thinkpad X1 Carbon 7th quirk subdevice id ALSA: hda/hdmi: improve debug traces for stream lookups ALSA: hda/hdmi: fix failures at PCM open on Intel ICL and later ALSA: opl3: fix infoleak in opl3 ALSA: usb-audio: Replace s/frame/packet/ where appropriate ALSA: usb-audio: Fix packet size calculation AsoC: amd: add missing snd- module prefix to the acp3x-rn driver kernel module ALSA: hda - let hs_mic be picked ahead of hp_mic ASoC: rt5682: fix the pop noise while OMTP type headset plugin ASoC: fsl_mqs: Fix unchecked return value for clk_prepare_enable ASoC: fsl_mqs: Don't check clock is NULL before calling clk API
2020-07-08Raise gcc version requirement to 4.9Linus Torvalds
I realize that we fairly recently raised it to 4.8, but the fact is, 4.9 is a much better minimum version to target. We have a number of workarounds for actual bugs in pre-4.9 gcc versions (including things like internal compiler errors on ARM), but we also have some syntactic workarounds for lacking features. In particular, raising the minimum to 4.9 means that we can now just assume _Generic() exists, which is likely the much better replacement for a lot of very convoluted built-time magic with conditionals on sizeof and/or __builtin_choose_expr() with same_type() etc. Using _Generic also means that you will need to have a very recent version of 'sparse', but thats easy to build yourself, and much less of a hassle than some old gcc version can be. The latest (in a long string) of reasons for minimum compiler version upgrades was commit 5435f73d5c4a ("efi/x86: Fix build with gcc 4"). Ard points out that RHEL 7 uses gcc-4.8, but the people who stay back on old RHEL versions persumably also don't build their own kernels anyway. And maybe they should cross-built or just have a little side affair with a newer compiler? Acked-by: Ard Biesheuvel <ardb@kernel.org> Acked-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-07-08dm: use noio when sending kobject eventMikulas Patocka
kobject_uevent may allocate memory and it may be called while there are dm devices suspended. The allocation may recurse into a suspended device, causing a deadlock. We must set the noio flag when sending a uevent. The observed deadlock was reported here: https://www.redhat.com/archives/dm-devel/2020-March/msg00025.html Reported-by: Khazhismel Kumykov <khazhy@google.com> Reported-by: Tahsin Erdogan <tahsin@google.com> Reported-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2020-07-08dm zoned: Fix zone reclaim triggerDamien Le Moal
Only triggering reclaim based on the percentage of unmapped cache zones can fail to detect cases where reclaim is needed, e.g. if the target has only 2 or 3 cache zones and only one unmapped cache zone, the percentage of free cache zones is higher than DMZ_RECLAIM_LOW_UNMAP_ZONES (30%) and reclaim does not trigger. This problem, combined with the fact that dmz_schedule_reclaim() is called from dmz_handle_bio() without the map lock held, leads to a race between zone allocation and dmz_should_reclaim() result. Depending on the workload applied, this race can lead to the write path waiting forever for a free zone without reclaim being triggered. Fix this by moving dmz_schedule_reclaim() inside dmz_alloc_zone() under the map lock. This results in checking the need for zone reclaim whenever a new data or buffer zone needs to be allocated. Also fix dmz_reclaim_percentage() to always return 0 if the number of unmapped cache (or random) zones is less than or equal to 1. Suggested-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2020-07-08dm zoned: fix unused but set variable warningsWei Yongjun
Fix unused but set variable warnings: drivers/md/dm-zoned-reclaim.c:504:42: warning: variable nr_rnd set but not used [-Wunused-but-set-variable] 504 | unsigned int p_unmap, nr_unmap_rnd = 0, nr_rnd = 0; | ^~~~~~ drivers/md/dm-zoned-reclaim.c:504:24: warning: variable nr_unmap_rnd set but not used [-Wunused-but-set-variable] 504 | unsigned int p_unmap, nr_unmap_rnd = 0, nr_rnd = 0; | ^~~~~~~~~~~~ Fixes: f97809aec589 ("dm zoned: per-device reclaim") Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2020-07-08dm writecache: reject asynchronous pmem devicesMichal Suchanek
DM writecache does not handle asynchronous pmem. Reject it when supplied as cache. Link: https://lore.kernel.org/linux-nvdimm/87lfk5hahc.fsf@linux.ibm.com/ Fixes: 6e84200c0a29 ("virtio-pmem: Add virtio pmem driver") Signed-off-by: Michal Suchanek <msuchanek@suse.de> Acked-by: Mikulas Patocka <mpatocka@redhat.com> Cc: stable@vger.kernel.org # 5.3+ Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2020-07-08dm: use bio_uninit instead of bio_disassociate_blkgChristoph Hellwig
bio_uninit is the proper API to clean up a BIO that has been allocated on stack or inside a structure that doesn't come from the BIO allocator. Switch dm to use that instead of bio_disassociate_blkg, which really is an implementation detail. Note that the bio_uninit calls are also moved to the two callers of __send_empty_flush, so that they better pair with the bio_init calls used to initialize them. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2020-07-08mmc: sdhci-msm: Override DLL_CONFIG only if the valid value is suppliedVeerabhadrarao Badiganti
During DLL initialization, the DLL_CONFIG register value would be updated with the value supplied from the device-tree. Override this register only if a valid value is supplied. Fixes: 03591160ca19 ("mmc: sdhci-msm: Read and use DLL Config property from device tree file") Signed-off-by: Veerabhadrarao Badiganti <vbadigan@codeaurora.org> Link: https://lore.kernel.org/r/1594213888-2780-1-git-send-email-vbadigan@codeaurora.org Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2020-07-08RDMA/siw: Fix reporting vendor_part_idKamal Heib
Move the initialization of the vendor_part_id to be before calling ib_register_device(), this is needed because the query_device() callback is called from the context of ib_register_device() before initializing the vendor_part_id, so the reported value is wrong. Fixes: bdcf26bf9b3a ("rdma/siw: network and RDMA core interface") Link: https://lore.kernel.org/r/20200707130931.444724-1-kamalheib1@gmail.com Signed-off-by: Kamal Heib <kamalheib1@gmail.com> Reviewed-by: Bernard Metzler <bmt@zurich.ibm.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-08powerpc/64s/exception: Fix 0x1500 interrupt handler crashNicholas Piggin
A typo caused the interrupt handler to branch immediately to the common "unknown interrupt" handler and skip the special case test for denormal cause. This does not affect KVM softpatch handling (e.g., for POWER9 TM assist) because the KVM test was moved to common code by commit 9600f261acaa ("powerpc/64s/exception: Move KVM test to common code") just before this bug was introduced. Fixes: 3f7fbd97d07d ("powerpc/64s/exception: Clean up SRR specifiers") Reported-by: Paul Menzel <pmenzel@molgen.mpg.de> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Tested-by: Paul Menzel <pmenzel@molgen.mpg.de> [mpe: Split selftest into a separate patch] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200708074942.1713396-1-npiggin@gmail.com
2020-07-08drm/hisilicon/hibmc: Move drm_fbdev_generic_setup() down to avoid the splatZenghui Yu
The HiSilicon hibmc driver triggers a splat at boot time as below [ 14.137806] ------------[ cut here ]------------ [ 14.142405] hibmc-drm 0000:0a:00.0: Device has not been registered. [ 14.148661] WARNING: CPU: 0 PID: 496 at drivers/gpu/drm/drm_fb_helper.c:2233 drm_fbdev_generic_setup+0x15c/0x1b8 [ 14.158787] [...] [ 14.278307] Call trace: [ 14.280742] drm_fbdev_generic_setup+0x15c/0x1b8 [ 14.285337] hibmc_pci_probe+0x354/0x418 [ 14.289242] local_pci_probe+0x44/0x98 [ 14.292974] work_for_cpu_fn+0x20/0x30 [ 14.296708] process_one_work+0x1c4/0x4e0 [ 14.300698] worker_thread+0x2c8/0x528 [ 14.304431] kthread+0x138/0x140 [ 14.307646] ret_from_fork+0x10/0x18 [ 14.311205] ---[ end trace a2000ec2d838af4d ]--- This turned out to be due to the fbdev device hasn't been registered when drm_fbdev_generic_setup() is invoked. Let's fix the splat by moving it down after drm_dev_register() which will follow the "Display driver example" documented by commit de99f0600a79 ("drm/drv: DOC: Add driver example code"). Signed-off-by: Zenghui Yu <yuzenghui@huawei.com> Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de> Signed-off-by: Xinliang Liu <xinliang.liu@linaro.org> Link: https://patchwork.freedesktop.org/patch/msgid/20200706144713.1123-1-yuzenghui@huawei.com
2020-07-08smb3: fix unneeded error message on change notifySteve French
We should not be logging a warning repeatedly on change notify. CC: Stable <stable@vger.kernel.org> # v5.6+ Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
2020-07-08xtensa: simplify xtensa_pmu_irq_handlerXu Wang
Use for_each_set_bit() instead of open-coding it to simplify the code. Signed-off-by: Xu Wang <vulab@iscas.ac.cn> Message-Id: <20200708062023.7986-1-vulab@iscas.ac.cn> Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2020-07-08fs: remove __vfs_readChristoph Hellwig
Fold it into the two callers. Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-07-08fs: implement kernel_read using __kernel_readChristoph Hellwig
Consolidate the two in-kernel read helpers to make upcoming changes easier. The only difference are the missing call to rw_verify_area in kernel_read, and an access_ok check that doesn't make sense for kernel buffers to start with. Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-07-08integrity/ima: switch to using __kernel_readChristoph Hellwig
__kernel_read has a bunch of additional sanity checks, and this moves the set_fs out of non-core code. Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-07-08fs: add a __kernel_read helperChristoph Hellwig
This is the counterpart to __kernel_write, and skip the rw_verify_area call compared to kernel_read. Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-07-08fs: remove __vfs_writeChristoph Hellwig
Fold it into the two callers. Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-07-08fs: implement kernel_write using __kernel_writeChristoph Hellwig
Consolidate the two in-kernel write helpers to make upcoming changes easier. The only difference are the missing call to rw_verify_area in kernel_write, and an access_ok check that doesn't make sense for kernel buffers to start with. Signed-off-by: Christoph Hellwig <hch@lst.de>