summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2019-10-29um-ubd: Entrust re-queue to the upper layersAnton Ivanov
Fixes crashes due to ubd requeue logic conflicting with the block-mq logic. Crash is reproducible in 5.0 - 5.3. Fixes: 53766defb8c8 ("um: Clean-up command processing in UML UBD driver") Cc: stable@vger.kernel.org # v5.0+ Signed-off-by: Anton Ivanov <anton.ivanov@cambridgegreys.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-10-29nvme-multipath: remove unused groups_only mode in ana logAnton Eidelman
groups_only mode in nvme_read_ana_log() is no longer used: remove it. Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Anton Eidelman <anton@lightbitslabs.com> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-10-29nvme-multipath: fix possible io hang after ctrl reconnectAnton Eidelman
The following scenario results in an IO hang: 1) ctrl completes a request with NVME_SC_ANA_TRANSITION. NVME_NS_ANA_PENDING bit in ns->flags is set and ana_work is triggered. 2) ana_work: nvme_read_ana_log() tries to get the ANA log page from the ctrl. This fails because ctrl disconnects. Therefore nvme_update_ns_ana_state() is not called and NVME_NS_ANA_PENDING bit in ns->flags is not cleared. 3) ctrl reconnects: nvme_mpath_init(ctrl,...) calls nvme_read_ana_log(ctrl, groups_only=true). However, nvme_update_ana_state() does not update namespaces because nr_nsids = 0 (due to groups_only mode). 4) scan_work calls nvme_validate_ns() finds the ns and re-validates OK. Result: The ctrl is now live but NVME_NS_ANA_PENDING bit in ns->flags is still set. Consequently ctrl will never be considered a viable path by __nvme_find_path(). IO will hang if ctrl is the only or the last path to the namespace. More generally, while ctrl is reconnecting, its ANA state may change. And because nvme_mpath_init() requests ANA log in groups_only mode, these changes are not propagated to the existing ctrl namespaces. This may result in a mal-function or an IO hang. Solution: nvme_mpath_init() will nvme_read_ana_log() with groups_only set to false. This will not harm the new ctrl case (no namespaces present), and will make sure the ANA state of namespaces gets updated after reconnect. Note: Another option would be for nvme_mpath_init() to invoke nvme_parse_ana_log(..., nvme_set_ns_ana_state) for each existing namespace. Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Anton Eidelman <anton@lightbitslabs.com> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-10-29powerpc/powernv: Fix CPU idle to be called with IRQs disabledNicholas Piggin
Commit e78a7614f3876 ("idle: Prevent late-arriving interrupts from disrupting offline") changes arch_cpu_idle_dead to be called with interrupts disabled, which triggers the WARN in pnv_smp_cpu_kill_self. Fix this by fixing up irq_happened after hard disabling, rather than requiring there are no pending interrupts, similarly to what was done done until commit 2525db04d1cc5 ("powerpc/powernv: Simplify lazy IRQ handling in CPU offline"). Fixes: e78a7614f3876 ("idle: Prevent late-arriving interrupts from disrupting offline") Reported-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> [mpe: Add unexpected_mask rather than checking for known bad values, change the WARN_ON() to a WARN_ON_ONCE()] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20191022115814.22456-1-npiggin@gmail.com
2019-10-29sched/topology: Allow sched_asym_cpucapacity to be disabledValentin Schneider
While the static key is correctly initialized as being disabled, it will remain forever enabled once turned on. This means that if we start with an asymmetric system and hotplug out enough CPUs to end up with an SMP system, the static key will remain set - which is obviously wrong. We should detect this and turn off things like misfit migration and capacity aware wakeups. As Quentin pointed out, having separate root domains makes this slightly trickier. We could have exclusive cpusets that create an SMP island - IOW, the domains within this root domain will not see any asymmetry. This means we can't just disable the key on domain destruction, we need to count how many asymmetric root domains we have. Consider the following example using Juno r0 which is 2+4 big.LITTLE, where two identical cpusets are created: they both span both big and LITTLE CPUs: asym0 asym1 [ ][ ] L L B L L B $ cgcreate -g cpuset:asym0 $ cgset -r cpuset.cpus=0,1,3 asym0 $ cgset -r cpuset.mems=0 asym0 $ cgset -r cpuset.cpu_exclusive=1 asym0 $ cgcreate -g cpuset:asym1 $ cgset -r cpuset.cpus=2,4,5 asym1 $ cgset -r cpuset.mems=0 asym1 $ cgset -r cpuset.cpu_exclusive=1 asym1 $ cgset -r cpuset.sched_load_balance=0 . (the CPU numbering may look odd because on the Juno LITTLEs are CPUs 0,3-5 and bigs are CPUs 1-2) If we make one of those SMP (IOW remove asymmetry) by e.g. hotplugging its big core, we would end up with an SMP cpuset and an asymmetric cpuset - the static key must remain set, because we still have one asymmetric root domain. With the above example, this could be done with: $ echo 0 > /sys/devices/system/cpu/cpu2/online Which would result in: asym0 asym1 [ ][ ] L L B L L When both SMP and asymmetric cpusets are present, all CPUs will observe sched_asym_cpucapacity being set (it is system-wide), but not all CPUs observe asymmetry in their sched domain hierarchy: per_cpu(sd_asym_cpucapacity, <any CPU in asym0>) == <some SD at DIE level> per_cpu(sd_asym_cpucapacity, <any CPU in asym1>) == NULL Change the simple key enablement to an increment, and decrement the key counter when destroying domains that cover asymmetric CPUs. Signed-off-by: Valentin Schneider <valentin.schneider@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Dietmar.Eggemann@arm.com Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: hannes@cmpxchg.org Cc: lizefan@huawei.com Cc: morten.rasmussen@arm.com Cc: qperret@google.com Cc: tj@kernel.org Cc: vincent.guittot@linaro.org Fixes: df054e8445a4 ("sched/topology: Add static_key for asymmetric CPU capacity optimizations") Link: https://lkml.kernel.org/r/20191023153745.19515-3-valentin.schneider@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-10-29sched/topology: Don't try to build empty sched domainsValentin Schneider
Turns out hotplugging CPUs that are in exclusive cpusets can lead to the cpuset code feeding empty cpumasks to the sched domain rebuild machinery. This leads to the following splat: Internal error: Oops: 96000004 [#1] PREEMPT SMP Modules linked in: CPU: 0 PID: 235 Comm: kworker/5:2 Not tainted 5.4.0-rc1-00005-g8d495477d62e #23 Hardware name: ARM Juno development board (r0) (DT) Workqueue: events cpuset_hotplug_workfn pstate: 60000005 (nZCv daif -PAN -UAO) pc : build_sched_domains (./include/linux/arch_topology.h:23 kernel/sched/topology.c:1898 kernel/sched/topology.c:1969) lr : build_sched_domains (kernel/sched/topology.c:1966) Call trace: build_sched_domains (./include/linux/arch_topology.h:23 kernel/sched/topology.c:1898 kernel/sched/topology.c:1969) partition_sched_domains_locked (kernel/sched/topology.c:2250) rebuild_sched_domains_locked (./include/linux/bitmap.h:370 ./include/linux/cpumask.h:538 kernel/cgroup/cpuset.c:955 kernel/cgroup/cpuset.c:978 kernel/cgroup/cpuset.c:1019) rebuild_sched_domains (kernel/cgroup/cpuset.c:1032) cpuset_hotplug_workfn (kernel/cgroup/cpuset.c:3205 (discriminator 2)) process_one_work (./arch/arm64/include/asm/jump_label.h:21 ./include/linux/jump_label.h:200 ./include/trace/events/workqueue.h:114 kernel/workqueue.c:2274) worker_thread (./include/linux/compiler.h:199 ./include/linux/list.h:268 kernel/workqueue.c:2416) kthread (kernel/kthread.c:255) ret_from_fork (arch/arm64/kernel/entry.S:1167) Code: f860dae2 912802d6 aa1603e1 12800000 (f8616853) The faulty line in question is: cap = arch_scale_cpu_capacity(cpumask_first(cpu_map)); and we're not checking the return value against nr_cpu_ids (we shouldn't have to!), which leads to the above. Prevent generate_sched_domains() from returning empty cpumasks, and add some assertion in build_sched_domains() to scream bloody murder if it happens again. The above splat was obtained on my Juno r0 with the following reproducer: $ cgcreate -g cpuset:asym $ cgset -r cpuset.cpus=0-3 asym $ cgset -r cpuset.mems=0 asym $ cgset -r cpuset.cpu_exclusive=1 asym $ cgcreate -g cpuset:smp $ cgset -r cpuset.cpus=4-5 smp $ cgset -r cpuset.mems=0 smp $ cgset -r cpuset.cpu_exclusive=1 smp $ cgset -r cpuset.sched_load_balance=0 . $ echo 0 > /sys/devices/system/cpu/cpu4/online $ echo 0 > /sys/devices/system/cpu/cpu5/online Signed-off-by: Valentin Schneider <valentin.schneider@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Dietmar.Eggemann@arm.com Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: hannes@cmpxchg.org Cc: lizefan@huawei.com Cc: morten.rasmussen@arm.com Cc: qperret@google.com Cc: tj@kernel.org Cc: vincent.guittot@linaro.org Fixes: 05484e098448 ("sched/topology: Add SD_ASYM_CPUCAPACITY flag detection") Link: https://lkml.kernel.org/r/20191023153745.19515-2-valentin.schneider@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-10-29USB: gadget: Reject endpoints with 0 maxpacket valueAlan Stern
Endpoints with a maxpacket length of 0 are probably useless. They can't transfer any data, and it's not at all unlikely that a UDC will crash or hang when trying to handle a non-zero-length usb_request for such an endpoint. Indeed, dummy-hcd gets a divide error when trying to calculate the remainder of a transfer length by the maxpacket value, as discovered by the syzbot fuzzer. Currently the gadget core does not check for endpoints having a maxpacket value of 0. This patch adds a check to usb_ep_enable(), preventing such endpoints from being used. As far as I know, none of the gadget drivers in the kernel tries to create an endpoint with maxpacket = 0, but until now there has been nothing to prevent userspace programs under gadgetfs or configfs from doing it. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Reported-and-tested-by: syzbot+8ab8bf161038a8768553@syzkaller.appspotmail.com CC: <stable@vger.kernel.org> Acked-by: Felipe Balbi <balbi@kernel.org> Link: https://lore.kernel.org/r/Pine.LNX.4.44L0.1910281052370.1485-100000@iolanthe.rowland.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-10-29powerpc/prom_init: Undo relocation before entering secure modeThiago Jung Bauermann
The ultravisor will do an integrity check of the kernel image but we relocated it so the check will fail. Restore the original image by relocating it back to the kernel virtual base address. This works because during build vmlinux is linked with an expected virtual runtime address of KERNELBASE. Fixes: 6a9c930bd775 ("powerpc/prom_init: Add the ESM call to prom_init") Signed-off-by: Thiago Jung Bauermann <bauerman@linux.ibm.com> Tested-by: Michael Anderson <andmike@linux.ibm.com> [mpe: Add IS_ENABLED() to fix the CONFIG_RELOCATABLE=n build] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190911163433.12822-1-bauerman@linux.ibm.com
2019-10-28scsi: qla2xxx: stop timer in shutdown pathNicholas Piggin
In shutdown/reboot paths, the timer is not stopped: qla2x00_shutdown pci_device_shutdown device_shutdown kernel_restart_prepare kernel_restart sys_reboot This causes lockups (on powerpc) when firmware config space access calls are interrupted by smp_send_stop later in reboot. Fixes: e30d1756480dc ("[SCSI] qla2xxx: Addition of shutdown callback handler.") Link: https://lore.kernel.org/r/20191024063804.14538-1-npiggin@gmail.com Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Acked-by: Himanshu Madhani <hmadhani@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-10-28hwmon: (ina3221) Fix read timeout issueNicolin Chen
After introducing "samples" to the calculation of wait time, the driver might timeout at the regmap_field_read_poll_timeout call, because the wait time could be longer than the 100000 usec limit due to a large "samples" number. So this patch sets the timeout limit to 2 times of the wait time in order to fix this issue. Fixes: 5c090abf945b ("hwmon: (ina3221) Add averaging mode support") Signed-off-by: Nicolin Chen <nicoleotsuka@gmail.com> Link: https://lore.kernel.org/r/20191022005922.30239-1-nicoleotsuka@gmail.com Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2019-10-28Merge tag 'batadv-net-for-davem-20191025' of git://git.open-mesh.org/linux-mergeDavid S. Miller
Simon Wunderlich says: ==================== Here are two batman-adv bugfixes: * Fix free/alloc race for OGM and OGMv2, by Sven Eckelmann (2 patches) ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-10-28net: usb: lan78xx: Disable interrupts before calling generic_handle_irq()Daniel Wagner
lan78xx_status() will run with interrupts enabled due to the change in ed194d136769 ("usb: core: remove local_irq_save() around ->complete() handler"). generic_handle_irq() expects to be run with IRQs disabled. [ 4.886203] 000: irq 79 handler irq_default_primary_handler+0x0/0x8 enabled interrupts [ 4.886243] 000: WARNING: CPU: 0 PID: 0 at kernel/irq/handle.c:152 __handle_irq_event_percpu+0x154/0x168 [ 4.896294] 000: Modules linked in: [ 4.896301] 000: CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.3.6 #39 [ 4.896310] 000: Hardware name: Raspberry Pi 3 Model B+ (DT) [ 4.896315] 000: pstate: 60000005 (nZCv daif -PAN -UAO) [ 4.896321] 000: pc : __handle_irq_event_percpu+0x154/0x168 [ 4.896331] 000: lr : __handle_irq_event_percpu+0x154/0x168 [ 4.896339] 000: sp : ffff000010003cc0 [ 4.896346] 000: x29: ffff000010003cc0 x28: 0000000000000060 [ 4.896355] 000: x27: ffff000011021980 x26: ffff00001189c72b [ 4.896364] 000: x25: ffff000011702bc0 x24: ffff800036d6e400 [ 4.896373] 000: x23: 000000000000004f x22: ffff000010003d64 [ 4.896381] 000: x21: 0000000000000000 x20: 0000000000000002 [ 4.896390] 000: x19: ffff8000371c8480 x18: 0000000000000060 [ 4.896398] 000: x17: 0000000000000000 x16: 00000000000000eb [ 4.896406] 000: x15: ffff000011712d18 x14: 7265746e69206465 [ 4.896414] 000: x13: ffff000010003ba0 x12: ffff000011712df0 [ 4.896422] 000: x11: 0000000000000001 x10: ffff000011712e08 [ 4.896430] 000: x9 : 0000000000000001 x8 : 000000000003c920 [ 4.896437] 000: x7 : ffff0000118cc410 x6 : ffff0000118c7f00 [ 4.896445] 000: x5 : 000000000003c920 x4 : 0000000000004510 [ 4.896453] 000: x3 : ffff000011712dc8 x2 : 0000000000000000 [ 4.896461] 000: x1 : 73a3f67df94c1500 x0 : 0000000000000000 [ 4.896466] 000: Call trace: [ 4.896471] 000: __handle_irq_event_percpu+0x154/0x168 [ 4.896481] 000: handle_irq_event_percpu+0x50/0xb0 [ 4.896489] 000: handle_irq_event+0x40/0x98 [ 4.896497] 000: handle_simple_irq+0xa4/0xf0 [ 4.896505] 000: generic_handle_irq+0x24/0x38 [ 4.896513] 000: intr_complete+0xb0/0xe0 [ 4.896525] 000: __usb_hcd_giveback_urb+0x58/0xd8 [ 4.896533] 000: usb_giveback_urb_bh+0xd0/0x170 [ 4.896539] 000: tasklet_action_common.isra.0+0x9c/0x128 [ 4.896549] 000: tasklet_hi_action+0x24/0x30 [ 4.896556] 000: __do_softirq+0x120/0x23c [ 4.896564] 000: irq_exit+0xb8/0xd8 [ 4.896571] 000: __handle_domain_irq+0x64/0xb8 [ 4.896579] 000: bcm2836_arm_irqchip_handle_irq+0x60/0xc0 [ 4.896586] 000: el1_irq+0xb8/0x140 [ 4.896592] 000: arch_cpu_idle+0x10/0x18 [ 4.896601] 000: do_idle+0x200/0x280 [ 4.896608] 000: cpu_startup_entry+0x20/0x28 [ 4.896615] 000: rest_init+0xb4/0xc0 [ 4.896623] 000: arch_call_rest_init+0xc/0x14 [ 4.896632] 000: start_kernel+0x454/0x480 Fixes: ed194d136769 ("usb: core: remove local_irq_save() around ->complete() handler") Cc: Woojung Huh <woojung.huh@microchip.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Andrew Lunn <andrew@lunn.ch> Cc: Stefan Wahren <wahrenst@gmx.net> Cc: Jisheng Zhang <Jisheng.Zhang@synaptics.com> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: David Miller <davem@davemloft.net> Signed-off-by: Daniel Wagner <dwagner@suse.de> Tested-by: Stefan Wahren <wahrenst@gmx.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-10-28net: dsa: sja1105: improve NET_DSA_SJA1105_TAS dependencyArnd Bergmann
An earlier bugfix introduced a dependency on CONFIG_NET_SCH_TAPRIO, but this missed the case of NET_SCH_TAPRIO=m and NET_DSA_SJA1105=y, which still causes a link error: drivers/net/dsa/sja1105/sja1105_tas.o: In function `sja1105_setup_tc_taprio': sja1105_tas.c:(.text+0x5c): undefined reference to `taprio_offload_free' sja1105_tas.c:(.text+0x3b4): undefined reference to `taprio_offload_get' drivers/net/dsa/sja1105/sja1105_tas.o: In function `sja1105_tas_teardown': sja1105_tas.c:(.text+0x6ec): undefined reference to `taprio_offload_free' Change the dependency to only allow selecting the TAS code when it can link against the taprio code. Fixes: a8d570de0cc6 ("net: dsa: sja1105: Add dependency for NET_DSA_SJA1105_TAS") Fixes: 317ab5b86c8e ("net: dsa: sja1105: Configure the Time-Aware Scheduler via tc-taprio offload") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-10-28net: ethernet: ftgmac100: Fix DMA coherency issue with SW checksumBenjamin Herrenschmidt
We are calling the checksum helper after the dma_map_single() call to map the packet. This is incorrect as the checksumming code will touch the packet from the CPU. This means the cache won't be properly flushes (or the bounce buffering will leave us with the unmodified packet to DMA). This moves the calculation of the checksum & vlan tags to before the DMA mapping. This also has the side effect of fixing another bug: If the checksum helper fails, we goto "drop" to drop the packet, which will not unmap the DMA mapping. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Fixes: 05690d633f30 ("ftgmac100: Upgrade to NETIF_F_HW_CSUM") Reviewed-by: Vijay Khemka <vijaykhemka@fb.com> Tested-by: Vijay Khemka <vijaykhemka@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-10-28net: fix sk_page_frag() recursion from memory reclaimTejun Heo
sk_page_frag() optimizes skb_frag allocations by using per-task skb_frag cache when it knows it's the only user. The condition is determined by seeing whether the socket allocation mask allows blocking - if the allocation may block, it obviously owns the task's context and ergo exclusively owns current->task_frag. Unfortunately, this misses recursion through memory reclaim path. Please take a look at the following backtrace. [2] RIP: 0010:tcp_sendmsg_locked+0xccf/0xe10 ... tcp_sendmsg+0x27/0x40 sock_sendmsg+0x30/0x40 sock_xmit.isra.24+0xa1/0x170 [nbd] nbd_send_cmd+0x1d2/0x690 [nbd] nbd_queue_rq+0x1b5/0x3b0 [nbd] __blk_mq_try_issue_directly+0x108/0x1b0 blk_mq_request_issue_directly+0xbd/0xe0 blk_mq_try_issue_list_directly+0x41/0xb0 blk_mq_sched_insert_requests+0xa2/0xe0 blk_mq_flush_plug_list+0x205/0x2a0 blk_flush_plug_list+0xc3/0xf0 [1] blk_finish_plug+0x21/0x2e _xfs_buf_ioapply+0x313/0x460 __xfs_buf_submit+0x67/0x220 xfs_buf_read_map+0x113/0x1a0 xfs_trans_read_buf_map+0xbf/0x330 xfs_btree_read_buf_block.constprop.42+0x95/0xd0 xfs_btree_lookup_get_block+0x95/0x170 xfs_btree_lookup+0xcc/0x470 xfs_bmap_del_extent_real+0x254/0x9a0 __xfs_bunmapi+0x45c/0xab0 xfs_bunmapi+0x15/0x30 xfs_itruncate_extents_flags+0xca/0x250 xfs_free_eofblocks+0x181/0x1e0 xfs_fs_destroy_inode+0xa8/0x1b0 destroy_inode+0x38/0x70 dispose_list+0x35/0x50 prune_icache_sb+0x52/0x70 super_cache_scan+0x120/0x1a0 do_shrink_slab+0x120/0x290 shrink_slab+0x216/0x2b0 shrink_node+0x1b6/0x4a0 do_try_to_free_pages+0xc6/0x370 try_to_free_mem_cgroup_pages+0xe3/0x1e0 try_charge+0x29e/0x790 mem_cgroup_charge_skmem+0x6a/0x100 __sk_mem_raise_allocated+0x18e/0x390 __sk_mem_schedule+0x2a/0x40 [0] tcp_sendmsg_locked+0x8eb/0xe10 tcp_sendmsg+0x27/0x40 sock_sendmsg+0x30/0x40 ___sys_sendmsg+0x26d/0x2b0 __sys_sendmsg+0x57/0xa0 do_syscall_64+0x42/0x100 entry_SYSCALL_64_after_hwframe+0x44/0xa9 In [0], tcp_send_msg_locked() was using current->page_frag when it called sk_wmem_schedule(). It already calculated how many bytes can be fit into current->page_frag. Due to memory pressure, sk_wmem_schedule() called into memory reclaim path which called into xfs and then IO issue path. Because the filesystem in question is backed by nbd, the control goes back into the tcp layer - back into tcp_sendmsg_locked(). nbd sets sk_allocation to (GFP_NOIO | __GFP_MEMALLOC) which makes sense - it's in the process of freeing memory and wants to be able to, e.g., drop clean pages to make forward progress. However, this confused sk_page_frag() called from [2]. Because it only tests whether the allocation allows blocking which it does, it now thinks current->page_frag can be used again although it already was being used in [0]. After [2] used current->page_frag, the offset would be increased by the used amount. When the control returns to [0], current->page_frag's offset is increased and the previously calculated number of bytes now may overrun the end of allocated memory leading to silent memory corruptions. Fix it by adding gfpflags_normal_context() which tests sleepable && !reclaim and use it to determine whether to use current->task_frag. v2: Eric didn't like gfp flags being tested twice. Introduce a new helper gfpflags_normal_context() and combine the two tests. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Josef Bacik <josef@toxicpanda.com> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: stable@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
2019-10-28udp: fix data-race in udp_set_dev_scratch()Eric Dumazet
KCSAN reported a data-race in udp_set_dev_scratch() [1] The issue here is that we must not write over skb fields if skb is shared. A similar issue has been fixed in commit 89c22d8c3b27 ("net: Fix skb csum races when peeking") While we are at it, use a helper only dealing with udp_skb_scratch(skb)->csum_unnecessary, as this allows udp_set_dev_scratch() to be called once and thus inlined. [1] BUG: KCSAN: data-race in udp_set_dev_scratch / udpv6_recvmsg write to 0xffff888120278317 of 1 bytes by task 10411 on cpu 1: udp_set_dev_scratch+0xea/0x200 net/ipv4/udp.c:1308 __first_packet_length+0x147/0x420 net/ipv4/udp.c:1556 first_packet_length+0x68/0x2a0 net/ipv4/udp.c:1579 udp_poll+0xea/0x110 net/ipv4/udp.c:2720 sock_poll+0xed/0x250 net/socket.c:1256 vfs_poll include/linux/poll.h:90 [inline] do_select+0x7d0/0x1020 fs/select.c:534 core_sys_select+0x381/0x550 fs/select.c:677 do_pselect.constprop.0+0x11d/0x160 fs/select.c:759 __do_sys_pselect6 fs/select.c:784 [inline] __se_sys_pselect6 fs/select.c:769 [inline] __x64_sys_pselect6+0x12e/0x170 fs/select.c:769 do_syscall_64+0xcc/0x370 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x44/0xa9 read to 0xffff888120278317 of 1 bytes by task 10413 on cpu 0: udp_skb_csum_unnecessary include/net/udp.h:358 [inline] udpv6_recvmsg+0x43e/0xe90 net/ipv6/udp.c:310 inet6_recvmsg+0xbb/0x240 net/ipv6/af_inet6.c:592 sock_recvmsg_nosec+0x5c/0x70 net/socket.c:871 ___sys_recvmsg+0x1a0/0x3e0 net/socket.c:2480 do_recvmmsg+0x19a/0x5c0 net/socket.c:2601 __sys_recvmmsg+0x1ef/0x200 net/socket.c:2680 __do_sys_recvmmsg net/socket.c:2703 [inline] __se_sys_recvmmsg net/socket.c:2696 [inline] __x64_sys_recvmmsg+0x89/0xb0 net/socket.c:2696 do_syscall_64+0xcc/0x370 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Reported by Kernel Concurrency Sanitizer on: CPU: 0 PID: 10413 Comm: syz-executor.0 Not tainted 5.4.0-rc3+ #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Fixes: 2276f58ac589 ("udp: use a separate rx queue for packet reception") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Cc: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-10-28net: dpaa2: Use the correct style for SPDX License IdentifierNishad Kamdar
This patch corrects the SPDX License Identifier style in header files related to DPAA2 Ethernet driver supporting Freescale SoCs with DPAA2. For C header files Documentation/process/license-rules.rst mandates C-like comments (opposed to C source files where C++ style should be used) Changes made by using a script provided by Joe Perches here: https://lkml.org/lkml/2019/2/7/46. Suggested-by: Joe Perches <joe@perches.com> Signed-off-by: Nishad Kamdar <nishadkamdar@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-10-28Merge branch 'net-avoid-KCSAN-splats'David S. Miller
Eric Dumazet says: ==================== net: avoid KCSAN splats Often times we use skb_queue_empty() without holding a lock, meaning that other cpus (or interrupt) can change the queue under us. This is fine, but we need to properly annotate the lockless intent to make sure the compiler wont over optimize things. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-10-28net: add READ_ONCE() annotation in __skb_wait_for_more_packets()Eric Dumazet
__skb_wait_for_more_packets() can be called while other cpus can feed packets to the socket receive queue. KCSAN reported : BUG: KCSAN: data-race in __skb_wait_for_more_packets / __udp_enqueue_schedule_skb write to 0xffff888102e40b58 of 8 bytes by interrupt on cpu 0: __skb_insert include/linux/skbuff.h:1852 [inline] __skb_queue_before include/linux/skbuff.h:1958 [inline] __skb_queue_tail include/linux/skbuff.h:1991 [inline] __udp_enqueue_schedule_skb+0x2d7/0x410 net/ipv4/udp.c:1470 __udp_queue_rcv_skb net/ipv4/udp.c:1940 [inline] udp_queue_rcv_one_skb+0x7bd/0xc70 net/ipv4/udp.c:2057 udp_queue_rcv_skb+0xb5/0x400 net/ipv4/udp.c:2074 udp_unicast_rcv_skb.isra.0+0x7e/0x1c0 net/ipv4/udp.c:2233 __udp4_lib_rcv+0xa44/0x17c0 net/ipv4/udp.c:2300 udp_rcv+0x2b/0x40 net/ipv4/udp.c:2470 ip_protocol_deliver_rcu+0x4d/0x420 net/ipv4/ip_input.c:204 ip_local_deliver_finish+0x110/0x140 net/ipv4/ip_input.c:231 NF_HOOK include/linux/netfilter.h:305 [inline] NF_HOOK include/linux/netfilter.h:299 [inline] ip_local_deliver+0x133/0x210 net/ipv4/ip_input.c:252 dst_input include/net/dst.h:442 [inline] ip_rcv_finish+0x121/0x160 net/ipv4/ip_input.c:413 NF_HOOK include/linux/netfilter.h:305 [inline] NF_HOOK include/linux/netfilter.h:299 [inline] ip_rcv+0x18f/0x1a0 net/ipv4/ip_input.c:523 __netif_receive_skb_one_core+0xa7/0xe0 net/core/dev.c:5010 __netif_receive_skb+0x37/0xf0 net/core/dev.c:5124 process_backlog+0x1d3/0x420 net/core/dev.c:5955 read to 0xffff888102e40b58 of 8 bytes by task 13035 on cpu 1: __skb_wait_for_more_packets+0xfa/0x320 net/core/datagram.c:100 __skb_recv_udp+0x374/0x500 net/ipv4/udp.c:1683 udp_recvmsg+0xe1/0xb10 net/ipv4/udp.c:1712 inet_recvmsg+0xbb/0x250 net/ipv4/af_inet.c:838 sock_recvmsg_nosec+0x5c/0x70 net/socket.c:871 ___sys_recvmsg+0x1a0/0x3e0 net/socket.c:2480 do_recvmmsg+0x19a/0x5c0 net/socket.c:2601 __sys_recvmmsg+0x1ef/0x200 net/socket.c:2680 __do_sys_recvmmsg net/socket.c:2703 [inline] __se_sys_recvmmsg net/socket.c:2696 [inline] __x64_sys_recvmmsg+0x89/0xb0 net/socket.c:2696 do_syscall_64+0xcc/0x370 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Reported by Kernel Concurrency Sanitizer on: CPU: 1 PID: 13035 Comm: syz-executor.3 Not tainted 5.4.0-rc3+ #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-10-28net: use skb_queue_empty_lockless() in busy poll contextsEric Dumazet
Busy polling usually runs without locks. Let's use skb_queue_empty_lockless() instead of skb_queue_empty() Also uses READ_ONCE() in __skb_try_recv_datagram() to address a similar potential problem. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-10-28net: use skb_queue_empty_lockless() in poll() handlersEric Dumazet
Many poll() handlers are lockless. Using skb_queue_empty_lockless() instead of skb_queue_empty() is more appropriate. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-10-28udp: use skb_queue_empty_lockless()Eric Dumazet
syzbot reported a data-race [1]. We should use skb_queue_empty_lockless() to document that we are not ensuring a mutual exclusion and silence KCSAN. [1] BUG: KCSAN: data-race in __skb_recv_udp / __udp_enqueue_schedule_skb write to 0xffff888122474b50 of 8 bytes by interrupt on cpu 0: __skb_insert include/linux/skbuff.h:1852 [inline] __skb_queue_before include/linux/skbuff.h:1958 [inline] __skb_queue_tail include/linux/skbuff.h:1991 [inline] __udp_enqueue_schedule_skb+0x2c1/0x410 net/ipv4/udp.c:1470 __udp_queue_rcv_skb net/ipv4/udp.c:1940 [inline] udp_queue_rcv_one_skb+0x7bd/0xc70 net/ipv4/udp.c:2057 udp_queue_rcv_skb+0xb5/0x400 net/ipv4/udp.c:2074 udp_unicast_rcv_skb.isra.0+0x7e/0x1c0 net/ipv4/udp.c:2233 __udp4_lib_rcv+0xa44/0x17c0 net/ipv4/udp.c:2300 udp_rcv+0x2b/0x40 net/ipv4/udp.c:2470 ip_protocol_deliver_rcu+0x4d/0x420 net/ipv4/ip_input.c:204 ip_local_deliver_finish+0x110/0x140 net/ipv4/ip_input.c:231 NF_HOOK include/linux/netfilter.h:305 [inline] NF_HOOK include/linux/netfilter.h:299 [inline] ip_local_deliver+0x133/0x210 net/ipv4/ip_input.c:252 dst_input include/net/dst.h:442 [inline] ip_rcv_finish+0x121/0x160 net/ipv4/ip_input.c:413 NF_HOOK include/linux/netfilter.h:305 [inline] NF_HOOK include/linux/netfilter.h:299 [inline] ip_rcv+0x18f/0x1a0 net/ipv4/ip_input.c:523 __netif_receive_skb_one_core+0xa7/0xe0 net/core/dev.c:5010 __netif_receive_skb+0x37/0xf0 net/core/dev.c:5124 process_backlog+0x1d3/0x420 net/core/dev.c:5955 read to 0xffff888122474b50 of 8 bytes by task 8921 on cpu 1: skb_queue_empty include/linux/skbuff.h:1494 [inline] __skb_recv_udp+0x18d/0x500 net/ipv4/udp.c:1653 udp_recvmsg+0xe1/0xb10 net/ipv4/udp.c:1712 inet_recvmsg+0xbb/0x250 net/ipv4/af_inet.c:838 sock_recvmsg_nosec+0x5c/0x70 net/socket.c:871 ___sys_recvmsg+0x1a0/0x3e0 net/socket.c:2480 do_recvmmsg+0x19a/0x5c0 net/socket.c:2601 __sys_recvmmsg+0x1ef/0x200 net/socket.c:2680 __do_sys_recvmmsg net/socket.c:2703 [inline] __se_sys_recvmmsg net/socket.c:2696 [inline] __x64_sys_recvmmsg+0x89/0xb0 net/socket.c:2696 do_syscall_64+0xcc/0x370 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Reported by Kernel Concurrency Sanitizer on: CPU: 1 PID: 8921 Comm: syz-executor.4 Not tainted 5.4.0-rc3+ #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-10-28net: add skb_queue_empty_lockless()Eric Dumazet
Some paths call skb_queue_empty() without holding the queue lock. We must use a barrier in order to not let the compiler do strange things, and avoid KCSAN splats. Adding a barrier in skb_queue_empty() might be overkill, I prefer adding a new helper to clearly identify points where the callers might be lockless. This might help us finding real bugs. The corresponding WRITE_ONCE() should add zero cost for current compilers. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-10-28Merge tag 'arc-5.4-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc Pull ARC fixes from Vineet Gupta: "Small fixes for ARC: - perf fix for Big Endian build [Alexey] - hadk platform enable soem peripherals [Eugeniy]" * tag 'arc-5.4-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc: ARC: perf: Accommodate big-endian CPU ARC: [plat-hsdk]: Enable on-boardi SPI ADC IC ARC: [plat-hsdk]: Enable on-board SPI NOR flash IC
2019-10-28RDMA/hns: Prevent memory leaks of eq->buf_listLijun Ou
eq->buf_list->buf and eq->buf_list should also be freed when eqe_hop_num is set to 0, or there will be memory leaks. Fixes: a5073d6054f7 ("RDMA/hns: Add eq support of hip08") Link: https://lore.kernel.org/r/1572072995-11277-3-git-send-email-liweihang@hisilicon.com Signed-off-by: Lijun Ou <oulijun@huawei.com> Signed-off-by: Weihang Li <liweihang@hisilicon.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-10-28RISC-V: Add PCIe I/O BAR memory mappingYash Shah
For legacy I/O BARs (non-MMIO BARs) to work correctly on RISC-V Linux, we need to establish a reserved memory region for them, so that drivers that wish to use the legacy I/O BARs can issue reads and writes against a memory region that is mapped to the host PCIe controller's I/O BAR mapping. Signed-off-by: Yash Shah <yash.shah@sifive.com> Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
2019-10-28RDMA/iw_cxgb4: Avoid freeing skb twice in arp failure casePotnuri Bharat Teja
_put_ep_safe() and _put_pass_ep_safe() free the skb before it is freed by process_work(). fix double free by freeing the skb only in process_work(). Fixes: 1dad0ebeea1c ("iw_cxgb4: Avoid touch after free error in ARP failure handlers") Link: https://lore.kernel.org/r/1572006880-5800-1-git-send-email-bharat@chelsio.com Signed-off-by: Dakshaja Uppalapati <dakshaja@chelsio.com> Signed-off-by: Potnuri Bharat Teja <bharat@chelsio.com> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-10-28RDMA/mlx5: Use irq xarray locking for mkey_tableJason Gunthorpe
The mkey_table xarray is touched by the reg_mr_callback() function which is called from a hard irq. Thus all other uses of xa_lock must use the _irq variants. WARNING: inconsistent lock state 5.4.0-rc1 #12 Not tainted -------------------------------- inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage. python3/343 [HC0[0]:SC0[0]:HE1:SE1] takes: ffff888182be1d40 (&(&xa->xa_lock)->rlock#3){?.-.}, at: xa_erase+0x12/0x30 {IN-HARDIRQ-W} state was registered at: lock_acquire+0xe1/0x200 _raw_spin_lock_irqsave+0x35/0x50 reg_mr_callback+0x2dd/0x450 [mlx5_ib] mlx5_cmd_exec_cb_handler+0x2c/0x70 [mlx5_core] mlx5_cmd_comp_handler+0x355/0x840 [mlx5_core] [..] Possible unsafe locking scenario: CPU0 ---- lock(&(&xa->xa_lock)->rlock#3); <Interrupt> lock(&(&xa->xa_lock)->rlock#3); *** DEADLOCK *** 2 locks held by python3/343: #0: ffff88818eb4bd38 (&uverbs_dev->disassociate_srcu){....}, at: ib_uverbs_ioctl+0xe5/0x1e0 [ib_uverbs] #1: ffff888176c76d38 (&file->hw_destroy_rwsem){++++}, at: uobj_destroy+0x2d/0x90 [ib_uverbs] stack backtrace: CPU: 3 PID: 343 Comm: python3 Not tainted 5.4.0-rc1 #12 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014 Call Trace: dump_stack+0x86/0xca print_usage_bug.cold.50+0x2e5/0x355 mark_lock+0x871/0xb50 ? match_held_lock+0x20/0x250 ? check_usage_forwards+0x240/0x240 __lock_acquire+0x7de/0x23a0 ? __kasan_check_read+0x11/0x20 ? mark_lock+0xae/0xb50 ? mark_held_locks+0xb0/0xb0 ? find_held_lock+0xca/0xf0 lock_acquire+0xe1/0x200 ? xa_erase+0x12/0x30 _raw_spin_lock+0x2a/0x40 ? xa_erase+0x12/0x30 xa_erase+0x12/0x30 mlx5_ib_dealloc_mw+0x55/0xa0 [mlx5_ib] uverbs_dealloc_mw+0x3c/0x70 [ib_uverbs] uverbs_free_mw+0x1a/0x20 [ib_uverbs] destroy_hw_idr_uobject+0x49/0xa0 [ib_uverbs] [..] Fixes: 0417791536ae ("RDMA/mlx5: Add missing synchronize_srcu() for MW cases") Link: https://lore.kernel.org/r/20191024234910.GA9038@ziepe.ca Acked-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-10-28UAS: Revert commit 3ae62a42090f ("UAS: fix alignment of scatter/gather ↵Alan Stern
segments") Commit 3ae62a42090f ("UAS: fix alignment of scatter/gather segments"), copying a similar commit for usb-storage, attempted to solve a problem involving scatter-gather I/O and USB/IP by setting the virt_boundary_mask for mass-storage devices. However, it now turns out that the analogous change in usb-storage interacted badly with commit 09324d32d2a0 ("block: force an unlimited segment size on queues with a virt boundary"), which was added later. A typical error message is: ehci-pci 0000:00:13.2: swiotlb buffer is full (sz: 327680 bytes), total 32768 (slots), used 97 (slots) There is no longer any reason to keep the virt_boundary_mask setting in the uas driver. It was needed in the first place only for handling devices with a block size smaller than the maxpacket size and where the host controller was not capable of fully general scatter-gather operation (that is, able to merge two SG segments into a single USB packet). But: High-speed or slower connections never use a bulk maxpacket value larger than 512; The SCSI layer does not handle block devices with a block size smaller than 512 bytes; All the host controllers capable of SuperSpeed operation can handle fully general SG; Since commit ea44d190764b ("usbip: Implement SG support to vhci-hcd and stub driver") was merged, the USB/IP driver can also handle SG. Therefore all supported device/controller combinations should be okay with no need for any special virt_boundary_mask. So in order to head off potential problems similar to those affecting usb-storage, this patch reverts commit 3ae62a42090f. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> CC: Oliver Neukum <oneukum@suse.com> CC: <stable@vger.kernel.org> Acked-by: Christoph Hellwig <hch@lst.de> Fixes: 3ae62a42090f ("UAS: fix alignment of scatter/gather segments") Link: https://lore.kernel.org/r/Pine.LNX.4.44L0.1910231132470.1878-100000@iolanthe.rowland.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-10-28usb-storage: Revert commit 747668dbc061 ("usb-storage: Set ↵Alan Stern
virt_boundary_mask to avoid SG overflows") Commit 747668dbc061 ("usb-storage: Set virt_boundary_mask to avoid SG overflows") attempted to solve a problem involving scatter-gather I/O and USB/IP by setting the virt_boundary_mask for mass-storage devices. However, it now turns out that this interacts badly with commit 09324d32d2a0 ("block: force an unlimited segment size on queues with a virt boundary"), which was added later. A typical error message is: ehci-pci 0000:00:13.2: swiotlb buffer is full (sz: 327680 bytes), total 32768 (slots), used 97 (slots) There is no longer any reason to keep the virt_boundary_mask setting for usb-storage. It was needed in the first place only for handling devices with a block size smaller than the maxpacket size and where the host controller was not capable of fully general scatter-gather operation (that is, able to merge two SG segments into a single USB packet). But: High-speed or slower connections never use a bulk maxpacket value larger than 512; The SCSI layer does not handle block devices with a block size smaller than 512 bytes; All the host controllers capable of SuperSpeed operation can handle fully general SG; Since commit ea44d190764b ("usbip: Implement SG support to vhci-hcd and stub driver") was merged, the USB/IP driver can also handle SG. Therefore all supported device/controller combinations should be okay with no need for any special virt_boundary_mask. So in order to fix the swiotlb problem, this patch reverts commit 747668dbc061. Reported-and-tested-by: Piergiorgio Sartor <piergiorgio.sartor@nexgo.de> Link: https://marc.info/?l=linux-usb&m=157134199501202&w=2 Signed-off-by: Alan Stern <stern@rowland.harvard.edu> CC: Seth Bollinger <Seth.Bollinger@digi.com> CC: <stable@vger.kernel.org> Fixes: 747668dbc061 ("usb-storage: Set virt_boundary_mask to avoid SG overflows") Acked-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/Pine.LNX.4.44L0.1910211145520.1673-100000@iolanthe.rowland.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-10-28usbip: Fix free of unallocated memory in vhci txSuwan Kim
iso_buffer should be set to NULL after use and free in the while loop. In the case of isochronous URB in the while loop, iso_buffer is allocated and after sending it to server, buffer is deallocated. And then, if the next URB in the while loop is not a isochronous pipe, iso_buffer still holds the previously deallocated buffer address and kfree tries to free wrong buffer address. Fixes: ea44d190764b ("usbip: Implement SG support to vhci-hcd and stub driver") Reported-by: kbuild test robot <lkp@intel.com> Reported-by: Julia Lawall <julia.lawall@lip6.fr> Signed-off-by: Suwan Kim <suwan.kim027@gmail.com> Reviewed-by: Julia Lawall <julia.lawall@lip6.fr> Acked-by: Shuah Khan <skhan@linuxfoundation.org> Link: https://lore.kernel.org/r/20191022093017.8027-1-suwan.kim027@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-10-28usbip: tools: Fix read_usb_vudc_device() error path handlingGwanYeong Kim
This isn't really accurate right. fread() doesn't always return 0 in error. It could return < number of elements and set errno. Signed-off-by: GwanYeong Kim <gy741.kim@gmail.com> Acked-by: Shuah Khan <skhan@linuxfoundation.org> Link: https://lore.kernel.org/r/20191018032223.4644-1-gy741.kim@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-10-28usb: xhci: fix __le32/__le64 accessors in debugfs codeBen Dooks (Codethink)
It looks like some of the xhci debug code is passing u32 to functions directly from __le32/__le64 fields. Fix this by using le{32,64}_to_cpu() on these to fix the following sparse warnings; xhci-debugfs.c:205:62: warning: incorrect type in argument 1 (different base types) xhci-debugfs.c:205:62: expected unsigned int [usertype] field0 xhci-debugfs.c:205:62: got restricted __le32 xhci-debugfs.c:206:62: warning: incorrect type in argument 2 (different base types) xhci-debugfs.c:206:62: expected unsigned int [usertype] field1 xhci-debugfs.c:206:62: got restricted __le32 ... [Trim down commit message, sparse warnings were similar -Mathias] Cc: <stable@vger.kernel.org> # 4.15+ Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/1572013829-14044-4-git-send-email-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-10-28usb: xhci: fix Immediate Data Transfer endiannessSamuel Holland
The arguments to queue_trb are always byteswapped to LE for placement in the ring, but this should not happen in the case of immediate data; the bytes copied out of transfer_buffer are already in the correct order. Add a complementary byteswap so the bytes end up in the ring correctly. This was observed on BE ppc64 with a "Texas Instruments TUSB73x0 SuperSpeed USB 3.0 xHCI Host Controller [104c:8241]" as a ch341 usb-serial adapter ("1a86:7523 QinHeng Electronics HL-340 USB-Serial adapter") always transmitting the same character (generally NUL) over the serial link regardless of the key pressed. Cc: <stable@vger.kernel.org> # 5.2+ Fixes: 33e39350ebd2 ("usb: xhci: add Immediate Data Transfer support") Signed-off-by: Samuel Holland <samuel@sholland.org> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/1572013829-14044-3-git-send-email-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-10-28xhci: Fix use-after-free regression in xhci clear hub TT implementationMathias Nyman
commit ef513be0a905 ("usb: xhci: Add Clear_TT_Buffer") schedules work to clear TT buffer, but causes a use-after-free regression at the same time Make sure hub_tt_work finishes before endpoint is disabled, otherwise the work will dereference already freed endpoint and device related pointers. This was triggered when usb core failed to read the configuration descriptor of a FS/LS device during enumeration. xhci driver queued clear_tt_work while usb core freed and reallocated a new device for the next enumeration attempt. EHCI driver implents ehci_endpoint_disable() that makes sure clear_tt_work has finished before it returns, but xhci lacks this support. usb core will call hcd->driver->endpoint_disable() callback before disabling endpoints, so we want this in xhci as well. The added xhci_endpoint_disable() is based on ehci_endpoint_disable() Fixes: ef513be0a905 ("usb: xhci: Add Clear_TT_Buffer") Cc: <stable@vger.kernel.org> # v5.3 Reported-by: Johan Hovold <johan@kernel.org> Suggested-by: Johan Hovold <johan@kernel.org> Reviewed-by: Johan Hovold <johan@kernel.org> Tested-by: Johan Hovold <johan@kernel.org> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/1572013829-14044-2-git-send-email-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-10-28USB: ldusb: fix control-message timeoutJohan Hovold
USB control-message timeouts are specified in milliseconds, not jiffies. Waiting 83 minutes for a transfer to complete is a bit excessive. Fixes: 2824bd250f0b ("[PATCH] USB: add ldusb driver") Cc: stable <stable@vger.kernel.org> # 2.6.13 Reported-by: syzbot+a4fbb3bb76cda0ea4e58@syzkaller.appspotmail.com Signed-off-by: Johan Hovold <johan@kernel.org> Link: https://lore.kernel.org/r/20191022153127.22295-1-johan@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-10-28USB: ldusb: use unsigned size format specifiersJohan Hovold
A recent info-leak bug manifested itself along with warning about a negative buffer overflow: ldusb 1-1:0.28: Read buffer overflow, -131383859965943 bytes dropped when it was really a rather large positive one. A sanity check that prevents this has now been put in place, but let's fix up the size format specifiers, which should all be unsigned. Signed-off-by: Johan Hovold <johan@kernel.org> Link: https://lore.kernel.org/r/20191022143203.5260-3-johan@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-10-28USB: ldusb: fix ring-buffer lockingJohan Hovold
The custom ring-buffer implementation was merged without any locking or explicit memory barriers, but a spinlock was later added by commit 9d33efd9a791 ("USB: ldusb bugfix"). The lock did not cover the update of the tail index once the entry had been processed, something which could lead to memory corruption on weakly ordered architectures or due to compiler optimisations. Specifically, a completion handler running on another CPU might observe the incremented tail index and update the entry before ld_usb_read() is done with it. Fixes: 2824bd250f0b ("[PATCH] USB: add ldusb driver") Fixes: 9d33efd9a791 ("USB: ldusb bugfix") Cc: stable <stable@vger.kernel.org> # 2.6.13 Signed-off-by: Johan Hovold <johan@kernel.org> Link: https://lore.kernel.org/r/20191022143203.5260-2-johan@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-10-28USB: Skip endpoints with 0 maxpacket lengthAlan Stern
Endpoints with a maxpacket length of 0 are probably useless. They can't transfer any data, and it's not at all unlikely that an HCD will crash or hang when trying to handle an URB for such an endpoint. Currently the USB core does not check for endpoints having a maxpacket value of 0. This patch adds a check, printing a warning and skipping over any endpoints it catches. Now, the USB spec does not rule out endpoints having maxpacket = 0. But since they wouldn't have any practical use, there doesn't seem to be any good reason for us to accept them. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Link: https://lore.kernel.org/r/Pine.LNX.4.44L0.1910281050420.1485-100000@iolanthe.rowland.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-10-28Merge tag 'fixes-for-v5.4-rc5' of ↵Greg Kroah-Hartman
git://git.kernel.org/pub/scm/linux/kernel/git/balbi/usb into usb-linus Felipe writes: USB: fixes for v5.4-rc5 Not much here, only 14 commits in different drivers. As for the specifics, Roger Quadros fixed an important bug in cdns3 where the driver was making decisions about data pull-up management behind the UDC framework's back. The Atmel UDC got a fix for interrupt storm in FIFO mode, this was done by Cristian Brisan. Apart from these, we have the usual set of non-critical fixes. Signed-off-by: Felipe Balbi <balbi@kernel.org> * tag 'fixes-for-v5.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/balbi/usb: usb: cdns3: gadget: Don't manage pullups usb: dwc3: remove the call trace of USBx_GFLADJ usb: gadget: configfs: fix concurrent issue between composite APIs usb: dwc3: pci: prevent memory leak in dwc3_pci_probe usb: gadget: composite: Fix possible double free memory bug usb: gadget: udc: atmel: Fix interrupt storm in FIFO mode. usb: renesas_usbhs: fix type of buf usb: renesas_usbhs: Fix warnings in usbhsg_recip_handler_std_set_device() usb: gadget: udc: renesas_usb3: Fix __le16 warnings usb: renesas_usbhs: fix __le16 warnings usb: cdns3: include host-export,h for cdns3_host_init usb: mtu3: fix missing include of mtu3_dr.h usb: fsl: Check memory resource before releasing it usb: dwc3: select CONFIG_REGMAP_MMIO
2019-10-28io_uring: don't touch ctx in setup after ring fd installJens Axboe
syzkaller reported an issue where it looks like a malicious app can trigger a use-after-free of reading the ctx ->sq_array and ->rings value right after having installed the ring fd in the process file table. Defer ring fd installation until after we're done reading those values. Fixes: 75b28affdd6a ("io_uring: allocate the two rings together") Reported-by: syzbot+6f03d895a6cd0d06187f@syzkaller.appspotmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-10-28Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid Pull HID fixes from Jiri Kosina: - HID++ device support regression fixes (race condition during cleanup, device detection fix, opps fix) from Andrey Smirnov - disable PM on i2c-hid, as it's causing problems with a lot of devices; other OSes apparently don't implement/enable it either; from Kai-Heng Feng - error handling fix in intel-ish driver, from Zhang Lixu - syzbot fuzzer fix for HID core code from Alan Stern - a few other tiny fixups (printk message cleanup, new device ID) * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid: HID: i2c-hid: add Trekstor Primebook C11B to descriptor override HID: logitech-hidpp: do all FF cleanup in hidpp_ff_destroy() HID: logitech-hidpp: rework device validation HID: logitech-hidpp: split g920_get_config() HID: i2c-hid: Remove runtime power management HID: intel-ish-hid: fix wrong error handling in ishtp_cl_alloc_tx_ring() HID: google: add magnemite/masterball USB ids HID: Fix assumption that devices have inputs HID: prodikeys: make array keys static const, makes object smaller HID: fix error message in hid_open_report()
2019-10-28Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostLinus Torvalds
Pull virtio fixes from Michael Tsirkin: "Some minor fixes" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: vringh: fix copy direction of vringh_iov_push_kern() vsock/virtio: remove unused 'work' field from 'struct virtio_vsock_pkt' virtio_ring: fix stalls for packed rings
2019-10-28Revert "ALSA: hda: Flush interrupts on disabling"Takashi Iwai
This reverts commit caa8422d01e983782548648e125fd617cadcec3f. It turned out that this commit caused a regression at shutdown / reboot, as the synchronize_irq() calls seems blocking the whole shutdown. Also another part of the change about shuffling the call order looks suspicious; the azx_stop_chip() call disables the CORB / RIRB while the others may still need the CORB/RIRB update. Since the original commit itself was a cargo-fix, let's revert the whole patch. Fixes: caa8422d01e9 ("ALSA: hda: Flush interrupts on disabling") BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=205333 BugLinK: https://bugs.freedesktop.org/show_bug.cgi?id=111174 Signed-off-by: Takashi Iwai <tiwai@suse.de> Cc: Chris Wilson <chris@chris-wilson.co.uk> Link: https://lore.kernel.org/r/20191028081056.22010-1-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de>
2019-10-28perf/headers: Fix spelling s/EACCESS/EACCES/, s/privilidge/privilege/Geert Uytterhoeven
As per POSIX, the correct spelling of the error code is EACCES: include/uapi/asm-generic/errno-base.h:#define EACCES 13 /* Permission denied */ Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Kosina <trivial@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: https://lkml.kernel.org/r/20191024122904.12463-1-geert+renesas@glider.be Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-10-28perf/x86/uncore: Fix event group supportKan Liang
The events in the same group don't start or stop simultaneously. Here is the ftrace when enabling event group for uncore_iio_0: # perf stat -e "{uncore_iio_0/event=0x1/,uncore_iio_0/event=0xe/}" <idle>-0 [000] d.h. 8959.064832: read_msr: a41, value b2b0b030 //Read counter reg of IIO unit0 counter0 <idle>-0 [000] d.h. 8959.064835: write_msr: a48, value 400001 //Write Ctrl reg of IIO unit0 counter0 to enable counter0. <------ Although counter0 is enabled, Unit Ctrl is still freezed. Nothing will count. We are still good here. <idle>-0 [000] d.h. 8959.064836: read_msr: a40, value 30100 //Read Unit Ctrl reg of IIO unit0 <idle>-0 [000] d.h. 8959.064838: write_msr: a40, value 30000 //Write Unit Ctrl reg of IIO unit0 to enable all counters in the unit by clear Freeze bit <------Unit0 is un-freezed. Counter0 has been enabled. Now it starts counting. But counter1 has not been enabled yet. The issue starts here. <idle>-0 [000] d.h. 8959.064846: read_msr: a42, value 0 //Read counter reg of IIO unit0 counter1 <idle>-0 [000] d.h. 8959.064847: write_msr: a49, value 40000e //Write Ctrl reg of IIO unit0 counter1 to enable counter1. <------ Now, counter1 just starts to count. Counter0 has been running for a while. Current code un-freezes the Unit Ctrl right after the first counter is enabled. The subsequent group events always loses some counter values. Implement pmu_enable and pmu_disable support for uncore, which can help to batch hardware accesses. No one uses uncore_enable_box and uncore_disable_box. Remove them. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: linux-drivers-review@eclists.intel.com Cc: linux-perf@eclists.intel.com Fixes: 087bfbb03269 ("perf/x86: Add generic Intel uncore PMU support") Link: https://lkml.kernel.org/r/1572014593-31591-1-git-send-email-kan.liang@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-10-28perf/x86/amd/ibs: Handle erratum #420 only on the affected CPU family (10h)Kim Phillips
This saves us writing the IBS control MSR twice when disabling the event. I searched revision guides for all families since 10h, and did not find occurrence of erratum #420, nor anything remotely similar: so we isolate the secondary MSR write to family 10h only. Also unconditionally update the count mask for IBS Op implementations that have read & writeable current count (CurCnt) fields in addition to the MaxCnt field. These bits were reserved on prior implementations, and therefore shouldn't have negative impact. Signed-off-by: Kim Phillips <kim.phillips@amd.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Fixes: c9574fe0bdb9 ("perf/x86-ibs: Implement workaround for IBS erratum #420") Link: https://lkml.kernel.org/r/20191023150955.30292-2-kim.phillips@amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-10-28perf/x86/amd/ibs: Fix reading of the IBS OpData register and thus precise ↵Kim Phillips
RIP validity The loop that reads all the IBS MSRs into *buf stopped one MSR short of reading the IbsOpData register, which contains the RipInvalid status bit. Fix the offset_max assignment so the MSR gets read, so the RIP invalid evaluation is based on what the IBS h/w output, instead of what was left in memory. Signed-off-by: Kim Phillips <kim.phillips@amd.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Fixes: d47e8238cd76 ("perf/x86-ibs: Take instruction pointer from ibs sample") Link: https://lkml.kernel.org/r/20191023150955.30292-1-kim.phillips@amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-10-28perf/core: Start rejecting the syscall with attr.__reserved_2 setAlexander Shishkin
Commit: 1a5941312414c ("perf: Add wakeup watermark control to the AUX area") added attr.__reserved_2 padding, but forgot to add an ABI check to reject attributes with this field set. Fix that. Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: adrian.hunter@intel.com Cc: mathieu.poirier@linaro.org Link: https://lkml.kernel.org/r/20191025121636.75182-1-alexander.shishkin@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-10-28vringh: fix copy direction of vringh_iov_push_kern()Jason Wang
We want to copy from iov to buf, so the direction was wrong. Note: no real user for the helper, but it will be used by future features. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>