linux/linux-stable.git - Linux kernel stable tree

Age	Commit message (Collapse)	Author
2019-10-07	tpm: use tpm_try_get_ops() in tpm-sysfs.c.	Jarkko Sakkinen
	commit 2677ca98ae377517930c183248221f69f771c921 upstream Use tpm_try_get_ops() in tpm-sysfs.c so that we can consider moving other decorations (locking, localities, power management for example) inside it. This direction can be of course taken only after other call sites for tpm_transmit() have been treated in the same way. Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Reviewed-by: Stefan Berger <stefanb@linux.ibm.com> Tested-by: Stefan Berger <stefanb@linux.ibm.com> Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com> Reviewed-by: James Bottomley <James.Bottomley@HansenPartnership.com> Tested-by: Alexander Steffen <Alexander.Steffen@infineon.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-10-05	drm/amd/display: Restore backlight brightness after system resume	Kai-Heng Feng
	commit bb264220d9316f6bd7c1fd84b8da398c93912931 upstream. Laptops with AMD APU doesn't restore display backlight brightness after system resume. This issue started when DC was introduced. Let's use BL_CORE_SUSPENDRESUME so the backlight core calls update_status callback after system resume to restore the backlight level. Tested on Dell Inspiron 3180 (Stoney Ridge) and Dell Latitude 5495 (Raven Ridge). Cc: <stable@vger.kernel.org> Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-10-05	md/raid0: avoid RAID0 data corruption due to layout confusion.	NeilBrown
	[ Upstream commit c84a1372df929033cb1a0441fb57bd3932f39ac9 ] If the drives in a RAID0 are not all the same size, the array is divided into zones. The first zone covers all drives, to the size of the smallest. The second zone covers all drives larger than the smallest, up to the size of the second smallest - etc. A change in Linux 3.14 unintentionally changed the layout for the second and subsequent zones. All the correct data is still stored, but each chunk may be assigned to a different device than in pre-3.14 kernels. This can lead to data corruption. It is not possible to determine what layout to use - it depends which kernel the data was written by. So we add a module parameter to allow the old (0) or new (1) layout to be specified, and refused to assemble an affected array if that parameter is not set. Fixes: 20d0189b1012 ("block: Introduce new bio_split()") cc: stable@vger.kernel.org (3.14+) Acked-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com> Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-10-05	i2c: riic: Clear NACK in tend isr	Chris Brandt
	commit a71e2ac1f32097fbb2beab098687a7a95c84543e upstream. The NACKF flag should be cleared in INTRIICNAKI interrupt processing as description in HW manual. This issue shows up quickly when PREEMPT_RT is applied and a device is probed that is not plugged in (like a touchscreen controller). The result is endless interrupts that halt system boot. Fixes: 310c18a41450 ("i2c: riic: add driver") Cc: stable@vger.kernel.org Reported-by: Chien Nguyen <chien.nguyen.eb@rvc.renesas.com> Signed-off-by: Chris Brandt <chris.brandt@renesas.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-10-05	hwrng: core - don't wait on add_early_randomness()	Laurent Vivier
	commit 78887832e76541f77169a24ac238fccb51059b63 upstream. add_early_randomness() is called by hwrng_register() when the hardware is added. If this hardware and its module are present at boot, and if there is no data available the boot hangs until data are available and can't be interrupted. For instance, in the case of virtio-rng, in some cases the host can be not able to provide enough entropy for all the guests. We can have two easy ways to reproduce the problem but they rely on misconfiguration of the hypervisor or the egd daemon: - if virtio-rng device is configured to connect to the egd daemon of the host but when the virtio-rng driver asks for data the daemon is not connected, - if virtio-rng device is configured to connect to the egd daemon of the host but the egd daemon doesn't provide data. The guest kernel will hang at boot until the virtio-rng driver provides enough data. To avoid that, call rng_get_data() in non-blocking mode (wait=0) from add_early_randomness(). Signed-off-by: Laurent Vivier <lvivier@redhat.com> Fixes: d9e797261933 ("hwrng: add randomness to system from rng...") Cc: <stable@vger.kernel.org> Reviewed-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-10-05	/dev/mem: Bail out upon SIGKILL.	Tetsuo Handa
	commit 8619e5bdeee8b2c685d686281f2d2a6017c4bc15 upstream. syzbot found that a thread can stall for minutes inside read_mem() or write_mem() after that thread was killed by SIGKILL [1]. Reading from iomem areas of /dev/mem can be slow, depending on the hardware. While reading 2GB at one read() is legal, delaying termination of killed thread for minutes is bad. Thus, allow reading/writing /dev/mem and /dev/kmem to be preemptible and killable. [ 1335.912419][T20577] read_mem: sz=4096 count=2134565632 [ 1335.943194][T20577] read_mem: sz=4096 count=2134561536 [ 1335.978280][T20577] read_mem: sz=4096 count=2134557440 [ 1336.011147][T20577] read_mem: sz=4096 count=2134553344 [ 1336.041897][T20577] read_mem: sz=4096 count=2134549248 Theoretically, reading/writing /dev/mem and /dev/kmem can become "interruptible". But this patch chose "killable". Future patch will make them "interruptible" so that we can revert to "killable" if some program regressed. [1] https://syzkaller.appspot.com/bug?id=a0e3436829698d5824231251fad9d8e998f94f5e Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: stable <stable@vger.kernel.org> Reported-by: syzbot <syzbot+8ab2d0f39fb79fe6ca40@syzkaller.appspotmail.com> Link: https://lore.kernel.org/r/1566825205-10703-1-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-10-05	md: only call set_in_sync() when it is expected to succeed.	NeilBrown
	commit 480523feae581ab714ba6610388a3b4619a2f695 upstream. Since commit 4ad23a976413 ("MD: use per-cpu counter for writes_pending"), set_in_sync() is substantially more expensive: it can wait for a full RCU grace period which can be 10s of milliseconds. So we should only call it when the cost is justified. md_check_recovery() currently calls set_in_sync() every time it finds anything to do (on non-external active arrays). For an array performing resync or recovery, this will be quite often. Each call will introduce a delay to the md thread, which can noticeable affect IO submission latency. In md_check_recovery() we only need to call set_in_sync() if 'safemode' was non-zero at entry, meaning that there has been not recent IO. So we save this "safemode was nonzero" state, and only call set_in_sync() if it was non-zero. This measurably reduces mean and maximum IO submission latency during resync/recovery. Reported-and-tested-by: Jack Wang <jinpu.wang@cloud.ionos.com> Fixes: 4ad23a976413 ("MD: use per-cpu counter for writes_pending") Cc: stable@vger.kernel.org (v4.12+) Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-10-05	md: don't report active array_state until after revalidate_disk() completes.	NeilBrown
	commit 9d4b45d6af442237560d0bb5502a012baa5234b7 upstream. Until revalidate_disk() has completed, the size of a new md array will appear to be zero. So we shouldn't report, through array_state, that the array is active until that time. udev rules check array_state to see if the array is ready. As soon as it appear to be zero, fsck can be run. If it find the size to be zero, it will fail. So add a new flag to provide an interlock between do_md_run() and array_state_show(). This flag is set while do_md_run() is active and it prevents array_state_show() from reporting that the array is active. Before do_md_run() is called, ->pers will be NULL so array is definitely not active. After do_md_run() is called, revalidate_disk() will have run and the array will be completely ready. We also move various sysfs_notify*() calls out of md_run() into do_md_run() after MD_NOT_READY is cleared. This ensure the information is ready before the notification is sent. Prior to v4.12, array_state_show() was called with the mddev->reconfig_mutex held, which provided exclusion with do_md_run(). Note that MD_NOT_READY cleared twice. This is deliberate to cover both success and error paths with minimal noise. Fixes: b7b17c9b67e5 ("md: remove mddev_lock() from md_attr_show()") Cc: stable@vger.kernel.org (v4.12++) Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-10-05	md/raid6: Set R5_ReadError when there is read failure on parity disk	Xiao Ni
	commit 143f6e733b73051cd22dcb80951c6c929da413ce upstream. 7471fb77ce4d ("md/raid6: Fix anomily when recovering a single device in RAID6.") avoids rereading P when it can be computed from other members. However, this misses the chance to re-write the right data to P. This patch sets R5_ReadError if the re-read fails. Also, when re-read is skipped, we also missed the chance to reset rdev->read_errors to 0. It can fail the disk when there are many read errors on P member disk (other disks don't have read error) V2: upper layer read request don't read parity/Q data. So there is no need to consider such situation. This is Reported-by: kbuild test robot <lkp@intel.com> Fixes: 7471fb77ce4d ("md/raid6: Fix anomily when recovering a single device in RAID6.") Cc: <stable@vger.kernel.org> #4.4+ Signed-off-by: Xiao Ni <xni@redhat.com> Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-10-05	i40e: check __I40E_VF_DISABLE bit in i40e_sync_filters_subtask	Stefan Assmann
	commit a7542b87607560d0b89e7ff81d870bd6ff8835cb upstream. While testing VF spawn/destroy the following panic occurred. BUG: unable to handle kernel NULL pointer dereference at 0000000000000029 [...] Workqueue: i40e i40e_service_task [i40e] RIP: 0010:i40e_sync_vsi_filters+0x6fd/0xc60 [i40e] [...] Call Trace: ? __switch_to_asm+0x35/0x70 ? __switch_to_asm+0x41/0x70 ? __switch_to_asm+0x35/0x70 ? _cond_resched+0x15/0x30 i40e_sync_filters_subtask+0x56/0x70 [i40e] i40e_service_task+0x382/0x11b0 [i40e] ? __switch_to_asm+0x41/0x70 ? __switch_to_asm+0x41/0x70 process_one_work+0x1a7/0x3b0 worker_thread+0x30/0x390 ? create_worker+0x1a0/0x1a0 kthread+0x112/0x130 ? kthread_bind+0x30/0x30 ret_from_fork+0x35/0x40 Investigation revealed a race where pf->vf[vsi->vf_id].trusted may get accessed by the watchdog via i40e_sync_filters_subtask() although i40e_free_vfs() already free'd pf->vf. To avoid this the call to i40e_sync_vsi_filters() in i40e_sync_filters_subtask() needs to be guarded by __I40E_VF_DISABLE, which is also used by i40e_free_vfs(). Note: put the __I40E_VF_DISABLE check after the __I40E_MACVLAN_SYNC_PENDING check as the latter is more likely to trigger. CC: stable@vger.kernel.org Signed-off-by: Stefan Assmann <sassmann@kpanic.de> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-10-05	efifb: BGRT: Improve efifb_bgrt_sanity_check	Hans de Goede
	commit 51677dfcc17f88ed754143df670ff064eae67f84 upstream. For various reasons, at least with x86 EFI firmwares, the xoffset and yoffset in the BGRT info are not always reliable. Extensive testing has shown that when the info is correct, the BGRT image is always exactly centered horizontally (the yoffset variable is more variable and not always predictable). This commit simplifies / improves the bgrt_sanity_check to simply check that the BGRT image is exactly centered horizontally and skips (re)drawing it when it is not. This fixes the BGRT image sometimes being drawn in the wrong place. Cc: stable@vger.kernel.org Fixes: 88fe4ceb2447 ("efifb: BGRT: Do not copy the boot graphics for non native resolutions") Signed-off-by: Hans de Goede <hdegoede@redhat.com> Cc: Peter Jones <pjones@redhat.com>, Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190721131918.10115-1-hdegoede@redhat.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-10-05	regulator: Defer init completion for a while after late_initcall	Mark Brown
	commit 55576cf1853798e86f620766e23b604c9224c19c upstream. The kernel has no way of knowing when we have finished instantiating drivers, between deferred probe and systems that build key drivers as modules we might be doing this long after userspace has booted. This has always been a bit of an issue with regulator_init_complete since it can power off hardware that's not had it's driver loaded which can result in user visible effects, the main case is powering off displays. Practically speaking it's not been an issue in real systems since most systems that use the regulator API are embedded and build in key drivers anyway but with Arm laptops coming on the market it's becoming more of an issue so let's do something about it. In the absence of any better idea just defer the powering off for 30s after late_initcall(), this is obviously a hack but it should mask the issue for now and it's no more arbitrary than late_initcall() itself. Ideally we'd have some heuristics to detect if we're on an affected system and tune or skip the delay appropriately, and there may be some need for a command line option to be added. Link: https://lore.kernel.org/r/20190904124250.25844-1-broonie@kernel.org Signed-off-by: Mark Brown <broonie@kernel.org> Tested-by: Lee Jones <lee.jones@linaro.org> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-10-05	media: don't drop front-end reference count for ->detach	Arnd Bergmann
	commit 14e3cdbb00a885eedc95c0cf8eda8fe28d26d6b4 upstream. A bugfix introduce a link failure in configurations without CONFIG_MODULES: In file included from drivers/media/usb/dvb-usb/pctv452e.c:20:0: drivers/media/usb/dvb-usb/pctv452e.c: In function 'pctv452e_frontend_attach': drivers/media/dvb-frontends/stb0899_drv.h:151:36: error: weak declaration of 'stb0899_attach' being applied to a already existing, static definition The problem is that the !IS_REACHABLE() declaration of stb0899_attach() is a 'static inline' definition that clashes with the weak definition. I further observed that the bugfix was only done for one of the five users of stb0899_attach(), the other four still have the problem. This reverts the bugfix and instead addresses the problem by not dropping the reference count when calling '->detach()', instead we call this function directly in dvb_frontend_put() before dropping the kref on the front-end. I first submitted this in early 2018, and after some discussion it was apparently discarded. While there is a long-term plan in place, that plan is obviously not nearing completion yet, and the current kernel is still broken unless this patch is applied. Link: https://patchwork.kernel.org/patch/10140175/ Link: https://patchwork.linuxtv.org/patch/54831/ Cc: Max Kellermann <max.kellermann@gmail.com> Cc: Wolfgang Rohdewald <wolfgang@rohdewald.de> Cc: stable@vger.kernel.org Fixes: f686c14364ad ("[media] stb0899: move code to "detach" callback") Fixes: 6cdeaed3b142 ("media: dvb_usb_pctv452e: module refcount changes were unbalanced") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Sean Young <sean@mess.org> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-10-05	media: sn9c20x: Add MSI MS-1039 laptop to flip_dmi_table	Hans de Goede
	commit 7e0bb5828311f811309bed5749528ca04992af2f upstream. Like a bunch of other MSI laptops the MS-1039 uses a 0c45:627b SN9C201 + OV7660 webcam which is mounted upside down. Add it to the sn9c20x flip_dmi_table to deal with this. Cc: stable@vger.kernel.org Reported-by: Rui Salvaterra <rsalvaterra@gmail.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-10-05	parisc: Disable HP HSC-PCI Cards to prevent kernel crash	Helge Deller
	commit 5fa1659105fac63e0f3c199b476025c2e04111ce upstream. The HP Dino PCI controller chip can be used in two variants: as on-board controller (e.g. in B160L), or on an Add-On card ("Card-Mode") to bridge PCI components to systems without a PCI bus, e.g. to a HSC/GSC bus. One such Add-On card is the HP HSC-PCI Card which has one or more DEC Tulip PCI NIC chips connected to the on-card Dino PCI controller. Dino in Card-Mode has a big disadvantage: All PCI memory accesses need to go through the DINO_MEM_DATA register, so Linux drivers will not be able to use the ioremap() function. Without ioremap() many drivers will not work, one example is the tulip driver which then simply crashes the kernel if it tries to access the ports on the HP HSC card. This patch disables the HP HSC card if it finds one, and as such fixes the kernel crash on a HP D350/2 machine. Signed-off-by: Helge Deller <deller@gmx.de> Noticed-by: Phil Scarr <phil.scarr@pm.me> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-10-05	scsi: implement .cleanup_rq callback	Ming Lei
	[ Upstream commit b7e9e1fb7a9227be34ad4a5e778022c3164494cf ] Implement .cleanup_rq() callback for freeing driver private part of the request. Then we can avoid to leak this part if the request isn't completed by SCSI, and freed by blk-mq or upper layer(such as dm-rq) finally. Cc: Ewan D. Milne <emilne@redhat.com> Cc: Bart Van Assche <bvanassche@acm.org> Cc: Hannes Reinecke <hare@suse.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Mike Snitzer <snitzer@redhat.com> Cc: dm-devel@redhat.com Cc: <stable@vger.kernel.org> Fixes: 396eaf21ee17 ("blk-mq: improve DM's blk-mq IO merging via blk_insert_cloned_request feedback") Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-10-05	blk-mq: add callback of .cleanup_rq	Ming Lei
	[ Upstream commit 226b4fc75c78f9c497c5182d939101b260cfb9f3 ] SCSI maintains its own driver private data hooked off of each SCSI request, and the pridate data won't be freed after scsi_queue_rq() returns BLK_STS_RESOURCE or BLK_STS_DEV_RESOURCE. An upper layer driver (e.g. dm-rq) may need to retry these SCSI requests, before SCSI has fully dispatched them, due to a lower level SCSI driver's resource limitation identified in scsi_queue_rq(). Currently SCSI's per-request private data is leaked when the upper layer driver (dm-rq) frees and then retries these requests in response to BLK_STS_RESOURCE or BLK_STS_DEV_RESOURCE returns from scsi_queue_rq(). This usecase is so specialized that it doesn't warrant training an existing blk-mq interface (e.g. blk_mq_free_request) to allow SCSI to account for freeing its driver private data -- doing so would add an extra branch for handling a special case that all other consumers of SCSI (and blk-mq) won't ever need to worry about. So the most pragmatic way forward is to delegate freeing SCSI driver private data to the upper layer driver (dm-rq). Do so by adding new .cleanup_rq callback and calling a new blk_mq_cleanup_rq() method from dm-rq. A following commit will implement the .cleanup_rq() hook in scsi_mq_ops. Cc: Ewan D. Milne <emilne@redhat.com> Cc: Bart Van Assche <bvanassche@acm.org> Cc: Hannes Reinecke <hare@suse.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Mike Snitzer <snitzer@redhat.com> Cc: dm-devel@redhat.com Cc: <stable@vger.kernel.org> Fixes: 396eaf21ee17 ("blk-mq: improve DM's blk-mq IO merging via blk_insert_cloned_request feedback") Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-10-05	IB/hfi1: Define variables as unsigned long to fix KASAN warning	Ira Weiny
	commit f8659d68e2bee5b86a1beaf7be42d942e1fc81f4 upstream. Define the working variables to be unsigned long to be compatible with for_each_set_bit and change types as needed. While we are at it remove unused variables from a couple of functions. This was found because of the following KASAN warning: ================================================================== BUG: KASAN: stack-out-of-bounds in find_first_bit+0x19/0x70 Read of size 8 at addr ffff888362d778d0 by task kworker/u308:2/1889 CPU: 21 PID: 1889 Comm: kworker/u308:2 Tainted: G W 5.3.0-rc2-mm1+ #2 Hardware name: Intel Corporation W2600CR/W2600CR, BIOS SE5C600.86B.02.04.0003.102320141138 10/23/2014 Workqueue: ib-comp-unb-wq ib_cq_poll_work [ib_core] Call Trace: dump_stack+0x9a/0xf0 ? find_first_bit+0x19/0x70 print_address_description+0x6c/0x332 ? find_first_bit+0x19/0x70 ? find_first_bit+0x19/0x70 __kasan_report.cold.6+0x1a/0x3b ? find_first_bit+0x19/0x70 kasan_report+0xe/0x12 find_first_bit+0x19/0x70 pma_get_opa_portstatus+0x5cc/0xa80 [hfi1] ? ret_from_fork+0x3a/0x50 ? pma_get_opa_port_ectrs+0x200/0x200 [hfi1] ? stack_trace_consume_entry+0x80/0x80 hfi1_process_mad+0x39b/0x26c0 [hfi1] ? __lock_acquire+0x65e/0x21b0 ? clear_linkup_counters+0xb0/0xb0 [hfi1] ? check_chain_key+0x1d7/0x2e0 ? lock_downgrade+0x3a0/0x3a0 ? match_held_lock+0x2e/0x250 ib_mad_recv_done+0x698/0x15e0 [ib_core] ? clear_linkup_counters+0xb0/0xb0 [hfi1] ? ib_mad_send_done+0xc80/0xc80 [ib_core] ? mark_held_locks+0x79/0xa0 ? _raw_spin_unlock_irqrestore+0x44/0x60 ? rvt_poll_cq+0x1e1/0x340 [rdmavt] __ib_process_cq+0x97/0x100 [ib_core] ib_cq_poll_work+0x31/0xb0 [ib_core] process_one_work+0x4ee/0xa00 ? pwq_dec_nr_in_flight+0x110/0x110 ? do_raw_spin_lock+0x113/0x1d0 worker_thread+0x57/0x5a0 ? process_one_work+0xa00/0xa00 kthread+0x1bb/0x1e0 ? kthread_create_on_node+0xc0/0xc0 ret_from_fork+0x3a/0x50 The buggy address belongs to the page: page:ffffea000d8b5dc0 refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 flags: 0x17ffffc0000000() raw: 0017ffffc0000000 0000000000000000 ffffea000d8b5dc8 0000000000000000 raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000 page dumped because: kasan: bad access detected addr ffff888362d778d0 is located in stack of task kworker/u308:2/1889 at offset 32 in frame: pma_get_opa_portstatus+0x0/0xa80 [hfi1] this frame has 1 object: [32, 36) 'vl_select_mask' Memory state around the buggy address: ffff888362d77780: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ffff888362d77800: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >ffff888362d77880: 00 00 00 00 00 00 f1 f1 f1 f1 04 f2 f2 f2 00 00 ^ ffff888362d77900: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ffff888362d77980: 00 00 00 00 00 00 00 00 f1 f1 f1 f1 04 f2 f2 f2 ================================================================== Cc: <stable@vger.kernel.org> Fixes: 7724105686e7 ("IB/hfi1: add driver files") Link: https://lore.kernel.org/r/20190911113053.126040.47327.stgit@awfm-01.aw.intel.com Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-10-05	IB/mlx5: Free mpi in mp_slave mode	Danit Goldberg
	commit 5d44adebbb7e785939df3db36ac360f5e8b73e44 upstream. ib_add_slave_port() allocates a multiport struct but never frees it. Don't leak memory, free the allocated mpi struct during driver unload. Cc: <stable@vger.kernel.org> Fixes: 32f69e4be269 ("{net, IB}/mlx5: Manage port association for multiport RoCE") Link: https://lore.kernel.org/r/20190916064818.19823-3-leon@kernel.org Signed-off-by: Danit Goldberg <danitg@mellanox.com> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-10-05	scsi: qla2xxx: Fix Relogin to prevent modifying scan_state flag	Quinn Tran
	commit 8b5292bcfcacf15182a77a973a98d310e76fd58b upstream. Relogin fails to move forward due to scan_state flag indicating device is not there. Before relogin process, Session delete process accidently modified the scan_state flag. [mkp: typos plus corrected Fixes: sha as reported by sfr] Fixes: 2dee5521028c ("scsi: qla2xxx: Fix login state machine freeze") Cc: stable@vger.kernel.org Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Himanshu Madhani <hmadhani@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-10-05	scsi: scsi_dh_rdac: zero cdb in send_mode_select()	Martin Wilck
	commit 57adf5d4cfd3198aa480e7c94a101fc8c4e6109d upstream. cdb in send_mode_select() is not zeroed and is only partially filled in rdac_failover_get(), which leads to some random data getting to the device. Users have reported storage responding to such commands with INVALID FIELD IN CDB. Code before commit 327825574132 was not affected, as it called blk_rq_set_block_pc(). Fix this by zeroing out the cdb first. Identified & fix proposed by HPE. Fixes: 327825574132 ("scsi_dh_rdac: switch to scsi_execute_req_flags()") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20190904155205.1666-1-martin.wilck@suse.com Signed-off-by: Martin Wilck <mwilck@suse.com> Acked-by: Ales Novak <alnovak@suse.cz> Reviewed-by: Shane Seymour <shane.seymour@hpe.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-10-05	iwlwifi: fw: don't send GEO_TX_POWER_LIMIT command to FW version 36	Luca Coelho
	commit fddbfeece9c7882cc47754c7da460fe427e3e85b upstream. The intention was to have the GEO_TX_POWER_LIMIT command in FW version 36 as well, but not all 8000 family got this feature enabled. The 8000 family is the only one using version 36, so skip this version entirely. If we try to send this command to the firmwares that do not support it, we get a BAD_COMMAND response from the firmware. This fixes https://bugzilla.kernel.org/show_bug.cgi?id=204151. Cc: stable@vger.kernel.org # 4.19+ Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-10-05	PM / devfreq: passive: fix compiler warning	MyungJoo Ham
	[ Upstream commit 0465814831a926ce2f83e8f606d067d86745234e ] The recent commit of PM / devfreq: passive: Use non-devm notifiers had incurred compiler warning, "unused variable 'dev'". Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: MyungJoo Ham <myungjoo.ham@samsung.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-10-05	media: omap3isp: Set device on omap3isp subdevs	Sakari Ailus
	[ Upstream commit e9eb103f027725053a4b02f93d7f2858b56747ce ] The omap3isp driver registered subdevs without the dev field being set. Do that now. Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-10-05	iommu/amd: Override wrong IVRS IOAPIC on Raven Ridge systems	Kai-Heng Feng
	[ Upstream commit 93d051550ee02eaff9a2541d825605a7bd778027 ] Raven Ridge systems may have malfunction touchpad or hang at boot if incorrect IVRS IOAPIC is provided by BIOS. Users already found correct "ivrs_ioapic=" values, let's put them inside kernel to workaround buggy BIOS. BugLink: https://bugs.launchpad.net/bugs/1795292 BugLink: https://bugs.launchpad.net/bugs/1837688 Reported-by: kbuild test robot <lkp@intel.com> Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-10-05	media: ttusb-dec: Fix info-leak in ttusb_dec_send_command()	Tomas Bortoli
	[ Upstream commit a10feaf8c464c3f9cfdd3a8a7ce17e1c0d498da1 ] The function at issue does not always initialize each byte allocated for 'b' and can therefore leak uninitialized memory to a USB device in the call to usb_bulk_msg() Use kzalloc() instead of kmalloc() Signed-off-by: Tomas Bortoli <tomasbortoli@gmail.com> Reported-by: syzbot+0522702e9d67142379f1@syzkaller.appspotmail.com Signed-off-by: Sean Young <sean@mess.org> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-10-05	drm/amd/powerplay/smu7: enforce minimal VBITimeout (v2)	Ahzo
	[ Upstream commit f659bb6dae58c113805f92822e4c16ddd3156b79 ] This fixes screen corruption/flickering on 75 Hz displays. v2: make print statement debug only (Alex) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102646 Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Ahzo <Ahzo@tutanota.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-10-05	e1000e: add workaround for possible stalled packet	Kai-Heng Feng
	[ Upstream commit e5e9a2ecfe780975820e157b922edee715710b66 ] This works around a possible stalled packet issue, which may occur due to clock recovery from the PCH being too slow, when the LAN is transitioning from K1 at 1G link speed. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=204057 Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-10-05	libertas: Add missing sentinel at end of if_usb.c fw_table	Kevin Easton
	[ Upstream commit 764f3f1ecffc434096e0a2b02f1a6cc964a89df6 ] This sentinel tells the firmware loading process when to stop. Reported-and-tested-by: syzbot+98156c174c5a2cad9f8f@syzkaller.appspotmail.com Signed-off-by: Kevin Easton <kevin@guarana.org> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-10-05	raid5: don't increment read_errors on EILSEQ return	Nigel Croxon
	[ Upstream commit b76b4715eba0d0ed574f58918b29c1b2f0fa37a8 ] While MD continues to count read errors returned by the lower layer. If those errors are -EILSEQ, instead of -EIO, it should NOT increase the read_errors count. When RAID6 is set up on dm-integrity target that detects massive corruption, the leg will be ejected from the array. Even if the issue is correctable with a sector re-write and the array has necessary redundancy to correct it. The leg is ejected because it runs up the rdev->read_errors beyond conf->max_nr_stripes. The return status in dm-drypt when there is a data integrity error is -EILSEQ (BLK_STS_PROTECTION). Signed-off-by: Nigel Croxon <ncroxon@redhat.com> Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-10-05	mmc: dw_mmc: Re-store SDIO IRQs mask at system resume	Ulf Hansson
	[ Upstream commit 7c526608d5afb62cbc967225e2ccaacfdd142e9d ] In cases when SDIO IRQs have been enabled, runtime suspend is prevented by the driver. However, this still means dw_mci_runtime_suspend\|resume() gets called during system suspend/resume, via pm_runtime_force_suspend\|resume(). This means during system suspend/resume, the register context of the dw_mmc device most likely loses its register context, even in cases when SDIO IRQs have been enabled. To re-enable the SDIO IRQs during system resume, the dw_mmc driver currently relies on the mmc core to re-enable the SDIO IRQs when it resumes the SDIO card, but this isn't the recommended solution. Instead, it's better to deal with this locally in the dw_mmc driver, so let's do that. Tested-by: Matthias Kaehlcke <mka@chromium.org> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Reviewed-by: Douglas Anderson <dianders@chromium.org> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-10-05	mmc: sdhci: Fix incorrect switch to HS mode	Al Cooper
	[ Upstream commit c894e33ddc1910e14d6f2a2016f60ab613fd8b37 ] When switching from any MMC speed mode that requires 1.8v (HS200, HS400 and HS400ES) to High Speed (HS) mode, the system ends up configured for SDR12 with a 50MHz clock which is an illegal mode. This happens because the SDHCI_CTRL_VDD_180 bit in the SDHCI_HOST_CONTROL2 register is left set and when this bit is set, the speed mode is controlled by the SDHCI_CTRL_UHS field in the SDHCI_HOST_CONTROL2 register. The SDHCI_CTRL_UHS field will end up being set to 0 (SDR12) by sdhci_set_uhs_signaling() because there is no UHS mode being set. The fix is to change sdhci_set_uhs_signaling() to set the SDHCI_CTRL_UHS field to SDR25 (which is the same as HS) for any switch to HS mode. This was found on a new eMMC controller that does strict checking of the speed mode and the corresponding clock rate. It caused the switch to HS400 mode to fail because part of the sequence to switch to HS400 requires a switch from HS200 to HS before going to HS400. Suggested-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Al Cooper <alcooperx@gmail.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-10-05	mmc: core: Clarify sdio_irq_pending flag for MMC_CAP2_SDIO_IRQ_NOTHREAD	Ulf Hansson
	[ Upstream commit 36d57efb4af534dd6b442ea0b9a04aa6dfa37abe ] The sdio_irq_pending flag is used to let host drivers indicate that it has signaled an IRQ. If that is the case and we only have a single SDIO func that have claimed an SDIO IRQ, our assumption is that we can avoid reading the SDIO_CCCR_INTx register and just call the SDIO func irq handler immediately. This makes sense, but the flag is set/cleared in a somewhat messy order, let's fix that up according to below. First, the flag is currently set in sdio_run_irqs(), which is executed as a work that was scheduled from sdio_signal_irq(). To make it more implicit that the host have signaled an IRQ, let's instead immediately set the flag in sdio_signal_irq(). This also makes the behavior consistent with host drivers that uses the legacy, mmc_signal_sdio_irq() API. This have no functional impact, because we don't expect host drivers to call sdio_signal_irq() until after the work (sdio_run_irqs()) have been executed anyways. Second, currently we never clears the flag when using the sdio_run_irqs() work, but only when using the sdio_irq_thread(). Let make the behavior consistent, by moving the flag to be cleared inside the common process_sdio_pending_irqs() function. Additionally, tweak the behavior of the flag slightly, by avoiding to clear it unless we processed the SDIO IRQ. The purpose with this at this point, is to keep the information about whether there have been an SDIO IRQ signaled by the host, so at system resume we can decide to process it without reading the SDIO_CCCR_INTx register. Tested-by: Matthias Kaehlcke <mka@chromium.org> Reviewed-by: Matthias Kaehlcke <mka@chromium.org> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Reviewed-by: Douglas Anderson <dianders@chromium.org> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-10-05	raid5: don't set STRIPE_HANDLE to stripe which is in batch list	Guoqing Jiang
	[ Upstream commit 6ce220dd2f8ea71d6afc29b9a7524c12e39f374a ] If stripe in batch list is set with STRIPE_HANDLE flag, then the stripe could be set with STRIPE_ACTIVE by the handle_stripe function. And if error happens to the batch_head at the same time, break_stripe_batch_list is called, then below warning could happen (the same report in [1]), it means a member of batch list was set with STRIPE_ACTIVE. [7028915.431770] stripe state: 2001 [7028915.431815] ------------[ cut here ]------------ [7028915.431828] WARNING: CPU: 18 PID: 29089 at drivers/md/raid5.c:4614 break_stripe_batch_list+0x203/0x240 [raid456] [...] [7028915.431879] CPU: 18 PID: 29089 Comm: kworker/u82:5 Tainted: G O 4.14.86-1-storage #4.14.86-1.2~deb9 [7028915.431881] Hardware name: Supermicro SSG-2028R-ACR24L/X10DRH-iT, BIOS 3.1 06/18/2018 [7028915.431888] Workqueue: raid5wq raid5_do_work [raid456] [7028915.431890] task: ffff9ab0ef36d7c0 task.stack: ffffb72926f84000 [7028915.431896] RIP: 0010:break_stripe_batch_list+0x203/0x240 [raid456] [7028915.431898] RSP: 0018:ffffb72926f87ba8 EFLAGS: 00010286 [7028915.431900] RAX: 0000000000000012 RBX: ffff9aaa84a98000 RCX: 0000000000000000 [7028915.431901] RDX: 0000000000000000 RSI: ffff9ab2bfa15458 RDI: ffff9ab2bfa15458 [7028915.431902] RBP: ffff9aaa8fb4e900 R08: 0000000000000001 R09: 0000000000002eb4 [7028915.431903] R10: 00000000ffffffff R11: 0000000000000000 R12: ffff9ab1736f1b00 [7028915.431904] R13: 0000000000000000 R14: ffff9aaa8fb4e900 R15: 0000000000000001 [7028915.431906] FS: 0000000000000000(0000) GS:ffff9ab2bfa00000(0000) knlGS:0000000000000000 [7028915.431907] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [7028915.431908] CR2: 00007ff953b9f5d8 CR3: 0000000bf4009002 CR4: 00000000003606e0 [7028915.431909] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [7028915.431910] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [7028915.431910] Call Trace: [7028915.431923] handle_stripe+0x8e7/0x2020 [raid456] [7028915.431930] ? __wake_up_common_lock+0x89/0xc0 [7028915.431935] handle_active_stripes.isra.58+0x35f/0x560 [raid456] [7028915.431939] raid5_do_work+0xc6/0x1f0 [raid456] Also commit 59fc630b8b5f9f ("RAID5: batch adjacent full stripe write") said "If a stripe is added to batch list, then only the first stripe of the list should be put to handle_list and run handle_stripe." So don't set STRIPE_HANDLE to stripe which is already in batch list, otherwise the stripe could be put to handle_list and run handle_stripe, then the above warning could be triggered. [1]. https://www.spinics.net/lists/raid/msg62552.html Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com> Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-10-05	platform/x86: intel_pmc_core: Do not ioremap RAM	M. Vefa Bicakci
	[ Upstream commit 7d505758b1e556cdf65a5e451744fe0ae8063d17 ] On a Xen-based PVH virtual machine with more than 4 GiB of RAM, intel_pmc_core fails initialization with the following warning message from the kernel, indicating that the driver is attempting to ioremap RAM: ioremap on RAM at 0x00000000fe000000 - 0x00000000fe001fff WARNING: CPU: 1 PID: 434 at arch/x86/mm/ioremap.c:186 __ioremap_caller.constprop.0+0x2aa/0x2c0 ... Call Trace: ? pmc_core_probe+0x87/0x2d0 [intel_pmc_core] pmc_core_probe+0x87/0x2d0 [intel_pmc_core] This issue appears to manifest itself because of the following fallback mechanism in the driver: if (lpit_read_residency_count_address(&slp_s0_addr)) pmcdev->base_addr = PMC_BASE_ADDR_DEFAULT; The validity of address PMC_BASE_ADDR_DEFAULT (i.e., 0xFE000000) is not verified by the driver, which is what this patch introduces. With this patch, if address PMC_BASE_ADDR_DEFAULT is in RAM, then the driver will not attempt to ioremap the aforementioned address. Signed-off-by: M. Vefa Bicakci <m.v.b@runbox.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-10-05	dmaengine: ti: edma: Do not reset reserved paRAM slots	Peter Ujfalusi
	[ Upstream commit c5dbe60664b3660f5ac5854e21273ea2e7ff698f ] Skip resetting paRAM slots marked as reserved as they might be used by other cores. Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com> Link: https://lore.kernel.org/r/20190823125618.8133-2-peter.ujfalusi@ti.com Signed-off-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-10-05	md/raid1: fail run raid1 array when active disk less than one	Yufen Yu
	[ Upstream commit 07f1a6850c5d5a65c917c3165692b5179ac4cb6b ] When run test case: mdadm -CR /dev/md1 -l 1 -n 4 /dev/sd[a-d] --assume-clean --bitmap=internal mdadm -S /dev/md1 mdadm -A /dev/md1 /dev/sd[b-c] --run --force mdadm --zero /dev/sda mdadm /dev/md1 -a /dev/sda echo offline > /sys/block/sdc/device/state echo offline > /sys/block/sdb/device/state sleep 5 mdadm -S /dev/md1 echo running > /sys/block/sdb/device/state echo running > /sys/block/sdc/device/state mdadm -A /dev/md1 /dev/sd[a-c] --run --force mdadm run fail with kernel message as follow: [ 172.986064] md: kicking non-fresh sdb from array! [ 173.004210] md: kicking non-fresh sdc from array! [ 173.022383] md/raid1:md1: active with 0 out of 4 mirrors [ 173.022406] md1: failed to create bitmap (-5) In fact, when active disk in raid1 array less than one, we need to return fail in raid1_run(). Reviewed-by: NeilBrown <neilb@suse.de> Signed-off-by: Yufen Yu <yuyufen@huawei.com> Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-10-05	hwmon: (acpi_power_meter) Change log level for 'unsafe software power cap'	Wang Shenran
	[ Upstream commit 6e4d91aa071810deac2cd052161aefb376ecf04e ] At boot time, the acpi_power_meter driver logs the following error level message: "Ignoring unsafe software power cap". Having read about it from a few sources, it seems that the error message can be quite misleading. While the message can imply that Linux is ignoring the fact that the system is operating in potentially dangerous conditions, the truth is the driver found an ACPI_PMC object that supports software power capping. The driver simply decides not to use it, perhaps because it doesn't support the object. The best solution is probably changing the log level from error to warning. All sources I have found, regarding the error, have downplayed its significance. There is not much of a reason for it to be on error level, while causing potential confusions or misinterpretations. Signed-off-by: Wang Shenran <shenran268@gmail.com> Link: https://lore.kernel.org/r/20190724080110.6952-1-shenran268@gmail.com Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-10-05	closures: fix a race on wakeup from closure_sync	Kent Overstreet
	[ Upstream commit a22a9602b88fabf10847f238ff81fde5f906fef7 ] The race was when a thread using closure_sync() notices cl->s->done == 1 before the thread calling closure_put() calls wake_up_process(). Then, it's possible for that thread to return and exit just before wake_up_process() is called - so we're trying to wake up a process that no longer exists. rcu_read_lock() is sufficient to protect against this, as there's an rcu barrier somewhere in the process teardown path. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Acked-by: Coly Li <colyli@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-10-05	ACPI / PCI: fix acpi_pci_irq_enable() memory leak	Wenwen Wang
	[ Upstream commit 29b49958cf73b439b17fa29e9a25210809a6c01c ] In acpi_pci_irq_enable(), 'entry' is allocated by kzalloc() in acpi_pci_irq_check_entry() (invoked from acpi_pci_irq_lookup()). However, it is not deallocated if acpi_pci_irq_valid() returns false, leading to a memory leak. To fix this issue, free 'entry' before returning 0. Fixes: e237a5518425 ("x86/ACPI/PCI: Recognize that Interrupt Line 255 means "not connected"") Signed-off-by: Wenwen Wang <wenwen@cs.uga.edu> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-10-05	ACPI: custom_method: fix memory leaks	Wenwen Wang
	[ Upstream commit 03d1571d9513369c17e6848476763ebbd10ec2cb ] In cm_write(), 'buf' is allocated through kzalloc(). In the following execution, if an error occurs, 'buf' is not deallocated, leading to memory leaks. To fix this issue, free 'buf' before returning the error. Fixes: 526b4af47f44 ("ACPI: Split out custom_method functionality into an own driver") Signed-off-by: Wenwen Wang <wenwen@cs.uga.edu> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-10-05	iommu/iova: Avoid false sharing on fq_timer_on	Eric Dumazet
	[ Upstream commit 0d87308cca2c124f9bce02383f1d9632c9be89c4 ] In commit 14bd9a607f90 ("iommu/iova: Separate atomic variables to improve performance") Jinyu Qi identified that the atomic_cmpxchg() in queue_iova() was causing a performance loss and moved critical fields so that the false sharing would not impact them. However, avoiding the false sharing in the first place seems easy. We should attempt the atomic_cmpxchg() no more than 100 times per second. Adding an atomic_read() will keep the cache line mostly shared. This false sharing came with commit 9a005a800ae8 ("iommu/iova: Add flush timer"). Signed-off-by: Eric Dumazet <edumazet@google.com> Fixes: 9a005a800ae8 ('iommu/iova: Add flush timer') Cc: Jinyu Qi <jinyuqi@huawei.com> Cc: Joerg Roedel <jroedel@suse.de> Acked-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-10-05	libata/ahci: Drop PCS quirk for Denverton and beyond	Dan Williams
	[ Upstream commit c312ef176399e04fc5f7f2809d9a589751fbf6d9 ] The Linux ahci driver has historically implemented a configuration fixup for platforms / platform-firmware that fails to enable the ports prior to OS hand-off at boot. The fixup was originally implemented way back before ahci moved from drivers/scsi/ to drivers/ata/, and was updated in 2007 via commit 49f290903935 "ahci: update PCS programming". The quirk sets a port-enable bitmap in the PCS register at offset 0x92. This quirk could be applied generically up until the arrival of the Denverton (DNV) platform. The DNV AHCI controller architecture supports more than 6 ports and along with that the PCS register location and format were updated to allow for more possible ports in the bitmap. DNV AHCI expands the register to 32-bits and moves it to offset 0x94. As it stands there are no known problem reports with existing Linux trying to set bits at offset 0x92 which indicates that the quirk is not applicable. Likely it is not applicable on a wider range of platforms, but it is difficult to discern which platforms if any still depend on the quirk. Rather than try to fix the PCS quirk to consider the DNV register layout instead require explicit opt-in. The assumption is that the OS driver need not touch this register, and platforms can be added with a new boad_ahci_pcs7 board-id when / if problematic platforms are found in the future. The logic in ahci_intel_pcs_quirk() looks for all Intel AHCI instances with "legacy" board-ids and otherwise skips the quirk if the board was matched by class-code. Reported-by: Stephen Douthit <stephend@silicom-usa.com> Cc: Christoph Hellwig <hch@infradead.org> Reviewed-by: Stephen Douthit <stephend@silicom-usa.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-10-05	iommu/amd: Silence warnings under memory pressure	Qian Cai
	[ Upstream commit 3d708895325b78506e8daf00ef31549476e8586a ] When running heavy memory pressure workloads, the system is throwing endless warnings, smartpqi 0000:23:00.0: AMD-Vi: IOMMU mapping error in map_sg (io-pages: 5 reason: -12) Hardware name: HPE ProLiant DL385 Gen10/ProLiant DL385 Gen10, BIOS A40 07/10/2019 swapper/10: page allocation failure: order:0, mode:0xa20(GFP_ATOMIC), nodemask=(null),cpuset=/,mems_allowed=0,4 Call Trace: <IRQ> dump_stack+0x62/0x9a warn_alloc.cold.43+0x8a/0x148 __alloc_pages_nodemask+0x1a5c/0x1bb0 get_zeroed_page+0x16/0x20 iommu_map_page+0x477/0x540 map_sg+0x1ce/0x2f0 scsi_dma_map+0xc6/0x160 pqi_raid_submit_scsi_cmd_with_io_request+0x1c3/0x470 [smartpqi] do_IRQ+0x81/0x170 common_interrupt+0xf/0xf </IRQ> because the allocation could fail from iommu_map_page(), and the volume of this call could be huge which may generate a lot of serial console output and cosumes all CPUs. Fix it by silencing the warning in this call site, and there is still a dev_err() later to notify the failure. Signed-off-by: Qian Cai <cai@lca.pw> Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-10-05	nvme-multipath: fix ana log nsid lookup when nsid is not found	Anton Eidelman
	[ Upstream commit e01f91dff91c7b16a6e3faf2565017d497a73f83 ] ANA log parsing invokes nvme_update_ana_state() per ANA group desc. This updates the state of namespaces with nsids in desc->nsids[]. Both ctrl->namespaces list and desc->nsids[] array are sorted by nsid. Hence nvme_update_ana_state() performs a single walk over ctrl->namespaces: - if current namespace matches the current desc->nsids[n], this namespace is updated, and n is incremented. - the process stops when it encounters the end of either ctrl->namespaces end or desc->nsids[] In case desc->nsids[n] does not match any of ctrl->namespaces, the remaining nsids following desc->nsids[n] will not be updated. Such situation was considered abnormal and generated WARN_ON_ONCE. However ANA log MAY contain nsids not (yet) found in ctrl->namespaces. For example, lets consider the following scenario: - nvme0 exposes namespaces with nsids = [2, 3] to the host - a new namespace nsid = 1 is added dynamically - also, a ANA topology change is triggered - NS_CHANGED aen is generated and triggers scan_work - before scan_work discovers nsid=1 and creates a namespace, a NOTICE_ANA aen was issues and ana_work receives ANA log with nsids=[1, 2, 3] Result: ana_work fails to update ANA state on existing namespaces [2, 3] Solution: Change the way nvme_update_ana_state() namespace list walk checks the current namespace against desc->nsids[n] as follows: a) ns->head->ns_id < desc->nsids[n]: keep walking ctrl->namespaces. b) ns->head->ns_id == desc->nsids[n]: match, update the namespace c) ns->head->ns_id >= desc->nsids[n]: skip to desc->nsids[n+1] This enables correct operation in the scenario described above. This also allows ANA log to contain nsids currently invisible to the host, i.e. inactive nsids. Signed-off-by: Anton Eidelman <anton@lightbitslabs.com> Reviewed-by: James Smart <james.smart@broadcom.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-10-05	nvmet: fix data units read and written counters in SMART log	Tom Wu
	[ Upstream commit 3bec2e3754becebd4c452999adb49bc62c575ea4 ] In nvme spec 1.3 there is a definition for data write/read counters from SMART log, (See section 5.14.1.2): This value is reported in thousands (i.e., a value of 1 corresponds to 1000 units of 512 bytes read) and is rounded up. However, in nvme target where value is reported with actual units, but not thousands of units as the spec requires. Signed-off-by: Tom Wu <tomwu@mellanox.com> Reviewed-by: Israel Rukshin <israelr@mellanox.com> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-10-05	ACPI / CPPC: do not require the _PSD method	Al Stone
	[ Upstream commit 4c4cdc4c63853fee48c02e25c8605fb65a6c9924 ] According to the ACPI 6.3 specification, the _PSD method is optional when using CPPC. The underlying assumption is that each CPU can change frequency independently from all other CPUs; _PSD is provided to tell the OS that some processors can NOT do that. However, the acpi_get_psd() function returns ENODEV if there is no _PSD method present, or an ACPI error status if an error occurs when evaluating _PSD, if present. This makes _PSD mandatory when using CPPC, in violation of the specification, and only on Linux. This has forced some firmware writers to provide a dummy _PSD, even though it is irrelevant, but only because Linux requires it; other OSPMs follow the spec. We really do not want to have OS specific ACPI tables, though. So, correct acpi_get_psd() so that it does not return an error if there is no _PSD method present, but does return a failure when the method can not be executed properly. This allows _PSD to be optional as it should be. Signed-off-by: Al Stone <ahs3@redhat.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-10-05	media: ov9650: add a sanity check	Mauro Carvalho Chehab
	[ Upstream commit 093347abc7a4e0490e3c962ecbde2dc272a8f708 ] As pointed by cppcheck: [drivers/media/i2c/ov9650.c:706]: (error) Shifting by a negative value is undefined behaviour [drivers/media/i2c/ov9650.c:707]: (error) Shifting by a negative value is undefined behaviour [drivers/media/i2c/ov9650.c:721]: (error) Shifting by a negative value is undefined behaviour Prevent mangling with gains with invalid values. As pointed by Sylvester, this should never happen in practice, as min value of V4L2_CID_GAIN control is 16 (gain is always >= 16 and m is always >= 0), but it is too hard for a static analyzer to get this, as the logic with validates control min/max is elsewhere inside V4L2 core. Reviewed-by: Sylwester Nawrocki <s.nawrocki@samsung.com> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-10-05	media: saa7134: fix terminology around saa7134_i2c_eeprom_md7134_gate()	Maciej S. Szmigiero
	[ Upstream commit 9d802222a3405599d6e1984d9324cddf592ea1f4 ] saa7134_i2c_eeprom_md7134_gate() function and the associated comment uses an inverted i2c gate open / closed terminology. Let's fix this. Signed-off-by: Maciej S. Szmigiero <mail@maciej.szmigiero.name> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> [hverkuil-cisco@xs4all.nl: fix alignment checkpatch warning] Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-10-05	media: cpia2_usb: fix memory leaks	Wenwen Wang
	[ Upstream commit 1c770f0f52dca1a2323c594f01f5ec6f1dddc97f ] In submit_urbs(), 'cam->sbuf[i].data' is allocated through kmalloc_array(). However, it is not deallocated if the following allocation for urbs fails. To fix this issue, free 'cam->sbuf[i].data' if usb_alloc_urb() fails. Signed-off-by: Wenwen Wang <wenwen@cs.uga.edu> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>