summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2021-03-10xen/events: avoid handling the same event on two cpus at the same timeJuergen Gross
When changing the cpu affinity of an event it can happen today that (with some unlucky timing) the same event will be handled on the old and the new cpu at the same time. Avoid that by adding an "event active" flag to the per-event data and call the handler only if this flag isn't set. Cc: stable@vger.kernel.org Reported-by: Julien Grall <julien@xen.org> Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Julien Grall <jgrall@amazon.com> Link: https://lore.kernel.org/r/20210306161833.4552-4-jgross@suse.com Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2021-03-10xen/events: don't unmask an event channel when an eoi is pendingJuergen Gross
An event channel should be kept masked when an eoi is pending for it. When being migrated to another cpu it might be unmasked, though. In order to avoid this keep three different flags for each event channel to be able to distinguish "normal" masking/unmasking from eoi related masking/unmasking and temporary masking. The event channel should only be able to generate an interrupt if all flags are cleared. Cc: stable@vger.kernel.org Fixes: 54c9de89895e ("xen/events: add a new "late EOI" evtchn framework") Reported-by: Julien Grall <julien@xen.org> Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Julien Grall <jgrall@amazon.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Tested-by: Ross Lagerwall <ross.lagerwall@citrix.com> Link: https://lore.kernel.org/r/20210306161833.4552-3-jgross@suse.com [boris -- corrected Fixed tag format] Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2021-03-10drm/amdgpu: fix S0ix handling when the CONFIG_AMD_PMC=mAlex Deucher
Need to check the module variant as well. Acked-by: Prike Liang <Prike.Liang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2021-03-10drm/radeon: fix AGP dependencyChristian König
When AGP is compiled as module radeon must be compiled as module as well. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-10drm/radeon: also init GEM funcs in radeon_gem_prime_import_sg_tableChristian König
Otherwise we will run into a NULL ptr deref. Signed-off-by: Christian König <christian.koenig@amd.com> Bug: https://bugzilla.kernel.org/show_bug.cgi?id=212137 Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 5.11.x
2021-03-10drm/amd/pm: correct the watermark settings for PolarisEvan Quan
The "/ 10" should be applied to the right-hand operand instead of the left-hand one. Signed-off-by: Evan Quan <evan.quan@amd.com> Noticed-by: Georgios Toptsidis <gtoptsid@gmail.com> Reviewed-by: Feifei Xu <Feifei.Xu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2021-03-10drm/amd/pm: bug fix for pcie dpmKenneth Feng
Currently the pcie dpm has two problems. 1. Only the high dpm level speed/width can be overrided if the requested values are out of the pcie capability. 2. The high dpm level is always overrided though sometimes it's not necesarry. Signed-off-by: Kenneth Feng <kenneth.feng@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2021-03-10drm/amdgpu: fb BO should be ttm_bo_type_deviceNirmoy Das
FB BO should not be ttm_bo_type_kernel type and amdgpufb_create_pinned_object() pins the FB BO anyway. Signed-off-by: Nirmoy Das <nirmoy.das@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-10drm/amdgpu/display: Use wm_table.entries for dcn301 calculate_wmZhan Liu
[Why] For DGPU Navi, the wm_table.nv_entries are used. These entires are not populated for DCN301 Vangogh APU, but instead wm_table.entries are. [How] Use DCN21 Renoir style wm calculations. Signed-off-by: Leo Li <sunpeng.li@amd.com> Signed-off-by: Zhan Liu <zhan.liu@amd.com> Reviewed-by: Dmytro Laktyushkin <Dmytro.Laktyushkin@amd.com> Acked-by: Zhan Liu <zhan.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-10drm/amd/display: Enabled pipe harvesting in dcn30Dillon Varone
[Why & How] Ported logic from dcn21 for reading in pipe fusing to dcn30. Supported configurations are 1 and 6 pipes. Invalid fusing will revert to 1 pipe being enabled. Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Dillon Varone <dillon.varone@amd.com> Reviewed-by: Jun Lei <Jun.Lei@amd.com> Acked-by: Eryk Brol <eryk.brol@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-10drm/amd/display: Revert dram_clock_change_latency for DCN2.1Sung Lee
[WHY & HOW] Using values provided by DF for latency may cause hangs in multi display configurations. Revert change to previous value. Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Sung Lee <sung.lee@amd.com> Reviewed-by: Haonan Wang <Haonan.Wang2@amd.com> Acked-by: Eryk Brol <eryk.brol@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-10Merge tag 's390-5.12-3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 fixes from Heiko Carstens: - fix various user space visible copy_to_user() instances which return the number of bytes left to copy instead of -EFAULT - make TMPFS_INODE64 available again for s390 and alpha, now that both architectures have been switched to 64-bit ino_t (see commit 96c0a6a72d18: "s390,alpha: switch to 64-bit ino_t") - make sure to release a shared hypervisor resource within the zcore device driver also on restart and power down; also remove unneeded surrounding debugfs_create return value checks - for the new hardware counter set device driver rename the uapi header file to be a bit more generic; also remove 60 second read limit which is not really necessary and without the limit the interface can be easier tested - some small cleanups, the largest being to convert all long long in our time and idle code to longs - update defconfigs * tag 's390-5.12-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390: remove IBM_PARTITION and CONFIGFS_FS from zfcpdump defconfig s390: update defconfigs s390,alpha: make TMPFS_INODE64 available again s390/cio: return -EFAULT if copy_to_user() fails s390/tty3270: avoid comma separated statements s390/cpumf: remove unneeded semicolon s390/crypto: return -EFAULT if copy_to_user() fails s390/cio: return -EFAULT if copy_to_user() fails s390/cpumf: rename header file to hwctrset.h s390/zcore: release dump save area on restart or power down s390/zcore: no need to check return value of debugfs_create functions s390/cpumf: remove 60 seconds read limit s390/topology: remove always false if check s390/time,idle: get rid of unsigned long long
2021-03-10drm/amd/display: Enable pflip interrupt upon pipe enableQingqing Zhuo
[Why] pflip interrupt would not be enabled promptly if a pipe is disabled and re-enabled, causing flip_done timeout error during DP compliance tests [How] Enable pflip interrupt upon pipe enablement Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Qingqing Zhuo <qingqing.zhuo@amd.com> Reviewed-by: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com> Acked-by: Eryk Brol <eryk.brol@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-10drm/amdgpu/display: use GFP_ATOMIC in dcn21_validate_bandwidth_fp()Holger Hoffstätte
After fixing nested FPU contexts caused by 41401ac67791 we're still seeing complaints about spurious kernel_fpu_end(). As it turns out this was already fixed for dcn20 in commit f41ed88cbd ("drm/amdgpu/display: use GFP_ATOMIC in dcn20_validate_bandwidth_internal") but never moved forward to dcn21. Signed-off-by: Holger Hoffstätte <holger@applied-asynchrony.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2021-03-10drm/amd/display: Fix nested FPU context in dcn21_validate_bandwidth()Holger Hoffstätte
Commit 41401ac67791 added FPU wrappers to dcn21_validate_bandwidth(), which was correct. Unfortunately a nested function alredy contained DC_FP_START()/DC_FP_END() calls, which results in nested FPU context enter/exit and complaints by kernel_fpu_begin_mask(). This can be observed e.g. with 5.10.20, which backported 41401ac67791 and now emits the following warning on boot: WARNING: CPU: 6 PID: 858 at arch/x86/kernel/fpu/core.c:129 kernel_fpu_begin_mask+0xa5/0xc0 Call Trace: dcn21_calculate_wm+0x47/0xa90 [amdgpu] dcn21_validate_bandwidth_fp+0x15d/0x2b0 [amdgpu] dcn21_validate_bandwidth+0x29/0x40 [amdgpu] dc_validate_global_state+0x3c7/0x4c0 [amdgpu] The warning is emitted due to the additional DC_FP_START/END calls in patch_bounding_box(), which is inlined into dcn21_calculate_wm(), its only caller. Removing the calls brings the code in line with dcn20 and makes the warning disappear. Fixes: 41401ac67791 ("drm/amd/display: Add FPU wrappers to dcn21_validate_bandwidth()") Signed-off-by: Holger Hoffstätte <holger@applied-asynchrony.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2021-03-10drm/amd/display: Add a backlight module optionTakashi Iwai
There seem devices that don't work with the aux channel backlight control. For allowing such users to test with the other backlight control method, provide a new module option, aux_backlight, to specify enabling or disabling the aux backport support explicitly. As default, the aux support is detected by the hardware capability. v2: make the backlight option generic in case we add future backlight types (Alex) BugLink: https://bugzilla.opensuse.org/show_bug.cgi?id=1180749 BugLink: https://gitlab.freedesktop.org/drm/amd/-/issues/1438 Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2021-03-10drm/amdgpu/display: handle aux backlight in backlight_get_brightnessAlex Deucher
Need to fetch it via aux. Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2021-03-10drm/amdgpu/display: don't assert in set backlight functionAlex Deucher
It just spams the logs. Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2021-03-10drm/amdgpu/display: simplify backlight settingAlex Deucher
Avoid the extra wrapper function. Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2021-03-10usbip: fix vudc usbip_sockfd_store races leading to gpfShuah Khan
usbip_sockfd_store() is invoked when user requests attach (import) detach (unimport) usb gadget device from usbip host. vhci_hcd sends import request and usbip_sockfd_store() exports the device if it is free for export. Export and unexport are governed by local state and shared state - Shared state (usbip device status, sockfd) - sockfd and Device status are used to determine if stub should be brought up or shut down. Device status is shared between host and client. - Local state (tcp_socket, rx and tx thread task_struct ptrs) A valid tcp_socket controls rx and tx thread operations while the device is in exported state. - While the device is exported, device status is marked used and socket, sockfd, and thread pointers are valid. Export sequence (stub-up) includes validating the socket and creating receive (rx) and transmit (tx) threads to talk to the client to provide access to the exported device. rx and tx threads depends on local and shared state to be correct and in sync. Unexport (stub-down) sequence shuts the socket down and stops the rx and tx threads. Stub-down sequence relies on local and shared states to be in sync. There are races in updating the local and shared status in the current stub-up sequence resulting in crashes. These stem from starting rx and tx threads before local and global state is updated correctly to be in sync. 1. Doesn't handle kthread_create() error and saves invalid ptr in local state that drives rx and tx threads. 2. Updates tcp_socket and sockfd, starts stub_rx and stub_tx threads before updating usbip_device status to SDEV_ST_USED. This opens up a race condition between the threads and usbip_sockfd_store() stub up and down handling. Fix the above problems: - Stop using kthread_get_run() macro to create/start threads. - Create threads and get task struct reference. - Add kthread_create() failure handling and bail out. - Hold usbip_device lock to update local and shared states after creating rx and tx threads. - Update usbip_device status to SDEV_ST_USED. - Update usbip_device tcp_socket, sockfd, tcp_rx, and tcp_tx - Start threads after usbip_device (tcp_socket, sockfd, tcp_rx, tcp_tx, and status) is complete. Credit goes to syzbot and Tetsuo Handa for finding and root-causing the kthread_get_run() improper error handling problem and others. This is a hard problem to find and debug since the races aren't seen in a normal case. Fuzzing forces the race window to be small enough for the kthread_get_run() error path bug and starting threads before updating the local and shared state bug in the stub-up sequence. Fixes: 9720b4bc76a83807 ("staging/usbip: convert to kthread") Cc: stable@vger.kernel.org Reported-by: syzbot <syzbot+a93fba6d384346a761e3@syzkaller.appspotmail.com> Reported-by: syzbot <syzbot+bf1a360e305ee719e364@syzkaller.appspotmail.com> Reported-by: syzbot <syzbot+95ce4b142579611ef0a9@syzkaller.appspotmail.com> Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org> Link: https://lore.kernel.org/r/b1c08b983ffa185449c9f0f7d1021dc8c8454b60.1615171203.git.skhan@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-10usbip: fix vhci_hcd attach_store() races leading to gpfShuah Khan
attach_store() is invoked when user requests import (attach) a device from usbip host. Attach and detach are governed by local state and shared state - Shared state (usbip device status) - Device status is used to manage the attach and detach operations on import-able devices. - Local state (tcp_socket, rx and tx thread task_struct ptrs) A valid tcp_socket controls rx and tx thread operations while the device is in exported state. - Device has to be in the right state to be attached and detached. Attach sequence includes validating the socket and creating receive (rx) and transmit (tx) threads to talk to the host to get access to the imported device. rx and tx threads depends on local and shared state to be correct and in sync. Detach sequence shuts the socket down and stops the rx and tx threads. Detach sequence relies on local and shared states to be in sync. There are races in updating the local and shared status in the current attach sequence resulting in crashes. These stem from starting rx and tx threads before local and global state is updated correctly to be in sync. 1. Doesn't handle kthread_create() error and saves invalid ptr in local state that drives rx and tx threads. 2. Updates tcp_socket and sockfd, starts stub_rx and stub_tx threads before updating usbip_device status to VDEV_ST_NOTASSIGNED. This opens up a race condition between the threads, port connect, and detach handling. Fix the above problems: - Stop using kthread_get_run() macro to create/start threads. - Create threads and get task struct reference. - Add kthread_create() failure handling and bail out. - Hold vhci and usbip_device locks to update local and shared states after creating rx and tx threads. - Update usbip_device status to VDEV_ST_NOTASSIGNED. - Update usbip_device tcp_socket, sockfd, tcp_rx, and tcp_tx - Start threads after usbip_device (tcp_socket, sockfd, tcp_rx, tcp_tx, and status) is complete. Credit goes to syzbot and Tetsuo Handa for finding and root-causing the kthread_get_run() improper error handling problem and others. This is hard problem to find and debug since the races aren't seen in a normal case. Fuzzing forces the race window to be small enough for the kthread_get_run() error path bug and starting threads before updating the local and shared state bug in the attach sequence. - Update usbip_device tcp_rx and tcp_tx pointers holding vhci and usbip_device locks. Tested with syzbot reproducer: - https://syzkaller.appspot.com/text?tag=ReproC&x=14801034d00000 Fixes: 9720b4bc76a83807 ("staging/usbip: convert to kthread") Cc: stable@vger.kernel.org Reported-by: syzbot <syzbot+a93fba6d384346a761e3@syzkaller.appspotmail.com> Reported-by: syzbot <syzbot+bf1a360e305ee719e364@syzkaller.appspotmail.com> Reported-by: syzbot <syzbot+95ce4b142579611ef0a9@syzkaller.appspotmail.com> Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org> Link: https://lore.kernel.org/r/bb434bd5d7a64fbec38b5ecfb838a6baef6eb12b.1615171203.git.skhan@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-10usbip: fix stub_dev usbip_sockfd_store() races leading to gpfShuah Khan
usbip_sockfd_store() is invoked when user requests attach (import) detach (unimport) usb device from usbip host. vhci_hcd sends import request and usbip_sockfd_store() exports the device if it is free for export. Export and unexport are governed by local state and shared state - Shared state (usbip device status, sockfd) - sockfd and Device status are used to determine if stub should be brought up or shut down. - Local state (tcp_socket, rx and tx thread task_struct ptrs) A valid tcp_socket controls rx and tx thread operations while the device is in exported state. - While the device is exported, device status is marked used and socket, sockfd, and thread pointers are valid. Export sequence (stub-up) includes validating the socket and creating receive (rx) and transmit (tx) threads to talk to the client to provide access to the exported device. rx and tx threads depends on local and shared state to be correct and in sync. Unexport (stub-down) sequence shuts the socket down and stops the rx and tx threads. Stub-down sequence relies on local and shared states to be in sync. There are races in updating the local and shared status in the current stub-up sequence resulting in crashes. These stem from starting rx and tx threads before local and global state is updated correctly to be in sync. 1. Doesn't handle kthread_create() error and saves invalid ptr in local state that drives rx and tx threads. 2. Updates tcp_socket and sockfd, starts stub_rx and stub_tx threads before updating usbip_device status to SDEV_ST_USED. This opens up a race condition between the threads and usbip_sockfd_store() stub up and down handling. Fix the above problems: - Stop using kthread_get_run() macro to create/start threads. - Create threads and get task struct reference. - Add kthread_create() failure handling and bail out. - Hold usbip_device lock to update local and shared states after creating rx and tx threads. - Update usbip_device status to SDEV_ST_USED. - Update usbip_device tcp_socket, sockfd, tcp_rx, and tcp_tx - Start threads after usbip_device (tcp_socket, sockfd, tcp_rx, tcp_tx, and status) is complete. Credit goes to syzbot and Tetsuo Handa for finding and root-causing the kthread_get_run() improper error handling problem and others. This is a hard problem to find and debug since the races aren't seen in a normal case. Fuzzing forces the race window to be small enough for the kthread_get_run() error path bug and starting threads before updating the local and shared state bug in the stub-up sequence. Tested with syzbot reproducer: - https://syzkaller.appspot.com/text?tag=ReproC&x=14801034d00000 Fixes: 9720b4bc76a83807 ("staging/usbip: convert to kthread") Cc: stable@vger.kernel.org Reported-by: syzbot <syzbot+a93fba6d384346a761e3@syzkaller.appspotmail.com> Reported-by: syzbot <syzbot+bf1a360e305ee719e364@syzkaller.appspotmail.com> Reported-by: syzbot <syzbot+95ce4b142579611ef0a9@syzkaller.appspotmail.com> Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org> Link: https://lore.kernel.org/r/268a0668144d5ff36ec7d87fdfa90faf583b7ccc.1615171203.git.skhan@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-10usbip: fix vudc to check for stream socketShuah Khan
Fix usbip_sockfd_store() to validate the passed in file descriptor is a stream socket. If the file descriptor passed was a SOCK_DGRAM socket, sock_recvmsg() can't detect end of stream. Cc: stable@vger.kernel.org Suggested-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org> Link: https://lore.kernel.org/r/387a670316002324113ac7ea1e8b53f4085d0c95.1615171203.git.skhan@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-10usbip: fix vhci_hcd to check for stream socketShuah Khan
Fix attach_store() to validate the passed in file descriptor is a stream socket. If the file descriptor passed was a SOCK_DGRAM socket, sock_recvmsg() can't detect end of stream. Cc: stable@vger.kernel.org Suggested-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org> Link: https://lore.kernel.org/r/52712aa308915bda02cece1589e04ee8b401d1f3.1615171203.git.skhan@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-10usbip: fix stub_dev to check for stream socketShuah Khan
Fix usbip_sockfd_store() to validate the passed in file descriptor is a stream socket. If the file descriptor passed was a SOCK_DGRAM socket, sock_recvmsg() can't detect end of stream. Cc: stable@vger.kernel.org Suggested-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org> Link: https://lore.kernel.org/r/e942d2bd03afb8e8552bd2a5d84e18d17670d521.1615171203.git.skhan@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-10Revert "mm, slub: consider rest of partial list if acquire_slab() fails"Linus Torvalds
This reverts commit 8ff60eb052eeba95cfb3efe16b08c9199f8121cf. The kernel test robot reports a huge performance regression due to the commit, and the reason seems fairly straightforward: when there is contention on the page list (which is what causes acquire_slab() to fail), we do _not_ want to just loop and try again, because that will transfer the contention to the 'n->list_lock' spinlock we hold, and just make things even worse. This is admittedly likely a problem only on big machines - the kernel test robot report comes from a 96-thread dual socket Intel Xeon Gold 6252 setup, but the regression there really is quite noticeable: -47.9% regression of stress-ng.rawpkt.ops_per_sec and the commit that was marked as being fixed (7ced37197196: "slub: Acquire_slab() avoid loop") actually did the loop exit early very intentionally (the hint being that "avoid loop" part of that commit message), exactly to avoid this issue. The correct thing to do may be to pick some kind of reasonable middle ground: instead of breaking out of the loop on the very first sign of contention, or trying over and over and over again, the right thing may be to re-try _once_, and then give up on the second failure (or pick your favorite value for "once"..). Reported-by: kernel test robot <oliver.sang@intel.com> Link: https://lore.kernel.org/lkml/20210301080404.GF12822@xsang-OptiPlex-9020/ Cc: Jann Horn <jannh@google.com> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Acked-by: Christoph Lameter <cl@linux.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-03-10Merge tag 'for-linus-2021-03-10' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux Pull detached mounts fix from Christian Brauner: "Creating a series of detached mounts, attaching them to the filesystem, and unmounting them can be used to trigger an integer overflow in ns->mounts causing the kernel to block any new mounts in count_mounts() and returning ENOSPC because it falsely assumes that the maximum number of mounts in the mount namespace has been reached, i.e. it thinks it can't fit the new mounts into the mount namespace anymore. Without this fix heavy use of the new mount API with move_mount() will cause the host to become unuseable and thus blocks some xfstest patches I want to resend. Depending on the number of mounts in your system, this can be reproduced on any kernel that supportes open_tree() and move_mount(). A reproducer has been sent for inclusion with xfstests. It takes care to do this in another mount namespace, not in the host's mount namespace so there shouldn't be any risk in running it but if one did run it on the host it would require a reboot in order to be able to mount again. See https://lore.kernel.org/fstests/20210309121041.753359-1-christian.brauner@ubuntu.com The root cause of this is that detached mounts aren't handled correctly when source and target mount are identical and reside on a shared mount causing a broken mount tree where the detached source itself is propagated which propagation prevents for regular bind-mounts and new mounts. This ultimately leads to a miscalculation of the number of mounts in the mount namespace. Detached mounts created via 'open_tree(fd, path, OPEN_TREE_CLONE)' are essentially like an unattached bind-mount. They can then later on be attached to the filesystem via move_mount() which calls into attach_recursive_mount(). Part of attaching it to the filesystem is making sure that mounts get correctly propagated in case the destination mountpoint is MS_SHARED, i.e. is a shared mountpoint. This is done by calling into propagate_mnt() which walks the list of peers calling propagate_one() on each mount in this list making sure it receives the propagation event. The propagate_one() function thereby skips both new mounts and bind mounts to not propagate them "into themselves". Both are identified by checking whether the mount is already attached to any mount namespace in mnt->mnt_ns. The is what the IS_MNT_NEW() helper is responsible for. However, detached mounts have an anonymous mount namespace attached to them stashed in mnt->mnt_ns which means that IS_MNT_NEW() doesn't realize they need to be skipped causing the mount to propagate "into itself" breaking the mount table and causing a disconnect between the number of mounts recorded as being beneath or reachable from the target mountpoint and the number of mounts actually recorded/counted in ns->mounts ultimately causing an overflow which in turn prevents any new mounts via the ENOSPC issue. So teach propagation to handle detached mounts by making it aware of them. I've been tracking this issue down for the last couple of days and then verifying that the fix is correct by unmounting everything in my current mount table leaving only /proc and /sys mounted and running the reproducer above overnight verifying the number of mounts counted in ns->mounts. With this fix the counts are correct and the ENOSPC issue can't be reproduced. This change will only have an effect on mounts created with the new mount API since detached mounts cannot be created with the old mount API so regressions are extremely unlikely. Here's an illustration: #### mount(): ubuntu@f1-vm:~$ sudo mount --bind /mnt/ /mnt/ ubuntu@f1-vm:~$ findmnt | grep -i mnt ├─/mnt /dev/sda2[/mnt] ext4 rw,relatime #### open_tree(OPEN_TREE_CLONE) + move_mount() with bug: ubuntu@f1-vm:~$ sudo ./mount-new /mnt/ /mnt/ ubuntu@f1-vm:~$ findmnt | grep -i mnt ├─/mnt /dev/sda2[/mnt] ext4 rw,relatime │ └─/mnt /dev/sda2[/mnt] ext4 rw,relatime #### open_tree(OPEN_TREE_CLONE) + move_mount() with the fix: ubuntu@f1-vm:~$ sudo ./mount-new /mnt /mnt ubuntu@f1-vm:~$ findmnt | grep -i mnt └─/mnt /dev/sda2[/mnt] ext4 rw,relatime" * tag 'for-linus-2021-03-10' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux: mount: fix mounting of detached mounts onto targets that reside on shared mounts
2021-03-10Merge tag '5.12-rc2-smb3' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds
Pull cifs fixes from Steve French: "Six cifs/smb3 fixes, three of them for stable, including some important mulitchannel crediting fixes, and a fix for statfs error handling" * tag '5.12-rc2-smb3' of git://git.samba.org/sfrench/cifs-2.6: cifs: do not send close in compound create+close requests cifs: return proper error code in statfs(2) cifs: change noisy error message to FYI cifs: print MIDs in decimal notation cifs: ask for more credit on async read/write code paths cifs: fix credit accounting for extra channel
2021-03-10misc/pvpanic: Export module FDT device tableShile Zhang
Export the module FDT device table to ensure the FDT compatible strings are listed in the module alias. This help the pvpanic driver can be loaded on boot automatically not only the ACPI device, but also the FDT device. Fixes: 46f934c9a12fc ("misc/pvpanic: add support to get pvpanic device info FDT") Signed-off-by: Shile Zhang <shile.zhang@linux.alibaba.com> Link: https://lore.kernel.org/r/20210218123116.207751-1-shile.zhang@linux.alibaba.com Cc: stable <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-10misc: fastrpc: restrict user apps from sending kernel RPC messagesDmitry Baryshkov
Verify that user applications are not using the kernel RPC message handle to restrict them from directly attaching to guest OS on the remote subsystem. This is a port of CVE-2019-2308 fix. Fixes: c68cfb718c8f ("misc: fastrpc: Add support for context Invoke method") Cc: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Cc: Jonathan Marek <jonathan@marek.ca> Cc: stable@vger.kernel.org Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Link: https://lore.kernel.org/r/20210212192658.3476137-1-dmitry.baryshkov@linaro.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-10virt: acrn: Correct type casting of argument of copy_from_user()Shuo Liu
hsm.c:336:50: warning: incorrect type in argument 2 (different address spaces) hsm.c:336:50: expected void const [noderef] __user *from hsm.c:336:50: got void * This patch fixes above sparse warning. Fixes: 3d679d5aec64 ("virt: acrn: Introduce interfaces to query C-states and P-states allowed by hypervisor") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Shuo Liu <shuo.a.liu@intel.com> Link: https://lore.kernel.org/r/20210310153708.17451-1-shuo.a.liu@intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-10x86/perf: Use RET0 as default for guest_get_msrs to handle "no PMU" caseSean Christopherson
Initialize x86_pmu.guest_get_msrs to return 0/NULL to handle the "nop" case. Patching in perf_guest_get_msrs_nop() during setup does not work if there is no PMU, as setup bails before updating the static calls, leaving x86_pmu.guest_get_msrs NULL and thus a complete nop. Ultimately, this causes VMX abort on VM-Exit due to KVM putting random garbage from the stack into the MSR load list. Add a comment in KVM to note that nr_msrs is valid if and only if the return value is non-NULL. Fixes: abd562df94d1 ("x86/perf: Use static_call for x86_pmu.guest_get_msrs") Reported-by: Dmitry Vyukov <dvyukov@google.com> Reported-by: syzbot+cce9ef2dd25246f815ee@syzkaller.appspotmail.com Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20210309171019.1125243-1-seanjc@google.com
2021-03-10block: rsxx: fix error return code of rsxx_pci_probe()Jia-Ju Bai
When create_singlethread_workqueue returns NULL to card->event_wq, no error return code of rsxx_pci_probe() is assigned. To fix this bug, st is assigned with -ENOMEM in this case. Fixes: 8722ff8cdbfa ("block: IBM RamSan 70/80 device driver") Reported-by: TOTE Robot <oslab@tsinghua.edu.cn> Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com> Link: https://lore.kernel.org/r/20210310033017.4023-1-baijiaju1990@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-03-10block: Fix REQ_OP_ZONE_RESET_ALL handlingDamien Le Moal
Similarly to a single zone reset operation (REQ_OP_ZONE_RESET), execute REQ_OP_ZONE_RESET_ALL operations with REQ_SYNC set. Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-03-10io_uring: remove indirect ctx into sqo injectionPavel Begunkov
We use ->ctx_new_list to notify sqo about new ctx pending, then sqo should stop and splice it to its sqd->ctx_list, paired with ->sq_thread_comp. The last one is broken because nobody reinitialises it, and trying to fix it would only add more complexity and bugs. And the first isn't really needed as is done under park(), that protects from races well. Add ctx into sqd->ctx_list directly (under park()), it's much simpler and allows to kill both, ctx_new_list and sq_thread_comp. note: apparently there is no real problem at the moment, because sq_thread_comp is used only by io_sq_thread_finish() followed by parking, where list_del(&ctx->sqd_list) removes it well regardless whether it's in the new or the active list. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-03-10io_uring: fix invalid ctx->sq_thread_idlePavel Begunkov
We have to set ctx->sq_thread_idle before adding a ring to an SQ task, otherwise sqd races for seeing zero and accounting it as such. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-03-10kernel: make IO threads unfreezable by defaultJens Axboe
The io-wq threads were already marked as no-freeze, but the manager was not. On resume, we perpetually have signal_pending() being true, and hence the manager will loop and spin 100% of the time. Just mark the tasks created by create_io_thread() as PF_NOFREEZE by default, and remove any knowledge of it in io-wq and io_uring. Reported-by: Kevin Locke <kevin@kevinlocke.name> Tested-by: Kevin Locke <kevin@kevinlocke.name> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-03-10io_uring: always wait for sqd exited when stopping SQPOLL threadJens Axboe
We have a tiny race where io_put_sq_data() calls io_sq_thead_stop() and finds the thread gone, but the thread has indeed not fully exited or called complete() yet. Close it up by always having io_sq_thread_stop() wait on completion of the exit event. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-03-10io_uring: remove unneeded variable 'ret'Yang Li
Fix the following coccicheck warning: ./fs/io_uring.c:8984:5-8: Unneeded variable: "ret". Return "0" on line 8998 Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Link: https://lore.kernel.org/r/1615271441-33649-1-git-send-email-yang.lee@linux.alibaba.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-03-10io_uring: move all io_kiocb init early in io_init_req()Jens Axboe
If we hit an error path in the function, make sure that the io_kiocb is fully initialized at that point so that freeing the request always sees a valid state. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-03-10io-wq: fix ref leak for req in case of exit cancelationsyangerkun
do_work such as io_wq_submit_work that cancel the work may leave a ref of req as 1 if we have links. Fix it by call io_run_cancel. Fixes: 4fb6ac326204 ("io-wq: improve manager/worker handling over exec") Signed-off-by: yangerkun <yangerkun@huawei.com> Link: https://lore.kernel.org/r/20210309030410.3294078-1-yangerkun@huawei.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-03-10io_uring: fix complete_post races for linked reqPavel Begunkov
Calling io_queue_next() after spin_unlock in io_req_complete_post() races with the other side extracting and reusing this request. Hand coded parts of io_req_find_next() considering that io_disarm_next() and io_req_task_queue() have (and safe) to be called with completion_lock held. It already does io_commit_cqring() and io_cqring_ev_posted(), so just reuse it for post io_disarm_next(). Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/5672a62f3150ee7c55849f40c0037655c4f2840f.1615250156.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-03-10io_uring: add io_disarm_next() helperPavel Begunkov
A preparation patch placing all preparations before extracting a next request into a separate helper io_disarm_next(). Also, don't spuriously do ev_posted in a rare case where REQ_F_FAIL_LINK is set but there are no requests linked (i.e. after cancelling a linked timeout or setting IOSQE_IO_LINK on a last request of a submission batch). Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/44ecff68d6b47e1c4e6b891bdde1ddc08cfc3590.1615250156.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-03-10io_uring: fix io_sq_offload_create error handlingPavel Begunkov
Don't set IO_SQ_THREAD_SHOULD_STOP when io_sq_offload_create() has failed on io_uring_alloc_task_context() but leave everything to io_sq_thread_finish(), because currently io_sq_thread_finish() hangs on trying to park it. That's great it stalls there, because otherwise the following io_sq_thread_stop() would be skipped on IO_SQ_THREAD_SHOULD_STOP check and the sqo would race for sqd with freeing ctx. A simple error injection gives something like this. [ 245.463955] INFO: task sqpoll-test-hang:523 blocked for more than 122 seconds. [ 245.463983] Call Trace: [ 245.463990] __schedule+0x36b/0x950 [ 245.464005] schedule+0x68/0xe0 [ 245.464013] schedule_timeout+0x209/0x2a0 [ 245.464032] wait_for_completion+0x8b/0xf0 [ 245.464043] io_sq_thread_finish+0x44/0x1a0 [ 245.464049] io_uring_setup+0x9ea/0xc80 [ 245.464058] __x64_sys_io_uring_setup+0x16/0x20 [ 245.464064] do_syscall_64+0x38/0x50 [ 245.464073] entry_SYSCALL_64_after_hwframe+0x44/0xae Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-03-10io-wq: remove unused 'user' member of io_wqJens Axboe
Previous patches killed the last user of this, now it's just a dead member in the struct. Get rid of it. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-03-10io_uring: Convert personality_idr to XArrayMatthew Wilcox (Oracle)
You can't call idr_remove() from within a idr_for_each() callback, but you can call xa_erase() from an xa_for_each() loop, so switch the entire personality_idr from the IDR to the XArray. This manifests as a use-after-free as idr_for_each() attempts to walk the rest of the node after removing the last entry from it. Fixes: 071698e13ac6 ("io_uring: allow registering credentials") Cc: stable@vger.kernel.org # 5.6+ Reported-by: yangerkun <yangerkun@huawei.com> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> [Pavel: rebased (creds load was moved into io_init_req())] Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/7ccff36e1375f2b0ebf73d957f037b43becc0dde.1615212806.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-03-10io_uring: clean R_DISABLED startup messPavel Begunkov
There are enough of problems with IORING_SETUP_R_DISABLED, including the burden of checking and kicking off the SQO task all over the codebase -- for exit/cancel/etc. Rework it, always start the thread but don't do submit unless the flag is gone, that's much easier. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-03-10io_uring: fix unrelated ctx reqs cancellationPavel Begunkov
io-wq now is per-task, so cancellations now should match against request's ctx. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-03-10io_uring: SQPOLL parking fixesJens Axboe
We keep running into weird dependency issues between the sqd lock and the parking state. Disentangle the SQPOLL thread from the last bits of the kthread parking inheritance, and just replace the parking state, and two associated locks, with a single rw mutex. The SQPOLL thread keeps the mutex for read all the time, except if someone has marked us needing to park. Then we drop/re-acquire and try again. This greatly simplifies the parking state machine (by just getting rid of it), and makes it a lot more obvious how it works - if you need to modify the ctx list, then you simply park the thread which will grab the lock for writing. Fold in fix from Hillf Danton on not setting STOP on a fatal signal. Fixes: e54945ae947f ("io_uring: SQPOLL stop error handling fixes") Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-03-10software node: Fix device_add_software_node()Heikki Krogerus
The function device_add_software_node() was meant to register the node supplied to it, but only if that node wasn't already registered. Right now the function attempts to always register the node. That will cause a failure with nodes that are already registered. Fixing that by incrementing the reference count of the nodes that have already been registered, and only registering the new nodes. Also, clarifying the behaviour in the function documentation. Fixes: e68d0119e328 ("software node: Introduce device_add_software_node()") Signed-off-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Tested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>