Age | Commit message (Collapse) | Author |
|
[ Upstream commit c6d3ee209b9e863c6251f72101511340451ca324 ]
Correct SCSI midlayer sending more requests than exposed host queue depth
causing firmware ASSERT and lockup issues by enabling host-wide tags.
Note: This also results in better performance.
Link: https://lore.kernel.org/r/161549369787.25025.8975999483518581619.stgit@brunhilda
Suggested-by: Ming Lei <ming.lei@redhat.com>
Suggested-by: John Garry <john.garry@huawei.com>
Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com>
Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
bounce
[ Upstream commit 751faedf06e895a17e985a88ef5b6364ffd797ed ]
Adds 80000 us sleep when the usb cable is plugged in to hopefully avoid
bouncing contacts.
Upon pluging in the usb cable vbus will bounce for some time, causing cpcap to
dissconnect charging due to detecting an undervoltage condition. This is a
scope of vbus on xt894 while quickly inserting the usb cable with firm force,
probed at the far side of the usb socket and vbus loaded with approx 1k:
http://uvos.xyz/maserati/usbplug.jpg.
As can clearly be seen, vbus is all over the place for the first 15 ms or so
with a small blip at ~40 ms this causes the cpcap to trip up and disable
charging again.
The delay helps cpcap_usb_detect avoid the worst of this. It is, however, still
not ideal as strong vibrations can cause the issue to reapear any time during
charging. I have however not been able to cause the device to stop charging due
to this in practice as it is hard to vibrate the device such that the vbus pins
start bouncing again but cpcap_usb_detect is not called again due to a detected
disconnect/reconnect event.
Signed-off-by: Carl Philipp Klemm <philipp@uvos.xyz>
Tested-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 1205b688c92558a04d8dd4cbc2b213e0fceba5db ]
Dan reported following static checker warnings
tools/testing/selftests/resctrl/resctrl_val.c:545 measure_vals()
warn: 'bw_imc' unsigned <= 0
tools/testing/selftests/resctrl/resctrl_val.c:549 measure_vals()
warn: 'bw_resc_end' unsigned <= 0
These warnings are reported because
1. measure_vals() declares 'bw_imc' and 'bw_resc_end' as unsigned long
variables
2. Return value of get_mem_bw_imc() and get_mem_bw_resctrl() are assigned
to 'bw_imc' and 'bw_resc_end' respectively
3. The returned values are checked for <= 0 to see if the calls failed
Checking for < 0 for an unsigned value doesn't make any sense.
Fix this issue by changing the implementation of get_mem_bw_imc() and
get_mem_bw_resctrl() such that they now accept reference to a variable
and set the variable appropriately upon success and return 0, else return
< 0 on error.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit d81343b5eedf84be71a4313e8fd073d0c510afcf ]
iMC (Integrated Memory Controller) counters are usually at
"/sys/bus/event_source/devices/" and are named as "uncore_imc_<n>".
num_of_imcs() function tries to count number of such iMC counters so that
it could appropriately initialize required number of perf_attr structures
that could be used to read these iMC counters.
num_of_imcs() function assumes that all the directories under this path
that start with "uncore_imc" are iMC counters. But, on some systems there
could be directories named as "uncore_imc_free_running" which aren't iMC
counters. Trying to read from such directories will result in "not found
file" errors and MBM/MBA tests will fail.
Hence, fix the logic in num_of_imcs() such that it looks at the first
character after "uncore_imc_" to check if it's a numerical digit or not. If
it's a digit then the directory represents an iMC counter, else, skip the
directory.
Reported-by: Reinette Chatre <reinette.chatre@intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit ee0415681eb661efa1eb2db7acc263f2c7df1e23 ]
Resctrl test suite before running any unit test (like cmt, cat, mbm and
mba) should first check if the feature is enabled (by kernel and not just
supported by H/W) on the platform or not.
validate_resctrl_feature_request() is supposed to do that. This function
intends to grep for relevant flags in /proc/cpuinfo but there are several
issues here
1. validate_resctrl_feature_request() calls fgrep() to get flags from
/proc/cpuinfo. But, fgrep() can only return a string with maximum of 255
characters and hence the complete cpu flags are never returned.
2. The substring search logic is also busted. If strstr() finds requested
resctrl feature in the cpu flags, it returns pointer to the first
occurrence. But, the logic negates the return value of strstr() and
hence validate_resctrl_feature_request() returns false if the feature is
present in the cpu flags and returns true if the feature is not present.
3. validate_resctrl_feature_request() checks if a resctrl feature is
reported in /proc/cpuinfo flags or not. Having a cpu flag means that the
H/W supports the feature, but it doesn't mean that the kernel enabled
it. A user could selectively enable only a subset of resctrl features
using kernel command line arguments. Hence, /proc/cpuinfo isn't a
reliable source to check if a feature is enabled or not.
The 3rd issue being the major one and fixing it requires changing the way
validate_resctrl_feature_request() works. Since, /proc/cpuinfo isn't the
right place to check if a resctrl feature is enabled or not, a more
appropriate place is /sys/fs/resctrl/info directory. Change
validate_resctrl_feature_request() such that,
1. For cat, check if /sys/fs/resctrl/info/L3 directory is present or not
2. For mba, check if /sys/fs/resctrl/info/MB directory is present or not
3. For cmt, check if /sys/fs/resctrl/info/L3_MON directory is present and
check if /sys/fs/resctrl/info/L3_MON/mon_features has llc_occupancy
4. For mbm, check if /sys/fs/resctrl/info/L3_MON directory is present and
check if /sys/fs/resctrl/info/L3_MON/mon_features has
mbm_<total/local>_bytes
Please note that only L3_CAT, L3_CMT, MBA and MBM are supported. CDP and L2
variants can be added later.
Reported-by: Reinette Chatre <reinette.chatre@intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit d7af3d0d515cbdf63b6c3398a3c15ecb1bc2bd38 ]
resctrl test suite accepts command line arguments (like -b, -t, -n and -p)
as documented in the help. But passing -n and -p throws an invalid option
error. This happens because -n and -p are missing in the list of
characters that getopt() recognizes as valid arguments. Hence, they are
treated as invalid options.
Fix this by adding them to the list of characters that getopt() recognizes
as valid arguments. Please note that the main() function already has the
logic to deal with the values passed as part of these arguments and hence
no changes are needed there.
Tested-by: Babu Moger <babu.moger@amd.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 2428673638ea28fa93d2a38b1c3e8d70122b00ee ]
Checking resctrl features call strcmp() to compare feature strings
(e.g. "mba", "cat" etc). The checkings are error prone and don't have
good coding style. Define the constant strings in macros and call
strncmp() to solve the potential issues.
Suggested-by: Shuah Khan <skhan@linuxfoundation.org>
Tested-by: Babu Moger <babu.moger@amd.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 896016d2ad051811ff9c9c087393adc063322fbc ]
Reinette reported following compilation issue on Fedora 32, gcc version
10.1.1
/usr/bin/ld: resctrl_tests.o:<src_dir>/resctrl.h:65: multiple definition
of `bm_pid'; cache.o:<src_dir>/resctrl.h:65: first defined here
Other variables are ppid, tests_run, llc_occup_path, is_amd. Compiler
isn't happy because these variables are defined globally in two .c files
but are not declared as extern.
To fix issues for the global variables, declare them as extern.
Chang Log:
- Split this patch from v4's patch 1 (Shuah).
Reported-by: Reinette Chatre <reinette.chatre@intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 8236c51d85a64643588505a6791e022cc8d84864 ]
Reinette reported following compilation issue on Fedora 32, gcc version
10.1.1
/usr/bin/ld: cqm_test.o:<src_dir>/cqm_test.c:22: multiple definition of
`cache_size'; cat_test.o:<src_dir>/cat_test.c:23: first defined here
The same issue is reported for long_mask, cbm_mask, count_of_bits etc
variables as well. Compiler isn't happy because these variables are
defined globally in two .c files namely cqm_test.c and cat_test.c and
the compiler during compilation finds that the variable is already
defined (multiple definition error).
Taking a closer look at the usage of these variables reveals that these
variables are used only locally in functions such as cqm_resctrl_val()
(defined in cqm_test.c) and cat_perf_miss_val() (defined in cat_test.c).
These variables are not shared between those functions. So, there is no
need for these variables to be global. Hence, fix this issue by making
them static variables.
Reported-by: Reinette Chatre <reinette.chatre@intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit a9d26a302dea29eb84f491b1340a57e56c631a71 ]
David reported a buffer overflow error in the check_results() function of
the cmt unit test and he suggested enabling _FORTIFY_SOURCE gcc compiler
option to automatically detect any such errors.
Feature Test Macros man page describes_FORTIFY_SOURCE as below
"Defining this macro causes some lightweight checks to be performed to
detect some buffer overflow errors when employing various string and memory
manipulation functions (for example, memcpy, memset, stpcpy, strcpy,
strncpy, strcat, strncat, sprintf, snprintf, vsprintf, vsnprintf, gets, and
wide character variants thereof). For some functions, argument consistency
is checked; for example, a check is made that open has been supplied with a
mode argument when the specified flags include O_CREAT. Not all problems
are detected, just some common cases.
If _FORTIFY_SOURCE is set to 1, with compiler optimization level 1 (gcc
-O1) and above, checks that shouldn't change the behavior of conforming
programs are performed.
With _FORTIFY_SOURCE set to 2, some more checking is added, but some
conforming programs might fail.
Some of the checks can be performed at compile time (via macros logic
implemented in header files), and result in compiler warnings; other checks
take place at run time, and result in a run-time error if the check fails.
Use of this macro requires compiler support, available with gcc since
version 4.0."
Fix the buffer overflow error in the check_results() function of the cmt
unit test and enable _FORTIFY_SOURCE gcc check to catch any future buffer
overflow errors.
Reported-by: David Binderman <dcb314@hotmail.com>
Suggested-by: David Binderman <dcb314@hotmail.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 79695dcd9ad4463a82def7f42960e6d7baa76f0b ]
Return NVME_SC_INVALID_FIELD from discovery controller like normal
controller when executing identify or get log page command.
Signed-off-by: Hou Pu <houpu.main@gmail.com>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit a1c3be890440a1769ed6f822376a3e3ab0d42994 ]
Another issue found by KASAN. The bit finding is buried inside the
dp_for_each_set_bit() macro (that passes on to for_each_set_bit() that
calls the bit stuff. These bit functions want an unsigned long pointer
as input and just dumbly casting leads to out-of-bounds accesses.
This fixes that.
Signed-off-by: Carsten Haitzler <carsten.haitzler@arm.com>
Reviewed-by: Steven Price <steven.price@arm.com>
Reviewed-by: James Qian Wang <james.qian.wang@arm.com>
Signed-off-by: Liviu Dudau <liviu.dudau@arm.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210204131102.68658-1-carsten.haitzler@foss.arm.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 234e6d2c18f5b080cde874483c4c361f3ae7cffe ]
On Hisilicon Kunpeng920, ESP is set to 1 by default for all ports of
SATA controller. In some scenarios, some ports are not external SATA ports,
and it cause disks connected to these ports to be identified as removable
disks. So disable the SXS capability on the software side to prevent users
from mistakenly considering non-removable disks as removable disks and
performing related operations.
Signed-off-by: Xingui Yang <yangxingui@huawei.com>
Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Reviewed-by: John Garry <john.garry@huawei.com>
Link: https://lore.kernel.org/r/1615544676-61926-1-git-send-email-luojiaxing@huawei.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit f0bdf98fab058efe7bf49732f70a0f26d1143154 ]
Remove the CQHCI_QUIRK_SHORT_TXFR_DESC_SZ quirk because the
latest chips have this fixed and earlier chips have other
CQE problems that prevent the feature from being enabled.
Signed-off-by: Al Cooper <alcooperx@gmail.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20210325192834.42955-1-alcooperx@gmail.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit ee629112be8b4eff71d4d3d108a28bc7dc877e13 ]
Add PCI IDs for Intel LKF eMMC and SD card host controllers.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Link: https://lore.kernel.org/r/20210322055356.24923-1-adrian.hunter@intel.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit f410ee0aa2df050a9505f5c261953e9b18e21206 ]
When imx_data->pinctrl is not a valid pointer, pinctrl_lookup_state
will trigger kernel panic.
When we boot Dual OS on Jailhouse hypervisor, we let the 1st Linux to
configure pinmux ready for the 2nd OS, so the 2nd OS not have pinctrl
settings.
Similar to this commit b62eee9f804e ("mmc: sdhci-esdhc-imx: no fail when no pinctrl available").
Reviewed-by: Bough Chen <haobo.chen@nxp.com>
Reviewed-by: Alice Guo <alice.guo@nxp.com>
Signed-off-by: Peng Fan <peng.fan@nxp.com>
Link: https://lore.kernel.org/r/1614222604-27066-6-git-send-email-peng.fan@oss.nxp.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 2ce35c0821afc2acd5ee1c3f60d149f8b2520ce8 ]
On bsg command completion, bsg_job_done() was called while qla driver
continued to access the bsg_job buffer. bsg_job_done() would free up
resources that ended up being reused by other task while the driver
continued to access the buffers. As a result, driver was reading garbage
data.
localhost kernel: BUG: KASAN: use-after-free in sg_next+0x64/0x80
localhost kernel: Read of size 8 at addr ffff8883228a3330 by task swapper/26/0
localhost kernel:
localhost kernel: CPU: 26 PID: 0 Comm: swapper/26 Kdump:
loaded Tainted: G OE --------- - - 4.18.0-193.el8.x86_64+debug #1
localhost kernel: Hardware name: HP ProLiant DL360
Gen9/ProLiant DL360 Gen9, BIOS P89 08/12/2016
localhost kernel: Call Trace:
localhost kernel: <IRQ>
localhost kernel: dump_stack+0x9a/0xf0
localhost kernel: print_address_description.cold.3+0x9/0x23b
localhost kernel: kasan_report.cold.4+0x65/0x95
localhost kernel: debug_dma_unmap_sg.part.12+0x10d/0x2d0
localhost kernel: qla2x00_bsg_sp_free+0xaf6/0x1010 [qla2xxx]
Link: https://lore.kernel.org/r/20210329085229.4367-6-njavali@marvell.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Saurav Kashyap <skashyap@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit b4142fc4d52d051d4d8df1fb6c569e5b445d369e ]
vkms_vblank_simulate() uses WARN_ON for timing-dependent condition
(timer overrun). This is a mis-use of WARN_ON, WARN_ON must be used
to denote kernel bugs. Use pr_warn() instead.
Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Reported-by: syzbot+4fc21a003c8332eb0bdd@syzkaller.appspotmail.com
Cc: Rodrigo Siqueira <rodrigosiqueiramelo@gmail.com>
Cc: Melissa Wen <melissa.srw@gmail.com>
Cc: Haneen Mohammed <hamohammed.sa@gmail.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: David Airlie <airlied@linux.ie>
Cc: dri-devel@lists.freedesktop.org
Cc: linux-kernel@vger.kernel.org
Acked-by: Melissa Wen <melissa.srw@gmail.com>
Signed-off-by: Melissa Wen <melissa.srw@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210320132840.1315853-1-dvyukov@google.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit a2b2cc660822cae08c351c7f6b452bfd1330a4f7 ]
This patch fixes the following Coverity warning:
CID 361199 (#1 of 1): Unchecked return value (CHECKED_RETURN)
3. check_return: Calling qla24xx_get_isp_stats without checking return
value (as is done elsewhere 4 out of 5 times).
Link: https://lore.kernel.org/r/20210320232359.941-7-bvanassche@acm.org
Cc: Quinn Tran <qutran@marvell.com>
Cc: Mike Christie <michael.christie@oracle.com>
Cc: Himanshu Madhani <himanshu.madhani@oracle.com>
Cc: Daniel Wagner <dwagner@suse.de>
Cc: Lee Duncan <lduncan@suse.com>
Reviewed-by: Daniel Wagner <dwagner@suse.de>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 8ee0fea4baf90e43efe2275de208a7809f9985bc ]
Incorrect variable used, missing initialization during validation.
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Dmytro Laktyushkin <Dmytro.Laktyushkin@amd.com>
Reviewed-by: Eric Bernstein <Eric.Bernstein@amd.com>
Acked-by: Solomon Chiu <solomon.chiu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 4710430a779e6077d81218ac768787545bff8c49 ]
[Why]
When unplugging a display, the underflow counter can be seen to
increase because PSTATE switch is allowed even when some planes are not
blanked.
[How]
Check that all planes are not active instead of all streams before
allowing PSTATE change.
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Aric Cyr <aric.cyr@amd.com>
Acked-by: Solomon Chiu <solomon.chiu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 6a30a92997eee49554f72b462dce90abe54a496f ]
[Why]
dc_cursor_position do not initialise position.translate_by_source when
crtc or plane->state->fb is NULL. UBSAN caught this error in
dce110_set_cursor_position, as the value was garbage.
[How]
Initialise dc_cursor_position structure elements to 0 in handle_cursor_update
before calling get_cursor_position.
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1471
Reported-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Anson Jacob <Anson.Jacob@amd.com>
Reviewed-by: Aurabindo Jayamohanan Pillai <Aurabindo.Pillai@amd.com>
Acked-by: Solomon Chiu <solomon.chiu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 0979d43259e13846d86ba17e451e17fec185d240 ]
Workload number mapped to the correct one.
This issue is only on vega10.
Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit c8941550aa66b2a90f4b32c45d59e8571e33336e ]
This recent change introduce SDMA interrupt info printing with irq->process function.
These functions do not require a set function to enable/disable the irq
Signed-off-by: shaoyunl <shaoyun.liu@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 50e2fc36e72d4ad672032ebf646cecb48656efe0 ]
If get_num_sdma_queues or get_num_xgmi_sdma_queues is 0, we end up
doing a shift operation where the number of bits shifted equals
number of bits in the operand. This behaviour is undefined.
Set num_sdma_queues or num_xgmi_sdma_queues to ULLONG_MAX, if the
count is >= number of bits in the operand.
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1472
Reported-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Anson Jacob <Anson.Jacob@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Tested-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 4ac5617c4b7d0f0a8f879997f8ceaa14636d7554 ]
The psp supplies the link type in the upper 2 bits of the psp xgmi node
information num_hops field. With a new link type, Aldebaran has these
bits set to a non-zero value (1 = xGMI3) so the KFD topology will report
the incorrect IO link weights without proper masking.
The actual number of hops is located in the 3 least significant bits of
this field so mask if off accordingly before passing it to the KFD.
Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Reviewed-by: Amber Lin <amber.lin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 4d6e9cdff7fbb6bef3e5559596fab3eeffaf95ca ]
Currently, for WLED5, the FSC (Full scale current) setting is not
updated properly due to driver toggling the wrong register after
an FSC update.
On WLED5 we should only toggle the MOD_SYNC bit after a brightness
update. For an FSC update we need to toggle the SYNC bits instead.
Fix it by adopting the common wled3_sync_toggle() for WLED5 and
introducing new code to the brightness update path to compensate.
Signed-off-by: Kiran Gunda <kgunda@codeaurora.org>
Reviewed-by: Daniel Thompson <daniel.thompson@linaro.org>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit cdfd4c689e2a52c313b35ddfc1852ff274f91acb ]
WLED3_SINK_REG_SYNC is, as the name implies, a sink register offset.
Therefore, use the sink address as base instead of the ctrl address.
This fixes the sync toggle on wled4, which can be observed by the fact
that adjusting brightness now works.
It has no effect on wled3 because sink and ctrl base addresses are the
same. This allows adjusting the brightness without having to disable
then reenable the module.
Signed-off-by: Obeida Shamoun <oshmoun100@googlemail.com>
Signed-off-by: Konrad Dybcio <konrad.dybcio@somainline.org>
Signed-off-by: Marijn Suijten <marijn.suijten@somainline.org>
Reviewed-by: Daniel Thompson <daniel.thompson@linaro.org>
Acked-by: Kiran Gunda <kgunda@codeaurora.org>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 2469b836fa835c67648acad17d62bc805236a6ea ]
Fixes coccicheck error:
drivers/power/supply/pm2301_charger.c:1089:7-27: ERROR:
drivers/power/supply/lp8788-charger.c:502:8-28: ERROR:
drivers/power/supply/tps65217_charger.c:239:8-33: ERROR:
drivers/power/supply/tps65090-charger.c:303:8-33: ERROR:
Threaded IRQ with no primary handler requested without IRQF_ONESHOT
Signed-off-by: dongjian <dongjian@yulong.com>
Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit ea1611ba3a544b34f89ffa3d1e833caab30a3f09 ]
The V4L2_CID_STATELESS_FWHT_PARAMS compound control was missing a
proper initialization of the flags field, so after loading the vicodec
module for the first time, running v4l2-compliance for the stateless
decoder would fail on this control because the initial control value
was considered invalid by the vicodec driver.
Initializing the flags field to sane values fixes this.
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit eaaea4681984c79d2b2b160387b297477f0c1aab ]
act_len can be uninitialized if usb_bulk_msg() returns an error.
Set it to 0 to avoid a KMSAN error.
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Reported-by: syzbot+a4e309017a5f3a24c7b3@syzkaller.appspotmail.com
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit c759b2970c561e3b56aa030deb13db104262adfe ]
Add a fix for the memory leak bugs that can occur when the
saa7164_encoder_register() function fails.
The function allocates memory without explicitly freeing
it when errors occur.
Add a better error handling that deallocate the unused buffers before the
function exits during a fail.
Signed-off-by: Daniel Niv <danielniv3@gmail.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit e5b499f6fb17bc95a813e85d0796522280203806 ]
We must free/disable all interrupts and cancel all pending works
before doing further cleanup.
Before this commit arizona_extcon_remove() was doing several
register writes to shut things down before disabling the IRQs
and it was cancelling only 1 of the 3 different works used.
Move all the register-writes shutting things down to after
the disabling of the IRQs and add the 2 missing
cancel_delayed_work_sync() calls.
This fixes various possible races on driver unbind. One of which
would always trigger on devices using the mic-clamp feature for
jack detection. The ARIZONA_MICD_CLAMP_MODE_MASK update was
done before disabling the IRQs, causing:
1. arizona_jackdet() to run
2. detect a jack being inserted (clamp disabled means jack inserted)
3. call arizona_start_mic() which:
3.1 Enables the MICVDD regulator
3.2 takes a pm_runtime_reference
And this was all happening after the ARIZONA_MICD_ENA bit clearing,
which would undo 3.1 and 3.2 because the ARIZONA_MICD_CLAMP_MODE_MASK
update was being done after the ARIZONA_MICD_ENA bit clearing.
So this means that arizona_extcon_remove() would exit with
1. MICVDD enabled and 2. The pm_runtime_reference being unbalanced.
MICVDD still being enabled caused the following oops when the
regulator is released by the devm framework:
[ 2850.745757] ------------[ cut here ]------------
[ 2850.745827] WARNING: CPU: 2 PID: 2098 at drivers/regulator/core.c:2123 _regulator_put.part.0+0x19f/0x1b0
[ 2850.745835] Modules linked in: extcon_arizona ...
...
[ 2850.746909] Call Trace:
[ 2850.746932] regulator_put+0x2d/0x40
[ 2850.746946] release_nodes+0x22a/0x260
[ 2850.746984] __device_release_driver+0x190/0x240
[ 2850.747002] driver_detach+0xd4/0x120
...
[ 2850.747337] ---[ end trace f455dfd7abd9781f ]---
Note this oops is just one of various theoretically possible races caused
by the wrong ordering inside arizona_extcon_remove(), this fixes the
ordering fixing all possible races, including the reported oops.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Acked-by: Charles Keepax <ckeepax@opensource.cirrus.com>
Tested-by: Charles Keepax <ckeepax@opensource.cirrus.com>
Acked-by: Chanwoo Choi <cw00.choi@samsung.com>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
been unplugged
[ Upstream commit c309a3e8793f7e01c4a4ec7960658380572cb576 ]
When the jack is partially inserted and then removed again it may be
removed while the hpdet code is running. In this case the following
may happen:
1. The "JACKDET rise" or ""JACKDET fall" IRQ triggers
2. arizona_jackdet runs and takes info->lock
3. The "HPDET" IRQ triggers
4. arizona_hpdet_irq runs, blocks on info->lock
5. arizona_jackdet calls arizona_stop_mic() and clears info->hpdet_done
6. arizona_jackdet releases info->lock
7. arizona_hpdet_irq now can continue running and:
7.1 Calls arizona_start_mic() (if a mic was detected)
7.2 sets info->hpdet_done
Step 7 is undesirable / a bug:
7.1 causes the device to stay in a high power-state (with MICVDD enabled)
7.2 causes hpdet to not run on the next jack insertion, which in turn
causes the EXTCON_JACK_HEADPHONE state to never get set
This fixes both issues by skipping these 2 steps when arizona_hpdet_irq
runs after the jack has been unplugged.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Acked-by: Charles Keepax <ckeepax@opensource.cirrus.com>
Tested-by: Charles Keepax <ckeepax@opensource.cirrus.com>
Acked-by: Chanwoo Choi <cw00.choi@samsung.com>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit c4d57c22ac65bd503716062a06fad55a01569cac ]
On all newer bq27xxx ICs, the AveragePower register contains a signed
value; in addition to handling the raw value as unsigned, the driver
code also didn't convert it to µW as expected.
At least for the BQ28Z610, the reference manual incorrectly states that
the value is in units of 1mW and not 10mW. I have no way of knowing
whether the manuals of other supported ICs contain the same error, or if
there are models that actually use 1mW. At least, the new code shouldn't
be *less* correct than the old version for any device.
power_avg is removed from the cache structure, se we don't have to
extend it to store both a signed value and an error code. Always getting
an up-to-date value may be desirable anyways, as it avoids inconsistent
current and power readings when switching between charging and
discharging.
Signed-off-by: Matthias Schiffer <matthias.schiffer@ew.tq-group.com>
Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 1f6c45ac5fd70ab59136ab5babc7def269f3f509 ]
In practice, IA_CSS_PIPE_ID_NUM should never be used when
calling atomisp_q_video_buffers_to_css(), as the driver should
discover the right pipe before calling it.
Yet, if some pipe parsing issue happens, it could end using
it.
So, add a WARN_ON() to prevent such case.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit cc271b6754691af74d710b761eaf027e3743e243 ]
The correct return code to report an invalid pipeline configuration is
-EPIPE. Return it instead of -EINVAL from __capture_legacy_try_fmt()
when the capture format doesn't match the media bus format of the
connected subdev.
Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Reviewed-by: Rui Miguel Silva <rmfrfs@gmail.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 5f864cfbf59bfed2057bd214ce7fbf6ad420d54b ]
The folowing AMD IOMMU are affected by the RiSC engine stall, requiring a
reset to maintain continual operation. After being added to the
broken_dev_id list the systems are functional long term.
0x1481 is the PCI ID for the IOMMU found on Starship/Matisse
0x1419 is the PCI ID for the IOMMU found on 15h (Models 10h-1fh) family
0x5a23 is the PCI ID for the IOMMU found on RD890S/RD990
Signed-off-by: Brad Love <brad@nextdimension.cc>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 9c39be40c0155c43343f53e3a439290c0fec5542 ]
syzbot reported memory leak in zr364xx_probe()[1].
The problem was in invalid error handling order.
All error conditions rigth after v4l2_ctrl_handler_init()
must call v4l2_ctrl_handler_free().
Reported-by: syzbot+efe9aefc31ae1e6f7675@syzkaller.appspotmail.com
Signed-off-by: Pavel Skripkin <paskripkin@gmail.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 24df8b74c8b2fb42c49ffe8585562da0c96446ff ]
When STA2X11_VIP is enabled, and GPIOLIB is disabled,
Kbuild gives the following warning:
WARNING: unmet direct dependencies detected for VIDEO_ADV7180
Depends on [n]: MEDIA_SUPPORT [=y] && GPIOLIB [=n] && VIDEO_V4L2 [=y] && I2C [=y]
Selected by [y]:
- STA2X11_VIP [=y] && MEDIA_SUPPORT [=y] && MEDIA_PCI_SUPPORT [=y] && MEDIA_CAMERA_SUPPORT [=y] && PCI [=y] && VIDEO_V4L2 [=y] && VIRT_TO_BUS [=y] && I2C [=y] && (STA2X11 [=n] || COMPILE_TEST [=y]) && MEDIA_SUBDRV_AUTOSELECT [=y]
This is because STA2X11_VIP selects VIDEO_ADV7180
without selecting or depending on GPIOLIB,
despite VIDEO_ADV7180 depending on GPIOLIB.
Signed-off-by: Julian Braha <julianbraha@gmail.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 28c7afb07ccfc0a939bb06ac1e7afe669901c65a ]
It's best if this condition is reported.
Signed-off-by: Sean Young <sean@mess.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit fd48c056a32ed6e7754c7c475490f3bed54ed378 ]
This fixes a compilation warning in pscsi_complete_cmd():
drivers/target/target_core_pscsi.c: In function ‘pscsi_complete_cmd’:
drivers/target/target_core_pscsi.c:624:5: warning: suggest braces around empty body in an ‘if’ statement [-Wempty-body]
; /* XXX: TCM_LOGICAL_UNIT_COMMUNICATION_FAILURE */
Link: https://lore.kernel.org/r/20210228055645.22253-5-chaitanya.kulkarni@wdc.com
Reviewed-by: Mike Christie <michael.christie@oracle.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 377f8331d0565e6f71ba081c894029a92d0c7e77 ]
virtio_gpu_object array is not freed or unlocked in some
failed cases.
Signed-off-by: xndcn <xndchn@gmail.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20210305151819.14330-1-xndchn@gmail.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit ee6ddf58475cce8a3d3697614679cd8cb4a6f583 ]
Running an rcuscale stress-suite can lead to "Out of memory" of a
system. This can happen under high memory pressure with a small amount
of physical memory.
For example, a KVM test configuration with 64 CPUs and 512 megabytes
can result in OOM when running rcuscale with below parameters:
../kvm.sh --torture rcuscale --allcpus --duration 10 --kconfig CONFIG_NR_CPUS=64 \
--bootargs "rcuscale.kfree_rcu_test=1 rcuscale.kfree_nthreads=16 rcuscale.holdoff=20 \
rcuscale.kfree_loops=10000 torture.disable_onoff_at_boot" --trust-make
<snip>
[ 12.054448] kworker/1:1H invoked oom-killer: gfp_mask=0x2cc0(GFP_KERNEL|__GFP_NOWARN), order=0, oom_score_adj=0
[ 12.055303] CPU: 1 PID: 377 Comm: kworker/1:1H Not tainted 5.11.0-rc3+ #510
[ 12.055416] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.12.0-1 04/01/2014
[ 12.056485] Workqueue: events_highpri fill_page_cache_func
[ 12.056485] Call Trace:
[ 12.056485] dump_stack+0x57/0x6a
[ 12.056485] dump_header+0x4c/0x30a
[ 12.056485] ? del_timer_sync+0x20/0x30
[ 12.056485] out_of_memory.cold.47+0xa/0x7e
[ 12.056485] __alloc_pages_slowpath.constprop.123+0x82f/0xc00
[ 12.056485] __alloc_pages_nodemask+0x289/0x2c0
[ 12.056485] __get_free_pages+0x8/0x30
[ 12.056485] fill_page_cache_func+0x39/0xb0
[ 12.056485] process_one_work+0x1ed/0x3b0
[ 12.056485] ? process_one_work+0x3b0/0x3b0
[ 12.060485] worker_thread+0x28/0x3c0
[ 12.060485] ? process_one_work+0x3b0/0x3b0
[ 12.060485] kthread+0x138/0x160
[ 12.060485] ? kthread_park+0x80/0x80
[ 12.060485] ret_from_fork+0x22/0x30
[ 12.062156] Mem-Info:
[ 12.062350] active_anon:0 inactive_anon:0 isolated_anon:0
[ 12.062350] active_file:0 inactive_file:0 isolated_file:0
[ 12.062350] unevictable:0 dirty:0 writeback:0
[ 12.062350] slab_reclaimable:2797 slab_unreclaimable:80920
[ 12.062350] mapped:1 shmem:2 pagetables:8 bounce:0
[ 12.062350] free:10488 free_pcp:1227 free_cma:0
...
[ 12.101610] Out of memory and no killable processes...
[ 12.102042] Kernel panic - not syncing: System is deadlocked on memory
[ 12.102583] CPU: 1 PID: 377 Comm: kworker/1:1H Not tainted 5.11.0-rc3+ #510
[ 12.102600] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.12.0-1 04/01/2014
<snip>
Because kvfree_rcu() has a fallback path, memory allocation failure is
not the end of the world. Furthermore, the added overhead of aggressive
GFP settings must be balanced against the overhead of the fallback path,
which is a cache miss for double-argument kvfree_rcu() and a call to
synchronize_rcu() for single-argument kvfree_rcu(). The current choice
of GFP_KERNEL|__GFP_NOWARN can result in longer latencies than a call
to synchronize_rcu(), so less-tenacious GFP flags would be helpful.
Here is the tradeoff that must be balanced:
a) Minimize use of the fallback path,
b) Avoid pushing the system into OOM,
c) Bound allocation latency to that of synchronize_rcu(), and
d) Leave the emergency reserves to use cases lacking fallbacks.
This commit therefore changes GFP flags from GFP_KERNEL|__GFP_NOWARN to
GFP_KERNEL|__GFP_NORETRY|__GFP_NOMEMALLOC|__GFP_NOWARN. This combination
leaves the emergency reserves alone and can initiate reclaim, but will
not invoke the OOM killer.
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
diameter > 2
[ Upstream commit 585b6d2723dc927ebc4ad884c4e879e4da8bc21f ]
As long as NUMA diameter > 2, building sched_domain by sibling's child
domain will definitely create a sched_domain with sched_group which will
span out of the sched_domain:
+------+ +------+ +-------+ +------+
| node | 12 |node | 20 | node | 12 |node |
| 0 +---------+1 +--------+ 2 +-------+3 |
+------+ +------+ +-------+ +------+
domain0 node0 node1 node2 node3
domain1 node0+1 node0+1 node2+3 node2+3
+
domain2 node0+1+2 |
group: node0+1 |
group:node2+3 <-------------------+
when node2 is added into the domain2 of node0, kernel is using the child
domain of node2's domain2, which is domain1(node2+3). Node 3 is outside
the span of the domain including node0+1+2.
This will make load_balance() run based on screwed avg_load and group_type
in the sched_group spanning out of the sched_domain, and it also makes
select_task_rq_fair() pick an idle CPU outside the sched_domain.
Real servers which suffer from this problem include Kunpeng920 and 8-node
Sun Fire X4600-M2, at least.
Here we move to use the *child* domain of the *child* domain of node2's
domain2 as the new added sched_group. At the same, we re-use the lower
level sgc directly.
+------+ +------+ +-------+ +------+
| node | 12 |node | 20 | node | 12 |node |
| 0 +---------+1 +--------+ 2 +-------+3 |
+------+ +------+ +-------+ +------+
domain0 node0 node1 +- node2 node3
|
domain1 node0+1 node0+1 | node2+3 node2+3
|
domain2 node0+1+2 |
group: node0+1 |
group:node2 <-------------------+
While the lower level sgc is re-used, this patch only changes the remote
sched_groups for those sched_domains playing grandchild trick, therefore,
sgc->next_update is still safe since it's only touched by CPUs that have
the group span as local group. And sgc->imbalance is also safe because
sd_parent remains the same in load_balance and LB only tries other CPUs
from the local group.
Moreover, since local groups are not touched, they are still getting
roughly equal size in a TL. And should_we_balance() only matters with
local groups, so the pull probability of those groups are still roughly
equal.
Tested by the below topology:
qemu-system-aarch64 -M virt -nographic \
-smp cpus=8 \
-numa node,cpus=0-1,nodeid=0 \
-numa node,cpus=2-3,nodeid=1 \
-numa node,cpus=4-5,nodeid=2 \
-numa node,cpus=6-7,nodeid=3 \
-numa dist,src=0,dst=1,val=12 \
-numa dist,src=0,dst=2,val=20 \
-numa dist,src=0,dst=3,val=22 \
-numa dist,src=1,dst=2,val=22 \
-numa dist,src=2,dst=3,val=12 \
-numa dist,src=1,dst=3,val=24 \
-m 4G -cpu cortex-a57 -kernel arch/arm64/boot/Image
w/o patch, we get lots of "groups don't span domain->span":
[ 0.802139] CPU0 attaching sched-domain(s):
[ 0.802193] domain-0: span=0-1 level=MC
[ 0.802443] groups: 0:{ span=0 cap=1013 }, 1:{ span=1 cap=979 }
[ 0.802693] domain-1: span=0-3 level=NUMA
[ 0.802731] groups: 0:{ span=0-1 cap=1992 }, 2:{ span=2-3 cap=1943 }
[ 0.802811] domain-2: span=0-5 level=NUMA
[ 0.802829] groups: 0:{ span=0-3 cap=3935 }, 4:{ span=4-7 cap=3937 }
[ 0.802881] ERROR: groups don't span domain->span
[ 0.803058] domain-3: span=0-7 level=NUMA
[ 0.803080] groups: 0:{ span=0-5 mask=0-1 cap=5843 }, 6:{ span=4-7 mask=6-7 cap=4077 }
[ 0.804055] CPU1 attaching sched-domain(s):
[ 0.804072] domain-0: span=0-1 level=MC
[ 0.804096] groups: 1:{ span=1 cap=979 }, 0:{ span=0 cap=1013 }
[ 0.804152] domain-1: span=0-3 level=NUMA
[ 0.804170] groups: 0:{ span=0-1 cap=1992 }, 2:{ span=2-3 cap=1943 }
[ 0.804219] domain-2: span=0-5 level=NUMA
[ 0.804236] groups: 0:{ span=0-3 cap=3935 }, 4:{ span=4-7 cap=3937 }
[ 0.804302] ERROR: groups don't span domain->span
[ 0.804520] domain-3: span=0-7 level=NUMA
[ 0.804546] groups: 0:{ span=0-5 mask=0-1 cap=5843 }, 6:{ span=4-7 mask=6-7 cap=4077 }
[ 0.804677] CPU2 attaching sched-domain(s):
[ 0.804687] domain-0: span=2-3 level=MC
[ 0.804705] groups: 2:{ span=2 cap=934 }, 3:{ span=3 cap=1009 }
[ 0.804754] domain-1: span=0-3 level=NUMA
[ 0.804772] groups: 2:{ span=2-3 cap=1943 }, 0:{ span=0-1 cap=1992 }
[ 0.804820] domain-2: span=0-5 level=NUMA
[ 0.804836] groups: 2:{ span=0-3 mask=2-3 cap=3991 }, 4:{ span=0-1,4-7 mask=4-5 cap=5985 }
[ 0.804944] ERROR: groups don't span domain->span
[ 0.805108] domain-3: span=0-7 level=NUMA
[ 0.805134] groups: 2:{ span=0-5 mask=2-3 cap=5899 }, 6:{ span=0-1,4-7 mask=6-7 cap=6125 }
[ 0.805223] CPU3 attaching sched-domain(s):
[ 0.805232] domain-0: span=2-3 level=MC
[ 0.805249] groups: 3:{ span=3 cap=1009 }, 2:{ span=2 cap=934 }
[ 0.805319] domain-1: span=0-3 level=NUMA
[ 0.805336] groups: 2:{ span=2-3 cap=1943 }, 0:{ span=0-1 cap=1992 }
[ 0.805383] domain-2: span=0-5 level=NUMA
[ 0.805399] groups: 2:{ span=0-3 mask=2-3 cap=3991 }, 4:{ span=0-1,4-7 mask=4-5 cap=5985 }
[ 0.805458] ERROR: groups don't span domain->span
[ 0.805605] domain-3: span=0-7 level=NUMA
[ 0.805626] groups: 2:{ span=0-5 mask=2-3 cap=5899 }, 6:{ span=0-1,4-7 mask=6-7 cap=6125 }
[ 0.805712] CPU4 attaching sched-domain(s):
[ 0.805721] domain-0: span=4-5 level=MC
[ 0.805738] groups: 4:{ span=4 cap=984 }, 5:{ span=5 cap=924 }
[ 0.805787] domain-1: span=4-7 level=NUMA
[ 0.805803] groups: 4:{ span=4-5 cap=1908 }, 6:{ span=6-7 cap=2029 }
[ 0.805851] domain-2: span=0-1,4-7 level=NUMA
[ 0.805867] groups: 4:{ span=4-7 cap=3937 }, 0:{ span=0-3 cap=3935 }
[ 0.805915] ERROR: groups don't span domain->span
[ 0.806108] domain-3: span=0-7 level=NUMA
[ 0.806130] groups: 4:{ span=0-1,4-7 mask=4-5 cap=5985 }, 2:{ span=0-3 mask=2-3 cap=3991 }
[ 0.806214] CPU5 attaching sched-domain(s):
[ 0.806222] domain-0: span=4-5 level=MC
[ 0.806240] groups: 5:{ span=5 cap=924 }, 4:{ span=4 cap=984 }
[ 0.806841] domain-1: span=4-7 level=NUMA
[ 0.806866] groups: 4:{ span=4-5 cap=1908 }, 6:{ span=6-7 cap=2029 }
[ 0.806934] domain-2: span=0-1,4-7 level=NUMA
[ 0.806953] groups: 4:{ span=4-7 cap=3937 }, 0:{ span=0-3 cap=3935 }
[ 0.807004] ERROR: groups don't span domain->span
[ 0.807312] domain-3: span=0-7 level=NUMA
[ 0.807386] groups: 4:{ span=0-1,4-7 mask=4-5 cap=5985 }, 2:{ span=0-3 mask=2-3 cap=3991 }
[ 0.807686] CPU6 attaching sched-domain(s):
[ 0.807710] domain-0: span=6-7 level=MC
[ 0.807750] groups: 6:{ span=6 cap=1017 }, 7:{ span=7 cap=1012 }
[ 0.807840] domain-1: span=4-7 level=NUMA
[ 0.807870] groups: 6:{ span=6-7 cap=2029 }, 4:{ span=4-5 cap=1908 }
[ 0.807952] domain-2: span=0-1,4-7 level=NUMA
[ 0.807985] groups: 6:{ span=4-7 mask=6-7 cap=4077 }, 0:{ span=0-5 mask=0-1 cap=5843 }
[ 0.808045] ERROR: groups don't span domain->span
[ 0.808257] domain-3: span=0-7 level=NUMA
[ 0.808571] groups: 6:{ span=0-1,4-7 mask=6-7 cap=6125 }, 2:{ span=0-5 mask=2-3 cap=5899 }
[ 0.808848] CPU7 attaching sched-domain(s):
[ 0.808860] domain-0: span=6-7 level=MC
[ 0.808880] groups: 7:{ span=7 cap=1012 }, 6:{ span=6 cap=1017 }
[ 0.808953] domain-1: span=4-7 level=NUMA
[ 0.808974] groups: 6:{ span=6-7 cap=2029 }, 4:{ span=4-5 cap=1908 }
[ 0.809034] domain-2: span=0-1,4-7 level=NUMA
[ 0.809055] groups: 6:{ span=4-7 mask=6-7 cap=4077 }, 0:{ span=0-5 mask=0-1 cap=5843 }
[ 0.809128] ERROR: groups don't span domain->span
[ 0.810361] domain-3: span=0-7 level=NUMA
[ 0.810400] groups: 6:{ span=0-1,4-7 mask=6-7 cap=5961 }, 2:{ span=0-5 mask=2-3 cap=5903 }
w/ patch, we don't get "groups don't span domain->span" any more:
[ 1.486271] CPU0 attaching sched-domain(s):
[ 1.486820] domain-0: span=0-1 level=MC
[ 1.500924] groups: 0:{ span=0 cap=980 }, 1:{ span=1 cap=994 }
[ 1.515717] domain-1: span=0-3 level=NUMA
[ 1.515903] groups: 0:{ span=0-1 cap=1974 }, 2:{ span=2-3 cap=1989 }
[ 1.516989] domain-2: span=0-5 level=NUMA
[ 1.517124] groups: 0:{ span=0-3 cap=3963 }, 4:{ span=4-5 cap=1949 }
[ 1.517369] domain-3: span=0-7 level=NUMA
[ 1.517423] groups: 0:{ span=0-5 mask=0-1 cap=5912 }, 6:{ span=4-7 mask=6-7 cap=4054 }
[ 1.520027] CPU1 attaching sched-domain(s):
[ 1.520097] domain-0: span=0-1 level=MC
[ 1.520184] groups: 1:{ span=1 cap=994 }, 0:{ span=0 cap=980 }
[ 1.520429] domain-1: span=0-3 level=NUMA
[ 1.520487] groups: 0:{ span=0-1 cap=1974 }, 2:{ span=2-3 cap=1989 }
[ 1.520687] domain-2: span=0-5 level=NUMA
[ 1.520744] groups: 0:{ span=0-3 cap=3963 }, 4:{ span=4-5 cap=1949 }
[ 1.520948] domain-3: span=0-7 level=NUMA
[ 1.521038] groups: 0:{ span=0-5 mask=0-1 cap=5912 }, 6:{ span=4-7 mask=6-7 cap=4054 }
[ 1.522068] CPU2 attaching sched-domain(s):
[ 1.522348] domain-0: span=2-3 level=MC
[ 1.522606] groups: 2:{ span=2 cap=1003 }, 3:{ span=3 cap=986 }
[ 1.522832] domain-1: span=0-3 level=NUMA
[ 1.522885] groups: 2:{ span=2-3 cap=1989 }, 0:{ span=0-1 cap=1974 }
[ 1.523043] domain-2: span=0-5 level=NUMA
[ 1.523092] groups: 2:{ span=0-3 mask=2-3 cap=4037 }, 4:{ span=4-5 cap=1949 }
[ 1.523302] domain-3: span=0-7 level=NUMA
[ 1.523352] groups: 2:{ span=0-5 mask=2-3 cap=5986 }, 6:{ span=0-1,4-7 mask=6-7 cap=6102 }
[ 1.523748] CPU3 attaching sched-domain(s):
[ 1.523774] domain-0: span=2-3 level=MC
[ 1.523825] groups: 3:{ span=3 cap=986 }, 2:{ span=2 cap=1003 }
[ 1.524009] domain-1: span=0-3 level=NUMA
[ 1.524086] groups: 2:{ span=2-3 cap=1989 }, 0:{ span=0-1 cap=1974 }
[ 1.524281] domain-2: span=0-5 level=NUMA
[ 1.524331] groups: 2:{ span=0-3 mask=2-3 cap=4037 }, 4:{ span=4-5 cap=1949 }
[ 1.524534] domain-3: span=0-7 level=NUMA
[ 1.524586] groups: 2:{ span=0-5 mask=2-3 cap=5986 }, 6:{ span=0-1,4-7 mask=6-7 cap=6102 }
[ 1.524847] CPU4 attaching sched-domain(s):
[ 1.524873] domain-0: span=4-5 level=MC
[ 1.524954] groups: 4:{ span=4 cap=958 }, 5:{ span=5 cap=991 }
[ 1.525105] domain-1: span=4-7 level=NUMA
[ 1.525153] groups: 4:{ span=4-5 cap=1949 }, 6:{ span=6-7 cap=2006 }
[ 1.525368] domain-2: span=0-1,4-7 level=NUMA
[ 1.525428] groups: 4:{ span=4-7 cap=3955 }, 0:{ span=0-1 cap=1974 }
[ 1.532726] domain-3: span=0-7 level=NUMA
[ 1.532811] groups: 4:{ span=0-1,4-7 mask=4-5 cap=6003 }, 2:{ span=0-3 mask=2-3 cap=4037 }
[ 1.534125] CPU5 attaching sched-domain(s):
[ 1.534159] domain-0: span=4-5 level=MC
[ 1.534303] groups: 5:{ span=5 cap=991 }, 4:{ span=4 cap=958 }
[ 1.534490] domain-1: span=4-7 level=NUMA
[ 1.534572] groups: 4:{ span=4-5 cap=1949 }, 6:{ span=6-7 cap=2006 }
[ 1.534734] domain-2: span=0-1,4-7 level=NUMA
[ 1.534783] groups: 4:{ span=4-7 cap=3955 }, 0:{ span=0-1 cap=1974 }
[ 1.536057] domain-3: span=0-7 level=NUMA
[ 1.536430] groups: 4:{ span=0-1,4-7 mask=4-5 cap=6003 }, 2:{ span=0-3 mask=2-3 cap=3896 }
[ 1.536815] CPU6 attaching sched-domain(s):
[ 1.536846] domain-0: span=6-7 level=MC
[ 1.536934] groups: 6:{ span=6 cap=1005 }, 7:{ span=7 cap=1001 }
[ 1.537144] domain-1: span=4-7 level=NUMA
[ 1.537262] groups: 6:{ span=6-7 cap=2006 }, 4:{ span=4-5 cap=1949 }
[ 1.537553] domain-2: span=0-1,4-7 level=NUMA
[ 1.537613] groups: 6:{ span=4-7 mask=6-7 cap=4054 }, 0:{ span=0-1 cap=1805 }
[ 1.537872] domain-3: span=0-7 level=NUMA
[ 1.537998] groups: 6:{ span=0-1,4-7 mask=6-7 cap=6102 }, 2:{ span=0-5 mask=2-3 cap=5845 }
[ 1.538448] CPU7 attaching sched-domain(s):
[ 1.538505] domain-0: span=6-7 level=MC
[ 1.538586] groups: 7:{ span=7 cap=1001 }, 6:{ span=6 cap=1005 }
[ 1.538746] domain-1: span=4-7 level=NUMA
[ 1.538798] groups: 6:{ span=6-7 cap=2006 }, 4:{ span=4-5 cap=1949 }
[ 1.539048] domain-2: span=0-1,4-7 level=NUMA
[ 1.539111] groups: 6:{ span=4-7 mask=6-7 cap=4054 }, 0:{ span=0-1 cap=1805 }
[ 1.539571] domain-3: span=0-7 level=NUMA
[ 1.539610] groups: 6:{ span=0-1,4-7 mask=6-7 cap=6102 }, 2:{ span=0-5 mask=2-3 cap=5845 }
Signed-off-by: Barry Song <song.bao.hua@hisilicon.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Valentin Schneider <valentin.schneider@arm.com>
Tested-by: Meelis Roos <mroos@linux.ee>
Link: https://lkml.kernel.org/r/20210224030944.15232-1-song.bao.hua@hisilicon.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit b89997aa88f0b07d8a6414c908af75062103b8c9 ]
Being called for each dequeue, util_est reduces the number of its updates
by filtering out when the EWMA signal is different from the task util_avg
by less than 1%. It is a problem for a sudden util_avg ramp-up. Due to the
decay from a previous high util_avg, EWMA might now be close enough to
the new util_avg. No update would then happen while it would leave
ue.enqueued with an out-of-date value.
Taking into consideration the two util_est members, EWMA and enqueued for
the filtering, ensures, for both, an up-to-date value.
This is for now an issue only for the trace probe that might return the
stale value. Functional-wise, it isn't a problem, as the value is always
accessed through max(enqueued, ewma).
This problem has been observed using LISA's UtilConvergence:test_means on
the sd845c board.
No regression observed with Hackbench on sd845c and Perf-bench sched pipe
on hikey/hikey960.
Signed-off-by: Vincent Donnefort <vincent.donnefort@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
Link: https://lkml.kernel.org/r/20210225165820.1377125-1-vincent.donnefort@arm.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit bb0cd09be45ea457f25fdcbcb3d6cf2230f26c46 ]
When unloading driver after killing some applications, it will hit sdma
flush tlb job timeout which is called by ttm_bo_delay_delete. So
to avoid the job submit after fence driver fini, call ttm_bo_lock_delayed_workqueue
before fence driver fini. And also put drm_sched_fini before waiting fence.
Signed-off-by: Emily Deng <Emily.Deng@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 309b477462df7542355ac984674a6e89c01c89aa ]
While testing target port swap test with ADISC enabled, several nodes
remain in UNUSED state. These nodes are never freed and rmmod hangs for
long time before finising with "0233 Nodelist not empty" error.
During PLOGI completion lpfc_plogi_confirm_nport() looks for existing nodes
with same WWPN. If found, the existing node is used to continue discovery.
The node on which plogi was performed is freed. When ADISC is enabled, an
ADISC els request is triggered in response to an RSCN. It's possible that
the ADISC may be rejected by the remote port causing the ADISC completion
handler to clear the port and node name in the node. If this occurs, if a
PLOGI is received it causes a node lookup based on wwpn to now fail,
causing the port swap logic to kick in which allocates a new node and swaps
to it. This effectively orphans the original node structure.
Fix the situation by detecting when the lookup fails and forgo the node
swap and node allocation by using the node on which the PLOGI was issued.
Link: https://lore.kernel.org/r/20210301171821.3427-15-jsmart2021@gmail.com
Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 143753059b8b957f1cf4355338a3e3a32f3a85bf ]
The driver is seeing a scenario where PLOGI response was issued and traffic
is arriving while the adapter is still setting up the login context. This
is resulting in errors handling the traffic.
Change the driver so that PLOGI response is sent after the login context
has been setup to avoid the situation.
Link: https://lore.kernel.org/r/20210301171821.3427-14-jsmart2021@gmail.com
Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 148bc64d38fe314475a074c4f757ec9d84537d1c ]
An unlikely error exit path from lpfc_els_retry() returns incorrect status
to a caller, erroneously indicating that a retry has been successfully
issued or scheduled.
Change error exit path to indicate no retry.
Link: https://lore.kernel.org/r/20210301171821.3427-12-jsmart2021@gmail.com
Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|