summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2020-09-03net/smc: reset sndbuf_desc if freedUrsula Braun
When an SMC connection is created, and there is a problem to create an RMB or DMB, the previously created send buffer is thrown away as well including buffer descriptor freeing. Make sure the connection no longer references the freed buffer descriptor, otherwise bugs like this are possible: [71556.835148] ============================================================================= [71556.835168] BUG kmalloc-128 (Tainted: G B OE ): Poison overwritten [71556.835172] ----------------------------------------------------------------------------- [71556.835179] INFO: 0x00000000d20894be-0x00000000aaef63e9 @offset=2724. First byte 0x0 instead of 0x6b [71556.835215] INFO: Allocated in __smc_buf_create+0x184/0x578 [smc] age=0 cpu=5 pid=46726 [71556.835234] ___slab_alloc+0x5a4/0x690 [71556.835239] __slab_alloc.constprop.0+0x70/0xb0 [71556.835243] kmem_cache_alloc_trace+0x38e/0x3f8 [71556.835250] __smc_buf_create+0x184/0x578 [smc] [71556.835257] smc_buf_create+0x2e/0xe8 [smc] [71556.835264] smc_listen_work+0x516/0x6a0 [smc] [71556.835275] process_one_work+0x280/0x478 [71556.835280] worker_thread+0x66/0x368 [71556.835287] kthread+0x17a/0x1a0 [71556.835294] ret_from_fork+0x28/0x2c [71556.835301] INFO: Freed in smc_buf_create+0xd8/0xe8 [smc] age=0 cpu=5 pid=46726 [71556.835307] __slab_free+0x246/0x560 [71556.835311] kfree+0x398/0x3f8 [71556.835318] smc_buf_create+0xd8/0xe8 [smc] [71556.835324] smc_listen_work+0x516/0x6a0 [smc] [71556.835328] process_one_work+0x280/0x478 [71556.835332] worker_thread+0x66/0x368 [71556.835337] kthread+0x17a/0x1a0 [71556.835344] ret_from_fork+0x28/0x2c [71556.835348] INFO: Slab 0x00000000a0744551 objects=51 used=51 fp=0x0000000000000000 flags=0x1ffff00000010200 [71556.835352] INFO: Object 0x00000000563480a1 @offset=2688 fp=0x00000000289567b2 [71556.835359] Redzone 000000006783cde2: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb ................ [71556.835363] Redzone 00000000e35b876e: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb ................ [71556.835367] Redzone 0000000023074562: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb ................ [71556.835372] Redzone 00000000b9564b8c: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb ................ [71556.835376] Redzone 00000000810c6362: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb ................ [71556.835380] Redzone 0000000065ef52c3: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb ................ [71556.835384] Redzone 00000000c5dd6984: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb ................ [71556.835388] Redzone 000000004c480f8f: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb ................ [71556.835392] Object 00000000563480a1: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk [71556.835397] Object 000000009c479d06: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk [71556.835401] Object 000000006e1dce92: 6b 6b 6b 6b 00 00 00 00 6b 6b 6b 6b 6b 6b 6b 6b kkkk....kkkkkkkk [71556.835405] Object 00000000227f7cf8: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk [71556.835410] Object 000000009a701215: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk [71556.835414] Object 000000003731ce76: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk [71556.835418] Object 00000000f7085967: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk [71556.835422] Object 0000000007f99927: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b a5 kkkkkkkkkkkkkkk. [71556.835427] Redzone 00000000579c4913: bb bb bb bb bb bb bb bb ........ [71556.835431] Padding 00000000305aef82: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZZZZZ [71556.835435] Padding 00000000b1cdd722: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZZZZZ [71556.835438] Padding 00000000c7568199: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZZZZZ [71556.835442] Padding 00000000fad4c4d4: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZZZZZ [71556.835451] CPU: 0 PID: 47939 Comm: kworker/0:15 Tainted: G B OE 5.9.0-rc1uschi+ #54 [71556.835456] Hardware name: IBM 3906 M03 703 (LPAR) [71556.835464] Workqueue: events smc_listen_work [smc] [71556.835470] Call Trace: [71556.835478] [<00000000d5eaeb10>] show_stack+0x90/0xf8 [71556.835493] [<00000000d66fc0f8>] dump_stack+0xa8/0xe8 [71556.835499] [<00000000d61a511c>] check_bytes_and_report+0x104/0x130 [71556.835504] [<00000000d61a57b2>] check_object+0x26a/0x2e0 [71556.835509] [<00000000d61a59bc>] alloc_debug_processing+0x194/0x238 [71556.835514] [<00000000d61a8c14>] ___slab_alloc+0x5a4/0x690 [71556.835519] [<00000000d61a9170>] __slab_alloc.constprop.0+0x70/0xb0 [71556.835524] [<00000000d61aaf66>] kmem_cache_alloc_trace+0x38e/0x3f8 [71556.835530] [<000003ff80549bbc>] __smc_buf_create+0x184/0x578 [smc] [71556.835538] [<000003ff8054a396>] smc_buf_create+0x2e/0xe8 [smc] [71556.835545] [<000003ff80540c16>] smc_listen_work+0x516/0x6a0 [smc] [71556.835549] [<00000000d5f0f448>] process_one_work+0x280/0x478 [71556.835554] [<00000000d5f0f6a6>] worker_thread+0x66/0x368 [71556.835559] [<00000000d5f18692>] kthread+0x17a/0x1a0 [71556.835563] [<00000000d6abf3b8>] ret_from_fork+0x28/0x2c [71556.835569] INFO: lockdep is turned off. [71556.835573] FIX kmalloc-128: Restoring 0x00000000d20894be-0x00000000aaef63e9=0x6b [71556.835577] FIX kmalloc-128: Marking all objects used Fixes: fd7f3a746582 ("net/smc: remove freed buffer from list") Reviewed-by: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-03net/smc: set rx_off for SMCR explicitlyUrsula Braun
SMC tries to make use of SMCD first. If a problem shows up, it tries to switch to SMCR. If the SMCD initializing problem shows up after the SMCD connection has already been initialized, field rx_off keeps the wrong SMCD value for SMCR, which results in corrupted data at the receiver. This patch adds an explicit (re-)setting of field rx_off to zero if the connection uses SMCR. Fixes: be244f28d22f ("net/smc: add SMC-D support in data transfer") Reviewed-by: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-03net/smc: fix toleration of fake add_link messagesKarsten Graul
Older SMCR implementations had no link failover support and used one link only. Because the handshake protocol requires to try the establishment of a second link the old code sent a fake add_link message and declined any server response afterwards. The current code supports multiple links and inspects the received fake add_link message more closely. To tolerate the fake add_link messages smc_llc_is_local_add_link() needs an improved check of the message to be able to separate between locally enqueued and fake add_link messages. And smc_llc_cli_add_link() needs to check if the provided qp_mtu size is invalid and reject the add_link request in that case. Fixes: c48254fa48e5 ("net/smc: move add link processing for new device into llc layer") Reviewed-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-03tg3: Fix soft lockup when tg3_reset_task() fails.Michael Chan
If tg3_reset_task() fails, the device state is left in an inconsistent state with IFF_RUNNING still set but NAPI state not enabled. A subsequent operation, such as ifdown or AER error can cause it to soft lock up when it tries to disable NAPI state. Fix it by bringing down the device to !IFF_RUNNING state when tg3_reset_task() fails. tg3_reset_task() running from workqueue will now call tg3_close() when the reset fails. We need to modify tg3_reset_task_cancel() slightly to avoid tg3_close() calling cancel_work_sync() to cancel tg3_reset_task(). Otherwise cancel_work_sync() will wait forever for tg3_reset_task() to finish. Reported-by: David Christensen <drc@linux.vnet.ibm.com> Reported-by: Baptiste Covolato <baptiste@arista.com> Fixes: db2199737990 ("tg3: Schedule at most one tg3_reset_task run") Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-03perf tools: Add bpf image check to __map__is_kmoduleJiri Olsa
When validating kcore modules the do_validate_kcore_modules function checks on every kernel module dso against modules record. The __map__is_kmodule check is used to get only kernel module dso objects through. Currently the bpf images are slipping through the check and making the validation to fail, so report falls back from kcore usage to kallsyms. Adding __map__is_bpf_image check for bpf image and adding it to __map__is_kmodule check. Fixes: 3c29d4483e85 ("perf annotate: Add basic support for bpf_image") Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lore.kernel.org/lkml/20200826213017.818788-1-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-03perf record/stat: Explicitly call out event modifiers in the documentationKim Phillips
Event modifiers are not mentioned in the perf record or perf stat manpages. Add them to orient new users more effectively by pointing them to the perf list manpage for details. Fixes: 2055fdaf8703 ("perf list: Document precise event sampling for AMD IBS") Signed-off-by: Kim Phillips <kim.phillips@amd.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Clarke <pc@us.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tony Jones <tonyj@suse.de> Cc: stable@vger.kernel.org Link: http://lore.kernel.org/lkml/20200901215853.276234-1-kim.phillips@amd.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-03perf bench: The do_run_multi_threaded() function must use ↵YueHaibing
IS_ERR(perf_session__new()) In case of error, the function perf_session__new() returns ERR_PTR() and never returns NULL. The NULL test in the return value check should be replaced with IS_ERR() Committer notes: This wasn't compiling due to an extraneous '{' not matched by a '}', fix it. Fixes: 13edc237200c ("perf bench: Add a multi-threaded synthesize benchmark") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20200902140526.26916-1-yuehaibing@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-03perf stat: Turn off summary for interval mode by defaultJin Yao
There's a risk that outputting interval mode summaries by default breaks CSV consumers. It already broke pmu-tools/toplev. So now we turn off the summary by default but we create a new option '--summary' to enable the summary. This is active even when not using CSV mode. Before: root@kbl-ppc:~# perf stat -I1000 --interval-count 2 # time counts unit events 1.000265904 8,005.73 msec cpu-clock # 8.006 CPUs utilized 1.000265904 601 context-switches # 0.075 K/sec 1.000265904 10 cpu-migrations # 0.001 K/sec 1.000265904 0 page-faults # 0.000 K/sec 1.000265904 66,746,521 cycles # 0.008 GHz 1.000265904 71,874,398 instructions # 1.08 insn per cycle 1.000265904 13,356,781 branches # 1.668 M/sec 1.000265904 298,756 branch-misses # 2.24% of all branches 2.001857667 8,012.52 msec cpu-clock # 8.013 CPUs utilized 2.001857667 164 context-switches # 0.020 K/sec 2.001857667 10 cpu-migrations # 0.001 K/sec 2.001857667 2 page-faults # 0.000 K/sec 2.001857667 5,822,188 cycles # 0.001 GHz 2.001857667 2,186,170 instructions # 0.38 insn per cycle 2.001857667 442,378 branches # 0.055 M/sec 2.001857667 44,750 branch-misses # 10.12% of all branches Performance counter stats for 'system wide': 16,018.25 msec cpu-clock # 7.993 CPUs utilized 765 context-switches # 0.048 K/sec 20 cpu-migrations # 0.001 K/sec 2 page-faults # 0.000 K/sec 72,568,709 cycles # 0.005 GHz 74,060,568 instructions # 1.02 insn per cycle 13,799,159 branches # 0.861 M/sec 343,506 branch-misses # 2.49% of all branches 2.004118489 seconds time elapsed After: root@kbl-ppc:~# perf stat -I1000 --interval-count 2 # time counts unit events 1.001336393 8,013.28 msec cpu-clock # 8.013 CPUs utilized 1.001336393 82 context-switches # 0.010 K/sec 1.001336393 8 cpu-migrations # 0.001 K/sec 1.001336393 0 page-faults # 0.000 K/sec 1.001336393 4,199,121 cycles # 0.001 GHz 1.001336393 1,373,991 instructions # 0.33 insn per cycle 1.001336393 270,681 branches # 0.034 M/sec 1.001336393 31,659 branch-misses # 11.70% of all branches 2.003905006 8,020.52 msec cpu-clock # 8.021 CPUs utilized 2.003905006 184 context-switches # 0.023 K/sec 2.003905006 8 cpu-migrations # 0.001 K/sec 2.003905006 2 page-faults # 0.000 K/sec 2.003905006 5,446,190 cycles # 0.001 GHz 2.003905006 2,312,547 instructions # 0.42 insn per cycle 2.003905006 451,691 branches # 0.056 M/sec 2.003905006 37,925 branch-misses # 8.40% of all branches root@kbl-ppc:~# perf stat -I1000 --interval-count 2 --summary # time counts unit events 1.001313128 8,013.20 msec cpu-clock # 8.013 CPUs utilized 1.001313128 83 context-switches # 0.010 K/sec 1.001313128 8 cpu-migrations # 0.001 K/sec 1.001313128 0 page-faults # 0.000 K/sec 1.001313128 4,470,950 cycles # 0.001 GHz 1.001313128 1,440,045 instructions # 0.32 insn per cycle 1.001313128 283,222 branches # 0.035 M/sec 1.001313128 33,576 branch-misses # 11.86% of all branches 2.003857385 8,020.34 msec cpu-clock # 8.020 CPUs utilized 2.003857385 154 context-switches # 0.019 K/sec 2.003857385 8 cpu-migrations # 0.001 K/sec 2.003857385 2 page-faults # 0.000 K/sec 2.003857385 4,515,676 cycles # 0.001 GHz 2.003857385 2,180,449 instructions # 0.48 insn per cycle 2.003857385 435,254 branches # 0.054 M/sec 2.003857385 31,179 branch-misses # 7.16% of all branches Performance counter stats for 'system wide': 16,033.53 msec cpu-clock # 7.992 CPUs utilized 237 context-switches # 0.015 K/sec 16 cpu-migrations # 0.001 K/sec 2 page-faults # 0.000 K/sec 8,986,626 cycles # 0.001 GHz 3,620,494 instructions # 0.40 insn per cycle 718,476 branches # 0.045 M/sec 64,755 branch-misses # 9.01% of all branches 2.006124542 seconds time elapsed Fixes: c7e5b328a8d4 ("perf stat: Report summary for interval mode") Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20200903010113.32232-1-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-03drm/amdgpu/mmhub2.0: print client id string for mmhubAlex Deucher
Print the name of the client rather than the number. This makes it easier to debug what block is causing the fault. Reviewed-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-03drm/amdgpu/gmc9: print client id string for mmhubAlex Deucher
Print the name of the client rather than the number. This makes it easier to debug what block is causing the fault. Reviewed-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-03drm/amdgpu/gmc10: print client id string for gfxhubAlex Deucher
Print the name of the client rather than the number. This makes it easier to debug what block is causing the fault. Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-03drm/amdgpu/gmc9: print client id string for gfxhubAlex Deucher
Print the name of the client rather than the number. This makes it easier to debug what block is causing the fault. Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-03drm/amdgpu/gfx10: Delete some duplicated argument to '|'Ye Bin
1. gfx_v10_0_soft_reset GRBM_STATUS__SPI_BUSY_MASK 2. gfx_v10_0_update_gfx_clock_gating AMD_CG_SUPPORT_GFX_CGLS Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Ye Bin <yebin10@huawei.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-03drm/amdgpu: add ta firmware load in psp_v12_0 for renoirChangfeng
It needs to load renoir_ta firmware because hdcp is enabled by default for renoir now. This can avoid error:DTM TA is not initialized Signed-off-by: Changfeng <Changfeng.Zhu@amd.com> Reviewed-by: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-03drm/amdgpu: fix max_entries calculation v4Christian König
Calculate the correct value for max_entries or we might run after the page_address array. v2: Xinhui pointed out we don't need the shift v3: use local copy of start and simplify some calculation v4: fix the case that we map less VA range than BO size Signed-off-by: Christian König <christian.koenig@amd.com> Fixes: 1e691e244487 drm/amdgpu: stop allocating dummy GTT nodes Reviewed-by: xinhui pan <xinhui.pan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-03drm/amdgpu: Remove superfluous NULL checkLuben Tuikov
The DRM device is a static member of the amdgpu device structure and as such always exists, so long as the PCI and thus the amdgpu device exist. Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-03drm/amdgpu: enable ih1 ih2 for Arcturus onlyAlex Sierra
Enable multi-ring ih1 and ih2 for Arcturus only. For Navi10 family multi-ring has been disabled. Apparently, having multi-ring enabled in Navi was causing continus page fault interrupts. Further investigation is needed to get to the root cause. Related issue link: https://gitlab.freedesktop.org/drm/amd/-/issues/1279 Signed-off-by: Alex Sierra <alex.sierra@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-03drm/amd/display: Fix a list corruptionxinhui pan
Remove the private obj from the internal list before we free aconnector. [ 56.925828] BUG: unable to handle page fault for address: ffff8f84a870a560 [ 56.933272] #PF: supervisor read access in kernel mode [ 56.938801] #PF: error_code(0x0000) - not-present page [ 56.944376] PGD 18e605067 P4D 18e605067 PUD 86a614067 PMD 86a4d0067 PTE 800ffff8578f5060 [ 56.953260] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC NOPTI [ 56.958815] CPU: 6 PID: 1407 Comm: bash Tainted: G O 5.9.0-rc2+ #46 [ 56.967092] Hardware name: System manufacturer System Product Name/PRIME Z390-A, BIOS 1401 11/26/2019 [ 56.977162] RIP: 0010:__list_del_entry_valid+0x31/0xa0 [ 56.982768] Code: 00 ad de 55 48 8b 17 4c 8b 47 08 48 89 e5 48 39 c2 74 27 48 b8 22 01 00 00 00 00 ad de 49 39 c0 74 2d 49 8b 30 48 39 fe 75 3d <48> 8b 52 08 48 39 f2 75 4c b8 01 00 00 00 5d c3 48 89 7 [ 57.003327] RSP: 0018:ffffb40c81687c90 EFLAGS: 00010246 [ 57.009048] RAX: dead000000000122 RBX: ffff8f84ea41f4f0 RCX: 0000000000000006 [ 57.016871] RDX: ffff8f84a870a558 RSI: ffff8f84ea41f4f0 RDI: ffff8f84ea41f4f0 [ 57.024672] RBP: ffffb40c81687c90 R08: ffff8f84ea400998 R09: 0000000000000001 [ 57.032490] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000006 [ 57.040287] R13: ffff8f84ea422a90 R14: ffff8f84b4129a20 R15: fffffffffffffff2 [ 57.048105] FS: 00007f550d885740(0000) GS:ffff8f8509600000(0000) knlGS:0000000000000000 [ 57.056979] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 57.063260] CR2: ffff8f84a870a560 CR3: 00000007e5144001 CR4: 00000000003706e0 [ 57.071053] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 57.078849] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 57.086684] Call Trace: [ 57.089381] drm_atomic_private_obj_fini+0x29/0x82 [drm] [ 57.095247] amdgpu_dm_fini+0x83/0x170 [amdgpu] [ 57.100264] dm_hw_fini+0x23/0x30 [amdgpu] [ 57.104814] amdgpu_device_fini+0x1df/0x4fe [amdgpu] [ 57.110271] amdgpu_driver_unload_kms+0x43/0x70 [amdgpu] [ 57.116136] amdgpu_pci_remove+0x3b/0x60 [amdgpu] [ 57.121291] pci_device_remove+0x3e/0xb0 [ 57.125583] device_release_driver_internal+0xff/0x1d0 [ 57.131223] device_release_driver+0x12/0x20 [ 57.135903] pci_stop_bus_device+0x70/0xa0 [ 57.140401] pci_stop_and_remove_bus_device_locked+0x1b/0x30 [ 57.146571] remove_store+0x7b/0x90 [ 57.150429] dev_attr_store+0x17/0x30 [ 57.154441] sysfs_kf_write+0x4b/0x60 [ 57.158479] kernfs_fop_write+0xe8/0x1d0 [ 57.162788] vfs_write+0xf5/0x230 [ 57.166426] ksys_write+0x70/0xf0 [ 57.170087] __x64_sys_write+0x1a/0x20 [ 57.174219] do_syscall_64+0x38/0x90 [ 57.178145] entry_SYSCALL_64_after_hwframe+0x44/0xa9 Signed-off-by: xinhui pan <xinhui.pan@amd.com> Acked-by: Feifei Xu <Feifei Xu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-03drm/amdgpu: Fix a redundant kfreexinhui pan
drm_dev_alloc() alloc *dev* and set managed.final_kfree to dev to free itself. Now from commit 5cdd68498918("drm/amdgpu: Embed drm_device into amdgpu_device (v3)") we alloc *adev* and ddev is just a member of it. So drm_dev_release try to free a wrong pointer then. Also driver's release trys to free adev, but drm_dev_release will access dev after call drvier's release. To fix it, remove driver's release and set managed.final_kfree to adev. [ 36.269348] BUG: unable to handle page fault for address: ffffa0c279940028 [ 36.276841] #PF: supervisor read access in kernel mode [ 36.282434] #PF: error_code(0x0000) - not-present page [ 36.288053] PGD 676601067 P4D 676601067 PUD 86a414067 PMD 86a247067 PTE 800ffff8066bf060 [ 36.296868] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC NOPTI [ 36.302409] CPU: 4 PID: 1375 Comm: bash Tainted: G O 5.9.0-rc2+ #46 [ 36.310670] Hardware name: System manufacturer System Product Name/PRIME Z390-A, BIOS 1401 11/26/2019 [ 36.320725] RIP: 0010:drm_managed_release+0x25/0x110 [drm] [ 36.326741] Code: 80 00 00 00 00 0f 1f 44 00 00 55 48 c7 c2 5a 9f 41 c0 be 00 02 00 00 48 89 e5 41 57 41 56 41 55 41 54 49 89 fc 53 48 83 ec 08 <48> 8b 7f 18 e8 c2 10 ff ff 4d 8b 74 24 20 49 8d 44 24 5 [ 36.347217] RSP: 0018:ffffb9424141fce0 EFLAGS: 00010282 [ 36.352931] RAX: 0000000000000006 RBX: ffffa0c279940010 RCX: 0000000000000006 [ 36.360718] RDX: ffffffffc0419f5a RSI: 0000000000000200 RDI: ffffa0c279940010 [ 36.368503] RBP: ffffb9424141fd10 R08: 0000000000000001 R09: 0000000000000001 [ 36.376304] R10: 0000000000000000 R11: 0000000000000000 R12: ffffa0c279940010 [ 36.384070] R13: ffffffffc0e2a000 R14: ffffa0c26924e220 R15: fffffffffffffff2 [ 36.391845] FS: 00007fc4a277b740(0000) GS:ffffa0c288e00000(0000) knlGS:0000000000000000 [ 36.400669] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 36.406937] CR2: ffffa0c279940028 CR3: 0000000792304006 CR4: 00000000003706e0 [ 36.414732] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 36.422550] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 36.430354] Call Trace: [ 36.433044] drm_dev_put.part.0+0x40/0x60 [drm] [ 36.438017] drm_dev_put+0x13/0x20 [drm] [ 36.442398] amdgpu_pci_remove+0x56/0x60 [amdgpu] [ 36.447528] pci_device_remove+0x3e/0xb0 [ 36.451807] device_release_driver_internal+0xff/0x1d0 [ 36.457416] device_release_driver+0x12/0x20 [ 36.462094] pci_stop_bus_device+0x70/0xa0 [ 36.466588] pci_stop_and_remove_bus_device_locked+0x1b/0x30 [ 36.472786] remove_store+0x7b/0x90 [ 36.476614] dev_attr_store+0x17/0x30 [ 36.480646] sysfs_kf_write+0x4b/0x60 [ 36.484655] kernfs_fop_write+0xe8/0x1d0 [ 36.488952] vfs_write+0xf5/0x230 [ 36.492562] ksys_write+0x70/0xf0 [ 36.496206] __x64_sys_write+0x1a/0x20 [ 36.500292] do_syscall_64+0x38/0x90 [ 36.504219] entry_SYSCALL_64_after_hwframe+0x44/0xa9 Signed-off-by: xinhui pan <xinhui.pan@amd.com> Acked-by: Alex Deucher <alexancer.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-03drm/amdgpu: block ring buffer access during GPU recoveryDennis Li
When GPU is in reset, its status isn't stable and ring buffer also need be reset when resuming. Therefore driver should protect GPU recovery thread from ring buffer accessed by other threads. Otherwise GPU will randomly hang during recovery. v2: correct indent Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Dennis Li <Dennis.Li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-03drm/amdgpu/swsmu: handle manual fan readback on SMU11Alex Deucher
Need to read back from registers for manual mode rather than using the metrics table. Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1164 Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-03drm/amdgpu/swsmu: add smu11 helper to get manual fan speed (v2)Alex Deucher
Will be used to fetch the fan speeds when manual fan mode is set. v2: squash in a Coverity fix from Colin Ian King Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-03drm/amdgpu/swsmu: drop set_fan_speed_percent (v2)Alex Deucher
No longer needed as we can calculate it based on the fan's max rpm. v2: minor code rework Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-03drm/amdgpu/swsmu: drop get_fan_speed_percent (v2)Alex Deucher
No longer needed as we can calculate it based on the fan's max rpm. v2: rework code to avoid possible uninitialized variable use. Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-03drm/amdgpu/swsmu: add get_fan_parameters callbacks for smu11 asicsAlex Deucher
grab the value from the pptable. Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-03drm/amdgpu/swsmu: add new callback for getting fan parametersAlex Deucher
To fetch the max rpm from pptable. Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-03drm/amdgpu: disable gpu-sched load balance for uvdNirmoy Das
On hardware with multiple uvd instances, dependent uvd jobs may get scheduled to different uvd instances. Because uvd_enc jobs retain hw context, dependent jobs should always run on the same uvd instance. This patch disables GPU scheduler's load balancer for a context that binds jobs from the same context to a uvd instance. v2: Squash in uvd_enc fix Signed-off-by: Nirmoy Das <nirmoy.das@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-03libtraceevent: Fix build warning on 32-bit archesTzvetomir Stoyanov (VMware)
Fixed a compilation warning for casting to pointer from integer of different size on 32-bit platforms. Reported-by: Arnaldo Carvalho de Melo <arnaldo.melo@gmail.com> Signed-off-by: Tzvetomir Stoyanov (VMware) <tz.stoyanov@gmail.com> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: linux-trace-devel@vger.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-03perf jevents: Fix suspicious code in fixregex()Namhyung Kim
The new string should have enough space for the original string and the back slashes IMHO. Fixes: fbc2844e84038ce3 ("perf vendor events: Use more flexible pattern matching for CPU identification for mapfile.csv") Signed-off-by: Namhyung Kim <namhyung@kernel.org> Reviewed-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: John Garry <john.garry@huawei.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: William Cohen <wcohen@redhat.com> Link: http://lore.kernel.org/lkml/20200903152510.489233-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-03perf parse-events: Use uintptr_t when casting numbers to pointersArnaldo Carvalho de Melo
To address these errors found when cross building from x86_64 to MIPS little endian 32-bit: CC /tmp/build/perf/util/parse-events-bison.o util/parse-events.y: In function 'parse_events_parse': util/parse-events.y:514:6: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast] 514 | (void *) $2, $6, $4); | ^ util/parse-events.y:531:7: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast] 531 | (void *) $2, NULL, $4)) { | ^ util/parse-events.y:547:6: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast] 547 | (void *) $2, $4, 0); | ^ util/parse-events.y:564:7: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast] 564 | (void *) $2, NULL, 0)) { | ^ Fixes: cabbf26821aa210f ("perf parse: Before yyabort-ing free components") Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Ian Rogers <irogers@google.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Martin KaFai Lau <kafai@fb.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <songliubraving@fb.com> Cc: Stephane Eranian <eranian@google.com> Cc: Yonghong Song <yhs@fb.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-03doc: net: dsa: Fix typo in config code samplePaul Barker
In the "single port" example code for configuring a DSA switch without tagging support from userspace the command to bring up the "lan2" link was typo'd. Signed-off-by: Paul Barker <pbarker@konsulko.com> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-03Merge tag 'fixes-2020-09-03' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock Pull misc build failure fixes from Mike Rapoport: "Fix min_low_pfn/max_low_pfn build errors on ia64 and microblaze. Some configurations of ia64 and microblaze use min_low_pfn and max_low_pfn in pfn_valid(). This causes build failures for modules that use pfn_valid(). The fix is to add EXPORT_SYMBOL() for these variables on ia64 and microblaze" * tag 'fixes-2020-09-03' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock: ia64: fix min_low_pfn/max_low_pfn build errors microblaze: fix min_low_pfn/max_low_pfn build errors
2020-09-03Merge tag 'affs-for-5.9-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull affs fix from David Sterba: "One fix to make permissions work the same way as on AmigaOS" * tag 'affs-for-5.9-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: affs: fix basic permission bits to actually work
2020-09-03xfs: fix xfs_bmap_validate_extent_raw when checking attr fork of rt filesDarrick J. Wong
The realtime flag only applies to the data fork, so don't use the realtime block number checks on the attr fork of a realtime file. Fixes: 30b0984d9117 ("xfs: refactor bmap record validation") Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Eric Sandeen <sandeen@redhat.com>
2020-09-03Merge tag 'media/v5.9-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media Pull media fixes from Mauro Carvalho Chehab: - a compilation fix issue with ti-vpe on arm 32 bits - two Kconfig fixes for imx214 and max9286 drivers - a kernel information leak at v4l2-core on time32 compat ioctls - some fixes at rc core unbind logic - a fix at mceusb driver for it to not use GFP_ATOMIC - fixes at cedrus and vicodec drivers at the control handling logic - a fix at gpio-ir-tx to avoid disabling interruts on a spinlock * tag 'media/v5.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: media: mceusb: Avoid GFP_ATOMIC where it is not needed media: gpio-ir-tx: spinlock is not needed to disable interrupts media: rc: do not access device via sysfs after rc_unregister_device() media: rc: uevent sysfs file races with rc_unregister_device() media: max9286: Depend on OF_GPIO media: i2c: imx214: select V4L2_FWNODE media: cedrus: Add missing v4l2_ctrl_request_hdl_put() media: vicodec: add missing v4l2_ctrl_request_hdl_put() media: media/v4l2-core: Fix kernel-infoleak in video_put_user() media: ti-vpe: cal: Fix compilation on 32-bit ARM
2020-09-03drm/i915: fix regression leading to display audio probe failure on GLKKai Vehmanen
In commit 4f0b4352bd26 ("drm/i915: Extract cdclk requirements checking to separate function") the order of force_min_cdclk_changed check and intel_modeset_checks(), was reversed. This broke the mechanism to immediately force a new CDCLK minimum, and lead to driver probe errors for display audio on GLK platform with 5.9-rc1 kernel. Fix the issue by moving intel_modeset_checks() call later. [vsyrjala: It also broke the ability of planes to bump up the cdclk and thus could lead to underruns when eg. flipping from 32bpp to 64bpp framebuffer. To be clear, we still compute the new cdclk correctly but fail to actually program it to the hardware due to intel_set_cdclk_{pre,post}_plane_update() not getting called on account of state->modeset==false.] Fixes: 4f0b4352bd26 ("drm/i915: Extract cdclk requirements checking to separate function") BugLink: https://github.com/thesofproject/linux/issues/2410 Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200901151036.1312357-1-kai.vehmanen@linux.intel.com
2020-09-03ALSA: hda/realtek - Improved routing for Thinkpad X1 7th/8th GenTakashi Iwai
There've been quite a few regression reports about the lowered volume (reduced to ca 65% from the previous level) on Lenovo Thinkpad X1 after the commit d2cd795c4ece ("ALSA: hda - fixup for the bass speaker on Lenovo Carbon X1 7th gen"). Although the commit itself does the right thing from HD-audio POV in order to have a volume control for bass speakers, it seems that the machine has some secret recipe under the hood. Through experiments, Benjamin Poirier found out that the following routing gives the best result: * DAC1 (NID 0x02) -> Speaker pin (NID 0x14) * DAC2 (NID 0x03) -> Shared by both Bass Speaker pin (NID 0x17) & Headphone pin (0x21) * DAC3 (NID 0x06) -> Unused DAC1 seems to have some equalizer internally applied, and you'd get again the output in a bad quality if you connect this to the headphone pin. Hence the headphone is connected to DAC2, which is now shared with the bass speaker pin. DAC3 has no volume amp, hence it's not connected at all. For achieving the routing above, this patch introduced a couple of workarounds: * The connection list of bass speaker pin (NID 0x17) is reduced not to include DAC3 (NID 0x06) * Pass preferred_pairs array to specify the fixed connection Here, both workarounds are needed because the generic parser prefers the individual DAC assignment over others. When the routing above is applied, the generic parser creates the two volume controls "Front" and "Bass Speaker". Since we have only two DACs for three output pins, those are not fully controlling each output individually, and it would confuse PulseAudio. For avoiding the pitfall, in this patch, we rename those volume controls to some unique ones ("DAC1" and "DAC2"). Then PulseAudio ignore them and concentrate only on the still good-working "Master" volume control. If a user still wants to control each DAC volume, they can still change manually via "DAC1" and "DAC2" volume controls. Fixes: d2cd795c4ece ("ALSA: hda - fixup for the bass speaker on Lenovo Carbon X1 7th gen") Reported-by: Benjamin Poirier <benjamin.poirier@gmail.com> Reviewed-by: Jaroslav Kysela <perex@perex.cz> Tested-by: Benjamin Poirier <benjamin.poirier@gmail.com> Cc: <stable@vger.kernel.org> BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=207407#c10 BugLink: https://gist.github.com/hamidzr/dd81e429dc86f4327ded7a2030e7d7d9#gistcomment-3214171 BugLink: https://gist.github.com/hamidzr/dd81e429dc86f4327ded7a2030e7d7d9#gistcomment-3276276 Link: https://lore/kernel.org/r/20200829112746.3118-1-benjamin.poirier@gmail.com Link: https://lore.kernel.org/r/20200903083300.6333-1-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de>
2020-09-03MIPS: SNI: Fix SCSI interruptThomas Bogendoerfer
On RM400(a20r) machines ISA and SCSI interrupts share the same interrupt line. Commit 49e6e07e3c80 ("MIPS: pass non-NULL dev_id on shared request_irq()") accidently dropped the IRQF_SHARED bit, which breaks registering SCSI interrupt. Put back IRQF_SHARED and add dev_id for ISA interrupt. Fixes: 49e6e07e3c80 ("MIPS: pass non-NULL dev_id on shared request_irq()") Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2020-09-03MIPS: add missing MSACSR and upper MSA initializationHuang Pei
In cc97ab235f3f ("MIPS: Simplify FP context initialization), init_fp_ctx just initialize the fp/msa context, and own_fp_inatomic just restore FCSR and 64bit FP regs from it, but miss MSACSR and upper MSA regs for MSA, so MSACSR and MSA upper regs's value from previous task on current cpu can leak into current task and cause unpredictable behavior when MSA context not initialized. Fixes: cc97ab235f3f ("MIPS: Simplify FP context initialization") Signed-off-by: Huang Pei <huangpei@loongson.cn> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2020-09-03Revert "kbuild: use -flive-patching when CONFIG_LIVEPATCH is enabled"Josh Poimboeuf
Use of the new -flive-patching flag was introduced with the following commit: 43bd3a95c98e ("kbuild: use -flive-patching when CONFIG_LIVEPATCH is enabled") This flag has several drawbacks: - It disables some optimizations, so it can have a negative effect on performance. - According to the GCC documentation it's not compatible with LTO, which will become a compatibility issue as LTO support gets upstreamed in the kernel. - It was intended to be used for source-based patch generation tooling, as opposed to binary-based patch generation tooling (e.g., kpatch-build). It probably should have at least been behind a separate config option so as not to negatively affect other livepatch users. - Clang doesn't have the flag, so as far as I can tell, this method of generating patches is incompatible with Clang, which like LTO is becoming more mainstream. - It breaks GCC's implicit noreturn detection for local functions. This is the cause of several "unreachable instruction" objtool warnings. - The broken noreturn detection is an obvious GCC regression, but we haven't yet gotten GCC developers to acknowledge that, which doesn't inspire confidence in their willingness to keep the feature working as optimizations are added or changed going forward. - While there *is* a distro which relies on this flag for their distro livepatch module builds, there's not a publicly documented way to create safe livepatch modules with it. Its use seems to be based on tribal knowledge. It serves no benefit to those who don't know how to use it. (In fact, I believe the current livepatch documentation and samples are misleading and dangerous, and should be corrected. Or at least amended with a disclaimer. But I don't feel qualified to make such changes.) Also, we have an idea for using objtool to detect function changes, which could potentially obsolete the need for this flag anyway. At this point the flag has no benefits for upstream which would counteract the above drawbacks. Revert it until it becomes more ready. This reverts commit 43bd3a95c98e1a86b8b55d97f745c224ecff02b9. Fixes: 43bd3a95c98e ("kbuild: use -flive-patching when CONFIG_LIVEPATCH is enabled") Reported-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Acked-by: Miroslav Benes <mbenes@suse.cz> Signed-off-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/696262e997359666afa053fe7d1a9fb2bb373964.1595010490.git.jpoimboe@redhat.com
2020-09-03x86/mm/32: Bring back vmalloc faulting on x86_32Joerg Roedel
One can not simply remove vmalloc faulting on x86-32. Upstream commit: 7f0a002b5a21 ("x86/mm: remove vmalloc faulting") removed it on x86 alltogether because previously the arch_sync_kernel_mappings() interface was introduced. This interface added synchronization of vmalloc/ioremap page-table updates to all page-tables in the system at creation time and was thought to make vmalloc faulting obsolete. But that assumption was incredibly naive. It turned out that there is a race window between the time the vmalloc or ioremap code establishes a mapping and the time it synchronizes this change to other page-tables in the system. During this race window another CPU or thread can establish a vmalloc mapping which uses the same intermediate page-table entries (e.g. PMD or PUD) and does no synchronization in the end, because it found all necessary mappings already present in the kernel reference page-table. But when these intermediate page-table entries are not yet synchronized, the other CPU or thread will continue with a vmalloc address that is not yet mapped in the page-table it currently uses, causing an unhandled page fault and oops like below: BUG: unable to handle page fault for address: fe80c000 #PF: supervisor write access in kernel mode #PF: error_code(0x0002) - not-present page *pde = 33183067 *pte = a8648163 Oops: 0002 [#1] SMP CPU: 1 PID: 13514 Comm: cve-2017-17053 Tainted: G ... Call Trace: ldt_dup_context+0x66/0x80 dup_mm+0x2b3/0x480 copy_process+0x133b/0x15c0 _do_fork+0x94/0x3e0 __ia32_sys_clone+0x67/0x80 __do_fast_syscall_32+0x3f/0x70 do_fast_syscall_32+0x29/0x60 do_SYSENTER_32+0x15/0x20 entry_SYSENTER_32+0x9f/0xf2 EIP: 0xb7eef549 So the arch_sync_kernel_mappings() interface is racy, but removing it would mean to re-introduce the vmalloc_sync_all() interface, which is even more awful. Keep arch_sync_kernel_mappings() in place and catch the race condition in the page-fault handler instead. Do a partial revert of above commit to get vmalloc faulting on x86-32 back in place. Fixes: 7f0a002b5a21 ("x86/mm: remove vmalloc faulting") Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org> Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20200902155904.17544-1-joro@8bytes.org
2020-09-03x86/cmdline: Disable jump tables for cmdline.cArvind Sankar
When CONFIG_RETPOLINE is disabled, Clang uses a jump table for the switch statement in cmdline_find_option (jump tables are disabled when CONFIG_RETPOLINE is enabled). This function is called very early in boot from sme_enable() if CONFIG_AMD_MEM_ENCRYPT is enabled. At this time, the kernel is still executing out of the identity mapping, but the jump table will contain virtual addresses. Fix this by disabling jump tables for cmdline.c when AMD_MEM_ENCRYPT is enabled. Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20200903023056.3914690-1-nivedita@alum.mit.edu
2020-09-03dmaengine: ti: k3-udma: Update rchan_oes_offset for am654 SYSFW ABI 3.0Peter Ujfalusi
SYSFW ABI 3.0 has changed the rchan_oes_offset value for am654 to support SR2. Since the kernel now needs SYSFW API 3.0 to work because the merged irqchip update, we need to also update the am654 rchan_oes_offset. Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com> Link: https://lore.kernel.org/r/20200831091019.25273-1-peter.ujfalusi@ti.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2020-09-03drm/nouveau/kms/nv50-gp1xx: add WAR for EVO push buffer HW bugBen Skeggs
Thanks to NVIDIA for confirming this workaround, and clarifying which HW is affected. Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Tested-by: Alexander Kapshuk <alexander.kapshuk@gmail.com>
2020-09-03drm/nouveau/kms/nv50-gp1xx: disable notifies again after core updateBen Skeggs
This was lost during the header conversion. Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-09-03drm/nouveau/kms/nv50-: add some whitespace before debug messageBen Skeggs
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-09-03drm/nouveau/kms/gv100-: Include correct push header in crcc37d.cLyude Paul
Looks like when we converted everything over to Nvidia's class headers, we mistakenly included the nvif/push507b.h instead of nvif/pushc37b.h, which resulted in breaking CRC reporting for volta+: nouveau 0000:1f:00.0: disp: chid 0 stat 10003361 reason 3 [RESERVED_METHOD] mthd 0d84 data 00000000 code 00000000 nouveau 0000:1f:00.0: disp: chid 0 stat 10003360 reason 3 [RESERVED_METHOD] mthd 0d80 data 00000000 code 00000000 nouveau 0000:1f:00.0: DRM: CRC notifier ctx for head 3 not finished after 50ms So, fix that. Signed-off-by: Lyude Paul <lyude@redhat.com> Fixes: c4b27bc8682c ("drm/nouveau/kms/nv50-: convert core crc_set_src() to new push macros") Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-09-03drm/radeon: Prefer lower feedback dividersKai-Heng Feng
Commit 2e26ccb119bd ("drm/radeon: prefer lower reference dividers") fixed screen flicker for HP Compaq nx9420 but breaks other laptops like Asus X50SL. Turns out we also need to favor lower feedback dividers. Users confirmed this change fixes the regression and doesn't regress the original fix. Fixes: 2e26ccb119bd ("drm/radeon: prefer lower reference dividers") BugLink: https://bugs.launchpad.net/bugs/1791312 BugLink: https://bugs.launchpad.net/bugs/1861554 Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-03drm/amdgpu: Fix bug in reporting voltage for CIKSandeep Raghuraman
On my R9 390, the voltage was reported as a constant 1000 mV. This was due to a bug in smu7_hwmgr.c, in the smu7_read_sensor() function, where some magic constants were used in a condition, to determine whether the voltage should be read from PLANE2_VID or PLANE1_VID. The VDDC mask was incorrectly used, instead of the VDDGFX mask. This patch changes the code to use the correct defined constants (and apply the correct bitshift), thus resulting in correct voltage reporting. Signed-off-by: Sandeep Raghuraman <sandy.8925@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-03drm/amdgpu: Specify get_argument function for ci_smu_funcsSandeep Raghuraman
Starting in Linux 5.8, the graphics and memory clock frequency were not being reported for CIK cards. This is a regression, since they were reported correctly in Linux 5.7. After investigation, I discovered that the smum_send_msg_to_smc() function, attempts to call the corresponding get_argument() function of ci_smu_funcs. However, the get_argument() function is not defined in ci_smu_funcs. This patch fixes the bug by specifying the correct get_argument() function. Fixes: a0ec225633d9f6 ("drm/amd/powerplay: unified interfaces for message issuing and response checking") Signed-off-by: Sandeep Raghuraman <sandy.8925@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org