summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2023-11-29drm/amd/display: Use DRAM speed from validation for dummy p-stateAlvin Lee
[Description] When choosing which dummy p-state latency to use, we need to use the DRAM speed from validation. The DRAMSpeed DML variable can change because we use different input params to DML when populating watermarks set B. Cc: stable@vger.kernel.org # 6.1+ Reviewed-by: Samson Tam <samson.tam@amd.com> Acked-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Alvin Lee <alvin.lee2@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-11-29drm/amd/display: Fix MPCC 1DLUT programmingIlya Bakoulin
[Why] Wrong function is used to translate LUT values to HW format, leading to visible artifacting in some cases. [How] Use the correct cm3_helper function. Cc: stable@vger.kernel.org # 6.1+ Reviewed-by: Krunoslav Kovac <krunoslav.kovac@amd.com> Acked-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Ilya Bakoulin <ilya.bakoulin@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-11-29drm/amd/display: Feed SR and Z8 watermarks into DML2 for DCN35Nicholas Kazlauskas
[Why] We've updated the table but the values aren't being reflected in DML2 calculation. [How] Pass them into the bbox overrides. Reviewed-by: Jun Lei <jun.lei@amd.com> Acked-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-11-29drm/amdgpu: Force order between a read and write to the same addressAlex Sierra
Setting register to force ordering to prevent read/write or write/read hazards for un-cached modes. Signed-off-by: Alex Sierra <alex.sierra@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.1.x
2023-11-29drm/amdgpu: Do not issue gpu reset from nbio v7_9 bif interruptHawking Zhang
In nbio v7_9, host driver should not issu gpu reset Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Stanley Yang <Stanley.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-11-29drm/amd/display: Add Z8 watermarks for DML2 bbox overridesNicholas Kazlauskas
[Why] We can override SR watermarks but not Z8 ones. [How] Add new parameters for Z8 matching the SR ones and feed them into the states. These also weren't being applied to every state, so make sure that we loop over and update all SOC states if given an override. Reviewed-by: Jun Lei <jun.lei@amd.com> Acked-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-11-29drm/amdgpu: optimize RLC powerdown notification on VangoghPerry Yuan
The smu needs to get the rlc power down message to sync the rlc state with smu, the rlc state updating message need to be sent at while smu begin suspend sequence , otherwise SMU will crash while RLC state is not notified by driver, and rlc state probally changed after that notification, so it needs to notify rlc state to smu at the end of the suspend sequence in amdgpu_device_suspend() that can make sure the rlc state is correctly set to SMU. [ 101.000590] amdgpu 0000:03:00.0: amdgpu: SMU: I'm not done with your previous command: SMN_C2PMSG_66:0x0000001E SMN_C2PMSG_82:0x00000000 [ 101.000598] amdgpu 0000:03:00.0: amdgpu: Failed to disable gfxoff! [ 110.838026] amdgpu 0000:03:00.0: amdgpu: SMU: I'm not done with your previous command: SMN_C2PMSG_66:0x0000001E SMN_C2PMSG_82:0x00000000 [ 110.838035] amdgpu 0000:03:00.0: amdgpu: Failed to disable smu features. [ 110.838039] amdgpu 0000:03:00.0: amdgpu: Fail to disable dpm features! [ 110.838040] [drm:amdgpu_device_ip_suspend_phase2 [amdgpu]] *ERROR* suspend of IP block <smu> failed -62 [ 110.884394] PM: suspend of devices aborted after 21213.620 msecs [ 110.884402] PM: start suspend of devices aborted after 21213.882 msecs [ 110.884405] PM: Some devices failed to suspend, or early wake event detected Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Perry Yuan <perry.yuan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-11-29drm/amd/display: fix a pipe mapping error in dcn32_fpuWenjing Liu
[why] In dcn32 DML pipes are ordered the same as dc pipes but only for used pipes. For example, if dc pipe 1 and 2 are used, their dml pipe indices would be 0 and 1 respectively. However update_pipe_slice_table_with_split_flags doesn't skip indices for free pipes. This causes us to not reference correct dml pipe output when building pipe topology. [how] Use two variables to iterate dc and dml pipes respectively and only increment dml pipe index when current dc pipe is not free. Cc: stable@vger.kernel.org # 6.1+ Reviewed-by: Chaitanya Dhere <chaitanya.dhere@amd.com> Acked-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Wenjing Liu <wenjing.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-11-29drm/amd/display: Update DCN35 watermarksNicholas Kazlauskas
[Why & How] Update to the new values per HW team request. Affects both stutter and z8. Reviewed-by: Charlene Liu <charlene.liu@amd.com> Acked-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-11-29drm/amdgpu: update xgmi num links info post gc9.4.2Jonathan Kim
GC IP 9.4.2 and up support TA reporting of the number of xGMI links between peers. Tested-by: Vignesh Chander <vignesh.chander@amd.com> Signed-off-by: Jonathan Kim <jonathan.kim@amd.com> Reviewed-by: Mukul Joshi <mukul.joshi@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-11-29drm/amd/display: Add z-state support policy for dcn35Nicholas Kazlauskas
[Why] DML2 means that the dcn3x policy for calculating z-state support no longer runs from validate_bandwidth. This means we are unconditionally allowing Z8, the hardware default. [How] Port the policy over to DCN35, but with a few modifications: - Don't use min_dst_y_next_start as a check for Z8/Z10 allow - Add support for overriding the Z10 stutter period per ASIC - Cleanup the code to make the policy assignment more clear Reviewed-by: Charlene Liu <charlene.liu@amd.com> Acked-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-11-29drm/amd/display: Include udelay when waiting for INBOX0 ACKAlvin Lee
When waiting for the ACK for INBOX0 message, we have to ensure to include the udelay for proper wait time Cc: stable@vger.kernel.org # 6.1+ Reviewed-by: Samson Tam <samson.tam@amd.com> Acked-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Alvin Lee <alvin.lee2@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-11-29cpufreq/amd-pstate: Only print supported EPP values for performance governorAyush Jain
show_energy_performance_available_preferences() to show only supported values which is performance in performance governor policy. -------Before-------- $ cat /sys/devices/system/cpu/cpu1/cpufreq/scaling_driver amd-pstate-epp $ cat /sys/devices/system/cpu/cpu1/cpufreq/scaling_governor performance $ cat /sys/devices/system/cpu/cpu1/cpufreq/energy_performance_preference performance $ cat /sys/devices/system/cpu/cpu1/cpufreq/energy_performance_available_preferences default performance balance_performance balance_power power -------After-------- $ cat /sys/devices/system/cpu/cpu1/cpufreq/scaling_driver amd-pstate-epp $ cat /sys/devices/system/cpu/cpu1/cpufreq/scaling_governor performance $ cat /sys/devices/system/cpu/cpu1/cpufreq/energy_performance_preference performance $ cat /sys/devices/system/cpu/cpu1/cpufreq/energy_performance_available_preferences performance Fixes: ffa5096a7c33 ("cpufreq: amd-pstate: implement Pstate EPP support for the AMD processors") Suggested-by: Wyes Karny <wyes.karny@amd.com> Signed-off-by: Ayush Jain <ayush.jain3@amd.com> Reviewed-by: Wyes Karny <wyes.karny@amd.com> Acked-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-11-29dm-flakey: start allocating with MAX_ORDERMikulas Patocka
Commit 23baf831a32c ("mm, treewide: redefine MAX_ORDER sanely") changed the meaning of MAX_ORDER from exclusive to inclusive. So, we can allocate compound pages with up to 1 << MAX_ORDER pages. Reflect this change in dm-flakey and start trying to allocate compound pages with MAX_ORDER. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2023-11-29dm-verity: align struct dm_verity_fec_io properlyMikulas Patocka
dm_verity_fec_io is placed after the end of two hash digests. If the hash digest has unaligned length, struct dm_verity_fec_io could be unaligned. This commit fixes the placement of struct dm_verity_fec_io, so that it's aligned. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Cc: stable@vger.kernel.org Fixes: a739ff3f543a ("dm verity: add support for forward error correction") Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2023-11-29dm verity: don't perform FEC for failed readahead IOWu Bo
We found an issue under Android OTA scenario that many BIOs have to do FEC where the data under dm-verity is 100% complete and no corruption. Android OTA has many dm-block layers, from upper to lower: dm-verity dm-snapshot dm-origin & dm-cow dm-linear ufs DM tables have to change 2 times during Android OTA merging process. When doing table change, the dm-snapshot will be suspended for a while. During this interval, many readahead IOs are submitted to dm_verity from filesystem. Then the kverity works are busy doing FEC process which cost too much time to finish dm-verity IO. This causes needless delay which feels like system is hung. After adding debugging it was found that each readahead IO needed around 10s to finish when this situation occurred. This is due to IO amplification: dm-snapshot suspend erofs_readahead // 300+ io is submitted dm_submit_bio (dm_verity) dm_submit_bio (dm_snapshot) bio return EIO bio got nothing, it's empty verity_end_io verity_verify_io forloop range(0, io->n_blocks) // each io->nblocks ~= 20 verity_fec_decode fec_decode_rsb fec_read_bufs forloop range(0, v->fec->rsn) // v->fec->rsn = 253 new_read submit_bio (dm_snapshot) end loop end loop dm-snapshot resume Readahead BIOs get nothing while dm-snapshot is suspended, so all of them will cause verity's FEC. Each readahead BIO needs to verify ~20 (io->nblocks) blocks. Each block needs to do FEC, and every block needs to do 253 (v->fec->rsn) reads. So during the suspend interval(~200ms), 300 readahead BIOs trigger ~1518000 (300*20*253) IOs to dm-snapshot. As readahead IO is not required by userspace, and to fix this issue, it is best to pass readahead errors to upper layer to handle it. Cc: stable@vger.kernel.org Fixes: a739ff3f543a ("dm verity: add support for forward error correction") Signed-off-by: Wu Bo <bo.wu@vivo.com> Reviewed-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2023-11-29dm verity: initialize fec io before freeing itWu Bo
If BIO error, verity_end_io() can call verity_finish_io() before verity_fec_init_io(). Therefore, fec_io->rs is not initialized and may crash when doing memory freeing in verity_fec_finish_io(). Crash call stack: die+0x90/0x2b8 __do_kernel_fault+0x260/0x298 do_bad_area+0x2c/0xdc do_translation_fault+0x3c/0x54 do_mem_abort+0x54/0x118 el1_abort+0x38/0x5c el1h_64_sync_handler+0x50/0x90 el1h_64_sync+0x64/0x6c free_rs+0x18/0xac fec_rs_free+0x10/0x24 mempool_free+0x58/0x148 verity_fec_finish_io+0x4c/0xb0 verity_end_io+0xb8/0x150 Cc: stable@vger.kernel.org # v6.0+ Fixes: 5721d4e5a9cd ("dm verity: Add optional "try_verify_in_tasklet" feature") Signed-off-by: Wu Bo <bo.wu@vivo.com> Reviewed-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2023-11-29block: Document the role of the two attribute groupsBart Van Assche
It is nontrivial to derive the role of the two attribute groups in source file block/blk-sysfs.c. Hence add a comment that explains their roles. See also commit 6d85ebf95c44 ("blk-sysfs: add a new attr_group for blk_mq"). Cc: Christoph Hellwig <hch@lst.de> Cc: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20231128194019.72762-1-bvanassche@acm.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-11-29tools: ynl-gen: always construct struct ynl_req_stateJakub Kicinski
struct ynl_req_state carries reply-related info from generated code into generic YNL code. While we don't need reply info to execute a request without a reply, we still need to pass in the struct, because it's also where we get the pointer to struct ynl_sock from. Passing NULL results in crashes if kernel returns an error or an unexpected reply. Fixes: dc0956c98f11 ("tools: ynl-gen: move the response reading logic into YNL") Link: https://lore.kernel.org/r/20231126225858.2144136-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-29ethtool: don't propagate EOPNOTSUPP from dumpsJakub Kicinski
The default dump handler needs to clear ret before returning. Otherwise if the last interface returns an inconsequential error this error will propagate to user space. This may confuse user space (ethtool CLI seems to ignore it, but YNL doesn't). It will also terminate the dump early for mutli-skb dump, because netlink core treats EOPNOTSUPP as a real error. Fixes: 728480f12442 ("ethtool: default handlers for GET requests") Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20231126225806.2143528-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-29cpufreq/amd-pstate: Fix scaling_min_freq and scaling_max_freq updateWyes Karny
When amd_pstate is running, writing to scaling_min_freq and scaling_max_freq has no effect. These values are only passed to the policy level, but not to the platform level. This means that the platform does not know about the frequency limits set by the user. To fix this, update the min_perf and max_perf values at the platform level whenever the user changes the scaling_min_freq and scaling_max_freq values. Fixes: ffa5096a7c33 ("cpufreq: amd-pstate: implement Pstate EPP support for the AMD processors") Acked-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Wyes Karny <wyes.karny@amd.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-11-29drm/panel: nt36523: fix return value check in nt36523_probe()Yang Yingliang
mipi_dsi_device_register_full() never returns NULL pointer, it will return ERR_PTR() when it fails, so replace the check with IS_ERR(). Fixes: 0993234a0045 ("drm/panel: Add driver for Novatek NT36523") Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://lore.kernel.org/r/20231129090715.856263-1-yangyingliang@huaweicloud.com Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://patchwork.freedesktop.org/patch/msgid/20231129090715.856263-1-yangyingliang@huaweicloud.com
2023-11-29drm/panel: starry-2081101qfh032011-53g: Fine tune the panel power sequencexiazhengqiao
For the "starry, 2081101qfh032011-53g" panel, it is stipulated in the panel spec that MIPI needs to keep the LP11 state before the lcm_reset pin is pulled high. Fixes: 6069b66cd962 ("drm/panel: support for STARRY 2081101QFH032011-53G MIPI-DSI panel") Signed-off-by: xiazhengqiao <xiazhengqiao@huaqin.corp-partner.google.com> Reviewed-by: Jessica Zhang <quic_jesszhan@quicinc.com> Link: https://lore.kernel.org/r/20231129084115.7918-1-xiazhengqiao@huaqin.corp-partner.google.com Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://patchwork.freedesktop.org/patch/msgid/20231129084115.7918-1-xiazhengqiao@huaqin.corp-partner.google.com
2023-11-29Merge tag 'pinctrl-v6.7-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl Pull pin control fixes from Linus Walleij: - Fix a really interesting potential core bug in the list iterator requireing the use of READ_ONCE() discovered when testing kernel compiles with clang. - Check devm_kcalloc() return value and an array bounds in the STM32 driver. - Fix an exotic string truncation issue in the s32cc driver, found by the kernel test robot (impressive!) - Fix an undocumented struct member in the cy8c95x0 driver. - Fix a symbol overlap with MIPS in the Lochnagar driver, MIPS defines a global symbol "RST" which is a bit too generic and collide with stuff. OK this one should be renamed too, we will fix that as well. - Fix erroneous branch taking in the Realtek driver. - Fix the mail address in MAINTAINERS for the s32g2 driver. * tag 'pinctrl-v6.7-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: dt-bindings: pinctrl: s32g2: change a maintainer email address pinctrl: realtek: Fix logical error when finding descriptor pinctrl: lochnagar: Don't build on MIPS pinctrl: avoid reload of p state in list iteration pinctrl: cy8c95x0: Fix doc warning pinctrl: s32cc: Avoid possible string truncation pinctrl: stm32: fix array read out of bound pinctrl: stm32: Add check for devm_kcalloc
2023-11-29KVM: PPC: Book3S HV: Fix KVM_RUN clobbering FP/VEC user registersNicholas Piggin
Before running a guest, the host process (e.g., QEMU) FP/VEC registers are saved if they were being used, similarly to when the kernel uses FP registers. The guest values are then loaded into regs, and the host process registers will be restored lazily when it uses FP/VEC. KVM HV has a bug here: the host process registers do get saved, but the user MSR bits remain enabled, which indicates the registers are valid for the process. After they are clobbered by running the guest, this valid indication causes the host process to take on the FP/VEC register values of the guest. Fixes: 34e119c96b2b ("KVM: PPC: Book3S HV P9: Reduce mtmsrd instructions required to save host SPRs") Cc: stable@vger.kernel.org # v5.17+ Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20231122025811.2973-1-npiggin@gmail.com
2023-11-29ALSA: hda/realtek: Add supported ALC257 for ChromeOSKailang Yang
ChromeOS want to support ALC257. Add codec ID to some relation function. Signed-off-by: Kailang Yang <kailang@realtek.com> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/99a88a7dbdb045fd9d934abeb6cec15f@realtek.com Signed-off-by: Takashi Iwai <tiwai@suse.de>
2023-11-29drm/i915: Call intel_pre_plane_updates() also for pipes getting enabledVille Syrjälä
We used to call intel_pre_plane_updates() for any pipe going through a modeset whether the pipe was previously enabled or not. This in fact needed to apply all the necessary clock gating workarounds/etc. Restore the correct behaviour. Fixes: 39919997322f ("drm/i915: Disable all planes before modesetting any pipes") Reviewed-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231121054324.9988-3-ville.syrjala@linux.intel.com (cherry picked from commit e0d5ce11ed0a21bb2bf328ad82fd261783c7ad88) Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2023-11-29drm/i915: Also check for VGA converter in eDP probeVille Syrjälä
Unfortunately even the HPD based detection added in commit cfe5bdfb27fa ("drm/i915: Check HPD live state during eDP probe") fails to detect that the VBT's eDP/DDI-A is a ghost on Asus B360M-A (CFL+CNP). On that board eDP/DDI-A has its HPD asserted despite nothing being actually connected there :( The straps/fuses also indicate that the eDP port is present. So if one boots with a VGA monitor connected the eDP probe will mistake the DP->VGA converter hooked to DDI-E for an eDP panel on DDI-A. As a last resort check what kind of DP device we've detected, and if it looks like a DP->VGA converter then conclude that the eDP port should be ignored. Cc: stable@vger.kernel.org Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/9636 Fixes: cfe5bdfb27fa ("drm/i915: Check HPD live state during eDP probe") Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231114142333.15799-1-ville.syrjala@linux.intel.com Reviewed-by: Jani Nikula <jani.nikula@intel.com> (cherry picked from commit fcd479a79120bf0cd507d85f898297a3b868dda6) Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2023-11-29drm/i915/gsc: Mark internal GSC engine with reserved uabi classTvrtko Ursulin
The GSC CS is not exposed to the user, so we skipped assigning a uabi class number for it. However, the trace logs use the uabi class and instance to identify the engine, so leaving uabi class unset makes the GSC CS show up as the RCS in those logs. Given that the engine is not exposed to the user, we can't add a new case in the uabi enum, so we insted internally define a kernel internal class as -1. At the same time remove special handling for the name and complete the uabi_classes array so internal class is automatically correctly assigned. Engine will show as 65535:0 other0 in the logs/traces which should be unique enough. v2: * Fix uabi class u8 vs u16 type confusion. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Fixes: 194babe26bdc ("drm/i915/mtl: don't expose GSC command streamer to the user") Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Alan Previn <alan.previn.teres.alexis@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231116084456.291533-1-tvrtko.ursulin@linux.intel.com (cherry picked from commit dfed6b58d54f3a5d7e6bc1fb060e2c936330eba2) Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2023-11-28bcachefs: Extra kthread_should_stop() calls for copygcKent Overstreet
This fixes a bug where going read-only was taking longer than it should have due to copygc forgetting to check kthread_should_stop() Additionally: fix a missing is_kthread check in bch2_move_ratelimit(). Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-11-28bcachefs: Convert gc_alloc_start() to for_each_btree_key2()Kent Overstreet
This eliminates some SRCU warnings: for_each_btree_key2() runs every loop iteration in a distinct transaction context. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-11-28bcachefs: Fix race between btree writes and metadata dropKent Overstreet
btree writes update the btree node key after every write, in order to update sectors_written, and they also might need to drop pointers if one of the writes failed in a replicated btree node. But the btree node might also have had a pointer dropped while the write was in flight, by bch2_dev_metadata_drop(), and thus there was a bug where the btree node write would ovewrite the btree node's key with what it had at the start of the write. Fix this by dropping pointers not currently in the btree node key. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-11-28bcachefs: move journal seq assertionKent Overstreet
journal_cur_seq() can legitimately be used outside of the journal lock, where this assert can race Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-11-28bcachefs: -EROFS doesn't count as move_extent_start_failKent Overstreet
The automated tests check if we've hit too many slowpath/error path events and fail the test - if we're just shutting down, that naturally shouldn't count. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-11-28ravb: Fix races between ravb_tx_timeout_work() and net related opsYoshihiro Shimoda
Fix races between ravb_tx_timeout_work() and functions of net_device_ops and ethtool_ops by using rtnl_trylock() and rtnl_unlock(). Note that since ravb_close() is under the rtnl lock and calls cancel_work_sync(), ravb_tx_timeout_work() should calls rtnl_trylock(). Otherwise, a deadlock may happen in ravb_tx_timeout_work() like below: CPU0 CPU1 ravb_tx_timeout() schedule_work() ... __dev_close_many() // Under rtnl lock ravb_close() cancel_work_sync() // Waiting ravb_tx_timeout_work() rtnl_lock() // This is possible to cause a deadlock If rtnl_trylock() fails, rescheduling the work with sleep for 1 msec. Fixes: c156633f1353 ("Renesas Ethernet AVB driver proper") Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru> Link: https://lore.kernel.org/r/20231127122420.3706751-1-yoshihiro.shimoda.uh@renesas.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-28smb: client: report correct st_size for SMB and NFS symlinksPaulo Alcantara
We can't rely on FILE_STANDARD_INFORMATION::EndOfFile for reparse points as they will be always zero. Set it to symlink target's length as specified by POSIX. This will make stat() family of syscalls return the correct st_size for such files. Cc: stable@vger.kernel.org Signed-off-by: Paulo Alcantara (SUSE) <pc@manguebit.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2023-11-28smb: client: fix missing mode bits for SMB symlinksPaulo Alcantara
When instantiating inodes for SMB symlinks, add the mode bits from @cifs_sb->ctx->file_mode as we already do for the other special files. Cc: stable@vger.kernel.org Signed-off-by: Paulo Alcantara (SUSE) <pc@manguebit.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2023-11-29nouveau/gsp: replace zero-length array with flex-array member and use ↵Gustavo A. R. Silva
__counted_by Fake flexible arrays (zero-length and one-element arrays) are deprecated, and should be replaced by flexible-array members. So, replace zero-length array with a flexible-array member in `struct PACKED_REGISTRY_TABLE`. Also annotate array `entries` with `__counted_by()` to prepare for the coming implementation by GCC and Clang of the `__counted_by` attribute. Flexible array members annotated with `__counted_by` can have their accesses bounds-checked at run-time via `CONFIG_UBSAN_BOUNDS` (for array indexing) and `CONFIG_FORTIFY_SOURCE` (for strcpy/memcpy-family functions). This fixes multiple -Warray-bounds warnings: drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c:1069:29: warning: array subscript 0 is outside array bounds of 'PACKED_REGISTRY_ENTRY[0]' [-Warray-bounds=] drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c:1070:29: warning: array subscript 0 is outside array bounds of 'PACKED_REGISTRY_ENTRY[0]' [-Warray-bounds=] drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c:1071:29: warning: array subscript 0 is outside array bounds of 'PACKED_REGISTRY_ENTRY[0]' [-Warray-bounds=] drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c:1072:29: warning: array subscript 0 is outside array bounds of 'PACKED_REGISTRY_ENTRY[0]' [-Warray-bounds=] While there, also make use of the struct_size() helper, and address checkpatch.pl warning: WARNING: please, no spaces at the start of a line This results in no differences in binary output. Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Danilo Krummrich <dakr@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/ZVZbX7C5suLMiBf+@work
2023-11-29nouveau/gsp/r535: remove a stray unlock in r535_gsp_rpc_send()Dan Carpenter
This unlock doesn't belong here and it leads to a double unlock in the caller, r535_gsp_rpc_push(). Fixes: 176fdcbddfd2 ("drm/nouveau/gsp/r535: add support for booting GSP-RM") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Timur Tabi <ttabi@nvidia.com> Signed-off-by: Danilo Krummrich <dakr@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/a0293812-c05d-45f0-a535-3f24fe582c02@moroto.mountain
2023-11-29nouveau: find the smallest page allocation to cover a buffer alloc.Dave Airlie
With the new uapi we don't have the comp flags on the allocation, so we shouldn't be using the first size that works, we should be iterating until we get the correct one. This reduces allocations from 2MB to 64k in lots of places. Fixes dEQP-VK.memory.allocation.basic.size_8KiB.forward.count_4000 on my ampere/gsp system. Cc: stable@vger.kernel.org # v6.6 Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Signed-off-by: Danilo Krummrich <dakr@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230811031520.248341-1-airlied@gmail.com
2023-11-28bcachefs: trace_move_extent_start_fail() now includes errcodeKent Overstreet
Renamed from trace_move_extent_alloc_mem_fail, because there are other reasons we colud fail (disk space allocation failure). Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-11-28bcachefs: Fix split_race livelockKent Overstreet
bch2_btree_update_start() calculates which nodes are going to have to be split/rewritten, so that we know how many nodes to reserve and how deep in the tree we have to take locks. But btree node merges require inserting two keys into the parent node, not just splits. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-11-28bcachefs: Fix bucket data type for stripe bucketsKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-11-28bcachefs: Add missing validation for jset_entry_data_usageKent Overstreet
Validation was completely missing for replicas entries in the journal (not the superblock replicas section) - we can't have replicas entries pointing to invalid devices. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-11-28bcachefs: Fix zstd compress workspace sizeKent Overstreet
zstd apparently lies about the size of the compression workspace it requires; if we double it compression succeeds. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-11-28Merge tag 'for-6.7-rc3-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: "A few fixes and message updates: - for simple quotas, handle the case when a snapshot is created and the target qgroup already exists - fix a warning when file descriptor given to send ioctl is not writable - fix off-by-one condition when checking chunk maps - free pages when page array allocation fails during compression read, other cases were handled - fix memory leak on error handling path in ref-verify debugging feature - copy missing struct member 'version' in 64/32bit compat send ioctl - tree-checker verifies inline backref ordering - print messages to syslog on first mount and last unmount - update error messages when reading chunk maps" * tag 'for-6.7-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: send: ensure send_fd is writable btrfs: free the allocated memory if btrfs_alloc_page_array() fails btrfs: fix 64bit compat send ioctl arguments not initializing version member btrfs: make error messages more clear when getting a chunk map btrfs: fix off-by-one when checking chunk map includes logical address btrfs: ref-verify: fix memory leaks in btrfs_ref_tree_mod() btrfs: add dmesg output for first mount and last unmount of a filesystem btrfs: do not abort transaction if there is already an existing qgroup btrfs: tree-checker: add type and sequence check for inline backrefs
2023-11-28block: warn once for each partition in bio_check_ro()Yu Kuai
Commit 1b0a151c10a6 ("blk-core: use pr_warn_ratelimited() in bio_check_ro()") fix message storm by limit the rate, however, there will still be lots of message in the long term. Fix it better by warn once for each partition. Signed-off-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20231128123027.971610-3-yukuai1@huaweicloud.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-11-28block: move .bd_inode into 1st cacheline of block_deviceMing Lei
The .bd_inode field of block_device is used in IO fast path of blkdev_write_iter() and blkdev_llseek(), so it is more efficient to keep it into the 1st cacheline. .bd_openers is only touched in open()/close(), and .bd_size_lock is only for updating bdev capacity, which is in slow path too. So swap .bd_inode layout with .bd_openers & .bd_size_lock to move .bd_inode into the 1st cache line. Cc: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Yu Kuai <yukuai3@huawei.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20231128123027.971610-2-yukuai1@huaweicloud.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-11-28io_uring: use fget/fput consistentlyJens Axboe
Normally within a syscall it's fine to use fdget/fdput for grabbing a file from the file table, and it's fine within io_uring as well. We do that via io_uring_enter(2), io_uring_register(2), and then also for cancel which is invoked from the latter. io_uring cannot close its own file descriptors as that is explicitly rejected, and for the cancel side of things, the file itself is just used as a lookup cookie. However, it is more prudent to ensure that full references are always grabbed. For anything threaded, either explicitly in the application itself or through use of the io-wq worker threads, this is what happens anyway. Generalize it and use fget/fput throughout. Also see the below link for more details. Link: https://lore.kernel.org/io-uring/CAG48ez1htVSO3TqmrF8QcX2WFuYTRM-VZ_N10i-VZgbtg=NNqw@mail.gmail.com/ Suggested-by: Jann Horn <jannh@google.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-11-28io_uring: free io_buffer_list entries via RCUJens Axboe
mmap_lock nests under uring_lock out of necessity, as we may be doing user copies with uring_lock held. However, for mmap of provided buffer rings, we attempt to grab uring_lock with mmap_lock already held from do_mmap(). This makes lockdep, rightfully, complain: WARNING: possible circular locking dependency detected 6.7.0-rc1-00009-gff3337ebaf94-dirty #4438 Not tainted ------------------------------------------------------ buf-ring.t/442 is trying to acquire lock: ffff00020e1480a8 (&ctx->uring_lock){+.+.}-{3:3}, at: io_uring_validate_mmap_request.isra.0+0x4c/0x140 but task is already holding lock: ffff0000dc226190 (&mm->mmap_lock){++++}-{3:3}, at: vm_mmap_pgoff+0x124/0x264 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (&mm->mmap_lock){++++}-{3:3}: __might_fault+0x90/0xbc io_register_pbuf_ring+0x94/0x488 __arm64_sys_io_uring_register+0x8dc/0x1318 invoke_syscall+0x5c/0x17c el0_svc_common.constprop.0+0x108/0x130 do_el0_svc+0x2c/0x38 el0_svc+0x4c/0x94 el0t_64_sync_handler+0x118/0x124 el0t_64_sync+0x168/0x16c -> #0 (&ctx->uring_lock){+.+.}-{3:3}: __lock_acquire+0x19a0/0x2d14 lock_acquire+0x2e0/0x44c __mutex_lock+0x118/0x564 mutex_lock_nested+0x20/0x28 io_uring_validate_mmap_request.isra.0+0x4c/0x140 io_uring_mmu_get_unmapped_area+0x3c/0x98 get_unmapped_area+0xa4/0x158 do_mmap+0xec/0x5b4 vm_mmap_pgoff+0x158/0x264 ksys_mmap_pgoff+0x1d4/0x254 __arm64_sys_mmap+0x80/0x9c invoke_syscall+0x5c/0x17c el0_svc_common.constprop.0+0x108/0x130 do_el0_svc+0x2c/0x38 el0_svc+0x4c/0x94 el0t_64_sync_handler+0x118/0x124 el0t_64_sync+0x168/0x16c From that mmap(2) path, we really just need to ensure that the buffer list doesn't go away from underneath us. For the lower indexed entries, they never go away until the ring is freed and we can always sanely reference those as long as the caller has a file reference. For the higher indexed ones in our xarray, we just need to ensure that the buffer list remains valid while we return the address of it. Free the higher indexed io_buffer_list entries via RCU. With that we can avoid needing ->uring_lock inside mmap(2), and simply hold the RCU read lock around the buffer list lookup and address check. To ensure that the arrayed lookup either returns a valid fully formulated entry via RCU lookup, add an 'is_ready' flag that we access with store and release memory ordering. This isn't needed for the xarray lookups, but doesn't hurt either. Since this isn't a fast path, retain it across both types. Similarly, for the allocated array inside the ctx, ensure we use the proper load/acquire as setup could in theory be running in parallel with mmap. While in there, add a few lockdep checks for documentation purposes. Cc: stable@vger.kernel.org Fixes: c56e022c0a27 ("io_uring: add support for user mapped provided buffer ring") Signed-off-by: Jens Axboe <axboe@kernel.dk>