linux/linux-stable.git - Linux kernel stable tree

Age	Commit message (Collapse)	Author
2024-10-02	ASoC: amd: acp: drop bogus NULL check from i2s_irq_handler	Murad Masimov
	When i2s_irq_handler is called, it's guaranteed that adata is not NULL, since IRQ handlers are guaranteed to be provided with a valid data pointer. Moreover, adata pointer is being dereferenced right before the NULL check, which makes the check pointless, even if adata could be NULL. Found by Linux Verification Center (linuxtesting.org) with SVACE. Signed-off-by: Murad Masimov <m.masimov@maxima.ru> Link: https://patch.msgid.link/20241001190848.711-1-m.masimov@maxima.ru Signed-off-by: Mark Brown <broonie@kernel.org>
2024-10-02	ASoC: intel: sof_sdw: Add check devm_kasprintf() returned value	Charles Han
	devm_kasprintf() can return a NULL pointer on failure but this returned value is not checked. Fixes: b359760d95ee ("ASoC: intel: sof_sdw: Add simple DAI link creation helper") Signed-off-by: Charles Han <hanchunchao@inspur.com> Link: https://patch.msgid.link/20240925080030.11262-1-hanchunchao@inspur.com Signed-off-by: Mark Brown <broonie@kernel.org>
2024-10-02	ASoC: imx-card: Set card.owner to avoid a warning calltrace if SND=m	Hui Wang
	In most Linux distribution kernels, the SND is set to m, in such a case, when booting the kernel on i.MX8MP EVK board, there is a warning calltrace like below: Call trace: snd_card_init+0x484/0x4cc [snd] snd_card_new+0x70/0xa8 [snd] snd_soc_bind_card+0x310/0xbd0 [snd_soc_core] snd_soc_register_card+0xf0/0x108 [snd_soc_core] devm_snd_soc_register_card+0x4c/0xa4 [snd_soc_core] That is because the card.owner is not set, a warning calltrace is raised in the snd_card_init() due to it. Fixes: aa736700f42f ("ASoC: imx-card: Add imx-card machine driver") Signed-off-by: Hui Wang <hui.wang@canonical.com> Link: https://patch.msgid.link/20241002025659.723544-1-hui.wang@canonical.com Signed-off-by: Mark Brown <broonie@kernel.org>
2024-10-02	ASoC: dt-bindings: davinci-mcasp: Fix interrupts property	Miquel Raynal
	My understanding of the interrupts property is that it can either be: 1/ - TX 2/ - TX - RX 3/ - Common/combined. There are very little chances that either: - TX - Common/combined or even - TX - RX - Common/combined could be a thing. Looking at the interrupt-names definition (which uses oneOf instead of anyOf), it makes indeed little sense to use anyOf in the interrupts definition. I believe this is just a mistake, hence let's fix it. Fixes: 8be90641a0bb ("ASoC: dt-bindings: davinci-mcasp: convert McASP bindings to yaml schema") Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://patch.msgid.link/20241001204749.390054-1-miquel.raynal@bootlin.com Signed-off-by: Mark Brown <broonie@kernel.org>
2024-10-02	ASoC: qcom: sm8250: add qrb4210-rb2-sndcard compatible string	Alexey Klimov
	Add "qcom,qrb4210-rb2-sndcard" to the list of recognizable devices. Signed-off-by: Alexey Klimov <alexey.klimov@linaro.org> Link: https://patch.msgid.link/20241002022015.867031-3-alexey.klimov@linaro.org Signed-off-by: Mark Brown <broonie@kernel.org>
2024-10-02	ASoC: dt-bindings: qcom,sm8250: add qrb4210-rb2-sndcard	Alexey Klimov
	Add adsp-backed soundcard compatible for QRB4210 RB2 platform, which as of now looks fully compatible with SM8250. Signed-off-by: Alexey Klimov <alexey.klimov@linaro.org> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://patch.msgid.link/20241002022015.867031-2-alexey.klimov@linaro.org Signed-off-by: Mark Brown <broonie@kernel.org>
2024-10-02	udf: fix uninit-value use in udf_get_fileshortad	Gianfranco Trad
	Check for overflow when computing alen in udf_current_aext to mitigate later uninit-value use in udf_get_fileshortad KMSAN bug[1]. After applying the patch reproducer did not trigger any issue[2]. [1] https://syzkaller.appspot.com/bug?extid=8901c4560b7ab5c2f9df [2] https://syzkaller.appspot.com/x/log.txt?x=10242227980000 Reported-by: syzbot+8901c4560b7ab5c2f9df@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=8901c4560b7ab5c2f9df Tested-by: syzbot+8901c4560b7ab5c2f9df@syzkaller.appspotmail.com Suggested-by: Jan Kara <jack@suse.com> Signed-off-by: Gianfranco Trad <gianf.trad@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz> Link: https://patch.msgid.link/20240925074613.8475-3-gianf.trad@gmail.com
2024-10-02	udf: refactor inode_bmap() to handle error	Zhao Mengmeng
	Refactor inode_bmap() to handle error since udf_next_aext() can return error now. On situations like ftruncate, udf_extend_file() can now detect errors and bail out early without resorting to checking for particular offsets and assuming internal behavior of these functions. Reported-by: syzbot+7a4842f0b1801230a989@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=7a4842f0b1801230a989 Tested-by: syzbot+7a4842f0b1801230a989@syzkaller.appspotmail.com Signed-off-by: Zhao Mengmeng <zhaomengmeng@kylinos.cn> Suggested-by: Jan Kara <jack@suse.cz> Signed-off-by: Jan Kara <jack@suse.cz> Link: https://patch.msgid.link/20241001115425.266556-4-zhaomzhao@126.com
2024-10-02	udf: refactor udf_next_aext() to handle error	Zhao Mengmeng
	Since udf_current_aext() has error handling, udf_next_aext() should have error handling too. Besides, when too many indirect extents found in one inode, return -EFSCORRUPTED; when reading block failed, return -EIO. Signed-off-by: Zhao Mengmeng <zhaomengmeng@kylinos.cn> Suggested-by: Jan Kara <jack@suse.cz> Signed-off-by: Jan Kara <jack@suse.cz> Link: https://patch.msgid.link/20241001115425.266556-3-zhaomzhao@126.com
2024-10-02	ALSA: hda: fix trigger_tstamp_latched	Jaroslav Kysela
	When the trigger_tstamp_latched flag is set, the PCM core code assumes that the low-level driver handles the trigger timestamping itself. Ensure that runtime->trigger_tstamp is always updated. Buglink: https://github.com/alsa-project/alsa-lib/issues/387 Reported-by: Zeno Endemann <zeno.endemann@mailbox.org> Signed-off-by: Jaroslav Kysela <perex@perex.cz> Link: https://patch.msgid.link/20241002081306.1788405-1-perex@perex.cz Signed-off-by: Takashi Iwai <tiwai@suse.de>
2024-10-02	udf: refactor udf_current_aext() to handle error	Zhao Mengmeng
	As Jan suggested in links below, refactor udf_current_aext() to differentiate between error, hit EOF and success, it now takes pointer to etype to store the extent type, return 1 when getting etype success, return 0 when hitting EOF and return -errno when err. Link: https://lore.kernel.org/all/20240912111235.6nr3wuqvktecy3vh@quack3/ Signed-off-by: Zhao Mengmeng <zhaomengmeng@kylinos.cn> Suggested-by: Jan Kara <jack@suse.cz> Signed-off-by: Jan Kara <jack@suse.cz> Link: https://patch.msgid.link/20241001115425.266556-2-zhaomzhao@126.com
2024-10-02	gpio: davinci: Fix condition for irqchip registration	Vignesh Raghavendra
	Since commit d29e741cad3f ("gpio: davinci: drop platform data support"), irqchip is no longer being registered on platforms what don't use unbanked gpios. Fix this. Reported-by: Sabeeh Khan <sabeeh-khan@ti.com> Fixes: d29e741cad3f ("gpio: davinci: drop platform data support") Signed-off-by: Vignesh Raghavendra <vigneshr@ti.com> Link: https://lore.kernel.org/r/20241002071901.2752757-1-vigneshr@ti.com Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
2024-10-02	kconfig: qconf: fix buffer overflow in debug links	Masahiro Yamada
	If you enable "Option -> Show Debug Info" and click a link, the program terminates with the following error: * buffer overflow detected *: terminated The buffer overflow is caused by the following line: strcat(data, "$"); The buffer needs one more byte to accommodate the additional character. Fixes: c4f7398bee9c ("kconfig: qconf: make debug links work again") Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2024-10-02	ALSA: hda/realtek: Add a quirk for HP Pavilion 15z-ec200	Abhishek Tamboli
	Add the quirk for HP Pavilion Gaming laptop 15z-ec200 for enabling the mute led. The fix apply the ALC285_FIXUP_HP_MUTE_LED quirk for this model. Link: https://bugzilla.kernel.org/show_bug.cgi?id=219303 Signed-off-by: Abhishek Tamboli <abhishektamboli9@gmail.com> Cc: <stable@vger.kernel.org> Link: https://patch.msgid.link/20240930145300.4604-1-abhishektamboli9@gmail.com Signed-off-by: Takashi Iwai <tiwai@suse.de>
2024-10-02	ALSA: hda/generic: Drop obsoleted obey_preferred_dacs flag	Takashi Iwai
	Now we evaluate directly with preferred_dacs table, the flag is no longer used and merely a placeholder. Let's drop the definition and its users. Link: https://patch.msgid.link/20241001121439.26060-2-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de>
2024-10-02	ALSA: hda/generic: Unconditionally prefer preferred_dacs pairs	Takashi Iwai
	Some time ago, we introduced the obey_preferred_dacs flag for choosing the DAC/pin pairs specified by the driver instead of parsing the paths. This works as expected, per se, but there have been a few cases where we forgot to set this flag while preferred_dacs table is already set up. It ended up with incorrect wiring and made us wondering why it doesn't work. Basically, when the preferred_dacs table is provided, it means that the driver really wants to wire up to follow that. That is, the presence of the preferred_dacs table itself is already a "do-it" flag. In this patch, we simply replace the evaluation of obey_preferred_dacs flag with the presence of preferred_dacs table for fixing the misbehavior. Another patch to drop of the obsoleted flag will follow. Fixes: 242d990c158d ("ALSA: hda/generic: Add option to enforce preferred_dacs pairs") Link: https://bugzilla.suse.com/show_bug.cgi?id=1219803 Link: https://patch.msgid.link/20241001121439.26060-1-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de>
2024-10-02	ufs_rename(): fix bogus argument of folio_release_kmap()	Al Viro
	new_dir does NOT point into dir_folio - it's an inode, not a pointer to ufs directory entry. Fixes: 516b97cf03dd6 "ufs: Convert directory handling to kmap_local" Acked-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2024-10-01	parisc: get rid of private asm/unaligned.h	Al Viro
	Declarations local to arch//kernel/.c are better off not in a public header - arch/parisc/kernel/unaligned.h is just fine for those bits. With that done parisc asm/unaligned.h is reduced to include of asm-generic/unaligned.h and can be removed - unaligned.h is in mandatory-y in include/asm-generic/Kbuild. Acked-by: Helge Deller <deller@gmx.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2024-10-01	smb: client: use actual path when queryfs	wangrong
	Due to server permission control, the client does not have access to the shared root directory, but can access subdirectories normally, so users usually mount the shared subdirectories directly. In this case, queryfs should use the actual path instead of the root directory to avoid the call returning an error (EACCES). Signed-off-by: wangrong <wangrong@uniontech.com> Reviewed-by: Paulo Alcantara (Red Hat) <pc@manguebit.com> Cc: stable@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>
2024-10-02	spi: Fix pm_runtime_set_suspended() with runtime pm	Mark Brown
	Merge series from Jinjie Ruan <ruanjinjie@huawei.com>: Fix pm_runtime_set_suspended() with runtime pm enabled, and fix the missing check for spi-cadence. Jinjie Ruan (3): spi: spi-imx: Fix pm_runtime_set_suspended() with runtime pm enabled spi: spi-cadence: Fix pm_runtime_set_suspended() with runtime pm enabled spi: spi-cadence: Fix missing spi_controller_is_target() check drivers/spi/spi-cadence.c \| 8 +++++--- drivers/spi/spi-imx.c \| 2 +- 2 files changed, 6 insertions(+), 4 deletions(-) -- 2.34.1
2024-10-01	drm/amd/display: Fix system hang while resume with TBT monitor	Tom Chung
	[Why] Connected with a Thunderbolt monitor and do the suspend and the system may hang while resume. The TBT monitor HPD will be triggered during the resume procedure and call the drm_client_modeset_probe() while struct drm_connector connector->dev->master is NULL. It will mess up the pipe topology after resume. [How] Skip the TBT monitor HPD during the resume procedure because we currently will probe the connectors after resume by default. Reviewed-by: Wayne Lin <wayne.lin@amd.com> Signed-off-by: Tom Chung <chiahsuan.chung@amd.com> Signed-off-by: Fangzhi Zuo <jerry.zuo@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 453f86a26945207a16b8f66aaed5962dc2b95b85) Cc: stable@vger.kernel.org
2024-10-01	drm/amd/display: Enable idle workqueue for more IPS modes	Leo Li
	[Why] There are more IPS modes other than DMUB_IPS_ENABLE that enables IPS. We need to enable the hotplug detect idle workqueue for those modes as well. [How] Modify the if condition to initialize the workqueue in all IPS modes except for DMUB_IPS_DISABLE_ALL. Fixes: 65444581a4ae ("drm/amd/display: Determine IPS mode by ASIC and PMFW versions") Signed-off-by: Leo Li <sunpeng.li@amd.com> Reviewed-by: Roman Li <roman.li@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 181db30bcfed097ecc680539b1eabe935c11f57f) Cc: stable@vger.kernel.org
2024-10-01	drm/amd/display: Add HDR workaround for specific eDP	Alex Hung
	[WHY & HOW] Some eDP panels suffer from flicking when HDR is enabled in KDE. This quirk works around it by skipping VSC that is incompatible with eDP panels. Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3151 Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 4d4257280d7957727998ef90ccc7b69c7cca8376) Cc: stable@vger.kernel.org
2024-10-01	drm/amd/display: avoid set dispclk to 0	Charlene Liu
	[why] set dispclk to 0 cause stability issue. Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Charlene Liu <Charlene.Liu@amd.com> Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 1c6b16ebf5eb2bc5740be9e37b3a69f1dfe1dded) Cc: stable@vger.kernel.org
2024-10-01	drm/amd/display: Restore Optimized pbn Value if Failed to Disable DSC	Fangzhi Zuo
	Existing last step of dsc policy is to restore pbn value under minimum compression when try to greedily disable dsc for a stream failed to fit in MST bw. Optimized dsc params result from optimization step is not necessarily the minimum compression, therefore it is not correct to restore the pbn under minimum compression rate. Restore the pbn under minimum compression instead of the value from optimized pbn could result in the dsc params not correct at the modeset where atomic_check failed due to not enough bw. One or more monitors connected could not light up in such case. Restore the optimized pbn value, instead of using the pbn value under minimum compression. Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Wayne Lin <wayne.lin@amd.com> Signed-off-by: Fangzhi Zuo <Jerry.Zuo@amd.com> Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 352c3165d2b75030169e012461a16bcf97f392fc) Cc: stable@vger.kernel.org
2024-10-01	drm/amd/display: update DML2 policy ↵	Yihan Zhu
	EnhancedPrefetchScheduleAccelerationFinal DCN35 [WHY & HOW] Mismatch in DCN35 DML2 cause bw validation failed to acquire unexpected DPP pipe to cause grey screen and system hang. Remove EnhancedPrefetchScheduleAccelerationFinal value override to match HW spec. Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Charlene Liu <charlene.liu@amd.com> Signed-off-by: Yihan Zhu <Yihan.Zhu@amd.com> Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 9dad21f910fcea2bdcff4af46159101d7f9cd8ba) Cc: stable@vger.kernel.org
2024-10-01	rust: kunit: use C-string literals to clean warning	Miguel Ojeda
	Starting with upstream Rust commit a5e3a3f9b6bd ("move `manual_c_str_literals` to complexity"), to be released in Rust 1.83.0 [1], Clippy now warns on `manual_c_str_literals` by default, e.g.: error: manually constructing a nul-terminated string --> rust/kernel/kunit.rs:21:13 \| 21 \| b"\x013%pA\0".as_ptr() as _, \| ^^^^^^^^^^^^^ help: use a `c""` literal: `c"\x013%pA"` \| = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#manual_c_str_literals = note: `-D clippy::manual-c-str-literals` implied by `-D warnings` = help: to override `-D warnings` add `#[allow(clippy::manual_c_str_literals)]` Apply the suggestion to clean up the warnings. Link: https://github.com/rust-lang/rust-clippy/pull/13263 [1] Reviewed-by: Trevor Gross <tmgross@umich.edu> Reviewed-by: Benno Lossin <benno.lossin@proton.me> Link: https://lore.kernel.org/r/20240927164414.560906-1-ojeda@kernel.org Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
2024-10-01	bcachefs: Fix bad shift in bch2_read_flag_list()	Kent Overstreet
	Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-10-01	riscv: Fix kernel stack size when KASAN is enabled	Alexandre Ghiti
	We use Kconfig to select the kernel stack size, doubling the default size if KASAN is enabled. But that actually only works if KASAN is selected from the beginning, meaning that if KASAN config is added later (for example using menuconfig), CONFIG_THREAD_SIZE_ORDER won't be updated, keeping the default size, which is not enough for KASAN as reported in [1]. So fix this by moving the logic to compute the right kernel stack into a header. Fixes: a7555f6b62e7 ("riscv: stack: Add config of thread stack size") Reported-by: syzbot+ba9eac24453387a9d502@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/000000000000eb301906222aadc2@google.com/ [1] Cc: stable@vger.kernel.org Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20240917150328.59831-1-alexghiti@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-10-01	ksmbd: Use struct_size() to improve smb_direct_rdma_xmit()	Thorsten Blum
	Use struct_size() to calculate the number of bytes to allocate for a new message. Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-10-01	ksmbd: Annotate struct copychunk_ioctl_req with __counted_by_le()	Thorsten Blum
	Add the __counted_by_le compiler attribute to the flexible array member Chunks to improve access bounds-checking via CONFIG_UBSAN_BOUNDS and CONFIG_FORTIFY_SOURCE. Change the data type of the flexible array member Chunks from __u8[] to struct srv_copychunk[] for ChunkCount to match the number of elements in the Chunks array. (With __u8[], each srv_copychunk would occupy 24 array entries and the __counted_by compiler attribute wouldn't be applicable.) Use struct_size() to calculate the size of the copychunk_ioctl_req. Read Chunks[0] after checking that ChunkCount is not 0. Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-10-01	ksmbd: Use struct_size() to improve get_file_alternate_info()	Thorsten Blum
	Use struct_size() to calculate the output buffer length. Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-10-01	ACPI: video: Add backlight=native quirk for Dell OptiPlex 5480 AIO	Hans de Goede
	Dell All In One (AIO) models released after 2017 may use a backlight controller board connected to an UART. In DSDT this uart port will be defined as: Name (_HID, "DELL0501") Name (_CID, EisaId ("PNP0501") The Dell OptiPlex 5480 AIO has an ACPI device for one of its UARTs with the above _HID + _CID. Loading the dell-uart-backlight driver fails with the following errors: [ 18.261353] dell_uart_backlight serial0-0: Timed out waiting for response. [ 18.261356] dell_uart_backlight serial0-0: error -ETIMEDOUT: getting firmware version [ 18.261359] dell_uart_backlight serial0-0: probe with driver dell_uart_backlight failed with error -110 Indicating that there is no backlight controller board attached to the UART, while the GPU's native backlight control method does work. Add a quirk to use the GPU's native backlight control method on this model. Fixes: cd8e468efb4f ("ACPI: video: Add Dell UART backlight controller detection") Cc: All applicable <stable@vger.kernel.org> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Link: https://patch.msgid.link/20240918153849.37221-1-hdegoede@redhat.com [ rjw: Changelog edit ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-10-01	cpufreq: Avoid a bad reference count on CPU node	Miquel Sabaté Solà
	In the parse_perf_domain function, if the call to of_parse_phandle_with_args returns an error, then the reference to the CPU device node that was acquired at the start of the function would not be properly decremented. Address this by declaring the variable with the __free(device_node) cleanup attribute. Signed-off-by: Miquel Sabaté Solà <mikisabate@gmail.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Link: https://patch.msgid.link/20240917134246.584026-1-mikisabate@gmail.com Cc: All applicable <stable@vger.kernel.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-10-01	cpufreq: intel_pstate: Make hwp_notify_lock a raw spinlock	Uwe Kleine-König
	notify_hwp_interrupt() is called via sysvec_thermal() -> smp_thermal_vector() -> intel_thermal_interrupt() in hard irq context. For this reason it must not use a simple spin_lock that sleeps with PREEMPT_RT enabled. So convert it to a raw spinlock. Reported-by: xiao sheng wen <atzlinux@sina.com> Link: https://bugs.debian.org/1076483 Signed-off-by: Uwe Kleine-König <ukleinek@debian.org> Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Acked-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Tested-by: xiao sheng wen <atzlinux@sina.com> Link: https://patch.msgid.link/20240919081121.10784-2-ukleinek@debian.org Cc: All applicable <stable@vger.kernel.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-10-01	spi: s3c64xx: fix timeout counters in flush_fifo	Ben Dooks
	In the s3c64xx_flush_fifo() code, the loops counter is post-decremented in the do { } while(test && loops--) condition. This means the loops is left at the unsigned equivalent of -1 if the loop times out. The test after will never pass as if tests for loops == 0. Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk> Fixes: 230d42d422e7 ("spi: Add s3c64xx SPI Controller driver") Reviewed-by: Andi Shyti <andi.shyti@kernel.org> Link: https://patch.msgid.link/20240924134009.116247-2-ben.dooks@codethink.co.uk Signed-off-by: Mark Brown <broonie@kernel.org>
2024-10-01	ASoC: Intel: soc-acpi: Fix missing empty terminators	Mark Brown
	Merge series from Bard Liao <yung-chuan.liao@linux.intel.com>: There is no links_num in struct snd_soc_acpi_mach {}, and we test !link->num_adr as a condition to end the loop in hda_sdw_machine_select(). So an empty item in struct snd_soc_acpi_link_adr array is required.
2024-10-01	btrfs: disable rate limiting when debug enabled	Leo Martins
	Disable ratelimiting for btrfs_printk when CONFIG_BTRFS_DEBUG is enabled. This allows for more verbose output which is often needed by functions like btrfs_dump_space_info(). Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Leo Martins <loemra.dev@gmail.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-10-01	btrfs: wait for fixup workers before stopping cleaner kthread during umount	Filipe Manana
	During unmount, at close_ctree(), we have the following steps in this order: 1) Park the cleaner kthread - this doesn't destroy the kthread, it basically halts its execution (wake ups against it work but do nothing); 2) We stop the cleaner kthread - this results in freeing the respective struct task_struct; 3) We call btrfs_stop_all_workers() which waits for any jobs running in all the work queues and then free the work queues. Syzbot reported a case where a fixup worker resulted in a crash when doing a delayed iput on its inode while attempting to wake up the cleaner at btrfs_add_delayed_iput(), because the task_struct of the cleaner kthread was already freed. This can happen during unmount because we don't wait for any fixup workers still running before we call kthread_stop() against the cleaner kthread, which stops and free all its resources. Fix this by waiting for any fixup workers at close_ctree() before we call kthread_stop() against the cleaner and run pending delayed iputs. The stack traces reported by syzbot were the following: BUG: KASAN: slab-use-after-free in __lock_acquire+0x77/0x2050 kernel/locking/lockdep.c:5065 Read of size 8 at addr ffff8880272a8a18 by task kworker/u8:3/52 CPU: 1 UID: 0 PID: 52 Comm: kworker/u8:3 Not tainted 6.12.0-rc1-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024 Workqueue: btrfs-fixup btrfs_work_helper Call Trace: <TASK> __dump_stack lib/dump_stack.c:94 [inline] dump_stack_lvl+0x241/0x360 lib/dump_stack.c:120 print_address_description mm/kasan/report.c:377 [inline] print_report+0x169/0x550 mm/kasan/report.c:488 kasan_report+0x143/0x180 mm/kasan/report.c:601 __lock_acquire+0x77/0x2050 kernel/locking/lockdep.c:5065 lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5825 __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline] _raw_spin_lock_irqsave+0xd5/0x120 kernel/locking/spinlock.c:162 class_raw_spinlock_irqsave_constructor include/linux/spinlock.h:551 [inline] try_to_wake_up+0xb0/0x1480 kernel/sched/core.c:4154 btrfs_writepage_fixup_worker+0xc16/0xdf0 fs/btrfs/inode.c:2842 btrfs_work_helper+0x390/0xc50 fs/btrfs/async-thread.c:314 process_one_work kernel/workqueue.c:3229 [inline] process_scheduled_works+0xa63/0x1850 kernel/workqueue.c:3310 worker_thread+0x870/0xd30 kernel/workqueue.c:3391 kthread+0x2f0/0x390 kernel/kthread.c:389 ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244 </TASK> Allocated by task 2: kasan_save_stack mm/kasan/common.c:47 [inline] kasan_save_track+0x3f/0x80 mm/kasan/common.c:68 unpoison_slab_object mm/kasan/common.c:319 [inline] __kasan_slab_alloc+0x66/0x80 mm/kasan/common.c:345 kasan_slab_alloc include/linux/kasan.h:247 [inline] slab_post_alloc_hook mm/slub.c:4086 [inline] slab_alloc_node mm/slub.c:4135 [inline] kmem_cache_alloc_node_noprof+0x16b/0x320 mm/slub.c:4187 alloc_task_struct_node kernel/fork.c:180 [inline] dup_task_struct+0x57/0x8c0 kernel/fork.c:1107 copy_process+0x5d1/0x3d50 kernel/fork.c:2206 kernel_clone+0x223/0x880 kernel/fork.c:2787 kernel_thread+0x1bc/0x240 kernel/fork.c:2849 create_kthread kernel/kthread.c:412 [inline] kthreadd+0x60d/0x810 kernel/kthread.c:765 ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244 Freed by task 61: kasan_save_stack mm/kasan/common.c:47 [inline] kasan_save_track+0x3f/0x80 mm/kasan/common.c:68 kasan_save_free_info+0x40/0x50 mm/kasan/generic.c:579 poison_slab_object mm/kasan/common.c:247 [inline] __kasan_slab_free+0x59/0x70 mm/kasan/common.c:264 kasan_slab_free include/linux/kasan.h:230 [inline] slab_free_hook mm/slub.c:2343 [inline] slab_free mm/slub.c:4580 [inline] kmem_cache_free+0x1a2/0x420 mm/slub.c:4682 put_task_struct include/linux/sched/task.h:144 [inline] delayed_put_task_struct+0x125/0x300 kernel/exit.c:228 rcu_do_batch kernel/rcu/tree.c:2567 [inline] rcu_core+0xaaa/0x17a0 kernel/rcu/tree.c:2823 handle_softirqs+0x2c5/0x980 kernel/softirq.c:554 __do_softirq kernel/softirq.c:588 [inline] invoke_softirq kernel/softirq.c:428 [inline] __irq_exit_rcu+0xf4/0x1c0 kernel/softirq.c:637 irq_exit_rcu+0x9/0x30 kernel/softirq.c:649 instr_sysvec_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1037 [inline] sysvec_apic_timer_interrupt+0xa6/0xc0 arch/x86/kernel/apic/apic.c:1037 asm_sysvec_apic_timer_interrupt+0x1a/0x20 arch/x86/include/asm/idtentry.h:702 Last potentially related work creation: kasan_save_stack+0x3f/0x60 mm/kasan/common.c:47 __kasan_record_aux_stack+0xac/0xc0 mm/kasan/generic.c:541 __call_rcu_common kernel/rcu/tree.c:3086 [inline] call_rcu+0x167/0xa70 kernel/rcu/tree.c:3190 context_switch kernel/sched/core.c:5318 [inline] __schedule+0x184b/0x4ae0 kernel/sched/core.c:6675 schedule_idle+0x56/0x90 kernel/sched/core.c:6793 do_idle+0x56a/0x5d0 kernel/sched/idle.c:354 cpu_startup_entry+0x42/0x60 kernel/sched/idle.c:424 start_secondary+0x102/0x110 arch/x86/kernel/smpboot.c:314 common_startup_64+0x13e/0x147 The buggy address belongs to the object at ffff8880272a8000 which belongs to the cache task_struct of size 7424 The buggy address is located 2584 bytes inside of freed 7424-byte region [ffff8880272a8000, ffff8880272a9d00) The buggy address belongs to the physical page: page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x272a8 head: order:3 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0 flags: 0xfff00000000040(head\|node=0\|zone=1\|lastcpupid=0x7ff) page_type: f5(slab) raw: 00fff00000000040 ffff88801bafa500 dead000000000122 0000000000000000 raw: 0000000000000000 0000000080040004 00000001f5000000 0000000000000000 head: 00fff00000000040 ffff88801bafa500 dead000000000122 0000000000000000 head: 0000000000000000 0000000080040004 00000001f5000000 0000000000000000 head: 00fff00000000003 ffffea00009caa01 ffffffffffffffff 0000000000000000 head: 0000000000000008 0000000000000000 00000000ffffffff 0000000000000000 page dumped because: kasan: bad access detected page_owner tracks the page as allocated page last allocated via order 3, migratetype Unmovable, gfp_mask 0xd20c0(__GFP_IO\|__GFP_FS\|__GFP_NOWARN\|__GFP_NORETRY\|__GFP_COMP\|__GFP_NOMEMALLOC), pid 2, tgid 2 (kthreadd), ts 71247381401, free_ts 71214998153 set_page_owner include/linux/page_owner.h:32 [inline] post_alloc_hook+0x1f3/0x230 mm/page_alloc.c:1537 prep_new_page mm/page_alloc.c:1545 [inline] get_page_from_freelist+0x3039/0x3180 mm/page_alloc.c:3457 __alloc_pages_noprof+0x256/0x6c0 mm/page_alloc.c:4733 alloc_pages_mpol_noprof+0x3e8/0x680 mm/mempolicy.c:2265 alloc_slab_page+0x6a/0x120 mm/slub.c:2413 allocate_slab+0x5a/0x2f0 mm/slub.c:2579 new_slab mm/slub.c:2632 [inline] ___slab_alloc+0xcd1/0x14b0 mm/slub.c:3819 __slab_alloc+0x58/0xa0 mm/slub.c:3909 __slab_alloc_node mm/slub.c:3962 [inline] slab_alloc_node mm/slub.c:4123 [inline] kmem_cache_alloc_node_noprof+0x1fe/0x320 mm/slub.c:4187 alloc_task_struct_node kernel/fork.c:180 [inline] dup_task_struct+0x57/0x8c0 kernel/fork.c:1107 copy_process+0x5d1/0x3d50 kernel/fork.c:2206 kernel_clone+0x223/0x880 kernel/fork.c:2787 kernel_thread+0x1bc/0x240 kernel/fork.c:2849 create_kthread kernel/kthread.c:412 [inline] kthreadd+0x60d/0x810 kernel/kthread.c:765 ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244 page last free pid 5230 tgid 5230 stack trace: reset_page_owner include/linux/page_owner.h:25 [inline] free_pages_prepare mm/page_alloc.c:1108 [inline] free_unref_page+0xcd0/0xf00 mm/page_alloc.c:2638 discard_slab mm/slub.c:2678 [inline] __put_partials+0xeb/0x130 mm/slub.c:3146 put_cpu_partial+0x17c/0x250 mm/slub.c:3221 __slab_free+0x2ea/0x3d0 mm/slub.c:4450 qlink_free mm/kasan/quarantine.c:163 [inline] qlist_free_all+0x9a/0x140 mm/kasan/quarantine.c:179 kasan_quarantine_reduce+0x14f/0x170 mm/kasan/quarantine.c:286 __kasan_slab_alloc+0x23/0x80 mm/kasan/common.c:329 kasan_slab_alloc include/linux/kasan.h:247 [inline] slab_post_alloc_hook mm/slub.c:4086 [inline] slab_alloc_node mm/slub.c:4135 [inline] kmem_cache_alloc_noprof+0x135/0x2a0 mm/slub.c:4142 getname_flags+0xb7/0x540 fs/namei.c:139 do_sys_openat2+0xd2/0x1d0 fs/open.c:1409 do_sys_open fs/open.c:1430 [inline] __do_sys_openat fs/open.c:1446 [inline] __se_sys_openat fs/open.c:1441 [inline] __x64_sys_openat+0x247/0x2a0 fs/open.c:1441 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f Memory state around the buggy address: ffff8880272a8900: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff8880272a8980: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb >ffff8880272a8a00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ^ ffff8880272a8a80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff8880272a8b00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ================================================================== Reported-by: syzbot+8aaf2df2ef0164ffe1fb@syzkaller.appspotmail.com Link: https://lore.kernel.org/linux-btrfs/66fb36b1.050a0220.aab67.003b.GAE@google.com/ CC: stable@vger.kernel.org # 4.19+ Reviewed-by: Qu Wenruo <wqu@suse.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-10-01	btrfs: fix a NULL pointer dereference when failed to start a new trasacntion	Qu Wenruo
	[BUG] Syzbot reported a NULL pointer dereference with the following crash: FAULT_INJECTION: forcing a failure. start_transaction+0x830/0x1670 fs/btrfs/transaction.c:676 prepare_to_relocate+0x31f/0x4c0 fs/btrfs/relocation.c:3642 relocate_block_group+0x169/0xd20 fs/btrfs/relocation.c:3678 ... BTRFS info (device loop0): balance: ended with status: -12 Oops: general protection fault, probably for non-canonical address 0xdffffc00000000cc: 0000 [#1] PREEMPT SMP KASAN NOPTI KASAN: null-ptr-deref in range [0x0000000000000660-0x0000000000000667] RIP: 0010:btrfs_update_reloc_root+0x362/0xa80 fs/btrfs/relocation.c:926 Call Trace: <TASK> commit_fs_roots+0x2ee/0x720 fs/btrfs/transaction.c:1496 btrfs_commit_transaction+0xfaf/0x3740 fs/btrfs/transaction.c:2430 del_balance_item fs/btrfs/volumes.c:3678 [inline] reset_balance_state+0x25e/0x3c0 fs/btrfs/volumes.c:3742 btrfs_balance+0xead/0x10c0 fs/btrfs/volumes.c:4574 btrfs_ioctl_balance+0x493/0x7c0 fs/btrfs/ioctl.c:3673 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:907 [inline] __se_sys_ioctl+0xf9/0x170 fs/ioctl.c:893 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f [CAUSE] The allocation failure happens at the start_transaction() inside prepare_to_relocate(), and during the error handling we call unset_reloc_control(), which makes fs_info->balance_ctl to be NULL. Then we continue the error path cleanup in btrfs_balance() by calling reset_balance_state() which will call del_balance_item() to fully delete the balance item in the root tree. However during the small window between set_reloc_contrl() and unset_reloc_control(), we can have a subvolume tree update and created a reloc_root for that subvolume. Then we go into the final btrfs_commit_transaction() of del_balance_item(), and into btrfs_update_reloc_root() inside commit_fs_roots(). That function checks if fs_info->reloc_ctl is in the merge_reloc_tree stage, but since fs_info->reloc_ctl is NULL, it results a NULL pointer dereference. [FIX] Just add extra check on fs_info->reloc_ctl inside btrfs_update_reloc_root(), before checking fs_info->reloc_ctl->merge_reloc_tree. That DEAD_RELOC_TREE handling is to prevent further modification to the reloc tree during merge stage, but since there is no reloc_ctl at all, we do not need to bother that. Reported-by: syzbot+283673dbc38527ef9f3d@syzkaller.appspotmail.com Link: https://lore.kernel.org/linux-btrfs/66f6bfa7.050a0220.38ace9.0019.GAE@google.com/ CC: stable@vger.kernel.org # 4.19+ Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-10-01	btrfs: send: fix invalid clone operation for file that got its size decreased	Filipe Manana
	During an incremental send we may end up sending an invalid clone operation, for the last extent of a file which ends at an unaligned offset that matches the final i_size of the file in the send snapshot, in case the file had its initial size (the size in the parent snapshot) decreased in the send snapshot. In this case the destination will fail to apply the clone operation because its end offset is not sector size aligned and it ends before the current size of the file. Sending the truncate operation always happens when we finish processing an inode, after we process all its extents (and xattrs, names, etc). So fix this by ensuring the file has a valid size before we send a clone operation for an unaligned extent that ends at the final i_size of the file. The size we truncate to matches the start offset of the clone range but it could be any value between that start offset and the final size of the file since the clone operation will expand the i_size if the current size is smaller than the end offset. The start offset of the range was chosen because it's always sector size aligned and avoids a truncation into the middle of a page, which results in dirtying the page due to filling part of it with zeroes and then making the clone operation at the receiver trigger IO. The following test reproduces the issue: $ cat test.sh #!/bin/bash DEV=/dev/sdi MNT=/mnt/sdi mkfs.btrfs -f $DEV mount $DEV $MNT # Create a file with a size of 256K + 5 bytes, having two extents, one # with a size of 128K and another one with a size of 128K + 5 bytes. last_ext_size=$((128 * 1024 + 5)) xfs_io -f -d -c "pwrite -S 0xab -b 128K 0 128K" \ -c "pwrite -S 0xcd -b $last_ext_size 128K $last_ext_size" \ $MNT/foo # Another file which we will later clone foo into, but initially with # a larger size than foo. xfs_io -f -c "pwrite -S 0xef 0 1M" $MNT/bar btrfs subvolume snapshot -r $MNT/ $MNT/snap1 # Now resize bar and clone foo into it. xfs_io -c "truncate 0" \ -c "reflink $MNT/foo" $MNT/bar btrfs subvolume snapshot -r $MNT/ $MNT/snap2 rm -f /tmp/send-full /tmp/send-inc btrfs send -f /tmp/send-full $MNT/snap1 btrfs send -p $MNT/snap1 -f /tmp/send-inc $MNT/snap2 umount $MNT mkfs.btrfs -f $DEV mount $DEV $MNT btrfs receive -f /tmp/send-full $MNT btrfs receive -f /tmp/send-inc $MNT umount $MNT Running it before this patch: $ ./test.sh (...) At subvol snap1 At snapshot snap2 ERROR: failed to clone extents to bar: Invalid argument A test case for fstests will be sent soon. Reported-by: Ben Millwood <thebenmachine@gmail.com> Link: https://lore.kernel.org/linux-btrfs/CAJhrHS2z+WViO2h=ojYvBPDLsATwLbg+7JaNCyYomv0fUxEpQQ@mail.gmail.com/ Fixes: 46a6e10a1ab1 ("btrfs: send: allow cloning non-aligned extent if it ends at i_size") CC: stable@vger.kernel.org # 6.11 Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-10-01	btrfs: tracepoints: end assignment with semicolon at btrfs_qgroup_extent ↵	Filipe Manana
	event class While running checkpatch.pl against a patch that modifies the btrfs_qgroup_extent event class, it complained about using a comma instead of a semicolon: $ ./scripts/checkpatch.pl qgroups/0003-btrfs-qgroups-remove-bytenr-field-from-struct-btrfs_.patch WARNING: Possible comma where semicolon could be used #215: FILE: include/trace/events/btrfs.h:1720: + __entry->bytenr = bytenr, __entry->num_bytes = rec->num_bytes; total: 0 errors, 1 warnings, 184 lines checked So replace the comma with a semicolon to silence checkpatch and possibly other tools. It also makes the code consistent with the rest. Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-10-01	btrfs: drop the backref cache during relocation if we commit	Josef Bacik
	Since the inception of relocation we have maintained the backref cache across transaction commits, updating the backref cache with the new bytenr whenever we COWed blocks that were in the cache, and then updating their bytenr once we detected a transaction id change. This works as long as we're only ever modifying blocks, not changing the structure of the tree. However relocation does in fact change the structure of the tree. For example, if we are relocating a data extent, we will look up all the leaves that point to this data extent. We will then call do_relocation() on each of these leaves, which will COW down to the leaf and then update the file extent location. But, a key feature of do_relocation() is the pending list. This is all the pending nodes that we modified when we updated the file extent item. We will then process all of these blocks via finish_pending_nodes, which calls do_relocation() on all of the nodes that led up to that leaf. The purpose of this is to make sure we don't break sharing unless we absolutely have to. Consider the case that we have 3 snapshots that all point to this leaf through the same nodes, the initial COW would have created a whole new path. If we did this for all 3 snapshots we would end up with 3x the number of nodes we had originally. To avoid this we will cycle through each of the snapshots that point to each of these nodes and update their pointers to point at the new nodes. Once we update the pointer to the new node we will drop the node we removed the link for and all of its children via btrfs_drop_subtree(). This is essentially just btrfs_drop_snapshot(), but for an arbitrary point in the snapshot. The problem with this is that we will never reflect this in the backref cache. If we do this btrfs_drop_snapshot() for a node that is in the backref tree, we will leave the node in the backref tree. This becomes a problem when we change the transid, as now the backref cache has entire subtrees that no longer exist, but exist as if they still are pointed to by the same roots. In the best case scenario you end up with "adding refs to an existing tree ref" errors from insert_inline_extent_backref(), where we attempt to link in nodes on roots that are no longer valid. Worst case you will double free some random block and re-use it when there's still references to the block. This is extremely subtle, and the consequences are quite bad. There isn't a way to make sure our backref cache is consistent between transid's. In order to fix this we need to simply evict the entire backref cache anytime we cross transid's. This reduces performance in that we have to rebuild this backref cache every time we change transid's, but fixes the bug. This has existed since relocation was added, and is a pretty critical bug. There's a lot more cleanup that can be done now that this functionality is going away, but this patch is as small as possible in order to fix the problem and make it easy for us to backport it to all the kernels it needs to be backported to. Followup series will dismantle more of this code and simplify relocation drastically to remove this functionality. We have a reproducer that reproduced the corruption within a few minutes of running. With this patch it survives several iterations/hours of running the reproducer. Fixes: 3fd0a5585eb9 ("Btrfs: Metadata ENOSPC handling for balance") CC: stable@vger.kernel.org Reviewed-by: Boris Burkov <boris@bur.io> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-10-01	btrfs: also add stripe entries for NOCOW writes	Johannes Thumshirn
	NOCOW writes do not generate stripe_extent entries in the RAID stripe tree, as the RAID stripe-tree feature initially was designed with a zoned filesystem in mind and on a zoned filesystem, we do not allow NOCOW writes. But the RAID stripe-tree feature is independent from the zoned feature, so we must also do NOCOW writes for RAID stripe-tree filesystems. Reviewed-by: Naohiro Aota <naohiro.aota@wdc.com> Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-10-01	btrfs: send: fix buffer overflow detection when copying path to cache entry	Filipe Manana
	Starting with commit c0247d289e73 ("btrfs: send: annotate struct name_cache_entry with __counted_by()") we annotated the variable length array "name" from the name_cache_entry structure with __counted_by() to improve overflow detection. However that alone was not correct, because the length of that array does not match the "name_len" field - it matches that plus 1 to include the NUL string terminator, so that makes a fortified kernel think there's an overflow and report a splat like this: strcpy: detected buffer overflow: 20 byte write of buffer size 19 WARNING: CPU: 3 PID: 3310 at __fortify_report+0x45/0x50 CPU: 3 UID: 0 PID: 3310 Comm: btrfs Not tainted 6.11.0-prnet #1 Hardware name: CompuLab Ltd. sbc-ihsw/Intense-PC2 (IPC2), BIOS IPC2_3.330.7 X64 03/15/2018 RIP: 0010:__fortify_report+0x45/0x50 Code: 48 8b 34 (...) RSP: 0018:ffff97ebc0d6f650 EFLAGS: 00010246 RAX: 7749924ef60fa600 RBX: ffff8bf5446a521a RCX: 0000000000000027 RDX: 00000000ffffdfff RSI: ffff97ebc0d6f548 RDI: ffff8bf84e7a1cc8 RBP: ffff8bf548574080 R08: ffffffffa8c40e10 R09: 0000000000005ffd R10: 0000000000000004 R11: ffffffffa8c70e10 R12: ffff8bf551eef400 R13: 0000000000000000 R14: 0000000000000013 R15: 00000000000003a8 FS: 00007fae144de8c0(0000) GS:ffff8bf84e780000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fae14691690 CR3: 00000001027a2003 CR4: 00000000001706f0 Call Trace: <TASK> ? __warn+0x12a/0x1d0 ? __fortify_report+0x45/0x50 ? report_bug+0x154/0x1c0 ? handle_bug+0x42/0x70 ? exc_invalid_op+0x1a/0x50 ? asm_exc_invalid_op+0x1a/0x20 ? __fortify_report+0x45/0x50 __fortify_panic+0x9/0x10 __get_cur_name_and_parent+0x3bc/0x3c0 get_cur_path+0x207/0x3b0 send_extent_data+0x709/0x10d0 ? find_parent_nodes+0x22df/0x25d0 ? mas_nomem+0x13/0x90 ? mtree_insert_range+0xa5/0x110 ? btrfs_lru_cache_store+0x5f/0x1e0 ? iterate_extent_inodes+0x52d/0x5a0 process_extent+0xa96/0x11a0 ? __pfx_lookup_backref_cache+0x10/0x10 ? __pfx_store_backref_cache+0x10/0x10 ? __pfx_iterate_backrefs+0x10/0x10 ? __pfx_check_extent_item+0x10/0x10 changed_cb+0x6fa/0x930 ? tree_advance+0x362/0x390 ? memcmp_extent_buffer+0xd7/0x160 send_subvol+0xf0a/0x1520 btrfs_ioctl_send+0x106b/0x11d0 ? __pfx___clone_root_cmp_sort+0x10/0x10 _btrfs_ioctl_send+0x1ac/0x240 btrfs_ioctl+0x75b/0x850 __se_sys_ioctl+0xca/0x150 do_syscall_64+0x85/0x160 ? __count_memcg_events+0x69/0x100 ? handle_mm_fault+0x1327/0x15c0 ? __se_sys_rt_sigprocmask+0xf1/0x180 ? syscall_exit_to_user_mode+0x75/0xa0 ? do_syscall_64+0x91/0x160 ? do_user_addr_fault+0x21d/0x630 entry_SYSCALL_64_after_hwframe+0x76/0x7e RIP: 0033:0x7fae145eeb4f Code: 00 48 89 (...) RSP: 002b:00007ffdf1cb09b0 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 0000000000000004 RCX: 00007fae145eeb4f RDX: 00007ffdf1cb0ad0 RSI: 0000000040489426 RDI: 0000000000000004 RBP: 00000000000078fe R08: 00007fae144006c0 R09: 00007ffdf1cb0927 R10: 0000000000000008 R11: 0000000000000246 R12: 00007ffdf1cb1ce8 R13: 0000000000000003 R14: 000055c499fab2e0 R15: 0000000000000004 </TASK> Fix this by not storing the NUL string terminator since we don't actually need it for name cache entries, this way "name_len" corresponds to the actual size of the "name" array. This requires marking the "name" array field with __nonstring and using memcpy() instead of strcpy() as recommended by the guidelines at: https://github.com/KSPP/linux/issues/90 Reported-by: David Arendt <admin@prnet.org> Link: https://lore.kernel.org/linux-btrfs/cee4591a-3088-49ba-99b8-d86b4242b8bd@prnet.org/ Fixes: c0247d289e73 ("btrfs: send: annotate struct name_cache_entry with __counted_by()") CC: stable@vger.kernel.org # 6.11 Tested-by: David Arendt <admin@prnet.org> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-10-01	drm/panthor: Don't add write fences to the shared BOs	Boris Brezillon
	The only user (the mesa gallium driver) is already assuming explicit synchronization and doing the export/import dance on shared BOs. The only reason we were registering ourselves as writers on external BOs is because Xe, which was the reference back when we developed Panthor, was doing so. Turns out Xe was wrong, and we really want bookkeep on all registered fences, so userspace can explicitly upgrade those to read/write when needed. Fixes: 4bdca1150792 ("drm/panthor: Add the driver frontend block") Cc: Matthew Brost <matthew.brost@intel.com> Cc: Simona Vetter <simona.vetter@ffwll.ch> Cc: <stable@vger.kernel.org> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Reviewed-by: Liviu Dudau <liviu.dudau@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240905070155.3254011-1-boris.brezillon@collabora.com
2024-10-01	drm/panthor: Don't declare a queue blocked if deferred operations are pending	Boris Brezillon
	If deferred operations are pending, we want to wait for those to land before declaring the queue blocked on a SYNC_WAIT. We need this to deal with the case where the sync object is signalled through a deferred SYNC_{ADD,SET} from the same queue. If we don't do that and the group gets scheduled out before the deferred SYNC_{SET,ADD} is executed, we'll end up with a timeout, because no external SYNC_{SET,ADD} will make the scheduler reconsider the group for execution. Fixes: de8548813824 ("drm/panthor: Add the scheduler logical block") Cc: <stable@vger.kernel.org> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Reviewed-by: Liviu Dudau <liviu.dudau@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240905071914.3278599-1-boris.brezillon@collabora.com
2024-10-01	drm/panthor: Fix access to uninitialized variable in tick_ctx_cleanup()	Boris Brezillon
	The group variable can't be used to retrieve ptdev in our second loop, because it points to the previously iterated list_head, not a valid group. Get the ptdev object from the scheduler instead. Cc: <stable@vger.kernel.org> Fixes: d72f049087d4 ("drm/panthor: Allow driver compilation") Reported-by: kernel test robot <lkp@intel.com> Reported-by: Julia Lawall <julia.lawall@inria.fr> Closes: https://lore.kernel.org/r/202409302306.UDikqa03-lkp@intel.com/ Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Liviu Dudau <liviu.dudau@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240930163742.87036-1-boris.brezillon@collabora.com
2024-10-01	drm/panthor: Lock the VM resv before calling drm_gpuvm_bo_obtain_prealloc()	Boris Brezillon
	drm_gpuvm_bo_obtain_prealloc() will call drm_gpuvm_bo_put() on our pre-allocated BO if the <BO,VM> association exists. Given we only have one ref on preallocated_vm_bo, drm_gpuvm_bo_destroy() will be called immediately, and we have to hold the VM resv lock when calling this function. Fixes: 647810ec2476 ("drm/panthor: Add the MMU/VM logical block") Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Liviu Dudau <liviu.dudau@arm.com> Reviewed-by: Steven Price <steven.price@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240913112722.492144-1-boris.brezillon@collabora.com
2024-10-01	drm/panthor: Add FOP_UNSIGNED_OFFSET to fop_flags	Liviu Dudau
	Since commit 641bb4394f40 ("fs: move FMODE_UNSIGNED_OFFSET to fop_flags") the FMODE_UNSIGNED_OFFSET flag has been moved to fop_flags and renamed, but the patch failed to make the changes for the panthor driver. When user space opens the render node the WARN() added by the patch gets triggered. Fixes: 641bb4394f40 ("fs: move FMODE_UNSIGNED_OFFSET to fop_flags") Cc: Christian Brauner <brauner@kernel.org> Signed-off-by: Liviu Dudau <liviu.dudau@arm.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Reviewed-by: Christian Brauner <brauner@kernel.org> Tested-by: Heiko Stuebner <heiko@sntech.de> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240920102802.2483367-1-liviu.dudau@arm.com