summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2025-02-13sched_ext: Use SCX_CALL_OP_TASK in task_tick_scxChuyi Zhou
Now when we use scx_bpf_task_cgroup() in ops.tick() to get the cgroup of the current task, the following error will occur: scx_foo[3795244] triggered exit kind 1024: runtime error (called on a task not being operated on) The reason is that we are using SCX_CALL_OP() instead of SCX_CALL_OP_TASK() when calling ops.tick(), which triggers the error during the subsequent scx_kf_allowed_on_arg_tasks() check. SCX_CALL_OP_TASK() was first introduced in commit 36454023f50b ("sched_ext: Track tasks that are subjects of the in-flight SCX operation") to ensure task's rq lock is held when accessing task's sched_group. Since ops.tick() is marked as SCX_KF_TERMINAL and task_tick_scx() is protected by the rq lock, we can use SCX_CALL_OP_TASK() to avoid the above issue. Similarly, the same changes should be made for ops.disable() and ops.exit_task(), as they are also protected by task_rq_lock() and it's safe to access the task's task_group. Fixes: 36454023f50b ("sched_ext: Track tasks that are subjects of the in-flight SCX operation") Signed-off-by: Chuyi Zhou <zhouchuyi@bytedance.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2025-02-13sched_ext: Fix the incorrect bpf_list kfunc API in common.bpf.h.Chuyi Zhou
Now BPF only supports bpf_list_push_{front,back}_impl kfunc, not bpf_list_ push_{front,back}. This patch fix this issue. Without this patch, if we use bpf_list kfunc in scx, the BPF verifier would complain: libbpf: extern (func ksym) 'bpf_list_push_back': not found in kernel or module BTFs libbpf: failed to load object 'scx_foo' libbpf: failed to load BPF skeleton 'scx_foo': -EINVAL With this patch, the bpf list kfunc will work as expected. Signed-off-by: Chuyi Zhou <zhouchuyi@bytedance.com> Fixes: 2a52ca7c98960 ("sched_ext: Add scx_simple and scx_example_qmap example schedulers") Signed-off-by: Tejun Heo <tj@kernel.org>
2025-02-13sched_ext: selftests: Fix grammar in tests descriptionDevaansh Kumar
Fixed grammar for a few tests of sched_ext. Signed-off-by: Devaansh Kumar <devaanshk840@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2025-02-10sched_ext: Fix incorrect assumption about migration disabled tasks in ↵Tejun Heo
task_can_run_on_remote_rq() While fixing migration disabled task handling, 32966821574c ("sched_ext: Fix migration disabled handling in targeted dispatches") assumed that a migration disabled task's ->cpus_ptr would only have the pinned CPU. While this is eventually true for migration disabled tasks that are switched out, ->cpus_ptr update is performed by migrate_disable_switch() which is called right before context_switch() in __scheduler(). However, the task is enqueued earlier during pick_next_task() via put_prev_task_scx(), so there is a race window where another CPU can see the task on a DSQ. If the CPU tries to dispatch the migration disabled task while in that window, task_allowed_on_cpu() will succeed and task_can_run_on_remote_rq() will subsequently trigger SCHED_WARN(is_migration_disabled()). WARNING: CPU: 8 PID: 1837 at kernel/sched/ext.c:2466 task_can_run_on_remote_rq+0x12e/0x140 Sched_ext: layered (enabled+all), task: runnable_at=-10ms RIP: 0010:task_can_run_on_remote_rq+0x12e/0x140 ... <TASK> consume_dispatch_q+0xab/0x220 scx_bpf_dsq_move_to_local+0x58/0xd0 bpf_prog_84dd17b0654b6cf0_layered_dispatch+0x290/0x1cfa bpf__sched_ext_ops_dispatch+0x4b/0xab balance_one+0x1fe/0x3b0 balance_scx+0x61/0x1d0 prev_balance+0x46/0xc0 __pick_next_task+0x73/0x1c0 __schedule+0x206/0x1730 schedule+0x3a/0x160 __do_sys_sched_yield+0xe/0x20 do_syscall_64+0xbb/0x1e0 entry_SYSCALL_64_after_hwframe+0x77/0x7f Fix it by converting the SCHED_WARN() back to a regular failure path. Also, perform the migration disabled test before task_allowed_on_cpu() test so that BPF schedulers which fail to handle migration disabled tasks can be noticed easily. While at it, adjust scx_ops_error() message for !task_allowed_on_cpu() case for brevity and consistency. Signed-off-by: Tejun Heo <tj@kernel.org> Fixes: 32966821574c ("sched_ext: Fix migration disabled handling in targeted dispatches") Acked-by: Andrea Righi <arighi@nvidia.com> Reported-by: Jake Hillion <jakehillion@meta.com>
2025-02-08sched_ext: Fix migration disabled handling in targeted dispatchesTejun Heo
A dispatch operation that can target a specific local DSQ - scx_bpf_dsq_move_to_local() or scx_bpf_dsq_move() - checks whether the task can be migrated to the target CPU using task_can_run_on_remote_rq(). If the task can't be migrated to the targeted CPU, it is bounced through a global DSQ. task_can_run_on_remote_rq() assumes that the task is on a CPU that's different from the targeted CPU but the callers doesn't uphold the assumption and may call the function when the task is already on the target CPU. When such task has migration disabled, task_can_run_on_remote_rq() ends up returning %false incorrectly unnecessarily bouncing the task to a global DSQ. Fix it by updating the callers to only call task_can_run_on_remote_rq() when the task is on a different CPU than the target CPU. As this is a bit subtle, for clarity and documentation: - Make task_can_run_on_remote_rq() trigger SCHED_WARN_ON() if the task is on the same CPU as the target CPU. - is_migration_disabled() test in task_can_run_on_remote_rq() cannot trigger if the task is on a different CPU than the target CPU as the preceding task_allowed_on_cpu() test should fail beforehand. Convert the test into SCHED_WARN_ON(). Signed-off-by: Tejun Heo <tj@kernel.org> Fixes: 4c30f5ce4f7a ("sched_ext: Implement scx_bpf_dispatch[_vtime]_from_dsq()") Fixes: 0366017e0973 ("sched_ext: Use task_can_run_on_remote_rq() test in dispatch_to_local_dsq()") Cc: stable@vger.kernel.org # v6.12+
2025-02-08sched_ext: Implement auto local dispatching of migration disabled tasksTejun Heo
Migration disabled tasks are special and pinned to their previous CPUs. They tripped up some unsuspecting BPF schedulers as their ->nr_cpus_allowed may not agree with the bits set in ->cpus_ptr. Make it easier for BPF schedulers by automatically dispatching them to the pinned local DSQs by default. If a BPF scheduler wants to handle migration disabled tasks explicitly, it can set SCX_OPS_ENQ_MIGRATION_DISABLED. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Andrea Righi <arighi@nvidia.com>
2025-02-02sched_ext: Fix incorrect time delta calculation in time_delta()Changwoo Min
When (s64)(after - before) > 0, the code returns the result of (s64)(after - before) > 0 while the intended result should be (s64)(after - before). That happens because the middle operand of the ternary operator was omitted incorrectly, returning the result of (s64)(after - before) > 0. Thus, add the middle operand -- (s64)(after - before) -- to return the correct time calculation. Fixes: d07be814fc71 ("sched_ext: Add time helpers for BPF schedulers") Signed-off-by: Changwoo Min <changwoo@igalia.com> Acked-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2025-01-27sched_ext: Fix lock imbalance in dispatch_to_local_dsq()Andrea Righi
While performing the rq locking dance in dispatch_to_local_dsq(), we may trigger the following lock imbalance condition, in particular when multiple tasks are rapidly changing CPU affinity (i.e., running a `stress-ng --race-sched 0`): [ 13.413579] ===================================== [ 13.413660] WARNING: bad unlock balance detected! [ 13.413729] 6.13.0-virtme #15 Not tainted [ 13.413792] ------------------------------------- [ 13.413859] kworker/1:1/80 is trying to release lock (&rq->__lock) at: [ 13.413954] [<ffffffff873c6c48>] dispatch_to_local_dsq+0x108/0x1a0 [ 13.414111] but there are no more locks to release! [ 13.414176] [ 13.414176] other info that might help us debug this: [ 13.414258] 1 lock held by kworker/1:1/80: [ 13.414318] #0: ffff8b66feb41698 (&rq->__lock){-.-.}-{2:2}, at: raw_spin_rq_lock_nested+0x20/0x90 [ 13.414612] [ 13.414612] stack backtrace: [ 13.415255] CPU: 1 UID: 0 PID: 80 Comm: kworker/1:1 Not tainted 6.13.0-virtme #15 [ 13.415505] Workqueue: 0x0 (events) [ 13.415567] Sched_ext: dsp_local_on (enabled+all), task: runnable_at=-2ms [ 13.415570] Call Trace: [ 13.415700] <TASK> [ 13.415744] dump_stack_lvl+0x78/0xe0 [ 13.415806] ? dispatch_to_local_dsq+0x108/0x1a0 [ 13.415884] print_unlock_imbalance_bug+0x11b/0x130 [ 13.415965] ? dispatch_to_local_dsq+0x108/0x1a0 [ 13.416226] lock_release+0x231/0x2c0 [ 13.416326] _raw_spin_unlock+0x1b/0x40 [ 13.416422] dispatch_to_local_dsq+0x108/0x1a0 [ 13.416554] flush_dispatch_buf+0x199/0x1d0 [ 13.416652] balance_one+0x194/0x370 [ 13.416751] balance_scx+0x61/0x1e0 [ 13.416848] prev_balance+0x43/0xb0 [ 13.416947] __pick_next_task+0x6b/0x1b0 [ 13.417052] __schedule+0x20d/0x1740 This happens because dispatch_to_local_dsq() is racing with dispatch_dequeue() and, when the latter wins, we incorrectly assume that the task has been moved to dst_rq. Fix by properly tracking the currently locked rq. Fixes: 4d3ca89bdd31 ("sched_ext: Refactor consume_remote_task()") Signed-off-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2025-01-27sched_ext: selftests/dsp_local_on: Fix selftest on UP systemsAndrea Righi
In UP systems p->migration_disabled is not available. Fix this by using the portable helper is_migration_disabled(p). Fixes: e9fe182772dc ("sched_ext: selftests/dsp_local_on: Fix sporadic failures") Signed-off-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2025-01-27tools/sched_ext: Add helper to check task migration stateAndrea Righi
Introduce a new helper for BPF schedulers to determine whether a task can migrate or not (supporting both SMP and UP systems). Fixes: e9fe182772dc ("sched_ext: selftests/dsp_local_on: Fix sporadic failures") Signed-off-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2025-01-27sched_ext: Fix incorrect autogroup migration detectionTejun Heo
scx_move_task() is called from sched_move_task() and tells the BPF scheduler that cgroup migration is being committed. sched_move_task() is used by both cgroup and autogroup migrations and scx_move_task() tried to filter out autogroup migrations by testing the destination cgroup and PF_EXITING but this is not enough. In fact, without explicitly tagging the thread which is doing the cgroup migration, there is no good way to tell apart scx_move_task() invocations for racing migration to the root cgroup and an autogroup migration. This led to scx_move_task() incorrectly ignoring a migration from non-root cgroup to an autogroup of the root cgroup triggering the following warning: WARNING: CPU: 7 PID: 1 at kernel/sched/ext.c:3725 scx_cgroup_can_attach+0x196/0x340 ... Call Trace: <TASK> cgroup_migrate_execute+0x5b1/0x700 cgroup_attach_task+0x296/0x400 __cgroup_procs_write+0x128/0x140 cgroup_procs_write+0x17/0x30 kernfs_fop_write_iter+0x141/0x1f0 vfs_write+0x31d/0x4a0 __x64_sys_write+0x72/0xf0 do_syscall_64+0x82/0x160 entry_SYSCALL_64_after_hwframe+0x76/0x7e Fix it by adding an argument to sched_move_task() that indicates whether the moving is for a cgroup or autogroup migration. After the change, scx_move_task() is called only for cgroup migrations and renamed to scx_cgroup_move_task(). Link: https://github.com/sched-ext/scx/issues/370 Fixes: 819513666966 ("sched_ext: Add cgroup support") Cc: stable@vger.kernel.org # v6.12+ Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Tejun Heo <tj@kernel.org>
2025-01-24sched_ext: selftests/dsp_local_on: Fix sporadic failuresTejun Heo
dsp_local_on has several incorrect assumptions, one of which is that p->nr_cpus_allowed always tracks p->cpus_ptr. This is not true when a task is scheduled out while migration is disabled - p->cpus_ptr is temporarily overridden to the previous CPU while p->nr_cpus_allowed remains unchanged. This led to sporadic test faliures when dsp_local_on_dispatch() tries to put a migration disabled task to a different CPU. Fix it by keeping the previous CPU when migration is disabled. There are SCX schedulers that make use of p->nr_cpus_allowed. They should also implement explicit handling for p->migration_disabled. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Ihor Solodrai <ihor.solodrai@pm.me> Cc: Andrea Righi <arighi@nvidia.com> Cc: Changwoo Min <changwoo@igalia.com>
2025-01-24selftests/sched_ext: Fix enum resolutionAndrea Righi
All scx enums are now automatically generated from vmlinux.h and they must be initialized using the SCX_ENUM_INIT() macro. Fix the scx selftests to use this macro to properly initialize these values. Fixes: 8da7bf2cee27 ("tools/sched_ext: Receive updates from SCX repo") Reported-by: Ihor Solodrai <ihor.solodrai@pm.me> Closes: https://lore.kernel.org/all/Z2tNK2oFDX1OPp8C@slm.duckdns.org/ Signed-off-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2025-01-24sched_ext: Include task weight in the error state dumpAndrea Righi
Report the task weight when dumping the task state during an error exit. Moreover, adjust the output format to display dsq_vtime, slice, and weight on the same line. This can help identify whether certain tasks were excessively prioritized or de-prioritized due to large niceness gaps. Signed-off-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2025-01-24sched_ext: Fixes typos in commentsAtul Kumar Pant
Fixes some spelling errors in the comments. Signed-off-by: Atul Kumar Pant <atulpant.linux@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2025-01-24Merge tag 'auxdisplay-v6.14-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/andy/linux-auxdisplay Pull auxdisplay updates from Andy Shevchenko: - A couple of cleanups to img-ascii-lcd driver * tag 'auxdisplay-v6.14-1' of git://git.kernel.org/pub/scm/linux/kernel/git/andy/linux-auxdisplay: auxdisplay: img-ascii-lcd: Constify struct img_ascii_lcd_config auxdisplay: img-ascii-lcd: Remove an unused field in struct img_ascii_lcd_ctx
2025-01-24Merge tag 'sound-6.14-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound updates from Takashi Iwai: "This was a relatively calm cycle, and most of changes are rather small device-specific fixes. Here are highlights: Core: - Further enhancements of ALSA rawmidi and sequencer APIs for MIDI 2.0 - compress-offload API extensions for ASRC support ASoC: - Allow clocking on each DAI in an audio graph card to be configured separately - Improved power management for Renesas RZ-SSI - KUnit testing for the Cirrus DSP framework - Memory to meory operation support for Freescale/NXP platforms - Support for pause operations in SOF - Support for Allwinner suinv F1C100s, Awinc AW88083, Realtek ALC5682I-VE HD- and USB-audio: - Add support for Focusrite Scarlett 4th Gen 16i16, 18i16, and 18i20 interfaces via new FCP driver - TAS2781 SPI HD-audio sub-codec support - Various device-specific quirks as usual" * tag 'sound-6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (235 commits) ALSA: hda: tas2781-spi: Fix bogus error handling in tas2781_hda_spi_probe() ALSA: hda: tas2781-spi: Fix error code in tas2781_read_acpi() ALSA: hda: tas2781-spi: Delete some dead code ALSA: usb: fcp: Fix return code from poll ops ALSA: usb: fcp: Fix incorrect resp->opcode retrieval ALSA: usb: fcp: Fix meter_levels type to __le32 ALSA: hda/realtek: Enable Mute LED on HP Laptop 14s-fq1xxx ALSA: hda: tas2781-spi: Fix -Wsometimes-uninitialized in tasdevice_spi_switch_book() ALSA: ctxfi: Simplify dao_clear_{left,right}_input() functions ALSA: hda: tas2781-spi: select CRC32 instead of CRC32_SARWATE ALSA: usb: fcp: Fix hwdep read ops types ALSA: scarlett2: Add device_setup option to use FCP driver ALSA: FCP: Add Focusrite Control Protocol driver ALSA: hda/tas2781: Add tas2781 hda SPI driver ALSA: hda/realtek - Fixed headphone distorted sound on Acer Aspire A115-31 laptop ASoC: xilinx: xlnx_spdif: Simpify using devm_clk_get_enabled() ALSA: hda: Support for Ideapad hotkey mute LEDs ASoC: Intel: sof_sdw: Fix DMI match for Lenovo 83JX, 83MC and 83NM ASoC: Intel: sof_sdw: Fix DMI match for Lenovo 83LC ASoC: dapm: add support for preparing streams ...
2025-01-24Merge tag 'v6.14-p1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto updates from Herbert Xu: "API: - Remove physical address skcipher walking - Fix boot-up self-test race Algorithms: - Optimisations for x86/aes-gcm - Optimisations for x86/aes-xts - Remove VMAC - Remove keywrap Drivers: - Remove n2 Others: - Fixes for padata UAF - Fix potential rhashtable deadlock by moving schedule_work outside lock" * tag 'v6.14-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (75 commits) rhashtable: Fix rhashtable_try_insert test dt-bindings: crypto: qcom,inline-crypto-engine: Document the SM8750 ICE dt-bindings: crypto: qcom,prng: Document SM8750 RNG dt-bindings: crypto: qcom-qce: Document the SM8750 crypto engine crypto: asymmetric_keys - Remove unused key_being_used_for[] padata: avoid UAF for reorder_work padata: fix UAF in padata_reorder padata: add pd get/put refcnt helper crypto: skcipher - call cond_resched() directly crypto: skcipher - optimize initializing skcipher_walk fields crypto: skcipher - clean up initialization of skcipher_walk::flags crypto: skcipher - fold skcipher_walk_skcipher() into skcipher_walk_virt() crypto: skcipher - remove redundant check for SKCIPHER_WALK_SLOW crypto: skcipher - remove redundant clamping to page size crypto: skcipher - remove unnecessary page alignment of bounce buffer crypto: skcipher - document skcipher_walk_done() and rename some vars crypto: omap - switch from scatter_walk to plain offset crypto: powerpc/p10-aes-gcm - simplify handling of linear associated data crypto: bcm - Drop unused setting of local 'ptr' variable crypto: hisilicon/qm - support new function communication ...
2025-01-24Merge tag 'tpmdd-next-6.14-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd Pull TPM update from Jarkko Sakkinen. * tag 'tpmdd-next-6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd: tpm: Change to kvalloc() in eventlog/acpi.c
2025-01-24Merge tag 'pmdomain-v6.14' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm Pull pmdomain updates from Ulf Hansson: "pmdomain core: - Add support for naming idlestates through DT pmdomain providers: - arm: Explicitly request the current state at init for the SCMI PM domain - mediatek: Add Airoha CPU PM Domain support for CPU frequency scaling - ti: Add per-device latency constraint management to the ti_sci PM domain cpuidle-psci: - Enable system-wakeup through GENPD_FLAG_ACTIVE_WAKEUP" * tag 'pmdomain-v6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm: pmdomain: airoha: Fix compilation error with Clang-20 and Thumb2 mode pmdomain: arm: scmi_pm_domain: Send an explicit request to set the current state pmdomain: airoha: Add Airoha CPU PM Domain support pmdomain: ti_sci: handle wake IRQs for IO daisy chain wakeups pmdomain: ti_sci: add wakeup constraint management pmdomain: ti_sci: add per-device latency constraint management pmdomain: imx-gpcv2: Suppress bind attrs pmdomain: imx8m[p]-blk-ctrl: Suppress bind attrs pmdomain: core: Support naming idle states dt-bindings: power: domain-idle-state: Allow idle-state-name cpuidle: psci: Activate GENPD_FLAG_ACTIVE_WAKEUP with OSI
2025-01-24Merge tag 'pinctrl-v6.14-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl Pull pin control updates from Linus Walleij: "No core changes this time New drivers: - New subdriver for the Qualcomm MSM8917 SoC TLMM - New subdriver for the Mediatek MT7988 SoC - New subdriver for the Rockchip RK3562 SoC - New subdriver for the Renesas RZ/G3E SoC Improvements: - Fix some missing pins in the Qualcomm IPQ5424 TLMM - Fix some missing LVDS pins in the Sunxi A100/A133 - Support Sunxi V853 (simple compatible string) - Cleanups in the Samsung driver - Fix some AMD suspend behaviour - Cleanups" * tag 'pinctrl-v6.14-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: (29 commits) dt-bindings: pinctrl: sunxi: add compatible for V853 pinctrl: Use str_enable_disable-like helpers dt-bindings: pinctrl: Correct indentation and style in DTS example pinctrl: amd: Take suspend type into consideration which pins are non-wake pinctrl: stm32: Add check for clk_enable() pinctrl: renesas: rzg2l: Fix PFC_MASK for RZ/V2H and RZ/G3E pinctrl: sunxi: add missed lvds pins for a100/a133 pinctrl: mediatek: Drop mtk_pinconf_bias_set_pd() pinctrl: renesas: rzg2l: Add support for RZ/G3E SoC pinctrl: renesas: rzg2l: Update r9a09g057_variable_pin_cfg table dt-bindings: pinctrl: renesas: Document RZ/G3E SoC dt-bindings: pinctrl: renesas: Add alpha-numerical port support for RZ/V2H pinctrl: rockchip: add rk3562 support dt-bindings: pinctrl: Add rk3562 pinctrl support pinctrl: Fix the clean up on pinconf_apply_setting failure dt-bindings: pinctrl: add binding for MT7988 SoC pinctrl: mediatek: add MT7988 pinctrl driver pinctrl: mediatek: add support for MTK_PULL_PD_TYPE pinctrl: ocelot: Constify some structures pinctrl: renesas: rzg2l: Add audio clock pins on RZ/G3S ...
2025-01-24Merge tag 'iommu-updates-v6.14' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux Pull iommu updates from Joerg Roedel: "Core changes: - PASID support for the blocked_domain ARM-SMMU Updates: - SMMUv2: - Implement per-client prefetcher configuration on Qualcomm SoCs - Support for the Adreno SMMU on Qualcomm's SDM670 SOC - SMMUv3: - Pretty-printing of event records - Drop the ->domain_alloc_paging implementation in favour of domain_alloc_paging_flags(flags==0) - IO-PGTable: - Generalisation of the page-table walker to enable external walkers (e.g. for debugging unexpected page-faults from the GPU) - Minor fix for handling concatenated PGDs at stage-2 with 16KiB pages - Misc: - Clean-up device probing and replace the crufty probe-deferral hack with a more robust implementation of arm_smmu_get_by_fwnode() - Device-tree binding updates for a bunch of Qualcomm platforms Intel VT-d Updates: - Remove domain_alloc_paging() - Remove capability audit code - Draining PRQ in sva unbind path when FPD bit set - Link cache tags of same iommu unit together AMD-Vi Updates: - Use CMPXCHG128 to update DTE - Cleanups of the domain_alloc_paging() path RiscV IOMMU: - Platform MSI support - Shutdown support Rockchip IOMMU: - Add DT bindings for Rockchip RK3576 More smaller fixes and cleanups" * tag 'iommu-updates-v6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux: (66 commits) iommu: Use str_enable_disable-like helpers iommu/amd: Fully decode all combinations of alloc_paging_flags iommu/amd: Move the nid to pdom_setup_pgtable() iommu/amd: Change amd_iommu_pgtable to use enum protection_domain_mode iommu/amd: Remove type argument from do_iommu_domain_alloc() and related iommu/amd: Remove dev == NULL checks iommu/amd: Remove domain_alloc() iommu/amd: Remove unused amd_iommu_domain_update() iommu/riscv: Fixup compile warning iommu/arm-smmu-v3: Add missing #include of linux/string_choices.h iommu/arm-smmu-v3: Use str_read_write helper w/ logs iommu/io-pgtable-arm: Add way to debug pgtable walk iommu/io-pgtable-arm: Re-use the pgtable walk for iova_to_phys iommu/io-pgtable-arm: Make pgtable walker more generic iommu/arm-smmu: Add ACTLR data and support for qcom_smmu_500 iommu/arm-smmu: Introduce ACTLR custom prefetcher settings iommu/arm-smmu: Add support for PRR bit setup iommu/arm-smmu: Refactor qcom_smmu structure to include single pointer iommu/arm-smmu: Re-enable context caching in smmu reset operation iommu/vt-d: Link cache tags of same iommu unit together ...
2025-01-24Merge tag 'platform-drivers-x86-v6.14-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86 Pull x86 platform driver updates from Ilpo Järvinen: "acer-wmi: - Add support for PH14-51, PH16-72, and Nitro AN515-58 - Add proper hwmon support - Improve error handling when reading "gaming system info" - Replace direct EC reads for the current platform profile with WMI calls to handle EC address variations - Replace custom platform_profile cycling with the generic one ACPI: - platform_profile: Major refactoring and improvements - Support registering multiple platform_profile handlers concurrently to avoid the need to quirk which handler takes precedence - Support reporting "custom" profile for cases where the current profile is ambiguous or when settings tweaks are done outside the pre-defined profile - Abstract and layer platform_profile API better using the class_dev and drvdata - Various minor improvements - Add Documentation and kerneldoc amd/hsmp: - Add support for HSMP protocol v7 amd/pmc: - Support AMD 1Ah family 70h - Support STB with Ryzen desktop SoCs amd/pmf: - Support Custom BIOS inputs for PMF TA - Support passing SRA sensor data from AMD SFH (HID) to PMF TA dell-smo8800: - Move SMO88xx quirk away from the generic i2c-i801 driver - Add accelerometer support for Dell Latitude E6330/E6430 and XPS 9550 - Support probing accelerometer for models yet to be listed in the DMI mapping table because ACPI lacks i2c-address for the accelerometer (behind a module parameter because probing might be dangerous) HID: - amd_sfh: Add support for exporting SRA sensor data hp-wmi: - Add fan and thermal support for Victus 16-s1000 input: - Add key for phone linking - i8042: Add context for the i8042 filter to enable cleaning up the filter related global variables from pdx86 drivers lenovo-wmi-camera: - Use SW_CAMERA_LENS_COVER instead of KEY_CAMERA_ACCESS mellanox mlxbf-pmc: - Add support for monitoring cycle count - Add Documentation thinkpad_acpi: - Add support for phone link key tools/power/x86/intel-speed-select: - Fix Turbo Ratio Limit restore x86-android-tables: - Add support for Vexia EDU ATLA 10 Bluetooth and EC battery driver And miscellaneous cleanups / refactoring / improvements" * tag 'platform-drivers-x86-v6.14-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: (133 commits) platform/x86: acer-wmi: Fix initialization of last_non_turbo_profile platform/x86: acer-wmi: Ignore AC events platform/mellanox: mlxreg-io: use sysfs_emit() instead of sprintf() platform/mellanox: mlxreg-hotplug: use sysfs_emit() instead of sprintf() platform/mellanox: mlxbf-bootctl: use sysfs_emit() instead of sprintf() platform/x86: hp-wmi: Add fan and thermal profile support for Victus 16-s1000 ACPI: platform_profile: Add a prefix to log messages ACPI: platform_profile: Add documentation ACPI: platform_profile: Clean platform_profile_handler ACPI: platform_profile: Move platform_profile_handler ACPI: platform_profile: Remove platform_profile_handler from exported symbols platform/x86: thinkpad_acpi: Use devm_platform_profile_register() platform/x86: inspur_platform_profile: Use devm_platform_profile_register() platform/x86: hp-wmi: Use devm_platform_profile_register() platform/x86: ideapad-laptop: Use devm_platform_profile_register() platform/x86: dell-pc: Use devm_platform_profile_register() platform/x86: asus-wmi: Use devm_platform_profile_register() platform/x86: amd: pmf: sps: Use devm_platform_profile_register() platform/x86: acer-wmi: Use devm_platform_profile_register() platform/surface: surface_platform_profile: Use devm_platform_profile_register() ...
2025-01-24Merge tag 'x86_tdx_for_6.14-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 TDX updates from Dave Hansen: "Intel Trust Domain updates. The existing TDX code needs a _bit_ of metadata from the TDX module. But KVM is going to need a bunch more very shortly. Rework the interface with the TDX module to be more consistent and handle the new higher volume. The TDX module has added a few new features. The first is a promise not to clobber RBP under any circumstances. Basically the kernel now will refuse to use any modules that don't have this promise. Second, enable the new "REDUCE_VE" feature. This ensures that the TDX module will not send some silly virtualization exceptions that the guest had no good way to handle anyway. - Centralize global metadata infrastructure - Use new TDX module features for exception suppression and RBP clobbering" * tag 'x86_tdx_for_6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/virt/tdx: Require the module to assert it has the NO_RBP_MOD mitigation x86/virt/tdx: Switch to use auto-generated global metadata reading code x86/virt/tdx: Use dedicated struct members for PAMT entry sizes x86/virt/tdx: Use auto-generated code to read global metadata x86/virt/tdx: Start to track all global metadata in one structure x86/virt/tdx: Rename 'struct tdx_tdmr_sysinfo' to reflect the spec better x86/tdx: Dump attributes and TD_CTLS on boot x86/tdx: Disable unnecessary virtualization exceptions
2025-01-24Merge tag 'x86-boot-2025-01-21' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 boot updates from Ingo Molnar: - A large and involved preparatory series to pave the way to add exception handling for relocate_kernel - which will be a debugging facility that has aided in the field to debug an exceptionally hard to debug early boot bug. Plus assorted cleanups and fixes that were discovered along the way, by David Woodhouse: - Clean up and document register use in relocate_kernel_64.S - Use named labels in swap_pages in relocate_kernel_64.S - Only swap pages for ::preserve_context mode - Allocate PGD for x86_64 transition page tables separately - Copy control page into place in machine_kexec_prepare() - Invoke copy of relocate_kernel() instead of the original - Move relocate_kernel to kernel .data section - Add data section to relocate_kernel - Drop page_list argument from relocate_kernel() - Eliminate writes through kernel mapping of relocate_kernel page - Clean up register usage in relocate_kernel() - Mark relocate_kernel page as ROX instead of RWX - Disable global pages before writing to control page - Ensure preserve_context flag is set on return to kernel - Use correct swap page in swap_pages function - Fix stack and handling of re-entry point for ::preserve_context - Mark machine_kexec() with __nocfi - Cope with relocate_kernel() not being at the start of the page - Use typedef for relocate_kernel_fn function prototype - Fix location of relocate_kernel with -ffunction-sections (fix by Nathan Chancellor) - A series to remove the last remaining absolute symbol references from .head.text, and enforce this at build time, by Ard Biesheuvel: - Avoid WARN()s and panic()s in early boot code - Don't hang but terminate on failure to remap SVSM CA - Determine VA/PA offset before entering C code - Avoid intentional absolute symbol references in .head.text - Disable UBSAN in early boot code - Move ENTRY_TEXT to the start of the image - Move .head.text into its own output section - Reject absolute references in .head.text - The above build-time enforcement uncovered a handful of bugs of essentially non-working code, and a wrokaround for a toolchain bug, fixed by Ard Biesheuvel as well: - Fix spurious undefined reference when CONFIG_X86_5LEVEL=n, on GCC-12 - Disable UBSAN on SEV code that may execute very early - Disable ftrace branch profiling in SEV startup code - And miscellaneous cleanups: - kexec_core: Add and update comments regarding the KEXEC_JUMP flow (Rafael J. Wysocki) - x86/sysfs: Constify 'struct bin_attribute' (Thomas Weißschuh)" * tag 'x86-boot-2025-01-21' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (33 commits) x86/sev: Disable ftrace branch profiling in SEV startup code x86/kexec: Use typedef for relocate_kernel_fn function prototype x86/kexec: Cope with relocate_kernel() not being at the start of the page kexec_core: Add and update comments regarding the KEXEC_JUMP flow x86/kexec: Mark machine_kexec() with __nocfi x86/kexec: Fix location of relocate_kernel with -ffunction-sections x86/kexec: Fix stack and handling of re-entry point for ::preserve_context x86/kexec: Use correct swap page in swap_pages function x86/kexec: Ensure preserve_context flag is set on return to kernel x86/kexec: Disable global pages before writing to control page x86/sev: Don't hang but terminate on failure to remap SVSM CA x86/sev: Disable UBSAN on SEV code that may execute very early x86/boot/64: Fix spurious undefined reference when CONFIG_X86_5LEVEL=n, on GCC-12 x86/sysfs: Constify 'struct bin_attribute' x86/kexec: Mark relocate_kernel page as ROX instead of RWX x86/kexec: Clean up register usage in relocate_kernel() x86/kexec: Eliminate writes through kernel mapping of relocate_kernel page x86/kexec: Drop page_list argument from relocate_kernel() x86/kexec: Add data section to relocate_kernel x86/kexec: Move relocate_kernel to kernel .data section ...
2025-01-24Merge tag 'perf-tools-for-v6.14-2025-01-21' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools Pull perf-tools updates from Namhyung Kim: "There are a lot of changes in the perf tools in this cycle. build: - Use generic syscall table to generate syscall numbers on supported archs - This also enables to get rid of libaudit which was used for syscall numbers - Remove python2 support as it's deprecated for years - Fix issues on static build with libzstd perf record: - Intel-PT supports "aux-action" config term to pause or resume tracing in the aux-buffer. Users can start the intel_pt event as "started-paused" and configure other events to control the Intel-PT tracing: # perf record --kcore -e intel_pt/aux-action=start-paused/ \ -e syscalls:sys_enter_newuname/aux-action=resume/ \ -e syscalls:sys_exit_newuname/aux-action=pause/ -- uname This requires kernel support (which was added in v6.13) perf lock: - 'perf lock contention' command has an ability to symbolize locks in dynamically allocated objects using slab cache name when it runs with BPF. Those dynamic locks would have "&" prefix in the name to distinguish them from ordinary (static) locks # perf lock con -abl -E 5 sleep 1 contended total wait max wait avg wait address symbol 2 1.95 us 1.77 us 975 ns ffff9d5e852d3498 &task_struct (mutex) 1 1.18 us 1.18 us 1.18 us ffff9d5e852d3538 &task_struct (mutex) 4 1.12 us 354 ns 279 ns ffff9d5e841ca800 &kmalloc-cg-512 (mutex) 2 859 ns 617 ns 429 ns ffffffffa41c3620 delayed_uprobe_lock (mutex) 3 691 ns 388 ns 230 ns ffffffffa41c0940 pack_mutex (mutex) This also requires kernel/BPF support (which was added in v6.13) perf ftrace: - 'perf ftrace latency' command gets a couple of options to support linear buckets instead of exponential. Also it's possible to specify max and min latency for the linear buckets: # perf ftrace latency -abn -T switch_mm_irqs_off --bucket-range=100 \ --min-latency=200 --max-latency=800 -- sleep 1 # DURATION | COUNT | GRAPH | 0 - 200 ns | 186 | ### | 200 - 300 ns | 256 | ##### | 300 - 400 ns | 364 | ####### | 400 - 500 ns | 223 | #### | 500 - 600 ns | 111 | ## | 600 - 700 ns | 41 | | 700 - 800 ns | 141 | ## | 800 - ... ns | 169 | ### | # statistics (in nsec) total time: 2162212 avg time: 967 max time: 16817 min time: 132 count: 2236 - As you can see in the above example, it nows shows the statistics at the end so that users can see the avg/max/min latencies easily - 'perf ftrace profile' command has --graph-opts option like 'perf ftrace trace' so that it can control the tracing behaviors in the same way. For example, it can limit the function call depth or threshold perf script: - Improve physical memory resolution in 'mem-phys-addr' script by parsing /proc/iomem file # perf script mem-phys-addr -- find / ... Event: mem_inst_retired.all_loads:P Memory type count percentage ---------------------------------------- ---------- ---------- 100000000-85f7fffff : System RAM 8929 69.7 547600000-54785d23f : Kernel data 1240 9.7 546a00000-5474bdfff : Kernel rodata 490 3.8 5480ce000-5485fffff : Kernel bss 121 0.9 0-fff : Reserved 3860 30.1 100000-89c01fff : System RAM 18 0.1 8a22c000-8df6efff : System RAM 5 0.0 Others: - 'perf test' gets --runs-per-test option to run the test cases repeatedly. This would be helpful to see if it's flaky - Add 'parse_events' method to Python perf extension module, so that users can use the same event parsing logic in the python code. One more step towards implementing perf tools in Python. :) - Support opening tracepoint events without libtraceevent. This will be helpful if it won't use the tracing data like in 'perf stat' - Update ARM Neoverse N2/V2 JSON events and metrics" * tag 'perf-tools-for-v6.14-2025-01-21' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: (176 commits) perf test: Update event_groups test to use instructions perf bench: Fix undefined behavior in cmpworker() perf annotate: Prefer passing evsel to evsel->core.idx perf lock: Rename fields in lock_type_table perf lock: Add percpu-rwsem for type filter perf lock: Fix parse_lock_type which only retrieve one lock flag perf lock: Fix return code for functions in __cmd_contention perf hist: Fix width calculation in hpp__fmt() perf hist: Fix bogus profiles when filters are enabled perf hist: Deduplicate cmp/sort/collapse code perf test: Improve verbose documentation perf test: Add a runs-per-test flag perf test: Fix parallel/sequential option documentation perf test: Send list output to stdout rather than stderr perf test: Rename functions and variables for better clarity perf tools: Expose quiet/verbose variables in Makefile.perf perf config: Add a function to set one variable in .perfconfig perf test perftool_testsuite: Return correct value for skipping perf test perftool_testsuite: Add missing description perf test record+probe_libc_inet_pton: Make test resilient ...
2025-01-23Merge tag 'sched_ext-for-6.14' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext Pull sched_ext updates from Tejun Heo: - scx_bpf_now() added so that BPF scheduler can access the cached timestamp in struct rq to avoid reading TSC multiple times within a locked scheduling operation. - Minor updates to the built-in idle CPU selection logic. - tool/sched_ext updates and other misc changes. * tag 'sched_ext-for-6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext: sched_ext: fix kernel-doc warnings sched_ext: Use time helpers in BPF schedulers sched_ext: Replace bpf_ktime_get_ns() to scx_bpf_now() sched_ext: Add time helpers for BPF schedulers sched_ext: Add scx_bpf_now() for BPF scheduler sched_ext: Implement scx_bpf_now() sched_ext: Relocate scx_enabled() related code sched_ext: Add option -l in selftest runner to list all available tests sched_ext: Include remaining task time slice in error state dump sched_ext: update scx_bpf_dsq_insert() doc for SCX_DSQ_LOCAL_ON sched_ext: idle: small CPU iteration refactoring sched_ext: idle: introduce check_builtin_idle_enabled() helper sched_ext: idle: clarify comments sched_ext: idle: use assign_cpu() to update the idle cpumask sched_ext: Use str_enabled_disabled() helper in update_selcpu_topology() sched_ext: Use sizeof_field for key_len in dsq_hash_params tools/sched_ext: Receive updates from SCX repo sched_ext: Use the NUMA scheduling domain for NUMA optimizations
2025-01-23Merge tag 'trace-ringbuffer-v6.14-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull trace fing buffer fix from Steven Rostedt: "Fix atomic64 operations on some architectures for the tracing ring buffer: - Have emulating atomic64 use arch_spin_locks instead of raw_spin_locks The tracing ring buffer events have a small timestamp that holds the delta between itself and the event before it. But this can be tricky to update when interrupts come in. It originally just set the deltas to zero for events that interrupted the adding of another event which made all the events in the interrupt have the same timestamp as the event it interrupted. This was not suitable for many tools, so it was eventually fixed. But that fix required adding an atomic64 cmpxchg on the timestamp in cases where an event was added while another event was in the process of being added. Originally, for 32 bit architectures, the manipulation of the 64 bit timestamp was done by a structure that held multiple 32bit words to hold parts of the timestamp and a counter. But as updates to the ring buffer were done, maintaining this became too complex and was replaced by the atomic64 generic operations which are now used by both 64bit and 32bit architectures. Shortly after that, it was reported that riscv32 and other 32 bit architectures that just used the generic atomic64 were locking up. This was because the generic atomic64 operations defined in lib/atomic64.c uses a raw_spin_lock() to emulate an atomic64 operation. The problem here was that raw_spin_lock() can also be traced by the function tracer (which is commonly used for debugging raw spin locks). Since the function tracer uses the tracing ring buffer, which now is being traced internally, this was triggering a recursion and setting off a warning that the spin locks were recusing. There's no reason for the code that emulates atomic64 operations to be using raw_spin_locks which have a lot of debugging infrastructure attached to them (depending on the config options). Instead it should be using the arch_spin_lock() which does not have any infrastructure attached to them and is used by low level infrastructure like RCU locks, lockdep and of course tracing. Using arch_spin_lock()s fixes this issue. - Do not trace in NMI if the architecture uses emulated atomic64 operations Another issue with using the emulated atomic64 operations that uses spin locks to emulate the atomic64 operations is that they cannot be used in NMI context. As an NMI can trigger while holding the atomic64 spin locks it can try to take the same lock and cause a deadlock. Have the ring buffer fail recording events if in NMI context and the architecture uses the emulated atomic64 operations" * tag 'trace-ringbuffer-v6.14-2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: atomic64: Use arch_spin_locks instead of raw_spin_locks ring-buffer: Do not allow events in NMI with generic atomic64 cmpxchg()
2025-01-23Merge tag 'ftrace-v6.14-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull fgraph updates from Steven Rostedt: "Remove calltime and rettime from fgraph infrastructure The calltime and rettime were used by the function graph tracer to calculate the timings of functions where it traced their entry and exit. The calltime and rettime were stored in the generic structures that were used for the mechanisms to add an entry and exit callback. Now that function graph infrastructure is used by other subsystems than just the tracer, the calltime and rettime are not needed for them. Remove the calltime and rettime from the generic fgraph infrastructure and have the callers that require them handle them" * tag 'ftrace-v6.14-2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: fgraph: Remove calltime and rettime from generic operations
2025-01-23Merge tag 'trace-v6.14-3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull tracing updates from Steven Rostedt: - Cleanup with guard() and free() helpers There were several places in the code that had a lot of "goto out" in the error paths to either unlock a lock or free some memory that was allocated. But this is error prone. Convert the code over to use the guard() and free() helpers that let the compiler unlock locks or free memory when the function exits. - Update the Rust tracepoint code to use the C code too There was some duplication of the tracepoint code for Rust that did the same logic as the C code. Add a helper that makes it possible for both algorithms to use the same logic in one place. - Add poll to trace event hist files It is useful to know when an event is triggered, or even with some filtering. Since hist files of events get updated when active and the event is triggered, allow applications to poll the hist file and wake up when an event is triggered. This will let the application know that the event it is waiting for happened. - Add :mod: command to enable events for current or future modules The function tracer already has a way to enable functions to be traced in modules by writing ":mod:<module>" into set_ftrace_filter. That will enable either all the functions for the module if it is loaded, or if it is not, it will cache that command, and when the module is loaded that matches <module>, its functions will be enabled. This also allows init functions to be traced. But currently events do not have that feature. Add the command where if ':mod:<module>' is written into set_event, then either all the modules events are enabled if it is loaded, or cache it so that the module's events are enabled when it is loaded. This also works from the kernel command line, where "trace_event=:mod:<module>", when the module is loaded at boot up, its events will be enabled then. * tag 'trace-v6.14-3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: (26 commits) tracing: Fix output of set_event for some cached module events tracing: Fix allocation of printing set_event file content tracing: Rename update_cache() to update_mod_cache() tracing: Fix #if CONFIG_MODULES to #ifdef CONFIG_MODULES selftests/ftrace: Add test that tests event :mod: commands tracing: Cache ":mod:" events for modules not loaded yet tracing: Add :mod: command to enabled module events selftests/tracing: Add hist poll() support test tracing/hist: Support POLLPRI event for poll on histogram tracing/hist: Add poll(POLLIN) support on hist file tracing: Fix using ret variable in tracing_set_tracer() tracepoint: Reduce duplication of __DO_TRACE_CALL tracing/string: Create and use __free(argv_free) in trace_dynevent.c tracing: Switch trace_stat.c code over to use guard() tracing: Switch trace_stack.c code over to use guard() tracing: Switch trace_osnoise.c code over to use guard() and __free() tracing: Switch trace_events_synth.c code over to use guard() tracing: Switch trace_events_filter.c code over to use guard() tracing: Switch trace_events_trigger.c code over to use guard() tracing: Switch trace_events_hist.c code over to use guard() ...
2025-01-23Merge tag 'ktest-6.14' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-ktest Pull ktest updates from Steven Rostedt: - Fix use of KERNEL_VERSION in newly created output directory If a new output directory is created (O=/dir), and one of the options uses KERNEL_VERSION which will run a "make kernelversion" in the output directory, it will fail because there is no config file yet. In this case, have it do a "make allnoconfig" which is the minimal needed to run the "make kernelversion". - Remove unused variables - Fix some typos * tag 'ktest-6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-ktest: ktest.pl: Fix typo "accesing" ktest.pl: Fix typo in comment ktest.pl: Remove unused declarations in run_bisect_test function ktest.pl: Check kernelrelease return in get_version
2025-01-23Merge tag 'probes-v6.14' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull probes updates from Masami Hiramatsu: - kprobes: Cleanups using guard() and __free(): Use cleanup.h macros to cleanup code and remove all gotos from kprobes code. - tracing/probes: Also cleanups tracing/*probe events code with guard() and __free(). These patches are just to simplify the parser codes. - kprobes: Reduce preempt disable scope in check_kprobe_access_safe() This reduces preempt disable time to only when getting the module refcount in check_kprobe_access_safe(). Previously it disabled preempt needlessly for other checks including jump_label_text_reserved(), which took a long time because of the linear search. * tag 'probes-v6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: tracing/kprobes: Simplify __trace_kprobe_create() by removing gotos tracing: Use __free() for kprobe events to cleanup tracing: Use __free() in trace_probe for cleanup kprobes: Remove remaining gotos kprobes: Remove unneeded goto kprobes: Use guard for rcu_read_lock kprobes: Use guard() for external locks jump_label: Define guard() for jump_label_lock tracing/eprobe: Adopt guard() and scoped_guard() tracing/uprobe: Adopt guard() and scoped_guard() tracing/kprobe: Adopt guard() and scoped_guard() kprobes: Adopt guard() and scoped_guard() kprobes: Reduce preempt disable scope in check_kprobe_access_safe()
2025-01-23Merge tag 'v6.14-rc-smb3-client-fixes-part' of ↵Linus Torvalds
git://git.samba.org/sfrench/cifs-2.6 Pull smb client updates from Steve French: - Fix oops in DebugData when link speed 0 - Two reparse point fixes - Ten DFS (global namespace) fixes - Symlink error handling fix - Two SMB1 fixes - Four cleanup fixes - Improved debugging of status codes - Fix incorrect output of tracepoints for compounding, and add missing compounding tracepoint * tag 'v6.14-rc-smb3-client-fixes-part' of git://git.samba.org/sfrench/cifs-2.6: (23 commits) smb: client: handle lack of EA support in smb2_query_path_info() smb: client: don't check for @leaf_fullpath in match_server() smb: client: get rid of TCP_Server_Info::refpath_lock cifs: Remove duplicate struct reparse_symlink_data and SYMLINK_FLAG_RELATIVE cifs: Do not attempt to call CIFSGetSrvInodeNumber() without CAP_INFOLEVEL_PASSTHRU cifs: Do not attempt to call CIFSSMBRenameOpenFile() without CAP_INFOLEVEL_PASSTHRU cifs: Remove declaration of dead CIFSSMBQuerySymLink function cifs: Fix printing Status code into dmesg cifs: Add missing NT_STATUS_* codes from nterr.h to nterr.c cifs: Fix endian types in struct rfc1002_session_packet cifs: Use cifs_autodisable_serverino() for disabling CIFS_MOUNT_SERVER_INUM in readdir.c smb3: add missing tracepoint for querying wsl EAs smb: client: fix order of arguments of tracepoints smb: client: fix oops due to unset link speed smb: client: correctly handle ErrorContextData as a flexible array smb: client: don't retry DFS targets on server shutdown smb: client: fix return value of parse_dfs_referrals() smb: client: optimize referral walk on failed link targets smb: client: provide dns_resolve_{unc,name} helpers smb: client: parse DNS domain name from domain= option ...
2025-01-23Merge tag 'v6.14-rc-ksmbd-server-fixes' of git://git.samba.org/ksmbdLinus Torvalds
Pull smb server updates from Steve French: "Three ksmbd server fixes: - Fix potential memory corruption in IPC calls - Support FSCTL_QUERY_INTERFACE_INFO for more configurations - Remove some unused functions" * tag 'v6.14-rc-ksmbd-server-fixes' of git://git.samba.org/ksmbd: ksmbd: fix integer overflows on 32 bit systems ksmbd: browse interfaces list on FSCTL_QUERY_INTERFACE_INFO IOCTL ksmbd: Remove unused functions
2025-01-23Merge tag 'fsnotify_hsm_for_v6.14-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull fsnotify pre-content notification support from Jan Kara: "This introduces a new fsnotify event (FS_PRE_ACCESS) that gets generated before a file contents is accessed. The event is synchronous so if there is listener for this event, the kernel waits for reply. On success the execution continues as usual, on failure we propagate the error to userspace. This allows userspace to fill in file content on demand from slow storage. The context in which the events are generated has been picked so that we don't hold any locks and thus there's no risk of a deadlock for the userspace handler. The new pre-content event is available only for users with global CAP_SYS_ADMIN capability (similarly to other parts of fanotify functionality) and it is an administrator responsibility to make sure the userspace event handler doesn't do stupid stuff that can DoS the system. Based on your feedback from the last submission, fsnotify code has been improved and now file->f_mode encodes whether pre-content event needs to be generated for the file so the fast path when nobody wants pre-content event for the file just grows the additional file->f_mode check. As a bonus this also removes the checks whether the old FS_ACCESS event needs to be generated from the fast path. Also the place where the event is generated during page fault has been moved so now filemap_fault() generates the event if and only if there is no uptodate folio in the page cache. Also we have dropped FS_PRE_MODIFY event as current real-world users of the pre-content functionality don't really use it so let's start with the minimal useful feature set" * tag 'fsnotify_hsm_for_v6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: (21 commits) fanotify: Fix crash in fanotify_init(2) fs: don't block write during exec on pre-content watched files fs: enable pre-content events on supported file systems ext4: add pre-content fsnotify hook for DAX faults btrfs: disable defrag on pre-content watched files xfs: add pre-content fsnotify hook for DAX faults fsnotify: generate pre-content permission event on page fault mm: don't allow huge faults for files with pre content watches fanotify: disable readahead if we have pre-content watches fanotify: allow to set errno in FAN_DENY permission response fanotify: report file range info with pre-content events fanotify: introduce FAN_PRE_ACCESS permission event fsnotify: generate pre-content permission event on truncate fsnotify: pass optional file access range in pre-content event fsnotify: introduce pre-content permission events fanotify: reserve event bit of deprecated FAN_DIR_MODIFY fanotify: rename a misnamed constant fanotify: don't skip extra event info if no info_mode is set fsnotify: check if file is actually being watched for pre-content events on open fsnotify: opt-in for permission events at file open time ...
2025-01-23Merge tag 'fs_for_v6.14-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull isofs update from Jan Kara: "Partial conversion of isofs to folios" * tag 'fs_for_v6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: isofs: Partially convert zisofs_read_folio to use a folio
2025-01-23Merge tag 'fsnotify_for_v6.14-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull inotify update from Jan Kara: "A small inotify strcpy() cleanup" * tag 'fsnotify_for_v6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: inotify: Use strscpy() for event->name copies
2025-01-23Merge tag 'xfs-merge-6.14' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds
Pull XFS updates from Carlos Maiolino: "This is mostly focused on the implementation of reflink and reverse-mapping support for XFS's real-time devices. It also includes several bugfixes. - Implement reflink support for the realtime device - Implement reverse-mapping support for the realtime device - Several bug fixes and cleanups" * tag 'xfs-merge-6.14' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: (121 commits) xfs: fix buffer lookup vs release race xfs: check for dead buffers in xfs_buf_find_insert xfs: add a b_iodone callback to struct xfs_buf xfs: move b_li_list based retry handling to common code xfs: simplify xfsaild_resubmit_item xfs: always complete the buffer inline in xfs_buf_submit xfs: remove the extra buffer reference in xfs_buf_submit xfs: move invalidate_kernel_vmap_range to xfs_buf_ioend xfs: simplify buffer I/O submission xfs: move in-memory buftarg handling out of _xfs_buf_ioapply xfs: move write verification out of _xfs_buf_ioapply xfs: remove xfs_buf_delwri_submit_buffers xfs: simplify xfs_buf_delwri_pushbuf xfs: move xfs_buf_iowait out of (__)xfs_buf_submit xfs: remove the incorrect comment about the b_pag field xfs: remove the incorrect comment above xfs_buf_free_maps xfs: fix a double completion for buffers on in-memory targets xfs/libxfs: replace kmalloc() and memcpy() with kmemdup() xfs: constify feature checks xfs: refactor xfs_fs_statfs ...
2025-01-23Merge tag 'bpf-next-6.14' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Pull bpf updates from Alexei Starovoitov: "A smaller than usual release cycle. The main changes are: - Prepare selftest to run with GCC-BPF backend (Ihor Solodrai) In addition to LLVM-BPF runs the BPF CI now runs GCC-BPF in compile only mode. Half of the tests are failing, since support for btf_decl_tag is still WIP, but this is a great milestone. - Convert various samples/bpf to selftests/bpf/test_progs format (Alexis Lothoré and Bastien Curutchet) - Teach verifier to recognize that array lookup with constant in-range index will always succeed (Daniel Xu) - Cleanup migrate disable scope in BPF maps (Hou Tao) - Fix bpf_timer destroy path in PREEMPT_RT (Hou Tao) - Always use bpf_mem_alloc in bpf_local_storage in PREEMPT_RT (Martin KaFai Lau) - Refactor verifier lock support (Kumar Kartikeya Dwivedi) This is a prerequisite for upcoming resilient spin lock. - Remove excessive 'may_goto +0' instructions in the verifier that LLVM leaves when unrolls the loops (Yonghong Song) - Remove unhelpful bpf_probe_write_user() warning message (Marco Elver) - Add fd_array_cnt attribute for prog_load command (Anton Protopopov) This is a prerequisite for upcoming support for static_branch" * tag 'bpf-next-6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (125 commits) selftests/bpf: Add some tests related to 'may_goto 0' insns bpf: Remove 'may_goto 0' instruction in opt_remove_nops() bpf: Allow 'may_goto 0' instruction in verifier selftests/bpf: Add test case for the freeing of bpf_timer bpf: Cancel the running bpf_timer through kworker for PREEMPT_RT bpf: Free element after unlock in __htab_map_lookup_and_delete_elem() bpf: Bail out early in __htab_map_lookup_and_delete_elem() bpf: Free special fields after unlock in htab_lru_map_delete_node() tools: Sync if_xdp.h uapi tooling header libbpf: Work around kernel inconsistently stripping '.llvm.' suffix bpf: selftests: verifier: Add nullness elision tests bpf: verifier: Support eliding map lookup nullness bpf: verifier: Refactor helper access type tracking bpf: tcp: Mark bpf_load_hdr_opt() arg2 as read-write bpf: verifier: Add missing newline on verbose() call selftests/bpf: Add distilled BTF test about marking BTF_IS_EMBEDDED libbpf: Fix incorrect traversal end type ID when marking BTF_IS_EMBEDDED libbpf: Fix return zero when elf_begin failed selftests/bpf: Fix btf leak on new btf alloc failure in btf_distill test veristat: Load struct_ops programs only once ...
2025-01-23Merge tag 'caps-6.13-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/sergeh/linux Pull capabilities updates from Serge Hallyn: - remove the cap_mmap_file() hook, as it simply returned the default return value and so doesn't need to exist (Paul Moore) - add a trace event for cap_capable() (Jordan Rome) * tag 'caps-6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/sergeh/linux: security: add trace event for cap_capable capabilities: remove cap_mmap_file()
2025-01-23tpm: Change to kvalloc() in eventlog/acpi.cJarkko Sakkinen
The following failure was reported on HPE ProLiant D320: [ 10.693310][ T1] tpm_tis STM0925:00: 2.0 TPM (device-id 0x3, rev-id 0) [ 10.848132][ T1] ------------[ cut here ]------------ [ 10.853559][ T1] WARNING: CPU: 59 PID: 1 at mm/page_alloc.c:4727 __alloc_pages_noprof+0x2ca/0x330 [ 10.862827][ T1] Modules linked in: [ 10.866671][ T1] CPU: 59 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.12.0-lp155.2.g52785e2-default #1 openSUSE Tumbleweed (unreleased) 588cd98293a7c9eba9013378d807364c088c9375 [ 10.882741][ T1] Hardware name: HPE ProLiant DL320 Gen12/ProLiant DL320 Gen12, BIOS 1.20 10/28/2024 [ 10.892170][ T1] RIP: 0010:__alloc_pages_noprof+0x2ca/0x330 [ 10.898103][ T1] Code: 24 08 e9 4a fe ff ff e8 34 36 fa ff e9 88 fe ff ff 83 fe 0a 0f 86 b3 fd ff ff 80 3d 01 e7 ce 01 00 75 09 c6 05 f8 e6 ce 01 01 <0f> 0b 45 31 ff e9 e5 fe ff ff f7 c2 00 00 08 00 75 42 89 d9 80 e1 [ 10.917750][ T1] RSP: 0000:ffffb7cf40077980 EFLAGS: 00010246 [ 10.923777][ T1] RAX: 0000000000000000 RBX: 0000000000040cc0 RCX: 0000000000000000 [ 10.931727][ T1] RDX: 0000000000000000 RSI: 000000000000000c RDI: 0000000000040cc0 The above transcript shows that ACPI pointed a 16 MiB buffer for the log events because RSI maps to the 'order' parameter of __alloc_pages_noprof(). Address the bug by moving from devm_kmalloc() to devm_add_action() and kvmalloc() and devm_add_action(). Suggested-by: Ard Biesheuvel <ardb@kernel.org> Cc: stable@vger.kernel.org # v2.6.16+ Fixes: 55a82ab3181b ("[PATCH] tpm: add bios measurement log") Reported-by: Andy Liang <andy.liang@hpe.com> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219495 Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Stefan Berger <stefanb@linux.ibm.com> Reviewed-by: Takashi Iwai <tiwai@suse.de> Tested-by: Andy Liang <andy.liang@hpe.com> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2025-01-22Merge tag 'AT_EXECVE_CHECK-v6.14-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull AT_EXECVE_CHECK from Kees Cook: - Implement AT_EXECVE_CHECK flag to execveat(2) (Mickaël Salaün) - Implement EXEC_RESTRICT_FILE and EXEC_DENY_INTERACTIVE securebits (Mickaël Salaün) - Add selftests and samples for AT_EXECVE_CHECK (Mickaël Salaün) * tag 'AT_EXECVE_CHECK-v6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: ima: instantiate the bprm_creds_for_exec() hook samples/check-exec: Add an enlighten "inc" interpreter and 28 tests selftests: ktap_helpers: Fix uninitialized variable samples/check-exec: Add set-exec selftests/landlock: Add tests for execveat + AT_EXECVE_CHECK selftests/exec: Add 32 tests for AT_EXECVE_CHECK and exec securebits security: Add EXEC_RESTRICT_FILE and EXEC_DENY_INTERACTIVE securebits exec: Add a new AT_EXECVE_CHECK flag to execveat(2)
2025-01-22Merge tag 'hardening-v6.14-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull hardening updates from Kees Cook: - stackleak: Use str_enabled_disabled() helper (Thorsten Blum) - Document GCC INIT_STACK_ALL_PATTERN behavior (Geert Uytterhoeven) - Add task_prctl_unknown tracepoint (Marco Elver) * tag 'hardening-v6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: hardening: Document INIT_STACK_ALL_PATTERN behavior with GCC stackleak: Use str_enabled_disabled() helper in stack_erasing_sysctl() tracing: Remove pid in task_rename tracing output tracing: Add task_prctl_unknown tracepoint
2025-01-22Merge tag 'tomoyo-pr-20250123' of git://git.code.sf.net/p/tomoyo/tomoyoLinus Torvalds
Pull tomoyo updates from Tetsuo Handa: "Small changes to improve usability" * tag 'tomoyo-pr-20250123' of git://git.code.sf.net/p/tomoyo/tomoyo: tomoyo: automatically use patterns for several situations in learning mode tomoyo: use realpath if symlink's pathname refers to procfs tomoyo: don't emit warning in tomoyo_write_control()
2025-01-22Merge tag 'landlock-6.14-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux Pull landlock updates from Mickaël Salaün: "This mostly factors out some Landlock code and prepares for upcoming audit support. Because files with invalid modes might be visible after filesystem corruption, Landlock now handles those weird files too. A few sample and test issues are also fixed" * tag 'landlock-6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux: selftests/landlock: Add layout1.umount_sandboxer tests selftests/landlock: Add wrappers.h selftests/landlock: Fix error message landlock: Optimize file path walks and prepare for audit support selftests/landlock: Add test to check partial access in a mount tree landlock: Align partial refer access checks with final ones landlock: Simplify initially denied access rights landlock: Move access types landlock: Factor out check_access_path() selftests/landlock: Fix build with non-default pthread linking landlock: Use scoped guards for ruleset in landlock_add_rule() landlock: Use scoped guards for ruleset landlock: Constify get_mode_access() landlock: Handle weird files samples/landlock: Fix possible NULL dereference in parse_path() selftests/landlock: Remove unused macros in ptrace_test.c
2025-01-22Merge tag 'crc-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux Pull CRC updates from Eric Biggers: - Reorganize the architecture-optimized CRC32 and CRC-T10DIF code to be directly accessible via the library API, instead of requiring the crypto API. This is much simpler and more efficient. - Convert some users such as ext4 to use the CRC32 library API instead of the crypto API. More conversions like this will come later. - Add a KUnit test that tests and benchmarks multiple CRC variants. Remove older, less-comprehensive tests that are made redundant by this. - Add an entry to MAINTAINERS for the kernel's CRC library code. I'm volunteering to maintain it. I have additional cleanups and optimizations planned for future cycles. * tag 'crc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux: (31 commits) MAINTAINERS: add entry for CRC library powerpc/crc: delete obsolete crc-vpmsum_test.c lib/crc32test: delete obsolete crc32test.c lib/crc16_kunit: delete obsolete crc16_kunit.c lib/crc_kunit.c: add KUnit test suite for CRC library functions powerpc/crc-t10dif: expose CRC-T10DIF function through lib arm64/crc-t10dif: expose CRC-T10DIF function through lib arm/crc-t10dif: expose CRC-T10DIF function through lib x86/crc-t10dif: expose CRC-T10DIF function through lib crypto: crct10dif - expose arch-optimized lib function lib/crc-t10dif: add support for arch overrides lib/crc-t10dif: stop wrapping the crypto API scsi: target: iscsi: switch to using the crc32c library f2fs: switch to using the crc32 library jbd2: switch to using the crc32c library ext4: switch to using the crc32c library lib/crc32: make crc32c() go directly to lib bcachefs: Explicitly select CRYPTO from BCACHEFS_FS x86/crc32: expose CRC32 functions through lib x86/crc32: update prototype for crc32_pclmul_le_16() ...
2025-01-22Merge tag 'keys-next-6.14-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd Pull keys updates from Jarkko Sakkinen. Avoid using stack addresses for sg lists. And a cleanup. * tag 'keys-next-6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd: KEYS: trusted: dcp: fix improper sg use with CONFIG_VMAP_STACK=y keys: drop shadowing dead prototype
2025-01-22smb: client: handle lack of EA support in smb2_query_path_info()Paulo Alcantara
If the server doesn't support both EAs and reparse point in a file, the SMB2_QUERY_INFO request will fail with either STATUS_NO_EAS_ON_FILE or STATUS_EAS_NOT_SUPPORT in the compound chain, so ignore it as long as reparse point isn't IO_REPARSE_TAG_LX_(CHR|BLK), which would require the EAs to know about major/minor numbers. Reported-by: Pali Rohár <pali@kernel.org> Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2025-01-22smb: client: don't check for @leaf_fullpath in match_server()Paulo Alcantara
The matching of DFS connections is already handled by @dfs_conn, so remove @leaf_fullpath matching altogether. Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2025-01-22smb: client: get rid of TCP_Server_Info::refpath_lockPaulo Alcantara
TCP_Server_Info::leaf_fullpath is allocated in cifs_get_tcp_session() and never changed afterwards, so there is no need to serialize its access. Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.com> Signed-off-by: Steve French <stfrench@microsoft.com>