Age | Commit message (Collapse) | Author |
|
Now when we use scx_bpf_task_cgroup() in ops.tick() to get the cgroup of
the current task, the following error will occur:
scx_foo[3795244] triggered exit kind 1024:
runtime error (called on a task not being operated on)
The reason is that we are using SCX_CALL_OP() instead of SCX_CALL_OP_TASK()
when calling ops.tick(), which triggers the error during the subsequent
scx_kf_allowed_on_arg_tasks() check.
SCX_CALL_OP_TASK() was first introduced in commit 36454023f50b ("sched_ext:
Track tasks that are subjects of the in-flight SCX operation") to ensure
task's rq lock is held when accessing task's sched_group. Since ops.tick()
is marked as SCX_KF_TERMINAL and task_tick_scx() is protected by the rq
lock, we can use SCX_CALL_OP_TASK() to avoid the above issue. Similarly,
the same changes should be made for ops.disable() and ops.exit_task(), as
they are also protected by task_rq_lock() and it's safe to access the
task's task_group.
Fixes: 36454023f50b ("sched_ext: Track tasks that are subjects of the in-flight SCX operation")
Signed-off-by: Chuyi Zhou <zhouchuyi@bytedance.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
Now BPF only supports bpf_list_push_{front,back}_impl kfunc, not bpf_list_
push_{front,back}.
This patch fix this issue. Without this patch, if we use bpf_list kfunc
in scx, the BPF verifier would complain:
libbpf: extern (func ksym) 'bpf_list_push_back': not found in kernel or
module BTFs
libbpf: failed to load object 'scx_foo'
libbpf: failed to load BPF skeleton 'scx_foo': -EINVAL
With this patch, the bpf list kfunc will work as expected.
Signed-off-by: Chuyi Zhou <zhouchuyi@bytedance.com>
Fixes: 2a52ca7c98960 ("sched_ext: Add scx_simple and scx_example_qmap example schedulers")
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
Fixed grammar for a few tests of sched_ext.
Signed-off-by: Devaansh Kumar <devaanshk840@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
task_can_run_on_remote_rq()
While fixing migration disabled task handling, 32966821574c ("sched_ext: Fix
migration disabled handling in targeted dispatches") assumed that a
migration disabled task's ->cpus_ptr would only have the pinned CPU. While
this is eventually true for migration disabled tasks that are switched out,
->cpus_ptr update is performed by migrate_disable_switch() which is called
right before context_switch() in __scheduler(). However, the task is
enqueued earlier during pick_next_task() via put_prev_task_scx(), so there
is a race window where another CPU can see the task on a DSQ.
If the CPU tries to dispatch the migration disabled task while in that
window, task_allowed_on_cpu() will succeed and task_can_run_on_remote_rq()
will subsequently trigger SCHED_WARN(is_migration_disabled()).
WARNING: CPU: 8 PID: 1837 at kernel/sched/ext.c:2466 task_can_run_on_remote_rq+0x12e/0x140
Sched_ext: layered (enabled+all), task: runnable_at=-10ms
RIP: 0010:task_can_run_on_remote_rq+0x12e/0x140
...
<TASK>
consume_dispatch_q+0xab/0x220
scx_bpf_dsq_move_to_local+0x58/0xd0
bpf_prog_84dd17b0654b6cf0_layered_dispatch+0x290/0x1cfa
bpf__sched_ext_ops_dispatch+0x4b/0xab
balance_one+0x1fe/0x3b0
balance_scx+0x61/0x1d0
prev_balance+0x46/0xc0
__pick_next_task+0x73/0x1c0
__schedule+0x206/0x1730
schedule+0x3a/0x160
__do_sys_sched_yield+0xe/0x20
do_syscall_64+0xbb/0x1e0
entry_SYSCALL_64_after_hwframe+0x77/0x7f
Fix it by converting the SCHED_WARN() back to a regular failure path. Also,
perform the migration disabled test before task_allowed_on_cpu() test so
that BPF schedulers which fail to handle migration disabled tasks can be
noticed easily.
While at it, adjust scx_ops_error() message for !task_allowed_on_cpu() case
for brevity and consistency.
Signed-off-by: Tejun Heo <tj@kernel.org>
Fixes: 32966821574c ("sched_ext: Fix migration disabled handling in targeted dispatches")
Acked-by: Andrea Righi <arighi@nvidia.com>
Reported-by: Jake Hillion <jakehillion@meta.com>
|
|
A dispatch operation that can target a specific local DSQ -
scx_bpf_dsq_move_to_local() or scx_bpf_dsq_move() - checks whether the task
can be migrated to the target CPU using task_can_run_on_remote_rq(). If the
task can't be migrated to the targeted CPU, it is bounced through a global
DSQ.
task_can_run_on_remote_rq() assumes that the task is on a CPU that's
different from the targeted CPU but the callers doesn't uphold the
assumption and may call the function when the task is already on the target
CPU. When such task has migration disabled, task_can_run_on_remote_rq() ends
up returning %false incorrectly unnecessarily bouncing the task to a global
DSQ.
Fix it by updating the callers to only call task_can_run_on_remote_rq() when
the task is on a different CPU than the target CPU. As this is a bit subtle,
for clarity and documentation:
- Make task_can_run_on_remote_rq() trigger SCHED_WARN_ON() if the task is on
the same CPU as the target CPU.
- is_migration_disabled() test in task_can_run_on_remote_rq() cannot trigger
if the task is on a different CPU than the target CPU as the preceding
task_allowed_on_cpu() test should fail beforehand. Convert the test into
SCHED_WARN_ON().
Signed-off-by: Tejun Heo <tj@kernel.org>
Fixes: 4c30f5ce4f7a ("sched_ext: Implement scx_bpf_dispatch[_vtime]_from_dsq()")
Fixes: 0366017e0973 ("sched_ext: Use task_can_run_on_remote_rq() test in dispatch_to_local_dsq()")
Cc: stable@vger.kernel.org # v6.12+
|
|
Migration disabled tasks are special and pinned to their previous CPUs. They
tripped up some unsuspecting BPF schedulers as their ->nr_cpus_allowed may
not agree with the bits set in ->cpus_ptr. Make it easier for BPF schedulers
by automatically dispatching them to the pinned local DSQs by default. If a
BPF scheduler wants to handle migration disabled tasks explicitly, it can
set SCX_OPS_ENQ_MIGRATION_DISABLED.
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Andrea Righi <arighi@nvidia.com>
|
|
When (s64)(after - before) > 0, the code returns the result of
(s64)(after - before) > 0 while the intended result should be
(s64)(after - before). That happens because the middle operand of
the ternary operator was omitted incorrectly, returning the result of
(s64)(after - before) > 0. Thus, add the middle operand
-- (s64)(after - before) -- to return the correct time calculation.
Fixes: d07be814fc71 ("sched_ext: Add time helpers for BPF schedulers")
Signed-off-by: Changwoo Min <changwoo@igalia.com>
Acked-by: Andrea Righi <arighi@nvidia.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
While performing the rq locking dance in dispatch_to_local_dsq(), we may
trigger the following lock imbalance condition, in particular when
multiple tasks are rapidly changing CPU affinity (i.e., running a
`stress-ng --race-sched 0`):
[ 13.413579] =====================================
[ 13.413660] WARNING: bad unlock balance detected!
[ 13.413729] 6.13.0-virtme #15 Not tainted
[ 13.413792] -------------------------------------
[ 13.413859] kworker/1:1/80 is trying to release lock (&rq->__lock) at:
[ 13.413954] [<ffffffff873c6c48>] dispatch_to_local_dsq+0x108/0x1a0
[ 13.414111] but there are no more locks to release!
[ 13.414176]
[ 13.414176] other info that might help us debug this:
[ 13.414258] 1 lock held by kworker/1:1/80:
[ 13.414318] #0: ffff8b66feb41698 (&rq->__lock){-.-.}-{2:2}, at: raw_spin_rq_lock_nested+0x20/0x90
[ 13.414612]
[ 13.414612] stack backtrace:
[ 13.415255] CPU: 1 UID: 0 PID: 80 Comm: kworker/1:1 Not tainted 6.13.0-virtme #15
[ 13.415505] Workqueue: 0x0 (events)
[ 13.415567] Sched_ext: dsp_local_on (enabled+all), task: runnable_at=-2ms
[ 13.415570] Call Trace:
[ 13.415700] <TASK>
[ 13.415744] dump_stack_lvl+0x78/0xe0
[ 13.415806] ? dispatch_to_local_dsq+0x108/0x1a0
[ 13.415884] print_unlock_imbalance_bug+0x11b/0x130
[ 13.415965] ? dispatch_to_local_dsq+0x108/0x1a0
[ 13.416226] lock_release+0x231/0x2c0
[ 13.416326] _raw_spin_unlock+0x1b/0x40
[ 13.416422] dispatch_to_local_dsq+0x108/0x1a0
[ 13.416554] flush_dispatch_buf+0x199/0x1d0
[ 13.416652] balance_one+0x194/0x370
[ 13.416751] balance_scx+0x61/0x1e0
[ 13.416848] prev_balance+0x43/0xb0
[ 13.416947] __pick_next_task+0x6b/0x1b0
[ 13.417052] __schedule+0x20d/0x1740
This happens because dispatch_to_local_dsq() is racing with
dispatch_dequeue() and, when the latter wins, we incorrectly assume that
the task has been moved to dst_rq.
Fix by properly tracking the currently locked rq.
Fixes: 4d3ca89bdd31 ("sched_ext: Refactor consume_remote_task()")
Signed-off-by: Andrea Righi <arighi@nvidia.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
In UP systems p->migration_disabled is not available. Fix this by using
the portable helper is_migration_disabled(p).
Fixes: e9fe182772dc ("sched_ext: selftests/dsp_local_on: Fix sporadic failures")
Signed-off-by: Andrea Righi <arighi@nvidia.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
Introduce a new helper for BPF schedulers to determine whether a task
can migrate or not (supporting both SMP and UP systems).
Fixes: e9fe182772dc ("sched_ext: selftests/dsp_local_on: Fix sporadic failures")
Signed-off-by: Andrea Righi <arighi@nvidia.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
scx_move_task() is called from sched_move_task() and tells the BPF scheduler
that cgroup migration is being committed. sched_move_task() is used by both
cgroup and autogroup migrations and scx_move_task() tried to filter out
autogroup migrations by testing the destination cgroup and PF_EXITING but
this is not enough. In fact, without explicitly tagging the thread which is
doing the cgroup migration, there is no good way to tell apart
scx_move_task() invocations for racing migration to the root cgroup and an
autogroup migration.
This led to scx_move_task() incorrectly ignoring a migration from non-root
cgroup to an autogroup of the root cgroup triggering the following warning:
WARNING: CPU: 7 PID: 1 at kernel/sched/ext.c:3725 scx_cgroup_can_attach+0x196/0x340
...
Call Trace:
<TASK>
cgroup_migrate_execute+0x5b1/0x700
cgroup_attach_task+0x296/0x400
__cgroup_procs_write+0x128/0x140
cgroup_procs_write+0x17/0x30
kernfs_fop_write_iter+0x141/0x1f0
vfs_write+0x31d/0x4a0
__x64_sys_write+0x72/0xf0
do_syscall_64+0x82/0x160
entry_SYSCALL_64_after_hwframe+0x76/0x7e
Fix it by adding an argument to sched_move_task() that indicates whether the
moving is for a cgroup or autogroup migration. After the change,
scx_move_task() is called only for cgroup migrations and renamed to
scx_cgroup_move_task().
Link: https://github.com/sched-ext/scx/issues/370
Fixes: 819513666966 ("sched_ext: Add cgroup support")
Cc: stable@vger.kernel.org # v6.12+
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
dsp_local_on has several incorrect assumptions, one of which is that
p->nr_cpus_allowed always tracks p->cpus_ptr. This is not true when a task
is scheduled out while migration is disabled - p->cpus_ptr is temporarily
overridden to the previous CPU while p->nr_cpus_allowed remains unchanged.
This led to sporadic test faliures when dsp_local_on_dispatch() tries to put
a migration disabled task to a different CPU. Fix it by keeping the previous
CPU when migration is disabled.
There are SCX schedulers that make use of p->nr_cpus_allowed. They should
also implement explicit handling for p->migration_disabled.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Ihor Solodrai <ihor.solodrai@pm.me>
Cc: Andrea Righi <arighi@nvidia.com>
Cc: Changwoo Min <changwoo@igalia.com>
|
|
All scx enums are now automatically generated from vmlinux.h and they
must be initialized using the SCX_ENUM_INIT() macro.
Fix the scx selftests to use this macro to properly initialize these
values.
Fixes: 8da7bf2cee27 ("tools/sched_ext: Receive updates from SCX repo")
Reported-by: Ihor Solodrai <ihor.solodrai@pm.me>
Closes: https://lore.kernel.org/all/Z2tNK2oFDX1OPp8C@slm.duckdns.org/
Signed-off-by: Andrea Righi <arighi@nvidia.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
Report the task weight when dumping the task state during an error exit.
Moreover, adjust the output format to display dsq_vtime, slice, and
weight on the same line.
This can help identify whether certain tasks were excessively
prioritized or de-prioritized due to large niceness gaps.
Signed-off-by: Andrea Righi <arighi@nvidia.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
Fixes some spelling errors in the comments.
Signed-off-by: Atul Kumar Pant <atulpant.linux@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/andy/linux-auxdisplay
Pull auxdisplay updates from Andy Shevchenko:
- A couple of cleanups to img-ascii-lcd driver
* tag 'auxdisplay-v6.14-1' of git://git.kernel.org/pub/scm/linux/kernel/git/andy/linux-auxdisplay:
auxdisplay: img-ascii-lcd: Constify struct img_ascii_lcd_config
auxdisplay: img-ascii-lcd: Remove an unused field in struct img_ascii_lcd_ctx
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound updates from Takashi Iwai:
"This was a relatively calm cycle, and most of changes are rather small
device-specific fixes. Here are highlights:
Core:
- Further enhancements of ALSA rawmidi and sequencer APIs for MIDI
2.0
- compress-offload API extensions for ASRC support
ASoC:
- Allow clocking on each DAI in an audio graph card to be configured
separately
- Improved power management for Renesas RZ-SSI
- KUnit testing for the Cirrus DSP framework
- Memory to meory operation support for Freescale/NXP platforms
- Support for pause operations in SOF
- Support for Allwinner suinv F1C100s, Awinc AW88083, Realtek
ALC5682I-VE
HD- and USB-audio:
- Add support for Focusrite Scarlett 4th Gen 16i16, 18i16, and 18i20
interfaces via new FCP driver
- TAS2781 SPI HD-audio sub-codec support
- Various device-specific quirks as usual"
* tag 'sound-6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (235 commits)
ALSA: hda: tas2781-spi: Fix bogus error handling in tas2781_hda_spi_probe()
ALSA: hda: tas2781-spi: Fix error code in tas2781_read_acpi()
ALSA: hda: tas2781-spi: Delete some dead code
ALSA: usb: fcp: Fix return code from poll ops
ALSA: usb: fcp: Fix incorrect resp->opcode retrieval
ALSA: usb: fcp: Fix meter_levels type to __le32
ALSA: hda/realtek: Enable Mute LED on HP Laptop 14s-fq1xxx
ALSA: hda: tas2781-spi: Fix -Wsometimes-uninitialized in tasdevice_spi_switch_book()
ALSA: ctxfi: Simplify dao_clear_{left,right}_input() functions
ALSA: hda: tas2781-spi: select CRC32 instead of CRC32_SARWATE
ALSA: usb: fcp: Fix hwdep read ops types
ALSA: scarlett2: Add device_setup option to use FCP driver
ALSA: FCP: Add Focusrite Control Protocol driver
ALSA: hda/tas2781: Add tas2781 hda SPI driver
ALSA: hda/realtek - Fixed headphone distorted sound on Acer Aspire A115-31 laptop
ASoC: xilinx: xlnx_spdif: Simpify using devm_clk_get_enabled()
ALSA: hda: Support for Ideapad hotkey mute LEDs
ASoC: Intel: sof_sdw: Fix DMI match for Lenovo 83JX, 83MC and 83NM
ASoC: Intel: sof_sdw: Fix DMI match for Lenovo 83LC
ASoC: dapm: add support for preparing streams
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto updates from Herbert Xu:
"API:
- Remove physical address skcipher walking
- Fix boot-up self-test race
Algorithms:
- Optimisations for x86/aes-gcm
- Optimisations for x86/aes-xts
- Remove VMAC
- Remove keywrap
Drivers:
- Remove n2
Others:
- Fixes for padata UAF
- Fix potential rhashtable deadlock by moving schedule_work outside
lock"
* tag 'v6.14-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (75 commits)
rhashtable: Fix rhashtable_try_insert test
dt-bindings: crypto: qcom,inline-crypto-engine: Document the SM8750 ICE
dt-bindings: crypto: qcom,prng: Document SM8750 RNG
dt-bindings: crypto: qcom-qce: Document the SM8750 crypto engine
crypto: asymmetric_keys - Remove unused key_being_used_for[]
padata: avoid UAF for reorder_work
padata: fix UAF in padata_reorder
padata: add pd get/put refcnt helper
crypto: skcipher - call cond_resched() directly
crypto: skcipher - optimize initializing skcipher_walk fields
crypto: skcipher - clean up initialization of skcipher_walk::flags
crypto: skcipher - fold skcipher_walk_skcipher() into skcipher_walk_virt()
crypto: skcipher - remove redundant check for SKCIPHER_WALK_SLOW
crypto: skcipher - remove redundant clamping to page size
crypto: skcipher - remove unnecessary page alignment of bounce buffer
crypto: skcipher - document skcipher_walk_done() and rename some vars
crypto: omap - switch from scatter_walk to plain offset
crypto: powerpc/p10-aes-gcm - simplify handling of linear associated data
crypto: bcm - Drop unused setting of local 'ptr' variable
crypto: hisilicon/qm - support new function communication
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd
Pull TPM update from Jarkko Sakkinen.
* tag 'tpmdd-next-6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd:
tpm: Change to kvalloc() in eventlog/acpi.c
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm
Pull pmdomain updates from Ulf Hansson:
"pmdomain core:
- Add support for naming idlestates through DT
pmdomain providers:
- arm: Explicitly request the current state at init for the SCMI PM
domain
- mediatek: Add Airoha CPU PM Domain support for CPU frequency
scaling
- ti: Add per-device latency constraint management to the ti_sci PM
domain
cpuidle-psci:
- Enable system-wakeup through GENPD_FLAG_ACTIVE_WAKEUP"
* tag 'pmdomain-v6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm:
pmdomain: airoha: Fix compilation error with Clang-20 and Thumb2 mode
pmdomain: arm: scmi_pm_domain: Send an explicit request to set the current state
pmdomain: airoha: Add Airoha CPU PM Domain support
pmdomain: ti_sci: handle wake IRQs for IO daisy chain wakeups
pmdomain: ti_sci: add wakeup constraint management
pmdomain: ti_sci: add per-device latency constraint management
pmdomain: imx-gpcv2: Suppress bind attrs
pmdomain: imx8m[p]-blk-ctrl: Suppress bind attrs
pmdomain: core: Support naming idle states
dt-bindings: power: domain-idle-state: Allow idle-state-name
cpuidle: psci: Activate GENPD_FLAG_ACTIVE_WAKEUP with OSI
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl
Pull pin control updates from Linus Walleij:
"No core changes this time
New drivers:
- New subdriver for the Qualcomm MSM8917 SoC TLMM
- New subdriver for the Mediatek MT7988 SoC
- New subdriver for the Rockchip RK3562 SoC
- New subdriver for the Renesas RZ/G3E SoC
Improvements:
- Fix some missing pins in the Qualcomm IPQ5424 TLMM
- Fix some missing LVDS pins in the Sunxi A100/A133
- Support Sunxi V853 (simple compatible string)
- Cleanups in the Samsung driver
- Fix some AMD suspend behaviour
- Cleanups"
* tag 'pinctrl-v6.14-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: (29 commits)
dt-bindings: pinctrl: sunxi: add compatible for V853
pinctrl: Use str_enable_disable-like helpers
dt-bindings: pinctrl: Correct indentation and style in DTS example
pinctrl: amd: Take suspend type into consideration which pins are non-wake
pinctrl: stm32: Add check for clk_enable()
pinctrl: renesas: rzg2l: Fix PFC_MASK for RZ/V2H and RZ/G3E
pinctrl: sunxi: add missed lvds pins for a100/a133
pinctrl: mediatek: Drop mtk_pinconf_bias_set_pd()
pinctrl: renesas: rzg2l: Add support for RZ/G3E SoC
pinctrl: renesas: rzg2l: Update r9a09g057_variable_pin_cfg table
dt-bindings: pinctrl: renesas: Document RZ/G3E SoC
dt-bindings: pinctrl: renesas: Add alpha-numerical port support for RZ/V2H
pinctrl: rockchip: add rk3562 support
dt-bindings: pinctrl: Add rk3562 pinctrl support
pinctrl: Fix the clean up on pinconf_apply_setting failure
dt-bindings: pinctrl: add binding for MT7988 SoC
pinctrl: mediatek: add MT7988 pinctrl driver
pinctrl: mediatek: add support for MTK_PULL_PD_TYPE
pinctrl: ocelot: Constify some structures
pinctrl: renesas: rzg2l: Add audio clock pins on RZ/G3S
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux
Pull iommu updates from Joerg Roedel:
"Core changes:
- PASID support for the blocked_domain
ARM-SMMU Updates:
- SMMUv2:
- Implement per-client prefetcher configuration on Qualcomm SoCs
- Support for the Adreno SMMU on Qualcomm's SDM670 SOC
- SMMUv3:
- Pretty-printing of event records
- Drop the ->domain_alloc_paging implementation in favour of
domain_alloc_paging_flags(flags==0)
- IO-PGTable:
- Generalisation of the page-table walker to enable external
walkers (e.g. for debugging unexpected page-faults from the GPU)
- Minor fix for handling concatenated PGDs at stage-2 with 16KiB
pages
- Misc:
- Clean-up device probing and replace the crufty probe-deferral
hack with a more robust implementation of
arm_smmu_get_by_fwnode()
- Device-tree binding updates for a bunch of Qualcomm platforms
Intel VT-d Updates:
- Remove domain_alloc_paging()
- Remove capability audit code
- Draining PRQ in sva unbind path when FPD bit set
- Link cache tags of same iommu unit together
AMD-Vi Updates:
- Use CMPXCHG128 to update DTE
- Cleanups of the domain_alloc_paging() path
RiscV IOMMU:
- Platform MSI support
- Shutdown support
Rockchip IOMMU:
- Add DT bindings for Rockchip RK3576
More smaller fixes and cleanups"
* tag 'iommu-updates-v6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux: (66 commits)
iommu: Use str_enable_disable-like helpers
iommu/amd: Fully decode all combinations of alloc_paging_flags
iommu/amd: Move the nid to pdom_setup_pgtable()
iommu/amd: Change amd_iommu_pgtable to use enum protection_domain_mode
iommu/amd: Remove type argument from do_iommu_domain_alloc() and related
iommu/amd: Remove dev == NULL checks
iommu/amd: Remove domain_alloc()
iommu/amd: Remove unused amd_iommu_domain_update()
iommu/riscv: Fixup compile warning
iommu/arm-smmu-v3: Add missing #include of linux/string_choices.h
iommu/arm-smmu-v3: Use str_read_write helper w/ logs
iommu/io-pgtable-arm: Add way to debug pgtable walk
iommu/io-pgtable-arm: Re-use the pgtable walk for iova_to_phys
iommu/io-pgtable-arm: Make pgtable walker more generic
iommu/arm-smmu: Add ACTLR data and support for qcom_smmu_500
iommu/arm-smmu: Introduce ACTLR custom prefetcher settings
iommu/arm-smmu: Add support for PRR bit setup
iommu/arm-smmu: Refactor qcom_smmu structure to include single pointer
iommu/arm-smmu: Re-enable context caching in smmu reset operation
iommu/vt-d: Link cache tags of same iommu unit together
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86
Pull x86 platform driver updates from Ilpo Järvinen:
"acer-wmi:
- Add support for PH14-51, PH16-72, and Nitro AN515-58
- Add proper hwmon support
- Improve error handling when reading "gaming system info"
- Replace direct EC reads for the current platform profile with WMI
calls to handle EC address variations
- Replace custom platform_profile cycling with the generic one
ACPI:
- platform_profile: Major refactoring and improvements
- Support registering multiple platform_profile handlers concurrently
to avoid the need to quirk which handler takes precedence
- Support reporting "custom" profile for cases where the current
profile is ambiguous or when settings tweaks are done outside the
pre-defined profile
- Abstract and layer platform_profile API better using the class_dev
and drvdata
- Various minor improvements
- Add Documentation and kerneldoc
amd/hsmp:
- Add support for HSMP protocol v7
amd/pmc:
- Support AMD 1Ah family 70h
- Support STB with Ryzen desktop SoCs
amd/pmf:
- Support Custom BIOS inputs for PMF TA
- Support passing SRA sensor data from AMD SFH (HID) to PMF TA
dell-smo8800:
- Move SMO88xx quirk away from the generic i2c-i801 driver
- Add accelerometer support for Dell Latitude E6330/E6430 and XPS
9550
- Support probing accelerometer for models yet to be listed in the
DMI mapping table because ACPI lacks i2c-address for the
accelerometer (behind a module parameter because probing might be
dangerous)
HID:
- amd_sfh: Add support for exporting SRA sensor data
hp-wmi:
- Add fan and thermal support for Victus 16-s1000
input:
- Add key for phone linking
- i8042: Add context for the i8042 filter to enable cleaning up the
filter related global variables from pdx86 drivers
lenovo-wmi-camera:
- Use SW_CAMERA_LENS_COVER instead of KEY_CAMERA_ACCESS
mellanox mlxbf-pmc:
- Add support for monitoring cycle count
- Add Documentation
thinkpad_acpi:
- Add support for phone link key
tools/power/x86/intel-speed-select:
- Fix Turbo Ratio Limit restore
x86-android-tables:
- Add support for Vexia EDU ATLA 10 Bluetooth and EC battery driver
And miscellaneous cleanups / refactoring / improvements"
* tag 'platform-drivers-x86-v6.14-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: (133 commits)
platform/x86: acer-wmi: Fix initialization of last_non_turbo_profile
platform/x86: acer-wmi: Ignore AC events
platform/mellanox: mlxreg-io: use sysfs_emit() instead of sprintf()
platform/mellanox: mlxreg-hotplug: use sysfs_emit() instead of sprintf()
platform/mellanox: mlxbf-bootctl: use sysfs_emit() instead of sprintf()
platform/x86: hp-wmi: Add fan and thermal profile support for Victus 16-s1000
ACPI: platform_profile: Add a prefix to log messages
ACPI: platform_profile: Add documentation
ACPI: platform_profile: Clean platform_profile_handler
ACPI: platform_profile: Move platform_profile_handler
ACPI: platform_profile: Remove platform_profile_handler from exported symbols
platform/x86: thinkpad_acpi: Use devm_platform_profile_register()
platform/x86: inspur_platform_profile: Use devm_platform_profile_register()
platform/x86: hp-wmi: Use devm_platform_profile_register()
platform/x86: ideapad-laptop: Use devm_platform_profile_register()
platform/x86: dell-pc: Use devm_platform_profile_register()
platform/x86: asus-wmi: Use devm_platform_profile_register()
platform/x86: amd: pmf: sps: Use devm_platform_profile_register()
platform/x86: acer-wmi: Use devm_platform_profile_register()
platform/surface: surface_platform_profile: Use devm_platform_profile_register()
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 TDX updates from Dave Hansen:
"Intel Trust Domain updates.
The existing TDX code needs a _bit_ of metadata from the TDX module.
But KVM is going to need a bunch more very shortly. Rework the
interface with the TDX module to be more consistent and handle the new
higher volume.
The TDX module has added a few new features. The first is a promise
not to clobber RBP under any circumstances. Basically the kernel now
will refuse to use any modules that don't have this promise. Second,
enable the new "REDUCE_VE" feature. This ensures that the TDX module
will not send some silly virtualization exceptions that the guest had
no good way to handle anyway.
- Centralize global metadata infrastructure
- Use new TDX module features for exception suppression and RBP
clobbering"
* tag 'x86_tdx_for_6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/virt/tdx: Require the module to assert it has the NO_RBP_MOD mitigation
x86/virt/tdx: Switch to use auto-generated global metadata reading code
x86/virt/tdx: Use dedicated struct members for PAMT entry sizes
x86/virt/tdx: Use auto-generated code to read global metadata
x86/virt/tdx: Start to track all global metadata in one structure
x86/virt/tdx: Rename 'struct tdx_tdmr_sysinfo' to reflect the spec better
x86/tdx: Dump attributes and TD_CTLS on boot
x86/tdx: Disable unnecessary virtualization exceptions
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 boot updates from Ingo Molnar:
- A large and involved preparatory series to pave the way to add
exception handling for relocate_kernel - which will be a debugging
facility that has aided in the field to debug an exceptionally hard
to debug early boot bug. Plus assorted cleanups and fixes that were
discovered along the way, by David Woodhouse:
- Clean up and document register use in relocate_kernel_64.S
- Use named labels in swap_pages in relocate_kernel_64.S
- Only swap pages for ::preserve_context mode
- Allocate PGD for x86_64 transition page tables separately
- Copy control page into place in machine_kexec_prepare()
- Invoke copy of relocate_kernel() instead of the original
- Move relocate_kernel to kernel .data section
- Add data section to relocate_kernel
- Drop page_list argument from relocate_kernel()
- Eliminate writes through kernel mapping of relocate_kernel page
- Clean up register usage in relocate_kernel()
- Mark relocate_kernel page as ROX instead of RWX
- Disable global pages before writing to control page
- Ensure preserve_context flag is set on return to kernel
- Use correct swap page in swap_pages function
- Fix stack and handling of re-entry point for ::preserve_context
- Mark machine_kexec() with __nocfi
- Cope with relocate_kernel() not being at the start of the page
- Use typedef for relocate_kernel_fn function prototype
- Fix location of relocate_kernel with -ffunction-sections (fix by Nathan Chancellor)
- A series to remove the last remaining absolute symbol references from
.head.text, and enforce this at build time, by Ard Biesheuvel:
- Avoid WARN()s and panic()s in early boot code
- Don't hang but terminate on failure to remap SVSM CA
- Determine VA/PA offset before entering C code
- Avoid intentional absolute symbol references in .head.text
- Disable UBSAN in early boot code
- Move ENTRY_TEXT to the start of the image
- Move .head.text into its own output section
- Reject absolute references in .head.text
- The above build-time enforcement uncovered a handful of bugs of
essentially non-working code, and a wrokaround for a toolchain bug,
fixed by Ard Biesheuvel as well:
- Fix spurious undefined reference when CONFIG_X86_5LEVEL=n, on GCC-12
- Disable UBSAN on SEV code that may execute very early
- Disable ftrace branch profiling in SEV startup code
- And miscellaneous cleanups:
- kexec_core: Add and update comments regarding the KEXEC_JUMP flow (Rafael J. Wysocki)
- x86/sysfs: Constify 'struct bin_attribute' (Thomas Weißschuh)"
* tag 'x86-boot-2025-01-21' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (33 commits)
x86/sev: Disable ftrace branch profiling in SEV startup code
x86/kexec: Use typedef for relocate_kernel_fn function prototype
x86/kexec: Cope with relocate_kernel() not being at the start of the page
kexec_core: Add and update comments regarding the KEXEC_JUMP flow
x86/kexec: Mark machine_kexec() with __nocfi
x86/kexec: Fix location of relocate_kernel with -ffunction-sections
x86/kexec: Fix stack and handling of re-entry point for ::preserve_context
x86/kexec: Use correct swap page in swap_pages function
x86/kexec: Ensure preserve_context flag is set on return to kernel
x86/kexec: Disable global pages before writing to control page
x86/sev: Don't hang but terminate on failure to remap SVSM CA
x86/sev: Disable UBSAN on SEV code that may execute very early
x86/boot/64: Fix spurious undefined reference when CONFIG_X86_5LEVEL=n, on GCC-12
x86/sysfs: Constify 'struct bin_attribute'
x86/kexec: Mark relocate_kernel page as ROX instead of RWX
x86/kexec: Clean up register usage in relocate_kernel()
x86/kexec: Eliminate writes through kernel mapping of relocate_kernel page
x86/kexec: Drop page_list argument from relocate_kernel()
x86/kexec: Add data section to relocate_kernel
x86/kexec: Move relocate_kernel to kernel .data section
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools
Pull perf-tools updates from Namhyung Kim:
"There are a lot of changes in the perf tools in this cycle.
build:
- Use generic syscall table to generate syscall numbers on supported
archs
- This also enables to get rid of libaudit which was used for syscall
numbers
- Remove python2 support as it's deprecated for years
- Fix issues on static build with libzstd
perf record:
- Intel-PT supports "aux-action" config term to pause or resume
tracing in the aux-buffer. Users can start the intel_pt event as
"started-paused" and configure other events to control the Intel-PT
tracing:
# perf record --kcore -e intel_pt/aux-action=start-paused/ \
-e syscalls:sys_enter_newuname/aux-action=resume/ \
-e syscalls:sys_exit_newuname/aux-action=pause/ -- uname
This requires kernel support (which was added in v6.13)
perf lock:
- 'perf lock contention' command has an ability to symbolize locks in
dynamically allocated objects using slab cache name when it runs
with BPF. Those dynamic locks would have "&" prefix in the name to
distinguish them from ordinary (static) locks
# perf lock con -abl -E 5 sleep 1
contended total wait max wait avg wait address symbol
2 1.95 us 1.77 us 975 ns ffff9d5e852d3498 &task_struct (mutex)
1 1.18 us 1.18 us 1.18 us ffff9d5e852d3538 &task_struct (mutex)
4 1.12 us 354 ns 279 ns ffff9d5e841ca800 &kmalloc-cg-512 (mutex)
2 859 ns 617 ns 429 ns ffffffffa41c3620 delayed_uprobe_lock (mutex)
3 691 ns 388 ns 230 ns ffffffffa41c0940 pack_mutex (mutex)
This also requires kernel/BPF support (which was added in v6.13)
perf ftrace:
- 'perf ftrace latency' command gets a couple of options to support
linear buckets instead of exponential. Also it's possible to
specify max and min latency for the linear buckets:
# perf ftrace latency -abn -T switch_mm_irqs_off --bucket-range=100 \
--min-latency=200 --max-latency=800 -- sleep 1
# DURATION | COUNT | GRAPH |
0 - 200 ns | 186 | ### |
200 - 300 ns | 256 | ##### |
300 - 400 ns | 364 | ####### |
400 - 500 ns | 223 | #### |
500 - 600 ns | 111 | ## |
600 - 700 ns | 41 | |
700 - 800 ns | 141 | ## |
800 - ... ns | 169 | ### |
# statistics (in nsec)
total time: 2162212
avg time: 967
max time: 16817
min time: 132
count: 2236
- As you can see in the above example, it nows shows the statistics
at the end so that users can see the avg/max/min latencies easily
- 'perf ftrace profile' command has --graph-opts option like 'perf
ftrace trace' so that it can control the tracing behaviors in the
same way. For example, it can limit the function call depth or
threshold
perf script:
- Improve physical memory resolution in 'mem-phys-addr' script by
parsing /proc/iomem file
# perf script mem-phys-addr -- find /
...
Event: mem_inst_retired.all_loads:P
Memory type count percentage
---------------------------------------- ---------- ----------
100000000-85f7fffff : System RAM 8929 69.7
547600000-54785d23f : Kernel data 1240 9.7
546a00000-5474bdfff : Kernel rodata 490 3.8
5480ce000-5485fffff : Kernel bss 121 0.9
0-fff : Reserved 3860 30.1
100000-89c01fff : System RAM 18 0.1
8a22c000-8df6efff : System RAM 5 0.0
Others:
- 'perf test' gets --runs-per-test option to run the test cases
repeatedly. This would be helpful to see if it's flaky
- Add 'parse_events' method to Python perf extension module, so that
users can use the same event parsing logic in the python code. One
more step towards implementing perf tools in Python. :)
- Support opening tracepoint events without libtraceevent. This will
be helpful if it won't use the tracing data like in 'perf stat'
- Update ARM Neoverse N2/V2 JSON events and metrics"
* tag 'perf-tools-for-v6.14-2025-01-21' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: (176 commits)
perf test: Update event_groups test to use instructions
perf bench: Fix undefined behavior in cmpworker()
perf annotate: Prefer passing evsel to evsel->core.idx
perf lock: Rename fields in lock_type_table
perf lock: Add percpu-rwsem for type filter
perf lock: Fix parse_lock_type which only retrieve one lock flag
perf lock: Fix return code for functions in __cmd_contention
perf hist: Fix width calculation in hpp__fmt()
perf hist: Fix bogus profiles when filters are enabled
perf hist: Deduplicate cmp/sort/collapse code
perf test: Improve verbose documentation
perf test: Add a runs-per-test flag
perf test: Fix parallel/sequential option documentation
perf test: Send list output to stdout rather than stderr
perf test: Rename functions and variables for better clarity
perf tools: Expose quiet/verbose variables in Makefile.perf
perf config: Add a function to set one variable in .perfconfig
perf test perftool_testsuite: Return correct value for skipping
perf test perftool_testsuite: Add missing description
perf test record+probe_libc_inet_pton: Make test resilient
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext
Pull sched_ext updates from Tejun Heo:
- scx_bpf_now() added so that BPF scheduler can access the cached
timestamp in struct rq to avoid reading TSC multiple times within a
locked scheduling operation.
- Minor updates to the built-in idle CPU selection logic.
- tool/sched_ext updates and other misc changes.
* tag 'sched_ext-for-6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext:
sched_ext: fix kernel-doc warnings
sched_ext: Use time helpers in BPF schedulers
sched_ext: Replace bpf_ktime_get_ns() to scx_bpf_now()
sched_ext: Add time helpers for BPF schedulers
sched_ext: Add scx_bpf_now() for BPF scheduler
sched_ext: Implement scx_bpf_now()
sched_ext: Relocate scx_enabled() related code
sched_ext: Add option -l in selftest runner to list all available tests
sched_ext: Include remaining task time slice in error state dump
sched_ext: update scx_bpf_dsq_insert() doc for SCX_DSQ_LOCAL_ON
sched_ext: idle: small CPU iteration refactoring
sched_ext: idle: introduce check_builtin_idle_enabled() helper
sched_ext: idle: clarify comments
sched_ext: idle: use assign_cpu() to update the idle cpumask
sched_ext: Use str_enabled_disabled() helper in update_selcpu_topology()
sched_ext: Use sizeof_field for key_len in dsq_hash_params
tools/sched_ext: Receive updates from SCX repo
sched_ext: Use the NUMA scheduling domain for NUMA optimizations
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull trace fing buffer fix from Steven Rostedt:
"Fix atomic64 operations on some architectures for the tracing ring
buffer:
- Have emulating atomic64 use arch_spin_locks instead of
raw_spin_locks
The tracing ring buffer events have a small timestamp that holds
the delta between itself and the event before it. But this can be
tricky to update when interrupts come in. It originally just set
the deltas to zero for events that interrupted the adding of
another event which made all the events in the interrupt have the
same timestamp as the event it interrupted. This was not suitable
for many tools, so it was eventually fixed. But that fix required
adding an atomic64 cmpxchg on the timestamp in cases where an event
was added while another event was in the process of being added.
Originally, for 32 bit architectures, the manipulation of the 64
bit timestamp was done by a structure that held multiple 32bit
words to hold parts of the timestamp and a counter. But as updates
to the ring buffer were done, maintaining this became too complex
and was replaced by the atomic64 generic operations which are now
used by both 64bit and 32bit architectures. Shortly after that, it
was reported that riscv32 and other 32 bit architectures that just
used the generic atomic64 were locking up. This was because the
generic atomic64 operations defined in lib/atomic64.c uses a
raw_spin_lock() to emulate an atomic64 operation. The problem here
was that raw_spin_lock() can also be traced by the function tracer
(which is commonly used for debugging raw spin locks). Since the
function tracer uses the tracing ring buffer, which now is being
traced internally, this was triggering a recursion and setting off
a warning that the spin locks were recusing.
There's no reason for the code that emulates atomic64 operations to
be using raw_spin_locks which have a lot of debugging
infrastructure attached to them (depending on the config options).
Instead it should be using the arch_spin_lock() which does not have
any infrastructure attached to them and is used by low level
infrastructure like RCU locks, lockdep and of course tracing. Using
arch_spin_lock()s fixes this issue.
- Do not trace in NMI if the architecture uses emulated atomic64
operations
Another issue with using the emulated atomic64 operations that uses
spin locks to emulate the atomic64 operations is that they cannot
be used in NMI context. As an NMI can trigger while holding the
atomic64 spin locks it can try to take the same lock and cause a
deadlock.
Have the ring buffer fail recording events if in NMI context and
the architecture uses the emulated atomic64 operations"
* tag 'trace-ringbuffer-v6.14-2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
atomic64: Use arch_spin_locks instead of raw_spin_locks
ring-buffer: Do not allow events in NMI with generic atomic64 cmpxchg()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull fgraph updates from Steven Rostedt:
"Remove calltime and rettime from fgraph infrastructure
The calltime and rettime were used by the function graph tracer to
calculate the timings of functions where it traced their entry and
exit. The calltime and rettime were stored in the generic structures
that were used for the mechanisms to add an entry and exit callback.
Now that function graph infrastructure is used by other subsystems
than just the tracer, the calltime and rettime are not needed for
them. Remove the calltime and rettime from the generic fgraph
infrastructure and have the callers that require them handle them"
* tag 'ftrace-v6.14-2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
fgraph: Remove calltime and rettime from generic operations
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull tracing updates from Steven Rostedt:
- Cleanup with guard() and free() helpers
There were several places in the code that had a lot of "goto out" in
the error paths to either unlock a lock or free some memory that was
allocated. But this is error prone. Convert the code over to use the
guard() and free() helpers that let the compiler unlock locks or free
memory when the function exits.
- Update the Rust tracepoint code to use the C code too
There was some duplication of the tracepoint code for Rust that did
the same logic as the C code. Add a helper that makes it possible for
both algorithms to use the same logic in one place.
- Add poll to trace event hist files
It is useful to know when an event is triggered, or even with some
filtering. Since hist files of events get updated when active and the
event is triggered, allow applications to poll the hist file and wake
up when an event is triggered. This will let the application know
that the event it is waiting for happened.
- Add :mod: command to enable events for current or future modules
The function tracer already has a way to enable functions to be
traced in modules by writing ":mod:<module>" into set_ftrace_filter.
That will enable either all the functions for the module if it is
loaded, or if it is not, it will cache that command, and when the
module is loaded that matches <module>, its functions will be
enabled. This also allows init functions to be traced. But currently
events do not have that feature.
Add the command where if ':mod:<module>' is written into set_event,
then either all the modules events are enabled if it is loaded, or
cache it so that the module's events are enabled when it is loaded.
This also works from the kernel command line, where
"trace_event=:mod:<module>", when the module is loaded at boot up,
its events will be enabled then.
* tag 'trace-v6.14-3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: (26 commits)
tracing: Fix output of set_event for some cached module events
tracing: Fix allocation of printing set_event file content
tracing: Rename update_cache() to update_mod_cache()
tracing: Fix #if CONFIG_MODULES to #ifdef CONFIG_MODULES
selftests/ftrace: Add test that tests event :mod: commands
tracing: Cache ":mod:" events for modules not loaded yet
tracing: Add :mod: command to enabled module events
selftests/tracing: Add hist poll() support test
tracing/hist: Support POLLPRI event for poll on histogram
tracing/hist: Add poll(POLLIN) support on hist file
tracing: Fix using ret variable in tracing_set_tracer()
tracepoint: Reduce duplication of __DO_TRACE_CALL
tracing/string: Create and use __free(argv_free) in trace_dynevent.c
tracing: Switch trace_stat.c code over to use guard()
tracing: Switch trace_stack.c code over to use guard()
tracing: Switch trace_osnoise.c code over to use guard() and __free()
tracing: Switch trace_events_synth.c code over to use guard()
tracing: Switch trace_events_filter.c code over to use guard()
tracing: Switch trace_events_trigger.c code over to use guard()
tracing: Switch trace_events_hist.c code over to use guard()
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-ktest
Pull ktest updates from Steven Rostedt:
- Fix use of KERNEL_VERSION in newly created output directory
If a new output directory is created (O=/dir), and one of the options
uses KERNEL_VERSION which will run a "make kernelversion" in the
output directory, it will fail because there is no config file yet.
In this case, have it do a "make allnoconfig" which is the minimal
needed to run the "make kernelversion".
- Remove unused variables
- Fix some typos
* tag 'ktest-6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-ktest:
ktest.pl: Fix typo "accesing"
ktest.pl: Fix typo in comment
ktest.pl: Remove unused declarations in run_bisect_test function
ktest.pl: Check kernelrelease return in get_version
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull probes updates from Masami Hiramatsu:
- kprobes: Cleanups using guard() and __free(): Use cleanup.h macros to
cleanup code and remove all gotos from kprobes code.
- tracing/probes: Also cleanups tracing/*probe events code with guard()
and __free(). These patches are just to simplify the parser codes.
- kprobes: Reduce preempt disable scope in check_kprobe_access_safe()
This reduces preempt disable time to only when getting the module
refcount in check_kprobe_access_safe().
Previously it disabled preempt needlessly for other checks including
jump_label_text_reserved(), which took a long time because of the
linear search.
* tag 'probes-v6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
tracing/kprobes: Simplify __trace_kprobe_create() by removing gotos
tracing: Use __free() for kprobe events to cleanup
tracing: Use __free() in trace_probe for cleanup
kprobes: Remove remaining gotos
kprobes: Remove unneeded goto
kprobes: Use guard for rcu_read_lock
kprobes: Use guard() for external locks
jump_label: Define guard() for jump_label_lock
tracing/eprobe: Adopt guard() and scoped_guard()
tracing/uprobe: Adopt guard() and scoped_guard()
tracing/kprobe: Adopt guard() and scoped_guard()
kprobes: Adopt guard() and scoped_guard()
kprobes: Reduce preempt disable scope in check_kprobe_access_safe()
|
|
git://git.samba.org/sfrench/cifs-2.6
Pull smb client updates from Steve French:
- Fix oops in DebugData when link speed 0
- Two reparse point fixes
- Ten DFS (global namespace) fixes
- Symlink error handling fix
- Two SMB1 fixes
- Four cleanup fixes
- Improved debugging of status codes
- Fix incorrect output of tracepoints for compounding, and add missing
compounding tracepoint
* tag 'v6.14-rc-smb3-client-fixes-part' of git://git.samba.org/sfrench/cifs-2.6: (23 commits)
smb: client: handle lack of EA support in smb2_query_path_info()
smb: client: don't check for @leaf_fullpath in match_server()
smb: client: get rid of TCP_Server_Info::refpath_lock
cifs: Remove duplicate struct reparse_symlink_data and SYMLINK_FLAG_RELATIVE
cifs: Do not attempt to call CIFSGetSrvInodeNumber() without CAP_INFOLEVEL_PASSTHRU
cifs: Do not attempt to call CIFSSMBRenameOpenFile() without CAP_INFOLEVEL_PASSTHRU
cifs: Remove declaration of dead CIFSSMBQuerySymLink function
cifs: Fix printing Status code into dmesg
cifs: Add missing NT_STATUS_* codes from nterr.h to nterr.c
cifs: Fix endian types in struct rfc1002_session_packet
cifs: Use cifs_autodisable_serverino() for disabling CIFS_MOUNT_SERVER_INUM in readdir.c
smb3: add missing tracepoint for querying wsl EAs
smb: client: fix order of arguments of tracepoints
smb: client: fix oops due to unset link speed
smb: client: correctly handle ErrorContextData as a flexible array
smb: client: don't retry DFS targets on server shutdown
smb: client: fix return value of parse_dfs_referrals()
smb: client: optimize referral walk on failed link targets
smb: client: provide dns_resolve_{unc,name} helpers
smb: client: parse DNS domain name from domain= option
...
|
|
Pull smb server updates from Steve French:
"Three ksmbd server fixes:
- Fix potential memory corruption in IPC calls
- Support FSCTL_QUERY_INTERFACE_INFO for more configurations
- Remove some unused functions"
* tag 'v6.14-rc-ksmbd-server-fixes' of git://git.samba.org/ksmbd:
ksmbd: fix integer overflows on 32 bit systems
ksmbd: browse interfaces list on FSCTL_QUERY_INTERFACE_INFO IOCTL
ksmbd: Remove unused functions
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Pull fsnotify pre-content notification support from Jan Kara:
"This introduces a new fsnotify event (FS_PRE_ACCESS) that gets
generated before a file contents is accessed.
The event is synchronous so if there is listener for this event, the
kernel waits for reply. On success the execution continues as usual,
on failure we propagate the error to userspace. This allows userspace
to fill in file content on demand from slow storage. The context in
which the events are generated has been picked so that we don't hold
any locks and thus there's no risk of a deadlock for the userspace
handler.
The new pre-content event is available only for users with global
CAP_SYS_ADMIN capability (similarly to other parts of fanotify
functionality) and it is an administrator responsibility to make sure
the userspace event handler doesn't do stupid stuff that can DoS the
system.
Based on your feedback from the last submission, fsnotify code has
been improved and now file->f_mode encodes whether pre-content event
needs to be generated for the file so the fast path when nobody wants
pre-content event for the file just grows the additional file->f_mode
check. As a bonus this also removes the checks whether the old
FS_ACCESS event needs to be generated from the fast path. Also the
place where the event is generated during page fault has been moved so
now filemap_fault() generates the event if and only if there is no
uptodate folio in the page cache.
Also we have dropped FS_PRE_MODIFY event as current real-world users
of the pre-content functionality don't really use it so let's start
with the minimal useful feature set"
* tag 'fsnotify_hsm_for_v6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: (21 commits)
fanotify: Fix crash in fanotify_init(2)
fs: don't block write during exec on pre-content watched files
fs: enable pre-content events on supported file systems
ext4: add pre-content fsnotify hook for DAX faults
btrfs: disable defrag on pre-content watched files
xfs: add pre-content fsnotify hook for DAX faults
fsnotify: generate pre-content permission event on page fault
mm: don't allow huge faults for files with pre content watches
fanotify: disable readahead if we have pre-content watches
fanotify: allow to set errno in FAN_DENY permission response
fanotify: report file range info with pre-content events
fanotify: introduce FAN_PRE_ACCESS permission event
fsnotify: generate pre-content permission event on truncate
fsnotify: pass optional file access range in pre-content event
fsnotify: introduce pre-content permission events
fanotify: reserve event bit of deprecated FAN_DIR_MODIFY
fanotify: rename a misnamed constant
fanotify: don't skip extra event info if no info_mode is set
fsnotify: check if file is actually being watched for pre-content events on open
fsnotify: opt-in for permission events at file open time
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Pull isofs update from Jan Kara:
"Partial conversion of isofs to folios"
* tag 'fs_for_v6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
isofs: Partially convert zisofs_read_folio to use a folio
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Pull inotify update from Jan Kara:
"A small inotify strcpy() cleanup"
* tag 'fsnotify_for_v6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
inotify: Use strscpy() for event->name copies
|
|
Pull XFS updates from Carlos Maiolino:
"This is mostly focused on the implementation of reflink and
reverse-mapping support for XFS's real-time devices.
It also includes several bugfixes.
- Implement reflink support for the realtime device
- Implement reverse-mapping support for the realtime device
- Several bug fixes and cleanups"
* tag 'xfs-merge-6.14' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: (121 commits)
xfs: fix buffer lookup vs release race
xfs: check for dead buffers in xfs_buf_find_insert
xfs: add a b_iodone callback to struct xfs_buf
xfs: move b_li_list based retry handling to common code
xfs: simplify xfsaild_resubmit_item
xfs: always complete the buffer inline in xfs_buf_submit
xfs: remove the extra buffer reference in xfs_buf_submit
xfs: move invalidate_kernel_vmap_range to xfs_buf_ioend
xfs: simplify buffer I/O submission
xfs: move in-memory buftarg handling out of _xfs_buf_ioapply
xfs: move write verification out of _xfs_buf_ioapply
xfs: remove xfs_buf_delwri_submit_buffers
xfs: simplify xfs_buf_delwri_pushbuf
xfs: move xfs_buf_iowait out of (__)xfs_buf_submit
xfs: remove the incorrect comment about the b_pag field
xfs: remove the incorrect comment above xfs_buf_free_maps
xfs: fix a double completion for buffers on in-memory targets
xfs/libxfs: replace kmalloc() and memcpy() with kmemdup()
xfs: constify feature checks
xfs: refactor xfs_fs_statfs
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Pull bpf updates from Alexei Starovoitov:
"A smaller than usual release cycle.
The main changes are:
- Prepare selftest to run with GCC-BPF backend (Ihor Solodrai)
In addition to LLVM-BPF runs the BPF CI now runs GCC-BPF in compile
only mode. Half of the tests are failing, since support for
btf_decl_tag is still WIP, but this is a great milestone.
- Convert various samples/bpf to selftests/bpf/test_progs format
(Alexis Lothoré and Bastien Curutchet)
- Teach verifier to recognize that array lookup with constant
in-range index will always succeed (Daniel Xu)
- Cleanup migrate disable scope in BPF maps (Hou Tao)
- Fix bpf_timer destroy path in PREEMPT_RT (Hou Tao)
- Always use bpf_mem_alloc in bpf_local_storage in PREEMPT_RT (Martin
KaFai Lau)
- Refactor verifier lock support (Kumar Kartikeya Dwivedi)
This is a prerequisite for upcoming resilient spin lock.
- Remove excessive 'may_goto +0' instructions in the verifier that
LLVM leaves when unrolls the loops (Yonghong Song)
- Remove unhelpful bpf_probe_write_user() warning message (Marco
Elver)
- Add fd_array_cnt attribute for prog_load command (Anton Protopopov)
This is a prerequisite for upcoming support for static_branch"
* tag 'bpf-next-6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (125 commits)
selftests/bpf: Add some tests related to 'may_goto 0' insns
bpf: Remove 'may_goto 0' instruction in opt_remove_nops()
bpf: Allow 'may_goto 0' instruction in verifier
selftests/bpf: Add test case for the freeing of bpf_timer
bpf: Cancel the running bpf_timer through kworker for PREEMPT_RT
bpf: Free element after unlock in __htab_map_lookup_and_delete_elem()
bpf: Bail out early in __htab_map_lookup_and_delete_elem()
bpf: Free special fields after unlock in htab_lru_map_delete_node()
tools: Sync if_xdp.h uapi tooling header
libbpf: Work around kernel inconsistently stripping '.llvm.' suffix
bpf: selftests: verifier: Add nullness elision tests
bpf: verifier: Support eliding map lookup nullness
bpf: verifier: Refactor helper access type tracking
bpf: tcp: Mark bpf_load_hdr_opt() arg2 as read-write
bpf: verifier: Add missing newline on verbose() call
selftests/bpf: Add distilled BTF test about marking BTF_IS_EMBEDDED
libbpf: Fix incorrect traversal end type ID when marking BTF_IS_EMBEDDED
libbpf: Fix return zero when elf_begin failed
selftests/bpf: Fix btf leak on new btf alloc failure in btf_distill test
veristat: Load struct_ops programs only once
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/sergeh/linux
Pull capabilities updates from Serge Hallyn:
- remove the cap_mmap_file() hook, as it simply returned the default
return value and so doesn't need to exist (Paul Moore)
- add a trace event for cap_capable() (Jordan Rome)
* tag 'caps-6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/sergeh/linux:
security: add trace event for cap_capable
capabilities: remove cap_mmap_file()
|
|
The following failure was reported on HPE ProLiant D320:
[ 10.693310][ T1] tpm_tis STM0925:00: 2.0 TPM (device-id 0x3, rev-id 0)
[ 10.848132][ T1] ------------[ cut here ]------------
[ 10.853559][ T1] WARNING: CPU: 59 PID: 1 at mm/page_alloc.c:4727 __alloc_pages_noprof+0x2ca/0x330
[ 10.862827][ T1] Modules linked in:
[ 10.866671][ T1] CPU: 59 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.12.0-lp155.2.g52785e2-default #1 openSUSE Tumbleweed (unreleased) 588cd98293a7c9eba9013378d807364c088c9375
[ 10.882741][ T1] Hardware name: HPE ProLiant DL320 Gen12/ProLiant DL320 Gen12, BIOS 1.20 10/28/2024
[ 10.892170][ T1] RIP: 0010:__alloc_pages_noprof+0x2ca/0x330
[ 10.898103][ T1] Code: 24 08 e9 4a fe ff ff e8 34 36 fa ff e9 88 fe ff ff 83 fe 0a 0f 86 b3 fd ff ff 80 3d 01 e7 ce 01 00 75 09 c6 05 f8 e6 ce 01 01 <0f> 0b 45 31 ff e9 e5 fe ff ff f7 c2 00 00 08 00 75 42 89 d9 80 e1
[ 10.917750][ T1] RSP: 0000:ffffb7cf40077980 EFLAGS: 00010246
[ 10.923777][ T1] RAX: 0000000000000000 RBX: 0000000000040cc0 RCX: 0000000000000000
[ 10.931727][ T1] RDX: 0000000000000000 RSI: 000000000000000c RDI: 0000000000040cc0
The above transcript shows that ACPI pointed a 16 MiB buffer for the log
events because RSI maps to the 'order' parameter of __alloc_pages_noprof().
Address the bug by moving from devm_kmalloc() to devm_add_action() and
kvmalloc() and devm_add_action().
Suggested-by: Ard Biesheuvel <ardb@kernel.org>
Cc: stable@vger.kernel.org # v2.6.16+
Fixes: 55a82ab3181b ("[PATCH] tpm: add bios measurement log")
Reported-by: Andy Liang <andy.liang@hpe.com>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219495
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
Reviewed-by: Takashi Iwai <tiwai@suse.de>
Tested-by: Andy Liang <andy.liang@hpe.com>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull AT_EXECVE_CHECK from Kees Cook:
- Implement AT_EXECVE_CHECK flag to execveat(2) (Mickaël Salaün)
- Implement EXEC_RESTRICT_FILE and EXEC_DENY_INTERACTIVE securebits
(Mickaël Salaün)
- Add selftests and samples for AT_EXECVE_CHECK (Mickaël Salaün)
* tag 'AT_EXECVE_CHECK-v6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
ima: instantiate the bprm_creds_for_exec() hook
samples/check-exec: Add an enlighten "inc" interpreter and 28 tests
selftests: ktap_helpers: Fix uninitialized variable
samples/check-exec: Add set-exec
selftests/landlock: Add tests for execveat + AT_EXECVE_CHECK
selftests/exec: Add 32 tests for AT_EXECVE_CHECK and exec securebits
security: Add EXEC_RESTRICT_FILE and EXEC_DENY_INTERACTIVE securebits
exec: Add a new AT_EXECVE_CHECK flag to execveat(2)
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull hardening updates from Kees Cook:
- stackleak: Use str_enabled_disabled() helper (Thorsten Blum)
- Document GCC INIT_STACK_ALL_PATTERN behavior (Geert Uytterhoeven)
- Add task_prctl_unknown tracepoint (Marco Elver)
* tag 'hardening-v6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
hardening: Document INIT_STACK_ALL_PATTERN behavior with GCC
stackleak: Use str_enabled_disabled() helper in stack_erasing_sysctl()
tracing: Remove pid in task_rename tracing output
tracing: Add task_prctl_unknown tracepoint
|
|
Pull tomoyo updates from Tetsuo Handa:
"Small changes to improve usability"
* tag 'tomoyo-pr-20250123' of git://git.code.sf.net/p/tomoyo/tomoyo:
tomoyo: automatically use patterns for several situations in learning mode
tomoyo: use realpath if symlink's pathname refers to procfs
tomoyo: don't emit warning in tomoyo_write_control()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux
Pull landlock updates from Mickaël Salaün:
"This mostly factors out some Landlock code and prepares for upcoming
audit support.
Because files with invalid modes might be visible after filesystem
corruption, Landlock now handles those weird files too.
A few sample and test issues are also fixed"
* tag 'landlock-6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux:
selftests/landlock: Add layout1.umount_sandboxer tests
selftests/landlock: Add wrappers.h
selftests/landlock: Fix error message
landlock: Optimize file path walks and prepare for audit support
selftests/landlock: Add test to check partial access in a mount tree
landlock: Align partial refer access checks with final ones
landlock: Simplify initially denied access rights
landlock: Move access types
landlock: Factor out check_access_path()
selftests/landlock: Fix build with non-default pthread linking
landlock: Use scoped guards for ruleset in landlock_add_rule()
landlock: Use scoped guards for ruleset
landlock: Constify get_mode_access()
landlock: Handle weird files
samples/landlock: Fix possible NULL dereference in parse_path()
selftests/landlock: Remove unused macros in ptrace_test.c
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux
Pull CRC updates from Eric Biggers:
- Reorganize the architecture-optimized CRC32 and CRC-T10DIF code to be
directly accessible via the library API, instead of requiring the
crypto API. This is much simpler and more efficient.
- Convert some users such as ext4 to use the CRC32 library API instead
of the crypto API. More conversions like this will come later.
- Add a KUnit test that tests and benchmarks multiple CRC variants.
Remove older, less-comprehensive tests that are made redundant by
this.
- Add an entry to MAINTAINERS for the kernel's CRC library code. I'm
volunteering to maintain it. I have additional cleanups and
optimizations planned for future cycles.
* tag 'crc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux: (31 commits)
MAINTAINERS: add entry for CRC library
powerpc/crc: delete obsolete crc-vpmsum_test.c
lib/crc32test: delete obsolete crc32test.c
lib/crc16_kunit: delete obsolete crc16_kunit.c
lib/crc_kunit.c: add KUnit test suite for CRC library functions
powerpc/crc-t10dif: expose CRC-T10DIF function through lib
arm64/crc-t10dif: expose CRC-T10DIF function through lib
arm/crc-t10dif: expose CRC-T10DIF function through lib
x86/crc-t10dif: expose CRC-T10DIF function through lib
crypto: crct10dif - expose arch-optimized lib function
lib/crc-t10dif: add support for arch overrides
lib/crc-t10dif: stop wrapping the crypto API
scsi: target: iscsi: switch to using the crc32c library
f2fs: switch to using the crc32 library
jbd2: switch to using the crc32c library
ext4: switch to using the crc32c library
lib/crc32: make crc32c() go directly to lib
bcachefs: Explicitly select CRYPTO from BCACHEFS_FS
x86/crc32: expose CRC32 functions through lib
x86/crc32: update prototype for crc32_pclmul_le_16()
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd
Pull keys updates from Jarkko Sakkinen.
Avoid using stack addresses for sg lists. And a cleanup.
* tag 'keys-next-6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd:
KEYS: trusted: dcp: fix improper sg use with CONFIG_VMAP_STACK=y
keys: drop shadowing dead prototype
|
|
If the server doesn't support both EAs and reparse point in a file,
the SMB2_QUERY_INFO request will fail with either
STATUS_NO_EAS_ON_FILE or STATUS_EAS_NOT_SUPPORT in the compound chain,
so ignore it as long as reparse point isn't
IO_REPARSE_TAG_LX_(CHR|BLK), which would require the EAs to know about
major/minor numbers.
Reported-by: Pali Rohár <pali@kernel.org>
Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
The matching of DFS connections is already handled by @dfs_conn, so
remove @leaf_fullpath matching altogether.
Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
TCP_Server_Info::leaf_fullpath is allocated in cifs_get_tcp_session()
and never changed afterwards, so there is no need to serialize its
access.
Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|