Age | Commit message (Collapse) | Author |
|
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull tracepoint cleanup from Steven Rostedt:
"Remove or hide unused tracepoints
Tracepoints take up memory (around 5K per tracepoint) even when they
are unused. Changes are being made to detect when a tracepoint is
defined but unused and a warning is shown at build. But those changes
are not yet ready for inclusion.
- Fix some of the unused tracepoints that it detected
Some tracepoints were removed and others were hidden by config
settings to match the config settings of where they are
instantiated. Some tracepoints were moved into architecture
specific code as only one architecture used them.
- Call the ftrace_test_filter tracepoint in an unreachable if
statement
The ftrace_test_filter tracepoint which is defined when ftrace
selftests are configured and is used to test the filter logic, but
the tracepoint is not actually called. It is put into an if
statement to not have it get compiled out, but also not warn for
not being used"
* tag 'trace-unused-v6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
tracing: sched: Hide numa events under CONFIG_NUMA_BALANCING
powerpc/thp: tracing: Hide hugepage events under CONFIG_PPC_BOOK3S_64
tracing: Call trace_ftrace_test_filter() for the event
tracing: arm: arm64: Hide trace events ipi_raise, ipi_entry and ipi_exit
binder: Remove unused binder lock events
PM: tracing: Hide power_domain_target event under ARCH_OMAP2PLUS
PM: tracing: Hide device_pm_callback events under PM_SLEEP
PM: tracing: Hide psci_domain_idle events under ARM_PSCI_CPUIDLE
PM: cpufreq: powernv/tracing: Move powernv_throttle trace event
alarmtimer: Hide alarmtimer_suspend event when RTC_CLASS is not configured
tracing, AER: Hide PCIe AER event when PCIEAER is not configured
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 platform updates from Ingo Molnar:
"This adds support for the AMD hardware feedback interface (HFI), by
Perry Yuan"
* tag 'x86-platform-2025-07-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/itmt: Add debugfs file to show core priorities
platform/x86/amd: hfi: Add debugfs support
platform/x86/amd: hfi: Set ITMT priority from ranking data
cpufreq/amd-pstate: Disable preferred cores on designs with workload classification
x86/process: Clear hardware feedback history for AMD processors
platform/x86: hfi: Add power management callback
platform/x86: hfi: Add online and offline callback support
platform/x86: hfi: Init per-cpu scores for each class
platform/x86: hfi: Parse CPU core ranking data from shared memory
platform/x86: hfi: Introduce AMD Hardware Feedback Interface Driver
x86/msr-index: Add AMD workload classification MSRs
MAINTAINERS: Add maintainer entry for AMD Hardware Feedback Driver
Documentation/x86: Add AMD Hardware Feedback Interface documentation
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/driver-core/driver-core
Pull driver core updates from Danilo Krummrich:
"debugfs:
- Remove unneeded debugfs_file_{get,put}() instances
- Remove last remnants of debugfs_real_fops()
- Allow storing non-const void * in struct debugfs_inode_info::aux
sysfs:
- Switch back to attribute_group::bin_attrs (treewide)
- Switch back to bin_attribute::read()/write() (treewide)
- Constify internal references to 'struct bin_attribute'
Support cache-ids for device-tree systems:
- Add arch hook arch_compact_of_hwid()
- Use arch_compact_of_hwid() to compact MPIDR values on arm64
Rust:
- Device:
- Introduce CoreInternal device context (for bus internal methods)
- Provide generic drvdata accessors for bus devices
- Provide Driver::unbind() callbacks
- Use the infrastructure above for auxiliary, PCI and platform
- Implement Device::as_bound()
- Rename Device::as_ref() to Device::from_raw() (treewide)
- Implement fwnode and device property abstractions
- Implement example usage in the Rust platform sample driver
- Devres:
- Remove the inner reference count (Arc) and use pin-init instead
- Replace Devres::new_foreign_owned() with devres::register()
- Require T to be Send in Devres<T>
- Initialize the data kept inside a Devres last
- Provide an accessor for the Devres associated Device
- Device ID:
- Add support for ACPI device IDs and driver match tables
- Split up generic device ID infrastructure
- Use generic device ID infrastructure in net::phy
- DMA:
- Implement the dma::Device trait
- Add DMA mask accessors to dma::Device
- Implement dma::Device for PCI and platform devices
- Use DMA masks from the DMA sample module
- I/O:
- Implement abstraction for resource regions (struct resource)
- Implement resource-based ioremap() abstractions
- Provide platform device accessors for I/O (remap) requests
- Misc:
- Support fallible PinInit types in Revocable
- Implement Wrapper<T> for Opaque<T>
- Merge pin-init blanket dependencies (for Devres)
Misc:
- Fix OF node leak in auxiliary_device_create()
- Use util macros in device property iterators
- Improve kobject sample code
- Add device_link_test() for testing device link flags
- Fix typo in Documentation/ABI/testing/sysfs-kernel-address_bits
- Hint to prefer container_of_const() over container_of()"
* tag 'driver-core-6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/driver-core/driver-core: (84 commits)
rust: io: fix broken intra-doc links to `platform::Device`
rust: io: fix broken intra-doc link to missing `flags` module
rust: io: mem: enable IoRequest doc-tests
rust: platform: add resource accessors
rust: io: mem: add a generic iomem abstraction
rust: io: add resource abstraction
rust: samples: dma: set DMA mask
rust: platform: implement the `dma::Device` trait
rust: pci: implement the `dma::Device` trait
rust: dma: add DMA addressing capabilities
rust: dma: implement `dma::Device` trait
rust: net::phy Change module_phy_driver macro to use module_device_table macro
rust: net::phy represent DeviceId as transparent wrapper over mdio_device_id
rust: device_id: split out index support into a separate trait
device: rust: rename Device::as_ref() to Device::from_raw()
arm64: cacheinfo: Provide helper to compress MPIDR value into u32
cacheinfo: Add arch hook to compress CPU h/w id into 32 bits for cache-id
cacheinfo: Set cache 'id' based on DT data
container_of: Document container_of() is not to be used in new code
driver core: auxiliary bus: fix OF node leak
...
|
|
AMU counters on certain CPPC-based platforms tend to yield inaccurate
delivered performance measurements on systems that are idle/mostly idle.
This results in an inaccurate frequency being stored by cpufreq in its
policy structure when the CPU is brought online. [1]
Consequently, if the userspace governor tries to set the frequency to a
new value, there is a possibility that it would be the erroneous value
stored earlier. In such a scenario, cpufreq would assume that the
requested frequency has already been set and return early, resulting in
the correct/new frequency request never making it to the hardware.
Since the operating frequency is liable to this sort of inconsistency,
mark the CPPC driver with CPUFREQ_NEED_UPDATE_LIMITS so that it is always
invoked when a target frequency update is requested.
Link: https://lore.kernel.org/linux-pm/20250619000925.415528-3-pmalani@google.com/ [1]
Suggested-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Prashant Malani <pmalani@google.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Link: https://patch.msgid.link/20250722055611.130574-2-pmalani@google.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
As the trace event powernv_throttle is only used by the powernv code, move
it to a separate include file and have that code directly enable it.
Trace events can take up around 5K of memory when they are defined
regardless if they are used or not. It wastes memory to have them defined
in configurations where the tracepoint is not used.
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/20250612145407.906308844@goodmis.org
Fixes: 0306e481d479a ("cpufreq: powernv/tracing: Add powernv_throttle tracepoint")
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm
Merge CPUFreq updates for 6.17 from Viresh Kumar:
"- tegra124: Allow building as a module (Aaron Kling).
- Minor cleanups for Rust cpufreq and cpumask APIs and fix MAINTAINERS
entry for cpu.rs (Abhinav Ananthu, Ritvik Gupta, and Lukas Bulwahn).
- Minor cleanups for miscellaneous cpufreq drivers (Arnd Bergmann, Dan
Carpenter, Krzysztof Kozlowski, Sven Peter, and Svyatoslav Ryhel)."
* tag 'cpufreq-arm-updates-6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm:
drivers: cpufreq: add Tegra114 support
rust: cpumask: Replace `MaybeUninit` and `mem::zeroed` with `Opaque` APIs
cpufreq: tegra124: Allow building as a module
cpufreq: dt: Add register helper
cpufreq: Export disable_cpufreq()
cpufreq: armada-8k: Fix off by one in armada_8k_cpufreq_free_table()
cpufreq: armada-8k: make both cpu masks static
rust: cpufreq: use c_ types from kernel prelude
rust: cpufreq: Ensure C ABI compatibility in all unsafe
cpufreq: brcmstb-avs: Fully open-code compatible for grepping
cpufreq: apple: drop default ARCH_APPLE in Kconfig
MAINTAINERS: adjust file entry in CPU HOTPLUG
|
|
Tegra114 is fully compatible with existing Tegra124 cpufreq driver.
Signed-off-by: Svyatoslav Ryhel <clamor95@gmail.com>
Reviewed-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
|
|
Detect the result of starting old governor in cpufreq_set_policy(). If it
fails, exit the governor and clear policy->governor.
Signed-off-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Link: https://patch.msgid.link/20250709104145.2348017-5-zhenglifeng1@huawei.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
cpufreq_verify_current_freq()
Move the check of cpufreq_driver->get into cpufreq_verify_current_freq() in
case of calling it without check.
Signed-off-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Link: https://patch.msgid.link/20250709104145.2348017-4-zhenglifeng1@huawei.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
In cpufreq_policy_put_kobj(), policy->rwsem is used. But in
cpufreq_policy_alloc(), if freq_qos_add_notifier() returns an error, error
path via err_kobj_remove or err_min_qos_notifier will be reached and
cpufreq_policy_put_kobj() will be called before policy->rwsem is
initialized. Thus, the calling of init_rwsem() should be moved to where
before these two error paths can be reached.
Fixes: 67d874c3b2c6 ("cpufreq: Register notifiers with the PM QoS framework")
Signed-off-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Link: https://patch.msgid.link/20250709104145.2348017-3-zhenglifeng1@huawei.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
The cpufreq-based invariance is enabled in cpufreq_register_driver(),
but never disabled after registration fails. Move the invariance
initialization to where all other initializations have been successfully
done to solve this problem.
Fixes: 874f63531064 ("cpufreq: report whether cpufreq supports Frequency Invariance (FI)")
Signed-off-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Link: https://patch.msgid.link/20250709104145.2348017-2-zhenglifeng1@huawei.com
[ rjw: New subject ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
The has_target() checks in __cpufreq_offline() are duplicate.
Remove one of them and put the operations of exiting governor together
with storing last governor's name.
Signed-off-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Link: https://patch.msgid.link/20250623133402.3120230-5-zhenglifeng1@huawei.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
After commit c034b02e213d ("cpufreq: expose scaling_cur_freq sysfs file
for set_policy() drivers"), the file scaling_cur_freq is exposed to all
drivers.
No need to create this file separately. It's better to be contained in
cpufreq_attrs.
Signed-off-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Link: https://patch.msgid.link/20250623133402.3120230-4-zhenglifeng1@huawei.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Users may disable HWP in firmware, in which case intel_pstate
wouldn't load unless the CPU model is explicitly supported.
Signed-off-by: Li RongQing <lirongqing@baidu.com>
Link: https://patch.msgid.link/20250623105601.3924-1-lirongqing@baidu.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
In the passive mode, intel_cpufreq_update_pstate() sets HWP_MIN_PERF in
accordance with the target frequency to ensure delivering adequate
performance, but it sets HWP_DESIRED_PERF to 0, so the processor has no
indication that the desired performance level is actually equal to the
floor one. This may cause it to choose a performance point way above
the desired level.
Moreover, this is inconsistent with intel_cpufreq_adjust_perf() which
actually sets HWP_DESIRED_PERF in accordance with the target performance
value.
Address this by adjusting intel_cpufreq_update_pstate() to pass
target_pstate as both the minimum and the desired performance levels
to intel_cpufreq_hwp_update().
Fixes: a365ab6b9dfb ("cpufreq: intel_pstate: Implement the ->adjust_perf() callback")
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Tested-by: Shashank Balaji <shashank.mahadasyam@sony.com>
Link: https://patch.msgid.link/6173276.lOV4Wx5bFT@rjwysocki.net
|
|
This requires four changes:
* Using the cpufreq-dt register helper to establish a hard dependency
for depmod to track
* Adding a remove routine to remove the cpufreq-dt device
* Adding a exit routine to handle cleaning up the driver
* Populating module license
Signed-off-by: Aaron Kling <webgeek1234@gmail.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
|
|
Cpufreq-dt currently exports no functions. This means that drivers that
are based on cpufreq-dt have no way of establishing a depmod dependency
on it. This helper allows that link.
Signed-off-by: Aaron Kling <webgeek1234@gmail.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
|
|
This is used by the tegra124-cpufreq driver.
Signed-off-by: Aaron Kling <webgeek1234@gmail.com>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
|
|
classification
On designs that have workload classification, it's preferred that
the amd-hfi driver is used to provide hints to the scheduler of
which cores to use instead of the amd-pstate driver.
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Reviewed-by: Perry Yuan <perry.yuan@amd.com>
Acked-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Link: https://lore.kernel.org/20250609200518.3616080-11-superm1@kernel.org
|
|
The freq_tables[] array has num_possible_cpus() elements so, to avoid an
out of bounds access, this loop should be capped at "< nb_cpus" instead
of "<= nb_cpus". The freq_tables[] array is allocated in
armada_8k_cpufreq_init().
Cc: stable@vger.kernel.org
Fixes: f525a670533d ("cpufreq: ap806: add cpufreq driver for Armada 8K")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
|
|
We need the driver-core fixes that are in 6.16-rc3 into here as well
to build on top of.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
An earlier patch marked one of the two CPU masks as 'static' to reduce stack
usage, but if CONFIG_NR_CPUS is large enough, the function still produces
a warning for compile testing:
drivers/cpufreq/armada-8k-cpufreq.c: In function 'armada_8k_cpufreq_init':
drivers/cpufreq/armada-8k-cpufreq.c:203:1: error: the frame size of 1416 bytes is larger than 1408 bytes [-Werror=frame-larger-than=]
Normally this should be done using alloc_cpumask_var(), but since the
driver already has a static mask and the probe function is not called
concurrently, use the same trick for both.
Fixes: 1ffec650d07f ("cpufreq: armada-8k: Avoid excessive stack usage")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
|
|
It is very useful to find driver implementing compatibles with `git grep
compatible`, so driver should not use defines for that string, even if
this means string will be effectively duplicated.
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
|
|
When the first driver for Apple Silicon was upstreamed we accidentally
included `default ARCH_APPLE` in its Kconfig which then spread to almost
every subsequent driver. As soon as ARCH_APPLE is set to y this will
pull in many drivers as built-ins which is not what we want.
Thus, drop `default ARCH_APPLE` from Kconfig.
Signed-off-by: Sven Peter <sven@kernel.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
|
|
When the userspace governor is used, the user intends to set a fixed CPU
frequency for a policy, for whatever reason. The CPUFREQ_GOV_STRICT_TARGET
flag is the required behaviour. Without this flag, the intel_pstate driver,
with HWP enabled, will set HWP_MIN_PERF to the target frequency and
HWP_MAX_PERF to the policy maximum, when configuring the HWP_REQUEST MSR.
This lets the hardware choose any frequency between the target frequency
and the policy maximum, which is not the intended behaviour. To fix this,
`cat scaling_setspeed > scaling_max_freq` had to be done. With this patch,
that is no longer necessary. Setting scaling_setspeed is sufficient, as
expected.
Signed-off-by: Shashank Balaji <shashank.mahadasyam@sony.com>
Link: https://patch.msgid.link/20250527-userspace-governor-doc-v2-1-0e22c69920f2@sony.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
cppc_cpufreq_register_em() is only used in populate_efficiency_class(). A
forward declaration of it is not necessary.
Move cppc_cpufreq_register_em() in front of populate_efficiency_class()
and remove the forward declaration of cppc_cpufreq_register_em().
No functional change.
Signed-off-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Link: https://patch.msgid.link/20250526113057.3086513-4-zhenglifeng1@huawei.com
[ rjw: Changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
The return value of populate_efficiency_class() is never needed and the
result of it doesn't affect the initialization of cppc_cpufreq.
It makes more sense to change it into a void function.
Signed-off-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Link: https://patch.msgid.link/20250526113057.3086513-3-zhenglifeng1@huawei.com
[ rjw: Subject and changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
After commit a28b2bfc099c ("cppc_cpufreq: replace per-cpu data array with a
list"), cpu_data can be got from policy->driver_data, so cpu_data_list is
not actually needed and can be removed.
Signed-off-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Link: https://patch.msgid.link/20250526113057.3086513-2-zhenglifeng1@huawei.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
The new FwNode abstraction will be used for accessing all device
properties.
It would be possible to duplicate the methods on the device itself, but
since some of the methods on Device would have different type sigatures
as the ones on FwNode, this would only lead to inconsistency and
confusion. For this reason, property_present is removed from Device and
existing users are updated.
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Remo Senekowitsch <remo@buenzli.dev>
Link: https://lore.kernel.org/r/20250611102908.212514-4-remo@buenzli.dev
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
|
|
Use the newly defined `CpuId` abstraction instead of raw CPU numbers.
This also fixes a doctest failure for configurations where `nr_cpu_ids <
4`.
The C `cpumask_{set|clear}_cpu()` APIs emit a warning when given an
invalid CPU number — but only if `CONFIG_DEBUG_PER_CPU_MAPS=y` is set.
Meanwhile, `cpumask_weight()` only considers CPUs up to `nr_cpu_ids`,
which can cause inconsistencies: a CPU number greater than `nr_cpu_ids`
may be set in the mask, yet the weight calculation won't reflect it.
This leads to doctest failures when `nr_cpu_ids < 4`, as the test tries
to set CPUs 2 and 3:
rust_doctest_kernel_cpumask_rs_0.location: rust/kernel/cpumask.rs:180
rust_doctest_kernel_cpumask_rs_0: ASSERTION FAILED at rust/kernel/cpumask.rs:190
Fixes: 8961b8cb3099 ("rust: cpumask: Add initial abstractions")
Reported-by: Miguel Ojeda <ojeda@kernel.org>
Closes: https://lore.kernel.org/rust-for-linux/CANiq72k3ozKkLMinTLQwvkyg9K=BeRxs1oYZSKhJHY-veEyZdg@mail.gmail.com/
Reported-by: Andreas Hindborg <a.hindborg@kernel.org>
Closes: https://lore.kernel.org/all/87qzzy3ric.fsf@kernel.org/
Suggested-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Boqun Feng <boqun.feng@gmail.com>
|
|
Move this API to the canonical timer_*() namespace.
[ tglx: Redone against pre rc1 ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/aB2X0jCKQO56WdMt@gmail.com
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull non-MM updates from Andrew Morton:
- "hung_task: extend blocking task stacktrace dump to semaphore" from
Lance Yang enhances the hung task detector.
The detector presently dumps the blocking tasks's stack when it is
blocked on a mutex. Lance's series extends this to semaphores
- "nilfs2: improve sanity checks in dirty state propagation" from
Wentao Liang addresses a couple of minor flaws in nilfs2
- "scripts/gdb: Fixes related to lx_per_cpu()" from Illia Ostapyshyn
fixes a couple of issues in the gdb scripts
- "Support kdump with LUKS encryption by reusing LUKS volume keys" from
Coiby Xu addresses a usability problem with kdump.
When the dump device is LUKS-encrypted, the kdump kernel may not have
the keys to the encrypted filesystem. A full writeup of this is in
the series [0/N] cover letter
- "sysfs: add counters for lockups and stalls" from Max Kellermann adds
/sys/kernel/hardlockup_count and /sys/kernel/hardlockup_count and
/sys/kernel/rcu_stall_count
- "fork: Page operation cleanups in the fork code" from Pasha Tatashin
implements a number of code cleanups in fork.c
- "scripts/gdb/symbols: determine KASLR offset on s390 during early
boot" from Ilya Leoshkevich fixes some s390 issues in the gdb
scripts
* tag 'mm-nonmm-stable-2025-05-31-15-28' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (67 commits)
llist: make llist_add_batch() a static inline
delayacct: remove redundant code and adjust indentation
squashfs: add optional full compressed block caching
crash_dump, nvme: select CONFIGFS_FS as built-in
scripts/gdb/symbols: determine KASLR offset on s390 during early boot
scripts/gdb/symbols: factor out pagination_off()
scripts/gdb/symbols: factor out get_vmlinux()
kernel/panic.c: format kernel-doc comments
mailmap: update and consolidate Casey Connolly's name and email
nilfs2: remove wbc->for_reclaim handling
fork: define a local GFP_VMAP_STACK
fork: check charging success before zeroing stack
fork: clean-up naming of vm_stack/vm_struct variables in vmap stacks code
fork: clean-up ifdef logic around stack allocation
kernel/rcu/tree_stall: add /sys/kernel/rcu_stall_count
kernel/watchdog: add /sys/kernel/{hard,soft}lockup_count
x86/crash: make the page that stores the dm crypt keys inaccessible
x86/crash: pass dm crypt keys to kdump kernel
Revert "x86/mm: Remove unused __set_memory_prot()"
crash_dump: retrieve dm crypt keys in kdump kernel
...
|
|
Merge Rust support for cpufreq and OPP, a new Rust-based cpufreq-dt
driver, an SCMI cpufreq driver cleanup, and an ACPI cpufreq driver
regression fix:
- Add Rust abstractions for CPUFreq framework (Viresh Kumar).
- Add Rust abstractions for OPP framework (Viresh Kumar).
- Add basic Rust abstractions for Clk and Cpumask frameworks (Viresh
Kumar).
- Clean up the SCMI cpufreq driver somewhat (Mike Tipton).
- Use KHz as the nominal_freq units in get_max_boost_ratio() in the
ACPI cpufreq driver (iGautham Shenoy).
* pm-cpufreq:
acpi-cpufreq: Fix nominal_freq units to KHz in get_max_boost_ratio()
rust: opp: Move `cfg(CONFIG_OF)` attribute to the top of doc test
rust: opp: Make the doctest example depend on CONFIG_OF
cpufreq: scmi: Skip SCMI devices that aren't used by the CPUs
cpufreq: Add Rust-based cpufreq-dt driver
rust: opp: Extend OPP abstractions with cpufreq support
rust: cpufreq: Extend abstractions for driver registration
rust: cpufreq: Extend abstractions for policy and driver ops
rust: cpufreq: Add initial abstractions for cpufreq framework
rust: opp: Add abstractions for the configuration options
rust: opp: Add abstractions for the OPP table
rust: opp: Add initial abstractions for OPP framework
rust: cpu: Add from_cpu()
rust: macros: enable use of hyphens in module names
rust: clk: Add initial abstractions
rust: clk: Add helpers for Rust code
MAINTAINERS: Add entry for Rust cpumask API
rust: cpumask: Add initial abstractions
rust: cpumask: Add few more helpers
|
|
commit 083466754596 ("cpufreq: ACPI: Fix max-frequency computation")
modified get_max_boost_ratio() to return the nominal_freq advertised
in the _CPC object. This was for the purposes of computing the maximum
frequency. The frequencies advertised in _CPC objects are in
MHz. However, cpufreq expects the frequency to be in KHz. Since the
nominal_freq returned by get_max_boost_ratio() was not in KHz but
instead in MHz,the cpuinfo_max_frequency that was computed using this
nominal_freq was incorrect and an invalid value which resulted in
cpufreq reporting the P0 frequency as the cpuinfo_max_freq.
Fix this by converting the nominal_freq to KHz before returning the
same from get_max_boost_ratio().
Reported-by: Manu Bretelle <chantr4@gmail.com>
Closes: https://lore.kernel.org/lkml/aDaB63tDvbdcV0cg@HQ-GR2X1W2P57/
Fixes: 083466754596 ("cpufreq: ACPI: Fix max-frequency computation")
Signed-off-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Cc: 6.14+ <stable@vger.kernel.org> # 6.14+
Link: https://patch.msgid.link/20250529085143.709-1-gautham.shenoy@amd.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management updates from Rafael Wysocki:
"Once again, the changes are dominated by cpufreq updates, but this
time the majority of them are cpufreq core changes, mostly related to
the introduction of policy locking guards and __free() usage, and
fixes related to boost handling.
Still, there is also a significant update of the intel_pstate driver
making it register an energy model when running on a hybrid platform
which is used for enabling energy-aware scheduling (EAS) if the driver
operates in the passive mode (and schedutil is used as the cpufreq
governor for all CPUs which is the passive mode default).
There are some amd-pstate driver updates too, for a good measure,
including the "Requested CPU Min frequency" BIOS option support and
new online/offline callbacks.
In the cpuidle space, the most significant change is the addition of a
C1 demotion on/off sysfs knob to intel_idle which should help some
users to configure their systems more precisely. There is also the
conversion of the PSCI cpuidle driver to a faux device one and there
are two small updates of cpuidle governors.
Device power management is also modified quite a bit, especially the
handling of devices with asynchronous suspend and resume enabled
during system transitions. They are now going to be handled more
asynchronously during suspend transitions and somewhat less
aggressively during resume transitions.
Apart from the above, the operating performance points (OPP) library
is now going to use mutex locking guards and scope-based cleanup
helpers and there is the usual bunch of assorted fixes and code
cleanups.
Specifics:
- Fix potential division-by-zero error in em_compute_costs() (Yaxiong
Tian)
- Fix typos in energy model documentation and example driver code
(Moon Hee Lee, Atul Kumar Pant)
- Rearrange the energy model management code and add a new function
for adjusting a CPU energy model after adjusting the capacity of
the given CPU to it (Rafael Wysocki)
- Refactor cpufreq_online(), add and use cpufreq policy locking
guards, use __free() in policy reference counting, and clean up
core cpufreq code on top of that (Rafael Wysocki)
- Fix boost handling on CPU suspend/resume and sysfs updates (Viresh
Kumar)
- Fix des_perf clamping with max_perf in amd_pstate_update()
(Dhananjay Ugwekar)
- Add offline, online and suspend callbacks to the amd-pstate driver,
rename and use the existing amd_pstate_epp callbacks in it
(Dhananjay Ugwekar)
- Add support for the "Requested CPU Min frequency" BIOS option to
the amd-pstate driver (Dhananjay Ugwekar)
- Reset amd-pstate driver mode after running selftests (Swapnil
Sapkal)
- Avoid shadowing ret in amd_pstate_ut_check_driver() (Nathan
Chancellor)
- Add helper for governor checks to the schedutil cpufreq governor
and move cpufreq-specific EAS checks to cpufreq (Rafael Wysocki)
- Populate the cpu_capacity sysfs entries from the intel_pstate
driver after registering asym capacity support (Ricardo Neri)
- Add support for enabling Energy-aware scheduling (EAS) to the
intel_pstate driver when operating in the passive mode on a hybrid
platform (Rafael Wysocki)
- Drop redundant cpus_read_lock() from store_local_boost() in the
cpufreq core (Seyediman Seyedarab)
- Replace sscanf() with kstrtouint() in the cpufreq code and use a
symbol instead of a raw number in it (Bowen Yu)
- Add support for autonomous CPU performance state selection to the
CPPC cpufreq driver (Lifeng Zheng)
- OPP: Add dev_pm_opp_set_level() (Praveen Talari)
- Introduce scope-based cleanup headers and mutex locking guards in
OPP core (Viresh Kumar)
- Switch OPP to use kmemdup_array() (Zhang Enpei)
- Optimize bucket assignment when next_timer_ns equals KTIME_MAX in
the menu cpuidle governor (Zhongqiu Han)
- Convert the cpuidle PSCI driver to a faux device one (Sudeep Holla)
- Add C1 demotion on/off sysfs knob to the intel_idle driver (Artem
Bityutskiy)
- Fix typos in two comments in the teo cpuidle governor (Atul Kumar
Pant)
- Fix denying of auto suspend in pm_suspend_timer_fn() (Charan Teja
Kalla)
- Move debug runtime PM attributes to runtime_attrs[] (Rafael
Wysocki)
- Add new devm_ functions for enabling runtime PM and runtime PM
reference counting (Bence Csókás)
- Remove size arguments from strscpy() calls in the hibernation core
code (Thorsten Blum)
- Adjust the handling of devices with asynchronous suspend enabled
during system suspend and resume to start resuming them immediately
after resuming their parents and to start suspending such a device
immediately after suspending its first child (Rafael Wysocki)
- Adjust messages printed during tasks freezing to avoid using
pr_cont() (Andrew Sayers, Paul Menzel)
- Clean up unnecessary usage of !! in pm_print_times_init() (Zihuan
Zhang)
- Add missing wakeup source attribute relax_count to sysfs and remove
the space character at the end ofi the string produced by
pm_show_wakelocks() (Zijun Hu)
- Add configurable pm_test delay for hibernation (Zihuan Zhang)
- Disable asynchronous suspend in ucsi_ccg_probe() to prevent the
cypd4226 device on Tegra boards from suspending prematurely (Jon
Hunter)
- Unbreak printing PM debug messages during hibernation and clean up
some related code (Rafael Wysocki)
- Add a systemd service to run cpupower and change cpupower binding's
Makefile to use -lcpupower (John B. Wyatt IV, Francesco Poli)"
* tag 'pm-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (72 commits)
cpufreq: CPPC: Add support for autonomous selection
cpufreq: Update sscanf() to kstrtouint()
cpufreq: Replace magic number
OPP: switch to use kmemdup_array()
PM: freezer: Rewrite restarting tasks log to remove stray *done.*
PM: runtime: fix denying of auto suspend in pm_suspend_timer_fn()
cpufreq: drop redundant cpus_read_lock() from store_local_boost()
cpupower: do not install files to /etc/default/
cpupower: do not call systemctl at install time
cpupower: do not write DESTDIR to cpupower.service
PM: sleep: Introduce pm_sleep_transition_in_progress()
cpufreq/amd-pstate: Avoid shadowing ret in amd_pstate_ut_check_driver()
cpufreq: intel_pstate: Document hybrid processor support
cpufreq: intel_pstate: EAS: Increase cost for CPUs using L3 cache
cpufreq: intel_pstate: EAS support for hybrid platforms
PM: EM: Introduce em_adjust_cpu_capacity()
PM: EM: Move CPU capacity check to em_adjust_new_capacity()
PM: EM: Documentation: Fix typos in example driver code
cpufreq: Drop policy locking from cpufreq_policy_is_good_for_eas()
PM: sleep: Introduce pm_suspend_in_progress()
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI updates from Rafael Wysocki:
"The most significant part of these changes is an ACPICA update
covering two upstream ACPICA releases, 20241212 and 20250404, that
have not been included into the kernel code base yet.
Among other things, it adds definitions needed to address GCC 15's
-Wunterminated-string-initialization warnings, adds support for three
new tables (MRRM, ERDT, RIMT), extends support for two tables (RAS2,
DMAR), and fixes some issues.
On top of the above, there is a new parser for the MRRM table, more
changes related to GCC 15's -Wunterminated-string-initialization
warnings, a CPPC library update including functions related to
autonomous CPU performance state selection, a couple of new quirks,
some assorted fixes and some code cleanups.
Specifics:
- Fix two ACPICA SLAB cache leaks (Seunghun Han)
- Add EINJv2 get error type action and define Error Injection Actions
in hex values to avoid inconsistencies between the specification
and the code (Zaid Alali)
- Fix typo in comments for SRAT structures (Adam Lackorzynski)
- Prevent possible loss of data in ACPICA because of u32 to u8
conversions (Saket Dumbre)
- Fix reading FFixedHW operation regions in ACPICA (Daniil Tatianin)
- Add support for printing AML arguments when the ACPICA debug level
is ACPI_LV_TRACE_POINT (Mario Limonciello)
- Drop a stale comment about the file content from actbl2.h (Sudeep
Holla)
- Apply pack(1) to union aml_resource (Tamir Duberstein)
- Fix overflow check in the ACPICA version of vsnprintf() (gldrk)
- Interpret SIDP structures in DMAR added revision 3.4 of the VT-d
specification (Alexey Neyman)
- Add typedef and other definitions related to MRRM to ACPICA (Tony
Luck)
- Add definitions for RIMT to ACPICA (Sunil V L)
- Fix spelling mistake "Incremement" -> "Increment" in the ACPICA
utilities code (Colin Ian King)
- Add typedef and other definitions for ERDT to ACPICA (Tony Luck)
- Introduce ACPI_NONSTRING and use it (Kees Cook, Ahmed Salem)
- Rename structure and field names of the RAS2 table in actbl2.h
(Shiju Jose)
- Fix up whitespace in acpica/utcache.c (Zhe Qiao)
- Avoid sequence overread in a call to strncmp() in
ap_get_table_length() and replace strncpy() with memcpy() in ACPICA
in some places (Ahmed Salem)
- Update copyright year in all ACPICA files (Saket Dumbre)
- Add __nonstring annotations for unterminated strings in the static
ACPI tables parsing code (Kees Cook)
- Add support for parsing the MRRM ACPI table and sysfs files to
describe memory regions listed in it (Tony Luck, Anil
Keshavamurthy)
- Remove an (explicitly) unused header file include from the VIOT
ACPI table parser file (Andy Shevchenko)
- Improve logging around acpi_initialize_tables() (Bartosz
Szczepanek)
- Clean up the initialization of CPU data structures in the ACPI
processor driver (Zhang Rui)
- Remove an obsolete comment regarding the C-states handling in the
ACPI processor driver (Giovanni Gherdovich)
- Simplify PCC shared memory region handling (Sudeep Holla)
- Rework and extend functions for reading CPPC register values and
for updating CPPC registers (Lifeng Zheng)
- Add three functions related to autonomous CPU performance state
selection to the CPPC library (Lifeng Zheng)
- Turn the acpi_pci_root_remap_iospace() fwnode_handle parameter into
a const pointer (Pei Xiao)
- Round battery capacity percengate in the ACPI battery driver to the
closest integer to avoid user confusion (shitao)
- Make the ACPI battery driver report the current as a negative
number to the power supply framework when the battery is
discharging as documented (Peter Marheine)
- Add TUXEDO InfinityBook Pro AMD Gen9 to the acpi_ec_no_wakeup[]
list to prevent spurious wakeups from suspend-to-idle (Werner
Sembach)
- Convert the APEI EINJ driver to a faux device one (Sudeep Holla,
Jon Hunter)
- Remove redundant calls to einj_get_available_error_type() from the
APEI EINJ driver (Zaid Alali)
- Fix a typo for MECHREVO in irq1_edge_low_force_override[] (Mingcong
Bai)
- Add an LPS0 check() callback to the AMD pinctrl driver and fix up
config symbol dependencies in it (Mario Limonciello, Rafael
Wysocki)
- Avoid initializing the ACPI platform profile driver on non-ACPI
platforms (Alexandre Ghiti)
- Document that references to ACPI data (non-device) nodes should use
string-only references in hierarchical data node packages (Sakari
Ailus)
- Fail the ACPI bus registration if acpi_kobj registration fails
(Armin Wolf)"
* tag 'acpi-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (65 commits)
ACPI: MRRM: Fix default max memory region
ACPI: bus: Bail out if acpi_kobj registration fails
ACPI: platform_profile: Avoid initializing on non-ACPI platforms
pinctrl: amd: Fix hibernation support with CONFIG_SUSPEND unset
ACPI: tables: Improve logging around acpi_initialize_tables()
ACPI: VIOT: Remove (explicitly) unused header
ACPI: Add documentation for exposing MRRM data
ACPI: MRRM: Add /sys files to describe memory ranges
ACPI: MRRM: Minimal parse of ACPI MRRM table
ACPICA: Update copyright year
ACPICA: Logfile: Changes for version 20250404
ACPICA: Replace strncpy() with memcpy()
ACPICA: Apply ACPI_NONSTRING in more places
ACPICA: Avoid sequence overread in call to strncmp()
ACPICA: Adjust the position of code lines
ACPICA: actbl2.h: ACPI 6.5: RAS2: Rename structure and field names of the RAS2 table
ACPICA: Apply ACPI_NONSTRING
ACPICA: Introduce ACPI_NONSTRING
ACPICA: actbl2.h: ERDT: Add typedef and other definitions
ACPICA: infrastructure: Add new DMT_BUF types and shorten a long name
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull core x86 updates from Ingo Molnar:
"Boot code changes:
- A large series of changes to reorganize the x86 boot code into a
better isolated and easier to maintain base of PIC early startup
code in arch/x86/boot/startup/, by Ard Biesheuvel.
Motivation & background:
| Since commit
|
| c88d71508e36 ("x86/boot/64: Rewrite startup_64() in C")
|
| dated Jun 6 2017, we have been using C code on the boot path in a way
| that is not supported by the toolchain, i.e., to execute non-PIC C
| code from a mapping of memory that is different from the one provided
| to the linker. It should have been obvious at the time that this was a
| bad idea, given the need to sprinkle fixup_pointer() calls left and
| right to manipulate global variables (including non-pointer variables)
| without crashing.
|
| This C startup code has been expanding, and in particular, the SEV-SNP
| startup code has been expanding over the past couple of years, and
| grown many of these warts, where the C code needs to use special
| annotations or helpers to access global objects.
This tree includes the first phase of this work-in-progress x86
boot code reorganization.
Scalability enhancements and micro-optimizations:
- Improve code-patching scalability (Eric Dumazet)
- Remove MFENCEs for X86_BUG_CLFLUSH_MONITOR (Andrew Cooper)
CPU features enumeration updates:
- Thorough reorganization and cleanup of CPUID parsing APIs (Ahmed S.
Darwish)
- Fix, refactor and clean up the cacheinfo code (Ahmed S. Darwish,
Thomas Gleixner)
- Update CPUID bitfields to x86-cpuid-db v2.3 (Ahmed S. Darwish)
Memory management changes:
- Allow temporary MMs when IRQs are on (Andy Lutomirski)
- Opt-in to IRQs-off activate_mm() (Andy Lutomirski)
- Simplify choose_new_asid() and generate better code (Borislav
Petkov)
- Simplify 32-bit PAE page table handling (Dave Hansen)
- Always use dynamic memory layout (Kirill A. Shutemov)
- Make SPARSEMEM_VMEMMAP the only memory model (Kirill A. Shutemov)
- Make 5-level paging support unconditional (Kirill A. Shutemov)
- Stop prefetching current->mm->mmap_lock on page faults (Mateusz
Guzik)
- Predict valid_user_address() returning true (Mateusz Guzik)
- Consolidate initmem_init() (Mike Rapoport)
FPU support and vector computing:
- Enable Intel APX support (Chang S. Bae)
- Reorgnize and clean up the xstate code (Chang S. Bae)
- Make task_struct::thread constant size (Ingo Molnar)
- Restore fpu_thread_struct_whitelist() to fix
CONFIG_HARDENED_USERCOPY=y (Kees Cook)
- Simplify the switch_fpu_prepare() + switch_fpu_finish() logic (Oleg
Nesterov)
- Always preserve non-user xfeatures/flags in __state_perm (Sean
Christopherson)
Microcode loader changes:
- Help users notice when running old Intel microcode (Dave Hansen)
- AMD: Do not return error when microcode update is not necessary
(Annie Li)
- AMD: Clean the cache if update did not load microcode (Boris
Ostrovsky)
Code patching (alternatives) changes:
- Simplify, reorganize and clean up the x86 text-patching code (Ingo
Molnar)
- Make smp_text_poke_batch_process() subsume
smp_text_poke_batch_finish() (Nikolay Borisov)
- Refactor the {,un}use_temporary_mm() code (Peter Zijlstra)
Debugging support:
- Add early IDT and GDT loading to debug relocate_kernel() bugs
(David Woodhouse)
- Print the reason for the last reset on modern AMD CPUs (Yazen
Ghannam)
- Add AMD Zen debugging document (Mario Limonciello)
- Fix opcode map (!REX2) superscript tags (Masami Hiramatsu)
- Stop decoding i64 instructions in x86-64 mode at opcode (Masami
Hiramatsu)
CPU bugs and bug mitigations:
- Remove X86_BUG_MMIO_UNKNOWN (Borislav Petkov)
- Fix SRSO reporting on Zen1/2 with SMT disabled (Borislav Petkov)
- Restructure and harmonize the various CPU bug mitigation methods
(David Kaplan)
- Fix spectre_v2 mitigation default on Intel (Pawan Gupta)
MSR API:
- Large MSR code and API cleanup (Xin Li)
- In-kernel MSR API type cleanups and renames (Ingo Molnar)
PKEYS:
- Simplify PKRU update in signal frame (Chang S. Bae)
NMI handling code:
- Clean up, refactor and simplify the NMI handling code (Sohil Mehta)
- Improve NMI duration console printouts (Sohil Mehta)
Paravirt guests interface:
- Restrict PARAVIRT_XXL to 64-bit only (Kirill A. Shutemov)
SEV support:
- Share the sev_secrets_pa value again (Tom Lendacky)
x86 platform changes:
- Introduce the <asm/amd/> header namespace (Ingo Molnar)
- i2c: piix4, x86/platform: Move the SB800 PIIX4 FCH definitions to
<asm/amd/fch.h> (Mario Limonciello)
Fixes and cleanups:
- x86 assembly code cleanups and fixes (Uros Bizjak)
- Misc fixes and cleanups (Andi Kleen, Andy Lutomirski, Andy
Shevchenko, Ard Biesheuvel, Bagas Sanjaya, Baoquan He, Borislav
Petkov, Chang S. Bae, Chao Gao, Dan Williams, Dave Hansen, David
Kaplan, David Woodhouse, Eric Biggers, Ingo Molnar, Josh Poimboeuf,
Juergen Gross, Malaya Kumar Rout, Mario Limonciello, Nathan
Chancellor, Oleg Nesterov, Pawan Gupta, Peter Zijlstra, Shivank
Garg, Sohil Mehta, Thomas Gleixner, Uros Bizjak, Xin Li)"
* tag 'x86-core-2025-05-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (331 commits)
x86/bugs: Fix spectre_v2 mitigation default on Intel
x86/bugs: Restructure ITS mitigation
x86/xen/msr: Fix uninitialized variable 'err'
x86/msr: Remove a superfluous inclusion of <asm/asm.h>
x86/paravirt: Restrict PARAVIRT_XXL to 64-bit only
x86/mm/64: Make 5-level paging support unconditional
x86/mm/64: Make SPARSEMEM_VMEMMAP the only memory model
x86/mm/64: Always use dynamic memory layout
x86/bugs: Fix indentation due to ITS merge
x86/cpuid: Rename hypervisor_cpuid_base()/for_each_possible_hypervisor_cpuid_base() to cpuid_base_hypervisor()/for_each_possible_cpuid_base_hypervisor()
x86/cpu/intel: Rename CPUID(0x2) descriptors iterator parameter
x86/cacheinfo: Rename CPUID(0x2) descriptors iterator parameter
x86/cpuid: Rename cpuid_get_leaf_0x2_regs() to cpuid_leaf_0x2()
x86/cpuid: Rename have_cpuid_p() to cpuid_feature()
x86/cpuid: Set <asm/cpuid/api.h> as the main CPUID header
x86/cpuid: Move CPUID(0x2) APIs into <cpuid/api.h>
x86/msr: Add rdmsrl_on_cpu() compatibility wrapper
x86/mm: Fix kernel-doc descriptions of various pgtable methods
x86/asm-offsets: Export certain 'struct cpuinfo_x86' fields for 64-bit asm use too
x86/boot: Defer initialization of VM space related global variables
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler updates from Ingo Molnar:
"Core & fair scheduler changes:
- Tweak wait_task_inactive() to force dequeue sched_delayed tasks
(John Stultz)
- Adhere to place_entity() constraints (Peter Zijlstra)
- Allow decaying util_est when util_avg > CPU capacity (Pierre
Gondois)
- Fix up wake_up_sync() vs DELAYED_DEQUEUE (Xuewen Yan)
Energy management:
- Introduce sched_update_asym_prefer_cpu() (K Prateek Nayak)
- cpufreq/amd-pstate: Update asym_prefer_cpu when core rankings
change (K Prateek Nayak)
- Align uclamp and util_est and call before freq update (Xuewen Yan)
CPU isolation:
- Make use of more than one housekeeping CPU (Phil Auld)
RT scheduler:
- Fix race in push_rt_task() (Harshit Agarwal)
- Add kernel cmdline option for rt_group_sched (Michal Koutný)
Scheduler topology support:
- Improve topology_span_sane speed (Steve Wahl)
Scheduler debugging:
- Move and extend the sched_process_exit() tracepoint (Andrii
Nakryiko)
- Add RT_GROUP WARN checks for non-root task_groups (Michal Koutný)
- Fix trace_sched_switch(.prev_state) (Peter Zijlstra)
- Untangle cond_resched() and live-patching (Peter Zijlstra)
Fixes and cleanups:
- Misc fixes and cleanups (K Prateek Nayak, Michal Koutný, Peter
Zijlstra, Xuewen Yan)"
* tag 'sched-core-2025-05-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (26 commits)
sched/uclamp: Align uclamp and util_est and call before freq update
sched/util_est: Simplify condition for util_est_{en,de}queue()
sched/fair: Fixup wake_up_sync() vs DELAYED_DEQUEUE
sched,livepatch: Untangle cond_resched() and live-patching
sched/core: Tweak wait_task_inactive() to force dequeue sched_delayed tasks
sched/fair: Adhere to place_entity() constraints
sched/debug: Print the local group's asym_prefer_cpu
cpufreq/amd-pstate: Update asym_prefer_cpu when core rankings change
sched/topology: Introduce sched_update_asym_prefer_cpu()
sched/fair: Use READ_ONCE() to read sg->asym_prefer_cpu
sched/isolation: Make use of more than one housekeeping cpu
sched/rt: Fix race in push_rt_task
sched: Add annotations to RT_GROUP_SCHED fields
sched: Add RT_GROUP WARN checks for non-root task_groups
sched: Do not construct nor expose RT_GROUP_SCHED structures if disabled
sched: Bypass bandwitdh checks with runtime disabled RT_GROUP_SCHED
sched: Skip non-root task_groups with disabled RT_GROUP_SCHED
sched: Add commadline option for RT_GROUP_SCHED toggling
sched: Always initialize rt_rq's task_group
sched: Remove unneeed macro wrap
...
|
|
Merge ACPI processor driver updates and ACPI CPPC library updates for
6.16-rc1:
- Clean up the initialization of CPU data structures in the ACPI
processor driver (Zhang Rui).
- Remove an obsolete comment regarding the C-states handling in the
ACPI processor driver (Giovanni Gherdovich).
- Simplify PCC shared memory region handling (Sudeep Holla).
- Rework and extend functions for reading CPPC register values and for
updating CPPC registers (Lifeng Zheng).
- Add three functions related to autonomous CPU performance state
selection to the CPPC library (Lifeng Zheng).
* acpi-processor:
ACPI: processor: idle: Remove redundant pr->power.count assignment
ACPI: processor: idle: Set pr->flags.power unconditionally
ACPI: processor: idle: Remove obsolete comment
* acpi-cppc:
ACPI: CPPC: Add three functions related to autonomous selection
ACPI: CPPC: Modify cppc_get_auto_sel_caps() to cppc_get_auto_sel()
ACPI: CPPC: Refactor register value get and set ABIs
ACPI: CPPC: Add cppc_set_reg_val()
ACPI: CPPC: Extract cppc_get_reg_val_in_pcc()
ACPI: CPPC: Rename cppc_get_perf() to cppc_get_reg_val()
ACPI: CPPC: Optimize cppc_get_perf()
ACPI: CPPC: Add IS_OPTIONAL_CPC_REG macro to judge if a cpc_reg is optional
ACPI: CPPC: Simplify PCC shared memory region handling
ACPI: PCC: Simplify PCC shared memory region handling
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm
Merge ARM CPUFreq updates for 6.16 from Viresh Kumar:
"- Rust abstractions for CPUFreq framework (Viresh Kumar).
- Rust abstractions for OPP framework (Viresh Kumar).
- Basic Rust abstractions for Clk and Cpumask frameworks (Viresh Kumar).
- Minor cleanup to the SCMI cpufreq driver (Mike Tipton)."
* tag 'cpufreq-arm-updates-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm: (24 commits)
cpufreq: scmi: Skip SCMI devices that aren't used by the CPUs
cpufreq: Add Rust-based cpufreq-dt driver
rust: opp: Extend OPP abstractions with cpufreq support
rust: cpufreq: Extend abstractions for driver registration
rust: cpufreq: Extend abstractions for policy and driver ops
rust: cpufreq: Add initial abstractions for cpufreq framework
rust: opp: Add abstractions for the configuration options
rust: opp: Add abstractions for the OPP table
rust: opp: Add initial abstractions for OPP framework
rust: cpu: Add from_cpu()
rust: macros: enable use of hyphens in module names
rust: clk: Add initial abstractions
rust: clk: Add helpers for Rust code
MAINTAINERS: Add entry for Rust cpumask API
rust: cpumask: Add initial abstractions
rust: cpumask: Add few more helpers
rust: devres: require a bound device
rust: pci: move iomap_region() to impl Device<Bound>
rust: device: implement Bound device context
rust: pci: preserve device context in AsRef
...
|
|
Add sysfs interfaces for CPPC autonomous selection in the cppc_cpufreq
driver.
Signed-off-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Reviewed-by: Sumit Gupta <sumitg@nvidia.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Link: https://patch.msgid.link/20250507031941.2812701-1-zhenglifeng1@huawei.com
[ rjw: Subject edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
In store_scaling_setspeed(), sscanf is still used to read to sysfs.
Newer kstrtox provide more features including overflow protection,
better errorhandling and allows for other systems of numeration. It
is therefore better to update sscanf() to kstrtouint().
Signed-off-by: Bowen Yu <yubowen8@huawei.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Link: https://patch.msgid.link/20250519070938.931396-1-yubowen8@huawei.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Setting the length of str_governor with a magic number could cause
overflow when max length increases, it is better to use the defined
macro in this case.
Signed-off-by: Bowen Yu <yubowen8@huawei.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Link: https://patch.msgid.link/20250519070908.930879-1-yubowen8@huawei.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Currently, all SCMI devices with performance domains attempt to register
a cpufreq driver, even if their performance domains aren't used to
control the CPUs. The cpufreq framework only supports registering a
single driver, so only the first device will succeed. And if that device
isn't used for the CPUs, then cpufreq will scale the wrong domains.
To avoid this, return early from scmi_cpufreq_probe() if the probing
SCMI device isn't referenced by the CPU device phandles.
This keeps the existing assumption that all CPUs are controlled by a
single SCMI device.
Signed-off-by: Mike Tipton <quic_mdtipton@quicinc.com>
Reviewed-by: Peng Fan <peng.fan@nxp.com>
Reviewed-by: Cristian Marussi <cristian.marussi@arm.com>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Tested-by: Cristian Marussi <cristian.marussi@arm.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
|
|
|
|
Introduce a Rust-based implementation of the cpufreq-dt driver, covering
most of the functionality provided by the existing C version. Some
features, such as retrieving platform data from `cpufreq-dt-platdev.c`,
are still pending.
The driver has been tested with QEMU, and frequency scaling works as
expected.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
|
|
ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/superm1/linux
Merge an amd-pstate driver fix for 6.16 (5/15/25) from Mario
Liminciello:
"Fix an error caught with -Werror in amd-pstate-ut."
* tag 'amd-pstate-v6.16-2025-05-15' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/superm1/linux:
cpufreq/amd-pstate: Avoid shadowing ret in amd_pstate_ut_check_driver()
|
|
Lockdep reports a possible circular locking dependency [1] when
cpu_hotplug_lock is acquired inside store_local_boost(), after
policy->rwsem has already been taken by store().
However, the boost update is strictly per-policy and does not
access shared state or iterate over all policies.
Since policy->rwsem is already held, this is enough to serialize
against concurrent topology changes for the current policy.
Remove the cpus_read_lock() to resolve the lockdep warning and
avoid unnecessary locking.
[1]
======================================================
WARNING: possible circular locking dependency detected
6.15.0-rc6-debug-gb01fc4eca73c #1 Not tainted
------------------------------------------------------
power-profiles-/588 is trying to acquire lock:
ffffffffb3a7d910 (cpu_hotplug_lock){++++}-{0:0}, at: store_local_boost+0x56/0xd0
but task is already holding lock:
ffff8b6e5a12c380 (&policy->rwsem){++++}-{4:4}, at: store+0x37/0x90
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #2 (&policy->rwsem){++++}-{4:4}:
down_write+0x29/0xb0
cpufreq_online+0x7e8/0xa40
cpufreq_add_dev+0x82/0xa0
subsys_interface_register+0x148/0x160
cpufreq_register_driver+0x15d/0x260
amd_pstate_register_driver+0x36/0x90
amd_pstate_init+0x1e7/0x270
do_one_initcall+0x68/0x2b0
kernel_init_freeable+0x231/0x270
kernel_init+0x15/0x130
ret_from_fork+0x2c/0x50
ret_from_fork_asm+0x11/0x20
-> #1 (subsys mutex#3){+.+.}-{4:4}:
__mutex_lock+0xc2/0x930
subsys_interface_register+0x7f/0x160
cpufreq_register_driver+0x15d/0x260
amd_pstate_register_driver+0x36/0x90
amd_pstate_init+0x1e7/0x270
do_one_initcall+0x68/0x2b0
kernel_init_freeable+0x231/0x270
kernel_init+0x15/0x130
ret_from_fork+0x2c/0x50
ret_from_fork_asm+0x11/0x20
-> #0 (cpu_hotplug_lock){++++}-{0:0}:
__lock_acquire+0x10ed/0x1850
lock_acquire.part.0+0x69/0x1b0
cpus_read_lock+0x2a/0xc0
store_local_boost+0x56/0xd0
store+0x50/0x90
kernfs_fop_write_iter+0x132/0x200
vfs_write+0x2b3/0x590
ksys_write+0x74/0xf0
do_syscall_64+0xbb/0x1d0
entry_SYSCALL_64_after_hwframe+0x56/0x5e
Signed-off-by: Seyediman Seyedarab <ImanDevel@gmail.com>
Link: https://patch.msgid.link/20250513015726.1497-1-ImanDevel@gmail.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Clang warns (or errors with CONFIG_WERROR=y):
drivers/cpufreq/amd-pstate-ut.c:262:6: error: variable 'ret' is uninitialized when used here [-Werror,-Wuninitialized]
262 | if (ret)
| ^~~
ret is declared at the top of the function and in the for loop so the
initialization of ret is local to the loop. Remove the declaration in
the for loop so that ret is always used initialized.
Fixes: d26d16438bc5 ("amd-pstate-ut: Reset amd-pstate driver mode after running selftests")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202505072226.g2QclhaR-lkp@intel.com/
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Acked-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20250512-amd-pstate-ut-uninit-ret-v1-1-fcb4104f502e@kernel.org
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
|
|
On some hybrid platforms some efficient CPUs (E-cores) are not connected
to the L3 cache, but there are no other differences between them and the
other E-cores that use L3. In that case, it is generally more efficient
to run "light" workloads on the E-cores that do not use L3 and allow all
of the cores using L3, including P-cores, to go into idle states.
For this reason, slightly increase the cost for all CPUs sharing the L3
cache to make EAS prefer CPUs that do not use it to the other CPUs of
the same type (if any).
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://patch.msgid.link/2032776.usQuhbGJ8B@rjwysocki.net
|