summaryrefslogtreecommitdiff
path: root/drivers/cpufreq
AgeCommit message (Collapse)Author
2021-04-09cpufreq: armada-37xx: Remove cur_frequency variablePali Rohár
Variable cur_frequency in armada37xx_cpufreq_driver_init() is unused. Signed-off-by: Pali Rohár <pali@kernel.org> Acked-by: Gregory CLEMENT <gregory.clement@bootlin.com> Tested-by: Tomasz Maciej Nowak <tmn505@gmail.com> Tested-by: Anders Trier Olesen <anders.trier.olesen@gmail.com> Tested-by: Philip Soares <philips@netisense.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2021-04-09cpufreq: armada-37xx: Fix determining base CPU frequencyPali Rohár
When current CPU load is not L0 then loading armada-37xx-cpufreq.ko driver fails with following error: # modprobe armada-37xx-cpufreq [ 502.702097] Unsupported CPU frequency 250 MHz This issue was partially fixed by commit 8db82563451f ("cpufreq: armada-37xx: fix frequency calculation for opp"), but only for calculating CPU frequency for opp. Fix this also for determination of base CPU frequency. Signed-off-by: Pali Rohár <pali@kernel.org> Acked-by: Gregory CLEMENT <gregory.clement@bootlin.com> Tested-by: Tomasz Maciej Nowak <tmn505@gmail.com> Tested-by: Anders Trier Olesen <anders.trier.olesen@gmail.com> Tested-by: Philip Soares <philips@netisense.com> Fixes: 92ce45fb875d ("cpufreq: Add DVFS support for Armada 37xx") Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2021-04-09cpufreq: armada-37xx: Fix driver cleanup when registration failedPali Rohár
Commit 8db82563451f ("cpufreq: armada-37xx: fix frequency calculation for opp") changed calculation of frequency passed to the dev_pm_opp_add() function call. But the code for dev_pm_opp_remove() function call was not updated, so the driver cleanup phase does not work when registration fails. This fixes the issue by using the same frequency in both calls. Signed-off-by: Pali Rohár <pali@kernel.org> Acked-by: Gregory CLEMENT <gregory.clement@bootlin.com> Tested-by: Tomasz Maciej Nowak <tmn505@gmail.com> Tested-by: Anders Trier Olesen <anders.trier.olesen@gmail.com> Tested-by: Philip Soares <philips@netisense.com> Fixes: 8db82563451f ("cpufreq: armada-37xx: fix frequency calculation for opp") Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2021-04-09cpufreq: armada-37xx: Fix the AVS value for load L1Pali Rohár
The original CPU voltage value for load L1 is too low for Armada 37xx SoC when base CPU frequency is 1000 or 1200 MHz. It leads to instabilities where CPU gets stuck soon after dynamic voltage scaling from load L1 to L0. Update the CPU voltage value for load L1 accordingly when base frequency is 1000 or 1200 MHz. The minimal L1 value for base CPU frequency 1000 MHz is updated from the original 1.05V to 1.108V and for 1200 MHz is updated to 1.155V. This minimal L1 value is used only in the case when it is lower than value for L0. This change fixes CPU instability issues on 1 GHz and 1.2 GHz variants of Espressobin and 1 GHz Turris Mox. Marvell previously for 1 GHz variant of Espressobin provided a patch [1] suitable only for their Marvell Linux kernel 4.4 fork which workarounded this issue. Patch forced CPU voltage value to 1.108V in all loads. But such change does not fix CPU instability issues on 1.2 GHz variants of Armada 3720 SoC. During testing we come to the conclusion that using 1.108V as minimal value for L1 load makes 1 GHz variants of Espressobin and Turris Mox boards stable. And similarly 1.155V for 1.2 GHz variant of Espressobin. These two values 1.108V and 1.155V are documented in Armada 3700 Hardware Specifications as typical initial CPU voltage values. Discussion about this issue is also at the Armbian forum [2]. [1] - https://github.com/MarvellEmbeddedProcessors/linux-marvell/commit/dc33b62c90696afb6adc7dbcc4ebbd48bedec269 [2] - https://forum.armbian.com/topic/10429-how-to-make-espressobin-v7-stable/ Signed-off-by: Pali Rohár <pali@kernel.org> Acked-by: Gregory CLEMENT <gregory.clement@bootlin.com> Tested-by: Tomasz Maciej Nowak <tmn505@gmail.com> Tested-by: Anders Trier Olesen <anders.trier.olesen@gmail.com> Tested-by: Philip Soares <philips@netisense.com> Fixes: 1c3528232f4b ("cpufreq: armada-37xx: Add AVS support") Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2021-04-09cpufreq: armada-37xx: Fix setting TBG parent for load levelsMarek Behún
With CPU frequency determining software [1] we have discovered that after this driver does one CPU frequency change, the base frequency of the CPU is set to the frequency of TBG-A-P clock, instead of the TBG that is parent to the CPU. This can be reproduced on EspressoBIN and Turris MOX: cd /sys/devices/system/cpu/cpufreq/policy0 echo powersave >scaling_governor echo performance >scaling_governor Running the mhz tool before this driver is loaded reports 1000 MHz, and after loading the driver and executing commands above the tool reports 800 MHz. The change of TBG clock selector is supposed to happen in function armada37xx_cpufreq_dvfs_setup. Before the function returns, it does this: parent = clk_get_parent(clk); clk_set_parent(clk, parent); The armada-37xx-periph clock driver has the .set_parent method implemented correctly for this, so if the method was actually called, this would work. But since the introduction of the common clock framework in commit b2476490ef11 ("clk: introduce the common clock..."), the clk_set_parent function checks whether the parent is actually changing, and if the requested new parent is same as the old parent (which is obviously the case for the code above), the .set_parent method is not called at all. This patch fixes this issue by filling the correct TBG clock selector directly in the armada37xx_cpufreq_dvfs_setup during the filling of other registers at the same address. But the determination of CPU TBG index cannot be done via the common clock framework, therefore we need to access the North Bridge Peripheral Clock registers directly in this driver. [1] https://github.com/wtarreau/mhz Signed-off-by: Marek Behún <kabel@kernel.org> Acked-by: Gregory CLEMENT <gregory.clement@bootlin.com> Tested-by: Pali Rohár <pali@kernel.org> Tested-by: Tomasz Maciej Nowak <tmn505@gmail.com> Tested-by: Anders Trier Olesen <anders.trier.olesen@gmail.com> Tested-by: Philip Soares <philips@netisense.com> Fixes: 92ce45fb875d ("cpufreq: Add DVFS support for Armada 37xx") Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2021-04-08cpufreq: Remove unused for_each_policy macroShaokun Zhang
Macro 'for_each_policy' has become unused since commit f963735a3ca3 ("cpufreq: Create for_each_{in}active_policy()"), so remove it. Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-03-29cpufreq: scmi: Port driver to the new scmi_perf_proto_ops interfaceCristian Marussi
Port driver to the new SCMI perf interface based on protocol handles and common devm_get_ops(). Link: https://lore.kernel.org/r/20210316124903.35011-13-cristian.marussi@arm.com Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Viresh Kumar <viresh.kumar@linaro.org> Tested-by: Florian Fainelli <f.fainelli@gmail.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Cristian Marussi <cristian.marussi@arm.com> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
2021-03-26cpufreq: Fix scaling_{available,boost}_frequencies_show() commentsGeert Uytterhoeven
The function names in the comment blocks for the functions scaling_available_frequencies_show() and scaling_boost_frequencies_show() do not match the actual names. Fixes: 6f19efc0a1ca08bc ("cpufreq: Add boost frequency support in core") Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-03-25cpufreq: dt: dev_pm_opp_of_cpumask_add_table() may return -EPROBE_DEFERQuanyang Wang
The function dev_pm_opp_of_cpumask_add_table() may return -EPROBE_DEFER, which needs to be propagated to the caller to try probing the driver later on. Signed-off-by: Quanyang Wang <quanyang.wang@windriver.com> [ Viresh: Massage changelog/subject, improve code. ] Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2021-03-23cpufreq: intel_pstate: Clean up frequency computationsRafael J. Wysocki
Notice that some computations related to frequency in intel_pstate can be simplified if (a) intel_pstate_get_hwp_max() updates the relevant members of struct cpudata by itself and (b) the "turbo disabled" check is moved from it to its callers, so modify the code accordingly and while at it rename intel_pstate_get_hwp_max() to intel_pstate_get_hwp_cap() which better reflects its purpose and provide a simplified variat of it, __intel_pstate_get_hwp_cap(), suitable for the initialization path. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by: Chen Yu <yu.c.chen@intel.com>
2021-03-22cpufreq: cppc: simplify default delay_us settingTom Saeger
Simplify case when setting default in cppc_cpufreq_get_transition_delay_us. Signed-off-by: Tom Saeger <tom.saeger@oracle.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2021-03-22cpufreq: Rudimentary typos fix in the file s5pv210-cpufreq.cBhaskar Chowdhury
Trivial spelling fixes throughout the file. Signed-off-by: Bhaskar Chowdhury <unixbhaskar@gmail.com> Reviewed-by: Tom Saeger <tom.saeger@oracle.com> Reviewed-by: Randy Dunlap <rdunlap@infradead.org> [ Viresh: Capitalize two words. ] Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2021-03-22cpufreq: CPPC: Add support for frequency invarianceViresh Kumar
The Frequency Invariance Engine (FIE) is providing a frequency scaling correction factor that helps achieve more accurate load-tracking. Normally, this scaling factor can be obtained directly with the help of the cpufreq drivers as they know the exact frequency the hardware is running at. But that isn't the case for CPPC cpufreq driver. Another way of obtaining that is using the arch specific counter support, which is already present in kernel, but that hardware is optional for platforms. This patch updates the CPPC driver to register itself with the topology core to provide its own implementation (cppc_scale_freq_tick()) of topology_scale_freq_tick() which gets called by the scheduler on every tick. Note that the arch specific counters have higher priority than CPPC counters, if available, though the CPPC driver doesn't need to have any special handling for that. On an invocation of cppc_scale_freq_tick(), we schedule an irq work (since we reach here from hard-irq context), which then schedules a normal work item and cppc_scale_freq_workfn() updates the per_cpu arch_freq_scale variable based on the counter updates since the last tick. To allow platforms to disable this CPPC counter-based frequency invariance support, this is all done under CONFIG_ACPI_CPPC_CPUFREQ_FIE, which is enabled by default. This also exports sched_setattr_nocheck() as the CPPC driver can be built as a module. Cc: linux-acpi@vger.kernel.org Reviewed-by: Ionela Voinescu <ionela.voinescu@arm.com> Tested-by: Ionela Voinescu <ionela.voinescu@arm.com> Tested-by: Vincent Guittot <vincent.guittot@linaro.org> Acked-by: Rafael J. Wysocki <rafael@kernel.org> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2021-03-19ia64: fix format string for ia64-acpi-cpu-freqSergei Trofimovich
Fix warning with %lx / s64 mismatch: CC [M] drivers/cpufreq/ia64-acpi-cpufreq.o drivers/cpufreq/ia64-acpi-cpufreq.c: In function 'processor_get_pstate': warning: format '%lx' expects argument of type 'long unsigned int', but argument 3 has type 's64' {aka 'long long int'} [-Wformat=] Signed-off-by: Sergei Trofimovich <slyfox@gentoo.org> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-03-15scmi-cpufreq: Get opp_shared_cpus from opp-v2 for EMNicola Mazzucato
By design, SCMI performance domains define the granularity of performance controls, they do not describe any underlying hardware dependencies (although they may match in many cases). It is therefore possible to have some platforms where hardware may have the ability to control CPU performance at different granularity and choose to describe fine-grained performance control through SCMI. In such situations, the energy model would be provided with inaccurate information based on controls, while it still needs to know the performance boundaries. To restore correct functionality, retrieve information of CPUs under the same performance domain from operating-points-v2 in DT, and pass it on to EM. Link: https://lore.kernel.org/r/20210218222326.15788-3-nicola.mazzucato@arm.com Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Nicola Mazzucato <nicola.mazzucato@arm.com> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
2021-03-15scmi-cpufreq: Remove deferred probeNicola Mazzucato
The current implementation of the scmi_cpufreq_init() function returns -EPROBE_DEFER when the OPP table is not populated. In practice the cpufreq core cannot handle this error code. Therefore, fix the return value and clarify the error message. Link: https://lore.kernel.org/r/20210218222326.15788-2-nicola.mazzucato@arm.com Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Nicola Mazzucato <nicola.mazzucato@arm.com> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
2021-03-08cpufreq: blacklist Arm Vexpress platforms in cpufreq-dt-platdevSudeep Holla
Add "arm,vexpress" to cpufreq-dt-platdev blacklist since the actual scaling is handled by the firmware cpufreq drivers(scpi, scmi and vexpress-spc). Signed-off-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2021-03-08cpufreq: qcom-hw: Fix return value check in qcom_cpufreq_hw_cpu_init()Wei Yongjun
In case of error, the function ioremap() returns NULL pointer not ERR_PTR(). The IS_ERR() test in the return value check should be replaced with NULL test. Fixes: 67fc209b527d ("cpufreq: qcom-hw: drop devm_xxx() calls from init/exit hooks") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Acked-by: Shawn Guo <shawn.guo@linaro.org> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2021-03-08cpufreq: qcom-hw: fix dereferencing freed memory 'data'Shawn Guo
Commit 67fc209b527d ("cpufreq: qcom-hw: drop devm_xxx() calls from init/exit hooks") introduces an issue of dereferencing freed memory 'data'. Fix it. Fixes: 67fc209b527d ("cpufreq: qcom-hw: drop devm_xxx() calls from init/exit hooks") Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Shawn Guo <shawn.guo@linaro.org> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2021-02-24Merge tag 'sfi-removal-5.12-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull Simple Firmware Interface (SFI) support removal from Rafael Wysocki: "Drop support for depercated platforms using SFI, drop the entire support for SFI that has been long deprecated too and make some janitorial changes on top of that (Andy Shevchenko)" * tag 'sfi-removal-5.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: x86/platform/intel-mid: Update Copyright year and drop file names x86/platform/intel-mid: Remove unused header inclusion in intel-mid.h x86/platform/intel-mid: Drop unused __intel_mid_cpu_chip and Co. x86/platform/intel-mid: Get rid of intel_scu_ipc_legacy.h x86/PCI: Describe @reg for type1_access_ok() x86/PCI: Get rid of custom x86 model comparison sfi: Remove framework for deprecated firmware cpufreq: sfi-cpufreq: Remove driver for deprecated firmware media: atomisp: Remove unused header mfd: intel_msic: Remove driver for deprecated platform x86/apb_timer: Remove driver for deprecated platform x86/platform/intel-mid: Remove unused leftovers (vRTC) x86/platform/intel-mid: Remove unused leftovers (msic) x86/platform/intel-mid: Remove unused leftovers (msic_thermal) x86/platform/intel-mid: Remove unused leftovers (msic_power_btn) x86/platform/intel-mid: Remove unused leftovers (msic_gpio) x86/platform/intel-mid: Remove unused leftovers (msic_battery) x86/platform/intel-mid: Remove unused leftovers (msic_ocd) x86/platform/intel-mid: Remove unused leftovers (msic_audio) platform/x86: intel_scu_wdt: Drop mistakenly added const
2021-02-23Merge branches 'pm-cpufreq' and 'pm-opp'Rafael J. Wysocki
* pm-cpufreq: cpufreq: Fix typo in kerneldoc comment cpufreq: schedutil: Remove update_lock comment from struct sugov_policy definition cpufreq: schedutil: Remove needless sg_policy parameter from ignore_dl_rate_limit() cpufreq: ACPI: Set cpuinfo.max_freq directly if max boost is known cpufreq: qcom-hw: drop devm_xxx() calls from init/exit hooks * pm-opp: opp: Don't skip freq update for different frequency
2021-02-19cpufreq: Fix typo in kerneldoc commentYue Hu
Change 'Terget' to 'Target'. Should be Target. Signed-off-by: Yue Hu <huyue2@yulong.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> [ rjw: Subject edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-02-18Merge branch 'cpufreq/arm/linux-next' of ↵Rafael J. Wysocki
git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm Pull ARM cpufreq fix for 5.12 from Viresh Kumar: "Single patch to fix issue with cpu hotplug and policy recreation for qcom-cpufreq-hw driver." * 'cpufreq/arm/linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm: cpufreq: qcom-hw: drop devm_xxx() calls from init/exit hooks
2021-02-18cpufreq: ACPI: Set cpuinfo.max_freq directly if max boost is knownRafael J. Wysocki
Commit 3c55e94c0ade ("cpufreq: ACPI: Extend frequency tables to cover boost frequencies") attempted to address a performance issue involving acpi-cpufreq, the schedutil governor and scale-invariance on x86 by extending the frequency tables created by acpi-cpufreq to cover the entire range of "turbo" (or "boost") frequencies, but that caused frequencies reported via /proc/cpuinfo and the scaling_cur_freq attribute in sysfs to change which may confuse users and monitoring tools. For this reason, revert the part of commit 3c55e94c0ade adding the extra entry to the frequency table and use the observation that in principle cpuinfo.max_freq need not be equal to the maximum frequency listed in the frequency table for the given policy. Namely, modify cpufreq_frequency_table_cpuinfo() to allow cpufreq drivers to set their own cpuinfo.max_freq above that frequency and change acpi-cpufreq to set cpuinfo.max_freq to the maximum boost frequency found via CPPC. This should be sufficient to let all of the cpufreq subsystem know the real maximum frequency of the CPU without changing frequency reporting. Link: https://bugzilla.kernel.org/show_bug.cgi?id=211305 Fixes: 3c55e94c0ade ("cpufreq: ACPI: Extend frequency tables to cover boost frequencies") Reported-by: Matt McDonald <gardotd426@gmail.com> Tested-by: Matt McDonald <gardotd426@gmail.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by: Giovanni Gherdovich <ggherdovich@suse.cz> Tested-by: Michael Larabel <Michael@phoronix.com> Cc: 5.11+ <stable@vger.kernel.org> # 5.11+
2021-02-18cpufreq: qcom-hw: drop devm_xxx() calls from init/exit hooksShawn Guo
Commit f17b3e44320b ("cpufreq: qcom-hw: Use devm_platform_ioremap_resource() to simplify code") introduces a regression on platforms using the driver, by failing to initialise a policy, when one is created post hotplug. When all the CPUs of a policy are hoptplugged out, the call to .exit() and later to devm_iounmap() does not release the memory region that was requested during devm_platform_ioremap_resource(). Therefore, a subsequent call to .init() will result in the following error, which will prevent a new policy to be initialised: [ 3395.915416] CPU4: shutdown [ 3395.938185] psci: CPU4 killed (polled 0 ms) [ 3399.071424] CPU5: shutdown [ 3399.094316] psci: CPU5 killed (polled 0 ms) [ 3402.139358] CPU6: shutdown [ 3402.161705] psci: CPU6 killed (polled 0 ms) [ 3404.742939] CPU7: shutdown [ 3404.765592] psci: CPU7 killed (polled 0 ms) [ 3411.492274] Detected VIPT I-cache on CPU4 [ 3411.492337] GICv3: CPU4: found redistributor 400 region 0:0x0000000017ae0000 [ 3411.492448] CPU4: Booted secondary processor 0x0000000400 [0x516f802d] [ 3411.503654] qcom-cpufreq-hw 17d43000.cpufreq: can't request region for resource [mem 0x17d45800-0x17d46bff] With that being said, the original code was tricky and skipping memory region request intentionally to hide this issue. The true cause is that those devm_xxx() device managed functions shouldn't be used for cpufreq init/exit hooks, because &pdev->dev is alive across the hooks and will not trigger auto resource free-up. Let's drop the use of device managed functions and manually allocate/free resources, so that the issue can be fixed properly. Cc: v5.10+ <stable@vger.kernel.org> # v5.10+ Fixes: f17b3e44320b ("cpufreq: qcom-hw: Use devm_platform_ioremap_resource() to simplify code") Suggested-by: Bjorn Andersson <bjorn.andersson@linaro.org> Signed-off-by: Shawn Guo <shawn.guo@linaro.org> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2021-02-15cpufreq: sfi-cpufreq: Remove driver for deprecated firmwareAndy Shevchenko
SFI-based platforms are gone. So does this driver. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Linus Walleij <linus.walleij@linaro.org> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-02-15Merge branch 'pm-opp' into pmRafael J. Wysocki
* pm-opp: (37 commits) PM / devfreq: Add required OPPs support to passive governor PM / devfreq: Cache OPP table reference in devfreq OPP: Add function to look up required OPP's for a given OPP opp: Replace ENOTSUPP with EOPNOTSUPP opp: Fix "foo * bar" should be "foo *bar" opp: Don't ignore clk_get() errors other than -ENOENT opp: Update bandwidth requirements based on scaling up/down opp: Allow lazy-linking of required-opps opp: Remove dev_pm_opp_set_bw() devfreq: tegra30: Migrate to dev_pm_opp_set_opp() drm: msm: Migrate to dev_pm_opp_set_opp() cpufreq: qcom: Migrate to dev_pm_opp_set_opp() opp: Implement dev_pm_opp_set_opp() opp: Update parameters of _set_opp_custom() opp: Allow _generic_set_opp_clk_only() to work for non-freq devices opp: Allow _generic_set_opp_regulator() to work for non-freq devices opp: Allow _set_opp() to work for non-freq devices opp: Split _set_opp() out of dev_pm_opp_set_rate() opp: Keep track of currently programmed OPP opp: No need to check clk for errors ...
2021-02-10Merge back cpufreq updates for v5.12.Rafael J. Wysocki
2021-02-08Merge branch 'cpufreq/arm/linux-next' of ↵Rafael J. Wysocki
git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm Pull ARM cpufreq changes for v5.12 from Viresh Kumar: "- Removal of Tango driver as the platform got removed (Arnd Bergmann). - Use resource managed APIs for tegra20 (Dmitry Osipenko). - Generic cleanups for brcmstb (Christophe JAILLET). - Enable boost support for qcom-hw (Shawn Guo)." * 'cpufreq/arm/linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm: cpufreq: remove tango driver cpufreq: brcmstb-avs-cpufreq: Fix resource leaks in ->remove() cpufreq: brcmstb-avs-cpufreq: Free resources in error path cpufreq: qcom-hw: enable boost support cpufreq: tegra20: Use resource-managed API
2021-02-08cpufreq: ACPI: Update arch scale-invariance max perf ratio if CPPC is not thereRafael J. Wysocki
If the maximum performance level taken for computing the arch_max_freq_ratio value used in the x86 scale-invariance code is higher than the one corresponding to the cpuinfo.max_freq value coming from the acpi_cpufreq driver, the scale-invariant utilization falls below 100% even if the CPU runs at cpuinfo.max_freq or slightly faster, which causes the schedutil governor to select a frequency below cpuinfo.max_freq. That frequency corresponds to a frequency table entry below the maximum performance level necessary to get to the "boost" range of CPU frequencies which prevents "boost" frequencies from being used in some workloads. While this issue is related to scale-invariance, it may be amplified by commit db865272d9c4 ("cpufreq: Avoid configuring old governors as default with intel_pstate") from the 5.10 development cycle which made it extremely easy to default to schedutil even if the preferred driver is acpi_cpufreq as long as intel_pstate is built too, because the mere presence of the latter effectively removes the ondemand governor from the defaults. Distro kernels are likely to include both intel_pstate and acpi_cpufreq on x86, so their users who cannot use intel_pstate or choose to use acpi_cpufreq may easily be affectecd by this issue. If CPPC is available, it can be used to address this issue by extending the frequency tables created by acpi_cpufreq to cover the entire available frequency range (including "boost" frequencies) for each CPU, but if CPPC is not there, acpi_cpufreq has no idea what the maximum "boost" frequency is and the frequency tables created by it cannot be extended in a meaningful way, so in that case make it ask the arch scale-invariance code to to use the "nominal" performance level for CPU utilization scaling in order to avoid the issue at hand. Fixes: db865272d9c4 ("cpufreq: Avoid configuring old governors as default with intel_pstate") Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Giovanni Gherdovich <ggherdovich@suse.cz> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
2021-02-08cpufreq: ACPI: Extend frequency tables to cover boost frequenciesRafael J. Wysocki
A severe performance regression on AMD EPYC processors when using the schedutil scaling governor was discovered by Phoronix.com and attributed to the following commits: 41ea667227ba ("x86, sched: Calculate frequency invariance for AMD systems") 976df7e5730e ("x86, sched: Use midpoint of max_boost and max_P for frequency invariance on AMD EPYC") The source of the problem is that the maximum performance level taken for computing the arch_max_freq_ratio value used in the x86 scale- invariance code is higher than the one corresponding to the cpuinfo.max_freq value coming from the acpi_cpufreq driver. This effectively causes the scale-invariant utilization to fall below 100% even if the CPU runs at cpuinfo.max_freq or slightly faster, so the schedutil governor selects a frequency below cpuinfo.max_freq then. That frequency corresponds to a frequency table entry below the maximum performance level necessary to get to the "boost" range of CPU frequencies. However, if the cpuinfo.max_freq value coming from acpi_cpufreq was higher, the schedutil governor would select higher frequencies which in turn would allow acpi_cpufreq to set more adequate performance levels and to get to the "boost" range of CPU frequencies more often. This issue affects any systems where acpi_cpufreq is used and the "boost" (or "turbo") frequencies are enabled, not just AMD EPYC. Moreover, commit db865272d9c4 ("cpufreq: Avoid configuring old governors as default with intel_pstate") from the 5.10 development cycle made it extremely easy to default to schedutil even if the preferred driver is acpi_cpufreq as long as intel_pstate is built too, because the mere presence of the latter effectively removes the ondemand governor from the defaults. Distro kernels are likely to include both intel_pstate and acpi_cpufreq on x86, so their users who cannot use intel_pstate or choose to use acpi_cpufreq may easily be affectecd by this issue. To address this issue, extend the frequency table constructed by acpi_cpufreq for each CPU to cover the entire range of available frequencies (including the "boost" ones) if CPPC is available and indicates that "boost" (or "turbo") frequencies are enabled. That causes cpuinfo.max_freq to become the maximum "boost" frequency of the given CPU (instead of the maximum frequency returned by the ACPI _PSS object that corresponds to the "nominal" performance level). Fixes: 41ea667227ba ("x86, sched: Calculate frequency invariance for AMD systems") Fixes: 976df7e5730e ("x86, sched: Use midpoint of max_boost and max_P for frequency invariance on AMD EPYC") Fixes: db865272d9c4 ("cpufreq: Avoid configuring old governors as default with intel_pstate") Link: https://www.phoronix.com/scan.php?page=article&item=linux511-amd-schedutil&num=1 Link: https://lore.kernel.org/linux-pm/20210203135321.12253-2-ggherdovich@suse.cz/ Reported-by: Michael Larabel <Michael@phoronix.com> Diagnosed-by: Giovanni Gherdovich <ggherdovich@suse.cz> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by: Giovanni Gherdovich <ggherdovich@suse.cz> Reviewed-by: Giovanni Gherdovich <ggherdovich@suse.cz> Tested-by: Michael Larabel <Michael@phoronix.com>
2021-02-04cpufreq: Remove unused flag CPUFREQ_PM_NO_WARNViresh Kumar
This flag is set by one of the drivers but it isn't used in the code otherwise. Remove the unused flag and update the driver. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-02-04cpufreq: Remove CPUFREQ_STICKY flagViresh Kumar
During cpufreq driver's registration, if the ->init() callback for all the CPUs fail then there is not much point in keeping the driver around as it will only account for more of unnecessary noise, for example cpufreq core will try to suspend/resume the driver which never got registered properly. The removal of such a driver is avoided if the driver carries the CPUFREQ_STICKY flag. This was added way back [1] in 2004 and perhaps no one should ever need it now. A lot of drivers do set this flag, probably because they just copied it from other drivers. This was added earlier for some platforms [2] because their cpufreq drivers were getting registered before the CPUs were registered with subsys framework. And hence they used to fail. The same isn't true anymore though. The current code flow in the kernel is: start_kernel() -> kernel_init() -> kernel_init_freeable() -> do_basic_setup() -> driver_init() -> cpu_dev_init() -> subsys_system_register() //For CPUs -> do_initcalls() -> cpufreq_register_driver() Clearly, the CPUs will always get registered with subsys framework before any cpufreq driver can get probed. Remove the flag and update the relevant drivers. Link: https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git/commit/include/linux/cpufreq.h?id=7cc9f0d9a1ab04cedc60d64fd8dcf7df224a3b4d # [1] Link: https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git/commit/arch/arm/mach-sa1100/cpu-sa1100.c?id=f59d3bbe35f6268d729f51be82af8325d62f20f5 # [2] Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-02-02cpufreq: qcom: Migrate to dev_pm_opp_set_opp()Viresh Kumar
dev_pm_opp_set_bw() is getting removed and dev_pm_opp_set_opp() should be used instead. Migrate to the new API. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Tested-by: Dmitry Osipenko <digetx@gmail.com>
2021-01-22cpufreq: intel_pstate: Remove repeated wordNigel Christian
In the comment for trace in passive mode there is an unnecessary "the". Eradicate it. Signed-off-by: Nigel Christian <nigel.l.christian@gmail.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-01-21cpufreq: remove tango driverArnd Bergmann
The tango platform is getting removed, so the driver is no longer needed. Cc: Marc Gonzalez <marc.w.gonzalez@free.fr> Cc: Mans Rullgard <mans@mansr.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> [ Viresh: Update cpufreq-dt-platdev.c as well ] Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2021-01-18cpufreq: brcmstb-avs-cpufreq: Fix resource leaks in ->remove()Christophe JAILLET
If 'cpufreq_unregister_driver()' fails, just WARN and continue, so that other resources are freed. Fixes: de322e085995 ("cpufreq: brcmstb-avs-cpufreq: AVS CPUfreq driver for Broadcom STB SoCs") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> [ Viresh: Updated Subject ] Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2021-01-18cpufreq: brcmstb-avs-cpufreq: Free resources in error pathChristophe JAILLET
If 'cpufreq_register_driver()' fails, we must release the resources allocated in 'brcm_avs_prepare_init()' as already done in the remove function. To do that, introduce a new function 'brcm_avs_prepare_uninit()' in order to avoid code duplication. This also makes the code more readable (IMHO). Fixes: de322e085995 ("cpufreq: brcmstb-avs-cpufreq: AVS CPUfreq driver for Broadcom STB SoCs") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> [ Viresh: Updated Subject ] Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2021-01-18cpufreq: qcom-hw: enable boost supportShawn Guo
At least on sdm850, the 2956800 khz is detected as a boost frequency in function qcom_cpufreq_hw_read_lut(). Let's enable boost support by calling cpufreq_enable_boost_support(), so that we can get the boost frequency by switching it on via 'boost' sysfs entry like below. $ echo 1 > /sys/devices/system/cpu/cpufreq/boost Signed-off-by: Shawn Guo <shawn.guo@linaro.org> Tested-by: Steev Klimaszewski <steev@kali.org> Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2021-01-18cpufreq: tegra20: Use resource-managed APIDmitry Osipenko
Switch cpufreq-tegra20 driver to use resource-managed API. This removes the need to get opp_table pointer using dev_pm_opp_get_opp_table() in order to release OPP table that was requested by dev_pm_opp_set_supported_hw(), making the code a bit more straightforward. Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2021-01-12cpufreq: intel_pstate: Get per-CPU max freq via MSR_HWP_CAPABILITIES if ↵Chen Yu
available Currently, when turbo is disabled (either by BIOS or by the user), the intel_pstate driver reads the max non-turbo frequency from the package-wide MSR_PLATFORM_INFO(0xce) register. However, on asymmetric platforms it is possible in theory that small and big core with HWP enabled might have different max non-turbo CPU frequency, because MSR_HWP_CAPABILITIES is per-CPU scope according to Intel Software Developer Manual. The turbo max freq is already per-CPU in current code, so make similar change to the max non-turbo frequency as well. Reported-by: Wendy Wang <wendy.wang@intel.com> Signed-off-by: Chen Yu <yu.c.chen@intel.com> [ rjw: Subject and changelog edits ] Cc: 4.18+ <stable@vger.kernel.org> # 4.18+: a45ee4d4e13b: cpufreq: intel_pstate: Change intel_pstate_get_hwp_max() argument Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-01-12cpufreq: intel_pstate: Rename two functionsRafael J. Wysocki
Rename intel_cpufreq_adjust_hwp() and intel_cpufreq_adjust_perf_ctl() to intel_cpufreq_hwp_update() and intel_cpufreq_perf_ctl_update(), respectively, to avoid possible confusion with the ->adjist_perf() callback function, intel_cpufreq_adjust_perf(). Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by: Chen Yu <yu.c.chen@intel.com>
2021-01-12cpufreq: intel_pstate: Change intel_pstate_get_hwp_max() argumentRafael J. Wysocki
All of the callers of intel_pstate_get_hwp_max() access the struct cpudata object that corresponds to the given CPU already and the function itself needs to access that object (in order to update hwp_cap_cached), so modify the code to pass a struct cpudata pointer to it instead of the CPU number. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by: Chen Yu <yu.c.chen@intel.com>
2021-01-12cpufreq: intel_pstate: Always read hwp_cap_cached with READ_ONCE()Rafael J. Wysocki
Because intel_pstate_get_hwp_max() which updates hwp_cap_cached may run in parallel with the readers of it, annotate all of the read accesses to it with READ_ONCE(). Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by: Chen Yu <yu.c.chen@intel.com>
2021-01-07cpufreq: intel_pstate: remove obsolete functionsLukas Bulwahn
percent_fp() was used in intel_pstate_pid_reset(), which was removed in commit 9d0ef7af1f2d ("cpufreq: intel_pstate: Do not use PID-based P-state selection") and hence, percent_fp() is unused since then. percent_ext_fp() was last used in intel_pstate_update_perf_limits(), which was refactored in commit 1a4fe38add8b ("cpufreq: intel_pstate: Remove max/min fractions to limit performance"), and hence, percent_ext_fp() is unused since then. make CC=clang W=1 points us those unused functions: drivers/cpufreq/intel_pstate.c:79:23: warning: unused function 'percent_fp' [-Wunused-function] static inline int32_t percent_fp(int percent) ^ drivers/cpufreq/intel_pstate.c:94:23: warning: unused function 'percent_ext_fp' [-Wunused-function] static inline int32_t percent_ext_fp(int percent) ^ Remove those obsolete functions. Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> Reviewed-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-01-07cpufreq: powernow-k8: pass policy rather than use cpufreq_cpu_get()Colin Ian King
Currently there is an unlikely case where cpufreq_cpu_get() returns a NULL policy and this will cause a NULL pointer dereference later on. Fix this by passing the policy to transition_frequency_fidvid() from the caller and hence eliminating the need for the cpufreq_cpu_get() and cpufreq_cpu_put(). Thanks to Viresh Kumar for suggesting the fix. Addresses-Coverity: ("Dereference null return") Fixes: b43a7ffbf33b ("cpufreq: Notify all policy->cpus in cpufreq_notify_transition()") Suggested-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-01-07cpufreq: intel_pstate: Use HWP capabilities in intel_cpufreq_adjust_perf()Rafael J. Wysocki
If turbo P-states cannot be used, either due to the configuration of the processor, or because intel_pstate is not allowed to used them, the maximum available P-state with HWP enabled corresponds to the HWP_CAP.GUARANTEED value which is not static. It can be adjusted by an out-of-band agent or during an Intel Speed Select performance level change, so long as it remains less than or equal to HWP_CAP.MAX. However, if turbo P-states cannot be used, intel_cpufreq_adjust_perf() always uses pstate.max_pstate (set during the initialization of the driver only) as the maximum available P-state, so it may miss a change of the HWP_CAP.GUARANTEED value. Prevent that from happening by modifyig intel_cpufreq_adjust_perf() to always read the "guaranteed" and "maximum turbo" performance levels from the cached HWP_CAP value. Fixes: a365ab6b9dfb ("cpufreq: intel_pstate: Implement the ->adjust_perf() callback") Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
2020-12-30cpufreq: intel_pstate: Fix fast-switch fallback pathRafael J. Wysocki
When sugov_update_single_perf() falls back to the "frequency" path due to the missing scale-invariance, it will call cpufreq_driver_fast_switch() via sugov_fast_switch() and the driver's ->fast_switch() callback will be invoked, so it must not be NULL. However, after commit a365ab6b9dfb ("cpufreq: intel_pstate: Implement the ->adjust_perf() callback") intel_pstate sets ->fast_switch() to NULL when it is going to use intel_cpufreq_adjust_perf(), which is a mistake, because on x86 the scale-invariance may be turned off dynamically, so modify it to retain the original ->adjust_perf() callback pointer. Fixes: a365ab6b9dfb ("cpufreq: intel_pstate: Implement the ->adjust_perf() callback") Reported-by: Kenneth R. Crudup <kenny@panix.com> Tested-by: Kenneth R. Crudup <kenny@panix.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-12-22Merge branch 'pm-cpufreq'Rafael J. Wysocki
* pm-cpufreq: cpufreq: intel_pstate: Use most recent guaranteed performance values cpufreq: intel_pstate: Implement the ->adjust_perf() callback cpufreq: Add special-purpose fast-switching callback for drivers cpufreq: schedutil: Add util to struct sg_cpu cppc_cpufreq: replace per-cpu data array with a list cppc_cpufreq: expose information on frequency domains cppc_cpufreq: clarify support for coordination types cppc_cpufreq: use policy->cpu as driver of frequency setting ACPI: processor: fix NONE coordination for domain mapping failure ACPI: processor: Drop duplicate setting of shared_cpu_map
2020-12-21cpufreq: intel_pstate: Use most recent guaranteed performance valuesRafael J. Wysocki
When turbo has been disabled by the BIOS, but HWP_CAP.GUARANTEED is changed later, user space may want to take advantage of this increased guaranteed performance. HWP_CAP.GUARANTEED is not a static value. It can be adjusted by an out-of-band agent or during an Intel Speed Select performance level change. The HWP_CAP.MAX is still the maximum achievable performance with turbo disabled by the BIOS, so HWP_CAP.GUARANTEED can still change as long as it remains less than or equal to HWP_CAP.MAX. When HWP_CAP.GUARANTEED is changed, the sysfs base_frequency attribute shows the most recent guaranteed frequency value. This attribute can be used by user space software to update the scaling min/max limits of the CPU. Currently, the ->setpolicy() callback already uses the latest HWP_CAP values when setting HWP_REQ, but the ->verify() callback will restrict the user settings to the to old guaranteed performance value which prevents user space from making use of the extra CPU capacity theoretically available to it after increasing HWP_CAP.GUARANTEED. To address this, read HWP_CAP in intel_pstate_verify_cpu_policy() to obtain the maximum P-state that can be used and use that to confine the policy max limit instead of using the cached and possibly stale pstate.max_freq value for this purpose. For consistency, update intel_pstate_update_perf_limits() to use the maximum available P-state returned by intel_pstate_get_hwp_max() to compute the maximum frequency instead of using the return value of intel_pstate_get_max_freq() which, again, may be stale. This issue is a side-effect of fixing the scaling frequency limits in commit eacc9c5a927e ("cpufreq: intel_pstate: Fix intel_pstate_get_hwp_max() for turbo disabled") which corrected the setting of the reduced scaling frequency values, but caused stale HWP_CAP.GUARANTEED to be used in the case at hand. Fixes: eacc9c5a927e ("cpufreq: intel_pstate: Fix intel_pstate_get_hwp_max() for turbo disabled") Reported-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Tested-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Cc: 5.8+ <stable@vger.kernel.org> # 5.8+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>