Age | Commit message (Collapse) | Author |
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Including fixes from Bluetooth, CAN and Netfilter.
Current release - regressions:
- two fixes for the netdev per-instance locking
- batman-adv: fix double-hold of meshif when getting enabled
Current release - new code bugs:
- Bluetooth: increment TX timestamping tskey always for stream
sockets
- wifi: static analysis and build fixes for the new Intel sub-driver
Previous releases - regressions:
- net: fib_rules: fix iif / oif matching on L3 master (VRF) device
- ipv6: add exception routes to GC list in rt6_insert_exception()
- netfilter: conntrack: fix erroneous removal of offload bit
- Bluetooth:
- fix sending MGMT_EV_DEVICE_FOUND for invalid address
- l2cap: process valid commands in too long frame
- btnxpuart: Revert baudrate change in nxp_shutdown
Previous releases - always broken:
- ethtool: fix memory corruption during SFP FW flashing
- eth:
- hibmcge: fixes for link and MTU handling, pause frames etc
- igc: fixes for PTM (PCIe timestamping)
- dsa: b53: enable BPDU reception for management port
Misc:
- fixes for Netlink protocol schemas"
* tag 'net-6.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (81 commits)
net: ethernet: mtk_eth_soc: revise QDMA packet scheduler settings
net: ethernet: mtk_eth_soc: correct the max weight of the queue limit for 100Mbps
net: ethernet: mtk_eth_soc: reapply mdc divider on reset
net: ti: icss-iep: Fix possible NULL pointer dereference for perout request
net: ti: icssg-prueth: Fix possible NULL pointer dereference inside emac_xmit_xdp_frame()
net: ti: icssg-prueth: Fix kernel warning while bringing down network interface
netfilter: conntrack: fix erronous removal of offload bit
net: don't try to ops lock uninitialized devs
ptp: ocp: fix start time alignment in ptp_ocp_signal_set
net: dsa: avoid refcount warnings when ds->ops->tag_8021q_vlan_del() fails
net: dsa: free routing table on probe failure
net: dsa: clean up FDB, MDB, VLAN entries on unbind
net: dsa: mv88e6xxx: fix -ENOENT when deleting VLANs and MST is unsupported
net: dsa: mv88e6xxx: avoid unregistering devlink regions which were never registered
net: txgbe: fix memory leak in txgbe_probe() error path
net: bridge: switchdev: do not notify new brentries as changed
net: b53: enable BPDU reception for management port
netlink: specs: rt-neigh: prefix struct nfmsg members with ndm
netlink: specs: rt-link: adjust mctp attribute naming
netlink: specs: rtnetlink: attribute naming corrections
...
|
|
Subvolume roots and the dirents that point to them are special; they
don't obey the normal snapshot versioning rules because they cross
snapshot boundaries.
We don't keep around older versions of subvolume dirents on rename - we
don't need to, because subvolume dirents are only visible in the parent
subvolume, and we wouldn't be able to match up the different dirent and
inode versions due to crossing the snapshot ID boundary.
That means that when we rename a subvolume, that's been snapshotted, the
older version of the subvolume root will become dangling - it won't have
a dirent that points to it.
That's expected, we just need to tell fsck that this is ok.
Fixes: https://github.com/koverstreet/bcachefs/issues/856
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
kbuf imports have the front offset adjusted and segments removed, but
the tail segments are still included in the segment count that gets
passed in the iov_iter. As the segments aren't necessarily all the
same size, move importing to a separate helper and iterate the
mapped length to get an exact count.
Reviewed-by: Nitesh Shetty <nj.shetty@samsung.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Android has mounted the v1 cpuset controller using filesystem type
"cpuset" (not "cgroup") since 2015 [1], and depends on the resulting
behavior where the controller name is not added as a prefix for cgroupfs
files. [2]
Later, a problem was discovered where cpu hotplug onlining did not
affect the cpuset/cpus files, which Android carried an out-of-tree patch
to address for a while. An attempt was made to upstream this patch, but
the recommendation was to use the "cpuset_v2_mode" mount option
instead. [3]
An effort was made to do so, but this fails with "cgroup: Unknown
parameter 'cpuset_v2_mode'" because commit e1cba4b85daa ("cgroup: Add
mount flag to enable cpuset to use v2 behavior in v1 cgroup") did not
update the special cased cpuset_mount(), and only the cgroup (v1)
filesystem type was updated.
Add parameter parsing to the cpuset filesystem type so that
cpuset_v2_mode works like the cgroup filesystem type:
$ mkdir /dev/cpuset
$ mount -t cpuset -ocpuset_v2_mode none /dev/cpuset
$ mount|grep cpuset
none on /dev/cpuset type cgroup (rw,relatime,cpuset,noprefix,cpuset_v2_mode,release_agent=/sbin/cpuset_release_agent)
[1] https://cs.android.com/android/_/android/platform/system/core/+/b769c8d24fd7be96f8968aa4c80b669525b930d3
[2] https://cs.android.com/android/platform/superproject/main/+/main:system/core/libprocessgroup/setup/cgroup_map_write.cpp;drc=2dac5d89a0f024a2d0cc46a80ba4ee13472f1681;l=192
[3] https://lore.kernel.org/lkml/f795f8be-a184-408a-0b5a-553d26061385@redhat.com/T/
Fixes: e1cba4b85daa ("cgroup: Add mount flag to enable cpuset to use v2 behavior in v1 cgroup")
Signed-off-by: T.J. Mercier <tjmercier@google.com>
Acked-by: Waiman Long <longman@redhat.com>
Reviewed-by: Kamalesh Babulal <kamalesh.babulal@oracle.com>
Acked-by: Michal Koutný <mkoutny@suse.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen fix from Juergen Gross:
"Just a single fix for the Xen multicall driver avoiding a percpu
variable referencing initdata by its initializer"
* tag 'for-linus-6.15a-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen: fix multicall debug feature
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
Pull fwctl fixes from Jason Gunthorpe:
"Three small changes from further build testing:
- Don't rely on the userspace uuid.h for the uapi header
- Fix sparse warnings in pds
- Typo in log message"
* tag 'for-linus-fwctl' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
fwctl: Fix repeated device word in log message
pds_fwctl: Fix type and endian complaints
fwctl/cxl: Fix uuid_t usage in uapi
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"A collection of small fixes. All are device-specific like quirks, new
IDs, and other safe (or rather boring) changes"
* tag 'sound-6.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
firmware: cs_dsp: test_bin_error: Fix uninitialized data used as fw version
ASoC: codecs: Add of_match_table for aw888081 driver
ASoC: fsl: fsl_qmc_audio: Reset audio data pointers on TRIGGER_START event
mailmap: Add entry for Srinivas Kandagatla
MAINTAINERS: use kernel.org alias
ASoC: cs42l43: Reset clamp override on jack removal
ALSA: hda/realtek - Fixed ASUS platform headset Mic issue
ALSA: hda/cirrus_scodec_test: Don't select dependencies
ALSA: azt2320: Replace deprecated strcpy() with strscpy()
ASoC: hdmi-codec: use RTD ID instead of DAI ID for ELD entry
ASoC: Intel: avs: Constrain path based on BE capabilities
ALSA: hda/tas2781: Remove unnecessary NULL check before release_firmware()
ASoC: Intel: avs: Fix null-ptr-deref in avs_component_probe()
ASoC: fsl_asrc_dma: get codec or cpu dai from backend
ASoC: qcom: Fix sc7280 lpass potential buffer overflow
ASoC: dwc: always enable/disable i2s irqs
ASoC: Intel: sof_sdw: Add quirk for Asus Zenbook S16
ASoC: codecs:lpass-wsa-macro: Fix logic of enabling vi channels
ASoC: codecs:lpass-wsa-macro: Fix vi feedback rate
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86
Pull x86 platform drivers fixes from Ilpo Järvinen:
"Fixes:
- amd/pmf: Fix STT limits
- asus-laptop: Fix an uninitialized variable
- intel_pmc_ipc: Allow building without ACPI
- mlxbf-bootctl: Use sysfs_emit_at() in secure_boot_fuse_state_show()
- msi-wmi-platform: Add locking to workaround ACPI firmware bug
New HW support:
- alienware-wmi-wmax:
- Extended thermal control support to:
- Alienware Area-51m R2
- Alienware m16 R1
- Alienware m16 R2
- Dell G16 7630
- Dell G5 5505 SE
- G-Mode support to Alienware m16 R1
- x86-android-tablets: Add Vexia Edu Atla 10 tablet 5V data"
* tag 'platform-drivers-x86-v6.15-3' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86:
platform/x86: msi-wmi-platform: Workaround a ACPI firmware bug
platform/x86: msi-wmi-platform: Rename "data" variable
platform/x86: alienware-wmi-wmax: Extend support to more laptops
platform/x86: alienware-wmi-wmax: Add G-Mode support to Alienware m16 R1
platform/x86: amd: pmf: Fix STT limits
mlxbf-bootctl: use sysfs_emit_at() in secure_boot_fuse_state_show()
platform/x86: x86-android-tablets: Add Vexia Edu Atla 10 tablet 5V data
platform/x86: x86-android-tablets: Add "9v" to Vexia EDU ATLA 10 tablet symbols
asus-laptop: Fix an uninitialized variable
platform/x86: intel_pmc_ipc: add option to build without ACPI
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"Small drivers fixes, except for ufs which has two large updates, one
for exposing the device level feature, which is a new addition to the
device spec and the other reworking the exynos driver to fix coherence
issues on some android phones"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: megaraid_sas: Driver version update to 07.734.00.00-rc1
scsi: megaraid_sas: Block zero-length ATA VPD inquiry
scsi: scsi_transport_srp: Replace min/max nesting with clamp()
scsi: ufs: core: Add device level exception support
scsi: ufs: core: Rename ufshcd_wb_presrv_usrspc_keep_vcc_on()
scsi: smartpqi: Use is_kdump_kernel() to check for kdump
scsi: pm80xx: Set phy_attached to zero when device is gone
scsi: ufs: exynos: gs101: Put UFS device in reset on .suspend()
scsi: ufs: exynos: Move phy calls to .exit() callback
scsi: ufs: exynos: Enable PRDT pre-fetching with UFSHCD_CAP_CRYPTO
scsi: ufs: exynos: Ensure consistent phy reference counts
scsi: ufs: exynos: Disable iocc if dma-coherent property isn't set
scsi: ufs: exynos: Move UFS shareability value to drvdata
scsi: ufs: exynos: Ensure pre_link() executes before exynos_ufs_phy_init()
scsi: iscsi: Fix missing scsi_host_put() in error path
scsi: ufs: core: Fix a race condition related to device commands
scsi: hisi_sas: Fix I/O errors caused by hardware port ID changes
scsi: hisi_sas: Enable force phy when SATA disk directly connected
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux
Pull ata fix from Damien Le Moal:
- Fix how sense data from the sense data for successfull NCQ commands
log page is used to fully initialize the result_tf of a completed
command, so that the sense data returned to the scsi layer is fully
initialized with all the device provided information (from Niklas)
* tag 'ata-6.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux:
ata: libata-sata: Save all fields from sense data descriptor
|
|
When adding folio_memcg function call in the zram module for
Android16-6.12, the following error occurs during compilation:
ERROR: modpost: "cgroup_mutex" [../soc-repo/zram.ko] undefined!
This error is caused by the indirect call to lockdep_is_held(&cgroup_mutex)
within folio_memcg. The export setting for cgroup_mutex is controlled by
the CONFIG_PROVE_RCU macro. If CONFIG_LOCKDEP is enabled while
CONFIG_PROVE_RCU is not, this compilation error will occur.
To resolve this issue, add a parallel macro CONFIG_LOCKDEP control to
ensure cgroup_mutex is properly exported when needed.
Signed-off-by: gao xu <gaoxu2@honor.com>
Acked-by: Michal Koutný <mkoutny@suse.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
Pull XFS fixes from Carlos Maiolino:
"This mostly includes fixes and documentation for the zoned allocator
feature merged during previous merge window, but it also adds a sysfs
tunable for the zone garbage collector.
There is also a fix for a regression to the RT device that we'd like
to fix ASAP now that we're getting more users on the RT zoned
allocator"
* tag 'xfs-fixes-6.15-rc3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
xfs: document zoned rt specifics in admin-guide
xfs: fix fsmap for internal zoned devices
xfs: Fix spelling mistake "drity" -> "dirty"
xfs: compute buffer address correctly in xmbuf_map_backing_mem
xfs: add tunable threshold parameter for triggering zone GC
xfs: mark xfs_buf_free as might_sleep()
xfs: remove the leftover xfs_{set,clear}_li_failed infrastructure
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba:
- handle encoded read ioctl returning EAGAIN so it does not mistakenly
free the work structure
- escape subvolume path in mount option list so it cannot be wrongly
parsed when the path contains ","
- remove folio size assertions when writing super block to device with
enabled large folios
* tag 'for-6.15-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: remove folio order ASSERT()s in super block writeback path
btrfs: correctly escape subvol in btrfs_show_options()
btrfs: ioctl: don't free iov when btrfs_encoded_read() returns -EAGAIN
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab
Pull slab fix from Vlastimil Babka:
- Stable fix adding zero initialization of slab->obj_ext to prevent
crashes with allocation profiling (Suren Baghdasaryan)
* tag 'slab-for-6.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab:
slab: ensure slab->obj_exts is clear in a newly allocated slab page
|
|
ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/superm1/linux
Merge amd-pstate content for 6.15 (4/15/25) from Mario Limonciello:
"Add a fix for X3D processors where depending upon what BIOS was
set initially rankings might be set improperly.
Add a fix for changing min/max limits while on the performance
governor."
* tag 'amd-pstate-v6.15-2025-04-15' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/superm1/linux:
cpufreq/amd-pstate: Enable ITMT support after initializing core rankings
cpufreq/amd-pstate: Fix min_limit perf and freq updation for performance governor
|
|
Since cpufreq_driver_resolve_freq() can run in parallel with
cpufreq_set_policy() and there is no synchronization between them,
the former may access policy->min and policy->max while the latter
is updating them and it may see intermediate values of them due
to the way the update is carried out. Also the compiler is free
to apply any optimizations it wants both to the stores in
cpufreq_set_policy() and to the loads in cpufreq_driver_resolve_freq()
which may result in additional inconsistencies.
To address this, use WRITE_ONCE() when updating policy->min and
policy->max in cpufreq_set_policy() and use READ_ONCE() for reading
them in cpufreq_driver_resolve_freq(). Moreover, rearrange the update
in cpufreq_set_policy() to avoid storing intermediate values in
policy->min and policy->max with the help of the observation that
their new values are expected to be properly ordered upfront.
Also modify cpufreq_driver_resolve_freq() to take the possible reverse
ordering of policy->min and policy->max, which may happen depending on
the ordering of operations when this function and cpufreq_set_policy()
run concurrently, into account by always honoring the max when it
turns out to be less than the min (in case it comes from thermal
throttling or similar).
Fixes: 151717690694 ("cpufreq: Make policy min/max hard requirements")
Cc: 5.16+ <stable@vger.kernel.org> # 5.16+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Christian Loehle <christian.loehle@arm.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Link: https://patch.msgid.link/5907080.DvuYhMxLoT@rjwysocki.net
|
|
Notice that ignore_dl_rate_limit() need not piggy back on the
limits_changed handling to achieve its goal (which is to enforce a
frequency update before its due time).
Namely, if sugov_should_update_freq() is updated to check
sg_policy->need_freq_update and return 'true' if it is set when
sg_policy->limits_changed is not set, ignore_dl_rate_limit() may
set the former directly instead of setting the latter, so it can
avoid hitting the memory barrier in sugov_should_update_freq().
Update the code accordingly.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Christian Loehle <christian.loehle@arm.com>
Link: https://patch.msgid.link/10666429.nUPlyArG6x@rjwysocki.net
|
|
The handling of the limits_changed flag in struct sugov_policy needs to
be explicitly synchronized to ensure that cpufreq policy limits updates
will not be missed in some cases.
Without that synchronization it is theoretically possible that
the limits_changed update in sugov_should_update_freq() will be
reordered with respect to the reads of the policy limits in
cpufreq_driver_resolve_freq() and in that case, if the limits_changed
update in sugov_limits() clobbers the one in sugov_should_update_freq(),
the new policy limits may not take effect for a long time.
Likewise, the limits_changed update in sugov_limits() may theoretically
get reordered with respect to the updates of the policy limits in
cpufreq_set_policy() and if sugov_should_update_freq() runs between
them, the policy limits change may be missed.
To ensure that the above situations will not take place, add memory
barriers preventing the reordering in question from taking place and
add READ_ONCE() and WRITE_ONCE() annotations around all of the
limits_changed flag updates to prevent the compiler from messing up
with that code.
Fixes: 600f5badb78c ("cpufreq: schedutil: Don't skip freq update when limits change")
Cc: 5.3+ <stable@vger.kernel.org> # 5.3+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Christian Loehle <christian.loehle@arm.com>
Link: https://patch.msgid.link/3376719.44csPzL39Z@rjwysocki.net
|
|
Commit 8e461a1cb43d ("cpufreq: schedutil: Fix superfluous updates caused
by need_freq_update") modified sugov_should_update_freq() to set the
need_freq_update flag only for drivers with CPUFREQ_NEED_UPDATE_LIMITS
set, but that flag generally needs to be set when the policy limits
change because the driver callback may need to be invoked for the new
limits to take effect.
However, if the return value of cpufreq_driver_resolve_freq() after
applying the new limits is still equal to the previously selected
frequency, the driver callback needs to be invoked only in the case
when CPUFREQ_NEED_UPDATE_LIMITS is set (which means that the driver
specifically wants its callback to be invoked every time the policy
limits change).
Update the code accordingly to avoid missing policy limits changes for
drivers without CPUFREQ_NEED_UPDATE_LIMITS.
Fixes: 8e461a1cb43d ("cpufreq: schedutil: Fix superfluous updates caused by need_freq_update")
Closes: https://lore.kernel.org/lkml/Z_Tlc6Qs-tYpxWYb@linaro.org/
Reported-by: Stephan Gerhold <stephan.gerhold@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Christian Loehle <christian.loehle@arm.com>
Link: https://patch.msgid.link/3010358.e9J7NaK4W3@rjwysocki.net
|
|
The QDMA packet scheduler suffers from a performance issue.
Fix this by picking up changes from MediaTek's SDK which change to use
Token Bucket instead of Leaky Bucket and fix the SPEED_1000 configuration.
Fixes: 160d3a9b1929 ("net: ethernet: mtk_eth_soc: introduce MTK_NETSYS_V2 support")
Signed-off-by: Bo-Cun Chen <bc-bocun.chen@mediatek.com>
Signed-off-by: Daniel Golle <daniel@makrotopia.org>
Link: https://patch.msgid.link/18040f60f9e2f5855036b75b28c4332a2d2ebdd8.1744764277.git.daniel@makrotopia.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
100Mbps
Without this patch, the maximum weight of the queue limit will be
incorrect when linked at 100Mbps due to an apparent typo.
Fixes: f63959c7eec31 ("net: ethernet: mtk_eth_soc: implement multi-queue support for per-port queues")
Signed-off-by: Bo-Cun Chen <bc-bocun.chen@mediatek.com>
Signed-off-by: Daniel Golle <daniel@makrotopia.org>
Link: https://patch.msgid.link/74111ba0bdb13743313999ed467ce564e8189006.1744764277.git.daniel@makrotopia.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
In the current method, the MDC divider was reset to the default setting
of 2.5MHz after the NETSYS SER. Therefore, we need to reapply the MDC
divider configuration function in mtk_hw_init() after reset.
Fixes: c0a440031d431 ("net: ethernet: mtk_eth_soc: set MDIO bus clock frequency")
Signed-off-by: Bo-Cun Chen <bc-bocun.chen@mediatek.com>
Signed-off-by: Daniel Golle <daniel@makrotopia.org>
Link: https://patch.msgid.link/8ab7381447e6cdcb317d5b5a6ddd90a1734efcb0.1744764277.git.daniel@makrotopia.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Sending exact nr_segs, avoids bio split check and processing in
block layer, which takes around 5%[1] of overall CPU utilization.
In our setup, we see overall improvement of IOPS from 7.15M to 7.65M [2]
and 5% less CPU utilization.
[1]
3.52% io_uring [kernel.kallsyms] [k] bio_split_rw_at
1.42% io_uring [kernel.kallsyms] [k] bio_split_rw
0.62% io_uring [kernel.kallsyms] [k] bio_submit_split
[2]
sudo taskset -c 0,1 ./t/io_uring -b512 -d128 -c32 -s32 -p1 -F1 -B1 -n2
-r4 /dev/nvme0n1 /dev/nvme1n1
Signed-off-by: Nitesh Shetty <nj.shetty@samsung.com>
[Pavel: fixed for kbuf, rebased and reworked on top of cleanups]
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/7a1a49a8d053bd617c244291d63dbfbc07afde36.1744882081.git.asml.silence@gmail.com
[axboe: fold in fix factoring in buf reg offset]
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf
Pablo Neira Ayuso says:
====================
Netfilter fix for net
The following batch contains one Netfilter fix for net:
1) conntrack offload bit is erroneously unset in a race scenario,
from Florian Westphal.
netfilter pull request 25-04-17
* tag 'nf-25-04-17' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
netfilter: conntrack: fix erronous removal of offload bit
====================
Link: https://patch.msgid.link/20250417102847.16640-1-pablo@netfilter.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Arch-PEBS retires IA32_PEBS_ENABLE and MSR_PEBS_DATA_CFG MSRs, so
intel_pmu_pebs_enable/disable() and intel_pmu_pebs_enable/disable_all()
are not needed to call for ach-PEBS.
To make the code cleaner, introduce static calls
x86_pmu_pebs_enable/disable() and x86_pmu_pebs_enable/disable_all()
instead of adding "x86_pmu.arch_pebs" check directly in these helpers.
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20250415114428.341182-7-dapeng1.mi@linux.intel.com
|
|
Since architectural PEBS would be introduced in subsequent patches,
rename x86_pmu.pebs to x86_pmu.ds_pebs for distinguishing with the
upcoming architectural PEBS.
Besides restrict reserve_ds_buffers() helper to work only for the
legacy DS based PEBS and avoid it to corrupt the pebs_active flag and
release PEBS buffer incorrectly for arch-PEBS since the later patch
would reuse these flags and alloc/release_pebs_buffer() helpers for
arch-PEBS.
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20250415114428.341182-6-dapeng1.mi@linux.intel.com
|
|
Move x86_pmu.bts flag initialization into bts_init() from
intel_ds_init() and rename intel_ds_init() to intel_pebs_init() since it
fully initializes PEBS now after removing the x86_pmu.bts
initialization.
It's safe to move x86_pmu.bts into bts_init() since all x86_pmu.bts flag
are called after bts_init() execution.
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20250415114428.341182-5-dapeng1.mi@linux.intel.com
|
|
CPUID archPerfmonExt (0x23) leaves are supported to enumerate CPU
level's PMU capabilities on non-hybrid processors as well.
This patch supports to parse archPerfmonExt leaves on non-hybrid
processors. Architectural PEBS leverages archPerfmonExt sub-leaves 0x4
and 0x5 to enumerate the PEBS capabilities as well. This patch is a
precursor of the subsequent arch-PEBS enabling patches.
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20250415114428.341182-4-dapeng1.mi@linux.intel.com
|
|
From the PMU's perspective, Clearwater Forest is similar to the previous
generation Sierra Forest.
The key differences are the ARCH PEBS feature and the new added 3 fixed
counters for topdown L1 metrics events.
The ARCH PEBS is supported in the following patches. This patch provides
support for basic perfmon features and 3 new added fixed counters.
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20250415114428.341182-3-dapeng1.mi@linux.intel.com
|
|
Due to an oversight in merging:
da916e96e2de ("perf: Make perf_pmu_unregister() useable")
on top of:
a3c3c66670ce ("perf/core: Fix child_total_time_enabled accounting bug at task exit")
the timekeeping fix from this latter patch got undone.
Redo it.
Fixes: da916e96e2de ("perf: Make perf_pmu_unregister() useable")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20250417080815.GI38216@noisy.programming.kicks-ass.net
|
|
Due to an oversight in merging:
da916e96e2de ("perf: Make perf_pmu_unregister() useable")
on top of:
56799bc03565 ("perf: Fix hang while freeing sigtrap event")
.. it is now possible to hit put_event(EVENT_TOMBSTONE), which makes
the computer sad.
This also means that for the event->parent == EVENT_TOMBSTONE, the
put_event() matching inherit_event() has gone missing.
Previously this was done in perf_event_release_kernel() after calling
perf_remove_from_context(), but with it delegated to put_event(), this
case is now entirely missed, leading to leaks.
Fixes: da916e96e2de ("perf: Make perf_pmu_unregister() useable")
Reported-by: kernel test robot <oliver.sang@intel.com>
Tested-by: kernel test robot <oliver.sang@intel.com>
Tested-by: James Clark <james.clark@linaro.org>
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Closes: https://lore.kernel.org/oe-lkp/202504131701.941039cd-lkp@intel.com
Link: https://lkml.kernel.org/r/20250415131446.GN5600@noisy.programming.kicks-ass.net
|
|
So there are three situations:
* If perf_event_free_task() has removed all the children from the parent list
before perf_event_release_kernel() got a chance to even iterate them, then
it's all good as there is no get_ctx() pending.
* If perf_event_release_kernel() iterates a child event, but it gets freed
meanwhile by perf_event_free_task() while the mutexes are temporarily
unlocked, it's all good because while locking again the ctx mutex,
perf_event_release_kernel() observes TASK_TOMBSTONE.
* But if perf_event_release_kernel() frees the child event before
perf_event_free_task() got a chance, we may face this scenario:
perf_event_release_kernel() perf_event_free_task()
-------------------------- ------------------------
mutex_lock(&event->child_mutex)
get_ctx(child->ctx)
mutex_unlock(&event->child_mutex)
mutex_lock(ctx->mutex)
mutex_lock(&event->child_mutex)
perf_remove_from_context(child)
mutex_unlock(&event->child_mutex)
mutex_unlock(ctx->mutex)
// This lock acquires ctx->refcount == 2
// visibility
mutex_lock(ctx->mutex)
ctx->task = TASK_TOMBSTONE
mutex_unlock(ctx->mutex)
wait_var_event()
// enters prepare_to_wait() since
// ctx->refcount == 2
// is guaranteed to be seen
set_current_state(TASK_INTERRUPTIBLE)
smp_mb()
if (ctx->refcount != 1)
schedule()
put_ctx()
// NOT fully ordered! Only RELEASE semantics
refcount_dec_and_test()
atomic_fetch_sub_release()
// So TASK_TOMBSTONE is not guaranteed to be seen
if (ctx->task == TASK_TOMBSTONE)
wake_up_var()
Basically it's a broken store buffer:
perf_event_release_kernel() perf_event_free_task()
-------------------------- ------------------------
ctx->task = TASK_TOMBSTONE smp_store_release(&ctx->refcount, ctx->refcount - 1)
smp_mb()
READ_ONCE(ctx->refcount) READ_ONCE(ctx->task)
So we need a smp_mb__after_atomic() before looking at ctx->task.
Fixes: 59f3aa4a3ee2 ("perf: Simplify perf_event_free_task() wait")
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/Z_ZvmEhjkAhplCBE@localhost.localdomain
|
|
In the zeal to adjust all event->state checks to include the new
REVOKED state, one adjustment was made in error. Notably it resulted
in read() on the perf filedesc to stop working for any state lower
than ERROR, specifically EXIT.
This leads to problems with (among others) perf-stat, which wants to
read the counts after a program has finished execution.
Fixes: da916e96e2de ("perf: Make perf_pmu_unregister() useable")
Reported-by: "Mi, Dapeng" <dapeng1.mi@linux.intel.com>
Reported-by: James Clark <james.clark@linaro.org>
Tested-by: James Clark <james.clark@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/77036114-8723-4af9-a068-1d535f4e2e81@linaro.org
Link: https://lore.kernel.org/r/20250417080725.GH38216@noisy.programming.kicks-ass.net
|
|
io_import_fixed is a mess. Even though we know the final len of the
iterator, we still assign offset + len and do some magic after to
correct for that.
Do offset calculation first and finalise it with iov_iter_bvec at the
end.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/2d5107fed24f8b23245ef2ede9a5a7f7c426df61.1744882081.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Kernel registered buffers are special because segments are not uniform
in size, and we have a bunch of optimisations based on that uniformity
for normal buffers. Handle kbuf separately, it'll be cleaner this way.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/4e9e5990b0ab5aee723c0be5cd9b5bcf810375f9.1744882081.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Don't optimise for requests with offset=0. Large registered buffers are
the preference and hence the user is likely to pass an offset, and the
adjustments are not expensive and will be made even cheaper in following
patches.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/1c2beb20470ee3c886a363d4d8340d3790db19f3.1744882081.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
From PMU's perspective, Panther Lake is similar to the previous
generation Lunar Lake. Both are hybrid platforms, with e-core and
p-core.
The key differences are the ARCH PEBS feature and several new events.
The ARCH PEBS is supported in the following patches.
The new events will be supported later in perf tool.
Share the code path with the Lunar Lake. Only update the name.
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20250415114428.341182-2-dapeng1.mi@linux.intel.com
|
|
Currently when a user samples user space GPRs (--user-regs option) with
PEBS, the user space GPRs actually always come from software PMI
instead of from PEBS hardware. This leads to the sampled GPRs to
possibly be inaccurate for single PEBS record case because of the
skid between counter overflow and GPRs sampling on PMI.
For the large PEBS case, it is even worse. If user sets the
exclude_kernel attribute, large PEBS would be used to sample user space
GPRs, but since PEBS GPRs group is not really enabled, it leads to all
samples in the large PEBS record to share the same piece of user space
GPRs, like this reproducer shows:
$ perf record -e branches:pu --user-regs=ip,ax -c 100000 ./foo
$ perf report -D | grep "AX"
.... AX 0x000000003a0d4ead
.... AX 0x000000003a0d4ead
.... AX 0x000000003a0d4ead
.... AX 0x000000003a0d4ead
.... AX 0x000000003a0d4ead
.... AX 0x000000003a0d4ead
.... AX 0x000000003a0d4ead
.... AX 0x000000003a0d4ead
.... AX 0x000000003a0d4ead
.... AX 0x000000003a0d4ead
.... AX 0x000000003a0d4ead
So enable GPRs group for user space GPRs sampling and prioritize reading
GPRs from PEBS. If the PEBS sampled GPRs is not user space GPRs (single
PEBS record case), perf_sample_regs_user() modifies them to user space
GPRs.
[ mingo: Clarified the changelog. ]
Fixes: c22497f5838c ("perf/x86/intel: Support adaptive PEBS v4")
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20250415104135.318169-2-dapeng1.mi@linux.intel.com
|
|
The below code would always unconditionally clear other status bits like
perf metrics overflow bit once PEBS buffer overflows:
status &= intel_ctrl | GLOBAL_STATUS_TRACE_TOPAPMI;
This is incorrect. Perf metrics overflow bit should be cleared only when
fixed counter 3 in PEBS counter group. Otherwise perf metrics overflow
could be missed to handle.
Closes: https://lore.kernel.org/all/20250225110012.GK31462@noisy.programming.kicks-ass.net/
Fixes: 7b2c05a15d29 ("perf/x86/intel: Generic support for hardware TopDown metrics")
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20250415104135.318169-1-dapeng1.mi@linux.intel.com
|
|
Pull NVMe fixes from Christoph:
"nvme fixes for Linux 6.15
- fix scan failure for non-ANA multipath controllers (Hannes Reinecke)
- fix multipath sysfs links creation for some cases (Hannes Reinecke)
- PCIe endpoint fixes (Damien Le Moal)
- use NULL instead of 0 in the auth code (Damien Le Moal)"
* tag 'nvme-6.15-2025-04-17' of git://git.infradead.org/nvme:
nvmet: pci-epf: cleanup link state management
nvmet: pci-epf: clear CC and CSTS when disabling the controller
nvmet: pci-epf: always fully initialize completion entries
nvmet: auth: use NULL to clear a pointer in nvmet_auth_sq_free()
nvme-multipath: sysfs links may not be created for devices
nvme: fixup scan failure for non-ANA multipath controllers
|
|
Add check for the return value of spi_imx_setupxfer().
spi_imx->rx and spi_imx->tx function pointer can be NULL when
spi_imx_setupxfer() return error, and make NULL pointer dereference.
Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
Call trace:
0x0
spi_imx_pio_transfer+0x50/0xd8
spi_imx_transfer_one+0x18c/0x858
spi_transfer_one_message+0x43c/0x790
__spi_pump_transfer_message+0x238/0x5d4
__spi_sync+0x2b0/0x454
spi_write_then_read+0x11c/0x200
Signed-off-by: Tamura Dai <kirinode0@gmail.com>
Reviewed-by: Carlos Song <carlos.song@nxp.com>
Link: https://patch.msgid.link/20250417011700.14436-1-kirinode0@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth
Luiz Augusto von Dentz says:
====================
bluetooth pull request for net:
- l2cap: Process valid commands in too long frame
- vhci: Avoid needless snprintf() calls
* tag 'for-net-2025-04-16' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth:
Bluetooth: vhci: Avoid needless snprintf() calls
Bluetooth: l2cap: Process valid commands in too long frame
====================
Link: https://patch.msgid.link/20250416210126.2034212-1-luiz.dentz@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
The scale of IIO bandwidth in free running counters is inherited from
the ICX. The counter increments for every 32 bytes rather than 4 bytes.
The IIO bandwidth out free running counters don't increment with a
consistent size. The increment depends on the requested size. It's
impossible to find a fixed increment. Remove it from the event_descs.
Fixes: 0378c93a92e2 ("perf/x86/intel/uncore: Support IIO free-running counters on Sapphire Rapids server")
Reported-by: Tang Jun <dukang.tj@alibaba-inc.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20250416142426.3933977-3-kan.liang@linux.intel.com
|
|
There was a mistake in the ICX uncore spec too. The counter increments
for every 32 bytes rather than 4 bytes.
The same as SNR, there are 1 ioclk and 8 IIO bandwidth in free running
counters. Reuse the snr_uncore_iio_freerunning_events().
Fixes: 2b3b76b5ec67 ("perf/x86/intel/uncore: Add Ice Lake server uncore support")
Reported-by: Tang Jun <dukang.tj@alibaba-inc.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20250416142426.3933977-2-kan.liang@linux.intel.com
|
|
There was a mistake in the SNR uncore spec. The counter increments for
every 32 bytes of data sent from the IO agent to the SOC, not 4 bytes
which was documented in the spec.
The event list has been updated:
"EventName": "UNC_IIO_BANDWIDTH_IN.PART0_FREERUN",
"BriefDescription": "Free running counter that increments for every 32
bytes of data sent from the IO agent to the SOC",
Update the scale of the IIO bandwidth in free running counters as well.
Fixes: 210cc5f9db7a ("perf/x86/intel/uncore: Add uncore support for Snow Ridge server")
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20250416142426.3933977-1-kan.liang@linux.intel.com
|
|
Meghana Malladi says:
====================
Bug fixes from XDP and perout series
This patch series consists of bug fixes from the XDP series:
1. Fixes a kernel warning that occurs when bringing down the
network interface.
2. Resolves a potential NULL pointer dereference in the
emac_xmit_xdp_frame() function.
3. Resolves a potential NULL pointer dereference in the
icss_iep_perout_enable() function
v3: https://lore.kernel.org/all/20250328102403.2626974-1-m-malladi@ti.com/
====================
Link: https://patch.msgid.link/20250415090543.717991-1-m-malladi@ti.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
The ICSS IEP driver tracks perout and pps enable state with flags.
Currently when disabling pps and perout signals during icss_iep_exit(),
results in NULL pointer dereference for perout.
To fix the null pointer dereference issue, the icss_iep_perout_enable_hw
function can be modified to directly clear the IEP CMP registers when
disabling PPS or PEROUT, without referencing the ptp_perout_request
structure, as its contents are irrelevant in this case.
Fixes: 9b115361248d ("net: ti: icssg-prueth: Fix clearing of IEP_CMP_CFG registers during iep_init")
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Closes: https://lore.kernel.org/all/7b1c7c36-363a-4085-b26c-4f210bee1df6@stanley.mountain/
Signed-off-by: Meghana Malladi <m-malladi@ti.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20250415090543.717991-4-m-malladi@ti.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
emac_xmit_xdp_frame()
There is an error check inside emac_xmit_xdp_frame() function which
is called when the driver wants to transmit XDP frame, to check if
the allocated tx descriptor is NULL, if true to exit and return
ICSSG_XDP_CONSUMED implying failure in transmission.
In this case trying to free a descriptor which is NULL will result
in kernel crash due to NULL pointer dereference. Fix this error handling
and increase netdev tx_dropped stats in the caller of this function
if the function returns ICSSG_XDP_CONSUMED.
Fixes: 62aa3246f462 ("net: ti: icssg-prueth: Add XDP support")
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Closes: https://lore.kernel.org/all/70d8dd76-0c76-42fc-8611-9884937c82f5@stanley.mountain/
Signed-off-by: Meghana Malladi <m-malladi@ti.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Roger Quadros <rogerq@kernel.org>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20250415090543.717991-3-m-malladi@ti.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
During network interface initialization, the NIC driver needs to register
its Rx queue with the XDP, to ensure the incoming XDP buffer carries a
pointer reference to this info and is stored inside xdp_rxq_info.
While this struct isn't tied to XDP prog, if there are any changes in
Rx queue, the NIC driver needs to stop the Rx queue by unregistering
with XDP before purging and reallocating memory. Drop page_pool destroy
during Rx channel reset as this is already handled by XDP during
xdp_rxq_info_unreg (Rx queue unregister), failing to do will cause the
following warning:
warning logs: https://gist.github.com/MeghanaMalladiTI/eb627e5dc8de24e42d7d46572c13e576
Fixes: 46eeb90f03e0 ("net: ti: icssg-prueth: Use page_pool API for RX buffer allocation")
Signed-off-by: Meghana Malladi <m-malladi@ti.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Roger Quadros <rogerq@kernel.org>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20250415090543.717991-2-m-malladi@ti.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|