Age | Commit message (Collapse) | Author |
|
git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Pull bpf updates from Alexei Starovoitov:
- Remove usermode driver (UMD) framework (Thomas Weißschuh)
- Introduce Strongly Connected Component (SCC) in the verifier to
detect loops and refine register liveness (Eduard Zingerman)
- Allow 'void *' cast using bpf_rdonly_cast() and corresponding
'__arg_untrusted' for global function parameters (Eduard Zingerman)
- Improve precision for BPF_ADD and BPF_SUB operations in the verifier
(Harishankar Vishwanathan)
- Teach the verifier that constant pointer to a map cannot be NULL
(Ihor Solodrai)
- Introduce BPF streams for error reporting of various conditions
detected by BPF runtime (Kumar Kartikeya Dwivedi)
- Teach the verifier to insert runtime speculation barrier (lfence on
x86) to mitigate speculative execution instead of rejecting the
programs (Luis Gerhorst)
- Various improvements for 'veristat' (Mykyta Yatsenko)
- For CONFIG_DEBUG_KERNEL config warn on internal verifier errors to
improve bug detection by syzbot (Paul Chaignon)
- Support BPF private stack on arm64 (Puranjay Mohan)
- Introduce bpf_cgroup_read_xattr() kfunc to read xattr of cgroup's
node (Song Liu)
- Introduce kfuncs for read-only string opreations (Viktor Malik)
- Implement show_fdinfo() for bpf_links (Tao Chen)
- Reduce verifier's stack consumption (Yonghong Song)
- Implement mprog API for cgroup-bpf programs (Yonghong Song)
* tag 'bpf-next-6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (192 commits)
selftests/bpf: Migrate fexit_noreturns case into tracing_failure test suite
selftests/bpf: Add selftest for attaching tracing programs to functions in deny list
bpf: Add log for attaching tracing programs to functions in deny list
bpf: Show precise rejected function when attaching fexit/fmod_ret to __noreturn functions
bpf: Fix various typos in verifier.c comments
bpf: Add third round of bounds deduction
selftests/bpf: Test invariants on JSLT crossing sign
selftests/bpf: Test cross-sign 64bits range refinement
selftests/bpf: Update reg_bound range refinement logic
bpf: Improve bounds when s64 crosses sign boundary
bpf: Simplify bounds refinement from s32
selftests/bpf: Enable private stack tests for arm64
bpf, arm64: JIT support for private stack
bpf: Move bpf_jit_get_prog_name() to core.c
bpf, arm64: Fix fp initialization for exception boundary
umd: Remove usermode driver framework
bpf/preload: Don't select USERMODE_DRIVER
selftests/bpf: Fix test dynptr/test_dynptr_memset_xdp_chunks failure
selftests/bpf: Fix test dynptr/test_dynptr_copy_xdp failure
selftests/bpf: Increase xdp data size for arm64 64K page size
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Jakub Kicinski:
"Core & protocols:
- Wrap datapath globals into net_aligned_data, to avoid false sharing
- Preserve MSG_ZEROCOPY in forwarding (e.g. out of a container)
- Add SO_INQ and SCM_INQ support to AF_UNIX
- Add SIOCINQ support to AF_VSOCK
- Add TCP_MAXSEG sockopt to MPTCP
- Add IPv6 force_forwarding sysctl to enable forwarding per interface
- Make TCP validation of whether packet fully fits in the receive
window and the rcv_buf more strict. With increased use of HW
aggregation a single "packet" can be multiple 100s of kB
- Add MSG_MORE flag to optimize large TCP transmissions via sockmap,
improves latency up to 33% for sockmap users
- Convert TCP send queue handling from tasklet to BH workque
- Improve BPF iteration over TCP sockets to see each socket exactly
once
- Remove obsolete and unused TCP RFC3517/RFC6675 loss recovery code
- Support enabling kernel threads for NAPI processing on per-NAPI
instance basis rather than a whole device. Fully stop the kernel
NAPI thread when threaded NAPI gets disabled. Previously thread
would stick around until ifdown due to tricky synchronization
- Allow multicast routing to take effect on locally-generated packets
- Add output interface argument for End.X in segment routing
- MCTP: add support for gateway routing, improve bind() handling
- Don't require rtnl_lock when fetching an IPv6 neighbor over Netlink
- Add a new neighbor flag ("extern_valid"), which cedes refresh
responsibilities to userspace. This is needed for EVPN multi-homing
where a neighbor entry for a multi-homed host needs to be synced
across all the VTEPs among which the host is multi-homed
- Support NUD_PERMANENT for proxy neighbor entries
- Add a new queuing discipline for IETF RFC9332 DualQ Coupled AQM
- Add sequence numbers to netconsole messages. Unregister
netconsole's console when all net targets are removed. Code
refactoring. Add a number of selftests
- Align IPSec inbound SA lookup to RFC 4301. Only SPI and protocol
should be used for an inbound SA lookup
- Support inspecting ref_tracker state via DebugFS
- Don't force bonding advertisement frames tx to ~333 ms boundaries.
Add broadcast_neighbor option to send ARP/ND on all bonded links
- Allow providing upcall pid for the 'execute' command in openvswitch
- Remove DCCP support from Netfilter's conntrack
- Disallow multiple packet duplications in the queuing layer
- Prevent use of deprecated iptables code on PREEMPT_RT
Driver API:
- Support RSS and hashing configuration over ethtool Netlink
- Add dedicated ethtool callbacks for getting and setting hashing
fields
- Add support for power budget evaluation strategy in PSE /
Power-over-Ethernet. Generate Netlink events for overcurrent etc
- Support DPLL phase offset monitoring across all device inputs.
Support providing clock reference and SYNC over separate DPLL
inputs
- Support traffic classes in devlink rate API for bandwidth
management
- Remove rtnl_lock dependency from UDP tunnel port configuration
Device drivers:
- Add a new Broadcom driver for 800G Ethernet (bnge)
- Add a standalone driver for Microchip ZL3073x DPLL
- Remove IBM's NETIUCV device driver
- Ethernet high-speed NICs:
- Broadcom (bnxt):
- support zero-copy Tx of DMABUF memory
- take page size into account for page pool recycling rings
- Intel (100G, ice, idpf):
- idpf: XDP and AF_XDP support preparations
- idpf: add flow steering
- add link_down_events statistic
- clean up the TSPLL code
- preparations for live VM migration
- nVidia/Mellanox:
- support zero-copy Rx/Tx interfaces (DMABUF and io_uring)
- optimize context memory usage for matchers
- expose serial numbers in devlink info
- support PCIe congestion metrics
- Meta (fbnic):
- add 25G, 50G, and 100G link modes to phylink
- support dumping FW logs
- Marvell/Cavium:
- support for CN20K generation of the Octeon chips
- Amazon:
- add HW clock (without timestamping, just hypervisor time access)
- Ethernet virtual:
- VirtIO net:
- support segmentation of UDP-tunnel-encapsulated packets
- Google (gve):
- support packet timestamping and clock synchronization
- Microsoft vNIC:
- add handler for device-originated servicing events
- allow dynamic MSI-X vector allocation
- support Tx bandwidth clamping
- Ethernet NICs consumer, and embedded:
- AMD:
- amd-xgbe: hardware timestamping and PTP clock support
- Broadcom integrated MACs (bcmgenet, bcmasp):
- use napi_complete_done() return value to support NAPI polling
- add support for re-starting auto-negotiation
- Broadcom switches (b53):
- support BCM5325 switches
- add bcm63xx EPHY power control
- Synopsys (stmmac):
- lots of code refactoring and cleanups
- TI:
- icssg-prueth: read firmware-names from device tree
- icssg: PRP offload support
- Microchip:
- lan78xx: convert to PHYLINK for improved PHY and MAC management
- ksz: add KSZ8463 switch support
- Intel:
- support similar queue priority scheme in multi-queue and
time-sensitive networking (taprio)
- support packet pre-emption in both
- RealTek (r8169):
- enable EEE at 5Gbps on RTL8126
- Airoha:
- add PPPoE offload support
- MDIO bus controller for Airoha AN7583
- Ethernet PHYs:
- support for the IPQ5018 internal GE PHY
- micrel KSZ9477 switch-integrated PHYs:
- add MDI/MDI-X control support
- add RX error counters
- add cable test support
- add Signal Quality Indicator (SQI) reporting
- dp83tg720: improve reset handling and reduce link recovery time
- support bcm54811 (and its MII-Lite interface type)
- air_en8811h: support resume/suspend
- support PHY counters for QCA807x and QCA808x
- support WoL for QCA807x
- CAN drivers:
- rcar_canfd: support for Transceiver Delay Compensation
- kvaser: report FW versions via devlink dev info
- WiFi:
- extended regulatory info support (6 GHz)
- add statistics and beacon monitor for Multi-Link Operation (MLO)
- support S1G aggregation, improve S1G support
- add Radio Measurement action fields
- support per-radio RTS threshold
- some work around how FIPS affects wifi, which was wrong (RC4 is
used by TKIP, not only WEP)
- improvements for unsolicited probe response handling
- WiFi drivers:
- RealTek (rtw88):
- IBSS mode for SDIO devices
- RealTek (rtw89):
- BT coexistence for MLO/WiFi7
- concurrent station + P2P support
- support for USB devices RTL8851BU/RTL8852BU
- Intel (iwlwifi):
- use embedded PNVM in (to be released) FW images to fix
compatibility issues
- many cleanups (unused FW APIs, PCIe code, WoWLAN)
- some FIPS interoperability
- MediaTek (mt76):
- firmware recovery improvements
- more MLO work
- Qualcomm/Atheros (ath12k):
- fix scan on multi-radio devices
- more EHT/Wi-Fi 7 features
- encapsulation/decapsulation offload
- Broadcom (brcm80211):
- support SDIO 43751 device
- Bluetooth:
- hci_event: add support for handling LE BIG Sync Lost event
- ISO: add socket option to report packet seqnum via CMSG
- ISO: support SCM_TIMESTAMPING for ISO TS
- Bluetooth drivers:
- intel_pcie: support Function Level Reset
- nxpuart: add support for 4M baudrate
- nxpuart: implement powerup sequence, reset, FW dump, and FW loading"
* tag 'net-next-6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1742 commits)
dpll: zl3073x: Fix build failure
selftests: bpf: fix legacy netfilter options
ipv6: annotate data-races around rt->fib6_nsiblings
ipv6: fix possible infinite loop in fib6_info_uses_dev()
ipv6: prevent infinite loop in rt6_nlmsg_size()
ipv6: add a retry logic in net6_rt_notify()
vrf: Drop existing dst reference in vrf_ip6_input_dst
net/sched: taprio: align entry index attr validation with mqprio
net: fsl_pq_mdio: use dev_err_probe
selftests: rtnetlink.sh: remove esp4_offload after test
vsock: remove unnecessary null check in vsock_getname()
igb: xsk: solve negative overflow of nb_pkts in zerocopy mode
stmmac: xsk: fix negative overflow of budget in zerocopy mode
dt-bindings: ieee802154: Convert at86rf230.txt yaml format
net: dsa: microchip: Disable PTP function of KSZ8463
net: dsa: microchip: Setup fiber ports for KSZ8463
net: dsa: microchip: Write switch MAC address differently for KSZ8463
net: dsa: microchip: Use different registers for KSZ8463
net: dsa: microchip: Add KSZ8463 switch support to KSZ DSA driver
dt-bindings: net: dsa: microchip: Add KSZ8463 switch support
...
|
|
Delete fexit_noreturns.c files and migrate the cases into
tracing_failure.c files.
The result:
$ tools/testing/selftests/bpf/test_progs -t tracing_failure/fexit_noreturns
#467/4 tracing_failure/fexit_noreturns:OK
#467 tracing_failure:OK
Summary: 1/1 PASSED, 0 SKIPPED, 0 FAILED
Signed-off-by: KaFai Wan <kafai.wan@linux.dev>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20250724151454.499040-5-kafai.wan@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
deny list
The result:
$ tools/testing/selftests/bpf/test_progs -t tracing_failure/tracing_deny
#468/3 tracing_failure/tracing_deny:OK
#468 tracing_failure:OK
Summary: 1/1 PASSED, 0 SKIPPED, 0 FAILED
Signed-off-by: KaFai Wan <kafai.wan@linux.dev>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20250724151454.499040-4-kafai.wan@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
__noreturn functions
With this change, we know the precise rejected function name when
attaching fexit/fmod_ret to __noreturn functions from log.
$ ./fexit
libbpf: prog 'fexit': BPF program load failed: -EINVAL
libbpf: prog 'fexit': -- BEGIN PROG LOAD LOG --
Attaching fexit/fmod_ret to __noreturn function 'do_exit' is rejected.
Suggested-by: Leon Hwang <leon.hwang@linux.dev>
Signed-off-by: KaFai Wan <kafai.wan@linux.dev>
Acked-by: Yafang Shao <laoar.shao@gmail.com>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20250724151454.499040-2-kafai.wan@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs bpf updates from Christian Brauner:
"These changes allow bpf to read extended attributes from cgroupfs.
This is useful in redirecting AF_UNIX socket connections based on
cgroup membership of the socket. One use-case is the ability to
implement log namespaces in systemd so services and containers are
redirected to different journals"
* tag 'vfs-6.17-rc1.bpf' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
selftests/kernfs: test xattr retrieval
selftests/bpf: Add tests for bpf_cgroup_read_xattr
bpf: Mark cgroup_subsys_state->cgroup RCU safe
bpf: Introduce bpf_cgroup_read_xattr to read xattr of cgroup's node
kernfs: remove iattr_mutex
|
|
Commit d7f008738171 ("bpf: try harder to deduce register bounds from
different numeric domains") added a second call to __reg_deduce_bounds
in reg_bounds_sync because a single call wasn't enough to converge to a
fixed point in terms of register bounds.
With patch "bpf: Improve bounds when s64 crosses sign boundary" from
this series, Eduard noticed that calling __reg_deduce_bounds twice isn't
enough anymore to converge. The first selftest added in "selftests/bpf:
Test cross-sign 64bits range refinement" highlights the need for a third
call to __reg_deduce_bounds. After instruction 7, reg_bounds_sync
performs the following bounds deduction:
reg_bounds_sync entry: scalar(smin=-655,smax=0xeffffeee,smin32=-783,smax32=-146)
__update_reg_bounds: scalar(smin=-655,smax=0xeffffeee,smin32=-783,smax32=-146)
__reg_deduce_bounds:
__reg32_deduce_bounds: scalar(smin=-655,smax=0xeffffeee,smin32=-783,smax32=-146,umin32=0xfffffcf1,umax32=0xffffff6e)
__reg64_deduce_bounds: scalar(smin=-655,smax=0xeffffeee,smin32=-783,smax32=-146,umin32=0xfffffcf1,umax32=0xffffff6e)
__reg_deduce_mixed_bounds: scalar(smin=-655,smax=0xeffffeee,umin=umin32=0xfffffcf1,umax=0xffffffffffffff6e,smin32=-783,smax32=-146,umax32=0xffffff6e)
__reg_deduce_bounds:
__reg32_deduce_bounds: scalar(smin=-655,smax=0xeffffeee,umin=umin32=0xfffffcf1,umax=0xffffffffffffff6e,smin32=-783,smax32=-146,umax32=0xffffff6e)
__reg64_deduce_bounds: scalar(smin=-655,smax=smax32=-146,umin=0xfffffffffffffd71,umax=0xffffffffffffff6e,smin32=-783,umin32=0xfffffcf1,umax32=0xffffff6e)
__reg_deduce_mixed_bounds: scalar(smin=-655,smax=smax32=-146,umin=0xfffffffffffffd71,umax=0xffffffffffffff6e,smin32=-783,umin32=0xfffffcf1,umax32=0xffffff6e)
__reg_bound_offset: scalar(smin=-655,smax=smax32=-146,umin=0xfffffffffffffd71,umax=0xffffffffffffff6e,smin32=-783,umin32=0xfffffcf1,umax32=0xffffff6e,var_off=(0xfffffffffffffc00; 0x3ff))
__update_reg_bounds: scalar(smin=-655,smax=smax32=-146,umin=0xfffffffffffffd71,umax=0xffffffffffffff6e,smin32=-783,umin32=0xfffffcf1,umax32=0xffffff6e,var_off=(0xfffffffffffffc00; 0x3ff))
In particular, notice how:
1. In the first call to __reg_deduce_bounds, __reg32_deduce_bounds
learns new u32 bounds.
2. __reg64_deduce_bounds is unable to improve bounds at this point.
3. __reg_deduce_mixed_bounds derives new u64 bounds from the u32 bounds.
4. In the second call to __reg_deduce_bounds, __reg64_deduce_bounds
improves the smax and umin bounds thanks to patch "bpf: Improve
bounds when s64 crosses sign boundary" from this series.
5. Subsequent functions are unable to improve the ranges further (only
tnums). Yet, a better smin32 bound could be learned from the smin
bound.
__reg32_deduce_bounds is able to improve smin32 from smin, but for that
we need a third call to __reg_deduce_bounds.
As discussed in [1], there may be a better way to organize the deduction
rules to learn the same information with less calls to the same
functions. Such an optimization requires further analysis and is
orthogonal to the present patchset.
Link: https://lore.kernel.org/bpf/aIKtSK9LjQXB8FLY@mail.gmail.com/ [1]
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Co-developed-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
Link: https://lore.kernel.org/r/79619d3b42e5525e0e174ed534b75879a5ba15de.1753695655.git.paul.chaignon@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
The improvement of the u64/s64 range refinement fixed the invariant
violation that was happening on this test for BPF_JSLT when crossing the
sign boundary.
After this patch, we have one test remaining with a known invariant
violation. It's the same test as fixed here but for 32 bits ranges.
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
Link: https://lore.kernel.org/r/ad046fb0016428f1a33c3b81617aabf31b51183f.1753695655.git.paul.chaignon@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
This patch adds coverage for the new cross-sign 64bits range refinement
logic. The three tests cover the cases when the u64 and s64 ranges
overlap (1) in the negative portion of s64, (2) in the positive portion
of s64, and (3) in both portions.
The first test is a simplified version of a BPF program generated by
syzkaller that caused an invariant violation [1]. It looks like
syzkaller could not extract the reproducer itself (and therefore didn't
report it to the mailing list), but I was able to extract it from the
console logs of a crash.
The principle is similar to the invariant violation described in
commit 6279846b9b25 ("bpf: Forget ranges when refining tnum after
JSET"): the verifier walks a dead branch, uses the condition to refine
ranges, and ends up with inconsistent ranges. In this case, the dead
branch is when we fallthrough on both jumps. The new refinement logic
improves the bounds such that the second jump is properly detected as
always-taken and the verifier doesn't end up walking a dead branch.
The second and third tests are inspired by the first, but rely on
condition jumps to prepare the bounds instead of ALU instructions. An
R10 write is used to trigger a verifier error when the bounds can't be
refined.
Link: https://syzkaller.appspot.com/bug?extid=c711ce17dd78e5d4fdcf [1]
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
Link: https://lore.kernel.org/r/a0e17b00dab8dabcfa6f8384e7e151186efedfdd.1753695655.git.paul.chaignon@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
As arm64 JIT now supports private stack, make sure all relevant tests
run on arm64 architecture.
Relevant tests:
#415/1 struct_ops_private_stack/private_stack:OK
#415/2 struct_ops_private_stack/private_stack_fail:OK
#415/3 struct_ops_private_stack/private_stack_recur:OK
#415 struct_ops_private_stack:OK
#549/1 verifier_private_stack/Private stack, single prog:OK
#549/2 verifier_private_stack/Private stack, subtree > MAX_BPF_STACK:OK
#549/3 verifier_private_stack/No private stack:OK
#549/4 verifier_private_stack/Private stack, callback:OK
#549/5 verifier_private_stack/Private stack, exception in mainprog:OK
#549/6 verifier_private_stack/Private stack, exception in subprog:OK
#549/7 verifier_private_stack/Private stack, async callback, not nested:OK
#549/8 verifier_private_stack/Private stack, async callback, potential nesting:OK
#549 verifier_private_stack:OK
Summary: 2/11 PASSED, 0 SKIPPED, 0 FAILED
Signed-off-by: Puranjay Mohan <puranjay@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/bpf/20250724120257.7299-4-puranjay@kernel.org
|
|
For arm64 64K page size, the xdp data size was set to be more than 64K
in one of previous patches. This will cause failure for bpf_dynptr_memset().
Since the failure of bpf_dynptr_memset() is expected with 64K page size,
return success.
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://patch.msgid.link/20250725043440.209266-1-yonghong.song@linux.dev
|
|
For arm64 64K page size, the bpf_dynptr_copy() in test dynptr/test_dynptr_copy_xdp
will succeed, but the test will failure with 4K page size. This patch made a change
so the test will fail expectedly for both 4K and 64K page sizes.
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Acked-by: Mykyta Yatsenko <yatsenko@meta.com>
Link: https://patch.msgid.link/20250725043435.208974-1-yonghong.song@linux.dev
|
|
This patch adds selftests to cover invalid narrower loads on the
context. These used to cause kernel warnings before the previous patch.
To trigger the warning, the load had to be aligned, to read an affected
context field (ex., skb->sk), and not starting at the beginning of the
field.
The nine new cases all fail without the previous patch.
Suggested-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://patch.msgid.link/44cd83ea9c6868079943f0a436c6efa850528cc1.1753194596.git.paul.chaignon@gmail.com
|
|
Prepare for bucket resume tests for established TCP sockets by creating
a program to immediately destroy and remove sockets from the TCP ehash
table, since close() is not deterministic.
Signed-off-by: Jordan Rife <jordan@jrife.io>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
|
|
Add parentheses around loopback address check to fix up logic and make
the socket state filter configurable for the TCP socket iterators.
Iterators can skip the socket state check by setting ss to 0.
Signed-off-by: Jordan Rife <jordan@jrife.io>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
|
|
Prepare to test TCP socket iteration over both listening and established
sockets by allowing the BPF iterator programs to skip the port check.
Signed-off-by: Jordan Rife <jordan@jrife.io>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
|
|
This patch adds coverage for the warning detected by syzkaller and fixed
in the previous patch. Without the previous patch, this test fails with:
verifier bug: REG INVARIANTS VIOLATION (false_reg1): range bounds
violation u64=[0x0, 0x0] s64=[0x0, 0x0] u32=[0x1, 0x0] s32=[0x0, 0x0]
var_off=(0x0, 0x0)(1)
Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/c7893be1170fdbcf64e0200c110cdbd360ce7086.1752171365.git.paul.chaignon@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Add selftests for the new bpf_arena_reserve_pages kfunc.
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Signed-off-by: Emil Tsalapatis <emil@etsalapatis.com>
Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20250709191312.29840-3-emil@etsalapatis.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
The enum64 type used by verifier_global_ptr_args test case requires
CONFIG_SCHED_CLASS_EXT. At the moment selftets do not depend on this
option. There are just a few enum64 types in the kernel. Instead of
tying selftests to implementation details of unrelated sub-systems,
just remove enum64 test case. Simple enums are covered and that should
be sufficient.
Fixes: 68cca81fd57f ("selftests/bpf: tests for __arg_untrusted void * global func params")
Reported-by: Amery Hung <ameryhung@gmail.com>
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Tested-by: Amery Hung <ameryhung@gmail.com>
Link: https://lore.kernel.org/r/20250708220856.3059578-1-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
This patch adds a negative test case for the following verifier error.
expected prog array map for tail call
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/r/aGu0i1X_jII-3aFa@mail.gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Add the following tests:
1. A test with an (unimportant) ldimm64 (16 byte insn) and a
Spectre-v4--induced nospec that clarifies and serves as a basic
Spectre v4 test.
2. Make sure a Spectre v4 nospec_result does not prevent a Spectre v1
nospec from being added before the dangerous instruction (tests that
[1] is fixed).
3. Combine the two, which is the combination that triggers the warning
in [2]. This is because the unanalyzed stack write has nospec_result
set, but the ldimm64 (which was just analyzed) had incremented
insn_idx by 2. That violates the assertion that nospec_result is only
used after insns that increment insn_idx by 1 (i.e., stack writes).
[1] https://lore.kernel.org/bpf/4266fd5de04092aa4971cbef14f1b4b96961f432.camel@gmail.com/
[2] https://lore.kernel.org/bpf/685b3c1b.050a0220.2303ee.0010.GAE@google.com/
Signed-off-by: Luis Gerhorst <luis.gerhorst@fau.de>
Link: https://lore.kernel.org/r/20250705190908.1756862-3-luis.gerhorst@fau.de
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Check usage of __arg_untrusted parameters of primitive type:
- passing of {trusted, untrusted, map value, scalar value, values with
variable offset} to untrusted `void *`, `char *` or enum is ok;
- varifier represents such parameters as rdonly_untrusted_mem(sz=0).
Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20250704230354.1323244-9-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Check usage of __arg_untrusted parameters with PTR_TO_BTF_ID:
- combining __arg_untrusted with other tags is forbidden;
- non-kernel (program local) types for __arg_untrusted are forbidden;
- passing of {trusted, untrusted, map value, scalar value, values with
variable offset} to untrusted is ok;
- passing of PTR_TO_BTF_ID with a different type to untrusted is ok;
- passing of untrusted to trusted is forbidden.
Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20250704230354.1323244-7-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Validate that reading a PTR_TO_BTF_ID field produces a value of type
PTR_TO_MEM|MEM_RDONLY|PTR_UNTRUSTED, if field is a pointer to a
primitive type.
Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20250704230354.1323244-4-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Add selftests to stress test the various facets of the stream API,
memory allocation pattern, and ensuring dumping support is tested and
functional.
Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com>
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20250703204818.925464-13-memxor@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Add tests to verify the behavior of bpf_dynptr_memset():
* normal memset 0
* normal memset non-0
* memset with an offset
* memset in dynptr that was adjusted
* error: size overflow
* error: offset+size overflow
* error: readonly dynptr
* memset into non-linear xdp dynptr
Signed-off-by: Ihor Solodrai <isolodrai@meta.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Mykyta Yatsenko <yatsenko@meta.com>
Link: https://lore.kernel.org/bpf/20250702210309.3115903-3-isolodrai@meta.com
|
|
Test case checking that verifier does not assume rdonly_untrusted_mem
values as not null.
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20250702073620.897517-2-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
|
|
Add tests for different scenarios with bpf_cgroup_read_xattr:
1. Read cgroup xattr from bpf_cgroup_from_id;
2. Read cgroup xattr from bpf_cgroup_ancestor;
3. Read cgroup xattr from css_iter;
4. Use bpf_cgroup_read_xattr in LSM hook security_socket_connect.
5. Use bpf_cgroup_read_xattr in cgroup program.
Signed-off-by: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/20250623063854.1896364-5-song@kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
There are spelling mistakes in description text. Fix these.
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20250630125528.563077-1-colin.i.king@gmail.com
|
|
Tests with aligned and misaligned memory access of different sizes via
pointer returned by bpf_rdonly_cast().
Suggested-by: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20250627015539.1439656-1-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
cgroup_xattr/read_cgroupfs_xattr has two issues:
1. cgroup_xattr/read_cgroupfs_xattr messes up lo without creating a netns
first. This causes issue with other tests.
Fix this by using a different hook (lsm.s/file_open) and not messing
with lo.
2. cgroup_xattr/read_cgroupfs_xattr sets up cgroups without proper
mount namespaces.
Fix this by using the existing cgroup helpers. A new helper
set_cgroup_xattr() is added to set xattr on cgroup files.
Fixes: f4fba2d6d282 ("selftests/bpf: Add tests for bpf_cgroup_read_xattr")
Reported-by: Alexei Starovoitov <ast@kernel.org>
Closes: https://lore.kernel.org/bpf/CAADnVQ+iqMi2HEj_iH7hsx+XJAsqaMWqSDe4tzcGAnehFWA9Sw@mail.gmail.com/
Signed-off-by: Song Liu <song@kernel.org>
Tested-by: Eduard Zingerman <eddyz87@gmail.com>
Tested-by: Ihor Solodrai <ihor.solodrai@linux.dev>
Link: https://lore.kernel.org/r/20250627191221.765921-1-song@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Merge branch 'vfs-6.17.bpf' from vfs tree into bpf-next/master
and resolve conflicts.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:
====================
pull-request: bpf-next 2025-06-27
We've added 6 non-merge commits during the last 8 day(s) which contain
a total of 6 files changed, 120 insertions(+), 20 deletions(-).
The main changes are:
1) Fix RCU usage in task_cls_state() for BPF programs using helpers like
bpf_get_cgroup_classid_curr() outside of networking, from Charalampos
Mitrodimas.
2) Fix a sockmap race between map_update and a pending workqueue from
an earlier map_delete freeing the old psock where both pointed to the
same psock->sk, from Jiayuan Chen.
3) Fix a data corruption issue when using bpf_msg_pop_data() in kTLS which
failed to recalculate the ciphertext length, also from Jiayuan Chen.
4) Remove xdp_redirect_map{,_err} trace events since they are unused and
also hide XDP trace events under CONFIG_BPF_SYSCALL, from Steven Rostedt.
* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next:
xdp: tracing: Hide some xdp events under CONFIG_BPF_SYSCALL
xdp: Remove unused events xdp_redirect_map and xdp_redirect_map_err
net, bpf: Fix RCU usage in task_cls_state() for BPF programs
selftests/bpf: Add test to cover ktls with bpf_msg_pop_data
bpf, ktls: Fix data corruption when using bpf_msg_pop_data() in ktls
bpf, sockmap: Fix psock incorrectly pointing to sk
====================
Link: https://patch.msgid.link/20250626230111.24772-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Modify existing veristat tests to verify that array presets are applied
as expected.
Introduce few negative tests as well to check that common error modes
are handled.
Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/bpf/20250625165904.87820-4-mykyta.yatsenko5@gmail.com
|
|
Cross-merge BPF, perf and other fixes after downstream PRs.
It restores BPF CI to green after critical fix
commit bc4394e5e79c ("perf: Fix the throttle error of some clock events")
No conflicts.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Add both positive and negative tests cases using string kfuncs added in
the previous patches.
Positive tests check that the functions work as expected.
Negative tests pass various incorrect strings to the kfuncs and check
for the expected error codes:
-E2BIG when passing too long strings
-EFAULT when trying to read inaccessible kernel memory
-ERANGE when passing userspace pointers on arches with non-overlapping
address spaces
A majority of the tests use the RUN_TESTS helper which executes BPF
programs with BPF_PROG_TEST_RUN and check for the expected return value.
An exception to this are tests for long strings as we need to memset the
long string from userspace (at least I haven't found an ergonomic way to
memset it from a BPF program), which cannot be done using the RUN_TESTS
infrastructure.
Suggested-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Viktor Malik <vmalik@redhat.com>
Link: https://lore.kernel.org/r/090451a2e60c9ae1dceb4d1bfafa3479db5c7481.1750917800.git.vmalik@redhat.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Allow macro expansion for values passed to the `__retval` and
`__retval_unpriv` attributes. This is especially useful for testing
programs which return various error codes.
With this change, the code for parsing special literals can be made
simpler, as the literals are defined via macros. The only exception is
INT_MIN which expands to (-INT_MAX -1), which is not single number and
cannot be parsed by strtol. So, we instead use a prefixed literal
_INT_MIN in __retval and handle it separately (assign the expected
return to INT_MIN). Also, strtol cannot handle the "ll" suffix so change
the value of POINTER_VALUE from 0xcafe4all to 0xbadcafe.
Signed-off-by: Viktor Malik <vmalik@redhat.com>
Link: https://lore.kernel.org/r/a6c6b551ae0575351faa7b7a1df52f9341a5cbe8.1750917800.git.vmalik@redhat.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
The following cases are tested:
- it is ok to load memory at any offset from rdonly_untrusted_mem;
- rdonly_untrusted_mem offset/bounds are not tracked;
- writes into rdonly_untrusted_mem are forbidden;
- atomic operations on rdonly_untrusted_mem are forbidden;
- rdonly_untrusted_mem can't be passed as a memory argument of a
helper of kfunc;
- it is ok to use PTR_TO_MEM and PTR_TO_BTF_ID in a same load
instruction.
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20250625182414.30659-4-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
BPF_REG now has range tracking logic. Add selftests for BPF_NEG.
Specifically, return value of LSM hook lsm.s/socket_connect is used to
show that the verifer tracks BPF_NEG(1) falls in the [-4095, 0] range;
while BPF_NEG(100000) does not fall in that range.
Signed-off-by: Song Liu <song@kernel.org>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20250625164025.3310203-3-song@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Add range tracking for instruction BPF_NEG. Without this logic, a trivial
program like the following will fail
volatile bool found_value_b;
SEC("lsm.s/socket_connect")
int BPF_PROG(test_socket_connect)
{
if (!found_value_b)
return -1;
return 0;
}
with verifier log:
"At program exit the register R0 has smin=0 smax=4294967295 should have
been in [-4095, 0]".
This is because range information is lost in BPF_NEG:
0: R1=ctx() R10=fp0
; if (!found_value_b) @ xxxx.c:24
0: (18) r1 = 0xffa00000011e7048 ; R1_w=map_value(...)
2: (71) r0 = *(u8 *)(r1 +0) ; R0_w=scalar(smin32=0,smax=255)
3: (a4) w0 ^= 1 ; R0_w=scalar(smin32=0,smax=255)
4: (84) w0 = -w0 ; R0_w=scalar(range info lost)
Note that, the log above is manually modified to highlight relevant bits.
Fix this by maintaining proper range information with BPF_NEG, so that
the verifier will know:
4: (84) w0 = -w0 ; R0_w=scalar(smin32=-255,smax=0)
Also updated selftests based on the expected behavior.
Signed-off-by: Song Liu <song@kernel.org>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20250625164025.3310203-2-song@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
The `name` field in `obj->externs` points into the BTF data at initial
open time. However, some functions may invalidate this after opening and
before loading (e.g. `bpf_map__set_value_size`), which results in
pointers into freed memory and undefined behavior.
The simplest solution is to simply `strdup` these strings, similar to
the `essent_name`, and free them at the same time.
In order to test this path, the `global_map_resize` BPF selftest is
modified slightly to ensure the presence of an extern, which causes this
test to fail prior to the fix. Given there isn't an obvious API or error
to test against, I opted to add this to the existing test as an aspect
of the resizing feature rather than duplicate the test.
Fixes: 9d0a23313b1a ("libbpf: Add capability for resizing datasec maps")
Signed-off-by: Adin Scannell <amscanne@meta.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20250625050215.2777374-1-amscanne@meta.com
|
|
The previous commit improves the precision in scalar(32)_min_max_add,
and scalar(32)_min_max_sub. The improvement in precision occurs in cases
when all outcomes overflow or underflow, respectively.
This commit adds selftests that exercise those cases.
This commit also adds selftests for cases where the output register
state bounds for u(32)_min/u(32)_max are conservatively set to unbounded
(when there is partial overflow or underflow).
Signed-off-by: Harishankar Vishwanathan <harishankar.vishwanathan@gmail.com>
Co-developed-by: Matan Shachnai <m.shachnai@rutgers.edu>
Signed-off-by: Matan Shachnai <m.shachnai@rutgers.edu>
Suggested-by: Eduard Zingerman <eddyz87@gmail.com>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20250623040359.343235-3-harishankar.vishwanathan@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Add tests for different scenarios with bpf_cgroup_read_xattr:
1. Read cgroup xattr from bpf_cgroup_from_id;
2. Read cgroup xattr from bpf_cgroup_ancestor;
3. Read cgroup xattr from css_iter;
4. Use bpf_cgroup_read_xattr in LSM hook security_socket_connect.
5. Use bpf_cgroup_read_xattr in cgroup program.
Signed-off-by: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/20250623063854.1896364-5-song@kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Add selftest cases that validate bpftool's expected behavior when
accessing maps protected from modification via security_bpf_map.
The test includes a BPF program attached to security_bpf_map with two maps:
- A protected map that only allows read-only access
- An unprotected map that allows full access
The test script attaches the BPF program to security_bpf_map and
verifies that for the bpftool map command:
- Read access works on both maps
- Write access fails on the protected map
- Write access succeeds on the unprotected map
- These behaviors remain consistent when the maps are pinned
Signed-off-by: Slava Imameev <slava.imameev@crowdstrike.com>
Reviewed-by: Quentin Monnet <qmo@kernel.org>
Link: https://lore.kernel.org/r/20250620151812.13952-2-slava.imameev@crowdstrike.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
The underlying lookup_user_key() function uses a signed 32 bit integer
for key serial numbers because legitimate serial numbers are positive
(and > 3) and keyrings are negative. Using a u32 for the keyring in
the bpf function doesn't currently cause any conversion problems but
will start to trip the signed to unsigned conversion warnings when the
kernel enables them, so convert the argument to signed (and update the
tests accordingly) before it acquires more users.
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Reviewed-by: Roberto Sassu <roberto.sassu@huawei.com>
Link: https://lore.kernel.org/r/84cdb0775254d297d75e21f577089f64abdfbd28.camel@HansenPartnership.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
LSM hooks such as security_path_mknod() and security_inode_rename() have
access to newly allocated negative dentry, which has NULL d_inode.
Therefore, it is necessary to do the NULL pointer check for d_inode.
Also add selftests that checks the verifier enforces the NULL pointer
check.
Signed-off-by: Song Liu <song@kernel.org>
Reviewed-by: Matt Bobrowski <mattbobrowski@google.com>
Link: https://lore.kernel.org/r/20250613052857.1992233-1-song@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
A test case to check if both branches of jset are explored when
computing program CFG.
At 'if r1 & 0x7 ...':
- register 'r2' is computed alive only if jump branch of jset
instruction is followed;
- register 'r0' is computed alive only if fallthrough branch of jset
instruction is followed.
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20250613175331.3238739-2-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
When running BPF selftests on arm64 with a 64K page size, I encountered
the following two test failures:
sockmap_basic/sockmap skb_verdict change tail:FAIL
tc_change_tail:FAIL
With further debugging, I identified the root cause in the following
kernel code within __bpf_skb_change_tail():
u32 max_len = BPF_SKB_MAX_LEN;
u32 min_len = __bpf_skb_min_len(skb);
int ret;
if (unlikely(flags || new_len > max_len || new_len < min_len))
return -EINVAL;
With a 4K page size, new_len = 65535 and max_len = 16064, the function
returns -EINVAL. However, With a 64K page size, max_len increases to
261824, allowing execution to proceed further in the function. This is
because BPF_SKB_MAX_LEN scales with the page size and larger page sizes
result in higher max_len values.
Updating the new_len parameter in both tests based on actual kernel
page size resolved both failures.
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20250612035037.2207911-1-yonghong.song@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
The bpf selftest xdp_adjust_tail/xdp_adjust_frags_tail_grow failed on
arm64 with 64KB page:
xdp_adjust_tail/xdp_adjust_frags_tail_grow:FAIL
In bpf_prog_test_run_xdp(), the xdp->frame_sz is set to 4K, but later on
when constructing frags, with 64K page size, the frag data_len could
be more than 4K. This will cause problems in bpf_xdp_frags_increase_tail().
To fix the failure, the xdp->frame_sz is set to be PAGE_SIZE so kernel
can test different page size properly. With the kernel change, the user
space and bpf prog needs adjustment. Currently, the MAX_SKB_FRAGS default
value is 17, so for 4K page, the maximum packet size will be less than 68K.
To test 64K page, a bigger maximum packet size than 68K is desired. So two
different functions are implemented for subtest xdp_adjust_frags_tail_grow.
Depending on different page size, different data input/output sizes are used
to adapt with different page size.
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20250612035032.2207498-1-yonghong.song@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
The test case absent_mark_in_the_middle_state is equivalent of the
following C program:
1: r8 = bpf_get_prandom_u32();
2: r6 = -32;
3: bpf_iter_num_new(&fp[-8], 0, 10);
4: if (unlikely(bpf_get_prandom_u32()))
5: r6 = -31;
6: for (;;) {
7: if (!bpf_iter_num_next(&fp[-8]))
8: break;
9: if (unlikely(bpf_get_prandom_u32()))
10: *(u64 *)(fp + r6) = 7;
11: }
12: bpf_iter_num_destroy(&fp[-8]);
13: return 0;
W/o a fix that instructs verifier to ignore branches count for loop
entries verification proceeds as follows:
- 1-4, state is {r6=-32,fp-8=active};
- 6, checkpoint A is created with {r6=-32,fp-8=active};
- 7, checkpoint B is created with {r6=-32,fp-8=active},
push state {r6=-32,fp-8=active} from 7 to 9;
- 8,12,13, {r6=-32,fp-8=drained}, exit;
- pop state with {r6=-32,fp-8=active} from 7 to 9;
- 9, push state {r6=-32,fp-8=active} from 9 to 10;
- 6, checkpoint C is created with {r6=-32,fp-8=active};
- 7, checkpoint A is hit, no precision propagated for r6 to C;
- pop state {r6=-32,fp-8=active} from 9 to 10;
- 10, state is {r6=-31,fp-8=active}, r6 is marked as read and precise,
these marks are propagated to checkpoints A and B (but not C, as
it is not the parent of current state;
- 6, {r6=-31,fp-8=active} checkpoint C is hit, because r6 is not
marked precise for this checkpoint;
- the program is accepted, despite a possibility of unaligned u64
stack access at offset -31.
The test case absent_mark_in_the_middle_state2 is similar except the
following change:
r8 = bpf_get_prandom_u32();
r6 = -32;
bpf_iter_num_new(&fp[-8], 0, 10);
if (unlikely(bpf_get_prandom_u32())) {
r6 = -31;
+ jump_into_loop:
+ goto +0;
+ goto loop;
+ }
+ if (unlikely(bpf_get_prandom_u32()))
+ goto jump_into_loop;
+ loop:
for (;;) {
if (!bpf_iter_num_next(&fp[-8]))
break;
if (unlikely(bpf_get_prandom_u32()))
*(u64 *)(fp + r6) = 7;
}
bpf_iter_num_destroy(&fp[-8])
return 0
The goal is to check that read/precision marks are propagated to
checkpoint created at 'goto +0' that resides outside of the loop.
The test case absent_mark_in_the_middle_state3 is a bit different and
is equivalent to the C program below:
int absent_mark_in_the_middle_state3(void)
{
bpf_iter_num_new(&fp[-8], 0, 10)
loop1(-32, &fp[-8])
loop1_wrapper(&fp[-8])
bpf_iter_num_destroy(&fp[-8])
}
int loop1(num, iter)
{
while (bpf_iter_num_next(iter)) {
if (unlikely(bpf_get_prandom_u32()))
*(fp + num) = 7;
}
return 0
}
int loop1_wrapper(iter)
{
r6 = -32;
if (unlikely(bpf_get_prandom_u32()))
r6 = -31;
loop1(r6, iter);
return 0;
}
The unsafe state is reached in a similar manner, but the loop is
located inside a subprogram that is called from two locations in the
main subprogram. This detail is important for exercising
bpf_scc_visit->backedges memory management.
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20250611200836.4135542-11-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|