Age | Commit message (Collapse) | Author |
|
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"Four small fixes, all in drivers. They're all one or two lines except
for the ufs one, but that's a simple revert of a previous feature"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: iscsi_tcp: Check that sock is valid before iscsi_set_param()
scsi: qla2xxx: Fix memory leak in qla2x00_probe_one()
scsi: mpi3mr: Handle soft reset in progress fault code (0xF002)
scsi: Revert "scsi: ufs: core: Initialize devfreq synchronously"
|
|
Pull block fixes from Jens Axboe:
- Ensure that ublk always reads the whole sqe upfront (me)
- Fix for a block size probing issue with ublk (Ming)
- Fix for the bio based polling (Keith)
- NVMe pull request via Christoph:
- fix discard support without oncs (Keith Busch)
- Partition scan error handling regression fix (Yu)
* tag 'block-6.3-2023-04-06' of git://git.kernel.dk/linux:
block: don't set GD_NEED_PART_SCAN if scan partition failed
block: ublk: make sure that block size is set correctly
ublk: read any SQE values upfront
nvme: fix discard support without oncs
blk-mq: directly poll requests
|
|
Pull io_uring fixes from Jens Axboe:
"Just two minor fixes for provided buffers - one where we could
potentially leak a buffer, and one where the returned values was
off-by-one in some cases"
* tag 'io_uring-6.3-2023-04-06' of git://git.kernel.dk/linux:
io_uring: fix memory leak when removing provided buffers
io_uring: fix return value when removing provided buffers
|
|
git://git.infradead.org/users/hch/dma-mapping
Pull dma-mapping fix from Christoph Hellwig:
- fix a braino in the swiotlb alignment check fix (Petr Tesarik)
* tag 'dma-mapping-6.3-2023-04-08' of git://git.infradead.org/users/hch/dma-mapping:
swiotlb: fix a braino in the alignment check fix
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull tracing fixes from Steven Rostedt:
"A couple more minor fixes:
- Reset direct->addr back to its original value on error in updating
the direct trampoline code
- Make lastcmd_mutex static"
* tag 'trace-v6.3-rc5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
tracing/synthetic: Make lastcmd_mutex static
ftrace: Fix issue that 'direct->addr' not restored in modify_ftrace_direct()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull MM fixes from Andrew Morton:
"28 hotfixes.
23 are cc:stable and the other five address issues which were
introduced during this merge cycle.
20 are for MM and the remainder are for other subsystems"
* tag 'mm-hotfixes-stable-2023-04-07-16-23' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (28 commits)
maple_tree: fix a potential concurrency bug in RCU mode
maple_tree: fix get wrong data_end in mtree_lookup_walk()
mm/swap: fix swap_info_struct race between swapoff and get_swap_pages()
nilfs2: fix sysfs interface lifetime
mm: take a page reference when removing device exclusive entries
mm: vmalloc: avoid warn_alloc noise caused by fatal signal
nilfs2: initialize "struct nilfs_binfo_dat"->bi_pad field
nilfs2: fix potential UAF of struct nilfs_sc_info in nilfs_segctor_thread()
zsmalloc: document freeable stats
zsmalloc: document new fullness grouping
fsdax: force clear dirty mark if CoW
mm/hugetlb: fix uffd wr-protection for CoW optimization path
mm: enable maple tree RCU mode by default
maple_tree: add RCU lock checking to rcu callback functions
maple_tree: add smp_rmb() to dead node detection
maple_tree: fix write memory barrier of nodes once dead for RCU mode
maple_tree: remove extra smp_wmb() from mas_dead_leaves()
maple_tree: fix freeing of nodes in rcu mode
maple_tree: detect dead nodes in mas_start()
maple_tree: be more cautious about dead nodes
...
|
|
When memory is a little tight on my system, it's pretty easy to see
warnings that look like this.
ksoftirqd/0: page allocation failure: order:3, mode:0x40a20(GFP_ATOMIC|__GFP_COMP), nodemask=(null),cpuset=/,mems_allowed=0
...
Call trace:
dump_backtrace+0x0/0x1e8
show_stack+0x20/0x2c
dump_stack_lvl+0x60/0x78
dump_stack+0x18/0x38
warn_alloc+0x104/0x174
__alloc_pages+0x588/0x67c
alloc_rx_agg+0xa0/0x190 [r8152 ...]
r8152_poll+0x270/0x760 [r8152 ...]
__napi_poll+0x44/0x1ec
net_rx_action+0x100/0x300
__do_softirq+0xec/0x38c
run_ksoftirqd+0x38/0xec
smpboot_thread_fn+0xb8/0x248
kthread+0x134/0x154
ret_from_fork+0x10/0x20
On a fragmented system it's normal that order 3 allocations will
sometimes fail, especially atomic ones. The driver handles these
failures fine and the WARN just creates spam in the logs for this
case. The __GFP_NOWARN flag is exactly for this situation, so add it
to the allocation.
NOTE: my testing is on a 5.15 system, but there should be no reason
that this would be fundamentally different on a mainline kernel.
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Acked-by: Hayes Wang <hayeswang@realtek.com>
Link: https://lore.kernel.org/r/20230406171411.1.I84dbef45786af440fd269b71e9436a96a8e7a152@changeid
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
After commit 217f69743681 ("net: busy-poll: allow preemption
in sk_busy_loop()"), a thread willing to use busy polling
is not hurting other threads anymore in a non preempt kernel.
I think it is safe to remove CAP_NET_ADMIN check.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20230406194634.1804691-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Simon Horman says:
====================
net: stmmac: dwmac-anarion: address issues flagged by sparse
Two minor enhancements to dwmac-anarion to address issues flagged by
sparse.
1. Always return struct anarion_gmac * from anarion_config_dt()
2. Add __iomem annotation to register base
No functional change intended.
Compile tested only.
====================
Link: https://lore.kernel.org/r/20230406-dwmac-anarion-sparse-v1-0-b0c866c8be9d@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
anarion_config_dt()
Always return struct anarion_gmac * from anarion_config_dt().
In the case where ctl_block was an error pointer it was being
returned directly. Which sparse flags as follows:
.../dwmac-anarion.c:73:24: warning: incorrect type in return expression (different address spaces)
.../dwmac-anarion.c:73:24: expected struct anarion_gmac *
.../dwmac-anarion.c:73:24: got void [noderef] __iomem *[assigned] ctl_block
Avoid this by converting the error pointer to an error.
And then reversing the conversion.
As a side effect, the error can be used for logging purposes,
subjectively, leading to a minor cleanup.
No functional change intended.
Compile tested only.
Signed-off-by: Simon Horman <horms@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Use __iomem annotation the register base: the ctl_block field of struct
anarion_gmac. I believe this is the normal practice for such variables.
By doing so some casting is avoided.
And sparse no longer reports:
.../dwmac-anarion.c:29:23: warning: incorrect type in argument 1 (different address spaces)
.../dwmac-anarion.c:29:23: expected void const volatile [noderef] __iomem *addr
.../dwmac-anarion.c:29:23: got void *
.../dwmac-anarion.c:34:22: warning: incorrect type in argument 2 (different address spaces)
.../dwmac-anarion.c:34:22: expected void volatile [noderef] __iomem *addr
.../dwmac-anarion.c:34:22: got void *
No functional change intended.
Compile tested only.
Signed-off-by: Simon Horman <horms@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This was never used since the commit that added it - e58bb43f5e43
("stmmac: initial support to manage pcs modes").
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Link: https://lore.kernel.org/r/20230406125412.48790-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux
Leon Romanovsky says:
====================
Improve IPsec limits, ESN and replay window
This series overcomes existing hardware limitations in Mellanox ConnectX
devices around handling IPsec soft and hard limits.
In addition, the ESN logic is tied and added an interface to configure
replay window sequence numbers through existing iproute2 interface.
ip xfrm state ... [ replay-seq SEQ ] [ replay-oseq SEQ ]
[ replay-seq-hi SEQ ] [ replay-oseq-hi SEQ ]
Link: https://lore.kernel.org/all/cover.1680162300.git.leonro@nvidia.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
* tag 'ipsec-esn-replay' of https://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux:
net/mlx5e: Simulate missing IPsec TX limits hardware functionality
net/mlx5e: Generalize IPsec work structs
net/mlx5e: Reduce contention in IPsec workqueue
net/mlx5e: Set IPsec replay sequence numbers
net/mlx5e: Remove ESN callbacks if it is not supported
xfrm: don't require advance ESN callback for packet offload
net/mlx5e: Overcome slow response for first IPsec ASO WQE
net/mlx5e: Add SW implementation to support IPsec 64 bit soft and hard limits
net/mlx5e: Prevent zero IPsec soft/hard limits
net/mlx5e: Factor out IPsec ASO update function
====================
Link: https://lore.kernel.org/r/20230406071902.712388-1-leon@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Any multiplication between GENMASK(31, 0) and a number bigger than 1
will be truncated because of the overflow, if the size of unsigned long
is 32 bits.
Replaced GENMASK with GENMASK_ULL to make sure that multiplication will
be between 64 bits values.
Cc: <stable@vger.kernel.org> # 5.15+
Fixes: 514def5dd339 ("phy: nxp-c45-tja11xx: add timestamping support")
Signed-off-by: Radu Pirea (OSS) <radu-nicolae.pirea@oss.nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20230406095953.75622-1-radu-nicolae.pirea@oss.nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Siddharth Vadapalli says:
====================
Add support for J784S4 CPSW9G
This series adds a new compatible to am65-cpsw driver for the CPSW9G
instance of the CPSW Ethernet Switch on TI's J784S4 SoC which has 8
external ports and 1 internal host port.
The CPSW9G instance supports QSGMII and USXGMII modes for which driver
support is added.
Additionally, the interface mode specific configurations are moved to the
am65_cpsw_nuss_mac_config() callback. Also, a TODO comment is added for
verifying whether in-band mode is necessary for 10 Mbps RGMII mode.
NOTE:
I have verified that the mac_config() operations are preserved across
link up and link down events for SGMII and USXGMII mode with the new
implementation in this series, as suggested by:
Russell King <linux@armlinux.org.uk>
For patches 1 and 3 of this series, I believe that the following tag:
Suggested-by: Russell King <linux@armlinux.org.uk>
should be added. However, I did not add it since I did not yet get
the permission to do so. I will be happy if the tags are added, since
the new implementation is almost entirely based on Russell's suggestion,
with minor changes made by me.
v2:
https://lore.kernel.org/r/20230403110106.983994-1-s-vadapalli@ti.com/
v1:
https://lore.kernel.org/r/20230331065110.604516-1-s-vadapalli@ti.com/
====================
Link: https://lore.kernel.org/r/20230404061459.1100519-1-s-vadapalli@ti.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
TI's J784S4 SoC supports USXGMII mode. Add USXGMII mode to the
extra_modes member of the J784S4 SoC data.
Configure MAC control register for supporting USXGMII mode and add
MAC_5000FD in the "mac_capabilities" member of struct "phylink_config".
Signed-off-by: Siddharth Vadapalli <s-vadapalli@ti.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
TI's J784S4 SoC supports QSGMII mode with the CPSW9G instance of the
CPSW Ethernet Switch. Add a new compatible for J784S4 SoC and enable
QSGMII support for it by adding QSGMII mode to the extra_modes member of
the "j784s4_cpswxg_pdata" SoC data.
Signed-off-by: Siddharth Vadapalli <s-vadapalli@ti.com>
Reviewed-by: Roger Quadros <rogerq@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Move the interface mode specific configuration to the mac_config()
callback am65_cpsw_nuss_mac_config().
Also, do not reset the MAC Control register on mac_link_down(). Only
clear those bits that can possibly be set in mac_link_up().
Let the MAC remain in IDLE state after mac_link_down(). Bring it out of
the IDLE state on mac_link_up().
Signed-off-by: Siddharth Vadapalli <s-vadapalli@ti.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
assume the following setup on a single machine:
1. An openvswitch instance with one bridge and default flows
2. two network namespaces "server" and "client"
3. two ovs interfaces "server" and "client" on the bridge
4. for each ovs interface a veth pair with a matching name and 32 rx and
tx queues
5. move the ends of the veth pairs to the respective network namespaces
6. assign ip addresses to each of the veth ends in the namespaces (needs
to be the same subnet)
7. start some http server on the server network namespace
8. test if a client in the client namespace can reach the http server
when following the actions below the host has a chance of getting a cpu
stuck in a infinite loop:
1. send a large amount of parallel requests to the http server (around
3000 curls should work)
2. in parallel delete the network namespace (do not delete interfaces or
stop the server, just kill the namespace)
there is a low chance that this will cause the below kernel cpu stuck
message. If this does not happen just retry.
Below there is also the output of bpftrace for the functions mentioned
in the output.
The series of events happening here is:
1. the network namespace is deleted calling
`unregister_netdevice_many_notify` somewhere in the process
2. this sets first `NETREG_UNREGISTERING` on both ends of the veth and
then runs `synchronize_net`
3. it then calls `call_netdevice_notifiers` with `NETDEV_UNREGISTER`
4. this is then handled by `dp_device_event` which calls
`ovs_netdev_detach_dev` (if a vport is found, which is the case for
the veth interface attached to ovs)
5. this removes the rx_handlers of the device but does not prevent
packages to be sent to the device
6. `dp_device_event` then queues the vport deletion to work in
background as a ovs_lock is needed that we do not hold in the
unregistration path
7. `unregister_netdevice_many_notify` continues to call
`netdev_unregister_kobject` which sets `real_num_tx_queues` to 0
8. port deletion continues (but details are not relevant for this issue)
9. at some future point the background task deletes the vport
If after 7. but before 9. a packet is send to the ovs vport (which is
not deleted at this point in time) which forwards it to the
`dev_queue_xmit` flow even though the device is unregistering.
In `skb_tx_hash` (which is called in the `dev_queue_xmit`) path there is
a while loop (if the packet has a rx_queue recorded) that is infinite if
`dev->real_num_tx_queues` is zero.
To prevent this from happening we update `do_output` to handle devices
without carrier the same as if the device is not found (which would
be the code path after 9. is done).
Additionally we now produce a warning in `skb_tx_hash` if we will hit
the infinite loop.
bpftrace (first word is function name):
__dev_queue_xmit server: real_num_tx_queues: 1, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 1
netdev_core_pick_tx server: addr: 0xffff9f0a46d4a000 real_num_tx_queues: 1, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 1
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 2, reg_state: 1
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 6, reg_state: 2
ovs_netdev_detach_dev server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, reg_state: 2
netdev_rx_handler_unregister server: real_num_tx_queues: 1, cpu: 9, pid: 21024, tid: 21024, reg_state: 2
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
netdev_rx_handler_unregister ret server: real_num_tx_queues: 1, cpu: 9, pid: 21024, tid: 21024, reg_state: 2
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 27, reg_state: 2
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 22, reg_state: 2
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 18, reg_state: 2
netdev_unregister_kobject: real_num_tx_queues: 1, cpu: 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
ovs_vport_send server: real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 2
__dev_queue_xmit server: real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 2
netdev_core_pick_tx server: addr: 0xffff9f0a46d4a000 real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 2
broken device server: real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024
ovs_dp_detach_port server: real_num_tx_queues: 0 cpu 9, pid: 9124, tid: 9124, reg_state: 2
synchronize_rcu_expedited: cpu 9, pid: 33604, tid: 33604
stuck message:
watchdog: BUG: soft lockup - CPU#5 stuck for 26s! [curl:1929279]
Modules linked in: veth pktgen bridge stp llc ip_set_hash_net nft_counter xt_set nft_compat nf_tables ip_set_hash_ip ip_set nfnetlink_cttimeout nfnetlink openvswitch nsh nf_conncount nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 tls binfmt_misc nls_iso8859_1 input_leds joydev serio_raw dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua sch_fq_codel drm efi_pstore virtio_rng ip_tables x_tables autofs4 btrfs blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear hid_generic usbhid hid crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel virtio_net ahci net_failover crypto_simd cryptd psmouse libahci virtio_blk failover
CPU: 5 PID: 1929279 Comm: curl Not tainted 5.15.0-67-generic #74-Ubuntu
Hardware name: OpenStack Foundation OpenStack Nova, BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
RIP: 0010:netdev_pick_tx+0xf1/0x320
Code: 00 00 8d 48 ff 0f b7 c1 66 39 ca 0f 86 e9 01 00 00 45 0f b7 ff 41 39 c7 0f 87 5b 01 00 00 44 29 f8 41 39 c7 0f 87 4f 01 00 00 <eb> f2 0f 1f 44 00 00 49 8b 94 24 28 04 00 00 48 85 d2 0f 84 53 01
RSP: 0018:ffffb78b40298820 EFLAGS: 00000246
RAX: 0000000000000000 RBX: ffff9c8773adc2e0 RCX: 000000000000083f
RDX: 0000000000000000 RSI: ffff9c8773adc2e0 RDI: ffff9c870a25e000
RBP: ffffb78b40298858 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffff9c870a25e000
R13: ffff9c870a25e000 R14: ffff9c87fe043480 R15: 0000000000000000
FS: 00007f7b80008f00(0000) GS:ffff9c8e5f740000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f7b80f6a0b0 CR3: 0000000329d66000 CR4: 0000000000350ee0
Call Trace:
<IRQ>
netdev_core_pick_tx+0xa4/0xb0
__dev_queue_xmit+0xf8/0x510
? __bpf_prog_exit+0x1e/0x30
dev_queue_xmit+0x10/0x20
ovs_vport_send+0xad/0x170 [openvswitch]
do_output+0x59/0x180 [openvswitch]
do_execute_actions+0xa80/0xaa0 [openvswitch]
? kfree+0x1/0x250
? kfree+0x1/0x250
? kprobe_perf_func+0x4f/0x2b0
? flow_lookup.constprop.0+0x5c/0x110 [openvswitch]
ovs_execute_actions+0x4c/0x120 [openvswitch]
ovs_dp_process_packet+0xa1/0x200 [openvswitch]
? ovs_ct_update_key.isra.0+0xa8/0x120 [openvswitch]
? ovs_ct_fill_key+0x1d/0x30 [openvswitch]
? ovs_flow_key_extract+0x2db/0x350 [openvswitch]
ovs_vport_receive+0x77/0xd0 [openvswitch]
? __htab_map_lookup_elem+0x4e/0x60
? bpf_prog_680e8aff8547aec1_kfree+0x3b/0x714
? trace_call_bpf+0xc8/0x150
? kfree+0x1/0x250
? kfree+0x1/0x250
? kprobe_perf_func+0x4f/0x2b0
? kprobe_perf_func+0x4f/0x2b0
? __mod_memcg_lruvec_state+0x63/0xe0
netdev_port_receive+0xc4/0x180 [openvswitch]
? netdev_port_receive+0x180/0x180 [openvswitch]
netdev_frame_hook+0x1f/0x40 [openvswitch]
__netif_receive_skb_core.constprop.0+0x23d/0xf00
__netif_receive_skb_one_core+0x3f/0xa0
__netif_receive_skb+0x15/0x60
process_backlog+0x9e/0x170
__napi_poll+0x33/0x180
net_rx_action+0x126/0x280
? ttwu_do_activate+0x72/0xf0
__do_softirq+0xd9/0x2e7
? rcu_report_exp_cpu_mult+0x1b0/0x1b0
do_softirq+0x7d/0xb0
</IRQ>
<TASK>
__local_bh_enable_ip+0x54/0x60
ip_finish_output2+0x191/0x460
__ip_finish_output+0xb7/0x180
ip_finish_output+0x2e/0xc0
ip_output+0x78/0x100
? __ip_finish_output+0x180/0x180
ip_local_out+0x5e/0x70
__ip_queue_xmit+0x184/0x440
? tcp_syn_options+0x1f9/0x300
ip_queue_xmit+0x15/0x20
__tcp_transmit_skb+0x910/0x9c0
? __mod_memcg_state+0x44/0xa0
tcp_connect+0x437/0x4e0
? ktime_get_with_offset+0x60/0xf0
tcp_v4_connect+0x436/0x530
__inet_stream_connect+0xd4/0x3a0
? kprobe_perf_func+0x4f/0x2b0
? aa_sk_perm+0x43/0x1c0
inet_stream_connect+0x3b/0x60
__sys_connect_file+0x63/0x70
__sys_connect+0xa6/0xd0
? setfl+0x108/0x170
? do_fcntl+0xe8/0x5a0
__x64_sys_connect+0x18/0x20
do_syscall_64+0x5c/0xc0
? __x64_sys_fcntl+0xa9/0xd0
? exit_to_user_mode_prepare+0x37/0xb0
? syscall_exit_to_user_mode+0x27/0x50
? do_syscall_64+0x69/0xc0
? __sys_setsockopt+0xea/0x1e0
? exit_to_user_mode_prepare+0x37/0xb0
? syscall_exit_to_user_mode+0x27/0x50
? __x64_sys_setsockopt+0x1f/0x30
? do_syscall_64+0x69/0xc0
? irqentry_exit+0x1d/0x30
? exc_page_fault+0x89/0x170
entry_SYSCALL_64_after_hwframe+0x61/0xcb
RIP: 0033:0x7f7b8101c6a7
Code: 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 2a 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 18 89 54 24 0c 48 89 34 24 89
RSP: 002b:00007ffffd6b2198 EFLAGS: 00000246 ORIG_RAX: 000000000000002a
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f7b8101c6a7
RDX: 0000000000000010 RSI: 00007ffffd6b2360 RDI: 0000000000000005
RBP: 0000561f1370d560 R08: 00002795ad21d1ac R09: 0030312e302e302e
R10: 00007ffffd73f080 R11: 0000000000000246 R12: 0000561f1370c410
R13: 0000000000000000 R14: 0000000000000005 R15: 0000000000000000
</TASK>
Fixes: 7f8a436eaa2c ("openvswitch: Add conntrack action")
Co-developed-by: Luca Czesla <luca.czesla@mail.schwarz>
Signed-off-by: Luca Czesla <luca.czesla@mail.schwarz>
Signed-off-by: Felix Huettner <felix.huettner@mail.schwarz>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Link: https://lore.kernel.org/r/ZC0pBXBAgh7c76CA@kernel-bug-kernel-bug
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says:
====================
pull-request: bpf 2023-04-08
We've added 4 non-merge commits during the last 11 day(s) which contain
a total of 5 files changed, 39 insertions(+), 6 deletions(-).
The main changes are:
1) Fix BPF TCP socket iterator to use correct helper for dropping
socket's refcount, that is, sock_gen_put instead of sock_put,
from Martin KaFai Lau.
2) Fix a BTI exception splat in BPF trampoline-generated code on arm64,
from Xu Kuohai.
3) Fix a LongArch JIT error from missing BPF_NOSPEC no-op, from George Guo.
4) Fix dynamic XDP feature detection of veth in xdp_redirect selftest,
from Lorenzo Bianconi.
* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
selftests/bpf: fix xdp_redirect xdp-features selftest for veth driver
bpf, arm64: Fixed a BTI error on returning to patched function
LoongArch, bpf: Fix jit to skip speculation barrier opcode
bpf: tcp: Use sock_gen_put instead of sock_put in bpf_iter_tcp
====================
Link: https://lore.kernel.org/r/20230407224642.30906-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The following example forces veristat to loop indefinitely:
$ cat two-ok
file_name,prog_name,verdict,total_states
file-a,a,success,12
file-b,b,success,67
$ cat add-failure
file_name,prog_name,verdict,total_states
file-a,a,success,12
file-b,b,success,67
file-b,c,failure,32
$ veristat -C two-ok add-failure
<does not return>
The loop is caused by handle_comparison_mode() not checking if `base`
variable points to `fallback_stats` prior advancing joined results
using `base`.
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20230407154125.896927-1-eddyz87@gmail.com
|
|
After commit d6e6286a12e7 ("libbpf: disassociate section handler on explicit
bpf_program__set_type() call"), bpf_program__set_type() will force cleanup
the program's SEC() definition, this commit fixed the test helper but missed
the bpftool, which leads to bpftool prog autoattach broken as follows:
$ bpftool prog load spi-xfer-r1v1.o /sys/fs/bpf/test autoattach
Program spi_xfer_r1v1 does not support autoattach, falling back to pinning
This patch fix bpftool to set program type only if it differs.
Fixes: d6e6286a12e7 ("libbpf: disassociate section handler on explicit bpf_program__set_type() call")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20230407081427.2621590-1-weiyongjun@huaweicloud.com
|
|
perf_event with type=PERF_TYPE_RAW and config=0x1b00 turned out to be not
reliable in ensuring LBR is active. Thus, test_progs:get_branch_snapshot is
not reliable in some systems. Replace it with PERF_COUNT_HW_CPU_CYCLES
event, which gives more consistent results.
Signed-off-by: Song Liu <song@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/bpf/20230407190130.2093736-1-song@kernel.org
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux
Pull gpio fixes from Bartosz Golaszewski:
- fix irq handling in gpio-davinci
- fix Kconfig dependencies for gpio-regmap
* tag 'gpio-fixes-for-v6.3-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
gpio: davinci: Add irq chip flag to skip set wake
gpio: davinci: Do not clear the bank intr enable bit in save_context
gpio: GPIO_REGMAP: select REGMAP instead of depending on it
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI fixes from Rafael Wysocki:
"Fix the ACPI backlight override mechanism for the cases when
acpi_backlight=video is set through the kernel command line or a DMI
quirk and add backlight quirks for Apple iMac14,1 and iMac14,2 and
Lenovo ThinkPad W530 (Hans de Goede)"
* tag 'acpi-6.3-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI: video: Add acpi_backlight=video quirk for Lenovo ThinkPad W530
ACPI: video: Add acpi_backlight=video quirk for Apple iMac14,1 and iMac14,2
ACPI: video: Make acpi_backlight=video work independent from GPU driver
ACPI: video: Add auto_detect arg to __acpi_video_get_backlight_type()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fix from Catalin Marinas:
"Fix uninitialised variable warning (from smatch) in the arm64 compat
alignment fixup code"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: compat: Work around uninitialized variable warning
|
|
Pull ksmbd server fixes from Steve French:
"Four fixes, three for stable:
- slab out of bounds fix
- lock cancellation fix
- minor cleanup to address clang warning
- fix for xfstest 551 (wrong parms passed to kvmalloc)"
* tag '6.3-rc5-ksmbd-server-fixes' of git://git.samba.org/ksmbd:
ksmbd: fix slab-out-of-bounds in init_smb2_rsp_hdr
ksmbd: delete asynchronous work from list
ksmbd: remove unused is_char_allowed function
ksmbd: do not call kvmalloc() with __GFP_NORETRY | __GFP_NO_WARN
|
|
The VLAN filters info is currently being held in a list and 2 bitmaps
(active_cvlans and active_svlans). We are experiencing some racing where
data is not in sync in the list and bitmaps. For example, the VLAN is
initially added to the list but only when the PF replies, it is added to
the bitmap. If a user adds many V2 VLANS before the PF responds:
while [ $((i++)) ]
ip l add l eth0 name eth0.$i type vlan id $i
we might end up with more VLAN list entries than the designated limit.
Also, The "ip link show" will show more links added than the PF limit.
On the other and, the bitmaps are only used to check the number of VLAN
filters and to re-enable the filters when the interface goes from DOWN to
UP.
This patch gets rid of the bitmaps and uses the list only. To do that,
the states of the VLAN filter are modified:
1 - IAVF_VLAN_REMOVE: the entry needs to be totally removed after informing
the PF. This is the "ip link del eth0.$i" path.
2 - IAVF_VLAN_DISABLE: (new) the netdev went down. The filter needs to be
removed from the PF and then marked INACTIVE.
3 - IAVF_VLAN_INACTIVE: (new) no PF filter exists, but the user did not
delete the VLAN.
Fixes: 48ccc43ecf10 ("iavf: Add support VIRTCHNL_VF_OFFLOAD_VLAN_V2 during netdev config")
Signed-off-by: Ahmed Zaki <ahmed.zaki@intel.com>
Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
The VLAN filter states are currently being saved as individual bits.
This is error prone as multiple bits might be mistakenly set.
Fix by replacing the bits with a single state enum. Also, add an
"ACTIVE" state for filters that are accepted by the PF.
Signed-off-by: Ahmed Zaki <ahmed.zaki@intel.com>
Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
HWmon core receives an array of pointers to hwmon_channel_info and it
does not modify it, thus it can be array of const pointers for safety.
This allows drivers to make them also const.
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
|
|
Hangbin Liu says:
====================
bonding: fix ns validation on backup slaves
The first patch fixed a ns validation issue on backup slaves. The second
patch re-format the bond option test and add a test lib file. The third
patch add the arp validate regression test for the kernel patch.
Here is the new bonding option test without the kernel fix:
]# ./bond_options.sh
TEST: prio (active-backup miimon primary_reselect 0) [ OK ]
TEST: prio (active-backup miimon primary_reselect 1) [ OK ]
TEST: prio (active-backup miimon primary_reselect 2) [ OK ]
TEST: prio (active-backup arp_ip_target primary_reselect 0) [ OK ]
TEST: prio (active-backup arp_ip_target primary_reselect 1) [ OK ]
TEST: prio (active-backup arp_ip_target primary_reselect 2) [ OK ]
TEST: prio (active-backup ns_ip6_target primary_reselect 0) [ OK ]
TEST: prio (active-backup ns_ip6_target primary_reselect 1) [ OK ]
TEST: prio (active-backup ns_ip6_target primary_reselect 2) [ OK ]
TEST: prio (balance-tlb miimon primary_reselect 0) [ OK ]
TEST: prio (balance-tlb miimon primary_reselect 1) [ OK ]
TEST: prio (balance-tlb miimon primary_reselect 2) [ OK ]
TEST: prio (balance-tlb arp_ip_target primary_reselect 0) [ OK ]
TEST: prio (balance-tlb arp_ip_target primary_reselect 1) [ OK ]
TEST: prio (balance-tlb arp_ip_target primary_reselect 2) [ OK ]
TEST: prio (balance-tlb ns_ip6_target primary_reselect 0) [ OK ]
TEST: prio (balance-tlb ns_ip6_target primary_reselect 1) [ OK ]
TEST: prio (balance-tlb ns_ip6_target primary_reselect 2) [ OK ]
TEST: prio (balance-alb miimon primary_reselect 0) [ OK ]
TEST: prio (balance-alb miimon primary_reselect 1) [ OK ]
TEST: prio (balance-alb miimon primary_reselect 2) [ OK ]
TEST: prio (balance-alb arp_ip_target primary_reselect 0) [ OK ]
TEST: prio (balance-alb arp_ip_target primary_reselect 1) [ OK ]
TEST: prio (balance-alb arp_ip_target primary_reselect 2) [ OK ]
TEST: prio (balance-alb ns_ip6_target primary_reselect 0) [ OK ]
TEST: prio (balance-alb ns_ip6_target primary_reselect 1) [ OK ]
TEST: prio (balance-alb ns_ip6_target primary_reselect 2) [ OK ]
TEST: arp_validate (active-backup arp_ip_target arp_validate 0) [ OK ]
TEST: arp_validate (active-backup arp_ip_target arp_validate 1) [ OK ]
TEST: arp_validate (active-backup arp_ip_target arp_validate 2) [ OK ]
TEST: arp_validate (active-backup arp_ip_target arp_validate 3) [ OK ]
TEST: arp_validate (active-backup arp_ip_target arp_validate 4) [ OK ]
TEST: arp_validate (active-backup arp_ip_target arp_validate 5) [ OK ]
TEST: arp_validate (active-backup arp_ip_target arp_validate 6) [ OK ]
TEST: arp_validate (active-backup ns_ip6_target arp_validate 0) [ OK ]
TEST: arp_validate (active-backup ns_ip6_target arp_validate 1) [ OK ]
TEST: arp_validate (interface eth1 mii_status DOWN) [FAIL]
TEST: arp_validate (interface eth2 mii_status DOWN) [FAIL]
TEST: arp_validate (active-backup ns_ip6_target arp_validate 2) [FAIL]
TEST: arp_validate (interface eth1 mii_status DOWN) [FAIL]
TEST: arp_validate (interface eth2 mii_status DOWN) [FAIL]
TEST: arp_validate (active-backup ns_ip6_target arp_validate 3) [FAIL]
TEST: arp_validate (active-backup ns_ip6_target arp_validate 4) [ OK ]
TEST: arp_validate (active-backup ns_ip6_target arp_validate 5) [ OK ]
TEST: arp_validate (interface eth1 mii_status DOWN) [FAIL]
TEST: arp_validate (interface eth2 mii_status DOWN) [FAIL]
TEST: arp_validate (active-backup ns_ip6_target arp_validate 6) [FAIL]
Here is the test result after the kernel fix:
TEST: arp_validate (active-backup arp_ip_target arp_validate 0) [ OK ]
TEST: arp_validate (active-backup arp_ip_target arp_validate 1) [ OK ]
TEST: arp_validate (active-backup arp_ip_target arp_validate 2) [ OK ]
TEST: arp_validate (active-backup arp_ip_target arp_validate 3) [ OK ]
TEST: arp_validate (active-backup arp_ip_target arp_validate 4) [ OK ]
TEST: arp_validate (active-backup arp_ip_target arp_validate 5) [ OK ]
TEST: arp_validate (active-backup arp_ip_target arp_validate 6) [ OK ]
TEST: arp_validate (active-backup ns_ip6_target arp_validate 0) [ OK ]
TEST: arp_validate (active-backup ns_ip6_target arp_validate 1) [ OK ]
TEST: arp_validate (active-backup ns_ip6_target arp_validate 2) [ OK ]
TEST: arp_validate (active-backup ns_ip6_target arp_validate 3) [ OK ]
TEST: arp_validate (active-backup ns_ip6_target arp_validate 4) [ OK ]
TEST: arp_validate (active-backup ns_ip6_target arp_validate 5) [ OK ]
TEST: arp_validate (active-backup ns_ip6_target arp_validate 6) [ OK ]
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch add bonding arp validate tests with mode active backup,
monitor arp_ip_target and ns_ip6_target. It also checks mii_status
to make sure all slaves are UP.
Acked-by: Jonathan Toppins <jtoppins@redhat.com>
Acked-by: Jay Vosburgh <jay.vosburgh@canonical.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
To improve the testing process for bond options, A new bond topology lib
is added to our testing setup. The current option_prio.sh file will be
renamed to bond_options.sh so that all bonding options can be tested here.
Specifically, for priority testing, we will run all tests using modes
1, 5, and 6. These changes will help us streamline the testing process
and ensure that our bond options are rigorously evaluated.
Acked-by: Jay Vosburgh <jay.vosburgh@canonical.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Acked-by: Jonathan Toppins <jtoppins@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When arp_validate is set to 2, 3, or 6, validation is performed for
backup slaves as well. As stated in the bond documentation, validation
involves checking the broadcast ARP request sent out via the active
slave. This helps determine which slaves are more likely to function in
the event of an active slave failure.
However, when the target is an IPv6 address, the NS message sent from
the active interface is not checked on backup slaves. Additionally,
based on the bond_arp_rcv() rule b, we must reverse the saddr and daddr
when checking the NS message.
Note that when checking the NS message, the destination address is a
multicast address. Therefore, we must convert the target address to
solicited multicast in the bond_get_targets_ip6() function.
Prior to the fix, the backup slaves had a mii status of "down", but
after the fix, all of the slaves' mii status was updated to "UP".
Fixes: 4e24be018eb9 ("bonding: add new parameter ns_targets")
Reviewed-by: Jonathan Toppins <jtoppins@redhat.com>
Acked-by: Jay Vosburgh <jay.vosburgh@canonical.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When a device is roaming between interfaces and a new flow entry is
created, we should assume that its output device is more up to date than
whatever entry existed already.
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
WED version 2 (on MT7986 and later) can offload flows originating from
wireless devices.
In order to make that work, ndo_setup_tc needs to be implemented on the
netdevs. This adds the required code to offload flows coming in from WED,
while keeping track of the incoming wed index used for selecting the
correct PPE device.
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
UBSAN: shift-out-of-bounds in net/ipv4/tcp_input.c:555:23
shift exponent 255 is too large for 32-bit type 'int'
CPU: 1 PID: 7907 Comm: ssh Not tainted 6.3.0-rc4-00161-g62bad54b26db-dirty #206
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
Call Trace:
<TASK>
dump_stack_lvl+0x136/0x150
__ubsan_handle_shift_out_of_bounds+0x21f/0x5a0
tcp_init_transfer.cold+0x3a/0xb9
tcp_finish_connect+0x1d0/0x620
tcp_rcv_state_process+0xd78/0x4d60
tcp_v4_do_rcv+0x33d/0x9d0
__release_sock+0x133/0x3b0
release_sock+0x58/0x1b0
'maxwin' is int, shifting int for 32 or more bits is undefined behaviour.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Smatch reports: drivers/net/ethernet/sun/niu.c:4525
niu_alloc_channels() warn: missing unwind goto?
If niu_rbr_fill() fails, then we are directly returning 'err' without
freeing the channels.
Fix this by changing direct return to a goto 'out_err'.
Fixes: a3138df9f20e ("[NIU]: Add Sun Neptune ethernet driver.")
Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This lock was supposed to be an unlock.
Fixes: 6cc041e90c17 ("cifs: avoid races in parallel reconnects in smb1")
Signed-off-by: Dan Carpenter <error27@gmail.com>
Reviewed-by: Paulo Alcantara (SUSE) <pc@manguebit.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Currently if disk_scan_partitions() failed, GD_NEED_PART_SCAN will still
set, and partition scan will be proceed again when blkdev_get_by_dev()
is called. However, this will cause a problem that re-assemble partitioned
raid device will creat partition for underlying disk.
Test procedure:
mdadm -CR /dev/md0 -l 1 -n 2 /dev/sda /dev/sdb -e 1.0
sgdisk -n 0:0:+100MiB /dev/md0
blockdev --rereadpt /dev/sda
blockdev --rereadpt /dev/sdb
mdadm -S /dev/md0
mdadm -A /dev/md0 /dev/sda /dev/sdb
Test result: underlying disk partition and raid partition can be
observed at the same time
Note that this can still happen in come corner cases that
GD_NEED_PART_SCAN can be set for underlying disk while re-assemble raid
device.
Fixes: e5cfefa97bcc ("block: fix scan partition for exclusively open device again")
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
mlx5-updates-2023-04-05
From Paul:
- TC action parsing cleanups
- Correctly report stats for missed packets
- Many CT actions limitations removed due to hw misses will now
continue from the relevant tc ct action in software.
From Adham:
- RQ/SQ devlink health diagnostics layout fixes
From Gal And Rahul:
- PTP code cleanup and cyclecounter shift value improvement
* tag 'mlx5-updates-2023-04-05' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux:
net/mlx5e: Fix SQ SW state layout in SQ devlink health diagnostics
net/mlx5e: Fix RQ SW state layout in RQ devlink health diagnostics
net/mlx5e: Rename misleading skb_pc/cc references in ptp code
net/mlx5: Update cyclecounter shift value to improve ptp free running mode precision
net/mlx5e: Remove redundant macsec code
net/mlx5e: TC, Remove sample and ct limitation
net/mlx5e: TC, Remove mirror and ct limitation
net/mlx5e: TC, Remove tuple rewrite and ct limitation
net/mlx5e: TC, Remove multiple ct actions limitation
net/mlx5e: TC, Remove special handling of CT action
net/mlx5e: TC, Remove CT action reordering
net/mlx5e: CT: Use per action stats
net/mlx5e: TC, Move main flow attribute cleanup to helper func
net/mlx5e: TC, Remove unused vf_tun variable
net/mlx5e: Set default can_offload action
====================
Link: https://lore.kernel.org/r/20230406020232.83844-1-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Running this test makes little sense if the enabled l3_stats are not
actually reported as "used". This can signify a failure of a driver to
install the necessary counters, or simply lack of support for enabling
in-HW counters on a given netdevice. It is generally impossible to tell
from the outside which it is. But more likely than not, if somebody is
running this on veth pairs, they do not intend to actually test that a
certain piece of HW can install in-HW counters for the veth. It is more
likely they are e.g. running the test by mistake.
Therefore detect that the counter has not been actually installed. In that
case, if the netdevice is one end of a veth pair, SKIP. Otherwise FAIL.
Suggested-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Danielle Ratson <danieller@nvidia.com>
Tested-by: Hangbin Liu <liuhangbin@gmail.com>
Link: https://lore.kernel.org/r/a86817961903cca5cb0aebf2b2a06294b8aa7dea.1680704172.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
A recent rearrangement of includes has lead to a problem on m68k
as flagged by the kernel test robot.
Resolve this by moving the block asm includes to below linux includes.
A side effect i that non-Sparc asm includes are now immediately
before Sparc asm includes, which seems nice.
Using sparse v0.6.4 I was able to reproduce this problem as follows
using the config provided by the kernel test robot:
$ wget https://download.01.org/0day-ci/archive/20230404/202304041748.0sQc4K4l-lkp@intel.com/config
$ cp config .config
$ make ARCH=m68k oldconfig
$ make ARCH=m68k C=2 M=drivers/net/ethernet/sun
CC [M] drivers/net/ethernet/sun/sunhme.o
In file included from drivers/net/ethernet/sun/sunhme.c:19:
./arch/m68k/include/asm/irq.h:78:11: error: expected ‘;’ before ‘void’
78 | asmlinkage void do_IRQ(int irq, struct pt_regs *regs);
| ^~~~~
| ;
./arch/m68k/include/asm/irq.h:78:40: warning: ‘struct pt_regs’ declared inside parameter list will not be visible outside of this definition or declaration
78 | asmlinkage void do_IRQ(int irq, struct pt_regs *regs);
| ^~~~~~~
Compile tested only.
Fixes: 1ff4f42aef60 ("net: sunhme: Alphabetize includes")
Reported-by: kernel test robot <lkp@intel.com>
Link: https://lore.kernel.org/oe-kbuild-all/202304041748.0sQc4K4l-lkp@intel.com/
Signed-off-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20230405-sunhme-includes-fix-v1-1-bf17cc5de20d@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
BPF helpers that take an ARG_PTR_TO_UNINIT_MEM must ensure that all of
the memory is set, including beyond the end of the string.
Signed-off-by: Barret Rhoden <brho@google.com>
Link: https://lore.kernel.org/r/20230407001808.1622968-1-brho@google.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Commit c14f7ccc9f5d ("PCI: Assign PCI domain IDs by ida_alloc()")
introduced a use-after-free bug in the bus removal cleanup. The issue was
found with kfence:
[ 19.293351] BUG: KFENCE: use-after-free read in pci_bus_release_domain_nr+0x10/0x70
[ 19.302817] Use-after-free read at 0x000000007f3b80eb (in kfence-#115):
[ 19.309677] pci_bus_release_domain_nr+0x10/0x70
[ 19.309691] dw_pcie_host_deinit+0x28/0x78
[ 19.309702] tegra_pcie_deinit_controller+0x1c/0x38 [pcie_tegra194]
[ 19.309734] tegra_pcie_dw_probe+0x648/0xb28 [pcie_tegra194]
[ 19.309752] platform_probe+0x90/0xd8
...
[ 19.311457] kfence-#115: 0x00000000063a155a-0x00000000ba698da8, size=1072, cache=kmalloc-2k
[ 19.311469] allocated by task 96 on cpu 10 at 19.279323s:
[ 19.311562] __kmem_cache_alloc_node+0x260/0x278
[ 19.311571] kmalloc_trace+0x24/0x30
[ 19.311580] pci_alloc_bus+0x24/0xa0
[ 19.311590] pci_register_host_bridge+0x48/0x4b8
[ 19.311601] pci_scan_root_bus_bridge+0xc0/0xe8
[ 19.311613] pci_host_probe+0x18/0xc0
[ 19.311623] dw_pcie_host_init+0x2c0/0x568
[ 19.311630] tegra_pcie_dw_probe+0x610/0xb28 [pcie_tegra194]
[ 19.311647] platform_probe+0x90/0xd8
...
[ 19.311782] freed by task 96 on cpu 10 at 19.285833s:
[ 19.311799] release_pcibus_dev+0x30/0x40
[ 19.311808] device_release+0x30/0x90
[ 19.311814] kobject_put+0xa8/0x120
[ 19.311832] device_unregister+0x20/0x30
[ 19.311839] pci_remove_bus+0x78/0x88
[ 19.311850] pci_remove_root_bus+0x5c/0x98
[ 19.311860] dw_pcie_host_deinit+0x28/0x78
[ 19.311866] tegra_pcie_deinit_controller+0x1c/0x38 [pcie_tegra194]
[ 19.311883] tegra_pcie_dw_probe+0x648/0xb28 [pcie_tegra194]
[ 19.311900] platform_probe+0x90/0xd8
...
[ 19.313579] CPU: 10 PID: 96 Comm: kworker/u24:2 Not tainted 6.2.0 #4
[ 19.320171] Hardware name: /, BIOS 1.0-d7fb19b 08/10/2022
[ 19.325852] Workqueue: events_unbound deferred_probe_work_func
The stack trace is a bit misleading as dw_pcie_host_deinit() doesn't
directly call pci_bus_release_domain_nr(). The issue turns out to be in
pci_remove_root_bus() which first calls pci_remove_bus() which frees the
struct pci_bus when its struct device is released. Then
pci_bus_release_domain_nr() is called and accesses the freed struct
pci_bus. Reordering these fixes the issue.
Fixes: c14f7ccc9f5d ("PCI: Assign PCI domain IDs by ida_alloc()")
Link: https://lore.kernel.org/r/20230329123835.2724518-1-robh@kernel.org
Link: https://lore.kernel.org/r/b529cb69-0602-9eed-fc02-2f068707a006@nvidia.com
Reported-by: Jon Hunter <jonathanh@nvidia.com>
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Cc: stable@vger.kernel.org # v6.2+
Cc: Pali Rohár <pali@kernel.org>
|
|
variables'
Yonghong Song says:
====================
LLVM commit [1] introduced hoistMinMax optimization like
(i < VIRTIO_MAX_SGS) && (i < out_sgs)
to
upper = MIN(VIRTIO_MAX_SGS, out_sgs)
... i < upper ...
and caused the verification failure. Commit [2] workarounded the issue by
adding some bpf assembly code to prohibit the above optimization.
This patch improved verifier such that verification can succeed without
the above workaround.
Without [2], the current verifier will hit the following failures:
...
119: (15) if r1 == 0x0 goto pc+1
The sequence of 8193 jumps is too complex.
verification time 525829 usec
stack depth 64
processed 156616 insns (limit 1000000) max_states_per_insn 8 total_states 1754 peak_states 1712 mark_read 12
-- END PROG LOAD LOG --
libbpf: prog 'trace_virtqueue_add_sgs': failed to load: -14
libbpf: failed to load object 'loop6.bpf.o'
...
The failure is due to verifier inadequately handling '<const> <cond_op> <non_const>' which will
go through both pathes and generate the following verificaiton states:
...
89: (07) r2 += 1 ; R2_w=5
90: (79) r8 = *(u64 *)(r10 -48) ; R8_w=scalar() R10=fp0
91: (79) r1 = *(u64 *)(r10 -56) ; R1_w=scalar(umax=5,var_off=(0x0; 0x7)) R10=fp0
92: (ad) if r2 < r1 goto pc+41 ; R0_w=scalar() R1_w=scalar(umin=6,umax=5,var_off=(0x4; 0x3))
R2_w=5 R6_w=scalar(id=385) R7_w=0 R8_w=scalar() R9_w=scalar(umax=21474836475,var_off=(0x0; 0x7ffffffff))
R10=fp0 fp-8=mmmmmmmm fp-16=mmmmmmmm fp-24=mmmm???? fp-32= fp-40_w=4 fp-48=mmmmmmmm fp-56= fp-64=mmmmmmmm
...
89: (07) r2 += 1 ; R2_w=6
90: (79) r8 = *(u64 *)(r10 -48) ; R8_w=scalar() R10=fp0
91: (79) r1 = *(u64 *)(r10 -56) ; R1_w=scalar(umax=5,var_off=(0x0; 0x7)) R10=fp0
92: (ad) if r2 < r1 goto pc+41 ; R0_w=scalar() R1_w=scalar(umin=7,umax=5,var_off=(0x4; 0x3))
R2_w=6 R6=scalar(id=388) R7=0 R8_w=scalar() R9_w=scalar(umax=25769803770,var_off=(0x0; 0x7ffffffff))
R10=fp0 fp-8=mmmmmmmm fp-16=mmmmmmmm fp-24=mmmm???? fp-32= fp-40=5 fp-48=mmmmmmmm fp-56= fp-64=mmmmmmmm
...
89: (07) r2 += 1 ; R2_w=4088
90: (79) r8 = *(u64 *)(r10 -48) ; R8_w=scalar() R10=fp0
91: (79) r1 = *(u64 *)(r10 -56) ; R1_w=scalar(umax=5,var_off=(0x0; 0x7)) R10=fp0
92: (ad) if r2 < r1 goto pc+41 ; R0=scalar() R1=scalar(umin=4089,umax=5,var_off=(0x0; 0x7))
R2=4088 R6=scalar(id=12634) R7=0 R8=scalar() R9=scalar(umax=17557826301960,var_off=(0x0; 0xfffffffffff))
R10=fp0 fp-8=mmmmmmmm fp-16=mmmmmmmm fp-24=mmmm???? fp-32= fp-40=4087 fp-48=mmmmmmmm fp-56= fp-64=mmmmmmmm
Patch 3 fixed the above issue by handling '<const> <cond_op> <non_const>' properly.
During developing selftests for Patch 3, I found some issues with bound deduction with
BPF_EQ/BPF_NE and fixed the issue in Patch 1.
After the above issue is fixed, the second issue shows up.
...
67: (07) r1 += -16 ; R1_w=fp-16
; bpf_probe_read_kernel(&sgp, sizeof(sgp), sgs + i);
68: (b4) w2 = 8 ; R2_w=8
69: (85) call bpf_probe_read_kernel#113 ; R0_w=scalar() fp-16=mmmmmmmm
; return sgp;
70: (79) r6 = *(u64 *)(r10 -16) ; R6=scalar() R10=fp0
; for (n = 0, sgp = get_sgp(sgs, i); sgp && (n < SG_MAX);
71: (15) if r6 == 0x0 goto pc-49 ; R6=scalar()
72: (b4) w1 = 0 ; R1_w=0
73: (05) goto pc-46
; for (i = 0; (i < VIRTIO_MAX_SGS) && (i < out_sgs); i++) {
28: (bc) w7 = w1 ; R1_w=0 R7_w=0
; bpf_probe_read_kernel(&len, sizeof(len), &sgp->length);
...
23: (79) r3 = *(u64 *)(r10 -40) ; R3_w=2 R10=fp0
; for (i = 0; (i < VIRTIO_MAX_SGS) && (i < out_sgs); i++) {
24: (07) r3 += 1 ; R3_w=3
; for (i = 0; (i < VIRTIO_MAX_SGS) && (i < out_sgs); i++) {
25: (79) r1 = *(u64 *)(r10 -56) ; R1_w=scalar(umax=5,var_off=(0x0; 0x7)) R10=fp0
26: (ad) if r3 < r1 goto pc+34 61: R0=scalar() R1_w=scalar(umin=4,umax=5,var_off=(0x4; 0x1)) R3_w=3 R6=scalar(id=1658)
R7=0 R8=scalar(id=1653) R9=scalar(umax=4294967295,var_off=(0x0; 0xffffffff)) R10=fp0 fp-8=mmmmmmmm fp-16=mmmmmmmm
fp-24=mmmm???? fp-32= fp-40=2 fp-56= fp-64=mmmmmmmm
; if (sg_is_chain(&sg))
61: (7b) *(u64 *)(r10 -40) = r3 ; R3_w=3 R10=fp0 fp-40_w=3
...
67: (07) r1 += -16 ; R1_w=fp-16
; bpf_probe_read_kernel(&sgp, sizeof(sgp), sgs + i);
68: (b4) w2 = 8 ; R2_w=8
69: (85) call bpf_probe_read_kernel#113 ; R0_w=scalar() fp-16=mmmmmmmm
; return sgp;
70: (79) r6 = *(u64 *)(r10 -16)
; for (n = 0, sgp = get_sgp(sgs, i); sgp && (n < SG_MAX);
infinite loop detected at insn 71
verification time 90800 usec
stack depth 64
processed 25017 insns (limit 1000000) max_states_per_insn 20 total_states 491 peak_states 169 mark_read 12
-- END PROG LOAD LOG --
libbpf: prog 'trace_virtqueue_add_sgs': failed to load: -22
Further analysis found the index variable 'i' is spilled but since it is not marked as precise.
This is more tricky as identifying induction variable is not easy in verifier. Although a heuristic
is possible, let us leave it for now.
[1] https://reviews.llvm.org/D143726
[2] Commit 3c2611bac08a ("selftests/bpf: Fix trace_virtqueue_add_sgs test issue with LLVM 17.")
====================
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
<non_const>'
Add various tests for code pattern '<const> <cond_op> <non_const>' to
exercise the previous verifier patch.
The following are veristat changed number of processed insns stat
comparing the previous patch vs. this patch:
File Program Insns (A) Insns (B) Insns (DIFF)
----------------------------------------------------- ---------------------------------------------------- --------- --------- -------------
test_seg6_loop.bpf.linked3.o __add_egr_x 12423 12314 -109 (-0.88%)
Only one program is affected with minor change.
Signed-off-by: Yonghong Song <yhs@fb.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20230406164510.1047757-1-yhs@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Currently, the verifier does not handle '<const> <cond_op> <non_const>' well.
For example,
...
10: (79) r1 = *(u64 *)(r10 -16) ; R1_w=scalar() R10=fp0
11: (b7) r2 = 0 ; R2_w=0
12: (2d) if r2 > r1 goto pc+2
13: (b7) r0 = 0
14: (95) exit
15: (65) if r1 s> 0x1 goto pc+3
16: (0f) r0 += r1
...
At insn 12, verifier decides both true and false branch are possible, but
actually only false branch is possible.
Currently, the verifier already supports patterns '<non_const> <cond_op> <const>.
Add support for patterns '<const> <cond_op> <non_const>' in a similar way.
Also fix selftest 'verifier_bounds_mix_sign_unsign/bounds checks mixing signed and unsigned, variant 10'
due to this change.
Signed-off-by: Yonghong Song <yhs@fb.com>
Acked-by: Dave Marchevsky <davemarchevsky@fb.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20230406164505.1046801-1-yhs@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Add various tests for code pattern '<non-const> NE/EQ <const>' implemented
in the previous verifier patch. Without the verifier patch, these new
tests will fail.
Signed-off-by: Yonghong Song <yhs@fb.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20230406164500.1045715-1-yhs@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Currently, for BPF_JEQ/BPF_JNE insn, verifier determines
whether the branch is taken or not only if both operands
are constants. Therefore, for the following code snippet,
0: (85) call bpf_ktime_get_ns#5 ; R0_w=scalar()
1: (a5) if r0 < 0x3 goto pc+2 ; R0_w=scalar(umin=3)
2: (b7) r2 = 2 ; R2_w=2
3: (1d) if r0 == r2 goto pc+2 6
At insn 3, since r0 is not a constant, verifier assumes both branch
can be taken which may lead inproper verification failure.
Add comparing umin/umax value and the constant. If the umin value
is greater than the constant, or umax value is smaller than the constant,
for JEQ the branch must be not-taken, and for JNE the branch must be taken.
The jmp32 mode JEQ/JNE branch taken checking is also handled similarly.
The following lists the veristat result w.r.t. changed number
of processes insns during verification:
File Program Insns (A) Insns (B) Insns (DIFF)
----------------------------------------------------- ---------------------------------------------------- --------- --------- ---------------
test_cls_redirect.bpf.linked3.o cls_redirect 64980 73472 +8492 (+13.07%)
test_seg6_loop.bpf.linked3.o __add_egr_x 12425 12423 -2 (-0.02%)
test_tcp_hdr_options.bpf.linked3.o estab 2634 2558 -76 (-2.89%)
test_parse_tcp_hdr_opt.bpf.linked3.o xdp_ingress_v6 1421 1420 -1 (-0.07%)
test_parse_tcp_hdr_opt_dynptr.bpf.linked3.o xdp_ingress_v6 1238 1237 -1 (-0.08%)
test_tc_dtime.bpf.linked3.o egress_fwdns_prio100 414 411 -3 (-0.72%)
Mostly a small improvement but test_cls_redirect.bpf.linked3.o has a 13% regression.
I checked with verifier log and found it this is due to pruning.
For some JEQ/JNE branches impacted by this patch,
one branch is explored and the other has state equivalence and
pruned.
Signed-off-by: Yonghong Song <yhs@fb.com>
Acked-by: Dave Marchevsky <davemarchevsky@fb.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20230406164455.1045294-1-yhs@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|