Age | Commit message (Collapse) | Author |
|
Change min() macros to min_t() which has compare type specified and it
helps avoid precision loss.
In some cases there was precision loss during calls or assignments.
Some fields in structs were unnecessarily large and gave multiple
warnings.
There were also some minor type differences which are now fixed as well as
some cases where a simple cast was needed.
Callers were were passing data that is a u16 to
ice_sched_cfg_node_bw_alloc() but the function was truncating that to a u8.
Fix that by changing the function to take a u16.
Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
|
When printing the ice status or AQ error codes, instead of printing out the
numerical value, provide the description of the error code. This provides
more info about the issue than a number.
Signed-off-by: Lihong Yang <lihong.yang@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
|
As soon as the driver registers the PF netdev, userspace utilities
like NetworkManager try to bring up the associated interface. When
this happens, the driver may not have finished initializing fully,
resulting in a bunch of errors in the interface up flow.
The driver already has a mechanism to indicate if it's not up yet;
by setting the __ICE_DOWN bit in pf->state, but this bit gets
cleared too early in the current flow. So clear this bit only when
the driver is fully up. Also check for the same bit in the ice_open
flow, and return -EBUSY if the bit is set.
Also in ice_open, replace references of vsi->back with a local
variable.
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
|
Currently, the ice driver is setting a PHY configuration,
which causes a link drop, and then additionally it calls
for a nway_reset, which restarts auto-negotiation on the
link, which also causes a link drop. These two link
events in such close timing is causing the FW to not be
able to generate a link interrupt for the driver to
respond to.
Remove the unnecessary auto-negotiation restart from the
set pauseparams flow. Also remove error path that
would have performed an ice_down/ice_up as that is
also unnecessary.
Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
|
The current implementation for contiguous TC check
is assuming that the UPs will be mapped to TCs in
a linear progressing fashion. This is obviously
not always true.
Change the check to allow for various UP2TC mapping
configurations.
Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
|
When there's a Tx timeout for a queue which belongs to a PFC enabled TC,
then it's not because the queue is hung but because PFC is in action.
In PFC, peer sends a pause frame for a specified period of time when its
buffer threshold is exceeded (due to congestion). Netdev on the other
hand checks if ACK is received within a specified time for a TX packet, if
not, it'll invoke the tx_timeout routine.
Signed-off-by: Avinash JD <avinash.dayanand@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
|
Implement promiscuous support for VF VSIs. Behaviour of promiscuous support
is based on VF trust as well as the, introduced, vf-true-promisc flag.
A trusted VF with vf-true-promisc disabled will be the default VSI, which
means that all traffic without a matching destination MAC address in the
device's internal switch will be forwarded to this VF VSI.
A trusted VF with vf-true-promisc enabled will go into "true promiscuous
mode". This amounts to the VF receiving all ingress and egress traffic
that hits the device's internal switch.
An untrusted VF will only receive traffic destined for that VF.
The vf-true-promisc-support flag cannot be toggled while any VF is in
promiscuous mode. This flag should be set prior to loading the iavf driver
or spawning VF(s).
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
|
Create a boost TCAM entry for each tunnel port in order to get a tunnel
PTYPE. Update netdev feature flags and implement the appropriate logic to
get and set values for hardware offloads.
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Henry Tieman <henry.w.tieman@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
|
The flash memory for the ice hardware contains a block of information
used for link management called the Netlist module.
As this essentially represents another section of firmware, add its
version information to the output of the driver's .info_get handler.
This includes both a version and the first few bytes of a hash of the
module contents.
fw.netlist -> the version information extracted from the netlist module
fw.netlist.build-> first 4 bytes of the hash of the contents, similar
to fw.mgmt.build
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
|
When attaching a flow dissector program to a network namespace with
bpf(BPF_PROG_ATTACH, ...) we grab a reference to bpf_prog.
If netns gets destroyed while a flow dissector is still attached, and there
are no other references to the prog, we leak the reference and the program
remains loaded.
Leak can be reproduced by running flow dissector tests from selftests/bpf:
# bpftool prog list
# ./test_flow_dissector.sh
...
selftests: test_flow_dissector [PASS]
# bpftool prog list
4: flow_dissector name _dissect tag e314084d332a5338 gpl
loaded_at 2020-05-20T18:50:53+0200 uid 0
xlated 552B jited 355B memlock 4096B map_ids 3,4
btf_id 4
#
Fix it by detaching the flow dissector program when netns is going away.
Fixes: d58e468b1112 ("flow_dissector: implements flow dissector BPF hook")
Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Stanislav Fomichev <sdf@google.com>
Link: https://lore.kernel.org/bpf/20200521083435.560256-1-jakub@cloudflare.com
|
|
John Fastabend says:
====================
This series adds logic to the verifier to handle the case where a
pointer is known to be non-null but then the verifier encountesr a
instruction, such as 'if ptr == 0 goto X' or 'if ptr != 0 goto X',
where the pointer is compared against 0. Because the verifier tracks
if a pointer may be null in many cases we can improve the branch
tracking by following the case known to be true.
The first patch adds the verifier logic and patches 2-4 add the
test cases.
v1->v2: fix verifier logic to return -1 indicating both paths must
still be walked if value is not zero. Move mark_precision skip for
this case into caller of mark_precision to ensure mark_precision can
still catch other misuses. And add PTR_TYPE_BTF_ID to our list of
supported types. Finally add one more test to catch the value not
equal zero case. Thanks to Andrii for original review.
Also fixed up commit messages hopefully its better now.
====================
Acked-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Adding a printk to test_sk_lookup_kern created the reported failure
where a pointer type is checked twice for NULL. Lets add it to the
progs test test_sk_lookup_kern.c so we test the case from C all the
way into the verifier.
We already have printk's in selftests so seems OK to add another one.
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/159009170603.6313.1715279795045285176.stgit@john-Precision-5820-Tower
|
|
When we have pointer type that is known to be non-null we only follow
the non-null branch. This adds tests to cover the map_value pointer
returned from a map lookup. To force an error if both branches are
followed we do an ALU op on R10.
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/159009168650.6313.7434084136067263554.stgit@john-Precision-5820-Tower
|
|
When we have pointer type that is known to be non-null and comparing
against zero we only follow the non-null branch. This adds tests to
cover this case for reference tracking. Also add the other case when
comparison against a non-zero value and ensure we still fail with
unreleased reference.
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/159009166599.6313.1593680633787453767.stgit@john-Precision-5820-Tower
|
|
Currently, when considering the branches that may be taken for a jump
instruction if the register being compared is a pointer the verifier
assumes both branches may be taken. But, if the jump instruction
is comparing if a pointer is NULL we have this information in the
verifier encoded in the reg->type so we can do better in these cases.
Specifically, these two common cases can be handled.
* If the instruction is BPF_JEQ and we are comparing against a
zero value. This test is 'if ptr == 0 goto +X' then using the
type information in reg->type we can decide if the ptr is not
null. This allows us to avoid pushing both branches onto the
stack and instead only use the != 0 case. For example
PTR_TO_SOCK and PTR_TO_SOCK_OR_NULL encode the null pointer.
Note if the type is PTR_TO_SOCK_OR_NULL we can not learn anything.
And also if the value is non-zero we learn nothing because it
could be any arbitrary value a different pointer for example
* If the instruction is BPF_JNE and ware comparing against a zero
value then a similar analysis as above can be done. The test in
asm looks like 'if ptr != 0 goto +X'. Again using the type
information if the non null type is set (from above PTR_TO_SOCK)
we know the jump is taken.
In this patch we extend is_branch_taken() to consider this extra
information and to return only the branch that will be taken. This
resolves a verifier issue reported with C code like the following.
See progs/test_sk_lookup_kern.c in selftests.
sk = bpf_sk_lookup_tcp(skb, tuple, tuple_len, BPF_F_CURRENT_NETNS, 0);
bpf_printk("sk=%d\n", sk ? 1 : 0);
if (sk)
bpf_sk_release(sk);
return sk ? TC_ACT_OK : TC_ACT_UNSPEC;
In the above the bpf_printk() will resolve the pointer from
PTR_TO_SOCK_OR_NULL to PTR_TO_SOCK. Then the second test guarding
the release will cause the verifier to walk both paths resulting
in the an unreleased sock reference. See verifier/ref_tracking.c
in selftests for an assembly version of the above.
After the above additional logic is added the C code above passes
as expected.
Reported-by: Andrey Ignatov <rdna@fb.com>
Suggested-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/159009164651.6313.380418298578070501.stgit@john-Precision-5820-Tower
|
|
Björn Töpel says:
====================
Overview
========
Driver adoption for AF_XDP has been slow. The amount of code required
to proper support AF_XDP is substantial and the driver/core APIs are
vague or even non-existing. Drivers have to manually adjust data
offsets, updating AF_XDP handles differently for different modes
(aligned/unaligned).
This series attempts to improve the situation by introducing an AF_XDP
buffer allocation API. The implementation is based on a single core
(single producer/consumer) buffer pool for the AF_XDP UMEM.
A buffer is allocated using the xsk_buff_alloc() function, and
returned using xsk_buff_free(). If a buffer is disassociated with the
pool, e.g. when a buffer is passed to an AF_XDP socket, a buffer is
said to be released. Currently, the release function is only used by
the AF_XDP internals and not visible to the driver.
Drivers using this API should register the XDP memory model with the
new MEM_TYPE_XSK_BUFF_POOL type, which will supersede the
MEM_TYPE_ZERO_COPY type.
The buffer type is struct xdp_buff, and follows the lifetime of
regular xdp_buffs, i.e. the lifetime of an xdp_buff is restricted to
a NAPI context. In other words, the API is not replacing xdp_frames.
DMA mapping/synching is folded into the buffer handling as well.
@JeffK The Intel drivers changes should go through the bpf-next tree,
and not your regular Intel tree, since multiple (non-Intel)
drivers are affected.
The outline of the series is as following:
Patch 1 is a fix for xsk_umem_xdp_frame_sz().
Patch 2 to 4 are restructures/clean ups. The XSKMAP implementation is
moved to net/xdp/. Functions/defines/enums that are only used by the
AF_XDP internals are moved from the global include/net/xdp_sock.h to
net/xdp/xsk.h. We are also introducing a new "driver include file",
include/net/xdp_sock_drv.h, which is the only file NIC driver
developers adding AF_XDP zero-copy support should care about.
Patch 5 adds the new API, and migrates the "copy-mode"/skb-mode AF_XDP
path to the new API.
Patch 6 to 11 migrates the existing zero-copy drivers to the new API.
Patch 12 removes the MEM_TYPE_ZERO_COPY memory type, and the "handle"
member of struct xdp_buff.
Patch 13 simplifies the xdp_return_{frame,frame_rx_napi,buff}
functions.
Patch 14 is a performance patch, where some functions are inlined.
Finally, patch 15 updates the MAINTAINERS file to correctly mirror the
new file layout.
Note that this series removes the "handle" member from struct
xdp_buff, which reduces the xdp_buff size.
After this series, the diff stat of drivers/net/ is:
27 files changed, 419 insertions(+), 1288 deletions(-)
This series is a first step of simplifying the driver side of
AF_XDP. I think more of the AF_XDP logic can be moved from the drivers
to the AF_XDP core, e.g. the "need wakeup" set/clear functionality.
Statistics when allocation fails can now be added to the socket
statistics via the XDP_STATISTICS getsockopt(). This will be added in
a follow up series.
Performance
===========
As a nice side effect, performance is up a bit as well.
* i40e: 3% higher pps for rxdrop, zero-copy, aligned and unaligned
(40 GbE, 64B packets).
* mlx5: RX +0.8 Mpps, TX +0.4 Mpps
Changelog
=========
v4->v5:
* Fix various kdoc and GCC warnings (W=1). (Jakub)
v3->v4:
* mlx5: Remove unused variable num_xsk_frames. (Jakub)
* i40e: Made i40e_fd_handle_status() static. (kbuild test robot)
v2->v3:
* Added xsk_umem_xdp_frame_sz() fix to the series. (Björn)
* Initialize struct xdp_buff member frame_sz. (Björn)
* Add API to query the DMA address of a frame. (Maxim)
* Do DMA sync for CPU till the end of the frame to handle possible
growth (frame_sz). (Maxim)
* mlx5: Handle frame_sz, use xsk_buff_xdp_get_frame_dma, use
xsk_buff API for DMA sync on TX, add performance numbers. (Maxim)
v1->v2:
* mlx5: Fix DMA address handling, set XDP metadata to invalid. (Maxim)
* ixgbe: Fixed xdp_buff data_end update. (Björn)
* Swapped SoBs in patch 4. (Maxim)
rfc->v1:
* Fixed build errors/warnings for m68k and riscv. (kbuild test
robot)
* Added headroom/chunk size getter. (Maxim/Björn)
* mlx5: Put back the sanity check for XSK params, use XSK API to get
the total headroom size. (Maxim)
* Fixed spelling in commit message. (Björn)
* Make sure xp_validate_desc() is inlined for Tx perf. (Maxim)
* Sorted file entries. (Joe)
* Added xdp_return_{frame,frame_rx_napi,buff} simplification (Björn)
Thanks for all the comments/input/help!
====================
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Update MAINTAINERS to correctly mirror the current AF_XDP socket file
layout. Also, add the AF_XDP files of libbpf.
rfc->v1: Sorted file entries. (Joe)
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Cc: Joe Perches <joe@perches.com>
Link: https://lore.kernel.org/bpf/20200520192103.355233-16-bjorn.topel@gmail.com
|
|
In order to reduce the number of function calls, the struct
xsk_buff_pool definition is moved to xsk_buff_pool.h. The functions
xp_get_dma(), xp_dma_sync_for_cpu(), xp_dma_sync_for_device(),
xp_validate_desc() and various helper functions are explicitly
inlined.
Further, move xp_get_handle() and xp_release() to xsk.c, to allow for
the compiler to perform inlining.
rfc->v1: Make sure xp_validate_desc() is inlined for Tx perf. (Maxim)
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200520192103.355233-15-bjorn.topel@gmail.com
|
|
The xdp_return_{frame,frame_rx_napi,buff} function are never used,
except in xdp_convert_zc_to_xdp_frame(), by the MEM_TYPE_XSK_BUFF_POOL
memory type.
To simplify and reduce code, change so that
xdp_convert_zc_to_xdp_frame() calls xsk_buff_free() directly since the
type is know, and remove MEM_TYPE_XSK_BUFF_POOL from the switch
statement in __xdp_return() function.
Suggested-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200520192103.355233-14-bjorn.topel@gmail.com
|
|
There are no users of MEM_TYPE_ZERO_COPY. Remove all corresponding
code, including the "handle" member of struct xdp_buff.
rfc->v1: Fixed spelling in commit message. (Björn)
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200520192103.355233-13-bjorn.topel@gmail.com
|
|
Use the new MEM_TYPE_XSK_BUFF_POOL API in lieu of MEM_TYPE_ZERO_COPY in
mlx5e. It allows to drop a lot of code from the driver (which is now
common in AF_XDP core and was related to XSK RX frame allocation, DMA
mapping, etc.) and slightly improve performance (RX +0.8 Mpps, TX +0.4
Mpps).
rfc->v1: Put back the sanity check for XSK params, use XSK API to get
the total headroom size. (Maxim)
v1->v2: Fix DMA address handling, set XDP metadata to invalid. (Maxim)
v2->v3: Handle frame_sz, use xsk_buff_xdp_get_frame_dma, use xsk_buff
API for DMA sync on TX, add performance numbers. (Maxim)
v3->v4: Remove unused variable num_xsk_frames. (Jakub)
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200520192103.355233-12-bjorn.topel@gmail.com
|
|
Remove MEM_TYPE_ZERO_COPY in favor of the new MEM_TYPE_XSK_BUFF_POOL
APIs.
v1->v2: Fixed xdp_buff data_end update. (Björn)
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Cc: intel-wired-lan@lists.osuosl.org
Link: https://lore.kernel.org/bpf/20200520192103.355233-11-bjorn.topel@gmail.com
|
|
Remove MEM_TYPE_ZERO_COPY in favor of the new MEM_TYPE_XSK_BUFF_POOL
APIs.
v4->v5: Fixed "warning: Excess function parameter 'alloc' description
in 'ice_alloc_rx_bufs_zc'" and "warning: Excess function
parameter 'xdp' description in
'ice_construct_skb_zc'". (Jakub)
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Cc: intel-wired-lan@lists.osuosl.org
Link: https://lore.kernel.org/bpf/20200520192103.355233-10-bjorn.topel@gmail.com
|
|
Remove MEM_TYPE_ZERO_COPY in favor of the new MEM_TYPE_XSK_BUFF_POOL
APIs. The AF_XDP zero-copy rx_bi ring is now simply a struct xdp_buff
pointer.
v4->v5: Fixed "warning: Excess function parameter 'bi' description in
'i40e_construct_skb_zc'". (Jakub)
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Cc: intel-wired-lan@lists.osuosl.org
Link: https://lore.kernel.org/bpf/20200520192103.355233-9-bjorn.topel@gmail.com
|
|
Continuing the path to support MEM_TYPE_XSK_BUFF_POOL, the AF_XDP
zero-copy/sk_buff rx_bi rings are now separate. Functions to properly
allocate the different rings are added as well.
v3->v4: Made i40e_fd_handle_status() static. (kbuild test robot)
v4->v5: Fix kdoc for i40e_clean_programming_status(). (Jakub)
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Cc: intel-wired-lan@lists.osuosl.org
Link: https://lore.kernel.org/bpf/20200520192103.355233-8-bjorn.topel@gmail.com
|
|
As a first step to migrate i40e to the new MEM_TYPE_XSK_BUFF_POOL
APIs, code that accesses the rx_bi (SW/shadow ring) is refactored to
use an accessor function.
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Cc: intel-wired-lan@lists.osuosl.org
Link: https://lore.kernel.org/bpf/20200520192103.355233-7-bjorn.topel@gmail.com
|
|
In order to simplify AF_XDP zero-copy enablement for NIC driver
developers, a new AF_XDP buffer allocation API is added. The
implementation is based on a single core (single producer/consumer)
buffer pool for the AF_XDP UMEM.
A buffer is allocated using the xsk_buff_alloc() function, and
returned using xsk_buff_free(). If a buffer is disassociated with the
pool, e.g. when a buffer is passed to an AF_XDP socket, a buffer is
said to be released. Currently, the release function is only used by
the AF_XDP internals and not visible to the driver.
Drivers using this API should register the XDP memory model with the
new MEM_TYPE_XSK_BUFF_POOL type.
The API is defined in net/xdp_sock_drv.h.
The buffer type is struct xdp_buff, and follows the lifetime of
regular xdp_buffs, i.e. the lifetime of an xdp_buff is restricted to
a NAPI context. In other words, the API is not replacing xdp_frames.
In addition to introducing the API and implementations, the AF_XDP
core is migrated to use the new APIs.
rfc->v1: Fixed build errors/warnings for m68k and riscv. (kbuild test
robot)
Added headroom/chunk size getter. (Maxim/Björn)
v1->v2: Swapped SoBs. (Maxim)
v2->v3: Initialize struct xdp_buff member frame_sz. (Björn)
Add API to query the DMA address of a frame. (Maxim)
Do DMA sync for CPU till the end of the frame to handle
possible growth (frame_sz). (Maxim)
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200520192103.355233-6-bjorn.topel@gmail.com
|
|
Move the XSK_NEXT_PG_CONTIG_{MASK,SHIFT}, and
XDP_UMEM_USES_NEED_WAKEUP defines from xdp_sock.h to the AF_XDP
internal xsk.h file. Also, start using the BIT{,_ULL} macro instead of
explicit shifts.
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200520192103.355233-5-bjorn.topel@gmail.com
|
|
Move the AF_XDP zero-copy driver interface to its own include file
called xdp_sock_drv.h. This, hopefully, will make it more clear for
NIC driver implementors to know what functions to use for zero-copy
support.
v4->v5: Fix -Wmissing-prototypes by include header file. (Jakub)
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200520192103.355233-4-bjorn.topel@gmail.com
|
|
The XSKMAP is partly implemented by net/xdp/xsk.c. Move xskmap.c from
kernel/bpf/ to net/xdp/, which is the logical place for AF_XDP related
code. Also, move AF_XDP struct definitions, and function declarations
only used by AF_XDP internals into net/xdp/xsk.h.
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200520192103.355233-3-bjorn.topel@gmail.com
|
|
Calculating the "data_hard_end" for an XDP buffer coming from AF_XDP
zero-copy mode, the return value of xsk_umem_xdp_frame_sz() is added
to "data_hard_start".
Currently, the chunk size of the UMEM is returned by
xsk_umem_xdp_frame_sz(). This is not correct, if the fixed UMEM
headroom is non-zero. Fix this by returning the chunk_size without the
UMEM headroom.
Fixes: 2a637c5b1aaf ("xdp: For Intel AF_XDP drivers add XDP frame_sz")
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200520192103.355233-2-bjorn.topel@gmail.com
|
|
git://people.freedesktop.org/~agd5f/linux into drm-fixes
amd-drm-fixes-5.7-2020-05-21:
amdgpu:
- DP fix
- Floating point fix
- Fix cursor stutter issue
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200521222415.4122-1-alexander.deucher@amd.com
|
|
In the function devm_platform_ioremap_resource(), if get resource
failed, the return value is ERR_PTR() not NULL. Thus it must be
replaced by IS_ERR(), or else it may result in crashes if a critical
error path is encountered.
Fixes: 0ce5ebd24d25 ("mfd: ioc3: Add driver for SGI IOC3 chip")
Signed-off-by: Zhang Shengju <zhangshengju@cmss.chinamobile.com>
Signed-off-by: Tang Bin <tangbin@cmss.chinamobile.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
In case we can't find a ->dumpit callback for the requested
(family,type) pair, we fall back to (PF_UNSPEC,type). In effect, we're
in the same situation as if userspace had requested a PF_UNSPEC
dump. For RTM_GETROUTE, that handler is rtnl_dump_all, which calls all
the registered RTM_GETROUTE handlers.
The requested table id may or may not exist for all of those
families. commit ae677bbb4441 ("net: Don't return invalid table id
error when dumping all families") fixed the problem when userspace
explicitly requests a PF_UNSPEC dump, but missed the fallback case.
For example, when we pass ipv6.disable=1 to a kernel with
CONFIG_IP_MROUTE=y and CONFIG_IP_MROUTE_MULTIPLE_TABLES=y,
the (PF_INET6, RTM_GETROUTE) handler isn't registered, so we end up in
rtnl_dump_all, and listing IPv6 routes will unexpectedly print:
# ip -6 r
Error: ipv4: MR table does not exist.
Dump terminated
commit ae677bbb4441 introduced the dump_all_families variable, which
gets set when userspace requests a PF_UNSPEC dump. However, we can't
simply set the family to PF_UNSPEC in rtnetlink_rcv_msg in the
fallback case to get dump_all_families == true, because some messages
types (for example RTM_GETRULE and RTM_GETNEIGH) only register the
PF_UNSPEC handler and use the family to filter in the kernel what is
dumped to userspace. We would then export more entries, that userspace
would have to filter. iproute does that, but other programs may not.
Instead, this patch removes dump_all_families and updates the
RTM_GETROUTE handlers to check if the family that is being dumped is
their own. When it's not, which covers both the intentional PF_UNSPEC
dumps (as dump_all_families did) and the fallback case, ignore the
missing table id error.
Fixes: cb167893f41e ("net: Plumb support for filtering ipv4 and ipv6 multicast route dumps")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
into drm-fixes
two fixes:
- memory leak fix when userspace passes a invalid softpin address
- off-by-one crashing the kernel in the perfmon domain iteration when
the GPU core has both 2D and 3D capabilities
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Lucas Stach <l.stach@pengutronix.de>
Link: https://patchwork.freedesktop.org/patch/msgid/ca7bce5c8aff0fcbbdc3bf2b9723df5f687c8924.camel@pengutronix.de
|
|
In case of error with MPLS support the code is misusing AF_INET
instead of AF_MPLS.
Fixes: 1b69e7e6c4da ("ipip: support MPLS over IPv4")
Signed-off-by: Vadim Fedorenko <vfedorenko@novek.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Vadim Fedorenko says:
====================
net/tls: fix encryption error path
The problem with data stream corruption was found in KTLS
transmit path with small socket send buffers and large
amount of data. bpf_exec_tx_verdict() frees open record
on any type of error including EAGAIN, ENOMEM and ENOSPC
while callers are able to recover this transient errors.
Also wrong error code was returned to user space in that
case. This patchset fixes the problems.
====================
Acked-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
We cannot free record on any transient error because it leads to
losing previos data. Check socket error to know whether record must
be freed or not.
Fixes: d10523d0b3d7 ("net/tls: free the record on encryption error")
Signed-off-by: Vadim Fedorenko <vfedorenko@novek.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
bpf_exec_tx_verdict() can return negative value for copied
variable. In that case this value will be pushed back to caller
and the real error code will be lost. Fix it using signed type and
checking for positive value.
Fixes: d10523d0b3d7 ("net/tls: free the record on encryption error")
Fixes: d3b18ad31f93 ("tls: add bpf support to sk_msg handling")
Signed-off-by: Vadim Fedorenko <vfedorenko@novek.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Oleksij Rempel says:
====================
provide KAPI for SQI
This patches are extending ethtool netlink interface to export Signal
Quality Index (SQI). SQI provided by 100Base-T1 PHYs and can be used for
cable diagnostic. Compared to a typical cable tests, this value can be
only used after link is established.
changes v3:
- rename __ethtool_get_sqi* to linkstate_get_sqi*. And move this
functions to the net/ethtool/linkstate.c
- protect linkstate_get_sqi* with locking
changes v2:
- use u32 instead of u8 for SQI
- add SQI_MAX field and callbacks
- some style fixes in the rst.
- do not convert index to shifted index.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch implements reading of the Signal Quality Index for better
cable/link troubleshooting.
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Signal Quality Index is a mandatory value required by "OPEN Alliance
SIG" for the 100Base-T1 PHYs [1]. This indicator can be used for cable
integrity diagnostic and investigating other noise sources and
implement by at least two vendors: NXP[2] and TI[3].
[1] http://www.opensig.org/download/document/218/Advanced_PHY_features_for_automotive_Ethernet_V1.0.pdf
[2] https://www.nxp.com/docs/en/data-sheet/TJA1100.pdf
[3] https://www.ti.com/product/DP83TC811R-Q1
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Wei Yongjun says:
====================
net: ethernet: ti: fix some return value check
This patchset convert cpsw_ale_create() to return PTR_ERR() only, and
changed all the caller to check IS_ERR() instead of NULL.
Since v2:
1) rebased on net.git, as Jakub's suggest
2) split am65-cpsw-nuss.c changes, as Grygorii's suggest
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Convert to using IS_ERR() instead of NULL test for cpsw_ale_create()
error handling. Also fix to return negative error code from this error
handling case instead of 0 in.
Fixes: 93a76530316a ("net: ethernet: ti: introduce am65x/j721e gigabit eth subsystem driver")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
cpsw_ale_create() can return both NULL and PTR_ERR(), but all of
the caller only check NULL for error handling. This patch convert
it to only return PTR_ERR() in all error cases, and the caller using
IS_ERR() instead of NULL test.
Fixes: 4b41d3436796 ("net: ethernet: ti: cpsw: allow untagged traffic on host port")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Yuval Basson says:
====================
qed: Add xrc core support for RoCE
This patch adds support for configuring XRC and provides the necessary
APIs for rdma upper layer driver (qedr) to enable the XRC feature.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add support for XRC-SRQ's and XRC-QP's for upper layer driver.
We maintain separate bitmaps for resource management for srq and
xrc-srq, However, the range in FW is one, The xrc-srq's are first
and then the srq's follow. Therefore we maintain a srq-id offset.
v2: perform cleanups if XRC bitmpas allocation fail.
Signed-off-by: Michal Kalderon <mkalderon@marvell.com>
Signed-off-by: Yuval Bason <ybason@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
First ILT page for TSDM client is allocated for XRC-SRQ's.
For regular SRQ's skip first ILT page that is reserved for
XRC-SRQ's.
Signed-off-by: Michal Kalderon <mkalderon@marvell.com>
Signed-off-by: Yuval Bason <ybason@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Once the traversal of the list is completed with list_for_each_entry(),
the iterator (node) will point to an invalid object. So passing this to
qrtr_local_enqueue() which is outside of the iterator block is erroneous
eventhough the object is not used.
So fix this by passing NULL to qrtr_local_enqueue().
Fixes: bdabad3e363d ("net: Add Qualcomm IPC router")
Reported-by: kbuild test robot <lkp@intel.com>
Reported-by: Julia Lawall <julia.lawall@lip6.fr>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Currently, psample can only send the packet bits after decapsulation.
The tunnel information is lost. Add the tunnel support.
If the sampled packet has no tunnel info, the behavior is the same as
before. If it has, add a nested metadata field named PSAMPLE_ATTR_TUNNEL
and include the tunnel subfields if applicable.
Increase the metadata length for sampled packet with the tunnel info.
If new subfields of tunnel info should be included, update the metadata
length accordingly.
Signed-off-by: Chris Mi <chrism@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|