Age | Commit message (Collapse) | Author |
|
This patch adds the End.BPF action to the LWT seg6local infrastructure.
This action works like any other seg6local End action, meaning that an IPv6
header with SRH is needed, whose DA has to be equal to the SID of the
action. It will also advance the SRH to the next segment, the BPF program
does not have to take care of this.
Since the BPF program may not be a source of instability in the kernel, it
is important to ensure that the integrity of the packet is maintained
before yielding it back to the IPv6 layer. The hook hence keeps track if
the SRH has been altered through the helpers, and re-validates its
content if needed with seg6_validate_srh. The state kept for validation is
stored in a per-CPU buffer. The BPF program is not allowed to directly
write into the packet, and only some fields of the SRH can be altered
through the helper bpf_lwt_seg6_store_bytes.
Performances profiling has shown that the SRH re-validation does not induce
a significant overhead. If the altered SRH is deemed as invalid, the packet
is dropped.
This validation is also done before executing any action through
bpf_lwt_seg6_action, and will not be performed again if the SRH is not
modified after calling the action.
The BPF program may return 3 types of return codes:
- BPF_OK: the End.BPF action will look up the next destination through
seg6_lookup_nexthop.
- BPF_REDIRECT: if an action has been executed through the
bpf_lwt_seg6_action helper, the BPF program should return this
value, as the skb's destination is already set and the default
lookup should not be performed.
- BPF_DROP : the packet will be dropped.
Signed-off-by: Mathieu Xhonneux <m.xhonneux@gmail.com>
Acked-by: David Lebrun <dlebrun@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
The new bpf_lwt_push_encap helper should only be accessible within the
LWT BPF IN hook, and not the OUT one, as this may lead to a skb under
panic.
At the moment, both LWT BPF IN and OUT share the same list of helpers,
whose calls are authorized by the verifier. This patch separates the
verifier ops for the IN and OUT hooks, and allows the IN hook to call the
bpf_lwt_push_encap helper.
This patch is also the occasion to put all lwt_*_func_proto functions
together for clarity. At the moment, socks_op_func_proto is in the middle
of lwt_inout_func_proto and lwt_xmit_func_proto.
Signed-off-by: Mathieu Xhonneux <m.xhonneux@gmail.com>
Acked-by: David Lebrun <dlebrun@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
The BPF seg6local hook should be powerful enough to enable users to
implement most of the use-cases one could think of. After some thinking,
we figured out that the following actions should be possible on a SRv6
packet, requiring 3 specific helpers :
- bpf_lwt_seg6_store_bytes: Modify non-sensitive fields of the SRH
- bpf_lwt_seg6_adjust_srh: Allow to grow or shrink a SRH
(to add/delete TLVs)
- bpf_lwt_seg6_action: Apply some SRv6 network programming actions
(specifically End.X, End.T, End.B6 and
End.B6.Encap)
The specifications of these helpers are provided in the patch (see
include/uapi/linux/bpf.h).
The non-sensitive fields of the SRH are the following : flags, tag and
TLVs. The other fields can not be modified, to maintain the SRH
integrity. Flags, tag and TLVs can easily be modified as their validity
can be checked afterwards via seg6_validate_srh. It is not allowed to
modify the segments directly. If one wants to add segments on the path,
he should stack a new SRH using the End.B6 action via
bpf_lwt_seg6_action.
Growing, shrinking or editing TLVs via the helpers will flag the SRH as
invalid, and it will have to be re-validated before re-entering the IPv6
layer. This flag is stored in a per-CPU buffer, along with the current
header length in bytes.
Storing the SRH len in bytes in the control block is mandatory when using
bpf_lwt_seg6_adjust_srh. The Header Ext. Length field contains the SRH
len rounded to 8 bytes (a padding TLV can be inserted to ensure the 8-bytes
boundary). When adding/deleting TLVs within the BPF program, the SRH may
temporary be in an invalid state where its length cannot be rounded to 8
bytes without remainder, hence the need to store the length in bytes
separately. The caller of the BPF program can then ensure that the SRH's
final length is valid using this value. Again, a final SRH modified by a
BPF program which doesn’t respect the 8-bytes boundary will be discarded
as it will be considered as invalid.
Finally, a fourth helper is provided, bpf_lwt_push_encap, which is
available from the LWT BPF IN hook, but not from the seg6local BPF one.
This helper allows to encapsulate a Segment Routing Header (either with
a new outer IPv6 header, or by inlining it directly in the existing IPv6
header) into a non-SRv6 packet. This helper is required if we want to
offer the possibility to dynamically encapsulate a SRH for non-SRv6 packet,
as the BPF seg6local hook only works on traffic already containing a SRH.
This is the BPF equivalent of the seg6 LWT infrastructure, which achieves
the same purpose but with a static SRH per route.
These helpers require CONFIG_IPV6=y (and not =m).
Signed-off-by: Mathieu Xhonneux <m.xhonneux@gmail.com>
Acked-by: David Lebrun <dlebrun@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
The function lookup_nexthop is essential to implement most of the seg6local
actions. As we want to provide a BPF helper allowing to apply some of these
actions on the packet being processed, the helper should be able to call
this function, hence the need to make it public.
Moreover, if one argument is incorrect or if the next hop can not be found,
an error should be returned by the BPF helper so the BPF program can adapt
its processing of the packet (return an error, properly force the drop,
...). This patch hence makes this function return dst->error to indicate a
possible error.
Signed-off-by: Mathieu Xhonneux <m.xhonneux@gmail.com>
Acked-by: David Lebrun <dlebrun@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
include/net/seg6.h cannot be included in a source file if CONFIG_IPV6 is
not enabled:
include/net/seg6.h: In function 'seg6_pernet':
>> include/net/seg6.h:52:14: error: 'struct net' has no member named
'ipv6'; did you mean 'ipv4'?
return net->ipv6.seg6_data;
^~~~
ipv4
This commit makes seg6_pernet return NULL if IPv6 is not compiled, hence
allowing seg6.h to be included regardless of the configuration.
Signed-off-by: Mathieu Xhonneux <m.xhonneux@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
Jun Wu at Facebook reported that an internal service was seeing a return
value of 1 from ftruncate() on Btrfs in some cases. This is coming from
the NEED_TRUNCATE_BLOCK return value from btrfs_truncate_inode_items().
btrfs_truncate() uses two variables for error handling, ret and err.
When btrfs_truncate_inode_items() returns non-zero, we set err to the
return value. However, NEED_TRUNCATE_BLOCK is not an error. Make sure we
only set err if ret is an error (i.e., negative).
To reproduce the issue: mount a filesystem with -o compress-force=zstd
and the following program will encounter return value of 1 from
ftruncate:
int main(void) {
char buf[256] = { 0 };
int ret;
int fd;
fd = open("test", O_CREAT | O_WRONLY | O_TRUNC, 0666);
if (fd == -1) {
perror("open");
return EXIT_FAILURE;
}
if (write(fd, buf, sizeof(buf)) != sizeof(buf)) {
perror("write");
close(fd);
return EXIT_FAILURE;
}
if (fsync(fd) == -1) {
perror("fsync");
close(fd);
return EXIT_FAILURE;
}
ret = ftruncate(fd, 128);
if (ret) {
printf("ftruncate() returned %d\n", ret);
close(fd);
return EXIT_FAILURE;
}
close(fd);
return EXIT_SUCCESS;
}
Fixes: ddfae63cc8e0 ("btrfs: move btrfs_truncate_block out of trans handle")
CC: stable@vger.kernel.org # 4.15+
Reported-by: Jun Wu <quark@fb.com>
Signed-off-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Sandipan Das says:
====================
[1] Support for bpf-to-bpf function calls in the powerpc64 JIT compiler.
[2] Provide a way for resolving function calls because of the way JITed
images are allocated in powerpc64.
[3] Fix to get JITed instruction dumps for multi-function programs from
the bpf system call.
[4] Fix for bpftool to show delimited multi-function JITed image dumps.
v4:
- Incorporate review comments from Jakub.
- Fix JSON output for bpftool.
v3:
- Change base tree tag to bpf-next.
- Incorporate review comments from Alexei, Daniel and Jakub.
- Make sure that the JITed image does not grow or shrink after
the last pass due to the way the instruction sequence used
to load a callee's address maybe optimized.
- Make additional changes to the bpf system call and bpftool to
make multi-function JITed dumps easier to correlate.
v2:
- Incorporate review comments from Jakub.
====================
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
This splits up the contiguous JITed dump obtained via the bpf
system call into more relatable chunks for each function in
the program. If the kernel symbols corresponding to these are
known, they are printed in the header for each JIT image dump
otherwise the masked start address is printed.
Before applying this patch:
# bpftool prog dump jited id 1
0: push %rbp
1: mov %rsp,%rbp
...
70: leaveq
71: retq
72: push %rbp
73: mov %rsp,%rbp
...
dd: leaveq
de: retq
# bpftool -p prog dump jited id 1
[{
"pc": "0x0",
"operation": "push",
"operands": ["%rbp"
]
},{
...
},{
"pc": "0x71",
"operation": "retq",
"operands": [null
]
},{
"pc": "0x72",
"operation": "push",
"operands": ["%rbp"
]
},{
...
},{
"pc": "0xde",
"operation": "retq",
"operands": [null
]
}
]
After applying this patch:
# echo 0 > /proc/sys/net/core/bpf_jit_kallsyms
# bpftool prog dump jited id 1
0xffffffffc02c7000:
0: push %rbp
1: mov %rsp,%rbp
...
70: leaveq
71: retq
0xffffffffc02cf000:
0: push %rbp
1: mov %rsp,%rbp
...
6b: leaveq
6c: retq
# bpftool -p prog dump jited id 1
[{
"name": "0xffffffffc02c7000",
"insns": [{
"pc": "0x0",
"operation": "push",
"operands": ["%rbp"
]
},{
...
},{
"pc": "0x71",
"operation": "retq",
"operands": [null
]
}
]
},{
"name": "0xffffffffc02cf000",
"insns": [{
"pc": "0x0",
"operation": "push",
"operands": ["%rbp"
]
},{
...
},{
"pc": "0x6c",
"operation": "retq",
"operands": [null
]
}
]
}
]
# echo 1 > /proc/sys/net/core/bpf_jit_kallsyms
# bpftool prog dump jited id 1
bpf_prog_b811aab41a39ad3d_foo:
0: push %rbp
1: mov %rsp,%rbp
...
70: leaveq
71: retq
bpf_prog_cf418ac8b67bebd9_F:
0: push %rbp
1: mov %rsp,%rbp
...
6b: leaveq
6c: retq
# bpftool -p prog dump jited id 1
[{
"name": "bpf_prog_b811aab41a39ad3d_foo",
"insns": [{
"pc": "0x0",
"operation": "push",
"operands": ["%rbp"
]
},{
...
},{
"pc": "0x71",
"operation": "retq",
"operands": [null
]
}
]
},{
"name": "bpf_prog_cf418ac8b67bebd9_F",
"insns": [{
"pc": "0x0",
"operation": "push",
"operands": ["%rbp"
]
},{
...
},{
"pc": "0x6c",
"operation": "retq",
"operands": [null
]
}
]
}
]
Signed-off-by: Sandipan Das <sandipan@linux.vnet.ibm.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
Syncing the bpf.h uapi header with tools so that struct
bpf_prog_info has the two new fields for passing on the
JITed image lengths of each function in a multi-function
program.
Signed-off-by: Sandipan Das <sandipan@linux.vnet.ibm.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
This adds new two new fields to struct bpf_prog_info. For
multi-function programs, these fields can be used to pass
a list of the JITed image lengths of each function for a
given program to userspace using the bpf system call with
the BPF_OBJ_GET_INFO_BY_FD command.
This can be used by userspace applications like bpftool
to split up the contiguous JITed dump, also obtained via
the system call, into more relatable chunks corresponding
to each function.
Signed-off-by: Sandipan Das <sandipan@linux.vnet.ibm.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
Currently, for multi-function programs, we cannot get the JITed
instructions using the bpf system call's BPF_OBJ_GET_INFO_BY_FD
command. Because of this, userspace tools such as bpftool fail
to identify a multi-function program as being JITed or not.
With the JIT enabled and the test program running, this can be
verified as follows:
# cat /proc/sys/net/core/bpf_jit_enable
1
Before applying this patch:
# bpftool prog list
1: kprobe name foo tag b811aab41a39ad3d gpl
loaded_at 2018-05-16T11:43:38+0530 uid 0
xlated 216B not jited memlock 65536B
...
# bpftool prog dump jited id 1
no instructions returned
After applying this patch:
# bpftool prog list
1: kprobe name foo tag b811aab41a39ad3d gpl
loaded_at 2018-05-16T12:13:01+0530 uid 0
xlated 216B jited 308B memlock 65536B
...
# bpftool prog dump jited id 1
0: nop
4: nop
8: mflr r0
c: std r0,16(r1)
10: stdu r1,-112(r1)
14: std r31,104(r1)
18: addi r31,r1,48
1c: li r3,10
...
Signed-off-by: Sandipan Das <sandipan@linux.vnet.ibm.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
Currently, we resolve the callee's address for a JITed function
call by using the imm field of the call instruction as an offset
from __bpf_call_base. If bpf_jit_kallsyms is enabled, we further
use this address to get the callee's kernel symbol's name.
For some architectures, such as powerpc64, the imm field is not
large enough to hold this offset. So, instead of assigning this
offset to the imm field, the verifier now assigns the subprog
id. Also, a list of kernel symbol addresses for all the JITed
functions is provided in the program info. We now use the imm
field as an index for this list to lookup a callee's symbol's
address and resolve its name.
Suggested-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Sandipan Das <sandipan@linux.vnet.ibm.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
Syncing the bpf.h uapi header with tools so that struct
bpf_prog_info has the two new fields for passing on the
addresses of the kernel symbols corresponding to each
function in a program.
Signed-off-by: Sandipan Das <sandipan@linux.vnet.ibm.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
This adds new two new fields to struct bpf_prog_info. For
multi-function programs, these fields can be used to pass
a list of kernel symbol addresses for all functions in a
given program to userspace using the bpf system call with
the BPF_OBJ_GET_INFO_BY_FD command.
When bpf_jit_kallsyms is enabled, we can get the address
of the corresponding kernel symbol for a callee function
and resolve the symbol's name. The address is determined
by adding the value of the call instruction's imm field
to __bpf_call_base. This offset gets assigned to the imm
field by the verifier.
For some architectures, such as powerpc64, the imm field
is not large enough to hold this offset.
We resolve this by:
[1] Assigning the subprog id to the imm field of a call
instruction in the verifier instead of the offset of
the callee's symbol's address from __bpf_call_base.
[2] Determining the address of a callee's corresponding
symbol by using the imm field as an index for the
list of kernel symbol addresses now available from
the program info.
Suggested-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Sandipan Das <sandipan@linux.vnet.ibm.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
This adds support for bpf-to-bpf function calls in the powerpc64
JIT compiler. The JIT compiler converts the bpf call instructions
to native branch instructions. After a round of the usual passes,
the start addresses of the JITed images for the callee functions
are known. Finally, to fixup the branch target addresses, we need
to perform an extra pass.
Because of the address range in which JITed images are allocated
on powerpc64, the offsets of the start addresses of these images
from __bpf_call_base are as large as 64 bits. So, for a function
call, we cannot use the imm field of the instruction to determine
the callee's address. Instead, we use the alternative method of
getting it from the list of function addresses in the auxiliary
data of the caller by using the off field as an index.
Signed-off-by: Sandipan Das <sandipan@linux.vnet.ibm.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
For multi-function programs, loading the address of a callee
function to a register requires emitting instructions whose
count varies from one to five depending on the nature of the
address.
Since we come to know of the callee's address only before the
extra pass, the number of instructions required to load this
address may vary from what was previously generated. This can
make the JITed image grow or shrink.
To avoid this, we should generate a constant five-instruction
when loading function addresses by padding the optimized load
sequence with NOPs.
Signed-off-by: Sandipan Das <sandipan@linux.vnet.ibm.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
The imm field of a bpf instruction is a signed 32-bit integer.
For JITed bpf-to-bpf function calls, it holds the offset of the
start address of the callee's JITed image from __bpf_call_base.
For some architectures, such as powerpc64, this offset may be
as large as 64 bits and cannot be accomodated in the imm field
without truncation.
We resolve this by:
[1] Additionally using the auxiliary data of each function to
keep a list of start addresses of the JITed images for all
functions determined by the verifier.
[2] Retaining the subprog id inside the off field of the call
instructions and using it to index into the list mentioned
above and lookup the callee's address.
To make sure that the existing JIT compilers continue to work
without requiring changes, we keep the imm field as it is.
Signed-off-by: Sandipan Das <sandipan@linux.vnet.ibm.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
Sparse warning:
kernel/bpf/btf.c:1985:34: warning: Variable length array is used.
This patch directly uses ARRAY_SIZE().
Fixes: f80442a4cd18 ("bpf: btf: Change how section is supported in btf_header")
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
When posted work request, it need to compute the length of
all sges of every wr and fill it into the msg_len field of
send wqe. Thus, While posting multiple wr,
tmp_len should be reinitialized to zero.
Fixes: 8b9b8d143b46 ("RDMA/hns: Fix the endian problem for hns")
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
When use cq record db for kernel, it needs to set the hr_cq->db_en
to 1 and configure the dma address of record cq db of qp context.
Fixes: 86188a8810ed ("RDMA/hns: Support cq record doorbell for kernel space")
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
The err pointer comes from uverbs_attr_get, not from the uobject member,
which does not store an ERR_PTR.
Fixes: be934cca9e98 ("IB/uverbs: Add device memory registration ioctl support")
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
|
|
Each user_context receives a separate dpi value and thus a different
address on the doorbell bar. The qedr_mmap function needs to validate
the address and map the doorbell bar accordingly.
The current implementation always checked against dpi=0 doorbell range
leading to a wrong mapping for doorbell bar. (It entered an else case
that mapped the address differently). qedr_mmap should only be used
for doorbells, so the else was actually wrong in the first place.
This only has an affect on arm architecture and not an issue on a
x86 based architecture.
This lead to doorbells not occurring on arm based systems and left
applications that use more than one dpi (or several applications
run simultaneously ) to hang.
Fixes: ac1b36e55a51 ("qedr: Add support for user context verbs")
Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com>
Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
Pablo Neira Ayuso says:
====================
Netfilter updates for net-next
The following patchset contains Netfilter updates for your net-next
tree, they are:
1) Remove obsolete nf_log tracing from nf_tables, from Florian Westphal.
2) Add support for map lookups to numgen, random and hash expressions,
from Laura Garcia.
3) Allow to register nat hooks for iptables and nftables at the same
time. Patchset from Florian Westpha.
4) Timeout support for rbtree sets.
5) ip6_rpfilter works needs interface for link-local addresses, from
Vincent Bernat.
6) Add nf_ct_hook and nf_nat_hook structures and use them.
7) Do not drop packets on packets raceing to insert conntrack entries
into hashes, this is particularly a problem in nfqueue setups.
8) Address fallout from xt_osf separation to nf_osf, patches
from Florian Westphal and Fernando Mancera.
9) Remove reference to struct nft_af_info, which doesn't exist anymore.
From Taehee Yoo.
This batch comes with is a conflict between 25fd386e0bc0 ("netfilter:
core: add missing __rcu annotation") in your tree and 2c205dd3981f
("netfilter: add struct nf_nat_hook and use it") coming in this batch.
This conflict can be solved by leaving the __rcu tag on
__netfilter_net_init() - added by 25fd386e0bc0 - and remove all code
related to nf_nat_decode_session_hook - which is gone after
2c205dd3981f, as described by:
diff --cc net/netfilter/core.c
index e0ae4aae96f5,206fb2c4c319..168af54db975
--- a/net/netfilter/core.c
+++ b/net/netfilter/core.c
@@@ -611,7 -580,13 +611,8 @@@ const struct nf_conntrack_zone nf_ct_zo
EXPORT_SYMBOL_GPL(nf_ct_zone_dflt);
#endif /* CONFIG_NF_CONNTRACK */
- static void __net_init __netfilter_net_init(struct nf_hook_entries **e, int max)
-#ifdef CONFIG_NF_NAT_NEEDED
-void (*nf_nat_decode_session_hook)(struct sk_buff *, struct flowi *);
-EXPORT_SYMBOL(nf_nat_decode_session_hook);
-#endif
-
+ static void __net_init
+ __netfilter_net_init(struct nf_hook_entries __rcu **e, int max)
{
int h;
I can also merge your net-next tree into nf-next, solve the conflict and
resend the pull request if you prefer so.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Tom Lendacky says:
====================
amd-xgbe: AMD XGBE driver updates 2018-05-21
The following updates are included in this driver update series:
- Fix the debug output for the max channels count
- Read (once) and save the port property registers during probe
- Remove the use of the comm_owned field
- Remove unused SFP diagnostic support indicator field
- Add ethtool --module-info support
- Add ethtool --show-ring/--set-ring support
- Update the driver in preparation for ethtool --set-channels support
- Add ethtool --show-channels/--set-channels support
- Update the driver to always perform link training in KR mode
- Advertise FEC support when using a KR re-driver
- Update the BelFuse quirk to now support SGMII
- Improve 100Mbps auto-negotiation for BelFuse parts
This patch series is based on net-next.
---
Changes since v1:
- Update the --set-channels support to the use of the combined, rx and
tx options as specified in the ethtool man page (in other words, don't
create combined channels based on the min of the tx and rx channels
specified).
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
After changing speed to 100Mbps as a result of auto-negotiation (AN),
some 10/100/1000Mbps SFPs indicate a successful link (no faults or loss
of signal), but cannot successfully transmit or receive data. These
SFPs required an extra auto-negotiation (AN) after the speed change in
order to operate properly. Add a quirk for these SFPs so that if the
outcome of the AN actually results in changing to a new speed, re-initiate
AN at that new speed.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Instead of using a quirk to make the BelFuse 1GBT-SFP06 part look like
a 1000baseX part, program the SFP PHY to support SGMII and 10/100/1000
baseT.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When a KR re-driver is present, indicate the FEC support is available
during auto-negotiation.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Link training is always attempted when in KR mode, but the code is
structured to check if link training has been enabled before attempting
to perform it. Since that check will always be true, simplify the code
to always enable and start link training during KR auto-negotiation.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add ethtool support to show and set the device channel configuration.
Changing the channel configuration will result in a device restart.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
In order to support being able to dynamically set/change the number of
Rx and Tx channels, update the code to:
- Move alloc and free of device memory into callable functions
- Move setting of the real number of Rx and Tx channels to device startup
- Move mapping of the RSS channels to device startup
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add ethtool support to show and set the number of the Rx and Tx ring
descriptors. Changing the ring configuration will result in a device
restart.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add support to get SFP module information using ethtool.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The driver currently sets an indication of whether the SFP supports, and
that the driver can obtain, diagnostics data. This isn't currently used
by the driver and the logic to set this indicator is flawed because the
field is cleared each time the SFP is checked and only set when a new SFP
is detected. Remove this field and the logic supporting it.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The comm_owned field can hide logic where double locking is attempted
and prevent multiple threads for the same device from accessing the
mutex properly. Remove the comm_owned field and use the mutex API
exclusively for gaining ownership. The current driver has been audited
and is obtaining communications ownership properly.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Read and save the port property registers once during the device probe
and then use the saved values as they are needed.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
A debug output print statement uses the wrong variable to output the
maximum Rx channel count (cut and paste error, basically). Fix the
statement to use the proper variable.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Ursula Braun says:
====================
patches 2018-05-23
here are more smc-patches for net-next:
Patch 1 fixes an ioctl problem detected by syzbot.
Patch 2 improves smc_lgr_list locking in case of abnormal link
group termination. If you want to receive a version for the net-tree,
please let me know. It would look somewhat different, since the port
terminate code has been moved to smc_core.c on net-next.
Patch 3 enables SMC to deal with urgent data.
Patch 4 is a minor improvement to avoid out-of-sync linkgroups
between 2 peers.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Client link group creation always follows the server linkgroup creation.
If peer creates a new server link group, client has to create a new
client link group. If peer reuses a server link group for a new
connection, client has to reuse its client link group as well. To
avoid out-of-sync conditions for link groups a longer delay for
for client link group removal is defined to make sure this link group
still exists, once the peer decides to reuse a server link group.
Currently the client link group delay time is just 10 jiffies larger
than the server link group delay time. This patch increases the delay
difference to 10 seconds to have a better protection against
out-of-sync link groups.
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add support for out of band data send and receive.
Signed-off-by: Stefan Raspl <raspl@linux.ibm.com>
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Currently, smc_port_terminate() is not holding the lock of the lgr list
while it is traversing the list. This patch adds locking to this
function and changes smc_lgr_terminate() accordingly.
Signed-off-by: Hans Wippel <hwippel@linux.ibm.com>
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
A connected SMC-socket contains addresses of descriptors for the
send buffer and the rmb (receive buffer). Fields of these descriptors
are used to determine the answer for certain ioctl requests.
Add extra handling for unconnected SMC socket states without valid
buffer descriptor addresses.
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Reported-by: syzbot+e6714328fda813fc670f@syzkaller.appspotmail.com
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
trigger an L1 configure operation when a transceiver module
is inserted in order to cause current "sticky" options like
Requested Forward Error Correction to be reapplied.
Signed-off-by: Casey Leedom <leedom@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
MDI Port Capabilities bit definitions were inconsistent with
regard to the MDI enum values. 2 bits used to define MDI in
the port capabilities are not really separable, it's a 2-bit
field with 4 different values. Change the port capability bit
definitions to be "AUTO" and "STRAIGHT" in order to get them
to line up with the enum's.
Signed-off-by: Casey Leedom <leedom@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next
Johannes Berg says:
For this round, we have various things all over the place, notably
* a fix for a race in aggregation, which I want to let
bake for a bit longer before sending to stable
* some new statistics (ACK RSSI, TXQ)
* TXQ configuration
* preparations for HE, particularly radiotap
* replace confusing "country IE" by "country element" since it's
not referring to Ireland
Note that I merged net-next to get a fix from mac80211 that got
there via net, to apply one patch that would otherwise conflict.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
spin_lock/unlock was used instead of spin_un/lock_irq
in a procedure used in process space, on a spinlock
which can be grabbed in an interrupt.
This caused the stack trace below to be displayed (on kernel
4.17.0-rc1 compiled with Lock Debugging enabled):
[ 154.661474] WARNING: SOFTIRQ-safe -> SOFTIRQ-unsafe lock order detected
[ 154.668909] 4.17.0-rc1-rdma_rc_mlx+ #3 Tainted: G I
[ 154.675856] -----------------------------------------------------
[ 154.682706] modprobe/10159 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire:
[ 154.690254] 00000000f3b0e495 (&(&qp_table->lock)->rlock){+.+.}, at: mlx4_qp_remove+0x20/0x50 [mlx4_core]
[ 154.700927]
and this task is already holding:
[ 154.707461] 0000000094373b5d (&(&cq->lock)->rlock/1){....}, at: destroy_qp_common+0x111/0x560 [mlx4_ib]
[ 154.718028] which would create a new lock dependency:
[ 154.723705] (&(&cq->lock)->rlock/1){....} -> (&(&qp_table->lock)->rlock){+.+.}
[ 154.731922]
but this new dependency connects a SOFTIRQ-irq-safe lock:
[ 154.740798] (&(&cq->lock)->rlock){..-.}
[ 154.740800]
... which became SOFTIRQ-irq-safe at:
[ 154.752163] _raw_spin_lock_irqsave+0x3e/0x50
[ 154.757163] mlx4_ib_poll_cq+0x36/0x900 [mlx4_ib]
[ 154.762554] ipoib_tx_poll+0x4a/0xf0 [ib_ipoib]
...
to a SOFTIRQ-irq-unsafe lock:
[ 154.815603] (&(&qp_table->lock)->rlock){+.+.}
[ 154.815604]
... which became SOFTIRQ-irq-unsafe at:
[ 154.827718] ...
[ 154.827720] _raw_spin_lock+0x35/0x50
[ 154.833912] mlx4_qp_lookup+0x1e/0x50 [mlx4_core]
[ 154.839302] mlx4_flow_attach+0x3f/0x3d0 [mlx4_core]
Since mlx4_qp_lookup() is called only in process space, we can
simply replace the spin_un/lock calls with spin_un/lock_irq calls.
Fixes: 6dc06c08bef1 ("net/mlx4: Fix the check in attaching steering rules")
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Michal Vokáč says:
====================
Add support for QCA8334 switch
This series basically adds support for a QCA8334 ethernet switch to the
qca8k driver. It is a four-port variant of the already supported seven
port QCA8337. Register map is the same for the whole familly and all chips
have the same device ID.
Major part of this series enhances the CPU port setting. Currently the CPU
port is not set to any sensible defaults compatible with the xGMII
interface. This series forces the CPU port to its maximum bandwidth and
also allows to adjust the new defaults using fixed-link device tree
sub-node.
Alongside these changes I fixed two checkpatch warnings regarding SPDX and
redundant parentheses.
Changes in v3:
- Rebased on latest net-next/master.
- Corrected fixed-link documentation.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Fix warning reported by checkpatch.
Signed-off-by: Michal Vokáč <michal.vokac@ysoft.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Replace the GPLv2 license boilerplate with the SPDX license identifier.
Signed-off-by: Michal Vokáč <michal.vokac@ysoft.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Implement adjust_link function that allows to overwrite default CPU port
setting using fixed-link device tree subnode.
Signed-off-by: Michal Vokáč <michal.vokac@ysoft.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
By default autonegotiation is enabled to configure MAC on all ports.
For the CPU port autonegotiation can not be used so we need to set
some sensible defaults manually.
This patch forces the default setting of the CPU port to 1000Mbps/full
duplex which is the chip maximum capability.
Also correct size of the bit field used to configure link speed.
Fixes: 6b93fb46480a ("net-next: dsa: add new driver for qca8xxx family")
Signed-off-by: Michal Vokáč <michal.vokac@ysoft.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|