Age | Commit message (Collapse) | Author |
|
Coprocessor context offsets are used by the assembly code that moves
coprocessor context between the individual fields of the
thread_info::xtregs_cp structure and coprocessor registers.
This fixes coprocessor context clobbering on flushing and reloading
during normal user code execution and user process debugging in the
presence of more than one coprocessor in the core configuration.
Cc: stable@vger.kernel.org
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
|
|
coprocessor_flush_all may be called from a context of a thread that is
different from the thread being flushed. In that case contents of the
cpenable special register may not match ti->cpenable of the target
thread, resulting in unhandled coprocessor exception in the kernel
context.
Set cpenable special register to the ti->cpenable of the target register
for the duration of the flush and restore it afterwards.
This fixes the following crash caused by coprocessor register inspection
in native gdb:
(gdb) p/x $w0
Illegal instruction in kernel: sig: 9 [#1] PREEMPT
Call Trace:
___might_sleep+0x184/0x1a4
__might_sleep+0x41/0xac
exit_signals+0x14/0x218
do_exit+0xc9/0x8b8
die+0x99/0xa0
do_illegal_instruction+0x18/0x6c
common_exception+0x77/0x77
coprocessor_flush+0x16/0x3c
arch_ptrace+0x46c/0x674
sys_ptrace+0x2ce/0x3b4
system_call+0x54/0x80
common_exception+0x77/0x77
note: gdb[100] exited with preempt_count 1
Killed
Cc: stable@vger.kernel.org
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
|
|
The numa_slit variable used by node_distance is available to a
module as long as it is linked at compile-time. However, it is
not available to loadable modules. Leading to errors such as:
ERROR: "numa_slit" [drivers/nvme/host/nvme-core.ko] undefined!
The error above is caused by the nvme multipath code that makes
use of node_distance for its path calculation. When the patch was
added, the lightnvm subsystem would select nvme and always compile
it in, leading to the node_distance call to always succeed.
However, when this requirement was removed, nvme could be compiled
in as a module, which exposed this bug.
This patch extracts node_distance to a function and exports it.
Since ACPI is depending on node_distance being a simple lookup to
numa_slit, the previous behavior is exposed as slit_distance and its
users updated.
Fixes: f333444708f82 "nvme: take node locality into account when selecting a path"
Fixes: 73569e11032f "lightnvm: remove dependencies on BLK_DEV_NVME and PCI"
Signed-off-by: Matias Bjøring <mb@lightnvm.io>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Make the high-level BPF JIT entry a general 'catch-all' and add
architecture specific entries to make it more clear who actively
maintains which BPF JIT compiler. The list (L) address implies
that this eventually lands in the bpf patchwork bucket. Goal is
that this set of responsible developers listed here is always up
to date and a point of contact for helping out in e.g. feature
development, fixes, review or testing patches in order to help
long-term in ensuring quality of the BPF JITs and therefore BPF
core under a given architecture. Every new JIT in future /must/
have an entry here as well.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Acked-by: Sandipan Das <sandipan@linux.ibm.com>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Zi Shen Lim <zlim.lnx@gmail.com>
Acked-by: Paul Burton <paul.burton@mips.com>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Wang YanQing <udknight@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Yonghong Song says:
====================
Commit 838e96904ff3 ("bpf: Introduce bpf_func_info")
added bpf func info support. The userspace is able
to get better ksym's for bpf programs with jit, and
is able to print out func prototypes.
For a program containing func-to-func calls, the existing
implementation returns user specified number of function
calls and BTF types if jit is enabled. If the jit is not
enabled, it only returns the type for the main function.
This is undesirable. Interpreter may still be used
and we should keep feature identical regardless of
whether jit is enabled or not.
This patch fixed this discrepancy.
The following example shows bpftool output for
the bpf program in selftests test_btf_haskv.o when jit
is disabled:
$ bpftool prog dump xlated id 1490
int _dummy_tracepoint(struct dummy_tracepoint_args * arg):
0: (85) call pc+2#__bpf_prog_run_args32
1: (b7) r0 = 0
2: (95) exit
int test_long_fname_1(struct dummy_tracepoint_args * arg):
3: (85) call pc+1#__bpf_prog_run_args32
4: (95) exit
int test_long_fname_2(struct dummy_tracepoint_args * arg):
5: (b7) r2 = 0
6: (63) *(u32 *)(r10 -4) = r2
7: (79) r1 = *(u64 *)(r1 +8)
8: (15) if r1 == 0x0 goto pc+9
9: (bf) r2 = r10
10: (07) r2 += -4
11: (18) r1 = map[id:1173]
13: (85) call bpf_map_lookup_elem#77088
14: (15) if r0 == 0x0 goto pc+3
15: (61) r1 = *(u32 *)(r0 +4)
16: (07) r1 += 1
17: (63) *(u32 *)(r0 +4) = r1
18: (95) exit
$ bpftool prog dump jited id 1490
no instructions returned
====================
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
The selftest test_btf is changed to test both jit and non-jit.
The test result should be the same regardless of whether jit
is enabled or not.
Signed-off-by: Yonghong Song <yhs@fb.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Commit 838e96904ff3 ("bpf: Introduce bpf_func_info")
added bpf func info support. The userspace is able
to get better ksym's for bpf programs with jit, and
is able to print out func prototypes.
For a program containing func-to-func calls, the existing
implementation returns user specified number of function
calls and BTF types if jit is enabled. If the jit is not
enabled, it only returns the type for the main function.
This is undesirable. Interpreter may still be used
and we should keep feature identical regardless of
whether jit is enabled or not.
This patch fixed this discrepancy.
Fixes: 838e96904ff3 ("bpf: Introduce bpf_func_info")
Signed-off-by: Yonghong Song <yhs@fb.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
We need to initialize the frame pointer register not just if it is
seen as a source operand, but also if it is seen as the destination
operand of a store or an atomic instruction (which effectively is a
source operand).
This is exercised by test_verifier's "non-invalid fp arithmetic"
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
On T4 and later sparc64 cpus we can use the fused compare and branch
instruction.
However, it can only be used if the branch destination is in the range
of a signed 10-bit immediate offset. This amounts to 1024
instructions forwards or backwards.
After the commit referenced in the Fixes: tag, the largest possible
size program seen by the JIT explodes by a significant factor.
As a result of this convergance takes many more passes since the
expanded "BPF_LDX | BPF_MSH | BPF_B" code sequence, for example,
contains several embedded branch on condition instructions.
On each pass, as suddenly new fused compare and branch instances
become valid, this makes thousands more in range for the next pass.
And so on and so forth.
This is most greatly exemplified by "BPF_MAXINSNS: exec all MSH" which
takes 35 passes to converge, and shrinks the image by about 64K.
To decrease the cost of this number of convergance passes, do the
convergance pass before we have the program image allocated, just like
other JITs (such as x86) do.
Fixes: e0cea7ce988c ("bpf: implement ld_abs/ld_ind in native bpf")
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Daniel Borkmann says:
====================
This set contains a fix for arm64 BPF JIT. First patch generalizes
ppc64 way of retrieving subprog into bpf_jit_get_func_addr() as core
code and uses the same on arm64 in second patch. Tested on both arm64
and ppc64.
====================
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
The arm64 JIT has the same issue as ppc64 JIT in that the relative BPF
to BPF call offset can be too far away from core kernel in that relative
encoding into imm is not sufficient and could potentially be truncated,
see also fd045f6cd98e ("arm64: add support for module PLTs") which adds
spill-over space for module_alloc() and therefore bpf_jit_binary_alloc().
Therefore, use the recently added bpf_jit_get_func_addr() helper for
properly fetching the address through prog->aux->func[off]->bpf_func
instead. This also has the benefit to optimize normal helper calls since
their address can use the optimized emission. Tested on Cavium ThunderX
CN8890.
Fixes: db496944fdaa ("bpf: arm64: add JIT support for multi-function programs")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Make fetching of the BPF call address from ppc64 JIT generic. ppc64
was using a slightly different variant rather than through the insns'
imm field encoding as the target address would not fit into that space.
Therefore, the target subprog number was encoded into the insns' offset
and fetched through fp->aux->func[off]->bpf_func instead. Given there
are other JITs with this issue and the mechanism of fetching the address
is JIT-generic, move it into the core as a helper instead. On the JIT
side, we get information on whether the retrieved address is a fixed
one, that is, not changing through JIT passes, or a dynamic one. For
the former, JITs can optimize their imm emission because this doesn't
change jump offsets throughout JIT process.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Sandipan Das <sandipan@linux.ibm.com>
Tested-by: Sandipan Das <sandipan@linux.ibm.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
All lists that reach the tree_nodes_free() function have both zero
counter and true dead flag. The reason for this is that lists to be
release are selected by nf_conncount_gc_list() which already decrements
the list counter and sets on the dead flag. Therefore, this if statement
in tree_nodes_free() is unnecessary and wrong.
Fixes: 31568ec09ea0 ("netfilter: nf_conncount: fix list_del corruption in conn_free")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
There is a reference counter to ensure that masquerade modules register
notifiers only once. However, the existing reference counter approach is
not safe, test commands are:
while :
do
modprobe ip6t_MASQUERADE &
modprobe nft_masq_ipv6 &
modprobe -rv ip6t_MASQUERADE &
modprobe -rv nft_masq_ipv6 &
done
numbers below represent the reference counter.
--------------------------------------------------------
CPU0 CPU1 CPU2 CPU3 CPU4
[insmod] [insmod] [rmmod] [rmmod] [insmod]
--------------------------------------------------------
0->1
register 1->2
returns 2->1
returns 1->0
0->1
register <--
unregister
--------------------------------------------------------
The unregistation of CPU3 should be processed before the
registration of CPU4.
In order to fix this, use a mutex instead of reference counter.
splat looks like:
[ 323.869557] watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [modprobe:1381]
[ 323.869574] Modules linked in: nf_tables(+) nf_nat_ipv6(-) nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 n]
[ 323.869574] irq event stamp: 194074
[ 323.898930] hardirqs last enabled at (194073): [<ffffffff90004a0d>] trace_hardirqs_on_thunk+0x1a/0x1c
[ 323.898930] hardirqs last disabled at (194074): [<ffffffff90004a29>] trace_hardirqs_off_thunk+0x1a/0x1c
[ 323.898930] softirqs last enabled at (182132): [<ffffffff922006ec>] __do_softirq+0x6ec/0xa3b
[ 323.898930] softirqs last disabled at (182109): [<ffffffff90193426>] irq_exit+0x1a6/0x1e0
[ 323.898930] CPU: 0 PID: 1381 Comm: modprobe Not tainted 4.20.0-rc2+ #27
[ 323.898930] RIP: 0010:raw_notifier_chain_register+0xea/0x240
[ 323.898930] Code: 3c 03 0f 8e f2 00 00 00 44 3b 6b 10 7f 4d 49 bc 00 00 00 00 00 fc ff df eb 22 48 8d 7b 10 488
[ 323.898930] RSP: 0018:ffff888101597218 EFLAGS: 00000206 ORIG_RAX: ffffffffffffff13
[ 323.898930] RAX: 0000000000000000 RBX: ffffffffc04361c0 RCX: 0000000000000000
[ 323.898930] RDX: 1ffffffff26132ae RSI: ffffffffc04aa3c0 RDI: ffffffffc04361d0
[ 323.898930] RBP: ffffffffc04361c8 R08: 0000000000000000 R09: 0000000000000001
[ 323.898930] R10: ffff8881015972b0 R11: fffffbfff26132c4 R12: dffffc0000000000
[ 323.898930] R13: 0000000000000000 R14: 1ffff110202b2e44 R15: ffffffffc04aa3c0
[ 323.898930] FS: 00007f813ed41540(0000) GS:ffff88811ae00000(0000) knlGS:0000000000000000
[ 323.898930] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 323.898930] CR2: 0000559bf2c9f120 CR3: 000000010bc80000 CR4: 00000000001006f0
[ 323.898930] Call Trace:
[ 323.898930] ? atomic_notifier_chain_register+0x2d0/0x2d0
[ 323.898930] ? down_read+0x150/0x150
[ 323.898930] ? sched_clock_cpu+0x126/0x170
[ 323.898930] ? nf_tables_core_module_init+0xe4/0xe4 [nf_tables]
[ 323.898930] ? nf_tables_core_module_init+0xe4/0xe4 [nf_tables]
[ 323.898930] register_netdevice_notifier+0xbb/0x790
[ 323.898930] ? __dev_close_many+0x2d0/0x2d0
[ 323.898930] ? __mutex_unlock_slowpath+0x17f/0x740
[ 323.898930] ? wait_for_completion+0x710/0x710
[ 323.898930] ? nf_tables_core_module_init+0xe4/0xe4 [nf_tables]
[ 323.898930] ? up_write+0x6c/0x210
[ 323.898930] ? nf_tables_core_module_init+0xe4/0xe4 [nf_tables]
[ 324.127073] ? nf_tables_core_module_init+0xe4/0xe4 [nf_tables]
[ 324.127073] nft_chain_filter_init+0x1e/0xe8a [nf_tables]
[ 324.127073] nf_tables_module_init+0x37/0x92 [nf_tables]
[ ... ]
Fixes: 8dd33cc93ec9 ("netfilter: nf_nat: generalize IPv4 masquerading support for nf_tables")
Fixes: be6b635cd674 ("netfilter: nf_nat: generalize IPv6 masquerading support for nf_tables")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
register_{netdevice/inetaddr/inet6addr}_notifier may return an error
value, this patch adds the code to handle these error paths.
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
When ip6_route_me_harder is invoked, it resets outgoing interface of:
- link-local scoped packets sent by neighbor discovery
- multicast packets sent by MLD host
- multicast packets send by MLD proxy daemon that sets outgoing
interface through IPV6_PKTINFO ipi6_ifindex
Link-local and multicast packets must keep their original oif after
ip6_route_me_harder is called.
Signed-off-by: Alin Nastac <alin.nastac@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
'offset' is constant and if it is zero, no need to subtract it
from BPF_REG_TMP.
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
Daniel Borkmann says:
====================
pull-request: bpf-next 2018-11-26
The following pull-request contains BPF updates for your *net-next* tree.
The main changes are:
1) Extend BTF to support function call types and improve the BPF
symbol handling with this info for kallsyms and bpftool program
dump to make debugging easier, from Martin and Yonghong.
2) Optimize LPM lookups by making longest_prefix_match() handle
multiple bytes at a time, from Eric.
3) Adds support for loading and attaching flow dissector BPF progs
from bpftool, from Stanislav.
4) Extend the sk_lookup() helper to be supported from XDP, from Nitin.
5) Enable verifier to support narrow context loads with offset > 0
to adapt to LLVM code generation (currently only offset of 0 was
supported). Add test cases as well, from Andrey.
6) Simplify passing device functions for offloaded BPF progs by
adding callbacks to bpf_prog_offload_ops instead of ndo_bpf.
Also convert nfp and netdevsim to make use of them, from Quentin.
7) Add support for sock_ops based BPF programs to send events to
the perf ring-buffer through perf_event_output helper, from
Sowmini and Daniel.
8) Add read / write support for skb->tstamp from tc BPF and cg BPF
programs to allow for supporting rate-limiting in EDT qdiscs
like fq from BPF side, from Vlad.
9) Extend libbpf API to support map in map types and add test cases
for it as well to BPF kselftests, from Nikita.
10) Account the maximum packet offset accessed by a BPF program in
the verifier and use it for optimizing nfp JIT, from Jiong.
11) Fix error handling regarding kprobe_events in BPF sample loader,
from Daniel T.
12) Add support for queue and stack map type in bpftool, from David.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging
Pull hwmon fixes from Guenter Roeck:
- fix temp4_type attribute permissions in w83795 driver
- fix tacho fault detection in mlxreg-fan driver
- fix current value calculations in ina2xx driver
- fix initial notification/warning in raspberrypi driver
- fix a NULL pointer access in ina2xx
* tag 'hwmon-for-v4.20-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
hwmon: (w83795) temp4_type has writable permission
hwmon: (mlxreg-fan) Fix macros for tacho fault reading
hwmon: (ina2xx) Fix current value calculation
hwmon: (raspberrypi) Fix initial notify
hwmon (ina2xx) Fix NULL id pointer in probe()
|
|
syzbot was able to trigger the WARN in cttimeout_default_get() by
passing UDPLITE as l4protocol. Alias UDPLITE to UDP, both use
same timeout values.
Furthermore, also fetch GRE timeouts. GRE is a bit more complicated,
as it still can be a module and its netns_proto_gre struct layout isn't
visible outside of the gre module. Can't move timeouts around, it
appears conntrack sysctl unregister assumes net_generic() returns
nf_proto_net, so we get crash. Expose layout of netns_proto_gre instead.
A followup nf-next patch could make gre tracker be built-in as well
if needed, its not that large.
Last, make the WARN() mention the missing protocol value in case
anything else is missing.
Reported-by: syzbot+2fae8fa157dd92618cae@syzkaller.appspotmail.com
Fixes: 8866df9264a3 ("netfilter: nfnetlink_cttimeout: pass default timeout policy to obj_to_nlattr")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
ip_vs_dst_event is supposed to clean up all dst used in ipvs'
destinations when a net dev is going down. But it works only
when the dst's dev is the same as the dev from the event.
Now with the same priority but late registration,
ip_vs_dst_notifier is always called later than ipv6_dev_notf
where the dst's dev is set to lo for NETDEV_DOWN event.
As the dst's dev lo is not the same as the dev from the event
in ip_vs_dst_event, ip_vs_dst_notifier doesn't actually work.
Also as these dst have to wait for dest_trash_timer to clean
them up. It would cause some non-permanent kernel warnings:
unregister_netdevice: waiting for br0 to become free. Usage count = 3
To fix it, call ip_vs_dst_notifier earlier than ipv6_dev_notf
by increasing its priority to ADDRCONF_NOTIFY_PRIORITY + 5.
Note that for ipv4 route fib_netdev_notifier doesn't set dst's
dev to lo in NETDEV_DOWN event, so this fix is only needed when
IP_VS_IPV6 is defined.
Fixes: 7a4f0761fce3 ("IPVS: init and cleanup restructuring")
Reported-by: Li Shuang <shuali@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Julian Anastasov <ja@ssi.bg>
Acked-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Daniel Borkmann says:
====================
pull-request: bpf 2018-11-25
The following pull-request contains BPF updates for your *net* tree.
The main changes are:
1) Fix an off-by-one bug when adjusting subprog start offsets after
patching, from Edward.
2) Fix several bugs such as overflow in size allocation in queue /
stack map creation, from Alexei.
3) Fix wrong IPv6 destination port byte order in bpf_sk_lookup_udp
helper, from Andrey.
4) Fix several bugs in bpftool such as preventing an infinite loop
in get_fdinfo, error handling and man page references, from Quentin.
5) Fix a warning in bpf_trace_printk() that wasn't catching an
invalid format string, from Martynas.
6) Fix a bug in BPF cgroup local storage where non-atomic allocation
was used in atomic context, from Roman.
7) Fix a NULL pointer dereference bug in bpftool from reallocarray()
error handling, from Jakub and Wen.
8) Add a copy of pkt_cls.h and tc_bpf.h uapi headers to the tools
include infrastructure so that bpftool compiles on older RHEL7-like
user space which does not ship these headers, from Yonghong.
9) Fix BPF kselftests for user space where to get ping test working
with ping6 and ping -6, from Li.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Make the formatting for map_type_name array consistent.
Signed-off-by: David Calavera <david.calavera@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
There is a spelling mistake in a btf_verifier_log_member message,
fix it.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
Building tags produces warning:
ctags: Warning: kernel/bpf/local_storage.c:10: null expansion of name pattern "\1"
Let's use the same fix as in commit 25528213fe9f ("tags: Fix DEFINE_PER_CPU
expansions"), even though it violates the usual code style.
Signed-off-by: Rustam Kovhaev <rkovhaev@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
|
|
I do not see how one can effectively use skb_insert() without holding
some kind of lock. Otherwise other cpus could have changed the list
right before we have a chance of acquiring list->lock.
Only existing user is in drivers/infiniband/hw/nes/nes_mgt.c and this
one probably meant to use __skb_insert() since it appears nesqp->pau_list
is protected by nesqp->pau_lock. This looks like nesqp->pau_lock
could be removed, since nesqp->pau_list.lock could be used instead.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Faisal Latif <faisal.latif@intel.com>
Cc: Doug Ledford <dledford@redhat.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: linux-rdma <linux-rdma@vger.kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
A recent change added a null check on p->dev after p->dev was being
dereferenced by the ns_capable check on p->dev. It turns out that
neither the p->dev and p->br null checks are necessary, and can be
removed, which cleans up a static analyis warning.
As Nikolay Aleksandrov noted, these checks can be removed because:
"My reasoning of why it shouldn't be possible:
- On port add new_nbp() sets both p->dev and p->br before creating
kobj/sysfs
- On port del (trickier) del_nbp() calls kobject_del() before call_rcu()
to destroy the port which in turn calls sysfs_remove_dir() which uses
kernfs_remove() which deactivates (shouldn't be able to open new
files) and calls kernfs_drain() to drain current open/mmaped files in
the respective dir before continuing, thus making it impossible to
open a bridge port sysfs file with p->dev and p->br equal to NULL.
So I think it's safe to remove those checks altogether. It'd be nice to
get a second look over my reasoning as I might be missing something in
sysfs/kernfs call path."
Thanks to Nikolay Aleksandrov's suggestion to remove the check and
David Miller for sanity checking this.
Detected by CoverityScan, CID#751490 ("Dereference before null check")
Fixes: a5f3ea54f3cc ("net: bridge: add support for raw sysfs port options")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Heiner Kallweit says:
====================
r8169: make use of xmit_more and __netdev_sent_queue
This series adds helper __netdev_sent_queue to the core and makes use
of it in the r8169 driver.
Heiner Kallweit (2):
net: core: add __netdev_sent_queue as variant of __netdev_tx_sent_queue
r8169: make use of xmit_more and __netdev_sent_queue
v2:
- fix minor style issue
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Make use of xmit_more and add the functionality introduced with
3e59020abf0f ("net: bql: add __netdev_tx_sent_queue()").
I used the mlx4 driver as template.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Similar to netdev_sent_queue add helper __netdev_sent_queue as variant
of __netdev_tx_sent_queue.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc into HEAD
PPC KVM fixes for 4.20
This has a single 1-line patch which fixes a bug in the recently-merged
nested HV KVM support.
|
|
Pull dma-mapping fixes from Christoph Hellwig:
"Two dma-direct / swiotlb regressions fixes:
- zero is a valid physical address on some arm boards, we can't use
it as the error value
- don't try to cache flush the error return value (no matter what it
is)"
* tag 'dma-mapping-4.20-3' of git://git.infradead.org/users/hch/dma-mapping:
swiotlb: Skip cache maintenance on map error
dma-direct: Make DIRECT_MAPPING_ERROR viable for SWIOTLB
|
|
Pull NFS client bugfixes from Trond Myklebust:
- Fix a NFSv4 state manager deadlock when returning a delegation
- NFSv4.2 copy do not allocate memory under the lock
- flexfiles: Use the correct stateid for IO in the tightly coupled case
* tag 'nfs-for-4.20-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
flexfiles: use per-mirror specified stateid for IO
NFSv4.2 copy do not allocate memory under the lock
NFSv4: Fix a NFSv4 state manager deadlock
|
|
I'm taking over the maintainance of Sparse so add myself as
maintainer and move Christopher's info to CREDITS.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Pull XArray updates from Matthew Wilcox:
"We found some bugs in the DAX conversion to XArray (and one bug which
predated the XArray conversion). There were a couple of bugs in some
of the higher-level functions, which aren't actually being called in
today's kernel, but surfaced as a result of converting existing radix
tree & IDR users over to the XArray.
Some of the other changes to how the higher-level APIs work were also
motivated by converting various users; again, they're not in use in
today's kernel, so changing them has a low probability of introducing
a bug.
Dan can still trigger a bug in the DAX code with hot-offline/online,
and we're working on tracking that down"
* tag 'xarray-4.20-rc4' of git://git.infradead.org/users/willy/linux-dax:
XArray tests: Add missing locking
dax: Avoid losing wakeup in dax_lock_mapping_entry
dax: Fix huge page faults
dax: Fix dax_unlock_mapping_entry for PMD pages
dax: Reinstate RCU protection of inode
dax: Make sure the unlocking entry isn't locked
dax: Remove optimisation from dax_lock_mapping_entry
XArray tests: Correct some 64-bit assumptions
XArray: Correct xa_store_range
XArray: Fix Documentation
XArray: Handle NULL pointers differently for allocation
XArray: Unify xa_store and __xa_store
XArray: Add xa_store_bh() and xa_store_irq()
XArray: Turn xa_erase into an exported function
XArray: Unify xa_cmpxchg and __xa_cmpxchg
XArray: Regularise xa_reserve
nilfs2: Use xa_erase_irq
XArray: Export __xa_foo to non-GPL modules
XArray: Fix xa_for_each with a single element at 0
|
|
Packet sockets with PACKET_TX_RING send skbs with user data in frags.
Before commit 5cd8d46ea156 ("packet: copy user buffers before orphan
or clone") ring slots could be released prematurely, possibly allowing
a process to overwrite data still in flight.
This test opens two packet sockets, one to send and one to read.
The sender has a tx ring of one slot. It sends two packets with
different payload, then reads both and verifies their payload.
Before the above commit, both receive calls return the same data as
the send calls use the same buffer. From the commit, the clone
needed for looping onto a packet socket triggers an skb_copy_ubufs
to create a private copy. The separate sends each arrive correctly.
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
In ip packet generation, pagedlen is initialized for each skb at the
start of the loop in __ip(6)_append_data, before label alloc_new_skb.
Depending on compiler options, code can be generated that jumps to
this label, triggering use of an an uninitialized variable.
In practice, at -O2, the generated code moves the initialization below
the label. But the code should not rely on that for correctness.
Fixes: 15e36f5b8e98 ("udp: paged allocation with gso")
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When a qdisc setup including pacing FQ is dismantled and recreated,
some TCP packets are sent earlier than instructed by TCP stack.
TCP can be fooled when ACK comes back, because the following
operation can return a negative value.
tcp_time_stamp(tp) - tp->rx_opt.rcv_tsecr;
Some paths in TCP stack were not dealing properly with this,
this patch addresses four of them.
Fixes: ab408b6dc744 ("tcp: switch tcp and sch_fq to new earliest departure time model")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Currently dev is dereferenced by the call dev_net(dev) before dev is null
checked. Fix this by null checking dev before the potential null
pointer dereference.
Detected by CoverityScan, CID#1462955 ("Dereference before null check")
Fixes: 23790ef12082 ("net: qualcomm: rmnet: Allow to configure flags for existing devices")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Fixes gcc '-Wunused-but-set-variable' warning:
drivers/net/ethernet/chelsio/cxgb4/t4_hw.c:5883:6:
warning: variable 'multitrc' set but not used [-Wunused-but-set-variable]
drivers/net/ethernet/chelsio/cxgb4/t4_hw.c:8585:32:
warning: variable 'speed' set but not used [-Wunused-but-set-variable]
'multitrc' never used since introduction in
commit 8e3d04fd7d70 ("cxgb4: Add MPS tracing support")
'speed' never used since introduction in
commit c3168cabe1af ("cxgb4/cxgbvf: Handle 32-bit fw port capabilities")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Return code should be formally "netdev_tx_t".
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid
Pull HID fixes from Jiri Kosina:
- revert of the high-resolution scrolling feature, as it breaks certain
hardware due to incompatibilities between Logitech and Microsoft
worlds. Peter Hutterer is working on a fixed implementation. Until
that is finished, revert by Benjamin Tissoires.
- revert of incorrect strncpy->strlcpy conversion in uhid, from David
Herrmann
- fix for buggy sendfile() implementation on uhid device node, from
Eric Biggers
- a few assorted device-ID specific quirks
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid:
Revert "Input: Add the `REL_WHEEL_HI_RES` event code"
Revert "HID: input: Create a utility class for counting scroll events"
Revert "HID: logitech: Add function to enable HID++ 1.0 "scrolling acceleration""
Revert "HID: logitech: Enable high-resolution scrolling on Logitech mice"
Revert "HID: logitech: Use LDJ_DEVICE macro for existing Logitech mice"
Revert "HID: logitech: fix a used uninitialized GCC warning"
Revert "HID: input: simplify/fix high-res scroll event handling"
HID: Add quirk for Primax PIXART OEM mice
HID: i2c-hid: Disable runtime PM for LG touchscreen
HID: multitouch: Add pointstick support for Cirque Touchpad
HID: steam: remove input device when a hid client is running.
Revert "HID: uhid: use strlcpy() instead of strncpy()"
HID: uhid: forbid UHID_CREATE under KERNEL_DS or elevated privileges
HID: input: Ignore battery reported by Symbol DS4308
HID: Add quirk for Microsoft PIXART OEM mouse
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Catalin Marinas::
- Fix wrong conflict resolution around CONFIG_ARM64_SSBD
- Fix sparse warning on unsigned long constant
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: cpufeature: Fix mismerge of CONFIG_ARM64_SSBD block
arm64: sysreg: fix sparse warnings
|
|
Pull networking fixes from David Miller:
1) Need to take mutex in ath9k_add_interface(), from Dan Carpenter.
2) Fix mt76 build without CONFIG_LEDS_CLASS, from Arnd Bergmann.
3) Fix socket wmem accounting in SCTP, from Xin Long.
4) Fix failed resume crash in ena driver, from Arthur Kiyanovski.
5) qed driver passes bytes instead of bits into second arg of
bitmap_weight(). From Denis Bolotin.
6) Fix reset deadlock in ibmvnic, from Juliet Kim.
7) skb_scrube_packet() needs to scrub the fwd marks too, from Petr
Machata.
8) Make sure older TCP stacks see enough dup ACKs, and avoid doing SACK
compression during this period, from Eric Dumazet.
9) Add atomicity to SMC protocol cursor handling, from Ursula Braun.
10) Don't leave dangling error pointer if bpf_prog_add() fails in
thunderx driver, from Lorenzo Bianconi. Also, when we unmap TSO
headers, set sq->tso_hdrs to NULL.
11) Fix race condition over state variables in act_police, from Davide
Caratti.
12) Disable guest csum in the presence of XDP in virtio_net, from Jason
Wang.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (64 commits)
net: gemini: Fix copy/paste error
net: phy: mscc: fix deadlock in vsc85xx_default_config
dt-bindings: dsa: Fix typo in "probed"
net: thunderx: set tso_hdrs pointer to NULL in nicvf_free_snd_queue
net: amd: add missing of_node_put()
team: no need to do team_notify_peers or team_mcast_rejoin when disabling port
virtio-net: fail XDP set if guest csum is negotiated
virtio-net: disable guest csum during XDP set
net/sched: act_police: add missing spinlock initialization
net: don't keep lonely packets forever in the gro hash
net/ipv6: re-do dad when interface has IFF_NOARP flag change
packet: copy user buffers before orphan or clone
ibmvnic: Update driver queues after change in ring size support
ibmvnic: Fix RX queue buffer cleanup
net: thunderx: set xdp_prog to NULL if bpf_prog_add fails
net/dim: Update DIM start sample after each DIM iteration
net: faraday: ftmac100: remove netif_running(netdev) check before disabling interrupts
net/smc: use after free fix in smc_wr_tx_put_slot()
net/smc: atomic SMCD cursor handling
net/smc: add SMC-D shutdown signal
...
|
|
Pull xfs fixes from Darrick Wong:
"Dave and I have continued our work fixing corruption problems that can
be found when running long-term burn-in exercisers on xfs. Here are
some patches fixing most of the problems, but there will likely be
more. :/
- Numerous corruption fixes for copy on write
- Numerous corruption fixes for blocksize < pagesize writes
- Don't miscalculate AG reservations for small final AGs
- Fix page cache truncation to work properly for reflink and extent
shifting
- Fix use-after-free when retrying failed inode/dquot buffer logging
- Fix corruptions seen when using copy_file_range in directio mode"
* tag 'xfs-4.20-fixes-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
iomap: readpages doesn't zero page tail beyond EOF
vfs: vfs_dedupe_file_range() doesn't return EOPNOTSUPP
iomap: dio data corruption and spurious errors when pipes fill
iomap: sub-block dio needs to zeroout beyond EOF
iomap: FUA is wrong for DIO O_DSYNC writes into unwritten extents
xfs: delalloc -> unwritten COW fork allocation can go wrong
xfs: flush removing page cache in xfs_reflink_remap_prep
xfs: extent shifting doesn't fully invalidate page cache
xfs: finobt AG reserves don't consider last AG can be a runt
xfs: fix transient reference count error in xfs_buf_resubmit_failed_buffers
xfs: uncached buffer tracing needs to print bno
xfs: make xfs_file_remap_range() static
xfs: fix shared extent data corruption due to missing cow reservation
|
|
The TX stats should be started with the tx_stats_syncp,
there seems to be a copy/paste error in the driver.
Signed-off-by: Andreas Fiedler <andreas.fiedler@gmx.net>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The vsc85xx_default_config function called in the vsc85xx_config_init
function which is used by VSC8530, VSC8531, VSC8540 and VSC8541 PHYs
mistakenly calls phy_read and phy_write in-between phy_select_page and
phy_restore_page.
phy_select_page and phy_restore_page actually take and release the MDIO
bus lock and phy_write and phy_read take and release the lock to write
or read to a PHY register.
Let's fix this deadlock by using phy_modify_paged which handles
correctly a read followed by a write in a non-standard page.
Fixes: 6a0bfbbe20b0 ("net: phy: mscc: migrate to phy_select/restore_page functions")
Signed-off-by: Quentin Schulz <quentin.schulz@bootlin.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The correct form is "can be probed", so fix the typo.
Signed-off-by: Fabio Estevam <festevam@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
|