Age | Commit message (Collapse) | Author |
|
There is a race in the TUN driver between napi_busy_loop and
napi_gro_frags. This commit resolves the race by adding the NAPI struct
via netif_tx_napi_add, instead of netif_napi_add, which disables polling
for the NAPI struct.
KCSAN reported:
BUG: KCSAN: data-race in gro_normal_list.part.0 / napi_busy_loop
write to 0xffff8880b5d474b0 of 4 bytes by task 11205 on cpu 0:
gro_normal_list.part.0+0x77/0xb0 net/core/dev.c:5682
gro_normal_list net/core/dev.c:5678 [inline]
gro_normal_one net/core/dev.c:5692 [inline]
napi_frags_finish net/core/dev.c:5705 [inline]
napi_gro_frags+0x625/0x770 net/core/dev.c:5778
tun_get_user+0x2150/0x26a0 drivers/net/tun.c:1976
tun_chr_write_iter+0x79/0xd0 drivers/net/tun.c:2022
call_write_iter include/linux/fs.h:1895 [inline]
do_iter_readv_writev+0x487/0x5b0 fs/read_write.c:693
do_iter_write fs/read_write.c:970 [inline]
do_iter_write+0x13b/0x3c0 fs/read_write.c:951
vfs_writev+0x118/0x1c0 fs/read_write.c:1015
do_writev+0xe3/0x250 fs/read_write.c:1058
__do_sys_writev fs/read_write.c:1131 [inline]
__se_sys_writev fs/read_write.c:1128 [inline]
__x64_sys_writev+0x4e/0x60 fs/read_write.c:1128
do_syscall_64+0xcc/0x370 arch/x86/entry/common.c:290
entry_SYSCALL_64_after_hwframe+0x44/0xa9
read to 0xffff8880b5d474b0 of 4 bytes by task 11168 on cpu 1:
gro_normal_list net/core/dev.c:5678 [inline]
napi_busy_loop+0xda/0x4f0 net/core/dev.c:6126
sk_busy_loop include/net/busy_poll.h:108 [inline]
__skb_recv_udp+0x4ad/0x560 net/ipv4/udp.c:1689
udpv6_recvmsg+0x29e/0xe90 net/ipv6/udp.c:288
inet6_recvmsg+0xbb/0x240 net/ipv6/af_inet6.c:592
sock_recvmsg_nosec net/socket.c:871 [inline]
sock_recvmsg net/socket.c:889 [inline]
sock_recvmsg+0x92/0xb0 net/socket.c:885
sock_read_iter+0x15f/0x1e0 net/socket.c:967
call_read_iter include/linux/fs.h:1889 [inline]
new_sync_read+0x389/0x4f0 fs/read_write.c:414
__vfs_read+0xb1/0xc0 fs/read_write.c:427
vfs_read fs/read_write.c:461 [inline]
vfs_read+0x143/0x2c0 fs/read_write.c:446
ksys_read+0xd5/0x1b0 fs/read_write.c:587
__do_sys_read fs/read_write.c:597 [inline]
__se_sys_read fs/read_write.c:595 [inline]
__x64_sys_read+0x4c/0x60 fs/read_write.c:595
do_syscall_64+0xcc/0x370 arch/x86/entry/common.c:290
entry_SYSCALL_64_after_hwframe+0x44/0xa9
Reported by Kernel Concurrency Sanitizer on:
CPU: 1 PID: 11168 Comm: syz-executor.0 Not tainted 5.4.0-rc6+ #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Fixes: 943170998b20 ("tun: enable NAPI for TUN/TAP driver")
Signed-off-by: Petar Penkov <ppenkov@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Since we do not plan using pthread_join() in the server do_accept()
loop, we better create detached threads, or risk increasing memory
footprint over time.
Fixes: 192dc405f308 ("selftests: net: add tcp_mmap program")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The nla_put_u16/nla_put_u32 makes sure that
*attrlen is align. The call tree is that:
nla_put_u16/nla_put_u32
-> nla_put attrlen = sizeof(u16) or sizeof(u32)
-> __nla_put attrlen
-> __nla_reserve attrlen
-> skb_put(skb, nla_total_size(attrlen))
nla_total_size returns the total length of attribute
including padding.
Cc: Joe Stringer <joe@ovn.org>
Cc: William Tu <u9012063@gmail.com>
Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The driver calls release_resource in remove to match request_mem_region
in probe, which is incorrect.
Fix it by using the right one, release_mem_region.
Signed-off-by: Chuhong Yuan <hslester96@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Vladimir Oltean says:
====================
DSA driver for Vitesse Felix switch
This series builds upon the previous "Accomodate DSA front-end into
Ocelot" topic and does the following:
- Reworks the Ocelot (VSC7514) driver to support one more switching core
(VSC9959), used in NPI mode. Some code which was thought to be
SoC-specific (ocelot_board.c) wasn't, and vice versa, so it is being
accordingly moved.
- Exports ocelot driver structures and functions to include/soc/mscc.
- Adds a DSA ocelot front-end for VSC9959, which is a PCI device and
uses the exported ocelot functionality for hardware configuration.
- Adds a tagger driver for the Vitesse injection/extraction DSA headers.
This is known to be compatible with at least Ocelot and Felix.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This supports an Ethernet switching core from Vitesse / Microsemi /
Microchip (VSC9959) which is part of the Ocelot family (a brand name),
and whose code name is Felix. The switch can be (and is) integrated on
different SoCs as a PCIe endpoint device.
The functionality is provided by the core of the Ocelot switch driver
(drivers/net/ethernet/mscc). In this regard, the current driver is an
instance of Microsemi's Ocelot core driver, with a DSA front-end. It
inherits its name from VSC9959's code name, to distinguish itself from
the switchdev ocelot driver.
The patch adds the logic for probing a PCI device and defines the
register map for the VSC9959 switch core, since it has some differences
in register addresses and bitfield mappings compared to the other Ocelot
switches (VSC7511, VSC7512, VSC7513, VSC7514).
The Felix driver declares the register map as part of the "instance
table". Currently the VSC9959 inside NXP LS1028A is the only instance,
but presumably it can support other switches in the Ocelot family, when
used in DSA mode (Linux running on the external CPU, and not on the
embedded MIPS).
In a few cases, some h/w operations have to be done differently on
VSC9959 due to missing bitfields. This is the case for the switch core
reset and init. Because for this operation Ocelot uses some bits that
are not present on Felix, the latter has to use a register from the
global registers block (GCB) instead.
Although it is a PCI driver, it relies on DT bindings for compatibility
with DSA (CPU port link, PHY library). It does not have any custom
device tree bindings, since we would like to minimize its dependency on
device tree though.
Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
While it is entirely possible that this tagger format is in fact more
generic than just these 2 switch families, I don't have that knowledge.
The Seville switch in NXP T1040 has a similar frame format, but there
are enough differences (e.g. DEST field starts at bit 57 instead of 56)
that calling this file tag_vitesse.c is a bit of a stretch at the
moment. The frame format has been listed in a comment so that people who
add support for further Vitesse switches can rework this tagger while
keeping compatibility with Felix.
The "ocelot" name was chosen instead of "felix" because even the Ocelot
switch can act as a DSA device when it is used in NPI mode, and the Felix
tagger format is almost identical. Currently it is only used for the
Felix switch embedded in the NXP LS1028A chip.
The ABI for this tagger should be considered "not stable" at the moment.
The DSA tag is always placed before the Ethernet header and therefore,
we are using the long prefix for RX tags to avoid putting the DSA master
port in promiscuous mode. Once there will be an API in DSA for drivers
to request DSA masters to be in promiscuous mode unconditionally, we
will switch to the "no prefix" extraction frame header, which will save
16 padding bytes for each RX frame.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The Felix DSA driver needs to write to SYS_RAM_INIT_RAM_INIT for its own
chip initialization process.
Also update the MAINTAINERS file such that the headers exported by the
ocelot driver are under the same maintainers' umbrella as the driver
itself.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
We will be registering another switch driver based on ocelot, which
lives under drivers/net/dsa.
Make sure the Felix DSA front-end has the necessary abstractions to
implement a new Ocelot driver instantiation. This includes the function
prototypes for implementing DSA callbacks.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The Felix switch has a different reset procedure, so a function pointer
needs to be created and added to the ocelot_ops structure.
The reset procedure has been moved into ocelot_init.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When using the NPI port, the DSA tag is passed through Ethernet, so the
switch's MAC needs to accept it as it comes from the DSA master. Increase
the MTU on the external CPU port to account for the length of the
injection header.
Without this patch, MTU-sized frames are dropped by the switch's CPU
port on xmit, which is especially obvious in TCP sessions.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This constant will be used in a future patch to increase the MTU on NPI
ports, and will also be used in the tagger driver for Felix.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Since in an NPI/DSA setup, not all ports will have the same MTU, we need
to make sure the watermarks for pause frames and/or tail dropping logic
that existed in the driver is still coherent for the new MTU values.
We need to do this because the NPI (aka external CPU) port needs an
increased MTU for the DSA tag. This will be done in a future patch.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
It doesn't make sense to rewrite all these registers every time the PHY
library notifies us about a link state change.
In a future patch we will customize the MTU for the CPU port, and since
the MTU was previously configured from adjust_link, if we don't make
this change, its value would have got overridden.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The adjust_link routine should be generic enough to be (re)used by
any SoC that integrates a switch core compatible with the Ocelot
core switch driver. Currently all configurations are generic except
for the PCS settings that are SoC specific. Move these out to the
Ocelot SoC/board instance.
Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Let's make this ioremap and regmap init code common. It should not
be platform dependent as it should be usable by PCI devices too.
Use better names where necessary to avoid clashes.
Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Karsten Graul says:
====================
net/smc: improve termination handling (part 3)
Part 3 of the SMC termination patches improves the link group
termination processing and introduces the ability to immediately
terminate a link group.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
If the SMC module is unloaded or an IB device is thrown away, the
immediate link group freeing introduced for SMCD is exploited for SMCR
as well. That means SMCR-specifics are added to smc_conn_kill().
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Make sure all pending work requests are completed before freeing
a link.
Dismiss tx pending slots already when terminating a link group to
exploit termination shortcut in tx completion queue handler.
And kill the completion queue tasklets after destroy of the
completion queues, otherwise there is a time window for another
tasklet schedule of an already killed tasklet.
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
For abnormal termination issue an LLC DELETE_LINK without the
orderly flag.
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Avoid waiting for a free work request buffer, if the link group
is already terminating.
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
If the ism module is unloaded return control from exit routine only,
if all link groups are freed.
If an IB device is thrown away return control from device removal only,
if all link groups belonging to this device are freed.
A counters for the total number of SMCD link groups per ISM device is
introduced. ism module unloading continues only if the total number of
SMCD link groups for all ISM devices is zero. ISM device
removal continues only it the total number of SMCD link groups per ISM
device has decreased to zero.
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
A final cleanup due to SMCD device removal means immediate freeing
of all link groups belonging to this device in interrupt context.
This patch introduces a separate SMCD link group termination routine,
which terminates all link groups of an SMCD device.
This new routine smcd_terminate_all ()is reused if the smc module is
unloaded.
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
SMCD link group termination is called when peer signals its shutdown
of its corresponding link group. For regular shutdowns no connections
exist anymore. For abnormal shutdowns connections must be killed and
their DMBs must be unregistered immediately. That means the SMCR method
to delay the link group freeing several seconds does not fit.
This patch adds immediate termination of a link group and its SMCD
connections and makes sure all SMCD link group related cleanup steps
are finished.
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
If peer announces shutdown, use the link group terminate worker for
local cleanup of link groups and connections to terminate link group
in proper context.
Make sure link groups are cleaned up first before destroying the
event queue of the SMCD device, because link group cleanup may
raise events.
Send signal shutdown only if peer has not done it already.
Send socket abort or close only, if peer has not already announced
shutdown.
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Jose Abreu says:
====================
net: stmmac: CPU Performance Improvements
CPU Performance improvements for stmmac. Please check bellow for results
before and after the series.
Patch 1/7, allows RX Interrupt on Completion to be disabled and only use the
RX HW Watchdog.
Patch 2/7, setups the default RX coalesce settings instead of using the
minimum value.
Patch 3/7 and 4/7, removes the uneeded computations for RX Flow Control
activation/de-activation, on some cases.
Patch 5/7, tunes-up the default coalesce settings.
Patch 6/7, re-works the TX coalesce timer activation logic.
Patch 7/7, removes the now uneeded TBU interrupt.
NetPerf UDP Results:
--------------------
Socket Message Elapsed Messages CPU Service
Size Size Time Okay Errors Throughput Util Demand
bytes bytes secs # # 10^6bits/sec % SS us/KB
--- XGMAC@2.5G: Before
212992 1400 10.00 2100620 0 2351.7 36.69 5.112
212992 10.00 2100539 2351.6 26.18 3.648
--- XGMAC@2.5G: After
212992 1400 10.00 2108972 0 2361.5 21.73 3.015
212992 10.00 2097038 2348.1 19.21 2.666
--- GMAC5@1G: Before
212992 1400 10.00 786000 0 880.2 34.71 12.923
212992 10.00 786000 880.2 23.42 8.719
--- GMAC5@1G: After
212992 1400 10.00 842648 0 943.7 14.12 4.903
212992 10.00 842648 943.7 12.73 4.418
Perf TCP Results on RX Path:
----------------------------
--- XGMAC@2.5G: Before
22.51% swapper [stmmac] [k] dwxgmac2_dma_interrupt
10.82% swapper [stmmac] [k] dwxgmac2_host_mtl_irq_status
5.21% swapper [stmmac] [k] dwxgmac2_host_irq_status
4.67% swapper [stmmac] [k] dwxgmac3_safety_feat_irq_status
3.63% swapper [kernel.kallsyms] [k] stack_trace_consume_entry
2.74% iperf3 [kernel.kallsyms] [k] copy_user_enhanced_fast_string
2.52% swapper [kernel.kallsyms] [k] update_stack_state
1.94% ksoftirqd/0 [stmmac] [k] dwxgmac2_dma_interrupt
1.45% iperf3 [kernel.kallsyms] [k] queued_spin_lock_slowpath
1.26% swapper [kernel.kallsyms] [k] create_object
--- XGMAC@2.5G: After
7.43% swapper [kernel.kallsyms] [k] stack_trace_consume_entry
5.86% swapper [stmmac] [k] dwxgmac2_dma_interrupt
5.68% swapper [kernel.kallsyms] [k] update_stack_state
4.71% iperf3 [kernel.kallsyms] [k] copy_user_enhanced_fast_string
2.88% swapper [kernel.kallsyms] [k] create_object
2.69% swapper [stmmac] [k] dwxgmac2_host_mtl_irq_status
2.61% swapper [stmmac] [k] stmmac_napi_poll_rx
2.52% swapper [kernel.kallsyms] [k] unwind_next_frame.part.4
1.48% swapper [kernel.kallsyms] [k] unwind_get_return_address
1.38% swapper [kernel.kallsyms] [k] arch_stack_walk
--- GMAC5@1G: Before
31.29% swapper [stmmac] [k] dwmac4_dma_interrupt
14.57% swapper [stmmac] [k] dwmac4_irq_mtl_status
10.66% swapper [stmmac] [k] dwmac4_irq_status
1.97% swapper [kernel.kallsyms] [k] stack_trace_consume_entry
1.73% iperf3 [kernel.kallsyms] [k] copy_user_enhanced_fast_string
1.59% swapper [kernel.kallsyms] [k] update_stack_state
1.15% iperf3 [kernel.kallsyms] [k] do_syscall_64
1.01% ksoftirqd/0 [stmmac] [k] dwmac4_dma_interrupt
0.89% swapper [kernel.kallsyms] [k] __default_send_IPI_dest_field
0.75% swapper [stmmac] [k] stmmac_napi_poll_rx
--- GMAC5@1G: After
6.70% swapper [kernel.kallsyms] [k] stack_trace_consume_entry
5.79% swapper [stmmac] [k] dwmac4_dma_interrupt
5.29% swapper [kernel.kallsyms] [k] update_stack_state
3.52% iperf3 [kernel.kallsyms] [k] copy_user_enhanced_fast_string
2.83% swapper [stmmac] [k] dwmac4_irq_mtl_status
2.62% swapper [kernel.kallsyms] [k] create_object
2.46% swapper [stmmac] [k] stmmac_napi_poll_rx
2.32% swapper [kernel.kallsyms] [k] unwind_next_frame.part.4
2.19% swapper [stmmac] [k] dwmac4_irq_status
1.39% swapper [kernel.kallsyms] [k] unwind_get_return_address
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Now that TX Coalesce has been rewritten we no longer need this
additional interrupt enabled. This reduces CPU usage.
Signed-off-by: Jose Abreu <Jose.Abreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Coalesce logic currently increments the number of packets and sets the
IC bit when the coalesced packets have passed a given limit. This does
not reflect very well what coalesce was meant for as we can have a large
number of packets that are coalesced and then a single one, sent later
on that has the IC bit.
Rework the logic so that it coalesces only upon a limit of packets and
sets the IC bit for large number of packets.
Signed-off-by: Jose Abreu <Jose.Abreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Tune-up the defalt coalesce settings for optimal values. This gives the
best performance in most of the use-cases.
Signed-off-by: Jose Abreu <Jose.Abreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
RFA and RFD should not be dependent on FIFO size. In fact, the more FIFO
space we have, the later we can activate Flow Control. Let's use
hard-coded values for RFA and RFD for all FIFO sizes with the exception
of 4k, which is a special case.
Signed-off-by: Jose Abreu <Jose.Abreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
RFA and RFD should not be dependent on FIFO size. In fact, the more FIFO
space we have, the later we can activate Flow Control. Let's use
hard-coded values for RFA and RFD for all FIFO sizes with the exception
of 4k, which is a special case.
Signed-off-by: Jose Abreu <Jose.Abreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
For performance reasons, sometimes using the minimum RX Coalesce value
is not optimal. Lets setup a default value that is optimal in most of
the use cases.
Signed-off-by: Jose Abreu <Jose.Abreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
We may only want to use the RX Watchdog so lets check if RX Coalesce
settings are non-zero and only set the RX Interrupt on Completion bit if
its not.
Signed-off-by: Jose Abreu <Jose.Abreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Commit 0c3cbbf96def ("mlxsw: Add specific trap for packets routed via
invalid nexthops") allocated an adjacency entry during driver
initialization whose purpose is to discard packets hitting the route
pointing to it.
These adjacency entries are allocated from a resource called KVD linear
(KVDL). There are situations in which the user can decide to set the
size of this resource (via devlink-resource) to 0, in which case the
driver will not be able to load.
Therefore, instead of pre-allocating this adjacency entry, simply
allocate it only when needed. A variable indicating the validity of the
entry is added and is used to ensure it is only allocated and written
once and that it is freed after all the routes were flushed.
Fixes: 0c3cbbf96def ("mlxsw: Add specific trap for packets routed via invalid nexthops")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
If a malicious device gives a short MAC it can elicit up to
5 bytes of leaked memory out of the driver. We need to check for
ETH_ALEN instead.
Reported-by: syzbot+a8d4acdad35e6bbca308@syzkaller.appspotmail.com
Signed-off-by: Oliver Neukum <oneukum@suse.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
mlxsw does not support VXLAN devices with a physical device attached and
vetoes such configurations upon enslavement to an offloaded bridge.
Commit 0ce1822c2a08 ("vxlan: add adjacent link to limit depth level")
changed the VXLAN device to be an upper of the physical device which
causes mlxsw to veto the creation of the VXLAN device with "Unknown
upper device type".
This is OK as this configuration is not supported, but it prevents us
from testing bad flows involving the enslavement of VXLAN devices with a
physical device to a bridge, regardless if the physical device is an
mlxsw netdev or not.
Adjust the test to use a dummy device as a physical device instead of a
mlxsw netdev.
Fixes: 0ce1822c2a08 ("vxlan: add adjacent link to limit depth level")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
If PROC_FS is not set, gcc warning this:
net/tls/tls_proc.c:23:12: warning:
'tls_statistics_seq_show' defined but not used [-Wunused-function]
Use #ifdef to guard this.
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The driver forgets to destroy workqueue in remove() similarly to what is
done when probe() fails. Add a call to destroy_workqueue() to fix it.
Since unregistration will wait for the work to finish, we do not need to
cancel/flush the work instance in remove().
Signed-off-by: Chuhong Yuan <hslester96@gmail.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20191114023405.31477-1-hslester96@gmail.com
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
|
|
No timer must be left running when the device goes away.
Signed-off-by: Oliver Neukum <oneukum@suse.com>
Reported-and-tested-by: syzbot+b6c55daa701fc389e286@syzkaller.appspotmail.com
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/1573726121.17351.3.camel@suse.com
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
|
|
Pull ceph fixes from Ilya Dryomov:
"Two fixes for the buffered reads and O_DIRECT writes serialization
patch that went into -rc1 and a fixup for a bogus warning on older gcc
versions"
* tag 'ceph-for-5.4-rc8' of git://github.com/ceph/ceph-client:
rbd: silence bogus uninitialized warning in rbd_object_map_update_finish()
ceph: increment/decrement dio counter on async requests
ceph: take the inode lock before acquiring cap refs
|
|
When a lookup is done, the afs filesystem will perform a bulk status-fetch
operation on the requested vnode (file) plus the next 49 other vnodes from
the directory list (in AFS, directory contents are downloaded as blobs and
parsed locally). When the results are received, it will speculatively
populate the inode cache from the extra data.
However, if the lookup races with another lookup on the same directory, but
for a different file - one that's in the 49 extra fetches, then if the bulk
status-fetch operation finishes first, it will try and update the inode
from the other lookup.
If this other inode is still in the throes of being created, however, this
will cause an assertion failure in afs_apply_status():
BUG_ON(test_bit(AFS_VNODE_UNSET, &vnode->flags));
on or about fs/afs/inode.c:175 because it expects data to be there already
that it can compare to.
Fix this by skipping the update if the inode is being created as the
creator will presumably set up the inode with the same information.
Fixes: 39db9815da48 ("afs: Fix application of the results of a inline bulk status fetch")
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Marc Dionne <marc.dionne@auristor.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fix from Will Deacon:
"One trivial fix for -rc8/final that ensures that the script used to
detect RELR relocation support in the toolchain works correctly when
$CC contains quotes. Although it fails safely (by failing to detect
the support when it exists), it would be nice to have this fixed in
5.4 given that it was only introduced in the last merge window.
Summary:
- Handle CC variables containing quotes in tools-support-relr.sh
script"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
scripts/tools-support-relr.sh: un-quote variables
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux
Pull MIPS fixes from Paul Burton:
"A fix and simplification for SGI IP27 exception handlers, and a small
MAINTAINERS update for Broadcom MIPS systems"
* tag 'mips_fixes_5.4_4' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
MAINTAINERS: Remove Kevin as maintainer of BMIPS generic platforms
MIPS: SGI-IP27: fix exception handler replication
|
|
Pull more KVM fixes from Paolo Bonzini:
- fixes for CONFIG_KVM_COMPAT=n
- two updates to the IFU erratum
- selftests build fix
- brown paper bag fix
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: Add a comment describing the /dev/kvm no_compat handling
KVM: x86/mmu: Take slots_lock when using kvm_mmu_zap_all_fast()
KVM: Forbid /dev/kvm being opened by a compat task when CONFIG_KVM_COMPAT=n
KVM: X86: Reset the three MSR list number variables to 0 in kvm_init_msr_list()
selftests: kvm: fix build with glibc >= 2.30
kvm: x86: disable shattered huge page recovery for PREEMPT_RT.
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc
Pull MMC fix from Ulf Hansson:
"Don't overwrite quirk flags in sdhci-of-at91 host driver"
* tag 'mmc-v5.4-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
mmc: sdhci-of-at91: fix quirk2 overwrite
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"A few small last-minute fixes for USB-audio and HD-audio as well as
for PCM core:
- A race fix for PCM core between stopping and closing a stream
- USB-audio regressions in the recent descriptor validation code and
relevant changes
- A read of uninitialized value in USB-audio spotted by fuzzer
- A fix for USB-audio race at stopping a stream
- Intel HD-audio platform fixes"
* tag 'sound-5.4-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: usb-audio: Fix incorrect size check for processing/extension units
ALSA: usb-audio: Fix incorrect NULL check in create_yamaha_midi_quirk()
ALSA: pcm: Fix stream lock usage in snd_pcm_period_elapsed()
ALSA: usb-audio: not submit urb for stopped endpoint
ALSA: hda: hdmi - fix pin setup on Tigerlake
ALSA: hda: Add Cometlake-S PCI ID
ALSA: usb-audio: Fix missing error check at mixer resolution test
|
|
Pull drm fixes from Dave Airlie:
"Here is this weeks non-intel hw vuln fixes pull. Three drivers, all
small fixes.
i915:
- MOCS table fixes for EHL and TGL
- Update Display's rawclock on resume
- GVT's dmabuf reference drop fix
amdgpu:
- Fix a potential crash in firmware parsing
sun4i:
- One fix to the dotclock dividers range for sun4i"
* tag 'drm-fixes-2019-11-15' of git://anongit.freedesktop.org/drm/drm:
drm/amdgpu: fix null pointer deref in firmware header printing
drm/i915/tgl: MOCS table update
Revert "drm/i915/ehl: Update MOCS table for EHL"
drm/sun4i: tcon: Set min division of TCON0_DCLK to 1.
drm/i915: update rawclk also on resume
drm/i915/gvt: fix dropping obj reference twice
|
|
Pull misc vfs fixes from Al Viro:
"Assorted fixes all over the place; some of that is -stable fodder,
some regressions from the last window"
* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
ecryptfs_lookup_interpose(): lower_dentry->d_parent is not stable either
ecryptfs_lookup_interpose(): lower_dentry->d_inode is not stable
ecryptfs: fix unlink and rmdir in face of underlying fs modifications
audit_get_nd(): don't unlock parent too early
exportfs_decode_fh(): negative pinned may become positive without the parent locked
cgroup: don't put ERR_PTR() into fc->root
autofs: fix a leak in autofs_expire_indirect()
aio: Fix io_pgetevents() struct __compat_aio_sigset layout
fs/namespace.c: fix use-after-free of mount in mnt_warn_timestamp_expiry()
|
|
Remove duplicated include.
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
|
|
Fixes gcc '-Wunused-but-set-variable' warning:
drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu_core.c: In function rtl8xxxu_gen2_config_channel:
drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu_core.c:1266:13: warning: variable rsr set but not used [-Wunused-but-set-variable]
rsr is never used, so remove it.
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zheng Yongjun <zhengyongjun3@huawei.com>
Reviewed-by: Chris Chiu <chiu@endlessm.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
|