Age | Commit message (Collapse) | Author |
|
--bpffs is not suggested by bash completions.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
Drop my author comments, those are from the early days of
bpftool and make little sense in tree, where we have quite
a few people contributing and git to attribute the work.
While at it bump some copyrights.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
Make bpf_program__next() skip over '.text' section if object file
has pseudo calls. The '.text' section is hardly a program in that
case, it's more of a storage for code of functions other than main.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
libbpf used to be able to load programs from the default section
called '.text'. It's not very common to leave sections unnamed,
but if it happens libbpf will fail to load the programs reporting
-EINVAL from the kernel. The -EINVAL comes from bpf_obj_name_cpy()
because since 48cca7e44f9f ("libbpf: add support for bpf_call")
libbpf does not resolve program names for programs in '.text',
defaulting to '.text'. '.text', however, does not pass the
(isalnum(*src) || *src == '_') check in bpf_obj_name_cpy().
With few extra lines of code we can limit the pseudo call
assumptions only to objects which actually contain code relocations.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
Users of bpf_object__open()/bpf_object__load() APIs may want to
load the programs and maps onto a device for offload. Allow
setting ifindex on those sub-objects.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
Specify default section names for BPF_PROG_TYPE_LIRC_MODE2
and BPF_PROG_TYPE_LWT_SEG6LOCAL, these are the only two
missing right now.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
Commit 4bfe3bd3cc35 ("tools/bpftool: use version from the kernel
source tree") added version to bpftool. The version used is
equal to the kernel version and obtained by running make kernelversion
against kernel source tree. Version is then communicated
to the sources with a command line define set in CFLAGS.
Use a simply expanded variable for the version, otherwise the
recursive make will run every time CFLAGS are used.
This brings the single-job compilation time for me from almost
16 sec down to less than 4 sec.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux
Pull parisc fixes and cleanups from Helge Deller:
"Nothing exiting in this patchset, just
- small cleanups of header files
- default to 4 CPUs when building a SMP kernel
- mark 16kB and 64kB page sizes broken
- addition of the new io_pgetevents syscall"
* 'parisc-4.18-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
parisc: Build kernel without -ffunction-sections
parisc: Reduce debug output in unwind code
parisc: Wire up io_pgetevents syscall
parisc: Default to 4 SMP CPUs
parisc: Convert printk(KERN_LEVEL) to pr_lvl()
parisc: Mark 16kB and 64kB page sizes BROKEN
parisc: Drop struct sigaction from not exported header file
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC fixes from Olof Johansson:
"A smaller batch for the end of the week (let's see if I can keep the
weekly cadence going for once).
All medium-grade fixes here, nothing worrisome:
- Fixes for some fairly old bugs around SD card write-protect
detection and GPIO interrupt assignments on Davinci.
- Wifi module suspend fix for Hikey.
- Minor DT tweaks to fix inaccuracies for Amlogic platforms, one
of which solves booting with third-party u-boot"
* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
arm64: dts: hikey960: Define wl1837 power capabilities
arm64: dts: hikey: Define wl1835 power capabilities
ARM64: dts: meson-gxl: fix Mali GPU compatible string
ARM64: dts: meson-axg: fix ethernet stability issue
ARM64: dts: meson-gx: fix ATF reserved memory region
ARM64: dts: meson-gxl-s905x-p212: Add phy-supply for usb0
ARM64: dts: meson: fix register ranges for SD/eMMC
ARM64: dts: meson: disable sd-uhs modes on the libretech-cc
ARM: dts: da850: Fix interrups property for gpio
ARM: davinci: board-da850-evm: fix WP pin polarity for MMC/SD
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild fixes from Masahiro Yamada:
- introduce __diag_* macros and suppress -Wattribute-alias warnings
from GCC 8
- fix stack protector test script for x86_64
- fix line number handling in Kconfig
- document that '#' starts a comment in Kconfig
- handle P_SYMBOL property in dump debugging of Kconfig
- correct help message of LD_DEAD_CODE_DATA_ELIMINATION
- fix occasional segmentation faults in Kconfig
* tag 'kbuild-fixes-v4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
kconfig: loop boundary condition fix
kbuild: reword help of LD_DEAD_CODE_DATA_ELIMINATION
kconfig: handle P_SYMBOL in print_symbol()
kconfig: document Kconfig source file comments
kconfig: fix line numbers for if-entries in menu tree
stack-protector: Fix test with 32-bit userland and CONFIG_64BIT=y
powerpc: Remove -Wattribute-alias pragmas
disable -Wattribute-alias warning for SYSCALL_DEFINEx()
kbuild: add macro for controlling warnings to linux/compiler.h
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
"The biggest diffstat comes from self-test updates, plus there's entry
code fixes, 5-level paging related fixes, console debug output fixes,
and misc fixes"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mm: Clean up the printk()s in show_fault_oops()
x86/mm: Drop unneeded __always_inline for p4d page table helpers
x86/efi: Fix efi_call_phys_epilog() with CONFIG_X86_5LEVEL=y
selftests/x86/sigreturn: Do minor cleanups
selftests/x86/sigreturn/64: Fix spurious failures on AMD CPUs
x86/entry/64/compat: Fix "x86/entry/64/compat: Preserve r8-r11 in int $0x80"
x86/mm: Don't free P4D table when it is folded at runtime
x86/entry/32: Add explicit 'l' instruction suffix
x86/mm: Get rid of KERN_CONT in show_fault_oops()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
"Tooling fixes mostly, plus a build warning fix"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (21 commits)
perf/core: Move inline keyword at the beginning of declaration
tools/headers: Pick up latest kernel ABIs
perf tools: Fix crash caused by accessing feat_ops[HEADER_LAST_FEATURE]
perf script: Fix crash because of missing evsel->priv
perf script: Add missing output fields in a hint
perf bench: Fix numa report output code
perf stat: Remove duplicate event counting
perf alias: Rebuild alias expression string to make it comparable
perf alias: Remove trailing newline when reading sysfs files
perf tools: Fix a clang 7.0 compilation error
tools include uapi: Synchronize bpf.h with the kernel
tools include uapi: Update if_link.h to pick IFLA_{BRPORT_ISOLATED,VXLAN_TTL_INHERIT}
tools include powerpc: Update arch/powerpc/include/uapi/asm/unistd.h copy to get 'rseq' syscall
perf tools: Update x86's syscall_64.tbl, adding 'io_pgetevents' and 'rseq'
tools headers uapi: Synchronize drm/drm.h
perf intel-pt: Fix packet decoding of CYC packets
perf tests: Add valid callback for parse-events test
perf tests: Add event parsing error handling to parse events test
perf report powerpc: Fix crash if callchain is empty
perf test session topology: Fix test on s390
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux
Pull selinux fix from Paul Moore:
"One fairly straightforward patch to fix a longstanding issue where a
process could stall while accessing files in selinuxfs and block
everyone else due to a held mutex.
The patch passes all our tests and looks to apply cleanly to your
current tree"
* tag 'selinux-pr-20180629' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
selinux: move user accesses in selinuxfs out of locked regions
|
|
Pull block fixes from Jens Axboe:
"Small set of fixes for this series. Mostly just minor fixes, the only
oddball in here is the sg change.
The sg change came out of the stall fix for NVMe, where we added a
mempool and limited us to a single page allocation. CONFIG_SG_DEBUG
sort-of ruins that, since we'd need to account for that. That's
actually a generic problem, since lots of drivers need to allocate SG
lists. So this just removes support for CONFIG_SG_DEBUG, which I added
back in 2007 and to my knowledge it was never useful.
Anyway, outside of that, this pull contains:
- clone of request with special payload fix (Bart)
- drbd discard handling fix (Bart)
- SATA blk-mq stall fix (me)
- chunk size fix (Keith)
- double free nvme rdma fix (Sagi)"
* tag 'for-linus-20180629' of git://git.kernel.dk/linux-block:
sg: remove ->sg_magic member
drbd: Fix drbd_request_prepare() discard handling
blk-mq: don't queue more if we get a busy return
block: Fix cloning of requests with a special payload
nvme-rdma: fix possible double free of controller async event buffer
block: Fix transfer when chunk sectors exceeds max
|
|
After commit f9d4b0c1e969 ("fib_rules: move common handling of newrule
delrule msgs into fib_nl2rule"), rule_exists got replaced by rule_find
for existing rule lookup in both the add and del paths. While this
is good for the delete path, it solves a few problems but opens up
a few invalid key matches in the add path.
$ip -4 rule add table main tos 10 fwmark 1
$ip -4 rule add table main tos 10
RTNETLINK answers: File exists
The problem here is rule_find does not check if the key masks in
the new and old rule are the same and hence ends up matching a more
secific rule. Rule key masks cannot be easily compared today without
an elaborate if-else block. Its best to introduce key masks for easier
and accurate rule comparison in the future. Until then, due to fear of
regressions this patch re-introduces older loose rule_exists during add.
Also fixes both rule_exists and rule_find to cover missing attributes.
Fixes: f9d4b0c1e969 ("fib_rules: move common handling of newrule delrule msgs into fib_nl2rule")
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Petr Machata says:
====================
mlxsw: Add resource scale tests
There are a number of tests that check features of the Linux networking
stack. By running them on suitable interfaces, one can exercise the
mlxsw offloading code. However none of these tests attempts to push
mlxsw to the limits supported by the ASIC.
As an additional wrinkle, the "limits supported by the ASIC" themselves
may not be a set of fixed numbers, but rather depend on a profile that
determines how the ASIC resources are allocated for different purposes.
This patchset introduces several tests that verify capability of mlxsw
to offload amounts of routes, flower rules, and mirroring sessions that
match predicted ASIC capacity, at different configuration profiles.
Additionally they verify that amounts exceeding the predicted capacity
can *not* be offloaded.
These are not generic tests, but ones that are tailored for mlxsw
specifically. For that reason they are not added to net/forwarding
selftests subdirectory, but rather to a newly-added drivers/net/mlxsw.
Patches #1, #2 and #3 tweak the generic forwarding/lib.sh to support the
new additions.
In patches #4 and #5, new libraries for interfacing with devlink are
introduced, first a generic one, then a Spectrum-specific one.
In patch #6, a devlink resource test is introduced.
Patches #7 and #8, #9 and #10, and #11 and #12 introduce three scale
tests: router, flower and mirror-to-gretap. The first of each pair of
patches introduces a generic portion of the test (mlxsw-specific), the
second introduces a Spectrum-specific wrapper.
Patch #13 then introduces a scale test driver that runs (possibly a
subset of) the tests introduced by patches from previous paragraph.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add a scale test capable of validating that offloaded network
functionality is indeed functional at scale when configured to
the different KVD profiles available.
Start by testing offloaded routes are functional at scale by
passing traffic on each one of them in turn.
Signed-off-by: Yuval Mintz <yuvalm@mellanox.com>
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add a wrapper around mlxsw/mirror_gre_scale.sh that parameterized number
of offloadable mirrors on Spectrum machines.
Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Test that it's possible to offload a given number of mirrors.
Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add a wrapper around mlxsw/tc_flower_scale.sh that parameterizes the
generic tc flower scale test template with Spectrum-specific target
values.
Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Yuval Mintz <yuvalm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add test of capacity to offload flower.
This is a generic portion of the test that is meant to be called from a
driver that supplies a particular number of rules to be tested with.
Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Yuval Mintz <yuvalm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
IPv4 routes in Spectrum are based on the kvd single-hash, but as it's
a hash we need to assume we cannot reach 100% of its capacity.
Add a wrapper that provides us with good/bad target numbers for the
Spectrum ASIC.
Signed-off-by: Yuval Mintz <yuvalm@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
[petrm@mellanox.com: Drop shebang.]
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This test aims for both stand alone and internal usage by the resource
infra. The test receives the number routes to offload and checks:
- The routes were offloaded correctly
- Traffic for each route.
Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Signed-off-by: Yuval Mintz <yuvalm@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add a selftest that can be used to perform basic sanity of the devlink
resource API as well as test the behavior of KVD manipulation in the
driver.
This is the first case of a HW-only test - in order to test the devlink
resource a driver capable of exposing resources has to be provided
first.
Signed-off-by: Yuval Mintz <yuvalm@mellanox.com>
[petrm@mellanox.com: Extracted two patches out of this patch. Tweaked
commit message.]
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This library builds on top of devlink_lib.sh and contains functionality
specific to Spectrum ASICs, e.g., re-partitioning the various KVD
sub-parts.
Signed-off-by: Yuval Mintz <yuvalm@mellanox.com>
[petrm@mellanox.com: Split this out from another patch. Fix line length
in devlink_sp_read_kvd_defaults().]
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This helper library contains wrappers to devlink functionality agnostic
to the underlying device.
Signed-off-by: Yuval Mintz <yuvalm@mellanox.com>
[petrm@mellanox.com: Split this out from another patch.]
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
setup_wait() and tc_offload_check() both assume that all NUM_NETIFS
interfaces are relevant for a given test. However, the scale test script
acts as an umbrella for a number of sub-tests, some of which may not
require all the interfaces.
Thus it's suboptimal for tc_offload_check() to query all the interfaces.
In case of setup_wait() it's incorrect, because the sub-test in question
of course doesn't configure any interfaces beyond what it needs, and
setup_wait() then ends up waiting indefinitely for the extraneous
interfaces to come up.
For that reason, give setup_wait() and tc_offload_check() an optional
parameter with a number of interfaces to probe. Fall back to global
NUM_NETIFS if the parameter is not given.
Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Yuval Mintz <yuvalm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
In the scale testing scenarios, one usually has a condition that is
expected to either fail, or pass, depending on which side of the scale
is being tested.
To capture this logic, add a function check_err_fail(), which dispatches
either to check_err() or check_fail(), depending on the value of the
first argument, should_fail.
Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Yuval Mintz <yuvalm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The devlink related scripts are mlxsw-specific. As a result, they'll
reside in a different directory - but would still need the common logic
implemented in lib.sh.
So as a preliminary step, allow lib.sh to be sourced from other
directories as well.
Signed-off-by: Yuval Mintz <yuvalm@mellanox.com>
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Jakub Kicinski says:
====================
nfp: flower updates and netconsole
This set contains assorted updates to driver base and flower.
First patch is a follow up to a fix to calculating counters which
went into net. For ethtool counters we should also make sure
they are visible even after ring reconfiguration. Next patch
is a safety measure in case we are running on a machine with a
broken BIOS we should fail the probe when risk of incorrect
operation is too high. The next two patches add netpoll support
and make use of napi_consume_skb(). Last we populate bus info
on all representors.
Pieter adds support for offload of the checksum action in flower.
John follows up to another fix he's done in net, we set TTL
values on tunnels to stack's default, now Johns does a full
route lookup to get a more precise information, he populates
ToS field as well. Last but not least he follows up on Jiri's
request to enable LAG offload in case the team driver is used
and then hash function is unknown.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Currently the NFP fw only supports L3/L4 hashing so rejects the offload of
filters that output to LAG ports implementing other hash algorithms. Team,
however, uses a BPF function for the hash that is not defined. To support
Team offload, accept hashes that are defined as 'unknown' (only Team
defines such hash types). In this case, use the NFP default of L3/L4
hashing for egress port selection.
Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Extract the tos and the tunnel flags from the tunnel key and offload these
action fields. Only the checksum and tunnel key flags are implemented in
fw so reject offloads of other flags. The tunnel key flag is always
considered set in the fw so enforce that it is set in the rule. Note that
the compulsory setting of the tunnel key flag and optional setting of
checksum is inline with how tc currently generates ipv4 udp tunnel
actions.
Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Previously the ttl for ipv4 udp tunnels was set to the namespace default.
Modify this to attempt to extract the ttl from a full route lookup on the
tunnel destination. If this is not possible then resort to the default.
Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Hardware will automatically update csum in headers when a set action has
been performed. This means we could in the driver ignore the explicit
checksum action when performing a set action.
Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
We used to leave bus-info in ethtool driver info empty for
representors in case multi-PCIe-to-single-host cards make
the association between PCIe device and NFP many to one.
It seems these attempts are futile, we need to link the
representors to one PCIe device in sysfs to get consistent
naming, plus devlink uses one PCIe as a handle, anyway.
The multi-PCIe-to-single-system support won't be clean,
if it ever comes.
Turns out some user space (RHEL tests) likes to read bus-info
so just populate it.
While at it remove unnecessary app NULL-check, representors
are spawned by an app, so it must exist.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Use napi_consume_skb() in nfp_net_tx_complete() to get bulk free.
Pass 0 as budget for ctrl queue completion since it runs out of
a tasklet.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
NFP NAPI handling will only complete the TXed packets when called
with budget of 0, implement ndo_poll_controller by scheduling NAPI
on all TX queues.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
On some platforms with broken ACPI tables we may not have access
to the Serial Number PCIe capability. This capability is crucial
for us for switchdev operation as we use serial number as switch ID,
and for communication with management FW where interface ID is used.
If we can't determine the Serial Number we have to fail device probe.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
After user changes the ring count statistics for deactivated
rings disappear from ethtool -S output. This causes loss of
information to the user and means that ethtool stats may not
add up to interface stats. Always expose counters from all
the rings. Note that we allocate at most num_possible_cpus()
rings so number of rings should be reasonable.
The alternative of only listing stats for rings which were
ever in use could be confusing.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When doing device hotplug the sub channel must be async to avoid
deadlock issues because device is discovered in softirq context.
When doing changes to MTU and number of channels, the setup
must be synchronous to avoid races such as when MTU and device
settings are done in a single ip command.
Reported-by: Thomas Walker <Thomas.Walker@twosigma.com>
Fixes: 8195b1396ec8 ("hv_netvsc: fix deadlock on hotplug")
Fixes: 732e49850c5e ("netvsc: fix race on sub channel creation")
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
As noticed by Eric, we need to switch to the helper
dev_change_tx_queue_len() for SIOCSIFTXQLEN call path too,
otheriwse still miss dev_qdisc_change_tx_queue_len().
Fixes: 6a643ddb5624 ("net: introduce helper dev_change_tx_queue_len()")
Reported-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Calling skb_unclone() is expensive as it triggers a memcpy operation.
Instead of calling skb_unclone() unconditionally, call it only when skb
has a shared frag_list. This improves tls rx throughout significantly.
Signed-off-by: Vakul Garg <vakul.garg@nxp.com>
Suggested-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
pool can be indirectly controlled by user-space, hence leading to
a potential exploitation of the Spectre variant 1 vulnerability.
This issue was detected with the help of Smatch:
drivers/atm/zatm.c:1491 zatm_ioctl() warn: potential spectre issue
'zatm_dev->pool_info' (local cap)
Fix this by sanitizing pool before using it to index
zatm_dev->pool_info
Notice that given that speculation windows are large, the policy is
to kill the speculation on the first load and not worry if it can be
completed with a dependent load/store [1].
[1] https://marc.info/?l=linux-kernel&m=152449131114778&w=2
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Julian Wiedmann says:
====================
s390/qeth: fixes 2018-06-29
please apply a few qeth fixes for -net and your 4.17 stable queue.
Patches 1-3 fix several issues wrt to MAC address management that were
introduced during the 4.17 cycle.
Patch 4 tackles a long-standing issue with busy multi-connection workloads
on devices in af_iucv mode.
Patch 5 makes sure to re-enable all active HW offloads, after a card was
previously set offline and thus lost its HW context.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
commit e830baa9c3f0 ("qeth: restore device features after recovery") and
commit ce3443564145 ("s390/qeth: rely on kernel for feature recovery")
made sure that the HW functions for device features get re-programmed
after recovery.
But we missed that the same handling is also required when a card is
first set offline (destroying all HW context), and then online again.
Fix this by moving the re-enable action out of the recovery-only path.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
If qeth_qdio_output_handler() detects that a transmit requires async
completion, it replaces the pending buffer's metadata object
(qeth_qdio_out_buffer) so that this queue buffer can be re-used while
the data is pending completion.
Later when the CQ indicates async completion of such a metadata object,
qeth_qdio_cq_handler() tries to free any data associated with this
object (since HW has now completed the transfer). By calling
qeth_clear_output_buffer(), it erronously operates on the queue buffer
that _previously_ belonged to this transfer ... but which has been
potentially re-used several times by now.
This results in double-free's of the buffer's data, and failing
transmits as the buffer descriptor is scrubbed in mid-air.
The correct way of handling this situation is to
1. scrub the queue buffer when it is prepared for re-use, and
2. later obtain the data addresses from the async-completion notifier
(ie. the AOB), instead of the queue buffer.
All this only affects qeth devices used for af_iucv HiperTransport.
Fixes: 0da9581ddb0f ("qeth: exploit asynchronous delivery of storage blocks")
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
*ether_addr*_64bits functions have been introduced to optimize
performance critical paths, which access 6-byte ethernet address as u64
value to get "nice" assembly. A harmless hack works nicely on ethernet
addresses shoved into a structure or a larger buffer, until busted by
Kasan on smth like plain (u8 *)[6].
qeth_l2_set_mac_address calls qeth_l2_remove_mac passing
u8 old_addr[ETH_ALEN] as an argument.
Adding/removing macs for an ethernet adapter is not that performance
critical. Moreover is_multicast_ether_addr_64bits itself on s390 is not
faster than is_multicast_ether_addr:
is_multicast_ether_addr(%r2) -> %r2
llc %r2,0(%r2)
risbg %r2,%r2,63,191,0
is_multicast_ether_addr_64bits(%r2) -> %r2
llgc %r2,0(%r2)
risbg %r2,%r2,63,191,0
So, let's just use is_multicast_ether_addr instead of
is_multicast_ether_addr_64bits.
Fixes: bcacfcbc82b4 ("s390/qeth: fix MAC address update sequence")
Reviewed-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When qeth_l2_set_mac_address() finds the card in a non-reachable state,
it merely copies the new MAC address into dev->dev_addr so that
__qeth_l2_set_online() can later register it with the HW.
But __qeth_l2_set_online() may very well be running concurrently, so we
can't trust the card state without appropriate locking:
If the online sequence is past the point where it registers
dev->dev_addr (but not yet in SOFTSETUP state), any address change needs
to be properly programmed into the HW. Otherwise the netdevice ends up
with a different MAC address than what's set in the HW, and inbound
traffic is not forwarded as expected.
This is most likely to occur for OSD in LPAR, where
commit 21b1702af12e ("s390/qeth: improve fallback to random MAC address")
now triggers eg. systemd to immediately change the MAC when the netdevice
is registered with a NET_ADDR_RANDOM address.
Fixes: bcacfcbc82b4 ("s390/qeth: fix MAC address update sequence")
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This reverts commit b7493e91c11a757cf0f8ab26989642ee4bb2c642.
On its own, querying RDEV for a MAC address works fine. But when upgrading
from a qeth that previously queried DDEV on a z/VM NIC (ie. any kernel with
commit ec61bd2fd2a2), the RDEV query now returns a _different_ MAC address
than the DDEV query.
If the NIC is configured with MACPROTECT, z/VM apparently requires us to
use the MAC that was initially returned (on DDEV) and registered. So after
upgrading to a kernel that uses RDEV, the SETVMAC registration cmd for the
new MAC address fails and we end up with a non-operabel interface.
To avoid regressions on upgrade, switch back to using DDEV for the MAC
address query. The downgrade path (first RDEV, later DDEV) is fine, in this
case both queries return the same MAC address.
Fixes: b7493e91c11a ("s390/qeth: use Read device to query hypervisor for MAC")
Reported-by: Michal Kubecek <mkubecek@suse.com>
Tested-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The __alx_open function can be called from ndo_open, which is called
under RTNL, or from alx_resume, which isn't. Since commit d768319cd427,
we're calling the netif_set_real_num_{tx,rx}_queues functions, which
need to be called under RTNL.
This is similar to commit 0c2cc02e571a ("igb: Move the calls to set the
Tx and Rx queues into igb_open").
Fixes: d768319cd427 ("alx: enable multiple tx queues")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|