Age | Commit message (Collapse) | Author |
|
Halve the time limit used when aggregation is enabled on an RX
endpoint, to half a millisecond.
Use DIV_ROUND_CLOSEST() to compute the value that represents the
time period, to get better accuracy in the event the time limit is
not an even multiple of the granularity.
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The timer used for aggregation makes use of an internal 32 KHz clock.
The granularity of the timer is programmed by a field whose value is
computed by ipa_aggr_granularity_val(). Redefine the way that value
is computed by using a new TIMER_FREQUENCY constant representing the
underlying clock frequency.
Add two BUILD_BUG_ON() calls to ensure the value used is valid.
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Denis Kirjanov says:
====================
xen networking: add XDP support to xen-netfront
The first patch adds a new extra type to enable proper synchronization
between an RX request/response pair.
The second patch implements BFP interface for xen-netfront.
The third patch enables extra space for XDP processing.
v14:
- fixed compilation warnings
v13:
- fixed compilation due to previous rename
v12:
- xen-netback: rename netfront_xdp_headroom to xdp_headroom
v11:
- add the new headroom constant to netif.h
- xenbus_scanf check
- lock a bulk of puckets in xennet_xdp_xmit()
v10:
- add a new xen_netif_extra_info type to enable proper synchronization
between an RX request/response pair.
- order local variable declarations
v9:
- assign an xdp program before switching to Reconfiguring
- minor cleanups
- address checkpatch issues
v8:
- add PAGE_POOL config dependency
- keep the state of XDP processing in netfront_xdp_enabled
- fixed allocator type in xdp_rxq_info_reg_mem_model()
- minor cleanups in xen-netback
v7:
- use page_pool_dev_alloc_pages() on page allocation
- remove the leftover break statement from netback_changed
v6:
- added the missing SOB line
- fixed subject
v5:
- split netfront/netback changes
- added a sync point between backend/frontend on switching to XDP
- added pagepool API
v4:
- added verbose patch descriprion
- don't expose the XDP headroom offset to the domU guest
- add a modparam to netback to toggle XDP offset
- don't process jumbo frames for now
v3:
- added XDP_TX support (tested with xdping echoserver)
- added XDP_REDIRECT support (tested with modified xdp_redirect_kern)
- moved xdp negotiation to xen-netback
v2:
- avoid data copying while passing to XDP
- tell xen-netback that we need the headroom space
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
the patch basically adds the offset adjustment and netfront
state reading to make XDP work on netfront side.
Reviewed-by: Paul Durrant <paul@xen.org>
Signed-off-by: Denis Kirjanov <kda@linux-powerpc.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The patch adds a basic XDP processing to xen-netfront driver.
We ran an XDP program for an RX response received from netback
driver. Also we request xen-netback to adjust data offset for
bpf_xdp_adjust_head() header space for custom headers.
synchronization between frontend and backend parts is done
by using xenbus state switching:
Reconfiguring -> Reconfigured- > Connected
UDP packets drop rate using xdp program is around 310 kpps
using ./pktgen_sample04_many_flows.sh and 160 kpps without the patch.
Signed-off-by: Denis Kirjanov <kda@linux-powerpc.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The patch adds a new extra type to be able to diffirentiate
between RX responses on xen-netfront side with the adjusted offset
required for XDP processing.
The offset value from a guest is passed via xenstore.
Signed-off-by: Denis Kirjanov <kda@linux-powerpc.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Jesper Dangaard Brouer says:
====================
V3: Reorder patches to cause less code churn.
The BPF selftest 'test_progs' contains many tests, that cover all the
different areas of the kernel where BPF is used. The CI system sees this
as one test, which is impractical for identifying what team/engineer is
responsible for debugging the problem.
This patchset add some options that makes it easier to create a shell
for-loop that invoke each (top-level) test avail in test_progs. Then each
test FAIL/PASS result can be presented the CI system to have a separate
bullet. (For Red Hat use-case in Beaker https://beaker-project.org/)
Created a public script[1] that uses these features in an advanced way.
Demonstrating howto reduce the number of (top-level) tests by grouping tests
together via using the existing test pattern selection feature, and then
using the new --list feature combined with exclude (-b) to get a list of
remaining test names that was not part of the groups.
[1] https://github.com/netoptimizer/prototype-kernel/blob/master/scripts/bpf_selftests_grouping.sh
====================
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
The program test_progs have some very useful ability to specify a list of
test name substrings for selecting which tests to run.
This patch add the ability to list the selected test names without running
them. This is practical for seeing which tests gets selected with given
select arguments (which can also contain a exclude list via --name-blacklist).
This output can also be used by shell-scripts in a for-loop:
for N in $(./test_progs --list -t xdp); do \
./test_progs -t $N 2>&1 > result_test_${N}.log & \
done ; wait
This features can also be used for looking up a test number and returning
a testname. If the selection was empty then a shell EXIT_FAILURE is
returned. This is useful for scripting. e.g. like this:
n=1;
while [ $(./test_progs --list -n $n) ] ; do \
./test_progs -n $n ; n=$(( n+1 )); \
done
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/159363985751.930467.9610992940793316982.stgit@firesoul
|
|
It can be practial to get the number of tests that test_progs contain.
This could for example be used to create a shell for-loop construct that
runs the individual tests.
Like:
for N in $(seq 1 $(./test_progs -c)); do
./test_progs -n $N 2>&1 > result_test_${N}.log &
done ; wait
V2: Add the ability to return the count for the selected tests. This is
useful for getting a count e.g. after excluding some tests with option -b.
The current beakers test script like to report the max test count upfront.
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/159363985244.930467.12617117873058936829.stgit@firesoul
|
|
When a user selects a non-existing test the summary is printed with
indication 0 for all info types, and shell "success" (EXIT_SUCCESS) is
indicated. This can be understood by a human end-user, but for shell
scripting is it useful to indicate a shell failure (EXIT_FAILURE).
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/159363984736.930467.17956007131403952343.stgit@firesoul
|
|
Turn off -Wnested-externs to avoid annoying warnings in BUILD_BUG_ON macro when
compiling bpftool:
In file included from /data/users/andriin/linux/tools/include/linux/build_bug.h:5,
from /data/users/andriin/linux/tools/include/linux/kernel.h:8,
from /data/users/andriin/linux/kernel/bpf/disasm.h:10,
from /data/users/andriin/linux/kernel/bpf/disasm.c:8:
/data/users/andriin/linux/kernel/bpf/disasm.c: In function ‘__func_get_name’:
/data/users/andriin/linux/tools/include/linux/compiler.h:37:38: warning: nested extern declaration of ‘__compiletime_assert_0’ [-Wnested-externs]
_compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
^~~~~~~~~~~~~~~~~~~~~
/data/users/andriin/linux/tools/include/linux/compiler.h:16:15: note: in definition of macro ‘__compiletime_assert’
extern void prefix ## suffix(void) __compiletime_error(msg); \
^~~~~~
/data/users/andriin/linux/tools/include/linux/compiler.h:37:2: note: in expansion of macro ‘_compiletime_assert’
_compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
^~~~~~~~~~~~~~~~~~~
/data/users/andriin/linux/tools/include/linux/build_bug.h:39:37: note: in expansion of macro ‘compiletime_assert’
#define BUILD_BUG_ON_MSG(cond, msg) compiletime_assert(!(cond), msg)
^~~~~~~~~~~~~~~~~~
/data/users/andriin/linux/tools/include/linux/build_bug.h:50:2: note: in expansion of macro ‘BUILD_BUG_ON_MSG’
BUILD_BUG_ON_MSG(condition, "BUILD_BUG_ON failed: " #condition)
^~~~~~~~~~~~~~~~
/data/users/andriin/linux/kernel/bpf/disasm.c:20:2: note: in expansion of macro ‘BUILD_BUG_ON’
BUILD_BUG_ON(ARRAY_SIZE(func_id_str) != __BPF_FUNC_MAX_ID);
^~~~~~~~~~~~
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200701212816.2072340-1-andriin@fb.com
|
|
The test_vmlinux test uses hrtimer_nanosleep as hook to test tracing
programs. But in a kernel built by clang, which performs more aggresive
inlining, that function gets inlined into its caller SyS_nanosleep.
Therefore, even though fentry and kprobe do hook on the function,
they aren't triggered by the call to nanosleep in the test.
A possible fix is switching to use a function that is less likely to
be inlined, such as hrtimer_range_start_ns. The EXPORT_SYMBOL functions
shouldn't be inlined based on the description of [1], therefore safe
to use for this test. Also the arguments of this function include the
duration of sleep, therefore suitable for test verification.
[1] af3b56289be1 time: don't inline EXPORT_SYMBOL functions
Tested:
In a clang build kernel, before this change, the test fails:
test_vmlinux:PASS:skel_open 0 nsec
test_vmlinux:PASS:skel_attach 0 nsec
test_vmlinux:PASS:tp 0 nsec
test_vmlinux:PASS:raw_tp 0 nsec
test_vmlinux:PASS:tp_btf 0 nsec
test_vmlinux:FAIL:kprobe not called
test_vmlinux:FAIL:fentry not called
After switching to hrtimer_range_start_ns, the test passes:
test_vmlinux:PASS:skel_open 0 nsec
test_vmlinux:PASS:skel_attach 0 nsec
test_vmlinux:PASS:tp 0 nsec
test_vmlinux:PASS:raw_tp 0 nsec
test_vmlinux:PASS:tp_btf 0 nsec
test_vmlinux:PASS:kprobe 0 nsec
test_vmlinux:PASS:fentry 0 nsec
Signed-off-by: Hao Luo <haoluo@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20200701175315.1161242-1-haoluo@google.com
|
|
Added full support for new version Ethtool API. New API allow use
2500Gbase-T and 5000base-T supported and advertised link speed modes.
Signed-off-by: Radoslaw Tyl <radoslawx.tyl@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
There is a 4 seconds delay in ixgbe_diag_test() that is holding up other
ioctls such as SIOCGIFCONF that Oracle database applications use.
One of Oracle's product runs "ethtool -t ethX online" periodically for
system monitoring and that is impacting database applications that use
SIOCGIFCONF at that same time.
This 4 second delay was needed in out early 1GbE parts to give the PHY
time to recover from a reset. This code was carried forward to the 10 GbE
driver even it was not needed for the supported PHYs in the ixgbe driver.
CC: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
CC: Jack Vogel <jack.vogel@oracle.com>
Reported-by: Venkat Venkatsubra <venkat.x.venkatsubra@oracle.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Commit bac8486116b0 ("iavf: Refactor the watchdog state machine") inverted
the logic for when to update statistics. Statistics should be updated when
no other commands are pending, instead they were only requested when a
command was processed. iavf_request_stats() would see a pending request
and not request statistics to be updated. This caused statistics to never
be updated; fix the logic.
Fixes: bac8486116b0 ("iavf: Refactor the watchdog state machine")
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
|
|
Interfaces already exist for dumping Rx and Tx descriptor information.
Introduce another for doing the same for XDP descriptors.
Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Prior to this, only the Rx and Tx ring statistics were dumped. The XDP
ring statistics are now dumped as well.
Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Prior to this, only Rx and Tx ring statistics were accounted for.
Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Move the check if the HW Tx ring is full to outside the send
loop. Currently it is checked for every single descriptor that we
send. Instead, tell the send loop to only process a maximum number of
packets equal to the number of available slots in the Tx ring. This
way, we can remove the check inside the send loop to and gain some
performance.
Suggested-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Eliminate a division in the napi_poll data path. This division is
executed even though it is only needed in the rare case when there are
not enough interrupt lines so they have to be shared between queue
pairs. Instead, just test for this case and only execute the division
if needed. The code has been lifted from the ice driver.
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Improve the performance of the AF_XDP zero-copy Tx completion
path. When there are no XDP buffers being sent using XDP_TX or
XDP_REDIRECT, we do not have go through the SW ring to clean up any
entries since the AF_XDP path does not use these. In these cases, just
fast forward the next-to-use counter and skip going through the SW
ring. The limit on the maximum number of entries to complete is also
removed since the algorithm is now O(1). To simplify the code path, the
maximum number of entries to complete for the XDP path is therefore
also increased from 256 to 512 (the default number of Tx HW
descriptors). This should be fine since the completion in the XDP path
is faster than in the SKB path that has 256 as the maximum number.
This patch provides around 4% throughput improvement for the l2fwd
application in xdpsock on my machine.
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Reviewed-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Fix to return negative error code -ENOMEM from the error handling
case instead of 0, as done elsewhere in this function.
Fixes: b66c7bc1cd4d ("iavf: Refactor init state machine")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
After OS requests to down a link on a physical network port, the
traffic is no longer being processed but the physical link with
a link partner is still established.
Currently there is a feature (Link down on close) which allows
to physically bring the link down (after OS request).
With this patch new feature with similar capability is introduced:
TOTAL_PORT_SHUTDOWN
Allows to physically disable the link on the NIC's port.
If enabled, (after link down request from the OS)
no link, traffic or led activity is possible on that port.
If I40E_FLAG_TOTAL_PORT_SHUTDOWN is enabled, the
I40E_FLAG_LINK_DOWN_ON_CLOSE_ENABLED must be explicitly forced to
true and cannot be disabled at that time.
The functionalities are exclusive in terms of configuration, but
they also have similar behavior (allowing to disable physical link
of the port), with following differences:
- LINK_DOWN_ON_CLOSE_ENABLED is configurable at host OS run-time
and is supported by whole family of 7xx Intel Ethernet Controllers
- TOTAL_PORT_SHUTDOWN may be enabled only before OS loads (in BIOS)
only if motherboard's BIOS and NIC's FW has support of it
- when LINK_DOWN_ON_CLOSE_ENABLED is used, the link is being brought
down by sending phy_type=0 to NIC's FW
- when TOTAL_PORT_SHUTDOWN is used, phy_type is not altered, instead
the link is being brought down by clearing bit
(I40E_AQ_PHY_ENABLE_LINK) in abilities field of
i40e_aq_set_phy_config structure
Introduced changes:
- new private flag I40E_FLAG_TOTAL_PORT_SHUTDOWN for handling the
feature
- probe of NVM if the feature was enabled at driver's port
initialization
- special handling on link-down procedure to let FW physically
shutdown the port if the feature was enabled
Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Convert all the remaining 'fall through" code comments to the newer
'fallthrough;' keyword.
Suggested-by: Joe Perches <joe@perches.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Vaibhav Gupta says:
====================
net: ethernet: use generic power management
Linux Kernel Mentee: Remove Legacy Power Management.
The purpose of this patch series is to remove legacy power management callbacks
from net ethernet drivers.
The callbacks performing suspend() and resume() operations are still calling
pci_save_state(), pci_set_power_state(), etc. and handling the power management
themselves, which is not recommended.
The conversion requires the removal of the those function calls and change the
callback definition accordingly and make use of dev_pm_ops structure.
All patches are compile-tested only.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
With legacy PM, drivers themselves were responsible for managing the
device's power states and takes care of register states.
After upgrading to the generic structure, PCI core will take care of
required tasks and drivers should do only device-specific operations.
Thus, there is no need to call the PCI helper functions like
pci_enable_device, which is not recommended. Hence, removed.
Compile-tested only.
Signed-off-by: Vaibhav Gupta <vaibhavgupta40@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
With legacy PM, drivers themselves were responsible for managing the
device's power states and takes care of register states.
After upgrading to the generic structure, PCI core will take care of
required tasks and drivers should do only device-specific operations.
Use "struct dev_pm_ops" variable to bind the callbacks.
Compile-tested only.
Signed-off-by: Vaibhav Gupta <vaibhavgupta40@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
With legacy PM, drivers themselves were responsible for managing the
device's power states and takes care of register states.
After upgrading to the generic structure, PCI core will take care of
required tasks and drivers should do only device-specific operations.
Thus, there is no need to call the PCI helper functions like
pci_enable_wake(), pci_save/restore_sate() and
pci_set_power_state().
Compile-tested only.
Signed-off-by: Vaibhav Gupta <vaibhavgupta40@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
With legacy PM, drivers themselves were responsible for managing the
device's power states and takes care of register states.
After upgrading to the generic structure, PCI core will take care of
required tasks and drivers should do only device-specific operations.
Use "struct dev_pm_ops" variable to bind the callbacks.
Compile-tested only.
Signed-off-by: Vaibhav Gupta <vaibhavgupta40@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
With legacy PM, drivers themselves were responsible for managing the
device's power states and takes care of register states.
After upgrading to the generic structure, PCI core will take care of
required tasks and drivers should do only device-specific operations.
Thus, there is no need to call the PCI helper functions like
pci_enable/disable_device(), pci_save/restore_sate() and
pci_set_power_state().
Compile-tested only.
Signed-off-by: Vaibhav Gupta <vaibhavgupta40@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
With legacy PM, drivers themselves were responsible for managing the
device's power states and takes care of register states.
After upgrading to the generic structure, PCI core will take care of
required tasks and drivers should do only device-specific operations.
Thus, there is no need to call the PCI helper functions like
pci_enable/disable_device(), pci_save/restore_sate() and
pci_set_power_state().
Compile-tested only.
Signed-off-by: Vaibhav Gupta <vaibhavgupta40@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Drivers should not use legacy power management as they have to manage power
states and related operations, for the device, themselves. This driver was
handling them with the help of PCI helper functions.
With generic PM, all essentials will be handled by the PCI core. Driver
needs to do only device-specific operations.
The driver defined empty-body .suspend() and .resume() callbacks earlier.
They can now be define NULL and bind with "struct dev_pm_ops" variable.
Compile-tested only.
Signed-off-by: Vaibhav Gupta <vaibhavgupta40@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
With legacy PM, drivers themselves were responsible for managing the
device's power states and takes care of register states.
After upgrading to the generic structure, PCI core will take care of
required tasks and drivers should do only device-specific operations.
Compile-tested only.
Signed-off-by: Vaibhav Gupta <vaibhavgupta40@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
With legacy PM, drivers themselves were responsible for managing the
device's power states and takes care of register states.
After upgrading to the generic structure, PCI core will take care of
required tasks and drivers should do only device-specific operations.
Thus, there is no need to call the PCI helper functions like
pci_save/restore_sate() and pci_set_power_state().
Compile-tested only.
Signed-off-by: Vaibhav Gupta <vaibhavgupta40@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
With legacy PM, drivers themselves were responsible for managing the
device's power states and takes care of register states.
After upgrading to the generic structure, PCI core will take care of
required tasks and drivers should do only device-specific operations.
Thus, there is no need to call the PCI helper functions like
pci_enable/disable_device(), pci_save/restore_sate() and
pci_set_power_state().
Compile-tested only.
Signed-off-by: Vaibhav Gupta <vaibhavgupta40@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
With legacy PM, drivers themselves were responsible for managing the
device's power states and takes care of register states. And they use PCI
helper functions to do it.
After upgrading to the generic structure, PCI core will take care of
required tasks and drivers should do only device-specific operations.
In this driver:
typhoon_resume() calls typhoon_wakeup() which then calls PCI helper
functions pci_set_power_state() and pci_restore_state(). The only other
function, using typhoon_wakeup() is typhoon_open().
Thus remove the pci_*() calls from tyhpoon_wakeup() and place them in
typhoon_open(), maintaining the order, to retain the normal behavior of
the function
Now, typhoon_suspend() calls typhoon_sleep() which then calls PCI helper
functions pci_enable_wake(), pci_disable_device() and
pci_set_power_state(). Other functions:
- typhoon_open()
- typhoon_close()
- typhoon_init_one()
are also invoking typhoon_sleep(). Thus, in this case, cannot simply
move PCI helper functions call.
Hence, define a new function typhoon_sleep_early() which will do all the
operations, which typhoon_sleep() was doing before calling PCI helper
functions. Now typhoon_sleep() will call typhoon_sleep_early() to do
those tasks, hence, the behavior for _open(), _close and _init_one() remain
unchanged. And typhon_suspend() only requires typhoon_sleep_early().
Compile-tested only.
Signed-off-by: Vaibhav Gupta <vaibhavgupta40@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Fix sparse build warning:
drivers/net/ethernet/qlogic/qed/qed_main.c:2480:6: warning:
symbol 'qed_hw_err_type_descr' was not declared. Should it be static?
Signed-off-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The variable err is being initialized with a value that is never read
and it is being updated later with a new value. The initialization is
redundant and can be removed.
Addresses-Coverity: ("Unused value")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Fix build errors when CONFIG_INET is not set/enabled.
(.text+0x2b1b): undefined reference to `tcp_prot'
(.text+0x2b3b): undefined reference to `tcp_prot'
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/b1a858ec-7e04-56bc-248a-62cb9bbee726@infradead.org
|
|
Song Liu says:
====================
This set introduces a new helper bpf_get_task_stack(). The primary use case
is to dump all /proc/*/stack to seq_file via bpf_iter__task.
A few different approaches have been explored and compared:
1. A simple wrapper around stack_trace_save_tsk(), as v1 [1].
This approach introduces new syntax, which is different to existing
helper bpf_get_stack(). Therefore, this is not ideal.
2. Extend get_perf_callchain() to support "task" as argument.
This approach reuses most of bpf_get_stack(). However, extending
get_perf_callchain() requires non-trivial changes to architecture
specific code. Which is error prone.
3. Current (v2) approach, leverages most of existing bpf_get_stack(), and
uses stack_trace_save_tsk() to handle architecture specific logic.
[1] https://lore.kernel.org/netdev/20200623070802.2310018-1-songliubraving@fb.com/
Changes v4 => v5:
1. Rebase and work around git-am issue. (Alexei)
2. Update commit log for 4/4. (Yonghong)
Changes v3 => v4:
1. Simplify the selftests with bpf_iter.h. (Yonghong)
2. Add example output to commit log of 4/4. (Yonghong)
Changes v2 => v3:
1. Rebase on top of bpf-next. (Yonghong)
2. Sanitize get_callchain_entry(). (Peter)
3. Use has_callchain_buf for bpf_get_task_stack. (Andrii)
4. Other small clean up. (Yonghong, Andrii).
Changes v1 => v2:
1. Reuse most of bpf_get_stack() logic. (Andrii)
2. Fix unsigned long vs. u64 mismatch for 32-bit systems. (Yonghong)
3. Add %pB support in bpf_trace_printk(). (Daniel)
4. Fix buffer size to bytes.
====================
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
The new test is similar to other bpf_iter tests. It dumps all
/proc/<pid>/stack to a seq_file. Here is some example output:
pid: 2873 num_entries: 3
[<0>] worker_thread+0xc6/0x380
[<0>] kthread+0x135/0x150
[<0>] ret_from_fork+0x22/0x30
pid: 2874 num_entries: 9
[<0>] __bpf_get_stack+0x15e/0x250
[<0>] bpf_prog_22a400774977bb30_dump_task_stack+0x4a/0xb3c
[<0>] bpf_iter_run_prog+0x81/0x170
[<0>] __task_seq_show+0x58/0x80
[<0>] bpf_seq_read+0x1c3/0x3b0
[<0>] vfs_read+0x9e/0x170
[<0>] ksys_read+0xa7/0xe0
[<0>] do_syscall_64+0x4c/0xa0
[<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
Note: bpf_iter test as-is doesn't print the contents of the seq_file. To
see the example above, it is necessary to add printf() to do_dummy_read.
Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20200630062846.664389-5-songliubraving@fb.com
|
|
This makes it easy to dump stack trace in text.
Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200630062846.664389-4-songliubraving@fb.com
|
|
Introduce helper bpf_get_task_stack(), which dumps stack trace of given
task. This is different to bpf_get_stack(), which gets stack track of
current task. One potential use case of bpf_get_task_stack() is to call
it from bpf_iter__task and dump all /proc/<pid>/stack to a seq_file.
bpf_get_task_stack() uses stack_trace_save_tsk() instead of
get_perf_callchain() for kernel stack. The benefit of this choice is that
stack_trace_save_tsk() doesn't require changes in arch/. The downside of
using stack_trace_save_tsk() is that stack_trace_save_tsk() dumps the
stack trace to unsigned long array. For 32-bit systems, we need to
translate it to u64 array.
Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200630062846.664389-3-songliubraving@fb.com
|
|
Sanitize and expose get/put_callchain_entry(). This would be used by bpf
stack map.
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200630062846.664389-2-songliubraving@fb.com
|
|
bpf_free_used_maps() or close(map_fd) will trigger map_free callback.
bpf_free_used_maps() is called after bpf prog is no longer executing:
bpf_prog_put->call_rcu->bpf_prog_free->bpf_free_used_maps.
Hence there is no need to call synchronize_rcu() to protect map elements.
Note that hash_of_maps and array_of_maps update/delete inner maps via
sys_bpf() that calls maybe_wait_bpf_programs() and synchronize_rcu().
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Paul E. McKenney <paulmck@kernel.org>
Link: https://lore.kernel.org/bpf/20200630043343.53195-2-alexei.starovoitov@gmail.com
|
|
Add simple selftest validating byte swap built-ins and compile-time macros.
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20200630152125.3631920-3-andriin@fb.com
|
|
Make bpf_endian.h compatible with vmlinux.h. It is a frequent request from
users wanting to use bpf_endian.h in their BPF applications using CO-RE and
vmlinux.h.
To achieve that, re-implement byte swap macros and drop all the header
includes. This way it can be used both with linux header includes, as well as
with a vmlinux.h.
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20200630152125.3631920-2-andriin@fb.com
|
|
Rahul Lakkireddy says:
====================
cxgb4: add mirror action support for TC-MATCHALL
This series of patches add support to mirror all ingress traffic
for TC-MATCHALL ingress offload.
Patch 1 adds support to dynamically create a mirror Virtual Interface
(VI) that accepts all mirror ingress traffic when mirror action is
set in TC-MATCHALL offload.
Patch 2 adds support to allocate mirror Rxqs and setup RSS for the
mirror VI.
Patch 3 adds support to replicate all the main VI configuration to
mirror VI. This includes replicating MTU, promiscuous mode,
all-multicast mode, and enabled netdev Rx feature offloads.
v3:
- Replace mirror VI refcount_t with normal u32 variable in all patches.
- Add back calling cxgb4_port_mirror_start() in cxgb_open(), which
was there in v1, but got missed in v2 during refactoring, in patch
3.
v2:
- Add mutex to protect all mirror VI data, instead of just
mirror Rxqs, in patch 1 and 2.
- Remove the un-needed mirror Rxq mutex in patch 2.
- Simplify the replication code by refactoring t4_set_rxmode()
to handle mirror VI, instead of duplicating the t4_set_rxmode()
calls in multiple places in patch 3.
====================
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When mirror VI is enabled, replicate various VI config params
enabled on main VI to mirror VI. These include replicating MTU,
promiscuous mode, all-multicast mode, and enabled netdev Rx
feature offloads.
v3:
- Replace mirror VI refcount_t with normal u32 variable.
- Add back calling cxgb4_port_mirror_start() in cxgb_open(), which
was there in v1, but got missed in v2 during refactoring.
v2:
- Simplify the replication code by refactoring t4_set_rxmode()
to handle mirror VI, instead of duplicating the t4_set_rxmode()
calls in multiple places.
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When mirror VI is enabled, allocate the mirror Rxqs and setup the
mirror VI RSS table. The mirror Rxqs are allocated/freed when
the mirror VI is created/destroyed or when underlying port is
brought up/down, respectively.
v3:
- Replace mirror VI refcount_t with normal u32 variable.
v2:
- Use mutex to protect all mirror VI data, instead of just
mirror Rxqs.
- Remove the un-needed mirror Rxq mutex.
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|