Age | Commit message (Collapse) | Author |
|
The check for flags is done to not pass empty lookups to adding switch
rule functions. Since metadata is always added to lookups there is no
need to check against the flag.
It is also fixing the problem with such rule:
$ tc filter add dev gtp_dev ingress protocol ip prio 0 flower \
enc_dst_port 2123 action drop
Switch block in case of GTP can't parse the destination port, because it
should always be set to GTP specific value. The same with ethertype. The
result is that there is no other matching criteria than GTP tunnel. In
this case flags is 0, rule can't be added only because of defensive
check against flags.
Fixes: 9a225f81f540 ("ice: Support GTP-U and GTP-C offload in switchdev")
Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
In case of traffic going from the VF (so ingress for port representor)
source VSI should be consider during packet classification. It is
needed for hardware to not match packets from different ports with
filters added on other port.
It is only for "from VF" traffic, because other traffic direction
doesn't have source VSI.
Set correct ::src_vsi in rule_info to pass it to the hardware filter.
For example this rule should drop only ipv4 packets from eth10, not from
the others VF PRs. It is needed to check source VSI in this case.
$tc filter add dev eth10 ingress protocol ip flower skip_sw action drop
Fixes: 0d08a441fb1a ("ice: ndo_setup_tc implementation for PF")
Reviewed-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com>
Reviewed-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Serge Semin says:
====================
net: stmmac: Fix MAC-capabilities procedure
The series got born as a result of the discussions around the recent
Yanteng' series adding the Loongson LS7A1000, LS2K1000, LS7A2000, LS2K2000
MACs support:
Link: https://lore.kernel.org/netdev/fu3f6uoakylnb6eijllakeu5i4okcyqq7sfafhp5efaocbsrwe@w74xe7gb6x7p
In particular the Yanteng' patchset needed to implement the Loongson
MAC-specific constraints applied to the link speed and link duplex mode.
As a result of the discussion with Russel the next preliminary patch was
born:
Link: https://lore.kernel.org/netdev/df31e8bcf74b3b4ddb7ddf5a1c371390f16a2ad5.1712917541.git.siyanteng@loongson.cn
The patch above was a temporal solution utilized by Yanteng for further
developments and to move on with the on-going review. This patchset is a
refactored version of that single patch with formatting required for the
fixes patches.
In particular the series starts with fixing the half-duplex-less
constraint currently applied for all IP-cores. In fact it's specific for
the DW QoS Eth only (DW GMAC v4.x/v5.x).
The next patch fixes the MAC-capabilities setting up during the active
Tx/Rx queues re-initialization procedure. Particularly the procedure
missed the max-speed limit thus possibly activating speeds prohibited on
the respective platforms.
Third patch fixes the incorrect MAC-capabilities initialization for DW
MAC100, DW XGMAC and DW XLGMAC devices by moving the correct
initialization to the IP-core specific setup() methods.
That's it for now. Thanks for review and testing in advance.
Signed-off-by: Serge Semin <fancer.lancer@gmail.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Simon Horman <horms@kernel.org>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Chen-Yu Tsai <wens@csie.org>
Cc: Jernej Skrabec <jernej.skrabec@gmail.com>
Cc: Samuel Holland <samuel@sholland.org>
Cc: netdev@vger.kernel.org
Cc: linux-stm32@st-md-mailman.stormreply.com
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-sunxi@lists.linux.dev
Cc: linux-kernel@vger.kernel.org
====================
Link: https://lore.kernel.org/r/20240412180340.7965-1-fancer.lancer@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Here is the list of the MAC capabilities specific to the particular DW MAC
IP-cores currently supported by the driver:
DW MAC100: MAC_ASYM_PAUSE | MAC_SYM_PAUSE |
MAC_10 | MAC_100
DW GMAC: MAC_ASYM_PAUSE | MAC_SYM_PAUSE |
MAC_10 | MAC_100 | MAC_1000
Allwinner sun8i MAC: MAC_ASYM_PAUSE | MAC_SYM_PAUSE |
MAC_10 | MAC_100 | MAC_1000
DW QoS Eth: MAC_ASYM_PAUSE | MAC_SYM_PAUSE |
MAC_10 | MAC_100 | MAC_1000 | MAC_2500FD
if there is more than 1 active Tx/Rx queues:
MAC_ASYM_PAUSE | MAC_SYM_PAUSE |
MAC_10FD | MAC_100FD | MAC_1000FD | MAC_2500FD
DW XGMAC: MAC_ASYM_PAUSE | MAC_SYM_PAUSE |
MAC_1000FD | MAC_2500FD | MAC_5000FD | MAC_10000FD
DW XLGMAC: MAC_ASYM_PAUSE | MAC_SYM_PAUSE |
MAC_1000FD | MAC_2500FD | MAC_5000FD | MAC_10000FD |
MAC_25000FD | MAC_40000FD | MAC_50000FD | MAC_100000FD
As you can see there are only two common capabilities:
MAC_ASYM_PAUSE | MAC_SYM_PAUSE.
Meanwhile what is currently implemented defines 10/100/1000 link speeds
for all IP-cores, which is definitely incorrect for DW MAC100, DW XGMAC
and DW XLGMAC devices.
Seeing the flow-control is implemented as a callback for each MAC IP-core
(see dwmac100_flow_ctrl(), dwmac1000_flow_ctrl(), sun8i_dwmac_flow_ctrl(),
etc) and since the MAC-specific setup() method is supposed to be called
for each available DW MAC-based device, the capabilities initialization
can be freely moved to these setup() functions, thus correctly setting up
the MAC-capabilities for each IP-core (including the Allwinner Sun8i). A
new stmmac_link::caps field was specifically introduced for that so to
have all link-specific info preserved in a single structure.
Note the suggested change fixes three earlier commits at a time. The
commit 5b0d7d7da64b ("net: stmmac: Add the missing speeds that XGMAC
supports") permitted the 10-100 link speeds and 1G half-duplex mode for DW
XGMAC IP-core even though it doesn't support them. The commit df7699c70c1b
("net: stmmac: Do not cut down 1G modes") incorrectly added the MAC1000
capability to the DW MAC100 IP-core. Similarly to the DW XGMAC the commit
8a880936e902 ("net: stmmac: Add XLGMII support") incorrectly permitted the
10-100 link speeds and 1G half-duplex mode for DW XLGMAC IP-core.
Fixes: 5b0d7d7da64b ("net: stmmac: Add the missing speeds that XGMAC supports")
Fixes: df7699c70c1b ("net: stmmac: Do not cut down 1G modes")
Fixes: 8a880936e902 ("net: stmmac: Add XLGMII support")
Suggested-by: Russell King (Oracle) <linux@armlinux.org.uk>
Signed-off-by: Serge Semin <fancer.lancer@gmail.com>
Reviewed-by: Romain Gantois <romain.gantois@bootlin.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
It's possible to have the maximum link speed being artificially limited on
the platform-specific basis. It's done either by setting up the
plat_stmmacenet_data::max_speed field or by specifying the "max-speed"
DT-property. In such cases it's required that any specific
MAC-capabilities re-initializations would take the limit into account. In
particular the link speed capabilities may change during the number of
active Tx/Rx queues re-initialization. But the currently implemented
procedure doesn't take the speed limit into account.
Fix that by calling phylink_limit_mac_speed() in the
stmmac_reinit_queues() method if the speed limitation was required in the
same way as it's done in the stmmac_phy_setup() function.
Fixes: 95201f36f395 ("net: stmmac: update MAC capabilities when tx queues are updated")
Signed-off-by: Serge Semin <fancer.lancer@gmail.com>
Reviewed-by: Romain Gantois <romain.gantois@bootlin.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
There are three DW MAC IP-cores which can have the multiple Tx/Rx queues
enabled:
DW GMAC v3.7+ with AV feature,
DW QoS Eth v4.x/v5.x,
DW XGMAC/XLGMAC
Based on the respective HW databooks, only the DW QoS Eth IP-core doesn't
support the half-duplex link mode in case if more than one queues enabled:
"In multiple queue/channel configurations, for half-duplex operation,
enable only the Q0/CH0 on Tx and Rx. For single queue/channel in
full-duplex operation, any queue/channel can be enabled."
The rest of the IP-cores don't have such constraint. Thus in order to have
the constraint applied for the DW QoS Eth MACs only, let's move the it'
implementation to the respective MAC-capabilities getter and make sure the
getter is called in the queues re-init procedure.
Fixes: b6cfffa7ad92 ("stmmac: fix DMA channel hang in half-duplex mode")
Signed-off-by: Serge Semin <fancer.lancer@gmail.com>
Reviewed-by: Romain Gantois <romain.gantois@bootlin.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
No new changes will be added for minor version 2. Change the minor
version number to 2 and stop displaying log message for unsupported
minor version 2.
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Link: https://lore.kernel.org/r/20240415220625.2828339-1-srinivas.pandruvada@linux.intel.com
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
|
|
Add Granite Rapids-D to hpm_cpu_ids, so that MSR 0x54 can be used.
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Link: https://lore.kernel.org/r/20240415212853.2820470-1-srinivas.pandruvada@linux.intel.com
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
|
|
ROG Zephyrus G14 advertises support for SPS notifications to the
BIOS but doesn't actually use them. Instead the asus-nb-wmi driver
utilizes such events.
Add a quirk to prevent the system from registering for ACPI platform
profile when this system is found to avoid conflicts.
Reported-by: al0uette@outlook.com
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218685
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20240410140956.385-3-mario.limonciello@amd.com
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
|
|
In the event of a BIOS bug add infrastructure that will be utilized
to override the return value for supported_funcs to avoid enabling
features.
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20240410140956.385-2-mario.limonciello@amd.com
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
|
|
ASUS ROG Zephyrus G14 doesn't have _CRS in AMDI0102 device and so
there are no resources to walk. This is expected behavior because
it doesn't support Smart PC. Decrease error message to debug.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=218685
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20240410140956.385-1-mario.limonciello@amd.com
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
|
|
Dmitry Safonov via says:
====================
selftests/net/tcp_ao: A bunch of fixes for TCP-AO selftests
Started as addressing the flakiness issues in rst_ipv*, that affect
netdev dashboard.
Signed-off-by: Dmitry Safonov <0x7f454c46@gmail.com>
====================
Link: https://lore.kernel.org/r/20240413-tcp-ao-selftests-fixes-v1-0-f9c41c96949d@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
On my new laptop with packages from nixos-unstable, gcc 12.3.0 produces
> lib/setup.c: In function ‘__test_msg’:
> lib/setup.c:20:9: error: format not a string literal and no format arguments [-Werror=format-security]
> 20 | ksft_print_msg(buf);
> | ^~~~~~~~~~~~~~
> lib/setup.c: In function ‘__test_ok’:
> lib/setup.c:26:9: error: format not a string literal and no format arguments [-Werror=format-security]
> 26 | ksft_test_result_pass(buf);
> | ^~~~~~~~~~~~~~~~~~~~~
> lib/setup.c: In function ‘__test_fail’:
> lib/setup.c:32:9: error: format not a string literal and no format arguments [-Werror=format-security]
> 32 | ksft_test_result_fail(buf);
> | ^~~~~~~~~~~~~~~~~~~~~
> lib/setup.c: In function ‘__test_xfail’:
> lib/setup.c:38:9: error: format not a string literal and no format arguments [-Werror=format-security]
> 38 | ksft_test_result_xfail(buf);
> | ^~~~~~~~~~~~~~~~~~~~~~
> lib/setup.c: In function ‘__test_error’:
> lib/setup.c:44:9: error: format not a string literal and no format arguments [-Werror=format-security]
> 44 | ksft_test_result_error(buf);
> | ^~~~~~~~~~~~~~~~~~~~~~
> lib/setup.c: In function ‘__test_skip’:
> lib/setup.c:50:9: error: format not a string literal and no format arguments [-Werror=format-security]
> 50 | ksft_test_result_skip(buf);
> | ^~~~~~~~~~~~~~~~~~~~~
> cc1: some warnings being treated as errors
As the buffer was already pre-printed into, print it as a string
rather than a format-string.
Fixes: cfbab37b3da0 ("selftests/net: Add TCP-AO library")
Signed-off-by: Dmitry Safonov <0x7f454c46@gmail.com>
Reported-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
On my new laptop with packages from nixos-unstable, gcc 12.3.0 produces:
> lib/proc.c: In function ‘netstat_read_type’:
> lib/proc.c:89:9: error: format not a string literal and no format arguments [-Werror=format-security]
> 89 | if (fscanf(fnetstat, type->header_name) == EOF)
> | ^~
> cc1: some warnings being treated as errors
Here the selftests lib parses header name, while expectes non-space word
ending with a column.
Fixes: cfbab37b3da0 ("selftests/net: Add TCP-AO library")
Signed-off-by: Dmitry Safonov <0x7f454c46@gmail.com>
Reported-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
The structure is on the stack and has to be zero-initialized as
the kernel checks for:
> if (in.reserved != 0 || in.reserved2 != 0)
> return -EINVAL;
Fixes: b26660531cf6 ("selftests/net: Add test for TCP-AO add setsockopt() command")
Signed-off-by: Dmitry Safonov <0x7f454c46@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Currently, "active reset" cases are flaky, because select() is called
for 3 sockets, while only 2 are expected to receive RST.
The idea of the third socket was to get into request_sock_queue,
but the test mistakenly attempted to connect() after the listener
socket was shut down.
Repair this test, it's important to check the different kernel
code-paths for signing RST TCP-AO segments.
Fixes: c6df7b2361d7 ("selftests/net: Add TCP-AO RST test")
Reported-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Dmitry Safonov <0x7f454c46@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux
Pull nfsd fixes from Chuck Lever:
- Fix a potential tracepoint crash
- Fix NFSv4 GETATTR on big-endian platforms
* tag 'nfsd-6.9-3' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux:
NFSD: fix endianness issue in nfsd4_encode_fattr4
SUNRPC: Fix rpcgss_context trace event acceptor field
|
|
Pull yet more bcachefs fixes from Kent Overstreet:
"This gets recovery working again for the affected user I've been
working with, and I'm still waiting to hear back on other bug reports
but should fix it for everyone else who's been having issues with
recovery.
- Various recovery fixes:
- fixes for the btree_insert_entry being resized on path
allocation btree_path array recently became dynamically
resizable, and btree_insert_entry along with it; this was being
observed during journal replay, when write buffer btree updates
don't use the write buffer and instead use the normal btree
update path
- multiple fixes for deadlock in recovery when we need to do lots
of btree node merges; excessive merges were clocking up the
whole pipeline
- write buffer path now correctly does btree node merges when
needed
- fix failure to go RW when superblock indicates recovery passes
needed (i.e. to complete an unfinished upgrade)
- Various unsafety fixes - test case contributed by a user who had
two drives out of a six drive array write out a whole bunch of
garbage after power failure
- New (tiny) on disk format feature: since it appears the btree node
scan tool will be a more regular thing (crappy hardware, user
error) - this adds a 64 bit per-device bitmap of regions that have
ever had btree nodes.
- A path->should_be_locked fix, from a larger patch series tightening
up invariants and assertions around btree transaction and path
locking state.
This particular fix prevents us from keeping around btree_paths
that are no longer needed"
* tag 'bcachefs-2024-04-15' of https://evilpiepirate.org/git/bcachefs: (24 commits)
bcachefs: set_btree_iter_dontneed also clears should_be_locked
bcachefs: fix error path of __bch2_read_super()
bcachefs: Check for backpointer bucket_offset >= bucket size
bcachefs: bch_member.btree_allocated_bitmap
bcachefs: sysfs internal/trigger_journal_flush
bcachefs: Fix bch2_btree_node_fill() for !path
bcachefs: add safety checks in bch2_btree_node_fill()
bcachefs: Interior known are required to have known key types
bcachefs: add missing bounds check in __bch2_bkey_val_invalid()
bcachefs: Fix btree node merging on write buffer btrees
bcachefs: Disable merges from interior update path
bcachefs: Run merges at BCH_WATERMARK_btree
bcachefs: Fix missing write refs in fs fio paths
bcachefs: Fix deadlock in journal replay
bcachefs: Go rw if running any explicit recovery passes
bcachefs: Standardize helpers for printing enum strs with bounds checks
bcachefs: don't queue btree nodes for rewrites during scan
bcachefs: fix race in bch2_btree_node_evict()
bcachefs: fix unsafety in bch2_stripe_to_text()
bcachefs: fix unsafety in bch2_extent_ptr_to_text()
...
|
|
This is part of a larger series cleaning up the semantics of
should_be_locked and adding assertions around it; if we don't need an
iterator/path anymore, it clearly doesn't need to be locked.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
In __bch2_read_super(), if kstrdup() fails, it needs to release memory
in sb->holder, fix to call bch2_free_super() in the error path.
Signed-off-by: Chao Yu <chao@kernel.org>
Reviewed-by: Hongbo Li <lihongbo22@huawei.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull kselftest fixes from Shuah Khan:
"A fix to kselftest harness to prevent infinite loop triggered in an
assert in FIXTURE_TEARDOWN and a fix to a problem seen in being able
to stop subsystem-enable tests when sched events are being traced"
* tag 'linux_kselftest-fixes-6.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
selftests/harness: Prevent infinite loop due to Assert in FIXTURE_TEARDOWN
selftests/ftrace: Limit length in subsystem-enable tests
|
|
Allow the power-domains property to the PWM_DISP block as on some SoCs
this does need at most one power domain.
Fixes: b09b179bac0a ("dt-bindings: pwm: Convert pwm-mtk-disp.txt to mediatek,pwm-disp.yaml format")
Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20240404081808.92199-1-angelogioacchino.delregno@collabora.com
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
|
|
With 16 channel pwm support, we're registering two instances of pwm_chip
with 8 channels each. We need to update PM functions to use both instances
of pwm_chip during power state transitions.
Introduce struct dwc_pwm_drvdata and use it as driver_data, which will
maintain both instances of pwm_chip along with dwc_pwm_info and allow us
to use them inside suspend/resume handles.
Fixes: ebf2c89eb95e ("pwm: dwc: Add 16 channel support for Intel Elkhart Lake")
Signed-off-by: Raag Jadav <raag.jadav@intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20240415074051.14681-1-raag.jadav@intel.com
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
|
|
Upon reviewing the flower control flags handling in
this driver, I notice that the key wasn't being used,
only the mask.
Ie. `tc flower ... ip_flags nofrag` was hardware
offloaded as `... ip_flags frag`.
Only compile tested, no access to HW.
Fixes: c672e3727989 ("octeontx2-pf: Add support to filter packet based on IP fragment")
Signed-off-by: Asbjørn Sloth Tønnesen <ast@fiberby.net>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Commit under Fixes optimized the number of recv() calls
needed during RTM_GETROUTE dumps, but we got multiple
reports of applications hanging on recv() calls.
Applications expect that a route dump will be terminated
with a recv() reading an individual NLM_DONE message.
Coalescing NLM_DONE is perfectly legal in netlink,
but even tho reporters fixed the code in respective
projects, chances are it will take time for those
applications to get updated. So revert to old behavior
(for now)?
Old kernel (5.19):
$ ./cli.py --dbg-small-recv 4096 --spec netlink/specs/rt_route.yaml \
--dump getroute --json '{"rtm-family": 2}'
Recv: read 692 bytes, 11 messages
nl_len = 68 (52) nl_flags = 0x22 nl_type = 24
...
nl_len = 60 (44) nl_flags = 0x22 nl_type = 24
Recv: read 20 bytes, 1 messages
nl_len = 20 (4) nl_flags = 0x2 nl_type = 3
Before (6.9-rc2):
$ ./cli.py --dbg-small-recv 4096 --spec netlink/specs/rt_route.yaml \
--dump getroute --json '{"rtm-family": 2}'
Recv: read 712 bytes, 12 messages
nl_len = 68 (52) nl_flags = 0x22 nl_type = 24
...
nl_len = 60 (44) nl_flags = 0x22 nl_type = 24
nl_len = 20 (4) nl_flags = 0x2 nl_type = 3
After:
$ ./cli.py --dbg-small-recv 4096 --spec netlink/specs/rt_route.yaml \
--dump getroute --json '{"rtm-family": 2}'
Recv: read 692 bytes, 11 messages
nl_len = 68 (52) nl_flags = 0x22 nl_type = 24
...
nl_len = 60 (44) nl_flags = 0x22 nl_type = 24
Recv: read 20 bytes, 1 messages
nl_len = 20 (4) nl_flags = 0x2 nl_type = 3
Reported-by: Stefano Brivio <sbrivio@redhat.com>
Link: https://lore.kernel.org/all/20240315124808.033ff58d@elisabeth
Reported-by: Ilya Maximets <i.maximets@ovn.org>
Link: https://lore.kernel.org/all/02b50aae-f0e9-47a4-8365-a977a85975d3@ovn.org
Fixes: 4ce5dc9316de ("inet: switch inet_dump_fib() to RCU protection")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Tested-by: Ilya Maximets <i.maximets@ovn.org>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
This adds a small (64 bit) per-device bitmap that tracks ranges that
have btree nodes, for accelerating btree node scan if it is ever needed.
- New helpers, bch2_dev_btree_bitmap_marked() and
bch2_dev_bitmap_mark(), for checking and updating the bitmap
- Interior btree update path updates the bitmaps when required
- The check_allocations pass has a new fsck_err check,
btree_bitmap_not_marked
- New on disk format version, mi_btree_mitmap, which indicates the new
bitmap is present
- Upgrade table lists the required recovery pass and expected fsck error
- Btree node scan uses the bitmap to skip ranges if we're on the new
version
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Add a sysfs knob for immediately flushing the entire journal.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
We shouldn't be doing the unlock/relock dance when we're not using a
path - this fixes an assertion pop when called from btree node scan.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
For forwards compatibilyt, we allow bkeys of unknown type in leaf nodes;
we can simply ignore metadata we don't understand. Pointers to btree
nodes must always be of known types, howwever.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull sysfs fix from Al Viro:
"Get rid of lockdep false positives around sysfs/overlayfs
syzbot has uncovered a class of lockdep false positives for setups
with sysfs being one of the backing layers in overlayfs. The root
cause is that of->mutex allocated when opening a sysfs file read-only
(which overlayfs might do) is confused with of->mutex of a file opened
writable (held in write to sysfs file, which overlayfs won't do).
Assigning them separate lockdep classes fixes that bunch and it's
obviously safe"
* tag 'pull-sysfs-annotation-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
kernfs: annotate different lockdep class for of->mutex of writable files
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull misc x86 fixes from Ingo Molnar:
- Follow up fixes for the BHI mitigations code
- Fix !SPECULATION_MITIGATIONS bug not turning off mitigations as
expected
- Work around an APIC emulation bug when the kernel is built with Clang
and run as a SEV guest
- Follow up x86 topology fixes
* tag 'x86-urgent-2024-04-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/cpu/amd: Move TOPOEXT enablement into the topology parser
x86/cpu/amd: Make the NODEID_MSR union actually work
x86/cpu/amd: Make the CPUID 0x80000008 parser correct
x86/bugs: Replace CONFIG_SPECTRE_BHI_{ON,OFF} with CONFIG_MITIGATION_SPECTRE_BHI
x86/bugs: Remove CONFIG_BHI_MITIGATION_AUTO and spectre_bhi=auto
x86/bugs: Clarify that syscall hardening isn't a BHI mitigation
x86/bugs: Fix BHI handling of RRSBA
x86/bugs: Rename various 'ia32_cap' variables to 'x86_arch_cap_msr'
x86/bugs: Cache the value of MSR_IA32_ARCH_CAPABILITIES
x86/bugs: Fix BHI documentation
x86/cpu: Actually turn off mitigations by default for SPECULATION_MITIGATIONS=n
x86/topology: Don't update cpu_possible_map in topo_set_cpuids()
x86/bugs: Fix return type of spectre_bhi_state()
x86/apic: Force native_apic_mem_read() to use the MOV instruction
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer fixes from Ingo Molnar:
- Address a (valid) W=1 build warning
- Fix timer self-tests
- Annotate a KCSAN warning wrt. accesses to the tick_do_timer_cpu
global variable
- Address a !CONFIG_BUG build warning
* tag 'timers-urgent-2024-04-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
selftests: kselftest: Fix build failure with NOLIBC
selftests: timers: Fix abs() warning in posix_timers test
selftests: kselftest: Mark functions that unconditionally call exit() as __noreturn
selftests: timers: Fix posix_timers ksft_print_msg() warning
selftests: timers: Fix valid-adjtimex signed left-shift undefined behavior
bug: Fix no-return-statement warning with !CONFIG_BUG
timekeeping: Use READ/WRITE_ONCE() for tick_do_timer_cpu
selftests/timers/posix_timers: Reimplement check_timer_distribution()
irqflags: Explicitly ignore lockdep_hrtimer_exit() argument
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf event fix from Ingo Molnar:
"Fix the x86 PMU multi-counter code returning invalid data in certain
circumstances"
* tag 'perf-urgent-2024-04-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/x86: Fix out of range data
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking fix from Ingo Molnar:
"Fix a PREEMPT_RT build bug"
* tag 'locking-urgent-2024-04-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
locking: Make rwsem_assert_held_write_nolockdep() build with PREEMPT_RT=y
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq fix from Ingo Molnar:
"Fix a bug in the GIC irqchip driver"
* tag 'irq-urgent-2024-04-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irqchip/gic-v3-its: Fix VSYNC referencing an unmapped VPE on GIC v4.1
|
|
Pull virtio bugfixes from Michael Tsirkin:
"Some small, obvious (in hindsight) bugfixes:
- new ioctl in vhost-vdpa has a wrong # - not too late to fix
- vhost has apparently been lacking an smp_rmb() - due to code
duplication :( The duplication will be fixed in the next merge
cycle, this is a minimal fix
- an error message in vhost talks about guest moving used index -
which of course never happens, guest only ever moves the available
index
- i2c-virtio didn't set the driver owner so it did not get refcounted
correctly"
* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
vhost: correct misleading printing information
vhost-vdpa: change ioctl # for VDPA_GET_VRING_SIZE
virtio: store owner from modules with register_virtio_driver()
vhost: Add smp_rmb() in vhost_enable_notify()
vhost: Add smp_rmb() in vhost_vq_avail_empty()
|
|
git://git.infradead.org/users/hch/dma-mapping
Pull dma-mapping fixes from Christoph Hellwig:
- fix up swiotlb buffer padding even more (Petr Tesarik)
- fix for partial dma_sync on swiotlb (Michael Kelley)
- swiotlb debugfs fix (Dexuan Cui)
* tag 'dma-maping-6.9-2024-04-14' of git://git.infradead.org/users/hch/dma-mapping:
swiotlb: do not set total_used to 0 in swiotlb_create_debugfs_files()
swiotlb: fix swiotlb_bounce() to do partial sync's correctly
swiotlb: extend buffer pre-padding to alloc_align_mask if necessary
|
|
The commit fc8b2a619469
("net: more strict VIRTIO_NET_HDR_GSO_UDP_L4 validation")
adds check of potential number of UDP segments vs
UDP_MAX_SEGMENTS in linux/virtio_net.h.
After this change certification test of USO guest-to-guest
transmit on Windows driver for virtio-net device fails,
for example with packet size of ~64K and mss of 536 bytes.
In general the USO should not be more restrictive than TSO.
Indeed, in case of unreasonably small mss a lot of segments
can cause queue overflow and packet loss on the destination.
Limit of 128 segments is good for any practical purpose,
with minimal meaningful mss of 536 the maximal UDP packet will
be divided to ~120 segments.
The number of segments for UDP packets is validated vs
UDP_MAX_SEGMENTS also in udp.c (v4,v6), this does not affect
quest-to-guest path but does affect packets sent to host, for
example.
It is important to mention that UDP_MAX_SEGMENTS is kernel-only
define and not available to user mode socket applications.
In order to request MSS smaller than MTU the applications
just uses setsockopt with SOL_UDP and UDP_SEGMENT and there is
no limitations on socket API level.
Fixes: fc8b2a619469 ("net: more strict VIRTIO_NET_HDR_GSO_UDP_L4 validation")
Signed-off-by: Yuri Benditovich <yuri.benditovich@daynix.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The writable file /sys/power/resume may call vfs lookup helpers for
arbitrary paths and readonly files can be read by overlayfs from vfs
helpers when sysfs is a lower layer of overalyfs.
To avoid a lockdep warning of circular dependency between overlayfs
inode lock and kernfs of->mutex, use a different lockdep class for
writable and readonly kernfs files.
Reported-by: syzbot+9a5b0ced8b1bfb238b56@syzkaller.appspotmail.com
Fixes: 0fedefd4c4e3 ("kernfs: sysfs: support custom llseek method for sysfs entries")
Suggested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
The btree write buffer flush fastpath that avoids the main transaction
commit path had the unfortunate side effect of not doing btree node
merging.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
There's been a bug in the btree write buffer where it wasn't triggering
btree node merges - and leaving behind a bunch of nearly empty btree
nodes.
Then during journal replay, when updates to the backpointers btree
aren't using the btree write buffer (because we require synchronization
with journal replay), we end up doing those merges all at once.
Then if it's the interior update path running them, we deadlock because
those run with the highest watermark.
There's no real need for the interior update path to be doing btree node
merges; other code paths can handle that at lower watermarks.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
This fixes a deadlock where the interior update path during journal
replay ends up doing a ton of merges on the backpointers btree, and
deadlocking.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
bch2_journal_flush_seq requires us to have a write ref
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
btree_key_can_insert_cached() should be checking the watermark -
BCH_TRANS_COMMIT_journal_replay really means nonblocking mode when
watermark < reclaim, it was being used incorrectly.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
This fixes a bug where we fail to start when upgrading/downgrading
because we forgot we needed to go rw.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|