Age | Commit message (Collapse) | Author |
|
WARN_ON("string") will unconditionally trigger a warning, but
not really do what it may look like. Use WARN(1, ...) instead
and add the mode number as well.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Link: https://patch.msgid.link/20240705133921.a50aa5b15ece.I9a25b7448b0498c0c2e503986978dae165c8bdf8@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
Do not check BSS color collision in following cases
1. already under a color change
2. color change is disabled
Signed-off-by: Michael-CY Lee <michael-cy.lee@mediatek.com>
Link: https://patch.msgid.link/20240705074346.11228-1-michael-cy.lee@mediatek.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
The color change finalize work might be called after the link is
stopped, which might lead to a kernel crash.
Signed-off-by: Michael-CY Lee <michael-cy.lee@mediatek.com>
Link: https://patch.msgid.link/20240705074326.11172-1-michael-cy.lee@mediatek.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
When user issues a connection with a different SSID than the one
virt_wifi has advertised, the __cfg80211_connect_result() will
trigger the warning: WARN_ON(bss_not_found).
The issue is because the connection code in virt_wifi does not
check the SSID from user space (it only checks the BSSID), and
virt_wifi will call cfg80211_connect_result() with WLAN_STATUS_SUCCESS
even if the SSID is different from the one virt_wifi has advertised.
Eventually cfg80211 won't be able to find the cfg80211_bss and generate
the warning.
Fixed it by checking the SSID (from user space) in the connection code.
Fixes: c7cdba31ed8b ("mac80211-next: rtnetlink wifi simulation device")
Reported-by: syzbot+d6eb9cee2885ec06f5e3@syzkaller.appspotmail.com
Signed-off-by: En-Wei Wu <en-wei.wu@canonical.com>
Link: https://patch.msgid.link/20240705023756.10954-1-en-wei.wu@canonical.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
Avoid reusing stale driver data when an interface is brought down and up
again. In order to avoid having to duplicate the memset in every single
driver, do it here.
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Link: https://patch.msgid.link/20240704130947.48609-1-nbd@nbd.name
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
In the current state, an erroneous call to
bpf_object__find_map_by_name(NULL, ...) leads to a segmentation
fault through the following call chain:
bpf_object__find_map_by_name(obj = NULL, ...)
-> bpf_object__for_each_map(pos, obj = NULL)
-> bpf_object__next_map((obj = NULL), NULL)
-> return (obj = NULL)->maps
While calling bpf_object__find_map_by_name with obj = NULL is
obviously incorrect, this should not lead to a segmentation
fault but rather be handled gracefully.
As __bpf_map__iter already handles this situation correctly, we
can delegate the check for the regular case there and only add
a check in case the prev or next parameter is NULL.
Signed-off-by: Andreas Ziegler <ziegler.andreas@siemens.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20240703083436.505124-1-ziegler.andreas@siemens.com
|
|
Create a helper function that puts the data from struct
ieee80211_iface_combination to a nl80211 message.
This will be used for adding per-radio interface combination data.
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Link: https://patch.msgid.link/22a0eee19dbcf98627239328bc66decd3395122c.1719919832.git-series.nbd@nbd.name
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
The order in which lists are sorted in __thermal_zone_device_update()
is reverse with respect to what it should be due to a mistake in
thermal_trip_notify_cmp().
Fix it and observe that it is not necessary to sort the lists in
different orders. They can both be sorted in ascending order if
way_down_list is walked in reverse order which allows the code to
be slightly more straightforward (and less prone to silly mistakes).
Fixes: 7454f2c42cce ("thermal: core: Sort trip point crossing notifications by temperature")
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://patch.msgid.link/12481676.O9o76ZdvQC@rjwysocki.net
|
|
Now that the s390x JIT supports exceptions, remove the respective tests
from the denylist.
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20240703005047.40915-4-iii@linux.ibm.com
|
|
Implement the following three pieces required from the JIT:
- A "top-level" BPF prog (exception_boundary) must save all
non-volatile registers, and not only the ones that it clobbers.
- A "handler" BPF prog (exception_cb) must switch stack to that of
exception_boundary, and restore the registers that exception_boundary
saved.
- arch_bpf_stack_walk() must unwind the stack and provide the results
in a way that satisfies both bpf_throw() and exception_cb.
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20240703005047.40915-3-iii@linux.ibm.com
|
|
Using a mask instead of an array saves a small amount of memory and
allows marking multiple registers as seen with a simple "or". Another
positive side-effect is that it speeds up verification with jitterbug.
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20240703005047.40915-2-iii@linux.ibm.com
|
|
After commit 0ede61d8589c ("file: convert to SLAB_TYPESAFE_BY_RCU") this
loop always iterates exactly one time. Delete the for statement and pull
the code in a tab.
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Christian Brauner <brauner@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/bpf/ZoWJF51D4zWb6f5t@stanley.mountain
|
|
When BPF_TRAMP_F_CALL_ORIG is not set, stack space for passing arguments
on stack doesn't need to be reserved because the original function is
not called.
Only reserve space for stacked arguments when BPF_TRAMP_F_CALL_ORIG is
set.
Signed-off-by: Puranjay Mohan <puranjay@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Pu Lehui <pulehui@huawei.com>
Link: https://lore.kernel.org/bpf/20240708114758.64414-1-puranjay@kernel.org
|
|
Set proper MAC capabilities for port 4 on LAN9371 and LAN9372 switches with
integrated 100BaseTX PHY. And introduce the is_lan937x_tx_phy() function to
reuse it where applicable.
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Acked-by: Arun Ramadoss <arun.ramadoss@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Correct the example to match the help text from the devlink utility.
Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
At this time, conntrack either returns NF_ACCEPT or NF_DROP.
To improve debuging it would be nice to be able to replace NF_DROP verdict
with NF_DROP_REASON() helper,
This helper releases the skb instantly (so drop_monitor can pinpoint
exact location) and returns NF_STOLEN.
Prepare call sites to deal with this before introducing such changes
in conntrack and nat core.
Signed-off-by: Florian Westphal <fw@strlen.de>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
dsos__add would add at the end of the dso array possibly requiring a
later find to re-sort the array. Patterns of find then add were
becoming O(n*log n) due to the sorts. Change the add routine to be
O(n) rather than O(1) but to maintain the sorted-ness of the dsos
array so that later finds don't need the O(n*log n) sort.
Fixes: 3f4ac23a9908 ("perf dsos: Switch backing storage to array from rbtree/list")
Reported-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Steinar Gunderson <sesse@google.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Matt Fleming <matt@readmodwrite.com>
Link: https://lore.kernel.org/r/20240703172117.810918-3-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
The array is sorted, so just move the elements and insert in order.
Fixes: 13ca628716c6 ("perf comm: Add reference count checking to 'struct comm_str'")
Reported-by: Matt Fleming <matt@readmodwrite.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Matt Fleming <matt@readmodwrite.com>
Cc: Steinar Gunderson <sesse@google.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Link: https://lore.kernel.org/r/20240703172117.810918-2-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux
Pull clk fixes from Stephen Boyd:
"A set of clk fixes for the Qualcomm, Mediatek, and Allwinner drivers:
- Fix the Qualcomm Stromer Plus PLL set_rate() clk_op to explicitly
set the alpha enable bit and not set bits that don't exist
- Mark Qualcomm IPQ9574 crypto clks as voted to avoid stuck clk
warnings
- Fix the parent of some PLLs on Qualcomm sm6530 so their rate is
correct
- Fix the min/max rate clamping logic in the Allwinner driver that
got broken in v6.9
- Limit runtime PM enabling in the Mediatek driver to only
mt8183-mfgcfg so that system wide resume doesn't break on other
Mediatek SoCs"
* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
clk: mediatek: mt8183: Only enable runtime PM on mt8183-mfgcfg
clk: sunxi-ng: common: Don't call hw_to_ccu_common on hw without common
clk: qcom: gcc-ipq9574: Add BRANCH_HALT_VOTED flag
clk: qcom: apss-ipq-pll: remove 'config_ctl_hi_val' from Stromer pll configs
clk: qcom: clk-alpha-pll: set ALPHA_EN bit for Stromer Plus PLLs
clk: qcom: gcc-sm6350: Fix gpll6* & gpll7 parents
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
- Fix unnecessary copy to 0 when kernel is booted at address 0
- Fix usercopy crash when dumping dtl via debugfs
- Avoid possible crash when PCI hotplug races with error handling
- Fix kexec crash caused by scv being disabled before other CPUs
call-in
- Fix powerpc selftests build with USERCFLAGS set
Thanks to Anjali K, Ganesh Goudar, Gautam Menghani, Jinglin Wen,
Nicholas Piggin, Sourabh Jain, Srikar Dronamraju, and Vishal Chourasia.
* tag 'powerpc-6.10-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
selftests/powerpc: Fix build with USERCFLAGS set
powerpc/pseries: Fix scv instruction crash with kexec
powerpc/eeh: avoid possible crash when edev->pdev changes
powerpc/pseries: Whitelist dtl slub object for copying to userspace
powerpc/64s: Fix unnecessary copy to 0 when kernel is booted at address 0
|
|
Pull smb client fix from Steve French:
"Fix for smb3 readahead performance regression"
* tag '6.10-rc6-smb3-client-fix' of git://git.samba.org/sfrench/cifs-2.6:
cifs: Fix read-performance regression by dropping readahead expansion
|
|
Now working at Oracle.
Link: https://lkml.kernel.org/r/20240703092704.11571-1-lorenzo.stoakes@oracle.com
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Even on 6.10-rc6, I've been seeing elusive "Bad page state"s (often on
flags when freeing, yet the flags shown are not bad: PG_locked had been
set and cleared??), and VM_BUG_ON_PAGE(page_ref_count(page) == 0)s from
deferred_split_scan()'s folio_put(), and a variety of other BUG and WARN
symptoms implying double free by deferred split and large folio migration.
6.7 commit 9bcef5973e31 ("mm: memcg: fix split queue list crash when large
folio migration") was right to fix the memcg-dependent locking broken in
85ce2c517ade ("memcontrol: only transfer the memcg data for migration"),
but missed a subtlety of deferred_split_scan(): it moves folios to its own
local list to work on them without split_queue_lock, during which time
folio->_deferred_list is not empty, but even the "right" lock does nothing
to secure the folio and the list it is on.
Fortunately, deferred_split_scan() is careful to use folio_try_get(): so
folio_migrate_mapping() can avoid the race by folio_undo_large_rmappable()
while the old folio's reference count is temporarily frozen to 0 - adding
such a freeze in the !mapping case too (originally, folio lock and
unmapping and no swap cache left an anon folio unreachable, so no freezing
was needed there: but the deferred split queue offers a way to reach it).
Link: https://lkml.kernel.org/r/29c83d1a-11ca-b6c9-f92e-6ccb322af510@google.com
Fixes: 9bcef5973e31 ("mm: memcg: fix split queue list crash when large folio migration")
Signed-off-by: Hugh Dickins <hughd@google.com>
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Nhat Pham <nphamcs@gmail.com>
Cc: Yang Shi <shy828301@gmail.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
compat
On a system with Perl 5.12.1, commit 5ef6dc08cfde
("lib/build_OID_registry: don't mention the full path of the script in
output") causes the build to fail with the error below.
Bareword found where operator expected at ./lib/build_OID_registry line 41, near "s#^\Q$abs_srctree/\E##r"
syntax error at ./lib/build_OID_registry line 41, near "s#^\Q$abs_srctree/\E##r"
Execution of ./lib/build_OID_registry aborted due to compilation errors.
make[3]: *** [lib/Makefile:352: lib/oid_registry_data.c] Error 255
Ahmad Fatoum analyzed that non-destructive substitution is only supported since
Perl 5.13.2. Instead of dropping `r` and having the side effect of modifying
`$0`, introduce a dedicated variable to support older Perl versions.
Link: https://lkml.kernel.org/r/20240702223512.8329-2-pmenzel@molgen.mpg.de
Link: https://lkml.kernel.org/r/20240701155802.75152-1-pmenzel@molgen.mpg.de
Fixes: 5ef6dc08cfde ("lib/build_OID_registry: don't mention the full path of the script in output")
Link: https://lore.kernel.org/all/259f7a87-2692-480e-9073-1c1c35b52f67@molgen.mpg.de/
Signed-off-by: Paul Menzel <pmenzel@molgen.mpg.de>
Suggested-by: Ahmad Fatoum <a.fatoum@pengutronix.de>
Cc: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Cc: Nicolas Schier <nicolas@fjasle.eu>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Ahmad Fatoum <a.fatoum@pengutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
A kernel warning was reported when pinning folio in CMA memory when
launching SEV virtual machine. The splat looks like:
[ 464.325306] WARNING: CPU: 13 PID: 6734 at mm/gup.c:1313 __get_user_pages+0x423/0x520
[ 464.325464] CPU: 13 PID: 6734 Comm: qemu-kvm Kdump: loaded Not tainted 6.6.33+ #6
[ 464.325477] RIP: 0010:__get_user_pages+0x423/0x520
[ 464.325515] Call Trace:
[ 464.325520] <TASK>
[ 464.325523] ? __get_user_pages+0x423/0x520
[ 464.325528] ? __warn+0x81/0x130
[ 464.325536] ? __get_user_pages+0x423/0x520
[ 464.325541] ? report_bug+0x171/0x1a0
[ 464.325549] ? handle_bug+0x3c/0x70
[ 464.325554] ? exc_invalid_op+0x17/0x70
[ 464.325558] ? asm_exc_invalid_op+0x1a/0x20
[ 464.325567] ? __get_user_pages+0x423/0x520
[ 464.325575] __gup_longterm_locked+0x212/0x7a0
[ 464.325583] internal_get_user_pages_fast+0xfb/0x190
[ 464.325590] pin_user_pages_fast+0x47/0x60
[ 464.325598] sev_pin_memory+0xca/0x170 [kvm_amd]
[ 464.325616] sev_mem_enc_register_region+0x81/0x130 [kvm_amd]
Per the analysis done by yangge, when starting the SEV virtual machine, it
will call pin_user_pages_fast(..., FOLL_LONGTERM, ...) to pin the memory.
But the page is in CMA area, so fast GUP will fail then fallback to the
slow path due to the longterm pinnalbe check in try_grab_folio().
The slow path will try to pin the pages then migrate them out of CMA area.
But the slow path also uses try_grab_folio() to pin the page, it will
also fail due to the same check then the above warning is triggered.
In addition, the try_grab_folio() is supposed to be used in fast path and
it elevates folio refcount by using add ref unless zero. We are guaranteed
to have at least one stable reference in slow path, so the simple atomic add
could be used. The performance difference should be trivial, but the
misuse may be confusing and misleading.
Redefined try_grab_folio() to try_grab_folio_fast(), and try_grab_page()
to try_grab_folio(), and use them in the proper paths. This solves both
the abuse and the kernel warning.
The proper naming makes their usecase more clear and should prevent from
abusing in the future.
peterx said:
: The user will see the pin fails, for gpu-slow it further triggers the WARN
: right below that failure (as in the original report):
:
: folio = try_grab_folio(page, page_increm - 1,
: foll_flags);
: if (WARN_ON_ONCE(!folio)) { <------------------------ here
: /*
: * Release the 1st page ref if the
: * folio is problematic, fail hard.
: */
: gup_put_folio(page_folio(page), 1,
: foll_flags);
: ret = -EFAULT;
: goto out;
: }
[1] https://lore.kernel.org/linux-mm/1719478388-31917-1-git-send-email-yangge1116@126.com/
[shy828301@gmail.com: fix implicit declaration of function try_grab_folio_fast]
Link: https://lkml.kernel.org/r/CAHbLzkowMSso-4Nufc9hcMehQsK9PNz3OSu-+eniU-2Mm-xjhA@mail.gmail.com
Link: https://lkml.kernel.org/r/20240628191458.2605553-1-yang@os.amperecomputing.com
Fixes: 57edfcfd3419 ("mm/gup: accelerate thp gup even for "pages != NULL"")
Signed-off-by: Yang Shi <yang@os.amperecomputing.com>
Reported-by: yangge <yangge1116@126.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: <stable@vger.kernel.org> [6.6+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c fix from Wolfram Sang:
"An i2c driver fix"
* tag 'i2c-for-6.10-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: pnx: Fix potential deadlock warning from del_timer_sync() call in isr
|
|
Currently building the powerpc selftests with USERCFLAGS set to anything
causes the build to break:
$ make -C tools/testing/selftests/powerpc V=1 USERCFLAGS=-Wno-error
...
gcc -Wno-error cache_shape.c ...
cache_shape.c:18:10: fatal error: utils.h: No such file or directory
18 | #include "utils.h"
| ^~~~~~~~~
compilation terminated.
This happens because the USERCFLAGS are added to CFLAGS in lib.mk, which
causes the check of CFLAGS in powerpc/flags.mk to skip setting CFLAGS at
all, resulting in none of the usual CFLAGS being passed. That can
be seen in the output above, the only flag passed to the compiler is
-Wno-error.
Fix it by dropping the conditional setting of CFLAGS in flags.mk.
Instead always set CFLAGS, but also append USERCFLAGS if they are set.
Note that appending to CFLAGS (with +=) wouldn't work, because flags.mk
is included by multiple Makefiles (to support partial builds), causing
CFLAGS to be appended to multiple times. Additionally that would place
the USERCFLAGS prior to the standard CFLAGS, meaning the USERCFLAGS
couldn't override the standard flags. Being able to override the
standard flags is desirable, for example for adding -Wno-error.
With the fix in place, the CFLAGS are set correctly, including the
USERCFLAGS:
$ make -C tools/testing/selftests/powerpc V=1 USERCFLAGS=-Wno-error
...
gcc -std=gnu99 -O2 -Wall -Werror -DGIT_VERSION='"v6.10-rc2-7-gdea17e7e56c3"'
-I/home/michael/linux/tools/testing/selftests/powerpc/include -Wno-error
cache_shape.c ...
Fixes: 5553a79387e9 ("selftests/powerpc: Add flags.mk to support pmu buildable")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20240706120833.909853-1-mpe@ellerman.id.au
|
|
[syzbot reported]
BUG: KMSAN: uninit-value in sized_strscpy+0xc4/0x160
sized_strscpy+0xc4/0x160
copy_name+0x2af/0x320 fs/hfsplus/xattr.c:411
hfsplus_listxattr+0x11e9/0x1a50 fs/hfsplus/xattr.c:750
vfs_listxattr fs/xattr.c:493 [inline]
listxattr+0x1f3/0x6b0 fs/xattr.c:840
path_listxattr fs/xattr.c:864 [inline]
__do_sys_listxattr fs/xattr.c:876 [inline]
__se_sys_listxattr fs/xattr.c:873 [inline]
__x64_sys_listxattr+0x16b/0x2f0 fs/xattr.c:873
x64_sys_call+0x2ba0/0x3b50 arch/x86/include/generated/asm/syscalls_64.h:195
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xcf/0x1e0 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x77/0x7f
Uninit was created at:
slab_post_alloc_hook mm/slub.c:3877 [inline]
slab_alloc_node mm/slub.c:3918 [inline]
kmalloc_trace+0x57b/0xbe0 mm/slub.c:4065
kmalloc include/linux/slab.h:628 [inline]
hfsplus_listxattr+0x4cc/0x1a50 fs/hfsplus/xattr.c:699
vfs_listxattr fs/xattr.c:493 [inline]
listxattr+0x1f3/0x6b0 fs/xattr.c:840
path_listxattr fs/xattr.c:864 [inline]
__do_sys_listxattr fs/xattr.c:876 [inline]
__se_sys_listxattr fs/xattr.c:873 [inline]
__x64_sys_listxattr+0x16b/0x2f0 fs/xattr.c:873
x64_sys_call+0x2ba0/0x3b50 arch/x86/include/generated/asm/syscalls_64.h:195
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xcf/0x1e0 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x77/0x7f
[Fix]
When allocating memory to strbuf, initialize memory to 0.
Reported-and-tested-by: syzbot+efde959319469ff8d4d7@syzkaller.appspotmail.com
Signed-off-by: Edward Adam Davis <eadavis@qq.com>
Link: https://lore.kernel.org/r/tencent_8BBB6433BC9E1C1B7B4BDF1BF52574BA8808@qq.com
Reported-and-tested-by: syzbot+01ade747b16e9c8030e0@syzkaller.appspotmail.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Kory Maincent says:
====================
net: pse-pd: Add new PSE c33 features
This patch series adds new c33 features to the PSE API.
- Expand the PSE PI informations status with power, class and failure
reason
- Add the possibility to get and set the PSE PIs power limit
v5: https://lore.kernel.org/r/20240628-feature_poe_power_cap-v5-0-5e1375d3817a@bootlin.com
v4: https://lore.kernel.org/r/20240625-feature_poe_power_cap-v4-0-b0813aad57d5@bootlin.com
v3: https://lore.kernel.org/r/20240614-feature_poe_power_cap-v3-0-a26784e78311@bootlin.com
v2: https://lore.kernel.org/r/20240607-feature_poe_power_cap-v2-0-c03c2deb83ab@bootlin.com
v1: https://lore.kernel.org/r/20240529-feature_poe_power_cap-v1-0-0c4b1d5953b8@bootlin.com
====================
Link: https://patch.msgid.link/20240704-feature_poe_power_cap-v6-0-320003204264@bootlin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This patch expands PSE callbacks with newly introduced
pi_get/set_current_limit() and pi_get_voltage() callback.
It also add the power limit ranges description in the status returned.
The only way to set ps692x0 port power limit is by configure the power
class plus a small power supplement which maximum depends on each class.
Acked-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: Kory Maincent <kory.maincent@bootlin.com>
Link: https://patch.msgid.link/20240704-feature_poe_power_cap-v6-7-320003204264@bootlin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Expand the c33 PSE attributes with power limit to be able to set and get
the PSE Power Interface power limit.
./ynl/cli.py --spec netlink/specs/ethtool.yaml --no-schema --do pse-get
--json '{"header":{"dev-name":"eth1"}}'
{'c33-pse-actual-pw': 1700,
'c33-pse-admin-state': 3,
'c33-pse-avail-pw-limit': 97500,
'c33-pse-pw-class': 4,
'c33-pse-pw-d-status': 4,
'c33-pse-pw-limit-ranges': [{'max': 18100, 'min': 15000},
{'max': 38000, 'min': 30000},
{'max': 65000, 'min': 60000},
{'max': 97500, 'min': 90000}],
'header': {'dev-index': 5, 'dev-name': 'eth1'}}
./ynl/cli.py --spec netlink/specs/ethtool.yaml --no-schema --do pse-set
--json '{"header":{"dev-name":"eth1"},
"c33-pse-avail-pw-limit":19000}'
None
Signed-off-by: Kory Maincent <kory.maincent@bootlin.com>
Link: https://patch.msgid.link/20240704-feature_poe_power_cap-v6-6-320003204264@bootlin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This patch expands the status information provided by ethtool for PSE c33
with available power limit and available power limit ranges. It also adds
a call to pse_ethtool_set_pw_limit() to configure the PSE control power
limit.
Reviewed-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: Kory Maincent <kory.maincent@bootlin.com>
Link: https://patch.msgid.link/20240704-feature_poe_power_cap-v6-5-320003204264@bootlin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This patch add a way to get and set the power limit of a PSE PI.
For that it uses regulator API callbacks wrapper like get_voltage() and
get/set_current_limit() as power is simply V * I.
We used mW unit as defined by the IEEE 802.3-2022 standards.
set_current_limit() uses the voltage return by get_voltage() and the
desired power limit to calculate the current limit. get_voltage() callback
is then mandatory to set the power limit.
get_current_limit() callback is by default looking at a driver callback
and fallback to extracting the current limit from _pse_ethtool_get_status()
if the driver does not set its callback. We prefer let the user the choice
because ethtool_get_status return much more information than the current
limit.
expand pse status with c33_pw_limit_ranges to return the ranges available
to configure the power limit.
Reviewed-by: Sai Krishna <saikrishnag@marvell.com>
Acked-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: Kory Maincent <kory.maincent@bootlin.com>
Link: https://patch.msgid.link/20240704-feature_poe_power_cap-v6-4-320003204264@bootlin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This update expands pd692x0_ethtool_get_status() callback with newly
introduced details such as the detected class, current power delivered,
and extended state information.
Acked-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: Kory Maincent <kory.maincent@bootlin.com>
Link: https://patch.msgid.link/20240704-feature_poe_power_cap-v6-3-320003204264@bootlin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Expand the c33 PSE attributes with PSE class, extended state information
and power consumption.
./ynl/cli.py --spec netlink/specs/ethtool.yaml --no-schema --do pse-get
--json '{"header":{"dev-name":"eth0"}}'
{'c33-pse-actual-pw': 1700,
'c33-pse-admin-state': 3,
'c33-pse-pw-class': 4,
'c33-pse-pw-d-status': 4,
'header': {'dev-index': 4, 'dev-name': 'eth0'}}
./ynl/cli.py --spec netlink/specs/ethtool.yaml --no-schema --do pse-get
--json '{"header":{"dev-name":"eth0"}}'
{'c33-pse-admin-state': 3,
'c33-pse-ext-state': 'mr-mps-valid',
'c33-pse-ext-substate': 2,
'c33-pse-pw-d-status': 2,
'header': {'dev-index': 4, 'dev-name': 'eth0'}}
Signed-off-by: Kory Maincent <kory.maincent@bootlin.com>
Link: https://patch.msgid.link/20240704-feature_poe_power_cap-v6-2-320003204264@bootlin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This update expands the status information provided by ethtool for PSE c33.
It includes details such as the detected class, current power delivered,
and extended state information.
Reviewed-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: Kory Maincent <kory.maincent@bootlin.com>
Link: https://patch.msgid.link/20240704-feature_poe_power_cap-v6-1-320003204264@bootlin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Loss recovery undo_retrans bookkeeping had a long-standing bug where a
DSACK from a spurious TLP retransmit packet could cause an erroneous
undo of a fast recovery or RTO recovery that repaired a single
really-lost packet (in a sequence range outside that of the TLP
retransmit). Basically, because the loss recovery state machine didn't
account for the fact that it sent a TLP retransmit, the DSACK for the
TLP retransmit could erroneously be implicitly be interpreted as
corresponding to the normal fast recovery or RTO recovery retransmit
that plugged a real hole, thus resulting in an improper undo.
For example, consider the following buggy scenario where there is a
real packet loss but the congestion control response is improperly
undone because of this bug:
+ send packets P1, P2, P3, P4
+ P1 is really lost
+ send TLP retransmit of P4
+ receive SACK for original P2, P3, P4
+ enter fast recovery, fast-retransmit P1, increment undo_retrans to 1
+ receive DSACK for TLP P4, decrement undo_retrans to 0, undo (bug!)
+ receive cumulative ACK for P1-P4 (fast retransmit plugged real hole)
The fix: when we initialize undo machinery in tcp_init_undo(), if
there is a TLP retransmit in flight, then increment tp->undo_retrans
so that we make sure that we receive a DSACK corresponding to the TLP
retransmit, as well as DSACKs for all later normal retransmits, before
triggering a loss recovery undo. Note that we also have to move the
line that clears tp->tlp_high_seq for RTO recovery, so that upon RTO
we remember the tp->tlp_high_seq value until tcp_init_undo() and clear
it only afterward.
Also note that the bug dates back to the original 2013 TLP
implementation, commit 6ba8a3b19e76 ("tcp: Tail loss probe (TLP)").
However, this patch will only compile and work correctly with kernels
that have tp->tlp_retrans, which was added only in v5.8 in 2020 in
commit 76be93fc0702 ("tcp: allow at most one TLP probe per flight").
So we associate this fix with that later commit.
Fixes: 76be93fc0702 ("tcp: allow at most one TLP probe per flight")
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Kevin Yang <yyd@google.com>
Link: https://patch.msgid.link/20240703171246.1739561-1-ncardwell.sw@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Adrian Moreno says:
====================
net: openvswitch: Add sample multicasting.
** Background **
Currently, OVS supports several packet sampling mechanisms (sFlow,
per-bridge IPFIX, per-flow IPFIX). These end up being translated into a
userspace action that needs to be handled by ovs-vswitchd's handler
threads only to be forwarded to some third party application that
will somehow process the sample and provide observability on the
datapath.
A particularly interesting use-case is controller-driven
per-flow IPFIX sampling where the OpenFlow controller can add metadata
to samples (via two 32bit integers) and this metadata is then available
to the sample-collecting system for correlation.
** Problem **
The fact that sampled traffic share netlink sockets and handler thread
time with upcalls, apart from being a performance bottleneck in the
sample extraction itself, can severely compromise the datapath,
yielding this solution unfit for highly loaded production systems.
Users are left with little options other than guessing what sampling
rate will be OK for their traffic pattern and system load and dealing
with the lost accuracy.
Looking at available infrastructure, an obvious candidated would be
to use psample. However, it's current state does not help with the
use-case at stake because sampled packets do not contain user-defined
metadata.
** Proposal **
This series is an attempt to fix this situation by extending the
existing psample infrastructure to carry a variable length
user-defined cookie.
The main existing user of psample is tc's act_sample. It is also
extended to forward the action's cookie to psample.
Finally, a new OVS action (OVS_SAMPLE_ATTR_PSAMPLE) is created.
It accepts a group and an optional cookie and uses psample to
multicast the packet and the metadata.
====================
Link: https://patch.msgid.link/20240704085710.353845-1-amorenoz@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add a test to verify sampling packets via psample works.
In order to do that, create a subcommand in ovs-dpctl.py to listen to
on the psample multicast group and print samples.
Reviewed-by: Aaron Conole <aconole@redhat.com>
Tested-by: Ilya Maximets <i.maximets@ovn.org>
Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
Link: https://patch.msgid.link/20240704085710.353845-11-amorenoz@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The trunc action was supported decode-able but not parse-able. Add
support for parsing the action string.
Reviewed-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
Link: https://patch.msgid.link/20240704085710.353845-10-amorenoz@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The userspace action lacks parsing support plus it contains a bug in the
name of one of its attributes.
This patch makes userspace action work.
Reviewed-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
Link: https://patch.msgid.link/20240704085710.353845-9-amorenoz@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add sample and psample action support to ovs-dpctl.py.
Refactor common attribute parsing logic into an external function.
Reviewed-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
Link: https://patch.msgid.link/20240704085710.353845-8-amorenoz@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
When a packet sample is observed, the sampling rate that was used is
important to estimate the real frequency of such event.
Store the probability of the parent sample action in the skb's cb area
and use it in psample action to pass it down to psample module.
Reviewed-by: Aaron Conole <aconole@redhat.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Reviewed-by: Ilya Maximets <i.maximets@ovn.org>
Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
Link: https://patch.msgid.link/20240704085710.353845-7-amorenoz@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add support for a new action: psample.
This action accepts a u32 group id and a variable-length cookie and uses
the psample multicast group to make the packet available for
observability.
The maximum length of the user-defined cookie is set to 16, same as
tc_cookie, to discourage using cookies that will not be offloadable.
Reviewed-by: Michal Kubiak <michal.kubiak@intel.com>
Reviewed-by: Aaron Conole <aconole@redhat.com>
Reviewed-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
Link: https://patch.msgid.link/20240704085710.353845-6-amorenoz@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Although not explicitly documented in the psample module itself, the
definition of PSAMPLE_ATTR_SAMPLE_RATE seems inherited from act_sample.
Quoting tc-sample(8):
"RATE of 100 will lead to an average of one sampled packet out of every
100 observed."
With this semantics, the rates that we can express with an unsigned
32-bits number are very unevenly distributed and concentrated towards
"sampling few packets".
For example, we can express a probability of 2.32E-8% but we
cannot express anything between 100% and 50%.
For sampling applications that are capable of sampling a decent
amount of packets, this sampling rate semantics is not very useful.
Add a new flag to the uAPI that indicates that the sampling rate is
expressed in scaled probability, this is:
- 0 is 0% probability, no packets get sampled.
- U32_MAX is 100% probability, all packets get sampled.
Reviewed-by: Aaron Conole <aconole@redhat.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
Link: https://patch.msgid.link/20240704085710.353845-5-amorenoz@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
If nobody is listening on the multicast group, generating the sample,
which involves copying packet data, seems completely unnecessary.
Return fast in this case.
Reviewed-by: Aaron Conole <aconole@redhat.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
Link: https://patch.msgid.link/20240704085710.353845-4-amorenoz@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
If the action has a user_cookie, pass it along to the sample so it can
be easily identified.
Reviewed-by: Michal Kubiak <michal.kubiak@intel.com>
Reviewed-by: Aaron Conole <aconole@redhat.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
Link: https://patch.msgid.link/20240704085710.353845-3-amorenoz@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add a user cookie to the sample metadata so that sample emitters can
provide more contextual information to samples.
If present, send the user cookie in a new attribute:
PSAMPLE_ATTR_USER_COOKIE.
Reviewed-by: Michal Kubiak <michal.kubiak@intel.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
Link: https://patch.msgid.link/20240704085710.353845-2-amorenoz@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Implement operations to get and set flow-control link parameters.
Both is done by simply calling phylink_ethtool_{get,set}_pauseparam().
Fix whitespace in mtk_ethtool_ops while at it.
Signed-off-by: Daniel Golle <daniel@makrotopia.org>
Reviewed-by: Michal Kubiak <michal.kubiak@intel.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Tested-by: Rui Salvaterra <rsalvaterra@gmail.com>
Link: https://patch.msgid.link/e3ece47323444631d6cb479f32af0dfd6d145be0.1720088047.git.daniel@makrotopia.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|