Age | Commit message (Collapse) | Author |
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/modules/linux
Pull modules fixes from Petr Pavlu:
"A single series to properly handle the module_kobject creation.
This fixes a problem with missing /sys/module/<module>/drivers for
built-in modules"
* tag 'modules-6.15-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/modules/linux:
drivers: base: handle module_kobject creation
kernel: globalize lookup_or_create_module_kobject()
kernel: refactor lookup_or_create_module_kobject()
kernel: param: rename locate_module_kobject
|
|
function
Function CIFSSMBSetPathInfo() is not supported by non-NT servers and
returns error. Fallback code via open filehandle and CIFSSMBSetFileInfo()
does not work neither because CIFS_open() works also only on NT server.
Therefore currently the whole smb_set_file_info() function as a SMB1
callback for the ->set_file_info() does not work with older non-NT SMB
servers, like Win9x and others.
This change implements fallback code in smb_set_file_info() which will
works with any server and allows to change time values and also to set or
clear read-only attributes.
To make existing fallback code via CIFSSMBSetFileInfo() working with also
non-NT servers, it is needed to change open function from CIFS_open()
(which is NT specific) to cifs_open_file() which works with any server
(this is just a open wrapper function which choose the correct open
function supported by the server).
CIFSSMBSetFileInfo() is working also on non-NT servers, but zero time
values are not treated specially. So first it is needed to fill all time
values if some of them are missing, via cifs_query_path_info() call.
There is another issue, opening file in write-mode (needed for changing
attributes) is not possible when the file has read-only attribute set.
The only option how to clear read-only attribute is via SMB_COM_SETATTR
command. And opening directory is not possible neither and here the
SMB_COM_SETATTR command is the only option how to change attributes.
And CIFSSMBSetFileInfo() does not honor setting read-only attribute, so
for setting is also needed to use SMB_COM_SETATTR command.
Existing code in cifs_query_path_info() is already using SMB_COM_GETATTR as
a fallback code path (function SMBQueryInformation()), so introduce a new
function SMBSetInformation which will implement SMB_COM_SETATTR command.
My testing showed that Windows XP SMB1 client is also using SMB_COM_SETATTR
command for setting or clearing read-only attribute against non-NT server.
So this can prove that this is the correct way how to do it.
With this change it is possible set all 4 time values and all attributes,
including clearing and setting read-only bit on non-NT SMB servers.
Tested against Win98 SMB1 server.
This change fixes "touch" command which was failing when called on existing
file. And fixes also "chmod +w" and "chmod -w" commands which were also
failing (as they are changing read-only attribute).
Note that this change depends on following change
"cifs: Improve cifs_query_path_info() and cifs_query_file_info()"
as it require to query all 4 time attribute values.
Signed-off-by: Pali Rohár <pali@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
When CAP_NT_SMBS was not negotiated then do not issue CIFSSMBQPathInfo()
and CIFSSMBQFileInfo() commands. CIFSSMBQPathInfo() is not supported by
non-NT Win9x SMB server and CIFSSMBQFileInfo() returns from Win9x SMB
server bogus data in Attributes field (for example lot of files are marked
as reparse points, even Win9x does not support them and read-only bit is
not marked for read-only files). Correct information is returned by
CIFSFindFirst() or SMBQueryInformation() command.
So as a fallback in cifs_query_path_info() function use CIFSFindFirst()
with SMB_FIND_FILE_FULL_DIRECTORY_INFO level which is supported by both NT
and non-NT servers and as a last option use SMBQueryInformation() as it was
before.
And in function cifs_query_file_info() immediately returns -EOPNOTSUPP when
not communicating with NT server. Client then revalidate inode entry by the
cifs_query_path_info() call, which is working fine. So fstat() syscall on
already opened file will receive correct information.
Note that both fallback functions in non-UNICODE mode expands wildcards.
Therefore those fallback functions cannot be used on paths which contain
SMB wildcard characters (* ? " > <).
CIFSFindFirst() returns all 4 time attributes as opposite of
SMBQueryInformation() which returns only one.
With this change it is possible to query all 4 times attributes from Win9x
server and at the same time, client minimize sending of unsupported
commands to server.
Signed-off-by: Pali Rohár <pali@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
SMB create requests issued via smb311_posix_mkdir() have an incorrect
length of zero bytes for the POSIX create context data. ksmbd server
rejects such requests and logs "cli req too short" causing mkdir to fail
with "invalid argument" on the client side. It also causes subsequent
rmmod to crash in cifs_destroy_request_bufs()
Inspection of packets sent by cifs.ko using wireshark show valid data for
the SMB2_POSIX_CREATE_CONTEXT is appended with the correct offset, but
with an incorrect length of zero bytes. Fails with ksmbd+cifs.ko only as
Windows server/client does not use POSIX extensions.
Fix smb311_posix_mkdir() to set req->CreateContextsLength as part of
appending the POSIX creation context to the request.
Signed-off-by: Jethro Donaldson <devel@jro.nz>
Acked-by: Paulo Alcantara (Red Hat) <pc@manguebit.com>
Reviewed-by: Namjae Jeon <linkinjeon@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
When turbo mode is unavailable on a Skylake-X system, executing the
command:
# echo 1 > /sys/devices/system/cpu/intel_pstate/no_turbo
results in an unchecked MSR access error:
WRMSR to 0x199 (attempted to write 0x0000000100001300).
This issue was reproduced on an OEM (Original Equipment Manufacturer)
system and is not a common problem across all Skylake-X systems.
This error occurs because the MSR 0x199 Turbo Engage Bit (bit 32) is set
when turbo mode is disabled. The issue arises when intel_pstate fails to
detect that turbo mode is disabled. Here intel_pstate relies on
MSR_IA32_MISC_ENABLE bit 38 to determine the status of turbo mode.
However, on this system, bit 38 is not set even when turbo mode is
disabled.
According to the Intel Software Developer's Manual (SDM), the BIOS sets
this bit during platform initialization to enable or disable
opportunistic processor performance operations. Logically, this bit
should be set in such cases. However, the SDM also specifies that "OS
and applications must use CPUID leaf 06H to detect processors with
opportunistic processor performance operations enabled."
Therefore, in addition to checking MSR_IA32_MISC_ENABLE bit 38, verify
that CPUID.06H:EAX[1] is 0 to accurately determine if turbo mode is
disabled.
Fixes: 4521e1a0ce17 ("cpufreq: intel_pstate: Reflect current no_turbo state correctly")
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Cc: All applicable <stable@vger.kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Before commit bca84a7b93fd ("PM: sleep: Use DPM_FLAG_SMART_SUSPEND
conditionally") the runtime PM status of the device in intel_resume()
had always been RPM_ACTIVE because setting DPM_FLAG_SMART_SUSPEND had
caused the core to call pm_runtime_set_active() for that device during
the "noirq" resume phase. For this reason, the pm_runtime_suspended()
check in intel_resume() had never triggered and the code depending on
it had never run. That had not caused any observable functional issues
to appear, so effectively the code in question had never been needed.
After commit bca84a7b93fd the core does not call pm_runtime_set_active()
for all devices with DPM_FLAG_SMART_SUSPEND set any more and the code
depending on the pm_runtime_suspended() check in intel_resume() runs if
the device is runtime-suspended prior to a system-wide suspend
transition. Unfortunately, when it runs, it breaks things due to the
attempt to runtime-resume bus->dev which most likely is not ready for a
runtime resume at that point.
It also does other more-or-less questionable things. Namely, it
calls pm_runtime_idle() for a device with a nonzero runtime PM usage
counter which has no effect (all devices have nonzero runtime PM
usage counters during system-wide suspend and resume). It also calls
pm_runtime_mark_last_busy() for the device even though devices cannot
runtime-suspend during system-wide suspend and resume (because their
runtime PM usage counters are nonzero) and an analogous call is made
in the same function later. Moreover, it sets the runtime PM status
of the device to RPM_ACTIVE before activating it.
For the reasons listed above, remove that code altogether.
On top of that, add a pm_runtime_disable() call to intel_suspend() to
prevent the device from being runtime-resumed at any point after
intel_suspend() has started to manipulate it because the changes
made by that function would be undone by a runtime-suspend of the
device.
Next, once runtime PM has been disabled, the runtime PM status of the
device cannot change, so pm_runtime_status_suspended() can be used
instead of pm_runtime_suspended() in intel_suspend().
Finally, make intel_resume() call pm_runtime_set_active() at the end to
set the runtime PM status of the device to "active" because it has just
been activated and re-enable runtime PM for it after that.
Additionally, drop the setting of DPM_FLAG_SMART_SUSPEND from the
driver because it has no effect on devices handled by it.
Fixes: bca84a7b93fd ("PM: sleep: Use DPM_FLAG_SMART_SUSPEND conditionally")
Reported-by: Bard Liao <yung-chuan.liao@linux.intel.com>
Tested-by: Bard Liao <yung-chuan.liao@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Bard Liao <yung-chuan.liao@linux.intel.com>
Link: https://patch.msgid.link/12680420.O9o76ZdvQC@rjwysocki.net
|
|
syzbot complains about the cached sq head read, and it's totally right.
But we don't need to care, it's just reading fdinfo, and reading the
CQ or SQ tail/head entries are known racy in that they are just a view
into that very instant and may of course be outdated by the time they
are reported.
Annotate both the SQ head and CQ tail read with data_race() to avoid
this syzbot complaint.
Link: https://lore.kernel.org/io-uring/6811f6dc.050a0220.39e3a1.0d0e.GAE@google.com/
Reported-by: syzbot+3e77fd302e99f5af9394@syzkaller.appspotmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
After calling nvme_auth_derive_tls_psk() we need to free the resulting
psk data, as either TLS is disable (and we don't need the data anyway)
or the psk data is copied into the resulting key (and can be free, too).
Fixes: fa2e0f8bbc68 ("nvmet-tcp: support secure channel concatenation")
Reported-by: Yi Zhang <yi.zhang@redhat.com>
Suggested-by: Maurizio Lombardi <mlombard@bsdbackstore.eu>
Signed-off-by: Hannes Reinecke <hare@kernel.org>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Tested-by: Yi Zhang <yi.zhang@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
|
|
queue->state_change is set as part of nvmet_tcp_set_queue_sock(), but if
the TCP connection isn't established when nvmet_tcp_set_queue_sock() is
called then queue->state_change isn't set and sock->sk->sk_state_change
isn't replaced.
As such we don't need to restore sock->sk->sk_state_change if
queue->state_change is NULL.
This avoids NULL pointer dereferences such as this:
[ 286.462026][ C0] BUG: kernel NULL pointer dereference, address: 0000000000000000
[ 286.462814][ C0] #PF: supervisor instruction fetch in kernel mode
[ 286.463796][ C0] #PF: error_code(0x0010) - not-present page
[ 286.464392][ C0] PGD 8000000140620067 P4D 8000000140620067 PUD 114201067 PMD 0
[ 286.465086][ C0] Oops: Oops: 0010 [#1] SMP KASAN PTI
[ 286.465559][ C0] CPU: 0 UID: 0 PID: 1628 Comm: nvme Not tainted 6.15.0-rc2+ #11 PREEMPT(voluntary)
[ 286.466393][ C0] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-3.fc41 04/01/2014
[ 286.467147][ C0] RIP: 0010:0x0
[ 286.467420][ C0] Code: Unable to access opcode bytes at 0xffffffffffffffd6.
[ 286.467977][ C0] RSP: 0018:ffff8883ae008580 EFLAGS: 00010246
[ 286.468425][ C0] RAX: 0000000000000000 RBX: ffff88813fd34100 RCX: ffffffffa386cc43
[ 286.469019][ C0] RDX: 1ffff11027fa68b6 RSI: 0000000000000008 RDI: ffff88813fd34100
[ 286.469545][ C0] RBP: ffff88813fd34160 R08: 0000000000000000 R09: ffffed1027fa682c
[ 286.470072][ C0] R10: ffff88813fd34167 R11: 0000000000000000 R12: ffff88813fd344c3
[ 286.470585][ C0] R13: ffff88813fd34112 R14: ffff88813fd34aec R15: ffff888132cdd268
[ 286.471070][ C0] FS: 00007fe3c04c7d80(0000) GS:ffff88840743f000(0000) knlGS:0000000000000000
[ 286.471644][ C0] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 286.472543][ C0] CR2: ffffffffffffffd6 CR3: 000000012daca000 CR4: 00000000000006f0
[ 286.473500][ C0] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 286.474467][ C0] DR3: 0000000000000000 DR6: 00000000ffff07f0 DR7: 0000000000000400
[ 286.475453][ C0] Call Trace:
[ 286.476102][ C0] <IRQ>
[ 286.476719][ C0] tcp_fin+0x2bb/0x440
[ 286.477429][ C0] tcp_data_queue+0x190f/0x4e60
[ 286.478174][ C0] ? __build_skb_around+0x234/0x330
[ 286.478940][ C0] ? rcu_is_watching+0x11/0xb0
[ 286.479659][ C0] ? __pfx_tcp_data_queue+0x10/0x10
[ 286.480431][ C0] ? tcp_try_undo_loss+0x640/0x6c0
[ 286.481196][ C0] ? seqcount_lockdep_reader_access.constprop.0+0x82/0x90
[ 286.482046][ C0] ? kvm_clock_get_cycles+0x14/0x30
[ 286.482769][ C0] ? ktime_get+0x66/0x150
[ 286.483433][ C0] ? rcu_is_watching+0x11/0xb0
[ 286.484146][ C0] tcp_rcv_established+0x6e4/0x2050
[ 286.484857][ C0] ? rcu_is_watching+0x11/0xb0
[ 286.485523][ C0] ? ipv4_dst_check+0x160/0x2b0
[ 286.486203][ C0] ? __pfx_tcp_rcv_established+0x10/0x10
[ 286.486917][ C0] ? lock_release+0x217/0x2c0
[ 286.487595][ C0] tcp_v4_do_rcv+0x4d6/0x9b0
[ 286.488279][ C0] tcp_v4_rcv+0x2af8/0x3e30
[ 286.488904][ C0] ? raw_local_deliver+0x51b/0xad0
[ 286.489551][ C0] ? rcu_is_watching+0x11/0xb0
[ 286.490198][ C0] ? __pfx_tcp_v4_rcv+0x10/0x10
[ 286.490813][ C0] ? __pfx_raw_local_deliver+0x10/0x10
[ 286.491487][ C0] ? __pfx_nf_confirm+0x10/0x10 [nf_conntrack]
[ 286.492275][ C0] ? rcu_is_watching+0x11/0xb0
[ 286.492900][ C0] ip_protocol_deliver_rcu+0x8f/0x370
[ 286.493579][ C0] ip_local_deliver_finish+0x297/0x420
[ 286.494268][ C0] ip_local_deliver+0x168/0x430
[ 286.494867][ C0] ? __pfx_ip_local_deliver+0x10/0x10
[ 286.495498][ C0] ? __pfx_ip_local_deliver_finish+0x10/0x10
[ 286.496204][ C0] ? ip_rcv_finish_core+0x19a/0x1f20
[ 286.496806][ C0] ? lock_release+0x217/0x2c0
[ 286.497414][ C0] ip_rcv+0x455/0x6e0
[ 286.497945][ C0] ? __pfx_ip_rcv+0x10/0x10
[ 286.498550][ C0] ? rcu_is_watching+0x11/0xb0
[ 286.499137][ C0] ? __pfx_ip_rcv_finish+0x10/0x10
[ 286.499763][ C0] ? lock_release+0x217/0x2c0
[ 286.500327][ C0] ? dl_scaled_delta_exec+0xd1/0x2c0
[ 286.500922][ C0] ? __pfx_ip_rcv+0x10/0x10
[ 286.501480][ C0] __netif_receive_skb_one_core+0x166/0x1b0
[ 286.502173][ C0] ? __pfx___netif_receive_skb_one_core+0x10/0x10
[ 286.502903][ C0] ? lock_acquire+0x2b2/0x310
[ 286.503487][ C0] ? process_backlog+0x372/0x1350
[ 286.504087][ C0] ? lock_release+0x217/0x2c0
[ 286.504642][ C0] process_backlog+0x3b9/0x1350
[ 286.505214][ C0] ? process_backlog+0x372/0x1350
[ 286.505779][ C0] __napi_poll.constprop.0+0xa6/0x490
[ 286.506363][ C0] net_rx_action+0x92e/0xe10
[ 286.506889][ C0] ? __pfx_net_rx_action+0x10/0x10
[ 286.507437][ C0] ? timerqueue_add+0x1f0/0x320
[ 286.507977][ C0] ? sched_clock_cpu+0x68/0x540
[ 286.508492][ C0] ? lock_acquire+0x2b2/0x310
[ 286.509043][ C0] ? kvm_sched_clock_read+0xd/0x20
[ 286.509607][ C0] ? handle_softirqs+0x1aa/0x7d0
[ 286.510187][ C0] handle_softirqs+0x1f2/0x7d0
[ 286.510754][ C0] ? __pfx_handle_softirqs+0x10/0x10
[ 286.511348][ C0] ? irqtime_account_irq+0x181/0x290
[ 286.511937][ C0] ? __dev_queue_xmit+0x85d/0x3450
[ 286.512510][ C0] do_softirq.part.0+0x89/0xc0
[ 286.513100][ C0] </IRQ>
[ 286.513548][ C0] <TASK>
[ 286.513953][ C0] __local_bh_enable_ip+0x112/0x140
[ 286.514522][ C0] ? __dev_queue_xmit+0x85d/0x3450
[ 286.515072][ C0] __dev_queue_xmit+0x872/0x3450
[ 286.515619][ C0] ? nft_do_chain+0xe16/0x15b0 [nf_tables]
[ 286.516252][ C0] ? __pfx___dev_queue_xmit+0x10/0x10
[ 286.516817][ C0] ? selinux_ip_postroute+0x43c/0xc50
[ 286.517433][ C0] ? __pfx_selinux_ip_postroute+0x10/0x10
[ 286.518061][ C0] ? rcu_is_watching+0x11/0xb0
[ 286.518606][ C0] ? ip_output+0x164/0x4a0
[ 286.519149][ C0] ? rcu_is_watching+0x11/0xb0
[ 286.519671][ C0] ? ip_finish_output2+0x17d5/0x1fb0
[ 286.520258][ C0] ip_finish_output2+0xb4b/0x1fb0
[ 286.520787][ C0] ? __pfx_ip_finish_output2+0x10/0x10
[ 286.521355][ C0] ? __ip_finish_output+0x15d/0x750
[ 286.521890][ C0] ip_output+0x164/0x4a0
[ 286.522372][ C0] ? __pfx_ip_output+0x10/0x10
[ 286.522872][ C0] ? rcu_is_watching+0x11/0xb0
[ 286.523402][ C0] ? _raw_spin_unlock_irqrestore+0x4c/0x60
[ 286.524031][ C0] ? __pfx_ip_finish_output+0x10/0x10
[ 286.524605][ C0] ? __ip_queue_xmit+0x999/0x2260
[ 286.525200][ C0] ? rcu_is_watching+0x11/0xb0
[ 286.525744][ C0] ? ipv4_dst_check+0x16a/0x2b0
[ 286.526279][ C0] ? lock_release+0x217/0x2c0
[ 286.526793][ C0] __ip_queue_xmit+0x1883/0x2260
[ 286.527324][ C0] ? __skb_clone+0x54c/0x730
[ 286.527827][ C0] __tcp_transmit_skb+0x209b/0x37a0
[ 286.528374][ C0] ? __pfx___tcp_transmit_skb+0x10/0x10
[ 286.528952][ C0] ? rcu_is_watching+0x11/0xb0
[ 286.529472][ C0] ? seqcount_lockdep_reader_access.constprop.0+0x82/0x90
[ 286.530152][ C0] ? trace_hardirqs_on+0x12/0x120
[ 286.530691][ C0] tcp_write_xmit+0xb81/0x88b0
[ 286.531224][ C0] ? mod_memcg_state+0x4d/0x60
[ 286.531736][ C0] ? rcu_is_watching+0x11/0xb0
[ 286.532253][ C0] __tcp_push_pending_frames+0x90/0x320
[ 286.532826][ C0] tcp_send_fin+0x141/0xb50
[ 286.533352][ C0] ? __pfx_tcp_send_fin+0x10/0x10
[ 286.533908][ C0] ? __local_bh_enable_ip+0xab/0x140
[ 286.534495][ C0] inet_shutdown+0x243/0x320
[ 286.535077][ C0] nvme_tcp_alloc_queue+0xb3b/0x2590 [nvme_tcp]
[ 286.535709][ C0] ? do_raw_spin_lock+0x129/0x260
[ 286.536314][ C0] ? __pfx_nvme_tcp_alloc_queue+0x10/0x10 [nvme_tcp]
[ 286.536996][ C0] ? do_raw_spin_unlock+0x54/0x1e0
[ 286.537550][ C0] ? _raw_spin_unlock+0x29/0x50
[ 286.538127][ C0] ? do_raw_spin_lock+0x129/0x260
[ 286.538664][ C0] ? __pfx_do_raw_spin_lock+0x10/0x10
[ 286.539249][ C0] ? nvme_tcp_alloc_admin_queue+0xd5/0x340 [nvme_tcp]
[ 286.539892][ C0] ? __wake_up+0x40/0x60
[ 286.540392][ C0] nvme_tcp_alloc_admin_queue+0xd5/0x340 [nvme_tcp]
[ 286.541047][ C0] ? rcu_is_watching+0x11/0xb0
[ 286.541589][ C0] nvme_tcp_setup_ctrl+0x8b/0x7a0 [nvme_tcp]
[ 286.542254][ C0] ? _raw_spin_unlock_irqrestore+0x4c/0x60
[ 286.542887][ C0] ? __pfx_nvme_tcp_setup_ctrl+0x10/0x10 [nvme_tcp]
[ 286.543568][ C0] ? trace_hardirqs_on+0x12/0x120
[ 286.544166][ C0] ? _raw_spin_unlock_irqrestore+0x35/0x60
[ 286.544792][ C0] ? nvme_change_ctrl_state+0x196/0x2e0 [nvme_core]
[ 286.545477][ C0] nvme_tcp_create_ctrl+0x839/0xb90 [nvme_tcp]
[ 286.546126][ C0] nvmf_dev_write+0x3db/0x7e0 [nvme_fabrics]
[ 286.546775][ C0] ? rw_verify_area+0x69/0x520
[ 286.547334][ C0] vfs_write+0x218/0xe90
[ 286.547854][ C0] ? do_syscall_64+0x9f/0x190
[ 286.548408][ C0] ? trace_hardirqs_on_prepare+0xdb/0x120
[ 286.549037][ C0] ? syscall_exit_to_user_mode+0x93/0x280
[ 286.549659][ C0] ? __pfx_vfs_write+0x10/0x10
[ 286.550259][ C0] ? do_syscall_64+0x9f/0x190
[ 286.550840][ C0] ? syscall_exit_to_user_mode+0x8e/0x280
[ 286.551516][ C0] ? trace_hardirqs_on_prepare+0xdb/0x120
[ 286.552180][ C0] ? syscall_exit_to_user_mode+0x93/0x280
[ 286.552834][ C0] ? ksys_read+0xf5/0x1c0
[ 286.553386][ C0] ? __pfx_ksys_read+0x10/0x10
[ 286.553964][ C0] ksys_write+0xf5/0x1c0
[ 286.554499][ C0] ? __pfx_ksys_write+0x10/0x10
[ 286.555072][ C0] ? trace_hardirqs_on_prepare+0xdb/0x120
[ 286.555698][ C0] ? syscall_exit_to_user_mode+0x93/0x280
[ 286.556319][ C0] ? do_syscall_64+0x54/0x190
[ 286.556866][ C0] do_syscall_64+0x93/0x190
[ 286.557420][ C0] ? rcu_read_unlock+0x17/0x60
[ 286.557986][ C0] ? rcu_is_watching+0x11/0xb0
[ 286.558526][ C0] ? lock_release+0x217/0x2c0
[ 286.559087][ C0] ? rcu_is_watching+0x11/0xb0
[ 286.559659][ C0] ? count_memcg_events.constprop.0+0x4a/0x60
[ 286.560476][ C0] ? exc_page_fault+0x7a/0x110
[ 286.561064][ C0] ? rcu_is_watching+0x11/0xb0
[ 286.561647][ C0] ? lock_release+0x217/0x2c0
[ 286.562257][ C0] ? do_user_addr_fault+0x171/0xa00
[ 286.562839][ C0] ? do_user_addr_fault+0x4a2/0xa00
[ 286.563453][ C0] ? irqentry_exit_to_user_mode+0x84/0x270
[ 286.564112][ C0] ? rcu_is_watching+0x11/0xb0
[ 286.564677][ C0] ? irqentry_exit_to_user_mode+0x84/0x270
[ 286.565317][ C0] ? trace_hardirqs_on_prepare+0xdb/0x120
[ 286.565922][ C0] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 286.566542][ C0] RIP: 0033:0x7fe3c05e6504
[ 286.567102][ C0] Code: c7 00 16 00 00 00 b8 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 80 3d c5 8b 10 00 00 74 13 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 54 c3 0f 1f 00 55 48 89 e5 48 83 ec 20 48 89
[ 286.568931][ C0] RSP: 002b:00007fff76444f58 EFLAGS: 00000202 ORIG_RAX: 0000000000000001
[ 286.569807][ C0] RAX: ffffffffffffffda RBX: 000000003b40d930 RCX: 00007fe3c05e6504
[ 286.570621][ C0] RDX: 00000000000000cf RSI: 000000003b40d930 RDI: 0000000000000003
[ 286.571443][ C0] RBP: 0000000000000003 R08: 00000000000000cf R09: 000000003b40d930
[ 286.572246][ C0] R10: 0000000000000000 R11: 0000000000000202 R12: 000000003b40cd60
[ 286.573069][ C0] R13: 00000000000000cf R14: 00007fe3c07417f8 R15: 00007fe3c073502e
[ 286.573886][ C0] </TASK>
Closes: https://lore.kernel.org/linux-nvme/5hdonndzoqa265oq3bj6iarwtfk5dewxxjtbjvn5uqnwclpwt6@a2n6w3taxxex/
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Tested-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
|
|
Ensure that TLS support is enabled in the kernel when
CONFIG_NVME_TARGET_TCP_TLS is enabled. Without this the code compiles,
but does not actually work unless something else enables CONFIG_TLS.
Fixes: 675b453e0241 ("nvmet-tcp: enable TLS handshake upcall")
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
|
|
Ensure that TLS support is enabled in the kernel when
CONFIG_NVME_TCP_TLS is enabled. Without this the code compiles, but does
not actually work unless something else enables CONFIG_TLS.
Fixes: be8e82caa68 ("nvme-tcp: enable TLS handshake upcall")
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
|
|
This patch addresses a data corruption issue observed in nvme-tcp during
testing.
In an NVMe native multipath setup, when an I/O timeout occurs, all
inflight I/Os are canceled almost immediately after the kernel socket is
shut down. These canceled I/Os are reported as host path errors,
triggering a failover that succeeds on a different path.
However, at this point, the original I/O may still be outstanding in the
host's network transmission path (e.g., the NIC’s TX queue). From the
user-space app's perspective, the buffer associated with the I/O is
considered completed since they're acked on the different path and may
be reused for new I/O requests.
Because nvme-tcp enables zero-copy by default in the transmission path,
this can lead to corrupted data being sent to the original target,
ultimately causing data corruption.
We can reproduce this data corruption by injecting delay on one path and
triggering i/o timeout.
To prevent this issue, this change ensures that all inflight
transmissions are fully completed from host's perspective before
returning from queue stop. To handle concurrent I/O timeout from multiple
namespaces under the same controller, always wait in queue stop
regardless of queue's state.
This aligns with the behavior of queue stopping in other NVMe fabric
transports.
Fixes: 3f2304f8c6d6 ("nvme-tcp: add NVMe over TCP host driver")
Signed-off-by: Michael Liang <mliang@purestorage.com>
Reviewed-by: Mohamed Khalfella <mkhalfella@purestorage.com>
Reviewed-by: Randy Jennings <randyj@purestorage.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
|
|
Michael Chan says:
====================
bnxt_en: Misc. bug fixes
This series fixes a bug in the driver initialization path, MSIX
setup sequencing issue in the FW error and AER paths, a missing
skb_mark_for_recycle() in the VLAN error path, some ethtool coredump
fixes, an ethtool selftest fix, and an ethtool register dump byte order
fix.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
For version 1 register dump that includes the PCIe stats, the existing
code incorrectly assumes that all PCIe stats are 64-bit values. Fix it
by using an array containing the starting and ending index of the 32-bit
values. The loop in bnxt_get_regs() will use the array to do proper
endian swap for the 32-bit values.
Fixes: b5d600b027eb ("bnxt_en: Add support for 'ethtool -d'")
Reviewed-by: Shruti Parab <shruti.parab@broadcom.com>
Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When retrieving the FW coredump using ethtool, it can sometimes cause
memory corruption:
BUG: KFENCE: memory corruption in __bnxt_get_coredump+0x3ef/0x670 [bnxt_en]
Corrupted memory at 0x000000008f0f30e8 [ ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ] (in kfence-#45):
__bnxt_get_coredump+0x3ef/0x670 [bnxt_en]
ethtool_get_dump_data+0xdc/0x1a0
__dev_ethtool+0xa1e/0x1af0
dev_ethtool+0xa8/0x170
dev_ioctl+0x1b5/0x580
sock_do_ioctl+0xab/0xf0
sock_ioctl+0x1ce/0x2e0
__x64_sys_ioctl+0x87/0xc0
do_syscall_64+0x5c/0xf0
entry_SYSCALL_64_after_hwframe+0x78/0x80
...
This happens when copying the coredump segment list in
bnxt_hwrm_dbg_dma_data() with the HWRM_DBG_COREDUMP_LIST FW command.
The info->dest_buf buffer is allocated based on the number of coredump
segments returned by the FW. The segment list is then DMA'ed by
the FW and the length of the DMA is returned by FW. The driver then
copies this DMA'ed segment list to info->dest_buf.
In some cases, this DMA length may exceed the info->dest_buf length
and cause the above BUG condition. Fix it by capping the copy
length to not exceed the length of info->dest_buf. The extra
DMA data contains no useful information.
This code path is shared for the HWRM_DBG_COREDUMP_LIST and the
HWRM_DBG_COREDUMP_RETRIEVE FW commands. The buffering is different
for these 2 FW commands. To simplify the logic, we need to move
the line to adjust the buffer length for HWRM_DBG_COREDUMP_RETRIEVE
up, so that the new check to cap the copy length will work for both
commands.
Fixes: c74751f4c392 ("bnxt_en: Return error if FW returns more data than dump length")
Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: Shruti Parab <shruti.parab@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When handling HWRM_DBG_COREDUMP_LIST FW command in
bnxt_hwrm_dbg_dma_data(), the allocated buffer info->dest_buf is
not freed in the error path. In the normal path, info->dest_buf
is assigned to coredump->data and it will eventually be freed after
the coredump is collected.
Free info->dest_buf immediately inside bnxt_hwrm_dbg_dma_data() in
the error path.
Fixes: c74751f4c392 ("bnxt_en: Return error if FW returns more data than dump length")
Reported-by: Michael Chan <michael.chan@broadcom.com>
Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: Shruti Parab <shruti.parab@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch is similar to the last patch to delay the
pci_alloc_irq_vectors() call in the AER path until after calling
bnxt_reserve_rings(). bnxt_reserve_rings() needs to properly map
the MSIX table first before we call pci_alloc_irq_vectors() which
may immediately write to the MSIX table in some architectures.
Move the bnxt_init_int_mode() call from bnxt_io_slot_reset() to
bnxt_io_resume() after calling bnxt_reserve_rings().
With this change, the AER path may call bnxt_open() ->
bnxt_hwrm_if_change() with bp->irq_tbl set to NULL. bp->irq_tbl is
cleared when we call bnxt_clear_int_mode() in bnxt_io_slot_reset().
So we cannot use !bp->irq_tbl to detect aborted FW reset. Add a
new BNXT_FW_RESET_STATE_ABORT to detect aborted FW reset in
bnxt_hwrm_if_change().
Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
On some architectures (e.g. ARM), calling pci_alloc_irq_vectors()
will immediately cause the MSIX table to be written. This will not
work if we haven't called bnxt_reserve_rings() to properly map
the MSIX table to the MSIX vectors reserved by FW.
Fix the FW error recovery path to delay the bnxt_init_int_mode() ->
pci_alloc_irq_vectors() call by removing it from bnxt_hwrm_if_change().
bnxt_request_irq() later in the code path will call it and by then the
MSIX table is properly mapped.
Fixes: 4343838ca5eb ("bnxt_en: Replace deprecated PCI MSIX APIs")
Suggested-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
If bnxt_rx_vlan() fails because the VLAN protocol ID is invalid,
the SKB is freed but we're missing the call to recycle it. This
may cause the warning:
"page_pool_release_retry() stalled pool shutdown"
Add the missing skb_mark_for_recycle() in bnxt_rx_vlan().
Fixes: 86b05508f775 ("bnxt_en: Use the unified RX page pool buffers for XDP and non-XDP")
Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When RDMA driver is loaded, running offline self test is not
supported and driver returns failure early. But it is not clearing
the input buffer and hence the application prints some junk
characters for individual test results.
Fix it by clearing the buffer before returning.
Fixes: 895621f1c816 ("bnxt_en: Don't support offline self test when RoCE driver is loaded")
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
WARN_ON() is triggered in __flush_work() if bnxt_init_chip() fails
because we call cancel_work_sync() on dim work that has not been
initialized.
WARNING: CPU: 37 PID: 5223 at kernel/workqueue.c:4201 __flush_work.isra.0+0x212/0x230
The driver relies on the BNXT_STATE_NAPI_DISABLED bit to check if dim
work has already been cancelled. But in the bnxt_open() path,
BNXT_STATE_NAPI_DISABLED is not set and this causes the error
path to think that it needs to cancel the uninitalized dim work.
Fix it by setting BNXT_STATE_NAPI_DISABLED during initialization.
The bit will be cleared when we enable NAPI and initialize dim work.
Fixes: 40452969a506 ("bnxt_en: Fix DIM shutdown")
Suggested-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Shravya KN <shravya.k-n@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When generating the MSR_IA32_PEBS_ENABLE value that will be loaded on
VM-Entry to a KVM guest, mask the value with the vCPU's desired PEBS_ENABLE
value. Consulting only the host kernel's host vs. guest masks results in
running the guest with PEBS enabled even when the guest doesn't want to use
PEBS. Because KVM uses perf events to proxy the guest virtual PMU, simply
looking at exclude_host can't differentiate between events created by host
userspace, and events created by KVM on behalf of the guest.
Running the guest with PEBS unexpectedly enabled typically manifests as
crashes due to a near-infinite stream of #PFs. E.g. if the guest hasn't
written MSR_IA32_DS_AREA, the CPU will hit page faults on address '0' when
trying to record PEBS events.
The issue is most easily reproduced by running `perf kvm top` from before
commit 7b100989b4f6 ("perf evlist: Remove __evlist__add_default") (after
which, `perf kvm top` effectively stopped using PEBS). The userspace side
of perf creates a guest-only PEBS event, which intel_guest_get_msrs()
misconstrues a guest-*owned* PEBS event.
Arguably, this is a userspace bug, as enabling PEBS on guest-only events
simply cannot work, and userspace can kill VMs in many other ways (there
is no danger to the host). However, even if this is considered to be bad
userspace behavior, there's zero downside to perf/KVM restricting PEBS to
guest-owned events.
Note, commit 854250329c02 ("KVM: x86/pmu: Disable guest PEBS temporarily
in two rare situations") fixed the case where host userspace is profiling
KVM *and* userspace, but missed the case where userspace is profiling only
KVM.
Fixes: c59a1f106f5c ("KVM: x86/pmu: Add IA32_PEBS_ENABLE MSR emulation for extended PEBS")
Closes: https://lore.kernel.org/all/Z_VUswFkWiTYI0eD@do-x1carbon
Reported-by: Seth Forshee <sforshee@kernel.org>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Tested-by: "Seth Forshee (DigitalOcean)" <sforshee@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20250426001355.1026530-1-seanjc@google.com
|
|
Shannon Nelson says:
====================
pds_core: small code updates
These are a few little code touch ups for a kdoc complaint,
quicker error detection, and a cleaner initialization.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Initialize the .enabled field of the FWCTL viftype default in
the declaration rather than as a bit of code as it is always
to be enabled and needs no logic around it.
Signed-off-by: Shannon Nelson <shannon.nelson@amd.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Shorten the adminq poll starting interval in order to notice
any transaction errors more quickly.
Signed-off-by: Shannon Nelson <shannon.nelson@amd.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Fix the kernel-doc complaint
include/linux/pds/pds_adminq.h:481: warning: Excess struct member 'name' description in 'pds_core_lif_getattr_comp'
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Shannon Nelson <shannon.nelson@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
A few ASUS models use the ALC256_FIXUP_ASUS_HEADSET_MODE although they
have no built-in mic pin on NID 0x13, as found in the commit
c1732ede5e80 ("ALSA: hda/realtek - Fix headset and mic on several Asus
laptops with ALC256"). This was relatively harmless in the past as
NID 0x13 was assigned as the secondary mic. But since the fix for the
pin sort order, this pin became the primary one, hence user started
noticing the broken input, and we've fixed already for a few ASUS
models to switch to ALC256_FIXUP_ASUS_MIC_NO_PRESENCE.
This patch corrects the other ASUS models to use the right quirk entry
for fixing the built-in mic regression. Here we cover X541SA
(1043:12e0), X541UV (1043:12f0), Z550SA (1043:13bf0) and X555UB
(1043:1ccd).
Fixes: 3b4309546b48 ("ALSA: hda: Fix headset detection failure due to unstable sort")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=220058
Link: https://patch.msgid.link/20250430053210.31776-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
Opt_err is not used in EROFS, we can remove it.
Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
Reviewed-by: Gao Xiang <xiang@kernel.org>
Link: https://lore.kernel.org/r/20250429075056.689570-1-lihongbo22@huawei.com
Signed-off-by: Gao Xiang <xiang@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto fix from Herbert Xu:
"This fixes a regression in scompress"
* tag 'v6.15-p6' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: scompress - increment scomp_scratch_users when already allocated
|
|
The precomputed value for the NAND_READ_LOCATION_2 register should be
stored in 'snandc->regs->read_location2'.
Fix the qcom_spi_set_read_loc_first() function accordingly.
Fixes: 7304d1909080 ("spi: spi-qpic: add driver for QCOM SPI NAND flash Interface")
Signed-off-by: Gabor Juhos <j4g8y7@gmail.com>
Reviewed-by: Md Sadre Alam <quic_mdalam@quicinc.com>
Link: https://patch.msgid.link/20250428-qpic-snand-readloc2-fix-v1-1-50ce0877ff72@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
Depending on the architecture __ffs() returns either an 'unsigned long'
or 'unsigned int' result. Compile-testing this driver on targets that
use the latter produces a warning:
sound/soc/intel/catpt/dsp.c: In function 'catpt_dsp_set_srampge':
sound/soc/intel/catpt/dsp.c:181:44: error: format '%ld' expects argument of type 'long int', but argument 4 has type 'u32' {aka 'unsigned int'} [-Werror=format=]
181 | dev_dbg(cdev->dev, "sanitize block %ld: off 0x%08x\n",
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Change the type of the local variable to match the format string and
avoid the warning on any architecture.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Cezary Rojewski <cezary.rojewski@intel.com>
Link: https://patch.msgid.link/20250429073545.3558494-1-arnd@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next
Pablo Neira Ayuso says:
====================
Netfilter updates for net-next
The following batch contains Netfilter updates for net-next:
1) Replace msecs_to_jiffies() by secs_to_jiffies(), from Easwar Hariharan.
2) Allow to compile xt_cgroup with cgroupsv2 support only,
from Michal Koutny.
3) Prepare for sock_cgroup_classid() removal by wrapping it around
ifdef, also from Michal Koutny.
4) Remove redundant pointer fetch on conntrack template, from Xuanqiang Luo.
5) Re-format one block in the tproxy documentation for consistency,
from Chen Linxuan.
6) Expose set element count and type via netlink attributes,
from Florian Westphal.
* tag 'nf-next-25-04-29' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next:
netfilter: nf_tables: export set count and backend name to userspace
docs: tproxy: fix formatting for nft code block
netfilter: conntrack: Remove redundant NFCT_ALIGN call
net: cgroup: Guard users of sock_cgroup_classid()
netfilter: xt_cgroup: Make it independent from net_cls
netfilter: xt_IDLETIMER: convert timeouts to secs_to_jiffies()
====================
Link: https://patch.msgid.link/20250428221254.3853-1-pablo@netfilter.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Remove code that isn't reached. There is no need to check for
adapter->req_vec_chunks, because if it isn't set idpf_set_mb_vec_id()
won't be called.
Only one path when idpf_set_mb_vec_id() is called:
idpf_intr_req()
-> idpf_send_alloc_vectors_msg() -> adapter->req_vec_chunk is allocated
here, otherwise an error is returned and idpf_intr_req() exits with an
error.
The idpf_set_mb_vec_id() becomes one-liner and it is called only once.
Remove it and set mailbox vector index directly.
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Reviewed-by: Michal Kubiak <michal.kubiak@intel.com>
Reviewed-by: Pavan Kumar Linga <pavan.kumar.linga@intel.com>
Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Samuel Salin <Samuel.salin@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
replaces formmated with formatted
also corrects grammar by replacing a with an, and capitalises RST
Signed-off-by: Ruben Wauters <rubenru09@aol.com>
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20250428215541.6029-1-rubenru09@aol.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Assign the ptype extracted from qword to the ptype field of struct
libeth_rqe_info.
Remove the now excess ptype param of idpf_rx_singleq_extract_fields(),
idpf_rx_singleq_extract_base_fields() and
idpf_rx_singleq_extract_flex_fields().
Suggested-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Signed-off-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com>
Tested-by: Samuel Salin <Samuel.salin@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Provide support for the following devlink cmds:
-DEVLINK_CMD_REGION_GET
-DEVLINK_CMD_REGION_NEW
-DEVLINK_CMD_REGION_DEL
-DEVLINK_CMD_REGION_READ
ixgbe devlink region implementation, similarly to the ice one,
lets user to create snapshots of content of Non Volatile Memory,
content of Shadow RAM, and capabilities of the device.
For both NVM and SRAM regions provide .read() handler to let user
read their contents without the need to create full snapshots.
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Slawomir Mrozowicz <slawomirx.mrozowicz@intel.com>
Co-developed-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com>
Signed-off-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com>
Tested-by: Bharath R <bharath.r@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Legacy implementation of .set_phys_id() ethtool callback is not
applicable for E610 device.
Add new implementation which uses 0x06E9 command by calling
ixgbe_aci_set_port_id_led().
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Signed-off-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Bharath R <bharath.r@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
E610 device doesn't support disabling FC autonegotiation.
Create dedicated E610 .set_pauseparam() implementation and assign
it to ixgbe_ethtool_ops_e610.
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Signed-off-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Bharath R <bharath.r@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Currently only APM (Advanced Power Management) is supported by
the ixgbe driver. It works for magic packets only, as for different
sources of wake-up E610 adapter utilizes different feature.
Add E610 specific implementation of ixgbe_set_wol() callback. When
any of broadcast/multicast/unicast wake-up is set, disable APM and
configure ACPI (Advanced Configuration and Power Interface).
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Signed-off-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Bharath R <bharath.r@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
E610's implementation of various ethtool ops is different than
the ones corresponding to ixgbe legacy products. Therefore create
separate E610 ethtool_ops struct which will be filled out in the
forthcoming patches.
Add adequate ops struct basing on MAC type. This step requires
changing a bit the flow of probing by placing ixgbe_set_ethtool_ops
after hw.mac.type is assigned. So move the whole netdev assignment
block after hw.mac.type is known. This step doesn't have any additional
impact on probing sequence.
Suggested-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Signed-off-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Bharath R <bharath.r@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
The current MQPRIO offload implementation uses the legacy TSN Tx mode. In
this mode the hardware uses four packet buffers and considers queue
priorities.
In order to harmonize the TAPRIO implementation with MQPRIO, switch to the
regular TSN Tx mode. This mode also uses four packet buffers and considers
queue priorities. In addition to the legacy mode, transmission is always
coupled to Qbv. The driver already has mechanisms to use a dummy schedule
of 1 second with all gates open for ETF. Simply use this for MQPRIO too.
This reduces code and makes it easier to add support for frame preemption
later.
Tested on i225 with real time application using high priority queue, iperf3
using low priority queue and network TAP device.
Acked-by: Faizal Rahim <faizal.abdul.rahim@linux.intel.com>
Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Mor Bar-Gabay <morx.bar.gabay@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Limit netdev_tc calls to MQPRIO. Currently these calls are made in
igc_tsn_enable_offload() and igc_tsn_disable_offload() which are used by
TAPRIO and ETF as well. However, these are only required for MQPRIO.
Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Mor Bar-Gabay <morx.bar.gabay@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
When running the igc with XDP/ZC in busy polling mode with deferral of hard
interrupts, interrupts still happen from time to time. That is caused by
the igb task watchdog which triggers Rx interrupts periodically.
That mechanism has been introduced to overcome skb/memory allocation
failures [1]. So the Rx clean functions stop processing the Rx ring in case
of such failure. The task watchdog triggers Rx interrupts periodically in
the hope that memory became available in the mean time.
The current behavior is undesirable for real time applications, because the
driver induced Rx interrupts trigger also the softirq processing. However,
all real time packets should be processed by the application which uses the
busy polling method.
Therefore, only trigger the Rx interrupts in case of real allocation
failures. Introduce a new flag for signaling that condition.
Follow the same logic as in commit 8dcf2c212078 ("igc: Get rid of spurious
interrupts").
[1] - https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git/commit/?id=3be507547e6177e5c808544bd6a2efa2c7f1d436
Reviewed-by: Joe Damato <jdamato@fastly.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Tested-by: Sweta Kumari <sweta.kumari@intel.com>
Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Use netif_napi_add_config() to assign persistent per-NAPI config.
This is useful for preserving NAPI settings when changing queue counts or
for user space programs using SO_INCOMING_NAPI_ID.
Reviewed-by: Joe Damato <jdamato@fastly.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Link queues to NAPI instances via netdev-genl API. This is required to use
XDP/ZC busy polling. See commit 5ef44b3cb43b ("xsk: Bring back busy polling
support") for details.
This also allows users to query the info with netlink:
|$ ./tools/net/ynl/pyynl/cli.py --spec Documentation/netlink/specs/netdev.yaml \
| --dump queue-get --json='{"ifindex": 2}'
|[{'id': 0, 'ifindex': 2, 'napi-id': 8201, 'type': 'rx'},
| {'id': 1, 'ifindex': 2, 'napi-id': 8202, 'type': 'rx'},
| {'id': 2, 'ifindex': 2, 'napi-id': 8203, 'type': 'rx'},
| {'id': 3, 'ifindex': 2, 'napi-id': 8204, 'type': 'rx'},
| {'id': 0, 'ifindex': 2, 'napi-id': 8201, 'type': 'tx'},
| {'id': 1, 'ifindex': 2, 'napi-id': 8202, 'type': 'tx'},
| {'id': 2, 'ifindex': 2, 'napi-id': 8203, 'type': 'tx'},
| {'id': 3, 'ifindex': 2, 'napi-id': 8204, 'type': 'tx'}]
Add rtnl locking to PCI error handlers, because netif_queue_set_napi()
requires the lock held.
While at __igb_open() use RCT coding style.
Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de>
Reviewed-by: Joe Damato <jdamato@fastly.com>
Tested-by: Sweta Kumari <sweta.kumari@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Link IRQs to NAPI instances via netdev-genl API. This allows users to query
that information via netlink:
|$ ./tools/net/ynl/pyynl/cli.py --spec Documentation/netlink/specs/netdev.yaml \
| --dump napi-get --json='{"ifindex": 2}'
|[{'defer-hard-irqs': 0,
| 'gro-flush-timeout': 0,
| 'id': 8204,
| 'ifindex': 2,
| 'irq': 127,
| 'irq-suspend-timeout': 0},
| {'defer-hard-irqs': 0,
| 'gro-flush-timeout': 0,
| 'id': 8203,
| 'ifindex': 2,
| 'irq': 126,
| 'irq-suspend-timeout': 0},
| {'defer-hard-irqs': 0,
| 'gro-flush-timeout': 0,
| 'id': 8202,
| 'ifindex': 2,
| 'irq': 125,
| 'irq-suspend-timeout': 0},
| {'defer-hard-irqs': 0,
| 'gro-flush-timeout': 0,
| 'id': 8201,
| 'ifindex': 2,
| 'irq': 124,
| 'irq-suspend-timeout': 0}]
|$ cat /proc/interrupts | grep enp2s0
|123: 0 1 IR-PCI-MSIX-0000:02:00.0 0-edge enp2s0
|124: 0 7 IR-PCI-MSIX-0000:02:00.0 1-edge enp2s0-TxRx-0
|125: 0 0 IR-PCI-MSIX-0000:02:00.0 2-edge enp2s0-TxRx-1
|126: 0 5 IR-PCI-MSIX-0000:02:00.0 3-edge enp2s0-TxRx-2
|127: 0 0 IR-PCI-MSIX-0000:02:00.0 4-edge enp2s0-TxRx-3
Reviewed-by: Joe Damato <jdamato@fastly.com>
Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Comment was erroneously added with /**, amend this to use /* as it is
not a kernel-doc.
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202504262247.1UBrDBVN-lkp@intel.com/
Signed-off-by: Aryan Srivastava <aryan.srivastava@alliedtelesis.co.nz>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250428214920.813038-1-aryan.srivastava@alliedtelesis.co.nz
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
If any address or port is changed, update it in all packets and recalculate
checksum.
Fixes: 9fd1ff5d2ac7 ("udp: Support UDP fraglist GRO/GSO.")
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20250426153210.14044-1-nbd@nbd.name
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Vladimir Oltean says:
====================
Fix Felix DSA taprio gates after clock jump
Richie Pearn presented a reproducible situation where traffic would get
blocked on the NXP LS1028A switch if a certain taprio schedule was
applied, and stepping the PTP clock would take place. The latter event
is an expected initial occurrence, but also at runtime, for example when
transitioning from one grandmaster to another.
The issue is completely described in patch 1/4, which also contains
the fix, but it has left me with some doubts regarding the need for
vsc9959_tas_clock_adjust() in general.
In order to prove to myself that vsc9959_tas_clock_adjust() is needed in
general, I have written a selftest for the tc-taprio data path in patch
4/4. On the LS1028A, we can clearly see the following failures without
that function:
INFO: Forcing a backward clock jump
TEST: ping [FAIL]
INFO: Setting up taprio after PTP
TEST: In band with gate [FAIL]
Reception of 100 packets failed
TEST: Out of band with gate [FAIL]
Reception of 100 packets failed
As for testing my fix from patch 1/4, that was quite a bit more complex
to do automatically. In fact, I couldn't find any other schedule that
would fail to be updated by vsc9959_tas_clock_adjust() as cleanly as
the schedule from Richie, so I've added that specific schedule as the
test_clock_jump_backward() test.
The test ordering is also (unfortunately) very strategic. Running the
selftest to the end dirties the GCL RAM, and when running
test_clock_jump_backward() once again, the GCL entries won't be all
zeroes as they were the first time around. They will contain bits and
pieces of old schedules, making it very challenging to make it fail.
Thus, test_clock_jump_backward() is the first in the test suite, and
without patch 1/4, it is only supposed to fail the _first_ time when
running after a clean boot.
====================
Link: https://patch.msgid.link/20250426144859.3128352-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|