Age | Commit message (Collapse) | Author |
|
We hit a bug with a recovering relocation on mount for one of our file
systems in production. I reproduced this locally by injecting errors
into snapshot delete with balance running at the same time. This
presented as an error while looking up an extent item
WARNING: CPU: 5 PID: 1501 at fs/btrfs/extent-tree.c:866 lookup_inline_extent_backref+0x647/0x680
CPU: 5 PID: 1501 Comm: btrfs-balance Not tainted 5.16.0-rc8+ #8
RIP: 0010:lookup_inline_extent_backref+0x647/0x680
RSP: 0018:ffffae0a023ab960 EFLAGS: 00010202
RAX: 0000000000000001 RBX: 0000000000000000 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 000000000000000c RDI: 0000000000000000
RBP: ffff943fd2a39b60 R08: 0000000000000000 R09: 0000000000000001
R10: 0001434088152de0 R11: 0000000000000000 R12: 0000000001d05000
R13: ffff943fd2a39b60 R14: ffff943fdb96f2a0 R15: ffff9442fc923000
FS: 0000000000000000(0000) GS:ffff944e9eb40000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f1157b1fca8 CR3: 000000010f092000 CR4: 0000000000350ee0
Call Trace:
<TASK>
insert_inline_extent_backref+0x46/0xd0
__btrfs_inc_extent_ref.isra.0+0x5f/0x200
? btrfs_merge_delayed_refs+0x164/0x190
__btrfs_run_delayed_refs+0x561/0xfa0
? btrfs_search_slot+0x7b4/0xb30
? btrfs_update_root+0x1a9/0x2c0
btrfs_run_delayed_refs+0x73/0x1f0
? btrfs_update_root+0x1a9/0x2c0
btrfs_commit_transaction+0x50/0xa50
? btrfs_update_reloc_root+0x122/0x220
prepare_to_merge+0x29f/0x320
relocate_block_group+0x2b8/0x550
btrfs_relocate_block_group+0x1a6/0x350
btrfs_relocate_chunk+0x27/0xe0
btrfs_balance+0x777/0xe60
balance_kthread+0x35/0x50
? btrfs_balance+0xe60/0xe60
kthread+0x16b/0x190
? set_kthread_struct+0x40/0x40
ret_from_fork+0x22/0x30
</TASK>
Normally snapshot deletion and relocation are excluded from running at
the same time by the fs_info->cleaner_mutex. However if we had a
pending balance waiting to get the ->cleaner_mutex, and a snapshot
deletion was running, and then the box crashed, we would come up in a
state where we have a half deleted snapshot.
Again, in the normal case the snapshot deletion needs to complete before
relocation can start, but in this case relocation could very well start
before the snapshot deletion completes, as we simply add the root to the
dead roots list and wait for the next time the cleaner runs to clean up
the snapshot.
Fix this by setting a bit on the fs_info if we have any DEAD_ROOT's that
had a pending drop_progress key. If they do then we know we were in the
middle of the drop operation and set a flag on the fs_info. Then
balance can wait until this flag is cleared to start up again.
If there are DEAD_ROOT's that don't have a drop_progress set then we're
safe to start balance right away as we'll be properly protected by the
cleaner_mutex.
CC: stable@vger.kernel.org # 5.10+
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
User reported there is an array-index-out-of-bounds access while
mounting the crafted image:
[350.411942 ] loop0: detected capacity change from 0 to 262144
[350.427058 ] BTRFS: device fsid a62e00e8-e94e-4200-8217-12444de93c2e devid 1 transid 8 /dev/loop0 scanned by systemd-udevd (1044)
[350.428564 ] BTRFS info (device loop0): disk space caching is enabled
[350.428568 ] BTRFS info (device loop0): has skinny extents
[350.429589 ]
[350.429619 ] UBSAN: array-index-out-of-bounds in fs/btrfs/struct-funcs.c:161:1
[350.429636 ] index 1048096 is out of range for type 'page *[16]'
[350.429650 ] CPU: 0 PID: 9 Comm: kworker/u8:1 Not tainted 5.16.0-rc4
[350.429652 ] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.13.0-1ubuntu1.1 04/01/2014
[350.429653 ] Workqueue: btrfs-endio-meta btrfs_work_helper [btrfs]
[350.429772 ] Call Trace:
[350.429774 ] <TASK>
[350.429776 ] dump_stack_lvl+0x47/0x5c
[350.429780 ] ubsan_epilogue+0x5/0x50
[350.429786 ] __ubsan_handle_out_of_bounds+0x66/0x70
[350.429791 ] btrfs_get_16+0xfd/0x120 [btrfs]
[350.429832 ] check_leaf+0x754/0x1a40 [btrfs]
[350.429874 ] ? filemap_read+0x34a/0x390
[350.429878 ] ? load_balance+0x175/0xfc0
[350.429881 ] validate_extent_buffer+0x244/0x310 [btrfs]
[350.429911 ] btrfs_validate_metadata_buffer+0xf8/0x100 [btrfs]
[350.429935 ] end_bio_extent_readpage+0x3af/0x850 [btrfs]
[350.429969 ] ? newidle_balance+0x259/0x480
[350.429972 ] end_workqueue_fn+0x29/0x40 [btrfs]
[350.429995 ] btrfs_work_helper+0x71/0x330 [btrfs]
[350.430030 ] ? __schedule+0x2fb/0xa40
[350.430033 ] process_one_work+0x1f6/0x400
[350.430035 ] ? process_one_work+0x400/0x400
[350.430036 ] worker_thread+0x2d/0x3d0
[350.430037 ] ? process_one_work+0x400/0x400
[350.430038 ] kthread+0x165/0x190
[350.430041 ] ? set_kthread_struct+0x40/0x40
[350.430043 ] ret_from_fork+0x1f/0x30
[350.430047 ] </TASK>
[350.430047 ]
[350.430077 ] BTRFS warning (device loop0): bad eb member start: ptr 0xffe20f4e start 20975616 member offset 4293005178 size 2
btrfs check reports:
corrupt leaf: root=3 block=20975616 physical=20975616 slot=1, unexpected
item end, have 4294971193 expect 3897
The first slot item offset is 4293005033 and the size is 1966160.
In check_leaf, we use btrfs_item_end() to check item boundary versus
extent_buffer data size. However, return type of btrfs_item_end() is u32.
(u32)(4293005033 + 1966160) == 3897, overflow happens and the result 3897
equals to leaf data size reasonably.
Fix it by use u64 variable to store item data end in check_leaf() to
avoid u32 overflow.
This commit does solve the invalid memory access showed by the stack
trace. However, its metadata profile is DUP and another copy of the
leaf is fine. So the image can be mounted successfully. But when umount
is called, the ASSERT btrfs_mark_buffer_dirty() will be triggered
because the only node in extent tree has 0 item and invalid owner. It's
solved by another commit
"btrfs: check extent buffer owner against the owner rootid".
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=215299
Reported-by: Wenqing Liu <wenqingliu0120@gmail.com>
CC: stable@vger.kernel.org # 4.19+
Signed-off-by: Su Yue <l@damenly.su>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Whenever we do any extent buffer operations we call
assert_eb_page_uptodate() to complain loudly if we're operating on an
non-uptodate page. Our overnight tests caught this warning earlier this
week
WARNING: CPU: 1 PID: 553508 at fs/btrfs/extent_io.c:6849 assert_eb_page_uptodate+0x3f/0x50
CPU: 1 PID: 553508 Comm: kworker/u4:13 Tainted: G W 5.17.0-rc3+ #564
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.13.0-2.fc32 04/01/2014
Workqueue: btrfs-cache btrfs_work_helper
RIP: 0010:assert_eb_page_uptodate+0x3f/0x50
RSP: 0018:ffffa961440a7c68 EFLAGS: 00010246
RAX: 0017ffffc0002112 RBX: ffffe6e74453f9c0 RCX: 0000000000001000
RDX: ffffe6e74467c887 RSI: ffffe6e74453f9c0 RDI: ffff8d4c5efc2fc0
RBP: 0000000000000d56 R08: ffff8d4d4a224000 R09: 0000000000000000
R10: 00015817fa9d1ef0 R11: 000000000000000c R12: 00000000000007b1
R13: ffff8d4c5efc2fc0 R14: 0000000001500000 R15: 0000000001cb1000
FS: 0000000000000000(0000) GS:ffff8d4dbbd00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007ff31d3448d8 CR3: 0000000118be8004 CR4: 0000000000370ee0
Call Trace:
extent_buffer_test_bit+0x3f/0x70
free_space_test_bit+0xa6/0xc0
load_free_space_tree+0x1f6/0x470
caching_thread+0x454/0x630
? rcu_read_lock_sched_held+0x12/0x60
? rcu_read_lock_sched_held+0x12/0x60
? rcu_read_lock_sched_held+0x12/0x60
? lock_release+0x1f0/0x2d0
btrfs_work_helper+0xf2/0x3e0
? lock_release+0x1f0/0x2d0
? finish_task_switch.isra.0+0xf9/0x3a0
process_one_work+0x26d/0x580
? process_one_work+0x580/0x580
worker_thread+0x55/0x3b0
? process_one_work+0x580/0x580
kthread+0xf0/0x120
? kthread_complete_and_exit+0x20/0x20
ret_from_fork+0x1f/0x30
This was partially fixed by c2e39305299f01 ("btrfs: clear extent buffer
uptodate when we fail to write it"), however all that fix did was keep
us from finding extent buffers after a failed writeout. It didn't keep
us from continuing to use a buffer that we already had found.
In this case we're searching the commit root to cache the block group,
so we can start committing the transaction and switch the commit root
and then start writing. After the switch we can look up an extent
buffer that hasn't been written yet and start processing that block
group. Then we fail to write that block out and clear Uptodate on the
page, and then we start spewing these errors.
Normally we're protected by the tree lock to a certain degree here. If
we read a block we have that block read locked, and we block the writer
from locking the block before we submit it for the write. However this
isn't necessarily fool proof because the read could happen before we do
the submit_bio and after we locked and unlocked the extent buffer.
Also in this particular case we have path->skip_locking set, so that
won't save us here. We'll simply get a block that was valid when we
read it, but became invalid while we were using it.
What we really want is to catch the case where we've "read" a block but
it's not marked Uptodate. On read we ClearPageError(), so if we're
!Uptodate and !Error we know we didn't do the right thing for reading
the page.
Fix this by checking !Uptodate && !Error, this way we will not complain
if our buffer gets invalidated while we're using it, and we'll maintain
the spirit of the check which is to make sure we have a fully in-cache
block while we're messing with it.
CC: stable@vger.kernel.org # 5.4+
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
When doing a full fsync, if we have prealloc extents beyond (or at) eof,
and the leaves that contain them were not modified in the current
transaction, we end up not logging them. This results in losing those
extents when we replay the log after a power failure, since the inode is
truncated to the current value of the logged i_size.
Just like for the fast fsync path, we need to always log all prealloc
extents starting at or beyond i_size. The fast fsync case was fixed in
commit 471d557afed155 ("Btrfs: fix loss of prealloc extents past i_size
after fsync log replay") but it missed the full fsync path. The problem
exists since the very early days, when the log tree was added by
commit e02119d5a7b439 ("Btrfs: Add a write ahead tree log to optimize
synchronous operations").
Example reproducer:
$ mkfs.btrfs -f /dev/sdc
$ mount /dev/sdc /mnt
# Create our test file with many file extent items, so that they span
# several leaves of metadata, even if the node/page size is 64K. Use
# direct IO and not fsync/O_SYNC because it's both faster and it avoids
# clearing the full sync flag from the inode - we want the fsync below
# to trigger the slow full sync code path.
$ xfs_io -f -d -c "pwrite -b 4K 0 16M" /mnt/foo
# Now add two preallocated extents to our file without extending the
# file's size. One right at i_size, and another further beyond, leaving
# a gap between the two prealloc extents.
$ xfs_io -c "falloc -k 16M 1M" /mnt/foo
$ xfs_io -c "falloc -k 20M 1M" /mnt/foo
# Make sure everything is durably persisted and the transaction is
# committed. This makes all created extents to have a generation lower
# than the generation of the transaction used by the next write and
# fsync.
sync
# Now overwrite only the first extent, which will result in modifying
# only the first leaf of metadata for our inode. Then fsync it. This
# fsync will use the slow code path (inode full sync bit is set) because
# it's the first fsync since the inode was created/loaded.
$ xfs_io -c "pwrite 0 4K" -c "fsync" /mnt/foo
# Extent list before power failure.
$ xfs_io -c "fiemap -v" /mnt/foo
/mnt/foo:
EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS
0: [0..7]: 2178048..2178055 8 0x0
1: [8..16383]: 26632..43007 16376 0x0
2: [16384..32767]: 2156544..2172927 16384 0x0
3: [32768..34815]: 2172928..2174975 2048 0x800
4: [34816..40959]: hole 6144
5: [40960..43007]: 2174976..2177023 2048 0x801
<power fail>
# Mount fs again, trigger log replay.
$ mount /dev/sdc /mnt
# Extent list after power failure and log replay.
$ xfs_io -c "fiemap -v" /mnt/foo
/mnt/foo:
EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS
0: [0..7]: 2178048..2178055 8 0x0
1: [8..16383]: 26632..43007 16376 0x0
2: [16384..32767]: 2156544..2172927 16384 0x1
# The prealloc extents at file offsets 16M and 20M are missing.
So fix this by calling btrfs_log_prealloc_extents() when we are doing a
full fsync, so that we always log all prealloc extents beyond eof.
A test case for fstests will follow soon.
CC: stable@vger.kernel.org # 4.19+
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
[BUG]
When looping btrfs/074 with 64K page size and 4K sectorsize, there is a
low chance (1/50~1/100) to crash with the following ASSERT() triggered
in btrfs_subpage_start_writer():
ret = atomic_add_return(nbits, &subpage->writers);
ASSERT(ret == nbits); <<< This one <<<
[CAUSE]
With more debugging output on the parameters of
btrfs_subpage_start_writer(), it shows a very concerning error:
ret=29 nbits=13 start=393216 len=53248
For @nbits it's correct, but @ret which is the returned value from
atomic_add_return(), it's not only larger than nbits, but also larger
than max sectors per page value (for 64K page size and 4K sector size,
it's 16).
This indicates that some call sites are not properly decreasing the value.
And that's exactly the case, in btrfs_page_unlock_writer(), due to the
fact that we can have page locked either by lock_page() or
process_one_page(), we have to check if the subpage has any writer.
If no writers, it's locked by lock_page() and we only need to unlock it.
But unfortunately the check for the writers are completely opposite:
if (atomic_read(&subpage->writers))
/* No writers, locked by plain lock_page() */
return unlock_page(page);
We directly unlock the page if it has writers, which is the completely
opposite what we want.
Thankfully the affected call site is only limited to
extent_write_locked_range(), so it's mostly affecting compressed write.
[FIX]
Just fix the wrong check condition to fix the bug.
Fixes: e55a0de18572 ("btrfs: rework page locking in __extent_writepage()")
CC: stable@vger.kernel.org # 5.16
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
rtw_mlme.h has a comment which briefly describes the locking rules for
the rtl8723bs driver, improve this to also mention the locking order
of xmit_priv.lock vs the lock(s) embedded in the various queues.
Cc: Fabio Aiuto <fabioaiuto83@gmail.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20220302101637.26542-2-hdegoede@redhat.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Commit 54659ca026e5 ("staging: rtl8723bs: remove possible deadlock when
disconnect (v2)") split the locking of pxmitpriv->lock vs sleep_q/lock
into 2 locks in attempt to fix a lockdep reported issue with the locking
order of the sta_hash_lock vs pxmitpriv->lock.
But in the end this turned out to not fully solve the sta_hash_lock issue
so commit a7ac783c338b ("staging: rtl8723bs: remove a second possible
deadlock") was added to fix this in another way.
The original fix was kept as it was still seen as a good thing to have,
but now it turns out that it creates a deadlock in access-point mode:
[Feb20 23:47] ======================================================
[ +0.074085] WARNING: possible circular locking dependency detected
[ +0.074077] 5.16.0-1-amd64 #1 Tainted: G C E
[ +0.064710] ------------------------------------------------------
[ +0.074075] ksoftirqd/3/29 is trying to acquire lock:
[ +0.060542] ffffb8b30062ab00 (&pxmitpriv->lock){+.-.}-{2:2}, at: rtw_xmit_classifier+0x8a/0x140 [r8723bs]
[ +0.114921]
but task is already holding lock:
[ +0.069908] ffffb8b3007ab704 (&psta->sleep_q.lock){+.-.}-{2:2}, at: wakeup_sta_to_xmit+0x3b/0x300 [r8723bs]
[ +0.116976]
which lock already depends on the new lock.
[ +0.098037]
the existing dependency chain (in reverse order) is:
[ +0.089704]
-> #1 (&psta->sleep_q.lock){+.-.}-{2:2}:
[ +0.077232] _raw_spin_lock_bh+0x34/0x40
[ +0.053261] xmitframe_enqueue_for_sleeping_sta+0xc1/0x2f0 [r8723bs]
[ +0.082572] rtw_xmit+0x58b/0x940 [r8723bs]
[ +0.056528] _rtw_xmit_entry+0xba/0x350 [r8723bs]
[ +0.062755] dev_hard_start_xmit+0xf1/0x320
[ +0.056381] sch_direct_xmit+0x9e/0x360
[ +0.052212] __dev_queue_xmit+0xce4/0x1080
[ +0.055334] ip6_finish_output2+0x18f/0x6e0
[ +0.056378] ndisc_send_skb+0x2c8/0x870
[ +0.052209] ndisc_send_ns+0xd3/0x210
[ +0.050130] addrconf_dad_work+0x3df/0x5a0
[ +0.055338] process_one_work+0x274/0x5a0
[ +0.054296] worker_thread+0x52/0x3b0
[ +0.050124] kthread+0x16c/0x1a0
[ +0.044925] ret_from_fork+0x1f/0x30
[ +0.049092]
-> #0 (&pxmitpriv->lock){+.-.}-{2:2}:
[ +0.074101] __lock_acquire+0x10f5/0x1d80
[ +0.054298] lock_acquire+0xd7/0x300
[ +0.049088] _raw_spin_lock_bh+0x34/0x40
[ +0.053248] rtw_xmit_classifier+0x8a/0x140 [r8723bs]
[ +0.066949] rtw_xmitframe_enqueue+0xa/0x20 [r8723bs]
[ +0.066946] rtl8723bs_hal_xmitframe_enqueue+0x14/0x50 [r8723bs]
[ +0.078386] wakeup_sta_to_xmit+0xa6/0x300 [r8723bs]
[ +0.065903] rtw_recv_entry+0xe36/0x1160 [r8723bs]
[ +0.063809] rtl8723bs_recv_tasklet+0x349/0x6c0 [r8723bs]
[ +0.071093] tasklet_action_common.constprop.0+0xe5/0x110
[ +0.070966] __do_softirq+0x16f/0x50a
[ +0.050134] __irq_exit_rcu+0xeb/0x140
[ +0.051172] irq_exit_rcu+0xa/0x20
[ +0.047006] common_interrupt+0xb8/0xd0
[ +0.052214] asm_common_interrupt+0x1e/0x40
[ +0.056381] finish_task_switch.isra.0+0x100/0x3a0
[ +0.063670] __schedule+0x3ad/0xd20
[ +0.048047] schedule+0x4e/0xc0
[ +0.043880] smpboot_thread_fn+0xc4/0x220
[ +0.054298] kthread+0x16c/0x1a0
[ +0.044922] ret_from_fork+0x1f/0x30
[ +0.049088]
other info that might help us debug this:
[ +0.095950] Possible unsafe locking scenario:
[ +0.070952] CPU0 CPU1
[ +0.054282] ---- ----
[ +0.054285] lock(&psta->sleep_q.lock);
[ +0.047004] lock(&pxmitpriv->lock);
[ +0.074082] lock(&psta->sleep_q.lock);
[ +0.077209] lock(&pxmitpriv->lock);
[ +0.043873]
*** DEADLOCK ***
[ +0.070950] 1 lock held by ksoftirqd/3/29:
[ +0.049082] #0: ffffb8b3007ab704 (&psta->sleep_q.lock){+.-.}-{2:2}, at: wakeup_sta_to_xmit+0x3b/0x300 [r8723bs]
Analysis shows that in hindsight the splitting of the lock was not
a good idea, so revert this to fix the access-point mode deadlock.
Note this is a straight-forward revert done with git revert, the commented
out "/* spin_lock_bh(&psta_bmc->sleep_q.lock); */" lines were part of the
code before the reverted changes.
Fixes: 54659ca026e5 ("staging: rtl8723bs: remove possible deadlock when disconnect (v2)")
Cc: stable <stable@vger.kernel.org>
Cc: Fabio Aiuto <fabioaiuto83@gmail.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=215542
Link: https://lore.kernel.org/r/20220302101637.26542-1-hdegoede@redhat.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
z_idataoff here is an absolute physical offset, so it should use
erofs_off_t (64 bits at least). Otherwise, it'll get trimmed and
cause the decompresion failure.
Link: https://lore.kernel.org/r/20220222033118.20540-1-hsiangkao@linux.alibaba.com
Fixes: ab92184ff8f1 ("erofs: add on-disk compressed tail-packing inline support")
Reviewed-by: Yue Hu <huyue2@yulong.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
|
|
PCM buffers might be allocated dynamically when the buffer
preallocation failed or a larger buffer is requested, and it's not
guaranteed that substream->dma_buffer points to the actually used
buffer. The driver needs to refer to substream->runtime->dma_addr
instead for the buffer address.
Signed-off-by: Zhen Ni <nizhen@uniontech.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20220302074241.30469-1-nizhen@uniontech.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
The ifindex doesn't have to be unique for multiple network namespaces on
the same machine.
$ ip netns add test1
$ ip -net test1 link add dummy1 type dummy
$ ip netns add test2
$ ip -net test2 link add dummy2 type dummy
$ ip -net test1 link show dev dummy1
6: dummy1: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether 96:81:55:1e:dd:85 brd ff:ff:ff:ff:ff:ff
$ ip -net test2 link show dev dummy2
6: dummy2: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether 5a:3c:af:35:07:c3 brd ff:ff:ff:ff:ff:ff
But the batman-adv code to walk through the various layers of virtual
interfaces uses this assumption because dev_get_iflink handles it
internally and doesn't return the actual netns of the iflink. And
dev_get_iflink only documents the situation where ifindex == iflink for
physical devices.
But only checking for dev->netdev_ops->ndo_get_iflink is also not an option
because ipoib_get_iflink implements it even when it sometimes returns an
iflink != ifindex and sometimes iflink == ifindex. The caller must
therefore make sure itself to check both netns and iflink + ifindex for
equality. Only when they are equal, a "physical" interface was detected
which should stop the traversal. On the other hand, vxcan_get_iflink can
also return 0 in case there was currently no valid peer. In this case, it
is still necessary to stop.
Fixes: b7eddd0b3950 ("batman-adv: prevent using any virtual device created on batman-adv as hard-interface")
Fixes: 5ed4a460a1d3 ("batman-adv: additional checks for virtual interfaces on top of WiFi")
Reported-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
|
|
There is no need to call dev_get_iflink multiple times for the same
net_device in batadv_get_real_netdevice. And since some of the
ndo_get_iflink callbacks are dynamic (for example via RCUs like in
vxcan_get_iflink), it could easily happen that the returned values are not
stable. The pre-checks before __dev_get_by_index are then of course bogus.
Fixes: 5ed4a460a1d3 ("batman-adv: additional checks for virtual interfaces on top of WiFi")
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
|
|
There is no need to call dev_get_iflink multiple times for the same
net_device in batadv_is_on_batman_iface. And since some of the
.ndo_get_iflink callbacks are dynamic (for example via RCUs like in
vxcan_get_iflink), it could easily happen that the returned values are not
stable. The pre-checks before __dev_get_by_index are then of course bogus.
Fixes: b7eddd0b3950 ("batman-adv: prevent using any virtual device created on batman-adv as hard-interface")
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
|
|
Before these changes elan_suspend() would only disable the regulator
when device_may_wakeup() returns false; whereas elan_resume() would
unconditionally enable it, leading to an enable count imbalance when
device_may_wakeup() returns true.
This triggers the "WARN_ON(regulator->enable_count)" in regulator_put()
when the elan_i2c driver gets unbound, this happens e.g. with the
hot-plugable dock with Elan I2C touchpad for the Asus TF103C 2-in-1.
Fix this by making the regulator_enable() call also be conditional
on device_may_wakeup() returning false.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20220131135436.29638-2-hdegoede@redhat.com
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
|
|
elan_disable_power() is called conditionally on suspend, where as
elan_enable_power() is always called on resume. This leads to
an imbalance in the regulator's enable count.
Move the regulator_[en|dis]able() calls out of elan_[en|dis]able_power()
in preparation of fixing this.
No functional changes intended.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20220131135436.29638-1-hdegoede@redhat.com
[dtor: consolidate elan_[en|dis]able() into elan_set_power()]
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
|
|
When trying to add a histogram against an event with the "cpu" field, it
was impossible due to "cpu" being a keyword to key off of the running CPU.
So to fix this, it was changed to "common_cpu" to match the other generic
fields (like "common_pid"). But since some scripts used "cpu" for keying
off of the CPU (for events that did not have "cpu" as a field, which is
most of them), a backward compatibility trick was added such that if "cpu"
was used as a key, and the event did not have "cpu" as a field name, then
it would fallback and switch over to "common_cpu".
This fix has a couple of subtle bugs. One was that when switching over to
"common_cpu", it did not change the field name, it just set a flag. But
the code still found a "cpu" field. The "cpu" field is used for filtering
and is returned when the event does not have a "cpu" field.
This was found by:
# cd /sys/kernel/tracing
# echo hist:key=cpu,pid:sort=cpu > events/sched/sched_wakeup/trigger
# cat events/sched/sched_wakeup/hist
Which showed the histogram unsorted:
{ cpu: 19, pid: 1175 } hitcount: 1
{ cpu: 6, pid: 239 } hitcount: 2
{ cpu: 23, pid: 1186 } hitcount: 14
{ cpu: 12, pid: 249 } hitcount: 2
{ cpu: 3, pid: 994 } hitcount: 5
Instead of hard coding the "cpu" checks, take advantage of the fact that
trace_event_field_field() returns a special field for "cpu" and "CPU" if
the event does not have "cpu" as a field. This special field has the
"filter_type" of "FILTER_CPU". Check that to test if the returned field is
of the CPU type instead of doing the string compare.
Also, fix the sorting bug by testing for the hist_field flag of
HIST_FIELD_FL_CPU when setting up the sort routine. Otherwise it will use
the special CPU field to know what compare routine to use, and since that
special field does not have a size, it returns tracing_map_cmp_none.
Cc: stable@vger.kernel.org
Fixes: 1e3bac71c505 ("tracing/histogram: Rename "cpu" to "common_cpu"")
Reported-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
|
|
When the DSA_NOTIFIER_TAG_PROTO returns an error, the user space process
which initiated the protocol change exits the kernel processing while
still holding the rtnl_mutex. So any other process attempting to lock
the rtnl_mutex would deadlock after such event.
The error handling of DSA_NOTIFIER_TAG_PROTO was inadvertently changed
by the blamed commit, introducing this regression. We must still call
rtnl_unlock(), and we must still call DSA_NOTIFIER_TAG_PROTO for the old
protocol. The latter is due to the limiting design of notifier chains
for cross-chip operations, which don't have a built-in error recovery
mechanism - we should look into using notifier_call_chain_robust for that.
Fixes: dc452a471dba ("net: dsa: introduce tagger-owned storage for private and shared data")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20220228141715.146485-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth
Luiz Augusto von Dentz says:
====================
bluetooth pull request for net:
- Fix regression with scanning not working in some systems.
* tag 'for-net-2022-03-01' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth:
Bluetooth: Fix not checking MGMT cmd pending queue
====================
Link: https://lore.kernel.org/r/20220302004330.125536-1-luiz.dentz@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
A number of places in the MGMT handlers we examine the command queue for
other commands (in progress but not yet complete) that will interact
with the process being performed. However, not all commands go into the
queue if one of:
1. There is no negative side effect of consecutive or redundent commands
2. The command is entirely perform "inline".
This change examines each "pending command" check, and if it is not
needed, deletes the check. Of the remaining pending command checks, we
make sure that the command is in the pending queue by using the
mgmt_pending_add/mgmt_pending_remove pair rather than the
mgmt_pending_new/mgmt_pending_free pair.
Link: https://lore.kernel.org/linux-bluetooth/f648f2e11bb3c2974c32e605a85ac3a9fac944f1.camel@redhat.com/T/
Tested-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Brian Gix <brian.gix@intel.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
|
|
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
1) Use kfree_rcu(ptr, rcu) variant, using kfree_rcu(ptr) was not
intentional. From Eric Dumazet.
2) Use-after-free in netfilter hook core, from Eric Dumazet.
3) Missing rcu read lock side for netfilter egress hook,
from Florian Westphal.
4) nf_queue assume state->sk is full socket while it might not be.
Invoke sock_gen_put(), from Florian Westphal.
5) Add selftest to exercise the reported KASAN splat in 4)
6) Fix possible use-after-free in nf_queue in case sk_refcnt is 0.
Also from Florian.
7) Use input interface index only for hardware offload, not for
the software plane. This breaks tc ct action. Patch from Paul Blakey.
* git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
net/sched: act_ct: Fix flow table lookup failure with no originating ifindex
netfilter: nf_queue: handle socket prefetch
netfilter: nf_queue: fix possible use-after-free
selftests: netfilter: add nfqueue TCP_NEW_SYN_RECV socket race test
netfilter: nf_queue: don't assume sk is full socket
netfilter: egress: silence egress hook lockdep splats
netfilter: fix use-after-free in __nf_register_net_hook()
netfilter: nf_tables: prefer kfree_rcu(ptr, rcu) variant
====================
Link: https://lore.kernel.org/r/20220301215337.378405-1-pablo@netfilter.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The netif_rx_ni() function frees the skb so we can't dereference it to
save the skb->len.
Fixes: 61e121047645 ("staging: gdm7240: adding LTE USB driver")
Cc: stable <stable@vger.kernel.org>
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Link: https://lore.kernel.org/r/20220228074331.GA13685@kili
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
U-Boot uses ethernet* aliases for setting MAC addresses. Therefore define
also alias for ethernet0.
Fixes: 7109d817db2e ("arm64: dts: marvell: add DTS for Turris Mox")
Signed-off-by: Pali Rohár <pali@kernel.org>
Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
After cited commit optimizted hw insertion, flow table entries are
populated with ifindex information which was intended to only be used
for HW offload. This tuple ifindex is hashed in the flow table key, so
it must be filled for lookup to be successful. But tuple ifindex is only
relevant for the netfilter flowtables (nft), so it's not filled in
act_ct flow table lookup, resulting in lookup failure, and no SW
offload and no offload teardown for TCP connection FIN/RST packets.
To fix this, add new tc ifindex field to tuple, which will
only be used for offloading, not for lookup, as it will not be
part of the tuple hash.
Fixes: 9795ded7f924 ("net/sched: act_ct: Fill offloading tuple iifidx")
Signed-off-by: Paul Blakey <paulb@nvidia.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Pull kvm fixes from Paolo Bonzini:
"The bigger part of the change is a revert for x86 hosts. Here the
second patch was supposed to fix the first, but in reality it was just
as broken, so both have to go.
x86 host:
- Revert incorrect assumption that cr3 changes come with preempt
notifier callbacks (they don't when static branches are changed,
for example)
ARM host:
- Correctly synchronise PMR and co on PSCI CPU_SUSPEND
- Skip tests that depend on GICv3 when the HW isn't available"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: selftests: aarch64: Skip tests if we can't create a vgic-v3
Revert "KVM: VMX: Save HOST_CR3 in vmx_prepare_switch_to_guest()"
Revert "KVM: VMX: Save HOST_CR3 in vmx_set_host_fs_gs()"
KVM: arm64: Don't miss pending interrupts for suspended vCPU
|
|
s390 has a swap_ex_entry_fixup function, however it is not being used
since common code expects a swap_ex_entry_fixup define. If it is not
defined the default implementation will be used. So fix this by adding
a proper define.
However also the implementation of the function must be fixed, since a
NULL value for handler has a special meaning and must not be adjusted.
Luckily all of this doesn't fix a real bug currently: the main extable
is correctly sorted during build time, and for runtime sorting there
is currently no case where the handler field is not NULL.
Fixes: 05a68e892e89 ("s390/kernel: expand exception table logic to allow new handling options")
Acked-by: Ilya Leoshkevich <iii@linux.ibm.com>
Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
|
|
arch_ftrace_get_regs is supposed to return a struct pt_regs pointer
only if the pt_regs structure contains all register contents, which
means it must have been populated when created via ftrace_regs_caller.
If it was populated via ftrace_caller the contents are not complete
(the psw mask part is missing), and therefore a NULL pointer needs be
returned.
The current code incorrectly always returns a struct pt_regs pointer.
Fix this by adding another pt_regs flag which indicates if the
contents are complete, and fix arch_ftrace_get_regs accordingly.
Fixes: 894979689d3a ("s390/ftrace: provide separate ftrace_caller/ftrace_regs_caller implementations")
Reported-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Reported-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Reviewed-by: Sven Schnelle <svens@linux.ibm.com>
Acked-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
|
|
ftrace_caller was used for both ftrace_caller and ftrace_regs_caller,
which means that the target address of the hotpatch trampoline was
never updated.
With commit 894979689d3a ("s390/ftrace: provide separate
ftrace_caller/ftrace_regs_caller implementations") a separate
ftrace_regs_caller entry point was implemeted, however it was
forgotten to implement the necessary changes for ftrace_modify_call
and ftrace_make_call, where the branch target has to be modified
accordingly.
Therefore add the missing code now.
Fixes: 894979689d3a ("s390/ftrace: provide separate ftrace_caller/ftrace_regs_caller implementations")
Reviewed-by: Sven Schnelle <svens@linux.ibm.com>
Acked-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
|
|
We need to preserve the values at OLDMEM_BASE and OLDMEM_SIZE which are
used by zgetdump in case when kdump crashes. In that case zgetdump will
attempt to read OLDMEM_BASE and OLDMEM_SIZE in order to find out where
the memory range [0 - OLDMEM_SIZE] belonging to the production kernel is.
Fixes: f1a546947431 ("s390/setup: don't reserve memory that occupied decompressor's head")
Cc: stable@vger.kernel.org # 5.15+
Signed-off-by: Alexander Egorenkov <egorenar@linux.ibm.com>
Acked-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
|
|
VRR capable property is not attached by default to the connector
It is attached only if VRR is supported.
So if the driver tries to call drm core set prop function without
it being attached that causes NULL dereference.
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: Manasi Navare <manasi.d.navare@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220225013055.9282-1-manasi.d.navare@intel.com
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull binfmt_elf fix from Kees Cook:
"This addresses a regression[1] under ia64 where some ET_EXEC binaries
were not loading"
Link: https://linux-regtracking.leemhuis.info/regzbot/regression/a3edd529-c42d-3b09-135c-7e98a15b150f@leemhuis.info/ [1]
- Fix ia64 ET_EXEC loading
* tag 'binfmt_elf-v5.17-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
binfmt_elf: Avoid total_mapping_size for ET_EXEC
|
|
Partially revert commit 5f501d555653 ("binfmt_elf: reintroduce using
MAP_FIXED_NOREPLACE"), which applied the ET_DYN "total_mapping_size"
logic also to ET_EXEC.
At least ia64 has ET_EXEC PT_LOAD segments that are not virtual-address
contiguous (but _are_ file-offset contiguous). This would result in a
giant mapping attempting to cover the entire span, including the virtual
address range hole, and well beyond the size of the ELF file itself,
causing the kernel to refuse to load it. For example:
$ readelf -lW /usr/bin/gcc
...
Program Headers:
Type Offset VirtAddr PhysAddr FileSiz MemSiz ...
...
LOAD 0x000000 0x4000000000000000 0x4000000000000000 0x00b5a0 0x00b5a0 ...
LOAD 0x00b5a0 0x600000000000b5a0 0x600000000000b5a0 0x0005ac 0x000710 ...
...
^^^^^^^^ ^^^^^^^^^^^^^^^^^^ ^^^^^^^^ ^^^^^^^^
File offset range : 0x000000-0x00bb4c
0x00bb4c bytes
Virtual address range : 0x4000000000000000-0x600000000000bcb0
0x200000000000bcb0 bytes
Remove the total_mapping_size logic for ET_EXEC, which reduces the
ET_EXEC MAP_FIXED_NOREPLACE coverage to only the first PT_LOAD (better
than nothing), and retains it for ET_DYN.
Ironically, this is the reverse of the problem that originally caused
problems with MAP_FIXED_NOREPLACE: overlapping PT_LOAD segments. Future
work could restore full coverage if load_elf_binary() were to perform
mappings in a separate phase from the loading (where it could resolve
both overlaps and holes).
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: linux-fsdevel@vger.kernel.org
Cc: linux-mm@kvack.org
Reported-by: matoro <matoro_bugzilla_kernel@matoro.tk>
Fixes: 5f501d555653 ("binfmt_elf: reintroduce using MAP_FIXED_NOREPLACE")
Link: https://lore.kernel.org/r/a3edd529-c42d-3b09-135c-7e98a15b150f@leemhuis.info
Tested-by: matoro <matoro_mailinglist_kernel@matoro.tk>
Link: https://lore.kernel.org/lkml/ce8af9c13bcea9230c7689f3c1e0e2cd@matoro.tk
Tested-By: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Link: https://lore.kernel.org/lkml/49182d0d-708b-4029-da5f-bc18603440a6@physik.fu-berlin.de
Cc: stable@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
|
|
Do not call get_trip_hyst() from thermal_genl_cmd_tz_get_trip() if
the thermal zone does not define one.
Fixes: 1ce50e7d408e ("thermal: core: genetlink support for events/cmd/sampling")
Signed-off-by: Nicolas Cavallari <nicolas.cavallari@green-communications.fr>
Cc: 5.10+ <stable@vger.kernel.org> # 5.10+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
The function alloc_workqueue() in nintendo_hid_probe() can fail, but
there is no check of its return value. To fix this bug, its return value
should be checked with new error handling code.
Fixes: c4eae84feff3e ("HID: nintendo: add rumble support")
Reported-by: TOTE Robot <oslab@tsinghua.edu.cn>
Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com>
Reviewed-by: Silvan Jegen <s.jegen@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless
johannes Berg says:
====================
Some last-minute fixes:
* rfkill
- add missing rfill_soft_blocked() when disabled
* cfg80211
- handle a nla_memdup() failure correctly
- fix CONFIG_CFG80211_EXTRA_REGDB_KEYDIR typo in
Makefile
* mac80211
- fix EAPOL handling in 802.3 RX path
- reject setting up aggregation sessions before
connection is authorized to avoid timeouts or
similar
- handle some SAE authentication steps correctly
- fix AC selection in mesh forwarding
* iwlwifi
- remove TWT support as it causes firmware crashes
when the AP isn't behaving correctly
- check debugfs pointer before dereferncing it
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The driver creates the top row map sysfs attribute in input_configured()
method; unfortunately we do not have a callback that is executed when HID
interface is unbound, thus we are leaking these sysfs attributes, for
example when device is disconnected.
To fix it let's switch to managed version of adding sysfs attributes which
will ensure that they are destroyed when the driver is unbound.
Fixes: 14c9c014babe ("HID: add vivaldi HID driver")
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Tested-by: Stephen Boyd <swboyd@chromium.org>
Reviewed-by: Stephen Boyd <swboyd@chromium.org>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
|
|
The kbuild change here accidentally removed not only the
unquoting, but also the last character of the variable
name. Fix that.
Fixes: 129ab0d2d9f3 ("kbuild: do not quote string values in include/config/auto.conf")
Reviewed-by: Masahiro Yamada <masahiroy@kernel.org>
Link: https://lore.kernel.org/r/20220221155512.1d25895f7c5f.I50fa3d4189fcab90a2896fe8cae215035dae9508@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
The mstar SoCs have an arch timer but HAVE_ARM_ARCH_TIMER wasn't
selected. If MSC313E_TIMER isn't selected then the kernel gets
stuck at boot because there are no timers available.
Signed-off-by: Daniel Palmer <daniel@0x0f.com>
Link: https://lore.kernel.org/r/20220301104349.3040422-1-daniel@0x0f.com'
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
in tunnel mode, if outer interface(ipv4) is less, it is easily to let
inner IPV6 mtu be less than 1280. If so, a Packet Too Big ICMPV6 message
is received. When send again, packets are fragmentized with 1280, they
are still rejected with ICMPV6(Packet Too Big) by xfrmi_xmit2().
According to RFC4213 Section3.2.2:
if (IPv4 path MTU - 20) is less than 1280
if packet is larger than 1280 bytes
Send ICMPv6 "packet too big" with MTU=1280
Drop packet
else
Encapsulate but do not set the Don't Fragment
flag in the IPv4 header. The resulting IPv4
packet might be fragmented by the IPv4 layer
on the encapsulator or by some router along
the IPv4 path.
endif
else
if packet is larger than (IPv4 path MTU - 20)
Send ICMPv6 "packet too big" with
MTU = (IPv4 path MTU - 20).
Drop packet.
else
Encapsulate and set the Don't Fragment flag
in the IPv4 header.
endif
endif
Packets should be fragmentized with ipv4 outer interface, so change it.
After it is fragemtized with ipv4, there will be double fragmenation.
No.48 & No.51 are ipv6 fragment packets, No.48 is double fragmentized,
then tunneled with IPv4(No.49& No.50), which obey spec. And received peer
cannot decrypt it rightly.
48 2002::10 2002::11 1296(length) IPv6 fragment (off=0 more=y ident=0xa20da5bc nxt=50)
49 0x0000 (0) 2002::10 2002::11 1304 IPv6 fragment (off=0 more=y ident=0x7448042c nxt=44)
50 0x0000 (0) 2002::10 2002::11 200 ESP (SPI=0x00035000)
51 2002::10 2002::11 180 Echo (ping) request
52 0x56dc 2002::10 2002::11 248 IPv6 fragment (off=1232 more=n ident=0xa20da5bc nxt=50)
xfrm6_noneed_fragment has fixed above issues. Finally, it acted like below:
1 0x6206 192.168.1.138 192.168.1.1 1316 Fragmented IP protocol (proto=Encap Security Payload 50, off=0, ID=6206) [Reassembled in #2]
2 0x6206 2002::10 2002::11 88 IPv6 fragment (off=0 more=y ident=0x1f440778 nxt=50)
3 0x0000 2002::10 2002::11 248 ICMPv6 Echo (ping) request
Signed-off-by: Lina Wang <lina.wang@mediatek.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
In case someone combines bpf socket assign and nf_queue, then we will
queue an skb who references a struct sock that did not have its
reference count incremented.
As we leave rcu protection, there is no guarantee that skb->sk is still
valid.
For refcount-less skb->sk case, try to increment the reference count
and then override the destructor.
In case of failure we have two choices: orphan the skb and 'delete'
preselect or let nf_queue() drop the packet.
Do the latter, it should not happen during normal operation.
Fixes: cf7fbe660f2d ("bpf: Add socket assign support")
Acked-by: Joe Stringer <joe@cilium.io>
Signed-off-by: Florian Westphal <fw@strlen.de>
|
|
Eric Dumazet says:
The sock_hold() side seems suspect, because there is no guarantee
that sk_refcnt is not already 0.
On failure, we cannot queue the packet and need to indicate an
error. The packet will be dropped by the caller.
v2: split skb prefetch hunk into separate change
Fixes: 271b72c7fa82c ("udp: RCU handling for Unicast packets.")
Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
|
|
causes:
BUG: KASAN: slab-out-of-bounds in sk_free+0x25/0x80
Write of size 4 at addr ffff888106df0284 by task nf-queue/1459
sk_free+0x25/0x80
nf_queue_entry_release_refs+0x143/0x1a0
nf_reinject+0x233/0x770
... without 'netfilter: nf_queue: don't assume sk is full socket'.
Signed-off-by: Florian Westphal <fw@strlen.de>
|
|
There is no guarantee that state->sk refers to a full socket.
If refcount transitions to 0, sock_put calls sk_free which then ends up
with garbage fields.
I'd like to thank Oleksandr Natalenko and Jiri Benc for considerable
debug work and pointing out state->sk oddities.
Fixes: ca6fb0651883 ("tcp: attach SYNACK messages to request sockets instead of listener")
Tested-by: Oleksandr Natalenko <oleksandr@redhat.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
|
|
When we get anti-clogging token required (added by the commit
mentioned below), or the other status codes added by the later
commit 4e56cde15f7d ("mac80211: Handle special status codes in
SAE commit") we currently just pretend (towards the internal
state machine of authentication) that we didn't receive anything.
This has the undesirable consequence of retransmitting the prior
frame, which is not expected, because the timer is still armed.
If we just disarm the timer at that point, it would result in
the undesirable side effect of being in this state indefinitely
if userspace crashes, or so.
So to fix this, reset the timer and set a new auth_data->waiting
in order to have no more retransmissions, but to have the data
destroyed when the timer actually fires, which will only happen
if userspace didn't continue (i.e. crashed or abandoned it.)
Fixes: a4055e74a2ff ("mac80211: Don't destroy auth data in case of anti-clogging")
Reported-by: Jouni Malinen <j@w1.fi>
Link: https://lore.kernel.org/r/20220224103932.75964e1d7932.Ia487f91556f29daae734bf61f8181404642e1eec@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
As there's potential for failure of the nla_memdup(),
check the return value.
Fixes: a442b761b24b ("cfg80211: add add_nan_func / del_nan_func")
Signed-off-by: Jiasheng Jiang <jiasheng@iscas.ac.cn>
Link: https://lore.kernel.org/r/20220301100020.3801187-1-jiasheng@iscas.ac.cn
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
When "debugfs=off" is used on the kernel command line, iwiwifi's
mvm module uses an invalid/unchecked debugfs_dir pointer and causes
a BUG:
BUG: kernel NULL pointer dereference, address: 000000000000004f
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 0 P4D 0
Oops: 0000 [#1] PREEMPT SMP
CPU: 1 PID: 503 Comm: modprobe Tainted: G W 5.17.0-rc5 #7
Hardware name: Dell Inc. Inspiron 15 5510/076F7Y, BIOS 2.4.1 11/05/2021
RIP: 0010:iwl_mvm_dbgfs_register+0x692/0x700 [iwlmvm]
Code: 69 a0 be 80 01 00 00 48 c7 c7 50 73 6a a0 e8 95 cf ee e0 48 8b 83 b0 1e 00 00 48 c7 c2 54 73 6a a0 be 64 00 00 00 48 8d 7d 8c <48> 8b 48 50 e8 15 22 07 e1 48 8b 43 28 48 8d 55 8c 48 c7 c7 5f 73
RSP: 0018:ffffc90000a0ba68 EFLAGS: 00010246
RAX: ffffffffffffffff RBX: ffff88817d6e3328 RCX: ffff88817d6e3328
RDX: ffffffffa06a7354 RSI: 0000000000000064 RDI: ffffc90000a0ba6c
RBP: ffffc90000a0bae0 R08: ffffffff824e4880 R09: ffffffffa069d620
R10: ffffc90000a0ba00 R11: ffffffffffffffff R12: 0000000000000000
R13: ffffc90000a0bb28 R14: ffff88817d6e3328 R15: ffff88817d6e3320
FS: 00007f64dd92d740(0000) GS:ffff88847f640000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000000000000004f CR3: 000000016fc79001 CR4: 0000000000770ee0
PKRU: 55555554
Call Trace:
<TASK>
? iwl_mvm_mac_setup_register+0xbdc/0xda0 [iwlmvm]
iwl_mvm_start_post_nvm+0x71/0x100 [iwlmvm]
iwl_op_mode_mvm_start+0xab8/0xb30 [iwlmvm]
_iwl_op_mode_start+0x6f/0xd0 [iwlwifi]
iwl_opmode_register+0x6a/0xe0 [iwlwifi]
? 0xffffffffa0231000
iwl_mvm_init+0x35/0x1000 [iwlmvm]
? 0xffffffffa0231000
do_one_initcall+0x5a/0x1b0
? kmem_cache_alloc+0x1e5/0x2f0
? do_init_module+0x1e/0x220
do_init_module+0x48/0x220
load_module+0x2602/0x2bc0
? __kernel_read+0x145/0x2e0
? kernel_read_file+0x229/0x290
__do_sys_finit_module+0xc5/0x130
? __do_sys_finit_module+0xc5/0x130
__x64_sys_finit_module+0x13/0x20
do_syscall_64+0x38/0x90
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7f64dda564dd
Code: 5b 41 5c c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 1b 29 0f 00 f7 d8 64 89 01 48
RSP: 002b:00007ffdba393f88 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f64dda564dd
RDX: 0000000000000000 RSI: 00005575399e2ab2 RDI: 0000000000000001
RBP: 000055753a91c5e0 R08: 0000000000000000 R09: 0000000000000002
R10: 0000000000000001 R11: 0000000000000246 R12: 00005575399e2ab2
R13: 000055753a91ceb0 R14: 0000000000000000 R15: 000055753a923018
</TASK>
Modules linked in: btintel(+) btmtk bluetooth vfat snd_hda_codec_hdmi fat snd_hda_codec_realtek snd_hda_codec_generic iwlmvm(+) snd_sof_pci_intel_tgl mac80211 snd_sof_intel_hda_common soundwire_intel soundwire_generic_allocation soundwire_cadence soundwire_bus snd_sof_intel_hda snd_sof_pci snd_sof snd_sof_xtensa_dsp snd_soc_hdac_hda snd_hda_ext_core snd_soc_acpi_intel_match snd_soc_acpi snd_soc_core btrfs snd_compress snd_hda_intel snd_intel_dspcfg snd_intel_sdw_acpi snd_hda_codec raid6_pq iwlwifi snd_hda_core snd_pcm snd_timer snd soundcore cfg80211 intel_ish_ipc(+) thunderbolt rfkill intel_ishtp ucsi_acpi wmi i2c_hid_acpi i2c_hid evdev
CR2: 000000000000004f
---[ end trace 0000000000000000 ]---
Check the debugfs_dir pointer for an error before using it.
Fixes: 8c082a99edb9 ("iwlwifi: mvm: simplify iwl_mvm_dbgfs_register")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Luca Coelho <luciano.coelho@intel.com>
Cc: linux-wireless@vger.kernel.org
Cc: Kalle Valo <kvalo@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Cc: stable <stable@vger.kernel.org>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://lore.kernel.org/r/20220223030630.23241-1-rdunlap@infradead.org
[change to make both conditional]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
Some APs misbehave when TWT is used and cause our firmware to crash.
We don't know a reasonable way to detect and work around this problem
in the FW yet. To prevent these crashes, disable TWT in the driver by
stopping to advertise TWT support.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=215523
Signed-off-by: Golan Ben Ami <golan.ben.ami@intel.com>
[reworded the commit message]
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Link: https://lore.kernel.org/r/20220301072926.153969-1-luca@coelho.fi
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
If CONFIG_RFKILL is not set, the Intel WiFi driver will not build
the iw_mvm driver part due to the missing rfill_soft_blocked()
call. Adding a inline declaration of rfill_soft_blocked() if
CONFIG_RFKILL=n fixes the following error:
drivers/net/wireless/intel/iwlwifi/mvm/mvm.h: In function 'iwl_mvm_mei_set_sw_rfkill_state':
drivers/net/wireless/intel/iwlwifi/mvm/mvm.h:2215:38: error: implicit declaration of function 'rfkill_soft_blocked'; did you mean 'rfkill_blocked'? [-Werror=implicit-function-declaration]
2215 | mvm->hw_registered ? rfkill_soft_blocked(mvm->hw->wiphy->rfkill) : false;
| ^~~~~~~~~~~~~~~~~~~
| rfkill_blocked
Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk>
Reported-by: Neill Whillans <neill.whillans@codethink.co.uk>
Fixes: 5bc9a9dd7535 ("rfkill: allow to get the software rfkill state")
Link: https://lore.kernel.org/r/20220218093858.1245677-1-ben.dooks@codethink.co.uk
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/matthias.bgg/linux into arm/fixes
- Set display pipeline to DSI on mt8183 kukui jacuzzi
- Fix display for mt8192 based boards by fixing the routing table
* tag 'v5.17-fixes-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/matthias.bgg/linux:
soc: mediatek: mt8192-mmsys: Fix dither to dsi0 path's input sel
arm64: dts: mt8183: jacuzzi: Fix bus properties in anx's DSI endpoint
Link: https://lore.kernel.org/r/8eb8510d-c597-4fee-e4b3-924b6d4bb3be@gmail.com
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux into arm/fixes
Qualcomm DeviceTree fixes for v5.17
The SDX65 platform and MTP device was added twice to the DT binding,
this drops one of the occurances.
* tag 'qcom-dts-fixes-for-5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux:
Revert "dt-bindings: arm: qcom: Document SDX65 platform and boards"
Link: https://lore.kernel.org/r/20220301033838.1801689-1-bjorn.andersson@linaro.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux into arm/fixes
Qualcomm ARM64 DeviceTree fixes for 5.17
This starts off by fixing an issue introduced in a bug fix in the
global clock controller, where the symbol clocks for UFS would
end up picking the wrong parent clock which breaks UFS.
It then makes sure that the reference clock for the USB blocks are
enabled, even with booting without clk_ignore_unused.
It corrects the apps SMMU interrupts defintion by adding a missing
interrupt in the list.
Lastly it disables the Qualcomm crypto hardware (for now) on the Lenovo
Yoga C630, to prevent the cryptomanager tests during boot from crashing
the device.
* tag 'qcom-arm64-fixes-for-5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux:
arm64: dts: qcom: c630: disable crypto due to serror
arm64: dts: qcom: sm8450: fix apps_smmu interrupts
arm64: dts: qcom: sm8450: enable GCC_USB3_0_CLKREF_EN for usb
arm64: dts: qcom: sm8350: Correct UFS symbol clocks
Link: https://lore.kernel.org/r/20220301033526.1801295-1-bjorn.andersson@linaro.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
https://github.com/Broadcom/stblinux into arm/fixes
This pull request contains Broadcom ARM-based SoCs Device Tree fixes for
5.17, please pull the following:
- Maxime fixes the HVS (display) register range for the BCM2711
(Raspberry Pi 4) SoC
* tag 'arm-soc/for-5.17/devicetree-fixes' of https://github.com/Broadcom/stblinux:
ARM: boot: dts: bcm2711: Fix HVS register range
Link: https://lore.kernel.org/r/20220228165537.1950863-1-f.fainelli@gmail.com
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|