Age | Commit message (Collapse) | Author |
|
In function scrub_handle_errored_block(), we use @sblocks_for_recheck
pointer to hold one scrub_block for each mirror, and uses kcalloc() to
allocate an array.
But this one pointer for an array is not readable due to the member
offsets done by addition and not [].
Change this pointer to struct scrub_block *[BTRFS_MAX_MIRRORS], this
will slightly increase the stack memory usage.
Since function scrub_handle_errored_block() won't get iterative calls,
this extra cost would completely be acceptable.
And since we're here, also set sblock->refs and use scrub_block_put() to
clean them up, as later we will add extra members in scrub_block, which
needs scrub_block_put() to clean them up.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Preserve the fs-verity status of a btrfs file across send/recv.
There is no facility for installing the Merkle tree contents directly on
the receiving filesystem, so we package up the parameters used to enable
verity found in the verity descriptor. This gives the receive side
enough information to properly enable verity again. Note that this means
that receive will have to re-compute the whole Merkle tree, similar to
how compression worked before encoded_write.
Since the file becomes read-only after verity is enabled, it is
important that verity is added to the send stream after any file writes.
Therefore, when we process a verity item, merely note that it happened,
then actually create the command in the send stream during
'finish_inode_if_needed'.
This also creates V3 of the send stream format, without any format
changes besides adding the new commands and attributes.
Signed-off-by: Boris Burkov <boris@bur.io>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Use `atomic_try_cmpxchg(ptr, &old, new)` instead of
`atomic_cmpxchg(ptr, old, new) == old` in free_extent_buffer. This
has two benefits:
- The x86 cmpxchg instruction returns success in the ZF flag, so this
change saves a compare after cmpxchg, as well as a related move
instruction in the front of cmpxchg.
- atomic_try_cmpxchg implicitly assigns the *ptr value to &old when
cmpxchg fails, enabling further code simplifications.
This patch has no functional change.
Reviewed-by: Boris Burkov <boris@bur.io>
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
There are several sanity checks which are no longer possible to trigger
inside btrfs_scrub_dev().
Since we have mount time check against super block nodesize/sectorsize,
and our fixed macro is hardcoded to handle even the worst combination.
Thus those sanity checks are no longer needed, can be easily removed.
But this patch still uses some ASSERT()s as a safe net just in case we
change some features in the future to trigger those impossible
combinations.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
We used to use this in a few spots, but now we only use it directly
inside of block-group.c, so remove the helper and just open code where
we were using it.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Before when this was modifying the bit field we had to protect it with
the bg->lock, however now we're using bit helpers so we can stop
using the bg->lock.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
This is used mostly to determine if we need to look at the caching ctl
list and clean up any references to this block group. However we never
clear this flag, specifically because we need to know if we have to
remove a caching ctl we have for this block group still. This is in the
remove block group path which isn't a fast path, so the optimization
doesn't really matter, simplify this logic and remove the flag.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
We're breaking out and re-searching for the next block group while
evicting any of the block group cache inodes. This is not needed, the
block groups aren't disappearing here, we can simply loop through the
block groups like normal and iput any inode that we find.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
We use this during device replace for zoned devices, we were simply
taking the lock because it was in a bit field and we needed the lock to
be safe with other modifications in the bitfield. With the bit helpers
we no longer require that locking.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
We use a bit field in the btrfs_block_group for different flags, however
this is awkward because we have to hold the block_group->lock for any
modification of any of these fields, and makes the code clunky for a few
of these flags. Convert these to a properly flags setup so we can
utilize the bit helpers.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
We previously had the pattern of
btrfs_update_space_info(all, the, bg, fields, &space_info);
link_block_group(bg);
bg->space_info = space_info;
Now that we're passing the bg into btrfs_add_bg_to_space_info we can do
the linking in that function, transforming this to simply
btrfs_add_bg_to_space_info(fs_info, bg);
and put the link_block_group() and bg->space_info assignment directly in
btrfs_add_bg_to_space_info.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
This function has grown a bunch of new arguments, and it just boils down
to passing in all the block group fields as arguments. Simplify this by
passing in the block group itself and updating the space_info fields
based on the block group fields directly.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
For both unused bg deletion and async balance work we'll happily run if
the fs is closing. However I want to move these to their own worker
thread, and they can be long running jobs, so add a check to see if
we're closing and simply bail.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
btrfs_insert_file_extent() is only ever used to insert holes, so rename
it and remove the redundant parameters.
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Signed-off-by: Sweet Tea Dorminy <sweettea-kernel@dorminy.me>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
We have own string matching helper that duplicates what sysfs_streq
does, with a slight difference that it skips initial whitespace. So far
this is used for the drive allocation policy. The initial whitespace
of written sysfs values should be rather discouraged and we should use a
standard helper.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
[BUG]
The following script shows that, although scrub can detect super block
errors, it never tries to fix it:
mkfs.btrfs -f -d raid1 -m raid1 $dev1 $dev2
xfs_io -c "pwrite 67108864 4k" $dev2
mount $dev1 $mnt
btrfs scrub start -B $dev2
btrfs scrub start -Br $dev2
umount $mnt
The first scrub reports the super error correctly:
scrub done for f3289218-abd3-41ac-a630-202f766c0859
Scrub started: Tue Aug 2 14:44:11 2022
Status: finished
Duration: 0:00:00
Total to scrub: 1.26GiB
Rate: 0.00B/s
Error summary: super=1
Corrected: 0
Uncorrectable: 0
Unverified: 0
But the second read-only scrub still reports the same super error:
Scrub started: Tue Aug 2 14:44:11 2022
Status: finished
Duration: 0:00:00
Total to scrub: 1.26GiB
Rate: 0.00B/s
Error summary: super=1
Corrected: 0
Uncorrectable: 0
Unverified: 0
[CAUSE]
The comments already shows that super block can be easily fixed by
committing a transaction:
/*
* If we find an error in a super block, we just report it.
* They will get written with the next transaction commit
* anyway
*/
But the truth is, such assumption is not always true, and since scrub
should try to repair every error it found (except for read-only scrub),
we should really actively commit a transaction to fix this.
[FIX]
Just commit a transaction if we found any super block errors, after
everything else is done.
We cannot do this just after scrub_supers(), as
btrfs_commit_transaction() will try to pause and wait for the running
scrub, thus we can not call it with scrub_lock hold.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
[PROBLEM]
Unlike data/metadata corruption, if scrub detected some error in the
super block, the only error message is from the updated device status:
BTRFS info (device dm-1): scrub: started on devid 2
BTRFS error (device dm-1): bdev /dev/mapper/test-scratch2 errs: wr 0, rd 0, flush 0, corrupt 1, gen 0
BTRFS info (device dm-1): scrub: finished on devid 2 with status: 0
This is not helpful at all.
[CAUSE]
Unlike data/metadata error reporting, there is no visible report in
kernel dmesg to report supper block errors.
In fact, return value of scrub_checksum_super() is intentionally
skipped, thus scrub_handle_errored_block() will never be called for
super blocks.
[FIX]
Make super block errors to output an error message, now the full
dmesg would looks like this:
BTRFS info (device dm-1): scrub: started on devid 2
BTRFS warning (device dm-1): super block error on device /dev/mapper/test-scratch2, physical 67108864
BTRFS error (device dm-1): bdev /dev/mapper/test-scratch2 errs: wr 0, rd 0, flush 0, corrupt 1, gen 0
BTRFS info (device dm-1): scrub: finished on devid 2 with status: 0
BTRFS info (device dm-1): scrub: started on devid 2
This fix involves:
- Move the super_errors reporting to scrub_handle_errored_block()
This allows the device status message to show after the super block
error message.
But now we no longer distinguish super block corruption and generation
mismatch, now all counted as corruption.
- Properly check the return value from scrub_checksum_super()
- Add extra super block error reporting for scrub_print_warning().
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
With CONFIG_READ_ONLY_THP_FOR_FS, the Linux kernel supports using THPs for
read-only mmapped files, such as shared libraries. However, the kernel
makes no attempt to actually align those mappings on 2MB boundaries,
which makes it impossible to use those THPs most of the time. This issue
applies to general file mapping THP as well as existing setups using
CONFIG_READ_ONLY_THP_FOR_FS. This is easily fixed by using
thp_get_unmapped_area for the unmapped_area function in btrfs, which
is what ext2, ext4, fuse, and xfs all use.
Initially btrfs had been left out in commit 8c07fc452ac0 ("btrfs: fix
alignment of VMA for memory mapped files on THP") as btrfs does not support
DAX. However, commit 1854bc6e2420 ("mm/readahead: Align file mappings
for non-DAX") removed the DAX requirement. We should now be able to call
thp_get_unmapped_area() for btrfs.
The problem can be seen in /proc/PID/smaps where THPeligible is set to 0
on mappings to eligible shared object files as shown below.
Before this patch:
7fc6a7e18000-7fc6a80cc000 r-xp 00000000 00:1e 199856
/usr/lib64/libcrypto.so.1.1.1k
Size: 2768 kB
THPeligible: 0
VmFlags: rd ex mr mw me
With this patch the library is mapped at a 2MB aligned address:
fbdfe200000-7fbdfe4b4000 r-xp 00000000 00:1e 199856
/usr/lib64/libcrypto.so.1.1.1k
Size: 2768 kB
THPeligible: 1
VmFlags: rd ex mr mw me
This fixes the alignment of VMAs for any mmap of a file that has the
rd and ex permissions and size >= 2MB. The VMA alignment and
THPeligible field for anonymous memory is handled separately and
is thus not effected by this change.
CC: stable@vger.kernel.org # 5.18+
Signed-off-by: Alexander Zhu <alexlzhu@fb.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
This wait event is very similar to the pending ordered wait event in the
sense that it occurs in a different context than the condition signaling
for the event. The signaling occurs in btrfs_remove_ordered_extent()
while the wait event is implemented in btrfs_start_ordered_extent() in
fs/btrfs/ordered-data.c
However, in this case a thread must not acquire the lockdep map for the
ordered extents wait event when the ordered extent is related to a free
space inode. That is because lockdep creates dependencies between locks
acquired both in execution paths related to normal inodes and paths
related to free space inodes, thus leading to false positives.
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Ioannis Angelakopoulos <iangelak@fb.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Reinitialize the class of the lockdep map for struct inode's
mapping->invalidate_lock in load_free_space_cache() function in
fs/btrfs/free-space-cache.c. This will prevent lockdep from producing
false positives related to execution paths that make use of free space
inodes and paths that make use of normal inodes.
Specifically, with this change lockdep will create separate lock
dependencies that include the invalidate_lock, in the case that free
space inodes are used and in the case that normal inodes are used.
The lockdep class for this lock was first initialized in
inode_init_always() in fs/inode.c.
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Ioannis Angelakopoulos <iangelak@fb.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
In contrast to the num_writers and num_extwriters wait events, the
condition for the pending ordered wait event is signaled in a different
context from the wait event itself. The condition signaling occurs in
btrfs_remove_ordered_extent() in fs/btrfs/ordered-data.c while the wait
event is implemented in btrfs_commit_transaction() in
fs/btrfs/transaction.c
Thus the thread signaling the condition has to acquire the lockdep map
as a reader at the start of btrfs_remove_ordered_extent() and release it
after it has signaled the condition. In this case some dependencies
might be left out due to the placement of the annotation, but it is
better than no annotation at all.
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Ioannis Angelakopoulos <iangelak@fb.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Add lockdep annotations for the transaction states that have wait
events;
1) TRANS_STATE_COMMIT_START
2) TRANS_STATE_UNBLOCKED
3) TRANS_STATE_SUPER_COMMITTED
4) TRANS_STATE_COMPLETED
The new macros introduced here to annotate the transaction states wait
events have the same effect as the generic lockdep annotation macros.
With the exception of the lockdep annotation for TRANS_STATE_COMMIT_START
the transaction thread has to acquire the lockdep maps for the
transaction states as reader after the lockdep map for num_writers is
released so that lockdep does not complain.
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Ioannis Angelakopoulos <iangelak@fb.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Similarly to the num_writers wait event in fs/btrfs/transaction.c add a
lockdep annotation for the num_extwriters wait event.
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Ioannis Angelakopoulos <iangelak@fb.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Annotate the num_writers wait event in fs/btrfs/transaction.c with
lockdep in order to catch deadlocks involving this wait event.
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Ioannis Angelakopoulos <iangelak@fb.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Introduce four macros that are used to annotate wait events in btrfs code
with lockdep;
1) the btrfs_lockdep_init_map
2) the btrfs_lockdep_acquire,
3) the btrfs_lockdep_release
4) the btrfs_might_wait_for_event macros.
The btrfs_lockdep_init_map macro is used to initialize a lockdep map.
The btrfs_lockdep_<acquire,release> macros are used by threads to take
the lockdep map as readers (shared lock) and release it, respectively.
The btrfs_might_wait_for_event macro is used by threads to take the
lockdep map as writers (exclusive lock) and release it.
In general, the lockdep annotation for wait events work as follows:
The condition for a wait event can be modified and signaled at the same
time by multiple threads. These threads hold the lockdep map as readers
when they enter a context in which blocking would prevent signaling the
condition. Frequently, this occurs when a thread violates a condition
(lockdep map acquire), before restoring it and signaling it at a later
point (lockdep map release).
The threads that block on the wait event take the lockdep map as writers
(exclusive lock). These threads have to block until all the threads that
hold the lockdep map as readers signal the condition for the wait event
and release the lockdep map.
The lockdep annotation is used to warn about potential deadlock scenarios
that involve the threads that modify and signal the wait event condition
and threads that block on the wait event. A simple example is illustrated
below:
Without lockdep:
TA TB
cond = false
lock(A)
wait_event(w, cond)
unlock(A)
lock(A)
cond = true
signal(w)
unlock(A)
With lockdep:
TA TB
rwsem_acquire_read(lockdep_map)
cond = false
lock(A)
rwsem_acquire(lockdep_map)
rwsem_release(lockdep_map)
wait_event(w, cond)
unlock(A)
lock(A)
cond = true
signal(w)
unlock(A)
rwsem_release(lockdep_map)
In the second case, with the lockdep annotation, lockdep would warn about
an ABBA deadlock, while the first case would just deadlock at some point.
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Ioannis Angelakopoulos <iangelak@fb.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
There is an internal report on hitting the following ASSERT() in
recalculate_thresholds():
ASSERT(ctl->total_bitmaps <= max_bitmaps);
Above @max_bitmaps is calculated using the following variables:
- bytes_per_bg
8 * 4096 * 4096 (128M) for x86_64/x86.
- block_group->length
The length of the block group.
@max_bitmaps is the rounded up value of block_group->length / 128M.
Normally one free space cache should not have more bitmaps than above
value, but when it happens the ASSERT() can be triggered if
CONFIG_BTRFS_ASSERT is also enabled.
But the ASSERT() itself won't provide enough info to know which is going
wrong.
Is the bg too small thus it only allows one bitmap?
Or is there something else wrong?
So although I haven't found extra reports or crash dump to do further
investigation, add the extra info to make it more helpful to debug.
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
Pull ext4 fixes from Ted Ts'o:
"Regression and bug fixes:
- Performance regression fix from 5.18 on a Rasberry Pi
- Fix extent parsing bug which triggers a BUG_ON when a (corrupted)
extent tree has has a non-root node when zero entries.
- Fix a livelock where in the right (wrong) circumstances a large
number of nfsd threads can try to write to a nearly full file
system, and retry for hours(!)"
* tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
ext4: limit the number of retries after discarding preallocations blocks
ext4: fix bug in extents parsing when eh_entries == 0 and eh_depth > 0
ext4: use buckets for cr 1 block scan instead of rbtree
ext4: use locality group preallocation for small closed files
ext4: make directory inode spreading reflect flexbg size
ext4: avoid unnecessary spreading of allocations among groups
ext4: make mballoc try target group first even with mb_optimize_scan
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm
Pull NVDIMM and DAX fixes from Dan Williams:
"A recently discovered one-line fix for devdax that further addresses a
v5.5 regression, and (a bit embarrassing) a small batch of fixes that
have been sitting in my fixes tree for weeks.
The older fixes have soaked in linux-next during that time and address
an fsdax infinite loop and some other minor fixups.
- Fix a infinite loop bug in fsdax
- Fix memory-type detection for devdax (EINJ regression)
- Small cleanups"
* tag 'dax-and-nvdimm-fixes-v6.0-final' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
devdax: Fix soft-reservation memory description
fsdax: Fix infinite loop in dax_iomap_rw()
nvdimm/namespace: drop nested variable in create_namespace_pmem()
ndtest: Cleanup all of blk namespace specific code
pmem: fix a name collision
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c fixes from Wolfram Sang:
"I2C driver bugfixes for mlxbf and imx, a few documentation fixes after
the rework this cycle, and one hardening for the i2c-mux core"
* tag 'i2c-for-6.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: mux: harden i2c_mux_alloc() against integer overflows
i2c: mlxbf: Fix frequency calculation
i2c: mlxbf: prevent stack overflow in mlxbf_i2c_smbus_start_transaction()
i2c: mlxbf: incorrect base address passed during io write
Documentation: i2c: fix references to other documents
MAINTAINERS: remove Nehal Shah from AMD MP2 I2C DRIVER
i2c: imx: If pm_runtime_get_sync() returned 1 device access is possible
|
|
Pick up another "Soft Reservation" fix for v6.0-final on top of some
straggling nvdimm fixes that missed v5.19.
|
|
The "hmem" platform-devices that are created to represent the
platform-advertised "Soft Reserved" memory ranges end up inserting a
resource that causes the iomem_resource tree to look like this:
340000000-43fffffff : hmem.0
340000000-43fffffff : Soft Reserved
340000000-43fffffff : dax0.0
This is because insert_resource() reparents ranges when they completely
intersect an existing range.
This matters because code that uses region_intersects() to scan for a
given IORES_DESC will only check that top-level 'hmem.0' resource and
not the 'Soft Reserved' descendant.
So, to support EINJ (via einj_error_inject()) to inject errors into
memory hosted by a dax-device, be sure to describe the memory as
IORES_DESC_SOFT_RESERVED. This is a follow-on to:
commit b13a3e5fd40b ("ACPI: APEI: Fix _EINJ vs EFI_MEMORY_SP")
...that fixed EINJ support for "Soft Reserved" ranges in the first
instance.
Fixes: 262b45ae3ab4 ("x86/efi: EFI soft reservation to E820 enumeration")
Reported-by: Ricardo Sandoval Torres <ricardo.sandoval.torres@intel.com>
Tested-by: Ricardo Sandoval Torres <ricardo.sandoval.torres@intel.com>
Cc: <stable@vger.kernel.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Omar Avelar <omar.avelar@intel.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Mark Gross <markgross@kernel.org>
Link: https://lore.kernel.org/r/166397075670.389916.7435722208896316387.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild fixes from Masahiro Yamada:
- Fix build error for the combination of SYSTEM_TRUSTED_KEYRING=y and
X509_CERTIFICATE_PARSER=m
- Fix DEBUG_INFO_SPLIT to generate debug info for GCC 11+ and Clang 12+
- Revive debug info for assembly files
- Remove unused code
* tag 'kbuild-fixes-v6.0-3' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
Makefile.debug: re-enable debug info for .S files
Makefile.debug: set -g unconditional on CONFIG_DEBUG_INFO_SPLIT
certs: make system keyring depend on built-in x509 parser
Kconfig: remove unused function 'menu_get_root_menu'
scripts/clang-tools: remove unused module
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 fix from Vasily Gorbik:
- Fix potential hangs in VFIO AP driver
* tag 's390-6.0-5' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/vfio-ap: bypass unnecessary processing of AP resources
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management fixes from Rafael Wysocki:
"These fix an uninitialized variable usage in the operating performance
points code and add missing DT bindings for it.
Specifics:
- Fix uninitialized variable usage in dev_pm_opp_config_clks_simple()
(Christophe JAILLET)
- Add missing OPP DT properties (Rob Herring)"
* tag 'pm-6.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
dt-bindings: opp: Add missing (unevaluated|additional)Properties on child nodes
OPP: Fix an un-initialized variable usage
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc driver fixes from Greg KH:
"Here are three tiny driver fixes for 6.0-rc7. They include:
- phy driver reset bugfix
- fpga memleak bugfix
- counter irq config bugfix
The first two have been in linux-next for a while, the last one has
only been added to my tree in the past few days, but was in linux-next
under a different commit id. I couldn't pull directly from the counter
tree due to some gpg key propagation issue, so I took the commit
directly from email instead"
* tag 'char-misc-6.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
counter: 104-quad-8: Fix skipped IRQ lines during events configuration
fpga: m10bmc-sec: Fix possible memory leak of flash_buf
phy: marvell: phy-mvebu-a3700-comphy: Remove broken reset support
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Pull tty/serial driver fixes from Greg KH:
"Here are some small, and late, serial driver fixes for 6.0-rc7 to
resolve some reported problems.
Included in here are:
- tegra icount accounting fixes, including a framework function that
other drivers will be converted over to using in 6.1-rc1.
- fsl_lpuart reset bugfix
- 8250 omap 485 bugfix
- sifive serial clock bugfix
The last three patches have not shown up in linux-next due to them
being added to my tree only 2 days ago, but they are tiny and
self-contained and the developers say they resolve issues that they
have with 6.0-rc. The other three have been in linux-next for a while
with no reported issues"
* tag 'tty-6.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
serial: sifive: enable clocks for UART when probed
serial: 8250: omap: Use serial8250_em485_supported
serial: fsl_lpuart: Reset prior to registration
serial: tegra-tcu: Use uart_xmit_advance(), fixes icount.tx accounting
serial: tegra: Use uart_xmit_advance(), fixes icount.tx accounting
serial: Create uart_xmit_advance()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup fixes from Tejun Heo:
- Add Waiman Long as a cpuset maintainer
- cgroup_get_from_id() could be fed a kernfs ID which doesn't point to
a cgroup directory but a knob file and then crash. Error out if the
lookup kernfs_node isn't a directory.
* tag 'cgroup-for-6.0-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
cgroup: cgroup_get_from_id() must check the looked-up kn is a directory
cpuset: Add Waiman Long as a cpuset maintainer
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq
Pull workqueue fix from Tejun Heo:
"Just one patch to improve flush lockdep coverage"
* tag 'wq-for-6.0-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
workqueue: don't skip lockdep work dependency in cancel_work_sync()
|
|
Pull io_uring fix from Jens Axboe:
"Just a single fix for an issue with un-reaped IOPOLL requests on ring
exit"
* tag 'io_uring-6.0-2022-09-23' of git://git.kernel.dk/linux:
io_uring: ensure that cached task references are always put on exit
|
|
Pull block fixes from Jens Axboe:
"Fix a regression that's been plaguing us by reverting the offending
commit, as attempts to both reproduce the issue and fix it in a saner
fashion have failed.
Fix for a potential oops condition in the s390 dasd block driver"
* tag 'block-6.0-2022-09-22' of git://git.kernel.dk/linux:
Revert "block: freeze the queue earlier in del_gendisk"
s390/dasd: fix Oops in dasd_alias_get_start_dev due to missing pavgroup
|
|
Alexey reported that the fraction of unknown filename instances in
kallsyms grew from ~0.3% to ~10% recently; Bill and Greg tracked it down
to assembler defined symbols, which regressed as a result of:
commit b8a9092330da ("Kbuild: do not emit debug info for assembly with LLVM_IAS=1")
In that commit, I allude to restoring debug info for assembler defined
symbols in a follow up patch, but it seems I forgot to do so in
commit a66049e2cf0e ("Kbuild: make DWARF version a choice")
Link: https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=31bf18645d98b4d3d7357353be840e320649a67d
Fixes: b8a9092330da ("Kbuild: do not emit debug info for assembly with LLVM_IAS=1")
Reported-by: Alexey Alexandrov <aalexand@google.com>
Reported-by: Bill Wendling <morbo@google.com>
Reported-by: Greg Thelen <gthelen@google.com>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Suggested-by: Masahiro Yamada <masahiroy@kernel.org>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
|
|
Dmitrii, Fangrui, and Mashahiro note:
Before GCC 11 and Clang 12 -gsplit-dwarf implicitly uses -g2.
Fix CONFIG_DEBUG_INFO_SPLIT for gcc-11+ & clang-12+ which now need -g
specified in order for -gsplit-dwarf to work at all.
-gsplit-dwarf has been mutually exclusive with -g since support for
CONFIG_DEBUG_INFO_SPLIT was introduced in
commit 866ced950bcd ("kbuild: Support split debug info v4")
I don't think it ever needed to be.
Link: https://lore.kernel.org/lkml/20220815013317.26121-1-dmitrii.bundin.a@gmail.com/
Link: https://lore.kernel.org/lkml/CAK7LNARPAmsJD5XKAw7m_X2g7Fi-CAAsWDQiP7+ANBjkg7R7ng@mail.gmail.com/
Link: https://reviews.llvm.org/D80391
Cc: Andi Kleen <ak@linux.intel.com>
Reported-by: Dmitrii Bundin <dmitrii.bundin.a@gmail.com>
Reported-by: Fangrui Song <maskray@google.com>
Reported-by: Masahiro Yamada <masahiroy@kernel.org>
Suggested-by: Dmitrii Bundin <dmitrii.bundin.a@gmail.com>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
|
|
io_uring caches task references to avoid doing atomics for each of them
per request. If a request is put from the same task that allocated it,
then we can maintain a per-ctx cache of them. This obviously relies
on io_uring always pruning caches in a reliable way, and there's
currently a case off io_uring fd release where we can miss that.
One example is a ring setup with IOPOLL, which relies on the task
polling for completions, which will free them. However, if such a task
submits a request and then exits or closes the ring without reaping
the completion, then ring release will reap and put. If release happens
from that very same task, the completed request task refs will get
put back into the cache pool. This is problematic, as we're now beyond
the point of pruning caches.
Manually drop these caches after doing an IOPOLL reap. This releases
references from the current task, which is enough. If another task
happens to be doing the release, then the caching will not be
triggered and there's no issue.
Cc: stable@vger.kernel.org
Fixes: e98e49b2bbf7 ("io_uring: extend task put optimisations")
Reported-by: Homin Rhee <hominlab@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Will Deacon:
"These are all very simple and self-contained, although the CFI
jump-table fix touches the generic linker script as that's where the
problematic macro lives.
- Fix false positive "sleeping while atomic" warning resulting from
the kPTI rework taking a mutex too early.
- Fix possible overflow in AMU frequency calculation
- Fix incorrect shift in CMN PMU driver which causes problems with
newer versions of the IP
- Reduce alignment of the CFI jump table to avoid huge kernel images
and link errors with !4KiB page size configurations"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
vmlinux.lds.h: CFI: Reduce alignment of jump-table to function alignment
perf/arm-cmn: Add more bits to child node address offset field
arm64: topology: fix possible overflow in amu_fie_setup()
arm64: mm: don't acquire mutex when rewriting swapper
|
|
Commit e90886291c7c ("certs: make system keyring depend on x509 parser")
is not the right fix because x509_load_certificate_list() can be modular.
The combination of CONFIG_SYSTEM_TRUSTED_KEYRING=y and
CONFIG_X509_CERTIFICATE_PARSER=m still results in the following error:
LD .tmp_vmlinux.kallsyms1
ld: certs/system_keyring.o: in function `load_system_certificate_list':
system_keyring.c:(.init.text+0x8c): undefined reference to `x509_load_certificate_list'
make: *** [Makefile:1169: vmlinux] Error 1
Fixes: e90886291c7c ("certs: make system keyring depend on x509 parser")
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Tested-by: Adam Borowski <kilobyte@angband.pl>
|
|
There is nowhere calling `menu_get_root_menu` function,
so remove it.
Signed-off-by: Zeng Heng <zengheng4@huawei.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
|
|
Remove unused imported 'os' module.
Signed-off-by: yangxingwu <xingwu.yang@gmail.com>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
|
|
cgroup has to be one kernfs dir, otherwise kernel panic is caused,
especially cgroup id is provide from userspace.
Reported-by: Marco Patalano <mpatalan@redhat.com>
Fixes: 6b658c4863c1 ("scsi: cgroup: Add cgroup_get_from_id()")
Cc: Muneendra <muneendra.kumar@broadcom.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Acked-by: Mukesh Ojha <quic_mojha@quicinc.com>
Cc: stable@vger.kernel.org # v5.14+
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core fixes from Greg KH:
"Here are two tiny driver core fixes for 6.0-rc7 that resolve some
oft-reported problems.
The first is a revert of the "fw_devlink.strict=1" default option that
we keep trying to enable, but we keep finding platforms that this just
breaks everything on. So again, we need it reverted and hopefully it
can be worked on in future releases.
The second is a sysfs file-size bugfix that resolves an issue that
many people are starting to hit as the fix it is fixing also was
backported to stable kernels. The util-linux developers are starting
to get bugreports about sysfs files that contain no data because of
this problem, and this fix which has been in linux-next in the
bitfield tree for a long time, resolves it. I'm submitting it here as
it needs to be merged for 6.0-final, not for 6.1-rc1.
Both of these have been in linux-next with no reported issues, only
reports were that these fixed problems"
* tag 'driver-core-6.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
drivers/base: Fix unsigned comparison to -1 in CPUMAP_FILE_MAX_BYTES
Revert "driver core: Set fw_devlink.strict=1 by default"
|