Age | Commit message (Collapse) | Author |
|
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull mmap_prepare updates from Christian Brauner:
"Last cycle we introduce f_op->mmap_prepare() in c84bf6dd2b83 ("mm:
introduce new .mmap_prepare() file callback").
This is preferred to the existing f_op->mmap() hook as it does require
a VMA to be established yet, thus allowing the mmap logic to invoke
this hook far, far earlier, prior to inserting a VMA into the virtual
address space, or performing any other heavy handed operations.
This allows for much simpler unwinding on error, and for there to be a
single attempt at merging a VMA rather than having to possibly
reattempt a merge based on potentially altered VMA state.
Far more importantly, it prevents inappropriate manipulation of
incompletely initialised VMA state, which is something that has been
the cause of bugs and complexity in the past.
The intent is to gradually deprecate f_op->mmap, and in that vein this
series coverts the majority of file systems to using f_op->mmap_prepare.
Prerequisite steps are taken - firstly ensuring all checks for mmap
capabilities use the file_has_valid_mmap_hooks() helper rather than
directly checking for f_op->mmap (which is now not a valid check) and
secondly updating daxdev_mapping_supported() to not require a VMA
parameter to allow ext4 and xfs to be converted.
Commit bb666b7c2707 ("mm: add mmap_prepare() compatibility layer for
nested file systems") handles the nasty edge-case of nested file
systems like overlayfs, which introduces a compatibility shim to allow
f_op->mmap_prepare() to be invoked from an f_op->mmap() callback.
This allows for nested filesystems to continue to function correctly
with all file systems regardless of which callback is used. Once we
finally convert all file systems, this shim can be removed.
As a result, ecryptfs, fuse, and overlayfs remain unaltered so they
can nest all other file systems.
We additionally do not update resctl - as this requires an update to
remap_pfn_range() (or an alternative to it) which we defer to a later
series, equally we do not update cramfs which needs a mixed mapping
insertion with the same issue, nor do we update procfs, hugetlbfs,
syfs or kernfs all of which require VMAs for internal state and hooks.
We shall return to all of these later"
* tag 'vfs-6.17-rc1.mmap_prepare' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
doc: update porting, vfs documentation to describe mmap_prepare()
fs: replace mmap hook with .mmap_prepare for simple mappings
fs: convert most other generic_file_*mmap() users to .mmap_prepare()
fs: convert simple use of generic_file_*_mmap() to .mmap_prepare()
mm/filemap: introduce generic_file_*_mmap_prepare() helpers
fs/xfs: transition from deprecated .mmap hook to .mmap_prepare
fs/ext4: transition from deprecated .mmap hook to .mmap_prepare
fs/dax: make it possible to check dev dax support without a VMA
fs: consistently use can_mmap_file() helper
mm/nommu: use file_has_valid_mmap_hooks() helper
mm: rename call_mmap/mmap_prepare to vfs_mmap/mmap_prepare
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull misc VFS updates from Christian Brauner:
"This contains the usual selections of misc updates for this cycle.
Features:
- Add ext4 IOCB_DONTCACHE support
This refactors the address_space_operations write_begin() and
write_end() callbacks to take const struct kiocb * as their first
argument, allowing IOCB flags such as IOCB_DONTCACHE to propagate
to the filesystem's buffered I/O path.
Ext4 is updated to implement handling of the IOCB_DONTCACHE flag
and advertises support via the FOP_DONTCACHE file operation flag.
Additionally, the i915 driver's shmem write paths are updated to
bypass the legacy write_begin/write_end interface in favor of
directly calling write_iter() with a constructed synchronous kiocb.
Another i915 change replaces a manual write loop with
kernel_write() during GEM shmem object creation.
Cleanups:
- don't duplicate vfs_open() in kernel_file_open()
- proc_fd_getattr(): don't bother with S_ISDIR() check
- fs/ecryptfs: replace snprintf with sysfs_emit in show function
- vfs: Remove unnecessary list_for_each_entry_safe() from
evict_inodes()
- filelock: add new locks_wake_up_waiter() helper
- fs: Remove three arguments from block_write_end()
- VFS: change old_dir and new_dir in struct renamedata to dentrys
- netfs: Remove unused declaration netfs_queue_write_request()
Fixes:
- eventpoll: Fix semi-unbounded recursion
- eventpoll: fix sphinx documentation build warning
- fs/read_write: Fix spelling typo
- fs: annotate data race between poll_schedule_timeout() and
pollwake()
- fs/pipe: set FMODE_NOWAIT in create_pipe_files()
- docs/vfs: update references to i_mutex to i_rwsem
- fs/buffer: remove comment about hard sectorsize
- fs/buffer: remove the min and max limit checks in __getblk_slow()
- fs/libfs: don't assume blocksize <= PAGE_SIZE in
generic_check_addressable
- fs_context: fix parameter name in infofc() macro
- fs: Prevent file descriptor table allocations exceeding INT_MAX"
* tag 'vfs-6.17-rc1.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (24 commits)
netfs: Remove unused declaration netfs_queue_write_request()
eventpoll: fix sphinx documentation build warning
ext4: support uncached buffered I/O
mm/pagemap: add write_begin_get_folio() helper function
fs: change write_begin/write_end interface to take struct kiocb *
drm/i915: Refactor shmem_pwrite() to use kiocb and write_iter
drm/i915: Use kernel_write() in shmem object create
eventpoll: Fix semi-unbounded recursion
vfs: Remove unnecessary list_for_each_entry_safe() from evict_inodes()
fs/libfs: don't assume blocksize <= PAGE_SIZE in generic_check_addressable
fs/buffer: remove the min and max limit checks in __getblk_slow()
fs: Prevent file descriptor table allocations exceeding INT_MAX
fs: Remove three arguments from block_write_end()
fs/ecryptfs: replace snprintf with sysfs_emit in show function
fs: annotate suspected data race between poll_schedule_timeout() and pollwake()
docs/vfs: update references to i_mutex to i_rwsem
fs/buffer: remove comment about hard sectorsize
fs_context: fix parameter name in infofc() macro
VFS: change old_dir and new_dir in struct renamedata to dentrys
proc_fd_getattr(): don't bother with S_ISDIR() check
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull dentry d_flags updates from Al Viro:
"The current exclusion rules for dentry->d_flags stores are rather
unpleasant. The basic rules are simple:
- stores to dentry->d_flags are OK under dentry->d_lock
- stores to dentry->d_flags are OK in the dentry constructor, before
becomes potentially visible to other threads
Unfortunately, there's a couple of exceptions to that, and that's
where the headache comes from.
The main PITA comes from d_set_d_op(); that primitive sets ->d_op of
dentry and adjusts the flags that correspond to presence of individual
methods. It's very easy to misuse; existing uses _are_ safe, but proof
of correctness is brittle.
Use in __d_alloc() is safe (we are within a constructor), but we might
as well precalculate the initial value of 'd_flags' when we set the
default ->d_op for given superblock and set 'd_flags' directly instead
of messing with that helper.
The reasons why other uses are safe are bloody convoluted; I'm not
going to reproduce it here. See [1] for gory details, if you care. The
critical part is using d_set_d_op() only just prior to
d_splice_alias(), which makes a combination of d_splice_alias() with
setting ->d_op, etc a natural replacement primitive.
Better yet, if we go that way, it's easy to take setting ->d_op and
modifying 'd_flags' under ->d_lock, which eliminates the headache as
far as 'd_flags' exclusion rules are concerned. Other exceptions are
minor and easy to deal with.
What this series does:
- d_set_d_op() is no longer available; instead a new primitive
(d_splice_alias_ops()) is provided, equivalent to combination of
d_set_d_op() and d_splice_alias().
- new field of struct super_block - 's_d_flags'. This sets the
default value of 'd_flags' to be used when allocating dentries on
this filesystem.
- new primitive for setting 's_d_op': set_default_d_op(). This
replaces stores to 's_d_op' at mount time.
All in-tree filesystems converted; out-of-tree ones will get caught
by the compiler ('s_d_op' is renamed, so stores to it will be
caught). 's_d_flags' is set by the same primitive to match the
's_d_op'.
- a lot of filesystems had sb->s_d_op->d_delete equal to
always_delete_dentry; that is equivalent to setting
DCACHE_DONTCACHE in 'd_flags', so such filesystems can bloody well
set that bit in 's_d_flags' and drop 'd_delete()' from
dentry_operations.
In quite a few cases that results in empty dentry_operations, which
means that we can get rid of those.
- kill simple_dentry_operations - not needed anymore
- massage d_alloc_parallel() to get rid of the other exception wrt
'd_flags' stores - we can set DCACHE_PAR_LOOKUP as soon as we
allocate the new dentry; no need to delay that until we commit to
using the sucker.
As the result, 'd_flags' stores are all either under ->d_lock or done
before the dentry becomes visible in any shared data structures"
Link: https://lore.kernel.org/all/20250224010624.GT1977892@ZenIV/ [1]
* tag 'pull-dcache' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (21 commits)
configfs: use DCACHE_DONTCACHE
debugfs: use DCACHE_DONTCACHE
efivarfs: use DCACHE_DONTCACHE instead of always_delete_dentry()
9p: don't bother with always_delete_dentry
ramfs, hugetlbfs, mqueue: set DCACHE_DONTCACHE
kill simple_dentry_operations
devpts, sunrpc, hostfs: don't bother with ->d_op
shmem: no dentry retention past the refcount reaching zero
d_alloc_parallel(): set DCACHE_PAR_LOOKUP earlier
make d_set_d_op() static
simple_lookup(): just set DCACHE_DONTCACHE
tracefs: Add d_delete to remove negative dentries
set_default_d_op(): calculate the matching value for ->d_flags
correct the set of flags forbidden at d_set_d_op() time
split d_flags calculation out of d_set_d_op()
new helper: set_default_d_op()
fuse: no need for special dentry_operations for root dentry
switch procfs from d_set_d_op() to d_splice_alias_ops()
new helper: d_splice_alias_ops()
procfs: kill ->proc_dops
...
|
|
Change the address_space_operations callbacks write_begin() and
write_end() to take struct kiocb * as the first argument instead of
struct file *.
Update all affected function prototypes, implementations, call sites,
and related documentation across VFS, filesystems, and block layer.
Part of a series refactoring address_space_operations write_begin and
write_end callbacks to use struct kiocb for passing write context and
flags.
Signed-off-by: Taotao Chen <chentaotao@didiglobal.com>
Link: https://lore.kernel.org/20250716093559.217344-4-chentaotao@didiglobal.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
This reverts commit 69505fe98f198ee813898cbcaf6770949636430b.
Initially, conditional lock acquisition was removed to fix an xfstest bug
that was observed during internal testing. The deadlock reported by syzbot
is resolved by reintroducing conditional acquisition. The xfstest bug no
longer occurs on kernel version 6.16-rc1 during internal testing. I
assume that changes in other modules may have contributed to this.
Fixes: 69505fe98f19 ("fs/ntfs3: Replace inode_trylock with inode_lock")
Reported-by: syzbot+a91fcdbd2698f99db8f4@syzkaller.appspotmail.com
Suggested-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
|
|
Use ntfs_inode field 'ni_bad' to mark inode as bad (if something went wrong)
and to avoid any operations
Suggested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
|
|
The reproducer uses a file0 on a ntfs3 file system with a corrupted i_link.
When renaming, the file0's inode is marked as a bad inode because the file
name cannot be deleted.
The underlying bug is that make_bad_inode() is called on a live inode.
In some cases it's "icache lookup finds a normal inode, d_splice_alias()
is called to attach it to dentry, while another thread decides to call
make_bad_inode() on it - that would evict it from icache, but we'd already
found it there earlier".
In some it's outright "we have an inode attached to dentry - that's how we
got it in the first place; let's call make_bad_inode() on it just for shits
and giggles".
Fixes: 78ab59fee07f ("fs/ntfs3: Rework file operations")
Reported-by: syzbot+1aa90f0eb1fc3e77d969@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=1aa90f0eb1fc3e77d969
Signed-off-by: Edward Adam Davis <eadavis@qq.com>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
|
|
The length of the file name should be smaller than the directory entry size.
Reported-by: syzbot+598057afa0f49e62bd23@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=598057afa0f49e62bd23
Signed-off-by: Lizhi Xu <lizhi.xu@windriver.com>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
|
|
After applying this patch, could correctly create symlink:
ln -s "relative/path/to/file" symlink
Signed-off-by: Rong Zhang <ulin0208@gmail.com>
[almaz.alexandrovich@paragon-software.com: added cpu_to_le32 macro to
rs->Flags assignment]
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
|
|
The symlinks created in windows will be broken in linux by ntfs3,
the patch fixes it.
Signed-off-by: Rong Zhang <ulin0208@gmail.com>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
|
|
Update nearly all generic_file_mmap() and generic_file_readonly_mmap()
callers to use generic_file_mmap_prepare() and
generic_file_readonly_mmap_prepare() respectively.
We update blkdev, 9p, afs, erofs, ext2, nfs, ntfs3, smb, ubifs and vboxsf
file systems this way.
Remaining users we cannot yet update are ecryptfs, fuse and cramfs. The
former two are nested file systems that must support any underlying file
ssytem, and cramfs inserts a mixed mapping which currently requires a VMA.
Once all file systems have been converted to mmap_prepare(), we can then
update nested file systems.
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Link: https://lore.kernel.org/08db85970d89b17a995d2cffae96fb4cc462377f.1750099179.git.lorenzo.stoakes@oracle.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
... to be used instead of manually assigning to ->s_d_op.
All in-tree filesystem converted (and field itself is renamed,
so any out-of-tree ones in need of conversion will be caught
by compiler).
Reviewed-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull more MM updates from Andrew Morton:
- "zram: support algorithm-specific parameters" from Sergey Senozhatsky
adds infrastructure for passing algorithm-specific parameters into
zram. A single parameter `winbits' is implemented at this time.
- "memcg: nmi-safe kmem charging" from Shakeel Butt makes memcg
charging nmi-safe, which is required by BFP, which can operate in NMI
context.
- "Some random fixes and cleanup to shmem" from Kemeng Shi implements
small fixes and cleanups in the shmem code.
- "Skip mm selftests instead when kernel features are not present" from
Zi Yan fixes some issues in the MM selftest code.
- "mm/damon: build-enable essential DAMON components by default" from
SeongJae Park reworks DAMON Kconfig to make it easier to enable
CONFIG_DAMON.
- "sched/numa: add statistics of numa balance task migration" from Libo
Chen adds more info into sysfs and procfs files to improve visibility
into the NUMA balancer's task migration activity.
- "selftests/mm: cow and gup_longterm cleanups" from Mark Brown
provides various updates to some of the MM selftests to make them
play better with the overall containing framework.
* tag 'mm-stable-2025-06-01-14-06' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (43 commits)
mm/khugepaged: clean up refcount check using folio_expected_ref_count()
selftests/mm: fix test result reporting in gup_longterm
selftests/mm: report unique test names for each cow test
selftests/mm: add helper for logging test start and results
selftests/mm: use standard ksft_finished() in cow and gup_longterm
selftests/damon/_damon_sysfs: skip testcases if CONFIG_DAMON_SYSFS is disabled
sched/numa: add statistics of numa balance task
sched/numa: fix task swap by skipping kernel threads
tools/testing: check correct variable in open_procmap()
tools/testing/vma: add missing function stub
mm/gup: update comment explaining why gup_fast() disables IRQs
selftests/mm: two fixes for the pfnmap test
mm/khugepaged: fix race with folio split/free using temporary reference
mm: add CONFIG_PAGE_BLOCK_ORDER to select page block order
mmu_notifiers: remove leftover stub macros
selftests/mm: deduplicate test names in madv_populate
kcov: rust: add flags for KCOV with Rust
mm: rust: make CONFIG_MMU ifdefs more narrow
mmu_gather: move tlb flush for VM_PFNMAP/VM_MIXEDMAP vmas into free_pgtables()
mm/damon/Kconfig: enable CONFIG_DAMON by default
...
|
|
Remove the local 'page' variable and do everything in terms of folios.
Removes the last user of copy_page_from_iter_atomic() and a hidden call to
compound_head() in ClearPageDirty().
Link: https://lkml.kernel.org/r/20250514170607.3000994-3-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Remove all the code related to changing compression on the fly because
it's not safe and not maintainable.
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
|
|
Make the logic of handling the InitializeFileRecordSegment operation
similar to that in windows.
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
|
|
The ntfs3 can use the page cache directly, so its address_space_operations
need direct_IO. Exit ntfs_direct_IO() if it is a compressed file.
Fixes: b432163ebd15 ("fs/ntfs3: Update inode->i_mapping->a_ops on compression state")
Reported-by: syzbot+e36cc3297bd3afd25e19@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=e36cc3297bd3afd25e19
Signed-off-by: Lizhi Xu <lizhi.xu@windriver.com>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
|
|
The hdr_first_de() function returns a pointer to a struct NTFS_DE. This
pointer may be NULL. To handle the NULL error effectively, it is important
to implement an error handler. This will help manage potential errors
consistently.
Additionally, error handling for the return value already exists at other
points where this function is called.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Fixes: 82cae269cfa9 ("fs/ntfs3: Add initialization of super block")
Signed-off-by: Andrey Vatoropin <a.vatoropin@crpt.ru>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
|
|
Static analysis shows that pointer "mi" cannot be NULL, since it is
pre-initialized above. A potential failure when mi equals NULL is
processed.
Remove the extra NULL check. It is meaningless and harms the readability
of the code, since before that the pointer is unconditionally
dereferenced.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Signed-off-by: Andrey Vatoropin <a.vatoropin@crpt.ru>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
|
|
https://github.com/Paragon-Software-Group/linux-ntfs3
Pull ntfs3 updates from Konstantin Komarov:
- Fix integer overflows on 32-bit systems and in hdr_first_de()
- Fix 'proc_info_root' leak on NTFS initialization failure
- Remove unused functions ni_load_attr, ntfs_sb_read, ntfs_flush_inodes
- update inode->i_mapping->a_ops on compression state
- ensure atomicity of write operations
- refactor ntfs_{create/remove}_{procdir,proc_root}()
* tag 'ntfs3_for_6.15' of https://github.com/Paragon-Software-Group/linux-ntfs3:
fs/ntfs3: Remove unused ntfs_flush_inodes
fs/ntfs3: Remove unused ntfs_sb_read
fs/ntfs3: Remove unused ni_load_attr
fs/ntfs3: Prevent integer overflow in hdr_first_de()
fs/ntfs3: Fix a couple integer overflows on 32bit systems
fs/ntfs3: Update inode->i_mapping->a_ops on compression state
fs/ntfs3: Fix WARNING in ntfs_extend_initialized_size
fs/ntfs3: Fix 'proc_info_root' leak when init ntfs failed
fs/ntfs3: Factor out ntfs_{create/remove}_proc_root()
fs/ntfs3: Factor out ntfs_{create/remove}_procdir()
fs/ntfs3: Keep write operations atomic
|
|
ntfs_flush_inodes() was added in 2021 by
commit 82cae269cfa9 ("fs/ntfs3: Add initialization of super block")
but has remained unused.
Remove it, and it's helper function.
Note: There is a commented out call to ntfs_flush_inodes in
ntfs_truncate() - I've left that in place.
Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
|
|
ntfs_sb_read() was added in 2021 by
commit 82cae269cfa9 ("fs/ntfs3: Add initialization of super block")
but hasn't been used.
Remove it.
Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
|
|
ni_load_attr() was added in 2021 by
commit 4342306f0f0d ("fs/ntfs3: Add file operations and implementation")
but hasn't been used.
Remove it.
Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
|
|
The "de_off" and "used" variables come from the disk so they both need to
check. The problem is that on 32bit systems if they're both greater than
UINT_MAX - 16 then the check does work as intended because of an integer
overflow.
Fixes: 60ce8dfde035 ("fs/ntfs3: Fix wrong if in hdr_first_de")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
|
|
On 32bit systems the "off + sizeof(struct NTFS_DE)" addition can
have an integer wrapping issue. Fix it by using size_add().
Fixes: 82cae269cfa9 ("fs/ntfs3: Add initialization of super block")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
|
|
Some filesystems, such as NFS, cifs, ceph, and fuse, do not have
complete control of sequencing on the actual filesystem (e.g. on a
different server) and may find that the inode created for a mkdir
request already exists in the icache and dcache by the time the mkdir
request returns. For example, if the filesystem is mounted twice the
directory could be visible on the other mount before it is on the
original mount, and a pair of name_to_handle_at(), open_by_handle_at()
calls could instantiate the directory inode with an IS_ROOT() dentry
before the first mkdir returns.
This means that the dentry passed to ->mkdir() may not be the one that
is associated with the inode after the ->mkdir() completes. Some
callers need to interact with the inode after the ->mkdir completes and
they currently need to perform a lookup in the (rare) case that the
dentry is no longer hashed.
This lookup-after-mkdir requires that the directory remains locked to
avoid races. Planned future patches to lock the dentry rather than the
directory will mean that this lookup cannot be performed atomically with
the mkdir.
To remove this barrier, this patch changes ->mkdir to return the
resulting dentry if it is different from the one passed in.
Possible returns are:
NULL - the directory was created and no other dentry was used
ERR_PTR() - an error occurred
non-NULL - this other dentry was spliced in
This patch only changes file-systems to return "ERR_PTR(err)" instead of
"err" or equivalent transformations. Subsequent patches will make
further changes to some file-systems to return a correct dentry.
Not all filesystems reliably result in a positive hashed dentry:
- NFS, cifs, hostfs will sometimes need to perform a lookup of
the name to get inode information. Races could result in this
returning something different. Note that this lookup is
non-atomic which is what we are trying to avoid. Placing the
lookup in filesystem code means it only happens when the filesystem
has no other option.
- kernfs and tracefs leave the dentry negative and the ->revalidate
operation ensures that lookup will be called to correctly populate
the dentry. This could be fixed but I don't think it is important
to any of the users of vfs_mkdir() which look at the dentry.
The recommendation to use
d_drop();d_splice_alias()
is ugly but fits with current practice. A planned future patch will
change this.
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: NeilBrown <neilb@suse.de>
Link: https://lore.kernel.org/r/20250227013949.536172-2-neilb@suse.de
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Update inode->i_mapping->a_ops when the compression state changes to
ensure correct address space operations.
Clear ATTR_FLAG_SPARSED/FILE_ATTRIBUTE_SPARSE_FILE when enabling
compression to prevent flag conflicts.
v2:
Additionally, ensure that all dirty pages are flushed and concurrent access
to the page cache is blocked.
Fixes: 6b39bfaeec44 ("fs/ntfs3: Add support for the compression attribute")
Reported-by: Kun Hu <huk23@m.fudan.edu.cn>, Jiaji Qin <jjtan24@m.fudan.edu.cn>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
|
|
Syzbot reported a WARNING in ntfs_extend_initialized_size.
The data type of in->i_valid and to is u64 in ntfs_file_mmap().
If their values are greater than LLONG_MAX, overflow will occur because
the data types of the parameters valid and new_valid corresponding to
the function ntfs_extend_initialized_size() are loff_t.
Before calling ntfs_extend_initialized_size() in the ntfs_file_mmap(),
the "ni->i_valid < to" has been determined, so the same WARN_ON determination
is not required in ntfs_extend_initialized_size().
Just execute the ntfs_extend_initialized_size() in ntfs_extend() to make
a WARN_ON check.
Reported-and-tested-by: syzbot+e37dd1dfc814b10caa55@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=e37dd1dfc814b10caa55
Signed-off-by: Edward Adam Davis <eadavis@qq.com>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
|
|
There's a issue as follows:
proc_dir_entry 'fs/ntfs3' already registered
WARNING: CPU: 3 PID: 9788 at fs/proc/generic.c:375 proc_register+0x418/0x590
Modules linked in: ntfs3(E+)
Call Trace:
<TASK>
_proc_mkdir+0x165/0x200
init_ntfs_fs+0x36/0xf90 [ntfs3]
do_one_initcall+0x115/0x6c0
do_init_module+0x253/0x760
load_module+0x55f2/0x6c80
init_module_from_file+0xd2/0x130
__x64_sys_finit_module+0xbf/0x130
do_syscall_64+0x72/0x1c0
Above issue happens as missing destroy 'proc_info_root' when error
happens after create 'proc_info_root' in init_ntfs_fs().
So destroy 'proc_info_root' in error path in init_ntfs_fs().
Fixes: 7832e123490a ("fs/ntfs3: Add support /proc/fs/ntfs3/<dev>/volinfo and /proc/fs/ntfs3/<dev>/label")
Signed-off-by: Ye Bin <yebin10@huawei.com>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
|
|
Introduce ntfs_create_proc_root()/ntfs_remove_proc_root() for
create/remove "/proc/fs/ntfs3".
Signed-off-by: Ye Bin <yebin10@huawei.com>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
|
|
Introduce ntfs_create_procdir() and ntfs_remove_procdir() to
create/remove "/proc/fs/ntfs3/.."
Signed-off-by: Ye Bin <yebin10@huawei.com>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
|
|
syzbot reported a NULL pointer dereference in __generic_file_write_iter. [1]
Before the write operation is completed, the user executes ioctl[2] to clear
the compress flag of the file, which causes the is_compressed() judgment to
return 0, further causing the program to enter the wrong process and call the
wrong ops ntfs_aops_cmpr, which triggers the null pointer dereference of
write_begin.
Use inode lock to synchronize ioctl and write to avoid this case.
[1]
Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
Mem abort info:
ESR = 0x0000000086000006
EC = 0x21: IABT (current EL), IL = 32 bits
SET = 0, FnV = 0
EA = 0, S1PTW = 0
FSC = 0x06: level 2 translation fault
user pgtable: 4k pages, 48-bit VAs, pgdp=000000011896d000
[0000000000000000] pgd=0800000118b44403, p4d=0800000118b44403, pud=0800000117517403, pmd=0000000000000000
Internal error: Oops: 0000000086000006 [#1] PREEMPT SMP
Modules linked in:
CPU: 0 UID: 0 PID: 6427 Comm: syz-executor347 Not tainted 6.13.0-rc3-syzkaller-g573067a5a685 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024
pstate: 80400005 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : 0x0
lr : generic_perform_write+0x29c/0x868 mm/filemap.c:4055
sp : ffff80009d4978a0
x29: ffff80009d4979c0 x28: dfff800000000000 x27: ffff80009d497bc8
x26: 0000000000000000 x25: ffff80009d497960 x24: ffff80008ba71c68
x23: 0000000000000000 x22: ffff0000c655dac0 x21: 0000000000001000
x20: 000000000000000c x19: 1ffff00013a92f2c x18: ffff0000e183aa1c
x17: 0004060000000014 x16: ffff800083275834 x15: 0000000000000001
x14: 0000000000000000 x13: 0000000000000001 x12: ffff0000c655dac0
x11: 0000000000ff0100 x10: 0000000000ff0100 x9 : 0000000000000000
x8 : 0000000000000000 x7 : 0000000000000000 x6 : 0000000000000000
x5 : ffff80009d497980 x4 : ffff80009d497960 x3 : 0000000000001000
x2 : 0000000000000000 x1 : ffff0000e183a928 x0 : ffff0000d60b0fc0
Call trace:
0x0 (P)
__generic_file_write_iter+0xfc/0x204 mm/filemap.c:4156
ntfs_file_write_iter+0x54c/0x630 fs/ntfs3/file.c:1267
new_sync_write fs/read_write.c:586 [inline]
vfs_write+0x920/0xcf4 fs/read_write.c:679
ksys_write+0x15c/0x26c fs/read_write.c:731
__do_sys_write fs/read_write.c:742 [inline]
__se_sys_write fs/read_write.c:739 [inline]
__arm64_sys_write+0x7c/0x90 fs/read_write.c:739
__invoke_syscall arch/arm64/kernel/syscall.c:35 [inline]
invoke_syscall+0x98/0x2b8 arch/arm64/kernel/syscall.c:49
el0_svc_common+0x130/0x23c arch/arm64/kernel/syscall.c:132
do_el0_svc+0x48/0x58 arch/arm64/kernel/syscall.c:151
el0_svc+0x54/0x168 arch/arm64/kernel/entry-common.c:744
el0t_64_sync_handler+0x84/0x108 arch/arm64/kernel/entry-common.c:762
[2]
ioctl$FS_IOC_SETFLAGS(r0, 0x40086602, &(0x7f00000000c0)=0x20)
Reported-by: syzbot+5d0bdc98770e6c55a0fd@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=5d0bdc98770e6c55a0fd
Signed-off-by: Lizhi Xu <lizhi.xu@windriver.com>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
|
|
Also reworked error handling in a couple of places.
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
|
|
Extended the `mi_enum_attr()` function interface with an additional
parameter, `struct ntfs_inode *ni`, to allow marking the inode
as bad as soon as an error is detected.
Reported-by: syzbot+73d8fc29ec7cba8286fa@syzkaller.appspotmail.com
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
|
|
Convert the first page passed to ni_write_frame() to a folio and use
folio_pos() on that instead of open-coding the access to folio->index,
cast & shift.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
|
|
Changes made to improve readability and debuggability.
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
|
|
As part of the process of switching from page to folio.
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
|
|
Check arguments again after lock.
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
|
|
Add offset check before access to attr->non_res field as mentioned in [1].
[1] https://lore.kernel.org/ntfs3/20241010110005.42792-1-llfamsec@gmail.com/
Suggested-by: lei lu <llfamsec@gmail.com>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
|
|
If using the proposed function folio_zero_range(), should one switch
from 'start + end' to 'start + length,' or use folio_zero_segment()
Fixes: 1da86618bdce ("fs: Convert aops->write_begin to take a folio")
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
|
|
Reported-by: syzbot+7f3761b790fa41d0f3d5@syzkaller.appspotmail.com
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
|
|
Use local runs_tree instead of cached. This way excludes rw_semaphore lock.
Reported-by: syzbot+1c25748a40fe79b8a119@syzkaller.appspotmail.com
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
|
|
https://github.com/Paragon-Software-Group/linux-ntfs3
Pull ntfs3 updates from Konstantin Komarov:
"New:
- implement fallocate for compressed files
- add support for the compression attribute
- optimize large writes to sparse files
Fixes:
- fix several potential deadlock scenarios
- fix various internal bugs detected by syzbot
- add checks before accessing NTFS structures during parsing
- correct the format of output messages
Refactoring:
- replace fsparam_flag_no with fsparam_flag in options parser
- remove unused functions and macros"
* tag 'ntfs3_for_6.12' of https://github.com/Paragon-Software-Group/linux-ntfs3: (25 commits)
fs/ntfs3: Format output messages like others fs in kernel
fs/ntfs3: Additional check in ntfs_file_release
fs/ntfs3: Fix general protection fault in run_is_mapped_full
fs/ntfs3: Sequential field availability check in mi_enum_attr()
fs/ntfs3: Additional check in ni_clear()
fs/ntfs3: Fix possible deadlock in mi_read
ntfs3: Change to non-blocking allocation in ntfs_d_hash
fs/ntfs3: Remove unused al_delete_le
fs/ntfs3: Rename ntfs3_setattr into ntfs_setattr
fs/ntfs3: Replace fsparam_flag_no -> fsparam_flag
fs/ntfs3: Add support for the compression attribute
fs/ntfs3: Implement fallocate for compressed files
fs/ntfs3: Make checks in run_unpack more clear
fs/ntfs3: Add rough attr alloc_size check
fs/ntfs3: Stale inode instead of bad
fs/ntfs3: Refactor enum_rstbl to suppress static checker
fs/ntfs3: Fix sparse warning in ni_fiemap
fs/ntfs3: Fix warning possible deadlock in ntfs_set_state
fs/ntfs3: Fix sparse warning for bigendian
fs/ntfs3: Separete common code for file_read/write iter/splice
...
|
|
asm/unaligned.h is always an include of asm-generic/unaligned.h;
might as well move that thing to linux/unaligned.h and include
that - there's nothing arch-specific in that header.
auto-generated by the following:
for i in `git grep -l -w asm/unaligned.h`; do
sed -i -e "s/asm\/unaligned.h/linux\/unaligned.h/" $i
done
for i in `git grep -l -w asm-generic/unaligned.h`; do
sed -i -e "s/asm-generic\/unaligned.h/linux\/unaligned.h/" $i
done
git mv include/asm-generic/unaligned.h include/linux/unaligned.h
git mv tools/include/asm-generic/unaligned.h tools/include/linux/unaligned.h
sed -i -e "/unaligned.h/d" include/asm-generic/Kbuild
sed -i -e "s/__ASM_GENERIC/__LINUX/" include/linux/unaligned.h tools/include/linux/unaligned.h
|
|
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
|
|
Reported-by: syzbot+8c652f14a0fde76ff11d@syzkaller.appspotmail.com
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
|
|
Fixed deleating of a non-resident attribute in ntfs_create_inode()
rollback.
Reported-by: syzbot+9af29acd8f27fbce94bc@syzkaller.appspotmail.com
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
|
|
The code is slightly reformatted to consistently check field availability
without duplication.
Fixes: 556bdf27c2dd ("ntfs3: Add bounds checking to mi_enum_attr()")
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
|
|
Checking of NTFS_FLAGS_LOG_REPLAYING added to prevent access to
uninitialized bitmap during replay process.
Reported-by: syzbot+3bfd2cc059ab93efcdb4@syzkaller.appspotmail.com
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
|
|
Mutex lock with another subclass used in ni_lock_dir().
Reported-by: syzbot+bc7ca0ae4591cb2550f9@syzkaller.appspotmail.com
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
|