summaryrefslogtreecommitdiff
path: root/fs/smb/client/file.c
AgeCommit message (Collapse)Author
2024-05-01cifs: Use more fields from netfs_io_subrequestDavid Howells
Use more fields from netfs_io_subrequest instead of those incorporated into cifs_io_subrequest from cifs_readdata and cifs_writedata. Signed-off-by: David Howells <dhowells@redhat.com> cc: Steve French <sfrench@samba.org> cc: Shyam Prasad N <nspmangalore@gmail.com> cc: Rohith Surabattula <rohiths.msft@gmail.com> cc: Jeff Layton <jlayton@kernel.org> cc: linux-cifs@vger.kernel.org cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org cc: linux-mm@kvack.org
2024-05-01cifs: Replace cifs_writedata with a wrapper around netfs_io_subrequestDavid Howells
Replace the cifs_writedata struct with the same wrapper around netfs_io_subrequest that was used to replace cifs_readdata. Signed-off-by: David Howells <dhowells@redhat.com> cc: Steve French <sfrench@samba.org> cc: Shyam Prasad N <nspmangalore@gmail.com> cc: Rohith Surabattula <rohiths.msft@gmail.com> cc: Jeff Layton <jlayton@kernel.org> cc: linux-cifs@vger.kernel.org cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org cc: linux-mm@kvack.org
2024-05-01cifs: Replace cifs_readdata with a wrapper around netfs_io_subrequestDavid Howells
Netfslib has a facility whereby the allocation for netfs_io_subrequest can be increased to so that filesystem-specific data can be tagged on the end. Prepare to use this by making a struct, cifs_io_subrequest, that wraps netfs_io_subrequest, and absorb struct cifs_readdata into it. Signed-off-by: David Howells <dhowells@redhat.com> cc: Steve French <sfrench@samba.org> cc: Shyam Prasad N <nspmangalore@gmail.com> cc: Rohith Surabattula <rohiths.msft@gmail.com> cc: Jeff Layton <jlayton@kernel.org> cc: linux-cifs@vger.kernel.org cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org cc: linux-mm@kvack.org
2024-05-01cifs: Use alternative invalidation to using launder_folioDavid Howells
Use writepages-based flushing invalidation instead of invalidate_inode_pages2() and ->launder_folio(). This will allow ->launder_folio() to be removed eventually. Signed-off-by: David Howells <dhowells@redhat.com> cc: Steve French <sfrench@samba.org> cc: Shyam Prasad N <nspmangalore@gmail.com> cc: Rohith Surabattula <rohiths.msft@gmail.com> cc: Jeff Layton <jlayton@kernel.org> cc: linux-cifs@vger.kernel.org cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org
2024-04-29mm: Remove the PG_fscache alias for PG_private_2David Howells
Remove the PG_fscache alias for PG_private_2 and use the latter directly. Use of this flag for marking pages undergoing writing to the cache should be considered deprecated and the folios should be marked dirty instead and the write done in ->writepages(). Note that PG_private_2 itself should be considered deprecated and up for future removal by the MM folks too. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> cc: Matthew Wilcox (Oracle) <willy@infradead.org> cc: Ilya Dryomov <idryomov@gmail.com> cc: Xiubo Li <xiubli@redhat.com> cc: Steve French <sfrench@samba.org> cc: Paulo Alcantara <pc@manguebit.com> cc: Ronnie Sahlberg <ronniesahlberg@gmail.com> cc: Shyam Prasad N <sprasad@microsoft.com> cc: Tom Talpey <tom@talpey.com> cc: Bharath SM <bharathsm@microsoft.com> cc: Trond Myklebust <trond.myklebust@hammerspace.com> cc: Anna Schumaker <anna@kernel.org> cc: netfs@lists.linux.dev cc: ceph-devel@vger.kernel.org cc: linux-cifs@vger.kernel.org cc: linux-nfs@vger.kernel.org cc: linux-fsdevel@vger.kernel.org cc: linux-mm@kvack.org
2024-04-03smb3: retrying on failed server closeRitvik Budhiraja
In the current implementation, CIFS close sends a close to the server and does not check for the success of the server close. This patch adds functionality to check for server close return status and retries in case of an EBUSY or EAGAIN error. This can help avoid handle leaks Cc: stable@vger.kernel.org Signed-off-by: Ritvik Budhiraja <rbudhiraja@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-04-02cifs: Fix caching to try to do open O_WRONLY as rdwr on serverDavid Howells
When we're engaged in local caching of a cifs filesystem, we cannot perform caching of a partially written cache granule unless we can read the rest of the granule. This can result in unexpected access errors being reported to the user. Fix this by the following: if a file is opened O_WRONLY locally, but the mount was given the "-o fsc" flag, try first opening the remote file with GENERIC_READ|GENERIC_WRITE and if that returns -EACCES, try dropping the GENERIC_READ and doing the open again. If that last succeeds, invalidate the cache for that file as for O_DIRECT. Fixes: 70431bfd825d ("cifs: Support fscache indexing rewrite") Signed-off-by: David Howells <dhowells@redhat.com> cc: Steve French <sfrench@samba.org> cc: Shyam Prasad N <nspmangalore@gmail.com> cc: Rohith Surabattula <rohiths.msft@gmail.com> cc: Jeff Layton <jlayton@kernel.org> cc: linux-cifs@vger.kernel.org cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>
2024-03-14cifs: fixes for get_inode_infoMeetakshi Setiya
Fix potential memory leaks, add error checking, remove unnecessary initialisation of status_file_deleted and do not use cifs_iget() to get inode in reparse_info_to_fattr since fattrs may not be fully set. Fixes: ffceb7640cbf ("smb: client: do not defer close open handles to deleted files") Reported-by: Paulo Alcantara <pc@manguebit.com> Signed-off-by: Meetakshi Setiya <msetiya@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-03-14cifs: defer close file handles having RH leaseBharath SM
Previously we only deferred closing file handles with RHW lease. To enhance performance benefits from deferred closes, we now include handles with RH leases as well. Signed-off-by: Bharath SM <bharathsm@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-03-13Merge tag '6.9-rc-smb3-client-fixes-part1' of ↵Linus Torvalds
git://git.samba.org/sfrench/cifs-2.6 Pull smb client updates from Steve French: - fix for folios/netfs data corruption in cifs_extend_writeback - additional tracepoint added - updates for special files and symlinks: improvements to allow selecting use of either WSL or NFS reparse point format on creating special files - allocation size improvement for cached files - minor cleanup patches - fix to allow changing the password on remount when password for the session is expired. - lease key related fixes: caching hardlinked files, deletes of deferred close files, and an important fix to better reuse lease keys for compound operations, which also can avoid lease break timeouts when low on credits - fix potential data corruption with write/readdir races - compression cleanups and a fix for compression headers * tag '6.9-rc-smb3-client-fixes-part1' of git://git.samba.org/sfrench/cifs-2.6: (24 commits) cifs: update internal module version number for cifs.ko smb: common: simplify compression headers smb: common: fix fields sizes in compression_pattern_payload_v1 smb: client: negotiate compression algorithms smb3: add dynamic trace point for ioctls cifs: Fix writeback data corruption smb: client: return reparse type in /proc/mounts smb: client: set correct d_type for reparse DFS/DFSR and mount point smb: client: parse uid, gid, mode and dev from WSL reparse points smb: client: introduce SMB2_OP_QUERY_WSL_EA smb: client: Fix a NULL vs IS_ERR() check in wsl_set_xattrs() smb: client: add support for WSL reparse points smb: client: reduce number of parameters in smb2_compound_op() smb: client: fix potential broken compound request smb: client: move most of reparse point handling code to common file smb: client: introduce reparse mount option smb: client: retry compound request without reusing lease smb: client: do not defer close open handles to deleted files smb: client: reuse file lease key in compound operations smb3: update allocation size more accurately on write completion ...
2024-03-11Merge tag 'vfs-6.9.file' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull file locking updates from Christian Brauner: "A few years ago struct file_lock_context was added to allow for separate lists to track different types of file locks instead of using a singly-linked list for all of them. Now leases no longer need to be tracked using struct file_lock. However, a lot of the infrastructure is identical for leases and locks so separating them isn't trivial. This splits a group of fields used by both file locks and leases into a new struct file_lock_core. The new core struct is embedded in struct file_lock. Coccinelle was used to convert a lot of the callers to deal with the move, with the remaining 25% or so converted by hand. Afterwards several internal functions in fs/locks.c are made to work with struct file_lock_core. Ultimately this allows to split struct file_lock into struct file_lock and struct file_lease. The file lease APIs are then converted to take struct file_lease" * tag 'vfs-6.9.file' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (51 commits) filelock: fix deadlock detection in POSIX locking filelock: always define for_each_file_lock() smb: remove redundant check filelock: don't do security checks on nfsd setlease calls filelock: split leases out of struct file_lock filelock: remove temporary compatibility macros smb/server: adapt to breakup of struct file_lock smb/client: adapt to breakup of struct file_lock ocfs2: adapt to breakup of struct file_lock nfsd: adapt to breakup of struct file_lock nfs: adapt to breakup of struct file_lock lockd: adapt to breakup of struct file_lock fuse: adapt to breakup of struct file_lock gfs2: adapt to breakup of struct file_lock dlm: adapt to breakup of struct file_lock ceph: adapt to breakup of struct file_lock afs: adapt to breakup of struct file_lock 9p: adapt to breakup of struct file_lock filelock: convert seqfile handling to use file_lock_core filelock: convert locks_translate_pid to take file_lock_core ...
2024-03-10cifs: Fix writeback data corruptionDavid Howells
cifs writeback doesn't correctly handle the case where cifs_extend_writeback() hits a point where it is considering an additional folio, but this would overrun the wsize - at which point it drops out of the xarray scanning loop and calls xas_pause(). The problem is that xas_pause() advances the loop counter - thereby skipping that page. What needs to happen is for xas_reset() to be called any time we decide we don't want to process the page we're looking at, but rather send the request we are building and start a new one. Fix this by copying and adapting the netfslib writepages code as a temporary measure, with cifs writeback intending to be offloaded to netfslib in the near future. This also fixes the issue with the use of filemap_get_folios_tag() causing retry of a bunch of pages which the extender already dealt with. This can be tested by creating, say, a 64K file somewhere not on cifs (otherwise copy-offload may get underfoot), mounting a cifs share with a wsize of 64000, copying the file to it and then comparing the original file and the copy: dd if=/dev/urandom of=/tmp/64K bs=64k count=1 mount //192.168.6.1/test /mnt -o user=...,pass=...,wsize=64000 cp /tmp/64K /mnt/64K cmp /tmp/64K /mnt/64K Without the fix, the cmp fails at position 64000 (or shortly thereafter). Fixes: d08089f649a0 ("cifs: Change the I/O paths to use an iterator rather than a page list") Signed-off-by: David Howells <dhowells@redhat.com> cc: Steve French <sfrench@samba.org> cc: Paulo Alcantara <pc@manguebit.com> cc: Ronnie Sahlberg <ronniesahlberg@gmail.com> cc: Shyam Prasad N <sprasad@microsoft.com> cc: Tom Talpey <tom@talpey.com> cc: Jeff Layton <jlayton@kernel.org> cc: linux-cifs@vger.kernel.org cc: samba-technical@lists.samba.org cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>
2024-03-10smb: client: do not defer close open handles to deleted filesMeetakshi Setiya
When a file/dentry has been deleted before closing all its open handles, currently, closing them can add them to the deferred close list. This can lead to problems in creating file with the same name when the file is re-created before the deferred close completes. This issue was seen while reusing a client's already existing lease on a file for compound operations and xfstest 591 failed because of the deferred close handle that remained valid even after the file was deleted and was being reused to create a file with the same name. The server in this case returns an error on open with STATUS_DELETE_PENDING. Recreating the file would fail till the deferred handles are closed (duration specified in closetimeo). This patch fixes the issue by flagging all open handles for the deleted file (file path to be precise) by setting status_file_deleted to true in the cifsFileInfo structure. As per the information classes specified in MS-FSCC, SMB2 query info response from the server has a DeletePending field, set to true to indicate that deletion has been requested on that file. If this is the case, flag the open handles for this file too. When doing close in cifs_close for each of these handles, check the value of this boolean field and do not defer close these handles if the corresponding filepath has been deleted. Signed-off-by: Meetakshi Setiya <msetiya@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-03-10smb3: update allocation size more accurately on write completionSteve French
Changes to allocation size are approximated for extending writes of cached files until the server returns the actual value (on SMB3 close or query info for example), but it was setting the estimated value for number of blocks to larger than the file size even if the file is likely sparse which breaks various xfstests (e.g. generic/129, 130, 221, 228). When i_size and i_blocks are updated in write completion do not increase allocation size more than what was written (rounded up to 512 bytes). Signed-off-by: Steve French <stfrench@microsoft.com>
2024-03-10cifs: prevent updating file size from server if we have a read/write leaseBharath SM
In cases of large directories, the readdir operation may span multiple round trips to retrieve contents. This introduces a potential race condition in case of concurrent write and readdir operations. If the readdir operation initiates before a write has been processed by the server, it may update the file size attribute to an older value. Address this issue by avoiding file size updates from readdir when we have read/write lease. Scenario: 1) process1: open dir xyz 2) process1: readdir instance 1 on xyz 3) process2: create file.txt for write 4) process2: write x bytes to file.txt 5) process2: close file.txt 6) process2: open file.txt for read 7) process1: readdir 2 - overwrites file.txt inode size to 0 8) process2: read contents of file.txt - bug, short read with 0 bytes Cc: stable@vger.kernel.org Reviewed-by: Shyam Prasad N <sprasad@microsoft.com> Signed-off-by: Bharath SM <bharathsm@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-02-09cifs: change tcon status when need_reconnect is set on itShyam Prasad N
When a tcon is marked for need_reconnect, the intention is to have it reconnected. This change adjusts tcon->status in cifs_tree_connect when need_reconnect is set. Also, this change has a minor correction in resetting need_reconnect on success. It makes sure that it is done with tc_lock held. Signed-off-by: Shyam Prasad N <sprasad@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-02-05smb/client: adapt to breakup of struct file_lockJeff Layton
Most of the existing APIs have remained the same, but subsystems that access file_lock fields directly need to reach into struct file_lock_core now. Signed-off-by: Jeff Layton <jlayton@kernel.org> Link: https://lore.kernel.org/r/20240131-flsplit-v3-44-c6129007ee8d@kernel.org Reviewed-by: NeilBrown <neilb@suse.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-02-05filelock: split common fields into struct file_lock_coreJeff Layton
In a future patch, we're going to split file leases into their own structure. Since a lot of the underlying machinery uses the same fields move those into a new file_lock_core, and embed that inside struct file_lock. For now, add some macros to ensure that we can continue to build while the conversion is in progress. Signed-off-by: Jeff Layton <jlayton@kernel.org> Link: https://lore.kernel.org/r/20240131-flsplit-v3-17-c6129007ee8d@kernel.org Reviewed-by: NeilBrown <neilb@suse.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-02-05smb/client: convert to using new filelock helpersJeff Layton
Convert to using the new file locking helper functions. Signed-off-by: Jeff Layton <jlayton@kernel.org> Link: https://lore.kernel.org/r/20240131-flsplit-v3-14-c6129007ee8d@kernel.org Reviewed-by: NeilBrown <neilb@suse.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-01-27Merge tag '6.8-rc1-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds
Pull smb client fixes from Steve French: "Nine cifs/smb client fixes - Four network error fixes (three relating to replays of requests that need to be retried, and one fixing some places where we were returning the wrong rc up the stack on network errors) - Two multichannel fixes including locking fix and case where subset of channels need reconnect - netfs integration fixup: share remote i_size with netfslib - Two small cleanups (one for addressing a clang warning)" * tag '6.8-rc1-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6: cifs: fix stray unlock in cifs_chan_skip_or_disable cifs: set replay flag for retries of write command cifs: commands that are retried should have replay flag set cifs: helper function to check replayable error codes cifs: translate network errors on send to -ECONNABORTED cifs: cifs_pick_channel should try selecting active channels cifs: Share server EOF pos with netfslib smb: Work around Clang __bdos() type confusion smb: client: delete "true", "false" defines
2024-01-23cifs: set replay flag for retries of write commandShyam Prasad N
Similar to the rest of the commands, this is a change to add replay flags on retry. This one does not add a back-off, considering that we may want to flush a write ASAP to the server. Considering that this will be a flush of cached pages, the retrans value is also not honoured. Signed-off-by: Shyam Prasad N <sprasad@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-01-23cifs: Share server EOF pos with netfslibDavid Howells
Use cifsi->netfs_ctx.remote_i_size instead of cifsi->server_eof so that netfslib can refer to it to. Signed-off-by: David Howells <dhowells@redhat.com> cc: Shyam Prasad N <nspmangalore@gmail.com> cc: Rohith Surabattula <rohiths.msft@gmail.com> cc: Jeff Layton <jlayton@kernel.org> cc: linux-cifs@vger.kernel.org cc: linux-cachefs@redhat.com cc: linux-fsdevel@vger.kernel.org cc: linux-mm@kvack.org Signed-off-by: Steve French <stfrench@microsoft.com>
2024-01-22cifs: Don't use certain unnecessary folio_*() functionsDavid Howells
Filesystems should use folio->index and folio->mapping, instead of folio_index(folio), folio_mapping() and folio_file_mapping() since they know that it's in the pagecache. Change this automagically with: perl -p -i -e 's/folio_mapping[(]([^)]*)[)]/\1->mapping/g' fs/smb/client/*.c perl -p -i -e 's/folio_file_mapping[(]([^)]*)[)]/\1->mapping/g' fs/smb/client/*.c perl -p -i -e 's/folio_index[(]([^)]*)[)]/\1->index/g' fs/smb/client/*.c Reported-by: Matthew Wilcox <willy@infradead.org> Signed-off-by: David Howells <dhowells@redhat.com> cc: Jeff Layton <jlayton@kernel.org> cc: Steve French <sfrench@samba.org> cc: Paulo Alcantara <pc@manguebit.com> cc: Ronnie Sahlberg <lsahlber@redhat.com> cc: Shyam Prasad N <sprasad@microsoft.com> cc: Tom Talpey <tom@talpey.com> cc: linux-cifs@vger.kernel.org cc: linux-fsdevel@vger.kernel.org
2024-01-19Merge tag 'vfs-6.8.netfs' of ↵Linus Torvalds
gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs Pull netfs updates from Christian Brauner: "This extends the netfs helper library that network filesystems can use to replace their own implementations. Both afs and 9p are ported. cifs is ready as well but the patches are way bigger and will be routed separately once this is merged. That will remove lots of code as well. The overal goal is to get high-level I/O and knowledge of the page cache and ouf of the filesystem drivers. This includes knowledge about the existence of pages and folios The pull request converts afs and 9p. This removes about 800 lines of code from afs and 300 from 9p. For 9p it is now possible to do writes in larger than a page chunks. Additionally, multipage folio support can be turned on for 9p. Separate patches exist for cifs removing another 2000+ lines. I've included detailed information in the individual pulls I took. Summary: - Add NFS-style (and Ceph-style) locking around DIO vs buffered I/O calls to prevent these from happening at the same time. - Support for direct and unbuffered I/O. - Support for write-through caching in the page cache. - O_*SYNC and RWF_*SYNC writes use write-through rather than writing to the page cache and then flushing afterwards. - Support for write-streaming. - Support for write grouping. - Skip reads for which the server could only return zeros or EOF. - The fscache module is now part of the netfs library and the corresponding maintainer entry is updated. - Some helpers from the fscache subsystem are renamed to mark them as belonging to the netfs library. - Follow-up fixes for the netfs library. - Follow-up fixes for the 9p conversion" * tag 'vfs-6.8.netfs' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs: (50 commits) netfs: Fix wrong #ifdef hiding wait cachefiles: Fix signed/unsigned mixup netfs: Fix the loop that unmarks folios after writing to the cache netfs: Fix interaction between write-streaming and cachefiles culling netfs: Count DIO writes netfs: Mark netfs_unbuffered_write_iter_locked() static netfs: Fix proc/fs/fscache symlink to point to "netfs" not "../netfs" netfs: Rearrange netfs_io_subrequest to put request pointer first 9p: Use length of data written to the server in preference to error 9p: Do a couple of cleanups 9p: Fix initialisation of netfs_inode for 9p cachefiles: Fix __cachefiles_prepare_write() 9p: Use netfslib read/write_iter afs: Use the netfs write helpers netfs: Export the netfs_sreq tracepoint netfs: Optimise away reads above the point at which there can be no data netfs: Implement a write-through caching option netfs: Provide a launder_folio implementation netfs: Provide a writepages implementation netfs, cachefiles: Pass upper bound length to allow expansion ...
2024-01-10Merge tag 'v6.8-rc-part1-smb-client' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds
Pull smb client fixes from Steve French: "Various smb client fixes, most related to better handling special file types: - Improve handling of special file types: - performance improvement (better compounding and better caching of readdir entries that are reparse points) - extend support for creating special files (sockets, fifos, block/char devices) - fix renaming and hardlinking of reparse points - extend support for creating symlinks with IO_REPARSE_TAG_SYMLINK - Multichannel logging improvement - Exception handling fix - Minor cleanups" * tag 'v6.8-rc-part1-smb-client' of git://git.samba.org/sfrench/cifs-2.6: cifs: update internal module version number for cifs.ko cifs: remove unneeded return statement cifs: make cifs_chan_update_iface() a void function cifs: delete unnecessary NULL checks in cifs_chan_update_iface() cifs: get rid of dup length check in parse_reparse_point() smb: client: stop revalidating reparse points unnecessarily cifs: Pass unbyteswapped eof value into SMB2_set_eof() smb3: Improve exception handling in allocate_mr_list() cifs: fix in logging in cifs_chan_update_iface smb: client: handle special files and symlinks in SMB3 POSIX smb: client: cleanup smb2_query_reparse_point() smb: client: allow creating symlinks via reparse points smb: client: fix hardlinking of reparse points smb: client: fix renaming of reparse points smb: client: optimise reparse point querying smb: client: allow creating special files via reparse points smb: client: extend smb2_compound_op() to accept more commands smb: client: Fix minor whitespace errors and warnings
2024-01-09Merge tag 'mm-stable-2024-01-08-15-31' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: "Many singleton patches against the MM code. The patch series which are included in this merge do the following: - Peng Zhang has done some mapletree maintainance work in the series 'maple_tree: add mt_free_one() and mt_attr() helpers' 'Some cleanups of maple tree' - In the series 'mm: use memmap_on_memory semantics for dax/kmem' Vishal Verma has altered the interworking between memory-hotplug and dax/kmem so that newly added 'device memory' can more easily have its memmap placed within that newly added memory. - Matthew Wilcox continues folio-related work (including a few fixes) in the patch series 'Add folio_zero_tail() and folio_fill_tail()' 'Make folio_start_writeback return void' 'Fix fault handler's handling of poisoned tail pages' 'Convert aops->error_remove_page to ->error_remove_folio' 'Finish two folio conversions' 'More swap folio conversions' - Kefeng Wang has also contributed folio-related work in the series 'mm: cleanup and use more folio in page fault' - Jim Cromie has improved the kmemleak reporting output in the series 'tweak kmemleak report format'. - In the series 'stackdepot: allow evicting stack traces' Andrey Konovalov to permits clients (in this case KASAN) to cause eviction of no longer needed stack traces. - Charan Teja Kalla has fixed some accounting issues in the page allocator's atomic reserve calculations in the series 'mm: page_alloc: fixes for high atomic reserve caluculations'. - Dmitry Rokosov has added to the samples/ dorectory some sample code for a userspace memcg event listener application. See the series 'samples: introduce cgroup events listeners'. - Some mapletree maintanance work from Liam Howlett in the series 'maple_tree: iterator state changes'. - Nhat Pham has improved zswap's approach to writeback in the series 'workload-specific and memory pressure-driven zswap writeback'. - DAMON/DAMOS feature and maintenance work from SeongJae Park in the series 'mm/damon: let users feed and tame/auto-tune DAMOS' 'selftests/damon: add Python-written DAMON functionality tests' 'mm/damon: misc updates for 6.8' - Yosry Ahmed has improved memcg's stats flushing in the series 'mm: memcg: subtree stats flushing and thresholds'. - In the series 'Multi-size THP for anonymous memory' Ryan Roberts has added a runtime opt-in feature to transparent hugepages which improves performance by allocating larger chunks of memory during anonymous page faults. - Matthew Wilcox has also contributed some cleanup and maintenance work against eh buffer_head code int he series 'More buffer_head cleanups'. - Suren Baghdasaryan has done work on Andrea Arcangeli's series 'userfaultfd move option'. UFFDIO_MOVE permits userspace heap compaction algorithms to move userspace's pages around rather than UFFDIO_COPY'a alloc/copy/free. - Stefan Roesch has developed a 'KSM Advisor', in the series 'mm/ksm: Add ksm advisor'. This is a governor which tunes KSM's scanning aggressiveness in response to userspace's current needs. - Chengming Zhou has optimized zswap's temporary working memory use in the series 'mm/zswap: dstmem reuse optimizations and cleanups'. - Matthew Wilcox has performed some maintenance work on the writeback code, both code and within filesystems. The series is 'Clean up the writeback paths'. - Andrey Konovalov has optimized KASAN's handling of alloc and free stack traces for secondary-level allocators, in the series 'kasan: save mempool stack traces'. - Andrey also performed some KASAN maintenance work in the series 'kasan: assorted clean-ups'. - David Hildenbrand has gone to town on the rmap code. Cleanups, more pte batching, folio conversions and more. See the series 'mm/rmap: interface overhaul'. - Kinsey Ho has contributed some maintenance work on the MGLRU code in the series 'mm/mglru: Kconfig cleanup'. - Matthew Wilcox has contributed lruvec page accounting code cleanups in the series 'Remove some lruvec page accounting functions'" * tag 'mm-stable-2024-01-08-15-31' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (361 commits) mm, treewide: rename MAX_ORDER to MAX_PAGE_ORDER mm, treewide: introduce NR_PAGE_ORDERS selftests/mm: add separate UFFDIO_MOVE test for PMD splitting selftests/mm: skip test if application doesn't has root privileges selftests/mm: conform test to TAP format output selftests: mm: hugepage-mmap: conform to TAP format output selftests/mm: gup_test: conform test to TAP format output mm/selftests: hugepage-mremap: conform test to TAP format output mm/vmstat: move pgdemote_* out of CONFIG_NUMA_BALANCING mm: zsmalloc: return -ENOSPC rather than -EINVAL in zs_malloc while size is too large mm/memcontrol: remove __mod_lruvec_page_state() mm/khugepaged: use a folio more in collapse_file() slub: use a folio in __kmalloc_large_node slub: use folio APIs in free_large_kmalloc() slub: use alloc_pages_node() in alloc_slab_page() mm: remove inc/dec lruvec page state functions mm: ratelimit stat flush from workingset shrinker kasan: stop leaking stack trace handles mm/mglru: remove CONFIG_TRANSPARENT_HUGEPAGE mm/mglru: add dummy pmd_dirty() ...
2024-01-07smb: client: allow creating special files via reparse pointsPaulo Alcantara
Add support for creating special files (e.g. char/block devices, sockets, fifos) via NFS reparse points on SMB2+, which are fully supported by most SMB servers and documented in MS-FSCC. smb2_get_reparse_inode() creates the file with a corresponding reparse point buffer set in @iov through a single roundtrip to the server. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202311260746.HOJ039BV-lkp@intel.com/ Signed-off-by: Paulo Alcantara (SUSE) <pc@manguebit.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2023-12-24netfs: Move pinning-for-writeback from fscache to netfsDavid Howells
Move the resource pinning-for-writeback from fscache code to netfslib code. This is used to keep a cache backing object pinned whilst we have dirty pages on the netfs inode in the pagecache such that VM writeback will be able to reach it. Whilst we're at it, switch the parameters of netfs_unpin_writeback() to match ->write_inode() so that it can be used for that directly. Note that this mechanism could be more generically useful than that for network filesystems. Quite often they have to keep around other resources (e.g. authentication tokens or network connections) until the writeback is complete. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> cc: linux-cachefs@redhat.com cc: linux-fsdevel@vger.kernel.org cc: linux-mm@kvack.org
2023-12-19fs: cifs: Fix atime update checkZizhi Wo
Commit 9b9c5bea0b96 ("cifs: do not return atime less than mtime") indicates that in cifs, if atime is less than mtime, some apps will break. Therefore, it introduce a function to compare this two variables in two places where atime is updated. If atime is less than mtime, update it to mtime. However, the patch was handled incorrectly, resulting in atime and mtime being exactly equal. A previous commit 69738cfdfa70 ("fs: cifs: Fix atime update check vs mtime") fixed one place and forgot to fix another. Fix it. Fixes: 9b9c5bea0b96 ("cifs: do not return atime less than mtime") Cc: stable@vger.kernel.org Signed-off-by: Zizhi Wo <wozizhi@huawei.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2023-12-10smb: do not test the return value of folio_start_writeback()Matthew Wilcox (Oracle)
In preparation for removing the return value entirely, stop testing it in smb. Link: https://lkml.kernel.org/r/20231108204605.745109-4-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: David Howells <dhowells@redhat.com> Cc: Steve French <sfrench@samba.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-10-18client: convert to new timestamp accessorsJeff Layton
Convert to using the new inode timestamp accessor functions. Signed-off-by: Jeff Layton <jlayton@kernel.org> Link: https://lore.kernel.org/r/20231004185347.80880-66-jlayton@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-08-28Merge tag 'v6.6-vfs.ctime' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs timestamp updates from Christian Brauner: "This adds VFS support for multi-grain timestamps and converts tmpfs, xfs, ext4, and btrfs to use them. This carries acks from all relevant filesystems. The VFS always uses coarse-grained timestamps when updating the ctime and mtime after a change. This has the benefit of allowing filesystems to optimize away a lot of metadata updates, down to around 1 per jiffy, even when a file is under heavy writes. Unfortunately, this has always been an issue when we're exporting via NFSv3, which relies on timestamps to validate caches. A lot of changes can happen in a jiffy, so timestamps aren't sufficient to help the client decide to invalidate the cache. Even with NFSv4, a lot of exported filesystems don't properly support a change attribute and are subject to the same problems with timestamp granularity. Other applications have similar issues with timestamps (e.g., backup applications). If we were to always use fine-grained timestamps, that would improve the situation, but that becomes rather expensive, as the underlying filesystem would have to log a lot more metadata updates. This introduces fine-grained timestamps that are used when they are actively queried. This uses the 31st bit of the ctime tv_nsec field to indicate that something has queried the inode for the mtime or ctime. When this flag is set, on the next mtime or ctime update, the kernel will fetch a fine-grained timestamp instead of the usual coarse-grained one. As POSIX generally mandates that when the mtime changes, the ctime must also change the kernel always stores normalized ctime values, so only the first 30 bits of the tv_nsec field are ever used. Filesytems can opt into this behavior by setting the FS_MGTIME flag in the fstype. Filesystems that don't set this flag will continue to use coarse-grained timestamps. Various preparatory changes, fixes and cleanups are included: - Fixup all relevant places where POSIX requires updating ctime together with mtime. This is a wide-range of places and all maintainers provided necessary Acks. - Add new accessors for inode->i_ctime directly and change all callers to rely on them. Plain accesses to inode->i_ctime are now gone and it is accordingly rename to inode->__i_ctime and commented as requiring accessors. - Extend generic_fillattr() to pass in a request mask mirroring in a sense the statx() uapi. This allows callers to pass in a request mask to only get a subset of attributes filled in. - Rework timestamp updates so it's possible to drop the @now parameter the update_time() inode operation and associated helpers. - Add inode_update_timestamps() and convert all filesystems to it removing a bunch of open-coding" * tag 'v6.6-vfs.ctime' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (107 commits) btrfs: convert to multigrain timestamps ext4: switch to multigrain timestamps xfs: switch to multigrain timestamps tmpfs: add support for multigrain timestamps fs: add infrastructure for multigrain timestamps fs: drop the timespec64 argument from update_time xfs: have xfs_vn_update_time gets its own timestamp fat: make fat_update_time get its own timestamp fat: remove i_version handling from fat_update_time ubifs: have ubifs_update_time use inode_update_timestamps btrfs: have it use inode_update_timestamps fs: drop the timespec64 arg from generic_update_time fs: pass the request_mask to generic_fillattr fs: remove silly warning from current_time gfs2: fix timestamp handling on quota inodes fs: rename i_ctime field to __i_ctime selinux: convert to ctime accessor functions security: convert to ctime accessor functions apparmor: convert to ctime accessor functions sunrpc: convert to ctime accessor functions ...
2023-08-14cifs: Release folio lock on fscache read hit.Russell Harmon via samba-technical
Under the current code, when cifs_readpage_worker is called, the call contract is that the callee should unlock the page. This is documented in the read_folio section of Documentation/filesystems/vfs.rst as: > The filesystem should unlock the folio once the read has completed, > whether it was successful or not. Without this change, when fscache is in use and cache hit occurs during a read, the page lock is leaked, producing the following stack on subsequent reads (via mmap) to the page: $ cat /proc/3890/task/12864/stack [<0>] folio_wait_bit_common+0x124/0x350 [<0>] filemap_read_folio+0xad/0xf0 [<0>] filemap_fault+0x8b1/0xab0 [<0>] __do_fault+0x39/0x150 [<0>] do_fault+0x25c/0x3e0 [<0>] __handle_mm_fault+0x6ca/0xc70 [<0>] handle_mm_fault+0xe9/0x350 [<0>] do_user_addr_fault+0x225/0x6c0 [<0>] exc_page_fault+0x84/0x1b0 [<0>] asm_exc_page_fault+0x27/0x30 This requires a reboot to resolve; it is a deadlock. Note however that the call to cifs_readpage_from_fscache does mark the page clean, but does not free the folio lock. This happens in __cifs_readpage_from_fscache on success. Releasing the lock at that point however is not appropriate as cifs_readahead also calls cifs_readpage_from_fscache and *does* unconditionally release the lock after its return. This change therefore effectively makes cifs_readpage_worker work like cifs_readahead. Signed-off-by: Russell Harmon <russ@har.mn> Acked-by: Paulo Alcantara (SUSE) <pc@manguebit.com> Reviewed-by: David Howells <dhowells@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>
2023-08-10cifs: fix potential oops in cifs_oplock_breakSteve French
With deferred close we can have closes that race with lease breaks, and so with the current checks for whether to send the lease response, oplock_response(), this can mean that an unmount (kill_sb) can occur just before we were checking if the tcon->ses is valid. See below: [Fri Aug 4 04:12:50 2023] RIP: 0010:cifs_oplock_break+0x1f7/0x5b0 [cifs] [Fri Aug 4 04:12:50 2023] Code: 7d a8 48 8b 7d c0 c0 e9 02 48 89 45 b8 41 89 cf e8 3e f5 ff ff 4c 89 f7 41 83 e7 01 e8 82 b3 03 f2 49 8b 45 50 48 85 c0 74 5e <48> 83 78 60 00 74 57 45 84 ff 75 52 48 8b 43 98 48 83 eb 68 48 39 [Fri Aug 4 04:12:50 2023] RSP: 0018:ffffb30607ddbdf8 EFLAGS: 00010206 [Fri Aug 4 04:12:50 2023] RAX: 632d223d32612022 RBX: ffff97136944b1e0 RCX: 0000000080100009 [Fri Aug 4 04:12:50 2023] RDX: 0000000000000001 RSI: 0000000080100009 RDI: ffff97136944b188 [Fri Aug 4 04:12:50 2023] RBP: ffffb30607ddbe58 R08: 0000000000000001 R09: ffffffffc08e0900 [Fri Aug 4 04:12:50 2023] R10: 0000000000000001 R11: 000000000000000f R12: ffff97136944b138 [Fri Aug 4 04:12:50 2023] R13: ffff97149147c000 R14: ffff97136944b188 R15: 0000000000000000 [Fri Aug 4 04:12:50 2023] FS: 0000000000000000(0000) GS:ffff9714f7c00000(0000) knlGS:0000000000000000 [Fri Aug 4 04:12:50 2023] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [Fri Aug 4 04:12:50 2023] CR2: 00007fd8de9c7590 CR3: 000000011228e000 CR4: 0000000000350ef0 [Fri Aug 4 04:12:50 2023] Call Trace: [Fri Aug 4 04:12:50 2023] <TASK> [Fri Aug 4 04:12:50 2023] process_one_work+0x225/0x3d0 [Fri Aug 4 04:12:50 2023] worker_thread+0x4d/0x3e0 [Fri Aug 4 04:12:50 2023] ? process_one_work+0x3d0/0x3d0 [Fri Aug 4 04:12:50 2023] kthread+0x12a/0x150 [Fri Aug 4 04:12:50 2023] ? set_kthread_struct+0x50/0x50 [Fri Aug 4 04:12:50 2023] ret_from_fork+0x22/0x30 [Fri Aug 4 04:12:50 2023] </TASK> To fix this change the ordering of the checks before sending the oplock_response to first check if the openFileList is empty. Fixes: da787d5b7498 ("SMB3: Do not send lease break acknowledgment if all file handles have been closed") Suggested-by: Bharath SM <bharathsm@microsoft.com> Reviewed-by: Bharath SM <bharathsm@microsoft.com> Reviewed-by: Shyam Prasad N <sprasad@microsoft.com> Signed-off-by: Paulo Alcantara (SUSE) <pc@manguebit.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2023-07-24smb: convert to ctime accessor functionsJeff Layton
In later patches, we're going to change how the inode's ctime field is used. Switch to using accessor functions instead of raw accesses of inode->i_ctime. Acked-by: Tom Talpey <tom@talpey.com> Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org> Signed-off-by: Jeff Layton <jlayton@kernel.org> Acked-by: Steve French <stfrench@microsoft.com> Message-Id: <20230705190309.579783-72-jlayton@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-07-10cifs: if deferred close is disabled then close files immediatelyBharath SM
If defer close timeout value is set to 0, then there is no need to include files in the deferred close list and utilize the delayed worker for closing. Instead, we can close them immediately. Signed-off-by: Bharath SM <bharathsm@microsoft.com> Reviewed-by: Shyam Prasad N <sprasad@microsoft.com> Cc: stable@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>
2023-07-10cifs: update the ctime on a partial page writeJeff Layton
POSIX says: "Upon successful completion, where nbyte is greater than 0, write() shall mark for update the last data modification and last file status change timestamps of the file..." Add the missing ctime update. Signed-off-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Steve French <stfrench@microsoft.com> Message-Id: <20230705190309.579783-6-jlayton@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-06-30Merge tag '6.5-rc-smb3-client-fixes-part1' of ↵Linus Torvalds
git://git.samba.org/sfrench/cifs-2.6 Pull smb client updates from Steve French: - Deferred close fix - Debugging improvements: display missing mount option, dump rc on invalidate inode failures, print client_guid in DebugData, log session id when matching session not found in reconnect, new dynamic tracepoint for session not found - Mount fixes including: potential null dereference, and possible memory leak and path name parsing when double slashes - Fix potential use after free in compounding - Two crediting (flow control) fixes: fix for crediting leak (stress scenario with excess lease credits) and better locking around updating credits - Three cleanups from issues pointed out by the kernel test robot - Session state check improvements (including for potential use after free) - DFS fixes: Fix for getattr on link when DFS disabled, fix for DFS mounts to same share with different prefix paths, DFS mount error checking improvement * tag '6.5-rc-smb3-client-fixes-part1' of git://git.samba.org/sfrench/cifs-2.6: cifs: new dynamic tracepoint to track ses not found errors cifs: log session id when a matching ses is not found smb: client: improve DFS mount check smb: client: fix shared DFS root mounts with different prefixes smb: client: fix parsing of source mount option smb: client: fix broken file attrs with nodfs mounts cifs: print client_guid in DebugData cifs: fix session state check in smb2_find_smb_ses cifs: fix session state check in reconnect to avoid use-after-free issue cifs: do all necessary checks for credits within or before locking cifs: prevent use-after-free by freeing the cfile later smb: client: fix warning in generic_ip_connect() smb: client: fix warning in CIFSFindNext() smb: client: fix warning in CIFSFindFirst() smb3: do not reserve too many oplock credits cifs: print more detail when invalidate_inode_mapping fails smb: client: fix warning in cifs_smb3_do_mount() smb: client: fix warning in cifs_match_super() cifs: print nosharesock value while dumping mount options SMB3: Do not send lease break acknowledgment if all file handles have been closed
2023-06-26Merge tag 'for-6.5/splice-2023-06-23' of git://git.kernel.dk/linuxLinus Torvalds
Pull splice updates from Jens Axboe: "This kills off ITER_PIPE to avoid a race between truncate, iov_iter_revert() on the pipe and an as-yet incomplete DMA to a bio with unpinned/unref'ed pages from an O_DIRECT splice read. This causes memory corruption. Instead, we either use (a) filemap_splice_read(), which invokes the buffered file reading code and splices from the pagecache into the pipe; (b) copy_splice_read(), which bulk-allocates a buffer, reads into it and then pushes the filled pages into the pipe; or (c) handle it in filesystem-specific code. Summary: - Rename direct_splice_read() to copy_splice_read() - Simplify the calculations for the number of pages to be reclaimed in copy_splice_read() - Turn do_splice_to() into a helper, vfs_splice_read(), so that it can be used by overlayfs and coda to perform the checks on the lower fs - Make vfs_splice_read() jump to copy_splice_read() to handle direct-I/O and DAX - Provide shmem with its own splice_read to handle non-existent pages in the pagecache. We don't want a ->read_folio() as we don't want to populate holes, but filemap_get_pages() requires it - Provide overlayfs with its own splice_read to call down to a lower layer as overlayfs doesn't provide ->read_folio() - Provide coda with its own splice_read to call down to a lower layer as coda doesn't provide ->read_folio() - Direct ->splice_read to copy_splice_read() in tty, procfs, kernfs and random files as they just copy to the output buffer and don't splice pages - Provide wrappers for afs, ceph, ecryptfs, ext4, f2fs, nfs, ntfs3, ocfs2, orangefs, xfs and zonefs to do locking and/or revalidation - Make cifs use filemap_splice_read() - Replace pointers to generic_file_splice_read() with pointers to filemap_splice_read() as DIO and DAX are handled in the caller; filesystems can still provide their own alternate ->splice_read() op - Remove generic_file_splice_read() - Remove ITER_PIPE and its paraphernalia as generic_file_splice_read was the only user" * tag 'for-6.5/splice-2023-06-23' of git://git.kernel.dk/linux: (31 commits) splice: kdoc for filemap_splice_read() and copy_splice_read() iov_iter: Kill ITER_PIPE splice: Remove generic_file_splice_read() splice: Use filemap_splice_read() instead of generic_file_splice_read() cifs: Use filemap_splice_read() trace: Convert trace/seq to use copy_splice_read() zonefs: Provide a splice-read wrapper xfs: Provide a splice-read wrapper orangefs: Provide a splice-read wrapper ocfs2: Provide a splice-read wrapper ntfs3: Provide a splice-read wrapper nfs: Provide a splice-read wrapper f2fs: Provide a splice-read wrapper ext4: Provide a splice-read wrapper ecryptfs: Provide a splice-read wrapper ceph: Provide a splice-read wrapper afs: Provide a splice-read wrapper 9p: Add splice_read wrapper net: Make sock_splice_read() use copy_splice_read() by default tty, proc, kernfs, random: Use copy_splice_read() ...
2023-06-19SMB3: Do not send lease break acknowledgment if all file handles have been ↵Bharath SM
closed In case if all existing file handles are deferred handles and if all of them gets closed due to handle lease break then we dont need to send lease break acknowledgment to server, because last handle close will be considered as lease break ack. After closing deferred handels, we check for openfile list of inode, if its empty then we skip sending lease break ack. Fixes: 59a556aebc43 ("SMB3: drop reference to cfile before sending oplock break") Reviewed-by: Tom Talpey <tom@talpey.com> Signed-off-by: Bharath SM <bharathsm@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2023-06-14cifs: fix lease break oops in xfstest generic/098Steve French
umount can race with lease break so need to check if tcon->ses->server is still valid to send the lease break response. Reviewed-by: Bharath SM <bharathsm@microsoft.com> Reviewed-by: Shyam Prasad N <sprasad@microsoft.com> Fixes: 59a556aebc43 ("SMB3: drop reference to cfile before sending oplock break") Signed-off-by: Steve French <stfrench@microsoft.com>
2023-05-24smb: move client and server files to common directory fs/smbSteve French
Move CIFS/SMB3 related client and server files (cifs.ko and ksmbd.ko and helper modules) to new fs/smb subdirectory: fs/cifs --> fs/smb/client fs/ksmbd --> fs/smb/server fs/smbfs_common --> fs/smb/common Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>