summaryrefslogtreecommitdiff
path: root/fs/nfs/nfs4proc.c
AgeCommit message (Collapse)Author
2015-12-28pNFS: Ensure nfs4_layoutget_prepare returns the correct errorTrond Myklebust
If we're unable to perform the layoutget due to an invalid open stateid or a bulk recall, ensure that we return the error so that the caller can decide on an appropriate action. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-12-28NFS41: map NFS4ERR_LAYOUTUNAVAILABLE to ENODATAPeng Tao
Instead of mapping it to EIO that is a fatal error and fails application. We'll go inband after getting NFS4ERR_LAYOUTUNAVAILABLE. Signed-off-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-12-28NFSv4.1/pnfs: Fixup an lo->plh_block_lgets imbalance in layoutreturnTrond Myklebust
Since commit 2d8ae84fbc32, nothing is bumping lo->plh_block_lgets in the layoutreturn path, so it should not be touched in nfs4_layoutreturn_release either. Fixes: 2d8ae84fbc32 ("NFSv4.1/pnfs: Remove redundant lo->plh_block_lgets...") Cc: stable@vger.kernel.org # 4.3+ Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-12-28nfs: machine credential support for additional operationsAndrew Elble
Allow LAYOUTRETURN and DELEGRETURN to use machine credentials if the server supports it. Add request for OPEN_DOWNGRADE as the close path also uses that. Signed-off-by: Andrew Elble <aweits@rit.edu> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-12-28NFSv4: Fix unused variable warnings in nfs4_init_*_client_string()Trond Myklebust
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-12-28Adding tracepoint to cached openOlga Kornievskaia
Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-12-28Adding stateid information to tracepointsOlga Kornievskaia
Operations to which stateid information is added: close, delegreturn, open, read, setattr, layoutget, layoutcommit, test_stateid, write, lock, locku, lockt Format is "stateid=<seqid>:<crc32 hash stateid.other>", also "openstateid=", "layoutstateid=", and "lockstateid=" for open_file, layoutget, set_lock tracepoints. New function is added to internal.h, nfs_stateid_hash(), to compute the hash trace_nfs4_setattr() is moved from nfs4_do_setattr() to _nfs4_do_setattr() to get access to stateid. trace_nfs4_setattr and trace_nfs4_delegreturn are changed from INODE_EVENT to new event type, INODE_STATEID_EVENT which is same as INODE_EVENT but adds stateid information for locking tracepoints, moved trace_nfs4_set_lock() into _nfs4_do_setlk() to get access to stateid information, and removed trace_nfs4_lock_reclaim(), trace_nfs4_lock_expired() as they call into _nfs4_do_setlk() and both were previously same LOCK_EVENT type. Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-12-28NFS: Allow the combination pNFS and labeled NFSTrond Myklebust
Fix the nfs4_pnfs_open_bitmap so that it also allows for labeled NFS. Signed-off-by: Trond Myklebust <trond,myklebust@primarydata.com>
2015-12-28nfs: Fix race in __update_open_stateid()Andrew Elble
We've seen this in a packet capture - I've intermixed what I think was going on. The fix here is to grab the so_lock sooner. 1964379 -> #1 open (for write) reply seqid=1 1964393 -> #2 open (for read) reply seqid=2 __nfs4_close(), state->n_wronly-- nfs4_state_set_mode_locked(), changes state->state = [R] state->flags is [RW] state->state is [R], state->n_wronly == 0, state->n_rdonly == 1 1964398 -> #3 open (for write) call -> because close is already running 1964399 -> downgrade (to read) call seqid=2 (close of #1) 1964402 -> #3 open (for write) reply seqid=3 __update_open_stateid() nfs_set_open_stateid_locked(), changes state->flags state->flags is [RW] state->state is [R], state->n_wronly == 0, state->n_rdonly == 1 new sequence number is exposed now via nfs4_stateid_copy() next step would be update_open_stateflags(), pending so_lock 1964403 -> downgrade reply seqid=2, fails with OLD_STATEID (close of #1) nfs4_close_prepare() gets so_lock and recalcs flags -> send close 1964405 -> downgrade (to read) call seqid=3 (close of #1 retry) __update_open_stateid() gets so_lock * update_open_stateflags() updates state->n_wronly. nfs4_state_set_mode_locked() updates state->state state->flags is [RW] state->state is [RW], state->n_wronly == 1, state->n_rdonly == 1 * should have suppressed the preceding nfs4_close_prepare() from sending open_downgrade 1964406 -> write call 1964408 -> downgrade (to read) reply seqid=4 (close of #1 retry) nfs_clear_open_stateid_locked() state->flags is [R] state->state is [RW], state->n_wronly == 1, state->n_rdonly == 1 1964409 -> write reply (fails, openmode) Signed-off-by: Andrew Elble <aweits@rit.edu> Cc: stable@vger,kernel.org Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-12-13xattr handlers: Simplify list operationAndreas Gruenbacher
Change the list operation to only return whether or not an attribute should be listed. Copying the attribute names into the buffer is moved to the callers. Since the result only depends on the dentry and not on the attribute name, we do not pass the attribute name to list operations. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-12-13nfs: Move call to security_inode_listsecurity into nfs_listxattrAndreas Gruenbacher
Add a nfs_listxattr operation. Move the call to security_inode_listsecurity from list operation of the "security.*" xattr handler to nfs_listxattr. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Cc: Trond Myklebust <trond.myklebust@primarydata.com> Cc: Anna Schumaker <anna.schumaker@netapp.com> Cc: linux-nfs@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-12-06vfs: Distinguish between full xattr names and proper prefixesAndreas Gruenbacher
Add an additional "name" field to struct xattr_handler. When the name is set, the handler matches attributes with exactly that name. When the prefix is set instead, the handler matches attributes with the given prefix and with a non-empty suffix. This patch should avoid bugs like the one fixed in commit c361016a in the future. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Reviewed-by: James Morris <james.l.morris@oracle.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-11-23nfs: use sliding delay when LAYOUTGET gets NFS4ERR_DELAYJeff Layton
When LAYOUTGET gets NFS4ERR_DELAY, we currently will wait 15s before retrying the call. That is a _very_ long time, so add a timeout value to struct nfs4_layoutget and pass nfs4_async_handle_error a pointer to it. This allows the RPC engine to use a sliding delay window, instead of a 15s delay. Signed-off-by: Jeff Layton <jeff.layton@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-11-13Merge branch 'for-linus-3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs xattr cleanups from Al Viro. * 'for-linus-3' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: f2fs: xattr simplifications squashfs: xattr simplifications 9p: xattr simplifications xattr handlers: Pass handler to operations instead of flags jffs2: Add missing capability check for listing trusted xattrs hfsplus: Remove unused xattr handler list operations ubifs: Remove unused security xattr handler vfs: Fix the posix_acl_xattr_list return value vfs: Check attribute names in posix acl xattr handers
2015-11-13xattr handlers: Pass handler to operations instead of flagsAndreas Gruenbacher
The xattr_handler operations are currently all passed a file system specific flags value which the operations can use to disambiguate between different handlers; some file systems use that to distinguish the xattr namespace, for example. In some oprations, it would be useful to also have access to the handler prefix. To allow that, pass a pointer to the handler to operations instead of the flags value alone. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-11-09Merge tag 'nfs-for-4.4-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds
Pull NFS client updates from Trond Myklebust: "Highlights include: New features: - RDMA client backchannel from Chuck - Support for NFSv4.2 file CLONE using the btrfs ioctl Bugfixes + cleanups: - Move socket data receive out of the bottom halves and into a workqueue - Refactor NFSv4 error handling so synchronous and asynchronous RPC handles errors identically. - Fix a panic when blocks or object layouts reads return a bad data length - Fix nfsroot so it can handle a 1024 byte long path. - Fix bad usage of page offset in bl_read_pagelist - Various NFSv4 callback cleanups+fixes - Fix GETATTR bitmap verification - Support hexadecimal number for sunrpc debug sysctl files" * tag 'nfs-for-4.4-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (53 commits) Sunrpc: Supports hexadecimal number for sysctl files of sunrpc debug nfs: Fix GETATTR bitmap verification nfs: Remove unused xdr page offsets in getacl/setacl arguments fs/nfs: remove unnecessary new_valid_dev check SUNRPC: fix variable type NFS: Enable client side NFSv4.1 backchannel to use other transports pNFS/flexfiles: Add support for FF_FLAGS_NO_IO_THRU_MDS pNFS/flexfiles: When mirrored, retry failed reads by switching mirrors SUNRPC: Remove the TCP-only restriction in bc_svc_process() svcrdma: Add backward direction service for RPC/RDMA transport xprtrdma: Handle incoming backward direction RPC calls xprtrdma: Add support for sending backward direction RPC replies xprtrdma: Pre-allocate Work Requests for backchannel xprtrdma: Pre-allocate backward rpc_rqst and send/receive buffers SUNRPC: Abstract backchannel operations xprtrdma: Saving IRQs no longer needed for rb_lock xprtrdma: Remove reply tasklet xprtrdma: Use workqueue to process RPC/RDMA replies xprtrdma: Replace send and receive arrays xprtrdma: Refactor reply handler error handling ...
2015-11-05Merge tag 'locks-v4.4-1' of git://git.samba.org/jlayton/linuxLinus Torvalds
Pull file locking updates from Jeff Layton: "The largest series of changes is from Ben who offered up a set to add a new helper function for setting locks based on the type set in fl_flags. Dmitry also send in a fix for a potential race that he found with KTSAN" * tag 'locks-v4.4-1' of git://git.samba.org/jlayton/linux: locks: cleanup posix_lock_inode_wait and flock_lock_inode_wait Move locks API users to locks_lock_inode_wait() locks: introduce locks_lock_inode_wait() locks: Use more file_inode and fix a comment fs: fix data races on inode->i_flctx locks: change tracepoint for generic_add_lease
2015-11-03nfs: Remove unused xdr page offsets in getacl/setacl argumentsAndreas Gruenbacher
The arguments passed around for getacl and setacl xdr encoding, struct nfs_setaclargs and struct nfs_getaclargs, both contain an array of pages, an offset into the first page, and the length of the page data. The offset is unused as it is always zero; remove it. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-10-22Move locks API users to locks_lock_inode_wait()Benjamin Coddington
Instead of having users check for FL_POSIX or FL_FLOCK to call the correct locks API function, use the check within locks_lock_inode_wait(). This allows for some later cleanup. Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
2015-10-21Merge branch 'nfsclone'Trond Myklebust
* nfsclone: nfs: add missing linux/types.h NFS: Fix an 'unused variable' complaint when #ifndef CONFIG_NFS_V4_2 nfs42: add NFS_IOC_CLONE_RANGE ioctl nfs42: respect clone_blksize nfs: get clone_blksize when probing fsinfo nfs42: add NFS_IOC_CLONE ioctl nfs42: add CLONE proc functions nfs42: add CLONE xdr functions
2015-10-15nfs: get clone_blksize when probing fsinfoPeng Tao
NFSv42 CLONE operation is supposed to respect it. Signed-off-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-10-15nfs42: add CLONE proc functionsPeng Tao
Signed-off-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-10-08NFSv4: Unify synchronous and asynchronous error handlingTrond Myklebust
They now only differ in the way we handle waiting, so let's unify. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-10-08NFSv4: Don't use synchronous delegation recall in exception handlingTrond Myklebust
The code needs to be able to work from inside an asynchronous context. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-10-08NFSv4: nfs4_async_handle_error should take a non-const nfs_serverTrond Myklebust
For symmetry with the synchronous handler, and so that we can potentially handle errors such as NFS4ERR_BADNAME. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-10-08NFSv4: Update the delay statistics counter for synchronous delaysTrond Myklebust
Currently, we only do so for asynchronous delays. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-10-08NFSv4: Refactor NFSv4 error handlingTrond Myklebust
Prepare for unification of the synchronous and asynchronous error handling. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-10-02nfs4: reset states to use open_stateid when returning delegation voluntarilyJeff Layton
When the client goes to return a delegation, it should always update any nfs4_state currently set up to use that delegation stateid to instead use the open stateid. It already does do this in some cases, particularly in the state recovery code, but not currently when the delegation is voluntarily returned (e.g. in advance of a RENAME). This causes the client to try to continue using the delegation stateid after the DELEGRETURN, e.g. in LAYOUTGET. Set the nfs4_state back to using the open stateid in nfs4_open_delegation_recall, just before clearing the NFS_DELEGATED_STATE bit. Signed-off-by: Jeff Layton <jeff.layton@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-10-02NFSv4: Fix a nograce recovery hangBenjamin Coddington
Since commit 5cae02f42793130e1387f4ec09c4d07056ce9fa5 an OPEN_CONFIRM should have a privileged sequence in the recovery case to allow nograce recovery to proceed for NFSv4.0. Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-10-02NFSv4.1: nfs4_opendata_check_deleg needs to handle NFS4_OPEN_CLAIM_DELEG_CUR_FHTrond Myklebust
We need to warn against broken NFSv4.1 servers that try to hand out delegations in response to NFS4_OPEN_CLAIM_DELEG_CUR_FH. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-09-23NFS41: make close wait for layoutreturnPeng Tao
If we send a layoutreturn asynchronously before close, the close might reach server first and layoutreturn would fail with BADSTATEID because there is nothing keeping the layout stateid alive. Also do not pretend sending layoutreturn if we are not. Signed-off-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-09-20NFSv4.x/pnfs: Don't try to recover stateids twice in layoutgetTrond Myklebust
If the current open or layout stateid doesn't match the stateid used in the layoutget RPC call, then don't try to recover it. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-09-20NFSv4: Recovery of recalled read delegations is brokenTrond Myklebust
When a read delegation is being recalled, and we're reclaiming the cached opens, we need to make sure that we only reclaim read-only modes. A previous attempt to do this, relied on retrieving the delegation type from the nfs4_opendata structure. Unfortunately, as Kinglong pointed out, this field can only be set when performing reboot recovery. Furthermore, if we call nfs4_open_recover(), then we end up clobbering the state->flags for all modes that we're not recovering... The fix is to have the delegation recall code pass this information to the recovery call, and then refactor the recovery code so that nfs4_open_delegation_recall() does not need to call nfs4_open_recover(). Reported-by: Kinglong Mee <kinglongmee@gmail.com> Fixes: 39f897fdbd46 ("NFSv4: When returning a delegation, don't...") Tested-by: Kinglong Mee <kinglongmee@gmail.com> Cc: NeilBrown <neilb@suse.com> Cc: stable@vger.kernel.org # v4.2+ Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-09-20NFS: Fix an infinite loop when layoutget fail with BAD_STATEIDKinglong Mee
If layouget fail with BAD_STATEID, restart should not using the old stateid. But, nfs client choose the layout stateid at first, and then the open stateid. To avoid the infinite loop of using bad stateid for layoutget, this patch sets the layout flag'ss NFS_LAYOUT_INVALID_STID bit to skip choosing the bad layout stateid. Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-09-01nfs: Remove unneeded checking of the return value from scnprintfKinglong Mee
The return value from scnprintf always less than the buffer length. So, result >= len always false. This patch removes those checking. int vscnprintf(char *buf, size_t size, const char *fmt, va_list args) { int i; i = vsnprintf(buf, size, fmt, args); if (likely(i < size)) return i; if (size != 0) return size - 1; return 0; } Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-09-01nfs: Fix truncated client owner id without proto typeKinglong Mee
The length of "Linux NFSv4.0 " is 14, not 10. Without this patch, I get a truncated client owner id as, "Linux NFSv4.0 ::1/::1" With this patch, "Linux NFSv4.0 ::1/::1 tcp" Fixes: a319268891 ("nfs: make nfs4_init_nonuniform_client_string use a dynamically allocated buffer") Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-08-31NFSv4.1/pnfs: Handle LAYOUTGET return values correctlyTrond Myklebust
According to RFC5661 section 18.43.3, if the server cannot satisfy the loga_minlength argument to LAYOUTGET, there are 2 cases: 1) If loga_minlength == 0, it returns NFS4ERR_LAYOUTTRYLATER 2) If loga_minlength != 0, it returns NFS4ERR_BADLAYOUT Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-08-30NFSv4.1: Fix a protocol issue with CLOSE stateidsTrond Myklebust
According to RFC5661 Section 18.2.4, CLOSE is supposed to return the zero stateid. This means that nfs_clear_open_stateid_locked() cannot assume that the result stateid will always match the 'other' field of the existing open stateid when trying to determine a race with a parallel OPEN. Instead, we look at the argument, and check for matches. Cc: stable@vger.kernel.org # v4.0+ Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-08-27NFS: Send attributes in OPEN request for NFS4_CREATE_EXCLUSIVE4_1Kinglong Mee
Client sends a SETATTR request after OPEN for updating attributes. For create file with S_ISGID is set, the S_ISGID in SETATTR will be ignored at nfs server as chmod of no PERMISSION. v3, same as v2. Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-08-27NFS: Get suppattr_exclcreat when getting server capabilitiesKinglong Mee
Create file with attributs as NFS4_CREATE_EXCLUSIVE4_1 mode depends on suppattr_exclcreat attribut. v3, same as v2. Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-08-27NFS: Make opened as optional argument in _nfs4_do_openKinglong Mee
Check opened, only update it when non-NULL. It's not needs define an unused value for the opened when calling _nfs4_do_open. v3, same as v2. Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-08-19NFSv4: Enable delegated opens even when reboot recovery is pendingTrond Myklebust
Unlike the previous attempt, this takes into account the fact that we may be calling it from the recovery thread itself. Detect this by looking at what kind of open we're doing, and checking the state of the NFS_DELEGATION_NEED_RECLAIM if it turns out we're doing a reboot reclaim-type open. Cc: Olga Kornievskaia <aglo@umich.edu> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-08-19Revert "NFSv4: Remove incorrect check in can_open_delegated()"Trond Myklebust
This reverts commit 4e379d36c050b0117b5d10048be63a44f5036115. This commit opens up a race between the recovery code and the open code. Reported-by: Olga Kornievskaia <aglo@umich.edu> Cc: stable@vger.kernel # v4.0+ Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-08-18NFSv4.1/pnfs: Play safe w.r.t. close() races when return-on-close is setTrond Myklebust
If we have an OPEN_DOWNGRADE and CLOSE race with one another, we want to ensure that the layout is forgotten by the client, so that we start afresh with a new layoutget. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-08-18NFSv4.1/pnfs: Fix a close/delegreturn hang when return-on-close is setTrond Myklebust
The helper pnfs_roc() has already verified that we have no delegations, and no further open files, hence no outstanding I/O and it has marked all the return-on-close lsegs as being invalid. Furthermore, it sets the NFS_LAYOUT_RETURN bit, thus serialising the close/delegreturn with all future layoutget calls on this inode. The checks in pnfs_roc_drain() for valid layout segments are therefore redundant: those cannot exist until another layoutget completes. The other check for whether or not NFS_LAYOUT_RETURN is set, actually causes a hang, since we already know that we hold that flag. To fix, we therefore strip out all the functionality in pnfs_roc_drain() except the retrieval of the barrier state, and then rename the function accordingly. Reported-by: Christoph Hellwig <hch@infradead.org> Fixes: 5c4a79fb2b1c ("Don't prevent layoutgets when doing return-on-close") Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-08-17NFS: Remove nfs41_server_notify_{target|highest}_slotid_update()Anna Schumaker
All these functions do is call nfs41_ping_server() without adding anything. Let's remove them and give nfs41_ping_server() a better name instead. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-08-17NFS: Fix a NULL pointer dereference of migration recovery ops for v4.2 clientKinglong Mee
---Steps to Reproduce-- <nfs-server> # cat /etc/exports /nfs/referal *(rw,insecure,no_subtree_check,no_root_squash,crossmnt) /nfs/old *(ro,insecure,subtree_check,root_squash,crossmnt) <nfs-client> # mount -t nfs nfs-server:/nfs/ /mnt/ # ll /mnt/*/ <nfs-server> # cat /etc/exports /nfs/referal *(rw,insecure,no_subtree_check,no_root_squash,crossmnt,refer=/nfs/old/@nfs-server) /nfs/old *(ro,insecure,subtree_check,root_squash,crossmnt) # service nfs restart <nfs-client> # ll /mnt/*/ --->>>>> oops here [ 5123.102925] BUG: unable to handle kernel NULL pointer dereference at (null) [ 5123.103363] IP: [<ffffffffa03ed38b>] nfs4_proc_get_locations+0x9b/0x120 [nfsv4] [ 5123.103752] PGD 587b9067 PUD 3cbf5067 PMD 0 [ 5123.104131] Oops: 0000 [#1] [ 5123.104529] Modules linked in: nfsv4(OE) nfs(OE) fscache(E) nfsd(OE) xfs libcrc32c iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi coretemp crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel ppdev vmw_balloon parport_pc parport i2c_piix4 shpchp auth_rpcgss nfs_acl vmw_vmci lockd grace sunrpc vmwgfx drm_kms_helper ttm drm mptspi serio_raw scsi_transport_spi e1000 mptscsih mptbase ata_generic pata_acpi [last unloaded: nfsd] [ 5123.105887] CPU: 0 PID: 15853 Comm: ::1-manager Tainted: G OE 4.2.0-rc6+ #214 [ 5123.106358] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 05/20/2014 [ 5123.106860] task: ffff88007620f300 ti: ffff88005877c000 task.ti: ffff88005877c000 [ 5123.107363] RIP: 0010:[<ffffffffa03ed38b>] [<ffffffffa03ed38b>] nfs4_proc_get_locations+0x9b/0x120 [nfsv4] [ 5123.107909] RSP: 0018:ffff88005877fdb8 EFLAGS: 00010246 [ 5123.108435] RAX: ffff880053f3bc00 RBX: ffff88006ce6c908 RCX: ffff880053a0d240 [ 5123.108968] RDX: ffffea0000e6d940 RSI: ffff8800399a0000 RDI: ffff88006ce6c908 [ 5123.109503] RBP: ffff88005877fe28 R08: ffffffff81c708a0 R09: 0000000000000000 [ 5123.110045] R10: 00000000000001a2 R11: ffff88003ba7f5c8 R12: ffff880054c55800 [ 5123.110618] R13: 0000000000000000 R14: ffff880053a0d240 R15: ffff880053a0d240 [ 5123.111169] FS: 0000000000000000(0000) GS:ffffffff81c27000(0000) knlGS:0000000000000000 [ 5123.111726] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 5123.112286] CR2: 0000000000000000 CR3: 0000000054cac000 CR4: 00000000001406f0 [ 5123.112888] Stack: [ 5123.113458] ffffea0000e6d940 ffff8800399a0000 00000000000167d0 0000000000000000 [ 5123.114049] 0000000000000000 0000000000000000 0000000000000000 00000000a7ec82c6 [ 5123.114662] ffff88005877fe18 ffffea0000e6d940 ffff8800399a0000 ffff880054c55800 [ 5123.115264] Call Trace: [ 5123.115868] [<ffffffffa03fb44b>] nfs4_try_migration+0xbb/0x220 [nfsv4] [ 5123.116487] [<ffffffffa03fcb3b>] nfs4_run_state_manager+0x4ab/0x7b0 [nfsv4] [ 5123.117104] [<ffffffffa03fc690>] ? nfs4_do_reclaim+0x510/0x510 [nfsv4] [ 5123.117813] [<ffffffff810a4527>] kthread+0xd7/0xf0 [ 5123.118456] [<ffffffff810a4450>] ? kthread_worker_fn+0x160/0x160 [ 5123.119108] [<ffffffff816d9cdf>] ret_from_fork+0x3f/0x70 [ 5123.119723] [<ffffffff810a4450>] ? kthread_worker_fn+0x160/0x160 [ 5123.120329] Code: 4c 8b 6a 58 74 17 eb 52 48 8d 55 a8 89 c6 4c 89 e7 e8 4a b5 ff ff 8b 45 b0 85 c0 74 1c 4c 89 f9 48 8b 55 90 48 8b 75 98 48 89 df <41> ff 55 00 3d e8 d8 ff ff 41 89 c6 74 cf 48 8b 4d c8 65 48 33 [ 5123.121643] RIP [<ffffffffa03ed38b>] nfs4_proc_get_locations+0x9b/0x120 [nfsv4] [ 5123.122308] RSP <ffff88005877fdb8> [ 5123.122942] CR2: 0000000000000000 Fixes: ec011fe847 ("NFS: Introduce a vector of migration recovery ops") Cc: stable@vger.kernel.org # v3.13+ Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-08-12NFSv4: don't set SETATTR for O_RDONLY|O_EXCLNeilBrown
It is unusual to combine the open flags O_RDONLY and O_EXCL, but it appears that libre-office does just that. [pid 3250] stat("/home/USER/.config", {st_mode=S_IFDIR|0700, st_size=8192, ...}) = 0 [pid 3250] open("/home/USER/.config/libreoffice/4-suse/user/extensions/buildid", O_RDONLY|O_EXCL <unfinished ...> NFSv4 takes O_EXCL as a sign that a setattr command should be sent, probably to reset the timestamps. When it was an O_RDONLY open, the SETATTR command does not identify any actual attributes to change. If no delegation was provided to the open, the SETATTR uses the all-zeros stateid and the request is accepted (at least by the Linux NFS server - no harm, no foul). If a read-delegation was provided, this is used in the SETATTR request, and a Netapp filer will justifiably claim NFS4ERR_BAD_STATEID, which the Linux client takes as a sign to retry - indefinitely. So only treat O_EXCL specially if O_CREAT was also given. Signed-off-by: NeilBrown <neilb@suse.com> Cc: stable@vger.kernel.org Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-07-28Merge tag 'nfs-for-4.2-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds
Pull NFS client bugfixes from Trond Myklebust: "Highlights include: Stable patches: - Fix a situation where the client uses the wrong (zero) stateid. - Fix a memory leak in nfs_do_recoalesce Bugfixes: - Plug a memory leak when ->prepare_layoutcommit fails - Fix an Oops in the NFSv4 open code - Fix a backchannel deadlock - Fix a livelock in sunrpc when sendmsg fails due to low memory availability - Don't revalidate the mapping if both size and change attr are up to date - Ensure we don't miss a file extension when doing pNFS - Several fixes to handle NFSv4.1 sequence operation status bits correctly - Several pNFS layout return bugfixes" * tag 'nfs-for-4.2-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (28 commits) nfs: Fix an oops caused by using other thread's stack space in ASYNC mode nfs: plug memory leak when ->prepare_layoutcommit fails SUNRPC: Report TCP errors to the caller sunrpc: translate -EAGAIN to -ENOBUFS when socket is writable. NFSv4.2: handle NFS-specific llseek errors NFS: Don't clear desc->pg_moreio in nfs_do_recoalesce() NFS: Fix a memory leak in nfs_do_recoalesce NFS: nfs_mark_for_revalidate should always set NFS_INO_REVAL_PAGECACHE NFS: Remove the "NFS_CAP_CHANGE_ATTR" capability NFS: Set NFS_INO_REVAL_PAGECACHE if the change attribute is uninitialised NFS: Don't revalidate the mapping if both size and change attr are up to date NFSv4/pnfs: Ensure we don't miss a file extension NFSv4: We must set NFS_OPEN_STATE flag in nfs_resync_open_stateid_locked SUNRPC: xprt_complete_bc_request must also decrement the free slot count SUNRPC: Fix a backchannel deadlock pNFS: Don't throw out valid layout segments pNFS: pnfs_roc_drain() fix a race with open pNFS: Fix races between return-on-close and layoutreturn. pNFS: pnfs_roc_drain should return 'true' when sleeping pNFS: Layoutreturn must invalidate all existing layout segments. ...
2015-07-28nfs: Fix an oops caused by using other thread's stack space in ASYNC modeKinglong Mee
An oops caused by using other thread's stack space in sunrpc ASYNC sending thread. [ 9839.007187] ------------[ cut here ]------------ [ 9839.007923] kernel BUG at fs/nfs/nfs4xdr.c:910! [ 9839.008069] invalid opcode: 0000 [#1] SMP [ 9839.008069] Modules linked in: blocklayoutdriver rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache snd_hda_codec_generic snd_hda_intel snd_hda_controller snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm joydev iosf_mbi crct10dif_pclmul snd_timer crc32_pclmul crc32c_intel ghash_clmulni_intel snd soundcore ppdev pvpanic parport_pc i2c_piix4 serio_raw virtio_balloon parport acpi_cpufreq nfsd nfs_acl lockd grace auth_rpcgss sunrpc qxl drm_kms_helper virtio_net virtio_console virtio_blk ttm drm virtio_pci virtio_ring virtio ata_generic pata_acpi [ 9839.008069] CPU: 0 PID: 308 Comm: kworker/0:1H Not tainted 4.0.0-0.rc4.git1.3.fc23.x86_64 #1 [ 9839.008069] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 [ 9839.008069] Workqueue: rpciod rpc_async_schedule [sunrpc] [ 9839.008069] task: ffff8800d8b4d8e0 ti: ffff880036678000 task.ti: ffff880036678000 [ 9839.008069] RIP: 0010:[<ffffffffa0339cc9>] [<ffffffffa0339cc9>] reserve_space.part.73+0x9/0x10 [nfsv4] [ 9839.008069] RSP: 0018:ffff88003667ba58 EFLAGS: 00010246 [ 9839.008069] RAX: 0000000000000000 RBX: 000000001fc15e18 RCX: ffff8800c0193800 [ 9839.008069] RDX: ffff8800e4ae3f24 RSI: 000000001fc15e2c RDI: ffff88003667bcd0 [ 9839.008069] RBP: ffff88003667ba58 R08: ffff8800d9173008 R09: 0000000000000003 [ 9839.008069] R10: ffff88003667bcd0 R11: 000000000000000c R12: 0000000000010000 [ 9839.008069] R13: ffff8800d9173350 R14: 0000000000000000 R15: ffff8800c0067b98 [ 9839.008069] FS: 0000000000000000(0000) GS:ffff88011fc00000(0000) knlGS:0000000000000000 [ 9839.008069] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 9839.008069] CR2: 00007f988c9c8bb0 CR3: 00000000d99b6000 CR4: 00000000000407f0 [ 9839.008069] Stack: [ 9839.008069] ffff88003667bbc8 ffffffffa03412c5 00000000c6c55680 ffff880000000003 [ 9839.008069] 0000000000000088 00000010c6c55680 0001000000000002 ffffffff816e87e9 [ 9839.008069] 0000000000000000 00000000477290e2 ffff88003667bab8 ffffffff81327ba3 [ 9839.008069] Call Trace: [ 9839.008069] [<ffffffffa03412c5>] encode_attrs+0x435/0x530 [nfsv4] [ 9839.008069] [<ffffffff816e87e9>] ? inet_sendmsg+0x69/0xb0 [ 9839.008069] [<ffffffff81327ba3>] ? selinux_socket_sendmsg+0x23/0x30 [ 9839.008069] [<ffffffff8164c1df>] ? do_sock_sendmsg+0x9f/0xc0 [ 9839.008069] [<ffffffff8164c278>] ? kernel_sendmsg+0x58/0x70 [ 9839.008069] [<ffffffffa011acc0>] ? xdr_reserve_space+0x20/0x170 [sunrpc] [ 9839.008069] [<ffffffffa011acc0>] ? xdr_reserve_space+0x20/0x170 [sunrpc] [ 9839.008069] [<ffffffffa0341b40>] ? nfs4_xdr_enc_open_noattr+0x130/0x130 [nfsv4] [ 9839.008069] [<ffffffffa03419a5>] encode_open+0x2d5/0x340 [nfsv4] [ 9839.008069] [<ffffffffa0341b40>] ? nfs4_xdr_enc_open_noattr+0x130/0x130 [nfsv4] [ 9839.008069] [<ffffffffa011ab89>] ? xdr_encode_opaque+0x19/0x20 [sunrpc] [ 9839.008069] [<ffffffffa0339cfb>] ? encode_string+0x2b/0x40 [nfsv4] [ 9839.008069] [<ffffffffa0341bf3>] nfs4_xdr_enc_open+0xb3/0x140 [nfsv4] [ 9839.008069] [<ffffffffa0110a4c>] rpcauth_wrap_req+0xac/0xf0 [sunrpc] [ 9839.008069] [<ffffffffa01017db>] call_transmit+0x18b/0x2d0 [sunrpc] [ 9839.008069] [<ffffffffa0101650>] ? call_decode+0x860/0x860 [sunrpc] [ 9839.008069] [<ffffffffa0101650>] ? call_decode+0x860/0x860 [sunrpc] [ 9839.008069] [<ffffffffa010caa0>] __rpc_execute+0x90/0x460 [sunrpc] [ 9839.008069] [<ffffffffa010ce85>] rpc_async_schedule+0x15/0x20 [sunrpc] [ 9839.008069] [<ffffffff810b452b>] process_one_work+0x1bb/0x410 [ 9839.008069] [<ffffffff810b47d3>] worker_thread+0x53/0x470 [ 9839.008069] [<ffffffff810b4780>] ? process_one_work+0x410/0x410 [ 9839.008069] [<ffffffff810b4780>] ? process_one_work+0x410/0x410 [ 9839.008069] [<ffffffff810ba7b8>] kthread+0xd8/0xf0 [ 9839.008069] [<ffffffff810ba6e0>] ? kthread_worker_fn+0x180/0x180 [ 9839.008069] [<ffffffff81786418>] ret_from_fork+0x58/0x90 [ 9839.008069] [<ffffffff810ba6e0>] ? kthread_worker_fn+0x180/0x180 [ 9839.008069] Code: 00 00 48 c7 c7 21 fa 37 a0 e8 94 1c d6 e0 c6 05 d2 17 05 00 01 8b 03 eb d7 66 0f 1f 84 00 00 00 00 00 66 66 66 66 90 55 48 89 e5 <0f> 0b 0f 1f 44 00 00 66 66 66 66 90 55 48 89 e5 41 54 53 89 f3 [ 9839.008069] RIP [<ffffffffa0339cc9>] reserve_space.part.73+0x9/0x10 [nfsv4] [ 9839.008069] RSP <ffff88003667ba58> [ 9839.071114] ---[ end trace cc14c03adb522e94 ]--- Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>