Age | Commit message (Collapse) | Author |
|
commit d49f042aeec99c5f87160bb52dd52088b1051311 upstream.
Currently, if the call to nfs_refresh_inode fails, then we end up leaking
a reference count, due to the call to nfs4_get_open_state.
While we're at it, replace nfs4_get_open_state with a simple call to
atomic_inc(); there is no need to do a full lookup of the struct nfs_state
since it is passed as an argument in the struct nfs4_opendata, and
is already assigned to the variable 'state'.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit d2bfda2e7aa036f90ccea610a657064b1e267913 upstream.
Cached opens have already been handled by _nfs4_opendata_reclaim_to_nfs4_state
and can safely skip being reprocessed, but must still call update_open_stateid
to make sure that all active fmodes are recovered.
Signed-off-by: Weston Andros Adamson <dros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit a43ec98b72aae3e330f0673438f58316c3769b84 upstream.
This is an unneeded check that could cause the client to fail to recover
opens.
Signed-off-by: Weston Andros Adamson <dros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit f494a6071d31e3294a3b51ad7a3684f983953f9f upstream.
_nfs4_opendata_reclaim_to_nfs4_state doesn't expect to see a cached
open CLAIM_PREVIOUS, but this can happen. An example is when there are
RDWR openers and RDONLY openers on a delegation stateid. The recovery
path will first try an open CLAIM_PREVIOUS for the RDWR openers, this
marks the delegation as not needing RECLAIM anymore, so the open
CLAIM_PREVIOUS for the RDONLY openers will not actually send an rpc.
The NULL dereference is due to _nfs4_opendata_reclaim_to_nfs4_state
returning PTR_ERR(rpc_status) when !rpc_done. When the open is
cached, rpc_done == 0 and rpc_status == 0, thus
_nfs4_opendata_reclaim_to_nfs4_state returns NULL - this is unexpected
by callers of nfs4_opendata_to_nfs4_state().
This can be reproduced easily by opening the same file two times on an
NFSv4.0 mount with delegations enabled, once as RDWR and once as RDONLY then
sleeping for a long time. While the files are held open, kick off state
recovery and this NULL dereference will be hit every time.
An example OOPS:
[ 65.003602] BUG: unable to handle kernel NULL pointer dereference at 00000000
00000030
[ 65.005312] IP: [<ffffffffa037d6ee>] __nfs4_close+0x1e/0x160 [nfsv4]
[ 65.006820] PGD 7b0ea067 PUD 791ff067 PMD 0
[ 65.008075] Oops: 0000 [#1] SMP
[ 65.008802] Modules linked in: rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache
snd_ens1371 gameport nfsd snd_rawmidi snd_ac97_codec ac97_bus btusb snd_seq snd
_seq_device snd_pcm ppdev bluetooth auth_rpcgss coretemp snd_page_alloc crc32_pc
lmul crc32c_intel ghash_clmulni_intel microcode rfkill nfs_acl vmw_balloon serio
_raw snd_timer lockd parport_pc e1000 snd soundcore parport i2c_piix4 shpchp vmw
_vmci sunrpc ata_generic mperf pata_acpi mptspi vmwgfx ttm scsi_transport_spi dr
m mptscsih mptbase i2c_core
[ 65.018684] CPU: 0 PID: 473 Comm: 192.168.10.85-m Not tainted 3.11.2-201.fc19
.x86_64 #1
[ 65.020113] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop
Reference Platform, BIOS 6.00 07/31/2013
[ 65.022012] task: ffff88003707e320 ti: ffff88007b906000 task.ti: ffff88007b906000
[ 65.023414] RIP: 0010:[<ffffffffa037d6ee>] [<ffffffffa037d6ee>] __nfs4_close+0x1e/0x160 [nfsv4]
[ 65.025079] RSP: 0018:ffff88007b907d10 EFLAGS: 00010246
[ 65.026042] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
[ 65.027321] RDX: 0000000000000050 RSI: 0000000000000001 RDI: 0000000000000000
[ 65.028691] RBP: ffff88007b907d38 R08: 0000000000016f60 R09: 0000000000000000
[ 65.029990] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000001
[ 65.031295] R13: 0000000000000050 R14: 0000000000000000 R15: 0000000000000001
[ 65.032527] FS: 0000000000000000(0000) GS:ffff88007f600000(0000) knlGS:0000000000000000
[ 65.033981] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 65.035177] CR2: 0000000000000030 CR3: 000000007b27f000 CR4: 00000000000407f0
[ 65.036568] Stack:
[ 65.037011] 0000000000000000 0000000000000001 ffff88007b907d90 ffff88007a880220
[ 65.038472] ffff88007b768de8 ffff88007b907d48 ffffffffa037e4a5 ffff88007b907d80
[ 65.039935] ffffffffa036a6c8 ffff880037020e40 ffff88007a880000 ffff880037020e40
[ 65.041468] Call Trace:
[ 65.042050] [<ffffffffa037e4a5>] nfs4_close_state+0x15/0x20 [nfsv4]
[ 65.043209] [<ffffffffa036a6c8>] nfs4_open_recover_helper+0x148/0x1f0 [nfsv4]
[ 65.044529] [<ffffffffa036a886>] nfs4_open_recover+0x116/0x150 [nfsv4]
[ 65.045730] [<ffffffffa036d98d>] nfs4_open_reclaim+0xad/0x150 [nfsv4]
[ 65.046905] [<ffffffffa037d979>] nfs4_do_reclaim+0x149/0x5f0 [nfsv4]
[ 65.048071] [<ffffffffa037e1dc>] nfs4_run_state_manager+0x3bc/0x670 [nfsv4]
[ 65.049436] [<ffffffffa037de20>] ? nfs4_do_reclaim+0x5f0/0x5f0 [nfsv4]
[ 65.050686] [<ffffffffa037de20>] ? nfs4_do_reclaim+0x5f0/0x5f0 [nfsv4]
[ 65.051943] [<ffffffff81088640>] kthread+0xc0/0xd0
[ 65.052831] [<ffffffff81088580>] ? insert_kthread_work+0x40/0x40
[ 65.054697] [<ffffffff8165686c>] ret_from_fork+0x7c/0xb0
[ 65.056396] [<ffffffff81088580>] ? insert_kthread_work+0x40/0x40
[ 65.058208] Code: 5c 41 5d 5d c3 0f 1f 84 00 00 00 00 00 66 66 66 66 90 55 48 89 e5 41 57 41 89 f7 41 56 41 89 ce 41 55 41 89 d5 41 54 53 48 89 fb <4c> 8b 67 30 f0 41 ff 44 24 44 49 8d 7c 24 40 e8 0e 0a 2d e1 44
[ 65.065225] RIP [<ffffffffa037d6ee>] __nfs4_close+0x1e/0x160 [nfsv4]
[ 65.067175] RSP <ffff88007b907d10>
[ 65.068570] CR2: 0000000000000030
[ 65.070098] ---[ end trace 0d1fe4f5c7dd6f8b ]---
Signed-off-by: Weston Andros Adamson <dros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit a6f951ddbdfb7bd87d31a44f61abe202ed6ce57f upstream.
In nfs4_proc_getlk(), when some error causes a retry of the call to
_nfs4_proc_getlk(), we can end up with Oopses of the form
BUG: unable to handle kernel NULL pointer dereference at 0000000000000134
IP: [<ffffffff8165270e>] _raw_spin_lock+0xe/0x30
<snip>
Call Trace:
[<ffffffff812f287d>] _atomic_dec_and_lock+0x4d/0x70
[<ffffffffa053c4f2>] nfs4_put_lock_state+0x32/0xb0 [nfsv4]
[<ffffffffa053c585>] nfs4_fl_release_lock+0x15/0x20 [nfsv4]
[<ffffffffa0522c06>] _nfs4_proc_getlk.isra.40+0x146/0x170 [nfsv4]
[<ffffffffa052ad99>] nfs4_proc_lock+0x399/0x5a0 [nfsv4]
The problem is that we don't clear the request->fl_ops after the first
try and so when we retry, nfs4_set_lock_state() exits early without
setting the lock stateid.
Regression introduced by commit 70cc6487a4e08b8698c0e2ec935fb48d10490162
(locks: make ->lock release private data before returning in GETLK case)
Reported-by: Weston Andros Adamson <dros@netapp.com>
Reported-by: Jorge Mora <mora@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
We need to pass the full open mode flags to nfs_may_open() when doing
a delegated open.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org
|
|
On a CB_RECALL the callback service thread flushes the inode using
filemap_flush prior to scheduling the state manager thread to return the
delegation. When pNFS is used and I/O has not yet gone to the data server
servicing the inode, a LAYOUTGET can preceed the I/O. Unlike the async
filemap_flush call, the LAYOUTGET must proceed to completion.
If the state manager starts to recover data while the inode flush is sending
the LAYOUTGET, a deadlock occurs as the callback service thread holds the
single callback session slot until the flushing is done which blocks the state
manager thread, and the state manager thread has set the session draining bit
which puts the inode flush LAYOUTGET RPC to sleep on the forechannel slot
table waitq.
Separate the draining of the back channel from the draining of the fore channel
by moving the NFS4_SESSION_DRAINING bit from session scope into the fore
and back slot tables. Drain the back channel first allowing the LAYOUTGET
call to proceed (and fail) so the callback service thread frees the callback
slot. Then proceed with draining the forechannel.
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
|
This ensures that the server doesn't need to keep huge numbers of
lock stateids waiting around for the final CLOSE.
See section 8.2.4 in RFC5661.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
|
The main reason for doing this is will be to allow for an asynchronous
RPC mode that we can use for freeing lock stateids as per section
8.2.4 of RFC5661.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
|
If a NFS client receives a delegation for a file after it has taken
a lock on that file, we can currently end up in a situation where
we mistakenly skip unlocking that file.
The following patch swaps an erroneous check in nfs4_proc_unlck for
whether or not the file has a delegation to one which checks whether
or not we hold a lock stateid for that file.
Reported-by: Chuck Lever <Chuck.Lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org [>=3.7]
Tested-by: Chuck Lever <Chuck.Lever@oracle.com>
|
|
Debugging aid to help identify servers that incorrectly apply open mode
checks to setattr requests that are not changing the file size.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
|
The NFSv4 and NFSv4.1 specs are both clear that the server should only check
stateid open mode if a SETATTR specifies the size attribute. If the
open mode is not one that allows writing, then it returns NFS4ERR_OPENMODE.
In the case where the SETATTR is not changing the size, the client will
still pass it the delegation stateid to ensure that the server does not
recall that delegation. In that case, the server should _ignore_ the
delegation open mode, and simply apply standard permission checks.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
|
Fix up a conflict between the linux-next branch and mainline.
Conflicts:
fs/nfs/nfs4proc.c
|
|
* rpcsec_gss-from_cel: (21 commits)
NFS: Retry SETCLIENTID with AUTH_SYS instead of AUTH_NONE
NFSv4: Don't clear the machine cred when client establish returns EACCES
NFSv4: Fix issues in nfs4_discover_server_trunking
NFSv4: Fix the fallback to AUTH_NULL if krb5i is not available
NFS: Use server-recommended security flavor by default (NFSv3)
SUNRPC: Don't recognize RPC_AUTH_MAXFLAVOR
NFS: Use "krb5i" to establish NFSv4 state whenever possible
NFS: Try AUTH_UNIX when PUTROOTFH gets NFS4ERR_WRONGSEC
NFS: Use static list of security flavors during root FH lookup recovery
NFS: Avoid PUTROOTFH when managing leases
NFS: Clean up nfs4_proc_get_rootfh
NFS: Handle missing rpc.gssd when looking up root FH
SUNRPC: Remove EXPORT_SYMBOL_GPL() from GSS mech switch
SUNRPC: Make gss_mech_get() static
SUNRPC: Refactor nfsd4_do_encode_secinfo()
SUNRPC: Consider qop when looking up pseudoflavors
SUNRPC: Load GSS kernel module by OID
SUNRPC: Introduce rpcauth_get_pseudoflavor()
SUNRPC: Define rpcsec_gss_info structure
NFS: Remove unneeded forward declaration
...
|
|
If we already checked the user access permissions on the original open,
then don't bother checking again on recovery. Doing so can cause a
deadlock with NFSv4.1, since the may_open() operation is not privileged.
Furthermore, we can't report an access permission failure here anyway.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
|
If we're in a delegation recall situation, we can't do a delegated open.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
|
When we're doing open-by-filehandle in NFSv4.1, we shouldn't need to
do the cache consistency revalidation on the directory. It is
therefore more efficient to just use open_noattr, which returns the
file attributes, but not the directory attributes.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
|
We should always clear it before initiating file recovery.
Also ensure that we clear it after a CLOSE and/or after TEST_STATEID fails.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
|
Defensive patch to ensure that we copy the state->open_stateid, which
can never be set to the delegation stateid.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
|
Fix nfs4_select_rw_stateid() so that it chooses the open stateid
(or an all-zero stateid) if the delegation does not match the selected
read/write mode.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
|
If we're doing NFSv4.1 against a server that has persistent sessions,
then we should not need to call SETATTR in order to reset the file
attributes immediately after doing an exclusive create.
Note that since the create mode depends on the type of session that
has been negotiated with the server, we should not choose the
mode until after we've got a session slot.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
|
Currently, _nfs4_do_setattr() will use the delegation stateid if no
writeable open file stateid is available.
If the server revokes that delegation stateid, then the call to
nfs4_handle_exception() will fail to handle the error due to the
lack of a struct nfs4_state, and will just convert the error into
an EIO.
This patch just removes the requirement that we must have a
struct nfs4_state in order to invalidate the delegation and
retry.
Reported-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
|
Otherwise we deadlock if state recovery is initiated while we
sleep.
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
|
Don't hold the NFSv4 sequence id while we check for open permission.
The call to ACCESS may block due to reboot recovery.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
|
When we send a RENEW or SEQUENCE operation in order to probe if the
lease is still valid, we want it to be able to time out since the
lease we are probing is likely to time out too. Currently, because
we use soft mount semantics for these RPC calls, the return value
is EIO, which causes the state manager to exit with an "unhandled
error" message.
This patch changes the call semantics, so that the RPC layer returns
ETIMEDOUT instead of EIO. We then have the state manager default to
a simple retry instead of exiting.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
|
Unify the error handling in nfs4_open_delegation_recall and
nfs4_lock_delegation_recall.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
|
Make it symmetric with nfs4_lock_delegation_recall
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
|
All error cases are handled by the switch() statement, meaning that the
call to nfs4_handle_exception() is unreachable.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
|
A server shouldn't normally return NFS4ERR_GRACE if the client holds a
delegation, since no conflicting lock reclaims can be granted, however
the spec does not require the server to grant the open in this
instance
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org
|
|
A server shouldn't normally return NFS4ERR_GRACE if the client holds a
delegation, since no conflicting lock reclaims can be granted, however
the spec does not require the server to grant the lock in this
instance.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org
|
|
Most NFSv4 servers implement AUTH_UNIX, and administrators will
prefer this over AUTH_NULL. It is harmless for our client to try
this flavor in addition to the flavors mandated by RFC 3530/5661.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
|
If the Linux NFS client receives an NFS4ERR_WRONGSEC error while
trying to look up an NFS server's root file handle, it retries the
lookup operation with various security flavors to see what flavor
the NFS server will accept for pseudo-fs access.
The list of flavors the client uses during retry consists only of
flavors that are currently registered in the kernel RPC client.
This list may not include any GSS pseudoflavors if auth_rpcgss.ko
has not yet been loaded.
Let's instead use a static list of security flavors that the NFS
standard requires the server to implement (RFC 3530bis, section
3.2.1). The RPC client should now be able to load support for
these dynamically; if not, they are skipped.
Recovery behavior here is prescribed by RFC 3530bis, section
15.33.5:
> For LOOKUPP, PUTROOTFH and PUTPUBFH, the client will be unable to
> use the SECINFO operation since SECINFO requires a current
> filehandle and none exist for these two [sic] operations. Therefore,
> the client must iterate through the security triples available at
> the client and reattempt the PUTROOTFH or PUTPUBFH operation. In
> the unfortunate event none of the MANDATORY security triples are
> supported by the client and server, the client SHOULD try using
> others that support integrity. Failing that, the client can try
> using AUTH_NONE, but because such forms lack integrity checks,
> this puts the client at risk.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
|
Currently, the compound operation the Linux NFS client sends to the
server to confirm a client ID looks like this:
{ SETCLIENTID_CONFIRM; PUTROOTFH; GETATTR(lease_time) }
Once the lease is confirmed, it makes sense to know how long before
the client will have to renew it. And, performing these operations
in the same compound saves a round trip.
Unfortunately, this arrangement assumes that the security flavor
used for establishing a client ID can also be used to access the
server's pseudo-fs.
If the server requires a different security flavor to access its
pseudo-fs than it allowed for the client's SETCLIENTID operation,
the PUTROOTFH in this compound fails with NFS4ERR_WRONGSEC. Even
though the SETCLIENTID_CONFIRM succeeded, our client's trunking
detection logic interprets the failure of the compound as a failure
by the server to confirm the client ID.
As part of server trunking detection, the client then begins another
SETCLIENTID pass with the same nfs4_client_id. This fails with
NFS4ERR_CLID_INUSE because the first SETCLIENTID/SETCLIENTID_CONFIRM
already succeeded in confirming that client ID -- it was the
PUTROOTFH operation that caused the SETCLIENTID_CONFIRM compound to
fail.
To address this issue, separate the "establish client ID" step from
the "accessing the server's pseudo-fs root" step. The first access
of the server's pseudo-fs may require retrying the PUTROOTFH
operation with different security flavors. This access is done in
nfs4_proc_get_rootfh().
That leaves the matter of how to retrieve the server's lease time.
nfs4_proc_fsinfo() already retrieves the lease time value, though
none of its callers do anything with the retrieved value (nor do
they mark the lease as "renewed").
Note that NFSv4.1 state recovery invokes nfs4_proc_get_lease_time()
using the lease management security flavor. This may cause some
heartburn if that security flavor isn't the same as the security
flavor the server requires for accessing the pseudo-fs.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
|
The long lines with no vertical white space make this function
difficult for humans to read. Add a proper documenting comment
while we're here.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
|
When rpc.gssd is not running, any NFS operation that needs to use a
GSS security flavor of course does not work.
If looking up a server's root file handle results in an
NFS4ERR_WRONGSEC, nfs4_find_root_sec() is called to try a bunch of
security flavors until one works or all reasonable flavors have
been tried. When rpc.gssd isn't running, this loop seems to fail
immediately after rpcauth_create() craps out on the first GSS
flavor.
When the rpcauth_create() call in nfs4_lookup_root_sec() fails
because rpc.gssd is not available, nfs4_lookup_root_sec()
unconditionally returns -EIO. This prevents nfs4_find_root_sec()
from retrying any other flavors; it drops out of its loop and fails
immediately.
Having nfs4_lookup_root_sec() return -EACCES instead allows
nfs4_find_root_sec() to try all flavors in its list.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
|
If the open_context for the file is not yet fully initialised,
then open recovery cannot succeed, and since nfs4_state_find_open_context
returns an ENOENT, we end up treating the file as being irrecoverable.
What we really want to do, is just defer the recovery until later.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
|
With unlink is an asynchronous operation in the sillyrename case, it
expects nfs4_async_handle_error() to map the error correctly.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
|
Now that we do CLAIM_FH opens, we may run into situations where we
get a delegation but don't have perfect knowledge of the file path.
When returning the delegation, we might therefore not be able to
us CLAIM_DELEGATE_CUR opens to convert the delegation into OPEN
stateids and locks.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
|
Sometimes, we actually _want_ to do open-by-filehandle, for instance
when recovering opens after a network partition, or when called
from nfs4_file_open.
Enable that functionality using a new capability NFS_CAP_ATOMIC_OPEN_V1,
and which is only enabled for NFSv4.1 servers that support it.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
|
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
|
Follow the practice described in section 8.2.2 of RFC5661: When sending a
read/write or setattr stateid, set the seqid field to zero in order to
signal that the NFS server should apply the most recent locking state.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
|
Clean up the setting of the nfs_server->caps, by shoving it all
into nfs4_server_common_setup().
Then add an 'initial capabilities' field into struct nfs4_minor_version_ops.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
|
Adds logic to ensure that if the server returns a BAD_STATEID,
or other state related error, then we check if the stateid has
already changed. If it has, then rather than start state recovery,
we should just resend the failed RPC call with the new stateid.
Allow nfs4_select_rw_stateid to notify that the stateid is unstable by
having it return -EWOULDBLOCK if an RPC is underway that might change the
stateid.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
|
If we replay a READ or WRITE call, we should not be changing the
stateid. Currently, we may end up doing so, because the stateid
is only selected at xdr encode time.
This patch ensures that we select the stateid after we get an NFSv4.1
session slot, and that we keep that same stateid across retries.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
|
If state recovery fails with an ESTALE or a ENOENT, then we shouldn't
keep retrying. Instead, mark the stateid as being invalid and
fail the I/O with an EIO error.
For other operations such as POSIX and BSD file locking, truncate
etc, fail with an EBADF to indicate that this file descriptor is no
longer valid.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
|
In order to be able to safely return the layout in nfs4_proc_setattr,
we need to block new uses of the layout, wait for all outstanding
users of the layout to complete, commit the layout and then return it.
This patch adds a helper in order to do all this safely.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Boaz Harrosh <bharrosh@panasas.com>
|
|
We need to clear the NFS_LSEG_LAYOUTCOMMIT bits atomically with the
NFS_INO_LAYOUTCOMMIT bit, otherwise we may end up with situations
where the two are out of sync.
The first half of the problem is to ensure that pnfs_layoutcommit_inode
clears the NFS_LSEG_LAYOUTCOMMIT bit through pnfs_list_write_lseg.
We still need to keep the reference to those segments until the RPC call
is finished, so in order to make it clear _where_ those references come
from, we add a helper pnfs_list_write_lseg_done() that cleans up after
pnfs_list_write_lseg.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Acked-by: Benny Halevy <bhalevy@tonian.com>
Cc: stable@vger.kernel.org
|
|
The client will currently try LAYOUTGETs forever if a server is returning
NFS4ERR_LAYOUTTRYLATER or NFS4ERR_RECALLCONFLICT - even if the client no
longer needs the layout (ie process killed, unmounted).
This patch uses the DS timeout value (module parameter 'dataserver_timeo'
via rpc layer) to set an upper limit of how long the client tries LATOUTGETs
in this situation. Once the timeout is reached, IO is redirected to the MDS.
This also changes how the client checks if a layout is on the clp list
to avoid a double list_add.
Signed-off-by: Weston Andros Adamson <dros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
|
If we don't release the open seqid before we wait for state recovery,
then we may end up deadlocking the state recovery thread.
This patch addresses a new deadlock that was introduced by
commit c21443c2c792cd9b463646d982b0fe48aa6feb0f (NFSv4: Fix a reboot
recovery race when opening a file)
Reported-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
|
This fixes an oops where a LAYOUTGET is in still in the rpciod queue,
but the requesting processes has been killed. Without this, killing
the process does the final pnfs_put_layout_hdr() and sets NFS_I(inode)->layout
to NULL while the LAYOUTGET rpc task still references it.
Example oops:
BUG: unable to handle kernel NULL pointer dereference at 0000000000000080
IP: [<ffffffffa01bd586>] pnfs_choose_layoutget_stateid+0x37/0xef [nfsv4]
PGD 7365b067 PUD 7365d067 PMD 0
Oops: 0000 [#1] SMP DEBUG_PAGEALLOC
Modules linked in: nfs_layout_nfsv41_files nfsv4 auth_rpcgss nfs lockd sunrpc ipt_MASQUERADE ip6table_mangle ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 iptable_nat nf_nat_ipv4 nf_nat iptable_mangle ip6table_filter ip6_tables ppdev e1000 i2c_piix4 i2c_core shpchp parport_pc parport crc32c_intel aesni_intel xts aes_x86_64 lrw gf128mul ablk_helper cryptd mptspi scsi_transport_spi mptscsih mptbase floppy autofs4
CPU 0
Pid: 27, comm: kworker/0:1 Not tainted 3.8.0-dros_cthon2013+ #4 VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform
RIP: 0010:[<ffffffffa01bd586>] [<ffffffffa01bd586>] pnfs_choose_layoutget_stateid+0x37/0xef [nfsv4]
RSP: 0018:ffff88007b0c1c88 EFLAGS: 00010246
RAX: ffff88006ed36678 RBX: 0000000000000000 RCX: 0000000ea877e3bc
RDX: ffff88007a729da8 RSI: 0000000000000000 RDI: ffff88007a72b958
RBP: ffff88007b0c1ca8 R08: 0000000000000002 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffff88007a72b958
R13: ffff88007a729da8 R14: 0000000000000000 R15: ffffffffa011077e
FS: 0000000000000000(0000) GS:ffff88007f600000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000080 CR3: 00000000735f8000 CR4: 00000000001407f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process kworker/0:1 (pid: 27, threadinfo ffff88007b0c0000, task ffff88007c2fa0c0)
Stack:
ffff88006fc05388 ffff88007a72b908 ffff88007b240900 ffff88006fc05388
ffff88007b0c1cd8 ffffffffa01a2170 ffff88007b240900 ffff88007b240900
ffff88007b240970 ffffffffa011077e ffff88007b0c1ce8 ffffffffa0110791
Call Trace:
[<ffffffffa01a2170>] nfs4_layoutget_prepare+0x7b/0x92 [nfsv4]
[<ffffffffa011077e>] ? __rpc_atrun+0x15/0x15 [sunrpc]
[<ffffffffa0110791>] rpc_prepare_task+0x13/0x15 [sunrpc]
Reported-by: Tigran Mkrtchyan <tigran.mkrtchyan@desy.de>
Signed-off-by: Weston Andros Adamson <dros@netapp.com>
Cc: stable@kernel.org
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|