nfsd: hold a lighter-weight client reference over CB_RECALL_ANY
authorJeff Layton <jlayton@kernel.org>
Fri, 5 Apr 2024 17:56:18 +0000 (13:56 -0400)
committerChuck Lever <chuck.lever@oracle.com>
Fri, 5 Apr 2024 18:05:35 +0000 (14:05 -0400)
Currently the CB_RECALL_ANY job takes a cl_rpc_users reference to the
client. While a callback job is technically an RPC that counter is
really more for client-driven RPCs, and this has the effect of
preventing the client from being unhashed until the callback completes.

If nfsd decides to send a CB_RECALL_ANY just as the client reboots, we
can end up in a situation where the callback can't complete on the (now
dead) callback channel, but the new client can't connect because the old
client can't be unhashed. This usually manifests as a NFS4ERR_DELAY
return on the CREATE_SESSION operation.

The job is only holding a reference to the client so it can clear a flag
after the RPC completes. Fix this by having CB_RECALL_ANY instead hold a
reference to the cl_nfsdfs.cl_ref. Typically we only take that sort of
reference when dealing with the nfsdfs info files, but it should work
appropriately here to ensure that the nfs4_client doesn't disappear.

Fixes: 44df6f439a17 ("NFSD: add delegation reaper to react to low memory condition")
Reported-by: Vladimir Benes <vbenes@redhat.com>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
fs/nfsd/nfs4state.c

index 5fcd93f7cb8c7d1b8560beaa2ca1245a2abc60fb..3cef81e196c6862f0577090b18499e4f379204fd 100644 (file)
@@ -3042,12 +3042,9 @@ static void
 nfsd4_cb_recall_any_release(struct nfsd4_callback *cb)
 {
        struct nfs4_client *clp = cb->cb_clp;
-       struct nfsd_net *nn = net_generic(clp->net, nfsd_net_id);
 
-       spin_lock(&nn->client_lock);
        clear_bit(NFSD4_CLIENT_CB_RECALL_ANY, &clp->cl_flags);
-       put_client_renew_locked(clp);
-       spin_unlock(&nn->client_lock);
+       drop_client(clp);
 }
 
 static int
@@ -6616,7 +6613,7 @@ deleg_reaper(struct nfsd_net *nn)
                list_add(&clp->cl_ra_cblist, &cblist);
 
                /* release in nfsd4_cb_recall_any_release */
-               atomic_inc(&clp->cl_rpc_users);
+               kref_get(&clp->cl_nfsdfs.cl_ref);
                set_bit(NFSD4_CLIENT_CB_RECALL_ANY, &clp->cl_flags);
                clp->cl_ra_time = ktime_get_boottime_seconds();
        }