#StackBounty: #linux #ubuntu #nfs #nfs4 NFS v4 server is causing stale file handle, but only when bind mount is a subdirectory

Bounty: 50

This problem is basically driving me insane, at this point. I have an Ubuntu 16.04 NFS server that was working fine with this configuration:

/etc/fstab:
UUID=b6bd34a3-f5af-4463-a515-be0b0b583f98  /data2  xfs  rw,relatime  0  0
/data2  /srv/nfs/cryodata    none    defaults,bind    0  0
/usr/local       /srv/nfs/local    none    defaults,bind    0  0

and

/etc/exports
/srv/nfs  192.168.159.31(rw,sync,fsid=0,crossmnt,no_subtree_check)
/srv/nfs/cryodata  192.168.159.31(rw,sync,no_subtree_check)
/srv/nfs/local      192.168.159.31(rw,sync,no_subtree_check)

This has all been working fine for months on the one nfs client using this configuration so far using these client side /etc/fstab entries:

kraken.bio.univ.edu:/local  /usr/local  nfs4  _netdev,auto  0  0
kraken.bio.univ.edu:/cryodata  /cryodata  nfs4  _netdev,auto  0  0

However, since this is a very large storage server, it was decided that it needs to accommodate several labs. So, I moved all the stuff that had been scattered across the /data2 partition into a /data2/cryodata subdirectory, and updated /etc/fstab on the server and /etc/exports as follows:

/etc/fstab:
...
/data2/cryodata  /srv/nfs/cryodata    none    defaults,bind    0  0
/data2/xray      /srv/nfs/xray    none    defaults,bind    0  0
/data2/EM        /srv/nfs/EM    none    defaults,bind    0  0
/usr/local       /srv/nfs/local    none    defaults,bind    0  0

and

/etc/exports
/srv/nfs  192.168.159.31(rw,sync,fsid=0,crossmnt,no_subtree_check)
/srv/nfs/cryodata  192.168.159.31(rw,sync,no_subtree_check)
/srv/nfs/EM  192.168.159.31(rw,sync,no_subtree_check)
/srv/nfs/xray  192.168.159.31(rw,sync,no_subtree_check)
/srv/nfs/local  192.168.159.31(rw,sync,no_subtree_check)

This simply does not work! When I try to mount the new mount on the client using the same client /etc/fstab entry:

{nfs client} /etc/fstab:
...
kraken.bio.univ.edu:/local  /usr/local  nfs4  _netdev,auto  0  0
kraken.bio.univ.edu:/cryodata  /cryodata  nfs4  _netdev,auto  0  0

.

# mount -v /cryodata
mount.nfs4: timeout set for Sat Feb 24 09:24:38 2018
mount.nfs4: trying text-based options 'addr=192.168.41.171,clientaddr=192.168.159.31'
mount.nfs4: mount(2): Stale file handle
mount.nfs4: trying text-based options 'addr=192.168.41.171,clientaddr=192.168.159.31'
mount.nfs4: mount(2): Stale file handle
mount.nfs4: trying text-based options 'addr=128.83.41.171,clientaddr=129.116.159.31'
...

The /usr/local continues to mount without problems. The first time I tried this I did forget to unexport/export the filesystems using exportfs -var before making changes, but since then I’ve switched back and forth, being careful to unexport and umount everything, with several server reboots in between. The original mount of a bind mount of the entire partition always works, and the bind mount of a subdirectory fails with the stale nfs handle message every time. I’ve tried enabling other nfs clients that have never mounted these partitions and get exactly the same error message: in this case it is definitely a server side problem. I’ve checked /var/lib/nfs/etab to make sure it’s cleared out between mount attempts, etc.

I thought the technique of bind mounting into an nfs server root directory resolved all these kinds of issues, but apparently not? The odd thing is /usr/local is a subdirectory of another partition, and it always mounts fine. It is on an ext3 md raid 1, although I can’t imagine this matters.

I’ve spent hours on this and have almost broken google looking for a solution to no avail.


Get this bounty!!!

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.