Results 1 to 2 of 2
Hi, We are having a cluster with couple of disk servers, compute nodes and a head node. Disk server is nfs mounted on all the compute and head nodes. The ...
Enjoy an ad free experience by logging in. Not a member yet? Register.
- 06-27-2011 #1
- Join Date
- Oct 2008
whole cluster freezes if one of the nfs server freezes
We are having a cluster with couple of disk servers, compute nodes and a head node. Disk server is nfs mounted on all the compute and head nodes.
The problem is that if any of the disk servers hangs, it freezes rest of the cluster too which is dangerous. Ideally, cluster should not care about the hanged server and run without any problem except files from that server is not visible.
Is there any way I could avoid the problem of freezing cluster because of one hanged server?
- 06-28-2011 #2
- Join Date
- Apr 2009
- I can be found either 40 miles west of Chicago, or in a galaxy far, far away.
Here is perhaps the relevant mount option information from the nfs man page:
soft / hard Determines the recovery behavior of the NFS client after an NFS request times out. If neither option is specified (or if the hard option is specified), NFS requests are retried indefinitely. If the soft option is specified, then the NFS client fails an NFS request after retrans retrans- missions have been sent, causing the NFS client to return an error to the calling application. NB: A so-called "soft" timeout can cause silent data corruption in certain cases. As such, use the soft option only when client responsiveness is more important than data integrity. Using NFS over TCP or increasing the value of the retrans option may mitigate some of the risks of using the soft option.Sometimes, real fast is almost as good as real time.
Just remember, Semper Gumbi - always be flexible!