problem regarding mpich2 installation(clusters)
I am using mpich2 for making a cluster .....
I have been able to start daemons on both machines of a cluster consisting of two systems ...but the problem is that I cannot run programs simultaneously both all the machines..please help...
the $mpdtrace command shows that both nodes are up.
$ mpiexec -n 2 /home/mpich2-install/sample_mpi
# Here 2 is the number of processes where 1st runs on root node & 2nd on
# 2nd node
It gives the errors:-
1) in root node
[unset]: connect failed with connection refused
[unset]: Unable to connect to 10.136.125.3 on 33767
[unset]: aborting job:
Fatal error in MPI_Init: Other MPI error, error stack:
MPIR_Init_thread(247): Initialization failed
MPID_Init(71)........: channel initialization failed
MPID_Init(274).......: PMI_Init returned -1
2) in the other node
hydhtc48139_39060 (mpd_sockpair 226): connect -2 Name or service not known
hydhtc48139_39060 (mpd_sockpair 233): connect error with -2 Name or service not known
hydhtc48139_mpdman_1 (mpd_sockpair 226): connect -2 Name or service not known
hydhtc48139_mpdman_1 (mpd_sockpair 233): connect error with -2 Name or service not known