Find the answer to your Linux question:
Results 1 to 2 of 2
please help me in using mpich2 . I have installed it in 2 systems. I am using one of them as root node. All server,client checks are woking but on ...
Enjoy an ad free experience by logging in. Not a member yet? Register.
  1. #1
    Just Joined!
    Join Date
    Jan 2007
    Posts
    4

    Unhappy Problem using mpich2


    please help me in using mpich2 . I have installed it in 2 systems.

    I am using one of them as root node.
    All server,client checks are woking but on using mpd I am getting following errors.

    1)
    in the other node:-
    $ mpd -n -h hydhtc27314 -p 45265 &

    # Here hydhtc27314 is the hostname of root node & 45265 is the port number.

    # Errors are :-

    hydhtc48139_43341: conn error in connect_rhs: Connection refused
    hydhtc48139_43341 (connect_rhs 942): failed to connect to rhs at 127.0.0.1 45265 hydhtc48139_43341 (enter_ring 849): rhs connect failed
    hydhtc48139_43341 (run 24: failed to enter ring

    2)
    while in the root node it shows simultaneously :-

    hydhtc27314_45265 (handle_rhs_input 1087): lost rhs; re-entering ring
    hydhtc27314_45265 (reenter_ring 806): reenter_ring rc=0 after numTries=1
    hydhtc27314_45265 (handle_rhs_input 1092): back in ring

    please help what to do in details...thank you.

  2. #2
    Just Joined!
    Join Date
    Jan 2007
    Posts
    4

    Thumbs up

    I have been able to start daemons on both machines ...but the problem is that I cannot run programs simultaneously on all the machines..please help...

    Example:-
    $ mpiexec -n 2 /home/mpich2-install/sample_mpi

    It gives the errors:-
    1) in root node

    [unset]: connect failed with connection refused
    [unset]: Unable to connect to 10.136.125.3 on 33767
    [unset]: aborting job:
    Fatal error in MPI_Init: Other MPI error, error stack:
    MPIR_Init_thread(247): Initialization failed
    MPID_Init(71)........: channel initialization failed
    MPID_Init(274).......: PMI_Init returned -1

    2) in the other node

    hydhtc48139_39060 (mpd_sockpair 226): connect -2 Name or service not known
    hydhtc48139_39060 (mpd_sockpair 233): connect error with -2 Name or service not known
    hydhtc48139_mpdman_1 (mpd_sockpair 226): connect -2 Name or service not known
    hydhtc48139_mpdman_1 (mpd_sockpair 233): connect error with -2 Name or service not known

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •