Find the answer to your Linux question:
Page 1 of 2 1 2 LastLast
Results 1 to 10 of 18
I have a fileserver running FC4 that stores CAD files (about 40,000 ifiles) for an engineering group. This server is simply a mirror to our main file server in another ...
Enjoy an ad free experience by logging in. Not a member yet? Register.
  1. #1
    Just Joined!
    Join Date
    Jun 2005
    Posts
    47

    rsh problems


    I have a fileserver running FC4 that stores CAD files (about 40,000 ifiles) for an engineering group. This server is simply a mirror to our main file server in another state so we have to sync the files every few hours so that things work correctly. To do this, we use a shell script that uses rsh and rcp to collect a list of files and copy them over to the local fileserver. This setup was working flawlessly with FC3, but every since I installed FC4 the script hangs (rsh process becomes defunct) at the first rsh command every time it is launched by crond. The weird thing is that I can manually launch the script and it works without a problem.

    Does anyone have any ideas what could be causing this? I need to get it fixed because I'm having to run the updates manually right now and its starting to get old.

  2. #2
    scm
    scm is offline
    Linux Engineer
    Join Date
    Feb 2005
    Posts
    1,044
    It sounds like the rsh process is terminating for some reason and its parent isn't waiting for it (hence the defunct status). Are you getting any diagnostics in the logs, or emails from cron, that might give a clue as to what's happening? There may be something in your environment that enables you to run the script successfully that cron doesn't have. If there is, find it, and you're home and dry!

  3. #3
    Linux Guru anomie's Avatar
    Join Date
    Mar 2005
    Location
    Texas
    Posts
    1,692
    Sorry to be curt, but don't use rsh / rcp in the first place. There are obviously secure alternatives: ssh / scp.

    Who knows, maybe fedora project has deprecated rsh / rcp. (Just speculating.)

  4. #4
    scm
    scm is offline
    Linux Engineer
    Join Date
    Feb 2005
    Posts
    1,044
    Quote Originally Posted by anomie
    Sorry to be curt, but don't use rsh / rcp in the first place. There are obviously secure alternatives: ssh / scp.

    Who knows, maybe fedora project has deprecated rsh / rcp. (Just speculating.)
    rsh and rcp are fine if you're running in a secure environment. It'd be very arrogant of the Fedora guys (or anyone) to banish useful commands (and UNIX standard) just because they don't think you're capable of using them safely. (Yes, I know they'd like to banish Windoze .... )

  5. #5
    Just Joined!
    Join Date
    Jun 2005
    Posts
    47
    Quote Originally Posted by anomie
    Sorry to be curt, but don't use rsh / rcp in the first place. There are obviously secure alternatives: ssh / scp.

    Who knows, maybe fedora project has deprecated rsh / rcp. (Just speculating.)
    I'm fully aware that ssh and scp are more secure, and I would be using them if I had my way. The problem is that this is a big corporation that still uses old Unix servers and the fileserver replication is only supported by the IT group via rsh/rcp.

    The good news is that this is on a secure network and I have firewall rules to limit access to rsh/rlogin/rcp to a specific IP address.

    Quote Originally Posted by scm
    t sounds like the rsh process is terminating for some reason and its parent isn't waiting for it (hence the defunct status). Are you getting any diagnostics in the logs, or emails from cron, that might give a clue as to what's happening? There may be something in your environment that enables you to run the script successfully that cron doesn't have. If there is, find it, and you're home and dry!
    Nothing is showing up in the logs and I'm not getting e-mail from cron because the cron job never finishes. You can let it try to finish for days and it will still be hung up and will never send an e-mail. BTW, the scripts and cron job have to be executed by a specific user with a specific UID for the whole thing to work. I have checked to make sure that cron is launching the script as that user and it is so I don't know what could be wrong.

  6. #6
    Just Joined!
    Join Date
    Jun 2005
    Posts
    47
    Is there any way this could be a cron problem rather than a rsh problem? I've read several websites that discuss how cron can be really screwy at times.

  7. #7
    Just Joined!
    Join Date
    Jun 2005
    Posts
    47
    Someone please help me!

  8. #8
    Linux Guru anomie's Avatar
    Join Date
    Mar 2005
    Location
    Texas
    Posts
    1,692
    Would it be possible to post the script? (You could obfuscate the IPs.)

    Maybe a new set of eyes can spot the problem. It's a little suspicious that it worked ok under FC3, but who knows.

  9. #9
    Just Joined!
    Join Date
    Jun 2005
    Posts
    47
    I can not post the whole script, but I can post the relavent snippet that is causing problems.

    Code:
    SERVER=$1
    #
    #  Abort is no servername
    #
    if [ "$SERVER" = "" ] ; then
       echo "SERVER name missing - Aborting"
       exit
    fi
    #
    SCRIPTLOC=/home/egnhxfr/scripts
    SETUP=$SCRIPTLOC/bu_data.setup
    CMNUSG=`grep ",$SERVER," $SETUP | cut -d',' -f3`
    DATA=`grep ",$SERVER," $SETUP | cut -d',' -f4`
    DWGS=`grep ",$SERVER," $SETUP | cut -d',' -f5`
    MAILTO=`grep ",$SERVER," $SETUP | cut -d',' -f6`
    TODAY=$(date +%y%m%d)
    #
    COPYLST=$SCRIPTLOC/bu_requests
    TMPdir=$SCRIPTLOC/bu
    TMPScript=$TMPdir/script-$SERVER
    ERRORLOG="$TMPdir/replicate.bu_data.errorlog.$TODAY"
    #
    #  Clean up the "deleted" files
    #
    rm -f $CMNUSG/links_del/* 2>/dev/null
    rm -rf $TMPdir 2>/dev/null
    mkdir $TMPdir
    #
    NOW=$(date)
    echo "Starting cron.replicate.bu_data: $NOW" >> $ERRORLOG
    echo "" >> $ERRORLOG
    #
    #  Make temp script to run on primary server
    #
    echo 'cd /mailbox/to-nhe' > $TMPScript
    awk '{print "ls "$0".*  >> PROE_PARTS.bur"}' $COPYLST >> $TMPScript
    echo 'grep bu_crawler /bu/users2/tdsdb/lists/PROE_PARTS.bu | egrep "(\.prt\.|\.asm\.|\.lay\.|\.drw\.)" >> PROE_PARTS.bur' >> $TMPScript
    echo 'grep bu_common_parts /bu/users2/tdsdb/lists/PROE_PARTS.bu | egrep "(\.prt\.|\.asm\.|\.lay\.|\.drw\.)" >> PROE_PARTS.bur' >> $TMPScript
    echo 'grep "racine_drive_trains_hin/crawler/hydrostatic" /bu/users2/tdsdb/lists/PROE_PARTS.dt | egrep "(\.prt\.|\.asm\.|\.lay\.|\.drw\.)" >> PROE_PARTS.bur' >> $TMPScript
    echo 'egrep -v "(_history|_pdbase|_trans_sbm|submission_forms)" PROE_PARTS.bur > PROE_PARTS.bu1' >> $TMPScript
    echo 'sort -t "." -k1,1 -k2,2 -k3,3nr PROE_PARTS.bu1 | sort -t "." -mu -k1,2 > PROE_PARTS.bur' >> $TMPScript
    chmod 777 $TMPScript
    /usr/bin/rcp $TMPScript XXX.XXX.XXX.XXX/mailbox/to-nhe
    #
    #  Run the script on primary server then retreive the file
    #
    
    /usr/bin/rsh XXX.XXX.XXX.XXX /mailbox/to-nhe/script-$SERVER # This is where everything goes wrong
    The last line is where the script hangs. It will not get past that line no matter what I try.

    BTW, I'm launching this as user "egnhxfr" with cron. My crontab entry looks like this:

    Code:
    30 * * * /home/egnhxfr/scripts/cron.replicate.bu_data ca

  10. #10
    Linux Guru anomie's Avatar
    Join Date
    Mar 2005
    Location
    Texas
    Posts
    1,692
    Just a few questions:

    1. For the rsh command, why are you using script-$SERVER instead of $TMPScript? edit: Never mind - I see now.

    2. Could you add some error checking immediately after the rcp command at the bottom? Like:
    Code:
    RC1=$?
    
    if [ "$RC1" -ne 0 ]
    then 
      echo "rcp returned a code of $RC1. Aborting now..." >&2
      exit 1
    fi

Page 1 of 2 1 2 LastLast

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •