Find the answer to your Linux question:
Results 1 to 10 of 10
I'm new to shell scripting, so I'm looking for pointers or an example to go off of. I'm looking to create a shell script that other people can run on ...
  1. #1
    Just Joined!
    Join Date
    Sep 2008
    Posts
    5

    Shell script to overwrite a local file if it differs from a remote server's file?

    I'm new to shell scripting, so I'm looking for pointers or an example to go off of.

    I'm looking to create a shell script that other people can run on their servers to download a few of my HTTP server's small static-content files if there's a difference in content.

    So this script would need to:
    • Get a certain file from a remote server via HTTP
    • Get the local file of the same name (if there is one)
    • Compare the remote and local files
    • If there's a difference (or the local file doesn't exist), download the remote file, overwriting the old local file


    What's the best way to go about this? Have you seen anything like it before? Thanks for your help!

  2. #2
    Linux Guru
    Join Date
    Nov 2007
    Location
    Córdoba (Spain)
    Posts
    1,513
    Is it mandatory that you build it yourself?

    I tell you cause there are lots of applications that can do that. This is called synchronization. And an example of such a tool is rsync. There's no need to reinvent the wheel, unless it's for academyc purposes.

  3. #3
    Just Joined!
    Join Date
    Sep 2008
    Posts
    5
    No, nothing's mandatory, I just thought it'd be a good idea to offer people an easy script to deploy since people are hotlinking my static-content. I don't mind the hotlinking really, but it's getting costly so I'd like to offer an easy alternative. Thing is, I'm terrible in the shell... I just know a little bit to get around somewhat + search engines.

    So rsync, you say. It looks promising, but I'm not sure if it'll work for this. My web server would be accessed over HTTP, but I don't see an rsync example that uses HTTP. (I should have specified that, I'll edit the first post.)

    Thoughts?

  4. #4
    Linux Guru
    Join Date
    Nov 2007
    Location
    Córdoba (Spain)
    Posts
    1,513
    Quote Originally Posted by domcat View Post
    No, nothing's mandatory, I just thought it'd be a good idea to offer people an easy script to deploy since people are hotlinking my static-content. I don't mind the hotlinking really, but it's getting costly so I'd like to offer an easy alternative. Thing is, I'm terrible in the shell... I just know a little bit to get around somewhat + search engines.

    So rsync, you say. It looks promising, but I'm not sure if it'll work for this. My web server would be accessed over HTTP, but I don't see an rsync example that uses HTTP. (I should have specified that, I'll edit the first post.)

    Thoughts?
    You would need to fire up an rsync server as well. It shouldn't be too difficult, though.

    The problem here will always be the same: you need a proper way to check whether a file needs to be downloaded or not, and rsync and similar tools can do that.

    Your original idea might seems lighter to you, but in fact, it's much worse in terms of net traffic.

    To accomplish what you asked in first place you need to download ALL the files entirely, then compare them locally and choose which one to keep. As you might have already guessed, at the point where you compare the two files, you already have all the files downloaded in the client box, so, you'd better overwrite them all directly, and save yourself the checks.

    If you don't mind people downloading the whole stuff each time, then they can use wget -r <url> or any equivalent tool.

    The good point about rsync and similar tools is that you don't need to download everything, saving a lot of bandwith if the changes to your tree are minimal (or not too big).

  5. #5
    Just Joined!
    Join Date
    Sep 2008
    Posts
    5
    at the point where you compare the two files, you already have all the files downloaded in the client box, so, you'd better overwrite them all directly, and save yourself the checks.
    True, but the goal of the original idea is to avoid getting an unintentionally zero-length file in transmission. (Maybe wget or similar can check for that?)

    So I still need to automate this in a shell script so it can be run at least weekly through a cron job (again, I'll update the first post, I didn't mention cron).

    Unfortunately, an rsync server doesn't look like it'd happen with my current development cycle and platform -- but thanks, it will come in handy later.

    So it looks like I'll ultimately be needing this:
    • wget or similar command to download a file over HTTP
    • a command to overwrite the local file (if present) with the downloaded one (if the downloaded file is not zero length)
    • one last thing: how would I put the contents of a remote file into a variable (the file contains only a number only)?

  6. #6
    Linux Guru
    Join Date
    Nov 2007
    Location
    Córdoba (Spain)
    Posts
    1,513
    Quote Originally Posted by domcat View Post
    So it looks like I'll ultimately be needing this:
    • wget or similar command to download a file over HTTP
    • a command to overwrite the local file (if present) with the downloaded one (if the downloaded file is not zero length)
    Then I'd suggest to:

    1. Download everything using wget -r to a temporal directory (make sure you rm -rf it at the beggining of each execution).
    2. Use find with -size 0 -exec rm -f '{}' \; to delete empty files from that directory.
    3. Copy or move the files in the temp dir to the base one, overwriting all the remaining files (zeroed files have been erased on the previous step).
    4. Delete temp dir.


    That's your basic script. Ask if you need help with concrete steps.

    [*]one last thing: how would I put the contents of a remote file into a variable (the file contains only a number only)?
    Code:
    wget -q -O - http://whatever.domain.com/my_file.txt
    Will output it to the starndard output. To put it into a var, you can do

    Code:
    var=$(wget -q -O - http://whatever.domain.com/my_file.txt)

  7. #7
    Just Joined!
    Join Date
    Sep 2008
    Posts
    5
    So I've discovered the -i (or --input-file) option for wget, which downloads a list of files in one shot. It saves me from having to use a bunch of wget commands or having to loop.

    Thanks to your help I'm well on my way, but I've hit one little scripting snag...

    How can I get the current path (to put into a variable)?

    Edit, to answer my own question:
    Code:
    DIR=`pwd`

  8. #8
    Linux Guru
    Join Date
    Nov 2007
    Location
    Córdoba (Spain)
    Posts
    1,513
    Quote Originally Posted by domcat View Post
    Thanks to your help I'm well on my way, but I've hit one little scripting snag...

    How can I get the current path (to put into a variable)?

    The "pwd" command does just that, so you could put it's output into a var like this:

    var=$(pwd)
    However, in linux, it happens that there's a $PWD variable which always holds the current path, so you could just use that variable in your script, and save yourself some typing.

  9. #9
    Just Joined!
    Join Date
    Sep 2008
    Posts
    5
    Any performance (or other) reason(s) to use -q on wget in a shell script?

  10. #10
    Linux Guru
    Join Date
    Nov 2007
    Location
    Córdoba (Spain)
    Posts
    1,513
    Quote Originally Posted by domcat View Post
    Any performance (or other) reason(s) to use -q on wget in a shell script?
    It depends on your needs. You can disable wgets output if you plan to program a frontend or something like that. Or if you just consider that it's useless, non-needed info.

    There shouldn't be any -noticeable- performance hit.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
...