Find the answer to your Linux question:
Results 1 to 4 of 4
I'm trying to transfer a large .tgz file from a CentOS dedicated server to a linux webhost (unknown OS). The problem is the webhost will not allow a 1.1gb file ...
  1. #1
    Just Joined!
    Join Date
    Jun 2008
    Posts
    33

    Transfer 1.1GB file from one server to another by splitting then joining

    I'm trying to transfer a large .tgz file from a CentOS dedicated server to a linux webhost (unknown OS).

    The problem is the webhost will not allow a 1.1gb file to be uploaded, however it will allow the upload in 149MB chunks.

    I used the split command to segment my tgz into 7 segments under 150mb. I then uploaded all segments via FTP which worked. Then I tried to join the segments to create the original tgz. The join appears to work with no issues.

    However, when I try to extract the tgz it appears there is a problem, most, but not all files are extracted and there is this error message:

    Code:
    gzip: stdin: Input/output error
    tar: Unexpected EOF in archive
    tar: Unexpected EOF in archive
    tar: Error is not recoverable: exiting now
    It appears the join did not work and the tgz is slightly corrupt. What am I doing wrong? Here's the commands I'm using:

    1. Create the original tgz on the dedicated server
    Code:
    tar -czf mysite.tgz ./myfolder
    2. Split the tgz into segments
    Code:
    split -b 149m -d mysite.tgz seg
    
    # using the -d switch so the segment files use a numerical suffix
    
    # I now have these files:
    
    seg00
    seg01
    seg02
    seg03
    seg04
    seg05
    seg06
    seg07
    3. Transfer segments to the other webhost using FTP

    Code:
    # hand typing (not a script)
    
    ftp ftp.mysite.com
    myusername
    mypassword
    binary
    cd somefolder
    
    put seg00
    put seg01
    put seg02
    # through to seg07
    
    bye
    4. Join up the segments on the new webhost
    Code:
    # this is in a .sh script file
    cd /full/path/to/somefolder
    cat seg* > mysite.tgz
    5. Extract the new tgz
    Code:
    # this is in a .sh script file
    cd /full/path/to/somefolder
    tar -xzf mysite.tgz
    
    # the above error is now thrown
    That's it. What am I doing wrong that's causing the above error?

    Many thanks

  2. #2
    tpl
    tpl is offline
    Linux User
    Join Date
    Jan 2007
    Location
    cleveland
    Posts
    452
    "cat seg* > mysite.tgz"

    this will leave mysite.tgz holding only the last seq*: try with ">>" instead
    the sun is new every day (heraclitus)

  3. #3
    Just Joined!
    Join Date
    Jun 2008
    Posts
    33
    Hi tpl, many thanks for your help, it appears to have solved the problem.

    Someone suggested using this for loop to join the segments, because it will ensure that they are joined in the correct order:

    Code:
    for ((i=0;i<8;i++))
    do
    echo Processing seg0${i}
    cat seg0${i} >> mysite.tgz
    done
    Having tried that it has the same results as before (not all files extracted) but there is no error.

    Anyway I'll use the cat seg* >> mysite.tgz method in futre. Thanks.

  4. #4
    Linux Guru Rubberman's Avatar
    Join Date
    Apr 2009
    Location
    I can be found either 40 miles west of Chicago, or in a galaxy far, far away.
    Posts
    8,970
    Actually, tpl is incorrect. "cat seg* >mysite.tgz" will do the same as "cat seg* >>mysite.tgz" as long as mysite.tgz did not exist previously. I am not 100% positive, but I think that split is a text/line-oriented tool and it is removing the new-lines from your files. I actually wrote a binary splitter tool a looonnng time ago (about 20 years) that I use and have built on many different operating systems without modification (Windows, QNX, Unix, Linux) called 'bsplit'. The source code is attached. Enjoy.

    Only one caveat. Since I wrote it so long ago, the options and their arguments are not separated by spaces as would be the case if I were to redo it. IE: use "-imysite.tgz" instead of current practice which would be "-i mysite.tgz".
    Attached Files Attached Files
    Sometimes, real fast is almost as good as real time.
    Just remember, Semper Gumbi - always be flexible!

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
...