Find the answer to your Linux question:
Results 1 to 7 of 7
I'm running a nightly backup script of my home directory using tar. The directly size is 1.4G but regardless of the compression method (z, j, Z), I'm only compressing down ...
  1. #1
    Just Joined!
    Join Date
    Nov 2006
    Posts
    9

    tar compression issue

    I'm running a nightly backup script of my home directory using tar. The directly size is 1.4G but regardless of the compression method (z, j, Z), I'm only compressing down to about 1.2G.

    What can I do to get further compression if possible? I thought I could compress down to a third or half the original size.

    Thanks.

  2. #2
    Linux Guru
    Join Date
    Nov 2007
    Posts
    1,695
    Data Compression

    Compression is only as effective as the data being compressed. Things like text files compress well because they have a lot of repetitive data in them. Files that have *already* been compressed or have mostly random data don't compress much.

    Binaries, JPEG's, and MPEG's would be examples of file types that have already been compressed.

  3. #3
    Linux Guru Jonathan183's Avatar
    Join Date
    Oct 2007
    Posts
    2,941
    You might have more luck using something like rsync and only archive differences, although you have 1.4GB it is unlikely all of it changes each day. Something like backintime may also be worth a look - thread here.

  4. #4
    Just Joined!
    Join Date
    Nov 2006
    Posts
    9
    The script is one of those scripts that takes incremental daily and full backup weekly. I choose to use tar rather than rsync specifically because I needed to compress for disk space. I would think that the data could be compressed more but I guess everything is working fine. I was just curious if I was missing something.

    Thanks for your time.

  5. #5
    Linux Guru Jonathan183's Avatar
    Join Date
    Oct 2007
    Posts
    2,941
    I was thinking you could use something like rsync to determine differences and then tar the differences, backintime was just an example using rsync ... but that may have done what you needed.

  6. #6
    Linux User
    Join Date
    Jun 2007
    Posts
    318
    You may be able to improve compression by splitting the compression into its own process and using (in the case of gzip) the --best or -9 option.

    Code:
    gzip --best < /path/to/home | tar -cf backup.tgz -

  7. #7
    Just Joined!
    Join Date
    Nov 2006
    Posts
    9
    Jonathan183 - I believe the script already looks for changes and only runs a backup of those. As a matter of fact, I'm sure of it. I've posted the script below:

    Code:
    #!/bin/sh
    # this program creates a "daily" tar incremental backup of the data specified
    # in SRC it puts the result in DEST
    # NOTE: Here "daily" doesn't mean anything - it just corresponds to
    # each time the program is run (this batch script is written to be run
    # each night, but that's not necessary)
    # everything must be locally mounted
    SRC="/home/stirling/work/"
    # don't forget the trailing "/" in DEST
    DEST="/disks/sdd1/backup/work/"
    
    # These are the number of days since the last full backup to wait
    # in order to make a new full backup
    DAYSINWEEK=7
    
    # These are the files which store the days since last backup
    DAYSSINCELASTWEEKFILE=${DEST}days_since_last_week_backup
    
    DEST_TEMP=${DEST}temp.tgz
    DEST_DAY_PART=${DEST}day
    DEST_WEEK=${DEST}week.tgz
    DEST_FILELIST=${DEST}filelist
    DEST_FILELIST_TEMP=${DEST}filelist.temp
    
    # check to see if the source even exists
    if [ -d $SRC ]
    then
    
      # check to see if the destination exists... if not, then create it
      if [ ! -d $DEST ]
      then
        mkdir $DEST
      fi  
    
      # still check to see if the destination directory exists, and if so, make
      #it writeable
      if [ -d $DEST ]
      then
        chmod u+w $DEST
        cd $DEST
        chmod u+w *
      else
        echo "Cannot create destination directory" 
        exit 1
      fi
    
      # We are going to do the weekly full backup, if necessary
      # first see if it is necessary
      if [ -f $DAYSSINCELASTWEEKFILE ]
      then
        DAYSSINCELASTWEEK=`cat $DAYSSINCELASTWEEKFILE`
      else
        DAYSSINCELASTWEEK=$DAYSINWEEK
      fi
      DAYSSINCELASTWEEK=`expr $DAYSSINCELASTWEEK + 1`  
      if [ $DAYSSINCELASTWEEK -gt $DAYSINWEEK ]
      then
      
        # we are going to rotate backups... reset the count variable    
        DAYSSINCELASTWEEK=0
        
        # we need to make sure to delete any partial old backups
        if [ -f $DEST_TEMP ]
        then
          rm $DEST_TEMP
        fi
        if [ -f $DEST_FILELIST_TEMP ]
        then
          rm $DEST_FILELIST_TEMP
        fi
    
        # now make the temporary full backup (complete with a filelist)
        tar zcvpf $DEST_TEMP -g $DEST_FILELIST_TEMP $SRC >> /dev/null  
        if [ ! $? -eq 0 ]
        then
          echo "backup failed"
          exit 1
        fi
    
        # we have a successful full backup - we need to remove the old backups
        if [ -f $DEST_WEEK ]
        then
          rm $DEST_WEEK
        fi
        if [ -f $DEST_FILELIST ]
        then
          rm $DEST_FILELIST
        fi
        mv $DEST_TEMP $DEST_WEEK
        mv $DEST_FILELIST_TEMP $DEST_FILELIST
        if [ -f "${DEST_DAY_PART}1.tgz" ]
        then
          rm ${DEST_DAY_PART}*
        fi
        
        
        # put a timestamp into the log file
        echo -n "Weekly backup: " >> ${DEST}log
        date >> ${DEST}log
        
      fi
      # write the day count out
      echo $DAYSSINCELASTWEEK > $DAYSSINCELASTWEEKFILE
      
      
      # We are going to do the daily backups, if necessary
      # first see if it is necessary
      if [ $DAYSSINCELASTWEEK -gt 0 ]
      then
    
        DEST_DAY="${DEST_DAY_PART}${DAYSSINCELASTWEEK}.tgz"
    
        # we need to make sure to delete any partial old backups
        if [ -f $DEST_TEMP ]
        then
          rm $DEST_TEMP
        fi
        if [ -f $DEST_FILELIST_TEMP ]
        then
          rm $DEST_FILELIST_TEMP
        fi
    
        # now make the daily incremental backup (using the filelist)
        cp $DEST_FILELIST $DEST_FILELIST_TEMP
        tar zcvpf $DEST_TEMP -g $DEST_FILELIST_TEMP $SRC >> /dev/null  
        if [ ! $? -eq 0 ]
        then
          echo "backup failed"
          exit 1
        fi
        if [ -f $DEST_FILELIST ]
        then
          rm $DEST_FILELIST
        fi
        mv $DEST_TEMP $DEST_DAY
        mv $DEST_FILELIST_TEMP $DEST_FILELIST
        touch $DEST_DAY
        touch $DEST_FILELIST
    
        # put a timestamp on the backup
        date >> ${DEST}log
    
      fi
      
      # change the backup directory back to nonwriteable
      cd $DEST
      chmod a-w *
      chmod a-w $DEST
    
    else
      echo "error: directory $SRC does not exist"
      exit 1
    fi
    
    echo "Success!!!"
    exit 0

    vsemaska - I appreciate the response and will try that out myself to see if it works better.

    Thanks all.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
...