Find the answer to your Linux question:
Results 1 to 5 of 5
Hi All, I wanted to learn some bash scripting and get more familiar with syntax, style, conventions, etc. On a high level, the goal of this script is to sync ...
Enjoy an ad free experience by logging in. Not a member yet? Register.
  1. #1
    Just Joined! sjc06007's Avatar
    Join Date
    Jul 2011
    Location
    USA
    Posts
    9

    bash backup script advice


    Hi All,
    I wanted to learn some bash scripting and get more familiar with syntax, style, conventions, etc. On a high level, the goal of this script is to sync in one direction. The user must supply two directories for input, i.e. the reference directory for which the target is based on. On the terminal the syntax is the following:
    Code:
    ./backmeup ~/reference/ ~/target/
    I have tested the script and have noticed the following things:
    --When one supplies two directors where the reference has no subfolders and only files in it, and the target directory is empty, the script does nothing. I believe this comes from the line:
    Code:
    TARGET_DIRECTORY_LIST=$(find . -type d | sed -e 's/.\(.*\)/\1/; s:^\/::' | sort -d)
    Where find does not include the pwd ... tips?
    --Any other sanity checks?
    Any suggestions or optimizations are always appreciated. I will try and post output of the `tree` command to further help. Here is the script:

    Code:
    #!/bin/bash
    # This function mirrors the reference directory onto the target directory via checking and 
    # copying each file into the target directory. If the file is not present it will
    # be copied to the target. If the file exists and is newer in the reference directory, then it 
    # will be copied again to the target directory, thus updating it. If the file exists and is older
    # in the reference directory, then the file will be skipped. The target directory should be 
    # identical to the supplied reference, including all subdirectories, files and attributes.
    
    IFS=$'\n'
    
    if [ $# -ne 2 ] ; then 
    	echo "NEED_TWO_ARGS"
    	echo "Usage: backmeup [REFERENCE_DIR] [TARGET_DIR]" 
    	echo "Example: ./backmeup.sh ~/reference/ ~/target/"
    	exit
    elif [[ ! -d "$1" || ! -d "$2" ]] ; then
    	echo echo "BAD_DIR_NAMES"
    	echo "Usage: backmeup [REFERENCE_DIR] [TARGET_DIR]"
    	echo "Example: ./backmeup.sh ~/reference/ ~/target/"
    	exit
    fi
    
    REFERENCE=$1
    TARGET=$2
    
    # Once the two directories are set this function will sync in direction: ref-->target
    function backup_single_directory { 
    
    	REF_FILES=$(ls $CURRENT_REFERENCE_DIR | sort -d)	# does not list .hidden files
    	TAR_FILES=$(ls $CURRENT_TARGET_DIR    | sort -d)
    	cd $CURRENT_TARGET_DIR
    	for ref_file in $REF_FILES  ; do
    		if [ ! -e $ref_file ] ||
    		   [[  -e $ref_file   &&  $CURRENT_REFERENCE_DIR$ref_file -nt $CURRENT_TARGET_DIR$ref_file ]] ; then
    			cp -r -v $CURRENT_REFERENCE_DIR/$ref_file $CURRENT_TARGET_DIR/$ref_file
    		fi
    	done
    	echo "updated target directory: $CURRENT_TARGET_DIR"
    	echo "diff $CURRENT_REFERENCE_DIR $CURRENT_TARGET_DIR"
    	diff $CURRENT_REFERENCE_DIR $CURRENT_TARGET_DIR | \	# sed parses the output of diff into rm
    		sed -e 's:^Only in\ :/:; s/:\ /\//; s:\/\/:\/:g;		 
    			s:\ :\\ :g; s:\,:\\,:g; s:(:\\(:; s:):\\):' | \		
    		xargs -0 -I{} rm -rfv {}
    }
    
    
    cd $TARGET
    TARGET_DIRECTORY_LIST=$(find . -type d | sed -e 's/.\(.*\)/\1/; s:^\/::' | sort -d) # first sed command removes first char
    #echo $TARGET_DIRECTORY_LIST
    
    
    cd $REFERENCE
    REFERENCE_DIRECTORY_LIST=$(find . -type d | sed -e 's/.\(.*\)/\1/; s:^\/::' | sort -d)
    #echo $REFERENCE_DIRECTORY_LIST
    
    # check to make sure the directory structure is the same as the reference
    for dir in $REFERENCE_DIRECTORY_LIST ; do
    	if [ ! -d $TARGET$dir ] ; then
    		mkdir -pv $TARGET$dir
    	fi
    done
    for dir in $TARGET_DIRECTORY_LIST ; do
    	if [[ -d $TARGET$dir && ! -d $REFERENCE$dir ]] ; then
    		rm -rfv $TARGET$dir
    	fi
    done
    
    
    # check and copy files over
    for dir in $REFERENCE_DIRECTORY_LIST ; do
    	cd $REFERENCE$dir
    	wegotzfilez=$(ls -lA | egrep '^-' | wc --bytes)		# directory listings begin with 'd'
    	if [ $wegotzfilez -gt 0 ] ; then
    
    		for file in `ls` ; do
    			CURRENT_REFERENCE_DIR=$REFERENCE$dir	# must be set
    			CURRENT_TARGET_DIR=$TARGET$dir		# before call
    			backup_single_directory
    			echo "$CURRENT_REFERENCE_DIR "
    			echo "$CURRENT_TARGET_DIR "
    		done
    	fi
    done
    
    echo
    echo
    echo
    echo
    echo "Finished job..."
    I found sed to much fun and couldn't resist using it as often as possible!

    Thanks all!

  2. #2
    Just Joined! sjc06007's Avatar
    Join Date
    Jul 2011
    Location
    USA
    Posts
    9

    Cool

    Also, please consider this as some testing data:

    Code:
    $ tree -d reference/ target/
    reference/
    ├── directory
    │** └── dir
    │**     └── dir
    │**         └── dir
    │**             └── dir
    ├── directory2
    ├── directory3
    └── directory_____TEST
    target/
    ├── directory
    │** └── dir
    │**     └── dir
    │**         └── dir
    │**             └── dir
    ├── directory2
    ├── directory3
    └── directory_____TEST
    And with files:

    Code:
    $ tree reference/ target/
    reference/
    ├── directory
    │** └── dir
    │**     ├── dir
    │**     │** └── dir
    │**     │**     └── dir
    │**     │**         └── deep_file
    │**     └── testfile2
    ├── directory2
    │** ├── test_dir2
    │** └── test_dir22
    ├── directory3
    │** └── test_dir3
    ├── directory_____TEST
    ├── Mittelbach - The LaTeX Companion 2e (AW, 2004).djvu
    ├── Oetiker - The Not so Short Introduction to LaTeX2e (2003).pdf
    ├── Syropoulos - Digital Typography Using LaTeX (Springer, 2003).djvu
    └── testfile1
    target/
    ├── directory
    │** ├── dir
    │** │** ├── dir
    │** │** │** └── dir
    │** │** │**     └── dir
    │** │** │**         └── deep_file
    │** │** └── testfile2
    │** └── testfile2
    ├── directory2
    │** ├── test_dir2
    │** └── test_dir22
    ├── directory3
    │** └── test_dir3
    ├── directory_____TEST
    ├── Mittelbach - The LaTeX Companion 2e (AW, 2004).djvu
    ├── Oetiker - The Not so Short Introduction to LaTeX2e (2003).pdf
    └── Syropoulos - Digital Typography Using LaTeX (Springer, 2003).djvu
    
    16 directories, 18 files
    That looks better in the terminal...
    Last edited by sjc06007; 07-13-2011 at 02:50 PM.

  3. #3
    Just Joined! sjc06007's Avatar
    Join Date
    Jul 2011
    Location
    USA
    Posts
    9
    Ideally, I'd like to traverse the directories in a depth-first-search fashion. I found a script that does it but I am unsure how to redesign/integrate it....

    BUMP!

  4. #4
    Just Joined! rm-rf's Avatar
    Join Date
    Mar 2011
    Posts
    83
    If I were you, I would use a program called rsync. Odds are it is already installed on your distro. Rsync was specifically designed for doing backups, however it still would need a little scripting done for it. I have written my own backup script, and the best way (atleast in my opinion) is to create a directory with the date, and copy everything over into that.

    Then the next time you backup everything, you create another directory with the date, and any files from the last backup that were not modified get linked to. So for example:
    Code:
    total 8
    drwxr-xr-x 3 noah noah 4096 2011-07-14 14:59 07142011
    drwxr-xr-x 3 noah noah 4096 2011-07-14 14:58 a
    
    ./07142011:
    total 4
    drwxr-xr-x 2 noah noah 4096 2011-07-14 14:59 b
    -rw-r--r-- 1 noah noah    0 2011-07-14 14:59 d
    
    ./07142011/b:
    total 0
    -rw-r--r-- 1 noah noah 0 2011-07-14 14:59 c
    
    ./a:
    total 4
    drwxr-xr-x 2 noah noah 4096 2011-07-14 14:58 b
    -rw-r--r-- 1 noah noah    0 2011-07-14 14:58 d
    
    ./a/b:
    total 0
    -rw-r--r-- 1 noah noah 0 2011-07-14 14:58 c
    would be what I get when I backup a, to the current directory, and this is what I would get on the next backup:
    Code:
    total 12
    drwxr-xr-x 3 noah noah 4096 2011-07-14 14:59 07142011
    drwxr-xr-x 3 noah noah 4096 2011-07-14 15:02 07152011
    drwxr-xr-x 3 noah noah 4096 2011-07-14 15:01 a
    
    ./07142011:
    total 4
    drwxr-xr-x 2 noah noah 4096 2011-07-14 14:59 b
    -rw-r--r-- 2 noah noah    0 2011-07-14 14:59 d
    
    ./07142011/b:
    total 0
    -rw-r--r-- 2 noah noah 0 2011-07-14 14:59 c
    
    ./07152011:
    total 4
    drwxr-xr-x 2 noah noah 4096 2011-07-14 15:03 b
    -rw-r--r-- 2 noah noah    0 2011-07-14 14:59 d
    
    ./07152011/b:
    total 0
    -rw-r--r-- 2 noah noah 0 2011-07-14 14:59 c
    
    ./a:
    total 4
    drwxr-xr-x 2 noah noah 4096 2011-07-14 14:58 b
    -rw-r--r-- 1 noah noah    0 2011-07-14 14:58 d
    -rw-r--r-- 1 noah noah    0 2011-07-14 15:01 e
    
    ./a/b:
    total 0
    -rw-r--r-- 1 noah noah 0 2011-07-14 14:58 c
    Basically everything in the current directory that has not been
    modified since the last backup will have a link pointing to the file from the last backup. This way it can backup very quickly, and it can take up very little space.
    here is the script I wrote if you want a refference:
    Code:
    #! /bin/sh
    
    if [ "$1" = "-y" ];
    then
    	OVERWRITE="yes";
    	shift;
    fi
    
    SRC=$1;
    DEST=$2;
    
    TODAY=$(date '+%y%m%d'); #get the date as a number in the form of ymd
    TODAY_READABLE=$(date '+%m,%d,%y'); #get the date as the folder name will be m,d,y
    
    LAST_BACKUP=$( ls $DEST | sed 's/^\(.*\),\(.*\),\(.*\)$/\3\1\2/g' | sort -nr | head -1); #get the last backup,
    #as a number in the form of ymd
    
    LAST_BACKUP_READABLE=$(echo $LAST_BACKUP | sed 's/^\([0-9]\{2\}\)\([0-9]\{2\}\)\([0-9]\{2\}\)$/\2,\3,\1/g'); #get the last
    #backup as it is in the file name m,d,y
    
    if [ "$(ls $DEST)" = "" ];
    then 
    
    	mkdir $DEST/$TODAY_READABLE
    	OUT=$(rsync -r  --progress -H $SRC $DEST/$TODAY_READABLE)
    
    	if [ "$?" != "0" ];
    	then
    		exit $?
    	fi
    
           echo $OUT | sed 's/[[:space:]+][0-9]*[[:space:]+]\([0-9]\{1,3\}\)%.*/\1/g' | grep '^[[:space:]]*[0-9]\{1,3\}$' | sed 's/^[[:space:]]*\([0-9]\{1,3\}\)$/\1/g';
    else
    	if [ "$LAST_BACKUP" = $TODAY ];
    	then
    
    		if [  -z "$OVERWRITE" ];
    		then
    			echo "backup already completed for today, overwrite? [yes/no]"
    			read OVERWRITE;
    		fi
    		
    		if [ "$OVERWRITE" = "yes" ];
    		then
    			rm -rf $DEST/$LAST_BACKUP_READABLE;
    			mkdir $DEST/$TODAY_READABLE;
    			OUT=$(rsync -r --progress $SRC $DEST/$TODAY_READABLE)
    		       
    			if [ "$?" != "0" ];
    			then
    				exit $?
    			fi
    
    			echo $OUT | sed 's/[[:space:]+][0-9]*[[:space:]+]\([0-9]\{1,3\}\)%.*/\1/g' | grep '^[[:space:]]*[0-9]\{1,3\}$' | sed 's/^[[:space:]]*\([0-9]\{1,3\}\)$/\1/g';
    
    		else #doesn't want to overwrite
    			exit 1;
    		fi
    
    		exit 0;
    
    	else #last backup was yesterday
    
    		mkdir  $DEST/$TODAY_READABLE
    
    		OUT=$(rsync -avz --progress --link-dest=$(pwd)/$DEST/$LAST_BACKUP_READABLE $SRC/ $DEST/$TODAY_READABLE )
    		
    		if [ "$?" != "0" ];
    		then
    			exit $?
    		fi
    
    		echo $OUT | sed 's/[[:space:]+][0-9]*[[:space:]+]\([0-9]\{1,3\}\)%.*/\1/g' | grep '^[[:space:]]*[0-9]\{1,3\}$' | sed 's/^[[:space:]]*\([0-9]\{1,3\}\)$/\1/g' ;
    	fi
    fi

  5. #5
    Just Joined! sjc06007's Avatar
    Join Date
    Jul 2011
    Location
    USA
    Posts
    9
    Thanks for the tip! I'm actually a little upset I rewrote the script in less than 20 lines of code and was done testing in another 15

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •