Hi, I have thousands of photos scattered over my hard drives and computers, and over time it has created many duplicates.

I have long wanted to write a script that goes through the folders as the pictures are in and find duplicates and later delete them and save the original. I want the script should be fully automatic but do not dare delete pictures without seeing them.

So I wrote a script that:
1st Creates a log file with paths to all duplicates.
2nd Divides the file into small log files where each file contains the paths to duplicate the same image.
3rd Executes an image software that displays the images by taking each log file as an argument and then lets the user select the duplicates and delete them.

I have no past experience with bash skrips and I know that the script is not optimal so I thought if we could possibly get it in better condition

Here is the script:

#  Find duplicates of images with fdupes 
#+ and fire up a image program (gthumb) to show the duplicates
#+ all at once to manually select and remove!


#   Create a single log file of all the 
#+ duplicate files, separated by blank lines.
/usr/bin/fdupes --recurse "$1" > ${DUPLOG}			
#   Split the single log file to multiple files 
#+ each containing path to one set of duplicated files
/bin/cat "$DUPLOG" | csplit -f dupes - '/^$/' '{*}'		

#   Removes blank lines from all dupliate files
/bin/sed -i '/^$/d' *							

#   Encloses every in file with quotes
#+ OBS! If script is executed a second time on same files
#+ then all lines gets quoted again. Remove old files!
/bin/sed -i 's/.*/"&"/g' *						

#   Open duplicate images in file with gthumb to really
#+ determine that they are duplicates, and then remove 
#+ the others manually.
for dupfiles in `ls`								
		/bin/cat "$dupfiles" |  /usr/bin/xargs /usr/bin/gthumb
exit 0;
Ps. The reason for choosing gThumb is its ability to take several pictures as argument and show them in the same window