Hi,
I am an undergrad student working for a computational biologist, and I was asked to get some sort of backup scheme working for his fileserver and webserver. A lot of the data that needs to be backed up is DNA sequence data and the like, often entire genomes, in addition to things like users' home directories and various biology programs/scripts. The problem is that this sequence data is large (~50GB for the sequence data), and there is also the matter of the output of the analysis programs (~150 GB currently).
Since the prof I'm working under is running analysis quite frequently, this data is liable to change. We have access to a computer with almost 1TB of space, but it's not all ours to use, and we'd like to be polite about it. Ideally, we would use an incremental solution with compression with full backups occurring every couple weeks to keep things (hopefully) compact and orderly. The problem I have run into so far is that once the data has been compressed, it is no longer possible to easily compare timestamps and the like.
Any suggestions for backup schemes that fit these criteria? Thanks in advance.