Greetings! This article will go into the method we use at a crime scene to acquire data in a forensically sound way using the free Linux command: dcfldd
Part of my job involves the acquisition of suspect or witness data from a variety of digital medial such as hard drives, thumb drives, xbox drives, floppy disks (whats-a-floppy?) and countless other storage devices. The type of data is generally referred to as dead box data since the device is usually off and the physical media is removed from its host prior to data capture; by contrast live box data refers to the analysis and acquisition of data in an active environment (network sniffer captures, live remote server data acquisition, active memory analysis and so on). The key goal for all of us involved in criminal investigations is obtaining the data in a forensically sound manor. What does that mean? As I mentioned in my opening article, a lot of what we do must be both technically as well as legally sound. We need to be able to confidently, intelligently and honestly respond to harsh cross-examination on a witness stand involving the method in which evidence was obtained. Nothing could be more damaging to the governments case then to have the smoking gun evidence (e.g. the recovered deleted data describing the bad guys plans in full detail) dismissed from trial because we were unable to explain and defend the method used to obtain it in the first place. This is where approved, tested (both technically and legally), and generally accepted software comes to the rescue.
First things first, what is an image? For our discussion, an image is a bit for bit copy of some piece of digital media. Quite simply if you take an image of a hard what you have is physically the exact data that resides on that drive regardless of say the type of operating system or file system setup (if any) installed. With an image you are copying all the "1's and 0's" right off the orignal media. This means when we analyze the image at the lab we are looking at everything that is and was on that drive at the moment we imaged it. If there are ten partitions on the server then we get all ten, the imaging command doesn't care about the logical makeup of the data, it deals more with the physical data. This also means we get access to deleted data, cleverly hidden data as well as any other "loose bits" floating around on the original media. From a criminal investigation you can see how this is extremely helpful to building a case.
The majority of time data can be acquired from media using over priced imaging hardware ($3500+ USD) that we carry in those black rolling cases you might have seen on TV shows. I'm not going to go into details on that hardware since its usually as simple as hooking up a few connectors, answering a few prompts on the LCD and hitting GO. Two hours later the media is duplicated in a perfect forensically sound fashion (yawn). What I want to discuss today is what happens if we don't have the hardware or when we do but Murphy's law kicks in and any number of issues arise and we are unable to use the expensive toys.
In this scenario I'll show you how we grab data from a standard internal SATA hard drive, the most common situation we run into on scene. So, how do we do it without the big guns hardware imagers? I personally acquire data using my old friend Ubuntu Linux on a standard laptop, a free command line tool called dcfldd and about 20 bucks in hardware.
Now a quick note about the command line tool dcfldd, is a spinnoff of the legendary dd command. The Wikipedia page on SourceForge. In summary this tool allow for on the fly hashing of the data, a progress indicator, approved methods for wiping of disks, verification of bit for bit copy, output to multiple devices, split output files and log and data piping capabilities. To obtain in using Ubuntu:
sudo apt-get install dcfldd
Continuing on in our scenario, say we execute the search warrant and secure the area, each team of agents would then get to work on their assigned jobs. In my case, I'm genrally the computer guy so off to locating the target computer(s). Once identified I take off my raid gear and put on the propeller cap. Here is a little insider secret, tucked into my tactical body armor, just behind my handcuffs I carry my SATA to USB adapter which cost about 20 bucks (ok its really packed up in a case somewhere, but that sounded more fun and might try it next time). Next I'll grab the computer(s) that need to be imaged, document and photograph everything, open it up, remove the drive and hook it up to my Linux laptop using the adapter.
What that really amounts to is alot of documenting and connecting the adapters SATA wire and power wire to the drive, plug in the power wire to an outlet and connect the USB side into the USB port (we ain't smashing atoms here) and start the process listed below.
Before I go any further, a few house keeping issues:
Since I use a Ubuntu Linux on a laptop that I installed and configured, my tools and setting are already good to go . Here are a few things I already configured:
RULE #1 - VERY IMPORTANT Before you connect the suspect drive to your Linux capture laptop you need to make sure that the drive does not automount (or if it does it should autmount READ ONLY) If the defense can show the drive was altered in anyway due to your imaging method you risk having all the evidence tossed! To accomplish this in Ubuntu 11 you can use the graphical DConf Editor a low-level key/value database designed for storing desktop environment settings. Launch DConf and goto: org.gnome.desktop.media-handling and look for the automount checkbox on right hand side and uncheck it.
or simply run the command
gsettings set org.gnome.desktop.media-handling automount false
To enable it again in the future:
gsettings set org.gnome.desktop.media-handling automount true
When it is safe, plug in the suspects drive and wait a few seconds then run the command
you will see the last few lines show the new device name (e.g. sdb). your suspect drive (or as you will soon see it called: INPUT FILE) will now be identified as /dev/sdb make note of this as it will be how we reference the suspect drive when using the dcfldd command.
Now that we have the unmounted drive identified and the dcfldd tool installed, data capture is as simple as one command. Although there are (like most Linux commands) an umpteen number of options, you will find that just a new are generally all that is required. Before you go any further make sure that you know the location of the target drive if you are going to image to another external device.
What are the key options you should know for dcfldd
if = Input File (device you are reading in eg. the suspects drive)
of = Output File (device you are copying too your evidence collection drive)
hash = md5, sha1, sha256, sha384 or sha512 (hash type - In criminal cases we generally use md5 and sha1)
hashwindow= how often (after x Bytes) to perform a hash calculation (On large drives 500GB+ I usually set it to 10G)
<hash>log = specifies where to log the hash calculations for each hash type (eg: md5log=md5.log)
hashconv = when to perform the hash generally setting it to AFTER is fine
bs = (no jokes here) specify the byte size to read at once generally 512 is used
conv = noerror (ignore read errors and continue) , sync (performs padding) are the 2 most common options here
split = breaks image file into
multiple files - VERY USEFUL FOR FUTURE ANALYSIS - I generally set this to 4GB
splitformat = I normally use 'nnn' which results in each filr name .001 .002 .00x (this is most common)
CAUTION: CAREER LIMITING MISTAKE! Don't mix up the input (if=) and output (of=) names!! Although nothing should happen in this scenario, you might easily end up copying 150 gigs of ZEROS from the empty capture drive onto the suspects drive, destroying the subjects drive and any evidence; subjecting you to legal actions, mockery and probably months of report writing and mandatory retraining seminars! Just to be on the safe side, on scene we use generally use a small inline hardware writeblocking device which prevents the writing of data to the subject drive (just in case!)
Here is an example of how I would image the suspects harddrive located at (/dev/sdb) to an external drive already mounted at (/mnt/external/) with 4GB output files named: evidence.dd.000, evidence.dd.001, evidence.dd.002 along with two plain text log files called sha1.log and md5.log each containing their respective hash calculations at 10GB intervals as well as a final hash calculation.
sudo dcfldd if=/dev/sdb hash=md5,sha1 md5log=md5.log sha1log=sha1.log hashwindow=10G split=10G splitformat=nnn hashconv=after bs=512 conv=noerror,sync of=/mnt/external/evidence.dd
And there you have it in a few hours you will have created a forensically sound image of a suspects computer hard drive using nothing but Linux and a few dollars in hardware. Of course you can generate images of USB thumb drives, xbox hard drives, external hard drives or any other media that can be plugged into a Linux laptop.
Ok so now what do we do with the images after collection? Stay tuned for an upcoming article!
Thank you all for reading. Please message me if you have ANY questions, comments, concerns, complaints, hot stock tips (kidding) or future article ideas.
You can also follow me at: twitter.com/luvz2fly