//Cyber Operations/Digital Forensics Lab

Table of Contents

Learning Outcomes [Top]

After completing these activities you should be able to:

This lab emphasizes the aspects of digital forensics that are commonly encountered. The activities are broken into small activities with specific foci.

Your Digital Forensics worksheet is due to your instructor at the beginning of class on the day listed on the course calendar, typically a week from the date assigned.

A. Preparation

  1. Create a directory named forensics on your desktop. All of the files created in this lab will be stored there.   ← Worksheet A.1.  

B. Trace an Email

Email is one of the most common forms of communication today. While older forms of communication required the presence and participation of two parties, email is more asymmetric in nature. Email makes use of the SMTP protocol, meaning that it follows a prescribed set of rules, which makes it predictable and simple to understand. Today we're going to look at a few emails and determine if they are legitimate, a spam email, or something else. The basis for this analysis will come from the header of the email, primarily. An email header contains information from every Mail Transfer Agent it comes in contact with on its way to the destination. This can be very helpful when putting together the information systems the email message passed through.

First examine this email by simply clicking on this link (don't download): Email 1. If we start from the bottom of the email, you'll see several large chunks of data that look like some sort of encoding. These encodings are images being sent through email. All attachments get encoded into the base64 format for transfer. Before each chunk you can actually see the file name, for when it gets reconstituted on the receiving end. If you scroll up past the images, you'll eventually arrive at the actual text of the message. Interesting, but this rarely helps with attribution (Non-Repudiation).

As you move up the email, you can see the following fields and their meaning:

Field Meaning
Message-ID A unique message ID as it passes through SMTP servers. This is used to avoid duplication of messages.
From The address the sender filled in here. This could be made up!
To This is the destination address.
X-Mailer The mail client (program) from which the email was sent.
Subject The subject of the email.
Reply-To This is the address the "reply" button usually uses and is where the reply email will be sent. The sender can fill this in with whatever he wants! The receiver may not even realize there is a different address here.
Received There are several lines for this field which each indicate the location and time of receipt of this email by a mail server. The top received is closest to the receiver, while the received that is lowest is closest to the sender.
Return-Path This is the address delivery error (bounce) messages should be sent to.

All emails will have some or all of these lines (called the email header) and possibly other lines as well. If email is potentially considered to be spam, there is usually a line or two in the header details that flag the email as spam. In Gmail, you can view the header and the email contents, safely, for any of your emails by clicking on the ▾ button next to the reply button in the header information of the email message. Click Show original in the resulting drop down menu.

Typically, we want to determine the mail servers and their ordering that the email message passed through. The mail server data in the email header represents the real path that the email took, and it is hard to forge all of this data. Examine the fields described above for Email 1.   Worksheet B.1. – B.7.  

Now, investigate Email 2, which is a spoofed email, and see if you can determine where it originated from.   Worksheet B.8. – B.10.  

Finally, investigate Email 3. This time, examine not only the header, but also the actual email message. Often, email servers can be compromised and used to send legitimate looking emails. Domain names in legitimate emails will be consistent throughout, especially if an email is providing a link to the website or a file to download. Other indications of spam and phishing emails are misspellings or poor grammar.   Worksheet B.11. – B.13.  

C. File Carving

This section of the Forensics Lab introduces you to file carving. File carving is an incredibly useful skill to have in the world of digital forensics. It basically means recovering files (data) from a physical storage device after the files have been deleted, the device has been erased, or the device has been damaged. At this point, the data on the device just looks like a sequence of "raw bytes" — meaning a sequence of bytes without any information as to where any file(s) begins or ends in the sequence of bytes.

With computers, "deleting" a file doesn't necessarily mean the data stored in the file (the bytes that comprise the file) are gone. It means that the file systems' record of the file's name, and the files' connection to that area of the hard drive are gone. Those bytes become "unallocated space", but still hold data that can be interpreted.

To carve a file from a block of bytes, you'll need to look for the header of the file, and depending on the file type the footer of the file. For example, the header (in hex) for a PNG file is 89 50 4e 47 and the footer is 49 45 4e 44 ae 42 60 82. Below we have an example of a chunk of unallocated space from a drive. Looking carefully, we spot a PNG header (starting at offset 10) and, following it, a PNG footer (ending at offset 42), therefore we deduce a PNG file is at the offset from 10 to 42.

Block of unallocated space from a drive
PNG header body PNG footer
7e 93 20 20 51 e9 05 6d ff 67 89 50 4e 47 0d 0a 1a 0a 00 00 00 0d 49 48 44 52 54 78 9c 62 60 01 00 00 00 49 45 4e 44 ae 42 60 82 3d 69 c4 82 81 f0 6f 61 e4 40 4b b4 34 2f 2e bb
00 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58

  Worksheet C.1.  

File Carving Activity Starts ...
Suppose you recover a hard drive from a bad guy's computer. Your job is to find incriminating data, or data that will help in an investigation. Your goal is to recover files from raw bytes from a recovered disk image:

  1. To start with, let's review a basic file format. The Joint Photographic Experts Group (JPEG) format gives us files with a .jpg extension. This file type has a very distinctive header and footer.
    • Header in hex: ff d8 ff e0
    • Footer in hex: ff d9
  2. Save the following file into your forensics directory: oneFile.   ← Worksheet C.  
  3. Using frhed, open the saved file. Can you see the JPG header in the file anywhere? Not easily!
  4. Press Ctrl-f and enter a search for the header in hex by typing <bh:ff><bh:d8><bh:ff><bh:e0> into the search prompt. The bh is specifying you want to search for the byte specified in hex. When the bytes are located, they will all be highlighted in frhed. Do not use uppercase hex characters.
  5. At the bottom of frhed the byte offset (in hex and in decimal) is given (i.e. offset 245=0xf5). The 245 in this example is the offset in decimal, the 0x in front of the f5 is simply indicating that this equivalent offset is in hex. In other words, 245 in decimal = f5 in hex. Write down the decimal and hex offsets (the location relative to the start of the file, which is shown at the bottom left of the frhed window) of the first byte of the header and the last byte of the header on your worksheet.   ← Worksheet C.2.  
  6. You can follow the same process for the footer.
    You can specify the search to look up or down from the current cursor location in frhed.
  7. Once you have the header and footer located, i.e. you know their offsets, now it is time to carve (copy) the entire file from the start of the header to the end of the footer. To do this:
    • Select Edit ⇒ Copy, and enter the start and ending offsets for the entire file. If you enter decimal values, just the value is entered. If you want to enter the hex value, you must include the x before the hex value to tell frhed the entered offset is hex.
    • Select File ⇒ New to create a new document.
    • Select Edit ⇒ Paste, choosing the option to Insert (NOT OVERWRITE). Press OK.
    • Choose File ⇒ Save and save as a .jpg file in your forensics directory.
    • Open the file from the file browser to see the image.   ← Worksheet C.3.  
  8. Your next task is to carve two files from a chunk of data called twoFiles, which you first need to download, saving into your forensics directory, and open with frhed like you did with oneFile. You will use the same file carving technique with your hex editor. In this case, one file is a pdf file and the other file is an audio file of the wav format. The information below should help you on your task.   ← Worksheet C.  

    Remember, if you need to search for the hex values, use this format: <bh:89><bh:50><bh:4e><bh:47> (which searches for a PNG file - you need to change the hex values for the header or footer you are specifically searching for)

    File Format Header in hex To Search in fhred Footer in hex To Search in fhred
    jpg    ff d8 ff e0<bh:ff><bh:d8><bh:ff><bh:e0> ff d9<bh:ff><bh:d9>
    pdf    25 50 44 46 2d 31 2e (%PDF-1.4)<bh:25><bh:50><bh:44><bh:46><bh:2d><bh:31><bh:2e> 25 25 45 4f 46 (%%EOF)<bh:25><bh:25><bh:45><bh:4f><bh:46>
    wav    52 49 46 46 (RIFF)<bh:52><bh:49><bh:46><bh:46> NO FOOTER! -

    Some file formats do not have footers. This can be problematic, but humans are often better than computers at solving this problem. In the case of the audio file, you will see a change in the information in the file. You can also try cutting differing amounts of data into the file, and see if it works. Experiment and answer the questions on the lab worksheet. One approach is to use the last byte in the file and work backwards (there is more than one way to solve a problem).   ← Worksheet C.4. – C.7.  

  9. You are now skilled at file carving by hand! But what if the data in question is megabytes large and the number of files is either very large or unknown. Because we know how to do this task on a small scale, it can be automated for larger sets of data, which is great. We'll be using a file carving program called scalpel, which is free software; if you did it correctly, scalpel was installed to your C:\SI110Programs directory during the first homework assignment. scalpel automates file carving. A user sets up a configuration file that lists different types of files it can search for, based on headers and footers. We've prepared a scalpel configuration file for you to use.
    Recall the md5 command syntax.
    C:>md5 -h  ← print help/usage statement
    C:>md5 fileToHash
    C:>md5 -d"inputText"

    Follow these steps:

    1. A forensic analyst works with copies of raw data, rather than originals. The analyst must be able to prove, without a doubt, that the original data was not modified - that the integrity of the data is intact. Therefore, the analyst must hash the original data, create a copy, hash the copy; the hashes must match. As the analyst works on the copy, she will periodically hash the data and ensure no changes have been made. If the hash value ever changes, then the analyst must make a new copy from the original, hashing again. So, you will be expected to work on the extracted content and not modify it in any way.
    2. Right-click and download unknownChunk.raw and scalpel.conf and save them in the forensics directory.   ← Worksheet C.  
      unknownChunk.raw - Raw bytes recovered off the hard drive of a confiscated information system.
      scalpel.conf - Describes file headers and footers in a form scalpel can use.
    3. Open a shell and navigate to your forensics directory (cd).
    4. Compute a hash of the unknownChunk.raw file: md5 unknownChunk.raw and record the hash value on your worksheet.   ← Worksheet C.8.  
    5. Execute this command: scalpel.exe -c scalpel.conf unknownChunk.raw
    6. A summary of how many files and of what types scalpel found are produced as output in the shell (if everything was done correctly). Read through the output and determine how many and what types of files scalpel found. In other words, what data were recovered from the hard drive. A directory called scalpel-output will be created in the forensics directory.   ← Worksheet C.9.  
    7. Compute the hash of the unknownChunk.raw file again and record the results on your worksheet. If the hash is different from the start of the activity, you need to download unknownChunk.raw again and repeat the scalpel activity after computing the hash.   ← Worksheet C.10.  
    8. Using the file browser, explore the scalpel-output directory and confirm what was reported in the shell about what the file carver (scalpel) was able to retrieve. Then answer the questions on your worksheet. If you got an error instead of scalpel results in the shell (scalpel never ends – you get stuck in the shell), then you will need to start over. There will still be a scalpel-output directory created.
      scalpel will not run if it detects a directory named scalpel-output already exists. If you need to run scalpel again, first delete the scalpel-output.

D. File Decryption

Recall the aes command syntax.
C:>aes  ← print help/usage statement
C:>aes -d symEncKey -i inFile -o outFile
If at first you don't succeed, then think and ... Some different file types actually have the same file headers. For example file formats for a suite of related programs, like an office product suite, sometimes use a common file format structure and therefore have the same file header; e.g. Microsoft Excel, Microsoft Power Point, and Microsoft Word. So how do you know what file type to use? Well, sometimes you have to look deeper in the file structure to determine, or we can perform a little trial and error by cycling through possible file types until the data are interpreted correctly. Trial and error is not an elegant solution, but it can be an effective problem solving approach in small batches.
  1. An intelligence operative caught on to a user's nefarious activities and captured one of his emails. The attachment was base64 encoded, just as the attachments were in Email 1. The attachment was easily decoded to an encrypted text file that you can download here. The unknownChunk.raw from Part C is a section of the HDD that was additionally recovered from the user's computer. Looking at the files you carved, are there any clues that can help you decrypt and view the email attachment?   ← Worksheet D.1.  
  2. Once decrypted, can you determine exactly what type of file this is by its header? Hint: what tool allows you to look at an unknown file to examine the header? Use the file header table to decide!   ← Worksheet D.2.  
  3. Rename the file that you carved with the appropriate extension and verify you can open it.   ← Worksheet D.3.  

E. Personal OPSEC

Loose Lips Still Sink Ships

Just as you understand that your actions from your laptop leave evidence on your laptop, your and others' past actions have also left evidence (Locard's Exchange Principle), evidence that still exists on the Internet.

The personal evidence in the cyber domain can negatively impact Fleet operations. Adversaries are continually searching for information about targets, constantly looking for information that can be used to gain access to systems in technical and non-technical ways. The phrase Loose Lips, Sink Ships is even more relevant in today's Information Age.

Data Aggregators

If you are not listed in Spokeo or Peoplefinder, that is good (but you must remain vigilant). Print out a copy of the no results found page with your name as the search criteria to receive credit for this part of the assignment.

My Epic Hacking

Mat Honan, of Wired.com, had his personal identity and online accounts hacked due to readily available public information and common commercial security practices.

There are many more data aggregator sites and services online, Spokeo is just one. Many of the data aggregator sites charge for the full reports and require registration to remove data from their databases. Just because you took steps to remove data from Spokeo does not mean you are completely secure; recall we cannot nullify (zero) risk, but we can and should take steps to appropriately mitigate risks.   Worksheet E.3.  

Browser Privacy Settings

We may not be able to remove all personal data from the Internet, but we can take steps to minimize data being kept in the future. Most modern day browsers have Do Not Track settings in them. In this portion of the lab you will turn on the Do Not Track settings in commonly used browsers. One thing to note about the Do Not Track settings is that they are voluntary, there is no mandate (legal or technical policy) to enforce a web site or even the browser vendor to adhere to or offer Do Not Track settings.

Photo Forensics

References

  1. Spokeo. Spokeo, About. http://www.spokeo.com/about. Retrieved: 04 Nov 2014.
  2. PeopleFinder. About PeopleFinder.com. http://www.peoplefinder.com/about-us/. Retrieved: 04 Nov 2014.