After completing these activities you should be able to:
Explain Locard’s Exchange Principle in the context of Digital Forensics
Describe the use of hashing in digital forensics.
Describe the process of file carving and explain the basic computer architecture principles behind it
Perform basic digital forensics activities.
Analyze basic forensic artifacts to deduce events that occurred in information systems.
Identify steps in the Digital Forensics process
Digital Forensics and the War on Terror
When terrorists or insurgents are captured, or when a hideout is discovered, one of the first orders of business is to search for computers, memory sticks, cell phones, and other kinds of electronic devices.
Then operatives carry out forensic analysis on the electronic devices in hopes of finding information about things like attack plans, or identities of other terrorists.
For example, when Osama bin Laden's compound was raided, a wealth of digital data were captured: five computers, dozens of hard drives, and more than 100 other storage devices.
Forensics is the scientific analysis of physical evidence, as from a crime scene or other related incident. When we use the term 'digital forensics' we are referring to
the analysis of events on an information system. From this evidence, we can reconstruct certain incidents and gather information about the user, the system, and the data affected.
When might we want to reconstruct a sequence of digital events? If a server is hacked and we need to know how it was done or even when, we want to check the status of our system.
The sexiest of the digital forensic scenarios is more CSI style: you recover a computer and you want to know what kind of shenanigans were done with it. You might be looking
for criminal evidence or traces of certain events. We'll focus on that kind of scenario.
Locard's Exchange Principle
In traditional, CSI-style forensics, one of the guiding concepts is Locard's Exchange Principle, which speculates that every time you make contact with another person, place, or thing, it results in an exchange of physical materials.
Thus, in the commission of a crime, the perpetrator leaves something at AND takes something from the crime scene.
These "somethings" are evidence.
More colorfully:
Wherever he steps, wherever he touches, whatever he leaves, even without consciousness, will serve as a silent witness against him, his fingerprints or his footprints, but his hair, the fibers from his clothes, the glass he breaks, the tool mark he leaves, the paint he scratches, the blood he deposits or collects.
All of these and more, bear mute witness against him.
This is evidence that does not forget.
It is not confused by the excitement of the moment.
It is not absent because human witnesses are.
It is factual evidence.
Physical evidence cannot be wrong, it cannot perjure itself, it cannot be wholly absent.
Only human failure to find it, study and understand it, can diminish its value. — Paul L. Kirk. 1953.
Locard's principle holds in the digital world as well and, in fact, it holds whether you are perpetrating a crime or not.
We have seen several examples of this already.
Visiting a website: Suppose you visit amazon.com and login there.
What evidence of this "visit" do you leave at the amazon.com web server?
An entry in the web server log, of course!
What evidence do you take with you?
First, a cookie from the amazon.com server.
Second, your browser caches a copy of the web pages you visit — i.e. it stores a copy on your machine of each web page.
This is so that when you look at a page a second time, you can just use the cached copy, provided the page hasn't changed, and not have to wait for the page to be resent from the server.
Third, your browser keeps a history of all the pages you've visited — which it uses to offer you a list of completions of the URL you're currently typing.
Additionally, DNS results for your system asking to resolve (look up) the IP address of amazon.com are cached on your local system.
Message board injection attack: Recall the injection attacks that you guys used to bring down the class message board.
You posted some "message" that included bad JavaScript which ultimately crashed the message board.
When you do this, what do you leave at the scene?
The web server log lists the URL you requested, which includes the message you sent because of the GET method.
Thus both the web server log and the HTML source of the message board then show the JavaScript you injected.
What did you take with you?
You might have noticed that your browser "remembers" values you've entered into form elements in the past, which can often save you some typing.
That means that your browser has stored your injected Javascript, along with the fact that it was entered into the "message" element of the message board.
A few more examples of "things you leave" on remote hosts
In addition to visiting websites, one of the ways we've seen that we "go somewhere" in the cyber world is by using SSH to get a terminal on a remote host.
It's interesting to see what you leave behind when you do this:
Login attempts: Every attempt you make to login to a system, successful or
not, is logged!
On ssh.cyber.usna.edu, for example, there is a file /var/log/auth.log that the sysadmin (System Administrator) has access to, that contains a log entry for every successful and unsuccessful attempt to login.
Here's an example of a few entries:
Nov 1 08:38:05 ssh.cyber.usna.edu sshd[3962]: pam_unix(sshd:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=131.122.6.104 user=mxxxxxx
Nov 1 08:38:05 ssh.cyber.usna.edu sshd[3962]: Accepted password for mxxxxxx from 131.122.6.104 port 49961 ssh2
Nov 1 08:38:05 ssh.cyber.usna.edu sshd[3962]: pam_unix(sshd:session): session opened for user mxxxxxx by (uid=0)
This tells us that at 8:38am on 1 November, someone at host 131.122.6.104 tried to login as user mxxxxxx, gave the wrong password, then tried to login again and was successful.
Think about how this could be used to track someone who was doing or trying to do bad things!
Commands executed: Every command you execute is logged!
On ssh.cyber.usna.edu, for example, the sysadmin has a tool called lastcomm that lists every command executed by any user.
Here's an example of a few lines output by the command:
md5sum mxxxxxx ?? 0.00 secs Thu Nov 3 07:36
bash F mxxxxxx ?? 0.00 secs Thu Nov 3 07:36
ssh mxxxxxx ?? 0.00 secs Thu Nov 3 07:36
bash F mxxxxxx ?? 0.00 secs Thu Nov 3 07:36
What do we learn from this?
We learn that at 7:36am on 3 November user mxxxxxx computed an MD5 hash and then ssh'd to some host.
Think about how that might be used as evidence.
In fact, there's a command called history that will bring up the last N commands you've given, along with arguments like filenames, etc.
If you login to your ssh.cyber.usna.edu account and give the history command, you'll see all the commands probably that you've ever given on ssh.cyber.usna.edu!
A few more examples of things that stay with you on your machine
Altering the registry on a Windows system can lead to system instability or system crashes.
Do not modify registry data unless you know what you are doing.
In SY110 we will not have you modify registry values, you will only read registry values.
Follow the below steps to launch an alternate version of regedit:
Direct from Search Dialog, Run Dialog, or an Administrator shell:
Enter command: regedt32
If prompted, click Yes in the User Account Control dialog.
The digital forensics lab will explore the kind of information that stays behind — perhaps unexpectedly — on your Windows computer.
So we mention only a few examples here (these are the typical locations, although exact locations may vary due to operating system version, implementations, and local IT policies):
Browser cache: C:\Users\m9999\AppData\Local\Google\Chrome\User Data\Default\Cache
- Browser history: Ctl + H
- Browser history in the file system: C:\Users\m9999\AppData\Local\Google\Chrome\User Data\Default\History
Recently accessed files: If you launch regedit and look under: HKEY_CURRENT_USER\Software\Microsoft\Windows\CurrentVersion\Explorer\RecentDocs
you will see a list that shows files that you opened recently, sorted by file extension.
If you right-click and choose "Modify", you'll see the file names.
Networks you've been on: HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\NetworkList\Signatures
and then look under both Managed and Unmanaged you'll see among other things the MAC Addresses of the Gateway Routers for networks you've been on.
- View previously connected wireless networks: netsh wlan show profiles
"Meta-data" in documents: Programs like Microsoft Word store "meta data" in the documents they create.
For example, if you right-click on the icon for a Word file, choose "Properties," and look under the "Details" tab, you often find information like the name of the document's author, an e-mail address, the username of the author, and so forth.
These documents that get published to the world may leak information that could be used for evil purposes.
File Carving
File carving is an incredibly useful skill to have in the world of digital forensics.
It is the process used in computer forensics to extract data from a storage device after the files have been deleted, the device has been erased, or the device has been damaged.
But, at this point, the data on the device just looks like a sequence of "raw bytes" — meaning a sequence of bytes without any information as to where any file begins or ends in the sequence of bytes.
In order to extract valuable data from these raw bytes, file carving is necessary.
How a storage device ("drive") is formatted
Just like communications protocols, data storage devices have a format.
A storage device (hard drive, thumb drive, etc) is nothing more than a huge sequence of bytes.
We can refer to a specific byte by its offset, i.e. its distance from the initial byte.
So the initial byte has offset 0, the next offset 1, and so forth.
Additionally, a formatted drive has:
a file system — a record for each file that includes its name, the byte offset at which it begins, and the byte offset at which it ends. This system also indicates what directories there are and which directory each file belongs in.
Some of the bytes on the drive are used to represent the file system.
files — a chunk of consecutive bytes on the drive. The file system only contains information about the files.
free or unallocated space — bytes on the drive that are not currently being used to store information. This could be
as part of the file system or as part of a file. When new files are created, bytes from the unallocated space are repurposed to store (relabeled as) the new file.
What does "delete" a file really mean?
The Windows Recycle Bin
When a user deletes a file in Windows, the file is actually moved to a new directory, called the Recycle Bin.
When the user empties the Recycle Bin, the files in the Recycle Bin are deleted as described to the left.
When you tell the operating system to delete a file, all it really means is that:
the file system structure's record of that file (its name, the byte offset it starts
at and the byte offset it ends at) is removed from the system, and
the bytes that constitute the file itself are simply reclassified as unallocated space; i.e that bytes that make up the data of the file are not actually deleted.
The House Analogy
File Name
Beginning Byte Offset
Ending Byte Offset
helloWorld.txt
0x00
0xE
historypaper_updated_edited_final_version5.docx
0x10
0xEA
Lecture2.pptx
0x9A2
0xA01
In the above example file system, helloWorld.txt begins at byte 0x00 and ends at byte 0xE. As you can see the files are not necessarily stored sequentially in memory, but knowing the beginning byte and ending byte allows the file to be located wherever it is stored. While the information regarding helloWorld.txt is present in the file system, the memory where the file is stored (0x00 to 0xE) cannot be overwritten by another file. The file system itself is also stored on the drive but does not contain the files themselves, just the name, location, and other information for where you can find the files.
Think of a file system as an address book. Imagine the information in the file system as the information written in your address book and the data that makes up the files as the actual houses. The address book will show you where to find the houses, and as long as that information is in your address book, you cannot build another house there. You can make changes and additions to the house though.
When you go into My Documents or some other location in the file system and delete helloWorld.txt you are not deleting the 1s and 0s that make up helloWorld.txt. You are, however, deleting the file name, beginning byte offset, ending byte offset, and other information stored in the file system.
For our analogy, it's as if you are erasing the information written in your address book, but not knocking down the house. HOWEVER, if someone else wanted to build a new house in that location, they could knock down the house at the location you erased from your address book, build a new one, and write the new information in the address book, or file system. This house might be called goodbyeWorld,txt instead of helloWorld.txt and be a completely different shape and size.
In the address book analogy, when the information in your address book is deleted, the house is still there. If you were to walk down the street where the house is, you could still look at it, take a picture, and get whatever information you needed about the house. Even though the 1s and 0s in the file system have been deleted, the 1s and 0s that make up the file are still on the drive. If you search through the drive, you can find those 1s and 0s just like you can find the house on the street. This is, until that space is overwritten with other 1s and 0s.
How to truly delete a file
So what if you want to delete a file so that it truly cannot be recovered?
To do that you have to not only "delete" the file in the sense of removing its record from the file system, you must also overwrite the bytes of that file with zeros or with random values.
There are utilities that will do that for you.
It is possible that a sophisticated forensics analysis could analyze the magnetic patterns on a drive and determine not only the current bit pattern of a byte, but
also previously stored bit patterns.
Fear of this has led many people to consider a file to only truly be deleted if its bytes have been overwritten many times.
Recovering a file that has been "Deleted"
Notice that after a file is "deleted", all of its bytes are still sitting there on the drive ... they are simply categorized as "unallocated", which means they are available for use to represent other files.
So, a file that has been deleted is recoverable up until the time that its bytes are overwritten for other purposes.
However, the file's name and the offsets at which it begins and ends are no longer available.
So the trick is finding where the file begins and ends, and that is what "file carving" is all about.
Recall (from our earlier discussions of file systems during Cyber Battlefield – Operating Systems, most files have headers and footers that represent what the data in the file represent.
File carving is essentially looking for file headers and footers in recovered blocks of bytes.
With computers, "deleting" a file doesn't necessarily mean the data stored in the file (the bytes that comprise the file) are gone.
It means that the file systems' record of the file's name, and the files' connection to that area of the hard drive are gone.
Those bytes become "unallocated space", but still hold data that can be interpreted.
To carve a file from a block of bytes, you'll need to look for the header of the file, and depending on the file type the footer of the file.
For example, the header (in hex) for a PNG file is 89 50 4e 47 and the footer is 49 45 4e 44 ae 42 60 82.
Below we have an example of a chunk of unallocated space from a drive.
Looking carefully, we spot a PNG header (starting at offset 10) and, following it, a PNG footer (ending at offset 42), therefore we deduce a PNG file is at the offset from 10 to 42.
Block of unallocated space from a drive
PNG header
body
PNG footer
7e
93
20
20
51
e9
05
6d
ff
67
89
50
4e
47
0d
0a
1a
0a
00
00
00
0d
49
48
44
52
54
78
9c
62
60
01
00
00
00
49
45
4e
44
ae
42
60
82
3d
69
c4
82
81
f0
6f
61
e4
40
4b
b4
34
2f
2e
bb
00
01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
Digital Forensics in the real world
The proliferation of cell phones, computers, PDAs, has added the element of digital forensics to investigations. You often see even on fictional crime shows, they have individuals using digital forensics to help solve a case. When conducting an investigation, it is conducted in three stages: Acquisition, Analysis, and Reporting. No one stage is more important than another, and they each are equally critical to an investigation. As we know, in a criminal court case, we know the lawyers of the defendant are trying to convince the jurors that there is reasonable doubt in order to force a verdict of Not Guilty. For this reason, digital forensics investigators must conduct their investigation in a way that would be capable of being repeated by an independent investigator.
Acquisition Stage: During the acquisition stage, the evidence of which the investigation will be conducted is collected. The investigation is conducted on a copy of the original digital evidence, both a logical and physical image of the evidence. The investigator will also hash the evidence to prove that the digital evidence has in no way been modified. A log of the chain of the custody must be generated of all actions taken during the acquisition stage.
Analysis Stage: During the analysis stage, this is where the investigator is examining the evidence and extracting information for the data. The goal is to construct a time-line of events. Throughout the analysis, the evidence is hashed to prove that it has not been modified in any way for we know that even changing one bit of the digital data would change the entire hash digest. If the hash digest was different than the hash digest of the original evidence, this would be a gold mine for the lawyer of the defendant for this could possibly introduce reasonable doubt to the trial. The analysis of the evidence may lead to the need to collect further evidence. Also, during this stage the investigator should keep a log of all their actions during the analysis of the evidence.
Reporting Stage: During the reporting stage, the unbiased findings from the first two stages are communicated to an adjudication venue. The adjudication venue (e.g., criminal court, internal investigation, etc.) is the organization that will make some recommendation or decision based on the evidence. While the formatting of the reported results may vary in format depending on the organization performing the analysis and the adjudication venue, the results normally include the following components:
An executive summary (an easy to read version of the findings)
A time-line of events
A hash of all digital evidence
The unbiased detailed findings
Supplemental Media:
Overview of Digital Forensics
Review Questions:
What is Locard's Exchange Principle?
What are five examples of traces that you leave behind in the digital world?
What are five examples of traces that you take with you in the digital world?
What are the 3 stages of a forensics investigation?
Why is it important that all forensics work be repeatable and unbiased?
What role does hashing play in collecting evidence? Analyzing? Reporting?
How is it possible to recover data even after it has been deleted?
Why is it that you are more likely to be able to recover a recently deleted file than a file that was deleted a long time ago?