SI110: Phases of a Cyber-attack / Cyber-recon

In this lecture we will discuss the fundamental phases of a cyber attack. The first phase, reconnaissance, will be covered in detail here, while the remaining phases will be covered in the Network Attack lecture.

What is a "Cyber Attack"?

As we've discussed, "security" for an information system means its ongoing ability to provide its services while maintaining the five properties described as the "pillars of IA". So an attack is an action that violates one of the pillars. So what kind of nefarious goals might we have, what kind of evil action might someone want to undertake that would require violating an IA pillar. Here are some examples:

Evil goal: I want to ...	pillar violated
steal a file	confidentiality
deface a webpage	integrity
bring down a DNS server	availability
send a bad e-mail from some else's account	non-repudiation
steal login credentials	authentication

So, suppose you are a wanna-be attacker, meaning that you have a goal — either a particular goal that requires violating a pillar, or the general goal of violating as many pillars as possible — and a target system. The target could be a single host or a network, or a large information system infrastructure. What's stopping you? Well, we just finished looking at tools we can use to protect the IA pillars: firewalls, encryption, hashing, password authentication, and certificate-based authentication. These are the things that are stopping you from achieving your evil goal.

Let's look at some very simple examples in which we just want to steal some data, i.e. to violate confidentiality:

You know that George from Accounting keeps the secret recipie for his award-winning chilli on his office computer, and your goal is to steal the recipe. What's stopping you? Password authentication! You need his username and password to login to his computer and access the recipe.
You want to view a webpage with secret planning information housed on your competitor's internal webserver. What's stopping you? A firewall! Your competitor's network sits behind a firewall that doesn't allow port 80 bound traffic in.
There's a guy on your WiFi network whom you want to discredit. You want to snoop in on his browser traffic to see what pages he's looking at. What's stopping you? Encryption! He's accessing sites via HTTPS and all the traffic is AES encrypted!

The moral here is this: If you want to attack a system, you need to violate a pillar. In order to violate a pillar, you need to defeat the tools that being used to protect the pillars.

An example to guide our discussion

We will focus on attack scenarios in which the attacker is on a different network than the system he wants to attack. To focus further, let's suppose we are the "bad guys" and we want to steal a file named secret.txt, which is in the account of a user named timvic on a host within the 5.6.7.0 network. Conceptually, our situation is illustrated by the diagram to the right, with the file secret.txt residing within the innermost circle. Each circle represents some barrier built by the tools we've discussed.

What creates the outermost barrier — the barrier between the internet at large and the target host's network? A firewall!
What creates the middle barrier — the barrier between the target host and its network? The combination of the host's firewall and limitations the machine's administrator has put on what processes are actually listening on ports.
What creates the innermost barrier — the barrier between processes on the host and, in this case, the file secret.txt? Recall that both files and processes have an owner, and the operating system doesn't allow a process to access a file unless its owner matches the file's owner, or the process owner is Administrator. Thus, the operating system's authentication is what provides a barrier between regular user processes and the users Administrator and timvic, which have the priveleges necessary to access the file secret.txt.
Note that if user timvic had AES-encrypted secret.txt before storing it, there would be yet another barrier (created by encryption) and yet another circle in the very center of our conceptual diagram.

Getting in through the firewall (the outermost ring) means using ports that it doesn't drop. Getting in to the host means using services it has actually running, i.e. process listening to ports, like a webserver listening on port 80. Getting past the final barrier in order to access secret.txt means getting control of a process running as a user with elevated priveleges relative to that data, i.e. Administrator or the user timivc. Ports allowed through by the firewall and services running on hosts are like gaps in the barriers. Potentially, such a gap can be exploited to provide access we shouldn't have.

A more accurate picture would reflect the fact that there are (generally) more hosts on the target host's network than just the target host himself. Moreover, different hosts on the network might have different services running, thus allowing different potential paths in from the outside. Consider the diagram to the right, which shows that there are two hosts: the target and a host running a webserver. As outsiders, we have no way to access our target host directly: the only service it runs is SSH on port 22, but the firewall does not let port 22 traffic in. However, we could imagine an attack proceeding by sending port 80 traffic into the network (which the firewall allows) to the webserver with some carefully crafted content that exploits a bug in the webserver, ultimately allowing us to execute commands on it. We use this to make an SSH connection from the webserver to the target host, which is allowed since both parties are inside the firewall. Now we have some some degree of access to our target host. Subsequent steps in the attack would have to take advantage of that to pursue the ultimate goal of stealing a copy of the file secret.txt.

What's important to note is that there are three basic phases of an attack like this:

Reconnaissance — In which we find out the information we need to actually get in: what traffic the firewall lets through, what hosts are in the network, what services they actually have running, etc. In the previous example, reconnaissance would have discovered that the firewall lets in port 80 traffic, that there's this other host running a webserver, that our target has an SSH server, etc.
Infiltration — In which we gain the access we need to achieve our goal (stealing secret.key in this example). This might involve multiple steps, as it did in this example in which we first gained access to the "other host", and then used that access to get into the target host.
Conclusion — In which we do the bad stuff that motivated our getting this illicit access (e.g. stealing secret.key) and take whatever steps necessary to cover our tracks.

The following reprises these three phases, with a lot of emphasis on reconnaissance.

Phase I: Reconnaissance

"To remain in ignorance of the enemy's condition simply because one grudges the outlay of a hundred ounces of silver in honours and emoluments, is the height of humanity." ~ Sun Tzu

The goal of the reconnaissance phase is to identify weak points of the target. A successful military strategist would dedicate ample resources on reconnaissance to find weaknesses in the enemy's defenses or to assess the enemy's capabilities. In either case, any information gathered about the target may be the crucial piece needed to reveal a critical weakness in defense or an unknown offensive capability of the enemy.

A cyber attack is not all that different than a military attack. A cyber attacker will dedicate a significant amount of time observing and probing the target computer network to find weaknesses in its defense. Any weakness found may lead to infiltration of the target network. Here is a list of some critical information that should be obtained during the reconnaissance phase:

Network Information	IP addresses subnet mask network topology domain names
Host Information	user names group names architecture type (e.g. x86 vs SPARC) operating system family and version TCP and UDP services running with versions
Security Policies	password complexity requirements password change frequency expired/disabled account retention physical security (e.g. locks, ID badges, etc.) firewalls intrusion detection systems
Human Information	home address home telephone number frequent hangouts computer knowledge dark secrets

Available resources include, but are not limited to:

Criminal records
Target's website
Public DNS Servers
Internet registry
Phonebook

News articles
Personal Blogs
Social Media
Satellite Images
Discarded Trash

This information is obtained by scouring all the resources available to the attacker using two distinct methods: passive and active.

Passive Reconnaissance

Gathering information without alerting the subject of the surveillence is passive reconnaissance. This is the natural start of any reconnaissance because, once alerted, a target will likely react by drastically increasing security in anticipation of an attack. This is like casing a place prior to robbing it.

Passive reconnaissance is commonly referred to as footprinting and, in context of a cyber attack, means minimizing any interaction with the target network which may raise flags in the computer logs. For example, visiting the target's website may leave behind a trace that your IP address established a TCP connection to the target's web server, but it will be one of millions of connections that day - probably not going to stand out to the administrator in the periodic review of server logs. On the other hand, visiting the target's website so frequently that the server becomes overloaded is certain to alert an administrator.

There is a lot of information freely available via the Internet. It can all be yours with a web browser and a lot of patience. The best starting place is the target's public website. View the source html files to find any clues. Users may provide personal information on their company website or a social media site, which could give hints as to what their user account password is. Names can be entered in a white pages search to reveal home addresses and telephone numbers, which can expand footprinting to an employee's home.

Network information can also be obtained freely via public records online. Every IP address and Domain Name must be registered in a public database. As a result, a few queries to the right places will provide anyone with the target domain's IP address range, DNS servers, and a contact address and telephone number.

Active Reconnaissance

If footprinting is like casing a place, then active reconnaissance would be actually trying to open doors and windows to see which ones are unlocked. We already know about a number of tools that can be used for active network recon: ping, traceroute, and netcat (nc). If we know the range of potential IP addresses for our target network, we can use ping to determing which IP's are actually in use by hosts on the network. We can use traceroute to figure out the topology of the network: i.e. were the routers are with respect to the hosts. Finally, we can use netcat (nc) to determine which ports are open with servers listening on them.

Active reconnaissance is commonly referred to as scanning. A simple scan would be to ping every IP address owned by the target network to see which ones belonged to real hosts. More sophisticated scans attempt a TCP connection with every port number of a specific IP address to determine which ports are open and, therefore, which services are running on the host at that IP address. Scanning is more intrusive than footprinting, but provides more specific information. There is also more risk that the target may be alerted to a potential attack since scanning results in more abnormal connections to target hosts, so it must be done carefully to avoid alertment.

One of the best tools for network scanning is a program called nmap. Nmap is a very powerful network scanner that we will be using to discover hosts on a target network during the Network Reconnaissance lab. We will discuss more detail on nmap during that lab.

Some other tools used for network scanning should already be familiar to you. Tools such as traceroute, ping, and netcat are commonly used to discover information about networks and their hosts. Let's analyze the traceroute information to Verizon.net.

traceroute to verizon.net (206.46.232.39), 30 hops max, 60 byte packets
 1  131.122.88.250 (131.122.88.250)  11.327 ms  11.378 ms  11.456 ms
 2  usna-c2-v726.net.usna.edu (10.0.2.21)  11.515 ms  11.550 ms  11.568 ms
 3  border-d1-v722.net.usna.edu (10.0.2.6)  17.075 ms  16.930 ms  16.797 ms
 4  border-f1-gi1_0.net.usna.edu (131.122.6.249)  11.016 ms  16.153 ms  16.112 ms
 5  border-r1-po1.net.usna.edu (192.190.228.1)  16.058 ms  6.819 ms  3.974 ms
 6  dren-sdp.net.usna.edu (138.18.45.5)  3.913 ms  1.319 ms  1.209 ms
 7  so48-2-1-0.ray.dren.net (138.18.1.59)  3.969 ms  3.894 ms  4.435 ms
 8  pos1-1-1.gw8.dca6.alter.net (152.179.75.129)  4.363 ms  4.298 ms  4.234 ms
 9  0.xe-3-0-3.xt1.dca6.alter.net (152.63.40.78)  4.171 ms  4.114 ms  4.046 ms
10  0.so-1-2-0.xl3.dfw7.alter.net (152.63.98.77)  268.630 ms  268.637 ms  267.884 ms
11  pos6-0.gw2.dfw13.alter.net (152.63.103.225)  265.903 ms  266.073 ms  267.466 ms
12  verizon-gw.customer.alter.net (63.65.122.26)  267.530 ms  264.703 ms  264.601 ms
13  po121.ctn-core1.vzlink.com (206.46.225.18)  280.754 ms  280.736 ms 280.663 ms <== this is verizon's router
14  * * *

Line 13 begins Verizon's network, marked by vzlink.com. It's IP address is given along with it's domain name, which can be probed later for open TCP ports. At line 14, we see three asterisks. That means that there was no response from the next router within Verizon's network - this is a common security practice to make reconnaissance more difficult. In this case, we do not get much information about the internel network topology on Verizon, but we do know the existence of packet filtering by their network.

Netcat can be used to probe version information from open ports. The following example demonstrates this technique on port 80 of Verizon.net's web server:

$ nc verizon.net 80
GET / HTTP/1.1

HTTP/1.1 302 Found
Date: Wed, 27 Jul 2011 20:21:06 GMT
Server: Microsoft-IIS/6.0
X-Powered-By: ASP.NET
X-AspNet-Version: 2.0.50727
Location: http://www.verizon.net/central
Set-Cookie: ASP.NET_SessionId=rurvikyswhz0xijy4p1hzt55; path=/; HttpOnly
Cache-Control: private
Content-Type: text/html; charset=utf-8
Content-Length: 147
Age: 25
Via: 1.1 localhost.localdomain

. . .

Highlighted in yellow is the name and version of the http service running on Verizon's web server. This also hints at the web server's operating system. It must be running Microsoft Windows, since it has Microsoft's IIS running.

Phase II: Infiltration

The goal of this phase is to gain control of a host on the target's network. This is typically done by gaining remote access to a shell or terminal as the administrator on that host.

Knowing a weakness is not enough to infiltrate the target; an attacker must discover a way to take advantage of that weakness. This does not necessarily require advanced knowledge and skill of computer programming, but having it can significantly improve the probability of success. Anyone can guess weak passwords to gain access, but developing a custom made program to exploit poorly written code in software requires advanced programming knowledge and skill. But not everyone needs to develop exploits in order to use them. Do you have the knowledge and skill to build a car? Probably not, but that knowledge is not required to obtain a driver's license. The same goes for cyber exploits. As long as someone possess the knowledge and skill to create exploitation programs, others can use them with little or no understanding of how they work.

There are many automated tools for exploitation of known computer weaknesses freely available on the Internet. The most popular exploitation program available is called Metasploit, which you will use later in the course.

Flame Covered It's Tracks
The "Flame" malware came to light in the summer of 2012. It's a very sophisticated piece of malware, probably produced by some nation-state, not by random hackers, terrorists or criminals. One of the many interesting aspects of the Flame malware was that it was designed to "cover its tracks", i.e. to erase traces of its existence on computers that it had infected, but was finished with. The article Flame authors order infected computers to remove all traces of the malware, from CIO Magazine, describes in some detail how Flame did this.

Phase III: Conclusion

The goal of the final phase is to achieve the intended objective and back out leaving no trace of the tresspass. In practice, this is the most difficult phase because computers keep records of every logon, logoff, startup, shutdown, network connection, program execution, and error received. With so many records left on every computer accessed, including routers, it is nearly impossible to eliminate all traces of an intrusion. Often times, an attacker will use techniques to deceive authorities as to the actual origin of the attack or attack from a which the attacker will carry out the objective motivating the attack.

An attacker driven by monetary gains may sell credit card information obtained with privileged access on the black market. Or, an attacker motivated by political gains may expose private, and embarrassing, information about a political opponent to hurt the opponent's popular opinion. The examples go on and on.

Finally, the attacker may either terminate the connection, if no further access is required, or create a backdoor for future access of the target.