Lets step back a bit and discuss the bigger picture of how the client and the server communicate. You may have touched this subject in a previous Networking class, and possibly in considerably more detail. We will focus on the basics. What we want to consider is that there is a back-and-forth conversation between the client and the server, and there are specifically agreed upon terms to this conversation.
GET / HTTP/1.1
Host: courses.cs.usna.edu
User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/62.0.3202.75 Safari/537.36
Accept: */*
Note: there is an empty line after the Accept: */* line which ends the HTTP header and the request.
HTTP/1.1 200 OK
Date: Fri, 30 Aug 2019 00:02:03 GMT
Server: Apache/2.4.29 (Ubuntu)
Vary: Accept-Encoding
Content-Length: 2967
Content-Type: text/html; charset=UTF-8
<!DOCTYPE html>
...
It is important that you notice the extra newlines that were sent by both the client and the server, these are used to inform the receiver that the sender has completed transmitting header/metadata and will now send content.
You can test this by bringing up a terminal and running netcat. The line below shows how to start the netcat session by connecting to port 80, the default port for HTTP requests, after which you can add the HTTP request (as seen above) and hitting enter a few times to send the request.
nc courses.cs.usna.edu 80
Sometimes it is better to pipe the HTTP request to the netcat connection as below:
printf "\
GET / HTTP/1.1\r\n\
Host: courses.cs.usna.edu\r\n\
\r\n\
" | nc courses.cs.usna.edu 80
It will actually be easier to test using the curl command, as it will add the appropriate headers for you (instead of typing them in manually like we did with nc). More importantly, curl highlights the back-and-forth communications between the client and the server as they negotiate the terms of the request.
curl -v courses.cs.usna.edu 2> courses.headers 1> courses.html
The odd redirection you are seeing above in the command line, allows us to send stdout (the html) to the file courses.html, and the stderr (the headers) to a separate file. curl specifically does this to ensure that you only work with the content if you were to perform regular redirection of the stdout. Additionally, it will help you with testing and help in keeping the content separate from the meta-data and headers. Below is an example of the headers, and as you can see the flow of information is annotated by the arrows > and <.
* Rebuilt URL to: courses.cs.usna.edu/
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0* Trying 10.1.83.55...
* TCP_NODELAY set
* Connected to courses.cs.usna.edu (10.1.83.55) port 80 (#0)
> GET / HTTP/1.1
> Host: courses.cs.usna.edu
> User-Agent: curl/7.58.0
> Accept: */*
>
< HTTP/1.1 200 OK
< Date: Fri, 30 Aug 2019 00:02:03 GMT
< Server: Apache/2.4.29 (Ubuntu)
< Vary: Accept-Encoding
< Content-Length: 2967
< Content-Type: text/html; charset=UTF-8
<
{ [1015 bytes data]
100 2967 100 2967 0 0 21345 0 --:--:-- --:--:-- --:--:-- 21345
* Connection #0 to host courses.cs.usna.edu left intact
We are using courses.cs.usna.edu as the example as the site is not encrypted by default. Many sites will attempt to quickly redirect the user to their encrypted site, and for the purposes of our discussion the insecure communication is interesting.
Below are a few different variations on the request method. Test them and see what the differences are.
HEAD / HTTP/1.0
GET /cgi-bin/query.pl?str=dogs&lang=en HTTP/1.0
POST /cgi-bin/query.pl HTTP/1.0
Content-Type: application/x-www-form-urlencoded
Content-Length: 16
str=dogs&lang=en
GET /img1.jpg HTTP/1.1
Host: www.host1.com
GET /img6.jpg HTTP/1.1
Host: www.host1.com
Connection: close
Bring up the Chrome developers console (ctrl-shift-j), click the network tab, and then reload courses.cs.usna.edu. You can now see all of the individual items that were loaded and you can see the contents of the HTTP requests and responses.
For another example, go to the course website and take a look at what was sent and received. You can click on each file and see the headers of both the requests and responses and the body of the responses received as well.
There are quite a few websites that need to be encrypted to protect the content (think banking, health, privacy, etc.) We need a method to encode/decode the data. HTTPS uses both asymmetric and symmetric encryption. This process runs in the following sequence:
One of the most important things that you need to take away from this conversation is that HTTP is a stateless protocol. Each request is treated independently. Moreover, when you go to a website two things are going to happen assuming the request was valid:
http://midn.cs.usna.edu/~mXXXXXX/IT350.html
http://csfaculty.academy.usna.edu/~adina/it350/demo/welcome.php?username=ac
welcome.php:
<?php
$username = $_GET["username"];
setcookie("username",$username);
echo '<!DOCYPE html>
<html lang=”en”><head>
<meta charset = “utf-8”>
<title>Test</title></head>
<body>';
echo "<h1>Welcome $username</h1>";
echo '</body></html>';
?>