There are a common set of tools that help provide confidentiality, integrity, authentication, and non-repudiation that fall under the umbrella of cryptography. Over this and the next several lessons we will learn about several categories of techniques (symmetric encryption, asymmetric encryption, cryptographic hashing, and steganography), we will use and understand simple examples of those techniques, and we will learn about and use real-world tools that make use of those techniques. We will also look at attacks on these techniques.

What is Cryptography

Caesar Shift: a Simple Encryption Method

The Caesar Shift Cipher assumes your message is all capital letters, and replaces each letter in the plaintext with a new letter to produce the ciphertext. The replacement scheme is based on secret key that Alice and Bob have agreed upon ahead of time — a number in the range 0-26 called the shift value. The replacement scheme is simple: if the shift value is s, the kth letter in the alphabet is replaced by letter k+s in the alphabet, circling back around to the front of the alphabet if necessary. So with a shift value of 3, the letter B (the 2nd letter in the alphabet) is replaced with the letter E (the letter number 2+3 = 5 in the alphabet). You can use the little applet below to help encrypt a message once you've chosen a shift value.

Decrypting is means subtracting rather than adding the shift value, although you might notice that a shift value of 26-s actually reverses a shift of s. Let's follow this process through from start to finish:

Although very simple and, as we'll see, not very secure, the Caesar Cipher is a good example. It has the basic properties of any cryptosystem: two communicating parties Alice/Bob, nefarious eavesdropper Eve, plaintex/ciphertext, encryption/decryption. Moreover, it's representative of one of the two basic classes of cryptosystem, symmetric encryption (also called secret-key), where there is a secret key, shared by both Alice and Bob, that is used to encrypt and decrypt the message.

Encryption Key Management

Much of military communications are encrypted today for obvious reasons. What is not so obvious is how the Navy and Marine Corps manage all of the encryption keys used for encrypted communications. The system used throughout the military is called the Electronic Key Management System (EKMS) and is centrally controlled by the National Security Agency (NSA). EKMS is in place to provide communications security (COMSEC) material (i.e. encryption keys) and support tools for tracking and managing encryption key material, generation, distribution, and accounting.

Sound like an important job? It is and you might be the one doing it at your command as a junior officer. Every Naval or Marine unit that uses secure communications has at least two EKMS managers and it is common practice to have a junior officer act as one of them.

Frequency Analysis — breaking the Caesar cipher

Let's suppose you are Eve, and you've intercepted the message (ciphertext) XPPE XP LE YZZY. There are more P's than anything else, so you might guess (correctly in this case) that a P in the ciphertext came from an E in the plaintext. This would lead you to guess the key/shift-value k = 11.

It's not always going to be that easy of course. The ciphertext RNCP KU QHH has more H's than anything else. If we assume that H's in the ciphertext came from E's in the plaintext, we would deduce a key/shift-value of 3. Decrypting assuming k = 3 gives OKZM HR NEE ... which is probably not the secret message. In fact, the plaintext that produced this message was PLAN IS OFF.

The problem with this approach is that that we only considered one letter — the most common appearing in the ciphertext. Assuming H's came from E's gave us lots of E's in our "cracked" message, but it also gave us Z's and K's, which are pretty uncommon. To do frequency analysis properly, we should consider all the letters in the message. This is tedious, of course, but when something is tedious, it just means that we ought to write a program and let the computer do it for us. Try out this page which features a Javascript program for cracking Caesar shift encryption via frequency analysis. It functions by calculating for each shift value the likelihood of that shift value being correct based on the frequencies of the letters that result from decrypting the given ciphertext with that shift value. It's very interesting to see how few characters of ciphertext are required to recover the key with a high degree of certainty.

So we see that the Caesar Shift Cipher is not very secure. In particular, it's quite vulnerable to attack via frequency analysis. Its problems are a) there are only 26 key values, so trying them all is a viable option, and b) since a given character in the plaintext is always replaced with the same character in the ciphertext, letter frequencies carry over from plaintext to ciphertext.

More Sophisticated Symmetric Encryption: The Vigenere Cipher

The key is a string of letters like JOE. To encrypt, you take your plaintext (we'll reuse MEET ME AT NOON) and write it down. Then you write down th key string over the plaintext, with letters matching up. If the plaintext is longer than the key, you simply repeat the key. Like this:

Think about how the Vigenere Cipher addresses the flaws in the Caesar Shift. The key is a string of characters, and since there are roughly 6 trillion strings of length less than 10, for instance, the problem of too few keys has been addressed. The same letter at different positions in the plaintext generally does not get mapped to the same character in the ciphertext, since the key-character written above plays a role in the encryption. So letter frequencies in the plaintext do not get carried over to the ciphertext.

Frequency Analysis Attacks on the Vigenere Cipher & the One Time Pad

The Venona Project: Poor Practice Defeats Perfect Security
One-time pads provide provably perfect security ... but at a price. Managing keys is really difficult! After all, you have to have as many bytes of key as you have bytes of plaintext to communicate. During WWII, the British and US intercepted a large amount of Soviet Russian communication that was encrypted with one-time pad encryption. However, cryptanalysis revealed that some of the one-time pad key had been reused ... which is the big no-no with one-time pad encryption. This misuse of the system allowed small parts of the communication to be decrypted. NSA's effort to exploit this misuse of one-time pad keys to decrypt as much as possible of the traffic was code-named VENONA. Over the years (Venona lasted until 1980), this code-breaking effort revealed Soviet espionage campaigns and spys at places like Los Alamos Nationa Labs, the State Department and the White House. It identified the Rosenbergs and Alger Hiss as spies.

With any cryptographic protocol, even a small deviation from the protocol can compromise security. This fact should be a major take-away from the story of the Venona project!

Cyber Pigeons?
Before there was the Internet, there were pigeons. In late 2012, a British man found a dead carrier pigeon in his chimney. Turns out it was from WWII and it carried an encrypted message tied to its leg. CNN has a nice story UK spies unable to crack coded message from WWII carrier pigeon about it and the fact that nobody's been able to decrypt the message. Turns out, the sender used a one-time pad.

Finding the key length can be a problem, but one easy way given what we already know is this: for each possible key length n, form the string consisting of every nth character starting from the first, give that as a ciphertext input to our Caesar Shift Frequency Analysis page, and make a note of the probability of the shift index it gave you for that n. Whichever n value gave us the highest score is probably the actual length of the key. In class, we will actually have performed this exercise.

This kind of attack requires enough text that our Caesar Shift frequency analysis of every nth character finds the proper shift index with high probability. If the message length is L, and we assume we need about 20 characters to be assured of having a high probability with our Caesar Shift frequency analysis, we'd like to have L/n > 20. If L is short or n is long, our attack will fail. So, in general, a longer key gives you more security from frequency analysis. If you have a key that is a completely random sequence of letters, and which is as long or longer than the message, the Vigenere Cipher is unbreakable — provided you never use the key again. In this situation, the system becomes what is called a one-time pad. The problem with such a system is that arranging to have this one huge key is difficult.

Chosen Plaintext Attack & the Vigenere Cipher

A kind of chosen plaintext attack was done by the US during WWII. We knew a Japanese attack was imminent because we had cracked a code, but we didn't know whether the string designating the target was referring to Hawaii or Midway. So we leaked a story about a water shortage on Midway, and discovered that same symbol in a message that was, we were sure, relaying that leaked information.

The program that cracked WEP-encryption in your wireless lab is actually also based on a chosen plaintext attack.