Honor

The course policy and the instructions it references, most pertinently COMPSCIDEPTINST 1531.1D, spell out what kinds of assistance are permissible for programming projects.

What you can do

  1. You may get help from your instructor.
  2. You may use general purpose resources online; such use must be documented.

What you cannot do

  1. You may not discuss the project with any other student.
  2. You may not look at any other student's project code.
  3. You may not show your own code to others.
  4. You may not copy code and submit it as your own work.
  5. You may not look at online resources that are directly related to the solutions to the project.

If you have questions about any of these rules, email your instructor and ask about it.

Overview of the Project

In this project, we will study different aspects of three modes of operations: ECB, CBC, CTR modes.

Dealine, Penalty and Reward (Read Carefully)

[20pts] Part 1: Implementing the ECB Mode

Cryptodome package

We will use Cryptodome package, which can be installed as follows (the lab machines already have the package installed):
pip3 install pycryptodomex --user

Your Task

Write project.py that contains the following functions that implement the ECB mode operations: For your convenience, I am giving you the template.

# project.py
# name: ...
from Cryptodome.Cipher import AES
from Cryptodome.Util.Padding import pad
from Cryptodome.Util.Padding import unpad

# read in_plain_file, encrypt the data, and store the ciphertext in out_cipher_file 
def encrypt_ecb(in_plain_file, out_cipher_file, key):
  # do something
  
# read in_cipher_file, decrypt the ciphertext, and store the plaintext in out_plain_file 
def decrypt_ecb(in_cipher_file, out_plain_file, key):
  # do something 

# read normal_bmp_file and in_cipher_file, fix the header in the ciphertext and
# store the results in out_cipher_bmp_file
def fix_bmp_header(normal_bmp_file, in_cipher_file, out_cipher_bmp_file):
  # do something 

Requirement

Test code


#!/usr/bin/python3
# test_part1.py

from project import *

key = bytes.fromhex("00112233445566778899aabbccddeeff")
encrypt_ecb("pic_original.bmp", "prj_ecb.bin", key)
decrypt_ecb("prj_ecb.bin", "prj_dec.bmp", key)
fix_bmp_header("pic_original.bmp", "prj_ecb.bin", "prj_ecb.bmp")

Sample Run

$ ./test_part1.py
$ md5sum pic_original.bmp
b62c61f912f1cb7037762d5fddcf782b  pic_original.bmp
$ md5sum prj_ecb.bin
36b89a0993197a0447a08e2c8722c675  prj_ecb.bin
$ md5sum prj_dec.bmp
b62c61f912f1cb7037762d5fddcf782b  prj_dec.bmp
$ md5sum prj_ecb.bmp
3a8b8dec89ef06b882d06565773ab3e3  prj_ecb.bmp
$ eog prj_ecb.bmp &

[20pts] Part 2: CBC and CTR mode

Implement the functions appropriately so that the following test code works correctly. Download pic_original.bmp from the lecture on Modes of Operation.

#!/usr/bin/python3
# test_part2.py

import project

key = bytes.fromhex("00112233445566778899aabbccddeeff")
iv = bytes.fromhex("000102030405060708090a0b0c0d0e0f")

project.encrypt_cbc("pic_original.bmp", "prj_cbc.bin", key, iv)
project.decrypt_cbc("prj_cbc.bin", "prj_cbc_dec.bmp", key, iv)

ctr = iv
project.encrypt_ctr("pic_original.bmp", "prj_ctr.bin", key, ctr)
project.decrypt_ctr("prj_ctr.bin", "prj_ctr_dec.bmp", key, ctr)

Tips

Sample runs

After running the code correctly, you should have the following results:
$ ./test_part2.py
$ md5sum prj_cbc.bin
ef302657d9d8c602ce87d8b979ef72d1  prj_cbc.bin
$ md5sum prj_cbc_dec.bmp
b62c61f912f1cb7037762d5fddcf782b  prj_cbc_dec.bmp

$ md5sum prj_ctr.bin
7901dcdfc7c7b8ef37a8359d8206d477  prj_ctr.bin
$ md5sum prj_ctr_dec.bmp
b62c61f912f1cb7037762d5fddcf782b  prj_ctr_dec.bmp

[20pts] Part 3: Padding

For block ciphers, when the size of a plaintext is not a multiple of the block size, padding may be required. The PKCS#5 padding scheme is widely used. Let's see how the padding algorithm works:

>>> from Cryptodome.Util.Padding import pad

>>> blk_size = 16

>>> pad(b"0123456789", blk_size)
b'0123456789\x06\x06\x06\x06\x06\x06'

>>> pad(b"0123456789a", blk_size)
b'0123456789a\x05\x05\x05\x05\x05'

>>> pad(b"0123456789ab", blk_size)
b'0123456789ab\x04\x04\x04\x04'

>>> pad(b"0123456789abc", blk_size)
b'0123456789abc\x03\x03\x03'

>>> pad(b"0123456789abcd", blk_size)
b'0123456789abcd\x02\x02'

>>> pad(b"0123456789abcde", blk_size)
b'0123456789abcde\x01'

>>> pad(b"0123456789abcdef", blk_size)
b'0123456789abcdef\x10\x10\x10\x10\x10\x10\x10\x10\x10\x10\x10\x10\x10\x10\x10\x10'

>>> pad(b"0123456789abcdefxxxx", blk_size)
b'0123456789abcdefxxxx\x0c\x0c\x0c\x0c\x0c\x0c\x0c\x0c\x0c\x0c\x0c\x0c'
As you see from the above sample runs, the padding algorithm works as follows (read carefully):

Your Task

Write your own padding/unpadding functions (i.e., functions pad2 and unpad2) that implement PKCS#5. Of course, you should use bytes or bytearray directly instead of calling use the pad/unpad function in Cryptodome.

Carefully look at the following sample runs. In particular, your unpad2 function should output "padding error!" if the padded data has an incorrect format.


>>> import project
>>> padded = project.pad2(b"0123456789")
>>> padded
b'0123456789\x06\x06\x06\x06\x06\x06'

>>> padded = project.pad2(b"0123456789abcdef")
>>> padded
b'0123456789abcdef\x10\x10\x10\x10\x10\x10\x10\x10\x10\x10\x10\x10\x10\x10\x10\x10'
>>> project.unpad2(padded)
b'0123456789abcdef'

>>> data = bytes([i for i in range(37)])
>>> data
b'\x00\x01\x02\x03\x04\x05\x06\x07\x08\t\n\x0b\x0c\r\x0e\x0f\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f !"#$'

>>> unpadded = project.unpad2(data)               # the length of data is not a multiple of 16
padding error!
>>> print(unpadded)
None

>>> padded = project.pad2(data)
>>> padded
b'\x00\x01\x02\x03\x04\x05\x06\x07\x08\t\n\x0b\x0c\r\x0e\x0f\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f !"#$ 
\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b'
>>> len(padded)
48
>>> unpadded = project.unpad2(padded)
>>> print(unpadded)
b'\x00\x01\x02\x03\x04\x05\x06\x07\x08\t\n\x0b\x0c\r\x0e\x0f\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f !"#$'

>>> padded2 = bytearray(padded)
>>> padded2[46] = 1                 # corruptting the padding 
>>> padded2 = bytes(padded2)        # padded2 now contains an incorrect padding
b'\x00\x01\x02\x03\x04\x05\x06\x07\x08\t\n\x0b\x0c\r\x0e\x0f\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f !"#$ 
\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x01\x0b'
>>> unpadded = project.unpad2(padded2)
padding error!
>>> print(unpadded)
None

[10pts] Part 4: Known Plaintext Attack on the CTR mode -- When CTR is reused

Encryption modes usually require an initial vector (IV) or counter (CTR). If we are not careful in selecting them, the encrypted data may not be secure at all, even though we are using a secure encryption algorithm and mode.

One may argue that if the plaintext does not repeat unlike pic_original.bmp, it may be secure to the same IV (or CTR). Unfortunately, we show that that is not the case. Let us look at the CTR mode. Assume that the attacker gets hold of a plaintext (P) and a ciphertext (C), can he/she decrypt other encrypted messages if the CTR is always the same?

Your Task

You are given the following information,
P: 546869732069732061206b6e6f776e206d65737361676521
C: ed6b6ee550650f5b9bb699678e244260c3bc87ec8a3172d7
target: d26d68e11e2c0c179bff9c7d842b5860cfad9fbfa80245fc

To do: What is the underlying message for the ciphertext target (i.e., the contents of secret.txt)? It's an English phrase. You must find out what this is.

Tips

  • Read the code in part4.py carefully.
  • Note that both C and target are encrypted using the same key and ctr.
  • Read the lecture notes again on the CTR mode. In particular, look at the scheme diagram carefully. What will happen if the key and ctr are the same? Specifically, in the diagram, identify the arrows that must carry the same data.
  • Note key and ctr are chosen at random, meaning everytime you run this script you will see different values for them. Therefore, the actual random values used in the encryption and displayed in the instructions are not known to you.
  • The script part4.py is given just to show how the data was generated. You probably have to write a separate script for your attack.

[10pts] Part 5: Chosen Plaintext Attack on the CBC mode --- When IV Is Predictiable

(Warning: This part is probably the most challenging in the entire project. If you're stuck, you may want to move on Parts 6 and 7 first, and then come back to this part.)

From the previous task, we now know that IVs cannot repeat. Another important requirement on IV is that IVs need to be unpredictable for many schemes, i.e., IVs need to be randomly generated.

In this task, we will see what is going to happen if IVs are predictable.

Assume that Bob just sent out an encrypted message, and Eve knows that its content is either Yes or No; Eve can see the ciphertext and the IV used to encrypt the message, but since the encryption algorithm AES is quite strong, Eve has no idea what the actual content is.

However, since Bob uses predictable IVs, Eve knows exactly what IV Bob is going to use next. A good cipher should not only tolerate the known-plaintext attack described previously, it should also tolerate the chosen-plaintext attack. Unfortunately, the IVs he generates are not random, and they can always be predictable.

To emulate the above scenario, you are given part5_py.txt.

Note: Every time you run part5.py, newly chosen plaintext, key, and iv will be used.

  • Change the file name to part5.py and run it.
  • Read the program carefully.
  • The message is randomly chosen as either Yes or No.
    Warning: It's actually the padded Yes or No. Remember that the CBC mode uses padding.
  • Before the while loop, it shows the iv and ct.
  • In each iteration, the program allows the user choose a message (hence, the chosen ciphertext attack). As shown in the sample run on the right, you need to use a hex string as the input.
  • If the user enters "open", then the program shows what was the message.

Sample run

$ ./part5.py
iv: db84aee3ccc1205528e9f831b840c195
ct: a3cc08dcf9b18f8e994f491875afdbf1

iv: d8afdd03531eeec72c00b6b3b0c3ec5c
pt (hex): 0d96b89f15a1f7b5fe140779e556cacd
ct: 521e66a59681f1ea9facf8c8c84bf3b5df9f73b0660d667e13ab503ec727f1dd

iv: 5e4fd49d68d4c580a60e6fddaf74b194
pt (hex): 121234
ct: cdfbe11fdb520406d241a7f0f16d3faf

iv: aca862411947b4da22560ae8026c6f05
pt (hex): open
The secret was:  b'No'

Your Task

Run part5.py and make a guess of the secret message (i.e., either Yes or No) before the program opens it in the end.
Tips
  • Warning: Don't change part5.py, but just run it.
  • You probably need to write some separate python scripts to assist your attack.
  • Every time you run part5.py, newly chosen plaintext, key, and iv will be used. So, you need to run your customized attack script in a separate terminal while running part5.py at the same time.
  • Don't forget that plaintext messages are padded. In the sample run, some ciphertext is quite long due to the padding.
  • You need to smartly choose the input messages to feed during the program loop.

Deliverables

In the report, describe your attack.

[10pts] Part 6: Error Propagation

In this part, we would like to understand the error propagation property of various encryption modes. In particular, we would like to do the following exercise:
  1. Encrypt pic_original.bmp into pic_ecb.bin, pic_cbc.bin, and pic_ctr.bin using the ECB, CBC, or CTR mode respectively.
  2. Unfortunately, the 70th byte in the encrypted file got corrupted. Simulated this by writing a simple python script -- read each pic_... file, set the 70th byte to 0, and store the modified data back into the file.
  3. Decrypt the corrupted ciphertext file using the correct key (and IV).

Your Task

Answer the following questions:
Q1: How much information can you recover by decrypting the corrupted file, if the encryption mode is ECB, CBC, or CTR, respectively?

Tip

You may want to check the decryption diagrams from Wiki. For example, the CBC mode decryption works as follows:

Follow the diagram by supposing that the first ciphertext block is corrupted.

[10pts] Part 7: Chosen Ciphertext Attack on the CBC Mode

You are given part7_py.txt.
$ ./part7.py
The target ciphertext is
iv: 81109c7dff291f6c57f0d6479a334ac8
ct: c892dfdcdaead56d88fbfebac7142442c0fdc499ff0694983d4cd251afd32fb0da6595f66d3e5fb091b7a0f0f18ad9b5


=============
Menu: 0 (encrypt), 1 (decrypt), 2 (open): 0
msg to encrypt in a hexstring format: 11223344aabb
iv: a681a0c60ae6fab05f9f1762d3de9ed5
ct: 36be519e0bf9a9fe84e3d51323925195

=============
Menu: 0 (encrypt), 1 (decrypt), 2 (open): 1
iv (in hexstring): a681a0c60ae6fab05f9f1762d3de9ed5
ct (in hexstring): 36be519e0bf9a9fe84e3d51323925195
decryption is:
    b'\x11"3D\xaa\xbb'

=============
Menu: 0 (encrypt), 1 (decrypt), 2 (open): 1
iv (in hexstring):  81109c7dff291f6c57f0d6479a334ac8
ct (in hexstring):  c892dfdcdaead56d88fbfebac7142442c0fdc499ff0694983d4cd251afd32fb0da6595f66d3e5fb091b7a0f0f18ad9b5
Cannot decrypt the target ciphertext!

=============
Menu: 0 (encrypt), 1 (decrypt), 2 (open): 1
iv (in hexstring):  a7bdcc39485abc97f1e20e372e08bfa1
ct (in hexstring):  1216c53c503147f45eb6e970a90ba959
decryption is:
    b'Hello World'

=============
Menu: 0 (encrypt), 1 (decrypt), 2 (open): 2
The message was:  b"Don't use ECB. It's not IND-CPA secure"

Your Task

As with part 5, make a guess of the secret message before the program opens it in the end.

Tips

Deliverables

In the report, describe your attack.

Submit

Write a project report.
~/bin/submit -c=IT430 -p=project project_report.doc project.py