Project 2

This is the archived website of SI 335 from the Spring 2015 semester. Feel free to browse around; you may also find more recent offerings at my teaching page.

**Part 1 Due Date**: 23:59 on Friday, February 27**Final Due Date**: 23:59 on Friday, March 6

This project contains **two deadlines**, an early
deadline for Part 1 only, and a later deadline for everything
else. For all parts, submission will be electronic only. There
is no written part and no cover page. You will need to submit
everything according to the instructions on
this page
for the submit program.

- Review the course policy on academic integrity for programming projects. Be sure to cite any discussions with classmates, or any online resources that you used. Citations should appear in comments in your code, and in README.txt files that you submit.
- Part of your grade will be based on "coding style". This is mostly about whether your code is easy to follow, consistently formatted, and sensibly designed. You should use meaningful variable/function/class names, clearly explain what is going on in your code, and "clean up" any extra debugging or non-functional parts before submitting.
- Your program will automatically compiled and tested by
a variety of exciting bash scripts. This means that it is
imperative that your code
**compiles without errors**in the CS linux lab environment and that it behaves exactly as specified. You should make use of the`335sanity`

check program to help make sure everything works.

In this project you will implement key generation, encryption, and decryption for the RSA algorithm. This is a big undertaking, but don't worry; it will be step-by-step, and a lot of the code for the multiple-precision integer arithmetic has been provided for you.

At the end, you will have a program called `rsa`

that runs like
the following. (Note: the text in black is what you type from the terminal,
the text in red is what the program prints out.)

$ make g++ -O3 -Wall -Wno-sign-compare -Wno-unused-function -c rsa.cpp g++ -O3 -Wall -Wno-sign-compare -Wno-unused-function -c posint.cpp g++ -O3 -Wall -Wno-sign-compare -Wno-unused-function -o rsa rsa.o posint.o $ ./rsa -h Generate a public-private key pair: ./rsa -k PUBLIC_KEY_FILE PRIVATE_KEY_FILE Encrypt a message with public key: ./rsa -e PUBLIC_KEY_FILE Decrypt a message with private key: ./rsa -d PRIVATE_KEY_FILE Note: PUBLIC_KEY_FILE and PRIVATE_KEY_FILE are any filenames you choose. Encryption and decryption read and write to/from standard in/out. You have to end the input with EOF (Ctrl-D if on command line). You can use normal bash redirection with < and > to read or write the encryption results to a file instead of standard in/out. $ ./rsa -k pubkey.txt privkey.txt Public/private key pair written to pubkey.txt and privkey.txt $ cat pubkey.txt 1905779491181402652518787 2514027753158811099261577 $ cat privkey.txt 398123441171873623399723 2514027753158811099261577 $ ./rsa -e pubkey.txt > encrypted.txt Type your message, followed by EOF (Ctrl-D) You have all been placed on DOUBLE SECRET PROBATION! Message successfully encrypted. $ cat encrypted.txt 548838731096047626262831 769310333176024031419596 1258442563525119256473951 1857872293629117578162100 476196509856241406020426 1289945524445473376652452 1462373558694356355643861 $ ./rsa -d privkey.txt < encrypted.txt You have all been placed on DOUBLE SECRET PROBATION!

By 23:59 on Friday, February 27, you need to submit (electronically) the
assignment called `proj 2key`

that contains a single file,
`pubkey.txt`

, with the public key \((e, n)\) that you have
generated, according to Part I below.

By the final due date, you need to submit as `proj 2`

your rsa
program code, as well as a pair of files `encrypt.txt`

and
`decrypt.txt`

containing, respectively, a message encrypted for
me, and a decryption of the message I sent to you.

Your rsa program should consist of the following files:

`rsa.cpp`

: This is your top-level program (with a`main`

method) and will contain all the code that you contribute.`README.txt`

: This should contain any sources or references that you used, anything extra your instructor needs to know, maybe some humorous ASCII art, etc.`Makefile`

: This needs to have a target`rsa`

that compiles your rsa program. You probably won't need to change this file.`posint.hpp`

: This is part of the`mp`

package and you must not change this file.`posint.cpp`

: This is part of the`mp`

package and you must not change this file either.

If you want to include some code in other C++ files, you should mention
that in the README. Otherwise everything should be in `rsa.cpp`

.

You can get all these files at once by downloading the file
proj2.tar.gz
and running `tar xzvf proj2.tar.gz`

Since we're going to be implementing RSA, that means there we will have
to deal with some very big integers, ones that can't be stored within a
single `int`

.

There are many good, free packages available for doing this that have
been developed by researchers around the world. The most popular is GNU's
GMP package

However, we're going to use a much smaller, more primitive version that I
have developed just for you called `mp`

. If you navigate to the
directory `/courses/roche/335/mp`

in the Linux environment here,
you will see the whole thing, along with an executable program called
`calc`

that you can use as a bignum calculator.

Since the RSA algorithm only deals with positive numbers, the only part
of the `mp`

library that you will be using is the
`PosInt`

class. Unsurprisingly, this is a class to represent
(arbitrarily large) positive integers.

You can have a look at the header file `posint.hpp`

for the
`PosInt`

class to see what the interface is like. What you should
notice is that you will use these guys very differently than how you would
use a regular `int`

.

For example, the "dumb" way of doing exponentiation might be written in pseudocode as:

1 2 3 4 5 def pow(n, k): res = 1 for 1 in range(0, k): res = res * n return res

Writing this in C++ using the `PosInt`

class would look like:

1 2 3 4 5 6 7 8 9 void pow (PosInt& res, const PosInt& n, const PosInt& k) { PosInt i(1); PosInt one(1); res.set(1); while (i.compare(k) <= 0) { res.mul(n); i.add(one); } }

Here are some things to notice:

- You can't use the usual C++ operators for arithmetic. So for example to add, you have to use the
`add`

method in the`PosInt`

class, not the`+`

operator. This seems annoying, but makes it easier to keep track of exactly what is happening. - There are no automatic conversions from regular
`int`

s. This means we can't do something like "i = i + 1" directly, but instead must first create a`PosInt`

object (called`one`

here) to hold the value 1, and add that to`i`

. - Most
`PosInt`

class*member functions*act like the "in-place" arithmetic operators such as`*=`

and`+=`

. Like in the example above, doing`res.mul(n);`

means to set`res`

to the result of multiplying`res`

times`n`

. - Most
*non-member functions*that deal with`PosInt`

objects do not actually return anything. Instead, they take the return value as the first parameter to the function, which is set by the function. The reason for this is to avoid too much unnecessary copying of these (potentially large) integers. So for example, the`pow`

function above doesn't actually return the result, but instead it takes a reference to the result as the first argument. -
The method
`set`

is used to set the value based on that of another`PosInt`

or a regular`int`

, and the`convert`

method is used to convert a (small)`PosInt`

back into a regular`int`

.

Of course you can look through any of the files in the `mp`

library, located at `/courses/roche/335/mp/`

on the CS Linux
environment, to see many more examples of how these routines work.

Finally, while the normal I/O operators `<<`

and
`>>`

will work just fine for reading and writing
`PosInt`

s, for debugging you might find the `print_array`

method useful. This prints the digits of the actual number in an array just like
we did in class, so you can see what the actual representation looks like.

When your `rsa`

program is run with the `-k`

flag, that means to generate a public/private key pair, randomly. You should follow the key generation algorithm from Section 5.4 of Unit 3. In particular, you will have to

- Choose two random PRIME numbers
*p*and*q* - Multiply them together to get
*n* - Compute \(\varphi(n)\)
- Choose a random public exponent
*e*that is relatively prime to \(\varphi(n)\) - Use the XGCD algorithm to compute the private exponent
*d*that satisfies \(ed \bmod \varphi(n) = 1\).

To do this, you will have to use some of the following functions from the
`PosInt`

class: `mul`

, `sub`

, `pow`

,
`set`

, `xgcd`

, `mod`

, `rand`

, and
`MillerRabin`

. Two special tips:

- The
`MillerRabin`

method only runs the probabilistic primality test*one time*, so you should run it more than once to make sure it is correct before assuming some number is prime. - The XGCD algorithm like we saw in class will always return one
negative number. But
`PosInt`

objects can't be negative, so the`xgcd`

method multiplies the second coefficient \(t\) by \(-1\) before returning, so that all the return values are positive numbers. Look at the documentation in the`posint.hpp`

file for more details. If you don't understand it, write a small program to do some experiments and try to figure it out!

The output should be two pairs of numbers, corresponding to the public key \((e, n)\) and the private key \((d, n)\), respectively. Each pair is written to a different file, as indicated on the command-line.

Now as long as you generate valid public/private key pairs, that are randomly
generated (should be different every time you run the program), I don't care too
much how you might choose to do it. The only requirement is that **your
modulus n must be at least \(\textbf{2}^\textbf{80}\)**.
This is because we will be
encrypting ten bytes at a time, so we need at least a 80-bit modulus to make
everything work. So you will have to choose your *p* and *q*
accordingly to make *n* this large. Of course you can make it much
larger, but not any smaller.

- Modify the
`rsa.cpp`

program so that it spits out your randomly-chosen public and private keys when run like`./rsa -k pubkey.txt privkey.txt`

. **By 23:59 on Friday, February 27**submit a file called`pubkey.txt`

as`335 proj 2key`

. This plain text file should contain your public key (only!), which will consist of two integers, \(e\) and \(n\), in that order. I will post everyone's public key on the website, along with secret messages to each of you.**IMPORTANT**: You may of course keep modifying your`rsa.cpp`

program between this deadline and the final submission deadline. But the public key you submit initially must be "typical" of an output from the final program you submit. For example, if the key you submit is 2000 bits, but your RSA program usually spits out 80-bit keys, then we will have a problem. If you notice a bug or need to correct something for any reason, just email your instructor.

Now that you have some keys, it's time to encrypt and decrypt some messages! There are really four tasks here:

The heart of the RSA cryptosystem is encrypting and decrypting messages with the public and the private key (respectively). If the public key is the pair of integers \((e, n)\), and the message is a single number \(M < n\), then we encrypt it by computing

\[ E = M^e \bmod n \]

Sometime later, the recipient can decrypt this message by knowing the private key \((d,n)\) and computing

\[ M = E^d \bmod n \]

Now of course ultimately the message to be encrypted will consist of a whole series of numbers \((M_1, M_2, \ldots)\) and the encrypted messages will be another list \((E_1, E_2, \ldots)\).

But the basic operation of raising one big integer to another big integer exponent, modulo a third big integer, is really the central operation of the while RSA scheme! So the first thing you need to do is implement the function

```
void powmod (PosInt& result, const PosInt& a, const PosInt&
b, const PosInt& n)
```

which will compute \(a^b \bmod n\) and store the result in (you guessed it)
`result`

. The prototype for this function is already there in the
starter code; you just have to fill it in.

Now there's a dumb way to do this, and a smart way... Of course I want you to
do it the smart way. This means using the square-and-multiply algorithm as
discussed in class, and taking the intermediate results mod *n* at every
step along the way. If you don't do it like this, it'll never finish in time
once you start plugging in the big numbers for RSA!

When your program is run like `rsa -e pubkey.txt`

, it
should use the public key \((e,n)\) from the given file
to encrypt a message. The message will come
in via standard in, one ASCII character at a time. Now as each character is read
in with `cin.get()`

, it is converted to an 8-bit number (between 0
and 255). Your program must take the characters, ten at a time, and convert
each group of 10 ASCII characters (8-bit numbers) into a single 80-bit number,
which will be the message \(M\) that gets used.

Actually, this process of reading in the characters and converting
to an 80-bit number has already pretty much been written for you. But you
should read and understand it, because *you* have to do the process in
reverse for decryption (below).

After converting the ASCII characters to big numbers like above, the last
thing your `rsa -e`

program will do is encrypt each of these numbers,
one at a time, using the formula \(E = M^e \bmod n\) and the `powmod`

function you already wrote. The output should be a list of the big numbers for
each \(E\), each printed on a separate line.

See the example at the top for how this should look when you are done.

Your decryption program will be run like `rsa -d privkey.txt`

,
and will use the private key \((d,n)\) from the file to encrypt a message
(list of big numbers) that is passed in via standard in. The *first*
thing the decrypting part of your program will do is use the formula
\(M =E^d \bmod n\) to convert each number \(E\)
in the input to another number \(M\).

Each of these numbers \(M\) contains (at most) ten ASCII characters. You have to extract them, one at a time, and print them out. This will involve dividing and modding by powers of 256 repeatedly.

Some of this has been written for you already, but not as much as for encryption. The good news is that it should be easy to generate examples to test your decryption algorithm if you already have the other parts working! Here's an example that might help:

`Decrypted message: M = 394012469885622488624741 (in hex) M = 536f75736170686f6e65 STEP 1 M = 394012469885622488624741 256^9 = 4722366482869645213696 M / 4722366482869645213696 = DEC: 394012469885622488624741 ASCII: S o u s a p h o n e HEX: 536f75736170686f6e65 \/ \ division \ by 256^9 \ \ \ \ \ \ \ /\ HEX: 00000000000000000053 ASCII: S DEC: 83 M % 4722366482869645213696 = DEC: 394012469885622488624741 ASCII: S o u s a p h o n e HEX: 536f75736170686f6e65 \/\/\/\/\/\/\/\/\/ modulo |||||||||||||||||| 256^9 |||||||||||||||||| /\/\/\/\/\/\/\/\/\ HEX: 006f75736170686f6e65 ASCII: o u s a p h o n e DEC: 2056051807441935887973 STEP 2 M' = 2056051807441935887973 256^8 = 18446744073709551616 M' / 18446744073709551616 = DEC: 2056051807441935887973 ASCII: o u s a p h o n e HEX: 006f75736170686f6e65 \/ \ division \ by 256^8 \ \ \ \ \ \ /\ HEX: 0000000000000000006f ASCII: o DEC: 111 M' % 18446744073709551616 = DEC: 2056051807441935887973 ASCII: o u s a p h o n e HEX: 006f75736170686f6e65 \/\/\/\/\/\/\/\/ modulo |||||||||||||||| 256^8 |||||||||||||||| /\/\/\/\/\/\/\/\ HEX: 000075736170686f6e65 ASCII: u s a p h o n e DEC: 8463215260175658597 (... and then 8 more steps)`

This part is the bulk of the project. You should be very happy once you get this far! Now here's what you need to do:

- Modify the
`rsa.cpp`

program so that`rsa -e pubkey.txt`

encrypts ASCII characters using the public key and prints out encrypted numbers, and`rsa -d privkey.txt`

decrypts numbers using the private key and prints out the ASCII characters that make up the original message. By the final deadline, go to the website and look up the message I have encrypted for you, as well as my own public key. Then you have to create two plain text files to be submitted along with your program:

`decrypt.txt`

contains the decryption of the message I sent you, and`encrypt.txt`

contains an encrypted message to me. It should be relatively short, and should have your own name in it somewhere.

Of course you should use your program to do both of these things!

Once everything is working, we want it to be as fast as possible of course! Right now the multiplication algorithm is just the standard algorithm that costs \(\Theta(n^2)\) to multiply to *n*-digit numbers.

To speed this up, I want you to implement Karatsuba's algorithm for integer multiplication. There are two extra functions that I have defined in the `PosInt`

class for doing "faster" multiplication, and some skeleton of their implementation is near the bottom of the starter code for `rsa.cpp`

.

The first function is

`PosInt& PosInt::fasterMul (const PosInt& x, const PosInt& y)`

which is the top-level function to do the multiplication of *x* times *y* and store the result in the `PosInt`

object that it was called on. You should use it just like you would use the usual `mul`

function. Mostly what this function has to do is set things up, do some memory allocation and zero-padding, and make the first recursive call to the recursive `fastMulArray`

function below. It has pretty much been written for you already...

Which means you will really spend your time implementing the second function:

`void PosInt::fastMulArray (int* dest, const int* x, const int* y, int len)`

This is a *static* function in the `PosInt`

class, meaning that it doesn't get called on any particular object. Really, it's just working on arrays of `int`

s instead of `PosInt`

objects. This (should) make it easier to set up and call your recursive calls, since they will just be on sub-arrays of the original ones. The basic skeleton of this function is there, but you will definitely have to look back at the karatsubaMul algorithm from Unit 4 and implement it very carefully, step by step.

Once you get this working, you should incorporate it into everything you did in parts II and III for the RSA algorithm. Like I said, this should really just involve replacing some calls to `something.mul(...)`

with `something.fasterMul(...)`

.

Finally, **add a new option flag to the rsa program**. If I run something like

`rsa -m <a> <b>`

, it should use your new code to multiply For example:

proj2$ rsa -m 4278 87124 372716472 proj2$ rsa -m 41274981789257 90876523976523 3750926872201963955201613411

And feel free to play around with it! The ultimate goal is *fast code*, which means you might want to tweak with the algorithm one way or another to really make it as fast as possible. You have permission to do pretty much anything you like, as long as it actually makes an improvement and you carefully document what you are doing.

- Modify the
`rsa.cpp`

program to complete the implementation of the`fasterMul`

and`fastMulArray`

functions. Then add a new`-m`

option to the program that reads two numbers from the command line and prints out their product.

The extra credit is about having the fastest, secure RSA implementation. Note that there is one big prize but also **multiple smaller prizes** and also **opportunities for sabotage**.

The basic idea is to get your encryption and decryption to be really fast, without sacrificing security. It's based on the following general principles:

**The bad guys have more computing power than you.**I will run your encryption and decryption algorithms with a time limit of just a few seconds, but there is no time or memory limit on what your classmates can do to try and crack your key.**Security is more important than speed**. I am going to encrypt a secret message to you using your public key and post it on the website. If anyone else sends me that secret message, you are not eligible to receive any extra credit points.**But a slow system is useless.**Like I said, I will test your code for encryption and decryption using a timeout of just a few seconds. If you choose a key size so large that encryption takes a long time, or don't write good code, then no one will use your cryptosystem even if it's really secure.

So the extra credit competition is to get the fastest encryption and decryption of random messages that I choose, **using random keys generated by your own program**. But in case you are tempted to just make the keys you generate really short, remember that if anyone cracks the message (for example by factoring your public modulus *n*), then you are not eligible for any extra credit.

- Nothing extra to get the extra credit, just submit your solutions to the other parts as usual.
- If you crack someone's secret message, send me a (regular) email. I will update the website as the messages get cracked. Also since no points are directly associated with cracking other people's keys,
**you can work together to crack other people's messages**. However, this doesn't mean you can violate the honor policy for the rest of the project, for example by looking at any classmate's code that will be submitted.