\( \def\ZZ{\mathbb{Z}} \def\GG{\mathbb{G}} \def\HH{\mathbb{H}} \)

Safe Prime

During the last lecture, we learned that modular exponentiations cycle through the numbers in \(\ZZ_n^*\). However, the length of the cycle is sometimes shorter and sometimes longer.

In this lecture, we will learn a good way to make the length of the cycle always the same by using a safe prime number.

What is a safe prime number?

A number \(p\) is safe prime if

Examples

Quick check

Give the next two safe prime numbers:
11 and 23

Modular Exponentiations of Squared Numbers

It turns out that when \(p = 2q + 1\) is a safe prime, modular exponentitions of all squared numbers all have the same cycle length \(q\).

For example, let's consider 11. We think that the cycle length of all squared numbers should be 5. Let's verify this:


from math import gcd

p = 11

QR = set()            # set of all squared numbers 
for a in range(1, p):
  if gcd(a, p) == 1:  # check if a is in ZZ_p*
    QR.add(a*a % p)   # we are interested in squared numbers s

for s in QR:
  print("s=", s, ":", end = " ")
  for i in range(1, p):
    print(pow(s, i, p), end=" ")  # mod exponentiations of s
  print()
   The output is as follows:
s= 1 : 1 1 1 1 1 1 1 1 1 1
s= 3 : 3 9 5 4 1 3 9 5 4 1
s= 4 : 4 5 9 3 1 4 5 9 3 1
s= 5 : 5 3 4 9 1 5 3 4 9 1
s= 9 : 9 4 3 5 1 9 4 3 5 1
As you see, all squared numbers (except 1) have the cycle 5.

Quadratic Residues

In a slightly more fancy terms, the squared numbers (of remainders) are called the quadratic residues.
Note: If \(p = 2q + 1\) is a safe prime number, then all numbers (except 1) in \(QR(\ZZ_p^*)\) have the cycle length \(q\).

Generator

Moreover, any number (except 1) in \(QR(\ZZ_p^*)\) list all numbers in \(QR(\ZZ_p^*)\). In this sense, these numbers generate \(QR(\ZZ_p^*)\) through modular exponentiations.

Note: There is a number that generates the entire \(\ZZ_p^*\). For example, the number 2 generates the entire group \(\ZZ_{11}^*\). If you can easily obtain this kind of the generator, that's better. But, sometimes, it takes time to find this.

Coming Up With a Hard Problem

In cryptography, coming up with a hard problem is a critical step toward building a useful cryptographic scheme. Why?

Looking into the modular exponentiation sequence again

Let's pick a somewhat large safe prime number 491. Since any squared number generates \(QR(\ZZ_{491}^*)\), I will pick \(g = 4\) as our generator.

p = 491
q = (p-1)//2
g = 4

for i in range(1, q+1):
  print(pow(g, i, p), end=" ")
The output is as follows:
4 16 64 256 42 168 181 233 441 291 182 237 457 355 438 279 134 45 180 229 425 227 417 195 289 174 205 329 334 354 434 263 70 280 138 61 244 485 467 395 107 428 239 465 387 75 300 218 381 51 204 325 318 290 178 221 393 99 396 111 444 303 230 429 243 481 451 331 342 386 71 284 154 125 9 36 144 85 340 378 39 156 133 41 164 165 169 185 249 14 56 224 405 147 97 388 79 316 282 146 93 372 15 60 240 469 403 139 65 260 58 232 437 275 118 472 415 187 257 46 184 245 489 483 459 363 470 407 155 129 25 100 400 127 17 68 272 106 424 223 401 131 33 132 37 148 101 404 143 81 324 314 274 114 456 351 422 215 369 3 12 48 192 277 126 13 52 208 341 382 55 220 389 83 332 346 402 135 49 196 293 190 269 94 376 31 124 5 20 80 320 298 210 349 414 183 241 473 419 203 321 302 226 413 179 225 409 163 161 153 121 484 463 379 43 172 197 297 206 333 350 418 199 305 238 461 371 11 44 176 213 361 462 375 27 108 432 255 38 152 117 468 399 123 1

What do you think? Any idea of a hard problem?

You will notice that the sequence is quite random. See the chart below.

  • Given \(x\), it is easy to compute \(y = g^x \bmod p\). In Python, it's just pow(g, x, p)
  • What about the other direction? That is, given \(y\), can you compute \(x\) easily?

Discrete Logarithm

Take 351 in the above. What is \(x\) such that pow(g, x, p) = 351? One naive solution below through a brute-force search:

p = 491
q = (p-1)//2
g = 4
y = 351
for i in range(1, q+1):
  if pow(g, i, p) == y:
    print("x is", i)
    break
The answer is x = 156. Quite easy at this moment.

How about really large \(p\)?

You can improve the above algorithm a little bit, but it turns out you cannot do significantly better than the brute-force search to get the solution.

Then, this is good news for us!

For example, consider a safe prime \(p \approx 2^{2048}\). The brute-force search cannot simply work.

Discrete Logarithm Problem

We call this problem "Discrete Logarithm Problem". In a sense, \(x\) is the (discrete version of ) logarithm of \(y\): \[ g^x = y \pmod p ~~\Longleftrightarrow x = \log_g y \pmod p \]

This problem is believed to be hard.

Hardness of Discrete Logarithm (Common Belief)

Creating a problem

The problem: What is \(\log_g y \pmod{p}\)?

Answer

The answer to the problem is \(x\) such that \(g^x = y \pmod{p}\).

Hardness

No polynomial time algorithm can solve \(\log_g y \pmod{p}\) except with negligible probability.

Key Exchange

Remember that a private key encryption scheme assumed that the encryption key has been shared by Alice and Bob over a secure channel. However, there may be no secure channel available for key sharing between Alice and Bob, for example, if they meet each other only over the Internet (HTTPS is a typical example). Achieving a private key communication without ever communicating over a private channel needs a radical idea. Whitfield Diffie and Martin Hellman came up with a beautiful scheme in their utterly fascinating paper "New Directions in Cryptography" in 1976. The first paragraph is as follows:
We stand today on the brink of a revolution in cryptography. The development of cheap digital hardware has freed it from the design limitations of mechanical computing and brought the cost of high grade cryptographic devices down to where they can be used in such commercial applications as remote cash dispensers and computer terminals. In turn, such applications create a need for new types of cryptographic systems which minimize the necessity of secure key distribution channels and supply the equivalent of a written signature. At the same time, theoretical developments in information theory and computer science show promise of providing provably secure cryptosystems, changing this ancient art into a science.

Diffie-Hellman Key Exchange

Recall the Diffie-Hellman key exchange protocol. Then the protocol proceeds as follows:

  1. Alice chooses a random exponent \(a\) from \(\{1, ..., p\}\), and computes \(A = g^a \bmod p\). Alice sends \(A\) to Bob.
  2. Bob also chooses a random exponent \(b\) from \(\{1, ..., p\}\), and computes \(B = g^b \bmod p\). Bob sends \(B\) to Alice.
The common secret key is computed as follows: Note that \(B^a = (g^b)^a = g^{ab}\), and \(A^b = (g^a)^b = g^{ab}\). Therefore, Alice and Bob will share the same key.

Security against the adversary

Alice can compute the key correctly

Bob can compute the key correctly

Example of DH

The following shows an execution of the Diffie-Hellman key exchange protocol. Fill in the blanks.
prime number p = 23
generator g = 4



Alice:                                  A = ____                  Bob:
exponent a = 4                      --------------->              exponent b = 6             


                                        B = ____
                                   <---------------

Ka = (___)^(___) mod ___ = ____                                   Kb = (___)^(___) mod ___ = ____

------
* Ka must be equal to Kb.
Check the answer.

Man-in-the-middle attack

DH key exchange protocol is vulnerable to a man-in-the-middle attack.

In the picture above, the attacker sits in the middle, and runs two separate key-exchange protocols, one with Alice and the other with Bob. Then, it obtains key, say \( K_1 \), shared with Alice and another key, say \( K_2 \), shared with Bob.

Fix: So, The DH key exchange protocol needs an authentication mechanism that confirms that a message really comes from Alice or Bob instead of some other parties. One popular mechanism is PKI + digital signature.