(RSA) Encryption 1 Private-key and public

CS243, Prof. Alvarez
1 PRIVATE-KEY AND PUBLIC-KEY ENCRYPTION
Prof. Sergio A. Alvarez
Maloney Hall, room 569
Computer Science Department
Boston College
Chestnut Hill, MA 02467 USA
http://www.cs.bc.edu/∼alvarez/
[email protected]
voice: (617) 552-4333
fax: (617) 552-6790
CS243, Logic and Computation
Rivest-Shamir-Adleman (RSA) Encryption
Information security is a crucial concern in data communications. We discuss an application of modular arithmetic algorithms to secure encoding of information. Familiarity with
the content of the notes on basic number theory, including modular exponentiation and the
extended Euclid gcd procedure, is required in order to understand RSA.
1
Private-key and public-key encryption
Encryption is the process of obscuring information via systematic means for the purpose of
secure communication or storage. The inverse process, decryption, is carried out in order to
recover the original information from its obscured form. A pair of matching encryption and
decryption techniques is sometimes called a cryptosystem.
1.1
Private key techniques
Until the last quarter of the 20th century CE, cryptosystems were typically based on a secret
key that should be known only to the sender and the intended recipient of the information.
Since they use the same key for encryption and decryption, such systems are also known as
symmetric key cryptosystems.
Example 1.1. In a simple substitution scheme, individual characters in a text message
can be replaced by others according to a “dictionary”. Knowledge of the dictionary would
enable anyone to decode a message that has been encrypted by using that dictionary. Hence,
the dictionary must remain private if the information is to remain unintelligible except to
the intended recipient. This creates the new problem of keeping the dictionary secret. A
particularly simple form of the substitution scheme requires keeping only a single small
integer k secret: each character can be replaced by the character that occurs k characters
after it in some natural ordering (alphabetical, for example):
>>> def simpleShift(s, key):
if len(s)==0: return ’’
return simpleShift(s[:-1], key) + chr(ord(s[-1]) + key)
>>> message = ’Four score and seven years ago...’
>>> encrypted = simpleShift(message, 5)
>>> encrypted
CS243, Prof. Alvarez
1 PRIVATE-KEY AND PUBLIC-KEY ENCRYPTION
’Ktzw%xhtwj%fsi%xj{js%~jfwx%flt333’
>>> decrypted = simpleShift(encrypted, -5)
>>> decrypted
’Four score and seven years ago...’
Simple substitution schemes are easy to break by examining the frequencies with which
individual characters or sequences of characters occur in the encrypted text, and comparing
them with the corresponding frequencies in text samples in the presumed language.
More elaborate private key schemes exist, of course, but they are still vulnerable to being
broken if the key is intercepted or guessed. A famous example is the Enigma scheme used
by the Germans during the second World War. Breaking that system took perseverance,
patience, and brilliant minds, including that of Alan Turing.
1.2
Public key techniques
The thought that the key used for encryption may be intentionally made public seems to
run counter to the very purpose of encryption. Would knowledge of the encryption details
not allow others to easily gain access to the encrypted information?
1.2.1
Characteristics of public key systems
The keys (no pun intended) to so-called public key cryptosystems are the following:
1. There are actually two interrelated keys: a public part, and a private part. Only the
public part is intentionally disclosed, together with the algorithm used for encryption.
2. Knowledge of the public key allows encryption, and knowledge of the private key allows
decryption.
3. While knowledge of the public key would in principle also allow decryption, the process
is computationally so demanding that successfully completing it is not feasible within
human lifespans.
Since public key systems use different ksys for encryption and decryption, they are also
known as asymmetric key cryptosystems. A well-known public key cryptosystem, RSA, will
be discussed below.
1.2.2
Historical comments
Around 1970, British scientist James Ellis at the Government Communications Headquarters in Cheltenham, England was investigating improved techniques for distribution of keys,
and conceived of the notion that keys could be made public without compromising security.
Working on Ellis’ proposal, Clifford Cocks, a number theorist also at GCHQ, later (1973)
proposed a specific technique for public key cryptography based on the computational difficulty of factoring large numbers. This work antedates the famous RSA algorithm by several
years, but was intentionally kept secret until 1997.
CS243, Prof. Alvarez
2
2 THE RSA ALGORITHM
The RSA algorithm
RSA (Rivest, Shamir, Adleman) is a public (asymmetric) key cryptosystem based on number
theory.
2.1
Generation of RSA keys
The public and private keys are generated together as follows:
1. Two large pseudorandom primes p and q are generated. Currently, a 2048 bit length
is common, which corresponds to about 700 decimal digits.
2. The public key is defined as the pair (N, e), where N = pq and e is an integer that is
relatively prime to (p − 1)(q − 1).
3. The private key is the multiplicative inverse, d, of e mod (p − 1)(q − 1).
2.2
The RSA encryption and decryption mappings
To send a private message to someone with public key (N, e), the message is first represented
as a number m using a predetermined convention (for example, characters in a word can
be thought of as digits modulo 256, using their ASCII codes, and m can be taken as the
magnitude of the number that has the given sequence of codes as its base 256 positional
representation). This number is then encrypted (see below) using the public key of the
intended recipient. The intended recipient can then decrypt the encrypted message by using
his/her private key. The result is the number m. The inverse of the initial numerical
representation procedure must then be applied to m in order to recover the original message.
2.2.1
Encryption
The encryption of a number m ∈ ZN∗ is
encrypt(m) = me
mod N,
(1)
where (N, e) is the public key of the intended recipient, generated as described above in
section 2.1.
2.2.2
Decryption
The decryption of a number n ∈ ZN∗ is
decrypt(n) = nd
mod N,
(2)
where d is the private key of the intended recipient and N is the modulus in the public key,
generated as described above in section 2.1.
CS243, Prof. Alvarez
2.2.3
2 THE RSA ALGORITHM
Why RSA works
RSA encryption and decryption are based on the following result.
Theorem 2.1. Suppose that p and q are primes. Define N = pq and φ(N ) = (p − 1)(q − 1).
∗
Let e be an element of Zφ(N
) that is relatively prime to φ(N ), and let d be the multiplicative
inverse of e mod φ(N ). The encryption and decryption functions from Eq. 1 and Eq. 2 are
inverses of each other on ZN :
∀x ∈ ZN∗ decrypt(encrypt(x)) ≡ x mod N
(3)
Proof. By Eq. 1 and Eq. 2, showing that Eq. 3 holds is equivalent to showing:
∀x ∈ ZN∗ xde ≡ x mod N
Since d and e are multiplicative inverses mod (p − 1)(q − 1), we know that:
de ≡ 1
mod (p − 1)(q − 1),
so that, for some integer k,
de = k(p − 1)(q − 1) + 1
The composition of the encryption and decryption functions can now be written as:
xde = xk(p−1)(q−1)
Since xp−1 ≡ 1 mod p by Fermat’s little theorem,
xk(p−1)(q−1) ≡ x mod p
Similarly,
xk(p−1)(q−1) ≡ x mod q
Hence, since N = pq,
xde = xk(p−1)(q−1) ≡ x mod N
This completes the proof.
2.3
Example
We present an example of RSA using relatively small integers in order to facilitate display.
First, two pseudorandom primes p and q are generated and their product is taken as the
modulus N :
>>> p
1478919731
>>> q
8926888669L
>>> N
13202151789024428039L
CS243, Prof. Alvarez
3 EXERCISES
Next, the encryption exponent e is selected so that it is relatively prime to (p − 1)(q − 1):
>>> e = 7
>>> euclidGCD(e, (p-1)*(q-1))
1L
The decryption exponent is calculated as the multiplicative inverse of e in ZN∗ (using the
extension of Euclid’s gcd procedure with weights, not shown):
>>> d
7544086730639211223L
>>> 0 < d and d < (p-1)*(q-1)
True
>>> (d*e)%((p-1)*(q-1))
1L
A message is first represented as a number, then encrypted using the recipient’s public key:
>>> m = codeText(’Hey’)
>>> m
4744569
>>> encrypted = modexp(m, e, N)
>>> encrypted
61535359180538885L
Decryption uses the private key, followed by reversing the representation of text as numbers:
>>> decrypted = modexp(encrypted, d, N)
>>> decrypted
4744569L
>>> decode(decrypted)
’Hey’
3
Exercises
1. Implement Python functions codeText and decode as described in the RSA example
above. The first of these should accept a text string as input and should return a
number that represents that string, by interpreting the string as a number in base 256,
using ASCII codes as the digit values. The second function should be the left inverse
of the first, so that decode(codeText(s)) = s for all strings s.
2. Consider a brute-force attack on RSA, which attempts to break the encryption by
directly factoring the public modulus, N , into a product of two primes, p and q.
CS243, Prof. Alvarez
3 EXERCISES
(a) Describe in detail how knowledge of a specific factoring N = pq of the public
modulus N could be used to decrypt a message that has been encrypted using
the RSA public key (N, e).
(b) Write a Python function that accepts an integer N as input and that returns a
pair of integers (p, q) that have the input parameter as their product. The return
value should not be the trivial factoring (N, 1) on input N unless N is prime.
(c) RSA moduli spanning about 300 decimal digits are not unusual (they’re actually
considered relatively “easy”, or low security). Estimate the time that it would
take your function from the preceding part to factor a 300 digit number in the
worst case.
3. (a) Generalize Theorem 2.1 so that k primes p1 , p2 , · · · pk are used instead of just two,
and describe the corresponding k-prime version of RSA.
(b) Discuss any pros and cons of using k > 2 primes for RSA instead of two.
References
S. Dasgupta, C. Papadimitriou, and U. Vazirani. Algorithms, McGraw-Hill, 2008
(sections 1.2 − 1.4).