Homework 1

Introduction to Cryptography
University of Michigan, Fall 2016
Homework 1
Instructor: Chris Peikert
Student: SOLUTIONS
This homework is due by 10pm on September 13 via the course Canvas page. Start early!
Instructions. Solutions must be typed, preferably in LATEX (a template for this homework is available on the
course web page). Your work will be graded on correctness, clarity, and concision. You should only submit
work that you believe to be correct; if you cannot solve a problem completely, you will get significantly more
partial credit if you clearly identify the gap(s) in your solution. It is good practice to start any long solution
with an informal (but accurate) “proof summary” that describes the main idea.
You may collaborate with others on this problem set and consult external sources, subject to the course
policies. However, you must write your own solutions and list your collaborators/sources for each problem.
1. In the ancient Caesar cipher, the key is a uniformly random “shuffle,” or permutation, of the alphabet
(including spacing and punctuation). For example, a random key might be: A becomes L, B becomes Z,
C becomes A, space becomes J, etc. To encrypt a message, the sender simply applies the permutation to
the message; to decrypt, the receiver reverses the shuffle.
Suppose we use the Caesar cipher to encrypt just one message that is shorter than the alphabet size. Is
this perfectly secret? Give a convincing argument (or formal proof) why or why not.
Solution: It is not. Let the alphabet be all capital letters, and the message length be just two.
Because each occurrence of a message letter always maps to the same ciphertext letter, the ciphertext
reveals whether the two characters in the message are the same or different. This is information that
the eavesdropper may not have in advance.
We formalize this intuition with the following counterexample to perfect secrecy. Define the messages
m0 = AA and m1 = AB, and the ciphertext c̄ = ZZ. Then we have
Pr [Enck (m0 ) = c̄] = 1/26 > 0
k←Gen
Pr [Enck (m1 ) = c̄] = 0,
k←Gen
which violates the definition.
2. Either give a convincing argument (or formal proof) that the following statement is true, or show that it
is false by giving a counterexample. (A counterexample should describe an encryption scheme that is
perfectly secret, but for which the statement is false.)
In any perfectly secret encryption scheme, the encryption of any message must be uniformly random
over the ciphertext space. Formally: for any fixed m̄ ∈ M and any c̄ ∈ C,
Pr [Enck (m̄) = c̄] = 1/|C|.
k←Gen
Solution: This statement is not true. As a counterexample, take the following modification of the
one-time pad: the message and key spaces are M = K = {0, 1}n , but the ciphertext space is
{0, 1}n+1 . Encryption works as before, but always appends a 1 onto the end of the ciphertext. This
clearly does not affect Shannon/perfect secrecy, because we are just adding “junk” to a ciphertext
Introduction to Cryptography
University of Michigan, Fall 2016
Homework 1
Instructor: Chris Peikert
Student: SOLUTIONS
that reveals nothing about the message. However, ciphertexts of the form c̄0 occur with probability
zero, hence ciphertexts are not uniform over C: for any message m̄ ∈ M and any ciphertext c̄ ∈ C
that ends in 1, we have
Pr [Enck (m̄) = c̄] = 1/2n = 2/|C| 6= 1/|C|.
k←Gen
The above example is actually not so artificial: many cryptosystems include extra protocol or
version information in their ciphertexts, to ensure that the receiver is able to decrypt correctly. This
information is not uniformly random, but also does not reveal anything about the message itself.
3. Eve has been eavesdropping on Alice’s and Bob’s communications with each other for some time.
They appear to be using one-time-pad encryption to keep their messages secret. Eve suspects that the
plaintexts are just English sentences encoded in the standard ASCII character set, and the ciphertexts are
generated using bitwise exclusive-or (XOR) with the pad. For example, in ASCII the character ‘a’ has
hexadecimal value 61 (or 01100001 in binary), which when XOR’ed with the hexadecimal pad value
83 (10000011 in binary) yields the hexadecimal ciphertext e2 (11100010 in binary).
Knowing that the one-time pad is hard to use properly, Eve has been storing every ciphertext sent between
Alice and Bob, and XORing pairs of them to look for any anomalies. One day she notices that a pair of
ciphertexts XOR to a value, shown below in hexademical, that appears “strange.” She suspects that Alice
and Bob may have reused part of their pad, and asks you to recover the plaintexts.
(a) What is “strange” about the value below that Eve found?
Solution: One very strange thing is that the first bit of every two-digit hex block is always zero
(i.e., the first character is always 0–7). More generally, “small” values like 00 appear quite
frequently.
This is strange because if Alice and Bob had properly used fresh pads for every message, these
properties would be extremely unlikely, because the XOR of any two ciphertext would be
uniformly random: c1 ⊕ c2 = (m1 ⊕ k1 ) ⊕ (m2 ⊕ k2 ) is uniform for any messages m1 , m2 ,
because k1 , k2 are uniform and independent.
By contrast, if Alice and Bob had reused a pad, then c1 ⊕ c2 = m1 ⊕ m2 . Because the ASCII
hex codes for characters used in English all have zero as their first bit, the same is true for the
XOR of any two such characters.
All this indicates the a pad was probably reused for the two ciphertexts that Eve XORed.
(b) Describe a good approach for helping Eve recover the plaintexts. The messages may be timesensitive, so your attack should work as quickly and be as automated as possible.
Solution: We hypothesize that C = c1 ⊕ c2 is actually m1 ⊕ m2 , the XOR of two English
messages. There are many ways we can recover most or all of m1 and m2 from this.
First, we can use a technique called “crib-dragging:” use a list of common words that are likely
to be in the message, e.g., to, Alice, where, etc. Exclusive-or each word at various locations
of C. If the result “looks like” it could be part of a valid English message (e.g., it contains all
Introduction to Cryptography
University of Michigan, Fall 2016
Instructor: Chris Peikert
Student: SOLUTIONS
Homework 1
letters or punctuation), it’s likely that the chosen word actually appears in that position in one of
the messages, and the result of the XOR appears in the other message. We can then use this to
inform our next guess: e.g., if the other message contains the fragment prob we might guess
that this is part of the word probably.
There are many ways to accelerate this process. One idea is to include spaces or other punctuation on both sides of the crib words. Another observation is that the ASCII codes for space and
punctuation are very far from those of a-z and A-Z. So locations with spaces or punctuation
stand out as larger hex values in C, e.g., they start with 4 or 5 instead of 0 or 1.
For example, try the word Bob. If we XOR this with the first three characters of C, we get Ali.
It seems likely that this is part of Alice, probably with a space or comma after it (because the
corresponding hex value in C is 5b). By going back and forth between the messages, and using
a computer program to help us with the grunge work, we can pretty quickly recover everything.
(c) Give as much of the plaintexts as you can find.
Solution: The plaintexts are:
Bob, when should we have Eve’s surprise party? Make sure
to use our one-time pad so she doesn’t find out about it.
Alice, let’s have it at noon. I might have reused our pad,
but it’s probably no big deal -- Eve won’t ever notice.
03
02
0b
52
4c
00
03
45
5a
01
44
49
0b
4e
55
00
42
0a
4f
2a
53
1f
48
11
45
19
28
1c
53
00
5b
0b
05
0a
2b
48
09
4b
4f
51
09
53
0a
15
11
0b
00
55
0b
00
54
3a
01
01
11
54
55
55
1c
06
1b
1f
02
00
00
4f
19
04
1e
43
1d
15
44
0e
54
0d
01
58
44
4f
12
07
4f
42
10
45
45
42
1a
02
57
48
00
08
45
0c
11
07
00
13
54
17
45
17
42
48
17
49
0d
01
00
54
1b
04
1a