Problem 1 Solution

CSCI 5440: Cryptography
The Chinese University of Hong Kong
Homework 2 Solutions
Problem 1
In this question you will explore the difference between secure and strongly secure MACs for fixed
message length. Recall that the two differ in the type of forgery: A forgery is a pair (M, T ) such
that V er(K, M, T ) = 1 and M was never queried from the tagging oracle, while a weak forgery is a
pair (M, T ) such that the query-answer pair (M, T ) was never observed in the interaction between
the adversary and the tagging oracle. (So in a weak forgery, M could have been queried before,
but if the oracle did not return T as an answer when M was queried, (M, T ) is considered a weak
forgery.) A MAC that prevents forgeries is called secure (against chosen message attack); one that
prevents weak forgeries we will call strongly secure.
(a) Assuming that secure MACs for message length m exist, show that there exists a MAC for
message length m that are secure but not strongly secure.
(b) Show that if T ag is deterministic and (T ag, V er) is secure, then there exists a verifier V er0
such that (T ag, V er0 ) is strongly secure.
Solution
(a) Let (T ag, V er) be an (s, ε) secure MAC for message length m. Consider the following MAC
T ag 0 (K, M ) = (T ag(K, M ), 0)
V er(K, M, (T, b)) = V er(K, M, T ).
Then (T ag 0 , V er0 ) is also (s, ε) secure: For suppose A0? is an oracle circuit of size s such
0
that A0T ag (K,·) produces a forgery for (T ag 0 , V er0 ) with probability ε. Consider the circuit
A? that simulates A0? and makes the same oracle calls as A0 , but with the last query bit
stripped. The output of A? is the same as the output of A0? with the last bit stripped. If
0
A0T ag (K,·) produces a forgery (M, (T, b)) for (T ag 0 , V er0 ) after seeing the query-answer pattern (M1 , (T1 , b1 )), . . . , (Mq , (Tq , bq )), then AT ag(K,·) sees (M1 , T1 ), . . . , (Mq , Tq ) and outputs
(M, T ), which is a forgery for (T ag, V er).
On the other hand, (T ag 0 , V er0 ) is not even (O(m), 0.99) strongly secure: Consider the oracle
circuit A? that queries the message 0 and upon receiving answer a, flips the last bit of a to
get a0 and outputs the forgery (0, a0 ). This is a forgery with probability one.
(b) Let V er0 (K, M, T ) = 1 if T = T ag(K, M ) and 0 otherwise. Since T ag is deterministic,
V er0 (K, M, T ag(K, M )) = 1 so (T ag, V er0 ) is a valid MAC. Moreover, (T ag, V er0 ) can only
be more secure than (T ag, V er) because every forgery for (T ag, V er0 ) is also a forgery for
(T ag, V er).
1
2
Problem 2
Consider the following encryption scheme, where {FK } is a pseudorandom function family:
Enc((K1 , K2 ), M ) = (S, FK1 (S) + M, FK2 (S + M )) where S is a random string
(
C + FK1 (S), if FK2 (S + C + FK1 (S)) = T
Dec((K1 , K2 ), (S, C, T )) =
error,
otherwise.
(a) Show that if (Enc, Dec) is used with the same key K1 = K2 , the scheme is not message
indistinguishable, even for one encryption.
(b) Now assume that K1 and K2 are independent. Consider the ideal scheme (REnc, RDec) which
is a variant of (Enc, Dec) where FK1 and FK2 are replaced with truly random independent
functions R1 and R2 , respectively. Show that if {FK } is pseudorandom and (REnc, RDec) is
CPA-secure, than (Enc, Dec) is CPA secure (for an appropriate choice of the parameters).
(c) Show that (REnc, RDec) is CPA-secure (for an appropriate choice of the parameters).
(d) (Optional) Can you show that (Enc, Dec) is CCA-secure?
Solution
(a) In this case, the encryption of 0 looks like (S, t, t) for some t, while we have no reason
to believe that the encryption of any other message will look like that. This suggests the
following distinguisher D: On input (a, b, c), accept if b = c, and reject otherwise. Clearly
D(Enc(K, 0)) always outputs 1. We now argue that if M is different from 0, D(Enc(K, 0))
is unlikely to output 1. If instead of FK we were using a truly random function R, then the
probability that R(S) equals R(S + M ) would be at most 2−k , because for every S these
two values are independent of one another. Replacing the random function by an (O(n), ε)
pseudorandom one can affect the distinguishing probability by at most ε, because otherwise
the oracle circuit AO = D(S, O(S), O(S + M )) would distinguish FK from R. It follows that
D can distinguish 0 from M with advantage 1 − 2−k − ε.
(b) Assume {FK } is (s, ε) pseudorandom and (REnc, RDec) is (.., ε) CPA-secure. Suppose
(Enc, Dec) is not (s, 5ε) CPA-secure. Then for some A of size Ω(s/t) (where t is the circuit size of FK ) and messages M0 , M1 ,
|Pr[AEnc(K,·) (Enc(K, M0 )) = 1] − Pr[AEnc(K,·) (Enc(K, M1 )) = 1]| > 5ε.
Then we claim that
|Pr[AREnc(K,·) (REnc(K, M0 )) = 1] − Pr[AREnc(K,·) (REnc(K, M1 )) = 1]| > ε.
Otherwise, for at least one of the messages Mb we must have
|Pr[AEnc(K,·) (Enc(K, Mb )) = 1] − Pr[AREnc(K,·) (REnc(K, Mb )) = 1]| > 2ε.
3
Consider the circuit B O1 ,O2 that runs A using oracle and input (S, O1 (S) + M, O2 (S + M )).
Then at least one of the expressions Pr[B R1 ,R2 = 1] − Pr[B FK1 ,R2 = 1] or Pr[B FK1 ,R2 =
1] − Pr[B FK1 ,FK2 = 1] must exceed ε in absolute value. In the first case, we can distinguish
R1 from FK1 by a circuit of size s that simulates B ?,R2 by replacing calls to R2 with random
answers (and making sure same queries are answered identically). In the second case, we can
distinguish FK2 from R2 by a circuit of size s that simulates AFK1 ,? .
(c) We show (REnc, RDec) is (s, s2−k ) CPA secure for every s. Look at ARenc(K,·) (S, R1 (S) +
M, R2 (S + M0 )). Conditioned on the event E that the oracle never returns an answer of the
form (S, ?, ?), ARenc(K,·) (S, R1 (S)+M, R2 (S+M )) is identically distributed to ARenc(K,·) (S, T, U ),
where S, T, U Are uniformly distributed random variables independent of all the oracle answers. Therefore
|Pr[ARenc(K,·) (Renc(K, M0 )) = 1] − Pr[ARenc(K,·) (S, T, U ) = 1]| ≤ Pr[E].
We obtain an analogous inequality for M1 . Therefore the distinguishing probability of A is
at most 2 Pr[E] ≤ s2−k .
Problem 3
Assuming cryptographic hash families exist, show that there is a function family {hK : {0, 1}2k →
{0, 1}k } which is a cryptographic hash family but is not a pseudorandom function family.
Solution
There are many possible solutions here. Assume {hK : {0, 1}2k−1 → {0, 1}k−1 } is an (s, ε) cryptographic hash family and let
h0K (x, b) = (hK (x), b)
where x ∈ {0, 1}2k−1 and b ∈ {0, 1}. We claim {h0K } is an (s, ε) cryptographic hash family. For
if not, then there exists a circuit A0 of size s such that A0 (K) produces a collision ((x, b), (x0 , b0 ))
for h0K with probability at least ε. Let A(K) simulate A0 (K) and output (x, x0 ). Then A has size
at most s. We claim that whenever ((x, b), (x0 , b0 )) is a collision for h0K , then (x, x0 ) is a collision
for hK . For we must have (x, b) 6= (x0 , b0 ) and (hK (x), b) = (hK (x0 ), b0 ) and this is only possible if
x 6= x0 but hK (x) = hK (x0 ).
On the other hand, {h0K } is not even a (O(k), 1/2) pseudorandom function family. Consider the
0
oracle circuit AF that outputs 1 if the last bit of F (0) is 0 and 0 otherwise. Then AhK always
outputs 1 while AR outputs 1 with probability only 1/2 when R is a random function.
Problem 4
Suppose Alice has some database that Bob wants to query interactively. Bob wants to be sure that
Alice’s answers are consistent: they are independent of his queries and the order in which he asks
4
them. One solution is for Alice to first commit to her database, and for Bob to verify that Alice’s
answers are consistent with her commitment.
We will think of a database as a function F : {0, 1}q → {0, 1}k that takes a query in {0, 1}q and
outputs an answer in {0, 1}k . The size of a database is the size of the truth table of F , that is 2q k.
You should think of the database as small enough so that Alice can perform computations on the
whole database, but large enough so that it is infeasible for Alice to send Bob the full database.
Assume Alice and Bob share a random key K ∈ {0, 1}k . A database commitment scheme consists
of three algorithms (Com, Ans, V er) where
• Com(K, F ) is a commitment algorithm that takes a key K and a database F and outputs a
string in {0, 1}k , which is a commitment to F .
• Ans(K, F, x) is an answering algorithm that takes a key K, a database F , and a query
x ∈ {0, 1}t to F and outputs F (x).
• V er(K, C, x, a) is a verification algorithm that takes a key K, a commitment C, a query x
and an answer a, accepts if the query-answer pair is consistent with the commitment, and
rejects otherwise.
We will say that the database commitment scheme is secure if it is infeasible for a computationally
bounded adversary to commit to a database F , but answer queries according to some other database
F 0 without getting caught.
(a) Give a formal definition of an (s, ε)-secure database commitment scheme. You should give
separate definitions for functionality and for security.
(b) Let {hS : {0, 1}2k → {0, 1}k } be a cryptographic hash family for input length 2k.
t
(q)
hS : {0, 1}2 k → {0, 1}k be the function recursively defined by the formula
(q)
(q−1)
hS (M1 M2 ) = hS (hS
(1)
(q−1)
(M1 ), hS
(M2 )),
Let
q−1 k
M1 , M2 ∈ {0, 1}2
(t)
with hS = hS . Show that {hS } is a cryptographic hash family for input length 2q k (for
appropriate parameters).
(c) Use part (b) to construct a secure database commitment scheme and prove your scheme is
secure assuming {hS } is a secure cryptographic hash family.
Solution
This question turned out more difficult than I intended, especially part (a), so I tried to be generous
with the credit.
(a) We say (Com, Ans, V er) is a database commitment scheme if for every K, F , and x:
V er(K, Com(K, F ), x, Ans(K, F, x)) = F (x).
5
To define security, we reason as follows. Suppose Alice has committed to a database. We
want to say that after she commits, possibly using a cheating commitment C, she cannot
change her mind: For every query x, if she provides two different answers A and A0 , it better
be that they verify to the same value, or else the verification algorithm detects an error.
Let’s say a pair of values (a, a0 ) is a violation for x if a 6= a0 and a, a0 6= error. We will say
(Com, Ans, V er) is (s, ε) secure if for every triple of circuits C, A, A0 of size s and every x
Pr[(V er(K, C(K), x, A(K, x)), V er(K, C(K), x, A0 (K, x))) is a violation for x] > ε
where the probability is taken over the choice of K.1
(b) Let s0 be the circuit size of hS and assume hS is an (s, ε) cryptographic hash family. We will
show that htS is an (s − O(s0 2t ), ε) cryptographic hash family.
(t)
It will help to represent the hash hS by a full binary tree of depth t. Let us identify the
nodes at level i of this tree by strings in {0, 1}i .
Let A be a circuit of size s − O(t2t ) so that
(t)
(t)
Pr[A(S) = (M, M 0 ), x 6= x0 , hS (M ) = hS (M 0 )] > ε.
Consider the following circuit B for finding a collision in hS : On input S, run A(S) to find
(t)
(t)
(M, M 0 ). If hS (M ) 6= hS (M 0 ) or M = M 0 , fail. Otherwise, divide M and M 0 into 2t blocks
z ∈ {0, 1}t of size k and find the first block z where Mz 6= Mz0 . Find a parent node u of z in
the tree where the hashes of M and M 0 are the same at node u but differ are not the same
at the children u0 , u1 of u. Output the values (of hS ) at u0 and u1 for M and M 0 .
(t)
(t)
Whenever hS (M ) = hS (M 0 ) and M 6= M 0 , a node u with the required properties must exist
on the path between z and the root, so B outputs a collision with probability greater than
ε. To calculate the size of s, notice that B needs to run A, compute (at most) 2t hashes hS ,
and for some of these hashes do a bit of extra work to figure out if their values are the same
or different. So B has size s, contradicting the assumption that hS is an (s, ε) cryptographic
hash family.
(q)
(c) Consider the following database commitment scheme. Com(K, F ) = hK (F ) where we view
F as a truth-table of size 2q k. The algorithm Ans(K, F, x) outputs the values
(gS (x1 . . . xq ), gS (x1 . . . xq−1 xq ), gS (x1 . . . xq−2 xq−1 ), . . . , gS (x1 ))
where b is the boolean complement of b and gS (a1 . . . at ) is obtained by applying hSq−t to the
restriction of F on those inputs x such that x1 = a1 , . . . , xt = at . (When t = q, gS (a) = F (a).)
The algorithm V er(K, C, x, (z0 , z1 , . . . , zq )) outputs error if h(. . . h(h(z0 z1 )z2 ) . . . zq ) 6= C.
Otherwise it outputs z0 .
To show security, suppose (Com, Ans, V er) is not (s, ε) secure and let C, A, A0 be the circuits
and x the input that produce the required violation with probability at least ε. Consider the
1
There is one aspect which is not modeled by this definition: the output of C can be used as an input to A and
A , but the proof extends even in that case.
0
6
circuit D that on input K, computes A(K, x) = (z0 , . . . , zq ) and A0 (K, x) = (z00 , . . . , zq0 ) and
0 ) if it is a collision for h . This circuit has size O(qt + s) where
outputs one of (zi zi+1 , zi0 zi+1
K
t is the circuit size of hK .
Now suppose (V er(K, C(K), x, A(K, x)), V er(K, C(K), x, A(K, x0 )) is a violation. Since neither instantiation of V er returns error, we must have
h(. . . h(h(z0 z1 )z2 ) . . . zq ) = C(K) = h(. . . h(h(z00 z10 )z20 ) . . . zq0 )
and since the two values are different, we must have A(K, x) 6= A0 (K, x). So there must be an
0
index i such that zi zi+1 6= zi0 zi+1
but hK (zi zi+1 ) = hK (zi zi+1 ). This is the desired collision.