Quantum Computation - University of York, Department of Computer

Secret agents leave big footprints: how to
plant a trapdoor in a cryptographic function
and why you might not get away with it.
GECCO 2003
John A Clark, Jeremy L Jacob and Susan Stepney
Dept of Computer Science
University of York,
York YO10 5DD, England
16 July 2003
A Research Exercise in Pure Evil

Do you feel frustrated and annoyed by people’s ability
to use modern day crypto-algorithms when they have
no intention whatsoever of supplying you with a secret
key so you can listen in?

Don’t worry – there is a solution.
– Get them to use an algorithm that looks secure but which only
you know how to break.

Technical: it’s in the cost function! Different cost
functions give different results.

Moral: Optimisation may be used and abused.
Conspiracy theory as motivation:
Data Encryption Standard (DES)

The Data Encryption Standard is the most controversial
cipher in history.
– Developed on behalf of the US Govt..
– Based on previous IBM work.
– Issued in 1976 as FIPS 46.
– 56 bit key (64 in fact but there are check bits) is
controversial:
 key length was originally 128;
 suspicion over NSA motives.

Criteria for the design were not revealed.
Conspiracy theory as motivation:
Data Encryption Standard (DES)
64
32
32
Input
L
R
IP
Key
Shift
32
56
Expansion Perm
Compression Perm
48
L0
Shift
R0
48
Sixteen cycles
48
S-box Substitution
L16
R16
R16
L16
Key'
32
P-Box Perm
S1
Inverse IP
L'
R'
S2
S3
S4
S5
S6
S7
S8
Conspiracy theory as motivation:
Data Encryption Standard (DES)

Matters became amusing in 1994
 Theoretically promising method emerged in the late 80’s and
early 90’s - differential cryptanalysis.
 DES was surprisingly resilient to differential cryptanalysis.
 Don Coppersmith wrote a paper (1994) that revealed some of
the design criteria and stated that DES was resistant to
differential cryptanalysis because it had been specifically
designed so.
 IBM (presumably from the NSA) knew about the method of
attack 16 or more years before it was discovered and published
by leading cryptography academics.


DES is more vulnerable to a later method (linear cryptanalysis)
Actually specialised FPGA hardware can now break DES in a few
hours.
Conspiracy theory as motivation:
Data Encryption Standard (DES)



Does DES have a trapdoor in it – a special property that can be
exploited by people in the know?
We do not know.
It seems actually to be a rather good algorithm.
– But the idea of having a secret trapdoor – now I like that.



How can we design cryptosystem that looks good but which I may
know how to break?
How can we prevent the wrangling about honesty in design?
Let’s try heuristic search. Will illustrate principle on the simplest
component –a single-valued Boolean function used in stream
ciphers.
Classical Stream Cipher Model
LSFR 1
LSFR 2
Plaintext Stream Pj
Keystream Zj
Cipherstream Cj
Combining Boolean
function f.
L1j
L2j
f
Receiver can generate key
stream and recover plaintext
LSFR n
Lnj
say 32 Bit registers
Pj
Zj
Cj
Boolean Function Design

A Boolean function f:{0,1}n->{0,1}
x
0
1
2
3
4
5
6
7
f(x) f(x)
0
0
0
0
1
1
1
1
0
0
1
1
0
0
1
1
0
1
0
1
0
1
0
1
1
0
0
0
1
0
1
1
-1
1
1
1
-1
1
-1
-1
Polar representation
Will talk only about
balanced functions
where there are equal
numbers of 1s and -1s.
A move simply swaps a
1 and a –1.
Functions are
essentially represented
as binary vectors
Planting Trapdoors
e.g. high non-linearity,
low autocorrelation
Public Goodness
Property P
Trapdoor
Property T
Design Space
Optimisation

Suppose you have an effective optimisation based
approach to getting functions with public property P.
Let the cost function used be
– Cost=honest(f)

Suppose you have an effective optimisation based
approach to getting functions with trapdoor property
T. Let the cost function used be
– Cost=trapdoor(f)

We can combine the two
– sneakyCost(f) = (1- l) honest(f)+l trapdoor(f)
 l is the malice factor: l=0 truly honest; l=1=>wicked
– Will you get caught out?
Example Trapdoor Function




We want to be able to tell whether an unknown
trapdoor has been inserted.
Experiments have used a randomly generated
vector as trapdoor. Closeness to this vector
(measured by Hamming distance) represents a
good trapdoor bias.
Want to investigate what happens when different
malice factors are used.
We shall consider high non-linearity and low
autocorrelation as public goodness measures.
You say you did, I say you didn’t
Publicly good solutions,
e.g. Boolean functions
with same very high
non-linearity
Publicly good solutions with
high trapdoor bias found by
annealing and combined
honest and trapdoor cost
functions.
Publicly good solutions
found by annealing and
honest cost function
There appears nothing to distinguish the sets of solutions obtained –
unless you know what form the trapdoor takes!
Or is there…
n=8: Examples with non-linearity vs autocorrelation
Autocorrelation
Autocorrelation
l=0.0
Non-linearity
64
56
48
40
32
24
64
56
48
40
32
24
110 112 114 116
0
0
0
0
0
0
0
0
0
0
0
1
0
0
3
4
0
2
7
12
0
0
0
1
MeanTrap=12.8
110 112 114 116
0
0
1
0
0
1
2
0
0
5
7
0
0
2 12
0
0
0
0
0
0
0
0
0
MeanTrap=222.1
l=0.6
64
56
48
40
32
24
64
56
48
40
32
24
l=0.2
l=0.4
110 112 114 116
0
0
0
0
0
0
1
0
0
0
7
0
0
0 16
0
0
0
6
0
0
0
0
0
MeanTrap=198.9
64
56
48
40
32
24
110 112 114 116
0
0
1
0
0
0
2
0
0
1
6
0
0
2 17
0
0
0
1
0
0
0
0
0
MeanTrap=213.1
80
72
64
56
48
40
32
24
110 112 114 116
2
0
0
0
4
1
0
0
10
6
0
0
2
5
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
MeanTrap=242.7
110 112 114 116
0
1
0
0
0
4
1
0
0
19
1
0
0
3
1
0
0
0
0
0
0
0
0
0
MeanTrap=232.3
l=0.8
l=1.0
Vector Representations
+1
-1
+1
+1
-1
+1
-1
-1
Different cost functions may give similar
goodness results but may do so in radically
different ways.
Results using honest and
dishonest cost functions
cluster in different parts of
the design space
Basically distinguish using
discriminant analysis.
If you don’t have an
alternative hypothesis then
you can generate a family of
honest results and ask how
probable the offered one is.
Vector Representations
For two groups G1 and G2.
Calculate the mean vectors m1 and m2. Project
m2 onto m1 and obtain the residual r.
Now project each vector in each group onto the
residual and take absolute value.
Games People Play




It seems possible to tell that something has been going on.
And we don’t need to know precisely what has been going
on.
Since any design has a binary vector representation the
technique is general.
Meta games:
– Some variations on a theme can be attempted. If you know the means
of detection you may be able to add a cost function component
concerned with detectability

sneakierCost(f) = (1- l) honest(f)+l malice(f)+a detectability(f)
Conclusions





Optimisation based design process may be open
and reproducible.
Optimisation can be abused.
Optimisation allows a family of representative
designs to be obtained.
Designs developed against different criteria just
look different.
The games just do not stop.
Coda

Search based approaches are not just for toy
problems.
– For several major criteria of interest search based
approaches have equalled or bettered the combined
best efforts of theoreticians for n<=8.
– Have recently produced hitherto unattained results for
n=9.
– Disproved cryptological conjectures in the literature.

CEC Special strand on computer security.
 Web page at www.cs.york.ac.uk/security
(part of virtual library)
Bonus Track

You cane even tell which technique people
have used.
– Simulated Annealing andGAs may also give different
types of solution.

Experiment:
 Evolved a pseudo-random number generator as
FPGA netlist
– Randomness criteria as cost function components
– Cost function components can act as classifiers too!
View evolved programs as bit strings and feed them
through the cost functions components used to evolve
them.