Alberto Aguilar

Theory of Computation II
Topic presented by:
Alberto Aguilar Gonzalez
Problem

You are designing a banking application that
will be accessed by thousands of users.

Security of passwords is a key factor.


Protect from people outside and inside the
organization
How do you store passwords in the
database?
One Approach


Encrypt passwords using a key.
When the information is needed, decrypt it
using same key!
IDEA: “hi” = decrypt(encrypt(“hi”))

Example (very simple):

Given a character, encrypt it by replacing it with
other. What is the idea?
Character
ASCII CODE
Encrypted
A
01000001
10110010
B
01000010
10100101
What is the problem with this
approach?
User
Pwd (encrypted using a key)
aagui003
bbhrt
aaoni001
jhlkhj

If someone accesses this
database and knows the
key (even people from IT
or support), all
passwords would be
revealed!
A better approach
One-way hash functions
(The talk is about this)
ONE WAY
One way function



A function y = f(x) is one way if it is
easy to compute y from x but “hard” to
compute x from y
However, nobody has proved that such
function exist!
A possible definition is:


f(x) can be obtained in polynomial time
f -1(x) is NP-hard
An example of one-way functions



Unique factorization Theorem: Every
integer has a unique factorization as
product of primes.
Factoring
Given two large prime numbers u, v,
consider y = f(u, v) = u * v. It is polynomial
time computable.
However, given y, can we calculate u and v
easily?
NO
Hash function

Map a message of variable length m to a
fingerprint of fixed n bits, and m >= n

Fundamental properties:



Compression
Easy to compute
Can be used to detect changes since a
modification (even a bit) would change the
hash value.
One-way hash functions

y = h(x) where



Given x, calculating h(x) is
easy
Given y, calculating any x
such that y=h(x) is hard,
AND
y is fixed length
independent of the size of
x (a compression function is
needed for large inputs)
Input
Output
Two questions

Is it easy to come up with new oneway hash functions?

What do we need to build such
functions?




Easy to compute (in general, it is a public
algorithm)
Hard to invert (2n different output!)
Compression function
Collision resistant
Collision

Given x1, x2, and a hash function h, a
collision exists if
h(x1) = h(x2)

Is this possible?



YES, why?
It is a many-to-one function! The input domain is
greater that the output domain.
Therefore, good one-way hash functions
should be collision resistant!
Collision resistant?
The Birthday paradox

Consider the probability Q1(n, d) that no two people out
of a group of n will have matching birthdays out of d
equally possible birthdays.
d!
Q1 (n, d ) 
(d  n)! d n

Probabilty that
two do have
same birthday
P2 (n, d )  1  Q1 (n, d )
In general, let Qi(n, d) denote the probability that a
birthday is shared by i people out of a group of n people,
then the probability that a birthday is shared by k or more
people.
k 1
Pk (n, d )  1  i 1 Q i (n, d )
http://mathworld.wolfram.com/BirthdayProblem.html
…birthday paradox

An approximation for the minimum number of
people needed to get 50-50 chance that two
have a match within k days out of d possible is
given by:
n(d , k )  1.2

d
2k  1
How many people do we need in this
classroom for a 50-50 chance?
n(365,0)  1.2
365
 22.93
2(0)  1
What about
OWHFs?
(Sevast'yanov 1972, Diaconis and Mosteller 1989).
Birthday attacks for
OWHFs



Given y = h(x), where y is length-fixed of n
bits, 2n outputs can be obtained.
Since x is of variable length, and |x| > |y| in
some cases. h(x) is a many-to-one function!
How many attempts are necessary so that
h(x1)=h(x2) (probability of success >= 0.5)?


Use the formula we just explained!
Let d = 2n, and k = 0
 
n(d )  1.2 d  1.2 2
n 1/ 2

 1.2 2 n / 2

To be collision resistant, how
big should n be?
Output length
n(d)
64 bits
 232
128 bits
 264
 280
160 bits

64-bits is now regarded as too small,
128-512 proposed
General structure of OWHF’s
Input
arbitrary
length input
iterated
compression
function
Output
optional
transformation
output
fixed length
output
Details
original input x
preprocessing
append padding bits
append length block
formatted
input x1, x2... xt
compression
xi function f

Hi
Ht
Hi-1
H0=IV
g
output h(x)=g(Ht)
iterated
processing
Two known OWHF’s

MD5




From Ronald Rivest (the R from RSA) [1992]
Produce a 128-bit hash value
MD5 is widely used, however collisions were
detected (Wang, 2004).
SHA1


Designed by the National Institute of Standards
and Technology (NIST), as an “upgrade” from
MD5
Produces 160-bit hash values
Going back to our problem

Save a pair <user, hash_of_passw>



<user01, 9dd4e461268c8034f5c8564e155c67a6>
Now, if somebody (inside or outside) access
passwords table each entry should be
attacked individually!
An authentication algorithm would look as
follows:
if MD5(passw_typed) == hash_of_passw
CorrectPassword = true
else
CorrectPassword = false
Other uses

Digital signatures



Antivirus
Software validation
Used to store passwords in some
Linux implementations
Thank you
Questions?
What #$!@
is he
talking
about?
mmm…
Z Z z…