digest

Overview

Cryptographic hash functions are functions that:
o Map an arbitrary-length (but finite) input to a fixed-size
output
o Are one-way (hard to invert)
o Are collision-resistant (difficult to find two values that
produce the same output)

Examples:
o Message digest functions - protect the integrity of data
by creating a fingerprint of a digital document
o Message Authentication Codes (MAC) - protect both the
integrity and authenticity of data by creating a
fingerprint based on both the digital document and a
secret key
Chapter 4  Hash Functions
1
Checksums vs. Mess. Digests

Checksums:
o
o
o
o

Used to produce a compact representation of a message
If the message changes the checksum will probably not match
Good: accidental changes to a message can be detected
Bad: easy to purposely alter a message without changing the
checksum
Message digests:
o Used to produce a compact representation (called the
fingerprint or digest) of a message
o If the message changes the digest will probably not match
o Good: accidental changes to a message can be detected
o Good: difficult to alter a message without changing the digest
Chapter 4  Hash Functions
2
Hash Functions

Message digest functions are hash functions
o A hash function, H(M)=h, takes an arbitrary-length input,
M, and produces a fixed-length output, h

Example hash function:
o
o
o
o
H = sum all the letters of an input word modulo 26
Input = a word
Output = a number between 0 and 25, inclusive
Example:




H(“Elvis”) = ((‘E’ + ‘L’ + ‘V’ + ‘I’ + ‘S’) mod 26)
H(“Elvis”) = ((5+12+22+9+19) mod 26)
H(“Elvis”) = (67 mod 26)
H(“Elvis”) = 15
Chapter 4  Hash Functions
3
Collisions

For the hash function:
o H = sum all the letters of an input word modulo
26



There are more inputs (words) than possible
outputs (numbers 0-25)
Some different inputs must produce the same
output
A collision occurs when two different inputs
produce the same output:
o The values x and y are not the same, but H(x)
and H(y) are the same
Chapter 4  Hash Functions
4
Collisions - Example


H(“Jumpsuit”) = 25
o
o
o
o
(‘J’ + ‘U’ + ‘M’ + ‘P’ + ‘S’ + ‘U’ + ‘I’ + ‘T’) mod 26
(10+21+13+16+19+21+9+20) mod 26
129 mod 26
25
o
o
o
o
(‘T’ + ‘C’ + ‘B’) mod 26
(20+3+2) mod 26
25 mod 26
25
H(“TCB”) = 25
Chapter 4  Hash Functions
5
Collision-Resistant Hash
Functions


Hash functions for which it is difficult to
find collisions are called collision-resistant
A collision-resistant hash function, H(M)=h:
o For any message, M1
o It is difficult to find another message, M2 such
that:
 M1 and M2 are not the same
 H(M1) and H(M2) are the same
Chapter 4  Hash Functions
6
One-Way Hash Functions

A function, H(M)=h, is one-way if:
o Forward direction: given M it is easy to
compute h
o Backward direction: given h it is difficult to
compute M

A one-way hash function:
o Easy to compute the hash for a given message
o Hard to determine what message produced a
given hash value
Chapter 4  Hash Functions
7
Message Digest Functions

Message digest functions are collisionresistant, one-way hash functions:
o Given a message it is easy to compute its
digest
o Hard to find any message that produces
a given digest (one-way)
o Hard to find any two messages that have
the same digest (collision-resistant)
Chapter 4  Hash Functions
8
Using Message Digest
Functions

Message digest functions can be used to protect
data integrity:
o A company makes some software available for download
over the World Wide Web
o Users want to be sure that they receive a copy that has
not been tampered with
o Solution:
 The company creates a message digest for its software
 The digest is transmitted (securely) to users
 Users compute their own digest for the software they
receive
 If the digests match the software probably has not been
altered
Chapter 4  Hash Functions
9
Attacks on Message Digests

Brute-force search for a collision:
o Goal:
 Find a message that produces a given digest, d
o Assume:
 The message digest function is “strong”
 The message digest function creates n-bit digests
o Approach:
 Generate random messages and compute digests for
them until one is found with digest d
 Approximately 2n random messages must be tried to
find one that hashes to d
Chapter 4  Hash Functions
10
Attacks on MDs (cont)

Birthday attack (based on the birthday
paradox):
o Goal:
 Find any two messages that produce the same digest
o Assume:
 The message digest function is “strong”
 The message digest function creates n-bit digests
o Approach:
 Generate random messages and compute digests for
them until two are found that produce the same
digest
 Approximately 2n/2 random messages must be tried to
find one that hashes to d
Chapter 4  Hash Functions
11
The Secure Hash Algorithm

The Secure Hash Algorithm:
o A Federal Information Processing Standard
o
o
o
o
(FIPS 180-1) adopted by the U.S. government in
1995
Based on a message digest function called MD4
created by Ron Rivest
Developed by NIST and the NSA
Input: a message of b bits
Output: a 160-bit message digest
Chapter 4  Hash Functions
12
SHA - Padding

Input: a message of b bits
o Padding makes the message length a multiple of 512 bits
o The input is always padded (even if its length is already a
multiple of 512)

Padding is accomplished by appending to the input:
o A single bit, 1
o Enough additional bits, all 0, to make the final 512-bit
block exactly 448 bits long
o A 64-bit integer representing the length of the original
message in bits
Chapter 4  Hash Functions
13
SHA – Padding Example

Consider the following message:
o M = 01100010 11001010 1001 (20 bits)

To pad we append:
o 1 (1 bit)
o 427 0s (427 bits)
o 64-bit binary representation of the number 20 (64 bits)

Result:
o Pad(M) = 01100010 11001010 10011000 00000000 . . .
00000000 00010100 (512 bits)
o 464 0s have been omitted above (denoted by the ellipsis)
Chapter 4  Hash Functions
14
SHA – Constant Init.

After padding, constants are initialized to the
following hexadecimal values:
o Five 32-bit words:





H0 = 67452301
H1 = EFCDAB89
H2 = 98BADCFE
H3 = 10325476
H4 = C3D2E1F0
o Eighty 32-bit words:




K0 – K19 = 5A827999
K20 – K39 = 6ED9EBA1
K40 – K59 = 8F1BBCDC
K60 – K79 = CA62C1D6
Chapter 4  Hash Functions
15
SHA – Step 1
The padded message contains a whole
number of 512-bit blocks, denoted B1, B2,
B3, . . ., Bn
 Each 512-bit block, Bi, of the padded
message is processed in turn:

o Bi is divided into 16 32-bit words, W0, W1, . . .,
W15
 W0 is composed of the leftmost 32 bits in Bi
 W1 is composed of the second 32 bits in Bi
…
 W15 is composed of the rightmost 32 bits in Bi
Chapter 4  Hash Functions
16
SHA – Step 2


W0, W1, . . ., W15 are used to compute 64 new 32bit words (W16, W17, . . ., W79)
Wj (16 < j < 79) is computed by:
o XORing words Wj-3, Wj-8, Wj-14, and Wj-16 together
o Circularly left shifting the result one bit
for j = 16 to 79
do
Wj = Circular_Left_Shift_1(Wj-3  Wj-8  Wj-14  Wj-16)
done
Chapter 4  Hash Functions
17
SHA – Step 3

The values of H0, H1, H2, H3, and H4 are
copied into five words called A, B, C, D, and
E:
o
o
o
o
o
A = H0
B = H1
C = H2
D = H3
E = H4
Chapter 4  Hash Functions
18
SHA – Step 4

Four functions are defined as follows:
o For (0 < j < 19):
 fj(B,C,D) = (B AND C) OR ((NOT B) AND D)
o For (20 < j < 39):
 fj(B,C,D) = (B  C  D)
o For (40 < j < 59):
 fj(B,C,D) = ((B AND C ) OR (B AND D) OR (C AND D))
o For (60 < j < 79):
 fj(B,C,D) = (B  C  D)
Chapter 4  Hash Functions
19
SHA – Step 4 (cont)


For each of the 80 words, W0, W1, . . ., W79, a 32bit word called TEMP is computed
The values of the words A, B, C, D, and E are
updated as shown below:
for j = 0 to 79
do
TEMP = Circular_Left_Shift_5(A) + fj(B,C,D) + E + Wj +
Kj
E = D; D = C; C = Circular_Left_Shift_30(B); B = A; A =
TEMP
done
Chapter 4  Hash Functions
20
SHA – Step 5

The values of H0, H1, H2, H3, and H4, are
updated:
o
o
o
o
o
H0 = H0 + A
H1 = H1 + B
H2 = H2 + C
H3 = H3 + D
H4 = H4 + E
Chapter 4  Hash Functions
21
SHA - Overview



Pad the message
Initialize constants
For each 512-bit block (B1, B2, B3, . . ., Bn):
o
o
o
o
o

Divide Bi into 16 32-bit words (W0 – W15)
Compute 64 new 32-bit words (W16, W17, . . ., W79)
Copy H0 - H4 into A, B, C, D, and E
For each Wj (W0 – W79) compute TEMP and update A-E
Update H0 - H4
The 160-bit message digest is: H0 H1 H2 H3 H4
Chapter 4  Hash Functions
22
Motivation for Message Authentication
Codes

Want to use a message digest function to protect files on
our computer from viruses:
o Calculate digests for important files and store them in a table
o Recompute and check from time to time to verify that the files
have not been modified


Good: if a virus modifies a file the change will be
detected since the digest of that file will be different
Bad: the virus could just compute new digests for modified
files and install them in the table
Chapter 4  Hash Functions
23
Message Authentication
Codes

A message authentication code (MAC) is a
key-dependent message digest function
o MACK(M) = h



The output, h, is a function of both the hash
function and a key, K
The MAC can only be created or verified by
someone who knows K
Can turn a one-way hash function into a MAC
by encrypting the hash value with a
symmetric-key cryptosystem
Chapter 4  Hash Functions
24
Using MAC

MAC can be used to protect data integrity
and authenticity:
o Want to use a MAC to protect files on our
computer from viruses:
 Calculate MAC values for important files and store
them in a table
 Recompute and check from time to time to verify that
the files haven’t been modified
o Good: if a virus modifies a file the hash of that
file will be different
o Good: virus doesn’t know the proper key so it
can’t install new MACs in the table to cover its
tracks
Chapter 4  Hash Functions
25
Implementing a MAC

Can use a block cipher algorithm:
o Pad the message (if necessary) so that its length is a
multiple of the cipher’s block size
o Divide the message into n blocks equal in length to the
cipher’s block size:
o
o
o
o
o
 m1, m2, . . ., mn
Choose a key, k
Encrypt m1 with k
XOR the result with m2
Encrypt the result with k
XOR the result with m3
…
Chapter 4  Hash Functions
26
Implementing a MAC (cont)
Chapter 4  Hash Functions
27
Summary

Message digests
o Message digest functions are collision-resistant, one-way hash
functions
 Collision-resistant: hard to find two values that produce the same
output
 One-way: hard to determine what input produced a given output
o Protects the integrity of a digital document

MAC
o A message authentication code is a key-dependent message
digest function
 The output is a function of both the hash function and a secret key
 The MAC can only be created or verified by someone who knows
the key
o Protects the integrity and authenticity of a digital document
Chapter 4  Hash Functions
28