The Hill Cipher

Ciphertext-only attack on 𝑑 × π‘‘ Hill in 𝑂(𝑑13𝑑 )
Shahram Khazaei
Siavash Ahmadi
Fall 2015
[email protected]
Outline
Introduction
The Hill Cipher
Cryptanalysis of Hill
Preliminaries
COA on Hill using monograms
Brute-force attack on Hill
Divide-and-conquer attack on Hill
CRT based divide-and-conquer attack on Hill
Experimental Results
Main References
End
Introduction
Classical Cipher
– Substitution
– Transposition
Examples:
–
–
–
–
–
–
–
Caesar
Vigenere square
Great
Morse Code
Pigpen
Columnar
Chinese cipher
Introduction
most of them are broken not only with
Known Plaintext Attacks (KPA)
but also with
Ciphertext Only Attacks (COA)
(with existence of some redundancy in the messages)
Hill is one of the classical cipher which is not
broken by COA.
The Hill Cipher
Invented by Lester S. Hill in 1929.
Plaintext: 𝑃 = (𝑝1 , 𝑝2 , … , π‘π‘šπ‘‘ )
– Let 𝑃𝑖 = (𝑝
π‘–βˆ’1 𝑑+1 , 𝑝 π‘–βˆ’1 𝑑+2 , … , 𝑝 π‘–βˆ’1 𝑑+𝑑 )
– Encryption: 𝐢𝑖 = 𝑃𝑖 𝐾
𝑐
𝑐
π‘–βˆ’1 𝑑+1
𝑝
𝑝
π‘–βˆ’1 𝑑+2
.
.
.
𝑐
𝑇
π‘–βˆ’1 𝑑+𝑑
π‘–βˆ’1 𝑑+1
π‘–βˆ’1 𝑑+2
.
.
.
=
𝑝
π‘–βˆ’1 𝑑+𝑑
Invertible
𝑇
𝐾11
𝐾21
.
.
.
𝐾𝑑1
𝐾12
𝐾22
.
.
.
𝐾𝑑2
.
.
.
.
.
.
.
.
.
.
.
.
. 𝐾1𝑑
. 𝐾2𝑑
. .
. .
. .
. 𝐾𝑑𝑑
Final Ciphertext: 𝐢 = (𝐢1 , 𝐢2 , … , πΆπ‘š )
It completely hides letter frequencies.
over β„€26
Cryptanalysis of Hill
KPA on Hill:
very easy!
𝑃𝑖1 , 𝑃𝑖2 , … , 𝑃𝑖𝑑
𝐻𝑖𝑙𝑙
(𝐢𝑖1 , 𝐢𝑖2 , … , 𝐢𝑖𝑑 )
𝑑 linearly independent blocks of plaintext
Cryptanalysis of Hill
KPA on Hill:
very easy!
𝑃𝑖1 , 𝑃𝑖2 , … , 𝑃𝑖𝑑
𝐻𝑖𝑙𝑙
(𝐢𝑖1 , 𝐢𝑖2 , … , 𝐢𝑖𝑑 )
𝑑 linearly independent blocks of plaintext
If
: U=
𝑇
𝑇
𝑇 𝑇
𝑃𝑖1 , 𝑃𝑖2 , … , 𝑃𝑖𝑑
βˆ’1
Then: 𝐾 = π‘ˆ
π‘Š
&W=
𝑇
𝑇
𝑇 𝑇
𝐢𝑖1 , 𝐢𝑖2 , … , 𝐢𝑖𝑑
Cryptanalysis of Hill
KPA on Hill:
very easy!
𝑃𝑖1 , 𝑃𝑖2 , … , 𝑃𝑖𝑑
𝐻𝑖𝑙𝑙
(𝐢𝑖1 , 𝐢𝑖2 , … , 𝐢𝑖𝑑 )
𝑑 linearly independent blocks of plaintext
If
: U=
𝑇
𝑇
𝑇 𝑇
𝑃𝑖1 , 𝑃𝑖2 , … , 𝑃𝑖𝑑
βˆ’1
Then: 𝐾 = π‘ˆ
COA on Hill:
&W=
𝑇
𝑇
𝑇 𝑇
𝐢𝑖1 , 𝐢𝑖2 , … , 𝐢𝑖𝑑
π‘Š
It is generally accepted that COA on Hill does
not work well.
– Exhaustive search:
– πœ… = 26𝑑
2
𝑑
𝑖=1
2
𝑑
26
matrix multiplication
1 βˆ’ 2βˆ’π‘– 1 βˆ’ 13βˆ’π‘– > 0.29 × 26𝑑
2
Preliminaries
English Language Properties.
𝐻𝑛 : Entropy of n-grams
𝐻1 = βˆ’ 𝑖 𝑓𝑖 log 2 𝑓𝑖 β‰ˆ 4.1718: Entropy of monograms
Preliminaries
For English
:
Preliminaries
For English
Preliminaries
For English
:
Preliminaries
Preliminaries
Preliminaries
COA on Hill using monograms
Brute-force attack on Hill using monograms only
Improve
A divide-and-conquer attack on Hill
Improve
A CRT based divide-and-conquer attack
They will find the key matrix up to an
unknown permutation of its columns
COA on Hill using monograms
Brute-force attack on Hill using monograms only
Improve
A divide-and-conquer attack on Hill
Improve
A CRT based divide-and-conquer attack
They will find the key matrix up to an
unknown permutation of its columns
The correct order of the columns can then
be determined using diagram frequencies
Brute-force attack on Hill
𝑑! matrices
Brute-force attack on Hill
𝑑! matrices
Exhaustive all the 𝑂
2
𝑑
26
𝑑! matrices
Unicity distance:
Unicity distance
Hence, the computational complexity of the attack
is:
3
𝑑2
𝑑2
𝑂 𝑑 26 𝑑! β‰ˆ 𝑂 26
Divide-and-conquer attack on Hill
The Key Observation:
𝑝
𝑝
π‘–βˆ’1 𝑑+1
𝑐
𝑐
π‘–βˆ’1 𝑑+2
.
.
.
𝑝
𝑇
π‘–βˆ’1 𝑑+𝑑
π‘–βˆ’1 𝑑+1
π‘–βˆ’1 𝑑+2
.
.
.
=
𝑐
π‘–βˆ’1 𝑑+𝑑
𝑇
βˆ’1
𝐾11
βˆ’1
𝐾21
.
.
.
βˆ’1
𝐾𝑑1
βˆ’1
𝐾12
βˆ’1
𝐾22
.
.
.
βˆ’1
𝐾𝑑2
βˆ’1
. . . 𝐾1𝑑
βˆ’1
. . . 𝐾2𝑑
... .
... .
... .
βˆ’1
. . . 𝐾𝑑𝑑
Divide-and-conquer attack on Hill
The Key Observation:
𝑝
𝑝
π‘–βˆ’1 𝑑+1
𝑐
𝑐
π‘–βˆ’1 𝑑+2
.
.
.
𝑝
𝑇
π‘–βˆ’1 𝑑+𝑑
π‘–βˆ’1 𝑑+1
π‘–βˆ’1 𝑑+2
.
.
.
=
𝑐
π‘–βˆ’1 𝑑+𝑑
𝑇
βˆ’1
𝐾11
βˆ’1
𝐾21
.
.
.
βˆ’1
𝐾𝑑1
βˆ’1
𝐾12
βˆ’1
𝐾22
.
.
.
βˆ’1
𝐾𝑑2
βˆ’1
. . . 𝐾1𝑑
βˆ’1
. . . 𝐾2𝑑
... .
... .
... .
βˆ’1
. . . 𝐾𝑑𝑑
– The monogram frequencies are still observed
– Guessing a single column of 𝐾 βˆ’1 actually reveals all
the correct columns
Divide-and-conquer attack on Hill
The best 𝑑 candidates for the probable columns are the
columns of a representative key matrix (with IC or IML).
Using Theorem 1, the enough number of decrypted letters
for almost uniquely determining each column of the
decryption matrix can be calculated as:
Therefore, the enough ciphertext length for obtaining the
above amount of decrypted letters is equal to:
The computational complexity of the attack is 𝑂 𝑑 2 26𝑑
It can be improved to 𝑂 𝑑26𝑑 by using pre-computations.
CRT based divide-and-conquer attack on Hill
The same procedure can be done in order to find the
columns of the decryption key matrix modulo 2 and 13.
Unicity distances:
Now, to find a representative key modulo 26, the attack
can be devised in two different ways using the CRT.
CRT based divide-and-conquer attack on Hill
First strategy:
– Find representative key matrices modulo 2 and 13, 𝐾 (2) and
𝐾 (13) , respectively.
– combine each of the d columns of 𝐾 (13) with all the d columns
of 𝐾 (2) to extract 𝑑 2 new columns modulo 26 using the CRT.
– The 𝑑 columns with largest index (IC or IML) can be
considered as a the representative key over β„€26 .
– The computational complexity is 𝑂(𝑑13𝑑 + 𝑑2𝑑 + π‘π‘œπ‘™π‘¦)
= 𝑂(𝑑13𝑑 ).
2
13
– The enough ciphertext length is max 𝑛0 , 𝑛0
= 74𝑑 2 .
CRT based divide-and-conquer attack on Hill
Second strategy:
– Find representative key matrices modulo 13, 𝐾 (13) , only.
– For each column of 𝐾 (13) do the following:
ο‚§ Consider the 2𝑑 βˆ’ 1 columns over β„€2 and compute the
corresponding columns over β„€26 using the CRT.
ο‚§ Calculate the IML or IC for each one and choose the column with
the largest index as a column of the representative key matrix
over β„€26 .
– The computational complexity is 𝑂(𝑑13𝑑 + π‘π‘œπ‘™π‘¦ × 2𝑑 ) =
𝑂(𝑑13𝑑 ).
– The enough ciphertext length is 12.5𝑑 2
Experimental Results
πœ† is a coefficient which say how bigger ciphertext length is used for
simulation from unicity distance.
Experimental Results
The Second Strategy
Experimental Results
The Second Strategy
Unicity distance:
But, the success probability for
πœ†π‘›0 𝑑 2 ciphertext length with πœ† β‰₯
4 and IML criteria is going to 1,
which
is
equal
to
1800
ciphertext length.
The reason is that here, the
decrypted string for a wrong
key is not random enough.
Experimental Results
Blue scenario:
– Meaningful text
Red scenario:
– Real text
Main References
1.
2.
3.
4.
5.
6.
C. Christensen. Polygraphic Substitution Ciphers: The Hill Cipher, II.
http://www.nku.edu/~christensen/1402%20Hill%20cipher%
20part%20II.pdf, Accessed Summer 2015.
O. Grosek and P. Zajac. Automated cryptanalysis of classical ciphers.
In Encyclopedia of Artificial Intelligence (3 Volumes), pages 186–
191. 2009.
L. S. Hill. Cryptography in an algebraic alphabet. In American
Mathematical Monthly, pages 306–312. 1929.
B. Hu. Introduction to Cryptology: Hill Cipher Remarks.
http://www.cs.rochester.edu/~bh/csc290/hill.html,
Accessed
Summer 2015.
J. Overbey, W. Traves, and J. Wojdylo. On the keyspace of the hill
cipher. Cryptologia, 29(1):59–72, 2005.
D. E. Robling Denning. Cryptography and data security. AddisonWesley Longman Publishing Co., Inc., 1982.
End
Siavash Ahmadi
[email protected]