Pseudo-Code

Knight Paper Algorithms
• Given cipher text f of length m, plaintext of v
tokens, and source model b
• m=340
• v=26
• f = cipher
• Notation changes between papers – be careful
• c refers to count table in “Decoding complexity”
and the cipher in “Unsupervised analysis”
Figure 2 – Decoding Complexity…
1. Set the s(f|e) table initially to be uniform.
This is the ‘channel model’ – our solution.
f = coded message (symbols)
e = English letters
In Unsupervised Analysis… Knight mentions
setting all values initially to 1/26
2. For several iterations – with O(mv2) running
time, we can do it many times. Knight gets
good results with 20.
a. Set up a count table c (2d array) 340 x 26
We may need to make special boundary cases
for the start and end boundaries, but no
word boundaries for us
• s and c tables:
HER<∆Ф…
A
B
C
D
…
Z
2.
b. b(e|e) is another table of probabilities that
‘given A, A follows’, ‘given A, B follows’, etc.
26 x 26
Rada: “The 20 Zodiac letters is not enough
source material”
Linguistics will build this for us.
• b (e|e)
A B C D E F … Z
A
B
C
…
Z
Need to add ‘start’ and ‘end’ I think
‘Boundary’ for us refers to the first and last
symbol only, not word boundaries (we don’t
have any in our cipher).
2c – 2f are straightforward
2g. Normalize c(f|e) to create a revised s(f|e)
STL Vector normalization?
Normalize with respect to e, so that in our table,
all of our probabilities for ‘A’ sum to 1, ‘B’ sum
to 1…
Merging that into s(f|e).
Need to…
Think about trigrams – Knight got much better
performace from trigrams
Apply improvements from second page of
“Unsupervised analysis”