Knight Paper Algorithms • Given cipher text f of length m, plaintext of v tokens, and source model b • m=340 • v=26 • f = cipher • Notation changes between papers – be careful • c refers to count table in “Decoding complexity” and the cipher in “Unsupervised analysis” Figure 2 – Decoding Complexity… 1. Set the s(f|e) table initially to be uniform. This is the ‘channel model’ – our solution. f = coded message (symbols) e = English letters In Unsupervised Analysis… Knight mentions setting all values initially to 1/26 2. For several iterations – with O(mv2) running time, we can do it many times. Knight gets good results with 20. a. Set up a count table c (2d array) 340 x 26 We may need to make special boundary cases for the start and end boundaries, but no word boundaries for us • s and c tables: HER<∆Ф… A B C D … Z 2. b. b(e|e) is another table of probabilities that ‘given A, A follows’, ‘given A, B follows’, etc. 26 x 26 Rada: “The 20 Zodiac letters is not enough source material” Linguistics will build this for us. • b (e|e) A B C D E F … Z A B C … Z Need to add ‘start’ and ‘end’ I think ‘Boundary’ for us refers to the first and last symbol only, not word boundaries (we don’t have any in our cipher). 2c – 2f are straightforward 2g. Normalize c(f|e) to create a revised s(f|e) STL Vector normalization? Normalize with respect to e, so that in our table, all of our probabilities for ‘A’ sum to 1, ‘B’ sum to 1… Merging that into s(f|e). Need to… Think about trigrams – Knight got much better performace from trigrams Apply improvements from second page of “Unsupervised analysis”
© Copyright 2026 Paperzz