Specification and Analysis of
CRYPTON V1.0
Chae Hoon Lim
Future Systems, Inc.
Aug. 27, 1998
KCDSA Task Force Team
1
Contents
Design history
Basic building blocks
Encryption/decryption
Key Scheduling
Security/efficiency analysis
Conclusion
2
Design Objectives
An efficient and secure block cipher
Security:
– security bounds high enough to defeat various existing
attacks such as differential and linear cryptanalysis.
– A large safety margin for the future
Efficiency:
– high performance in software on large microprocessors
– efficient implementation on low-cost 8-bit microprocessors
– very high speed in hardware; low hardware complexity
Simplicity
3
Design Choices
Feistel vs Substitution-Permutation Network (SPN)
– Feistel: more cryptanalytic experience, fewer constraints in
round function design; poor parallelism
– SPN: more parallelism, more hardware-efficient; more
constraints in round function design
Choice from two alternative designs
– design based on Feistel: much like Twofish
SALTIS (unpublished)
– design based on SPN: used the global structure of Square
– final decision: SPN-type cipher CRYPTON
4
Main Features
secure against existing attacks
a simple, fine-grained design: easy to implement/analyze
symmetry in encryption and decryption
high performance on most CPU architectures
fast key scheduling: much faster than one-block encryption
efficient hardware implementation; low complexity
high degree of parallelism very high speed in hardware:
can achieve several Gbits/sec using about 30000 gates
5
CRYPTON v1.0: Motivations / Changes
Original AES proposal (CRYPTON v0.5):
– at almost final stage of design, but not complete
Motivations to revision:
– key scheduling was under examination for modification.
– somewhat weak S-boxes; decided to replace S-boxes with
stronger ones in this opportunity.
Tried to keep changes minimal: no substantial redesign
Changes:
– Key scheduling strengthened (overall structure unchanged).
– New 8 x 8 Sboxes (2 S-boxes --> 4 S-boxes).
6
Row-wise bit permutation
Bit-wise key addition
Column-to-row transposition
Column-wise bit permutation
Byte-wise substitution
Bit-wise key addition
44 byte
array
Output
Output
transformation
Round
transformation
(12 rounds)
Input whitening
Input
High-level Structure of CRYPTON
7
Notation
Data representation in 4 x 4 byte array
A = (A[3], A[2], A[1], A[0])t =
=
a03
a13
a23
a33
a02
a12
a22
a32
a01
a11
a21
a31
A[0]
A[1]
A[2]
A[3]
a00
a10
a20
a30
8
Basic Building Blocks
Components of Round Transformation:
–
–
–
–
Byte-wise Substitution
Column-wise Bit Permutation
Column-to-Row Transposition
Key Xoring
Round Transformation
– Even rounds: eK = K o o e o e
– Odd rounds: oK = K o o o o o
9
Encryption/Decryption
Round keys
– i-th round encryption: Kei = {Ke [4i+j]}(0 j 3)
– i-th round decryption: Kdi = {Kd [4i+j]}(0 j 3)
– e = o e o , o = o o o
– Kdi = e(Ke i) for even i, o(Kei) for odd i.
Encryption EK :
e eK eK eK eK K
12
e
11
e
2
e
1
e
0
e
Decryption DK :
– same as encryption except for using Kd instead of Ke.
10
Byte-wise Substitution
Odd rounds:
B o ( A) bij S i j mod 4 (aij )
Even rounds:
B e ( A) bij S i j 2 mod 4 (aij )
S3 S2 S1 S0
S1 S0 S3 S2
S0 S3 S2 S1
S2 S1 S0 S3
S1 S0 S3 S2
S3 S2 S1 S0
S2 S1 S0 S3
S0 S3 S2 S1
Odd rounds
Even rounds
11
Column-wise Bit Permutation (1)
( A ),
( A )
Odd rounds : o 3 ( A3 ), 2 ( A2 ), 1 ( A1 ), 0 ( A0 )
Even rounds : e
3
1
2
1
(
A
),
(
A
), 2
0
3
3 2 1 0
1 0 3 2
Odd rounds
Even rounds
0
12
Column-wise Bit Permutation (2)
m0 = 0xfc, m1 = 0xf3, m2 = 0xcf, m3 = 0x3f
for 4-byte column vectors a and b, b = 0(a) is defined by
b0
a0 m3 a3 m2 a 2 m1 a1 m0 a0
b1
a1 m3 a3 m3 a 2 m2 a1 m1 a0
b 0
b2
a2
m3 a3 m0 a 2 m3 a1 m2 a0
b
a m a m a m a m a
1
2
0
1
3
0
3
3 3 3
b0
a0
a0 b1
a0 b2
a0 b3
b1
a1
a1 b2
a1 b3
a1 b0
b 0 a 1 a b , 2 a b , 3 a b
2
2
2 3
2 0
2 1
b
a
a b
a b
a b
3
3
3 0
3 1
3 2
13
Column-to-Row Transposition / Key Add
Transposition: B = (A) bij = aji
a03 a02 a01 a00
a13 a12 a11 a10
a30 a20 a10 a00
a31 a21 a11 a01
a23 a22 a21 a20
a32 a22 a12 a02
a33 a32 a31 a30
a33 a23 a13 a03
Key addition:
– B = K(A) B[i] = A[i] K[i] for i=0,1,2,3.
14
Key Scheduling (1)
Overall structure: two-step generation
facilitate low-level implementations
User Key (0~32bytes)
Expanded Keys (32bytes)
Decryption Transform
Encryption Round Keys
Decryption Round Keys
15
Key Scheduling (2)
Already planned at the beginning
Known weakness: 232 weak keys for 256-bit key
– found by J. Borst and S. Vaudenay independently.
– due to regular patterns preserved in both round key
generation and round transformation
Changes:
– major changes made in round key generation
– used distinct round constants
– used 2/6-bit byte rotation and word-wise rotation
Consequence: believed secure against most known
key schedule weaknesses
16
Diffusion Property of (1)
Achieve diffusion order 4
at least 4 active bytes on average per round
Minimum diffusion set = x y =
{0x01,0x02, 0x03, 0x04, 0x08, 0x0c, 0x10, 0x20, 0x30, 0x40, 0x80, 0xc0}
{0x11, 0x12, 0x13, 0x21, 0x22, 0x23, 0x31, 0x32, 0x33, 0x44, 0x48, 0x4c,
0x84, 0x88, 0x8c, 0xc4, 0xc8, 0xcc}
order
No
ratio
4
5
204
13464
6
7
8
1793364 13058978 4162570479
4.75x10-8 3.13x10-6 4.18x10-4 3.04x10-2
96.92x10-2
17
Diffusion Property of i (2)
Ij = a set of input vectors of diffusion order 4 under i with
j nonzero bytes
I1 {( 0,0,0, x) t , (0,0, x,0) t , (0, x,0,0) t , ( x,0,0,0) t | x x },
I 2 {( 0,0, x, x) t , (0, x, x,0) t , ( x, x,0,0) t , ( x,0,0, x) t | x x },
I 2 {( 0, y,0, y ) t , ( y,0, y,0) t | y x y },
I 3 {( 0, x, x, x) t , ( x,0, x, x) t , ( x, x,0, x) t , ( x, x, x,0) t | x x }.
No.minimum diffusion vectors = 48+48+60+48 = 204
a I j i (a) I 4 j for j 1,2,3,
a I 2 i (a) I 2
18
Minimum Diffusion Patterns by o
Type-1
Type-2
Type-3
Type-4
Round 1
Round 2
Round 3
Round 4
19
Differential/Linear Prob. for nn S-box S
S-box differential prob.:
– x / y : input/output differences, resp.
| {x X | S ( x) S ( x x) y} |
Pr(x y)
2n
S-box linear prob.:
– x / y : input/output selection vectors, resp.
| {x X | x x S ( x) y} | 2
Pr (x y )
2n 1
n 1
2
20
S-box Construction (1)
One 8x8 involution S-box S 4 S-boxes Si
ROL7
ROL5
S
S
ROL1
ROL3
S
S
S0
S1
S2
S3
21
S-box Construction (2)
Design criteria for S-boxes:
– should be efficiently implementable in hardware logic and
on low-cost smart cards.
– The prob. of differential and linear characteristics should be
as small as possible.
– High prob. I/O differences/selection vectors in S should
have as high Hamming weights as possible.
– The number of such pairs in all Si’s should be as small as
possible when restricted to .
22
P0
P0-1
P1-1
Inverse Bit Permutation
ROLn
Left rotate
by n bits
ROLn
Bit Permutation
P1
The S-box S Search Model
23
The Selected S-box S
x7
x6
x 5 x4
x3
x2
x1 x0
P1
P0
4-bit P-boxes
z7
z6
z5 z4
z3
z2
z1
z0
z7
z6
z5 z4
z3
z2
z1
z0
z4
z0
z3 z7
z5
z1
z2
z6
z2
z5
z7
w3 w2
w1 w0
y3
y2
w5 w4
P1-1
y1
y0
y7
y6
Linear involution
z0
w7 w6
P0-1
Input x
y5
Inverse P-boxes
y4
Output y
24
Differential/Linear Char. of S-boxes (1)
Previous S-boxes: too many high prob. I/O pairs
The new S-boxes:
– Pr(DC) 10/256 = 2-4.68 for only 7 pairs
– Pr(LC) (32/128)2 = 2-4 for only 6 pairs
– High prob. char.: sum of Hamming weights is at least
4, on average 8.
Difference distribution
value
0
2
4
6
8
10
No 39584 20158 4976 749
62
7
Linear approx. distribution
value
0
4
8
12
16
20
No 13927 22058 15948 8460 3731 1094
24
276
28
36
32
6
25
Differential/Linear Char. of S-boxes (2)
Observarion:
–
–
–
–
min. 4 active bytes/round only for byte values in
for such values, max. entry in distr. tables : 6 / 24
Pr(DC) 6/256 = 2-5.42
Pr(LC) (24/128)2 = 2-4.83
S0
S1
S2
S3
DC( 6)
LC(24)
DC( 6)
LC(24)
DC( 6)
LC(24)
DC( 6)
LC(24)
(11,c0)
(88,11)
(11, 3)
(88,44)
(c0,11)
(11,88)
( 3,11)
(44,88)
(22,8c) (32,cc) (88,11)
(22,32) (32, 33) (88,44)
(11,88) (8c,22)
(cc,22)
(32,22) (33,32) (44,88)
26
Differential/Linear Cryptanalysis - Bounds
Observations:
– Min. No. of active S-boxes up to 8 rounds = 32
– Suppose that all such active S-boxes have
Pr(DC) = 2-5.42 and Pr(LC) = 2-4.83.
Overall char.prob.of DC/LC up to 8 rounds:
– pC8 (2-5.42)32 = 2-173.3
– pL8 (2-4.83)32 = 2-154.6
Differential, linear hull/multiple linear approx.:
– may increase the probabilities by a constant factor.
27
Differential/Linear Cryptanalysis - Simulation
Partial exhaustive search over the minimum diffusion set
theoretically breakable up to 7 rounds
Char. Prob.
Diff. Prob.
No. of
rounds
DC
LC
DC
LC
Diffusion
Type
5
110.3
105.0
109.5
105.0
3/4
6
127.1
122.8
124.3
120.7
3/3
7
156.9
145.1
155.4
144.2
3/4
8
185.7
169.3
181.5
169.1
4/4
figure = -log2 (prob.)
28
Variants/Extensions of DC/LC
Variants of DC:
– truncated/higher-order differentials,
– impossible differentials: a number of impossible
differentials up to 4 rounds; none for more than 5 rounds
Variants of LC:
– nonlinear approximations, generalized LC, partitioning
cryptanalysis
29
Other Possible Attacks
interpolation attacks: no simple algebraic description
dedicated SQUARE attacks:
– the best known attack up to 6 rounds
– can’t be extended to more round versions
Side-channel cryptanalysis:
– timing attacks
– differential fault analysis
– differential power analysis
Key schedule cryptanalysis
– weak keys, semi-weak keys, equivalent keys
– simple relations, related keys
30
Software Efficiency
32-bit Ps: same as the previous version
– Pentium Pro 200 MHz, Windows 95, MSVC 5.0
– UltraSparc 167 MHz, Solaris 2.5, GNU C
]
Language\Clocks
Key setup (enc/dec)
Enc/Dec
In-line Asm (PC)
N/A
381/381 (64Mbps)
MSVC 5.0 (PC)
327/397
452/452 (54Mbps)
GNU C (UltraSparc)
496/564
575/575 (42Mbps)
8-bit Ps: 256 byte ROM, 52 byte RAM; a little bit slower
than the previous version
31
Hardware Efficiency
Gate array implementation of 2-round iterative version
– VHDL description & logic synthesis using Synopsys +
HYUNDAI’s 0.35 micron gate array library
Simulation results:
Opt.
in
Clock
Period
(nsec)
Enc /
Dec
(cycles)
Key
setup
Key
Speed
Switch (Mbits/s
(cycles)
ec)
Cell
Area
(no.of
gates)
Total
Area
(no.of
gates)
Area
18.98
7
0
1
919
18322 51527
Time
10.23
7
0
1
1705
28179 74021
32
Conclusion
Advantages:
– strong security against various known attacks (with at least
3-round safety margin)
– symmetry in encryption and decryption
– uniformly fast on various architectures in software
– efficiently implementable in hardware
– high degree of parallelism: very high speed in hardware
Remarks:
– can be freely used: royalty-free
– welcome any comments/analysis reports
33
© Copyright 2026 Paperzz