White Paper Low-Speed Rijndael Encryption/Decryption Processors (Advanced Encryption Standard (AES)) Introduction The Altera® low-speed Rijndael encryption/decryption processors implement the Rijndael encryption or decryption algorithms, and are optimized for Altera FLEX™ 10KE and APEX® 20K devices. The cores are parameterized for key and block sizes. All combinations of key (128, 192, and 256 bits) and block lengths (128, 192, and 256 bits) are supported. A test case generator is included with the core, which converts a file containing key and block data into a simulation file for the core. The Rijndael Algorithm The Rijndael algorithm processes an encryption or decryption operation in a number of rounds: 10, 12, or 14. The key is expanded into a much larger keyspace. The keyspace is as large as the number of block bits, multiplied by the number of rounds. The block and keyspace are stored as matrixes of four rows, with Nb columns for the block, and Nk columns for the key. An encryption round consists of the following operations (decryption reverses the order of operations for each round): Byte by byte finite field substitution Matrix index shuffling Matrix multiply over a finite field Key addition Ports and Parameters Table 1 shows the parameters, which set the key and bit sizes. Table 1 Parameters Parameter Description Nb As defined in the Rijndael specification—the number of columns in the block. Nb = 4 corresponds to a 128 bit block, Nb = 6 is for 192 bits, and Nb = 8 is for 256 bits. Any combination of Nb and Nk can be chosen. Nk As defined in the Rijndael specification - the number of columns in the key. Nk = 4 corresponds to a 128 bit key, Nk = 6 is for 192 bits, and Nk = 8 is for 256 bits. Any combination of Nb and Nk can be chosen. WP-LWSPDRIJNDL-1.0 Date: June 2002 1 Altera Corporation Low-Speed Rijndael Encryption/Decryption Processors Table 2 shows the input signals. Table 2 Input Signals Signal Name Description SYSCLK SYSCLK is the main system clock. RESET The core is asynchronously reset when the RESET signal is asserted high. This prepares the core to receive another plaintext (in the case of an encryptor), or ciphertext (in the case of a decryptor) input. The RESET signal does not clear the previously calculated key expansion, which can be re-used. If a different key is needed from the one used in the previous cycle, a new key must be loaded and calculated first. DATALOAD When this signal is asserted, data on the DATAIN bus is latched into the core. GO When all data for an encryption or decryption algorithm has been loaded into the core, GO is asserted to start data processing. GO must be held high until processing is complete (indicated by ENCDONE or DECDONE, depending on the core). DATAIN[8..1] Input bytes are written into the core using this bus, when the DATALOAD input is set high. A total of 4 × Nb bytes are written into the core during an input cycle. ENCADD[5..1] DECADD[5..1] Data bytes written into the core must be accompanied by an address, on ENCADD[] (for an encryption core) or DECADD[] (for a decryption core). KEYLOAD When this signal is asserted, data on the KEYIN [] bus is latched into the core. KEYIN[8..1] Key bytes are written into the core using this bus, when the KEYLOAD input is asserted. A total of 4 × Nk bytes are written into the core during an input cycle. After a new key is loaded, the keyspace must be calculated. Once the keyspace is calculated, it does not have to be calculated again for any subsequent encryption or decryption operation, until the key needs to be changed. KEYCALC Once a new key is loaded, the KEYCALC signal is asserted and held until the key calculation is completed. (Indicated by KEYDONE). Table 3 shows the output signals. Table 3 Output Signals Signal Name Description KEYDONE The core asserts KEYDONE when the key space has been calculated. At this point, data can be loaded into the core and processed. ENCDONE DECDONE The core asserts ENCDONE (encryption cores), or DECDONE (decryption cores), when the encryption or decryption operation is complete. ENCOUT[8..1] DECOUT[8..1] Encrypted or decrypted bytes can be read out of this bus, once processing is complete. ENCADD[] or DECADD[] are used to read out the bytes in the desired sequence. Core Interface The following figures illustrate the relationship between control and data signals needed to load a key, calculate a keyspace, load plaintext, perform an encryption operation, and read the ciphertext out. The example is for an encryption processor, Nb = 4, Nk = 4, but the signal relationships are the same for any parameter combination, although processing times are specific to this particular parameter combination. 2 Altera Corporation Low-Speed Rijndael Encryption/Decryption Processors Figure 1 shows the loading of a key into the core. Once the key is loaded, the KEYCALC control can be asserted. Figure 1: Loading a Key SYSCLK RESET ENCADD KEYLOAD KEYIN[] KEYCALC The key space calculation requires 190 clock cycles. Figure 2 shows the loading of plaintext into the core. Once the data loading is complete, GO is used to start the encryption processing. Figure 2: Loading Plaintext SYSCLK RESET DATALD GO DATAIN[] ENCADD The encryption operation requires 260 clock cycles. The ciphertext can then be read out, (see Figure 3). There is a two-cycle latency between the address input, and ciphertext output. 3 Altera Corporation Low-Speed Rijndael Encryption/Decryption Processors Figure 3: Ciphertext Output SYSCLK RESET DATALD GO DATAIN[] ENCADD Compiling the Core The core is optimized for Altera FLEX 10KE and APEX 20K devices. It must be compiled into a device that supports dual port RAM. For best results, the core should be compiled with a synthesis style that supports carry chains. Table 4 and 5 show resource requirements and performance for the encryption and decryption cores, respectively, with all parameter combinations. The number of clocks required for key and code generation include loading the key or text information into the core. Table 4 Area and Performance of RJENCAA Nb Nk LEs EAB/ ESB Rounds Key Gen. Rounds Key Space Key Gen Clocks Code Gen Clocks 4 4 516 5 10 10 176 196 276 4 6 516 5 12 8 208 232 328 4 8 526 5 14 7 240 270 380 6 4 546 5 12 19 312 358 432 6 6 546 5 12 12 312 366 432 6 8 559 5 14 11 360 406 498 8 4 549 5 14 29 480 538 620 8 6 547 5 14 20 480 544 620 8 8 561 5 14 14 480 508 620 4 Altera Corporation Low-Speed Rijndael Encryption/Decryption Processors Table 5 Area and Performance of RJDECAA Nb Nk LEs EAB/ ESB Rounds Key Gen. Rounds Key Space Key Gen Clocks Code Gen Clocks 4 4 593 5 10 10 176 196 276 4 6 593 5 12 8 208 232 328 4 8 601 5 14 7 240 270 380 6 4 621 5 12 19 312 358 432 6 6 621 5 12 12 312 366 432 6 8 635 5 14 11 360 406 498 8 4 628 5 14 29 480 538 620 8 6 627 5 14 20 480 544 620 8 8 644 5 14 14 480 508 620 In a 10KE-1 device, the cores typically achieve an fMAX of 78 MHz. In a 20KE-1 device, the cores typically achieve an fMAX of 95 MHz. Core Performance The core performance (in Mbits/s) can be easily calculated. As the keyspace must only be calculated once per key, it does not affect the throughput, except for the first operation for a new key. Performance = fMAX × (bits per block)/(code clocks per block) In the case of Nb = 4 and Nk = 4, in a 20KE-1 device: Performance = 95 × 128/276 Performance = 44 Mbit/s. Testing the Core A utility is included with the core package that encrypts a key and plaintext sequence stored in a file, and decrypts the ciphertext back into plaintext. It uses the plaintext and ciphertext to generate test cases for the Altera tools, for the encryption and decryption processors respectively. Example Test – Standard Sequence The data format in the test source file must be byte wide hexadecimal, from ‘00’ to ‘FF’. First create a testing file foo.txt, containing the following data: 12 13 23 34 45 56 67 78 89 ff ee dd ea a1 b1 b2 0a 0b 0c 0d 44 55 66 77 1a 1b 1c 1d 1 2 3 4 The utility parses the data into a key, followed by plaintext. 5 Altera Corporation Low-Speed Rijndael Encryption/Decryption Processors Now, run the utility: FIVE 128 128 (Running the utility without parameters returns the parameter list – number of key bits, number of block bits.) The utility returns: Key: 121323344556677889ffeeddeaa1b1b2 Plaintext: 0a0b0c0d445566771a1b1c1d01020304 Ciphertext: 69bf969db6c11b1ad3c66461f743a09f Plaintext: 0a0b0c0d445566771a1b1c1d01020304 The utility simultaneously generates testcases for the RJENCAA.TDF (encryption) and RJDECAA.TDF (decryption) cores. Now, run the testcases. From the simulator window, in Inputs/Outputs option (File menu), select the vector file RJENCAA., and start the simulation. After simulation is complete, the ciphertext is read out (left to right bytes) on the ENCOUT[] bus. The simulation of the decryption core is handled identically. Appendix A—Top Level Wrapper Unencrypted top level wrappers for both cores (TOP_LEVEL_RJENCAA.TDF and TOP_LEVEL_RJDECAA.TDF, respectively) are provided, to make it easier to instantiate the cores. The source code for the two wrappers follows. TOP_LEVEL_RJENCAA.TDF FUNCTION rjencaa (sysclk, reset, datain[8..1], keyin[8..1], encadd[5..1], dataload, keyload, keycalc, go) RETURNS (keydone, encdone, encout[8..1]); PARAMETERS ( nb = 4, nk = 4 ); subdesign top_level_rjencaa ( sysclk, reset : INPUT; 6 Altera Corporation Low-Speed Rijndael Encryption/Decryption Processors datain[8..1], keyin[8..1], encadd[5..1] : INPUT; dataload, keyload : INPUT; keycalc, go : INPUT; keydone, encdone : OUTPUT; encout[8..1] : OUTPUT; ) BEGIN (keydone, encdone, encout[8..1]) = rjencaa (sysclk, reset, datain[8..1], keyin[8..1], encadd[5..1], dataload, keyload, keycalc, go) WITH (nb=nb,nk=nk); END; TOP_LEVEL_RJDECAA.TDF FUNCTION rjdecaa (sysclk, reset, datain[8..1], keyin[8..1], decadd[5..1], dataload, keyload, keycalc, go) RETURNS (keydone, decdone, decout[8..1]); PARAMETERS ( nb = 4, nk = 4 ); subdesign top_level_rjdecaa ( sysclk, reset : INPUT; datain[8..1], keyin[8..1], decadd[5..1] : INPUT; dataload, keyload : INPUT; keycalc, go : INPUT; keydone, decdone : OUTPUT; decout[8..1] : OUTPUT; ) 7 Altera Corporation Low-Speed Rijndael Encryption/Decryption Processors BEGIN (keydone, decdone, decout[8..1]) = rjdecaa (sysclk, reset, datain[8..1], keyin[8..1], decadd[5..1], dataload, keyload, keycalc, go) WITH (nb=nb,nk=nk); END; 101 Innovation Drive San Jose, CA 95134 (408) 544-7000 http://www.altera.com Copyright © 2002 Altera Corporation. All rights reserved. Altera, The Programmable Solutions Company, the stylized Altera logo, specific device designations, and all other words and logos that are identified as trademarks and/or service marks are, unless noted otherwise, the trademarks and service marks of Altera Corporation in the U.S. and other countries. All other product or service names are the property of their respective holders. Altera products are protected under numerous U.S. and foreign patents and pending applications, mask work rights, and copyrights. Altera warrants performance of its semiconductor products to current specifications in accordance with Altera's standard warranty, but reserves the right to make changes to any products and services at any time without notice. Altera assumes no responsibility or liability arising out of the application or use of any information, product, or service described herein except as expressly agreed to in writing by Altera Corporation. Altera customers are advised to obtain the latest version of device specifications before relying on any published information and before placing orders for products or services. 8 Altera Corporation Low-Speed Rijndael Encryption/Decryption Processors 9
© Copyright 2025 Paperzz