ARCHI 2017, Nancy, France — March 6–10, 2017 Hardware Implementation of Cryptography Jérémie Detrey CARAMBA team, LORIA INRIA Nancy – Grand Est, France [email protected] /* /* /* CARAMBA d[5],Q[999 (;i--;e=scanf("%" ++i<C ;++Q[ c:i); for(;i --;n +=!u*Q +l*lc*s* l=i,s=u,r=4;r;E= %C,l=E%C+r */ */ */ ]={0};main(n "d",d+i));for(C i*i% C],c= --;) for(u [l%C ],e+= s%C) %C]) i*l+c*u*s,s=(u*l --[d]);printf /* cc caramba.c; echo f3 f2 f1 f0 p | ./a.out */ E,C, c,r, u,l, e,s, i=5, ){for =*d; i[Q]? =C;u Q[(C for( +i*s) ("%d" "\n", (e+n* n)/2 -C);} ARCHI 2017, Nancy, France — March 6–10, 2017 Hardware Implementation of (Elliptic Curve) Cryptography Jérémie Detrey CARAMBA team, LORIA INRIA Nancy – Grand Est, France [email protected] /* /* /* CARAMBA d[5],Q[999 (;i--;e=scanf("%" ++i<C ;++Q[ c:i); for(;i --;n +=!u*Q +l*lc*s* l=i,s=u,r=4;r;E= %C,l=E%C+r */ */ */ ]={0};main(n "d",d+i));for(C i*i% C],c= --;) for(u [l%C ],e+= s%C) %C]) i*l+c*u*s,s=(u*l --[d]);printf /* cc caramba.c; echo f3 f2 f1 f0 p | ./a.out */ E,C, c,r, u,l, e,s, i=5, ){for =*d; i[Q]? =C;u Q[(C for( +i*s) ("%d" "\n", (e+n* n)/2 -C);} Cryptography Alice Bob I Alice and Bob want to communicate using a public channel (e.g., Internet) Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 1 / 43 Cryptography Alice Bob Eve I Alice and Bob want to communicate using a public channel (e.g., Internet) • ... but Eve is listening (passive attack: eavesdropping) Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 1 / 43 Cryptography Alice Mallory Bob Eve I Alice and Bob want to communicate using a public channel (e.g., Internet) • ... but Eve is listening (passive attack: eavesdropping) • ... and Mallory is interfering (active attack: tampering, forgery, replay, etc.) Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 1 / 43 Cryptography Alice Bob ? Mallory Eve I Alice and Bob want to communicate using a public channel (e.g., Internet) • ... but Eve is listening (passive attack: eavesdropping) • ... and Mallory is interfering (active attack: tampering, forgery, replay, etc.) I Cryptography: how to prevent such attacks, and ensure Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 1 / 43 Cryptography Alice Bob ? Mallory Eve I Alice and Bob want to communicate using a public channel (e.g., Internet) • ... but Eve is listening (passive attack: eavesdropping) • ... and Mallory is interfering (active attack: tampering, forgery, replay, etc.) I Cryptography: how to prevent such attacks, and ensure • confidentiality (Who can read the message?) Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography → encryption 1 / 43 Cryptography Alice Bob ? Mallory Eve I Alice and Bob want to communicate using a public channel (e.g., Internet) • ... but Eve is listening (passive attack: eavesdropping) • ... and Mallory is interfering (active attack: tampering, forgery, replay, etc.) I Cryptography: how to prevent such attacks, and ensure • confidentiality (Who can read the message?) • integrity (Was the message modified?) Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography → encryption → cryptographic hash functions 1 / 43 Cryptography Alice Bob ? Mallory Eve I Alice and Bob want to communicate using a public channel (e.g., Internet) • ... but Eve is listening (passive attack: eavesdropping) • ... and Mallory is interfering (active attack: tampering, forgery, replay, etc.) I Cryptography: how to prevent such attacks, and ensure • confidentiality (Who can read the message?) → encryption • integrity (Was the message modified?) → cryptographic hash functions • authenticity (Who sent the message?) → message auth. code (MAC), signature Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 1 / 43 Cryptography Alice Bob ? Mallory Eve I Alice and Bob want to communicate using a public channel (e.g., Internet) • ... but Eve is listening (passive attack: eavesdropping) • ... and Mallory is interfering (active attack: tampering, forgery, replay, etc.) I Cryptography: how to prevent such attacks, and ensure • • • • confidentiality (Who can read the message?) → encryption integrity (Was the message modified?) → cryptographic hash functions authenticity (Who sent the message?) → message auth. code (MAC), signature ... and many others: non-repudiation, zero-knowledge proof, secret sharing, etc. Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 1 / 43 Cryptographic layers I A complete cryptosystem implementation relies on many layers: Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 2 / 43 Cryptographic layers I A complete cryptosystem implementation relies on many layers: • protocol (OpenPGP, TLS, SSH, etc.) Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 2 / 43 Cryptographic layers I A complete cryptosystem implementation relies on many layers: • protocol (OpenPGP, TLS, SSH, etc.) • cryptographic mechanisms (encryption, hashing, signature, etc.) Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 2 / 43 Cryptographic layers I A complete cryptosystem implementation relies on many layers: • protocol (OpenPGP, TLS, SSH, etc.) • cryptographic mechanisms (encryption, hashing, signature, etc.) • cryptographic primitives (AES, RSA, ECDH, etc.) Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 2 / 43 Cryptographic layers I A complete cryptosystem implementation relies on many layers: • • • • protocol (OpenPGP, TLS, SSH, etc.) cryptographic mechanisms (encryption, hashing, signature, etc.) cryptographic primitives (AES, RSA, ECDH, etc.) arithmetic and logic operations (CPU / ASIP instruction set) Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 2 / 43 Cryptographic layers I A complete cryptosystem implementation relies on many layers: • • • • • protocol (OpenPGP, TLS, SSH, etc.) cryptographic mechanisms (encryption, hashing, signature, etc.) cryptographic primitives (AES, RSA, ECDH, etc.) arithmetic and logic operations (CPU / ASIP instruction set) logic circuits (registers, multiplexers, adders, etc.) Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 2 / 43 Cryptographic layers I A complete cryptosystem implementation relies on many layers: • • • • • • protocol (OpenPGP, TLS, SSH, etc.) cryptographic mechanisms (encryption, hashing, signature, etc.) cryptographic primitives (AES, RSA, ECDH, etc.) arithmetic and logic operations (CPU / ASIP instruction set) logic circuits (registers, multiplexers, adders, etc.) logic gates (NOT, NAND, etc.) and wires Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 2 / 43 Cryptographic layers I A complete cryptosystem implementation relies on many layers: • • • • • • • protocol (OpenPGP, TLS, SSH, etc.) cryptographic mechanisms (encryption, hashing, signature, etc.) cryptographic primitives (AES, RSA, ECDH, etc.) arithmetic and logic operations (CPU / ASIP instruction set) logic circuits (registers, multiplexers, adders, etc.) logic gates (NOT, NAND, etc.) and wires transistors Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 2 / 43 Cryptographic layers I A complete cryptosystem implementation relies on many layers: • • • • • • • protocol (OpenPGP, TLS, SSH, etc.) cryptographic mechanisms (encryption, hashing, signature, etc.) cryptographic primitives (AES, RSA, ECDH, etc.) arithmetic and logic operations (CPU / ASIP instruction set) logic circuits (registers, multiplexers, adders, etc.) logic gates (NOT, NAND, etc.) and wires transistors I When designing a cryptoprocessor, the hardware/software partitioning can be tailored to the application’s requirements Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 2 / 43 Cryptographic layers I A complete cryptosystem implementation relies on many layers: • • • • • • • protocol (OpenPGP, TLS, SSH, etc.) cryptographic mechanisms (encryption, hashing, signature, etc.) cryptographic primitives (AES, RSA, ECDH, etc.) arithmetic and logic operations (CPU / ASIP instruction set) logic circuits (registers, multiplexers, adders, etc.) logic gates (NOT, NAND, etc.) and wires transistors I When designing a cryptoprocessor, the hardware/software partitioning can be tailored to the application’s requirements I All top layers (esp. the blue and green ones) might lead to critical vulnerabilities if poorly implemented! ⇒ a cryptosystem is no more secure than its weakest link Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 2 / 43 Cryptographic layers I A complete cryptosystem implementation relies on many layers: • • • • • • • protocol (OpenPGP, TLS, SSH, etc.) cryptographic mechanisms (encryption, hashing, signature, etc.) cryptographic primitives (AES, RSA, ECDH, etc.) arithmetic and logic operations (CPU / ASIP instruction set) logic circuits (registers, multiplexers, adders, etc.) logic gates (NOT, NAND, etc.) and wires transistors I When designing a cryptoprocessor, the hardware/software partitioning can be tailored to the application’s requirements I All top layers (esp. the blue and green ones) might lead to critical vulnerabilities if poorly implemented! ⇒ a cryptosystem is no more secure than its weakest link I In this lecture, we will mostly focus on the green layers Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 2 / 43 Which target platforms? I Cryptography should be available everywhere: Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 3 / 43 Which target platforms? I Cryptography should be available everywhere: • on desktop PCs and laptops → 64-bit Intel or AMD CPUs with SIMD instructions (SSE / AVX) Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 3 / 43 Which target platforms? I Cryptography should be available everywhere: • on → • on → desktop PCs and laptops 64-bit Intel or AMD CPUs with SIMD instructions (SSE / AVX) smartphones low-power 32- or 64-bit ARM CPUs, maybe with SIMD (NEON) Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 3 / 43 Which target platforms? I Cryptography should be available everywhere: • on → • on → • on → desktop PCs and laptops 64-bit Intel or AMD CPUs with SIMD instructions (SSE / AVX) smartphones low-power 32- or 64-bit ARM CPUs, maybe with SIMD (NEON) wireless sensors tiny 8-bit microcontroller (such as Atmel AVRs) Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 3 / 43 Which target platforms? I Cryptography should be available everywhere: • on desktop PCs and laptops → 64-bit Intel or AMD CPUs with SIMD instructions (SSE / AVX) • on smartphones → low-power 32- or 64-bit ARM CPUs, maybe with SIMD (NEON) • on wireless sensors → tiny 8-bit microcontroller (such as Atmel AVRs) • on smart cards and RFID chips → custom cryptoprocessor (ASIC or ASIP) with dedicated hardware for cryptographic operations Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 3 / 43 Which target platforms? I Cryptography should be available everywhere: • on desktop PCs and laptops → 64-bit Intel or AMD CPUs with SIMD instructions (SSE / AVX) • on smartphones → low-power 32- or 64-bit ARM CPUs, maybe with SIMD (NEON) • on wireless sensors → tiny 8-bit microcontroller (such as Atmel AVRs) • on smart cards and RFID chips → custom cryptoprocessor (ASIC or ASIP) with dedicated hardware for cryptographic operations I Other possible target platforms, mostly for cryptanalytic computations: • clusters of CPUs • GPUs (graphics processors) • FPGAs (reconfigurable circuits) Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 3 / 43 Which target platforms? I Cryptography should be available everywhere: • on desktop PCs and laptops → 64-bit Intel or AMD CPUs with SIMD instructions (SSE / AVX) • on smartphones → low-power 32- or 64-bit ARM CPUs, maybe with SIMD (NEON) • on wireless sensors → tiny 8-bit microcontroller (such as Atmel AVRs) • on smart cards and RFID chips → custom cryptoprocessor (ASIC or ASIP) with dedicated hardware for cryptographic operations I Other possible target platforms, mostly for cryptanalytic computations: • clusters of CPUs • GPUs (graphics processors) • FPGAs (reconfigurable circuits) ⇒ In such cases, implementation security is usually less critical Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 3 / 43 Efficient and secure implementation? I Many possible meanings for efficiency: Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 4 / 43 Efficient and secure implementation? I Many possible meanings for efficiency: • fast? → low latency or high throughput? Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 4 / 43 Efficient and secure implementation? I Many possible meanings for efficiency: • fast? → low latency or high throughput? • small? → low memory / code / silicon usage? Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 4 / 43 Efficient and secure implementation? I Many possible meanings for efficiency: • fast? → low latency or high throughput? • small? → low memory / code / silicon usage? • low power?... or low energy? Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 4 / 43 Efficient and secure implementation? I Many possible meanings for efficiency: • fast? → low latency or high throughput? • small? → low memory / code / silicon usage? • low power?... or low energy? ⇒ Identify constraints according to application and target platform Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 4 / 43 Efficient and secure implementation? I Many possible meanings for efficiency: • fast? → low latency or high throughput? • small? → low memory / code / silicon usage? • low power?... or low energy? ⇒ Identify constraints according to application and target platform I Secure against which attacks? Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 4 / 43 Efficient and secure implementation? I Many possible meanings for efficiency: • fast? → low latency or high throughput? • small? → low memory / code / silicon usage? • low power?... or low energy? ⇒ Identify constraints according to application and target platform I Secure against which attacks? • protocol attacks? (POODLE, FREAK, LogJam, etc.) Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 4 / 43 Efficient and secure implementation? I Many possible meanings for efficiency: • fast? → low latency or high throughput? • small? → low memory / code / silicon usage? • low power?... or low energy? ⇒ Identify constraints according to application and target platform I Secure against which attacks? • protocol attacks? (POODLE, FREAK, LogJam, etc.) • cryptanalysis? (weak cipher, small keys, etc.) Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 4 / 43 Efficient and secure implementation? I Many possible meanings for efficiency: • fast? → low latency or high throughput? • small? → low memory / code / silicon usage? • low power?... or low energy? ⇒ Identify constraints according to application and target platform I Secure against which attacks? • protocol attacks? (POODLE, FREAK, LogJam, etc.) • cryptanalysis? (weak cipher, small keys, etc.) • timing attacks? Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 4 / 43 Efficient and secure implementation? I Many possible meanings for efficiency: • fast? → low latency or high throughput? • small? → low memory / code / silicon usage? • low power?... or low energy? ⇒ Identify constraints according to application and target platform I Secure against which attacks? • • • • protocol attacks? (POODLE, FREAK, LogJam, etc.) cryptanalysis? (weak cipher, small keys, etc.) timing attacks? power or electromagnetic analysis? Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 4 / 43 Efficient and secure implementation? I Many possible meanings for efficiency: • fast? → low latency or high throughput? • small? → low memory / code / silicon usage? • low power?... or low energy? ⇒ Identify constraints according to application and target platform I Secure against which attacks? • • • • • protocol attacks? (POODLE, FREAK, LogJam, etc.) cryptanalysis? (weak cipher, small keys, etc.) timing attacks? power or electromagnetic analysis? fault attacks? [See A. Tisserand’s talk] Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 4 / 43 Efficient and secure implementation? I Many possible meanings for efficiency: • fast? → low latency or high throughput? • small? → low memory / code / silicon usage? • low power?... or low energy? ⇒ Identify constraints according to application and target platform I Secure against which attacks? • • • • • • protocol attacks? (POODLE, FREAK, LogJam, etc.) cryptanalysis? (weak cipher, small keys, etc.) timing attacks? power or electromagnetic analysis? fault attacks? [See A. Tisserand’s talk] cache attacks? Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 4 / 43 Efficient and secure implementation? I Many possible meanings for efficiency: • fast? → low latency or high throughput? • small? → low memory / code / silicon usage? • low power?... or low energy? ⇒ Identify constraints according to application and target platform I Secure against which attacks? • • • • • • • protocol attacks? (POODLE, FREAK, LogJam, etc.) cryptanalysis? (weak cipher, small keys, etc.) timing attacks? power or electromagnetic analysis? fault attacks? [See A. Tisserand’s talk] cache attacks? branch-prediction attacks? Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 4 / 43 Efficient and secure implementation? I Many possible meanings for efficiency: • fast? → low latency or high throughput? • small? → low memory / code / silicon usage? • low power?... or low energy? ⇒ Identify constraints according to application and target platform I Secure against which attacks? • • • • • • • • protocol attacks? (POODLE, FREAK, LogJam, etc.) cryptanalysis? (weak cipher, small keys, etc.) timing attacks? power or electromagnetic analysis? fault attacks? [See A. Tisserand’s talk] cache attacks? branch-prediction attacks? etc. ⇒ Possible attack scenarios depend on the application Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 4 / 43 Some references Alfred Menezes, Paul van Oorschot, and Scott Vanstone, Handbook of Applied Cryptography. Chapman & Hall / CRC, 1996. http://www.cacr.math.uwaterloo.ca/hac/ Francisco Rodrı́guez-Henrı́quez, Arturo Dı́az Pérez, Nazar Abbas Saqib, and Çetin Kaya Koç, Cryptographic Algorithms on Reconfigurable Hardware. Springer, 2006. Proceedings of the CHES workshop and of other crypto conferences. Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 5 / 43 Some references Elliptic Curves in Cryptography, Ian F. Blake, Gadiel Seroussi, and Nigel P. Smart. London Mathematical Society 265, Cambridge University Press, 1999. Guide to Elliptic Curve Cryptography, Darrel Hankerson, Alfred Menezes, and Scott Vanstone. Springer, 2004. Handbook of Elliptic and Hyperelliptic Curve Cryptography, Henri Cohen and Gerhard Frey (editors). Chapman & Hall / CRC, 2005. Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 6 / 43 Outline I Some encryption mechanisms I Elliptic curve cryptography I Scalar multiplication I Elliptic curve arithmetic I Finite field arithmetic Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 7 / 43 Symmetric encryption Alice Bob M I Alice wants to send a confidential message M to Bob Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 8 / 43 Symmetric encryption Alice ? Bob M I Alice wants to send a confidential message M to Bob Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 8 / 43 Symmetric encryption Alice K ? M Bob K I Alice wants to send a confidential message M to Bob • they decide upon a shared secret key K (might be tricky!) Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 8 / 43 Symmetric encryption Alice K ? M C = EncK (M ) Bob K I Alice wants to send a confidential message M to Bob • they decide upon a shared secret key K (might be tricky!) • encrypt message using shared key: C = EncK (M) Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 8 / 43 Symmetric encryption Alice K ? M C = EncK (M ) Bob K I Alice wants to send a confidential message M to Bob • they decide upon a shared secret key K (might be tricky!) • encrypt message using shared key: C = EncK (M) • decrypt ciphertext using shared key: M = DecK (C ) Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 8 / 43 Symmetric encryption Alice K ? M C = EncK (M ) Bob K I Alice wants to send a confidential message M to Bob • they decide upon a shared secret key K (might be tricky!) • encrypt message using shared key: C = EncK (M) • decrypt ciphertext using shared key: M = DecK (C ) I Block cipher: • split message M into n-bit blocks (e.g., n = 128 bits) • encryption/decryption primitive : iterated keyed permutation {0, 1}n → {0, 1}n • requires a mode of operation to combine the blocks Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 8 / 43 AES [Daemen & Rijmen, 2001] M K k0 I Advanced Encryption Standard I Key sizes: 128, 192 or 256 bits I Block size: 128 bits S S SubBytes S S I Substitution–permutation network ShiftRows • SubBytes: nonlinear subst. on bytes • ShiftRows & MixColumns: mainly wires, plus a few XORs MixColumns k1 Key schedule .. . I 10, 12, or 14 rounds (depending on key size) k9 S S SubBytes S S ShiftRows k10 C Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 9 / 43 AES [Daemen & Rijmen, 2001] M K k0 I Advanced Encryption Standard I Key sizes: 128, 192 or 256 bits I Block size: 128 bits S S SubBytes S S I Substitution–permutation network ShiftRows • SubBytes: nonlinear subst. on bytes • ShiftRows & MixColumns: mainly wires, plus a few XORs MixColumns k1 Key schedule .. . k9 S S SubBytes S S ShiftRows k10 C I 10, 12, or 14 rounds (depending on key size) I Low-area version (1 S-box): 20 cycles / round, 2.5 to 5 kGE I Parallel version (20 S-boxes): 1 cycle / round, 20 to 35 kGE I Fully unrolled version (200 S-boxes): 1 cycle / block, at least 200 kGE Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 9 / 43 Symmetric encryption Alice K ? M C = EncK (M ) Bob K I Alice wants to send a confidential message M to Bob • decide upon a shared secret key K • encrypt message using shared key: C = EncK (M) • decrypt ciphertext using shared key: M = DecK (C ) I Block cipher: • split message M into n-bit blocks (e.g., n = 128 bits) • encryption/decryption primitive : keyed permutation {0, 1}n → {0, 1}n • requires a mode of operation to combine the blocks Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 10 / 43 Symmetric encryption Alice K ? M C = EncK (M ) Bob K I Alice wants to send a confidential message M to Bob • decide upon a shared secret key K • encrypt message using shared key: C = EncK (M) • decrypt ciphertext using shared key: M = DecK (C ) I Block cipher: • split message M into n-bit blocks (e.g., n = 128 bits) • encryption/decryption primitive : keyed permutation {0, 1}n → {0, 1}n • requires a mode of operation to combine the blocks I Stream cipher: • generate a pseudorandom keystream Z using a PRNG initialized by the key K and a random initialization vector (IV) • use Z to mask the message: C = M ⊕ Z and M = C ⊕ Z (⊕ is XOR) Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 10 / 43 Trivium [De Cannière & Preneel, 2005] 4 1 286 287 288 I Part of the eSTREAM portfolio (low-area hardware ciphers) 26 I Key size: 80 bits I IV size: 80 bits 24 66 3 69 zi I 288-bit circular shift register, plus a few XOR and AND gates 91 92 93 178 17 1 17 75 6 94 162 7 171 Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 11 / 43 Trivium [De Cannière & Preneel, 2005] 4 1 286 287 288 I Part of the eSTREAM portfolio (low-area hardware ciphers) 26 I Key size: 80 bits I IV size: 80 bits 24 66 3 69 zi 91 92 93 I 288-bit circular shift register, plus a few XOR and AND gates I Serial version: • 1 keystream bit / clock cycle • 2.6 kGE 178 17 1 17 75 6 94 162 7 171 I Parallel version: Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography • up to 64 bits / clock cycle • 4.9 kGE 11 / 43 Public-key encryption Alice Bob K K C = EncK (M ) Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 12 / 43 Public-key encryption Alice Bob ? ? K K C = EncK (M ) I Agreeing on a shared secret key K over a public channel is difficult Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 12 / 43 Public-key encryption Alice Bob ? ? K C = EncK (M ) KB SK PK B I Agreeing on a shared secret key K over a public channel is difficult I Use public-key cryptography: • Bob generates a public/secret key-pair (PK B , SK B ) Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 12 / 43 Public-key encryption Alice Bob ? ? K C = EncK (M ) KB SK PK B I Agreeing on a shared secret key K over a public channel is difficult I Use public-key cryptography: • Bob generates a public/secret key-pair (PK B , SK B ) • Alice retrieves Bob’s public key PK B Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 12 / 43 Public-key encryption Alice Bob ? ? K C = EncK (M ) C = EncPK B (M ) KB SK PK B I Agreeing on a shared secret key K over a public channel is difficult I Use public-key cryptography: • Bob generates a public/secret key-pair (PK B , SK B ) • Alice retrieves Bob’s public key PK B • encryption only uses the public key: C = EncPK B (M) Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 12 / 43 Public-key encryption Alice Bob ? ? K C = EncK (M ) C = EncPK B (M ) KB SK PK B I Agreeing on a shared secret key K over a public channel is difficult I Use public-key cryptography: • Bob generates a public/secret key-pair (PK B , SK B ) • Alice retrieves Bob’s public key PK B • encryption only uses the public key: C = EncPK B (M) • but decryption requires the secret key: M = DecSK B (C ) Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 12 / 43 Public-key encryption Alice Bob ? ? K C = EncK (M ) C = EncPK B (M ) KB SK PK B I Agreeing on a shared secret key K over a public channel is difficult I Use public-key cryptography: • Bob generates a public/secret key-pair (PK B , SK B ) • Alice retrieves Bob’s public key PK B • encryption only uses the public key: C = EncPK B (M) • but decryption requires the secret key: M = DecSK B (C ) I Security: computing SK B from PK B should be difficult Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 12 / 43 Public-key encryption Alice Bob ? ? K C = EncK (M ) C = EncPK B (M ) KB SK PK B I Agreeing on a shared secret key K over a public channel is difficult I Use public-key cryptography: • Bob generates a public/secret key-pair (PK B , SK B ) • Alice retrieves Bob’s public key PK B • encryption only uses the public key: C = EncPK B (M) • but decryption requires the secret key: M = DecSK B (C ) I Security: computing SK B from PK B should be difficult ⇒ rely on (supposedly) “hard” number-theoretic problems Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 12 / 43 Public-key encryption Alice Bob ? ? K C = EncK (M ) C = EncPK B (M ) KB SK PK B I Agreeing on a shared secret key K over a public channel is difficult I Use public-key cryptography: • Bob generates a public/secret key-pair (PK B , SK B ) • Alice retrieves Bob’s public key PK B • encryption only uses the public key: C = EncPK B (M) • but decryption requires the secret key: M = DecSK B (C ) I Security: computing SK B from PK B should be difficult ⇒ rely on (supposedly) “hard” number-theoretic problems • integer factorization (RSA) Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 12 / 43 Public-key encryption Alice Bob ? ? K C = EncK (M ) C = EncPK B (M ) KB SK PK B I Agreeing on a shared secret key K over a public channel is difficult I Use public-key cryptography: • Bob generates a public/secret key-pair (PK B , SK B ) • Alice retrieves Bob’s public key PK B • encryption only uses the public key: C = EncPK B (M) • but decryption requires the secret key: M = DecSK B (C ) I Security: computing SK B from PK B should be difficult ⇒ rely on (supposedly) “hard” number-theoretic problems • integer factorization (RSA) • discrete logarithm problem in finite fields (ElGamal, DSA, etc.) Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 12 / 43 Public-key encryption Alice Bob ? ? K C = EncK (M ) C = EncPK B (M ) KB SK PK B I Agreeing on a shared secret key K over a public channel is difficult I Use public-key cryptography: • Bob generates a public/secret key-pair (PK B , SK B ) • Alice retrieves Bob’s public key PK B • encryption only uses the public key: C = EncPK B (M) • but decryption requires the secret key: M = DecSK B (C ) I Security: computing SK B from PK B should be difficult ⇒ rely on (supposedly) “hard” number-theoretic problems • integer factorization (RSA) • discrete logarithm problem in finite fields (ElGamal, DSA, etc.) • discrete logarithm problem in elliptic curves (elliptic curve cryptography) Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 12 / 43 Public-key encryption Alice Bob ? ? K C = EncK (M ) C = EncPK B (M ) KB SK PK B I Agreeing on a shared secret key K over a public channel is difficult I Use public-key cryptography: • Bob generates a public/secret key-pair (PK B , SK B ) • Alice retrieves Bob’s public key PK B • encryption only uses the public key: C = EncPK B (M) • but decryption requires the secret key: M = DecSK B (C ) I Security: computing SK B from PK B should be difficult ⇒ rely on (supposedly) “hard” number-theoretic problems • integer factorization (RSA) • discrete logarithm problem in finite fields (ElGamal, DSA, etc.) • discrete logarithm problem in elliptic curves (elliptic curve cryptography) → harder problem, and thus requires smaller keys (256 vs. 3072 bits) Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 12 / 43 Public-key encryption Alice Bob ? ? K C = EncK (M ) C = EncPK B (M ) KB SK PK B I Agreeing on a shared secret key K over a public channel is difficult I Use public-key cryptography: • Bob generates a public/secret key-pair (PK B , SK B ) • Alice retrieves Bob’s public key PK B • encryption only uses the public key: C = EncPK B (M) • but decryption requires the secret key: M = DecSK B (C ) I Security: computing SK B from PK B should be difficult ⇒ rely on (supposedly) “hard” number-theoretic problems • integer factorization (RSA) • discrete logarithm problem in finite fields (ElGamal, DSA, etc.) • discrete logarithm problem in elliptic curves (elliptic curve cryptography) → harder problem, and thus requires smaller keys (256 vs. 3072 bits) Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 12 / 43 Outline I Some encryption mechanisms I Elliptic curve cryptography I Scalar multiplication I Elliptic curve arithmetic I Finite field arithmetic Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 13 / 43 A primer on elliptic curves I Let us consider a field K (e.g., Q, R, Fp , etc.) Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 14 / 43 A primer on elliptic curves I Let us consider a field K (e.g., Q, R, Fp , etc.) I An elliptic curve E defined over K is given by an equation of the form E : y 2 = x 3 + Ax + B, with parameters A, B ∈ K Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 14 / 43 A primer on elliptic curves I Let us consider a field K (e.g., Q, R, Fp , etc.) I An elliptic curve E defined over K is given by an equation of the form E : y 2 = x 3 + Ax + B, with parameters A, B ∈ K I The set of K -rational points of E is defined as E (K ) = {(x, y ) ∈ K × K | (x, y ) satisfy E } Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 14 / 43 A primer on elliptic curves I Let us consider a field K (e.g., Q, R, Fp , etc.) I An elliptic curve E defined over K is given by an equation of the form E : y 2 = x 3 + Ax + B, with parameters A, B ∈ K I The set of K -rational points of E is defined as E (K ) = {(x, y ) ∈ K × K | (x, y ) satisfy E } ∪ {O} O is called the “point at infinity” Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 14 / 43 A primer on elliptic curves I Let us consider a field K (e.g., Q, R, Fp , etc.) I An elliptic curve E defined over K is given by an equation of the form E : y 2 = x 3 + Ax + B, with parameters A, B ∈ K I The set of K -rational points of E is defined as E (K ) = {(x, y ) ∈ K × K | (x, y ) satisfy E } ∪ {O} O is called the “point at infinity” I Additive group law: E (K ) is an abelian group • addition via the “chord and tangent” method • O is the neutral element Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 14 / 43 Elliptic curves and group law E /R : y 2 = x 3 − 3x + 1 6 4 2 −4 −3 −2 −1 0 1 2 3 4 −2 −4 −6 Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 15 / 43 Elliptic curves and group law E /R : y 2 = x 3 − 3x + 1 6 O 4 2 −4 −3 −2 −1 0 1 2 3 4 −2 −4 −6 Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 15 / 43 Elliptic curves and group law E /R : y 2 = x 3 − 3x + 1 6 O 4 2 −4 −3 −2 −1 0 1 2 3 4 −2 −4 −6 Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 15 / 43 Elliptic curves and group law E /R : y 2 = x 3 − 3x + 1 ∞O 50 20 10 5 2 −4 −3 −2 −1 0 1 2 3 4 −2 −3 −4 −5 Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 15 / 43 Elliptic curves and group law E /R : y 2 = x 3 − 3x + 1 6 O 4 2 −4 −3 −2 −1 0 1 2 3 4 −2 −4 −6 Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 15 / 43 Elliptic curves and group law E /R : y 2 = x 3 − 3x + 1 6 O 4 2 Q −4 −3 −2 −1 0 1 2 3 4 P −2 −4 −6 Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 15 / 43 Elliptic curves and group law E /R : y 2 = x 3 − 3x + 1 6 O 4 2 Q −4 −3 −2 −1 0 1 2 3 4 P −2 −4 −6 Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 15 / 43 Elliptic curves and group law E /R : y 2 = x 3 − 3x + 1 6 O 4 2 Q −4 −3 −2 −1 0 P −2 1 2 3 4 P +Q −4 −6 Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 15 / 43 Elliptic curves and group law E /F17 : y 2 = x 3 + x + 7 16 O 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 0 1 2 3 4 5 6 7 8 9 Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 10 11 12 13 14 15 16 15 / 43 Elliptic curves and group law E /F17 : y 2 = x 3 + x + 7 16 O 15 14 13 12 11 10 9 8 7 6 5 4 3 2 Q 1 P 0 0 1 2 3 4 5 6 7 8 9 Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 10 11 12 13 14 15 16 15 / 43 Elliptic curves and group law E /F17 : y 2 = x 3 + x + 7 16 O 15 14 13 12 11 10 9 8 7 6 5 4 3 2 Q 1 P 0 0 1 2 3 4 5 6 7 8 9 Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 10 11 12 13 14 15 16 15 / 43 Elliptic curves and group law E /F17 : y 2 = x 3 + x + 7 16 O 15 14 13 12 11 10 9 8 P +Q 7 6 5 4 3 2 Q 1 P 0 0 1 2 3 4 5 6 7 8 9 Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 10 11 12 13 14 15 16 15 / 43 Scalar multiplication and discrete logarithm E /K : y 2 = x 3 + Ax + B Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 16 / 43 Scalar multiplication and discrete logarithm E /K : y 2 = x 3 + Ax + B I If K is a finite field Fq , then E (K ) is a finite abelian group Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 16 / 43 Scalar multiplication and discrete logarithm E /K : y 2 = x 3 + Ax + B I If K is a finite field Fq , then E (K ) is a finite abelian group • let P ∈ E (Fq ), with P 6= O Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 16 / 43 Scalar multiplication and discrete logarithm E /K : y 2 = x 3 + Ax + B I If K is a finite field Fq , then E (K ) is a finite abelian group • let P ∈ E (Fq ), with P 6= O • consider 2P = P + P, then 3P = P + P + P, etc. Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 16 / 43 Scalar multiplication and discrete logarithm E /K : y 2 = x 3 + Ax + B I If K is a finite field Fq , then E (K ) is a finite abelian group • let P ∈ E (Fq ), with P 6= O • consider 2P = P + P, then 3P = P + P + P, etc. • since E (Fq ) is finite, take the smallest ` > 0 such that `P = O Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 16 / 43 Scalar multiplication and discrete logarithm E /K : y 2 = x 3 + Ax + B I If K is a finite field • • • • Fq , then E (K ) is a finite abelian group let P ∈ E (Fq ), with P 6= O consider 2P = P + P, then 3P = P + P + P, etc. since E (Fq ) is finite, take the smallest ` > 0 such that `P = O define G as G = {O, P, 2P, 3P, . . . , (` − 1)P} Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 16 / 43 Scalar multiplication and discrete logarithm E /K : y 2 = x 3 + Ax + B I If K is a finite field • • • • • Fq , then E (K ) is a finite abelian group let P ∈ E (Fq ), with P 6= O consider 2P = P + P, then 3P = P + P + P, etc. since E (Fq ) is finite, take the smallest ` > 0 such that `P = O define G as G = {O, P, 2P, 3P, . . . , (` − 1)P} G is a cyclic subgroup of E (Fq ), of order `, and P Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography is a generator of G 16 / 43 Scalar multiplication and discrete logarithm E /K : y 2 = x 3 + Ax + B I If K is a finite field • • • • • Fq , then E (K ) is a finite abelian group let P ∈ E (Fq ), with P 6= O consider 2P = P + P, then 3P = P + P + P, etc. since E (Fq ) is finite, take the smallest ` > 0 such that `P = O define G as G = {O, P, 2P, 3P, . . . , (` − 1)P} G is a cyclic subgroup of E (Fq ), of order `, and P is a generator of I The scalar multiplication in base P gives an isomorphism between expP : Z/`Z k G Z/`Z and G: −→ G 7−→ kP = P | +P + {z. . . + P} k times Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 16 / 43 Scalar multiplication and discrete logarithm E /K : y 2 = x 3 + Ax + B I If K is a finite field • • • • • Fq , then E (K ) is a finite abelian group let P ∈ E (Fq ), with P 6= O consider 2P = P + P, then 3P = P + P + P, etc. since E (Fq ) is finite, take the smallest ` > 0 such that `P = O define G as G = {O, P, 2P, 3P, . . . , (` − 1)P} G is a cyclic subgroup of E (Fq ), of order `, and P is a generator of I The scalar multiplication in base P gives an isomorphism between expP : Z/`Z k G Z/`Z and G: −→ G 7−→ kP = P | +P + {z. . . + P} k times I The inverse map is the so-called discrete logarithm (in base P): dlogP = exp−1 : P G −→ Q 7−→ Z/`Z Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography k such that Q = kP 16 / 43 Towards elliptic curve cryptography I Scalar multiplication can be computed in polynomial time: P k Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 17 / 43 Towards elliptic curve cryptography I Scalar multiplication can be computed in polynomial time: kPP k Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 17 / 43 Towards elliptic curve cryptography I Scalar multiplication can be computed in polynomial time: kPP k I Under a few conditions, the discrete logarithm can only be computed in exponential time (as far as we know): Q = kP Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 17 / 43 Towards elliptic curve cryptography I Scalar multiplication can be computed in polynomial time: kPP k I Under a few conditions, the discrete logarithm can only be computed in exponential time (as far as we know): Q = kP k Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 17 / 43 Towards elliptic curve cryptography I Scalar multiplication can be computed in polynomial time: kPP k I Under a few conditions, the discrete logarithm can only be computed in exponential time (as far as we know): Q = kP k I That’s a one-way function Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 17 / 43 Towards elliptic curve cryptography I Scalar multiplication can be computed in polynomial time: kPP k I Under a few conditions, the discrete logarithm can only be computed in exponential time (as far as we know): Q = kP k I That’s a one-way function ⇒ public-key cryptography! Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 17 / 43 Towards elliptic curve cryptography I Scalar multiplication can be computed in polynomial time: kPP k I Under a few conditions, the discrete logarithm can only be computed in exponential time (as far as we know): Q = kP k I That’s a one-way function ⇒ public-key cryptography! • secret key: an integer k in Z/`Z • public key: the point kP in G ⊆ E (Fq ) Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 17 / 43 Example protocol: EC Diffie–Hellman key exchange I Alice and Bob want to establish a secure communication channel Alice Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography Bob 18 / 43 Example protocol: EC Diffie–Hellman key exchange I Alice and Bob want to establish a secure communication channel I How can they decide upon a shared secret key over a public channel? Alice ? Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography Bob ? 18 / 43 Example protocol: EC Diffie–Hellman key exchange I Alice and Bob want to establish a secure communication channel I How can they decide upon a shared secret key over a public channel? Alice ? Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography Bob ? 18 / 43 Example protocol: EC Diffie–Hellman key exchange I Alice and Bob want to establish a secure communication channel I How can they decide upon a shared secret key over a public channel? Alice ? P Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography Bob ? P 18 / 43 Example protocol: EC Diffie–Hellman key exchange I Alice and Bob want to establish a secure communication channel I How can they decide upon a shared secret key over a public channel? Alice a ? P Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography Bob b ? P 18 / 43 Example protocol: EC Diffie–Hellman key exchange I Alice and Bob want to establish a secure communication channel I How can they decide upon a shared secret key over a public channel? Alice a ? aPP Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography Bob b ? bPP 18 / 43 Example protocol: EC Diffie–Hellman key exchange I Alice and Bob want to establish a secure communication channel I How can they decide upon a shared secret key over a public channel? Alice ? a aP aPP Bob b ? bPP bP Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 18 / 43 Example protocol: EC Diffie–Hellman key exchange I Alice and Bob want to establish a secure communication channel I How can they decide upon a shared secret key over a public channel? Alice ? a aP aabPPP Bob b ? abP bPP bP Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 18 / 43 Central operation: the scalar multiplication I Elliptic curve Diffie–Hellman (ECDH): • Alice: QA ← aP and K ← aQB (2 scalar mults) • Bob: QB ← bP and K ← bQA (2 scalar mults) Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 19 / 43 Central operation: the scalar multiplication I Elliptic curve Diffie–Hellman (ECDH): • Alice: QA ← aP and K ← aQB (2 scalar mults) • Bob: QB ← bP and K ← bQA (2 scalar mults) I Elliptic curve Digital Signature Algorithm (ECDSA): • Alice (KeyGen): QA ← aP (1 scalar mult) • Alice (Sign): R ← kP (1 scalar mult) 0 • Bob (Verify): R ← uP + vQA (2 scalar mults) Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 19 / 43 Central operation: the scalar multiplication I Elliptic curve Diffie–Hellman (ECDH): • Alice: QA ← aP and K ← aQB (2 scalar mults) • Bob: QB ← bP and K ← bQA (2 scalar mults) I Elliptic curve Digital Signature Algorithm (ECDSA): • Alice (KeyGen): QA ← aP (1 scalar mult) • Alice (Sign): R ← kP (1 scalar mult) 0 • Bob (Verify): R ← uP + vQA (2 scalar mults) I etc. Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 19 / 43 Central operation: the scalar multiplication I Elliptic curve Diffie–Hellman (ECDH): • Alice: QA ← aP and K ← aQB (2 scalar mults) • Bob: QB ← bP and K ← bQA (2 scalar mults) I Elliptic curve Digital Signature Algorithm (ECDSA): • Alice (KeyGen): QA ← aP (1 scalar mult) • Alice (Sign): R ← kP (1 scalar mult) 0 • Bob (Verify): R ← uP + vQA (2 scalar mults) I etc. I Other important operations might be required, such as pairings Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 19 / 43 Central operation: the scalar multiplication I Elliptic curve Diffie–Hellman (ECDH): • Alice: QA ← aP and K ← aQB (2 scalar mults) • Bob: QB ← bP and K ← bQA (2 scalar mults) I Elliptic curve Digital Signature Algorithm (ECDSA): • Alice (KeyGen): QA ← aP (1 scalar mult) • Alice (Sign): R ← kP (1 scalar mult) 0 • Bob (Verify): R ← uP + vQA (2 scalar mults) I etc. I Other important operations might be required, such as pairings I Several algorithmic and arithmetic layers: Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 19 / 43 Central operation: the scalar multiplication I Elliptic curve Diffie–Hellman (ECDH): • Alice: QA ← aP and K ← aQB (2 scalar mults) • Bob: QB ← bP and K ← bQA (2 scalar mults) I Elliptic curve Digital Signature Algorithm (ECDSA): • Alice (KeyGen): QA ← aP (1 scalar mult) • Alice (Sign): R ← kP (1 scalar mult) 0 • Bob (Verify): R ← uP + vQA (2 scalar mults) I etc. I Other important operations might be required, such as pairings I Several algorithmic and arithmetic layers: • scalar multiplication Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 19 / 43 Central operation: the scalar multiplication I Elliptic curve Diffie–Hellman (ECDH): • Alice: QA ← aP and K ← aQB (2 scalar mults) • Bob: QB ← bP and K ← bQA (2 scalar mults) I Elliptic curve Digital Signature Algorithm (ECDSA): • Alice (KeyGen): QA ← aP (1 scalar mult) • Alice (Sign): R ← kP (1 scalar mult) 0 • Bob (Verify): R ← uP + vQA (2 scalar mults) I etc. I Other important operations might be required, such as pairings I Several algorithmic and arithmetic layers: • scalar multiplication • elliptic curve arithmetic (point addition, point doubling, etc.) Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 19 / 43 Central operation: the scalar multiplication I Elliptic curve Diffie–Hellman (ECDH): • Alice: QA ← aP and K ← aQB (2 scalar mults) • Bob: QB ← bP and K ← bQA (2 scalar mults) I Elliptic curve Digital Signature Algorithm (ECDSA): • Alice (KeyGen): QA ← aP (1 scalar mult) • Alice (Sign): R ← kP (1 scalar mult) 0 • Bob (Verify): R ← uP + vQA (2 scalar mults) I etc. I Other important operations might be required, such as pairings I Several algorithmic and arithmetic layers: • scalar multiplication • elliptic curve arithmetic (point addition, point doubling, etc.) • finite field arithmetic (addition, multiplication, inversion, etc.) Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 19 / 43 Available implementations I There already exist several free-software, open-source implementations of ECC (or of useful layers thereof): Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 20 / 43 Available implementations I There already exist several free-software, open-source implementations of ECC (or of useful layers thereof): • at the protocol level: GnuPG, OpenSSL, GnuTLS, OpenSSH, cryptlib, etc. Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 20 / 43 Available implementations I There already exist several free-software, open-source implementations of ECC (or of useful layers thereof): • at the protocol level: GnuPG, OpenSSL, GnuTLS, OpenSSH, cryptlib, etc. • at the cryptographic primitive level: RELIC, NaCl (Ed25519), crypto++, etc. Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 20 / 43 Available implementations I There already exist several free-software, open-source implementations of ECC (or of useful layers thereof): • at the protocol level: GnuPG, OpenSSL, GnuTLS, OpenSSH, cryptlib, etc. • at the cryptographic primitive level: RELIC, NaCl (Ed25519), crypto++, etc. • at the curve arithmetic level: PARI, Sage (not for crypto!) Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 20 / 43 Available implementations I There already exist several free-software, open-source implementations of ECC (or of useful layers thereof): • at the protocol level: GnuPG, OpenSSL, GnuTLS, OpenSSH, cryptlib, etc. • at the cryptographic primitive level: RELIC, NaCl (Ed25519), crypto++, etc. • at the curve arithmetic level: PARI, Sage (not for crypto!) • at the field arithmetic level: MPFQ, GF2X, NTL, GMP, etc. Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 20 / 43 Available implementations I There already exist several free-software, open-source implementations of ECC (or of useful layers thereof): • at the protocol level: GnuPG, OpenSSL, GnuTLS, OpenSSH, cryptlib, etc. • at the cryptographic primitive level: RELIC, NaCl (Ed25519), crypto++, etc. • at the curve arithmetic level: PARI, Sage (not for crypto!) • at the field arithmetic level: MPFQ, GF2X, NTL, GMP, etc. I Available open-source hardware implementations of ECC: Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 20 / 43 Available implementations I There already exist several free-software, open-source implementations of ECC (or of useful layers thereof): • at the protocol level: GnuPG, OpenSSL, GnuTLS, OpenSSH, cryptlib, etc. • at the cryptographic primitive level: RELIC, NaCl (Ed25519), crypto++, etc. • at the curve arithmetic level: PARI, Sage (not for crypto!) • at the field arithmetic level: MPFQ, GF2X, NTL, GMP, etc. I Available open-source hardware implementations of ECC: • implementation of NaCl’s crypto box (Ed25519 + Salsa20 + Poly1305) in 29.3 to 32.6 kGE [Hutter et al., 2015] Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 20 / 43 Available implementations I There already exist several free-software, open-source implementations of ECC (or of useful layers thereof): • at the protocol level: GnuPG, OpenSSL, GnuTLS, OpenSSH, cryptlib, etc. • at the cryptographic primitive level: RELIC, NaCl (Ed25519), crypto++, etc. • at the curve arithmetic level: PARI, Sage (not for crypto!) • at the field arithmetic level: MPFQ, GF2X, NTL, GMP, etc. I Available open-source hardware implementations of ECC: • implementation of NaCl’s crypto box (Ed25519 + Salsa20 + Poly1305) in 29.3 to 32.6 kGE [Hutter et al., 2015] • PAVOIS project: ECC cryptoprocessor designed to evaluate algorithmic and arithmetic protections against side-channel attacks [See A. Tisserand’s talk] Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 20 / 43 Outline I Some encryption mechanisms I Elliptic curve cryptography I Scalar multiplication I Elliptic curve arithmetic I Finite field arithmetic Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 21 / 43 Scalar multiplication I Given k in Z/`Z and P in G ⊆ E (Fq ), we want to compute kP = P {z. . . + P} | +P + k times Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 22 / 43 Scalar multiplication I Given k in Z/`Z and P in G ⊆ E (Fq ), we want to compute kP = P {z. . . + P} | +P + k times I Size of ` (and k) for crypto applications: from 250 to 500 bits Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 22 / 43 Scalar multiplication I Given k in Z/`Z and P in G ⊆ E (Fq ), we want to compute kP = P {z. . . + P} | +P + k times I Size of ` (and k) for crypto applications: from 250 to 500 bits I Repeated addition, in O(k) complexity, is out of the question! Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 22 / 43 Double-and-add algorithm I Available operations on E (Fq ): • point addition: (Q, R) 7→ Q + R • point doubling: Q 7→ 2Q = Q + Q Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 23 / 43 Double-and-add algorithm I Available operations on E (Fq ): • point addition: (Q, R) 7→ Q + R • point doubling: Q 7→ 2Q = Q + Q I Idea: iterative algorithm based on the binary expansion of k Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 23 / 43 Double-and-add algorithm I Available operations on E (Fq ): • point addition: (Q, R) 7→ Q + R • point doubling: Q 7→ 2Q = Q + Q I Idea: iterative algorithm based on the binary expansion of k • start from the most significant bit of k • double current result at each step • add P if the corresponding bit of k is 1 Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 23 / 43 Double-and-add algorithm I Available operations on E (Fq ): • point addition: (Q, R) 7→ Q + R • point doubling: Q 7→ 2Q = Q + Q I Idea: iterative algorithm based on the binary expansion of k • • • • start from the most significant bit of k double current result at each step add P if the corresponding bit of k is 1 same principle as binary exponentiation Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 23 / 43 Double-and-add algorithm I Denoting by (kn−1 . . . k1 k0 )2 , with n = dlog2 `e, the binary expansion of k: function scalar-mult(k, P): T ←O for i ← n − 1 downto 0: T ← 2T if ki = 1: T ←T +P return T Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 24 / 43 Double-and-add algorithm I Denoting by (kn−1 . . . k1 k0 )2 , with n = dlog2 `e, the binary expansion of k: function scalar-mult(k, P): T ←O for i ← n − 1 downto 0: T ← 2T if ki = 1: T ←T +P return T I Example: k = 431 Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 24 / 43 Double-and-add algorithm I Denoting by (kn−1 . . . k1 k0 )2 , with n = dlog2 `e, the binary expansion of k: function scalar-mult(k, P): T ←O for i ← n − 1 downto 0: T ← 2T if ki = 1: T ←T +P return T I Example: k = 431 = (110101111)2 Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 24 / 43 Double-and-add algorithm I Denoting by (kn−1 . . . k1 k0 )2 , with n = dlog2 `e, the binary expansion of k: function scalar-mult(k, P): T ←O for i ← n − 1 downto 0: T ← 2T if ki = 1: T ←T +P return T I Example: k = 431 = (110101111)2 T = Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography = O 24 / 43 Double-and-add algorithm I Denoting by (kn−1 . . . k1 k0 )2 , with n = dlog2 `e, the binary expansion of k: function scalar-mult(k, P): T ←O for i ← n − 1 downto 0: T ← 2T if ki = 1: T ←T +P return T I Example: k = 431 = (110101111)2 T = P Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography = P 24 / 43 Double-and-add algorithm I Denoting by (kn−1 . . . k1 k0 )2 , with n = dlog2 `e, the binary expansion of k: function scalar-mult(k, P): T ←O for i ← n − 1 downto 0: T ← 2T if ki = 1: T ←T +P return T I Example: k = 431 = (110101111)2 T = P ·2 Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography = 2P 24 / 43 Double-and-add algorithm I Denoting by (kn−1 . . . k1 k0 )2 , with n = dlog2 `e, the binary expansion of k: function scalar-mult(k, P): T ←O for i ← n − 1 downto 0: T ← 2T if ki = 1: T ←T +P return T I Example: k = 431 = (110101111)2 T = P ·2+P Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography = 3P 24 / 43 Double-and-add algorithm I Denoting by (kn−1 . . . k1 k0 )2 , with n = dlog2 `e, the binary expansion of k: function scalar-mult(k, P): T ←O for i ← n − 1 downto 0: T ← 2T if ki = 1: T ←T +P return T I Example: k = 431 = (110101111)2 T = (P · 2 + P) · 2 Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography = 6P 24 / 43 Double-and-add algorithm I Denoting by (kn−1 . . . k1 k0 )2 , with n = dlog2 `e, the binary expansion of k: function scalar-mult(k, P): T ←O for i ← n − 1 downto 0: T ← 2T if ki = 1: T ←T +P return T I Example: k = 431 = (110101111)2 T = (P · 2 + P) · 22 Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography = 12P 24 / 43 Double-and-add algorithm I Denoting by (kn−1 . . . k1 k0 )2 , with n = dlog2 `e, the binary expansion of k: function scalar-mult(k, P): T ←O for i ← n − 1 downto 0: T ← 2T if ki = 1: T ←T +P return T I Example: k = 431 = (110101111)2 T = (P · 2 + P) · 22 + P Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography = 13P 24 / 43 Double-and-add algorithm I Denoting by (kn−1 . . . k1 k0 )2 , with n = dlog2 `e, the binary expansion of k: function scalar-mult(k, P): T ←O for i ← n − 1 downto 0: T ← 2T if ki = 1: T ←T +P return T I Example: k = 431 = (110101111)2 T = ((P · 2 + P) · 22 + P) · 2 Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography = 26P 24 / 43 Double-and-add algorithm I Denoting by (kn−1 . . . k1 k0 )2 , with n = dlog2 `e, the binary expansion of k: function scalar-mult(k, P): T ←O for i ← n − 1 downto 0: T ← 2T if ki = 1: T ←T +P return T I Example: k = 431 = (110101111)2 T = ((P · 2 + P) · 22 + P) · 22 Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography = 52P 24 / 43 Double-and-add algorithm I Denoting by (kn−1 . . . k1 k0 )2 , with n = dlog2 `e, the binary expansion of k: function scalar-mult(k, P): T ←O for i ← n − 1 downto 0: T ← 2T if ki = 1: T ←T +P return T I Example: k = 431 = (110101111)2 T = ((P · 2 + P) · 22 + P) · 22 + P Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography = 53P 24 / 43 Double-and-add algorithm I Denoting by (kn−1 . . . k1 k0 )2 , with n = dlog2 `e, the binary expansion of k: function scalar-mult(k, P): T ←O for i ← n − 1 downto 0: T ← 2T if ki = 1: T ←T +P return T I Example: k = 431 = (110101111)2 T = (((P · 2 + P) · 22 + P) · 22 + P) · 2 Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography = 106P 24 / 43 Double-and-add algorithm I Denoting by (kn−1 . . . k1 k0 )2 , with n = dlog2 `e, the binary expansion of k: function scalar-mult(k, P): T ←O for i ← n − 1 downto 0: T ← 2T if ki = 1: T ←T +P return T I Example: k = 431 = (110101111)2 T = (((P · 2 + P) · 22 + P) · 22 + P) · 2 + P Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography = 107P 24 / 43 Double-and-add algorithm I Denoting by (kn−1 . . . k1 k0 )2 , with n = dlog2 `e, the binary expansion of k: function scalar-mult(k, P): T ←O for i ← n − 1 downto 0: T ← 2T if ki = 1: T ←T +P return T I Example: k = 431 = (110101111)2 T = ((((P · 2 + P) · 22 + P) · 22 + P) · 2 + P) · 2 Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography = 214P 24 / 43 Double-and-add algorithm I Denoting by (kn−1 . . . k1 k0 )2 , with n = dlog2 `e, the binary expansion of k: function scalar-mult(k, P): T ←O for i ← n − 1 downto 0: T ← 2T if ki = 1: T ←T +P return T I Example: k = 431 = (110101111)2 T = ((((P · 2 + P) · 22 + P) · 22 + P) · 2 + P) · 2 + P Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography = 215P 24 / 43 Double-and-add algorithm I Denoting by (kn−1 . . . k1 k0 )2 , with n = dlog2 `e, the binary expansion of k: function scalar-mult(k, P): T ←O for i ← n − 1 downto 0: T ← 2T if ki = 1: T ←T +P return T I Example: k = 431 = (110101111)2 T = (((((P · 2 + P) · 22 + P) · 22 + P) · 2 + P) · 2 + P) · 2 Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography = 430P 24 / 43 Double-and-add algorithm I Denoting by (kn−1 . . . k1 k0 )2 , with n = dlog2 `e, the binary expansion of k: function scalar-mult(k, P): T ←O for i ← n − 1 downto 0: T ← 2T if ki = 1: T ←T +P return T I Example: k = 431 = (110101111)2 T = (((((P · 2 + P) · 22 + P) · 22 + P) · 2 + P) · 2 + P) · 2 + P = 431P Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 24 / 43 Double-and-add algorithm I Denoting by (kn−1 . . . k1 k0 )2 , with n = dlog2 `e, the binary expansion of k: function scalar-mult(k, P): T ←O for i ← n − 1 downto 0: T ← 2T if ki = 1: T ←T +P return T I Example: k = 431 = (110101111)2 T = (((((P · 2 + P) · 22 + P) · 22 + P) · 2 + P) · 2 + P) · 2 + P = 431P Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 24 / 43 Double-and-add algorithm I Denoting by (kn−1 . . . k1 k0 )2 , with n = dlog2 `e, the binary expansion of k: function scalar-mult(k, P): T ←O for i ← n − 1 downto 0: T ← 2T if ki = 1: T ←T +P return T I Example: k = 431 = (110101111)2 T = (((((P · 2 + P) · 22 + P) · 22 + P) · 2 + P) · 2 + P) · 2 + P = 431P I Complexity in O(n) = O(log2 `) operations over E (Fq ): • n − 1 doublings, and • n/2 additions on average Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 24 / 43 Windowed method I Consider 2w -ary expansion of k: i.e., split k into w -bit chunks Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 25 / 43 Windowed method I Consider 2w -ary expansion of k: i.e., split k into w -bit chunks I Precompute 2P, 3P, . . . , (2w − 1)P: • 2w −1 − 1 doublings, and • 2w −1 − 1 additions Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 25 / 43 Windowed method I Consider 2w -ary expansion of k: i.e., split k into w -bit chunks I Precompute 2P, 3P, . . . , (2w − 1)P: • 2w −1 − 1 doublings, and • 2w −1 − 1 additions I Example with w = 3: k = 431 Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 25 / 43 Windowed method I Consider 2w -ary expansion of k: i.e., split k into w -bit chunks I Precompute 2P, 3P, . . . , (2w − 1)P: • 2w −1 − 1 doublings, and • 2w −1 − 1 additions I Example with w = 3: k = 431 = (110 101 111)2 Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 25 / 43 Windowed method I Consider 2w -ary expansion of k: i.e., split k into w -bit chunks I Precompute 2P, 3P, . . . , (2w − 1)P: • 2w −1 − 1 doublings, and • 2w −1 − 1 additions I Example with w = 3: k = 431 = (110 101 111)2 = (657)23 Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 25 / 43 Windowed method I Consider 2w -ary expansion of k: i.e., split k into w -bit chunks I Precompute 2P, 3P, . . . , (2w − 1)P: • 2w −1 − 1 doublings, and • 2w −1 − 1 additions I Example with w = 3: k = 431 = (110 101 111)2 = (657)23 T = Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography = O 25 / 43 Windowed method I Consider 2w -ary expansion of k: i.e., split k into w -bit chunks I Precompute 2P, 3P, . . . , (2w − 1)P: • 2w −1 − 1 doublings, and • 2w −1 − 1 additions I Example with w = 3: k = 431 = (110 101 111)2 = (657)23 T = 6P Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography = 6P 25 / 43 Windowed method I Consider 2w -ary expansion of k: i.e., split k into w -bit chunks I Precompute 2P, 3P, . . . , (2w − 1)P: • 2w −1 − 1 doublings, and • 2w −1 − 1 additions I Example with w = 3: k = 431 = (110 101 111)2 = (657)23 T = 6P · 23 Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography = 48P 25 / 43 Windowed method I Consider 2w -ary expansion of k: i.e., split k into w -bit chunks I Precompute 2P, 3P, . . . , (2w − 1)P: • 2w −1 − 1 doublings, and • 2w −1 − 1 additions I Example with w = 3: k = 431 = (110 101 111)2 = (657)23 T = 6P · 23 + 5P Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography = 53P 25 / 43 Windowed method I Consider 2w -ary expansion of k: i.e., split k into w -bit chunks I Precompute 2P, 3P, . . . , (2w − 1)P: • 2w −1 − 1 doublings, and • 2w −1 − 1 additions I Example with w = 3: k = 431 = (110 101 111)2 = (657)23 T = (6P · 23 + 5P) · 23 Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography = 424P 25 / 43 Windowed method I Consider 2w -ary expansion of k: i.e., split k into w -bit chunks I Precompute 2P, 3P, . . . , (2w − 1)P: • 2w −1 − 1 doublings, and • 2w −1 − 1 additions I Example with w = 3: k = 431 = (110 101 111)2 = (657)23 T = (6P · 23 + 5P) · 23 + 7P = 431P Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 25 / 43 Windowed method I Consider 2w -ary expansion of k: i.e., split k into w -bit chunks I Precompute 2P, 3P, . . . , (2w − 1)P: • 2w −1 − 1 doublings, and • 2w −1 − 1 additions I Example with w = 3: k = 431 = (110 101 111)2 = (657)23 T = (6P · 23 + 5P) · 23 + 7P = 431P Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 25 / 43 Windowed method I Consider 2w -ary expansion of k: i.e., split k into w -bit chunks I Precompute 2P, 3P, . . . , (2w − 1)P: • 2w −1 − 1 doublings, and • 2w −1 − 1 additions I Example with w = 3: k = 431 = (110 101 111)2 = (657)23 T = (6P · 23 + 5P) · 23 + 7P = 431P I Complexity: • n − w doublings, and • (1 − 2−w )n/w additions on average Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 25 / 43 Windowed method I Consider 2w -ary expansion of k: i.e., split k into w -bit chunks I Precompute 2P, 3P, . . . , (2w − 1)P: • 2w −1 − 1 doublings, and • 2w −1 − 1 additions I Example with w = 3: k = 431 = (110 101 111)2 = (657)23 T = (6P · 23 + 5P) · 23 + 7P = 431P I Complexity: • n − w doublings, and • (1 − 2−w )n/w additions on average I Select w carefully so that precomputation cost does not become predominant Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 25 / 43 Windowed method I Consider 2w -ary expansion of k: i.e., split k into w -bit chunks I Precompute 2P, 3P, . . . , (2w − 1)P: • 2w −1 − 1 doublings, and • 2w −1 − 1 additions I Example with w = 3: k = 431 = (110 101 111)2 = (657)23 T = (6P · 23 + 5P) · 23 + 7P = 431P I Complexity: • n − w doublings, and • (1 − 2−w )n/w additions on average I Select w carefully so that precomputation cost does not become predominant I Sliding window variant: half as many precomputations Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 25 / 43 Security issues I Back to the double-and-add algorithm: function scalar-mult(k, P): T ←O for i ← n − 1 downto 0: T ← 2T if ki = 1: T ←T +P return T Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 26 / 43 Security issues I Back to the double-and-add algorithm: function scalar-mult(k, P): T ←O for i ← n − 1 downto 0: T ← 2T if ki = 1: T ←T +P return T I At step i, point addition T ← T + P is computed if and only if ki = 1 Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 26 / 43 Security issues I Back to the double-and-add algorithm: function scalar-mult(k, P): T ←O for i ← n − 1 downto 0: T ← 2T if ki = 1: T ←T +P return T I At step i, point addition T ← T + P is computed if and only if ki = 1 • careful timing analysis will reveal Hamming weight of secret k Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 26 / 43 Security issues I Back to the double-and-add algorithm: function scalar-mult(k, P): T ←O for i ← n − 1 downto 0: T ← 2T if ki = 1: T ←T +P return T Power I At step i, point addition T ← T + P is computed if and only if ki = 1 • careful timing analysis will reveal Hamming weight of secret k • simple power analysis (SPA) will leak bits of k Time Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 26 / 43 Security issues I Back to the double-and-add algorithm: function scalar-mult(k, P): T ←O for i ← n − 1 downto 0: T ← 2T if ki = 1: T ←T +P return T Power I At step i, point addition T ← T + P is computed if and only if ki = 1 • careful timing analysis will reveal Hamming weight of secret k • simple power analysis (SPA) will leak bits of k 1 0 0 1 1 0 1 0 0 1 Time Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 26 / 43 Security issues I Back to the double-and-add algorithm: function scalar-mult(k, P): T ←O for i ← n − 1 downto 0: T ← 2T if ki = 1: T ←T +P else: Z ←T +P return T Power I At step i, point addition T ← T + P is computed if and only if ki = 1 • careful timing analysis will reveal Hamming weight of secret k • simple power analysis (SPA) will leak bits of k 1 0 0 1 1 0 1 0 0 1 Time I Use double-and-add-always algorithm? Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 26 / 43 Security issues I Back to the double-and-add algorithm: function scalar-mult(k, P): T ←O for i ← n − 1 downto 0: T ← 2T if ki = 1: T ←T +P else: Z ←T +P return T Power I At step i, point addition T ← T + P is computed if and only if ki = 1 • careful timing analysis will reveal Hamming weight of secret k • simple power analysis (SPA) will leak bits of k 1 0 0 1 1 0 1 0 0 1 Time I Use double-and-add-always algorithm? • the result of the point addition is used if and only if ki = 1 Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 26 / 43 Security issues I Back to the double-and-add algorithm: function scalar-mult(k, P): T ←O for i ← n − 1 downto 0: T ← 2T if ki = 1: T ←T +P else: Z ←T +P return T Power I At step i, point addition T ← T + P is computed if and only if ki = 1 • careful timing analysis will reveal Hamming weight of secret k • simple power analysis (SPA) will leak bits of k 1 0 0 1 1 0 1 0 0 1 Time I Use double-and-add-always algorithm? • the result of the point addition is used if and only if ki = 1 ⇒ vulnerable to fault attacks [See A. Tisserand’s lecture] Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 26 / 43 The Montgomery ladder I Algorithm proposed by Montgomery in 1987: function scalar-mult(k, P): T0 ← O T1 ← P for i ← n − 1 downto 0: if ki = 1: T0 ← T0 + T1 T1 ← 2T1 else: T1 ← T0 + T1 T0 ← 2T0 return T0 Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 27 / 43 The Montgomery ladder I Algorithm proposed by Montgomery in 1987: function scalar-mult(k, P): T0 ← O T1 ← P for i ← n − 1 downto 0: if ki = 1: T0 ← T0 + T1 T1 ← 2T1 else: T1 ← T0 + T1 T0 ← 2T0 return T0 I Properties: Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 27 / 43 The Montgomery ladder I Algorithm proposed by Montgomery in 1987: function scalar-mult(k, P): T0 ← O T1 ← P for i ← n − 1 downto 0: if ki = 1: T0 ← T0 + T1 T1 ← 2T1 else: T1 ← T0 + T1 T0 ← 2T0 return T0 I Properties: • perform one addition and one doubling at each step Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 27 / 43 The Montgomery ladder I Algorithm proposed by Montgomery in 1987: function scalar-mult(k, P): T0 ← O T1 ← P for i ← n − 1 downto 0: if ki = 1: T0 ← T0 + T1 T1 ← 2T1 else: T1 ← T0 + T1 T0 ← 2T0 return T0 I Properties: • perform one addition and one doubling at each step • ensure that both results are used in the next step Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 27 / 43 The Montgomery ladder I Algorithm proposed by Montgomery in 1987: function scalar-mult(k, P): T0 ← O T1 ← P for i ← n − 1 downto 0: if ki = 1: T0 ← T0 + T1 T1 ← 2T1 else: T1 ← T0 + T1 T0 ← 2T0 return T0 I Properties: • perform one addition and one doubling at each step • ensure that both results are used in the next step • loop invariant: T1 = T0 + P Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 27 / 43 The Montgomery ladder I Algorithm proposed by Montgomery in 1987: function scalar-mult(k, P): T0 ← O T1 ← P for i ← n − 1 downto 0: if ki = 1: T0 ← T0 + T1 T1 ← 2T1 else: T1 ← T0 + T1 T0 ← 2T0 return T0 I Properties: • perform one addition and one doubling at each step • ensure that both results are used in the next step • loop invariant: T1 = T0 + P I Example: k = 19 Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 27 / 43 The Montgomery ladder I Algorithm proposed by Montgomery in 1987: function scalar-mult(k, P): T0 ← O T1 ← P for i ← n − 1 downto 0: if ki = 1: T0 ← T0 + T1 T1 ← 2T1 else: T1 ← T0 + T1 T0 ← 2T0 return T0 I Properties: • perform one addition and one doubling at each step • ensure that both results are used in the next step • loop invariant: T1 = T0 + P I Example: k = 19 = (10011)2 Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 27 / 43 The Montgomery ladder I Algorithm proposed by Montgomery in 1987: function scalar-mult(k, P): T0 ← O T1 ← P for i ← n − 1 downto 0: if ki = 1: T0 ← T0 + T1 T1 ← 2T1 else: T1 ← T0 + T1 T0 ← 2T0 return T0 I Properties: • perform one addition and one doubling at each step • ensure that both results are used in the next step • loop invariant: T1 = T0 + P I Example: k = 19 = (10011)2 T0 = T1 = P Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography = = O P 27 / 43 The Montgomery ladder I Algorithm proposed by Montgomery in 1987: function scalar-mult(k, P): T0 ← O T1 ← P for i ← n − 1 downto 0: if ki = 1: T0 ← T0 + T1 T1 ← 2T1 else: T1 ← T0 + T1 T0 ← 2T0 return T0 I Properties: • perform one addition and one doubling at each step • ensure that both results are used in the next step • loop invariant: T1 = T0 + P I Example: k = 19 = (10011)2 T0 = T1 = P Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography = = O P 27 / 43 The Montgomery ladder I Algorithm proposed by Montgomery in 1987: function scalar-mult(k, P): T0 ← O T1 ← P for i ← n − 1 downto 0: if ki = 1: T0 ← T0 + T1 T1 ← 2T1 else: T1 ← T0 + T1 T0 ← 2T0 return T0 I Properties: • perform one addition and one doubling at each step • ensure that both results are used in the next step • loop invariant: T1 = T0 + P I Example: k = 19 = (10011)2 T0 = P T1 = P Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography = = P P 27 / 43 The Montgomery ladder I Algorithm proposed by Montgomery in 1987: function scalar-mult(k, P): T0 ← O T1 ← P for i ← n − 1 downto 0: if ki = 1: T0 ← T0 + T1 T1 ← 2T1 else: T1 ← T0 + T1 T0 ← 2T0 return T0 I Properties: • perform one addition and one doubling at each step • ensure that both results are used in the next step • loop invariant: T1 = T0 + P I Example: k = 19 = (10011)2 T0 = P T1 = P · 2 Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography = P = 2P 27 / 43 The Montgomery ladder I Algorithm proposed by Montgomery in 1987: function scalar-mult(k, P): T0 ← O T1 ← P for i ← n − 1 downto 0: if ki = 1: T0 ← T0 + T1 T1 ← 2T1 else: T1 ← T0 + T1 T0 ← 2T0 return T0 I Properties: • perform one addition and one doubling at each step • ensure that both results are used in the next step • loop invariant: T1 = T0 + P I Example: k = 19 = (10011)2 T0 = P T1 = P · 2 Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography = P = 2P 27 / 43 The Montgomery ladder I Algorithm proposed by Montgomery in 1987: function scalar-mult(k, P): T0 ← O T1 ← P for i ← n − 1 downto 0: if ki = 1: T0 ← T0 + T1 T1 ← 2T1 else: T1 ← T0 + T1 T0 ← 2T0 return T0 I Properties: • perform one addition and one doubling at each step • ensure that both results are used in the next step • loop invariant: T1 = T0 + P I Example: k = 19 = (10011)2 T0 = P T1 = P · 2 + P Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography = P = 3P 27 / 43 The Montgomery ladder I Algorithm proposed by Montgomery in 1987: function scalar-mult(k, P): T0 ← O T1 ← P for i ← n − 1 downto 0: if ki = 1: T0 ← T0 + T1 T1 ← 2T1 else: T1 ← T0 + T1 T0 ← 2T0 return T0 I Properties: • perform one addition and one doubling at each step • ensure that both results are used in the next step • loop invariant: T1 = T0 + P I Example: k = 19 = (10011)2 T0 = P · 2 T1 = P · 2 + P Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography = 2P = 3P 27 / 43 The Montgomery ladder I Algorithm proposed by Montgomery in 1987: function scalar-mult(k, P): T0 ← O T1 ← P for i ← n − 1 downto 0: if ki = 1: T0 ← T0 + T1 T1 ← 2T1 else: T1 ← T0 + T1 T0 ← 2T0 return T0 I Properties: • perform one addition and one doubling at each step • ensure that both results are used in the next step • loop invariant: T1 = T0 + P I Example: k = 19 = (10011)2 T0 = P · 2 T1 = P · 2 + P Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography = 2P = 3P 27 / 43 The Montgomery ladder I Algorithm proposed by Montgomery in 1987: function scalar-mult(k, P): T0 ← O T1 ← P for i ← n − 1 downto 0: if ki = 1: T0 ← T0 + T1 T1 ← 2T1 else: T1 ← T0 + T1 T0 ← 2T0 return T0 I Properties: • perform one addition and one doubling at each step • ensure that both results are used in the next step • loop invariant: T1 = T0 + P I Example: k = 19 = (10011)2 T0 = P · 2 T1 = P · 2 + P + 2P Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography = 2P = 5P 27 / 43 The Montgomery ladder I Algorithm proposed by Montgomery in 1987: function scalar-mult(k, P): T0 ← O T1 ← P for i ← n − 1 downto 0: if ki = 1: T0 ← T0 + T1 T1 ← 2T1 else: T1 ← T0 + T1 T0 ← 2T0 return T0 I Properties: • perform one addition and one doubling at each step • ensure that both results are used in the next step • loop invariant: T1 = T0 + P I Example: k = 19 = (10011)2 T 0 = P · 22 T1 = P · 2 + P + 2P Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography = 4P = 5P 27 / 43 The Montgomery ladder I Algorithm proposed by Montgomery in 1987: function scalar-mult(k, P): T0 ← O T1 ← P for i ← n − 1 downto 0: if ki = 1: T0 ← T0 + T1 T1 ← 2T1 else: T1 ← T0 + T1 T0 ← 2T0 return T0 I Properties: • perform one addition and one doubling at each step • ensure that both results are used in the next step • loop invariant: T1 = T0 + P I Example: k = 19 = (10011)2 T 0 = P · 22 T1 = P · 2 + P + 2P Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography = 4P = 5P 27 / 43 The Montgomery ladder I Algorithm proposed by Montgomery in 1987: function scalar-mult(k, P): T0 ← O T1 ← P for i ← n − 1 downto 0: if ki = 1: T0 ← T0 + T1 T1 ← 2T1 else: T1 ← T0 + T1 T0 ← 2T0 return T0 I Properties: • perform one addition and one doubling at each step • ensure that both results are used in the next step • loop invariant: T1 = T0 + P I Example: k = 19 = (10011)2 T0 = P · 22 + 5P T1 = P · 2 + P + 2P Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography = 9P = 5P 27 / 43 The Montgomery ladder I Algorithm proposed by Montgomery in 1987: function scalar-mult(k, P): T0 ← O T1 ← P for i ← n − 1 downto 0: if ki = 1: T0 ← T0 + T1 T1 ← 2T1 else: T1 ← T0 + T1 T0 ← 2T0 return T0 I Properties: • perform one addition and one doubling at each step • ensure that both results are used in the next step • loop invariant: T1 = T0 + P I Example: k = 19 = (10011)2 T0 = P · 22 + 5P = 9P T1 = (P · 2 + P + 2P) · 2 = 10P Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 27 / 43 The Montgomery ladder I Algorithm proposed by Montgomery in 1987: function scalar-mult(k, P): T0 ← O T1 ← P for i ← n − 1 downto 0: if ki = 1: T0 ← T0 + T1 T1 ← 2T1 else: T1 ← T0 + T1 T0 ← 2T0 return T0 I Properties: • perform one addition and one doubling at each step • ensure that both results are used in the next step • loop invariant: T1 = T0 + P I Example: k = 19 = (10011)2 T0 = P · 22 + 5P = 9P T1 = (P · 2 + P + 2P) · 2 = 10P Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 27 / 43 The Montgomery ladder I Algorithm proposed by Montgomery in 1987: function scalar-mult(k, P): T0 ← O T1 ← P for i ← n − 1 downto 0: if ki = 1: T0 ← T0 + T1 T1 ← 2T1 else: T1 ← T0 + T1 T0 ← 2T0 return T0 I Properties: • perform one addition and one doubling at each step • ensure that both results are used in the next step • loop invariant: T1 = T0 + P I Example: k = 19 = (10011)2 T0 = P · 22 + 5P + 10P = 19P T1 = (P · 2 + P + 2P) · 2 = 10P Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 27 / 43 The Montgomery ladder I Algorithm proposed by Montgomery in 1987: function scalar-mult(k, P): T0 ← O T1 ← P for i ← n − 1 downto 0: if ki = 1: T0 ← T0 + T1 T1 ← 2T1 else: T1 ← T0 + T1 T0 ← 2T0 return T0 I Properties: • perform one addition and one doubling at each step • ensure that both results are used in the next step • loop invariant: T1 = T0 + P I Example: k = 19 = (10011)2 T0 = P · 22 + 5P + 10P = 19P T1 = (P · 2 + P + 2P) · 22 = 20P Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 27 / 43 The Montgomery ladder I Algorithm proposed by Montgomery in 1987: function scalar-mult(k, P): T0 ← O T1 ← P for i ← n − 1 downto 0: if ki = 1: T0 ← T0 + T1 T1 ← 2T1 else: T1 ← T0 + T1 T0 ← 2T0 return T0 I Properties: • perform one addition and one doubling at each step • ensure that both results are used in the next step • loop invariant: T1 = T0 + P I Example: k = 19 = (10011)2 T0 = P · 22 + 5P + 10P = 19P T1 = (P · 2 + P + 2P) · 22 = 20P Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 27 / 43 Outline I Some encryption mechanisms I Elliptic curve cryptography I Scalar multiplication I Elliptic curve arithmetic I Finite field arithmetic Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 28 / 43 Addition and doubling O Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography E 29 / 43 Addition and doubling O E Q P Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 29 / 43 Addition and doubling O E LP,Q Q P Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 29 / 43 Addition and doubling O E LP,Q R0 Q P Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 29 / 43 Addition and doubling O LR 0 ,O E LP,Q R0 Q P Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 29 / 43 Addition and doubling O LR 0 ,O E LP,Q R0 Q P Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography R =P +Q 29 / 43 Addition and doubling O E P Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 29 / 43 Addition and doubling O E LP,P P Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 29 / 43 Addition and doubling O E LP,P R0 P Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 29 / 43 Addition and doubling O LR 0 ,O E LP,P R0 P Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 29 / 43 Addition and doubling O LR 0 ,O E LP,P R0 P R = 2P Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 29 / 43 Addition and doubling formulae E /Fq : y 2 = x 3 + Ax + B Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 30 / 43 Addition and doubling formulae E /Fq : y 2 = x 3 + Ax + B I Let P = (xP , yP ) and Q = (xQ , yQ ) ∈ E (Fq )\{O} (affine coordinates) Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 30 / 43 Addition and doubling formulae E /Fq : y 2 = x 3 + Ax + B I Let P = (xP , yP ) and Q = (xQ , yQ ) ∈ E (Fq )\{O} (affine coordinates) I The opposite of P is −P = (xP , −yP ) Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 30 / 43 Addition and doubling formulae E /Fq : y 2 = x 3 + Ax + B I Let P = (xP , yP ) and Q = (xQ , yQ ) ∈ E (Fq )\{O} (affine coordinates) I The opposite of P is −P = (xP , −yP ) I If P 6= −Q, then P + Q = R = (xR , yR ) with xR = λ2 − xP − xQ where yQ − yP x −x Q P λ= 2 3x + A P 2yP and yR = λ(xP − xR ) − yP if P 6= Q (addition), or if P = Q (doubling) Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 30 / 43 Addition and doubling formulae E /Fq : y 2 = x 3 + Ax + B I Let P = (xP , yP ) and Q = (xQ , yQ ) ∈ E (Fq )\{O} (affine coordinates) I The opposite of P is −P = (xP , −yP ) I If P 6= −Q, then P + Q = R = (xR , yR ) with xR = λ2 − xP − xQ where yQ − yP x −x Q P λ= 2 3x + A P 2yP and yR = λ(xP − xR ) − yP if P 6= Q (addition), or if P = Q (doubling) I Cost (number of multiplications, squarings, and inversions in Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography Fq ): 30 / 43 Addition and doubling formulae E /Fq : y 2 = x 3 + Ax + B I Let P = (xP , yP ) and Q = (xQ , yQ ) ∈ E (Fq )\{O} (affine coordinates) I The opposite of P is −P = (xP , −yP ) I If P 6= −Q, then P + Q = R = (xR , yR ) with xR = λ2 − xP − xQ where yQ − yP x −x Q P λ= 2 3x + A P 2yP and yR = λ(xP − xR ) − yP if P 6= Q (addition), or if P = Q (doubling) I Cost (number of multiplications, squarings, and inversions in Fq ): • addition: 2M + 1S + 1I Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 30 / 43 Addition and doubling formulae E /Fq : y 2 = x 3 + Ax + B I Let P = (xP , yP ) and Q = (xQ , yQ ) ∈ E (Fq )\{O} (affine coordinates) I The opposite of P is −P = (xP , −yP ) I If P 6= −Q, then P + Q = R = (xR , yR ) with xR = λ2 − xP − xQ where yQ − yP x −x Q P λ= 2 3x + A P 2yP and yR = λ(xP − xR ) − yP if P 6= Q (addition), or if P = Q (doubling) I Cost (number of multiplications, squarings, and inversions in Fq ): • addition: 2M + 1S + 1I • doubling: 2M + 2S + 1I Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 30 / 43 Addition and doubling formulae E /Fq : y 2 = x 3 + Ax + B I Let P = (xP , yP ) and Q = (xQ , yQ ) ∈ E (Fq )\{O} (affine coordinates) I The opposite of P is −P = (xP , −yP ) I If P 6= −Q, then P + Q = R = (xR , yR ) with xR = λ2 − xP − xQ where yQ − yP x −x Q P λ= 2 3x + A P 2yP and yR = λ(xP − xR ) − yP if P 6= Q (addition), or if P = Q (doubling) I Cost (number of multiplications, squarings, and inversions in Fq ): • addition: 2M + 1S + 1I • doubling: 2M + 2S + 1I ⇒ field inversion is expensive! Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 30 / 43 Other coordinate systems E /Fq : y 2 = x 3 + Ax + B I One can use other coordinate systems which provide more efficient formulae Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 31 / 43 Other coordinate systems E /Fq : y 2 = x 3 + Ax + B I One can use other coordinate systems which provide more efficient formulae I Projective coordinates: points (X : Y : Z ) with (x, y ) = (X /Z , Y /Z ) E /Fq : Y 2 Z = X 3 + AXZ 2 + BZ 3 Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 31 / 43 Other coordinate systems E /Fq : y 2 = x 3 + Ax + B I One can use other coordinate systems which provide more efficient formulae I Projective coordinates: points (X : Y : Z ) with (x, y ) = (X /Z , Y /Z ) E /Fq : Y 2 Z = X 3 + AXZ 2 + BZ 3 • idea: get rid of the inversion over Fq by using Z as the denominator Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 31 / 43 Other coordinate systems E /Fq : y 2 = x 3 + Ax + B I One can use other coordinate systems which provide more efficient formulae I Projective coordinates: points (X : Y : Z ) with (x, y ) = (X /Z , Y /Z ) E /Fq : Y 2 Z = X 3 + AXZ 2 + BZ 3 • idea: get rid of the inversion over • addition: 12M + 2S • doubling: 7M + 5S Fq by using Z as the denominator Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 31 / 43 Other coordinate systems E /Fq : y 2 = x 3 + Ax + B I One can use other coordinate systems which provide more efficient formulae I Projective coordinates: points (X : Y : Z ) with (x, y ) = (X /Z , Y /Z ) E /Fq : Y 2 Z = X 3 + AXZ 2 + BZ 3 • idea: get rid of the inversion over • addition: 12M + 2S • doubling: 7M + 5S Fq by using Z as the denominator I Jacobian coordinates: points (X : Y : Z ) with (x, y ) = (X /Z 2 , Y /Z 3 ) E /Fq : Y 2 = X 3 + AXZ 4 + BZ 6 Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 31 / 43 Other coordinate systems E /Fq : y 2 = x 3 + Ax + B I One can use other coordinate systems which provide more efficient formulae I Projective coordinates: points (X : Y : Z ) with (x, y ) = (X /Z , Y /Z ) E /Fq : Y 2 Z = X 3 + AXZ 2 + BZ 3 • idea: get rid of the inversion over • addition: 12M + 2S • doubling: 7M + 5S Fq by using Z as the denominator I Jacobian coordinates: points (X : Y : Z ) with (x, y ) = (X /Z 2 , Y /Z 3 ) E /Fq : Y 2 = X 3 + AXZ 4 + BZ 6 • addition: 12M + 4S • doubling: 4M + 6S Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 31 / 43 Other coordinate systems E /Fq : y 2 = x 3 + Ax + B I One can use other coordinate systems which provide more efficient formulae I Projective coordinates: points (X : Y : Z ) with (x, y ) = (X /Z , Y /Z ) E /Fq : Y 2 Z = X 3 + AXZ 2 + BZ 3 • idea: get rid of the inversion over • addition: 12M + 2S • doubling: 7M + 5S Fq by using Z as the denominator I Jacobian coordinates: points (X : Y : Z ) with (x, y ) = (X /Z 2 , Y /Z 3 ) E /Fq : Y 2 = X 3 + AXZ 4 + BZ 6 • addition: 12M + 4S • doubling: 4M + 6S I And many others: modified jacobian coordinates, López–Dahab (over Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography F2 ), etc. n 31 / 43 Other coordinate systems E /Fq : y 2 = x 3 + Ax + B I One can use other coordinate systems which provide more efficient formulae I Projective coordinates: points (X : Y : Z ) with (x, y ) = (X /Z , Y /Z ) E /Fq : Y 2 Z = X 3 + AXZ 2 + BZ 3 • idea: get rid of the inversion over • addition: 12M + 2S • doubling: 7M + 5S Fq by using Z as the denominator I Jacobian coordinates: points (X : Y : Z ) with (x, y ) = (X /Z 2 , Y /Z 3 ) E /Fq : Y 2 = X 3 + AXZ 4 + BZ 6 • addition: 12M + 4S • doubling: 4M + 6S I And many others: modified jacobian coordinates, López–Dahab (over F2 ), etc. n I Explicit-Formula Database (by Bernstein and Lange): http://hyperelliptic.org/EFD/ Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 31 / 43 Outline I Some encryption mechanisms I Elliptic curve cryptography I Scalar multiplication I Elliptic curve arithmetic I Finite field arithmetic Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 32 / 43 Implementing finite field arithmetic I Group law over E (Fq ) requires: • additions / subtractions over Fq • multiplications / squarings over Fq Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 33 / 43 Implementing finite field arithmetic I Group law over E (Fq ) requires: • additions / subtractions over Fq • multiplications / squarings over Fq • a few inversions over Fq Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 33 / 43 Implementing finite field arithmetic I Group law over E (Fq ) requires: • additions / subtractions over Fq • multiplications / squarings over Fq • a few inversions over Fq I Typical finite fields Fq : • prime field Fp , with p an n-bit prime and n between 250 and 500 bits • binary field F2n , with n prime and between 250 and 500 Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 33 / 43 Implementing finite field arithmetic I Group law over E (Fq ) requires: • additions / subtractions over Fq • multiplications / squarings over Fq • a few inversions over Fq I Typical finite fields Fq : • prime field Fp , with p an n-bit prime and n between 250 and 500 bits • binary field F2n , with n prime and between 250 and 500 (... still secure?) Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 33 / 43 Implementing finite field arithmetic I Group law over E (Fq ) requires: • additions / subtractions over Fq • multiplications / squarings over Fq • a few inversions over Fq I Typical finite fields Fq : • prime field Fp , with p an n-bit prime and n between 250 and 500 bits • binary field F2n , with n prime and between 250 and 500 (... still secure?) I What we have at our disposal: • basic integer arithmetic (addition, multiplication) • left and right shifts • bitwise logic operations (bitwise NOT, AND, etc.) Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 33 / 43 Implementing finite field arithmetic I Group law over E (Fq ) requires: • additions / subtractions over Fq • multiplications / squarings over Fq • a few inversions over Fq I Typical finite fields Fq : • prime field Fp , with p an n-bit prime and n between 250 and 500 bits • binary field F2n , with n prime and between 250 and 500 (... still secure?) I What we have at our disposal: • basic integer arithmetic (addition, multiplication) • left and right shifts • bitwise logic operations (bitwise NOT, AND, etc.) I ... on w -bit words: • w = 32 or 64 on CPUs • w = 8 or 16 bits on microcontrollers • a bit more flexibility in hardware (but integer arithmetic with w > 64 bits is hard!) Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 33 / 43 Implementing finite field arithmetic I Group law over E (Fq ) requires: • additions / subtractions over Fq • multiplications / squarings over Fq • a few inversions over Fq I Typical finite fields Fq : • prime field Fp , with p an n-bit prime and n between 250 and 500 bits • binary field F2n , with n prime and between 250 and 500 (... still secure?) I What we have at our disposal: • basic integer arithmetic (addition, multiplication) • left and right shifts • bitwise logic operations (bitwise NOT, AND, etc.) I ... on w -bit words: • w = 32 or 64 on CPUs • w = 8 or 16 bits on microcontrollers • a bit more flexibility in hardware (but integer arithmetic with w > 64 bits is hard!) ⇒ elements of Fq represented using several words Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 33 / 43 Multiprecision representation I Consider A ∈ FP , with P an n-bit prime Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 34 / 43 Multiprecision representation I Consider A ∈ FP , with P an n-bit prime • represent A as an integer modulo P n A Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 34 / 43 Multiprecision representation I Consider A ∈ FP , with P an n-bit prime • represent A as an integer modulo P • split A into k = dn/w e w -bit words (or limbs), ak−1 , ..., a1 , a0 : A = ak−1 2(k−1)w + · · · + a1 2w + a0 w a3 w w w a2 a1 a0 Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 34 / 43 Multiprecision representation I Consider A ∈ FP , with P an n-bit prime • represent A as an integer modulo P • split A into k = dn/w e w -bit words (or limbs), ak−1 , ..., a1 , a0 : A = ak−1 2(k−1)w + · · · + a1 2w + a0 w w w w a3 a2 a1 a0 Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 34 / 43 Multiprecision representation I Consider A ∈ FP , with P an n-bit prime • represent A as an integer modulo P • split A into k = dn/w e w -bit words (or limbs), ak−1 , ..., a1 , a0 : A = ak−1 2(k−1)w + · · · + a1 2w + a0 I Addition of A and B ∈ FP : + a3 a2 a1 a0 b3 b2 b1 b0 Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 34 / 43 Multiprecision representation I Consider A ∈ FP , with P an n-bit prime • represent A as an integer modulo P • split A into k = dn/w e w -bit words (or limbs), ak−1 , ..., a1 , a0 : A = ak−1 2(k−1)w + · · · + a1 2w + a0 I Addition of A and B ∈ FP : • right-to-left word-wise addition + a3 a2 a1 a0 b3 b2 b1 b0 Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 34 / 43 Multiprecision representation I Consider A ∈ FP , with P an n-bit prime • represent A as an integer modulo P • split A into k = dn/w e w -bit words (or limbs), ak−1 , ..., a1 , a0 : A = ak−1 2(k−1)w + · · · + a1 2w + a0 I Addition of A and B ∈ FP : • right-to-left word-wise addition • need to propagate carry c + a3 a2 a1 a0 b3 b2 b1 b0 r0 Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 34 / 43 Multiprecision representation I Consider A ∈ FP , with P an n-bit prime • represent A as an integer modulo P • split A into k = dn/w e w -bit words (or limbs), ak−1 , ..., a1 , a0 : A = ak−1 2(k−1)w + · · · + a1 2w + a0 I Addition of A and B ∈ FP : • right-to-left word-wise addition • need to propagate carry c + a3 a2 a1 a0 b3 b2 b1 b0 r0 Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 34 / 43 Multiprecision representation I Consider A ∈ FP , with P an n-bit prime • represent A as an integer modulo P • split A into k = dn/w e w -bit words (or limbs), ak−1 , ..., a1 , a0 : A = ak−1 2(k−1)w + · · · + a1 2w + a0 I Addition of A and B ∈ FP : • right-to-left word-wise addition • need to propagate carry c + a3 a2 a1 a0 b3 b2 b1 b0 r1 r0 Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 34 / 43 Multiprecision representation I Consider A ∈ FP , with P an n-bit prime • represent A as an integer modulo P • split A into k = dn/w e w -bit words (or limbs), ak−1 , ..., a1 , a0 : A = ak−1 2(k−1)w + · · · + a1 2w + a0 I Addition of A and B ∈ FP : • right-to-left word-wise addition • need to propagate carry c + a3 a2 a1 a0 b3 b2 b1 b0 r2 r1 r0 Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 34 / 43 Multiprecision representation I Consider A ∈ FP , with P an n-bit prime • represent A as an integer modulo P • split A into k = dn/w e w -bit words (or limbs), ak−1 , ..., a1 , a0 : A = ak−1 2(k−1)w + · · · + a1 2w + a0 I Addition of A and B ∈ FP : • right-to-left word-wise addition • need to propagate carry + c a3 a2 a1 a0 b3 b2 b1 b0 r3 r2 r1 r0 Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 34 / 43 Multiprecision representation I Consider A ∈ FP , with P an n-bit prime • represent A as an integer modulo P • split A into k = dn/w e w -bit words (or limbs), ak−1 , ..., a1 , a0 : A = ak−1 2(k−1)w + · · · + a1 2w + a0 I Addition of A and B ∈ FP : • right-to-left word-wise addition • need to propagate carry + c a3 a2 a1 a0 b3 b2 b1 b0 r3 r2 r1 r0 Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 34 / 43 Multiprecision representation I Consider A ∈ FP , with P an n-bit prime • represent A as an integer modulo P • split A into k = dn/w e w -bit words (or limbs), ak−1 , ..., a1 , a0 : A = ak−1 2(k−1)w + · · · + a1 2w + a0 I Addition of A and B ∈ FP : • right-to-left word-wise addition • need to propagate carry • might need reduction modulo P: compare then subtract (in constant time!) + c ? ≥ a3 a2 a1 a0 b3 b2 b1 b0 r3 r2 r1 r0 P Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 34 / 43 Multiprecision representation I Consider A ∈ FP , with P an n-bit prime • represent A as an integer modulo P • split A into k = dn/w e w -bit words (or limbs), ak−1 , ..., a1 , a0 : A = ak−1 2(k−1)w + · · · + a1 2w + a0 I Addition of A and B ∈ FP : • right-to-left word-wise addition • need to propagate carry • might need reduction modulo P: compare then subtract (in constant time!) + c ? ≥ a3 a2 a1 a0 b3 b2 b1 b0 r3 r2 r1 r0 p3 p2 p1 p0 Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 34 / 43 Multiprecision representation I Consider A ∈ FP , with P an n-bit prime • represent A as an integer modulo P • split A into k = dn/w e w -bit words (or limbs), ak−1 , ..., a1 , a0 : A = ak−1 2(k−1)w + · · · + a1 2w + a0 I Addition of A and B ∈ FP : • right-to-left word-wise addition • need to propagate carry • might need reduction modulo P: compare then subtract (in constant time!) + c − a3 a2 a1 a0 b3 b2 b1 b0 r3 r2 r1 r0 p3 p2 p1 p0 r30 r20 r10 r00 Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 34 / 43 Multiprecision representation I Consider A ∈ FP , with P an n-bit prime • represent A as an integer modulo P • split A into k = dn/w e w -bit words (or limbs), ak−1 , ..., a1 , a0 : A = ak−1 2(k−1)w + · · · + a1 2w + a0 I Addition of A and B ∈ FP : • • • • right-to-left word-wise addition need to propagate carry might need reduction modulo P: compare then subtract (in constant time!) lazy reduction: if kw > n, do not reduce after each addition + c − a3 a2 a1 a0 b3 b2 b1 b0 r3 r2 r1 r0 p3 p2 p1 p0 r30 r20 r10 r00 Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 34 / 43 MP multiplication I Multiplication of A and B ∈ FP : × Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography a3 a2 a1 a0 b3 b2 b1 b0 35 / 43 MP multiplication I Multiplication of A and B ∈ FP : • schoolbook method: k 2 w -by-w -bit products × a3 a2 a1 a0 b3 b2 b1 b0 a2 b0 a3 b3 a0 b0 a3 b0 a1 b0 a2 b1 a0 b1 a3 b1 a1 b1 a2 b2 a0 b2 a3 b2 a1 b2 a2 b3 a0 b3 a1 b3 Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 35 / 43 MP multiplication I Multiplication of A and B ∈ FP : • schoolbook method: k 2 w -by-w -bit products × a3 a2 a1 a0 b3 b2 b1 b0 a2 b0 a0 b0 + a3 b0 a1 b0 + a2 b1 a0 b1 + a3 b1 a1 b1 + a2 b2 a0 b2 + a3 b2 a1 b2 + a2 b3 a0 b3 + a3 b3 r7 a1 b3 r6 r5 r4 Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography r3 r2 r1 r0 35 / 43 MP multiplication I Multiplication of A and B ∈ FP : • schoolbook method: k 2 w -by-w -bit products • subquadratic algorithms (e.g., Karatsuba) when k is large × a3 a2 a1 a0 b3 b2 b1 b0 a2 b0 a0 b0 + a3 b0 a1 b0 + a2 b1 a0 b1 + a3 b1 a1 b1 + a2 b2 a0 b2 + a3 b2 a1 b2 + a2 b3 a0 b3 + a3 b3 r7 a1 b3 r6 r5 r4 Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography r3 r2 r1 r0 35 / 43 MP multiplication I Multiplication of A and B ∈ FP : • schoolbook method: k 2 w -by-w -bit products • subquadratic algorithms (e.g., Karatsuba) when k is large • final product fits into 2k words → requires reduction modulo P (see later) × a3 a2 a1 a0 b3 b2 b1 b0 a2 b0 a0 b0 + a3 b0 a1 b0 + a2 b1 a0 b1 + a3 b1 a1 b1 + a2 b2 a0 b2 + a3 b2 a1 b2 + a2 b3 a0 b3 + a3 b3 r7 a1 b3 r6 r5 r4 Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography r3 r2 r1 r0 35 / 43 MP multiplication I Multiplication of A and B ∈ FP : • schoolbook method: k 2 w -by-w -bit products • subquadratic algorithms (e.g., Karatsuba) when k is large • final product fits into 2k words → requires reduction modulo P (see later) • should run in constant time (for fixed P)! × a3 a2 a1 a0 b3 b2 b1 b0 a2 b0 a0 b0 + a3 b0 a1 b0 + a2 b1 a0 b1 + a3 b1 a1 b1 + a2 b2 a0 b2 + a3 b2 a1 b2 + a2 b3 a0 b3 + a3 b3 r7 a1 b3 r6 r5 r4 Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography r3 r2 r1 r0 35 / 43 MP modular reduction I Given an integer A < P 2 (on 2k words), compute R = A mod P 2n A P Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 36 / 43 MP modular reduction I Given an integer A < P 2 (on 2k words), compute R = A mod P I Easy case: P is a pseudo-Mersenne prime P = 2n − c with c “small” (e.g., < 2w ) 2n A P Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 36 / 43 MP modular reduction I Given an integer A < P 2 (on 2k words), compute R = A mod P I Easy case: P is a pseudo-Mersenne prime P = 2n − c with c “small” (e.g., < 2w ) • then 2n ≡ c (mod P) 2n A P Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 36 / 43 MP modular reduction I Given an integer A < P 2 (on 2k words), compute R = A mod P I Easy case: P is a pseudo-Mersenne prime P = 2n − c with c “small” (e.g., < 2w ) • then 2n ≡ c (mod P) • split A wrt. 2n : A = AH 2n + AL n n AH AL Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 36 / 43 MP modular reduction I Given an integer A < P 2 (on 2k words), compute R = A mod P I Easy case: P is a pseudo-Mersenne prime P = 2n − c with c “small” (e.g., < 2w ) • then 2n ≡ c (mod P) • split A wrt. 2n : A = AH 2n + AL • compute A0 ← c · AH + AL (one 1 × k-word multiplication) AL AH Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 36 / 43 MP modular reduction I Given an integer A < P 2 (on 2k words), compute R = A mod P I Easy case: P is a pseudo-Mersenne prime P = 2n − c with c “small” (e.g., < 2w ) • then 2n ≡ c (mod P) • split A wrt. 2n : A = AH 2n + AL • compute A0 ← c · AH + AL (one 1 × k-word multiplication) AL c · AH Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 36 / 43 MP modular reduction I Given an integer A < P 2 (on 2k words), compute R = A mod P I Easy case: P is a pseudo-Mersenne prime P = 2n − c with c “small” (e.g., < 2w ) • then 2n ≡ c (mod P) • split A wrt. 2n : A = AH 2n + AL • compute A0 ← c · AH + AL (one 1 × k-word multiplication) • rinse & repeat (one 1 × 1-word multiplication) AL c · AH + A0H A0L ≤w Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 36 / 43 MP modular reduction I Given an integer A < P 2 (on 2k words), compute R = A mod P I Easy case: P is a pseudo-Mersenne prime P = 2n − c with c “small” (e.g., < 2w ) • then 2n ≡ c (mod P) • split A wrt. 2n : A = AH 2n + AL • compute A0 ← c · AH + AL (one 1 × k-word multiplication) • rinse & repeat (one 1 × 1-word multiplication) AL + c · AH A0L c · A0H Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 36 / 43 MP modular reduction I Given an integer A < P 2 (on 2k words), compute R = A mod P I Easy case: P is a pseudo-Mersenne prime P = 2n − c with c “small” (e.g., < 2w ) • then 2n ≡ c (mod P) • split A wrt. 2n : A = AH 2n + AL • compute A0 ← c · AH + AL (one 1 × k-word multiplication) • rinse & repeat (one 1 × 1-word multiplication) AL c · AH + A0L c · A0H + A00 ≤1 Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 36 / 43 MP modular reduction I Given an integer A < P 2 (on 2k words), compute R = A mod P I Easy case: P is a pseudo-Mersenne prime P = 2n − c with c “small” (e.g., < 2w ) • then 2n ≡ c (mod P) • split A wrt. 2n : A = AH 2n + AL • compute A0 ← c · AH + AL (one 1 × k-word multiplication) • rinse & repeat (one 1 × 1-word multiplication) • final subtraction might be necessary AL c · AH + A0L c · A0H + A00 − P A mod P Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 36 / 43 MP modular reduction I Given an integer A < P 2 (on 2k words), compute R = A mod P I Easy case: P is a pseudo-Mersenne prime P = 2n − c with c “small” (e.g., < 2w ) • then 2n ≡ c (mod P) • split A wrt. 2n : A = AH 2n + AL • compute A0 ← c · AH + AL (one 1 × k-word multiplication) • rinse & repeat (one 1 × 1-word multiplication) • final subtraction might be necessary I Examples: P = 2255 − 19 (Curve25519) or P = 2448 − 2224 − 1 (Ed448-Goldilocks) AL c · AH + A0L c · A0H + A00 − P A mod P Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 36 / 43 MP modular reduction: general case I Idea: find quotient Q = bA/Pc, then take remainder as A − QP Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 37 / 43 MP modular reduction: general case I Idea: find quotient Q = bA/Pc, then take remainder as A − QP • Euclidean division is way too expensive! Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 37 / 43 MP modular reduction: general case I Idea: find quotient Q = bA/Pc, then take remainder as A − QP • Euclidean division is way too expensive! • since P is fixed, precompute 1/P with enough precision Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 37 / 43 MP modular reduction: general case I Idea: find quotient Q = bA/Pc, then take remainder as A − QP • Euclidean division is way too expensive! • since P is fixed, precompute 1/P with enough precision I Barrett reduction: p3 Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography p2 p1 p0 37 / 43 MP modular reduction: general case I Idea: find quotient Q = bA/Pc, then take remainder as A − QP • Euclidean division is way too expensive! • since P is fixed, precompute 1/P with enough precision I Barrett reduction: • precompute P 0 = b22kw /Pc (k + 1 words) 0. 0 0 0 p40 Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography p30 p20 p10 p00 37 / 43 MP modular reduction: general case I Idea: find quotient Q = bA/Pc, then take remainder as A − QP • Euclidean division is way too expensive! • since P is fixed, precompute 1/P with enough precision I Barrett reduction: • precompute P 0 = b22kw /Pc (k + 1 words) p40 Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography p30 p20 p10 p00 37 / 43 MP modular reduction: general case I Idea: find quotient Q = bA/Pc, then take remainder as A − QP • Euclidean division is way too expensive! • since P is fixed, precompute 1/P with enough precision I Barrett reduction: • precompute P 0 = b22kw /Pc (k + 1 words) • given A < P 2 , get the k + 1 most significant words AH ← bA/2(k−1)w c a7 a6 a5 p40 p30 p20 p10 p00 a4 a3 a2 a1 a0 Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 37 / 43 MP modular reduction: general case I Idea: find quotient Q = bA/Pc, then take remainder as A − QP • Euclidean division is way too expensive! • since P is fixed, precompute 1/P with enough precision I Barrett reduction: • precompute P 0 = b22kw /Pc (k + 1 words) • given A < P 2 , get the k + 1 most significant words AH ← bA/2(k−1)w c a7 a6 a5 p40 p30 p20 p10 p00 a4 a3 a2 a1 a0 Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 37 / 43 MP modular reduction: general case I Idea: find quotient Q = bA/Pc, then take remainder as A − QP • Euclidean division is way too expensive! • since P is fixed, precompute 1/P with enough precision I Barrett reduction: • precompute P 0 = b22kw /Pc (k + 1 words) • given A < P 2 , get the k + 1 most significant words AH ← bA/2(k−1)w c p40 p30 p20 p10 p00 a7 a6 a5 a4 a3 Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 37 / 43 MP modular reduction: general case I Idea: find quotient Q = bA/Pc, then take remainder as A − QP • Euclidean division is way too expensive! • since P is fixed, precompute 1/P with enough precision I Barrett reduction: • precompute P 0 = b22kw /Pc (k + 1 words) • given A < P 2 , get the k + 1 most significant words AH ← bA/2(k−1)w c • compute Q̃ ← bAH · P 0 /2(k+1)w c (one (k + 1) × (k + 1)-word multiplication) × q̃8 q̃7 q̃6 q̃5 p40 p30 p20 p10 p00 a7 a6 a5 a4 a3 q̃4 q̃3 q̃2 q̃1 q̃0 Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 37 / 43 MP modular reduction: general case I Idea: find quotient Q = bA/Pc, then take remainder as A − QP • Euclidean division is way too expensive! • since P is fixed, precompute 1/P with enough precision I Barrett reduction: • precompute P 0 = b22kw /Pc (k + 1 words) • given A < P 2 , get the k + 1 most significant words AH ← bA/2(k−1)w c • compute Q̃ ← bAH · P 0 /2(k+1)w c (one (k + 1) × (k + 1)-word multiplication) × q̃8 q̃7 q̃6 q̃5 p40 p30 p20 p10 p00 a7 a6 a5 a4 a3 q̃4 q̃3 q̃2 q̃1 q̃0 Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 37 / 43 MP modular reduction: general case I Idea: find quotient Q = bA/Pc, then take remainder as A − QP • Euclidean division is way too expensive! • since P is fixed, precompute 1/P with enough precision I Barrett reduction: • precompute P 0 = b22kw /Pc (k + 1 words) • given A < P 2 , get the k + 1 most significant words AH ← bA/2(k−1)w c • compute Q̃ ← bAH · P 0 /2(k+1)w c (one (k + 1) × (k + 1)-word multiplication) × p40 p30 p20 p10 p00 a7 a6 a5 a4 a3 q̃9 q̃8 q̃6 q̃5 Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 37 / 43 MP modular reduction: general case I Idea: find quotient Q = bA/Pc, then take remainder as A − QP • Euclidean division is way too expensive! • since P is fixed, precompute 1/P with enough precision I Barrett reduction: • precompute P 0 = b22kw /Pc (k + 1 words) • given A < P 2 , get the k + 1 most significant words AH ← bA/2(k−1)w c • compute Q̃ ← bAH · P 0 /2(k+1)w c (one (k + 1) × (k + 1)-word multiplication) • compute à ← Q̃ · P (one k × k-word multiplication) × p40 p30 p20 p10 p00 a7 a6 a5 a4 a3 q̃9 q̃8 q̃6 q̃5 p3 p2 p1 p0 ã3 ã2 ã1 ã0 × ã7 ã6 ã5 ã4 Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 37 / 43 MP modular reduction: general case I Idea: find quotient Q = bA/Pc, then take remainder as A − QP • Euclidean division is way too expensive! • since P is fixed, precompute 1/P with enough precision I Barrett reduction: • precompute P 0 = b22kw /Pc (k + 1 words) • given A < P 2 , get the k + 1 most significant words AH ← bA/2(k−1)w c • compute Q̃ ← bAH · P 0 /2(k+1)w c (one (k + 1) × (k + 1)-word multiplication) • compute à ← Q̃ · P (one k × k-word multiplication) × p40 p30 p20 p10 p00 a7 a6 a5 a4 a3 q̃9 q̃8 q̃6 q̃5 p3 p2 p1 p0 × ã7 ã6 ã5 ã4 ã3 ã2 ã1 ã0 a7 a6 a5 a4 a3 a2 a1 a0 Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 37 / 43 MP modular reduction: general case I Idea: find quotient Q = bA/Pc, then take remainder as A − QP • Euclidean division is way too expensive! • since P is fixed, precompute 1/P with enough precision I Barrett reduction: • precompute P 0 = b22kw /Pc (k + 1 words) • given A < P 2 , get the k + 1 most significant words AH ← bA/2(k−1)w c • compute Q̃ ← bAH · P 0 /2(k+1)w c (one (k + 1) × (k + 1)-word multiplication) • compute à ← Q̃ · P (one k × k-word multiplication) • compute remainder R ← A − Ã × p40 p30 p20 p10 p00 a7 a6 a5 a4 a3 q̃9 q̃8 q̃6 q̃5 p3 p2 p1 p0 × − ã7 ã6 ã5 ã4 ã3 ã2 ã1 ã0 + a7 a6 a5 a4 a3 a2 a1 a0 r3 r2 r1 r0 r4 Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 37 / 43 MP modular reduction: general case I Idea: find quotient Q = bA/Pc, then take remainder as A − QP • Euclidean division is way too expensive! • since P is fixed, precompute 1/P with enough precision I Barrett reduction: • precompute P 0 = b22kw /Pc (k + 1 words) • given A < P 2 , get the k + 1 most significant words AH ← bA/2(k−1)w c • compute Q̃ ← bAH · P 0 /2(k+1)w c (one (k + 1) × (k + 1)-word multiplication) • compute à ← (Q̃ · P) mod 2(k+1)w (one k × k-word short multiplication) • compute remainder R ← (A − Ã) mod 2(k+1)w × p40 p30 p20 p10 p00 a7 a6 a5 a4 a3 q̃9 q̃8 q̃6 q̃5 p3 p2 p1 p0 × − ã7 ã6 ã5 ã4 ã3 ã2 ã1 ã0 + a7 a6 a5 a4 a3 a2 a1 a0 r3 r2 r1 r0 r4 Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 37 / 43 MP modular reduction: general case I Idea: find quotient Q = bA/Pc, then take remainder as A − QP • Euclidean division is way too expensive! • since P is fixed, precompute 1/P with enough precision I Barrett reduction: • precompute P 0 = b22kw /Pc (k + 1 words) • given A < P 2 , get the k + 1 most significant words AH ← bA/2(k−1)w c • compute Q̃ ← bAH · P 0 /2(k+1)w c (one (k + 1) × (k + 1)-word multiplication) • compute à ← (Q̃ · P) mod 2(k+1)w (one k × k-word short multiplication) • compute remainder R ← (A − Ã) mod 2(k+1)w • since Q − 2 ≤ Q̃ ≤ Q, at most two final subtractions × p40 p30 p20 p10 p00 a7 a6 a5 a4 a3 q̃9 q̃8 q̃6 q̃5 p3 p2 p1 p0 × − ã7 ã6 ã5 ã4 ã3 ã2 ã1 ã0 + a7 a6 a5 a4 a3 a2 a1 a0 r3 r2 r1 r0 r4 Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 37 / 43 MP modular reduction: general case I Montgomery reduction (REDC): like Barrett, but on the least significant words Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 38 / 43 MP modular reduction: general case I Montgomery reduction (REDC): like Barrett, but on the least significant words • requires P odd (on k words) and A < 2kw P p3 Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography p2 p1 p0 38 / 43 MP modular reduction: general case I Montgomery reduction (REDC): like Barrett, but on the least significant words • requires P odd (on k words) and A < 2kw P • precompute P 0 ← (−P −1 ) mod 2kw (on k words) p30 Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography p20 p10 p00 38 / 43 MP modular reduction: general case I Montgomery reduction (REDC): like Barrett, but on the least significant words • requires P odd (on k words) and A < 2kw P • precompute P 0 ← (−P −1 ) mod 2kw (on k words) • given A, compute K ← (A · P 0 ) mod 2kw (one k × k-word short multiplication) a7 a6 a5 a4 Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography p30 p20 p10 p00 a3 a2 a1 a0 38 / 43 MP modular reduction: general case I Montgomery reduction (REDC): like Barrett, but on the least significant words • requires P odd (on k words) and A < 2kw P • precompute P 0 ← (−P −1 ) mod 2kw (on k words) • given A, compute K ← (A · P 0 ) mod 2kw (one k × k-word short multiplication) a7 a6 a5 a4 Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography p30 p20 p10 p00 a3 a2 a1 a0 38 / 43 MP modular reduction: general case I Montgomery reduction (REDC): like Barrett, but on the least significant words • requires P odd (on k words) and A < 2kw P • precompute P 0 ← (−P −1 ) mod 2kw (on k words) • given A, compute K ← (A · P 0 ) mod 2kw (one k × k-word short multiplication) × p30 p20 p10 p00 a7 a6 a5 a4 a3 a2 a1 a0 k7 k6 k5 k4 k3 k2 k1 k0 Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 38 / 43 MP modular reduction: general case I Montgomery reduction (REDC): like Barrett, but on the least significant words • requires P odd (on k words) and A < 2kw P • precompute P 0 ← (−P −1 ) mod 2kw (on k words) • given A, compute K ← (A · P 0 ) mod 2kw (one k × k-word short multiplication) • compute à ← K · P (one k × k-word multiplication) × p30 p20 p10 p00 a7 a6 a5 a4 a3 a2 a1 a0 k7 k6 k5 k4 k3 k2 k1 k0 p3 p2 p1 p0 ã3 ã2 ã1 ã0 × ã7 ã6 ã5 ã4 Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 38 / 43 MP modular reduction: general case I Montgomery reduction (REDC): like Barrett, but on the least significant words • requires P odd (on k words) and A < 2kw P • precompute P 0 ← (−P −1 ) mod 2kw (on k words) • given A, compute K ← (A · P 0 ) mod 2kw (one k × k-word short multiplication) • compute à ← K · P (one k × k-word multiplication) • compute remainder R ← A + Ã × p30 p20 p10 p00 a7 a6 a5 a4 a3 a2 a1 a0 k7 k6 k5 k4 k3 k2 k1 k0 p3 p2 p1 p0 × ã7 ã6 ã5 ã4 ã3 ã2 ã1 ã0 a7 a6 a5 a4 a3 a2 a1 a0 Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 38 / 43 MP modular reduction: general case I Montgomery reduction (REDC): like Barrett, but on the least significant words • requires P odd (on k words) and A < 2kw P • precompute P 0 ← (−P −1 ) mod 2kw (on k words) • given A, compute K ← (A · P 0 ) mod 2kw (one k × k-word short multiplication) • compute à ← K · P (one k × k-word multiplication) • compute remainder R ← A + Ã × r8 p20 p10 p00 a7 a6 a5 a4 a3 a2 a1 a0 k7 k6 k5 k4 k3 k2 k1 k0 p3 p2 p1 p0 × + p30 ã7 ã6 ã5 ã4 ã3 ã2 ã1 ã0 a7 a6 a5 a4 a3 a2 a1 a0 r7 r6 r5 r4 r3 r2 r1 r0 Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 38 / 43 MP modular reduction: general case I Montgomery reduction (REDC): like Barrett, but on the least significant words • requires P odd (on k words) and A < 2kw P • precompute P 0 ← (−P −1 ) mod 2kw (on k words) • given A, compute K ← (A · P 0 ) mod 2kw (one k × k-word short multiplication) • compute à ← K · P (one k × k-word multiplication) • compute remainder R ← A + Ã × r8 p20 p10 p00 a7 a6 a5 a4 a3 a2 a1 a0 k7 k6 k5 k4 k3 k2 k1 k0 p3 p2 p1 p0 × + p30 ã7 ã6 ã5 ã4 ã3 ã2 ã1 ã0 a7 a6 a5 a4 a3 a2 a1 a0 r7 r6 r5 r4 0 0 0 0 Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 38 / 43 MP modular reduction: general case I Montgomery reduction (REDC): like Barrett, but on the least significant words • requires P odd (on k words) and A < 2kw P • precompute P 0 ← (−P −1 ) mod 2kw (on k words) • given A, compute K ← (A · P 0 ) mod 2kw (one k × k-word short multiplication) • compute à ← K · P (one k × k-word multiplication) • compute remainder R ← A + Ã × r8 p20 p10 p00 a7 a6 a5 a4 a3 a2 a1 a0 k7 k6 k5 k4 k3 k2 k1 k0 p3 p2 p1 p0 × + p30 ã7 ã6 ã5 ã4 ã3 ã2 ã1 ã0 a7 a6 a5 a4 a3 a2 a1 a0 r7 r6 r5 r4 0 0 0 0 Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 38 / 43 MP modular reduction: general case I Montgomery reduction (REDC): like Barrett, but on the least significant words • requires P odd (on k words) and A < 2kw P • precompute P 0 ← (−P −1 ) mod 2kw (on k words) • given A, compute K ← (A · P 0 ) mod 2kw (one k × k-word short multiplication) • compute à ← K · P (one k × k-word multiplication) • compute remainder R ← (A + Ã)/2kw × p20 p10 p00 a7 a6 a5 a4 a3 a2 a1 a0 k7 k6 k5 k4 k3 k2 k1 k0 p3 p2 p1 p0 × + p30 ã7 ã6 ã5 ã4 ã3 ã2 ã1 ã0 a7 a6 a5 a4 a3 a2 a1 a0 r7 r6 r5 r4 r8 Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 38 / 43 MP modular reduction: general case I Montgomery reduction (REDC): like Barrett, but on the least significant words • requires P odd (on k words) and A < 2kw P • precompute P 0 ← (−P −1 ) mod 2kw (on k words) • given A, compute K ← (A · P 0 ) mod 2kw (one k × k-word short multiplication) • compute à ← K · P (one k × k-word multiplication) • compute remainder R ← (A + Ã)/2kw • at most one extra subtraction × p20 p10 p00 a7 a6 a5 a4 a3 a2 a1 a0 k7 k6 k5 k4 k3 k2 k1 k0 p3 p2 p1 p0 × + p30 ã7 ã6 ã5 ã4 ã3 ã2 ã1 ã0 a7 a6 a5 a4 a3 a2 a1 a0 r7 r6 r5 r4 r8 Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 38 / 43 MP modular reduction: general case I Montgomery reduction (REDC): like Barrett, but on the least significant words • requires P odd (on k words) and A < 2kw P • precompute P 0 ← (−P −1 ) mod 2kw (on k words) • given A, compute K ← (A · P 0 ) mod 2kw (one k × k-word short multiplication) • compute à ← K · P (one k × k-word multiplication) • compute remainder R ← (A + Ã)/2kw • at most one extra subtraction I REDC(A) returns R = (A · 2−kw ) mod P, not A mod P! Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 38 / 43 MP modular reduction: general case I Montgomery reduction (REDC): like Barrett, but on the least significant words • requires P odd (on k words) and A < 2kw P • precompute P 0 ← (−P −1 ) mod 2kw (on k words) • given A, compute K ← (A · P 0 ) mod 2kw (one k × k-word short multiplication) • compute à ← K · P (one k × k-word multiplication) • compute remainder R ← (A + Ã)/2kw • at most one extra subtraction I REDC(A) returns R = (A · 2−kw ) mod P, not A mod P! • represent X ∈ FP in Montgomery representation: X̂ = (X · 2kw ) mod P Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 38 / 43 MP modular reduction: general case I Montgomery reduction (REDC): like Barrett, but on the least significant words • requires P odd (on k words) and A < 2kw P • precompute P 0 ← (−P −1 ) mod 2kw (on k words) • given A, compute K ← (A · P 0 ) mod 2kw (one k × k-word short multiplication) • compute à ← K · P (one k × k-word multiplication) • compute remainder R ← (A + Ã)/2kw • at most one extra subtraction I REDC(A) returns R = (A · 2−kw ) mod P, not A mod P! • represent X ∈ FP in Montgomery representation: X̂ = (X · 2kw ) mod P • if Z = (X · Y ) mod P, then REDC(X̂ · Ŷ ) = (X · Y · 2kw ) mod P = Ẑ → that’s the so-called Montgomery multiplication Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 38 / 43 MP modular reduction: general case I Montgomery reduction (REDC): like Barrett, but on the least significant words • requires P odd (on k words) and A < 2kw P • precompute P 0 ← (−P −1 ) mod 2kw (on k words) • given A, compute K ← (A · P 0 ) mod 2kw (one k × k-word short multiplication) • compute à ← K · P (one k × k-word multiplication) • compute remainder R ← (A + Ã)/2kw • at most one extra subtraction I REDC(A) returns R = (A · 2−kw ) mod P, not A mod P! • represent X ∈ FP in Montgomery representation: X̂ = (X · 2kw ) mod P • if Z = (X · Y ) mod P, then REDC(X̂ · Ŷ ) = (X · Y · 2kw ) mod P = Ẑ → that’s the so-called Montgomery multiplication • conversions: X̂ = REDC(X , 22kw mod P) Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography and X = REDC(X̂ , 1) 38 / 43 MP modular reduction: general case I Montgomery reduction (REDC): like Barrett, but on the least significant words • requires P odd (on k words) and A < 2kw P • precompute P 0 ← (−P −1 ) mod 2kw (on k words) • given A, compute K ← (A · P 0 ) mod 2kw (one k × k-word short multiplication) • compute à ← K · P (one k × k-word multiplication) • compute remainder R ← (A + Ã)/2kw • at most one extra subtraction I REDC(A) returns R = (A · 2−kw ) mod P, not A mod P! • represent X ∈ FP in Montgomery representation: X̂ = (X · 2kw ) mod P • if Z = (X · Y ) mod P, then REDC(X̂ · Ŷ ) = (X · Y · 2kw ) mod P = Ẑ → that’s the so-called Montgomery multiplication • conversions: X̂ = REDC(X , 22kw mod P) and X = REDC(X̂ , 1) • Montgomery representation is compatible with addition / subtraction in Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography FP 38 / 43 MP modular reduction: general case I Montgomery reduction (REDC): like Barrett, but on the least significant words • requires P odd (on k words) and A < 2kw P • precompute P 0 ← (−P −1 ) mod 2kw (on k words) • given A, compute K ← (A · P 0 ) mod 2kw (one k × k-word short multiplication) • compute à ← K · P (one k × k-word multiplication) • compute remainder R ← (A + Ã)/2kw • at most one extra subtraction I REDC(A) returns R = (A · 2−kw ) mod P, not A mod P! • represent X ∈ FP in Montgomery representation: X̂ = (X · 2kw ) mod P • if Z = (X · Y ) mod P, then REDC(X̂ · Ŷ ) = (X · Y · 2kw ) mod P = Ẑ → that’s the so-called Montgomery multiplication • conversions: X̂ = REDC(X , 22kw mod P) and X = REDC(X̂ , 1) • Montgomery representation is compatible with addition / subtraction in FP ⇒ do all computations in Montgomery repr. instead of converting back and forth Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 38 / 43 MP modular reduction: general case I Montgomery reduction (REDC): like Barrett, but on the least significant words • requires P odd (on k words) and A < 2kw P • precompute P 0 ← (−P −1 ) mod 2kw (on k words) • given A, compute K ← (A · P 0 ) mod 2kw (one k × k-word short multiplication) • compute à ← K · P (one k × k-word multiplication) • compute remainder R ← (A + Ã)/2kw • at most one extra subtraction I REDC(A) returns R = (A · 2−kw ) mod P, not A mod P! • represent X ∈ FP in Montgomery representation: X̂ = (X · 2kw ) mod P • if Z = (X · Y ) mod P, then REDC(X̂ · Ŷ ) = (X · Y · 2kw ) mod P = Ẑ → that’s the so-called Montgomery multiplication • conversions: X̂ = REDC(X , 22kw mod P) and X = REDC(X̂ , 1) • Montgomery representation is compatible with addition / subtraction in FP ⇒ do all computations in Montgomery repr. instead of converting back and forth I REDC can be computed iteratively (one word at a time) and interleaved with the computation of X̂ · Ŷ Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 38 / 43 MP field inversion I Given A ∈ F∗P , compute A−1 mod P Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 39 / 43 MP field inversion I Given A ∈ F∗P , compute A−1 mod P I Extended Euclidean algorithm: • compute Bézout’s coefficients: U and V such that UA + VP = gcd(A, P) = 1 • then UA ≡ 1 (mod P) and A−1 ≡ U (mod P) Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 39 / 43 MP field inversion I Given A ∈ F∗P , compute A−1 mod P I Extended Euclidean algorithm: • compute Bézout’s coefficients: U and V such that UA + VP = gcd(A, P) = 1 • then UA ≡ 1 (mod P) and A−1 ≡ U (mod P) • fast, but running time depends on A Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 39 / 43 MP field inversion I Given A ∈ F∗P , compute A−1 mod P I Extended Euclidean algorithm: • compute Bézout’s coefficients: U and V such that UA + VP = gcd(A, P) = 1 • then UA ≡ 1 (mod P) and A−1 ≡ U (mod P) • fast, but running time depends on A ⇒ requires randomization of A to protect against timing attacks Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 39 / 43 MP field inversion I Given A ∈ F∗P , compute A−1 mod P I Extended Euclidean algorithm: • compute Bézout’s coefficients: U and V such that UA + VP = gcd(A, P) = 1 • then UA ≡ 1 (mod P) and A−1 ≡ U (mod P) • fast, but running time depends on A ⇒ requires randomization of A to protect against timing attacks I Fermat’s little theorem: • we know that AP−1 ≡ 1 (mod P), whence AP−2 ≡ A−1 (mod P) Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 39 / 43 MP field inversion I Given A ∈ F∗P , compute A−1 mod P I Extended Euclidean algorithm: • compute Bézout’s coefficients: U and V such that UA + VP = gcd(A, P) = 1 • then UA ≡ 1 (mod P) and A−1 ≡ U (mod P) • fast, but running time depends on A ⇒ requires randomization of A to protect against timing attacks I Fermat’s little theorem: • we know that AP−1 ≡ 1 (mod P), whence AP−2 ≡ A−1 (mod P) • precompute short sequence of squarings and multiplications for fast exponentiation of A Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 39 / 43 MP field inversion I Given A ∈ F∗P , compute A−1 mod P I Extended Euclidean algorithm: • compute Bézout’s coefficients: U and V such that UA + VP = gcd(A, P) = 1 • then UA ≡ 1 (mod P) and A−1 ≡ U (mod P) • fast, but running time depends on A ⇒ requires randomization of A to protect against timing attacks I Fermat’s little theorem: • we know that AP−1 ≡ 1 (mod P), whence AP−2 ≡ A−1 (mod P) • precompute short sequence of squarings and multiplications for fast exponentiation of A • example: P = 2255 − 19 in 11M and 254S [Bernstein, 2006] Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 39 / 43 MP field inversion I Given A ∈ F∗P , compute A−1 mod P I Extended Euclidean algorithm: • compute Bézout’s coefficients: U and V such that UA + VP = gcd(A, P) = 1 • then UA ≡ 1 (mod P) and A−1 ≡ U (mod P) • fast, but running time depends on A ⇒ requires randomization of A to protect against timing attacks I Fermat’s little theorem: • we know that AP−1 ≡ 1 (mod P), whence AP−2 ≡ A−1 (mod P) • precompute short sequence of squarings and multiplications for fast exponentiation of A • example: P = 2255 − 19 in 11M and 254S [Bernstein, 2006] A Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 39 / 43 MP field inversion I Given A ∈ F∗P , compute A−1 mod P I Extended Euclidean algorithm: • compute Bézout’s coefficients: U and V such that UA + VP = gcd(A, P) = 1 • then UA ≡ 1 (mod P) and A−1 ≡ U (mod P) • fast, but running time depends on A ⇒ requires randomization of A to protect against timing attacks I Fermat’s little theorem: • we know that AP−1 ≡ 1 (mod P), whence AP−2 ≡ A−1 (mod P) • precompute short sequence of squarings and multiplications for fast exponentiation of A • example: P = 2255 − 19 in 11M and 254S [Bernstein, 2006] A S A2 Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 39 / 43 MP field inversion I Given A ∈ F∗P , compute A−1 mod P I Extended Euclidean algorithm: • compute Bézout’s coefficients: U and V such that UA + VP = gcd(A, P) = 1 • then UA ≡ 1 (mod P) and A−1 ≡ U (mod P) • fast, but running time depends on A ⇒ requires randomization of A to protect against timing attacks I Fermat’s little theorem: • we know that AP−1 ≡ 1 (mod P), whence AP−2 ≡ A−1 (mod P) • precompute short sequence of squarings and multiplications for fast exponentiation of A • example: P = 2255 − 19 in 11M and 254S [Bernstein, 2006] A S A2 S A4 Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 39 / 43 MP field inversion I Given A ∈ F∗P , compute A−1 mod P I Extended Euclidean algorithm: • compute Bézout’s coefficients: U and V such that UA + VP = gcd(A, P) = 1 • then UA ≡ 1 (mod P) and A−1 ≡ U (mod P) • fast, but running time depends on A ⇒ requires randomization of A to protect against timing attacks I Fermat’s little theorem: • we know that AP−1 ≡ 1 (mod P), whence AP−2 ≡ A−1 (mod P) • precompute short sequence of squarings and multiplications for fast exponentiation of A • example: P = 2255 − 19 in 11M and 254S [Bernstein, 2006] A S 2 A S2 A8 Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 39 / 43 MP field inversion I Given A ∈ F∗P , compute A−1 mod P I Extended Euclidean algorithm: • compute Bézout’s coefficients: U and V such that UA + VP = gcd(A, P) = 1 • then UA ≡ 1 (mod P) and A−1 ≡ U (mod P) • fast, but running time depends on A ⇒ requires randomization of A to protect against timing attacks I Fermat’s little theorem: • we know that AP−1 ≡ 1 (mod P), whence AP−2 ≡ A−1 (mod P) • precompute short sequence of squarings and multiplications for fast exponentiation of A • example: P = 2255 − 19 in 11M and 254S [Bernstein, 2006] A S 2 A S2 A9 Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 39 / 43 MP field inversion I Given A ∈ F∗P , compute A−1 mod P I Extended Euclidean algorithm: • compute Bézout’s coefficients: U and V such that UA + VP = gcd(A, P) = 1 • then UA ≡ 1 (mod P) and A−1 ≡ U (mod P) • fast, but running time depends on A ⇒ requires randomization of A to protect against timing attacks I Fermat’s little theorem: • we know that AP−1 ≡ 1 (mod P), whence AP−2 ≡ A−1 (mod P) • precompute short sequence of squarings and multiplications for fast exponentiation of A • example: P = 2255 − 19 in 11M and 254S [Bernstein, 2006] A S 2 A S2 A9 A11 Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 39 / 43 MP field inversion I Given A ∈ F∗P , compute A−1 mod P I Extended Euclidean algorithm: • compute Bézout’s coefficients: U and V such that UA + VP = gcd(A, P) = 1 • then UA ≡ 1 (mod P) and A−1 ≡ U (mod P) • fast, but running time depends on A ⇒ requires randomization of A to protect against timing attacks I Fermat’s little theorem: • we know that AP−1 ≡ 1 (mod P), whence AP−2 ≡ A−1 (mod P) • precompute short sequence of squarings and multiplications for fast exponentiation of A • example: P = 2255 − 19 in 11M and 254S [Bernstein, 2006] A S 2 A S2 A9 A11 S Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 5 A2 −1 39 / 43 MP field inversion I Given A ∈ F∗P , compute A−1 mod P I Extended Euclidean algorithm: • compute Bézout’s coefficients: U and V such that UA + VP = gcd(A, P) = 1 • then UA ≡ 1 (mod P) and A−1 ≡ U (mod P) • fast, but running time depends on A ⇒ requires randomization of A to protect against timing attacks I Fermat’s little theorem: • we know that AP−1 ≡ 1 (mod P), whence AP−2 ≡ A−1 (mod P) • precompute short sequence of squarings and multiplications for fast exponentiation of A • example: P = 2255 − 19 in 11M and 254S [Bernstein, 2006] A S 2 A S2 A 9 A 11 S 25 −1 A S5 210 −1 A S10 A2 20 −1 S20 A 2255 −21 S5 A 2250 −1 S50 A 2100 −1 S100 Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography A 2100 −1 S50 250 −1 A S10 A2 40 −1 39 / 43 The Residue Number System (RNS) I Let B = (m1 , . . . , mk ) a tuple of k pairwise coprime integers • typically, the mi ’s are chosen to fit in a machine word (w bits) • pseudo-Mersenne primes allow for easy reduction modulo mi : mi = 2w − ci , with small ci Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 40 / 43 The Residue Number System (RNS) I Let B = (m1 , . . . , mk ) a tuple of k pairwise coprime integers • typically, the mi ’s are chosen to fit in a machine word (w bits) • pseudo-Mersenne primes allow for easy reduction modulo mi : mi = 2w − ci , with small ci • write M = k Y mi and, for all i, Mi = M/mi i=1 Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 40 / 43 The Residue Number System (RNS) I Let B = (m1 , . . . , mk ) a tuple of k pairwise coprime integers • typically, the mi ’s are chosen to fit in a machine word (w bits) • pseudo-Mersenne primes allow for easy reduction modulo mi : mi = 2w − ci , with small ci • write M = k Y mi and, for all i, Mi = M/mi i=1 I Let A < M be an integer Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 40 / 43 The Residue Number System (RNS) I Let B = (m1 , . . . , mk ) a tuple of k pairwise coprime integers • typically, the mi ’s are chosen to fit in a machine word (w bits) • pseudo-Mersenne primes allow for easy reduction modulo mi : mi = 2w − ci , with small ci • write M = k Y mi and, for all i, Mi = M/mi i=1 I Let A < M be an integer → − • represent A as the tuple A = (a1 , . . . , ak ) with ai = A mod mi = |A|mi , for all i → that is the RNS representation of A in base B Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 40 / 43 The Residue Number System (RNS) I Let B = (m1 , . . . , mk ) a tuple of k pairwise coprime integers • typically, the mi ’s are chosen to fit in a machine word (w bits) • pseudo-Mersenne primes allow for easy reduction modulo mi : mi = 2w − ci , with small ci • write M = k Y mi and, for all i, Mi = M/mi i=1 I Let A < M be an integer → − • represent A as the tuple A = (a1 , . . . , ak ) with ai = A mod mi = |A|mi , for all i → that is the RNS representation of A in base B → − • given A = (a1 , . . . , ak ), retrieve the unique corresponding integer A ∈ Z/M Z using the Chinese remaindering theorem (CRT): k X −1 A= |ai · Mi |mi · Mi i=1 Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography M 40 / 43 The Residue Number System (RNS) I Let B = (m1 , . . . , mk ) a tuple of k pairwise coprime integers • typically, the mi ’s are chosen to fit in a machine word (w bits) • pseudo-Mersenne primes allow for easy reduction modulo mi : mi = 2w − ci , with small ci • write M = k Y mi and, for all i, Mi = M/mi i=1 I Let A < M be an integer → − • represent A as the tuple A = (a1 , . . . , ak ) with ai = A mod mi = |A|mi , for all i → that is the RNS representation of A in base B → − • given A = (a1 , . . . , ak ), retrieve the unique corresponding integer A ∈ Z/M Z using the Chinese remaindering theorem (CRT): k X −1 A= |ai · Mi |mi · Mi i=1 I If M > P, we can represent elements of M FP Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography in RNS 40 / 43 RNS arithmetic → − → − I Let A = (a1 , . . . , ak ) and B = (b1 , . . . , bk ) → − A → − B Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 41 / 43 RNS arithmetic → − → − I Let A = (a1 , . . . , ak ) and B = (b1 , . . . , bk ) • add., sub. and mult. can be performed in parallel on all “channels”: → − → − A ± B = (|a1 ± b1 |m1 , . . . , |ak ± bk |mk ) → − → − A × B = (|a1 × b1 |m1 , . . . , |ak × bk |mk ) → − A → − B Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 41 / 43 RNS arithmetic → − → − I Let A = (a1 , . . . , ak ) and B = (b1 , . . . , bk ) • add., sub. and mult. can be performed in parallel on all “channels”: → − → − A ± B = (|a1 ± b1 |m1 , . . . , |ak ± bk |mk ) → − → − A × B = (|a1 × b1 |m1 , . . . , |ak × bk |mk ) a1 a2 a3 a4 b1 b2 b3 b4 Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 41 / 43 RNS arithmetic → − → − I Let A = (a1 , . . . , ak ) and B = (b1 , . . . , bk ) • add., sub. and mult. can be performed in parallel on all “channels”: → − → − A ± B = (|a1 ± b1 |m1 , . . . , |ak ± bk |mk ) → − → − A × B = (|a1 × b1 |m1 , . . . , |ak × bk |mk ) a1 × b1 a2 × b2 a3 × b3 Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography a4 × b4 41 / 43 RNS arithmetic → − → − I Let A = (a1 , . . . , ak ) and B = (b1 , . . . , bk ) • add., sub. and mult. can be performed in parallel on all “channels”: → − → − A ± B = (|a1 ± b1 |m1 , . . . , |ak ± bk |mk ) → − → − A × B = (|a1 × b1 |m1 , . . . , |ak × bk |mk ) a1 × b1 a2 × b2 a3 × b3 a4 × b4 r1 r2 r3 r4 Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 41 / 43 RNS arithmetic → − → − I Let A = (a1 , . . . , ak ) and B = (b1 , . . . , bk ) • add., sub. and mult. can be performed in parallel on all “channels”: → − → − A ± B = (|a1 ± b1 |m1 , . . . , |ak ± bk |mk ) → − → − A × B = (|a1 × b1 |m1 , . . . , |ak × bk |mk ) • native parallelism: suited to SIMD instructions and hardware implementation a1 × b1 a2 × b2 a3 × b3 a4 × b4 r1 r2 r3 r4 Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 41 / 43 RNS arithmetic → − → − I Let A = (a1 , . . . , ak ) and B = (b1 , . . . , bk ) • add., sub. and mult. can be performed in parallel on all “channels”: → − → − A ± B = (|a1 ± b1 |m1 , . . . , |ak ± bk |mk ) → − → − A × B = (|a1 × b1 |m1 , . . . , |ak × bk |mk ) • native parallelism: suited to SIMD instructions and hardware implementation a1 × b1 a2 × b2 a3 × b3 a4 × b4 r1 r2 r3 r4 I Limitations: Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 41 / 43 RNS arithmetic → − → − I Let A = (a1 , . . . , ak ) and B = (b1 , . . . , bk ) • add., sub. and mult. can be performed in parallel on all “channels”: → − → − A ± B = (|a1 ± b1 |m1 , . . . , |ak ± bk |mk ) → − → − A × B = (|a1 × b1 |m1 , . . . , |ak × bk |mk ) • native parallelism: suited to SIMD instructions and hardware implementation a1 × b1 a2 × b2 a3 × b3 a4 × b4 r1 r2 r3 r4 I Limitations: • operations are computed in Z/M Z: beware of overflows! (we need M > P 2 ) Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 41 / 43 RNS arithmetic → − → − I Let A = (a1 , . . . , ak ) and B = (b1 , . . . , bk ) • add., sub. and mult. can be performed in parallel on all “channels”: → − → − A ± B = (|a1 ± b1 |m1 , . . . , |ak ± bk |mk ) → − → − A × B = (|a1 × b1 |m1 , . . . , |ak × bk |mk ) • native parallelism: suited to SIMD instructions and hardware implementation a1 × b1 a2 × b2 a3 × b3 a4 × b4 a5 × b5 a6 × b6 a7 × b7 a8 × b8 r1 r2 r3 r4 r5 r6 r7 r8 I Limitations: • operations are computed in Z/M Z: beware of overflows! (we need M > P 2 ) Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 41 / 43 RNS arithmetic → − → − I Let A = (a1 , . . . , ak ) and B = (b1 , . . . , bk ) • add., sub. and mult. can be performed in parallel on all “channels”: → − → − A ± B = (|a1 ± b1 |m1 , . . . , |ak ± bk |mk ) → − → − A × B = (|a1 × b1 |m1 , . . . , |ak × bk |mk ) • native parallelism: suited to SIMD instructions and hardware implementation a1 × b1 a2 × b2 a3 × b3 a4 × b4 a5 × b5 a6 × b6 a7 × b7 a8 × b8 r1 r2 r3 r4 r5 r6 r7 r8 I Limitations: • operations are computed in Z/M Z: beware of overflows! (we need M > P 2 ) • RNS modular reduction has quadratic complexity O(k 2 ) Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 41 / 43 RNS Montgomery reduction I Requires two RNS bases Bα = (mα,1 , . . . , mα,k ) and Bβ = (mβ,1 , . . . , mβ,k ) such that Mα > P, Mβ > P, and gcd(Mα , Mβ ) = 1 Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 42 / 43 RNS Montgomery reduction I Requires two RNS bases Bα = (mα,1 , . . . , mα,k ) and Bβ = (mβ,1 , . . . , mβ,k ) such that Mα > P, Mβ > P, and gcd(Mα , Mβ ) = 1 I RNS base extension algorithm (BE) [Kawamura et al., 2000] − → − → − → • given Xα in base Bα , BE(Xα , Bα , Bβ ) computes Xβ , the repr. of X in base Bβ − → − → • similarly, BE(Xβ , Bβ , Bα ) computes Xα in base Bα Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 42 / 43 RNS Montgomery reduction I Requires two RNS bases Bα = (mα,1 , . . . , mα,k ) and Bβ = (mβ,1 , . . . , mβ,k ) such that Mα > P, Mβ > P, and gcd(Mα , Mβ ) = 1 I RNS base extension algorithm (BE) [Kawamura et al., 2000] − → − → − → • given Xα in base Bα , BE(Xα , Bα , Bβ ) computes Xβ , the repr. of X in base Bβ − → − → • similarly, BE(Xβ , Bβ , Bα ) computes Xα in base Bα • similar to RNS modular reduction → O(k 2 ) complexity Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 42 / 43 RNS Montgomery reduction Bα − → Aα Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography Bβ − → Aβ 43 / 43 RNS Montgomery reduction Bβ Bα − → Aα aα,1 aα,2 aα,3 aα,4 Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography aβ,1 aβ,2 aβ,3 aβ,4 − → Aβ 43 / 43 RNS Montgomery reduction Bβ Bα − → Aα aα,1 aα,2 aα,3 aα,4 −−−− −→ (−P −1 )α 0 pα,1 0 pα,2 0 pα,3 0 pα,4 Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography aβ,1 aβ,2 aβ,3 aβ,4 − → Aβ 43 / 43 RNS Montgomery reduction Bβ Bα − → Aα aα,1 aα,2 aα,3 aα,4 −−−− −→ (−P −1 )α × 0 pα,1 × 0 pα,2 × 0 pα,3 × 0 pα,4 − → Kα kα,1 kα,2 kα,3 kα,4 Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography aβ,1 aβ,2 aβ,3 aβ,4 − → Aβ 43 / 43 RNS Montgomery reduction Bβ Bα − → Aα aα,1 aα,2 aα,3 aα,4 −−−− −→ (−P −1 )α × 0 pα,1 × 0 pα,2 × 0 pα,3 × 0 pα,4 − → Kα kα,1 kα,2 kα,3 kα,4 BE Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography aβ,1 aβ,2 aβ,3 aβ,4 − → Aβ kβ,1 kβ,2 kβ,3 kβ,4 − → Kβ 43 / 43 RNS Montgomery reduction Bβ Bα − → Aα aα,1 aα,2 aα,3 aα,4 −−−− −→ (−P −1 )α × 0 pα,1 × 0 pα,2 × 0 pα,3 × 0 pα,4 kα,1 × pα,1 kα,2 × pα,2 kα,3 × pα,3 kα,4 × pα,4 − → Kα − → Pα BE Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography aβ,1 aβ,2 aβ,3 aβ,4 kβ,1 × pβ,1 kβ,2 × pβ,2 kβ,3 × pβ,3 kβ,4 × pβ,4 − → Aβ − → Kβ − → Pβ 43 / 43 RNS Montgomery reduction Bβ Bα − → Aα aα,1 aα,2 aα,3 aα,4 −−−− −→ (−P −1 )α × 0 pα,1 × 0 pα,2 × 0 pα,3 × 0 pα,4 − → Pα kα,1 × pα,1 kα,2 × pα,2 kα,3 × pα,3 kα,4 × pα,4 − → Aα + aα,1 + aα,2 + aα,3 + aα,4 − → Kα BE Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography aβ,1 aβ,2 aβ,3 aβ,4 kβ,1 × pβ,1 kβ,2 × pβ,2 kβ,3 × pβ,3 kβ,4 × pβ,4 + aβ,1 + aβ,2 + aβ,3 + aβ,4 − → Aβ − → Kβ − → Pβ − → Aβ 43 / 43 RNS Montgomery reduction Bβ Bα − → Aα aα,1 aα,2 aα,3 aα,4 −−−− −→ (−P −1 )α × 0 pα,1 × 0 pα,2 × 0 pα,3 × 0 pα,4 − → Pα kα,1 × pα,1 kα,2 × pα,2 kα,3 × pα,3 kα,4 × pα,4 − → Aα + aα,1 + aα,2 + aα,3 0 0 0 − → Kα − → Tα ≡ 0 (mod Mα ) − → Aβ aβ,1 aβ,2 aβ,3 aβ,4 kβ,1 × pβ,1 kβ,2 × pβ,2 kβ,3 × pβ,3 kβ,4 × pβ,4 + aα,4 + aβ,1 + aβ,2 + aβ,3 + aβ,4 − → Aβ 0 tβ,1 tβ,2 tβ,3 tβ,4 − → Tβ BE Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography − → Kβ − → Pβ 43 / 43 RNS Montgomery reduction Bβ Bα − → Aα aα,1 aα,2 aα,3 aα,4 −−−− −→ (−P −1 )α × 0 pα,1 × 0 pα,2 × 0 pα,3 × 0 pα,4 − → Kα kα,1 kα,2 kα,3 kα,4 BE Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography − → Aβ aβ,1 aβ,2 aβ,3 aβ,4 kβ,1 × pβ,1 kβ,2 × pβ,2 kβ,3 × pβ,3 kβ,4 × pβ,4 + aβ,1 + aβ,2 + aβ,3 + aβ,4 − → Aβ tβ,1 tβ,2 tβ,3 tβ,4 − → Tβ − → Kβ − → Pβ 43 / 43 RNS Montgomery reduction Bβ Bα − → Aα aα,1 aα,2 aα,3 aα,4 −−−− −→ (−P −1 )α × 0 pα,1 × 0 pα,2 × 0 pα,3 × 0 pα,4 − → Kα kα,1 kα,2 kα,3 kα,4 BE Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography aβ,1 aβ,2 aβ,3 aβ,4 kβ,1 × pβ,1 kβ,2 × pβ,2 kβ,3 × pβ,3 kβ,4 × pβ,4 + aβ,1 + aβ,2 + aβ,3 + aβ,4 tβ,1 × 0 mβ,1 tβ,2 × 0 mβ,2 tβ,3 × 0 mβ,3 tβ,4 × 0 mβ,4 − → Aβ − → Kβ − → Pβ − → Aβ − → Tβ −−− −→ (Mα−1 )β 43 / 43 RNS Montgomery reduction Bβ Bα − → Aα aα,1 aα,2 aα,3 aα,4 −−−− −→ (−P −1 )α × 0 pα,1 × 0 pα,2 × 0 pα,3 × 0 pα,4 − → Kα kα,1 kα,2 kα,3 kα,4 BE Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography aβ,1 aβ,2 aβ,3 aβ,4 kβ,1 × pβ,1 kβ,2 × pβ,2 kβ,3 × pβ,3 kβ,4 × pβ,4 + aβ,1 + aβ,2 + aβ,3 + aβ,4 tβ,1 × 0 mβ,1 tβ,2 × 0 mβ,2 tβ,3 × 0 mβ,3 tβ,4 × 0 mβ,4 rβ,1 rβ,2 rβ,3 rβ,4 − → Aβ − → Kβ − → Pβ − → Aβ − → Tβ −−− −→ (Mα−1 )β − → Rβ 43 / 43 RNS Montgomery reduction Bβ Bα − → Aα aα,1 aα,2 aα,3 aα,4 −−−− −→ (−P −1 )α × 0 pα,1 × 0 pα,2 × 0 pα,3 × 0 pα,4 − → Kα kα,1 kα,2 kα,3 kα,4 − → Rα rα,4 rα,3 rα,2 rα,1 BE BE Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography aβ,1 aβ,2 aβ,3 aβ,4 kβ,1 × pβ,1 kβ,2 × pβ,2 kβ,3 × pβ,3 kβ,4 × pβ,4 + aβ,1 + aβ,2 + aβ,3 + aβ,4 tβ,1 × 0 mβ,1 tβ,2 × 0 mβ,2 tβ,3 × 0 mβ,3 tβ,4 × 0 mβ,4 rβ,1 rβ,2 rβ,3 rβ,4 − → Aβ − → Kβ − → Pβ − → Aβ − → Tβ −−− −→ (Mα−1 )β − → Rβ 43 / 43 RNS Montgomery reduction Bβ Bα − → Aα aα,1 aα,2 aα,3 aα,4 −−−− −→ (−P −1 )α × 0 pα,1 × 0 pα,2 × 0 pα,3 × 0 pα,4 − → Kα kα,1 kα,2 kα,3 kα,4 − → Rα rα,4 rα,3 rα,2 rα,1 BE BE aβ,1 aβ,2 aβ,3 aβ,4 kβ,1 × pβ,1 kβ,2 × pβ,2 kβ,3 × pβ,3 kβ,4 × pβ,4 + aβ,1 + aβ,2 + aβ,3 + aβ,4 tβ,1 × 0 mβ,1 tβ,2 × 0 mβ,2 tβ,3 × 0 mβ,3 tβ,4 × 0 mβ,4 rβ,1 rβ,2 rβ,3 rβ,4 − → Aβ − → Kβ − → Pβ − → Aβ − → Tβ −−− −→ (Mα−1 )β − → Rβ − → − → I Result is (Rα , Rβ ) ≡ (A · Mα−1 ) (mod P) Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 43 / 43 RNS Montgomery reduction Bβ Bα − → Aα aα,1 aα,2 aα,3 aα,4 −−−− −→ (−P −1 )α × 0 pα,1 × 0 pα,2 × 0 pα,3 × 0 pα,4 − → Kα kα,1 kα,2 kα,3 kα,4 − → Rα rα,4 rα,3 rα,2 rα,1 BE BE aβ,1 aβ,2 aβ,3 aβ,4 kβ,1 × pβ,1 kβ,2 × pβ,2 kβ,3 × pβ,3 kβ,4 × pβ,4 + aβ,1 + aβ,2 + aβ,3 + aβ,4 tβ,1 × 0 mβ,1 tβ,2 × 0 mβ,2 tβ,3 × 0 mβ,3 tβ,4 × 0 mβ,4 rβ,1 rβ,2 rβ,3 rβ,4 − → Aβ − → Kβ − → Pβ − → Aβ − → Tβ −−− −→ (Mα−1 )β − → Rβ − → − → I Result is (Rα , Rβ ) ≡ (A · Mα−1 ) (mod P) I See also the hybrid position–residues number system [Bigou & Tisserand, 2016] Jérémie Detrey — Hardware Implementation of (Elliptic Curve) Cryptography 43 / 43 Un peu de publicité éhontée... Journées Codage & Cryptographie 2017 du 23 au 28 avril à La Bresse (Vosges) Soumission de résumés: jusqu’au 8 mars Inscriptions: jusqu’au 3 avril https://jc2-2017.inria.fr/ À très bientôt dans les Vosges !
© Copyright 2026 Paperzz