Topics in Cryptography Lecture 7 Topic: Side Channels Lecturer: Moni Naor Recap: chosen ciphertext security • Why chosen ciphertext/malleability matters • Taxonomy of Attacks and Security • Ideas for achieving CCA – Redundancy + Verification • The NIZK approach • Simple scheme achieving CCA1 – Based on DDH – Modification achieving CCA2 • Chosen-Ciphertext Security via Correlated Products • CCA and IBE • Deniable Authentication Adversarial Models STANDARD MODEL: Abstract models of computation Interactive Turing machines Private memory, randomness ... Well-defined adversarial access Can model powerful attacks REAL LIFE: Physical implementations leak information Adversarial access not always captured by abstract models Ek(m) 3 Adversarial Models Attacks in the standard model: Attacks outside the standard model: Chosen-plaintext attacks Chosen-ciphertext attacks Composition Self-referential encryption Circular encryption .... Timing attacks [Kocher 96] Fault detection [BDL 97, BS 97] Power analysis [KJJ 99] Cache attacks [OST 05] Memory attacks [HSHCPCFAF 08] ... Ek(m) 4 Adversarial Models Attacks in the standard model: Chosen-plaintext attacks Chosen-ciphertext attacks Composition Self-referential encryption Circular encryption .... Attacks outside the standard model: Timing attacks [Kocher 96] Fault detection [BDL 97, BS 97] Power analysis [KJJ 99] Cache attacks [OST 05] Memory attacks [HSHCPCFAF 08] Electromagnetic radiation analysis ... Side channel: Any information not captured by the abstract “standard” model 5 Countermeasures GOALS Preserve functionality Security HARDWARE E.g., minimizing electromagnetic leakage, “tamper-proof” devices,... Ad-hoc solutions Typically expensive or inefficient Efficiency Generic methods SOFTWARE E.g., fixed timing (indep. of input), oblivious RAM,... Many heuristics Require precise modeling 6 Thesis of this course and not only at implementation time Must incorporate side-channel attacks in the design of systems Many tools developed in the foundations of cryptography are helpful for protecting against side-channel attacks Proof by examples... 7 Side Channels • Timing Attacks • Cache Attacks • Memory Attacks slide 9 Timing Attacks • Kocher, Timing Attacks on Implementations of DiffieHellman, RSA, DSS, and Other Systems, (CRYPTO 1996) • Brumley and Boneh, Remote Timing Attacks Are Practical, (USENIX Security 2003) Slides based on Vitaly Shmatikov slide 10 Timing Attack • Basic idea: learn the system’s secret by observing how long it takes to perform various computations • Typical goal: extract private key • Extremely powerful because isolation doesn’t help – Victim could be remote – Victim could be inside its own virtual machine – Keys could be in tamper-proof storage or smartcard • Attacker wins simply by measuring response times slide 11 RSA Cryptosystem • Key generation: – Generate large primes P, Q – Compute N=PQ and (N)=(P-1)(Q-1) – Choose small e, relatively prime to (N) Why? • Typically, e=3 or e=216+1=65537 – Compute unique d such that ed = 1 mod (N) – Public key = (e,N); private key = d • Encryption of m (simplified!): c = me mod N • Decryption of c: cd mod N = (me)d mod N = m RSA Decryption • RSA decryption: compute yx mod N – A modular exponentiation operation • Naive algorithm: square and multiply: x 1 0 1 w Basic Timing Whether iteration takes a long time depends on the kth bit of secret exponent This takes a while to compute This is instantaneous Old observation: timing depends on number of 1’s If all multiplication take the same time: all you get Not all multiplications were created equal • Different timing given operands • Assumption/Heuristic: timings of subsequent multiplications are independent Exact – Given that we know the first k-1 bits of x timing – Given a guess for the kth bit of x Exact guess – Time of remaining bits independent Given measurement of total time can see whether there is correlation between events: kth step is long Total time is long Outline of Kocher’s Attack • Idea: guess some bits of the exponent; – Predict how long decryption will take • If guess is correct, will observe correlation; if incorrect, then prediction will look random – The more bits you already know, the stronger the signal, thus easier to detect (error-correction property) • Start by guessing a few top bits, look at correlations for each guess, pick the most promising candidate and continue Works against systems under direct control RSA in OpenSSL • OpenSSL: popular open-source toolkit – – – – mod_SSL (in Apache = 28% of HTTPS market) stunnel (secure TCP/IP servers) sNFS (secure NFS) Many more applications • Kocher’s attack doesn’t work against OpenSSL – Instead of square-and-multiply, OpenSSL uses CRT, sliding windows and two different multiplication algorithms for modular exponentiation • CRT = Chinese Remainder Theorem • Secret exponent is processed in chunks, not bit-by-bit Chinese Remainder Theorem • n = n1n2…nk where gcd(ni,nj)=1 when i j • The system of congruences x = x1 mod n1 = … = xk mod nk – Has a simultaneous solution x to all congruences – There exists exactly one solution x between 0 and n-1 • For RSA modulus N=PQ, to compute x mod N enough to know x mod P and x mod Q RSA Decryption With CRT • To decrypt c, need to compute m=cd mod N • Use Chinese Remainder Theorem – – – – – d1 = d mod (P-1) d2 = d mod (P-1) these are precomputed qinv = Q-1 mod P Compute m1 = cd1 mod P; m2 = cd2 mod Q Compute m = m2+(qinv*(m1-m2) mod P)*Q Attack this computation in order to learn Q Operations Involved in Decryption What is needed to compute cd mod Q and xy mod Q? • Exponentiation – Sliding windows • Multiplication routines – “Normal” - when operands have unequal length – Karatsuba - faster when operands have equal length • Modular reduction n log23 – Montgomery reduction Time of these operations is input sensitive Montgomery Reduction • Decryption requires computing m2 = cd2 mod Q • Done by repeated multiplication – Simple: square and multiply (process d2 one bit at a time) – More clever: sliding windows (process d2 in 5-bit blocks) • In either case, many multiplications modulo Q • Multiplications use Montgomery reduction – Pick some R = 2k Avoid long divisions – To compute x ¢ y mod Q: convert x and y into their Montgomery form xR mod q and yR mod q – Compute (xR * yR) * R-1 = zR mod q R a power • Multiplication by R-1 can be done very efficiently of 2 Schindler’s Observation • At the end of Montgomery reduction: if zR > Q, then need to subtract Q – Probability of this extra step is proportional to c mod Q • If c is close to Q, many subtractions will be done • If c mod Q = 0, very few subtractions – Decryption will take longer as c gets closer to Q, then become fast as c passes a multiple of Q • By playing with different values of c and observing how long decryption takes, attacker can guess Q! If all other operations are fixed! Reduction Timing Dependency Decryption time Q 2Q P Value of ciphertext c Integer Multiplication Routines • 30-40% of OpenSSL running time is spent on integer multiplication • If integers have the same number of words n, OpenSSL uses Karatsuba multiplication – Takes O(nlog23) • If integers have unequal number of words n and m, OpenSSL uses normal multiplication – Takes O(nm) Summary of Time Dependencies g<q g>q Montgomery effect Longer Shorter Multiplication effect Shorter Longer g is the decryption value (same as c) Different effects… but one will always dominate! slide 25 Attack Is Binary Search Decryption time #Reductions Mult routine 0-1 Gap Q Value of ciphertext slide 26 Attack Overview • Initial guess g for Q between 2511 and 2512 • Try all possible guesses for the top few bits • Suppose we know i-1 top bits of Q. Goal: ith bit – Set g =…known i-1 bits of Q …000000 – Set ghi=…known i-1 bits of Q …100000 (note: g<ghi) • If g<Q<ghi then the ith bit of Q is 0 • If g<ghi<Q then the ith bit of Q is 1 • Goal: decide whether g<Q<ghi or g<ghi<Q slide 27 Two Possibilities for ghi Decryption time g #Reductions Mult routine ghi? Difference in decryption times between g and ghi will be small ghi? Q Value of ciphertext Difference in decryption times between g and ghi will be large slide 28 Timing Attack Details • What is “large” and “small”? – Know from attacking previous bits • Decrypting just g does not work because of sliding windows g, g+1, …, g+ – Decrypt a neighborhood of values near g – Will increase difference between large and small values, resulting in larger 0-1 gap • Attack required only 2 hours, about 1.4 million queries to recover the private key – Only need to recover most significant half bits of q slide 29 The 0-1 Gap Zero-one gap slide 30 Extracting RSA Private Key Montgomery reduction dominates zero-one gap Multiplication routine dominates slide 31 Normal SSL Handshake Regular client 1. ClientHello 2. ServerHello SSL server (send public key) 3. ClientKeyExchange (encrypted under public key) Exchange data encrypted with new shared key slide 32 Attacking SSL Handshake 1. ClientHello Attacker 2. ServerHello SSL server (send public key) 3. Record time t1 Send guess g or ghi 4. Alert 5. Record time t2 Compute t2–t1 slide 33 Works On The Network Similar timing on WAN vs. LAN slide 34 Defenses • Require statically that all decryptions take the same time – For example, always do the extra “dummy” reduction – … but what if compiler optimizes it away? • Dynamically make all decryptions the same or multiples of the same time “quantum” – Now all decryptions have to be as slow as the slowest decryption • Use RSA blinding slide 35 RSA Blinding • Instead of decrypting ciphertext c, decrypt a random ciphertext related to c Choose random r 2 ZN* Compute x’ = c ¢ re mod Can prepare ahead – – N – Decrypt x’ to obtain m’ =x’d – Calculate original plaintext m = m’/r mod N • Since r is random, decryption time is independent of ciphertext • 2-10% performance penalty slide 36 Blinding Works slide 37 Cache Attacks • Cryptanalysis through Cache Address Leakage: Dag Arne Osvik, Adi Shamir , Eran Tromer Slides based on Eran Tromer Cache attacks • Pure software • No special privileges • No interaction with the cryptographic code • Very efficient – full AES key extraction from Linux encrypted partition in 65 milliseconds) • Compromise otherwise well-secured systems • “Commoditize” side-channel attacks: – Easily deployed software breaks many common systems Why cache? Annual speed increase: Typical latency: CPU core cache Main memory 60% 7-9% (until recently) 0.3ns 50-150ns → timing gap Address leakage The cache is a shared resource: • cache state affects, and is affected by, all processes, leading to crosstalk between processes. • The cached data is subject to memory protection… – Not attacked • But the “metadata” leaks information about memory access patterns: Which addresses are being accessed. Associative memory cache DRAM memory block (64 bytes) cache cache line (64 bytes) cache DRAM S-box tables in memory cache DRAM Detecting access to AES tables Measurement technique Two approaches to exploit Inter-process crosstalk: • Measuring the effect of the cache on the encryption – Need precise timing • Measuring the effect of the encryption on the cache Measuring effect of cache on encryption 1. Make sure the tables are cached DRAM 2. Evict one cache set 3. Time an cache encryption and see if it’s slow Measurement technique Two approaches to exploit Inter-process crosstalk: • Measuring the effect of the cache on the encryption – Need precise timing • Measuring the effect of the encryption on the cache Measuring effect of encryption on cache 1. Completely cache DRAM evict tables from cache Measuring effect of encryption on cache 1. Completely cache DRAM evict tables from cache 2. Trigger a single encryption Measuring effect of encryption on cache 1. Completely cache DRAM evict tables from cache 2. Trigger a single encryption 3. Access attacker memory again. See which cache sets are slow Advantages of second method • Yields more information (64) from a single encryption • Insensitive to timing variance in encryption code path • No real need to trigger the encryption – can wait until it happens by itself A typical software implementation of AES char p[16], k[16]; // plaintext and key int32 T0[256],T1[256],T2[256],T3[256]; // lookup tables int32 Col[4]; // intermediate state ... /* Round 1 */ Col[0] T0[p[ 0]©k[ 0]] T1[p[ 5]©k[ 5]] T2[p[10]©k[10]] T3[p[15]©k[15]]; Col[1] T0[p[ 4]©k[ 4]] T1[p[ 9]©k[ 9]] T2[p[14]©k[14]] T3[p[ 3]©k[ 3]]; Col[2] T0[p[ 8]©k[ 8]] T1[p[13]©k[13]] T2[p[ 2]©k[ 2]] T3[p[ 7]©k[ 7]]; Col[3] T0[p[12]©k[12]] T1[p[ 1]©k[ 1]] T2[p[ 6]©k[ 6]] T3[p[11]©k[11]]; lookup index = plaintext key Synchronous attack • A software service performs AES encryption using a secret key. • An attacker process runs on the same CPU. • The attacker process can somehow invoke the service on known plaintext. • Examples: – Encrypted disk partition + filesystem – IP/Sec, VPN Synchronous attack on AES: Overview • Measure (possibly noisy) cache usage of many encryptions of known plaintexts. • Guess the first key byte. For each hypothesis: – For each sampled plaintext, predict which cache line is accessed by “T0[p[ 0]©k[ 0]]” • Identify the hypothesis which yields maximal correlation between predictions and measurements. • Proceed for the rest of the key bytes. • Practically, a few hundred samples suffice. Got 64 bits of the key (high nibble of each byte)! • Use these partial results to mount attack further AES rounds, exploiting S-box nonlinearity. A few thousand samples for complete key recovery. Protection: The Oblivious RAM Model Oblivious Turing Machine: • At any point in time know where the heads are – The access pattern is independent of the • Important: to convert to circuits • Get good results for the Cook-Levin Theorem Oblivious RAM • The access pattern is independent of the – Probability distribution! Suggested by Goldreich 1987 Model qi CPU Small private memory Main memory M[qi] Oblivious RAM Requirements Any sequence of locations i1, i2, … induces a distribution on sequences of requests q1, q2… • Functionality: should be able to figure out the original content • Security: for any two sequence of locations i1, i2, … and i’1, i’2, … induced distributions of requests should be indistinguishable Oblivious Ram Constructions • Trivial: O(n) slowdown – O(log n) bits private memory • Known: polylog slowdown [Goldreich-Ostrovsky 96] – O(log n) bits private memory Memory Attacks [HSHCPCFAF 08] Concern: Not only computation leaks information Memory retains its content after power is lost 5 seconds 30 seconds 60 seconds 5 minutes http://citp.princeton.edu/memory 59 Memory Attacks [HSHCPCFAF 08] Not only computation leaks information Memory retains its content after power is lost Memory content can even last for several minutes Recover “noisy” keys Can use redundancy in round keys Cold boot attacks Completely compromise popular disk encryption systems Reconstruct DES, AES, and RSA keys http://citp.princeton.edu/memory 60 Public-Key Encryption Semantic security [GM82] under CPA: For any m0 and m1 infeasible to distinguish Epk(m0) and Epk(m1) pk m0, m1 Output b’ Epk(mb) (sk, pk) b à {0,1} 61 Key-Leakage Attacks Semantic security with key leakage [AGV 09]: For any* leakage f(sk) and for any m0 and m1 infeasible to distinguish Epk(m0) and Epk(m1) pk Akavia, Goldwasser and Vaikuntanathan f f(sk) m0, m1 Output b’ Epk(mb) (sk, pk) b à {0,1} Clearly, cannot allow f(sk) that easily reveals sk For now f : SK ! {0,1}¸ for ¸ < |sk| 62 Is this the right model? Noisy leakage Leakage of intermediate values as opposed to low-bandwidth leakage Are intermediate values always erased? Key generation process Decryption process Keys generated using a “weak” random source Not a perfect model, but still a good starting point Discuss extensions later on 63 What We Know A generic method for protecting against key-leakage attacks Chosen-ciphertext key-leakage attacks Main building block: Hash Proof Systems [CS 02] Efficient instantiations Based on decisional Diffie-Hellman, few exponentiations A generic CPA-to-CCA transformation Efficient schemes Extensions Noisy leakage Leakage of intermediate values Weak random sources 64 Outline of the Talk Some tools The generic construction by examples A simple scheme: ¸ ¼ |sk|/2 Improved schemes: ¸ ¼ |sk| Extensions of the model Conclusions, further work, and some rest... 65 Min-Entropy Probability distribution X over {0,1}n H1(X) = - log maxx Pr[X = x] Represents the probability of the most likely value of X X is a k-source if H1(X) ¸ k (i.e., Pr[X = x] · 2-k for all x) Statistical distance: ¢(X,Y) = a |Pr[X=a] – Pr[Y=a]| 66 Extractors Universal procedure for “purifying” an imperfect source Definition: Ext: {0,1}n £ {0,1}d ! {0,1}ℓ is a (k,)-extractor if for any ksource X ¢(Ext(X, Ud), Uℓ) · k-source of length n x “seed” d random bits s EXT ℓ almost-uniform bits 67 Strong Extractors Output looks random even after seeing the seed Definition: Ext: {0,1}n £ {0,1}d ! {0,1}ℓ is a (k,)-strong extractor if Ext’(x, s) = s ◦ Ext(x,s) is a (k, )-extractor Leftover hash lemma [ILL 89]: Pairwise independent hash functions are strong extractors Example: Ext(x, (a,b)) = first ℓ bits of ax+b over GF[2n] Output length ℓ = k – 2log(1/) Seed length d = 2n, almost pairwise independence d = O(log n + k) 68 Decisional Diffie-Hellman Alice gx gy Bob Both parties compute K = gxy DDH assumption: x,, g yr,1,gg z ) r 2) (g1(g, , g2g, xg, 1gr,y,gg2rxy) ) (g(g, , g g g 1 2 1 2 for random x, g1,y,g2z22GZand q r, r1, r2 2 Zq 69 Outline of the Lecture Some tools The generic construction by examples A simple scheme: ¸ ¼ |sk|/2 Improved schemes: ¸ ¼ |sk| Extensions of the model Conclusions, further work, and some rest... 70 A Simple Scheme G - group of order q Ext : G £ {0,1}d ! {0,1} - strong extractor Key generation Choose g1, g2 2 G and x1, x2 2 Zq Let h = g1x1 g2x2 Output sk = (x1, x2) and pk = (g1, g2, h) MAIN IDEA: Redundancy: any pk corresponds to many possible sk’s x x h=g1 1 g2 2 reveals only log(q) bits of information on sk=(x1,x2) Leakage of ¸ bits ) sk still has min-entropy log(q) - ¸ 71 A Simple Scheme G - group of order q Ext : G £ {0,1}d ! {0,1} - strong extractor Key generation Encpk(m) Decsk(u1, u2, s, e) Choose g1, g2 2 G and x1, x2 2 Zq Let h = g1x1 g2x2 Output sk = (x1, x2) and pk = (g1, g2, h) d Choose r 2 Zq and a seed s 2 {0,1} Output (g1r, g2r, s, Ext(hr, s) © m) Output e © Ext(u1x1 u2x2, s) u1x1 u2x2 = g1rx1 g2rx2 = (g1x1 g2x2)r = hr 72 A Simple Scheme Theorem: The scheme is resilient to any leakage of ¸ ¼ log(q) bits half the size of sk Proof by reduction: Adversary for the encryption scheme Distinguisher for decisional Diffie-Hellman 73
© Copyright 2025 Paperzz