Precise Bounds for Montgomery Modular Multiplication and Some Potentially Insecure RSA Moduli Colin D. Walter formerly: www.co.umist.ac.uk (Manchester, UK) [email protected] future: www.comodo.net (Bradford, UK) [email protected] Motivation • Modular multiplication is the foundation of most arithmetic-based cryptography: efficiency and security are important. • Montgomery modular multiplication is one highly favoured method. • To avoid full length comparisons or timing attacks, conditional modular reductions are skipped, but the price is a higher bound, often 2M for modulus M, and perhaps extra iterations. • For typical, standard key and word lengths, 2M will overflow into the next word by just 1 bit. So an extra word may have to be processed: inefficient. • Perhaps the overflow bit can be detected and allow a power analysis attack. RSA 2002 C.D. Walter, UMIST 2 History • P. L. Montgomery Modular multiplication without trial division Maths of Compn 44 (1985), 519–521 • C. D. Walter Montgomery Exponentiation Needs No Final Subtractions Electronics Letters 35 (1999), 1831–1832 • G. Hachez & J.-J. Quisquater Montgomery Exponentiation with No Final Subtractions: improved results CHES 2000, LNCS 1965, 293 – 301 RSA 2002 C.D. Walter, UMIST 3 Montgomery Modr Multn { Pre-condition: 0 A < rn } P 0 ; For i 0 to n1 do Begin q (p0+aib0)(-m0-1) mod r ; P (P + aiB + qM) div r ; { Invariant: 0 P < M+B } End ; { Post-conditions: Prn A×B mod M , ABr–n P < M + ABr–n } RSA 2002 C.D. Walter, UMIST 4 Loop Invariants I Suppose P < M+B at the start of the loop. At the end of the loop, the new value of P is (P + aiB + qM) div r < ((M+B)+(r–1)B+(r–1)M)/r = M+B So the invariant holds. If B was bounded by 2M, the output would be bounded by 3M. Either we perform a conditional subtraction or we perform another iteration to keep input less than 2M. The former is banned to avoid timing attacks. If the last ai is small enough, the bound becomes M+B/2 < 2M and another iteration would be unnecessary. To achieve that we require ai r/2 for the top digit: — unlikely if A M and M uses all bits of the top word. RSA 2002 C.D. Walter, UMIST 5 Loop Invariants II More accuracy is possible. Define: i ij10 a j r j i Then i+1 = (i + ai)/r < 1 by induction. Suppose Pi is the value of P at the start of the iteration using i. Then it is easy to establish: i+1B Pi+1 < M + i+1B because i+1B = (iB + aiB)/r < (Pi + aiB + qiM)/r = (Pi + aiB + qiM) div r = Pi+1 and similarly for the upper bound. RSA 2002 C.D. Walter, UMIST 6 Post-Condition At the end of the last iteration: n nj 10 a j r j n Ar n So the loop invariant gives: ABr–n P < M + ABr–n • This is the tightest interval possible since its width is only M. • It improves on the previous upper bound M+B since Ar–n < 1. • It is much better if A is known to be smaller, e.g. less than M. RSA 2002 C.D. Walter, UMIST 7 Stability Under what conditions will a bound on A and B be preserved? Then output from one MMM can be re-used as input without adjustment. Suppose A and B are bounded by (1+)M. We require M + ABr–n (1+)M always for such stability, i.e. M + (1+)2M2r–n (1+)M This means (1+)2Mr–n which we can solve for suitable . It has real solutions exactly when: 4M rn RSA 2002 C.D. Walter, UMIST 8 First Results • The condition 4M rn for I/O remaining bound improves on those given by the papers cited earlier. • When the condition is satisfied we can choose so that A and B are bounded by 2M or by ½rn as appropriate. • Intermediate values of P are bounded above by ¾rn. • For such M with n digits, there is no extra processing required to compensate for removing the final subtraction. • For standard key lengths, we need to take n to be 1 more than the number of digits in M in order to satisfy the bound. RSA 2002 C.D. Walter, UMIST 9 Standard Key Lengths • We have seen the need for increasing n for standard key lengths. This means one more iteration than the number of digits in M. It is the cost of deleting the final subtraction. • How many bits of the corresponding extra digit are required? • We know the bound 2M means at most one bit is needed. Is it necessary? Its occasional existence may provide a handle for a timing or power analysis attack. • The frequency of the top bit being non-zero is different for squares and multiplies. This was reported at RSA 2001. (This bit is what prompts the final conditional subtraction.) RSA 2002 C.D. Walter, UMIST 10 The Extra Bit • The frequency of the top bit becoming set is around 25% – 30% when n has not been increased. • Increasing n decreases the upper bound M + ABr–n making it less likely to set the topmost bit, i.e. the next bit after the top bit of M. • We need to discover its frequency of being 1 to determine if a difference for squares and multiplies is measurable. We will see when it is always zero. • Since n is being increased by 1, we have ¼rn–1 < M < rn–1 and want I/O to be less than rn–1. RSA 2002 C.D. Walter, UMIST 11 Conditions for no overflow bit • The condition of interest is M + ABr–n < rn–1 when A, B < rn–1. • So we need M such that M + (rn–1)2r–n < rn–1 i.e. M < rn–1(1–r–1) • Thus the arguments and output of MMM will have the same number of words as M unless the top word of M is all 1s. • Hence, when the final conditional subtraction is omitted from MMM, there is no “overflow” bit against which a power analysis attack can be mounted unless the top word of M is all 1s. RSA 2002 C.D. Walter, UMIST 12 The Unlikely Event • The potentially dangerous case is therefore when the top word of M is r – 1, which is reassuringly uncommon, and the worst case is M = rn–1. • By solving our previous quadratic in , the best bound on the inputs to achieve stability in that worst case is (1+)M = ½rn(1–(1–4r–1)½) = rn–1 + rn–2 + 2rn–3 + 5rn–4 +... • With the reasonable assumptions that residues mod M are uniformly distributed, at most about r–1 of outputs will exceed rn–1. • So, for a 16-bit architecture, and limited smartcard life, the overflow bit is too rare to be of use in power analysis. • One could safely re-introduce a conditional subtraction here to avoid the need for extra hardware. RSA 2002 C.D. Walter, UMIST 13 Exponentiation • We end by noting that no final subtraction is needed in the case of MMM exponentiation: • To compute Te mod M, pre-processing generates Trn mod M so that subsequent multiplications are all larger than from standard modular multiplication by a factor of rn mod M. The output is therefore A = Tern mod M. • Post-processing removes the extra factor rn by an MMM multiplication by 1. The output is bounded above by M + Ar–n where A < 2M < ½rn. So the output is M. Of course, equality with M is impossible, since that could only arise from T = 0 which would result in output 0. • So no final modular reduction is needed for exponentiation. RSA 2002 C.D. Walter, UMIST 14 Conclusion • Precise output bounds have been obtained for Montgomery Modular Multiplication. • This gives I/O bounds for MMM in the context of exponentiation when the final conditional subtraction is omitted. • All numbers have the same word size as the modulus M when 4M rn and M has n words. • Otherwise, MMM must perform another iteration, but overflow bits are then too rare to be in danger from power analysis attacks. • No final modular subtraction is required for expn. RSA 2002 C.D. Walter, UMIST 15
© Copyright 2026 Paperzz