GPU and CPU Parallelization of Honest-but-Curious Secure TwoParty Computation Nathaniel Husted, Steve Myers, abhi shelat, Paul Grubbs Secure Two-party Computation Alice and Bob want to compute a public function of their private inputs. Disease Database Alice Bob Secure Two-party Computation Alice Bob X Y F(X,Y) => Alice & Bob Alice provides X. Bob provides Y. F(X,Y) is correctly calculated without Bob learning X and Alice learning Y. Yao’s Garbled Circuits [Yao1986] F(X,Y) X Y XOR 0 XOR 1 0 AND 2 OR 4 AND 3 O U T P U T S I’m going to discuss the current fastest solution for processing Yao’s Garbled Circuits. Yao’s Garbled Circuits [Yao1986] F(X,Y) = X + Y X Y XOR 0 XOR 1 0 AND 2 OR 4 AND 3 O U T P U T S Wires in Yao’s Garbled Circuits [Yao1986] • Alice must use random labels (𝜆) for wire values instead of 0’s and 1’s. Label 0 (𝜆00 ) Wire 0 (𝑊0 ) Label 1 (𝜆10 ) Yao’s Garbled Circuits [Yao1986] F(X,Y) = X + Y X XOR 0 Y 0 0x1212 0x1234 0xCC1C 0x1112 XOR 1 Label 0 = 0xF1F1 Label 1 = 0xABAB AND 2 OR 4 0x4321 0x9932 0x6753 0x9B3F AND 3 0x93FA 0x8843 O U T P U T S Encrypting Gates in Yao’s Garbled Circuits [Yao1986] Gate 2 (𝐺2 ) AND 𝑊0 AND GATE 𝑊1 𝑊0 𝑊1 Output 𝜆00 𝜆10 𝜆02 𝜆00 𝜆11 𝜆02 𝜆10 𝜆10 𝜆02 𝜆10 𝜆11 𝜆12 𝑊2 Encrypting Gates in Yao’s Garbled Circuits [Yao1986] • Notation shortcut: 𝐸𝑛𝑐𝜆00,𝜆01 (𝜆02 ) = 𝐸𝑛𝑐𝜆00 (𝐸𝑛𝑐𝜆01 (𝜆02 )) Gate 2 (𝐺2 ) AND 𝑊0 AND GATE 𝑊1 𝑊0 𝑊1 Output 𝜆00 𝜆10 𝐸𝑛𝑐𝜆00,𝜆01 (𝜆02 ) 𝜆00 𝜆11 𝐸𝑛𝑐𝜆00,𝜆11 (𝜆02 ) 𝜆10 𝜆10 𝐸𝑛𝑐𝜆10,𝜆01 (𝜆02 ) 𝜆10 𝜆11 𝐸𝑛𝑐𝜆10,𝜆11 (𝜆12 ) 𝑊2 Garbling Gates in Yao’s Garbled Circuits [Yao1986] Gate 2 (𝐺2 ) AND 𝑊0 AND GATE Encrypted Entry 𝑊1 0,0 𝐸𝑛𝑐𝜆00,𝜆01 (𝜆02 ) 0,1 𝐸𝑛𝑐𝜆00,𝜆11 (𝜆02 ) 1,0 𝐸𝑛𝑐𝜆10,𝜆01 (𝜆02 ) 1,1 𝐸𝑛𝑐𝜆10,𝜆11 (𝜆12 ) 𝑊2 Garbling Gates in Yao’s Garbled Circuits [Yao1986] Gate 2 (𝐺2 ) AND 𝑊0 AND GATE Encrypted Entry 𝑊1 0,0 𝐸𝑛𝑐𝜆00,𝜆01 (𝜆02 ) 0,1 𝐸𝑛𝑐𝜆00,𝜆11 (𝜆02 ) 1,0 𝐸𝑛𝑐𝜆10,𝜆01 (𝜆02 ) 1,1 𝐸𝑛𝑐𝜆10,𝜆11 (𝜆12 ) 𝑊2 Garbling Gates in Yao’s Garbled Circuits [Yao1986] Gate 2 (𝐺2 ) AND 𝑊0 AND GATE 𝜋0 , 𝜋1 𝑊1 Encrypted Entry 0,0 𝐸𝑛𝑐𝜆10,𝜆01 (𝜆02 , 𝜋20 ) 0,1 𝐸𝑛𝑐𝜆10,𝜆11 (𝜆12 , 𝜋21 ) 1,0 𝐸𝑛𝑐𝜆00,𝜆01 (𝜆02 , 𝜋20 ) 1,1 𝐸𝑛𝑐𝜆00,𝜆11 (𝜆02 , 𝜋20 ) 𝑊2 Yao’s Garbled Circuits [Yao1986] F(X,Y) = X + Y X XOR 0 Y 0 0x1212 0x1234 0xCC1C 0x1112 Label 0 = 0xF1F1 Label 1 = 0xABAB XOR 1 AND 2 OR GATE 𝑊0 𝑊1 Output 𝜆00 𝜆10 𝐸𝑛𝑐𝜆00,𝜆01 (𝜆02 ) 1 1 𝐸𝑛𝑐𝜆00,𝜆01 (𝜆12 ) 𝜆00 0x4321 0x9932 0x6753 0x9B3F AND 3 OR𝜆 4 0x93FA 0x8843 (𝜆 ) 𝜆10 𝜆10 𝐸𝑛𝑐𝜆10,𝜆01 𝜆10 𝜆11 𝐸𝑛𝑐𝜆10,𝜆11 (𝜆12 ) 1 2 O U T P U T S Alice sends the generated circuit to Bob. • Alice sends ALL garbled truth tables to Bob. XOR GATE 𝜋0 , 𝜋1 Encrypted Entry 1 XOR GATE 0,0 𝐻 𝜆0 ∥ 𝜆01 ∥ 0 ⨁(𝜆02 ∥ 𝜋20 ) 𝜋 ,𝜋 1 1 Encrypted1 Entry 1 𝐻 𝜆GATE 0,1 0 1 AND 0 ∥ 𝜆1 ∥ 0 ⨁(𝜆2 ∥ 𝜋2 ) 0 0 1 0,0 𝐻 𝜆00 ∥ 𝜆1 ∥ 0 0⨁(𝜆20 ∥ 𝜋20 ) 0 𝜋 , 𝜋 Encrypted Entry 𝐻 𝜆 0 1 1,0 0 ∥ 𝜆1 ∥ 01 ⨁(𝜆2 ∥ 𝜋12 ) 1 𝐻OR 𝜆0GATE ∥ 𝜆1 ∥1 0 ⨁(𝜆 0,1 2 ∥ 𝜋2 )0 0 0 0,0 0 0 1 ∥ 00 ⨁(𝜆2 ∥ 𝜋2 ) 1 𝐻 𝜆0 ∥ 𝜆 𝐻 𝜆 ∥ 𝜆 ∥ 0 ⨁(𝜆 ∥ 𝜋 ) 0 1 2 2 0 0 0 0 1,1 1,0 𝜋 , 𝜋 1𝜆∥1 0∥ 𝜆⨁(𝜆 Encrypted 1 1 𝜋2 )12 ∥ 𝜋Entry 2 ∥⨁(𝜆 0,1 𝐻 0𝜆0 ∥1𝐻𝜆AND 0 GATE 1 ∥0 2) 0 0 1 0,0 𝐻0 𝜆00 ∥ 𝜆10∥00 ⨁(𝜆 ∥ 𝜋20 ) 0 1 0 0 2 𝐻 𝜆 ∥ 𝜆 ∥ 0 ⨁(𝜆 ∥ 𝜋 ) 0 1 2 2 𝐻 𝜆 ∥ 𝜆 ∥ 0 ⨁(𝜆 ∥ 𝜋 ) 𝜋 , 𝜋 Encrypted Entry 1,0 2 2 00 1 1 1 1,1 𝐻 𝜆0 ∥ 𝜆11 ∥ 0 ⨁(𝜆12 ∥ 𝜋21 ) 0,1 0 0 0 0 , 𝜋2 1 𝐸𝑛𝑐 1 ,𝜆0 (𝜆 𝐻 𝜆0,0 ∥ 𝜋20 )) 0 0 01 2 2 0 ∥ 𝜆1 ∥ 00 𝜆⨁(𝜆 1,1 𝐻 𝜆 ∥ 𝜆 ∥ 0 ⨁(𝜆 ∥ 𝜋20 ) 1,0 0 1 1 1 2 𝐸𝑛𝑐 1 1 (𝜆 , 𝜋 ) 0,1 𝜆0 ,𝜆1 2 2 0 𝐻 𝜆 ∥ 𝜆11 ∥ 0 ⨁(𝜆02 ∥ 𝜋20 ) 0 1,1 𝐸𝑛𝑐𝜆00 ,𝜆01 (𝜆02 , 𝜋20 ) 1,0 1,1 ALICE 𝐸𝑛𝑐𝜆00 ,𝜆11 (𝜆02 , 𝜋20 ) BOB Sent over the network… Bob evaluates the circuit. • Evaluation is the reverse of generation. Gate 3 (𝐺3 ) AND 𝑊0 𝜆0 = 0xCC1C 𝜋0 = 0x1 𝑊1 𝜆1 = 0x1234 𝜋1 = 0x0 AND GATE 𝜋0 , 𝜋1 Encrypted Entry 𝑊2 ,𝜆1 (𝐸𝑛𝑐𝜆10,𝜆01 (𝜆02 , 𝜋20 )) ,𝜆1 (𝐸𝑛𝑐𝜆10,𝜆11 (𝜆12 , 𝜋21 )) 𝜆2 = ? ? ,𝜆1 (𝐸𝑛𝑐𝜆00,𝜆01 (𝜆02 , 𝜋20 )) 𝜋2 = ?? 0 ,𝜆1 (𝐸𝑛𝑐𝜆00,𝜆11 (𝜆02 , 𝜋20 )) 0,0 𝐷𝑒𝑐𝜆 0,1 𝐷𝑒𝑐𝜆 1,0 𝐷𝑒𝑐𝜆 1,1 𝐷𝑒𝑐𝜆 0 0 0 Bob evaluates the circuit. • Evaluation is the reverse of generation. Gate 3 (𝐺3 ) AND 𝑊0 𝜆0 = 0xCC1C 𝜋0 = 0x1 𝑊1 𝜆1 = 0x1234 𝜋1 = 0x0 AND GATE 𝜋0 , 𝜋1 Encrypted Entry 𝑊2 ,𝜆1 (𝐸𝑛𝑐𝜆10,𝜆01 (𝜆02 , 𝜋20 )) ,𝜆1 (𝐸𝑛𝑐𝜆10,𝜆11 (𝜆12 , 𝜋21 )) 𝜆2 = ? ? ,𝜆1 (𝐸𝑛𝑐𝜆00,𝜆01 (𝜆02 , 𝜋20 )) 𝜋2 = ?? 0 ,𝜆1 (𝐸𝑛𝑐𝜆00,𝜆11 (𝜆02 , 𝜋20 )) 0,0 𝐷𝑒𝑐𝜆 0,1 𝐷𝑒𝑐𝜆 1,0 𝐷𝑒𝑐𝜆 1,1 𝐷𝑒𝑐𝜆 0 0 0 ENTRY TO DECODE Bob evaluates the circuit. • Evaluation is the reverse of generation. Gate 3 (𝐺3 ) AND 𝑊0 𝜆0 = 0xCC1C 𝜋0 = 0x1 𝑊1 𝜆1 = 0x1234 𝜋1 = 0x0 AND GATE 𝜋0 , 𝜋1 Encrypted Entry 0,0 ⊥ 0,1 ⊥ 𝜆2 = ? ? 𝜆02 , 𝜋20 1,0 1,1 𝑊2 ⊥ ENTRY TO DECODE 𝜋2 = ?? Other security models for Yao’s Garbled Circuits • Malicious-Leaks-A-Bit [Huang2013] • Benefits: • Attacker can analyze results and lie in the protocol. • Only requires one extra Generation and Evaluation. • Drawbacks: • Leaks 1-bit of output. • Fully Malicious [Lindell2013] • Benefits: • Leaks no information to the attacker. • Drawbacks • Requires Alice generate between 60 – 130 circuits. Bob must evaluate ~1/2 and verify the rest. • NOTE: Our methods can work with either of these models! Brief survey of garbled circuit systems CPU Based ? GPU Based? OT Parallel? Extension? Bottleneck Security Model Our Work Yes* Yes Yes* Yes Communication Honest-but-curious, malicious leaks a bit, (Fully Malicious) Huang et al. Yes No Yes No Processing Honest-but-curious, malicious leaks a bit Kreuter et al. [Kreuter2013] Yes No Yes Super computers Communication Fully Malicious Frederiksen et al. [Frederiksen201] No Yes Yes* Single GPUs Communication Fully Malicious Contributions to Garbled Circuit Optimization 1. A method for accurately comparing garbled circuit systems with very different circuit formats. 2. A method for generating all gates in a circuit at once. 3. A method for reducing the number of calculations for each gate garbling. 4. A scalable generation method that can be combined with other best-in-class implementations. Fast Garbled Circuit Processing With GPUs • GPUs are highly parallel Single Instruction Multiple Data (SIMD) processors. • We can use every “core” on the GPU to process a gate. • But the SIMD parallelism requires protocol modifications. Generating all gates at once allows high throughput but requires protocol modification. • The Free XOR Technique [Kolesnikov2008] Label 0 (𝜆00 ) Label 1 (𝜆10 ) = 𝜆00 ⨁𝑅 Gate 0 (𝐺0 ) XOR 𝜆02 = 𝜆00 ⨁𝜆10 𝜆12 = 𝜆00 ⨁𝜆10 ⨁𝑅 Gate 2 (𝐺2 ) AND 𝜆10 𝜆11 = 𝜆10 ⨁𝑅 𝑅 : Randomly Generated Constant Generating all gates at once allows high throughput but requires protocol modification. • Our modified Free XOR technique Label 0 (𝜆00 ) Gate 0 (𝐺0 ) XOR Label 1 (𝜆10 ) = 𝜆00 ⨁𝑅 XOR Offset 𝜆00 ⨁𝜆10 ⨁𝜆02 𝜆02 𝜆12 = 𝜆02 ⨁𝑅 𝜆10 𝜆11 = 𝜆10 ⨁𝑅 𝑅 : Randomly Generated Constant Gate 2 (𝐺2 ) AND Benefits of increased Throughput Benchmarking Machines Name CPU GPU Tie (DARPA) Tesla K20 0.71 Ghz EC2 (Amazon) Tesla S2050 1.15 Ghz Kreuter et al. Xenon E5506 2.13 Ghz Garbling Truth Tables in practice Gate 2 (𝐺2 ) AND 𝑊0 AND GATE GARBLED TRUTH TABLE 𝜋0 , 𝜋1 𝑊1 Encrypted Entry 0,0 SHA1 𝜆10 ∥ 𝜆10 ∥ 2 ⨁(𝜆02 ∥ 𝜋20 ) 0,1 SHA1 𝜆10 ∥ 𝜆11 ∥ 2 ⨁(𝜆12 ∥ 𝜋21 ) 1,0 SHA1 𝜆00 ∥ 𝜆10 ∥ 2 ⨁(𝜆02 ∥ 𝜋20 ) 1,1 SHA1 𝜆00 ∥ 𝜆11 ∥ 2 ⨁(𝜆02 ∥ 𝜋20 ) 𝑊2 Reducing calculations required per-gate provided benefits over other GPU systems. SHA1 Counts Random Wire Label (per wire) 1 SHA1 Garbled Truth Table 4 SHA1 XOR Offset 0 SHA1 • But recall there are three wires for every gate in the circuit… Inputs and Outputs of SHA1 Buckets holding inputs: 1 2 3 4 … Buckets holding algorithm state: A B C D E 15 16 Pre-computing SHA1 intermediate values Inputs for random wire values: Seed Seed Seed Seed … 0x0 … Wire ID Wire ID Pre-computing SHA1 intermediate values Buckets holding inputs: Seed Seed Seed Seed … 0x0 … Only buckets used during the first 14 rounds. = Common for all Wires Wire ID Wire ID Benefits of SHA1 pre-computation Benchmarking Machines Name GPU GPU Cores Tie (DARPA) Tesla K20 0.71 Ghz 2496 EC2 (Amazon) Tesla S2050 1.15 Ghz 448 Current and On-Going Work • Now implement the PCF2 circuit format developed by Kreuter et al. • Working on additional circuit optimizations on top of those provided by the PCF2 compiler. • Provide a full scale solution from honest-but-curious to fully malicious processing. • Multiple GPUs • Super computers • Experiments and source code are available upon request. Questions? Extra Slide Matter Using GPUs we show the fastest single machine garbled circuit generator • XOR Gates: ~ 60.2 Million Gates Per Second • TT Gates: ~34.1 Million Gates Per Second 1. Alice will generate the Yao’s circuit. • Alice must construct the circuit using a series of Boolean gates with two input wires and one output wire. • Each gate has a serial number and garbled truth table. 𝑊0 Gate 0 (𝐺0 ) 𝑊1 AND 𝑊2 Wires in Yao’s Garbled Circuits [Yao1986] • Alice must use random labels (𝜆) for wire values instead of 0’s and 1’s. • Alice must use permutation bits (p-bits; 𝜋) to signify the label choice. Wire 0 (𝑊0 ) Label 0 (𝜆00 ) = 0xA1B2 P-bit 0 (𝜋00 ) = 0x1 Label 1 (𝜆10 ) = 0x192F P-bit 1 (𝜋01 ) =𝜋00 ⨁ 1 = 0x0 Encrypting Gates in Yao’s Garbled Circuits [Yao1986] • How Alice creates garbled truth tables in two steps • Step 1: Create Encrypted Truth Table Gate 2 (𝐺2 ) AND 𝑊0 AND GATE 𝐺2 Serial #: 2 𝑊1 𝑊1 𝜆10 = 0x428F 𝜋10 = 0x0 𝜆11 = 0xADC1 𝜋11 = 0x1 𝑊0 𝜆00 = 0xA1B2 𝜋00 = 0x1 𝜆10 = 0x192F 𝜋01 = 0x0 𝐸𝑛𝑐𝜆00,𝜆01 (𝜆02 , 𝜋20 ) 𝐸𝑛𝑐𝜆10,𝜆01 (𝜆02 , 𝜋20 ) 𝐸𝑛𝑐𝜆00,𝜆11 (𝜆02 , 𝜋20 ) 𝐸𝑛𝑐𝜆10,𝜆11 (𝜆12 , 𝜋21 ) 𝑊2 𝜆02 = 0xA1B2 𝜋20 = 0x0 𝜆12 = 0x192F 𝜋21 = 0x1 Encrypting Gates in Yao’s Garbled Circuits [Yao1986] • How Alice creates garbled truth tables in two steps • Step 1: Create Encrypted Truth Table Gate 2 (𝐺2 ) AND 𝑊0 AND GATE 𝐺2 Serial #: 2 𝑊1 𝑊1 𝜆10 = 0x428F 𝜋10 = 0x0 𝜆11 = 0xADC1 𝜋11 = 0x1 𝑊0 𝜆00 = 0xA1B2 𝜋00 = 0x1 𝜆10 = 0x192F 𝜋01 = 0x0 𝐸𝑛𝑐𝜆00,𝜆01 (𝜆02 , 𝜋20 ) 𝐸𝑛𝑐𝜆10,𝜆01 (𝜆02 , 𝜋20 ) 𝐸𝑛𝑐𝜆00,𝜆11 (𝜆02 , 𝜋20 ) 𝐸𝑛𝑐𝜆10,𝜆11 (𝜆12 , 𝜋21 ) STEP 1 OUTPUT 𝑊2 𝜆02 = 0xA1B2 𝜋20 = 0x0 𝜆12 = 0x192F 𝜋21 = 0x1 Encrypting Gates in Yao’s Garbled Circuits [Yao1986] • How Alice creates garbled truth tables in two steps • Step 1: Create Encrypted Truth Table Gate 1 (𝐺1 ) XOR 𝑊0 XOR GATE 𝑊1 𝑊0 𝑊1 Output 𝜆00 𝜆10 𝐸𝑛𝑐𝜆00,𝜆01 (𝜆02 ) 𝜆00 𝜆11 𝐸𝑛𝑐𝜆00,𝜆01 (𝜆12 ) 𝜆10 𝜆10 𝐸𝑛𝑐𝜆10,𝜆01 (𝜆12 ) 𝜆10 𝜆11 𝐸𝑛𝑐𝜆10,𝜆11 (𝜆02 ) 𝑊2 Encrypting Gates in Yao’s Garbled Circuits [Yao1986] • How Alice creates garbled truth tables in two steps • Step 1: Create Encrypted Truth Table Gate 4 (𝐺4 ) OR 𝑊0 OR GATE 𝑊1 𝑊0 𝑊1 Output 𝜆00 𝜆10 𝐸𝑛𝑐𝜆00,𝜆01 (𝜆02 ) 𝜆00 𝜆11 𝐸𝑛𝑐𝜆00,𝜆01 (𝜆12 ) 𝜆10 𝜆10 𝐸𝑛𝑐𝜆10,𝜆01 (𝜆12 ) 𝜆10 𝜆11 𝐸𝑛𝑐𝜆10,𝜆11 (𝜆12 ) 𝑊2 A basic overview of the Yao’s protocol • Assumptions: • Security Model: Honest but Curious • Process: 1. 2. 3. 4. 5. Alice will generate the Yao’s circuit. Alice sends the generated circuit to Bob. Bob will use Oblivious Transfer to learn Alice’s inputs. Bob will evaluate the circuit. Bob sends the output to Alice Yao’s Garbled Circuits under an Honest-butCurious Security Model 1. Alice generates wire labels and garbled truth tables for all wires and gates in a circuit. 2. Alice sends the garbled truth tables to Bob. 3. Bob obtains Alice’s input using Oblivious Transfer. 4. Bob evaluates the circuit. 5. Bob sends output to Alice. Both party can analyze data t all steps of this protocol but must perform all steps. Bob performs Oblivious Transfer to obtain Alice’s Inputs 𝑆0 Alice 𝑆1 Oblivious Transfer 𝑆𝑎 Bob So how fast can we process garbled circuits? • XOR Gates: ~ 60.2 Million Gates Per Second • TT Gates: ~34.1 Million Gates Per Second
© Copyright 2026 Paperzz