The Superdiversifier:
Peephole Individualization for Software
Protection
Matthias Jacob
Nokia
Mariusz H. Jakubowski
Prasad Naldurg
Chit Wei (Nick) Saw
Ramarathnam Venkatesan
Microsoft Research
International Workshop on Security: IWSEC ’08
Kagawa, Japan
November 25-27, 2008
Introduction
• Software individualization
– “Different-looking” but functionally equivalent code
– Diversity as a defense against attacks
– Important role in both biological and man-made systems
• Superoptimization
– Brute-force search for shortest code sequences that implement a
given function
– Compiler optimization introduced by Massalin ‘87
• Goals of our work:
– Leverage and extend superoptimization to individualize
instruction sequences
– Study superdiversification in the context of more comprehensive
protecton frameworks
11/26/08
2
What Does This Do?
unsigned __int64 nInput = _atoi64(argv[1]);
__int64 n;
n
n
n
n
n
=
=
=
=
=
nInput - ((nInput >> 1) & 033333333333333333333LL);
n - ((nInput >> 2) & 011111111111111111111LL);
n + (n >> 3);
n & 07070707070707070707LL;
n % 077;
printf("%d\n", n);
11/26/08
3
Overview
Instruction-level diversity via guided search
• Introduction
• Background
– Individualization
– Superoptimization
•
•
•
•
11/26/08
Superdiversification
Experimental results
Applications
Conclusion
4
Software Individualization
• Element of software security
– Defends against BORE attacks (Break Once/Run
Everywhere)
– Forces duplication of effort to break systems
– Alleviates “software monoculture” problem
• Many practical uses:
–
–
–
–
11/26/08
ASLR (Address Space Layout Randomization)
Secure DRM clients
Self-mutating malware
…
5
Individualization Schemes
• Static: Individualization of program code
– Algorithmic
• Bubble sort
• Red-black trees
quicksort
splay trees
– Syntactic
• MOV EAX,0
• MOV EAX,5; MOV EBX,1
XOR EAX,EAX
MOV EBX,1; MOV EAX,5
• Dynamic: Individualization of runtime behavior
–
–
–
–
–
11/26/08
Varying paths at runtime
Variable data encoding
Self-modifying code
Byte-codes with variable semantics
…
6
Superoptimization
• Brute-force search for shortest equivalent
instruction sequence
• [Massalin ‘87]:
– “Startling programs have been generated, many
of them engaging in convoluted bit fiddling
bearing little resemblance to the source programs
which defined the functions.”
– “… like a typical superoptimized program, the
logic is really convoluted.”
11/26/08
7
Superoptimization
• Input: Instruction sequence implementing a function
• Algorithm outline:
– Enumerate all possible sequences up to a given length
(e.g., 10 instructions).
– Check for equivalence to input sequence:
• Quick test: Test candidate sequence on several random inputs.
• Slow test: Check Boolean equivalence of sequences (if quick
test passes).
– Skip sequences longer than current shortest sequence.
• Quick test takes most of the computation time.
• Slow test guarantees equivalence to input sequence.
11/26/08
8
Overview
Instruction-level diversity via guided search
• Introduction
• Background
– Individualization
– Superoptimization
•
•
•
•
11/26/08
Superdiversification
Experimental results
Applications
Conclusion
9
The Superdiversifier
• Adapt and extend superoptimization to diversify code:
– Restrict set of instructions and operands allowed in
search.
– Guide search based on instruction frequencies occurring in
real-life programs.
– Use pruning techniques to cut down search time.
– Accept a secret key to control the above operations.
• Output any equivalent sequences, not necessarily only
the shortest.
– Secret key determines order of search.
– Different keys may yield dramatically different equivalent
sequences.
11/26/08
10
Equivalence Test Using a SAT Solver
• Input: Two Boolean functions, F(x) and G(x).
• Goal: Determine whether F(x) ≡ G(x).
F(x) ≡ G(x) iff x, F(x) = G(x).
F(x) ≡ G(x) iff x│F(x) ≠ G(x).
• Thus, simply run a SAT solver on F(x) ≠ G(x)
represented as a Boolean (CNF) formula.
• F(x) ≡ G(x) iff F(x) ≠ G(x) is unsatisfiable.
11/26/08
11
Overview
Instruction-level diversity via guided search
• Introduction
• Background
– Individualization
– Superoptimization
•
•
•
•
11/26/08
Superdiversification
Experimental results
Applications
Conclusion
12
Experimental Results
Function: Swap registers
Input code
Sample equivalent versions
11/26/08
13
Experimental Results
Function: Swap registers
Input code
Only arithmetic and logical instructions
allowed in search.
11/26/08
Sample equivalent versions
14
Experimental Results
Function: Fragment of
compiler-generated code
Input code
Sample equivalent versions
Small set of constants allowed in search
(may be harvested from real-life programs).
11/26/08
15
Empirical Taxonomy
11/26/08
16
Overview
Instruction-level diversity via guided search
• Introduction
• Background
– Individualization
– Superoptimization
•
•
•
•
11/26/08
Superdiversification
Experimental results
Applications
Conclusion
17
Some Applications
An element of comprehensive individualization systems
• Defense against signature-based attacks
• Patch obfuscation
– Patches reveal location of vulnerabilities.
• “Patch Tuesdays” often followed by exploits.
• Diffing tools locate vulnerable code quickly.
– Superdiversification helps to hide patches.
• Maximize size of diff between unpatched and patched
applications.
• For best results, diversify large sections of the patched
binary, not just the patch code.
11/26/08
18
Conclusion
• Main contribution: Guided search for instruction
sequences to individualize binaries.
• Future work
– Extend range of superdiversified code.
• Other types of instructions
• Control-flow constructs
– Optimize for better speed.
– Adapt to custom byte-codes.
• Modern instructions sets are geared towards generality and
performance.
• Custom byte-codes may be designed for individualization
and obfuscation.
• Instructions may perform arbitrary operations, not just serve
as elementary building blocks.
11/26/08
19
© Copyright 2025 Paperzz