Predicate Dispatching: A Unified Theory of Dispatch

Milestones and
measures of success
Collaborative learning for security and
repair in application communities
MIT & Determina
AC PI meeting
July 10, 2007
Success status, page 1
Milestones – Oct 2006
• Learning for Windows executables
• Port Daikon to work on executables rather than source
code
• New infrastructure
• Requires debug information in the executable
• Utilizes debug information to determine variables and
their names
Success status, page 2
Milestones – Jan 2007
• Distributed learning
• Partition learning throughout the community
• Communicate learning results to central location
• Much smaller than raw trace data
• Does not require storing trace data
• Centrally combine learning results
• Identical results to learning over all data in one big run
• Each client still learns over whole program (or same
parts of it)
Success status, page 3
Milestones – Apr 2007
• Learning on stripped Windows executables
• No need for debug information
• New notion of values to learn over: “binary variables”
• Expressions that are used in the binary (the executable)
• Solves many problems with lack of debug information
•
•
•
•
Since they are used, they are guaranteed to be meaningful
No need to track initialization
No need to track array lengths
No need for other tricks to prevent core dump or behavior change
• Yields variables at every point of program execution
• Includes within loops, etc.
• Not just procedure entry and exit
• Reduces false positives
• Compare only the variables that are used
• Slightly less expressive
• Fewer comparisons
• Mitigate by using values from previous basic blocks
Success status, page 4
Milestones – Apr 2007
• Manual implementation of the repair algorithm
• Testing on a real program to illustrate capabilities
• All phases exist, but not all automted
Success status, page 5
Milestones – Jul 2007
• Automation of learning and protection process
• Learning (infers invariants)
• Same status as in previous milestones
• Monitoring (detects attacks)
• Integration of first detector: Memory Firewall Detector
• Deployment of monitoring and of repairs
• Uses Determina Management Console
• Localization (determines which invariants are violated)
• Logs results of evaluating previously learned invariants
• Results collected by central management console
• Protection (repairs violated properties)
• Distribute and evaluate multiple repairs
• Not yet implemented for all learned invariants
• Currently limited to a selected part of the program
• Improves performance
Success status, page 6
Milestones – Oct 2007
• Learning over entire program
• Scale up infrastructure
• Different community members learn about different
parts of program
• Enrich variables over which learning occurs
• Support invariants over variables on the call stack
• Additional repairs
• Support new types of invariants
• Add new repairs for existing invariants
Success status, page 7
Milestones – Nov 2007
• Red Team exercise
Success status, page 8
Milestones – Apr 2008
• Add new detector: Crash detector
• (Previous detector: Determina Memory Firewall)
• Problem: many applications are written expecting to
crash
• Application catches the crash and restarts part of itself
• We must distinguish between “expected” and
“unexpected” crashes
Success status, page 9
Milestones – Jul 2008
• Add new detector: data structure consistency
checker
• (Previous detectors: Memory Firewall, crashes)
• Flags violation of learned properties as a potential
attack
• Problems:
• Higher false positive rate
– Due to incomplete learning: not enough executions to learn over
• Difficult to know whether repairs make things better
– Due to incomplete learning: unlearned properties
• Experimentally evaluate whether this works
Success status, page 10
Milestones – Oct 2008
• Support Microsoft executables
• Highly optimized
• Code for one function scattered around program
• Uses unexpected instructions and code conventions
• Parts are coded in assembly to optimize speed/space
• Without source code, debugging our system is difficult
Success status, page 11
Milestones – Dec 2008
• Productize
• Determina offers additional product to DoD and other
customers
• Seamlessly integrated with suite of security and reliability
offerings
• Improves availability of large installations
• Detects and mitigates attacks before a human has had
time to do so
• Buys time for human intervention (or eliminates it)
Success status, page 12
Phase 1 measures of success
• Automated learning and repair
• Effective against attacks
• Low overhead
Success status, page 13
Automated learning and repair
•
•
•
•
•
•
•
Instrument community
Learn over different parts of the program
Combine results
Detect attacks
Disribute fixes
Evaluate fixes
Distribute best fixes
• Works on stripped Windows executables
• We suggest an open source program for evaluation
• Eases task of the Red Team
• Eases our development effort
• No source code is used duing the evaluation
Success status, page 14
Effective detection and repair
• Detect 95% of code injection attacks
• These inject external code into the application and
execute it
• Not: downloading bad executable, Word file, etc.
• Not: application errors (e.g., function return wrong
value)
• Recover from 60% of these attacks
• After learning from a number of attacks
• Number of attacks (and amount of learning) depends on
confidence required
Success status, page 15
Low overhead
• Protection mode overhead of less than 200%
• Not noticeable for I/O-bound programs
• Use extra CPUs on chip
• Better performance in Phase 2
• Determina infrastructure adds about 5% overhead
• Measure overhead in steady state
• Startup instrumentation costs are higher
• Learning must take place over several months
• We will develop an automated system for simulating
user actions to facilitate learning
Success status, page 16
Phase 2 measures of success
• Support additional attack detectors
• Crash detector
• Data structure consistency checker
• Detect 50% of attacks and errors that damage the
information representation of the program
• Recover from 30% of the detected attacks
• Overhead during monitoring and learning of less
than 150% (steady-state)
• System works on a reasonable subset of the
standard Microsoft programs (including Office
and Windows)
Success status, page 17
Measures of success
• Code injection attacks only (insert code into application &
execute it)
• Downloading bad executable, Word file, etc. is not code injection
• One program (we Firefox only, one version (we will
choose version 1.0 or something similar), no plugins
• 20 occurences of each unique attacks (identical), to achieve
repair
• Learning phase takes place in advance of red team
• Only input that is batch oriented (html, jpeg, gif etc), no
forms, no user interactions of any sort (so we can automate
learning). HTML files can include javascript code.
• Doesn't cover downloading bad executable, word file, etc.
(these aren't really code injection attacks)
Success status, page 18