Delta Debugging
Koushik Sen
EECS, UC Berkeley
1
How do we go from this
<td align=left valign=top>
<SELECT NAME="op sys" MULTIPLE SIZE=7>
<OPTION VALUE="All">All<OPTION VALUE="Windows 3.1">Windows 3.1<OPTION VALUE="Windows 95">Windows 95<OPTION
VALUE="Windows 98">Windows 98<OPTION VALUE="Windows ME">Windows ME<OPTION VALUE="Windows 2000">Windows
2000<OPTION VALUE="Windows NT">Windows NT<OPTION VALUE="Mac System 7">Mac System 7<OPTION VALUE="Mac
System 7.5">Mac System 7.5<OPTION VALUE="Mac System 7.6.1">Mac System 7.6.1<OPTION VALUE="Mac System 8.0">Mac System
8.0<OPTION VALUE="Mac System 8.5">Mac System 8.5<OPTION VALUE="Mac System 8.6">Mac System 8.6<OPTION VALUE="Mac
System 9.x">Mac System 9.x<OPTION VALUE="MacOS X">MacOS X<OPTION VALUE="Linux">Linux<OPTION
VALUE="BSDI">BSDI<OPTION VALUE="FreeBSD">FreeBSD<OPTION VALUE="NetBSD">NetBSD<OPTION
VALUE="OpenBSD">OpenBSD<OPTION VALUE="AIX">AIX<OPTION VALUE="BeOS">BeOS<OPTION VALUE="HP-UX">HPUX<OPTION VALUE="IRIX">IRIX<OPTION VALUE="Neutrino">Neutrino<OPTION VALUE="OpenVMS">OpenVMS<OPTION
VALUE="OS/2">OS/2<OPTION VALUE="OSF/1">OSF/1<OPTION VALUE="Solaris">Solaris<OPTION
VALUE="SunOS">SunOS<OPTION VALUE="other">other</SELECT>
</td>
<td align=left valign=top>
<SELECT NAME="priority" MULTIPLE SIZE=7>
<OPTION VALUE="--">--<OPTION VALUE="P1">P1<OPTION VALUE="P2">P2<OPTION VALUE="P3">P3<OPTION
VALUE="P4">P4<OPTION VALUE="P5">P5</SELECT>
</td>
<td align=left valign=top>
<SELECT NAME="bug severity" MULTIPLE SIZE=7>
<OPTION VALUE="blocker">blocker<OPTION VALUE="critical">critical<OPTION VALUE="major">major<OPTION
VALUE="normal">normal<OPTION VALUE="minor">minor<OPTION VALUE="trivial">trivial<OPTION
VALUE="enhancement">enhancement</SELECT>
</tr>
</table>
File
Print
Segmentation Fault
2
into this
<SELECT>
File
Print
Segmentation Fault
3
Simplification
• Once one has reproduced a problem, one must
find out what’s relevant:
– Does the problem really depend on 10,000 lines of
input?
– Does the failure really require this exact schedule?
– Do we need this sequence of calls?
4
Broken computer
• How do you identify the problem?
5
Why Simplify?
• Ease of communication. A simplified test case
is easier to communicate.
• Easier debugging. Smaller test cases result in
smaller states and shorter executions.
• Identify duplicates. Simplified test cases
subsume several duplicates.
6
Motivation
• The Mozilla open-source web browser project receives
several dozens bug reports a day.
• Each bug report has to be simplified
– Eliminate all details irrelevant to producing the failure
• To facilitate debugging
• To make sure it does not replicate a similar bug report
• In July 1999, Bugzilla listed more than 370 open bug
reports for Mozilla.
– These were not even simplified
– Mozilla engineers were overwhelmed with work
– They created the Mozilla BugAThon: a call for volunteers to
process bug reports
7
Your Solution
• How do you solve these problems?
• Binary search
– Cut the test case in half
– Iterate
• Brilliant idea: Why not automate this?
8
Binary Search
• Proceed by binary search. Throw away half
the input and see if the output is still wrong.
• If not, go back to the previous state and
discard the other half of the input.
9
Binary Search
• Proceed by binary search. Throw away half
the input and see if the output is still wrong.
• If not, go back to the previous state and
discard the other half of the input.
✘
10
Binary Search
• Proceed by binary search. Throw away half
the input and see if the output is still wrong.
• If not, go back to the previous state and
discard the other half of the input.
11
Binary Search
• Proceed by binary search. Throw away half
the input and see if the output is still wrong.
• If not, go back to the previous state and
discard the other half of the input.
✘
12
Binary Search
• Proceed by binary search. Throw away half
the input and see if the output is still wrong.
• If not, go back to the previous state and
discard the other half of the input.
13
Binary Search
• Proceed by binary search. Throw away half
the input and see if the output is still wrong.
• If not, go back to the previous state and
discard the other half of the input.
✘
14
Binary Search
• Proceed by binary search. Throw away half
the input and see if the output is still wrong.
• If not, go back to the previous state and
discard the other half of the input.
15
Binary Search
• Proceed by binary search. Throw away half
the input and see if the output is still wrong.
• If not, go back to the previous state and
discard the other half of the input.
✔
16
Binary Search
• Proceed by binary search. Throw away half
the input and see if the output is still wrong.
• If not, go back to the previous state and
discard the other half of the input.
17
Binary Search
• Proceed by binary search. Throw away half
the input and see if the output is still wrong.
• If not, go back to the previous state and
discard the other half of the input.
✘
18
Binary Search
• Proceed by binary search. Throw away half
the input and see if the output is still wrong.
• If not, go back to the previous state and
discard the other half of the input.
✔
19
Binary Search
• Proceed by binary search. Throw away half
the input and see if the output is still wrong.
• If not, go back to the previous state and
discard the other half of the input.
✔
20
Binary Search
• Proceed by binary search. Throw away half
the input and see if the output is still wrong.
• If not, go back to the previous state and
discard the other half of the input.
Simplified input
21
Complex Input
<td align=left valign=top>
<SELECT NAME="op sys" MULTIPLE SIZE=7>
<OPTION VALUE="All">All<OPTION VALUE="Windows 3.1">Windows 3.1<OPTION VALUE="Windows 95">Windows 95<OPTION
VALUE="Windows 98">Windows 98<OPTION VALUE="Windows ME">Windows ME<OPTION VALUE="Windows 2000">Windows
2000<OPTION VALUE="Windows NT">Windows NT<OPTION VALUE="Mac System 7">Mac System 7<OPTION VALUE="Mac
System 7.5">Mac System 7.5<OPTION VALUE="Mac System 7.6.1">Mac System 7.6.1<OPTION VALUE="Mac System 8.0">Mac System
8.0<OPTION VALUE="Mac System 8.5">Mac System 8.5<OPTION VALUE="Mac System 8.6">Mac System 8.6<OPTION VALUE="Mac
System 9.x">Mac System 9.x<OPTION VALUE="MacOS X">MacOS X<OPTION VALUE="Linux">Linux<OPTION
VALUE="BSDI">BSDI<OPTION VALUE="FreeBSD">FreeBSD<OPTION VALUE="NetBSD">NetBSD<OPTION
VALUE="OpenBSD">OpenBSD<OPTION VALUE="AIX">AIX<OPTION VALUE="BeOS">BeOS<OPTION VALUE="HP-UX">HPUX<OPTION VALUE="IRIX">IRIX<OPTION VALUE="Neutrino">Neutrino<OPTION VALUE="OpenVMS">OpenVMS<OPTION
VALUE="OS/2">OS/2<OPTION VALUE="OSF/1">OSF/1<OPTION VALUE="Solaris">Solaris<OPTION
VALUE="SunOS">SunOS<OPTION VALUE="other">other</SELECT>
</td>
<td align=left valign=top>
<SELECT NAME="priority" MULTIPLE SIZE=7>
<OPTION VALUE="--">--<OPTION VALUE="P1">P1<OPTION VALUE="P2">P2<OPTION VALUE="P3">P3<OPTION
VALUE="P4">P4<OPTION VALUE="P5">P5</SELECT>
</td>
<td align=left valign=top>
<SELECT NAME="bug severity" MULTIPLE SIZE=7>
<OPTION VALUE="blocker">blocker<OPTION VALUE="critical">critical<OPTION VALUE="major">major<OPTION
VALUE="normal">normal<OPTION VALUE="minor">minor<OPTION VALUE="trivial">trivial<OPTION
VALUE="enhancement">enhancement</SELECT>
</tr>
</table>
File
Print
Segmentation Fault
22
Simplified Input
• <SELECT NAME="priority" MULTIPLE SIZE=7>
• Simplified from 896 lines to one single line
• Required 12 tests only
23
Binary Search
• <SELECT NAME="priority" MULTIPLE SIZE=7>
24
Binary Search
• <SELECT NAME="priority" MULTIPLE SIZE=7>
✘
25
Binary Search
• <SELECT NAME="priority" MULTIPLE SIZE=7>
• <SELECT NAME="priority" MULTIPLE SIZE=7>
✘
✔
26
Binary Search
• <SELECT NAME="priority" MULTIPLE SIZE=7>
• <SELECT NAME="priority" MULTIPLE SIZE=7>
• <SELECT NAME="priority" MULTIPLE SIZE=7>
✘
✔
✔
27
Binary Search
• <SELECT NAME="priority" MULTIPLE SIZE=7>
• <SELECT NAME="priority" MULTIPLE SIZE=7>
• <SELECT NAME="priority" MULTIPLE SIZE=7>
✘
✔
✔
What do we do if both halves pass?
28
Change Granularity
Increase
Granularity of
test case
Decrease
Granularity of
test case
Failure of the
test case
More chances
Less chances
Progress of the
search
Slower,
Reduced by
amount < ½
Faster,
Reduced by
amount > ½
29
General DD Algorithm
Basic idea:
1. Start with few & large changes first
∆1
∆2
∆1
∆2
2. If all alternatives pass or are unresolved,
apply more & smaller changes.
∆1
∆2
∆3
∆4
∆5
∆6
∆7
∆8
∆1
∆2
∆3
∆4
∆5
∆6
∆7
∆8
30
Binary Search
• <SELECT NAME="priority" MULTIPLE SIZE=7>
• <SELECT NAME="priority" MULTIPLE SIZE=7>
• <SELECT NAME="priority" MULTIPLE SIZE=7>
✘
✔
✔
31
Binary Search
•
•
•
•
<SELECT NAME="priority" MULTIPLE SIZE=7>
<SELECT NAME="priority" MULTIPLE SIZE=7>
<SELECT NAME="priority" MULTIPLE SIZE=7>
<SELECT NAME="priority" MULTIPLE SIZE=7>
✘
✔
✔
✔
32
Binary Search
•
•
•
•
•
<SELECT NAME="priority" MULTIPLE SIZE=7>
<SELECT NAME="priority" MULTIPLE SIZE=7>
<SELECT NAME="priority" MULTIPLE SIZE=7>
<SELECT NAME="priority" MULTIPLE SIZE=7>
<SELECT NAME="priority" MULTIPLE SIZE=7>
✘
✔
✔
✔
✘
33
Binary Search
•
•
•
•
•
•
<SELECT NAME="priority" MULTIPLE SIZE=7>
<SELECT NAME="priority" MULTIPLE SIZE=7>
<SELECT NAME="priority" MULTIPLE SIZE=7>
<SELECT NAME="priority" MULTIPLE SIZE=7>
<SELECT NAME="priority" MULTIPLE SIZE=7>
<SELECT NAME="priority" MULTIPLE SIZE=7>
✘
✔
✔
✔
✘
✘
34
Binary Search
•
•
•
•
•
•
•
<SELECT NAME="priority" MULTIPLE SIZE=7>
<SELECT NAME="priority" MULTIPLE SIZE=7>
<SELECT NAME="priority" MULTIPLE SIZE=7>
<SELECT NAME="priority" MULTIPLE SIZE=7>
<SELECT NAME="priority" MULTIPLE SIZE=7>
<SELECT NAME="priority" MULTIPLE SIZE=7>
<SELECT NAME="priority" MULTIPLE SIZE=7>
✘
✔
✔
✔
✘
✘
✔
35
Inputs and Failures
• Let E be the set of possible inputs
• rP E corresponds to an input that passes
• rF E corresponds to an input that fails
36
Binary Search
•
•
•
•
•
•
•
<SELECT NAME="priority" MULTIPLE SIZE=7>
<SELECT NAME="priority" MULTIPLE SIZE=7>
<SELECT NAME="priority" MULTIPLE SIZE=7>
<SELECT NAME="priority" MULTIPLE SIZE=7>
<SELECT NAME="priority" MULTIPLE SIZE=7>
<SELECT NAME="priority" MULTIPLE SIZE=7>
<SELECT NAME="priority" MULTIPLE SIZE=7>
✘
✔
✔
✔
✘
✘
✔
Example of rF and rP ?
37
Binary Search
•
•
•
•
•
•
•
<SELECT NAME="priority" MULTIPLE SIZE=7>
<SELECT NAME="priority" MULTIPLE SIZE=7>
<SELECT NAME="priority" MULTIPLE SIZE=7>
<SELECT NAME="priority" MULTIPLE SIZE=7>
<SELECT NAME="priority" MULTIPLE SIZE=7>
<SELECT NAME="priority" MULTIPLE SIZE=7>
<SELECT NAME="priority" MULTIPLE SIZE=7>
✘
✔
✔
✔
✘
✘
✔
Example of rF and rP ?
rP = <SELECT NALE SIZE=7>
rF = <SELECT NAME="priori
38
Changes
• We can go from one input to another by changes
• A change is a mapping : E E which takes one
input and changes it to another input
r1 = <SELECT NAty" MULTIPLE SIZE=7>
’ = insert ME="priori at appropriate position
What is ’(r1)?
39
Changes
• We can go from one input to another by changes
• A change is a mapping : E E which takes one
input and changes it to another input
r1 = <SELECT NAty" MULTIPLE SIZE=7>
’ = insert ME="priori at appropriate position
What is ’(r1)?
<SELECT NAME="priority" MULTIPLE SIZE=7>
40
Decomposing Changes
• A change can be decomposed to a number of
elementary changes 1, 2, ..., n where =1 o 2 o ... o n
– where (i o j)(r) = i(j(r))
• For example, deleting a part of the input file can be
decomposed to deleting characters one be one from
the input file
– another way to say it: by composing deleting of single
characters we can get a change that deletes part of the input
file
• ’ = insert ME="priori
• ’ = 1 o 2 o 3 o 4 o 5 o 6 o 7 o 8 o 9 o 10
– 1 = insert M
– 2 = insert E
41
Summary
• We have an input with failure: rF
• We have an input without failure: rP
• We have a set of changes cF = {1, 2, ..., n }
such that
rF = (1 o 2 o ... o n )(rP) where rF is a run with failure
• Each subset c of cF is a test case
42
Testing Test Cases
• Given a test case c, we would like to know if the input
generated by changing rP by the changes in c is an
input that causes a failure
• We define a function
test: Powerset(cF) {P, F, ?}
such that, given c={1, 2, ..., m} cF
test(c) = F iff (1 o 2 o ... o m )(rP) is a failing input
43
Minimizing Test Cases
• Now the question is: Can we find the minimal test case
c such that test(c) = F?
• A test case c cF is called the global minimum of cF if
for all c’ cF , |c’| < |c| test(c’) F
• Global minimum is the smallest set of changes which
will make the program fail
44
Minimizing Test Cases
• Now the question is: Can we find the minimal test case
c such that test(c) = F?
• A test case c cF is called the global minimum of cF if
for all c’ cF , |c’| < |c| test(c’) F
• Global minimum is the smallest set of changes which
will make the program fail
• Finding the global minimum may require us to perform
exponential number of tests
45
Search for 1-minimal Input
• Different problem formulation
Find a set of changes that cause the failure, but
removing any change causes the failure to go away
• This is 1-minimality
46
Minimizing Test Cases
• A test case c cF is called a local minimum of cF if
for all c’ c , test(c’) F
• A test case c cF is n-minimal if
for all c’ c , |c| |c’| n test(c’) F
• A test case is 1-minimal if
for all i c , test(c – {i}) F
47
Naïve Algorithm
• To find a 1-minimal subset of C, simply
• Remove one element from C
• If c – {} = X, recurse with smaller set
• If C – {} X, C is 1-minimal
48
Analysis
• In the worst case,
– We remove one element from the set per iteration
– After trying every other element
• Work is potentially
N + (N-1) + (N-2) + …
• This is O(N2)
49
Work Smarter, Not Harder
• We can often do better
• Silly to start out removing 1 element at a time
– Try dividing change set in 2 initially
– Increase # of subsets if we can’t make progress
– If we get lucky, search will converge quickly
50
Minimization Algorithm
• The delta debugging algorithm finds a 1-minimal test case
• It partitions the set cF to 1, 2, ... n
– 1, 2, ... n are pairwise disjoint
– cF = 1 2 ... n
• Define the complement of i as i = cF i
• Start with n = 2
• Tests each test case defined by the partition and their
complements
• Reduce the test case if a smaller failure inducing set is found
– otherwise refine the partition, i.e. n = n*2
51
Each Step of the Minimization Algorithm
Let n = 2
(A) Start with as test set
(B) Test each 1, 2, ... n and each 1, 2, ..., n
(C) There are four possible outcomes
1. Some i causes failure
–
Go to step (A) with = i and n = 2
–
Go to step (A) with = i and n = n 1
–
Increase granularity: Go to (A) with = and n = 2n
–
Done, found the 1-minimal subset
2. Some i causes failure
3. No test causes failure
4. The granularity can no longer be increased
52
Delta Debugging
53
Analysis
• Worst case is still quadratic
• Subdivide until each set is of size 1
– Reduced to the naïve algorithm
• Good news
– For single failure, converges in log N
– Binary search again
54
The GNU C Compiler
#define SIZE 20
•
•
•
•
This program (bug.c) causes GCC
2.95.2 to crash when optimization is
enabled
We would like to minimize this
program in order to file a bug
report
In the case of GCC, a passing
program run is the empty input
For the sake of simplicity, we model
change as the insertion of a single
character
– r is running GCC with an empty input
– r means running GCC with bug.c
– each change i inserts the ith
character of bug.c
double mult(double z[], int n)
{
int i, j;
i = 0;
for (j = 0; j < n; j++) {
i = i + j + 1;
z[i] = z[i] * (z[0] + 1.0);
}
return z[n];
}
void copy(double to[], double from[], int count)
{
int n = (count + 7) / 8;
switch (count % 8) do {
case 0: *to++ = *from++;
case 7: *to++ = *from++;
case 6: *to++ = *from++;
case 5: *to++ = *from++;
case 4: *to++ = *from++;
case 3: *to++ = *from++;
case 2: *to++ = *from++;
case 1: *to++ = *from++;
} while (--n > 0);
return mult(to, 2);
}
int main(int argc, char *argv[])
{
double x[SIZE], y[SIZE];
double *px = x;
while (px < x + SIZE)
*px++ = (px – x) * (SIZE + 1.0);
return copy(y, x, SIZE);
}
55
The GNU C Compiler
• The test procedure would
– create the appropriate subset of bug.c
– feed it to GCC
– return iff GCC had crashed, and otherwise
77
755
377
188
56
The GNU C Compiler
• The minimized code is
t(double z[],int n){int i,j;for(;;){i=i+j+1;z[i]=z[i]*(z[0]+0);}return z[n];}
• The test case is 1-minimal
–
–
–
–
No single character can be removed without removing the failure
Even every superfluous whitespace has been removed
The function name has shrunk from mult to a single t
This program actually has a semantic error (infinite loop), but GCC
still isn't supposed to crash
• So where could the bug be?
– We already know it is related to optimization
– If we remove the –O option to turn off optimization, the failure
disappears
57
The GNU C Compiler
• The GCC documentation lists 31 options to control
optimization on Linux
–ffloat-store
–fforce-mem
–fno-inline
–fkeep-static-consts
–fstrength-reduce
–fcse-skip-blocks
–fgcse
–fschedule-insns2
–fcaller-saves
–fmove-all-movables
–fstrict-aliasing
–fno-default-inline
–fforce-addr
–finline-functions
–fno-function-cse
–fthread-jumps
–frerun-cse-after-loop
–fexpensive-optimizations
–ffunction-sections
–funroll-loops
–freduce-all-givs
–fno-defer-pop
–fomit-frame-pointer
–fkeep-inline-functions
–ffast-math
–fcse-follow-jumps
–frerun-loop-opt
–fschedule-insns
–fdata-sections
–funroll-all-loops
–fno-peephole
• It turns out that applying all of these options causes
the failure to disappear
– Some option(s) prevent the failure
58
The GNU C Compiler
• We can use test case minimization in order to find the
preventing option(s)
– Each i stands for removing a GCC option
– Having all i applied means to run GCC with no option (failing)
– Having no i applied means to run GCC with all options (passing)
• After seven tests, the single option -ffast-math is found which
prevents the failure
– Unfortunately, it is a bad candidate for a workaround because it may
alter the semantics of the program
– Thus, we remove -ffast-math from the list of options and make
another run
– Again after seven tests, it turn out that -fforce-addr also prevents
the failure
– Further examination shows that no other option prevents the failure
59
The GNU C Compiler
• So, this is what we can send to the GCC
maintainers
– The minimal test case
– “The failure only occurs with optimization”
– “-ffast-math and -fforce-addr prevent the failure”
60
Case Studies
• Reducing a Mozilla bug to
– Print the html file containing <SELECT>
• Minimizing fuzz input
– Fuzzing: Take a program, send it randomly
generated input and see it crashes
– Used delta debugging to minimize fuzz input
61
Another Application
• Yesterday, my program worked. Today, it does not.
Why?
– The new release 4.17 of GDB changed 178,000 lines
– it no longer integrated properly with DDD (a graphical frontend)
– How to isolate the change that caused the failure.
62
Summary
• Delta Debugging is a technique, not a tool
• Bad News:
– Probably must be reimplemented for each
significant system
– To exploit knowledge of changes
• Good News:
– Relatively simple algorithm, significant
payoff
– It’s worth reimplementing
63
© Copyright 2026 Paperzz