Inference and Checking Framework for Pluggable Types

Type-based Taint Analysis for Java
Web Applications
Wei Huang, Yao Dong and
Ana Milanova
Rensselaer Polytechnic Institute
1
Taint Analysis for Java Web Applications
 Tracks
flows from untrusted sources to
sensitive sinks
◦ Such flows can cause SQL-injection, Cross-site
scripting, other attacks
Untrusted
input
unsanitized
Sensitive
sinks
SOURCES:
SINKS:
ServletRequest.getParameter(),
etc.
Statement.execute(),
etc
2
SQL Injection
Tainted input
HttpServletRequest req = ...;
Statement stat = ...;
String user = req.getParameter(“user”);
String query = “SELECT * FROM Users WHERE name
= “ + user;
stat.execute(query);
“John OR
1=1”
SELECT * FROM Users WHERE
name = John OR 1 = 1
3
Work on Taint Analysis
 Finding
Security Vulnerabilities with Static
Analysis [Livshits and Lam, Usenix Security’05]
 TAJ [Tripp et al. PLDI’09]
 F4F [Sridharan et al. OOPSLA’11]
 Andromeda [Tripp et al. FASE’13]
 TAJ, F4F
and Andromeda are included in a
commercial tool from IBM, called AppScan
4
Issues with Existing Work
 Dataflow
and points-to based approaches
 Reflection
 Libraries
 Frameworks
5
Our Type-based Taint Analysis


SFlow: a type system
SFlowInfer: inference tool for SFlow

Easily and effectively handles reflection,
libraries and frameworks
◦ Takes Java program where sources are typed
tainted and sinks are typed safe
◦ Infers SFlow types for the rest of the variables
◦ If inference succeeds --- no flows from sources
to sinks
◦ If it fails with type errors --- potential flows
6
Inference and Checking Framework
Parameters
 Immutability (ReIm)
 Universe Types (UT)
 Ownership Types (OT)
 SFlow
 AJ
 EnerJ
 More?
Unified Typing Rules
Program
Source
Instantiated Rules
Set-Based Solver
Annotated
Libraries
Set-based Solution
Extract Typing
Concrete Typing
Type Checking
7
SFlowInfer
 The
instantiated inference tool
 Detects (or verifies the absence of)
information flow violations
Sources and
Sinks
Annotated
Libraries
Java source
SFlowInfer
Result
8
SQL Injection
Source: the
return value is
tainted
HttpServletRequest req = ...;
Statement stat = ...;
tainted String user =
Sink: the parameter
req.getParameter(“user”);
is safe
tainted String query =
“SELECT * FROM Users WHERE name = “ + user;
stat.execute(query);
Subtyping:
safe <: tainted
Type error!
9
Contributions
 SFlow:
A context-sensitive type system for
secure information flow
 SFlowInfer: An inference algorithm for
SFlow
◦ SFlowInfer is an effective taint analysis tool
 Implementation
and evaluation
10
Outline
 SFlow
type system
 Inference algorithm for SFlow
 Handling of reflection, libraries and
frameworks
 Implementation and evaluation
11
SFlow Type Qualifiers
 tainted:
A variable x is tainted, if there is
flow from an untrusted source to x
 safe: A variable x is safe if there is flow
from x to a safe sink
 poly: The polymorphic qualifier, can be
instantiated to tainted or safe
safe <: poly <: tainted
12
Instantiated Typing Rules for SFlow
Viewpoint adaptation accounts
for context sensitivity.
qy is the context of adaptation.
(TCALL)
qy <: qy qthis
qz <: qy qp
qy qret <: qx
B(x = y.m(z)) = {... Þ qx <: qy qret ,...}
T
G x = y.m(z)
Additional constraints…
13
Outline
 SFlow
type system
 Inference algorithm for SFlow
 Handling of reflection, libraries and
frameworks
 Implementation and evaluation
14
Inference and Checking Framework
Parameters
 Immutability (ReIm)
 Universe Types (UT)
 Ownership Types (OT)
 SFlow
 AJ
 EnerJ
 More?
Unified Typing Rules
Program
Source
Instantiated Rules
Set-Based Solver
Annotated
Libraries
Set-based Solution
Extract Typing
Concrete Typing
Type Checking
15
Set-based Solver
 Set
Mapping S:
◦ variable  {tainted, poly, safe}
 Iterates
over statements s
◦ Removes infeasible qualifiers for each variable
in s according to the typing rule
 Until
reaches a fixpoint, and outputs
◦ Type errors if one or more variables get
assigned the empty set, or
◦ A set-based solution
16
From Stanford Securibench-micro
StringBuffer buf;
…
foo(buf, buf, resp, req);
void foo(StringBuffer b,
StringBuffer b2,
ServletResponse resp, ServletRequest req) {
String name;
name = req.getParameter(NAME);//source
b.append(name);
PrintWriter writer = resp.getWriter();
String str = b2.toString();
writer.println(str); //sink
17
}
From Stanford Securibench-micro
StringBuffer buf;
…
foo(buf, buf, resp, req);
void foo(StringBuffer b,
StringBuffer b2,
ServletResponse resp, ServletRequest req) {
String name;
name = req.getParameter(NAME);//source
b.append(name);
PrintWriter writer = resp.getWriter();
String str = b2.toString();
writer.println(str); //sink
18
}
From Stanford Securibench-micro
StringBuffer buf;
…
foo(buf, buf, resp, req);
void foo(StringBuffer b,
StringBuffer b2,
ServletResponse resp, ServletRequest req) {
String name;
name = req.getParameter(NAME);//source
b.append(name);
PrintWriter writer = resp.getWriter();
String str = b2.toString();
writer.println(str); //sink
19
}
From Stanford Securibench-micro
StringBuffer buf;
…
foo(buf, buf, resp, req);
void foo(StringBuffer b,
StringBuffer b2,
ServletResponse resp, ServletRequest req) {
String name;
name = req.getParameter(NAME);//source
b.append(name);
PrintWriter writer = resp.getWriter();
String str = b2.toString();
writer.println(str); //sink
20
}
From Stanford Securibench-micro
StringBuffer buf;
…
foo(buf, buf, resp, req);
void foo(StringBuffer b,
StringBuffer b2,
ServletResponse resp, ServletRequest req) {
String name;
name = req.getParameter(NAME);//source
b.append(name);
PrintWriter writer = resp.getWriter();
String str = b2.toString();
writer.println(str); //sink
21
}
From Stanford Securibench-micro
StringBuffer buf;
…
foo(buf, buf, resp, req);
void foo(StringBuffer b,
StringBuffer b2,
ServletResponse resp, ServletRequest req) {
String name;
name = req.getParameter(NAME);//source
b.append(name);
PrintWriter writer = resp.getWriter();
String str = b2.toString();
writer.println(str); //sink, BAD!
22
}
Set-based Solver
{tainted,poly,safe} StringBuffer buf;
…
foo(buf, buf, resp, req);
void foo({tainted,poly,safe} StringBuffer b,
{tainted,poly,safe} StringBuffer b2,
ServletResponse resp, ServletRequest req) {
{tainted,poly,safe} String name;
name = req.getParameter(NAME);//source
b.append(name);
PrintWriter writer = resp.getWriter();
{tainted,poly,safe} String str = b2.toString();
writer.println(str); //sink, BAD: flow from source!
}
23
Set-based Solver
{tainted,poly,safe} StringBuffer buf;
…
foo(buf, buf, resp, req);
void foo({tainted,poly,safe} StringBuffer b,
{tainted,poly,safe} StringBuffer b2,
ServletResponse resp, ServletRequest req) {
{tainted,poly,safe} String name;
name = req.getParameter(NAME);//source
b.append(name);
PrintWriter writer = resp.getWriter();
{tainted,poly,safe} String str = b2.toString();
writer.println(str); //sink, BAD: flow from source!
}
24
Set-based Solver
{tainted,poly,safe} StringBuffer buf;
…
foo(buf, buf, resp, req);
void foo({tainted,poly,safe} StringBuffer b,
{tainted,poly,safe} StringBuffer b2,
ServletResponse resp, ServletRequest req) {
{tainted,poly,safe} String name;
name = req.getParameter(NAME);//source
b.append(name);
PrintWriter writer = resp.getWriter();
{tainted,poly,safe} String str = b2.toString();
writer.println(str); //sink, BAD: flow from source!
}
25
Set-based Solver
{tainted,poly,safe} StringBuffer buf;
…
foo(buf, buf, resp, req);
void foo({tainted,poly,safe} StringBuffer b,
{tainted,poly,safe} StringBuffer b2,
ServletResponse resp, ServletRequest req) {
{tainted,poly,safe} String name;
name = req.getParameter(NAME);//source
b.append(name);
PrintWriter writer = resp.getWriter();
{tainted,poly,safe} String str = b2.toString();
writer.println(str); //sink, BAD: flow from source!
}
26
Set-based Solver
{tainted,poly,safe} StringBuffer buf;
…
foo(buf, buf, resp, req);
void foo({tainted,poly,safe} StringBuffer b,
{tainted,poly,safe} StringBuffer b2,
ServletResponse resp, ServletRequest req) {
{tainted,poly,safe} String name;
name = req.getParameter(NAME);//source
b.append(name);
PrintWriter writer = resp.getWriter();
{tainted,poly,safe} String str = b2.toString();
writer.println(str); //sink, BAD: flow from source!
}
27
Set-based Solver
{tainted,poly,safe} StringBuffer buf;
…
foo(buf, buf, resp, req);
void foo({tainted,poly,safe} StringBuffer b,
{tainted,poly,safe} StringBuffer b2,
ServletResponse resp, ServletRequest req) {
{tainted,poly,safe} String name;
name = req.getParameter(NAME);//source
b.append(name);
PrintWriter writer = resp.getWriter();
{tainted,poly,safe} String str = b2.toString();
Type error!
writer.println(str); //sink
tainted or poly str
}
cannot be assigned
28
Set-based Solver (Cont’d)
What if the set-based solver terminates
without a type error?
 Extract the maximal typing from set-based
solution according to preference ranking
tainted > poly > safe

◦ If S(x) = {poly, safe} the maximal typing types x
poly

Unfortunately, the maximal typing for SFlow
does not always type-check
29
Inference and Checking Framework
Parameters
 Immutability (ReIm)
 Universe Types (UT)
 Ownership Types (OT)
 SFlow
 AJ
 EnerJ
 More?
Unified Typing Rules
Program
Source
Instantiated Rules
Set-Based Solver
Annotated
Libraries
Set-based Solution
Extract Typing
Concrete Typing
Type Checking
30
Maximal Typing
 Unfortunately, the
maximal typing for
SFlow does not always type-check!
class A {
{String f;
{String get(A this) {
return this.f;
}
}
A y = ...;
String x = y.get();
writer.println(x); // sink
31
Maximal Typing (Cont’d)
class A {
{poly}
String
{poly,safe} String
return this.f;
}
}
{tainted,poly,safe}
{safe}
f;
get({poly,safe} this) {
A y = ...;
String x = y.get();
writer.println(x);
32
Maximal Typing (Cont’d)
class A {
{poly}
String
{poly,safe} String
return this.f;
}
}
{tainted,poly,safe}
{safe}
f;
get({poly,safe} this) {
A y = ...;
String x = y.get();
writer.println(x);
33
Maximal Typing (Cont’d)
class A {
{poly}
String
{poly,safe} String
return this.f;
}
}
{tainted,poly,safe}
{safe}
f;
get({poly,safe} this) {
A y = ...;
String x = y.get();
writer.println(x);
y
t ai nt ed
r et
<:
x
pol y <: saf e
✗
34
Method Summary Constraints
 Reflect
the relations between parameters
and return values
 Further remove infeasible qualifiers
String id(String p) {
String x = p;
return x;
}
p <: x
x <:ret
p <:ret
35
Method Summary Constraints (Cont’d)
class A {
{poly}
String f;
{poly,safe} String get({poly,safe} this) {
return this.f;
}
this poly <:ret
this f <:ret
this <:ret
}
{tainted,poly,safe} A y = ...;
{safe}
String x = y.get();
writer.println(x);
y <: y this
y ret <: x
y this <: y ret
y <:x
36
Method Summary Constraints (Cont’d)
class A {
{poly}
String f;
{poly,safe} String get({poly,safe} this) {
return this.f;
}
}
{tainted,poly,safe} A y = ...;
{safe}
String x = y.get();
writer.println(x);
y
saf e
r et
<:
x
pol y <: saf e
✔
37
Outline
 SFlow
type system
 Inference algorithm for SFlow
 Handling of reflection, libraries and
frameworks
 Implementation and evaluation
38
Reflection, Libraries and Frameworks
Reflective object creation is easy!
 There is no need to abstract heap objects!
 Flow from x to y is reflected through
subtyping x <: y

X x
x.f
y =
b =
= (X)Class.forName(“str”).newInstance();
= a; // a is a source
x;
y.f; // b is a sink
x <: y
a <: b
39
Reflection, Libraries and Frameworks
(Cont’d)
 Libraries
(JDK, third-party, frameworks)
 Unknown
library methods are typed
poly, poly  poly
safe l = r.m(r1,r2)
l = r.m(tainted r1,r2)
40
Reflection, Libraries and Frameworks
(Cont’d)

Frameworks (e.g., Struts, Spring)
◦ Framework classes/interfaces are
subclassed/implemented in web application code

Superclass-subclass relation is handled using
function subtyping constraints
UserAction.execute(ActionForm userForm)
<:
Action.execute(tainted ActionForm form)
entails
form <: userForm //userForm is tainted
41
Outline
 SFlow
type system
 Inference algorithm for SFlow
 Handling of reflection, libraries and
frameworks
 Implementation and evaluation
42
Implementation
 Built
in inference and checking framework
for pluggable types [Huang et al. ECOOP’12]
◦ Instantiated framework with SFlow
◦ Built on top of the Checker Framework [Papi et
al. ISSTA’08, Dietl et al. ICSE’11]
 Publicly
available at
◦ http://code.google.com/p/type-inference/
43
Evaluation
 DroidBench
◦ A suit of 39 Android apps by [Arzt et al. PLDI’14]
for evaluating taint analysis for Android
 Java
web applications
◦ Stanford Securibench: a suit by Ben Livshits
designed for evaluating taint analysis
◦ Other web applications from previous work
◦ 13 web applications comprising 473kLOC
44
DroidBench [Arzt et al. PLDI’14]
 SFlowInfer
outperforms AppScan and Fortify
SCA
 FlowDroid [Arzt et al. PLDI’14] is flow-sensitive
◦ DroidBench is designed for flow sensitivity
Tool Name
AppScan
Fortify SCA
FlowDroi
d
SFlowInfer
Correct warning ✔
14
17
26
28
False warning
✖
5
4
4
9
Missed flow

14
11
2
0
Precision ✔/(✔+✖)
74%
81%
86%
76%
✔/(✔+)
50%
61%
93%
100% 45
Recall
Java Web Applications
 We
manually examined all type errors
 Parameter Manipulation / SQL Injection
◦ 7 benchmarks have no type errors
◦ 66 type errors correspond to true flows
◦ Average false positive rate: 15%
 Parameter
Manipulation / XSS
◦ 8 benchmarks have no type errors
◦ 143 type errors correspond to true flows
◦ Average false positive rate: 4%
46
Runtime Performance
 SFlowInfer
takes less than 3 minutes on all
but 2 benchmarks
 Largest benchmark, photov 126kLOC,
takes 640 seconds
◦ Can be optimized
 Maximal
heap size is set to 2GB!
47
Conclusion
A
type system for secure information flow
 An efficient type inference algorithm
◦ Effective taint analysis tool
 Evaluation
 Publicly
on 473kLOC
available at
◦ http://code.google.com/p/type-inference/
48