Type-based Taint Analysis for Java
Web Applications
Wei Huang, Yao Dong and
Ana Milanova
Rensselaer Polytechnic Institute
1
Taint Analysis for Java Web Applications
Tracks
flows from untrusted sources to
sensitive sinks
◦ Such flows can cause SQL-injection, Cross-site
scripting, other attacks
Untrusted
input
unsanitized
Sensitive
sinks
SOURCES:
SINKS:
ServletRequest.getParameter(),
etc.
Statement.execute(),
etc
2
SQL Injection
Tainted input
HttpServletRequest req = ...;
Statement stat = ...;
String user = req.getParameter(“user”);
String query = “SELECT * FROM Users WHERE name
= “ + user;
stat.execute(query);
“John OR
1=1”
SELECT * FROM Users WHERE
name = John OR 1 = 1
3
Work on Taint Analysis
Finding
Security Vulnerabilities with Static
Analysis [Livshits and Lam, Usenix Security’05]
TAJ [Tripp et al. PLDI’09]
F4F [Sridharan et al. OOPSLA’11]
Andromeda [Tripp et al. FASE’13]
TAJ, F4F
and Andromeda are included in a
commercial tool from IBM, called AppScan
4
Issues with Existing Work
Dataflow
and points-to based approaches
Reflection
Libraries
Frameworks
5
Our Type-based Taint Analysis
SFlow: a type system
SFlowInfer: inference tool for SFlow
Easily and effectively handles reflection,
libraries and frameworks
◦ Takes Java program where sources are typed
tainted and sinks are typed safe
◦ Infers SFlow types for the rest of the variables
◦ If inference succeeds --- no flows from sources
to sinks
◦ If it fails with type errors --- potential flows
6
Inference and Checking Framework
Parameters
Immutability (ReIm)
Universe Types (UT)
Ownership Types (OT)
SFlow
AJ
EnerJ
More?
Unified Typing Rules
Program
Source
Instantiated Rules
Set-Based Solver
Annotated
Libraries
Set-based Solution
Extract Typing
Concrete Typing
Type Checking
7
SFlowInfer
The
instantiated inference tool
Detects (or verifies the absence of)
information flow violations
Sources and
Sinks
Annotated
Libraries
Java source
SFlowInfer
Result
8
SQL Injection
Source: the
return value is
tainted
HttpServletRequest req = ...;
Statement stat = ...;
tainted String user =
Sink: the parameter
req.getParameter(“user”);
is safe
tainted String query =
“SELECT * FROM Users WHERE name = “ + user;
stat.execute(query);
Subtyping:
safe <: tainted
Type error!
9
Contributions
SFlow:
A context-sensitive type system for
secure information flow
SFlowInfer: An inference algorithm for
SFlow
◦ SFlowInfer is an effective taint analysis tool
Implementation
and evaluation
10
Outline
SFlow
type system
Inference algorithm for SFlow
Handling of reflection, libraries and
frameworks
Implementation and evaluation
11
SFlow Type Qualifiers
tainted:
A variable x is tainted, if there is
flow from an untrusted source to x
safe: A variable x is safe if there is flow
from x to a safe sink
poly: The polymorphic qualifier, can be
instantiated to tainted or safe
safe <: poly <: tainted
12
Instantiated Typing Rules for SFlow
Viewpoint adaptation accounts
for context sensitivity.
qy is the context of adaptation.
(TCALL)
qy <: qy qthis
qz <: qy qp
qy qret <: qx
B(x = y.m(z)) = {... Þ qx <: qy qret ,...}
T
G x = y.m(z)
Additional constraints…
13
Outline
SFlow
type system
Inference algorithm for SFlow
Handling of reflection, libraries and
frameworks
Implementation and evaluation
14
Inference and Checking Framework
Parameters
Immutability (ReIm)
Universe Types (UT)
Ownership Types (OT)
SFlow
AJ
EnerJ
More?
Unified Typing Rules
Program
Source
Instantiated Rules
Set-Based Solver
Annotated
Libraries
Set-based Solution
Extract Typing
Concrete Typing
Type Checking
15
Set-based Solver
Set
Mapping S:
◦ variable {tainted, poly, safe}
Iterates
over statements s
◦ Removes infeasible qualifiers for each variable
in s according to the typing rule
Until
reaches a fixpoint, and outputs
◦ Type errors if one or more variables get
assigned the empty set, or
◦ A set-based solution
16
From Stanford Securibench-micro
StringBuffer buf;
…
foo(buf, buf, resp, req);
void foo(StringBuffer b,
StringBuffer b2,
ServletResponse resp, ServletRequest req) {
String name;
name = req.getParameter(NAME);//source
b.append(name);
PrintWriter writer = resp.getWriter();
String str = b2.toString();
writer.println(str); //sink
17
}
From Stanford Securibench-micro
StringBuffer buf;
…
foo(buf, buf, resp, req);
void foo(StringBuffer b,
StringBuffer b2,
ServletResponse resp, ServletRequest req) {
String name;
name = req.getParameter(NAME);//source
b.append(name);
PrintWriter writer = resp.getWriter();
String str = b2.toString();
writer.println(str); //sink
18
}
From Stanford Securibench-micro
StringBuffer buf;
…
foo(buf, buf, resp, req);
void foo(StringBuffer b,
StringBuffer b2,
ServletResponse resp, ServletRequest req) {
String name;
name = req.getParameter(NAME);//source
b.append(name);
PrintWriter writer = resp.getWriter();
String str = b2.toString();
writer.println(str); //sink
19
}
From Stanford Securibench-micro
StringBuffer buf;
…
foo(buf, buf, resp, req);
void foo(StringBuffer b,
StringBuffer b2,
ServletResponse resp, ServletRequest req) {
String name;
name = req.getParameter(NAME);//source
b.append(name);
PrintWriter writer = resp.getWriter();
String str = b2.toString();
writer.println(str); //sink
20
}
From Stanford Securibench-micro
StringBuffer buf;
…
foo(buf, buf, resp, req);
void foo(StringBuffer b,
StringBuffer b2,
ServletResponse resp, ServletRequest req) {
String name;
name = req.getParameter(NAME);//source
b.append(name);
PrintWriter writer = resp.getWriter();
String str = b2.toString();
writer.println(str); //sink
21
}
From Stanford Securibench-micro
StringBuffer buf;
…
foo(buf, buf, resp, req);
void foo(StringBuffer b,
StringBuffer b2,
ServletResponse resp, ServletRequest req) {
String name;
name = req.getParameter(NAME);//source
b.append(name);
PrintWriter writer = resp.getWriter();
String str = b2.toString();
writer.println(str); //sink, BAD!
22
}
Set-based Solver
{tainted,poly,safe} StringBuffer buf;
…
foo(buf, buf, resp, req);
void foo({tainted,poly,safe} StringBuffer b,
{tainted,poly,safe} StringBuffer b2,
ServletResponse resp, ServletRequest req) {
{tainted,poly,safe} String name;
name = req.getParameter(NAME);//source
b.append(name);
PrintWriter writer = resp.getWriter();
{tainted,poly,safe} String str = b2.toString();
writer.println(str); //sink, BAD: flow from source!
}
23
Set-based Solver
{tainted,poly,safe} StringBuffer buf;
…
foo(buf, buf, resp, req);
void foo({tainted,poly,safe} StringBuffer b,
{tainted,poly,safe} StringBuffer b2,
ServletResponse resp, ServletRequest req) {
{tainted,poly,safe} String name;
name = req.getParameter(NAME);//source
b.append(name);
PrintWriter writer = resp.getWriter();
{tainted,poly,safe} String str = b2.toString();
writer.println(str); //sink, BAD: flow from source!
}
24
Set-based Solver
{tainted,poly,safe} StringBuffer buf;
…
foo(buf, buf, resp, req);
void foo({tainted,poly,safe} StringBuffer b,
{tainted,poly,safe} StringBuffer b2,
ServletResponse resp, ServletRequest req) {
{tainted,poly,safe} String name;
name = req.getParameter(NAME);//source
b.append(name);
PrintWriter writer = resp.getWriter();
{tainted,poly,safe} String str = b2.toString();
writer.println(str); //sink, BAD: flow from source!
}
25
Set-based Solver
{tainted,poly,safe} StringBuffer buf;
…
foo(buf, buf, resp, req);
void foo({tainted,poly,safe} StringBuffer b,
{tainted,poly,safe} StringBuffer b2,
ServletResponse resp, ServletRequest req) {
{tainted,poly,safe} String name;
name = req.getParameter(NAME);//source
b.append(name);
PrintWriter writer = resp.getWriter();
{tainted,poly,safe} String str = b2.toString();
writer.println(str); //sink, BAD: flow from source!
}
26
Set-based Solver
{tainted,poly,safe} StringBuffer buf;
…
foo(buf, buf, resp, req);
void foo({tainted,poly,safe} StringBuffer b,
{tainted,poly,safe} StringBuffer b2,
ServletResponse resp, ServletRequest req) {
{tainted,poly,safe} String name;
name = req.getParameter(NAME);//source
b.append(name);
PrintWriter writer = resp.getWriter();
{tainted,poly,safe} String str = b2.toString();
writer.println(str); //sink, BAD: flow from source!
}
27
Set-based Solver
{tainted,poly,safe} StringBuffer buf;
…
foo(buf, buf, resp, req);
void foo({tainted,poly,safe} StringBuffer b,
{tainted,poly,safe} StringBuffer b2,
ServletResponse resp, ServletRequest req) {
{tainted,poly,safe} String name;
name = req.getParameter(NAME);//source
b.append(name);
PrintWriter writer = resp.getWriter();
{tainted,poly,safe} String str = b2.toString();
Type error!
writer.println(str); //sink
tainted or poly str
}
cannot be assigned
28
Set-based Solver (Cont’d)
What if the set-based solver terminates
without a type error?
Extract the maximal typing from set-based
solution according to preference ranking
tainted > poly > safe
◦ If S(x) = {poly, safe} the maximal typing types x
poly
Unfortunately, the maximal typing for SFlow
does not always type-check
29
Inference and Checking Framework
Parameters
Immutability (ReIm)
Universe Types (UT)
Ownership Types (OT)
SFlow
AJ
EnerJ
More?
Unified Typing Rules
Program
Source
Instantiated Rules
Set-Based Solver
Annotated
Libraries
Set-based Solution
Extract Typing
Concrete Typing
Type Checking
30
Maximal Typing
Unfortunately, the
maximal typing for
SFlow does not always type-check!
class A {
{String f;
{String get(A this) {
return this.f;
}
}
A y = ...;
String x = y.get();
writer.println(x); // sink
31
Maximal Typing (Cont’d)
class A {
{poly}
String
{poly,safe} String
return this.f;
}
}
{tainted,poly,safe}
{safe}
f;
get({poly,safe} this) {
A y = ...;
String x = y.get();
writer.println(x);
32
Maximal Typing (Cont’d)
class A {
{poly}
String
{poly,safe} String
return this.f;
}
}
{tainted,poly,safe}
{safe}
f;
get({poly,safe} this) {
A y = ...;
String x = y.get();
writer.println(x);
33
Maximal Typing (Cont’d)
class A {
{poly}
String
{poly,safe} String
return this.f;
}
}
{tainted,poly,safe}
{safe}
f;
get({poly,safe} this) {
A y = ...;
String x = y.get();
writer.println(x);
y
t ai nt ed
r et
<:
x
pol y <: saf e
✗
34
Method Summary Constraints
Reflect
the relations between parameters
and return values
Further remove infeasible qualifiers
String id(String p) {
String x = p;
return x;
}
p <: x
x <:ret
p <:ret
35
Method Summary Constraints (Cont’d)
class A {
{poly}
String f;
{poly,safe} String get({poly,safe} this) {
return this.f;
}
this poly <:ret
this f <:ret
this <:ret
}
{tainted,poly,safe} A y = ...;
{safe}
String x = y.get();
writer.println(x);
y <: y this
y ret <: x
y this <: y ret
y <:x
36
Method Summary Constraints (Cont’d)
class A {
{poly}
String f;
{poly,safe} String get({poly,safe} this) {
return this.f;
}
}
{tainted,poly,safe} A y = ...;
{safe}
String x = y.get();
writer.println(x);
y
saf e
r et
<:
x
pol y <: saf e
✔
37
Outline
SFlow
type system
Inference algorithm for SFlow
Handling of reflection, libraries and
frameworks
Implementation and evaluation
38
Reflection, Libraries and Frameworks
Reflective object creation is easy!
There is no need to abstract heap objects!
Flow from x to y is reflected through
subtyping x <: y
X x
x.f
y =
b =
= (X)Class.forName(“str”).newInstance();
= a; // a is a source
x;
y.f; // b is a sink
x <: y
a <: b
39
Reflection, Libraries and Frameworks
(Cont’d)
Libraries
(JDK, third-party, frameworks)
Unknown
library methods are typed
poly, poly poly
safe l = r.m(r1,r2)
l = r.m(tainted r1,r2)
40
Reflection, Libraries and Frameworks
(Cont’d)
Frameworks (e.g., Struts, Spring)
◦ Framework classes/interfaces are
subclassed/implemented in web application code
Superclass-subclass relation is handled using
function subtyping constraints
UserAction.execute(ActionForm userForm)
<:
Action.execute(tainted ActionForm form)
entails
form <: userForm //userForm is tainted
41
Outline
SFlow
type system
Inference algorithm for SFlow
Handling of reflection, libraries and
frameworks
Implementation and evaluation
42
Implementation
Built
in inference and checking framework
for pluggable types [Huang et al. ECOOP’12]
◦ Instantiated framework with SFlow
◦ Built on top of the Checker Framework [Papi et
al. ISSTA’08, Dietl et al. ICSE’11]
Publicly
available at
◦ http://code.google.com/p/type-inference/
43
Evaluation
DroidBench
◦ A suit of 39 Android apps by [Arzt et al. PLDI’14]
for evaluating taint analysis for Android
Java
web applications
◦ Stanford Securibench: a suit by Ben Livshits
designed for evaluating taint analysis
◦ Other web applications from previous work
◦ 13 web applications comprising 473kLOC
44
DroidBench [Arzt et al. PLDI’14]
SFlowInfer
outperforms AppScan and Fortify
SCA
FlowDroid [Arzt et al. PLDI’14] is flow-sensitive
◦ DroidBench is designed for flow sensitivity
Tool Name
AppScan
Fortify SCA
FlowDroi
d
SFlowInfer
Correct warning ✔
14
17
26
28
False warning
✖
5
4
4
9
Missed flow
14
11
2
0
Precision ✔/(✔+✖)
74%
81%
86%
76%
✔/(✔+)
50%
61%
93%
100% 45
Recall
Java Web Applications
We
manually examined all type errors
Parameter Manipulation / SQL Injection
◦ 7 benchmarks have no type errors
◦ 66 type errors correspond to true flows
◦ Average false positive rate: 15%
Parameter
Manipulation / XSS
◦ 8 benchmarks have no type errors
◦ 143 type errors correspond to true flows
◦ Average false positive rate: 4%
46
Runtime Performance
SFlowInfer
takes less than 3 minutes on all
but 2 benchmarks
Largest benchmark, photov 126kLOC,
takes 640 seconds
◦ Can be optimized
Maximal
heap size is set to 2GB!
47
Conclusion
A
type system for secure information flow
An efficient type inference algorithm
◦ Effective taint analysis tool
Evaluation
Publicly
on 473kLOC
available at
◦ http://code.google.com/p/type-inference/
48
© Copyright 2026 Paperzz