Himabindu Lakkaraju, Stanford, USA

Hima Lakkaraju
joint work with Jure Leskovec, Sendhil Mullainathan,
Jon Kleinberg, Jens Ludwig
Medicine
What treatment do we recommend to a patient?
Education
Is a student at risk of dropping out?
Criminal justice
Will a person commit a crime if let out on bail?
Today humans make decisions
Can algorithms help?
Hima Lakkaraju (Stanford University)
2
U.S. police makes >12M arrests/year
Where do people wait for trial?
Release vs. Detain is high stakes
 Pre-trial detention spells avg. 2-3
months (can be up to 9-12 months)
 Nearly 750,000 people in jails in US
 Consequential for jobs, families
as well as crime
Hima Lakkaraju (Stanford University)
3
Judge must decide whether to release or not
(bail)
Defendant when out on bail can behave badly:
Can we use machine learning
 Fail to appear at trial
algorithms to make these
 Commit a crime predictions?
The judge is making a prediction!
Hima Lakkaraju (Stanford University)
4
We have results on on four data sets
 Will present from one of them
 Results on all are qualitatively similar
Input: Defendant history
Output (i.e., prediction):
 Failure to appear at trial
 Any other crime
 Any other violent crime
 No crime
Hima Lakkaraju (Stanford University)
5
We want to build a decision rule
which decides whether to lock or
release the defendant based on
defendant’s history
Defendant
characteristics
Decision
rule
Hima Lakkaraju (Stanford University)
Lock or
release
6
We want to build a predictor
which will predict defendant’s
future bad behavior
based on his/her history
Defendant
characteristics
ML
algorithm
Hima Lakkaraju (Stanford University)
Prediction
of bad
behavior
in the future
7
Only administrative variables
 Common across regions
 No “gaming the system” problem
Only variables that are legal to use
 No race, gender
Note: Judge predictions may not
obey either of these
Hima Lakkaraju (Stanford University)
8













Age at first arrest
Times sentenced residential
correction
Level of charge
Number of active warrants
Number of misdemeanor
cases
Number of past revocations
Current charge domestic
violence
Is first arrest
Prior jail sentence
Prior prison sentence
Employed at first arrest
Currently on supervision
Had previous revocation









Arrest for new offense while
on supervision or bond
Has active warrant
Has active misdemeanor
warrant
Has other pending charge
Had previous adult
conviction
Had previous adult
misdemeanor conviction
Had previous adult felony
conviction
Had previous Failure to
Appear
Prior supervision within 10
years
(only legal variables and easy to get)
Hima Lakkaraju (Stanford University)
3
9
Number of
Jurisdiction
datapoints
Fraction
released
people
Fraction of
Crime
Failure to
Appear at
Trial
Nonviolent
Crime
Violent
Crime
Kentucky
73.02%
17.45%
10.45%
4.18%
2.82%
1,116,946 78.24%
19.35%
12.02%
5.38%
1.95%
Federal
Pretrial
System
362,713
Data: 2008-10
Other states as well but not today
Hima Lakkaraju (Stanford University)
10
X
2
N
X
3
X
X
5
X
X
N
X
7
X
8
X
9
X
N
X
X
X
11
X
X
...
X
X
X
N
N
X
X
X
X
X
X
N
D
X
6
12
N
X
4
10
D
X
X
D
X
N
N
...
Hima Lakkaraju (Stanford University)
Training
(model building)
1
Crime?
Evaluation
Defendant Characteristics
ID
11
Training:
 Train a ML model whether a defendant
would commit a crime
Testing:
 Assume real judge releases k defendants
 For every defendant predict P(crime)
 Sort them by increasing P(crime)
and “release” top k
 Compare crime rate of judge vs. model
Hima Lakkaraju (Stanford University)
12
Top four levels of the decision tree
Typical tree depth: 20-25
We build ensembles of such trees using techniques
called random forests and gradient boosting.
Hima Lakkaraju (Stanford University)
13
Arrest for
new
offense
Yes
Active
Misdemeanor
No
Yes
0.61
0.31
No
Prior
supervision
Yes
Active
warrant
No
No
Yes
0.34
0.32
0.39
Probability of NOT committing a crime
Hima Lakkaraju (Stanford University)
14
 Parameterized by % released
 Note: Different measures than AUC
What is the
% release – crime rate
tradeoff?
Hima Lakkaraju (Stanford University)
15
NCA, NVCA)
FTA,rate
(includes
Crime RateOverall
crime
Holding release rate
constant, crime rate is
reduced by 0.102
(58.45% decrease)
Judges
17.45%
Machine Learning
11.39%
All features +
ML algorithm
Release
ReleaseRate
Rate
Notes: Standard errors too small to display on graph
NOTE: Contrast
with AUC ROC
Hima Lakkaraju (Stanford University)
73%
98%
16
FTA Rate
Holding release rate
constant, crime rate is
reduced by 0.0633
(60.57% decrease)
Judges decisions
10.45%
ML algorithm
4.12%
Release Rate
73%
Notes: Standard errors too small to display on graph
Hima Lakkaraju (Stanford University)
17
NCA Rate
Holding release rate
constant, crime rate is
reduced by 0.0244
(58.37% decrease)
Judges decisions
4.18%
1.74%
Release Rate
73%
Notes: Standard errors too small to display on graph
Hima Lakkaraju (Stanford University)
18
Holding release rate
constant, crime rate is
reduced by 0.0143
(50.71% decrease)
NVCA Rate
Judges decisions
2.82%
1.39%
Release Rate
73%
Notes: Standard errors too small to display on graph
Hima Lakkaraju (Stanford University)
19
We only see eventual crime for
those who were released
Judge
Release
Lock
Crime?
Yes
Crime?
No Yes
No
Solution: For locked defendants we
have to impute crime rate
Hima Lakkaraju (Stanford University)
20
Locked
Released
Labeled (Observe crime/no crime)
Train
Train the model
making predictions
Impute
Train the model
for imputing
missing labels
Unlabeled
(don’t see crime)
Test
Used for model
evaluation
BIASED?
Hima Lakkaraju (Stanford University)
21
1) Problem one-sided:
 Releasing a jailed defendant is a
problem
 Jailing a released defendant is not
2) In Federal system cases
randomly assigned
Hima Lakkaraju (Stanford University)
22
Key components of our solution
 Judges have similar cases
(random assignment of cases)
 Selective labels are one-sided
Hima Lakkaraju (Stanford University)
23
Solution: Use most lenient judges!
 Take released population of a lenient
judge
 Which of those would we jail to become
less lenient
 Compare crime rate to a strict human
judge
Hima Lakkaraju (Stanford University)
24
Crime Rate
Judges
Machine Learning
Judges today
Release Rate
Judges mapped into deciles by ranked order of lenience
E.g.: 0.9 on x-axis = Top 10% of most lenient judges
Hima Lakkaraju (Stanford University)
25
Crime Rate
Judges
Contraction
Machine Learning
Judges today
ML vs. Contraction:
11.39 vs. 12.14%
Release Rate
Judges mapped into deciles by ranked order of lenience
E.g.: 0.9 on x-axis = Top 10% of most lenient judges
Hima Lakkaraju (Stanford University)
26
Odds are against ML:
Judge sees many things ML doesn’t
 Offender history:
 Observable by both
 Demeanor:
 Unobservable by the ML algorithm
Maybe that’s the problem
Hima Lakkaraju (Stanford University)
27
Two possible reasons why judges are making
mistakes:
 Misuse of observable variables
 These are the administrative variables available to the
ML algorithm
E.g.: Previous crime history, etc.
 Misuse of unobservable variables
 Variables not seen by the ML algorithm
“I looked in his eyes and saw his soul”
Hima Lakkaraju (Stanford University)
28
Predict judge’s decision
 So training data does NOT include whether
defendant actually commits a crime
Gives a release rule:
Release if predicted judge would release
 This rule weights variables the way judge
does
What does it do differently then?
 Does not use the unobservables!
Hima Lakkaraju (Stanford University)
Dawes, Faust and Meehl 1989
29
 Build a model to predict judge’s behavior
 We predict what the judge will do now (Not the
crime of the defendant)
 Rank defendants by mixing judges and
predicted judges
 γ * judge + (1 – γ ) * predicted judge
 Model will do better if judge misweights
unobservables
Hima Lakkaraju (Stanford University)
30
Pure Judge
Pure ML model
of the judge
Model of the judge beats
the judge by 35%
Hima Lakkaraju (Stanford University)
31
Algorithms are very effective in making
predictions in the bail setting
It is highly likely that judges are misweighting
unobservable variables
Can we identify patterns of mistakes?
Hima Lakkaraju (Stanford University)
32
The setting: Human Evaluations
Item i
Evaluator j
Decision of
judge j on
defendant i
True Label ti
No crime
Assigned Label aj,i
Crime
Hima Lakkaraju (Stanford University)
33
The setting: Human Evaluations
Item i
Evaluator j
Decision of
judge j on
defendant i
True Label ti
No crime
True label
Assigned Label
p1,1
p1,2
p2,1
p2,2
Hima Lakkaraju (Stanford University)
Probability that j
assigns an item
that ‘truly’
belongs to class 1
to class 2
34
Input:
 Defendants/judges described by features
 Defendants’ true as well as assigned labels
Output:
 Discover groups of evaluators and items that
share common confusion matrices
 For each such group, infer the corresponding
confusion matrix
Hima Lakkaraju (Stanford University)
35
Judge j
Defendant i
Decision of
judge j on i
True label
Assigned label
Cluster
cj
Cluster
di
Confusion matrix
a1
a2
a3
a4
a5
Judge attributes
b1
b2
b3
b4
Item attributes
zi
Item label
Similar judges share confusion matrices
when deciding on similar items
Hima Lakkaraju (Stanford University)
36
Less experienced judges make more mistakes
 Left: Judges with low number of cases
 Right: judges with high number of cases
Hima Lakkaraju (Stanford University)
37
Judges with many felony cases are stricter
 Left: Judges with few felony cases
 Right: Judges with many felony cases
Hima Lakkaraju (Stanford University)
38
Defendants who have
kids are confusing to
judges
Hima Lakkaraju (Stanford University)
Defendants who are
single, did felonies,
and moved a lot are
accurately judged
39
Algorithms are very effective in making
predictions in the bail setting
We can also identify fine grained mistake
patterns of human decision makers
Can we build interpretable summaries of
datasets which can also be used for accurate
prediction?
Hima Lakkaraju (Stanford University)
40
Hima Lakkaraju (Stanford University)
41
New tool to understand human
decision making
The framework can be applied to
various other questions:
 Making decisions about patient
treatments
 Assigning students to intervention
programs
Hima Lakkaraju (Stanford University)
42




A Bayesian Framework for Modeling Human Evaluations by H. Lakkaraju, J.
Leskovec, J. Kleinberg, S. Mullainathan. SIAM International Conference on
Data Mining (SDM) 2015.
A Machine Learning Framework to Identify Students at Risk ofAdverse
Academic Outcomes by H. Lakkaraju, E. Aguiar, C. Shan, D. Miller, N.
Bhanpuri, R. Ghani. ACM SIGKDD International Conference on Knowledge
Discovery and Data Mining (KDD) 2015.
Interpretable Decision Sets: A Joint Framework for Description and
Prediction by H. Lakkaraju, S. Bach, J. Leskovec. Under review at ACM
SIGKDD International Conference on Knowledge Discovery and Data Mining
(KDD) 2016.
Human Decisions vs. Machine Predictions by J. Kleinberg, H. Lakkaraju, J.
Leskovec, J. Ludwig, S. Mullainathan. Working Paper.
Hima Lakkaraju (Stanford University)
43