Crime Forecasting Using Data Mining Techniques

Crime Forecasting Using Data
Mining Techniques
Present by: Chung-Hsien Yu
Chung-Hsien Yu, Max W. Ward, Melissa Morabito, and Wei Ding
Department of Computer Science,
Department of Sociology
University of Massachusetts Boston
The 4th
Workshop on
DataData
Mining
Case Techniques:
Studies and Practice
Prize, Vancouver,
December,
2011 and Wei Ding
Crime
Forecasting
Using
Mining
Chung-Hsien
Yu, Max W.Canada,
Ward, Melissa
Morabito,
CONTRIBUTIONS
• Architected a data structure which contains aggregated counts
of crime-related events from original crime records.
• Harvested additional spatial and temporal features from the
data.
• Employed an ensemble classification to perform the crime
forecasting.
• Proposed the best forecasting approach to achieve the most
stable outcomes
• Build a model that takes advantage of implicit and explicit
spatial and temporal data to make reliable crime predictions.
2
Crime Forecasting Using Data Mining Techniques: Chung-Hsien Yu, Max W. Ward, Melissa Morabito, and Wei Ding
Original Data
Residential
Burglary
911 Calls
Arrest
Foreclosure
Street
Robbery
3
Crime Forecasting Using Data Mining Techniques: Chung-Hsien Yu, Max W. Ward, Melissa Morabito, and Wei Ding
Aggregated Data
4
3
1
1
1
Crime Forecasting Using Data Mining Techniques: Chung-Hsien Yu, Max W. Ward, Melissa Morabito, and Wei Ding
Aggregated Data
0
1
1
3
5
1
3
1
2
2
2
1
1
3
0
2
1
0
2
0
0
5 2
1
1
3
0
25 2
1
0
2
0
0
1
1
3
0
25 2
1
0
2
0
0
1
1
3
0
25 2
01
30
50
02
4
04
1
1
3
0
25 2
01
30
50
02
4
04
1
1
3
0
25 2
3 4 2 30 0 50 2 01 3 04 2 02
1
1
3
0
25 2
50 2 01 3 04 2 02
4 2 30 0
3
1
4
2
3
25 2
50 2 01 3 04 2 02
4 2 30 0
3
50 2 3 01 23 04 2 02
4 26 3030
43
8
5
50 23 06 23 04 2 03
4 62 3130
43
8
3
42 0
9 8 0 43 0 4 62 3 3 30 1 5 23 3 0 23
42 0
8 0 43 0 4 62 3 3 30 1 5 23 3 0 23
9
02 0
8 0 43 0 4 62 3 3 30 1 5 23 3 6 23
9
59
6
62 3 0 30 01 23 3 23
8 01 4300
2
59
6
62 30 36 01 23 3 23
8 10 4300
2
4 6 1 59 0 8 10 0 4 00 0 6 30 03 01
33 2
6 1 59 0 8 10 0 4 00 0 6 30 0 3 01
4
33 2
6 1 59 0 8 10 0 4 00 0 7 30 0 3 01
4
03 2
10 0 00 0 30 0 01
6 1 59 0
4
3
10 0 05 0 30 0 01
6 1 59 0
4
5
10 0 0
61 5 0
00 0
4
10 0 0
61 5 0
00 0
4
4
5
1
6
00 0
0
4
0
0
1
0
4
0
0
1
0
3
4
0
4
1
0
0
2
Crime Forecasting Using Data Mining Techniques: Chung-Hsien Yu, Max W. Ward, Melissa Morabito, and Wei Ding
Predicting Crime
Hotspot:
residential burglary count > 0
Heating-up:
residential burglary increased
6
Crime Forecasting Using Data Mining Techniques: Chung-Hsien Yu, Max W. Ward, Melissa Morabito, and Wei Ding
Broken Windows Theory
• Focus on Offenders
– Street Robbery
– Motor Vehicle
Larceny
– Commercial
Burglary
7
• Focus on Places
– Foreclosure
– Arrest
– Residential
Burglary
Crime Forecasting Using Data Mining Techniques: Chung-Hsien Yu, Max W. Ward, Melissa Morabito, and Wei Ding
Feature Selection
8
Total
records
Average
(Number/year)
Total
Attributes
Variables
Time serials
Commercial Robbery
6 years, 2004-2009
400
67
26
Street Robbery
6 years, 2004-2009
18,321
3,054
26
Residential Burglary
6 years, 2004-2009
12,020
2,003
28
Commercial Burglary
5 years, 2005-2009
4,438
740
75
Moto Vehicle Larceny
4 years, 2006-2009
29,685
7,421
24
Arrest
2004-Nov.2010
254,309
42,982
59
911 Calls
6 years, 2004-2009
2,527,162
421,194
36
Mayor's Hotline
15 Mon., Oct.2008-2009
12,239
9,791
19
Construction Permit
6 years, 2004-2009
30,773
5,129
32
Foreclosure
6 years, 2004-2009
11,671
1,945
34
Commercial Robbery Person
6 years, 2004-2009
1,005
168
8
Street Robbery Person
6 years, 2004-2009
32,064
5,344
8
Crime Forecasting Using Data Mining Techniques: Chung-Hsien Yu, Max W. Ward, Melissa Morabito, and Wei Ding
t-Month-Based Approach
3
0
4
8
6
9
2
predict (t+1) month data
1
1
3
2
2
3 4 1
2
05 10
2
1
1
3
0
2
1
0
2
0
0
5 2
0
3
1
1
3
0
25 2
2
2
1
0
2
0
0
1
1
3
0
25 2
1
0
2
0
0
3
1
0
4
25 4
0
3
5
0
4
1
0
2
04
0
3
6
2
3
0
3
5
0
4
1
0
2
04
0
5
54 2 04 3 14 2 03
4 2 33 0
3
1
42 0
3 0 4 2 3 3 0 1 5 2 30 3
52 0 3
42 3 0
42 0
3
1
3
1
0 62
42 0
30
43
23
8
23
0
1
0
0
30
43
62
23
8
23
2
62 3 30 1 23 3 23
8 0 40 0
2
9
33 2
9 0 8 0 0 4 0 0 6 3 03 1
63 3 1
80 4 0
33 2
9
59
6
5 30 3 01
7 10 4 00
33 2
00
59
10
01
6
30
3
10 0 00 0 30 0 01
6 1 51 0
4
3
10 0 0
61 5 0
00 0
4
10 0 0
61 5 0
00 0
4
0
3
1
0
00 0
0
4
0
0
1
0
4
0
0
1
0
3
5
0
0
5
0
0
t months data
Crime Forecasting Using Data Mining Techniques: Chung-Hsien Yu, Max W. Ward, Melissa Morabito, and Wei Ding
EXPERIMENTS
• Experimenting different data mining
techniques: 1NN, SVM, Decision Tree (J48),
Neural Network, and Naïve Bayes.
• Different grid size:
24 x 20 (one-half mile square),
41 x 40 (one-quarter mile square).
• Hotspot vs. Heating-up.
10
Crime Forecasting Using Data Mining Techniques: Chung-Hsien Yu, Max W. Ward, Melissa Morabito, and Wei Ding
1NN with/without constrained
Precision
Recall
F1
Accuracy
0%
10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
Not-Constrained
11
Location Constrained
Crime Forecasting Using Data Mining Techniques: Chung-Hsien Yu, Max W. Ward, Melissa Morabito, and Wei Ding
Compare Classification Methods
12
Crime Forecasting Using Data Mining Techniques: Chung-Hsien Yu, Max W. Ward, Melissa Morabito, and Wei Ding
Compare Classification Methods
90%
80%
70%
60%
50%
Set 1
40%
Set 2
Set 3
30%
20%
10%
0%
SVM
13
J48
Neural
1NN
Bayes
Crime Forecasting Using Data Mining Techniques: Chung-Hsien Yu, Max W. Ward, Melissa Morabito, and Wei Ding
Voting Effect
14
Crime Forecasting Using Data Mining Techniques: Chung-Hsien Yu, Max W. Ward, Melissa Morabito, and Wei Ding
Grid Sizes
15
Crime Forecasting Using Data Mining Techniques: Chung-Hsien Yu, Max W. Ward, Melissa Morabito, and Wei Ding
SUMMARY
1. Crime is strongly related to the location:
• 1NN with location constrained performs better.
• Naive Bayes: what has happened in a particular
place in the past is likely to recur.
• Grid size matters because the larger grid cell
exhibiting a broader spatial knowledge.
2. Predicting crime increase is harder but will be
more useful.
16
Crime Forecasting Using Data Mining Techniques: Chung-Hsien Yu, Max W. Ward, Melissa Morabito, and Wei Ding
DEPLOYMENT
• How can we help the law
enforcement?
• What are the obstacles?
17
Crime Forecasting Using Data Mining Techniques: Chung-Hsien Yu, Max W. Ward, Melissa Morabito, and Wei Ding
Data Warehouse
18
Crime Forecasting Using Data Mining Techniques: Chung-Hsien Yu, Max W. Ward, Melissa Morabito, and Wei Ding
Visualized Reports
19
Crime Forecasting Using Data Mining Techniques: Chung-Hsien Yu, Max W. Ward, Melissa Morabito, and Wei Ding
Crime Forecasting
20
Crime Forecasting Using Data Mining Techniques: Chung-Hsien Yu, Max W. Ward, Melissa Morabito, and Wei Ding
ACKNOWLEDGMENTS
• Funded by the National Institute of Justice,
2009-DE-BX-K219.
• Funded by University of Massachusetts
President's 2010 Science & Technology
(S&T) Initiatives, 2011-2012
21
Crime Forecasting Using Data Mining Techniques: Chung-Hsien Yu, Max W. Ward, Melissa Morabito, and Wei Ding
Q&A
THANK YOU!!
22
Crime Forecasting Using Data Mining Techniques: Chung-Hsien Yu, Max W. Ward, Melissa Morabito, and Wei Ding