Predicting Next Responses

I Know What You Did Next: Predicting Respondent’s Next
Activity Using Machine Learning
Hariharan Arunachalam, Gregory Atkin, Douglas Wettlaufer, Adam
Eck, Dr. Leen-Kiat Soh & Dr. Robert Belli
University of Nebraska-Lincoln
Department of Computer Science and Engineering
May 15, 2015
70th AAPOR Annual Conference
Acknowledgements




This material is based upon work supported by the
National Science Foundation under Grant No. SES 1132015.
UNL Survey Research and Methodology (SRAM)
UNL Gallup Research Center
NCRN CSE Team: Gregory Scott Atkins, Douglas
Wettlaufer, Adam Eck
Any opinions, findings, and conclusions or recommendations expressed in this material
are those of the author(s) and do not necessarily reflect the views of the National
Science Foundation.
All presented experimental results are preliminary and subject to revision.
2
Introduction
3

Computer Assisted Telephone Interview (CATI)


Interviewers use a software instrument while conducting the
interviews with the respondent over a telephone.
How much assistance does the computer software
provide?
Used as a data recording instrument
 Manual recoding & cleaning
 Used for further data processing, storage and distribution



Primarily used as an instrument running a set of rules
Can it be enhanced to assist interviewers improve data
quality?

If yes, How can it be enhanced?
Introduction (2)
4

Survey type targeted: Time use diary
 Specifically

the American Time Use Diary (ATUS)
Improve interview data quality
 Improve
successful responses turnover
 Decrease
item non-response
 Increase recording and collection efficiency




Reduce data entry time
Reduce overall interview time and post processing time
Reduce interviewer errors
Reduce learning efforts
American Time Use Survey
5

American Time Use Survey (ATUS)
 Used




by researchers in various fields
Conducted annually by
U.S Census Bureau for U.S
Bureau of Labor Statistics
Method: CATI
Respondents are asked to
recollect the activities they
did for 24 hours from
0400 the previous day
Recorded as a time diary
American Time Use Survey

American Time Use Survey (ATUS)
 Used




by researchers in various fields
Conducted annually by
U.S Census Bureau for U.S
Bureau of Labor Statistics
Method: CATI
Respondents are asked to
recollect the activities they
did for 24 hours from
0400 the previous day
Recorded as a time diary
Activity
Eating
Start Time
11:30 am
End Time
12:30 pm
Where
Restaurant
Who
Co-workers
Activity
Watching TV
Start Time
6:30 pm
End Time
11:30 pm
Where
Home
Who
Friends
6
Overview
7



Introduction
American Time Use Survey
Data






Introduction
ATUS
Classification
Transformation
Architecture Overview
Experiments


Introduction
Description & Results





Markov Chains
Artificial Neural Network
Conclusions
Instrument Prototype
Future Work
Data: Introduction
8

Time diary


Sequence of activities in a chronological order for a respondent
Activity Sequence

An activity followed by another



Independent of the activities before and after it
Thus activity sequences of different lengths can occur across all
respondents of a survey
Prediction




Given an activity, what comes next?
Some activities are intuitive: Sleeping is usually followed by personal
care activities such as brushing and bathing
What are the attributes that affect the prediction?
Predict possible next activities

Take top 5 predictions and then allow the interviewer to pick based on
respondent response.
Data: ATUS
Respondent pool
Demographic
data
Interview process
Raw respondent
data
Logging
Paradata
CPS data
Recoding process
Activity data
9
Data: ATUS


U.S Bureau of Labor Statistics (BLS) publicly releases the
cleaned data from ATUS interviews yearly
We consider the data from 2010, 2011, 2012 and 2013



Activities are coded using dictionaries that classify and
translate verbal (verbatim) respondent responses. Activities
are coded in three tiers based on time-use




Activity data (Respondent activities) and
Demographics data (CPS Data) (69 attributes)
Tier 1 eg. Household activities
Tier 2 eg. Housework, Food & drink preparation and cleanup
Tier 3 eg. Interior cleaning, laundry, sewing, repairing &
maintaining textiles
We are interested in using the third tier (T3) activity names
(most detailed) and then work from there.
10
Data: Classification & Transformation
11

Classification

The tier 3 activities provided with ATUS are too specific and
not usable in a conversation and cannot be presented to the
interviewer


Eg: HH management & paperwork assistance for non-HH adults
Data too sparse
Some activity sequences hardly occur while some (sleeping
followed by personal care) occurs frequently
 Classify data in tiers similar to ATUS but based on different
criterion
 Activity names that can be used for predictions and presented
to the interviewer


Transformation

Given the new classification, how are predictions different across
them?
Data: Classification
L-CONCEPT
Household
Activities
Entertainment
MID
Watching TV &
Movies
TV &
Movies
(religious)
TV &
Movies
(nonreligious)
Watching
wrestling
Watching sports,
games &
activities
General
household
activities
Watching
aerobics
HH &
personal
organization
& planning
Watching
baseball
Home
security
Interior cleaning
& decoration
HH
management &
paperwork
assistance for
non-HH adults
Interior
cleaning
Heating &
cooling
Tier 3 (T3)
12
Data: Classification
D-CONCEPT
Indoor
Entertainment
TV &
Movies
(religious)
TV &
Movies
(nonreligious)
Outdoor
Entertainment
Watching
wrestling
Watching
aerobics
Household
Activities
Watching
baseball
Home
security
HH &
personal
organization
& planning
HH
management &
paperwork
assistance for
non-HH adults
Maintenance &
Repair
Interior
cleaning
Heating &
cooling
Tier 3 (T3)
13
Data: Transformation

Grouped T3 activities into
 MID
– a middle level of activities more abstract than T3
 CONCEPT – the highest most abstract level of grouping
 L-CONCEPT
– When the concepts are built using MID
 D-CONCEPT – When the concepts are built using T3

Create 60 configurations where a configuration has
 The
dataset from the year on which training occurs
 The dataset from the year on which testing is done
 A transformation for the activities that defines the
abstraction levels to use for the first and next activities
in the sequence
14
Architecture Overview
15
Experiments: Introduction
16

Machine learning techniques

Markov Chain Models (MCM)
Train and build a classifier to predict a list of possible next
activities that come after a given activity using transition
probabilities
 Used when data is temporal and next in sequence is being
predicted


Artificial Neural Networks (ANN)
Train and build a network to predict the activity that comes next
after a given activity
 Least requirements of domain knowledge
 Ability to model unknown & hidden relationships between
attributes for predictions
 Makes ANN a likely candidate for hidden relations

Experiments: Markov Chain Model

Markov Chain Model (MCM)



Statistical model
Modeled as a Markov process with state-space transitions –
useful for temporal pattern recognition
Activity A
Activity B
Activity C
Activity D
Index 0
Index 1
Index 2
Index 3
Consider pair-wise adjacent activities – learn to predict the
top 5 possible activities that could come next using

The first activity (non-demographic model)


Uses the entire dataset population
The first activity & demographic attributes (demographic models)


Each demographic model uses the subset of data which matches the
value for this demographic model
Intuition: Daily activities/routines may have similarities across
demographics
17
Experiments: Markov Chain Model


Predicts the top 5 activities that could come after an
activity by probability
Due to issues of sparsity and maintaining readability,
we selected 5 transformations that are applied to the
first and next activity selected:
First: D-CONCEPT
 First: D-CONCEPT
 First: L-CONCEPT
 First: MID
 First: T3


Next: MID
Next: T3
Next: MID
Next: MID
Next T3
The selection uses the combinations that predicts the
detailed versions of the activity by looking at the first
activity using different classifications
18
Experiments: MCM Results
Trained on 2012
Tested on 2013
Maximum difference between accuracy of a demographic model and the non-demographic model: 100
Minimum difference between accuracy of a demographic model and the non-demographic model: 0
19
Experiments: Markov Chain Model Results
Trained Year
Tested Year
D-CONCEPT MID
D-CONCEPT T3
L-CONCEPT MID
MID MID
T3 T3
2010
2011
53.35
50.81
59.23
45.13
39.15
2010
2012
50.10
51.32
60.14
45.23
39.45
2010
2013
54.67
51.22
59.53
47.87
40.57
2011
2010
49.80
47.26
57.81
43.71
37.83
2011
2012
49.90
46.75
59.84
42.80
38.84
2011
2013
54.87
48.38
59.74
46.65
38.44
2012
2010
51.32
46.55
51.93
45.84
38.13
2012
2011
52.43
45.44
53.45
47.57
38.74
2012
2013
54.97
45.13
54.16
47.87
38.74
2013
2010
56.19
47.46
56.19
43.31
37.93
2013
2011
58.62
47.46
57.61
44.93
38.54
2013
2012
58.82
45.64
59.03
43.10
38.54
Percent of times a demographic-models performed as well or better than the non-demographic model for different transforms and year data sets
20
Experiments: Markov Chain Model Results


Intuition of using
transformations
holds true – higher
and more unique
models when using
abstract groupings
Transformation
Average percent better
L-CONCEPT & MID
57.4
D-CONCEPT & MID
53.8
D-CONCEPT & T3
47.8
MID & MID
45.3
T3 & T3
38.8
Contrary to the intuition of using demographics
though, the demographic-based models did not
consistently perform better than the nondemographic model
21
Experiments: Artificial Neural Networks






Inspired by biological neural networks found in the brain
Builds a model of neuron connections between three layers (input,
hidden, output) in a network.
Stores variable ‘relationship’ as a connection weight between two
neurons of adjacent layers.
ANN tries to learn the ‘relationships’ that exists between the
attributes by adjusting the weights of these relationships during
training – predict by evaluating the output layers neurons and
using a function to pick the best weighted output.
Data transformed as per ANN requirements (one-hot format).
Tested for multiple training times to detect over-fitting – network
learns the training set – but cannot generalize – repeat more to
promote generalization.
22
Experiments: Artificial Neural Network Results
23
Conclusions

The machine learning methods used have been able to
model the respondents’ activity sequences accurately


Many demographics with possible values – sparse
relationships and not all values combinations are in the data
The general distribution of the sampled population
could be balancing out the accuracy by predicting the
common activity sequences correctly and the unique
activity sequences incorrectly.
The unique activity sequences are relatively harder for the
algorithms to learn
 The common activity sequences do not occur in enough
numbers to compensate

24
Conclusions

For ANN, the accuracies obtained for prediction is
impressive and prompts further investigations
 Currently
only predicts ONLY one possible next activity
for an activity
 Low
accuracy for a single classifier – but context implies
improvement possibilities if more predictions are allowed
 Ensemble
ANN with top 5 predictions – multiple ANN
that focus on learning to predict for each possible next
activity in a hierarchical structure
 This
could help the network distribute learning for each
possible next activity across separate nodes thus allowing
better generalization
25
Instrument Prototype
26
Current & Future Work

The next steps are to investigate using
 Ensemble
 Multiple
techniques & optimizations and
machine learning modeling methods in tandem
 Problem
solving techniques such as case based
reasoning for the predictions
 Identify
and apply techniques that can generalize where
needed, but use specific unique cases where generalization
fails
27
28
Thank You!
Experiments: Principal Component Analysis




PCA – use an orthogonal transformation to convert a
set of observations of possibly correlated variables into
a set of values of linearly uncorrelated variables called
principal components.
Identifies attributes that bring about variability in the
data – attributes that machine learning algorithms can
use instead of all attributes – more computationally
intensive.
Data same as used for MCM
Starting attributes: First activity name, Index of first
activity in day, Hour and Minute of the end time of the
first activity, and the Demographics.
29
Experiments: PCA Results
Index
Selected Attribute
Description
1
FirstActivity
The first activity name
2
EndTimeHour
The hour of the first activity’s end time
3
PESEX
The sex of the respondent
4
PRTAGE
The age of the respondent
5
GESTCEN
Census state code of the respondent’s home
6
GEREG
The region of the US where the respondent lives
7
PRNMCHLD
Number of own children under the age of 18
8
HETENURE
The tenure of the respondent’s living quarters
9
HRHTYPE
The type of the respondent’s household
10
PEEDUCA
The respondent’s highest level of school/degree
11
PEMJNUM
The number of jobs the respondent has at a time
12
PRDTIND1
The detailed industry recode of the respondent’s main job
13
PRMJOCC2
The major occupation recode of the respondent’s second job
14
PRMJOCGR
The major occupation recode of the respondent’s main job
30