Privacy-preserving User Behavior Models for Context

First IEEE International Workshop on Behavioral Implications of Contextual Analytics (PerCom Workshops) 2017
Your Data in Your Hands:
Privacy-preserving User Behavior Models for
Context Computation
Rahul Murmuria, Angelos Stavrou
Daniel Barbara
Vincent Sritapan
Kryptowire
Fairfax, Virginia 22030, USA
{rahul, angelos}@kryptowire.com
George Mason University
Fairfax, Virginia 22030, USA
[email protected]
Department of Homeland Security
Washington, D.C., USA
[email protected]
Abstract—Modern smartphone applications rely on contextual
information while providing the users with relevant and timely
content and services. One way of generating such contextual
information is by employing learning systems to model user
behavior. Motion-based sensors, such as the accelerometer or
gyroscope, have been previously employed for recognizing predefined high-level physical activities such as climbing stairs,
jogging, or driving. In practice, human activities are highly
diverse and unsupervised methods must be used to expose complex behavioral characteristics that are user-centric. This paper
proposes a novel machine learning model for user authentication
and trust that is continuously assessing the user activities in an
effort to expose deviations from known training data. The goal
is to export this trust score as a contextual input to mobile apps
for detection of unauthorized access, fraudulent transactions, the
progress of a disease, or other behavioral changes such as stage
fright, intoxicated behavior, or mood changes. All collected data
and generated models of the user remains on the smartphone, and
only the score needs to be revealed to the apps. As a result, the
user controls the data without the need to share with any remote
entity. The paper presents preliminary performance results of
this technique.
I. I NTRODUCTION
Mobile applications are revolutionizing the way users perform everyday activities by providing them with content
predicted to be relevant to the user at any given time. In order
to compute this contextual content, many mobile applications
are collecting personally identifiable information (PII) and
transmitting them to external processing centers for further
analysis. These processing centers could take many forms
depending on the application requirements and can range from
application developers, mobile network operators, advertisers,
to enterprise device management systems. Of course, the
unfettered collection and sharing of PII has given rise to user
concerns about privacy implications. A Pew Research Center
study [1], that surveyed 461 people and conducted focus
groups with 80 people, concluded that users share personal
information in exchange for tangible benefits, but are unhappy
about what happens to that information once third-parties have
them in their possession.
Contextual information is valuable: mobile applications
have leveraged physical and biometric device sensors to of-
978-1-5090-4338-5/17/$31.00 ©2017 IEEE
fer situational awareness and trigger context-aware content.
For example, banking applications often track geolocation to
assess transactions and decide if they are fraudulent. Indeed,
a Wall-Street Journal article recently reported that Visa Inc.
and Mastercard Inc. offer services to banking applications
that use smartphone location-tracking as one of the inputs
to their predictive fraud analytics [2]. Similarly, in medicine,
the progress of a disease can be tracked using a smartphone
and relevant services can be provided to the user in realtime. Mehta et al. [3] reported the development of a tool that
acquires the high-bandwidth signal from motion sensors to
detect the progress of voice disorders in patients. Geolocation
tracking and speech recordings are examples of highly personalized information which are shared with remote entities
in order to compute context, even though in most applications
of this type, the goal is only to recognize if the user is behaving
uncharacteristically.
Thus far, the accelerometer and gyroscope sensors are the
most commonly used device sensors for activity recognition.
Successful implementations have so far focused on modeling and recognizing simple human activities such as sitting,
standing, jogging, and climbing stairs [4]. The devices are
usually placed firmly in a fixed position in the pockets or
on the body of the users who are asked to perform the same
tasks repeatedly. Once training data is obtained, clearly labeled
with the activities they correspond to, the models are generated
for each of the activities and used for activity recognition. In
practice, the issue with this setup is that if the environment
changes, new behavior models will need to be generated.
These environmental changes can be as simple as changing the
way the smartphones are mounted. Primo et al. [5] presented
a context-aware authentication scheme where the position of
the smartphone is taken into consideration. However, realworld environments are far more diverse and people perform a
wide range of complex activities while seamlessly integrating
the smartphone. Controlling either the environment or the
activities performed by the users can affect their behavior in
non-trivial ways. As a result, these methods are not scalable.
In contrast, Murmuria et al. [6] suggested an unsupervised
model for solving continuous authentication, using an algo-
First IEEE International Workshop on Behavioral Implications of Contextual Analytics (PerCom Workshops) 2017
rithm called Strangeness-based Outlier Detection (StrOUD).
We have extended this work by using a local density based algorithm called Local Outlier Factor (LOF) [7] which is an outlier detection method to measure the strangeness of an activity.
In comparison to StrOUD, LOF enabled us to produce more
stable results. Models prepared using unsupervised methods do
not rely on pre-labeled data. There is no previous knowledge
about how many different activities can be discovered in the
user’s dataset. Therefore, by using unsupervised learning, our
approach can discover additional and more complex activities
without limiting the user’s behavior in any way. Moreover,
these activity models are general enough to persist across many
environmental changes. In order to discover these activities,
only data from the modeled user is required. As a result, this
is an outlier detection problem and not a classification problem
(see Section IV-A).
Our analysis presents techniques for data collection, preprocessing, feature extraction, and outlier detection, all directed
towards modeling user behavior. These techniques are privacypreserving because all of the computations are performed
locally on the device without the need to rely on external
processing centers. As a consequence, the data do not need
to be shared with any third parties. The users remain in
control of their own data. Our continuous authentication model
is implemented on the device and computes a contextual
trust score that represents the probability that the users are
performing everyday activities the same way as they did at
the time of training the models. This trust score can then be
shared with third party applications to drive decisions, such
as blocking transactions when the score is low or reporting
unusual changes in a medical disorder to the concerning
doctor.
The rest of this paper is organized as follows: Section II is
a description of the privacy-preserving implementation. Section III and Section IV detail the performance evaluation and
feasibility of the modeling technique, respectively. Section V
is a brief literature review. Section VI and Section VII suggest
further research directions and conclude the paper.
II. I MPLEMENTATION AND M ODELING
In this section, we describe our mathematical model and
software implementation. This description includes the design
of our on-device context generation tool which collects the
motion sensor data, models user behavior, and generates trust
scores in real-time. Figure 1 gives an overview of the structure
of the application. A proof-of-concept was implemented as
an Android application (named KAuth) and has been tested
on Android KitKat, Lollipop, and Marshmallow. Privacy is
preserved as a consequence of the choices made at each step
of the implementation, details of which has been discussed
where applicable in the subsections below.
A. Data Collection
The gyroscope and accelerometer sensor readings were
recorded using the Sensor Event API, which is part of the
standard Android SDK. During a single sensor event the
Fig. 1. Structure of the Context Generation Application (KAuth)
accelerometer and gyroscope return, for the three coordinate
axes of the device, acceleration force data in m/s2 and rate
of rotation data in rad/s, respectively. The acceleration measurements include all forces applied to the device, including
the force of gravity. As a result, orientation of the device is
inherently part of the accelerometer measurements got from
the three coordinate axes.
The readings were collected with the fastest available sampling period of around 5 milliseconds on Nexus 5 (Model:
LG-D820) and around 3 to 10 milliseconds on Samsung S6
(Model: SM-G920I). These readings were saved as a CSV
file on the smartphone for later analysis. In addition, the
segmentation, feature extraction, and online model generation
algorithms were running in parallel, such that the collected
data were transformed and stored in-memory in it’s cleaned,
reduced, and processed form.
B. Segmentation
In addition to the sensor readings, the KAuth application
records the timestamp and package name of the top application being used by the user. This information is retrieved
from the Activity Manager API and the system permission,
GET REAL TASKS, is needed to perform this task.
Users perform different activities on different applications.
When a user is playing a game, the digital footprint that the
user leaves behind is significantly different from when the
user is sending text messages. Murmuria et al. [6] proved that
mixing data from different applications can adversely impact
the overall performance of a behavior modeling system.
The activity logger in the KAuth application inserted placemarkers in the data whenever the user switched from one
application to another. As part of pre-processing the data, only
those events were extracted, that were generated while using
the application for which the user profiles are being created. As
a result, multiple datasets were collected on the smartphone,
one set for every application used.
First IEEE International Workshop on Behavioral Implications of Contextual Analytics (PerCom Workshops) 2017
C. Feature Extraction
The accelerometer and gyroscope produce readings in the
form of a multi-dimensional time-series. Let X, Y, and Z
represent the readings
√ from a sensor in x, y, and z axes, respectively. Then R = X 2 + Y 2 + Z 2 represents the magnitude
resultant of the acceleration or the angular speed, respectively.
The recorded events were divided into small windows of 1.6
seconds each, where we can measure properties related to
the group of events. The window size was chosen because
FFT computations require the number of events in the input
to be a power of 2 (see Section II-C2). For the purposes of
this analysis, the data associated within each window frame
can be referred to as one movement gesture. Each of these
movement gestures loosely represents the smallest constituent
unit of any complex activity performed by the user. Statistical
time-domain and frequency-domain features were extracted
to represent each gesture as a multi-dimensional dataset. The
features discussed below were selected from a larger set by
performing an offline analysis on a previously recorded dataset
of 110 users using the smartphone for routine tasks spanning
a week (see Section III).
1) Time-domain Features: The time-domain features include the mean, standard deviation, skewness, and kurtosis of
each small window of time. The mean represents the average
magnitude of the user’s movements, the standard deviation
shows the scatter of the data, or in other words, the intensity
of the user’s activities. Skewness measures the asymmetry
and kurtosis explains whether the source of the variance is
infrequent extreme movements or frequent modestly sized
movements. The three axes and the resultant together form
4 independent time-series and these statistics are computed
for each series separately, thereby resulting in 16 time-series
features for each hardware sensor.
During feature selection, which was performed offline (see
Section III), it was discovered that the mean values of all
the axes were poor features for accelerometer sensor whereas
they were strong features for the gyroscope sensor. Finally,
12 features were selected for accelerometer, including the
standard deviation, skewness, and kurtosis, whereas, 8 features
were selected for gyroscope, limited to mean and standard
deviation of each of the dimensions.
2) Frequency-domain Features: Kavanagh et al. [8] suggested that the Nyquist frequency for physical accelerometry
signal is typically around 10 Hz frequency. For this study,
the multi-dimensional data segments from both sensors were
aggregated into 32 readings at 20 Hz frequency (twice of
Nyquist frequency). This is achieved by low-pass filtering the
time-series to a cut-off frequency of 20 Hz (sampling rate of 50
milliseconds). Events recorded at 50 milliseconds in windows
of size 1.6 seconds leads to 32 readings for every movement
gesture. Consequently, applying Fast Fourier Transform (FFT)
on this data produces 32 coefficients. Since the resulting power
spectrum is two-sided and one-half mirrors the other, we take
the first 16 coefficients as features after leaving the direct
component (the first component or direct component of FFT
is the same as the statistical mean, which has already been
recorded in the time-domain features). After feature selection,
the first 4 coefficients in the power spectrum were selected as
the best set of features for both accelerometer and gyroscope.
D. Outlier Detection
The task of user behavior validation is equivalent to determining whether the stream of movement gestures from either
the accelerometer or gyroscope follow the same distribution
as those previously obtained from the same user of the device.
The KAuth application works in two phases, a training phase
to generate the behavior models, and an authentication phase
where new movement gestures are evaluated and the overall
trust score is revised.
The movement gestures dataset is one with an unbounded
number of classes. There are no pre-defined and labeled
movement gestures such as jogging or driving. In this type
of a dataset, each complex activity performed by the user is
loosely represented by a cluster in the dataset. These clusters
vary in densities, which stems from the nature of the gestures.
Breuning et al. in the paper [7], first presented this notion
that being an outlier is not a binary property and described a
density-based outlier detection technique called Local Outlier
Factor (LOF). We leveraged this algorithm to detect outliers
in our study. The outlier factor returned by this algorithm
captures the degree to which a given movement gesture can
be called an outlier. It is the average of the ratio of the local
density of the movement gesture and those of this gesture’s knearest neighbors in the euclidean space. The outlier factor
is higher when a movement gesture’s neighborhood of kneighbors is more densely packed than the gesture itself.
Consequently, when the outlier factor is close to 1, the gesture
is not an outlier, and when the factor is much higher than a
pre-selected threshold, then the gesture can be deemed as an
outlier. For all the gestures in the middle which have outlier
factors close to the threshold, a decision need not be made
whether it is an outlier or not, but rather, a cost or penalty can
be assigned, that depends on the degree of outlier-ness. This
threshold has been selected experimentally from among a set
of handpicked candidates in the same way as was suggested
by Breuning et al. [7].
In the KAuth implementation, during training, once a preset
number of movement gestures are recorded for a given application (750 gestures in our proof-of-concept implementation
which takes 20 minutes to record), every gesture’s local
density is computed. A portion of these gestures will have
outlier factor higher than the pre-selected threshold, and the
size of this portion can be configured depending on the value
of the threshold selected. For generating the model, we need to
retain the density computations of every gesture in the training
set along with the training dataset itself. LOF algorithm does
not employ the use of any rule learned by generalizing from
the training dataset. It looks for local outliers, and therefore
fits well with our motion sensor data where there are no welldefined set of classes each representing user activities.
First IEEE International Workshop on Behavioral Implications of Contextual Analytics (PerCom Workshops) 2017
During authentication phase, new gestures are evaluated
against the training dataset and k-nearest neighbors are calculated. The LOF algorithm assigns an outlier factor to the
new gesture which depends on the relative densities in the
neighborhood. The null hypothesis here is that the new gesture
fits into the distribution of gestures in the training set. If the
outlier factor is significantly above the threshold, the alternative hypothesis is accepted that gesture is an outlier. In KAuth
implementation, all outlier factors above the threshold but
lower than twice of the threshold are assigned a proportional
real-valued penalty less than the penalty assigned for a full
outlier.
E. Continuous Scoring
Penalties and rewards are assigned to each gesture depending upon their outlier factor. If the factor is below the
threshold, a reward is assigned and if the factor is above
the threshold, a penalty is assigned (see Section II-D). These
rewards and penalties are then aggregated to compute a ‘trust
score’ out of 100, which is revised by adding or subtracting the
reward or penalty for every new movement gesture performed
by the user. This trust score is an assessment of the user’s
deviations from known behavior. In KAuth, the parameters
were limited to a maximum reward of 3 points and a maximum
penalty of 5 points. Murmuria et al. [9] published more details
about this technique of computing the trust score from the
stream of outlier factors.
III. P ERFORMANCE E VALUATION
The user behavior model discussed in this paper was evaluated offline by collecting data from 110 users and performing a
series of one verses all tests. The goal was to find features and
parameters that improve the clustering of the baseline user’s
data and create greater separation between the baseline user
and all other test users.
A. Volunteer Data Collection
Motion sensor data was collected from 110 volunteers that
were compensated to participate in our study for a period
of one week using the provided phone, Google Nexus 5
Model:LG-D820, as their primary device. The user’s SIM
cards were switched from their device to the device we
provided them on the day of the sign-up in order to ensure that
this device was used as their primary mode of communication
for the entire week. The users were instructed to install all their
favorite applications and use the device as they would use their
primary device. Each user was assigned a pseudonym with the
convention Sxxx, where the xxx is a digit between 001 and
110. The real names of the users were not retained. We also did
not record any user-generated content outside of the sensory
data. All volunteer participants were University students, and
we did not discriminate who volunteered, beyond requiring
them to have an active GSM-based mobile operator whose
SIM card could be easily switched into our device.
We used the same device model for all users in-order to
achieve uniformity in the measurements and avoid introducing
Fig. 2. Application Usage Per User
any device-specific markers into the collected dataset. We
recorded data from all the sensors concerned into files. These
files were stored in the external storage directory of each
smartphone. Upon completion of a users session, we extracted
that data out from the smartphone into our data store where we
performed offline analysis of the data. Our research required
behavioral data of human subjects and necessary approvals
were acquired from the Institutional Review Board (IRB).
Figure 2 confirms that, the mobile device usage is not
uniform across all users and all mobile applications. In the
figure, it is observed that 106 users generated actual data, and
4 users did not use the allotted device throughout the week.
Further, all 106 users have the Launcher (Google Search Box
and Homescreen) application in common, as that is the first
application visible when the smartphone is unlocked. Dialer,
Chrome, Facebook, and Youtube were used by 106, 105, 99,
and 89 users. In terms of the time that the users spent in any
application, Whatsapp topped the list among applications used
by at least 15 people, with 57 users spending an average of 290
minutes through the week, followed by Viber with 19 users
using the application for an average of 143 minutes. Facebook
was used for an average of 121 minutes.
B. Analysis
Section II-B described the preprocessing steps, Section II-C
described the features that were formulated, and the outlier detection algorithm utilized in this research has been described in
section II-D. For mobile applications in which the users spent
over 40 minutes totally, baseline models were created using
20 minutes of training data. As per the feature construction
technique described in section II-C, 20 minutes corresponds
to 750 movement gestures. The baselines were generated
using the LOF algorithm and tested using all users who had
20 minutes of data for the corresponding application. All
baselines contain data only from the modeled user, which is
First IEEE International Workshop on Behavioral Implications of Contextual Analytics (PerCom Workshops) 2017
Fig. 3. Time-domain Features on Gyroscope Sensor Data
Fig. 4. Frequency-domain Features on Gyroscope Sensor Data
the positive class. The tested movement gestures were assigned
reward or penalty according to the algorithm discussed in
section II-E. The output was a series of 0 to 100 trust scores,
for every pair of baseline and test user.
Since the output is not binary, standard metrics such as
Equal Error Rate (EER) and/or the Receiver Operating Characteristic (ROC) fail to capture the practical implications of the
continuous series of scores. Figures 3, 4 show the distribution
of the scores resulting from the one verses all tests performed
with gyroscope sensor data for the Whatsapp application.
Baseline users, when tested against their own baselines are
called ‘genuine users’, and test users who were tested against
baselines of other users are considered ‘imposters’ in the plots.
In the presented results, it is possible to determine from
the plots which feature sets performed better than the others.
However, in order to try all subsets of features available and
evaluate a range of other input parameters, thousands of such
plots need to be generated, and it gets difficult to compare results. In order to compare the results programmatically using a
single metric, the weighted accept score (WAS) was employed,
which was presented by Murmuria et al. in [9]. Using this
score, we repeated the analysis with various different subsets
of the feature set, and the best feature set and input parameters
were selected (see Section II-C).
Figure 3 and Figure 4 show results obtained when generating user models from gyroscope sensor data while extracting
time-domain and frequency-domain features, respectively. The
x-axis represents the trust scores binned for visual depiction
and the y-axis represents the percentage of total events. The
results follow expectation that genuine users spend most of
their time in the [80, 100] bin, and the imposters spend
most of their time in the [0, 1] and [1, 50] bins. While not
presented here, Facebook and Youtube applications showed
similar results for both gyroscope and accelerometer sensors.
recognition and continuous authentication are an outlier detection problem, and only one user’s data can be used in order
to prepare behavioral models of that user.
Many publications in activity recognition discuss classification models that depend on creating a 2-class verifier, where
in addition to data from the activity performers, sensory data
from other users is required (see Section V). Most researchers
fail to discuss that, in truth, the number of classes representing
the set of all activities is unbounded. Modeling all activities
from all imposters as a single negative class leads to overfitting
and lack of generalization, which results in the eventual poor
performance of the deployed system when new users are
introduced. Researchers who modeled only a pre-selected set
of activities and collected data only for those activities have
regularly missed observing this phenomena due to the lack of
diversity in their datasets.
There will always be a larger set of users with partially
unique activities who were not available at the time of training
the models for the users in the system. Therefore, this problem
should instead be modeled as in this paper, as a semisupervised outlier detection problem, where only the data from
the positive class is available and some measure is used to
determine if the new stream of measurements belong in the
distribution of previously recorded readings or not.
IV. D ISCUSSION
A. Rationale for using Outlier Detection
As discussed in Section I, the number of activities that can
be found in a user’s dataset is unbounded. As a result, activity
B. Feasibility of Generating Trust Score Locally
This paper discussed a technique which enables smartphones to generate user behavior models entirely on-device.
The outlier detection algorithms discussed in this paper are
based on finding fitness of a newly recorded activity into a
model represented only by a sample distribution of activities
of the modeled user and the fitness is tested via hypothesis
testing. As a result, most of the processing time during model
generation and testing goes to the execution of the nearest
neighbors discovery step. This operation was optimized by
reducing the number of distance calculations, employing a
commonly used tree-based data structure called KD-tree which
recursively partitions the dataset along each of the feature
dimensions, thereby reducing the complexity from O(n log n)
to O(log n). Further, the data collected from the sensors
First IEEE International Workshop on Behavioral Implications of Contextual Analytics (PerCom Workshops) 2017
are preprocessed and after the feature extraction step, the
20 minutes of training data occupy only 30 megabytes. On
modern smartphones, it is easy to retain data of such volumes
both in memory and storage. Therefore, the technique in this
paper is feasible.
V. R ELATED W ORK
There is a large body of user behavior based research on
activity recognition and continuous authentication systems for
mobile devices.
Shi et al. [10] discussed a technique to fuse data from multiple sensors to create an authentication score. The researchers
collected a wide range of behavioral information such as
location, communication, and usage of applications, in order
to create a user profile. Their approach is built on the concept
that most users are habitual in nature and to build this model,
the authors export all highly intrusive data to a remote server in
order to map activities to time of the day and characterize the
user. Similarly, Riva et al. [11] presented an architecture that
utilized face and voice recognition, location familiarity, and
determining possession by sensing nearby electronic objects as
signals to establish the legitimate user’s level of authenticity.
Their model is constructed remotely on cloud services and is
computationally too expensive to fit on any mobile device.
Kwapisz et al. [12] published a system to identify and
authenticate users based on accelerometer data. They used a
dataset of 36 users, labeled according to activities such as
walking, jogging, and climbing stairs. These labels were used
as context and solved authentication as a 2-class problem.
While they concluded based on their results that it is not
critical to know what activity the user is performing, their
dataset was generated by users repeating a limited set of predefined activities. In contrast, we present in this paper an
unsupervised method that scales to all possible activities that
users can perform.
For traditional computing devices, Killourhy et al. [13] and
Shen et al. [14] published comparisons of various anomalydetection algorithms for keystroke dynamics and mouse dynamics respectively, limiting the discussion to 1-class verification due to lack of availability of imposter data in the
real-world. In contrast, we present a privacy-preserving implementation of such a 1-class verification system on mobile
devices with a trust score model for context generation.
VI. F UTURE W ORK
The dataset used in this research is very noisy and as future
work, we would further investigate better data cleaning and
feature evaluation strategies in order to make the resulting
behavioral models more robust and accurate. Further, we
evaluated time-domain and frequency-domain features of accelerometer and gyroscope separately, but did not attempt any
ensemble methods on these models. In addition to ensemble
techniques, we would like to investigate using all the features
in the same model, and compare the performance.
VII. C ONCLUSION
We have presented a context-generation technique in the
form of a continuously revised trust score that represents the
probability that users are performing activities in the same
way as they normally do. In this model, no personal data
needs to be shared with any remote server or third party
application, and the score is generated entirely on the device
in real-time. Further, we presented performance results of this
system by analyzing data collected from 110 participants.
Results show that this system is feasible and third party
apps can benefit from the trust score without the burden of
modeling the user behavior. This score can enable apps to
block fraudulent financial transactions, monitor progress of
a disease, or detect unauthorized access of a device without
collecting any personally identifiable data from the users.
ACKNOWLEDGMENTS
This research was funded by Department of Homeland
Security contract D15PC00154.
R EFERENCES
[1] L. Rainie and M. Duggan, “Privacy and information sharing,” Pew
Research Center, Jan, vol. 14, 2016.
[2] R. Sidel, “Why Your Bank Wants to Track Your Phone,” Wall Street
Journal, 2016-03-04T10:30:00.000Z.
[3] D. D. Mehta, M. Zañartu, S. W. Feng, H. A. Cheyne II, and R. E. Hillman, “Mobile voice health monitoring using a wearable accelerometer
sensor and a smartphone platform,” IEEE Transactions on Biomedical
Engineering, vol. 59, no. 11, pp. 3090–3096, 2012.
[4] P. Siirtola and J. Röning, “Recognizing human activities userindependently on smartphones based on accelerometer data,” IJIMAI,
vol. 1, no. 5, pp. 38–45, 2012.
[5] A. Primo, V. V. Phoha, R. Kumar, and A. Serwadda, “Context-Aware Active Authentication Using Smartphone Accelerometer Measurements,” in
Computer Vision and Pattern Recognition Workshops (CVPRW), 2014
IEEE Conference on. IEEE, 2014, pp. 98–105.
[6] R. Murmuria, A. Stavrou, D. Barbará, and D. Fleck, “Continuous
Authentication on Mobile Devices Using Power Consumption, Touch
Gestures and Physical Movement of Users,” in Research in Attacks,
Intrusions, and Defenses. Springer, 2015, pp. 405–424.
[7] M. M. Breunig, H.-P. Kriegel, R. T. Ng, and J. Sander, “LOF: Identifying
density-based local outliers,” in ACM Sigmod Record, vol. 29. ACM,
2000, pp. 93–104.
[8] J. J. Kavanagh and H. B. Menz, “Accelerometry: A technique for
quantifying movement patterns during walking,” Gait & posture, vol. 28,
no. 1, pp. 1–15, 2008.
[9] R. Murmuria and A. Stavrou, “Authentication Feature and Model
Selection using Penalty Algorithms,” in Symposium on Usable Privacy
and Security (SOUPS), 2016.
[10] E. Shi, Y. Niu, M. Jakobsson, and R. Chow, “Implicit authentication
through learning user behavior,” in Information Security, ser. Lecture
Notes in Computer Science. Springer, 2011, no. 6531, pp. 99–113.
[11] O. Riva, C. Qin, K. Strauss, and D. Lymberopoulos, “Progressive
authentication: Deciding when to authenticate on mobile phones,” in
Proceedings of the 21st USENIX Security Symposium, 2012.
[12] J. R. Kwapisz, G. M. Weiss, and S. A. Moore, “Cell phone-based biometric identification,” in Biometrics: Theory Applications and Systems
(BTAS), 2010 Fourth IEEE International Conference on. IEEE, 2010,
pp. 1–7.
[13] K. S. Killourhy and R. A. Maxion, “Comparing anomaly-detection
algorithms for keystroke dynamics,” in Dependable Systems & Networks,
2009. DSN’09. IEEE/IFIP International Conference on. IEEE, 2009,
pp. 125–134.
[14] C. Shen, Z. Cai, R. Maxion, G. Xiang, and X. Guan, “Comparing
classification algorithm for mouse dynamics based user identification,”
in 2012 IEEE Fifth International Conference on Biometrics: Theory,
Applications and Systems (BTAS), Sep. 2012, pp. 61–66.

Download Report

Privacy-preserving User Behavior Models for Context

Paperzz.com

Your Paperzz