a performance comparison of modeling physical human

A PERFORMANCE COMPARISON OF
MODELING PHYSICAL HUMAN ACTIVITIES
THIEN KAE JACK
FACULTY OF COMPUTING AND
INFORMATICS
UNIVERSITY MALAYSIA SABAH
2015
i
ABSTRACT
This paper presents the background of human activity recognition (HAR) using
wireless sensors network (WSN) data. Performing HAR using WSN data is an
important and challenging task which it may contribute in many domains’
application. Time series classification (TSC) based approach is proposed in this
paper to achieve the goal mentioned just now. Datasets that will be used in this
research can be acquired from the internet which the dataset was collected for past
study. There are six activities performed by the volunteers which are walking,
walking upstairs, walking downstairs, sitting, standing, and laying. The TSC
approach employs the instance based k-NN with different similarity measure which
includes Dynamic Time Warping to perform classification of HAR. Furthermore,
other classification approaches were also performed to compare the performance.
The involved classifiers are J48 decision tree and Support Vector Machine. Besides
using original acquired dataset to perform classification, discretization and feature
selection will be applied to the dataset before the classification process. Overall, kNN with Dynamic Time Warping produced a comparable performance with other
classifiers.
ii
ABSTRAK
Kertas kerja ini membentangkan bahawa latar belakang pengiktirafan aktiviti
manusia (HAR) dengan menggunakan data rangkaian sensor tanpa wayar (WSN).
Laksana HAR menggunakan data WSN adalah satu tugas yang penting dan
mencabar sebab keusahaan tersebut boleh menyumbang dalam applikasi dari
banyak bidang. Cara berasaskan klasifikasi siri masa (TSC) adalah dicadangkan
dalam kertas kerja ini supaya matlamat tersebut dapat dicapai. Set data yang akan
digunakan dalam kajian ini dapat diperolehi dari internet yang telah dikumpulkan
untuk pengajian lain. Set data tersebut terdapat enam aktiviti yang dijalankan oleh
sukarelawan, aktiviti tersebut adalah berjalan, berjalan sambil naik tingkat, berjalan
sambil turun tingkat, duduk, berdiri, dan baring. TSC menggunakan k-NN dengan
Dynamic Time Warping sebagai langkah persamaan untuk melaksanakan HAR
klasifikai proses. Selain itu, cara klasifikasi lain akan dibagai juga untuk membuat
perbandingan tentang hasilnya. Klasifiaksi yang tersebut adalah J48 decision tree
dan Support Vector Machine. Selain daripada menggunakan set data yang
diperolehi, proses pendiskretan dan pemilihan ciri juga akan dilaksanakan sebelum
klasifiksi. Secara keseluruhan, k-NN dengan Dynamic Time Warping menghasilkan
keputusan yang standing dengan cara klasifiksi lain.
iii
iv
CHAPTER 1
INTRODUCTION
1.1
Introduction
This chapter presents the most important elements that initiated this research
project. Section 1.2 presents the background of the problem. Section 1.3 and 1.4
describe the research question and objectives. Section 1.5 presents the scope of
the project and section 1.6 explains the organisation of the report.
1.2
Problem Background
Human activity recognition (HAR) from wireless sensor(s) network (WSN)
data is an area of important research for the society in the future. The data
of real-time human activities movement is collected from the wireless
sensor(s) attached on human body or install at the environment. These
sensors can be accelerometer, thermometer, or gyroscope which is high
availability. Using a suitable HAR method, the collected data which
corresponding to the movement of human activities performed, can be
analysed to recognise those activities. The method includes data pre processing, features extraction, classification, and validation. Such method
can help the computer to identify human movement or activities which may
apply to many areas of application. Such as personal safety, healthcare,
sports, and personal fitness. However, many past researches applied multi
nodes of wireless sensors [1, 2, 3] which may cause inconvenience to the
subject and not feasible in reality for the subject to attach many devices in
5
daily life. Thus, the new era of smartphone may overcome this problem due
to its embedded powerful sensors like accelerometer, gyroscope, and
thermometer. Some researches of HAR [4, 5, 6, 7, 8] had been done using
smartphone as the only node and sensor to collect data.
The potential of HAR is rely on how intelligent the computers or
gadgets can be performed while having interactions with human. Human
activities may vary from time to time and continuous. For instance, in a
period of time, the personal can be performing walking, standing, running, or
jumping. Moreover, they can be performed continuously with switching the
activity as well. Hence, choosing a suitable approach to analyse the data is
an important step in human activity recognition. The collected data can thus,
theoretically, be represented in the form of time series or point series, where
the y-axis could represent the magnitude of signal received from sensors and
the x-axis represented the timestamp. This in turn could allow the application
of Time Series Classification (TSC) based technique for HAR. TSC has
typically treated like a classic discrimination problem [9] and it has been
widely used in many domains like climate, business, and of course HAR [1, 4,
10]. However, the term time series might cause misleading because there are
some researches were using TSC to investigate non-temporal data, for
example shape recognition [11, 12, 13]. Due to its functionality, the
motivation of the research presented in this report is to produce an approach
to HAR using TSC techniques.
Other than that, data collected from the sensor(s) for HAR can be very
huge in size. For instance, accelerometer and gyroscope which are the
common sensors embedded in a smartphone nowadays can collect hundreds
of
features
of
data
every
hertz
once
they
are
activated.
Hence,
implementation of feature transformation or feature selection might play an
important role in HAR in terms of the performance of the recognition
processes.
6
1.3
Research Question
The research motivation describes in the foregoing section gives raised t o a
research question of how TSC technique can best be applied to HAR which
are walking, walking downstairs, walking upstairs, sitting, standing, and
laying? The two further sub-questions are derived from the main question:
1) What is the best approach to transform time series data from the
human activities data?
2) What is the best TSC technique can be applied on the time series
data generated in (1)?
1.4
Objectives
Three research objectives have been identified to answer the research questions
stated in Section 1.3. They are:
1) To formulate a feature transform technique that can be used to
transform features in the form of point series related to HAR
2) To investigate and identify feature selection methods that can be
applied to learn a point series data
3) To investigate the performance of the TSC when these selected
features are taken as input
1.5
Research Scope
1) Datasets was acquired from the internet which was collected by the
efforts from Jorge L. Reyes-Ortiz, Davide Anguita, Alessandro Ghio, and
Luca Oneto. The datasets was collected for the research of Human
7
Activity Recognition Using Smartphones Dataset which the datasets can
be downloaded from www.smartlab.ws. There were 30 volunteers that
contributed the datasets which they were performed walking (WA),
walking downstairs (WD), walking upstairs (WU), sitting (ST),
standing (SD), and laying (LY). An embedded accelerometer and
an embedded gyroscope in a smart phone collected those data
which attached on the volunteers’ waist. There are 561 features in
the dataset which includes x, y, and z-axis of the features.
2) To recognise human activities from the data described above, TSC
technique will be employed and the performance will compare with the
SVM which is the classification method used by the researchers who
obtained the datasets to classify the same datasets to be used in this
research.
1.6
Organisation of the Report
The remaining of this report is organised as in the following.
Chapter two presents the literature reviews of related work. This chapter
discusses the idea of HAR, TSC, and classification techniques from the literatures
which includes the background, advantages, and techniques used.
Chapter three presents the methodology of this research. This chapter will
present the methods or approaches that will be used to fulfil the objectives of this
project.
Chapter four presents the experimental setup for classification which
without feature selection. This chapter will discuss the preparation of data, feature
transformation, and results of the classification.
Chapter five presents the experimental setup for classification which will
perform the feature selection of the data. This chapter also will address the result
of the classification to feature selected data.
8
Chapter six presents the classification of HAR with discretization of dataset.
The experimental setup will be addressed and the results obtained will be discussed
as well.
Chapter seven presents the analysis of the performances obtained which
includes several classifiers and distance metrics to be used in the experiments.
Chapter eight presents the conclusion of this report. This chapter will
conclude this report about the approaches use in the effort of human activity
recognition from wireless sensor network. Besides, future work will addresses in
this chapter as well.
9
10