Predicting parking lot occupancy using Prediction Instrument

PREDICTING PARKING LOT OCCUPANCY
USING PREDICTION INSTRUMENT
DEVELOPMENT FOR COMPLEX DOMAINS
M ASTER T HESIS
Predicting parking lot occupancy
using Prediction Instrument
Development for Complex Domains
hier een wit regel
Public version
hier een wit regel
hier een wit regel
Author
Joanne Lijbers
Study programme: ..Business Information Technology
Email:[email protected]
hier een wit regel
hier een wit regel
Graduation committee
S.J. van der Spoel, MSc.
Industrial Engineering and Business Information Systems, University of Twente
Dr. C. Amrit
Industrial Engineering and Business Information Systems, University of Twente
Dr.Ir. M. van Keulen
EEMCS - Database Group, University of Twente
C. ten Hoope, MSc.
Analytics and Information Management, Deloitte Nederland
hier een wit regel
hier een wit regel
Faculty of Electrical Engineering, Mathematics and Computer Science
University of Twente
July, 2016
ii
“Not everything that can be counted counts, and not everything that counts can
be counted”
William Bruce Cameron
iii
Abstract
In predictive analytics, complex domains are domains in which behavioural,
cultural, political, and other soft factors, affect prediction outcome. A softinclusive domain analysis can be performed, to capture the effects of these
(domain specific) soft factors.
This research assesses the use of a soft-inclusive domain analysis, to develop a prediction instrument in a complex domain, versus the use of an
analysis in which no soft factors are taken into account: a soft-exclusive analysis.
A case study of predicting parking lot occupancy is used to test the methods. A regression approach is taken, trying to predict the exact number of
cars in the parking lot, one day ahead.
Results show no significant difference in predictive performance, when
comparing the developed prediction instruments. Possible explanations for
this result are: high predictive performance of the soft-exclusive developed
predictive model, and the fact that not all soft factors, identified using softinclusive analysis, could be used in training the predictive model.
v
Acknowledgements
In your hands, or on your screen, you find the result of performing my final
project at the University of Twente, to graduate from the study Business Information Technology. As my specialization track was ‘Business Analytics’,
I wanted to focus on this topic in my final project as well. With the topic
of this thesis being predictive analytics, I have gained sufficient knowledge
into the field of my studies, and am eager to continue to work in, and learn
from, this field after graduating.
I want to thank my university supervisors, Sjoerd, Chintan and Maurice,
for guidance during the research process, the feedback and answering my
questions (from in-depth to design questions). Special thanks to Claudia,
my supervisor at Deloitte, for the weekly meetings. Thanks for helping me
out with the technical aspects of this research and for your guidance during the process. Last, I want to thank friends and family for the support
throughout the past months, and the good times during all the years of
study.
Joanne Lijbers
Amsterdam, July 2016
vii
Contents
Abstract
iii
Acknowledgements
v
List of Figures
ix
List of Tables
xi
List of Abbreviations
1
2
3
4
Introduction
1.1 Introduction . . . . . . .
1.2 Research Background . .
1.2.1 Case Description
1.3 Research Objectives . . .
1.3.1 Validation Goal .
1.3.2 Prediction Goal .
xiii
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
1
1
2
2
4
4
5
Methodology
2.1 Research Design . . . . . . . .
2.1.1 Research Methodology
2.1.2 Research Questions . .
2.2 Thesis structure . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
7
7
7
9
9
Theoretical Background
3.1 Predictive analytics . . . . . . . . . . . . . . . . . . . . . . .
3.2 Domain-driven development methods . . . . . . . . . . . .
3.3 Prediction Instrument Development for Complex Domains
3.3.1 Preparation Stage . . . . . . . . . . . . . . . . . . . .
3.3.2 Stage I: Qualitative assumptions . . . . . . . . . . .
3.3.3 Stage II: Predictive modelling . . . . . . . . . . . . .
3.3.4 Stage III: Model convergence . . . . . . . . . . . . .
3.3.5 PID-SEDA . . . . . . . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
11
11
12
13
14
14
15
15
15
Soft-exclusive development
4.1 Stage I: Assumptions . . . . . . .
4.1.1 Goal definition . . . . . .
4.1.2 Literature review . . . . .
4.1.3 Data constraint . . . . . .
4.2 Stage II: Predictive modelling . .
4.2.1 Data cleaning . . . . . . .
4.2.2 Data selection strategies .
4.2.3 Exploratory data analysis
4.2.4 Technique selection . . . .
.
.
.
.
.
.
.
.
.
17
17
17
18
23
23
23
24
25
27
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
viii
4.2.5 Evaluation, validation & model selection . . . . . . .
Stage III: Model convergence . . . . . . . . . . . . . . . . . .
Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
27
28
28
Soft-inclusive development
5.1 Preparation . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.1.1 Problem identification . . . . . . . . . . . . . . . . . .
5.1.2 Expert selection . . . . . . . . . . . . . . . . . . . . . .
5.2 Stage I: Assumptions . . . . . . . . . . . . . . . . . . . . . . .
5.2.1 Hypothesis divergence . . . . . . . . . . . . . . . . . .
5.2.2 Hypothesis convergence . . . . . . . . . . . . . . . . .
5.2.3 Constraint definition . . . . . . . . . . . . . . . . . . .
5.3 Stage II: Predictive modelling . . . . . . . . . . . . . . . . . .
5.3.1 Data selection & cleaning strategies . . . . . . . . . .
5.3.2 Reduction by data & domain constraints . . . . . . .
5.3.3 Exploratory data analysis . . . . . . . . . . . . . . . .
5.3.4 Technique & parameter selection . . . . . . . . . . . .
5.3.5 Model training . . . . . . . . . . . . . . . . . . . . . .
5.3.6 Reduction by interestingness, deployment & domain
constraints . . . . . . . . . . . . . . . . . . . . . . . . .
5.4 Stage III: Model Convergence . . . . . . . . . . . . . . . . . .
5.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
29
29
29
29
30
30
32
34
35
35
37
37
38
39
39
40
40
Discussion
6.1 Comparing models . . . . .
6.2 Validity . . . . . . . . . . . .
6.2.1 Conclusion Validity
6.2.2 Internal Validity . . .
6.2.3 External Validity . .
.
.
.
.
.
41
41
42
43
43
45
Conclusion
7.1 Answering Research Questions . . . . . . . . . . . . . . . . .
7.2 Recommendations . . . . . . . . . . . . . . . . . . . . . . . .
47
47
48
4.3
4.4
5
6
7
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
A Correlation soft-exclusive factors
51
B Performance measures - Soft-exclusive
55
C Performance measures - Soft-inclusive
57
References
59
ix
List of Figures
1.1
1.2
Influence of soft-factors on domain . . . . . . . . . . . . . . .
PID-CD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2
3
2.1
Research Methodology . . . . . . . . . . . . . . . . . . . . . .
8
3.1
Developing & evaluating a prediction model . . . . . . . . .
13
4.1
4.2
Refinement literature search . . . . . . . . . . . . . . . . . . .
Exploratory data analysis . . . . . . . . . . . . . . . . . . . .
20
26
5.1
5.2
5.3
Hypotheses constructs . . . . . . . . . . . . . . . . . . . . . .
Hypotheses after specialization . . . . . . . . . . . . . . . . .
Relation between traffic & occupancy . . . . . . . . . . . . .
33
34
38
xi
List of Tables
4.1
4.2
4.3
4.4
4.5
Performance Measures . . . . . .
Search Terms . . . . . . . . . . . .
Soft-exclusive prediction factors .
Significant correlations . . . . . .
Soft-exclusive regression results .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
18
19
22
26
28
5.1
5.2
5.3
5.4
5.5
Domain experts . . . . . . . . . . . . . .
Collected hypotheses . . . . . . . . . . .
Variable correlation . . . . . . . . . . . .
Random Forest - Performance Measures
Comparison of results . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
30
32
38
39
40
6.1
Performance measures selected strategies . . . . . . . . . . .
41
7.1
Correlation occupancy & appointments . . . . . . . . . . . .
49
A.1 Correlation (a) . . . . . . . . . . . . . . . . . . . . . . . . . . .
A.2 Correlation (b) . . . . . . . . . . . . . . . . . . . . . . . . . . .
52
53
B.1
B.2
B.3
B.4
B.5
B.6
B.7
Performance measures - Strategy 1
Performance measures - Strategy 2
Performance measures - Strategy 3
Performance measures - Strategy 4
Performance measures - Strategy 5
Performance measures - Strategy 6
Performance measures - Strategy 7
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
55
55
55
55
56
56
56
C.1
C.2
C.3
C.4
C.5
C.6
Performance measures - Strategy 2
Performance measures - Strategy 3
Performance measures - Strategy 4
Performance measures - Strategy 5
Performance measures - Strategy 8
Performance measures - Strategy 9
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
57
57
57
57
58
58
xiii
List of Abbreviations
BI
CART
DT
IMS
MAE
MAPE
MLR
MSE
PID-CD
PID-SEDA
RF
RMSE
SVR
Business Intelligence
Classification And Regression Trees
Decision Tree
Intelligence Meta-Synthesis
Mean Absolute Error
Mean Absolute Percentage Error
Multiple Linear Regression
Mean Square Error
Prediction Instrument Development
for Complex Domains
Prediction Instrument Development
with Soft-Exclusive Domain Analysis
Random Forest
Root Mean Square Error
Support Vector Regression
1
Chapter 1
Introduction
1.1
Introduction
Sensors, and the data they collect, are used in a wide variety of domains,
like disaster management and intelligence analysis, but also in the ‘manufacturing, energy and resources’-industry and the ecology sector [1]. Sometimes collected (sensor) data in these domains is very straightforward to
analyse, for example when a sensor is used inside a machine to monitor
tool conditions. The number of factors influencing the tool condition is limited, because, for example, only the number of usages influences tool quality. Prediction of failure is easy once the process is repeating itself. Analysis of such a simple domain can be done without taking soft-factors into account. Soft-factors are factors like behaviour, politics and strategies, which
can influence a domain and the data retrieved from a domain [2]. Since no
soft-factors need to be taken into account, such an analysis is referred to
as soft-exclusive. Van der Spoel, Amrit and van Hilligersberg [2] describe a
soft-exclusive analysis to be ‘domain analysis that only takes easily quantifiable factors into account’.
Other domains are more complex: they are only partially observable; probabilistic; evolve over time; and are subject to behavioural influences [3].
Many factors might interact with these complex domains, and not all of them
might be known. When human behaviour is involved, domains can almost
always be referred to as complex domains [3].
Complex domains need a different approach compared to the simpler
ones. When analysing such a domain, the influence of soft-factors, like
mentioned above, does need to be taken into account. This is because factors like behaviour and politics influence how a domain is represented by
data (see Figure 1.1). Including soft-factors into an analysis is referred to as
a soft-inclusive approach. Van der Spoel et al. [2] developed a method to use
this soft-inclusive approach in developing predictive models. This research
will add to this topic, amongst others, by validating the method of Van der
Spoel et al. [2]. Previous and related research will now be discussed, as well
as current gaps in knowledge, to motivate the choice of research. Thereafter
2
Chapter 1. Introduction
F IGURE 1.1: The influence of soft-factors on a domain, retrieved from [2, p. 11]. A domain gets represented by data
going through a filter, which can be affected by soft-factors
from the domain.
the objectives of the research are given. The objectives are used as a basis in
designing the research.
1.2
Research Background
As mentioned in the introduction, this research focuses on the use of data
gathered from complex domains. If valuable insights need to be derived
from these data, it is important to understand the domain: to gather intelligence about it. Domain analysis is a way to do so: to learn to understand
the context in which insights are created [2], [4].
Van der Spoel et al. [2] developed a method to develop prediction instruments which uses domain analysis, see Figure 1.2. The method provides
steps in which prediction models are created, by using hypotheses obtained
from analysing the domain to which the prediction models apply. Using
‘intelligence’ from people involved in this domain (the experts), the domain can be analysed in a more thorough way than using knowledge of
the researcher(s) alone. By performing field studies, or brainstorming with
these experts, hypotheses on what influences the to-be-predicted system
are gathered and, together with constraints, are used to create prediction
models. The steps, as displayed in Figure 1.2, are explained in more detail
in Section 3.3.
Prediction Instrument Development for Complex Domains (PID-CD)
has recently been developed (see [2]). The method needs to be tested, to
see how it performs in a new environment, to be able to increase its validity.
1.2.1
Case Description
To validate PID-CD, a case study of predicting parking lot occupancy is
used. The research uses sensor data from the parking lot of ‘The Edge’, one
of the offices of Deloitte Nederland B.V. The building is said to be the most
1.2. Research Background
3
F IGURE 1.2: Prediction Instrument Development for Complex Domains as designed by Van der Spoel et al. [2]. The
steps are explained in more detail in Section 3.3
sustainable office in the world [5]. Rainwater gets re-used to flush toilets,
and solar panels collect all the power the building uses. Besides this, one
of the special features of the building is the many sensors it contains. Because of these special features ‘The Edge’ is referred to as a smart building.
Smart buildings pursue goals that relate to energy consumption, security
and the needs of users [6]. At The Edge this is, among other things, realized by movement sensors, which control the lights based on occupancy
and by temperature sensors, which control the climate at the different departments. The data collected for these uses also get saved for the purpose
of analysis and optimization. Data analysis could reveal patterns that enable more efficient use.
One of the things which can be used more efficiently is the parking lot of
the building. Approximately 2500 employees are based at The Edge, but
only 240 permanent parking spots are reserved for Deloitte employees. To
solve the issue of these few parking spots, employees are able to park at a
garage near the office. Only employees to which the following rules apply,
get rights to park in the parking lot of The Edge:
1. The Edge as main office;
2. Joining the lease-program;
4
Chapter 1. Introduction
3. Function-level senior manager of higher;
4. Ambulant function in Audit or Consulting.
Unfortunately this still leaves more people with parking rights than there
are spots available. This results in both inefficient use of time, when employees have to search for a parking spot elsewhere, as well as dissatisfied
employees because of that.
Since it affects efficiency and satisfaction of employees it is important to
maximize the use of available parking space, and so to increase its efficiency. Some employees get dissatisfied because they arrive at a full parking lot, but other employees get dissatisfied because on quiet days, when
parking spots are available, they are still not allowed to park. Predicting
the occupancy of the parking lot might help in resolving these problems.
1.3
Research Objectives
The goal of the research is twofold. Firstly, the research is aimed at validating PID-CD, developed by Van der Spoel et al. [2]. Secondly, the goal
of the research is to predict the parking lot occupancy of the office ‘The
Edge’ in Amsterdam, used by Deloitte Netherlands. These goals will now
be discussed separately.
1.3.1
Validation Goal
According to Wieringa and Moralı [7], validation research needs to be done
to answer questions on the effectiveness and utility of a designed artefact.
Wieringa and Moralı define an artefact in Information System Research to
be anything from software to hardware, or more conceptual entities like
methods, techniques or business processes [7]. Validation can be used for
trade-off analysis: ‘Do answers change when the artefact changes?’, as well
as for sensitivity analysis: ‘Do answers change when the context (in which
the artefact is implemented) changes?’ [7].
In this research, the artefact is the method developed by Van der Spoel et al.
[2]. The research is aimed at answering both validation questions. The goal
is to see how this method performs in a different context, compared to the
context Van der Spoel et al. describe in their research, which is to predict
turnaround time for trucks at a container terminal [2] (sensitivity analysis).
The second validation goal is to see the change in answers when the
artefact changes (trade-off analysis). This trade-off question will be answered by developing two prediction instruments. Besides developing a
1.3. Research Objectives
5
prediction instrument using PID-CD, a prediction instrument will be developed using a soft-exclusive approach: prediction instrument development
with soft-exclusive domain analysis (PID-SEDA) [2]. This method differs from
PID-CD in the phase of collecting hypotheses and constraints, as will be explained in Section 3.3. By comparing the results of this change, a trade-off
can be made between, for example, quality of results on one hand and effort
to develop the artefact on the other hand.
1.3.2
Prediction Goal
The second goal of this research is to accurately predict the occupancy of
the parking lot of ‘The Edge’ on a given day. If the occupancy could be
predicted, arrangements can be made in advance: if it is predicted to be
very busy, employees can be warned, or if it is predicted to be quiet other
employees might get access for that day.
The development of applications that enable these uses is beyond the scope
of this research. This research is about developing an actionable prediction
model for the occupancy of the parking lot, being a model that ’satisfies
both technical concerns and business expectations’ [8].
7
Chapter 2
Methodology
2.1
Research Design
This chapter explains how the research is conducted. The methodology, as
well as the research questions will be explained. At the end of this chapter,
the structure of the remainder of this thesis is presented.
2.1.1
Research Methodology
This research is classified as Technical Action Research (TAR). TAR is ‘the
attempt to scale up a treatment to conditions of practice by actually using
it in a particular problem’, as defined by Wieringa and Moralı [7]. With
Technical Action Research a developed artefact is implemented in practice,
to validate its design and by doing so increasing its relevance. By implementing in practice, an artefact moves from being in idealized conditions
to being an actionable artefact in the real world [9]. TAR is intended to
solve improvement problems (designing and evaluating artefacts) as well
as to answer knowledge questions (resolve a lack of knowledge about some
aspects in the real world). This research is classified as TAR because it answers knowledge questions like ‘What would be the effect of applying Intelligence Meta-Synthesis (IMS) in developing a predictive instrument in a
complex system?’ and addresses the improvement problem of predicting
the occupancy of the parking lot of The Edge.
The structure of TAR is shown in the top half of Figure 2.1 [7, p.231]. The
left improvement problem, which shows the steps in developing an artefact, has already been conducted by Van der Spoel et al. [2]. This research
will contain the steps in the dotted frame. In the bottom half of Figure 2.1
these steps are applied to this research, showing the chapters in which the
different steps will be discussed. The different stages of developing a domain driven prediction instrument, as defined in [2], are also mapped onto
the structure of TAR.
8
Chapter 2. Methodology
F IGURE 2.1: The structure of Technical Action Research,
taken from [7]. The dotted frame in the bottom shows the
steps of TAR applied to this research and the steps of PICCD.
.
2.2. Thesis structure
9
The Research Execution phase is performed twice: first a prediction instrument following a soft-exclusive development approach, using only literature, is developed. Second a prediction instrument is developed following the PID-CD method of Van der Spoel et al. [2].
2.1.2
Research Questions
The goals of validating the domain driven prediction instrument development method and predicting parking lot occupancy translate to the following research questions, and subquestions:
’How does a prediction instrument developed using a soft-inclusive method
compare to a prediction instrument developed using a soft-exclusive method?’
1. What instrument for predicting parking lot occupancy results from using
’prediction instrument development with soft-exclusive domain analysis’?
2. What instrument for predicting parking lot occupancy results from using
’prediction instrument development for complex domains’?
2.2
Thesis structure
The remainder of this thesis will be structured as follows (as can be seen in
Figure 2.1):
Chapter 3 provides a theoretical background into the topics of ’predictive
analytics’ and ’intelligence meta-synthesis’. Common terms and practices
will be introduced, to ease the understanding of the other chapters. The
stages and steps of PID-SEDA and PID-CD are explained as well.
Chapter 4 addresses the improvement problem, using a soft-exclusive
development method, answering the first sub-question. As can be seen
in Figure 2.1 this includes problem investigation, treatment design, design
validation, implementation, and evaluation of the design.
Chapter 5 shows the results of performing the same steps, but in this
chapter the PID-CD method is used, answering sub-question two.
In chapter 6 the results of the two development methods are compared.
Internal and external validity will be checked, and contributions and limitations are described.
Concluding this thesis, the research questions are answered, and recommendations for future work are given in chapter 7
11
Chapter 3
Theoretical Background
This chapter provides a theoretical background into the topics of ’predictive analytics’ and ’domain-driven development methods’. Next, the stages
of Prediction Instrument Development for Complex Domains (PID-CD) [2]
are explained.
3.1
Predictive analytics
According to Waller and Fawcett [10] data science is ‘the application of
quantitative and qualitative methods to solve relevant problems and predict outcomes’. Besides, for example, database management and visualization, predictive analytics form a subset of data science.
Predictive analytics is the process of developing prediction models, as
well as evaluating the predictive power of such models [11]. A prediction
model can be viewed as a function:
y = f (X)
The output of the model is represented by y, the variable to be predicted.
X represents the (set of) input variable(s) [12]. By calculating this function,
the relationship between X and y can be modelled and used for predicting
new values of y.
Training this function can be done by using a train-set of data (e.g. 70
percent of a dataset), for which all values of X and y are known. Because
the values are known the relationship between the input- and output variable(s) can be determined. After this a test-set (using the remaining 30 percent of data), only using values of X, is used to test if the trained function
accurately predicts y.
The process of predictive analytics is displayed in Figure 3.1. When using a function to predict a numerical outcome, the predictive model can be
referred to as a regression model. When the outcome is categorical this is referred to as classification. Another prediction goal is ranking, which is used
to "rank observations to their probability of belonging to a certain class" [11,
12
Chapter 3. Theoretical Background
p. 23].
Linear- and Multiple Regression are the most important and single most widely
used prediction techniques [13]. Besides these techniques other techniques
can be used, like Support Vector Regression, which can recognize subtle patterns in complex data sets [14], but also techniques like Decision Tree or Random Forest, which can be used for both classification and regression. Decision Trees (DT) exist of multiple nodes at which an attribute gets compared
to a certain constant (often being a greater- or smaller than comparison)
[15]. Each branch represents an outcome of the comparison, and tree leaves
represent prediction values (or classes in case of classification) [12]. The
learning of a DT is simple and fast, and the representation of results is intuitive and easy to understand [12]. Random Forest (RF) is a technique that
uses multiple Decision Trees to create a prediction model. According to
Breiman [16], using RF results in high prediction accuracy. The technique
often achieves the same or better prediction performance compared to a
single DT [17]. The process described before, and displayed in Figure 3.1,
remains the same for these techniques, with the function being a decision
tree or a forest.
As mentioned, evaluating the predictive power of a model is the second part of predictive analytics. Evaluating the accuracy of a (numerical)
prediction model is done by calculating the difference (the error) between
known values of y and the predicted values y’ [12].
A prediction instrument, as developed in this research, is a combination of
a predictive model (the trained function), the technique used to create it,
its parameters, a data selection & refinement strategy and (business) constraints (to determine whether or not the model is useful in practice) [2].
3.2
Domain-driven development methods
As mentioned in Section 1.1, developing prediction instruments gets more
difficult when dealing with complex domains. Predictive models often cannot be copied from existing ones, since every complex domain has its own
unique characteristics. One way to develop actionable prediction models
in such domains is to use domain-driven development methods.
According to Cao and Zhang [18], domain-driven data mining aims to develop specific methodologies and techniques to deal with complex (business) domains. When using a domain-driven approach, both objective and
subjective factors can be included in a (predictive) model. Waller and Fawcett
state analysis and domain knowledge cannot be separated [10]. According
3.3. Prediction Instrument Development for Complex Domains
13
F IGURE 3.1: The process of developing and evaluating a
prediction model. A function of y gets trained, using a subset of data. This function is used to predict the goal variable
of the test set. The last step is evaluating the results by calculating prediction error.
to the authors data scientists need both a broad set of analytical skills, as
well as deep domain knowledge.
This domain knowledge however does not necessarily has to come from
data scientists themselves. One way of developing instruments with a domaindriven view is to use an Intelligence Meta-Synthesis (IMS) approach. IMS is
a method for capturing soft factors, in the form of different kinds of intelligence, like human intelligence, data intelligence and domain intelligence
[19]. According to Gu and Tang [20], ‘meta-synthesis emphasizes the synthesis of collected information and knowledge of various kinds of experts’.
It is a methodology in which quantitative methods are combined with qualitative (domain) knowledge, obtained by consulting domain experts.
Van der Spoel et al. use IMS as a basis to their soft-inclusive domain
analysis [2]. The domain analysis is soft-inclusive because it, besides including hard factors, also takes soft, domain specific, factors like behaviour
and culture, into account. Soft-exclusive domain analysis on the other hand
only takes factors into account that are directly quantifiable (hard factors).
3.3
Prediction Instrument Development for Complex
Domains
The development method designed by Van der Spoel et al. [2] is described
below, as it is used in Chapter 5 of this thesis. The steps of the method
are displayed in Figure 1.2. In the preparation stage the prediction goal
is defined and experts are selected. In stage I hypotheses and constraints
regarding the domain are collected. In stage II the collected hypotheses
14
Chapter 3. Theoretical Background
are translated into datasets and the constraints are used to select the final
datasets. These datasets are used to train predictive models, which need to
comply with constraints given. In third stage final predictive models are
chosen. These steps are discussed more elaborately below.
3.3.1
Preparation Stage
As displayed in Figure 1.2, before hypotheses are collected, preparations
need to be taken. ‘What needs to be predicted’ (the prediction goal) and
‘what are characteristics of the problem domain’, are questions that are answered during this preparation stage [2]. Another part of the preparation
stage is to determine the experts who will be consulted later in the process.
3.3.2
Stage I: Qualitative assumptions
In the first core stage of the development method hypotheses are collected
and constraints are defined. Hypotheses are collected through brainstorming, individual interviews, field studies and/or literature review [2]. Brainstorming might have to be done anonymously, to ensure conflicting interests do not affect the results of the brainstorm.
After hypotheses are collected, the number of hypotheses is reduced, to
avoid having to test similar hypotheses. Selections are made by looking at
the level of agreement. Only those hypotheses that are sufficiently different
and interesting are taken into account in the development stage. Merging
the hypotheses into one set T for testing is done by following the next steps:
1. Translate the hypotheses collected into diagrams, showing the factors
(constructs) and their relations
2. Standardize and specialize the constructs (synonymous factors are replaced by one synonym; constructs are possibly replaced by their subconstruct)
3. Determine causal influence of constructs and group the hypotheses to
the same causal influence. One hypothesis per group gets added to
the set of hypotheses to be tested ‘T’.
The last part of the first stage is to define constraints. Through consulting experts domain constraints are collected, which regard to data, deployment and interestingness. Domain constraints origin from the domain the
prediction instrument is being developed for, for example having to comply with privacy standards. Data constraints are constraints on structure,
quantity and quality of data. Whether or not a prediction instrument can
3.3. Prediction Instrument Development for Complex Domains
15
actually be used in existing technological infrastructures, relates to deployment constraints. Finally, the interestingness constraint relates to the performance of the instrument.
3.3.3
Stage II: Predictive modelling
Once the hypotheses set is completed, prediction models are created. The
hypotheses are translated into available variables for learning the models.
After this selection of data, the data might need to be cleaned before usage
(for example delete outlier rows). Through exploratory data analysis and
consulting the experts, it can be checked which selection- & cleaning strategies need to be applied. Next, the different selection & cleaning strategies
are reduced by checking compliance with the data and domain constraints
collected in stage I. After that prediction methods and parameters (like size
of the training/test set) are selected. For every strategy and every prediction method selected, predictive models are trained and evaluated, using
calculated performance measures. Based on the interestingness, deployment & domain constraints, models that do not meet the constraints are
discarded.
3.3.4
Stage III: Model convergence
The final stage of the method is to select (a) predictive model(s). Domain
experts are consulted to make this selection. Selection is done based on predictive performance, but also factors like training time, or personal preferences can be taken into consideration. If a model gets selected it will,
together with the data selection & cleaning strategy, prediction method, parameters and constraints form the developed prediction instrument.
3.3.5
PID-SEDA
As explained, besides using PID-CD, Prediction Instrument Development
with Soft-Exclusive Domain Analysis (PID-SEDA) will be used to serve as
a benchmark method. Using this method represents using a soft-exclusive
approach in analysing a complex domain. Almost no knowledge of the domain is used for selecting factors or algorithms. By comparing its results to
the results of using PID-CD, the effect of including soft-factors in analysing
a complex domain is researched.
PID-SEDA is the soft-exclusive development method which is used in
Chapter 4. The method differs from PID-CD by collecting hypotheses only
through conducting a literature review. The predictive modelling stage is
similar to the one in PID-CD, except no domain, deployment and interestingness constraints need to be met. At the end of th (iterative) process, the
best predictive model is chosen based on predictive power [2].
17
Chapter 4
Soft-exclusive development
This chapter displays the results of using the soft-exclusive development
method (PID-SEDA), to develop a prediction instrument for predicting parking lot occupancy in The Edge. The different stages of the method, as well
as the final prediction instrument developed, are presented below.
4.1
Stage I: Assumptions
The first stage of the soft-exclusive development model focuses on gathering assumptions on how to predict parking lot occupancy. The prediction
goal is determined and a structured literature review is performed to see
which factors are mentioned in existing literature. Concluding this stage a
description of available data is given.
4.1.1
Goal definition
The prediction goal is to predict the occupancy of the parking lot of The
Edge. Occupancy is the number of (Deloitte) cars that are currently in the
parking lot. Different time windows are tested: predicting occupancy half
an hour in advance, predicting two hours in advance and predicting the
evening before the predicted moments. Output (occupancy) will be part of
a wide range of possible, continues values (appr. 0-250 cars). Therefore, a
regression approach is taken, trying to predict the exact number of cars in
the parking lot (referred to as a prediction goal [11, p. 23]).
The performance of the model(s) is determined by calculating the mean
square error (MSE), the root mean square error (RMSE) and the mean absolute error (MAE). See Table 4.1 for a description of these methods. The MSE,
RMSE and MAE are scale-dependent measures, useful when comparing
different methods applied to the same dataset [21]. These measures will be
used to select the best soft-exclusive developed prediction model: the lower
the error, the better the model. MAE will be treated as the most relevant
measure, since it is the most natural and unambiguous measure of average
error [22]. Another often used performance measure is the mean absolute
18
Chapter 4. Soft-exclusive development
TABLE 4.1: Performance Measures
Measures
Formulas
n
Mean squared error
Root mean squared error
1X 2
et
MSE =
n
t=1
v
u n
u1 X
RMSE = t
e2t
n
t=1
Mean absolute error
Mean absolute percentage error
n
1X
MAE =
|et |
n
t=1
n 100% X et MAPE =
yt n
t=1
n = the number of prediction values. et = prediction error: the difference between
the tth prediction and the tth actual value. yt = the tth actual value.
percentage error (MAPE) (see Table 4.1), which can be used to compare prediction performance across different data sets [21]. This measure cannot be
used in this case since the actual data frequently contains zeros (for example
occupancy at night), resulting in an infinite MAPE.
4.1.2
Literature review
To select factors from existing literature a structured review, described by
Webster and Watson [23], is conducted. Criteria for inclusion and/or exclusion are defined; fields of research are determined; appropriate sources are
selected and specific search terms are defined. After the search the results
are refined by language, title, abstract and full text. Forward and backward
citations are used to search for new relevant articles, until no new articles
come up [23]. These steps are described in detail below.
Inclusion/exclusion criteria
To ensure the relevance of the selected articles inclusion and exclusion criteria are determined. Articles should mention the topic of parking or anything synonymous. Real time, near real time and non-real time predictions
are all included, trying to collect as much hypotheses as possible. Articles
that do not use empirical data are excluded, since we try to find articles
which test factors influencing parking lot occupancy.
Fields of research
No limits on fields of research will be set, since non-related articles will be
filtered out in the refine-steps.
4.1. Stage I: Assumptions
19
TABLE 4.2: Search Terms
Prediction terms
Synonyms
Prediction goals
Predicting
Prediction
Parking lot
Parking space
Parking area
Parking spot
Lay-by
Garage
Parking
Occupancy
Availability
Sources
The sources used for the search are Google Scholar and Scopus. According to Moed, Bar-Ilan and Halevi [24], both of these databases cover a set
of core sources in the field of study. Although Scopus is a good source for
finding published articles, Google Scholar can add to a search by also showing ‘intermediary stages of the publication process’ [24]. Using these both
databases can therefore provide a surround search.
Search
Table 4.2 displays the specific search terms that are used for the literature
search. Besides ‘occupancy’ as a prediction goal, also ‘availability’ is used,
since predicting occupancy can also be done by predicting the available
spots left. In the middle, synonyms for ‘parking lot’ are given. Using different synonyms in the literature search will limit the impact of use of language.
All possible combinations of these terms, synonyms and goals are used
in the search, resulting in a total of 461 articles (354 from Google Scholar,
107 from Scopus).
Refine sample
The results are refined using the following steps:
1. Filter out doubles
2. Filter by language
3. Refine by title
4. Refine by abstract
5. Refine by full text
20
Chapter 4. Soft-exclusive development
F IGURE 4.1: Filter & refinement steps of the structured literature review. n is the number of articles remaining after
the refinement step before.
4.1. Stage I: Assumptions
21
Figure 4.1 displays the refinement of the search results. The search results
contained 79 doubles, because two different sources were used. Seven articles were removed because they were not written in English. Articles with
a title that did not mention ’parking’ (or related terms) were removed. The
abstracts of the remaining 85 articles were read. Articles that do not seem to
contribute to the purpose of predicting parking lot occupancy are removed.
The remaining 35 articles were read, leaving nine relevant articles.
Forward & backwards citations
The nine selected articles were cited by, and referred to 147 articles in total.
The same refining steps as above were applied, resulting two new articles.
These two new articles contained two new citations, which were removed
from the list after reading the abstract.
This structured review resulted in eleven articles useful for selecting variables in the soft-exclusive development method.
Analysis
The final factors which will be used in the prediction model, derived from
the eleven articles, can be found in a concept matrix, as recommended by
Webster and Watson [23], see Table 4.3.
The factor time of day is mentioned in most articles. The occupancy of a
parking garage might for example be higher during business hours and
low during the night, or vice versa if it is a residential garage.
The second factor derived from literature is day of week. Whether it is a
working day or a non-working day (like in the weekend), might influence
occupancy.
Weather is the third factor, displayed in Table 4.4, mentioned in three
different articles. Weather conditions might influence people’s choice to go
by car or not.
Holidays also is a straightforward factor, derived from literature. Whether
or not it is a holiday, e.g. like Christmas, likely influences the occupancy of
a parking garage.
A factor mentioned only by David, Overkamp and Scheuerer [28] is the
effect of a day being a day between holiday and weekend. A lot of people take,
or have to take, a day off on such days, possibly resulting in an effect on the
parking lot occupancy.
Where Chen et al. [29] and Reinstadler et al. [30] only mention normal
holidays as an influential factor, David et al. also mention the influence of
school-holidays [28].
22
Chapter 4. Soft-exclusive development
TABLE 4.3: Soft-exclusive prediction factors
Articles
Chen (2014)
Chen et al. (2013)
David et al. (2000)
Fabusuyi et al. (2014)
Kunjithapatham et al. (n.d.)
McGuiness and McNeil (1991)
Richter et al. (2014)
Reinstadler et al. (2013)
Soler (2015)
Vlahogianni et al. (2014)
Zheng et al. (2015)
Concepts
Time of day
Day of week
Events
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
Articles
Holidays
X
X
X
X
Concepts
Historic occupancy
Chen (2014)
Chen et al. (2013)
David et al. (2000)
Fabusuyi et al. (2014)
Kunjithapatham et al. (n.d.)
McGuiness and McNeil (1991)
Richter et al. (2014)
Reinstadler et al. (2013)
Soler (2015)
Vlahogianni et al. (2014)
Zheng et al. (2015)
Weather
Day between
holiday and
weekend
School holidays
Parking lot
accessibility
Parking price
X
X
X
X
X
X
The seventh factor derived from literature is historic occupancy. This factor, mentioned by Zheng, Rajasegarar and Leckie [31] and Vlahogianni et
al. [32], means using the occupancy of some points in time before the time
of prediction as input to the model. The researchers use the occupancy of n
steps before time of prediction, to predict occupancy k steps ahead. Zheng
et al. for example, use the previous observation when working with a regression tree, and the previous 5 observations when working with neural
networks, to predict 1 to 20 steps ahead [31]. Vlahogianni et al. use a look
back window of 5 steps to predict 1 to 60 steps ahead [32]. The goal of
including this factor is "to represent the temporal characteristics of transportation time series (e.g. parking occupancy) in a manner resembling the
common statistical prediction techniques" [32].
The influence of events, is described in four different articles. For example, when located near a theatre, as described by Fabusuyi et al. [33], data
on theatre performances can be a great predictor of parking lot occupancy.
Parking lot accessibility shows whether or not a parking lot is easily accessible. Road constructions or detours nearby might influence the occupancy
of the garage.
Only mentioned by Chen [35] is the final factor ’parking price’. Increasing
or decreasing parking price might influence parking occupancy.
4.2. Stage II: Predictive modelling
4.1.3
23
Data constraint
After collecting these hypotheses, data is checked on availability, is cleaned,
and is transformed into a dataset ready for training a prediction model
(stage II). A data constraint, as described below, states the quality and quantity of data [2].
The original dataset used for this research is collected from the parking lot
of The Edge between November 17, 2014 and April 15, 2016. A row of data
is given per car entering the parking lot: showing row-ID; entry time; exit
time; subscription type; license plate and country code.
Data on occupancy is not available and has to be calculated based on the
number of cars going in and out. Only 49.897 from the 176.997 rows of the
original dataset contain the date and time a car left the parking lot. Reasons
for these empty rows are for example the gate of the parking lot not closing
after each car, resulting in not scanning all cars leaving the parking lot.
4.2
Stage II: Predictive modelling
As mentioned above, stage II of the development method takes the factors
derived from literature (see Table 4.3), to create predictive models. In this
stage the data cleaning step is described, data selection strategies are described, an exploratory analysis will be performed and the predictive models will be evaluated.
4.2.1
Data cleaning
It is chosen to impute the 127.100 missing values, mentioned in Section
4.1.3, instead of using only the ± 50.000 complete rows, because these complete rows are obtained only in the first half year after opening the parking
lot. The first few weeks after opening, employees still were in the process
of moving to the building and getting used to parking at The Edge. Using these weeks in data of only half a year might negatively influence the
prediction model. Besides this, using only half a year of data excludes the
possible influence of holidays, school-holidays etcetera. Using the whole
dataset, with imputing missing values, should result in learning a more robust prediction model.
Imputation is done by using one of the most frequently used imputation methods: mean imputation [36]. Using the available data, averages are
calculated to replace the missing values. Using the complete rows, an average parking stay is calculated for every hour of every different day. The
missing values are replaced based on the entry times of the car, adding the
average parking-stay time for that time of arrival, leaving a new calculated
24
Chapter 4. Soft-exclusive development
exit time. With these entry and exit times completed, the occupancy of the
parking lot is calculated for every half hour of the day (from November 17,
2014 to April 15, 2016 this makes 24.765 rows of data).
Because the average stay is used in calculating the occupancy, the calculated number of cars inside the parking lot sometimes (2.4% of values)
exceeds the maximum number of parking spots. In the parking lot 240
spots are reserved for Deloitte, the dataset created contains 594 rows with
an occupancy exceeding this number. To reduce noise, these rows are removed from the dataset.
The factors time of day, day of week and historic occupancy also are retrieved
from this transformed dataset.
Weather data is retrieved from the Royal Netherlands Meteorological Institute [37]. Data on (school)holidays and days between holiday and weekend are
retrieved from Schoolvakanties-Nederland [38].
No (open) datasets are available on the factors events and parking lot accessibility: no historic information on road constructions or detours can be
found, and possible events in The Edge are not centrally registered. Therefore these two factors can not be used in the final dataset (and predictive
model).
The factor parking price technically could be integrated to the final dataset,
but since parking price is zero at all times (resulting in no predictive power),
the factor is ignored.
The original dataset contained one row for every car entering and/or leaving the parking lot. After the above mentioned cleaning- and transformation steps, the final dataset contains one row per half an hour (November 17, 2014 - April 15, 2016), with the associated variables (occupancy;
weather; holidays, etc.).
4.2.2
Data selection strategies
Different data selection strategies are defined, based on the use of the factor
historic occupancy.
Vlahogianni et al. and Zheng et al. include previous observations in their
prediction models [31], [32]. Although these researches predict occupancy
in a more real-time manner (1 to 60 minutes ahead), the effects of including
this factor are researched here. The referred articles use 1 and 5 observations, depending on the modelling technique. Both uses, including 1 or 5
observations, are tested to see which time window works best with which
technique. For the prediction goal of predicting the evening before, predictions are made as being one day ahead (48-steps ahead prediction). By
4.2. Stage II: Predictive modelling
25
doing so predictions still can be checked the evening before, but data on
more similar time frames will be taken into account. For this prediction
goal both strategies of using the previous 1 and 5 observations are tested,
as well as including data on observations of the previous 5 days. This is
because the goal is to add memory structures that retain the effect of past
information to the system and use it during learning [32]. While the shortterm predictions incorporate the memory structures of that particular day,
including information on the past 5 days might be a better memory structure for the non-near real time predictions.
The different strategies therefore are:
1. 1-steps ahead occupancy - including 1 time frame
2. 1-steps ahead occupancy - including 5 time frames
3. 4-steps ahead occupancy - including 1 time frame
4. 4-steps ahead occupancy - including 5 time frames
5. 48-steps ahead occupancy - including 1 time frame
6. 48-steps ahead occupancy - including 5 time frames
7. 48-steps ahead occupancy - including occupancy at prediction time
up to 5 days before
All strategies also include the other factors which resulted from the literature review.
4.2.3
Exploratory data analysis
Figure 4.2 displays the average number of cars in the parking lot for every
hour, per day of the week. Looking at the graph, time of day and day of week
seem to be important independent variables. For example: office hours
result in high occupancy; weekends result in few cars parked in the parking
lot.
Besides this visual exploration, correlation between all of the variables
is checked. Table A.1 and A.2 in Appendix A display all correlations. Bold
values indicate a significant correlation at α < 0.05. Table 4.4 shows the
correlations between the goal variable and the independent variables of all
strategies. All independent variables significantly correlate with the dependent variable occupancy.
26
Chapter 4. Soft-exclusive development
TABLE 4.4: Significant correlations between independent
variables & occupancy
Ind. factor
Corr.
Ind. factor
Corr.
Ind. factor
Corr.
WeekDay
Holiday
Wind
Sunshine
Rain
Temperature
Hour
1-step/obs. 1
-0.28
-0.38
0.13
0.32
0.02
0.02
0.16
0.98
1-step/obs. 2
1-step/obs. 3
1-step/obs. 4
1-step/obs. 5
4-step/obs. 3
4-step/obs. 4
4-step/obs. 5
48-steps/obs. 1
0.93
0.86
0.79
0.71
0.62
0.53
0.44
0.68
48-steps/obs. 2
48-steps/obs. 3
48-steps/obs. 4
48-steps/obs. 5
48-steps/day2
48-steps/day3
48-steps/day4
48-steps/day5
0.67
0.63
0.58
0.52
0.45
0.48
0.47
0.43
F IGURE 4.2: The average number of cars in the parking lot,
displayed per day of the week and hour of the day.
4.2. Stage II: Predictive modelling
4.2.4
27
Technique selection
With the factors retrieved from the structured research, the prediction models can be tested. BI software ’Qlikview’ [39] is used to combine all the variables from different sources into one dataset. This set is used to create and
analyse the prediction model(s), using the ’R suite for statistical analysis’
[40].
Four learning methods are tested: Multiple Linear Regression (MLR);
Random Forest (RF); Classification and Regression Trees (CART) and Support Vector Regression (SVR). MLR is used because it is a very straightforward method, often used in the field of predictive analytics (as mentioned
in Section 3.1). RF is an ensemble method, which combines the results of
other prediction methods to make (better) predictions [11]. RF often results
in high accuracy and is robust to outliers and noise [16]. CART and SVR
are used to represent other categories of regression techniques. Doing this
reduces the risk of poor performing models, because of technique-specific
issues [2].
When testing these different machine learning methods, the best performing method is chosen based on the performance measures explained
in Section 4.1.1.
4.2.5
Evaluation, validation & model selection
To evaluate the results 10-fold cross-validation is used. With cross-validation
a model is trained and tested using the hold-out method explained in Section 3.1. The training and testing procedure however is repeated multiple
times (in this case ten times), each time testing on a different set of observations [11]. The performance measures are calculated using the total prediction errors (squared and absolute) from the 10 iterations, divided by the
total number of rows in the original dataset [12]. Although Random Forest
is robust against overfitting [16], cross-validation is used to make sure the
performance measures are not due to chance. The performance measures
for the different techniques and data selection strategies are displayed in
Tables B.1 up to and including B.7 of Appendix B. Using Random Forest
results in the lowest prediction errors for all different strategies. The performance measures resulting from using RF are displayed in Table 4.5.
The table also displays the measures for a naive model. This naive
model predicts the occupancy based on averages for day of week and time
of day: variables which can be used using the original dataset only.
The bottom row of Table 4.5 shows the accuracy of the different prediction models. This is the percentage of occupancy predictions made that are
within a range of 5 cars from the actual occupancy.
28
Chapter 4. Soft-exclusive development
TABLE 4.5: RF performance measures all strategies
MSE
RMSE
MAE
%≤5
4.3
Naive
1
2
3
4
5
6
7
956.2
30.9
18.0
24.1
43.2
6.6
2.9
84.0
20.6
4.5
2.3
86.6
182.7
13.5
6.4
70.4
153.9
12.4
6.0
70.6
150.2
12.3
5.9
69.7
184.3
13.6
6.5
67.9
104.8
10.2
4.9
72.3
Stage III: Model convergence
Using the performance and accuracy measures displayed in the tables above,
final predictive models are selected. All tested strategies result in less prediction error and higher accuracy compared to using averages of day and
time to predict occupancy (naive model). It can be seen that including a
larger time window improves the performance of a model, except for including the 5 last observations in the 48-steps ahead prediction. Including
the observations of the last 5 days however does result in less prediction
error.
Strategies 2, 4 and 7 therefore are selected to be the prediction models
for the goals of predicting respectively half an hour, two hours and one
day ahead. When predicting half an hour in advance predictions are on
average 2.3 cars off. Using strategies 4 and 7 result in average (absolute)
error of respectively 6.0 and 4.9 cars.
4.4
Conclusion
Based on the results a prediction instrument using the weather variables
temperature, sunshine, wind and rainfall; using (school)holiday data; day of week;
time of day and (historic) occupancy numbers of 5 steps in advance, applying
the data cleaning strategy described in Section 4.1.3 and using a Random
Forest technique, is selected as the final prediction instrument developed
using PID-SEDA. Using this instrument results in an average error of 2.3
cars, when predicting half an hour in advance, an average error of 6.0 cars
predicting 2 hours in advance, and an average error of 4.9 cars predicting
one day ahead.
29
Chapter 5
Soft-inclusive development
This chapter describes the process of developing a prediction instrument,
following the soft-inclusive approach of Van der Spoel et al. [2]. For comparability, the prediction problem is the same as the prediction problem in the
soft-exclusive development method. This chapter starts with the process of
selecting experts, collecting hypotheses on what influences the occupancy
of the garage and collecting possible constraints. In the second stage these
hypotheses and constraints are used to develop prediction models, after
which a final model is selected in stage III.
5.1
Preparation
Before conducting the first stage of the development method, developing
hypotheses, the prediction problem is identified. As mentioned this prediction problem is similar to the one in Section 4.1.1. By doing so the results
of the different development methods can be compared using a trade-off
analysis.
5.1.1
Problem identification
The problem is predicting occupancy of the parking garage of The Edge,
in a way employees can decide whether or not to park in the garage on a
given day and time. The goal variable in this prediction is occupancy: the
number of Deloitte-cars in the garage.
5.1.2
Expert selection
A stakeholder analysis is conducted to identify the involved actors in the
domain of the prediction problem. Results from this analysis are summarized in Table 5.1.
Three different groups of domain experts are identified. The first is the
group of employees who is always allowed to park their cars in the garage.
These employees are mainly partners, directors and senior managers, since
these functions require to be at the office a great deal of the time.
30
Chapter 5. Soft-inclusive development
TABLE 5.1: Domain experts
Group
Stakeholder
1. Employee - Parking rights
Partner
Director
Senior Manager
Manager
Senior Consultant
Consultant
Business Analyst
CBRE Global Workplace Solutions
IP Parking
Reception
2. Employee - Parking rights
(only after 04:00 PM)
3. Support
The second group is the group of employees with only ‘overtime’ parking rights. These rights allow the employees to park in the garage after 04:00 PM every working day, as well as the whole day during weekends. On working days before 04:00 PM no access is granted to the parking
garage.
The third group is a collection of support stakeholders, instead of being
users like the first two groups. ‘CBRE GWS’ manages the housing of The
Edge and is also in charge of the parking garage. ‘IP Parking’ is the company which installed and manages the parking management system and
the corresponding sensors. The receptionists also connect to the prediction problem as domain experts, because they see the people who parked
their car entering the building, as well as have experience reserving parking
spots for visitors.
5.2
Stage I: Assumptions
In the first stage of the soft-inclusive development method, hypotheses are
collected on what influences the occupancy of the parking garage. These
hypotheses are collected by consulting the experts from Table 5.1.
5.2.1
Hypothesis divergence
Hypotheses are collected through conducting brainstorm sessions with groups
of experts. As mentioned in Section 3.1, brainstorming sometimes has to
be done anonymously to ensure no information is withheld. In this research anonymous brainstorming is not necessary, since the behaviour of
experts does not affect the (business) processes of the company. However
to increase internal validity, individual expert-interviews were conducted
as well. Figure 5.1 shows the hypotheses that are mentioned by experts diagrammatically, by showing the constructs and hypothetical relationships.
Table 5.2 summarizes the hypotheses and their constructs.
5.2. Stage I: Assumptions
31
Hypothesis 1
Mentioned by a manager of Risk Advisory, as well as during a brainstorm
session with experts in group 2, is the influence of day of the week. This is
because of one of the main activities of Deloitte it advisory work. On Fridays a lot of employees come to the office to meet up with internal teams.
On the other days these employees are working at a clients’ office.
Hypothesis 2
Mentioned by one of the experts in group 1: when arriving ‘on time’ occupancy of the garage is low. This suggests influence of the variable time of
day, like used in the soft-exclusive method.
Hypothesis 3 & 4
One of the experts mentioned to work at home if the weather is really bad,
so not parking a car in the garage on those days. Other employees mentioned they usually go by bike to The Edge, but might go by car if it is a
rainy day.
Hypothesis 5
A hypothesis also mentioned by employees who usually go by bike, is the
fact that they go by car if they have to be somewhere else later that day.
This happens quite often because of the advisory work: in the morning employees might have team meetings at The Edge, after which they go to the
office of clients to discuss progress, present results, et cetera.
Hypothesis 6
Flexible working is encouraged by Deloitte. Employees can work from
home, or other locations, connected through the Deloitte Network. Sometimes physical presence however is still needed, to have meetings and/or
discussions on important topics. Experts mentioned they park their car in
the garage if they have appointments at The Edge that require their physical presence.
Hypothesis 7
For The Edge being in Amsterdam (capital of The Netherlands), highways
surrounding the office often are congested during office hours. During a
brainstorm-session with experts in group 2 it was mentioned some employees choose to go by train if a lot of traffic is predicted.
Hypothesis 8
Employees who usually go to office by public transportation might go by
car if trains do not ride, or if detours have to be taken, as mentioned during
32
Chapter 5. Soft-inclusive development
TABLE 5.2: Collected hypotheses and their constructs
1
Hypothesis
Fridays are more busy than other days
2
3
4
5
Occupancy depends on what time you arrive
Extremely bad weather: I will not go (by car)
Bad weather: I will go by car, instead of by bike
I go by car if I have to be somewhere else later
6
7
8
9
10
11
I go to the office if I have appointments
When there is a lot of traffic I go by train
When trains do not ride I go by car
Because of road constructions I might go by train
The number of external assignments
Results from literature review
Constructs
Friday (F) is-a day-of-week (DoW);
other day (OD) is-a DoW
Time-of-day (ToD)
Weather (W)
Weather (W)
External appointments (Ex)
is-an appointment (A)
Internal appointments (In) is-an A
Traffic (T)
Rail problems (RP)
Road constructions (RC)
Ex is-a A
DoW; ToD; W;
Holiday (H); Historic Occupancy (HO)
an expert brainstorm.
Hypothesis 9
One of the experts (stakeholder group 2) mentioned road constructions can
hinder accessing the garage. Employees can travel by public transportation
instead, resulting in lower parking occupancy.
Hypothesis 10
Another hypothesis resulting from the expert brainstorm is the influence of
external assignments. The percentage of posted employees might influence
the number of cars in the garage, since it influences the number of employees working at a clients’ office.
Hypothesis 11
The last hypothesis is not derived using expert opinions, but by conducting
a structured literature review. This review has already been conducted during development of the soft-exclusive prediction model in Section 4.1.2. Usable variables resulting from this review are time of day, day of week, weather,
holiday and occupancy k-steps ahead (historic occupancy).
5.2.2
Hypothesis convergence
To make sure no hypothesis will be tested more than once, the set of hypotheses gets converged. The first step in this process is specialization, in
which constructs are replaced by their subconstructs (if any exist). This
rule applies to hypothesis 1, 5, 6, and 10, resulting in the diagrams shown
in Figure 5.2.
After specialization the set of hypotheses gets standardized, which means
synonymous constructs are replaced by one synonym [2], [41]. In this case
no synonymous relations are described.
5.2. Stage I: Assumptions
33
( A ) Hypothesis 1
( B ) Hypothesis 2
( C ) Hypothesis 3
( D ) Hypothesis 4
( E ) Hypothesis 5
( F ) Hypothesis 6
( G ) Hypothesis 7
( H ) Hypothesis 8
( I ) Hypothesis 9
( J ) Hypothesis 10
( K ) Hypothesis 11
F IGURE 5.1: Hypotheses diagrams show the possible relationships between constructs, as mentioned by experts. For
example, weather (W) decreases occupancy (O), as suggested
in hypothesis 3.
34
Chapter 5. Soft-inclusive development
( B ) Hypothesis 5
( A ) Hypothesis 1
( C ) Hypothesis 6
( D ) Hypothesis 10
F IGURE 5.2: Hypotheses 1, 5, 6 and 10 after specialization:
replacing a construct by its subconstruct.
All hypotheses that contain a construct not mentioned by other hypotheses become part of the final set of hypotheses T. This applies to hypothesis 1; 6; 7; 8; 10 and 11. Hypotheses 2 and 11 share the construct time
of day and suggest the same total causal influence on occupancy. In that case
the hypothesis that has the most constructs gets added to T, being hypothesis 11, which is already part of the set. Hypothesis 3, 4 and 11 also share
constructs, being weather. Causal influence however is different for all three
hypotheses, resulting in adding hypotheses 3 and 4 to T. The same goes for
hypotheses 5 and 10, resulting in the final set of hypotheses T containing all
(specialized) hypotheses but number 2.
5.2.3
Constraint definition
Once the set of hypotheses is defined, constraints are collected. To make
sure the final prediction instrument is actionable (can be used in practice),
experts have determined some constraints (as explained in Section 3.3).
Data constraint
The prediction instrument should use data collected by the sensors of the
parking lot of The Edge. A dataset is available with collected data between
November 17, 2014 and April 15, 2016, showing, among others, date and
time of cars entering and leaving the parking lot. Apart from this, external
(free) data sources must be used and be integrated with the model.
As can be seen in Table 5.2, the variables day of week, time of day and weather
are variables mentioned by experts, which also resulted from the literature
5.3. Stage II: Predictive modelling
35
review in Section 4.2. Because of ex ante availability1 , the same data constraint applies here: only variables that are available one day ahead (domain constraint) can be taken into account. Therefore the same weather
variables will be used: temperature, rainfall, sunshine and wind.
Deployment constraint
No (IT-)systems are in place in which the prediction instrument should
be implemented. Developing a prediction instrument is (one of) the first
phase(s) of looking at the possibilities of using data collected by the (smart)
building. Therefore, no deployment constraints are in place.
Domain constraint
The use of the prediction instrument is for employees to be able to see if
they can go by car to The Edge, or not, based on occupancy predictions.
Since most employees decide a day’s type of transport the evening beforehand, predictions on every half hour of the day should be available the
evening before.
Another domain constraint is that data used in the prediction model
should not be traceable to individuals. When privacy sensitive data can
be used to enhance the model, a privacy officer should be consulted to discuss whether the data can be used or not, for example in an aggregated way.
Interestingness constraint
As Van der Spoel et al. [2] mention, studies with a similar problem can
be used as a benchmark, for comparing predictive performance. As one
of the goals of this research is to compare a soft-exclusive development
method with this soft-inclusive development method, the soft-exclusive developed prediction instrument, with the prediction goal of predicting one
day ahead, will be used as the benchmark (see Section 4.4). No constraints
on prediction accuracy and/or prediction error are given by the experts.
5.3
Stage II: Predictive modelling
5.3.1
Data selection & cleaning strategies
To translate the hypotheses into datasets that are used to train the predictive
models, data cleaning- and selection steps will be performed. Predictive
power of a developed model can hereby be improved, since possible noise
in the data will be reduced [2].
1
Only include predictors (input variables) that are available at time of prediction [11].
36
Chapter 5. Soft-inclusive development
Data cleaning strategies
Section 4.2.1 describes the data cleaning strategy which is used in this section as well. Using the dataset of The Edge, occupancy is calculated based
on the number of cars entering and leaving the parking lot (using mean imputation for missing values). The factors time of day, day of week and historic
occupancy are retrieved from this transformed dataset of the parking lot of
The Edge as well.
Data selection & cleaning strategies
The constructs (factors) derived from the domain analysis are now tested
in different strategies of using the factors. By doing so the strategy with the
best (predictive) performance can be selected.
The selection strategies are as follows:
1. Naive model Only using day and time, being a naive benchmark.
2. Soft-exclusive Only using the factors derived from the literature review, being hypothesis 11, to be able to compare the different development methods. This strategy reflects hypotheses 1, 3 and 4 as well.
3. Traffic & Rail Problems Using factors from literature, expanding the
dataset by using (near) real-time traffic- and rail data, as suggested in
hypotheses 7 & 8.
4. Traffic data day before Including traffic data (of one day before, because
of ex ante availability) to the dataset of factors from literature, like
suggested in hypothesis 7.
5. Rail data day before Including rail data (of one day before, because of ex
ante availability) to the dataset of variables from literature, like suggested in hypothesis 8.
6. Appointments Including data on appointments of employees to the dataset
of variables from literature, like suggested in hypotheses 5, 6 and 10.
7. Constructions Including data on road constructions to the dataset of
variables from literature, like suggested in hypothesis 9.
8. Combining strategy 4; 5; 6 & 7 Combining all mentioned factors, both
mentioned in literature, as well as by the experts.
9. Combining strategy 4; 5; 6 & 7, using categorical weather factors Because
predictions are made one evening in advance, predictions of weather
variables have to be used as input factors. To reduce the error in these
5.3. Stage II: Predictive modelling
37
predictions, weather variables will be converted to categorical factors.
For example, temperature now is assigned to categories like ‘Below
zero’, ‘0-5’, ‘5-10’, ..., ‘Above 25’.
5.3.2
Reduction by data & domain constraints
Having defined the different strategies, the data- and domain constraints
are checked to see which strategies can be used.
All strategies, apart from the naive benchmark, use the dataset of The Edge
and external data sources to make predictions, meeting the first domain
constraint.
Strategy 3, 6 & 7 are filtered out because of the data constraint of ex ante
availability. Data on rail problems is retrieved from ‘Rijden de Treinen’ [42],
traffic data is retrieved from the National Database on Traffic data [43].
These two variables however are not available at time of prediction, because of the constraint to predict the evening before. If predictions would
be made e.g. half an hour before, then this strategy could have been used.2
No (open) dataset is available on the factor road constructions: no historic
information on road constructions or detours can be found. The same goes
for data on internal + external appointments. Therefore strategy 6 & 7 can
not be used in training a predictive model as well. Strategy 8 and 9 will
now change to only combining strategies 4 & 5.
Weather data is retrieved from the Royal Netherlands Meteorological Institute [37], and is available at time of prediction (see Section 5.2.3). The
same goes for data on (school)holidays and days between holiday and weekend,
which are retrieved from Schoolvakanties-Nederland [38].
Strategies 1 and 2 are already used in the soft-exclusive development
method, so the results from Section 4.2 will be used here.
Resulting, the strategies that will be used in selecting the final predictive
model(s) are: 4, 5, 8 and 9.
5.3.3
Exploratory data analysis
To see how the collected hypotheses relate to the data, an exploratory data
analysis is performed. Figure 5.3 displays the average course of occupancy,
traffic intensity and traffic speed, during working days. The figure provides
insight into the relationship between the occupancy of the parking lot and
the variable traffic, which will be used in strategies 4, 8 and 9. As can be seen
2
To see if this strategy would improve predictive performance, this strategy will be
tested, although it will not be used in selecting the final predictive model.
38
Chapter 5. Soft-inclusive development
F IGURE 5.3: The (cor)relation between traffic and occupancy flows. Averages for working days are used.
TABLE 5.3: Variable correlation, significant at α < 0.05
Rail problems day before
Traffic intensity day before
Traffic speed day before
Occupancy
0.07
0.15
0.06
there seems to be a (cor)relation between the independent variable (traffic)
and the goal variable (occupancy). The variable traffic intensity follows an
opposite-like course, compared to occupancy, during daytime. Looking at
traffic speed, the moments of traffic jam can be identified, which relate to the
moments occupancy shows a steep increase or decrease.
Besides this visual exploration, correlation between variables is checked.
Correlation between most of the variables is already checked in Section
4.2.1. Correlation between the new variables traffic and rail problems (of one
day ahead) and the goal variable is displayed in Table 5.3. As can be seen
in the table, all three variables significantly correlate (α < 0.05) with occupancy, although correlation is low.
5.3.4
Technique & parameter selection
To test the different strategies the same techniques are used as with PIDSEDA, being Multiple Linear Regression (MLR), Random Forest (RF), Classification and Regression Trees (CART) and Support Vector Regression (SVR)
for the regression approach (see Section 4.2.3).
BI software ‘Qlikview’ [39] is used to combine all the factors from different sources into a data-set per strategy. Statistical computing program ‘R’
[40] is used to train, test and evaluate the different prediction models.
5.3. Stage II: Predictive modelling
39
TABLE 5.4: Random Forest - Performance Measures
MSE
RMSE
MAE
%≤5
5.3.5
Naive
PIDSEDA
3
4
5
8
9
956.2
30.9
18.0
24.1
104.8
10.2
4.9
72.3
113.8
10.7
5.2
71.5
109.3
10.5
5.1
71.7
106.0
10.3
5.0
72.2
111.1
10.5
5.2
71.4
136.5
11.7
5.9
67.8
Model training
To test the performance of the models, 10-folds cross-validation is used, as
suggested by Han and Kamber [12, p. 365]. The average results of the performance measures for all strategies, are displayed in Appendix C. Because
using RF gives the best results for all strategies, the performance measures
of this technique are displayed here in Table 5.4. Besides the prediction error measures, the percentage of predictions that have an error of less-thanor-equal to 5 cars, is displayed in bottom row of Table 5.4. As can be seen,
all strategies perform better than the naive benchmark.
5.3.6
Reduction by interestingness, deployment & domain constraints
The tested strategies are now checked based on the constraints defined in
Section 5.2.3.
No privacy sensitive data is used in either of the strategies, conforming
to one of the domain constraints. Another domain constraint was using
external databases, which eliminates the naive strategy. This strategy will
only be used as a benchmark, but cannot be chosen to be in the final prediction instrument. As mentioned before, strategy 3 does not comply to the
domain constraints as well, which means it will not be chosen to be used in
the prediction instrument.
Since no deployment constraints are defined, the final reduction-step is to
look at the interestingness constraint. No constraints are given on prediction error and accuracy, the interestingness constraint however is to use the
soft-exclusive developed model as a benchmark. The to be chosen strategy
therefore has to perform the same or better than the model developed using
PID-SEDA.
A t-test is performed to test if the measures of the strategies differ significantly [44], see Table 5.5. Strategy 5 performs significantly better (t-test,
p < 0.01 level) compared to strategies 4, 8 & 9, and is the only strategy for
40
Chapter 5. Soft-inclusive development
TABLE 5.5: Comparison of results (based on absolute error).
An x indicates the strategy in the row performs significantly
better compared to the one in the column (p < 0.01).
PID-SEDA
Strategy 4
Strategy 5
Strategy 8
Strategy 9
PID-SEDA
Strategy 4
-
x
x
Strategy 5
-
Strategy 8
Strategy 9
x
x
x
x
x
-
x
-
which the interestingness constraint holds. There is no significant difference in the performance of the model of PID-SEDA and strategy 5.
This leaves strategy 5 as the only remaining strategy after reduction by domain and interestingness constraints.
5.4
Stage III: Model Convergence
Only one model results from the steps described above. Using this model
the occupancy of the parking lot of The Edge one day ahead is predicted
with an average error of 5 cars. 72.2% of predictions have an error of less
than or equal to 5 cars, compared to the actual occupancy.
5.5
Conclusion
Based on the results a prediction instrument using the weather variables
temperature, sunshine, wind and rainfall; using (school)holiday data; day of week;
time of day; (historic) occupancy numbers of up to 5 days ahead; including
data on rail problems of one day ahead and using a Random Forest technique, is selected as the final prediction instrument developed using PIDCD. Using this instrument, average prediction error is 5.0 cars, and 72.2%
of predictions is within an error of 5 cars.
In the next chapter this prediction instrument is compared to the prediction instrument developed using PID-SEDA (Section 4.4), to be able to
answer the research question.
41
Chapter 6
Discussion
This chapter describes the last part of Technical Action Research: analysis of
results [7]. Analysis of results consists of presenting observations, providing possible explanations (Section 6.1), and answering research questions
(Chapter 7). Generalizations, limitations, and recommendations for future
work are discussed by assessing research validity (Section 6.2).
6.1
Comparing models
To answer the research question, the prediction instruments resulting from
both using PID-SEDA and PID-CD are compared. Table 6.1 displays the
mean squared- and absolute errors, as well as the accuracy, for the predictive models. Three predictive models resulted from using PID-SEDA,
one for predicting half an hour ahead, the second for predicting two hours
ahead, and the last predicting one day ahead. One predictive model resulted from using PID-CD, which is aimed at predicting occupancy one
day ahead as well. As can be seen in the table, the first model of PID-SEDA
predicts with the smallest errors and highest accuracy. Prediction of occupancy half an hour ahead differs, on average, 2.3 cars from the actual occupancy value. However, based on the domain constraints defined in Section
5.2.3, only those models with a prediction goal of predicting one day ahead
can be used in practice. The first two prediction instruments of PID-SEDA
therefore are not actionable.
TABLE 6.1: Performance measures of different development methods & strategies
MSE
RMSE
MAE
%≤5
PID-SEDA (S2)
PID-SEDA (S4)
PID-SEDA (S7)
PID-CD (S5)
20.6
4.5
2.3
86.6
153.9
12.4
6.0
70.6
104.8
10.2
4.9
72.3
106.0
10.3
5.0
72.2
42
Chapter 6. Discussion
As tested in Section 5.3.6, there is no significant difference in the predic-
tive performance of the remaining PID-SEDA model and the model developed using PID-CD. Including information on rail problems one day ahead
does not improve performance of the predictive model. As the selected
PID-CD model performs significantly better than the other models tested
in Section 5.3.5, including information on traffic one day ahead (strategy 4),
including both rail problems and traffic data of one day ahead (strategy 8
& 9), or including real-time information on rail problems and traffic (strategy 3), does not improve predictive performance as well. The same data
cleaning strategy is applied, and the same modelling technique (RF) and
validation method (10-fold cross-validation) is used. Based on the predictive models alone, the predictions instruments do not (significantly) differ.
A possible reason why PID-CD did not result in a better predictive model,
compared to using PID-SEDA, is that not all hypotheses mentioned by experts could be used. Specific intelligence about the domain was gathered,
that eventually could not be used in developing the prediction model, since
no data was available on internal and external appointments of employees.
We will elaborate more on this in Section 7.2.
Besides this, the predictive model resulting from using PID-SEDA already has high predictive power. This predictive power is unlikely to be
enhanced by adding a factor that only slightly correlates with the goal variable (significant correlation is 0.07, see Table 5.3). Occupancy has proven
to follow a similar pattern during working days. Using the variables time
of day and day of week, and adding a memory structure by using historic
occupancy, is already a solid basis for a predictive model. Adding factors
with high or moderate correlation to the goal variable, such as holiday and
weather in this case, refines predictive accuracy of the model. Low correlation factors however (like rail problems), have to little predictive power to
enhance the model.
6.2
Validity
According to Wieringa [45] reasons that support the conclusion of a Technical Action Research must be spelled out, as well as reasons why the conclusion could be false. Three well-known kinds of validity are used to do so:
(statistical) conclusion validity, internal validity and external validity [45],
originally proposed by Cook and Campbell [46].
Conclusion validity refers to the ability to draw conclusions on the basis of statistical evidence [47, p.1253]. Internal validity is the support for a
conclusion to be caused by the treatment only [45], and not be influenced
by factors outside the treatment (in this case the treatment is a predictive
6.2. Validity
43
model). Internal validity assesses the quality of the predictive model(s), for
example by describing limitations. The quality of a predictive model indicates the potential usefulness in practice [48]. External validity refers to
generalizing across times, settings [47], and in this case, domains.
6.2.1
Conclusion Validity
Conclusions in this research are drawn on the basis of statistical evidence.
Conclusions are based on a two-sample t test, testing the difference in (average) prediction error, see Section 5.3.6.
A two-sample t test is used because these tests are more robust than
one-sample methods [44]. The samples were independent, which is a condition to use the test, and sample sizes were large (n = 74266). When dealing
with equal sized samples and distributions with similar shape, probability
values are very accurate [44], which was tested and true for the t test performed.
6.2.2
Internal Validity
Internal validity is assessed by looking at the prediction models, as well as
by looking at the execution of the different development methods.
Cross-validation
Internal validity of the prediction models is enhanced by using 10-fold
cross-validation. Stability of the models is hereby improved, compared to
using a holdout method, because 100% of the data is used for validation.
The effect of outliers is reduced, and the chance of the good results being
caused by chance is reduced. Stability can be further improved however by
performing a repeated cross-validation [48].
Another validation method which can be used is bootstrapping, which
can be thought of as a smoothed version of cross-validation [49]. Efron
and Tibshirani [49] show that the bootstrap method outperforms crossvalidation in multiple prediction experiments. We suggest to research the
effect of using a bootstrap method in future work.
Modeling technique
Different modelling techniques are used, to reduce the risk of poor performing models because of technique-specific issues. In future work, it would
be interesting to see the effect of using another technique, for example Neural Networks, which was proven to be an accurate prediction technique in
other parking lot occupancy prediction researches [29], [32].
44
Chapter 6. Discussion
Data quality
More than a year of data is collected and used for training the models. By
doing so it is made sure every possible value of the input variables is used
for training. For example, all the possible values of the factor holiday are included, school holidays in the summer, but also winter holidays and other
national days.
The amount of missing data, and the use of data imputation to solve
this, however might have influenced quality of the dataset. Most of the
occupancy values used for training and testing are calculated using average parking stay (see Section 4.2.1). Because of this, occupancy values
most likely do not represent the actual occupancy values, but represent an
approximation of occupancy. Although this does not influence the performance measures (errors are calculated from the same dataset), the dataset is
likely to (slightly) differ from the truth. The trained models therefore might
not be able to predict occupancy accurately in the real world, which is a
limitation of this study. Making sure sensors work most of the time, and
closing the gate after each car leaving the parking lot (to be able to scan all
cars), are possible solutions to reduce the amount of missing data, thereby
increasing data quality.
Execution of methods
Regarding internal validity of using PID-CD, collecting domain intelligence
via interviews or brainstorms is sensitive to validity threats. Experts might
answer questions in a socially desirable way [45]. As discussed in Section
5.2.1, this threat especially applies to brainstorming, resulting in sometimes
having to brainstorm anonymously [2]. In this research brainstorming was
not conducted anonymously, because behaviour of experts in the domain
does not affect the processes of the company. However to increase internal
validity, individual expert-interviews are conducted as well. With conducting individual interviews possible need to answer questions in a socially
desirable way is reduced.
Another threat regarding expert consultation, is the interpretation of experts’ input. Researchers interpretation of what experts say affects the designed model [45]. The hypotheses used in this research might have been
interpreted wrongly, for example, originally intended to suggest another
relation between (other) variables. To decrease influence of this threat, experts should be consulted throughout the whole process, using an iterative
approach to be able to adapt steps if necessary.
A threat to the validity of using PID-SEDA, is the (unintended) use of domain specific intelligence. PID-SEDA is used in this research to serve as a
6.2. Validity
45
benchmark method, in which no domain specific knowledge is used. However, not using any domain intelligence turned out to be difficult. For example, the domain specific prediction goal (predicting one day ahead) was already known at the moment of performing the soft-exclusive method. The
goal was used in the soft-exclusive method to be able to easily compare
the results with the results from PID-CD. It can be argued however, if this
goal would have been used when this specific domain knowledge was not
known. If not, no actionable prediction model would have resulted from
using PID-SEDA, since domain constraints were not met.
6.2.3
External Validity
External validity is assessed to see if generalizations across times and domains can be made.
Other people using the artefact
According to Wieringa, a major threat to validity of conclusions is that the
researcher, who developed the artefact, is "able to use it in a way than no
one else can" [45]. This would imply a lack of generalizability to any case
where other people are using the artefact. For the method of Van der Spoel
et al. [2] this threat is mitigated by performing this research: letting someone else use the artefact (PID-CD) in a new client cycle (see Figure 2.1).
The artefact (prediction instrument) developed in this research can easily be used by other researchers with some experience in data mining, since
well-known techniques and validation methods are used, and steps of developing the instruments are described thoroughly.
Generalizing across times
Because this research focuses on one specific complex domain, generalizing across times and other domains is discouraged. To give an example
why not to generalize over time: one of the many possible characteristics
of a complex domain is the influence of human behaviour [3]. Human behaviour continuously changes. Parking behaviour of the years 2014 to 2016
might be different from parking behaviour in the future. This means predictive models should be continuously adapted to the changes in the domain.
Taking new generated data as input for training new models therefore is
recommended.
Generalizing across domains
The prediction instruments developed in this research are specifically targeted at predicting parking lot occupancy of the parking lot of The Edge.
Having used both experts’ input, as input from existing literature improves
generalizability, but generalizing across domains still is hard. As Van der
46
Chapter 6. Discussion
Spoel et al. [2] mention, a soft factor is unique to one domain. The same factor could be at play in two domains, but its effect will be different in each
domain [2, p.16]. Factors used in the prediction instruments can be used
as input for other parking lot occupancy predictions, but a soft-inclusive
domain analysis should be conducted to discover domain specific aspects.
47
Chapter 7
Conclusion
Concluding this thesis, the research questions will be answered and recommendations for future work are given.
7.1
Answering Research Questions
The research question and its subquestions, as formulated in Section 2.1.2,
are answered below. First, the subquestions are answered.
1. What instrument for predicting parking lot occupancy results from using
’prediction instrument development with soft-exclusive domain analysis’?
A prediction instrument resulting from using PID-SEDA, includes
the weather variables temperature, sunshine, wind and rainfall; uses
(school)holiday data; day of week; time of day and (historic) occupancy
numbers of 5 steps in advance. The instrument uses a Random Forest technique, with 10-fold cross-validation to validate performance
measures. Using this instrument results in an average error of 2.3
cars, when predicting half an hour in advance and an average error of 6.0 cars predicting 2 hours in advance. These two prediction
models however can not be used in the domain researched, because
of the domain constraint to predict one day ahead. Predicting one
day ahead this prediction instrument results in an average error of 4.9
cars. 72.3% of prediction are within an error of 5 cars.
2. What instrument for predicting parking lot occupancy results from using
’prediction instrument development for complex domains’?
A prediction instrument resulting from using PID-CD, includes the
weather variables temperature, sunshine, wind and rainfall; uses (school)holiday
data; day of week; time of day; (historic) occupancy numbers of up to 5
days ahead, and includes data on rail problems of one day ahead. It
predicts one day ahead and uses a Random Forest technique with 10fold cross-validation to validate performance measures. No privacy
sensitive data is used and external data sources are integrated into the
48
Chapter 7. Conclusion
predictive model. Using this instrument, average prediction error is 5
cars, and 72.2% of predictions is within an error of 5 cars.
Next, the research question is answered:
’How does a prediction instrument developed using a soft-inclusive method
compare to a prediction instrument developed using a soft-exclusive method?’
Answering the research question, the prediction instrument developed using PID-CD does not outperform the prediction instrument developed using PID-SEDA, as it did in the original research [2]. The high predictive
performance of the prediction instrument developed using PID-SEDA, and
not being able to use all hypotheses suggested by experts when using PIDCD, are possible explanations for this result.
Continuing the validation of PID-CD, the method should be tested in
more (complex) domains, to be able to find the range of domains in which
using the method adds significant value.
7.2
Recommendations
One of the recommendations for future work, mentioned in Section 6.2, will
now be discussed more elaborately.
Because of the data constraint of ex ante availability, data selection strategy 6 of the soft-inclusive method could not be used in training a predictive model. Input of the experts to include data on internal and external
appointments (hypotheses 5, 6 & 10), therefore is not incorporated into the
final prediction instrument. It would be interesting to see how incorporating this information would affect the performance of a (similar) predictive
model. Based on the hypotheses, it is assumed that the number of internal
appointments positively correlates with the number of cars in the parking
lot, and the number of external appointments negatively influences the occupancy.
Sample data is collected and an exploratory data analysis is performed, to
see if these assumptions are reflected in data. A survey is completed by 250
experts of group 1 (experts with parking rights all day). The survey questioned whether the particular expert had parked his/her car in the parking
lot that day, if he/she had appointments at The Edge, which required their
physical presence, and if he/she had appointments elsewhere, on the given
day. Data on all five working days is collected.
Table 7.1 shows the correlations between the number of employees who
parked their car in the parking lot, and the number of people with internal
or external appointments. Bold values indicate a significant correlation at
7.2. Recommendations
49
TABLE 7.1: Correlation occupancy & appointments
Occupancy
Int. Appointments
Ext. Appointments
Occupancy
Int. Appointments
Ext. Appointments
0.83
-0.10
-0.20
-
p< 0.01. Correlation between occupancy and internal appointments is 0.83.
This indicates the factor is likely to add predictive power to a prediction
model. Future work therefore could be to create an architecture in which
information on appointments, from, for example, employees’ agendas, can
be incorporated into developing the prediction model.
51
Appendix A
Correlation soft-exclusive
factors
52
TABLE A.1: Correlation variables - bold values indicate significant value at α < 0.05 (a)
WeekDay
Holiday
Wind
Sunshine
Rain
Temp.
Hour
1-step/1
1-step/2
1-step/3
1-step/4
1.00
-0,28
-0,38
0,13
0,32
0,02
0,02
0,16
0,98
0,93
0,86
0,79
0,71
0,62
0,53
0,44
0,68
0,67
0,63
0,58
0,52
0.45
0.48
0.47
0.43
1.00
0.68
0.01
0.00
-0.01
-0.01
0.00
-0.28
-0.28
-0.28
-0.28
-0.28
-0.28
-0.28
-0.28
-0.10
-0.10
-0.10
-0.10
-0.10
0.24
0.10
0.04
0.12
1.00
0.03
0.01
-0.01
-0.04
0.00
-0.38
-0.38
-0.38
-0.37
-0.37
-0.37
-0.37
-0.37
-0.12
-0.12
-0.12
-0.12
-0.12
0.10
0.10
0.10
0.11
1.00
0.03
0.15
0.08
0.05
0.13
0.12
0.11
0.10
0.08
0.06
0.05
0.03
0.13
0.13
0.12
0.11
0.10
0.12
0.13
0.16
0.20
1.00
-0.11
-0.04
0.02
0.30
0.27
0.24
0.20
0.17
0.12
0.07
0.03
0.32
0.30
0.27
0.24
0.20
0.33
0.33
0.31
0.30
1.00
0.01
0.00
0.02
0.03
0.03
0.03
0.04
0.04
0.04
0.04
0.00
0.00
0.00
0.01
0.01
0.01
0.01
0.00
0.00
1.00
0.01
0.02
0.02
0.02
0.02
0.02
0.02
0.02
0.02
0.05
0.05
0.05
0.05
0.04
0.04
0.03
0.03
0.02
1.00
0.20
0.24
0.28
0.32
0.36
0.40
0.43
0.46
0.16
0.20
0.24
0.28
0.32
0.16
0.16
0.16
0.16
1.00
0.98
0.93
0.86
0.79
0.71
0.62
0.53
0.67
0.68
0.67
0.63
0.58
0.44
0.47
0.46
0.41
1.00
0.98
0.93
0.86
0.79
0.71
0.62
0.64
0.67
0.68
0.67
0.63
0.41
0.44
0.43
0.39
1.00
0.98
0.93
0.86
0.79
0.71
0.59
0.64
0.67
0.68
0.67
0.37
0.40
0.39
0.35
1.00
0.98
0.93
0.86
0.79
0.53
0.59
0.64
0.67
0.68
0.33
0.35
0.34
0.30
Appendix A. Correlation soft-exclusive factors
Occupancy
WeekDay
Holiday
Wind
Sunshine
Rain
Temperature
Hour
1-step/obs. 1
1-step/obs. 2
1-step/obs. 3
1-step/obs. 4 - 4-step/obs. 1
1-step/obs. 5 - 4-step/obs. 2
4-step/obs. 3
4-step/obs. 4
4-step/obs. 5
48-steps/obs. 1
48-steps/obs. 2
48-steps/obs. 3
48-steps/obs. 4
48-steps/obs. 5
48-steps/day2
48-steps/day3
48-steps/day4
48-steps/day5
Occupancy
1-step/obs. 5
4-step/obs. 3
4-step/obs. 4
4-step/obs. 5
48-steps/obs. 1
48-steps/obs. 2
48-steps/obs. 3
48-steps/obs. 4
48-steps/obs. 5
48-steps/day2
48-steps/day3
48-steps/day4
48-steps/day5
1-step/5
4-step/3
4-step/4
4-step/5
48-steps/1
48-steps/2
48-steps/3
48-steps/4
48-steps/5
48/d2
48/d3
48/d4
1.00
0.98
0.93
0.86
0.47
0.53
0.59
0.64
0.67
0.28
0.30
0.29
0.25
1.00
0.98
0.93
0.41
0.47
0.53
0.59
0.64
0.23
0.24
0.24
0.20
1.00
0.98
0.34
0.41
0.47
0.53
0.59
0.18
0.19
0.18
0.15
1.00
0.27
0.34
0.41
0.47
0.53
0.12
0.13
0.12
0.10
1.00
0.98
0.93
0.86
0.79
0.68
0.45
0.48
0.47
1.00
0.98
0.93
0.86
0.67
0.44
0.47
0.46
1.00
0.98
0.93
0.64
0.41
0.44
0.43
1.00
0.98
0.59
0.37
0.39
0.39
1.00
0.53
0.33
0.35
0.34
1.00
0.68
0.45
0.48
1.00
0.68
0.45
1.00
0.68
Appendix A. Correlation soft-exclusive factors
TABLE A.2: Correlation variables - bold values indicate significant value at α < 0.05 (b)
53
55
Appendix B
Performance measures Soft-exclusive
TABLE B.1: Performance measures - Strategy 1
Measure
MLR
RF
DT
SVR
MSE
RMSE
MAE
%≤5
201.0
14.2
7.6
62.6
43.2
6.6
2.9
84.0
344.4
18.6
12.3
23.9
209.0
14.4
8.0
47.7
TABLE B.2: Performance measures - Strategy 2
Measure
MLR
RF
DT
SVR
MSE
RMSE
MAE
%≤5
74.4
8.6
4.6
75.4
20.6
4.5
2.3
86.6
369.4
19.2
12.8
22.9
81.9
9.1
5.6
59.3
TABLE B.3: Performance measures - Strategy 3
Measure
MLR
RF
DT
SVR
MSE
RMSE
MAE
%≤5
1803.0
42.5
28.1
16.9
182.7
13.5
6.4
70.4
505.4
22.5
13.4
46.2
2105.0
45.9
24.0
35.2
TABLE B.4: Performance measures - Strategy 4
Measure
MLR
RF
DT
SVR
MSE
RMSE
MAE
%≤5
1248
35.3
22.8
21.4
153.9
12.4
6.0
70.6
497
22.3
13.4
44.2
1462
38.2
19.2
39.0
56
Appendix B. Performance measures - Soft-exclusive
TABLE B.5: Performance measures - Strategy 5
Measure
MLR
RF
DT
SVR
MSE
RMSE
MAE
%≤5
2271
47.7
33.7
8.7
150.2
12.3
5.9
69.7
757
27.5
17.0
27.0
2888
53.7
25.8
27.6
TABLE B.6: Performance measures - Strategy 6
Measure
MLR
RF
DT
SVR
MSE
RMSE
MAE
%≤5
2246
47.4
33.4
9.6
184.3
13.6
6.5
67.9
714
26.7
16.5
22.5
2879
53.7
25.6
29.1
TABLE B.7: Performance measures - Strategy 7
Measure
MLR
RF
DT
SVR
MSE
RMSE
MAE
%≤5
1892
43.5
31.8
8.1
104.8
10.2
4.9
72.3
682
26.1
15.8
24.1
2420
49.2
24.8
24.1
57
Appendix C
Performance measures Soft-inclusive
TABLE C.1: Performance measures - Strategy 2
Measure
MLR
RF
DT
SVR
MSE
RMSE
MAE
%≤5
1892
43.5
31.8
8.1
129
11.4
5.2
70.9
682
26.1
15.8
24.1
2420
49.2
24.8
24.2
TABLE C.2: Performance measures - Strategy 3
Measure
MLR
RF
DT
SVR
MSE
RMSE
MAE
%≤5
1892.5
43.5
31.8
8.4
116.4
10.8
5.2
72.6
696.7
26.4
16.1
20.2
2382.1
43.8
24.6
25.2
TABLE C.3: Performance measures - Strategy 4
Measure
MLR
RF
DT
SVR
MSE
RMSE
MAE
%≤5
1882
43.4
31.6
8.3
114
10.7
5.2
71.6
689
26.2
16.0
20.3
2416.8
49.2
24.7
24.3
TABLE C.4: Performance measures - Strategy 5
Measure
MLR
RF
DT
SVR
MSE
RMSE
MAE
%≤5
1892
43.5
31.8
8.4
116.3
10.8
5.1
71.8
667
25.8
15.9
21.1
2419.5
49.2
24.7
24.1
58
Appendix C. Performance measures - Soft-inclusive
TABLE C.5: Performance measures - Strategy 8
Measure
MLR
RF
DT
SVR
MSE
RMSE
MAE
%≤5
1882
43.4
31.6
8.3
102.7
10.1
5.0
71.8
682
26.1
16.0
20.9
2418.4
49.2
24.7
24.5
TABLE C.6: Performance measures - Strategy 9
Measure
MLR
RF
DT
SVR
MSE
RMSE
MAE
%≤5
1870
43.2
31.6
8.4
114.2
10.7
5.2
71.9
683
26.1
16.0
20.5
2411.5
49.1
24.7
24.2
59
References
[1]
A. R. Ganguly, O. A. Omitaomu, and R. M. Walker, “Knowledge discovery from sensor data for security applications”, in Learning from
Data Streams, Springer, 2007, pp. 187–204.
[2]
S. Van der Spoel, C. Amrit, and J. Van Hillegersberg, “Domain driven
prediction instrument development for complex systems”, Research
of University of Twente, 2016.
[3]
M. Jackson and P. Keys, “Towards a system of systems methodologies”, Journal of the operational research society, pp. 473–486, 1984.
[4]
P. Chapman, J. Clinton, R. Kerber, T. Khabaza, T. Reinartz, C. Shearer,
and R. Wirth, “Crisp-dm 1.0 step-by-step data mining guide”, 2000.
[5]
GSBouw, The edge, 2015. [Online]. Available: http://www.gensbouw.
nl/nl/projecten/detail/the-edge (visited on 02/02/2016).
[6]
U. Rutishauser, J. Joller, and R. Douglas, “Control and learning of ambience by an intelligent building”, Systems, Man and Cybernetics, Part
A: Systems and Humans, IEEE Transactions on, vol. 35, no. 1, pp. 121–
132, 2005.
[7]
R. Wieringa and A. Moralı, “Technical action research as a validation
method in information systems design science”, in Design science research in information systems. advances in theory and practice, Springer,
2012, pp. 220–238.
[8]
L. Cao, S. Y. Philip, C. Zhang, and Y. Zhao, Domain driven data mining.
Springer, 2010.
[9]
[10]
D. Schön, “Educating the reflective practitioner”, 1987.
M. A. Waller and S. E. Fawcett, “Data science, predictive analytics,
and big data: A revolution that will transform supply chain design
and management”, Journal of Business Logistics, vol. 34, no. 2, pp. 77–
84, 2013.
[11]
G. Shmueli, “To explain or to predict?”, Statistical science, pp. 289–310,
2010.
[12]
J. Han, J. Pei, and M. Kamber, Data mining: Concepts and techniques.
Elsevier, 2011.
[13]
D. J. Hand, H. Mannila, and P. Smyth, Principles of data mining. MIT
press, 2001.
60
[14]
REFERENCES
D. Basak, S. Pal, and D. C. Patranabis, “Support vector regression”,
Neural Information Processing-Letters and Reviews, vol. 11, no. 10, pp. 203–
224, 2007.
[15]
I. H. Witten and E. Frank, Data Mining: Practical machine learning tools
and techniques. Morgan Kaufmann, 2005.
[16]
L. Breiman, “Random forests”, Machine learning, vol. 45, no. 1, pp. 5–
32, 2001.
[17]
M. R. Segal, “Machine learning benchmarks and random forest regression”, Center for Bioinformatics & Molecular Biostatistics, 2004.
[18]
L. Cao and C. Zhang, “Domain-driven actionable knowledge discovery in the real world”, in Pacific–Asia Conference on Knowledge Discovery and Data Mining, Springer, 2006, pp. 821–830.
[19]
L. Cao and N. T. Nguyen, “Intelligence metasynthesis and knowledge
processing in intelligent systems.”, J. UCS, vol. 14, no. 14, pp. 2256–
2262, 2008.
[20]
J. Gu and X. Tang, “Meta-synthesis system approach to knowledge
science”, International Journal of Information Technology & Decision Making, vol. 6, no. 03, pp. 559–572, 2007.
[21]
R. J. Hyndman and A. B. Koehler, “Another look at measures of forecast accuracy”, International journal of forecasting, vol. 22, no. 4, pp. 679–
688, 2006.
[22]
C. J. Willmott and K. Matsuura, “Advantages of the mean absolute
error (mae) over the root mean square error (rmse) in assessing average model performance”, Climate research, vol. 30, no. 1, pp. 79–82,
2005.
[23]
J. Webster and R. T. Watson, “Analyzing the past to prepare for the
future: Writing a”, MIS quarterly, vol. 26, no. 2, pp. 13–23, 2002.
[24]
H. F. Moed, J. Bar-Ilan, and G. Halevi, “A new methodology for comparing google scholar and scopus”, Journal of Informetrics, vol. 10, no.
2, pp. 533–551, 2016.
[25]
A. Kunjithapatham, A. Tripathy, C.-t. Eng, C. Chiriyankandath, and
T. Rao, “Predicting parking lot availability in san francisco”, N.D.
[26]
F. Richter, S. Di Martino, and D. C. Mattfeld, “Temporal and spatial clustering for a parking prediction service”, in Tools with Artificial
Intelligence (ICTAI), 2014 IEEE 26th International Conference on, IEEE,
2014, pp. 278–282.
[27]
S. Soler, “Creation of a web application for smart park system and
parking prediction study for system integration”, 2015.
REFERENCES
[28]
61
A. David, K. Overkamp, and W. Scheuerer, “Event-oriented forecast
of the occupancy rate of parking spaces as part of a parking information service”, in Proceedings of the 7th World Congress on Intelligent
Systems, Turin, Italy, 2000.
[29]
B. Chen, F. Pinelli, M. Sinn, A. Botea, and F. Calabrese, “Uncertainty
in urban mobility: Predicting waiting times for shared bicycles and
parking lots”, in Intelligent Transportation Systems-(ITSC), 2013 16th International IEEE Conference on, IEEE, 2013, pp. 53–58.
[30]
M. Reinstadler, M. Braunhofer, M. Elahi, and F. Ricci, “Predicting
parking lots occupancy in bolzano”, 2013.
[31]
Y. Zheng, S. Rajasegarar, and C. Leckie, “Parking availability prediction for sensor-enabled car parks in smart cities”, in Intelligent Sensors,
Sensor Networks and Information Processing (ISSNIP), 2015 IEEE Tenth
International Conference on, IEEE, 2015, pp. 1–6.
[32]
E. I. Vlahogianni, K. Kepaptsoglou, V. Tsetsos, and M. G. Karlaftis,
“Exploiting new sensor technologies for real-time parking prediction
in urban areas”, in Transportation Research Board 93rd Annual Meeting
Compendium of Papers, 2014, pp. 14–1673.
[33]
T. Fabusuyi, R. C. Hampshire, V. A. Hill, and K. Sasanuma, “Decision
analytics for parking availability in downtown pittsburgh”, Interfaces,
vol. 44, no. 3, pp. 286–299, 2014.
[34]
E. McGuiness and S. McNeil, “Statistical models to predict commercialand parking-space occupancy”, Journal of urban planning and development, vol. 117, no. 4, pp. 129–139, 1991.
[35]
X. Chen, “Parking occupancy prediction and pattern analysis”, 2014.
[36]
G. Batista and M. C. Monard, “An analysis of four missing data treatment methods for supervised learning”, Applied Artificial Intelligence,
vol. 17, no. 5-6, pp. 519–533, 2003.
[37]
KNMI, Hourly weather data, 2016. [Online]. Available: http://www.
knmi . %20nl / klimatologie / uurgegevens / selectie . cgi
(visited on 04/11/2016).
[38]
Schoolvakanties, Schoolvakanties nederland, 2016. [Online]. Available:
www.schoolvakanties-nederland.nl (visited on 04/11/2016).
[39]
Qlik, Qlikview, 2016. [Online]. Available: http://global.qlik.
com/nl (visited on 02/15/2016).
[40]
R, Statistical programming in r, 2016. [Online]. Available: https://
www.r-project.org/ (visited on 04/29/2016).
62
[41]
REFERENCES
R. M. Mueller, “A meta-model for inferring inter-theory relationships
of causal theories”, in System Sciences (HICSS), 2015 48th Hawaii International Conference on, IEEE, 2015, pp. 4908–4917.
[42]
R. de Treinen, Rijden de treinen, 2016. [Online]. Available: www.rijdendetreinen.
nl (visited on 06/15/2016).
[43]
N. D. Wegverkeersgegevens, Historische verkeersgegevens, 2016. [Online]. Available: http://www.ndw.nu/pagina/nl/4/databank/
65/historische_gegevens/ (visited on 06/17/2016).
[44]
D. S. Moore and G. P. McCabe, Introduction to the Practice of Statistics.
WH Freeman/Times Books/Henry Holt & Co, 1989.
[45]
R. Wieringa, “Empirical research methods for technology validation:
Scaling up to practice”, Journal of systems and software, vol. 95, pp. 19–
31, 2014.
[46]
T. D. Cook and D. T. Campbell, “The design and conduct of quasiexperiments and true experiments in field settings”, Handbook of industrial and organizational psychology, vol. 223, p. 336, 1976.
[47]
T. A. Scandura and E. A. Williams, “Research methodology in management: Current practices, trends, and implications for future research”, Academy of Management journal, vol. 43, no. 6, pp. 1248–1264,
2000.
[48]
E. W. Steyerberg, F. E. Harrell, G. J. Borsboom, M. Eijkemans, Y. Vergouwe, and J. D. F. Habbema, “Internal validation of predictive models: Efficiency of some procedures for logistic regression analysis”,
Journal of clinical epidemiology, vol. 54, no. 8, pp. 774–781, 2001.
[49]
B. Efron and R. Tibshirani, “Improvements on cross-validation: The
632+ bootstrap method”, Journal of the American Statistical Association,
vol. 92, no. 438, pp. 548–560, 1997.