Full-Text PDF

entropy
Article
An Entropy-Based Approach for Evaluating Travel
Time Predictability Based on Vehicle Trajectory Data
Tao Xu 1,2 , Xianrui Xu 1,2 , Yujie Hu 3 and Xiang Li 1,2, *
1
2
3
*
Key Laboratory of Geographic Information Science (Ministry of Education), East China Normal University,
Shanghai 200241, China; [email protected] (T.X.); [email protected] (X.X.)
School of Geographic Sciences, East China Normal University, Shanghai 200241, China
Kinder Institute for Urban Research, Rice University, Houston, TX 77005, USA; [email protected]
Correspondence: [email protected]; Tel.: +86-21-5434-1218
Academic Editor: Kevin H. Knuth
Received: 29 January 2017; Accepted: 7 April 2017; Published: 11 April 2017
Abstract: With the great development of intelligent transportation systems (ITS), travel time
prediction has attracted the interest of many researchers, and a large number of prediction
methods have been developed. However, as an unavoidable topic, the predictability of travel
time series is the basic premise for travel time prediction, which has received less attention than the
methodology. Based on the analysis of the complexity of the travel time series, this paper defines
travel time predictability to express the probability of correct travel time prediction, and proposes an
entropy-based method to measure the upper bound of travel time predictability. Multiscale entropy is
employed to quantify the complexity of the travel time series, and the relationships between entropy
and the upper bound of travel time predictability are presented. Empirical studies are made with
vehicle trajectory data in an express road section to shape the features of travel time predictability.
The effectiveness of time scales, tolerance, and series length to entropy and travel time predictability
are analyzed, and some valuable suggestions about the accuracy of travel time predictability are
discussed. Finally, comparisons between travel time predictability and actual prediction results from
two prediction models, ARIMA and BPNN, are made. Experimental results demonstrate the validity
and reliability of the proposed travel time predictability.
Keywords: travel time predictability; multiscale entropy; travel time series; vehicle trajectory data
1. Introduction
In transportation, travel time is confined to the time traveled by vehicles in a road network.
Accordingly, travel time forecast is to predict the time taken by a vehicle to travel between any
two points in a road network, which may benefit transportation planning, design, operations, and
evaluation [1]. With ever-increasing traffic congestion in metropolitan areas, travel time for every
traveler becomes complicated and irregular. How to accurately predict travel time, therefore, is of
great importance to researchers. In the past few decades, a variety of travel time prediction approaches
has been developed such as the linear regression, ARIMA, Bayesian nets, neural networks, decision
trees, support vector regression and Kalman filtering methods. Based on periodic fluctuations of
historical travel time series, these approaches make use of their own derivation rules to recognize
traffic patterns and predict future travel time on specific routes as precisely as possible. However, the
inevitable nonstationarity of travel time series caused by high self-adapting and heterogenetic drivers
or by unpredictable and unusual circumstances makes it difficult to accurately predict future travel
time. In addition to the performance of prediction models, the quality of input data (historical travel
time series) for prediction model affects the precision of travel time prediction.
Entropy 2017, 19, 165; doi:10.3390/e19040165
www.mdpi.com/journal/entropy
Entropy 2017, 19, 165
2 of 15
Nowadays, various data are employed for travel time prediction [2–4], e.g., vehicle trajectory
data, mobile phone data, smart card data, loop detector data, video monitoring data, and artificial
statistical data. Vehicle trajectory data are frequently collected from a large number of vehicles and
consist of a huge number of GPS sample points including geographic coordinates and sample time as
well as the identification of vehicle. Historical travel time series in specific trips can be extracted from
large amounts of vehicle trajectory data, and travel time prediction can be achieved. Based on vehicle
trajectory data, many existing research works have been performed for travel time prediction. Some of
them evaluate the performance of prediction models [3,5–7], and some of them focus on the reliability
or uncertainty of historical travel time series [8–13], but few of them examine the quality of data from
the perspective of prediction.
Travel time reliability is the consistency or dependability in travel times as measured day-to-day
or at different times of a day [14], which represents the temporal uncertainty experienced by travelers
in their trip [15] or the travel time distributions under various external conditions [16]. Travel time
reliability only presents the certainty of historical travel rules but not the accuracy of future travel times.
The aim of this paper is to evaluate the effectiveness of historical travel time series extracted
from vehicle trajectory data in travel time prediction. Especially, we use the term “predictability” to
denote the results of evaluation. In traffic studies, some research efforts about predictability have been
presented. Yue et al. [17] used the cross-correlation coefficient between traffic flows collected at two
detector stations to explain short-term traffic predictability in the form of probability. Foell et al. [18]
analyzed the temporal distribution of ridership demand on various date conditions and used the
F-score, an effective metric of information retrieval, to measure the predictability of bus line usage.
Siddle [19] introduced the travel time predictability of two specific prediction models—auto-regressive
moving average and non-linear time series analysis—in the Auckland strategic motorway network;
and travel time predictability was used to explain the performance of specific prediction models.
In addition, the predictability of road section congestion (speed) [20] and human mobility [21,22]
are measured by information entropy. Until now, there is not enough literature on data-driven
measurement of travel time predictability. Therefore, we hope to explore travel time predictability for
evaluating the characteristic of travel time series on prediction.
In this paper, travel time predictability describes the possibility of correct time prediction based
on historical travel time series. It indicates the influence of the complexity of historical travel time
series on prediction results. For example, travel time predictability of a travel time series being 0.9
indicates that the travel time prediction accuracy cannot exceed 90% for the given travel time series no
matter how good a predictive model is. Regular commute patterns give us confidence about future
travel times, but random traffic flow often disturbs traffic rules and bring uncertain changes to travel
time prediction.
Song et al. [22] explored the limits of predictability of human mobility and developed a method to
measure the upper bound of predictability based on information entropy [23] and Fano’s inequality [24].
In their research, mobile phone data were employed to quantify human mobility as discrete location
series, and the entropy of location series was measured by Lempel-Ziv data compression [25]; Fano’s
inequality was used to deduce the relationships between entropy and the upper bound of predictability.
A Lempel-Ziv data compression algorithm is a method to measure the complexity of the nonlinear
symbolic coarse-grained time series. However, travel time series usually has a continuous range of
values determined by the tradeoff between accuracy and grain size. Since the complexity of time series
is highly sensitive to grain size [26], the symbolization of travel time series is never a straightforward
task. Furthermore, the complexity of the Lempel-Ziv algorithm can only be used for qualitative analysis
and is not suitable for quantitative description [27]. Therefore, the Lempel-Ziv algorithm is not suitable
for measuring the complexity of travel time series, and the method proposed by Song et al. [22] is not
applicable to travel time predictability.
Inspired by Song et al. [22], this paper attempts to measure the complexity of travel time series
and assess travel time predictability. First, a travel time series is defined as a continuous variable,
Entropy 2017, 19, 165
3 of 15
and multiscale entropy (MSE) [28] in different scales is measured to present the true entropy of the
given travel time series. Then, the upper bound of predictability is calculated based on the method
of Song et al. [22]. Usually, MSE is used to assess the complexity of multi-value time series from
the perspective of multi-time scales and has been successful in many fields. However, MSE often
produces some inaccurate estimations or undefined entropy which could largely bias the evaluation
of the complexity of travel time series. Wu et al. [29] proposed the refined composite multiscale
entropy (RCMSE) algorithm based on MSE, and found that RCMSE increases the accuracy of entropy
estimation and reduces the generation of undefined entropy.
To this end, our contributions are to integrate the two methods proposed by Wu et al. [29]
and Song et al. [22], and apply them to evaluate the features of entropy and predictability of travel
time series.
This paper applies the above techniques to an express road section with heavy traffic flow in
Shanghai, China. Taxi cab trajectory data recorded in April 2015 are used to present the average taxi
mobility trends and evaluate travel time predictability. Commonly seen mobility research often focuses
on the mobility patterns of independent individuals (person or vehicle) based on such as mobile phone
data [22,30,31] and GPS data [22,32,33]. Differently, travel time predictability emphasizes the expected
success rate of travel time prediction based on a given travel time series from individual or statistics
values of travel time, and the statistical trends of travel times from multiple vehicles may present
traffic patterns more effectively than individual mobility which may be affected by unpredictable
driving behaviors. In this paper, massive trip data in one route are acquired from a large amount of
taxi trajectory data, and the 5 min travel time series is averaged to present the statistical trends of travel
time. We then assess the entropy and predictability based on the travel time series. Next, we discuss
the influences of time scales, tolerance, and series length on entropy and travel time predictability.
Finally, we employ two prediction models, ARIMA and BPNN, to predict the future travel time of the
selected route for model validation.
The rest of this paper is organized as follows. The next section defines the methodology of travel
time predictability. The study area, data sources and results of a case study are presented in Section 3.
Section 4 concludes the paper.
2. Materials and Methods
Historical travel time series are extracted from vehicle trajectory data and include a large number
of sample points. Each sample trajectory consists of a vehicle ID, a time stamp, longitude, latitude,
speed, etc. To attain the travel time of a specific trip, road matching of vehicle trajectory data is
performed to specific routes using the method proposed by Li et al. [34]. By calculating the difference
in the time stamp of the first and last sample points of origin and destination of a given trip, the set of
travel times for all trips is established. Then, based on a predefined departure time interval, the travel
times of all trips that fall into the departure time interval are averaged to generate travel time series.
For any travel time series with the same routes, we employ the RCMSE algorithm [29] to calculate
their multiscale entropy values and evaluate the complexity. Then, travel time predictability is defined,
and the relationship between the upper bound of travel time predictability and the entropy of the
historical travel time series is presented.
2.1. Entropy of Travel Time Series
Multiscale entropy of the travel time series is measured by the refined composite multiscale
entropy (RCMSE) algorithm.
Let X = { X1 , X2 , . . . , X N } denote a travel time series.
Step 1. Construct m-dimensional vectors Xim by using Equation (1).
Xim = { Xi , Xi+1 , . . . , Xi+m−1 }, 1 ≤ i ≤ N − m.
(1)
Entropy 2017, 19, 165
4 of 15
Step 2. Calculate the Euclidean distance dijm between any two vectors Xim and X jm by using
Equation (2).
dijm = k Xim − X jm k , 1 ≤ i, j ≤ N − m, i 6= j.
(2)
∞
≤ r, Xim and X jm are called an m-dimensional matched
vector pair. nm represents the total number of m-dimensional matched vector pairs. Similarly, nm+1 is
the total number of (m + 1)-dimensional matched vector pairs.
Step 4. The sample entropy (SampEn) is defined by Equation (3).
Step 3. Let r be the tolerance level. If
dijm
SampEn( X, m, r ) = −ln
n m +1
.
nm
(3)
Step 5. Let yτk = {yτk,1 , yτk,2 , . . . , yτk,p } be the k-th coarse-grained time series of X defined in
Equation (4), where p is the length of the coarse-grained time series, and τ is a scale factor. To obtain
yτk , the original time series X is segmented into N/τ coarse-grained series with each segment with a
length τ. The j-th element of the k-th coarse-grained time series yτk,j is the mean value of each segment
τ of the original time series X.
yτk,j =
1
τ
jτ +k −1
∑i=( j−1)τ+k Xi , 1 ≤ j ≤
N
, 1 ≤ k ≤ τ.
τ
(4)
Step 6. Classical multiscale entropy, MSE( X, τ, m, r ), is defined by Equation (5).
MSE( X, τ, m, r ) = SampEn(y1τ , m, r ).
(5)
Step 7. RCMSE is defined in Equation (6), where nm
k,τ is the total number of m-dimensional
matched vector pairs in the k-th coarse-grained time series with a length of τ.
RCMSE( X, τ, m, r ) = −ln
+1
∑τk=1 nm
k,τ
.
∑τk=1 nm
k,τ
(6)
Compared with SampEn and MSE which are more likely to induce undefined entropy, the RCMSE
algorithm can estimate entropy more accurately. In Equation (7), the true entropy S( X ) of the travel
time series X is denoted by RCMSE( X, τ, m, r ). S( X ) is roughly equal to, with time scale τ, the
negative logarithm of the mean of the conditional probability of new patterns (i.e., the distance
between vectors is greater than r) when the dimension of the pattern changes (i.e., m to m + 1). S( X )
describes the degree of irregularity of the travel time series at different time scales and is proportional
to the complexity of the travel time series. Based on Equation (7), the true entropy of the travel time
series with different time scales can be achieved.
S( X ) = RCMSE( X, τ, m, r ).
(7)
2.2. Travel Time Predictability
Based on the historical travel time series, the predictability of travel time is defined as the
probability Π that an algorithm can correctly predict future travel time. Again, X = { X1 , X2 , . . . X N }
represents a historical travel time series, ϕ is the actual travel time of the ( N + 1)th , ϕ̂ is the expected
value, and ϕ a is the estimated value based on model α. Let π be the probability of ϕ = ϕ̂ with a given
historical travel time series X. Equation (8) shows that π is the random value of distribution of the
subsequent travel time and it is an upper bound of the probability distribution of predictive values.
In other words, any prediction based on historical series X cannot do better than the one having the
true travel time being equal to the expected value, ϕ = ϕ̂.
π = P( ϕ = ϕ̂| X )
Entropy 2017, 19, 165
5 of 15
= supx { P( ϕ a = x | X )}
≥ P( ϕ = ϕ a | X ) .
(8)
The definition of predictability Π for a travel time series with a length of N is given by Equation (9),
where P( X ) is the probability of observing a particular historical travel time series X. ∑ πP( X ) presents
the best success rate to predict the ( N + 1)th travel time based on X. Π can be viewed as the averaged
predictability (Song et al., 2010) of a historical travel time series.
Π = lim
1
N →∞ n
N
∑i ∑ πP(X ).
(9)
Next, we relate entropy S( X ) to predictability Π to explore the upper bound of predictability
Based on Fano’s inequality [24], the relationship between entropy and predictability is shown in
Equation (10), which indicates that the complexity of X is less than or equal to the sum of the complexity
of successfully predicting S(Π ) and the complexity of failing to predict (1 − Π )log2 (n − 1), where n is
the number of values of X. Travel time is defined in second and n denotes the total time (seconds) of X.
The equality in Equation (10) holds up when Π meets the maximum value Π max . S(Π ) is presented in
Equation (11) and its relationship with the upper bound of travel time predictability is presented in
Equation (12). Based on the known S( X ) in Equation (7), we can traverse from 0 to 1 to achieve the
optimal solution of Π max with a given accuracy target.
Π max .
S( X ) ≤ S(Π ) + (1 − Π )log2 (n − 1)
(10)
S(Π ) = −Π log2 Π − (1 − Π )log2 (1 − Π )
(11)
S( X ) = −Π max log2 Π max − (1 − Π max )log2 (1 − Π max ) + (1 − Π max )log2 (n − 1).
(12)
3. Experiments
3.1. Study Area and Data
An express road section (about 6.74 km) in Shanghai, China is selected as the study area. As shown
in Figure 1, the selected route is a traffic corridor with heavy traffic and is a part of the express road
system of Shanghai represented by gray lines. Since this area includes the closed road segments,
continuous traffic flow will not be interrupted by an intersection’s delay, and the fluctuations of travel
time coincide with traffic patterns.
Taxi cab trajectory data associated with the selected route in April 2015 are extracted as the real
travel time series. Note that we only use the trips occupied by passengers and discard the patrolling
ones (a state of seeking clients in the street). Each trip consists of an origin, a destination, and a route.
The total number of passenger trips is 20,430, and averages about 29 per hour, which is sufficient
to represent the dynamics of travel time. In addition to the route length, the travel time of a taxi
cab trajectory may also be affected by heterogeneous driving behavior, therefore, the distribution
of travel time derived from individual trips could be rather complex [6,30,32]. Instead, the travel
time estimated from multiple vehicles can well reflect the average trend of travel time and hence
more effectively characterize traffic patterns than individual mobility. We use a 5 min time interval
to the average travel time of travel cases to obtain a 5 min travel time series from taxi cab trajectory
data in April 2015. Figure 2 shows the 5 min travel time series with 8,640 sample GPS points. Let
X = {Xi |0 < i < N, N = 8640} depict the 5 min travel time series, where Xi is the i-th sample point,
and N is the number of points of X.
Entropy 2017, 19, 165
Entropy
Entropy 2017,
2017, 19,
19, 165
165
6 of 15
6
of 15
15
6 of
Figure 1. The selected express road section.
Figure 1. The selected express road section.
Figure 1. The selected express road section.
Given that the key of this paper is to analyze travel time predictability from the aspect of
Given
oftaxi
this
paper
is totrends
analyze
travel
time
predictability
thetime
aspect
of prediction,
prediction,
thethe
average
mobility
presented
by
the predictability
5 minfrom
travel
and have
Given that
that
thekey
key
of this
paper
is to are
analyze
travel
time
fromseries
the
aspect
of
the
average
taxi
mobility
trends
are
presented
by
the
5
min
travel
time
series
and
have
nothing
to
nothing to do
individual
taxi behavior.
analysis and
entropy
predictability
prediction,
thewith
average
taxi mobility
trends The
are presented
byevaluation
the 5 min of
travel
timeand
series
and have
do
individual
taxi behavior.
The analysis
and evaluation
of entropy
and and
predictability
are
arewith
given
below.
nothing
to
do with individual
taxi behavior.
The analysis
and evaluation
of entropy
predictability
given
below.
are given
below.
Figure 2. The 5 min travel time series from taxi trajectory data in April 2015.
Figure 2. The 5 min travel time series from taxi trajectory data in April 2015.
Figure 2. The 5 min travel time series from taxi trajectory data in April 2015.
3.2.
3.2.Entropy
Entropyand
andPredictability
Predictability
d
3.2. Entropy
and Predictability
To
the
Toevaluate
evaluate
thecomplexity
complexityof
oftravel
traveltime
time series,
series, the
the daily
daily entropy,
entropy, named
named S , ,and
andthe
the weekly
weekly
w
entropy,
named
S
,
are
calculated
with
24-hour
subseries,
i.e.,
288
consecutive
points,
and
168h h(7
entropy,
named
,
are
calculated
with
24-hour
subseries,
i.e.,
288
consecutive
points,
and
168
To evaluate the complexity of travel time series, the daily entropy, named
, and the weekly
(7
days)
subseries,
i.e.,
2016
consecutive
points,
respectively,
from
the
5
min
travel
time
series.
days) subseries,
2016
consecutive
respectively,
from288
theconsecutive
5 min travel
time series.
We
entropy,
named i.e.,, are
calculated
withpoints,
24-hour
subseries, i.e.,
points,
and 168
h set
(7
We
set
the
difference
of
adjacent
subseries
is
1
h
(12
points)
to
obtain
many
subseries.
Then,
we
let=
the difference
adjacent
subseries
is
1
h
(12
points)
to
obtain
many
subseries.
Then,
we
let
ofi.e.,
days)
subseries,
2016
consecutive
points,
respectively,
from
the
5
min
travel
time
series.
We
set
d ( X 0 ) 0 = X
S{d =difference
X,j×
0 to
< obtain
j, <where
696many
} , where
is aThen,
subset
X with
( {S)|
=X{of ×adjacent
, j××12 ,subseries
…12, +1×, . .is. , 1Xh
},+0288
<
is aX0subset
of weofwith
288
j×12
−1<, 696}
the
(12
points)
subseries.
let
=
j
(consecutive
)| = points,
consecutive
and
we
let
=
{
(
′)|
=
{
,
,
…
,
},
0
<
<
552}
,
w
w
{288
{ points,
,
,
…
,
},
0
<
<
696}
,
where
is
a
subset
of
with
288
00
00
× ,...,X ×
×
×
and
we let S× = {Sj (X )X = Xj×12×, Xj×12
+1
j×12+2016−1 , 0 < j < 552} ,
whereX00 is is
a subset
of
with
2016
consecutive
points.
consecutive
and
we
let
=
{ ( ′)|
={ × , ×
,…, ×
}, 0 < < 552} ,
where
apoints,
subset
of
X with
2016
consecutive
points.
Let
scale
factor
=
1
,
tolerance
r
=
0.1
,
and
dimension
=
2
,
where
is thedeviation
standard
where
is
a
subset
of
with
2016
consecutive
points.
Let scale factor τ = 1, tolerance r = 0.1σ, and dimension m = 2, where σ is the standard
deviation
of
the
5
min
travel
time
series.
The
statistical
results
of
values
of
and
are
shown
in
scale
factor
1 , tolerance
r = 0.1results
, andofdimension
= 2S
, wwhere
is in
theFigure
standard
of theLet
5 min
travel
time =
series.
The statistical
values of Sd and
are shown
3. It
Figure
3. of
Itthat
can
seen
that
the
values
of
are
in of
the
range
0.6–3.4
the
valuesin
of
deviation
the the
5bemin
travel
series.
The in
statistical
results
values
ofof values
andofand
can
be seen
values
of time
Sd are
scattered
thescattered
range
of 0.6–3.4
and the
Sware
areshown
compact
are
compact
in
the
range
of
1.6–2.3.
The
remarkable
difference
between
and
means
that
d
w
Figure
3.
It
can
be
seen
that
the
values
of
are
scattered
in
the
range
of
0.6–3.4
and
the
values
of
in the range of 1.6–2.3. The remarkable difference between S and S means that the complexity of
theare
complexity
daily
travel
series
tends
to by
change
frequently;
by contrast,
the complexity
of
compact
in
the range
oftime
1.6–2.3.
The
remarkable
difference
between
means
that
daily
travel
timeof
series
tends
to
change
frequently;
contrast,
the complexity
ofand
weekly
travel time
weekly
travel
time
series
is
stable.
peaks
at
about
1.7,
indicating
that,
on
average,
the
probability
w
the
complexity
of
daily
travel
time
series
tends
to
change
frequently;
by
contrast,
the
complexity
of
series is stable. S peaks at about 1.7, indicating that, on average, the probability of new 2-dimension
.
of =
new
2-dimension
( is
= stable.
2)
patterns
in a weekly
travel
time
series
≈ and
0.183.
peaks at
−1.7 ≈
weekly
travel
time
at
1.7,
indicating
that,
on average,
the
(m
2) patterns
in aseries
weekly
travel
time peaks
series
is eabout
0.183.
Sd peaks
at is
about
1.2,
the probability
probability
.
.
about
and the probability
of new patterns
in a daily
travel
series
≈ 0.301, indicating
of
new1.2,
2-dimension
( = 2) patterns
in a weekly
travel
timetime
series
is is ≈ 0.183.
peaks at
.
about 1.2, and the probability of new patterns in a daily travel time series is
≈ 0.301, indicating
Entropy 2017, 19, 165
7 of 15
Entropy
Entropy 2017,
2017, 19,
19, 165
165
77 of
of 15
15
e−1.2
of new patterns in a daily travel time series is
≈ 0.301, indicating that weekly travel time series
that
weekly
travel
with
complexity
lower
probability
new
than
that greater
weekly complexity
travel time
time series
series
with greater
greater
complexity
have
lower
probability
ofsimple
new patterns
patterns
than
with
have lower
probability
of new have
patterns
than
relativelyof
daily travel
relatively
simple
daily
travel
time
series.
relatively
simple
daily
travel
time
series.
time series.
Figure
The
entropy
in
the
weekly
travel
time
series
and
the
daily
travel
time
series.
Figure
Figure3.3.
3.The
Theentropy
entropyin
inthe
theweekly
weeklytravel
traveltime
timeseries
seriesand
andthe
thedaily
dailytravel
traveltime
timeseries.
series.
Travel
time
predictability
the probability
of
accurate
prediction,
which
is
by
Traveltime
timepredictability
predictabilityis is
is
probability
of an
an
accurate
prediction,
which
is determined
determined
by
Travel
thethe
probability
of an
accurate
prediction,
which
is determined
by the
the
complexity
(entropy)
and
the
range
of
the
travel
time
series.
In
our
experiments,
we
set
the
the complexity
(entropy)
and
the of
range
of thetime
travel
timeInseries.
In our experiments,
set the
complexity
(entropy)
and the
range
the travel
series.
our experiments,
we set thewe
accuracy
accuracy
of
0.001
to
calculate
the
upper
bound
of
travel
time
predictability
by
Equation
(12).
accuracy
of
0.001
to
calculate
the
upper
bound
of
travel
time
predictability
by
Equation
(12).
max
of 0.001 to calculate the upper bound of travel time predictability Π
by Equation (12). Therefore,
Therefore,
the
optimal
(maximum)
can
be
achieved
by
traversing
in
the
range
of
0.001–0.999.
Therefore,
the
optimal
(maximum)
can
be
achieved
by
traversing
in
the
range
of
0.001–0.999.
max
the optimal (maximum) Π
can be achieved by traversing in the range of 0.001–0.999.
The
statistical
results
of
the
upper bound
of
travel
time
predictability
in
travel
time
The
statistical
results
of
the
of of
travel
time
predictability
in weekly
weekly
traveltravel
time series
series
The statistical results of theupper
upperbound
bound
travel
time
predictability
in weekly
time
and
daily
travel
time
series
are
shown
in
Figure
4.
Since
travel
time
predictability
is
and
daily
travel
time
series
are
shown
in
Figure
4.
Since
travel
time
predictability
w
d
series Π and daily travel time series Π are shown in Figure 4. Since travel time predictability isis
influenced
by
entropy
and
the
range
of
series,
weekly
travel
time
series
with
higher
entropy
have
influencedby
byentropy
entropyand
andthe
therange
rangeof
ofseries,
series,weekly
weeklytravel
traveltime
timeseries
serieswith
withhigher
higherentropy
entropyhave
have
influenced
lower
predictability,
peaking
at
0.95,
and
daily
travel
time
series
with
lower
entropy
have
higher
lowerpredictability,
predictability,peaking
peakingatat0.95,
0.95,and
anddaily
dailytravel
traveltime
timeseries
serieswith
withlower
lowerentropy
entropyhave
havehigher
higher
lower
predictability,
peaking
at
0.99.
This
demonstrates
that
the
more
complex
the
travel
time
series
is,
the
predictability,
peaking
at
0.99.
This
demonstrates
that
the
more
complex
the
travel
time
series
is,the
the
predictability, peaking at 0.99. This demonstrates that the more complex the travel time series is,
less
predictability
it
is,
in
other
words,
the
less
likely
to
correctly
predict
it.
lesspredictability
predictabilityititis,
is,ininother
otherwords,
words,the
theless
lesslikely
likelytotocorrectly
correctlypredict
predictit.it.
less
Figure
The
upper
bound
of
travel
time
predictability
in
the
weekly
travel
time
series
and
the
daily
Figure4.4.
4.The
Theupper
upperbound
boundof
oftravel
traveltime
timepredictability
predictabilityin
inthe
theweekly
weeklytravel
traveltime
timeseries
seriesand
andthe
thedaily
daily
Figure
travel
time
series.
travel
time
series.
travel time series.
3.3.
Analysis
and
Discussion
3.3.Analysis
Analysisand
andDiscussion
Discussion
3.3.
As
formulated
in
Equations
(7)
and
(12),
the
features
of
travel
time
predictability
are
influenced
Asformulated
formulatedin
inEquations
Equations(7)
(7)and
and(12),
(12),the
thefeatures
featuresof
oftravel
traveltime
timepredictability
predictabilityare
areinfluenced
influenced
As
by
three
key
factors,
i.e.,
scale
factor
,
tolerance
,
and
series
length.
In
this
subsection,
we
bythree
threekey
keyfactors,
factors,i.e.,
i.e.,scale
scalefactor
factor
, tolerance
, and
series
length.
In this
subsection,
we analyze
analyze
by
τ, tolerance
r, and
series
length.
In this
subsection,
we analyze
the
the
effectiveness
of
these
factors
to
the
predictability
of
travel
time
series,
and
discuss
the
features
the
effectiveness
of
these
factors
to
the
predictability
of
travel
time
series,
and
discuss
the
features
effectiveness of these factors to the predictability of travel time series, and discuss the features and
and
of
.. The
of the
travel time
is
by
and trends
trends
of . The validity
The validity
validity
the proposed
proposed
time predictability
predictability
is verified
verified
by comparing
comparing
trends
of Πmax
of the of
proposed
travel travel
time predictability
is verified
by comparing
Πmax
and
the
prediction
results
of
future
travel
time
from
two
typical
prediction
models,
ARIMA
and
the
prediction
results
of
future
travel
time
from
two
typical
prediction
models,
ARIMA
and the prediction results of future travel time from two typical prediction models, ARIMA and BPNN.
and
and BPNN.
BPNN.
3.3.1.
3.3.1. Time
Time Scales
Scales
The
The scale
scale factor
factor is
is the
the key
key parameter
parameter of
of RCMSE.
RCMSE. It
It can
can be
be used
used to
to analyze
analyze the
the complexity
complexity and
and
predictability
of
travel
time
series
in
multiple
time
scales.
Figure
5
shows
the
predictability of travel time series in multiple time scales. Figure 5 shows the entropy
entropy and
and
Entropy 2017, 19, 165
8 of 15
3.3.1. Time Scales
Entropy 2017, 19, 165
Entropy 2017, 19, 165
8 of 15
8 of 15
The scale factor τ is the key parameter of RCMSE. It can be used to analyze the complexity and
predictabilityofoftravel
5 min
travel
series with
time scales
1–20. The
entropyand
is calculated
by
predictability
time
seriestime
in multiple
time scales.
Figureof
the entropy
predictability
predictability of 5 min
travel
time
series with
time scales
of5 shows
1–20. The
entropy is calculated
by
Equation
(7)
with
=
0.1
,
and
=
2.
is
calculated
by
Equation
(12).
As
increases,
entropy
of
5
min
travel
time
series
with
time
scales
of
1–20.
The
entropy
is
calculated
by
Equation
(7)
with
Equation (7) with = 0.1 , and
= 2.
is calculated by Equation (12). As increases, entropy
andand
predictability
There arebymore
“new(12).
patterns”
in the travel
time
series
longer time
rrises
= 0.1σ,
m = 2. Πmaxfalls.
is calculated
Equation
As τ increases,
entropy
rises
andof
rises
and predictability
falls.
There are more
“new patterns”
in the travel
time
series
ofpredictability
longer time
scales.
The are
complexity
of the
travel time
series
oftime
longer
time
scales
istime
greater
than
those
of shorter
falls.
There
more
“new
patterns”
in
the
travel
series
of
longer
scales.
The
complexity
of
scales. The complexity of the travel time series of longer time scales is greater than those of shorter
time
scales,
and
the
travel
time
series
of
longer
time
scales
are
more
difficult
to
correctly
predict.
the
time
of longer
scales
is greater
of shorter
time
andpredict.
the travel
timetravel
scales,
andseries
the travel
timetime
series
of longer
time than
scalesthose
are more
difficult
to scales,
correctly
time series of longer time scales are more difficult to correctly predict.
Figure 5. Entropy and predictability of 5 min travel time series with scale factor of 1–20.
Figure
Figure 5.
5. Entropy and predictability
predictability of
of 55 min
min travel
travel time
time series
series with
with scale
scale factor
factor of
of 1–20.
1–20.
3.3.2. Tolerance
3.3.2. Tolerance
Tolerance
3.3.2.
Tolerance , is a key factor in evaluating the complexity of travel time series and constrains the
Tolerance r, ,isisaakey
keyfactor
factorininevaluating
evaluatingthe
thecomplexity
complexityof
oftravel
traveltime
timeseries
seriesand
and constrains
constrains the
the
Tolerance
contributions of travel time fluctuations to the complexity. We attempt to evaluate the effectiveness
contributions of
oftravel
traveltime
timefluctuations
fluctuationstotothe
thecomplexity.
complexity.We
We
attempt
evaluate
effectiveness
contributions
attempt
to to
evaluate
thethe
effectiveness
of
of in six time scales, i.e., = 2, 4, 6, 8, 10, and 12. Figures 6 and 7 show the changing trends of
intime
six time
scales,
6, and
8, 10,
12. Figures
and 7the
show
the changing
of
rofin six
scales,
i.e., τi.e.,
= 2, 4,= 6,2,8,4,10,
12.and
Figures
6 and 76show
changing
trends oftrends
entropy
entropy and travel time predictability of the 5 min travel time series with of 0.01 to 0.16 ,
entropy
time predictability
the 5time
minseries
travel
time
0.01 to 0.16
and
traveland
timetravel
predictability
of the 5 minoftravel
with
r ofseries
0.01σ with
to 0.16σ,ofrespectively.
Since,
respectively. Since the travel time predictability of six time scales reaches the maximum value 0.999
respectively.
Since
the travel of
time
of six time
scales reaches
the
maximum
0.999
the
travel time
predictability
sixpredictability
time scales reaches
the maximum
value
0.999
when rvalue
is equal
to
when is equal to 0.16 , the test range of is 0.01 to 0.16 . In Figure 6, with the increase in ,
when
is
equal
to
0.16
,
the
test
range
of
is
0.01
to
0.16
.
In
Figure
6,
with
the
increase
in
0.16σ, the test range of r is 0.01σ to 0.16σ. In Figure 6, with the increase in r, the entropy gradually,
the entropy gradually becomes lower because the value gap between travel times less than is not
the entropy
gradually
lower
value
gapless
between
less than
not
becomes
lower
becausebecomes
the value
gap because
betweenthe
travel
times
than rtravel
is not times
concerned.
To sixistime
concerned. To six time scales, in addition to = 2, other values of entropy are hard to distinguish at
concerned.
To six time
in addition
= 2, other
values
of entropy are
at
scales,
in addition
to τ =scales,
2, other
values ofto
entropy
are hard
to distinguish
at ahard
lowertor,distinguish
and ordered
a lower , and ordered values of entropy can be found at higher values (about0.06 to 0.16 ).
a lowerof entropy
, and ordered
be found
higher
values (about0.06 to 0.16 ).
values
can bevalues
found of
at entropy
higher r can
values
(aboutat
0.06σ
to 0.16σ).
Figure 6.
6. The
The effectiveness
effectiveness of
of r to
Figure
toentropy.
entropy.
Figure 6. The effectiveness of to entropy.
Figure 77 shows
shows Π max of
ofthe
the55min
mintravel
traveltime
timeseries
serieswith
withsix
sixtime
timescales.
scales. There
There is
is aa negative
negative
Figure
Figure 7 shows max of the 5 min travel time series
with six time scales. There is a negative
max
correlation
between
and
S(X).
The
higher
the
is
and
the
lower
the
S(X)
is
in
shorter
time
correlation between Π
and S(X). The higher the Π
is and the lower the S(X) is in shorter time
correlation between
and S(X). The higher the
is and the lower the S(X) is in shorter time
scales, i.e., = 2, the lower the
is and the higher the S(X) is in longer time scales. At the same
scales, i.e., = 2, the lower the
is and the higher the S(X) is in longer time scales. At the same
tolerance level, travel time series with a lower value are easier to predict than those with a higher
tolerance level, travel time series with a lower value are easier to predict than those with a higher
. With the expansion of ,
gradually increases to 0.999.
is limited by . Obviously, the
. With the expansion of ,
gradually increases to 0.999.
is limited by . Obviously, the
Entropy 2017, 19, 165
9 of 15
scales, i.e., τ = 2, the lower the Π max is and the higher the S(X) is in longer time scales. At the same
Entropy2017,
2017,
19,165
165
15
Entropy
19,
99ofof15
tolerance
level,
travel time series with a lower r value are easier to predict than those with a higher
r.
With the expansion of r, Π max gradually increases to 0.999. Π max is limited by r. Obviously, the higher
higher is,
is, the
the higher
higher the
the tolerance
tolerance isis to
to predictive
predictive error,
error, the
the greater
greater the
the
is, and
and the
the more
more
higher
is,
r is, the higher the tolerance is to predictive error, the greater the Π max is, and the more accurate the
accuratethe
theprediction
predictionis.
is.
accurate
prediction is.
Figure7.
7.The
Theeffectiveness
effectivenessof
of r to
totravel
traveltime
timepredictability.
predictability.
Figure
to
travel
time
predictability.
Figure
7.
The
effectiveness
of
For the
the possibility
possibility of
of perfect
perfect theoretical
theoretical prediction,
prediction, Figure
Figure 88 shows
shows the
the tolerance
tolerance of
of perfect
perfect
For
For the possibility of perfect theoretical prediction, Figure 8 shows the tolerance of perfect
predictionin
inmultiple
multipletime
timescales.
scales.The
Theranges
rangesof
of are
arefrom
from11to
to20.
20.Black
Blackline
linerepresents
representsthe
thetrends
trends
prediction
prediction in multiple time scales. The ranges of τ are from 1 to 20. Black line represents the trends of
of with
withaa
of0.999.
0.999.For
Forexample,
example,the
thenext
nexttravel
traveltime
timein
in ==66 can
canbe
beaccurately
accuratelypredicted
predicted
of
of
τ with a Π max of 0.999. For example, the next travel time in τ = 6 can be accurately predicted with
with ==0.14
0.14 by
bythe
theappropriate
appropriateprediction
predictionmodel.
model.The
Thegrowth
growthtrend
trendof
of indicates
indicatesthat
thataahigher
higher
with
r = 0.14σ by the appropriate prediction model. The growth trend of r indicates that a higher τ is more
is
more
difficult
to
predict,
and
their
perfect
prediction
needs
greater
tolerance
ranges.
is more difficult to predict, and their perfect prediction needs greater tolerance ranges.
difficult to predict, and their perfect prediction needs greater tolerance ranges.
Figure
8.
The
effectiveness
of
Figure8.
8.The
Theeffectiveness
effectivenessof
of r to
toperfect
perfectprediction.
prediction.
Figure
to
perfect
prediction.
3.3.3.Series
SeriesLength
Length
3.3.3.
Next,we
weanalyze
analyzethe
influenceof
of
series
length
on
entropy
and
predictability.
Figure
shows
the
Next,
analyze
theinfluence
ofseries
serieslength
length
on
entropy
and
predictability.
Figure
9 shows
on
entropy
and
predictability.
Figure
99shows
the
entropy
oftravel
travel
time
series
insix
six
time
scales
with
different
series
length.
Theseries
series
length
ofaaoneonethe
entropy
of travel
time
series
intime
six
time
scales
with
different
series
length.
The
series
length
of
a
entropy
of
time
series
in
scales
with
different
series
length.
The
length
of
day55min
min
travel
timetime
series
288,
andand
soon.
on.
Meanwhile,
,and
andand==m2.2.=ItIt2.
can
beseen
seen
that
one-day
5 min
travel
series
is 288,
so Meanwhile,
on.
Meanwhile,
r0.1
= ,0.1σ,
It be
can
be seen
day
travel
time
series
isis288,
and
so
==0.1
can
that
these
higher
entropy
arein
in22in
day
or33or
day
travel
time
series,
andand
themore
more
stable
trends
arein
inthe
the
>14
that
these
higher
entropy
are
2 day
3 day
travel
time
series,
the
more
stable
trends
are
in >14
the
these
higher
entropy
are
day
or
day
travel
time
series,
and
the
stable
trends
are
dayday
(i.e.,
twotwo
weeks)
travel
time
series.
WeWe
can
think
that
entropy
ofof
day
travel
time
series
>14
(i.e.,
weeks)
travel
time
series.
can
think
that
entropy
a>14
>14
day
travel
time
series
day
(i.e.,
two
weeks)
travel
time
series.
We
can
think
that
entropy
of
aa>14
day
travel
time
series
isis
roughly
independent
ofofseries
series
length.
Table
shows
thatthe
statisticsof
ofentropy
entropyto
tosupport
supportthese
these
is
roughly
independent
serieslength.
length.Table
Table111shows
showsthat
thestatistics
statistics
of
entropy
roughly
independent
of
findings,where
whereS((X
( ))isisisthe
theaverage
averagevalue
valueofofentropy
entropyof
ofall
alltravel
traveltime
timeseries,
series,and
thestandard
standard
findings,
travel
time
series,
and sdS is
the
average
value
isisthe
the
deviationof
ofall
all
entropy.
The
of>14
>14
days
much
lower
thanthose
those
of
<14days,
days,
and
thestable
most
deviation
entropy.
The
sdS of >14
days
is much
lower
thanthan
those
of <14of
days,
and
the
most
deviation
of
all
entropy.
The
of
days
isismuch
lower
<14
and
the
most
stableseries
seriesisisthe
the>14
>14day
daytravel
traveltime
timeseries
serieswith
with ==22 and
and ==4.4.Therefore,
Therefore,we
wecan
candemonstrate
demonstrate
stable
thatthe
thecomplexity
complexityof
ofthe
the>14
>14day
daytravel
traveltime
timeseries
seriesisisstable
stableand
andisisindependent
independentof
ofseries
serieslength.
length.
that
Entropy 2017, 19, 165
10 of 15
series is the >14 day travel time series with τ = 2 and τ = 4. Therefore, we can demonstrate that the
Entropy 2017, 19, 165
10 of 15
complexity
of the >14 day travel time series is stable and is independent of series length.
Figure9.9.Entropy
Entropyofoftravel
traveltime
timeseries
serieswith
withdifferent
differentseries
serieslengths.
lengths.
Figure
Similarly, Figure 10 demonstrates the smallest value of predictability of 2 day or 3 day travel
Similarly, Figure 10 demonstrates the smallest value of predictability of 2 day or 3 day travel
time series and the stationarity and independence of predictability of the >14 day travel time series.
time series and the stationarity and independence of predictability of the >14 day travel time series.
In Table 1,
denotes the average value of travel time predictability, and
denotes the
In Table 1, Π max denotes the average value of travel time predictability, and sdΠ max denotes the
standard deviation of travel time predictability. Great differences between the
of the >14 day
standard deviation of travel time predictability. Great differences between the sdΠ max of the >14 day
travel time series and the <14 day travel time series present the stable predictability of the >14 day
travel time series and the <14 day travel time series present the stable predictability of the >14 day
travel time series. In addition, we can demonstrate that the most stable predictability is in the >14 day
travel time series. In addition, we can demonstrate that the most stable predictability is in the >14 day
travel time series with = 2 and = 4.
travel time series with τ = 2 and τ = 4.
Figure10.
10.Travel
Traveltime
timepredictability
predictabilityofoftravel
traveltime
timeseries
serieswith
withdifferent
differentseries
serieslengths.
lengths.
Figure
Table1.1.The
Thestatistics
statisticsof
ofentropy
entropyand
andtravel
traveltime
timepredictability.
predictability.
Table
τ
2
4
2
6
4
68
8 10
1012
12
( )
S(X)
1.1752
1.1966
1.1752
1.2334
1.1966
1.2415
1.2334
1.3261
1.2415
1.3571
1.3261
1.3571
<
sds≥
0.0896
< 14 Days
≥0.0126
14 Days
0.0755
0.0125
0.0126
0.0896
0.0623
0.0148
0.0125
0.0755
0.1108
0.0167
0.0148
0.0623
0.1311
0.0193
0.0167
0.1108
0.0953
0.0242
0.0193
0.1311
0.0242
0.0953
Π max
0.9896
0.9885
0.9896
0.9863
0.9885
0.9856
0.9863
0.9805
0.9856
0.9786
0.9805
0.9786
<
max
sdΠ≥
0.0045
0.0008
< 14 Days
≥ 14 Days
0.0040
0.0007
0.0045
0.0008
0.0040
0.0011
0.0040
0.0007
0.0058
0.0011
0.0040
0.0011
0.0089
0.0013
0.0058
0.0011
0.0066
0.0016
0.0089
0.0013
0.0066
0.0016
3.3.4. The Validity of Travel Time Predictability
To validate the travel time predictability, two prediction models, i.e., AutoRegressive Integrated
Moving Average (ARIMA) [35], and Back Propagation Neuro Networks (BPNN) [36], are employed
Entropy 2017, 19, 165
11 of 15
to predict future travel time.
The ARIMA model is a method for time series analysis and prediction. Since travel time series
The Validity
of Travel
Time Predictability
have3.3.4.
obvious
fluctuation
differences
between weekdays and weekends, we use the seasonal ARIMA
(SARIMA)
model, the
denoted
( , , )( ,two
, prediction
) , to predict
future
travel time, where
To validate
travel time predictability,
models,
i.e., AutoRegressive
Integratedis the
order
of theAverage
autoregressive
(AR)and
part,
is the order
of the
moving
average
part, tois the
Moving
(ARIMA) [35],
Back Propagation
Neuro
Networks
(BPNN)
[36],(MA)
are employed
degree
of
difference
for
reducing
the
non-stationarity
of
time
series,
is
the
number
of
periods per
predict future travel time.
season, and
, , model
refer to
the
AR, differencing,
MA terms
for the seasonal
parttime
of the
ARIMA
The ARIMA
is a
method
for time seriesand
analysis
and prediction.
Since travel
series
have
obvious
fluctuation
differences
between
weekdays
and
weekends,
we
use
the
seasonal
ARIMA
model. Due to the stationary and weekly change period of travel time series in our experiments, we
denoted=ARI
p, d, q)( P,the
D, Qautocorrelation
travel time,
is the
)m , to predict future
set (SARIMA)
= 0 , model,
= 0 , and
7 . MA
By (testing
function
(ACF)where
and pthe
partial
order
of
the
autoregressive
(AR)
part,
q
is
the
order
of
the
moving
average
(MA)
part,
d
is
the
degree
of
autocorrelation function (PACF) of complete and seasonal part travel time series, we set = 3, =
the non-stationarity of time series, m is the number of periods per season, and
1, difference
= 1, andfor reducing
= 2. Then,
we use
(3,0,1)(1,0,2) to predict future travel time in the selected
P, D, Q refer to the AR, differencing, and MA terms for the seasonal part of the ARIMA model. Due to
route.
the stationary and weekly change period of travel time series in our experiments, we set d = 0, D = 0,
As a neuro network method, the BPNN model includes an input layer, a hidden layer, and an
and m = 7. By testing the autocorrelation function (ACF) and the partial autocorrelation function
output
layer.
It can learn
and store
large
amounts
of input–output
mapping
(PACF)
of complete
and seasonal
part
travel
time series,
we set p = 3, q =
1, P = 1,by
andmodel
Q = 2.training
Then, to
represent
dynamic
and non-linear
processes.
Inselected
our experiments,
the BPNN model
we useand
ARIpredict
MA(3, 0,the
1)(1,
0, 2)7 to predict
future travel
time in the
route.
has threeAsinputs,
i.e.,
the
date,
the
time
of
day,
and
the
day
of
week,
and
one
output,
i.e.,
theantravel
a neuro network method, the BPNN model includes an input layer, a hidden layer,
and
time,output
the number
nodes
the
hidden
is 7,ofthe
learning rate
( ) is by
0.9,model
and the
momentum
layer. Itof
can
learn in
and
store
large layer
amounts
input–output
mapping
training
to
represent
and
predict
the
dynamic
and
non-linear
processes.
In
our
experiments,
the
BPNN
model
has
factor ( ) is 0.7.
three
inputs,
i.e., thethe
date,
the time
of day,time
and the
day of week,
one output,
i.e., themodels
travel time,
Figure
11 shows
errors
of travel
prediction
withand
ARIMA
and BPNN
in 5 min
the
number
of
nodes
in
the
hidden
layer
is
7,
the
learning
rate
(η)
is
0.9,
and
the
momentum
factor
(α)
travel time series. We set = 0.1 (about 22 s),
= 2, and = 2. We predict 50 times with ARIMA
0.7.
and isBPNN,
respectively, and let be the absolute value of the difference between predictive value
Figure 11 shows the errors of travel time prediction with ARIMA and BPNN models in 5 min
and actual value. The dashed line indicates the tolerance = 0.1 . It can be seen that most dots are
travel time series. We set r = 0.1σ (about 22 s), m = 2, and τ = 2. We predict 50 times with ARIMA
below it. The statistical results of travel time prediction of the 5 min travel time series are shown in
and BPNN, respectively, and let e be the absolute value of the difference between predictive value and
Table
2. value.denotes
the line
average
predictability
ofr 50
predictions
thethat
5 min
travel
time
series. If
actual
The dashed
indicates
the tolerance
= 0.1σ.
It can beinseen
most
dots are
below
is it.
less
(below
the
line
of Figure
11),5itmin
is travel
a successful
prediction.
number
Thethan
statistical
results
of dashed
travel time
prediction
of the
time series
are shownThe
in Table
2. of
max
successful
predictions
is
40
with
ARIMA,
and
41
with
BPNN.
Compared
with
of
0.952,
Π
denotes the average predictability of 50 predictions in the 5 min travel time series. If e is less the
success
of prediction
are of
lower,
while
average errors,
13.46
and
12.42,ofare
lower than
thanrates
r (below
the dashed line
Figure
11), ittheir
is a successful
prediction.
The
number
successful
max
predictions
of 0.952, the success rates of
tolerance
, 22.is 40 with ARIMA, and 41 with BPNN. Compared with Π
prediction are lower, while their average errors, 13.46 and 12.42, are lower than tolerance r, 22.
Figure
The
errorsofoftravel
travel time
time prediction
5 min
travel
timetime
series.
Figure
11.11.
The
errors
predictionininthe
the
5 min
travel
series.
Entropy 2017, 19, 165
Entropy 2017, 19, 165
12 of 15
12 of 15
Table 2.
2. The
The statistics
statistics of
of prediction
prediction with
with the
the 55 min
min travel
travel time
time series.
series.
Table
Prediction
Number
Number
Prediction
Number
of of Number
of of
Model
Prediction
Success
Model
Prediction
Success
ARIMA
50
38
ARIMA
50
38
BPNN BPNN
50 50
40 40
Success
Average
Average Error
Success
Rate Rate Error (s) (s)
90%
13.46
90%
13.46 0.952
91%
12.42 12.42
91%
Π max
0.952
To comprehensively evaluate the relationships between travel time predictability and prediction
To comprehensively evaluate the relationships between travel time predictability and prediction
results, two group comparisons were conducted. Figure 12 shows the comparison results between
results, two group comparisons were conducted. Figure 12 shows the comparison results between
travel time predictability and prediction results in the 5 min travel time series with different series
travel time predictability and prediction results in the 5 min travel time series with different series
lengths of 1–30 days. The average prediction results of 100 experiments of each travel time series are
lengths of 1–30 days. The average prediction results of 100 experiments of each travel time series are
presented: = 0.1 ,
= 2, and = 2. Let
be the success rate of travel time prediction by
presented: r = 0.1σ, m = 2, and τ = 2. Let SR ARI MA be the success rate of travel time prediction
ARIMA, and let
be the success rate by BPNN. There is a significant statistical difference
by ARIMA, and let SR BPNN be the success rate by BPNN. There is a significant statistical difference
between
(or
) and
. It indicates that the actual performance (
and
between SR ARI MA (or SR BPNN ) and Π max . It indicates that the actual performance (SR ARI MA and
) of ARIMA and BPNN lags far behind the theoretical optimal value of success rate of travel
SR BPNN ) of ARIMA and BPNN lags far behind the theoretical optimal value of success rate of travel
time
prediction, and the performance of travel time prediction is still great room for improvement.
time prediction, and the performance of travel time prediction is still great room for improvement.
Note that the change trends of
or
are basically consistent, as shown in Table 3, with
Note that the change trends of SR ARI MA or SR BPNN are basically consistent, as shown in Table 3, with
the standard deviation ( ) of the travel time predictability, and the predicted results of ARIMA and
the standard deviation (sd) of the travel time predictability, and the predicted results of ARIMA and
BPNN with shorter, 14-day series lengths have obvious higher values than the longer, 14-day series
BPNN with shorter, 14-day series lengths have obvious higher values than the longer, 14-day series
length, which indicates that the accuracy of prediction is affected by the complexity of the travel time
length, which indicates that the accuracy of prediction is affected by the complexity of the travel time
series and demonstrates the validity of travel time predictability.
series and demonstrates the validity of travel time predictability.
Figure
Figure 12. The
The comparisons
comparisons between
between travel
travel time
time predictability
predictability and
and prediction
prediction results
results in
in the
the 55 min
travel
travel time series with different series lengths.
Table 3.
3. The
Thestandard
standarddeviation
deviation
of travel
predictability
and
the predicted
results
with
(sd)( of )travel
time time
predictability
and the
predicted
results with
different
different
series lengths.
series lengths.
Series Length (Days)
< 14days
< 14 days
≥ 14days
Series Length (Days)
≥ 14 days
Π max
0.0052
0.0052
0.0009
0.0009
SRARIMA
0.0339
0.0339
0.0142
0.0142
SRBPNN
0.0378
0.0378
0.0137
0.0137
The same situation occurs in Figure 13. We compare
and the prediction results in different
The same situation occurs in Figure 13. We compare Π max and the prediction results in different
time scales of 1–20 with = 0.1 , and
= 2 . With the increase of time scales, travel time
time scales of 1–20 with r = 0.1σ, and m = 2. With the increase of time scales, travel time predictability
predictability and the success rates of two prediction models decline synchronously.
and the success rates of two prediction models decline synchronously.
The proposed travel time predictability is a valid measurement of travel time series for correct
The proposed travel time predictability is a valid measurement of travel time series for correct
prediction,
which provides an achievable target to the development of travel time prediction methods
prediction,
which
an achievable
to the
development
and contribute
to aprovides
differentiated
scheme target
of travel
time
prediction. of travel time prediction methods
and contribute to a differentiated scheme of travel time prediction.
Entropy 2017, 19, 165
Entropy 2017, 19, 165
13 of 15
13 of 15
Relationships between
between travel
travel time
time predictability
predictability and prediction results in travel time series
Figure 13. Relationships
with different time scales.
4. Discussion
Discussion and Conclusions
This paper
paper defines
definestravel
traveltime
time
predictability
as the
probability
of correctly
predicting
predictability
as the
probability
of correctly
predicting
futurefuture
travel
travel
times upon
basedhistorical
upon historical
travel
timeand
series
and develops
an entropy-based
to
times based
travel time
series
develops
an entropy-based
approachapproach
to measure
measure
upper
boundtime
of travel
time predictability.
Multiscale
entropy
ofseries
travelistime
series to
is
the upperthe
bound
of travel
predictability.
Multiscale entropy
of travel
time
calculated
calculated
evaluate its
complexity.
Theofupper
of travel time
predictability
is found
to be
evaluate itstocomplexity.
The
upper bound
travel bound
time predictability
is found
to be related
to entropy.
related
to entropy.
Travel expresses
time predictability
expressesof
the
characteristics
travel
time
series
itself
Travel time
predictability
the characteristics
travel
time seriesofitself
and
is an
expected
and
is
an
expected
value
of
data-based
prediction
performance.
value of data-based prediction performance.
A case
section
road
in in
Shanghai,
China
is designed.
The data
source
is a large
casestudy
studyininan
anexpress
express
section
road
Shanghai,
China
is designed.
The data
source
is a
amount
of taxiofcab
data collected
in April
2015. 2015.
By analyzing
the effectiveness
of theoftime
large amount
taxitrajectory
cab trajectory
data collected
in April
By analyzing
the effectiveness
the
scales
and
tolerance
to
entropy
and
travel
time
predictability,
we
demonstrate
that
time
scales
and
time scales and tolerance to entropy and travel time predictability, we demonstrate that time scales
tolerance
are positively
related
to the
and and
negative
related
to travel
timetime
predictability.
In
and tolerance
are positively
related
to entropy
the entropy
negative
related
to travel
predictability.
addition,
wewe
reveal
thethe
higher
value
ofof
entropy
In addition,
reveal
higher
value
entropyand
andthe
thelower
lowerpredictability
predictabilityof
of22day
day or
or 33 day
day travel
time series and the more stable values of >14
>14 day
day travel
travel time
time series.
series. Finally,
Finally, two
two prediction
prediction models,
models,
ARIMA and BPNN, are employed to predict travel time based on historical travel time series and to
examine the validity and reliability of travel time predictability. Though travel time predictability is
independent of the prediction method, it can aid the development of travel time prediction methods
and contribute to a differentiated scheme for travel time prediction
prediction in
in diverse
diverse traffic
traffic environment.
environment.
Future efforts may be pursued in two
two directions.
directions. First, the comprehensive investigation and
verification
of travel
travel time
timepredictability
predictabilityshould
shouldbegin
begininina alarger
largernetwork
network
with
multiple
data
sources
verification of
with
multiple
data
sources
to
to
show
capability
of capturing
the entropy,
which contributes
totraffic
deeper
traffic knowledge
show
the the
capability
of capturing
the entropy,
which contributes
to deeper
knowledge
discovery
discovery
and differentiation
traffic
police formulation.
Second,
scope of predictability
be
and differentiation
traffic police
formulation.
Second, the
scope the
of predictability
should beshould
extended
extended
and the possibility
of applying
predictability
to other
types
of time
series
may be surveyed.
and the possibility
of applying
predictability
to other types
of time
series
may
be surveyed.
Natural
Science
Foundation
of China
(Grant
No.
Acknowledgments: This research
researchwas
wassponsored
sponsoredby
bythe
theNational
National
Natural
Science
Foundation
of China
(Grant
41271441).
The authors
also would
like to show
their
gratitude
to 3 anonymous
reviewers for
their constructive
No.
41271441).
The authors
also would
like to
show
their gratitude
to 3 anonymous
reviewers
for their
comments that greatly improved the manuscript.
constructive comments that greatly improved the manuscript.
Author Contributions: Tao Xu developed the methodology and composed the manuscript; Xianrui Xu conducted
Author
Contributions:
TaotheXu
developed
theimplemented
methodologythe
and
composed
the manuscript;
Xianrui the
Xu
experiments
and analyzed
results;
Yujie Hu
prediction
methods;
Xiang Li proposed
conducted
experiments
and
results; Yujie Hu implemented the prediction methods; Xiang Li
initial idea and
completed
theanalyzed
literature the
review.
proposed the initial idea and completed the literature review.
Conflicts of Interest: The authors declare no conflict of interest.
Conflicts of Interest: The authors declare no conflict of interest.
References
1.
Wu, C.H.; Ho, J.M.; Lee, D.T. Travel-time prediction with support vector regression. IEEE Trans. Intell.
Transp. Syst. 2005, 5, 276–281. [CrossRef]
Entropy 2017, 19, 165
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.
22.
23.
24.
25.
26.
27.
14 of 15
Mori, U.; Mendiburu, A.; Álvarez, M.; Lozano, J.A. A review of travel time estimation and forecasting for
Advanced Traveller Information Systems. Transportmetr. A Transp. Sci. 2015, 11, 1–39. [CrossRef]
Oh, S.; Byon, Y.J.; Jang, K.; Yeo, H. Short-Term Travel-Time Prediction on Highway: A Review of the
Data-Driven Approach. Transp. Rev. 2015, 35, 4–32. [CrossRef]
Vlahogianni, E.I.; Karlaftis, M.G.; Golias, J.C. Short-term traffic forecasting: Where we are and where we’re
going. Transp. Res. Part C Emerg. Technol. 2014, 43, 3–19. [CrossRef]
Lin, H.E.; Zito, R.; Taylor, M. A review of travel-time prediction in transport and logistics. Proc. East. Asia
Soc. Transp. Stud. 2005, 5, 1433–1448.
Van Lint, J.W.C.; van Hinsbergen, C.P.I. Short-Term Traffic and Travel Time Prediction Models. Artif. Intell.
Appl. Crit. Trans. Issues 2012, 22, 22–41.
Vlahogianni, E.I.; Golias, J.C.; Karlaftis, M.G. Short-term traffic forecasting: Overview of objectives and
methods. Transp. Rev. 2004, 24, 533–557. [CrossRef]
Abrantes, P.A.L.; Wardman, M.R. Meta-analysis of UK values of travel time: An update. Transp. Res. Part A
Policy Pract. 2011, 45, 1–17. [CrossRef]
Jara-Diaz, S.R. Transport Economic Theory; Emerald Group Publishing Limited: Bingley, UK, 2007; pp. 11–49.
Li, Z.; Hensher, D.A.; Rose, J.M. Willingness to pay for travel time reliability in passenger transport: A review
and some new empirical evidence. Transp. Res. Part E Log. Transp. Rev. 2010, 46, 384–403. [CrossRef]
Van Lint, J.W.C.; Zuylen, H.J.V.; Tu, H. Travel time unreliability on freeways: Why measures based on
variance tell only half the story. Transp. Res. Part A Policy Pract. 2008, 42, 258–277. [CrossRef]
Shires, J.D.; de Jong, G.C. An international meta-analysis of values of travel time savings. Eval. Program Plan.
2009, 32, 315–325. [CrossRef] [PubMed]
Wardman, M.; Batley, R. Travel time reliability: A review of late time valuations, elasticities and demand
impacts in the passenger rail market in Great Britain. Transportation 2014, 41, 1041–1069. [CrossRef]
Noland, R.B.; Polak, J.W. Travel time variability: A review of theoretical and empirical issues. Transp. Rev.
2002, 22, 39–54. [CrossRef]
Carrion, C.; Levinson, D. Value of travel time reliability: A review of current evidence. Transp. Res. Part A
Policy Pract. 2012, 46, 720–741. [CrossRef]
Rietveld, P.; Bruinsma, F.R.; Vuuren, D.V. Coping with unreliability in public transport chains: A case study
for Netherlands. Transp. Res. Part A Policy Pract. 2001, 35, 539–559. [CrossRef]
Yue, Y.; Yeh, A.G.O.; Zhuang, Y. Prediction time horizon and effectiveness of real-time data on short-term
traffic predictability. Proc. Intell. Transp. Syst. Conf. 2007, 962–967.
Foell, S.; Phithakkitnukoon, S.; Kortuem, G.; Veloso, M. Predictability of Public Transport Usage: A Study of
Bus Rides in Lisbon, Portugal. IEEE Trans. Intell. Transp. Syst. 2015, 16, 2955–2960. [CrossRef]
Siddle, D. Travel Time Predictability. Available online: https://trid.trb.org/view.aspx?id=1323043 (accessed
on 11 April 2017).
Wang, J.; Mao, Y.; Li, J.; Xiong, Z.; Wang, W.X. Predictability of road traffic and congestion in urban areas.
PLoS ONE 2015, 10, e0121825. [CrossRef] [PubMed]
Du, Y.; Chai, Y.W.; Yang, J.W.; Liang, J.H.; Lan, J.H. Predictability of Resident activity in Beijing Based on
GPS Data. Geogr. Geo Inf. Sci. 2015, 31, 47–51.
Song, C.; Qu, Z.; Blumm, N.; Barabási, A.L. Limits of predictability in human mobility. Science 2010, 327,
1018–1021. [CrossRef] [PubMed]
Shannon, C.E. A Mathematical Theory of Communication: The Bell System Technical Journal. Bell Syst.
Tech. J. 1948, 27, 3–55. [CrossRef]
Fano, R.M.; Hawkins, D. Transmission of Information: A Statistical Theory of Communications. Am. J. Phys.
1961, 29, 793–794. [CrossRef]
Kontoyiannis, I.; Algoet, P.H.; Suhov, Y.M.; Wyner, A.J. Nonparametric entropy estimation for stationary
processes and random fields, with applications to English text. IEEE Trans. Inf. Theory 2007, 44, 1319–1327.
[CrossRef]
Wyner, A.D.; Ziv, J. The sliding-window lempel-ziv algorithm is asymptotically optimal. Proc. IEEE 1994, 82,
872–877. [CrossRef]
Xie, X.X.; Li, S.; Zhang, C.L.; Li, J.K. Study on the application of Lempel-Ziv complexity in the nonlinear
detecting. Complex Syst. Complex. Sci. 2005, 2, 61–66.
Entropy 2017, 19, 165
28.
29.
30.
31.
32.
33.
34.
35.
36.
15 of 15
Costa, M.; Goldberger, A.L.; Peng, C.K. Multiscale entropy analysis of complex physiologic time series.
Phys. Rev. Lett. 2002, 92, 705–708. [CrossRef] [PubMed]
Wu, S.D.; Wu, C.W.; Lin, S.G.; Lee, K.Y.; Peng, C.K. Analysis of complex time series using refined composite
multiscale entropy. Phys. Lett. A 2014, 378, 1369–1374. [CrossRef]
González, M.C.; Hidalgo, C.A. Understanding individual human mobility patterns. Nature 2008, 453, 779.
[CrossRef] [PubMed]
Pappalardo, L.; Simini, F.; Rinzivillo, S.; Pedreschi, D.; Giannotti, F.; Barabási, A.L. Returners and explorers
dichotomy in human mobility. Nat. Commun. 2015, 6. [CrossRef] [PubMed]
Pappalardo, L.; Rinzivillo, S.; Qu, Z.; Pedreschi, D.; Giannotti, F. Understanding the patterns of car travel.
Eur. Phys. J. Spec. Top. 2013, 215, 61–73. [CrossRef]
Hu, Y.; Miller, H.J.; Li, X. Detecting and analyzing mobility hotspots using surface networks. Trans. GIS 2014,
18, 911–935. [CrossRef]
Li, X.J.; Li, X.; Tang, D.; Xu, X. Deriving features of traffic flow around an intersection from trajectories
of vehicles. In Proceedings of the 2010 International Conference on Geoinformatics: Giscience in Change,
Geoinformatics, Beijing, China, 18–20 June 2010; pp. 1–5.
Box, G.E.P.; Jenkins, G.M. Time series analysis: Forecasting and control. J. Oper. Res. Soc. 1971, 22, 199–201.
Rumelhart, D.E.; Hinton, G.E.; Williams, R.J. Learning representations by back-propagating errors.
Cogn. Model. 1988, 5, 1. [CrossRef]
© 2017 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access
article distributed under the terms and conditions of the Creative Commons Attribution
(CC BY) license (http://creativecommons.org/licenses/by/4.0/).