Attitudinal Market Segmentation for Transit Riders Using Factor

2016 3rd International Conference on Information and Communication Technology for Education (ICTE 2016)
ISBN: 978-1-60595-372-4
Attitudinal Market Segmentation for Transit Riders Using
Factor Analysis
Sida Luo, Wenyang Gu
Transportation Center, Northwestern University, 2145 Sheridan Road, Evanston, IL 60208, United States
ABSTRACT: Increasing the ridership of urban transit system is crucial for sustainable development of
transportation. Previous studies have succeeded in using attitudinal market segmentation to understand the
travel demand. However, those related to transit market mostly depend on complex statistical methods
including structural equation modeling (SEM) to extract latent attitudinal variables. It adds difficulty to their
real-world applications. This study uses simple principle component analysis (PCA) method to help the
extraction of attitudinal variables. Results show that PCA is able to provide significant market segmentation.
Policy implementations are tailored to the four submarkets, which is valuable to transit system development.
1 INTRODUCTION
Developing urban transit system is the way to go for
alleviating traffic congestions in cities with large
population and dense land use. In order to promote
the ridership of public transit, characteristics of
travel demand need accurately understanding. An
attitudinal market segmentation approach can help
capture the travel demand through getting insight
into the heterogeneity of transit riders. Accordingly,
effective strategies and policies designated to
different submarkets of bus users can be proposed to
promote the development of transit system (Elmore
1998).
Literature has received much attention in using
attitudinal market segmentation to understand travel
behaviors [Anable 2005, Heinen 2010, Li 2013,
Outwater 2003, a, b, Shiftan 2008]. Its advantages in
exploring latent contributing factors have been
recognized and validated. For example, Shiftan et al.
(2008) used SEM to identify potential transit
markets, and found that sensitivity to time, need for
fixed schedule and willingness to use transit were
the best attitudinal factors to define market
segments. Similar research was conducted by Li et
al. (2013), coming to the result of 6 bicycle
commuting submarkets. In addition to policy
implementations, they found most socioeconomic
attributes do not show substantial characteristics
among these submarkets. Meanwhile, the
segmentation approach based on other variables
such as socioeconomics, level of service and trip
attributes has also been used in travel demand
modeling (Badoe 1998, Clark 1998, Xing 2009).
Those studies mostly used SEM to extract latent
attitudinal variables (Shiftan 2008, Outwater 2003,
a, b, Li 2013). PCA, a factor analysis approach, is
rarely used in transit market analysis. Compared
with SEM, PCA is more straightforward and easier
to use, so it is more promising for traffic
practitioners. As a result, this study applies two
factor analysis approaches along with K-means
clustering to conduct transit attitudinal market
segmentation. Some policy implementations are put
forward, which are of significance to the
development of sustainable transportation systems.
Table 1. Descriptive statistics of the attitudinal questions.
Question
1
2
3
4
5
6
7
8
Content
I will reserve some time when I go to work or school to arrive on time.
I prefer buses with less number of transfers.
If the bus is not crowded, I prefer using it.
If the bus saves travel time, I can pay more to take the bus.
I always fully use my bus travel time or waiting time.
If the transfer walking distance is short, I will travel by bus.
If the bus is very crowded, I will prefer getting off earlier.
If I am more comfortable in travelling, I can pay more for the bus.
Mean
4.03
4.15
3.66
3.66
3.50
3.41
3.37
3.57
Std.
0.81
0.71
1.02
0.90
0.77
0.98
0.98
0.92
9
10
11
12
13
14
15
I am usually in a hurry when going to work or school.
If there is a bus route without transfer to my destination, I will use that route.
At the bus stop, I usually do not board until the coming bus has available seats.
Even if the bus fare goes up, I will still use the bus.
If I am using the bus, I need to arrive at my destination in possibly shortest time.
Even if the transfer stop is crowded, I consider the transfer route as an alternative.
I cannot stand waiting a bus for a long time.
Figure 1. Pie charts of each socioeconomic attribute.
Figure 2. Scree plot for EFA.
3.62
4.18
3.27
3.47
3.86
3.32
4.06
0.83
0.71
1.03
0.88
0.79
0.93
0.89
2 METHODOLOGY
4 RESULTS
2.1 Explanatory factor analysis (EFA)
4.1 Identification of latent attitudinal variables
EFA is a statistical method to identify the underlying
relationships among a large set of variables, and
reduce data dimension such that only a few core
factors remain. The remaining variables are known
as latent variables. If the slope evidently changes at
one point in the scree plot, the corresponding
variable should be a latent variable (Cattell, 1966).
EFA involves three main steps: extraction, rotation
and interpretation.
The attitudinal variables are identified by EFA.
Using SPSS Statistics 19 to do the analysis, it takes
6 iterations to converge for our dataset. The scree
plot is shown in Figure 2. As a result, 4 latent
attitudinal variables, A1, A2, A3, A4, are extracted
from the 15 original statement variables, V1, V2, …,
V15, which explains 64.55% of the total variance. A1
associated with V1, V5, V9, V13, V15 reflects time
sensitivity; A2 associated with V3, V7, V11 reflects
comfort level. A3 associated with V4, V8, V12 reflects
cost sensitivity; A4 associated with V2, V6, V10, V14
reflects transfer sensitivity. The Bartlett’s test of
sphericity gives a χ2 of 2327.6 showing the results
are significant at 0.05 level.
2.2 Principle component analysis (PCA)
PCA is essentially an orthogonal linear
transformation. It is a standard procedure to convert
a set of possibly correlated variables into a set of
linearly uncorrelated variables. The resultant
variables are called principal components.
2.3 K-means clustering (KMC)
KMC is one of the widely-used unsupervised
learning algorithms. It aims to partition observations
into different clusters. The algorithm will perform a
clustering to minimize the distance inside each
group while maximizing the distance between
centers of groups. In this study, the objective of
KMC is exactly to group transit riders with similar
travel attitudes into submarkets.
3 DATA RESOURCE
A survey was conducted in Nanjing, China in April
2014. It includes questions regarding socioeconomic
attributes and attitudes towards bus travel. The
attitudinal questions (or statements) are carefully
designed to reflect travelers’ attitudes that may
affect their choice of transit. The effective sample
size is 600, which is used in this study. The
responses are coded as: 5 – strongly agree; 4 –
agree; 3 – somehow agree; 2 – disagree; 1 – strongly
disagree. Results of some basic statistical analysis
are shown in Table 1 and Figure 1. It shows that
people tend to have consensus on the effects of
transfer – decreasing transfer times is likely to make
transit more appealing. In addition, people do not
have extremely strong needs for seats and much
variability can be observed among them. Having a
seat is not a key factor that affects transit ridership.
As for socioeconomic attributes, some bias could be
seen from the sample. For instance, there are more
men than women, more people with higher
education, more unmarried people with no kids,
more people without automobiles and more people
with bicycles. However, the survey has succeeded
covering transit users with different attributes. The
bias is not so evident to affect validity of our results.
4.2 Score of latent attitudinal variables
The score of those latent attitudinal variables will be
utilized for cluster analysis. PCA can help identify 5
principle components, M1, …, M5, from the 15
statement variables. Accordingly, a standardized
component matrix can be obtained where the
statement variables are ordered and grouped based
on the 4 attitudinal variables. The matrix is shown in
Table 2. Then, we collapse the statement variables
into attitudinal variables and calculate sij, the score
of variable i under major component j, according to
the information in Table 2. For example,
Table 2. Standardized component matrix for PCA.
V1
V5
V9
V13
V15
V3
V7
V11
V4
V8
V12
V2
V6
V10
V14
M1
M2
M3
M4
M5
0.248
0.242
0.255
0.315
0.272
0.221
0.231
0.262
0.292
0.296
0.286
0.215
0.197
0.288
0.219
-0.272
-0.196
-0.130
-0.224
-0.233
0.459
0.486
0.432
0.121
0.128
0.053
-0.224
-0.046
-0.198
-0.086
0.130
-0.043
0.029
0.091
0.078
0.270
0.214
0.210
-0.461
-0.471
-0.474
0.262
0.162
0.136
0.176
-0.156
-0.148
-0.152
-0.191
-0.295
-0.116
-0.077
-0.093
0.054
0.026
0.112
0.019
0.635
0.018
0.599
-0.042
0.390
0.353
0.173
0.055
-0.189
0.030
0.218
-0.230
-0.156
0.089
-0.520
0.096
-0.448
0.199
Table 3. Matrix for score calculation.
A1
A2
A3
A4
wi
M1
s11
s21
s31
s41
0.380
M2
s12
s22
s32
s42
0.204
M3
s13
s23
s33
s43
0.153
M4
s14
s24
s34
s44
0.143
M5
s15
s25
s35
s45
0.119
s11 = 0.248V1+0.242V5+0.255V9+0.315V13+0.272V15
(1)
A matrix [sij] can be obtained in this manner. The
weights of principle components wi are determined
by their eigenvalues that are λ1 = 3.677, λ2 = 1.977,
λ3 = 1.486, λ4 = 1.387 and λ5 = 1.156 from the
output, respectively. Note that the eigenvalues have
already been used to obtain the standardized
component matrix in Table 2 where column j is the
eigenvector ηj associated with λj. According to the
eigenvalues, we obtain the weights
wi = λi /(λ1+λ2+λ3+λ4+λ5)
(2)
The matrix [sij] and results of the weights are
shown in Table 3. Then, scores of the 4 latent
variables, s1, …, s4, are calculated by
si = w1si1+w2si2+w3si3+w4si4+w5si5
(3)
The factor analysis approaches have helped
extract the latent attitudinal variables, making it
ready for the market segmentation.
4.3 Transit market segmentation by KMC
4.3.1 Four transit attitudinal submarkets
Time Sensitivity (A1), Comfort Level (A2), Cost
Sensitivity (A3) and Transfer Sensitivity (A4) are the
4 variables that is used for KMC. SPSS is also used
here. Experiments show that when the number of
clusters is 4, results are significant and the clusters
are independent of each other. Convergence is
achieved in the 9th iteration. Logistic regression is
used to test the correlation between different
segments. None of the regression parameters are
significant at 0.01 level and little correlation is found
between the segments and socioeconomic features.
This shows the result of clustering is statistically
significant. 4 clusters correspond to 4 submarkets
and the number of travelers in the submarkets is 76,
121, 140, 263 and 600 in sequence.
4.3.2 Characteristics of submarkets
The four transit submarkets have distinct attitudinal
characteristics, which can be visualized in Figure 3.
Submarket 1 is a group of transit users with high
time sensitivity, low demand of comfort level,
medium cost sensitivity and high transfer sensitivity;
submarket 2 is a group of transit users with low time
sensitivity, medium demand of comfort level, low
cost sensitivity and medium transfer sensitivity;
submarket 3 is a group of transit users with high
time sensitivity, high demand of comfort level,
medium cost sensitivity and low transfer sensitivity;
submarket 4 is a group of transit users with high
time sensitivity, high demand of comfort level, high
cost sensitivity and high transfer sensitivity.
4.3.3 Policy implications for submarkets
This approach groups travelers with similar attitudes
towards transit. Characteristics of various segments
can help traffic planners develop targeted policies
that best serve the needs of each submarket and
therefore promote transit ridership.
Travelers in submarket 1 and 4 are very sensitive
in time and bus transfers. To encourage them to use
transit, rapid transit service that connects their major
work and residential districts can be designed. This
makes travel time shorter and more predictable, and
the number of transfers is expected to decrease.
Another direction is to improve the transfer facilities
to decrease transfer distance and time and improve
transfer safety and environment. Also, bus
information can be available at the facilities such
that the travelers will feel more certain about their
waiting time if a transfer is needed.
Travelers in submarket 2 have both low time and
cost sensitivity. They tend to be regular transit riders
who are rather insensitive to the level of service.
Captive riders probably belong to submarket 2,
whose travel demand by bus is basically inelastic.
Travelers in submarket 3 have strong desires in
comfort level. Providing buses with better
environment such as more seating space and less
noise can make them more competitive to private
cars. A good strategy is to update transit vehicles
with moderate cost, which is highly likely to attract
these travelers.
Travelers in submarket 4 have high cost
sensitivity. It will be a bad strategy to increase the
ridership of buses if transit companies raise the fares
when a large proportion of these transit riders is
observed. Providing conventional transit service
with normal price, which is relatively low, will be
good for this submarket.
Figure 3. Radar chart of various clusters.
5 CONCLUSIONS
This paper uses attitudinal market segmentation
approach to study the transit market. With the help
of factor analysis approaches, latent attitudinal
variables are extracted, based on which the K-means
clustering is applied to achieve the segmentation. 4
attitudinal segments are found, and policy
implementations are specifically provided according
to the characteristics of each submarket. For future
research, the SP survey on attitudes is expensive to
some extent, and it is unable to know the attitude of
transit riders in future years. These pose some
limitations for the approach to be applied to
transport planning. Therefore, it is of great
significance to directly predict the attitudinal
segmentation based on socioeconomic attributes that
can be obtained from some data used for urban
planning. Machine learning can play the role.
Finding good machine learning methods is the
direction for follow-up studies.
6 ACKNOWLEDGMENT
The authors would like to thank the colleagues in the
Transportation Center of Northwestern University
for their valuable time and advice.
REFERENCES
[1] Anable, J. 2005. ‘Complacent car addicts’ or ‘aspiring
environmentalists’? Identifying travel behavior segments
using attitude theory. Transport Policy 12 (2005): 65-78.
[2] Badoe, D.A. & Miller, E.J. 1998. An automatic
segmentation procedure for studying variations in mode
choice behavior. Journal of Advanced Transportation 32
(2): 190-215.
[3] Clark, D.E. 1998. Estimation future bicycle and pedestrian
trips from a travel demand forecasting model.
Transportation Research Board, Washington, DC.
[4] Elmore, Y.R. 1998. A Handbook: Using Market
Segmentation to Increase Transit Ridership. TCRP Report
36. Transportation Research Board, Washington, DC.
[5] Heinen, E., Maat, K. & Wee, B.V. 2011. The role of
attitudes toward characteristics of bicycle commuting on
the choice to cycle to work over various distances.
Transportation Research Part D 16 (2011): 102-109.
[6] Li, Z.B., Wang, W., Yang, C. & Ragland, D.R. 2013.
Bicycle commuting market analysis using attitudinal
market segmentation approach. Transportation Research
Part A 47 (2013): 56-68.
[7] Outwater, M.L., Castleberry, S., Shiftan, Y., Ben-Akiva,
M.E., Zhou, Y.S. & Kuppam, A. 2003. Use of structural
equation modeling for an attitudinal market segmentation
approach to mode choice and ridership forecasting. 10th
International Conference on Travel Behavior Research 1015 August 2003. Lucerne, Switzerland.
[8] Outwater, M.L., Castleberry, S., Shiftan, Y., Ben-Akiva,
M.E., Zhou, Y.S. & Kuppam, A. 2003. Attitudinal market
segmentation approach to mode choice and ridership
forecasting: Structural equation modeling. Transportation
Research Record 1854: 32-42.
[9] Shiftan, Y., Outwater, M.L. & Zhou Y.S. 2008. Transit
market research using structural equation modeling and
attitudinal market segmentation. Transport Policy 15
(2008): 186-195.
[10] Xing, Y., Handy, S.L. & Mokhtarian P.L. 2010. Factors
associated with proportions and miles of bicycling for
transportation and recreation in six small US cities.
Transportation Research Part D 15 (2010): 73-81.