1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 Identifying Optimal High-Risk Driver Segments for Safety Messaging: A Geodemographic Modeling Approach George Eguakun, M.B.A., Ph.D. Candidate Manager Traffic Safety Program Evaluation Saskatchewan Government Insurance (SGI) Regina Operations Center 5104 Donnelly Crescent, Regina, Saskatchewan, Canada, S4X 4C9 Tel: 1 306 775 6274; Fax, 1 306 352 3154 Email: [email protected] Peter Y. Park, Ph.D., P.Eng. (Corresponding Author) Associate Professor Department of Civil and Geological Engineering University of Saskatchewan 57 Campus Drive, Saskatoon, SK, Canada, S7N 5A9 Tel: 1 306 966 1314; Fax, 1 306 966 5427 Email: [email protected] Kwei Quaye, Ph.D. MA (Econ), P.Eng. Assistant Vice President Traffic Safety Services and Driver Development Saskatchewan Government Insurance (SGI) Head Office th 2260 11 Avenue, Regina, Saskatchewan, Canada, S4P 0J9 Tel: 1 306 775 6182; Fax, 1 306 352 3154 Email: [email protected] Word Count: Abstract (230) + Main Body (4,767) + Figures (2,000) = 6,997 words Presented at the 94th Annual Meeting of Transportation Research Board, January 11 – 15, 2015 Eguakun, Park, Quaye 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 Abstract Given the public safety risk posed by high-risk drivers, most traffic safety agencies consider this group a key target for strategic planning purposes. The aim of this research is to develop a framework that can be used to efficiently and effectively target high-risk drivers. The specific objectives are to establish whether high-risk drivers are homogenous, and if not, to determine the optimal set of primary and secondary clusters for efficient and effective targeting with minimal resources. The study area is Saskatchewan, Canada. Multiple databases (including traffic collisions, insurance claims and conviction data) formed the basis for the research. In this study, HRDs are defined as all drivers who are both enrolled in Saskatchewan Government Insurance’s (SGI) Driver Improvement Program and in the negative or penalty zone of SGI’s Safety Driver Rating scale as a result of accumulated demerit points. Geodemographic modeling, using the neighbourhood as the unit of analysis, a large number of variables, and a set of probabilistic clustering techniques, was used in the analysis. The results indicate that the high-risk driver group is heterogeneous, falling into sub-clusters with varying collision and traffic behaviour profiles. The study found that Saskatchewan, high-risk drivers are mainly in the major cities (56%), rural municipalities (18%) and towns (15%). The optimal primary high-risk segments for efficient targeting are those major cities and towns where both the risk of collision involvement and the concentration of high-risk drivers are higher than the driver population. Drivers in the primary target area for messaging show higher levels of distracted, impaired and aggressive driving behaviours, driver-inexperience, extreme fatigue, falling asleep behind the wheel, and inattention. 2 Eguakun, Park, Quaye 70 1. Introduction 71 1.1 Problem statement 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 This paper uses a geodemographic modeling approach to identify the most appropriate high-risk driver segments for safety messages aimed at high-risk drivers (HRDs). By identifying the most appropriate segments of HRDs and then using appropriate messages targeted at these groups, changes in driving behaviour, reductions in the number of collisions, and the use of resources should also be maximized. While HRDs constitute a relatively small percentage of the driving population, they are believed to account for a significant proportion of deaths and injuries in collisions. Research conducted in Canada (1) indicates that only three to four percent of the driver population is HRDs, but about 12 percent of fatalities and eight percent of injuries involve HRDs. In New Zealand, it has been estimated that about 33 percent of at-fault collisions are due to HRDs (2). The percentage of HRD involvement in collisions may vary across jurisdictions depending on the way HRDs are defined in each jurisdiction. For example, McSaveney and Jones (2) defined HRDs as drivers who have a history of dangerous and reckless driving. Dangerous and reckless driving included disqualified driving, unlicensed driving, involvement in illegal street car racing, repeat drink/drug driving, high Blood Alcohol Content (BAC) offenders, repeat speeding offenders, and high-level speed offenders. The Traffic Injury Research Foundation (1) defined HRDs as suspended/prohibited drivers and repeat traffic offenders with a pattern of illegal driving behaviours (e.g., drivers with recurring incidences of alcohol/drug impaired violations, traffic violations, and collision involvement). However HRDs are defined, it is fair to say that there is a consensus amongst traffic safety agencies/jurisdictions around the world that HRDs pose serious public safety issues, and thus it is not surprising that most traffic safety agencies select HRDs as a target safety area in plans designed to reduce collision deaths and injuries (3, 4,). For example, Canada’s national Road Safety Strategy (RSS) 2015 (for the five-year period 2011 to 2015) identified HRDs as a key target area of safety concern. The presumed rationale is that high-risk driving behaviour often leads to increased risk of collision involvement (5). RSS 2015 commits each of the Canadian provinces and territories to developing appropriate methods that can effectively and efficiently identify and deliver appropriate traffic safety messages to each group of HRDs. This approach entails drawing on every possible source of information to answer pertinent questions about who, where, and what risk profiles comprise HRD drivers. One of the easiest and most common ways of identifying HRDs and categorizing them into various segments is to develop multiple HRD groups based primarily on a few selected driver characteristics. For instance, young drivers, impaired drivers, repeat traffic offenders and collision involved drivers can each be viewed as a unique HRD group (6, 7). A problem with this simple approach is that it treats each HRD segment (e.g. young drivers, impaired drivers, repeat speeding violators, etc.) as a unique group although there may well be overlaps between the groups, (e.g., a young and impaired drivers group). It is certainly reasonable to assume that a young driver who is a repeat traffic offender, but who has no impaired driving history, would have a different risk profile from a young and impaired driver who does not engage in repeat traffic violations. The way in which groups are defined can be important in the efficient and effective use of financial resources invested in safety messages designed to reduce the number of collisions. The lack of a clearly distinct boundary between groups invariably leads to difficulties in 3 Eguakun, Park, Quaye 115 116 117 targeting and formatting safety messages. The aim of this research is to develop a framework that traffic safety professionals can use to efficiently and effectively segment and target high-risk drivers. 118 1.2 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 Effective segmentation (i.e., identification) of consumers is a key consideration in, for example, the social marketing industry. To overcome the overlapping syndrome and other problem issues when targeting products, social marketers use an advanced segmentation technique known as geodemographic modeling (8, 9). The technique has been described as the most cost-effective technique available for determining and segmenting target groups for disseminating public policy statements (10, 11). The modeling effort requires: 1) collecting massive amounts of data on consumer characteristics and purchasing behaviour, 2) constructing statistical models of consumer identity, and 3) mapping and analyzing distributions of consumers’ market-related activities (12, 13). The marketing industry typically includes demographic data (age, gender, and income), geographic data (rural, urban) and psychographic data (lifestyle) in their models. These data are integrated and analyzed using advanced data management technologies, statistical models, and geographical information systems (14) to understand consumers’ purchasing activities and related characteristics, for example, where consumers live and what they choose to buy. The approach assumes that like-minded consumers tend to “cluster together” in spatially proximate neighbourhoods. Heitgerd and Lee (15) applied a geodemographic model to conduct public health risk assessments. They wanted to identify risky clusters in order to conduct efficient targeting of health education activities. They used the neighbourhood as a geospatial unit of analysis to establish national priority sites. Although geodemographic modeling is widely used in areas such as social marketing it has been little used in traffic safety (16). Cambois and Fontaine’s (17) study can be regarded as an early attempt to classify HRD groups by defining unique segments that are mutually exclusive from each other. The study used traffic collision data and other driver information, such as aggressive driving, speeding, red-light running, and driving impaired, as the basis for segmentation. Blatt and Furman (18) used a geodemographic modeling approach to investigate whether collisions in rural areas involved urban dwellers or rural dwellers. They concluded that most rural collisions involved residents living in rural areas and small towns. Shankar and Warkell (19) applied a geodemographic modeling approach to analyze fatal motorcycle collisions. They then identified which safety message targets and specific media channels were most appropriate to each segment of road users involved in the fatal motorcycle collisions. Anderson (20) applied geodemographic modeling to determine drivers’ injury risk in London, UK. Anderson found distinct spatial and statistical patterns in certain groups of drivers who were more likely to be at risk of being involved in a collision. An important area which is largely unexplored in the few studies identified above is the homogeneity or lack of homogeneity of the HRD group using the neighbourhood as a unit of analysis. 154 3. 155 156 157 158 159 The goal of this research is to develop a framework that traffic safety professionals can use for the efficient and effective segmentation and targeting of high-risk drivers. The specific objectives are 1) to establish whether HRDs are homogenous, and if not, to identify neighbourhood clusters in which, they dominate; and 2) to determine the optimal set of primary clusters that could be reached with minimal resources. In this study, HRDs are defined as all Literature review Study objectives and scope 4 Eguakun, Park, Quaye 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 drivers who are both enrolled in Saskatchewan Government Insurance’s (SGI) Driver Improvement Program and in the negative or penalty zone of SGI’s Safety Driver Rating scale as a result of accumulated demerit points. The Driver Improvement Program is a progressive sanctions program for problem drivers who exceed nine demerit points. The Safety Driver Rating Scale defines a driver’s safety position on a scale from -10 to 19 as part of a Safety Recognition Program. The Safety Recognition Program uses the safety rating scale to determine driver safety points. It consists of four zones: the negative or penalty zone (-10 to <0), neutral zone (0) vehicle insurance discount zone (>0 to 12), and platinum customers (>12 to 19). A rating of 0 or greater is an indication of safe driving. A driver in this zone could be in a discount position while a driver in the negative zone attracts a financial penalty for each sliding point in that zone. For example, suppose a driver is at 2 in the Safety Zone. She is convicted of running a stop sign and loses 4 points. This moves her to -2 points in the Penalty zone. She is immediately assessed a one-time financial penalty of $50 for the incident. Such a driver will be included in our high-risk driver group. The study considers high-risk driver groups on the basis of driving experience. It is expected that analyzing HRDs separately by taking into account their driving experience, driver behaviour, collision risk and geodemographic profiles will allow us to identify true high-risk drivers. It should also enable us exclude drivers whose high-risk driving behaviour, e.g., excessive speeding, is due to a lack of driving experience rather than a disposition towards high-risk driving. 180 4. 181 182 183 184 185 186 187 188 189 190 191 The efficient segmentation of the HRD group required bringing together a number of datasets from SGI: 1) traffic collision database (2006-2011), 2) vehicle characteristics database, 3) problem driver database from Driver Improvement and Safe Driver Recognition programs. Other database includes the most recent census data from Statistics Canada, and the convictions database from the Saskatchewan Department of Justice. The traffic collision database included collision characteristics, driver and vehicle occupant information. The census data included updated postal codes, and the convictions database included summary offence tickets and criminal code convictions. A total of 30,453 high-risk drivers who were in the penalty zone of the safety rating scale were extracted as the subjects in this study. The datasets were prepared and classified into geodemographic, risk profile and traffic behaviour categories using the following steps: 192 193 194 195 196 197 198 199 200 201 202 203 204 1. To create the geodemographic category, high-risk drivers were initially assigned random codes and aggregated into groups using postal codes. For privacy purposes, the only personal data collected were gender and age. The HRD group dataset was then linked to the most recent census data using postal codes as unique identifiers. This allowed the subsequent extraction of geodemographic variables for each low level neighbourhood. 2. Risk profiles were developed using, age and gender variables which have been found to significantly influence collision risk (21-28). In this study, we estimated the probability of involvement in a collision when the driver was a male or female. For age, we estimated the probability of involvement in a collision when the driver was less than 25 years (young), over 65 years (elderly), and between 25 and 65 years (other). The probabilities, which were used as inputs into the subsequent cluster analyses, were estimated using logistic regression models similar to those described in Guo and Fang (29). The data elements used in the estimation process were derived by merging the aggregated codes with the traffic collision Data Sources and Preparation 5 6 Eguakun, Park, Quaye 205 206 207 208 database. Table 1 shows the two main categories (geodemographic and collisions risk) and the variables for each category. TABLE 1 Main Categories and Aggregated Variables used in the Cluster Analysis Main Category Geodemographic Collision Risk Profile 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 Aggregated Variable Variable Description %_gender %_age Postal_Code Dwellings Persons_per_family Aboriginal_popn Employment Status Education_level Average_Income Percent male Drivers Percent young Drivers Neighbourhood unique identifier Housing units in Neighbourhood Count of family members in dwelling Number of Aboriginals Employed versus unemployed Level of Education—5 levels Average neighbourhood income Gender_Prob Young_driverProb Elderly_driverProb Other (Reference) Prob(Male|Collision) Prob(Driver<25 years|Collision) Prob(Driver>65 years|Collision) Prob(25<=Driver<=65 years|Collision) 3. As many variables in the collision and summary offence datasets might describe traffic behaviour, it was necessary to aggregate variables that measure common underlying constructs. For example, human condition, as a major contributing factor associated with collisions, includes driver inattention, inexperience, distraction, driving while impaired, had been drinking, falling asleep, being fatigued, losing consciousness, and many other variables. Using factor analysis techniques, a human condition index was created that comprised of four underlying constructs. Six other indices were similarly developed covering constructs describing issues such as following too closely, impaired driving, and use of seat belts. The seven indices are presented in Table 2. The factor analysis procedure used to develop the underlying constructs in this study was similar to that described by Saccamonno and Lai (30), with the constructs being prime candidates for modeling purposes as they are deemed to be non-collinear. 7 Eguakun, Park, Quaye 224 225 TABLE 2 Traffic Behaviour Indices derived from Factor Analysis Procedure Behaviour Index Human Condition Index (HC) Human Action Index (HA) Impaired Driver Index (ID) Aggressive Driver Index Distracted Driving Index Medical Index Seat Belt Use Index 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 5. Orthogonal Factors (Constructs) HC1 HC2 HC3 HC4 HA1 HA2 HA3 HA4 ID1 AD DD MI SBI Description of Variable Components Driver inexperience or confused state Medical disability, defective vision or hearing Extreme fatigue,falling asleep Losing consciousness or sickness Following too closely Disregard for traffic control device Backing unsafely Taking evasive action such as hard braking Ability impaired by alcohol or drugs, Speeding, running redlights, tailgating, weaving in and out of traffic, failing to yield the right of way. Inattention, distracted in or out of vehicle Medically Risk Seat Belt Use rate Modeling Procedure The modeling procedure used in this study was based on two analyses: cluster analysis and the creation of a collision risk index and a high-risk driver penetration index. 5.1 Cluster Analysis The first part of the cluster analysis was designed to investigate whether we should treat the high-risk group as a homogenous group or whether we should treat it as a number of sub-units. We tested the hypothesis that the high-risk driver group is homogenous and therefore cannot be clustered. To test the hypothesis, the main categories and the traffic behaviour indices presented in Tables 1 and 2 were used as primary inputs for a cluster analysis, and the number of clusters was pre-specified. The clustering procedure used for this part of the analysis has been widely used in traffic safety research (29, 30, and 31). The choice of clustering procedure was guided by the need to develop a set of high-risk driver clusters with minimal overlap of attributes while avoiding outliers in the dataset creating small clusters consisting of only a few observations. The cluster analysis procedure selected allows for these two principles to be addressed adequately. The procedure uses the K-means model to develop cluster centroids that are far apart from each other. It assigns an observation to an initial cluster, with the closest value to the cluster centroid, while minimizing the least squares sum within clusters. This iteration is repeated until all high-risk drivers are assigned and the clusters are replaced by the cluster means or centroids. The k-means clustering procedure used in this study is considered suitable for large datasets, as is the case in this study with 30,453 observations. 5.2 Collision Risk Index and High-Risk Drivers Penetration Index 8 Eguakun, Park, Quaye 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 Two indices were created to refine the main clusters and the sub-clusters: the collision risk index and the high-risk driver penetration index. The collision risk index (CI) was estimated by dividing the number of collisions in a cluster by the total number of collisions, and dividing this number by the number of drivers in the cluster divided by the total number of drivers. The number of drivers was defined as the number of licensed drivers. The CI for each cluster was estimated as follows: 280 using the centroid for all cluster PIs, determined as: in , where n is the total number of final clusters to be identified. A high penetration cluster means that messages targeted at that cluster would reach more high-risk drivers than messages targeted at a low penetration cluster. The two indices are discussed further in Section 6.3. 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 Cluster Collision Index = (CI) Cluster Collisions/Total Collisions Cluster Driving Population/Total Driving Population (1) Equation 1 indicates whether a cluster is overrepresented (Collision Index >1) or underrepresented (Collision Index <1) in the number of collisions. The Equation treats the cluster as an entity and provides an indication of the collision risk relative to the Saskatchewan driving population. The high-risk drivers penetration index (PI) was estimated by dividing the number of high-risk drivers in a cluster by the total number of high-risk drivers, and dividing this number by the population of the cluster divided by the total population. The PI for each cluster was estimated as follows: Cluster Penetration Index = High-Risk Drivers in Cluster/Total High Risk Drivers (PI) Cluster Population/Total Population (2) The Penetration Index provides an indication of whether high-risk drivers are overrepresented (PI>1) or underrepresented (PI<1) in any given cluster when compared with the general driver population profile. Unlike the collision index, which relies on collision incidents, the penetration index (PI) is a marketing term that signifies the degree to which traffic safety messages can reach a target group—equivalent to the market penetration rate. A PI, as computed in the study, is a normalized ratio that can be skewed in either direction. Thus, the threshold could be determined ∑n PI 6 RESULTS 6.1 Clusters and Sub-Clusters Table 3 presents the results of the cluster analysis for HRDs in Saskatchewan. The cluster analysis suggests that HRDs can be segmented into a number of main and sub-clusters using postal codes as the unit of analysis. The main (high-level postal codes aggregated using census subdivisions) clusters are cities (accounting for 56% of all HRDs by population), followed by rural municipalities (18%), and towns (15%). Within each main cluster, the cluster analysis found three to five sub-clusters. Cluster 13, is the dominant sub-cluster for cities and includes Regina, Saskatoon, Moose Jaw, Humbolt and Swift Current. Sub-cluster 33 dominates the villages (and includes Holdfest, Avonlea and Medstead), and sub-cluster 25 dominates the towns (and includes Shaunavon, Milestone and Southey). In the rural municipalities, most HRDs are in sub-clusters 41, 43 and 45 (which Eguakun, Park, Quaye 298 299 300 include Garry, Tisdale and Wilton); on the Indian Reserves, most HRDs are in sub-cluster 55 (which includes Buffalo Rever Dene, Wapachewunak and Red Pheasant). TABLE 3 Derived Neighborhood High Risk Clusters by Location Sub cluster % of Main Cluster Prefix Identity Population Total 11 24,310 13 345,025 56% City 1 14 99,355 15 93,320 22 34,895 Town 2 23 28,265 15% 24 9,725 25 66,170 31 1,425 32 700 Village 3 33 36,155 4% 34 2,765 35 1,925 41 50,495 Rural Municipality 4 43 41,690 18% 45 83,625 51 2,165 Indian Reserves 5 52 6,765 5% 53 11,670 55 27,150 62 1,775 Others 6 63 10,575 2% 65 4,940 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 Table 4 shows the number of drivers in each of the five main sub-clusters, the percentage who are male, and the dominant age category (relative to the total driving population). The Table includes a Traffic Safety Behaviour Index composed of seven constructs. These constructs are described in Table 2. Each traffic safety behaviour construct is assigned points according to the correlation coefficient between the construct and the sub-cluster. If the correlation coefficient is significant and high, indicating a high level of undesirable traffic behaviour, the behaviour construct is assigned 3 points. A moderate level of undesirable behaviour is assigned 2 points, and a low level is assigned 1 point. Each sub-cluster is assigned a total of 7 to 21 points. A high total score is defined as 16.5 to 21 points; a moderate total score is defined as 11.8 to 16.4 points; and a low total score is defined is defined as 7 to 11.7 points. Table 4 indicates that the high-risk sub-clusters differ for the percentage of males, the dominant age category, and the traffic behaviours. The Indian Reserve sub-cluster 55 has the worst score for traffic behaviours (21) followed by the main sub-cluster for towns (19), and the main sub-cluster for rural municipalities (15). Both the Indian Reserve and towns sub-clusters 9 Eguakun, Park, Quaye 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 are dominated by young HRDs (aged 15 to 34). The traffic behaviours results for the percentage of HRDs who are male are less clear cut. TABLE 4 Main Sub-Clusters showing Characteristics and Traffic Safety Behaviour Index 6.2 Spatial Representation of Clusters and Sub-Clusters Figures 1 and 2 use ArcGIS software to show the location of the sub-clusters. Figure 1 shows the two major cities (Regina and Saskatoon). Figure 2 shows the non-major cities and also the towns and villages. The towns and villages are combined. The maps were created by layering the final clusters over shape files for Saskatchewan. The Figures show that a sub-cluster of HRDs is not entirely dependent on the location. Sub-clusters 11 and 14, for example, bring together Saskatoon and Regina HRDs who share similar characteristics, and sub-clusters 13 and Cluster 15 bring together Saskatoon and Regina HRDs who share similar characteristics with non-major city HRDs. 6.3 Target Clusters Behaviour Identification of the main clusters, sub-clusters and their locations does not determine which sub-clusters are prime candidates for targeting. The collision risk index (CI) and high-risk drivers penetration index (PI) introduced in Section 5.2 were used to create the perceptual map shown in Figure 3. With respect to the cluster penetration index (PI), the centroid stabilizes at a PI of 1.2. Thus, a penetration index less than 1 is generally considered to be low (PI<1) while a PI between 1 and 1.2 is considered high. It is higher when the ratio falls between 1.2 and 1.50, while a PI of greater than 1.5 is considered extremely high. The threshold at 1.2 ensures those clusters with higher to extreme penetration are considered as primary targets to enhance penetration effectiveness. . The quadrant with a high CI and a high PI should contain the most important sub-clusters. In Figure 3, the Primary Target quadrant identifies these sub-clusters. Secondary clusters were defined as those lying in the quadrant with a high CI, but a low PI. In Figure 3, the Secondary Target quadrant identifies these sub-clusters. Figure 3 shows that six sub-clusters (11, 14, 15, 22, 23 and 24) are prime candidates for targeting HRDs with safety messages. These sub-clusters represent about 29% of the high-risk 10 Eguakun, Park, Quaye 349 350 351 352 353 driver population and, are associated with the major cities and the towns: Regina, Saskatoon (Cluster 11); Martensville (Cluster 14); Moose Jaw and Prince Albert (Cluster 15); Vonda and Liberty (Cluster 22); Invermay, Weldon, and Norquaye (Cluster 23); and Canola, Kyle, and Herbert (Cluster 24). 354 355 356 357 358 359 360 FIGURE 1 Geospatial Presentation of High-Risk Sub-Clusters in Saskatoon and Regina 11 Eguakun, Park, Quaye 361 362 363 364 365 366 367 368 FIGURE 2 Geospatial Presentation of High-Risk Sub-Clusters in Towns, Villages and Non-Major Cities in Saskatchewan 12 13 Eguakun, Park, Quaye 369 370 Collision Index P Secondary Target Primary Target Penetration 13, 25, 32, 33 11, 14, 15, 22, 24, 23 High Risk Driver Penetration 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 FIGURE 3 Perceptual Map showing Collision Index and Driver Penetration Index Values and Primary and Secondary Quadrants for Sub-Clusters of High-Risk Drivers Table 5 shows the geodemographic variables and traffic safety behaviour index for the six primary target clusters behaviour. The Table shows that the dominant age category and behaviour score of the primary clusters vary. Clusters 11(found primarily in Regina and Saskatoon) and 14 (mainly in Saskatoon and Martensville) have behaviour scores of 21 and 17 respectively, an indication that they should be given priority attention when selecting target clusters. City based sub-clusters had particularly high levels of distracted and aggressive driving behaviours, especially when compared with sub-clusters for towns. City based sub-clusters were also more likely to be associated with driver-inexperience, extreme fatigue, falling asleep behind the wheel, inattention, driving too fast for road conditions, exceeding the speed limit, and following too closely. Sub-clusters associated with towns were more likely to have medically atrisk drivers due to the higher proportion of seniors in those clusters. Although sub-clusters 11 and 14 both scored high on the behaviour index, Figure 3 shows that the risk of collision involvement was significantly higher for Cluster 11, probably due to a greater proportion of younger drivers in sub-cluster 11 than in sub-cluster 14. Eguakun, Park, Quaye 392 393 394 395 396 397 398 399 400 HRDs in sub-cluster 15 differed from sub-cluster 11 and 14 as they were less likely to engage in impaired driving, be medically at risk, or fail to use seat belts. Town sub-clusters 22, 23 and 24 were appropriately ranked on the basis of their traffic safety behavioural score. TABLE 5 Demographic Characteristics and Traffic Safety Behaviour Index Scores for Primary Target Sub-Clusters 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 7. DISCUSSION/CONCLUSION Our research links geodemographic attributes of drivers, driver experience, risk profiles and collisions data to gain insight into segmenting high risk road users. The study provides a framework for identifying optimal high-risk driver segments through geodemographic modeling. Like consumer market segments, traffic safety segments can be defined by certain attributes. The study uses a number of geodemographic and traffic safety behaviour variables to define clusters of drivers whose high risk behaviours should be targeted in safety messaging programs. The geodemographic attributes include gender, age, income, postal code, etc.), and the traffic safety behaviour attributes include following too closely, impaired driving, use of seat belts, etc. We combine the geodemographic and traffic safety behaviour attributes to develop main sub-clusters and main cluster categories of high risk drivers. The sub-clusters are further divided into primary and secondary targets for traffic safety messaging. The perceptual map shows that there is a linear relationship between the cluster collision index and the penetration index, an indication of the extent to which the developed clusters truly reflect high-risk behavior of the primary and secondary targets. For example, the results indicate that that the dominant high-risk clusters are not necessarily the primary targets. Cluster 13, from Table 4 does not appear as a primary target on the perceptual map, although it is identified as a dominant cluster for high-risk drivers. This could be attributed to the fact that this cluster ranks low on overall traffic safety behavior, collisions and violations, probably due to the dominant age being 65+. On the other 14 Eguakun, Park, Quaye 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 hand, cluster 11, which is associated with the city category, ranks high because it is dominated by a younger 25-54 year old high-risk drivers. It is apparent that the clusters and primary targets portray the influence of age on collision involvement and traffic behavior. The approach used in this study has the potential to minimize the overlapping syndrome of segmenting high-risk drivers. For example, when sub-cluster 11 high-risk drivers are targeted, the message can target a number of traffic safety behaviours including aggressive driving and driver distraction while at the same time focusing on a particular location. This is consistent with the view that like-minded people tend to cluster together in similar neighbourhoods. The limitations of the study must also be considered. The approach presented in this study overcomes the limitation of not being able to reach high-risk drivers who have not come to the attention of law enforcement officers. The way we estimate risk does not allow the inclusion of the expected loss of an outcome: we estimate risk loosely risk by examining the number of traffic-collision involvements in a given cluster relative to the corresponding exposure value (for licensed drivers or population), but risk is a probability concept which must be applied to determine the expected loss of an outcome. The lack of consideration for determining a high-risk driver cluster’s probability of future involvement in a collision is also a weakness. We are also aware that geographic information is not perfect and therefore cannot be regarded as the absolute accurate reflection of reality (33). The limitations notwithstanding, this study shows that high-risk drivers cannot be considered homogenous as the group includes clear sub-clusters with varying collision risks, geodemographic attributes and traffic behaviour profiles. Different sub-clusters should be targeted differently, and it is important to identify primary and secondary targets, especially when resources are limited. In the case of Saskatchewan, the geodemographic modeling techniques used in this study have identified high-risk drivers are found mainly in the major cities (56%), rural municipalities (18%) and towns (15 %). 8. FUTURE WORK AND PRACTICAL APPLICATIONS This study provides traffic safety stakeholders responsible for highways and infrastructure, transportation, city roads, and insurance, an innovative intelligence-based resource for targeting the most relevant HRD groups, and thereby maximizing resources. For the approach to be successful, transportation professionals need quality data from a variety of sources. Defining the high-risk driver clearly at the beginning of will narrow the focus for successful identification of optimal primary and secondary sub-clusters for targeting purposes. Traditionally, mass media target the whole of the HRD group regardless of the withincluster risk variance, making the targeting inefficient. This study enables transportation professionals to focus on the main clusters of high risk i.e., those drivers who generate considerable negative incidents from undesirable traffic behaviours. For example, Clusters 11 and 14 rank very high on the traffic safety behaviour index used in this study, and are prime targets for messaging. The findings presented in this paper suggest that we should target city 2054 year olds with messages on aggressive driving, impaired driving and distracted driving. In the towns, scores on the traffic safety behaviour index are mostly lower than in the cities. If resources are available for messaging in towns, we should target 55-64 year olds in Cluster 24. This should be the case in spite of the fact that Cluster 25, with a higher proportion of females aged 35-44, dominate the neighbourhood. 15 Eguakun, Park, Quaye 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 It would be interesting for future work to investigate which distinct geodemographic high-risk clusters are likely to be involved in future collisions, and at what loss outcome, as opposed to focusing on where in the road network future collisions are likely to occur. Such work would consider and predict likely variations in high-risk behaviour across different neighbourhoods and would further refine our ability to undertake focused traffic safety targeting. An important area that was largely unexplored in our study is the extent to which individual high-risk drivers contribute to the collective group at different hierarchical levels. It is our view such a study could be accomplished through the use of robust statistical techniques that employ model building and hypothesis testing within a theoretical framework. It would be also interesting to explore various clustering methods to see if different methods produce visibly different results. Future work could also employ psychographic and social values data to further define the target cluster profiles. An economic appraisal of the scientific risk potential of the high-risk drivers would provide estimates of the costs of high risk drivers in dollar terms and would be useful in an analysis of the return on investment from campaigns that target high-risk drivers with traffic safety messages. 16 Eguakun, Park, Quaye 486 REFERENCES 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 1. Traffic Injury Research Foundation. Study of the Profile of High-Risk Drivers. TIRF. Ottawa, ON. 1995. 2. Mcsaveney J., And Jones W. High-Risk Drivers: An Exercise in Collision Data Analysis with What Was at Hand. Australasian Transport Research Forum 2011 Proceedings 28 - 30 September 2011, Adelaide, Australia Publication: http://www.patrec.org/atrf.aspx. Accessed Dec 20, 2012. 3. NZ Ministry Of Transportation. Safer Journeys: New Zealand’s Road Safety Strategy 2010 – 2020. New Zealand Ministry of Transport. www.transport.govt.nz/saferjourneys/documents/saferjourneystrategy.pdf. Accessed Oct 5, 2012 4. US DOT Speed Management Strategic Initiative. FHWA, FMCSA, NHTSA, DOT HS 809 925. 2005. 5. Canadian Council of Motor Transport Administrators (CCMTA). Road Safety Strategy 20112015. http://www.ccmta.ca/crss-2015/_files/road_safety_strategy_2015.pdf Accessed Jan 25, 2013. 6. Alberta Ministry of Transportation. Alberta Safety Plan. www.transportation.alberta.ca/content/doctype48/production/traffic safetyplan.pdf. Accessed Nov. 1, 2012. 7. Canadian Council of Motor Transport Administrator. Strategy to Deal with the High-Risk Driver. High-Risk Driver (HRD) Task Force. June 2001. Ottawa, ON. http://www.ccmta.ca/english/committees/rsrp/highrisk/pdf/hr_hrd_strategy.pdf. Accessed Nov 6, 2012. 8. Walsh, D., Chapman, R.E. Rudd, B.A. Social Marketing for Public Health, Health Affairs, Summer, 1993. Pp. 104-19. 9. Ott C.H and Haertlein C. Social Norms Marketing: A Prevention Strategy to Decrease HighRisk Drinking Among College Students. Nursing Clinics of North America, 37(2), 2002, pp. 351-364. 10. Ashby D, I., Longley P, A. Geocomputation, Geodemographics and Resource Allocation for Local Policing. Transactions in GIS 9(1) 2005, pp. 53-72 11. Farr, M, Wardlow, J., Jones, C. Tackling Health Inequalities Using Geodemographics: A Social Marketing Approach. International Journal of Marketing Research, 50(4) 2008, 449467. 12. Goss, J. Geodemographics. International Encyclopedia of the Social & Behavioural, 2009, pp 6166-6169. 13. Birkin, M., Clarke, G.P. Geodemographics. International Encyclopedia of Human Geography, 2009, pp. 382-389. 14. Petersen, J., Gibing, M., Longley, P., Mateos, P., Atkinson, P., Ashby, D. Geodemographics as a Tool for Targeting Neighbourhoods in Public Health Campaigns. Journal of Geographical Systems, 13.2: 2011, pp.173 (20). 15. Heitgerd, J.L., Lee, C.V. a New Look at Neighbourhoods near National Priorities List Sites. Social Science & Medicine, 57(6) 2003, pp. 1117-1126. 16. Harman B. and Murphy M. The Application of Social Marketing in Reducing Road Traffic Collisions among Young Male Drivers: An Investigation Using Physical Fear Threat Appeals. International Journal of Business Management. July 2008. 17 Eguakun, Park, Quaye 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 www.ccsenet.org/journal/index.php/ijbm/article/download/1412/1370.Accessed Dec. 20, 2012. 17. Cambois, M.A., Fontaine, H. Surveys Measuring Risk Exposure and the Combining of Results with Other Data. Collision Analysis & Prevention. 14(5) 1982, pp. 387-396 18. Blatt, J., Furman, S.M. Residence Location of Drivers Involved in Fatal Collisions. Collision Analysis and Prevention, 30(6), 1998, pp. 705-11. 19. Shankar, U. And Warkell, K. Geo-Demographic Analysis of Fatal Motorcycle Collisions. Washington, D.C.: U.S. Dept. Of Transportation, National Highway Traffic Safety Administration, DOT HS 809 197, 2007 20. Anderson, T K. (2010). Using Geodemographic to Measure and Explain Social and Environmental Differences In Road Traffic Collision Risk. Environment And Planning A 42(9), 2010, pp. 2186-2200. 21. Murat Karacasu, Arzu Er. An Analysis on Distribution of Traffic Faults in Accidents, Based on Driver's Age and Gender: Eskisehir Case. Procedia - Social and Behavioural Sciences, Volume 20, 2011, Pages 776-785. 22. Elena Santamariña-Rubio, Katherine Pérez, Marta Olabarria, Ana M. Novoa. Gender differences in road traffic injury rate using time travelled as a measure of exposure. Accident Analysis & Prevention, Volume 65, April 2014, Pages 1-7 23. Dana Yagil. Gender and age-related differences in attitudes toward traffic laws and traffic violations. Transportation Research Part F: Traffic Psychology and Behaviour, Volume 1, Issue 2, December 1998, Pages 123-135 24. Dawn L. Massie, Kenneth L. Campbell, Allan F. Williams. Traffic Accident involvement rates by driver age and gender. Accident Analysis & Prevention, Volume 27, Issue 1, February 1995, Pages 73-87 25. Lu Ma, Xuedong Yan. Examining the nonparametric effect of drivers’ age in rear-end accidents through an additive logistic regression model. Accident Analysis & Prevention, Volume 67, June 2014, Pages 129-136 26. Mohamed A. Abdel-Aty, A.Essam Radwan. Modeling traffic accident occurrence and involvement. Accident Analysis & Prevention, Volume 32, Issue 5, September 2000, Pages 633-642 27. Guangnan Zhang, Kelvin K.W. Yau, Guanghan Chen. Risk factors associated with traffic violations and accident severity in China. Accident Analysis & Prevention, Volume 59, October 2013, Pages 18-25. 28. Beatriz González-Iglesias, JoséAntonio Gómez-Fraguela, MªÁngeles Luengo-Martín. Driving anger and traffic violations: Gender differences. Transportation Research F: Traffic Psychology and Behaviour, Volume 15, Issue 4, July 2012, Pages 404-412 29. Guo, F., and Fang, Youjia. Individual Driver Risk Assessment Using Naturalistic Driving Data. Accident Analysis and Prevention, Volume 61(2013) pp. 3-9. 30. Saccamonno, F.F. and Lai, X. A model for Evaluating Countermeasures at High-wayRailway Grade Crossings. Transportation Research Record: Journal of the Transportation Research Board, No. 1918, Transportation Research Board of the National Academies, Washington, D.C. 2005, pp. 18-25. 31. Donmez, B., Boyle, L.N., Lee, J.D. Differences in Off-Glances: Effects on Young Drivers’ Performance. Journal of Transportation Engineering-ASCE 136 (5), pp. 403-409. 32. Schneider, J., Kasper, B., (2003). Lifestyles, Choice of Housing and Daily Mobility: The Lifestyle Approach in the Context of Spatial Mobility and Planning. International Social Science Journal. 55, 2003, pp.319-332. 18 Eguakun, Park, Quaye 578 579 580 33. Duckham, M., Mason, K., Stell, J., Worbovs, M. (2001). A Formal Approach to Imperfection in Geographic Information. Computers, Environment and Urban Systems. 25(1) 2001, pp. 89-103. 19
© Copyright 2026 Paperzz