Crash the Crash Test Dummy

Ressler 0
Crash
the
Crash Test
Dummy
An Analysis of
Individual Factors in
Fatal Car Crashes
ESE 302
May 7, 2004
Alexandra Ressler
Ressler 1
Table of Contents
Introduction.................................................................................................................................... 2
Crash’s Friends ........................................................................................................................... 3
Data Selection ............................................................................................................................. 4
Logistic Regression..................................................................................................................... 5
Assumptions................................................................................................................................ 7
A Brief Summary of Findings..................................................................................................... 8
Logistic Regression of All Data...................................................................................................... 9
Logistic Regression of Driver Data .............................................................................................. 12
Conclusions................................................................................................................................... 18
Crash’s Friends Revisited ......................................................................................................... 19
Questions Raised for Further Study.......................................................................................... 20
Appendix A: Data Selection Criteria ........................................................................................... 21
Appendix B: Histograms for All Data ......................................................................................... 29
Appendix C: Problems with Airbags ............................................................................................ 32
Ressler 2
Introduction
Crash the crash test dummy is vitally interested in which individual factors have the
greatest effect on his chance of survival in a fatal car crash. Car crashes are the leading cause of
accidental death in the United States, and in 2002, there was a car crash fatality every 12 minutes
and a disabling injury every 14 seconds. In that year, “motor vehicle crashes were the leading
cause of death for people ages 1 to 33”1.
Leading Causes of Unintentional Injury Deaths
United States, 2002
Motor Vehicle
44,000
Poisoning
15,700
Falls
14,500
Suffocation by
Inhalation or Ingestion of Food or Other Object
Drowning
4,200
1
3,000
Crash simulates a lot of fatal car crashes, but he doesn’t get to pick the circumstances of
each crash, such as the causes, environmental conditions, and vehicle characteristics. He does
know that due to a limited testing budget, he’ll only be tested in the most common types of
private automobiles, including cars, utility vehicles, vans, and pickup trucks. He also knows that
due to limited testing equipment, he only has to worry about impacts from the front, left side,
right side, and rear. Given this information, Crash would like to know whether he can improve
his predicted chance of survival by emulating any of his friends, each of whom personifies a
particular individual trait.
1
National Safety Council, < http://www.nsc.org/library/report_injury_usa.htm>
Ressler 3
Crash’s Friends
Seatbelt Sid- Sid always uses a restraining system,
whether it’s a lap belt, a shoulder belt, or a child safety
seat (in the case of Sid Jr.). He’s always telling Crash to
buckle up- is he being sanctimonious, or does he have a
point?
Crash Jr.- Okay, he can’t drive, but just because he’s not
behind the wheel doesn’t mean Crash Jr.’s out of harm’s
way. Or does it? And will he be any safer when he’s as
old as Crash and has his own license? In the meantime,
he’d love to ride shotgun (as it’s the cool thing to do), but
whether Crash should let him depends on Backseat Bob.
2
3
Crashella- Does Crashella give new meaning to the term
“femme fatale”? Or is she better off than her bulkier
brother? And who’s a safer driver, anyway?
4
Driver Dan- He’s cool, he’s hot, and he’s driving, thank
you. Driver Dan loves the freedom of the road and the
wheel at his fingertips- but would he be safer if he let his
girlfriend Crashella drive?
5
Backseat Bob- Forget backseat driver, Bob’s a backseat
passenger! He won’t touch the passenger seat, and he
doesn’t think anyone else should either- especially
Crash Jr. Should Crash listen to Bob and put Crash Jr.’s
reputation on the critical list, or would that be
overprotective parenting at its worst?
6
Airbag Al- Al’s a bit of an airhead, but he claims there’s
less to damage that way. Should Crash listen to his
bubbleheaded philosophy, or is Al just a windbag?
7
2
http://www.bertonex19.de/Technik_/Crashtest/TC1999dummy.jpg
http://webs.lanset.com/aeolusaero/images/Crash%20test%20dummy%20baby.jpg
4
http://www.level7.zeves.ru/art/crash_dummies.jpg
3
Ressler 4
Data Selection
Crash took the data for his analysis from the Fatality Analysis Reporting System’s
(FARS) 2002 Case Listings8. His data set included information for 56,833 individuals involved
(but not necessarily killed or injured) in fatal car crashes in 2002. These individuals represented
35,783 vehicles and 25,765 fatal crashes9. In general, Crash took the following data from each
individual, converted it to binomial data (with the exception of Age) and sorted it as follows:
Impact Point- Principal
Age
Air Bag
Injury Severity
Person Type
Restraint System-Use
Seating Position
Gender
Body Type (refers to the vehicle
body type)
1 Front
2 Left
3 Right
4 Rear
Not sorted; left as a continuous numerical variable.
1 Airbag Deployed
0 No Airbag Deployed
0 FATAL
1 NOT
0 Driver
1 Passenger
0 No Restraint System-Use
1 Restraint System Used
0 Front Seat
1 Second Seat
0 Male
1 Female
1 Automobiles
2 Utility Vehicles
3 Vans
4 Pickup Trucks
All independent variables are continuous, while the dependent variable Injury Severity is
nominal. Impact Point-Principal and Body Type are not actually used in the regression analysis,
but serve as important selection criteria. The remaining variables used in the regression analysis
are Bernoulli variables, to simplify the analysis10.
5
http://www.aidanbell.com/pics/thumbs/Crash%20Test%20Dummy.jpg
http://www.7er.com/modelle/e32/images/e32_crashtest_dummy.jpg
7
http://www.n-tv.de/images/200207/3053071_VW_CrashtestDummy.jpg
8
Fatality Analysis Reporting System’s Web-Based Encyclopedia, < http://www-fars.nhtsa.dot.gov/queryReport.cfm?stateid=0&year=2002>
9
For the exact criteria used to select individuals, please see Appendix A.
6
10
And because crash test dummies like dummy variables. Were these variables not Bernoulli and treated as
nominal, the logistic regressions would calculate an estimate for every possible outcome of the nominal variable; for
Ressler 5
Logistic Regression11
As the dependent variable Injury Severity is nominal, Crash cannot use multiple regression,
and consequently cannot determine prediction intervals, r-squared values, or variance inflation
factors. Instead, Crash uses logistic regression, which is designed for Bernoulli dependent
variables and predicts the probability of an outcome rather than the outcome itself. Under
logistic regression, the parameter estimate for each independent variable is called the maximumlikelihood estimate (as opposed to the least-squares estimate for multiple regression). As a set,
the maximum-likelihood estimates are such that the given observed values are most likely to
occur.
As an example of how the estimates work, let’s say the estimate for Age is -0.02. For each
additional year of age, the individual’s predicted probability of being a fatality decreases by two
percent. In the case of a binomial variable, if Restraint System-Use has an estimate of -0.78,
then the use of a restraint system decreases the individual’s predicted probability of being a
fatality by 78%.
The χ2 value for a maximum-likelihood estimate is its equivalent of a least-squares estimate’s
F value, and equals the square of its standardized value under the null hypothesis. The greater
the χ2 value, the more significant the variable, or the more it maximizes the chances of having
the given observed values occur. The p-value for each estimate is represented as Prob>ChiSq, or
the probability that one would get such data randomly if the null hypothesis (estimate=0) were
true, and the lower the p-value, the more significant the variable. Other tests for variable
significance are the Wald Tests for Effects, in which one runs the regression with and without
example, each seating position would be treated as a separate variable. N.B. The independent variables must be
continuous.
11
Information from “Notes on Logistic Regression” by Tony E. Smith and the JMPIN 4 online manual.
Ressler 6
the variable and compares the results to determine significance. Crash uses χ2 values to compare
the relative significance of independent variables, as the Wald Tests for Effects produce the same
relative significances among independent variables in his regressions.
It is important to note that as n approaches infinity, the asymmetric χ2 distribution becomes
increasingly skewed, the standard deviations of the parameter estimates become increasingly
asymptotic, and the χ2 values themselves become less relevant in the absolute sense. In other
words, “the scope of Chi-square statistics is limited when n becomes very large, the smallest
departure from the target becoming statistically significant”12. In regressions with very large
sample sizes, χ2 values are appropriate to compare the relative significance among variables and
regressions, but the values themselves do not adhere to the normal absolute standards. For
example, a value of 4 may be significant or “reasonably good” for a distribution with n=100, but
may be insignificant for a distribution with n=10,000.
To examine goodness of fit, Crash uses two metrics: the ChiSquare from the Whole Model
Test, which compares the regression model to a model with all parameters but the intercepts
removed, and the success rate. The success rate is calculated by rounding each individual’s
predicted chance of being a fatality to predict whether or not he or she was a fatality, then
comparing said prediction to the individual’s actual injury severity. The success rate is the
percentage of accurate predictions. Since the goal of all regressions is to find the mix of
independent variables that allows the most accurate predictions, success rate is obviously a better
metric.
12
<http://www.stat.auckland.ac.nz/~iase/publications/3/3269.pdf>
Ressler 7
Assumptions
•
Crash assumes that his independent variables are the most significant individual factors
of those included in the FARS case listings. It is possible the he’s neglecting more
significant but less intuitive factors (e.g., perhaps he should consider his friend Drake the
Drunk or Donny the Designated Driver, though making alcohol level a selection criteria
would severely limit his sample size).
•
Crash assumes that he hasn’t inadvertently excluded any variable-specific categories that
are significant within his chosen variables13.
•
Crash assumes that his individuals are independent within his chosen variables. This is a
faulty assumption for many reasons, e.g. when two or more people come from the same
car, they cannot have independent seating positions and at most one can be the driver.
•
Crash assumes that his variables are independent within each individual (i.e., no multi collinearities). This is a very faulty assumption, as Person Type dictates Seating Position
(the driver must be in the front) and Airbag availability is also limited to the front.
Furthermore, Person Type influences Age (almost no drivers are younger than 16). This
flaw will be addressed within the regressions.
•
The Gauss-Markov assumptions of linearity, independence, and homoscedasticity are not
used for logistic regression. Though logistic regression “does not have the requirements
of the independent variables to be normally distributed, linearly related, nor equal
variance within each group (Tabachnick and Fidell, 1996, p575)”, it requires large
sample groups14.
13
14
For example, Person Type 3. For more information, please see Appendix A.
http://www.kmentor.com/socio-tech-info/archives/000480.html
Ressler 8
A Brief Summary of Findings
Using logistic regressions, Crash finds that he can accurately predict whether a person
was a fatality about 70% of the time. As long as he includes Restraint System as an independent
variable, this is true whether he looks at all individuals, just drivers, or just passengers. This
figure, while moderately disappointing compared to other logistic regressions (where the mid80’s is considered “reasonably good” and 90% is considered “quite respectable”15) is nonetheless
impressive when taken in context. The factors that affect a person’s survival in a fatal car crash
are not limited to individual variables, but also extend to the causes of the crash, environmental
conditions, and vehicle characteristics. If one of the individuals in the sample drove their car off
a hundred foot cliff, the data would register whether the driver used a restraint system, but not
that regardless of whether or not the driver used a restraint system he or she had almost no
chance of survival. The logistic regression models based on individual variables are limited in
their prediction accuracy because they ignore significant external factors that affect chance of
survival. When viewed in this light, Crash’s 70% success rate is respectable.
15
Based on the “Analysis of Changing Religious Perspectives” report and “Notes on Logistic Regression” by Tony
E. Smith on the class website.
Ressler 9
Logistic Regression of All Data
Crash first examines the entire data set. His first regression includes all variables.
Nominal Logistic Fit for Injury Severity SORTED
The statistics report has several notable
Whole Model Test
Model
-LogLikelihood
Difference
Full
Reduced
DF ChiSquare Prob>ChiSq
4885.923
33264.685
38150.608
RSquare (U)
Observations (or Sum Wgts)
6
9771.846
features. First of all, the Whole Model
0.0000
Test has an astronomical ChiSquare
0.1281
56833
value of 9,771.846. Rather than
Converged by Gradient
Lack Of Fit
Source
indicating a ludicrously good fit, the
DF -LogLikelihood ChiSquare
Lack Of Fit
Saturated
Fitted
1739
1745
6
1399.472 2798.944
31865.213 Prob>ChiSq
33264.685
<.0001
order of magnitude suggests that the
Parameter Estimates
Term
Estimate
Intercept
Restraint System-Use SORTED
Age
Gender SORTED
Person Type SORTED
Seating Position SORTED 1v2
Airbag SORTED
-0.0798436
-1.5615478
0.02082281
0.10701923
-0.4761275
-0.4809617
0.14609065
Std Error ChiSquare Prob>ChiSq
0.0249661
0.020127
0.0004871
0.0200556
0.0233042
0.0340468
0.021654
10.23
6019.4
1827.1
28.47
417.42
199.56
45.52
0.0014
0.0000
0.0000
<.0001
<.0001
<.0001
<.0001
For log odds of FATAL/NOT
Nparm
Restraint System-Use SORTED
Age
Gender SORTED
Person Type SORTED
Seating Position SORTED 1v2
Airbag SORTED
produced a radically skewed
χ2 distribution. Consequently, all of
Crash’s logistic regressions will have
Effect Wald Tests
Source
unusually monstrous sample size has
1
1
1
1
1
1
DF Wald ChiSquare Prob>ChiSq
1
1
1
1
1
1
6019.39665
1827.09479
28.474345
417.424313
199.558036
45.5165284
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
enormous ChiSquare values, which
therefore cannot be used to determine
absolute significance, but are nonetheless helpful in determining relative significance.
Having navigated around this first pothole, Crash hits another bump in the road when he
notes that Airbag has a positive estimate, meaning it decreases one’s chances of survival. Since
this seems very counterintuitive, ceteris paribus, this leads Crash to suspect the multicollinearity
among Person Type, Airbag, and Seating Position mentioned earlier. By fitting Person Type and
Airbag by Seating Position, Crash realizes that all drivers sit in the front and that no one in the
second row of seats has an airbag, meaning these three variables are inescapably collinear.
Ressler 10
Contingency Analysis of Person Type SORTED By Seating Position SORTED 1v2
Mosaic Plot
Contingency Analysis of Airbag SORTED By Seating Position SORTED 1v2
Mosaic Plot
1.00
1
1
0.75
0.50
0
0.25
Airbag SORTED
Person Type SORTED
1.00
0.00
0.75
0.50
0
0.25
0.00
0
1
0
Seating Position SORTED 1v2
Person Type SORTED
Count 0
1
Total %
Col %
Row %
0
34270 13426 47696
60.30 23.62 83.92
100.00 59.50
71.85 28.15
1
9137
9137
0
0.00 16.08 16.08
0.00 40.50
0.00 100.00
34270 22563 56833
60.30 39.70
Tests
Source
Model
Error
C. Total
N
Airbag SORTED
Count 0
1
Total %
Col %
Row %
0
32754 14942 47696
57.63 26.29 83.92
78.19 100.00
68.67 31.33
1
9137
0
9137
0.00 16.08
16.08
0.00
21.81
0.00
100.00
41891 14942 56833
73.71 26.29
Tests
DF
-LogLike RSquare (U)
1
56831
56832
56833
Test
9830.790
28348.409
38179.199
0.2575
ChiSquare Prob>ChiSq
Likelihood Ratio
Pearson
19661.58
16536.34
Fisher's Exact Test
Left
Right
2-Tail
Kappa
Seating Position SORTED 1v2
Contingency Table
Seating Position SORTED 1v2
Seating Position SORTED 1v2
Contingency Table
1
0.0000
0.0000
Prob
1.0000
0.0000
0.0000
Std Err
0.45077 0.003462
Kappa measures the degree of agreement.
Source
Model
Error
C. Total
N
DF
-LogLike RSquare (U)
1
56831
56832
56833
Test
3087.878
29652.442
32740.320
ChiSquare Prob>ChiSq
Likelihood Ratio
Pearson
6175.756
3883.383
Fisher's Exact Test
Left
Right
2-Tail
Kappa
0.0943
0.0000
0.0000
Prob
0.0000
1.0000
0.0000
Std Err
-0.24926 0.001821
Kappa measures the degree of agreement.
Additionally, Crash realizes that drivers are almost all 16 years of age or older, producing
another colinearity and further degrading the integrity of his regression.
To compensate for these developments, Crash reruns his regression multiple times with
different sets of variables16. His final results are summarized in the following table:
16
A dummy’s version of stepwise regression, which JMPIN 4 does not allow for logistic regressions. Crash can
approximate stepwise regression by not sorting Injury Level and treating it as a continuous variable, but he does not
need to as the number of variables is small enough that he can run the necessary regressions on his own.
Ressler 11
Success
χ2 for
WMT
Rate
X
X
X
9771.846 0.706421
X
X
9726.375 NA
X
X
X
9568.723 NA
X
X
X
X
9346.831 NA
X
X
X
9486.005 0.706649
X
X
X
9274.496 NA
X
X
X
8637.149 NA
X
X
8414.179 NA
X
X
9453.883 0.705945
X
8412.845 0.691236
5237.82
0.669963
X
2187.603 0.626995
Crash notes that the regression that includes all the variables has the best success rate, but
Restraint
X
X
X
X
X
X
X
X
X
X
X
Age
Gender
Person
Type
X
X
X
Seat
Position
X
X
Air Bag
this is still not the best regression because of its multicollinearities. Given that Person Type
alone produces a higher ChiSquare than Seat Position and Air Bag combined
(9486.005>9346.831), Crash decides to examine Air Bag in a separate driver regression, and to
examine Seat Position in a separate passenger
regression. Furthermore, since Gender contributes
a negligible 0.000704% to the success rate, Crash
discards this variable and selects Restraint System,
Age, and Person Type as the most significant
Nominal Logistic Fit for Injury Severity SORTED
Whole Model Test
Model
Difference
Full
Reduced
individual in a fatal car crash is a fatality17. This
DF ChiSquare Prob>ChiSq
4726.941
33423.666
38150.608
RSquare (U)
Observations (or Sum Wgts)
3
between the contradictory goals of increasing
success rate and eliminating multicollinearity.
56833
Lack Of Fit
Lack Of Fit
Saturated
Fitted
DF -LogLikelihood ChiSquare
365
368
3
650.884 1301.768
32772.782 Prob>ChiSq
33423.666
<.0001
Parameter Estimates
Intercept
Restraint System-Use SORTED
Age
Person Type SORTED
Estimate
Std Error ChiSquare Prob>ChiSq
-0.0829646 0.0236603
-1.518155 0.0197314
0.0222467 0.0004792
-0.6404254 0.020073
12.30
5919.9
2155.3
1017.9
0.0005
0.0000
0.0000
<.0001
For log odds of FATAL/NOT
Effect Wald Tests
Source
Restraint System-Use SORTED
Age
Person Type SORTED
17
0.0000
0.1239
Term
regression represents the most balanced option
9453.883
Converged by Gradient
Source
factors in determining the probability that an
-LogLikelihood
Nparm
1
1
1
DF Wald ChiSquare Prob>ChiSq
1
1
1
5919.94559
2155.31517
1017.91277
Crash tolerates the colinearity between Age and Person Type because the marginal benefit of the latter to success
rate is almost 1.5% and because its ChiSquare is almost half as large as Age’s, making it undesirable to discard.
0.0000
0.0000
0.0000
Ressler 12
Logistic Regression of Driver Data
By focusing on the drivers in his data set, Crash can get a better idea of the significance
of Air Bag when isolated from its multicollinearities with other individual factors. He can
investigate a possible link between safe driving and gender, and he can determine how the
colinearity between Person Type and Age affects the model from the driver perspective. His
initial regression produces this statistics report:
Nominal Logistic Fit for Injury Severity SORTED
Whole Model Test
Model
Difference
Full
Reduced
-LogLikelihood
RSquare (U)
Observations (or Sum Wgts)
Even within the scope of the driver, Air Bag
DF ChiSquare Prob>ChiSq
3086.346
20564.004
23650.350
4
6172.692
0.0000
is still the least significant variable and has a
positive estimate. Suspecting further
0.1305
34270
Converged by Objective
collinearities, Crash runs a contingency
Lack Of Fit
Source
Lack Of Fit
Saturated
Fitted
DF -LogLikelihood ChiSquare
652
656
4
531.338 1062.676
20032.666 Prob>ChiSq
20564.004
<.0001
analysis on Air Bag and Restraint System to
Parameter Estimates
Term
Intercept
Restraint System-Use SORTED
Age
Gender SORTED
Air Bag SORTED
Estimate
0.03350614
-1.8182414
0.02168777
0.15882526
0.15105195
Std Error ChiSquare Prob>ChiSq
0.0313377
0.0260371
0.0006449
0.0261276
0.0252463
1.14
4876.6
1131.1
36.95
35.80
0.2850
0.0000
<.0001
<.0001
<.0001
Restraint System-Use SORTED
Age
Gender SORTED
Air Bag SORTED
systems are more safety-sensitive in general,
and therefore more likely to have a functional
For log odds of FATAL/NOT
Effect Wald Tests
Source
see if perhaps people who use restraint
Nparm
1
1
1
1
DF Wald ChiSquare Prob>ChiSq
1
1
1
1
4876.60705
1131.11759
36.9521082
35.7978096
0.0000
0.0000
0.0000
0.0000
airbag.
Crash also runs a contingency
analysis on Injury Severity and Gender to investigate whether males or females have a better
chance of survival in the driver’s seat and therefore might be considered “safer” drivers.
Contingency Analysis of Air Bag SORTED By Restraint System-Use SORTED
Mosaic Plot
1.00
1.00
data showed a slight disparity
Injury Severity SORTED
Air Bag SORTED
1
In both cases, though the
Ressler 13
Contingency Analysis of Injury Severity SORTED By Gender SORTED
Mosaic Plot
0.75
0.50
0
0.25
0.00
1
0.75
0.50
0
0.25
0.00
0
1
0
Restraint System-Use SORTED
Contingency Table
functional airbags and that
Contingency Table
Air Bag SORTED
Count 0
1
Total %
Col %
Row %
0
8554
3776
24.96 11.02
37.61 32.77
69.38 30.62
1
14192
7748
41.41 22.61
62.39 67.23
64.69 35.31
22746 11524
66.37 33.63
12330
35.98
Gender SORTED
restraints are more likely to have
Restraint System-Use SORTED
(suggesting those who use
21940
64.02
34270
Tests
Source
female drivers are less likely to
Model
Error
C. Total
N
not significant enough to make a
1
34268
34269
34270
-LogLike RSquare (U)
39.178
21843.275
21882.453
Source
0.0018
Model
Error
C. Total
N
ChiSquare Prob>ChiSq
Likelihood Ratio
Pearson
78.356
77.795
Fisher's Exact Test
DF
1
34268
34269
34270
Test
-LogLike RSquare (U)
3.106
23647.243
23650.350
Prob
6.213
6.209
Fisher's Exact Test
Std Err
Kappa
0.0127
0.0127
Prob
0.0066
0.9939
0.0131
Left
Right
2-Tail
0.039578 0.004443
0.0001
ChiSquare Prob>ChiSq
Likelihood Ratio
Pearson
<.0001
<.0001
1.0000
<.0001
<.0001
Left
Right
2-Tail
Kappa
Injury Severity SORTED
Count 0
1
Total %
Col %
Row %
0
12722 11082 23804
37.12 32.34 69.46
68.89 70.13
53.44 46.56
1
5746
4720 10466
16.77 13.77 30.54
31.11 29.87
54.90 45.10
18468 15802 34270
53.89 46.11
Tests
DF
Test
be fatalities), the disparity was
1
Gender SORTED
Std Err
-0.01275 0.005111
Kappa measures the degree of agreement.
Kappa measures the degree of agreement.
definite conclusion. If Crash analyzed several years’ worth of data and found similar slight
disparities each time, then that might allow him to reasonably identify relationships between the
above variables, but since his data is only from 2002, the slight inequalities in weight could very
well be normal variation.
Having failed to identify relationships involving Air
Bag and Restraint System or Gender and Injury Severity,
Distributions
Age
.001 .01 .05.10 .25 .50 .75 .90.95 .99 .999
90
80
70
Crash moves on to a known culprit: Age. The histogram
60
50
40
and normal quantile plot for Age are to the right. It is plainly
evident that the number of drivers drops to zero below 16 and
30
20
10
-4 -3 -2 -1
0
Normal Quantile Plot
that the distribution’s tails are not normal. In particular, the
plot’s residuals are positive for younger drivers (age less than
21) and older drivers (age greater than 50). Crash posits that
Age’s significance will be dramatically reduced by its lower
cutoff point.
Quantiles
100.0% maximum
99.5%
97.5%
90.0%
75.0%
quartile
50.0%
median
25.0%
quartile
10.0%
2.5%
0.5%
0.0%
minimum
97.000
89.000
83.000
70.000
52.000
37.000
24.000
19.000
16.000
16.000
7.000
Moments
Mean
Std Dev
Std Err Mean
upper 95% Mean
lower 95% Mean
N
40.03274
19.00742
0.10268
40.23399
39.83149
34270
1
2
3
4
5
Ressler 14
To test this hypothesis, Crash runs a series of regressions whose results are summarized
in the table below:
Success
χ2 for
WMT
Rate
X
X
X
X
6172.692 0.701401
X
X
X
6136.883 0.701109
X
X
X
6135.714 0.700263
X
X
6094.431 0.700671
X
4906.507 0.69005
X
633.8483 0.573154
Both Gender and Air Bag have a negligible impact on success rate, but whereas Gender
Restraint
Age
Gender
Air Bag
always increases it, the inclusion of Air Bag increases it if Gender is included, but decreases it if
Gender isn’t. The fact that Air Bag can actually decrease Success Rate if included and its
general failure to be significant suggest that it is not properly viewed within a regression of
individual variables. In other words, Air Bag is likely closely related to external variables such
as Impact Points and Most Harmful Events in the sense that the worse the fatal accident is, the
more likely an air bag will be deployed. Consequently, Air Bag is a poor predictor within a
context of individual variables18, and Crash eliminates it and Gender from his final regression.
As predicted, Age suffers considerably from its lower cutoff point, to the extent that Restraining
System alone is almost eight times as significant as Age
Nominal Logistic Fit for Injury Severity SORTED
Whole Model Test
Model
-LogLikelihood
DF ChiSquare Prob>ChiSq
3047.215
20603.134
23650.350
alone, and has a superior success rate by almost 12%.
Difference
Full
Reduced
Interestingly, Crash notes that the WMT χ2 for the combined
RSquare (U)
Observations (or Sum Wgts)
2
6094.431
0.0000
0.1288
34270
Converged by Gradient
regression is far greater than the sum of its parts (6094.431 –
(4906.507+633.848) = 554.076) suggesting that Age and
Restraining System complement each other strongly19.
Lack Of Fit
Source
Lack Of Fit
Saturated
Fitted
DF -LogLikelihood ChiSquare
170
172
2
283.947 567.8934
20319.188 Prob>ChiSq
20603.134
<.0001
Parameter Estimates
Term
Intercept
Restraint System-Use SORTED
Age
Estimate
Std Error ChiSquare Prob>ChiSq
0.11267265 0.0299576
-1.7874446 0.0256615
0.02171348 0.0006441
14.15
4851.8
1136.4
0.0002
0.0000
<.0001
For log odds of FATAL/NOT
Effect Wald Tests
Source
18
Restraint System-Use SORTED
Age
Nparm
1
1
DF Wald ChiSquare Prob>ChiSq
1
1
4851.77182
1136.38046
For more on why Air Bag is a poor predictor, please see Appendix C.
19
Or at least, more so than is noticeable in the other “dummy-stepwise” regressions. The ChiSquare for Age almost
doubles from its stand-alone value, while the ChiSquare for Restraint System decreases slightly.
0.0000
0.0000
Ressler 15
Logistic Regression of Passenger Data
A closer perusal of the passenger data allows Crash to examine Seating Position in a
meaningful context and to determine how the colinearity between Person Type and Age affects
the model from the passenger perspective. Crash’s initial regression is as follows:
With the exception of the abysmal
Nominal Logistic Fit for Injury Severity SORTED
Whole Model Test
Model
-LogLikelihood
Difference
Full
Reduced
DF ChiSquare Prob>ChiSq
1147.983
12570.029
13718.012
RSquare (U)
Observations (or Sum Wgts)
4
2295.966
significance level of Gender, Seating
0.0000
Position has the lowest ChiSquare. Crash
0.0837
suspects that Seating Position, like Air
22563
Converged by Gradient
Lack Of Fit
Source
Bag, is heavily dependent on external
DF -LogLikelihood ChiSquare
Lack Of Fit
Saturated
Fitted
743
747
4
516.702 1033.404
12053.327 Prob>ChiSq
12570.029
<.0001
factors, which could explain its relatively
Parameter Estimates
Term
Estimate
Intercept
Restraint System-Use SORTED
Age
Gender SORTED
Seating Position SORTED 1v2
-0.6951106
-1.1355824
0.0202084
0.0261724
-0.4471993
Std Error ChiSquare Prob>ChiSq
0.0360995
0.0319055
0.0007468
0.031671
0.0336364
370.77
1266.8
732.20
0.68
176.76
<.0001
<.0001
<.0001
0.4086
<.0001
Nparm
Restraint System-Use SORTED
Age
Gender SORTED
Seating Position SORTED 1v2
the effect of Seating Position on predicted
chance of being a fatality depends on both
For log odds of FATAL/NOT
Effect Wald Tests
Source
poor showing. He knows for a fact that
1
1
1
1
DF Wald ChiSquare Prob>ChiSq
1
1
1
1
1266.7959
732.196063
0.68290895
176.759869
0.0000
0.0000
0.4086
0.0000
Impact Point-Principal and Air Bag. This
is plain to see when one notes that 63% of
all passengers in the data set were in crashes for which the
Distributions
Impact Point-Principal SORTED
4
4
Impact Point-Principal was the front, suggesting greater danger
3
2
for those seated in the front (who are the only ones with air
3
bags), as attested to by Seating Position’s -0.45 parameter
2
1
estimate. Though it might not be as significant as Restraint
1
System, Seating Position has the second largest parameter
Frequencies
Level
Count
Prob
1
2
3
4
Total
14288
3114
3346
1815
22563
0.63325
0.13801
0.14830
0.08044
1.00000
4 Levels
estimate.
Ressler 16
It’s already evident to Crash that Age plays a relatively more significant role with
passengers than it does with drivers, as its ChiSquare in the initial regression was more than half
that of Restraint System’s as opposed to less than one fourth for the initial driver regression. Its
Distributions
normal quantile plot shows a very abnormal
Age
distribution with especially skewed tails. The
.001 .01 .05.10 .25 .50 .75 .90.95 .99 .999
90
80
fact that the median is at 21 suggests that an
70
60
incredible proportion of passengers involved in
50
40
30
fatal car crashes were under 30. Given that
20
10
automobile crashes are the leading cause of death
0
-3
-2
-1
0
Normal Quantile Plot
Quantiles
100.0% maximum
99.5%
97.5%
90.0%
75.0% quartile
50.0%
median
25.0% quartile
10.0%
2.5%
0.5%
0.0%
minimum
97.000
89.000
81.000
64.000
40.000
21.000
15.000
6.000
1.000
0.000
0.000
Moments
Mean
Std Dev
Std Err Mean
upper 95% Mean
lower 95% Mean
N
28.65842
21.46074
0.14287
28.93847
28.37838
22563
1
2
3
4
for people under 30, this histogram suggests that
it’s in part due to the fact that of the passengers
in fatal automobile crashes, the majority are
under 30. Given this fascinating and unexpected
piece of information, Crash predicted that Age
would play a much more relatively significant
role compared to Restraint System for passengers
than it did for drivers.
Ressler 17
Success
χ2 for
WMT
Rate
X
X
X
2295.966 0.717591
X
X
2295.284 0.716882
X
X
2116.238 0.716616
X
1055.551 0.703364
X
854.1598 0.714621
As was immediately obvious from the initial regression, Gender plays almost no
Restraint
Age
Gender
Seating
Position
X
X
role in the determination of success rate, and can be safely and happily eliminated.
Seating Position plays a disappointingly small role, but Crash anticipated this based on its
dependency on external factors and Air Bag. This leaves Restraint System and Age as
the most significant factors, but what is truly exciting is that Age actually has a higher
stand-alone success rate than Restraint System! In fact, Restraint System contributes
only 0.2% to the success rate, suggesting that despite its lower ChiSquare value, Age is a
more powerful predictor than Restraint System for passenger data.
Nominal Logistic Fit for Injury Severity SORTED
Whole Model Test
Model
Difference
Full
Reduced
-LogLikelihood
DF ChiSquare Prob>ChiSq
1058.119
12659.893
13718.012
RSquare (U)
Observations (or Sum Wgts)
2
2116.238
0.0000
0.0771
22563
Converged by Gradient
Lack Of Fit
Source
Lack Of Fit
Saturated
Fitted
DF -LogLikelihood ChiSquare
193
195
2
206.299 412.5974
12453.595 Prob>ChiSq
12659.893
<.0001
Parameter Estimates
Term
Intercept
Restraint System-Use SORTED
Age
Estimate
Std Error ChiSquare Prob>ChiSq
-0.9506663 0.0292207
-1.0905452 0.0312436
0.02277803 0.0007084
1058.5
1218.3
1033.9
<.0001
<.0001
<.0001
For log odds of FATAL/NOT
Effect Wald Tests
Source
Restraint System-Use SORTED
Age
Nparm
1
1
DF Wald ChiSquare Prob>ChiSq
1
1
1218.33062
1033.88546
0.0000
0.0000
Ressler 18
Conclusions
Through a series of logistic regressions performed on all 56,833 individuals, on the
34,270 drivers, and on the 22,563 passengers, Crash the crash test dummy succeeded in
narrowing down the most significant individual factors that had the greatest effect on an
individual’s chance of survival in a fatal car crash. For both the main data set and the two
subsets, Crash found logistic regressions that allowed him to predict with greater than 70%
accuracy whether an individual from the data sets was a fatality or not.
Of the variables that were discarded, Gender appeared to have negligible significance,
while Air Bag and Seating Position are most likely heavily dependent on external variables.
Crash still believes that Air Bag and Seating Position are significant individual variables that
have a great effect on an individual’s chance of survival in a fatal car crash, but to prove this one
must first find a more appropriate context than among other individual variables and also identify
which external variables have the greatest effect.
The most significant variable in terms of parameter estimate χ2 for both drivers and
passengers was Restraint System, which also had the highest standalone success rate for drivers,
making it the most significant individual factor for drivers. On the other hand, Age had the
highest standalone success rate for passengers, defying its inferior ChiSquare value and making
it the most significant individual factor for passengers. In the data set as a whole, Restraint
System was the most significant variable, beating out Age due to the weakness of Age as a
predictor variable for drivers and due to the much greater proportion of drivers.
Ressler 19
Crash’s Friends Revisited
Seatbelt Sid has every reason to tell Crash to buckle upunder the all data model, it could improve his predicted
chances of survival by 76%!
Much as it pains him, Crash Jr. is better off young. In
every data set, danger increased with age, in spite of the
fact that the majority of passengers were under 22. On
the other hand, drivers comprised the majority of the
complete data set, and there were practically no drivers
younger than 16. In short, the abnormalities in age
demographics may have adversely affected results, but the
regressions all suggest that younger is safer.
The data was not significant enough to support any
conclusion on whether gender affects predicted survival
rate, whether of driver or passenger, hence Crash cannot
conclude whether Crashella is safer in general, a safer
driver, or safer as a passenger.
Based on the complete data set’s best regression,
Driver Dan could improve his predicted chance of
survival by 32% if he lets someone else drive.
Backseat Bob is right to be leery of the front seat when
63% of the passengers in the data set were in a fatal
collision whose principal impact was from the front.
Nevertheless, due to the omission of powerful external
variables, Crash and Backseat Bob can’t conclude with a
high significance level that the front seat disimproves
one’s chances of survival. Until they identify and include
such variables, Crash Jr. will have to suffer in the second
seat.
Again, due to the omission of significant, related
variables, Airbag Al can’t say with significance whether
airbags improve survival rate; according to Crash’s
model, in some cases they disimprove a person’s survival
rate. For more information, please see Appendix C.
Ressler 20
Questions Raised for Further Study
•
Why do young people form such a large proportion of passengers involved in fatal car
crashes? Could it be that ceteris paribus, young people are more likely to be a fatality?
•
Is there a possible connection between Seat Position and Age, based on parental decisions
like Crash’s?
•
Could one make a better estimate for Seating Position by looking at only cars which had
one passenger who would not depend on the Seating Position of others and could choose
front or back freely?
•
Is there a way to account for the fact that some individuals represented the same car and
the same accident and are hence not independent?
•
What effect would sorting Age into nominal groups have? Would this improve
prediction success rates?
•
Which external factors have the greatest effect on the significance of Air Bag and the
significance of Seating Position?
•
How is the survival rate of those in the front seat affected by whether those in the back
seat used Restraining Systems? (E.g., if the person sitting behind the driver is not using a
restraining system and there is a head-on collision, could the driver be killed by the
impact of the person hitting from behind?)
Ressler 21
Appendix A: Data Selection Criteria
Below are tables from the FARS website which list the codes used to quantify each
variable used in the analysis (in the order they appear in the FARS JMPIN data table). The bold
codes are those used to select individuals for the analysis; in other words, each individual used in
the analysis has one of the bold codes in every category. Individuals who do not have a bold
code for every category are discarded. In general, codes that correspond to unknown data are
discarded, unless they fit in with the sorting properties (e.g., an airbag deployed from an
unknown direction still counts as a deployed airbag). Immediately below is a summary of how
each category was sorted for the analysis.
Impact Point- Principal
Age
Air Bag Availability/Function
Injury Severity
Person Type
Restraint System-Use
Seating Position
Sex (Referred to as Gender)
Body Type
11-1: 1 (Front)
8-10: 2 (Left)
2-4: 3 (Right)
5-7: 4 (Rear)
Not sorted
< 10: 1 (Airbag Deployed)
≥10: 0 (No Airbag Deployed)
< 4: 0 (Not Fatal)
4: 1 (Fatal)
1: 0 (Driver)
2: 1 (Passenger)
0: 0 (No Restraint System-Use)
1-4, 8-13: 1 (Restraint System Used)
11-19: 0 (Front Seat)
21-29: 1 (Second Seat)
1: 0 (Male)
2: 1 (Female)
1-9: 1 (Automobiles)
14-16, 19: 2 (Utility Vehicles)
20-29: 3 (Vans)
30-39: 4 (Pickup Trucks)
Ressler 22
Impact Point-Principal
Code
Definition
_
Blank
0
Non-Collision
1
1 Clock Point
2
2 Clock Point
3
3 Clock Point
4
4 Clock Point
5
5 Clock Point
6
6 Clock Point
7
7 Clock Point
8
8 Clock Point
9
9 Clock Point
10
10 Clock Point
11
11 Clock Point
12
12 Clock Point
13
Top
14
Undercarriage
99
Unknown
* Only the Clock Point codes are included as it is hypothesized that they have collinear
properties with Seating Position.
Age
Code
Definition
_
Blank
0
Up To One Year
1
1 Year
2
2 Years
3
3 Years
4
4 Years
5
5 Years
6
6 Years
7
7 Years
8
8 Years
9
9 Years
10
10 Years
11
11 Years
12
12 Years
13
13 Years
14
14 Years
15
15 Years
16
16 Years
17
17 Years
18
18 Years
19
19 Years
20
20 Years
21
21 Years
22
22 Years
Ressler 23
23
23 Years
24
24 Years
25
25 Years
26
26 Years
27
27 Years
28
28 Years
29
29 Years
30
30 Years
31
31 Years
32
32 Years
33
33 Years
34
34 Years
35
35 Years
36
36 Years
37
37 Years
38
38 Years
39
39 Years
40
40 Years
41
41 Years
42
42 Years
43
43 Years
44
44 Years
45
45 Years
46
46 Years
47
47 Years
48
48 Years
49
49 Years
50
50 Years
51
51 Years
52
52 Years
53
53 Years
54
54 Years
55
55 Years
56
56 Years
57
57 Years
58
58 Years
59
59 Years
60
60 Years
61
61 Years
62
62 Years
63
63 Years
64
64 Years
65
65 Years
66
66 Years
67
67 Years
68
68 Years
69
69 Years
70
70 Years
71
71 Years
72
72 Years
Ressler 24
73
73 Years
74
74 Years
75
75 Years
76
76 Years
77
77 Years
78
78 Years
79
79 Years
80
80 Years
81
81 Years
82
82 Years
83
83 Years
84
84 Years
85
85 Years
86
86 Years
87
87 Years
88
88 Years
89
89 Years
90
90 Years
91
91 Years
92
92 Years
93
93 Years
94
94 Years
95
95 Years
96
96 Years
97
97 Years or Older
99
Unknown
Air Bag Availability/Function
Code
Definition
0
Non-Motorist
1
From the FRONT
2
From the SIDE
7
From OTHER Direction
8
From MULTIPLE Directions
9
From UNKNOWN Direction
20
Airbag Available-NO DEPLOYMENT
28
Airbag Available-SWITCHED OFF
29
Airbag Available-UNKNOWN IF DEPLOYED
30
Not Available (This Seat)
31
Previously Deployed/Not Replaced
32
Disabled/Removed
99
Unknown if Airbag Available (For this Seat)
Injury Severity
Code
Definition
_
Blank
0
No Injury (0)
1
Possible Injury (C)
2
Nonincapacitating Evident Injury (B)
3
Incapacitating Injury (A)
Ressler 25
4
Fatal Injury (K)
5
Injured, Severity Unknown
6
Died Prior to Accident*
9
Unknown
Person Type
Code
Definition
_
Blank
1
Driver of a Motor Vehicle in Transport
2
Passenger of a Motor Vehicle in Transport
3
*
Occupant of a Motor Vehicle Not in Transport
4
Occupant of a Non-Motor Vehicle Transport Device
5
Pedestrian
6
Bicyclist
7
Other Cyclist
8
Other Pedestrians
9
Unknown Occupant Type in a Motor Vehicle in Transport
19
Unknown Type of Non-Motorist
99
Unknown Person Type
* Code 3 is eliminated as there is no available definition for “in Transport”, which might refer to
a motor vehicle that is in motion (not stopped at a traffic light) or to a motor vehicle that is en
route (rather than parked).
Restraint System-Use
Code
Definition
_
Blank
0
Non Used - Vehicle Occupant; Not Applicable
1
Shoulder Belt
2
Lap Belt
3
Lap and Shoulder Belt
4
Child Safety Seat
5
Motorcycle Helmet
6
Bicycle Helmet
8
Restraint Used - Type Unknown
13
Safety Belt Used Improperly
14
Child Safety Seat Used Improperly
15
Helmets Used Improperly
99
Unknown
Seating Position
Code
Definition
_
Blank
0
Non-Motorist
11
Front Seat - Left Side(Driver's Side)
12
Front Seat - Middle
13
Front Seat - Right Side
18
Front Seat - Other
19
Front Seat - Unknown
Ressler 26
21
Second Seat - Left Side
22
Second Seat - Middle
23
Second Seat - Right Side
28
Second Seat - Other
29
Second Seat - Unknown
31
Third Seat - Left Side *
32
Third Seat - Middle *
33
Third Seat - Right Side *
38
Third Seat - Other *
39
Third Seat - Unknown
41
Fourth Seat - Left Side
42
Fourth Seat - Middle
43
Fourth Seat - Right Side
48
Fourth Seat - Other
49
Fourth Seat - Unknown
50
Sleeper Section of Cab (Truck)
51
Other Passenger in enclosed passenger or cargo area
52
Other Passenger in unenclosed passenger or cargo area
53
Other Passenger in passenger or cargo area, unknown whither or not
enclosed
54
Trailing Unit
55
Riding on Vehicle Exterior
99
Unknown
Sex
Code
Definition
_
Blank
1
Male
2
Female
9
Unknown
Body Type
Code
Definition
_
Blank
1
Convertible(excludes sun-roof,t-bar)
2
2-door sedan,hardtop,coupe
3
3-door/2-door hatchback
4
4-door sedan, hardtop
5
5-door/4-door hatchback
6
Station Wagon (excluding van and truck based)
7
Hatchback, number of doors unknown
8
Sedan/Hardtop, number of doors unknown
9
Other or Unknown automobile type
10
Auto-based pickup (includes E1 Camino, Caballero, Ranchero, Subaru
Brat,Rabbit Pickup)
11
Auto-based panel (cargo station wagon, auto-based ambulance or hearse)
12
Large Limousine-more than four side doors or stretched chassis
13
Three-wheel automobile or automobile derivative
14
Compact utility (Jeep CJ-2-CJ-7, Scrambler, Golden Eagle, Renegade,
Laredo, Wrangler, .....)
Ressler 27
15
Large utility (includes Jeep Cherokee [83 and before], Ramcharger,
Trailduster, Bronco-fullsize ..)
16
Utility station wagon (includes suburban limousines, Suburban,
Travellall, Grand Wagoneer)
19
Utility, Unknown body type
20
Minivan (Chrysler Town and Country, Caravan, Grand Caravan,
Voyager, Grand Voyager, Mini-Ram, ...)
21
Large Van (B150-B350, Sportsman, Royal Maxiwagon, Ram,
Tradesman, Voyager [83 and before], .....)
22
Step van or walk-in van
23
Van based motorhome
24
Van-based school bus
25
Van-based transit bus
28
Other van type (Hi-Cube Van, Kary)
29
Unknown van type
30
Compact pickup (GVWR <4,500 lbs.) (D50,Colt P/U, Ram 50, Dakota,
Arrow Pickup [foreign], Ranger, ..)
31
Standard pickup (GVWR 4,500 to 10,00 lbs.)(Jeep Pickup, Comanche,
Ram Pickup, D100-D350, ......)
32
Pickup with slide-in camper
33
Convertible pickup
39
Unknown (pickup style) light conventional truck type
40
Cab chassis based (includes light stake, light dump, light tow, rescue
vehicles)
41
Truck based panel
42
Light truck based motorhome (chassis mounted)
45
Other light conventional truck type (includes stretched suburban limousine)
48
Unknown light truck type (not a pickup)
49
Unknown light vehicle type (automobile, van, or light truck)
50
School Bus
51
Cross Country/Intercity Bus (i.e., Greyhound)
52
Transit Bus (City Bus)
58
Other Bus Type
59
Unknown Bus Type
60
Step van
61
Single unit straight truck (10,000 lbs < GVWR < or= 19,500 lbs)
62
Single unit straight truck (19,500 lbs < GVWR < or= 26,000 lbs.)
63
Single unit straight truck (GVWR > 26,000 lbs.)
64
Single unit straight truck (GVWR unknown)
65
Medium/heavy truck based motorhome
66
Truck-tractor (Cab only, or with any number of trailing unit; any weight)
67
Medium.Heavy Pickup
71
Unknown if single unit or combination unit Medium Truck (10,000 < GVWR <
26,000)
72
Unknown if single unit or combination unit Heavy Truck (GVWR > 26,000)
73
Camper or motorhome, unknown truck type
78
Unknown medium/heavy truck type
79
Unknown truck type (light/medium/heavy)
80
Motorcycle
81
Moped (motorized bicycle)
82
Three-wheel Motorcycle or Moped - not All-Terrain Vehicle
83
Off-road Motorcycle (2-wheel)
Ressler 28
88
Other motored cycle type(minibikes, Motorscooters)
89
Unknown motored cycle type
90
ATV (All-Terrain Vehicle; includes dune/swamp buggy - 3 or 4 wheels)
91
Snowmobile
92
Farm equipment other than trucks
93
Construction equipment other than trucks (includes graders)
97
Other vehicle type (includes go-cart, fork-lift, city street seeeper)
99
Unknown body type
Ressler 29
Appendix B: Histograms for All Data
Distributions
Distributions
Injury Severity SORTED
Restraint System-Use SORTED
1
1
1
1
0
0
0
0
Frequencies
Frequencies
Level
Count
Prob
Level
Count
0
1
Total
34338 0.60419
22495 0.39581
56833 1.00000
0
1
Total
21270 0.37425
35563 0.62575
56833 1.00000
2 Levels
Prob
2 Levels
Distributions
Distributions
Person Type SORTED
Gender SORTED
1
1
0
0
0
0
Frequencies
Frequencies
Level
Count
0
1
Total
35516 0.62492
21317 0.37508
56833 1.00000
2 Levels
1
1
Prob
Level
Count
0
1
Total
34270 0.60299
22563 0.39701
56833 1.00000
2 Levels
Prob
Ressler 30
Distributions
Distributions
Airbag SORTED
Seating Position SORTED 1v2
1
1
1
1
0
0
0
0
Frequencies
Frequencies
Level
Count
Prob
Level
Count
0
1
Total
47696 0.83923
9137 0.16077
56833 1.00000
0
1
Total
41891 0.73709
14942 0.26291
56833 1.00000
Prob
2 Levels
2 Levels
Distributions
Distributions
Impact Point-Principal SORTED
Body Type SORTED
4
4
3
4
4
3
2
3
2
3
2
1
1
2
1
1
Frequencies
Frequencies
Level
Count
Prob
Level
Count
Prob
1
2
3
4
Total
37954
7773
7218
3888
56833
0.66782
0.13677
0.12700
0.06841
1.00000
1
2
3
4
Total
32845
7471
5036
11481
56833
0.57792
0.13146
0.08861
0.20201
1.00000
4 Levels
4 Levels
Ressler 31
Distributions
Age
.001 .01 .05.10 .25 .50 .75 .90.95 .99 .999
90
80
70
60
50
40
30
20
10
0
-4 -3 -2 -1
0
Normal Quantile Plot
Quantiles
100.0% maximum
99.5%
97.5%
90.0%
quartile
75.0%
median
50.0%
quartile
25.0%
10.0%
2.5%
0.5%
0.0%
minimum
97.000
89.000
82.000
68.000
48.000
31.000
19.000
15.000
3.000
0.000
0.000
Moments
Mean
Std Dev
Std Err Mean
upper 95% Mean
lower 95% Mean
N
35.51708
20.77647
0.08715
35.68790
35.34626
56833
1
2
3
4
5
Ressler 32
Appendix C: Problems with Airbags
There are a number of reasons why airbags may increase one’s chances of being a fatality
within this dataset20.
•
•
•
•
•
•
•
Airbags can kill people who don’t wear seatbelts, adult or child, particularly in frontal
crashes where pre-crash braking throws the victim forward so their heads are close to the
airbag when it deploys.
Airbags increase risk for right-front passengers less than 13 years old.
Airbags by themselves protect only in frontal crashes.
Airbags are not designed to deploy in side, rear, or rollover crashes.
Airbags have a negligible effect in non-frontal crashes.
Absolute benefits are larger for unbelted drivers, but they still have a lower chance of
survival.
Airbags are less effective for older drivers.
Despite these problems, Crash did succeed in finding a set of subsets that verify that the
above problems were, in fact, the problem. Crash took only those accidents for which the
principal impact was at 12 o’clock, that is, the accidents for which airbags were the most critical.
He then ran a regression using injury severity as the nominal dependent variable and using Air
Bag as the continuous independent variable for each of the below subsets. It makes sense that
belted drivers’ risk would increase if their air bag deployed, as this indicates a more severe
accident, and that unbelted drivers are better off using airbags, since the absolute benefits are
greater for them. The fact that airbags increased risk for passengers under the age of 13
corresponds with the known facts, while the fact that airbags decreased risk for older passengers
may be the expected result. Note, however, the low ChiSquares. Airbags are still problematical.
12 Drivers Belted
12 Drivers Not Belted
12 Passengers Child
12 Passengers Adult
20
Estimate
0.36908419
-0.0592098
0.17742011
-0.0623108
Std. Error
0.0404951
0.0516979
0.2424533
0.0597072
ChiSquare
83.07
1.31
0.54
1.09
For more information, please see http://www.nsc.org/partners/status3.htm or
http://www.hwysafety.org/safety_facts/airbags/stats.htm
Prob>ChiSquare
<0.0001
0.2521
0.4643
0.2967
Ressler 33