Case Study – Logistic Regression NFL Field Goal Attempts – 2003 Dependent Variable: Outcome of Field Goal Attempt (1=Success, 0=Failure) Independent Variable: Distance (Yards) Data: YARDS * SUCCESS Crosstabulation Count SUCCESS 0 YARDS Total 18.0 19.0 20.0 21.0 22.0 23.0 24.0 25.0 26.0 27.0 28.0 29.0 30.0 31.0 32.0 33.0 34.0 35.0 36.0 37.0 38.0 39.0 40.0 41.0 42.0 43.0 44.0 45.0 46.0 47.0 48.0 49.0 50.0 51.0 52.0 53.0 54.0 55.0 56.0 57.0 58.0 60.0 62.0 1 0 0 1 1 0 0 2 0 1 0 1 4 3 3 2 3 6 7 4 6 7 7 5 10 2 7 11 8 9 7 15 13 11 6 9 7 5 5 0 1 0 2 1 192 Total 1 5 18 17 36 33 28 17 26 32 31 33 19 23 16 30 19 20 21 26 26 26 24 22 26 21 20 13 25 20 21 16 14 12 5 7 3 1 1 1 1 0 0 756 1 5 19 18 36 33 30 17 27 32 32 37 22 26 18 33 25 27 25 32 33 33 29 32 28 28 31 21 34 27 36 29 25 18 14 14 8 6 1 2 1 2 1 948 Random Component: Binomial Distribution Systematic Component: g() = + X Link Function: logit g() = log 1 e x Regression Equation: ( x) 1 e x Estimated Regression Equation: Variables in the Equation Step a 1 YARDS Constant B -.110 5.698 S.E. .011 .451 Wald 107.856 159.545 df 1 1 Sig. .000 .000 Exp(B) .896 298.235 95.0% C.I.for EXP(B) Lower Upper .878 .915 a. Variable(s) entered on s tep 1: YARDS. e 5.6980.110x ( x) 1 e 5.6980.110x ^ Yardage 20 25 30 35 40 45 50 55 Probability .9706 .9502 .9167 .8639 .7855 .6787 .5493 .4129 95% CI for : -0.110 1.96(0.011) -0.110 0.022 (-0.132 , -0.088) Odds Ratio: OR ( x 1) 1 ( x 1) odds( x 1) e odds( x) ( x) 1 ( x) ^ Estimated odds ratio: e e 0.110 0.896 Odds of successful fieldgoal change by about (0.896-1)100% = -10.4% for each extra yard of the attempt. 95% CI for odds ratio: (e-0.132 , e-0.088) (0.876 , 0.916) Hosmer-Lemeshow Goodness-of-Fit Test: 1. Lump attempts into 10 bins of approximately equal number of attempts 2. Obtain the observed and predicted (expected) numbers of successes and failures for each bin (20 total cells) 3. For each bin, compute the contribution to the chi-square statistic: (observed - expected) 2 X expected 2 4. Sum the results from step 3 over all 20 cells Hosme r and Leme show Test St ep 1 Chi-square 5.974 df 8 Sig. .650 Contingency Table for Hosmer and Lemeshow Test Step 1 1 2 3 4 5 6 7 8 9 10 SUCCESS = 0 Observed Expected 47 47.043 35 36.501 28 27.657 19 22.232 19 18.588 17 12.632 14 10.995 8 6.779 3 5.638 2 3.936 SUCCESS = 1 Observed Expected 45 44.957 57 55.499 58 58.343 69 65.768 76 76.412 67 71.368 88 91.005 83 84.221 103 100.362 110 108.064 Total 92 92 86 88 95 84 102 91 106 112 Don’t reject the null hypothesis that the model fit is appropriate. Computational Approach to Obtaining Logistic Regression Analysis Data: ni observations at the ith of m distinct levels of the independent variable(s), with yi successes. x1' ' x X 2 ' x m xi' 1 x1i x pi 0 1 p Note: p=1 in this case With Likelihood and log-Likelihood Functions: exp( xi' ) ni ! L( | (n1 , y1 ),..., (nm , y m )) ' i 1 y i !( ni y i )! 1 exp( x i ) m yi 1 ' 1 exp( x ) i n n n i 1 i 1 i 1 ni yi log( L) log( ni !) log( y i !) log(( ni y i )!) y i xi' ni log 1 exp( xi' ) The derivative of the log-likelihood wrt : log( L) 1 y i x i ni xi exp( xi' ) ( y i ni i ) xi g ( ) ' 1 exp( xi ) The Hessian matrix: 1 exp( xi' ) xi' exp( xi' ) exp( xi' ) xi' exp( xi' ) 2 ni xi 2 ' 1 exp( x' ) i ni xi xi' exp( xi' ) Wii ni i (1 i ) 1 1 exp( x ) ' i 2 X 'WX G ( ) Wij 0 i j Newton-Raphson-Algorithm: ^ NEW ^ OLD 1 ^ OLD ^ OLD g G Results from SAS PROC IML: BETA_NEW SEBETA 5.6978802 0.4510991 -0.10991 0.0105832 VBETA LOGLIKE 0.2034904 -0.004683 -408.6017 -0.004683 0.000112 Note: There log-likelihood function evaluation does not match SPSS’ (it does not evaluate the first term, which does not involve ). Any tests concerning will not be effected. 1 .8 y .6 .4 .2 20 30 40 50 60 70 x For more detail, see Agresti (2002), Categorical Data Analysis. Chapter 4. Sources: www.jt-sw.com, www.espn.com
© Copyright 2026 Paperzz