Linear Models III Thursday May 31, 10:15-12:00 Deborah Rosenberg, PhD Research Associate Professor Division of Epidemiology and Biostatistics University of IL School of Public Health Training Course in MCH Epidemiology Density of Student's t with 10 d.f. Chi-Square Densities 0.6 0.4 0.3 0.5 1 d.f. 0.4 3 d.f. 5 d.f. 8 d.f. 0.0 0.1 0.0 0.1 0.3 0.2 0.2 2 d.f. -3 -2 -1 0 1 2 t 3 0 5 10 x 15 Ordinal and Nominal Outcomes Disease or Other Health Outcome Yes No Exposure or Yes Person, Place, or Time Variable No a b a+b (n1) c d c+d (n2) a+c (m1) b+d (m2) a+b+c+d N Outcomes with More than 2 Categories Examples of Outcomes which might be suited for ordinal or nominal regression: Ordinal or Nominal bmi categories Nominal cause of death categories Ordinal or nominal severity of illness categories Ordinal or nominal categories of program participation 1 Density of Student's t with 10 d.f. Chi-Square Densities 0.6 0.4 0.3 0.5 1 d.f. 0.4 3 d.f. 5 d.f. 8 d.f. 0.0 0.1 0.0 0.1 0.3 0.2 0.2 2 d.f. -3 -2 -1 0 t 1 2 3 0 5 10 x 15 Ordinal and Nominal Outcomes Disease or Other Health Outcome Yes No Exposure or Yes Person, Place, or Time Variable No a b a+b (n1) c d c+d (n2) a+c (m1) b+d (m2) a+b+c+d N The Cumulative Logit Model The primary motivation for using a logistic model with an ordinal outcome is to accommodate a truly ordinal variable that has a "ceiling" and "floor" effect and one in which the intervals between each response category can be somewhat arbitrary —that is, it is not a continuous variable. Modeling an ordinal outcome as a continuous variable can yield biased results because it will yield predicted values outside the range of the ordinal variable. 2 Density of Student's t with 10 d.f. Chi-Square Densities 0.6 0.4 0.3 0.5 1 d.f. 0.4 3 d.f. 5 d.f. 8 d.f. 0.0 0.1 0.0 0.1 0.3 0.2 0.2 2 d.f. -3 -2 -1 0 t 1 2 3 0 5 10 x 15 Ordinal and Nominal Outcomes Disease or Other Health Outcome Yes No Exposure or Yes Person, Place, or Time Variable No a b a+b (n1) c d c+d (n2) a+c (m1) b+d (m2) a+b+c+d N The Cumulative Logit Model An ordered outcome may reflect an underlying continuous variable for which we have no data or for which we don't know the "real" threshold values. For example, a Likert scale for satisfaction—very dissatisfied to very satisfied—or for agreement— strongly disagree to strongly agree—has response categories reflecting a continuous scale for which there is no data. 3 Density of Student's t with 10 d.f. Chi-Square Densities 0.6 0.4 0.3 0.5 1 d.f. 0.4 3 d.f. 5 d.f. 8 d.f. 0.0 0.1 0.0 0.1 0.3 0.2 0.2 2 d.f. -3 -2 -1 0 1 2 3 0 t 5 10 x 15 Modeling Ordinal Outcomes Disease or Other Health Outcome Yes No Exposure or Yes Person, Place, or Time Variable No a b a+b (n1) c d c+d (n2) a+c (m1) b+d (m2) a+b+c+d N Some other ordinal variables that may reflect an underlying continuous construct that cannot be measured as such. The ordered values are intended to reflect distinct threshold values. Examples of ordinal variables of this type: 4 access to care index reports of experience of life stress assessment of overall health status satisfaction with care 4 Density of Student's t with 10 d.f. Chi-Square Densities 0.6 0.4 0.3 0.5 1 d.f. 0.4 3 d.f. 5 d.f. 8 d.f. 0.0 0.1 0.0 0.1 0.3 0.2 0.2 2 d.f. -3 -2 -1 0 t 1 2 3 0 5 10 x 15 Ordinal and Nominal Outcomes Disease or Other Health Outcome Yes No Exposure or Yes Person, Place, or Time Variable No a b a+b (n1) c d c+d (n2) a+c (m1) b+d (m2) a+b+c+d N The Cumulative Logit Model To appropriately model an outcome as ordinal, the proportional odds assumption must hold. The proportional odds assumption: if an independent variable increases (or decreases) the odds of being in category 1 v. the remaining categories, then it also similarly increases (or decreases) the odds of being in category 2 and 1 combined v. the remaining categories, in categories 3, 2, and 1 combined v. the remaining categories, etc. 5 Density of Student's t with 10 d.f. Chi-Square Densities 0.6 0.4 0.3 0.5 1 d.f. 0.4 3 d.f. 5 d.f. 8 d.f. 0.0 0.1 0.0 0.1 0.3 0.2 0.2 2 d.f. -3 -2 -1 0 t 1 2 3 0 5 10 x 15 Ordinal and Nominal Outcomes Disease or Other Health Outcome Yes No Exposure or Yes Person, Place, or Time Variable No a b a+b (n1) c d c+d (n2) a+c (m1) b+d (m2) a+b+c+d N The Cumulative Logit Model The null hypothesis for the proportional odds assumption is that the odds ratios for the association between a risk factor and an ordinal outcome are constant regardless of how the category boundaries are drawn. If the proportional odds assumption holds, then the association between an independent variable and the outcome can be expressed as a single summary estimate—a common odds ratio—across all categories. 6 Density of Student's t with 10 d.f. Chi-Square Densities 0.6 0.4 0.3 0.5 1 d.f. 0.4 3 d.f. 5 d.f. 8 d.f. 0.0 0.1 0.0 0.1 0.3 0.2 0.2 2 d.f. -3 -2 -1 0 t 1 2 3 0 5 10 x 15 Ordinal and Nominal Outcomes Disease or Other Health Outcome Yes No Exposure or Yes Person, Place, or Time Variable No a b a+b (n1) c d c+d (n2) a+c (m1) b+d (m2) a+b+c+d N The Cumulative Logit Model The proportional odds assumption can be tested with a chi-square statistic – a score test. A nonsignificant result means that the null hypothesis will not be rejected and that the cumulative logit model is appropriate; a significant result means that the proportional odds assumption may not hold. 7 Density of Student's t with 10 d.f. Chi-Square Densities 0.6 0.4 0.3 0.5 1 d.f. 0.4 3 d.f. 5 d.f. 8 d.f. 0.0 0.1 0.0 0.1 0.3 0.2 0.2 2 d.f. -3 -2 -1 0 t 1 2 3 0 5 10 x 15 Ordinal and Nominal Outcomes Disease or Other Health Outcome Yes No Exposure or Yes Person, Place, or Time Variable No a b a+b (n1) c d c+d (n2) a+c (m1) b+d (m2) a+b+c+d N The Cumulative Logit Model: For an ordered outcome with k categories p1 ln Odds1 ln 1 p1 ln Odds1 2 ln ln Odds1 2... k 1 ln p1 2 1 p1 2 p1 2... k 1 1 p1 2... k 1 Both the numerator and denominator change http://www.indiana.edu/%7Estatmath/stat/all/cat/2b1.html 8 Density of Student's t with 10 d.f. Chi-Square Densities 0.6 0.4 0.5 0.3 Ordinal and Nominal Outcomes 1 d.f. 0.4 3 d.f. 5 d.f. 8 d.f. 0.0 0.1 0.0 0.1 0.3 0.2 0.2 2 d.f. -3 -2 -1 0 1 2 3 0 5 t 10 15 x Risk Factor Yes No 1 a e Ordinal Outcome Variable 2 3 b c f g Disease or Other Health Outcome Yes No Exposure or Yes Person, Place, or Time Variable No 4 d h a b a+b (n1) c d c+d (n2) a+c (m1) b+d (m2) a+b+c+d N Total Odds Among the exposed = a / b+c+d Risk Factor Yes No 1 a e Ordinal Outcome Variable 2 3 b c f g 4 d h Total Odds Among the exposed = a+b / c+d Risk Factor Yes No 1 a e Ordinal Outcome Variable 2 3 b c f g 4 d h Odds Among the exposed = a+b+c / d Total 9 Density of Student's t with 10 d.f. Chi-Square Densities 0.6 0.4 0.3 0.5 1 d.f. 0.4 3 d.f. 5 d.f. 8 d.f. 0.0 0.1 0.0 0.1 0.3 0.2 0.2 2 d.f. -3 -2 -1 0 t 1 2 3 0 5 10 x 15 Ordinal and Nominal Outcomes Disease or Other Health Outcome Yes No Exposure or Yes Person, Place, or Time Variable No a b a+b (n1) c d c+d (n2) a+c (m1) b+d (m2) a+b+c+d N The Cumulative Logit Model Given k categories of an ordered outcome variable, a cumulative logit model yields k-1 intercept terms. Each intercept corresponds to a category combined with all adjacent lower-ordered categories. Since proportional odds are assumed, and therefore a common odds ratio, the effect of each covariate is reflected in a single beta coefficient. 10 Density of Student's t with 10 d.f. Chi-Square Densities 0.6 0.4 0.5 0.3 Ordinal and Nominal Outcomes 1 d.f. 0.4 3 d.f. 5 d.f. 8 d.f. 0.0 0.1 0.0 0.1 0.3 0.2 0.2 2 d.f. -3 -2 -1 0 1 2 3 0 t 5 10 15 x Disease or Other Health Outcome Yes No Exposure or Yes Person, Place, or Time Variable No a b a+b (n1) c d c+d (n2) a+c (m1) b+d (m2) a+b+c+d N The Cumulative Logit Model Suppose an outcome variable has 4 categories and we are modeling one independent variable. The cumulative logit model will look as follows: ln(Odds) = b0,1 + b0,12 + b0,123 + b1 The odds ratio is the same regardless of category: e e b 0 ,1 b1 1 e b 0 , 1 b1 0 b1 e e b0 ,12 b1 1 e b0 ,12 b1 0 b1 e e b0 ,123 b1 1 b1 e b0 ,123 b1 0 11 Density of Student's t with 10 d.f. Chi-Square Densities 0.6 0.4 0.3 0.5 1 d.f. 0.4 3 d.f. 5 d.f. 8 d.f. 0.0 0.1 0.0 0.1 0.3 0.2 0.2 2 d.f. -3 -2 -1 0 t 1 2 3 0 5 10 x 15 Ordinal and Nominal Outcomes Disease or Other Health Outcome Yes No Exposure or Yes Person, Place, or Time Variable No a b a+b (n1) c d c+d (n2) a+c (m1) b+d (m2) a+b+c+d N A stratified approach to mimic a cumulative logit model for a 4 category variable, would mean creating new dichotomous variables something like the following: if ordvar = 1 then ordvar1 = 1; else if ordvar ^= . then ordvar1 = 0; if 1<=ordvar<=2 then ordvar2 = 1; else if ordvar ^= . then ordvar2 = 0; if 1<=ordvar<=3 then ordvar3 = 1; else if ordvar ^= . then ordvar3 = 0; 12 Density of Student's t with 10 d.f. Chi-Square Densities 0.6 0.4 0.3 0.5 1 d.f. 0.4 3 d.f. 5 d.f. 8 d.f. 0.0 0.1 0.0 0.1 0.3 0.2 0.2 2 d.f. -3 -2 -1 0 t 1 2 3 0 5 10 x 15 Ordinal and Nominal Outcomes Disease or Other Health Outcome Yes No Exposure or Yes Person, Place, or Time Variable No a b a+b (n1) c d c+d (n2) a+c (m1) b+d (m2) a+b+c+d N Mimicking Cumulative Logit with Binary Logistic Models proc logistic; model ordvar1 = factors; run; proc logistic; model ordvar2 = factors; run; proc logistic; model ordvar3 = factors; run; The OR from each model will be approx. the same if the proportional odds assumption holds. Note that all observations are used in each model. 13 Density of Student's t with 10 d.f. Chi-Square Densities 0.6 0.4 0.3 0.5 1 d.f. 0.4 3 d.f. 5 d.f. 8 d.f. 0.0 0.1 0.0 0.1 0.3 0.2 0.2 2 d.f. -3 -2 -1 0 t 1 2 3 0 5 10 x 15 Ordinal and Nominal Outcomes Disease or Other Health Outcome Yes No Exposure or Yes Person, Place, or Time Variable No a b a+b (n1) c d c+d (n2) a+c (m1) b+d (m2) a+b+c+d N The Cumulative Logit Model If the proportional odds assumption does not hold, it might be because the outcome variable is nominal rather than ordinal, or it might be that we have mis-specified the categories, failing to pinpoint important thresholds on the underlying continuum. The score test is quite sensitive—it is up to the analyst to examine the pattern of ORs for different dichotomous cutpoints and decide whether it is reasonable to use a cumulative logit model. 14 Density of Student's t with 10 d.f. Chi-Square Densities 0.6 0.4 0.3 0.5 1 d.f. 0.4 3 d.f. 5 d.f. 8 d.f. 0.0 0.1 0.0 0.1 0.3 0.2 0.2 2 d.f. -3 -2 -1 0 t 1 2 3 0 5 10 x 15 Ordinal and Nominal Outcomes Disease or Other Health Outcome Yes No Exposure or Yes Person, Place, or Time Variable No a b a+b (n1) c d c+d (n2) a+c (m1) b+d (m2) a+b+c+d N The Generalized Logit Model In contrast to the cumulative logit model, in a generalized logit model, the outcome categories are like dummy variables—mutually exclusive categories compared to a common reference group. 15 Density of Student's t with 10 d.f. Chi-Square Densities 0.6 0.4 0.3 0.5 1 d.f. 0.4 3 d.f. 5 d.f. 8 d.f. 0.0 0.1 0.0 0.1 0.3 0.2 0.2 2 d.f. -3 -2 -1 0 t 1 2 3 0 5 10 x 15 Ordinal and Nominal Outcomes Disease or Other Health Outcome Yes No Exposure or Yes Person, Place, or Time Variable No a b a+b (n1) c d c+d (n2) a+c (m1) b+d (m2) a+b+c+d N The Generalized Logit Model: For a nominal outcome with k categories ln Odds1 ln ln Odds 2 ln p1 1 p1 2... k 1 p2 1 p1 2... k 1 p k 1 ln Odds k 1 ln 1 p1 2... k 1 Fixed denominator (reference category) http://www.indiana.edu/%7Estatmath/stat/all/cat/2b1.html 16 Density of Student's t with 10 d.f. Chi-Square Densities 0.6 0.4 0.5 0.3 Ordinal and Nominal Outcomes 1 d.f. 0.4 3 d.f. 5 d.f. 8 d.f. 0.0 0.1 0.0 0.1 0.3 0.2 0.2 2 d.f. -3 -2 -1 0 1 2 3 0 5 t 10 15 x Risk Factor Yes No 4 a e Nominal Outcome Variable 3 2 b c f g 1 d h Disease or Other Health Outcome Yes No Exposure or Yes Person, Place, or Time Variable No a b a+b (n1) c d c+d (n2) a+c (m1) b+d (m2) a+b+c+d N Total Odds Among the exposed = a / d Risk Factor Yes No 4 a e Nominal Outcome Variable 3 2 b c f g 1 d h Total Odds Among the exposed = b / d Risk Factor Yes No 4 a e Nominal Outcome Variable 3 2 b c f g 1 d h Odds Among the exposed = c / d Total 17 Density of Student's t with 10 d.f. Chi-Square Densities 0.6 0.4 0.3 0.5 1 d.f. 0.4 3 d.f. 5 d.f. 8 d.f. 0.0 0.1 0.0 0.1 0.3 0.2 0.2 2 d.f. -3 -2 -1 0 t 1 2 3 0 5 10 x 15 Ordinal and Nominal Outcomes Disease or Other Health Outcome Yes No Exposure or Yes Person, Place, or Time Variable No a b a+b (n1) c d c+d (n2) a+c (m1) b+d (m2) a+b+c+d N The Generalized Logit Model Given k categories of an outcome variable, a generalized logit model yields k-1 intercept terms. Each intercept corresponds to a single category. Since proportional odds are not assumed, odds ratios can vary across categories, and therefore the effect of each covariate is reflected in k-1 slope parameters. 18 Density of Student's t with 10 d.f. Chi-Square Densities 0.6 0.4 0.3 0.5 1 d.f. 0.4 3 d.f. 5 d.f. 8 d.f. 0.0 0.1 0.0 0.1 0.3 0.2 0.2 2 d.f. -3 -2 -1 0 1 2 3 t 0 5 10 15 x Ordinal and Nominal Outcomes Disease or Other Health Outcome Yes No Exposure or Yes Person, Place, or Time Variable No a b a+b (n1) c d c+d (n2) a+c (m1) b+d (m2) a+b+c+d N The Generalized Logit Model Suppose an outcome variable has 4 categories and we are modeling one independent variable. The generalized logit model is as follows: ln(Odds) = b0,1 + b0,2 + b0,3 + b1,1 + b1,2 +b1,3 b b 1 b 0 b 0 0 ,1 1 ,1 1, 2 1, 3 e b1,1 1. e b b 0 b1, 2 0 b1, 3 0 e 0 ,1 1 ,1 2. e e b 0 , 2 b1,1 0 b1, 2 1 b1, 3 0 b 0 , 2 b1,1 0 b1, 2 0 b1, 3 0 The odds ratios are distinct for each category: b e b1, 2 b 0 b 0 b 1 0,3 1 ,1 1, 2 1, 3 e 3. b b 0 b 0 b 0 e b1, 3 1, 2 1, 3 e 0 , 3 1,1 19 Density of Student's t with 10 d.f. Chi-Square Densities 0.6 0.4 0.3 0.5 1 d.f. 0.4 3 d.f. 5 d.f. 8 d.f. 0.0 0.1 0.0 0.1 0.3 0.2 0.2 2 d.f. -3 -2 -1 0 t 1 2 3 0 5 10 x 15 Ordinal and Nominal Outcomes Disease or Other Health Outcome Yes No Exposure or Yes Person, Place, or Time Variable No a b a+b (n1) c d c+d (n2) a+c (m1) b+d (m2) a+b+c+d N The Generalized Logit Model Each slope parameter tests the odds of being in one outcome category compared to the odds of being in the reference category Compared to those without Factor A, individuals with factor A have ___ times the odds of having the outcomecategory 1; Compared to those without Factor A, individuals with factor A have ___ times the odds of having the outcomecategory 2; Compared to those without Factor A, individuals with factor A have ___ times the odds of having the outcomecategory 3; 20 Density of Student's t with 10 d.f. Chi-Square Densities 0.6 0.4 0.3 0.5 1 d.f. 0.4 3 d.f. 5 d.f. 8 d.f. 0.0 0.1 0.0 0.1 0.3 0.2 0.2 2 d.f. -3 -2 -1 0 t 1 2 3 0 5 10 x 15 Ordinal and Nominal Outcomes Disease or Other Health Outcome Yes No Exposure or Yes Person, Place, or Time Variable No a b a+b (n1) c d c+d (n2) a+c (m1) b+d (m2) a+b+c+d N A stratified approach to mimic generalized logit model for a 4 category variable, would not require creation of new variables, but would mean running models like the following: 21 Density of Student's t with 10 d.f. Chi-Square Densities 0.6 0.4 0.3 0.5 1 d.f. 0.4 3 d.f. 5 d.f. 8 d.f. 0.0 0.1 0.0 0.1 0.3 0.2 0.2 2 d.f. -3 -2 -1 0 t 1 2 3 0 5 10 x 15 Ordinal and Nominal Outcomes proc logistic; where ordvar in(1,4); model ordvar = factors; run; proc logistic; where ordvar in(2,4); model ordvar = factors; run; proc logistic; where ordvar in(3,4); model ordvar = factors; run; Disease or Other Health Outcome Yes No Exposure or Yes Person, Place, or Time Variable No a b a+b (n1) c d c+d (n2) a+c (m1) b+d (m2) a+b+c+d N Mimicking Generalized Logit with Binary Logistic Models The ORs from the models will differ. Note that different subsets of observations are used in each model. 22 Density of Student's t with 10 d.f. Chi-Square Densities 0.6 0.4 Disease or Other Health Outcome Yes No 0.3 0.5 1 d.f. 0.4 3 d.f. 5 d.f. 8 d.f. 0.0 0.1 0.0 0.1 0.3 0.2 0.2 -3 -2 -1 0 t 1 2 3 Example 1. 2 d.f. 0 5 10 Exposure or Yes Person, Place, or Time Variable No 15 x a b a+b (n1) c d c+d (n2) a+c (m1) b+d (m2) a+b+c+d N The Association of Smoking and Fetal/Infant Death in Preterm Deliveries Frequency|Smoking and Mortality Percent |Dichotomous Outcome Row Pct | Col Pct |fetal or|survivor| Total | neonata| >=28 da| |l death |ys | ---------+--------+--------+ yes | 79 | 1135 | 1214 | 0.87 | 12.50 | 13.37 | 6.51 | 93.49 | | 14.08 | 13.32 | ---------+--------+--------+ no | 482 | 7385 | 7867 | 5.31 | 81.32 | 86.63 | 6.13 | 93.87 | | 85.92 | 86.68 | ---------+--------+--------+ Total 561 8520 9081 6.18 93.82 100.00 Crude OR=1.07 23 Density of Student's t with 10 d.f. Chi-Square Densities 0.6 0.4 Disease or Other Health Outcome Yes No 0.3 0.5 1 d.f. 0.4 3 d.f. 5 d.f. 8 d.f. 0.0 0.1 0.0 0.1 0.3 0.2 0.2 -3 -2 -1 0 1 2 3 Example 1. 2 d.f. 0 t 5 10 15 x Exposure or Yes Person, Place, or Time Variable No a b a+b (n1) c d c+d (n2) a+c (m1) b+d (m2) a+b+c+d N The Association of Smoking and Fetal/Infant Death in Preterm Deliveries Crude Logistic Model with Dichotomous Outcome DF Estimate 1 1 -2.7293 0.0643 0.0470 0.1255 Parameter Standard Error Intercept smoking yes Wald Chi-Square Pr > ChiSq 3370.3800 0.2627 <.0001 0.6083 Odds Ratio Estimates Effect smoking yes vs no Point Estimate 1.066 95% Wald Confidence Limits 0.834 1.364 24 Density of Student's t with 10 d.f. Chi-Square Densities 0.6 0.4 Disease or Other Health Outcome Yes No 0.3 0.5 1 d.f. 0.4 3 d.f. 5 d.f. 8 d.f. 0.0 0.1 0.0 0.1 0.3 0.2 0.2 -3 -2 -1 0 t 1 2 3 Example 1. 2 d.f. 0 5 10 15 x Cumulative Logit: Exposure or Yes Person, Place, or Time Variable No a b a+b (n1) c d c+d (n2) a+c (m1) b+d (m2) a+b+c+d N Odds of type of death among smokers and the OR for smoker v. nonsmoker Frequency| Smoking and Mortality Percent | 3 Categories Row Pct | Col Pct | fetal d|neonatal|survivor| Total |eath >=2| death 0| >=28 da| |0 wks |-28 days|ys | ---------+--------+--------+--------+ yes | 46 | 33 | 1135 | 1214 | 0.51 | 0.36 | 12.50 | 13.37 | 3.79 | 2.72 | 93.49 | | 13.86 | 14.41 | 13.32 | ---------+--------+--------+--------+ no | 286 | 196 | 7385 | 7867 | 3.15 | 2.16 | 81.32 | 86.63 | 3.64 | 2.49 | 93.87 | | 86.14 | 85.59 | 86.68 | ---------+--------+--------+--------+ Total 332 229 8520 9081 3.66 2.52 93.82 100.00 Odds=46 / (33+1135)=0.04 OR = 1.04 Frequency| Smoking and Mortality Percent | 3 Categories Row Pct | Col Pct | fetal d|neonatal|survivor| Total |eath >=2| death 0| >=28 da| |0 wks |-28 days|ys | ---------+--------+--------+--------+ yes | 46 | 33 | 1135 | 1214 | 0.51 | 0.36 | 12.50 | 13.37 | 3.79 | 2.72 | 93.49 | | 13.86 | 14.41 | 13.32 | ---------+--------+--------+--------+ no | 286 | 196 | 7385 | 7867 | 3.15 | 2.16 | 81.32 | 86.63 | 3.64 | 2.49 | 93.87 | | 86.14 | 85.59 | 86.68 | ---------+--------+--------+--------+ Total 332 229 8520 9081 3.66 2.52 93.82 100.00 Odds=(46+33) / 1135=0.07 OR = 1.07 25 Density of Student's t with 10 d.f. Chi-Square Densities 0.6 0.4 Disease or Other Health Outcome Yes No 0.3 0.5 1 d.f. 0.4 3 d.f. 5 d.f. 8 d.f. 0.0 0.1 0.0 0.1 0.3 0.2 0.2 -3 -2 -1 0 t 1 2 3 Example 1. 2 d.f. 0 5 10 Exposure or Yes Person, Place, or Time Variable No 15 x a b a+b (n1) c d c+d (n2) a+c (m1) b+d (m2) a+b+c+d N Cumulative Logit Model with 3 Categories Ordered Value 1 2 3 outcome5 fetal death >=20 wks neonatal death 0-28 days survivor >=28 days Frequency 332 229 8520 Probabilities modeled are cumulated over the lower Ordered Values. Score Test for the Proportional Odds Assumption Chi-Square DF Pr > ChiSq The proportional 0.0400 1 0.8414 odds assumption holds 26 Density of Student's t with 10 d.f. Chi-Square Densities 0.6 0.4 Disease or Other Health Outcome Yes No 0.3 0.5 1 d.f. 0.4 3 d.f. 5 d.f. 8 d.f. 0.0 0.1 0.0 0.1 0.3 0.2 0.2 -3 -2 -1 0 1 2 3 t Example 1. 2 d.f. 0 5 10 Exposure or Yes Person, Place, or Time Variable No 15 x a b a+b (n1) c d c+d (n2) a+c (m1) b+d (m2) a+b+c+d N Cumulative Logit: Each intercept corresponds to a category plus all categories with lower ordered values v. the remaining categories. Analysis of Maximum Likelihood Estimates Parameter Intercept fetal death >=20 wks Intercept neonatal death 0-28 days smoking yes DF Estimate Standard Error Wald Chi-Square Pr > ChiSq 1 1 1 -3.2803 -2.7292 0.0635 0.0586 0.0470 0.1255 3130.7559 3370.8916 0.2561 <.0001 <.0001 0.6128 Odds Ratio Estimates Effect smoking yes vs no Point Estimate 1.066 46 / (33+1135) = (46+33) / 1135 = 95% Wald Confidence Limits 0.833 The odds ratio is an ‘average’ of the cumulative logits 1.363 e-3.2803+0.0635 e-2.7291+0.0635 = = 0.04 0.07 27 Density of Student's t with 10 d.f. Chi-Square Densities 0.6 0.4 Disease or Other Health Outcome Yes No 0.3 0.5 1 d.f. 0.4 3 d.f. 5 d.f. 8 d.f. 0.0 0.1 0.0 0.1 0.3 0.2 0.2 -3 -2 -1 0 1 2 3 0 t Example 1. 2 d.f. 5 10 Exposure or Yes Person, Place, or Time Variable No 15 x a b a+b (n1) c d c+d (n2) a+c (m1) b+d (m2) a+b+c+d N Generalized Logit Model with 3 Categories In a generalized logit model, each intercept and slope correspond to a single category. Parameter outcome5 Intercept Intercept smoking yes smoking yes fetal death >=20 wks neonatal death 0-28 days fetal death >=20 wks neonatal death 0-28 days DF Estimate Standard Error 1 1 1 1 -3.2512 -3.6291 0.0455 0.0912 0.0603 0.0724 0.1620 0.1908 Wald Chi-Square Pr > ChiSq 2910.4207 2514.6406 0.0787 0.2284 <.0001 <.0001 0.7790 0.6327 Odds Ratio Estimates Effect outcome5 smoking yes vs no smoking yes vs no fetal death >=20 wks neonatal death 0-28 days Point Estimate 1.047 1.096 95% Wald Confidence Limits 0.762 0.754 1.438 1.592 Is 1.07 a reasonable summary of 1.047 and 1.096? 28 Density of Student's t with 10 d.f. Chi-Square Densities 0.6 0.4 Disease or Other Health Outcome Yes No 0.3 0.5 1 d.f. 0.4 3 d.f. 5 d.f. 8 d.f. 0.0 0.1 0.0 0.1 0.3 0.2 0.2 -3 -2 -1 0 t 1 2 3 Example 2. 2 d.f. 0 5 10 15 x Exposure or Yes Person, Place, or Time Variable No a b a+b (n1) c d c+d (n2) a+c (m1) b+d (m2) a+b+c+d N The Association of Maternal Risk and Fetal/Infant Death in Preterm Deliveries Frequency|Maternal Risk and Mortality Percent |Dichotomous Outcome Row Pct | Col Pct |fetal or|survivor| Total | neonata| >=28 da| |l death |ys | ---------+--------+--------+ yes | 282 | 3836 | 4118 | 2.76 | 37.50 | 40.26 | 6.85 | 93.15 | | 41.53 | 40.17 | ---------+--------+--------+ no | 397 | 5713 | 6110 | 3.88 | 55.86 | 59.74 | 6.50 | 93.50 | | 58.47 | 59.83 | ---------+--------+--------+ Total 679 9549 10228 6.64 93.36 100.00 Frequency| Matern Risk and Mortality Percent | 3 Categories Row Pct | Col Pct | fetal d|neonatal|survivor| Total |eath >=2| death 0| >=28 da| |0 wks |-28 days|ys | ---------+--------+--------+--------+ yes | 153 | 129 | 3836 | 4118 | 1.50 | 1.26 | 37.50 | 40.26 | 3.72 | 3.13 | 93.15 | | 36.60 | 49.43 | 40.17 | ---------+--------+--------+--------+ no | 265 | 132 | 5713 | 6110 | 2.59 | 1.29 | 55.86 | 59.74 | 4.34 | 2.16 | 93.50 | | 63.40 | 50.57 | 59.83 | ---------+--------+--------+--------+ Total 418 261 9549 10228 4.09 2.55 93.36 100.00 29 Density of Student's t with 10 d.f. Chi-Square Densities 0.6 0.4 Disease or Other Health Outcome Yes No 0.3 0.5 1 d.f. 0.4 3 d.f. 5 d.f. 8 d.f. 0.0 0.1 0.0 0.1 0.3 0.2 0.2 -3 -2 -1 0 1 2 3 Example 2. 2 d.f. 0 t 5 10 15 x Exposure or Yes Person, Place, or Time Variable No a b a+b (n1) c d c+d (n2) a+c (m1) b+d (m2) a+b+c+d N The Association of Maternal Risk and Fetal/Infant Death in Preterm Deliveries Crude Logistic Model with Dichotomous Outcome Parameter DF Estimate Standard Error Wald Chi-Square Pr > ChiSq 1 1 -2.6666 0.0563 0.0519 0.0806 2639.4735 0.4873 <.0001 0.4851 Intercept matrisk yes Odds Ratio Estimates Effect matrisk yes vs no Point Estimate 1.058 95% Wald Confidence Limits 0.903 1.239 30 Density of Student's t with 10 d.f. Chi-Square Densities 0.6 0.4 Disease or Other Health Outcome Yes No 0.3 0.5 1 d.f. 0.4 3 d.f. 5 d.f. 8 d.f. 0.0 0.1 0.0 0.1 0.3 0.2 0.2 -3 -2 -1 0 t 1 2 3 Example 2. 2 d.f. 0 5 10 Exposure or Yes Person, Place, or Time Variable No 15 x a b a+b (n1) c d c+d (n2) a+c (m1) b+d (m2) a+b+c+d N Cumulative Logit Model with 3 Categories Ordered Value outcome5 Frequency 1 fetal death >=20 wks 418 2 neonatal death 0-28 days 261 3 survivor >=28 days 9549 Probabilities modeled are cumulated over the lower Ordered Values. Score Test for the Proportional Odds Assumption Chi-Square DF Pr > ChiSq The proportional 10.7077 1 0.0011 odds assumption does not hold. 31 Density of Student's t with 10 d.f. Chi-Square Densities 0.6 0.4 Disease or Other Health Outcome Yes No 0.3 0.5 1 d.f. 0.4 3 d.f. 5 d.f. 8 d.f. 0.0 0.1 0.0 0.1 0.3 0.2 0.2 -3 -2 -1 0 1 2 3 0 t Example 2. 2 d.f. 5 10 Exposure or Yes Person, Place, or Time Variable No 15 x a b a+b (n1) c d c+d (n2) a+c (m1) b+d (m2) a+b+c+d N Cumulative Logit Model with 3 Categories Parameter DF Estimate Standard Error 1 1 1 -3.1750 -2.6629 0.0473 0.0600 0.0518 0.0806 Intercept fetal death >=20 wks Intercept neonatal death 0-28 days matrisk yes Wald Chi-Square Pr > ChiSq 2798.1261 2641.7916 0.3435 <.0001 <.0001 0.5578 Odds Ratio Estimates Effect matrisk yes vs no Point Estimate 1.048 95% Wald Confidence Limits 0.895 e-3.1750+0.0473 e-2.6629+0.0473 The odds ratio is an ‘average’ of the cumulative logits 1.228 = = 0.04 0.07 32 Density of Student's t with 10 d.f. Chi-Square Densities 0.6 0.4 Disease or Other Health Outcome Yes No 0.3 0.5 1 d.f. 0.4 3 d.f. 5 d.f. 8 d.f. 0.0 0.1 0.0 0.1 0.3 0.2 0.2 -3 -2 -1 0 1 2 3 0 t Example 2. 2 d.f. 5 10 Exposure or Yes Person, Place, or Time Variable No 15 x a b a+b (n1) c d c+d (n2) a+c (m1) b+d (m2) a+b+c+d N Generalized Logit Model with 3 Categories Parameter outcome5 Intercept Intercept matrisk yes matrisk yes fetal death >=20 wks neonatal death 0-28 days fetal death >=20 wks neonatal death 0-28 days DF Estimate Standard Error 1 1 1 1 -3.0708 -3.7676 -0.1510 0.3755 0.0628 0.0880 0.1037 0.1255 Wald Chi-Square Pr > ChiSq 2388.0754 1831.5579 2.1212 8.9450 <.0001 <.0001 0.1453 0.0028 Odds Ratio Estimates Effect outcome5 matrisk yes vs no matrisk yes vs no fetal death >=20 wks neonatal death 0-28 days Point Estimate 0.860 1.456 95% Wald Confidence Limits 0.702 1.138 1.054 1.862 Is 1.048 a reasonable summary of 0.86 and 1.5? 33 Density of Student's t with 10 d.f. Chi-Square Densities 0.6 0.4 Disease or Other Health Outcome Yes No 0.3 0.5 1 d.f. 0.4 3 d.f. 5 d.f. 8 d.f. 0.0 0.1 0.0 0.1 0.3 0.2 0.2 -3 -2 -1 0 t 1 2 3 Example 3. LBW 2 d.f. 0 5 10 Exposure or Yes Person, Place, or Time Variable No 15 x a b a+b (n1) c d c+d (n2) a+c (m1) b+d (m2) a+b+c+d N Modeling a 3 category birthweight variable: /*cumulative logit */ proc logistic order=formatted; model bwcat = smoking late_no_pnc; run; Ordered Value 1 2 3 bwcat vlbw mlbw normal bw Total Frequency 897 4087 75824 Probabilities modeled are cumulated over the lower Ordered Values. 34 Density of Student's t with 10 d.f. Chi-Square Densities 0.6 0.4 Disease or Other Health Outcome Yes No 0.3 0.5 1 d.f. 0.4 3 d.f. 5 d.f. 8 d.f. 0.0 0.1 0.0 0.1 0.3 0.2 0.2 -3 -2 -1 0 1 2 3 t Example 3. LBW 2 d.f. 0 5 10 15 x Exposure or Yes Person, Place, or Time Variable No a b a+b (n1) c d c+d (n2) a+c (m1) b+d (m2) a+b+c+d N Score Test for the Proportional Odds Assumption Parameter Intercept vlbw Intercept mlbw smoking late_no_pnc Chi-Square DF Pr > ChiSq 17.0152 2 0.0002 DF Estimate Standard Error 1 1 1 1 -4.6326 -2.8614 0.6012 0.2720 0.0351 0.0176 0.0383 0.0362 Effect smoking late_no_pnc Point Estimate 1.824 1.313 Wald Chi-Square Pr > ChiSq 17461.8326 26396.0103 246.8141 56.5520 <.0001 <.0001 <.0001 <.0001 95% Wald Confidence Limits 1.692 1.223 1.966 1.409 35 Density of Student's t with 10 d.f. Chi-Square Densities 0.6 0.4 Disease or Other Health Outcome Yes No 0.3 0.5 1 d.f. 0.4 3 d.f. 5 d.f. 8 d.f. 0.0 0.1 0.0 0.1 0.3 0.2 0.2 -3 -2 -1 0 t 1 2 3 Example 3. LBW 2 d.f. 0 5 10 Exposure or Yes Person, Place, or Time Variable No 15 x a b a+b (n1) c d c+d (n2) a+c (m1) b+d (m2) a+b+c+d N /*mimicking cumulative logit with binary models*/ proc logistic order=formatted; model vlbw = smoking late_no_pnc; run; Point 95% Wald Effect vlbw v. mlbw and normal smoking late_no_pnc Estimate 1.346 1.138 proc logistic order=formatted; model lbw = smoking late_no_pnc; run; vlbw and mlbw v. normal Effect smoking late_no_pnc Point Estimate 1.834 1.315 Confidence Limits 1.118 0.961 1.621 1.347 95% Wald Confidence Limits 1.701 1.225 1.977 1.412 Both models include all observations in the sample 36 Density of Student's t with 10 d.f. Chi-Square Densities 0.6 0.4 Disease or Other Health Outcome Yes No 0.3 0.5 1 d.f. 0.4 3 d.f. 5 d.f. 8 d.f. 0.0 0.1 0.0 0.1 0.3 0.2 0.2 -3 -2 -1 0 t 1 2 3 Example 3. LBW 2 d.f. 0 5 10 15 x Exposure or Yes Person, Place, or Time Variable No a b a+b (n1) c d c+d (n2) a+c (m1) b+d (m2) a+b+c+d N /* generalized logit */ proc logistic order=formatted; model bwcat(ref='normal bw') = smoking late_no_pnc / link=glogit; run; Ordered Value 1 2 3 bwcat vlbw mlbw normal bw Total Frequency 897 4087 75824 Logits modeled use bwcat='normal bw' as the reference category. 37 Density of Student's t with 10 d.f. Chi-Square Densities 0.6 0.4 Disease or Other Health Outcome Yes No 0.3 0.5 1 d.f. 0.4 3 d.f. 5 d.f. 8 d.f. 0.0 0.1 0.0 0.1 0.3 0.2 0.2 -3 -2 -1 0 1 2 3 0 t Example 3. LBW 2 d.f. 5 10 Exposure or Yes Person, Place, or Time Variable No 15 x a b a+b (n1) c d c+d (n2) a+c (m1) b+d (m2) a+b+c+d N vlbw v. normal and mlbw v. normal Parameter bwcat Intercept Intercept smoking smoking late_no_pnc late_no_pnc vlbw lbw vlbw lbw vlbw lbw DF Estimate Standard Error 1 1 1 1 1 1 -4.5070 -3.0764 0.3409 0.6587 0.1470 0.3002 0.0394 0.0195 0.0947 0.0412 0.0861 0.0393 Effect bwcat smoking smoking late_no_pnc late_no_pnc vlbw mlbw vlbw mlbw Point Estimate 1.406 1.932 1.158 1.350 Wald Chi-Square Pr > ChiSq 13075.7241 24943.4219 12.9546 255.2248 2.9166 58.2169 <.0001 <.0001 0.0003 <.0001 0.0877 <.0001 95% Wald Confidence Limits 1.168 1.782 0.979 1.250 1.693 2.095 1.371 1.458 38 Density of Student's t with 10 d.f. Chi-Square Densities 0.6 0.4 Disease or Other Health Outcome Yes No 0.3 0.5 1 d.f. 0.4 3 d.f. 5 d.f. 8 d.f. 0.0 0.1 0.0 0.1 0.3 0.2 0.2 -3 -2 -1 0 t 1 2 3 Example 3. LBW 2 d.f. 0 5 10 x 15 Exposure or Yes Person, Place, or Time Variable No a b a+b (n1) c d c+d (n2) a+c (m1) b+d (m2) a+b+c+d N /* mimicking generalized logit with binary models*/ proc logistic order=formatted; where bwcat = 2 or bwcat = 0; model bwcat(ref='normal bw') = smoking late_no_pnc / link=glogit; run; proc logistic order=formatted; where bwcat = 1 or bwcat = 0; model bwcat(ref='normal bw') = smoking late_no_pnc / link=glogit; run; 39 Density of Student's t with 10 d.f. Chi-Square Densities 0.6 0.4 Disease or Other Health Outcome Yes No 0.3 0.5 1 d.f. 0.4 3 d.f. 5 d.f. 8 d.f. 0.0 0.1 0.0 0.1 0.3 0.2 0.2 -3 -2 -1 0 1 2 3 t Example 3. LBW 2 d.f. 0 5 10 15 x Exposure or Yes Person, Place, or Time Variable No a b a+b (n1) c d c+d (n2) a+c (m1) b+d (m2) a+b+c+d N Generalized logit approach using binary models with only a subset of observations in each model vlbw v. normal mlbw v. normal Effect bwcat smoking late_no_pnc vlbw vlbw Effect bwcat smoking late_no_pnc mlbw mlbw Point Estimate 1.406 1.159 Point Estimate 1.933 1.351 95% Wald Confidence Limits 1.168 0.979 1.693 1.371 95% Wald Confidence Limits 1.783 1.251 2.095 1.459 40 Density of Student's t with 10 d.f. Chi-Square Densities 0.6 0.4 Disease or Other Health Outcome Yes No 0.3 0.5 1 d.f. 0.4 3 d.f. 5 d.f. 8 d.f. 0.0 0.1 0.0 0.1 0.3 0.2 0.2 -3 -2 -1 0 t 1 2 3 Example 3. LBW 2 d.f. 0 5 10 x 15 Exposure or Yes Person, Place, or Time Variable No a b a+b (n1) c d c+d (n2) a+c (m1) b+d (m2) a+b+c+d N Generalized logit models can get complicated, but custom estimates can still be obtained in the usual way. proc logistic order=formatted; where 2<=momage<=3; class parityrisk(ref='no hx preterm') / param=ref; model bwcat = smoking late_no_pnc matrisk momage parityrisk smoking*parityrisk / link=glogit; contrast 'sm-risk, hxpreterm' smoking 1 matrisk 1 smoking*parityrisk 1 0 / estimate=exp; contrast 'sm-risk, primips'smoking 1 matrisk 1 smoking*parityrisk 0 1 / estimate=exp; contrast 'sm-risk, lorisk multips' smoking 1 matrisk 1 smoking*parityrisk 0 0 / estimate=exp; run; 41 Density of Student's t with 10 d.f. Chi-Square Densities 0.6 0.4 Disease or Other Health Outcome Yes No 0.3 0.5 1 d.f. 0.4 3 d.f. 5 d.f. 8 d.f. 0.0 0.1 0.0 0.1 0.3 0.2 0.2 -3 -2 -1 0 t 1 2 3 Example 3. LBW 2 d.f. 0 5 10 15 x Exposure or Yes Person, Place, or Time Variable No a b a+b (n1) c d c+d (n2) a+c (m1) b+d (m2) a+b+c+d N The tests for the constructs in the model are all statistically significant: Type 3 Analysis of Effects Effect smoking late_no_pnc matrisk momage parityrisk smoking*parityrisk DF Wald Chi-Square Pr > ChiSq 2 2 2 2 4 4 199.1393 46.9823 615.7383 7.7596 382.2127 22.1081 <.0001 <.0001 <.0001 0.0207 <.0001 0.0002 42 Density of Student's t with 10 d.f. Chi-Square Densities 0.6 0.4 Disease or Other Health Outcome Yes No 0.3 0.5 1 d.f. 0.4 3 d.f. 5 d.f. 8 d.f. 0.0 0.1 0.0 0.1 0.3 0.2 0.2 -3 -2 -1 0 1 2 3 Example 3. LBW 2 d.f. 0 t 5 10 Exposure or Yes Person, Place, or Time Variable No 15 x Parameter bwcat Intercept Intercept smoking smoking late_no_pnc late_no_pnc matrisk matrisk momage momage parityrisk parityrisk parityrisk parityrisk smoking*parityrisk smoking*parityrisk smoking*parityrisk smoking*parityrisk vlbw mlbw vlbw mlbw vlbw mlbw vlbw mlbw vlbw mlbw vlbw mlbw vlbw mlbw vlbw mlbw vlbw mlbw >=35 >=35 hx preterm hx preterm primip primip hx preterm hx preterm primip primip DF Estimate Standard Error 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 -5.3253 -3.6337 0.3873 0.7851 0.1095 0.2866 1.0549 0.6885 0.1607 0.1150 1.6210 1.4185 0.6110 0.5060 0.4809 0.0921 -0.3266 -0.3663 0.0733 0.0332 0.1372 0.0564 0.0933 0.0422 0.0712 0.0338 0.1002 0.0493 0.1965 0.1089 0.0789 0.0383 0.3477 0.1950 0.2141 0.0914 a b a+b (n1) c d c+d (n2) a+c (m1) b+d (m2) a+b+c+d N Wald Chi-Square Pr > ChiSq 5284.8432 11977.8462 7.9668 193.5812 1.3762 46.1522 219.3322 414.5412 2.5727 5.4443 68.0158 169.6569 60.0412 174.2524 1.9131 0.2231 2.3270 16.0623 <.0001 <.0001 0.0048 <.0001 0.2407 <.0001 <.0001 <.0001 0.1087 0.0196 <.0001 <.0001 <.0001 <.0001 0.1666 0.6367 0.1271 <.0001 Not all beta coefficients are statistically significant. 43 Density of Student's t with 10 d.f. Chi-Square Densities 0.6 0.4 Disease or Other Health Outcome Yes No 0.3 0.5 1 d.f. 0.4 3 d.f. 5 d.f. 8 d.f. 0.0 0.1 0.0 0.1 0.3 0.2 0.2 -3 -2 -1 0 1 2 t 3 Example 3. LBW 2 d.f. 0 5 10 15 x Exposure or Yes Person, Place, or Time Variable No a b a+b (n1) c d c+d (n2) a+c (m1) b+d (m2) a+b+c+d N Parity-specific contrasts of the joint effect of smoking and having some antepartum medical risk, adjusting for entry into prenatal care and maternal age. Contrast sm-risk, sm-risk, sm-risk, sm-risk, sm-risk, sm-risk, Type hxpreterm hxpreterm primips primips lorisk multips lorisk multips EXP EXP EXP EXP EXP EXP Row Estimate Standard Error 1 2 1 2 1 2 6.8423 4.7860 3.0515 3.0260 4.2299 4.3649 2.2409 0.9081 0.5430 0.2388 0.6439 0.2819 Confidence Limits 3.6011 3.2997 2.1530 2.5924 3.1387 3.8459 13.0010 6.9419 4.3248 3.5322 5.7005 4.9539 Should we leave the smoking*parityrisk term in the model? 44 Density of Student's t with 10 d.f. Chi-Square Densities 0.6 0.4 0.5 0.3 Example 4. Prenatal Care 1 d.f. 0.4 3 d.f. 5 d.f. 8 d.f. 0.0 0.1 0.0 0.1 0.3 0.2 0.2 2 d.f. -3 -2 -1 0 1 2 3 0 t 5 10 15 x Disease or Other Health Outcome Yes No Exposure or Yes Person, Place, or Time Variable No a b a+b (n1) c d c+d (n2) a+c (m1) b+d (m2) a+b+c+d N Should we consider the categories ordinal or nominal? Table of prevlbw by indexsum prevlbw Frequency Row Pct indexsum(two factor summary index) Total No Pnc Inadeq Inter Adeq Adeq+ prev lbw 736.34 3.71 3097.6 15.62 2363.3 11.91 5274.7 26.59 8364 42.17 19836 no hx lbw or primip 3315.8 1.18 19576 6.98 33170 11.83 138719 49.46 85667 30.55 280448 45 Density of Student's t with 10 d.f. Chi-Square Densities 0.6 0.4 0.5 0.3 Example 4. Prenatal Care 1 d.f. 0.4 3 d.f. 5 d.f. 8 d.f. 0.0 0.1 0.0 0.1 0.3 0.2 0.2 2 d.f. -3 -2 -1 0 1 2 3 0 t 5 10 15 x Disease or Other Health Outcome Yes No Exposure or Yes Person, Place, or Time Variable No a b a+b (n1) c d c+d (n2) a+c (m1) b+d (m2) a+b+c+d N The Overlapping dichotomous Contrasts No Pnc v. Any PNC, OR = 3.2 prevlbw Inad/No v. Adeq+/Adeq/Inter, OR=2.7 indexsum(two factor summary index) prevlbw indexsum(two factor summary index) Frequency Row Pct No Pnc Inadeq Inter Adeq Adeq+ Frequency Row Pct No Pnc Inadeq Inter Adeq Adeq+ prev lbw 736.34 3.71 3097.6 15.62 2363.3 11.91 5274.7 26.59 8364 42.17 prev lbw 736.34 3.71 3097.6 15.62 2363.3 11.91 5274.7 26.59 8364 42.17 no hx lbw or primip 3315.8 1.18 19576 6.98 33170 11.83 138719 49.46 85667 30.55 no hx lbw or primip 3315.8 1.18 19576 6.98 33170 11.83 138719 49.46 85667 30.55 Inter/Inad/No v. Adeq+/Adeq, OR=1.8 prevlbw indexsum(two factor summary index) All others v. Adeq+, OR=0.60 prevlbw indexsum(two factor summary index) Frequency Row Pct No Pnc Inadeq Inter Adeq Adeq+ Frequency Row Pct No Pnc Inadeq Inter Adeq Adeq+ prev lbw 736.34 3.71 3097.6 15.62 2363.3 11.91 5274.7 26.59 8364 42.17 prev lbw 736.34 3.71 3097.6 15.62 2363.3 11.91 5274.7 26.59 8364 42.17 no hx lbw or primip 3315.8 1.18 19576 6.98 33170 11.83 138719 49.46 85667 30.55 no hx lbw or primip 3315.8 1.18 19576 6.98 33170 11.83 138719 49.46 85667 30.55 46 Density of Student's t with 10 d.f. Chi-Square Densities 0.6 0.4 0.5 0.3 Example 4. Prenatal Care 1 d.f. 0.4 3 d.f. 5 d.f. 8 d.f. 0.0 0.1 0.0 0.1 0.3 0.2 0.2 2 d.f. -3 -2 -1 0 1 2 3 0 5 t 10 15 x Disease or Other Health Outcome Yes No Exposure or Yes Person, Place, or Time Variable No a b a+b (n1) c d c+d (n2) a+c (m1) b+d (m2) a+b+c+d N Non-overlapping dichotomous contrasts: prevlbw Frequency Row Pct indexsum(two factor summary index) prevlbw Frequency Row Pct indexsum(two factor summary index) Inadeqq Adeq No Pnc Adeq prev lbw 736.34 3.71 5274.7 26.59 prev lbw 3097.6 15.62 5274.7 26.59 no hx lbw or primip 3315.8 1.18 138719 49.46 no hx lbw or primip 19576 6.98 138719 49.46 prevlbw Frequency Row Pct indexsum(two factor summary index) Inter Adeq prev lbw 2363.3 11.91 5274.7 26.59 no hx lbw or primip 33170 11.83 138719 49.46 prevlbw Frequency Row Pct indexsum(two factor summary index) Adeq+ Adeq prev lbw 8364 42.17 5274.7 26.59 no hx lbw or primip 85667 30.55 138719 49.46 47 Density of Student's t with 10 d.f. Chi-Square Densities 0.6 0.4 0.5 0.3 0.4 2 d.f. 3 d.f. 5 d.f. 8 d.f. 0.0 0.1 0.0 0.1 0.3 0.2 0.2 -3 -2 -1 0 1 2 3 0 t 5 10 15 x Cumulative Logit: The null hypothesis of proportional odds is rejected. Parameter Intercept Intercept Intercept Intercept prevlbw No PNC Inadequate Intermediate Adequate Disease or Other Health Outcome Yes No Example 4. Prenatal Care 1 d.f. Exposure or Yes Person, Place, or Time Variable No a b a+b (n1) c d c+d (n2) a+c (m1) b+d (m2) a+b+c+d N Score Test for the Proportional Odds Assumption Chi-Square DF Pr > ChiSq 7014.0733 3 <.0001 DF Estimate Standard Error 1 1 1 1 1 -4.2917 -2.3257 -1.3409 0.7857 -0.00326 0.1749 0.0701 0.0495 0.0423 0.1698 Wald Chi-Square Pr > ChiSq 601.9645 1101.3880 732.9840 345.3622 0.0004 <.0001 <.0001 <.0001 <.0001 0.9847 Odds Ratio Estimates Any association is obscured by averaging across levels of APNCU. Effect prevlbw Point Estimate 0.997 95% Wald Confidence Limits 0.715 1.390 48 Density of Student's t with 10 d.f. Chi-Square Densities 0.6 0.4 0.5 0.3 Example 4. Prenatal Care 1 d.f. 0.4 3 d.f. 5 d.f. 8 d.f. 0.0 0.1 0.0 0.1 0.3 0.2 0.2 2 d.f. -3 -2 -1 0 1 2 3 0 5 t 10 15 x Parameter indexsum Intercept Intercept Intercept Intercept prevlbw prevlbw prevlbw prevlbw No PNC Inadequate Intermediate adequate+ No PNC Inadequate Intermediate adequate+ DF Estimate Standard Error 1 1 1 1 1 1 1 1 -3.7338 -1.9581 -1.4308 -0.4820 1.7648 1.4258 0.6280 0.9430 0.2019 0.0842 0.0670 0.0459 0.4114 0.2606 0.2691 0.1861 Disease or Other Health Outcome Yes No Exposure or Yes Person, Place, or Time Variable No a b a+b (n1) c d c+d (n2) a+c (m1) b+d (m2) a+b+c+d N Wald Chi-Square Pr > ChiSq 342.1621 541.3514 455.9302 110.4236 18.4034 29.9399 5.4441 25.6809 <.0001 <.0001 <.0001 <.0001 <.0001 <.0001 0.0196 <.0001 Odds Ratio Estimates Generalized Logit Effect indexsum prevlbw prevlbw prevlbw prevlbw No PNC Inadequate Intermediate adequate+ Point Estimate 5.840 4.161 1.874 2.568 95% Wald Confidence Limits 2.608 2.497 1.106 1.783 13.080 6.935 3.175 3.698 49 Density of Student's t with 10 d.f. Chi-Square Densities 0.6 0.4 0.3 0.5 1 d.f. 0.4 3 d.f. 5 d.f. 8 d.f. 0.0 0.1 0.0 0.1 0.3 0.2 0.2 2 d.f. -3 -2 -1 0 t 1 2 3 0 5 10 x 15 Example 4. Prenatal Care Disease or Other Health Outcome Yes No Exposure or Yes Person, Place, or Time Variable No a b a+b (n1) c d c+d (n2) a+c (m1) b+d (m2) a+b+c+d N Women with a prior lbw delivery had more than 4 times the odds of receiving no or inadequate prenatal care rather than adequate care compared to women with no history of lbw delivery. Compared to women without a history of lbw delivery, however, these high risk women also had more than twice the odds of appropriately receiving care beyond what is considered adequate for most women. 50 Density of Student's t with 10 d.f. Chi-Square Densities 0.6 0.4 Disease or Other Health Outcome Yes No 0.3 0.5 1 d.f. 0.4 3 d.f. 5 d.f. 8 d.f. 0.0 0.1 0.0 0.1 0.3 0.2 0.2 -3 -2 -1 0 1 2 3 Example 5. 2 d.f. 0 t 5 10 15 x Exposure or Yes Person, Place, or Time Variable No a b a+b (n1) c d c+d (n2) a+c (m1) b+d (m2) a+b+c+d N Cumulative Logit Model for the Associations Between Key Features Across Domains and Higher Levels of MCH Epidemiology Functioning Odds 95% CI Ratio Outcome is a 3 level rating of MCH epidemiology functioning: •above average •average •below average P Organizational Position* 2.0 0.8- 4.8 0.14 Agenda-Setting by Consensus 6.1 1.1-34.3 0.04 Agenda-Setting by Consensus Including External Partners 6.6 1.3-33.2 0.02 Total Key Staff with Doctoral Training 2.5 1.3 - 5.0 0.01 Additional Staff: Assignees, Fellows, or Interns 6.4 1.3-32.1 0.03 Routine Data Sharing (internal and external) & Data Integration Occurring 4.0 0.9-18.3 0.07 * Organizational position is the three level ordinal variable: named MCH epidemiology unit, no named unit, but recognized presence, and no or diffuse Effort 51 Density of Student's t with 10 d.f. Chi-Square Densities 0.6 0.4 0.3 0.5 1 d.f. 0.4 0.0 -2 -1 0 t • • 3 d.f. 5 d.f. 8 d.f. 0.1 0.0 0.1 0.3 0.2 0.2 -3 • 2 d.f. 1 2 3 0 5 10 x 15 Summary: Ordinal and Nominal Outcomes Cumulative--Ordinal Proportional odds assumption—assess the series of binary comparisons from collapsing categories k-1 intercepts 1 slope / 1 odds ratio Disease or Other Health Outcome Yes No Exposure or Yes Person, Place, or Time Variable No a b a+b (n1) c d c+d (n2) a+c (m1) b+d (m2) a+b+c+d N Generalized--Nominal • No assumption of the shape of the association • Categories compared to a reference group • k-1 intercepts • k-1 slopes / k-1 odds ratios 52 Density of Student's t with 10 d.f. Chi-Square Densities 0.6 0.4 0.3 0.5 1 d.f. 0.4 3 d.f. 5 d.f. 8 d.f. 0.0 0.1 0.0 0.1 0.3 0.2 0.2 2 d.f. -3 -2 -1 0 t 1 2 3 0 5 10 x 15 Summary: Ordinal and Nominal Outcomes Disease or Other Health Outcome Yes No Exposure or Yes Person, Place, or Time Variable No a b a+b (n1) c d c+d (n2) a+c (m1) b+d (m2) a+b+c+d N Issues for categorizing an outcome variable are similar to those for defining categories for independent variables: Conceptual meaning of the categories Statistical tests v. judgment about differences between categories Sample size and power 53 Density of Student's t with 10 d.f. Chi-Square Densities 0.6 0.4 0.3 0.5 1 d.f. 0.4 3 d.f. 5 d.f. 8 d.f. 0.0 0.1 0.0 0.1 0.3 0.2 0.2 2 d.f. -3 -2 -1 0 t 1 2 3 0 5 10 x 15 Summary: Ordinal and Nominal Outcomes Disease or Other Health Outcome Yes No Exposure or Yes Person, Place, or Time Variable No a b a+b (n1) c d c+d (n2) a+c (m1) b+d (m2) a+b+c+d N Model Building Similar to beginning with examining dummy variables for an independent variable prior to deciding whether to use it in an ordinal form, sometimes it is useful to run a generalized logit model first, since it requires no assumption about the ordering of the categories, and empirically assess whether the variation in categoryspecific odds ratios is important or negligible. 54 Density of Student's t with 10 d.f. Chi-Square Densities 0.6 0.4 0.3 0.5 1 d.f. 0.4 3 d.f. 5 d.f. 8 d.f. 0.0 0.1 0.0 0.1 0.3 0.2 0.2 2 d.f. -3 -2 -1 0 t 1 2 3 0 5 10 x 15 Summary: Ordinal and Nominal Outcomes Disease or Other Health Outcome Yes No Exposure or Yes Person, Place, or Time Variable No a b a+b (n1) c d c+d (n2) a+c (m1) b+d (m2) a+b+c+d N And even if the proportional odds assumption holds, reporting separate odds ratios for each category—using generalized logit—may be important in order to emphasize the similarity of the strength of the association across categories. In addition, the cumulative logit model will not only force the strength of association to be uniform, the predicted values will also be forced to be linear. Using generalized logit, the predicted odds and odds ratios will both more closely reflect the observed values. 55 Density of Student's t with 10 d.f. Chi-Square Densities 0.6 0.4 0.3 0.5 1 d.f. 0.4 3 d.f. 5 d.f. 8 d.f. 0.0 0.1 0.0 0.1 0.3 0.2 0.2 2 d.f. -3 -2 -1 0 t 1 2 3 0 5 10 x 15 Summary: Ordinal and Nominal Outcomes Disease or Other Health Outcome Yes No Exposure or Yes Person, Place, or Time Variable No a b a+b (n1) c d c+d (n2) a+c (m1) b+d (m2) a+b+c+d N Why Not Just Always Run Stratified Models for Generalized Logit? For nominal outcomes, using a single model may be more efficient than using separate binary models With separate binary models, need to decide whether each model should include the same independent variables or whether different final, category-specific models make sense, each including only those variables which are risk or protective factors for a particular binary comparison 56 Density of Student's t with 10 d.f. Chi-Square Densities 0.6 0.4 0.3 0.5 1 d.f. 0.4 3 d.f. 5 d.f. 8 d.f. 0.0 0.1 0.0 0.1 0.3 0.2 0.2 2 d.f. -3 -2 -1 0 t 1 2 3 0 5 10 x 15 Summary: Ordinal and Nominal Outcomes Disease or Other Health Outcome Yes No Exposure or Yes Person, Place, or Time Variable No a b a+b (n1) c d c+d (n2) a+c (m1) b+d (m2) a+b+c+d N Using a single multinomial model permits a unified profile of risk and protective factors across the categories—both significant and insignificant 57 Density of Student's t with 10 d.f. Chi-Square Densities 0.6 0.4 0.3 0.5 1 d.f. 0.4 3 d.f. 5 d.f. 8 d.f. 0.0 0.1 0.0 0.1 0.3 0.2 0.2 2 d.f. -3 -2 -1 0 t 1 2 3 0 5 10 x 15 Summary: Ordinal and Nominal Outcomes Disease or Other Health Outcome Yes No Exposure or Yes Person, Place, or Time Variable No a b a+b (n1) c d c+d (n2) a+c (m1) b+d (m2) a+b+c+d N For a variable that is actually continuous, are there reasons to use a cumulative logit model instead of a continuous outcome model? For example, when would modeling ordinal categories of birthweight be preferable either to modeling birthweight continuously in grams or categorized into nominal groups? using a variable as ordinal (with fewer categories) as opposed to continuous will yield odds ratios instead of mean differences No assumption of normality required 58 Density of Student's t with 10 d.f. Chi-Square Densities 0.6 0.4 0.3 0.5 1 d.f. 0.4 3 d.f. 5 d.f. 8 d.f. 0.0 0.1 0.0 0.1 0.3 0.2 0.2 2 d.f. -3 -2 -1 0 t 1 2 3 0 5 10 x 15 Summary: Ordinal and Nominal Outcomes Disease or Other Health Outcome Yes No Exposure or Yes Person, Place, or Time Variable No a b a+b (n1) c d c+d (n2) a+c (m1) b+d (m2) a+b+c+d N For a variable that meets the proportional odds assumption, is it still appropriate to choose to use a generalized logit approach? using ordinal as opposed to nominal categories will be more efficient if there is truly an ordinal effect Why "waste" degrees of freedom on multiple odds ratios, if the effect is constant across categories? 59 Density of Student's t with 10 d.f. Chi-Square Densities 0.6 0.4 0.3 0.5 1 d.f. 0.4 3 d.f. 5 d.f. 8 d.f. 0.0 0.1 0.0 0.1 0.3 0.2 0.2 2 d.f. -3 -2 -1 0 t 1 2 3 0 5 10 x 15 Which Modeling Approach? Disease or Other Health Outcome Yes No Exposure or Yes Person, Place, or Time Variable No a b a+b (n1) c d c+d (n2) a+c (m1) b+d (m2) a+b+c+d N Choosing the form of the outcome variable: Stressful Life Events • Any stressful life event (y/n) = independent vars (dichotomous) • Fin. Emot. Traum. Partner = independent vars (Nominal - No stressful life events as the reference) • Sum of stressful life events = independent vars (continuous) • Scale of stressful life events = independent vars (ordinal) 60 Density of Student's t with 10 d.f. Chi-Square Densities 0.6 0.4 0.3 0.5 1 d.f. 0.4 3 d.f. 5 d.f. 8 d.f. 0.0 0.1 0.0 0.1 0.3 0.2 0.2 2 d.f. -3 -2 -1 0 t 1 2 3 0 5 10 x 15 Which Modeling Approach? Disease or Other Health Outcome Yes No Exposure or Yes Person, Place, or Time Variable No a b a+b (n1) c d c+d (n2) a+c (m1) b+d (m2) a+b+c+d N Choosing the form of the outcome variable: Maternal Depression • Any depression (y/n) = independent vars • Pre&Post Pre_Only PP_Only = independent vars (Nominal - No depression as the reference) • Severe Moderate Mild = independent vars (Ordinal or Nominal) • Depression Severity Scale = independent vars (ordinal) 61 Density of Student's t with 10 d.f. Chi-Square Densities 0.6 0.4 0.3 0.5 1 d.f. 0.4 3 d.f. 5 d.f. 8 d.f. 0.0 0.1 0.0 0.1 0.3 0.2 0.2 2 d.f. -3 -2 -1 0 t 1 2 3 0 5 10 x 15 Which Modeling Approach? Disease or Other Health Outcome Yes No Exposure or Yes Person, Place, or Time Variable No a b a+b (n1) c d c+d (n2) a+c (m1) b+d (m2) a+b+c+d N Choosing the form of the outcome variable: Breastfeeding • Ever Breastfed (yes v. no) = independent vars • Exclusive BF>=2 mos. (yes v. no) = independent vars • Exclusive >=2 mo. Exclusive BF<=2 mo.= independent vars Never Breastfed as reference • BF<2 mo. BF 2-6 mo. BF > 6 mo. = independent vars Never Breastfed as reference • Breastfeeding duration in weeks = independent vars 62
© Copyright 2026 Paperzz