Supplemental Digital Content 10 Test results of DIF of 29 PTSD/Trauma items: Race groups Race was collapsed into two DIF groups: white = 1, non-white = 0. lordif was run with the following options: detection criterion = Chisqr; threshold alpha = 0.01; minimum count in a cell = 5; plus other default settings. Figure 1 shows the density distributions of PTSD/Trauma trait estimates for the DIF groups. Trait Distributions 0.2 0.0 0.1 Density 0.3 0.4 Non-white White -4 -2 0 2 4 theta Figure 1: The graph shows smoothed histograms of the PTSD/Trauma trait levels of white (dashed line) and non-white (solid line) study participants as measured by the PTSD/Trauma scale (theta). There is substantial overlap in the distributions except at the high end of the scale where non-white group shows probably higher trait levels. However both groups on average have similar mean scores. All five response categories were retained by the DIF test procedure except for two items (#2 and #25) that were collapsed into 4 categories due to sparseness. Three of the 29 items were flagged for DIF: item #7 (“I broke into a sweat when I thought about what happened”); item #9 (“I was afraid in crowds”); and item #29 (“I was afraid to go to sleep”). Figures 2 to 4 show the diagnostic plots for the three individual DIF items, while Figures 5 and 6 are for test or scale level evaluations. 1 Item-level plots: 4 Differences in Item True Score Functions 4 Item True Score Functions - Item 7 1)=6e-04 2 ,2)=0.0101,R =0.006 13 13 2 2 ,1)=0.6559,R23=1e-04 3 2 23 1 2 Item Score Pr( 2 ,1)=0.0027,R12=0.0059, ( 12 1 Item Score 3 Pr( 2 2 Pr( Non-white -2 0 2 4 -4 0 2 theta Item Response Functions Impact (Weighted by Density) 4 4 2.56, -1.16, -0.37, 0.53, 1.5 3 0.8 -2 theta 1.0 -4 0 0 White 2 Size 0.6 || -4 -2 | | | | 0 theta 0 0.0 0.2 1 0.4 Probability 2.35, -1.04, -0.14, 0.83, 1.69 | | 2 4 -4 -2 0 2 4 theta Figure 2: The top left plot shows item characteristic curves (ICCs) separately for white vs. non-white. These true score functions are based on group-specific item parameter estimates. The slope of the function is similar for both groups; however the white group shows a slightly higher threshold across almost all spectrum of the scale. This indicates a uniform DIF and is consistent with the LR test results: significant for uniform DIF (Model 1 versus Model 2) (p=0.003), but not significant for non-uniform DIF (Model 2 versus Model 3) (p=0.656). The top right plot shows the absolute difference between the ICCs for the two groups, almost symmetrically spread across the levels of the trait (theta). The lower left plot shows the item category response functions for the two groups based on the group specific parameter estimates. Again it is seen that the threshold for the white group is slightly higher, now there is added evidence that all five response categories of this item show uniform contributions to the DIF. In the lower right plot, when the absolute difference between the ICCs (the upper right plot) is weighted by the score distribution for the focal group, i.e., white individuals, the magnitude is close to zero; therefore the impact of the DIF is minimal. 2 4 Differences in Item True Score Functions 4 Item True Score Functions - Item 9 ,2)=0.009,R =0.0063 13 2 1)=0.0218 2 13 2 ,1)=0.5419,R23=3e-04 3 2 23 1 2 Item Score Pr( 2 ,1)=0.0026,R12=0.0061, ( 12 1 Item Score 3 Pr( 2 2 Pr( Non-white -2 0 2 4 -4 0 2 theta Item Response Functions Impact (Weighted by Density) 4 4 2.73, -1.51, -0.73, 0.06, 1.08 3 0.8 -2 theta 1.0 -4 0 0 White 2 Size 0.6 | -4 -2 | | || 0 theta | 0 0.0 0.2 1 0.4 Probability 2.26, -1.51, -1.01, -0.11, 0.73 | 2 4 -4 -2 0 2 4 theta Figure 3: Result patterns similar to Figure 2 are shown except that white and non-white groups switch places. 3 4 Differences in Item True Score Functions 4 Item True Score Functions - Item 29 1)=0.0049 2 ,2)=0.0022,R =0.008 13 13 2 2 ,1)=0.0029,R23=0.0058 3 2 23 1 2 Item Score Pr( 2 ,1)=0.0654,R12=0.0022, ( 12 1 Item Score 3 Pr( 2 2 Pr( Non-white -2 0 2 4 -4 0 2 theta Item Response Functions Impact (Weighted by Density) 4 4 1.73, -1.16, -0.39, 0.67, 1.83 3 0.8 -2 theta 1.0 -4 0 0 White 2 Size 0.6 | -4 -2 | | | 0 theta || 0 0.0 0.2 1 0.4 Probability 2.31, -0.75, -0.07, 0.77, 1.53 | | 2 4 -4 -2 0 2 4 theta Figure 4: The difference between this Figure and Figures 2&3 is that in the top left plot the slope of the function for the white group was substantially higher than that for the non-white group, indicating non-uniform DIF, so that the white shows a slightly higher threshold in the lower region of the trait (theta < 1), but this relationship is reversed in the upper region of the scale. The graphical pattern is confirmed by the LR test results: non-significant uniform DIF (Model 1 versus Model 2) (p=0.065) but significant non-uniform DIF (Model 2 versus Model 3) (p=0.006). Here again the DIF impact is minimal (lower right plot). 4 Scale-level plots: DIF Items 2 20 4 6 TCC 60 40 TCC 8 80 10 100 12 All Items Non-white 0 0 Non-white White -4 -2 0 theta 2 4 White -4 -2 0 2 4 theta Figure 5: Impact of DIF items on test characteristic curves (TCCs). Estimation of TCCs for white and non-white DIF groups was based on group-specific item parameter estimates. The plots show the expected total scores for groups of items at each level of the PTSD/Trauma trait (theta). The left plot shows these curves for all of the items (both items with and without DIF), while the right plot shows these curves for only the three items found to have DIF. These curves suggest that at the overall scale level there is minimal difference in the total expected score at any PTSD/Trauma trait level for white or non-white individuals. 5 0.04 White 0.00 -0.02 -0.04 -0.06 initial - purified 0.02 0.04 0.02 0.00 -0.04 -0.06 -0.02 Non-white -3 -2 -1 0 1 2 3 initial theta Figure 6: Individual-level DIF impact. The plots show the difference in score when ignore DIF is ignored versus when DIF is accounted for. The box plot of these differences (left) suggests a fairly normal distribution around approximately zero mean/median. On the right the same difference scores are plotted against the initial scores ignoring DIF (initial theta), with the data points color coded for white and non-white groups. Horizontal guidelines are placed at 0.0 (solid line), i.e., no difference, and at the mean of the differences (dotted line). The positive values to the left of the plot indicate that in almost all cases, accounting for DIF led to slightly lower scores (i.e., naïve score ignoring DIF minus score accounting for DIF > 0) for those with lower trait levels, but this appears to be consistent across both DIF groups. The negative values to the right indicate that for those with higher trait levels, accounting for DIF led to slightly higher scores; but this again was consistent across the two groups. However the take away is that these difference scores deviate only minimally away from zero. Reference: Choi SW, Gibbons LE, Crane PK. lordif: An R Package for Detecting Differential Item Functioning Using Iterative Hybrid Ordinal Logistic Regression/Item Response Theory and Monte Carlo Simulations. J Stat Softw. 2011;39(8):1–30. 6
© Copyright 2026 Paperzz