Supplemental Digital Content 10 Test results of DIF of 29 PTSD

Supplemental Digital Content 10
Test results of DIF of 29 PTSD/Trauma items: Race groups
Race was collapsed into two DIF groups: white = 1, non-white = 0.
lordif was run with the following options: detection criterion = Chisqr; threshold alpha = 0.01; minimum
count in a cell = 5; plus other default settings. Figure 1 shows the density distributions of PTSD/Trauma
trait estimates for the DIF groups.
Trait Distributions
0.2
0.0
0.1
Density
0.3
0.4
Non-white
White
-4
-2
0
2
4
theta
Figure 1: The graph shows smoothed histograms of the PTSD/Trauma trait levels of white (dashed line)
and non-white (solid line) study participants as measured by the PTSD/Trauma scale (theta). There is
substantial overlap in the distributions except at the high end of the scale where non-white group shows
probably higher trait levels. However both groups on average have similar mean scores.
All five response categories were retained by the DIF test procedure except for two items (#2 and #25)
that were collapsed into 4 categories due to sparseness. Three of the 29 items were flagged for DIF: item
#7 (“I broke into a sweat when I thought about what happened”); item #9 (“I was afraid in crowds”); and
item #29 (“I was afraid to go to sleep”). Figures 2 to 4 show the diagnostic plots for the three individual
DIF items, while Figures 5 and 6 are for test or scale level evaluations.
1
Item-level plots:
4
Differences in Item True Score Functions
4
Item True Score Functions - Item 7
1)=6e-04
2
,2)=0.0101,R =0.006
13
13
2
2
,1)=0.6559,R23=1e-04
3
2
23
1
2
Item Score
Pr(
2
,1)=0.0027,R12=0.0059, (
12
1
Item Score
3
Pr(
2
2
Pr(
Non-white
-2
0
2
4
-4
0
2
theta
Item Response Functions
Impact (Weighted by Density)
4
4
2.56, -1.16, -0.37, 0.53, 1.5
3
0.8
-2
theta
1.0
-4
0
0
White
2
Size
0.6
||
-4
-2
| |
| |
0
theta
0
0.0
0.2
1
0.4
Probability
2.35, -1.04, -0.14, 0.83, 1.69
| |
2
4
-4
-2
0
2
4
theta
Figure 2: The top left plot shows item characteristic curves (ICCs) separately for white vs. non-white. These true
score functions are based on group-specific item parameter estimates. The slope of the function is similar for both
groups; however the white group shows a slightly higher threshold across almost all spectrum of the scale. This
indicates a uniform DIF and is consistent with the LR test results: significant for uniform DIF (Model 1 versus
Model 2) (p=0.003), but not significant for non-uniform DIF (Model 2 versus Model 3) (p=0.656). The top right plot
shows the absolute difference between the ICCs for the two groups, almost symmetrically spread across the levels of
the trait (theta). The lower left plot shows the item category response functions for the two groups based on the
group specific parameter estimates. Again it is seen that the threshold for the white group is slightly higher, now
there is added evidence that all five response categories of this item show uniform contributions to the DIF. In the
lower right plot, when the absolute difference between the ICCs (the upper right plot) is weighted by the score
distribution for the focal group, i.e., white individuals, the magnitude is close to zero; therefore the impact of the
DIF is minimal.
2
4
Differences in Item True Score Functions
4
Item True Score Functions - Item 9
,2)=0.009,R =0.0063
13
2
1)=0.0218
2
13
2
,1)=0.5419,R23=3e-04
3
2
23
1
2
Item Score
Pr(
2
,1)=0.0026,R12=0.0061, (
12
1
Item Score
3
Pr(
2
2
Pr(
Non-white
-2
0
2
4
-4
0
2
theta
Item Response Functions
Impact (Weighted by Density)
4
4
2.73, -1.51, -0.73, 0.06, 1.08
3
0.8
-2
theta
1.0
-4
0
0
White
2
Size
0.6
|
-4
-2
| |
||
0
theta
|
0
0.0
0.2
1
0.4
Probability
2.26, -1.51, -1.01, -0.11, 0.73
|
2
4
-4
-2
0
2
4
theta
Figure 3: Result patterns similar to Figure 2 are shown except that white and non-white groups switch places.
3
4
Differences in Item True Score Functions
4
Item True Score Functions - Item 29
1)=0.0049
2
,2)=0.0022,R =0.008
13
13
2
2
,1)=0.0029,R23=0.0058
3
2
23
1
2
Item Score
Pr(
2
,1)=0.0654,R12=0.0022, (
12
1
Item Score
3
Pr(
2
2
Pr(
Non-white
-2
0
2
4
-4
0
2
theta
Item Response Functions
Impact (Weighted by Density)
4
4
1.73, -1.16, -0.39, 0.67, 1.83
3
0.8
-2
theta
1.0
-4
0
0
White
2
Size
0.6
|
-4
-2
|
| |
0
theta
||
0
0.0
0.2
1
0.4
Probability
2.31, -0.75, -0.07, 0.77, 1.53
| |
2
4
-4
-2
0
2
4
theta
Figure 4: The difference between this Figure and Figures 2&3 is that in the top left plot the slope of the function for
the white group was substantially higher than that for the non-white group, indicating non-uniform DIF, so that the
white shows a slightly higher threshold in the lower region of the trait (theta < 1), but this relationship is reversed in
the upper region of the scale. The graphical pattern is confirmed by the LR test results: non-significant uniform DIF
(Model 1 versus Model 2) (p=0.065) but significant non-uniform DIF (Model 2 versus Model 3) (p=0.006). Here
again the DIF impact is minimal (lower right plot).
4
Scale-level plots:
DIF Items
2
20
4
6
TCC
60
40
TCC
8
80
10
100
12
All Items
Non-white
0
0
Non-white
White
-4
-2
0
theta
2
4
White
-4
-2
0
2
4
theta
Figure 5: Impact of DIF items on test characteristic curves (TCCs). Estimation of TCCs for white and non-white
DIF groups was based on group-specific item parameter estimates. The plots show the expected total scores for
groups of items at each level of the PTSD/Trauma trait (theta). The left plot shows these curves for all of the items
(both items with and without DIF), while the right plot shows these curves for only the three items found to have
DIF. These curves suggest that at the overall scale level there is minimal difference in the total expected score at any
PTSD/Trauma trait level for white or non-white individuals.
5
0.04
White
0.00
-0.02
-0.04
-0.06
initial - purified
0.02
0.04
0.02
0.00
-0.04
-0.06
-0.02
Non-white
-3
-2
-1
0
1
2
3
initial theta
Figure 6: Individual-level DIF impact. The plots show the difference in score when ignore DIF is ignored versus
when DIF is accounted for. The box plot of these differences (left) suggests a fairly normal distribution around
approximately zero mean/median. On the right the same difference scores are plotted against the initial scores
ignoring DIF (initial theta), with the data points color coded for white and non-white groups. Horizontal guidelines
are placed at 0.0 (solid line), i.e., no difference, and at the mean of the differences (dotted line). The positive values
to the left of the plot indicate that in almost all cases, accounting for DIF led to slightly lower scores (i.e., naïve
score ignoring DIF minus score accounting for DIF > 0) for those with lower trait levels, but this appears to be
consistent across both DIF groups. The negative values to the right indicate that for those with higher trait levels,
accounting for DIF led to slightly higher scores; but this again was consistent across the two groups. However the
take away is that these difference scores deviate only minimally away from zero.
Reference:
Choi SW, Gibbons LE, Crane PK. lordif: An R Package for Detecting Differential Item Functioning
Using Iterative Hybrid Ordinal Logistic Regression/Item Response Theory and Monte Carlo Simulations.
J Stat Softw. 2011;39(8):1–30.
6