Implementing Worst Rank Imputation Using SAS

Paper SP12
Implementing Worst Rank Imputation Using SAS®
Qian Wang, Merck Sharp & Dohme (Europe), Inc., Brussels, Belgium
Eric Qi, Merck & Company, Inc., Upper Gwynedd, PA
ABSTRACT
Classic designs of randomized clinical trials quite often require repeated measurements over the treatment period. However,
some patients may experience terminal events which prevent their continuation in the treatment. Although no measurements
can be collected after those patients leave the study, the drop-out reasons (e.g. drug-related adverse experience or lack of
efficacy) and time (e.g. early or late in the study) might be an indication of the effect of the treatment.
One approach to incorporate this piece of information into the analysis is to impute, for the missing observation, a ‘worstrank score’ which is worse than any values actually collected at that time point. The transformation from observed data into
rankings for analysis also has the advantage of being less sensitive to data distribution assumption and less affected by the
outliers. Nevertheless, how to implement and program the worst rank imputation remains a challenge.
This paper provides one possible definition of the worst rank imputation and describes in details how to implement it using
SAS.
INTRODUCTION
In a clinical trial which requires repeated response measurements over a period of time, it’s possible that patients discontinue
the treatment due to treatment-related reasons (e.g. lack of efficacy or adverse experience) which prevent their physical
evaluation before the end of the study. The observations after discontinuation, however, might not be missing at random and
might, on the contrary, provide insights to the treatment effect. For example, the discontinuation due to lack of efficacy is a
strong negative message and the missing observation is an indication of treatment failure. In this case, the missing observation is informative.
One approach to incorporate this informative “missingness” into the analysis is to impute a ‘worst-rank score’ for the missing data. This score is worse than any values actually observed. The transformation from observed data into rankings for
analysis also has the advantage of being less sensitive to data distribution assumption and less affected by the outliers. However, the exact definition of worst-rank imputation and its implementation remain a challenge.
This paper provides one possible definition of the worst rank imputation approach and describes in detail its SAS program
architecture as well as step-by-step implementation. Please note that the paper focuses on an application and implementation of the worst-rank imputation instead of its statistical soundness.
WORST RANK IMPUTATION
Example Study Design
A simulated trial (please refer to table 1) is used in this paper to illustrate the worst rank imputation method. The study consists of 14 patients with unique subject ID (USUBJID) 101 to 114. Ideally, 5 assessments of the endpoint of interest should
be collected for each patient at the baseline visit (Visit 0) and 4 post-baseline visits (Visit 1 to 4). Patient 101, 102, 103
complete the treatment with measurement collected at every time point. Patient 104 and 107 are also completers, but fail to
provide response information for all time points. These 5 patients are labeled as COMPLETER. The rest of the patients
drop out of the study towards the end due to various reasons (e.g. WITHDRAWAL OF CONSENT, LOST TO FOLLOWUP, ADVERSE EVENT, LACK OF EFFICACY, etc). The reasons and time of discontinuation are recorded. As an example, patient 106 stops treatment after Visit 2 due to lack of efficacy. Therefore, this patient only has assessments before
drop-out at Visit 0, 1 and 2. No information is available for his Visit 3 and 4. It also happens that some patients skip assessments before they leave the study. One example given here is patient 110 who is missing Visit 2 measurement although
he only stops treatment after Visit 3.
Table 1 Original Data Collected
USUBJID
101
102
103
104
105
106
107
108
109
110
111
112
113
114
0
108.0
90.0
114.0
75.0
54.0
112.5
120.0
222.0
21.0
48.0
2.0
84.0
114.0
90.0
Endpoint Value at Visit
1
2
3
114.0
90.0
84.0
90.0
90.0
90.0
67.5
60.0
60.0
40.0
60.0
27.0
.
.
.
90.0
66.0
.
117.0
.
.
162.8
81.6
.
21.0
24.0
.
6.0
.
24.0
18.4
20.0
113.3
.
.
.
96.0
48.0
96.0
78.0
.
.
4
84.0
90.0
60.0
.
.
.
73.8
.
.
.
.
.
.
.
Visit
NA
NA
NA
NA
0
2
NA
2
2
3
3
0
3
1
Discontinuation
Reason
COMPLETER
COMPLETER
COMPLETER
COMPLETER
WITHDRAWAL OF CONSENT
LACK OF EFFICACY
COMPLETER
LOST TO FOLLOW-UP
ADVERSE EVENT
WITHDRAWAL OF CONSENT
LACK OF EFFICACY
ADVERSE EVENT
ADVERSE EVENT
LACK OF EFFICACY
Without losing generality, change from baseline analysis is adopted for this study. Let’s assume that the ranking and the
collected assessments have the following relationships:
Rank
1 (Best)
Maximum (Worst)
Assessment Value
Smallest
Largest
And the ranking and the change from baseline values have the following relationships:
1 (Best)
Rank
Change From Baseline Value (Visit 1 to 4)
Smallest (Highly Negative)
Maximum (Worst)
Largest (Less Negative or Positive)
The ranking of change from baseline during the follow-up visits (computed as Visiti value - Visit0 value) is constructed in a
way that the rank value corresponds to the treatment effect (the smallest rank value 1 indicating the best effect and the largest
value the worst). In this example, the smallest value of change from baseline (the most negative value which indicates the
largest decrease from baseline) corresponds to the best treatment effect, hence rank number 1.
Baseline Ranking
Although the worst-rank imputation approach only applies to the follow-up time points, the baseline ranking is also calculated for this example since it’s sometimes needed to account for baseline levels by means of a covariate in the repeated
measurement analysis. The ranking is calculated solely based on the baseline endpoint values. Table 2 gives the ranking of
all patients at Visit 0. The best rank (for patient 111) corresponds to the smallest endpoint value (2.0), and the worst rank
(for patient 108) corresponds to the largest assessment (222.0).
Table 2 Visit 0 (Baseline) Ranking
USUBJID
111
109
110
105
104
112
102
114
101
106
103
113
107
108
Rank
1
2
3
4
5
6
8
8
9
10
12
12
13
Visit 0
Value
2.0
21.0
48.0
54.0
75.0
84.0
90.0
90.0
108.0
112.5
114.0
114.0
120.0
14
222.0
Visit
3
2
3
0
NA
0
NA
1
NA
2
NA
3
NA
2
Discontinuation
Reason
LACK OF EFFICACY
ADVERSE EVENT
WITHDRAWAL OF CONSENT
WITHDRAWAL OF CONSENT
COMPLETER
ADVERSE EVENT
COMPLETER
LACK OF EFFICACY
COMPLETER
LACK OF EFFICACY
COMPLETER
ADVERSE EVENT
COMPLETER
LOST TO FOLLOW-UP
Follow-up Visit Ranking
If there were no missing values, patients at each follow-up visit (Visit 1 to Visit 4) could be ranked against their response
change from baseline. However when it’s impossible to collect information at all time points, the worst-rank imputation
method provides a possible solution.
The guiding principle of worst-rank imputation is to rank information in a way which corresponds to the treatment effect.
Intuitively, patients who still stay in the study at the specific time point show better response compared to patients who have
left the study due to drug-related reasons. Continuing patients are therefore assigned better ranks than patients discontinued
for treatment-related causes. For a patient who is known to remain in study but has missing assessment at the time point, his
last observation during the treatment is carried forward. So in the case of patient 110 who only leaves study after Visit 3, the
missing Visit 2 assessment can be imputed using the Visit 1 observation. Non-dropouts (with/without missing information)
are ranked together based on the observed change from baseline (calculated from observed or imputed values). Table 3 lists
all the patient assessments after Last Observation Carried-Forward (LOCF). Table 4 shows the computed change from baseline from visit 1 to visit 4 with LOCF. All carried-forward values are highlighted in bold.
Table 3 Data After LOCF Imputation
USUBJID
101
102
103
104
105
106
107
108
109
110
111
112
113
114
0
108.0
90.0
114.0
75.0
54.0
112.5
120.0
222.0
21.0
48.0
2.0
84.0
114.0
90.0
Endpoint Value at Visit
1
2
3
114.0
90.0
84.0
90.0
90.0
90.0
67.5
60.0
60.0
40.0
60.0
27.0
.
.
.
90.0
66.0
.
117.0
117.0
117.0
162.8
81.6
.
21.0
24.0
.
6.0
24.0
6.0
18.4
20.0
113.3
.
.
.
96.0
48.0
96.0
78.0
.
.
4
84.0
90.0
60.0
27.0
.
.
73.8
.
.
.
.
.
.
.
Visit
NA
NA
NA
NA
0
2
NA
2
2
3
3
0
3
1
Discontinuation
Reason
COMPLETER
COMPLETER
COMPLETER
COMPLETER
WITHDRAWAL OF CONSENT
LACK OF EFFICACY
COMPLETER
LOST TO FOLLOW-UP
ADVERSE EVENT
WITHDRAWAL OF CONSENT
LACK OF EFFICACY
ADVERSE EVENT
ADVERSE EVENT
LACK OF EFFICACY
Table 4 Calculated Change From Baseline After LOCF Imputation
USUBJID
101
102
103
104
105
106
107
108
109
110
111
112
113
114
Endpoint Change From Baseline Value at Visit
1
2
3
4
6.0
-18.0
-24.0
-24.0
0.0
0.0
0.0
0.0
-46.5
-54.0
-54.0
-54.0
-35.0
-15.0
-48.0
-48.0
.
.
.
.
-22.5
-46.5
.
.
-3.0
-46.2
-3.0
-3.0
-59.2
-140.4
.
.
0.0
3.0
.
.
-42.0
-24.0
.
-42.0
16.4
18.0
111.3
.
.
.
.
.
-18.0
-66.0
-18.0
.
-12.0
.
.
.
Visit
NA
NA
NA
NA
0
2
NA
2
2
3
3
0
3
1
Discontinuation
Reason
COMPLETER
COMPLETER
COMPLETER
COMPLETER
WITHDRAWAL OF CONSENT
LACK OF EFFICACY
COMPLETER
LOST TO FOLLOW-UP
ADVERSE EVENT
WITHDRAWAL OF CONSENT
LACK OF EFFICACY
ADVERSE EVENT
ADVERSE EVENT
LACK OF EFFICACY
Discontinued patients are ranked after all patients still under treatments. Those drop-outs are grouped based on the reason of
discontinuation. The reasons can be ordered so that drop-outs due to less severe drug-related reasons are ranked before those
due to more serious causes. In the example study, one possible ordering might be to consider patients with reasons WITHDRAWAL OF CONSENT or LOST TO FOLLOW-UP (group 1) as showing better treatment effect, compared to those with
ADVERSE EVENT (group 2). Discontinuation due to LACK OF EFFICACY (group 3) is regarded as the indication of
worst response among all reasons. For the patients who drop out due to the same reason, their discontinuation time (recorded as discontinuation visit number) are compared. Patients who drop out later in the study are ranked better (smaller
rank values) than those who leave the study earlier. Table 5, 6, 7 and 8 give the worst-rank imputation results at each Visit 1
to 4 respectively.
At Visit 1, all patients except 105 and 112 are still in the study, and they are ranked based on the calculated change from
baseline value. 105, 112 discontinued from study after baseline visit and are ranked after all other patients. Patient 105
drops out for WITHDRAWAL OF CONSENT (discontinuation group 1) and is considered as showing better treatment response than patient 112 who terminates the study due to ADVERSE EVENT (discontinuation group 2).
Table 5 Visit 1 Ranking
108
103
110
104
106
113
114
107
102
109
101
111
105
1
2
3
4
5
6
7
8
10
10
11
12
13
Visit 1
Change From Baseline
-59.2
-46.5
-42.0
-35.0
-22.5
-18.0
-12.0
-3.0
0.0
0.0
6.0
16.4
.
112
14
.
USUBJID
Rank
Visit
Discontinuation
Reason Group/Description
2
NA
3
NA
2
3
1
NA
NA
2
NA
3
0
1 / LOST TO FOLLOW-UP
COMPLETER
1 / WITHDRAWAL OF CONSENT
COMPLETER
3 / LACK OF EFFICACY
2 / ADVERSE EVENT
3 / LACK OF EFFICACY
COMPLETER
COMPLETER
2 / ADVERSE EVENT
COMPLETER
3 / LACK OF EFFICACY
1 / WITHDRAWAL OF CONSENT
0
2 / ADVERSE EVENT
At Visit 2, patients 107 and 110 are ranked using their calculated change from baseline with LOCF imputation. Patient 105,
112 and 114 are no longer in the study and are considered as having worse treatment effect than all other patients. They are
ordered based on the implication of reason of discontinuation.
Table 6 Visit 2 Ranking
108
113
103
106
110
101
104
107
102
109
111
105
112
1
2
3
4
5
6
7
8
9
10
11
12
13
Visit 2
Change From Baseline (LOCF)
-140.4
-66.0
-54.0
-46.5
-42.0
-18.0
-15.0
-3.0
0.0
3.0
18.0
.
.
114
14
.
USUBJID
Rank
Visit
Discontinuation
Reason Group/Description
2
3
NA
2
3
NA
NA
NA
NA
2
3
0
0
1 / LOST TO FOLLOW-UP
2 / ADVERSE EVENT
COMPLETER
3 / LACK OF EFFICACY
1 / WITHDRAWAL OF CONSENT
COMPLETER
COMPLETER
COMPLETER
COMPLETER
2 / ADVERSE EVENT
3 / LACK OF EFFICACY
1 / WITHDRAWAL OF CONSENT
2 / ADVERSE EVENT
1
3 / LACK OF EFFICACY
At Visit 3, patient 108, 105, 109, 112, 106 and 114 have all dropped out from study. They are grouped based on discontinuation reasons. Within each group, patients are ranked based on the time when they stop the treatment. For example, both
106 and 114 leave the study due to LACK OF EFFICACY. 106 is assigned a better rank because he discontinues later (after
Visit 2) than 114 (after Visit 1).
Table 7 Visit 3 Ranking
103
104
101
110
113
107
102
111
108
105
109
112
106
1
2
4
4
5
6
7
8
9
10
11
12
13
Visit 3
Change From Baseline (LOCF)
-54.0
-48.0
-24.0
-24.0
-18.0
-3.0
0.0
111.3
.
.
.
.
.
114
14
.
USUBJID
Rank
Visit
Discontinuation
Reason Group/Description
NA
NA
NA
3
3
NA
NA
3
2
0
2
0
2
COMPLETER
COMPLETER
COMPLETER
1 / WITHDRAWAL OF CONSENT
2 / ADVERSE EVENT
COMPLETER
COMPLETER
3 / LACK OF EFFICACY
1 / LOST TO FOLLOW-UP
1 / WITHDRAWAL OF CONSENT
2 / ADVERSE EVENT
2 / ADVERSE EVENT
3 / LACK OF EFFICACY
1
3 / LACK OF EFFICACY
At Visit 4, only patients 103, 104, 107, 101 and 102 still remain in the study and all the rest stop the treatment due to various
reasons.
Table 8 Visit 4 Ranking
USUBJID
103
104
107
101
102
110
108
105
113
109
112
111
106
114
Rank
1
2
3
4
5
6
7
8
9
10
11
12
13
14
Visit 4
Change From Baseline (LOCF)
-54.0
-48.0
-46.2
-24.0
0.0
.
.
.
.
.
.
.
.
.
Visit
NA
NA
NA
NA
NA
3
2
0
3
2
0
3
2
1
Discontinuation
Reason Group/Description
COMPLETER
COMPLETER
COMPLETER
COMPLETER
COMPLETER
1 / WITHDRAWAL OF CONSENT
1 / LOST TO FOLLOW-UP
1 / WITHDRAWAL OF CONSENT
2 / ADVERSE EVENT
2 / ADVERSE EVENT
2 / ADVERSE EVENT
3 / LACK OF EFFICACY
3 / LACK OF EFFICACY
3 / LACK OF EFFICACY
SAS IMPLEMENTATION
Figure 1 shows the process flow of the SAS implementation of the worst-rank imputation method described in the previous
section. Firstly, all collected patient data are classified based on whether the patient is still continuing in the study at each
specific time point. If the patient is a drop-out, the data are grouped based on the reason of discontinuation. Next, if the patient is continuing at a specific visit, and his observation is missing, his last observation prior to the current visit is carried
forward. In the 3rd step, all continuing patients are ranked together according to the assessment values (collected or imputed). The dropout patients due to the same reason are ranked together based on their time of discontinuation, and the
whole dropout group is appended after the continuing patients from step 3. The drop-out groups are ordered, and the reason
group which reflects worse treatment effect is always positioned after those with better response.
Figure 1 Process Flow
All observations (missing or not)
Step 1:
Classify observations based on
patient status
a. Continueing
patients with
non-missing
observations
b. Continueing patients
with missing
observations
Step 2:
LOCF
c. Dropouts reason 1
d. Dropouts reason 2
e. Dropouts reason 3
… …
Step 4:
Rank based on
time and append
Ranked data for a, b & c
Step 3:
Rank based on
Observation
Step 4:
Rank based on
time and append
Ranked data
for a,& b
Ranked data for
a, b, c & d
Step 4:
Rank based on
time and append
Ranked data
for all patients
STEP 1: CLASSIFY OBSERVATIONS BASED ON PATIENT STATUS
At each visit time point, whether the patient is still continuing in the study (group a & b in the flow chart) can be determined
by 2 criteria:
1. Patient is labeled as a COMPLETER; Or
2. Patient is not a COMPLETER, but his discontinuation visit occurs after the current visit.
The rest of the patients are categorized according to the drop-out reasons (e.g. group c for WITHDRAWAL OF CONSENT
or LOST TO FOLLOW-UP; group d for ADVERSE EVENT; and group e for LACK OF EFFICACY).
Table 9 below lists the SAS datasets after categorizing the observations in the given example. Each dataset contains variables USUBJID, VISITNUM, RESULTN, DSDY and DSDECOD corresponding respectively to the subject identifier, visit
number, measurement value during the visit, discontinuation visit number and patient discontinuation reason (or patient
status in the case of a COMPLETER).
Table 9 SAS datasets after step 1
Group a: Continuing patients with non-missing observations
USUBJID
101
101
101
101
101
102
102
102
102
102
103
103
103
103
103
105
106
106
106
108
108
108
109
109
109
111
111
111
111
112
113
113
113
113
114
114
VISITNUM
0
1
2
3
4
0
1
2
3
4
0
1
2
3
4
0
0
1
2
0
1
2
0
1
2
0
1
2
3
0
0
1
2
3
0
1
Group b: Continuing patients with missing observations
RESULTN
DSDY
DSDECOD
USUBJID
108.0
114.0
90.0
84.0
84.0
90.0
90.0
90.0
90.0
90.0
114.0
67.5
60.0
60.0
60.0
54.0
112.5
90.0
66.0
222.0
162.8
81.6
21.0
21.0
24.0
2.0
18.4
20.0
113.3
84.0
114.0
96.0
48.0
96.0
90.0
78.0
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
0
2
2
2
2
2
2
2
2
2
3
3
3
3
0
3
3
3
3
1
1
COMPLETER
COMPLETER
COMPLETER
COMPLETER
COMPLETER
COMPLETER
COMPLETER
COMPLETER
COMPLETER
COMPLETER
COMPLETER
COMPLETER
COMPLETER
COMPLETER
COMPLETER
WITHDRAWAL OF CONSENT
LACK OF EFFICACY
LACK OF EFFICACY
LACK OF EFFICACY
LOST TO FOLLOW-UP
LOST TO FOLLOW-UP
LOST TO FOLLOW-UP
ADVERSE EVENT
ADVERSE EVENT
ADVERSE EVENT
LACK OF EFFICACY
LACK OF EFFICACY
LACK OF EFFICACY
LACK OF EFFICACY
ADVERSE EVENT
ADVERSE EVENT
ADVERSE EVENT
ADVERSE EVENT
ADVERSE EVENT
LACK OF EFFICACY
LACK OF EFFICACY
104
104
104
104
104
107
107
107
107
107
110
110
110
110
VISITNUM
0
1
2
3
4
0
1
2
3
4
0
1
2
3
RESULTN
DSDY
DSDECOD
75.0
40.0
60.0
27.0
.
120.0
117.0
.
.
73.8
48.0
6.0
.
24.0
.
.
.
.
.
.
.
.
.
.
3
3
3
3
COMPLETER
COMPLETER
COMPLETER
COMPLETER
COMPLETER
COMPLETER
COMPLETER
COMPLETER
COMPLETER
COMPLETER
WITHDRAWAL OF CONSENT
WITHDRAWAL OF CONSENT
WITHDRAWAL OF CONSENT
WITHDRAWAL OF CONSENT
Group c: Missing information due to reason 1
USUBJID
105
105
105
105
108
108
110
VISITNUM
1
2
3
4
3
4
4
USUBJID
109
109
112
112
112
112
113
VISITNUM
3
4
1
2
3
4
4
RESULTN
.
.
.
.
.
.
.
DSDY
0
0
0
0
2
2
3
DSDECOD
WITHDRAWAL OF CONSENT
WITHDRAWAL OF CONSENT
WITHDRAWAL OF CONSENT
WITHDRAWAL OF CONSENT
LOST TO FOLLOW-UP
LOST TO FOLLOW-UP
WITHDRAWAL OF CONSENT
Group d: Missing information due to reason 2
RESULTN
.
.
.
.
.
.
.
DSDY
2
2
0
0
0
0
3
DSDECOD
ADVERSE EVENT
ADVERSE EVENT
ADVERSE EVENT
ADVERSE EVENT
ADVERSE EVENT
ADVERSE EVENT
ADVERSE EVENT
Group e: Missing information due to reason 3
USUBJID
106
106
111
114
114
114
VISITNUM
3
4
4
2
3
4
RESULTN
.
.
.
.
.
.
DSDY
2
2
3
1
1
1
DSDECOD
LACK OF EFFICACY
LACK OF EFFICACY
LACK OF EFFICACY
LACK OF EFFICACY
LACK OF EFFICACY
LACK OF EFFICACY
STEP 2: LOCF IMPUTATION FOR CONTINUING PATIENTS WITH MISSING VALUES
For the continuing patients with missing observations (group b dataset from step 1), a Last Observation carried Forward approach (LOCF) is applied to impute missing data. Below is the SAS implementation of LOCF.
%macro LOCF(inds=
,outds=
,SubjidVar=
,TimeVar=
,Var=
,CrfwdVar=
);
/*
/*
/*
/*
/*
/*
Input dataset name */
Output dataset name */
Patient identifier */
Time point variable */
Name of the variable to be carried forward */
Name of the variable for the carried-forward values */
proc sort data=&inds out=&outds;
by &SubjidVar &TimeVar;
run;
data &outds;
set &outds;
by &SubjidVar &TimeVar;
retain &CrfwdVar;
if first.&SubjidVar then &CrfwdVar=.;
if &var >.z then &CrfwdVar=&var;
run;
%mend locf;
In this example, the macro can be invoked in the following way.
%locf(
inds=group_b
,outds=group_bLOCF
,SubjidVar=USUBJID
,TimeVar=VISITNUM
,Var=RESULTN
,CrfwdVar=RESULTN_LOCF
);
STEP 3: RANK BASED ON OBSERVATIONS FOR CONTINUING PATIENTS
Patients from group a and group b (with LOCF) are ranked based on the values of assessment (observed or imputed) using
the SAS procedure RANK.
The RANK procedure computes ranks for one or more numeric variables across all observations within each ‘by-variable’
group and outputs the ranks to a new SAS data set. The following program shows how to rank the values of RANKVAR at
each visit (as indicated by the ‘by VISITNUM’ statement) using a simple PROC RANK step. In this example, RANKVAR
refers to the baseline assessment during Visit 0, and change from baseline during the follow-up Visit 1 to Visit 4. The ranking results within each visit are stored in the variable WR.
proc RANK data=group_a_bLOCF out=rk_group_a_bLOCF;
by VISITNUM;
var RANKVAR;
ranks WR;
run;
By default, PROC RANK assigns rank number 1 to the smallest variable value. To reverse the order, you may specify the
option DESCENDING so that the largest value corresponds to rank number 1.
The procedure also allows flexibility in ranking tied values. The valid options are ‘TIES=HIGH | LOW | MEAN’ where
‘TIES=HIGH’ assigns the largest of the corresponding ranks to all the tied records;
‘TIES=LOW’ assigns the smallest of the corresponding ranks to all the tied records;
and ‘TIES=MEA'N’ is the default and assigns the mean of the corresponding ranks to all the tied records.
In an example dataset with only one variable and 4 observations of values 4, 5, 5 and 6, the default PROC RANK assigns
rank 1, 2.5, 2.5 and 4 to these 4 observations respectively. If option TIES=HIGH is specified, the procedure calculates ranks
as 1, 3, 3 and 4. When TIES=LOW, the observed values are ranked as 1, 2, 2 and 4.
STEP 4: RANK BASED ON REASON AND TIME FOR DROP-OUT PATIENTS
At each visit, the patients who have stopped treatment (group c, d, e from step 1) are ranked after those still in the study
(group a and b). Based on the treatment effect (as implied by the discontinuation reason), patients with more serious dropout reason are ordered towards the bottom. Therefore, in this example, group e patients will be ranked worse than group d,
and group d worse than group c. Within each group, patients will be ranked based on their drop-out time, the earlier they
drop, the worse they will be ranked.
The following APPENDRANK macro is defined to rank patients in a discontinuation group (c, d or e) and append their rank
to the previous group.
%macro appendrank(
processds=
/* Dataset to be ranked */
,previousds=
,byvar=
,var
=
,rankvar=
,outds=
);
/*
/*
/*
/*
/*
Dataset to append to */
By variable
Name of the variable to be ranked
Name of the variable for ranking results
Output dataset
*/
*/
*/
*/
proc sort data=&processds out=_ds1;
by &byvar;
run;
%**** Rank based on drop-out time **;
proc rank data=_ds1 out=_ds2 DESCENDING;
by &byvar;
var &var;
ranks &rankvar;
run;
%**** get the largest rank in previous data;
proc SQL;
create table _rkmax as
select &byvar, MAX(&rankvar) as _MAX
from &previousds
group by &byvar;
create table _ds3 as
select a.*, _MAX
from _ds2 as a, _rkmax as b
where a.&byvar=b.&byvar ;
quit;
data _ds4;
set _ds3;
&rankvar=&rankvar+_MAX;
drop _MAX;
run;
data &outds;
set &previousds _ds4;
run;
proc sort data=&outds;
by &byvar wr;
run;
%**** Delete temporary datasets created within macro;
proc datasets library=work memtype=data nolist ;
delete _ds1 _ds2 _ds3 _ds4 _rkmax;
quit ;
%mend appendrank;
To rank group c patients and consolidate the ranking with the previous group a and b patients (specified by previousds=rk_group_a_bLOCF), simply call the macro with the following parameters:
%appendrank(
Processds
,previousds
,byvar
,var
,rankvar
,outds
);
=
=
=
=
=
=
group_c
rk_group_a_bLOCF
visitnum
dsdy
WR
rk_group_a_bLOCF_c
The output dataset rk_group_a_bLOCF_c contains the ranking for all group a, b and c patients using worst rank imputation.
It can then be passed into the appendrank macro again as previousds for the ranking of group d together. And similarly, the
final ranking is generated by a 3rd call to the same macro.
CONCLUSIONS
In clinical trials, quite often repeated measurements are required for the analysis, however, patients may drop out in the middle of the trial due to various reasons. This paper described one possible solution, a worst-rank imputation approach, which
transforms the collected observations into ranks to “smooth” the extreme outliers, and accounts for missing information
based on the reason and time of dropout. A step-by-step explanation of its SAS coding was also provided.
REFERENCES:
John M. Lachin
Worst-Rank Score Analysis with Informatively Missing Observations in Clinical Trials
Controlled Clinical Trials Volume 20, Issue 5 , October 1999, Pages 408-422
SAS Institute Inc., SAS Language Reference Version 6 First Edition
Copyright 1990 by SAS institute, Cary, NC, USA
ACKNOWLEDGMENTS
The authors greatly acknowledge the review and candid feedback from Margaret Coughlin, Frederic Coppin, Kristel Vandormael and Cindy White.
TRADEMARKS
SAS® and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc.
in the USA and other countries. ® indicates USA registration.
CONTACT INFORMATION
Qian Wang
Scientific Programming
Biostatistics and Research Decision Sciences
Merck Sharp & Dohme (Europe), Inc.
Clos du Lynx 5
B-1200 Brussels, Belgium
E-mail: [email protected]
Eric Qi
Scientific Programming
Biostatistics and Research Decision Sciences
Merck & Co. (UG1D-88)
Upper Gwynedd, PA 19454-2505
E-mail: [email protected]