Estimation of HIV Incidence Using Multiple Biomarkers

American Journal of Epidemiology
© The Author 2013. Published by Oxford University Press on behalf of the Johns Hopkins Bloomberg School of
Public Health. All rights reserved. For permissions, please e-mail: [email protected].
Vol. 177, No. 3
DOI: 10.1093/aje/kws436
Advance Access publication:
January 9, 2013
Practice of Epidemiology
Estimation of HIV Incidence Using Multiple Biomarkers
Ron Brookmeyer*, Jacob Konikoff, Oliver Laeyendecker, and Susan H. Eshleman
* Correspondence to Dr. Ron Brookmeyer, Department of Biostatistics, Fielding School of Public Health, University of California, Los Angeles,
Los Angeles, CA 90095-1772 (e-mail: [email protected]).
Initially submitted July 25, 2012; accepted for publication October 31, 2012.
The incidence of human immunodeficiency virus (HIV) is the rate at which new HIV infections occur in populations. The development of accurate, practical, and cost-effective approaches to estimation of HIV incidence is a
priority among researchers in HIV surveillance because of limitations with existing methods. In this paper, we
develop methods for estimating HIV incidence rates using multiple biomarkers in biological samples collected
from a cross-sectional survey. An advantage of the method is that it does not require longitudinal follow-up of
individuals. We use assays for BED, avidity, viral load, and CD4 cell count data from clade B samples collected
in several US epidemiologic cohorts between 1987 and 2010. Considering issues of accuracy, cost, and implementation, we identify optimal multiassay algorithms for estimating incidence. We find that the multiple-biomarker
approach to cross-sectional HIV incidence estimation corrects the significant deficiencies of currently available
approaches and is a potentially powerful and practical tool for HIV surveillance.
acquired immunodeficiency syndrome; algorithms; cross-sectional studies; HIV; incidence; models, statistical
Abbreviations: AIDS, acquired immunodeficiency syndrome; ALIVE, AIDS Link to Intravenous Experience; CI, confidence interval; HIV, human immunodeficiency virus; HIVNET, HIV Network for Prevention Trials; MACS, Multicenter AIDS Cohort Study;
NU, not used.
The incidence of human immunodeficiency virus (HIV)
is the rate at which new HIV infections occur in populations. While HIV prevalence measures overall disease
burden, incidence tracks the leading edge of the epidemic.
Estimates of incidence are important for monitoring the
growth of new infections, targeting prevention efforts, and
designing prevention trials. Accurate, practical, and costeffective approaches for estimating incidence are a priority
among researchers in HIV surveillance. Unfortunately, at
this point in time, there is no single, widely accepted
method for estimating HIV incidence in populations.
The main reason estimation of HIV incidence is challenging is because the traditional approach for estimating
incidence, namely longitudinal follow-up of cohorts, presents issues for HIV surveillance (1). These issues include:
the difficulty of obtaining high follow-up rates in large representative samples of uninfected populations; the cost of
such studies; differences in HIV risk behaviors among
persons who do and do not participate in cohort studies;
and the fact that the counseling offered to persons in these
cohorts to reduce high-risk behaviors changes their HIV incidence rates—the quantity we are trying to measure.
An alternative approach to estimation of incidence, one
based on changes in HIV prevalence, has the advantage of
not requiring follow-up of cohorts, but it does depend on
critical assumptions about mortality and migration (2).
Such an approach has been shown to be sensitive to those
assumptions, and even if the assumptions are correct, prohibitively large sample sizes are required to obtain statistically stable estimates in many situations of interest (3).
A third approach for estimating HIV incidence rates—
the subject of this paper—utilizes biomarkers from biological samples collected in a single cross-sectional survey to
identify infections that occurred recently. The biomarker
approach requires only 1 cross-sectional survey and does
not require follow-up of cohorts. The approach was originally suggested using assays for HIV p24 antigen and HIV
antibody (4). Persons who are HIV antibody-negative and
264
Am J Epidemiol. 2013;177(3):264–272
Estimating HIV Incidence With Multiple Biomarkers 265
antigen-positive are classified into a window period indicating recent (acute) HIV infection. Disadvantages of the biomarker approach based on detection of acute HIV infection
are that 1) it requires prohibitively large sample sizes to
obtain statistically reliable estimates because the duration of
the window period for acute (antibody-negative) HIV infection is short and 2) testing costs are high because all HIV
antibody-negative persons must be tested for antigen. Subsequently, a dual-antibody testing system was developed in
which participants are first tested with a standard HIV antibody assay; those who are positive are tested with a less
sensitive assay (5). Persons who are negative on the less
sensitive assay are said to be in the “window period.” The
less sensitive assay currently in widespread use for this
purpose is the BED capture enzyme immunoassay (6).
While the BED assay has been used throughout the
world (7, 8), investigators in the Joint United Nations Programme on HIV/AIDS have concluded that the BED assay
should not be used for estimating HIV incidence because of
concerns that the assay incorrectly labels long-standing infections as new incident infections (9). These concerns have
led to procedures to adjust incidence estimates obtained
with the BED assay (10, 11). However, those adjustment
procedures have been controversial, and several reports
have questioned their correctness (12, 13). Other procedures
include one that accounts for infected persons who permanently remain in the window period (14) and one that uses
an adjustment factor based on the misclassification rate
among persons infected for at least a specified duration of
time (15, 16). A caveat regarding adjustment factors, discussed by Hallett et al. (17), is that they may depend not
only on the operating characteristics of the assays but also
on the epidemic curve, specifically the durations of time
that persons in the population have been infected. This
means that adjustment factors derived from the same population at different calendar times may not be valid (17).
Research has been undertaken to combine data from crosssectional and follow-up studies to obtain improved incidence estimates (18, 19). The science of cross-sectional
HIV incidence assays is reviewed elsewhere (20, 21).
While the biomarker approach has significant advantages,
its utility has been hampered by limitations of the main biomarker currently in use, namely the BED assay. To address
this, both the National Institutes of Health and the Bill and
Melinda Gates Foundation have launched initiatives to
improve biomarker approaches for HIV incidence estimation
by developing new biomarkers and refining existing ones (21).
Our objective here was to develop an approach for estimating
HIV incidence rates using multiple biomarkers from samples
collected cross-sectionally. The question was whether a
multiple-biomarker approach for HIV incidence estimation
could correct problems with the BED assay, be accurate and
cost-effective, not require additional external adjustment
factors, and still be easily implemented in many settings.
EPIDEMIOLOGIC MODEL UNDERLYING THE
BIOMARKER APPROACH
Suppose a cross-sectional survey is performed consisting
of N persons who have tested for the presence of HIV
Am J Epidemiol. 2013;177(3):264–272
infection; suppose that m individuals are infected and n are
uninfected (m + n = N). Biological samples from each of the
m infected individuals are assayed for biomarkers associated with duration of infection. Suppose an algorithm (or
rule) uses biomarkers to assign a binary indicator Yi to each
of the m infected individuals. The indicator Yi is set equal
to 1 if the algorithm classifies the ith individual as “recently”
infected, in which case the person is said to be in the
“window period,” and set to 0 otherwise. For example, the
BED approach classifies HIV-infected persons as within
the window period (Y = 1) if the result of their BED assay
is less than a specified cutoff value (typically <0.8 normalized optical density).
The biomarker estimate of HIV incidence is derived
from the fundamental epidemiologic relationship that the
prevalence of a condition is equal to incidence multiplied
by the mean duration of the condition (22), where, here, the
condition refers to being in the window period—indicating
a recently occurring infection. If that epidemiologic relationship holds, the estimate of HIV incidence is
I¼
W
;
nm
ð1Þ
where W is the
P number of persons in the window period,
that is, W ¼ m
i¼1 Yi , and μ is the mean duration of time
that an individual spends in the window period. Confidence
intervals for incidence that account for uncertainty in μ
have been developed using analytic (23) and Monte Carlo
(24) approaches.
The epidemiologic relationship, that prevalence is equal
to incidence multiplied by mean duration, depends on 2
key assumptions. The first assumption is that the incidence
rate is constant for a period stretching back in time as long
as the largest observable window periods. However, the assumption is not valid for the BED assay, because some
persons can remain in the window period (below the BED
cutoff ) for many years (25). The second assumption is that
once people exit the window period, they never reenter the
window period again. That assumption also does not
appear valid for the BED assay, because BED levels may
decrease in late-stage HIV infection when the immune
system collapses or after patients begin antiretroviral
therapy and become virally suppressed (25).
If assumptions underlying the epidemiologic relationship
( prevalence equals incidence times mean duration) are violated, there are important caveats with equation 1. If the
first assumption of constant incidence is violated, equation
1 is not estimating the current incidence at the time of
sample collection but rather is estimating the incidence
occurring in the population sometime in the past. How far
back in the past is a question answered by the concept of
the shadow (26). The term “shadow” is used because the
cross-sectional survey is casting a shadow back in time. We
use ψ to denote the shadow. Equation 1 is approximately
estimating the HIV incidence rate ψ days before collection
of the samples (26, 27). The bigger the shadow, the less
current is the incidence estimate. If participants make return
visits to the window period, then μ in equation 1 should be
266 Brookmeyer et al.
the average total amount of time people spend in the
window from all visits (26, 27).
The shadow can be thought of in the following way.
Among persons who are in the window period at the time
of the survey (i.e., prevalent window period cases), the
shadow is the average duration of time such persons have
already spent in the window prior to the survey. The
shadow ψ is not the same as the mean μ because ψ is the
average duration prevalent window period cases spent in
the window prior to the survey, while μ is the average total
duration a cohort of incident infections will spend in the
window period; prevalent window period cases are not
similar to cohorts of incident infections because of the
selection bias known as length-biased sampling.
Here, we give the mathematical definition of the shadow
ψ and the mean total window duration μ (26, 27). Let φ(t)
represent the probability that an individual who has been
infected for exactly t days is in the window period, that is,
φ(t) = P(Y = 1|infected for t days). The expected total duration of time an individual is in the window period is
μ = ∫φ(t)dt. The shadow is ψ = ∫t(φ(t)/μ)dt. In brief, the derivation of the shadow follows from the probability density
for the duration of time a prevalent window period case has
already spent in the window period (26, 27). That probability density is given by φ(t)/μ and is referred to as the backward recurrence time density in the stochastic processes
literature (28). The backward recurrence time density has
important applications in epidemiology and the theory of
disease screening (22, 29, 30).
MATERIALS AND METHODS
Data sources
We used data from 1,782 HIV-positive biological
samples collected from persons enrolled in one of 3 major
epidemiologic cohort studies in the United States (31): the
HIV Network for Prevention Trials (HIVNET) 001 Study
(32), an HIV vaccine preparedness cohort study conducted
primarily among men who have sex with men (808
samples from 103 individuals; data collected during 1995–
1999); the AIDS Link to Intravenous Experience (ALIVE)
Study (33), a cohort study of injection drug users (410
samples from 241 individuals; data collected during 1990–
2009); and the Multicenter AIDS Cohort Study (MACS)
(34), a cohort study of men who have sex with men (564
samples from 365 individuals; data collected during 1987–
2009). Each individual who contributed a sample to the
data set had a documented negative HIV antibody test
within 18 months of a subsequent positive test. The median
numbers of days between the last negative and first positive
HIV tests were 180, 182, and 187 for the HIVNET 001,
MACS, and ALIVE samples, respectively. We also had a
second independent data set of samples from patients with
advanced HIV disease who had been participating in the
Johns Hopkins HIV Clinical Practice Cohort (35) for at
least 8 years (500 samples from 379 individuals; data collected during 2002–2010); these patients did not have a
documented prior negative HIV antibody test but had been
followed for 8–24 years (median, 12.6 years). We used this
second data set to confirm some findings from the analysis
of our primary data set of 1,782 samples. In both primary
and confirmatory data sets (31, 36), all samples were
assayed for the 4 biomarkers described below.
Biomarkers
The BED capture enzyme immunoassay measures the
proportion of immunoglobulin G that is specific to HIV
antigen (6). BED levels generally increase with duration of
infection, although there are exceptions: BED levels may
decrease in later stages of infection when the immune
system collapses or after patients begin antiretroviral
therapy; and levels among elite virus controllers may stay
low for long periods of time (25). BED levels are measured
in units of normalized optical density. Some investigators
suggest using an assay cutoff of less than 0.8 normalized
optical density to identify persons with recent HIV infection (37).
The avidity assay measures the strength with which antibodies bind to target antigens (38). Antibodies produced
shortly after HIV infection bind more weakly to antigen
than those produced later. An avidity index is calculated as
the percentage of antigen-binding of chaotropic-treated antibody relative to the antigen-binding of nontreated antibody. Avidity levels generally increase with duration of
infection.
The CD4 T lymphocyte (CD4 cell) is the target cell of
HIV; CD4 cell counts generally fall with increasing duration of infection. Low CD4 cell counts usually indicate
late-stage HIV infection (39).
HIV viral load measures the amount of virus in plasma.
By the time an HIV-infected person tests positive for HIV
antibodies, viral load levels have usually increased and
remain high throughout the disease, unless the patient is on
antiretroviral therapy or is an elite controller (40). For our
study, the limit of detection of the viral load assays was
400 copies/mL or lower.
Statistical methods
We developed algorithms using multiple biomarkers to
classify samples as either in the window period indicating
recent infection (Y = 1) or not (Y = 0). We call these multiassay algorithms. We considered algorithms that assign
Y = 1 to a sample if the sample satisfies each of the following criteria: BED < CB; avidity < CA; viral load > CVL; and
CD4 > CCD4, where CB, CA, CVL, and CCD4 are the assay
cutoffs. We generated nearly 11,340 algorithms by considering all combinations of the following cutoffs: 14 BED
(CB) cutoffs—0.6, 0.7, …, 1.8 normalized optical density
and NU, where NU means the assay was not used; 9
avidity (CA) cutoffs—30%, 40%, 60%, 80%, 85%, 90%,
95%, 100%, and NU; 9 CD4 cell count (CCD4) cutoffs—
50, 100, 200, 250, 300, 350, 400, and 500 cells/mm3 and
NU; and 10 viral load (CVL) cutoffs—400, 600, 800,
1,000, 1,500, 2,000, 3,000, 5,000, and 10,000 copies/mL
and NU.
Am J Epidemiol. 2013;177(3):264–272
Estimating HIV Incidence With Multiple Biomarkers 267
We determined Y for each of the 1,782 biological
samples from the 3 cohort studies using each algorithm.
We used cubic splines with a knot at 2 years to model φ(t),
fðtÞ
log
1 fðtÞ
¼ b0 þ b1 t þ b2 t 2 þ b3 t 3 þ b4 ðt 2Þ3þ :
We chose spline models because they are flexible and do
not impose strong parametric assumptions. The spline
models allow φ(t) to increase or decrease and thus can
account for persons who may reenter the window period
after exiting. In contrast, survival models, such as the
Weibull model, are decreasing functions and cannot
account for persons who reenter the window period.
Splines, in contrast, do not impose the constraint that φ(t) is
1 when t = 0. In addition, splines are not forced to converge
to 0 with increasing duration. We varied placement of the
knot between 1 and 3 years and found that our results were
not sensitive to knot placement. Logistic regression was
used to estimate model parameters with a working independence assumption, and, as described below, bootstrapped
confidence intervals accounted for multiple samples contributed by the same individuals. If the negative HIV antibody
test for each person was found to be HIV RNA-positive, the
HIV seroconversion date was estimated as 2 weeks later (41).
Otherwise, seroconversion dates were estimated by sampling seroconversion times from uniform distributions in
the interval between the last negative and first positive antibody tests. This approach assumes that testing frequency is
unrelated to risks of infection for persons in the cohorts.
Related approaches for estimating the distribution of
window periods are discussed by Sweeting et al. (42). We
estimated φ(t) using imputed seroconversion dates, repeated
that imputation 10 times, and averaged the estimates using
multiple imputation methods. We retained only algorithms
that satisfied the condition φ(t) < 0.001 for t = 8 years. We
then confirmed that φ(t) converged to 0 by using our confirmatory data set from the Johns Hopkins HIV Clinical Practice Cohort; we retained only algorithms that classified each
of the 500 samples from the Hopkins cohort (infected >8
years) as Y = 0. For those retained algorithms, we calculated
µ and ψ by numerical integration assuming φ(t) = 0.0 for
t > 8 years. We obtained confidence intervals for µ and ψ
by blocking on individual, so that all biological samples
from the same individual were included in the bootstrapped
sample, and stratified by cohort study. Confidence intervals
were obtained by means of the percentile method using 500
bootstraps.
We considered issues of accuracy, cost, and implementation to identify optimal algorithms. We required algorithms
to have shadows less than a maximal acceptable value. The
National Institutes of Health has a goal of estimating HIV
incidence within the year preceding sample collection. We
required that acceptable algorithms have upper 95% confidence limits for the shadow of <1 year and estimates of
<250 days. Among acceptable algorithms, we identified the
one with the largest mean window period, µ, because that
one would have the smallest variance for the incidence
estimate.
Am J Epidemiol. 2013;177(3):264–272
Figure 1. Mean window duration (µ) versus shadow (ψ) for
algorithms with a shadow of less than 1 year.
We identified the order of performing assays that would
minimize cost. We assumed that the costs of the BED,
avidity, CD4, and viral load assays were r, 2r, 5r, and 10r,
respectively, where r represents the unit cost of a BED
assay. We identified the order of performing the assays that
gave the lowest cost by permuting the assays. We compared
that cost with the cost of testing all samples with all 4
assays. Assays for CD4 cell count can only be conducted
on whole blood, not on stored serum. Because of the cost
and effort of cryopreserving whole blood for CD4 cell
counting, CD4 testing often must be performed close to the
time at which samples are collected, before other biomarkers have been assayed. Accordingly, we considered all sequential orders of performing the assays with the constraint
that the first assay performed was CD4 cell count. In some
settings, assaying for CD4 count may present such enormous logistical difficulties that only algorithms that do not
include CD4 count should be considered. Accordingly, we
also performed the entire analysis after excluding CD4
count as a candidate biomarker.
RESULTS
Figure 1 shows the mean window duration versus the
shadow for algorithms with shadows of less than 1 year.
The figure shows that algorithms with large shadows tend
to have large mean window periods and thereby illustrates
the classic statistical tradeoff between bias and variance:
We desire small bias (small shadows) but also small variance (large mean window durations). We identified the
algorithm with the largest mean window subject to the constraint that the upper 95% confidence limit of the shadow
be less than 1 year. That algorithm is illustrated schematically in part A of Figure 2. The algorithm had cutoffs for
BED, avidity, viral load, and CD4 of 1.6, 85, 400, and 50,
respectively, and estimates of the mean window duration
(μ) and shadow (ψ) of 159 days and 184 days, respectively.
The assay order that minimized costs, subject to the
268 Brookmeyer et al.
Figure 2. Top-ranked algorithms for estimating human immunodeficiency virus (HIV) incidence using multiple biomarkers. In part A, the
algorithm is based on 4 biomarkers (CD4 cell count, BED, avidity, and viral load); in part B, it is based on 3 biomarkers (BED, avidity, and viral
load).
constraint that the CD4 assay was performed first, was
CD4, BED, avidity, and viral load. The cost of that algorithm was 44% of the cost of testing all samples with all 4
biomarkers.
We also searched through algorithms that used only 3
biomarkers (BED, avidity, and viral load). Such algorithms
are of practical interest because of the logistical difficulties
involved in obtaining CD4 cell measurements in many settings. The optimal 3-biomarker algorithm is illustrated
schematically in part B of Figure 2. The algorithm had
cutoffs for BED, avidity, and viral load of 1.5, 40, and 400,
respectively. The algorithm had a mean window duration
(μ) of 101 days and a shadow (ψ) of 194 days. The assay
order that minimized costs was BED, avidity, and viral
load. The assay costs of that algorithm were only 0.13
times the cost of testing all samples with all 4 assays. An
advantage of the part A algorithm is that its mean window
duration is 58 days longer than the part B algorithm, implying that the part B algorithm requires a cross-sectional
survey sample size approximately 57% larger (i.e., [(159 –
102)/102] × 100 = 57%) than the part A algorithm to have
the same incidence standard error. However, advantages of
the part B algorithm are that its assay costs are only about
one-third (0.13/0.44 = 0.30) those of the part A algorithm
and it is logistically easier to implement because CD4 cell
count is not a component of the algorithm. Algorithms with
higher viral load cutoffs (<800 copies/mL) performed similarly to those shown in parts A and B (results not shown).
Figure 3. φ(t ) for the algorithms shown in part A of Figure 1 (curve 1)
and part B of Figure 1 (curve 2) and an algorithm using only the BED
assay with a cutoff of 0.8 normalized optical density (curve 3) versus
time since infection (i.e., seroconversion).
Am J Epidemiol. 2013;177(3):264–272
Estimating HIV Incidence With Multiple Biomarkers 269
Our search space included algorithms based on a single
biomarker, but these did not satisfy the criterion that the
shadow’s upper 95% confidence limit be less than 1 year,
leading us to conclude that the multiple-biomarker approach produces more accurate incidence estimates than an
approach based on a single biomarker. Figure 3 shows φ(t)
for the part A (curve 1) and part B (curve 2) algorithms, as
well as the algorithm based solely on BED with the cutoff
of 0.8 (curve 3). The estimates of φ(t) at 1 year were: for
curve 1, φ(t) = 0.07 (95% confidence interval (CI): 0.03,
0.09); for curve 2, φ(t) = 0.02 (95% CI: 0.002, 0.04); and
for curve 3, φ(t) = 0.31 (95% CI: 0.22, 0.40). We found that
φ(t) converged to 0 well before 8 years for the multiassay
algorithms (curves 1 and 2) but remained high for curve
3 (φ(t) = 0.19) even at 8 years.
Table 1 shows how algorithms 1, 2, and 3 of Figure 3
classify samples by duration of infection. The last column
of Table 1 (algorithm 4) also shows an alternative multiassay algorithm (31, 43), where we have calculated the mean
window and shadow using the methods described in this
paper; that algorithm has a mean window duration 36 days
shorter than algorithm 1, with a shadow that is 38 days
shorter than that of algorithm 1. Of samples of persons
infected for less than 6 months, algorithm 1 classified more
samples into the window period than the other algorithms.
Algorithm 3 (BED alone) classified a large number of
samples of persons infected for more than 8 years into the
window period, while the multiassay algorithms did not
classify any sample from anyone infected for more than 5
years into the window period.
The algorithms we considered are sequential in that the decision to assay for an additional biomarker depends on the
results of previous assays. We also expanded our search of
algorithms by allowing samples to be assigned Y = 1 if they
satisfied criteria involving cutoffs that were combined using
Boolean operators (i.e., and; or). The algorithms in parts A
and B of Figure 2 did not change with this expanded search.
DISCUSSION
Our objective was to develop methods for estimating
HIV incidence using multiple biomarkers. An advantage of
this approach is that it only requires biological samples
from a single cross-sectional survey and does not require
follow-up of cohorts. We considered 3 issues in selecting
optimal algorithms: accuracy, cost, and implementation.
Table 1. Numbers of HIV-Positive Biological Samples Classified Into the Window Period as Recent Infections
(Y = 1) by Each of 4 Algorithms, According to Duration of Infectiona
Algorithmb
Duration
of Infection,
yearsc
1
No. of
Samples
2
(CD4 >50, BED
<1.6, Avidity <85,
and VL >400)
(BED <1.5,
Avidity <40,
and VL >400)
3
4
(BED <0.8)
(CD4 >200,
BED <1.0,
Avidity <80,
and VL >400)
0.0–<0.5
142
83
55
80
68
0.5–<1.0
166
26
2
61
15
1.0–<2.0
263
4
3
65
2
2.0–<3.0
301
3
1
62
2
3.0–<4.0
440
5
3
64
2
4.0–<5.0
125
0
1
15
0
5.0–<8.0
333
0
0
44
0
≥8.0d
512
0
0
95
0
159
101
—e
123
134, 186
79, 119
184
194
148, 225
109, 289
Mean µ, days
95% CI
Shadow ψ, days
95% CI
99, 142
—e
146
117, 190
Abbreviations: ALIVE, AIDS Link to Intravenous Experience; CI, confidence interval; HIVNET, HIV Network for
Prevention Trials; VL, viral load.
a
Samples were collected from persons enrolled in one of 3 major US epidemiologic cohort studies: the HIVNET
001 Study (32) (1995–1999), the ALIVE Study (33) (1990–2009), and the Multicenter AIDS Cohort Study (34)
(1987–2009). Samples satisfying the listed criteria were assigned Y = 1.
b
Units: CD4 cell count, cells/mm3; BED, normalized optical density; avidity, %; VL, copies/mL.
c
Intervals include the left endpoint but not the right endpoint. Classification into time intervals was made by
midpoint imputation.
d
Five hundred (out of 512) samples from persons in the Johns Hopkins HIV Clinical Practice Cohort (2002–
2010) who were infected for more than 8 years.
e
The mean and shadow could not be calculated for the BED algorithm because the integrals did not converge.
The shadow was greater than 3 years.
Am J Epidemiol. 2013;177(3):264–272
270 Brookmeyer et al.
We find that a multiple-biomarker approach produces more
accurate incidence estimates than one based on any currently available single biomarker. We achieve this increase in
accuracy without significant increases in cost because most
samples are not tested for all biomarkers. Our approach
does not require external adjustment factors to estimate
incidence. We identified 4- and 3-assay algorithms with
mean window periods of 159 days and 101 days, respectively. All individuals eventually exit the window period with
these algorithms, thereby correcting a significant problem
with the BED assay. Incorporation of the CD4 count, viral
load, and avidity assays helps screen out persons who otherwise would be classified in the window period based solely
on the BED assay because of natural or antiretroviral druginduced viral suppression, advanced HIV disease, or other
factors.
A limitation of our findings is that our data came from
persons in the United States who were probably infected
with HIV subtype B, which may not be generalizable to
other populations. Assay performance can be different in
persons who are infected with other HIV subtypes, are
from other countries, or have other characteristics such as
different rates of antiretroviral therapy. In our data, the percentages of samples from persons who reported that they
were on antiretroviral therapy were 42% in HIVNET 001,
50% in MACS, and 41% in ALIVE. In these cohorts, for
samples of persons infected for more than 3 years and
more than 5 years, 47% and 65% were on antiretroviral
therapy, respectively. Recent studies have suggested that
there is no significant association between antiretroviral
therapy and the performance characteristics of BED or
avidity assays after adjustment for viral load (44). These
results are consistent with a conceptual model that the antiHIV antibody response is down-regulated when the level
of replicating virus is low, regardless of the cause of viral
suppression (45). These considerations suggest that multiassay algorithms that include viral load may not be affected
by antiretroviral therapy. In our samples from the Johns
Hopkins HIV Clinical Practice Cohort, 41% had viral
loads greater than 400 copies/mL. Nevertheless, caution
should be exercised in extrapolating our results to other
populations.
The biomarker approach requires collection of biological
samples from persons who are representative of the population, and effort should be taken to assess potential biases.
For example, samples from voluntary testing and counseling centers may lead to biases because persons recently infected may be more likely to come to such centers for HIV
testing (46). Nationally representative probability-based
HIV prevalence surveys, such as the Demographic and
Health Surveys, have been conducted in over 30 countries.
Incorporating multiassay algorithms into those surveys
could provide accurate and practical approaches for estimating national HIV incidence with a marginal increase in the
cost of these surveys; incorporating them into 2 serial HIV
prevalence surveys could allow direct estimation of
changes in national HIV incidence. The multiple-biomarker
approach offers an accurate, cost-effective, and practical
tool for HIV incidence estimation and global epidemic
surveillance.
ACKNOWLEDGMENTS
Author affiliations: Department of Biostatistics, Fielding
School of Public Health, University of California, Los
Angeles, Los Angeles, California (Ron Brookmeyer, Jacob
Konikoff ); National Institute of Allergy and Infectious Diseases, Bethesda, Maryland (Oliver Laeyendecker); Department of Medicine, School of Medicine, Johns Hopkins
University, Baltimore, Maryland (Oliver Laeyendecker);
and Department of Pathology, School of Medicine, Johns
Hopkins University, Baltimore, Maryland (Susan H.
Eshleman).
This work was supported by National Institutes of
Health grant R01-AI095068 (R.B., J.K., S.E.); the Division
of Intramural Research, National Institute of Allergy and
Infectious Diseases (NIAID) (O.L.); and grant 1UM1AI068613 from the NIAID, the National Institute on Drug
Abuse (NIDA), the National Institute of Mental Health,
and the Office of AIDS Research, National Institutes of
Health (S.E.).
The HIV Network for Prevention Trials (HIVNET) 001
Study was funded by the HIVNET and sponsored by the
NIAID; the AIDS Link to Intravenous Experience (ALIVE)
Study was funded by the NIDA; and the Multicenter AIDS
Cohort Study (MACS) was funded by the NIAID, with additional supplemental funding from the National Cancer Institute and the National Heart, Lung, and Blood Institute.
The Johns Hopkins HIV Clinical Practice Cohort was
funded by the NIDA, the National Institute of Alcohol
Abuse and Alcoholism, and the NIAID.
The authors thank the following investigators for providing or generating the data analyzed in this study: Caroline
E. Mullis, Matthew M. Cousins, and Drs. Thomas C.
Quinn, Deborah Donnell, Connie Celum, Susan P.
Buchbinder, George R. Seage III, Lisa P. Jacobson, Joseph
B. Margolick, Joelle Brown, Gregory D. Kirk, Shruti H.
Mehta, Richard D. Moore, and Jeanne C. Keruly.
Conflict of interest: none declared.
REFERENCES
1. Brookmeyer R. Measuring the HIV/AIDS epidemic:
approaches and challenges. Epidemiol Rev. 2010;32(1):
26–37.
2. Hallett TB, Zaba B, Todd J, et al. Estimating incidence from
prevalence in generalised HIV epidemics: methods and
validation. PLoS Med. 2008;5(4):e80. (doi:10.1371/journal.
pmed.0050080).
3. Brookmeyer R, Konikoff J. Statistical considerations in
determining HIV incidence from changes in HIV prevalence.
Stat Commun Infect Dis. 2011;3(1). (doi:10.2202/19484690.1044).
4. Brookmeyer R, Quinn TC. Estimation of current human
immunodeficiency virus incidence rates from a cross-sectional
survey using early diagnostic tests. Am J Epidemiol.
1995;141(2):166–172.
5. Jansen RS, Satten GA, Stramer S, et al. New testing strategy
to detect early HIV-1 infection for use in incidence estimates
and for clinical and prevention purposes. JAMA. 1998;
280(1):42–48.
Am J Epidemiol. 2013;177(3):264–272
Estimating HIV Incidence With Multiple Biomarkers 271
6. Parekh BS, Kennedy MS, Dobbs T, et al. Quantitative
detection of increasing HIV type 1 antibodies after
seroconversion: a simple assay for detecting recent HIV
infection and estimating incidence. AIDS Res Hum
Retroviruses. 2002;18(4):295–307.
7. Mermin J, Musinguzi J, Opio A, et al. Risk factors for
recent HIV infection in Uganda. JAMA. 2008;300(5):
540–549.
8. Prejean J, Song R, Hernandez A, et al. Estimated HIV
incidence in the United States, 2006–2009. PLoS One.
2011;6:e17502. (doi:10.1371/journal.pone.0017502).
9. Epidemiology Reference Group Secretariat, Joint United
Nations Programme on HIV/AIDS. UNAIDS Reference
Group on Estimates, Modelling and Projections’ Statement
on the Use of the Bed-Assay for the Estimation of HIV-1
Incidence for Surveillance or Epidemic Monitoring. London,
United Kingdom: UNAIDS Epidemiology Secretariat,
Imperial College London; 2005. (www.epidem.org/
Publications/BED%20statement.pdf.) (Accessed October
23, 2012).
10. McDougal JS, Parekh BS, Peterson ML, et al. Comparison
of HIV-1 incidence observed during longitudinal follow-up
with incidence estimated by cross-sectional analysis using the
BED capture enzyme immunoassay. AIDS Res Hum
Retroviruses. 2006;22(10):945–952.
11. Hargrove JW, Humphrey JH, Mutasa K, et al. Improved
HIV-1 incidence estimates using BED capture enzyme
immunoassay. AIDS. 2008;22(4):511–518.
12. Brookmeyer R. Should biomarker estimates of HIV incidence
be adjusted? AIDS. 2009;23(4):485–491.
13. Wang R, Lagakos SW. On the use of adjusted cross-sectional
estimators of HIV incidence. J Acquir Immune Defic Syndr.
2009;52(5):538–547.
14. Welte A, McWalter TA, Barnighausen T. A simplified
formula for inferring HIV incidence from cross-sectional
surveys using tests for recent infection. AIDS Res Hum
Retroviruses. 2009;25(1):125–126.
15. Hargrove J, van Schalkwyk C, Eastwood H. BED estimates
of HIV incidence: resolving the differences, making things
simpler. PLoS One. 2012;7(1):e29736. (doi:10.1371/journal.
pone.0029736).
16. Kassanjee R, McWalter TA, Barnighausen T, et al. A new
general biomarker-based incidence estimator. Epidemiology.
2012;23(5):721–728.
17. Hallett T, Ghys P, Bärnighausen T, et al. Errors in “BED”derived estimates of HIV incidence will vary by place, time
and age. PLoS One. 2009;4(5):e5720. (doi:10.1371/journal.
pone.0005720).
18. Wang R, Lagakos SW. Augmented cross-sectional prevalence
testing for estimating HIV incidence. Biometrics. 2010;66(3):
864–874.
19. Clagget B, Lagakos SW, Wang R. Augmented cross-sectional
studies with abbreviated follow-up for estimating HIV
incidence. Biometrics. 2012;68(1):62–74.
20. Busch M, Pilcher C, Mastro T, et al. Beyond detuning: 10
years of progress and new challenges in the development and
application of assays for HIV incidence estimation. AIDS.
2010;24(18):2763–2771.
21. Incidence Assay Critical Path Working Group. More and
better information to tackle HIV epidemics: toward
improved HIV incidence assays. PLoS Med. 2011;8(6):
e1001045. (doi:10.1371/journal.pmed.1001045).
22. Freeman J, Hutchinson GB. Prevalence, incidence
and duration. Am J Epidemiol. 1980;112(5):707–723.
Am J Epidemiol. 2013;177(3):264–272
23. Brookmeyer R. Accounting for follow-up bias in estimation
of human immunodeficiency virus incidence rates. J R Stat
Soc Ser A. 1997;160(1):127–140.
24. Cole SR, Chu R, Brookmeyer R. Confidence intervals for
biomarker-based human immunodeficiency virus incidence
estimates and differences using prevalent data. Am J
Epidemiol. 2007;165(1):94–100.
25. Laeyendecker O, Brookmeyer R, Oliver AE, et al. Factors
associated with incorrect identification of recent HIV
infection using the BED capture immunoassay. AIDS Res
Hum Retroviruses. 2012;28(8):816–822.
26. Kaplan E, Brookmeyer R. Snapshot estimators of recent HIV
incidence rates. Oper Res. 1999;47(1):29–37.
27. Brookmeyer R. On the statistical accuracy of biomarker
assays for HIV incidence. J Acquir Immune Defic Syndr.
2010;54(4):406–414.
28. Cox DR. Renewal Theory. London, United Kingdom:
Methuen and Company; 1962.
29. Zelen M. Forward and backward recurrence times and length
biased sampling: age specific models. Lifetime Data Anal.
2004;10(4):325–334.
30. Zelen M, Feinleib M. On the theory of screening for chronic
diseases. Biometrika. 1969;56(3):601–604.
31. Laeyendecker O, Brookmeyer R, Cousins MM, et al. HIV
incidence determination in the United States: a multi-assay
approach [ published online ahead of print November 5,
2012]. J Infect Dis. (doi:10.1093/infdis/jis659).
32. Celum CL, Buchbinder SP, Donnell D, et al. Early human
immunodeficiency virus (HIV) infection in the HIV Network
for Prevention Trials vaccine preparedness cohort: risk
behaviors, symptoms, and early plasma and genital tract virus
load. J Infect Dis. 2001;183(1):23–35.
33. Vlahov D, Anthony JC, Munoz A, et al. The ALIVE study,
a longitudinal study of HIV-1 infection in intravenous drug
users: description of methods and characteristics of
participants. NIDA Res Monogr. 1991;109:75–100.
34. Kaslow RA, Ostrow DG, Detels R, et al. The Multicenter
AIDS Cohort Study: rationale, organization, and selected
characteristics of the participants. Am J Epidemiol. 1997;
126(2):310–318.
35. Moore RD. Understanding the clinical and economic outcomes
of HIV therapy: the Johns Hopkins HIV Clinical Practice Cohort.
J Acquir Immune Defic Syndr. 1998;17(suppl 1):S38–S41.
36. Keating SM, Hanson D, Lebedeva M, et al. Less sensitive
and avidity modifications of the VITROS anti-HIV-1+2 assay
for detection of recent HIV infections and incidence
estimation [ published online ahead of print October 3, 2012].
J Clin Microbiol. (doi:10.1128/JCM.01454-12).
37. Parekh BS, Hanson D, Hargrove J, et al. Determination of
mean recency for estimation of HIV type 1 incidence with the
BED-capture EIA in persons infected with diverse subtypes.
AIDS Res Hum Retroviruses. 2012;27(3):265–273.
38. Masciotra S, Dobbs T, Candal D, et al. Antibody aviditybased assay for identifying recent HIV-1 infections based on
genetic systems TM ½ plus O EIA [abstract]. Presented at the
17th Conference on Retroviruses and Opportunistic
Infections, San Francisco, California, February 16–19, 2010.
39. Mellors JW, Munoz A, Giorgi J, et al. Plasma viral load and
CD4+ lymphocytes as prognostic markers of HIV-1 infection.
Ann Intern Med. 1997;126(12):946–954.
40. Chu H, Gange SJ, Li X, et al. The effect of HAART on
HIV RNA trajectory among treatment-naïve men and women:
a segmental Bernoulli/lognormal random effects model with
left censoring. Epidemiology. 2010;21(suppl 4):S25–S34.
272 Brookmeyer et al.
41. Fiebig EW, Wright DJ, Rawal BD, et al. Dynamics of HIV
viremia and antibody seroconversion in plasma donors:
implications for diagnosis and staging of primary HIV
infection. AIDS. 2003;17(13):1871–1879.
42. Sweeting M, De Angelis D, Parry J, et al. Estimating the
distribution of the window period for recent HIV infections:
a comparison of statistical methods. Stat Med. 2010;29(3):
3194–3202.
43. Eshleman SH, Hughes JP, Laeyendecker O, et al. Use of a
multi-faceted approach to assess HIV incidence in a cohort
study of women in the United States: HIV Prevention Trials
Network 064 Study [ published online ahead of print
November 5, 2012]. J Infect Dis. (doi:10.1093/infdis/jis658).
44. Laeyendecker O, Brookmeyer R, Mullis C, et al. Specificity
of four laboratory approaches for cross-sectional HIV
incidence determination: analysis of samples from adults
with known non-recent HIV infection from five African
countries. AIDS Res Hum Retroviruses. 2012;28(10):
1177–1183.
45. Trkola A, Kuster H, Leeman C, et al. Humoral immunity to
HIV-1: kinetics of antibody responses in chronic infection
reflects capacity of immune system to improve viral set point.
Blood. 2004;104(6):1784–1792.
46. Remis RS, Palmer RWH. Testing bias in calculating HIV
incidence from serological testing algorithm for recent HIV
seroconversion. AIDS. 2009;23(4):493–503.
Am J Epidemiol. 2013;177(3):264–272