Mixture distribution analysis of phenotypic markers

From www.bloodjournal.org by guest on June 16, 2017. For personal use only.
Mixture distribution analysis of phenotypic markers reflecting
HFE gene mutations
Christine E. McLaren, Kuo-Tung Li, Chad P. Garner, Ernest Beutler, and Victor R. Gordeuk
The goal of this study was to determine
whether statistical modeling of population data for a phenotypic marker could
reflect a major locus gene defect. Identifying mutations in the HFE gene makes it
possible to assess the association between transferrin saturation (TS) subpopulations and HFE mutations. Data
were analyzed from 27 895 white patients
who attended a health appraisal clinic
and who had TS and common mutations
of HFE determined. Mixture distribution
modeling of TS was performed, and the
proportion of HFE mutations in TS subpopulations was assessed on a probability basis. Three subpopulations of TS
were identified, consistent with HardyWeinberg conditions for major locus effects. For men, 72% of the subpopulation
with the highest mean TS had HFE gene
mutations; they were primarily homozygotes or compound heterozygotes. Seventy-three percent of the subpopulation
with moderate mean TS also had HFE
gene mutations; they were predominantly
simple heterozygotes. Sixty-seven per-
cent of the subpopulation with the lowest
mean TS were wild-type homozygotes.
Similar results were observed for women.
These results suggest that statistical modeling of population clinical laboratory
test data can reveal the influence of a
major locus gene defect and perhaps can
be applied to other aspects of body metabolism than iron. (Blood. 2003;102:4563-4566)
© 2003 by The American Society of Hematology
Homozygous hemochromatosis, a common inherited susceptibility
to iron overload, occurs in 0.3% to 0.5% of white persons of
western European descent.1,2 Most cases of hemochromatosis in
white persons are attributable to common missense mutations of
the HFE gene. The 2 common missense mutations are C282Y
(exon 4, nt 845G⬎A) and H63D (exon 2, nt 187C⬎G), and 80% to
90% of patients of northern European ancestry with typical
hemochromatosis phenotype are C282Y homozygotes.3,4 Transferrin saturation (TS) generally is regarded as the best single
screening test for hemochromatosis.5
We previously used mixture modeling of TS measured in the
second and third National Health and Nutrition Examination
Surveys in the United States (NHANES II, NHANES III) and in
population data from Australia to study possible genetic influences
on iron metabolism in white populations.6-8 In each study, we found
TS subpopulations consistent with Hardy-Weinberg predictions for
a major hemochromatosis locus that leads to altered TS values in
affected homozygotes and in heterozygotes. In one study, serum
ferritin concentrations were measured simultaneously, permitting
the determination of 3 subpopulations of TS with progressively
increasing mean age-adjusted serum ferritin concentration values,
consistent with increasing body iron stores.8
For our previous analyses of population data, HFE gene
mutation status was unavailable. In this new study, we analyzed a
large data set from patients attending a health appraisal clinic.
Transferrin saturation, serum ferritin, and HFE mutations were
From the Epidemiology Division, University of California, Irvine; the Chao
Family Comprehensive Cancer Center, Orange; the Scripps Research
Institute, La Jolla, CA; and the Division of Hematology/Oncology, Department
of Medicine, Howard University, Washington, DC.
Submitted April 24, 2003; accepted July 28, 2003. Prepublished online as
Blood First Edition Paper, August 7, 2003; DOI 10.1182/blood-2003-04-1278.
Supported in part by National Institutes of Health grants and contracts
HL-508203, N01-HC-05180 (C.E.M.), UH1 HL03679-03, and N01-HC-05186;
Howard University General Clinical Research Center grants M01-RR10284
measured for each patient. The goal was to verify that statistical
distribution methodology could be used to analyze laboratory test
values to predict the presence of major locus mutations that affect
body metabolism.
Patients, materials, and methods
Sources of data
The primary source of data was a study conducted at the Kaiser Permanente
San Diego Health Appraisal Clinic from 1998 to 2001.9,10 All patients older
than 20 years of age who were registered with the clinic were apprised of a
research project in which DNA analysis for HFE mutations and measurements of serum ferritin concentration would be added to the tests usually
performed. We analyzed data from the 30 966 non-Hispanic white men and
women, at least 20 years of age, who consented to participate in the Kaiser
Permanente study and for whom HFE genotype data and health and
demographic data were obtained. The term white is used in this paper to
designate non-Hispanic patients who selected the category white to indicate
their sole ancestry. For the purposes of the present study, we did not analyze
data from Asian, black, or Hispanic clinic patients because HFE mutations
are less common in these populations than among white persons. Details of
the laboratory analysis of samples for HFE C282Y and H63D mutations,
hematologic measurements, serum iron concentration, total iron-binding
capacity, transferrin saturation, and serum ferritin concentration have been
described previously.9
(V.R.G.), RR00833, and DK53505-4 (E.B.); Centers for Disease Control grant
DK535-02; and the Stein Endowment Fund (E.B.).
Reprints: Christine E. McLaren, Department of Medicine, Epidemiology
Division, University of California, 224 Irvine Hall, Irvine, CA 92697-7550; e-mail:
[email protected]
The publication costs of this article were defrayed in part by page charge
payment. Therefore, and solely to indicate this fact, this article is hereby
marked ‘‘advertisement’’ in accordance with 18 U.S.C. section 1734.
© 2003 by The American Society of Hematology
From www.bloodjournal.org by guest on June 16, 2017. For personal use only.
McLAREN et al
Table 1. Exclusions from the total sample of white participants
at least 20 years of age
Total sample
Samples excluded on the basis of low Hb or MCV ⬍80 fL
Transferrin saturation not measured
Sample for transferrin saturation modeling
Serum ferritin not measured
Sample for serum ferritin analysis
15 415
15 294
1 258
13 805
14 090
13 272
13 505
Hemoglobin (Hb) exclusions: men, Hb ⬍ 13.5 g/dL; women, Hb ⬍ 12.0 g/dL.
Statistical modeling of transferrin saturations
For analysis, we selected transferrin saturation and serum ferritin values
from 15 294 nonpregnant white women and 15 415 white men at least 20
years of age. Patients were excluded if they had abnormally low values for
hemoglobin or mean corpuscular volume (MCV) because associated
conditions might have altered transferrin saturation and serum ferritin
values.11-17 We did not exclude patients with MCV values above the
reference range because hemochromatosis probands have mean MCV
values significantly higher than wild-type patients.18 Data sets for transferrin saturation modeling consisted of 13 805 men and 14 090 women
(Table 1).
The research hypothesis was that HFE gene mutations influence the
distribution of transferrin saturation, resulting in genetically based subpopulations. Details of mixture-modeling techniques as applied to transferrin
saturation distributions, including parameter estimation, statistical tests,
and confidence intervals for proportions, have been described
mutations, and 6 genotypes in the sample. The 6 genotypes are denoted by
the mutation that is present or as wild type (wt) if neither mutation exists on
the chromosome, giving wt/wt, wt/H63D, wt/C282Y, H63D/C282Y, H63D/
H63D, and C282Y/C282Y. The absence of chromosomes carrying both
mutations confirms that the mutations are in virtually complete linkage
Genotype data were analyzed to determine whether the observed
genotype frequencies fit the expectations under the Hardy-Weinberg
equilibrium model. The Hardy-Weinberg model assumes that the 2 alleles
(or mutations) in a genotype are independent and that the genotype
frequencies are the product of their constituent allele or mutation frequencies. Deviations from the Hardy-Weinberg model can result from an
underlying structure in the population (eg, inbreeding or selection bias) or,
more commonly, errors in the data. The C282Y and H63D mutation
frequencies in the population were estimated separately from the data for
13 805 men and 14 090 women, and expected genotype frequencies were
then calculated from the mutation frequencies, assuming Hardy-Weinberg
equilibrium. The Kolmogorov-Smirnov goodness-of-fit test was applied to
compare observed and expected genotype frequencies.
Association of modeled TS subpopulations and HFE genotypes
For each sex, the frequency and proportion of genotypes occurring within
transferrin saturation subpopulations was estimated. Given the total sample
size, the number of expected observations within each transferrin saturation
interval was calculated. As described previously,8 within a given transferrin
saturation interval, each observation was then assigned a probability, and
these probabilities were used to assign transferrin saturation values to a
given subgroup according to the expected proportions within transferrin
saturation subpopulations.
Comparison of mean serum ferritin concentrations within
transferrin saturation subpopulations
In a previous investigation of population data, we found evidence of
subpopulations of patients, determined on the basis of transferrin saturation,
with significantly different iron stores based on mean serum ferritin
concentrations. Confirmatory analyses were performed in this new study.
Considering the transferrin saturation data sets used for mixture modeling,
533 men and 585 women did not have measurements of serum ferritin
concentration. After removing values for these patients, the data sets to
analyze serum ferritin concentration consisted of 13 272 men and 13 505
women. Because the distributions of serum ferritin concentration were
markedly skewed, the square root transformation was applied to each data
set. In addition, because serum ferritin concentration tends to increase with
age,24,25 we applied methods for age standardization to age 60 years. The
probability that an individual serum ferritin concentration value belonged to
either of the 3 transferrin saturation subpopulations was computed.8 We
used the Parametric Trend Test26 to compare the mean square root of serum
ferritin concentration within each of 3 transferrin saturation subgroups,
taking into account the unequal sample sizes for subgroups of patients.
Genetic analysis of the genotype data
With rare exceptions27,28 the C282Y and H63D mutations occur in trans;
therefore, we observed 3 gametes, defined by the presence or absence of the
Statistical modeling of transferrin saturations
The primary analysis was performed on transferrin saturations for
27 895 patients. Results of statistical mixture modeling indicated
that the fit of the data to a mixture of 3 normal populations with
unequal variances was significantly better than the fit to either a
mixture of 2 normal populations or a single normal population for
white men (likelihood ratio statistics, 157 [P ⬍ .01] and 1704
[P ⬍ .01], respectively) and women (likelihood ratio statistics, 120
[P ⬍ .01] and 649 [P ⬍ .01], respectively). Figure 1 shows the
distribution of transferrin saturation for men and women. The fitted
subpopulations are superimposed over the histograms of the
observed data. Table 2 gives mixture-model parameter estimates
for transferrin saturation subpopulations and shows that for each
sex, transferrin saturation subpopulations were identified with
increasing means for men and women (Trend test, P ⬍ .0001 for
each). We also compared the age-standardized mean serum ferritin
concentrations for the TS subpopulations (Table 2). Trend tests
demonstrated an increase in mean serum ferritin concentration,
Figure 1. Observed and modeled distributions of transferrin
saturation values are shown for each ethnicity and sex.
Distribution of transferrin saturation values: (A) 13 805 white men and
(B) 14 090 white women. A histogram of the observed data is shown.
Dashed lines represent the fitted normal distributions representing
each subpopulation. Solid line represents the overall fitted mixture
From www.bloodjournal.org by guest on June 16, 2017. For personal use only.
Table 2. Parameter estimates for transferrin saturation and
age-standardized serum ferritin concentration
Serum ferritin concentration,
Transferrin saturation, %
Mixture distribution models
Subpopulation estimates
95% CI
11 419
132.3, 136.9
1 735
166.4, 176.9
285.6, 445.2
11 637
72.3, 75.7
1 754
90.3, 98.0
123.2, 193.2
For patients younger than 60 years, serum ferritin concentration values were
standardized to those expected at age 60.
standardized to age 60 years, with increasing mean transferrin
saturation for men (P ⬍ .0001) and women (P ⬍ .0001).
Genetic analysis of the genotype data
The observed frequencies of the H63D and C282Y mutations were
0.152 and 0.062, respectively, in men and 0.149 and 0.064,
respectively, in women. The data show strong agreement with the
Hardy-Weinberg equilibrium model, which would be expected for
a large, randomly selected sample from a large, randomly mating
population.10,29 There were no significant differences between
observed and expected genotype frequencies (KolmogorovSmirnov statistic ⫽ 0.167; P ⫽ 1.0 for men and women).
Association of modeled TS subpopulations and HFE genotypes
The frequency of genotypes within each TS subpopulation is given
in Table 3. For men, 72% of patients in the subpopulation with the
highest mean TS had HFE gene mutations; 60% of them were
homozygotes or compound heterozygotes, and 40% were simple
heterozygotes (Table 3). Seventy-three percent of patients in the
subpopulation with moderate mean TS had HFE gene mutations;
71% of them were simple heterozygotes for HFE mutations, and
29% were homozygotes or compound heterozygotes. In the
subpopulation with the lowest mean TS, only 33% of patients had
HFE mutations; 94% of them were simple heterozygotes, and 6%
were homozygotes or compound heterozygotes for HFE mutations.
Similar results were observed in analyses of data from women.
As we gain understanding of the genetic influences on the body’s
metabolic processes, it is likely that we will find that routine
laboratory measurements performed in clinical medicine are influenced by genetic mutations and polymorphisms in the population.
It is also possible that studying the distributions of the results of
laboratory measurements in populations can provide evidence for
such as yet unidentified major mutations that influence metabolic
pathways of the body. An example of this approach is a communitybased prevalence study of hypertension, in which values for
angiotensinogen (AGT) collected from Nigerian families were
analyzed. Although the AGT genetic mutation status of family
members was unknown, a mixture of 2 or 3 distributions fit the
AGT values significantly better than a single distribution, and a
mixture of 3 distributions provided the best fit to the data,
suggesting a major genetic effect.30 We have examined mixture
modeling of a phenotypic marker to reveal a major locus effect in
the context of iron overload in the white population. Before the
HFE gene was identified, we conducted a series of studies based on
the postulate that the hemochromatosis mutation in white patients
would influence the distribution of transferrin saturation in the
population and would permit the identification of distinct subpopulations based on hemochromatosis genotype. We amassed considerable evidence consistent with the hypothesis6-8 but had not had the
opportunity to verify our hypothesis by studying a population in
which transferrin saturations and HFE genotypes were determined
simultaneously in all participants. The present study represents our
first opportunity to attempt to verify our hypothesis in a large
population in whom phenotyping and genotyping were performed.10
Under the hypothesis that the distribution of transferrin saturation reflects several populations based on individual genotype for
hemochromatosis, in the present study we applied a statistical
distribution methodology to transferrin saturations in a large data
set and examined the relationship between the model results and
the actual genotyping of HFE mutations. Using this approach, 3
transferrin saturation subpopulations were identified in white men
and women. The proportions of these subpopulations are consistent
with Hardy-Weinberg criteria for a major, common genetic influence on iron metabolism. For men and women, the subpopulation
with higher mean TS values consisted predominantly of those with
HFE gene mutations. In the subpopulation with the highest TS, the
HFE mutations were predominantly homozygote or compound
heterozygote. In the subpopulations with moderate TS, the HFE
mutations were predominantly simple heterozygote. In contrast,
most of the patients in the subpopulation with the lowest TS were
predominantly HFE wild-type homozygote.
Our study has several limitations. We were unable to exclude
patients with elevations in erythrocyte protoporphyrin or liver
function tests or those with positive serum hepatitis B surface
antigen or serum hepatitis C antibody because these laboratory
tests were not performed for all participants. Patients with abnormalities in these test results might have had elevations in TS not
attributable to a genetic defect and might have been part of the
proportion with wild-type HFE found in the subpopulation with the
highest TS and moderate TS. The first 10 000 patients were tested
for the S65C HFE mutation.9 Because not all patients were tested,
we did not exclude those with other HFE mutations, such as S65C,
who might also have been classified as having wild-type HFE in
the upper TS subpopulation.
Table 3. Frequency of genotypes within transferrin
saturation subpopulations
11 895
13 805
12 154
14 090
From www.bloodjournal.org by guest on June 16, 2017. For personal use only.
McLAREN et al
Even with these limitations, however, modeling demonstrated
subpopulations of TS corresponding to distributions of HFE
genotypes in this population. Our findings thus provide confirmation that mutations in a major locus influence the distribution of
transferrin saturation in the US white population. Our findings also
raise the possibility that mixture-modeling procedures might be
used to predict the presence of major loci influencing many other
laboratory tests, including those measuring metals, electrolytes,
enzyme activities, metabolic breakdown products, and components
of blood and plasma.
This is manuscript no. 15507-MEM from The Scripps Research Institute.
1. Adams PC, Gregor JC, Kertesz AE, et al. Screening blood donors for hereditary hemochromatosis: decision and analysis model based on a 30year database. Gastroenterology. 1995;109:177188.
2. Witte DL, Crosby WH, Edwards CQ, et al. Practice Guideline Development Task Force of the
College of American Pathologists: hereditary
hemochromatosis. Clin Chim Acta. 1996;245:
3. Feder JN, Gnirke A, Thomas W, et al. A novel
MHC class I-like gene is mutated in patients with
hereditary haemochromatosis. Nat Genet. 1996;
4. Burke W, Thomson E, Khoury MJ, et al. Hereditary hemochromatosis: gene discovery and its
implications for population-based screening.
JAMA. 1998;280:172-178.
5. Powell LW, Jazwinska E, Halliday JW. Primary
iron overload. In: Brock JH, Halliday JW, Pippard
MJ, et al, eds. Iron Metabolism in Health and Disease. Philadelphia, PA: WB Saunders; 1994:247254.
6. McLaren CE, Gordeuk VR, Looker AC, et al.
Prevalence of heterozygotes for hemochromatosis in the white population of the United States.
Blood. 1995;86:2021-2027.
7. McLaren CE, McLachlan GJ, Halliday JW, et al.
Distribution of transferrin saturation in an Australian population: relevance to the early diagnosis
of hemochromatosis. Gastroenterology. 1998;
8. McLaren CE, Li KT, Gordeuk VR, et al. Relationship between transferrin saturation and iron
stores in the African American and US caucasian
populations: analysis of data from the third National Health and Nutrition Examination Survey.
Blood. 2001;98:2345-2351.
9. Beutler E, Felitti V, Gelbart T, Ho N. The effect of
HFE genotypes on measurements of iron overload in patients attending a health appraisal
clinic. Ann Intern Med. 2000;133:329-337.
10. Beutler E, Felitti VJ, Koziol JA, Ho NJ, Gelbart T.
Penetrance of 845G⬎A (C282Y) HFE hereditary
haemochromatosis mutation in the USA. Lancet.
11. Finch CA, Deubelbeiss K, Cook JD, et al. Ferrokinetics in man. Medicine. 1970;49:17-53.
12. Hershko C, Graham G, Bates GW, Rachmilewitz
EA. Non-specific serum iron in thalassaemia: an
abnormal serum iron fraction of potential toxicity.
Br J Haematol. 1978;40:255-263.
13. Cartwright GE, Wintrobe MM. Chemical, clinical,
and immunological studies on the products of human plasma fractionation, XXXIX: the anemia of
infection: studies on the iron-binding capacity of
serum. J Clin Invest. 1949;28:86-98.
14. Jandl JH. Blood. Boston, MA: Little, Brown & Co;
15. Bainton DF, Finch CA. The diagnosis of iron deficiency anemia. Am J Med. 1964;37:62-70.
16. Edwards CQ, Griffen LM, Kaplan J, Kushner JP.
Twenty-four–hour variation of transferrin saturation in treated and untreated haemochromatosis
homozygotes. J Intern Med. 1989;226:173-179.
17. Gordeuk VR, Brittenham GM, McLaren GD,
Spagnuolo PJ. Hyperferremia in immmunosuppressed patients with acute nonlymphocytic leukemia and the risk of infection. J Lab Clin Med.
18. Barton JC, Bertoli LF, Rothenberg BE. Peripheral
blood erythrocyte parameters in hemochromatosis: evidence for increased erythrocyte hemoglobin content. J Lab Clin Med. 2000;135:96-104.
19. Gordeuk VR, McLaren CE, Looker AC, Hasselblad V, Brittenham GM. Distribution of transferrin
saturations in the African-American population.
Blood. 1998;91:2175-2179.
20. Hasselblad V. Manual for the grouped DISFIT
software, version 1.1. Durham, NC: Duke University; 1992.
21. McLachlan GJ, Krishnan T. The EM Algorithm
and Extensions. New York, NY: John Wiley &
Sons; 1997.
22. McLachlan GJ, Basford KE. Mixture Models: Inference and Applications to Clustering. New York,
NY: Marcel Dekker; 1988.
23. Crump KS, Howe R. A preview of methods for
calculating confidence limits in low dose extrapolation. In: Krewski D, ed. Toxicological Risk Assessment. Boca Raton, FL: CRC Press; 1985:
24. Expert Scientific Working Group. Summary of a
report on assessment of the iron nutritional status
of the United States population. Am J Clin Nutr.
25. Zacharski LR, Ornstein DL, Woloshin S,
Schwartz LM. Association of age, sex, and race
with body iron stores in adults: analysis of
NHANES III data. Am Heart J. 2000;140:98-104.
26. Cuzick J. Trend tests. In: Kotz S, Johnson N, eds.
Encyclopedia of Statistical Sciences. Vol 9. New
York, NY: John Wiley & Sons; 1988:336-342.
27. Spriggs EL, Harris PE, Best LG. Hemochromatosis mutations C282Y and H63D in “cis” phase.
Clin Genet. 2001;60:68-72.
28. Lucotte G, Champenois T, Semonin O. A rare
case of a patient heterozygous for the hemochromatosis mutation C282Y and homozygous for
H63D. Blood Cells Mol Dis. 2001;27:892-893.
29. Waalen J, Felitti V, Gelbart T, Ho NJ, Beutler E.
Prevalence of hemochromatosis-related symptoms in homozygotes for the C282Y mutation of
the HFE gene. Mayo Clin Proc. 2002;77:522-530.
30. Guo X, Rotimi C, Cooper R, et al. Evidence of a
major gene effect for angiotensinogen among
Nigerians. Ann Hum Genet. 1999;63:293-300.
From www.bloodjournal.org by guest on June 16, 2017. For personal use only.
2003 102: 4563-4566
doi:10.1182/blood-2003-04-1278 originally published online
August 7, 2003
Mixture distribution analysis of phenotypic markers reflecting HFE gene
Christine E. McLaren, Kuo-Tung Li, Chad P. Garner, Ernest Beutler and Victor R. Gordeuk
Updated information and services can be found at:
Articles on similar topics can be found in the following Blood collections
Clinical Trials and Observations (4563 articles)
Red Cells (1159 articles)
Information about reproducing this article in parts or in its entirety may be found online at:
Information about ordering reprints may be found online at:
Information about subscriptions and ASH membership may be found online at:
Blood (print ISSN 0006-4971, online ISSN 1528-0020), is published weekly by the American Society
of Hematology, 2021 L St, NW, Suite 900, Washington DC 20036.
Copyright 2011 by The American Society of Hematology; all rights reserved.