Relationship Between Protein Complotypes and DNA Variant

Relationship Between Protein Complotypes
and DNA Variant Haplotypes:
Complotype-RFLP Constellations (CRC)
Susan Simon, Lennart Truedsson,
Deborah Marcus-Bagley, Zuheir Awdeh,
George S. Eisenbarth, Stuart J. Brink,
Edmond J. Yunis, and Chester A. Alper
ABSTRACT: From the study of 52 families and 15
homozygous typing cells, 234 MHC complement haplotypes were characterized for features in the DNA of the
complotype region: C2/Sst I (2.75, 2.70, 2.65, and 2.40
kb), BF/Taq I (6.6 and 4.5 kb), C4 59/Bgl II (15 and 4.5
kb), C4 59/Taq I (7.0, 6.4, 6.0 and 5.4 kb) and C4 39/Xba
I/BamH I (11 and 417 kb) restriction fragment length
polymorphisms (RFLP’s), by the presence or absence of
C4A, C4B, CYP21A and CYP21B genes and by duplications. Nineteen (of over 1000 theoretically possible)
complotype-RFLP constellations (CRC’s) were found. The
9 CRC’s with two C4 and CYP21 genes were designated
A through I. CRC’s Bdup and Ddup were like B and D
but had duplicated C4B-CYP21B genes. The remaining
CRC’s had deletions of C4 and/or CYP21 genes and were
designated Bdel, Cdel and the like. Individual complement alleles and complotypes were not randomly distributed among the CRC’s. Some complotypes, such as SC01,
SC02 and FlC30, were restricted to only 1 CRC; others,
such as SC31, FC31, and SC30, were found in several
CRC’s. Some of the CRC’s contained a single complotype,
others contained several. Remarkably, there are about 30
CRC-specified complotypes with frequencies of .01 or
higher and 14 of .02 or higher. A number of evolutionary
origins of complement alleles and complotypes are suggested by the relationships among CRC’s. Approximate
normal frequencies of the undeleted CRC’s were A 5 .27,
B 5 .19, Bdup 5 .02, C 5 .17, D 5 .07, Ddup 5 .02,
E 5 .06, F 5 .05, and G 5 .02. Thus, CRC’s without
deletions accounted for 88% of normal complotypes.
Since the frequency of Bdel, with a deletion of C4A, was
.12, 10 CRC’s accounted for all observed normal caucasian
MHC haplotypes. Human Immunology 57, 27–36 (1997).
© American Society for Histocompatibility and Immunogenetics, 1997. Published by Elsevier Science Inc.
ABBREVIATIONS
RFLP
restriction fragment length polymorphism
CRC
complotype-RFLP constellation
EDTA
INTRODUCTION
There is extensive genetic polymorphism in the genes
encoding the complement proteins of the major histocompatibility complex in humans. It is most extensive in
the genes encoding the two isotypes of C4, C4A and C4B
[1], but also involves factor B of the alternative pathway
From The Center for Blood Research (S.S., L.T., D.M.-B., Z.A.,
E.J.Y., C.A.A.) and the Departments of Pediatrics (S.J.B., C.A.A.),
Medicine (G.S.E.), and Pathology (Z.A., E.J.Y.), Harvard Medical School,
Boston, MA 02115, USA.
Address reprint requests to: Chester A. Alper and Edmond J. Yunis, The
Center for Blood Research, 800 Huntington Avenue, Boston, MA 02115.
Received August 19, 1997; accepted August 19, 1997.
Human Immunology 57, 27–36 (1997)
© American Society for Histocompatibility and Immunogenetics, 1997
Published by Elsevier Science Inc.
ethylene diamine tetraacetate
[2– 4] and, to a lesser extent, C2 [5–7]. These genes form
fixed haplotypes (complotypes) defined by their BF, C2,
C4A, and C4B protein alleles [8] that are populationcharacteristic. Among mixed European caucasians, there
are about a dozen protein complotypes that occur at
frequencies of 1% or more.
By the application of molecular biologic techniques, it
was shown that the 4 complement genes (C2, BF, C4A,
C4B, in that order from telomere to centromere) as well
as 2 genes for the adrenal steroid cytochrome P450
21-hydroxylase enzyme (CYP21A, a pseudogene, and
CYP21B, the expressed gene) are found within 100 to
0198-8859/97/$17.00
PII S0198-8859(97)00177-8
28
120 kb of genomic DNA [9 –13]. A number of restriction fragment length polymorphisms (RFLP’s) have been
found in the DNA of the complement and CYP21 genes.
In addition, deletions of C4A, C4B and CYP21A, gene
conversion-like C4B to C4A changes and duplications of
C4B and CYP21B are common.
The present study was designed to determine the
relationship over the whole complotype region between
the polymorphisms in the major histocompatibility complex-encoded complement proteins and RFLP’s, largely
in non-coding regions, detected in the corresponding
genomic DNA. In this way, we hoped to define complotype-RFLP constellations or CRC’s. An analysis of these,
in turn, should suggest evolutionary relationships among
the genes for individual protein-defined C2, C4 and BF
alleles and complotypes in caucasians.
MATERIALS AND METHODS
Subjects
Lymphoblastoid lines were established by Epstein–Barr
virus transformation of peripheral blood B lymphocytes
from members of 52 families in which at least 1 member
had type 1 diabetes mellitus. All subjects were studied
for C2, C4 and BF protein types and complotypes were
assigned from studies in immediate relatives. In addition, lymphoblastoid lines were established from peripheral blood B lymphocytes of MHC homozygous individuals either by us or by the 10th International
Histocompatibility Workshop [14]. From the families, a
total of 84 independent normal and 135 diabetic proteindefined complotypes were identified, with diabetic haplotypes defined as those found in any patient and normal
haplotypes as those not found in any patient [15]. An
additional 15 independent haplotypes were provided by
the homozygous cells. Normal frequencies were calculated only from the 84 normal haplotypes, whereas all
other analyses were based on the 234 total haplotypes.
Complotype Determinations
Plasma from whole blood collected into EDTA was used
to test for genetic polymorphisms in C4 (C4A and C4B),
factor B (BF), and C2. For C4 typing [1], the plasma was
treated with neuraminidase and, in some instances, carboxypeptidase [16] and then subjected to agarose gel
electrophoresis and immunofixation with goat anti-human C4 (Atlantic Antibodies, Stillwater, MN). C2 types
were determined by isoelectric focusing in thin layer
polyacrylamide gel and a C2-sensitive agarose gel overlay
incorporating antibody-sensitized sheep erythrocytes [5].
BF typing [2] was by agarose gel electrophoresis and
immunofixation with anti-human factor B (Atlantic Antibodies). Subtypes of BF F were determined by isoelectric focusing in thin-layer polyacrylamide gel and im-
S. Simon et al.
munofixation with anti-factor B [3, 4]. Nomenclature for
C4 is that described previously [1]. Variants and alleles
at each C4 locus are designated by integers according to
electrophoretic mobility from cathode to anode of the
desialated protein at pH 8.8. Individual alleles are italicized and designated by locus name in capital letters, an
asterisk, and a number or ‘‘Q0’’ if null (e.g., C4Ap4,
C4Bp2, or C4BpQ0). Phenotypes, variants or proteins are
given with Roman capital letters, a space, and the same
number or symbol as the corresponding allele (C4A 4,
C4B 2, or C4B Q0).
Complotypes are designated by their BF, C2, C4A,
and C4B alleles, in that arbitrary order [8]. Null or Q0
alleles are simply 0. Thus, FC30 indicates the complotype with alleles BFpF, C2pC, C4Ap3, C4BpQ0.
Restriction Fragment Length
Polymorphism (RFLP)
DNA was extracted from the B-lymphoblastoid lines by
the method of Gross-Bellard and colleagues [17]. Three
to 10 mg of DNA were digested to completion with each
restriction enzyme, using conditions recommended by
the manufacturer (Bethesda Research Laboratories,
Gaithersburg, MD). Fragments after digestion were separated by electrophoresis in 0.75% agarose gel and transferred [18] to Nytran (Schleicher and Schuell, Keene,
NH) or Sure-blot (Oncor, Gaithersburg, MD) membranes. Electrophoresis of Sst I-digested DNA was carried out for 96 h at 30 V at 4°C. using 1X TBE (89 mM
Tris borate/89 mM boric acid/2 mM EDTA at pH 8.0).
All probes were labeled with alpha [32P] dCTP by the
random primer method [19]. Prehybridization and hybridization were carried out at 45°C., the latter overnight. Membranes were washed twice at room temperature with 0.1X standard saline citrate (SSC), 0.1% SDS
and a third time with the same solution at 52°C. for
20 – 60 min. The membranes were then exposed to X-ray
film with an intensifying screen at 270°C. for 2 to 10
days for autoradiography.
The specific enzymes used in this study were Taq I,
Bgl II, Sst I, and a mixture of Xba I and BamH I. The
probes used were as follows. For the 59 region of the C4
genes, a 500 bp fragment derived from the 59 end of the
full-length C4 cDNA clone pAT-A after BamH I/Kpn I
double digestion [20] was used. A 570 bp fragment
derived from the pC4AL1 cDNA clone by Pst I digestion
was used for the 39 portion of the C4 genes [21]. For the
CYP21 genes, a 900 bp fragment generated from the
cosmid clone 1E3 [10] was used. For polymorphism in
C2 genes, a 300 bp probe derived from the pG850
genomic clone [22] was used. The probe for BF was
derived from the clone pFB3b by double digestion with
Cla I and BamH I [23]. The probes were purified by
Complotype-RFLP Constellations (CRC)
29
TABLE 1 Restriction fragment length polymorphisms in BF, C2, C4A, and C4B
Enzyme
Probe
Sst I
59 C2
Taq I
Bgl II
Taq I
BF
59 C4
59 C4
Xba I/BamH I
39 C4
Variants
kb
2.75, 2.70, 2.65, 2.55, 2.40
6.6, 4.5
15, 4.5
C4A: 7.0, del. C4B: 6.0 (long),
5.4 (short), 6.4 (short with
deleted C4A), del.
11, 714
Comments
2.4 kb associated with BF F, 2.7 with BF F and BF
S, 2.75 with C2 Q0 and C2 B
4.5 kb correlates with BF F(B), 6.6 with BF F(A)
15 kb correlates with BF F(B), 4.5 with BF F(A)
Some C4B genes make C4A protein, some C4A and
C4B genes make no protein
Splits SC31, SC30, FC31, FC30, SC21 complotypes;
Xba I site associated with long C4B genes
FIGURE 1 Restriction map of the complotype region in
relation to the MHC on chromosome 6p showing the restriction sites used in this study.
References
[22, 24–26]
[27, 28]
[29]
[10, 30, 31, 32]
[21, 33, 34]
30
S. Simon et al.
TABLE 2 Complotype restriction fragment length polymorphism constellations (CRC’s) on all
caucasian haplotypesa
C2
CRC
A
B
Bdup
C
D
Ddup
E
F
G
H
I
J
Bdel
BCdel
Edel1
Edel2
DGdel
Adel1
Adel2
a
b
2.4
2.65
1
1
1
1
BF
2.75b
2.7
1
1
1
1
1
1
1
1
1
1
1
1
1
4.5
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
59 C4A
6.6
15
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
4.5
1
1
—
1
1
39 C4A
11
CYP-21A
59 C4B
417
S
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
—
—
—
1
—
1
—
1
1
—
1
1
1
1
—
1
1
L
39 C4B
417
11
1
1
11
1
—
—
—
—
—
n
1
1
11
1
1
1
11
1
1
1
1
1
1
1
1
1
1
— —
— —
— —
— —
— —
1
1
11
1
1
CYP-21B
1
1
11
1
1
11
1
1
1
1
1
1
1
1
1
—
1
1
1
1
—
—
—
—
—
1
57
25
6
29
17
7
9
17
4
1
1
1
40
14
1
2
1
1
1
Deletions are shown as dashed lines, duplications as 11 under the appropriate genomic region.
Numbers below each region are restriction fragment lengths in kb, S is short C4B gene and L is long C4B gene.
TABLE 3 Distribution of protein complotypes in relation to CRC’s on random normal caucasian haplotypes
SC31
SC01
SC30
SC33
SC(3,2)0
FC(3,2)0
SC21
SC22
SC2(1,2)
SC42
SC02
SB41
SB42
S042
SC51
SC61
FC31
FC30
FC01
S1C2(1,17)
Total
Fraction
A
B
19
4
Bdup
C
D
Ddup
E
F
11
G
Bde1
1
10
3
1
1
2
1
2
1
1
1
2
2
1
3
1
1
2
1
4
1
1
23
0.27
16
0.19
1
3
1
2
0.02
14
0.17
6
0.07
2
2
0.02
5
0.06
4
0.05
2
0.02
10
0.12
Sum
Fraction
35
10
7
1
2
1
1
1
2
3
3
1
1
2
1
4
2
4
1
2
84
0.42
0.12
0.08
0.01
0.02
0.01
0.01
0.01
0.02
0.04
0.04
0.01
0.01
0.02
0.01
0.05
0.02
0.05
0.01
0.02
Complotype-RFLP Constellations (CRC)
31
FIGURE 2 Interrelationships among the common CRC’s. Each CRC is shown by its letter (Table 2) and abbreviated
designation: C2/Sst I, BF/Taq I, C4A/Bgl II (I 5 15 kb, II 5 4.5 kb), C4A present (1) or deleted (P - - P), CYP21A present
(1) or deleted (P - - P), long (L), short (S), or deleted C4B (P - - P), CYP21B present (1) or deleted (P - - P), and 39
C4/Xba I/BamH I. The frequency of each common CRC among normal caucasian MHC haplotypes is given in parentheses.
Complotypes from the population of 234 total haplotypes within each CRC are given below or to the right of each CRC
description. Solid lines indicate unambiguous connection, interrupted lines indicate that more than one connection is possible.
Single arrows at the end of lines indicate likely direction. If direction is not clear, arrows are at both ends of the lines. Postulated
genetic change indicated by del for deletion, dup for duplication, mut for point mutation or nonhomologous crossingover, hom
cro for homologous crossingover or gene conversion-like change.
excision of bands after electrophoresis in low melting
point agarose gel.
Table 1 summarizes the restriction endonucleases [10,
21, 22, 24 –29, 30 –34] and probes used to detect these
polymorphisms and the correlations that have been made
with protein variants. Figure 1 provides a map of the
complotype region within the MHC and the approximate positions of the restriction sites used in the present
study.
RESULTS
The complotype-RFLP constellations (CRC’s) are defined
by their C2/Sst I (2.75, 2.70, 2.65, and 2.40 kb), BF/Taq
I (6.6 and 4.5 kb), C4A/Bgl II (15 kb, designated I, and
4.5 kb, designated II), 59 C4 (7.0, 6.4, 6.0 and 5.4 kb)
and 39 C4/Xba I/BamH I (11 and 417 kb) RFLP variants, by the presence or absence of C4A, C4B, CYP21A
and CYP21B genes and by the presence or absence of
duplications (Figure 1 and Table 2). Of the more than
1000 theoretically possible combinations of these variants, only 19 CRC’s were actually found among the 234
independent haplotypes studied (Table 2). Twelve of
these are ‘‘full’’ with no deletions (patterns A through J),
of which 2 have duplicated C4B and CYP21B (Bdup and
Ddup), whereas the remaining 7 have deletions of 1 or
more C4 and/or CYP21 genes. CRC’s with deleted C4
and/or CYP21 genes were designated by their presumed
full CRC of origin and the designation ‘‘del.’’ If there
were 2 possible full CRC’s of origin (see below), both
were used in the designation. If there were two CRC’s
with different deletions from the same CRC, these were
designated ‘‘del1’’ and ‘‘del2’’. The BF/Taq I 4.5 kb
variant was associated with the 15 kb (I) C4A 59/Bgl II
variant on 221 haplotypes, whereas the reciprocal 6.6
kb/4.5 kb (6.6 kb, II) combination was found on 12
haplotypes. There was a single 4.5 kb/4.5 kb (4.5 kb, II)
haplotype.
Table 3 lists the protein complotypes in relation to
CRC’s in the 84 normal haplotypes. Considering only
individual protein-associated alleles in all the 234 haplotypes, it was observed that BFpS (the most common
BF allele in caucasians) was found in all CRC’s except
Ddup, E, J, BCde1, Edel1, and Edel2. BFpF (as FA in all
instances tested) was found in E, J, Edel1, and Edel2
with no other BF alleles in these CRC’s. BFpF (as FB in
all cases tested) was found primarily in CRC’s B and C,
along with BFpS. The less common BFpF1 allele was
32
S. Simon et al.
FIGURE 3 As for Fig. 2.
found exclusively in CRC BCdel (with no other BF allele)
and BFpS1 was found exclusively in Ddup (with no other
BF allele). The less common C2 alleles, C2pB and C2pQ0
(the common or type 1 null allele), were found primarily
in CRC F, with a single example in CRC H (both CRC
F and CRC H carry the 2.75 kb Sst I C2 variant and
contained no C2pC, the common allele). The most common C4A allele, C4Ap3, was found in the full A, B, C,
D, E, F, G, and H CRC’s as well as CRC’s BCdel, Edel1,
and DGdel, with deletions. C4Ap2 was found with
C4Ap3 in the A, B, and C CRC’s, alone in the Bdup and
Ddup CRC’s and as a single example in the Ade12 CRC.
C4Ap4 was found in CRC’s B, D, and F. C4Ap6 was
only in CRC B along with the sole example of C4Ap5.
C4Bp1 occurred in all but the D, I, BCdel, Edel1, Edel2,
DGdel, and Adel1 CRC’s. C4Bp2, in contrast, was found
only in the B, Bdup, D, and F CRC’s and C4Bp3 was
only found in CRC A.
In terms of complotypes, SC31 was found in CRC’s A,
B, C and G. SC30 was even more widely dispersed in
CRC’s A, B, D, G, I, DGdel, and Adel1. The complotype
SC21 was found only in CRC’s A, B, and Adel2. BFpFB
in FC31 and FC30 was found in the B, C, and J CRC’s
and BFpFA in FC31, FC01, and FC30 was in the E and
Edel1 CRC’s and as FC91,0 in the Edel2 CRC. The
complotypes SC01, F1C30, S1C2(1,17), and SC2(1,2)
uniquely and exclusively composed the CRC’s Bdel, BCdel, Ddup, and Bdup, respectively. Other complotypes
such as SC61, SC33, and SC02 were found exclusively in
one CRC (B, A, and D, respectively), but along with
other protein complotypes. SC42 occurred in CRC’s B
and D and SC02 was found only in D. Complotypes with
Complotype-RFLP Constellations (CRC)
33
FIGURE 4 As for Fig. 2.
C4A and C4B genes both producing C4A proteins were
restricted to CRC A (SC[3,3]0), possibly CRC I with
possible SC[3,3]0, CRC C (FC[3,2]0 and SC[3,2]0), and
CRC F (SB[4,3]0).
To relate these observations to each other and to
possible evolutionary relationships, the schemata shown
in Figs. 2 through 4 were developed. For this, CRC’s
were considered most related if they shared the maximum number of RFLP’s. The direction of the relatedness
was assigned on the basis of certain DNA or protein
features being considered ‘‘terminal.’’ For example, deletions or duplications were viewed as deriving from full
haplotypes. C2pB and C2pQ0 were deemed terminal
derivations from C2pC. C4Bp2 was considered to have
arisen from C4Bp1. Figure 2 relates the 7 common
non-deleted, non-duplicated CRC’s (A, B, C, D, E, F,
and G) to one another. These CRC’s make up over 70%
of random normal complotypes in caucasians. Because
they differ by several features such as the presence of the
Xba I site, C4B long or short, and C2 2.65 or 2.70, it is
not in general possible to know which gave rise to which.
Therefore, the connections are shown as going in either
direction (two-headed arrows), except for the clear derivation of CRC F from CRC D. Because CRC E has a
short C4B gene but differs from other CRC’s in three
respects: 2.40 kb C2, the BF/Taq I 6.6 kb variant and the
4.5 kb (II) C4A 59/Bgl II variant, it could have arisen
from either CRC B or CRC D. The possible connections
are shown as two interrupted unidirectional lines. In Fig.
3, the probable derivations of CRC’s represented only
once or twice among the 234 independent haplotypes
from common CRC’s are shown. CRC’s A, I, Ade11 and
Ade12 (Fig. 3a) were connected because they shared the
39 C4 Xba I 417 kb allele and all had the C4B long gene
(or deleted C4B) and I, Ade11, and Ade12 could be
clearly derived from A by single events. CRC DGde1
could have arisen from either CRC G or CRC D, since
these differ only by long and short C4B genes and C4B
is deleted in CRC DGde1 (Fig. 3b). CRC H, represented
by the very unusual complotype SB31, could either have
arisen from CRC F by a gene-conversion-like substitution of a short C4B gene by a long one, or by ancient
TABLE 4 Common CRC-complotype frequencies on random normal caucasian haplotypesa
CRC-complotype Frequency
SC31A
SC31C
SC01
SC31B
SC61
a
.23
.13
.12
.05
.05
CRC-complotype Frequency
SC30A
SC02
FC30E
SC30D
SC2(1,2)
.04
.04
.04
.02
.02
CRC designations are given only for those complotypes that occur in more than one CRC.
CRC-complotype Frequency
S042
SC42B
S1C2(1,17)
SC(3,2)0
.02
.02
.02
.02
34
S. Simon et al.
TABLE 5 CRC-complotype combinations found
only once in normal caucasian haplotypesa
SC31G
SC30B
SC30G
SC21B
SC22
a
SC33
SC42D
SB41
SB42
SC51
FC(3,2)0
FC01
FC31B
FC31E
FC30B
CRC designations are as in Table 4.
crossing over between BF and C4A involving CRC’s F
and G, as shown in Fig. 3c. The latter kind of explanation could also apply to the origin of CRC J from CRC
B and CRC E (Fig. 3d). The origin of CRC’s Ede11 and
Ede12 by deletion of CYP21A and C4B or C4B and
CYP21B from CRC E is clear (Fig. 3d).
Figure 4a shows the origin of the common deleted
CRC Bde1 from CRC B and the origin of the BCde1
CRC by deletion of CYP21A-C4B from either CRC B or
CRC C. The CRC’s with duplicated C4B-CYP21B
genes, Bdup and Ddup, are shown (Fig. 4b) arising from
CRC’s B and D.
From the 84 random normal complotypes (Table 3),
approximate normal frequencies of the undeleted patterns were A 5 .27, B 5 .19, Bdup 5 .02, C 5 .17,
D 5 .07, Ddup 5 .02, E 5 .06, F 5 .05, and G 5 .02.
Thus, 9 patterns without deletions accounted for 88% of
normal complotypes. Since the frequency of Bde1, with
a deletion of C4A, was .12, 10 CRC’s accounted for all
observed normal caucasian MHC haplotypes.
Remarkably, there are about 30 CRC-specified complotypes that occur at frequencies of .01 or higher and 14
of .02 or higher among normal caucasians. These are
shown in Tables 4 and 5. Since only 84 normal complotypes were studied, single instances of complotypeCRC’s gave frequencies of .01. It is likely that those
single instances had true frequencies somewhat higher or
lower than .01 and that there were some other common
combinations not detected.
DISCUSSION
It is remarkable that only 19 of the more than 1000
theoretically possible CRC’s comprise the major part of
caucasian complotypes. Since most of the RFLP’s that
characterize the CRC’s represent macrostructural features
such as the presence or absence of an intron derived from
an endogenous retrovirus as in the long and short C4B
genes [32, 35–37], the presence or absence of C4 or
CYP21 genes [10, 31], and the size of an intron containing retroviral sequences in the C2 gene [22, 38], this
speaks for a striking conservation of the macrostructure
of the complotype region in the caucasian MHC. This
conservation also contrasts with the polymorphism of
complotypes in terms of protein allotypes [8]. We have
now recognized at least 30 different common protein
complotypes in caucasians.
The distribution of complotypes on CRC’s provides
information about the evolutionary origin of complotypes and individual C2, BF, and C4 alleles. Perhaps
clearest is the probable origin of BFpF, BFpS1 and
BFpF1 from BFpS. This is consistent with earlier observations that support the same conclusion [2, 39]. The
charge difference between the less common S1 and F1
variants of factor B compared with BF S or BF F is in the
Bb fragment, whereas BF S1 and F1 have Ba fragments
identical to BF S; on the other hand, BF F and BF S differ
in Ba [2] and the Ba fragment carries the difference
between BF FA and BF FB [40]. This has been confirmed
and further elucidated by the demonstration that BF F
differs from BF S by a single replacement of Arg at
position 7 in the Ba portion of BF S by Gln in BF FA or
Try in BF FB [39, 41]. The postulated evolutionary
scheme based on CRC’s for these changes is shown in
Figure 4.
The finding that SC31 and SC30 occur on a number
of closely related CRC’s suggests their close common
ancestry. At least two kinds of C4Ap3 and C4Bp1 are
distinguished by nucleotide sequence analysis [42]. We
might expect that there are two or three different SC31
(and perhaps more of SC30) complotypes, corresponding
to the CRC’s that carry them. The fact that the rare SC51
is in the same CRC B as the relatively common SC61
suggests that the latter gave rise to the former.
From the presence of SC42 in CRC D, it may be that
SC02 in the same CRC arose by mutation in SC42
leading to non-expression of the C4Ap4 gene. Furthermore, it appears likely that S042 (the type 1 C2 deficiency complotype) and SB42 (both in CRC F) arose from
the SC42 in CRC D. We have commented earlier on the
probable origin of C2pB and C2pQ0 from SC42 based on
the Sst I RFLP in C2 [26]. From the CRC analysis, it is
clear that SC42 on the one hand and S042 and SB42 on
the other are on ‘‘adjacent’’ CRC, further supporting
their evolutionary common origin.
The mechanisms are unclear by which the structural
features arose that differentiate the CRC’s. One can
imagine ancient homologous crossingover at meiosis
playing an important part, for example, in the generation
of CRC J. Such an event might have taken place between
the C2-BF region of a complotype in CRC C and the
C4A region of a complotype in CRC E. Unfortunately,
no sample was available to determine if the FC31 of this
haplotype was BF FA or BF FB. Certainly, for many of
the CRC’s, gene-conversion-like events may have been
involved in their evolution. Unfortunately, the fossil
record is incomplete and there is at least one missing link
Complotype-RFLP Constellations (CRC)
35
between CRC’s E, Ede11, and Ede12 and all the others,
since they differ by at least three different features.
For the localization of putative susceptibility genes
near the complotype region, as in gluten-sensitive enteropathy [43], all genetic markers that allow us to ‘‘split’’
common complotypes, such as SC31, SC30 or FC31, are
critical. The CRC’s provide such additional markers.
12.
ACKNOWLEDGMENTS
13.
We thank Barbara Moore, Susan Mrose and Carroll Goldsmith
for expert technical assistance and Dr. Devendra Dubey for
thoughtful comments. This work was supported by National
Institutes of Health grants AI 14157, HL 29583, HL 48675,
HD 17461, and DK 26844, and by the Swedish Medical
Research Council and by the Alfred Österlund Fund. Louise
Viehmann provided outstanding secretarial help. Drs. Michael
Carroll, A. Steven Whitehead, and R. Duncan Campbell
kindly provided DNA probes for these studies.
REFERENCES
1. Awdeh ZL, Alper CA: Inherited structural polymorphism
of the fourth component of human complement. Proc
Natl Acad Sci USA 77:3576, 1980.
2. Alper CA, Boenisch T, Watson L: Genetic polymorphism
in glycine-rich beta-glycoprotein. J Exp Med 135:68,
1972.
3. Teng YS, Tan SG: Subtyping of properdin factor B (Bf) by
isoelectrofocusing. Hum Hered 32:362, 1982.
4. Geserick G, Patzelt D, Schröder H, Nagai T: Isoelectrofocusing in the study of the Bf system. Existence of two
common subtypes of the BfpF allele. Vox Sang 44:178,
1983.
5. Alper CA: Inherited structural polymorphism in human
C2: Evidence for genetic linkage between C2 and Bf. J
Exp Med 144:1111, 1976.
6. Hobart MJ, Lachmann PJ: Allotypes of complement components in man. Transplant Rev 32:26, 1976.
7. Meo T, Atkinson J, Bernoco M, Bernoco D, Ceppellini R:
Mapping of the HLA locus controlling C2 structural
variants and linkage disequilibrium between C22 and
Bw15. Eur J Immunol 7:916, 1976.
8. Alper CA, Raum D, Karp S, Awdeh ZL, Yunis EJ: Serum
protein ‘supergenes’ of the major histocompatibility complex in man (complotypes). Vox Sang 45:62, 1983.
9. Carroll MC, Campbell RD, Bentley DR, Porter RR: A
molecular map of the human major histocompatibility
complex class III region linking complement genes C4,
C2 and factor B. Nature 307:237, 1984.
10. Carroll MC, Campbell RD, Porter RR: Mapping of steroid 21-hydroxylase genes adjacent to complement component C4 genes in HLA, the major histocompatibility
complex in man. Proc Natl Acad Sci USA 82:521, 1985.
11. Carroll MC, Katzman P, Alicot EM, Koller BH, Geraghty
DE, Orr HT, Strominger JL, Spies T: Linkage map of the
14.
15.
16.
17.
18.
19.
20.
21.
22.
23.
24.
25.
human major histocompatibility complex including the
tumor necrosis factor genes. Proc Natl Acad Sci USA
84:8535, 1987.
White PC, Grossberger D, Onufer BJ, New MI, Dupont
B, Strominger JL: Two genes encoding steroid 21-hydroxylase are located near the genes encoding the fourth component of complement in man. Proc Natl Acad Sci USA
82:1089, 1985.
Dunham L, Sargent CA, Trowsdale J, Campbell RD:
Molecular mapping of the human major histocompatibility complex by pulsed-field gel electrophoresis. Proc Natl
Acad Sci USA 84:7237, 1987.
Yang SY, Milford EL, Hämmerling U, Dupont B: Description of the reference panel of B-lymphoblastoid cell
lines for factors of the HLA system: the B-cell line panel
designed for the tenth international histocompatibility
workshop. In Dupont B (ed.): Immunobiology of HLA.
Histocompatibility Testing 1987 I: New York, SpringerVerlag, 1989.
Raum D, Awdeh Z, Yunis EJ, Alper CA, Gabbay KH:
Extended major histocompatibility complex haplotypes in
type I diabetes mellitus. J Clin Invest 74:449, 1984.
Sim E, Cross SJ: Phenotyping of human complement
component C4, a class-III HLA-antigen. Biochem J 239:
763, 1986.
Gross-Bellard M, Oudet P, Chambon P: Isolation of high
molecular weight DNA from mammalian cells. Eur J Biochem 36:32, 1973.
Southern E: Detection of specific sequences among DNA
fragments separated by gel electrophoresis. J Mol Biol
98:503, 1975.
Feinberg AP, Vogelstein B: A technique for radiolabeling
DNA restriction endonuclease fragments to high specific
activity. Anal Biochem 137:266, 1983.
Belt KT, Carroll MC, Porter RR: The structural basis of
the multiple forms of human complement C4. Cell 36:
907, 1984.
Whitehead AS, Woods DE, Fleischnick E, Chin JE, Yunis
EJ, Katz AJ, Gerald PS, Alper CA, Colten HR: DNA
polymorphism of the C4 genes: a new marker for analysis
of the major histocompatibility complex. N Engl J Med
310:88, 1984.
Bentley DR, Campbell RD, Cross SJ: DNA polymorphism of the C2 locus. Immunogenetics 22:377, 1985.
Morley BJ, Campbell RD: Internal homologies of the Ba
fragment from human complement component factor B, a
class III HLA antigen. EMBO J 3:153, 1984.
Ratanachaiyavong S, Campbell RD, McGregor AM: Enhanced resolution of the SstI polymorphic variants of the
C2 locus: Description of a new size class. Hum Immunol
26:310, 1989.
Zhu ZB, Volanakis JE: Allelic associations of multiple
RFLPs of the gene encoding complement protein C2.
Am J Hum Genet 46:956, 1990.
36
26. Simon S, Awdeh Z, Campbell RD, Ronco P II, Brink SJ,
Eisenbarth GS, Yunis EJ, Alper CA: A restriction fragment of the C2 gene is a unique marker for C2 deficiency
and the uncommon C2 allele C2pB (a marker for type 1
diabetes). J Clin Invest 88:2142, 1991.
27. Cross SJ, Edwards JH, Bentley DR, Campbell RD: Polymorphism of the C2 and factor B genes. Immunogenetics
21:39, 1985.
28. Fathallah D, Abbal M, Thomsen M, Cambon-Thomsen A,
Campbell RD: A DNA restriction fragment length polymorphism in the complement region of the human MHC
shows an absolute correlation with polymorphism of complement factor B (BF) defined by isoelectric focusing.
J Immunogenet 12:321, 1985.
29. Schneider PM, Rittner C: Bgl II restriction fragment
length polymorphism of human complement C4A gene
coincides with BFpF allele of factor B. Immunogenetics
27:225, 1988.
30. Carroll MC, Palsdottir A, Belt KT, Porter RR: Deletion of
complement C4 and steroid 21-hydroxylase genes in the
HLA class III region. EMBO J 4:2547, 1985.
31. Schneider PM, Carroll MC, Alper CA, Rittner C, Whitehead AS, Yunis EJ, Colten HR: Polymorphism of the
human complement C4 and steroid 21-hydroxylase genes.
Restriction fragment length polymorphisms revealing
structural deletions, homoduplications, and size variants.
J Clin Invest 78:650, 1986.
32. Prentice HL, Schneider PM, Strominger JL: C4B polymorphism detected in human cosmid clone. Immunogenetics 23:274, 1986.
33. Whitehead AS, Truedsson L, Schneider PM, Awdeh Z,
Fleischnick E, Blumenthal M, Costello W, Gerald PS,
Yunis EJ, Alper CA: The distribution of human C4 DNA
variants in relation to major histocompatibility complex
alleles and extended haplotypes. Hum Immunol 21:23,
1988.
34. Witzel K, Chu X, Rittner C, Schneider PM: Polymerase
chain reaction analysis of the XbaI polymorphism of the
S. Simon et al.
35.
36.
37.
38.
39.
40.
41.
42.
43.
human complement C4 genes provides evidence for strong
haplotype conservation. Hum Immunol 43:165, 1995.
Palsdottir A, Fossdal R, Arnason A, Edwards JH, Jensson
O: Heterogeneity of human C4 gene size. A large intron
(6.5 kb) is present in all C4A genes and some C4B genes.
Immunogenetics 25:299, 1987.
Dangel AW, Mendoza AR, Baker BJ, Daniel CM, Carroll
MC, Wu L-C, Yu CY: The dichotomous size variation of
human complement C4 genes is mediated by a novel
family of endogenous retroviruses, which also establishes
species-specific genomic patterns among Old World primates. Immunogenetics 40:425, 1994.
Chu X, Rittner C, Schneider PM: Length polymorphism
of the human complement component C4 gene is due to
an ancient retroviral integration. Exp Clin Immunogenet
12:74, 1995.
Zhu ZB, Hsieh S-L, Bentley DR, Campbell RD, Volanakis JE: A variable number of tandem repeats locus
within the human complement C2 gene is associated with
a retroposon derived from a human endogenous retrovirus.
J Exp Med 175:1783, 1992.
Bentley DR, Campbell RD: C2 and factor B: structure
and genetics. Biochem Soc Symp 51:7, 1986.
Siemens I, Bender K, Geserick G, Mauff G, Pulverer G:
The BF F subtypes are detectable in the Ba fragment of
factor B. Forens Sci Internat 42:279, 1989.
Davrinche C, Abbal M, Clerc M: Molecular characterization of human complement factor B subtypes. Immunogenetics 32:309, 1990.
Yu CY, Belt KT, Giles CM, Campbell RD, Porter RR:
Structural basis of the polymorphism of human complement components C4A and C4B: gene size, reactivity and
antigenicity. EMBO J 5:2873, 1986.
Ahmed AR, Yunis JJ, Marcus-Bagley D, Yunis EJ,
Salazar M, Katz AJ, Awdeh Z, Alper CA: Major histocompatibility complex susceptibility genes for dermatitis
herpetiformis compared with those for gluten-sensitive
enteropathy. J Exp Med 178:2067, 1993.