The Involucrin Gene of the Gibbon: The Middle Region Shared by

The Involucrin Gene of the Gibbon: The Middle Region
Shared by the Hominoids’
Philippe Djian and Howard Green
Department
of Cellular and Molecular
Physiology,
Harvard
Medical School
The evolution of the anthropoid involucrin gene has resulted largely from a process
of vectorial addition of short tandem repeats. The coding region of the involucrin
gene of the gibbon (Hylobates lar), including the segment of repeats, has been
cloned and sequenced, and its repeat structure can now be compared with that of
the other hominoids. In the gibbon, as in the others, repeat additions in the past
can be assigned to early, middle, and late regions of the present-day segment of
repeats. All 10 repeats of the gibbon early region were completed in a common
anthropoid
ancestor. All 17 repeats of the gibbon middle region were completed
in a common hominoid ancestor. After divergence of the gibbon lineage, eight
repeats were added to the middle region of the great ape-human
lineages. Seven
of these are shared by two to four species, according to the order of their divergences
from each other. After its divergence, the gibbon lineage added a short speciesspecific late region. The gibbon also possesses an incomplete repeat just 3’ of the
early region, the only addition in this region in any hominoid. Comparison of the
number of repeats added with the number of nucleotides substituted shows an
inconstant relation between the two.
Introduction
Late in the differentiation of epidermal cells and other keratinocytes, involucrin,
a protein substrate of transglutaminase, becomes incorporated into a cross-linked envelope located beneath the plasma membrane (Rice and Green 1979). This envelope
then contributes to the resistance of the skin.
The nucleotide sequence of the involucrin gene is known for five anthropoid
primates-the
human (Eckert and Green 1986), the chimpanzee (Djian and Green
1989b), the gorilla (Teumer and Green 1989)) the orangutan (Djian and Green 1989a),
and the owl monkey (Tseng and Green 1989)-and
for two prosimians-the
lemur
(Tseng and Green 1988) and the galago (Phillips et al., accepted). In all these primates,
the coding region‘contains a segment of short tandem repeats. In prosimians, as in
nonprimate mammals (H. Tseng and H. Green, unpublished data), the segment of
repeats is located in the 5’half of the coding region, whereas in the anthropoid primates
a different segment of repeats is located in the 3’ half. It was postulated that in a
common anthropoid ancestor the earlier segment of repeats was excised and that a
modern segment of repeats was generated by duplications of a sequence located in
the 3’ half of the coding region (Tseng and Green 1988).
In the anthropoid primates, successive repeats were added vectorially in a 3’-to5 ’direction by a process continuing through numerous lineage branchings. The segment
1. Key words: involucrin, gibbon, hominoid evolution, Hylobates
lur,
lo-codon repeats.
Address for correspondence and reprints: Howard Green and Philippe Djian, Harvard Medical School,
Department of Cellular and Molecular Physiology, 25 Shattuck Street, Boston, Massachusetts 02 115.
Mol. Biol. Evol. 7(3):220-227. 1990.
0 1990 by The University of Chicago. All rights reserved.
0737-4038/90/0703-0002$02.00
220
The Involucrin
Gene of the Gibbon
22
I
of repeats has been divided into three regions. The early region consists of the 10 3’most repeats: as this region is present in both the (new-world) owl monkey (Tseng
and Green 1989) and the hominoids, it must have been generated in a common
anthropoid ancestor. The middle region of the hominoids, consisting of 17-24 repeats,
is shared in part by lower catarrhine species (authors’ unpublished data) but does not
correspond to any region of the owl monkey segment of repeats and must therefore
have been generated after divergence of the catarrhines from the platyrrhines. As the
late region of the hominoids is species specific, it must have been generated in each
lineage after that lineage’s divergence from all others.
We now report the nucleotide sequence of the involucrin gene of the gibbon
(Hylobates Zar), the remaining hominoid for which no involucrin gene sequence has
been known.
Material and Methods
Gibbon keratinocytes were isolated by Dr. R. H. Rice (Harvard School of Public
Health) from a vaginal biopsy performed at the Yerkes Primate Center (Atlanta) and
were serially cultivated (Rheinwald and Green 1977; Simon and Green 1985). The
involucrin gene was cloned by the procedure described earlier, except that the probe
used for screening was a 5.5-kb XbaI-EcoRI fragment containing the whole orangutan
involucrin coding region (Djian and Green 1989b). Two independent clones were
isolated. To obtain clone 1, gibbon genomic DNA was cut with XbaI and HindIII.
The resulting 4-4.3-kb fragments were cloned in pUC 18. For clone 2, genomic DNA
was cut with BamHI, and the resulting 2.3-2.5-kb fragments were similarly cloned.
After being subcloned into M13, the entire sequence of the coding region was determined, first on one strand of clone 1 and then on clone 2, almost entirely on the
opposite strand (fig. 1); there was complete agreement between the two sequences.
Results
General Features of the Gene
The restriction map of the gibbon involucrin gene and its flanking DNA is very
similar to that of other hominoids, except for the presence of an EcoRI site 3’of the
poly A addition site and a PstI site in repeats 24-25 (fig. 1). The sequence of the two
parts of the coding region flanking the segment of repeats is shown in figure 2. As in
other hominoids, the two parts have a total of 198 codons and are separated at the
same point by the segment of repeats (fig. 2).
PstI
XbaI
BamHI PstI
PstI
I______________________________________
Hind111
PstI.
BamHI
EcoRI.
FIG. 1.-Restriction
map and sequencing of the gibbon involucrin gene. An EcoRI and a PstI site are
dotted because they are not found in other hominoid involucrin genes. All other sites are present in other
hominoids. The box represents the coding region of the involucrin gene with its segment of repeats (stippled).
Arrows indicate sequenced parts of overlapping DNA fragments.
222
Djian and Green
CAA
GCC
CCT
ATG
TGC
GAG
ATG
GAA
GAG
CAG
CCA
CAA
GM
CTG
CAA
CAC
CTC
CCT
AAA
CAG
GTC
ACT
TGT
CAG
CAT
GAG
AGG
GAA
AAT
CTG
ACA
AGT
CCA
CAG
AAG
CCA
ACT
GAG
GAG
GAG
CAG
GAT
GAG
CAA
GGA
CTG
CAG
GTC
CCA
GTG
TCA
GTA
CAA
CTG
GAA
CAG
CAG
AAG
GAG
ATA
Segment
GAG
CAG
ACA
CAG
CCA
FIG. 2.-Coding
CAG
GTC
AAG
CAG
CCC
CCT
CAA
GGA
CAG
AAA
GTG
GAC
GAA
CAG
CAT
CCA GTG
GAG CTC
AAT ACC
ACT CCA
CTT GGT
AAG CAA
AAG GGG
CAG CAG
CAG CAA
CAT CAG
CTT AAG
CAG CTA
AAG CTC
CTA ATC
AAG,AAA
ACC
CTC
CAG
CTG
GAG
GAG
CTG
CAG
CAG
AAA
CAG
AAC
TTA
AAG
GAG
ATG
CTC
AAG
CAG
CCT
CTC
GAA
CCT
GAG
CAC
GCA
GAG
GAA
GAC
AGA
CAA
TCC
TCC
ACT
GAG
CCC
CCA
AAG
GAG
CCA
TGG
GAA
AAA
CAT
CAG
GAT
CTC
CAG
CCT
GTT
CAA
CCA
GTG
CAC
CAA
CAG
GAA
AAC
GCA
CTG
CAA
GAG
TTG
CCA
CTG
GTA
CAG
GGC
CCC
GAG
TGG
o:f Repeats
:
:
TTT'GCC
ATT CAA
GCA TTG
AAG CAG
AAA TAA
CCA
CCA
CTT
GAG
GCT
GTC
CCT
GTG
region flanking the segment of repeats of the gibbon involucrin gene
The Segment of Repeats
In the gibbon involucrin gene, this segment consists of 33 repeats (fig. 3). These
are classified as A or B, according to their first three codons: AAG, CAC, CTG in A
repeats and GAG, CTC, CCA in B repeats (Teumer and Green 1989; Tseng and Green
1989). Repeats can be further distinguished by the presence of nonconsensus marker
nucleotides. The gibbon segment of repeats has been aligned with the previously published segment of repeats of the orangutan (Djian and Green 1989a).
The Early Region
This region, located at the 3’end of the modern segment, is present in all higher
primates examined, earlier and consists of repeats 1- 10. Unique to the gibbon is one
incomplete repeat immediately 3’ of repeat 1. This repeat, denoted as - 1, is not
present in any other hominoid or in the owl monkey; it therefore does not belong to
the early region and must have been generated in the gibbon lineage after the latter’s
divergence from the lineage leading to the great apes and man. As the first three codons
are missing from the repeat, it cannot be classified as A or B, and as the rest of the
repeat conforms to consensus, there are no marker nucleotides that could clarify its
origin.
The Middle Region
The repeats of this region present in different hominoids have been matched,
and each has been designated by a Greek letter (fig. 4). Each of the 17 repeats of the
gibbon middle region is present in other hominoids, except that repeat 5 has been
deleted from the orangutan and repeats 4 and x have been deleted from the human
(fig. 4). All 17 repeats of the gibbon middle region must therefore have been completed
in a common ancestor of the hominoids.
Gibbon
B
A/B
B
A
y
A
GAG
GAG
GAG
GAG
AAG
CTC
CAC
CTC
CAC
CAC
B
A
A
B
A
GAG
AAG
AAG
CAG
AAG
CTC CCA
CAC CTG
CALCTG
ZTC CCA
LAC CTG
t
I
8
ACA
CTA
CCA
CTG
CTG
GAG
GAA
GAG
GAG
GAT
Orang-utan
CAG
CAG
CAG
CAG
CAG
CAG
CAG
CAG
CAG
CAG
GAG
GAG
GAG
GAG
GGG
GGG
GGG
GGG
GGG
AAG
CAG
CAG
CAG
CAG
CAG
TTG
CTG
CTG
CTG
CCA
32
31
30
29
28
GAG CAG
GAG CAG
GAG CAL
GAG LAG
GAFF CAG
CAG
CAG
CAG
CAG
CAG
GTG
GAG
AAG
G;G
GAG
GCA
GGGGG
GGG
GGG
CAG
CAG
GAG
CAG
CAG
CTG
CTG
CTG
CTG
CTG
21
26
25
24
23
'g A AAG CAC CTG GAL CAG CAG GAG $&G CAG CCA 22
a, B GAG CTC CCA GAG CAG CAG GLG GGG CAG CTG 21
h
A AAG CAC CTG GAG CAG CAG GAG GGG CAG CTG 20
a
g
A
B
A
A
A
A
X
A
A
GAG
7%~
M-G
AAG
AAG
GAG
--_
CAG
AAG
CAC ATG
CTC CCA
CASCTG
CAC CTG
CAC CTG
CALCTG
___ ___
CAC CTG
CAC CT&
GAG
GAG
GAG
GAG
GTG
GTG
GjhG
GAG
GAG
CAC
CAT
GAG
LAG
CAG
CAG
CAG
CAG
LAG
CAG
CAG
CAG
LAG
CAG
CAG
CAG
CAG
CAG
A
B
A
s, 0 B
-4-i
A
f;l$B
Q) h A
B
A
1 A
AAG
GAG
AAG
GAG
AAG
GAG
AAG
GAG
AAA
AA.
CA!&CTG
$JC TCA
CAC CTG
CTC CCA
CAC CTG
CTC CCA
CAC CCG
C&Z CCA
CAT CTG
GAFCTG
GAG
GAG
GAG
GAG
GAG
GAG
GAG
GAG
GAG
GAG
CAG
CAG
CAG
CAG
@G
CAG
CAA
CAG
CAG
CAG
CAG SG
CAG GE
G,AG GGG
U&GAG
CAG GAG
CAG Ga
CAG GAG
CAG GAG
CAG GAG
CAG _&RG
---
---
GAG
CAG CAG CAG GGG CAG CTG -1
7
tc
X
---
GAA
GTY
GTG
GAG
GAG
GAG
GAG
GIG
GAG
GGG
GGG
GGA
GGG
GGG
AGG
GGG
GAG
GGG
B
A
B
A
A
GAG
AAG
GAG
AAG
AAG
CTC
CAG
CTC
CAC
CAT
CCA
CTA
CCA
CTG
CTG
GAG
GAG
GAG
GAG
GAG
CAG
CAG
CAG
CAG
CAG
CAG
CAG
CAG
CAG
CAG
GAG
GAG
GAG
GAG
GAG
CAG
GGG
GGA
GGG
GGG
CAC
CAG
CAG
CAG
CAG
CTG
CTG
CTG
CTG
CTG
64
63
62
61
60
B
A
AD
AD
AD
AD
AD
B
A
A
B
A
AD
B
A
A
B
A
AD
B
A
A
B
A
A"
A
B
A
B
A
AD
GAG
AAG
AAG
AAG
AAG
AAG
AAG
GAG
AAG
AAG
GAG
AAG
AAG
GAG
AAG
AAG
GAG
AAG
AAG
GAG
AAG
AAG
GAG
AAG
AAG
AAG
GAG
AAG
GAG
AAG
AAG
GTC
TAC
CAC
CAC
CAC
CAC
CAC
CTC
CAC
CAT
GTC
TAC
CAC
CTC
CAT
CAT
GTC
TAC
CAC
CTC
CAT
CAT
GTC
TAC
CAC
CAC
CTC
CAC
GTC
TAC
CAC
CCA
CTG
CTG
CTG
CTG
CTG
CTG
CCG
CTG
CTG
CCA
CTG
CTG
CCA
CTG
CTG
CCA
CTG
CTG
CCA
CTG
CTG
CCA
CTG
CTG
CTG
CCA
CTG
CCA
CTG
CTG
GAG
GAA
GAT
GAT
GAT
GAT
GAT
GAG
GAG
GAG
GAG
GAA
GAT
GAG
GAG
GAG
GAG
GAA
GAT
GAG
GAG
GAG
GAG
GAA
GAT
GAT
GAG
GAG
GAG
GAA
GAT
GAG
CAG
CAG
CAG
CAG
CAG
CAG
CAG
CAG
CAC
GAG
CAG
CAG
CAG
CAG
CAC
GAG
CAG
CAG
CAG
CAG
CAC
GAG
CAG
CAG
CAG
CAG
CAG
GAG
CAG
CAG
CAG
CAG
CAG
CAG
CAG
CAG
CAG
CAG
CAG
GAG
CAG
CAG
CAG
CAG
CAG
CAG
CAG
CAG
CAG
CAG
CAG
CAG
CAG
CAG
CAG
CAG
CAG
CAG
CAG
CAG
CAG
GTG
GAG
GAG
GAG
GAG
GAG
GAG
GAG
GAG
GAG
GTG
GAG
GAG
GAG
GAG
AAG
GTG
GAG
GAG
GAG
GAG
GAG
GTG
GAG
GAG
GAG
GTG
GAG
GTG
GAG
GAG
GGG
GGG
GGG
GGG
GGG
GGG
GGG
GGA
GGG
GGG
GGG
GGG
GGG
GGA
GGG
GGG
GGG
GGG
GGG
GGA
GGG
GGG
GGG
GGG
GGG
AAG
GGG
GGG
GGG
GGG
GGG
CAG
CAG
CAG
CAG
CAG
CAG
CAG
CAG
CAG
CAG
CAG
CAG
CAG
CAG
CAG
CAG
CAG
CAG
CAG
CAG
CAG
CAG
CAG
CAG
CAG
CAG
CAG
CAG
CAG
CAG
CAG
CTG
CTG
CTG
CTG
CTG
CTG
CTG
CTG
CTG
CTG
CTG
CTG
CTG
CTG
CTG
CTG
CTG
CTG
CTG
CTG
CTG
CTG
CTG
CTG
CTG
CTG
CTG
CTG
CTG
CTG
CTG
59
58
51
56
55
54
53
52
51
50
49
48
47
46
45
44
43
42
41
40
39
38
31
36
35
34
33
32
31
30
29
B
A
A
B
A
AD
A
B
A
A
A
B
A
A
GAG
AAG
AAG
GAG
AAG
AAG
AAG
GAG
AAG
GAG
GAG
GGG
AAG
AAG
CTC
CAT
CAT
GTC
TAC
CAC
CAC
CTC
CAC
CAC
CAC
CTC
CAG
AAC
CCA
CTG
CTG
CCA
CTG
CTG
CTG
CCA
CTG
CTG
CTG
CCA
CTA
CTG
GAG
GAG
GAG
GAG
GAA
GAT
GAT
GAG
GAG
GAG
GAG
GAG
GAG
GAG
CAG
CAG
CAC
GAG
CAG
CAG
CAG
CAG
CAG
GGG
CAC
CAG
AAG
GAG
CAG
CAG
CAG
CAG
CAG
CAG
CAG
CAG
CAG
CAG
CAG
CAG
CAG
GAG
GAG
GAG
GAG
GTG
GAG
GAG
GAG
GTG
GAG
GAG
GAA
GTG
GAG
GAG
GGA
GGG
GGG
GGG
GGG
GGG
AAG
GGG
GGG
GGG
GGG
TGG
GGG
GGG
CAG
CAG
CAG
CAG
CAG
CAG
CAG
CAG
CAA
CAG
CAG
CAG
CAG
CAG
CTG
CTG
CTG
CTG
CTG
CTG
CTG
CTG
CTG
CTG
CTG
CTG
CCA
CTG
28
21
26
25
24
23
22
21
20
19
18
11
16
15
14
13
12
11
CAG
CAG
CAG
CAG
CAG
CAG
AAG
CAG
CAG
CTG
CTG
CCA
CTG
CTG
CTG
GTG
FTG
CTG
19
la
11
16
15
14
13
12
11
A
X
A
A
AAG
--_
GAG
AAG
CAT
__CAC
CAC
CTG
--CTT
CTA
GTG
GAG
GAG
GAG
CAG
CAG
GAG
GAG
CAG
CAG
CAG
CAG
GAG
GAG
GTG
GAG
GGG
GGG
GGG
GGG
CAG
CAG
CAG
CAG
CTG
GTG
CTG
CTG
GGG CAG
GGG CAG
MG
CAG
GGC_ CAG
GC& CAG
GGA CAG
MG
CAG
GGA CAq
GGG CAG
GGG CAG
CTG
CCA
CTG
CTG
CTG
CCL
CTA
CTA
CTG
CTG
10 A
9 B
8 A
I B
6 A
5 B
4 A
3 B
2 A
1 A
AAG
GAG
AAG
GAG
AAG
GAG
AAG
GAG
AAA
AAG
TAT
GTC
CAC
CTC
CAC
CTC
CAC
CAC
CAT
AAC
CTG
CCA
CTG
CCA
CTG
CCA
CTG
CCA
CTG
CTA
GAG
GAG
GAG
GAG
GAG
GAG
GAG
GAG
GAG
GAG
CAG
CAG
CAG
CAG
AAG
CAG
CAG
CAG
CAG
CAG
CAG
CAG
GAG
CAA
CAG
CAG
CAG
AAG
CAA
CAG
CAG
GTG
GAG
GAG
GAG
GTA
GAA
GAC
GAG
AAG
GGG
GGG
AAG
GGC
GCA
GGA
AAG
GGA
GGG
GGG
CAG
CAG
CAG
CAG
CAG
CAG
CAG
CAA
CAG
CAG
CTG 10
CCA
9
CTG
8
CTG
I
CTG
6
CCA
5
CTA
4
CTA
3
CTG
2
CTG
1
FIG. 3.-Segment of repeats of the gibbon involucrin gene. The segment of repeats of the gibbon is
aligned with the previously described segment of the orangutan. Repeats are numbered 3’to 5’ and are
classified as A or B. Repeat 13 and repeat - 1 of the gibbon lack the first three codons and therefore cannot
be identified as A or B. Gibbon repeat 3 1 is equally divergent from the A and B consensus. Marker nucleotides
coincident in the gibbon and orangutan are underlined in the gibbon sequence. There are 29 coincident
marker nucleotides in the early region and 28 in the middle region. The vertical bar indicates that the late
regions of the orangutan and gibbon do not correspond.
224
Chimpanzee
Djian and Green
Gorilla
Common
Designation
Repeat Number
Hu Ch Go Or Gi
aa
0
I34
34
33 33
I73
24 24_1
1 211
17 18 18 16 17
Y
I3
ix
I 14
13
12
111
I-
14
13
12
11
14
13
12
11
14
13
12
11
14 I
13
12
11 1
FIG. 4.-Shared repeats in the middle region of the hominoids. On the right, the repeats of the middle
region of the five species are aligned. Corresponding repeats, designated by a Greek letter, are shared by two
to five species, except for p (no. 23 of the orangutan), which is not shared. On the left is a phylogenetic tree
on which the repeat additions and species-specific deletions (A) are indicated. Repeats a-6 and v, 4, and x
were added in a common ancestor of both hominoids and cercopithecoids, because they are also present in
one or more old-world monkeys (authors’ unpublished data). On the other side of the diagonal are the
numbers of marker nucleotides shared exclusively by all species diverging at the next branch point. Except
for repeat p, all species-specific repeats are located not in the middle region but in the late region (see fig. 3
and Djian and Green 1989b). The ceboids whose involucrin gene sequence is known are the owl monkey
(Tseng and Green 1989), the cebus, and the cotton-top tamarin (M. Phillips, R. Rice, P. Djian, and H.
Green, unpublished data); these all possess an early region homologous to that of the hominoids, but their
middle and late regions are different.
Some of the other repeats of the middle region are not shared by all the hominoids,
because they were generated in a hominoid sublineage (fig. 4). One repeat (CL)is
shared by all the great apes and human but not by the gibbon; two repeats (h and 5)
are shared by the African apes and human but not by the orangutan and the gibbon;
and four repeats (E and v-aa) are shared only by the African apes. The addition of
repeats to the middle region in sublineages of the hominoids occurred mainly at its
5’end or not far from it, but there are exceptions, such as repeat E of the chimpanzee
and gorilla. The only repeat of the middle region confined to a single species is repeat
23 of the orangutan. It is an exact A repeat, except that the fourth codon encodes
aspartic acid. This repeat (AD) has played an important role in the generation of the
late region (Djian and Green 1989~~).
Just as sharing of recently generated repeats in the middle region of the hominoids
is restricted to certain lineages, sharing of marker nucleotides in the early and middle
regions is also restricted (table 1). When the five hominoids are compared in groups
of two, three, or four, the more closely related species are clearly revealed by their
greater sharing of marker nucleotides.
The Involucrin Gene of the Gibbon
225
Table 1
Shared Marker Nucleotides in the Early and Middle Regions of the Hominoids
Species Compared
Chimpanzee and gorilla. . . . . . . . . . . . . . . .
Human and gibbon . . . . . . . . . . . . . . . . . .
All other combinations of two species .
.
Human, chimpanzee, and gorilla . .
....
Chimpanzee, gorilla, and gibbon
..
..
All other combinations of three species .
.
Human, chimpanzee, gorilla, and orangutan
Human, chimpanzee, gorilla, and gibbon. . . .
Chimpanzee, gorilla, orangutan, and
gibbon . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
All other combinations of four species . . . . .
Human, chimpanzee, gorilla, orangutan, and
gibbon . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
No. of Repeats
Compared
Marker Nucleotides Shared
Only within the Group
34
25
26-28
28
27
25-27
25
25
26
24
0
14
6b
1
NOTE.-Corresponding repeats of the early and middle regions were examined for marker nucleotides shared by only
two, three, or four species. When a repeat is postulated to have been deleted (nos. 29 and 30 in human and no. 15 in
orangutan), the corresponding repeats in the other species of the same group were excluded. In three cases in which two
different marker nucleotides were present at the same position in different species, these nucleotides were also excluded.
’Of the eight marker nucleotides shared by chimpanzee and gorilla but not by human or orangutan (Djian and Green
1989b), one is also found in the gibbon.
b Not shared by Macaca fascicularis.
The Late Region
The gibbon gene appears to contain a small, species-specific late region. One
repeat (repeat 28) must have resulted from a duplication of repeat 22 after completion
of the middle region; these two repeats share five marker nucleotides (at codon positions
4, 8, and 10).
As in the orangutan, the extreme 5’end of the segment of repeats in many catarrhines consists of a small group of shared repeats that do not belong to the late
region (Djian and Green 1989a). Although their pattern is similar to repeats 6 l-64
of the orangutan, repeats 29-32 of the gibbon gene do not clearly match this group
of shared repeats. Repeat 3 1 cannot be assigned to either type A or type B, and in
none of the four repeats are there definite marker nucleotides shared with other hominoids. These four repeats have tentatively been placed in the late region.
Discussion
The modern segment of repeats began to be added in the anthropoid lineage after
the latter’s divergence from the prosimians. Since that time, it has been periodically
expanded by the addition of more repeats. This process can be examined with some
precision because of two of its features: First, nearly every repeat can be distinguished
by its type and its marker nucleotides. Second, repeat addition proceeded with a definite
order. After the platyrrhine-catarrhine divergence, the catarrhines continued to add
repeats immediately 5 ’of repeat 10, to what became the middle region. By the time
of the divergence of the cercopithecoids from the hominoids, seven repeats had already
been added to the middle region (authors’ unpublished data). By the time of the
divergence of the gibbon from the other hominoids, 10 more repeats had been added
226
Djian and Green
to the common hominoid lineage. The great ape-human lineage then added repeat
~1.After divergence of the orangutan, the African ape-human lineage added the two
repeats h and 5. After the divergence of the human, the African ape lineage added
repeats E, w, o, and aa. This completed the middle region of all the hominoids.
After their divergence from each other, all the hominoids except the chimpanzee
continued to add repeats immediately 5’of their middle region, thus generating their
species-specific late regions. The only species-specific additions outside of the late
region are repeat 23 of the orangutan, partial repeat - 1 in the gibbon, and a six-codon
insertion in repeat 12 of the chimpanzee (Djian and Green 19893). The process of
consecutive repeat addition has therefore been largely vectorial in the middle region,
as in the early and late regions.
Shared repeats and shared marker nucleotides in the segment of repeats have
been used to establish that the chimpanzee and the gorilla, not the chimpanzee and
human, are sister species ( Djian and Green 1989b; for a recent opposing view, see
Williams and Goodman 1989). For the other hominoid groups, both the extent of
repeat sharing in the middle region (fig. 4) and the number of common marker nucleotides in both early and middle regions (table 1) are consistent with earlier conclusions about the relatedness of those species (Goodman 1976; Sarich and Cronin
1976; Goodman et al. 1989). It is clear from the involucrin gene that the closest
related group of two species consists of the chimpanzee and gorilla, that the closest
related group of three includes the human, and that the closest group of four includes
the orangutan.
It appears that expansion of the segment of repeats during anthropoid evolution
has not been a continuous process. Instead, there have been periods of repeat addition
followed by pauses. Examples can be cited for each of the three regions of the segment
of repeats.
1. The only part of the segment of repeats shared by the hominoids and the newworld monkeys is the early region, of which repeats 3-6 and repeats 7- 10 are duplicate
blocks ( Tseng and Green 1989). In these eight repeats, a total of eight marker nucleotides shared by the new-world monkeys and the hominoids are present in only
one of the duplicate blocks. This indicates that those nucleotides were substituted in
a common ancestor of the platyrrhines and catarrhines after the four-repeat duplication
which completed the early region. It can be concluded that during the time between
the completion of ‘the early region and the divergence of the platyrrhines from the
catarrhines, there were eight nucleotide substitutions in repeats 3- 10 but no additions
of repeats.
2. Between the platyrrhine-catarrhine divergence and the gibbon-great ape divergence, seven marker nucleotides appeared in repeats 3- 10 of the common hominoid
lineage, for those marker nucleotides are found in all of the hominoids but not in the
owl monkey; during this period, 17 repeats were added to make up most of the presentday middle region of the hominoids. It is therefore clear that in the generation of the
middle region there was a marked increase in the rate of repeat generation relative to
the rate of nucleotide substitution.
3. Since the divergence of the chimpanzee from the gorilla, the latter added 713 repeats to form the late regions of three known alleles, while the chimpanzee added
none.
As a single duplication may add numerous repeats, the rate of repeat addition
has been determined by the size of the units duplicated, as well as by the frequency
The Involucrin Gene of the Gibbon
227
of the duplications. The clearest example of this is the orangutan gene; it has generated
27 of the 3 1 repeats of its late region by a total of four duplications (Djian and
Green 1989a).
Acknowledgments
The authors acknowledge with gratitude the valuable suggestions of Dr. Walter
M. Fitch. This investigation was aided by a grant from the National Cancer Institute.
LITERATURE
CITED
DJIAN, P., and H. GREEN. 1989a. The involucrin gene of the orangutan: generation of the late
region as an evolutionary trend in the hominoids. Mol. Biol. Evol. 6:469-477.
. 19896. Vectorial expansion of the involucrin gene and the relatedness of the hominoids.
Proc. Natl. Acad. Sci. USA 86:8447-845 1.
ECKERT, R. L., and H. GREEN. 1986. Structure and evolution of the human involucrin gene.
Cell 46:583-589.
GOODMAN, M. 1976. Toward a genealogical description of the primates. Pp. 321-353 in M.
GOODMANand R. E. TASHIAN, eds. Molecular anthropology. Plenum, New York.
GOODMAN,M., B. F. KOOP, J. CZELUSNIAK,D. H. A. FITCH, D. A. TAGLE, and J. L. SLIGHTOM.
1989. Molecular phylogeny of the family of apes and humans. Genome 31:3 16-335.
PHILLIPS, M., P. DJIAN, and H. GREEN. The involucrin gene of the galago: existence of a
correction process acting on its segment of repeats. J. Biol. Chem. (accepted).
RHEINWALD,J. G., and H. GREEN. 1977. Epidermal growth factor and the multiplication of
cultured human epidermal keratinocytes. Nature 265:42 l-424.
RICE, R. H., and H. GREEN. 1979. Presence in human epidermal cells of a soluble protein
precursor of the cross-linked envelope: activation of the cross-linking by calcium ions. Cell
l&68 l-694.
SARICH, V. M., and J. E. CRONIN. 1976. Molecular systematics of the primates. Pp. 141-170
in M. GOODMAN and R. E. TASHIAN, eds. Molecular anthropology. Plenum, New York.
SIMON, M., and H. GREEN. 1985. Enzymatic cross-linking of involucrin and other proteins by
keratinocyte particulates in vitro. Cell 40:677-683.
TEUMER, J., and H. GREEN. 1989. Divergent evolution of part of the involucrin gene in the
hominoids: unique intragenic duplications in the gorilla and human. Proc. Natl. Acad. Sci.
USA 86:1283-1286.
TSENG, H., and H. GREEN. 1988. Remodeling of the involucrin gene during primate evolution.
Cell 54:49 l-496.
1989. The involucrin gene of the owl monkey: origin of the early region. Mol. Biol.
Eva;. 6:460-468.
WILLIAMS,S. A., and M. GOODMAN. 1989. A statistical test that supports a human/chimpanzee
clade based on noncoding DNA sequence data. Mol. Biol. Evol. 6:325-330.
WALTER M. FITCH,
reviewing editor
Received November 27, 1989; revision received January 3, 1990
Accepted February 5, 1990