The Nucleotide Targets of Somatic Mutation and the Role

The Nucleotide Targets of Somatic Mutation
and the Role of Selection in Immunoglobulin
Heavy Chains of a Teleost Fish
This information is current as
of June 18, 2017.
Subscription
Permissions
Email Alerts
J Immunol 2006; 176:1655-1667; ;
doi: 10.4049/jimmunol.176.3.1655
http://www.jimmunol.org/content/176/3/1655
This article cites 66 articles, 30 of which you can access for free at:
http://www.jimmunol.org/content/176/3/1655.full#ref-list-1
Information about subscribing to The Journal of Immunology is online at:
http://jimmunol.org/subscription
Submit copyright permission requests at:
http://www.aai.org/About/Publications/JI/copyright.html
Receive free email-alerts when new articles cite this article. Sign up at:
http://jimmunol.org/alerts
The Journal of Immunology is published twice each month by
The American Association of Immunologists, Inc.,
1451 Rockville Pike, Suite 650, Rockville, MD 20852
Copyright © 2006 by The American Association of
Immunologists All rights reserved.
Print ISSN: 0022-1767 Online ISSN: 1550-6606.
Downloaded from http://www.jimmunol.org/ by guest on June 18, 2017
References
Feixue Yang, Geoffrey C. Waldbieser and Craig J. Lobb
The Journal of Immunology
The Nucleotide Targets of Somatic Mutation and the Role of
Selection in Immunoglobulin Heavy Chains of a Teleost Fish1
Feixue Yang,* Geoffrey C. Waldbieser,† and Craig J. Lobb2*
M
utations generally lead to nonbeneficial consequences.
Yet in the humoral immune system, somatic mutation
serves as a cornerstone of Ig and Ab diversity by modifying the H and L chain structures encoded by rearranged V(D)J
segments. The postrearrangement modifications to Ig genes that
include somatic mutation, as well as class-switch recombination
and gene conversion, are dependent upon the enzyme activationinduced cytidine deaminase (AID)3 that converts deoxycytidine to
deoxyuridine (1–3). Studies to determine the specific nucleotide
targets of somatic mutation in mammals have identified the
RGYW/WRCY motif (where R ⫽ A or G, Y ⫽ C or T, and W ⫽
T or A) as a principal hotspot for AID-induced G:U lesions (4 – 6).
In addition, the dinucleotide target WA has been identified as a
principle site for A/T mutations (7–12), which in turn has led to
studies to define the roles of the error-prone polymerases in mis-
*Department of Microbiology, University of Mississippi Medical Center, 2500 North
State Street, Jackson, MS 39216; and †United States Department of Agriculture, Catfish Genetics Research Unit, Thad Cochran National Warmwater Aquaculture Center,
Stoneville, MS 38776
Received for publication October 17, 2005. Accepted for publication November 10,
2005.
The costs of publication of this article were defrayed in part by the payment of page
charges. This article must therefore be hereby marked advertisement in accordance
with 18 U.S.C. Section 1734 solely to indicate this fact.
1
This work was supported by a grant from the National Institutes of Health
(AI23052).
2
Address correspondence and reprint requests to Dr. Craig J. Lobb, University of
Mississippi Medical Center, Department of Microbiology, 2500 North State Street,
Jackson, MS 39216-4505. E-mail address: [email protected]
3
Abbreviations used in this paper: AID, activation-induced cytidine deaminase;
CDRT, the total nucleotides or codons encoded within CDR1 and CDR2; DH, heavy
chain diversity region gene segment; FR, framework region; FRT, the total nucleotides or codons encoded within FR1, FR2, and FR3; JH, heavy chain joining region
gene segment; R, replacement (nonsynonymous) substitution; R:S, the ratio of the
number of replacement to silent substitutions; S, silent (synonymous) substitution;
VH, heavy chain variable region gene segment.
Copyright © 2006 by The American Association of Immunologists, Inc.
match repair and their potential involvement as secondary
mutators (13–19).
Somatic mutation in mammals is intimately involved in the corelated processes of B cell selection by Ag, which occurs in germinal centers, and affinity maturation. In this developmental pathway, mutated B cells with higher affinity receptors outcompete
other B cells for limited amounts of Ag and clonally proliferate,
whereas B cells with lower affinity receptors presumably undergo
apoptosis. This process results in the production of Ab populations
with higher affinity sites for Ag, which progressively increase in
time (20, 21). There appears to be at least one other pathway where
B cell maturation likely exists. This pathway is located in the
splenic marginal zone wherein somatic mutation results in highly
mutated IgM B cells that are involved in the T-independent response to Ags. Although it is not yet known whether selection by
Ag results in affinity maturation within this IgM B cell population,
it is known that affinity maturation can occur in the absence of
germinal center formation (22–27).
In contrast to the extensive studies of somatic mutation done in
mammals, few studies have addressed the early evolutionary processes of somatic mutation and the nucleotides that are targeted for
mutation. Earlier studies in xenopus and shark H chains indicated
that somatic mutation occurred in these two classes of vertebrates
and that there was a strong mutational bias toward G and C (28,
29). Subsequent studies on shark L chains and shark NAR have
shown that mutations in A and T can account for 40 –50% of the
mutations. These latter studies have also observed that tandem
mutations, ranging in length from 2 to 4 nts, can represent from 25
to 50% of the total mutations, suggesting that alternative mutational and/or repair mechanisms may exist (30, 31).
At present, there have been no definitive studies that prove
whether or not somatic mutation occurs in the Igs of bony fish
(class Osteichythes). Earlier studies in the channel catfish have
defined 13 different VH families that are used in the H chain cDNA
repertoire, and germline segments representing each family have been
0022-1767/06/$02.00
Downloaded from http://www.jimmunol.org/ by guest on June 18, 2017
Sequence analysis of H chain cDNA derived from the spleen of an individual catfish has shown that somatic mutation occurs within
both the VH- and JH-encoded regions. Somatic mutation preferentially targets G and C nucleotides with approximately balanced
frequencies, resulting in the predominant accumulation of G-to-A and C-to-T substitutions that parallel the activation-induced
cytidine deaminase nucleotide exchanges known in mammals. The overall mutation rate of A nucleotides is not significantly
different from that expected by sequence-insensitive mutations, and a significant bias exists against mutations occurring in T.
Targeting of mutations is dependent upon the sequence of neighboring nucleotides, allowing statistically significant hotspot motifs
to be identified. Dinucleotide, trinucleotide, and RGYW analyses showed that mutational targets in catfish are restricted when
compared with the spectrum of targets known in mammals. The preferential targets for G and C mutation are the central GC
positions in both AGCT and AGCA. The WA motif, recognized as a mammalian hotspot for A mutations, was not a significant
target for catfish mutations. The only significant target for A mutations was the terminal position in AGCA. Lastly, comparisons
of mutations located in framework region and CDR codons coupled with multinomial distribution studies found no substantial
evidence in either independent or clonally related VDJ rearrangements to indicate that somatic mutation coevolved with mechanisms that select B cells based upon nonsynonymous mutations within CDR-encoded regions. These results suggest that the
principal role of somatic mutation early in phylogeny was to diversify the repertoire by targeting hotspot motifs preferentially
located within CDR-encoded regions. The Journal of Immunology, 2006, 176: 1655–1667.
1656
SOMATIC MUTATION AND ROLE OF SELECTION IN TELEOST H CHAINS
identified (32–34). The DH locus, identified through approaches that
examined the excision productions of DH-JH recombination events,
comprises at least three DH segments and is located ⬃9 kb upstream of the nine segments that compose the JH locus (35–37).
During repertoire analyses, we observed that point mutations occurred in the JH-encoded region of H chain cDNA, and these initial
observations have led to this report, which shows that somatic
mutation occurs within catfish H chain V regions. We have subsequently analyzed the mono-, di-, and trinucleotide mutational
targets as well as the occurrence of mutations within RGYW/
WRCY motifs. These studies, coupled with analyses to determine
whether there are selection mechanisms, provide new insight into
the early evolutionary patterns and role of somatic mutation in Ig
diversification.
Materials and Methods
Construction of library and sequence analysis
Determination of Taq polymerase fidelity
A rearranged H chain cDNA clone that used a member of the VH6 family
(VH6VDJ) was amplified using the same PCR conditions that were used
for cDNA library construction with the exception that the clone was subjected to 30 ⫻ 3 cycles (90 rounds) of amplification. The product was
cloned into the pCR2.1 vector. Clones were subsequently sequenced, and
21 mismatches were identified in the 7,714 bases. The resultant Taq polymerase error rate was 0.30 ⫻ 10⫺4 mutations/bp per cycle. Therefore, Taq
polymerase misincorporation errors within the VH-encoded region should
represent no more than 0.32 to 0.36 mutations per sequence.
Calculation of mutability indexes
The mono-, di-, and trinucleotides compositions of the utilized VH and JH
regions of the germline or consensus sequences were determined using
Pustell software (IBI) and adjusted manually. The number of mutations in
each sequence was recorded in an Excel spreadsheet. Mutability indexes
were calculated as reported by Shapiro et al. (11) and are defined as the
observed number of times a given mono-, di-, or trinucleotide target was
mutated divided by the expected number of mutations. The frequency of a
Statistical analysis
␹2 analysis was used to compare the mutational events in mono-, di-, trinucleotide, and codon analyses by contrasting the observed mutational frequencies to their expected mutational frequencies. p values ⬍0.01 were
considered statistically significant, and Bonferroni corrections were applied
when the distribution of mutations resulting from a single mutation could
be assigned to different di- or trinucleotide targets. ␹2 tests were also performed in the analyses of mutational frequencies and distributions of various motifs within FR or CDR as reported within the Results. Fisher’s exact
test was used to confirm the significance of the ␹2 test when the expected
counts of 25% of the cells had values ⬍5 in the comparisons of Taq polymerase error rates. Wilson confidence intervals for binomial parameters
were calculated to confirm the significance interval of the ␹2 tests in the
analyses of specific mutated positions within codons and in RGYW/
WRCY targets. Statistical analyses of Ag selection pressure on Ig genes
used the multinomial distribution model of Lossos et al. (39), which the
authors have made available online at 具http://www-stat.stanford.edu/
immunoglobulin/典. The excess of CDR replacements or the scarcity of FR
replacements were judged significant at p ⬍ 0.05.
Results
Somatic hypermutation occurs within both JH- and VH-encoded
regions of catfish H chains
An Ig H chain-specific cDNA library constructed from the spleen
of an adult channel catfish was screened with different VH family
specific probes. From this library, 187 nonidentical clones representing various expressed members of the 13 different catfish VH
families were identified. The JH-encoded region in these clones
were aligned with the genomic sequences of the previously defined
JH segments (designated JH1–JH9; Ref. 36), and the germline JH
segment used in the rearrangement was determined. These alignments showed two important features. The first was that nucleotide
mismatches were observed in the expressed JH segment when
compared with the sequence of the used germline JH. These mismatches could not be explained by potential allelic variation because the library was constructed from the cDNA from only one
animal. Secondly, clones could be assigned to clonal sets that were
defined as sequences that used the same VDJ rearrangement.
Within a clonal set, identical mismatches from the germline JH
sequence were observed in some but not all of the clones within
a set. Because it was possible that somatic mutation occurred
within the JH-encoded region, and since it was also possible that
mutations could be maintained during B cell clonal expansion,
a mutation in the same position in members of clonal sets was
deemed to represent a single event and therefore only counted
as a single mismatch. By these criteria, there were a total of 79
nucleotide mismatches in the cDNA clones when compared
with the sequences of the germline JH segments (Fig. 1). These
results indicated that these differences were due to either somatic mutation or to Taq polymerase errors that arose during
PCR amplification.
To determine whether these mismatches could be explained by
Taq polymerase misincorporation errors, a single H chain cDNA
clone (VH6VDJ) was arbitrarily selected and extensively amplified by PCR. The resulting PCR products were cloned, and representative products were sequenced. These results identified 21 nt
Downloaded from http://www.jimmunol.org/ by guest on June 18, 2017
An Ig H chain cDNA library was constructed using total RNA from the
spleen of an individual catfish (Ictalurus punctatus) as reported earlier (34).
Briefly, first-strand synthesis was initiated with a primer corresponding to
the C␮2 domain of the catfish H chain, the product was tailed, and 30
rounds of PCR amplification using Taq polymerase (Invitrogen Life Technologies) were conducted using a primer for the C␮1 domain and the
adapter primer provided in the 5⬘-RACE kit. The amplicons were ligated
and cloned into the T/A cloning vector pCR2.1, and individual colonies
were transferred to master plates for subsequent analyses. From this library, 187 cDNA clones (gene accession nos. DQ230539 –DQ230706, and
sequences previously reported; Ref. 34) representing rearranged members
of the VH1 to VH13 gene families were defined by hybridization and
sequenced with vector primers using ABI PRISM BigDye Terminators
chemistry on an ABI PRISM 3700 DNA Analyzer (Applied Biosystems) in
the U.S. Department of Agriculture, Agricultural Research Service, MidSouth Area Genomic Laboratory. FR and CDR regions were assigned using the nomenclature of the international ImMunoGeneTics information
system (38).
In the 104 sequences that used a known germline VH gene or one of the
defined VH consensus sequences (see Results), clones were identified that
represented the same VDJ rearrangement and thus were deemed members
of the same clonal set. The number of clones within a clonal set ranged in
size from 2 to 9, and identical as well as different nucleotide mutations
were observed. Because a mutation could be carried during clonal expansion, a mutation observed in the same position in a clonal set was deemed
to represent a single event and therefore was only counted once in mutational analyses. In the VH analyses of these 104 clones, there were 388
mismatches identified in 39,083 total nucleotides. Similarly, the JH-encoded region within all 187 clones were aligned with the nine germline
sequences designated JH1–JH9 defined in earlier studies (36), and mismatches with the assigned germline segment were recorded. There were 79
mismatches identified in 9,129 total nucleotides with identical mismatches
within members of clonal sets counted as a single event. Each of the mutations identified in this study were manually verified by inspection of the
sequence chromatograph, and the CS designation following the name of the
clone was added to sequences assigned to clonal sets.
target in the database was initially determined as the total number of times
a specific target existed in the database divided by the total number of all
potential targets within the database. This frequency value was then
multiplied by the total number of mutational events to yield the expected
number of mutations.
Mutability indexes for each position in di- and trinucleotides targets
were calculated separately. Thus, for dinucleotides, each mutation was
counted twice (position 1 and position 2). In the trinucleotide database,
every mutation was counted three times. In contrast, in the analysis of
codons (extending from FR1 through the end of FR3) the mutations were
only counted in a single position.
The Journal of Immunology
1657
FIGURE 1. The germline coding region sequences of channel catfish JH segments and the locations where somatic mutations occurred within the
JH-encoded regions of splenic H chain cDNA clones. The nucleotides introduced into the JH-encoded regions by mutation are shown directly underneath
each JH germline sequence (designated JH1–JH9; Ref. 36). The sequence of JH5 is not shown because no mutations were identified in the cDNA clones
that used this segment. The coding regions of the JH3 and JH4 germline segments are identical except at their 5⬘ ends, and deletion of these characteristic
nucleotides during rearrangement results in cDNA clones that could have used either segment; such clones are designated as having used JH3/4. The
demarcation of the FR4-encoded region is shown.
exhibited coding region similarities likely restricted to a specific
VH germline gene (32–34). Nucleotide differences between
aligned clones within a group were principally single position differences, and these infrequent differences were located at various
positions within the VH-encoded region. Six of these 11 groups
also contained multiple cDNA clones that shared the same VDJ
rearrangement and were therefore members of clonal sets (Table
II). These results allowed us to construct a VH consensus sequence
for each group that was derived from clones that represented three
or more independent VDJ rearrangements. The VH consensus sequence generally shared ⬎98% nucleotide identity to the VH-encoded region of each cDNA clone assigned to that group
(Table II).
When the consensus sequences were aligned with the respective
sequences in these 11 groups, there were a total of 388 mismatches
in the 39,083 VH-encoded nucleotides (nucleotide differences observed in the same position in members of clonal sets were recorded only once). The distribution of these VH substitutions was
significantly different from the distribution of misincorporation errors induced by Taq polymerase when compared with the literature
data or to our internal control ( p ⬍ 0.0001; Table I). There was,
however, no significant difference between the distribution of the
substitutions within the VH- and JH-encoded regions ( p ⫽ 0.23;
Table I). We therefore conclude that somatic mutation occurred
within catfish H chain V regions.
Table I. Distribution of nucleotide substitutions resulting from Taq polymerase errors compared to the distribution of substitutions identified in JHand VH-encoded regions of channel catfish H chain cDNA clones
Database
B ⫹ Da
VH6VDJc
JHd,e
VHe,f
G3A
C3T
G3T
C3A
G3C
C3G
A3G
T3C
A3T
T3A
A3C
T3G
Total
Substitutions
19 (15.7b)
3 (14.3)
22 (27.8)
134 (34.5)
1 (0.8)
0 (0.0)
6 (7.6)
25 (6.4)
0 (0.0)
0 (0.0)
7 (8.9)
67 (17.3)
79 (65.3)
14 (66.7)
27 (34.2)
100 (25.8)
17 (14.0)
2 (9.5)
10 (12.7)
37 (9.5)
5 (4.1)
2 (9.5)
7 (8.9)
25 (6.4)
121
21
79
388
a
The data of Bracho et al. (Ref. 40; percentage of G ⫹ C ⫽ 43, 102 errors) and Dunning et al. (Ref. 41; percentage of G ⫹ C ⫽ 40, 19 errors) were combined (designated
B ⫹ D), and used as the reference for Taq polymerase misincorporation errors. The percentage of G ⫹ C for JH and VH equaled 46 and 44%, respectively.
b
The number within parentheses refers to the percentage of substitutions within the indicated category.
c
A single cDNA clone (VH6VDJ) was subjected to 90 rounds of PCR amplification using Taq polymerase under amplification conditions identical to that used in library
construction. The distribution of the 21 substitutions in the 7714 sequenced bases attributable to Taq polymerase misincorporation errors is indicated. The distribution of
substitutions was not significant when compared to the data of B ⫹ D (Fisher’s exact test, p ⫽ 0.74).
d
The JH-encoded region from 187 H chain cDNA clones was compared to the respective germline JH sequence. The distribution of the 79 substitutions in the 9129 sequenced
bases is indicated.
e
The distribution of substitutions was significantly different from the data of B ⫹ D and VH6VDJ (␹2 or Fisher’s exact text; p ⬍ 0.0001).
f
The sequences of the VH-encoded region from 104 H chain cDNA clones (Table I) were compared to germline or consensus sequences. The distribution of the 388
substitutions in the 39,083 sequenced bases is indicated. The distribution of substitutions was not significantly different from the JH data (␹2, p ⫽ 0.23).
Downloaded from http://www.jimmunol.org/ by guest on June 18, 2017
substitutions within the sequenced 7,714 bases that were attributable to Taq misincorporation errors. Because these differences
could not be assigned to a specific strand, the 12 possible substitutions were grouped into their respective six transition/transversions categories and the distribution of the errors compared with
that reported in the literature (Table I). Our control Taq polymerase error rate was 0.30 ⫻ 10⫺4 errors/bp per cycle, which was
consistent with the Taq error rate found by others (reviewed in Ref.
41). The comparison of the distribution of these misincorporations
was also not statistically different from reported literature values
(Fisher’s exact text, p ⫽ 0.74; Table I). Substitutions for A or T
were more abundant than substitutions for G or C, and transitions
accounted for 81% of the Taq-attributed differences. In contrast,
the distribution of the JH substitutions was statistically different
from our internal control (VH6VDJ) as well as the literature-reported Taq polymerase error rate (Fisher’s exact test, p ⬍ 0.0001;
Table I). G or C transversions, infrequent in control misincorporations, accounted for 16.5% of the substitutions in JH-encoded
regions.
With these studies indicating that somatic mutation occurred in
JH-encoded regions, the VH-encoded regions within the 187 clones
were assigned to their respective VH family, and the sequences
were aligned. Of the 187 sequences, 104 of these could be readily
assigned to 11 different groups. The VH-encoded regions within
each group shared between 95 and 100% nucleotide identity and
1658
SOMATIC MUTATION AND ROLE OF SELECTION IN TELEOST H CHAINS
Table II. Summary of the mutations in cDNA sequences assigned to 11 VH consensus sequences
VH Consensus
Sequence
Number of Independent
VDJ Rearrangements in
the cDNA Clones
Expressing the VH
Consensus Sequence
VH1A
VH1B
VH2A
VH5A
VH6G
VH6H
VH7A
VH7B
4
4
4
6
4
4
10
9
VH9A
17
VH9B
VH10A
TOTAL
3
5
N.A.f
Number of cDNA
Total Number of
Total Number of
Mean ⫾ SD, and Range of the
Clones in Clonal Sets
cDNA Clones
Percentage of nt Identities in Bases Analyzed for
that Expressed the Expressing the VH
Consensus
VH Mutational
cDNA Clones When Aligned
Consensus VH
Sequence
Analysis
Sequence
with the Consensus VH Sequence
2 (CS4)a
0
4 (CS1)a
4 (CS1)a
0
0
0
2 (CS1)a,b 7 (CS2)
3 (CS3)
8 (CS1)a,b 3 (CS2)
2 (CS3)
0
9 (CS1)a
N.A.f
Total Number of
Mutations in the
cDNA Clones
Compared to the
VH Consensus
Sequence
5
4
7
9
4
4
10
18
98.1 ⫾ 0.7 (97.5 ⫺ 99.3)
98.9 ⫾ 0.3 (98.4 ⫺ 99.2)
98.7 ⫾ 0.5 (97.8 ⫺ 99.2)
98.9 ⫾ 0.6 (97.6 ⫺ 99.5)
98.7 ⫾ 0.2 (98.4 ⫺ 98.9)
97.9 ⫾ 0.7 (97.3 ⫺ 98.9)
99.2 ⫾ 0.4 (98.5 ⫺ 99.7)
98.1 ⫾ 1.3 (95.5 ⫺ 100.0)c
2000
1528
2556
3339
1460
1439
3901
7117
29
18
31
36
20
29
33
90
27
99.3 ⫾ 0.6 (97.5 ⫺ 100.0)d
9720
64
3
13
104
99.5 ⫾ 0.2 (99.5 ⫺ 99.7)e
98.7 ⫾ 0.3 (98.3 ⫺ 99.5)
N.A.f
908
5115
39083
5
33
388
Specific nucleotides are differentially targeted by somatic
mutation
With these results, it was important to determine whether specific
bases were preferentially targeted in VH coding regions by somatic
mutation and to compare the pattern of these mutations with those
defined in mammals. Mutability indexes, defined as the observed
frequency of the targeted nucleotide compared with its expected
unbiased mutation frequency, were determined (7, 11). Mutability
indexes were normalized to take into consideration the fact that a
specific nucleotide may not occur at the same relative frequency as
the other three (7). These analyses showed that G and C were
preferentially mutated ( p ⬍ 0.001 and p ⬍ 0.005, respectively),
and that the mutability indexes for these two nucleotides were
similar (Table III). G and C mutations were significantly higher
than the mutations that occurred in either A or T ( p ⬍ 0.001), and
mutations in T were significantly lower than mutations in A ( p ⬍
0.005).
A total of 60.3% of the substitutions were transitions, with R
transitions 19% higher than Y transitions. R transversions (62.3%)
were also more common than Y transversions. The overall transversion to transition ratio was 0.66:1. This ratio is significantly
different from the theoretical 2:1 transversion to transition ratio
( p ⬍ 0.001). The transversion:transition ratios for A, C, G, and T
were 0.75, 0.61, 0.76, and 0.43, respectively, and each was also
statistically different from the theoretical ratio ( p ⬍ 0.001). Thus,
it appears that neither the targeting of mutations nor the patterns of
substitutions that subsequently occur in those positions can be explained by assuming that somatic mutation events occur randomly.
Sequence-specific patterns of somatic mutation
Mutation hotspots have been identified in mammals that represent
sequence-specific patterns where the mutation rate of a target is
influenced by the presence of specific neighboring bases. To determine whether specific dinucleotides had higher mutation frequencies in the VH database, mutability indexes were calculated
for mutations in the first position, the second position, or the combined positions (position independent) for each of the possible 16
dinucleotides (Table IV). The significantly mutable dinucleotides
Table III. Substitutions and mononucleotide mutability indexes in the VH-encoded regions of channel catfish IgH chains
Substitution
A
From
A
C
G
T
Total
a
9 (8.3)
67 (56.8)
14 (24.6)
90
C
G
T
Total
22 (21.0)b
60 (57.1)
32 (29.6)
23 (21.9)
67 (62.0)
16 (13.6)
105 (100)
108 (100)
118 (100)
57 (100)
388
35 (29.7)
40 (70.2)
97
3 (5.3)
95
106
Mutability
Indexa
0.93
1.28c
1.34d
0.56d
The mutability index is the observed number of mutations in a specific nucleotide divided by the expected number of mutations in that nucleotide. The expected number
of mutations was derived by determining the frequency of the nucleotide within the sequenced VH database multiplied by the total number of observed mutations within the
database. A mutability index value of 1.0 would be assumed to represent the effects of sequence-insensitive (random) mutations. The observed and the expected number of
mutations were compared by ␹2 analyses, and significant differences are indicated in the footnotes to this table; significantly mutable mononucleotides are in bold type. The total
number of nucleotides in the VH database was 39,083 (A ⫽ 11,407; T ⫽ 10,329; G ⫽ 8,870; C ⫽ 8,477) with 388 mutations.
b
The numbers in parentheses are the substitution percentage for the indicated nucleotide.
c
Statistically significant by ␹2 test ( p ⬍ 0.01).
d
Statistically significant by ␹2 test ( p ⬍ 0.005).
Downloaded from http://www.jimmunol.org/ by guest on June 18, 2017
a
Clonal sets (designated as CS followed by the set name) used the identical VDJ rearrangement. The number of clones that were analyzed in these sets precedes the CS
designation. Mutations observed in the same position in members of clonal sets were recorded only once.
b
Three clonal sets were defined that used the designated VH consensus sequence but were rearranged with different DH and JH segments.
c
The VH-encoded region in clones 3B06AVH7 and 6F01AVH7 were identical to each other as well as to the VH7B consensus sequence.
d
The VH-encoded region in clones 2D04AVH9 and 2B08AVH9-CS1 were identical to each other as well as to the VH9A consensus sequence.
e
The VH9B consensus sequence was identical to the previously determined VH9.1 germline sequence (AY238378).
f
Not applicable.
The Journal of Immunology
1659
Table IV. Comparison of dinucleotide mutability indexes in channel catfish and human VH regions
Catfish VHa
Combined positions
No. of
mutations
55
64
89
22
70
21
5
73
33
115
36
43
42
18
49
41
0.95
1.10
1.47c
0.44c
1.00
0.88
0.46
1.15
0.63
2.95c
0.92
0.94
0.93
0.38c
0.74
0.88
First position
Second position
Mutability
index
Mutability index
1.21
1.03
0.79
0.56
0.80
1.01
0.37
2.02c
0.53
3.18c
0.86
1.13
0.89
0.34
0.52
0.64
0.69
1.17
2.14c
0.32
1.20
0.76
0.56
0.28c
0.73
2.72c
0.97
0.74
0.98
0.43
0.97
1.11
Combined positions
No. of
mutations
86
108
181
88
87
96
28
105
63
160
87
100
124
60
63
40
Mutability
index
d
1.43
1.20
1.50d
1.36d
0.77
0.76d
0.55d
0.89
0.60d
1.72d
0.66d
1.13
2.03d
0.61d
0.56d
1.01
First position
Second position
Mutability
index
Mutability index
0.93
1.31
NA
1.70d
NA
NA
0.47d
NA
0.46d
1.95d
0.48d
NA
1.67d
0.51d
NA
0.91
1.92d
1.09
NA
1.02
NA
NA
0.63
NA
0.74d
1.48d
0.83
NA
2.40d
0.71
NA
1.11
a
Mutability indexes were calculated, and significance levels were evaluated by ␹2 analyses as described in the legend to Table III. Combined positions refer to a mutation
that occurred within the indicated dinucleotide without regard for the specific position of the mutation. First and second position refers to a mutation that occurred in the first
or the second position, respectively, in the indicated dinucleotide. The total number of dinucleotides in the catfish VH database was 38,979 with 388 mutations. NA, These values
were not reported by these authors, but none of these mutability indexes were reported as statistically significant.
b
The number of mutations and the mutability indexes for human VH regions shown in this table were reported by Shapiro et al. (Ref. 11; copyright 1999 by the American
Association of Immunologists, Inc.), who derived mutability indexes from the sequences reported by Dorner et al. (10) and Dunn-Walters and Spencer (42).
c
Statistically significant by ␹2 analyses at p ⫽ 0.01; highly mutable dinucleotide positions are in bold type.
d
Statistically significant by ␹2 analyses at p ⫽ 0.01 as reported by Shapiro et al. (11); highly mutable dinucleotide positions are in bold type.
by position were CT, its reverse complement AG, and GC (where
the underlined nucleotide(s) indicates the significantly mutated position). The only dinucleotides that were significant mutable in the
combined positions were AG and GC. Dinucleotide mutability indexes for the mutations that occurred within the JH-encoded regions were also determined. Although the number of these mutations did not permit statistical evaluation by dinucleotide position,
␹2 analyses of the combined positions showed that only AG and
GC were significantly mutable ( p ⬍ 0.01).
Shapiro et al. (11) had determined dinucleotide mutability indexes for human VH somatic mutation events. The comparison of
this data with the present analyses indicates that AG and GC were
significant targets of mutation in both humans and catfish VH regions (Table IV). Both of these targets as well as the CT are located within the mutation hotspot RGYW/WRCY. In the human
VH analyses, the additional mutation targets of AT, AA, and TA
were also identified. The latter two targets compose the WA motif
(where W ⫽ T or A), which has been reported as an additional
target for mammalian somatic mutation events (7–13). The WA
motif is also represented in WRCY motifs (although the targeted
nucleotide is not in the C position). None of these WA dinucleotide
motifs, however, were significant targets of mutation in catfish VH
regions (Table IV).
To determine whether the mutations in catfish VH regions might
reflect alternatives to the hotspot motifs characterized in mammals,
trinucleotide mutability indexes were calculated for each VH mutation in each of the three possible positions. In addition, the position-independent trinucleotide mutability indexes (shown as
“combined” in Table V) was determined. These indexes were compared with the trinucleotide mutability indexes derived by Shapiro
et al. (11, 43) for human VH mutations (Table V). These comparisons indicate several important points. First, the number of VH
trinucleotide mutation targets in catfish is restricted when compared with those that occur in man. In the analysis of the combined
positions there were five significantly mutable trinucleotide targets
in catfish VH, whereas 13 such targets were identified in the human
VH studies. In the analysis of the mutations by position, nine were
identified in catfish, and 29 targets were identified in the human VH
studies. Secondly, 11 of the 14 total mutation targets identified in
catfish were also present in the human VH studies. Among these is
AGC and GCT; both of these are major targets of somatic mutation
in both species, and both are contained within the RGYW/WRCY
motif. In addition, it is apparent that there is a distinction between
the WAN motifs targeted in man compared with the catfish. TAN,
and to a lesser degree AAN, are both significantly mutable in man,
but none of these motifs was significantly mutable in the catfish
database. Therefore, WAN is not a preferred target for somatic
mutation events in the catfish.
In addition to the trinucleotides AGC and GCT, significantly
mutable trinucleotides in the first position included CTA, CTC,
and GCA. These motifs, however, may not be additional targets for
mutation because the first two nucleotides of CTA and CTC are in
the WRCY motif, and GCA is found in the RGYW motif. The
other two significantly mutable trinucleotides were AAG and
TAG. These may also not represent novel target motifs because
AG is a significantly mutable dinucleotide and represents the first
two nucleotides contained in RGYW. The latter trinucleotides may
simply represent major contributors to a NAG motif considering
the relatively high (although not significant) mutability indexes of
CAG and GAG.
Somatic mutations are restricted to specific sequences within
RGYW/WRCY motifs
The above di- and trinucleotide analyses did not identify any significant motifs with G in the R position of RGYW (i.e., GG, NGG,
GGC, GGT) or any significant motifs with C in the Y position of
WRCY (i.e., CC, GCC, ACC). These results indicated that somatic
mutation events in catfish may have restricted targets, and we proceeded to determine the occurrence and patterns of mutations in
these motifs. RGYW/WRCY motifs represented 28.7% of the
Downloaded from http://www.jimmunol.org/ by guest on June 18, 2017
AA
AC
AG
AT
CA
CC
CG
CT
GA
GC
GG
GT
TA
TC
TG
TT
Mutability
index
Human VHb
1660
SOMATIC MUTATION AND ROLE OF SELECTION IN TELEOST H CHAINS
Table V. Comparison of trinucleotide mutability indexes in channel catfish and human VH regions
Catfish VHa
Human VHb
Position
Trinucleotide
1
2
3
Combined
1
2
3
0.75
1.27
1.10
0.99
1.07
1.34
0.68
0.92
0.44
3.81d
0.86
0.88
0.43
0.37
0.65
1.03
1.29
1.58
0.94
0.41
0.49
1.09
1.51
0.64
0.90
0.00
0.80
1.34
1.96c
1.08
0.74
1.07
0.81
0.51
1.30
0.46
2.33d
1.25
0.90
2.68d
0.56
1.42
1.02
1.22
0.97
0.59
1.54
0.79
0.89
1.12
2.54c
0.50
0.44
0.15
0.18
0.60
0.96
1.06
0.71
0.50
1.23
0.62
1.17
0.32
0.24
2.04
0.93
1.92
1.02
1.34
1.36
0.89
0.60
1.28
1.07
0.28
0.74
0.32
0.88
0.39
1.25
1.85
0.07c
0.62
0.76
0.76
0.76
0.76
0.93
0.93
0.93
0.93
4.76d
2.51d
1.17
1.76
1.07
0.46
0.45
0.46
3.42d
0.94
1.35
3.79d
0.45
1.13
0.38
1.29
0.65
1.26
1.94
0.73
0.59
1.42
1.02
0.50
0.30
0.22
0.00
0.55
1.01
0.72
0.37
0.21
0.48
0.93
1.46
0.14
1.07
0.68
0.13
0.87
0.86
1.07
0.00
1.68
0.40
5.37d
0.86
1.52
0.18
0.16
0.00
0.77
1.94
1.45
1.01
0.46
0.76
0.76
0.76
0.76
0.56
0.56
0.56
0.56
0.23
0.30
0.23
0.29
0.61
0.46
1.20
0.61
1.51
1.87
1.35
4.13d
0.67
1.13
0.38
1.72
0.97
0.25
1.05
0.73
1.18
1.03
2.03
0.33
0.51
0.00
0.00
0.69
0.51
1.73
1.12
0.53
1.92
0.62
1.75
0.54
0.95
1.09
2.25c
0.17
1.33
1.61
0.68
0.18
0.30
4.79d
0.64
0.83
0.37
0.64
1.06
1.93
0.69
1.45
1.73
0.15
0.67
0.67
0.67
0.67
1.30
1.30
1.30
1.30
0.91
0.44
0.82
1.17
0.76
0.62
2.26
0.31
2.05
0.94
0.00
0.11
0.56
1.98
2.29
0.65
1.29
0.25
1.64
0.91
0.89
0.90
4.58d
0.67
0.51
0.22
0.54
0.55
1.35
0.72
0.65
0.75
1.28
0.31
0.29
0.27
0.80
1.52
1.29
1.83d
0.74
1.18
1.46
1.24
0.59
2.50d
0.68
1.20
1.67d
1.69d
0.84
1.54d
1.12
0.79
1.21
0.82
0.88
0.37d
0.62
0.71
0.57
0.57
0.57
0.99
1.62d
0.37d
0.73d
0.85
1.14
0.24d
0.98
1.13
1.26
0.71
0.43
2.18d
0.76
0.74
0.64d
0.91
2.62d
0.46d
0.89
1.81d
2.22
2.33d
2.54d
1.53d
0.77
0.65d
0.32
0.57d
0.28d
0.52
0.56d
1.02
1.69d
0.73
0.50
0.95
0.24
0.43
1.11
1.83
0.82
1.33
1.83
1.62
0.92
1.47
1.18
1.12
0.47
2.19d
1.26
2.37d
0.67
0.83
0.75
0.62
1.16
0.67
1.08
0.94
0.29
0.46
0.24
0.99
2.26d
0.52
1.25
1.03
0.49
0.16d
0.35d
0.94
2.09d
0.52
0.51
3.83d
0.58
0.28
0.25d
0.91
2.89d
0.69
1.17
3.41d
0.00
2.51d
1.95
1.30
0.66
0.41
0.65
0.43
0.31
0.29
0.48
0.51
0.74
1.16
1.00
0.00
1.20
2.14d
1.90d
2.36
0.52
1.40
0.73
1.53
0.42
3.44d
0.47
1.80d
2.19d
1.05
0.28
0.66
0.92
0.97
0.62
0.74
0.61
0.37
0.46
0.62
0.00
0.69
0.37
1.39
1.24
0.33
0.37d
1.20
0.33
0.40
0.75
1.61
1.39
1.33
0.26
2.01d
0.62
1.20
0.59
1.36
1.27
0.28d
0.56
1.00
5.90d
2.24d
3.32d
2.30d
0.73
0.65
0.32
0.85
0.31
0.57
0.44d
1.70
2.23d
0.51
0.50
0.71
0.96
2.00
0.85
1.31
0.89
0.82
1.83
0.57
0.42
2.58d
0.39
0.68
2.35d
1.84d
0.98
1.58
1.76
0.56
2.26d
1.11
0.88
0.07d
0.31
0.57
1.44
0.57
1.10
0.60
1.36
0.26d
0.58
0.34
2.62d
0.16d
1.85d
0.85
0.30
0.29d
0.51
0.71
1.07
0.74
1.09
0.45
3.70d
0.41
0.93
1.00
0.74
2.24d
2.34d
1.00
0.93
0.89
0.00
0.43
0.23d
0.71
0.76
0.85
2.08d
0.51
0.00
2.14
a
Mutability indexes were calculated, and significance levels were evaluated by ␹2 analyses as described in the legend to Table III. Combined positions refer to a mutation
that occurred within the indicated trinucleotide without regard for the specific position of the mutation. Position 1, 2, and 3 refers to a mutation that occurred in the first, second,
or third position, respectively, in the indicated trinucleotide. The total number of trinucleotides in the catfish VH database was 38,875 with 388 mutations.
b
Mutability indexes for human VH trinucleotides are as reported by Shapiro et al. (Refs. 11 and 43; copyright 1999 and 2002 by the American Association of Immunologists,
Inc.), who analyzed 738 mutations in 14,811 nucleotides from sequences reported by Dorner et al. (10) and Dunn-Walters and Spencer (42).
c
Statistically significant by ␹2 tests at p ⬍ 0.05; highly mutable trinucleotide positions are in bold type.
d
Statistically significant by ␹2 tests at p ⬍ 0.01; highly mutable trinucleotide positions are in bold type. The human trinucleotides indicated were statistically significant by
␹2 tests at p ⫽ 0.01 as reported by Shapiro et al. (11, 43).
e
For catfish ␹2 analyses in specific positions, these trinucleotides were evaluated as CCN, where N is any nucleotide.
f
For catfish ␹2 analyses in specific positions, these trinucleotides were evaluated as CGN.
Downloaded from http://www.jimmunol.org/ by guest on June 18, 2017
AAA
AAC
AAG
AAT
ACA
ACC
ACG
ACT
AGA
AGC
AGG
AGT
ATA
ATC
ATG
ATT
CAA
CAC
CAG
CAT
CCAe
CCCe
CCGe
CCTe
CGAf
CGCf
CGGf
CGTf
CTA
CTC
CTG
CTT
GAA
GAC
GAG
GAT
GCA
GCC
GCG
GCT
GGA
GGC
GGG
GGT
GTA
GTC
GTG
GTT
TAA
TAC
TAG
TAT
TCA
TCC
TCG
TCT
TGA
TGC
TGG
TGT
TTA
TTC
TTG
TTT
Position
Combined
The Journal of Immunology
1661
39,083 nucleotides in the VH database, and these motifs were significantly overrepresented when compared with their expected distribution ( p ⬍ 0.0001). These general motifs accounted for 183 of
the total mutations in the VH database (47.2%) and were statistically significant targets of mutation events ( p ⬍ 0.001). The number of the mutations in each of these motifs was then determined
to address which of these motifs were targets for somatic mutation
(Table VI). These results showed that of these 15 different motifs
only AGCT, AGCA had significant position-independent mutability indexes ( p ⬍ 0.01).
The specific mutations that occurred within the RGYW/WRCY
motifs were then determined. Fifty-three of the 118 total VH mutations that occurred in G were located in RGYW motifs (44.9%),
and 43 of the 108 total mutations that occurred in C were located
in WRCY motifs (39.8%). Mutations in these positions were significantly higher than the overall mutation rate of G and C in the
VH database ( p ⬍ 0.001). However, when the mutability indexes
of G and C were determined in each of these motifs, only AGCT
and AGCA were significant targets for G mutations in the RGYW
motifs, and only AGCT was a significant target for C mutations in
the WRCY motifs (Table VI). The other RGYW motifs accounted
for nine G mutations, and the other WRCY motifs accounted for
16 C mutations; none of these motifs were significant hotspots for
G and C mutations ( p ⬎ 0.01).
AGCA had the second highest position-independent mutation
frequency in these analyses, but only 19 of the 42 mutations occurred in the G position. When the specific mutations in the AGCA
motif were examined, 10 mutations occurred in the C position, and
this motif was determined to be a significant hotspot for C mutations ( p ⬍ 0.01). Therefore, the motif that best describes G and C
hotspots in the catfish VH database is AGCW. This motif explains
37.3% of the mutations that occurred in G and 33.3% of the mutations that occurred in C. We tested this conclusion by removing
the AGCW motifs from the VH database, and then we reanalyzed
the remaining mutations for significant trinucleotide mutability indexes. ␹2 analyses could only be conducted with confidence on the
mutations in the combined positions of a trinucleotide (position
independent), and these results showed that none of the resulting
RGYW
AGCT
Number of mutations
in targeted
positiona
Number of mutations
not in targeted
position
Total number of
mutations in motif
Total occurrence of
motife
Mutability index
(position
independentd)
Mutability index
(position
dependentd)
AGTT
AGCA
AGTA
GGTT
GGTA
GGCT
GGCA
25
0
19
1
2
2
2
2
32b
0
23
7
2
6
9
6
57
0
42
8
4
8
11
8
344
138
300
179
190
130
133
143
4.14e
NDf
3.50e
1.11
0.53
1.54
2.07
1.40
5.46e
NDf
4.76e
0.42
0.79
1.16
1.13
1.05
WRCY
AGCT
Number of mutations
in targeted
positiona
Number of mutations
not in targeted
position
Total number of
mutations in motif
Total occurrence of
motifc
Mutability index
(position
independentd)
Mutability index
(position
dependentd)
a
AACT
TGCT
TACT
AACC
TACC
AGCC
TGCC
27
2
5
2
3
0
4
0
30b
4
6
8
5
3
3
0
57
6
11
10
8
3
7
0
344
222
350
387
190
49
171
0
4.14e
0.68
0.79
0.65
1.05
1.53
1.02
NDf
6.16e
0.71
1.12
0.41
1.24
NDf
1.84
NDf
Refers to the total number of mutations that occurred at the targeted (underlined) nucleotide position within the RGYW/WRCY motifs.
AGCT had 4 mutations in the A position, 25 mutations in the G position, 27 mutations in the C position, and 1 mutation in the T position.
Total occurrence refers to the number of times the indicated motif appeared within the VH database. The same occurrence value for AGCT and AGCT is shown because
these sequences are identical.
d
The mutability index was calculated, and significance levels for position-independent and position-dependent mutations were evaluated by ␹2 analyses as described in the
legend to Table III. ␹2 analyses for position-dependent mutability indexes were directly evaluated for AGCT, AGCA, and AGCT. The low number of expected mutations in the
remaining motifs required that position-dependent ␹2 tests be performed as “other RGYW motifs” and “other WRCY motifs” neither of these were significant (p ⬎ 0.10).
e
Statistically significant by ␹2 test (p ⬍ 0.01).
f
ND because no mutations occurred in AGTT or TACC; the TGCC motif was not present in the VH database.
b
c
Downloaded from http://www.jimmunol.org/ by guest on June 18, 2017
Table VI. The occurrence and mutability indexes of the RGYW and WRCY motifs in catfish VH-encoded regions
1662
SOMATIC MUTATION AND ROLE OF SELECTION IN TELEOST H CHAINS
trinucleotides were significantly mutable ( p ⬎ 0.01). Lastly, we
examined the flanking A nucleotides in AGCA to determine
whether either of these positions had significant incidence of mutations. Statistical analyses showed that AGCA, which had nine
mutations, was a significant hotspot for A mutations ( p ⬍ 0.01).
Therefore, we conclude that AGCW is a significant target for G
and C mutations, and AGCA is a significant target for A mutations.
Patterns of somatic mutations in codons
The impact of selection on somatic mutations
In mammalian productive rearrangements, selection influences the
patterns of mutation because certain amino acid positions do not
appear to tolerate replacements, whereas mutations in other positions appear to occur as the result of selection by Ag. To initially
address the question of selection in catfish productive rearrangements, we analyzed the positional distribution of the 331 mutations
within the codons in the FR1 through FR3 VH-encoded region.
There were a total of 104 mutations in codon position 1, 110 mutations in codon position 2, and 117 mutations in codon position 3.
There was no significant difference in the distribution of mutations
in these positions ( p ⫽ 0.68), and this was also true when the
positions of mutations in codons located only within FRT were
examined (FRT, p ⫽ 0.82). The positions of the mutations in
Downloaded from http://www.jimmunol.org/ by guest on June 18, 2017
Three hundred thirty-one of the 388 mutations in the VH database
were located within the region spanned by FR1 through the end of
FR3; 251 mutations were located with the FR regions, and 80
mutations were within the CDR regions. The overall mutability
index of the codons within the three combined FR regions (FR1,
FR2, and FR3; designated as FRT) was 0.90, whereas the overall
mutability index of codons within the combined CDR regions
(CDR1 and CDR2, designated CDRT) was 1.50. ␹2 analyses
showed that codons in CDRT were significantly more mutable
than those found in FRT ( p ⬍ 0.001). Mutability indexes for the
individual codons were then derived and their statistical significance evaluated. Position-independent mutational analyses
showed that only three codons were significantly mutable targets:
AGC, GCA, and GCT ( p ⬍ 0.01).
Mutations in AGC codons accounted for 34 of the 80 mutations
found within CDRT (42.5%), but only 16 of the 251 mutations
found in the FRT (6.4%). These values indicated that either AGC
codons were significantly more mutable in CDR regions and/or
that the distribution of AGC codons between the FR and CDR
regions was different. To test this hypothesis, we analyzed the
distribution of the serine codons AGC, AGT, and TCN within the
FRT and CDRT of the VH consensus sequences. Serine codons
represented 10.8 and 23.1% of the FRT and CDRT codons, respectively, and the ratios of AGC:AGT:TCN were 1:0.75:4.18 in
FR regions and 1:0.42:0.63 in CDR regions. ␹2 analyses showed
that AGC was significantly more represented in CDR regions than
in FR regions ( p ⬍ 0.001). We then compared the mutation frequency of the AGC codon in the FRT and CDRT. In the VH database AGC represented 301 codons, with 135 located in FRT and
166 located in CDRT; the number of mutated AGC codons in these
regions was 16 and 34, respectively. ␹2 analyses indicated that the
AGC codon in CDRT was not significantly more mutable than that
found in FRT ( p ⫽ 0.045). Comparisons with the other highly
mutable codons GCT and GCA showed that neither of these
codons was more frequently represented in CDRT than FRT, and
only GCA, which accounted for seven of the mutations in CDRT,
was more highly mutated in CDRT than in FRT ( p ⬍ 0.001).
Thus, these results indicate that the nonrandom distribution of the
highly mutable AGC codon within CDRT appears to primarily
explain the higher overall CDR mutation rate.
codons within the CDRT were also not significantly different if the
AGC codon was removed ( p ⫽ 0.34).
Of the 331 total mutations within the FR1 to FR3 regions, 251
mutations were in FRT resulting in 173 replacement (R) substitutions and 78 silent (S) substitutions, whereas 80 mutations were
located in CDRT resulting in 51 replacement substitutions and 29
silent substitutions. ␹2 analyses showed that there was no significant difference in the R:S ratios when the FRT and CDRT regions
were compared ( p ⫽ 0.39). We also analyzed the substitutions that
occurred only in the AGC codon. In FRT, there were 16 mutations
resulting in 11 replacement and 5 silent substitutions; in CDRT,
there were 34 mutations resulting in 22 replacement and 12 substitutions. ␹2 analyses also showed that there was no significant
difference between these R:S ratios ( p ⫽ 0.78). Although these
combined results suggest the lack of selection (10, 44), the work of
others has shown that analyses must also focus on the individual
sequences rather than the collective data. Lossos et al. (39), building on the earlier work of Chang and Casali (45), developed a
multinomial distribution model to estimate Ag selection pressure
on expressed Ig genes. In this model, selection is addressed by
determining the excess of replacements in CDR and/or the scarcity
of replacements in FR. p values are derived by determining the
number of replacement and silent substitutions, and values with
significance p ⬍ 0.05 are assumed to have resulted by selection
rather than by chance. We compared each of the 93 catfish productive rearrangements that had mutations within the FR1 through
FR3 region with their respective VH consensus sequence using the
Lossos et al. (39) distribution model. In 82 of these sequences
(88%) there was no statistical evidence to suggest either selection
alternatives (i.e., excess of CDR replacements or scarcity of FR
replacements). In 10 of these rearrangements, there was counterselection of mutations in FR because these exhibited significant
scarcity of FR replacements. Four of these rearrangements are in
clonal set VH7B-CS2 (see below). Only 1 clone exhibited significant excess of CDR replacements, and none of the rearrangements
exhibited both significant scarcity of FR replacements and excess
CDR replacements.
To further determine whether selection mechanisms may be
present, we analyzed the patterns of somatic mutations in three
clonal sets representing different VDJ rearrangements. These sets
were chosen because they had at least seven different clonal representatives to identify the unique and sequential mutations that
likely occurred during clonal expansion. For each of these sets, the
progenitor VDJ sequence was defined using the VH member consensus sequence for 5⬘-untranslated region through FR3, the clonal
set consensus sequence for CDR3 with the germline sequence of
the utilized region of the DH segment, and the sequence of the
utilized germline JH segment for FR4. Different basic patterns of
clonal genealogies were evident in each of these three sets (Fig. 2).
The first, depicted by clonal lineage set VH10A-CS1, showed that
clone 2G03 had four mutations when compared with the progenitor consensus sequence. Each of the other clones within the set
likely descended from 2G03, and these clones had accumulated
one to three different mutations that were not present in 2G03 or in
each other. In clonal set VH9A-CS1 a different pattern was observed in that five of the clones (2H10, 2B08, 2C03, 2F03, and
2E04) represented different lineages that descended from the progenitor VDJ consensus. These radiating clonal descendents had
accumulated from one to three mutations, and the mutations in one
sublineage were different from those found in other sublineages.
Two other clones (2C06 and 2F11) had descended from a hypothetical intermediate, designated H1 (Fig. 2B), which had accumulated four mutations since it had descended from the parental VDJ.
The Journal of Immunology
1663
Clone 2C06 had accumulated four additional mutations, whereas
2F11 had a single additional mutation in comparison to H1.
The third clonal set, VH7B-CS2, was different from each of
these other clonal sets. Various major lineages appear to have descended from the progenitor VH7B VDJ sequence, and the mutations that accumulated in the clonal descendants were extensive.
Lineage analysis required that two hypothetical intermediates, designated H1 and H2, be constructed to serve as clonal intermediates
in clonal expansion. H1 was postulated because clones 6E07 and
6D07 had mutations in common, and therefore a common intermediate was likely. These two descendant clones had extensively
diverged from each other with a total of 27 different mutations
between them. All of the other clones within this clonal set were
derived from a different sublineage. This sublineage had accumulated eight common mutations when compared with the progenitor,
and these identical mutations were presumed to have been present
in an intermediate designated as H2. Two subsequent pathways
Downloaded from http://www.jimmunol.org/ by guest on June 18, 2017
FIGURE 2. Genealogical relationships between clonally related
splenic cDNA sequences. A–C, Three
different clonal sets that used members of three different VH families
(VH10, VH9, and VH7, respectively)
are shown. Within each clonal set,
the dashed circle indicates the common progenitor, the dotted circle(s)
represents the hypothetical (H) intermediate(s), and the solid circle represents the sequenced cDNA clone designated by its name. The number
besides each arrow indicates the
number of different mutations introduced during clonal expansion. For
example, in A, clone VH10A2G03CS1 had accumulated a total of four
different mutations since it had descended from its progenitor, and each
of these four mutations were found in
all other clonal descendents. Each of
the other descendant clones had in
turn accumulated additional mutations as indicated by the number adjacent to the arrow. D, The multiple
sequence alignment of the cDNA
clones within clonal set VH7B-CS2.
The sequence of the progenitor consensus (7B-Cons) is shown on the top
line and is demarcated into the 5⬘-untranslated region, FR and CDR regions, and partial C␮ region. The utilized regions of the germline
sequence of the DH3 segment (overlined; Ref. 37) and the germline JH2
segment (underlined; Ref. 36) are
shown. Dots indicate sequence identity with the 7B-Cons sequence, and
dashes indicated nucleotides that
were absent from the 5⬘- ends of
clones 1E10 and 6H01; the nucleotides introduced by mutation are
indicated.
had occurred within this sublineage. The first, represented by clone
6A02, had accumulated an additional 13 mutations from H2. The
second, represented by the other remaining clones, descended from
a common intermediate (1E10), which itself had accumulated four
additional mutations in comparison with H2. Thus, the mutations
present within this clonal set are extensive, with the most distant
descendant in this set (6A02) having diverged from its progenitor
by 21 mutations.
To determine whether selection by Ag had occurred within any
of these three clonal sets, the multinomial distribution analyses of
replacement and silent substitutions extending through the end of
FR4 were calculated. These results showed that when the members
in these clonal sets were compared with either their progenitor or
to their immediate clonal precursor (as shown in Fig. 2), only 1 of
these clones had significant p values for either scarcity of FR replacements or excess CDR replacements. Clone 3B08 in VH7BCS2, which had 17 total mutations compared with its progenitor,
1664
SOMATIC MUTATION AND ROLE OF SELECTION IN TELEOST H CHAINS
was marginally significant for scarcity of FR replacements ( p ⫽
0.044). Thus, within these clonal sets selective forces to either
conserve FR or to accumulate R mutations in CDR do not appear
to exceed that expected to occur by chance.
Discussion
Downloaded from http://www.jimmunol.org/ by guest on June 18, 2017
These studies have sought to provide insight into the nucleotide
targets and potential mechanisms of somatic hypermutation that
are operational at the phylogenetic level of bony fish. The analyses
of mononucleotide mutations indicated that A, C, G, and T mutations accounted for a relative mutation rate in catfish VH regions of
27, 28, 30, and 15%, respectively, with transitions more frequent
that transversions. When corrected for base composition, mutations in A, C, G, and T nucleotides occurred with mutational frequencies of 23, 31, 33, and 14%, respectively. Mutability indexes
were determined, and statistical analyses showed that the number
of mutations in A were not significantly different from that expected from sequence-insensitive (random) mutations. C and G
were mutated at frequencies higher than expected from random
mutations, and mutations in T were significantly lower than expected. Thus, the mutational frequencies of G and C (unlike A and
T) are approximately equal. G and C are the preferred targets for
somatic mutation, and there is a bias against mutations occurring
in T.
These results can be compared with studies done in other vertebrates. For example, Milstein et al. (8) reported average mutational frequencies of A, C, G, and T of 33, 23, 24, and 20%,
respectively, for human and mouse H and L chains and artificial
substrates inserted into murine transgenes. Smith et al. (7) reported
mutation frequencies of 33, 26, 24, and 16%, respectively, in a
collective study on mutations in murine V genes. There was no
significant difference when these frequencies were compared with
the mononucleotide mutation frequencies defined in catfish VH or
JH databases ( p ⬎ 0.1). The generally observed imbalance of A
compared with T mutations in mammalian systems has been a
basis for proposing a strand-biased mechanism that differentially
targets A:T but not G:C pairs (discussed below). Earlier studies on
shark (29) and Xenopus (28) H chains had analyzed a limited number of mutations and concluded that these lower vertebrates exhibited a strong mutational bias toward G and C. However, extensive mutational analysis on shark L chains (30) and shark new Ag
receptor (31) have subsequently shown that mutations in A and T
generally represent ⬎40 and ⬎50% of the mutations, respectively.
These latter studies have also observed that tandem mutations,
ranging in length from two to four nucleotides, are characteristic of
the mutational pattern and may represent ⬎50% of the total mutations. The percentage of transitions also varied when substitutions in point mutations were compared with those observed in
tandem mutations. These results have suggested that alternative
mutational and/or repair mechanisms may be operational. Tandem
mutations, however, are not characteristic of the mutational pattern
in catfish H chains as these represent ⬍5% of the total mutations.
The analyses to determine the nucleotide targets of somatic mutation in the catfish have shown that specific motifs target specific
nucleotides for mutation. Dinucleotide analyses of the VH and JH
databases showed that CT, its reverse complement AG, and GC are
significant targets for G and C mutations. No significant dinucleotides were identified that were targets for either A or T mutations
in either database. The lack of A and T dinucleotide targets suggested that trinucleotide analyses might define targets for A and T
mutations if these mutations occurred in highly specific targets. In
addition, these analyses would determine whether G and C mutations were more restricted than the dinucleotide analyses had indicated. Trinucleotide analyses were conducted on the mutations in
the VH database, because the JH database had insufficient inherent
structural diversity. Although none of the trinucleotides were identified as significant targets for A or T mutations, nine significantly
mutable positions in seven different trinucleotides were identified
as significant targets for G or C mutations. These seven motifs
were AAG, AGC, CTA, CTC, GCA, GCT, and TAG. The two
palindrome motifs AGC and GCT, respectively, accounted for 39
or 29% of the total mutations in G and accounted for 38 or 34% of
the total mutations in C. The other highly mutable C trinucleotide
targets (CTA and CTC) accounted for a combined total of 35% of
the C mutations. Of the remaining three trinucleotides (GCA,
AAG, and TAG) GCA was the most mutated and accounted for
21% of the G mutations; the other two motifs accounted for 14 and
8% of the G mutations, respectively. It must be noted that these
separate percentages should not be considered as additive. For example, a single mutation that occurred in the G position of the
tetranucleotide AGCT would be counted in both AGC and GCT.
The vertical alignment of these targeted trinucleotides indicated
that many of these motifs might be included in RGYW/WRCY
motifs, and we proceeded to test this hypothesis. RGYW/WRCY
motifs were found to be significantly overrepresented in the VH
database; nonetheless, these motifs were significant targets of mutations accounting for 47% of the total VH position-independent
mutations. Position-dependent analyses, however, showed that
only two of these general motifs were significant targets for G and
C mutations, AGCT and AGCA. These two combined motifs explained 37% of the total mutations in G and 34% of the total
mutations in C. In addition, these analyses showed that AGCA was
a significantly mutable target and accounted for 9% of the total
mutations that occurred in A. These results were tested by removing the AGCW motifs from the VH database, and the mutability
indexes of the resulting trinucleotides were recalculated. These
results showed that none of these resulting trinucleotides were now
significantly mutable ( p ⬎ 0.01). Thus, we conclude that these
three motifs (AGCT, AGCA, and AGCA) are the principal motifs
for specifically targeted somatic mutations in catfish H chains.
These results allow comparisons to be made with the somatic
mutations events characterized in mammals. Two general features
of somatic mutation events in mammals have become evident. The
first is that approximately equal numbers of mutations appear to
occur in G and C nucleotides. The second is that the number of
mutations in A generally exceeds the number of mutations in T (7,
8, 46 – 47). These results have been one of the foundations for
proposing two underlying mechanisms or stages for somatic mutation. The first targets G and C nucleotides and is strand independent. It is now known that mutations in G and C are principally
due to AID, which catalyzes deamination of C residues to U residues and preferentially targets RGYW/WRCY motifs (1, 2, 5).
The DNA deamination model predicts that when the lesion in G:U
pairs is repaired, faithful replication would convert the U to a T
and result in the observation that transitions predominate at the
mutated sites (48). These features of mutational analyses in mammals also appear to be characteristic of the mutational patterns
observed in catfish H chains (Table III). In regards to the specific
targeting of mutations to RGYW motifs, it is clear that the spectrum of RGYW targets used in mammals is restricted in catfish.
Dorner et al. (49) calculated the position-independent number of
mutated RGYW motifs in productively rearranged human H
chains (mutations in WRCY were not reported). In these analyses,
28.2% of the total mutations were located in RGYW motifs with
13.2% of the total mutations located in AGCW. In the catfish database, 35.6% of the total mutations were located in RGYW motifs
with 25.5% of the total mutations located in AGCW. Therefore,
the targeting of mutations to RGYW motifs as well as the number
The Journal of Immunology
significant difference. There was also no difference between FRT
and CDRT when the R:S ratios of the mutations found only in the
AGC codon were compared. The higher ratio of R:S substitutions
in CDR as compared with FR has been used as an indication for
Ag selection (45); thus, by these criteria, mechanisms targeted toward selection of replacement mutations in CDR do not appear to
be present.
The multinomial distribution method was also used to evaluate
the question of B cell selection by Ag, and this model addresses
two important features. The first is whether selection mechanisms
serve to conserve the basic framework structure of the H chain by
selecting for synonymous substitutions. The second is whether selection mechanisms serve to select for nonsynonymous substitutions within the CDR regions that may alter Ag binding. In our
analyses, only 10 of the 93 productive rearrangements that had
mutations showed evidence for significant scarcity of FR replacements. Of the 51 rearrangements that had four or more mutations
in the FR1 to FR3 regions, only seven of these had significant
scarcity of FR replacements (13.7%), and four of these were within
the VH-encoded region of clonal set VH7B-CS2. In comparison,
66% of the H chains in B cell lymphomas (57), 36 – 84% of the H
chains in tumor-infiltrating B cells (58), and 72– 82% of the H
chains in synovial B cells (59) were significant for scarcity of FR
replacements and/or excess of CDR replacements.
The analyses of the VH7B-CS2 clonal set proved especially
informative. This set, composed of seven clones, exhibited extensive mutations when these clones were compared with the VH
clonal set progenitor sequence (Fig. 2C). In this set, clone 1E10
was the immediate precursor to three other clones (3B08, 6F08,
and 6H01). 1E10 was significant for scarcity of FR replacements
in the VH-encoded region as were two of its descendants (6F08 and
3B08). The third descendant 6H01, which had acquired two additional mutations, was no longer significant for scarcity of VH FR
replacements. Similarly, two descendants (6D07 and 6E07) that
arose from branches different from those leading to 1E10 had accumulated a total 17 or 10 mutations, respectively, from the progenitor sequence; but neither of these exhibited significant scarcity
of FR replacements. Lastly, when the multinomial distribution
studies were expanded to include the CDR3 and FR4 regions, only
1 clone in this clonal set (3B08) showed evidence for significant
scarcity of FR replacements. Thus, if a significant positive selective force for synonymous FR substitutions exists, it is not uniformly evident even within members of the same clonal set.
Only 1 of the 93 productive rearrangements exhibited significant
excess of nonsynonymous CDR substitutions. This single clone
(2D12AVH10) had four total VH mutations with two of the three
mutations within the CDR resulting in replacements. This minimal
result strongly indicates that positive selection mechanisms that
serve to enrich B cells based upon nonsynonymous substitutions in
CDR do not appear to be functional in bony fish. This conclusion
is in agreement with other studies in bony fish that have shown that
the affinity of serum Ab population varies but does not significantly increase with time postimmunization when compared with
the 3- to 4-log increase in affinity typically observed in mammals
(60 – 64). In channel catfish, the affinity of the serum anti-DNP Ab
population was measured by equilibrium dialysis in samples from
individual animals over a 2-year period. These results showed that
during this extended time period, there was less than a 1-log increase in the affinity of the Ab population (62). These studies also
detected low-affinity sites in the Ab population at each time point,
and Sips analysis of Ab heterogeneity did not show a significant
decrease during the immunization period. These results appear to
be consistent with the present analyses on the divergence patterns
of members in clonal sets. Mutations appear to accumulate during
Downloaded from http://www.jimmunol.org/ by guest on June 18, 2017
of mutations targeted to AGCW is significantly higher in catfish H
chains (␹2, p ⬍ 0.01; also see mutability indexes of RGYW/
WRCY motifs; Table VI).
In contrast to G and C mutations, the imbalance of mutations in
A and T has indicated that there is a strand-biased mechanism for
mutations in A/T that preferentially occurs in the WA motif (9, 50,
51). Extensive studies to explain these observations have implicated numerous repair enzymes involved in resolving U:G mismatches (recently reviewed in Refs. 14 –19). Although discussion
of these enzymes and mechanisms is beyond the scope of this
paper, no evidence was obtained in these studies to indicate that
the WA motif is a targeted site for mutation in catfish V regions.
However, the conclusion that AGCA is a hotspot for A mutations
supports the hypothesis that resolution of U:G mismatches in this
RGYW motif involves mutations in adjacent sites. Following the
recent conclusions of Neuberger et al. (14), this may be the first
report to provide direct linkage of significant mutations in C:G
pairs and adjacent A:T pairs. This result was detected in these
studies because of the high concentration of mutations targeted
toward limited RGYW motifs. If it is assumed that G:C mutations
in catfish are targeted by AID, then a two-step or second stage of
mutation, which uses an independent mutational mechanism targeted toward A:T pairs, does not necessarily need to be postulated.
AID-related structures have been identified in different species of
bony fish, including the catfish, based upon their sequence identity
to mammalian AID (52, 53). Recent studies have also found that
AID from zebrafish and fugu are able to catalyze class switch
recombination in mouse B cells. In addition, mutator activity was
demonstrated by reversion of an inactive kanamycin allele in Escherichia coli and inactivation of ura3 in Saccharomyces cerevisiae
(54). Thus, AID-related structures in bony fish appear to have
functional activity.
A central question of somatic hypermutation is whether somatic
mutation serves to alter the ability of expressed Ab H and L chains
to bind Ag. The related second question is whether a mechanism
exists that can preferentially select the B cells with the mutated
higher affinity binding sites such that these populations predominant the immune response. In mammals Ag-stimulated, classswitched B cells proliferate and undergo somatic mutation in germinal centers wherein mutated lineages with higher affinity
receptors compete for limited amounts of Ag and are selectively
expanded while cells with less effective binding sites undergo apoptosis (Refs. 20 and 21, see also Ref. 55). The channel catfish, as
well as other bony fish, does not undergo class switching, and
neither lymph nodes nor germinal centers are present. Because
affinity maturation may occur in mammals in the absence of germinal centers (22–27), it was important to determine whether selection mechanisms could be detected by analyzing the patterns of
mutation that occurred in catfish H chains. These studies, in contrast to those in mammals, found no supporting evidence to suggest
that CDR-targeted replacement mutations result in selection. First,
these studies showed that there was no significant difference in the
positional distribution of the mutations in VH-encoded codons.
Secondly, the general result that CDRT was significantly more
mutable than FRT was predominantly attributed to the skewed
distribution of the highly mutable codon AGC. Mutations in AGC
accounted for ⬎40% of the mutations in CDRT, and distribution
analyses of the serine codons AGC, AGT, and TCN showed that
the AGC was preferentially located in CDR regions. In this regard,
these analyses phylogenetically underscore the studies of Wagner
et al. (56) who also concluded that mutations are inherently targeted toward CDR regions because of the nonrandom distribution
of the AGC codon. Furthermore, when the R:S ratios of mutations
occurring in FRT were compared with those in CDRT there was no
1665
1666
SOMATIC MUTATION AND ROLE OF SELECTION IN TELEOST H CHAINS
Acknowledgments
We extend appreciation to Drs. Edward F. Meydrech and Jacob Olivier for
statistical consultation, Mary Duke for high throughput cDNA sequencing
support, and Miles Lange for helpful discussions.
Disclosures
The authors have no financial conflict of interest.
References
1. Muramatsu, M., K. Kinoshita, S. Fagarasan, S. Yamada, Y. Shinkai, and
T. Honjo. 2000. Class switch recombination and hypermutation require activation-induced cytidine deaminase (AID), a potential RNA editing enzyme. Cell
102: 553–563.
2. Revy, P., T. Muto, Y. Levy, F. Geissmann, A. Plebani, O. Sanal, N. Catalan,
M. Forveille, R. Dufourcq-Labelouse, A. Gennery, et al. 2000. Activation-induced cytidine deaminase (AID) deficiency causes the autosomal recessive form
of the Hyper-IgM syndrome (HIGM2). Cell 102: 565–575.
3. Arakawa, H., J. Hauschild, and J. M. Buerstedde. 2002. Requirement of the
activation-induced deaminase (AID) gene for immunoglobulin gene conversion.
Science 295: 1301–1306.
4. Rogozin, I. G., and N. A. Kolchanov. 1992. Somatic hypermutagenesis in immunoglobulin genes. II. Influence of neighbouring base sequences on mutagenesis. Biochim. Biophys. Acta. 1171: 11–18.
5. Yoshikawa, K., I. M. Okazaki, T. Eto, K. Kinoshita, M. Muramatsu, H. Nagaoka,
and T. Honjo. 2002. AID enzyme-induced hypermutation in an actively transcribed gene in fibroblasts. Science 296: 2033–2036.
6. Martin, A., P. D. Bardwell, C. J. Woo, M. Fan, M. J. Shulman, and M. D. Scharff.
2002. Activation-induced cytidine deaminase turns on somatic hypermutation in
hybridomas. Nature 415: 802– 806.
7. Smith, D. S., G. Creadon, P. K. Jena, J. P. Portanova, B. L. Kotzin, and
L. J. Wysocki. 1996. Di- and trinucleotide target preferences of somatic mutagenesis in normal and autoreactive B cells. J. Immunol. 156: 2642–2652.
8. Milstein, C., M. S. Neuberger, and R. Staden. 1998. Both DNA strands of antibody genes are hypermutation targets. Proc. Natl. Acad. Sci. USA 95:
8791– 8794.
9. Spencer, J., M. Dunn, and D. K. Dunn-Walters. 1999. Characteristics of sequences around individual nucleotide substitutions in IgVH genes suggest different GC and AT mutators. J. Immunol. 162: 6591– 6601.
10. Dorner, T., H.-P. Brezinschek, R. I. Brezinschek, S. J. Foster, R. Damiati-Saad,
and P. E. Lipsky. 1997. Analysis of the frequency and pattern of somatic mutations within nonproductively rearranged human variable heavy chain genes.
J. Immunol. 158: 2779 –2789.
11. Shapiro, G. S., K. Aviszus, D. Ikle, and L. J. Wysocki. 1999. Predicting regional
mutability in antibody V genes based solely on di- and trinucleotide sequence
composition. J. Immunol. 163: 259 –268.
12. Oprea, M., L. G. Cowell, and T. B. Kepler. 2001. The targeting of somatic
hypermutation closely resembles that of meiotic mutation. J. Immunol. 166:
892– 899.
13. Pavlov, Y. I., I. B. Rogozin, A. P. Galkin, A. Y. Aksenova, F. Hanaoka, C. Rada,
and T. A. Kunkel. 2002. Correlation of somatic hypermutation specificity and
A-T base pair substitution errors by DNA polymerase ␩ during copying of a
mouse immunoglobulin ␬ light chain transgene. Proc. Natl. Acad. Sci. USA 99:
9954 –9959.
14. Neuberger, M. S., J. M. Di Noia, R. C. L. Beale, G. T. Williams, Z. Yang, and
C. Rada. 2005. Somatic hypermutation at A䡠T pairs: polymerase error versus
dUTP incorporation. Nat. Rev. Immunol. 5: 171–178.
15. Delbos, F., A. De Smet, A. Faili, S. Aoufouchi, J. C. Weill, and C. A. Reynaud.
2005. Contribution of DNA polymerase ␩ to immunoglobulin gene hypermutation in the mouse. J. Exp. Med. 201: 1191–1196.
16. Honjo, T., H. Nagaoka, R. Shinkura, and M. Muramatsu. 2005. AID to overcome
the limitations of genomic information. Nat. Immunol. 6: 655– 661.
17. Zu, Z., Z. Fulop, Y. Zhong. A. J. Evinger, H. Zan, and P. Casali. 2005. DNA
lesions and repair in immunoglobulin class switch recombination and somatic
hypermutation. Annu.NY. Acad. Sci. 1050: 146 –162.
18. Barreto, V. M., A. R. Ramiro, and M. C. Nussenzweig. 2005. Activation-induced
deaminase: controversies and open questions. Trends Immunol. 26: 90 –96.
19. Mayorov, V. I., I. B. Rogozin, L. R. Adkison, and P. J. Gearhart. 2005. DNA
polymerase ␩ contributes to strand bias of mutations of A versus T in immunoglobulin genes. J. Immunol. 174: 7781–7786.
20. Rajewsky, K. 1996. Clonal selection and learning in the antibody system. Nature
381: 751–758.
21. Kelsoe, G. 1996. The germinal center: a crucible for lymphocyte selection. Semin. Immunol. 8: 179 –184.
22. Matsumoto, M., S. F. Lo, C. J. Carruthers, J. Min, S. Mariathasan, G. Huang,
D. R. Plas, S. M. Martin, R. S. Geha, M. N. Nahm, and D. D. Chaplin. 1996.
Affinity maturation without germinal centres in lymphotoxin-␣-deficient mice.
Nature 382: 462–382.
23. Kato, J., N. Motoyama, I. Taniuchi, H. Takeshita, M. Toyoda, K. Masuda, and
T. Watanabe. 1998. Affinity maturation in lyn kinase-deficient mice with defective germinal center formation. J. Immunol. 160: 4788 – 4795.
24. Takahashi, Y., P. R. Dutta, D. M. Cerasoli, and G. Kelsoe. 1998. In situ studies
of the primary immune response to (4-hydroxy-3-nitrophenyl)acetyl. V. Affinity
maturation develops in two stages of clonal selection. J. Exp. Med. 187: 885– 895.
25. William, J., C. Euler, S. Christensen, and M. J. Shlomchik. 2002. Evolution of
autoantibody responses via somatic hypermutation outside of germinal centers.
Science 297: 2066 –2070.
26. Weller, S., M. C. Braun, B. K. Tan, A. Rosenwald, C. Cordier, M. E. Conley,
A. Plebani, D. S. Kumararatne, D. Bonnet, O. Tournilhac, et al. 2004. Human
blood IgM “memory” B cells are circulating splenic marginal zone B cells harboring a prediversified immunoglobulin repertoire. Blood 12: 3647–3654.
27. Manser, T. 2004. Textbook germinal centers? J. Immunol. 172: 3369 –3375.
28. Wilson, M., E. Hsu, A. Marcuz, M. Courtet, L. Du Pasquier, and C. Steinberg.
1992. What limits affinity maturation of antibodies in Xenopus-the rate of somatic
mutation or the ability to select mutants? EMBO J. 11: 4337– 4347.
29. Hinds-Frey, K. R., H. Nishikata, R. T. Litman, and G. W. Litman. 1993. Somatic
variation precedes extensive diversification of germline sequences and combinatorial joining in the evolution of immunoglobulin heavy chain diversity. J. Exp.
Med. 178: 815– 824.
30. Lee, S. S., D. Tranchina, Y. Ohta, M. F. Flajnik, and E. Hsu. 2002. Hypermutation in shark immunoglobulin light chain genes results in contiguous substitutions. Immunity 16: 571–582.
31. Diaz, M., A. S. Greenberg, and M. F. Flajnik. 1998. Somatic hypermutation of the
new antigen receptor gene (NAR) in the nurse shark does not generate the repertoire: possible role in antigen-driven reactions in the absence of germinal centers. Proc. Natl. Acad. Sci. USA 95: 14343–14348.
32. Ghaffari, S. H., and C. J. Lobb. 1991. Heavy chain variable region gene families
evolved early in phylogeny. J. Immunol. 146: 1037–1046.
33. Ventura-Holman, T., J. C. Jones, S. H. Ghaffari, and C. J. Lobb. 1994. Structure
and genomic organization of VH gene segments in the channel catfish: members
of different VH gene families are interspersed and closely linked. Mol. Immunol.
31: 823– 832.
34. Yang, F., T. Ventura-Holman, G. C. Waldbieser, and C. J. Lobb. 2003. Structure,
genomic organization, and phylogenetic implications of six new VH families in
the channel catfish. Mol. Immunol. 40: 247–260.
35. Ghaffari, S. H., and C. J. Lobb. 1992. Organization of immunoglobulin heavy
chain constant and joining region genes in the channel catfish. Mol. Immunol. 29:
151–159.
36. Hayman, J. R., S. H. Ghaffari, and C. J. Lobb. 1993. Heavy chain joining region
segments of the channel catfish: genomic organization and phylogenetic implications. J. Immunol. 151: 3587–3596.
37. Hayman, J. R., and C.J. Lobb. 2000. Heavy chain diversity region segments of the
channel catfish: structure, organization, expression and phylogenetic implications. J. Immunol. 164: 1916 –1924.
38. Lefranc, M. P., V. Giudicelli, C. Ginestoux, J. Bodmer, W. Muller, R. Bontrop,
M. Lemaitre, A. Malik, V. Barbie, and D. Chaume. 1999. IMGT, the international
ImMunoGeneTics database. Nucleic Acids Res. 27: 209 –212.
39. Lossos, I. S., R. Tibshirani, B. Narasimhan, and R. Levy. 2000. The inference of
antigen selection on Ig genes. J. Immunol. 165: 5122–5126.
40. Bracho, M. A., A. Moya, and E. Barrio. 1998. Contribution of Taq polymeraseinduced errors to the estimation of RNA virus diversity. J. Gen. Virol. 79:
2921–2928.
41. Dunning, A. M., P. Talmud, and S. E. Humphries. 1988. Errors in the polymerase
chain reaction. Nucleic Acids Res. 16: 10393.
Downloaded from http://www.jimmunol.org/ by guest on June 18, 2017
clonal expansion, and sublineages that presumably diverged early
in B cell clonal expansion and do not exhibit significant levels of
CDR-targeted replacements remain present in the B cell population. Thus, these results indicate that somatic mutation may have
evolved as a mechanism to principally increase repertoire diversity. This basic mechanism continues to be phylogenetically operational as shown, for example, in the repertoire studies with
sheep (65) and the apparent lack of selection in the studies with
Xenopus H chains (28).
In conclusion, somatic mutation occurs within catfish H chain V
regions. The analysis of these hotspot motifs has shown that although these targets share common motifs, their number is restricted when compared with the spectrum of mutational targets
known in mammals. It will be of interest to determine whether
these differences are due to variant enzymatic activities of the
AID-related molecules in bony fish or whether these differences
are attributable to AID-associated factors that may chaperone AID
to hotspot motifs, such as have been suggested in the studies with
replication protein A (66). Lastly, these studies found no substantial evidence to indicate that somatic mutation coevolved with
mechanisms that select B cells based upon nonsynonymous mutations within CDR-encoded regions. These results suggest that the
principal role of somatic mutation early in phylogeny was to diversify the Ig and Ab repertoire by targeting hotspot motifs preferentially located within CDR-encoded regions.
The Journal of Immunology
54. Barreto, V. M., Q. Pan-Hammarstrom, Y. Zhao, L. Hammarstrom, Z. Misulovin,
and M. C. Nussenzeig. 2005. AID from bony fish catalyzes class switch recombination. J. Exp. Med. 202: 733–738.
55. Jackson, S. M., and J. D. Capra. 2005. IgH V-region sequence does not predict
the survival fate of human germinal center B cells. J. Immunol. 174: 2805–2813.
56. Wagner, S. D., C. Milstein, and M. S. Neuberger. 1995. Codon bias targets
mutation. Nature 376: 732.
57. Lossos, I. S., C. Y. Okada, R. Tibshirani, R. Warnke, J. M. Vose, T. C. Greiner,
and R. Levy. 2000. Molecular analysis of immunoglobulin genes in diffuse large
B-cell lymphomas. Blood 95: 1797–1803.
58. Coronella, J. A., C. Spier, M. Welch, K. T. Trevor, A. T. Stopeck, H. Villar, and
E. M. Hersch. 2002. Antigen-driven oligoclonal expansion of tumor-infiltrating B
cells in infiltrating ductal carcinoma of the breast. J. Immunol. 169: 1829 –1836.
59. Ghosh, S., A. C. Steere, B. D. Stollar, and B. T. Huber. 2005. In situ diversification of the antibody repertoire in chronic lyme arthritis synovium. J. Immunol.
174: 2860 –2869.
60. Clem, L. W., and P. A. Small. 1970. Phylogeny of immunoglobulin structure and
function. V. Valences and association constants of teleost antibodies to a haptenic
determinant. J. Exp. Med. 132: 385– 400.
61. Voss, E. W., W. J. Groberg, and J. L. Fryer. 1978. Binding affinity of tetrameric
coho salmon Ig anti-hapten antibodies. Immunochemistry 15: 459 – 464.
62. Lobb, C. J. 1985. Covalent structure and affinity of channel catfish antidinitrophenyl antibodies. Mol. Immunol. 22: 993–999.
63. Cain, K. D., D. R. Jones, R. L. Raison. 2002. Antibody-antigen kinetics following
immunization of rainbow trout (Oncorhynchus mykiss) with a T-cell dependent
antigen. Dev. Comp. Immunol. 26: 181–190.
64. Kaattari, S. L., H. L. Zhang, I. W. Khor, I. M. Kaattari, and D. A. Shapiro. 2002.
Affinity maturation in trout: clonal dominance of high affinity antibodies late in
the immune response. Dev. Comp. Immunol. 26: 191–200.
65. Reynaud, C. A., C. Garcia, W. R. Hein, and J. C. Weill. 1995. Hypermutation
generating the sheep immunoglobulin repertoire is an antigen-independent process. Cell 80: 115–125.
66. Chaudhuri, J., C. Khuong, and F. W. Alt. 2004. Replication protein A interacts
with AID to promote deamination of somatic hypermutation targets. Nature 430:
992–998.
Downloaded from http://www.jimmunol.org/ by guest on June 18, 2017
42. Dunn-Walters, D. K., and J. Spencer. 1998. Strong intrinsic biases towards mutation and conservation of bases in human Ig VH genes during somatic hypermutation prevent statistical analysis of antigen selection. Immunology 95:
339 –345.
43. Shapiro, G. S., K. Aviszus, J. Murphy, and L. J. Wysocki. 2002. Evolution of Ig
DNA sequence to target specific base mutations within codons for somatic hypermutation. J. Immunol. 168: 2302–2306.
44. Kuppers, R., M. Zhao, M. L. Hansmann, K. Rajewsky. 1993. Tracing B cell
development in human germinal centres by molecular analysis of single cells
picked from histological sections. EMBO J. 12: 4955– 4967.
45. Chang, B., and P. Casali. 1994. The CDR1 sequences of a major proportion of
human germline Ig VH genes are inherently susceptible to amino acid replacement. Immunol. Today 15: 367–373.
46. Lebecque, S. G., and P. J. Gearhart. 1990. Boundaries of somatic mutation in
rearranged immunoglobulin genes: 5⬘ boundary is near the promoter, and 3⬘
boundary is approximately 1 kb from V(D)J gene. J. Exp. Med. 172: 1717–1727.
47. Foster, S. J., T. Dorner, and P. E. Lipsky. 1999. Targeting and subsequent selection of somatic hypermutations in the human V␬ repertoire. Eur. J. Immunol.
29: 3122–3132.
48. Petersen-Mahrt, S. K., R. S. Harris, and M. S. Neuberger. 2002. AID mutates E.
coli suggesting a DNA deamination mechanism for antibody diversification. Nature 418: 99 –103.
49. Dorner, T., S. J. Foster, N. L. Farner, and P. E. Lipsky. 1998. Somatic hypermutation of human immunoglobulin heavy chain genes: targeting of RGYW motifs on both DNA strands. Eur. J. Immunol. 28: 3384 –3396.
50. Rada, C., M. R. Ehrenstein, M. S. Neuberger, and C. Milstein. 1998. Hot spot
focusing of somatic hypermutation in MSH2-deficient mice suggests two stages
of mutational targeting. Immunity 9: 135–141.
51. Rogozin, I. B., Y. I. Pavlov, K. Bebenek, T. Matsuda, and T. A. Kunkel. 2001.
Somatic mutation hotspots correlate with DNA polymerase ␩ error spectrum.
Nat. Immunol. 2: 530 –536.
52. Saunders, H. L., and B. G. Magor. 2004. Cloning and expression of the AID gene
in the channel catfish. Dev. Comp. Immunol. 28: 657– 663.
53. Zhao, Y., Q. Pan-Hammarstrom, Z. Zhao, and L. Hammarstrom. 2005. Identification of the activation-induced cytidine deaminase gene from zebrafish: an evolutionary analysis. Dev. Comp. Immunol. 29: 61–71.
1667