Rapid Evolution of the MH Class I Locus Results

Rapid Evolution of the MH Class I Locus Results in Different Allelic
Compositions in Recently Diverged Populations of Atlantic Salmon
S. Consuegra,*1,2 H.-J. Megens, 2 H. Schaschl,*3 K. Leon, R. J. M. Stet, 4 and W. C. Jordan*
*Institute of Zoology, Zoological Society of London, Regent’s Park, London, United Kingdom; and Cell Biology and
Immunology Group, Department of Animal Sciences, Wageningen University, Wageningen, The Netherlands
We compared major histocompatibility class I allelic diversity in two currently reproductively isolated Atlantic salmon
(Salmo salar) populations (Irish and Norwegian) with a common postglacial origin in order to test for among-population
differences in allelic composition and patterns of recombination and point mutation. We also examined the evidence for
adaptive molecular divergence at this locus by analyzing the rate of amino acid replacement in relation to a neutral expectation. Contrary to our prediction, and in contrast to the situation for other genetic markers, the two populations have almost
nonoverlapping sets of major histocompatibility class I alleles. Although there is a strong signal of point mutation that
predates population divergence, recent recombination, acting in similar, but not identical, ways in both populations appears
to be a significant force in creating new alleles. Moreover, selection acting on peptide-binding residues seems to favor new
recombinant alleles and is likely to be responsible for the rapid divergence between populations.
Introduction
The genes of the major histocompatibility complex
(MHC) are some of the most variable protein-coding loci
detected in vertebrates and among the most suitable candidate genes currently available to study processes of adaptive
evolution due to their well-established role within the
immune system (Hedrick 1994). MHC genes encode cellsurface molecules that bind small self-peptides or peptides
from non–self-proteins derived from infectious pathogens
within the cell and present them on the cell surface to T cells.
Recognition of the MHC–non-self-peptide complex by a
specific T cell receptor initiates either a cellular or humoral
immune response depending on the classes of MHC
molecule and T cell involved. Pathogen-driven balancing
selection, either through overdominance or negative
frequency–dependent selection or both, is thought to be
the main evolutionary force that promotes and maintains
the extensive variability in these genes (Edwards and
Hedrick 1998; Jeffery and Bangham 2000; Hedrick 2002;
Penn, Damjanovich, and Potts 2002). However, sexual selection (Jordan and Bruford 1998; Landry et al. 2001) and neutral forces such as genetic drift and gene flow may also have a
role (Landry and Bernatchez 2001).
Several molecular mechanisms are thought to be
involved in producing high levels of polymorphism at
MHC loci. New alleles may arise through point mutations,
and selective pressures then act on the substitution rate of
such point mutations causing higher frequencies of nonsynonymous than synonymous replacements at those
amino acid residues responsible for peptide binding (the
1
Present address: Fish Muscle Research Group, Gatty Marine
Laboratory, University of St Andrews, St Andrews, Fife, Scotland, United
Kingdom.
2
S.C. and H.-J.M. equally contributed to this study.
3
Present address: Max-Planck-Institute of Limnology, Department
of Evolutionary Ecology, Ploen, Germany.
4
Present address: Scottish Fish Immunology Research Centre,
University of Aberdeen, Zoology Building, Aberdeen, Scotland, United
Kingdom.
Key words: MHC, recombination, positive selection, Atlantic salmon,
Salmo salar.
E-mail: [email protected].
Mol. Biol. Evol. 22(4):1095–1106. 2005
doi:10.1093/molbev/msi096
Advance Access publication February 2, 2005
peptide-binding residues [PBRs]) (Garrigan and Hedrick
2001). However, the rate of point mutation in MHC genes
does not appear to be high enough to account for the high
variability levels often observed, and other mechanisms
such as recombination and gene conversion may play an
important role (Martinsohn et al. 1999). In particular,
recombination may create new MHC alleles in isolated populations even over short time periods, as is the case of the
HLA-B alleles of the South American tribal Amerindians
founded between 10,000–40,000 years B.P. (Belich et al.
1992; Watkins et al. 1992).
In contrast to other vertebrates in which they are
tightly linked, the class I and class II major histocompatibility (MH) genes in teleosts are located in different linkage
groups, an apparently derived condition that allows independent evolution of these loci (Ohta et al. 2000). As the
class I and class II genes do not form a complex, they are
known as MH genes in teleosts (Stet et al. 2002). Salmonid
MH class I and class II genes are highly polymorphic, with
many alleles differing in the composite patterns of amino
acid substitutions (Shum et al. 2001; Stet et al. 2002).
Grimholt et al. (1993) isolated the first class I sequences from
salmon cDNA, and a single MH class I locus (Sasa-UBA)
was found to be expressed (Grimholt et al. 2002). Class I
alleles are thought to represent ancestral lineages that predate
the separation of Onchorynchus and Salmo genera and have
accumulated high levels of variability (Miller and Withler
1998; Shum et al. 2001; Grimholt et al. 2002). Evidence
for recombination has been found in intron 2, which separates the exons encoding the a1 and a2 domains that contain
the PBRs in class I genes, resulting in exon-domain shuffling
(Shum et al. 2001). This mechanism of creating new alleles
appears more common in salmonids than in primates, probably due to the greater length of the salmonid intron 2 (Shum
et al. 2001; Grimholt et al. 2002).
To date, most studies of MH gene functionality and
variation in Atlantic salmon (Salmo salar) have focused
on farmed fish (Grimholt et al. 1993, 2002; Shum et al.
2001; Stet et al. 2002) because they often have advantages
such as known pedigree and the ability to manipulate families for segregation studies. However, comparative studies
of natural populations can provide further insight into the
manner in which environmental and demographic factors
Molecular Biology and Evolution vol. 22 no. 4 Ó Society for Molecular Biology and Evolution 2005; all rights reserved.
1096 Consuegra et al.
can contribute to the origin and maintenance of MHC variation (Garrigan and Hedrick 2001). In particular, the study
of reproductively isolated populations with a common postglacial origin can help elucidate the mechanisms creating
and maintaining MH diversity. Under an assumption of
selective neutrality, populations founded from a common
source should share a common pool of alleles with frequency distributions for each population shaped by forces
such as genetic drift and gene flow. On the other hand, population-specific selective forces acting on common alleles
could either increase divergence between populations or
homogenize them through stabilizing selection, counterbalancing the effects of genetic drift (Koskinen, Haugen, and
Primmer 2002).
Atlantic salmon are widely distributed along the
Atlantic coasts of Europe and eastern North America. In
Europe, the present-day distribution of the species (from
the Iberian Peninsula to Russia) was established upon retreat
of the ice sheets after the last glacial maximum (Bernatchez
and Wilson 1998). Routes of recolonization from glacial refugia can be traced using neutral molecular markers as
founder effects and genetic drift have probably been the
dominant forces in shaping the patterns of genetic variability
among present-day populations (Bernatchez and Wilson
1998). Evidence suggests that northern European rivers
remained glaciated until ;15,000 years B.P. (Hewitt 1999)
and that Atlantic salmon recolonized them from southern
refugia (Verspoor et al. 1999; Nilsson et al. 2001; Consuegra
et al. 2002). Natal homing, an important characteristic of the
Atlantic salmon life cycle, makes populations prone to
reproductive isolation and may promote local adaptation
(Taylor 1991). Northern European Atlantic salmon populations therefore provide unique material to study the effect of
natural selection over a short period of time (;15,000–
20,000 years) using nonneutral genetic markers.
Here, we compare MH class I (Sasa-UBA) allelic
diversity of two isolated Atlantic salmon populations with
a common postglacial origin to test the null hypotheses that
(1) there is no difference in allelic composition between them
(as expected in populations derived from the same glacial
refugium) and (2) they share a common pattern of recombination and point mutation (under a null assumption that the
populations are subject to similar selective pressures).
Materials and Methods
Samples
Juveniles were sampled from four west coast Irish rivers with natural populations of Atlantic salmon: Owenmore,
Owenduff, Burrishole, and Carrowinskey. Similar samples
from Norwegian populations were obtained from four rivers:
Os, Aargardselva, Gaula, and Ims. Samples of white muscle
were stored in 95% ethanol, while anterior kidney tissue was
stored in RNAlater buffer (Qiagen, Ltd., West Sussex, UK)
for subsequent extraction of RNA.
Minneapolis, Minn.) or with the SV total RNA isolation kit
from PROMEGA (Promega, Madison, Wis.). A total of
11 ll of the final purified RNA from the Irish samples
was digested with DNAse I and was used to synthesize
first-strand cDNA using the First-Strand cDNA Synthesis
kit (Amersham Pharmacia Biotech UK Limited, Bucks,
UK). First-strand cDNAs were used as templates for polymerase chain reaction (PCR) amplification of the Atlantic
salmon b-actin locus to check for possible genomic DNA
contamination (the presence of an intron between the primers
results in products of different size in genomic and cDNA)
with the primers Act_fwd 5#-ATGGAAGATGAAATCGC
CGC-3# and Act_rev 5#-TGCCAGATCTTCTCCATGTCG-3#, and the resulting products were run in a 1.5% agarose gel. Samples that gave a band of the correct size (;200
bp), with no evidence of genomic DNA contamination (a
product of ;450 bp), were then used for amplification
of the MH class I locus. Reverse transcription (RT)–
PCR was performed in the Norwegian samples from the
total purified RNA with the Superscript one-step RT-PCR
kit (Invitrogen, Paisley, UK).
A region of ;550 bp of the cDNA was amplified with
a 50:50 mix of the following primers (Grimholt et al. 2002)
for forward priming: Lead2S: 5#-CTGGGAATAGGCCTTCTACAT-3# and Lead4S:-5#-AGCCCTACATTCTTCATCTGC-3# and the reverse primers UBA3_rev:
5#-CTGTCGCGTGGCAGGTCACTG-3# and UBAex3R:
5#-TGTCCTIATCAGAGTGCTCTTCC-3#. This region
spans from exon 1 to exon 4 of the class I locus (Grimholt
et al. 2002), including the entire a1 and a2 domains (exons
2 and 3). The PCR and RT-PCR products were purified and
cloned into the pCR2.1 plasmid vector (TA-cloning kit,
Invitrogen). Both strands of at least five clones per individual were sequenced using M13 forward and reverse or SPS6
and T7 primers with the ABI Prism BigDye Terminator
Cycle Sequencing Kit diluted with Better Buffer (Microzone Corporation, Ontario, Canada) following the manufacturer’s protocol and sequences were resolved on an
ABI Prism 377 automated sequencer.
Sequence Analysis
Only sequences represented by at least two clones
from independent PCRs were considered in subsequent
analyses. Sequences were aligned with Sequencher (Gene
Codes Corp., Ann Arbor, Mich.) software and BioEdit
v. 5.0.9 (using the ClustalW program included in the
package). DNAML from PHYLIP 3.6 package (Felsenstein
1989) was used to estimate maximum likelihood trees used
in both spatial phylogenetic variation (SPV) and positive
selection analyses. In the following analyses, sequences
from each population were analyzed independently in order
to compare the patterns of recombination and selection
between populations. As in previous studies (Grimholt
et al. 2002), alleles were defined on the basis of deduced
amino acid sequences, not on nucleotide sequence.
cDNA Isolation, Amplification, and Sequencing
Total RNA was extracted from anterior kidney tissue
of 19 Irish and 34 Norwegian individuals using the Purescript RNA Isolation kit from GENTRA (Gentra Systems,
Recombination Analysis
The likelihood-based method of Grassly and Holmes
(1997) for detection of phylogenetically anomalous regions
MHC Class I Divergence in Atlantic Salmon 1097
or SPV was implemented using the program Plato (Partial
Likelihoods Assessed Through Optimization) (Grassly and
Holmes 1997). SPV may arise either as a result of selection
or of conversion-recombination. In this method a likelihood
for each site is calculated on the basis of the overall (or
global) maximum likelihood phylogeny, and a ‘‘sliding
window’’ technique is used to identify the region with
the lowest likelihood score value for each window size.
The window size is varied from a minimum of 5 bp up
to half the sequence length. The standardized normal deviate (Z) is used to test the statistical significance of each of
the lowest likelihood regions against a null normal distribution generated by simulation (100 replicates). A value
of Z 3 is taken to be equivalent to P 0.05 adjusted
for multiple tests (Grassly and Holmes 1997). The analysis
was based on the overall maximum likelihood phylogeny
under the Jukes Cantor substitution model (Jukes and
Cantor 1969) as that model best fitted the data according
to the analysis performed with MODELTEST v1.06
(Posada and Crandall 1998). DnaSP software (J. Rozas
and R. Rozas 1999) was used to estimate the minimum
number of recombination events (Rm) in the history of
the sample using the four-gamete test (Hudson and Kaplan
1985). Estimates of the 95% confidence interval for Rm and
the probability of obtaining a lower value than the observed
value were obtained by coalescent simulations (1,000 replicates). The regions potentially involved in recombination
identified by DnaSP were compared with regions of SPV
detected with PLATO.
The program Ldhat (McVean, Awadalla, and
Fearnhead 2002) was used to asses the importance of
recombination relative to point mutation in the patterns
of genetic variation of both Irish and Norwegian samples.
This program uses a composite likelihood method to estimate the population recombination rates (4Ner) and implements a likelihood permutation test for recombination
based on the loss of the interchangeable character of the
sites when recombination occurs (McVean, Awadalla,
and Fearnhead 2002). It also compares the result with other
three permutation tests based in the decay of linkage disequilibrium with distance and in the sum of distances
between pairs of sites. Based on the estimates of the recombination rate (q) and mutation rate per site (h), we calculated
the ratio q/h as an indicator of the relative likelihood of a
nucleotide being involved in recombination relative to
mutation (McVean, Awadalla, and Fearnhead 2002). We
repeated the analysis removing sites that were identified
as positively selected.
Detecting Positive Selection
Potential PBRs were identified in the aligned sequences
according to Grimholt et al. (1993). Distances based on
synonymous and nonsynonymous substitutions for the
PBR and non-PBR sites were calculated for each population
using the distance of Nei and Gojobori (1986) with the Jukes
and Cantor (1969) correction for multiple substitutions.
Standard errors were calculated by 1,000 bootstrapping
replicates. The Z-test (Nei and Kumar 2000) implemented
in MEGA 2.1 (Kumar et al. 2001) was used for testing neutrality estimating the statistical significance of the differen-
ces between the synonymous and nonsynonymous distances
in each one of the two domains (a1 and a2).
To detect positive selection at single amino acid sites,
we used the maximum likelihood method (Yang 2000)
implemented in CODEML of the PAML 3.14 package (Yang
1997) and the parsimony method (Suzuki and Gojobori
1999) with the modification of Su (2000) implemented in
the SGI software (Su 2000).
For the maximum likelihood method we used different codon-based models that allow for variable selection
among sites as recommended by Yang et al. (2000). We
compared the scenario where nonsynonymous mutations
are neutral or deleterious (models M1 and M7, respectively) with models that allow for positive selection
including an additional category for advantageous substitutions (M2, M3, M8). Six different models that allow for
different intensity of selection among sites were tested.
M0 assumes a constant substitution rate (x) for all sites.
Three of the models assume a discrete distribution of the x
among sites: M1 (neutral) assumes two categories of sites
conserved (x 5 0) and neutral (x 5 1); M2 (selection)
includes an additional category of sites with x estimated
from the data; M3 (discrete) assumes a discrete distribution of K different x ratios. Two additional models assume
a continuous distribution for heterogeneous x ratios
among sites: M7 (beta) that assumes a beta distribution
and does not allow for positively selected sites and M8
(beta and x) that accounts for positively selected sites
(x . 1). Nested models can be compared in pairs using
the likelihood ratio test (LRT): twice the log-likelihood
difference is compared with a v2 distribution with degrees
of freedom equal to the difference in the number of parameters between both models. In this way, the more general
models M2 and M3 can be tested against M1 and M8
against M7. Nonnested models (M3 against M8) were
compared using the Akaike information criterion (AIC;
Akaike 1974): AIC 5 2(estimated log likelihood
of the model) 1 2(number of free parameters of the
model). A model that minimizes the value of AIC was considered the most appropriate model. A Bayesian approach
implemented in CODEML was used to identify residues
under positive selection in each of the domains (a1 and
a2) separately. Following Yang et al. (2000) we repeated
the analysis allowing for different initial values of x (0.4
and 3 and 4), and only the results with the highest likelihood values were taken into account.
We also used Suzuki and Gojobori’s (1999) parsimonybased method for identifying positively selected sites with a
modification that allows the input of different tree topologies (Su 2000). For this analysis we used the same maximum likelihood tree as for the maximum likelihood
analysis of selection. The SGI software reconstructs the
ancestral sequences using a maximum parsimony approach
and estimates the average numbers of synonymous and
nonsynonymous sites for each codon through the phylogenetic tree in order to compute the number of synonymous
and nonsynonymous changes and test for neutrality at each
codon site. The numbers of synonymous and nonsynonymous changes are used to calculate the binomial probability
of obtaining the observed numbers of changes for
each codon site and the significance level is set at 5%.
1098 Consuegra et al.
FIG. 1.—Alignment of MHC class I amino acid sequences. Dots indicate identity, and gaps introduced to maximize alignment are indicated by
dashes. Sequences in bold correspond to Irish populations, and sequences in italics are shared between Norwegian and Irish populations. Squares indicate
sites involved in recombination identified by DnaSP, and regions of SPV are underlined. Nucleotide sequences were deposited in the GenBank under
accession numbers AY62572–AY62598.
Positive selection is considered to occur if the number of
nonsynonymous changes is significantly larger than that
of synonymous changes.
We used a variability metric (V) (Reche and Reinherz
2003) similar to the Shannon entropy index (Shannon 1948)
to identify variable amino acid residues that may be involved
in immune recognition (Stewart et al. 1997). The result of
this analysis can help independently identify polymorphic
residues (V . 1) potentially involved in peptide contact
(potential PBRs). This method, along with the Bayesian
method implemented in CODEML and the maximum parsimony method in SGI, allows the detection of potential
PBRs without making any a priori assumption about the
position of such sites.
Results
DNA Sequence Analyses
From analysis of cDNA it was evident that all individuals examined expressed only one dominant class I locus
(Sasa-UBA) as no more than two sequences were cloned
from any single fish. In total, 21 alleles were present in
the 19 Irish individuals analyzed with 13 alleles in 34 Norwegian individuals (fig. 1). Only two Irish alleles were
MHC Class I Divergence in Atlantic Salmon 1099
FIG. 1.—Continued.
identical to the alleles described in Norwegian Atlantic salmon populations (Sasa-UBA*0601; Grimholt et al. 2002
and Sasa-UBA*0602; this study).
Nucleotide diversity was distributed evenly over the
length of the sequences and was similar within (Ireland
p 5 0.204; Norway p 5 0.218) and between groups (K 5
0.211). The proportion of polymorphic sites was 55.9%
among Irish sequences and 56.3% among Norwegian
sequences. The average number of nucleotide differences
between groups (108.06) was similar to the average number
of nucleotide differences within groups (Irish 5 104.56,
Norwegian 5 111.96).
Although only two of the sequences from Irish and
Norwegian populations were identical, several Irish and
Norwegian alleles were identical in either the a1 or a2
domain but differed in the rest of the sequence (fig. 1).
In total, the 40 sequences described were formed by 27
a1 and 27 a2 unique domains. The Irish and Norwegian
class I sequences are composed of 13 and 10 unique a1
sequences and 16 and 15 unique a2 sequences, respectively. This distribution suggests an important role of
recombination between the a1 and a2 domains in the formation of novel alleles in both populations.
Patterns of Recombination
There was significant evidence of recombination in both
Irish and Norwegian groups of alleles using a combination of
1100 Consuegra et al.
Table 1
Results of the Analysis of Recombination and Point
Mutation in Irish and Norwegian MH Class I Sequences
Group
n
h
IRL
NRW
IRLa
NRWb
21
21
20
19
84.69
80.88
22.96
79.83
q
Pcorr Pcorr
P Lkmax P G4 (r2,d) (D#, d)
Lk
10 361,052.94
25 359,843.16
11 42,532.11
14 354,462.25
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
NOTE.—Estimates of the Watterson mutation rate (h), population recombination
rate (q) are given, as are the probabilities resulting from testing the null hypothesis of
no recombination with several methods: Lk and P Lkmax from a likelihood permutation test (McVean, Awadalla, and Fearnhead 2002), P G4, Pcorr(r2,d) and
Pcorr(D#,d) from correlations of linkage disequilibrium with distance (Meunier
and Eyre-Walker 2001). IRL 5 Irish sequences; NRW 5 Norwegian sequences.
a
Excluding Sasa-UBA*2001.
b
Excluding Sasa-UBA*1001 and Sasa-UBA*0901.
a likelihood permutation test and three other permutation
tests (table 1). The estimated population recombination
rate (q 5 4Ner) in the Norwegian alleles was q 5 25 and
in the Irish alleles was q 5 10. The estimates of the mutation
rate per site were similar in both Irish (0.219) and Norwegian
(0.207) groups of alleles (table 1). Watterson estimates of the
amount of mutation per population (h 5 4Nel) exceeded the
recombination rates per population (table 1), suggesting a
larger accumulation of mutations than of new recombinants.
The exclusion of Sasa-UBA*1001 and Sasa-UBA*0901
from the Norwegian sequences did not change the estimated
mutation rate, although it lowered the recombination rate
(q 5 14), suggesting an origin by recombination for these
two sequences. Excluding Sasa-UBA*2001 from the Irish
sequences, however, did not change the estimated recombination rate, although it lowered the estimated mutation rate
from h 5 85 to h 5 23, suggesting an origin through point
mutation for this sequence.
The recombination tests performed with the DnaSP
software estimated a minimum of 48 and 33 recombination
events (Rm) in the Norwegian and Irish sequences, respectively. The majority of sites involved in recombination are
common to both groups of sequences (fig. 1). The exclusion
of Sasa-UBA*1001 and Sasa-UBA*0901 from the analysis
resulted in a decrease of the estimated number of recombination events for the Norwegian populations to 39. Excluding Sasa-UBA*2001 from the Irish sequences in the analysis
Table 2
Average Nonsynonymous (dN) and Synonymous (dS)
Distances Between MHC Class I Alleles Found in the Irish
(IRL) and Norwegian (NRW) Populations of Atlantic
Salmon Analyzed and Probability (P) of dN 5 dS
Population
IRL
IRL
IRL
IRL
NRW
NRW
NRW
NRW
*
Region
a1
a1
a2
a2
a1
a1
a2
a2
P , 0.05
PBR
Non-PBR
PBR
Non-PBR
PBR
Non-PBR
PBR
Non-PBR
Number
of Codons
30
74
29
83
30
74
29
83
dN
0.276
0.232
0.208
0.094
0.318
0.250
0.184
0.097
(0.042)
(0.027)
(0.039)
(0.017)
(0.041)
(0.026)
(0.041)
(0.020)
dS
0.216
0.263
0.184
0.092
0.251
0.280
0.246
0.085
(0.047)
(0.040)
(0.041)
(0.019)
(0.047)
(0.045)
(0.062)
(0.018)
P
0.028*
0.458
0.421
0.262
0.041*
0.421
0.178
0.699
Table 3
Likelihood Ratio Tests (LTR) Comparing Models to Test
for Evidence of Positive Selection in the Irish (IRL) and
Norwegian (NRW) MHC Class I Alleles
Domain
Model
M3-M0
M2-M1
M3-M1
M8-M7
M3-M0
M2-M1
M3-M1
M8-M7
a1
a1
a1
a1
a2
a2
a2
a2
P
2DL
IRL
NRW
IRL
NRW
108.04
20.4
74.04
12.62
207.50
97.32
100.48
79.16
87.48
7.70
69.32
11.42
377.64
160.60
160.66
1635.68
,0.001
,0.01
,0.001
,0.01
,0.001
,0.001
,0.001
,0.001
,0.001
,0.05
,0.001
,0.01
,0.001
,0.001
,0.001
,0.001
NOTE.—Twice the difference in likelihood between models (2DL) was compared to a v2 distribution with df 5 g 1 to assess the significance of tests (where
g is the number of parameters of the model).
only lowered the number of estimated recombination events
by one (to 32). The 95% confidence intervals estimated by
coalescent simulations were overlapping between populations with estimated population mean Rms that were not significantly different (Ireland, mean Rm 5 13.8 [95%
confidence intervals 8, 23]: Norway, mean Rm 5 15.8
[95% confidence intervals 7, 25]). Areas of SPV were similar
in both groups of sequences (fig. 1), with larger regions
identified in the a1 domain (10–35 bp) compared with the
a2 domain (5–13 bp).
We also observed significant evidence of recombination after repeating the maximum likelihood recombination
test without sites identified as positively selected (see
below) (P , 0.001). After removing selected sites the estimated recombination rate in the populations (q 5 23 and 10
for Norway and Ireland, respectively) remained lower than
the mutation rates (h 5 64.7 and 68.7). The number of
recombination events detected by DnaSP after excluding
selected sites was 34 in the Norwegian populations and
23 in the Irish populations.
Patterns of Positive Selection
Evidence of positive selection in the potential PBRs
identified according to Grimholt et al. (1993) was found
in the a1 domain in both Irish and Norwegian groups of
sequences (table 2). Tests for positive selection gave nonsignificant results for the non-PBR sites in both domains
and also for the potential PBR sites of the a2 domain in
both populations.
Maximum likelihood models that allowed positive
selection fitted the data significantly better than those that
assumed only neutral or deleterious mutations (table 3). The
results were similar using maximum likelihood trees based
on the a1 and a2 domains separately or on the whole
sequence, so here we present only the results of both
domains combined. The x estimates of the model M0
(an average over all sites in the protein) indicated that purifying selection dominated the evolution of both MH
domains (a1: x 5 0.290/0.340; a2: x 5 0.880/0.974; table
4). An LRT test indicated that model M2 that allows for
positive selection, fitted the data better than model M1 that
only considers conserved and neutral sites (table 3). Estimates using the M2 model suggested that 4% of the sites
MHC Class I Divergence in Atlantic Salmon 1101
Table 4
Log-likelihood Values (L), Model Parameter Estimates and Positively Selected Sites Identified by a Bayesian Method
Implemented in CODEML
L
Estimates of Parameters
Model
IRL
NRW
IRL
NRW
a1 Domain
M0
M1
M2
M3
1,446.25
1,429.25
1,419.05
1,392.23
2,005.73
1,996.65
1,992.80
1,961.99
1,399.83
1,393.52
1,966.44
1,960.73
x 5 0.290
p0 5 0.267 (x0 5 0), p1 5 0.733 (x1 5 1)
p0 5 0.264 p1 5 0.696, p2 5 0.040 (x2 5 12.40)
p0 5 0.468 (x0 5 0.05), p1 5 0.490 (x1 5 0.56),
p2 5 0.042 (x2 5 6.00)
p 5 0.405, q 5 0.728
p0 5 0.968, p1 5 0.032, p 5 0.449, q 5 0.883
(x 5 7.39)
x 5 0.340
p0 5 0.165 (x0 5 0), p1 5 0.835 (x1 5 1)
p0 5 0.161 p1 5 0.722, p2 5 0.116 (x2 5 3.44)
p0 5 0.172 (x0 5 0.02), p1 5 0.553 (x1 5 0.27),
p2 5 0.275 (x2 5 1.02)
p 5 0.656, q 5 1.011
p0 5 0.978, p1 5 0.022, p 5 0.673, q 5 1.094
(x 5 21.44)
1,479.90
1,426.39
1,377.73
1,376.15
1,692.04
1,583.55
1,503.25
1,503.22
1,416.23
1,376.65
1,570.89
1,503.05
x 5 0.880
p0 5 0.360 (x0 5 0) p1 5 0.639 (x1 5 1)
p0 5 0.335, p1 5 0.516, p2 5 0.149 (x2 5 9.39)
p0 5 0.536 (x0 5 0.11), p1 5 0.337 (x1 5 1.68),
p2 5 0.126 (x2 5 10.53)
p 5 0.203, q 5 0.255
p0 5 0.852, p1 5 0.147, p 5 0.153, q 5 0.156
(x 5 8.04)
x 5 0.974
p0 5 0.476 (x0 5 0) p1 5 0.524 (x1 5 1)
p0 5 0.455 p1 5 0.372, p2 5 0.172 (x2 5 11.34)
p0 5 0.456 (x0 5 0.00), p1 5 0.372 (x1 5 1.06),
p2 5 0.172 (x2 5 11.74)
p 5 0.104, q 5 0.179
p0 5 0.831, p1 5 0.169, p 5 0.021, q 5 0.026
(x 5 10.39)
M7
M8
a2 Domain
M0
M1
M2
M3
M7
M8
NOTE.—M0 assumes a constant x (dS/dN) value, while M3 allows for three different xi in pi proportions. M7 and M8 assume a continuous distribution (with shape
parameters p and q) of x values with p1 being the proportion of codon sites assigned to the positive selection x class.
in the a1 domain were under positive selection in the Irish
sequences (x 5 12.4) and 12% in the Norwegian sequences
(x 5 3.44), while 15% and 17% of the sites of the a2
domain were under selection in the Irish and Norwegian
sequences, respectively (Irish x 5 9.39; Norwegian x 5
11.34). The model M3, that assumes three site classes, fitted
the data significantly better than all the previous models
(table 3). The results of model M3 suggest that 4% of
the sites in the a1 domain are under strong positive selection in the Irish sequences (x 5 6.0) and 2.7% of the sites
in the Norwegian sequences are under moderate selection
(x 5 1.12). According to this model ;13% of the sites in
the a2 domain are under strong positive selection in both
populations (x . 9).
An LRT test comparing two models that both assume a
beta distribution of x over sites (M7 and M8) indicated that
M8 (that allows for selection) fitted the data better than M7
(that does not allow for selection). The estimates from M8
indicated that 3% of the sites in the a1 domain were under
strong positive selection in the Irish sequences (x 5 7.4)
and 2% in the Norwegian sequences (x 5 21.44), while
in the a2 domain 15% of the sites were under strong diversifying selection in both populations, consistent with the
results from the M3 model.
The results from the M8 model (with a smaller value of
AIC: 2,795.04 and 3,929.46 for the M8 model against
2,796.46 and 3,933.48 for M3 model, for Irish and Norwegian sequences, respectively) were used to compare the
sites identified as positively selected in the sequences from
the posterior probabilities for both populations (fig. 2).
Only two sites were identified with a strong signal of positive selection (at the 99% confidence level) in the a1
domain of the Irish sequences. The same sites (plus one
more with a lower signal of selection) were also detected
as positively selected in the Norwegian sequences (fig.
2a). The a2 domain shows a larger number of sites with
a strong selective signal in both populations. There were
no differences between populations in the distribution of
the sites under diversifying selection in the a2 domain
under the M8 model, although 16 additional sites were
identified in the Norwegian population under M3. All
the sites identified with M8 were included in those identified with M3.
The maximum parsimony method implemented in
SGI only predicted one positively selected site in the a2
domain of the Norwegian sequences (42S, also predicted
by the likelihood method), although some of the sites with
the highest probabilities (0.9) coincided with sites identified by CODEML as positively selected (65G in a1 in the
Irish sequences; 71L and 74T in a2 in the Norwegian sequences). Several sites were identified as subject to purifying
selection in the a1 domain (7 and 13 in Irish and Norwegian
sequences, respectively) and also in the a2 domain (8 and 6).
The analysis of sequence variability (V) based on the
Shannon entropy index identified 17 and 28 highly variable
sites (V . 1.5, Stewart et al. 1997) in the a1 domain of the
Irish and Norwegian sequences, respectively, from which 4
and 9 in each population had values V . 2.0 (fig. 2). Fewer
sites were identified in the a2 domain (9 for the Irish
sequences and 10 for the Norwegian sequences from which
2 in each case had V . 2.0). Most variable sites in the a2
domain (7 in the Irish sequences and 9 in the Norwegians)
coincided with positively selected sites, and in both sets of
sequences the selected sites in the a1 domain had V . 1.5.
In summary, the results of the different analyses indicated that there is variable selective pressure across sites of
the MH class I sequences analyzed. The patterns of distribution of positively selected sites were similar between both
populations (fig. 2) and were generally consistent between
methods of analysis, although a larger number of sites under
selection were inferred under maximum likelihood models.
All the regions of SPV identified by PLATO contained
1102 Consuegra et al.
MHC Class I Divergence in Atlantic Salmon 1103
potential PBRs: (1) as defined by Grimholt et al. 1993, (2)
deduced as positively selected by the Bayesian method in
CODEML, and (3) highly variable with entropy values
V . 1.5. Regions of SPV also overlapped with areas identified as involved in recombination events by DnaSP (fig. 1).
Discussion
Population Divergence at the MH Class I Locus
Evidence from neutral molecular markers suggests that
Irish and Norwegian Atlantic salmon populations originate
from a common Pleistocene glacial refugium, with the onset
of divergence between these populations due to postglacial
dispersal around 12,000–15,000 years ago (Verspoor et al.
1999; Nilsson et al. 2001; Consuegra et al. 2002). However,
contrary to expectation on the basis of this relatively recent
divergence, these populations had different allelic compositions at the MH class I locus, with only 2 out of 41 sequences
in common. In contrast, Irish and Norwegian populations
share the majority of mitochondrial DNA haplotypes
(Verspoor et al. 1999; Nilsson et al. 2001) and alleles at allozyme (Bourke et al. 1997) and microsatellite (unpublished
data) loci.
Collectively, the sequences displayed high levels of
nucleotide diversity but did not show fixed substitutions
between populations, suggesting that much of the observed
polymorphism predates population divergence. Such a pattern of shared polymorphism is indicative of relatively
ancient allelic lineages (Shum et al. 2001) possibly maintained by balancing selection (Garrigan and Hedrick 2003).
The presence of the Sasa-UBA*0601 and SasaUBA*0602 alleles in both populations may be a relict of their
common origin with selective pressure(s) common to both
populations maintaining the presence of these alleles. Alternatively, escape of fish of Norwegian origin from Atlantic
salmon farms on the west coast of Ireland (Clifford et al.
1998) may have recently introduced these alleles into the
Irish population, particularly through the contribution of
mature parr to natural reproduction (Garant et al. 2003).
A third possibility is that our sample could have included
juveniles escaped from farms that were mistaken for wild
fish. Regardless of the explanation for the presence of these
two sequences in common, from our sampling it appears that
the MH class I allelic compositions in Irish and Norwegian
Atlantic salmon is almost entirely nonoverlapping.
Levels of Recombination
Although recombination/mutation ratios suggested that
mutation has been the dominant force in shaping the present
allelic diversity in both populations, recombination appears
to play an important role in creating new alleles in these isolated populations. Congruent results from several forms of
analyses provided strong evidence for recombination in
the sequences from both Irish and Norwegian populations,
with similar regions of sequence potentially involved in
the recombination process in each population. As there were
no fixed single nucleotide polymorphisms between populations, but little sharing of complete alleles, it may be that the
Ldhat analysis is picking up a strong signal of relatively old
mutation, which is masking the signature of recombination
that has occurred after population divergence and is responsible for the nonoverlapping allele compositions.
Patterns of Selection
Our analyses revealed a complex pattern of selection
across sites, with the best-fitting models of substitution
allowing residues to evolve in neutral, conservative, and/
or positively selected manners. Such a result is consistent
with what is known of the interactions between residues
within the MHC molecule, interactions between the
MHC and peptide molecules, and interactions between
the MHC-peptide complex and a suite of other cell surface
receptors (a suite that is probably not yet completely
defined) (Van den Berg, Yoder, and Litman 2004). While
little is currently known about structure-function relationships in fish MH molecules, comparison with the relatively
well-understood human and mouse systems allows some
inferences to be drawn. Most sites appear to be conserved,
regardless of the model of substitution used, possibly in
order to maintain structural integrity of the a helices and
b sheets formed by both the a1 and a2 domains (Bjorkman
et al. 1987a). However, conserved residues also appear to
be necessary for nonspecific binding with T cell receptors
(Bjorkman et al. 1987b; Garboczi et al. 1996; Garcia et al.
1996) and other antigen receptors (Van den Berg, Yoder,
and Litman 2004).
Conventionally, PBRs have been accepted as generally under positive selection (Hedrick et al. 1991), and a
high level of variability has been used as a criterion for
identifying PBRs on occasion (Reche and Reinherz
2003). However, in this case when PBRs are deduced from
the alignment with human HLA-A2 sequences (Grimholt
et al. 1993) and tests of neutrality were performed, evidence
for positive selection was found only in the a1 domain. This
is in contrast with the results of the maximum likelihood
test where evidence for positive selection was found in both
domains. The maximum parsimony method, on the other
hand, only presented evidence for positive selection in
the a2 domain.
The maximum parsimony method has been argued to be
more robust to violations of the assumptions of the models
and less prone to false positives than the maximum likelihood method (Suzuki and Nei 2001, 2004). The main difference between methods is in the way that positively selected
sites are identified. The maximum likelihood method
FIG. 2.—Posterior probabilities of site classes for sites along the MHC class I (a) a1 and (b) a2 domain under the random-sites model M8 (beta and x).
Ten equal-probability categories were used to approximate the beta distribution (Yang et al. 2000), so that the model has 11 categories. The location of the
positively selected sites is indicated by the dotted line (95% level of significance). Polymorphic sites determined by their entropy values: gray arrows
indicate sites with V . 1.0, and black arrows indicate sites with V . 2.0. The amino acid positions refer to Sasa-UBA*0101 for the Norwegian sequences
and Sasa-UBA*1601 for Irish sequences.
1104 Consuegra et al.
groups the codon sites into categories with different
nonsynonymous-synonymous (x) rate ratios and test
whether the x of the group is .1, while the maximum parsimony method examines each codon with a standard statistical approach (Suzuki and Nei 2004). Although maximum
parsimony method may detect fewer false positives, it has
lower statistical power for identifying positive selection
and is less sensitive in detecting truly selected sites than
the maximum likelihood method (Wong et al. 2004) unless
a large number of sequences are analyzed (Suzuki and Nei
2001, 2004). In general, the maximum likelihood method
seems to be most powerful in detecting positive selection
and predicting positive selected sites when the analyses
are repeated with several initial values and when the interpretation of the data is cautious (e.g., when the x ratios are close
to 1, the prediction is less accurate) (Wong et al. 2004). This
method is particularly appropriate for detecting selection in
MHC genes, where positive selection could be acting simultaneously in groups of codons (Suzuki and Nei 2004). In this
case, only 1 of the 13 sites predicted by the maximum likelihood method was also predicted by maximum parsimony,
although another 3 sites had a high (although nonsignificant)
probability (0.9) and 4 more sites had a probability 0.8.
The results of the maximum likelihood analysis are
also supported by the coincidence of the majority of the
positively selected sites predicted by CODEML with the
potential PBRs deduced by Grimholt et al. (1993) based
on the alignment of a salmon UBA class I sequence with
a human HLA-A2 sequence. Ten of fifteen predicted positively selected sites were deduced as possible PBR sites
(65G and 79A in the a1 domain and 5N, 7W, 24E,
26W, 63Q, 66H, 67D, and 74T in the a2 domain).
Moreover, there was high congruence between sites of
high entropy and predicted positively selected sites: of 15
positively selected sites, 12 had high entropy. Five sites
were predicted as positive selected but not potential PBRs
(17A, 34I, 41K, 42S, and 71L). In particular, the site 42S in
the a2 domain was predicted as positively selected by three
different methods (maximum likelihood, maximum parsimony, and high entropy), although it was not amongst
the potential PBRs from Grimholt et al. (1993). Crystallographic models of MH molecules in fish are not available
(Grimholt et al. 2002), and the structure and variability of
the class I molecule (with insertions and deletions) make it
difficult to align the Atlantic salmon sequences with the
HLA sequences; this may be the origin of some of the discrepancies observed between predicted positively selected
sites and deduced PBR sites.
Interplay Between Recombination and Selection
As methods for identifying recombination and positive
selection-variability often identified the same sites, our
results are consistent with a model of selection on recombination events (Ohta 1995) to produce divergence between
populations in the set of MHC class I alleles each possesses.
However, recombination and positive selection may produce similar signals in the pattern of polymorphism. For
example, the analysis of SPV does not seek to distinguish
between the two processes (Grassly and Holmes 1997). In
particular, congruence of sites involved in recombination
and under positive selection could be due to the effect of
recombination on the maximum likelihood methods for
identifying positive selection. These models of codon substitution are phylogeny based and do not account for the
effects of recombination (Anisimova, Nielsen, and Yang
2003). However, when comparing the results of models
M7 and M8, which appear to be relatively unaffected by
recombination, with those of other models (M2, M3) that
appear to be more sensitive to recombination, results were
consistent. In general, Bayesian methods for identifying positively selected sites seem to be robust to recombination
effects (Anisimova, Nielsen, and Yang 2002). Moreover,
the results of the recombination analyses excluding positively selected sites indicate that recombination is playing
an important role in generating allelic diversity as the signals
of recombination persist after the exclusion of potential
PBRs.
Conclusion
In summary, in spite of their postglacial common origin, Irish and Norwegian salmon populations currently
appear to have largely different pools of MH class I alleles.
However, the evidence suggests a probable common origin
for all the alleles. Although point mutation appears to be the
main force creating new alleles, there is evidence for extensive recombination. Recombination events are most often
observed to involve sites under positive selection, with similar but not identical patterns in both populations. Selection
favoring new recombinant alleles is therefore likely to be
responsible for the rapid divergence of the MH class I
variation of these populations.
Acknowledgments
We are grateful to Elvira De Eyto, Phil McGuinnity,
and Kjetil Hindar for providing the samples on which this
study is based. We also thank Pekka Pamilo and two anonymous reviewers for useful comments on an earlier version
of the manuscript. Funding for this study was supplied by a
contract from the European Commission (Salimpact:
QLRT-2000-01185).
Literature Cited
Akaike, H. 1974. A new look at the statistical model identification.
IEEE Trans. Autom. Control AC-19:716–723.
Anisimova, M., J. P. Bielawski, and Z. Yang. 2002. Accuracy of
Bayes prediction of amino acid sites under positive selection.
Mol. Biol. Evol. 19:950–958.
Anisimova, M., R. Nielsen, and Z. Yang. 2003. Effect of recombination on the accuracy of the likelihood method for detecting
positive selection at amino acid sites. Genetics 164:
1229–1236.
Belich, M. P., J. A. Madrigal, W. H. Hildebrand, J. Zemmour,
R. C. Williams, R. Luz, M. L. Petzl-Erler, and P. Parham.
1992. Unusual HLA-B alleles in two tribes of Brazilian Indians.
Nature 357:326–329.
Bernatchez, L., and C. C. Wilson. 1998. Comparative phylogeography of Nearctic and Palearctic fishes. Mol. Ecol. 7:431–452.
Bjorkman, P. J., M. A. Saper, B. Samraoui, W. S. Bennett, J. L.
Strominger, and D. C. Wiley. 1987a. Structure of the human
MHC Class I Divergence in Atlantic Salmon 1105
class I histocompatibility antigen, HLA-A2. Nature
329:506–512.
———. 1987b. The foreign antigen binding site and T cell
recognition regions of class I histocompatibility antigens.
Nature 329:512–518.
Bourke, E. A., J. Coughlan, H. Jansson, P. Galvin, and T. F. Cross.
1997. Allozyme variation in populations of Atlantic salmon
located throughout Europe: diversity that could be compromised by introductions of reared fish. ICES J. Mar. Sci.
54:974–985.
Clifford, S. L., P. McGinnity, and A. Ferguson. 1998. Genetic
changes in Atlantic salmon (Salmo salar) populations of
Northwest Irish rivers resulting from escapes of adult farm salmon. Can. J. Fish. Aquat. Sci. 55:358–363.
Consuegra, S., C. Garcı́a de Leániz, A. Serdio, M. González
Morales, L. G. Straus, D. Knox, and E. Verspoor.2002. Mitochondrial DNA variation in Pleistocene and modern Atlantic
salmon from the Iberian glacial refugium. Mol. Ecol.
11:2037–2048.
Edwards, S. V., and P. W. Hedrick. 1998. Evolution and ecology
of MHC molecules: from genomics to sexual selection. TREE
13:305–311.
Felsenstein, J. 1989. PHYLIP—phylogeny inference package
(Version 3.2). Cladistics 5:164–166.
Garant, D., I. A. Fleming, S. Einum, and L. Bernatchez. 2003.
Alternative male life-history tactics as potential vehicles for
speeding introgression of farm salmon traits into wild populations. Ecol. Lett. 6:541–549.
Garboczi, D. N., P. Ghosh, U. Utz, Q. R. Fan, W. E. Biddison, and
D. C. Wiley. 1996. Structure of the complex between human
T-cell receptor, viral peptide and HLA-A2. Nature 384:
134–141.
Garcia, K. C., M. Degano, R. L. Stanfield, A. Brunmark, M. R.
Jackson, P. A. Peterson, L. Teyton, and I. A. Wilson.
1996. An alphabeta T cell receptor structure at 2.5 A and
its orientation in the TCR-MHC complex. Science 274:
209–219.
Garrigan, D., and P. W. Hedrick. 2001. Class I MHC polymorphism and evolution in endangered California Chinook and
other Pacific salmon. Immunogenetics 53:483–489.
———. 2003. Detecting adaptive molecular polymorphism: lessons from the MHC. Evolution 57:1707–1722.
Grassly, N. C., and E. C. Holmes. 1997. A likelihood method for
the detection of selection and recombination using sequence
data. Mol. Biol. Evol. 14:239–247.
Grimholt, U., F. Drablõs, S. M. Jõrgensen, B. Hõyheim, and
R. J. M. Stet. 2002. The major histocompatibility class I
locus in Atlantic salmon (Salmo salar L.): polymorphism, linkage analysis and protein modelling. Immunogenetics 54:
570–581.
Grimholt, U., I. Hordvik, V. M. Fosse, I. Olsaker, C. Endresen,
and O. Lie. 1993. Molecular cloning of major histocompatibilty complex class I cDNAs from Atlantic salmon (Salmo
salar). Immunogenetics 37:469–473.
Hedrick, P. W., W. Klitz, W. P. Robinson, M. K. Kuhner, and
G. Thomson. 1991. Evolutionary genetics of HLA. Pp. 248–271
in R. Selander, A. Clark and T. Whittam, eds. Molecular evolution. Sinauer, Sunderland, MA.
Hedrick, P. W. 1994. Evolutionary genetics of the major histocompatibility complex. Am. Nat. 143:945–964.
———. 2002 Pathogen resistance and genetic variation at MHC
loci. Evolution 56:1902–1908.
Hewitt, G. M. 1999 Post-glacial re-colonization of European biota.
Biol. J. Linn. Soc. 68:87–112.
Hudson, R. R., and N. L. Kaplan. 1985. Statistical properties of the
number of recombination events in the history of a sample of
DNA sequences. Genetics 111:147–164.
Jeffery, K. J. M., and C. R. M. Bangham. 2000. Do infectious diseases drive MHC diversity? Microbes Infect. 2:1335–1341.
Jordan, W. C., and M. W. Bruford. 1998. New perspectives on
mate choice and the MHC. Heredity 81:127–133.
Jukes, T. H., and C. R. Cantor. 1969. Evolution of protein molecules. Pp. 21–132 in H. M. Munro, ed. Mammalian protein
metabolism. Academic Press, New York.
Koskinen, M. T., T. O. Haugen, and C. R. Primmer. 2002. Contemporary fisherian life-history evolution in small salmonid
populations. Nature 419:926–930.
Kumar, S., K. Tamura, I. B. Jakobsen, and M. Nei. 2001. MEGA2:
molecular evolutionary genetics analysis software. Bioinformatics 17:1244.
Landry, C., and L. Bernatchez. 2001. Comparative analysis of
population structure across environments and geographical
scales at major histocompatibility complex and microsatellite
loci in Atlantic salmon (Salmo salar). Mol. Ecol. 10:
2525–2539.
Landry, C., D. Garant, P. Duchesne, and L. Bernatchez. 2001.
ÔGood genes as heterozygosityÕ: the major histocompatibility
complex and mate choice in Atlantic salmon (Salmo salar).
Proc. R. Soc. Lond. B Biol. Sci. 268:1279–1285.
Martinsohn, J. T., A. B. Sousa, L. A. Guethlein, and J. C. Howard.
1999. The gene conversion hypothesis of MHC evolution:
a review. Immunogenetics 50:168–200.
McVean, G., P. Awadalla, and P. Fearnhead. 2002. A coalescentbased method for detecting and estimating recombination from
gene sequences. Genetics 160:1231–1241.
Meunier, J., and A. Eyre-Walker. 2001. The correlation between
linkage disequilibrium and distance. Implications for recombination in Hominid mitochondria. Mol. Biol. Evol. 18:
2132–2135.
Miller, K. M., and R. E. Withler. 1998. The salmonid class I MHC:
limited diversity in a primitive teleost. Immunol. Rev.
166:279–293.
Nei, M., and T. Gojobori, 1986. Simple methods for estimating
the numbers of synonymous nucleotide substitutions. Mol.
Biol. Evol. 3:418–426.
Nei, M. and T. Gojobori, 2000. Molecular Evolution and Phylogenetics. Oxford University Press, New York, NY.
Nilsson, J., R. Gross, T. Asplund et al. (16 co-authors). 2001.
Matrilinear phylogeography of Atlantic salmon (Salmo salar L.)
in Europe and postglacial colonization of the Baltic Sea area.
Mol. Ecol. 10:89–102.
Ohta, T. 1995. Gene conversion vs point mutation in generating
variability at the antigen recognition site of major histocompatibility complex loci. J. Mol. Evol. 41:115–119.
Ohta, Y., K. Okamura, E. C. McKinney, S. Bartl, K. Hashimoto,
and M. F. Flajnik. 2000. Primitive synteny of vertebrate major
histocompatibility complex class I and class II genes. Proc.
Natl. Acad. Sci. USA 97:4712–4717.
Penn, D. J., K. Damjanovich, and W. K. Potts. 2002. MHC
heterozygosity confers a selective advantage against
multiple-strain infections. Proc. Natl. Acad. Sci. USA 99:
11260–11264.
Posada, D., and K. A. Crandall. 1998. MODELTEST: testing the
model of DNA substitution. Bioinformatics 14:817–818.
Reche, P. A., and E. L. Reinherz. 2003. Sequence variability analysis of human class I and class II MHC molecules: functional
and structural correlates of amino acid polymorphisms. J. Mol.
Biol. 331:623–641.
Rozas, J., and R. Rozas. 1999. DnaSP Version 3: an integrated
program for molecular population genetics and molecular evolution analysis. Bioinformatics 15:174–175.
Shannon, C. E. 1948. A mathematical theory of communication.
Bell Syst. Tech. J. 27:379–423, 623–656.
1106 Consuegra et al.
Shum, B. P., L. Guethlein, L. R. Flodin, M. D. Adkison, R. P.
Hedrick, R. B. Nehring, R. J. M. Stet, C. Secombes, and
P. Parham. 2001. Modes of salmonid MHC Class I and II evolution differ from the primate paradigm. J. Immunol. 166:
3297–3308.
Stet, R. J. M., B. deVries, K. Mudde, T. Hermsen, J. van Heerwaarden, B. P. Shum, and U. Grimholt. 2002. Unique haplotypes of co-segregating major histocompatibility class II A and
class II B alleles in Atlantic salmon (Salmo salar) give rise to
diverse class II genotypes. Immunogenetics 54:320–331.
Stewart J. J., C. Y. Lee, S. Ibrahim, P. Watts, M. Shlomick,
S. Weigert, and S. Litwin. 1997. A Shannon entropy analysis
of immunoglobin and T cell receptors. Mol. Immunol. 34:
1067–1082.
Su, C. 2000. SGI: modified Suzuki and Gojobori’s method for
detecting positive and negative selection at individual
codon sites. The Pennsylvania State University, University
Park, Pa.
Suzuki, Y., and T. Gojobori. 1999. A method for detecting positive selection at single amino acid sites. Mol. Biol. Evol.
16:1315–1328.
Suzuki, Y., and M. Nei. 2001. Reliabilities of parsimony-based
and likelihood-based methods for detecting positive
selection at single amino acid sites. Mol. Biol. Evol. 18:
2179–2185.
———. 2004. False-positive selection identified by ML-based
methods: examples from the Sig1 gene of the diatom Thalassiosira weissflogii amd the tax gene of a human T-cell lymphotropic virus. Mol. Biol. Evol. 21:914–921.
Taylor, E. B. 1991 A review of local adaptations in Salmonidae,
with particular reference to Pacific and Atlantic salmon. Aquaculture 98:185–207.
Van den Berg, T. K., J. A. Yoder, and G. W. Litman. 2004. On the
origins of adaptive immunity: innate immune receptors join the
tale. Trends Immunol. 25:11–16.
Verspoor, E., E. M. McCarthy, D. Knox, E. Bourke, and T. F.
Cross. 1999. The phylogeography of European Atlantic
salmon (Salmo salar L.) based on RFLP analysis of the
ND1/16sRNA region of mtDNA. Biol. J. Linn. Soc. 68:
129–146.
Watkins, D. I., S. N. McAdam, X. Liu et al. (13 co-authors).
1992. New recombinant HLA-B alleles in a tribe of South
American Amerindians indicate rapid evolution of MHC class
I loci. Nature 357:329–333.
Wong, W. S. W., Z. Yang, N. Goldman, and R. Nielsen. 2004.
Accuracy and power of statistical methods for detecting adaptive evolution in protein coding sequences and for identifying
positively selected sites. Genetics 168:1041–1051.
Yang, Z. 1997. PAML: a program package for phylogenetic analysis by maximum likelihood. CABIOS 13:555–556.
Yang, Z., R. Nielsen, N. Goldman, and A.-M. K. Pedersen. 2000.
Codon-substitution models for heterogeneous selection pressure at amino acid sites. Genetics 155:431–449.
Pekka Pamilo, Associate Editor
Accepted January 25, 2005