articles
Sequence and analysis of chromosome 2
of the plant Arabidopsis thaliana
Xiaoying Lin*, Samir Kaul*, Steve Rounsley², Terrance P. Shea², Maria-Ines Benito, Christopher D. Town, Claire Y. Fujii, Tanya Mason,
Cheryl L. Bowman, Mary Barnstead, Tamara V. Feldblyum, C. Robin Buell, Karen A. Ketchum, John Lee, Catherine M. Ronning,
Hean L. Koo, Kelly S. Moffat, Lisa A. Cronin, Mian Shen, Grace Pai, Susan Van Aken, Lowell Umayam, Luke J. Tallon, John E. Gill,
Mark D. Adams², Ana J. Carrera, Todd H. Creasy, Howard M. Goodman², Chris R. Somerville², Greg P. Copenhaver², Daphne Preuss²,
William C. Nierman, Owen White, Jonathan A. Eisen, Steven L. Salzberg, Claire M. Fraser & J. Craig Venter²
The Institute for Genomic Research, 9712 Medical Center Drive, Rockville, Maryland 20850, USA
* These authors contributed equally to this work
............................................................................................................................................................................................................................................................................
Arabidopsis thaliana (Arabidopsis) is unique among plant model organisms in having a small genome (130±140 Mb), excellent
physical and genetic maps, and little repetitive DNA. Here we report the sequence of chromosome 2 from the Columbia ecotype in
two gap-free assemblies (contigs) of 3.6 and 16 megabases (Mb). The latter represents the longest published stretch of
uninterrupted DNA sequence assembled from any organism to date. Chromosome 2 represents 15% of the genome and encodes
4,037 genes, 49% of which have no predicted function. Roughly 250 tandem gene duplications were found in addition to large-scale
duplications of about 0.5 and 4.5 Mb between chromosomes 2 and 1 and between chromosomes 2 and 4, respectively. Sequencing
of nearly 2 Mb within the genetically de®ned centromere revealed a low density of recognizable genes, and a high density and
diverse range of vestigial and presumably inactive mobile elements. More unexpected is what appears to be a recent insertion of a
continuous stretch of 75% of the mitochondrial genome into chromosome 2.
Arabidopsis thaliana (thale cress), a member of the mustard family,
is a dicotyledenous ¯owering plant with a diploid number of 10 that
has become a widely used model for the study of plant biology
because of its small size, short generation time, facile genetics and
ease of transformation. Whole-genome analysis began ten years ago
with the production of cosmid and yeast arti®cial chromosome
(YAC)-based physical maps1, while early large-scale sequencing
efforts focused on ordered cosmids2. On August 20±21 1996,
representatives of six research groups from Japan, Europe and the
USA met in Washington DC to discuss strategies for facilitating
international cooperation in completing the sequencing of the
Arabidopsis genome (the Arabidopsis Genome Initiative (AGI)) by
the year 2004. The group agreed on a workplan to ensure that the
sequencing of all ®ve chromosomes was completed except for the
dif®cult to sequence repetitive regions: the nucleolar organizer
regions (NORs) and centromeres (http://genome-www.stanford.
edu/Arabidopsis/AGI/AGI_memo.html)3.
The main molecular and cytological features of the A. thaliana
chromosomes had been de®ned by the time high-throughput
sequencing began (Fig. 1). The NORs in A. thaliana are found on
the upper ends of chromosomes 2 and 4. Each region, 3.5±4 Mb in
length, consists of an array of tandemly repeated ribosomal RNA
genes4. Using a combination of restriction analysis, polymerase
chain reaction (PCR) and sequencing, it has been shown that the
rDNA of the NOR on chromosome 4 immediately abuts the
telomeric repeats with less than 500 base pairs (bp) of intervening
sequence5. Restriction analysis is consistent with a similar arrangement on chromosome 2, with the rDNA terminating at a different
point within the repeat unit. By contrast, the genomic organization
of A. thaliana centromeres is not well characterized. These regions
are heterochromatic6 and contain large tandem arrays of a family of
180-bp elements7±9 that are reminiscent of the 170-bp alphoid
² Present addresses: Cereon Genomics, 270 Albany Street, Cambridge, Massachusetts 02139, USA (S.R.,
T.P.S.); Celera Genomics Corporation, 45 West Gude Drive, Rockville, Maryland 20850, USA (M.D.A.,
J.C.V.); Department of Molecular Biology, Massachusetts General Hospital, Boston, Massachussets 02114,
USA (H.M.G.); Department of Plant Biology, Carnegie Institution of Washington, 260 Panama St.,
Stanford, California 94305, USA (C.R.S.); Department of Molecular Genetics and Cell Biology, University
of Chicago, Chicago, Illinois 60637, USA (G.P.C., D.P.).
NATURE | VOL 402 | 16 DECEMBER 1999 | www.nature.com
repeats found in primate centromeres10. Large blocks of DNA
consisting primarily of 180-bp repeats have been characterized by
2D gel electrophoresis and assigned to speci®c centromeres on the
basis of restriction-fragment-length polymorphisms11. The 180-bp
repeat block on chromosome 2 is 820±830 kb in size11, which should
be considered as a minimum estimate for the size of the
unsequenced centromeric region on this chromosome.
Features of chromosome 2
Chromosome 2 is acrocentric and was originally estimated to be
13.5 Mb in length from the YAC-based physical map12. Sequencing
was initiated using bacterial arti®cial chromosomes (BACs) that
had been placed onto the physical map of chromosome 2 by
hybridization to YACs and genetic markers. Subsequently, BAC
end sequences and BAC ®ngerprint data13 allowed extension from
these initial seed points and completion of the entire chromosome.
A total of 257 BAC and P1 clones (including 5 BACs completed by
other groups) were sequenced to produce over 24 Mb of ®nished
sequence, which has been assembled as two contigs terminating in
blocks of 180-bp repeats. These repeats represent the inner boundaries of our ®nished sequence.
The upper (short) arm, measured from the lower end of the NOR
to the centromeric 180 bp repeats, is 3.6 Mb. The northern-most
BAC (F23H14) in this contig contains a single rDNA repeat unit
that is oriented in the same direction as the telomere-proximal
repeat, suggesting that all the rDNA units are arranged in a head-totail fashion running 59 to 39 from the telomere towards the
centromere14. Immediately adjoining the rDNA is about 60 kb of
highly repetitive sequence containing many transposons. The lower
(long) arm from centromere to telomere is 16 Mb. The sequence of
the southern-most BAC in this contig (F11L15) is joined by a small
PCR fragment to the A. thaliana sequence in pAtT51, a telomerecontaining clone previously mapped to TEL2S15 (http://genomewww.stanford.edu/Arabidopsis/ww/Nov98RImaps/index.html).
Within this region, a putative transcriptional co-activator gene was
identi®ed. Because pAtT51 was derived from the Landsberg ecotype, the last 2.5 kb of this contig may not be identical to the
sequence in the Columbia ecotype.
© 1999 Macmillan Magazines Ltd
761
articles
Gene content
Using a combination of gene prediction programs and database
searches, the chromosome was annotated (Fig. 3 (facing page 768);
and Table 1). Statistics of the chromosomal composition are listed
in Table 2, along with corresponding data from other eukaryotic
genome projects. A graphical distribution of various features is
shown in Fig. 4. Of the 4,037 genes identi®ed, 2,078 (51.5%) were
assigned either a de®nite or putative function, 865 (21.4%) were
labelled as unknown genes and 1,094 genes (27.1%) were designated
as hypothetical. In addition, 400 pseudogenes, most of which are
related to proteins found in retrotransposons and are located near
the centromere, and 79 genes encoding structural RNAs (73 tRNA, 4
tRNA and 2 small nucleolar (snRNA)) were found.
On average, a gene occurs every 4.4 kb, is about 2 kb in length
(from start to stop codon) and contains 4.6 exons. More than 50%
of the chromosome is therefore intergenic sequence. Twenty-three
per cent of the genes consist of a single exon, whereas the largest
Telomere
Marker
cM
NOR
< kk1
3.5
< mi421
19
gene (T20F21.17) contains 52 predicted exons and encodes a
protein that is 50% identical to the human ch-TOG protein16. At
least 1,352 (33.5%) genes are expressed, as re¯ected by the presence
of a matching complementary DNA or expressed sequence tag
(EST) entry in GenBank. Genes that are most highly represented
by ESTs include a glycine-rich RNA-binding protein, aquaporins,
ribulose bisphosphate carboxylase activase, a chlorophyll a/b
binding protein and lipid transfer proteins. The distribution of
genes and ESTs along the chromosome is shown in Fig. 4b, c.
Classi®cation by biological role or biochemical function of the
known and putative genes shows that most major cellular processes
appear to be represented, with genes involved in regulatory functions and signal transduction (DNA-binding proteins/transcription
factors and protein kinases) comprising the largest functional
groups (Table 1). These homology-based assignments are consistent
with results from searching the hidden Markov models (HMMs) of
protein domains in the pFam database17. Of the 4,037 genes
predicted to encode a protein, 1,436 match at least one of the 434
distinct pFam HMMs (22% of the total HMMs). The most
frequently seen matches are to the LRR (leucine-rich repeat, 239
matches), the pkinase (eukaryotic protein kinase, 161 matches) and
the zf-C3HC4 (zinc ®nger, C3HC4 type, 46 matches) HMMs. The
cellular destinations of the proteins were also predicted. Over 17%
(698) contained signal peptides, whereas over 10% (416) were
predicted to be organelle-targeted (135 for the chloroplast and
281 for the mitochondrion). In addition, more than 16% of the
proteins are probably located in cellular membranes, with 665
sequences predicted to have 4 or more transmembrane segments
(TMSs). Furthermore, about 25% of the proteins with 1, 2 or 3
predicted TMSs (1,438) are predicted to be localized to the
membrane.
Gene and chromosomal duplications
Chromosome 2 contains many duplicated genes. More than 60% of
20,000
Physical distance (kb)
At 19.6 Mb, the overall length of chromosome 2 (excluding the
NOR and centromere) is 45% larger than the original estimate. This
difference is due in part to gaps in the YAC-based physical map and
to a deletion in one of the YACs (see Fig. 1). Similar increases in the
actual physical lengths of the other A. thaliana chromosomes
indicate that the genome size is actually 130±140 Mb, rather than
the earlier estimates of 70±100 Mb.
Using sequence information for 37 markers from the Arabidopsis
Recombinant Inbred genetic map (http://genome-www.stanford.
edu/Arabidopsis/ww/Nov98RImaps/index.html), the relationship
between physical and genetic distances along the chromosome
was examined. As the centromere represents a discontinuity,
regression analysis for the two arms was performed separately.
The ratios of physical to genetic distance on the short arm and
long arm were not signi®cantly different, with values of 244 kb cM-1
and 223 kb cM-1, respectively (Fig. 2a). Although these two values
indicate a similar overall rate of recombination along the two arms
of the chromosome, local distortions are evident which may re¯ect
chromosomal rearrangements between the two ecotypes used to
construct the mapping population (Fig. 2b).
a
16,000
12,000
8,000
4,000
{
< er
< ve018
Physical/genetic distance (kb cM–1)
0
Centromere
49
69
∆
< mi79a
87
Telomere
Figure 1 Features of Arabidopsis thaliana chromosome 2. The chromosome is roughly
23 Mb long (including the NOR), spans a genetic map distance of 90 cM and contains
15% of the unique sequence of the A. thaliana genome. The locations of a number of
genetic markers are shown. The original YAC contigs are represented by the orange line,
and the location of the deletion noted in the text is shown (D).
762
2,500
2,000
1,500
1,000
500
20
40
60
80
100
40
60
80
Genetic map distance (cM)
100
b
400
300
200
100
0
20
Figure 2 Representation of physical/genetic distance along chromosome 2. a, Physical
distance (in kb) is plotted against position of each marker on the Recombinant Inbred (RI)
map. Data above and below the centromere are treated as separate sets for regression
analysis. Note that genetic distance is measured from the telomere (TEL2N), whereas
physical distance is measured from the bottom of the NOR. b, The ratio of physical to
genetic distance between successive pairs of markers is plotted against map position
for each marker pair on the RI map.
© 1999 Macmillan Magazines Ltd
NATURE | VOL 402 | 16 DECEMBER 1999 | www.nature.com
articles
the predicted proteins (2,542 out of 4,037) have a signi®cant match
(FASTA, P , 10 2 10 ) to another A. thaliana protein, with 2,138
matching at least one other protein encoded on chromosome 2. Of
these, 593 are found in 239 tandem duplications that range in size
from 2 to 9 genes (average 2.5). A particularly striking example is
found in BAC F16P2, which contains tandem repeats of three gene
families (Fig. 5). Of the 12 intact copies of a putative short-chain
dehydrogenase/reductase gene in this region, 9 are found in tandem.
This gene family is most similar to tropinone reductase, a branch
point in the biosynthesis of tropane alkaloids including such
medicinally important compounds as atropine, scopolamine and
cocaine18,19. Although these alkaloids are not known to occur in the
Brassicaceae, at least one of these genes is expressed, suggesting a
role in secondary metabolism which has yet to be elucidated. This
same BAC also contains a tandem array of seven genes encoding
glutathione S-transferase (GST), which has been implicated in
numerous plant processes including stress and pathogen responses,
xenobiotic tolerance and vacuolar sequestration20. In total, 13 GST
genes have been found on chromosome 2, with 4 of the other genes
occurring as tandem pairs. The GSTs within the long tandem repeat
are more closely related to one another than to any other of the 25
GSTs known in Arabidopsis, indicating that this expansion is of
relatively recent origin. There is also a smaller duplication consisting
Table 1 Classi®cation of chromosome-2 proteins according to role or
function
Percentage of
chromosome-2
proteins
.............................................................................................................................................................................
Regulatory functions
Signal transduction
Cellular structure, organization and biogenesis
Transport and binding proteins
Protein fate
Energy metabolism
Growth and development
Secondary metabolism
Cellular processes
Pathogen responses
Protein synthesis
Fatty acid and phospholipid metabolism
General transcription
Environmental response
DNA metabolism
Central intermediary metabolism
Amino-acid biosynthesis
Purine and pyrimidine base, nucleoside and nucleotide metabolism
Biosynthesis of cofactors, prosthetic groups and carriers
Other categories
Unclassi®ed
Unknown protein
Hypothetical protein
7.8
4.0
3.6
3.6
3.0
2.7
2.6
2.5
2.4
2.0
1.9
1.5
1.4
1.0
0.7
0.7
0.6
0.6
0.4
6.9
1.4
21.4
27.1
of three pumilio-like genes, two of which are in tandem. The
proximity of these gene duplications suggests either that this is a
repeat-prone region of the chromosome or that they arose as part of
the same expansion event.
Some of the duplicated genes are found within large chromosomal duplications (Fig. 6). A segment of the bottom arm of
chromosome 2 (13.4±14.1 Mb, containing 170 genes) from clone
F20M17 to F4P9 was found to be duplicated on chromosome 1
(from clone F19P19 to F3F20) with an inversion in the middle
(13.8±13.9 Mb on chromosome 2). Within this region, 57 gene pairs
(33%) are duplicated between the two chromosomes. Several
instances of tandem duplications on just one of the two chromosomes were also observed. For example, F4P9.8 encodes an auxininducible protein that matched two adjacent copies of a similar
protein-coding sequence (F19P19.31 and F19P19.32) on chromosome 1; and a single copy of a gene encoding an ion-channel protein
on chromosome 1 (YUP8H12.9) matched two adjacent copies on
chromosome 2 (T32F6.9 and T32F6.8). Sequence comparison of the
proteins in these asymmetric duplications suggests that the tandem
duplication events occurred after the large-scale duplication. An
additional smaller section near the top of chromosome 2 (200±
500 kb) shares 16 genes with a region on chromosome 1 (T10B6).
An even larger duplication (4.6 Mb) between chromosome 2 (6.7±
11.3 Mb, from clone F9O13 to F18A8) and chromosome 4 (from
clone F20O9 to the bottom end of the chromosome) was found. In
this region, 39% of the genes (430 out of 1,100) are duplicated
between the two chromosomes, including some singleton-tandem
duplication pairs, as described above. In addition, although this
region contains megabase-scale rearrangements (translocations or
inversions), within each rearranged segment gene order is still
preserved between the two chromosomes. A small part of this
duplication (45 kb) has been described recently21.
Comparative and evolutionary genomics
.............................................................................................................................................................................
4,037 genes are included in this analysis.
Comparison of the predicted proteins on chromosome 2 to other
A. thaliana proteins, to proteins from other plants, and to all
available complete genome sequences provides insight into the
evolution of plant genomes. More speci®cally, the timing of gene
duplications can be better understood. For example, as most of the
2,542 chromosome-2 proteins (83%) that have paralogues within
the A. thaliana genome are more similar to their paralogue than to
any protein in the available completed genomes, it is likely that these
gene duplications occurred after the separation of plants from the
animal and fungal lineages. This analysis also suggests that all plants
have a common set of genes for many functions. Even though
sequencing in other plants has focused primarily on particular types
of genes, more than 40% (1,656) of the proteins encoded on
chromosome 2 have a signi®cant match to a protein previously
Table 2 Compositional analysis of Arabidopsis chromosome 2 and comparison with other eukaryotic chromosomes and genomes
Feature
A. thaliana*
P. falciparum*
S. cerevisiae²
C. elegans²
Total: 19.6 Mb
Short arm: 3.6 Mb
Long arm: 16.0 Mb
Overall: 35.8%
Coding: 43.6%
Non-coding: 32.1%
0.95 Mb
13.4 Mb
97 Mb
19.7%
24.3%
13.3%
38%
40%
35%
36%
§
§
4,037
4.4 kb per gene
209
4.5 kb per gene
6,339
2.1 kb per gene
19,099
5.0 kb per gene
Exon statistics³
20,352 exons
4.6 exons per gene
295 bp per exon
355 exons
1.7 exons per gene
728 bp per exon
6,561 exons
1.04 exons per gene
§
§
5 exons per gene
§
Intron statistics
15,926 introns
3.6 introns per gene
178 bp per intron
144 introns
1.6 introns per gene
207 bp per intron
§
§
§
§
§
§
...................................................................................................................................................................................................................................................................................................................................................................
Overall length
Base composition (%GC)
Number of genes
Gene density
...................................................................................................................................................................................................................................................................................................................................................................
* Chromosome 2 only.
² Complete genome.
³ Includes intronless genes.
§ Not published.
NATURE | VOL 402 | 16 DECEMBER 1999 | www.nature.com
© 1999 Macmillan Magazines Ltd
763
articles
identi®ed in other plant species. Of these, 48% (803) do not have a
match in any complete genome sequence. In addition, comparison
with other species indicates that many of the genes on chromosome
2 probably originated within the plant lineage. More than 65%
(2,745) of the predicted proteins do not show signi®cant similarity
to any other completed genome, suggesting that they are unique to
plants. Of the remaining proteins (1,293), the majority are most
similar to proteins from Saccharomyces cerevisiae (368 or 28%) or
Caenorhabditis elegans (633 or 49%). The number of proteins whose
best match is to one of these two proteomes is proportional to the
total number of proteins encoded in each genome, suggesting that A.
thaliana is roughly equally evolutionarily distant from animals and
fungi. The remaining 292 proteins (23%) are most similar to bacterial
and archaeal proteins, which may be due in part to transfers of genes
from organellar genomes to the nucleus (see below).
Lateral transfers of genes from organelles to the nucleus have been
well established22. In most cases, such genes were identi®ed because
their products were known to function in the organelle. The
a
b
80
70
60
50
40
30
20
30
sequence of chromosome 2 allows a reverse type of analysis,
identi®cation of nuclear genes that originated in the chloroplast
by their evolutionary ancestry. Genes most similar to cyanobacterial
genes relative to genes from other complete genomes were considered to have been acquired from the chloroplast genome. Using
this methodology, 135 putative chloroplast-derived genes were
identi®ed. Of these, 71 (52.5%) are predicted to have chloroplasttargeting peptides. The putative functions of these genes include
DNA repair, protein synthesis, regulatory functions, cell division,
photosynthesis, transport and oxidation±reduction. As these genes
are not found in any of the completed plant chloroplast genomes,
most of these gene transfers probably occurred soon after the
symbiosis with the chloroplast evolved. Many other genes are of
unknown function and will be worth pursuing as potential novel
chloroplast-targeted functions.
A mitochondrial genome in the nucleus
Completion of the chromosome has revealed a large and
Trinucleotide composition
variation
GC content (%)
20
Gene density
10
0
c
d
e
80
60
40
20
0
8
6
4
2
0
EST density
Tandem gene duplications
6
4
tRNA gene density
2
f
0
15
10
Transposon density
5
g
0
Intermediate repeats
0
5
10
15
20
Chromosome position (Mb)
Figure 4 Distribution of various features along chromosome 2. The two arms of the
chromosome are represented as a continuous sequence of DNA, beginning immediately
after the NOR (position 1) and continuing to the centromere (position 3,606,929), where it
is joined to the long arm (which terminates with the most telomeric base (position
19,647,005)) by a 120-bp spacer. The various features are plotted relative to the
nucleotide position on the chromosome. a, Chi-squared statistics of trinucleotide
frequency50 and GC composition in a 10-kb sliding window. b, Gene density per 100 kb;
764
c, EST density per 100 kb. d, Tandem repeats, showing repeat number as well as
location. e, tRNAs per 100 kb. f, Density of transposon-related sequences per 100 kb.
g, Position of pericentromeric repeat classes along chromosome 2. From top to bottom
they are 180 bp, Athila, 106B, 11B7RE, 163A, 164A, 278A, mi167, pAtT27, new repeats
1±4. The GC peak (45%) and the gaps in gene and transposon density from 3.2 to 3.5 Mb
reveal the site of the mitochondrial DNA insertion.
© 1999 Macmillan Magazines Ltd
NATURE | VOL 402 | 16 DECEMBER 1999 | www.nature.com
articles
unexpected organellar-to-nuclear gene-transfer event. Within the
genetically de®ned centromere, there is a stretch of 270 kb of
sequence that is nearly identical to that of the Arabidopsis mitochondrial genome. The authenticity of this insertion in the Columbia ecotype was con®rmed by PCR ampli®cation across the
junctions of mitochondrial and unique nuclear DNA, followed by
the sequencing of the corresponding fragments. This insertion is
G
G
G
G
G
much larger than any of the previously reported organellar-nuclear
transfers23, and is 99% identical to the mitochondrial genome,
suggesting that the transfer event was very recent. As the published
sequence is derived from the C24 ecotype24, it is not possible
determine whether the observed sequence polymorphisms are due
to divergence of the insertion from the mitochondrial genome or to
differences among ecotypes. The organization of the mitochondrial
G
G
26349
30113
33877
37641
41405
45169
48933
52697
56461
60225
63989
67753
71517
75281
T
T
79045
T
T
82809
86573
P
105393
T
T
90337
T
112921
T
94101
P
109157
T
116685
T
T
97865
101829
T
120449
124213
127977
P
b
Chromosome 4 relative position
(Mb)
a
Chromosome 1 relative position
(Mb)
Figure 5 Organization of genes on BAC F16P2 showing the three tandem gene
duplications. The display from TIGR Annotator shows the exon±intron structure of the
annotated genes. The glutathione S-transferase and tropinone reductase genes are
labelled G and T, respectively. A smaller duplication of pumilio-like protein (P) is also
present.
1.4
1.2
1.0
0.8
0.6
13.4
13.6
13.8
14.0
Chromosome 2 coordinate (Mb)
17.0
16.0
15.0
14.0
13.0
12.0
7.0
8.0
9.0
10.0
11.0
Chromosome 2 coordinate (Mb)
Figure 6 Large chromosomal duplications. Regions of similarity were initially detected
using the MUMmer program after which sequences were aligned using the dds program
with criteria of at least 75% identity over 100 bp. Points in the dot plot represent the
coordinates of each such match. Values show positions (in bp) within each contig
NATURE | VOL 402 | 16 DECEMBER 1999 | www.nature.com
assembled from all available sequence. a, Duplication between chromosomes 1 and 2.
b, Duplication between chromosomes 4 and 2. Arrows indicate the small, previously
described duplication21.
© 1999 Macmillan Magazines Ltd
765
articles
DNA in the nucleus differs from that of the published mitochondrial genome and its predicted alternate forms25; it corresponds to
another possible isoform with an internal deletion (Fig. 7). This
deletion may have occurred during or after transfer or may
represent an alternate form of the Columbia mitochondrial
genome.
The pericentromeric region
Until the completion of chromosome 2, it was unclear what features
would be discovered in the regions around the centromere (the
pericentromeric region). In addition to the detection of the mitochondrial genome insertion, an analysis of this region has revealed
the presence, distribution and diversity of speci®c repetitive elements on a single chromosome. Transposons were identi®ed by
searching for predicted amino-acid sequences that are similar to
characterized transposases, reverse-transcriptases, polyproteins or
other transposon-related coding sequences. Of the 563 transposon
proteins annotated, 50% are pseudogenes found either as fragments
of previously intact elements or as genes interrupted by stop codons,
frameshifts or other mutations. A breakdown of the classes of
transposons identi®ed is shown in Table 3. Representative members
of most of the large transposon families seen in other plants
(Mutator, En-Spm and Ac-Ds (haT/mariner)) are present. Retroelements including members of the LINE-like Ta11 elements, long
a
b
3'
1
B
A
D
A. thaliana
mitochondrial
genome
366,923 bp
3
1
3
Possible
insertion
point
B
3
1
A'
C 1 A' 3
Chr2 (kb)
d
B
D'
c
3
C
1
1
C
D'
3430
terminal repeat (LTR) elements in both the Ty3-gypsy and the Ty1copia family, and the Arabidopsis Athila family represent the largest
class of transposons. This accumulation of retroelements and other
transposons may be due to either preferential insertion into the
regions ¯anking the centromere or their elimination from the rest of
the genome at a higher rate26. In addition, representatives of
previously described intermediate repeats as well as some new
repeats were identi®ed (Fig. 4f), a distribution consistent with
previous hybridization data27±30.
As shown in Fig. 4, the number of recognizable genes decreases
close to the centromere, whereas the number of transposable
elements and various classes of intermediate repeats increases.
Nevertheless, within this region there are some identi®able genes:
a protein kinase (T25N22.14); a plasma membrane proton ATPase
(F9A16.7); an ABC transporter (T5E7.1); a DNA-replication licensing factor (T12J2.1), an RNA helicase (T12J2.7) and a 40S
ribosomal protein S16 (F7B19.13). The RNA helicase, protein
kinase and 40S ribosomal protein appear to be actively expressed
genes, as each is represented in the EST collection.
Conclusions
Arabidopsis has been considered to be not only a tractable species for
plant biology studies, but also a suitable model from which a
fundamental understanding of basic plant processes could be
applied to agriculturally relevant species. The analysis of the
complete sequence of chromosome 2 validates this concept because
many genes that have previously been characterized in other plants
have been found. As observed with other eukaryotic genome
projects, however, roughly half of the predicted genes have no
known function. This represents a large, untapped resource of
information that will require a combination of knockout
approaches, antisense technologies and global gene-expression
analyses to understand their functions.
Both gene duplication and large-scale chromosomal duplications
are observed, events that apparently occurred after the divergence of
plants from other lineages. These duplications are larger than any
reported in the C. elegans genome31, but less extensive than
the complete genome duplications proposed for maize32 and
S. cerevisiae33. As gene duplication is often accompanied by functional divergence34, the identi®cation of such duplications in the
plant lineage may help identify new plant functions. Many of the
duplications occurred in tandem, generating closely linked gene
families which may provide some of the allelic diversity that
breeders and agricultural researchers are seeking. Some new
families, such as the family of genes that may encode one step in
alkaloid biosynthesis, could provide a reservoir of molecular tools to
modify plant metabolism and productivity in a bene®cial manner.
Table 3 Representaton of transposable elements on chromosome 2
3330
Transposon class
3230
100
200
300
Mito (kb)
Figure 7 Proposed model for insertion of mitochondrial DNA into chromosome 2.
a, Diagram of the major form of the published A. thaliana C24 mitochondrial genome25.
The different colours correspond to the major segments of the genome: A (nucleotide
297,579±44,698); B (48,895±112,146); C (118,737±178,862); D (183,060±
290,989); and two major repeats, 1 (44,698±48,894, 178,863±183,059) and 3
(112,147±118,736, 297,579±290,990). A small repeat (repeat number 2) is not shown.
The orientation of one of the three repeats is inverted relative to the other. The arrows are
used to indicate orientation for comparison with the alternative form. b, Hypothetical
alternative form, generated by in silico rearrangements around some of the repeats.
Arrows and primes indicate inversions. c, Linearized form of genome shown in b if
inserted at the point between repeat 3 and segment D9. d, Dot plot of the linearized
alternate form against the region of chromosome 2 containing the mitochondrial DNA. The
gap in the diagonal indicates that there was a deletion of a large segment (see text).
766
Subclass
Family
Number of open
reading frames
.............................................................................................................................................................................
DNA elements
Retroelements
Non-LTR LINE-like
retrotransposons
LTR retrotransposons
Retroviral-like
Mutator
En-Spm/Tam1/Psl
Ac-Ds
mariner
hAT
73
71
12
2
1
Ta11-like
TSCL
152
1
Athila
Other
Ty1-copia-like (Ta1)
Ty3-gypsy-like (Tat1)
Replication protein
Helicase
75
141
Total
563
28
7
.............................................................................................................................................................................
.............................................................................................................................................................................
© 1999 Macmillan Magazines Ltd
NATURE | VOL 402 | 16 DECEMBER 1999 | www.nature.com
articles
The completion of chromosomes 2 and 4 (ref. 35), which
accounts for 30% of the entire Arabidopsis genome, represents a
landmark achievement in eukaryotic genomics. The 16-Mb contig
on the lower arm of chromosome 2 is the largest sequence assembly
published to date. Sequencing within the genetically de®ned centromere has shown not only an abundance of degenerate transposable elements, but also a number of recognizable intact genes, some
of which appear to be expressed. As researchers move on to
sequence larger and more complex genomes, it remains to be seen
whether the level of genome closure achieved in this model plant
can be duplicated.
Note added in proof : Since this paper was accepted for publication, a
23Mb contig from human chromosome 22 has been published
(See Nature 402, 489±495 (1999).
M
Methods
Sequencing
Three libraries made from the Columbia ecotype of A. thaliana were used in sequencing
chromosome 2: the TAMU BAC library (clones pre®xed with T36); the IGF BAC library
(clones pre®xed with F37); and the Mitsui P1 library (clones pre®xed with M38). Sheared
BAC DNA (1.6±2 kb) was ligated to a modi®ed pUC19 vector and transformed into
Escherichia coli. Sequencing reactions were performed using either BigDye primers,
BigDye Terminators or Dichlorohodamine Terminators (P.E. Biosystems), and were run
on ABI 377 and ABI 3700 sequencers (P.E. Biosystems). Shotgun clones were sequenced to
generate 7±8-fold coverage of each BAC. In total, 590,165 reactions were carried out to
generate the sequence of the chromosome. In collaboration with Celera Genomics, 99,072
reactions were run on ABI 3700 sequencers, which accounted for nearly 20% of the overall
sequence for the chromosome. The discrepancy rate on these machines, as determined by
comparison of overlapping BACs, was comparable to that of ABI 377 machines (1
unresolved discrepancy per 32,000 bases). A base difference is possibly due to residual
heterozygosity or to different DNA sources used to make the various libraries. Individual
BACs were assembled from the shotgun sequences using the TIGR Assembler program39.
Following assembly into contigs, the Grouper program (TIGR, unpublished software) was
used to link individual assemblies within the BACs. The BACs were then closed using a
combination of BAC walking, directed PCR and resequencing of individual shotgun
clones. To con®rm the assembly of the entire BAC sequence, predicted and actual
restriction digest patterns were compared. The collinearity of the BACs with the
Arabidopsis chromosome 2 sequence was con®rmed by alignment of the sequence of
chromosome 2 with markers obtained from the Arabidopsis Recombinant Inbred map
(http://genome-www.stanford.edu/Arabidopsis/ ww/Nov98RImaps/index.html) and
with YAC end sequences. Several markers that clearly mapped to single genetic loci were
excluded from this analysis because they contained repetitive sequence and could not be
assigned a unique location on the sequence map.
Two BACs (T5M2 and T5E7) within the genetically de®ned centromere26 were found to
contain long stretches of sequence with very high similarity to the Arabidopsis C24
mitochondrial genome (69 kb on T5M2, and 74 kb on T5E7) joined directly to nuclear
sequence. Two overlapping BACs provided an additional 122 kb of purely mitochondrial
sequence to close the gap, suggesting that the total size of the organellar DNA insertion is
about 270 kb. Veri®cation of the mitochondrial insertion involved PCR ampli®cation of
the two ¯anking junction sequences from Arabidopsis Columbia genomic DNA. PCR
products with the expected size were ampli®ed that had sequence identical to the junction
BACs. Similar sized junction PCR products were obtained using additional BACs from
both the TAMU and IGF libraries. Determination of the exact size and sequence of the
insertion will be facilitated once the sequence of the Columbia mitochondrial genome is
known, and it becomes possible to distinguish between nuclear- and organellar-derived
mitochondrial sequences.
The sequence of chromosome 2 identi®ed by the BAC clone F11L15 did not contain the
telomeric repeat consensus sequence 59-TTTAGGG-39 (ref. 15). However, the sequence of
the lower telomere of chromosome 2 had been previously identi®ed in the clone pAtT51
(ref. 15) (http://genome-www.stanford.edu/Arabidopsis/ww/Nov98RImaps/index.html).
pAtT51 was sequenced to identify primer sites that could be used to walk inwards from the
telomere. Southern hybridization using probes derived from F11L15 and pAtT51
suggested that the gap between the BAC contig and pAtT51 was less than 1 kb. A 702-bp
PCR product that spanned this physical gap was ampli®ed from Arabidopsis Columbia
genomic DNA and sequenced.
Analysis
Annotation of chromosome 2 involved both DNA and protein database searches and gene
prediction programs. Gene predictions were made by Genscan40, Gene®nder (P. Green and
L. Hillier, unpublished data) and GRAIL41. Splice sites were identi®ed by these programs,
by NetPlantGene42 and by alignment with EST and protein sequences using the dds and
dps programs43. Predicted protein sequences were searched against a non-redundant
amino-acid database and against HMMs of the protein domains from pFAM3 (1,407) and
TIGR built HMMs (502) with the HMMer2 program17. tRNAs were identi®ed using
tRNAscan-SE44. Repeats were identi®ed using RepeatMasker (A. F. A. Smit and P. Green,
http://ftp.genome.washington.edu/RM/RepeatMasker.html) and a modi®ed version of
the suf®x tree algorithm used in the MUMmer system45. Output from the gene ®nding and
signal detection programs was displayed using the Annotator genome viewer (L. Zhou,
NATURE | VOL 402 | 16 DECEMBER 1999 | www.nature.com
unpublished data) and manually edited. The accuracy of computational gene ®nders for
Arabidopsis is quite variable and all gene predictions that are not supported by independent evidence, such as protein or EST homology, should be regarded as tentative, pending
further research.
Putative membrane-spanning proteins were identi®ed using TopPred46. SignalP47 was
used to detect signal peptide sequences, while Predotar (N. Peeters et al., unpublished)
searched for organellar targets. In addition, ChloroP48 and Mitoprot49 were used to ®nd
proteins targeted for the chloroplast and mitochondria, respectively. The results presented
are the concordance of the predictions.
We analysed the proteome of chromosome 2 by comparing each predicted protein with
a database of all proteins from chromosome 2, all available completed genomes and all
proteins from A. thaliana and C. elegans using described methods50. To identify genes on
other chromosomes that were related to chromosome-2 genes, nucleotide searches were
also used as much of the nucleotide sequence in GenBank is unannotated. Each predicted
transcript was searched against all available genomic sequence using a cut-off of 75%
identity over 50% of the transcript length in order to take intron sequences into account.
To search for large-scale duplications within the genome, we assembled pseudomolecules representing all available sequence of chromosomes 1 and 4. The sequence of
chromosome 2 was then compared with chromosomes 1 and 4 using the MUMmer
program45 with a word size of 20. Long-range duplications were found by plotting the
results, and a local alignment program dds43 was used to identify matches between the
exact repeats within the duplicated regions (stringency .75%). Because of insuf®cient
sequence or overlap information, it is dif®cult at the present time to construct
chromosome-size pseudomolecules of chromosomes 3 and 5 and assess the possibility of
other large duplications.
Received 5 October; accepted 29 October 1999.
1. Schmidt, R. & Dean, C. Towards construction of an overlapping YAC library of the Arabidopsis
thaliana genome. Bioessays 15, 63±69 (1993).
2. Bevan, M. et al. Analysis of 1.9 Mb of contiguous sequence from chromosome 4 of Arabidopsis
thaliana. Nature 391, 485±488 (1998).
3. Bevan, M. et al. Objective: the complete sequence of a plant genome. Plant Cell 9; 476±478 (1997).
4. Copenhaver, G. P. & Pikaard, C. S. Two-dimensional RFLP analyses reveal megabase-sized clusters of
rRNA gene variants in Arabidopsis thaliana, suggesting local spreading of variants as the mode for gene
homogenization during concerted evolution. Plant J. 9, 273±282 (1996).
5. Copenhaver, G. P. & Picaard, C. S. RFLP and physical mapping with an rDNA-speci®c endonuclease
reveals that nucleolus organizer regions of Arabidopsis thaliana adjoin the telomeres on chromosomes
2 and 4. Plant J. 9, 259±272 (1996).
6. Schweizer, D., Loidl, J. & Hamilton, B. Heterochromatin and the phenomenon of chromosome
banding. Results Probl. Cell Differ. 14, 235±254 (1987).
7. Martinez-Zapater, J. M., Estelle, M. A. & Somerville, C. R. A highly repeated DNA sequence in
Arabidopsis thaliana. Mol. Gen. Genet. 204, 417±423 (1986).
8. Maluszynska, J. & Heslop-Harrison, J. S. Localization of tandemly repeated DNA sequences in
Arabidopsis thaliana. Plant J. 1, 159±166 (1991).
9. Simoens, C. R., Gielen, J., Van Montagu, M. & Inze, D. Characterization of highly repetitive sequences
of Arabidopsis thaliana. Nucleic Acids Res. 16, 6753±6766 (1988).
10. Pluta, A. F., Mackay, A. M., Ainsztein, A. M., Goldberg, I. G. & Earnshaw, W. C. The centromere: hub
of chromosomal activities. Science 270, 1591±1594 (1995).
11. Round, E. K., Flowers, S. K. & Richards, E. J. Arabidopsis thaliana centromere regions: genetic map
positions and repetitive DNA structure. Genome Res. 7, 1045±1053 (1997).
12. Zachgo, E. A. et al. A physical map of chromosome 2 of Arabidopsis thaliana. Genome Res. 6, 19±25
(1996).
13. Marra, M. A. et al. High throughput ®ngerprint analysis of large-insert clones. Genome Res. 7, 1072±
1084 (1997).
14. Copenhaver, G. P., Doelling, J. H., Gens, J. S. & Pikaard, C. S. Use of RFLPs larger than 100 kbp to map
the position and internal organization of the nucleolus organizer region on chromosome 2 in
Arabidopsis thaliana. Plant J. 7, 273±286 (1995).
15. Richards, E. J., Chao, S., Vongs, A. & Yang, J. Characterization of Arabidopsis thaliana telomeres
isolated in yeast. Nucleic Acids Res. 20, 4039±4046 (1992).
16. Charrasse, S. et al. Characterization of the cDNA and pattern of expression of a new gene overexpressed in human hepatomas and colonic tumors. Eur. J. Biochem. 234, 406±413 (1995).
17. Sonnhammer, E. L., Eddy, S. R., Birney, E., Bateman, A. & Durbin, R. Pfam: multiple sequence
alignments and HMM-pro®les of protein domains. Nucleic Acids Res. 26, 320±322 (1998).
18. Leete, E. Recent developments in the biosynthesis of the tropane alkaloids. Planta Med. 56, 339±352
(1990).
19. Yamad, Y. et al. in Secondary Products from Plant Tissue Culture (eds Charlwood, B. V. & Rhodes, M. J.
C.) 227±242 (Clarendon, Oxford, 1990).
20. Marrs, K. A. The functions and regulation of glutathione S-transferases in plants. Annu. Rev. Plant
Physiol. 47, 127±158 (1996).
21. Terryn, N. et al. Evidence for an ancient chromosomal duplication in Arabidopsis thaliana by
sequencing and analyzing a 400-kb contig at the APETALA2 locus on chromosome 4. FEBS Lett. 445,
237±245 (1999).
22. Martin, W. & Herrmann, R. G. Gene transfer from organelles to the nucleus: how much, what
happens, and why? Plant Physiol. 118, 9±17 (1998).
23. Blanchard, J. L. & Schmidt, G. W. Pervasive migration of organellar DNA to the nucleus in plants. J.
Mol. Evol. 41, 397±406 (1995).
24. Unseld, M., Marienfeld, J. R., Brandt, P. & Brennicke, A. The mitochondrial genome of Arabidopsis
thaliana contains 57 genes in 366,924 nucleotides. Nature Genet. 15, 57±61 (1997).
25. Klein, M. et al. Physical mapping of the mitochondrial genome of Arabidopsis thaliana by cosmid and
YAC clones. Plant J. 6, 447±455 (1994).
26. Copenhaver, G. P. et al. Genetic de®nition and sequence analysis of Arabidopsis centromeres. Science
(in the press).
27. Thompson, H. L., Schmidt, R. & Dean, C. Identi®cation and distribution of seven classes of middlerepetitive DNA in the Arabidopsis thaliana genome. Nucleic Acids Res. 24, 3017±3022 (1996).
© 1999 Macmillan Magazines Ltd
767
articles
28. Thompson, H., Schmidt, R., Brandes, A., Heslop-Harrison, J. S. & Dean, C. A novel repetitive
sequence associated with the centromeric regions of Arabidopsis thaliana chromosomes. Mol. Gen.
Genet. 253, 247±252 (1996).
29. Thompson, H. L., Schmidt, R. & Dean, C. Analysis of the occurrence and nature of repeated DNA in
an 850 kb region of Arabidopsis thaliana chromosome 4. Plant Mol. Biol. 32, 553±557 (1996).
30. Brandes, A., Thompson, H., Dean, C. & Heslop-Harrison, J. S. Multiple repetitive DNA sequences in
the paracentromeric regions of Arabidopsis thaliana L. Chromosome Res. 5, 238±246 (1997).
31. The C. elegans Sequencing Consortium. Genome sequence of the nematode C. elegans: a platform for
investigating biology. Science 282, 2012±2018 (1998).
32. Gale, M. D. & Devos, K. M. Comparative genetics in the grasses. Proc. Natl Acad. Sci. USA 95, 1971±
1974 (1997).
33. Wolfe, K. H. & Shields, D. C. Molecular evidence for an ancient duplication of the entire yeast genome.
Nature 387, 708±713 (1997).
34. Hughes, A. L. The evolution of functionally novel proteins after gene duplication. Proc. R. Soc. Lond. B
Biol. Sci. 256, 119±124 (1994).
35. The European Union Arabidopsis Genome Sequencing Consortium & The Cold Spring Harbor,
Washington University in St Louis and PE Biosystems Arabidopsis Sequencing Consortium. Sequence
and analysis of chromosome 4 of the plant Arabidopsis thaliana. Nature 402, 769±777 (1999).
36. Choi, S., Creelman, R. A. Mullet, J. E. & Wing, R. A. Construction and characterization of bacterial
arti®cial chromosome library of Arabidopsis thaliana. Plant Mol. Biol. Rep. 13, 124±128 (1995).
37. Mozo, T., Fischer, S., Meier-Ewert, S., Lehrach, H. & Altmann, T. Use of the IGF BAC library for
physical mapping of the Arabidopsis thaliana genome. Plant J. 16, 377±384 (1998).
38. Liu, Y.-G., Mitsukawa, N., Vasquez-Tell, A. & Whittier, R. F. Generation of a high-quality P1 library of
Arabidopsis suitable for chromosome walking. Plant J. 7, 351±358 (1995).
39. Sutton, G. G., White, O., Adams, M. D. & Kerlavage, A. R. TIGR Assembler: a new tool for assembling
large shotgun sequencing projects. Genome 1, 9±19 (1995).
40. Burge, C. & Karlin, S. Prediction of complete gene structures in human genomic DNA. J. Mol. Biol.
268, 78±94 (1997).
41. Uberbacher, E. C. & Mural, R. J. Locating protein-coding regions in human DNA sequences by a
multiple sensor-neural network approach. Proc. Natl Acad. Sci. USA 88, 11261±11265 (1991).
42. Hebsgaard, S. M. et al. Splice site prediction in Arabidopsis thaliana DNA by combining local and
global sequence information. Nucleic Acids Res. 24, 3439±3452 (1996).
43. Huang, X., Adams, M. D., Zhou, H. & Kerlavage, A. R. A tool for analyzing and annotating genomic
sequences. Genomics 46, 37±45 (1997).
44. Lowe, T. & Eddy, S. R. tRNAscan-SE: a program for improved detection of transfer RNA genes in
genomic sequence. Nucleic Acids Res. 25, 955±964 (1997).
45. Delcher, A. L. et al. Alignment of whole genomes. Nucleic Acids Res. 27; 2369±2376 (1999).
46. Claros, M. G. & von Heijne, G. TopPred II: an improved software for membrane protein structure
predictions. Comput. Appl. Biosci. 10, 685±686 (1994).
47. Nielsen, H., Brunak, S. & von Heijne, G. Machine learning approaches for the prediction of signal
peptides and other protein sorting signals. Protein Eng. 12, 3±9 (1999).
48. Emanuelsson, O., Nielsen, H. & von Heijne, G. ChloroP, a neural network-based method for
predicting chloroplast transit peptides and their cleavage sites. Protein. Sci. 8, 978±984 (1999).
49. Claros, M. G. & Vincens, P. Computational method to predict mitochondrially imported proteins and
their targeting sequences. Eur. J. Biochem. 241, 779±786 (1996).
50. Nelson, K. E. et al. Evidence for lateral gene transfer between Archaea and bacteria from genome
sequence of Thermotoga maritima. Nature 399, 323±329 (1999).
Acknowledgements
We thank S. Jackson, J. Jiang, K. Meyer, E. Richards and S. Tabata for helpful contributions
in completing the chromosome. In addition, we would like to thank the TIGR faculty,
system administrators, sequencing facility, bioinformatics department and administrative
staff. This work was supported by the National Science Foundation, the Department of
Energy and the US Department of Agriculture.
Correspondence and requests for materials should be addressed to J. C. Venter
(e-mail: [email protected]).
Figure 3 Gene map of Arabidopsis thaliana chromosome 2. Predicted gene models are Q
shown with arrowheads indicating the direction of transcription. Genes are colour coded
according to broad role categories as shown in the key. The region highlighted in pink
indicates the genetically de®ned centromere within which the cross-hatched gene-free
region represents the mitochondrial insertion. The gap indicates the highly repetitive
centromeric region. Note that the NOR is not drawn to scale. Gene identi®cation numbers
follow the recently adopted nomenclature (At2gNNNNN), where At is Arabidopsis thaliana,
2 is chromosome number, g indicates that the feature is a gene and numbering is from top
to bottom of each chromosome in increments of 10.
768
© 1999 Macmillan Magazines Ltd
NATURE | VOL 402 | 16 DECEMBER 1999 | www.nature.com
22700
47690
46850
46090
41140
40350
40360
42620
45240
47720
46130
44470
47710
46860
47700
38930
41160
43490
46110
41150
47730
46140
42630
41170
40370
38940
46160
44490
41860
35050
46190
47790
45280
46900
47780
46920
45300
46210
46220
44520
42700
40430
16070
19610
24190
31170
29670
28850
38150
47810
45310
42710
41220
46930
46230
45320
44530
39760
39000
38160
37380
34330
32740
24880
35080
46950
44550
47820
43610
42730
40440
38170
36710
35940
39770
34340
33500
32750
29680
27630
26920
26280
30480
46240
46940
45330
44540
43600
42720
36700
35070
34320
33490
31950
31180
22780
23450
28860
27620
16830
10900
47830
46960
46250
47840
46260
47850
44580
47860
47880
46280
45400
44590
43660 43670
42790
38200
39060
46290
38220
39070
47890
46990
46300
44600
42800
39080
38240
43690
45410
47000
46310
47900
40500
38250
42820
47010
26360
25660
33580
44650
47940
47020
39130
46350
44670
39840
47960
25680
29840
26410
42850
47050
45440
42860
47980
36820
44720
44730
45460
47990
47070
38290
37500
48000
47080
46380
43790
42030
41370
40560
33640
35230
47090
46390
48010
36830
35240
34510
39180
45480
39190
45500
44750
48040
45510
46420
48050
35280
47160
44780
48070
46440
44790
43860
42940
38330
45540
36160
35330
34590
37590
47170
46450
44800
43870
40670
48090
47180
42960
42140
41430
39930
32940
46470
45550
43880
47190 47200
46480
45560
43890
15000
21450
18370
25740
25100
45570
48120
45580
43910
44820
45610
44830
45630
44840
47240
46520
43940
44860
47250
45640
44850
43950
36270
38460
44870
46530
47260
45650
39290
42270
Sequence gap
38470
45660
44890
38480
46550
47270
43080
47280
33040
47290
45680
46560
47300
37730
44040
43110
42300
44920
47310
39340
31540
30820
32330
34690
47330
46580
44930
43130
32340
30830
47340
45710
40810
38550
47350
46600
45720
46590
47360
46620
47370
47380
42360
45740
44090
37800
41580
38580
42350
34710
35600
30880
47390
45750
44950
44100
42370
41590
40830
31590
28580
40100
40110
45760
47400
46630
46640
47410
46650
45770
43170
43180
46660
45780
44980
44130
39410
45790
44140
7630
7020
17850
45800
45810
43220
45820
26700
44170
45830
43230
45010
42440
36400
47440
45840
44180
40910
25960
25300
47450
45850
44190
17180
18610
41670
45860
40160
45870
47460
30170
32450
35640
47470
45060
46700
45890
38720
47490
45900
45080
44240
38740
46710
45910
45100
44260
43310
43320
33260
45110
47510
45940
45130
47540
46740
45950
42530
41730
39570
47560
45140
43360
25410
45150
44340
43370
46760
47570
42560
32540
31780
31060
28700
24710
24020
23310
26110
27390
27400
25420
34890
38800
39590
45980
47580
40270
46770
45990 46000
47590
45180
30330
40290
45190
47610
46780
46020
40300
44420
47620
46790
46030
43420
44430
47630
25470
46800
45200
47650
46810
46050
Daniela
45220
34210
46830
46070
38920
40340
47680
46840
46080
43470
41820
44460
35820
47690
46850
46090
41830
41140
40350
37280
36610 36620
35810
38100
37270
42610
34220
30390
29570
28260
26850
34980 34990
32640
22690
21940
21110
18820
17370
15980
15970
15300
23360
30380
31870
29560
34970
43460
38910
26840
26190
25490
24760
24110
21930
7220
12030
24120
18060
16710
18810
21100
17360
13170
2280
3050
5070
14280
12630
27480 27490
35800
37260
38090
40330
46820
43450
41130
39700
36590
46060
32630
33380
34960
33370
31860
38900
27470
29550
28250
47660 47670
45210
43440
44450
4111041120
40320
39690
38890
37250
34940
35790
36580
37240
41810
39680
33360
32620
30370
24100
26830
25480
23350
22680
2270
5060
10280
10800
15960
21090
19500
1670
2260
20290
18050
18800
21920
21080
12620
17350
16700
26180
29540
27460
26170
29530
34200
31850
24090
24750
28780
28240
19490
21910
21070
15940
18040
17340
Puja
11410
5640
14270
13790
12020
15290
14790
18790
20280
18030
13160
12610
14260
13780
15930
15280
26820
38880
41100
47640
44440
46040
43430
42600
41800
37230
35780
34930
34190
32610
31840
30360
28760
28230
27450
26160
24080
23340
22670
38870
41090
39670
40310
38860
20260
21060
20270
17330
12600
14780
10790
10270
6760
2250
3910
4300
5050
6140
3040
2240
3900
1660
7780
7210
2230
12010
10780
19480
15270
11400
15920
18780
18020
16690
36570
38080
37220
13150
7200
3030
10260
6130
7190
7770
13770
14250
29510
33350
35770
32600
31830
11390
12590
26150
28220
30350
6750
4290
3890
10770
27440
26810
25460
21900
21050
20250
19470
18770
17320
41080
42590
41790
13760
16680
29500
11380
7760
12000
14770
18010
26140
39660
38850
41070
6120
15910
35760
34920
34180
32590
36560
42580
44410
43410
41780
41060
30340
31820
11990
2220
1650
5040
5630
6740
10760
24070
28750
28210
26800
25450
21890
19460
1640
3020
12580
29480 29490
35750
38070
39650
38840
36550
34170
33340
32580
44390 44400
47600
46010
45170
44380
43400
41770
40280
39640
38830
37210
34910
34160
36540
35740
41050
41750 41760
44360 44370
43390
39630
38060
31810
30320
17310
18760
27430
28740
7180
15260
13750
24740
26130
29470
25440
29460
28720
27420
26790
32560 32570
31080
34900
33330
30310
38820
37200
36530
35730
38050
41030 41040
39620
38810
45160
44350
43380
42570
41020
40260
38040
34150
32550
31800
31070
33320
37180 37190
36500
35720
31790
30300
27410
29450
6730
3880
5030
5620
4280
11360
11370
24060
23330
22660
20240
19450
15900
18000
3010
1630
3870
6110
14230 14240
13740
12570
11980
21040
24050
24730
23320
28200
25430
24720
20230
18750
10750
7750
11350
17990
11970
7170
2210
6720
1620
4270
5610
6100
16670
22650
21880
21030
24040
26120
28710
17980
14220
20220
18740
21020
11960
10740
6710
3000
3850 3860
13720 13730
13140
22640
20210
21870
21010
24030
22630
21860
21000
19440
17300
15250
15890
5600
12560
13710
1610
2990
3840
10250
7160
6700
6090
14210
18730
17970
16660
15880
20200
11950
14760
13130
1072010730
7740
6690
6080
2200
3830
5020
4260
1600
29410 29420 29430
29440
26780
26100
24010
28190
41010
40250
39580
41740
42550
21850
23300
38790
37170
34880
34140
45970
38780
41000
42540
45960
47550
46750
35710
38020
40240
37160
36490
34870
34130
19430
20990
6680
17290
16650
15240
5590
4250
2980
2190
6070
7730
11340
12550
10240
18720
33310
30290
29400
27380
17280
33300
31770
31050
32530
33290
25390
20980
28180
26080
29390
44310 44330
43340 43350
44300
42520
34860
24000
30280
31040
31760
19420
20190
24700
25380
20970
15230
18710
17960
11940
6060
1590
2970
3820
1580
5010
14750
13120
10710
7150
16640
6670
3810
2960
2180
5000
4240
15870
17270
18700
29380
30270
38010
40990
38770
44290
41720
39560
47520 47530
46730
45930
44280
45120
42510
40980
40230
39550
34850
33280
32520
31030
30260
35700
36480
37150
28690
29370
27370
25370
21840
28170
27360
26070
23990
22620
19410
16630
17950
15220
20960
15860
17260
11930
14200
13700
14740
12540
11330
6050
4990
11920
10700
7720
13690
12530
11910
10690
14730
34120
33270
38000
34840
37140
43330
41710
39540
38760
37990
35690
34110
28160
30250
7130
7140
3800
2170
6660
5580
10230
26770
25360
29360
28680
26060
31020
26760
21830
23290
23270 23280
23980
29350
30240
45920
46720
44270
42500
39530
27350
28670
28150
20180
1570
2950
16620
15210
12520
20950
17940
21810
21820
22610
24690
34100
36470
37130
40970
39520
38750
36460
35680
33250
15850
13680
16610
20170
7710
11890 11900
20930 20940
19400
17250
23970
32510
31750
34090
30230
15200
16600
6040
7120
11320
7110
6030
5570
2160
4980
3780
2940
14720
12510
11880
10680
23260
21800
25350
31010
27340
29340
20160
26050
23250
22600
20150
20920
15840
18690
3770
2930
1560
6650
14190
14710
17240
13110
40220
40960
47500
45090
44250
42490
40210
38730
37980
34830
36450
35670
32500
31000
29330
28660
28140
27330
26750
24680
6640
3760
2150
2920
4970
10220
3750
2910
4230
11310
23960
23240
21790
20910
37110 37120
36440
6630
5560
4960
13670
17230
15190
26040
34080
32490
31740
39510
40200
4328043290
41700
40950
43270
45070
44230
42480
37100
37970
27320
26740
25340
29320
33240
23230
22590
20140
19390
24670
26030
30220
36430
34820
39490 39500
37960
29310
28650
28130
26730
35660
34070
37090
36420
35650
40190
40940
25330
24660
32470 32480
31730
30990
30210
29300
20900
21780
18670
18680
15830
14700
6620
6020
7100
6610
17930
13660
11300
20130
18660
16590
22570
15820
26020
23950
26010
28120
34810
38710
44220
47480
45050
42470
41690
40180
38700
37950
34060
31720
43260
34050
37080
44210
30200
32460
34800
39480
46690
45880
40930
40170
37070
33230
28640
26720
25320
20890
11870
13100
14180
15180
17220
20120
22560
21770
20880
11290
12500
11860
10670
10210
1550
4220
2890
2140
4950
3740
1540
2880
7090
6600
2130
1530
7700
5550
13650
15170
18650
29270 2928029290
27310
30980
29260
19380
23220
23940
24650
16580
17210
3730
2870
2120
4940
13090
11850
11280
7080
6010
4210
1520
13640
20110
15810
26000
23930
23210
13630
14690
18640
22550
31700 31710
34040
36410
43250
45040
44200
42460
41680
39470
38690
37940
17200
17920
10200
11270
7070
6590
5540
11260
20870
26710
27300
25990
23920
23200
22540
30970
33220
30160
30960
37060
40150
39460
38680
34790
34030
33210
32440
31690
30950
30150
19370
20100
2110
4930
4200
21730 21740 21750
29250
28110
27290
25980
25310
15800
18630
16570
20860
19360
23910
11250
15160
7060
3720
7670
1510
13080
14680
17910
17190
15790
14170
20090
23190
24640
25970
28630
29240
28100
27280
23900
23180
22530
20850
11240
13620
18620
11840
5530
3710
2860
10660
7660
6000
4190
2100
10190
13070
12490
19340 19350
21720
45030
42450
40920
40140
39450
37930
35630
34780
34020
32430
17170
17900
31670 31680
30140
28090
27270
43240
37050
38670
37910 37920
34010
33200
30940
30130
19330
15150
16560
5520
6580
7050
2850
4920
14160
7690
14670
13610
15780
20840
1500
2090
13060
7650
18600
17160
24630
23890
5990
6570
14150
22520
21710
18590
17890
4180
4900 4910
14660
16540 16550
27260
25950
29230
11830
13600
15770
12480
21690
22510
41660
40130
39440
38660
7640
1931019320
20830
33190
31660
36390
34770
46670 46680
45000
44160
42430
41650
40900
37900
34760
39430
38650
37890
34000
33180
15140
17150
23880
28620
13050
14140
21680
25290
7680
10180
1490
2840
3680 3690
4890
10650
2080
4880
4170
3670
1480
19300
17880
19290
27250
25940
23170
32420
29220
35620
43210
47430
44990
39420
41640
25280
30930
34750
37880
44150
43200
42410
40890
37870
33990
36380
34740
33980
31650
30120
23870
24620
28610
28080
22500
13590
14650
18580
15760
7040
10640
13040
21660
20820
20080
19280
23160
20070
17870
17140
18570
5980
5510
4870
11230
15130
12470
16530
15750
17860
17130
25930
25270
27240
29210
32410
47420
20060
20810
21650
10630
5500
7030
6560
1470
14130
16520
19270
18560
13030
11220
16510
10170
17120
37040
38640
43190
41630
42400
37860
33170
31640
30920
28070
24610
19260
15120
40870 40880
40120
38630
40860
39400
41620
42390
34730
37850
33160
32400
37030
38620
37840
40850
39390
44970
44120
43160
40840
44960
44110
43150
37830
36370
34720
32390
23150
25260
31630
30910
30110
15740
17840
13580
14640
13020
14120
23850
23860
25920
23840
22490
21640
20800
17830
15110
11820
3650
5970
2830
1460
5490
4860
2070
10620
5480
13570
5960
7010
25250
28600
31600 31610 31620
33150
38610
39380
42380
41600 41610
25240
24600
20050
27230
30900
29200
28590
28060
25910
21630
19250
14630
16500
20790
24590
30100
35610
37820
19240
13560
14110
15100
5470
7620
6550
13010
11200
7000
2820
3640
4850
11210
3630
4160
2810
5460
17110
15730
18550
20040
26690
23140
21620
30890
37020
32380
38600
20030
20780
25230
22480
33130
37810
39370
37010
36360
33120
32370
38590
33880
33110
31580
30870
30090
29190
24580
26680
25220
29180
26670
25900
28570
23830
19230
20770
21610
13550
7610
6990
5950
3620
1450
4150
10160
3610
4140
5450
3600
2800
2060
4840
11190
16490
15720
12460
20020
17820
23130
21600
20760
18540
17100
15710
15090
23120
23820
37000
36350
37790
35590
40090
45730
44940
46610
29170
30860
39360
44080
25210
23110
20750
19220
21590
26660
30080
41570
38570
40820
42340
38560
37780
36990
33870
16480
18530
14620
13000
11810
11180
1440
3590
4830
6540
5440
4820
3580
10150
12450
12990
14610
2050
17800
13540
15700
31570
33100
32360
36340
37770
41560
43140
42330
40080
36330
35580
34700
33090
28560
29160
31560
30840
32350
31550
30070
11800
10610
12980
28050
27220
25890
6530
7600
20010
22470
24570
25200
28040
29150
25880
25190
2802028030
29140
23810
23100
21580
4130
2790
4810
10140
5940
16470
20730 20740
1430
2040
4120
6520
19210
17790
17090
16460
15080
26650
25870
22460
37760
44070
39350
38540
41550
40070
42320
19200
13530
12440
11790
14100
18520
20000
33080
36980
37750
15690
17080
5430
4800
10130
6980
6510
3570
10120
7590
5420
11170
12970
27210
3555035560 35570
36970
45700
47320
46570
44060
43120
41540
25860
24560
23800
28010
26640
33070
36320
38530
37740
35540
32320
31530
30810
40060
38520
40800
42310
44050
41530
40050
38510
36960
33860
33060
30800
30060
18510
22450
21560
23090
25850
17070
16450
15680
15070
6970
13520
14600
12430
14090
13510
2780
2030
4110
7580
6500
1420
11780
10110
19990
22440
24550
23080
26630
28550
27200
20720
21550
28000
24540
30050
36310
35530
22430
29130
32310
45690
40780 40790
40040
43100
44030
43090
41520
35520
34680
33050
19190
19980
20710
26620
15670
14590
16440
18500
23790
25180
23070
31520
25840
32300
30040
23780
24530
27990
27190
28540
33850
39330
37720
44910
26610
27180
30030
30790
44020
45670
44010
42290
40770
40030
39320
38500
35510
23770
20700
19180
17780
12960
13500
12420
14080
15060
14580
15660
17060
20690
21540
20680
12950
11160
5410
7570
3560
2020
4790
10600
2770
4780
4100
3550
10100
5930
5400
13490
12410
19970
17770
14570
19960
16430
18490
10090
10590
14070
13480
15650
11770
11150
10080
2760
3540
6490
5390
3530
2010
1380 1390 1400 1410
2750
7560
4770
5920
6480
25830
28530
25170
31510
33030
30780
33840
41510
44000
26600
24520
21520
2000
3520
2740
21530
15050
13470
19950
16420
15640
6960
10070
12940
11760
11140
10580
27980
36950
5380
10060
6470
1360 1370
3510
4760
2730
37710
39310
38490
44900
43990
33020
32290
31500
30020
25820
19170
35500
36300
34670
32280
43070
42280
40760
40020
39300
46540
43980
43060
40010
29120
35490
37700
36930 36940
36290
35470
33830
25810
27170
26590
28520
27970
33010
35480
30770
31480
22420
1350
1990
15630
17760
19940
23060
20670
31490
23050
21510
14060
14560
16410
17050
19160
15620
27960
25160
23760
22410
36920
37690
34660
43970
44880
41500
39280
43960
43050
42260
41490
38450
36910
37680
36260
33820
33000
31470
30010
29110
27160
30760
35440 35450 35460
33810
32990
32270
31460
30000
23040
19930
18480
17750
15040
6460
7550
10570
1340
2720
4090
10050
11750
11130
6950
14050
12930
11120
20660
21500
25800
Telomere
45620
43040
42250
40750
40000
39270
26580
27150
Mitochondrial region
48160
43030
43930
39990
38440
40740
41480
42240
35430
33800
31450
23030
24510
22400
21490
Nucleolar organizer region
46510
45600
43020
42230
41470
37670
36250
3541035420
32980
32260
31440
19920
17040
17740
18470
19150
15600
15610
12400
15030
20650
21480
25150
11110
11730
10040
10560
4750
3500
5370
1980
1330
2710
5910
4740
4080
5360
5900
11740
6450
2700
1320
10030
6940
5890
4730
4070
17730
22390
29100
25790
24500
23750
20640
27950
26570
30750
28510
18460
19140
23010
23740
25140
27140
27940
16400
15590
17030
22370
21470
24490
23730
22360
20630
19910
35390 35400
36240
16390
18450
17720
12920
14550
14040
13460
11720
7540
6440
4720
Genetic centromere
48150
47230
46500
45590
43920
38430
40730
42220
15580
30740
31430
37660
39260
36230
33790
32250
37650
36220
35380
32970
29990
11710
12910
12390
5350
3480
5880
2690
4710
1970
1300 1310
1960
11090 11100
5870
1290
13450
14540
11080
24480
29090
25130
27130
5860
6930
17020
22350
24470
4700
5340
14030
18440
25780
26560
30730
31410 31420
38420
19130
23000
25770
27930
36900
42210
39250
19900
24450 24460
26550
35370
34650
33780
30720
29080
17010
18420
17710
13440
11700
12900
14020
10550
7530
1280
2680
10020
3470
6430
4690
4060
3460
5850
22340
20620
24440
23720
21460
25120
35360
39980
40720
42200
43010
48130 48140
47220
tRNA
snRNA
5.8S RNA
18S RNA
26S RNA
43900
37640
32960
31400
29070
38410
41460
36890
36210
32240
30710
28500
27920
27120
25760
26540
22990
22330
19120
19890
18410
15570
15020
12880 12890
11690
11070
13430
17000
3450
2670
1950
10540
10010
4680
1270
14010
18390 18400
15560
17700
16380
23710
39240
39970
38400
31390
42190
39230
39960
43000
42180
47210
46490
44810
34640
35350
36200
37630
41450
40710
38390
37620
36880
34630
32950
31380
30700
32230
31370
29980
29060
27910
25750
25110
24430
16990
15010
11680
14530
12870
12380
13420
6420
5330
3440
2660
4050
11060
10000
5320
11670
7520
5840
3430
1260
2650
4040
14000
19880
26530
27900
24420
22980
19110
18380
17690
16370
15550
14520
6920
5310
4670
3410 3420
1940
1250
2640
6410
11050
12370
15540
26520
42990
42170
39950
40700
38380
48110
42980
42160
39220
37610
36190
33770
32210
32220
34620
33760
29050
28490
27890
27110
29970
30690
32200
35340
40690
41440
42150
40680
48100
46460
37600
39940
38370
36870
23700
5300
12860
13410
22320
17680
20610
5830
3390 3400
9970
99809990
11660
24400 24410
28480
22970
34610
32190
30680
36170 36180
34600
33750
32180
31360
29040
27880
27100
24390
13990
16980
18360
22310
19870
21440
26510
20600
23690
14990
16350 16360
15530
13400
12850
11650
7510
3380
1240
1930
11040
4660
9960
6400
5290
3370
1920
1230
12360
17670
22300
17660
21430
25090
33740
32170
31350
30670
29960
16340
19860
29020
12350
14510
19100
22960
28470
26500
3834038350 38360
41420
40660
39210
42130
48080
42950
42120
40650
39920
36130
35320
34580
37580
25080
17650
24370
21420
11030
12840
7500
5280
2630
1220
4650
3360
16970
15520
13390
11640
18340 18350
19850
27090
29950
32930
33730
36120
35310
28460
29010
7490
9950
13980
17640
26490
22950
27870
27080
15510
5270
4640
5820
6910
5260
1210
3350
4630
12340
11020
9940
20590
Transport and binding proteins
Conserved hypothetical
Other categories
Unclassified
Unknown
48060
46430
44770
42930
33720
34570
40640
41410
42100 42110
43850
42920
41400
40630
36110
37570
32160
31340
32920
35290 35300
34560
31330
27860
26480
23680
22290
19090
24360
25070
25730
28450
29940
29000
27840
25060
24350
23670
22270
21410
1200
1910
2620
16320 16330
17630
18330
16310
17620
19840
19080
15500
16960
12830
14500
10530
6390
5810
11630
9930
14480
14490
13970
14980
22940
22260
39910
36850
37560
39900
43840
42910
45520 45530
43830
42090
41390
39890
38320
39200
35270
37550
33710
16300
15490
13380
28440
32910
32150
30660
28430
27830
26470
23660
22930
11010
10520
1190
3340
6380
5250
7480
6370
12820
7470
20580
19070
36090 36100
28420
34550
31320
32900
40620
42900
42080
44760
43820
42890
41380
47130 47140 47150
46410
43810
42070
37540
36840
33700
25050
27070
29930
32130 32140
30650
29920
22250
21400
19830
23640
28990
14970
13960
16290
15480
17610
19060
27820
26460
28410
25720
11620
14470
12810
13950
24340
20570
18320
16950
11610
6360
9920
2610
3330
4610 4620
5240
4600
7460
Protein fate/protein synthesis
Nucleosides, and nucleotides
Regulatory functions/signal transduction
Secondary metabolism
Transcription
48030
45490
47120
38310
32890
33690
34540
37530
31310
32120
29910
36080
35260
40610
39880
42060
40600
43800
42050
48020
47110
46400
35250
36070
37520
33680
32880
30640
29900
27810
25710
23630
21390
22920
17600
15470
19050
19820
27060
26450
13940
1180
1900
4590
33103320
6350
2590
12800
9910
14460
18300
22240
28980
25040
22910
27800
34520 34530
32100
31300
30630
28400
27790
26440
25700
25030
24330
22230
19810
17590
16280
15460
19040
12790
12330
10510
6340
5800
3300
1170
2580
14960
14450
9900
6900
5230
7450
5790
12780
18290
20560
23620
22210
19030
27050
33670
32870
38300
42040
45470
44740
42880
30620
40580 40590
39870
37510
36060
14950
15450
1150
4580
3290
2570
13370
13930
17580
29890
28970
27780
32090
31290
26430
24310 24320
20550
19020
23610
22900
21380
20540
16270
6890
6330
1890
3280
18270 18280
17570
16940
16260
15440
14440
13920
11600
19010
32080
29880
28390
40570
32860
32070
34500
32060
31280
30610
29870
26420
25690
25020
23600
22890
24300
10500
11000
22200
20530
5220
5780
7440
9890
1140
4570
3270
DNA metabolism
Energy metabolism
Environmental response/pathogen response
Fatty acid and phospholipid metabolism
Growth and development
46370
44710
47060
45450
43780
42870
40550
2287022880
28960
374703748037490
36050
38280
42010
41360
43760 43770
41350
40540
39850
33630
32050
30600
29860
27040
19000
20520
17560
18260
2560
6320
5210
12770
16930
16250
1880
4560
10490
6310
1130
18250
13360
24290
21370
22860
10990
11590
18990
13910
20510
35210
34490
33620
32850
32040
29850
39150 39160 39170
38270
30590
21350
18240
24280
28380
35200
34480
20500
23590
14940
16230
16920
17550
22190
32840
32030
35190
44700
43750
47970
28370
27770
27030
28950
34470
36810
39140
44690
47040
25010
26400
23580
22850
18980
13350
11580
10980
14430
12760
7430
3260
1860
1870
4550
2550
5770
1120
10480
9880
1850
12320
7420
5200
15430
21340
18230
23570
32020
36040
41340
42000
46360
45430
44680
43740
41990
41330
40530
42840
38260
47030
47950
46340
43720
43730
41320
40520
41980
34450 34460
33610
35170 35180
37460
36030
34440
32830
31270
29830
25000
25670
26390
27020
19800
10470
9870
6880
13900
16910
20480 20490
1110
2540
3250
4030
11560 11570
10970
13340
12310
6300
5760
4540
1100
Amino acid biosynthesis
Cofactors, prosthetic groups, and carriers
Cellular processes
Cellular structure, organization and biogenesis
Central intermediary metabolism
44660
43710
35160
37450
36800
36020
41310
42830
41970
39830
39110
33590
22840
23560
29820
30580
33600
32010
29810
28940
28360
27010
26380
24990
32820
28930
24270
22180
21330
18970
18220
17540
16220
14420
13330
12750
2530
1090
11550
7410
5190
10460
23550
22170
20470
19790
17530
21320
27760
26370
34430
31260
29800
34420
39120
36010
36790
37440
27000
24980
23540
22830
22160
18960
21310
1080
4530
16210
12300
11540
6870
6290
13320
15420
14930
16200
16900
18200 18210
2774027750
32810
29790
27730
26990
24970
34410
45420
46330
44640
43700
40510
39820
39100
46320
44630
35150
36780
36000
34400
33570
23530
11530
5180
2520
1070
13880 13890
13310
12740
12290
19780
14410
1840
3240
4520
4020
1060
2510
9860
3230
7400
12280
6280
5170
22150
22820
24260
31250
30570
32000
31240
30560
29780
32800
41960
27720
16190
19770
17520
15410
13870
2500
3220
4510
11520
10960
10450
12270
26350
24960
22810
20460
47910 4792047930
44620
41300
39810
36770
35990
35140
34390
30550
29770
6270
6860
12730
14920
28920
27710
26980
28910
28350
33560
42810
41950
40490
39090
22140
21300
4010
4500
3210
13300
16890
19760
18190
26340
25650
27700
28900
29760
36760
23520
12260
14400
1050
2490
4000
7390
3200
16180
10950
24250
24950
34380
44610
41290
40480
43680
31990
38230
36750
35980
34370
33550
41270 41280
41940
41260
35970
35130
34360
31980
30540
29750
30530
25640
26330
26970
27690
23510
5750
19750
20450
19740
21290
19730
24240
22130
15400
16880
1616016170
17510
18180
20440
21280
19720
24940
25630
36740
38210
37430
35960
35120
36730
35950
39050
47870
46980
46270
45380
42780
43650
45360 45370
42770
40470
38190
37420
33540
31230
29740
28890
21270
20430
18950
19710
23500
26960
32790
30520
31970
31220
30510
29730
24930
26320
28340
27680
25620
26310
22800
23490
20420
21260
18170
1040
3990
10440
6850
6260
3190
1830
7380
12250
13290
12240
13860
12720
14390
2480
3180
11510
3980
3170
10940
16150
15390
13280
16870
12710
17500
16140
14910
14380
5160
10430
12230
10420
9850
4490
3160
1820
5740
5150
12700
12220
11500
10410
3150
19700
18940
16860
21250
23480
41930
41250
19690
20410
24230
27670
29720
39800
39040
41920
40460
35110
34350
24920
25610
26950
28880
15380
16130
19680
21240
11490
9840
7370
14900
12210
10930
4480
5140
3970
2470
5730
10400
16850
10920
4470
24220
28330
32780
33530
41240
4363043640
42760
20400
23470
29710
39030
17490
14890
12200
11480
19670
30500
37410
41910
46970
45350
44570
43620
42750
39790
39020
37400
38180
33520
32770
31210
36720
41230
37390
35100
33510
32760
31960
30490
31200
29700
6840
27660
26300
24910
28320
26940
25600
24900
23460
27650
29690
28870
24210
1892018930
19660
22790
22120
24890
27640
45340
44560
42740
41900
40450
39780
39010
35090
31190
26930
26290
25590
22100
21230
18160
15370
10910
16120
16840
7360
6250
5130
2460
10380 10390
5720
12690
20380 20390
19650
18910
15360
11470
12190
14370
1810
2450
4460
3960
10360 10370
7340
7350
6240
3140
13850
14880
20370
18150
17480
5710
5120
4450
1800
7330
12680
13270
12180
19630 19640
18900
22090
25580
24860 24870
24200
22080
25570
26270
28310
27610
26910
22770
22070
20360
14870
16100
16110
16820
21220
16090
15350
14860
16810
12170
13840
10890
10350
7320
4440
1790
6830
3130
6230
4430
9830
6220
18140
19620
1780
3120
2440
12160
13260
14850
16800
16080
17470
21210
20350
21200
24850
11460
12150
18890
16790
35930
32730
30470
43590
38140
37370
39750
43580
33480
36690
38990
38130
25560
27600
24840
24180
22060
21190
31940
34310
29660
41890
44510
40420
47800
43570
46200
44500
46910
45290
38980
35920
39740
42680 42690
41210
43540
4355043560
42670
38970
38120
35060
26260
28840
19600
20340
17460
13830
12670
16060
13250
16780
18130
16050
16770
23440
31930
32720
12660
14350 14360
13240
7310
5700
4420
3110
2430
1770
10870 10880
10340
9820
6210
4410
3950
2410 2420
1760
12120 12130 12140
7300
4400
2400
5110
10330
7290
3940
1750
2390
10860
27590
31160
33470
41880
34300
37360
40410
35910
41200
39730
31920
32710
3345033460
34290
31910
30440
30460
27570 27580
26900
25550
24830
23430
22760
18880
22050
19590
18120
17450
24820
24170
22040
29650
27550
24810
23420
31150
29640
41870
40400
43530
46180
47770
45270
46890
47760
46170
46880
42660
34280
22030
21180
15340
16040
14340
12110
7280
11450
10850
16760
14330
13820
13210
13220 13230
20330
19580
22750
24800
35900
36680
41190
40390
39720
31140
16750
18110
17440
26250
12100
5690
6200
6820
4390
3100
2380
1740
4380
9810
7270
12650
16020 16030
15330
14840
37330 37340 37350
43510 43520
38960
37320
32700
34270
30430
29630
28830
28300
13200
14320
10320
6190
5680
5100
4370
1730
2370
10840
6810
21170
19570
33440
25540
24160
27540
35890
34260
35040
42650
45260
43500
41180
26890
26240
24790
28820
36670
37310
40380
35880
38950
41850
45250
46870
46150
47750
32690
34250
26230
31900
31130
29620
33430
37300
39710
25530
28810
27530
24150
23410
22020
21160
20320
18870
18100
13810
13190
10310
7260
6180
4360
2360
9800
17430
16740
19560
21150
24780
23400
22740
22010
19550
15320
18090
5090
6800
14310
12090
11440
10830
7250
6170
1720
2350
3090
4350
2340
1710
5670
16010
17420
18860
24770
36660
42640
44480
41840
36650
35860 35870
34240
35030
32680
31120
30420
33420
30410
37290
35020
32670
31890
36640
35850
33410
34230
35010
30400
3109031100 31110
29610
27520
26220
25520
28290
26880
24140
23390
22730
24130
22720
6790
12080
14830
16730
19540
20310
22000
27510
28800
17410
18850
21140
19530
17400
14820
1700
4340
2330
13180
12070
5660
3080
14300
12060
11430
6780
4330
2198021990
18840
26870
26210
23380
21970
25510
31880
32660
46100
45230
28280
21960
19520
20300
1690
2320
10300
7240
5650
4320
2310
9790
18080
14810
14290
12050
11420
17390
18830
21130
35840
41830
43480
16000
16720
29590 29600
38110
37280
36630
33400
3582035830
35000
34220
33390
32650
31870
30390
29580
12640
14800
28790
26200
28270
27500
25500
22710
21950
23370
26860
28260
36620
18820
19510
21120
24120
23360
29570
17380
18070
21940
20290
17370
6160
5080
4310
10820
1030
1680
3070
2300
10290
3930
1020
12040
13800
15310
10810
15980 15990
15300
12030
6770
2290
1010
6150
7230
3060
3920
10280
7220
2280
© 1999 Macmillan Magazines Ltd
© Copyright 2026 Paperzz