Alu Monomer Revisited: Recent Generation of Alu Monomers Kenji K. Kojima*,1,2 1 Department of Biological Sciences, Graduate School of Bioscience and Biotechnology, Tokyo Institute of Technology, Midori-Ku, Yokohama, Kanagawa, Japan 2 Genetic Information Research Institute, Mountain View, California *Corresponding author: E-mail: [email protected]. Associate editor: Norihiro Okada Abstract Alu is a predominant short interspersed element (SINE) family in the human genome and consists of two monomer units connected by an A-rich linker. At present, dimeric Alu elements are active in humans, but Alu monomers are present as fossilized sequences. A comparative genome analysis of human and chimpanzee genomes revealed eight recent insertions of Alu monomers. One of them was a retroposed product of another Alu monomer with 3# transduction. Further analysis of 1,404 loci of the Alu monomer in the human genome revealed that some Alu monomers were recently generated by recombination between the internal and 3# A-rich tracts inside of dimeric Alu elements. The data show that Alu monomers were generated by 1) retroposition of other Alu monomers and 2) recombination between two A-rich tracts. Key words: Alu, monomer, dimer, retroposition, recombination, evolution. Supplementary Material online). Among them, five chimpanzee-specific insertions were generated by the retroposition of BC200 RNA (Kuryshev et al. 2001). Of the remaining eight Alu monomer insertions, five were similar to the left monomer of the AluY lineage; two to the left monomer of the AluS lineage; and one to the left monomer of the AluJ lineage. h14_44393 was an AluJ-like monomer on chromosome 13, and the sequence most similar to h14_44393 was also a monomer (PM_h14_44393) located on chromosome 19 (fig. 1). The two human Alu monomers differed from each other with respect to only two nucleotides. Alu monomers orthologous to PM_h14_44393 were found in the genomes of the chimpanzee, orangutan, gibbon, and baboon. There is no insertion at the orthologous locus in the mouse lemur genome. The comparison of uninserted and inserted loci revealed that PM_h14_44393 was inserted with 12-bp target site duplications (TSDs) as a monomer after the split of Strepsirrhini (lemurs, lorises, and bush babies) and before the split between apes and Old World monkeys. Notably, h14_44393 contained the 3# flanking TSD of PM_h14_44393, clearly showing that PM_h14_44393 is the parental element of h14_44393. Distinct TSDs and flanking sequences of h14_44393 and PM_h14_44393 show that h14_44393 was not generated by segmental duplication of PM_h14_44393. It indicates that h14_44393 is generated by retroposition of PM_h14_44393 with 3# flanking sequence transduction. It is noteworthy that retrocopies of BC200 RNA also showed 3# transduction downstream of their polyA tracts (supplementary material S1, Supplementary Material online). Next, all Alu monomers were surveyed in the human genome. There were 1,404 Alu monomers, and 304 of them were more similar to dimeric Alu than ancient Alu monomers © The Author 2010. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: [email protected] Mol. Biol. Evol. 28(1):13–15. 2011 doi:10.1093/molbev/msq218 Advance Access publication August 16, 2010 13 Letter Alu is the most abundant repetitive sequence accounting for ;11% of the human genome (Lander et al. 2001). A typical Alu sequence is a dimer consisting of two 7SL RNA-derived monomer units connected by an A-rich linker (Ullu and Tschudi 1984). Dimeric Alus are primarily classified into three lineages (J, S, and Y) (Britten et al. 1988; Jurka and Smith 1988; Batzer et al. 1996). AluJ elements are inactive in humans, whereas some AluS and AluY elements actively retropose in the human genome (Bennett et al. 2008). Dimeric Alu elements are considered to be originally generated by a fusion between free left Alu monomer (FLAM)-C and free right Alu monomer (Quentin 1992a). Two other types of Alu monomers, fossil Alu monomer (FAM) and FLAM-A (or free left Alu) are likely the precursors of these monomeric Alu elements (Jurka and Zuckerkandl 1991; Quentin 1992b). FLAM-A is nearly identical to the rodent short interspersed element (SINE) family PB1 (Quentin 1994), and SINEs derived from FLAM-A/PB1 have been identified in the genomes of rodents, tree shrews, and primates (Nishihara et al. 2002; Kriegs et al. 2007). It means that FLAM-A/PB1 originated from the common ancestor of Supraprimates, which includes primates, tree shrews, flying lemurs, rodents, and lagomorphs. No Alu monomers have been reported to be active in the human genome with the exception of BC200 (Kuryshev et al. 2001). BC200 is a small non–protein coding RNA gene derived by a FLAM-like monomer insertion shared by New World monkeys, Old World monkeys, and apes. More than 200 retroposed copies of BC200 are found in the human genome. On analyzing human-specific and chimpanzee-specific insertions (Kojima and Okada 2009; Kojima 2010), six human-specific and seven chimpanzee-specific Alu monomer insertions were found (supplementary material S1, MBE Kojima · doi:10.1093/molbev/msq218 FIG. 1 Alignment of h14_44393, PM_h14_44393, and Alu consensus sequences. TSDs are in lowercase. Substitutions shared among PM_h14_44393 and h14_44393 are shaded except CpG sites. (supplementary materials S2 and S3, Supplementary Material online). Orthologous loci from eight primate species (chimpanzee, orangutan, gibbon, macaque, marmoset, tarsier, mouse lemur, and bush baby) were searched for all Alu monomers. No Alu monomers shared by the human genome and the genomes of nonprimates were found (supplementary material S2, Supplementary Material online). One FAM insertion was not present at the orthologous locus of the tree shrew genome (supplementary material S2, Supplementary Material online). This shows that FAM was active after the branching of primates from tree shrews. Three uninserted empty loci for human FLAM-A insertions were identified from the tarsier genome (supplementary material S2, Supplementary Material online); thus, FLAM-A was still active after the branching of tarsiers. Only one FLAM-C insertion was traced back to the common ancestor of all primates. Recent generations of FLAM-C include BC200 retroposition and recombination of polyA tracts of existing AluJ elements as described below. Several orthologous loci for human Alu monomers in some other primates were occupied by dimeric Alu elements. If at least two successive sister groups have dimeric Alu insertions instead of the Alu monomer, the dimeric Alu is likely to have been replaced by the Alu monomer on the lineage leading to humans. Four loci with such distribution were found (table 1). In all four cases, TSDs were conserved. In the case of human_chr16_2851, the human and chimp genomes had AluY-like monomers (fig. 2). The genomes of the orangutan, gibbon, macaque, and baboon had AluY-like dimers instead. High similarity between the Alu monomers and the left monomers of dimeric Alu indicates that the Alu monomer was derived from the deletion of the right Alu monomer through recombination between the internal and 3# A-rich tracts. The other three loci also showed high similarity between the Alu monomer and the left monomer of dimeric Alu (supplementary material S4, Supplementary Material online). The FLAM-C monomer, human_chr1_ 4450, was likely derived from an AluJ insertion. The internal A-rich tract was extended in all dimeric Alus at orthologous loci of Alu monomers. It is likely that this extended polyA tract recombined with the polyA tail to generate the Alu monomer. This study indicated that at least two mechanisms contribute to the generation of recent Alu monomers (fig. 3). The first mechanism is represented by h14_44393 (fig. 1). This 14 monomer was generated by retroposition of another monomer, that is, PM_h14_44393. The second mechanism is represented by human_chr16_2851 (fig. 2). Dimeric Alu insertion and subsequent recombinational deletion of the right Alu monomer result in human_chr16_2851 monomer. Recombination is likely enhanced by the extension of A-rich tracts. Methods Genome sequences were downloaded from the University of California Santa Cruz Genome Browser (http:// hgdownload.cse.ucsc.edu/downloads.html), the Washington University Genome Sequencing Center (http://genome.wus tl.edu/), the Broad Institute Mammalian Genome Project Web site (http://www.broad.mit.edu/mammals/), and the Trace Archive database at the National Center for Biotechnology Information (NCBI) Web site (http://www.ncbi.nlm .nih.gov/). Human-specific and chimpanzee-specific Alu insertions were collected as described in previous reports (Kojima and Okada 2009; Kojima 2010). 3#-truncated Alu insertions were manually analyzed. All Alu insertions in the human genome were characterized with the aid of RepeatMasker (http://www.repeatmasker .org/) and the RepBase library (http://www.girinst.org/ repbase/). Alu insertions that were longer than 100 bp, flanked by 13–21-bp TSDs, and ended between positions 111 and 149 were selected. Alu subfamily classification was based on RepeatMasker results. To characterize the distribution of each human Alu monomer, NCBI Blast was used (ftp://ftp.ncbi.nlm.nih.gob/blast/executables/) and the genomes of eight primate species (chimpanzee, orangutan, gibbon, macaque, marmoset, tarsier, mouse lemur, and bush baby) were examined in order to manually determine whether Alu is present or absent at the corresponding loci. Table 1. Distribution of Alu Monomers and Dimers at Orthologous Loci. Insertion at Orthologous Loci Alu Monomer human_chr1_4450 human_chr6_4284 human_chr14_1978 human_chr16_2851 Family FLAM-C AluS AluS AluY H M M M M C M M M M O M D M D G M ND M D Mac D D D D Mar D D D U NOTE.—H, human; C, chimpanzee; O, orangutan; G, gibbon; Mac, macaque; Mar, marmoset, M, monomer; D, dimer; ND, not determined; U, uninserted. MBE Recent Generation of Alu Monomers · doi:10.1093/molbev/msq218 FIG. 2 Alignment of orthologous loci of human_chr16_2851 locus in primates. TSDs are shaded. Supplementary Material Supplementary materials S1–S4 are available at Molecular Biology and Evolution online (http://www. mbe.oxfordjournals.org/). Acknowledgments The author thanks Dr. Norihiro Okada for useful suggestions. This work was supported by grants from the Ministry of Education, Culture, Sports, Science, and Technology. K.K.K. was the recipient of a Grant-in-Aid from the Japan Society for the Promotion of Science for Young scientists. References Batzer MA, Deininger PL, Hellmann-Blumberg U, Jurka J, Labuda D, Rubin CM, Schmid CW, Zietkiewicz E, Zuckerkandl E. 1996. Standardized nomenclature for Alu repeats. J Mol Evol. 42:3–6. Bennett EA, Keller H, Mills RE, Schmidt S, Moran JV, Weichenrieder O, Devine SE. 2008. Active Alu retrotransposons in the human genome. Genome Res. 18:1875–1883. Britten RJ, Baron WF, Stout DB, Davidson EH. 1988. Sources and evolution of human Alu repeated sequences. Proc Natl Acad Sci U S A. 85:4770–4774. Jurka J, Smith T. 1988. A fundamental division in the Alu family of repeated sequences. Proc Natl Acad Sci U S A. 85:4775–4778. Jurka J, Zuckerkandl E. 1991. Free left arms as precursor molecules in the evolution of Alu sequences. J Mol Evol. 33:49–56. Kojima KK. 2010. Different integration site structures between L1 protein-mediated retrotransposition in cis and retrotransposition in trans. Mobile DNA. 1:17. Kojima KK, Okada N. 2009. mRNA retrotransposition coupled with 5# inversion as a possible source of new genes. Mol Biol Evol. 26:1405–1420. Kriegs JO, Churakov G, Jurka J, Brosius J, Schmitz J. 2007. Evolutionary history of 7SL RNA-derived SINEs in Supraprimates. Trends Genet. 23:158–161. Kuryshev VY, Skryabin BV, Kremerskothen J, Jurka J, Brosius J. 2001. Birth of a gene: locus of neuronal BC200 snmRNA in three prosimians and human BC200 pseudogenes as archives of change in the Anthropoidea lineage. J Mol Biol. 309:1049–1066. Lander ES, Linton LM, Birren B, et al. (100 co-authors). 2001. Initial sequencing and analysis of the human genome. Nature 409:860–921. Nishihara H, Terai Y, Okada N. 2002. Characterization of novel Aluand tRNA-related SINEs from the tree shrew and evolutionary implications of their origins. Mol Biol Evol. 19:1964–1972. Quentin Y. 1992a. Fusion of a free left Alu monomer and a free right Alu monomer at the origin of the Alu family in the primate genomes. Nucleic Acids Res. 20:487–493. Quentin Y. 1992b. Origin of the Alu family: a family of Alu-like monomers gave birth to the left and the right arms of the Alu elements. Nucleic Acids Res. 20:3397–3401. Quentin Y. 1994. A master sequence related to a free left Alu monomer (FLAM) at the origin of the B1 family in rodent genomes. Nucleic Acids Res. 22:2222–2227. Ullu E, Tschudi C. 1984. Alu sequences are processed 7SL RNA genes. Nature 312:171–172. FIG. 3 Mechanisms for recent generation of Alu monomers. (A) Alu monomer is transcribed with its 3# flanking sequence, and the Alu RNA is retrotransposed into another locus. (B) The polyA tract at the middle of dimeric Alu and the polyA tail are extended, and the latter half of dimeric Alu is deleted through the recombination between these two polyA tracts. 15
© Copyright 2026 Paperzz