Chromosomes of parasitic protist Trichomonas vaginalis Zuzana Zubáčová Ph.D. thesis Thesis supervisor: Prof. RNDr. Jan Tachezy, Ph.D. Prague 2011 Department of Parasitology Faculty of Science, Charles University 1 Výsledky prezentované v doktorské práci Mgr. Zuzany Zubáčové jsou společným dílem pracovníků Laboratoře biochemické a molekulární parazitologie a spolupracujících tímů. Prohlašuji, že autorka se na jejich získání zasloužila významným dílem. Data presented in this thesis resulted from team collaboration at the Laboratory of biochemical and molecular parasitology and during projects with our partners. I declare that the author contributed substantially to obtaining of these results. 2 THERE´S A LOT GOIN´ ON… Poďakovanie: Ďakujem Honzovi Tachezymu za trpezlivé vedenie mojej práce a kontrolu textu dizertácie. Za krásne roky spolupráce a pomoc ďakujem kolegom a kamarátom z laborky. Ďakujem rodičom a Oliverovi za podporu v kritických chvíľach. Ďakujem Karlovi za poskytnuté zázemie v závere práce. 3 Chromosomes of Trichomonas vaginalis Contents Abstract 6 1. Overview 7 1. 1. Introduction 7 1. 2. Nuclear genomes and chromosomes of parasitic protists 8 1. 2. 1. Genomes of Apicomplexa (CHROMALVEOLATA) 9 Plasmodium falciparum 9 Toxoplasma gondii 11 Cryptosporidium 11 1. 2. 2. Genomes of Microsporidia (OPISTHOKONTA) 12 1. 2. 3. Genome of Entamoeba histolytica (AMOEBOZOA) 13 1. 2. 4. Genomes of Kinetoplastida (EXCAVATA) 13 Trypanosoma brucei 13 Leishmania major 14 1. 2. 5. Genome of Giardia intestinalis (Metamonada, EXCAVATA) 15 1. 2. 6. Genomes of Trichomonas vaginalis and related species (Parabasala, EXCAVATA) 16 2. Mitosis in parasitic protists 18 2. 1. Mitosis in Apicomplexa 20 2. 2. Mitosis in Kinetoplastida 22 2. 3. Mitosis in Microsporidia 24 2. 4. Mitosis in Entamoeba histolytica 24 2. 5. Mitosis in Giardia intestinalis 26 2. 6. Mitosis in Trichomonas vaginalis 27 References 29 3. Aims 39 4. Publications 40 Draftgenome sequence of the sexually transmitted pathogen Trichomonas vaginalis. Comparative analysis of trichomonad genome sizes and karyotypes. Microsatelite polymorphism in the sexually transmitted human pathogen Trichomonas vaginalis indicates a genetically diverse parasite. 4 Fluorescence in situ hybridization (FISH) mapping of single copy genes on Trichomonas vaginalis chromosomes. 5. Unpublished data 41 Histone H3 variants in Trichomonas vaginalis 6. Thesis summary 41 42 5 Abstract Genome sequencing and associated proteome projects has revolutionized research in the field of molecular parasitology. Progress in sequencing of parasite as well as free-living species enables comparative and phylogenetic studies and provides important data sets for understanding of basic biology and identification of new drug targets. The genome sequencing of Trichomonas vaginalis revealed surprisingly large genome size of this organism (~160 Mb) that resulted from expansion of various repetitive elements, specific gene families and large scale genome duplication. I studied genome sizes of other nine selected species from Trichomonadea group to find whether other trichomonads have undergone similar genome expansion. The measurement of nuclear DNA content by flow cytometry revealed relatively large DNA content in all tested species although within a broad range (86-177 Mb). The largest genomes were observed in the Trichomonas and Tritrichomonas genera while Tetratrichomonas contains the smallest genome. The genome sizes correlated with the cell volume however no relationship between genome size and pathogenicity, the site of the infection or trichomonad phagocytic ability was observed. The published information about chromosomes of trichomonads was insufficient or contradictory. To elucidate the number and appearance of chromosomes I investigated karytypes of T. vaginalis and related trichomonads using conventional cytogenetic techniques and fluorescence in situ hybridization (FISH). Karyotype analysis of selected trichomonad species showed that all trichomonads have a single set of chromosomes that condense during mitotic metaphase. The number of chromosomes in trichomonads ranges from four to six. FISH analysis using probes against SSU and LSU rRNA genes showed that all 250 rDNA units are localized to a single chromosome in one secondary constriction. I also adapted FISH for T. vaginalis that is sensitive enough to detect single copy genes on metaphase chromosomes. Sensitivity of conventional FISH was increased using tyramide signal amplification. Our FISH protocol provides amenable tool for physical mapping of the T. vaginalis genome and other essential applications such as development of genetic markers for T. vaginalis genotyping with application in evolutionary studies, comparative genomics and clinical praxis. Similar to the other eukaryotes, successful mitosis is essential for reproduction of T. vaginalis, however many aspects of this process are not fully understood. In my thesis, I summarized current understanding of mitosis of T. vaginalis including my preliminary results of study of trichomonad centromeres. I studied chromatin localization of T. vaginalis histone H3 variants by transfecting trichomonads with their tagged versions. Among 23 of T. vaginalis histones H3, I identified the homolog of cenH3 variant, which is the universal marker of eukaryotic centromeres. 6 1. Overview 1.1. Introduction With the beginning of the 21st century started the genomic era by deciphering of human genome (Venter, 2001). It opened thousands new doors for basic as well as clinical research. Knowledge of whole genomes fundamentally improved our understanding of biology of pathogenic organisms including parasitic protist Trichomonas vaginalis. T. vaginalis is a parasitic member of Parabasala group that belongs to eukaryotic supergroup Excavata. It is an anaerobic organism, which invades the urogenital tract and causes the most frequent non-viral sexual transmitted disease in humans called trichomoniasis. Despite its medical importance, T. vaginalis was for a long time rather understudied organism (Carlton et al. 2007). Its popularity as a subject of basic research started to grow after the release of data from genome sequencing project in 2007. The summary of the results is published in Carlton et al. (Carlton et al. 2007). Analyses of sequence data and subsequent proteomic studies revealed new biochemical pathways in trichomonads, provided information for development of new genetic markers for population studies that have potential to explain the mechanism of drug resistance, allowed identification of new virulence markers and many more. The most surprising result of Trichomonas genome project was the genome size. Although the predicted genome size was about 25 Mb (Wang and Wang 1985), the genome sequence analysis revealed the size of about 160Mbp coding for about 60 000 genes. Thus, Trichomonas vaginalis appeared to be a parasitic protist with the largest genome and higher gene coding capacity than human. Observed large genome size resulted from large scale genome duplication events, accumulation of various repetitive elements and expansion of specific gene families (Carlton et al. 2007). T. vaginalis is a haploid, clonal protist that reproduces by mitosis. Ultrastructural studies revealed that during mitosis of T. vaginalisthe nuclear envelope remains intact. The mitotic spindle is formed extranuclearly with kinetochoral microtubules that interact with kinetochores of chromosomes via nuclear envelope barrier (Brugerolle, 1975). This mitosis variant is named cryptopleuromitosois (Raikov, 1994) and it was observed exclusivelly in parabasalids. Although meiotic division has not been described in Trichomonas, figures of chromosomes typical for meiotic division were rarely observed on chromosomal preparations by Drmota and Kral (Drmota and Kral, 1997). In addition, eight of nine meiosis specific genes were found in its genome suggesting that T. vaginalis is eighter currently sexual or just recently asexual organism (Malik et al., 2008). During mitosis, T. vaginalis genome and also genomes of other trichomonads condense into tiny well-structured haploid chromosomes. The 7 chromosome number is stable within trichomonad species and vary from four to six between tested species (Zubacova et al., 2008). According to Drmota and Kral (Drmota and Kral, 1997), who described constrictions on T. vaginalis chromosomes as centromeres, T. vaginalis has monocentric rather than holocentric chromosomes. However, mechanism of chromosome segregation into next cell generation has not been studied in T. vaginalis. The structure of T. vaginalis kinetochores, the mode of interaction between kinetochores and mitotic microtubules, and the role of nuclear envelope in segregation of chromosomes is still unclear. 1. 2. Nuclear genomes and chromosomes of parasitic protists A term genome means total hereditary information of an organism. Nuclear genomes are compacted into cell structures called chromosomes. Chromosomes consist of DNA associated with proteins, first of all histones, that enable folding and packing of long DNA molecule into nucleus. Complex of DNA and associated proteins is called chromatin (the Greek chroma means color and reflects staining properties of chromatin). In chromatin, DNA is wound around complex of core histones consisting of histones H2A, H2B, H3 and H4 and its specialized variants. Histone H1 binds linker DNA between nucleosomes. Histones were most likely inherited from archaebacterial ancestors at early stage of eukaryotic evolution (Talbert and Henikoff, 2010). Although histones belong among the most conserved proteins, histones of parasitic protists show remarkable divergence from homologs of higher eukaryotes what rise the question how these differences influence chromatin structure and condensation. Significant sequence deviations were observed for example in Trypanosoma brucei (Bender et al, 1992), Trypanosoma cruzi (Bontempi et al., 1994), Entamoeba histolytica (Födinger et al., 1993), Entamoeba invadens (Sanchez et al., 1994), Leishmania infantum (Soto et al., 1994), T. vaginalis (Marinets et al, 1996) and Giardia intestinalis (Wu et al. 2000). During the cell cycle chromosomes usually pass through cycles of condensation (strongly packed and displaying distinct morphology) and decondensation (chromatin is relaxed in the nucleus with no obvious morphology). A species or an individual specific set of chromosomes with characteristic number and appearance is called karyotype. In parasitic protists number of chromosomes varies from five in Tritrichomonas foetus to hundred of minichromosomes in Trypanosoma brucei (Alberts et al., 2002; Drmota and Kral, 1997; Ersfeld, 2011). However, in some parasites such as trypanosomatids (Gull et al., 1998; Ravel et al., 1998; Dubessay et al., 2004) and Entamoeba histolytica (Willhoeft and Tannich, 1999), chromosomes do not condense in any state of their life cycle. Chromosomes can not be visualised using standard cytogenetic staining also in Plasmodium (Coppel and Black, 2005). 8 Number of genome copies that is usually related to number of chromosomes in an organism determines its ploidy. Parasitic protists are often haploid not only regarding genome copies but also chromosome segments. For example, majority of Trypanosoma brucei genome is diploid but regions of the genome containing VSG genes are haploid. These regions lacks homologous chromosomal segments (Melville et al., 2000; Nuismer and Otto, 2004 ). The evolutionary advantage of haploidy in parasites has been proposed in regard to successful host infection as alleles connected with successful invasion of a host tend to be recessive. Haploid group of parasites includes important pathogens of plants such as Plasmodiophora brassicae and Spongospora subterranean and human protist pathogens such as Plasmodium falciparum, Toxoplasma gondii and Trichomonas vaginalis (Nuismer and Otto, 2004). Data from parasite´s genome projects present enormous amount of information in regard to basic research of their biology and uncovering of new strategies for development of antiparasital drugs (Coppel and Black, 2005). Access to genomic datasets of individual genome projects of pathogenic protists is currently enabled via public databases including National Center for Biotechnology Information (NCBI).Genomes of human pathogens are integrated into database EuPathDB (ttp://eupathdb.org/eupathdb/, (Aurrecoechea et al., 2005)) with elaborated system of bioinformatic tools. This database serves as an entry point for study of individual genes and genomes of Plasmodium (PlasmoDB), Toxoplasma (ToxoDB), Cryptosporidium (CryptoDB), Trypanosoma and Leishmania (TriTrypDB), Entamoeba (AmoebaDB), Encephalitozoon and Enterocytozoon (MicrosporidiaDB), Giardia (GiardiaDB) and Trichomonas (TrichDB). 1.2.1. Genomes of Apicomplexa (CHROMALVEOLATA) Plasmodium falciparum P. falciparum causes deadly human disease malaria that in 2009 took 781 000 lives. In Africa, one in every five children deaths is due to malaria (http://www.who.int/topics/malaria/en/). The genome project had the major ambition to accelerate the research in direction of new strategies for elimination of malaria using new chemotherapeutic or vaccination. The size of nuclear genome of malaria parasite is ~23 Mb (Gardner et al., 2002). It consists of haploid set of 14 chromosomes that do not condense. Conventional cytogenetic spreading techniques for chromosomes imaging do not lead to visualization of Plasmodium 9 chromosomes (Gerald et al, 2011; Kelly et al., 2006; Prensier and Slomianny, 1986). The chromosomes have diffuse appearance when stained for fluorescence microscopy and it is difficult to observe these chromosomes using transmission electron microscopy. However, chromosomes of P. falciparum can be resolved by pulse field gel electrophoresis (PFGE). This technique is widely used in parasites for determining the number and size of chromosomes (electrophoretic karyotype) (Coppel and Black, 2005; Kelly et al. 2006; Prensier and Slomianny 1986). P. falciparum chromosomes considerably vary in their length due to variations in subtelomeric regions. Natural variation in chromosome length between Plasmodium isolates results from meiotic recombination of parasites in the mosquito vector. Chromosome size variation can be observed also in laboratory cultures. However, it this case the variation is caused by chromosome breakage and reparation events that are independent on meiotic recombination (Coppel and Black, 2005;Gardner et al. 2002). Genome of P. falciparum is the most AT – rich genome sequenced to date. The overall A/T content is 80, 6%. The most AT-rich regions are present in introns, intergenic regions and at the site of putative centromeres. In the genome sequence, 5286 protein coding genes were predicted, from which 54% contain intrones. No transposones or retrotransposones were detected (Gardner et al., 2002). Plasmodium genome is characteristic with unusual organization of rDNA units. Ribosomal RNA coding genes and their arrangement are generally one of the most extensively studied part of the genome (Prokopowich et al. 2003). In most eukaryotes rDNA units are organized as a tandemly repeated arrays of 18S – 5,8S – 28S rRNA genes that are clustered in nucleolar organizer region (NOR) on one or more chromosomes. The numbert of rDNA copies usually positively correlates with the genome sizes of eukaryotic genome (Prokopowich et al. 2003). However, units of ribosomal RNA genes in Plasmodium are not clustered but dispersed throughout different chromosomes and their copy number is unusually low: 4-8 per genome (Waters, 1994). Furthermore, Plasmodium rDNA units are not identical in their sequences and their expression is regulated developmentally when different set of rRNA is expressed at different stages of the parasite life cycle. It has been proposed that Plasmodium is able to alter the rate of the gene expression by changing the properties of ribosomes resulting in change of cell development (Gardner et al. 2002). Genes responsible for virulence, adhesion molecules and antigenic variation are present in multiple copies and are typically localized in subtelomeric regions of chromosomes. Deletions in subtelomeric regions alter expression of adhesive molecules 10 leading to the lost of cytoadherence of infected erythrocytes to endhotelial cells (Day et al. 1993; Gardner et al. 2002). Toxoplasma gondii T. gondii is an important human and veterinary pathogen that causes toxoplasmosis. It is estimated that about one-third of the human population is chronically infected that make T. gondii one of the most prevalent parasites in humans (Laliberté and Carruthers, 2008). Most infections of T. gondii are asymptomatic. However, it may propagate in immunocompromised patiens such as individuals with Acquired Immune Deficiency Syndrome that may lead to damage of various organs and even death. In pregnant women, T. gondii could be transplacentally transmitted to the foetus that results in development of serious defects (Dubey et al., 1998). Toxoplasma can be easily propagated in laboratory in human foreskin fibroblasts (Roos et al., 1994) and methods for reverse genetics are available (Brooks et al. 2011). Thus Toxoplasma has became a model apicomplexan organism. The B7 strain that is commonly associated with AIDS patiens was used for analysis of Toxoplasma genome. The genome size of T. gondii is 80 Mb and contains 8000 protein coding genes (Coppel and Black, 2005; Kissinger et al., 2003; Xia et al., 2008). Initial studies of karyotype using PFGE revealed 11 chromosomes in Toxoplasma (Kissinger et al. 2003; Sibley and Boothroyd, 1992), however, more recent observation showed that T. gondii genome consist of 14 chromosomes (Coppel and Black, 2005). In comparison to genes of Plasmodium or Cryptosporidium, T. gondii genes are richer in introns. Genes coding secretion proteins of apical complex organelles are located typically in subtelomeric regions of chromosomes (Coppel and Black, 2005). rRNA genes that comprise of 110 units per haploid genome are organized conventionally as tandem repeats (Guay et al. 1992). Cryptosporidium C. parvum and C. hominis are two cryptosporidial species that cause infection of humans. They differ in the host range, genotype and pathogenicity. C. hominis is a human specific parasite, while C. parvum infects also other mammals (Xu et al., 2004). Cryptosporidiosis belongs to a group of opportunistic infections pathogens. In immunocompromised patients it may cause serious chronic diarrhea resulting even in death (O´Connor et al., 2011). No long-term in vitro cultures of Crysptosporidium are available. 11 However, large amount of Cryptosporidium cysts for experiments can be obtained from the feces of infected calves and then be in vitro excysted (LaGier et al., 2003). The genomes of both human species C. parvum and C. hominis were sequenced however only subtle differences were found. The 9 Mb genome of Cryptosporidium consists of eight chromosomes that were examined by molecular karyotyping using PFGE (Abrahamsen et al., 2004; Blunt et al., 1997; Xu et al., 2004). Similarly to microsporidia, genomes of Cryptosporidium are highly compact , however forces that affected their genome sizes were different (Keeling, 2004). There are two basic mechanisms leading to genome reduction: (i) shortening of intergenic regions, and (ii) elimination of genes from the genome (Keeling, 2004). Genome of Cryptosporidium was affected predominantly by the first mechanism. It is characteristic by reduction of intergenic regions and introns as well as genes are reduced in their length. In the genomes of C. parvum and C. hominis, 3807 and 3994 genes were predicted, respectively, that is less than in Plasmodium or Toxoplasma. In contrast to other apicomplexan, subtelomeric gene clusters encoding variant surface proteins were not detected in Cryptosporidium (Coppel and Black, 2005). Genes coding rRNA are organized according to the plasmodial paradigm. They are present in five copies and they are not clustered in the genome. The rDNA units are dispersed throughout the genome at least on three chromosomes. Like in Plasmodium, rDNA units accumulated sequence heterogeneities propably because of dispersion in the genome (Le Blancq et al., 1997). 1.2.2. Genomes of Microsporidia (OPISTHOKONTA) Genomes of microsporidia range from 19,5 Mb in Glugea atherinae to 2.9 Mb in Encephalitozoon cuniculi and 2,3 Mb in Encephalitozoon intestinalis that is the smallest known eukaryotic genome (Keeling, 2004). Intestinal pathogen E. cuniculi is an obligate intracellular parasite that infects broad spectrum of mammals including humans. It is an important opportunistic pathogen in HIV infected patients. E. cuniculi genome possess only 1997 protein coding genes and according to molecular karyotype constructed by PFGE it is organized as a diploid set of 11 chromosomes (Biderre et al., 1995; Katinka et al., 2001). In the process of genome reduction, microsporidia underwent equally gene elimination, and compaction that included the shortening of intergenic regions and genes themselves including rDNA units. Encephalitozoon displays the highest gene density among eukaryotes and 80% genes is shorter than their homologs in yeast (Biderre et al., 1995; Corradi et al., 2007; 12 Katinka et al., 2001; Keeling, 2004). With the exception of subtelomeric repeats at the ends of chromosomes, genome of E. cuniculi is poor in repetitive elements that comprise large size of genomes in most of eukaryotes and no transposones were detected in E. cuniculi genome. Protein coding genes are localized in the GC rich core region of chromosomes that are flanked by subtelomeric regions containing one rDNA unit (22 rDNA units in total) and repeats of mini- and microsatellites (Biderre et al., 1995; Katinka et al., 2001). 1.2.3. Genome of Entamoeba histolytica (AMOEBOZOA) Pathogenic amoeba E. histolytica causes amoebiasis that is significant cause of morbidity and mortality in the Third World countries (Stanley, 2003). Genome of E. histolytica (strain HM-1: IMSS) is 23 Mb large and 9938 genes coding proteins were predicted. The genome is rich in introns that are present in 25% of genes (Loftus et al., 2005) Chromosomes of E. histolytica do not condense and they are extremely fragile. Together with the existence of various circular DNA molecules in the nucleus such as plasmids bearing rRNA genes (200 copies per cell) (Willhoeft and Tannich, 1999) the estimation of chromosome number is complicated and organism ploidy is still uncertain. Different number of chromosomes was observed using different experimental methods. Acridine orange staining revealed only six chromosomes (Gomez-Conde et al., 1998). Separation by PFGE identified 31-35 chromosomes and suggested that E. histolytica is tetraploid (Willhoeft and Tannich, 1999). The most recent study of E. histolytica chromosomes by transmission electron microscopy suggests presence of linear 28-35 long chromosomes that corresponds to PFGE analysis (Chavez-Munguia et al., 2006). Sizes of corresponding chromosomes are variable in different isolated that is probably the result of differences in the length of subtelomeric repeats, particularly of t-RNA arrays (Loftus et al., 2005). 1.2.4. Genomes of Kinetoplastida (EXCAVATA) Trypanosoma brucei Subspecies of T. brucei are causative agents of African sleeping sickness in humans. The majority of cases are caused by T. b. gambiense leading to chronic infection in Western 13 and Central Africa. T. b. rhodesiense is causative agent of acute sleeping sickness in Eastern and Southern Africa (Malvy, 2011). In the 26 Mb genome of T. brucei, 9068 genes were predicted. The genome is characteristic by expansion of subtelomeric gene families (20% of the genome) involved in antigenic variation of the parasite (Berriman et al., 2005). There are three classes of chromosomes classified according to their size as minichromosomes (~100, 30 – 150 kb, probably haploid), intermediate (1-5 chromosomes, 100 – 900 kb, probably haploid) and megabase chromosomes (11 diploid chromosomes, 1 – 6 Mb). In kinetoplastids, intermediate - and minichromosomes were found only in T. brucei where they comprise of about 20% of total genome size. Megabase chromosomes contain housekeeping genes and expression sites for metacyclic and bloodstream variants of surface glycoprotein (VSG) genes. Variability in size observed among homologous chromosomes is in T. brucei caused by length variation of VSG expression sites, telomeres and repetitive regions rich in transposable elements. Intermediate chromosomes contain VSG expression sites and are thought to originate by breakage of megabase chromosomes. Minichromosomes constitute the reservoir of VSG genes. They also vary in size due to dynamics of their telomeres (Ersfeld, 2011; Ersfeld et al, 1999). Another specific feature of T. brucei genome is the presence of subtelomeric genes named retrotransposone hot spots that are clusters of intact genes and pseudogenes with yet unknown function (Berriman et al. 2005; Coppel and Black, 2005). Leishmania major L. major causes cutaneous leishmaniasis. L. major Friedlin strain selected for genome project is a user-friendly laboratory strain and model for study of immunological aspects of leishmaniasis (Ravel et al., 1998). L. major genome size is ~34Mb, consists of 36 chromosomes analysed by PFGE and contains 8000 genes organized in polycistronic clusters as it is typical for Kinetoplastida (Ivens et al., 2005; Ravel et al. 1998). The genome is diploid or, more precisely, partially aneuploid for one or few chromosomes (Sunkin et al., 2000). Typically for trypanosomatids, Leishmania chromosomes do not condense in any stage of the cell division. Chromosomes of Leishmania can not be identified by their size because of high size polymorphism between isolates and species due to different length of subtelomeric repetitive sequences. Size polymorphism of 14 Leishmania homologous chromosomes is however not the result of recombination events like in Plasmodium (Ravel et al, 1998; Sunkin et al., 2000). 1.2.5. Genome of Giardia intestinalis (Metamonada, EXCAVATA) An intestinal parasite G. intestinalis causes diarrheal disease, giardiasis. Two major genotypes assemblage A and B infect humans worldwide (Lalle, 2010). The genome of Giardia is structurally compact, but to the less extend then in Cryptosporidium or Encephalitozoon (Morrison et al., 2007). Genome size determined for Giardia assemblages A and B is about 11,7Mb and contain 4787 and 4691 protein - coding genes, respectively, with minimum of introns and transposable elements (Franzén et al., 2009; Morrison et al., 2007). Giardia has two nuclei that differ in DNA content and also chromosome number (Tumova et al., 2007). Separation of giardial chromosomes by PFGE revealed five chromosomes per haploid genome (Le Blancq and Adam, 1998). Chromosomes of G. intestinalis are dynamic in the mean of their frequent rearrangment resulting in size variation between homologs. This process involves particularly subtelomeric rRNA gene units that are located on multiple chromosomes (Adam, 1992; Le Blancq and Adam, 1998; Upcroft et al., 2005). Chromosomes of Giardia condense during mitosis and can be studied by cytogenetic spreading technique. Giardia metaphase chromosomes are tiny (0,8 – 2,4 µm), rod – shaped or ovoid structures without apparent constrictions and they can not be distinguished based on their morphology. Chromosomes of Giardia consist of two sister chromatids that are not linked together, as it is typical for eukaryotic chromosomes, probably due to modification in cohesin protein complex (Tumova et al., 2007). Despite the absence of constrictions, chromosomes of Giardia were characterized as monocentric with point centromeres. It was deduced from the ultrastructural study of dividing cell showing the low number of spindle microtubules attaching to the chromosomes in metaphase of mitosis and characteristic immunostaining pattern of cenH3 histone variant, a marker protein of centromeres (Dawson et al., 2007; Tumova et al., 2007). Recent study by Tumova et al. (Tumova et al., 2007) revealed that Giardia is not tetraploid with 10 chromosomes in each nucleus, as originally proposed (Le Blancq and Adam, 1998), but the two nuclei possess different chromosome number and thus Giardia is aneuploid for at least some chromosomes. Two dominant karyotypes were observed in Giardia: i.) 9 + 11 chromosome pattern in isolates from genetic assemblage A and ii.) 10 + 11 chromosomes in isolates from assemblage B. This assymetry was found out to be stable. 15 1.2.6. Genomes of Trichomonas vaginalis and related species (Parabasala, EXCAVATA) T. vaginalis is the causative agent of human trichomoniasis, the most widespread nonviral sexually transmitted disease worldwide (Conrad et al., 2011). Analysis of T. vaginalis G3 genome revealed surprising large genome size: about 160 Mbp (Carlton et al., 2007). It is the largest genome among sequenced parasitic prosists and with 60 000 protein coding genes the genome has one of the highest coding capacities in eukaryotes. Several factors were suggested to participace on the Trichomonas genome expansion.: (i) the genome underwent several large scale genome duplications, (ii) specific gene families were amplified, (iii) the genome has highly repetitive nature that was predicted to be 53,3% already in former study of Wang and Wang (Wang and Wang, 1985). Various repetitive sequences such as transposones or retrotransposones comprise about two-thirds (65%) of the genome (Carlton et al., 2007). Several hundred copies of transposone mariner were identified here as the first representative of the mariner family found in protists (Silva et al., 200ř). T. vaginalis genome contains 250 rDNA units that are composed of genes coding 16S, 5,8S and 23S rRNAs. The units are arranged in classical head to tail tandem arrays. They were mapped in a single nucleolar organizer region (NOR) by fluorescence in situ hybridization (FISH) in the secondary constriction of chromosome IV (Carlton et al., 2007; Zubacova et al., 2011). Previous studies of NORs in trichomonads based on silver staining method failed to detect NOR in T. vaginalis, however, a single NOR was found in karyotypes of related trichomonads Trichomitus batrachorum and Tritrichomonas augusta (Drmota and Kral, 1997; Zubacova et al. 2011). Recently, FISH technique revealed single NOR also in Tritrichomonas foetus and Trichomonas tenax, a sister taxon of T. vaginalis (TorresMachorro et al., 2009). Karyotype of T. vaginalis as well as karyotypes of other trichomonads consists of stable haploid set of chromosomes that condense during mitosis and can be studied using spreading technique and other cytogenetic techniques such as C- and fluorescence-banding or FISH. The culture of trichomonads can be enriched in cells arrested in metaphase by using mitotic drug colchicine (Drmota and Kral, 1997; Yuh et al., 1998). In contrast to other parasitic protists, numerous attempts for construction of molecular karyotype were unsuccessful. Chromosomes of T. vaginalis can not be separated by PFGE probably due to their large size. 16 Karyotype of T. vaginalis consists of six chromosomes that could be distiguish by their size and morphology. They are tiny (1 – 2.4 µm) and consist of two sister chromatids arranged in paralell, each with its own centromere. Genes encoding proteins of cohesin multi – protein complex involved in sister chromatid cohesion, such as Rad21 are annotated in T. vaginalis genome (Tumova et al., 2007; Zubacova and Tachezy, unpublished data). However, sister chromatid cohesion as a thin connections between chromatids can be observed very rarely on T. vaginalis metaphase microscopic preparations (Drmota and Kral,1997; Zubacova unpublished observation). Together with the examination of T. vaginalis, karyotypes, species including genera Tritrichomonas, Tetratrichomonas, related trichomonad Pentatrichomonas, Monocercomonas, Trichomitus and Hypotrichomonas were recently revised as available analyses conducted between 1909 and 1931 were rather inconclusive (Drmota and Kral, 1997). Using spreading technique we described four to six chromosomes in their karyotypes. The chromosome number was haploid and species-stable in all examined organisms (Zubacova et al., 2008). Cytogenetic techniques of chromosome examination including banding or FISH have not been previously employed in study of trichomonad genome. Similarly to Giardia, our attempts to visualize heterochromatin regions such as centromeres and AT- or GC- rich DNA on T. vaginalis chromosomes using C – banding and fluorescence banding failed (Tumova et al., 2007; Zubacova, unpublished). On the other hand, FISH technique was successfully established for mapping of both multiple and single copy genes (Zubacova et al. 2011). This technique was employed to map distribution of transposone mariner on T. vaginalis chromosomes and in the study of genetic polymorhpism of different T. vaginalis isolates using single – copy microsatellite markers (Carlton et al., 2007; Conrad et al., 2011). The expansion of T. vaginalis genome might be associated with increased cell size. It was hypothesized that large size of T. vaginalis genome and cell are related with pathogenicity of this parasite, augmentation of its phagocytosis ability and reduction of its own ingestion by host macrophages. Thus, genome sizes of eight selected trichomonad species of different pathogenicity and phagocytic capacity have been studied. The estimation of nuclear DNA content by flow cytometry revealed relatively large genomes (86 – 177 Mb) in all tested species independently on their parasitic life style (Zubacova et al., 2008). Genome sequencing of another trichomonad species is highly desired to elucidate parabasalid genome evolution 17 2. Mitosis in parasitic protists Mitosis is an essential process for cell life. Mitosis is a part of eukaryotic cell cycle that includes chromosome segregation into daughter nuclei that proceed cell division. It is divided into stages of profase, metaphase, anaphase and telophase that include changes in chromosome movement and nuclear envelope dynamics. In a textbook mitosis, at prophase, chromatin starts to condense to chromosomes that consist of two sister chromatids. The chromatids are linked together via cohesin protein complex at the centromere. Microtubule organizing center (MTOC) duplicates to form mitotic spindle and nuclear envelope disintegrates. At metaphase, chromosomes are aligned at the center of the spindle to form metaphase plate. They are anchored to spindle microtubules via multi-protein complexes kinetochores that are built at centromeres one at each chromatid. Chromosome segregation occurs at anaphase by the action of mitotic spindle and associated motor proteins. Sister chromatids separate and begin to move towards opposite poles of the cell. At telophase, nuclear envelope assemble around the daughter genomes, chromosomes decondense and cell starts to divide into daughter cells (cytokinesis) (Gerald, 2011; Santaguida and Musacchio, 2009). Every eukaryotic chromosome requires a centromere for proper segregation into daughter cells. Centromeres evolved from telomeres during evolution of eukaryotic chromosomes by breakage of ancestral circular chromosome and subsequent recovery of the exposed DNA ends by retrotransposones followed by amplification of subtelomeric repeats that gave rise to proto – centromeres. Phylogenetic analyses support the idea that all centromeres, no matter how complex they are, share common ancestor based on repetitive DNA derived from subtelomeric repeats (Villasante et al., 2007). In majority of eukaryotes the centromere is not determined by specific DNA sequence motifs but epigenetically by centromere specific proteins. Divergent variant of histone H3 generally called cenH3 (CENP – A in mammals, Cid in Drosophila, Cse4 in yeast, CNA1 in Tetrahymena) is one and the only marker of centromeres found in all studied eukaryotes with the only exception of kinetoplastid protists (Lowell and Cross, 2004). CenH3s are essential for kinetochore assembly on centromeres (Henikoff and Ahmad, 2005; Talbert and Henikoff, 2010). Two types of chromosomes were described according to the structure of centromeres and kinetochores. First, mononectric (or monokinetic) chromosomes have centromeres and associated kinetochores localized in distinct chromosomal regions. These centromeres appear as primary chromosomal constrictions. Monocentric chromosomes are present in majority of 18 model eukaryotes and are considered to be ancestral to holocentric chromosomes (Guerra et al., 2010; Sagolla et al., 2006). In contrast, holocentric chromosomes do not have any constrictions and their centromeres and so – called diffuse kinetochores are spread along entire length of chromatids. This type of chromosomes was not studied so extensively in comparison to „typical“ monocentric type despite they are relatively widespread. Holocentric chromosomes occur commonly in invertebrates (for example in nematode worm Caenorhabditis elegans), among protists in ciliates, and also in several green algae and plant groups. Holocentric chromosomes were not observed in vertebrates (Dernburg, 2001; Guerra et al., 2010; Mola and Papeschi, 2006; Tyler-Smith and Floridia, 2000). Knowledge in the organization of crucial chromosome structures such as centromeres and associated kinetochores is very limited in parasitic protists. Monocentric chromosomes were described in Giardia intestinalis (Tumova et al., 2007) and Trichomonas vaginalis (Drmota and Kral, 1997; Zubacova and Tachezy, unpublished data). Chromosomes of Plasmodium falciparum, trypanosomatids and Entamoeba histolytica do not condense what hampers to study their ultrastructure. From studies of centromeric DNA and kinetochores it is unclear whether their chromosomes are monocentric or holocentric (Kelly et al., 2006; LopezRobles et al., 2000; Obado et al., 2007; Prensier and Slomianny, 1986). Centromeres were not detected on Cryptosporidium chromosomes (Bankier et al., 2003). Based on behavior of the nuclear envelope, two basic variants of mitosis can be distinguished: open and closed. According to the symmetry of the spindle, mitosis is classified as orthomitosis when the spindle has axial symmetry or pleuromitosis when excentric spindle is present (Figure 1). Generally, higher eukaryotes undergo open mitosis. Mitotic spindle is cytoplasmic and nuclear envelope disintegrates to allow association of spindle microtubules with chromosome kinetochore (De Souza and Osmani, 2007). In eukaryotes with closed mitosis, the nuclear envelope remains intact during whole cell cycle and mitotic spindle is intranuclear. The organisms use the modification of nuclear pore complex or modify nuclear transport to allow tubulin units enter the nucleus and assemble intranuclear spindle. Closed mitosis occurs in unicellular eukaryotes such as fungi (e. g. yeasts) and trypanosomes (De Souza and Osmani, 2007; Ersfeld and Gull, 1997; Raikov, 1994). In addition to open and closed mitosis, variants of this process have evolved in various lineages of eukaryotes. Extreme mitotic diversity occurs in protists however, details of this process are often poorly understood (Brooks et al., 2011; Gerald, 2011; Raikov, 1994). However, along with gradual release of genomic data, growing interest in investigation of molecular aspects of cell division 19 can be observed especially in parasitic protists with respect to developing antiparasitic drugs that will be able to target cell division machinery and inhibit parasite´s reproduction. Figure 1: Classification of mitosis based on nuclear envelope behavior and mitotic spindle localization (Raikov, 1994). 2.1. Mitosis in Apicomplexa In Apicomplexa there are three types of cell division: i.) Endodyogeny is budding of two daughter cells after mitosis within the mother cell. This type of reproduction is typical for Toxoplasma tachyzoites (Dzierszinski et al., 2004). ii.) In schizogony repeated mitosis without cytokinesis leads to multinucleated cells and subsequent generation of numerous haploid progeny by cytokinesis. Schizogony occurs in Plasmodium life cycle and in slow proliferating bradyzoites of Toxoplasma. iii.) Endopolygeny is characteristic for tissue parasite Sarcocystis, where several rounds of genome duplication occur without nuclear division leading to the large temporarily polyploid nucleus followed by budding into multiple daughters (Figure 2). Multiple intranuclear spindles persist throughout the cell cycle providing organization of chromosomes within polyploid nucleus (Brooks et al., 2011; Dzierszinski et al., 2004; Striepen et al., 2007; Vaishnava et al., 2005). 20 Nuclear envelope persists during mitosis of Apicomplexa but there are pores within the envelope for penetration of spindle microtubules. Two MTOCs (called also kinetic centers and centriolar plaques in Plasmodium or centrocons in Toxoplasma and Sarcocystis ) are extranuclear but are embedded within the outer side of the envelope giving rise to two bundles of intranuclear spindle microtubules (Brooks et al., 2011; Gerald, 2011). This type of mitosis is called semi open pleuromitosis. Chromosomes do not condense in Apicomplexa. Throughout the cell cycle they remain permanently attached to the mitotic spindles via kinetochores to maintain genome integrity of polyploid stages. Chromosomes of Plasmodium do not align into central metaphase plate. They have characteristic trilaminar kinetochores although majority of conserved centromere – associated and kinetochoral components were not detected in the genome that suggest the evolution of novelties in the protein composition of Plasmodium kinetochore (Kelly et al., 2006; Prensier and Slomianny, 2006). Centromeres in Plasmodium were defined on the base of DNA sequence motifs using etoposide – mediated topoisomerase II cleavage assay. This assay is based on inhibition of activity of topoisomerase II by anticancer grug etoposide. In metaphase topoisomerase II concentrates at the centromeres playing role in sister chromatid decatenation. Etoposide blocks the activity of topoisomerase II leaving doublestrand breaks in regions of centromeres and subsequent breakage of DNA at the sites of centromeres (Brooks et al., 2011; Kelly et al., 2006). Plasmodium has „regional“ centromeres that are extremely AT rich (97%), they contain repetitive arrays and are relatively small comprised of only 2 - 3 kb of DNA (the length of centromeric region composed of repetitive satellite DNA in plants and metazoans can be up to Mb). Ortholog of centromere marker protein cenH3 was identified in Plasmodium genome however its chromosomal localization has not been studied yet (Kelly et al., 2006; Obado et al., 2007). In contrast to Plasmodium, centromeres of Toxoplasma chromosomes do not share characteristic DNA sequence motif and are the sites of accumulation of di – and trimethylated lysine 9 of core histone H3 that is a typical modification of heterochromatin flanking the centromeres. Centromeres of Toxoplasma were identified by immunolocalization of cenH3 marker histone and etoposide – mediated topoisomerase II cleavage assay. Centromeres immunostaining also showed that during mitosis, Toxoplasma chromosomes align to classical metaphase plate located very close to envelope between two MTOCs, the centrocone. The centrocone together with persisting spindle microtubules and probably proteins of nuclear envelope regulate organization of chromosomes throughout the parasite development to ensure genome stability during rapid reproduction (Brooks et al., 2011). 21 Figure 2: Mitosis in Apicomplexa inside the host cell: During endodyogony Toxoplasma completes whole cell cycle after each round of DNA replication. Plasmodium (schizogony) and Sarcocystis (endopolygeny) omit cytokinesis and/or nuclear division after several round of DNA replication resulting in multinucleated or polyploid cells (nuclei in grey, centrosomes in red, parasite membrane in purple) (Striepen et al., 2007). 2.2. Mitosis in Kinetoplastida Kinetoplastid parasites proliferate extensively in the digestive tract of their insect vectors, in host´s bloodstream (Trypanosoma brucei), tissue cells (Trypanosoma cruzi) or macrophages (Leishmania) (Berriman et al., 2005; Ivens et al., 2005; Tyler and Engman, 2001). Mitosis in kinetoplastids is closed with intact nuclear envelope and intranuclear mitotic spindle (Ogbadoyi et al., 2000). Distinct MTOCs have not been observed despite presence of genes encoding proteins such as γ – tubulin and centrins that function in MTOC (Berriman et al., 2011; Ersfeld and Gull, 1997). There are two types of chromosomes in trypanosomes, megabase chromosomes and minichromosomes that do not visibly condense during the mitosis probably due to altered structure and properties of their core histones (Ersfeld et al., 1999). Segregation of Trypanosoma megabase chromosomes and minichromosomes differs markedly (Ersfeld and Gull, 1997). Despite their number and size, it was found that minichromosomes are divided into daughter cells with high fidelity (Alsford et al., 2001). Minichromosomes are segregated by central pole - to - pole spindle microtubules. It is not known how they interact with the spindle, however, lateral gliding model along spindle microtubules has been proposed for their segregation (Ersfeld, 2011; Gull et al., 1998). Megabase chromosomes that have trilaminar kinetochores are segregated by peripheral pole – to – kinetochore microtubules 22 (Figure 3). Interesting feature of Trypanosoma mitosis is that the number of large chromosomes exceeds the number of kinetochores (11 pairs of megabase chromosomes versus 10 trilaminar kinetochore – like structures) (Ersfeld and Gull, 1997; Ersfeld et al., 1999; Gull et al., 1998). This could be explained by existence of kinetochore independent mechanism of chromosomal segregation. Alternatively, some of the large chromosomes may possess small and undetectable kinetochores like kinetochores of Sachcaromyces cerevisiae that are defined functionally as well as biochemically but their ultrastructure is not visible by electron microscopy. The numer of kinetochores do not corresponds to the chromosome number not only in T. brucei but also in T. cruzi and Leishmania (Gull et al., 1998; Ogbadoyi et al., 2000). The exact nature of kinetochores and centromeres of kinetoplastids is poorly understood. Centromeric DNA has not been identified, although there are several sequence candidates. Repetitive palindromes of 177 bp in T. brucei minichromosomes or 5,5 kb repetitive region within chromosome I are suggested to have centromeric role (Lowell and Cross, 2004). In Trypanosoma cruzi, centromeric DNA was mapped by chromosome fragmentation. A 16 kb region required for chromosome mitotic stability is GC rich and composed predominantly of degenerated retrotransposones, but lacks repetitive satellite DNA typical for centromeres of higher eukaryotes. Clusters of subtelomeric repeats may have role in maintenance of mitotic stability in Leishmania species (Obado et al., 2007). Trypanosomes have poorly conserved and reduced components of centromeric and core kinetochore protein complexes. For example, they lack the centromeric variant of histone H3, cenH3 that was found in all eukaryotes studied to date ranging from those with small point centromeres (yeast Saccharomyces cerevisiae) to those with holocentric chromosomes (nematode C. elegans). Instead of cenH3, trypanosomes have telomere – specific H3V histone variant that is conserved among kinetoplastids (Berriman et al., 2005; Lowell and Cross, 2004). H3V immunostaining in combination with FISH revealed that H3V is localized to telomeres of all classes of Trypanosoma chromosomes. Its function in telomeres is not understood (Lowell and Cross, 2004). Protein sequence of Trypanosoma histone H3V shares several common features with cenH3, such as divergent N – terminus and absence of glutamine residue in α – 1 helix of the protein. Histone H3V, similarly as cenH3, associates with the spindle during mitosis. Furthermore, centromeres are thought to evolved from telomeres what is consistent with the observations that telomeres and subtelomeric regions in several organisms have centromeric function (Lowell and Cross, 2004; Villasante et al. 2007). For example, chromosome fragmentation experiments showed that satellite DNA of subtelomeric regions of Leishmania major chromosomes have centromeric properties in 23 connection with maintaining chromosome stability (Dubessay et al., 2004). On the other hand, unlike cenH3, histone H3V lacks N – terminal extension and insertion in loop 1 of the protein structure. Moreover, H3V is not essential for chromosome segregation and cell viability, while gene knock out of cenH3 lead to defects in mitosis so H3V is probably not functional homolog of cenH3 (Regnier et al., 2005; Sanyal and Carbon, 2002). Figure 3: Model of segregation of Trypanosoma large and minichromosomes during mitosis. Large chromosomes (yellow) are segregated by kinetochore (green) dependent manner. Minichromosomes (red) segregate by gliding along pole-to-pole spindle microtubules (blue) (Gull et al., 1998). 2.3. Mitosis in Microsporidia Microsporidia divide by mitosis during their merogony and in the sporont stage inside host´s cells. They divide by closed mitosis with intranuclear spindle that is a typical trait of fungi. Chromosomes do not condense and chromatin is diffused inside the nucleus (Bigliardi et al., 1998; De Souza and Osmani, 2007). 2.4. Mitosis in Entamoeba histolytica E. histolytica trofozoites proliferate asexually in large intestine of vertebrate host by unusual type of closed mitosis. Unlike in majority of eukaryotes, there is no evidence for cell cycle checkpoint control in E. histolytica. Consequently, genome is repeatedly duplicated (endoreduplication) without nuclear and cellular division that leads to polyploidy (Lohia, 2003). Mitosis of majority of eukaryotes with functional checkpoints is normally blocked when it switches to endoreduplications. On the other hand, endoreduplication is also developmental mechanism responsible for cell types differentiation and tissue growth. It is widespread in plants where polyploidy can be detected in majority of plant tissues, while in 24 animals is limited to specific cell types (Kondorosi et al., 2000; Lee et al., 2009). Whether endoreduplications have some biological significance in E. histolytica is not clear. Polyploid amoebas occur in laboratory predominantly in media with depletion of nutrients (Lohia, 2003). It has been hypothesized that endoreduplication in E. histolytica represents an energy saving mode of proliferation that does not require cytoskelet rearrangements and production of new cell membranes or is used for effective nutrients uptake by increasing cell size (Lee et al., 2009). Molecular basis of Entamoeba chromosome segregation and nuclear division is still not fully elucidated. E. histolytica chromosomes and their number are poorly defined despite fluorescence and electron microscopy examinations. Chromosomes do not condense and nuclear DNA is concentrated as a dense body called karyosome in the center of the nucleus. Karyosome was considered to play a role in organizing of MTOC in early phases of mitosis. Microtubules appear in Entamoeba cells only during mitosis, while they are not detectable neither in nucleus nor in cytoplasm in interphase. Intranuclear mitotic spindle is unusually arranged to rosettes of 7 – 12 microtubules. In dividing nucleus, 28 – 35 microtubular rosettes were observed suggesting that the number of Entamoeba chromosomes is at least 35 considering that each rosette is associated with a single chromosome (Chavez-Munguia et al., 2006). No typical trilaminar kinetochores were observed on E. histolytica chromosomes. In the only study of centromeres, CREST serum was tested in E.histolytica to detect centromere/kinetochore regions in chromatin. CREST serum originates from human patients suffering from autoimmune disease systemic scleroderma when antibodies against centromeric and kinetochore proteins are produced (Fritzler et al., 2011; Lopez-Robles et al., 2000). Among lower eukaryotes, CREST serum was successfully tested in Leishmania mexicana chromosomes study (Lopez-Robles et al., 2000). In E. histolytica fixed trophozoites CREST serum recognized 2 – 3 clusters of intranuclear round bodies considered by López – Robles et al. (2000) to be centromeres. However, recognition sites of CREST antibodies in Entamoeba chromatin were not studied in detail. Furthermore, the number of fluorescence signals on microscopic preparations does not correspond with the number of chromosomes (35). The nature of Entamoeba centromeres and kinetochores thus still remains a subject of discussion and needs further study. 25 2.5. Mitosis in Giardia intestinalis Binucleate trofozoites of G. intestinalis reproduce in the upper part of small intestine. During the cell division, Giardia must duplicate not only nuclei but also multiple cytoskeletal structures such as eight flagella, the ventral disc, the funis and the median body that occur in and characterize trophozoites. Giardia employ semi – open mitosis (also described in related diplomonad Hexamita inflata) with two extranuclear mitotic spindles. The giardia mitosis includes all the phases that characterize mitosis of higher eukaryotes (Sagolla et al., 2006; Tumova et al., 2007). The mode of mitosis as well as presence of mitotic spindles were identified in Giardia just recently by transmission electron microscopy (TEM) and 3D fluorescence deconvolution microscopy. These studies revealed usual spindle microtubules based segregation of chromosomes and disproved previous proposals of unconventional mode of genome segregation based on spindle microtubule-independent mechanism (Solari et al., 2003). During semi-open mitosis, nuclear membrane is retained, however, polar openings are formed in the membrane and enable kinetochoral microtubules to enter the nucleus and associate with chromosomes. Segregation of chromosomes proceeds in left – right axis of bilaterally symmetrical trophozoite. Cytokinesis takes place along longitudinal axis of the cell. Despite observed alignment of chromosomes into spindle midzone, no typical metaphase plate was identified probably because of the small chromosome size or transient nature of this step (Sagolla et al., 2006; Tumova et al., 2007). Evidence of functionally conserved spindle in Giardia was based on behavior of histone cenH3 in mitotic anaphase when it clusters at the leading edge of segregating DNA towards spindle microtubule. Despite absence of direct evidence for attachment of microtubules to kinetochores, the number of microtubules entering the nucleus seen on electron microscopy preparations suggest that more than one microtubule is attached to one kinetochore (Dawson et al., 2007; Sagolla et al., 2006). Conserved kinetochoral proteins were not identified in G. intestinalis genome, however the presence of cenH3 histone variant, as an essential basement for eukaryotic kinetochore assembly, indicates the existence of kinetochoral structure (Dawson et al., 2007). Giardia nuclei remain physically separated from each other during cell division because of persistence of nuclear membrane that might lead to accumulation of allelic heterozygosity. However, genome sequencing revealed remarkably low level of heterozygosity between genome copies 0,01%. The mechanism of maintaining of 26 homozygosity of giardial nuclei is thus unknown. Nuclear fusions and meiotic recombinations leading to reduction of heterozygosity were suggested, however, cytological evidence of such events is missing (Morrison et al., 2007; Sagolla et al., 2006; Tumova et al., 2007). 2.6. Mitosis in Trichomonas vaginalis The ultrastructure of T. vaginalis mitosis was studied by Brugerolle and colleagues (Bricheux et al., 2007; Brugerolle, 1975). The phases of mitosis were described by Drmota and Kral (Drmota and Kral, 1997) who found all typical mitotic phases. T. vaginalis proliferates in urogenital tract of humans by a specific type of close mitosis named cryptopleuromitosis (Brugerolle, 1975). Similarly to Giardia, trichomonads form extranuclear mitotic spindle. However, polar opening were not observed and microtubules interacts somehow with chromosomes at nuclear envelope (Raikov, 1994). As in other lower eukaryotes with closed mitosis and extranuclear spindle, partial dissasembly and opening of nuclear pore complexes of intact nuclear envelope can be considered to enable the direct attachment of trichomonad spindle microtubules to kinetochores (De Souza and Osmani, 2007). The mitotic spindle is assembled from two MTOCs called atractophores. Three types of microtubules organize T. vaginalis extranuclear spindle: i.) pole to pole called paradesmosis, ii.) pole to cytoplasm and iii.) pole to nucleus called kinetochoral microtubules that attach to the outer side of nuclear membrane via a sort of dense material on the envelope named by Brugerolle the kinetochore (Bricheux et al., 2007). The structural basis of chromosomes segregation is poorly understood. Together with mitotic spindle, the nuclear envelope is considered to play a role in T. vaginalis chromosome segregation. It was suggested that the fluidity of nuclear membrane participates on movement of chromosomes at anaphase (Raikov, 1994). Whether trichomonad chromosomes are arranged along metaphase plate is not clear. Proteins involved in T. vaginalis mitosis such as components of centromere and kinetochore have not been studied experimentally. Genes coding for typical kinetochoral and centromere - associated proteins are present in T. vaginalis genome (Zubacova and Tachezy, unpublished data), however, their spectrum seems to be reduced in comparison to yeast and higher eukaryotes. Inability to detect homologous components in T. vaginalis might also reflect high diversity of Trichomonas sequences or presence of unique components of 27 kinetochore complex that remain to be described (Santaguida and Musacchio, 2009; Zubacova and Tachezy, unpublished data). Study of T. vaginalis centromeres is in its infancy. In initial study, we focused on identification of centromeric variant of histone H3, cenH3. We identified three histone H3 variants and 20 core H3 histones. To identify putative cenH3, we examined nuclear localization of all three H3 variants and a single core H3 as control using expression of hemagglutinin-tagged proteins in T. vaginalis. One of the H3 variant appears to be a putative cenH3 based on characteristic fluorescence pattern in nuclei. Immunostaining of TvcenH3 in G1 phase nuclei revealed six dots that correspond to kinetochores on six chromosomes. Twelve TvcenH3 fluorescence dots in doublets were observed in G2 phase of the cell cycle after DNA replication. This pattern of cenH3 is characteristic for monocentric centromeres appearance in interphase nuclei (Dernburg, 2001; Zubacova and Tachezy, unpublished data). Our attempts to localize centromeres on Trichomonas vaginalis chromosomes using CREST antibodies recognizing mammalian centromeric and kinetochore proteins were not successful probably because of differences in protein sequences of major CREST antigens such as CENP – A (mammalian cenH3 homolog) and CENP - B between humans and trichomonads. Experiments with expression of heterologous cenH3 from G. intestinalis or human CENP – A with the aim to identify the chromosomal position centromeres also failed in Trichomonas. None of these proteins could functionally replace trichomonad cenH3 in transfected trichomonads (Zubacova and Tachezy, unpublished data). Unique localization of the second T. vaginalis H3 variant corresponded to histone H3.3, a marker of transcriptionally active chromatin characterized by H3K4 methylation (Dernburg, 2001; Talbert and Henikoff, 2010). Despite completely different protein sequences, chromatin localization of the third T. vaginalis H3 variant was identical with localization of core H3. Function of this H3 variant is unknown (Zubacova and Tachezy, unpublished data). 28 References Abrahamsen, M. S., et al. (2004). Complete genome sequence of the apicomplexan Cryptosporidium parvum. Science 304: 441-45. Adam, R. D. (1992). Chromosome-size variation in Giardia lamblia: the role of rDNA repeats. Nucleic Acids Research 20: 3057-61. Alberts B., et al. (2002). Molecular Biology Biology of the Cell. Garland Science: 198-99. Alsford, S., et al. (2001). Diversity and dynamics of the minichromosomal karyotype in Trypanosoma brucei. Molecular and Biochemical Parasitology 113: 79-88. Aurrecoechea, C., et al. (2009). GiardiaDB and TrichDB: integrated genomic resources for the eukaryotic protist pathogens Giardia lamblia and Trichomonas vaginalis. Nucleic Acids Research 37: 526-530. Bankier, A. T., et al. (2003). Integrated mapping, chromosomal sequencing and sequence analysis of Cryptosporidium parvum. Genome Research 13: 1787-99. Bender, K., et al. (1992). Sequence differences between histones of procyclic Trypanosoma brucei brucei and higher eukaryotes. Parasitology 105: 97-104. Berriman, M., et al. (2005). The genome of the African trypanosome Trypanosoma brucei. Science 309: 416-22. Biderre, C., et al. (1995). Evidence for the smallest nuclear genome (2.9 Mb) in the microsporidium Encephalitozoon cuniculi. Molecular and Biochemical Parasitology 74: 229-31. Bigliardi, E., et al. (1998). Mechanisms of microsporidial cell division: Ultrastructural study on Encephalitozoon hellem. Journal of Eukaryotic Microbiology 45: 347-51. Blunt, D. S., et al. (1997). Molecular karyotype analysis of Cryptosporidium parvum: Evidence for eight chromosomes and a low-molecular-size molecule. Clinical and Diagnostic Laboratory Immunology 4: 11-13. 29 Bontempi, E. J., et al. (1994). Genes for histones H3 in Trypanosoma cruzi. Molecular and Biochemical Parasitology 66: 147-51. Bricheux, G., G. Coffe and G. Brugerolle (2007). Identification of a new protein in the centrosome-like "atractophore" of Trichomonas vaginalis. Molecular and Biochemical Parasitology 153: 133-40. Brooks, C. F., et al. (2011). Toxoplasma gondii sequesters centromeres to a specific nuclear region throughout the cell cycle. Proceedings of the National Academy of Sciences of the United States of America 108: 3767-72. Brugerolle, G. (1975). Aspects of cryptopleuromitosis in Trichomonas vaginalis and in other genera of Trichomonas. Journal of Protozoology 22: A76-A77. Carlton, J. M., et al. (2007). Draft genome sequence of the sexually transmitted pathogen Trichomonas vaginalis. Science 315: 207-12. Chavez-Munguia, B., V. Tsutsumi and A. Martinez-Palomo (2006). Entamoeba histolytica: Ultrastructure of the chromosomes and the mitotic spindle. Experimental Parasitology 114: 235-39. Conrad, M., et al. (2011). Microsatellite polymorphism in the sexually transmitted human pathogen Trichomonas vaginalis indicates a genetically diverse parasite. Molecular and Biochemical Parasitology 175: 30-38. Coppel, R. L. and C. G. Black (2005). Parasite genomes. International Journal for Parasitology 35: 465-79. Corradi, N., et al. (2007). Patterns of genome evolution among the microsporidian parasites Encephalitozoon cuniculi, Antonospora locustae and Enterocytozoon bieneusi. Plos One 2. Dawson, S. C., M. S. Sagolla, and W. Z. Cande (2007). The cenH3 histone variant defines centromeres in Giardia intestinalis. Chromosoma 116: 175-84. Day, K. P., et al. (1993). Genes necessary for expression of virulence determinant and for transmission of Plasmodium falciparum are located on a 0,3-megabase region of 30 chromosome 9. Proceedings of the National Academy of Sciences of the United States of America 90: 8292-96. De Souza, C. P. C. and S. A. Osmani (2007). Mitosis, not just open or closed. Eukaryotic Cell 6: 1521-27. Dernburg, A. F. (2001). Here, there, and everywhere: Kinetochore function on holocentric chromosomes. Journal of Cell Biology 153: F33-F38. Drmota, T. and J. Kral (1997). Karyotype of Trichomonas vaginalis. European Journal of Protistology 33: 131-35. Dubessay, P., et al. (2004). Chromosome fragmentation in Leishmania. Methods in Molecular Biology 270: 353-78. Dubey, J. P., Lindsay, D. S. and C. A. Speer (1998). Structures of Toxoplasma gondii tachyzoites, bradyzoites, and sporozoites and biology and development of tissue cysts. Clinical Microbiology Reviews 11: 267-99. Dzierszinski, F., et al. (2004). Dynamics of Toxoplasma gondii differentiation. Infection and Immunity 3: 992-1003. Ersfeld K. (2011). Nuclear architecture, genome and chromatin organisation in Trypanosoma brucei. Research in Microbiology 162: 626-636. Ersfeld, K. and K. Gull (1997). Partitioning of large and minichromosomes in Trypanosoma brucei. Science 276: 611-14. Ersfeld, K., S. E. Melville, and K. Gull (1999). Nuclear and genome organization of Trypanosoma brucei. Parasitology Today 15: 58-63. Födinger, M., et al. (1993). Pathogenic Entamoeba histolytica: cDNA cloning of a histone H3 with a divergent primary sturcture. Molecular and Biochemical Parasitology 59: 315-22. 31 Franzén, O., et al. (2009). Draft genome sequencing of Giardia intestinalis assemblage B isolate GS: is human giardiasis caused by two different species? PLoS Pathogens 5: e1000560. Fritzler MJ., et al. (2011). Historical perspectives on the discovery and elucidation of autoantibodies to centromere proteins (CENP) and the emerging importance of antibodies to CENP-F. Autoimmunity Reviews 10: 194-200. Gardner, M. J., et al. (2002). Genome sequence of the human malaria parasite Plasmodium falciparum. Nature 419: 498-511. Gerald, N., Mahajan, B. and S. Kumar (2011). Mitosis in the human malaria parasite Plasmodium falciparum. Eukaryotic Cell 10: 474-482. Gomez-Conde, E., et al. (1998). Chromatin organization during the nuclear division stages of live Entamoeba histolytica trophozoites. Experimental Parasitology 89: 122-24. Guay, J. M., et al. (1992). Physical and genetic mapping of cloned ribosomal DNA from Toxoplasma gondii primary and secondary structure of the 5S gene. Gene 114: 16571. Guerra, M., et al. (2010). Neocentrics and holokinetics (holocentrics): chromosomes out of the centromeric rules. Cytogenetic and Genome Research 129: 82-96. Gull, K., S. Alsford, and K. Ersfeld (1998). Segregation of minichromosomes in trypanosomes: implications for mitotic mechanisms. Trends in Microbiology 6: 31923. Henikoff, S. and K. Ahmad (2005). Assembly of variant histones into chromatin. Annual Review of Cell and Developmental Biology 21: 133-53. Ivens, A. C., et al. (2005). The genome of the kinetoplastid parasite Leishmania major. Science 309: 436-42. Katinka, M. D., et al. (2001). Genome sequence and gene compaction of the eukaryote parasite Encephalitozoon cuniculi. Nature 414: 450-53. 32 Keeling, P. J. (2004). Reduction and compaction in the genome of the apicomplexan parasite Cryptosporidium parvum. Developmental Cell 6: 614-16. Kelly, J. M., L. McRobert, and D. A. Baker (2006). Evidence on the chromosomal location of centromeric DNA in Plasmodium falciparum from etoposide-mediated topoisomerase-II cleavage. Proceedings of the National Academy of Sciences of the United States of America 103: 6706-11. Kissinger, J. C., et al. (2003). ToxoDB: accessing the Toxoplasma gondii genome. Nucleic Acids Research 31: 234-36. Kondorosi, E., F. Roudier and E. Gendreau (2000). Plant cell-size control: growing by ploidy? Current Opinion in Plant Biology 3: 488-92. LaGier, M. J., et al. (2003). Mitochondrial-type iron-sulfur cluster biosynthesis genes (IscS and IscU) in the apicomplexan Cryptosporidium parvum. Microbiology 149: 3519-30. Laliberté, J. and V. B. Carruthers (2008). Host cell manipulation by the human pathogen Toxoplasma gondii. Cellular and Molecular Life Sciences 65: 1900-15. Lalle M. (2010). Giardiasis in the post genomic era: treatment, drug resistance and novel therapeutic perspectives. Infectious Disorders-Drug Targets 10: 283-94. Le Blancq, S. M. and R. D. Adam (1998). Structural basis of karyotype heterogeneity in Giardia lamblia. Molecular and Biochemical Parasitology 97: 199-208. Le Blancq, S. M., et al. (1997). Ribosomal RNA gene organization in Cryptosporidium parvum. Molecular and Biochemical Parasitology 90: 463-78. Lee, H. O., J. M. Davidson, and R. J. Duronio (2009). Endoreplication: polyploidy with purpose. Genes & Development 23: 2461-77. Loftus, B., et al. (2005). The genome of the protist parasite Entamoeba histolytica. Nature 433: 865-68. Lohia, A. (2003). The cell cycle of Entamoeba histolytica. Molecular and Cellular Biochemistry 253: 217-22. 33 Lopez-Robles, M. C., et al. (2000). Centromeric structure identification in Entamoeba histolytica by anticentromeric/kinetochore antibodies obtained from patients with the CREST syndrome. Archives of Medical Research 31: S207-S209. Lowell, J. E. and G. A. M. Cross (2004). A variant histone H3 is enriched at telomeres in Trypanosoma brucei. Journal of Cell Science 117: 5937-47. Malik, S. B., et al. (2008). An expanded inventory of conserved meiotic genes provides evidence for sex in Trichomonas vaginalis. Plos One 3. Malvy, D. and F. Chappuis (2011). Sleeping sickness. Clinical Microbiology and Infection 17: 986-95. Marinets, A., et al. (1996). The sequence and organization of the core histone H3 and H4 genes in the early branching amitochondriate protist Trichomonas vaginalis. Journal of Molecular Evolution 43: 563-71. Melville, S. E., et al. (2000). The molecular karyotype of the megabase chromosomes of Trypanosoma brucei stock 427. Molecular and Biochemical Parasitology 111: 26173. Mola, L. M. and A. G. Papeschi (2006). Holokinetic chromosomes at a glance. Journal of Basic and Applied Genetics 17: 17-33. Morrison, H. G., et al. (2007). Genomic minimalism in the early diverging intestinal parasite Giardia lamblia. Science 317: 1921-26. Nuismer, S. L. and S. P. Otto (2004). Host-parasite interactions and the evolution of ploidy. Proceedings of the National Academy of Sciences of the United States of America 101: 11036-39. Obado, S. O., et al. (2007). Repetitive DNA is associated with centromeric domains in Trypanosoma brucei but not Trypanosoma cruzi. Genome Biology 8: R37 Ogbadoyi, E., et al. (2000). Architecture of the Trypanosoma brucei nucleus during interphase and mitosis. Chromosoma 108: 501-13. 34 O´Connor, R. M., et al. (2011). Cryptosporidiosis in patients with HIV/AIDS. AIDS 13: 549-60. Prensier, G. and C. Slomianny (1986). The karyotype of Plasmodium falciparum determined by ultrastructural serial sectioning and 3D reconstruction. Journal of Parasitology 72: 731-36. Prokopowich, C. D., T. R. Gregory, and T. J. Crease (2003). The correlation between rDNA copy number and genome size in eukaryotes. Genome 46: 48-50. Raikov, I. B. (1994). The diversity of forms of mitosis in protozoa - a comparative review. European Journal of Protistology 30: 253-69. Ravel, C., P. Dubessay, and P. Bastien (1998). The complete chromosomal organization of the reference strain of the Leishmania genome project, L. major 'Friedlin'. Parasitology Today 14: 301-03. Regnier, V., et al. (2005). CENP-A is required for accurate chromosome segregation and sustained kinetochore association of BubR1. Molecular and Cellular Biology 25: 3967-81. Roos, D. S., et al. (1994). Molecular tools for genetic dissection of the protozoan parasite Toxoplasma gondii. Methods in Cell Biology 45: 27-63. Sagolla, M. S., et al. (2006). Three-dimensional analysis of mitosis and cytokinesis in the binucleate parasite Giardia intestinalis. Journal of Cell Science 119: 4889-900. Sanchez, L. B., Enea, V. and D. Eichinger (1994). Increased levels of polyadenylated histone H2B mRNA accumulate during Entamoeba invadens cyst formation. Molecular and Biochemical Parasitology 67: 137-46. Santaguida, S. and A. Musacchio (2009). The life and miracles of kinetochores. Embo Journal 28: 2511-31. Sanyal, K. and J. Carbon (2002). The CENP-A homolog CaCse4p in the pathogenic yeast Candida albicans is a centromere protein essential for chromosome transmission. 35 Proceedings of the National Academy of Sciences of the United States of America 99: 12969-74. Sibley, L. D. and J. C. Boothroyd (1992). Construction of a molecular karyotype for Toxoplasma gondii. Molecular and Biochemical Parasitology 51: 291-300. Silva, J. C., et al. (2005). A potentially functional mariner transposable element in the protist Trichomonas vaginalis. Molecular Biology and Evolution 22: 126-34. Solari, A. J., et al. (2003). A unique mechanism of nuclear division in Giardia lamblia involves compoments of the ventral disk and the nuclear envelope. Biocell 27: 329-46. Soto, M., et al. (1994). The Leishmania infantum histone H3 possesses an extremely divergent N-terminal domain. Biochimica et Biophysica Acta 1219: 533-35. Stanley, S. L. (2003). Amoebiasis. Lancet 361: 1025-34. Striepen, B., et al. (2007). Building the perfect parasite: cell division in apicomplexa. PLoS Pathogens 3: e78. Sunkin, S. M., et al. (2000). The size difference between Leishmania major Friedlin chromosome one homologues is localized to sub-telomeric repeats at one chromosomal end. Molecular and Biochemical Parasitology 109: 1-15. Talbert, P. B. and S. Henikoff (2010). Histone variants: ancient wrap artists of the epigenome. Nature reviews: Molecular Cell Biology 11: 264-75. Torres-Machorro, A. L., et al. (2009). Comparative analyses among the Trichomonas vaginalis, Trichomonas tenax, and Tritrichomonas foetus 5S ribosomal RNA genes. Current Genetics 55: 199-210. Tumova, P., et al. (2007). Cytogenetic evidence for diversity of two nuclei within a single diplomonad cell of Giardia. Chromosoma 116: 65-78. Tyler, K. M. and D. M. Engman (2001). The life cycle of Trypanosoma cruzi revisited. International Journal for Parasitology 31: 472-81. 36 Tyler-Smith, C. and G. Floridia (2000). Many paths to the top of the mountain: Diverse evolutionary solutions to centromere structure. Cell 102: 5-8. Upcroft, J. A., M. Abedinia, and P. Upcroft (2005). Rearranged subtelomeric rRNA genes in Giardia duodenalis. Eukaryotic Cell 4: 484-86. Vaishnava, S., et al. (2005). Plastid segregation and cell division in the apicomplexan parasite Sarcocystis neurona. Journal of Cell Science 118: 3397-407. Venter, J. C., et al. (2001). The sequence of the human genome. Science 291: 1304-51. Villasante, A., J. P. Abad, and M. Mendez-Lago (2007). Centromeres were derived from telomeres during the evolution of the eukaryotic chromosome. Proceedings of the National Academy of Sciences of the United States of America 104: 10542-47. Wang, A. L. and C. C. Wang (1985). Isolation and characterization of DNA from Tritrichomonas foetus and Trichomonas vaginalis. Molecular and Biochemical Parasitology 14: 323-35. Waters, A. P. (1994). The ribosomal RNA genes of Plasmodium. Advances in Parasitology 34: 33-79. Willhoeft, U. and E. Tannich (1999). The electrophoretic karyotype of Entamoeba histolytica. Molecular and Biochemical Parasitology 99: 41-53. Wu, G., et al. (2000). Core histones of the amitochondriate protist Giardia lamblia. Molecular Biology and Evolution 17: 271-73. Xia, D., et al. (2008). The proteome of Toxoplasma gondii: integration with the genome provides novel insights into gene expression and annotation. Genome Biology 9: R116. Xu, P., et al. (2004). The genome of Cryptosporidium hominis. Nature 431: 1107-12. Yuh, Y. S., J. Y. Liu, and M. F. Shaio (1998). Chromosome number of Trichomonas vaginalis. Journal of Parasitology 84: 28. 37 Zubacova, Z., Krylov, V. and J. Tachezy (2011). Fluorescence in situ hybridization (FISH) mapping of single copy genes on Trichomonas vaginalis chromosomes. Molecular and Biochemical Parasitology 176: 135-37. Zubacova, Z., Z. Cimburek, and J. Tachezy (2008). Comparative analysis of trichomonad genome sizes and karyotypes. Molecular and Biochemical Parasitology 161: 49-54. 38 3. Aims To compare nuclear DNA content and karyotypes in selected species of Trichomonadea group. To adapt fluorescence in situ hybridization (FISH) for mapping of repetitive sequences and single copy genes on T. vaginalis chromosomes. To map genes for ribosomal RNA, position and number of nucleolar organizer regions on T. vaginalis chromosomes and genome distribution of transposone mariner. To investigate histone H3 variants in T. vaginalis with focus on identification of centromeric histone H3 (cenH3) as a marker protein of centromeres. 39 4. Publications Carlton, J. M., Hirt R. P., Silva J. C., Delcher A. L., Schatz M., Zhao Q., Wortman J. R., Bidwell S. L., Alsmark U. C., Besteiro S., Sicheritz-Ponten T., Noel C. J., Dacks J. B., Foster P. G., Simillion C., Van de Peer Y., Miranda-Saavedra D., Barton G. J., Westrop G. D., Müller S., Dessi D., Fiori P. L., Ren Q., Paulsen I., Zhang H., Bastida-Corcuera F. D., Simoes-Barbosa A., Brown M. T., Hayes R. D., Mukherjee M., Okumura C. Y., Schneider R., Smith A. J., Vanacova S., Villalvazo M., Haas B. J., Pertea M., Feldblyum T. V., Utterback T. R., Shu C. L., Osoegawa K., de Jong P. J., Hrdy I., Horvathova L., Zubacova Z., Dolezal P., Malik S. B., Logsdon J. M. Jr, Henze K., Gupta A., Wang C. C., Dunne R. L., Upcroft J. A., Upcroft P., White O., Salzberg S. L., Tang P., Chiu C. H., Lee Y. S., Embley T. M., Coombs G. H., Mottram J. C., Tachezy J., Fraser-Liggett C. M. and P. J. Johnson (2007). Draft genome sequence of the sexually transmitted pathogen Trichomonas vaginalis. Science 315: 207-12. Zubacova, Z., Z. Cimburek and J. Tachezy (2008). Comparative analysis of trichomonad genome sizes and karyotypes. Molecular and Biochemical Parasitology 161: 49-54. Conrad, M., Zubacova, Z., Dunn, L. A., Upcroft, J., Sullivan, S. A., Tachezy, J. and J. M. Carlton (2011). Microsatellite polymorphism in the sexually transmitted human pathogen Trichomonas vaginalis indicates a genetically diverse parasite. Molecular and Biochemical Parasitology 175: 30-38. Zubacova, Z., V. Krylov, and J. Tachezy (2011). Fluorescence in situ hybridization (FISH) mapping of single copy genes on Trichomonas vaginalis chromosomes. Molecular and Biochemical Parasitology 176: 135-37. 40 RESEARCH ARTICLE Draft Genome Sequence of the Sexually Transmitted Pathogen Trichomonas vaginalis Jane M. Carlton,1*† Robert P. Hirt,2 Joana C. Silva,1 Arthur L. Delcher,3 Michael Schatz,3 Qi Zhao,1 Jennifer R. Wortman,1 Shelby L. Bidwell,1 U. Cecilia M. Alsmark,2 Sébastien Besteiro,4 Thomas Sicheritz-Ponten,5 Christophe J. Noel,2 Joel B. Dacks,6 Peter G. Foster,7 Cedric Simillion,8 Yves Van de Peer,8 Diego Miranda-Saavedra,9 Geoffrey J. Barton,9 Gareth D. Westrop,4 Sylke Müller,4 Daniele Dessi,10 Pier Luigi Fiori,10 Qinghu Ren,1 Ian Paulsen,1 Hanbang Zhang,1 Felix D. Bastida-Corcuera,11 Augusto Simoes-Barbosa,11 Mark T. Brown,11 Richard D. Hayes,11 Mandira Mukherjee,11 Cheryl Y. Okumura,11 Rachel Schneider,11 Alias J. Smith,11 Stepanka Vanacova,11 Maria Villalvazo,11 Brian J. Haas,1 Mihaela Pertea,3 Tamara V. Feldblyum,1 Terry R. Utterback,12 Chung-Li Shu,13 Kazutoyo Osoegawa,13 Pieter J. de Jong,13 Ivan Hrdy,14 Lenka Horvathova,14 Zuzana Zubacova,14 Pavel Dolezal,14 Shehre-Banoo Malik,15 John M. Logsdon Jr.,15 Katrin Henze,16 Arti Gupta,17 Ching C. Wang,17 Rebecca L. Dunne,18 Jacqueline A. Upcroft,19 Peter Upcroft,19 Owen White,1 Steven L. Salzberg,3 Petrus Tang,20 Cheng-Hsun Chiu,21 Ying-Shiung Lee,22 T. Martin Embley,2 Graham H. Coombs,23 Jeremy C. Mottram,4 Jan Tachezy,14 Claire M. Fraser-Liggett,1 Patricia J. Johnson11 We describe the genome sequence of the protist Trichomonas vaginalis, a sexually transmitted human pathogen. Repeats and transposable elements comprise about two-thirds of the ~160-megabase genome, reflecting a recent massive expansion of genetic material. This expansion, in conjunction with the shaping of metabolic pathways that likely transpired through lateral gene transfer from bacteria, and amplification of specific gene families implicated in pathogenesis and phagocytosis of host proteins may exemplify adaptations of the parasite during its transition to a urogenital environment. The genome sequence predicts previously unknown functions for the hydrogenosome, which support a common evolutionary origin of this unusual organelle with mitochondria. richomonas vaginalis is a flagellated protist that causes trichomoniasis, a common but overlooked sexually transmitted human infection, with ~170 million cases occurring annually worldwide (1). The extracellular parasite resides in the urogenital tract of both sexes and can cause vaginitis in women and urethritis and prostatitis in men. Acute infections are associated with pelvic inflammatory disease, increased risk of human immunodeficiency virus 1 (HIV-1) infection, and adverse pregnancy outcomes. T. vaginalis is a member of the parabasilid lineage of microaerophilic eukaryotes that lack mitochondria and peroxisomes but contain unusual organelles called hydrogenosomes. Although previously considered to be one of the earliest branching eukaryotic lineages, recent analyses leave the evolutionary relationship of parabasalids to other major eukaryotic groups unresolved (2, 3). In this article, we report the draft sequence of T. vaginalis, the first parabasalid genome to be described. Genome structure, RNA processing, and lateral gene transfer. The T. vaginalis genome sequence, generated using whole-genome shotgun methodology, contains 1.4 million shotgun reads assembled into 17,290 scaffolds at ~7.2× coverage (4). At least 65% of the T. vaginalis genome is repetitive (table S1). Despite several procedures developed to improve the assembly T (4), the superabundance of repeats resulted in a highly fragmented sequence, preventing investigation of T. vaginalis genome architecture. The repeat sequences also hampered measurement of genome size, but we estimate it to be ~160 Mb (4). A core set of ~60,000 protein-coding genes was identified (Table 1), endowing T. vaginalis with one of the highest coding capacities among eukaryotes (table S2). Introns were identified in 65 genes, including the ~20 previously documented (5). Transfer RNAs (tRNAs) for all 20 amino acids were found, and ~250 ribosomal DNA (rDNA) units were identified on small contigs and localized to one of the six T. vaginalis chromosomes (Fig. 1). The Inr promoter element was found in ~75% of 5′ untranslated region (UTR) sequences (4), supporting its central role in gene expression (6). Intriguingly, the eukaryotic transcription machinery of T. vaginalis appears more metazoan than protistan (table S3 to S5). The presence of a T. vaginalis Dicer-like gene, two Argonaute genes, and 41 transcriptionally active DEADDEAH-box helicase genes suggests the existence of an RNA interference (RNAi) pathway (fig. S1). Identification of these components raises the possibility of using RNAi technology to manipulate T. vaginalis gene expression. During genome annotation, we identified 152 cases of possible prokaryote-to-eukaryote lateral www.sciencemag.org SCIENCE VOL 315 gene transfer (LGT) [tables S6 and S7 and Supporting Online Material (SOM) text], augmenting previous reports of conflicting phylogenetic relationships among several enzymes (7). The putative functions of these genes are diverse, affecting various metabolic pathways (fig. S2) and strongly influencing the evolution of the T. vaginalis metabolome. A majority (65%) of the 152 LGT genes encode metabolic enzymes, more than a third of which are involved in carbohydrate or amino acid metabolism (Fig. 2). Several LGT genes may have been acquired from Bacteroidetes-related bacteria, which are abundant among vertebrate intestinal flora (fig. S3). Repeats, transposable elements, and genome expansion. The most common 59 repeat families identified in the assembly (4) constitute ~39 Mb of the genome and can be classified as (i) virus-like; (ii) transposon-like, including ~1000 copies of the first mariner element identified outside animals (8); (iii) retrotransposonlike; and (iv) unclassified (Table 2). Most of the 1 Institute for Genomic Research, 9712 Medical Research Drive, Rockville, MD 20850, USA. 2Division of Biology, Devonshire Building, Newcastle University, Newcastle upon Tyne NE1 7RU, UK. 3Center for Bioinformatics and Computational Biology, University of Maryland, College Park, MD 20742, USA. 4 Wellcome Centre for Molecular Parasitology and Division of Infection and Immunity, Glasgow Biomedical Research Centre, University of Glasgow, Glasgow G12 8TA, UK. 5Center for Biological Sequence Analysis, BioCentrum-DTU, Technical University of Denmark, DK-2800 Lyngby, Denmark. 6Department of Biological Sciences, University of Calgary, 2500 University Drive NW, Calgary, Alberta T2N 1N4, Canada. 7 Department of Zoology, Natural History Museum, Cromwell Road, London SW7 5BD, UK. 8Department of Plant Systems Biology, Flanders Interuniversity Institute for Biotechnology (VIB), Ghent University, Technologiepark 927, B-9052 Ghent, Belgium. 9School of Life Sciences Research, University of Dundee, Dow Street, Dundee DD1 5EH, UK. 10 Department of Biomedical Sciences, Division of Experimental and Clinical Microbiology, University of Sassari, Viale San Pietro 43/b, 07100 Sassari, Italy. 11Department of Microbiology, Immunology, and Molecular Genetics, University of California, Los Angeles, CA 90095–1489, USA. 12 J. Craig Venter Joint Technology Center, 5 Research Place, Rockville, MD 20850, USA. 13Children’s Hospital Oakland Research Institute, 747 52nd Street, Oakland, CA 94609, USA. 14 Department of Parasitology, Faculty of Science, Charles University, 128 44 Prague, Czech Republic. 15Department of Biological Sciences, Roy J. Carver Center for Comparative Genomics, University of Iowa, Iowa City, IA 52242–1324, USA. 16 Institute of Botany, Heinrich Heine University, 40225 Düsseldorf, Germany. 17Department of Pharmaceutical Chemistry, University of California, San Francisco, CA 94143–2280, USA. 18Department of Microbiology and Parasitology, University of Queensland, Brisbane, Queensland 4072, Australia. 19 Queensland Institute of Medical Research, Brisbane, Queensland 4029, Australia. 20Bioinformatics Center/Molecular Medicine Research Center, Chang Gung University, Taoyuan 333, Taiwan. 21Molecular Infectious Diseases Research Center, Chang Gung Memorial Hospital, Taoyuan 333, Taiwan. 22Medical Genomic Center, Chang Gung Memorial Hospital, Taoyuan 333, Taiwan. 23Strathclyde Institute of Pharmacy and Biomedical Sciences, John Arbuthnott Building, University of Strathclyde, Glasgow G4 0NR, UK. *To whom correspondence should be addressed. E-mail: [email protected] †Present address: Department of Medical Parasitology, New York University School of Medicine, New York, NY 10011, USA. 12 JANUARY 2007 207 RESEARCH ARTICLE 59 repeats are present in hundreds of copies (average copy number ~660) located on small (1- to 5-kb) contigs, and each repeat family is extraordinarily homogenous, with an average polymorphism of ~2.5%. The lack of a strong correlation between copy number and average pairwise difference between copies (fig. S4) suggests that a sudden expansion of the repeat families had occurred. To estimate the time of expansion, we compared the degree of polymorphism among T. vaginalis repeats to Table 1. Summary of the T. vaginalis genome sequence data. Assembly size (bp, base pairs) includes all contigs and differs from estimated genome size of ~160 Mb (4). The scaffold size is the minimum scaffold length, such that more than half the genome is contained in scaffolds of at least that length. The number of predicted genes may include low-complexity repeats or novel transposable elements rather than true T. vaginalis genes, but in the absence of decisive evidence these remain in the gene set. The number of evidence-supported genes includes those with either similarity to a known protein (E < 1 × 10–10, >25% length of protein) or similarity to an expressed sequence tag (>95% identity over >90% length of the gene). A total of 763 rDNA fragments (258 copies of 28S, 254 copies of 18S, and 251 copies of 5.8S) were identified. Feature Value Genome Size of assembly (bp) 176,441,227 G+C content (%) 32.7 No. of scaffolds 17,290 N50 scaffold size (bp) 68,338 Protein-coding genes No. of predicted genes 59,681 No. of evidence-supported genes 25,949 No. of genes with introns 65 Mean gene length (bp) 928.6 Gene G+C content (%) 35.5 Gene density (bp) 2956 Mean length of intergenic 1165.4 regions (bp) Intergenic G+C content (%) 28.8 Non–protein-coding genes Predicted tRNA genes 479 Predicted 5.8S, 18S, and ~250 28S rDNA units the divergence between T. vaginalis and its sister taxon T. tenax, a trichomonad of the oral cavity (9), for several protein-coding loci (4). Our results indicate that repeat family amplification occurred after the two species split (table S8). Several families have also undergone multiple expansions, as implied by bi- or trimodal distribution of pairwise distances between copies (fig. S5). T. vaginalis repeat families appear to be absent in T. tenax but are present in geographically diverse T. vaginalis (4), consistent with the expansion having occurred after speciation but before diversification of T. vaginalis. The large genome size, high repeat copy number, low repeat polymorphism, and evidence of repeat expansion after T. vaginalis and T. tenax diverged suggest that T. vaginalis has undergone a very recent and substantial increase in genome size. To determine whether the genome underwent any large-scale duplication event(s), we analyzed age distributions of gene families with five or fewer members (4). A peak in the age distribution histogram of pairs of gene families was observed (fig. S6), indicating that the genome underwent a period of increased duplication, and possibly one or more large-scale genome duplication events. Metabolism, oxidative stress, and transport. T. vaginalis uses carbohydrate as a main energy source via fermentative metabolism under aerobic and anaerobic conditions. We found the parasite to use a variety of amino acids as energy substrates (Fig. 2) (10), with arginine dihydrolase metabolism a major pathway for energy production (fig. S7) (11). We confirmed a central role for aminotransferases (Fig. 2 and table S9) and glutamate dehydrogenase as indicated previously (12, 13); these pathways are likely catabolic but may be reversible to allow the parasite to synthesize glutamate, aspartate, alanine, glutamine, and glycine. Genes required for synthesis of proline from arginine (fig. S7) and for threonine metabolism (fig. S8) were identified. We also identified a de novo biosynthesis pathway for cysteine via cysteine synthase, an LGT candidate (fig. S8) (14), and genes encoding enzymes involved in methionine metabolism, including its possible regeneration (fig. S9). Earlier studies indicated that de novo lipid biosynthesis in T. vaginalis is confined to the major phospholipid phosphatidylethanolamine (PE) (15), whereas other lipids, including cholesterol, are likely acquired from exogenous sources. We found an absence of several essential enzyme-encoding genes in the synthesis and degradation pathways of nearly all lipids (4), in contrast to the PE synthetic pathway, which appears complete; however, experimental verification of these results is required. T. vaginalis is microaerophilic with a primarily anaerobic life style and thus requires redox and antioxidant systems to counter the detrimental effects of oxygen. Genes encoding a range of defense molecules, such as superoxide dismutases, thioredoxin reductases, peroxiredoxins, and rubrerythrins, were identified (table S10). T. vaginalis demonstrates a broad range of transport capabilities, facilitated by expansion of particular transporter families, such as those for sugar and amino acids (table S11). The parasite also possesses more members of the cationchloride cotransporter (CCC) family than any other sequenced eukaryote, likely reflecting osmotic changes faced by the parasite in a mucosal environment. None of the proteins required for glycosylphosphatidylinositol (GPI)-anchor synthesis were identified in the genome sequence, making T. vaginalis the first eukaryote known to lack an apparent GPI-anchor biosynthetic pathway. Whether T. vaginalis has evolved an unusual biosynthetic pathway for synthesis of its nonprotein lipid anchors, such as the inositolphosphoceramide of surface lipophosphoglycans (16), remains to be determined. Massively expanded gene families. Many gene families in the T. vaginalis genome have undergone expansion on a scale unprecedented in unicellular eukaryotes (Table 3). Such “conservative” gene family expansions are likely to improve an organism’s adaptation to its environment (17). Notably, the selective expansion of subsets of the membrane trafficking machinery, critical for secretion of pathogenic proteins, endocytosis of host proteins, and phagocytosis of bacteria and host cells (table S12), correlates well with the parasite’s active endocytic and phagocytic life-style. Massively amplified gene families also occur in the parasite’s kinome, which comprises ~880 genes (SOM text) encoding distinct eukaryotic protein kinases (ePKs) and ~40 atypical protein Fig. 1. Karyotype and fluorescent in situ hybridization (FISH) analysis of T. vaginalis chromosomes. (A) Metaphase chromosome squashes of T. vaginalis reveal six chromosomes (I to VI). (B) FISH analysis using an 18S rDNA probe shows that all ~250 rDNA units localize to a single chromosome. (C) In contrast, the Tvmar1 transposable element (8) is dispersed throughout the genome. 208 12 JANUARY 2007 VOL 315 SCIENCE www.sciencemag.org RESEARCH ARTICLE kinases, making it one of the largest eukaryotic kinomes known. The parasite has heterotrimeric guanine nucleotide–binding proteins and components of the mitogen-activated protein kinase (MAPK) pathway, suggesting yeast-like signal transduction mechanisms. Unusually, the T. vaginalis kinome contains 124 cytosolic tyrosine kinase–like (TKL) genes, yet completely lacks receptor serine/threonine ePKs of the TKL family. Inactive kinases were found to make up 17% of the T. vaginalis kinome (table S13); these may act as substrates and scaffolds for assembly of signaling complexes (18). ePK accessory domains are important for regulating signaling pathways, but just nine accessory domain types were identified in 8% (72/883) of the T. vaginalis ePKs (table S14), whereas ~50% of human ePKs contain at least 1 of 83 accessory domain types. This suggests that regulation of protein kinase function and cell signaling in T. vaginalis is less complex than that in higher eukaryotes, a possible explanation for the abundance of T. vaginalis ePKs. T. vaginalis possesses several unusual cytoskeletal structures: the axostyle, the pelta, and the costa (19). Most actin- and tubulin-related components of the cytoskeleton are present (table S15), with the exception of homologs of the actin motor myosin. In contrast, homologs of the microtubular motors kinesin and dynein are unusually abundant (Table 3). Thus, T. vaginalis intracellular transport mechanisms are mediated primarily by kinesin and cytoplasmic dynein, as described for Dictyostelium and filamentous fungi, raising the possibility that the loss of myosin-driven cytoplasmic transport is not uncommon in unicellular eukaryotes (20). Whether the structural remodeling of amoeboid T. vaginalis during host colonization (see below) is actin-based, as described for other eukaryotes, or driven by novel cytoskeletal rearrangements remains an open question. We identified homologs of proteins involved in DNA damage response and repair, chromatin restructuring, and meiosis, the latter a process not thought to occur in the parasite (table S16). Of the 29 core meiotic genes found, several are general repair proteins required for meiotic progression in other organisms (21), and eight are meiosisspecific proteins. Thus, T. vaginalis contains either recent evolutionary relics of meiotic machinery or genes functional in meiotic recombination in an as-yet undescribed sexual cycle. Molecular mechanisms of pathogenesis. T. vaginalis must adhere to host cells to establish and maintain an infection. A dense glycocalyx composed of lipophosphoglycan (LPG) (Fig. 3) and surface proteins has been implicated in adherence (22), but little is known about this critical pathogenic process. We identified genes encoding enzymes predicted to be required for LPG synthesis (table S17). Of particular interest are the genes required for synthesis of an unusual nucleotide sugar found in T. vaginalis LPG, the monosaccharide rhamnose, which is absent in the human host, making it a potential drug target. Genes (some of which are LGT candidates) were identified that are involved in sialic acid biosynthesis, consistent with the reported presence of this sugar on the parasite surface (23). We identified eight families containing ~800 proteins (4) that represent candidate surface molecules (Fig. 3 and table S18), including ~650 GLYCOLYSIS arginine 16 NH 3 15 18 hydroxypyruvate 14 17 glycerate 24 3-phosphoglycerate 3-phospho- 25 hydroxypyruvate phosphoserine 27 19 carbamoyl phosphate 13 ornithine 5 11 NH3 26 CO 2 12 citrulline urea NH3 putrescine 2-phosphoglycerate serine 20 22 glycine CO 2 1-pyrroline5-carboxylate 21 threonine proline phosphohomoserine NH 3 α-ketobutyrate methanethiol, NH3 31 methionine pyruvate 3-mercaptopyruvate alanine 9 3, 40 3 5 glutamate 2 NH3 , H2 S tyrosine, phenylalanine tryptophan indole, 2 NH 3 38 3 aspartate 34 S-adenosylmethionine 33 35 4 asparagine lysine 6 31 homocysteine 32 methylthio adenosine 29 phosphoenol NH3 , pyruvate H 2S NH 3 H 2S cysteine 9 30 22 36 cystine NH3 23 10 PYRUVATE METABOLISM 28 8 7 1 NH3 39 leucine, isoleucine, NH3 valine NH 3 CO 2 NH 3 S-adenosylhomocysteine glutamine 4-aminobutanoate 8-amino-7-oxononanoate 37 1-aminocyclopropane1-carboxylate 7,8-diaminononanoate α-ketoglutarate S-adenosyl-4-methylthio2-oxobutanoate Fig. 2. Schematic of T. vaginalis amino acid metabolism. A complete description of enzymatic reactions (represented as numbers) is given in the SOM text. Broken lines represent enzymes for which no gene was identified in the genome sequence, although the activity would appear to be required. Green boxes indicate enzymes encoded by candidate LGT genes. www.sciencemag.org SCIENCE VOL 315 highly diverse BspA-like proteins characterized by the Treponema pallidum leucine-rich repeat, TpLLR. BspA-like proteins are expressed on the surface of certain pathogenic bacteria and mediate cell adherence and aggregation (24). The only other eukaryote known to encode BspA-like proteins, the mucosal pathogen Entamoeba histolytica (25), contains 91 such proteins, one of which was recently localized to the parasite surface (26). There are >75 T. vaginalis GP63-like proteins, homologs of the most abundant surface proteins of Leishmania major, the leishmanolysins, which contribute to virulence and pathogenicity through diverse functions in both the insect vector and the mammalian host (27). Most T. vaginalis GP63like genes possess the domains predicted to be required for a catalytically active metallopeptidase, including a short HEXXH motif (28) (Fig. 3). Unlike trypanosomatid GP63 proteins, which are predicted to be GPI-anchored, most T. vaginalis GP63-like proteins have a predicted C-terminal transmembrane domain as a putative cell surface anchor, consistent with the apparent absence of GPI-anchor biosynthetic enzymes. Other T. vaginalis protein families share domains with Chlamydia polymorphic membrane proteins, Giardia lamblia variant surface proteins, and E. histolytica immunodominant variable surface antigens (Fig. 3 and table S18). After cytoadherence, the parasite becomes amoeboid, increasing cell-to-cell surface contacts and forming cytoplasmic projections that interdigitate with target cells (19). We have identified genes encoding cytolytic effectors, which may be released upon host-parasite contact. T. vaginalis lyses host red blood cells, presumably as a means of acquiring lipids and iron and possibly explaining the exacerbation of symptoms observed during menstruation (29). This hemolysis is dependent on contact, temperature, pH, and Ca2+, suggesting the involvement of pore-forming proteins (30) that insert into the lipid bilayer of target cells, mediating osmotic lysis. Consistent with this, we have identified 12 genes (TvSaplip1 to TvSaplip12) containing saposin-like (SAPLIP) pore-forming domains (fig. S10). These domains show a predicted six-cysteine pattern and abundant hydrophobic residues in conserved positions while displaying high sequence variability (fig. S11). The TvSaplips are similar to amoebapore proteins secreted by E. histolytica and are candidate trichopores that mediate a cytolytic effect. The degradome. Peptidases perform many critical biological processes and are potential virulence factors, vaccine candidates, and drug targets (31). T. vaginalis contains an expanded degradome of more than 400 peptidases (SOM text), making it one of the most complex degradomes described (table S19). Of the three families of aspartic peptidases (table S20), T. vaginalis contains a single member of the HIV-1 retropepsin family that might serve as a putative candidate for anti-HIV peptidase inhibitors. Many studies have implicated papain 12 JANUARY 2007 209 RESEARCH ARTICLE Table 2. Summary of highly repetitive sequences in the genome of T. vaginalis. The 31 unclassified repeat families have been collapsed into one group. Repeat name R128b R169a R1794 R299a R3a R9a R947a Average (total) R8 R107 R119 R11b R128a R130a R165 R178 R204 R210 R2375 R242b R26b R289a R309b R414a R41b R473 Integ.a95313 Average (total) R1407.RT copia.a38393 Average (total) Average (total) Total all Average all Putative identity Copy no. Viral 203 2348 903 752 2243 1037 867 873 954 2637 831 663 518 669 931 1283 Transposon Mariner transposase 982 1304 Integrase 384 2246 Mutator-like profile 282 2954 Integrase 1842 981 HNH endonuclease 200 800 Mutator-like profile 173 1129 Mutator-like profile 365 2410 Integrase 580 2433 Integrase 51 1323 Mutator-like profile 75 2127 Endonuclease 566 765 Integrase 49 1151 Integrase 1148 1141 Integrase 54 1250 Integrase 56 1601 Integrase 37 909 Integrase 927 1184 Integrase 19 1124 Integrase 68 1301 414 1481 Retrotransposon Reverse transcriptase 6 2349 Copia-like profile 7 11001 6.5 6675 31 unclassified families 793 1131 Poxvirus D5 protein Phage tail fiber prot Hypothetical KilA-N terminal domain Poxvirus D5 protein KilA-N terminal domain KilA-N terminal domain 660 family cysteine peptidases as virulence factors in trichomonads; we identified >40 of them, highlighting the diversity of this family. Cysteine peptidases that contribute to the 20S proteosome (ubiquitin C-terminal hydrolases) are abundant (117 members, ~25% of the degradome), emphasizing the importance of cytosolic protein degradation in the parasite. T. vaginalis has nine NlpC/P60-like members (table S20; several of which are LGT candidates), which play a role in bacterial cell wall degradation and the destruction of healthy vaginal microflora, making the vaginal mucosa more sensitive to other infections. We also identified many subtilisin-like and several rhomboid-like serine peptidases, candidates for processing T. vaginalis surface proteins. In addition to the first asparaginase-type of threonine peptidase found in a protist, 13 families of metallopeptidases were also identi- 210 Length (bp) Cumulative length (bp) Average pairwise difference (%) 370,658 607,929 1,879,762 655,984 1,721,187 503,959 298,778 (6,038,257) 1.2 3.9 3.0 6.6 4.6 3.6 3.1 3.7 1,158,473 518,936 303,256 1,634,764 131,491 137,526 368,152 636,901 51,185 118,581 329,717 45,564 1,175,921 53,674 73,737 27,708 807,711 15,254 22,023 (7,610,574) 0.5 0.9 1.1 2.1 1.5 0.7 1.1 1.5 1.6 0.9 5.4 2.5 2.6 2.2 1.4 1.4 2.5 1.3 2.3 1.8 6,833 16,504 (23,337) 1.0 7.5 4.3 (24,617,704) 38,289,872 2.6 1450 2.5 fied, as well as three cystatin-like proteins, natural peptidase inhibitors, which may regulate the activity of the abundant papain-like cysteine peptidases (table S20). The hydrogenosome. Several microaerophilic protists and fungi, including trichomonads and ciliates, lack typical mitochondria and possess double-membrane hydrogenosomes, which produce adenosine triphophate (ATP) and molecular hydrogen through fermentation of metabolic intermediates produced in the cytosol. Although the origin of these organelles has been controversial, most evidence now supports a common origin with mitochondria (3). Few genes encoding homologs of mitochondrial transporters, translocons, and soluble proteins were identified in the T. vaginalis genome (fig. S12), suggesting that its hydrogenosome has undergone reductive evolution comparable to other 12 JANUARY 2007 VOL 315 SCIENCE Table 3. Large gene families in T. vaginalis. Only gene families, or aggregates of gene families mediating a given process, for which there are more than 30 members and that have been assigned a putative function are listed. (See SOM text for subfamily organization.) The small guanosine triphosphatases (GTPases) are the sum of Rab and ARF small GTPases only; all other GTPases are shown in table S12. ABC, ATP-binding cassette; MFS, major facilitator superfamily; MOP, multidrug/oligosaccharidyl-lipid/polysaccharide; AAAP, amino acid/auxin permease. Gene family (functional unit) Members Protein kinases 927 BspA-like gene family 658 Membrane trafficking: small GTPases 328 rDNA gene cluster ~250 Cysteine peptidase (clan CA, family C19) 117 Membrane trafficking: vesicle formation 113 ABC transporter superfamily 88 GP63-like (leishmanolysin) 77 MFS transporter family 57 MOP flippase transporter family 47 Cysteine peptidase (clan CA, family C1) 48 AAAP transporter family 40 Dynein heavy chain 35 P-ATPase transporter family 33 Serine peptidase (clan SB, family S8) 33 Membrane trafficking: vesicle fusion 31 protists whose mitochondrial proteomes are reduced (e.g., Plasmodium). Because nuclearencoded hydrogenosomal matrix proteins are targeted to the organelle by N-terminal presequences that are proteolytically cleaved upon import (32) similar to mitochondrial precursor proteins, we screened the genome for consensus 5to 20-residue presequences containing ML(S/T/A) X(1...15)R (N/F/E/XF), MSLX(1...15)R(N/F/XF), or MLR(S/N)F (28) motifs. A total of 138 genes containing putative presequences were identified, 67% of which are similar to known proteins, primarily ones involved in energy metabolism and electron-transport pathways (fig. S13 and table S21). The production of molecular hydrogen, the hallmark of the hydrogenosome, is catalyzed by an unusually diverse group of iron-only [Fe]hydrogenases that possess, in addition to a conserved H cluster, four different sets of functional domains (fig. S14), indicating that hydrogen production may be more complex than originally proposed. The pathway that generates electrons for hydrogen (fig. S12) is composed of many proteins encoded by multiple genes (tables S22 to S24). Our analyses extend the evidence that T. vaginalis hydrogenosomes contain the complete machinery required for mitochondria-like intraorganellar FeS cluster formation (33) and also reveal the presence of two putative cytosolic auxiliary proteins, indicating that hydrogenosomes may be involved in biogenesis of cytosolic FeS proteins. Some components of www.sciencemag.org RESEARCH ARTICLE Fig. 3. Structural organization of putative T. vaginalis surface molecules involved in host cell adherence and cytotoxicity (28). Candidate surface protein families (a to f) are depicted as part of the glycocalyx (gray shading), known to be composed of LPG (g). The number of proteins with an inferred transmembrane domain (red box) is indicated at right, with the size of the entire family shown in parentheses. Substantial length variation exists between and within families (proteins are not drawn to scale). Additional information can be found in table S18. cell volume reported for protists (41). T. vaginalis is also a highly predatory parasite that phagocytoses bacteria, vaginal epithelial cells, and host erythrocytes (42) and is itself ingested by macrophages. Given these interactions, it is tempting to speculate that an increase in cell size could have been selected for in order to augment the parasite’s phagocytosis of bacteria, to reduce its own phagocytosis by host cells, and to increase the surface area for colonization of vaginal mucosa. References and Notes FeS cluster assembly machinery have also been found in mitosomes (mitochondrial remnants) of G. lamblia, supporting a common evolutionary origin of mitochondria, hydrogenosomes, and mitosomes (3). A new predicted function of hydrogenosomes revealed by the genome sequence is amino acid metabolism. We identified two components of the glycine-cleavage complex (GCV), L protein and H protein. Another component of this pathway is serine hydroxymethyl transferase (SHMT), which in eukaryotes exists as both cytosolic and mitochondrial isoforms. A single gene coding for SHMT of the mitochondrial type with a putative N-terminal hydrogenosomal presequence was identified. Because both GCV and SHMT require folate (fig. S12), which T. vaginalis apparently lacks, the functionality of these proteins remains unclear. The 5-nitroimidazole drugs metronidazole (Mz) and tinidazole are the only approved drugs for treatment of trichomoniasis. These prodrugs are converted within the hydrogenosome to toxic nitroradicals via reduction by ferredoxin (Fdx) (fig. S12). Clinical resistance to Mz (Mz R) is estimated at 2.5 to 5% of reported cases and rising (34) and is associated with decreases in or loss of Fdx (35). We identified seven Fdx genes with hydrogenosomal targeting signals (tables S21 and S25), the redundancy of which provides explanations for the low frequency of MzR and for why knockout of a single Fdx gene does not lead to MzR (35). Our analyses also provide clues to the potential mechanisms that clinically resistant parasites may use, such as the presence of nitroreductase (NimA-like), reduced nicotinamide adenine dinucleotide phosphate (NADPH)– nitroreductase, and NADH-flavin oxidoreductase genes (tables S23 and S26), which have been implicated in MzR in bacteria (36, 37). Summary and concluding remarks. Our investigation of the T. vaginalis genome sequence provides a new perspective for studying the biology of an organism that continues to be ignored as a public health issue despite the high number of trichomoniasis cases worldwide. The discovery of previously unknown metabolic pathways, the elucidation of pathogenic mechanisms, and the identification of candidate surface proteins likely involved in facilitating invasion of human mucosal surfaces provide potential leads for the development of new therapies and novel methods for diagnosis. The analysis presented here of one of the most repetitive genomes known has undoubtedly been hampered by the sheer number of highly similar repeats and transposable elements. Why did this genome expand so dramatically in size? We hypothesize that the most recent common ancestor of T. vaginalis underwent a population bottleneck during its transition from an enteric environment (the habitat of most trichomonads) to the urogenital tract. During this time, the decreased effectiveness of selection resulted in repeat accumulation and differential gene family expansion. Genome size and cell volume are positively correlated (38, 39); hence, the increased genome size of T. vaginalis achieved through rapid fixation of repeat copies could have ultimately resulted in a larger cell size. T. vaginalis cell volume is greater than that of T. tenax and related intestinal species Pentratrichomonas hominis (40) and T. gallinae (41), and it generally conforms to the relationship of genome size to www.sciencemag.org SCIENCE VOL 315 1. Global Prevalence and Incidence of Selected Curable Sexually Transmitted Infections (World Health Organization, Geneva, 2001), www.who.int/docstore/hiv/ GRSTI/006.htm. 2. S. M. Adl et al., J. Eukaryot. Microbiol. 52, 399 (2005). 3. T. M. Embley, W. Martin, Nature 440, 623 (2006). 4. Materials and methods are available as supporting material on Science Online. 5. S. Vanacova, W. Yan, J. M. Carlton, P. J. Johnson, Proc. Natl. Acad. Sci. U.S.A. 102, 4430 (2005). 6. M. A. Schumacher, A. O. Lau, P. J. Johnson, Cell 115, 413 (2003). 7. M. Muller, in Evolutionary Relationship Among Protozoa, G. H. Coombs, V. Vickerman, M. A. Sleigh, A. Warren, Eds. (Chapman & Hall, London, 1998), pp. 109–132. 8. J. C. Silva, F. Bastida, S. L. Bidwell, P. J. Johnson, J. M. Carlton, Mol. Biol. Evol. 22, 126 (2005). 9. K. Kutisova et al., Parasitology 131, 309 (2005). 10. X. Zuo, B. C. Lockwood, G. H. Coombs, Microbiology 141, 2637 (1995). 11. Y. Kleydman, N. Yarlett, T. E. Gorrell, Microbiology 150, 1139 (2004). 12. P. N. Lowe, A. F. Rowe, Mol. Biochem. Parasitol. 21, 65 (1986). 13. A. C. Turner, W. B. Lushbaugh, Exp. Parasitol. 67, 47 (1988). 14. G. D. Westrop, G. Goodall, J. C. Mottram, G. H. Coombs, J. Biol. Chem. 281, 25062 (2006). 15. D. H. Beach, G. G. Holz Jr., B. N. Singh, D. G. Lindmark, Mol. Biochem. Parasitol. 44, 97 (1991). 16. B. N. Singh, D. H. Beach, D. G. Lindmark, C. E. Costello, Arch. Biochem. Biophys. 309, 273 (1994). 17. C. Vogel, C. Chothia, PLoS Comput. Biol. 2, e48 (2006). 18. M. Parsons, E. A. Worthey, P. N. Ward, J. C. Mottram, BMC Genomics 6, 127 (2005). 19. M. Benchimol, Microsc. Microanal. 10, 528 (2004). 20. T. A. Richards, T. Cavalier-Smith, Nature 436, 1113 (2005). 21. M. A. Ramesh, S. B. Malik, J. M. Logsdon Jr., Curr. Biol. 15, 185 (2005). 22. F. D. Bastida-Corcuera, C. Y. Okumura, A. Colocoussi, P. J. Johnson, Eukaryot. Cell 4, 1951 (2005). 23. B. P. Dias Filho, A. F. Andrade, W. de Souza, M. J. Esteves, J. Angluster, Microbios 71, 55 (1992). 24. A. Ikegami, K. Honma, A. Sharma, H. K. Kuramitsu, Infect. Immun. 72, 4619 (2004). 25. B. Loftus et al., Nature 433, 865 (2005). 26. P. H. Davis et al., Mol. Biochem. Parasitol. 145, 111 (2006). 27. C. Yao, J. E. Donelson, M. E. Wilson, Mol. Biochem. Parasitol. 132, 1 (2003). 28. Single-letter abbreviations for the amino acid residues are as follows: A, Ala; C, Cys; D, Asp; E, Glu; F, Phe; G, Gly; H, His; I, Ile; K, Lys; L, Leu; M, Met; N, Asn; P, Pro; Q, Gln; R, Arg; S, Ser; T, Thr; V, Val; W, Trp; X, any amino acid; and Y, Tyr. 29. M. W. Lehker, T. H. Chang, D. C. Dailey, J. F. Alderete J. Exp. Med. 171, 2165 (1990). 30. P. L. Fiori, P. Rappelli, A. M. Rocchigiani, P. Cappuccinelli, FEMS Microbiol. Lett. 109, 13 (1993). 31. M. Klemba, D. E. Goldberg, Annu. Rev. Biochem. 71, 275 (2002). 32. P. J. Bradley, C. J. Lahti, E. Plumper, P. J. Johnson, EMBO J. 16, 3484 (1997). 33. R. Sutak et al., Proc. Natl. Acad. Sci. U.S.A. 101, 10368 (2004). 12 JANUARY 2007 211 34. G. Schmid et al., J. Reprod. Med. 46, 545 (2001). 35. K. M. Land et al., Mol. Microbiol. 51, 115 (2004). 36. D. H. Kwon et al., Antimicrob. Agents Chemother. 45, 2609 (2001). 37. H. K. Leiros et al., J. Biol. Chem. 279, 55840 (2004). 38. T. Cavalier-Smith, Ann. Bot. (London) 95, 147 (2005). 39. T. R. Gregory, Biol. Rev. Camb. Philos. Soc. 76, 65 (2001). 40. B. M. Honigberg, G. Brugerolle, in Trichomonads Parasitic in Humans, B. M. Honigberg, Ed. (SpringerVerlag, New York, 1990), pp. 5–35. 41. B. J. Shuter, Am. Nat. 122, 26 (1983). 42. J. G. Rendon-Maldonado, M. Espinosa-Cantellano, A. Gonzalez-Robles, A. Martinez-Palomo, Exp. Parasitol. 89, 241 (1998). 43. We are grateful to the Trichomonas research community and M. Gottlieb for support and guidance during the duration of this project. We thank M. Delgadillo-Correa, J. Shetty, S. van Aken, and S. Smith for technical support; T. Creasy, S. Angiuoli, H. Koo, J. Miller, J. Orvis, P. Amedeo, and E. Lee for engineering support; W. Majoros and G. Pertea for data manipulation; and S. Sullivan and E. F. Merino for editing. Funding for this project was provided by the National Institute of Allergy and Infectious Diseases (grants UO1 AI50913-01 and NIH-N01-AI-30071). The Burroughs Wellcome Fund and the Ellison Medical Foundation provided funds for Trichomonas community meetings during the genome project. The Chang Gung T. vaginalis Systems Biology Project was supported by grants (SMRPD33002 and SMRPG33115) from Chang Gung Memorial Hospital and Taiwan Biotech Company Limited. This whole genome shotgun project has been deposited at DNA Data Bank of Japan/European Molecular Biology Laboratory/GenBank under the project accession AAHC00000000. The version described here is the first version, AAHC01000000. Supporting Online Material www.sciencemag.org/cgi/content/full/315/5809/207/DC1 Materials and Methods SOM Text Figs. S1 to S14 Tables S1 to S26 References Data 24 July 2006; accepted 14 November 2006 10.1126/science.1132894 REPORTS Spectropolarimetric Diagnostics of Thermonuclear Supernova Explosions Lifan Wang,1,2* Dietrich Baade,3 Ferdinando Patat3 Even at extragalactic distances, the shape of supernova ejecta can be effectively diagnosed by spectropolarimetry. We present results for 17 type Ia supernovae that allow a statistical study of the correlation among the geometric structures and other observable parameters of type Ia supernovae. These observations suggest that type Ia supernova ejecta typically consist of a smooth, central, iron-rich core and an outer layer with chemical asymmetries. The degree of this peripheral asphericity is correlated with the light-curve decline rate of type Ia supernovae. These results lend strong support to delayed-detonation models of type Ia supernovae. ifferent supernova (SN) explosion mechanisms may lead to differently structured ejecta. Type Ia supernovae (SNe) have been used as premier tools for precision cosmology. They occur when a carbon and oxygen white dwarf reaches the Chandrasekhar stability limit, probably because of mass accretion in a binary system, and is disrupted in a thermonuclear explosion (1). Decade-long debates have discussed how the explosive nuclear burning is triggered and how it propagates through the progenitor star (2–6). Successful models generally start with a phase of subsonic nuclear burning, or deflagration, but theorists disagree on whether the burning front becomes supersonic after the earlier phase of deflagration. An explosion that does turn into supersonic burning is called delayed detonation (4). The resulting chemical structures are dramatically different for deflagration (5) and delayed-detonation models (7). For a pure deflagration model, chemical clumps are expected to be present at all velocity layers that burning has reached (5). For delayed detonation, the detonation front propagates D 1 Physics Department, Texas A&M University, College Station, TX 77843–4242, USA. 2Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA. 3European Southern Observatory, Karl-Schwarzschild-Strasse 2, D-85748 Garching, Germany. *To whom correspondence should be addressed. E-mail: [email protected] 212 through the ashes left behind by deflagration, burns partially burned or unburned elements further into heavier elements, and erases the chemically clumpy structures generated by deflagration. The polarized emission from a supernova is caused by electron scattering in its ejecta. It is sensitive to the geometric structure of the ejecta (8). Electron scattering in an asymmetric ejecta would produce nonzero degrees of polarization (8). In continuum light, normal SNe Ia are only polarized up to about 0.3%, but polarization as high as 2% is found across some spectral lines (9–14). The polarization decreases with time and vanishes around 2 weeks past optical maximum. The low level of continuum polarization implies that the SN Ia photospheres are, in general, almost spherical. The decrease of polarization at later times suggests that the outermost zones of the ejecta are more aspherical than the inner zones. Similarly, the large polarization across certain spectral lines implies that the layers above the photosphere are highly aspheric and most likely chemically clumpy (12–15). In this study, we report polarimetry of 17 SNe Ia. These are all the SNe Ia for which we have premaximum polarimetry. The observations (Table 1) were collected with the 2.1-m Otto Struve Telescope of the McDonald Observatory of the University of Texas and with one of the 8.2-m Table 1. Identification, observing date, phase (in days relative to optical maximum light), telescope (McD indicates McDonald Observatory), intrinsic degree of polarization across the Si II 635.5-nm line, Dm15, and references for Dm15 values. Numbers in parentheses indicate 1-s error multiplied by 100. SN Date of observation Phase Telescope 1996X 1997bp 1997bq 1997br 1999by 2001V 2001el 2002bo 2002el 2002fk 2003W 2004dt 2004ef 2004eo 2005cf 2005de 2005df 14 April 1996 7 April 1997 25 April 1997 20 April 1997 9 May 1999 25 February 2001 26 September 2001 18 March 2002 14 August 2002 6 October 2002 31 January 2003 13 August 2004 11 September 2004 20 September 2004 1 June 2005 6 August 2005 8 August 2005 –4.2 –5.0 –3.0 –2.0 –2.5 –7.3 –4.2 –5.0 –6.4 –5.5 –4.5 –7.3 –4.1 –5.9 –9.9 –4.4 –4.3 McD 2.1 McD 2.7 McD 2.1 McD 2.1 McD 2.1 VLT VLT VLT VLT VLT VLT VLT VLT VLT VLT VLT VLT 12 JANUARY 2007 VOL 315 SCIENCE m m m m m www.sciencemag.org PSiII Dm15 References 0.50(20) 0.90(10) 0.40(20) 0.20(20) 0.40(10) 0.00(07) 0.45(02) 0.90(05) 0.72(09) 0.67(10) 0.64(10) 1.60(10) 1.10(30) 0.71(08) 0.44(05) 0.67(14) 0.73(05) 1.30(02) 1.29(08) 1.17(02) 1.09(02) 1.79(01) 0.73(03) 1.16(01) 1.34(03) 1.38(05) 1.19(05) 1.30(05) 1.21(05) 1.54(07) 1.46(08) 1.12(06) 1.41(06) 1.29(09) (19, 21) (19, 22) (19, 22) (19, 23) (24) (19, 25) (19, 26) (19, 27) (19) see text see text (28) see text (28) (28) see text see text Molecular & Biochemical Parasitology 161 (2008) 49–54 Contents lists available at ScienceDirect Molecular & Biochemical Parasitology Comparative analysis of trichomonad genome sizes and karyotypes Zuzana Zubáčová a , Zdeněk Cimbůrek b , Jan Tachezy a,∗ a b Department of Parasitology, Faculty of Science, Charles University in Prague, Viničná 7, Prague 128 44, Czech Republic Center of Flow Cytometry, Institute of Microbiology, Academy of Science of the Czech Republic, Prague 14200, Czech Republic a r t i c l e i n f o Article history: Received 22 December 2007 Received in revised form 2 June 2008 Accepted 7 June 2008 Available online 19 June 2008 Keywords: Trichomonas Genome size Cell size C-value Karyotype a b s t r a c t In parasitic protists, the genome sizes range from 2.9 Mb in Encephalitozoon cuniculi to about 160 Mb in Trichomonas vaginalis. The suprisingly large genome size of the former human parasite resulted from the expansion of various repetitive elements, specific gene families, and possibly from large-scale genome duplication. The reason for this phenomenon, as well as whether other trichomonad species have undergone a similar genome expansion, is not known. In this work we studied the genomes of nine selected species of the Trichomonadea group. We found that each species has a characteristic karyotype with a stable and haploid number of chromosomes. Relatively large genome sizes were found in all the tested species, although over a rather broad range (86–177 Mb). The largest genomes were typically observed in the Trichomonas and Tritrichomonas genera (133–177 Mb), while Tetratrichomonas gallinarum contains the smallest genome (86 Mb). The genome size correlated with the cell volume, however, no relationship between genome size and the site of infection or trichomonad phagocytic ability was observed. The data presented here provide primary information towards selecting a trichomonad species for future large-scale sequencing to elucidate the evolution of unusual parabasalid genomes. © 2008 Elsevier B.V. All rights reserved. 1. Introduction Trichomonads are anaerobic flagellated protists that belong to the Parabasala group. Most trichomonad species are the commensals of vertebrates and insects inhabiting the host intestinal tract [1–3]. However, they also include important pathogens such as Trichomonas vaginalis and Tritrichomonas foetus that cause sexually transmitted diseases in humans and cattle, respectively [4,5]. T. vaginalis is the first parabasalid for which the complete genome sequence was recently described [6]. This genome sequence analysis revealed that the size of the T. vaginalis genome is unusually large, at about 160 Mb [6], which is approximately sixfold larger than was originally estimated [7] and represents one of the largest genomes among parasitic protists. The gene prediction revealed about 60,000 putative genes, which are organized into 6 chromosomes [6,8,9]. The reason for such a genome expansion remains unclear, however, it has been hypothesized that it is a rather recent event associated with the transition of the T. vaginalis ancestor from an enteric to urogenital tract environment [6]. This transition included an increase in the cell volume, which may positively correlate with the DNA content [10]. It was further speculated that this larger cell volume might increase its phagocytic potential as well as ∗ Corresponding author. Tel.: +420 22195 1811; fax: +420 224 919 704. E-mail address: [email protected] (J. Tachezy). 0166-6851/$ – see front matter © 2008 Elsevier B.V. All rights reserved. doi:10.1016/j.molbiopara.2008.06.004 surface area for the parasite’s interaction with the host cell tissue [6], factors which are highly relevant to trichomonad pathogenicity. To test these hypotheses and contribute to our knowledge of trichomonad genome organization, we determined and compared genome sizes, cell and nuclei sizes as well as karyotypes among selected trichomonad species including T. vaginalis, Trichomonas tenax, Tetratrichomonas gallinarum, Pentatrichomonas hominis, Tritrichomonas foetus, Tritrichomonas augusta, Monocercomonas colubrorum, Trichomitus batrachorum and Hypotrichomonas acosta, which represent parasites or commensals of the oral cavity, intestine and urogenital tract in various vertebrates, including humans (Table 1). 2. Materials and methods 2.1. Organisms and cultivation Axenic cultures of the following organisms were used in the study: T. vaginalis strains G3 [American Type Culture Collection (ATCC) PRA-98] and T1 (provided by J.-H. Tai, Institute of Biomedical Sciences, Taipei, Taiwan), T. tenax strain Hs-4:NIH (ATCC 30207), T. foetus strains KV-1 (ATCC 30924) and LUB-1 MIP [11], T. augusta strain T-37 (ATCC 30077), T. gallinarum strain M3 [12], P. hominis strain GII (isolated from human by Gibora 1981, Slovak Republic), M. colubrorum strain R183 [1], T. batrachorum strain BUB (isolated from Bufo bufo by Kulda 1991, Czech Republic), H. acosta strain L3 50 Z. Zubáčová et al. / Molecular & Biochemical Parasitology 161 (2008) 49–54 Table 1 Characterization of selected trichomonad species Species Host Habitat Phagocytosis Pathogenicity Trichomonas vaginalis Trichomonas tenax Tritrichomonas foetus Ttritrichomonas augusta Tetratrichomonas gallinarum Pentatrichomonas hominis Monocercomonas colubrorum Trichomitus batrachorum Hypotrichomonas acosta Human Human Cattle Amphibian, reptile Galliform and anseriform birds Mammals including human Reptile Amphibian Reptile, chelonian Urogenital tract Oral cavity, respiratory tract Urogenital tract Intestine Intestine Intestine Intestine Intestine Intestine Yes [6] Yes [32] Limited [19,33] Not observed [34] Yes [35] Yes [36] Not observed [37]c Yes [37] Yes [38] Yes Noa Yes No No Nob No No No a b c Bronchopulmonary infections by T. tenax have been reported in patients with cancer or other lung diseases [16–18,39]. An unusual case of pulmonary infection caused by P. hominis has been reported [40]. Čepička, unpublished observation. (ATCC 30069). All organisms were grown in Diamond’s TryptoneYeast extract-Maltose medium (TYM) at pH 6.2 for T. vaginalis and pH 7.2 for the others, supplemented with 10% (v/v) heat-inactivated horse serum (Gibco-BRL) and ferrous ammonium citrate [13]. Trichomonads isolated from humans, cattle and birds were cultivated at 37 ◦ C, the other species at 25 ◦ C. 2.2. Chromosome preparation Metaphase chromosomes from colchicine-treated cells were obtained by a modified spreading technique as described in [8,14]. Briefly, trichomonads were grown to the mid-log phase until the cell density reached 0.5 to 1.5 × 106 cells per ml, depending on the trichomonad species. Colchicine was then added to 5 ml of culture to reach a final concentration of 0.5–1 mM and the cells were incubated for 8 h. After incubation, the cells were harvested by centrifugation at 1000 × g for 10 min. The pellets were washed once with phosphate-buffered saline (PBS) pH 7.3 and hypothonized in 75 mM potassium chloride (KCl) for 15–20 min at room temperature. The suspensions were centrifuged at 1000 × g 10 min and the cells in pellets were fixed twice with freshly prepared Carnoy fixation (ethanol–chloroform–acetic acid in the rate of 6: 3: 1). First, the cells were fixed for 2 h at 4 ◦ C in 10 ml of the fixative. Then the cells were spin down at 1300 × g for 10 min and fixed in 0.5–1 ml of the fixative overnight at 4 ◦ C. Aliquots of the final suspension in the fixative solution were dropped onto microscope slides from a height of about 100 cm. The spreads were dried at 45 ◦ C for 1 h. Chromosomes were stained with 4 ,6-diamidino-2-phenylindole (DAPI) (0.5 (g/ml) in Vectashield mounting medium (VectorLabs). Preparations were observed using an IX81 microscope, photographed with an IX2-UCB camera and processed using CellR̂ software (Olympus). The metaphase figures were inspected from at least five independent experiments. At least 50 metaphase figures per isolate were evaluated. 2.3. Estimation of cell and nucleus volume and genome size correlation relationships between genome sizes and cell volumes, and nuclei volumes. 2.4. Flow cytometry Cells in the logarithmic growth phase were stained with Hoechst 33342 (Molecular Probes) (final concentration 5 (g/ml) for 30 min at 37 ◦ C (T. vaginalis, T. tenax, T. foetus, T. gallinarum, P. hominis) or at 25 ◦ C (M. colubrorum, T. batrachorum, T. augusta, H. acosta). The relative fluorescence intensities from Hoechst-stained nuclei were measured and analyzed in an LSR II cytometer (Becton Dickinson, San Jose, USA). Before the measurement, propidium iodide (Sigma) was added to the sample (final concentration 5 (g/ml) to stain the non-living cells. Hoechst 33342 was excited at 406 nm, emission was detected using a 450/50 filter; 561 nm was used for propidium iodide excitation and emission was detected using a 610/20 filter. The flow rate was about 500 fluorescent events per second in individual samples. Data analysis was performed using FlowJo 8.3.3. software (TreeStar, Oregon, USA). At least 10,000 living cells were monitored for each histogram. The cytometer was calibrated with reference to the genome size of the T. vaginalis G3 strain [6]. 3. Results 3.1. Karyotype analysis The chromatin of all studied trichomonads that were arrested during metaphase of mitosis by colchicine condensed into well-structured chromosomes (Fig. 1). Constant number of chromosomes was observed in more than 95% figures in all species using DAPI-stained metaphase preparations (Table 2). Observed deviations (smaller number of chromosomes) in less than 5% of figures were considered to be artifacts of spreading technique [14,15]. All trichomonads displayed a single set of chromosomes, Table 2 Chromosome numbers determined in nine trichomonad species Species The dimensions of cells and nuclei were measured from digital microphotographs of living cells using an Olympus BX51 microscope equipped with an IX2-UCB camera. The cells were attached to microscope slides that were covered with poly-l-lysine (Sigma), and observed using a combination of differential interference contrast illumination and fluorescence of Hoechst 33342-stained nuclei. Fifty non-dividing cells of each strain were measured using the program QuickPHOTO MICRO 2.2 (Promicra). Approximate cell and nucleus volumes (V) were calculated according to the formula V = 3/4 ab2 , where “a” is the cell length, and “b” is the cell width. Least-squares linear regression analyses were performed to test for Trichomonas vaginalis Trichomonas tenax Tetratrichomonas gallinarum Pentatrichomonas hominis Tritrichomonas foetus Tritrichomonas augusta Monocercomonas colubrorum Trichomitus batrachorum Hypotrichomonas acosta Chromosome number This studya Published data 6 6 5 6 5 5 4 6 5 6 [8,9] 3–6 [41] Not available 5–6 [42] 5 [14] 5 [43], 4–8 [44] Not available 6 [42,45], 4–5 [46], 4–8 [44] Not available a The metaphase chromosome spreads from colchicine-treated cells were stained with DAPI and counted using fluorescence microscopy. Z. Zubáčová et al. / Molecular & Biochemical Parasitology 161 (2008) 49–54 51 Fig. 1. Karyotypes of representative species from Trichomonadidae [(A) Trichomonas vaginalis, (B) Trichomonas tenax, (C) Pentatrichomonas hominis, (D) Tetratrichomonas gallinarum], Tritrichomonadidae [(E) Tritrichomonas augusta, (F) Tritrichomonas foetus], Monocercomonadidae [(G) Monocercomonas colubrorum, (H) Hypotrichomonas acosta], and Trichomitidae [(I) Trichomitus batrachorum] groups. In metaphase of mitosis, only haploid chromosome sets were observed in all species. Each chromosome consists of two sister chromatides. Arrows indicate chromatide constrictions. Scale bar is 10 m. each with two chromatids (Fig. 1). The chromatids were organized in parallel with one or two constrictions observed in some pairs of chromatids in each of the species examined. Although the chromosomes were generally minute (0.9–2.4 m in T. vaginalis), they differed in morphology and size, producing a specific characteristic pattern for each species (Fig. 1 and Table 2). Homologous chromosome pairs, the tetrads typically found in the first meiotic division of other eukaryotes, were not observed in our specimens. These results indicated that all selected species possess a stable, haploid chromosome set. 3.2. Genome sizes Trichomonad cells in the G1-phase of the cell cycle were used to estimate DNA content by means of flow cytometry (Fig. 2). The nuclei were stained with the DNA-specific fluorescent dye Hoechst 33342. At this stage, the DNA was organized as fine granules, which were distributed homogenously within the nuclei except the sites of nucleoli (Fig. 2). The plasma membrane of all trichomonad species used in the study was highly permeable to this dye, no autofluorescence was observed (data not shown). T. vaginalis, the genome of which is completely sequenced [6], was used as a standard for calibrating the cytometer. The other protists with known genome sizes (Saccharomyces cerevisiae, Eimeria tenella, Giardia intestinalis and Trypanosoma brucei) appeared to be unsuitable for the calibration due to smaller sizes of their genomes and/or differences in nucleotide preferences (data not shown). Fluorescence intensities corresponding to DNA content revealed relatively large genomes in all nine trichomonad species compared to the genomes of other parasitic protists (Table 3). The largest genome sizes were determined in the species of the Trichomonas and Tritrichomonas genera, ranging from 133 to 177 Mb. The genomes of species from the other trichomonad genera displayed considerably smaller genomes, ranging from 86 to 114 Mb. 52 Z. Zubáčová et al. / Molecular & Biochemical Parasitology 161 (2008) 49–54 Fig. 2. Measurement of nuclear DNA amount (genome size) in trichomonads by flow cytometry. (A) Histogram of fluorescence intensity corresponding to Hoechst 33342stained DNA in living cells of Tritrichomonas foetus. The G1-phase population was used for genome size estimation and (B) Hoechst 33342-stained nuclei of G1-phase Tritrichomonas foetus. The smallest genome was found in the bird commensal T. gallinarum, which is almost 100 Mb smaller than the largest genome, belonging to the cattle parasite T. foetus. 3.3. Correlation between genome sizes and sizes of cells and nuclei Published data concerning trichomonad cell sizes were determined by various methods, which made direct comparison between trichomonad species difficult. Thus, we first determined the sizes of trichomonad cells and their nuclei under standard conditions and calculated their approximate volumes (Table 4). The largest cells as well as the largest nuclei were exhibited by the Table 3 Genome sizes (C-values) determined in studied trichomonad species by means of flow cytometry Species, strain Genome size (Mb) mean ± S.D. (n)b Trichomonas vaginalis, G3 Trichomonas vaginalis, T1 Trichomonas tenax, Hs-4 Tritrichomonas foetus, KV-1 Tritrichomonas foetus, LUB Tritrichomonas augusta, T-37 Tetratrichomonas gallinarum, M3 Pentatrichomonas hominis, GII Monocercomonas colubrorum, R183 Trichomitus batrachorum, BUB Hypotrichomonas acosta, L3 158–166a 154 ± 9 (8) 133 ± 4 (3) 177 ± 11 (4) 177 ± 12 (4) 165 ± 6 (3) 86 ± 10 (4) 94 ± 8 (3) 114 ± 17 (4) 125 ± 11 (3) 114 ± 16 (4) Species Genome size (Mb) Toxoplasma gondii Trypanosoma brucei Plasmodium falciparum Giardia intestinalis Encephalitozoon cuniculi 80 [47] 35 [48] 23 [49] 12 [15] 2.9 [50] Known genome sizes of selected parasitic protists are included for comparison. a Trichomonas vaginalis G3 strain with genome size value of 160 Mb (the precise genome size was reported as a range from 158–166 Mb) [6] was used as an external standard for calibration of the flow cytometer. b S.D.: standard deviation; n: number of independent experiments. Fig. 3. Correlation between trichomonad genome sizes (Mbp) and cell and nucleus volumes (m3 ). Genome sizes were positively correlated with cell volumes (R2 = 0.5) as well as nucleus volumes (R2 = 0.8) in all tested species. The numbers in the plot correspond to genome sizes. species of the Trichomonas and Tritrichomonas genera, which correspond to the presence of the largest genomes (Table 3). Similarly, P. hominis with the smallest cell and nucleus size possesses one of the smallest genomes. Genome sizes were positively correlated with cell volumes (R2 = 0.5) as well as nuclei volumes (R2 = 0.8) (Fig. 3). 4. Discussion The data presented in this study revealed considerably large genomes in all the selected trichomonad species in comparison to other parasitic protists (Table 3). The genomes were organized into four to six chromosomes forming a stable characteristic karyotype for each species. The difference in published chromosome numbers for some species (Table 2) likely attributed to the much lower limit of resolution of light microscopy available for distinguishing the small trichomonad chromosomes in these studies, conducted between 1909 and 1931. The sizes of the genomes ranging from 86 to 177 Mb were neither directly related to the trichomonad’s pathogenicity, nor their phagocytic potential, nor the site of the infection. Although the largest genomes were found in the pathogenic species T. vaginalis and T. foetus, which inhabit the urogenital tract of the host (160–177 Mb), a comparable genome size was obtained for T. augusta (165 Mb), which is a commensal of the reptile intestine. In addition, a relatively large genome was also found in T. tenax (133 Mb), which is a commensal of the human oral cavity, although its role as an etiologic agent in extraoral infections cannot be excluded [16–18]. As far as phagocytic potential is concerned, tritrichomonads are not able to ingest host cells and only limited phagocytosis of cell debris was recently observed [19], although their cell volume as well as genome size is comparable with Trichomonas species, for which phagocytosis is a characteristic mechanism for importing external material including whole bacteria, erythrocytes or leukocytes [20]. In other words, the largest genomes were found Z. Zubáčová et al. / Molecular & Biochemical Parasitology 161 (2008) 49–54 53 Table 4 Cell dimensions and approximate volumes of selected trichomonad species Species Cell Nucleus Length (m) mean ± S.D. Trichomonas vaginalis Trichomonas tenax Tritrichomonas foetus Tritrichomonas augusta Tetratrichomonas gallinarum Pentatrichomonas hominis Hypotrichomonas acosta Monocercomonas colubrorum Trichomitus batrachorum 14.8 14.2 13.28 16.71 12.01 6.92 11.2 8.82 12.38 ± ± ± ± ± ± ± ± ± 1.6 1.92 2 2.39 0.92 0.92 1.54 2.08 1.58 Width (m) mean ± S.D. 9.8 10 9.15 11.8 8.58 4.53 7.99 5.92 8.69 ± ± ± ± ± ± ± ± ± 1.0 1.79 1.43 1.94 0.93 0.72 1.09 1.17 0.98 a 3 Volume (m ) Length (m) mean ± S.D. 3.3 3.3 2.7 5.4 2 0.3 1.7 0.7 2.2 5.63 4.32 5.19 5.25 3.13 2.41 3.05 2.88 3.58 ± ± ± ± ± ± ± ± ± 0.59 0.51 0.68 0.87 0.46 0.42 0.40 0.55 0.42 Width (m) mean ± S.D. 3.00 2.56 3.47 3.58 2.47 1.82 1.97 1.89 2.78 ± ± ± ± ± ± ± ± ± 0.50 0.44 0.41 0.60 0.32 0.38 0.30 0.42 1.50 Volumea (m3 ) 0.12 0.06 0.15 0.16 0.04 0.02 0.03 0.02 0.07 Average of 50 specimens ± standard deviation. a The cell volume was calculated according to the formula V = 3/4 ab2 ; “a” is the cell length, “b” is the cell width. in all the tested species of Trichomonas and Tritrichomonas genera, without an apparent relationship to their parasitic style of life. Next we considered the sizes of the genomes and phylogenetic relationships between the studied trichomonads. As is apparent from Fig. 4, both Trichomonadinae and Tritrichomonadinae groups include genera from both ends of the genome size range. For example, P. hominis and T. gallinarum displayed considerably smaller genomes, although they belong to the Trichomonadinae group together with Trichomonas species. Similarly, M. colubrorum displayed a considerably smaller genome than Tritrichomonas species, although M. colubrorum was shown to be closely related to T. foetus in various phylogenetic reconstructions [1,21]. Therefore, this genome amplification (or reduction) may have happen independently within Trichomonadinae and Tritrichomonadinae groups. The only significant correlation observed was between trichomonad genome sizes and the overall volumes of the cells as well as their nuclei. Such a correlation was reported in all groups of organisms from protists to vertebrates, although the exact causes of this phenomenon are not clear [10]. It has been shown, however, that an increase in a haploid genome (C-value) of large cells correlates with an increase in the amount of non-coding DNA, including introns, pseudogenes and transposable elements, while a number of genes remain comparable over a broad taxonomic scale. Consequently, an increase in non-coding DNA leads to a discrepancy between the genome size and gene number, a phenomenon which is also known as the C-value paradox [22,23]. For example in the human genome, the protein coding genes comprise only 1.5% of the genome, while the contribution of introns is over 25% [24]. Interestingly, introns are not involved in the expansion of the T. vaginalis genome, in which only 65 introns were identified [6]. In this organism, the genome expansion mainly involved the amplification of repeats and transposable elements, massive expansion of certain gene families, and possibly large-scale genome duplications [6]. The last two events may led to a multiplication of the majority of genes, which is reflected by the unusual high gene number, reaching 60,000 in the T. vaginalis genome. The unusual level of the gene multiplicity in the trichomonad genome and large genome sizes might be related to the unusual evolutionary history of the Parabasala group. Previously, monocercomonads were suggested to be an ancient parabasalid lineage based on their morphological simplicity, from which trichomonads with a more complex karyomastigont (the organellar system, which includes a nucleus attached to a mastigont consisting of flagella, central cytoskeletal structures, and a parabasal body) have evolved [3,25]. This simple-to-complex scenario, however, was not supported by molecular phylogeny and the absence of various cytoskeletal structures in monocercomonads was most likely the result of multiple loss in several trichomonad clades [1]. Suprisingly, several recent phylogenetic studies placed trichonymphids, a group of organisms with very complex cytoskeletal structures, at the root position of parabasalids [26–28] (Fig. 4). These organisms, living in the digestive tract of termites and wood-feeding cockroaches, are characterized by the large cell size (about 100 times larger than species of the Trichomonadida group [3,29]), the Fig. 4. Draft of phylogenetic relationships within Parabasala group [21,26–28]. The taxa included in this study are shown in bold. 54 Z. Zubáčová et al. / Molecular & Biochemical Parasitology 161 (2008) 49–54 multiple mastigonts, and the large nucleus with large number of chromosomes [3,29]. For example, the nucleus of Trichonympha major is about 17.6 m in transversal diameter with 24 haploid chromosomes [30]. As the karymastigont reproduces as a unit structure [31], it is tempting to speculate that the trichonymphid ancestor originally possessed complete multiple karyomastigonts including nuclei as observed for example in Calonymphidae [3,29], and later the multiple nuclei were transformed or somehow contributed to the formation of a single large hypermastigid nucleus with presumably multiplicated genome. If the ancestral position of trichonymphids is correct, we can further speculate that the genome sizes observed in the studied trichomonad species might result from a regressive evolution, which includes DNA content decrease, simplification of the mastigont and decrease in cell size. To elucidate the evolution of the parabasalid genome, more work on large-scale genomics is required. In particular, it would be interesting to compare the genomes or selected genome markers in organisms with total DNA contents at opposite ends of the spectrum. Based on our results, P. hominis seems to be an interesting trichomonad species for such studies: (i) this organism displayed one of the smallest genome in the range of selected trichomonad species (94 Mb), in which less repeates and transposable elements might be expected, (ii) it is living in the host intestine, an original environment from which an ancestor of T. vaginalis most likely evolved, thus comparison with T. vaginalis genome may provide new insight concerning events that were responsible for genome size changes associated with transition of trichomonads from an enteric environment to the urogenital tract [6] and (iii) as P. hominis is mostly harmless organism, the comparison with T. vaginalis genome may help to pinpoint the gene families whose amplification is related to T. vaginalis pathogenicity. Acknowledgements This work was supported by Czech Ministry of Education, Youth and Sport, project MSM0021620858 and LC07032 (J.T.), and the Grant Agency of Charles University 180/2006/B-BIO/PrF (Z.Z.). References [1] Hampl V, Cepicka I, Flegr J, et al. Morphological and molecular diversity of the monocercomonadid genera Monocercomonas, Hexamastix, and Honigbergiella gen. nov. Protist 2007;158:365–83. [2] Gerbod D, Sanders E, Moriya S, et al. Molecular phylogenies of Parabasalia inferred from four protein genes and comparison with rRNA trees. Mol Phylogenet Evol 2004;31:572–80. [3] Brugerolle G, Lee JJ. Phylum Parabasalia. In: An illustrated guide to the protozoa. 2nd ed. Lawrence, Kansas: Allen Press Inc.; 2001. p. 1196–1211. [4] Schwebke JR, Burgess D, Trichomoniasis. Clin Microbiol Rev 2004;17:794–803. [5] Rae DO, Crews JE. Tritrichomonas foetus. Vet Clin North Am Food Anim Pract 2006;22:595–611. [6] Carlton JM, Hirt RP, Silva JC, et al. Draft genome sequence of the sexually transmitted pathogen Trichomonas vaginalis. Science 2007;315:207–12. [7] Wang AL, Wang CC. Isolation and characterization of DNA from Tritrichomonas foetus and Trichomonas vaginalis. Mol Biochem Parasitol 1985;14:323–35. [8] Drmota T, Kral J. Karyotype of Trichomonas vaginalis. Eur J Protistol 1997;33:131–5. [9] Yuh YS, Liu JY, Shaio MF. Chromosome number of Trichomonas vaginalis. J Parasitol 1997;83:551–3. [10] Gregory TR. Coincidence, coevolution, or causation? DNA content, cell size, and the C-value enigma. Biol Rev Camb Philos Soc 2001;76:65–101. [11] Tachezy J, Kulda J, Bahnikova I, et al. Tritrichomonas foetus: iron acquisition from lactoferrin and transferrin. Exp Parasitol 1996;83:216–28. [12] Kutisova K, Kulda J, Cepicka I, et al. Tetratrichomonads from the oral cavity and respiratory tract of humans. Parasitology 2005;131:309–19. [13] Diamond L. The establishment of various trichomonads of animals and man in axenic cultures. J Parasitol 1957;43:488–90. [14] Xu WD, Lun ZR, Gajadhar A. Chromosome numbers of Tritrichomonas foetus and Tritrichomonas suis. Vet Parasitol 1998;78:247–51. [15] Tumova P, Hofstetrova K, Nohynkova E, et al. Cytogenetic evidence for diversity of two nuclei within a single diplomonad cell of Giardia. Chromosoma 2007;116:65–78. [16] Shiota T, Arizono N, Morimoto T, et al. Trichomonas tenax empyema in an immunocompromised patient with advanced cancer. Parasite 1998;5: 375–7. [17] Lewis KL, Doherty DE, Ribes J, et al. Empyema caused by Trichomonas. Chest 2003;123:291–2. [18] Duboucher C, Barbier C, Beltramini A, et al. Pulmonary superinfection by trichomonads in the course of acute respiratory distress syndrome. Lung 2007;185:295–301. [19] Benchimol M, Rosa ID, Fontes RD, et al. Trichomonas adhere and phagocytose sperm cells: adhesion seems to be a prominent stage during interaction. Parasitol Res 2008;102:597–604. [20] Rendon-Maldonado JG, Espinosa-Cantellano M, Gonzalez-Robles A, et al. Trichomonas vaginalis: in vitro phagocytosis of lactobacilli, vaginal epithelial cells, leukocytes, and erythrocytes. Exp Parasitol 1998;89:241–50. [21] Ohkuma M, Saita K, Inoue T, et al. Comparison of four protein phylogeny of parabasalian symbionts in termite guts. Mol Phylogenet Evol 2007;42: 847–53. [22] Taft RJ, Pheasant M, Mattick JS. The relationship between non-protein-coding DNA and eukaryotic complexity. Bioessays 2007;29:288–99. [23] Pagel M, Johnstone RA. Variation across species in the size of the nuclear genome supports the junk-DNA explanation for the C-value paradox. Proc Biol Sci 1992;249:119–24. [24] Lander ES, Linton LM, Birren B, et al. Initial sequencing and analysis of the human genome. Nature 2001;409:860–921. [25] Honigberg BM. Evolutionary and systematic relationships in the flagellate order Trichomonadida Kirby. J Protozool 1963;10:20–63. [26] Viscogliosi E, Philippe H, Baroin A, et al. Phylogeny of trichomonads based on partial sequences of large subunit ribosomal-RNA and on cladistic analysis of morphological data. J Eukaryot Microbiol 1993;40:411–21. [27] Dacks JB, Redfield RJ. Phylogenetic placement of Trichonympha. J Eukaryot Microbiol 1998;45:445–7. [28] Ohkuma M, Ohtoko K, Grunau C, et al. Phylogenetic identification of the symbiotic hypermastigote Trichonympha agilis in the hindgut of the termite Reticulitermes speratus based on small-subunit rRNA sequence. J Eukaryot Microbiol 1998;45:439–44. [29] Carpenter KJ, Keeling PJ. Morphology and phylogenetic position of Eucomonympha imla (Parabasalia: Hypermastigida). J Eukaryot Microbiol 2007;54:325–32. [30] Bobyleva NN. Morphology and evolution of intestinal parasitic flagellates of the far-eastern roach Cryptocercus relictus. Acta Protozool 1975;14:109–60. [31] Margulis L, Dolan MF, Guerrero R. The chimeric eukaryote: origin of the nucleus from the karyomastigont in amitochondriate protests. Proc Natl Acad Sci USA 2000;97:6954–9. [32] Ribaux CL, Magloire H, Joffre A. Additional data on the ultrastructural study of Trichomonas tenax—intracellular localization of acid phosphatase. J Biol Buccale 1980;8:213–28. [33] Hibler CP, Hammond DM, Caskey FH, et al. The morphology and incidence of the trichomonads of swine, Tritrichomonas suis (Gruby and Delafond), Tritrichomonas rotunda, nsp and Trichomonas buttreyi, nsp. J Protozool 1960;7:159–71. [34] Kulda J. Tritrichomonas augusta (Alexeieff 1911) as a parasite of some lizards in Czechoslovakia. Acta Soc Zool Bohem 1957;21:209–24. [35] Mcdowell SA. Morphological and taxonomic study of the caecal protozoa of the common fowl, Gallus gallus L. J Morphol 1953;92:337–99. [36] Kirby H. The structure of the common intestinal trichomonad of man. J Parasitol 1945;31:163–75. [37] Moskowitz N. Observations on some intestinal flagellates from reptilian host (Squamata). J Morphol 1951;89:257–321. [38] Mattern CFT, Daniel WA, Honigber BM. Structure of Hypotrichomonas acosta (Moskowitz) (Monocercomonadidae, Trichomonadida) as revealed by electron microscopy. J Protozool 1969;16:668–85. [39] Mallat H, Podglajen I, Lavarde W, et al. Molecular characterization of Trichomonas tenax causing pulmonary infection. J Clin Microbiol 2004;42:3886–7. [40] Jongwutiwes S, Silachamroon U, Putaporntip C. Pentatrichomonas hominis in empyema thoracis. Trans R Soc Trop Med Hyg 2000;94:185–6. [41] Hinshaw HC. On the morphology and mitosis of Trichomonas buccalis (Goodey). Univ Cal Publ Zool 1926;29:159–74. [42] Bishop A. The morphology and method of division of Trichomonas. Parasitology 1931;23:129–56. [43] Kofoid CA. Mitosis and multiple fission in trichomonad flagellates. Proc Am Acad Arts Sci 1915;51:289–378. [44] Kuczynski MH. Untersuchungen an Trichomonaden. Arch Protistenkd 1914;33:119–204. [45] Dobell CC. Researches on the intestinal protozoa of frogs and toads. Quart J Micro Sci 1909;53:201–66. [46] Grassé PP. Contribution a lı̌étude des flagellés parasites. Arch Zool Exp Gen 1926;65:345–602. [47] Kissinger JC, Gajria B, Li L, et al. ToxoDB: accessing the Toxoplasma gondii genome. Nucl Acids Res 2003;31:234–6. [48] Berriman M, Ghedin E, Hertz-Fowler C, et al. The genome of the African trypanosome Trypanosoma brucei. Science 2005;309:416–22. [49] Abrahamsen MS, Templeton TJ, Enomoto S, et al. Complete genome sequence of the apicomplexan, Cryptosporidium parvum. Science 2004;304:441–5. [50] Katinka MD, Duprat S, Cornillot E, et al. Genome sequence and gene compaction of the eukaryote parasite Encephalitozoon cuniculi. Nature 2001;414:450–3. Molecular & Biochemical Parasitology 175 (2011) 30–38 Contents lists available at ScienceDirect Molecular & Biochemical Parasitology Microsatellite polymorphism in the sexually transmitted human pathogen Trichomonas vaginalis indicates a genetically diverse parasite夽 Melissa Conrad a , Zuzana Zubacova b , Linda A. Dunn c , Jacqui Upcroft c , Steven A. Sullivan a , Jan Tachezy b , Jane M. Carlton a,∗ a b c Department of Medical Parasitology, New York University Langone Medical Center, New York, NY 10010, USA Department of Parasitology, Faculty of Science, Charles University in Prague, Vinicná 7, Prague 128 44, Czech Republic Queensland Institute of Medical Research, Brisbane, QLD 4006, Australia a r t i c l e i n f o Article history: Received 3 July 2010 Received in revised form 19 August 2010 Accepted 23 August 2010 Available online 9 September 2010 Keywords: Population genetics Trichomonas vaginalis Sexually transmitted infection Microsatellite FISH Genetic markers a b s t r a c t Given the growing appreciation of serious health sequelae from widespread Trichomonas vaginalis infection, new tools are needed to study the parasite’s genetic diversity. To this end we have identified and characterized a panel of 21 microsatellites and six single-copy genes from the T. vaginalis genome, using seven laboratory strains of diverse origin. We have (1) adapted our microsatellite typing method to incorporate affordable fluorescent labeling, (2) determined that the microsatellite loci remain stable in parasites continuously cultured for up to 17 months, and (3) evaluated microsatellite marker coverage of the six chromosomes that comprise the T. vaginalis genome, using fluorescent in situ hybridization (FISH). We have used the markers to show that T. vaginalis is a genetically diverse parasite in a population of commonly used laboratory strains. In addition, we have used phylogenetic methods to infer evolutionary relationships from our markers in order to validate their utility in future population analyses. Our panel is the first series of robust polymorphic genetic markers for T. vaginalis that can be used to classify and monitor lab strains, as well as provide a means to measure the genetic diversity and population structure of extant and future T. vaginalis isolates. © 2010 Elsevier B.V. All rights reserved. 1. Introduction Trichomoniasis, caused by the parasitic protist Trichomonas vaginalis, is the most prevalent non-viral sexually transmitted infection (STI) worldwide, with an estimated 174 million new cases occurring each year. Approximately five million of these infections occur within the United States [1], and 154 million occur in resource-limited settings [2]. As many as one-third of female infections and the majority of male infections are asymptomatic, often causing trichomoniasis to be considered a self-clearing female Abbreviations: HE , expected heterozygosity; PCR, polymerase chain reaction; FISH, fluorescent in situ hybridization; MS, microsatellite; SNP, single nucleotide polymorphism; SCG, single-copy gene. 夽 Note: Nucleotide sequence data reported in this paper are available in the EMBL, GenBank and DDBJ databases under the accession numbers: HM365085–HM365208, DQ321767, DQ321764, XM 001301638, XM 001304496, XM 001310009, XM 001315268, XM 001317855, XM 001318124, XM 001320341, XM 001321129, XM 001321414, XM 001321947, XM 001322737, XM 001324024, XM 001324948, XM 001329311, XM 001329418, XM 001329432, XM 001580544, XM 001581132, XM 001581222, XM 001582251, and XM 001583217. ∗ Corresponding author at: Department of Medical Parasitology, 341 East 25th Street, New York, NY 20010, USA. Tel.: +1 212 263 4377, fax: +1 212 263 8116. E-mail address: [email protected] (J.M. Carlton). 0166-6851/$ – see front matter © 2010 Elsevier B.V. All rights reserved. doi:10.1016/j.molbiopara.2010.08.006 ‘nuisance’ disease [3]. However, trichomoniasis has been associated with a number of significant reproductive health sequelae, including pelvic inflammatory disease [4] and adverse pregnancy outcomes, such as premature rupture of membranes, preterm delivery, and low birth-weight [5,6]. Most importantly, it has been implicated in increasing sexual transmission of HIV up to twofold [7,8]. In light of these findings, understanding the transmission dynamics, strain virulence, and pathology of T. vaginalis infection is critically important for protecting patients – particularly female patients – from serious disease and threats to reproductive health. However, very little is known about T. vaginalis genetic diversity or population structure. Antigenic characterization, isoenzyme analysis, repetitive sequence hybridization, restriction fragment length polymorphism (RFLP), random amplified polymorphic DNA (RAPD), pulsed-field gel electrophoresis, and sequence polymorphism in ribosomal RNA genes and intergenic regions have been used to type T. vaginalis isolates in an attempt to find correlations between genotypes and biologically relevant phenotypes or geographical distribution [9–16]. Instead of providing clarification, these studies have yielded discordant results for a number of different phenotypes including metronidazole resistance and geographical distribution [9,10,13,14], the presence of a linear, double-stranded RNA virus known as TVV [10,13], clin- M. Conrad et al. / Molecular & Biochemical Parasitology 175 (2011) 30–38 ical manifestation of infection in patients [12,13], and virulence [9,10,13]. Multilocus genotyping of bacterial and eukaryotic pathogens has been used successfully to describe population diversity, delimit species, identify genetic components of important clinical phenotypes, and track the spread of epidemics [17–19]. While RAPD and RFLP techniques from the pre-genomics era have produced significant amounts of low-cost multilocus data with relative ease, microsatellite (MS) loci, also known as short tandem repeats, have been found to be superior tools for most applications [20,21]. These genetic markers are tandemly repeated 2–6 bp DNA sequences that display length polymorphism due to changes in the number of repeat units, and are frequently multi-allelic at each locus. They are expressed co-dominantly and are abundant throughout eukaryotic genomes. Crucially, they are generally considered to be neutral alleles that have not evolved under selective pressure, making them an ideal tool for determining population history. Finally, because of their high diversity, sensitivity, and multilocus nature, they also allow for improved accuracy in detection of mixed genotype infections [22]. MS markers have been an important means for studying population genetics of a number of eukaryotic parasites [17,23–26], having been used to characterize population structure in Leishmania tropica [27], Trypanosoma cruzi [28], T. brucei gambiense [29], Plasmodium falciparum [23], P. vivax [30], and Toxoplasma gondii [26]; to construct a genetic map for Plasmodium falciparum locating regions involved in such phenotypes as drug resistance [31]; and to determine the origins and global dispersal of drug-resistant parasite strains [32]. Single nucleotide polymorphisms (SNPs) are another type of genetic marker useful in molecular population genetics. Like MS markers, they can be highly prevalent and widely distributed within genomes; in contrast to MS, these markers are usually bi-allelic due to low mutation rates, which reduce the risk of homoplasy and simplify mutation models. These characteristics make them especially suited for certain analyses, such as the inference of phylogenies [33], particularly when they are localized within single-copy genes (SCG), although careful selection of SCG SNP markers is important to prevent gene-specific evolutionary history being mistaken for population evolutionary history. This danger persists despite the development of sophisticated statistical methods that are robust to a number of confounding influences on phylogenetic analyses [34]. Therefore, MS and SCG markers must be validated before they can be adopted as molecular tools for understanding the diversity and population structure of the parasite under study. A genome sequence is an ideal resource for anyone intending to mine genetic loci for use as markers in population genetic studies, especially of an organism refractory to classical genetics. The draft T. vaginalis genome, published in 2007 [35], is a ∼160 MB assembly with a core set of ∼60,000 protein-coding genes. Even at 7.2× coverage, the assembly remains highly fragmented, with ∼74,000 contigs representing the parasite’s six chromosomes. The poor quality of the assembly is due to such complicating factors as an apparently rapid, massive expansion of gene families, and the highly repetitive nature (>65%) of the genome, which rendered in silico genome assembly challenging. However, this superabundance of repeats suggests a genomic plenitude of MSs that can be exploited, while also suggesting a comparative lack of singlecopy genes. Here we report the identification and characterization of 86 microsatellite markers and 21 single-copy genes from the genome of T. vaginalis, from which a panel of 27 polymorphic genetic markers (21 MS markers and 6 SCG markers) were developed using seven common laboratory strains of diverse geographical origin, and six sub-strains. We have adapted our MS typing method to incorporate a simple and affordable fluorescent labeling protocol, 31 thereby drastically reducing the cost of genotyping and making the technique readily available to members of the research community. To validate the usefulness of the markers, we have measured the stability and distribution of the MS, and used phylogenetic and minimum-spanning network analyses to compare the evolutionary relationships inferred by our panel of markers. Ours is the first robust series of polymorphic genetic markers for the T. vaginalis parasite that can be used to classify and monitor lab strains, as well as provide a means to measure the genetic diversity and population structure of T. vaginalis field isolates. 2. Materials and methods 2.1. Parasite strains and DNA extraction A list of the seven geographically widespread T. vaginalis laboratory strains used in this study is shown in Table 1. Two strains, BRIS/92/HEPU/B7268 and BRIS/92/HEPU/F1623 (abbreviated as B7268 and F1623, respectively) were cultured under drug selection for a minimum of 255 and 530 days, respectively, and parasites were sampled at various time points during their continuous culture. DNA extraction for all strains was performed using modification of a published protocol [36]. Approximately 1.5 × 108 cells from late log phase cultures were harvested at 3000 rpm for 10 min at room temperature. The cell pellet was resuspended in 0.1× UNSET lysis buffer (8 M urea, 2% sarkosyl, 0.15 M NaCl, 0.001 M EDTA, 0.1 M Tris–HCl pH 7.5), and the lysate immediately extracted twice with an equal volume of phenol:chlorophorm:isoamyl alcohol (25:24:1) and once with an equal volume of chloroform:isoamyl alcohol (24:1). Nucleic acids were precipitated with 0.6 volumes of ice-cold isopropanol for 30 min at room temperature or overnight at 4 ◦ C. Following centrifugation at 12000 rpm for 40 min, the pellet was rinsed three times with 1 mL ice-cold 70% ethanol, dried at room temperature and resuspended in 50 L low TE buffer, pH 8 (10 mM Tris–HCl, pH 8, 0.1 mM EDTA, pH 8.0) containing 5 g/mL RNase A. 2.2. Microsatellite marker identification, genotyping and stability The T. vaginalis genome sequence was mined using version 1.1 of the genome database resource TrichDB (http://trichdb.org; see Ref. [37]). The genome was screened for microsatellites (MS) with di-, tri-, tetra-, penta-, and hexa-nucleotide repeats, using the default settings of SciRoKo [38] as follows: a fixed penalty of five; allowing no more than three simultaneous mismatches; requiring a minimum score of 15 and a minimum of three repeats. The MS were filtered to include only those smaller than 400 bp to ensure that products would fall within the size standard range during genotyping. Sequences from the regions 200 bp upstream and downstream of each MS were downloaded and used to design MS-flanking primers with Primer3Plus [39], using default parameters modified for 100–450 bp products. Primer sequences were searched against the complete T. vaginalis genome with Megablast [40] to ensure uniqueness. A 19 nt tag sequence 5 -GTCGTTTTACAACGTCGTG-3 was added to the 5 end of all forward primers, and the final primer set was commercially synthesized (Table S1). All MS markers were amplified as follows: 25 L PCR reactions contained 50 nM forward primer, 1 M reverse primer, 1 M of the primer 5 -GTCGTTTTACAACGTCGTG-3 labeled with phoshoramidite conjugate HEX or 6-FAM at the 5 end, 1× Mg-free PCR buffer (Promega, Madison, WI), 0.25 mM dNTPs (New England BioLabs Ipswich, MA), 1.5 mM MgCl2 , and 0.02 U/L GoTaqFlexi (Promega, Madison, WI). Thermocycler programs were optimized for each primer combination (Table S1). We tested two different cycling programs (step and standard) at temperatures ranging from 45 to 60 ◦ C and for each locus selected the program that worked 32 M. Conrad et al. / Molecular & Biochemical Parasitology 175 (2011) 30–38 Table 1 Seven T. vaginalis laboratory strains and six sub-strains used in this study. Several of the commonly used strains are available through the American Type Culture Collection (ATCC). Six sub-strains isolated during the course of long periods of continuous culture are also shown, with the day of isolation indicated (Day X). Identifier Origin B7RC2 C1:NIH T1 G3 CI6 BRIS/92/HEPU/F1623 (Day 0) BRIS/92/HEPU/F1623-Met (Day 350) BRIS/92/HEPU/F1623-Met (Day 440) BRIS/92/HEPU/F1623-Met (Day 530) BRIS/92/HEPU/B7268 (Day 0) BRIS/92/HEPU/B7268-Met (Day >55) BRIS/92/HEPU/B7268-Met (Day >55) BRIS/92/HEPU/B7268-toy (Day >200) North Carolina, USA Maryland, USA Taipei, Taiwan Kent, UK Puerto Rico Brisbane, Australia Brisbane, Australia Brisbane, Australia Brisbane, Australia Brisbane, Australia Brisbane, Australia Brisbane, Australia Brisbane, Australia most consistently based on ABI 3130xl sequencer results described below. Controls included (1) amplification of all primer sets with commercially available human DNA, (2) DNA prepared from a T. vaginalis negative vaginal swab to verify primer specificity for the parasite, and (3) DNA-free PCR reactions. To determine the sensitivity of the genotyping to mixed infections, artificial mixtures of two strains with genotypes known to differ was evaluated at three loci (MS70, MS77, and MS135) at 1:1, 2:1 and 3:1 concentrations. These concentrations allowed for the detection of multiple peaks in all cases (data not shown). PCR amplicons were run on an ABI 3130xl sequencer with GeneScan-500 LIZ size standard (ABI, Foster City, CA) for size determination. 2.3. Identification and sequencing of single-copy genes (SCG) Single-copy genes (SCGs) in the highly repetitive T. vaginalis genome were identified by an all-against-all BLASTP [41] comparison of protein sequences, excluding those annotated as hypothetical. Sequences with >60% sequence similarity were removed from further consideration. The remaining genes were filtered by size and those ≤3.5 kb retained. Genes were selected manually from this list by criteria indicating likely polymorphism, for example, if they were implicated in biologically relevant mechanisms such as virulence. In addition, genes coding for several proteins identified as being located on the parasite cell surface [35] were screened for suitability as polymorphic markers. Primers were designed using Primer3Plus [39] using default parameters and the final primer set commercially synthesized (Table S2). All genes were amplified as follows: 50 L PCR reactions contained 1 M forward primer, 1 M reverse primer, 1× Mg-free PCR buffer (Promega, Madison, WI), 0.25 mM dNTPs (New England BioLabs, Ipswich, MA), 1.5–2 mM MgCl2 , and 0.02 U/L GoTaqFlexi (Promega, Madison, WI). Thermocycler programs (called 2 and 3.5 kb) were optimized for each primer combination using gradient PCRs ranging from 45 to 65 ◦ C (Table S2). Controls included DNA-free PCR reactions. PCR products were visualized on 1% agarose gels and those containing single amplicons were purified using Agencourt AMPure (Beckman Coulter Genomics, Danvers, MA). The band of interest in PCRs with multiple products were purified via gel isolation using Wizard SV Gel and PCR Cleanup System (Promega Madison, WI). Sequencing reactions were performed using BigDye Terminator v 3.1 (ABI, Foster City, CA) using forward primers, reverse primers, and internal sequencing primers when necessary. All genes were sequenced at ≥2× coverage. Sequencing reactions were cleaned using Agencourt CleanSEQ (Beckman Coulter Genomics, Danvers, MA) and analyzed on an ABI 3130xl sequencer. The new sequences from T. vaginalis lab strains were deposited in GenBank as indicated in Table S3. Year 1986 1956 1993 1973 1980s 1992 1992 1992 1992 1992 1992 1992 1992 Availability References ATCC #50167 ATCC #30001 Available from J. Carlton ATCC #PRA-98 BioMed Diagnostics, Inc. Available from J. Carlton Available from J. Carlton Available from J. Carlton Available from J. Carlton Available from J. Carlton Available from J. Carlton Available from J. Carlton Available from J. Carlton [58] [59,60] [61] [62–64] Unpublished [65,66] [65,66] [65,66] [65,66] [66–68] [66–68] [66–68] [66–68] 2.4. Chromosome spreads and fluorescent in situ hybridization (FISH) FISH was used to map T. vaginalis MS markers to their chromosomal positions. T. vaginalis cells were prepared using modifications to a published protocol [42]. Briefly, G3 parasites were grown to mid-log phase in Diamonds media with 10% horse serum, 1 mM Colchicine (Sigma, St. Louis, MO) added to 30–50 mL of culture (∼1.5 × 106 cells/mL), incubated at 37 ◦ C for 6 h, and the cells harvested by centrifugation (all centrifugations were performed at 1000 × g for 5 min at 4 ◦ C). Cells were resuspended in 10 mL of 75 mM potassium chloride, allowed to swell for 5 min at 37 ◦ C, centrifuged, gently resuspended in 5 mL of fresh Carnoy fixative (ethanol:chloroform:glacial acetic acid, 6:3:1) at 4 ◦ C for 20 min, centrifuged again and finally left resuspended in ∼1–2 mL fresh Carnoy fixative 4 ◦ C over night. Approximately 20 L of the cell suspension were dropped from a height of ∼100 cm onto grease-free microscopic slides and left to air-dry. Primers to amplify 1.5–2.5 kb regions spanning each of the MS loci were designed with Primer3Plus [39] using default parameters and the final primer set commercially synthesized (Table S3). All genes were amplified as described for SCGs using either the step or 2 kb thermocycler program. Amplicons were gel purified using the Wizard SV Gel and PCR Clean-up System (Promega Madison, WI), labeled with DIG-11-dUTP using the DIG DNA Labeling Kit (Roche Applied Science, Indianapolis, IN) and cleaned by ethanol precipitation. Freshly prepared T. vaginalis chromosome spreads were briefly placed in 50% acetic acid solution, dried at 37 ◦ C, incubated in 50 g/mL pepsin (Sigma, St. Louis, MO) in 3 mM acetate buffer with 0.01 M HCl, and washed with PBS pH 7.4 at room temperature for 5 min. The chromosomes were post-fixed with fresh 2% formaldehyde in PBS for 30 min, and incubated in 1% H2 O2 in PBS for 30 min (all at room temperature). Following dehydration in 3 min washes of increasing concentrations of methanol (70%, 90%, and 100%) and subsequent drying, the chromosomes were denatured in 70% formamide in 2× SSC for 5 min at 75 ◦ C under a cover slip and then immediately dehydrated in a methanol series as described above. A total of 2 L of each probe (∼20 ng) was denatured in 50 L of 50% deionized formamide (Sigma, St. Louis, MO) in 2× SSC at 90 ◦ C for 5 min and submerged in ice for 3 min. The probe mixture was applied to the chromosome slides, a coverslip overlaid and sealed with rubber cement, and hybridization was left to proceed overnight in a humid chamber at 37 ◦ C. High stringency washes were performed at 42 ◦ C in three changes of 50% formamide (Fluka Buchs, SUI) in 2× SSC for 5 min each, 3× 5 min washes in 2× SSC, and a 5 min wash in TNT wash buffer (100 mM Tris–HCl, 150 mM NaCl, 0.05% Tween20, pH 7.5), both at room temperature. At room tempera- M. Conrad et al. / Molecular & Biochemical Parasitology 175 (2011) 30–38 ture, samples were blocked for 30 min with TNB blocking buffer (PerkinElmer, Waltham, MA), and then incubated for 60 min in a humid chamber with anti-digoxigenin antibody conjugated with horseradish peroxidase (Roche Applied Science, Indianapolis, IN) diluted 1:1000 in TNB. Slides were washed 3 times for 5 min in TNT wash buffer at room temperature. The TSA-Plus TMR System (PerkinElmer, Waltham, MA) was used for tyramide signal amplification according to the manufacturers instructions. Slides were counterstained with DAPI in Vectashield mounting medium (VectorLabs, Burlingame, CA). Slides were visualized using a Leica TCS SP2 AOBS confocal microscope at 63×, and images were merged using LCS Lite v. 2.5 (Leica Microsystems Heidelberg GmbH, Mannheim, Germany). Chromosomes were identified by their morphological characteristics [42] (see Fig. 1 legend), and chromosome assignment was performed twice, first with a locus-identifying label, and then a second time in randomized order with labels removed (i.e., blinded). 2.5. Population genetic and evolutionary analyses GeneMapper 4.0 (ABI, Foster City, CA) was used to score MS allele sizes. All calls were manually edited to discard data from poorly amplified reactions and to ensure that proper allele calls were assigned. Strains were classified as mixtures of two or more genotypes if multiple peaks appeared at two or more loci in duplicate reactions. HE (expected heterozygosity) was calculated using the 33 n 2 p where p is the frequency of formula HE = [n/(n − 1)] 1 − i=1 the ith allele and n is the number of alleles sampled and confirmed with Arlequin3.11 [43]. A phylogeny using MS markers was inferred using POPTREE2 [44], a program that constructs phylogenetic trees from allele frequency data. A neighbor-joining (NJ) tree was constructed using DA [45] distance as a measure of genetic difference between individuals. Support values for the tree were obtained by bootstrapping 1000 replicates. A minimum-spanning network was inferred using Network 4.516 (Fluxus Technology Ltd. 2009) [46], software developed to reconstruct all possible least complex phylogenetic trees using a range of data types. For SNPs, SCG sequence data was aligned to the reference sequence published in TrichDB and the alignments manually edited using Sequencher 4.8 (Gene Codes Corporation, Ann Arbor, MI). Called SNPs were manually verified and included any single nucleotide change that occurred in any single strain. We used ModelGenerator v. 0.85 [47] to identify appropriate nucleotide substitution models for inference of trees for the six most polymorphic gene sequences: a set of three surface protein-coding genes TVAG 400860 (GP63a), TVAG 216430 (GP63b), and TVAG 303420 (LLF4), and a set of three housekeeping genes TVAG 005070 (PMS1), TVAG 302400 (Mlh1a), TVAG 021420 (coronin, CRN), with the number of gamma categories set at ten. PhyML [48] as part of SeaView v. 4.2.4 [49] was used to infer maximum likelihood (ML) phylogenies reconstructed by applying simultaneous NNI (nearest neighbor interchange) and SPR (subtree pruning and regrafting) moves on Fig. 1. Representative images of metaphase-arrested chromosome spreads hybridized with probes specific to microsatellite loci: (top left panel) representative image showing the T. vaginalis karyotype and chromosome numbering according to a distinct set of karyometrical data [35]; (top right panel) MS20 localized to chromosome II; (bottom left panel) MS77 localized to chromosome II; (bottom right panel) MS03 localized to chromosome II. The six T. vaginalis chromosomes are morphologically distinct as follows: chromosome I is the longest and is a subtelocentric/submetacentric type; chromosome II is acrocentric; chromosome III is metacentric; chromosome IV is metacentric/submetacentric; chromosome V is subtelocentric/acrocentric, although the short arm is not always visible; chromosome VI is the shortest and acrocentric. 34 M. Conrad et al. / Molecular & Biochemical Parasitology 175 (2011) 30–38 five independent random starting trees. Substitution rate categories were set at ten and transition/transversion (Ts/Tv) ratios, invariable sites and across-site rate variation were selected as indicated by ModelGenerator. Support values for the tree were obtained by bootstrapping 1000 replicates. Each gene was initially run independently and two manually concatenated DNA sequence data sets were generated based on their topology. The surface protein sequences were analyzed using Tamura and Nei’s 1993 nucleotide substitution model [50] with a Ts/Tv ratio of 2.4, 12 rate categories, gamma distribution parameter alpha = 0.04, and the proportion of invariable sites being set at 0.83, while the housekeeping gene sequences were analyzed with Hasegawa et al.’s 1985 nucleotide substitution model [51] with a Ts/Tv ratio of 1.91, four rate categories, and the proportion of invariable sites set at 0.97. 3. Results 3.1. Microsatellite identification and development of a simple genotyping method We screened the T. vaginalis genome sequence for microsatellites (MS) as an initial step towards developing a panel of markers for genetic diversity studies of the parasite. In total, 4,630 MSs were identified with an average length of 30.50 nt ± 1.07 SE, with trinucleotide and pentanucleotide repeats being the most abundant (37.7% and 25.6%, respectively), followed by tetranuclotide (14.9%), hexanuclotide (12.3%) and dinucleotide (9.5%) repeats. After further filtering to exclude MSs present on the same scaffold, we designed unique primer pairs to 86 MSs for testing (Table S1). Of these 86 MS loci, a total of 21 were found to amplify reliably under our optimizing conditions and are described below in detail. We also adapted previously published methods to develop a simple, reliable and affordable high-throughput MS genotyping protocol for T. vaginalis population genetic studies. First, we adapted a ‘tagged primer’ technique [52,53] by adding a 19 nt tag to the 5 end of each forward primer, designed to be recognized by a fluorescently tagged universal primer. Primers were added in a 1:40:40 ratio (locus-specific forward primer:locus-specific reverse primer:tagged universal primer) to perform semi-nested PCRs in which priming was done first with locus-specific reactions, which were then used as a template for reactions with the labeled primer. Using one fluorescently tagged universal primer to label all 21 loci greatly reduces genotyping costs. We tested two amplification programs [54] (Table S2) at a range of annealing temperatures, and validated amplicons by electrophoretic sizing to determine the most reliable program for each locus. For the latter step HEXlabeled and 6-FAM labeled reactions were multiplexed on the ABI 3130xl, further reducing genotyping costs and increasing the efficiency of the semi-automated, high-throughput methods described here. 3.2. T. vaginalis microsatellites are stable over many generations To determine if the microsatellite loci characterized in this study are stable over long periods (e.g., during continuous culture of the parasite), we genotyped two T. vaginalis strains that had been maintained in culture for many months. T. vaginalis strain F1623 was maintained under metronidazole selection for a minimum of 530 days, and strain B7268 was maintained under metronidazole selection for a minimum of 55 days and toyocamycin selection for a minimum of 200 days (Table 1). We genotyped each strain in duplicate with all 21 MS markers at four separate time points: Day 0, two Day 55 time points from independent cultures, and Day 255 for strain B7268; at Day 0, Day 350, Day 440, and Day 530 for strain F1623. No changes were observed at any of the four time points evaluated (data not shown), indicating that the MS loci are stable for at least 17 months under highly selective conditions. For all further analyses the data for these different time points were collapsed and considered as a single isolate. 3.3. Assignment of MS markers to T. vaginalis chromosomes In order to ensure that our polymorphic MS loci were distributed throughout the T. vaginalis genome, we used fluorescent in situ chromosome hybridization (FISH) to map MS loci to its six chromosomes. T. vaginalis chromosome spreads were hybridized with DIG-labeled locus-specific probes designed to be 1.5–2.5 kb in length and to span the entire MS locus, and hybridized probes were visualized using a tyramide signal amplification system. The chromosomes were identified by their morphological characteristics after two independent analyses of confocal images (Fig. 1). Although successful signal detection from hybridizations of the probes was infrequent, most likely due to the single-copy nature of the loci, 13 of the 21 MS were successfully mapped to single chromosomes, and the location of a 14th marker (MS06) could be narrowed down to one of two chromosomes (Table 2). The 14 mapped MS markers are distributed among four of the six chromosomes, with an average of approximately three markers per chromosome, ranging from two markers on chromosomes I and IV, to five markers on chromosome II (Table 2). Chromosomes V and VI, the smallest of the six chromosomes, were not represented by the mapped markers, although any of the remaining eight MS markers may be located on them. 3.4. T. vaginalis MS markers from common laboratory strains are highly polymorphic We used the 21 MS markers identified above to genotype seven lab strains of T. vaginalis. Several of the strains are maintained at the American Type Culture Collection (ATCC) and are commonly used by T. vaginalis labs around the world. MS loci ranged from 2 to 6 nucleotides in length, and each locus had from 1 to 6 alleles (Table 1), the average being 3.33 alleles per locus. We used the n formula HE = [n/(n − 1)] 1 − p2 , where p is the frequency i=1 of the ith allele and n is the number of alleles sampled, to determine heterozygosity of each locus, and we verified this value with Arlequin 3.11 [43]. This value varied by locus, ranging from 0 to 0.95, with the average HE determined to be 0.66 ± 0.06. One locus was found to be monomorphic in the lab isolates, but genotyping of additional clinical isolates (data not shown) revealed it to be polymorphic. None of the lab strains appeared to be mixtures of genotypes by our criteria, or in comparison to a simulated mixture of two lab strains that we used as a control (see Section 2; data not shown). These results show that MSs in T. vaginalis are highly polymorphic and can be used as robust markers for genetic studies. Phylogenetic analyses of the lab isolates using MS data inferred trees with low to moderate support (Fig. 2a). The basic topology of the phylogeny correlates well with the topology of the minimumspanning network inferred by Network (Fig. 2b), supporting the ability of these markers to represent the evolutionary history of the strains. Topologies from both methods suggest a branch composed of B7RC2 and F1623 that is consistently delineated from the remaining five isolates. This separation also occurs in the phylogeny inferred from concatenated sequences of housekeeping genes (see below). 3.5. Single-copy gene identification and polymorphism Although MS markers are useful in characterizing genetic diversity due to their highly polymorphic nature, the high mutation rate associated with them can also be disadvantageous, leading to M. Conrad et al. / Molecular & Biochemical Parasitology 175 (2011) 30–38 35 Table 2 Characteristics of 21 T. vaginalis MS loci characterized in this study. Locus MS01 MS03 MS04 MS06 MS07 MS08b MS09b MS10 MS17 MS20b MS38b MS44 MS70 MS77 MS94 MS100 MS129 MS135 MS153 MS168 MS184b Average Motif GAATAA AAAATA CAA TTC ATTAAT TTC AGAA AAAT TTTTA TGT CTT AT TA GA AT TG ATT TA AT TG TTG Chromosome ** I II* ND III or IV* II* III* IV** IV* III*** III** III* II** ND II*** ND I* ND ND ND ND II** Scaffold DS113177 DS113181 DS113183 DS113274 DS113206 DS113227 DS113295 DS113349 DS113178 DS113200 DS113502 DS113642 DS114241 DS114317 DS113199 DS113212 DS113270 DS113807 DS113342 DS113392 DS113453 G3 lengtha Alleles HE 584,929 437,572 431,961 170,984 244,377 207,319 155,794 128,477 503,923 279,084 91,738 69,902 24,950 22,118 279,741 227,203 172,859 53,461 132,270 115,942 99,852 170 253 250 406 401 270 423 421 206 432 266 247 252 227 209 194 193 248 242 183 254 3 1 3 6 3 3 2 4 4 2 2 4 4 3 2 2 5 4 4 4 5 0.52 0 0.67 0.95 0.76 0.67 0.29 0.87 0.9 0.29 0.29 0.81 0.86 0.67 0.57 0.57 0.86 0.81 0.81 0.81 0.86 211,165 268.7 3.33 0.66 Scaffold size (bp) ND, not determined. a The actual size of MS amplified from reference strain G3 are all within 1–4 bp of the expected size as predicted by in silico analysis of the G3 genome assembly, with the exception of with the 19 bp tag sequence added to all loci and two markers, MS04 and MS09, which have a 6 and 13 bp difference, respectively. b Indicates microsatellite is located within an annotated gene. * Localization inferred from one replicate. ** Localization inferred from two replicates. *** Localization inferred from ≥3 replicates. Fig. 2. (A) Neighbor joining tree inferred from microsatellite data using DA distance. Numbers in the tree correspond to non-parametric bootstrap supports (1000 replicates). (B) Minimum-spanning network inferred from microsatellite data using reduced median setting. (C) Maximum likelihood phylogeny inferred using concatenated surface protein gene sequences. The log-likelihood of the corresponding phylogenetic model is −5946.1 (D) Maximum likelihood phylogeny inferred using concatenated housekeeping gene sequences. The log-likelihood of the corresponding phylogenetic model is −7963.0. For both ML phylogenies, the numbers in the trees correspond to non-parametric bootstrap supports (1000 replicates). 36 M. Conrad et al. / Molecular & Biochemical Parasitology 175 (2011) 30–38 Table 3 SNPs identified in 21 single-copy genes sequenced in seven T. vaginalis strains. Genes whose IDs are underlined were used to generate phylogenetic trees shown in Fig. 2. Gene ID Gene name No. SNPs Sequence length (bp) Accession Nos. TVAG 400860 Clan MA, family M8, leishmanolysin-like metallopeptidase (GP63a) Mismatch repair MutL homolog (PMS1) Mismatch repair MutL homolog (Mlh1A) Coronin (CRN) Antigenic protein P1, putative (VSA) Clan MA, family M8, leishmanolysin-like metallopeptidase (GP63b) Vesicular mannose-binding lectin, putative (LLF4) Vesicular mannose-binding lectin, putative, PS (LLF1) Vesicular mannose-binding lectin, putative HIV-1 rev binding protein, putative Clan CA, family C1, cathepsin L-like cysteine peptidase Clan CA, family C1, cathepsin L-like cysteine peptidase Multidrug resistance pump, putative Tubulin alpha chain, putative Aspartic peptidase Tropomyosin isoforms 1/2, putative Arp2/3, putative Actin depolymerizing factor, putative Histone deacetylase complex subunit SAP18, putative Clan CE, family C48, cysteine peptidase Conserved hypothetical protein (with PF03388 Domain) 19 1425 HM365121–HM365126, XM 001324948 13 13 7 7a 6 1617 2337 1659 1055 1638 HM365174–HM365178, XM HM365169–HM365173, XM HM365115–HM365120, XM HM365203–HM365208, XM HM365127–HM365132, XM 5 4 4 4 4 3 3 2 2 2 2 1 1 0 0 1119 1032 1095 750 1233 729 885 834 565 879 801 378 294 640 1119 HM365157–HM365162, XM 001329311 HM365139–HM365144, XM 001329418 HM365151–HM365156, XM 001310009 HM365133–HM365138, XM 001315268 HM365109–HM365114 XM 001321129 HM365103–HM365108, XM 001580544 HM365163–HM365168, XM 001329432 HM365197–HM365202, XM 001322737 HM365179–HM365184, XM 001324024 HM365191–HM365196, XM 001321947 HM365091–HM365096, XM 001318124 HM365085–HM365090, XM 001581222 HM365185–HM365190, XM 001582251 HM365097–HM365102, XM 001583217 HM365145–HM365150, XM 001321414 TVAG TVAG TVAG TVAG TVAG 005070 302400 021420 364940 216430 TVAG TVAG TVAG TVAG TVAG TVAG TVAG TVAG TVAG TVAG TVAG TVAG TVAG TVAG TVAG 303420 291830 086190 171780 485880 228710 291970 184510 459080 414100 087140 192620 166900 094560 309150 a 001301638, DQ321767 00132034, 1DQ321764 001581132 001304496 001317855 All seven SNPs were found within a single isolate that also has a frame-shifting deletion. homoplasies that may confound true evolutionary histories. Single nucleotide polymorphisms (SNPs), on the other hand, arise less frequently, are less subject to homoplasy, and with the advent of high-throughput sequencing methods, are easier to genotype. Since SNPs located within single-copy genes (SCGs) are most useful for inferring phylogenies, we mined the T. vaginalis genome for SCGs using a basic all-versus-all BLASTP of all non-hypothetical proteins, and identified 4,582 genes as suitable candidates. The list was further reduced to 845 genes by removing all those with a shared annotated name, and reduced further still on the basis of size and function. A final experimental set of 21 loci included 18 genes identified through this screening process and three genes (GP63a, GP63b, and LLF4) included because their putative function as surface proteins suggested that they might be polymorphic. Amplification and sequencing of all 21 genes from seven of the laboratory strains identified a wealth of SNPs (Table 3). Polymorphism varied by locus, with some genes highly conserved and exhibiting no variation (TVAG 094560 and TVAG 309150) and others having up to 19 SNPs in ∼1.4 kb of sequence (TVAG 400860). We selected six of the most variable genes (GP63a, PMS1, Mlh1a, CRN, GP63b, and LLF4) representing housekeeping genes, potential virulence factors and drug targets, for phylogenetic analysis. (Although TVAG 364940 [VSA] was found to be more polymorphic than TVAG 216430 [GP63b] and TVAG 303420 [LLF4], the polymorphisms were restricted to a single isolate with a frame-shifting deletion, and so we excluded this gene.) After determining appropriate nucleotide model parameters, we used PhyML [48] to infer the phylogeny of each gene separately. The three surface protein genes (GP63a, GP63b, and LLF4) had similar topologies, as did the three housekeeping genes (PMS1, Mlh1a, and CRN), although these topologies differed between the two groups. Using the similarity between the topologies, we arranged the sequence data into two concatenated data sets, one composed of the three putative surface protein-coding genes (GP63a, GP63b, and LLF4), and one composed of the three housekeeping genes (PMS1, Mlh1a, and CRN). The surface protein data set was composed of 4182 unambiguously aligned positions with 30 polymorphic sites, translating to 24 variable sites in the protein sequence, one of which has three amino acid variants, while the housekeeping data set was composed of 5619 unambiguously aligned positions, including 34 polymorphic positions, translating to 23 variable sites in the protein sequence. The two data sets were analyzed using Tamura and Nei’s [50] and Hasegawa et al. [51] nucleotide substitution models, respectively. The two trees share little topology (Fig. 2c and d), probably as a result of the different selective pressures and evolutionary forces the two sets of genes experience. The phylogeny inferred from the concatenated housekeeping sequence has better bootstrap support than that of the concatenated surface protein sequence; in addition, it more closely resembles the phylogenetic and minimum-spanning network topologies inferred from microsatellite data (Fig. 2a, b, and d), suggesting that the separation of B7RC2 and F1623 from the remaining strains is the best representation of their ancestral history under assumed neutral selection pressure. Including additional lab strains or isolates would likely clarify relationships between individual strains that are currently ambiguous. 4. Discussion A draft of the first T. vaginalis genome sequence was published in 2007 [35], opening the way for development of new genetic tools to understand the biology of this neglected pathogen. Although Sanger sequencing coverage of the genome is high at ∼7.2×, the presence of hundreds of gene families and thousands of repetitive elements that show extreme similarity to each other (average pairwise difference ∼2.5%) hinder the generation of a more complete assembly of the genome. As a result, identifying the single-copy genes and unique MS markers reported in this study has been a challenging and non-trivial exercise. Here we have described a novel suite of 27 robust, reproducible, and inexpensive genetic markers for investigating the population genetics of T. vaginalis, including the first reported MS markers and a panel of diverse single-copy genes. In addition, we have mapped the MS loci to individual chromosomes, verified their stability, and performed minimum-spanning network and phylogenetic analysis to validate their usefulness for inferring the evolutionary relationship of strains. We have similarly identified polymorphic single-copy genes and also demonstrated their use in phylogenetic studies. Our results show that the 21 MS loci that we identified and successfully optimized for genotyping in seven laboratory strains M. Conrad et al. / Molecular & Biochemical Parasitology 175 (2011) 30–38 and six sub-strains exhibit a high degree of diversity, and are distributed among at least four of the six chromosomes comprising the T. vaginalis genome. We have shown these markers to be stable in continuous culture over extended periods of time while under drug selection, suggesting that the mutation rate at these loci is slow enough to prevent extensive homoplasy, and supporting their usefulness as markers. By identifying single-copy genes coding for protein sequences lacking strong similarity to other protein sequences encoded by the genome, we have been able to eliminate complications arising from the use of paralogs (genes related by duplication within a genome), ensuring that phylogenies represent locus-specific evolutionary history and therefore better reflect the ancestral relationship of T. vaginalis isolates. We have also selected genes with diverse functions, including housekeeping genes and putative surface proteins, thereby representing the history of loci evolving under a range of selective pressures. Combining sequence data from these genes allows us to accurately represent the evolutionary history of the genome as whole, rather than of select loci. In addition, we have identified genes with sufficient polymorphism to distinguish between closely related strains, making these markers useful for intra-specific, population-level analyses. The six genes that we have selected for phylogenetic analysis demonstrate different evolutionary relationships among the strains. The housekeeping gene phylogenies strongly support strains B7RC2 and F1623 being more closely related to each other than to the other isolates, a relationship that is maintained in phylogenies of the individual genes that make up the concatenated data set. This association is also found in ancestral relationships inferred from microsatellite data using two different methods of analysis. In contrast, the surface genes indicate a closer relationship between B7268 and F1623. This relationship is only apparent in the GP63a gene, as the other individual genes have low support values at all nodes (data not shown). This may indicate that the relationship of these genes is more influenced by selective pressure than common ancestry. Pressure for diversification of surface proteins to adapt to host immune responses and/or to be able to colonize a host may overshadow other aspects of their evolutionary history, making their ancestral relationship ambiguous in this small data set. Evaluating additional clinical samples may better explain the lack of overlap in tree topologies and provide a clearer picture of the forces driving the relationships between the genes of different isolates. During our MS marker development, we employed a costsaving method that utilized one universal tagged primer to label PCR amplicons of the MS loci [52,53,55], making the technique accessible to more researchers, particularly those in resourcelimited settings. The availability of these reproducible markers will allow comparison and merging of data sets from different research projects, enabling a more comprehensive investigation of the population diversity of T. vaginalis through better representation of widespread geographical locations, population demographics, and overall parasite diversity. As a result of this study and future parallel population-level diversity studies, those investigating T. vaginalis will be able to examine the extant population structure and diversity of the parasite to better select strains when studying specific biological functions. Similarly, this genotyping method will be invaluable for resolving conflicting experimental results by helping to standardize the influence of genetic background on variation in quantitative traits. Another powerful use of these markers will be to elucidate the role of meiosis in the T. vaginalis lifecycle [54] by providing a means to detect recombinants [56] and to utilize population genetic models to detect other evidence for or against recombination, such as linkage disequilibrium [23] or the detection of haplotypes in significantly excessive frequency [57]. 37 T. vaginalis remains a ‘neglected’ pathogen in part due to an incomplete understanding of the role it has played in pathology and in facilitating co-infection and transmission of diseases that are of great public health concern (e.g., HIV/AIDS). As a result, there is a paucity of tools for elucidating its mechanisms of colonization, pathogenesis, and drug resistance. The development of this panel of genetic markers is a step towards rectifying this neglect and raising awareness of the complexity of the parasite’s genetics, leading to better explanation of the wide range of severity of symptoms and manifestations of pathology. Acknowledgements We thank Drs. Shehre-Banoo Malik for assistance in phylogenetic analyses, Patrick Sutton for valuable input in the writing of the manuscript, and Ute Frevert for assistance with the microscopy. We would also like to recognize the Confocal Microscopy Facility of the Department of Medical Parasitology for the use of the Leica TCS SP2 AAOBS confocal microscope. M.C. was partially funded under a National Institutes of Health training grant T32AI007180-27. This study was funded by a National Institutes of Health award to J.M.C. under the Advancing Novel Science in Women’s Health Research program announcement, grant number R21 AI083954-01. J.T. and Z.Z. were supported by the Czech Ministry of Education (LC07032, MSM 0021620858). Appendix A. Supplementary data Supplementary data associated with this article can be found, in the online version, at doi:10.1016/j.molbiopara.2010.08.006. References [1] World Health Organization. Global prevalence and incidence of selected curable sexually transmitted infections: overviews and estimates. WHO/HIV AIDS/2001.02. Geneva: World Health Organization; 2001. [2] Johnston VJ, Mabey DC. Global epidemiology and control of Trichomonas vaginalis. Curr Opin Infect Dis 2008;21(1):56–64. [3] Van Der Pol B. Editorial commentary: Trichomonas vaginalis infection: the most prevalent nonviral sexually transmitted infection receives the least public health attention. Clin Infect Dis 2007;44(1):23–5. [4] Cherpes TL, Wiesenfeld HC, Melan MA. The associations between pelvic inflammatory disease. Trichomonas vaginalis infection, and positive herpes simplex virus type 2 serology. Sex Transm Dis 2006;33(12):747–52. [5] Moodley P, Wilkinson D, Connolly C, Moodley J, Sturm AW. Trichomonas vaginalis is associated with pelvic inflammatory disease in women infected with human immunodeficiency virus. Clin Infect Dis 2002;34(4):519–22. [6] Cotch MF, Pastorek 2nd JG, Nugent RP. Trichomonas vaginalis associated with low birth weight and preterm delivery. The Vaginal Infections and Prematurity Study Group. Sex Transm Dis 1997;24(6):353–60. [7] McClelland RS, Sangare L, Hassan WM. Infection with Trichomonas vaginalis increases the risk of HIV-1 acquisition. J Infect Dis 2007;195(5):698–702. [8] Guenthner PC, Secor WE, Dezzutti CS. Trichomonas vaginalis-induced epithelial monolayer disruption and human immunodeficiency virus type 1 (HIV-1) replication: implications for the sexual transmission of HIV-1. Infect Immun 2005;73(7):4155–60. [9] Vanácová S, Tachezy J, Kulda J, Flegr J. Characterization of trichomonad species and strains by PCR fingerprinting. J Eukaryot Microbiol 1997;44(6):545–52. [10] Hampl V, Vanácová S, Kulda J, Flegr J. Concordance between genetic relatedness and phenotypic similarities of Trichomonas vaginalis strains. BMC Evol Biol 2001;1:11. [11] Mead JR, Fernadez M, Romagnoli PA, Secor WE. Use of Trichomonas vaginalis clinical isolates to evaluate correlation of gene expression and metronidazole resistance. J Parasitol 2006;92(1):196–9. [12] Rojas L, Fraga J, Sariego I. Genetic variability between Trichomonas vaginalis isolates and correlation with clinical presentation. Infect Genet Evol 2004;4(1):53–8. [13] Snipes LJ, Gamard PM, Narcisi EM, Beard CB, Lehmann T, Secor WE. Molecular epidemiology of metronidazole resistance in a population of Trichomonas vaginalis clinical isolates. J Clin Microbiol 2000;38(8):3004–9. [14] Stiles JK, Shah PH, Xue L, Meade JC, et al. Molecular typing of Trichomonas vaginalis isolates by HSP70 restriction fragment length polymorphism. Am J Trop Med Hyg 2000;62(4):441–5. [15] Upcroft JA, Delgadillo-Correa MG, Dunne RL, Sturm AW, Johnson PJ, Upcroft P. Genotyping Trichomonas vaginalis. Int J Parasitol 2006;36(7):821–8. 38 M. Conrad et al. / Molecular & Biochemical Parasitology 175 (2011) 30–38 [16] Crucitti T, Abdellati S, Van Dyck E, Buvé A. Molecular typing of the actin gene of Trichomonas vaginalis isolates by PCR-restriction fragment length polymorphism. Clin Microbiol Infect 2008;14(9):844–52. [17] Barker GC. Microsatellite DNA: a tool for population genetic analysis. Trans R Soc Trop Med Hyg 2002;96(Suppl. 1):S21–4. [18] Tibayrenc M. Multilocus enzyme electrophoresis for parasites and other pathogens. Methods Mol Biol 2009;551:13–25. [19] Maiden MC. Multilocus sequence typing of bacteria. Annu Rev Microbiol 2006;60(1):561–88. [20] Powell W, Morgante M, Andre C, et al. The comparison of RFLP, RAPD, AFLP and SSR (microsatellite) markers for germplasm analysis. Mol Breed 1996;2:225–38. [21] Su X, Wellems TE. Toward a high-resolution Plasmodium falciparum linkage map: polymorphic markers from hundreds of simple sequence repeats. Genomics 1996;33(3):430–44. [22] Havryliuk T, Ferreira MU. A closer look at multiple-clone Plasmodium vivax infections: detection methods, prevalence and consequences. Mem Inst Oswaldo Cruz 2009;104(1):67–73. [23] Anderson TJ, Haubold B, Williams JT, et al. Microsatellite markers reveal a spectrum of population structures in the malaria parasite Plasmodium falciparum. Mol Biol Evol 2000;17(10):1467–82. [24] Imwong M, Sudimack D, Pukrittayakamee S, et al. Microsatellite variation, repeat array length, and population history of Plasmodium vivax. Mol Biol Evol 2006;23(5):1016–8. [25] Cooper A, Tait A, Sweeney L, et al. Genetic analysis of the human infective trypanosome Trypanosoma brucei gambiense: chromosomal segregation, crossing over, and the construction of a genetic map. Genome Biol 2008;9(6):R103. [26] Ajzenberg D, Bañuls AL, Tibayrenc M, Dardé ML. Microsatellite analysis of Toxoplasma gondii shows considerable polymorphism structured into two main clonal groups. Int J Parasitol 2002;32(1):27–38. [27] Schwenkenbecher JM, Wirth T, Schnur LF, et al. Microsatellite analysis reveals genetic structure of Leishmania tropica. Int J Parasitol 2006;36(2):237–46. [28] Oliveira RP, Broude NE, Macedo AM, Cantor CR, Smith CL, Pena SD. Probing the genetic population structure of Trypanosoma cruzi with polymorphic microsatellites. Proc Natl Acad Sci U S A 1998;95(7):3776–80. [29] Simo G, Njiokou F, Tume C, et al. Population genetic structure of Central African Trypanosoma brucei gambiense isolates using microsatellite DNA markers. Infect Genet Evol 2010; 10(1): 68–76. [30] Imwong M, Nair S, Pukrittayakamee S, et al. Contrasting genetic structure in Plasmodium vivax populations from Asia and South America. Int J Parasitol 2007;37(8–9):1013–22. [31] Ferdig MT, Su XZ. Microsatellite markers and genetic mapping in Plasmodium falciparum. Parasitol Today 2000;16(7):307–12. [32] Pearce RJ, Pota H, Evehe MS, et al. Multiple origins and regional dispersal of resistant dhps in African Plasmodium falciparum malaria. PLoS Med 2009;6(4):e1000055. [33] Morin PA, Luikart G, Wayne RK. SNPs in ecology, evolution and conservation. Trends Ecol Evol 2004;19(4):208–16. [34] Edwards SV. Natural selection and phylogenetic analysis. Proc Natl Acad Sci U S A 2009;106(22):8799–800. [35] Carlton JM, Hirt RP, Silva JC, et al. Draft genome sequence of the sexually transmitted pathogen Trichomonas vaginalis. Science 2007;315(5809):207–12. [36] Horner DS, Hirt RP, Kilvington S, Lloyd D, Embley TM. Molecular data suggest an early acquisition of the mitochondrion endosymbiont. Proc Biol Sci 1996;263(1373):1053–9. [37] Aurrecoechea C, Brestelli J, Brunk BP, et al. GiardiaDB and TrichDB: integrated genomic resources for the eukaryotic protist pathogens Giardia lamblia and Trichomonas vaginalis. Nucleic Acids Res 2009;37(Suppl. 1):D526–30. [38] Kofler RSC, Lelley T. SciRoKo: a new tool for whole genome microsatellite search and investigation. Bioinformatics 2007;23(13):1683–5. [39] Untergasser A, Nijveen H, Rao X, Bisseling T, Geurts R, Leunissen JA. Primer3Plus, an enhanced web interface to Primer3. Nucleic Acids Res 2007;35 (Web Server Issue). [40] Zhang Z, Schwartz S, Wagner L, Miller W. A greedy algorithm for aligning DNA sequences. J Comput Biol 2000;7(1–2):203–14. [41] Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. J Mol Biol 1990;215(3):403–10. [42] Drmota T, Kral J. Karyotype of Trichomonas vaginalis. Eur J Protistol 1997;33:131–5. [43] Excoffier L, Laval G, Schneider S. Arlequin (version 3.0): an integrated software package for population genetics data analysis. Evol Bioinform Online 2005;1:47–50. [44] Takezaki N, Nei M, Tamura K. POPTREE2: software for constructing population trees from allele frequency data and computing other population statistics with Windows interface. Mol Biol Evol 2010; 27(4): 747–52. [45] Nei M, Tajima F, Tateno Y. Accuracy of estimated phylogenetic trees from molecular data. II. Gene frequency data. J Mol Evol 1983;19(2):153– 70. [46] Bandelt HJ, Forster P, Sykes BC, Richards MB. Mitochondrial portraits of human populations using median networks. Genetics 1995;141(2):743–53. [47] Keane TM, Creevey CJ, Pentony MM, Naughton TJ, Mclnerney JO. Assessment of methods for amino acid matrix selection and their use on empirical data shows that ad hoc assumptions for choice of matrix are not justified. BMC Evol Biol 2006;6(1):29. [48] Guindon S, Gascuel O, Simple A. Fast, and accurate algorithm to estimate large phylogenies by maximum likelihood. Syst Biol 2003;52(5):696–704. [49] Gouy M, Guindon S, Gascuel O. SeaView version 4: a multiplatform graphical user interface for sequence alignment and phylogenetic tree building. Mol Biol Evol 2010; 27(2): 221–4. [50] Tamura K, Nei M. Estimation of the number of nucleotide substitutions in the control region of mitochondrial DNA in humans and chimpanzees. Mol Biol Evol 1993;10(3):512–26. [51] Hasegawa M, Kishino H, Yano T. Dating of the human–ape splitting by a molecular clock of mitochondrial DNA. J Mol Evol 1985;22(2):160–74. [52] Protas ME, Hersey C, Kochanek D, et al. Genetic analysis of cavefish reveals molecular convergence in the evolution of albinism. Nat Genet 2006;38(1):107–11. [53] Oetting WS, Armstrong CM, Ronan SM, Young TL, Sellers TA, King RA, et al. Multiplexed short tandem repeat polymorphisms of the Weber 8A set of markers using tailed primers and infrared fluorescence detection. Electrophoresis 1998;19(18):3079–83. [54] Malik SB, Pightling AW, Stefaniak LM, Schurko AM, Logsdon Jr JM. An expanded inventory of conserved meiotic genes provides evidence for sex in Trichomonas vaginalis. PLoS One 2008;3(8):e2879. [55] Li J, Zhang Y, Sullivan M, Hong L, et al. Typing Plasmodium yoelii microsatellites using a simple and affordable fluorescent labeling method. Mol Biochem Parasitol 2007;155(2):94–102. [56] Feng X, Rich SM, Tzipori S, Widmer G, et al. Experimental evidence for genetic recombination in the opportunistic pathogen Cryptosporidium parvum. Mol Biochem Parasitol 2002;119(1):55–62. [57] Asmundsson IM, Dubey JP, Rosenthal BM. A genetically diverse but distinct North American population of Sarcocystis neurona includes an overrepresented clone described by 12 microsatellite alleles. Infect Genet Evol 2006;6(5):352–60. [58] Hollander DH, Tysor JS. Isolation of a stable clone of the ameboid-adherent (AA) variant of Trichomonas vaginalis. J Parasitol 1987;73:1074–5. [59] Burch TA, Rees CW, Kayhoe DE. Laboratory and clinical studies on vaginal trichomoniasis. Am J Obstet Gynecol 1958;76:658–65. [60] Reardon LV, Ashburn LL, Leon J, Davis VE. Differences in Strains of Trichomonas vaginalis as Revealed by Intraperitoneal Injections into Mice. The Journal of Parasitology 1961;47:527–32. [61] Tai JH, Su HM, Tsai J, Shaio MF, Wang CC. The divergence of Trichomonas vaginalis virus RNAs among various isolates of Trichomonas vaginalis. Exp Parasitol 1993;76:278–86. [62] Bremner AF, Coombs GH, North MJ. Antitrichomonal activity of alphadifluoromethylornithine. J Antimicrob Chemother 1987;20:405–11. [63] Clackson TE, Coombs GH. The antagonistic effects of acetate and lactate upon the trichomonacidal activity of metronidazole. J Antimicrob Chemother 1983;11:401–6. [64] Coombs GH, Clackson TE. Antitrichomonal activity of compounds that affect DNA and its repair. J Antimicrob Chemother 1983;11:191–4. [65] Brown DM, Upcroft JA, Dodd HN, Chen N, Upcroft P. Alternative 2-keto acid oxidoreductase activities in Trichomonas vaginalis. Mol Biochem Parasitol 1999;98:203–14. [66] Wright JM, Webb RI, O’Donoghue P, Upcroft P, Upcroft JA. Hydrogenosomes of laboratory-induced metronidazole-resistant Trichomonas vaginalis lines are downsized while those from clinically metronidazole-resistant isolates are not. J Eukaryot Microbiol 2010;57:171–6. [67] Upcroft JA, Upcroft P. Drug susceptibility testing of anaerobic protozoa. Antimicrob Agents Chemother 2001;45:1810–4. [68] Voolmann T, Boreham P. Metronidazole resistant Trichomonas vaginalis in Brisbane. Med J Aust 1993;159:490. Molecular & Biochemical Parasitology 176 (2011) 135–137 Contents lists available at ScienceDirect Molecular & Biochemical Parasitology Short technical report Fluorescence in situ hybridization (FISH) mapping of single copy genes on Trichomonas vaginalis chromosomes Zuzana Zubáčová a , Vladimír Krylov b , Jan Tachezy a,∗ a b Charles University in Prague, Faculty of Science, Department of Parasitology, Vinicna 7, 128 44 Prague, Czech Republic Charles University in Prague, Faculty of Science, Department of Cell Biology, Vinicna 7, 128 44 Prague, Czech Republic a r t i c l e i n f o Article history: Received 19 October 2010 Received in revised form 20 December 2010 Accepted 21 December 2010 Available online 30 December 2010 Keywords: Trichomonas vaginalis Chromosome FISH a b s t r a c t The highly repetitive nature of the Trichomonas vaginalis genome and massive expansion of various gene families has caused difficulties in genome assembly and has hampered genome mapping. Here, we adapted fluorescence in situ hybridization (FISH) for T. vaginalis, which is sensitive enough to detect single copy genes on metaphase chromosomes. Sensitivity of conventional FISH, which did not allow single copy gene detection in T. vaginalis, was increased by means of tyramide signal amplification. Two selected single copy genes, coding for serine palmitoyltransferase and tryptophanase, were mapped to chromosome I and II, respectively, and thus could be used as chromosome markers. This established protocol provides an amenable tool for the physical mapping of the T. vaginalis genome and other essential applications, such as development of genetic markers for T. vaginalis genotyping. © 2011 Elsevier B.V. All rights reserved. Trichomonas vaginalis is a flagellated protist belonging to the Parabasala group. It is an obligate human parasite that causes the most prevalent non-viral, sexually transmitted infection of the urogenital tract [1]. The parasite reproduces asexually by binary fission, although the presence of sexual processes in T. vaginalis has also been suggested [2,3]. The mitosis of T. vaginalis is closed, as the nuclear envelope remains intact with the formation of an extranuclear spindle [4]. During mitosis, nuclear chromatin condenses into six chromosomes that can be distinguished according to their size and morphology [2]. Metaphase chromosomes consist of a pair of parallel sister chromatids; each chromatid bears its own constriction that is interpreted as a centromere [2]. This chromatid arrangement is considerably different from the typical monocentric chromosomes of higher eukaryotes. Sequencing of the T. vaginalis genome allows the estimation of its size at about 160 Mb, with a core set of about 60,000 genes, representing the highest coding capacities in eukaryotes [5]. However, the highly repetitive character of the genome, including 59 repeat families that constitute almost 39 Mb of the genome, and massive expansion of various gene families caused difficulties in genome assembly and hampered gene mapping. Therefore, applications of genetic techniques such as fluorescence in situ hybridization (FISH) are essential to improve correct joining of contigs and allow their Abbreviations: FISH, fluorescence in situ hybridization; TSA, tyramide signal amplification. ∗ Corresponding author. Tel.: +420 221951811. E-mail address: [email protected] (J. Tachezy). 0166-6851/$ – see front matter © 2011 Elsevier B.V. All rights reserved. doi:10.1016/j.molbiopara.2010.12.011 mapping on T. vaginalis chromosomes. Moreover, the physical location of single copy genes provides a tool for T. vaginalis genotyping with applications in evolutionary studies, comparative genomics and clinical praxis. Previously, FISH has been employed in T. vaginalis for the mapping of multicopy genes coding for rRNA that are present in its genome in hundreds of copies [5,6]. These genes are clustered to a single chromosome IV, which facilitates their easy detection [5]. However, our attempts to detect single copy genes on T. vaginalis chromosomes using conventional FISH were unsuccessful. In this report, we describe a protocol for highly sensitive FISH, adapted for the T. vaginalis genome, which allows for the visualization of single-copy genes on metaphase chromosomes of this parasite. The sensitivity of the system was increased by coupling FISH with horseradish peroxidase-catalyzed tyramide signal amplification (FISH-TSA). To establish a FISH-TSA technique for trichomonad chromosomes, we selected four nucleotide sequences with similar GC-content (42–49%) as probes, coding for asparaginase-like threonine peptidase (TVAG 258340), acetylornithine aminotransferase, putative (TVAG 258770), serine palmitoyltransferase (TVAG 388650), and tryptophanase (TVAG 054490). The presence of a single gene copy for each selected sequence in T. vaginalis genome was verified by BLAST searches at Tricho DB database (http://trichdb.org/trichdb/). As a positive control, we used the 28S rRNA gene (AF202181), which is present in 250 copies. Unlabeled probes were obtained by PCR amplification using trichomonad genomic DNA as a template. The DNA was isolated from the T1 strain (kindly provided by J.-H. Tai, Institute of Biomedical Sci- 136 Z. Zubáčová et al. / Molecular & Biochemical Parasitology 176 (2011) 135–137 Fig. 1. Mapping of single-copy genes in the Trichomonas vaginalis genome by FISH-TSA. (A) Metaphase spread with red signals corresponding to a 1419 bp probe hybridized to serine palmitoyltransferase gene on chromosome I. (B) Visualization of the serine palmitoyltransferase gene in an interphase nucleus. (C) The result of hybridizing a 1377 bp probe to the tryptophanase gene on metaphase chromosome II. (D) Mapping of a multicopy gene for the 28S rRNA was conducted as a positive control. The probe hybridized to chromosome IV. ences, Taipei, Taiwan) with the High Pure PCR Template Preparation kit (Roche). The primers for gene amplification were designed to obtain complete sequences of 921–1419 bp (single copy genes) and 1574 bp (28S rRNA gene). The PCR products were purified from the gel using QIAquick Gel Extraction Kit (Qiagen), labeled by digoxigenin-11-dUTP (Roche) using DecaLabel DNA Labeling Kit (Fermentas), and purified using QIAquick Gel Extraction Kit (Qiagen) in a final volume of 50 l. The metaphase chromosomes from colchicine-treated T. vaginalis cells were obtained using modifications to the described spreading technique [7]. T. vaginalis T1 cells were grown in Diamonds TYM media supplemented with 10% horse serum. The trichomonads in logarithmic growth phase (about 3 × 107 cells in 20 ml) were then incubated in the same media with 1 mM colchicine (Sigma) for 6 h at 37 ◦ C and harvested by centrifugation. Cells were hypotonised in 10 ml of 75 mM potassium chloride for 5 min at 37 ◦ C, spun down, resuspended in 5 ml of Carnoy fixative (ethanol–chloroform–acetic acid, 6:3:1) at 4 ◦ C for 20 min, spun down, resuspended in 1–2 ml Carnoy fixative and left at 4 ◦ C overnight. The cell suspension (20 l) was then dropped from a height of about 20 cm onto microscope slides (Star Frost, Waldemar Knittel Glasbearbeifungs GmbH, Braunschweig, Germany) and left to dry. The chromosome spreads were briefly submerged in 50% acetic acid to remove the cytoplasmic residues, dried, incubated for 5 min at 37 ◦ C in 50 g/ml pepsin (Sigma) in 3 mM potassium acetate and 0.01 M HCl to excise proteins, and then washed for 5 min with phosphate-buffered saline pH 7.4 (PBS) at room temperature. The chromosome spreads were post-fixed with fresh 2% formaldehyde in PBS for 30 min at room temperature (after this step, the spreads could be stored at −20 ◦ C for one month for later use). To remove any endogenous peroxidase activity, the spreads were incubated in 1% H2 O2 in PBS for 30 min at room temperature, dehydrated in series of 70%, 90% and 100% methanol, and air-dried. Subsequently, the chromosomes were denatured in 50 l of 70% deionized formamide (Sigma) in 2× SSC for 5 min at 75 ◦ C under a coverslip and dehydrated again in methanol, as described above. Each probe (20 ng in 2 l) was denatured in 50 l of 50% formamide in 2× SSC at 90 ◦ C for 5 min and immediately cooled down in ice. Denatured probe was applied on the chromosome spread, with a coverslip, and sealed with rubber cement. Hybridization proceeded overnight in a humid chamber at 37 ◦ C. The chromosome spreads were washed at high stringency 3 times for 5 min at 42 ◦ C using 50% formamide (Fluka) in 2× SSC, following 3 × 5 min washes in 2× SSC at room temperature, and 5 min in TNT buffer (100 mM Tris–HCl, 150 mM NaCl, 0.05% Tween-20, pH Z. Zubáčová et al. / Molecular & Biochemical Parasitology 176 (2011) 135–137 7.5). The spreads were blocked for 30 min in TNB blocking buffer (PerkinElmer) and incubated for 60 min with anti-digoxigenin antibody conjugated with horseradish peroxidase (Roche) diluted 1:1000 in TNB buffer at room temperature. Slides were washed 3 × 5 min in TNT buffer at room temperature. The tyramide signal amplification was performed using the TSA – Plus TMR System (PerkinElmer), according to the manufacturer’s manual. Slides were counterstained with DAPI in Vectashield mounting medium (VectorLabs). Chromosomes were observed using an IX81 microscope (Olympus) equipped with an IX2-UCB camera. Images were processed using Cell® software (Olympus). The high sensitivity of FISH-TSA allows the successful visualization of single-copy genes in a Trichomonas vaginalis karyotype with minimal background. The gene coding for serine palmitoyltransferase was mapped to the longest chromosome, chromosome I (subtelocentric/submetacentric type). The specific signal was detected on the longer chromatid arm, close to the constriction of each chromatid (Fig. 1A). The signal was frequently observed only on one of the two sister chromatids (about 20% of total spreads). When paired doublets were observed (about 5% spreads), a signal was usually asymmetric, with lower intensity on one of the chromatids. The strong signal for serine palmitoyltransferase was also detected on interphase nuclei (Fig. 1B). The single-copy gene tryptophanase was identified on acrocentric chromosome II (Fig. 1C). The signal was observed close to the center of sister chromatids of this chromosome, which do not possess apparent constriction. As in the case of serine palmitoyltransferase, the signal was often observed only on one of the sister chromatids. Unlike single-copy genes, the probe against multiple 25S rRNA genes frequently labeled both sister chromatids (85% of total spreads). The signal corresponding to 25S rRNA genes was observed as a single spot on chromosome IV, and it was associated with chromatid constriction (Fig. 1D), as observed previously for 18S rRNA genes [5]. The adaptation of FISH-TSA provided substantial improvement in sensitivity over conventional FISH for the detection of genes in the T. vaginalis genome. However, the signal was limited by the length of the probe, as has been observed for other cell lines [8,9]. Probes for both successfully mapped genes were over 1300 bp in length (1377 bp, tryptophanase; 1419 bp, serine palmitoyltransferase gene), whereas two shorter probes against asparaginase-like threonine peptidase (921 bp) and acetylornithine aminotransferase (1203 bp) repeatedly did not detect the corresponding genes (data not shown). When compared to plant and human genomes, for which probes smaller than 1 kb have been successfully used [8], observed sensitivity of FISH-TSA in T. vaginalis is somehow lower. Beside a GC content, which was comparable for T. vaginalis (42–49%) and human (41–49%) probes [8], the most important factor that influences FISH sensitivity is chromatin compactness [9]. Thus, to estimate the level of chromatin condensation in T. vaginalis metaphase chromosomes, we calculated the ratio between T. vaginalis genome size (160 Mbp) and total length of metaphase chromosomes (9.4 m), which is about 17 Mbp m−1 . This value is comparable or lower than the ratio values known for human (26.6 Mbp m−1 ) and plants (40.6–249.6 Mbp m−1 ) [9]. Therefore, the observed limitation of probe length in T. vaginalis 137 cannot simply be explained by genome compactness; other factors, such as the specific character of cellular and nuclear membranes and cytoplasm that cover the chromosomes on spreads and the specific structure of chromatin related to strongly divergent histones [10], might affect the availability of the targeted genes for the probe hybridization. The chromosomes of a number of parasitic protists, including related intestinal parasite Giardia intestinalis, could be routinely separated by pulse field gel electrophoresis (PFGE) [11]. Notably, although separation of six T. vaginalis chromosomes have been reported [12], estimated sizes of six separated DNA bands (in total 16.2 Mbp) did not correspond to the trichomonad genome size. Our attempts, and the attempts of other laboratories [13], did not lead to separation of genomic DNAs corresponding to chromosomes. Considering the large size of T. vaginalis genome (160 Mb), it is likely that chromosomes of T. vaginalis are too large to be separated by PFGE that is limited by about 10 Mbp. In conclusion, we adapted a highly sensitive FISH-TSA method for the detection of single-copy genes on metaphase T. vaginalis chromosomes. This method provides an amenable tool for the joining of available genomic contigs, the physical mapping of single-copy genes and the definition of physical linkage groups, the study of genome rearrangements, and the genotyping of T. vaginalis strains. Acknowledgement This study was supported by the Czech Ministry of Education (LC07032, MSM0021620858) to J.T. References [1] Schwebke JR, Burgess D. Trichomoniasis. Clin Microbiol Rev 2004;17:794–803. [2] Drmota T, Kral J. Karyotype of Trichomonas vaginalis. Eur J Protistol 1997;33:131–5. [3] Malik SB, Pightling AW, Stefaniak LM, et al. An expanded inventory of conserved meiotic genes provides evidence for sex in Trichomonas vaginalis. PLoS One 2008;3:e2879. [4] Brugerolle G. Etude de la cryptopleuromitose et de la morphogenese de division chez Trichomonas vaginalis et chez plusieurs de trichomonadines primirives. Protistologica 1975;11:457–68. [5] Carlton JM, Hirt RP, Silva JC, et al. Draft genome sequence of the sexually transmitted pathogen Trichomonas vaginalis. Science 2007;315:207–12. [6] Torres-Machorro AL, Hernández R, Alderete JF, López-Villaseñor I. Comparative analyses among the Trichomonas vaginalis, Trichomonas tenax, and Tritrichomonas foetus 5S ribosomal RNA genes. Curr Genet 2009;55:199–210. [7] Zubacova Z, Cimburek Z, Tachezy J. Comparative analysis of trichomonad genome sizes and karyotypes. Mol Biochem Parasitol 2008;161:49–54. [8] Schriml LM, Padilla-Nash HM, Coleman A, et al. Tyramide signal amplification (TSA)-FISH applied to mapping PCR-labeled probes less than 1 kb in size. Biotechniques 1999;27:608–13. [9] Khrustaleva LI, Kik C. Localization of single-copy T-DNA insertion in transgenic shallots (Allium cepa) by using ultra-sensitive FISH with tyramide signal amplification. Plant J 2001;25:699–707. [10] Marinets A, Müller M, Johnson PJ, et al. The sequence and organization of the core histone H3 and H4 genes in the early branching amitochondriate protist Trichomonas vaginalis. J Mol Evol 1996;43:563–71. [11] Upcroft JA, Krauer KG, Upcroft P. Chromosome sequence maps of the Giardia lamblia assemblage A isolate WB. Trends Parasitol 2010;26:484–91. [12] Lehker MW, Alderete JF. Resolution of six chromosomes of Trichomonas vaginalis and conservation of size and number among isolates. J Parasitol 1999;85:976–9. [13] Upcroft JA, Delgadillo-Correa MG, Dunne RL, et al. Genotyping Trichomonas vaginalis. Int J Parasitol 2006;36:821–8. 5. Unpublished data Zubacova, Z. and J. Tachezy (2011). Histone H3 variants in Trichomonas vaginalis. 41 Histone H3 variants in Trichomonas vaginalis Histones are DNA - binding proteins that evolved from archaebacterial ancestors and function in eukaryotic genome packaging. Octameres composed of core histones H2A, H2B, H3 and H4 form basic units of chromatin called nucleosomes (Dalal et al. 2007;Henikoff and Ahmad 2005). Although histones are remarkably conserved proteins, several variants of H2A and H3 have arised. Histone variants differ from core histones in their protein sequences, distinct localization in chromatin and specialized functions. Histone H2A can be replaced by H2AZ destabilizing nucleosome structure or H2AX playing role in DNA repair and T cells differentiation (Redon et al. 2002). Histone H3 can be replaced by H3.3 or cenH3 that are deposited in active genes and centromeres, respectively. Histone H3.3 has similar protein sequence as core H3. It is characterized by lysine methylation and substitues core H3 in nucleosomes in transcriptionally active chromatin. Recent experiments showed that H3.3 containing nucleosomes occur also in pericentric heterochromatin and telomeres but its role here is unclear (Szenker et al. 2011; Talbert and Henikoff 2010). On the other hand, cenH3 is highly diverged from core H3 especially in N-terminal part of the protein and shares 50-60% sequence identity only within C-terminal histone fold domain. CenH3 plays crucial role in chromosome segregation. It is an epigenetic marker of centromeres, a structure that is defining feature of eukaryotic chromosomes (Ahmad and Henikoff 2002). CenH3 nucleosomes form basement for kinetochore complex that assembles on centromeres and mediates capture of chromosomes by spindle microtubules and their migration to poles during cell division. CenH3s were identified in all eukaryotes studied to date with the exeption of parasite Trypanosoma brucei (Lowell and Cross 2004). Further specific variants of histone H3 have evolved in parasitic protists Trypanosoma brucei and Giardia intestinalis. T. brucei histone H3V with unknown function is enriched at telomeres (Lowell and Cross 2004). G. intestinalis histone H3B probably marks noncentromeric heterochromatin ((Dawson et al. 2007). Trichomonas vaginalis is a human parasite from Parabasala group of protists. Its repetitive 160 Mb genome is divided into six chromosomes (Carlton et al. 2007). Genes for all four core histones H2A, H2B, H3, H4 together with three H3 variants can be found in T. vaginalis genome database by product name and BLAST search (http://trichdb.org/trichdb/). Per haploid genome, there are 17 copies of histone H2A, 14 copies of H2B, 21 copies of H4 and 23 intronless genes coding for putative histone H3. Twenty trichomonad histones H3 are identical in their protein sequences and three variants are different (Figure 1a.). Selected H3 variants were aligned with homologs from other organisms using ClustalX ((Thompson et al. 1997) (Figure 1b.). We examined protein sequences and nuclear localization of three T. vaginalis H3 variants to find centromeric marker cenH3. T. vaginalis strain Tv T1 (kindly provided by J.H. Tai, Institute of Biomedical Sciences, Taipei, Taiwan) was used in the study. Cells were grown in TYM medium with 10% heat – inactivated horse serum (pH 6, 2) at 37°C (Diamond 1957). We studied experimentally chromatin localization of the following T. vaginalis histones H3: TVAG_270080 as core histone H3 according to the alignment (139 amino acids in lenght) which was used as a control. TVAG_087830 as H3 variant that differs from core H3 in six amino acid residues (139 amino acids in lenght). TVAG_224460 as divergent H3 variant (155 amino acids in lenght). It has N-terminal extension when aligned to core H3 and nonconserved N-terminal tail. TVAG_185390 as second the divergent variant that can not be aligned to core histone H3 at the N-terminus (135 amino acids in lenght). It does not have N-terminal extension (Figure 1a.). Depositions of H3 variants in T. vaginalis chromatin were elucidated using their tagged versions. Genes were PCR amplified, cloned and expressed in trichomonads. The template genomic DNA was isolated from trichomonads using High Pure PCR Template Preparation kit (Roche). PCR products were purified from the gel using QIAquick Gel Extraction Kit (Qiagen). Amplified genes were introduced into TagVag vector with a hemagglutinin (HA) tag at the 3´ end. Trichomonads were electroporated with histone H3_HA constructs and maintained as described in Sutak et al. (Sutak et al. 2004). After confirmation of expression of HA – tagged histones of expected size of 20 kDa in transfected trichomonads by Western blotting (data not shown), the localization of H3 variants was examined by indirect immunofluorescence using monoclonal antibody against HA - tag. Culture of trichomonads enriched in mitotic cells was obtained according to the protocol described in Torres-Machorro et al. (Torres-Machorro et al. 2009). Metaphase chromosomes were obtained by cultivation of trichomonads with 1mM colchicine for 6 hours followed by 5 minutes of hypothonic swelling in 75mM KCl (Zubacova et al. 2008). Microscopic preparations were prepared according to Sutak et al. (Sutak et al. 2004). Cells were placed on glass slides coated with 3aminopropyltriethoxysilane (Sigma) and fixed with methanol (5 min) and acetone (5 min) at – 20°C. Cells were blocked for 1 hour in PBS/0.25% BSA (Sigma)/0.25% cold water skin gelatin (Sigma)/0.05% Tween 20 (Sigma) and then incubated for 1 hour or alternatively overnight with mouse anti-HA monoclonal antibody diluted in blocking buffer to stain overexpressed histones H3. In double-labeling experiments, rabbit anti-monomethylated H3K4 antibody (Millipore) was used. After PBS washing, secondary donkey anti-mouse Alexa Fluor-488 (green) and donkey anti-rabbit Alexa Fluor-546 (red) antibodies (Molecular Probes) were added for 1 hour. After washing in PBS, the cells were mounted in VectaShield mounting medium with DAPI (VectorLabs). Fluorescence microscopy was performed using IX81 microscope, photographed with an IX2-UCB camera and processed using Cell^R software (Olympus). Cell section images were analysed using ImageJ (NIH, Bethesda, MD, USA) program to determine the distribution of histone H3 variants. Despite histones belong to the most conserved proteins, former study revealed remarkably greater sequence diversity of core H3 and H4 from protists when aligned to their counterparts from multicellular organisms (Marinets et al. 1996). The highest sequence identidy of trichomonad histone H3 TVAG_270080 with histones from higher eukaryotes was 82% thus we conclude that this histone is core H3 homolog. On microscopy preparations, it was distributed homogenously throughout the whole interphase nuclei as well as whole metaphase chromosomes (Figures 2. and 3.). The protein sequence of H3 variant TVAG_087830 is almost identical with that of core H3 except six amino acids substitutions that is typical for histone variant H3.3, a marker of transcriptionally active chromatin (Figure 1a.) (Ahmad and Henikoff 2002; Henikoff and Ahmad 2005). This H3 variant was localized in trichomonad chromatin as many small foci (Figure 2.). Activation of transcription is indicated by histone lysine methylation (Martin and Zhang 2005). To confirm experimentally that histone TVAG_087830 defines actively transcribed regions we used anti-monomethyl H3K4 antibody immunostaining to label active chromatin in nuclei of T. vaginalis. Anti-monomethyl H3K4 stained discrete foci that colocalized with HA-tagged TVAG_087830 H3 variant (Figure 4.). Centromeric H3 variant, cenH3, is the only stable epigenetic marker of centromeres, structures that characterize eukaryotic chromosomes and have essential role in genome segregation during cell division (Ahmad and Henikoff 2002). With the exception of Trypanosoma brucei, all eukaryotes studied to date have cenH3. In addition, the role of telomeric T. brucei histone H3V as a cenH3 was not completely exluded (Lowell and Cross 2004). Although, centromeric H3 variants lack conserved sequence motif, based on several protein sequence criteria that chacterize majority of cenH3s (Ahmad and Henikoff 2002; Henikoff and Ahmad 2005) we considered the posibility that one of the two divergent histone H3 variants TVAG_224460 or TVAG_185390 might be marker of trichomonad centromeres. i.) All studied eukaryotes have only one gene copy in their genomes coding for cenH3 (Cervantes et al. 2006). Both trichomonad cenH3 candidates are single copy genes. ii.) CenH3s have divergent N-terminus with typical extension when aligned with canonical histone H3 (Dawson et al. 2007). Both trichomonad putative cenH3s have divergent Nterminal part of the protein but only TVAG_224460 variant has N-terminal extension of ten aminoacids. Variant TVAG_185390 is shorter of four amino acids residues than core H3 (Figure 1a.). iii.) CenH3s typically share only 50-60% identity within C-terminal histone fold domain of core H3 (Cervantes et al. 2006; Lowell and Cross 2004). T. vaginalis putative cenH3s share 60% and 61% sequence identity with core H3 within HFD. iv.) All experimentally validated centromeric H3 variants have at least one amino acid residue insertion in loop 1 of secondary protein structure. One amino acid residue insertion is present in loop 1 of variant TVAG_185390. Variant TVAG_224460 has no amino acid insertion in loop 1. v.) Conserved glutamine residue in α-1 helix of core histone H3 is often absent in cenH3s (Lowell and Cross 2004). This glutamine is absent in TVAG_185390 but not in TVAG_224460 variant (Figure 1b). Although both proteins display some characteristics of cenH3, their immunofluorescence pattern was completely different. Surprisingly, immunostaining of H3 variant TVAG_185390 showed the localization along the entire chromosomal arms and in whole interphase nuclei like core H3. This localization does not correspond to cenH3 (Figures 2. and 3.). Second cenH3 candidate TVAG_224460 was localized to six peripheral dots what corresponds to six chromosomes in G1 phase of the mitosis. Trichomonad cenH3 was localized as peripheral double dots in G2 phase (with duplicated chromatin). The same fluorescence pattern of cenH3 was observed previously during Tetrahymena termophila mitosis (Cervantes et al. 2006). We concluded that this H3 variant marks sites of the centromeres in T. vaginalis chromatin (Figure 2.). Our results give support to former observation that trichomonad chromosomes are monocentric rather than holocentric with diffuse kinetochores (Drmota and Kral 1997). Immunofluorescence pattern of cenH3 staining differs remarkably between monocentrics and holocentrics. CenH3 staining of monocentric chromosomes looks like distinct foci corresponding to the number of chromosomes, whereas cenH3 staining in holocentric chromosomes is localized as a band on the edge of each sister chromatid facing towards spindle microtubules (Dernburg 2001). Such pattern we have not seen in T. vaginalis. Furthermore, in monocentric chromosomes centromeres are visible as a primary constrictions that are absent in holocentrics (Ahmad and Henikoff 2002; Guerra et al. 2010; Maddox et al. 2004). Secondary constrictions are seen on chromosomes in addition to primary constrictions. In mitotic metaphase T. vaginalis chromatin condenses into six chromosomes and four of them bear clearly visible constriction (Drmota and Kral 1997). Previous studies of T. vaginalis karyotype by FISH showed that constriction on chromosome IV is secondary where rRNA genes are localized (Torres-Machorro et al. 2009; Zubacova et al. 2011). Thus subtelomeric constrictions on chromosomes I, III and V are most likely primary and correspond to centromeres. The remaining chromosomes II, VI and also chromosome IV without primary constrictions are classified as acrocentric with terminal position of centromeres (Drmota and Kral 1997; Zubacova et al. 2011). We attempted to confirm the position of centromeres on T. vaginalis chromosomes by cenH3 immunostaining. Overexpressed cenH3 variant was detectable on condensed chromosomes only after slight denaturation of chromosomes with 0,1% SDS. The fluorescence pattern was different in comparison to the other variants but the morphology of metaphase chromosomes was not preserved to identify the precise position of centromeres (Figure 3.). Overexpression of neither core H3 nor its variants did not affect growth and viability of Trichomonas cells. As cenH3 was found to be essential for cell survival (Cervantes et al. 2006; Lowell and Cross 2004), study of phenotype of T. vaginalis cenH3 gene knock-outs remains attractive proposition. In T. vaginalis genome are annotated also several conserved proteins of kinetochore complex, such as CENP – B or CENP – E (Table 1.), co - localization experiments could support the role of histone TVAG_224460 as centromeric marker. In conclusion, we found that T. vaginalis has H3.3 histone variant that is targeted to transcriptionally active regions of chromatin. We showed that one of the two distinct H3 variants has function as epigenetic marker of centromeres. The function of the third trichomonad H3 variant is unclear. TVAG_129730 TVAG_224650 TVAG_100590 TVAG_399370 TVAG_341360 TVAG_414480 TVAG_211260 TVAG_085670 TVAG_499020 TVAG_137940 TVAG_497330 TVAG_445270 TVAG_452730 TVAG_315320 TVAG_470690 TVAG_270890 TVAG_014910 TVAG_270800 TVAG_239330 TVAG_270080 TVAG_087830 TVAG_185390 TVAG_224460 MARTKQTARKSTG---GKTPRKSLGAKAARKSTPTID-SQGAKKQHRFRPGTVALREIRKYQKSTDLLIR MARTKQTARKSTG---GKTPRKSLGAKAARKSTPTID-SQGAKKQHRFRPGTVALREIRKYQKSTDLLIR MARTKQTARKSTG---GKTPRKSLGAKAARKSTPTID-SQGAKKQHRFRPGTVALREIRKYQKSTDLLIR MARTKQTARKSTG---GKTPRKSLGAKAARKSTPTID-SQGAKKQHRFRPGTVALREIRKYQKSTDLLIR MARTKQTARKSTG---GKTPRKSLGAKAARKSTPTID-SQGAKKQHRFRPGTVALREIRKYQKSTDLLIR MARTKQTARKSTG---GKTPRKSLGAKAARKSTPTID-SQGAKKQHRFRPGTVALREIRKYQKSTDLLIR MARTKQTARKSTG---GKTPRKSLGAKAARKSTPTID-SQGAKKQHRFRPGTVALREIRKYQKSTDLLIR MARTKQTARKSTG---GKTPRKSLGAKAARKSTPTID-SQGAKKQHRFRPGTVALREIRKYQKSTDLLIR MARTKQTARKSTG---GKTPRKSLGAKAARKSTPTID-SQGAKKQHRFRPGTVALREIRKYQKSTDLLIR MARTKQTARKSTG---GKTPRKSLGAKAARKSTPTID-SQGAKKQHRFRPGTVALREIRKYQKSTDLLIR MARTKQTARKSTG---GKTPRKSLGAKAARKSTPTID-SQGAKKQHRFRPGTVALREIRKYQKSTDLLIR MARTKQTARKSTG---GKTPRKSLGAKAARKSTPTID-SQGAKKQHRFRPGTVALREIRKYQKSTDLLIR MARTKQTARKSTG---GKTPRKSLGAKAARKSTPTID-SQGAKKQHRFRPGTVALREIRKYQKSTDLLIR MARTKQTARKSTG---GKTPRKSLGAKAARKSTPTID-SQGAKKQHRFRPGTVALREIRKYQKSTDLLIR MARTKQTARKSTG---GKTPRKSLGAKAARKSTPTID-SQGAKKQHRFRPGTVALREIRKYQKSTDLLIR MARTKQTARKSTG---GKTPRKSLGAKAARKSTPTID-SQGAKKQHRFRPGTVALREIRKYQKSTDLLIR MARTKQTARKSTG---GKTPRKSLGAKAARKSTPTID-SQGAKKQHRFRPGTVALREIRKYQKSTDLLIR MARTKQTARKSTG---GKTPRKSLGAKAARKSTPTID-SQGAKKQHRFRPGTVALREIRKYQKSTDLLIR MARTKQTARKSTG---GKTPRKSLGAKAARKSTPTID-SQGAKKQHRFRPGTVALREIRKYQKSTDLLIR MARTKQTARKSTG---GKTPRKSLGAKAARKSTPTID-SQGAKKQHRFRPGTVALREIRKYQKSTDLLIR MARTKQTARKTTG---GKTPRKSLGAKAARKAIPTVD-SQGAKKQHRFRPGTVALREIRKYQKSTDLLIR MEEEPRPIHR---GKKRRIPLSSGAARPPDNSSDKSESQKKQRRKR--NSWLREIHFYQKTTNLLIR MASTRIYADDWSFFDDPDNRSSRLEFSIPSQWSQLDVIKPAKKQLKKKKLPNPDQKPKKKHNHVLKEIRTYQNSVDLLIP 66 66 66 66 66 66 66 66 66 66 66 66 66 66 66 66 66 66 66 66 66 62 80 TVAG_129730 TVAG_224650 TVAG_100590 TVAG_399370 TVAG_341360 TVAG_414480 TVAG_211260 TVAG_085670 TVAG_499020 TVAG_137940 TVAG_497330 TVAG_445270 TVAG_452730 TVAG_315320 TVAG_470690 TVAG_270890 TVAG_014910 TVAG_270800 TVAG_239330 TVAG_270080 TVAG_087830 TVAG_185390 TVAG_224460 KLPFQRLVREIASGFR-GDLRFQSSAIAALQEASEAYLVGLFEDTNLCAIHANRVTIMERDVQLAQRIRGERN-KLPFQRLVREIASGFR-GDLRFQSSAIAALQEASEAYLVGLFEDTNLCAIHANRVTIMERDVQLAQRIRGERN-KLPFQRLVREIASGFR-GDLRFQSSAIAALQEASEAYLVGLFEDTNLCAIHANRVTIMERDVQLAQRIRGERN-KLPFQRLVREIASGFR-GDLRFQSSAIAALQEASEAYLVGLFEDTNLCAIHANRVTIMERDVQLAQRIRGERN-KLPFQRLVREIASGFR-GDLRFQSSAIAALQEASEAYLVGLFEDTNLCAIHANRVTIMERDVQLAQRIRGERN-KLPFQRLVREIASGFR-GDLRFQSSAIAALQEASEAYLVGLFEDTNLCAIHANRVTIMERDVQLAQRIRGERN-KLPFQRLVREIASGFR-GDLRFQSSAIAALQEASEAYLVGLFEDTNLCAIHANRVTIMERDVQLAQRIRGERN-KLPFQRLVREIASGFR-GDLRFQSSAIAALQEASEAYLVGLFEDTNLCAIHANRVTIMERDVQLAQRIRGERN-KLPFQRLVREIASGFR-GDLRFQSSAIAALQEASEAYLVGLFEDTNLCAIHANRVTIMERDVQLAQRIRGERN-KLPFQRLVREIASGFR-GDLRFQSSAIAALQEASEAYLVGLFEDTNLCAIHANRVTIMERDVQLAQRIRGERN-KLPFQRLVREIASGFR-GDLRFQSSAIAALQEASEAYLVGLFEDTNLCAIHANRVTIMERDVQLAQRIRGERN-KLPFQRLVREIASGFR-GDLRFQSSAIAALQEASEAYLVGLFEDTNLCAIHANRVTIMERDVQLAQRIRGERN-KLPFQRLVREIASGFR-GDLRFQSSAIAALQEASEAYLVGLFEDTNLCAIHANRVTIMERDVQLAQRIRGERN-KLPFQRLVREIASGFR-GDLRFQSSAIAALQEASEAYLVGLFEDTNLCAIHANRVTIMERDVQLAQRIRGERN-KLPFQRLVREIASGFR-GDLRFQSSAIAALQEASEAYLVGLFEDTNLCAIHANRVTIMERDVQLAQRIRGERN-KLPFQRLVREIASGFR-GDLRFQSSAIAALQEASEAYLVGLFEDTNLCAIHANRVTIMERDVQLAQRIRGERN-KLPFQRLVREIASGFR-GDLRFQSSAIAALQEASEAYLVGLFEDTNLCAIHANRVTIMERDVQLAQRIRGERN-KLPFQRLVREIASGFR-GDLRFQSSAIAALQEASEAYLVGLFEDTNLCAIHANRVTIMERDVQLAQRIRGERN-KLPFQRLVREIASGFR-GDLRFQSSAIAALQEASEAYLVGLFEDTNLCAIHANRVTIMERDVQLAQRIRGERN-KLPFQRLVREIASGFR-GDLRFQSSAIAALQEASEAYLVGLFEDTNLCAIHANRVTIMERDVQLAQRIRGERN-KLPFQRLVREIASGFR-GDLRFQSSAIAALQEAAEAYLVGLFEDTNLCAIHANRVTIMERDVQLAMRIRGERN-KLPFCRLVKEITQSVSIGEFRYTTGAMEALQEASEAFLIKLLEDGQVCAIHARRITLMNRDLQLAQRLRGDR--RLSFQRLVREIAHQNN-PTIKFQETAIQALQEASEAFLVGMMEDGNLCTIHAQRVTIMKKDMKLAERIRGDSITE 72 72 72 72 72 72 72 72 72 72 72 72 72 72 72 72 72 72 72 72 72 72 74 Figure 1a. Sequence alignment of all histones H3 identified in T. vaginalis genome database TrichDB (http://trichdb.org/trichdb/). Histones highlighted in red were used in experiments: TVAG_270080 as core H3; TVAG_087830 as H3.3 variant; TVAG_185390 as H3 variant with unknown function and TVAG_224460 as cenH3. Histone fold domain TVAG270080 TVAG224460 TVAG185390 SpoCnp1 HsCenp-A GiCenH3 TbH3V -MA-RTKQTARKSTGGKTPRKS-----LGAKAARKSTPTIDS-QGAKKQHRFRPGTVALREIRKYQKSTDL -MASTRIYADDWSFFDDPDNRSSRLEFSIPSQWSQLDVIKPAKKQLKKKKL-PNPDQKPKKKHNH-VLKEIRTYQNSVDL -MEEEPRPIHRGKKRRIPLSS-----GAARPPDNSSDK----SESQKKQRRKRNS-WLREIHFYQKTTNL -MA-KKSLMAE--PGDPIPR------------PRK--------------KRYRPGTTALREIRKYQRSTDL -MGPRRRSRKPEAPRRRSPSPT-----PTPGPSRRG-PSLG--ASSHQHSRRRQG--WLKEIRKLQKSTHL -MSGGSRSQVARNTGHRRREISGRNMIPGVVVNARQSRSKLSSDPFSSVPRRPARVSHMEREIYHYQHNVDT -MAQMKKITPR--PVRPKSVASR---P-IQAVARAPVKKVENTPPQKRHHRWRPGTVALREIRRLQSSTDF 63 77 59 41 60 71 64 Loop1 TVAG270080 TVAG224460 TVAG185390 SpoCnp1 HsCenp-A GiCenH3 TbH3V LIRKLPFQRLVREIASGFRG------DLRFQSSAIAALQEASEAYLVGLFEDTNLCAIHANRVTIMERDVQLAQRIRGER LIPRLSFQRLVREIAHQNNP------TIKFQETAIQALQEASEAFLVGMMEDGNLCTIHAQRVTIMKKDMKLAERIRGDS LIRKLPFCRLVKEITQSVSIG-----EFRYTTGAMEALQEASEAFLIKLLEDGQVCAIHARRITLMNRDLQLAQRLRGDR LIQRLPFSRIVREISSEFVANFSTDVGLRWQSTALQCLQEAAEAFLVHLFEDTNLCAIHAKRVTIMQRDMQLARRIRGALIRKLPFSRLAREICVKFTRG----VDFNWQAQALLALQEAAEAFLVHLFEDAYLLTLHAGRVTLFPKDVQLARRIRGLE LIQKLPFARLVQELVEQIAQRDGSKGPYRFQGMAMEALQSATEEYIVELFSTALLATYHANRVTLMSKDILLVLRIQ-QR LIQRAPFRRFLREVVSNLKDS------YRMSAACVDAIQEATETYITSVFMDANLCTLHANRVTLFPKDIQLALKLRGER TVAG270080 TVAG224460 TVAG185390 SpoCnp1 HsCenp-A GiCenH3 TbH3V N--------ITE------------------------EGLG-----NLNSLR---N--------- 74 74 75 79 76 79 74 1 3 0 0 4 6 1 Figure 1b. Sequence alignment of T. vaginalis histone H3 variants (TVAG_270080 as core H3; TVAG_224460 as cenH3, TVAG_185390 as H3 variant with unknown function). Centromeric H3 variants cenH3s have typically divergent N – termini that can not be aligned between species (yellow). In contrast to cenH3s of other organisms, T. vaginalis cenH3 does not have substituted glutamine residue in the α-1 helix of the protein (pink) and lacks the insertion in the loop 1 region (grey). Abbreviations: Spo – Schizosaccharomyces pombe, Hs – Homo sapiens, Gi – Giardia intestinalis, Tb – Trypanosoma brucei. DAPI á-HA DAPI + á-HA Canonical histone H3 TVAG_270080 Histone H3 variant TVAG_185390 H3.3 variant TVAG_087830 CenH3 variant TVAG_224460, G1 phase CenH3 variant TVAG_224460, G2 phase Figure 2. Localization of trichomonad core H3 and H3 variants during interphase. DAPI á-HA DAPI+á-HA Core histone H3 TVAG_270080 Histone H3 variant TVAG_185390 H3.3 variant TVAG_087830 CenH3 variant TVAG_224460 Figure 3. Localization of trichomonad core H3 and H3 variants during metaphase. á-HA mH3K4 merge H3.3 variant TVAG_087830 Figure 4. Active regions of transcription visualised by monomethyl H3K4 immunostaining colocalize with T. vaginalis H3.3 variant TVAG_087830. Annotation Gene ID Function in mitosis cenp-A TVAG_224460 DNA binding, centromere assembly cenp-B TVAG_067240 DNA binding, centromere assembly inner centromere protein TVAG_475090 chromosomal passenger protein, cohesion inner centromere protein TVAG_123230 chromosomal passenger protein, cohesion inner centromere protein TVAG_215610 chromosomal passenger protein, cohesion cbf5, putative TVAG_407130 chromosomal passenger protein centromeric protein E TVAG_237710 outer kinetochore, movements centromeric protein E TVAG_010490 outer kinetochore, movements centromeric protein E TVAG_494200 outer kinetochore, movements centromeric protein E TVAG_026780 outer kinetochore, movements centromeric protein E TVAG_410100 outer kinetochore, movements centromeric protein E TVAG_099890 outer kinetochore, movements centromeric protein E TVAG_244940 outer kinetochore, movements centromeric protein E TVAG_192230 outer kinetochore, movements centromeric protein E TVAG_473490 outer kinetochore, movements centromeric protein E TVAG_193720 outer kinetochore, movements centromeric protein E TVAG_030630 outer kinetochore, movements centromeric protein E TVAG_480170 outer kinetochore, movements centromeric protein E TVAG_458090 outer kinetochore, movements centromeric protein E TVAG_158440 outer kinetochore, movements centromeric protein E TVAG_007500 outer kinetochore, movements centromeric protein E TVAG_307050 outer kinetochore, movements high mobility group protein TVAG_402230 centromere assembly high mobility group protein B2 TVAG_110110 centromere assembly high mobility group protein B1 TVAG_485890 centromere assembly high mobility group protein TVAG_302620 centromere assembly high mobility group protein TVAG_453060 centromere assembly high mobility group protein TVAG_161660 centromere assembly high mobility group protein TVAG_325010 centromere assembly high mobility group protein TVAG_125540 centromere assembly high mobility group protein TVAG_194720 centromere assembly high mobility group protein TVAG_346400 centromere assembly DNA topoisomerase II TVAG_038880 chromosomal passenger protein, release DNA topoisomerase II TVAG_198840 chromosomal passenger protein, release cytoplasmic dynein heavy chain TVAG_365640 movements cytoplasmic dynein heavy chain TVAG_432890 movements cohesin subunit rad21 TVAG_297540 cohesion spindle assembly checkpoint component MAD1 TVAG_186470 spindle checkpoint proteins spindle assembly checkpoint component MAD1 TVAG_077120 spindle checkpoint proteins spindle assembly checkpoint component MAD1 TVAG_483050 spindle checkpoint proteins spindle assembly checkpoint component MAD1 TVAG_054210 spindle checkpoint proteins spindle assembly checkpoint component MAD1 TVAG_181280 spindle checkpoint proteins spindle assembly checkpoint component MAD1 TVAG_023960 spindle checkpoint proteins TVAG_340790 peri-centromeric heterochromatin protein Table 1. T. vaginalis centromere – associated proteins, kinetochore and spindle – checkpoint components identifyied in genome database (http://trichdb.org/trichdb/, (Santaguida and Musacchio 2009). Reference List Ahmad K, Henikoff S (2002) Histone H3 variants specify modes of chromatin assembly. Proceedings of the National Academy of Sciences of the United States of America 99:16477-16484 Carlton JM, Hirt RP, Silva JC, Delcher AL, Schatz M, Zhao Q, Wortman JR, Bidwell SL, Alsmark UCM, Besteiro S, Sicheritz-Ponten T, Noel CJ, Dacks JB, Foster PG, Simillion C, Van de Peer Y, Miranda-Saavedra D, Barton GJ, Westrop GD, Muller S, Dessi D, Fiori PL, Ren QH, Paulsen I, Zhang HB, Bastida-Corcuera FD, Simoes-Barbosa A, Brown MT, Hayes RD, Mukherjee M, Okumura CY, Schneider R, Smith AJ, Vanacova S, Villalvazo M, Haas BJ, Pertea M, Feldblyum TV, Utterback TR, Shu CL, Osoegawa K, de Jong PJ, Hrdy I, Horvathova L, Zubacova Z, Dolezal P, Malik SB, Logsdon JM, Henze K, Gupta A, Wang CC, Dunne RL, Upcroft JA, Upcroft P, White O, Salzberg SL, Tang P, Chiu CH, Lee YS, Embley TM, Coombs GH, Mottram JC, Tachezy J, Fraser-Liggett CM, Johnson PJ (2007) Draft genome sequence of the sexually transmitted pathogen Trichomonas vaginalis. Science 315:207-212 Cervantes MD, Xi XH, Vermaak D, Yao MC, Malik HS (2006) The CNA1 histone of the ciliate Tetrahymena thermophila is essential for chromosome segregation in the germline micronucleus. Molecular Biology of the Cell 17:485-497 Dalal Y, Wang H, Lindsay S, Henikoff S (2007) Tetrameric structure of centromeric nucleosomes in interphase Drosophila cells. Plos Biology 5:1798-1809 Dawson SC, Sagolla MS, Cande WZ (2007) The cenH3 histone variant defines centromeres in Giardia intestinalis. Chromosoma 116:175-184 Dernburg AF (2001) Here, there, and everywhere: Kinetochore function on holocentric chromosomes. Journal of Cell Biology 153:F33-F38 Diamond LS (1957) The Establishment of Various Trichomonads of Animals and Man in Axenic Cultures. Journal of Parasitology 43:488-490 Drmota T, Kral J (1997) Karyotype of Trichomonas vaginalis. European Journal of Protistology 33:131-135 Guerra M, Cabral G, Cuacos M, Gonzalez-Garcia M, Gonzalez-Sanchez M, Vega J, Puertas MJ (2010) Neocentrics and Holokinetics (Holocentrics): Chromosomes out of the Centromeric Rules. Cytogenetic and Genome Research 129:82-96 Henikoff S, Ahmad K (2005) Assembly of variant histones into chromatin. Annual Review of Cell and Developmental Biology 21:133-153 Lowell JE, Cross GAM (2004) A variant histone H3 is enriched at telomeres in Trypanosoma brucei. Journal of Cell Science 117:5937-5947 Maddox PS, Oegema K, Desai A, Cheeseman IM (2004) ''Holo''er than thou: Chromosome segregation and kinetochore function in C-elegans. Chromosome Research 12:641-653 Marinets A, Muller M, Johnson PJ, Kulda J, Scheiner O, Wiedermann G, Duchene M (1996) The sequence and organization of the core histone H3 and H4 genes in the early branching amitochondriate protist Trichomonas vaginalis. Journal of Molecular Evolution 43:563-571 Martin C, Zhang Y (2005) The diverse functions of histone lysine methylation. Nature Reviews Molecular Cell Biology 6:838-849 Redon C, Pilch D, Rogakou E, Sedelnikova O, Newrock K, Bonner W (2002) Histone H2A variants H2AX and H2AZ. Current Opinion in Genetics & Development 12:162-169 Santaguida S, Musacchio A (2009) The life and miracles of kinetochores. Embo Journal 28:2511-2531 Sutak R, Dolezal P, Fiumera HL, Hrdy I, Dancis A, gadillo-Correa M, Johnson PJ, Muller M, Tachezy J (2004) Mitochondrial-type assembly of FeS centers in the hydrogenosomes of the amitochondriate eukaryote Trichomonas vaginalis. Proceedings of the National Academy of Sciences of the United States of America 101:10368-10373 Szenker E, Ray-Gallet D, Almouzni G (2011) The double face of the histone variant H3.3. Cell Research 21:421-434 Talbert PB, Henikoff S (2010) Histone variants - ancient wrap artists of the epigenome. Nature Reviews Molecular Cell Biology 11:264-275 Thompson JD, Gibson TJ, Plewniak F, Jeanmougin F, Higgins DG (1997) The CLUSTAL_X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Research 25:4876-4882 Torres-Machorro AL, Hernandez R, Alderete JF, Lopez-Villasenor I (2009) Comparative analyses among the Trichomonas vaginalis, Trichomonas tenax, and Tritrichomonas foetus 5S ribosomal RNA genes. Current Genetics 55:199-210 Zubacova Z, Cimburek Z, Tachezy J (2008) Comparative analysis of trichomonad genome sizes and karyotypes. Molecular and Biochemical Parasitology 161:49-54 Zubacova Z, Krylov V, Tachezy J (2011) Fluorescence in situ hybridization (FISH) mapping of single copy genes on Trichomonas vaginalis chromosomes. Molecular and Biochemical Parasitology 176:135-137 5. Thesis summary Karyotype of Trichomonas vaginalis Trichomonads reproduce asexually by mitosis. Cultures of trichomonads can be enriched in cells with condensed and well – structured chromosomes by arresting the cells in mitotic metaphase using colchicine. In T. vaginalis, mitotic index (a ratio between cells in mitosis and total cell number) after colchicine treatment was in all experiments high. About 80% cells displayed visible chromosomes. We studied chromosomes of T. vaginalis by conventional cytogenetic technique using chromosome spreading after short time hypotonic treatment followed by fluorescence staining. T. vaginalis has haploid number of six tiny chromosomes (the longest chromosome I is 2, 4 µm long) that differ in size and morphology. Each trichomonad chromosome consist of two chromatids that are arranged in parallels, however, connection between sister chromatids typical for monocentric as well as holocentric chromosomes of higher eukaryotes was either not visible or observed very rarely. Constrictions were observed on chromosomes I, III, IV. and V. suggesting that the chromosomes of T. vaginalis are monocentric with point centromeres. Chromosome number in T. vaginalis karyotype is stable in different isolates (we tested strains T1, G3 and Cp1) and does not vary after long term cultivation. We did not observed pairing of homologous chromosomes typical for meiotic division. Our failure to observe meiosis in T. vaginalis, however, does not exclude a possibility that this process may rarely occur. Karyotypes of related trichomonads In addition to Trichomonas vaginalis, we investigated and revised karyotypes in other eight species including Trichomonas tenax (6 chromosomes), Tetratrichomonas gallinarum (5 chromosomes), Pentatrichomonas hominis (6 chromosomes), Tritrichomonas foetus (5 chromosomes), Tritrichomonas augusta (5 chromosomes), Monocercomonas colubrorum (4 chromosomes), Trichomitus batrachorum (6 chromosomes) and Hypotrichomonas acosta (5 chromosomes). Karyotypes of this species have not been studied previously or data about their chromosomes are from the beginning of twenty century when the limits of light microscopy could affect the results. All species can be studied by spreading technique using colchicine treatment, however observed mitotic index was lower when compared to T. vaginalis. All trichomonads have single chromosome set with constant number of chromosomes. Sister chromatid organization without visible cohesion was the same as in T. vaginalis. 42 Nuclear DNA content and genome sizes in related trichomonads Genome sequence project of T. vaginalis revealed that this organism possesses the largest genome among parasitic protists. It was hypothesized that the large size of the genome is related with pathogenic potential of T. vaginalis. Thus, we analyzed nuclear DNA content of other eight species from Trichomonadea group. T. vaginalis cells were used as standard with known genome size of 160 Mb. Flow cytometry revealed relatively large genomes in all tested species ranging from 90 Mb in human commensal Pentatrichomonas hominis to 177 Mb in tritrichomonads. Therefore, previously suggested relationship between large genome size of trichomonads and parasitic life style, extraintestinal localization in host and phagocytosis ability was not confirmed in our analysis. The only positive correlation was found between genome size versus cell and nuclei volume in all tested trichomonads. Based on inferred earliest emergence of hypermastigids among the Parabasala group, the giant protists living in termite intestine, we suggested that unusually large genome of T. vaginalis may be not the result of recent expansion but oppositely it resulted from regressive evolution of large genome of trichomonad common ancestor. Fluorescence in situ hybridization as a new tool in research of genome organization of Trichomonas vaginalis We tested classic cytogenetic techniques including C – banding and fluorescence banding to find distribution of constitutive heterochromatin on T. vaginalis chromosomes but no bands were observed using any technique. We have, however, adapted fluorescence in situ hybridization (FISH) for T. vaginalis metaphase chromosomes that is sensitive enough to map not only repetitive sequences but also single copy genes. We found markers for chromosome I. and II. by mapping the localization of single copy genes encoding serine palmitoyltransferase and tryptophanase. Previous attempts to detect nucleolar organizer regions (NORs) on T. vaginalis by AgNO3 staining failed. FISH mapping of ribosomal RNA genes revealed single NOR in secondary constriction of chromosome IV. FISH was successfully employed in the study of genetic diversity of T. vaginalis strains of diverse origin using panel of microsatellites as genetic markers. Analysis revealed the existence of genetic variation among T. vaginalis strains and indicates participation of genetic recombination on genetic diversity of trichomonads. Our FISH protocol provides a tool for physical mapping of T. vaginalis genome as well as for development of additional genetic markers and study of genome rearrangements. 43 Histone H3 variants in Trichomonas vaginalis and identification of centromeric marker cenH3 In T. vaginalis genome database TrichDB there are 23 genes for histone H3. Twenty of them are identical and according to the sequence alignments they are core histones H3. The remaining three sequences are different. We investigated their protein sequences and nuclear chromatin localization by transfection of trichomonads with hemagglutinin – tagged histone constructs. We wanted to determine if they belong to known group of eukaryotic H3 variants such as marker of transcriptionally active genes H3.3 and divergent centromere marker cenH3 responsible for proper chromosome segregation and to distinguish whether eitherer of these histone H3 variants define trichomonad centromeres. Generally, core histones represent one of the most conserved proteins with the exception of histones from protists that show significant divergence. Overall diversity of core H3 sequences of metazoans is typically less than 15%, however, differences between protist and other organisms can be up to 44%. Comparison of T. vaginalis core H3 protein sequence with homologs from other organisms revealed only 82% identity with sequences from multicellular organisms (Dernburg, 2001; Marinets et al., 1996; Zubacova and Tachezy, unpublished data). Trichomonad putative H3.3 variant (TVAG_087830) is almost identical with protein sequence of core H3. They differ only in six amino acid residues. This histone H3 variant co localized with active transcriptional regions in chromatin as defined by monomethyl H3K4 immunostaining that strongly suggest that this histone is H3.3 ortholog. Two other divergent H3 variants in T. vaginalis TVAG_185390 and TVAG_224460 have different N – terminal parts of the protein and can be aligned with core H3 only in their histone fold domain where sharing 60% identity. H3 variant TVAG_185390 is unique for T. vaginalis. Although it shows some sequence features of cenH3 including amino acid insertion in loop 1 region and absence of conserved glutamine residue in alpha – 1 helix in protein structure, its deposition in chromatin was congruent with core H3 histone. The role of this variant in trichomonad chromatin 44 is not known. The second variant TVAG_224460 named TvcenH3 is most likely cenH3 ortholog according to specific immunostaining pattern. It localized to six and twelve discrete foci, corresponding to chromosome number of T. vaginalis in G1 and G2 phase, respectively. The sequence alignment of trichomonad core histones H3 and H3 variants revealed short N – terminal extension of two amino acids in TvcenH3 that is a feature of all studied cenH3s. Protein sequence of TvcenH3, however, seems to be unique as it lacks insertion in loop 1 and has conserved glutamine in alpha – 1 helix. As T. vaginalis cenH3 is encoded by single copy gene, it would be interesting to study phenotype of cenH3 knock – outs to test the effect on cell viability because cenH3 is considered to be essential protein for cell division. Together with cenH3 we have identified also several genes coding conserved kinetochoral proteins in T. vaginalis genome such as CENP-B, CENP-E or INCENP suggesting that segregation of T. vaginalis chromosomes is mediated by conventional attachment of spindle microtubules to kinetochores. Localization of trichomonad putative kinetochoral components as well as visualization of chromosomes capture by extranuclear spindle microtubules in T. vaginalis closed mitosis remains a subject of future studies. 45
© Copyright 2025 Paperzz