IMMUNOLOGY AND MEDICAL MICROBIOLOGY ELSEVIER FEMS Immunology and Medical Microbiology 15 (1996) 73-79 Use of polymorphic short and clustered coding-region microsatellites to distinguish strains of Can&da albicans Dawn Field a, Lori Eggert a, Dave Metzgar a, Randall Rose b, Christopher Wills ‘, * a Departmentof Biology,University of California at San Diego, La Jolla, CA 92093-0116. USA b Department of Linguistics, University of California at San Diego, La Jolla, CA 92093-0116, USA ’ Center,for Molecular Genetics. Universi@ of California at San Diego, L.a Jolla, CA 92093-0116, USA Received 8 February 1996; revised 1 April 1996; accepted 18 April 1996 Abstract We describe the identification of polymorphic microsatellite loci in the pathogenic yeast, Cundida albicans. A search for all coding-region microsatellites with more than four repeats that can be found in Cundidu sequences in GenBank was conducted. Nine such microsatellite sequences consisting of trinucleotide motifs were found. Three of these were perfect microsatellites while the remaining six sequences were found in one imperfect microsatellite and two compound microsatell&es. Because of the close proximity of some of these repeats, all could be assayed with six PCR primer pairs. All of these microsatellite sequences were found in five nuclear genes, ZNFl, CCNl, CPHl, EFGI, and h4hY2. Except for a single (CTT), serine tract, all coded for polyglutamine tracts. Another locus with seven alleles, a region of the ERKl protein kinase gene, was also examined, and may be a representative of a new class of highly polymorphic ‘clustered’ microsatellites. Such loci, in which several non-contiguous but closely linked microsatellites are clustered together, may be a useful source of DNA polymorphisms in microorganisms in which long microsatellite sequences are unavailable. All seven regions amplified were polymorphic, having between two and seven variable length alleles in the 11 strains of Candida albicuns examined. The results of this and similar searches will facilitate epidemiological and evolutionary studies of Cundidu and other microorganisms. Keywords: Candida a&cans; Polymorphic microsatellite locus; Coding-region microsatelllites; Clustered polymorphisms 1. Introduction the proper diagnosis, treatment, study of infectious agents. The ability to distinguish morphologically identical strains of pathogenic microorganisms at the genetic level, and to use these markers to trace the strains’ sensitivity to drugs, levels of virulence, and transmission and colonization histories, is critical to Candida albicans is a pathogenic, apparently asexual [l] yeast that can cause massive oropharyngeal, esophageal and vaginal infections and, less commonly, a wide variety of life-threatening conditions triggered by bloodstream infection [2]. The individuals most at risk are AIDS victims and other * Corresponding author. Tel: + 1 (619) 534-4113; (619) 534-7108; E-mail: [email protected] 0928-8244/96/$15.00 Copyright PII SO928-8244(96)00034-X 0 1996 Federation Fax: + 1 of European and epidemological patients with a disease-compromised immune system, and those who are on immunosuppressive drugs. As the prevalence of AIDS increases, and the use of Microbiological Societies. Published by Elsevier Science B.V. 74 D. Field et al. / FEMS Immunology and Medical Microbiology 15 (19961 73-79 chemotherapy and organ transplantation grows, the need to understand the evolution of drug resistance, transmission, and virulence of accompanying Candida infections will grow as well. The ability to examine the level of diversity and the stability of its clonal types is essential in understanding the epidemiology of this pathogen. Another longstanding question about Candida is whether it is truly a collection of asexual pathogenic lineages or whether it retains a cryptic sexual phase. In recent years, spurred by the connection between Candida infections and AIDS, many methods have been proposed to differentiate Candida strains. Several electrophoretic methods, morphotyping, serotyping, antibiogram, resistogram typing, biotyping, sensitivity to yeast killer toxins, and typing based on protein variability have all been used [3,4]. In general, molecular markers such as RFLP’s, RAPDs and DNA and PCR fingerprinting provide the most discriminatory power [5,6]. Such techniques assay a wide range of nuclear polymorphisms using relatively simple procedures [7-91. Since RFLPs, RAPDs, and DNA fingerprinting and PCR fingerprinting all assay anonymous loci, they can be applied to organisms with poorly known genomes, but unfortunately these techniques rarely assay codominantly inherited polymorphisms. Ideal molecular markers should be numerous and highly polymorphic, provide reproducible results, and be simple to assay. Such markers should preferably be codominantly inherited (providing the ability to distinguish heterozygotes at a cloned locus) so that the Mendelian inheritance and evolution of the target locus can be followed through time. Microsatellite markers fulfill these requirements and have revolutionized the genetic analysis of higher eukaryotes. These abundant hypervariable DNA sequences, made up of motifs of l-6 bases of tandemly repeated nucleotides [ 10,ll I, are subject to high rates of polymerase slippage during replication [12]. The accumulation of length mutations renders microsatellites among the most variable classes of repetitive DNAs. Microsatellite sequences are dispersed throughout the length of eukaryotic chromosomes, making them ideal for genome mapping and linkage studies [ 131. Before microsatellites can be assayed, informative loci. must be found in the genome of the target organism. This entails cloning of the microsatellite and the flanking sequence so that PCR primers can be designed. Microsatellites are usually cloned by screening species-specific genomic libraries. An equally successful strategy is to screen existing sequences deposited in databases such as GenBank. Here we describe an application of this latter approach to a microorganism, which has enabled us to identify seven polymorphic microsatellite loci in the pathogenic yeast Candida. 2. Materials and methods 2.1. Candida isolates Three strains of Candida aibicans from the American Type Culture Collection were typed: ATCC 60193, 11006, and 36232. Nine additional clinical isolates from a variety of sources were kindly supplied by Sharon Reed, one of which was C. kruseii. C. albicans was distinguished by its ability to form pseudohyphae under anaerobic conditions and by appropriate assimilation tests. 2.2. Isolation of genomic DNA for PCR This DNA extraction method is based on a protocol used to obtain PCR-grade DNA from forensic samples [14]. Our modified version allows DNA for use in PCR to be extracted from single yeast colonies in less than 20 min in a single tube. A sterile toothpick was used to transfer approximately one cubic millimeter of cells from single colonies grown on YEPD plates to 300 $ of 5% Chelex (w/v) suspended in ddH,O in a 0.6 ml Eppendorf tube. Each tube was boiled for 10 min followed by 10 s of vortexing and centrifugation for 3 min at 14000 rpm to pellet all cellular debris at the bottom of the tube. All DNA prepared in this manner was stored at - 20°C thawed, vortexed, and centrifuged for 3 min at 14 000 rpm before each use as PCR template [ 141. 2.3. Identification PCR primers of microsatellites and selection of All Candida albicans sequences deposited in GenBank (Version 90.0) were downloaded onto a 75 D. Field et al. / FEMS Immunology and Medical Microbiology I5 (1996) 73-79 Table 1 GenBank survey of Candido microsatellites No. of Total bp Max di Max tri Max tetra Max penta Max hexa No. di No. tri No. tetra No. penta No. hexa entries searched found found found found found > 4 repeats > 4 repeats > 4 repeats > 4 repeats > 4 repeats Coding Non-coding 89 138 160109 276320 4 13 11 7 3 8 2 5 4 3 0 37 9 18 0 3 0 1 0 0 The results of searching all Candida albicans sequences, both coding and non-coding, available in GenBank for microsatellite with motifs of 2-6 bases are summarized. For each repeat type, the longest repeat found as well as the total number of microsatellites longer than four repeats is given. In GenBank (version 90.0) there are 138 entries from Condida, totaling 276 320 nucleotides. Of the 138 total entries, there are 103 coding regions representing 160 109 nucleotides. Of the 103, 14 were found to be allelic redundancies, reducing the total number of unique published Cundidu OFWs to 89. Sun microcomputer using Entrez from the CD version of GenBank (Version 90.01, and searched for possible microsatellites using a program under development by Randall Rose. A total of 89 non-duplicated protein sequences, comprising a total of 160 109 nucleotides, were searched. PCR primers were designed to amplify products of 100-400 nucleotides in length. All PCR primers were purchased from Research Genetics (Huntsville, AL). polynucleotide kinase (Gibco BRL; according to manufacturer’s specifications). The PCR reaction mix included 2 mM each dNTP, 1 unit of cloned pfu DNA polymerase (Stratagene), 0.5 p,M forward and reverse unlabeled primers, 0.05 p,M end-labeled forward primer, and 1 ~1 DNA prepared in Chelex, in a total volume of 10 pl 1X PCR buffer. PCR was performed on a Perkin Elmer 2400 thermocycler, employing 40 cycles with 1 min of denaturation at 94”C, 1 min of annealing at 50°C and 1 min of extension at 72°C. Four ~1 of stop solution was added to each 10 pl reaction, and the samples were loaded on 8% polyacrylamide gels. The gels were 2.4. PCR PCR products were visualized by end-labeling each forward primer with gamma P3*ATP using T4 Table 2 PCR primers for seven polymorphic microsatellites in Cundido albicans Microsateilite primer name GenBank locus Gene Size of GenBank clone Forward and reverse primers ‘ZNFl’ YSAzNFl ZNFI gene: zinc finger protein 240 bp ‘CCN2’ YSACLNl Gl cyclin: (CCNl; CLNl) 221 bp ‘CPHl’ CAU15152 CPHl gene: Stel2-like ‘MNT2’ CAMNT2GE.N MN72 gene: mannosyltransferase ‘EFGl’ CAEFGTF EFGI gene: putative transcription ‘EFG2’ CAEFGTF EFGI gene: (see above) 338 bp ‘ERKl’ YSAERKl ERKl gene: protein kinase 170 bp F 5’ CCATTACAGCTGAACCAGCGAGGG 3’ R 5’ CGCTAGGTAACCTACAGA’ZTGTGGC 3’ F 5’ CC’ITCCCATCCTCATACC 3’ R 5’ CCAATGA-ITCAAGTA’I-TGGATGG 3’ F 5’ GCCATGGGATATCAAAGC 3 R 5’ C’I-IGGTAATGCCACCGCC 3’ F 5’ GCCAATACTGGAAACTGTGCC 3’ R 5’ CGGGCTAAAGTGACAAATGTGGC 3’ F 5’ GGTCAACAGACTGGACAGACAGC 3’ R 5’ GGTATGGGGGCACCACTAGGAGC 3 F 5’ CACCTGCATCAGAACCAGG 3’ R 5’ GATGTTG’ITGGGGTGAAGGG 3’ F 5’ CGACCACGTCATCAATACAAATCG3’ R 5’ CG’ITGAATGAAACITGACGAGGGG 3’ transcription factor 216 bp 329 bp factor 214 bp For each primer pair the GenBank locus and a brief description of the gene for which primers were designed GenBank clone refers to the length DNA region in GenBank flanked by the forward and reverse primers. are given. The size of the D. Field et al. / FEMS Immunology and Medical Microbiology 15 (1996) 73- 79 76 dried for 1 h and exposed to film for 6-24 h. 2.5. Cloning and sequencing of ZNFI and ERKl alleles Genomic DNA was amplified (substituting 1 U of Taq polymerase (Perkin Elmer) for pfu in the above PCR protocol) and cloned into E. coli using a TA cloning kit (Invitrogen). Sequencing of individual alleles was done using the Sequenase Version 2.0 DNA sequencing kit (USB) and -40 primers according to manufacturer’s specifications. 3. Results Results of the GenBank lite sequences with di- to summarized in Table 1 for ing regions of the genome. search for all microsatelhexanucleotide motifs are both coding and non-codFrom these microsatellite sequences, the nine longest coding-region trinucleotide repeats were selected for amplification by PCR. The genes involved and the PCR primers designed are listed in Table 2. Overall length of the microsatellite sequence was the only criterion used in the initial GenBank search. These nine repeats were found to be grouped into three perfect, one interrupted, and two compound microsatellite sequences [ 151 In addition to these nine repeats which were selected for length, a region of the ERKl gene was analyzed, because it was found that the 5’ region of this protein kinase shows a localized clustering of several very short repeats in one 100 bp region of DNA. Eleven strains of C. albicans were typed using these seven pairs of PCR primers. These seven coding regions tested were all polymorphic, displaying between two and seven variable length alleles among the strains tested (Table 3). Mean heterozygosities Table 3 The longest coding trinucleotide repeats found in Candida albicans sequences deposited in GenBank clustered set of short polymorphic repeats found in the gene ERKI Gene No. of unique genotypes found among 11 strains ZNFl 4 Total no. of alleles Category of repeat Average heterozygosity 4 perfect 55% (version 90.0). with the addition of a Repeated sequences Polyaminoacid tract * ‘(cAA), , (Q), I (Q), (Q17 (CA& CCN2 Ml%%? CPHl 5 5 3 6 4 2 perfect perfect interrupted 75% 25% 33% EFGI EFG2 4 3 2 4 compound compound 41% 58% ERKI 8 1 clustered 66% :;E$) YCAA~~(CAG),(CAA), YCAA), VCAG)~(CAA), . (CAA), ‘(cAG)H(cAA), (cAA),(cAG), @x4), . . . 90‘4), (GCTCAA),(CAA), . (GCAGCC), . . O-0, (GCTCAA),(CAA), . (GCAGCC), . (c7-0, (GCTCAA),(CAA), (GCAGCC), . . . (~~77, (GCTCAA),(CAA), (GCAGCC), . . (cz~T), W, (Q&. .(Q),3.. (Q), I (Q)~. (Q)~ (QA)~@& .(A), fQAl,cQ,, (A), (QAJ,(Q)~ . . (AJ2 (QA),(Q), . . (A), Cs), (S), . C.9, . CS), For each gene, the type of repeat, the repeat, and the amino acid tract it encodes are given. Superscripts preceding individual repeats refer to the ranks of the nine longest repeats found in GenBank. * The eleven strains tested were C. albicans. All loci failed to amplify specific products from a sample of C. kruseii. The one italicized (CAA)-repeat for ZNFI was independently sequenced both from ATCC strains 14053, 36232 and the three italicized sequences at the ERKI locus came from ATCC strains 14053 (196 bp), 36232 (175 bp) and 60193 (187 bp). D. Field et al./ FEMS Immunology and Medical Microbiology 15 (1996) 73-79 ATCC ATCC ATCC 8244 14053 36232 60193 E C. krusei 9692 7253 H 9201 9648 Fig. 1. Genotypes obtained for eight C. albicans strains and one C. kruseii strain at the ERKJ locus. Strain 9201, an apparent homozygote at this locus, was also apparently homozygous at the other loci tested, suggesting that it may be a haploid. across strains varied from 25-75% among these loci and ten of the eleven strains could be uniquely genotyped. The two st:rains which remained indistinguishable are clinical isolates obtained at two different hospitals within the San Diego area. All strains were heterozygous for at least one locus, suggesting diploidy. Strain 9201 was, however, apparently homozygous at all loci, suggesting that it may be a haploid. An autoradiograph, displaying amplification products from the ERKI locus for eight strains of C. albicans and one strain of C. kruseii is shown (Fig. 1). DNA samples from C. kruseii consistently yielded no specific amplification products under the PCR conditions detailed above for any of the seven loci tested, indicating considerable evolutionary divergence from C. albicans. Sequencing of four alleles at the ERKI locus confirmed that four of the short microsatellites were contributing to length polymorphisms (Table 3). At this locus, very short repeats can be polymorphic among strains. Differences in the repeat length from (CAA), , in GenBank to (CAA), in ATCC strains 14053 and 36232 accounted for the two of the band sizes seen in ZNFZ (Table 3). To control for PCR-induced replication errors, all PCR product-containing plasmids were ‘typed’ to verify that the allele cloned 77 into these plasmids matched exactly the length of the allele when amplified from target genomic DNA. 4. Discussion Microsatellites fall into three categories as defined by Weber [El: perfect, compound, and intertupted. Weber found that perfect CA-repeats were more polymorphic than imperfect repeats of the other two categories [15]. Further, for all categories of repeats it appeared that length is the best predictor of the degree of polymorphism [15], an observation that agreed with earlier findings (e.g. 116,171). We examined three perfect, two compound, and one interrupted microsatellite in Candida and found results consistent with Weber’s observations. However, the seven Candida polymorphic regions typed here differ from microsatellite sequence-containing regions commonly employed in higher eukaryotes [13,15]. The microsatellite sequences are all extremely short, are found in translated regions, and consist of trinucleotides or hexanucleotides, all except one of which code for polyglutamine tracts. A region of the ERKl gene which may belong to a fourth category of repeat, a ‘clustered’ microsatellite, was also selected for examination. ‘Clustered’ loci contain numerous non-contiguous but closely linked microsatellites. The clustered microsatellite found in ERKl resembles a string of very short perfect microsatellites as traditionally defined. It has yet to be determined whether the rapid accumulation of polymorphism in this region signifies an unusually elevated rate of polymerase slippage. The relatively low heterozygosities found at these loci may be explained by the fact that they are translated into proteins, so that the degree of purifying selection acting upon them might be significant. Microsatellites appear to be relatively abundant in Candida, so that a small number of alleles at each polymorphic locus can be compensated for by surveying a larger number of loci. This can be facilitated by the multiplexing of PCR primers. We have successfully multiplexed three to four pairs of PCR primers in a single PCR reaction using the same conditions as those for a single primer pair. Although only coding microsatellites were assayed in this study, non-coding microsatellites should 78 D. Field et al./FEMS Immunology and Medical Microbiology 15 (1996) 73-79 be at least as polymorphic. They may, however, prove difficult to study in microorganisms because of high rates of evolutionary divergence in primer target sequences. Coding regions were amplified in this study because we were interested in examining the variability associated with polyglutamine tracts. In Candida, as well as in a variety of other microorganisms [ 181, polyglutamine tracts, predominantly coded for by the CAA codon, appear abundant. In addition, glutamine repeats are found in a growing number of human triplet-repeat diseases [19], as well as in large numbers in evolutionarily diverse transcription factors [20]. They have been shown to be capable of modulating levels of transcription in vivo [20]. The fact that these glutamine repeats retain high levels of variability in natural populations suggests that such loci will be useful in epidemiological studies, in cloning of microsatellites de novo from microorganisms using (CAA), probes, and perhaps even in gaining an understanding of the role of glutamine tract variability using a simple eukaryotic model. The approach taken here should prove useful in determining the detailed epidemiology of Cundida infections. One example is the study of drug resistance. Since Candida is an asexual organism, the evolution of drug resistance can result from replacement of one clone with another or from de novo mutation within a clonal lineage. We have begun a study to determine which alternative is most likely during the course of infection in AIDS patents undergoing fluconazole prophylaxis. Large differences in Cundida genotypes at different points during the course of an infection seem to be correlated with large changes in fluconazole resistance, suggesting clonal replacement (Field et al., manuscript in preparation). Microsatellite typing should be especially useful in a clinical setting, since PCR/Chelex-based methods should allow the rapid extraction of Cundida DNA from sources such as blood and urine La. References [I] Sarachek, A., Rhoads, D.D. and Schwarzhoff, R.H. (1981) Hybridization of Candida albicans through fusion of protoplasm. Arch. Microbial. 129, l-8. [2] DuPont, P. (1995) Candida albicans, the opportunist. A cellular and molecular perspective. J. Am. Pediatric Med. Assoc. 85, 104-115. [3] Merz, W. (1990) Candida albicans strain delineation. Clin. Microbial. Rev. 3, 321-334. [4] Hunter, P. (1991) A critical review of typing methods for Candida albicans and their applications. Crit. Rev. Microbiol. 17, 417-434. [5] Ernst, J. (1990) Molecular genetics of pathogenic fungi: some recent developments and perspectives. Mycoses 33, 225-229. [6] Pfaller, M. (1992) The use of molecular techniques for epidemiologic typing of Candida species. Curr. Topics Med. Mycol. 4, 43-63. [71 Lieckfeldt, E., Meyer, W. and Borner, T. (1993) Rapid identification and differentiation of yeasts by DNA and PCR fingerprinting. J. Basic Microbial. 33, 413-425. l81 Meyer, W., Lieckfeldt, E., Kuhls, K., Freedman, E., Bomer, T. and Mitchell, T. (1993) DNA- and PCR-fingerprinting in fungi. Exs 67, 31 l-320. [91 Sullivan, D., Bennett, D., Henman, M., Harwood, P., Flint, S., Mulcahy, F., Shanley, D. and Coleman, D. (1993) Oligonucleotide fingerprinting of isolates of Candida species other than C. albicans and of atypical Candida species from human immunodeficiency virus-positive and AIDS patients. J. Clin. Microbial. 31, 2124-2133. of simple sequences as a 1101 Tautz, D. (1989) Hypervariability general source for polymorphic DNA markers. Nucleic Acids Res. 17.6463-6471. [ill Weber, J. and May, P. (1989) Abundant class of human DNA polymorphisms which can be typed using the polymerase chain reaction. Am. J. Human Genetics 44, 388-396. WI Strand, M., Prolla, T., Liskay, R. and Petes, T. (1994) Destabilization of tracts of simple repetitive DNA in yeast by mutations affecting DNA mismatch repair. Nature 365, 274276. 1131 Murray, J., Buetow, K., Weber, J., Ludwigsen, S., Scherpbier-Heddema, T., Manion, F., Quillen, J., Sheffield, V.. Sunden, S., Duyk, G., et al. (1994) A comprehensive human linkage map with centimorgan density. Cooperative Human Linkage Center (CHLC). Science 265, 2049-2054. [141 Walsh, P., Metzger, D. and Higuchi, R. (1991) Chelex 100 as a medium for simple extraction of DNA for PCR-based typing from forensic material. BioTechniques 10, 506-513. [151 Weber, J.L. (1990) Informativeness of human (dC-dA)n.(dGdT)n polymorphisms. Genomics 524, 524-530. ll61 Levinson, G. and Gutman, G. (1987) High frequencies of short frameshifts in poly-CA/TG tandem repeats borne by bacteriophage Ml3 in Escherichia coli K-12. Nucleic Acids Res. 15, 5323-5338. [171 Chung, M., Ranum, L., Duvick, L., Servadio, A., Zoghbi, H. and Orr, H. (1992) Evidence for a mechanism predisposing to intergenerational CAG repeat instability in spinocerebellar ataxia type 1. Nature Genetics 5, 254-258. lr81 Field, D. and Wills, C. (1996) Long, polymorphic microsatellites in simple organisms. Proc. Royal Sot. Lond. B, 263, 209-215. D. Field et al. / FEMS hvnunology and Medical Microbiology 15 (1996) 73-79 [19] Sutherland, G. and Richards, R. (1995) Simple tandem DNA repeats and human genetic disease. Proc. Natl. Acad. Sci. USA 92, 3636-3641. [20] Gerber, H., Seipel, K., Georgiev, 0.. Hofferer, M., Hug, M., Rusconi, S. and Schaffner, W. (1994) Transcriptional activation modulated by homopolymeric glutamine and proline stretches. Science 263, 808-8 11. 79 [21] Buchman, T., Rossier, M., Merz, W. and Charache, P. (1990) Detection of surgical pathogens by in vitro DNA amplification. Part I. Rapid identification of Condida albicans by in vitro amplification of a fungus-specific gene. Surgery 108, 338-346.
© Copyright 2026 Paperzz