Molecular Marker Guide A molecular marker is a stretch of DNA (this can range from one base to millions of bases) that enables us to deduce information about the identity, demography, history and evolution of an individual, population or species. There are many different marker types and some are best suited to particular applications. This guide is not exhaustive as new markers and technologies are emerging all the time. Instead it aims to provide an overview of the main marker types, their uses and their limitations. To find people who can provide you with information about the right marker type(s) for your study, search for labs/researchers in the Community Database. Links for more information Provides details on marker types and their applicability to different ecological questions: http://www.nature.com/scitable/knowledge/library/molecular-genetic-techniques-and-markers-forecological-15785936 Overview of different molecular markers used in conservation: http://labs.russell.wisc.edu/peery/files/2011/12/Molecular-Markers.pdf Information and links to resources about plant molecular markers (though note some of the information is applicable to animals too): http://www.cgn.wur.nl/UK/CGN+Plant+Genetic+Resources/Research/Molecular+markers/ A practical guide to using microsatellites for ecologists: http://web.ecologia.unam.mx/laboratorios/evolucionmolecular/images/a_homes/pdfs/SelkoeToonen_2006_MicrosForEcologists.pdf Review of AFLPs and their applications: http://www.softgenetics.com/TrendsPlantScience2007.pdf A guide to using expressed sequence tags for ecologists: http://www.mol-palaeolit.de/pdf/bouck/2007/5517_Bouck+Vision2007.pdf Reviews of new genomic applications to conservation: http://iuss.unife.it/dipartimento/biologiaevoluzione/ricerca/evoluzione-e-genetica/chioggia2011/allendorf-et-al-2010-nature-reviews-genetics.pdf http://129.125.2.51/fmns-research/theobio/_pdf/ou_eatig10.pdf A guide to next generation sequencing technologies: ftp://84.237.21.152/pub_archive/lin/yu/NGS/2010_Field_guide_to_next_generation_DNAsequencers. pdf Key words: Polymerase chain reaction (PCR): A method that allows you to exponentially amplify a specific segment of DNA, even when the starting whole cell DNA concentration is very low. Locus: A locus is a segment of DNA at a particular position on a chromosome (the plural is ‘loci’). Allele: An allele is a variant at a given locus. For species that have two sets of chromosomes (diploids, like us and most mammals), one individual in a population may have alleles ‘A’ and ‘a’ at locus 1, while another has alleles ‘A’ and‘A’. Homozygous: A homozygous individual has a ‘genotype’ comprised of two identical genetic variants (alleles) at the same locus on a pair of chromosomes. Heterozygous: A heterozygous individual has a genotype comprised of two different alleles at the same locus on a pair of chromosomes. Polymorphic: When there are different alleles among different individuals in a population. In other words, when there is genetic variation. Dominant: A dominant marker can only be scored as present or absent. It is not possible to distinguish between heterozygotes and homozygotes. Codominant: It is possible to distinguish between heterozygotes and homozygotes using codominant markers. Genomic regions Nuclear autosomal DNA markers: Genomic location: The autosomes (i.e. the chromosomes which are not sex-chromosomes and not from organelle – mitochondrial and chloroplast – DNA). Inheritance: Inherited from both parents in sexually reproducing organisms. Applications: Individual identification, Inbreeding, Pedigrees, Parentage, Demography, Population structure, Selection, Speciation, Phylogenetic relationships, Phylogeography. Limitations: It can be difficult to reconstruct the inheritance of genetic lineages due to recombination of DNA and to determine which allele was inherited from which parent. Marker types: Allozymes, anonymous markers (e.g. AFLPs, RAD tags), microsatellites, SNPs, ESTs (see below), DNA sequence data, whole genomes. Sex-chromosome markers: Location: Markers located on the chromosomes that determine the sex of an individual. For example: X and Y chromosomes in mammals, females are XX and males are XY, Z and W chromosomes in birds, females are ZW and males are ZZ. Inheritance: Marker specific. Sex-chromosome markers are inherited asymmetrically by the sexes, but the mode varies. For example in mammals Y-chromosomes are inherited from father to son and X-chromosomes are inherited from the mother in sons, but from both parents in daughters. Applications: Useful for investigating sex-biased processes: Paternity, Maternity, Sex-typing, Sexbiased dispersal/population structure, selection. Limitations: Markers from sex-chromosomes can be difficult to develop due to their DNA complexities. Many sex-chromosome markers have low variability meaning they have low resolution. Cannot be used to investigate processes equally affecting both sexes. Marker types: microsatellites, SNPs, DNA sequence data. Mitochondrial DNA markers: Location: Mitochondrial organelles. Inheritance: From mother to offspring in vertebrates (i.e. down matrilines). Applications: Useful for looking at female-biased dispersal due to maternal inheritance, Demography, Phylogeography, Selection. Mitochondrial genomes are typically highly variable in vertebrates (which can provide high resolution), though they can be highly conserved plants. Limitations: Can be difficult to disentangle whether patterns are due to selection or demography. High stochasticity associated with mtDNA markers means patterns may not be congruent with true population history. Mitochondrial data should usually be combined with autosomal data when looking at phylogeographic processes. Marker types: SNPs, DNA sequence data, whole mitochondrial genome. Chloroplast DNA markers Location: Plastid organelles. Inheritance: Depends on the species – can be uniparental or biparental, and the size of these genomes varies among species. Applications: Demography, Phylogeography, Selection. Limitations: Can be difficult to disentangle whether patterns are due to selection or demography. Chloroplast DNA data should usually be combined with autosomal data when looking at phylogeographic processes. Some plastid genomes are highly conserved (low variation and so low resolution). Marker types: SNPs, DNA sequence data, whole plastid genome. Marker Types Allozymes: Genes code for proteins, some of which form enzymes. Genetic variation at the locus (or loci) coding for an enzyme may change the charge properties of the enzyme, allowing the variants to be discriminated as ‘allozymes’. These markers are co-dominant and may be useful in non-model organisms as they do not require prior sequence information. However, not all variation at the locus will be detected by this method, so the resolution is relatively low, and the enzyme has a specific function, and so could be under directional or purifying selection (which could confound assumptions made about the neutrality of genetic markers when populations are compared). Anonymous markers: Some of these techniques use restriction enzymes (enzymes that cut DNA) and PCR, to amplify thousands or millions of fragments of DNA randomly from an organism’s genome. They have the advantage that they do not require prior information about the DNA sequence of the species, and so are good for non-model organisms. Markers employing this method include AFLPs (amplified fragment length polymorphisms) and RAD tags (restriction site associated DNA markers). AFLP markers are dominant, meaning that interpretation is restricted to assessing the presence or absence of variable sites without knowing what locus they are associated with (so no information on genotype). RAD tags provide information on heterozygosity and homozygosity by providing redundant sequence for the same segments of DNA, and therefore information on genotype. Note that some other approaches to anonymous DNA amplification (e.g. RAPD and ISSR) provide data that are difficult to interpret accurately, and are therefore not recommended. DNA sequence data: Determining the sequence of nucleotide bases (A, T, G & C) in a segment of DNA provides detailed information on genetic variation, and is fast, easy and not expensive when combined with PCR and based on mtDNA. This approach has been very popular for studies of vertebrate populations where mtDNA is highly variable (and so provides good resolution) and both haploid and matrilineal (avoiding complications presented by the PCR amplification and sequencing of nuclear loci, which require further steps to determine genotype prior to sequencing). New DNA sequencing methodologies, and especially next generation sequencing, will provide much more sequence data quickly, and sequence paired chromosomes in diploids through extensive re-sequencing of each position. Eventually this may provide whole genome sequences for population-level analyses, but in 2012 this would still be quite expensive. RAD tag approaches (see above) provide a subset of the full genome sequence, and this approach is more affordable, but only necessary when either high resolution is required, or when the objective is to look for loci that may be under selection. Sequence data for short segments of DNA can also provide important information on selection, when the locus sequence is well described and known to be under selection. The most common example of this is the sequencing of immune system genes (e.g. ‘MHC’ genes) for comparison at the population level. Another potential resource for investigating genes under selection is ‘Expressed Sequence Tags’ (EST). These are sequences associated with the transcript of functional genes, and are especially useful when combined with array technologies (see below) that can screen may such loci at once. Microsatellite DNA: These are relatively short (~20-200 bases) simple sequence repeats of DNA, such as ACACACAC or GTTGTTGTT. Microsatellites are typically highly variable in length among individuals (evolving by DNA copying errors that alter the length of the repeat array) and so provide high resolution for comparing populations or individuals (e.g. to assess levels of kinship). These loci are amplified by PCR and the genotypes resolved by standard methods that allow the process to be fast and inexpensive. Multiple loci are combined (typically 10-20, but more will provide better accuracy) in a single screening process. Care needs to be taken to ensure that the genotypes are read accurately, and to control for variation among labs when multiple labs are involved. Single nucleotide polymorphisms (SNPs): DNA sequences vary among individuals at specific nucleotide positions, and these can be identified as ‘single nucleotide polymorphisms’. Methods can then be devised to look for variation only at these sites that are known to be variable, rather than collecting sequence data for the full stretch of DNA in which they occur. There are various ways to screen for these polymorphisms, but one common way is to use a ‘micro-array’. This is a means to set up a template including the various SNP loci that have been identified, and then to screen DNA samples for a match to sequences found on the template (known as a ‘chip’). Once the preliminary work has been done, this allows a large number of loci to be assessed quickly for a large number of samples. However, the preliminary work can be expensive and time-consuming, and it is important to control for potential biases by basing the initial setup on a diverse range of populations. Sex chromosome markers: Once a locus that is unique to the ‘heterogametic’ chromosome (e.g. the Y chromosome in mammals) has been identified, PCR can simultaneously amplify a locus from both sex chromosomes, and thereby identify the sex of the sample. This method is fast, easy and inexpensive. Marker Type Properties and Uses Development Resolution time (power) Marker Genome Cost Allozymes Nuclear Low Low Low Low Low Medium Medium Medium Medium-high AFLPs RAD tags Nuclear, sexchromosomes Nuclear, sexchromosomes Microsatellites Nuclear, sexchromosomes, plastids Low Medium Medium SNPs All Mediumhigh Medium-high Medium - high DNA sequence data All (can range from short segment to whole genomes) Medium high Low-medium Medium - high Prior sequence Neutral/Coding Applications data required Genetic diversity, Population Coding No structure, Adaptation Demography, Both No Population structure Demography, Both No Population structure Inbreeding, Neutral (unless Individual ID, linked to Yes Parentage/Kinship, functional Demography, locus) Population structure Inbreeding, Individual ID, Parentage/Kinship, Both Yes Demography, Population structure, Adaptation Species ID, Phylogenetics, Phylogeography, Both Yes Demography, Population structure, Adaptation
© Copyright 2026 Paperzz