Introduction to Plant Genome and Gene Structure Dr. Jonaliza L. Siangliw Rice Gene Discovery Unit, BIOTEC Defining genome Coined by Hans Winkler (1920) as “genom(e)” by joining gene and chromosome Lederberg and McCray (2001) defines Genome as the complete gene compliment or the total DNA amount per haploid chromosome set RGDU at “Community of Practices” CARDI June 18-21, 2007 (GCP_BIOTEC) Brief history of genome size study in plants (Estimates of DNA amounts) Early studies are based on analysis of isolated nuclei or cell suspension Led to the use of the term C value (Swift 1950b) These studies dealt only with relative DNA contents and did not provide estimates of absolute DNA mass The first estimate of the absolute amount of DNA in the nuclear genome of a plant was done for Lilium species Genome size – the mass (in picgrams, pg-1) of DNA per haploid nucleus Brief history of genome size study in plants (Estimates of DNA amounts) (A) Zingeria biebersteiniana – a monocot species with chromosome 2n = 4 (B) Voanioala gerardii – a rainforest palm from Madagascar with a chromosome count of 2n = 600 MICHAEL D. BENNETT* Proc. Natl. Acad. Sci. USA Vol. 95, pp. 2011–2016, March 1998 Brief history of genome size study in plants (Estimates of DNA amounts) Examples of DNA amounts and chromosome sizes. (A) Brachyscome dishrosomatica 2n = 4, 1c= 1,.1 pg (B) Myriophyllum spicatum 2n = 14, 1c = 0.3 pg (C) Fritillaria sp. 2n = 14, 1c = 65 pg (D) Selaginella kraussiana 2n = 40, 1c = 0.36 pg (E) Equisetum variegatum 2n = 216, 1c = 30.4 pg T. Ryan Gregory The evolution of the genome (2005) Brief history of genome size study in plants (Main areas of focus in early genome size studies) Developing methods for estimating plant genome size and testing and proving their accuracy. Exploration of the ranges in genome size in different groups and at various taxonomic levels. Investigating genome size variation through the (a) mechanism responsible, (b) rates of change, and (c) evolutionary significance to resolve the so called Cvalue paradox. Brief history of genome size study in plants (Main areas of focus in early genome size studies) Constancy and the origin of the “C-value” DNA constancy hypothesis Swift (1950a) – referred to classes of DNA as Class I – being the common diploid value Class II – as Class 1C value representing the haploid DNA content. C-value paradox – the DNA/cell does not correspond to the total gene content of the organism Brief history of genome size study in plants (Main areas of focus in early genome size studies) T. Ryan Gregory Paleobiology, 30(2), 2004, pp. 179–202 Brief history of genome size study in plants (Impact of molecular revolution on genome size research) Molecular work on DNA sequences gave insight on the structure and content of individual genomes but at the same time have inhibitory effect on genome size research such as Strong emphasis on DNA C values per se began to fade and in 1980s it was almost impossible to obtain grant funding to estimate genome size The revelation of repetitive DNA sequences believed to cause potential changes in copy number led to reports of substantial intraspecific variation (violating the rule of DNA constancy) such as those related to developmental, environmental and geographical factors. Led to the necessity of second wave of careful measurements proving that intraspecific variation is due to technichal artifacts are challenges posted to genome size researchers (www.rbgkew.org.uk/cval/workshopreport.html) Brief history of genome size study in plants (Genome size studies in the post-genomic era) Large scale genome sequencing program Study of molecular basis of genome evolution in plants Allow investigation and comparison of different taxa (ex. subspecies indica and japonica) Brief history of genome size study in plants (Genome size studies in the post-genomic era) Large scale genome sequencing program Allow investigation and comparison of different species within families (Oryza, Sorghum, Zea) Brief history of genome size study in plants (Genome size studies in the post-genomic era) Large scale genome sequencing program Allow investigation and comparison of difference between families (Poaceae and Brassicaceae) Reveal the key molecular mechanisms involved in the gain and/or loss of DNA resulting in changes in genome size Patterns in plant genome size evolution (The extent of variation in plant taxa) How do plant genome sizes evolve? (Sequences responsible for the range of genome sizes encountered in plants) Repetitive DNA in plants is composed of transposable elements (TEs) Class I – RNA-mediated mode of transposition Retrotransposons – characterized by long terminal repeats (LTRs) Retroposons – lacks terminal repeats (non-LTR retroelements) and use reverse transcriptase to transpose through an RNA intermediate How do plant genome sizes evolve? (Sequences responsible for the range of genome sizes encountered in plants) How do plant genome sizes evolve? (Sequences responsible for the range of genome sizes encountered in plants) Repetitive DNA in plants is composed of transposable elements (TEs) Class II – DNAmediated mode of transposition Helitrons Mutator-like elements Miniature inverted repeat transposable elements (MITES) How do plant genome sizes evolve? (What triggers the spread of transposable elements?) Transcriptional activation can be induced by experimental manipulations of various biotic and abiotic stresses like Wounding, tissue culture and disease attack Adaptation to water availability in Hordeum, (Vicient et al.1999a) rice (Jiang et al. 2003) How do plant genome sizes evolve? (What triggers the spread of transposable elements?) Polyploidization and interspecific hybridization may trigger TEs amplification in Nicotiana (Comai, 2000) How do plant genome sizes evolve? (Satellite DNA) Satellite DNA Tandemly arranged repeats of identical or similar sequences Variable in size but the most common monomeric units are 150-180 bp and 320380 bp Two smaller unit of satellite DNA Minisatellites (10-40 bp repeats) Microsatellites (2-6 bp repeats) How do plant genome sizes evolve? (Genome size increase by polyploidy) Polyploidy – results from combining three or more basic chromosome sets or genomes in one nucleus Prominent mode of speciation C-value and basic genome size are not equivalent, thus C-value must be indicated as 1Cx-value to indicate the basic genome size (Greilhuber et al. 2005) How do plant genome sizes evolve? (Mechanisms of genome size decrease) Unequal intrastrand homologous recombination Illegitimate recombination Occurs between the long terminal repeats of LTR-retrotransposons that leads to the deletion of internal DNA segment Recombination that does not require the participation of a recA protein or large (>50 bp) stretches of sequence homology Loss of DNA during the repair of double stranded breaks Often accompanied by DNA deletions Intraspecific variation in genome size (Intraspecific variation and speciation) Speciation may occur without any change in C-value and likewise, variation in DNA amount can also precede reproductive isolation and morphological diversification Speciation was thought to depend mainly on changes in informational genes. Comparative genomics revealed constancy in this part of the genome and non-coding sequences determine diversity and suggested to play major role in plant speciation. Intraspecific variation in genome size (Intraspecific variation and speciation) Intraspecific variation in genome size (Intraspecific variation and speciation) Methodology for estimating genome size in plants (Complete genome sequencing) Arabidopsis thaliana – dicot plant that was sequenced through the Arabidopsis Genome Initiative in 1997 and whose complete sequence was made public in 2000 The genome size was estimated as 125 megabases (Mb) based on the size of the sequenced regions (115.4 Mb) plus the roughly 10 Mb for the unsequenced centromere. From DNA sequence to gene discovery DNA DNA DNA is a molecule which encodes genetic information. It is a long, coiled, double-stranded chain of interlocking base-pairs called a double-helix. There are four types of bases in DNA: A (adenine), T (thymine), G (guanine), and C (cytosine). The order of the bases in a DNA strand, called the sequence, creates a code for information: the DNA code 'ATC' has a different meaning than the code 'TCA,' and so on. DNA Structure DNA Structure DNA Structure DNA Structure Central Dogma of Molecular Biology Gene A gene is a section of the DNA strand that carries the instructions for a specific function. For example, the 'globin' genes contain instructions for making the hemoglobin protein, which is the protein which allows our blood to carry oxygen throughout the body. Humans have about 50,000 different genes, which work together in complex ways to control much of what our bodies do. While we all have the same genes, there are different versions of many genes, called alleles. For example, while most people have genes which give them pigmented (coloured) eyes, there are multiple alleles for specific eye colors. Each person has particular combination of alleles for eye color, for hair color, etc., which makes him or her genetically unique. Gene structure Eukaryotic Gene Expression Eukaryotic Gene Complexity Enhancers - A DNA element that strongly stimulates transcription of a gene or genes. Usually found upstream from the genes they influence. Eukaryotic Gene Complexity Promoters – a DNA sequence to which RNA polymerase binds prior to initiation of transcription. Usually found just upstream from the transcription start site of a gene. Eukaryotic Gene Complexity 5’ UTRs - A region of a gene which is transcribed into mRNA, becoming the 5' end of the message, but which does not contain protein coding sequence. The 5'-untranslated region is the portion of the DNA starting from the cap site and extending to the base just before the ATG translation initiation codon. While not itself translated, this region may have sequences which alter the translation efficiency of the mRNA, or which affect the stability of the mRNA Eukaryotic Gene Complexity Introns - intervening sequences in the gene that are removed in the formation of the functional mRNA. Usually includes noncoding sequence but there are instances of alternative processing where sequences can be both introns and exons. Eukaryotic Gene Complexity Exons - sequences in the gene that are found in the functional mRNA. Includes coding sequence but may also include non-coding sequence. Eukaryotic Gene Complexity 3’ UTR - A region of the DNA which is transcribed into mRNA and becomes the 3' end or the message, but which does not contain protein coding sequence. Everything between the stop codon and the polyA tail is considered to be 3' untranslated. The 3' untranslated region may affect the translation efficiency of the mRNA or the stability of the mRNA. It also has sequences which are required for the addition of the poly(A) tail to the message (including one known as the "hexanucleotide", AAUAAA). Mutation Mutation - A mutation is a permanent change in the DNA sequence of a gene. Mutations in a gene's DNA sequence can alter the amino acid sequence of the protein encoded by the gene. Nature of mutations Substitution mutations - convert one type of base pair into another. G-C to A-T and A-T to G-C changes are referred to as transition mutations (replacement of a purine to pyrimidine base pair by a purine to pyrimidebase pair). G-C to C-G, G-C to T-A, A-T to TA, and A-T to C-G are called transversions (replacement of a purine-pyrimidine base pair by a pyrimidinepurine base pair). Although transitions are more common than transversions, both kinds of mutations occur as a consequence of replication errors, both can result from chemical damage to DNA, and both have been implicated as causative factors in inherited genetic disease and cancer. Single nucleotide changes can change the codon to that of another amino acid, thus altering the protein. In addition, such changes can also create a stop codon Nature of mutations Nature of mutations Small insertions/deletions comprise a second relatively common class of mutation. Genetic changes of this sort involve insertion or loss of a small number of contiguous base pairs (one to several hundred). Nature of mutations Nature of mutations Nature of mutations
© Copyright 2026 Paperzz