Identification of the Chromosomal Origins of Replication (oriCRSI oriCRSII) in R. sphaeroides 2.4.1 Tim Johnson, Randi Harbour, Kristina Hernandez, Lin Lin, and Madhusudan Choudhary Department of Biological Sciences, Sam Houston State University, Huntsville, Texas 77341 INTRODUCTION RESULTS AND DISCUSSION Rhodobacter sphaeroides belongs to the α-3 subdivision of the Proteobacteria. This organism is metabolically versatile, and it grows under a variety of growth conditions, such as aerobic, semi-aerobic, and photosynthetic growth conditions. R. sphaeroides possesses a complex genome, which is comprised of two chromosomes (CI and CII) and five endogenous plasmids (1). CI and CII are ~3.0Mbp and ~0.9Mbp in size, respectively (2). Analysis of the R. sphaeroides genome reveals that genes for a wide variety of essential functions are dispersed between the two chromosomes. Recently, it has also been demonstrated that CI and CII have been both essential and ancient partners within the R. sphaeroides genome since its separation from its ancestor lineage (3). Unlike eukaryotes, prokaryotic cell lacks mitosis or mitosis like apparatus. The existence of multiple chromosomes in bacteria may require a well coordinated chromosomal replication and chromosomal segregation to distribute the chromosomes equally in the two daughter cells. Therefore, in order to understand the process of DNA replication, the origin of chromosomal replication must first be identified. The origin of replication (referred as oriC in E. coli) is the specific region in the chromosome where the DNA double helix will begin to denature allowing replication of the chromosome to initiate (4). This region varies 40-80 base pairs in length among different bacterial species, and usually remains very AT rich (70 to 80%) as the bonds between adenine and thymine are more easily denatured than the bonds between guanine and cytosine. There are cis-elements located within and around this region, which are recognized by a set of proteins including DnaA, RepABC, and other proteins associated with chromosomal replication that bind to the specific DNA sequences in this region and facilitate the initiation of the chromosomal replication. The advantage of the program used in this study is that it offers a progressive search and it also allows the processing of an entire genome at once whereas many currently available web-based programs only allow for a small number sequences, which is very time consuming and even providing limited information. However, an alternative approach through ARTMIS used in this study further validated the result as shown in Figure 2. Furthermore, the efficacy and the accuracy of this program was tested using the entire genomic sequences of Caulobacter cresentus and Sinirhizobium meliloti, and the program was able to identify all putative origins including the one which is biologically functional. The output of both search programs provided 3013 and 336 nucleotide sequence files of Chromosome I and Chromosome II, respectively. After matching the overlapping regions, there were a total 125 CI- and 16 CII-specific sequences remained. Following through the protein database search, there were 37 CI- and 9 CII-specific sequences were chosen to be analyzed further. These regions of putative origins were then further analyzed to determine the presence of known cis-elements to which the DnaA and other replication proteins bind as shown in Figure 3. Each of the total 13 DNA regions (as shown in Table 1) along with the 300 nucleotide upstream and the down stream sequences was then analyzed for 21 different conserved boxes for oriC, DnaA, RepABC1, and RepABC2 (5). Many of these sequences contain 2 to 5 of these conserved binding boxes as shown in Figure 3. Based on the %AT content and the number of binding boxes, 13 possible origin of replication were identified in R. sphaeroides’ chromosomes. Comparison of the genome sequences of Caulobacter crescentus and Rickettsia prowazekii revealed that both species shared a conserved cluster of genes in the hemE-hemH region that overlapped the established origin of replication in C. cresecentus and the putative origin of replication in R. prowazekii (6). The origin of replication of the S. meliloti chromosome has also been predicted as well as experimentally confirmed to be approximately 400 kb from dnaA and adjacent to hemE (5). A putative origin of replication of CI in R. sphaeroides is located ~40 kb from hemE but it remains uncertain until it will be confirmed experimentally. Like R. sphaeroides, Vibrio cholerae possesses two chromosomes and the origin of replication of the two chromosomes (oriCIvc and oriCIIvc), has been experimentally studied (7). Thus, the identification of chromosomal origins in R. sphaeroides may further facilitate the mechanism of chromosomal replication in bacterial species which possess multiple chromosomes. In order to identify the putative origins on CI and CII in R. sphaeroides, a silico-approach was employed to search CI- and CII-specific genomic sequences both with variable sequence length and %GC composition. Two different computer programs, which search either overlapping or discrete segments of DNA sequence, were used to search the entire chromosome specific sequences. All the sequences of 50 to 100 nucleotides in length with >65% AT content were selected for further analysis. These sequences were then analyzed for the presence of cis-elements using the conserved consensus sequence found in Sinorhizobium meliloti (5), which is closely related species to R. sphaeroides and which also belong to the α-3 subgroup of proteobacteria. (a) METHODS Silico-approach for the identification of the origin of replication: To identify the chromosomal origin of replication in Rhodobacter sphaeroides 2.4.1, a computer program was designed in order to search the A-T rich regions within CI and CII sequence. Further, the sequence was analyzed for the presence of the consensus cis-elements which are necessary for the initiator proteins to start the replication. The algorithm was developed as such that it searches both variable nucleotide lengths (50100 nucleotide range) and varying %AT composition (65 % to 80%) in an overlapping and progressive manner as shown in Figure 1. The program was applied on each of the chromosomal sequence of R. sphaeroides in the fasta format, which were directly obtained from the NCBI server. For efficient use of memory and input-output loading, each sequence is analyzed sequentially in a buffer. The analysis is performed by using the %AT calculation for each candidate sequence and then checking if the nucleotide composition of the sequence is above a chosen threshold value. If a sequence is shown to be above the chosen threshold value, it is then sent to the output data files. In addition, ARTIMIS was also used to calculate the %GC composition within each of the discrete 120 nucleotide s long sequence along each of the two chromosomal sequences as shown in Figure 2. Identification of the conserved DNA sequence boxes in the origin region: The sequences, however, overlapped each other as was the nature of the program and as such had to be combined to eliminate analyzing the same region twice. The assembled sequences were searched against the protein database of the R. sphaeroides in order to identify if any of these sequences encode for the protein. Finally, the remaining sequences were further analyzed using the DNADynamo to determine whether they contain the consensus boxes as they were previously identified in the chromosomal origin of S. meliloti (5). The program was downloaded through the internet from the publically available website. The program performs the searches both in forward and reverse complement directions of the target sequence. a b Figure 1. a) Program window; b) Input data; c) Output data. c Chromosome I ~69% GC Chromosome II ~69% GC a b Figure 2. The G+C content and possible sites for origin of replication in CI and CII in R. sphaeroides 2.4.1 (purple-below average; yellow-over average). a) G+C content and two possible sites for origin of replication in Chromosome I. b) G+C content and 9 possible sites for origin of replication in Chromosome II. Figure 3. DnaA and RepABC box biding sites for the origin of replication. a) A G+C content graph of a ~6kb region encompassing in the possible region for origin of replication in R. sphaeroides. b) The sequence of possible regions for origin of replication. c) DnaA and RepABC biding sites that match the DnaA and RepABC box consensus sequences. d) The sequences of the putative DnaA and RepABC boxes. (* Biding sites for multiple box consensus sequences ) FUTURE WORKS All thirteen putative chromosomal origins of R. sphaeroides 2.4.1 will be cloned into the suicide vector (pLO1 or pSUP202). The resulting recombinant plasmid will be tested biologically if one of these origins allow the suicide plasmid to autonomously replicate in R. sphaeroides. This work is currently in progress. Table 1. The possible regions for origin of replication in Chromosome I and Chromosome II Coordinates Sequences (with A-T rich region marked as red) A+T content for A-T rich region Locations 2380028-2380181 TCGCATCGCCCCTCCCGCTTCGTTGAACATTTTGGCCGATTAAATTCATTTTTTTGCCGACCATCAACGTTTATTTTCTTTTTG ATGAAGATTTCCAGATTTACTTTCAGTTTTTCCATGCTTATGCCTTGGAAACTGGCAGTTTCCCGTTGGC 69.32% CI 1700865-1701165 GGAGTGACTGAATGAAAGGCAACGATGTATCAATCATGAGATCGGAACATGAGTCTGCTCTCGAATAGAGTGAGATCAGG ATTTAAGACAAAGTAAACATTTTTGGTATTCTTAAGTGATTGATTTTATTGAATAAATCAAGGGTGTCATATGGATTTGTTTT TCTTAAGAAATCGTTTAATGATTGATTTATTGATTTATTAAGAAATGGATGAATCGAGATTTGATGTTCATGGTTCTTGAATG GGTATTCCATCAATGAACATGAACATGAGTGCATTTTGGCGTAAGTGAGCGAAGC 72.58% CI 1701171-1701360 GAACGCCACCTTTAATCCACATAGAGGTTTTGAGATCAGGAAAGGAGTCTTCTTTCAGATAAAGGTTTGAGATCAGGAAAG GAGTCTTCTTTCACATAGAGGTTTTGAGATCGGATAAACCTTTAATCCACATAGAGGTTTTGAGATCGGATAAACTGCATCG AATAAGGGTCACCATAAGCAATCTGGC 63.06% CI 1701367-1701598 CCGCGCGAAGCGCCAATGGAATCGTTTATCCAATAGAGATTTGGACTCATACAGATCGGATAAATGATCTATGCTCAGATA GAGATTTTGAGATATCAAATTTCATCAGATAAAGGTATTTTGGATCTTCAAACTTCCTTTCTCTAACTCAGATCTCATCTGGA CCTTATAGTTAAGATTCTGATTATAGCTCTATTTCTATAGGGGGACGAAACCCCCATTTTCGTGGTGA 68.88% CI 199834-199941 TCTTCCCCAGCTTATTGAAAGACAAACTGAAGAAAAAACGAGAAATTCTGACGGTTATAGAAAGTCAGACTTACAGAAGAT CCGAGGGGGTGCTTTGAAACGCACATC 62.65% CII 205692-205820 GTTCGGCGAGGCTCCACCTGTTCCCATTGACAGGCTAATCGAAAGCTAATCTAATAAAAACAAATAAAAGCTGACATGTGA TGTAAGAAAATCTGACGAAAGAGAGGGGCGGATGTCGATCCGGATGCT 66.67% CII 365609-365736 AGTATCAACTAAAGGTTGTAACCCGTCTATACTTTAGCGATAGAGTTTCATTAAGATACAATCAAGCGGGATTGTTCCTTCG AGACTGGAACACCGTCAAAAGTGTGGGATATGGTCATTTTGACACA 63.75% CII 478071-478177 GGAGTCAAGCATTTTGTAAACTTGTTATATACCAATCGGTTTCACTTGCTGAGCGAGGCCCCGGATAATCTGTTTTCGCATT GTTTTGGAATGATAATCACTCTG 61.82% CII 583697-583830 GTTACATTTTGTGCAAGACCATCACGATCTGTCAATCTCATTTTGCCAGATTTTCATGCTGCACCGCAGATAAACTCGGTGA TTGACTTGTTCATATGTTTATTTGACAACTAATATGATCGTAGCCCAAGCGC 60.57% CII 634469-634658 AAGAAAGTCAGCATAGAAATTGAGAATTAAGCACTCGTCTGGCAGAAAGGCCTTCCCGAAATTACATCGGGCAATTCAAA AGAACCACCGTATTTAAGTTGACTGACGAAATACACATGTAGTTAAAATGCAGCCAATCGGAGGGCAATATGGACGGTCAG AGAGTATCACAAGAAGAGTTTGAGGAACT 62.50% CII 738147-738270 GCGAGTGGGATGTTCAGTAAGTTGATGAGTTTATCTGCTCGATAGTGCATGTATGCACCAATATTGGTTAAGTAAACGCTAC CACTTTCGATTGAATCAAAAGCCGGACAAATCACCCATGGAT 63.64% CII 876323-876437 AAGGACGAAAACACGTCATGACTCGCTTCATACTCAGCGACCTTTGCATCTGTTGTTATATTGGGGAAATAGTAGTGGTCTT CAAATGCCATTATTTTCTTCCAATCTTTGTCGG 64.39% CII 921529-921661 AATGGCTGATCCTTGGGTAATTTGTCCGGCTTTTGATTCAATCGAAAGTGGTAGCGTTTACTTAACCAATATTGGTGCATAC ATGCACTATCGAGCAAATAAACTCATCAACTTACTGAACATCCCACTCGCC 64.84% CII REFERENCES 1. Suwanto, A., and S. Kaplan. 1989b. Physical and genetic mapping of the Rhodobacter sphaeroides 2.4.1 genome: presence of two unique circular chromosomes. J. Bacteriology, 171:5850-5859. 2. Mackenzie, C., et al. (2001) The home stretch, a first analysis of the nearly completed genome of Rhodobacter sphaeroides 2.4.1. Photosynthesis Research, 70: 19-41. 3. Choudhary, M., Yun-Xin Fu, C. Mackenzie, and S. Kaplan. 2004. DNA Sequence duplication in Rhodobacter sphaeroides genome: Evidence of an ancient partnership between chromosomes I and II. J. Bacteriology, 187:20192027. 4. Fuller, R. S., Kaguni, J. M. and Kornberg, A. (1981). Enzymatic replication of the origin of Escherichia coli chromosome. Proc Natl Acad Sci USA 78, 7370-7374. 5. Sibley, C. D., MacLellan, S. R., Finan, T. (2006) The Sinorhizobium meliloti chromosomal origin of replication. Microbiology 152: 443-455. 6. Brassinga, A. K. C., R. Siam, and G. T. Marczynski. (2000) Conserved gene cluster at replication origins of the αProteobacteria Caulobacter crescentus and Rickettsia prowazekii. Journal of Bacteriology 183(5): 1824-1829. 7. Egan, E. S. and M. K. Waldor. (2003) Distinct replication requirements for the two Vibrio cholerae chromosomes. Cell 114: 521-530.
© Copyright 2024 Paperzz