Broad Institute – M. Henn Dengue Virus White Paper Comparative Genomics of Dengue Virus: genome population structure, transmission, and understanding differential inflammatory disease responses A white paper for Microbial Genome Sequencing submitted by: Matthew Henn1, Irene Bosch2, and Eva Harris3 in collaboration with the Genomic Resources in Dengue Consortium (GRID) 1 Broad Institute Microbial Genome Sequencing & Analysis, 320 Charles Street, Cambridge, MA 02141 USA tel: 617.324.2341 mailto: [email protected] 2 University of Massachusetts Medical School, Center for Infectious Disease and Vaccine Research, Worcester, MA 01655 USA 3 University of California-Berkeley, School of Public Health, Berkeley, CA 94720 USA Tuesday, June 21, 2005 3/15/2010 Page 1 of 21 Broad Institute – M. Henn Dengue Virus White Paper Executive Summary Dengue virus (DEN), a category-A pathogen, is a significant threat to public health world wide. The virus is transmitted to humans by the mosquitoes Aedes aegypti and Ae. albopictus. The incidence of dengue fever (DF), dengue hemorrhagic fever (DHF), and dengue shock syndrome (DSS) are rapidly increasing, and more than 2.5 billion people live in regions endemic for the disease. Presently approximately 50-100 million cases of DF occur yearly with more than 500,000 resulting in severe and potentially fatal forms of the disease (DHF & DSS). Several factors contribute to the threat posed by dengue. Most significant are lack of cross-reactive immunity for the four DEN serotypes (DEN-1, DEN2, DEN-3, and DEN-4), hyperendemic circulation of the four different serotypes in the same geographical area, frequent worldwide travel, high population density, and lack of effective mosquito control programs. Other socioeconomic factors only amplify the challenge of dengue control. Despite the threat of this virus to public health and its importance to research on viral hemorrhagic fevers our understanding of the virulence, pathogenesis, and mechanisms of re-emergence of the dengue virus is limited. As a consequence, presently no vaccine or anti-viral therapy exists for this flavivirus, and the genetic underpinnings of disease outcomes are unknown. Studies of nucleotide divergence among he different serotypes has largely been limited to a single gene. This lack of basic information of viral diversity severely limits vaccine and anti-viral therapy development efforts. In addition, the presence of immune pathology in secondary DEN infections indicates that previous non-sterile immune response to one serotype can exacerbate the immune response to the secondary infections caused by another serotype. This host-viral interaction also represents a considerable challenge for vaccine development. We propose to sequence 3500 dengue genomes of distinct geographic origin and disease to build the genomic infrastructure needed to study and combat this virus. We will define the population structure of the virus at multiple scales (i.e. within host, local, regional, and continental) which will enable determination of the impact of introduced strains versus indigenous evolution on disease outcomes. Additionally, as this project develops, the association of genome sequences to clinical data as well as human genetic information will provide the first map of genomic distributions with reference to DF, DHF, and DSS and host immune pressure. Such a genomic resource will be able to identify important genetic correlates of disease emergence, virulence, and attenuation in addition to the stability of polymorphisms across the entire genome. Further, these data will provide the information necessary to develop cost-effective strain diagnostics for disease tracking and outbreak response. 3/15/2010 Page 2 of 21 Broad Institute – M. Henn Dengue Virus White Paper I) Dengue Virus a Burgeoning Public Health Threat Worldwide and in the USA The recent increase in activity of dengue is cause for serious attention to this global vector-borne viral disease of humans. NIAID has recognized the importance of this RNA virus (family Flaviviridae) and considers dengue a Category-A organism. More than 2.5 billion people live in areas endemic for the disease. 50-100 million new dengue infections are estimated to occur yearly worldwide [1, 2] with the severe forms of the disease, dengue haemorrhagic fever (DHF) and dengue shock syndrome (DSS), accruing in at least 500,000 of the cases [1, 3]. This is an increase from approximately 100,000 cases per annum in the 1970s (Figure 1) [3]. Morbidity from DF is also of great public health concern. In severe epidemics, approximately 5% of DHF cases lead to fatalities [4] and rates as high as 40% have been reported in endemic regions with poor medical facilities [1]. The prevalence of dengue disease worldwide is surging, particularly in the Americas, and 2001 epidemiological and media reports indicated the highest levels of the virus ever recorded [5]. This re-emergence is evident in the United States Commonwealth of Puerto Rico where all serotypes of the disease are co-circulating [6] and in Hawaii, which in 2001-2002 experienced its first autochthonous dengue epidemic since 1944 [7]. In addition, from 2001-2004 377 suspected cases of travel-associated dengue infections were reported in the United States with CDC confirmed cases occurring in 22 states [8]. Dengue consists of four known serotypes, DEN-1, DEN-2, DEN-3, and DEN-4, that are no longer geographically isolated (Figure 2). Increased travel between dengue-endemic regions, globalized markets, increased population densities, and unplanned urbanization particularly in tropical regions have significantly impacted the epidemiology of dengue infection. Both the virus and the mosquito vector have spread across the mid-latitudes (Figure 2). Due to the above societal shifts, all of the endemic regions are now hyperendemic, with multiple DEN viruses co-circulating. This in turn has had a tremendous impact on the resurgence of DF, DHF, and DSS. The potential of evolutionary mechanisms leading to new genotypes is likely enhanced by an increased genetic pool. Molecular studies using the DEN E-protein indicate a dramatic increase in genetic diversity within serotypes at the onset of widespread human transmission approximately 100 years ago (Figure 3) [9]. Also apparent is the recent appearance of DEN-1 and DEN-3 (Figure 3) [9]. How such diversity impacts virulence and greater inflammatory responses remains unanswered. The presence of immune pathology in secondary dengue infections indicates that previous non-sterile immune response to one serotype could exacerbate the innate immune response to the secondary infections caused by a different serotype than the first (i.e. Immune Enhancement Hypothesis). This intricate interaction between the virus and the host impacts disease outcomes and poses a significant challenge for vaccine development. Tetravalent vaccines with sterile immunity for all the four types of the virus are required. Introduction of dengue into the United States mainland is of particular concern. The mosquito Aedes aegypti is the dominant vector of dengue in the Americas, while Ae. albopictus is a major vector in Asia. Significantly, the recent epidemic in Hawaii implicated Aedes albopictus as the vector [7], and this mosquito is now found in at least 24 states on the US mainland [10, 11]. Dengue virus is unique from related flaviviruses in that it has adapted to humans and can be maintained in an Aedes-human-Aedes cycle without input from an enzootic cycle [1]. This urbanization of virus transmission increases its ability to spread beyond its endemic regions as it is not dependent on the forest enzootic life-stage. In 2001, travel between the US mainland and dengue-endemic regions was estimated to be 14 million passengers [7]. Additionally, the deliberate dispersal of mosquitoes infected with different DEN serotypes into urban centers could pose a significant threat to public health in multiple regions of the US where Ae. aegypti and Ae. albopictus are found. 3/15/2010 Page 3 of 21 Broad Institute – M. Henn Dengue Virus White Paper Figure 1. Reported Cases of Dengue Hemorrhagic Fever from 19551998. Source: World Health Organization Figure 3. Maximum likelihood phylogeny of four dengue serotypes showing diversity within serotypes and date (A.D.) of most recent common ancestor (MRCA). Figure is from Twiddy et al. [9]. Figure 2. Worldwide distribution of Aedes aegypti, dengue virus serotypes, and dengue epidemic activity. Figure is modified from Gubler [3] Figure 2. 3/15/2010 Page 4 of 21 Broad Institute – M. Henn Dengue Virus White Paper II) Research Objectives and Goals Remarkably, for a virus that is considered a Category-A pathogen and that infects ~100 million people each year, there are only 96 whole genome sequences (10.5 kb) deposited, with the distribution of these sequences being highly skewed among the serotypes (DEN-1=19, DEN-2=47, DEN-3=26, DEN-4=4). There is an urgent need for an expansive and collaborative research program that relates viral genomic sequences with well-defined clinical and epidemiological data as well as genetic information about the human host. Under this proposal, we will generate the genomic sequence and databases needed to overcome many present limitations to the dengue research community. Additionally, as a component of this proposal the Broad Institute, in close collaboration with the greater infectious disease research community, will develop the necessary framework and infrastructure capable of enabling high-throughput viral-host interaction studies. a) Comparative Genomics to Understand Evolution and Transmission of the Virus During an Epidemic Using comparative genomics methods we will determine the contributions of mutation, recombination, genetic drift, gene flow and the natural selection of advantageous mutations to the genetic structure of dengue virus populations in at least four epidemiologically well-defined settings (Puerto Rico, Venezuela, Nicaragua, and Vietnam). The immediate availability and density of samples at these sites will enable us to determine how genetic diversity is changing through time. This will allow us to resolve whether viral evolution occurs indigenously, or whether new strains are imported from elsewhere, perhaps triggering major epidemics. Ultimately data from these sites will provide a means to more accurately model how the virus could respond to the introduction of a vaccine program. The use of viral samples collected throughout particular epidemics will allow us to test the hypothesis that viruses are selected during the course of epidemics and are correlated with increased severity [12-14]. This connection will be investigated by association with the clinical outcome of specific cases as well as with the ratio of severe to total dengue cases in the population over time (particularly when the epidemic is caused by predominantly one serotype). Furthermore, by examining the phylogenetic relationships between the viral sequences sampled (i.e. using coalescent methods) we will estimate the relative population growth rate of dengue viruses in each geographic region, as well as whether this rate is changing through time and whether it differs by serotype. It is important to understand the pattern of genetic change that has occurred in endemic regions. Previous studies have shown that changes in the DEN population genetic structure are associated in part with strain introductions from both geographically and socio-economically related populations [15-20]. For example, throughout the Americas, DEN-2 Asian–American subtype IIIb, linked to DHF/DSS cases, appears to have displaced the American subtype V, which has only been associated with DF [18-23]. Thus, the first epidemic of DHF/DSS in the Americas in 1981 was attributed to a DEN-2strain imported from Southeast Asia, where the severe form of the disease had been endemic [19]. Although the onset of DHF/DSS in the Americas in 1981 was associated with the introduction of a novel strain of DEN-2, changes in dengue severity have also been associated with hyperendemic transmission patterns (the cooccurrence of multiple dengue serotypes in the same locality) [24]. There is strong evidence linking disease severity with secondary heterologous infection [25]. In endogenous regions, secondary infections have become increasingly more frequent over periods of time in which the virus has mutated. Complete genome sequences are central to advancing the understanding of DEN population structure and transmission, yet limited studies have taken into consideration the entire viral genome in such studies (Appendix A). The DEN population genomic map generated via this proposal will allow the dengue research community to better understand the population dynamics of the dengue genome during disease outbreaks and the genetic diversity underlying severe forms of the disease. Given the relative lack of 3/15/2010 Page 5 of 21 Broad Institute – M. Henn Dengue Virus White Paper knowledge of the genetic mechanisms of pathogenicity and virulence, full genome sequences will provide the means to identify important genetic motifs beyond the envelope protein and how the evolution of these elements is associated with disease re-emergence. b) Comparative Genomics to Develop Strain Diagnostics and Resolve the Contribution of Viral Determinants to Disease Severity Some data suggest that hyperendemicity may not be the only factor contributing to the increased disease severity and that strain virulence may play an important role [26-30]. Some genotypes, or indeed serotypes, of dengue virus are likely unequivocally more virulent than others. Understanding if and how viral factors contribute to disease pathogenesis could be very useful in the rational design and targeted deployment of new anti-virals or prophylactic dengue vaccines. It would also provide valuable information as to whether the spread of particular strains should be monitored with more attention. Sequence variation in dengue viruses is probably related to virulence [31]. As mentioned above, epidemiological studies have suggested that the DEN-2 Asian-American genotype, but not the American genotype, are capable of causing DHF in secondary infections [19, 32]. These observations led to the hypothesis that DEN-2 Asian-American genotype viruses were more virulent than DEN-2 American genotype viruses. Similar observations have been reported with respect to genetically distinct variants of DEN-3 in Sri Lanka, where strains isolated before 1989 were never associated with severe dengue while those circulating after 1989 were correlated with the emergence of DHF in Sri Lanka [33]. DEN-4 generally leads to less severe disease than other serotypes, particularly DEN-2 and DEN-3 [34]. Dengue virus enters the body via a mosquito bite and is thought to replicate within a number of immunological cells including: macrophages, monocytes, B cells, mast, dendritic, and endothelial cells [35]. A dominant means of viral entry into host cells is thought to occur through the enchanced binding of viral proteins to Fc! host-cell receptors. Besides the Fc! receptor that is implicated in Antibody Dependent Enhancement (ADE), knowledge of other DEN receptors is limited. Recently genetic determinants in the E gene and 5’ and 3’ untranslated regions of the DEN genome were associated with dengue virulence [25], and it is known that dengue virus reactive T-cells vary in their ability to recognize different serotypes [25]. Additionally, entry and replication of dengue in dendritic cells can be blocked by antibodies to the carbohydrate recognition domain of DC-SIGN, a receptor that is utilized with different efficiencies by different DEN serotypes [36]. All of these findings indicate a multiplicity of genetic determinants of host infection in addition to immunity responses, and suggests specific host-viral associations that are dependent of viral genome changes. What is urgently needed to understand the viral determinants of disease severity is a genome-wide strategy that identifies polymorphic loci that are associated with distinct infection outcomes (e.g., DF versus DHF). Presently there is a lack of sufficient viral genomic data with wich to link disease outcome with viral polymorphisms. Genomic sequence generated under this proposal will directly address this need. For example, recent studies have shown that the nonstructural regions of dengue virus are involved in counteracting the antiviral response of the host cell [37, 38]. An appropriately structured genome database could resolve polymorphisms at these loci and associate this information with the virus’ ability to block the antiviral response of the host. The ensuing genome sequence reference database we generate will be invaluable for disease diagnostics and dengue monitoring. A long-term goal of this proposal and of GRID is the development of high-throughput viral resequencing arrays. As an example, the ability to quickly type strains using population-specific molecular markers could be applied to track and ultimately 3/15/2010 Page 6 of 21 Broad Institute – M. Henn Dengue Virus White Paper aid the determination of both the potential threat and the appropriate response to outbreaks or the malicious release of this Category-A biological agent. c) Comparative Genomics to Understand In Vivo Diversity and Host–Viral Interaction RNA viruses exhibit a high degree of sequence variation not just between isolates as addressed above, but also among viruses within an individual. This in part is due to the viral RNA polymerase and its inability to proofread. As a result, viral infections within an individual exist as a population of closely related sequences that are called quasispecies [39-44]. The role of quasispecies in disease pathogenesis, transmission, and viral evolution is suspected to possibly be important for RNA viruses. Recent data on poliovirus indicate that intra-host genetic diversity is important for pathogenesis and tropism in mammalian hosts (Raul Andino, personal communication). A study using the dengue envelope (E) gene showed that genome-defective viruses exist in vivo [43]. Sequence generated under this proposal will include a pilot initiative to characterize the extent of viral diveristy within a host. Of particular interest are the comparisons between individuals with different disease outcomes and multiple time points within an individual infection (e.g., first acute sample and sample at defervescence). These data will help to define within-host DEN diversity and allow an assessment of the capacity of present isolation techniques from serum to capture actual viral diversity. Generation of this data will assist in project design and sequencing strategy for this project and future viral initiatives. Dengue is an example of a complex disease; host genetic, viral and environmental factors likely contribute to infection outcome. Understanding the contribution of host genetics to dengue disease susceptibility/resistance is made more difficult by the absence of a robust animal model in which to identify candidate susceptibility/resistance genes. Notwithstanding these limitations, host genetic variability at the HLA class I loci, Fc! receptor, the vitamin D receptor, and the promoter region of CD209 have all been associated with distinct symptomatic phenotypes [45-49]. There is an emerging theme in viral research that host immune pressures play a significant role in disease outcomes and in viral evolution both within individuals (i.e. quasispecies) and at the geographic scale. For example, research on HIV has revealed that mutations within HIV populations are associated with with expression of specific HLA class I alleles [50]. In the case of dengue, secondary exposures to the virus induce a memory response in immunologically primed individuals that can either clear the infection or contribute to the virus’ pathology (Immune Enhancement Hypothesis, IEH); the genetic underpinnings of this are unknown, but specific HLA class I alleles have been associated with the different outcomes [49, 51]. Combined host-viral genomic studies are beginning to reveal that in the setting of highly polymorphic pathogens (e.g. DEN), immune pressures eventually drive some escape mutations to fixation. This results in some regions of the virus no longer being immunogenic to individuals expressing particular HLA molecules [50]. For HIV, data are beginning to reveal population-specific polymorphisms in HIV, suggesting the loss of some immunogenic regions of the virus in one population expressing high levels of a particular HLA allele, while this region remains intact in another population expressing only low levels of the allele (Todd Allen 2005, personal communication). Generation of an association database that can identify a protective link between host HLA-type or other candidate genes and dengue viral genomes will facilitate vaccine development for this highly variable pathogen (Figure 3), and aid identifying high risk groups. Such a resource could enable a more focused selection of specific antigens for inclusion in vaccines aimed at engendering a particular immune response and avoidance of an immune enhancement response. While our current efforts are focused on understanding the interaction between human host and dengue, future efforts are anticipated with the mosquito vector as well to provide a fully integrated genomic view of the virus’ life-history. 3/15/2010 Page 7 of 21 Broad Institute – M. Henn Dengue Virus White Paper d) Comparative Genomics Enhanced Vaccine Development and Antiviral Therapy The development of successful vaccines and antiviral therapies against dengue is dependent on a comprehensive understanding of the genetic variability across all serotypes and genotypes of the entire viral genome. The antigenic variation of pathogens has implications in the immune response of the host through mechanisms of immune evasion, immune attenuation and host mimicry. Presently only sequence information for the envelope glycoprotein (E) gene exist in substantial number for use in vaccine attenuation studies. Attenuated strains of all four DEN serotypes exist but a lack of animal models and in vitro markers of attenuation slow further development of these vaccine candidates [52]. This limitation could be overcome by that ability to search whole genome data for loci polymorphisms and regions susceptible to attenuation by single base pair or amino acid changes. Additionally, the association of genomic diversity with virulence and disease outcome will permit the use of structural genomics to understand conformation changes to disease specific proteins. III) Strain and Site Selection Three geographic areas were selected for this proposal -the Caribbean, the Americas, and Southeast Asiafrom which four focus collections/areas were selected: Puerto Rico, Venezuela, Nicaragua, and Vietnam. These sites were selected as they are existing collections with on going sample collection that can meet the criteria of our scientific goals. All four serotypes of the virus are circulating at each of the geographic locations. These sites provide a continuum of sample resolutions that can assess viral population structure at highly refined scales (i.e., at the resolution of counties in Puerto Rico, Figure 4), country scales (i.e., Nicaragua and Venezuela), and global scales (i.e., all sites). The Puerto Rico surveillance system represents a unique model to conduct phylogenetic analysis of dengue virus over an extended period of time with documented epidemiological and clinical data in a region with very few reported reintroductions (Appendix C). In addition, the CDC collections in Puerto Rico and the Americas are the only known collections with a municipality sampling resolution (Figure 4). In contrast, to Puerto Rico the other sites represent scenarios in which re-introductions have occurred, but also like Puerto Rico have well documented epidemioligcal and clinical data across multiple epidemics and from within individual outbreaks of the disease. The Venezuela, Nicaragua, and Vietnam sites also represent case studies with collections of both RNA and serum samples for quasispecies analysis, as well as existing initiatives in host-viral studies. A significant effort was made by the Broad and GRID to identify sample collections and ongoing dengue sampling efforts that could adequately address our defined scientific goals and generate required cDNAs in a timely manner. A list was generated of existing sample collections (Appendix B) and each collection was assessed for resolution of sampling, collection size, collection content, existence of appropriate clinical data, and compatibility with sampling efforts currently ongoing. From this list, our sampling sites were selected. One or more members of GRID has access to and have agreed to supply required viral cDNAs and/or host DNAs for each region. Our selections ensure that all serotypes and known genotypes are included in the sequencing effort. The selection of these sites will result in a highly structured, comprehensive DEN genome database 3/15/2010 Page 8 of 21 Figure 4. Distribution of DEN-2 and DEN-3 in Broad Institute – M. Henn Dengue Virus White Paper While the initial focus of this project is on the collection sites noted, GRID is aware of a growing interest in this project from other key geographic regions and dengue researchers. We anticipate the potential inclusion of samples from other geographic regions and countries within selected focus areas. Such a decision will be made based on how significant genomic viral diversity is within our proposed sampling sites and the scientific value of broader geographic coverage versus deeper within site strain coverage based on these findings. Approval of this project is anticipated to facilitate removal of current limitations on sample access providing the necessary seed to justify additional funding to other granting agencies. IV) DNA Sequencing, Assembly, and Annotation • Viral Sequencing Strategy (dengue genome size = 10.7kb) We anticipate generating 3500 genome sequences derived from a combination of low-passage viral isolates and serum sample amplifications. The viral sequencing in this proposal will support two distinct sequencing objectives. First, we will determine the entire sequence of individual viral genomes using RNA from low-passage isolates as a means to capture viral diversity at the geographic scale and to relate strain genome structure to disease outcomes. Second, we will determine the consensus genome sequence of viral populations within an individual host (i.e. quasispecies). These “mixed” genome sequences represent a sample of the diversity within a host and are required for host-pathogen association studies. Viral Isolates. Low-passage isolates will yield sequence for individual full-length genomes suitable for geographic diversity and disease associated studies. These samples are limited by their potential misrepresentation of the frequency of individual sequence polymorphisms within the host’s quasispecies complex. Therefore these samples are not suitable for our host-viral interaction studies. Serum Samples. Serum sample amplifications represent bulk viral RNA from mixed populations of genotypes within a host (i.e. quasispecies). To describe this diversity we will employ two different methods that can both ultimately generate a consensus genome sequence that represents the mixed population. For these samples we will generate a genome consensus sequence using either a PCR-based or clone-based approach. These two methods have different sensitivities. The clone-based method will provide enhanced sensitivity for quasispecies analysis as sequence from strains at low dominance within the host will be captured. The consensus sequences generated are suitable to accurately identify hostassociated and disease related polymorphisms. Employing the two different sequencing approaches (i.e. PCR versus clone) to a subset of samples will allow us to determine how well PCR-based consensus sequence generation represents actual viral diversity within the host. Strategy. As identification of nucleotide changes are critical to the analysis, high quality sequence is required to identify real differences rather than sequencing errors. This will be achieved through the sequencing of overlapping PCR fragments. PCR-based sequencing: Our sequencing approach is currently based on the utilization of a set of primer pairs overlapping the entire DEN genome. cDNA will be produced from RNA extracted from low passage viral isolates or serum samples. RT-PCR from viral isolate samples will be performed using a standardized high fidelity PCR protocol at each of the four main focus regions and result in approximately 5-8 overlapping amplicons. These PCR products will be provided to the Broad Institute MSC for high throughput sequencing. RT-PCR amplification and sequencing in the forward and reverse 3/15/2010 Page 9 of 21 Broad Institute – M. Henn Dengue Virus White Paper direction of 500bp amplicons generated from the original 5-8 overlapping amplicons will result in an effective coverage of 8X. Clone-based sequencing: This approach will also utilize a set of overlapping primer pairs for the entire DEN genome. Approximately 30 clones will be picked for each PCR fragment (i.e. 30X coverage) and then sequenced. Genome data will be assembled at the Broad Institute. Genome assembly is an active area of research at each MSC and improvements in the algorithms are implemented on an ongoing basis. Annotation teams at the Broad will use all available evidence and follow standard protocols at the Broad MSC for genome annotation. The Broad will work in collaboration with members of GRID and the greater dengue research community on genome analysis and the identification of sequence polymorphisms. • Host Genetic Typing While the primary focus of this proposal is on the viral genome sequencing, the importance of understanding the interaction between host and virus is evident. Sequencing during the initial stages of this project will be directed at the viral genome. As this project develops, we anticipate the typing of 1500 patients from which we have associated viral genome sequences and clinical data. To achieve this goal and maximum impact of the genetic data on finding tangible disease solutions, the Broad is working in close collaboration with members of GRID and current collaborators from other related projects (i.e. HCV and HIV) to develop the appropriate technology and framework to successfully implement such a program. Presently no high throughput method exists for host genetic typing of HLA and disease loci. The Broad recognizes the significance of this need and is presently developing a strategy capable of efficiently and effectively sequencing the numerous variants of the human HLA loci as well as other relevant disease alleles. As needed, existing collaborations can provide the ability to type host loci. • Future sequencing goals The Broad in collaboration with members of GRID is presently exploring the potential of viral resequencing arrays to provide the dengue research community a means for high throughput viral strain diagnostics. Resequencing RNA genomes poses an interesting additional technological challenge and opportunity. First, can RNA be directly hybridized to the DNA arrays or is cDNA required? The standard gene expression protocol for Affymetrix microarrays calls for a cRNA-DNA hybridization, so directly resequencing RNA is conceivable, but unproven. Second, the size of typical RNA viral genomes compared to the high probe density of current microarrays raises the question whether complex variations besides single nucleotide polymorphisms can be detected with additional probes. For example, all possible single- and double- base deletions and insertions for the Dengue genome could be represented on current microarrays. This novel application of Variation Detection Arrays could break new ground in microarray-based resequencing, but demands new algorithm development. V) Genome Resources in Dengue Consortium (GRID) The process for producing this white paper was carried out in a highly collaborative fashion. Over the past several months, many scientists from the dengue research community with a vital interest in these data have worked in partnership with the Broad Institute to establish common scientific goals and define the appropriate genomic infrastructure needed to support them. GRID is the result of this process, and consists of: 3/15/2010 Page 10 of 21 Broad Institute – M. Henn Dengue Virus White Paper 1)Angel Balmaseda, Department of Virology, National Diagnostics and Reference Center, Ministry of Health, Managua, Nicaragua 2) Bruce Birren, Broad Institute, Cambridge, MA 3) Irene Bosch, University of. Massachusetts Medical School, Center for Infectious Disease and Vaccine Research, Worcester, MA 4) Eva Harris, Division of Infectious Diseases, School of Public Health, University of California Berkeley, , Berkeley, CA 5) Matthew Henn, Broad Institute, Cambridge, MA 6) Rebeca Rico-Hesse, Dept. Virology & Immunology, Southwest Foundation for Biomedical Research, San Antonio, TX 7) Jeremy Farrar, Oxford University Clinical Research Unit, Hospital for Tropical Disease, HCMC, Vietnam 8) Jorge Munoz-Jordán, Centers for Disease Control and Prevention, Division of Vector-Borne Infectious Disease, San Juan, PR 9) David Kulp, University of Massachusetts, Dept. of Computer Science, Amherst, MA 10) Cameron Simmons, Oxford University Clinical Research Unit, Hospital for Tropical Disease, HCMC, Vietnam 11) Steve Whitehead, Laboratory of Infectious Disease, NIAID Vaccine Branch, Bethesda, MD 12) Priscilla Yang, Harvard Medical School, Dept. of Microbiology, Boston, MA GRID sought input and attained consensus concerning sequencing targets through many meetings at the Broad with various participants, several conference calls with most members of GRID, and numerous email and telephone exchanges. The Broad has made a significant investment over the past six months in developing and coordinating this project. A critical component of this effort has been the sharing of information and data pertaining to goal development and sequencing target selection. In addition, great weight was given to practical matters in the structure of both the sequencing and analysis components of the proposed work. The group has worked together to standardize protocols for viral isolation and cDNA production between the various project sites. Importantly, many consortium members have active collaborations in our designated focus areas as well as have secure access to the viral collections and sampling clinics necessary for the acquisition of genomic cDNA for all anticipated sequencing targets. VI) Management of the Dengue Genome Project by the Research Community 1) Broad Institute (Matthew Henn) 20% 2) Irene Bosch 5% 3) Eva Harris 5% Both Dr. Bosch and Dr. Harris have been key contacts for the Broad Institute during the development of this white paper. They both maintain extensive dengue research programs at US universities, and have well-established collaborations with dengue clinics in the Americas and Asia. VII) Data Release In accordance with the NIAID’s principles regarding data release, we will publicly release all data generated under this contract as rapidly as possible. As required by our contract, NIAID will be provided with a 21-45 calendar-day period to review and comment upon all data prior to its public release. 3/15/2010 Page 11 of 21 Broad Institute – M. Henn Dengue Virus White Paper Chromatogram Files: Unless otherwise directed by NIAID, we will submit all sequences and trace files (chromatograms) generated under this proposal to the Trace Archive at NCBI on a no less than weekly basis. These data will also include information on templates, vectors, and quality values for each sequence. Genome Assemblies: Genome assemblies will be made available via GenBank and the MSCs’ websites, after internal and community validation. Assuming no significant errors are detected during the validation process, assemblies will be released within 45 calendar days of being generated. Genome Annotation: Automated annotation data will be made available via GenBank and our web sites after internal and community validation. Assuming no significant errors are detected during the validation process, annotation data will be released within 45 calendar days of being generated. 3/15/2010 Page 12 of 21 Broad Institute – M. Henn Dengue Virus White Paper VIII) References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. Gubler, D.J., The changing epidemiology of yellow fever and dengue, 1900 to 2003: full circle? Comp Immunol Microbiol Infect Dis, 2004. 27(5): p. 319-30. World Helath Organization, Stengthening implementation of the global strategy for DF/DHF prevention and control. 1999. p. http://www.who.int/vaccinesdisease/research/virus2.shtml. Gubler, D.J., Epidemic dengue/dengue hemorrhagic fever as a public health, social and economic problem in the 21st century. Trends Microbiol, 2002. 10(2): p. 100-3. World Helath Organization, Vaccines, immunization, and biologicals: dengue and Japanese encephalitis vaccines. 2001. p. http://www.who.int/vaccinesdiseases/research/virus2.htm. Halstead, S.B., Dengue. Current Opinion In Infectious Diseases, 2002. 15(5): p. 471-476. Rigau-Perez, J.G., et al., The reappearance of dengue-3 and a subsequent dengue-4 and dengue-1 epidemic in Puerto Rico in 1998. Am J Trop Med Hyg, 2002. 67(4): p. 355-62. Effler, P., et al., Dengue Fever, Hawaii, 2001-2002. Emerging Infectious Disease, 2005. 11(5): p. 742-749. Beatty, M.E., et al., Travel-Associated Dengue Infections - United States, 2001–2004. Morbidity and Mortality Weekly Report, 2005. 54(22): p. 555-558. Twiddy, S.S., E.C. Holmes, and A. Rambaut, Inferring the rate and time-scale of dengue virus evolution. Molecular Biology And Evolution, 2003. 20(1): p. 122-129. Moore, C.G., Aedes albopictus in the United States: Current status and prospects for further spread. Journal Of The American Mosquito Control Association, 1999. 15(2): p. 221-227. Moore, C.G. and C.J. Mitchell, Aedes albopictus in the United States: Ten-year presence and public health implications. Emerging Infectious Diseases, 1997. 3(3): p. 329-334. Kouri, G.P., M.G. Guzman, and J.R. Bravo, Why dengue haemorrhagic fever in Cuba? 2. An integral analysis. Trans R Soc Trop Med Hyg, 1987. 81(5): p. 821-3. Rodriguez-Roche, R., et al., Virus evolution during a severe dengue epidemic in Cuba, 1997. Virology, 2005. 334(2): p. 154-9. Guzman, M.G., G. Kouri, and S.B. Halstead, Do escape mutants explain rapid increases in dengue case-fatality rates within epidemics? Lancet, 2000. 355(9218): p. 1902-3. Fong, M.Y., C.L. Koh, and S.K. Lam, Molecular epidemiology of Malaysian dengue 2 viruses isolated over twenty-five years (1968-1993). Res Virol, 1998. 149(6): p. 457-64. Foster, J.E., et al., Molecular evolution and phylogeny of dengue type 4 virus in the Caribbean. Virology, 2003. 306(1): p. 126-34. Lewis, J.A., et al., Phylogenetic relationships of dengue-2 viruses. Virology, 1993. 197(1): p. 216-24. Nogueira, R.M., M.P. Miagostovich, and H.G. Schatzmayr, Molecular epidemiology of dengue viruses in Brazil. Cad Saude Publica, 2000. 16(1): p. 205-11. Rico-Hesse, R., et al., Origins of dengue type 2 viruses associated with increased pathogenicity in the Americas. Virology, 1997. 230(2): p. 244-51. Uzcategui, N.Y., et al., Molecular epidemiology of dengue type 2 virus in Venezuela: evidence for in situ virus evolution and recombination. J Gen Virol, 2001. 82(Pt 12): p. 2945-53. Halstead, S.B., et al., Haiti: absence of dengue hemorrhagic fever despite hyperendemic dengue virus transmission. Am J Trop Med Hyg, 2001. 65(3): p. 180-3. 3/15/2010 Page 13 of 21 Broad Institute – M. Henn 22. 23. 24. 25. 26. 27. 28. 29. 30. 31. 32. 33. 34. 35. 36. 37. 38. 39. 40. 41. 42. Dengue Virus White Paper Ruiz, B.H., et al., Phylogenetic comparison of the DEN-2 Mexican isolate with other flaviviruses. Intervirology, 2000. 43(1): p. 48-54. Tolou, H., et al., Complete genomic sequence of a dengue type 2 virus from the French West Indies. Biochem Biophys Res Commun, 2000. 277(1): p. 89-92. Gubler, D.J., The global pandemic of dengue/dengue haemorrhagic fever: current status and prospects for the future. Ann Acad Med Singapore, 1998. 27(2): p. 227-34. Rothman, A.L., Immunology and immunopathogenesis of dengue disease. Adv Virus Res, 2003. 60: p. 397-419. Cologna, R. and R. Rico-Hesse, American genotype structures decrease dengue virus output from human monocytes and dendritic cells. J Virol, 2003. 77(7): p. 3929-38. Endy, T.P., et al., Spatial and temporal circulation of dengue virus serotypes: a prospective study of primary school children in Kamphaeng Phet, Thailand. Am J Epidemiol, 2002. 156(1): p. 52-9. Rico-Hesse, R., Molecular evolution and distribution of dengue viruses type 1 and 2 in nature. Virology, 1990. 174(2): p. 479-93. Rico-Hesse, R., Microevolution and virulence of dengue viruses. Adv Virus Res, 2003. 59: p. 315-41. Vaughn, D.W., Invited commentary: Dengue lessons from Cuba. Am J Epidemiol, 2000. 152(9): p. 800-3. Leitmeyer, K.C., et al., Dengue virus structural differences that correlate with pathogenesis. J Virol, 1999. 73(6): p. 4738-47. Watts, D.M., et al., Failure of secondary infection with American genotype dengue 2 to cause dengue haemorrhagic fever. Lancet, 1999. 354(9188): p. 1431-4. Messer, W.B., et al., Emergence and global spread of a dengue serotype 3, subtype III virus. Emerg Infect Dis, 2003. 9(7): p. 800-9. Nisalak, A., et al., Serotype-specific dengue virus circulation and dengue disease in Bangkok, Thailand from 1973 to 1999. Am J Trop Med Hyg, 2003. 68(2): p. 191-202. Malavige, G.N., et al., Dengue viral infections. Postgraduate Medical Journal, 2004. 80(948): p. 588-601. Altmeyer, R., Virus attachment and entry offer numerous targets for antiviral therapy. Curr Pharm Des, 2004. 10(30): p. 3701-12. Liu, W.J., et al., Inhibition of interferon signaling by the New York 99 strain and Kunjin subtype of West Nile virus involves blockage of STAT1 and STAT2 activation by nonstructural proteins. J Virol, 2005. 79(3): p. 1934-42. Munoz-Jordan, J.L., et al., Inhibition of interferon signaling by dengue virus. Proc Natl Acad Sci U S A, 2003. 100(24): p. 14333-8. Holland, J.J., J.C. De La Torre, and D.A. Steinhauer, RNA virus populations as quasispecies. Curr Top Microbiol Immunol, 1992. 176: p. 1-20. Lin, S.R., et al., Study of sequence variation of dengue type 3 virus in naturally infected mosquitoes and human hosts: implications for transmission and evolution. J Virol, 2004. 78(22): p. 12717-21. Lu, M., et al., Analysis of hepatitis C virus quasispecies populations by temperature gradient gel electrophoresis. J Gen Virol, 1995. 76 (Pt 4): p. 881-7. Martell, M., et al., Hepatitis C virus (HCV) circulates as a population of different but closely related genomes: quasispecies nature of HCV genome distribution. J Virol, 1992. 66(5): p. 3225-9. 3/15/2010 Page 14 of 21 Broad Institute – M. Henn 43. 44. 45. 46. 47. 48. 49. 50. 51. 52. 53. 54. 55. 56. 57. 58. 59. 60. 61. 62. Dengue Virus White Paper Wang, W.K., et al., Dengue type 3 virus in plasma is a population of closely related genomes: quasispecies. J Virol, 2002. 76(9): p. 4662-5. Zhu, T., et al., Genotypic and phenotypic characterization of HIV-1 patients with primary infection. Science, 1993. 261(5125): p. 1179-81. Loke, H., et al., Strong HLA class I--restricted T cell responses in dengue hemorrhagic fever: a double-edged sword? J Infect Dis, 2001. 184(11): p. 1369-73. Loke, H., et al., Susceptibility to dengue hemorrhagic fever in vietnam: evidence of an association with variation in the vitamin d receptor and Fc gamma receptor IIa genes. Am J Trop Med Hyg, 2002. 67(1): p. 102-6. Stephens, H.A., et al., HLA-A and -B allele associations with secondary dengue virus infections correlate with disease severity and the infecting viral serotype in ethnic Thais. Tissue Antigens, 2002. 60(4): p. 309-18. Sakuntabhai, A., et al., A variant in the CD209 promoter is associated with severity of dengue disease. Nat Genet, 2005. 37(5): p. 507-13. Waganaar, J.F.P., A.T.A. Mairuhu, and E.C. van Gorp, Genetic influences on dengue virus infections. WHO Dengue Bulletin, 2004. 28: p. 126-134. Moore, C.B., et al., Evidence of HIV-1 adaptation to HLA-restricted immune responses at a population level. Science, 2002. 296(5572): p. 1439-43. Stephens, H.A., et al., HLA-A and -B allele associations with secondary dengue virus infections correlate with disease severity and the infecting viral serotype in ethnic Thais. Tissue Antigens, 2002. 60(4): p. 309-18. Rothman, A.L., Dengue: defining protective versus pathologic immunity. J Clin Invest, 2004. 113(7): p. 946-51. Klungthong, C., et al., The molecular epidemiology of dengue virus serotype 4 in Bangkok, Thailand. Virology, 2004. 329(1): p. 168-79. Bennett, S.N., et al., Selection-driven evolution of emergent dengue virus. Mol Biol Evol, 2003. 20(10): p. 1650-8. Moncayo, A.C., et al., Dengue emergence and adaptation to peridomestic mosquitoes. Emerg Infect Dis, 2004. 10(10): p. 1790-6. Wang, E., et al., Evolutionary relationships of endemic/epidemic and sylvatic dengue viruses. J Virol, 2000. 74(7): p. 3227-34. Uzcategui, N.Y., et al., Molecular epidemiology of dengue virus type 3 in Venezuela. J Gen Virol, 2003. 84(Pt 6): p. 1569-75. Cologna, R., P.M. Armstrong, and R. Rico-Hesse, Selection for virulent dengue viruses occurs in humans and mosquitoes. J Virol, 2005. 79(2): p. 853-9. Laille, M. and C. Roche, Comparison of dengue-1 virus envelope glycoprotein gene sequences from French Polynesia. Am J Trop Med Hyg, 2004. 71(4): p. 478-84. Shurtleff, A.C., et al., Genetic variation in the 3' non-coding region of dengue viruses. Virology, 2001. 281(1): p. 75-87. Aviles, G., et al., Complete coding sequences of dengue-1 viruses from Paraguay and Argentina. Virus Res, 2003. 98(1): p. 75-82. Baleotti, F.G., M.L. Moreli, and L.T. Figueiredo, Brazilian Flavivirus phylogeny based on NS5. Mem Inst Oswaldo Cruz, 2003. 98(3): p. 379-82. 3/15/2010 Page 15 of 21 Broad Institute – M. Henn Dengue Virus White Paper Appendix A List of published evolutionary studies of dengue virus showing only two studies that used the complete viral genome Serotype Location Taiwan Americas Thailand Puerto Rico USA Dengue serotype/No. of clones sequenced DEN-3. 88 from human and 50 from mosquitoes Target sequence C and E DEN-2. 62 Isolates total with 55 DEN-2. E/NS1 junction DEN-4 Three genotypesSEA (I), SEA-America (II), Sylvatic (III). 53 Isolates from 27-year period DEN-4. 82 isolates E 6 Thai strains compared with 58 complete coding regions 40% of the genome (4,000 bases of C, PrM, E, NS1, NS2A, NS4B West Africa, SEA, Oceania DEN-2 Hypothetic aa replacements in E by Wang et al. Malaysia and Thailand Venezuela DEN-1, 2, and 4 E DEN-3. 15 Isolates E SEA, West Africa, America DEN-2. 24 Isolates E Cuba DEN-2. 20 isolates and 6 for complete genome E, complete genome Venezuela DEN-2. 34 isolates E, PrM, NS1 Hypothesis Higher nucleotide substitutions in human derived virus. Host-derived positive selection No differences between DF and DHF. No association with Pathogenesis No geographical segregation of the three genotypes. Immune driven positive selection Temporal clustering. Antigenic variation of NS2A. Synonymous changes outnumbering nonshynonymous changes Peridomestic mosquitoes have higher dissemination rates of endemic viruses. Host-derived positive selection Sylvatic genetic divergence Introduction of Genotype III no evidence of reemergence of type V from 1960. Importation of dengue strains from Asia. Out-competing Asia vs American DEN2 strains in mosquito. Selection pressure evolution No a.a. difference in E. Evolution during epidemic in the NS genes. In situ evolution of dengue 2 within the Reference [40] [19] [53] [54] [55] [56] [57] [58] [13] [20] 3/15/2010 Page 16 of 21 Broad Institute – M. Henn French Polynesia SEA, West Africa, Americas, sylvatic DEN-1. Genotype V and IV DEN-1, 2, 3, 4 Argentina, Paraguay DEN-1, Five isolates Brazil DEN and encephalitic and hemorrhagic in 15 strains of flaviviruses DEN-2, 11 isolates Thailand & Venezuela Dengue Virus White Paper E 3’ NTR Whole genome NS5 Whole genome same geographical region Introduction of DEN1 SEA DEN viruses arose from sylvatic progenitors and evolved into human epidemic strains. Variation in the 3'NCR does not correlate with DEN virus pathogenesis 3- a.a. in NS4A. Co-existance of two genotypes of the five known. Selection pressure evolution Correlated structural differences to pathogenesis [59] [60] [61] [62] [31] 3/15/2010 Page 17 of 21
© Copyright 2026 Paperzz