The origin and evolution of hepatitis viruses in humans

Journal
of General Virology (2001), 82, 693–712. Printed in Great Britain
...................................................................................................................................................................................................................................................................................
The origin and evolution of hepatitis viruses in humans
2000 Fleming Lecture
Delivered at the 146th Meeting of the Society for General Microbiology, 13 April 2000
Peter Simmonds
Laboratory for Clinical and Molecular Virology, University of Edinburgh, Summerhall, Edinburgh EH9 1QH, UK
The spread and origins of hepatitis C virus (HCV) in human populations have been the subject of
extensive investigations, not least because of the importance this information would provide in
predicting clinical outcomes and controlling spread of HCV in the future. However, in the absence
of historical and archaeological records of infection, the evolution of HCV and other human
hepatitis viruses can only be inferred indirectly from their epidemiology and by genetic analysis of
contemporary virus populations. Some information on the history of the latter may be obtained by
dating the time of divergence of various genotypes of HCV, hepatitis B virus (HBV) and the nonpathogenic hepatitis G virus (HGV)/GB virus-C (GBV-C). However, the relatively recent times
predicted for the origin of these viruses fit poorly with their epidemiological distributions and the
recent evidence for species-associated variants of HBV and HGV/GBV-C in a wide range of nonhuman primates. The apparent conservatism of viruses over long periods implied by these
latter observations may be the result of constraints on sequence change peculiar to viruses with
single-stranded genomes, or with overlapping reading frames. Large population sizes and intense
selection pressures that optimize fitness may be the factors that set virus evolution apart from that
of their hosts.
Introduction
Smallpox was always present, filling the churchyard with
corpses, tormenting with constant fear all whom it had not yet
stricken, leaving on those whose lives it spared the hideous traces of
its power, turning the babe into a changeling at which the mother
shuddered, and making the eyes and cheeks of the betrothed maiden
objects of horror to the lover.
T. B. Macaulay, The History of England from the Accession of
James II, vol. IV (1855)
O’er ladies’ lips, who straight on kisses dream
Which oft the angry Mab with blisters plague
William Shakespeare, Romeo and Juliet, act I, scene IV ;
discovered by Peter Wildy (Wildy et al., 1982)
These descriptions of smallpox and herpes simplex virus
provide a glimpse into the past history of virus infections in
Author for correspondence : Peter Simmonds.
Fax j44 131 650 7965. e-mail Peter.Simmonds!ed.ac.uk
humans. Alas, our knowledge of virus history is poor, none
more so than for hepatitis viruses for which historical
descriptions such as the above do not exist. In this Lecture, I
will summarize what is currently known about the origins and
evolution of a number of hepatitis viruses, and try to illustrate
how our understanding of the transmission of hepatitis C virus
(HCV) and its likely future clinical impact would be enhanced
if we were able to discover some basic facts about its past.
Chronic infection with HCV has become established as the
major infectious cause of chronic liver disease in Western
countries. Its spread in these populations is poorly understood,
although it is known to be transmitted by blood contact, and
has particularly targeted risk groups such as injecting drug
users (IDUs), and in the past, recipients of blood transfusion
and blood products. HCV infection is frequently persistent,
and sets in train an inexorable course of slowly progressive
liver disease. Part of the difficulty in understanding the
epidemiology of HCV is the lack of symptoms associated with
both initial infection and for prolonged periods of chronic
infection (Seeff et al., 2000 ; Kenny-Walsh, 1999 ; Wiese et al.,
GJD
0001-7612 # 2001 SGM
Downloaded from www.microbiologyresearch.org by
IP: 88.99.165.207
On: Sun, 18 Jun 2017 03:28:33
P. Simmonds
2000). Most HCV-infected individuals are unaware of their
clinical status and there is thus considerable under-diagnosis
and under-reporting of infection. Indeed, in many individuals,
HCV infection may first become apparent only on the
development of liver failure or liver cancer several decades
after initial infection. We urgently need to assess the likely
scale of HCV infection, the factors leading to its current
epidemic spread and the interventions that could be made to
avert future problems associated with disease progression.
Our basic knowledge of the epidemiology of HCV would
be helped if there was some understanding of its origins, and
in particular information about the nature of HCV infection in
populations from whom HCV spread during the current
epidemic. Investigation in this area is complex for a number of
reasons, although there are now accounts of the recent spread
of other viruses, such as human immunodeficiency virus (HIV),
on which the spread of HCV could be modelled. In this Lecture
I will review the progress that has been made in reconstructing
the evolution and spread of a number of hepatitis viruses, and
the analogies there might be with the spread of HCV.
HCV infection
HCV has a positive-sense RNA genome and is classified as
a flavivirus. Its inclusion in the family Flaviviridae is based upon
a number of similarities in its genome organization, structure
and replication to a large group of vector-borne viruses
causing diseases such as dengue fever and yellow fever.
Important virological similarities include the existence in both
of a single, large gene that encodes a polyprotein which is
cleaved after synthesis into functional proteins. HCV and
flaviviruses are enveloped ; amongst the structural proteins
encoded by HCV, two large glycoproteins are incorporated
into its viral envelope, and are likely to be homologous to E
and NS1 of flaviviruses. There are other similarities in the
position and amino acid sequences of evolutionarily conserved
elements involved in virus replication, such as the RNA
polymerase (NS5) and helicase (NS3). However, there are also
differences, particularly in the mechanism of translation of the
HCV and flavivirus polyproteins. Flaviviruses have a relatively
short leader sequence upstream of the coding region, and the
host-cell ribosomes are thought to scan this region to identify
the first methionine codon to initiate translation. In contrast,
the corresponding 5h untranslated region (5hUTR) of HCV
folds into a complex RNA secondary structure which acts as an
internal ribosomal entry site. This directs ribosomal binding
and initiation of translation to an internal methionine residue.
Unfortunately, extensive functional analysis of HCV has
proven difficult because of the difficulties in culturing HCV in
vitro. Its host range is confined to humans and close primate
relatives, and this has also hindered the development of animal
models for HCV infection. At present, we still lack much basic
information on its entry into cells, replication and assembly ;
GJE
a summary of what is currently understood about the functions
of the different HCV-encoded proteins is shown in Fig. 1.
The long asymptomatic stage of HCV infection and its
slow disease progression make estimation of the future clinical
impact of HCV difficult to assess, particularly in the UK and
other Western countries where there is epidemiological
evidence for its very recent spread in new risk groups for
parenteral infection, such as IDUs. Currently, it is estimated
that approximately 0n5 % of the UK population is infected with
HCV, and the likely relatively short duration of many of these
infections means that we have yet to witness the full extent of
liver disease that is likely to arise. Understanding the future
outcome of HCV is vital for health planning ; for example, if all
of those infected with HCV were to progress to chronic liver
failure or liver cancer, this would become an intolerable burden
on health resources of the UK and other countries with similar
or higher prevalences of infection. Understanding the natural
history of HCV infection is also of major importance in
management of currently asymptomatic individuals. For
example, combination treatment with interferon and ribavirin
leads to complete and permanent clearance of viraemia in
approximately 50 % of individuals, while much lower frequencies of response are observed in individuals with more
advanced disease, such as those with cirrhosis. Effective
treatment of HCV at an early stage of infection may therefore
be able to avert much of the end-stage liver disease associated
with untreated hepatitis C.
In trying to plan and prioritize HCV management and
treatment, we need accurate information on the course and
influences on HCV disease progression. Available evidence
indicates that long-term asymptomatic carriage of HCV may
occur in a large proportion of persistently infected individuals
(Seeff et al., 2000 ; Kenny-Walsh, 1999 ; Wiese et al., 2000). For
example, a relatively benign course of HCV infection has been
observed after 22 years in recipients of anti-D immunoglobulin
in Ireland (Kenny-Walsh, 1999), and amongst blood transfusion
recipients after 17 years (Seeff et al., 1992). However, these
findings should be tempered by the observation in Japan and
Italy of a high and rapidly increasing incidence of severe liver
disease and hepatocellular carcinoma associated with HCV.
The implications from these and other experiences is that
disease complications of infection may take an extremely long
time to develop (Koretz et al., 1993 ; Alter et al., 1997 ; Seeff,
1997). It is therefore vital to understand the long-term clinical
consequences in the large number of clinically silent individuals
infected relatively recently through drug misuse.
HCV genetic variability
Much of the evidence for the previous spread of HCV
derives indirectly from descriptions of current genotype
frequencies of HCV in different risk groups and populations.
HCV can be classified into a number of distinct genotypes,
Downloaded from www.microbiologyresearch.org by
IP: 88.99.165.207
On: Sun, 18 Jun 2017 03:28:33
Fleming Lecture
Fig. 1. (A) Organization of the HCV genome, showing the 5h and 3h untranslated regions, the single open reading frame and its
cleavage sites. (B) The relative sizes of the resulting proteins and what is currently understood of their functions. (C) Genome
organization of HGV/GBV-C, with genes homologous to those of HCV indicated ; note the lack of a protein corresponding to the
capsid protein of HCV.
whose distribution varies both geographically and between
risk groups (reviewed in Simmonds, 1998). In the currently
widely used classification for HCV, known variants of HCV
collected from different parts of the world can be divided into
six main ‘ genotypes ’, many of which contain more closely
related variants (Fig. 2). For nomenclature, it has been proposed
that HCV is classified into genotypes, corresponding to the
main branches in the phylogenetic tree, and subtypes corresponding to the more closely related sequences within some
of the major groups (Simmonds et al., 1994 ; Enomoto et al.,
1990). The types have been numbered 1 to 6, and the subtypes
a, b and c, in both cases in order of discovery. Therefore, the
sequence cloned by Chiron is assigned type 1a, HCV-J and -BK
are 1b, HC-J6 is type 2a and HC-J8 is 2b. Each of the six main
genotypes of HCV is equally divergent from the others,
differing at 31 to 34 % of nucleotide positions on pairwise
comparison of complete genomic sequences, and leading to
approximately 30 % amino acid sequence divergence between
the encoded polyproteins. Although there is no neutralization
assay available for HCV, HCV genotypes most likely cor-
respond to the serotypes of other viruses, such as dengue virus
and poliovirus.
Different regions of the genome show various levels of
sequence diversity. The ends of the genome that contain
elements involved in virus replication and in guiding protein translation are highly conserved between genotypes
(Kolykhalov et al., 1996 ; Bukh et al., 1992), as is the initial
coding region (the core gene). Other regions, such as the E1
and E2 genes coding for the envelope glycoproteins, are
highly variable, typically differing at over 50 % of sites between
genotypes.
In Europe, types 1b and type 2 are widely distributed,
particularly in older age groups, while those infected through
drug use are more likely to be infected with genotypes 3a and
1a (Simmonds et al., 1996 a ; Tisminetzky et al., 1994 ; Pawlotsky
et al., 1995 ; Goeser et al., 1995 ; McOmish et al., 1993). The
observation of genotypes associated with drug use in Europe
that are distinct from those found in individuals infected
through other routes suggests that infection of IDUs originated
through a geographically large transmission network largely
Downloaded from www.microbiologyresearch.org by
IP: 88.99.165.207
On: Sun, 18 Jun 2017 03:28:33
GJF
P. Simmonds
Fig. 2. Sequence relationships of
currently available complete genomic
sequences of HCV (listed in Chamberlain
et al., 1997), and their classification into
six genotypes (first tier) and subtypes a,
b etc. (second tier). The nomenclature of
the HCV genotypes follows the
consensus proposal for classification of
HCV (Simmonds et al., 1994), i.e. the
prototype sequence cloned by Chiron is
assigned type 1a, HCV-J and -BK are 1b,
HC-J6 is type 2a and HC-J8 is 2b,
although note the anomalous genotype
assignations of types 7b, 8b, 9a and 11a
which cluster with types 6a and 6b, and
type 10a, which clusters within genotype
3 sequences. Proposals to reclassify
these as subtypes in the genotype 6 and
3 clades have been made (Robertson et
al., 1998 ; Mizokami et al., 1996 ; de
Lamballerie et al., 1997 ; Simmonds et al.,
1996 b). The unrooted tree was
generated using the MEGA package
(Kumar et al., 1993).
distinct from other HCV-infected individuals, an observation
which has also been documented for HIV (Brown et al., 1997 ;
Holmes et al., 1995 b). In the remainder of this Lecture, I will
describe the approaches that have been taken to discover the
origins of human hepatitis viruses, and the light that this
information sheds on their current distribution and epidemiology. Understanding the origins of HCV in humans will
help put the current epidemic spread in certain risk groups into
a broader context that may be of value in predicting its future
impact on human health.
Virus archaeology and evolution
Investigation of the origin of viruses is by definition a
speculative venture. The exercise is limited by the ephemeral
existence of many viruses, and the lack of any historical depth
in epidemiological studies. Furthermore, clinical specimen
archives, suitable for recovery of viruses by isolation or by
PCR, that are older than 30 years are rare and restricted (Davis
et al., 2000 ; Seeff et al., 2000), so it is difficult to obtain any
direct evidence for the type of viruses that may have existed in
the past. In contrast to the rich fossil record of animals and
plants, and more recently, the ability to sequence DNA
recovered from some archaeological and fossil remains, almost
all viruses are essentially invisible in the geological and indeed
the historical record.
Probably the earliest solid archaeological evidence for virus
infection is contained in skeletons from the Neolithic and
Bronze age periods that are deformed similarly to those with
GJG
poliomyelitis from the present day (Wells, 1964), although it
remains to be formally demonstrated that the virus we now
refer to as poliovirus was the cause. As discussed later in this
Lecture, our current uncertainty about the lifespan of recognizable virus species means that we cannot at this stage
discount the possibility that other, now vanished, viruses may
have caused poliomyelitis then, only to be subsequently
replaced. The written record from the ancient world is similarly
difficult to interpret. There are few virus infections that cause
disease unambiguously different from other causes. Probably
the best example of a virus whose existence in ancient
civilizations can be confidently determined is smallpox, which
seems to have afflicted civilization throughout the period for
which records exist. Smallpox is believed by many to have
originated after the development of farming in the Middle East
around 10 000 BC. Smallpox appears to have spread widely
over the following centuries. For example, it has been
traditionally thought that skin lesions of smallpox are preserved in mummified remains from the 18th and 20th Egyptian
Dynasties (1570–1085 BC), including Ramses V who died in
1157 BC, although this has been disputed more recently
(Hopkins, 1980). Written descriptions of smallpox survive in
Ancient Greek writings, such as those of Thucydides, who
describes a particularly severe epidemic in Athens in 430 BC.
Overall, however, this situation is unusual ; among the
hundreds of viruses known to currently infect humans, the
historical, archaeological and palaeontological record is blank
for all but a few.
Reconstruction of virus histories by analysis of their current
Downloaded from www.microbiologyresearch.org by
IP: 88.99.165.207
On: Sun, 18 Jun 2017 03:28:33
Fleming Lecture
distributions and genetic relatedness is also fraught with
problems. Virus histories are inextricably linked to that of their
hosts, which itself may be uncertain in detail or impossible to
meaningfully reconstruct for viruses that can frequently cross
species barriers. Viruses recombine with one another, with
other viruses and with the genomes of the cells they infect and
they may undergo major genome rearrangements and changes
in replication strategy.
Rates of sequence change in viruses, particularly those with
RNA genomes, are invariably much greater than their hosts,
and this presents a number of problems in evolutionary
reconstruction. To persist within an infected individual or a
host population, most RNA viruses must replicate continuously. Each infection cycle in a cell typically takes 1–3 days,
over which period several copyings of the virus genome occur.
For a virus such as HCV, there may therefore be 100 000
genome replications over a period corresponding to a human
generation ; over this time, only 15–20 cellular divisions
separate a human egg from its gametes. Compounding this
difference in replication frequency, RNA viruses generally
encode their own nucleic acid replicating enzymes, which
typically lack proof-reading activity and therefore produce far
greater numbers of mutations per replication cycle than that of
their hosts. Combined, these two factors likely underlie the
greater than million times rate of sequence change compared
with other organisms. Even over relatively short evolutionary
periods, this rapid rate of virus sequence change potentially
obliterates any trace of genetic relatedness between descendants.
Viruses may be subjected to intense selection pressures to
evade the host’s immune response, antiviral treatment and to
adapt to new hosts on crossing species barriers. Rapid, adaptive
changes are favoured in viruses by the frequently large
population sizes in an infected organism and the high mutant
frequency generated during replication. For example, HIV-1infected individuals have been estimated to harbour 10(–10)
infected cells (Haase et al., 1996). As each genome replication
generates 0n5–5 nucleotide changes (Mansky & Temin, 1995),
the population in principle contains every possible substitution
and combination of paired substitutions. It is therefore not
surprising that resistance to antivirals, such as zidovudine, that
depends on single or double amino acid changes in the pol
gene, appears so quickly in treated individuals. The mutants
were already present and have the potential to replace the
original population over a period of weeks under this new
selection pressure.
Finally, as will be presented in this Lecture, it appears that
viruses just do not evolve in a manner comparable to that of
animals and plants. Constraints on the evolution of certain
viruses, imposed perhaps by possessing single-stranded genomes, extensively overlapping reading frames or regulatory
elements embedded in coding sequences, may lead to constraints on sequence change considerably different from those
operating on animal and plant genomes. As a result, many of
Fig. 3. Relationship between rates of amino acid sequence change with
sequence divergence between α-globin genes of mammals, placentals and
birds (plotted using the ‘ 4’ symbol). Rates were calculated by combining
the degree of sequence divergence with times of speciation known from
palaeontological records (Kumar & Hedges, 1998). Over a wide range of
divergence times, rates of sequence change were remarkably constant
(range 0n5–1n7i10−9 nucleotide substitutions per site per year). By
contrast, rates of sequence change of HGV/GBV-C ($) vary considerably
for different degrees of divergence : data points from left to right represent
the following divergence times : 8n4 years, time-course in HGV/GBV-Cinfected individual (Nakao et al., 1997) ; 100 000 years, divergence of
modern humans ; 1n6 Myr, divergence of troglodytes and verus subspecies
of chimpanzees (Pan troglodytes ; Morin et al., 1994) ; 7 Myr, divergence
of humans and chimpanzees (Jones et al., 1992 ; Morin et al., 1994) ;
35 Myr, divergence of Old (human and chimpanzees) and New World
primates (Sanguinis mystax and S. labiatus, Aotus trivirgatus ; Jones et al.,
1992). Sequences compared were from the NS5 region of the genome
[amino acid positions 2498–2561 in sequence PNF2161 (U44402)],
with divergences and rates based on Jukes–Cantor (J-C) distances.
the models and assumptions that underlie phylogenetic
reconstruction of animal and plant evolution may arguably not
apply to at least some viruses. One potential casualty is the
‘ molecular clock ’ (Kumar & Hedges, 1998), which is based on
the repeatable observation that the degree of both nucleotide
and encoded amino acid sequence divergence (calculated using
relatively simple corrections for multiple substitutions) between homologous genes in different species remain proportional to their time of divergence. For example, sequence
change in the α-chain of haemoglobin is relatively constant
over extremely long periods of vertebrate evolution, and
between sequences that have become extremely divergent
(Fig. 3). Over sequences differing from each other by up to
40 %, the rate of sequence change (0n5–1n7i10−* per site per
year) was little different from that observed over the period of
ape speciation (0n8–1n5i10−* per site per year). The implication is that whatever functional constraints there may be
on the encoded protein (in this case, in enzymatic activity),
Downloaded from www.microbiologyresearch.org by
IP: 88.99.165.207
On: Sun, 18 Jun 2017 03:28:33
GJH
P. Simmonds
(A)
(B)
GJI
Fig. 4. For legend see facing page. Downloaded from www.microbiologyresearch.org by
IP: 88.99.165.207
On: Sun, 18 Jun 2017 03:28:33
Fleming Lecture
Fig. 5. Phylogenetic analysis of
sequences from the NS5 region of
HGV/GBV-C, HGV/GBV-CCPZ and GBV-A
sequences recovered from different New
World primate species (Adams et al.,
1998). The branching order (but not
scale) of GB virus sequences from
different primates is congruent with the
genetic relatedness of their host species.
Note the greater sequence divergence of
HGV/GBV-CCPZ recovered from
troglodytes and verus subspecies of
chimpanzees than the sequence diversity
found between human HGV/GBV-C
genotypes. Numbers on branches
indicates number of bootstrap
re-samplings from 100, supporting
the observed phylogeny, restricted to
values of 75 % or greater ; p distances
are indicated on the scale bar.
these are insignificant compared to the flexibility with which
amino acid substitutions can be introduced into the protein
sequence, without evident change in fitness of the organism.
The importance of the molecular clock in evolutionary
reconstruction lies in its ability to predict from the rate of
sequence change of a gene (even over a short observation
period), the time of divergence of other, more distantly related
species. Given the lack of other evidence to construct virus
histories, the molecular clock has been enthusiastically adopted
as the method to calculate times of divergence of genetic
variants within contemporary virus populations (Suzuki &
Gojobori, 1997 ; Zanotto et al., 1996 ; Bollyky & Holmes,
1999 ; Zhang et al., 1999 ; Korber et al., 2000). However, as we
shall see in the next section, there is now evidence that the
evolution of some viruses violates the behaviour predicted by
the molecular clock, through constraints over and above the
coding potential of a gene, and\or what appear to be greater
restrictions on the amino acid sequence itself. This appears to
have led in some cases to extreme conservatism of virus
sequences, defying attempts to reconstruct their origins based
on sequence comparisons alone. Viruses, for all their mutability
and extreme population dynamics may be far more conservative, and older, than has so far been recognized.
In the following section, I will describe the genetic
variability and virus–host relationships shown by hepatitis G
virus (HGV) or GB virus-C (GBV-C) in their primate hosts, as
an example of an extremely successful, co-adapted virus whose
evolution appears to be constrained in ways untypical of that
observed in their hosts. How broadly this apparent ‘ slowdown ’ in sequence change applies to other viruses and how
this influences our perception of their evolutionary histories
will be the topic of the remainder of this Lecture.
Co-evolution of HGV/GBV-C in primates
The name HGV\GBV-C remains as an ugly acronym for
the virus independently but simultaneously discovered in 1995
(Linnen et al., 1996 ; Leary et al., 1996). The description of the
virus as hepatitis G virus is doubly unfortunate as there is no
evidence that it causes hepatitis in its natural host (humans)
either during initial infection or after long-term carriage.
HGV\GBV-C is distantly related to HCV and other flaviviruses in the Hepacivirus genus (Fig. 1 C), although sequence
similarities are generally limited to specific enzymatic motifs in
the NS3 and NS5B genes. There is little or no similarity in the
number and arrangement of genes encoding the structural
capsid and envelope genes. Strangely, and still unexplained, is
the lack of any obvious homologue of the nucleocapsid gene
(Fig. 1) ; the first protein translated has the characteristics of an
envelope glycoprotein, possibly homologous to the E1 gene of
HCV.
HGV\GBV-C infection is found widely in human popu-
Fig. 4. (A) Phylogenetic analysis of complete genomic sequences of HGV/GBV-C (Smith et al., 2000), and their provisional
classification into five genotypes. In marked contrast to HCV (Fig. 2), note the shallowness of the branching between sequence
clusters representing the HGV/GBV-C genotypes. The unrooted tree was generated using the programs DNADIST and
NEIGHBOR in the PHYLIP package (Felsenstein, 1993). (B) Approximate geographical distribution of HGV/GBV-C genotypes in
indigenous populations.
Downloaded from www.microbiologyresearch.org by
IP: 88.99.165.207
On: Sun, 18 Jun 2017 03:28:33
GJJ
P. Simmonds
lations, with frequencies of active or past infection ranging
from 5 to 15 %. This distribution extends even to highly
isolated populations, such as indigenous tribes people in Papua
New Guinea, sub-Saharan Africa and Central\South America
(Smith et al., 2000 ; Tanaka et al., 1998 a, b ; Mison et al., 2000).
Infection is frequently persistent and associated with high
levels of circulating viraemia, although no evidence links
HGV\GBV-C to any identifiable hepatic or non-hepatic
disease. Indeed, exactly what cells are infected with HGV\
GBV-C still remains unclear, although the suspicion must fall
on cells of the haemopoietic or lymphoid lineage such as CD4
lymphocytes, recently shown to be susceptible to infection in
vitro (Xiang et al., 2000).
Variants of HGV\GBV-C show quite limited sequence
variability, with nucleotide sequences differing from each other
by a maximum of 13 %. HGV\GBV-C has been tentatively
classified into four or five genotypes based on these sequence
relationships (Smith et al., 2000 ; Mison et al., 2000 ; Sathar et al.,
1999 ; Muerhoff et al., 1996 ; Mukaide et al., 1997), although the
variants lack the clear phylogenetic groupings that underpin
the genotype classification of other viruses such as HCV (Fig.
2). There is no great sequence variability of the genes encoding
putative envelope glycoproteins (unlike other viruses such as
HCV and HIV-1 where such variability has been linked to
persistence) ; indeed there is extreme conservation of the
encoded amino acid sequence throughout the genome. Expressed numerically, sequence divergence at non-synonymous
sites (dN ; i.e. sites where nucleotide substitutions alter the
encoded amino acid) is at least 50 times less than the variability
found at synonymous (i.e. silent) sites (dS) (Simmonds & Smith,
1999 b). Most coding sequences show biases towards synonymous variability (i.e. show a ratio of dN\dS significantly
less than 1), but there are few known coding sequences with
ratios approaching 0n02 (or less) as are found in HGV\GBV-C.
What the constraints are on sequence change in the coding
region of HGV\GBV-C remains quite mysterious, and is one
of many aspects of the unusual sequence variability of the
virus.
The broad distribution in human populations, and its
apparent non-pathogenicity, is consistent with the long-term
presence and close host association of HGV\GBV-C with
humans. Further evidence for this hypothesis is provided by
the geographical distribution of HGV\GBV-C genotypes
amongst indigenous populations in different parts of the world
(Fig. 4 B). In all cases, these are congruent with the distributions
expected if HGV\GBV-C was already present in modern
human populations as they migrated out of Africa 100 000–
150 000 years ago (Tucker et al., 1999 ; Konomi et al., 1999 ; Liu
et al., 2000 ; Tanaka et al., 1998 a, b ; Mison et al., 2000 ;
Katayama et al., 1997 ; Gonzalez-Perez et al., 1997). For
example, sequences from the populations in the Far-East are
almost invariably genotype 3, and this genotype is otherwise
only found in native inhabitants of North and South America
(Konomi et al., 1999 ; Tanaka et al., 1998 b ; Gonzalez-Perez et
HAA
al., 1997). In contrast, Caucasian and other populations from
India westwards including Northern Africa are infected with
genotype 2. Genotype 1 is confined to sub-Saharan Africa, and
shows the greatest overall sequence diversity (Liu et al.,
2000 ; Smith et al., 1997 a ; Muerhoff et al., 1997) ; particularly
divergent variants have been recovered from Pygmy and other
African populations (Sathar et al., 1999 ; Tanaka et al., 1998 a).
The long association of HGV\GBV-C in humans is possibly
also mirrored in other primates (Charrel et al., 1999 ; Adams et
al., 1998 ; Leary et al., 1997 ; Bukh & Apgar, 1997 ; Birkenmeyer
et al., 1998) (Fig. 5). HGV\GBV-C variants more divergent
than those found in humans have been found in wild-caught
chimpanzees from Central and West Africa (Birkenmeyer et al.,
1998 ; Adams et al., 1998). Furthermore, distinct variants of
HGV\GBV-CCPZ were recovered from the two different
subspecies of chimpanzees, troglodytes and verus. These showed
greater divergence from each other than found between human
genotypes, an observation consistent with the likely much
greater population diversity of surviving chimpanzee populations (Morin et al., 1994). Finally, even more divergent
homologues of HGV\GBV-C, described as GBV-A, have been
recovered from several species of New World primates (Bukh
& Apgar, 1997 ; Leary et al., 1997). Again mirroring host
relationships, genetic variants of GBV-A differing from each
other by around 25 % are closely associated with different New
World primate species (Bukh & Apgar, 1997 ; Erker et al.,
1998 ; Leary et al., 1997).
The molecular clock in HGV/GBV-C evolution
These repeated examples of phylogenetic congruency
between GB viruses and their primate host species are clearly
consistent with the hypothesis of virus and host co-evolution.
If this principal is accepted, however, then the concept of the
molecular clock will have to be abandoned as a principle
underlying the evolution of HGV\GBV-C. Indeed, the concept
of an RNA virus evolving over the time-scales involved in
primate evolution sits very oddly with the current paradigm of
RNA viruses, in which properties for extremely rapid sequence
change, ability to adapt and ephemeral nature have been
emphasized (Holland et al., 1982).
Evidence that HGV\GBV-C does not change at a constant
rate is provided by the discrepancy between the rate of
sequence change over short observation intervals, and that
implied by the divergence shown between variants infecting
human populations and different primate species. Sequence
comparison of HGV\GBV-C in samples collected 8 years apart
from an individual acutely infected with HGV\GBV-C indicated a rate of 3n9i10−% nucleotide substitutions per site per
year over the whole genome (Nakao et al., 1997) (Fig. 3). While
this is comparable to that of other RNA viruses [e.g. 4i10−%
for HCV in NS5 (Smith et al., 1997 b) ; 1n4i10−% for HIV-1
(Zhu et al., 1998)], and caused no great surprise amongst
virologists when it was first published, it is quite incompatible
Downloaded from www.microbiologyresearch.org by
IP: 88.99.165.207
On: Sun, 18 Jun 2017 03:28:33
Fleming Lecture
(A)
(B)
Fig. 6. Frequency histogram of variability at synonymous sites in (A) 17 HGV/GBV-C sequences of genotypes 1, 2 and 3 ;
(B) expected distribution of synonymous variability arising by chance (Simmonds & Smith, 1999 b). HGV/GBV-C sequences
show a greater frequency of invariant codons (23 %) compared with control sequences (10 %), with evidence for a second
distribution of variability values (mean 0n75) greater than the that of controls (0n5).
with the lack of sequence divergence found between different
human populations, if the observed genotype differences
originated through migration longer than 100 000 years ago
(Fig. 3). The rate of sequence change over the interval in which
Old and New World primate species evolved is even more
discrepant from this short-term rate (approximately 10 000fold lower).
If one wished to defend the molecular clock, and extrapolate
the rate of sequence change measured over 8 years to longer
intervals, then the current divergence found in human
HGV\GBV-C would have originated from a common ancestor
as recently as 300 years ago, while those infecting chimpanzees
and New World primates would have diverged from human
genotypes 600–1000 years ago. The occurrence of evolutionarily related viruses in chimpanzees in the wild in Africa
and in marmosets and other primates in South America
requires a transmission chain of infection operative over the
last millennium in which humans would have been the
intermediates. Apart from its evident absurdity, this hypothesis
ignores the species specificity of the GB viruses ; GB viruses
obtained from New World primates are non-infectious in
chimpanzees, nor can human HGV\GBV-C infect New World
primates (J. Bukh and others, personal communication). A
transmission chain linking these divergent primate species in
recent evolutionary history is therefore unlikely indeed.
Constraints of virus sequence change
A more radical explanation of the data is that there are
major differences between GB viruses and higher organisms in
the constraints operating on sequence change. Constraints
unanticipated by conventional methods of sequence comparison would lead to the underestimation of the frequency of
multiple substitutions that occurred on comparison of more
divergent sequences. This would reproduce the apparent
‘ slowing ’ of sequence change inferred from the sequence
relationships between HGV\GBV-C genotypes, and between
GB viruses infecting different primates.
It is well established that different constraints on sequence
change operate at synonymous and non-synonymous sites in
coding sequences of almost all gene sequences. Indeed, there
are several methods to compare sequences at synonymous and
non-synonymous sites separately to allow for independent
correction for multiple substitutions (Nei & Gojobori, 1986 ; Li
et al., 1985). Similarly, transitions occur more frequently than
transversions, and again can be corrected for separately. Many
complex methods have been developed to calculate evolutionary distances that allow for differences in rate of different
types of sequence change at different sites, and have been
successfully applied to evolutionary reconstructions of mammals and other eukaryotes. However, such methods fail
dismally to reconcile the short-term rate of sequence change in
HGV\GBV-C with rates implied from human population and
primate species distributions (Simmonds & Smith, 1999 a).
To investigate the existence of more esoteric constraints on
sequence change in HGV\GBV-C, we compared the distribution of sequence variability in coding sequences with
those of mammalian and other virus coding sequences
(Simmonds & Smith, 1999 b). Quite apart from the extreme
conservation of the encoded amino acid sequence of the
HGV\GBV-C polyprotein (remarked upon above), we also
obtained evidence that a large proportion of synonymous sites
in the coding part of the HGV\GBV-C genome is also
unexpectedly invariant (Fig. 6) ; comparison of complete coding
sequences (8600 bases) of different genotypes of HGV\GBVC showed an excess of invariant synonymous sites (at 23 % of
all codons) compared with the frequency expected by chance
(10 %). This carries the necessary implication that there are
fitness constraints on sequence change of HGV\GBV-C over
and above the coding function of the genome. As described in
Downloaded from www.microbiologyresearch.org by
IP: 88.99.165.207
On: Sun, 18 Jun 2017 03:28:33
HAB
P. Simmonds
detail by Simmonds & Smith (1999 b) and in the paper by
Cuceanu et al., (2001) that follows, it now appears likely that
the RNA genome of HGV\GBV-C forms a complex and
extensive secondary structure through internal base pairing.
The high free energy on folding, the existence of multiple
covariant sites and the conservation of specific stem–loops
between quite divergent GB virus sequences (such as HGV\
GBV-C, HGV\GBV-CCPZ and GBV-A) all point to (an)
evolutionarily conserved function(s) for the predicted secondary structure. What this actually is, and the extent to which
it may be found in other viruses with single-stranded genomes,
remains to be determined.
Nonetheless, the restrictions imposed by secondary structure provide an important clue towards understanding how the
co-evolution hypothesis for HCV\GBV-C can be reconciled
with the observation of its rapid rate of sequence change over
short observation periods. For example, there may be a class
of synonymous sites which are in non-base paired parts of the
genome, and where sequence change may be relatively
unconstrained. These substitutions may therefore be fixed at
the frequency predicted from the measured short-term rate of
sequence change (Nakao et al., 1997). A different class of
nucleotide sites which participate in internal base-pairing could
be under greater constraint if the resulting secondary structure
influenced the fitness of the virus. Substitutions may therefore
only occur if simultaneous compensatory changes occur to
retain base-pairing in the stem–loop. Indeed, in our analysis of
HGV\GBV-C sequences, covariant sites in predicted stem–
loops were found throughout the coding part of the genome
(Simmonds & Smith, 1999 ; Cuceanu et al., 2001) and rivalled
the 5hUTRs of HCV and HGV\GBV-C in frequency and
complexity. As the third base positions (normally synonymous) are usually opposite each other in the predicted
stem–loops of HGV\GBV-C, the frequency at which covariant
substitutions may occur is the substitution frequency squared
(i.e. approximately 10−(–10−) substitutions per site per year).
This is indeed quite similar to the rate at which sequence
divergence accumulates over the longer periods of human
dispersal and primate speciation (Fig. 3).
It seems therefore as if after the rapid accumulation of
substitutions at unpaired sites, further diversification can only
occur at the much slower rate required by paired changes that
retain secondary structure. Indeed, the extreme conservation
of the amino acid sequence may reflect the even greater
difficulty of simultaneous sequence change at opposite, nonsynonymous base-paired sites ; both amino acid changes would
have to be neutral or beneficial to HGV\GBV-C for the
covariant change to be fixed in the virus population. The
extreme dN\dS ratio mentioned above may therefore result
more from the peculiar constraints imposed by the requirement
to maintain RNA secondary structure, rather than unusual
functional or structural conservatism of the encoded proteins.
A more exact prediction of the expected short- and longterm rates of sequence change of HGV\GBV-C requires a
HAC
more complete mapping and functional investigation of its
RNA secondary structure, so that the sites that are likely to be
constrained and unconstrained can be identified. We also
require further information on the distribution of HGV\GBVC homologues in other primate species ; currently, sequences
have only been obtained from humans, chimpanzees and some
New World primate species. A more conclusive demonstration
of congruency would be obtained if sequences from other Old
World primates could be obtained, as GB viruses recovered
from them should occupy a phylogenetic position intermediate
between the human\chimpanzee common ancestor and the
branch leading to the New World primate species.
Further analysis of free energy on folding, frequencies of
covariant sites and other manifestations of internal basepairing should be also done for other viruses with singlestranded genomes, to investigate whether the constraints
imposed by secondary structure formation represent a more
general principle of virus evolution.
Evolution and origins of hepatitis B virus
(HBV)
HBV is another virus where it is difficult to reconcile its
likely rapid rate of sequence change with its distribution and
close host associations with humans and other primates.
Although many of the issues to be discussed remain unresolved, and highly controversial, it seems to provide another
example of virus evolution operating in a different way from
that assumed for higher organisms.
HBV chronically infects approximately 5 % of the human
population. The toll of approximately 1 million deaths from
chronic liver disease and hepatocellular carcinoma per year
(Thomas & Jacyna, 1993) demonstrates the scale of the global
health problem it poses. HBV is transmitted by sexual contact
and by parenteral exposure, although it is thought that
mother-to-child perinatal transmission and the establishment
of a life-long highly infectious carrier state are responsible for
the observed high rates of endemicity in high prevalence
regions such as South and East Asia, sub-Saharan Africa and
amongst indigenous peoples in Central and South America.
HBV is classified in the Hepadnaviridae, and contains a
partly double-stranded DNA genome of approximately 3200
bases. HBV replicates via an RNA intermediate anti-genome
sequence, encoding a potentially error-prone polymerase
enzyme with both reverse transcriptase and DNA polymerase
activities. An unusual feature of the HBV genome is the
presence of multiple overlapping reading frames for the genes
encoding the core, polymerase and surface antigen genes ; 67 %
of the genome is multiply coding, and therefore lacks what
would be conventionally regarded as synonymous and nonsynonymous sites. It is additionally probable that regions of
the genome that are non-coding may be involved in a variety
of secondary structure interactions necessary for circularization
and transcription.
Downloaded from www.microbiologyresearch.org by
IP: 88.99.165.207
On: Sun, 18 Jun 2017 03:28:33
Fleming Lecture
Fig. 7. Phylogenetic analysis of HBsAg gene sequences of representative sequences of human genotypes A–F and HBV
sequences recovered from primates (chimpanzees, gibbons, orang-utans, woolly monkey). Note the intermixing and
approximately equal sequence divergence between human genotypes A–E with sequences recovered from different primate
species. Bootstrap values of 70 % are indicated on the branches.
HBV infecting humans in different geographical regions are
currently classified into six genotypes differing from each
other by nucleotide sequence distances of approximately
10–13 % (Fig. 7). Genotypes A and D have global distributions,
genotypes B and C are found predominantly in East and South
East Asia, genotype E is predominant in West Africa, and the
most divergent genotype F is found exclusively amongst
indigenous peoples in Central and South America (Arauz-Ruiz
et al., 1997 ; Norder et al., 1994).
The rate of sequence change of HBV is uncertain. HBV
contains a polymerase enzyme without proof-reading activity,
and error frequencies on RNA or DNA copying are likely to be
of the order measured for the related retroviruses, and for other
RNA viruses. Measurement of the rate of sequence change of
HBV is complicated by the existence of overlapping reading
frames and the lack of synonymous sites in most of the coding
sequence. Compounding this difficulty is the evidence that
many amino acid changes, particularly in the pre-core region,
have a positive selective value, and may occur as an immune
evasion strategy. In a recent study (Hannoun et al., 2000),
individuals with hepatitis and who had cleared HBeAg
(interpreted as evidence of a vigorous immune response to
HBV) showed a mean 12-fold greater nucleotide substitution
rate than individuals who were apparently immunotolerized
Downloaded from www.microbiologyresearch.org by
IP: 88.99.165.207
On: Sun, 18 Jun 2017 03:28:33
HAD
P. Simmonds
(mild hepatitis, HBeAg positive). Taking the latter group as
representing the evolution of HBV in the absence of immune
pressure, HBV shows a substitution rate of 2n1 (range
0–13)i10−& substitutions per site per year over a mean
observation period of 22 years (range 20–35 years). As it is
generally HBeAg-positive carriers who transmit HBV infection
between human generations, this rate is likely to be the most
appropriate for extrapolating substitution rates over longer
periods. On this basis, the human genotypes of HBV would
have originated from a common ancestor approximately
2300–3100 years ago. How this predicted time of divergence
fits with the various theories for the origin of HBV is explored
in the remainder of this section.
A bold account for the geographical distribution of HBV
genotypes proposed that HBV originated from the Americas,
and spread into the Old World over the last 400 years after
contact from Europeans during colonization (Bollyky et al.,
1997). The diversity of HBV sequences of 11–13 % between
HBV genotypes implies a substitution rate of around
1n5i10−% per site per year, which is more rapid than the rate
observed in HBeAg-positive carriers (2n1i10−& substitutions
per site per year ; see above). However, the main problem for
this hypothesis is the observation of the widespread distribution of HBV in Old World primate species. At the time
this hypothesis was proposed, evidence for infection of other
primates, such as chimpanzees and gibbons, by HBV was
controversial, and could be dismissed as the result of accidental
transmission from humans to captive animals (Lanford et al.,
1998 ; Norder et al., 1996 ; Vaudin et al., 1988 ; Zuckerman et al.,
1978). Since then, it has become more firmly established that
primates are infected with HBV in the wild. In a remarkable
example of multiple publication, three groups recently published evidence for the existence of a shared genotype of HBV
infecting West African chimpanzees (Fig. 7) (Hu et al., 2000 ;
Takahashi et al., 2000 ; MacDonald et al., 2000). This chimpanzee-specific variant of HBV showed approximately 11 %
divergence from the human genotypes A–E. Similar findings
were reported from gibbons and orang-utans in South East
Asia, both of whom harbour species-specific genotypes of
HBV equidistant from each other, and from chimpanzee and
human genotypes (Grethe et al., 2000 ; Warren et al., 1999).
There are a few exceptions to these species associations ; one of
the sequences in the chimpanzee clade originated from a gorilla
(AJ131567) ; a chimpanzee sequence (HBV131567) groups
with gibbons ; another chimpanzee sequence (AB032431)
groups in human genotype E. Finally, sequence AF213008
from a gibbon groups separately from all other sequences.
While it is tempting to explain away these exceptions as either
laboratory error or contamination, or inadvertent transmission
of HBV between species in captivity, more data are clearly
needed to substantiate the claimed species\genotype associations.
Since the evidence for primate infections in the wild became
widely known, a number of alternative, often futile and invarHAE
iably highly speculative attempts have been made to reconcile
these observations into a coherent account of HBV evolution.
The main difficulty arises from the observation of equivalent
sequence relationships between human genotypes A–E and G
to each other and to the primate-species-associated genotypes
of HBV. It is also difficult to rationally fit into any scheme the
outlier human HBV genotype F and the even more divergent
HBV variant obtained from a captive woolly monkey, a New
World primate (Lanford et al., 1998).
Amongst the main competing theories, the hypothesis for
a New World origin for HBV (Bollyky et al., 1997), discussed
above, appears incompatible with the now firmly established
observations for the widespread distribution of HBV in a wide
range of Old World primate species in the wild. The proposal
that contact with humans established HBV infection in several
primate species in the wild over the last 300 years is
epidemiologically highly improbable.
An alternative theory proposes that, as with HGV\GBV-C
(see above), HBV co-evolved with anatomically modern
humans as they migrated from Africa approximately 100 000
years ago (Norder et al., 1994 ; Magnius & Norder, 1995). This
would imply a sustained rate of sequence change of HBV over
the past 100 000 years of approximately 5i10−( nucleotide
changes per site per year, remarkably similar to that hypothesized for HGV\GBV-C (Fig. 3). However, unlike HGV\GBVC, the phylogeny of HBV genotypes in no way corresponds to
genetic relationships between human (or primate) population
groups. For example, the presence of genotype F in Native
American populations is inconsistent with the presence of
genotypes B and C in Mongoloid North East Asians, who are
genetically their nearest relatives. Secondly, there is little or no
genetic evidence that South American and Polynesian populations are significantly intermixed, and there is therefore no
explanation for the presence of genotype F in Pacific Island
populations. Indeed, there is little relationship between HBV
genotype distributions with any of the other human population
groups (South East Asians, Caucasians and African populations). As indicated above, the other incongruence is the
existence of specifies-specific genotypes of HBV in chimpanzees, gibbons and orang-utans, and the way in which they are
intermixed with human genotypes. This is quite unexpected, as
the primate viruses should be much more divergent from
human variants and from each other given the much longer
period of co-speciation of primate species (10–15 million
years).
A third, highly speculative, hypothesis for HBV origins,
that we recently discussed (MacDonald et al., 2000), proposes
that variants found in chimpanzees, gibbons, orang-utans and
in the New World primate woolly monkey species co-speciated
over 10–35 million years. In this case, the outlying position of
the woolly monkey HBV sequence and the equal divergence of
HBV variants from Old World primate species is (approximately) consistent with host phylogeny and fossil-based
estimates for their relative times of divergence. If co-speciation
Downloaded from www.microbiologyresearch.org by
IP: 88.99.165.207
On: Sun, 18 Jun 2017 03:28:33
Fleming Lecture
occurred, then the long-term rate of sequence change of HBV
would range from 3 to 5i10−* nucleotide changes per site per
year, lower than recorded for any other virus.
Interestingly, a range of much more genetically divergent
hepadnaviruses infects rodents in North and South America,
such as the woodchuck (Marmota monax), ground squirrel
(Spermophilus beecheyi) and arctic ground squirrel (S. parryii). These viruses may be a manifestation of an equivalent
process of co-evolution over even longer periods. Remarkably,
the genetic distance between primate and rodent HBV variants
after their divergence approximately 110 million years ago
indicates a rate of sequence change of 6n8i10−* changes per
site per year, bizarrely similar to the rate operating over
primate co-evolution.
If HBV co-evolved in primates then the existence of
numerous equally distinct human genotypes of HBV would
require a different explanation to fit in with this theory. It is
possible that human HBV infection arose many times through
contact with different primates infected with species-specific
genotypes (equivalent to those found in chimpanzees, gibbons
and orang-utans). In some ways, this scenario corresponds to
that believed to underlie the origins of HIV infection in
humans. Infection with HIV-1 is likely to have originated
through at least two separate cross-species transmissions from
chimpanzees (Gao et al., 1999), while human infection with
HIV-2 in West Africa arose independently several times
through contact with sooty mangabeys (Feng et al., 1992). A
primate origin for human HBV infection is indeed supported
by the observation that the areas of high HBV prevalence in
humans are those in which contact and cross-species transmission from primates are most likely (South America, subSaharan Africa and South East Asia). Indeed, certain HBV
genotypes are specific to these three areas (F, E and B\C
respectively). As another parallel with HIV-1, the mixture of
HBV genotypes found outside these areas, such as in Europe
and North America, may result from much more recent
epidemic spread, in newly exposed susceptible groups such as
IDUs and male homosexuals. The problem with the theory for
a primate origin is that, to date, no HBV genotypes are shared
between primates and humans, apart from the finding of a
genotype E variant in a chimpanzee (Takahashi et al., 2000) ; if
HBV genotypes A–F originated in primates, then the actual
species involved in transmission to humans remain unidentified.
At this stage, the problems associated with each of the three
hypotheses for the origin of HBV prevents any undisputed
conclusions being drawn. However, the fact that such different
hypotheses are being argued about highlights the current
lack of understanding surrounding the origins of HBV. If the
slow rates of sequence change, implied by both the human
migration and the primate co-evolution hypotheses, can be
verified, HBV would therefore represent another example of a
virus evolving in a markedly different way from higher
organisms, in this case perhaps the result of the complex
constraints imposed by the existence of overlapping reading
frames (Mizokami et al., 1997).
Origins of HCV
The final part of this Lecture returns to the hypotheses
surrounding the origins of HCV, and discusses the application
of the principles advanced for the origins of HGV\GBV-C and
HBV to the reconstruction of its evolutionary history. The
discussion includes the evidence we have for the rate of
sequence change of HCV, an analysis of the geographical
distribution of HCV genotypes and its genetic diversity, and
finally the evidence for the existence of primate homologues of
HCV, or of primates as immediate sources of infection in
humans.
We believe we have relatively accurate values for the rate
of sequence change of HCV, and these provide the starting
point for investigating the dynamics of at least its recent
evolution. A fortuitous opportunity to establish the rate for
HCV arose from the availability of samples from individuals
infected with HCV after exposure to a homogeneous, common
source outbreak in the 1970s. The culprit was a batch of antirhesus D immunoglobulin (anti-D) used in 1977 in Ireland,
which contained a highly viraemic component plasma donation
from a recently infected individual (Power et al., 1994, 1995).
Sequence divergence in the NS5 and E1 regions between
sequences recovered from the recipients 17–20 years later
indicated rates of sequence change of 4n1 and 7n1i10−% per
site per year respectively (Smith et al., 1997 b), with no
evidence for variation in rate between individuals.
Assuming this rate of sequence change is maintained over
longer periods, the diversity of variants within each of the
genotypes associated with the risk groups for HCV infection
was assessed ; these included types 1a, 1b and 3a in Western
countries. For type 1b, 40 NS5 sequences from epidemiologically unrelated individuals in Europe, USA, Asia and Japan
showed a distribution of pairwise distances approximately four
times greater than those between anti-D recipients, indicating
a time of divergence approximately 60–70 years ago (Smith
et al., 1997 b). The absence of any country or region-specific
phylogenetic groupings further implies that the initial spread
of type 1b occurred relatively rapidly, and that it became
disseminated throughout many of the world’s populations
over a short period. These results are consistent with the
prediction of recent, epidemic spread of HCV derived from
mid-depth analysis (Holmes et al., 1995 a).
The diversity of sequences amongst type 3a variants was
more restricted than for type 1b, suggesting a more recent
dissemination (40 years based upon distances in NS5). In
contrast, types 2a, 2b and particularly 2c were more diverse,
with a predicted time of origin of 90–150 years ago. The
genetic evidence of relatively recent spread of genotypes such
as 3a into IDUs is consistent with the epidemiological evidence
for the widespread increase in needle-sharing drug abuse since
Downloaded from www.microbiologyresearch.org by
IP: 88.99.165.207
On: Sun, 18 Jun 2017 03:28:33
HAF
P. Simmonds
(A)
(B)
HAG
Fig. 8. For legend see facing page.
Downloaded from www.microbiologyresearch.org by
IP: 88.99.165.207
On: Sun, 18 Jun 2017 03:28:33
Fleming Lecture
the 1960s, while the greater diversity of types 1b and subtypes
of type 2 implies earlier, different modes of transmission,
consistent with a range of other risk factors identifiable in older
HCV-infected individuals (Simmonds et al., 1996 a ; Lau et al.,
1996 ; Tisminetzky et al., 1994 ; Pawlotsky et al., 1995 ; Goeser
et al., 1995 ; McOmish et al., 1993). Genotype 4a infection is
found at high frequency in the Middle East, particularly in
Egypt, where there is evidence for the inadvertent, large-scale
spread of HCV infection by unsterilized needles used for
bilharzia treatment in the 1950s and 1960s (Frank et al., 2000).
Origin of HCV genotypes
Although the above-cited and other ongoing molecular
epidemiology studies of HCV sequence diversity appear
successful in documenting the relatively recent spread of HCV,
the problems and controversy associated with the analysis of
HGV\GBV-C and HBV sequences indicate the need for
caution with reconstruction of the earlier history of HCV. In
particular, for all the reasons discussed above, it would be rash
to assume that the rates of sequence change measured over 20
years can be extrapolated to calculate the time of origin of
much more divergent HCV genotypes, in view of the problems
that this approach has caused for other viruses. The problem
with HCV is that, apart from studies of genotype distributions,
there is little other information currently available that would
help towards identifying the source of the current epidemic
HCV infection in the West.
HCV genotype distributions in non-Western countries are
poorly documented, particularly in sub-Saharan Africa. What
information is available, however, indicates a quite different
pattern of sequence variability of the virus. For example, in
Western African countries, such as Gambia, Ghana, BurkinaFaso, Benin and Guinea (Ruggieri et al., 1996 ; WansbroughJones et al., 1998 ; Mellor et al., 1995 ; Jeannel et al., 1998), small
scale surveys have indicated a predominance of infection with
genotype 2. In contrast to Western countries, these type 2
infections are characterized by considerable sequence diversity,
with different individuals each being infected with different
subtypes, each in turn distinct from the 2a, 2b and 2c subtypes
found in the West. Similarly, genotype 4 infection in Central
Africa (Congo, Gabon, Central African Republic) is also
characterized by extreme subtype diversity (Xu et al., 1994 ;
Menendez et al., 1999 ; Stuyver et al., 1993 ; Fretz et al.,
1995 ; Bukh et al., 1993), quite different from the epidemic
pattern of type 4a infection in Egypt and elsewhere in the
Middle East (see above). Combining a large number geographical surveys, HCV variants from five different regions in
Africa and South East Asia contain areas of great subtype
diversity (Fig. 8 A).
For another virus, HIV-1, the great sequence diversity in
Central Africa provides genetic evidence for that area being
the original source of the subsequent worldwide epidemic. By
analogy, it could be imagined that the areas identified in Fig.
8 (B) represent areas of origin for each of the genotypes that
subsequently spread into Western countries. As with HIV-1,
the sheer range of HCV sequences in sub-Saharan Africa and
South East Asia argues for the long-term presence of HCV in
human populations from these areas. Extending the analogy,
the demographic and epidemiological factors that led to the
global spread of HIV-1 in the last 30–40 years could be the
same ones underlying the epidemic spread of HCV over the
same period, although it is clear that spread of certain
genotypes of HCV, such as types 1b and subtypes of genotype
2, occurred earlier. This hypothesis of HCV origins differs in
being multifocal ; Southern Asia appears to harbour the greatest
diversity of genotypes 3 and 6, while types 1, 2, 4 and
probably 5 would be of African origin. To substantiate this
speculative hypothesis, we clearly need much more information on the epidemiology of HCV infection in these areas.
In particular, it would be important to identify the transmission
routes between individuals that are able to maintain HCV
infection in a human population for the long periods implied
by its genetic variability. There are few, if any, other examples
of viruses infecting populations where the principal route of
transmission is parenteral. Amongst possibilities for alternative
routes, tribal scarification practices, sexual transmission between individuals with untreated sexually transmitted diseases,
and mosquito or tick vectors possibly contribute to HCV
transmission in these areas, but there is no evidence so far
(McCarthy et al., 1994 ; Bellini et al., 1997 ; Silverman et al.,
1996).
Intriguingly, the areas of greatest genetic diversity of HCV
(Fig. 8 B) are also those of greatest HBV prevalence. Perhaps if
we understood more about the origins of HBV we might be
able to explain this coincidental distribution of HCV diversity
and the existence of shared factors that may maintain infections
with both viruses in these communities. Pursuing the analogy
with the origins of HIV-1 (and possibly HBV), there would also
be a place for more intensive investigation of infection with
HCV or related viruses in Old World primates. Indeed, a virus
referred to as GBV-B, similar to HCV in genome organization
and secondary structure of the 5hUTR, although highly
divergent in sequences, was recently found in a captive
tamarind, a New World primate (Simons et al., 1995). It may
therefore show the same evolutionary relationship to HCV as
GBV-A does with HGV\GBV-C. Old World primate species
could conceivably carry a range of HCV-like viruses that
recapitulate host inter-relationships in the same way as
hypothesized for HGV\GBV-C, primate lentiviruses and
perhaps HBV.
Fig. 8. (A) Phylogenetic analysis of nucleotide sequences from part of the HCV NS5B region amplified from HCV-infected
individuals, including those from sub-Saharan Africa and South East Asia. Note the much greater diversity and number of
subtypes in genotypes 1, 2, 3, 4 and 6 compared with those found in Western countries (Fig. 2). (B) Approximate
geographical distribution of regions where increased subtype diversity is found.
Downloaded from www.microbiologyresearch.org by
IP: 88.99.165.207
On: Sun, 18 Jun 2017 03:28:33
HAH
P. Simmonds
Concluding remarks
Even without definitive conclusions, this Lecture has at
least documented the difficulties in reconstructing virus origins
from contemporary, indirect genetic and epidemiological
evidence. Furthermore, the incompatibility between predictions of relatively recent times of divergence of HBV and
HGV\GBV-C with their global distributions in humans and
primates in the wild further suggests that the molecular clock
cannot be applied in the simple way that it has been to
reconstruct evolutionary histories of other organisms.
Despite its shortcomings, much of the indirect evidence for
virus origins that derives from studies of genotype distributions points to very long-term virus–host interrelationships.
As discussed above, current evidence suggests that relatively
closely related variants of HGV\GBV-C have evolved with
their hosts over the period of human dispersal and primate
speciation. Our own investigations of this discrepancy have
concentrated on the constraints on sequence change, such as
RNA secondary structure formation in HGV\GBV-C, that
may lead to significant underestimation of evolutionary
distances between more distantly related variants. Other
authors have documented the unusual complications imposed
by the use of multiple reading frames by HBV, and the lack of
conventional synonymous sites in most of the genome
(Mizokami et al., 1997).
Aside from these specific issues, a more general principle
that shapes the evolution of viruses is their large population
size. To paraphrase work in other fields, the numbers of
multicellular, large organisms such as mammals are much more
restricted, and this limits the population pool in which fitness
selection occurs. Evidence for a lack of meaningful selection at
the genome level includes their unnecessarily large genome
sizes, for the most part packed with repetitive, mobile elements
and other junk DNA, introns and often nonsensical redundancy
in gene function. Taking this argument further, the lack
of selection would extend to coding sequences, where
fixation of mutations that have relatively minor effects on
organism fitness could occur at similar frequency to genuinely
neutral substitutions. Gene sequences may therefore diverge
more or less randomly during speciation, and reproduce the
linear relationship between time and degree of sequence
divergence that underlies the molecular clock.
Bacteria and viruses are unlikely to evolve under such
relaxed constraints. Indeed, their small genome sizes, an almost
universal lack of introns or gene reduplications, and in the
extreme case, such extreme economy in coding sequences that
most of the genome contains multiple reading frames suggest
a degree of fitness optimization absent from larger organisms.
This process would clearly be facilitated by the large
population sizes of both viruses and bacteria in their natural
environments.
Population size and high mutation rates have generally
been seen as factors enhancing the adaptive ability of viruses
HAI
to cope with new pressures (such as antiviral treatment,
immune recognition). However, the same factors can have
precisely the opposite effect in stable environments. Large
population sizes and the ability of more fit mutants to rapidly
replace entire virus populations in the infected individual
inevitably produce populations highly optimized for the
environment in which they replicate. Mutants with sequence
changes that had even a marginal harmful effect on virus
fitness, such as a conservative amino acid substitution (or in the
case of HGV\GBV-C, a synonymous substitution that disrupted secondary structure) would be quickly driven out by the
10*–10"! other members of the population pool competing for
cells to infect. In contrast, comparable substitutions, for
example in the haemoglobin gene of an elephant living in a
small breeding group, would have no significant impact on its
reproductive fitness, and would be as likely to become fixed in
the elephant population as neutral or even beneficial mutations.
The selection conditions operating during persistent virus
infections can therefore drive out much of the phenotypic
variability that characterizes the evolution of larger organisms,
and may produce the extraordinary conservatism and evolutionary stasis observed in hepatitis viruses. Effectively, some
viruses may have found their fitness peak for a particular host.
Neither transmission bottlenecks nor any of the other processes
associated with population drift are going to drive them from
that peak, particularly where the high mutation rate of RNA
viruses provides the means for rapid re-establishment of the
original, fitness-optimized population. Large population sizes,
the intense selection pressures that operate within them, and
high mutation rates that promotes convergence to fitness
peaks, may be the factors that set virus evolution apart from
that of their hosts.
References
Adams, N. J., Prescott, L. E., Jarvis, L. M., Lewis, J. C. M., McClure,
M. O., Smith, D. B. & Simmonds, P. (1998). Detection in chimpanzees
of a novel flavivirus related to GB virus-C\hepatitis G virus. Journal of
General Virology 79, 1871–1877.
Alter, H. J., Conrycantilena, C., Melpolder, J., Tan, D., Vanraden, M.,
Herion, D., Lau, D. & Hoofnagle, J. H. (1997). Hepatitis C in
asymptomatic blood donors. Hepatology 26, 29–33.
Arauz-Ruiz, P., Norder, H., Visona, K. A. & Magnius, L. O. (1997).
Genotype F prevails in HBV infected patients of Hispanic origin in
Central America and may carry the precore stop mutant. Journal of
Medical Virology 51, 305–312.
Bellini, R., Casali, B., Carrieri, M., Zambonelli, C., Rivasi, P. & Rivasi, F.
(1997). Aedes albopictus (Diptera : Culicidae) is incompetent as a vector of
hepatitis C virus. APMIS 105, 299–302.
Birkenmeyer, L. G., Desai, S. M., Muerhoff, A. S., Leary, T. P., Simons,
J. N., Montes, C. C. & Mushahwar, I. K. (1998). Isolation of a GB virus-
related genome from a chimpanzee. Journal of Medical Virology 56, 44–51.
Bollyky, P. L. & Holmes, E. C. (1999). Reconstructing the complex
evolutionary history of hepatitis B virus. Journal of Molecular Evolution
49, 130–141.
Bollyky, P. L., Rambaut, A., Grassly, N., Carman, W. F. & Holmes, E. C.
Downloaded from www.microbiologyresearch.org by
IP: 88.99.165.207
On: Sun, 18 Jun 2017 03:28:33
Fleming Lecture
(1997). Hepatitis B virus has a New World evolutionary origin.
Hepatology 26, 765.
Brown, A. J. L., Lobidel, D., Wade, C. M., Rebus, S., Phillips, A. N.,
Brettle, R. P., France, A. J., Leen, C. S., McMenamin, J., McMillan, A.,
Maw, R. D., Mulcahy, F., Robertson, J. R., Sankar, K. N., Scott, G., Wyld,
R. & Peutherer, J. F. (1997). The molecular epidemiology of human
immunodeficiency virus type 1 in six cities in Britain and Ireland. Virology
235, 166–177.
Bukh, J. & Apgar, C. L. (1997). Five new or recently discovered (GBVA) virus species are indigenous to New World monkeys and may
constitute a separate genus of the Flaviviridae. Virology 229, 429–436.
Bukh, J., Purcell, R. H. & Miller, R. H. (1992). Sequence analysis of the
5h noncoding region of hepatitis C virus. Proceedings of the National
Academy of Sciences, USA 89, 4942–4946.
Bukh, J., Purcell, R. H. & Miller, R. H. (1993). At least 12 genotypes of
hepatitis C virus predicted by sequence analysis of the putative E1 gene
of isolates collected worldwide. Proceedings of the National Academy of
Sciences, USA 90, 8234–8238.
Chamberlain, R. W., Adams, N. J., Taylor, L. A., Simmonds, P. & Elliott,
R. M. (1997). The complete coding sequence of hepatitis C virus
genotype 5a, the predominant genotype in South Africa. Biochemical and
Biophysical Research Communications 236, 44–49.
Charrel, R. N., de Micco, P. & de Lamballerie, X. (1999). Phylogenetic
analysis of GB viruses A and C : evidence for cospeciation between virus
isolates and their primate hosts. Journal of General Virology 80,
2329–2335.
Cuceanu, N. M., Tuplin, A. & Simmonds, P. (2001). Evolutionarily
conserved RNA secondary structures in coding and non-coding
sequences at the 3h end of the hepatitis G virus\GB-virus C genome.
Journal of General Virology 82, 713–722.
Davis, J. L., Heginbottom, J. A., Annan, A. P., Daniels, R. S., Berdal,
B. P., Bergan, T., Duncan, K. E., Lewin, P. K., Oxford, J. S., Roberts, N.,
Skehel, J. J. & Smith, C. R. (2000). Ground penetrating radar surveys to
locate 1918 Spanish Flu victims In permafrost. Journal of Forensic Science
45, 68–76.
de Lamballerie, X., Charrel, R. N., Attoui, H. & de Micco, P. (1997).
Classification of hepatitis C virus variants in six major types based on
analysis of the envelope 1 and nonstructural 5B genome regions and
complete polyprotein sequences. Journal of General Virology 78, 45–51.
Enomoto, N., Takada, A., Nakao, T. & Date, T. (1990). There are two
major types of hepatitis C virus in Japan. Biochemical and Biophysical
Research Communications 170, 1021–1025.
Erker, J. C., Desai, S. M., Leary, T. P., Chalmers, M. L., Montes, C. C. &
Mushahwar, I. K. (1998). Genomic analysis of two GB virus A variants
isolated from captive monkeys. Journal of General Virology 79, 41–45.
Felsenstein, J. (1993). PHYLIP Inference Package version 3.5. Department of Genetics, University of Washington, Seattle, WA, USA.
Feng, G., Yue, L., White, A. T., Pappas, P. G., Barchue, J., Greene,
B. M., Sharp, P. M., Shaw, G. M. & Hahn, B. H. (1992). Human infection
by genetically diverse SIVsm-related HIV-2 in West Africa. Nature 358,
495–499.
Frank, C., Mohamed, M. K., Strickland, G. T., Lavanchy, D., Arthur,
R. R., Magder, L. S., El Khoby, T., Abdel-Wahab, Y., Aly Ohn, E. S.,
Anwar, W. & Sallam, I. (2000). The role of parenteral antischistosomal
therapy in the spread of hepatitis C virus In Egypt. Lancet 355, 887–891.
Fretz, C., Jeannel, D., Stuyver, L., Herve, V., Lunel, F., Boudifa, A.,
Mathiot, C., Dethe, G. & Fournel, J. J. (1995). HCV infection in a rural
population of the Central African Republic (CAR) : evidence for three
additional subtypes of genotype 4. Journal of Medical Virology 47,
435–437.
Gao, F., Bailes, E., Robertson, D. L., Chen, Y., Rodenburg, C. M.,
Michael, S. F., Cummins, L. B., Arthur, L. O., Peeters, M., Shaw, G. M.,
Sharp, P. M. & Hahn, B. H. (1999). Origins of HIV-1 in the chimpanzee
Pan troglodytes troglodytes. Nature 397, 436–441.
Goeser, T., Tox, U., Muller, H. M., Arnold, J. C. & Theilmann, L. (1995).
Genotypes in chronic hepatitis C virus (HCV) infection and in liver
cirrhosis caused by HCV in Germany. Deutsche Medizinische Wochenschrift
120, 1070–1073.
Gonzalez-Perez, M. A., Norder, H., Bergstrom, A., Lopez, E., Visona,
K. A. & Magnius, L. O. (1997). High prevalence of GB virus C strains
genetically related to strains with Asian origin in Nicaraguan hemophiliacs. Journal of Medical Virology 52, 149–155.
Grethe, S., Heckel, J. O., Rietschel, W. & Hufert, F. T. (2000). Molecular
epidemiology of hepatitis B virus variants in nonhuman primates. Journal
of Virology 74, 5377–5381.
Haase, A. T., Henry, K., Zupancic, M., Sedgewick, G., Faust, R. A.,
Melroe, H., Cavert, W., Gebhard, K., Staskus, K., Zhang, Z. Q., Dailey,
P. J., Balfour, H. H., Erice, A. & Perelson, A. S. (1996). Quantitative
image analysis of HIV-1 infection in lymphoid tissue. Science 274,
985–989.
Hannoun, C., Horal, P. & Lindh, M. (2000). Long-term mutation rates
in the hepatitis B virus genome. Journal of General Virology 81, 75–83.
Holland, J., Spindler, K., Horodyski, F., Grabau, E., Nichol, S. &
Vandepol, S. (1982). Rapid evolution of RNA genomes. Science 215,
1577–1585.
Holmes, E. C., Nee, S., Rambaut, A., Garnett, G. P. & Harvey, P. H.
(1995 a). Revealing the history of infectious disease epidemics using
phylogenetic trees. Philosophical Transactions of the Royal Society of London
B Biological Sciences 349, 33–40.
Holmes, E. C., Zhang, L. Q., Robertson, P., Cleland, A., Harvey, E.,
Simmonds, P. & Brown, A. J. L. (1995 b). The molecular epidemiology
of human immunodeficiency virus type 1 in Edinburgh. Journal of
Infectious Diseases 171, 45–53.
Hopkins, D. (1980). News from the field. Paleopathology Association
Newsletter 31, 6.
Hu, X., Margolis, H. S., Purcell, R. H., Ebert, J. & Robertson, B. H.
(2000). Identification of hepatitis B virus indigenous to chimpanzees.
Proceedings of the National Academy of Sciences, USA 97, 1661–1664.
Jeannel, D., Fretz, C., Traore, Y., Kohdjo, N., Bigot, A., Gamy, E. P.,
Jourdan, G., Kourouma, K., Maertens, G., Fumoux, F., Fournel, J. J. &
Stuyver, L. (1998). Evidence for high genetic diversity and long-term
endemicity of hepatitis C virus genotypes 1 and 2 in West Africa. Journal
of Medical Virology 55, 92–97.
Jones, S., Martin, R. & Pilbeam, D. (1992). Human Evolution.
Cambridge : Cambridge University Press.
Katayama, Y., Apichartpiyakul, C., Handajani, R., Ishido, S. & Hotta, H.
(1997). GB virus C hepatitis G virus (GBV-C\HGV) infection in Chiang
Mai, Thailand, and identification of variants on the basis of 5huntranslated region sequences. Archives of Virology 142, 2433–2445.
Kenny-Walsh, E. (1999). Clinical outcomes after hepatitis C infection
from contaminated anti- D immune globulin. New England Journal of
Medicine 340, 1228–1233.
Kolykhalov, A. A., Feinstone, S. M. & Rice, C. M. (1996). Identification
of a highly conserved sequence element at the 3h terminus of hepatitis C
virus genome RNA. Journal of Virology 70, 3363–3371.
Konomi, N., Miyoshi, C., La Fuente Zerain, C., Li, T. C., Arakawa, Y. &
Abe, K. (1999). Epidemiology of hepatitis B, C, E, and G virus infections
and molecular analysis of hepatitis G virus isolates in Bolivia. Journal of
Clinical Microbiology 37, 3291–3295.
Downloaded from www.microbiologyresearch.org by
IP: 88.99.165.207
On: Sun, 18 Jun 2017 03:28:33
HAJ
P. Simmonds
Korber, B., Muldoon, M., Theiler, J., Gao, F., Gupta, R., Lapedes, A.,
Hahn, B. H., Wolinsky, S. & Bhattacharya, T. (2000). Timing the
Mellor, J., Holmes, E. C., Jarvis, L. M., Yap, P. L., Simmonds, P. &
International Collaborators (1995). Investigation of the pattern of
ancestor of the HIV-1 pandemic strains [See Comments]. Science 288,
1789–1796.
Koretz, R. L., Abbey, H., Coleman, E. & Gitnick, G. (1993). Non-A, nonB posttransfusion hepatitis : looking back into the second decade. Annals
of Internal Medicine 119, 110–115.
Kumar, S. & Hedges, S. B. (1998). A molecular timescale for vertebrate
evolution. Nature 392, 917–920.
Kumar, S., Tamura, K. & Nei, M. (1993). MEGA : Molecular Evolutionary Genetics Analysis, version 1.0. Pennsylvania State University,
University Park, PA, USA.
hepatitis C virus sequence diversity in different geographical regions :
implications for virus classification. Journal of General Virology 76,
2493–2507.
Lanford, R. E., Chavez, D., Brasky, K. M., Burns, R. B., III & Rico-Hesse,
R. (1998). Isolation of a hepadnavirus from the woolly monkey, a New
World primate. Proceedings of the National Academy of Sciences, USA 95,
5757–5761.
Leary, T. P., Muerhoff, A. S., Simons, J. N., Pilot-Matias, T. J., Erker,
J. C., Chalmers, M. L., Schlauder, G. G., Dawson, G. J., Desai, S. M. &
Mushahwar, I. K. (1996). Sequence and genomic organization of GBV-
C : a novel member of the Flaviviridae associated with human non-A–E
hepatitis. Journal of Medical Virology 48, 60–67.
Leary, T. P., Desai, S. M., Erker, J. C. & Mushahwar, I. K. (1997). The
sequence and genomic organization of a GB virus A variant isolated from
captive tamarins. Journal of General Virology 78, 2307–2313.
Li, W.-H., Wu, C.-I. & Luo, C.-C. (1985). A new method for estimating
synonymous and non-synonymous rates of nucleotide substitution
considering the relative likelihood of nucleotide and codon changes.
Molecular Biology and Evolution 2, 150–174.
Linnen, J., Wages, J., Zhangkeck, Z. Y., Fry, K. E., Krawczynski, K. Z.,
Alter, H., Koonin, E., Gallagher, M., Alter, M., Hadziyannis, S.,
Karayiannis, P., Fung, K., Nakatsuji, Y., Shih, J. W. K., Young, L.,
Piatak, M., Hoover, C., Fernandez, J., Chen, S., Zou, J. C., Morris, T.,
Hyams, K. C., Ismay, S., Lifson, J. D., Hess, G., Foung, S. K. H.,
Thomas, H., Bradley, D., Margolis, H. & Kim, J. P. (1996). Molecular
cloning and disease association of hepatitis G virus : a transfusiontransmissible agent. Science 271, 505–508.
Liu, H. F., Muyembe-Tamfum, J. J., Dahan, K., Desmyter, J. & Goubau,
P. (2000). High prevalence of GB virus C\hepatitis G virus in Kinshasa,
Democratic Republic of Congo : a phylogenetic analysis. Journal of
Medical Virology 60, 159–165.
McCarthy, M. C., El-Tigani, A., Khalid, I. O. & Hyams, K. C. (1994).
Hepatitis B and C in Juba, southern Sudan : results of a serosurvey.
Transactions of the Royal Society of Tropical Medicine and Hygiene 88,
534–536.
MacDonald, D. M., Holmes, E. C., Lewis, J. C. & Simmonds, P. (2000).
Detection of hepatitis B virus infection in wild-born chimpanzees (Pan
troglodytes verus) : phylogenetic relationships with human and other
primate genotypes. Journal of Virology 74, 4253–4257.
McOmish, F., Chan, S.-W., Dow, B. C., Gillon, J., Frame, W. D.,
Crawford, R. J., Yap, P. L., Follett, E. A. C. & Simmonds, P. (1993).
Detection of three types of hepatitis C virus in blood donors :
investigation of type-specific differences in serological reactivity and rate
of alanine aminotransferase abnormalities. Transfusion 33, 7–13.
Magnius, L. O. & Norder, H. (1995). Subtypes, genotypes and molecular
epidemiology of the hepatitis B virus as reflected by sequence variability
of the S-gene. Intervirology 38, 24–34.
Mansky, L. M. & Temin, H. M. (1995). Lower in vivo mutation rate of
human immunodeficiency virus type 1 than that predicted from the
fidelity of purified reverse transcriptase. Journal of Virology 69, 5087–
5094.
HBA
Menendez, C., Sanchez-Tapias, J. M., Alonso, P. L., Gimenez-Barcons,
M., Kahigwa, E., Aponte, J. J., Mshinda, H., Navia, M. M., Deanta,
M. T. J., Rodes, J. & Saiz, J. C. (1999). Molecular evidence of mother-to-
infant transmission of hepatitis G virus among women without known
risk factors for parenteral infections. Journal of Clinical Microbiology 37,
2333–2336.
Mison, L., Hyland, C., Poidinger, M., Borthwick, I., Faoagali, J., Aeno,
U. & Gowans, E. (2000). Hepatitis G virus genotypes in Australia, Papua
New Guinea and the Solomon Islands : a possible New Pacific type
identified. Journal of Gastroenterology and Hepatology 15, 952–956.
Mizokami, M., Gojobori, T., Ohba, K. I., Ikeo, K., Ge, X. M., Ohno, T.,
Orito, E. & Lau, J. Y. N. (1996). Hepatitis C virus types 7, 8 and 9 should
be classified as type 6 subtypes. Journal of Hepatology 24, 622–624.
Mizokami, M., Orito, E., Ohba, K., Ikeo, K., Lau, J. Y. & Gojobori, T.
(1997). Constrained evolution with respect to gene overlap of hepatitis
B virus. Journal of Molecular Evolution 44, 83–90.
Morin, P. A., Moore, J. J., Chakraborty, R., Jin, L., Goodall, J. &
Woodruff, D. S. (1994). Kin selection, social structure, gene flow, and
the evolution of chimpanzees. Science 265, 1193–1201.
Muerhoff, A. S., Simons, J. N., Leary, T. P., Erker, J. C., Chalmers,
M. L., Pilot-Matias, T. J., Dawson, G. J., Desai, S. M. & Mushahwar, I. K.
(1996). Sequence heterogeneity within the 5h-terminal region of the
hepatitis GB virus C genome and evidence for genotypes. Journal of
Hepatology 25, 379–384.
Muerhoff, A. S., Smith, D. B., Leary, T. P., Erker, J. C., Desai, S. M. &
Mushahwar, I. K. (1997). Identification of GB virus C variants by
phylogenetic analysis of 5h-untranslated and coding region sequences.
Journal of Virology 71, 6501–6508.
Mukaide, M., Mizokami, M., Orito, E., Ohba, K., Nakano, T., Ueda, R.,
Hikiji, K., Iino, S., Shapiro, S., Lahat, N., Park, Y. M., Kim, B. S.,
Oyunsuren, T., Rezieg, M., Alahdal, M. N. & Lau, J. Y. N. (1997). Three
different GB virus C\hepatitis G virus genotypes – phylogenetic analysis
and A genotyping assay based on restriction fragment length polymorphism. FEBS Letters 407, 51–58.
Nakao, H., Okamoto, H., Fukuda, M., Tsuda, F., Mitsui, T., Masuko, K.,
Lizuka, H., Miyakawa, Y. & Mayumi, M. (1997). Mutation rate of GB
virus C hepatitis G virus over the entire genome and in subgenomic
regions. Virology 233, 43–50.
Nei, M. & Gojobori, T. (1986). Simple methods for estimating the
numbers of synonymous and non-synonymous substitutions. Molecular
Biology and Evolution 3, 418–426.
Norder, H., Courouce, A. M. & Magnius, L. O. (1994). Complete
genomes, phylogenetic relatedness, and structural proteins of six strains
of the hepatitis B virus, four of which represent two new genotypes.
Virology 198, 489–503.
Norder, H., Ebert, J. W., Fields, H. A., Mushahwar, I. K. & Magnius,
L. O. (1996). Complete sequencing of a gibbon hepatitis B virus genome
reveals a unique genotype distantly related to the chimpanzee hepatitis
B virus. Virology 218, 214–223.
Pawlotsky, J. M., Tsakiris, L., Roudot-Thoraval, F., Pellet, C., Stuyver,
L., Duval, J. & Dhumeaux, D. (1995). Relationship between hepatitis C
virus genotypes and sources of infection in patients with chronic hepatitis
C. Journal of Infectious Diseases 171, 1607–1610.
Power, J. P., Lawlor, E., Davidson, F., Yap, P. L., Kenny-Walsh, E.,
Downloaded from www.microbiologyresearch.org by
IP: 88.99.165.207
On: Sun, 18 Jun 2017 03:28:33
Fleming Lecture
Whelton, M. J. & Walsh, T. J. (1994). Hepatitis C viraemia in recipients
of Irish intravenous anti-D immunoglobulin. Lancet 344, 1166–1167.
Power, J. P., Lawlor, E., Davidson, F., Holmes, E. C., Yap, P. L. &
Simmonds, P. (1995). Molecular epidemiology of an outbreak of
infection with hepatitis C virus in recipients of anti-D immunoglobulin.
Lancet 345, 1211–1213.
Robertson, B., Myers, G., Howard, C., Brettin, T., Bukh, J., Gaschen, B.,
Gojobori, T., Maertens, G., Mizokami, M., Nainan, O., Netesov, S.,
Nishioka, K., Shini, T., Simmonds, P., Smith, D., Stuyver, L. & Weiner,
A. (1998). Classification, nomenclature, and database development for
Simmonds, P., Mellor, J., Sakuldamrongpanich, T., Nuchaprayoon, C.,
Tanprasert, S., Holmes, E. C. & Smith, D. B. (1996 b). Evolutionary
analysis of variants of hepatitis C virus found in South-East Asia :
comparison with classifications based upon sequence similarity. Journal of
General Virology 77, 3013–3024.
Simons, J. N., Pilot-Matias, T. J., Leary, T. P., Dawson, G. J.,
Desai, S. M., Schlauder, G. G., Muerhoff, A. S., Erker, J. C., Buijk,
S. L., Chalmers, M. L., Vansant, C. L. & Mushahwar, I. K. (1995).
Identification of two flavivirus-like genomes in the GB hepatitis
agent. Proceedings of the National Academy of Sciences, USA 92, 3401–3405.
hepatitis C virus (HCV) and related viruses : proposals for standardization.
Archives of Virology 143, 2493–2503.
Smith, D. B., Cuceanu, N., Davidson, F., Jarvis, L. M., Mokili, J. L. K.,
Hamid, S., Ludlam, C. A. & Simmonds, P. (1997 a). Discrimination of
Ruggieri, A., Argentini, C., Kouruma, F., Chionne, P., Dugo, E., Spada,
E., Dettori, S., Sabbatani, S. & Rapicetta, M. (1996). Heterogeneity of
hepatitis G virus\GBV-C geographical variants by analysis of the 5h noncoding region. Journal of General Virology 78, 1533–1542.
hepatitis C virus genotype 2 variants in West Central Africa (Guinea
Conakry). Journal of General Virology 77, 2073–2076.
Smith, D. B., Pathirana, S., Davidson, F., Lawlor, E., Power, J., Yap,
P. L. & Simmonds, P. (1997 b). The origin of hepatitis C virus
Sathar, M. A., Soni, P. N., Pegoraro, R., Simmonds, P., Smith, D. B.,
Dhillon, A. P. & Dusheiko, G. M. (1999). A new variant of GB virus
C\hepatitis G virus (GBV-C\HGV) from South Africa. Virus Research 64,
151–160.
Seeff, L. B. (1997). Natural history of hepatitis C. Hepatology 26, 21–28.
Seeff, L. B., Buskell-Bales, Z., Wright, E. C., Durako, S. J., Alter, H. J.,
Iber, F. L., Hollinger, F. B., Gitnick, G., Knodell, R. G., Perrillo, R. P.,
Stevens, C. E. & Hollingsworth, C. G. (1992). Long-term mortality after
transfusion-associated non A hepatitis, non B hepatitis. New England
Journal of Medicine 327, 1906–1911.
Seeff, L. B., Miller, R. N., Rabkin, C. S., Buskell-Bales, Z., StraleyEason, K. D., Smoak, B. L., Johnson, L. D., Lee, S. R. & Kaplan, E. L.
(2000). 45-year follow-up of hepatitis C virus infection in healthy young
adults. Annals of Internal Medicine 132, 105–111.
Silverman, A. L., McCray, D. G., Gordon, S. C., Morgan, W. T. & Walker,
E. D. (1996). Experimental evidence against replication or dissemination
of hepatitis C virus in mosquitoes (Diptera : Culicidae) using detection
by reverse transcriptase polymerase chain reaction. Journal of Medical
Entomology 33, 398–401.
Simmonds, P. (1998). Variability of the hepatitis C virus genome. In
Hepatitis C Virus, 2nd edn, pp. 38–63. Edited by H. W. Reesink. Basel :
Karger.
Simmonds, P. & Smith, D. B. (1999 a). Hepatitis C and G viruses – old
or new? In HIV and the New Viruses, pp. 459–480. Edited by A. G.
Dalgleish & R. A. Weiss. San Diego : Academic Press.
Simmonds, P. & Smith, D. B. (1999 b). Structural constraints on RNA
virus evolution. Journal of Virology 73, 5787–5794.
Simmonds, P., Alberti, A., Alter, H. J., Bonino, F., Bradley, D. W.,
Brechot, C., Brouwer, J. T., Chan, S. W., Chayama, K., Chen, D. S.,
Choo, Q. L., Colombo, M., Cuypers, H. T. M., Date, T., Dusheiko, G. M.,
Esteban, J. I., Fay, O., Hadziyannis, S. J., Han, J., Hatzakis, A., Holmes,
E. C., Hotta, H., Houghton, M., Irvine, B., Kohara, M., Kolberg, J. A.,
Kuo, G., Lau, J. Y. N., Lelie, P. N., Maertens, G., McOmish, F.,
Miyamura, T., Mizokami, M., Nomoto, A., Prince, A. M., Reesink, H. W.,
Rice, C., Roggendorf, M., Schalm, S. W., Shikata, T., Shimotohno, K.,
Stuyver, L., Trepo, C., Weiner, A., Yap, P. L. & Urdea, M. S. (1994). A
proposed system for the nomenclature of hepatitis C viral genotypes.
Hepatology 19, 1321–1324.
Simmonds, P., Mellor, J., Craxi, A., Sanchez-Tapias, J. M., Alberti, A.,
Prieto, J., Colombo, M., Rumi, M. G., Loiacano, O., Ampurdanesmingall, S., Fornsbernhardt, X., Chemello, L., Civeira, M. P., Frost, C.
& Dusheiko, G. (1996 a). Epidemiological, clinical and therapeutic
associations of hepatitis C types in western European patients. Journal of
Hepatology 24, 517–524.
genotypes. Journal of General Virology 78, 321–328.
Smith, D. B., Basaras, M., Frost, S., Haydon, D., Cuceanu, N., Prescott,
L., Kamenka, C., Millband, D., Sathar, M. A. & Simmonds, P. (2000).
Phylogenetic analysis of GBV-C\hepatitis G virus. Journal of General
Virology 81, 769–780.
Stuyver, L., Rossau, R., Wyseur, A., Duhamel, M., Vanderborght, B.,
Van Heuverswyn, H. & Maertens, G. (1993). Typing of hepatitis C virus
isolates and characterization of new subtypes using a line probe assay.
Journal of General Virology 74, 1093–1102.
Suzuki, Y. & Gojobori, T. (1997). The origin and evolution of Ebola and
Marburg viruses. Molecular Biology and Evolution 14, 800–806.
Takahashi, K., Brotman, B., Usuda, S., Mishiro, S. & Prince, A. M.
(2000). Full-genome sequence analyses of hepatitis B virus (HBV) strains
recovered from chimpanzees infected in the wild : implications for an
origin of HBV. Virology 267, 58–64.
Tanaka, Y., Mizokami, M., Orito, E., Ohba, K., Kato, T., Kondo, Y.,
Mboudjeka, I., Zekeng, L., Kaptue, L., Bikandou, B., Mpele, P.,
Takehisa, J., Hayami, M., Suzuki, Y. & Gojobori, T. (1998 a). African
origin of GB virus C hepatitis G virus. FEBS Letters 423, 143–148.
Tanaka, Y., Mizokami, M., Orito, E., Ohba, K. I., Nakano, T., Kato, T.,
Kondo, Y., Ding, X., Ueda, R., Sonoda, S., Tajima, K., Miura, T. &
Hayami, M. (1998 b). GB virus C\hepatitis G virus infection among
Colombian native Indians. American Journal of Tropical Medicine and
Hygiene 59, 462–467.
Thomas, H. C. & Jacyna, M. R. (1993). Hepatitis B virus : pathogenesis
and treatment of chronic infection. In Viral Hepatitis, pp. 185–207. Edited
by A. J. Zuckerman & H. C. Thomas. Edinburgh : Churchill Livingstone.
Tisminetzky, S. G., Gerotto, M., Pontisso, P., Chemello, L., Ruvoletto,
M. G., Baralle, F. & Alberti, A. (1994). Genotypes of hepatitis C virus in
Italian patients with chronic hepatitis C. International Hepatological
Communications 2, 105–112.
Tucker, T. J., Smuts, H., Eickhaus, P., Robson, S. C. & Kirsch, R. E.
(1999). Molecular characterization of the 5h non-coding region of South
African GBV-C\HGV isolates : major deletion and evidence for a fourth
genotype. Journal of Medical Virology 59, 52–59.
Vaudin, M., Wolstenholme, A. J., Tsiquaye, K. N., Zuckerman, A. J. &
Harrison, T. J. (1988). The complete nucleotide sequence of the genome
of a hepatitis B virus isolated from a naturally infected chimpanzee.
Journal of General Virology 69, 1383–1389.
Wansbrough-Jones, M. H., Frimpong, E., Cant, B., Harris, K., Evans,
M. R. W. & Teo, C. G. (1998). Prevalence and genotype of hepatitis C
virus infection in pregnant women and blood donors in Ghana.
Transactions of the Royal Society of Tropical Medicine and Hygiene 92,
496–499.
Downloaded from www.microbiologyresearch.org by
IP: 88.99.165.207
On: Sun, 18 Jun 2017 03:28:33
HBB
P. Simmonds
Warren, K. S., Heeney, J. L., Swan, R. A., Heriyanto & Verschoor, E. J.
(1999). A new group of hepadnaviruses naturally infecting orangutans
(Pongo pygmaeus). Journal of Virology 73, 7860–7865.
Wells, C. (1964). Bones, Bodies and Disease. London : Thames and Hudson.
Wiese, M., Berr, F., Lafrenz, M., Porst, H. & Oesen, U. (2000). Low
frequency of cirrhosis in a hepatitis C (genotype 1B) single-source
outbreak in Germany : a 20-year multicenter study. Hepatology 32, 91–96.
Wildy, P., Field, H. J. & Nash, A. A. (1982). Classical herpes latency
revisited. In Virus Persistence, pp. 133–167. Edited by B. W. J. Mahy, A.
C. Minson & G. K. Darby. Cambridge : Cambridge University Press.
Xiang, J., Wunschmann, S., Schmidt, W., Shao, J. & Stapleton, J. T.
(2000). Full-length GB virus C (hepatitis G virus) RNA transcripts are
(1994). Hepatitis C virus genotype 4 is highly prevalent in Central
Africa (Gabon). Journal of General Virology 75, 2393–2398.
Zanotto, P. M. D., Gould, E. A., Gao, G. F., Harvey, P. H. & Holmes,
E. C. (1996). Population dynamics of flaviviruses revealed by molecular
phylogenies. Proceedings of the National Academy of Sciences, USA 93,
548–553.
Zhang, G., Haydon, D. T., Knowles, N. J. & McCauley, J. W. (1999).
Molecular evolution of swine vesicular disease virus. Journal of General
Virology 80, 639–651.
Zhu, T. F., Korber, B. T., Nahmias, A. J., Hooper, E., Sharp, P. M. & Ho,
D. D. (1998). An African HIV-1 sequence from 1959 and implications
for the origin of the epidemic. Nature 391, 594–597.
infectious in primary CD4-positive T Cells. Journal of Virology 74,
9125–9133.
Zuckerman, A. J., Thornton, A., Howard, C. R., Tsiquaye, K. N., Jones,
D. M. & Brambell, M. R. (1978). Hepatitis B outbreak among chim-
Xu, L. Z., Larzul, D., Delaporte, E., Brechot, C. & Kremsdorf, D.
panzees at the London Zoo. Lancet ii, 652–654.
HBC
Downloaded from www.microbiologyresearch.org by
IP: 88.99.165.207
On: Sun, 18 Jun 2017 03:28:33