Diversity of Polycomb group complexes in plants - iGRAD

Review
Diversity of Polycomb group
complexes in plants: same rules,
different players?
Lars Hennig and Maria Derkacheva
Department of Biology and Zurich-Basel Plant Science Center, ETH Zurich, CH-8092 Zurich, Switzerland
Polycomb Group (PcG) proteins form an epigenetic
memory system that is conserved in plants and animals
and controls gene expression during development. Loss
of plant PcG proteins leads to loss of organ identity and
to cell overproliferation. Our understanding of plant PcG
protein function has recently been advanced by the
identification of additional proteins required for transcriptional repression by PcG and by the purification
of an Arabidopsis PcG protein complex. These data
indicate that Polycomb Repressive Complex 2 (PRC2)like complexes in animals and plants have to associate
with Plant Homeo Domain (PHD)-finger proteins for
efficient deposition of histone H3 trimethylated at lysine
27 (H3K27me3) and transcriptional repression. Subsequently, H3K27me3 at target genes assist to recruit
additional PcG protein complexes – PRC1 in animals and
potentially LIKE HETEROCHROMATIN PROTEIN-1 (LHP1)
and the RING finger gene product AtRING1 in plants. A
picture is emerging in which the general mechanisms of
PcG protein function are well conserved between
animals and plants, but in which individual players have
been exchanged during evolution.
PcG proteins comprise an evolutionary conserved
transcriptional memory system
Polycomb group (PcG) proteins were first identified in
Drosophila melanogaster as regulators of homeotic genes
during development, but now it is clear that PcG proteins
are well conserved among metazoans and plants (for
review see Refs. [1,2]). PcG proteins dictate the transcriptional status of target genes and therefore control the
choice between alternative development programs. PcG
proteins function by forming large protein complexes. In
Drosophila, for example, three different PcG protein complexes have been characterized in detail: the Polycomb
Repressive Complex 1 (PRC1), the Polycomb Repressive
Complex 2 (PRC2), and Pcl-PRC2 (for review see Ref. [3]).
PRC2 is comprised of four proteins including the histone
methyltransferase Enhancer of Zeste (E(z)). It is likely that
PRC2 functions in vivo primarily as a mono- and di-methyl
transferase acting on lysine 27 of histone H3 (H3K27),
whereas Pcl-PRC2, which contains additional subunits
such as Polycomb-like (Pcl), functions as a H3K27 trimethyltransferase. The PRC1 complex also modifies
histones – its RING subunit catalyzes histone H2A lysine
119 ubiquitylation (H2AK119u). Therefore, the chromatin
of PcG target genes in flies usually carries both H3K27me3
and H2AK119u. Although several plant PRC2-like complexes that control development have been known for
many years, plant Pcl-PRC2- and PRC1-like complexes
were identified only recently.
Plant PcG proteins are essential for the maintenance of
cell fates and normal development. Here we will review the
functions of plant PcG proteins in development with a
strong focus on the regulation of flowering time and seed
development, two traits that are not only of academic but
also of great economic interest. We will discuss the current
knowledge about PcG protein complexes in plants, and how
the identification of a PHD-PRC2 complex, and of candidates for PRC1-like complexes, will advance our understanding of the mechanisms of PcG protein function in
plants.
And then there were many – genes for PcG proteins in
plants
PcG proteins were first identified in plants in 1997, and it
now seems that all extant multicellular plants have a
functional PcG system (Figure 1) [4,5]. A particularly
impressive example of functional conservation of the
PcG system over more than 450 million years of evolution
is illustrated by the finding that the Arabidopsis and moss
Glossary
Apomixis: non-sexual seed formation. Apomixis is a very interesting
agronomic trait because it offers a way to stabilize high levels of heterozygosity, a prerequisite for hybrid vigor. Whereas Mendelian inheritance of a
heterozygous locus (Aa) leads to offspring with all possible allele combinations
(AA, Aa and aa), apomictic inheritance leads to offspring with the mother’s
genotype (Aa) only.
Gametophyte: the haploid stage of a plant’s life cycle that produces gametes
by mitosis. In mosses, the gametophytic phase dominates, whereas it is greatly
reduced (usually to 3 to 7 cells) in seed plants.
Maternal effect: the phenotype of the progeny is determined by the genotype
of the mother. A gametophytic maternal effect is an effect where the genotype
of the female (maternal) gametophyte determines the phenotype of the
progeny. Examples of molecular mechanisms that lead to gametophytic
maternal effects are (i) the gene is active only after fertilization, but only the
maternal allele is expressed (genomic imprinting); (ii) the gene is active before
fertilization in the female gametophyte, but the gene product functions only
after fertilization (maternal carry over); (iii) the gene is active and the gene
product functions before fertilization in the female gametophyte, but the effect
on phenotype becomes visible only after fertilization (epigenetic gametophytic
effect).
Sporophyte: the diploid stage of a plant’s life cycle that produces spores by
meiosis. In mosses, the sporophytic phase is greatly reduced, whereas it
dominates in seed plants.
Corresponding author: Hennig, L. ([email protected]).
414
0168-9525/$ – see front matter ß 2009 Elsevier Ltd. All rights reserved. doi:10.1016/j.tig.2009.07.002
Review
Trends in Genetics
Vol.25 No.9
Figure 1. PcG proteins in plants. Arabidopsis PcG proteins have homologs in distant plant species, demonstrating that PcG proteins are highly conserved in evolution.
Numbers on the symbolic phylogenetic tree on the left note divergence time (in millions of years) of major plant lineages. The number of each type of protein in various
plant species is shown. The top and bottom rows for each class lists representative members from Arabidopsis and Drosophila, respectively. LHP1 and VEL1 are considered
to be functional analogs but not homologs of Drosophila Pc and Pcl, respectively. The contribution to PcG complexes is indicated at the lower part of the figure. Data are
from Phytozome 4.0 (Joint Genome Institute, http://www.phytozome.net) and from Refs. [4,5,18,87,90].
FERTILIZATION INDEPENDENT ENDOSPERM (FIE)
PcG proteins can partially rescue each other’s mutant
phenotypes [4]. It is possible that genes for PcG proteins
were present in the last common ancestor of plants and
animals and were subsequently lost in unicellular
lineages. Such gene loss could explain the presence of only
a FIE but no CURLY LEAF (CLF) or EMBRYONIC
FLOWER 2 (EMF2) homologs in the green algae Chlamydomonas reinhardtii. In addition to the proteins, which are
conserved between animals and plants, the plant-specific
VEL proteins and LHP1 (Box 1) are also required for the
function of the plant PcG system. There is a tendency for
the complexity of genes encoding PcG proteins to increase
during evolution. For example, the moss and fern genomes
mostly have single copies of the genes encoding PcG
proteins, whereas seed plants usually have small gene
families with up to five members.
The model plant Arabidopsis thaliana has 12 homologs
of Drosophila PRC2 subunits: the three E(z) homologs
CLF, MEDEA (MEA) and SWINGER (SWN) [6–10]; the
three Suppressor of zeste (Su(z)12) homologs EMF2, FERTILISATION INDEPENDENT SEED2 (FIS2) and VERNALIZATION2 (VRN2) [10–12]; the single Extra sex
combs (Esc) homolog FERTILIZATION INDEPENDENT
ENDOSPERM (FIE) [13]; and the five p55 homologs MSI1–
5 [14–18] (Table 1). Genetic and molecular evidence
suggests that these proteins form at least three PRC2-like
complexes: the EMBRYONIC FLOWER (EMF), VERNALIZATION (VRN) and FERTILISATION INDEPENDENT
SEED (FIS) complexes (Figure 2) [7,19–26]. The existence
of the FIS and VRN complexes has been confirmed biochemically [19,22,23]. Although these complexes have
clearly distinct functions, they do share some target genes.
For instance, the FIS complex, which contains MEA/SWN,
FIS2, FIE and MSI1, silences target genes during gametogenesis and early seed development, whereas the EMF
complex, which probably contains CLF/SWN, EMF2, FIE
and MSI1, silences some of the same target genes during
subsequent sporophytic development [27]. Similarly,
FLOWERING LOCUS C (FLC) is silenced by both the
VRN and the EMF complexes [11,28]. It will be important
to determine how much the target gene sets of different
Box 1. Biochemical functions of PcG proteins in plants
CLF, SWN and MEA are homologs of Drosophila E(z) and all contain
a conserved SET domain (Table 1). Many SET domain proteins have
histone lysine methyltransferase activity, and E(z) as well as its
mammalian orthologs catalyze trimethylation of H3K27 (for review
see Ref. [3]). Although lysine methyltransferase activity has not yet
directly been measured for plant E(z) homologs, genetic evidence
strongly suggests that CLF, MEA and probably SWN catalyze
trimethylation of H3K27. First, plant PcG protein target genes carry
H3K27me3 that requires the presence of PRC2-like complexes
[33,37,55,56,91]. Second, CLF as well as MEA both require an intact
SET domain to function [27,91].
LHP1 belongs to the large family of chromodomain (CD) proteins,
which also includes HP1 and Pc [79]. In vitro, LHP1 and the LHP1 CD
bind with similar affinity to H3K27me3, H3K9me2/3 and
H3K9me3K27me3 peptides [47,48]. Because lhp1 mutant plants
with a mutation in the CD that abolishes binding to methylated
histones lose silencing of PcG target genes, and have the same
phenotype as lhp1 null alleles, the binding of LHP1 to methylated H3
tails appears to be essential for its function [87]. Whereas the LHP1
chromodomain mediates binding to methylated histone tails, the
carboxy-terminal LHP1 chromo-shadow domain is required for
binding to the MADS domain transcription factor SVP that appears
to repress target genes such as FT, at least in part, by recruiting
LHP1 [51]. Similarly, LHP1 interacts with the transcription factor
SCARECROW (SCR) and is recruited to SCR target genes [50]. These
data indicate that LHP1 can be recruited to target genes independently of H3K27me3 by direct interaction with transcription factors.
415
Review
Trends in Genetics Vol.25 No.9
Table 1. Arabidopsis PcG and associated proteins
Arabidopsis protein
CLF
SWN
MEA
Arabidopsis PcG-complex
EMF
VRN
EMF
VRN
FIS
FIS
VRN5 (VIL1)
EMF
VRN
FIS
EMF
VRN
FIS
EMF
VRN
FIS
VRN
VIN3
VRN
VEL1
VRN
AtRING1a
AtRING1b
LHP1
EMF1
VRN1
LHP1-RING
LHP1-RING
LHP1-RING
?
?
EMF2
VRN2
FIS2
MSI1
FIE
Protein domains
SET
Drosophila homologs
E(z)
SET
E(z)
SET
E(z)
Zn finger
Zn finger
Zn finger
WD 40
Su(z)12
Su(z)12
Su(z)12
Nurf 55
WD 40
Esc
PHD finger
FN III
PHD finger
FN III
PHD finger
FN III
RING
RING
Chromodomain
B3
PRC2-like complexes overlap. Do these complexes generally regulate common genes, or is this the exception?
Genome-wide binding studies will give a first answer,
however, a thorough understanding will require the elucidation of the mechanisms of PcG protein targeting in
plants.
In Drosophila, PRC1 and Pcl-PRC2 are recruited to
target genes by cis-acting Polycomb response elements
(PRE) (for review see Refs. [1,29]). Binding of PRC1 to
H3K27me3 via its Polycomb (Pc) subunit may help to
recruit PRC1 to targets, but H3K27me3 is not absolutely
required for PRC1 binding to target genes (for review see
Ref. [3]). In contrast to Pcl-PRC2, which localizes mainly to
PREs, the core PRC2 might bind to chromosomes with
much less specificity (for review see Ref. [3]). To date,
functional PREs have been identified and studied only
in Drosophila, and the identification of PREs in plants
will be a major contribution. It seems likely that plant
PREs will soon become available because the promoter
and/or intragenic regions of the three plant PcG target
genes FLC, AGAMOUS (AG) and PHERES1 (PHE1) were
able to direct appropriately regulated reporter gene
expression [30–33].
Genomic data have revealed that the PcG system is
ancient and that plants contain many PcG proteins which
can form diverse complexes. Even moss has functional PcG
proteins, and in seed plants a considerable diversification
of genes encoding PcG proteins has occurred. The question
then arises: what are the functions of the PcG complexes in
plants?
A matter of identity
One central function of PcG proteins in plants and animals
is to maintain cell identity, and mutations in these genes
416
Mouse homologs
EZH 1
EZH 2
EZH 1
EZH 2
Refs.
[6]
EZH 1
EZH 2
SUZ 12
SUZ 12
SUZ 12
RpAp48
RpAp46
[8,9]
EED
[13]
[7]
[12]
[11]
[10,89]
[14]
[32,38]
[56]
[32,38]
SCE (RING)
SCE (RING)
RING1A, RING1B
RING1A, RING1B
[49,85]
[49,85]
[47,48,79]
[83]
[82]
often cause homeotic transformations of organs. For
example in plants, clf mutants mis-express homeotic genes
leading to alterations in flower organ identity; sepals, for
example, will adopt a carpel-like fate and can even bear
ovules [6]. However, plant PcG proteins not only repress
the genetic programs establishing specific cellular identity
but also maintain cells in a differentiated state of specific
identity. In FIE knock-down plants and clf;swn double
mutants, de-differentiated cells accumulate in disorganized, callus-like structures [7,34]. Similarly, loss of moss
FIE causes meristems to overproliferate and prevents
differentiation of leafy gametophytes [4].
Altogether, it seems that plant PcG proteins are
required to maintain organ identity and to keep cells in
an appropriately differentiated state. This is in contrast to
mammals, where over-expression and not lack of PcG
proteins leads to loss of cell differentiation and to overproliferation, as has been seen in cancer (for review see
Refs. [2,35]). This might imply that mechanisms controlling cell-differentiation evolved independently, making
different use of the PcG system in animals and plants.
How do PcG proteins influence flowering?
The switch to reproductive development (i.e. flowering)
largely determines the reproductive success of plants
and is controlled by multiple signaling pathways that
integrate environmental and developmental signals (for
review see Ref. [36]). It has been known for several years
that PcG proteins participate in some signaling pathways
to control flowering time, but it is only now becoming clear
that PcG proteins control most major regulators of flowering time. In Arabidopsis, at least two PRC2-like complexes
are part of this signaling network. First, the EMF complex
suppresses precocious flowering and promotes vegetative
Review
Trends in Genetics
Vol.25 No.9
Figure 2. PRC2-like complexes act at different stages of the Arabidopsis life cycle. The EMF complex (i.e. CLF/SWN, EMF2, FIE and MSI1) promotes vegetative development
of the plant, and delays reproduction, but also maintains cells in a differentiated state [7,12,28,34]. The VRN complex (i.e. CLF/SWN, VRN2, FIE and MSI1) establishes
epigenetic silencing of FLC after vernalization and enables flowering [22,23]. The FIS complex (i.e. MEA, SWN, FIS2, FIE and MSI1) prevents seed development in the
absence of fertilization and is required for normal seed development [8,9,13,19,24,25,89].
development by repressing the transcription of flowering
activators such as FLOWERING LOCUS T (FT), the main
flowering time integrator, and AGAMOUS-LIKE 19
(AGL19) [12,28,37].
A second PRC2-like complex controlling flowering time
is the VRN complex that enables Arabidopsis to flower
after vernalization, which in Arabidopsis causes the epigenetic repression of the flowering repressor FLOWERING
LOCUS C (FLC) (Box 2). The VRN complex is composed of
VRN2, CLF/SWN, FIE and MSI1 [22,23] (Figure 2). VRN2
was found to be required for FLC repression after vernalization [11], and it seemed likely that cold exposure would
Box 2. Vernalization – reproduction enabled by cold
Vernalization is the promotion of the competence to flower by long
periods of low temperatures such as those typically experienced
during winters (for review see Ref. [92]). In temperate climate zones,
spring is usually the most suitable season for the onset of
reproduction, and many plants from these zones match flowering to
the spring season by requiring vernalization and long photoperiods.
The photoperiod and vernalization pathways converge to regulate
flower integrator genes. Arabidopsis has both winter-annual accessions, which require vernalization to flower, and summer-annual
accessions, which do not require vernalization, and there is much
natural variation in the extent of vernalization requirement [93]. The
major determinants of vernalization requirement in Arabidopsis are
the MADS box gene FLOWERING LOCUS C (FLC) and the FLC
activator FRIGIDA (FRI). FLC delays flowering, in part by repressing
transcription of the potent flowering activator FLOWERING TIME (FT)
that is induced by the photoperiod pathway and is part of the mobile
flowering signal termed Florigen. Ecotypes with weak or inactive
alleles of FRI or FLC flower without vernalization, but ecotypes with
active FRI and FLC flower only once vernalization has repressed FLC.
Cold-induced FLC repression is maintained even after plants are
returned to ambient temperatures, allowing activation of FT and other
floral activator genes. It is not known how cold is sensed for
vernalization, but gradual induction of VIN3 expression is the first
effect of prolonged cold known so far. VIN3, as part of the PHD-VRN
complex, allows H3K27me3 to accumulate at FLC. The stability of
epigenetic silencing of FLC varies between accessions and explains
part of the variation in vernalization found in Arabidopsis natural
accessions adapted to different climates [94].
Although FLC is the major gene for vernalization in Arabidopsis,
other genes act in parallel to FLC in the vernalization response [92].
VIN3 is required not only for FLC-dependent but also for FLCindependent branches of the vernalization pathway [56], suggesting
that PcG mediated gene silencing might be involved in all branches of
the Arabidopsis vernalization response.
417
Review
Trends in Genetics Vol.25 No.9
Figure 3. VRN2 is constitutively associated with all investigated regions of the FLC locus independently of vernalization [23]. Prolonged exposure to cold, usually below
8 8C, causes VRN5 to associate with a specific region near the 50 end of intron 1. At this stage VRN5, together with VRN3, VEL1 and the VRN PRC2-like complex, form the PclPRC2-like PHD-VRN complex that locally marks FLC chromatin with H3K27me3 (symbolized by the stop signs). After growth at ambient temperature for 7 days, the PHDVRN complex and H3K27me3 spread over the entire FLC locus [23,39].
target the VRN complex to FLC. Surprisingly, however, it
was found that association of the core VRN complex with
the FLC locus, and increased expression of FLC in vrn2
mutants, were both independent of cold exposure [11,23].
In addition, the EMF complex represses FLC under warm
conditions [28].
A major breakthrough in the field was the discovery that
the VRN complex associates with VERNALIZATION
INSENSITIVE 3 (VIN3), VRN5, and VEL1 to form a
PHD-PRC2 complex during prolonged cold [22,23].
VIN3, VRN5 and VEL1, together with VEL2 and VEL3,
form a small Arabidopsis protein family that is characterized by a PHD finger and a fibronectin III domain [32,38].
The PHD-VRN complex increases H3K27me3 levels in
FLC chromatin, leading to stable silencing. Another key
finding was the observation that VRN5 and H3K27me3 are
initially restricted to a small region from the transcriptional start to the beginning of the first intron, and only
spread across the entire FLC locus after return to warm
conditions [23,39] (Figure 3). However, neither VRN5 nor
H3K27me3 enter the body of the neighbouring genes.
Interestingly, this spreading (as well as continuous stable
silencing of FLC) takes place only in mitotically active
cells, and not in non-dividing cells [39]. Mammalian
PRC2 binds to H3K27me3, and it has been suggested that
H3K27me3 recruits PRC2 to maintain the mark at sites of
418
DNA replication [40]. It remains to be established whether
spreading of H3K27me3 on FLC during cell proliferation
relies on the same mechanisms. It has long been known
that vernalization requires mitotic cell division [41] and
the new results suggest that establishing stability in PcGmediated epigenetic silencing of target genes such as FLC
is the underlying molecular mechanism for this physiological process.
The increase in H3K27me3 at the FLC locus brought
about by the PHD-VRN complex is reminiscent of the
situation in animals where Pcl-PRC2-like complexes, that
also contain PHD-finger proteins, introduce high levels of
H3K27me3 into target gene chromatin (for review see Ref.
[3]). In vitro studies have shown that the mammalian PHD
finger protein PHF1 specifically promotes the ability of
Polycomb Group protein EZH2 to catalyze tri-methylation
(and not mono- or di-methylation) of H3K27. It will be
important to see whether VEL proteins associate also with
the FIS and EMF complexes. An association of VEL
proteins with the EMF complex is possible because vrn5
mutants have FLC-independent phenotypes that resemble
weak alleles of EMF complex mutants. These phenotypes
include mild leaf curling, an increase in mean petal number, and early flowering in short-day conditions [32].
Together, the new results of de Lucia and colleagues [23]
suggest, in both animals and plants, that association of
Review
PRC2-like complexes with PHD-finger proteins creates the
active protein machinery for efficient H3K27me3 deposition and transcriptional repression. Notably, no sequence
homology exists outside the PHD domain between the
plant and animal PHD-finger proteins that associate with
PRC2-like complexes, suggesting that stimulation of PRC2
HMTase activity by PHD-finger proteins independently
evolved at least twice.
Silence – just for now or forever?
Initially, it was thought that once PcG protein-mediated
silencing is established that it was generally irreversible.
However, FT and AGL19 are both temporally repressed by
PcG proteins in young Arabidopsis plants, but escape from
silencing and lose H3K27me3 in response to day-length
and vernalization signals, respectively [28,37]. Therefore,
we now know that PcG protein-mediated silencing is often
only transient and can be reversed. At least three mechanisms could lead to H3K27me3 clearance from target
gene chromatin. Modified H3 could be passively diluted
during mitosis or it could be actively exchanged by unmodified H3. There are several replacement histone genes in
Arabidopsis [42], and replication-independent removal of
H3 has been observed in Arabidopsis [43]. Alternatively,
modified H3 could be actively demethylated. Active
demethylation of H3K27me3 is found in animals (for
review see Ref. [44]), and plant genomes encode homologs
of the metazoan JmjC-domain histone demethylases [45].
However, there are no obvious Arabidopsis homologs of the
KDM6/JMJD3 H3K27me3 demethylases, and it will be
important to identify plant H3K27me3 demethylases.
The extent to which the potential mechanisms contribute
to reactivation of PcG protein target genes in plants is
currently unclear. Although the mechanisms of H3K27me3
clearance in plants remain elusive, recent results from
many laboratories have demonstrated that PcG-mediated
silencing is not a one-way road but instead affords a
dynamic means of modulating gene expression during development [12,28,37,46–52].
In contrast to most other studied plant PcG protein
target genes, which are reactivated at certain stages of
the plant life cycle, FLC remains inactive once it is silenced
by vernalization. Stable resetting of FLC expression occurs
only in heart to torpedo stage embryos [53,54]. Interestingly, in addition to the typical PcG protein chromatin
mark H3K27me3, H3K9me2 was also found at FLC after
vernalization; both marks required VRN2 and the PHDVRN complex [55,56]. In Arabidopsis, H3K9me2 is mostly
associated with heterochromatic regions [57], and PcG
protein target genes generally do not carry this mark
[47,58,59]. It is therefore possible that FLC acquired stable
heterochromatin-like long-term silencing as a consequence
of potentially reversible PcG protein-mediated repression.
However, loss of the H3K9 dimethyltransferase KRYPTONITE does not cause increased expression of FLC or late
flowering [60]. Clearly, more work is required to clarify the
relationship between heterochromatic H3K9me2 gene
silencing and PcG protein-mediated repression, but the
results on FLC demonstrate that PcG protein-mediated
silencing and heterochromatic H3K9me2 can sometimes
function together.
Trends in Genetics
Vol.25 No.9
PcG proteins and sex
The third PRC2-like complex in plants – the FIS complex –
prevents the initiation of endosperm and seed development
in the absence of fertilization (for review see Ref. [61])
(Figure 2). The FIS complex consists of four main subunits:
MEA/SWN, FIE, FIS2 and MSI1 [19,24,25,62]. The FIS
complex functions in the female gametophyte before fertilization, and because this function is needed to control
gametophyte and seed development after fertilization, fis
mutants are true epigenetic female gametophytic maternaleffect mutants [63]. Initiation of fertilization-independent
seed development, such as that observed in fis mutants, is a
hallmark of apomictic development. However, PcG proteins
might not generally be involved in apomixes, because knockdown of FIE in sexually reproducing Hieracium piloselloides
does not lead to autonomous seed development; on the
contrary, knock down of FIE in apomictic Hieracium leads
to inhibition of autonomous embryo and endosperm formation [64]. Similarly, loss of genes for three PRC2 subunits
did not led to autonomous seed development in rice [65].
Thus, PRC2-like complexes are required for seed development in diverse species and do not simply act to prevent
apomixis as initially thought.
The function of the FIS complex after fertilization in the
endosperm is less well understood. One direct FIS complex
target gene in the endosperm is PHERES1 (PHE1) that
encodes a MADS domain transcription factor [33]. PHE1 is
paternally imprinted (Box 3); the maternal allele is
repressed by the FIS complex [27,66]. Interestingly,
FIS2 and MEA are themselves imprinted with only the
maternal alleles being expressed in the endosperm [20,67–
69]. MEA imprinting is controlled by an autoregulation
process through H3K27 trimethylation at the promoter
and at the 30 coding region of the gene [70–72]. By contrast,
FIS2 imprinting is independent of PcG proteins but
depends on differential DNA methylation [69]. The Arabidopsis FIS-complex subunit genes FIE and MSI1 are not
imprinted, but one of the three genes encoding E(z) homologs in maize, and one of the two FIE homologs in maize
and rice, are expressed only from the maternal alleles in
the endosperm [65,73,74]. It is therefore possible that
imprinting of genes for PRC2 subunits is widespread
among seed plants.
Box 3. Genomic imprinting
Genomic imprinting is an epigenetic phenomenon leading to allelespecific gene expression depending on the parent of origin, and has
evolved independently in mammals and flowering plants (for review
see Ref. [95]). Parent-of-origin specific expression is generally
established through epigenetic modifications, usually involving
DNA and/or histone methylation. Several theories have attempted
to rationalize the emergence of imprinting. The most prevailing
theory proposes that the evolution of imprinting was driven by the
conflict of interest between two parents about how many resources
mothers should invest into the offspring [96]. According to this
theory, genes with expression from the paternal allele promote
embryo growth while genes expressed from the maternal allele
restrict embryo growth. Consistent with this theory, many imprinted
genes act in specialized tissue for the transfer of nutrients from the
mother to the embryo, the mammalian placenta and the plant
endosperm. Because not all known imprinted genes control offspring growth, it is likely that other evolutionary forces have
contributed to the origin of genomic imprinting.
419
Review
Trends in Genetics Vol.25 No.9
Figure 4. A model of PcG protein-mediated gene silencing in Arabidopsis. The figure depicts the active (On) and the silenced state (Off) of the target gene. Inactive PRC2-like
complexes (light violet) could be present already at the active gene. During silencing Pcl-PRC2- and/or activated PRC2-like complexes (violet) introduce H3K27me3 into
target chromatin. Subsequently, LHP1 as a part of a PRC1-like complex, which contains also AtRING1a and AtRING1b, is considered to bind to this mark, establishing stable
gene silencing. Note that AtRING1a and AtRING1b can also directly interact with PRC2 via CLF.
The hunt for PRC1-like functions in plants
In contrast to PRC2-like complexes, that have been known
in plants for several years, the existence of PRC1-like
complexes in plants has remained enigmatic, primarily
because plant genomes do not encode bona fide Pc homologs. Recent evidence suggests that the plant chromodomain protein LHP1, also known as TERMINAL FLOWER
2, is a major candidate for a Pc-like function in Arabidopsis.
It was initially thought that LHP1 functioned in plants, as
does HP1 in animals, to silence heterochromatic loci. However, LHP1 (unlike HP1) was usually found in euchromatin
and was found to be needed for the silencing of euchromatic
genes, including many PcG protein targets, but not for the
silencing of genes in heterochromatin [75,76]. LHP1 has a
chromodomain that binds H3K27me3 in vitro, and its
genome-wide localization overlaps strongly with regions
rich in H3K27me3 [47,48]. Thus, LHP1 might fulfil in
plants the role that Pc fulfils in animals, namely to bind
to the PRC2-depend H3K27me3 mark. However, LHP1
seems not to be required for establishment of chromosome-wide H3K27me3 profiles in Arabidopsis [47], in contrast to animal Pc which may be involved in spreading of
H3K27me3 across target loci [1].
420
The lhp1 mutant has a pleiotropic phenotype, including
changes in flowering time, plant architecture, leaf
morphology, inflorescence determinacy and hormone
levels [46,77–80]. Important LHP1 target genes responsible for the mutant phenotype, are that are also targets for
PRC2-like complexes, include FT, FLC and AGAMOUS
(AG) [60,80,81]. Nevertheless, lhp1 null mutants have a
much milder phenotype than strong alleles of the genes
encoding PRC2 subunits, indicating that other proteins act
redundantly with LHP1 in plants to fulfil the role of Pc.
Two proteins are known to act downstream of PRC2-like
complexes in plants, VRN1 and EMF1, and these have
been proposed to be involved in PRC1-like functions
[52,60]. VRN1 is a plant-specific DNA-binding protein that
contains two B3 domains [82] and is needed for stable
repression of FLC; however, VRN1 is not needed for
H3K27me3 accumulation at FLC chromatin, suggesting
that it acts downstream of the VRN PRC2-like complex
[60]. The plant-specific DNA-binding protein EMF1, which
is needed for stable repression of AG, binds to silent AG,
and this binding depends on the EMF PRC2-like complex
[52,83]. In contrast to VRN1, EMF1 is required for normal
H3K27me3 levels at target genes and might thus affect
Review
PRC2 function. Although EMF1 is needed for silencing of
AG, EMF1 is not needed for silencing of other EMF complex targets including FT [52,84]. The available data
suggest that VRN1 and EMF1 are required to lock specific
PcG target genes in a silent state; however, neither protein
seems essential for the silencing of all PcG target genes.
Thus, VRN1 and EMF1 are plant-specific proteins that
contribute to the stabilization of PcG protein-mediated
silencing.
RING1 is another core component of animal PRC1.
RING1 functions as an E3 ubiquitin ligase that monoubiquitylates lysine 119 of histone H2A via its RING
domain. Although the precise role of this ubiquitylation
is unknown, it is required for normal PcG-dependent silencing [3]. Two Arabidopsis RING1 homologs were recently
identified [49,85]. AtRING1a and AtRING1b contain both
the RING domain and the Ubiquitin-like RAWUL domain,
the key characteristics of RING1 in animal PRC1 [85].
These two RING1 homologs, as well as the EMF PRC2-like
complex, are required to silence the class I KNOTTED1like homeobox (KNOX) genes that control meristem identity [21,49]. Because AtRING1a and AtRING1b can bind to
LHP1 [49], it seems likely that these three proteins form a
PRC1-like complex that binds to KNOX genes once a
PRC2-like complex has established the H3K27me3 mark
(Figure 4). It is not yet clear whether this process is
involved in the repression of other plant PcG protein
targets.
Although these results suggest the existence of a PRC1like complex in plants, this complex might function differently from animal PRC1 because H2AK119 ubiquitination
has not so far been detected in Arabidopsis [86]. In mammals and insects, H3K27 trimethylation by Pcl-PRC2, and
H2AK119 mono-ubiquitylation by PRC1, are both essential for stable repression. PRC1 PcG proteins can induce
nucleosome compaction in vitro, but in vivo transcription
initiation and/or elongation are blocked rather than access
and binding of the basal transcriptional machinery (for
review see Ref. [3]). It is not known whether LHP1 or
AtRING1a/b can repress transcription in plants by a
similar mechanism. For EMF1, however, interference with
transcription has been observed in vitro [52].
The view is now emerging that, in addition to PRC2-like
complexes, several other proteins contribute to PcGmediated silencing in plants. Recent work on LHP1,
AtRING1a and AtRING1b indicates that these proteins
might form a complex similar in domain composition and
function to animal PRC1 [47–49,60,75,76,81,87]. One next
step in the field will be to establish how LHP1, AtRING1a
and AtRING1b repress transcription and whether they
function together with the plant-specific EMF1 and
VRN1 proteins.
To spread or not to spread?
Genome-wide profiling techniques have revealed that 3–
4% of mammalian genes (i.e. 500–700 genes), and 1.3% of
the fly genes (i.e. 200 genes), are directly regulated by PcG
proteins (for review see Ref. [29]). In flies, PcG binding was
usually found in small peaks of 1 kb that often overlapped with known or presumptive PREs, whereas
H3K27me3 covered chromosomal domains of tens or even
Trends in Genetics
Vol.25 No.9
hundreds of kilobases with depletion of H3K27me3 at the
PREs. It was initially suggested that H3K27me3 in mammals does not form the extended domains found in flies, but
instead forms small focal peaks at silent gene promoters
(for review see Ref. [29]). More recently, however, it was
reported that in mouse embryonic fibroblasts H3K27me3
covered about 11% of the analyzed chromosome 17, forming large domains of average size 43 kb but with many
domains larger than 100 kb [88]. Differences in the
domains detected can be caused by differences in profiling
technology and algorithms used for data analysis (for
review see Ref. [29]).
Several differences are known in the genome-wide
distributions of H3K27me3 in Arabidopsis [47,58], mammals and flies. First, H3K27me3 decorated 4400 genes
(15% of all genes) in Arabidopsis, considerably more than
in animals. It should be noted that the animal experiments
were usually performed with single cell types, while the
plant experiments were performed with tissues composed
of diverse cell types. Because most Arabidopsis genes
marked by H3K27me3 have tissue-specific patterns of
expression [58], H3K27me3 profiles are often cell-type
specific. Thus, the higher number of target genes found
in Arabidopsis might not necessarily reflect a more widespread role of PcG protein-based regulation of gene expression in plants. Second, plant H3K27me3 domains were
largely restricted to the transcribed regions of single genes
and did not extend over the large genomic regions found in
flies or recently in mammals. Third, plant H3K27me3
domains were not significantly associated with low-nucleosome density regions. The establishment, spreading, and
functional roles of H3K27me3 may thus differ between
plants and animals. This idea is supported by the existence
of plant-specific members of PcG protein complexes such
LHP1 and VEL proteins. Despite these differences, profiling studies in plants, insects and mammals have revealed
several similarities. First, H3K27me3 was usually
enriched in gene-dense genome regions. Second,
H3K27me3 usually did not overlap significantly with heterochromatic H3K9me2 marks in Arabidopsis or
H3K9me3 marks in mouse or fly. Third, there was usually
a strong negative correlation between H3K27me3 and
transcription. These findings are consistent with the idea
that PcG protein-dependent H3K27me3 constitutes a
mark for transcriptional silencing predominantly of
euchromatic gene-dense regions distinct from the heterochromatic gene-poor regions that are rich in H3K9me2 or
H3K9me3.
Concluding remarks
The PcG system is an ancient and conserved molecular
machinery that represses expression of developmental
regulators in plants and animals. Seed plants contain
many genes encoding PcG proteins and have diverse
PcG complexes. Plant PcG proteins, as with their animal
counterparts, maintain organ identity. By contrast, cell
overproliferation is caused by loss of PcG proteins in
plants, whereas it is caused by over-expression of PcG
proteins in mammals. New data suggest that PRC2-like
complexes must associate with PHD-finger proteins for
efficient H3K27me3 deposition and transcriptional repres421
Review
sion. In animals, H3K27me3 at target genes helps to
recruit PRC1, whereas in plants the same mark might
recruit a PRC1-like LHP1-AtRING1 complex. Thus, the
general mechanisms of PcG protein function are well conserved, but individual players have been substituted
during evolution. We can expect to see plant PcG protein
complex composition firmly established in the near future.
A challenging task will be to identify the mechanism(s) of
PcG protein recruitment to target loci. Finally, we need to
uncover the mechanism(s) of transcriptional repression by
PcG proteins in order to fully appreciate why evolution has
placed this protein machinery at such prominent control
positions in both plant and animal development. Undoubtedly there are exciting times to come!
Acknowledgments
We thank Claudia Köhler and Cristina Madeira Alexandre for helpful
comments on the manuscript. Research in the authors’ laboratory is
supported by SNF grant 3100AO-116060 and by ETH project ETH-08 08-2.
References
1 Schwartz, Y.B. and Pirrotta, V. (2007) Polycomb silencing mechanisms
and the management of genomic programmes. Nat. Rev. Genet. 8, 9–22
2 Köhler, C. and Villar, C.B. (2008) Programming of gene expression by
Polycomb group proteins. Trends Cell Biol. 18, 236–243
3 Müller, J. and Verrijzer, P. (2009) Biochemical mechanisms of gene
regulation by Polycomb group protein complexes. Curr. Opin. Genet.
Dev. 19, 150–158
4 Mosquna, A. et al. (2009) Regulation of stem cell maintenance by the
Polycomb protein FIE has been conserved during land plant evolution.
Development 136, 2433–2444
5 Chen, L-J. et al. (2009) Molecular evolution of VEF-domain-containing
PcG genes in plants. Mol. Plant.; early acess-, DOI: 10.1093/mp/ssp03
6 Goodrich, J. et al. (1997) A Polycomb-group gene regulates homeotic
gene expression in Arabidopsis. Nature 386, 44–51
7 Chanvivattana, Y. et al. (2004) Interaction of Polycomb-group proteins
controlling flowering in Arabidopsis. Development 131, 5263–5276
8 Grossniklaus, U. et al. (1998) Maternal control of embryogenesis by
MEDEA, a Polycomb group gene in Arabidopsis. Science 280, 446–450
9 Kiyosue, T. et al. (1999) Control of fertilization-independent endosperm
development by the MEDEA Polycomb gene in Arabidopsis. Proc. Natl.
Acad. Sci. U. S. A. 96, 4186–4191
10 Luo, M. et al. (1999) Genes controlling fertilization-independent seed
development in Arabidopsis thaliana. Proc. Natl. Acad. Sci. U. S. A. 96,
296–301
11 Gendall, A.R. et al. (2001) The VERNALIZATION 2 gene mediates the
epigenetic regulation of vernalization in Arabidopsis. Cell 107, 525–535
12 Yoshida, N. et al. (2001) EMBRYONIC FLOWER2, a novel Polycomb
group protein homolog, mediates shoot development and flowering in
Arabidopsis. Plant Cell 13, 2471–2481
13 Ohad, N. et al. (1999) Mutations in FIE, a WD Polycomb group gene,
allow endosperm development without fertilization. Plant Cell 11, 407–
416
14 Ach, R.A. et al. (1997) A conserved family of WD-40 proteins binds to
the Retinoblastoma protein in both plants and animals. Plant Cell 9,
1595–1606
15 Hennig, L. et al. (2003) Arabidopsis MSI1 is required for epigenetic
maintenance of reproductive development. Development 130, 2555–
2565
16 Ausin, I. et al. (2004) Regulation of flowering time by FVE, a
Retinoblastoma-associated protein. Nat. Genet. 36, 162–166
17 Kim, H.J. et al. (2004) A genetic link between cold responses and
flowering time through FVE in Arabidopsis thaliana. Nat. Genet. 36,
167–171
18 Hennig, L. et al. (2005) MSI1-like proteins: an escort service for
chromatin assembly and remodeling complexes. Trends Cell Biol.
15, 295–302
19 Köhler, C. et al. (2003) Arabidopsis MSI1 is a component of the MEA/
FIE Polycomb group complex and required for seed development.
EMBO J. 22, 4804–4814
422
Trends in Genetics Vol.25 No.9
20 Luo, M. et al. (2000) Expression and parent-of-origin effects for FIS2,
MEA, and FIE in the endosperm and embryo of developing Arabidopsis
seeds. Proc. Natl. Acad. Sci. U. S. A. 97, 10637–10642
21 Katz, A. et al. (2004) FIE and CURLY LEAF Polycomb proteins interact
in the regulation of homeobox gene expression during sporophyte
development. Plant J. 37, 707–719
22 Wood, C.C. et al. (2006) The Arabidopsis thaliana vernalization
response requires a Polycomb-like protein complex that also
includes VERNALIZATION INSENSITIVE 3. Proc. Natl. Acad. Sci.
U. S. A. 103, 14631–14636
23 De Lucia, F. et al. (2008) A PHD-Polycomb Repressive Complex 2
triggers the epigenetic silencing of FLC during vernalization. Proc.
Natl. Acad. Sci. U. S. A. 105, 16831–16836
24 Spillane, C. et al. (2000) Interaction of the Arabidopsis Polycomb group
proteins FIE and MEA mediates their common phenotypes. Curr. Biol.
10, 1535–1538
25 Yadegari, R. et al. (2000) Mutations in the FIE and MEA genes that
encode interacting Polycomb proteins cause parent-of-origin effects on
seed development by distinct mechanisms. Plant Cell 12, 2367–2382
26 Wang, D. et al. (2006) Partially redundant functions of two SETdomain polycomb-group proteins in controlling initiation of seed
development in Arabidopsis. Proc. Natl. Acad. Sci. U. S. A. 103,
13244–13249
27 Makarevich, G. et al. (2006) Different Polycomb group complexes
regulate common target genes in Arabidopsis. EMBO Rep. 7, 947–952
28 Jiang, D. et al. (2008) Repression of FLOWERING LOCUS C and
FLOWERING LOCUS T by the Arabidopsis Polycomb Repressive
Complex 2 components. PLoS ONE 3, e3404
29 Ringrose, L. (2007) Polycomb comes of age: genome-wide profiling of
target sites. Curr. Opin. Cell Biol. 19, 290–297
30 Sieburth, L.E. and Meyerowitz, E.M. (1997) Molecular dissection of the
AGAMOUS control region shows that cis elements for spatial
regulation are located intragenically. Plant Cell 9, 355–365
31 Finnegan, E.J. et al. (2004) A cluster of Arabidopsis genes with a
coordinate response to an environmental stimulus. Curr. Biol. 14,
911–916
32 Greb, T. et al. (2007) The PHD finger protein VRN5 functions in the
epigenetic silencing of Arabidopsis FLC. Curr. Biol. 17, 73–78
33 Köhler, C. et al. (2003) The Polycomb-group protein MEDEA regulates
seed development by controlling expression of the MADS-box gene
PHERES1. Genes Dev. 17, 1540–1553
34 Kinoshita, T. et al. (2001) Polycomb repression of flowering during early
plant development. Proc. Natl. Acad. Sci. U. S. A. 98, 14156–14161
35 Ballestar, E. and Esteller, M. (2008) Epigenetic gene regulation in
cancer. Adv Genet 61, 247–267
36 Michaels, S.D. (2008) Flowering time regulation produces much fruit.
Curr. Opin. Plant Biol. 12, 75–80
37 Schönrock, N. et al. (2006) Polycomb-group proteins repress the floral
activator AGL19 in the FLC-independent vernalization pathway.
Genes Dev. 20, 1667–1678
38 Sung, S. et al. (2006) A PHD finger protein involved in both the
vernalization and photoperiod pathways in Arabidopsis. Genes Dev.
20, 3244–3248
39 Finnegan, E.J. and Dennis, E.S. (2007) Vernalization-induced
trimethylation of histone H3 Lysine 27 at FLC is not maintained in
mitotically quiescent cells. Curr. Biol. 17, 1978–1983
40 Hansen, K.H. et al. (2008) A model for transmission of the H3K27me3
epigenetic mark. Nat. Cell. Biol. 10, 1291–1300
41 Wellensiek, S.J. (1962) Dividing cells as the locus for vernalization.
Nature 195, 307–308
42 Okada, T. et al. (2005) Analysis of the histone H3 gene family in
Arabidopsis and identification of the male-gamete-specific variant
AtMGH3. Plant J. 44, 557–568
43 Ingouff, M. et al. (2007) Distinct dynamics of HISTONE3 variants
between the two fertilization products in plants. Curr. Biol. 17, 1032–
1037
44 Swigut, T. and Wysocka, J. (2007) H3K27 Demethylases, at long last.
Cell 131, 29–32
45 Lu, F. et al. (2008) Comparative analysis of JmjC domain-containing
proteins reveals the potential histone demethylases in Arabidopsis and
rice. J. Integr. Plant Biol. 50, 886–896
46 Takada, S. and Goto, K. (2003) TERMINAL FLOWER 2, an
Arabidopsis homolog of HETEROCHROMATIN PROTEIN 1,
Review
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
counteracts the activation of FLOWERING LOCUS T by CONSTANS
in the vascular tissues of leaves to regulate flowering time. Plant Cell
15, 2856–2865
Turck, F. et al. (2007) Arabidopsis TFL2/LHP1 specifically associates
with genes marked by trimethylation of Histone H3 Lysine 27. PLoS
Genet. 3, 855–866
Zhang, X. et al. (2007) The Arabidopsis LHP1 protein colocalizes with
histone H3 Lys27 trimethylation. Nat. Struct. Mol. Biol. 14, 869–871
Xu, L. and Shen, W.H. (2008) Polycomb silencing of KNOX genes
confines shoot stem cell niches in Arabidopsis. Curr. Biol. 18, 1966–
1971
Cui, H. and Benfey, P.N. (2009) Interplay between SCARECROW, GA
and LIKE HETEROCHROMATIN PROTEIN 1 in ground tissue
patterning in the Arabidopsis root. Plant J. 58, 1016–1027
Liu, C. et al. (2009) Regulation of floral patterning by flowering time
genes. Dev. Cell 16, 711–722
Calonje, M. et al. (2008) EMBRYONIC FLOWER1 participates in
Polycomb Group-mediated AG gene silencing in Arabidopsis. Plant
Cell 20, 277–291
Sheldon, C.C. et al. (2008) Resetting of FLOWERING LOCUS C
expression after epigenetic repression by vernalization. Proc. Natl.
Acad. Sci. U. S. A. 105, 2214–2219
Choi, J. et al. (2009) Resetting and regulation of FLOWERING LOCUS
C expression during Arabidopsis reproductive development. Plant J.
57, 918–931
Bastow, R. et al. (2004) Vernalization requires epigenetic silencing of
FLC by histone methylation. Nature 427, 164–167
Sung, S. and Amasino, R.M. (2004) Vernalization in Arabidopsis
thaliana is mediated by the PHD finger protein VIN3. Nature 427,
159–164
Fuchs, J. et al. (2006) Chromosomal histone modification patterns –
from conservation to diversity. Trends Plant Sci. 11, 199–208
Zhang, X. et al. (2007) Whole-genome analysis of Histone H3 Lysine 27
trimethylation in Arabidopsis. PLoS Biol. 5, 1026–1035
Bernatavichute, Y.V. et al. (2008) Genome-wide association of histone
H3 Lysine nine methylation with CHG DNA methylation in
Arabidopsis thaliana. PLoS ONE 3 (9), e3156
Mylne, J.S. et al. (2006) LHP1, the Arabidopsis homolog of
HETEROCHROMATIN PROTEIN1, is required for epigenetic
silencing of FLC. Proc. Natl. Acad. Sci. U. S. A. 103, 5012–5017
Köhler, C. and Makarevich, G. (2006) Epigenetic mechanisms
governing seed development in plants. EMBO Rep. 7, 1223–1227
Wang, D. et al. (2006) Partially redundant functions of two SETdomain Polycomb-group proteins in controlling initiation of seed
development in Arabidopsis. Proc. Natl. Acad. Sci. U. S. A. 103,
13244–13249
Leroy, O. et al. (2007) Polycomb group proteins function in the female
gametophyte to determine seed development in plants. Development
134, 3639–3648
Rodrigues, J.C. et al. (2008) Sexual and apomictic seed formation in
Hieracium requires the plant Polycomb-group gene FERTILIZATION
INDEPENDENT ENDOSPERM. Plant Cell 20, 2372–2386
Luo, M. et al. (2009) Expression, imprinting, and evolution of rice
homologs of the Polycomb Group genes. Mol. Plant 2, 711–723
Köhler, C. et al. (2005) The Arabidopsis thaliana MEDEA Polycomb
group protein controls expression of PHERES1 by parental imprinting.
Nat. Genet. 37, 28–30
Kinoshita, T. et al. (1999) Imprinting of the MEDEA Polycomb gene in
the Arabidopsis endosperm. Plant Cell 11, 1945–1952
Vielle-Calzada, J.P. et al. (1999) Maintenance of genomic imprinting at
the Arabidopsis MEDEA locus requires zygotic DDM1 activity. Genes
Dev. 13, 2971–2982
Jullien, P.E. et al. (2006) Maintenance of DNA methylation during the
Arabidopsis life cycle is essential for parental imprinting. Plant Cell 18,
1360–1372
Baroux, C. et al. (2006) Dynamic regulatory interactions of Polycomb
group genes: MEDEA autoregulation is required for imprinted gene
expression in Arabidopsis. Genes. Dev. 20, 1081–1086
Gehring, M. et al. (2006) DEMETER DNA glycosylase establishes
MEDEA
Polycomb
gene
self-imprinting
by
allele-specific
demethylation. Cell 124, 495–506
Trends in Genetics
Vol.25 No.9
72 Jullien, P.E. et al. (2006) Polycomb group complexes self-regulate
imprinting of the Polycomb group gene MEDEA in Arabidopsis.
Curr. Biol. 16, 486–492
73 Hermon, P. et al. (2007) Activation of the imprinted Polycomb Group
FIE1 gene in maize endosperm requires demethylation of the maternal
allele. Plant Mol. Biol. 64, 387–395
74 Haun, W.J. et al. (2007) Genomic imprinting, methylation and
molecular evolution of maize Enhancer of zeste (Mez) homologs.
Plant J. 49, 325–337
75 Libault, M. et al. (2005) The Arabidopsis LHP1 protein is a component
of euchromatin. Planta 222, 910–925
76 Nakahigashi, K. et al. (2005) The Arabidopsis HETROCHROMATIN
PROTEIN 1 homolog (TERMINAL FLOWER2) silences genes within
the euchromatic region but not genes positioned in heterochromatin.
Plant Cell Physiol. 46, 1747–1756
77 Larsson, A.S. et al. (1998) The TERMINAL FLOWER 2 (TFL2) gene
controls the reproductive transition and meristem identity in
Arabidopsis thaliana. Genetics 149, 597–605
78 Ludwig-Muller, J. et al. (2000) A glucosinolate mutant of Arabidopsis is
thermosensitive and defective in cytosolic Hsp90 expression after heat
stress. Plant Physiol. 123, 949–958
79 Gaudin, V. et al. (2001) Mutations in LIKE HETEROCHROMATIN
PROTEIN 1 affect flowering time and plant architecture in
Arabidopsis. Development 128, 4847–4858
80 Kotake, T. et al. (2003) Arabidopsis TERMINAL FLOWER 2 gene
encodes a LIKE HETEROCHROMATIN PROTEIN 1 homolog and
represses both FLOWERING LOCUS T to regulate flowering time
and several floral homeotic genes. Plant Cell Physiol. 44, 555–564
81 Sung, S. et al. (2006) Epigenetic maintenance of the vernalized state in
Arabidopsis thaliana requires LIKE HETEROCHROMATIN
PROTEIN 1. Nat. Genet. 38, 706–710
82 Levy, Y.Y. et al. (2002) Multiple roles of Arabidopsis VRN1 in
vernalization and flowering time control. Science 297, 243–246
83 Aubert, D. et al. (2001) EMF1, a novel protein involved in the control of
shoot architecture and flowering in Arabidopsis. Plant Cell 13, 1865–
1875
84 Moon, Y.H. et al. (2003) EMF genes maintain vegetative development
by repressing the flower program in Arabidopsis. Plant Cell 15, 681–
693
85 Sanchez-Pulido, L. et al. (2008) RAWUL: A new Ubiquitin-like domain
in PRC1 Ring finger proteins that unveils putative plant and worm
PRC1 orthologs. BMC Genomics 9, 308
86 de Napoles, M. et al. (2004) Polycomb group proteins Ring1A/B link
ubiquitylation of histone H2A to heritable gene silencing and X
inactivation. Dev. Cell 7, 663–676
87 Exner, V. et al. (2009) The chromodomain of LIKE
HETEROCHROMATIN PROTEIN 1 is essential for H3K27me3
binding and function during Arabidopsis development. PLoS ONE 4
88 Pauler, F.M. et al. (2009) H3K27me3 forms BLOCs over silent genes
and intergenic regions and specifies a histone banding pattern on a
mouse autosomal chromosome. Genome Res. 19, 221–233
89 Chaudhury, A.M. et al. (1997) Fertilization-independent seed
development in Arabidopsis thaliana. Proc. Natl. Acad. Sci. U. S. A.
94, 4223–4228
90 Romanel, E.A. et al. (2009) Evolution of the B3 DNA binding
superfamily: new insights into REM family gene diversification.
PLoS One 4 (6), e5791
91 Schubert, D. et al. (2006) Silencing by plant Polycomb-group genes
requires dispersed trimethylation of histone H3 at lysine 27. EMBO J.
25, 4638–4649
92 Alexandre, C.M. and Hennig, L. (2008) FLC or not FLC: the other side
of vernalization. J. Exp. Bot. 59, 1127–1135
93 Lempe, J. et al. (2005) Diversity of flowering responses in wild
Arabidopsis thaliana strains. PLoS Genet. 1, 109–118
94 Shindo, C. et al. (2006) Variation in the epigenetic silencing of FLC
contributes to natural variation in Arabidopsis vernalization response.
Genes Dev. 20, 3079–3083
95 Feil, R. and Berger, F. (2007) Convergent evolution of genomic
imprinting in plants and mammals. Trends Genet. 23, 192–199
96 Haig, D. and Westoby, M. (1989) Parent specific gene expression and
the triploid endosperm. Am. Naturalist 134, 147–155
423
Epigenetic Reprogramming in Plant and Animal Development
Suhua Feng et al.
Science 330, 622 (2010);
DOI: 10.1126/science.1190614
This copy is for your personal, non-commercial use only.
If you wish to distribute this article to others, you can order high-quality copies for your
colleagues, clients, or customers by clicking here.
The following resources related to this article are available online at
www.sciencemag.org (this information is current as of February 18, 2013 ):
Updated information and services, including high-resolution figures, can be found in the online
version of this article at:
http://www.sciencemag.org/content/330/6004/622.full.html
A list of selected additional articles on the Science Web sites related to this article can be
found at:
http://www.sciencemag.org/content/330/6004/622.full.html#related
This article cites 60 articles, 17 of which can be accessed free:
http://www.sciencemag.org/content/330/6004/622.full.html#ref-list-1
This article has been cited by 47 articles hosted by HighWire Press; see:
http://www.sciencemag.org/content/330/6004/622.full.html#related-urls
This article appears in the following subject collections:
Molecular Biology
http://www.sciencemag.org/cgi/collection/molec_biol
Science (print ISSN 0036-8075; online ISSN 1095-9203) is published weekly, except the last week in December, by the
American Association for the Advancement of Science, 1200 New York Avenue NW, Washington, DC 20005. Copyright
2010 by the American Association for the Advancement of Science; all rights reserved. The title Science is a
registered trademark of AAAS.
Downloaded from www.sciencemag.org on February 18, 2013
Permission to republish or repurpose articles or portions of articles can be obtained by
following the guidelines here.
Epigenetics
29.
30.
31.
32.
B. Brower-Toland et al., Genes Dev. 21, 2300 (2007).
M. Pal-Bhadra et al., Science 303, 669 (2004).
C. Klattenhoff et al., Cell 138, 1137 (2009).
S. D. Taverna, R. S. Coyne, C. D. Allis, Cell 110, 701 (2002).
A. Böhne, F. Brunet, D. Galiana-Arnoux, C. Schultheis,
J. N. Volff, Chromosome Res. 16, 203 (2008).
S. Chambeyron et al., Proc. Natl. Acad. Sci. U.S.A. 105,
14964 (2008).
C. Josefsson, B. Dilkes, L. Comai, Curr. Biol. 16, 1322 (2006).
R. A. Martienssen, New Phytol. 186, 46 (2010).
A. A. Aravin, R. Sachidanandam, A. Girard, K. Fejes-Toth,
G. J. Hannon, Science 316, 744 (2007).
33. N. Zamudio, D. Bourc’his, Heredity 105, 92 (2010).
34. H. Kano et al., Genes Dev. 23, 1303 (2009).
35. R. J. O’Neill, M. J. O’Neill, J. A. Graves, Nature
393, 68 (1998).
36. C. Babinet, V. Richoux, J. L. Guénet, J. P. Renard,
Dev. Suppl. 1990, 81 (1990).
37. T. A. Bell et al., Genetics 172, 411 (2006).
38. T. Watanabe et al., Nature 453, 539 (2008).
39. W. Yan et al., Biol. Reprod. 78, 896 (2008).
40. V. Grandjean et al., Development 136, 3647 (2009).
41. Y. Tarutani et al., Nature 466, 983 (2010).
42. L. Wu et al., Mol. Cell 38, 465 (2010).
REVIEW
Epigenetic Reprogramming in Plant
and Animal Development
Suhua Feng,1 Steven E. Jacobsen,1* Wolf Reik2*
Epigenetic modifications of the genome are generally stable in somatic cells of multicellular
organisms. In germ cells and early embryos, however, epigenetic reprogramming occurs on a
genome-wide scale, which includes demethylation of DNA and remodeling of histones and
their modifications. The mechanisms of genome-wide erasure of DNA methylation, which
involve modifications to 5-methylcytosine and DNA repair, are being unraveled. Epigenetic
reprogramming has important roles in imprinting, the natural as well as experimental
acquisition of totipotency and pluripotency, control of transposons, and epigenetic inheritance
across generations. Small RNAs and the inheritance of histone marks may also contribute to
epigenetic inheritance and reprogramming. Reprogramming occurs in flowering plants and in
mammals, and the similarities and differences illuminate developmental and reproductive
strategies.
pigenetic marks are enzyme-mediated chemical modifications of DNA and of its associated chromatin proteins. Although
they do not alter the primary sequence of DNA,
they also contain heritable information and play
key roles in regulating genome function. Such
modifications—including cytosine methylation,
posttranslational modifications of histone tails and
the histone core, and the positioning of nucleosomes
(histone octamers wrapped with DNA)—influence the transcriptional state and other functional
aspects of chromatin. For example, methylation of
DNA and certain residues on the histone H3 Nterminal tail [e.g., H3 lysine 9 (H3K9)] are important for transcriptional gene silencing and the
formation of heterochromatin. Such marks are essential for the silencing of nongenic sequences—
including transposons, pseudogenes, repetitive
sequences, and integrated viruses—that could become deleterious to cells if expressed and hence
activated. Epigenetic gene silencing is also im-
E
1
Howard Hughes Medical Institute and Department of Molecular, Cell and Developmental Biology, University of California, Los Angeles, CA 90095, USA. 2Laboratory of Developmental
Genetics and Imprinting, Babraham Institute, Cambridge CB22
3AT, UK, and Centre for Trophoblast Research, University of
Cambridge, Cambridge CB2 3EG, UK.
*To whom correspondence should be addressed. E-mail:
[email protected] (S.E.J.); [email protected] (W.R.)
622
portant in developmental phenomena such as
imprinting in both plants and mammals, as well as
in cell differentiation and reprogramming.
DNA methylation occurs in three different nucleotide sequence contexts: CG, CHG, and CHH
(where H = C, T, or A). In both mammals and
plants, CG methylation is maintained by the maintenance DNA methyltransferase, termed DNMT1
[DNA (cytosine-5)-methyltransferase 1] in mammals and MET1 (DNA METHYLTRANSFERASE 1) in Arabidopsis, and by a cofactor that
recognizes hemimethylated DNA at replication
foci, called UHRF1 (ubiquitin-like containing
PHD and RING finger domains 1) in mammals and VIM (VARIATION IN METHYLATION) family proteins in Arabidopsis (1). In
addition, the mammalian de novo DNA methyltransferases DNMT3A and Dnmt3b are required
for the maintenance of CG methylation at some
loci (2). CHG methylation is common in Arabidopsis and other plant genomes and is maintained by a feedforward loop that is formed by
a plant-specific DNA methyltransferase, CMT3
(CHROMOMETHYLASE 3), and a histone methyltransferase, KYP (KRYPTONITE) (1, 3, 4).
CHH methylation is also abundant in plants and
is maintained by the RNA-directed DNA methylation (RdDM) pathway, which actively targets
the DNA methyltransferase DRM2 (DOMAINS
29 OCTOBER 2010
VOL 330
SCIENCE
43. P. Dunoyer et al., EMBO J. 29, 1699 (2010).
44. X. Dramard, T. Heidmann, S. Jensen, PLoS ONE 2, e304 (2007).
45. We are indebted to J. Brennecke, S. Duharcourt, and
A. Pélisson for valuable discussions and advice. D.B. is
supported by a European Young Investigator award and
the Fondation Schlumberger pour l’Enseignement et la
Recherche. O.V. is supported by a European Research
Council starting grant (210890) “Frontiers of RNAi” and
an award from the Bettencourt Foundation.
10.1126/science.1194776
REARRANGED METHYLTRANSFERASE 2;
a homolog of Dnmt3) to DNA by means of 24nucleotide (nt) small interfering RNAs (siRNAs)
bound by AGO4 (ARGONAUTE 4) (1) (Fig. 1).
CHG and CHH methylation are also present at
detectable levels in mammals, especially in stem
cells, and this methylation is likely introduced by
DNMT3A and DNMT3B (5, 6). De novo methylation of DNA in all of these sequence contexts
is generally established by the DNMT3 (mammals)
and DRM2 (Arabidopsis) methyltransferases.
Mammals do not have an Arabidopsis-like RNAdirected DNA methylation pathway, but in germ
cells, PIWI-associated RNAs (piRNAs) are thought
to guide DNMT3 activity (7). Mammals have a
noncatalytic paralog of de novo methyltransferase,
DNMT3L, which interacts with DNMT3A and
unmethylated H3K4 (as does DNMT3A and
DNMT3B) (8–10); these findings imply a targeting mechanism of these methyltransferases to
chromatin. Unmethylated CpG islands are specifically bound by CFP1 (CXXC finger protein–1),
which in turn recruits the histone H3K4 methyltransferase SETD1 (SET domain containing–1)
(11); this suggests that H3K4 methylation, and
therefore exclusion of DNMT3 from CpG islands, could help to explain how promoters remain unmethylated. Consistent with this idea,
demethylation of H3K4 has been shown to be
important for the acquisition of DNA methylation
in imprinted genes in oocytes (12). Additionally,
transcription can also help to establish de novo
DNA methylation at imprinted regions (13). Earlier this year, it was shown that the nucleosome
landscape also influences the methylation patterning in both plant and animal genomes (14).
Some histone modifications are also thought
to be actively maintained during DNA replication, in part facilitated by the association of
the histone modification enzymes with the DNA
replication machinery. For example, the mammalian histone H3K9 methyltransferases G9A
and SETDB1 (SET domain bifurcated–1), the
mammalian H4K20 methyltransferase SETD8
(SET domain containing–8), and the plant histone H3K27 monomethyltransferases ATXR5
(ARABIDOPSIS TRITHORAX-RELATED
PROTEIN 5) and ATXR6 interact with the replication protein PCNA (proliferating cell nuclear
antigen) (15, 16). However, histone methylation can also be very dynamic and is controlled
www.sciencemag.org
Downloaded from www.sciencemag.org on February 18, 2013
24.
25.
26.
27.
28.
SPECIALSECTION
DNA Methylation Throughout
the Arabidopsis Life Cycle
To maintain genome integrity from generation
to generation, transposons and repetitive DNA
elements must be kept under tight regulation in
reproductive cells. One of the ways that plants
achieve this is through the stable inheritance of
DNA methylation. Plants frequently show meiotic
inheritance of gene silencing (1). Furthermore,
plants are not known to undergo genome-wide
waves of demethylation in germ cells, as occurs
in animals. However, large-scale reprogramming
occurs in non–germ line reproductive cells, and
this reprogramming may function to reinforce
silencing of transposable elements in germ cells
(see below).
One way to actively reprogram the epigenome is to remove methylated cytosines. The
Arabidopsis genome encodes four bifunctional helix-hairpin-helix DNA glycosylases and AP
lyases—ROS1 (REPRESSOR OF SILENCING
1), DME (DEMETER), DML2 (DEMETERLIKE 2), and DML3 (DEMETER-LIKE 3)—
which recognize and remove methylated cytosines, resulting in a 1-nt gap in the DNA double
strand. Subsequently, as yet unidentified DNA
repair polymerase and DNA ligase enzymes are
thought to fill in the gap with an unmethylated
cytosine (1, 17). ROS1, DML2, and DML3 mainly function in vegetative tissues, and genomic
studies suggest that they demethylate hundreds
of specific loci across the genome with a bias
toward genes (1, 18). Knocking out all three
genes does not markedly affect the overall levels
or patterns of methylation in the Arabidopsis
genome (18, 19). Instead, these enzymes appear
to be acting as a counterbalance to the RNAdirected DNA methylation system to quantitatively fine-tune methylation levels at particular
genomic locations.
By contrast, DME functions to cause global
hypomethylation in the endosperm (the extraembryonic tissue of flowering plants) of Arabidopsis (20, 21), and thus contributes to large-scale
epigenetic reprogramming (Fig. 1). In Arabidopsis,
female gametogenesis begins when a somatically derived megaspore mother cell undergoes meiosis to give rise to a haploid megaspore, which
subsequently develops into a mature female gametophyte (embryo sac) that contains one egg cell,
one central cell (two nuclei), and several other
accessory cells. During double fertilization (which
is common in plants), the egg cell fuses with a
sperm cell from the male gametophyte (pollen
grain) to form an embryo, and the central cell
fuses with the other sperm cell from pollen to
form the triploid endosperm, which nourishes
the embryo, and thus bears a function similar to
that of the placenta in mammals. DME is expressed primarily in the central cell before fertilization, and thus only the maternal genome is
demethylated by DME. This leads to maternal
allele–specific gene expression (imprinting) in
the endosperm (22). Until recently, only six imprinted Arabidopsis genes were known, but recent genomic studies of endosperm have revealed
genome-wide differences in DNA methylation,
including a substantial reduction of CG methylation; hence, many additional genes are likely to
be imprinted in Arabidopsis, some of which have
been verified by single-gene studies (20, 21) (Fig.
1). Demethylation by DME may also reactivate
transposon expression, which shunts transposon
transcripts into the RNAi pathway, producing
additional siRNAs that can guide DNA methylation to non-CG sites whose methylation is high
in wild-type endosperm but decreased in dme
mutant endosperm (Fig. 1). Curiously, there are
even higher levels of non-CG methylation in the
wild-type embryo that could be explained by
movement of siRNAs produced in the central
cell into the egg cell; this attractive idea awaits
experimental support (20). Because the endosperm genome does not contribute to the next
generation, mild reactivation of transposons in
endosperm may not be deleterious and may reinforce the silencing of transposons in the egg
cell and later embryo, contributing to the genome integrity of offspring. Indeed, there is a
class of RNA polymerase IV (Pol IV)–dependent
siRNAs that only accumulates in flowers and
young siliques, likely originating from the endosperm (23). Notably, these siRNAs are derived
from maternal alleles only, which suggests that
they may be produced in part during female
gametogenesis and then retained after karyogamy.
However, these siRNAs are expressed more
highly after fertilization, and therefore imprinted
maternal expression of siRNA loci may also
occur as the endosperm develops (23). It is
tempting to speculate that the maternal Pol IV–
dependent siRNAs are the “messenger” that mediates communication between endosperm and
embryo (Fig. 1); however, these siRNAs were
detected only in the endosperm, not in the embryo (23). Nonetheless, the possibility that they
exist in low abundance in the embryo, or are
ephemeral, cannot be ruled out.
The idea that siRNAs move from the endosperm to the embryo is consistent with the model
put forth in an earlier study on paternal genome
reprogramming in Arabidopsis (24). The male
gametophyte of Arabidopsis (a pollen grain) contains two sperm cells, which fertilize the egg
www.sciencemag.org
SCIENCE
VOL 330
cell and central cell, respectively, and a vegetative
nucleus (Fig. 1). Transposon expression is generally up-regulated in pollen, and certain transposons even become mobile in pollen, unlike
the situation in most other tissues (24). Reduction
of transposon methylation followed by transposon reactivation appears to occur in the vegetative
nucleus; this is supported by the finding that transposon reactivation and movement are not inherited by the next generation (24). It has been shown
that several key RdDM pathway proteins (RDR2
and DCL3) and CHG methylation maintenance
pathway proteins (CMT3 and KYP) have reduced
expression levels in pollen; in addition, DDM1
(DECREASE IN DNA METHYLATION 1), an
important chromatin remodeler required for DNA
and histone methylation and transposon silencing, is exclusively localized in sperm cells but not
in the vegetative nucleus (24). These results suggest a model in which hypomethylation of the
vegetative cell may reactivate transposons that
could serve to reinforce transposon silencing in
the adjacent sperm cells (Fig. 1) (24).
Small RNAs may be involved in this communication between the vegetative cell and the sperm
cells. A class of siRNAs that is 21 nt in length and
corresponds to Athila retrotransposons, the largest
transposon family, is detected in sperm cells.
Because Athila retrotransposons remain silenced
in sperm cells but are activated in the vegetative
nucleus, it is possible that the 21-nt siRNAs are
produced in the vegetative nucleus and then travel
to their site of action—sperm cells—where they
mediate the silencing of transposons through an
unknown mechanism (Fig. 1) (24). A common
theme is that both male and female gametophytes
contain nurse cells in which massive epigenetic
reprogramming may serve to reinforce transposon
silencing in the germ line (Fig. 1).
Another example of small RNAs silencing
transposons at a distance occurs when the megaspore mother cell differentiates from somatic
tissues (25). Mutations in AGO9 (ARGONAUTE
9), a member of the Arabidopsis Argonaute family of proteins, result in the reactivation of transposons in the ovule (including the egg cell) (Fig. 1).
Remarkably, AGO9 is not expressed in the reproductive cells themselves (megaspore mother
cell, megaspore, or developing female gametophyte), but is expressed in the companion cells
surrounding the female gametophyte. The transposon targets of AGO9 are similar to those reactivated in pollen, and evidence suggests that
AGO9-mediated transposon silencing uses components of known silencing pathways, including
the 24-nt RNA-directed DNA methylation pathway (25). Whether the AGO9-associated 24-nt
siRNAs are the mobile signal remains to be tested.
Downloaded from www.sciencemag.org on February 18, 2013
by the combined action of both histone methyltransferases and histone demethylases, as well as
by the proteins that read these histone marks (16).
Through the developmental regulation of these
epigenetic mechanisms, both plants and animals undergo epigenetic reprogramming in various cell types and developmental stages, which
serve either to transmit epigenetic information
between cells or between sexual generations, or
to reset epigenetic marks to reduce the risk of
perpetuating dangerous epigenetic alleles.
Resetting of Histone Modifications
in Arabidopsis
In addition to DNA methylation, plants also reprogram histones and their associated marks;
29 OCTOBER 2010
623
Epigenetics
Somatic cell
KYP
CMT3
MET1
VIMs
H3K9me2
CHG
CG
DRM2
ATXR5
AGO4
ATXR6
CHH
H3K27me1
Maintenance
Re-replication
of TEs
Nucleus
X
TE
DRM2
de novo
AGO4
24-nt siRNAs
Female
gametophyte
Demethylate
CG
CHG
Male
gametophyte
DDM1
DME
RdDM
CHH
CG
CHH
CHG
TE
Central
cell
TE
Methylate
TE transcripts
Vegetative
nucleus
TE transcripts
21-nt siRNAs
24-nt siRNAs
Transport?
Transport?
Sperm cells
Egg cell
CG
CG
CHH
CHG
X
CHH
CHG
X
TE
TE
Transport?
Companion cell
CG
AGO9
24-nt siRNAs
CHG
CHH
X
TE
Double
fertilization
CG
Paternal TE
CHG
CHH
Seed
CG
CHG
CHH
CG
CHG
CHH
X
X
Endosperm
CG
CHG
Paternal TE
CHH
Embryo
Maternal TE
X
Methylate
TE transcripts
Maternal TE
Pol IV-dependent 24-nt siRNAs
Transport?
Fig. 1. Model of epigenetic silencing dynamics during the Arabidopsis life
cycle. In somatic cells, three different mechanisms are responsible for repressing transcription from transposable elements (TEs), DNA methylation (in all
three sequence contexts), histone H3K9 dimethylation (H3K9me2), and histone H3K27 monomethylation (H3K27me1). Methyltransferases and proteins
regulating these epigenetic marks are shown. See text for details. In the
female gametophyte, the central cell is demethylated by DME, which leads to
TE activation and up-regulation of RdDM. The siRNAs produced from TEs not
only direct non-CG methylation in the central cell, but also might travel to the
egg cell and enhance the silencing of TEs there. In addition, AGO9-associated
624
29 OCTOBER 2010
VOL 330
siRNAs produced in somatic companion cells also contribute to the silencing
of TEs in the egg cell. In the male gametophyte, the vegetative nucleus does
not express DDM1 and has reduced RdDM, which leads to TE activation and
mobilization. A new class of 21-nt siRNAs, produced from TEs in the vegetative nucleus, travels to sperm cells to reinforce TE silencing. After double
fertilization, maternal TEs in the endosperm stay activated and produce Pol
IV–dependent siRNAs, which could function to silence the paternal TEs in the
endosperm. The methylation levels in the embryo are elevated, possibly as a
result of the siRNA signals transmitted from the endosperm. Different shadings
indicate the level of DNA methylation (high, black; medium, gray; low, white).
SCIENCE
www.sciencemag.org
Downloaded from www.sciencemag.org on February 18, 2013
Gametogenesis
SPECIALSECTION
evolved to suppress excessive replication and to
ensure genome stability (Fig. 1). If true, this
would suggest that histone marks not only get
reprogrammed but also reprogram the genome,
in the case of H3K27 monomethylation, by
keeping the replication of transposons in check.
This presumably is important for actively cycling
plant cells, for reproductive cells undergoing
meiosis, and perhaps for early stages of embryo
development.
Mechanisms of Epigenetic Reprogramming
in Mammalian Development
Genome-wide epigenetic reprogramming occurs in mammalian development at two distinct stages: in primordial germ cells (PGCs)
primarily once they have reached the embryonic
gonads (embryonic day E10.5 to E13.5), and in
the early embryo beginning in the zygote immediately after fertilization and extending to the
morula stage of preimplantation development
(Fig. 2) (28–30). This reprogramming entails
erasure of DNA methylation and loss of histone
modifications (as well as loss of histones and histone variants); here we focus on demethylation
of DNA. The loss of DNA methylation by E13.5
(the developmental endpoint of reprogramming)
is truly global; in mouse female PGCs, only 7%
of CpGs remain methylated [versus 70 to 80% in
embryonic stem (ES) cells and somatic cells],
and most promoters and genic, intergenic, and
transposon sequences are hypomethylated at this
stage (31). The only clear exception to global
erasure is intracisternal A particles (IAPs), an
active family of retrotransposons that have only
recently been acquired in the rodent lineage,
which only show partial demethylation in PGCs
(31). Promoters of germ cell–specific genes (such
as Dazl or Vasa) are methylated in early PGCs
and become demethylated and expressed during reprogramming (32). Imprinted genes have
allele-specific methylation in early PGCs and
the Xist promoter is methylated, and this methylation is all erased in PGCs by E13.5 (Fig. 2)
(33, 34). Although most of the genome-wide demethylation appears to occur in E11.5 to E13.5
PGCs, it remains possible that some loci become demethylated at slightly earlier stages (35);
hence, demethylation is not necessarily coordinated timewise throughout the genome. Nothing
is currently known about the possible occurrence
or erasure of non-CG methylation in PGCs.
DNA deaminases and the base excision repair
pathway have recently been implicated in erasure, which suggests that active demethylation
is involved at least in part (31, 36). The cytosine
deaminases AID and APOBEC1 are capable in
vitro of deaminating 5-methylcytosine (5mC) as
well as cytosine and are expressed, albeit at a low
level, in PGCs (36, 37). Notably, AID deficiency
in PGCs results in a deficit in demethylation of
20% of all CpGs if it is assumed that early PGCs
have methylation levels similar to those of ES
www.sciencemag.org
SCIENCE
VOL 330
cells or somatic cells (31). Because of this partial effect of AID deficiency on erasure, the
potential redundancy with other DNA deaminases needs to be examined. The 5mC hydroxylases TET1 and TET2 are also expressed in
PGCs (36), suggesting the possibility that 5mC
could be modified by different mechanisms
(deamination, hydroxylation) in order to initiate active demethylation. It is also possible
that a combination of passive (resulting in hemimethylated substrates in G2 phase of the cell
cycle) and active demethylation could be involved. Finally, it is possible that the genomewide nature of the demethylation process and
its relatively coordinate timing require different mechanisms and different modifications
of 5mC to join forces in order to achieve such
large-scale reprogramming.
Initial modification of 5mC would require
further modification or DNA repair in order to
achieve demethylation. DNA repair pathways
that might be involved in resolving mismatches
or in excising 5-hydroxymethylcytosine (5hmC)
are nucleotide excision repair, mismatch repair,
and especially base excision repair (BER), which
is also involved in demethylation during reprogramming in plants (1). BER components such as
PARP1, APE1, and XRCC1 are all up-regulated
at E11.5 in PGCs, together with enhancement of
chromatin-bound XRCC1; thus, it is possible
that BER is activated at this time point (Fig. 2)
(36). Global losses of several histone modifications (e.g., H3K27me3, H3K9ac) as well as the
linker histone H1 are observed after demethylation of DNA, indicating that widespread DNA
repair might be associated with global remodeling of nucleosomes in PGCs (38). It is also
possible that specific histone modification or
demodification enzymes (deacetylases, demethylases) are in part responsible for erasure of
histone marks in PGCs, but none have been
identified so far.
Base excision repair also appears to be involved in demethylation in the zygote immediately after fertilization (Fig. 2). The added
complication here is that it is specifically the
paternal, sperm-derived, genome that is demethylated, whereas the maternal one is not; the maternal genome may be specifically protected from
demethylation (39–42). Differentially methylated
regions in imprinted genes are also specifically
protected from demethylation, and so again are
IAPs. Nonetheless, there appear to be substantial losses of methylation in the zygote, potentially of a similar scale to those occurring in PGCs
(39, 43, 44). Notably, demethylation of the paternal genome may occur in two phases, one
before DNA replication and one associated with
the S and G2 phases (44). The first phase might
involve modification of 5mC but only partial
demethylation (44). Demethylation might then
continue at replication or afterward. BER components are also present at these stages with an
29 OCTOBER 2010
Downloaded from www.sciencemag.org on February 18, 2013
as opposed to DNA methylation (which is typically inherited), some histone modifications are
known to be reset in each generation. Because
plants do not set aside a germ line early in development (germ cells are differentiated from
adult somatic cells), some type of “reprogramming” process is likely needed to erase the effects
of epigenetic marks caused by external stimuli
(such as development or stress). For example,
PCG (Polycomb group) proteins mediate the silencing of FLC (FLOWERING LOCUS C) in
Arabidopsis, which controls flowering time (26).
In winter-annual accessions of Arabidopsis, FLC
is expressed at high levels to repress the initiation of flowering. During vernalization (prolonged exposure to cold, such as during winter),
FLC becomes modified by H3K27 trimethylation, which helps to turn off FLC expression epigenetically. When winter passes and temperatures
become warmer, trimethylation and silencing
of FLC persists, and therefore Arabidopsis can
flower in response to environmental cues such as
photoperiod. When gametes are formed through
meiosis, H3K27 trimethylation marks on FLC
are removed by an unknown mechanism and FLC
becomes reexpressed in the seeds. Thus, flowering is inhibited by FLC until the next-generation
plants encounter cold weather.
Resetting of histone marks may involve, in
part, global replacement of histones (27). The
histone variant H3.3 can be incorporated in the
absence of DNA replication, and thus is a candidate for the “replacement” histone H3 during
reprogramming. HTR10 (HISTONE THREE
RELATED 10) is exclusively expressed in male
reproductive cells, but after karyogamy of sperm
and egg cell nuclei, the paternal HTR10 signal
disappears within a matter of hours before S
phase of the first zygote division (27). This suggests that HTR10 is actively removed from the
chromatin in a replication-independent manner
specifically in the sperm cell that fertilizes the
egg. Unlike DNA methylation reprogramming,
which occurs in accessory cells, histone reprogramming takes place in the zygote and thus can
transmit information to the next generation. These
results raise a number of questions. How does
the reprogramming system differentiate between
the two sperm cells? Does similar reprogramming happen in the female genome as well?
What types of histone H3 replace the parental
histone H3 in the zygote, and where do they
come from?
Recently a new transposon silencing mark
was described in Arabidopsis that does not appear to involve the well-studied DNA methylation or histone H3K9 dimethylation marks. This
mark, H3K27 monomethylation, is needed to suppress excessive replication of heterochromatin
in which transposons reside (15). Overreplication of transposons might lead to transposon reactivation and copy number propagation, and
the H3K27 monomethylation system may have
625
Epigenetics
Zygote
E0.5
Fertilization
Gametes
BER
AID
ELP
TETs?
PE
ICM
TE
XEN
cells
ES cells
DNA
X in
ac
tiv
TS cells
me
t
hy
la
tio
of
n
io
Blastocyst
E3.5
n
tion
ctiva
rea
iat
de m
et
hy
X inactiva
tio
n
DNMTs
X
Init
DNA
t io
printing
of im
e
r
u
as
e t h y l a t i on
Er
dem
A
ctivation
DN
rea
X
g
la
PGCs (gonad)
E10.5-13.5
tin
n
io
at
im
in
pr
n
Placenta
Fig. 2. The two major phases of genome-wide erasure of DNA methylation in the early embryo and in
primordial germ cells (PGCs) of the mouse. Thickness of the outer arrows indicates levels of DNA
methylation. Red, maternal genome; blue, paternal genome. After fertilization, the paternal genome is
more rapidly demethylated than the maternal one. During gametogenesis, de novo methylation in
spermatogenesis occurs earlier than in oogenesis. The inner circle shows factors or candidate factors
that are implicated in de novo methylation, the maintenance of methylation, and demethylation,
respectively. Solid arrows in the inner circle show at what developmental time these epigenetic regulators are thought to act. ES cells, TS cells, and XEN cells are stem cell lines that are derived from the
inner cell mass (ICM), trophectoderm (TE), and primitive endoderm (PE) of the blastocyst, respectively.
have been implicated in demethylation of the
paternal genome (Fig. 2) (45); Elongator is involved in diverse aspects of transcriptional
regulation and can also modify tRNAs. Could
Elongator catalyze an as yet unknown modification of 5mC that makes it a substrate for
BER? After zygotic demethylation, the embryonic genome continues to be demethylated during the following few cleavage divisions until
626
Hence, the current evidence for the initiation
and regulation of genome-wide erasure of DNA
methylation in PGCs and the zygote points to
initiating events that modify 5mC (such as deamination and hydroxylation), which would trigger a BER response. Of course, it is still possible
that bifunctional DNA glycosylases of the type
that excise 5mC in plants also exist in animals
(although none have been found so far); con-
29 OCTOBER 2010
VOL 330
SCIENCE
versely, homologs of APOBEC deaminases and
TET and ALKBH-type hydroxylases have yet to
be described in plants.
Experimental Reprogramming in Mammals
Experimental reprogramming to a pluripotent
state can be achieved, albeit inefficiently, by fusion of somatic cells and pluripotent cells, by
cloning, and by direct reprogramming using core
transcription factors (48). With all three methods, there is evidence that epigenetic reprogramming is a central component of achieving the
goal of an embryonic or ES cell–like (iPS) state.
In cell fusion experiments between somatic cells
and ES cells, key pluripotency genes such as
Oct4 and Nanog need to be demethylated; AID
also seems to be important for demethylation in
this system (49). Generation of iPS cells from
somatic cells by the transduction of core transcription factors (such as OCT4, SOX2, KLF4,
and C-MYC) probably requires multiple epigenetic reprogramming steps while the cells that
undergo reprogramming divide (50). DNA
demethylation is clearly critical because incompletely reprogrammed iPS cells can become
completely reprogrammed by treatment with the
methylation inhibitor azacytidine (48). Inhibitors
of histone deacetylases and histone methyltransferases are also beneficial, showing in general
that repressive epigenetic modifications acquired
during differentiation and somatic development
need to be reversed to achieve the pluripotent
state (48). Notably, reprogramming by cloning
apparently results in better resetting of the
epigenome than can be achieved by direct
reprogramming with transcription factors, indicating perhaps that true totipotency requires
passage through germ cells or zygotes (51). Direct
applications to regenerative medicine will result
from unraveling the role of AID, hydroxymethylation, and the TETs, and of the base excision
repair pathway as well as the methyltransferases
in this process, and from knowledge of how the
reprogramming network is connected with the
pluripotency network.
Comparative Biology of
Epigenetic Reprogramming
Whether genome-scale epigenetic reprogramming has a unified purpose is not clear; some
aspects of reprogramming are clearly conserved
(or have been reinvented) in animals and plants
with their contrasting, although sometimes surprisingly similar, reproductive and biological
strategies. In mammals, zygotic reprogramming
is broadly conserved, although there may be
some differences in timing or extent; by contrast,
Xenopus does not appear to show demethylation
of the paternal genome (52). Hypomethylation
of PGCs is also seen in human and pig fetal
development but has not been studied in nonmammalian organisms. Global DNA demethylation in PGCs and paternal demethylation in the
www.sciencemag.org
Downloaded from www.sciencemag.org on February 18, 2013
the blastocyst stage, with DNMT1 protein being
largely excluded from the nucleus by an unknown
mechanism (Fig. 2). Nonetheless, the maintenance of methylation in differentially methylated regions of imprinted genes does depend on
DNMT1 (46), so it will be important to understand how DNMT1 might be targeted during
this reprogramming phase to key regions in the
genome, such as imprinted genes (47).
enhancement of chromatin-bound XRCC1 in
the paternal pronucleus (36). Both phases show
evidence of DNA strand breaks, indicating that
repair may be involved in both of them, and
inhibition of BER components partly interferes
with demethylation (36, 44). Whether AID or
TETs are involved in zygotic demethylation is
not yet known, but components of the Elongator
complex (Elongator complex proteins, ELPs)
SPECIALSECTION
RNAs potentially through both oocyte and sperm
might also contribute to epigenetic inheritance
and to reprogramming across generations in animals and in plants (24, 25, 56–59).
Finally, epigenetic reprogramming is linked
to regaining pluripotency and, following that,
lineage commitment. Early PGCs and cells in
the early embryo are pluripotent, and these cells
as well as the stem cell lines that can be isolated
from them [ES and EG (embryonic germ) cells]
have unique epigenetic signatures, which are at
least in part the outcomes of reprogramming.
For example, some of the key pluripotency
transcription factors (such as Nanog) are methylated in sperm but not in ICM (inner cell mass)
or ES cells, so their demethylation is important
for the acquisition of pluripotency (43); in the
absence of the highly expressed gene encoding
TET1 (which hydroxylates 5mC), the Nanog
gene is repressed and its promoter becomes
methylated (60).
One final epigenetic parallel between mammals and plants is worth highlighting. After demethylation in the early mammalian embryo,
selective de novo methylation occurs in ICM
cells and their descendants, which is important
for the identity and stability of embryonic lineages
(28), whereas the placenta remains hypomethylated at the genome-wide level (31). Similarly,
genome-wide demethylation in the plant endosperm but not the embryo (20, 21) indicates that
epigenetic regulation between the two primary
lineages (embryonic, extraembryonic) is fundamentally different, with this difference apparently being conserved—or reinvented—in plants
and animals.
References and Notes
1. J. A. Law, S. E. Jacobsen, Nat. Rev. Genet. 11, 204
(2010).
2. T. Chen, Y. Ueda, J. E. Dodge, Z. Wang, E. Li,
Mol. Cell. Biol. 23, 5594 (2003).
3. S. Feng et al., Proc. Natl. Acad. Sci. U.S.A. 107,
8689 (2010).
4. A. Zemach, I. E. McDaniel, P. Silva, D. Zilberman, Science
328, 916 (2010); published online 15 April 2010
(10.1126/science.1186366).
5. B. H. Ramsahoye et al., Proc. Natl. Acad. Sci. U.S.A. 97,
5237 (2000).
6. R. Lister et al., Nature 462, 315 (2009).
7. A. A. Aravin, G. J. Hannon, Cold Spring Harb. Symp.
Quant. Biol. 73, 283 (2008).
8. D. Jia, R. Z. Jurkowska, X. Zhang, A. Jeltsch, X. Cheng,
Nature 449, 248 (2007).
9. S. K. Ooi et al., Nature 448, 714 (2007).
10. Y. Zhang et al., Nucleic Acids Res. 38, 4246
(2010).
11. J. P. Thomson et al., Nature 464, 1082 (2010).
12. D. N. Ciccone et al., Nature 461, 415 (2009).
13. M. Chotalia et al., Genes Dev. 23, 105 (2009).
14. R. K. Chodavarapu et al., Nature 466, 388
(2010).
15. Y. Jacob et al., Nature 466, 987 (2010).
16. R. Bonasio, S. Tu, D. Reinberg, Science 330, 612 (2010).
17. J. K. Zhu, Annu. Rev. Genet. 43, 143 (2009).
18. J. Penterman et al., Proc. Natl. Acad. Sci. U.S.A. 104,
6752 (2007).
www.sciencemag.org
SCIENCE
VOL 330
19. R. Lister et al., Cell 133, 523 (2008).
20. T. F. Hsieh et al., Science 324, 1451
(2009).
21. M. Gehring, K. L. Bubb, S. Henikoff, Science 324,
1447 (2009).
22. J. H. Huh, M. J. Bauer, T. F. Hsieh, R. L. Fischer, Cell 132,
735 (2008).
23. R. A. Mosher et al., Nature 460, 283 (2009).
24. R. K. Slotkin et al., Cell 136, 461 (2009).
25. V. Olmedo-Monfil et al., Nature 464, 628
(2010).
26. D. H. Kim, M. R. Doyle, S. Sung, R. M. Amasino,
Annu. Rev. Cell Dev. Biol. 25, 277 (2009).
27. M. Ingouff, Y. Hamamura, M. Gourgues, T. Higashiyama,
F. Berger, Curr. Biol. 17, 1032 (2007).
28. M. Hemberger, W. Dean, W. Reik, Nat. Rev. Mol. Cell
Biol. 10, 526 (2009).
29. M. A. Surani, K. Hayashi, P. Hajkova, Cell 128,
747 (2007).
30. H. Sasaki, Y. Matsui, Nat. Rev. Genet. 9, 129
(2008).
31. C. Popp et al., Nature 463, 1101 (2010).
32. D. M. Maatouk et al., Development 133, 3411
(2006).
33. P. Hajkova et al., Mech. Dev. 117, 15 (2002).
34. J. Lee et al., Development 129, 1807 (2002).
35. Y. Seki et al., Development 134, 2627 (2007).
36. P. Hajkova et al., Science 329, 78 (2010).
37. H. D. Morgan, W. Dean, H. A. Coker, W. Reik,
S. K. Petersen-Mahrt, J. Biol. Chem. 279, 52353
(2004).
38. P. Hajkova et al., Nature 452, 877 (2008).
39. J. Oswald et al., Curr. Biol. 10, 475 (2000).
40. W. Mayer, A. Niveleau, J. Walter, R. Fundele, T. Haaf,
Nature 403, 501 (2000).
41. F. Santos, B. Hendrich, W. Reik, W. Dean, Dev. Biol. 241,
172 (2002).
42. T. Nakamura et al., Nat. Cell Biol. 9, 64 (2007).
43. C. R. Farthing et al., PLoS Genet. 4, e1000116
(2008).
44. M. Wossidlo et al., EMBO J. 29, 1877 (2010).
45. Y. Okada, K. Yamagata, K. Hong, T. Wakayama, Y. Zhang,
Nature 463, 554 (2010).
46. R. Hirasawa et al., Genes Dev. 22, 1607
(2008).
47. X. Li et al., Dev. Cell 15, 547 (2008).
48. S. Yamanaka, H. M. Blau, Nature 465, 704
(2010).
49. N. Bhutani et al., Nature 463, 1042 (2010).
50. J. Hanna et al., Nature 462, 595 (2009).
51. K. Kim et al., Nature 467, 285 (2010).
52. I. Stancheva, O. El-Maarri, J. Walter, A. Niveleau,
R. R. Meehan, Dev. Biol. 243, 155 (2002).
53. R. Öllinger et al., PLoS Genet. 4, e1000199
(2008).
54. F. K. Teixeira et al., Science 323, 1600 (2009); published
online 29 January 2009 (10.1126/science.1165313).
55. D. J. Katz, T. M. Edwards, V. Reinke, W. G. Kelly, Cell 137,
308 (2009).
56. U. Brykczynska et al., Nat. Struct. Mol. Biol. 17,
679 (2010).
57. S. S. Hammoud et al., Nature 460, 473
(2009).
58. O. H. Tam et al., Nature 453, 534 (2008).
59. T. Watanabe et al., Nature 453, 539 (2008).
60. S. Ito et al., Nature 466, 1129 (2010).
61. We thank J. A. Law for reading and commenting on
the manuscript, and F. Santos for help with figures.
S.F. is a Special Fellow of the Leukemia & Lymphoma
Society. Supported by NIH grant GM60398 (S.E.J.) and
by grants from the UK Biotechnology and Biological
Sciences Research Council, UK Medical Research
Council, and European Union (W.R.). S.E.J. is an
investigator of the Howard Hughes Medical
Institute.
Downloaded from www.sciencemag.org on February 18, 2013
zygote may occur primarily in mammals (and
in the central cell in seed plants) that also have
imprinting whose mechanism is based on DNA
methylation. Clearly, demethylation in PGCs is
necessary for erasure of imprints so that new imprints can later be established properly, according to the sex of the germ line (Fig. 2). Plants do
not seem to erase imprints; instead, they establish them by demethylation of the maternal
genome in the endosperm after fertilization (with
the endosperm being comparable to the placenta) (Fig. 1). Perhaps there are as yet undiscovered imprinted genes that acquire parent-specific
methylation patterns by (paternal) zygotic demethylation, in analogy to plants.
A second group of genes where demethylation
in PGCs seems important are the germ line–
specific genes (e.g., Dazl, Vasa) that have specialized functions, for example, in meiosis and
germ cell differentiation. These genes are generally demethylated and expressed in germ cells,
but in early PGCs they are methylated and silenced. Genes that are demethylated in PGCs
include those with a role in transposon control;
Tex19.1, for example, silences members of the
ERVK transposon family (53). Hence, global
demethylation, which in principle would lead to
transcriptional activation and potentially to transposition of active transposon families, at the same
time activates defense mechanisms against transposons that are not needed in somatic cells where
transposons are methylated. An extreme view
of this scenario is the possibility that demethylated transposons produce small RNAs, which in
turn lead to de novo methylation and renewed
silencing of transposons (Fig. 1) (24). Although
it may sound paradoxical, reprogramming may
have an important role in resetting the permanent silencing program for transposons across
generations. Also, the fact that AID has a role in
the erasure of methylation in PGCs is interesting
in connection with roles of APOBEC deaminases in innate immunity and transposon control, establishing another potential link between
the two.
The extent of methylation reprogramming
in PGCs is substantial, and this limits the potential in mammals for epigenetic transgenerational
inheritance. By contrast, in plants where epigenetic reprogramming may not occur to such an
extent in the germ line, examples of stable inheritance of epialleles over multiple generations are more common (54). In Caenorhabditis
elegans, histone demethylation in the germ line
is needed to prevent accumulation of aberrant
epigenetic marks that interfere with normal
physiology and limit life span (55). By analogy,
epigenetic reprogramming in mammalian PGCs
or early embryos may be important to prevent
the accumulation of potentially detrimental epialleles, which could otherwise cause chronic diseases and limit life span in human populations.
The inheritance of histone marks and of small
10.1126/science.1190614
29 OCTOBER 2010
627