the relationship between chromatin organisation and rna splicing

FACULTY OF SCIENCE
CHARLES UNIVERSITY IN PRAGUE
Department of Cell Biology
THE RELATIONSHIP BETWEEN CHROMATIN
ORGANISATION AND RNA SPLICING
UNDERGRADUATE THESIS
Jaroslav Icha
supervisor: Mgr. David Staněk, Ph.D.
Prague, 2010
Acknowledgments
I would like to thank my supervisor David Staněk for his guiding and tips on scientific
writing and Sarah Guck for correcting the language of this work.
I declare that I wrote this thesis on my own, guided by my supervisor Mgr. David Staněk,
Ph.D., and I cited all used sources of information.
……………………………….
in Prague
Abstract
It is well known that RNA splicing and other pre-mRNA processing reactions happen
cotranscriptionally. Surprisingly, there were recently discovered some chromatin features that
had uneven distribution between exons and introns, which directly links chromatin
organisation to splicing. This work summarizes all the studies that detected these chromatin
patterns on exons and discuss their inconsistencies. In these studies nucleosomes were found
to be preferentially positioned on exons, specific histone modifications and DNA methylation
were also enriched on exons. These local patterns of chromatin organisation were
evolutionarily conserved from mammals (human and mouse) to worm C. elegans and fly D.
melanogaster. Their findings indicate that the role for chromatin structure in pre-mRNA
splicing is to promote exon recognition. There are two mechanisms proposed for this role of
chromatin in splicing. The first one is influence on RNA polymerase II elongation speed, and
the second is specific recruitment of splicing machinery. In the near future we can expect
studies searching for concrete examples of these two mechanisms and assessing their
significance. Indeed it was reported very recently that H3K36me3 regulates alternative
splicing via the second mechanism.
Keywords: chromatin, splicing, H3K36me3, nucleosome positioning, chromatin
signatures, exon recognition
Abstrakt
Je známo, že RNA sestřih a další úpravy pre-mRNA probíhají kotranskripčně. Nedávno
bylo velmi nečekaně zjištěno, že některé vlastnosti chromatinu se liší mezi introny a exony,
což sestřih přímo propojuje i s organizací chromatinu. Tato práce shrnuje všechny články,
které popisují tyto odlišnosti ve struktuře chromatinu a diskutuje body, ve kterých se
rozcházejí. V těchto studiích bylo nalezeno více nukleozomů na exonech v porovnání s
okolními introny. Na exonech byly častější i specifické modifikace histonů a také metylace
DNA. Tyto lokální odlišnosti v uspořádání chromatinu byly evolučně konzervované mezi
savci (člověk a myš) a háďátkem C. elegans a octomilkou D. melanogaster. Výsledky těchto
studií naznačují, že hlavní funkcí odlišností v uspořádání chromatinu je napomáhat
rozpoznávání exonů. Zatím byly navrženy dva mechanismy, jak může chromatin ovlivňovat
sestřih. První je vliv chromatinu na rychlost elongace RNA polymerázy II a druhý je
specifická vazba sestřihových faktorů na chromatin. V blízké budoucnosti můžeme očekávat
studie, které budou hledat konkrétní příklady obou mechanismů a zhodnotí, jak je který
mechanismus důležitý. Opravdu bylo velmi nedávno zjištěno, že H3K36me3 reguluje
alternativní sestřih druhým zmíněným mechanismem.
Klíčová slova: chromatin, sestřih, H3K36me3, rozmísťování nukleozomů, modifikace
chromatinu, rozpoznávání exonů
Contents
1. Introduction ..................................................................................................................... 6
1.1
RNA splicing ........................................................................................................... 6
1.1.1 Alternative splicing ............................................................................................... 6
1.1.2 Splicing is co-transcriptional ................................................................................. 7
1.2 Chromatin modifications ............................................................................................. 9
1.2.1 Acetylations and methylations of histones ............................................................ 9
1.2.2 Histone code ........................................................................................................ 10
1.2.3 Chromatin signatures........................................................................................... 11
1.3 What determines nucleosome positioning in vivo? .................................................... 13
1.3.1 Statistical positioning .......................................................................................... 14
1.3.2 Intrinsic preference of DNA sequences for nucleosomes ................................... 14
1.3.3 DNA methylation ................................................................................................ 15
1.3.4 Histone modifications ......................................................................................... 15
1.3.5 Chromatin remodeling complexes ...................................................................... 16
1.3.6 Transcription factor binding ................................................................................ 16
2. The relationship between chromatin and splicing ...................................................... 18
2.1 Nucleosomes are preferentially positioned on exons and marked with specific
histone methylations ............................................................................................................. 18
2.2 DNA methylation is enriched on exons ..................................................................... 24
3. Conclusions..................................................................................................................... 25
4. Literature ....................................................................................................................... 29
1. Introduction
1.1 RNA splicing
Splicing is a process typical for eukaryotes, in which primary sequence of pre-mRNA is
modified. Parts of the sequence called introns are removed, and the remaining RNA
sequences called exons are joined together. Splicing was discovered in 1977 when it was
observed that viral DNA and viral RNA transcribed in the cell do not anneal perfectly [1]. The
viral DNA formed loops; these excessive sequences present in the DNA coded for introns. On
the boundaries of introns and exons there exist consensus sequences that are recognized
(based on base paring) by small nuclear ribonucleoprotein particles (snRNPs) U1 and U2. U1
and U2 together with other snRNPs (U4, U5, U6) and associated proteins form a complex
called spliceosome. It catalyzes two sequential transesterification reactions by which the
intron is spliced out and two adjacent exons are joined. There are also self-splicing introns,
which are spliced without spliceosomal catalysis. The recognition of splice sites may vary and
is subject to regulation, which results in generation of two or more alternatively spliced
isoforms of mRNA from a single gene.
1.1.1 Alternative splicing
In the first years after the discovery of alternative splicing (AS) on imunoglobulin gene
(membrane and secreted isoforms) [2; 3], the process was considered an exception. The
current view of AS is opposite and it is thought that as much as 95 % of the pre-mRNAs with
more than two exons are alternatively spliced, at least in humans. The amount of alternative
splicing in humans is a lot higher compared to C. elegans and D. melanogaster, while the
number of genes in humans is not strikingly higher. It is a possible explanation for the higher
complexity of humans [4]. One gene can code for a large number of different mRNAs. An
extreme example is the Dscam gene in D. melanogaster, which can give rise to 38,016
different mRNA isoforms [4]. However, not all the isoforms differ in function. AS events that
are tissue-specific, conserved over more species and frequent are considered to be functional,
but this applies only to 10-30% of all AS events. The rest are probably a result of noise in the
splicing process. But we can think about the bulk of unuseful RNA as an experiment of
evolution through which the cell can find new solutions for little cost [5]. When we consider
the amount of RNA isoforms, which five times exceed the number of human genes, it is clear
that the machinery regulating alternative splicing cannot be gene-specific. There is a system
of antagonizing regulatory RNA-binding proteins. Splicing is positively regulated by SR
6
proteins (SR for serin and arginine rich domain) and negatively regulated by heterogeneous
nuclear ribonucleoproteins (hnRNPs). SR proteins bind sequences called splicing enhancers,
and hnRNPs bind sequences called splicing silencers. According to a simple model, splicing
occurs when the positive signals from SR proteins bound to splicing enhancers outweighs the
negative signals from hnRNPs bound to splicing silencers. In reality there are many more
mechanisms influencing the splicing decision [4].
1.1.2 Splicing is co-transcriptional
People always tried to find a so-called splicing code. They thought it would be possible to
successfully predict splicing from the RNA primary structure. This idea prevailed during the
early years of research on splicing, when experiments were done mostly in vitro on
exogenously added RNA. Splicing was occuring normally (though markedly slower than in
vivo), and sequence was one obvious variable to study. Up until this day, however some
progress has been made, it seems that splicing prediction cannot be based solely on RNA
sequence. There must be many other variables influencing splicing decisions. The biggest
change in the concept of splicing came with the discovery that most of the splicing events
happen co-transcriptionally. This means the introns are removed during transcription of the
nascent transcript (Fig. 1).
Fig. 1 Co-transcriptional processing of pre-mRNA. Spliceosome assembles on the nascent transcript after 5′
capping and before polyadenylation. The C-terminal domain (CTD) of RNA polymerase II (RNAPII) is
phosphorylated on Ser5 during transcription initiation (blue line) and on Ser2 during elongation (yellow line).
Reproduced from [6].
Many components of the splicing machinery, e.g., SR proteins and U1 snRNP, were found
to interact directly with RNA polymerase II (RNAPII) C-terminal domain (CTD) [6].
7
Involvement of RNAPII brought new variables into the model of splicing, above all the speed
of RNAPII elongation. A higher speed of elongation results in exon skipping, while a lower
speed results in exon inclusion [7] (in this study due to mutation in RNAPII). Elongation
speed can be modulated through several mechanisms. The siRNA targeted in the vicinity of
an alternative exon triggered formation of heterochromatin marks and slowed the rate of
elongation around that sequence and caused exon inclusion [8]. Hyperacetylation of histone
H3 lysine 9 after membrane depolarisation in neuronal cells caused higher elongation rate and
exon skipping [9]. Hyperphosphorylation of RNAPII after DNA damage slowed down
elongation and caused alternative splicing of many genes [10].
Transcription is a link through which previously unrelated areas of chromatin and splicing
are tied together. In this thesis we will point to newly discovered chromatin features that
probably contribute to exon definition genome-wide.
8
1.2 Chromatin modifications
Eukaryotic genome exists in the cell nucleus in the form of chromatin. The basic unit of
chromatin is the nucelosome, which consists of histone octamer and approximately 147 base
pairs of DNA wrapped 1.65 times around it [11]. Histones are basic, so they reduce the
electrostatic repulsion of negatively charged DNA and allow it to be tightly packed within the
cell nucleus. Each histone consists of a core formed by three alpha helices and an aminoterminal domain. The amino-terminal domain is called a tail because it is unstructured and
sticks out of the nucleosome core (Fig. 2).
Fig. 2 (a) 3D structure of a nucleosome. DNA is shown in black and red, histones in grey and basic amino
acids (lysine and arginine) within 7 Å of DNA are shown in blue to emphasize the electrostatic interactions
between DNA phosphates and histones (b) Core histones H2A, H2B, H3, H4 constituting a nucleosome and
linker histone H1, minor histone variants H3.3, H2A.Z and histone tail modifications (Ac, acetylation; Me,
methylation). Reproduced from [12].
Histone tails are positively charged due to the frequent lysine and arginine residues, and
they are often covalently modified by acetylation, methylation, phosphorylation, ADPribosylation, ubiquitinylation, sumoylation or deimination. It is the most conserved part of
histone, which as a whole is one of the most conserved proteins in eukaryotes. The specific
modifications of histone tails control the accessibility of DNA for transcription, replication
and DNA repair enzymes. Histone tails also regulate the formation of higher order chromatin
structures.
1.2.1 Acetylations and methylations of histones
Of all the histone tail modifications, the most important appear to be acetylation and
methylation of lysines. They were discovered more than forty years ago [13]. Acetylation
reduces the positive charge on histone tails and brings an open conformation of chromatin,
which enhances transcription. Methylation does not influence the charge and different
methylations can have enhancing or repressing effect on transcription. Whether the specific
9
modification has an activating or repressing effect depends on its position in the gene. For
example, histone H3 lysine 9 methylation (H3K9me) activates transcription when in coding
region but represses it when localized at the promoter [14]. It was described that
modifications differ in their range. They occupy a long region of several kb or just a few
highly positioned nucleosomes, e.g., around the transcription start site (TSS).
Some of the chromatin modifications are epigenetically inherited traits because it is
thought that they can be handed down unchanged for many cell divisions through diverse
mechanisms. It was shown that heterochromatin mark H3K9me3 reinforces itself by
recruiting a K9-methyltransferase [15]. However, most of the methylations and acetylations
are only transient. Their presence depends on the balanced actions of histone methyl and
acetyl transferases as well as histone demethylases and deacetylases. Edmunds et al. showed
that at c-Fos and c-Jun genes modifications can change within minutes along with activation
of transcription [16]. These transient modifications could be underestimated based on the
chromatin immunoprecipitation (ChIP) data because at any moment it is possible to capture
only a portion of all the potential modifications in the cell. Inhibitors of deacetylases and
demethylases are used to examine the dynamics of acetylations and methylations, i.e., how
fast the nucleosomes are hyperacetylated or hypermethylated, while also revealing all the
potential sites of modification.
Fig. 3(a) protein domains recognizing methylations, acetylations and phosphorylations of histone tails (b)
proteins binding specific histone H3 and H4 tail modifications. Reproduced from [17].
1.2.2 Histone code
In cells there are various combinations of histone tail covalent modifications present. The
modifications are recognized by specialized domains of various proteins with diverse
functions in genome regulation [18]. This is referred to as a histone code [19]. Acetylations
are bound by proteins containing a bromodomain [20]. Methylations are recognized by
domains from the Royal family (chromo, tudor, MBT) [21] and evolutionarily unrelated PHD
(Plant Homeo Domains) [22]. Phosphorylations are recognized by 14-3-3 proteins (Fig. 3a)
[23]. Those domains recruit several enzymatic activities to the nucleosomes (Fig. 3b). For
10
example, lysine deacetylases and demethylases, methyltransferases, chromatin-remodeling
complexes and ubiquitin ligases [17].
Histone modifications can also prevent binding of non-histone proteins to chromatin.
For example, H3K4 methylation disrupts binding of histone deacetylase NuRD complex [24]
and H3T3 phosphorylation prevents binding of the INHAT (inhibitor of acetyltransferases)
complex [25]. Both of these complexes repress transcription, which agrees with an
observation that H3K4 methylation and H3T3 phosphorylation belong among the activating
modifications.
1.2.3 Chromatin signatures
There are some well described modifications typical for either euchromatin or
heterochromatin. Heterochromatin areas such as centromeres or transposons are characterised
by a low level of acetylations and a high level of di- and trimethylations of H3K9, H3K27 and
trimethylation of H4K20 [26; 27]. Silenced genes show low levels of acetylations and
activating methylations. Actively transcribed euchromatin is characterized by high levels of
acetylations and trimethylations of H3K4, H3K36 and H3K79. H3K4me3 and H3K9ac have a
narrow peak around the TSS and are associated with initiation-competent RNAPII.
H3K36me3, another mark of active transcription, is dependent on transcriptional elongation
and is enriched in the whole gene body peaking in the 3′ part [26; 28].
The role of many histone modifications in the transcribed region is still unknown, but it
was discovered that some of them are functioning in cotranscriptional processing of premRNA. Sims et al. showed that CHD1 protein, one of the specific binders of H3K4me3,
forms a stable complex with components of the spliceosome. Knockdown of CHD1 or H3K4
methyltransferase by siRNA lowered both association of U2 snRNP with chromatin and
splicing efficiency in vivo [29]. H3K36me3 methyltransferase HYPB/Setd2 forms a huge
complex with nuclear export factors on the CTD of RNAPII, and HYPB/Setd2 knockdown
leads to malfunction of the export system and accumulation of mRNAs in the nucleus [30].
Some of the chromatin signatures can reveal other functional elements in the genome and
can be used to find them, especially those that lack conserved sequence motifs and cannot be
predicted by comparative genomics. Guttman et al. found over one thousand transcription
units for non-coding RNAs by examining the peaks of H3K36me3 and H3K4me3 in
intergenic regions [31]. Specific H3K4me1, H3K9 and K14 diacetylation peaks helped to
11
identify many enhancers [32; 33]. Other functional elements with specific chromatin signature
are insulators, which prevent heterochromatin from spreading into euchromatin regions. They
are enriched in H3K4 mono- di- or trimethylation and H3K9me1 and simultaneously depleted
in H3K9 di- or trimethylation [28]. Surprisingly, specific histone modifications were
discovered also on exons. The first one was H3K36me3 [34], and others followed (see
below).
We demonstrated that the chromatin state is a key element in regulation of many cellular
processes. We focused on the role of chromatin in regulation of gene expression, and later we
will discuss the link between chromatin and splicing of pre-mRNA, which was only briefly
mentioned thus far. Specifically, we will describe new discoveries of preferential positioning
of nucleosomes on exons compared to introns and several specific histone modifications
present at exons.
12
1.3 What determines nucleosome positioning in vivo?
It was discovered last year that nucleosomes are preferentially positioned on exons [35; 36;
37; 38; 39]. An apparent question is by what mechanisms get nucleosomes enriched on
exons? The factors influencing nucleosome positioning will be reviewed here. This research
topic is almost thirty years old, but recent progress in methods, which allow the examination
of nucleosome positions in the whole genome, improved considerably our understanding of
nucleosome positioning. At first, there were published several papers determining nucleosome
positions genome-wide by tiling arrays, mostly in S. cerevisiae [40; 41; 42; 43]. Later, the
next generation sequencing was used instead in more species [28; 44; 45; 46].
Despite some discrepancies, all the studies revealed certain common patterns. Most of the
yeast promoters show similar nucleosome occupancy (Fig. 4). They contain a part depleted in
nucleosomes, the so-called nucleosome-free region (NFR), with -1 and +1 nucleosomes well
positioned (or phased) around it. The -1 nucleosome occupies the sequence from -300 to -150
relative to the TSS, thus it covers some regulatory elements in the promoter. The +1
nucleosome displays the strongest positioning of all nucleosomes. Both the -1 and +1
nucleosomes contain histone variants H2A.Z and H3.3 [44; 45]. At the 3′ end of genes there is
another NFR. This 3′ NFR in compacted genomes, e.g., in yeasts, could serve as a 5′ NFR of
the next gene [12]. The genome-wide studies also revealed that as much as 80 % of the yeast
nucleosomes are well positioned. This means that they occupy the same DNA sequence in
most of the cells used in the experiment (growing, not synchronised cells).
Fig. 4 The consensus distribution of nucleosomes over yeast genes. Genes were aligned together at the
beginning and at the end and the graphs were fused in the gene body. The grey ovals represent nucleosomes, the
green circle represents the transcription start site (TSS) and the red circle represents the transcription termination
site. The green shading in the plot represents high levels of H2A.Z, acetylation, H3K4 methylation and well
positioned nucleosomes. Reproduced from [12] which is based on [47].
13
1.3.1 Statistical positioning
There exist more than one explanation for the precise positioning of nucleosomes. The old
but often preferred theory of statistical positioning says that the nucleosomes will order
regularly along a barrier [47; 48]. The barrier can be a non-histone protein, which binds DNA
in sequence-specific manner (e.g., transcription factor), or it can be another nucleosome, but
one that is strongly positioned. A sequence that strongly excludes nucleosomes can act as a
barrier, too. The poly (dA-dT) tract at the 3′ end of genes is a good example. Statistical
positioning can differ in strength among species and cell types because they differ in the
length of the linker DNA between nucleosomes. When the nucleosomes are close to each
other, there is not much variation possible. The strongest statistical positioning is expected in
yeast (~18 bp linker) [12]. We can presume somewhat weaker effects in species with a longer
linker DNA such as D. melanogaster and C. elegans (~28 bp linker) or humans (~38 bp
linker) [12]. If nucleosomes followed statistical positioning, their position on the sequence
would depend only on their density (i.e., the number of nucleosomes on the sequence).
1.3.2 Intrinsic preference of DNA sequences for nucleosomes
The most studied mechanism of nucleosome positioning is the intrinsic preference of
certain DNA sequences for histones, e.g., [41; 49; 50; 51; 52; 53; 54; 55; 56; 57; 58]. The
physical basis of this preference is not specific interaction of bases in DNA with certain
amino acid residues in histones. Rather, it is a difference in bendability and ability to alter the
helical twist among various DNA sequences [59]. The most widely known in vivo and in vitro
nucleosome-excluding sequence is poly (dA-dT) tract. It excludes nucleosomes from
promoters, origins of replication and the 3′ ends of the genes [60; 61; 62]. These tracts are
also enriched in eukaryotes compared to prokaryotes [63]. Contrarily, sequences that show
high affinity for nucleosomes have high GC content, and Tillo and Hughes claim that
nucleosome occupancy can be predicted merely from the % of GC [52].
A widely used method for studying the intrinsic DNA preferences for histones is to
compare occupancy of histones on DNA in vivo and in vitro on chromatin reconstituted from
purified genomic DNA and histones. Many studies claim that in vivo and in vitro distribution
of nucleosomes highly correlate [49; 64]. This would mean that sequence is the only
important factor in vivo that infleunces nucleosome positions. This hypothesis is known as the
nucleosome code [49; 51].
14
However, the latest findings of Zhang et al. [65] contradict this theory and they explain it
by problematic methodology of Kaplan et al. [49]. In the in vitro reconstitution experiment
they used a histone:DNA ratio of 0.4, while Zhang et al. used a ratio of 1:1. Kaplan et al.
created artificial competition of the sequences for nucleosomes that resulted in isolated
nucleosomes (~1 nucleosome per 400 bp), which inhabited DNA sequences most favouring
the nucleosomes. This situation is different from in vivo conditions, in which histones are
abundant and all DNA is packed into nucleosomes [65]. Zhang et al. mention some evidence
against nucleosome code. They argue that it does not explain differences in nucleosome
spacing among cell types and related species and different nucleosome positions of S.
cerevisiae DNA when it was moved into S. pombe cells [53].
Both studies of yeast genome-wide nucleosome positions [49; 65] were compared by Stein
et al. [66]. They compared the Kaplan et al. in vitro data with in vivo micro-array data from
another yeast study by Lee et al. [41] and found less correlation than was found in the original
study. Stein et al. also imply that the correlation coefficient in Kaplan et al. is overestimated
due to the influential point effect. This effect occurs when there is a large number of points
present in a small region of the scatter plot plus a small number of outlying correlated points.
These outlying points represent here the frequently observed nucleosome depletion on
promoters and 3′ parts of the genes. Most importantly, Kaplan et al. used flawed methodology
that made their study more prone to some artifacts in next-generation sequencing. It is known
that there is a huge DNA-sequence dependent variation in read number [67]. On the contrary,
the Zhang et al. [65] experimental design was resistant to this artifact. Stein et al. conclude
that there is no nucleosome code, and only a small fraction of nucleosomes are positioned as a
consequence of histone preferences for DNA sequence motif.
1.3.3 DNA methylation
It was shown by modelling [68] and by crystallography [69] that methylation of DNA at
CpG decreases the ability of a DNA strand to bend. Lower bendability of DNA should cause
decreased affinity of nucleosomes for that sequence. There are in vitro studies which confirm
this [70; 71] but data from in vivo studies are largely missing.
1.3.4 Histone modifications
The direct influence of histone covalent modifications, specifically acetylations, on
nucleosome positioning was examined by removing the tail domains by trypsine digestion
15
[72] and later by hyperacetylation of the tails [73]. In both studies was observed a small
increase of accessibility of DNA (by nuclease digestion). Even though the direct effect of
histone tail modifications is small, it is thought that the main effect is indirect through proteinprotein interaction, e.g., recruiting the ATP-dependent chromatin remodeling complexes and
other bromodomain-containing proteins [17].
1.3.5 Chromatin remodeling complexes
Chromatin remodeling complexes are able to rewrite the state of chromatin as established
by equilibrum mechanisms such as statistical positioning. For example, Isw2 in yeasts has the
ability to move nucleosomes to sequences unfavourable for them, but when it is deleted
nucleosomes can once again escape from these sequences [74]. Chromatin remodeling
complexes such as RSC can remove nucleosomes from the DNA altogether [75]. Other
complexes replace nucleosome histone subunits, such as the SWR1 remodeling complex,
which replaces H2A with H2A.Z [76; 77] or the CHD1 which replaces H3 with H3.3 [78].
1.3.6 Transcription factor binding
Sequence-specific binding of non-histone proteins, such as transcription factors, to DNA
can shift away the nucleosomes. The transcription factor (TF) binding sites are significantly
depleted of nucleosomes in vivo [41; 42]. When the TF binding site is naturally wrapped
around a nucleosome, there is competition for that site between the TFs and nearby
nucleosomes, which must alter their positions. There can be cooperativity between a TF and a
naturally nucleosome-excluding poly (dA-dT) sequences, which are frequent in the promoters
[60]. In case of poly (dA-dT) sequences there is probably little competition and TFs bind to
previously unoccupied sequences, so they do not influence the positions of neighbouring
nucleosomes.
Obviously there are many mechanisms of nucleosome positioning, but it is still unclear
how much the individual mechanisms contribute to the final nucleosome positioning in vivo.
Even though there is probably no nucleosome code, there is a consensus that sometimes DNA
sequence is the major determinant of nucleosome positions and that GC rich sequences have
higher affinity for nucleosomes than AT rich sequences. The depletion of nuleosomes on 3′
NFRs is largely encoded in DNA. In 5′ NFRs there is less correlation between in vivo and in
vitro positioning data [65]. This implies that besides the DNA sequence there are other factors
such as TF binding, action of chromatin remodeling complexes and transcription initiation.
16
The +1 nucleosome, which is the most positioned of all, shows a close relationship with
the transciption start site. The TSS in yeast is mostly positioned inside the +1 nucleosome
~10-15 bp from its upstream border [44]. D. melanogaster genes also show regular spacing
between TSS and +1 nucleosome [45]. Strong positioning of the +1 nucleosome cannot be
explained only by influence of the DNA sequence. Zhang et al. propose a transcription-based
mechanism for +1 nucleosome positioning [65]. First the transcription initiation machinery
assembles on the 5′ NFR, and then some component of this machinery interacts with a
nucleosome-remodeling complex that positions the +1 nucleosome. The +1 nucleosome then
statistically positions the downstream nucleosomes. This model would be valid for yeasts and
invertebrates such as flies, unlike vertebrates in which the majority of promoters are different
from the above described type. They are called diffused promoters and have a different
structure, for example, multiple TSSs spread over 50-100 bp [79].
We should not be surprised that we cannot find a single strong mechanism of nucleosome
positioning. Replication, transcription or DNA damage response are regulated by histone
modifications and nucleosome positions, so it is best for the chromatin to stay in dynamic
equilibrum under the influence of multiple factors in order to react to the momentary needs of
the cell.
17
2. The relationship between chromatin and splicing
It is now generally accepted that RNA is spliced cotranscriptionally. Transcription and
splicing are closely linked, posing an opportunity to influence splicing indirectly by affecting
transcription. Chromatin state is one well known factor that affects transcription. In the last
year occured many studies demonstrating new links between chromatin and splicing
(summarized in Table 1). These studies discovered that nucleosomes are preferentially
positioned on exons, and these nucleosomes are enriched in several histone methylations
compared to adjacent introns. This suggests that chromatin structure promotes exon
definition.
However, the first link between chromatin and splicing was proposed long ago by
Beckmann et al. [80]. They noticed a periodicity of distribution of exons and introns in the
gene that is conspicuously similar to the pattern of nucleosomes wound around DNA. But at
that time they could not make the right conclusion about their data because they did not know
that splicing happens cotranscriptionally. Thus, linking chromatin and splicing did not make
much sense. They predicted that nucleosomes might protect splice sites from mutations since
nucleosomal DNA mutates slower than linker DNA [80]. Later on, it turned out that splicing
pattern cannot be determined solely based on RNA sequence. No algorithm based on RNA
primary structure could successfully predict splicing, implying that there has to be an extra
level of information provided. Now that we know splicing appears simultaneously with
transcription, a good candidate for that additional information is the structure of chromatin.
The advent of genome sequencing showed that the length of the majority of exons reaches
around 150 bp, which corresponds to the length of DNA wound around a nucleosome.
2.1 Nucleosomes are preferentially positioned on exons and marked with specific
histone methylations
Kolasinska-Zwierz et al. first reported H3K36me3 enrichment on exons. They got the same
results analysing their own C. elegans data [34], along with human [28] and mouse [26] data.
They showed that H3K36me3 enrichment is not due to GC bias as H3K36me3 signal was
stronger in exons compared to introns across the whole range of % GC [34]. Highly expressed
exons were more enriched in H3K36me3, which is in line with the evidence that H3K36 is
methylated cotranscriptionally [81]. Interestingly, alternative exons were less marked with
18
H3K36me3 than constitutive exons. There was no such difference for other studied
methylations, thus tying H3K36 trimethylation to splicing.
Two studies by Tilgner et al. and Schwartz et al. reported higher nucleosome occupancy on
exons compared to adjacent introns at the same time [35; 36]. Several more papers followed
soon after [37; 38; 39]. All of them agreed on the essential finding, but they differed in other
conclusions; this is striking because they used the same ChIP-seq raw data. They did,
however, differ in the subset of exons they finally chose to analyze since very short and very
long exons behaved differently and brought noise into the data. Most of these same studies
also examined the histone methylations enriched on exons. Again they could not agree on the
specific methylations on exons, and some even argued that the enrichment of histone
methylations is only due to nucleosome occupancy bias.
authors
original data
species
nucleosome
positions
GC content
dependent
histone modifications
transcription
correlated
alternative
exons
KolasinskaZwierz et al.
[34]
Barski et al. [28],
Kolasinska-Zwierz et al.
[34], Mikkelsen et al. [26]
C. elegans,
mouse, human
not analysed
not analysed
H3K36me3
yes
less
marked
Andersson et
al. [37]
Barski et al., Mikkelsen et
al., Schones et al. [82],
Valouev et al. [46], Wang
et al. [83]
C. elegans,
human, mouse
yes
no
H3K36me3, H3K79me1,
H2BK5me1, H3K27me1,
H3K27me2, H3K27me3
yes
not
analysed
Tilgner et al.
[35]
Barski et al., Schones et
al., Valouev et al.
C. elegans, human
yes
partially
no
not analysed
not
analysed
Schwartz et
al. [36]
Barski et al., Mikkelsen et
al., Schones et al.,
Valouev et al., Wang et
al.
C. elegans,
mouse, human
yes
yes
no
yes
not
analysed
Nahkuri et
al. [39]
Barski et al., Schones et
al., Sasaki et al.[84],
Hammoud et al. [85]
human, Japanese
killifish (Oryzias
latipe)
yes
yes
not analysed
not analysed
not
analysed
Spies et al.
[38]
Barski et al., Sasaki et al.,
Schones et al.,
human, Japanese
killifish (Oryzias
latipes)
yes
yes
H3K36me3, H3K27me2
no
equally
marked
Hon et al.
[86]
Barski et al.
human
not analysed
not analysed
H3K36me3, H2BK5me1,
H4K20me1
yes
less
marked
Table 1 Summary of the studies describing new links between chromatin and splicing. The columns from left
to right show the authors, the sources of raw data on nucleosome positioning and histone methylations, species
used, whether nucleosomes were found to be preferentially positioned on exons, whether it is GC content
dependent, histone modifications found to be enriched on exons, whether this enrichment correlates with
transcription, whether alternative exons are modified as much as constitutive exons.
Tilgner et al. used public data of nucleosome occupancy in C. elegans and human CD4+ Tcells and divided all the exons according to their splice site strength. The weakest 5 % of
exons showed higher enrichment of nucleosomes than the strongest 5 % of exons. To assess
sequence bias they also involved pseudoexons in the analysis. Pseudoexons are exon-like
sequences in introns with strong splice sites that are never incorporated into the mature
19
mRNA. There was no enrichment of nucleosomes on pseudoexons; in fact, there was a weak
nucleosome depletion. By analyzing gene expression data from CD4+ T cells [82], they
proved that nucleosome occupancy is not dependent on transcription. Non-transcribed genes
showed a similar pattern of nucleosomes as did transcribed genes.
The study further asked to what extent higher GC content of exons is responsible for
higher nucleosome occupancy in exons, as it is known that nucleosomes prefer GC rich
sequences. The authors found that the profile of GC content in human exons is similar to the
profile of nucleosome occupancy, and even the weak exons have higher GC content than do
the strong exons. However, they also found arguments against influence of GC content.
Primarily, a set of pseudoexons and their neighbouring introns with the same GC content as
real exons had significantly lower nucleosome occupancy. Also, the correlation between GC
content and nucleosome occupancy was higher for pseudoexons (0.422) than for exons
(0.182). This suggests there are other factors acting on exons with a stronger influence on
nucleosome occupancy.
Tilgner et al. also reanalyzed H3K36me3 enrichment on exons previously reported by
Kolasinska-Zwierz et al. [34]. After normalization of the methylation data to nucleosome
occupancy they demonstrated that the H3K36me3 peak basically disappears. But other studies
later proved that H3K36me3 enrichment is even bigger than nucleosome enrichment, and
H3K36me3 signature is still present even after nucleosome occupancy is taken into account.
The major conclusion of Tilgner et al. is that nucleosome occupancy helps exon recognition
and inclusion into the mature transcript. Their conclusion was later supported by more data
from other studies, but the precise molecular mechanism of this influence is still unclear.
Schwartz et al. used the same human CD4+ T-cells data for the main analysis but in some
cases reached somewhat different conclusions [36]. They observed negative correlation
between gene expression and nucleosome occupancy, which was actually lower in highly
expressed exons and vice versa (Fig. 5b). They also did an analysis of alternative splicing and
nucleosome occupancy. Less included exons had lower nucleosome occupancy (Fig. 5f),
which supports the conclusion of Tilgner et al. Furthermore, they discovered the enrichment
of nucleosome-disfavoring sequences in introns close to the exon boundaries. In terms of
histone methylations, their results were consistent with the first study reporting the
enrichment of H3K36me3 on exons [34]. They found an increase of methylation with gene
expression and with inclusion of exon into mRNA. But they suggested that the enrichment of
20
all four methylations found on exons (H3K36me3, H3K79me1, H4K20me1 and H2BK5me1)
is due to increased nucleosome occupancy around exons. To determine the influence of GC
content, they divided exons into groups according to their GC content and compared their
nucleosome occupancy. They found a strong correlation between GC content and nucleosome
occupancy of exons. In introns there was no such correlation. Finally, they mapped RNA
polymerase II presence. RNAPII is most likely the means of cross-talk between chromatin
and splicing. They found more RNAPII bound to exons than to introns, supporting the theory
that speed of RNAPII influences splicing [7]. Nucleosomes at exons could serve as ‘speed
bumps’ that slow down RNAPII and help exon inclusion.
Fig. 5 (a) Higher GC content in constitutive exons, the window shows 2000 bp around the center of exons
(b) Nucleosome ocupancy on exons in T-cells. They are aligned by the 3′ (left) and 5′ (right) splice sites. Exons
were divided into five equally sized groups accordind to their expression. (c) Nucleosome occupancy as in (b)
for 600 bp around the center of introns (d) Nucleosome occupancy on exons in C. elegans. Exons are aligned by
the 3′ (left) and 5′ (right) splice sites. (e) Computational prediction of nucleosome occupancy on exons based on
software from [49]. Exons were divided into five equally sized groups according to their expression. (f)
comparison of mean nucleosome occupancy in introns, alternatively spliced exons included in less than 50% of
transcripts, alternatively spliced exons included in more than 50% of transcripts and constitutive exons. Error
bars represent the s.e.m. Reproduced from [36].
Andersson et al. using the same data noticed that nucleosomes on exons were not only
enriched but also well positioned, unlike mucleosomes on introns [37]. They computed the
average centers of the peaks at +94 bp from the exon start in human and +101 bp in C.
elegans. For a minority of exons of extreme sizes, this rule did not hold. The shortest exons
21
(less than 50 bp) were an exception because they are not enriched in nucleosomes at all. Also,
the longest exons (more than 500 bp) had a different nucleosome occupancy with peaks on
both ends. Both observations are understandable if we take into account that one nucleosome
binds 147 bp of DNA. We should consider the fact that exons shorter than 50 bp are included
into mRNA even without any signal from nucleosomes. Obviously other factors must play a
role in splicing as well. Furthermore, nucleosome occupancy at internal exons was
independent of transcription, although it changed with transcription at the beginning and end
of genes, as had been observed previously. Andersson et al. found several histone
methylations enriched on exons (H3K36me3, H3K79me1, H2BK5me1, H3K27me1,
H3K27me2, H3K27me3). Their signals changed with exon inclusion. Some correlated
positively (H3K36me3, H3K79me1, H2BK5me1) and some negatively (H3K27me2,
H3K27me3) with exon expression. H3K36me3 and H3K79me1 signals differed the most
between exons and introns, and these modifications showed the biggest correlation with exon
expression. Authors identified them as likely candidates for facilitating exon inclusion during
splicing.
Nahkuri et al. analyzed nucleosome positions in human sperm cells and blastulae of
Japanese killifish [84], in addition to the CD4+ T-cells [39]. They again saw elevated
nucleosome occupancy in exons in all datasets, and it was independent of gene expression
level and evolutionary conservation. The expression and conservation of exons did not change
with GC content, but it influenced nucleosome occupancy, which was higher on exons with
higher GC content. But more importantly, the relative difference in nucleosome occupancy
between exons and introns was significant independently of GC content, and the shape of the
peak of nucleosome occupancy looked the same for all exons. Thus, lower GC content should
not affect exon recognition. The contribution of Nahkuri et al. is the finding of the same
nucleosome enrichment in both somatic and gametic cells, indicating that such enrichment is
preserved in the germ line [39].
Spies et al. worked mostly with data from human CD4+ T-cells, although they confirmed
the nucleosome enrichment on exons in Japanese killifish as well [38]. They explained this
enrichment by exonic nucleotide composition. They selected regions in introns and intergenic
regions that resembled exons in their nucleotide composition and found the same nucleosome
occupancy pattern as on real exons. They supported the hypothesis of Tilgner et al. about
nucleosomes that help exon inclusion with two findings. Firstly, so-called isolated exons
(flanked by introns longer than 5 kbp) were more enriched in nucleosomes and histone
22
methylations than were clustered exons (flanked by 0.5 to 1 kbp long introns) (Fig. 6a and b).
Secondly, exons with weak splice sites had higher nucleosome occupancy than exons with
strong splice sites (Fig. 6c). As for histone methylations, H3K36me3 and H3K27me2 were
the only modifications significantly enriched on exons. Spies et al. calculated the correlation
of relative enrichment of some histone methylations on exons (exon:intron occupancy ratio)
between lowly and highly expressed genes. There was a significant correlation, indicating that
histone modifications are similarly enriched in all genes regardless of expression. This finding
is in conflict with previous observations. The authors also did not see significant differences
in H3K36me3 marks between constitutive and alternative exons.
Fig. 6 (a) Histone H3 methylations are more enriched on isolated exons (flanked by introns longer than 5
kbp, upper bar of each pair) than on clustered exons (flanked by introns between 0.5 and 1 kbp, lower bar).
CTCF is an insulator element and serves as a negative control. Error bars represent 95% confidence intervals. (b)
Nucleosome occupancy on isolated and clustered exons. (c) Nucleosome enrichment on exons correlates
negatively with strength of the 3′ splice site. Reproduced from [38].
In a study discovering chromatin signatures, Hon et al. confirmed H3K36me3 enrichment
on exons and newly H2BK5me1 and H4K20me1 enrichment on exons close to the 5′ end of
genes [86]. H3K36me3 enrichment, on the other hand, increased toward the 3′ end of genes,
which is consistent with previous observations. They also confirmed the positive correlation
of H3K36me3 with gene expression and exon inclusion in alternative splicing.
23
2.2 DNA methylation is enriched on exons
DNA methylation is another epigenetic mark found to be enriched on exons compared to
introns. Hodges et al. were the first to discover this enrichment in a study that involved
quantitative mapping of DNA methylation in part of the human genome [87]. They also
wanted to show a relationship between DNA methylation and histone modifications. They
chose to perform ChIP-seq on two histone modifications: H3K36me3, a typical modification
of gene bodies and exons, and H3K4me2, which is associated with promoters and TSS. They
confirmed again the H3K36me3 preferential localization on exons and showed positive
correlation between DNA methylation and H3K36me3. For H3K4me2 they showed negative
correlation with DNA methylation [87]. The second study that came to similar conclusions
studied genome-wide changes of DNA methylation during cellular differentiation [88]. It
found not only markedly higher levels of DNA methylation on exons, but also a very sharp
peak of methylation at the 5′ splice site and a sharp plunge at the 3′ splice site. They
hypothesize that enrichment of DNA methylation is biased by higher GC content of exons
compared to neighbouring introns [88]. This is very likely, but they do not analyze the matter
further.
Fig. 7 CpG methylation in 200 bp regions around the splice sites. It was higher on exons with sharp peak of
methylation at 5′ splice sites and sharp plunge at 3′ splice sites regardless of differentiation state of the cells (blue
curve represents human embryonic stem cells (hESC), purple curve represents fibroblastic derivatives of hESC
and green represents fibroblasts). Both DNA strands were methylated equally. Reproduced from [88].
24
3. Conclusions
The relationship between pre-mRNA splicing and chromatin organization is rapidly
emerging field and new (sometimes contradictory) data are appearing in literature every
month. What can we say with certainty about chromatin structure on exons? Nucleosomes are
enriched and well positioned on exons across evolutionarily distant organisms. This
enrichment is independent from transcription, although some studies [36; 39] show a slight
decrease of nucleosome occupancy in highly expressed genes, probably due to the action of
RNA polymerase II. Nucleosome enrichment on exons can be partially explained by
nucleosome-favouring sequences and higher GC content in exons [36; 38; 39], and by
nucleosome-disfavouring sequences in introns [36]. But other factors are important in
nucleosome positioning, too. Tilgner et al. demonstrated this convincingly by comparing
nucleosome occupancy on exons and pseudoexons [35]. It seems likely that nucleosomes help
exon inclusion into mature mRNA. Nucleosome occupancy is higher in exons that need
splicing enhancement. Spies et al. showed that isolated exons and exons with weak splice
sites have higher nucleosome occupancy [38]. Schwartz et al. found low nucleosome
occupancy on alternative exons [36].
Unlike nucleosome positioning, histone methylations are strongly correlated with exon
expression. The strongest correlation and the biggest peak on exons shows H3K36
trimethylation. This is the only histone methylation on which all the studies can agree. The
H3K36me3 signal increases from the 5′ to the 3′ part of the gene [36]. Other methylations,
namely H3K79me1, H3K27me2, H2BK5me1 and H4K20me1 are also candidates for exonspecific marks. It is difficult to prove this genome-wide, perhaps because some of the
methylations are enriched only on parts of exons such as H2BK5me1 and H4K20me1 that
were enriched in 5′ parts of genes [86], and when examining them genome-wide there is too
much noise in the data. Histone methylations help exon inclusion into mature mRNA just like
nucleosome positioning. Isolated exons are more enriched in histone methylations than are
clustered exons [38]. Alternative exons are less marked with H3K36me3 than are constitutive
exons [34; 86], although the significance of this observation is still uncertain [38]. Spies et al.
hypothesize that histone modifications are good long-lasting marks that could control splicing
of alternative isoforms in a cell type-specific manner, or they could serve in immune memory
[38].
The first direct link between chromatin and splicing was discovered in 2006. It was shown
that a subunit of SWI/SNF chromatin remodeling complex influences alternative splicing and
25
binds components of the spliceosome [89]. Histone modifications were shown to influence
splicing as well. For example, the previously discussed work by Sims et al. reveals that
H3K4me3 binds spliceosomal components through CHD1 protein [29]. It was revealed that
hyperacetylation of H3K9 after neuronal cell depolarization influences NCAM alternative
splicing [9]. Loomis et al. reported reduction of binding of SR proteins to chromatin after
H3S10 phosphorylation [90]. Still, the discovery of nucleosomes and histone methylations
enriched on exons was unexpected. Authors of the original genome-wide nucleosome and
methylation data did not notice this enrichment. In fact, it took more than a year before it was
recognized. But the data we have now are only correlations, and the exact molecular
mechanism by which nucleosomes and histone methylations influence splicing is still
unknown (with one exception [91]).
There are basically two mechanisms proposed for how nucleosome positioning and histone
methylation influence splicing. First, nucleosomes function as ‘speed bumps’ and slow down
elongating RNA polymerase II . It is well documented that slowing down the elongation rate
helps exon inclusion [7]. The splicing machinery (SR proteins, U1 snRNP) associated with
RNAPII [92] has more time to assemble on exon boundaries and accomplish splicing
reaction. This model is supported by higher nucleosome occupancy on exons with weak
splice sites and isolated exons.
Fig. 8 Alternative splicing of the human FGFR2 gene. Alternative exons IIIb and IIIc are mutually exclusive
and tissue-specific. In human mesenchymal stem cells, H3K36me3 is recognized by MRG15, which recruits
PTB to intronic splicing silencers near the exon IIIb, which causes IIIb exclusion. Downregulation of H3K36
methyltransferase or MRG15 promotes inclusion of previously repressed exon IIIb . In epithelial cells, exon IIIb
inclusion is promoted by epithelial-specific RNA binding proteins (ESRPs), but overexpression of H3K36
methyltransferase or MRG15 results in exon IIIb exclusion. Reproduced from [93] based on [91].
26
The H3K36 methyltransferase Setd2 binds to phosphorylated CTD of elongating RNAPII
simultaneously with various splicing factors. Thus, elongation and splicing are undoubtedly
coupled. The second model well suited for histone modifications suggests that modified
nucleosomes recruit splicing machinery and improve splicing efficiency. Very recently an
evidence for this model was published mapping one concrete histone modification
(H3K36me3), which is recognized by the chromatin-binding protein MRG15 (MORF-related
gene on chromosome 15), and a splicing regulatory protein PTB (polypyrimidine tract
binding protein) recruited by this adaptor system (Fig. 8) [91]. Alternatively, Allemand et al.
propose direct binding between histones and pre-mRNA, which can be influenced by histone
modifications [94].
It was mentioned previously that the average length of human exons is 151 bp, which is
conspicuously similar to the 147 bp of DNA wound around one nucleosome, indicating the
exon length is constrained by nucleosome binding. An older explanation called ‘exon
definition’ is equally good. It says that exon length around 150 bp is ideal for interactions
between splicing factors on 3′ and 5′ splice sites. Tilgner et al. found evidence supporting the
theory of constraint by nucleosome binding [35]. Their dataset of exons with weak splice
sites, which were also more enriched in nucleosomes, were more constrained in length (mean
153 bp, s.d. 177 bp) than were exons with strong splice sites (mean 164 bp, s.d. 313 bp) [35].
Nucleosome occupancy provides an explanation for the previously inexplicable higher GC
content in exons. The older hypotheses interpreted higher GC content in exons as a result of
bias caused by the protein-coding function of exons. This cannot be correct because higher
GC content was observed also on exons of non-coding RNAs [35]. Moreover, not all exons
are translated, and some of them constitute the 5′ and 3′ untranslated regions. It is likely that
exonic sequences evolved to incorporate nucleosomes more effectively. Tilgner et al.
supported this by finding nucleosome enrichment on non-coding exons and lower GC content
in pseudoexons resulting in reduced nucleosome occupancy [35].
The enrichment of DNA methylation on exons is likely caused by higher GC content of
exons compared to surrounding introns. None of the studies that report higher DNA
methylation on exons discuss the counterintuitive fact that exons are enriched both in
nucleosomes and DNA methylation, even though DNA methylation is known to exclude
nucleosomes from a DNA sequence, as discussed previously. Presumably this effect is not
strong enough in vivo and is overcome by other nucleosome positioning signals. The precise
27
mechanism for how DNA methylation influences splicing is still unknown. It is probable,
however, that DNA methylation is coupled to splicing through the RNA polymerase II
transcription, similar to other epigenetic marks on exons.
Another role of nucleosomes on exons is protective. This is not mutually exclusive with
their role in splicing. Sasaki et al. reported that the positions with higher nucleosome
occupancy had a higher rate of substitutions but a lower rate of insertions and deletions
compared to positions with lower nucleosome occupancy [84]. Insertions and deletions cause
frameshift mutation, which is more severe than substitution.
The relationship between nucleosome positioning, histone methylation and DNA
methylation is currently unknown, and it will be an interesting research topic in the future.
Soon we might expect successful splicing predictions based on chromatin structure and
perhaps the discovery of nucleosome enrichment or depletion on other functional elements in
the genome.
28
4. Literature
[1]S.M. Berget, C. Moore, P.A. Sharp, Spliced segments at the 5' terminus of adenovirus 2 late mRNA.
Proc Natl Acad Sci U S A 74 (1977) 3171-3175.
[2]F.W. Alt, A.L. Bothwell, M. Knapp, E. Siden, E. Mather, M. Koshland, D. Baltimore, Synthesis of
secreted and membrane-bound immunoglobulin mu heavy chains is directed by mRNAs that differ
at their 3' ends. Cell 20 (1980) 293-301.
[3]P. Early, J. Rogers, M. Davis, K. Calame, M. Bond, R. Wall, L. Hood, Two mRNAs can be produced
from a single immunoglobulin mu gene by alternative RNA processing pathways. Cell 20 (1980)
313-319.
[4]T.W. Nilsen, B.R. Graveley, Expansion of the eukaryotic proteome by alternative splicing. Nature 463
(2010) 457-463.
[5]E. Melamud, J. Moult, Stochastic noise in splicing machinery. Nucleic Acids Res 37 (2009) 4873-4886.
[6]S. Pandit, D. Wang, X.D. Fu, Functional integration of transcriptional and RNA processing machineries.
Curr Opin Cell Biol 20 (2008) 260-265.
[7]M. de la Mata, C.R. Alonso, S. Kadener, J.P. Fededa, M. Blaustein, F. Pelisch, P. Cramer, D. Bentley,
A.R. Kornblihtt, A slow RNA polymerase II affects alternative splicing in vivo. Mol Cell 12
(2003) 525-532.
[8]M. Allo, V. Buggiano, J.P. Fededa, E. Petrillo, I. Schor, M. de la Mata, E. Agirre, M. Plass, E. Eyras,
S.A. Elela, R. Klinck, B. Chabot, A.R. Kornblihtt, Control of alternative splicing through siRNAmediated transcriptional gene silencing. Nat Struct Mol Biol 16 (2009) 717-724.
[9]I.E. Schor, N. Rascovan, F. Pelisch, M. Allo, A.R. Kornblihtt, Neuronal cell depolarization induces
intragenic chromatin modifications affecting NCAM alternative splicing. Proc Natl Acad Sci U S
A 106 (2009) 4325-4330.
[10]M.J. Munoz, M.S. Perez Santangelo, M.P. Paronetto, M. de la Mata, F. Pelisch, S. Boireau, K. GloverCutter, C. Ben-Dov, M. Blaustein, J.J. Lozano, G. Bird, D. Bentley, E. Bertrand, A.R. Kornblihtt,
DNA damage regulates alternative splicing through inhibition of RNA polymerase II elongation.
Cell 137 (2009) 708-720.
[11]K. Luger, A.W. Mader, R.K. Richmond, D.F. Sargent, T.J. Richmond, Crystal structure of the
nucleosome core particle at 2.8 A resolution. Nature 389 (1997) 251-260.
[12]C. Jiang, B.F. Pugh, Nucleosome positioning and gene regulation: advances through genomics. Nat
Rev Genet 10 (2009) 161-172.
[13]V.G. Allfrey, R. Faulkner, A.E. Mirsky, Acetylation and Methylation of Histones and Their Possible
Role in the Regulation of Rna Synthesis. Proc Natl Acad Sci U S A 51 (1964) 786-794.
[14]C.R. Vakoc, S.A. Mandat, B.A. Olenchock, G.A. Blobel, Histone H3 lysine 9 methylation and
HP1gamma are associated with transcription elongation through mammalian chromatin. Mol Cell
19 (2005) 381-391.
[15]M. Lachner, D. O'Carroll, S. Rea, K. Mechtler, T. Jenuwein, Methylation of histone H3 lysine 9 creates
a binding site for HP1 proteins. Nature 410 (2001) 116-120.
[16]J.W. Edmunds, L.C. Mahadevan, A.L. Clayton, Dynamic histone H3 methylation during gene
induction: HYPB/Setd2 mediates all H3K36 trimethylation. EMBO J 27 (2008) 406-420.
[17]T. Kouzarides, Chromatin modifications and their function. Cell 128 (2007) 693-705.
[18]S.D. Taverna, H. Li, A.J. Ruthenburg, C.D. Allis, D.J. Patel, How chromatin-binding modules interpret
histone modifications: lessons from professional pocket pickers. Nat Struct Mol Biol 14 (2007)
1025-1040.
[19]B.D. Strahl, C.D. Allis, The language of covalent histone modifications. Nature 403 (2000) 41-45.
29
[20]D.J. Owen, P. Ornaghi, J.C. Yang, N. Lowe, P.R. Evans, P. Ballario, D. Neuhaus, P. Filetici, A.A.
Travers, The structural basis for the recognition of acetylated histone H4 by the bromodomain of
histone acetyltransferase gcn5p. EMBO J 19 (2000) 6141-6149.
[21]S. Maurer-Stroh, N.J. Dickens, L. Hughes-Davies, T. Kouzarides, F. Eisenhaber, C.P. Ponting, The
Tudor domain 'Royal Family': Tudor, plant Agenet, Chromo, PWWP and MBT domains. Trends
Biochem Sci 28 (2003) 69-74.
[22]H. Li, S. Ilin, W. Wang, E.M. Duncan, J. Wysocka, C.D. Allis, D.J. Patel, Molecular basis for sitespecific read-out of histone H3K4me3 by the BPTF PHD finger of NURF. Nature 442 (2006) 9195.
[23]M.B. Yaffe, K. Rittinger, S. Volinia, P.R. Caron, A. Aitken, H. Leffers, S.J. Gamblin, S.J. Smerdon,
L.C. Cantley, The structural basis for 14-3-3:phosphopeptide binding specificity. Cell 91 (1997)
961-971.
[24]K. Nishioka, S. Chuikov, K. Sarma, H. Erdjument-Bromage, C.D. Allis, P. Tempst, D. Reinberg, Set9,
a novel histone H3 methyltransferase that facilitates transcription by precluding histone tail
modifications required for heterochromatin formation. Genes Dev 16 (2002) 479-489.
[25]R. Schneider, A.J. Bannister, C. Weise, T. Kouzarides, Direct binding of INHAT to H3 tails disrupted
by modifications. J Biol Chem 279 (2004) 23859-23862.
[26]T.S. Mikkelsen, M. Ku, D.B. Jaffe, B. Issac, E. Lieberman, G. Giannoukos, P. Alvarez, W. Brockman,
T.K. Kim, R.P. Koche, W. Lee, E. Mendenhall, A. O'Donovan, A. Presser, C. Russ, X. Xie, A.
Meissner, M. Wernig, R. Jaenisch, C. Nusbaum, E.S. Lander, B.E. Bernstein, Genome-wide maps
of chromatin state in pluripotent and lineage-committed cells. Nature 448 (2007) 553-560.
[27]J.A. Rosenfeld, Z. Wang, D.E. Schones, K. Zhao, R. DeSalle, M.Q. Zhang, Determination of enriched
histone modifications in non-genic portions of the human genome. BMC Genomics 10 (2009) 143.
[28]A. Barski, S. Cuddapah, K. Cui, T.Y. Roh, D.E. Schones, Z. Wang, G. Wei, I. Chepelev, K. Zhao,
High-resolution profiling of histone methylations in the human genome. Cell 129 (2007) 823-837.
[29]R.J. Sims, 3rd, S. Millhouse, C.F. Chen, B.A. Lewis, H. Erdjument-Bromage, P. Tempst, J.L. Manley,
D. Reinberg, Recognition of trimethylated histone H3 lysine 4 facilitates the recruitment of
transcription postinitiation factors and pre-mRNA splicing. Mol Cell 28 (2007) 665-676.
[30]S.M. Yoh, J.S. Lucas, K.A. Jones, The Iws1:Spt6:CTD complex controls cotranscriptional mRNA
biosynthesis and HYPB/Setd2-mediated histone H3K36 methylation. Genes Dev 22 (2008) 34223434.
[31]M. Guttman, I. Amit, M. Garber, C. French, M.F. Lin, D. Feldser, M. Huarte, O. Zuk, B.W. Carey, J.P.
Cassady, M.N. Cabili, R. Jaenisch, T.S. Mikkelsen, T. Jacks, N. Hacohen, B.E. Bernstein, M.
Kellis, A. Regev, J.L. Rinn, E.S. Lander, Chromatin signature reveals over a thousand highly
conserved large non-coding RNAs in mammals. Nature 458 (2009) 223-227.
[32]T.Y. Roh, G. Wei, C.M. Farrell, K. Zhao, Genome-wide prediction of conserved and nonconserved
enhancers by histone acetylation patterns. Genome Res 17 (2007) 74-81.
[33]N.D. Heintzman, R.K. Stuart, G. Hon, Y. Fu, C.W. Ching, R.D. Hawkins, L.O. Barrera, S. Van Calcar,
C. Qu, K.A. Ching, W. Wang, Z. Weng, R.D. Green, G.E. Crawford, B. Ren, Distinct and
predictive chromatin signatures of transcriptional promoters and enhancers in the human genome.
Nat Genet 39 (2007) 311-318.
[34]P. Kolasinska-Zwierz, T. Down, I. Latorre, T. Liu, X.S. Liu, J. Ahringer, Differential chromatin
marking of introns and expressed exons by H3K36me3. Nat Genet 41 (2009) 376-381.
[35]H. Tilgner, C. Nikolaou, S. Althammer, M. Sammeth, M. Beato, J. Valcarcel, R. Guigo, Nucleosome
positioning as a determinant of exon recognition. Nat Struct Mol Biol 16 (2009) 996-1001.
[36]S. Schwartz, E. Meshorer, G. Ast, Chromatin organization marks exon-intron structure. Nat Struct Mol
Biol 16 (2009) 990-995.
30
[37]R. Andersson, S. Enroth, A. Rada-Iglesias, C. Wadelius, J. Komorowski, Nucleosomes are well
positioned in exons and carry characteristic histone modifications. Genome Res 19 (2009) 17321741.
[38]N. Spies, C.B. Nielsen, R.A. Padgett, C.B. Burge, Biased chromatin signatures around polyadenylation
sites and exons. Mol Cell 36 (2009) 245-254.
[39]S. Nahkuri, R.J. Taft, J.S. Mattick, Nucleosomes are preferentially positioned at exons in somatic and
sperm cells. Cell Cycle 8 (2009) 3420-3424.
[40]D.K. Pokholok, C.T. Harbison, S. Levine, M. Cole, N.M. Hannett, T.I. Lee, G.W. Bell, K. Walker,
P.A. Rolfe, E. Herbolsheimer, J. Zeitlinger, F. Lewitter, D.K. Gifford, R.A. Young, Genome-wide
map of nucleosome acetylation and methylation in yeast. Cell 122 (2005) 517-527.
[41]W. Lee, D. Tillo, N. Bray, R.H. Morse, R.W. Davis, T.R. Hughes, C. Nislow, A high-resolution atlas
of nucleosome occupancy in yeast. Nat Genet 39 (2007) 1235-1244.
[42]G.C. Yuan, Y.J. Liu, M.F. Dion, M.D. Slack, L.F. Wu, S.J. Altschuler, O.J. Rando, Genome-scale
identification of nucleosome positions in S. cerevisiae. Science 309 (2005) 626-630.
[43]F. Ozsolak, J.S. Song, X.S. Liu, D.E. Fisher, High-throughput mapping of the chromatin structure of
human promoters. Nat Biotechnol 25 (2007) 244-248.
[44]I. Albert, T.N. Mavrich, L.P. Tomsho, J. Qi, S.J. Zanton, S.C. Schuster, B.F. Pugh, Translational and
rotational settings of H2A.Z nucleosomes across the Saccharomyces cerevisiae genome. Nature
446 (2007) 572-576.
[45]T.N. Mavrich, C. Jiang, I.P. Ioshikhes, X. Li, B.J. Venters, S.J. Zanton, L.P. Tomsho, J. Qi, R.L.
Glaser, S.C. Schuster, D.S. Gilmour, I. Albert, B.F. Pugh, Nucleosome organization in the
Drosophila genome. Nature 453 (2008) 358-362.
[46]A. Valouev, J. Ichikawa, T. Tonthat, J. Stuart, S. Ranade, H. Peckham, K. Zeng, J.A. Malek, G. Costa,
K. McKernan, A. Sidow, A. Fire, S.M. Johnson, A high-resolution, nucleosome position map of C.
elegans reveals a lack of universal sequence-dictated positioning. Genome Res 18 (2008) 10511063.
[47]T.N. Mavrich, I.P. Ioshikhes, B.J. Venters, C. Jiang, L.P. Tomsho, J. Qi, S.C. Schuster, I. Albert, B.F.
Pugh, A barrier nucleosome model for statistical positioning of nucleosomes throughout the yeast
genome. Genome Res 18 (2008) 1073-1083.
[48]R.D. Kornberg, L. Stryer, Statistical distributions of nucleosomes: nonrandom locations by a stochastic
mechanism. Nucleic Acids Res 16 (1988) 6677-6690.
[49]N. Kaplan, I.K. Moore, Y. Fondufe-Mittendorf, A.J. Gossett, D. Tillo, Y. Field, E.M. LeProust, T.R.
Hughes, J.D. Lieb, J. Widom, E. Segal, The DNA-encoded nucleosome organization of a
eukaryotic genome. Nature 458 (2009) 362-366.
[50]S.C. Satchwell, H.R. Drew, A.A. Travers, Sequence periodicities in chicken nucleosome core DNA. J
Mol Biol 191 (1986) 659-675.
[51]E. Segal, Y. Fondufe-Mittendorf, L. Chen, A. Thastrom, Y. Field, I.K. Moore, J.P. Wang, J. Widom, A
genomic code for nucleosome positioning. Nature 442 (2006) 772-778.
[52]D. Tillo, T.R. Hughes, G+C content dominates intrinsic nucleosome occupancy. BMC Bioinformatics
10 (2009) 442.
[53]E.A. Sekinger, Z. Moqtaderi, K. Struhl, Intrinsic histone-DNA interactions and low nucleosome
density are important for preferential accessibility of promoter regions in yeast. Mol Cell 18 (2005)
735-748.
[54]D. Tillo, N. Kaplan, I.K. Moore, Y. Fondufe-Mittendorf, A.J. Gossett, Y. Field, J.D. Lieb, J. Widom,
E. Segal, T.R. Hughes, High nucleosome occupancy is encoded at human regulatory sequences.
PLoS One 5 (2010) e9129.
[55]S. Gupta, J. Dennis, R.E. Thurman, R. Kingston, J.A. Stamatoyannopoulos, W.S. Noble, Predicting
human nucleosome occupancy from primary sequence. PLoS Comput Biol 4 (2008) e1000134.
31
[56]G.C. Yuan, J.S. Liu, Genomic sequence is highly predictive of local nucleosome depletion. PLoS
Comput Biol 4 (2008) e13.
[57]R.T. Simpson, P. Kunzler, Cromatin and core particles formed from the inner histones and synthetic
polydeoxyribonucleotides of defined sequence. Nucleic Acids Res 6 (1979) 1387-1415.
[58]E.N. Trifonov, J.L. Sussman, The pitch of chromatin DNA is reflected in its nucleotide sequence. Proc
Natl Acad Sci U S A 77 (1980) 3816-3820.
[59]J. Widom, Role of DNA sequence in nucleosome stability and dynamics. Q Rev Biophys 34 (2001)
269-324.
[60]Y. Field, N. Kaplan, Y. Fondufe-Mittendorf, I.K. Moore, E. Sharon, Y. Lubling, J. Widom, E. Segal,
Distinct modes of regulation by chromatin encoded through nucleosome positioning signals. PLoS
Comput Biol 4 (2008) e1000216.
[61]K. Struhl, Naturally occurring poly(dA-dT) sequences are upstream promoter elements for constitutive
transcription in yeast. Proc Natl Acad Sci U S A 82 (1985) 8419-8423.
[62]V. Iyer, K. Struhl, Poly(dA:dT), a ubiquitous promoter element that stimulates transcription via its
intrinsic DNA structure. EMBO J 14 (1995) 2570-2579.
[63]K.J. Dechering, K. Cuelenaere, R.N. Konings, J.A. Leunissen, Distinct frequency-distributions of
homopolymeric DNA tracts in different genomes. Nucleic Acids Res 26 (1998) 4056-4062.
[64]M. Gencheva, S. Boa, R. Fraser, M.W. Simmen, A.W. CB, J. Allan, In Vitro and in Vivo nucleosome
positioning on the ovine beta-lactoglobulin gene are related. J Mol Biol 361 (2006) 216-230.
[65]Y. Zhang, Z. Moqtaderi, B.P. Rattner, G. Euskirchen, M. Snyder, J.T. Kadonaga, X.S. Liu, K. Struhl,
Intrinsic histone-DNA interactions are not the major determinant of nucleosome positions in vivo.
Nat Struct Mol Biol 16 (2009) 847-852.
[66]A. Stein, T.E. Takasuka, C.K. Collings, Are nucleosome positions in vivo primarily determined by
histone-DNA sequence preferences? Nucleic Acids Res 38 (2010) 709-719.
[67]O. Harismendy, P.C. Ng, R.L. Strausberg, X. Wang, T.B. Stockwell, K.Y. Beeson, N.J. Schork, S.S.
Murray, E.J. Topol, S. Levy, K.A. Frazer, Evaluation of next generation sequencing platforms for
population targeted sequencing studies. Genome Biol 10 (2009) R32.
[68]D. Nathan, D.M. Crothers, Bending and flexibility of methylated and unmethylated EcoRI DNA. J Mol
Biol 316 (2002) 7-17.
[69]D.B. Tippin, M. Sundaralingam, Nine polymorphic crystal structures of d(CCGGGCCCGG),
d(CCGGGCCm5CGG), d(Cm5CGGGCCm5CGG) and d(CCGGGCC(Br)5CGG) in three different
conformations: effects of spermine binding and methylation on the bending and condensation of
A-DNA. J Mol Biol 267 (1997) 1171-1185.
[70]M. Buttinelli, A. Minnock, G. Panetta, M. Waring, A. Travers, The exocyclic groups of DNA modulate
the affinity and positioning of the histone octamer. Proc Natl Acad Sci U S A 95 (1998) 85448549.
[71]C.S. Davey, S. Pennings, C. Reilly, R.R. Meehan, J. Allan, A determining influence for CpG
dinucleotides on nucleosome positioning in vitro. Nucleic Acids Res 32 (2004) 4322-4331.
[72]K.J. Polach, P.T. Lowary, J. Widom, Effects of core histone tail domains on the equilibrium constants
for dynamic DNA site accessibility in nucleosomes. J Mol Biol 298 (2000) 211-223.
[73]J.D. Anderson, P.T. Lowary, J. Widom, Effects of histone acetylation on the equilibrium accessibility
of nucleosomal DNA target sites. J Mol Biol 307 (2001) 977-985.
[74]I. Whitehouse, T. Tsukiyama, Antagonistic forces that position nucleosomes in vivo. Nat Struct Mol
Biol 13 (2006) 633-640.
[75]Y. Lorch, B. Maier-Davis, R.D. Kornberg, Mechanism of chromatin remodeling. Proc Natl Acad Sci U
S A 107 (2010) 3458-3462.
[76]G. Mizuguchi, X. Shen, J. Landry, W.H. Wu, S. Sen, C. Wu, ATP-driven exchange of histone H2AZ
variant catalyzed by SWR1 chromatin remodeling complex. Science 303 (2004) 343-348.
32
[77]M.S. Kobor, S. Venkatasubrahmanyam, M.D. Meneghini, J.W. Gin, J.L. Jennings, A.J. Link, H.D.
Madhani, J. Rine, A protein complex containing the conserved Swi2/Snf2-related ATPase Swr1p
deposits histone variant H2A.Z into euchromatin. PLoS Biol 2 (2004) E131.
[78]A.Y. Konev, M. Tribus, S.Y. Park, V. Podhraski, C.Y. Lim, A.V. Emelyanov, E. Vershilova, V.
Pirrotta, J.T. Kadonaga, A. Lusser, D.V. Fyodorov, CHD1 motor protein is required for deposition
of histone variant H3.3 into chromatin in vivo. Science 317 (2007) 1087-1090.
[79]T. Juven-Gershon, J.Y. Hsu, J.W. Theisen, J.T. Kadonaga, The RNA polymerase II core promoter - the
gateway to transcription. Curr Opin Cell Biol 20 (2008) 253-259.
[80]J.S. Beckmann, E.N. Trifonov, Splice junctions follow a 205-base ladder. Proc Natl Acad Sci U S A 88
(1991) 2380-2383.
[81]N.J. Krogan, M. Kim, A. Tong, A. Golshani, G. Cagney, V. Canadien, D.P. Richards, B.K. Beattie, A.
Emili, C. Boone, A. Shilatifard, S. Buratowski, J. Greenblatt, Methylation of histone H3 by Set2 in
Saccharomyces cerevisiae is linked to transcriptional elongation by RNA polymerase II. Mol Cell
Biol 23 (2003) 4207-4218.
[82]D.E. Schones, K. Cui, S. Cuddapah, T.Y. Roh, A. Barski, Z. Wang, G. Wei, K. Zhao, Dynamic
regulation of nucleosome positioning in the human genome. Cell 132 (2008) 887-898.
[83]Z. Wang, C. Zang, J.A. Rosenfeld, D.E. Schones, A. Barski, S. Cuddapah, K. Cui, T.Y. Roh, W. Peng,
M.Q. Zhang, K. Zhao, Combinatorial patterns of histone acetylations and methylations in the
human genome. Nat Genet 40 (2008) 897-903.
[84]S. Sasaki, C.C. Mello, A. Shimada, Y. Nakatani, S. Hashimoto, M. Ogawa, K. Matsushima, S.G. Gu,
M. Kasahara, B. Ahsan, A. Sasaki, T. Saito, Y. Suzuki, S. Sugano, Y. Kohara, H. Takeda, A. Fire,
S. Morishita, Chromatin-associated periodicity in genetic variation downstream of transcriptional
start sites. Science 323 (2009) 401-404.
[85]S.S. Hammoud, D.A. Nix, H. Zhang, J. Purwar, D.T. Carrell, B.R. Cairns, Distinctive chromatin in
human sperm packages genes for embryo development. Nature 460 (2009) 473-478.
[86]G. Hon, W. Wang, B. Ren, Discovery and annotation of functional chromatin signatures in the human
genome. PLoS Comput Biol 5 (2009) e1000566.
[87]E. Hodges, A.D. Smith, J. Kendall, Z. Xuan, K. Ravi, M. Rooks, M.Q. Zhang, K. Ye, A. Bhattacharjee,
L. Brizuela, W.R. McCombie, M. Wigler, G.J. Hannon, J.B. Hicks, High definition profiling of
mammalian DNA methylation by array capture and single molecule bisulfite sequencing. Genome
Res 19 (2009) 1593-1605.
[88]L. Laurent, E. Wong, G. Li, T. Huynh, A. Tsirigos, C.T. Ong, H.M. Low, K.W. Kin Sung, I. Rigoutsos,
J. Loring, C.L. Wei, Dynamic changes in the human methylome during differentiation. Genome
Res 20 (2010) 320-331.
[89]E. Batsche, M. Yaniv, C. Muchardt, The human SWI/SNF subunit Brm is a regulator of alternative
splicing. Nat Struct Mol Biol 13 (2006) 22-29.
[90]R.J. Loomis, Y. Naoe, J.B. Parker, V. Savic, M.R. Bozovsky, T. Macfarlan, J.L. Manley, D.
Chakravarti, Chromatin binding of SRp20 and ASF/SF2 and dissociation from mitotic
chromosomes is modulated by histone H3 serine 10 phosphorylation. Mol Cell 33 (2009) 450-461.
[91]R.F. Luco, Q. Pan, K. Tominaga, B.J. Blencowe, O.M. Pereira-Smith, T. Misteli, Regulation of
alternative splicing by histone modifications. Science 327 (2010) 996-1000.
[92]R. Das, J. Yu, Z. Zhang, M.P. Gygi, A.R. Krainer, S.P. Gygi, R. Reed, SR proteins function in coupling
RNAP II transcription to pre-mRNA splicing. Mol Cell 26 (2007) 867-881.
[93]K. Fox-Walsh, X.D. Fu, Chromatin: the final frontier in splicing regulation? Dev Cell 18 (2010) 336338.
[94]E. Allemand, E. Batsche, C. Muchardt, Splicing, transcription, and chromatin: a menage a trois. Curr
Opin Genet Dev 18 (2008) 145-151.
33