Plant Cell Advance Publication. Published on April 3, 2017, doi:10.1105/tpc.17.00020 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 RESEARCH ARTICLE 38 INTRODUCTION 39 The transcription start site (TSS) of genes transcribed by RNA polymerase II in 40 eukaryotes is thought to be primarily determined by the assembly of the pre-initiation 41 complex on recognizable sequences in the promoter (Danino et al. 2015; Thomas and 42 Chiang 2006; Vernimmen and Bickmore 2015). Because the term promoter in its 43 broadest sense includes all sequences that regulate expression, we will use the term 44 proximal promoter to refer to the sequences surrounding and immediately upstream of 45 the TSS. This process starts when the TATA-binding protein subunit of the general Intron DNA Sequences Can Be More Important Than the Proximal Promoter in Determining the Site of Transcript Initiation Jenna E Gallegos1 & Alan B Rose1* 1 Department of Molecular and Cellular Biology, University of California, Davis, 1 Shields Avenue, Davis, CA, 95616 *Corresponding Author: [email protected] Short title: Introns affect transcript initiation One-sentence summary: Introns boost mRNA accumulation without a proximal promoter and influence the transcription start site, indicating that some introns play a previously unrecognized role in controlling initiation. The author responsible for distribution of materials integral to the findings presented in this article in accordance with the policy described in the Instructions for Authors (www.plantcell.org) is: Alan B. Rose ([email protected]). ABSTRACT To more precisely define the positions from which certain intronic regulatory sequences increase mRNA accumulation, the effect of a UBIQUITIN intron on gene expression was tested from 6 different positions surrounding the transcription start site (TSS) of a reporter gene fusion in Arabidopsis thaliana. The intron increased expression from all transcribed positions, but had no effect when upstream of the 5'-most TSS. While this implies that the intron must be transcribed to increase expression, the TSS changed when the intron was located in the 5' UTR, suggesting that the intron affects transcription initiation. Remarkably, deleting 303 nt of the promoter including all known TSSs and all but 18 nt of the 5’-UTR had virtually no effect on the level of gene expression as long as an intron containing stimulatory sequences was included. Instead, transcription was initiated in normally untranscribed sequences the same distance upstream of the intron as when the promoter was intact. These results suggest that certain intronic DNA sequences play unexpectedly large roles in directing transcription initiation and constitute a previously unrecognized type of downstream regulatory element for genes transcribed by RNA polymerase II. ©2017 American Society of Plant Biologists. All Rights Reserved 1 46 transcription factor TFIID binds TATA box sequences 30-40 nt upstream of the TSS. 47 Proximal promoters often contain additional recognizable sequences including the 48 initiator, which encompasses the TSS, the TFIIB recognition element, and the 49 downstream promoter element, which, along with the initiator, binds to TFIID (Butler and 50 Kadonaga 2002; Thomas and Chiang 2006). However, there are no universally 51 conserved promoter elements, and even the TATA box is found in a minority of genes in 52 plants (Morton et al. 2014), yeast (Basehoar et al. 2004), and humans (Yang et al. 53 2007). The common heterogeneity in start sites within a single gene further suggests 54 that there is considerable flexibility in the sequences that can support transcription 55 initiation (Morton et al. 2014). Many genes rely on additional promoter elements 56 (usually less than 1 Kb from the TSS) and distal enhancer elements to regulate gene 57 expression (Vernimmen and Bickmore 2015). 58 59 There are many examples in which a fully intact promoter with all its transcription factor 60 binding sites is inactive unless one or more of the gene’s endogenous introns is present 61 (Callis et al. 1987; Emami et al. 2013; Palmiter et al. 1991). In other cases, the 62 expression of genes that are weakly active without an intron can be increased tenfold or 63 more by the addition of an intron (Callis et al. 1987; Clancy et al. 1994; Kempe et al. 64 2013; Morello et al. 2011; Rose 2002). Furthermore, introns can change the spatial 65 expression patterns of tissue-specific genes (Emami et al. 2013; Giani et al. 2009; 66 Jeong et al. 2006). These examples demonstrate that some introns significantly boost 67 expression, even in the absence of prior promoter activity. 68 69 In some instances, the mechanism through which an intron increases gene expression 70 is well understood. For example, introns can contain enhancers or alternative TSSs 71 (Deyholos and Sieburth 2000; Kim et al. 2006; Morello et al. 2002). In addition, 72 interactions between the splicing machinery and other factors involved in mRNA 73 synthesis and maturation, as well as the exon-junction complex proteins deposited on 74 the mRNA during splicing, assist in mRNA production, stability, export, and translation 75 (Dahan et al. 2011; Le Hir et al. 2001; Maniatis and Reed 2002; Wiegand et al. 2003). 76 2 77 Certain introns that stimulate gene expression exhibit properties that suggest they must 78 operate by a different and poorly understood mechanism. First, the varying abilities of 79 efficiently spliced introns to increase mRNA accumulation indicate that only certain 80 introns boost mRNA levels beyond the general effects caused by the splicing machinery 81 or exon junction complexes (Rose 2002). These introns are unlike enhancers because 82 they have no effect when inserted into the 3’ UTR (Callis et al. 1987; Jeong et al. 2006; 83 Rose 2002; Snowden et al. 1996) or upstream of the promoter (Callis et al. 1987; Jeon 84 et al. 2000). The increase in mRNA accumulation caused by specific introns only when 85 located in transcribed sequences near the TSS will be referred to here as intron- 86 mediated enhancement (IME) (Gallegos and Rose 2015). 87 88 The question of whether splicing is necessary for IME is difficult to address because 89 unspliced intron sequences that remain in the mRNA have numerous secondary effects, 90 mostly due to stop codons and short open reading frames that interfere with translation 91 and can cause mRNA decay. Tests of introns engineered to maintain the reading frame 92 when retained in the mRNA came to conflicting conclusions about the need for splicing 93 in IME (Akua et al. 2010; Clancy and Hannah 2002; Rose and Beliakoff 2000). 94 However, the ability of some modified introns to increase mRNA accumulation in the 95 absence of splicing suggests that the mechanism of IME is at least partially splicing 96 independent (reviewed in Gallegos and Rose 2015). 97 98 The properties of IME have been most thoroughly examined for the first intron of the 99 Arabidopsis (Arabidopsis thaliana) UBIQUITIN 10 (UBQ10) gene. The UBQ10 intron 100 increases mRNA accumulation over 12 fold, while an efficiently spliced intron from the 101 COLD-REGULATED 15a (COR15a) gene, which is nearly identical in length and GC 102 content, has a less than twofold effect on expression (Rose 2002; Rose et al. 2008). 103 Replacing segments of the COR15a intron with like-sized segments of the UBQ10 104 intron revealed that the sequences responsible for increasing mRNA accumulation are 105 redundant and dispersed throughout the UBQ10 intron (Rose et al. 2008), unlike the 106 discrete sequences to which transcription factors bind in promoters and enhancers. 107 Hybrid introns containing COR15a intron splice sites and an internal portion of the 3 108 UBQ10 intron in either orientation stimulated expression equally well (Rose et al. 2011), 109 suggesting that the effect on gene expression is caused by intron sequences in the 110 DNA, rather than the strand-specific RNA. Experiments varying the location of the 111 UBQ10 intron and the endogenous TRYPTOPHAN BIOSYNTHESIS 1 (TRP1) first 112 intron in TRP1:GUS coding sequences revealed that their ability to stimulate mRNA 113 accumulation declines when moved from roughly 250 nt to 550 nt downstream of the 114 major TSS, and is completely gone when 1100 nt or more from the major TSS (Rose 115 2004). 116 117 The observation that these introns must be near the TSS to exhibit IME inspired the 118 development of an algorithm known as the IMEter that assigns a score based on the 119 oligomer composition of a given intron compared to promoter-proximal introns genome- 120 wide (Rose et al. 2008). The utility of the IMEter in predicting the stimulating ability of 121 introns has been confirmed in Arabidopsis (Rose et al. 2008), soybeans (Zhang et al. 122 2015), and other angiosperms (Aguilar-Hernandez and Guzman 2013). 123 124 The intron sequences responsible for IME have proven difficult to identify because their 125 redundant and dispersed nature renders deletion analysis ineffective (Clancy and 126 Hannah 2002; Donath et al. 1995; Rose 2002; Rose et al. 2008). However, the strong 127 correlation between intron IMEter scores and their ability to increase expression allowed 128 a large number of potentially stimulating introns to be computationally searched for 129 over-represented sequences (Rose et al. 2008). A candidate motif, TTNGATYTG, was 130 shown to be sufficient for IME because it can convert the previously non-stimulating 131 COR15a intron into one that strongly boosts mRNA accumulation. Introns containing 132 either 6 or 11 copies of this motif, named COR15a6L and COR15a11L, increase mRNA 133 accumulation 15-fold and 24-fold respectively (Rose et al. 2016). This cannot be the 134 only sequence that mediates IME because an exact match to this motif occurs only 135 once in the UBQ10 intron, while stimulating sequences are distributed throughout this 136 intron (Rose et al. 2008). 137 4 138 While the ability of an intron to stimulate expression is known to vary with its location 139 within a gene, the exact positional requirements for IME have not been fully determined. 140 Furthermore, it is unknown if the stimulating ability of an intron continues to increase as 141 it gets closer to the TSS, or if introns have their maximum effect approximately 200 nt 142 downstream of the major TSS where genome-wide average intron IMEter scores peak 143 (Parra et al. 2011). Pinpointing the ideal location for a stimulating intron will help to 144 maximize gene expression in industrial and research applications. Additionally, the 145 differing effects of a stimulating intron at various locations could yield insight into the 146 mechanism of IME. 147 148 In this study, we completed the first gene-scale mapping of the effect of intron location 149 on IME by comparing the expression of TRP1:GUS fusions containing the UBQ10 intron 150 at different locations around the TSS. In the process, we discovered an unexpected role 151 for introns in determining the site of transcription initiation. 152 153 5 154 RESULTS 155 The UBQ10 intron has no effect on expression from upstream of the 5’-most TSS 156 To determine the 5’-limit from which an intron can stimulate expression by IME, the first 157 intron of UBQ10 was tested at six locations upstream of the normal second exon of a 158 TRP1:GUS reporter gene (designated positions 1-6 in Figure 1). Previously generated 159 transgenic TRP1:GUS plants with the first intron of UBQ10 at three additional locations 160 (Rose 2004), designated herein as positions 0, -1, and -2, were included in Figure 1 to 161 show the 3’ limits of intron position for IME. Positions 0, -1, and -2 were previously 162 designated “259”, “551”, and “1136” respectively according to their distance from the 163 most frequently used TSS. A different numbering system was used here because a 164 description of intron location relative to the TSS is complicated by the presence of 165 multiple TSSs in this gene. The main TSS is at -41 relative to the ATG, as determined 166 by primer extension and RNAse protection (Rose et al. 1992), RNAseq, and the 5’ ends 167 of ESTs and cDNAs in the TAIR database (https://www.arabidopsis.org/). There are 168 also a few minor TSSs, of which the one at -117 is the furthest from the start codon and 169 is the annotated TSS for this gene. 170 171 Introns inserted at position 3 should be in the 5’-UTR of all TRP1:GUS transcripts, while 172 position 4 is upstream of the major TSS but downstream of the 5’-most TSS. The intron 173 at position 5 is upstream of all known TSSs, and roughly midway between 174 the TRP1 start codon and the stop codon of the upstream gene. This gene (At5g17980) 175 is a 3.15 Kb intronless gene of which only 1.8 Kb from the 3’ end is present in 176 the TRP1 promoter fragment in the TRP1:GUS fusions. The intron at position 6 is in 177 coding sequences of this gene 198 nt from the stop codon. 178 179 To insert the intron, a PstI site was added to the 5’ end of the intron, and the last six 180 nucleotides of the intron were converted into a PstI site. The intron was then cloned as 181 a PstI fragment into a PstI site created by site-directed mutagenesis at the desired 182 location (Figure 1D) as described (Rose and Beliakoff 2000). Introns inserted as PstI 183 sites leave no extraneous nucleotides in the mRNA (Rose 2004), and are efficiently 184 spliced, presumably because the 5’ splice site is unchanged, and the 3’ PstI site 6 185 (CTGCAG) conforms to the major 3’ splice site consensus (NNNYAG) in dicots. 186 7 187 For intron positions in coding sequences (positions 1 and 2), there were no places 188 where PstI sites could be made without changing an amino acid, so sites were chosen 189 such that changes were unlikely to disrupt protein function. The PstI sites introduced at 190 positions 1 and 2 convert an isoleucine and a glycine respectively into alanines, all of 191 which are in the family of nonpolar amino acids. Further, the first two exons of the TRP1 192 gene encode the chloroplast transit peptide (Rose et al. 1992). Transit peptides are not 193 stringent in their sequence requirements and can usually tolerate moderate changes 194 without losing function (Inoue and Glaser 2014). Because transit peptides are cleaved 195 off as the protein is imported into the chloroplast, conservative changes here are 196 unlikely to affect the enzymatic activity of the mature GUS reporter. 197 198 To qualitatively assess the positional effect of the UBQ10 intron on expression of the 199 TRP1:GUS reporter, pooled T2 transgenic seedlings of unknown transgene copy 200 number were histochemically stained for GUS activity (Figure 1B). The UBQ10 intron 201 clearly stimulated expression from all four transcribed locations but not from either 202 position upstream of the TSS, where it may reduce expression. 203 204 To quantitatively compare expression of reporter genes containing the intron at all 205 transcribed positions, GUS enzyme activity and mRNA accumulation were measured in 206 single-copy transgenic Arabidopsis lines (Figure 1C and Supplemental Tables 1 and 2). 207 Expression levels between independent single-copy lines of each construct were 208 similar, indicating that, for this transgene, the site of integration had little effect. This lack 209 of importance of the site of integration has also been demonstrated for other TRP1:GUS 210 fusions (Rose 2004; Rose and Beliakoff 2000), and likely stems from the location of the 211 TRP1 promoter in the middle of the T-DNA, where it is isolated from flanking plant 212 sequences by at least 2 Kb on each side. 213 214 GUS enzyme assays and RNA gel blots both indicated that TRP1:GUS mRNA levels 215 were similar when the UBQ10 intron was located at position 0, 1, 2, 3, or 4 (Figure 1 216 and Supplemental Tables 1 and 2). The activity of the intron at position 4 was surprising 217 because this location is 43 nt upstream of the major TSS, meaning that most of the 8 218 transcripts were not predicted to contain the intron. Inserting 304 nt of sequence into the 219 proximal promoter was expected to reduce expression. 220 221 Intron placement alters TSS usage 222 To determine if any transcripts initiated within the intron, cDNA from pooled seedlings 223 was amplified by 5’-RACE, cloned, and sequenced. None of the sequenced products 224 contained any intron sequences, but they did reveal an effect on the site of 225 initiation. When the intron was located at positions 1 or 2 (within coding sequences of 226 the first exon), the 5’ ends of most 5’-RACE products, and presumably the TSS, 227 mapped to within 10 nt of the major TSS (Figure 2). However, when the intron was 228 located at position 3 (within the 5’-UTR), transcription began almost exclusively within 5 229 nt of the furthest upstream TSS. When the intron was located at position 4 (upstream of 230 the major TSS), transcription began at the 5’-most TSS or at locations further upstream 231 where initiation does not normally occur, but not at any site downstream of the intron 232 including the major TSS (Figure 2A). These results suggest that introns may affect the 233 site of transcription initiation and explain how the gene could still be highly expressed 234 when the intron was located upstream of the major TSS. 235 236 A substantial promoter deletion does not diminish IME 237 The observation that inserting the intron into the 5’-UTR changed the location at which 238 transcription predominantly began suggested that transcription does not always start a 239 fixed distance downstream of conserved sequences in the proximal promoter. To 240 determine the extent of the flexibility in sequences that can support initiation, 303 nt of 241 the TRP1 promoter were deleted from the TRP1:GUS fusion (from position 3 to 5 in 242 Figure 1). The deletion encompassed all previously recognized TSSs and all but 18 nt 243 of the 5’-UTR (Figure 1D). The deletion was created in reporter gene fusions containing 244 no intron, the non-stimulating COR15a intron, or one of three stimulating 245 introns: UBQ10, COR15a6L, or COR15a11L (derivatives of the COR15a intron 246 containing 6 or 11 copies of the TTNGATYTG motif). Remarkably, deleting the proximal 247 promoter did not obviously reduce gene expression, as determined by histochemical 248 staining for GUS activity in seedlings of unknown transgene copy number (Figure 3). To 9 249 quantify potential subtle differences in expression, mRNA levels were compared 250 in single-copy transgenic Arabidopsis lines. Deleting all known TSSs did not diminish 251 mRNA levels from TRP1:GUS fusions containing any of the three stimulating introns 10 252 (Figure 3B). However, deleting these promoter sequences reduced but did not eliminate 253 expression of the constructs containing the non-stimulating COR15a intron or no intron 11 254 (Figure 3B). The size of the mature mRNA was not significantly changed by the 255 promoter deletion regardless of which intron was present. 256 257 Stimulating introns activate alternative TSSs upstream of the promoter deletion 258 To determine where transcription was initiating in the deletion-containing constructs, 259 and to verify that transcription of the TRP1:GUS fusion was similar to that of the 260 endogenous TRP1 gene, cDNA from pooled seedlings was amplified by 5’-RACE, 261 cloned, and sequenced. For TRP1:GUS constructs containing the COR15a11L intron in 262 which the promoter was intact, the apparent TSSs mapped within the range of -114 to 263 +1 relative to the ATG (Figure 2B), consistent with the start sites of the endogenous 264 TRP1 gene. 265 266 When the promoter sequences were deleted, most of the apparent start sites from 267 constructs containing either the COR15a11L or the UBQ10 intron mapped upstream of 268 the deletion in normally untranscribed sequences (Figure 2B). Even though these 269 initiation sites were far upstream of the normal TSS, transcription was initiating a similar 270 distance upstream of the intron (223-311 nt) as it did when the promoter was intact. This 271 can be seen in the average length of the 5’-UTRs in the sequenced 5’-RACE clones, 272 which did not significantly differ between the promoter-deletion constructs containing 273 either the COR15a11L or UBQ10 intron, the intact promoter with the COR15a11L 274 intron, or the previously sequenced cDNAs and ESTs from the endogenous TRP1 gene 275 (Figure 2C). 276 277 To verify that deleting all known TSSs caused transcription to start in a new location, 278 RNA gel blots were probed with a 709 nt fragment spanning the PstI sites at positions 4 279 and 6 (Figure 3C). The probe hybridized much more strongly with mRNA from lines in 280 which the proximal promoter was deleted and a stimulating intron was present than with 281 mRNA from lines in which the promoter was intact regardless of whether or not an 282 intron was present. This finding confirms that transcripts derived from promoter deletion 283 constructs with stimulating introns contain upstream sequences that are not transcribed 284 when the proximal promoter is intact. The smaller band present in all lanes is of 12 285 unknown origin. 286 287 Expression requires at least some sequences from the intergenic region 288 To further test the limits of sequences that can support transcript initiation, a larger 289 deletion (spanning from position 3 to position 6 in Figure 1) was created in TRP1:GUS 290 fusions containing no intron or the UBQ10 intron. This deletion removed the entire 291 intergenic region from 18 nt upstream of the TRP1 ATG into coding sequences near the 292 3’-end of the upstream gene. Transgenic plants containing TRP1:GUS constructs with 293 this complete promoter deletion had undetectably low levels of GUS activity even when 294 a stimulating intron was included (Figure 3D). This result suggests that even though the 295 promoter sequences that can support initiation in response to a stimulating intron are 296 surprisingly flexible, there are some features of the TRP1 promoter, either sequences or 297 chromatin structure, that are absolutely required for expression. 298 299 13 300 DISCUSSION 301 The effect of intron position on gene expression 302 Including the results presented here, the effect of the UBQ10 intron on expression has 303 been measured from a total of 14 locations within the TRP1:GUS reporter gene. The 304 intron increased mRNA accumulation from only six positions near the start of the gene. 305 Of the six active locations, the effect of the intron on mRNA accumulation was greatest 306 at -18 from the start codon and least from +510. In practical applications where introns 307 are used to maximize gene expression, for both efficacy and ease it might be best to 308 insert the intron into the 5’-UTR near the start codon. 309 310 It was previously unclear whether introns could affect gene expression if they were 311 upstream of but near the TSS. In earlier studies in which an intron was placed upstream 312 of various promoters (Callis et al. 1987; Jeon et al. 2000), the distance between the 313 intron and the TSS was usually more than 550 nt, so the introns may have been too far 314 away to affect expression. While these experiments clearly demonstrated that introns 315 are unlike enhancer elements whose influence extends several kilobases, they did not 316 establish if introns must be transcribed to have an effect. Here we showed that the 317 UBQ10 intron clearly did not stimulate expression when located 204 nt upstream of the 318 5’-most TSS. This finding is consistent with the idea that introns must be transcribed to 319 affect expression, although this conclusion is complicated by the possibility that 320 stimulating introns might cause initiation upstream of themselves, as discussed below. 321 The inability of the intron to affect expression from untranscribed sequences does not 322 necessarily demonstrate a role for splicing, because the mechanism of IME could 323 involve direct interactions between the transcription machinery and the intronic DNA. 324 325 The effect of intron position on the location of the TSS 326 The finding that moving a stimulating intron into the 5’-UTR, or deleting all known TSSs, 327 causes transcription to begin in sequences that do not normally support initiation 328 suggests that start sites are not determined by proximal promoter sequences alone. 329 Further, there are no obviously conserved sequences between the various TSS’s 330 observed in this study (Figure 4). There must be additional conditions that can be 14 331 influenced by intron sequences to allow initiation at competent sites that are normally 332 inactive. 333 15 334 The transcription stimulated by UBQ10 intron sequences did not initiate a fixed distance 335 upstream of the intron, as relocating the intron in constructs with intact promoters did 336 not appreciably change the TSS unless the intron was inserted into the 5’-UTR (Figure 337 2). In yeast, splicing occurs during transcription very soon after the nascent RNA 338 emerges from RNA polymerase II (Carrillo Oesterreich et al. 2016); moving the intron 339 into the 5’-UTR might push the predominant site of initiation further upstream if there is 340 steric interference between the bulky spliceosome and the transcription machinery as it 341 assembles on the promoter for the next round of transcription. 342 343 The effect of proximal promoter deletions on IME 344 The sequences between positions 5 and 6 are capable of supporting initiation, while 345 those upstream of position 6 are not. The region between positions 5 and 6 may have 346 distinguishing features beyond specific sequence composition, such as a propensity to 347 form open chromatin. The DNase-sensitive region that includes the TRP1 promoter 348 does not extend as far upstream as position 6 (Figure 5). Further, the sequences 349 between positions 3 and 6 are noticeably more AT-rich than sequences upstream of 350 position 6 (Figure 5B), and histone associations are favored by GC-richness (Segal and 351 Widom 2009). 352 353 These results may help to explain why some promoters are completely dependent on 354 introns for expression. For some genes, the entire promoter may include regulatory 355 sequences in the first intron. A subset of these intron-dependent genes may even lack 356 proximal promoters in the traditional sense but rather rely on a cryptic downstream 357 promoter element within the intron that, unlike previously identified regulatory elements 358 for RNA polymerase II, specifically activates transcription a few hundred base pairs 359 upstream of itself. The ability of stimulating introns to override the tissue specificity of 360 promoters further suggests that the mechanism of IME does not depend on prior 361 promoter activity, and inherently leads to constitutive expression throughout the plant. 362 Intron-caused ubiquitous expression that is independent of normal promoter activity is 363 more consistent with a DNA-based than an RNA-based mechanism of IME. 364 16 365 If certain introns can drive expression in the absence of the normal TSSs, it is puzzling 366 that proximal promoter sequences have not been found more generally dispensable in 367 the large number of publications reporting promoter deletion analyses. One possible 17 368 explanation is that introns are rarely included in the genes used to assess promoter 369 activity. One exception is a study of the unc-54 gene of Caenorhabditis elegans in which 370 the expression of different versions of the gene was measured by their ability to rescue 371 an unc-54 null mutation (Okkema et al. 1993). While unc-54 genomic DNA rescued 372 efficiently, unc-54 cDNA under the control of the unc-54 promoter did not, 373 demonstrating a role for introns in controlling the expression of this gene. An intron- 374 containing but promoterless gene was able to rescue, and transcripts derived from this 375 construct initiated in the plasmid sequences newly fused upstream of the start codon. 376 This finding suggests that the first intron of unc-54 causes initiation upstream of itself, 377 although the possibility cannot be ruled out that some bacterial sequences can 378 fortuitously act as a promoter in C. elegans. 379 380 There is suggestive evidence from the ENCODE project that some introns might cause 381 initiation upstream of themselves in humans as well. Using unique EST ends as 382 genome-wide indicators of TSSs, initiation was observed not only at the annotated TSS 383 but also 250-300 nt upstream of the first intron in genes with long first exons 384 (Bieberstein et al. 2012). Additionally, genes with a naturally occurring intron very near 385 the TSS tend to be more actively transcribed, as demonstrated by association of these 386 genes with RNA Polymerase II and TFIID (Bieberstein et al. 2012). Thus, the apparent 387 ability of intron sequences to stimulate transcript initiation upstream of themselves and 388 for promoter proximal introns to increase gene expression may be conserved across 389 kingdoms. 390 391 Possible mechanisms of IME 392 The following speculation incorporates previous data from plants and other eukaryotes, 393 focusing largely on the UBQ10 intron. The observations that some introns increase 394 mRNA accumulation in the absence of splicing (Rose and Beliakoff 2000),that UBQ10 395 intron sequences increase expression to an equal degree in either orientation (Rose et 396 al. 2011), and that prior promoter activity is unnecessary for the UBQ10 intron to 397 activate expression (Emami et al. 2013) all point to a DNA-based mechanism of IME for 398 this intron. Genome-wide IMEter scores suggest that the expression of as many as 15% 18 399 of genes in Arabidopsis and other plants may be influenced by a similar mechanism 400 (Gallegos and Rose 2015). However, there are other known ways in which different 401 introns increase expression (Akua and Shaul 2013; Bianchi et al. 2013), and a single 402 intron could have multiple effects. 403 404 The main requirement for sequences to act as the promoter of intron-regulated genes 405 such as TRP1 might be a region of open chromatin that allows access of the 406 transcription machinery to the DNA. Within that region of open chromatin there may be 407 DNA sequences that favor initiation, but others can readily substitute when the preferred 408 sites are deleted. Intron sequences may also help to create the local chromatin state 409 necessary for initiation through the interaction between DNA sequences within the 410 intron and histone-modifying or nucleosome-positioning factors. Transcription from 411 many or most promoters may be inherently bidirectional (Bagchi and Iyer 2016), and 412 introns have been shown to decrease antisense transcription in yeast (Agarwal and 413 Ansari 2016). Introns also favor re-initiation by facilitating DNA looping that brings the 3’ 414 end of the gene, and thus the transcription machinery, in proximity to the promoter 415 (Moabbi et al. 2012). In short, stimulating introns may boost mRNA production by 416 creating local chromatin conditions that favor initiation, decreasing the proportion of 417 RNA polymerase II molecules that transcribe away from the gene, and positive 418 feedback that amplifies the accumulation of mRNA through re-initiation. 419 420 The limitations to the positions from which the UBQ10 intron is capable of stimulating 421 mRNA accumulation could be caused by a combination of the relatively short distances 422 (1 Kb or less) over which the postulated intron-driven changes in chromatin structure 423 extend, and the location of potential translational start codons (Gallegos and Rose 424 2015). The intron at all locations might increase transcription initiation upstream of itself, 425 but transcripts that initiate outside of the 220 nt window in which the TRP1 start codon is 426 the first ATG are likely to be unstable (Figure 5C). This might also explain why there 427 was a slight decrease in expression when the intron was located at position 5 or 6 if 428 transcription was initiating preferentially upstream of the intron. The apparent lack of 429 change in TSS when the UBQ10 intron is moved through TRP1:GUS coding sequences 19 430 is consistent with the idea that only transcripts that start within the 220 nt window 431 upstream of the ATG accumulate. 432 433 In conclusion, the ability of some introns to influence the location of the TSS and to 434 stimulate expression of a gene lacking its proximal promoter indicates that certain intron 435 sequences constitute a previously unrecognized type of downstream regulatory element 436 for genes transcribed by RNA polymerase II. These sequences can affect expression 437 from more than 500 nt downstream of the TSS, suggesting they are unlike the 438 downstream transcription factor binding sites found in genes transcribed by RNA 439 polymerase III. The inability of the same introns to stimulate expression from upstream 440 or more than 1100 nt downstream of the TSS also differentiates these sequence 441 elements from enhancers. Further characterization of the sequences that support 442 initiation in response to an intron, and of the role of chromatin structure, should lead to 443 better understanding of the mechanism through which these sequences increase gene 444 expression. 445 446 MATERIALS AND METHODS 447 Cloning of reporter gene fusions 448 The starting template for all constructs was an intronless TRP1:GUS fusion containing 449 2.4 Kb of sequence stretching from the 3’ end of the gene (At5g17980) upstream 450 of TRP1 through the first 8 amino acids of the 3rd exon of TRP1 fused to the E. coli 451 uidA (GUS) gene (Rose and Last 1997). PstI sites were introduced at 6 locations using 452 PCR mutagenesis, and confirmed by sequencing (Rose 2004). To delete promoter 453 sequences, a promoter fragment whose 3’ end was the PstI site at position 5 (in the 454 intergenic region) was ligated to an exon fragment whose 5’ end was the PstI site at 455 position 3 in the 5’-UTR, thereby deleting the sequences between positions 5 and 3. 456 The same procedure was used to generate the larger promoter deletion, except that the 457 distal fragment had the PstI site at position 6 (in the upstream gene). All fusions were 458 then cloned into the binary vector pEND4K, transformed into Agrobacterium 459 tumefaciens by electroporation, and then introduced into Arabidopsis thaliana ecotype 20 460 Columbia (Col) by floral dip as described (Rose and Beliakoff 2000). 461 462 GUS expression assays 463 For qualitative comparisons of expression (as in Figure 1B), T2 seedlings from multiple 464 lines for each construct were histochemically stained for GUS activity. The plate of 465 seedlings in buffer (10 mM EDTA, 100 mM NaPO4 pH 7.0, 0.1% Triton X-100) 466 containing 0.5 mg/mL 5-bromo-4-chloro-3-indolyl ß-D-glucuronic acid (Calbiochem, La 467 Jolla, CA, USA) was incubated at 37° for 30 minutes to 2 hours, the plants were rinsed 468 in water, and chlorophyll was removed with ethanol. Quantitative measurements of GUS 469 enzyme activity in leaf extracts were performed as described (Rose, 2004). 470 471 Identification of single-copy transgenic lines 472 Single-copy transgenic lines were identified for quantitative measurements 473 of GUS enzyme activity and mRNA levels. For each construct, 100 seeds from each of 474 18-72 T2 lines were screened for kanamycin resistance (Rose 2004). Genomic DNA 475 was extracted from lines that exhibited a 3:1 segregation ratio (kanamycin 476 resistant:sensitive) indicative of a single locus of transgene insertion. Gel blots of the 477 DNA digested with BamHI or PstI were probed with the GUS gene. All lines for which 478 both enzymes generated a single band were propagated to the T3 or T4 generation, and 479 homozygous plants were used for quantitative comparisons. Plants were grown in 480 Professional Growing Mix (Sun Gro Horticulture) at a density of 500 seeds per 150 cm2 481 pot in continuous light (~125 μmol/m2/s) under cool white fluorescent bulbs for 21 days 482 at 20°C. RNA was extracted from leaf tissue using the RNeasy Plant Mini kit (Qiagen) 483 following the manufacturer’s protocol. Expression levels were compared to the 484 analogous intronless control pAR281 (Rose and Beliakoff 2000). Other previously 485 generated constructs used in this study include TRP1:GUS fusions containing the 486 UBQ10, COR15a, COR15a6L, or COR51a11L intron at position 0 (Rose et al. 2008, 487 Rose et al. 2016), or the UBQ10 intron at position -1 or -2 (position 551 or 1136 in Rose 488 2004). 489 490 Mapping TSSs with 5’-RACE 21 491 The 5’ ends of TRP1:GUS mRNAs were mapped by 5’-RACE as described (Scotto- 492 Lavino et al. 2006). Briefly, RNA extracted from transgenic plants was reverse 493 transcribed (Reverse Transcription System; Promega, Madison, WI, USA) using 494 the GUS-specific primer OAR37 (5’-TAACGCGCTTTCCCACCAACG-3’). The resulting 495 cDNA was polyadenylated with terminal deoxynucleotidyltransferase and used as the 496 template for PCR with the primers OAR37, QT (5’- 497 CCAGTGAGCAGAGTGACGAGGACTCGAGCTCAAGCTTTTTTTTTTTTTTTTT-3’) and 498 QO (5’-CCAGTGAGCAGAGTGACG-3’). A dilution of the first round PCR product was 499 used as the template for a second round of PCR with the nested primers QI (5’- 500 GAGGACTCGAGCTCAAGC-3’) and OAR29 (5’-GGTTGGGGTTTCTACAGGACG-3’). 501 The 2nd round PCR product was cleaned up with a PCR Purification Kit (Qiagen, 502 Hilden, Germany), digested with restriction enzymes XhoI and BamHI, and ligated into 503 the vector pBluescript KS+. The DNA from transformants was screened by digestion 504 with SacI. Clones with inserts long enough to contain the SacI site that begins 24 nt 505 downstream of the TRP1 start codon were sequenced. 506 507 Accession numbers 508 Sequence data from this article can be found in the Arabidopsis Genome Initiative or 509 EMBL/GenBank data libraries under the following accession numbers: COR15a 510 (At2g42540), TRP1 (At5g17990), and UBQ10 (At4g05320). 511 512 Supplemental Data 513 Supplemental Table 1. Detailed mRNA expression data. 514 Supplemental Table 2. Detailed enzyme expression data. 515 516 517 Acknowledgements 518 This research was supported by funding from the UC Davis PI Bridge Program. Jenna E 519 Gallegos was supported by an NSF Graduate Research Fellows Program Grant 520 1148897. 22 521 522 Author Contributions 523 524 J.E.G. and A.B.R. jointly conceived the project, performed the experiments, and wrote the manuscript. 525 526 23 Parsed Citations Agarwal N, Ansari A (2016) Enhancement of Transcription by a Splicing-Competent Intron Is Dependent on Promoter Directionality PLoS Genet. 12:e1006047 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Aguilar-Hernandez V, Guzman P (2013) Spliceosomal introns in the 5' untranslated region of plant BTL RING-H2 ubiquitin ligases are evolutionary conserved and required for gene expression BMC Plant Biol. 13:179 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Akua T, Berezin I, Shaul O (2010) The leader intron of AtMHX can elicit, in the absence of splicing, low-level intron-mediated enhancement that depends on the internal intron sequence BMC Plant Biol. 10:93 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Akua T, Shaul O (2013) The Arabidopsis thaliana MHX gene includes an intronic element that boosts translation when localized in a 5' UTR intron J. Exp. Bot. 64:4255-4270 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Bagchi DN, Iyer VR (2016) The Determinants of Directionality in Transcriptional Initiation Trends Genet. 32:322-333 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Basehoar AD, Zanton SJ, Pugh BF (2004) Identification and distinct regulation of yeast TATA box-containing genes Cell 116:699-709 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Bianchi M, Crinelli R, Giacomini E, Carloni E, Radici L, Magnani M (2013) Yin Yang 1 intronic binding sequences and splicing elicit intron-mediated enhancement of ubiquitin C gene expression PLoS One 8:e65932 Bieberstein NI, Carrillo Oesterreich F, Straube K, Neugebauer KM (2012) First exon length controls active chromatin signatures and transcription Cell Rep. 2:62-68 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Butler JE, Kadonaga JT (2002) The RNA polymerase II core promoter: a key component in the regulation of gene expression Genes Dev. 16:2583-2592 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Callis J, Fromm M, Walbot V (1987) Introns increase gene expression in cultured maize cells. Genes Dev. 1:1183-1200 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Carrillo Oesterreich F, Herzel L, Straube K, Hujer K, Howard J, Neugebauer KM (2016) Splicing of Nascent RNA Coincides with Intron Exit from RNA Polymerase II Cell 165:372-381 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Clancy M, Hannah LC (2002) Splicing of the Maize Sh1 First Intron Is Essential for Enhancement of Gene Expression, and a T-Rich Motif Increases Expression without Affecting Splicing Plant Physiol. 130:918-929. Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Clancy M, Vasil V, Hannah LC, Vasil IK (1994) Maize Shrunken-1 intron and exon regions increase gene expression in maize protoplasts Plant Sci. 98:151-161 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Dahan O, Gingold H, Pilpel Y (2011) Regulatory mechanisms and networks couple the different phases of gene expression Trends Genet. 27:316-322 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Danino YM, Even D, Ideses D, Juven-Gershon T (2015) The core promoter: At the heart of gene expression Biochim. Biophys. Acta 1849:1116-1131 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Deyholos MK, Sieburth LE (2000) Separable whorl-specific expression and negative regulation by enhancer elements within the AGAMOUS second intron Plant Cell 12:1799-1810 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Donath M, Mendel R, Cerff R, Martin W (1995) Intron-dependent transient expression of the maize GapA1 gene Plant Mol. Biol. 28:667-676 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Emami S, Arumainayagam D, Korf I, Rose AB (2013) The effects of a stimulating intron on the expression of heterologous genes in Arabidopsis thaliana Plant Biotechnol. J. 11:555-563 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Gallegos JE, Rose AB (2015) The enduring mystery of intron-mediated enhancement Plant Sci. 237:8-15 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Giani S, Altana A, Campanoni P, Morello L, Breviario D (2009) In trangenic rice, a- and ß-tubulin regulatory sequences control GUS amount and distribution through intron mediated enhancement and intron dependent spatial expression Transgenic Res. 18:151162 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Inoue K, Glaser E (2014) Processing and degradation of chloroplast extension peptides. In: Theg SM, Wollman F-A (eds) Plastid Biology. Springer Science, New York, pp 305-323 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Jeon JS, Lee S, Jung KH, Jun SH, Kim C, An G (2000) Tissue-preferential expression of a rice alpha-tubulin gene, OsTubA1, mediated by the first intron Plant Physiol. 123:1005-1014 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Jeong YM, Mun JH, Lee I, Woo JC, Hong CB, Kim SG (2006) Distinct roles of the first introns on the expression of Arabidopsis profilin gene family members Plant Physiol. 140:196-209 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Kempe K, Rubtsova M, Riewe D, Gils M (2013) The production of male-sterile wheat plants through split barnase expression is promoted by the insertion of introns and flexible peptide linkers Transgenic Res. 22:1089-1105 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Kim MJ, Kim H, Shin JS, Chung CH, Ohlrogge JB, Suh MC (2006) Seed-specific expression of sesame microsomal oleic acid desaturase is controlled by combinatorial properties between negative cis-regulatory elements in the SeFAD2 promoter and enhancers in the 5'-UTR intron Mol. Genet. Genomics 276:351-368 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Le Hir H, Gatfield D, Izaurralde E, Moore MJ (2001) The exon-exon junction complex provides a binding platform for factors involved in mRNA export and nonsense-mediated mRNA decay EMBO J. 20:4987-4997. Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Maniatis T, Reed R (2002) An extensive network of coupling among gene expression machines Nature 416:499-506. Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Moabbi AM, Agarwal N, El Kaderi B, Ansari A (2012) Role for gene looping in intron-mediated enhancement of transcription Proc. Natl. Acad. Sci. USA 109:8505-8510 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Morello L, Bardini M, Sala F, Breviario D (2002) A long leader intron of the Ostub16 rice beta-tubulin gene is required for highlevel gene expression and can autonomously promote transcription both in vivo and in vitro Plant J. 29:33-44 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Morello L, Giani S, Troina F, Breviario D (2011) Testing the IMEter on rice introns and other aspects of intron-mediated enhancement of gene expression J. Exp. Bot. 62:533-544 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Morton T et al. (2014) Paired-end analysis of transcription start sites in Arabidopsis reveals plant-specific promoter signatures Plant Cell 26:2746-2760 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Okkema PG, Harrison SW, Plunger V, Aryana A, Fire A (1993) Sequence requirements for myosin gene expression and regulation in Caenorhabditis elegans Genetics 135:385-404 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Palmiter RD, Sandgren EP, Avarbock MR, Allen DD, Brinster RL (1991) Heterologous introns can enhance expression of transgenes in mice Proc. Natl. Acad. Sci. USA 88:478-482 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Parra G, Bradnam K, Rose AB, Korf I (2011) Comparative and functional analysis of intron-mediated enhancement signals reveals conserved features among plants Nucleic Acids Res. 39:5328-5337 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Rose AB (2002) Requirements for intron-mediated enhancement of gene expression in Arabidopsis RNA 8:1444-1453 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Rose AB (2004) The effect of intron location on intron-mediated enhancement of gene expression in Arabidopsis Plant J. 40:744751 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Rose AB, Beliakoff JA (2000) Intron-mediated enhancement of gene expression independent of unique intron sequences and splicing Plant Physiol. 122:535-542 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Rose AB, Carter A, Korf I, Kojima N (2016) Intron sequences that stimulate gene expression in Arabidopsis Plant Mol. Biol. 92:337346 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Rose AB, Casselman AL, Last RL (1992) A phosphoribosylanthranilate transferase gene is defective in blue fluorescent Arabidopsis thaliana tryptophan mutants. Plant Physiol. 100:582-592 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Rose AB, Elfersi T, Parra G, Korf I (2008) Promoter-proximal introns in Arabidopsis thaliana are enriched in dispersed signals that elevate gene expression Plant Cell 20:543-551 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Rose AB, Emami S, Bradnam K, Korf I (2011) Evidence for a DNA-based mechanism of intron-mediated enhancement Front. Plant Sci. 2:98 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Rose AB, Last RL (1997) Introns act post-transcriptionally to increase expression of the Arabidopsis thaliana tryptophan pathway gene PAT1 Plant J. 11:455-464 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Scotto-Lavino E, Du G, Frohman MA (2006) 5' end cDNA amplification using classic RACE Nat. Protoc. 1:2555-2562 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Segal E, Widom J (2009) What controls nucleosome positions? Trends Genet. 25:335-343 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Snowden KC, Buchholz WG, Hall TC (1996) Intron position affects expression from the tpi promoter in rice Plant Mol. Biol. 31:689692 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Thomas MC, Chiang CM (2006) The general transcription machinery and general cofactors Crit. Rev. Biochem. Mol. Biol. 41:105178 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Vernimmen D, Bickmore WA (2015) The Hierarchy of Transcriptional Activation: From Enhancer to Promoter Trends Genet. 31:696708 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Wiegand HL, Lu S, Cullen BR (2003) Exon junction complexes mediate the enhancing effect of splicing on mRNA expression Proc. Natl. Acad. Sci. USA 100:11327-11332 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Yang C, Bolotin E, Jiang T, Sladek FM, Martinez E (2007) Prevalence of the initiator over the TATA box in human and yeast genes and identification of DNA motifs enriched in human TATA-less core promoters Gene 389:52-65 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Zhang N, McHale LK, Finer JJ (2015) Isolation and characterization of "GmScream" promoters that regulate highly expressing soybean (Glycine max Merr.) genes Plant Sci. 241:189-198 Pubmed: Author and Title CrossRef: Author and Title Google Scholar: Author Only Title Only Author and Title Intron DNA Sequences Can Be More Important Than the Proximal Promoter in Determining the Site of Transcript Initiation Jenna E Gallegos and Alan B. Rose Plant Cell; originally published online April 3, 2017; DOI 10.1105/tpc.17.00020 This information is current as of June 17, 2017 Supplemental Data /content/suppl/2017/04/05/tpc.17.00020.DC1.html /content/suppl/2017/04/28/tpc.17.00020.DC2.html Permissions https://www.copyright.com/ccc/openurl.do?sid=pd_hw1532298X&issn=1532298X&WT.mc_id=pd_hw1532298X eTOCs Sign up for eTOCs at: http://www.plantcell.org/cgi/alerts/ctmain CiteTrack Alerts Sign up for CiteTrack Alerts at: http://www.plantcell.org/cgi/alerts/ctmain Subscription Information Subscription Information for The Plant Cell and Plant Physiology is available at: http://www.aspb.org/publications/subscriptions.cfm © American Society of Plant Biologists ADVANCING THE SCIENCE OF PLANT BIOLOGY
© Copyright 2026 Paperzz