Biochemical Society Transactions (2002) Volume 30, part 2 Splicing signals and factors in plant intron removal J.W. S. Brown', C. G. Simpson, G. Thow, G. P. Clark, S. N. Jennings, N. Medina-Escobar, S. Haupt, S. C. Chapman and K. J. Oparka Scottish Crop Research Institute, Invergowrie, Dundee DD2 SDA, Scotland, U.K. Abstract recruited to the 5' splice site where U1 small nuclear RNA (snRNA) base pairs with the premRNA. Branchpoint bridging protein (BBP/SFl) and U2 auxiliary factor of 65 kDa (U2AF65)bind co-operatively to the branchpoint and polypyrimidine tract, after which U2snRNP is recruited and U2snRNA base pairs with to the branchpoint sequence. Interactions between protein factors at the 5' splice site and branchpoint/poly-pyrimidine tract/3' splice site lead to formation of the commitment complex and spliceosome [l-31. I n addition, splice-site selection is modulated by other factors interacting with intron and exon splicing enhancer and silencer elements. Thus, splice-site selection in multiple intron pre-mRNA transcripts is co-ordinated through the activity of a number of splicing regulatory factors, including serine-arginine-rich (SR) proteins, that permit co-operative interaction between splice sites along the pre-mRNA transcript and lead to highly regulated splice-site selection [l-61. In the absence of plant nuclear in vitro splicing extracts, characterization of plant intron splicing signals such as branchpoint, polypyrimidine tract and UA-rich sequences, and the determination of the roles of putative splicing factors and RNA-binding proteins, has been extremely difficult. T h e requirement for strong splicing signals in the invertase mini-exon system makes it highly sensitive to mutation and provides a system to investigate, in detail, the sequence and spatial requirements of intron signals, and the effects of overexpression of splicing factors. This paper describes the utility of the mini-exon system in such characterizations and more general approaches to identifying proteins involved in RNA processing in plants. Constitutive splicing of the potato invertase miniexon 2 (9 nt long) requires a branchpoint sequence positioned around 50 nt upstream of the 5' splice site of the adjacent intron and a U,, element found just downstream of the branchpoint in the upstream intron [Simpson, Hedley, Watters, Clark, McQuade, Machray and Brown (2000) RNA 6, 4224331. T h e sensitivity of this in vivo plant splicing system has been used to demonstrate exon scanning in plants, and to characterize plant intronic elements, such as branchpoint and polypyrimidine tract sequences. Plant introns differ from their vertebrate and yeast couterparts in being UA- or U-rich (up to 85 yo UA). One of the key differences in splicing between plants and other eukaryotes lies in early intron recognition, which is thought to be mediated by UA-binding proteins. We are adopting three approaches to studying the RNA-protein interactions in plant splicing. First, overexpression of plant splicing factors and, in particular, UA-binding proteins, in conjunction with a range of mini-exon mutants. Secondly, the sequences of around 6 5 % of vertebrate and yeast splicing factors have high-quality matches to Arabidopsis proteins, opening the door to identification and analysis of gene knockouts. Finally, to discover plant-specific proteins involved in splicing and in, for example, rRNA or small nuclear RNA processing, green fluorescent protein-cDNA fusion libraries in viral vectors are being screened. Introduction Precursor mRNA (pre-mRNA) splicing requires the early recognition of pre-mRNA splicing signals. In metazoa, the 5' and 3' splice sites, and an internal branchpoint associated with a downstream poly-pyrimidine tract, are essential intron signals. Early in spliceosome assembly, U1 small nuclear ribonucleoprotein particle (snRNP) is The potato invertase mini-exon system Key words: mini-exon, pre-mRNA splicing, splicing factor. Abbreviations used: pre-mRNA, precursor mRNA; SnRNP, small nuclear ribonucleoprotein particle; SR, serine-arginine-rich; UBP, U-rich-bindingprotein; GFP, green fluorescent protein. 'To whom correspondence should be addressed (e-mail [email protected]). T h e potato invertase mini-exon is constitutively included in invertase transcripts and is dependent on strong constitutive splicing signals [7]. T h e key elements which determine mini-exon inclusion are a branchpoint sequence, an adjacent downstream U-rich region and the distance between 0 2002 Biochemical Society I46 RNA-Protein Machines these signals and the 5’ splice site downstream of the mini-exon. T h e model for mini-exon splicing is that exon-bridging interactions, between factors at the branchpoint/polypyrimidine tract in intron 1 and the 5’ splice site of intron 2, promote splicing of intron 2. This is then followed by removal of intron 1 (Figure la) [7]. T h e increased distance of the branchpoint/poly-pyrimidinetract from the mini-exon is needed to allow the exonbridging interactions to occur. Mutation of the splice sites, branchpoint or polypyrimidine tract, or reduction of the distance between the branchpoint/polypyrimidine tract and the miniexon, leads to increased or complete skipping of the mini-exon. T h e skipping phenotype is readily identifiable and quantifiable by reverse transcriptase PCR. is summarized by the sequences CIJRAY or YURAY, where the underlined U and A are essential nucleotides, and purines and pyrimidines are preferred nucleotides at positions 3 and 5 respectively. In position 1 , pyrimidines are preferred but also C is preferred to U (Figure lb). These results represent the first detailed characterization of a branchpoint in a plant intron. Furthermore, the preferred sequences match consensus branchpoint sequences of both vertebrate and plant introns [8,11]. Plants do not, in general, contain strong polypyrimidine tracts as found in vertebrate introns, but tend to have U-rich sequences between putative branchpoints and the 3’ splice site. U- or UArichness is a general feature of plant introns, essential for efficient splicing [12], and mutations to U-rich regions in a number of plant introns alter splicing efficiency and can activate cryptic splice sites ([13-171 for reviews; [7,18]). Although the position of U-rich sequences, adjacent and downstream of branchpoints, in plant invertase genes suggests that they may be polypyrimidine tracts, it was necessary to distinguish between a polypyrimidine tract function and that of a UAor U-rich intronic element required for efficient plant intron splicing. Mutational series of the U,, sequence in potato invertase intron 1, where increasing numbers of Us were replaced by Cs or As (‘C’ and ‘A’ series respectively), were tested Branchpoint and polypyrimidine tract sequences Various lines of evidence suggest that branchpoint sequences that share similarity with those of vertebrate introns may be important in splicing and splice-site selection in plants [8,9]. However, the contribution of sequence context to the strength of plant branchpoints has not been determined. A complete mutational analysis of the branchpoint sequence in intron 1 of the mini-exon system identified the relative requirements for particular nucleotides in every position [lo]. This Figure I Structure of potato invertase mini-exon and characterization of branchpoint and polypyrimidine tract (a) Model ofthe exon-definition process that promotes splicing ofthe 9 nt invertase mini-exon [7]. Exons are indicated as boxes and introns (IVS) by lines. Splice sites are shown as GU and AG. The branchpoint (A) and the associated polypyrimidine tract (pV; a U, sequence) is over 50 nt from the downstream 5’ splice site in all dicotyledon invertase genes isolated so far. (b) Mutational analysis of the branchpoint and polypyrimidine tract generates the CURAY branchpoint sequence, where the size of the letters represents the contribution of particular nucleotides to splicing. The minimum polypyrimidine sequence for efficient splicing is two groups of Us separated by three t o six pyrimidines. , Branchpoint CURAY Poiypytlmldlnetract I47 0 2002 Biochemical Society Biochemical Society Transactions (2002) Volume 30, part 2 for splicing of the mini-exon. Inclusion of > 2As drastically reduced splicing efficiency, while Cs could compensate to a large extent for Us [lo]. These results suggested that the U-rich region functioned as a polypyrimidine tract rather than a U-rich intronic element. T o function efficiently, the poly-pyrimidine tract had to contain a minimum of two groups of three or four Us up to 6 nt apart (Figure lb). Thus at least in some plant introns, polypyrimidine tracts are required for splicing, as is generally the case in vertebrate introns. U-rich intronic elements, were used in conjunction with overexpression of UBP-1 and other proteins, following co-transfection of tobacco protoplasts. Despite their similar affinities for U-rich sequences in vitro, differential splicing efficiencies were observed for each of the proteins with the C and A series of mutants (C. G. Simpson and J. W. S. Brown, unpublished work). This suggests that, although quite similar in structure and in some cases sequence, subtle differences exist among the proteins in terms of specificity of sequence binding and their effects on splicing of the mini-exon. U-rich-binding proteins (UBPs) Splicing factor orthologues in Archidopsis Since the initial realization that UA-rich sequences were required for efficient splicing in plants [12], UA- or U-rich-binding proteins have been postulated to interact with and identify intron sequences within a pre-mRNA transcript [13-171. RNA-binding proteins with affinity for U-rich sequences have been identified in Arabidopsis [ 19-21]. All have been shown to have high affinity for U-rich sequences in vitro and therefore represent candidates for proteins involved in early intron recognition. The most studied of these is UBP-1, which can enhance splicing of poorly spliced introns when overexpressed in protoplasts, but also appears to have a role in mRNA stability [20]. We have made use of the range of mutants in the mini-exon splicing system to begin to investigate differential interactions of such proteins. The A and C series of mutants, designed to distinguish between the polypyrimidine tract and A number of splicing factors have been isolated from plants. These include snRNP, SR and heterogeneous ribonucleoprotein particle (‘ hnRNP ’) proteins (see [17] for review). The Arabidopsis genome sequence has greatly aided the identification of splicing proteins that are conserved among eukaryotes. In an analysis of such proteins, splicing factors with significant homology to counterparts in yeast and vertebrates were identified for around 6 5 % of the eukaryotic proteins (C. G. Simpson and J. W. S. Brown, unpublished work). In some cases sequence similarity was striking, while in others similarity resided in only portions of the protein, such that identification was tentative. In particular, the majority of protein factors involved in mammalian branchpoint, polypyrimidine-tract and 3’-splice-site selection have well-conserved orthologues in Figure 2 Proteins identified to interact at the branchpoindpolypyrimidinetracd3’ splice site during early spliceosomal complex assembly in mammalian introns E, complex E, the earliest ATP-independent complex: A, complex A, the ATP-dependent pre-spliceosomal complex PI. The sizes of the human proteins in amino acids are given in brackets. Human and Arabidopsis orthologues were aligned using LALIGN (http://w.ch.embnet.ot@ and the percentage identity over overlapping regions is given (the size ofthe overlap is in brackets). UZAF, U2 auxiliary factor; mBBP, mammalian branchpoint binding protein; SAP, splicing associated protein. Complex E Human mBBP (639aa) U 2 A P (475aa) U 2 A P (UOM) p14 (12Saa) SAP166 (1-a) SAP145 (872aa) SAP130 (1217aa) SAP114 (793aa) SAW2 (-a) SAPS1 (SOlaa) SAP49 (424aa) Comolex A 0 2002 Biochemical Society I48 Arabidopsis 32% ( m a ) 40% (491aa) s6% (247aa) 64% (121aa) 68% (13osaa) 42% 1-( 58% (1219aa) 31$6 (773aa) 62% (m) 45% (Sl2aa) 6896 (378aa) RNA-Protein Machines Arabidopsis (Figure 2 ) . That many yeast and vertebrate splicing factors failed to identify similar proteins in Arabidopsis suggests that these genes and proteins have diversified in plants or that their functions have been taken over by other plant-specific proteins. T o begin to address the functions of identified genes, various tools are available : overexpression of proteins with miniexon mutants, transfer-DNA or transposon-gene knockouts in Arabidopsis, or virus-induced or RNA-induced gene silencing. such as the potato mini-exon, provide new and exciting opportunities to study constitutive and alternative splicing in plants. References I 2 3 4 5 6 Identification of nuclear proteins 7 Plant viral vectors have been developed for ectopic expression of RNA and proteins in plant cells. In particular, tobacco mosaic virus-based vectors have been designed for efficient, high-throughput expression in tobacco hosts. cDNA libraries have been constructed from Nicotiana benthamiana mRNA in tobacco mosaic virus vectors, where partial cDNAs are expressed as 3'- or 5'-green fluorescent protein (GFP) fusion proteins. Infection of leaves of tobacco plants produced many hundreds of individual infection sites per leaf, each of which represents a different fusion product. Lesions are rapidly screened by fluorescence microscopy and those showing nuclear or nucleolar localization of GFP are selected. Following isolation of the lesion and passaging through another infection cycle, the cDNA is cloned and sequenced. This allows large-scale screening of cDNAs for particular cellular localization. Thus far, GFP localization has been observed throughout the whole nucleoplasm, to putative nuclear pore structures, strands within the nucleus, nuclear speckles and the nucleolus (K. J. Oparka, unpublished work). T h e lack of an in vitro splicing extract from plants has hampered progress in understanding the nuances and subtleties of plant intron splicing in comparison with other eukaryotic systems. T h e identification of splicing factors conserved in plants and of novel plant proteins, the technologies for examining gene function in plant cells and sensitive in vivo splicing assay systems, 8 9 10 II I2 13 14 15 16 17 18 19 20 21 Kr'dmer, A. ( I 996) Annu. Rev. Biochem. 65, 367409 Reed, R (2000) C u r . Opin. Cell Biol. 12,340-345 Hastings, M. L. and Krainer, A. R. (2001) Cur. Opin. Cell Biol. 13,302-309 Black D. L. (1995) RNA I, 763-771 Blencow, B. J. (2000) Trends Biochern. Sci. 25, 106- I I0 Smith, C. W. J. and Valcirel, J. (2000) Trends Biochem. Sci. 25, 38 1-388 Simpson, C. G., Hedley, P. E., Watten, J. A,, Clark G. P., McQuade, C., Machray, G. C. and Brown, J. W. S. (2000) RNA 6,422433 Simpson, C. G., Clark. G. P.. Davidson, D., Smith, P. and Brown, J. W. S. ( I 996) Plant J. 9,369-380 Liu, H.-X. and Filipowicz, W. ( I 996) Plant J. 9,38 1-389 Simpson, C. G., Thow, G., Clark G. P., Jennings, S. N., Watten, J. A. and Brown, J.W. S. (2002) RNA. 8,47-56 Tolstrup, N., Rouze, P. and Brunak S. ( I 997) Nucleic Acids Res. 25, 3 I 59-3 I63 Goodall, G. J. and Filipowicz, W. ( I 989) Cell 58, 473483 Luehen. K. R and Walbot, V. ( I 994) Plant Mol. Biol. 24, 449463 Simpson. G. G. and Filipowicz, W. ( I 996) Plant Mol. Biol. 32,141 Brown, J. W. S. and Simpson, C. G. ( I 998) Annu. Rev. Plant Physiol. Plant Mol. Biol. 49,77-95 Schuler, M.A. ( I 998) in A Look Beyond Transcription: Mechanisms Determining mRNA Stability and Translation in Plants (Bailey-Serres, J. and Gallie, D., eds), pp. 1-19, American Society of Plant Physiologists, Rockeville, MD Lorkovit, Z. J., Kirk D. A. W., Larnbetmon, M. H. L. and Filipowicz, W. (2000) Trends Plant Sci. 5, 160-1 67 Latijnhouwen, M. J., Pairoba, C. F., Brendel, V.. Walbot, V. and Carle-Urioste, J, C. ( I 999) Plant Mol. Biol. 41,637-644 Gniadkowski, M., Hemmings-Mieszcak M., Klahre, U., Liu, H.-X. and Filipowicz, W. ( I 996) Nucleic Acids Res. 24, 6 19-627 Lambermon, M. H. L, Simpson, G. G., Wieczorek-Kirk D. A,, Hemmings-Mieszczak M., Klahre, U. and Filipowicz, W. (2000) EMBO J. 19, 1638- I 649 Lorkovit, Z. J., Wieczorek-Kirk D. A,, Klahre, U., Hemmings-Mieszczak M. and Filipowicz,W. (2000) RNA 6, 1610-1624 Received 2 I November 200 I I49 0 2002 Biochemical Society
© Copyright 2026 Paperzz