Methods 43 (2007) 110–117 www.elsevier.com/locate/ymeth Construction of small RNA cDNA libraries for deep sequencing Cheng Lu a, Blake C. Meyers a, Pamela J. Green a a,b,* Department of Plant and Soil Sciences, Delaware Biotechnology Institute, University of Delaware, Newark, DE 19711, USA b College of Marine and Earth Sciences, University of Delaware, Newark, DE 19711, USA Accepted 1 May 2007 Abstract Small RNAs (21–24 nucleotides) including microRNAs (miRNAs) and small interfering RNAs (siRNAs) are potent regulators of gene expression in both plants and animals. Several hundred genes encoding miRNAs and thousands of siRNAs have been experimentally identified by cloning approaches. New sequencing technologies facilitate the identification of these molecules and provide global quantitative expression data in a given biological sample. Here, we describe the methods used in our laboratory to construct small RNA cDNA libraries for high-throughput sequencing using technologies such as MPSS, 454 or SBS. 2007 Elsevier Inc. All rights reserved. Keywords: Small RNAs; miRNAs; siRNAs; High-throughput sequencing; 454 1. Introduction Nearly all eukaryotes produce small RNAs (21–24 nucleotides) that function to silence genes by multiple mechanisms. miRNAs (generally 21–22 nt) are the most abundant type of small RNAs in most organisms. miRNAs originate from ‘‘hairpin’’ primary transcripts from one strand of distinct genomic loci by two rounds of endoribonuclease cleavage by RNase III-like enzymes. Another type of small RNAs, known as siRNAs (generally 22–24 nt), is similar in structure and function to miRNAs. siRNAs are processed from longer double-stranded RNA molecules and represent both strands of the RNA. In many organisms, such as plants, siRNAs are believed to originate from longer transcripts derived from transposons, repetitive sequences and transgenes [1–3]. The first and still most common approach to the discovery of small RNAs has been to clone and sequence individual small RNAs using traditional molecular methods. The majority of currently known miRNAs were identified by * Corresponding author. Address: Department of Plant and Soil Sciences, Delaware Biotechnology Institute, University of Delaware, Newark, DE 19711, USA. Fax: +1 302 831 3231. E-mail address: [email protected] (P.J. Green). 1046-2023/$ - see front matter 2007 Elsevier Inc. All rights reserved. doi:10.1016/j.ymeth.2007.05.002 this approach. It was first used to identify miRNAs and siRNAs in mammals, Caenorhabditis elegans, Drosophila and Arabidopsis [4–8]. Small RNAs that are generated by RNaseIII have 5 0 phosphate and 3 0 hydroxyl termini in contrast to most RNA turnover products that have a 5 0 hydroxyl terminus [9]. Different cloning protocols have been developed independently. Most of them require the presence of 5 0 phosphate and free 3 0 hydroxyl group on the small RNAs for adapter ligation. After reverse transcription, the cDNA is PCR-amplified using primers corresponding to the adapter sequences. The PCR products are cloned and sequenced. Based on published data, about 30– 50% of the clones represent RNA turnover products of the abundant rRNAs, tRNAs, snRNAs [7,10]. The cloning frequency of an individual small RNA generally reflects its relative abundance in the sample, providing a quantitative expression measurement. Despite the early success of this approach, it is unlikely that these efforts are saturating for rare or tissue-specific small RNAs. The identification and quantification of small RNAs using high-throughput sequencing methods was first accomplished in Arabidopsis by our lab [11]. More than 2 million small RNAs were sequenced by Massively Parallel Signature Sequencing (MPSS) [12] from Arabidopsis flowers and seedlings, yielding more than 70,000 C. Lu et al. / Methods 43 (2007) 110–117 genome-matching distinct sequences. This represented a significant advance over more traditional methods for small RNA identification. One limitation of MPSS is that it is only capable of sequencing the 5 0 17 nucleotides of small RNAs. We also pioneered use of an alternative approach for small RNA sequencing based on the ‘‘454’’ method of sequencing [13], a technology which produces longer sequence reads [12]. Recently, we reported the use of both MPSS and 454 to sequence small RNAs from different Arabidopsis mutant backgrounds [14,15]. Combined with genetic approaches, deep sequencing provides a powerful tool for the dissection and characterization of diverse small RNA populations and identification of low abundance miRNAs. This article describes the method used in our laboratory to make size-fractionated cDNA libraries that are used for high-throughput sequencing with parallel approaches. This method was originally developed for use with plants and MPSS. Substantial progress has been reported for other next-generation sequencing technologies. Solexa, Inc. has developed a four-color DNA sequencing-by-synthesis (SBS) approach as a replacement for MPSS based on a novel, reversible, dye-termination chemistry (http:// www.solexa.com). This approach can potentially generate >10 million 25–30 nt sequence tags with high accuracy. A different sequencing approach named Supported Oligo Ligation Detection (SOLiD) is being developed by Agencourt Personal Genomics (now a part of Applied Biosystems, Inc.). This method uses an array of microbeads each coated with a single DNA or cDNA fragment; a pool of fluorescent oligos is used to ‘‘read’’ the sequences by complementary binding using a repeated process of ligation, detection, and cleavage. This determines up to 50 nucleotides of sequence per bead, for >10 million beads. These novel, highly parallel methods have the potential to dramatically reduce the cost of sequencing and offer a much richer source of sequence information. The method described here should be applicable to all of these forthcoming technologies. 2. Method An overview of small RNA cloning and sequencing methods is schematically depicted in Fig. 1. First, low molecular weight (LMW) RNA is isolated from the tissue of interest. Next, small RNAs (20–30 nt) are purified from the LMW RNA fraction by polyacrylamide gelbased size fractionation and are ligated to a 5 0 RNA adapter. To prevent self-ligation of small RNAs and self-ligation of the adapter, the 5 0 terminus of the adapter has a hydroxyl group and an excess of adapter over small RNAs is used. Then, a 3 0 adapter is ligated to the gel-purified product of the 5 0 adapter ligation. The 3 0 RNA adapter is modified to prevent circularization and self-ligation; typically, the 3 0 hydroxyl is blocked by chemical synthesis of an oligonucleotide containing a 3 0 non-nucleotidic group. After reverse transcription, a 111 total RNA isolation with Trizol separation of LMW and HMW RNA purification of small RNAs (20-30 nt) 5’ adapter ligation purification of 1st ligation product 3’ adapter ligation purification of 2nd ligation product reverse transcription 18 cycles of PCR amplification purification of 75 bp PCR product cloning into pCRII-TOPO for QC analysis of ~100 colonies using traditional sequencing 454, MPSS, or SBS sequencing Fig. 1. An outline of how to make a small RNA cDNA library from total RNA samples. See text for details. low number of PCR cycles are used to obtain sufficient amount of template for sequencing. The PCR product can be cloned and sequenced with regular PCR cloning vectors. The quality of the small RNA cDNA libraries is usually assessed in a quality control (‘‘QC’’) step by sequencing about 100 individual clones. Below, these steps are described in detail. 3. Material and reagents 1. RNA isolation: Trizol reagent (Invitrogen 15596), chloroform, isopropanol, 75% ethanol, DEPC-treated water. 2. LMW and high molecular weight (HMW) RNA separation: 5 M NaCl, 50% PEG8000, 5 mg/ml glycogen (Ambion 9510). 3. RNA purification: 10· TBE, 2· formamide loading buffer (90% formamide, 1· TBE, xylene cyanol, and bromophenol blue), 10 bp DNA ladder (1 lg/ll) (Invitrogen 10821-015), 10% ammonium persulfate, TEMED, 40% acrylamide stock (Ambion 9022) , 0.3 M NaCl, ethanol (EtOH), ethidium bromide, Spin-X filter (Corning 8162). 4. Adapter ligation: T4 RNA ligase 5 U/ll and 10· RNA ligase buffer (Ambion 2140), RNaseOUT 40 U/ll (Invitrogen 10777). 5. RT-PCR: Superscript II RT 200 U/ll (Invitrogen 18064), Taq DNA polymerase (Invitrogen 103420), dNTP mix (Invitrogen R725-01). 112 C. Lu et al. / Methods 43 (2007) 110–117 6. PCR product purification: 10 bp DNA ladder (1 lg/ ll) (Invitrogen 10821-015), buffer saturated phenol (pH 7.9) (Ambion 9710), chloroform. 7. PCR cloning: TOPO TA Cloning (Invitrogen K450001), LB Broth Base (Invitrogen 12780), TOP10 One Shot Cells (Invitrogen K4500-40), X-gal (Invitrogen 15520-034), IPTG (Invitrogen 15529-019). 8. RNA oligos for RNA ligation (Dharmacon): 5 0 RNA adaptor (5 0 OH-GGU CUU AGU CGC AUC CUG UAG AUG GAUC-OH 3 0 ), 3 0 RNA adaptor (5 0 pUAU GCA CAC UGA UGC UGA CAC CUG CidT 3 0 ). (Note: p, phosphate; idT, inverted deoxythymidine. The exact sequences of the adapters can be changed based on specific needs. Both adaptors were purified by PAGE by Dharmacon.) 9. DNA oligo for reverse transcription: RT-primer (5 0 CAA GCA GAA GAC CGC ATA CGA 3 0 ). 10. DNA oligos for PCR amplification: 5 0 PCR primer (5 0 CAA GCA GAA GAC CGC ATA CGA 3 0 ), 3 0 PCR primer (5 0 AAT GAT ACG GCG ACC ACC GA 3 0 ). 4. Protocol (i) Carefully remove supernatant and wash pellet with 80% EtOH without dislodging. Allow the pellet to air dry, and dissolve the pellet in DEPC-treated water. (Note: 10 ll DEPC-treated water is typically used to resuspend LMW RNAs from 100 lg of total RNA.) 4.1.1. Comment Most RNA isolation methods are based on either chemical extraction (e.g. Trizol) or immobilization on silica-based membrane (e.g. QIAGEN RNeasy Kit). The second method has been shown to work well for large RNAs (>200 nt). However, small RNAs can not be recovered efficiently from silica-based purification. Based on our experience, total RNA isolated using Trizol reagent is usually free of protein and DNA contamination yet contains most of the small RNA. The small RNA enrichment step (HMW and LMW RNA separation) is recommended for library construction because a high level of rRNA and mRNA might increase background noise in libraries. These HMW RNAs can be precipitated using 5–10% PEG (MW = 8000). In the LMW fraction, RNA species 6200 nt (including tRNAs) are highly enriched (Fig. 2a). Our preferred way to quickly analyze the abundance and integrity of the LMW RNA is to run an aliquot of the preparation on a 1.5% agarose gel (Fig. 2b). 4.1. Low molecular weight (LMW) RNA isolation Harvest samples and immediately freeze in liquid nitrogen. (a) Grind to a fine powder. As an example, the use of 3 g of seedling tissue ground using a mortar and pestle under liquid nitrogen will yield about 500 lg of total RNA. (b) Isolate total RNA using Trizol reagent as indicated in the manufacturer’s protocol. For the example tissue in (a), we would use 40 ml Trizol. For some recalcitrant tissues, we add an extra chloroform extraction. (c) Dissolve total RNA in DEPC-treated water to a concentration of about 1 lg/ll. (Note: A yield of 200– 400 lg of total RNA is typically what we use as starting material for the subsequent steps.) (d) mRNA and rRNA (high molecular weight (HMW) RNAs) are precipitated by adding both 50% PEG (MW = 8000) to a final concentration of 5% and 5 M NaCl to a final concentration of 0.5 M. (e) Mix well and put tube on ice for 30 min. (f) Spin down at max speed in a microcentrifuge (12 K) for 10 min at 4 C to pellet HMW RNAs. (Note: The pellet from this step can be dissolved in DEPC-treated water and used for regular Northern blots.) (g) Transfer supernatant to a new microcentrifuge tube (this fraction contains the LMW RNAs) and add 2.5 volumes of 100% EtOH, mix well, and place at 20 C for at least 2 h. (h) Spin down at maximum speed for at least 30 min at 4 C to pellet LMW RNAs. 4.2. 17–27 Nucleotide small RNA purification from LMW RNA (a) Prepare the glass and spacers (1.5 mm) for pouring the gel. (Mini-PROTEAN 3 System from Bio-Rad or other similar vertical electrophoresis systems with approximately 10 · 8 cm gel size can be used.) (b) Prepare a 15% polyacrylamide/urea gel. Mix the components (9.6 g urea, 7.5 ml 40% acrylamide stock, 2 ml 10· TBE, DEPC-treated water to 20 ml) and warm the solution to 37 C to dissolve the urea. Filter the solution through a nitrocellulose filter and cool to room temperature. (c) Add 120 ll of a freshly prepared solution of 10% ammonium persulfate to the acrylamide solution. Mix well. (d) Add 9.2 ll of TEMED. Mix the solution by swirling. Fill the space almost to the top. Lay the glass plates against a test-tube rack at an angle of 10. (e) Immediately insert the appropriate comb. Allow the acrylamide to polymerize for 30 min at room temperature. (f) Remove the comb from the gel and rinse out the wells thoroughly with 1· TBE. (g) Pre-run the gel for 15–30 min at 200 V, and wash the wells using 1· TBE. The gel is now ready for loading. (h) Load as much as 10 ll LMW RNAs into each well as follows: add an equal volume of 2· loading dye to the RNA solution. Mix well by vortexing, heat the samples at 65 C for at least 5 min to disrupt secondary structure and put on ice immediately. C. Lu et al. / Methods 43 (2007) 110–117 a b 1 2 c 113 2 1 30 nt 20 nt Total RNA d 1 LMW RNA e small RNAs f 1 1 50 nt 70 nt 70 nt 5’adapter ligation 3’adapter ligation PCR Fig. 2. Gels illustrating different steps during small RNA cDNA library construction. (a) Total RNAs isolated with Trizol were run on a 1.2% agarose gel. (b) After separation by PEG precipitation, HMW (lane1) and LMW (lane 2) RNAs were run on a 1.5% agarose gel. (c) The LMW fraction from 200 lg of total RNAs (lanes 1 and 2) was resolved on a 15% denaturing polyacrylamide gel and stained with ethidium bromide. Two bands can be detected in the range between 20 and 30 nt. The portion of gel within the rectangle was recovered. Ten microliters of 5 0 (d) or 3 0 (e) adapter ligation product (lane 1) was resolved on a 10% or 7.5% denaturing polyacrylamide gel, respectively, and stained with ethidium bromide. The portion of gel within the rectangle was recovered. (f) After PCR, 5 ll of PCR product (lane 1) was resolved on a 7.5% denaturing polyacrylamide gel and stained with ethidium bromide. A 75 bp band should be easily detected. (i) Load 3 ll of 10-bp ladder (+2· loading dye) into an unused lane. (Note: Denature the ladder the same way as the LMW RNAs, so the ladder will run as single stranded DNA.) (j) Run the gel at 200 V for about 1 h, and stain the gel with 1· TBE/ethidium bromide (1 lg/ml) for 5 min. (k) Cut out a plug of the gel corresponding to the band size of 20–30 nucleotides with a clean razor blade, put it into 2 ml tube and crush the gel thoroughly with a small pestle. (Note: As shown in Fig. 2c, two small RNA bands (21 and 24 nt) can be seen after ethidium bromide staining from most plant samples.) (l) Add two volumes (250 ll per lane) of 0.3 M NaCl to the tube, and elute the RNA by rotating the tube gently at room temperature for at least 4 h (to overnight). (m) Transfer the eluate and the gel debris onto the top of a Spin-X filter source, and spin at full speed for 1 min in a microcentrifuge at RT. (n) Add 2.5 volumes 100% EtOH and 3 ll glycogen to the filtrate (the eluted sample), mix well, and incubate at 80 C for at least 2 h. (o) Spin down at maximum speed at 4 C for 30 min in a microcentrifuge. (p) Carefully remove the supernatant and wash the pellet with 80% EtOH without dislodging. Allow the RNA pellet to air dry then dissolve the RNA in 10 ll of DEPC-treated water. 4.3. 5 0 Adaptor ligation and purification (a) The 5 0 adaptor ligation reaction is carried out in a 10 ll reaction containing 5 ll purified small RNAs, 2 ll 5 0 RNA adaptor (20 lM stock concentration), 1 ll 10· RNA ligase buffer, 2 ll T4 RNA ligase. Incubate the reaction at room temperature for 6 h. (b) Stop the reaction with 10 ll 2· loading dye. Heat the sample in loading buffer at 65 C for 15 min prior to loading. (c) Prepare a 10% denaturing polyacrylamide gel as described in 2b–g above except using the following gel recipe: 9.6 g urea, 5 ml 40% acrylamide stock, 2 ml 10· TBE, DEPC-treated water to 20 ml. (d) Load the entire 5 0 adapter reaction into one well. (e) Load 2 ll of 10-bp ladder (+2· loading dye) into an unused lane. 114 C. Lu et al. / Methods 43 (2007) 110–117 (f) Run the gel at 200 V for about 1 h, and stain the gel with 1· TBE/ethidium bromide (1 lg/ml) for 5 min. (g) Cut out a plug of the gel corresponding to a band size of 50–65 nucleotides with a clean razor blade, put it into a 2 ml tube and crush, elute, and precipitate as described in 2i–p above and dissolve the RNA in 5 ll of DEPC-treated water. (g) Cut out a plug of the gel corresponding to the band size of 70–90 nucleotides with a clean razor blade, put it into 2 ml tube and crush, elute, and precipitate as described in 2i–p above, and dissolve the RNA in 5 ll of DEPC-treated water. 4.5. RT-PCR of small RNAs ligated with adapters 4.3.1. Comment RNA adapters are usually supplied in a dehydrated form. We recommend dissolving them at 200 lM in DEPC-treated water. Then prepare a 20 lM working solution for ligation reactions. All the solutions should be aliquoted and stored at 80 C. T4 RNA ligase catalyzes the ATP-dependent intraand intermolecular formation of phosphodiester bonds between 5’-phosphate and 3’-hydroxyl termini of oligonucleotides, single-stranded RNA and DNA. Several T4 RNA ligases from different companies have been tested in our lab. We found that Ambion’s T4 RNA ligase gave the most efficient and reliable results under our reaction conditions. The cloning frequency of individual small RNAs usually reflects their expression level. However, one possible source of bias in the ligation reaction is differential ligation efficiency toward the ends of various small RNAs. As shown in Fig. 2d, the 5 0 ligation product is generally invisible with ethidium bromide staining. Based on the size of the 5 0 adapter, recover a band from the gel corresponding to the right size. Because a large molar excess of 5 0 RNA adapter is used in the reaction, the unligated adapters are readily detectable in the gel (Fig. 2d). (a) Five microliters of purified ligation product is incubated with 3 ll of 100 lM RT-primer and 3 ll DEPC-treated water at 65 C for 10 min. Spin down to cool. (b) Add the following components on ice, in order: 6 ll of 5· first strand buffer, 5.5 ll of 2 mM dNTP mix, 3 ll of 100 mM DTT, 1.5 ll RNaseOut, and 3 ll Superscipt II RT (200 U/ll). (c) Incubate at 45 C for 1 h, followed by a final 5 min incubation at 90 C to inactivate the enzyme. (d) PCR is carried out in twelve 50 ll reaction tubes, each containing 5 ll 10· PCR buffer, 1.5 ll of 50 mM MgCl2, 1 ll of 10 mM dNTPs, 0.5 ll of 100 lM 5 0 PCR primer, 0.5 ll of 100 lM 3’ PCR primer, 1 ll Taq polymerase (5 U/ll), 1 ll of RT reaction mixture. (e) The reactions are incubated at 94 C for 1 min, and then cycled 18 times at 94 C for 45 s, 55 C for 45 s, and 72 C for 45 s. This is followed by a 3 min incubation at 72 C. (f) Check the reaction on a 7.5% denaturing polyacrylamide gel as follows: remove 5 ll from the PCR reaction, add 2· loading dye, and heat the sample well before loading along side the 10 bp ladder. A good smear in the 75 nt size range can be seen with ethidium bromide staining (Fig. 2f). 4.4. 3 0 Adaptor ligation and purification 4.6. PCR product purification (a) The 3 0 adaptor ligation reaction is carried out in a 10 ll reaction containing 5 ll purified 5 0 adapter ligation product, 2 ll 3 0 RNA adaptor (20 lM), 1 ll 10· RNA ligase buffer, 2 ll T4 RNA ligase. The reaction is incubated at room temperature for 6 h. (b) Stop the reaction with 10 ll 2· loading dye. Heat sample/loading buffer at 65 C for 15 min prior to loading. (c) Prepare a 7.5% denaturing polyacrylamide gel as described in 2b–g except using the following gel recipe: 9.6 g urea, 3.75 ml 40% acrylamide stock, 2 ml 10· TBE, DEPC-treated water to 20 ml. (d) Load the entire 3 0 adapter ligation reaction into one well. (e) Load 2 ll of 10 bp ladder (+2· loading dye) into an unused lane. (f) Run the gel at 200 V for about 1 h, and stain the gel with 1· TBE/ethidium bromide (1 lg/ml) for 5 min. (a) Add an equal volume of Tris buffer (pH 7.9)-saturated phenol:chloroform (1:1) to the PCR reaction. (b) Mix well by vortexing for 30 s, and spin in a microcentrifuge for 3 min at max speed. (c) Carefully remove the aqueous layer to a new tube. (d) To remove traces of phenol, add an equal volume of chloroform to the aqueous layer. (e) Mix well by vortexing for 30 s, and spin in a microcentrifuge for 3 min at maximum speed. (f) Transfer the aqueous layer to new tube. (g) Measure the volume of the DNA sample. Adjust the salt concentration by adding 1/10 volume of 3 M sodium acetate, pH 5.2. Mix well. Add 2.5 volumes of cold 100% ethanol (calculated after salt addition). Mix well. (h) Place at 20 C for at least 2 h. (i) Spin at maximum speed in a microcentrifuge for 20 min. C. Lu et al. / Methods 43 (2007) 110–117 (j) Carefully remove the supernatant and wash the pellet with 75% EtOH without dislodging. Allow the RNA pellet to air dry. (k) Resuspend the pellet in 120 ll H2O. Add 24 ll of 6· loading dye. Mix well. (l) Load the entire sample into eight wells of a 10% TBEpolyacrylamide gel, along with 10 bp DNA ladder as marker, and run the gel at 150 V for 60 min. (m) Cut out the product band (75 bp) with a clean razor blade, put it into a 2 ml tube and crush, and precipitate it as described in 2i–p above, except add 2 volumes of cold 100% EtOH, and incubate at 20 C for at least 2 h. (n) Spin at maximum speed at 4 C for 30 min. Wash the pellet with 0.5 ml room temperature 70% EtOH. Vacuum or air dry the pellet, and dissolve it in 12 ll of H2O. (Note: For 454 sequencing, 1 lg of purified PCR product at 100 ng/ll is required) 4.6.1. Comments To get enough cDNA for sequencing and to maintain quantitative information as well, 15–20 PCR cycles are usually used for amplification. For plant samples (200 lg total RNA), 18 PCR cycles can generate 1.5 lg of purified 75 bp product. Very often, a 50 bp band is visible in the gel (Fig. 2f). This band is generated from the adapter ligation product without small RNA inserts. Because most PCR purification kits have a poor recovery efficiency for small sized double-stranded DNA, gel purification should be carried out. 4.7. Cloning into pCRII-TOPO for quality control (QC) (a) Gel-purified PCR products (0.2 ll) are incubated with 4 ll sterile water, 1 ll of 1.2 M NaCl and 1 ll pCRII-TOPO vector at room temperature for 10 min. 115 (b) Transfer 2 ll of each reaction into separate vials of One Shot cells. (c) Spread 10–50 ll of each transformation mix onto LB plates containing 50 lg/ml kanamycin and X-gal/ IPTG. (d) Incubate overnight at 37 C. (e) An efficient TOPO Cloning reaction should produce several hundred colonies. Transfer white or light blue colonies to a 96-well plate and culture them overnight containing 50 lg/ml kanamycin. (f) This plate is ready for regular ABI QC sequencing. 4.7.1. Comments This step is extremely important for assessing the quality of the small RNA cDNA libraries. Highly expressed miRNAs should be easily identified from the sequencing data. Furthermore, contamination from adapter self-ligation should be lower than 5%. Therefore, any library that has failed (i.e., contains a low level of known miRNAs and a high level of the adapter contamination) in QC analysis should not be used for high-throughput sequencing. Fig. 3 shows the QC analysis for a small RNA cDNA library of Arabidopsis flowers. Our libraries average about 5% adapter contamination. 83 out of 85 clones are in the size range between 18–27 nt, with 20– 24 nt representing the most common size. Approximately 36% of clones are known miRNAs. In addition, there are many small RNAs, which represent endogenous siRNAs, match to transposons, rRNA and other repetitive sequences. The intergenic region (IGR)-derived small RNAs could arise from novel miRNA genes or unannotated transposons or retroelements. For some studies that do not need very deep coverage, it is more cost-effective to mix several libraries in one run than sequence them separately. Indexing nucleotides can be added to the individual cDNA libraries so that the origin of the sequences can be traced. Several strategies are Fig. 3. Quality control analysis for a small RNA cDNA library generated from Arabidopsis flowers. (a) Size distribution of the cloned small RNAs. (b) Distribution of small RNAs in various sequence classes. 116 C. Lu et al. / Methods 43 (2007) 110–117 possible to achieve this depending on specific needs. For example, different 5 0 RNA adapters can be used for distinct libraries. After making libraries independently, the purified PCR products can be combined before high-throughput sequencing. For this method, the indexing nucleotides can be placed adjacent to the cloned sequence rather than at the end of a primer that also must be sequenced. Because of the high cost of RNA oligos, this method is very expensive. If the high-throughput reads are long enough to get entire adapter sequences (like 454), an alternative strategy can be applied. The second strategy has the advantage that the same RNA adapters can be used in all the libraries. The indexing nucleotides are introduced into PCR primers (for example, two indexing nucleotides can be added at the 5 0 ends of the PCR primers), which would produce a distinct ‘‘tag’’ for each library. We have successfully sequenced mixed libraries by 454 with this second indexing strategy. 5. Troubleshooting The integrity of RNA and DNA oligos has significant impact on the outcome of the experiment. HPLC or PAGE-purified RNA oligos should be used. Regardless of the source of oligos, if there is any question about the cleanliness of the oligos, the oligos should be further PAGE-purified. The oligos can be assessed for intactness by running an aliquot on a polyacrylamide gel. The following discussion assumes that only very pure, high quality RNA or DNA oligos were used in the protocol. Positive control. An optional control experiment consists of using miRNA-certified total RNAs (Ambion) through adaptor ligation, purification, reverse transcription and PCR. Obtaining a good small RNA library using the control RNA demonstrates that the reagents are working properly. RNA quality and stability. The key factor for success is quality of the starting RNA. RNA degradation during RNA isolation or purification steps is the most likely reason for failure to obtain a good library. We strongly recommend that you analyze an aliquot of the total RNA on an agarose gel before starting any purification and ligation steps. Look for a 28S ribosomal RNA band that is twice the intensity of the 18S band. In addition, both bands should be sharp, with no smearing. Not all RNA purification methods efficiently recover small RNAs. Therefore, it is important to confirm that the method is effective for recovery of small RNAs. A few specific problems are discussed below Low or no product after the final PCR amplification. One probable cause is a low amount of starting RNA. At least 5 lg of LMW RNA (usually precipitated from over 100 lg of total RNA) should be used. Although we have seen successful small RNA libraries for some samples with lower amounts of starting material, 100 lg of high quality total RNA provides more consistent results. Adaptor sequence contamination. RNA degradation or low resolution of gel-purification may cause RNA adaptor sequence contamination in small RNA libraries. In theory, in the absence of RNA degradation, no adapter–adapter ligation should result since the RNA adapter either has no 5 0 -phosphate or it has a blocked structure that cannot undergo ligation with T4 RNA ligase. To minimize degradation and protect RNA integrity, RNase inhibitor can be used during various enzymatic reactions. Gel-purification. Poor quality of small RNA libraries can alternatively be caused by unsatisfactory gel-purification. For best results, it is recommended that freshly-made denaturing (7.5 M urea) polyacrylamide gels be used. Less than 20 ll of sample should be applied to a well of 1.5 mm thickness and of 5 mm width for good separation. 6. Conclusions A major limitation of traditional sequencing for the discovery of small RNAs by cloning is that it is extremely challenging to identify small RNAs that are expressed at a low level, in restricted cell-types, or at very specific stages. In principle, this is no longer a limiting factor due to our ability to deeply sequence small RNA libraries from a broad range of samples. Using the method described here, we first analyzed the small RNA component of the transcriptome of Arabidopsis tissues [11]. This work identified many small RNA sequences that were not previously documented and some were associated with genomic regions previously considered devoid of activity. Our data indicated that high-throughput sequencing methods are necessary to sample the full complexity of small RNAs in plants and likely other organisms as well. Application of this method to several key mutants affecting small RNA biogenesis pathways can quickly lead to the identification of candidate miRNAs, trans-acting siRNAs and other interesting classes of small RNAs [14,15]. In addition to analyzing small RNAs in plants, we have recently extended this approach to animal, fungi and viral systems ([16] and unpublished data). This method should prove to be a powerful approach that allows rapid identification and quantification of thousands or potentially millions of small RNA molecules in a single run. Acknowledgments We thank S. Luo and C.D. Haudenschild for technical advice and assistance; M. German and M. Accerbi for comments on the manuscript. This work was supported primarily by NSF Grants 0439186 and 0548569 (P.J.G. and B.C.M.), with additional support provided by DOE DE-FG02-04ER15541 (P.J.G.). References [1] X. Chen, FEBS Lett. 579 (2005) 5923–5931. [2] B.C. Meyers, F.F. Souret, C. Lu, P.J. Green, Curr. Opin. Biotechnol. 17 (2006) 139–146. [3] H. Vaucheret, Genes Dev. 20 (2006) 759–771. C. Lu et al. / Methods 43 (2007) 110–117 [4] M. Lagos-Quintana, R. Rauhut, W. Lendeckel, T. Tuschl, Science 294 (2001) 853–858. [5] N.C. Lau, L.P. Lim, E.G. Weinstein, D.P. Bartel, Science 294 (2001) 858–862. [6] R.C. Lee, V. Ambros, Science 294 (2001) 862–864. [7] C. Llave, K.D. Kasschau, M.A. Rector, J.C. Carrington, Plant Cell 14 (2002) 1605–1619. [8] W. Park, J. Li, R. Song, J. Messing, X. Chen, Curr. Biol. 12 (2002) 1484–1495. [9] P.D. Zamore, T. Tuschl, P.A. Sharp, D.P. Bartel, Cell 101 (2000) 25–33. [10] R. Sunkar, T. Girke, P.K. Jain, J.K. Zhu, Plant Cell 17 (2005) 1397– 1411. [11] C. Lu, S.S. Tej, S. Luo, C.D. Haudenschild, B.C. Meyers, P.J. Green, Science 309 (2005) 1567–1569. [12] S. Brenner, M. Johnson, J. Bridgham, G. Golda, D.H. Lloyd, D. Johnson, S. Luo, S. McCurdy, M. Foy, M. Ewan, R. Roth, D. George, S. Eletr, G. Albrecht, E. Vermaas, S.R. Williams, K. Moon, T. Burcham, M. Pallas, R.B. DuBridge, J. Kirchner, K. Fearon, J. Mao, K. Corcoran, Nat. Biotechnol. 18 (2000) 630–634. 117 [13] M. Margulies, M. Egholm, W.E. Altman, S. Attiya, J.S. Bader, L.A. Bemben, J. Berka, M.S. Braverman, Y.J. Chen, Z. Chen, S.B. Dewell, L. Du, J.M. Fierro, X.V. Gomes, B.C. Godwin, W. He, S. Helgesen, C.H. Ho, G.P. Irzyk, S.C. Jando, M.L. Alenquer, T.P. Jarvie, K.B. Jirage, J.B. Kim, J.R. Knight, J.R. Lanza, J.H. Leamon, S.M. Lefkowitz, M. Lei, J. Li, K.L. Lohman, H. Lu, V.B. Makhijani, K.E. McDade, M.P. McKenna, E.W. Myers, E. Nickerson, J.R. Nobile, R. Plant, B.P. Puc, M.T. Ronan, G.T. Roth, G.J. Sarkis, J.F. Simons, J.W. Simpson, M. Srinivasan, K.R. Tartaro, A. Tomasz, K.A. Vogt, G.A. Volkmer, S.H. Wang, Y. Wang, M.P. Weiner, P. Yu, R.F. Begley, J.M. Rothberg, Nature 437 (2005) 376–380. [14] I.R. Henderson, X. Zhang, C. Lu, L. Johnson, B.C. Meyers, P.J. Green, S.E. Jacobsen, Nat. Genet. 38 (2006) 721–725. [15] C. Lu, K. Kulkarni, F.F. Souret, R. MuthuValliappan, S.S. Tej, R.S. Poethig, I.R. Henderson, S.E. Jacobsen, W. Wang, P.J. Green, B.C. Meyers, Genome Res. 16 (2006) 1276–1288. [16] J. Burnside, E. Bernberg, A. Anderson, C. Lu, B.C. Meyers, P.J. Green, N. Jain, G. Isaacs, R.W. Morgan, J. Virol. 80 (2006) 8778– 8786.
© Copyright 2026 Paperzz