S1 Supplementary Information An In Vitro Translation, Selection, and Amplification System for Peptide Nucleic Acids Yevgeny Brudno, Michael E. Birnbaum, Ralph E. Kleiner, David R. Liu* Department of Chemistry and Chemical Biology and the Howard Hughes Medical Institute Harvard University, 12 Oxford Street, Cambridge, MA 02138 Supplementary Methods General Methods. Unless otherwise noted, all commercially available reagents and solvents were purchased from Aldrich and used without further purification. All PNA monomers were purchased from ASM Research. Unless otherwise noted, all commercially available enzymes and buffers were purchased from New England Biolabs (NEB). Unless otherwise specified, all reactions were conducted at room temperature. Preparation of Individual DNA Templates. DNA templates for polymerization experiments in main text Figures 2 and 3 and Supplementary Figure 1 were synthesized with a 5’-MMT-aminodT-CE phosphoramidite (Glen Research, Cat No. 10-1932) at the 5’ terminus. Purification of DNA templates was carried out by (i) deprotection with 1:1 ammonium hydroxide: methylamine for 15 min at 65 ˚C; (ii) reverse-phase HPLC purification of the MMT-containing fraction using a [8% acetonitrile in 0.1 M TEAA, pH 7] to [40% acetonitrile in 0.1 M TEAA, pH 7] solvent gradient with a column temperature of 45˚C; (iii) MMT cleavage with 3% trifluoroacetic acid in water for 10 min at room temperature; (iv) reverse-phase HPLC purification of the MMTdeprotected product. Samples were quantitated by UV spectroscopy using a Nanodrop ND1000 spectrophotometer. DNA Templates in Supplementary Figure 1B: gaat: NGCGACGGTATACCGTCGCAATTCATTCATTCATTCATTCATTCATTCATTCATTCATTC gact: NGCGACGGTATACCGTCGCAAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTC gagt: NGCGACGGTATACCGTCGCAACTCACTCACTCACTCACTCACTCACTCACTCACTCACTC gcgt: NGCGACGGTATACCGTCGCAACGCACGCACGCACGCACGCACGCACGCACGCACGCACGC ggat: NGCGACGGTATACCGTCGCAATCCATCCATCCATCCATCCATCCATCCATCCATCCATCC Nature Chemical Biology: doi: 10.1038/nchembio.280 S2 ggct: NGCGACGGTATACCGTCGCAAGCCAGCCAGCCAGCCAGCCAGCCAGCCAGCCAGCCAGCC gggt: NGCGACGGTATACCGTCGCAACCCACCCACCCACCCACCCACCCACCCACCCACCCACCC gcct: NGCGACGGTATACCGTCGCAAGGCAGGCAGGCAGGCAGGCAGGCAGGCAGGCAGGCAGGC DNA Templates in Supplementary Figure 1C: – lane (25 °C): NGCGACGGTTCCCCGTCGCAACCTTACCTTACCTTACCTTACCTTACCTTACCTT aaggt: NGCGACGGTTCCCCGTCGCAACCTTACCTTACCTTACCTTACCTTACCTTACCTT agagt: NGCGACGGTTCCCCGTCGCAACTCTACTCTACTCTACTCTACTCTACTCTACTCT caagt: NGCGACGGTTCCCCGTCGCAACTTGACTTGACTTGACTTGACTTGACTTGACTTG cttgt: NGCGACGGTTCCCCGTCGCAACAAGACAAGACAAGACAAGACAAGACAAGACAAG agcat: NGCGACGGTTCCCCGTCGCAATGCTATGCTATGCTATGCTATGCTATGCTATGCT cacat: NGCGACGGTTCCCCGTCGCAATGTGATGTGATGTGATGTGATGTGATGTGATGTG cagtt: NGCGACGGTTCCCCGTCGCAAACTGAACTGAACTGAACTGAACTGAACTGAACTG ggatt: NGCGACGGTTCCCCGTCGCAAATCCAATCCAATCCAATCCAATCCAATCCAATCC ccatt: NGCGACGGTTCCCCGTCGCAAATGGAATGGAATGGAATGGAATGGAATGGAATGG ctact: NGCGACGGTTCCCCGTCGCAAGTAGAGTAGAGTAGAGTAGAGTAGAGTAGAGTAG DNA Templates in Figure 2: acact: NGCGACGGTTCCCCGTCGCAAGTGTAGTGTAGTGTAGTGTAGTGTAGTGTAGTGTAGTGTAGTGTAGTGT accat: NGCGACGGTTCCCCGTCGCAATGGTATGGTATGGTATGGTATGGTATGGTATGGTATGGTATGGTATGGT acgtt: NGCGACGGTTCCCCGTCGCAAACGTAACCAAACGTAACCAAACGTAACCAAACGTAACCAAACGTAACCA caact: NGCGACGGTTCCCCGTCGCAAGTTGAGTTGAGTTGAGTTGAGTTGAGTTGAGTTGAGTTGAGTTGAGTTG cacat: NGCGACGGTTCCCCGTCGCAATGTGATGTGATGTGATGTGATGTGATGTGATGTGATGTGATGTGATGTG cagtt: NGCGACGGTTCCCCGTCGCAAACTGAACTGAACTGAACTGAACTGAACTGAACTGAACTGAACTGAACTG gtact: NGCGACGGTTCCCCGTCGCAAGTACAACCAAGTACAACCAAGTACAACCAAGTACAACCAAGTACAACCA gtcat: NGCGACGGTTCCCCGTCGCAATGACAACCAATGACAACCAATGACAACCAATGACAACCAATGACAACCA gtgtt: NGCGACGGTTCCCCGTCGCAAACACAACACAACACAACACAACACAACACAACACAACACAACACAACAC tgact: NGCGACGGTTCCCCGTCGCAAGTCAAGTCAAGTCAAGTCAAGTCAAGTCAAGTCAAGTCAAGTCAAGTCA tgcat: NGCGACGGTTCCCCGTCGCAATGCAAACCAATGCAAACCAATGCAAACCAATGCAAACCAATGCAAACCA tggtt: NGCGACGGTTCCCCGTCGCAATGCAAACCAAACCAAACCAAACCAAACCAAACCAAACCAAACCAAACCA DNA Templates in Figure 3: top-left gel: NGCGACGGTTCCCCGTCGCAAGTGTATGGTAACGTAGTTGATGTGAACTG top-right gel: NGCGACGGTTCCCCGTCGCAAGTTGAGTGTAACGTAACTGATGGTATGTG bottom-left gel: NGCGACGGTTCCCCGTCGCAAGTACATGACAACACAGTCAATGCAAACCA bottom-right gel: NGCGACGGTTCCCCGTCGCAATGCAATGACAACCAAGTACAACACAGTCA Nature Chemical Biology: doi: 10.1038/nchembio.280 S3 Synthesis of PNA Aldehyde Building Blocks. PNA pentamer aldehyde building blocks were synthesized using preloaded H-Thr-Gly-NovaSyn TG resin (Novabiochem) and standard automated Fmoc solid-phase peptide synthesis on an Applied Biosystems 433A peptide synthesizer by adapting previously described protocols1,2. PNA pentamer aldehydes were purified by reverse-phase high-pressure liquid chromatography (HPLC) using a C18 stationary phase and an acetonitrile/0.1% trifluoroacetic acid (TFA) gradient, characterized by ESI or MALDI-TOF mass spectrometry and quantitated on a Nanodrop ND1000 UV spectrophotometer. Building Block Expected Mass (Da) Observed Mass (Da) aaggt 1400.6 1400.8 agagt 1400.6 1400.8 caagt 1360.6 1361.0 cttgt 1342.6 1342.8 agcat 1360.6 1361.0 cacat 1320.6 1320.8 cagtt 1351.6 1351.8 ggatt 1391.6 1391.8 ccatt 1311.6 1311.8 ctact 1311.6 1311.8 gaat 1109.4 1109.4 gagt 1125.4 1125.4 ggat 1125.4 1125.4 gggt 1141.4 1141.4 ggct 1101.4 1101.4 gcgt 1101.4 1101.4 gact 1085.4 1085.4 gcct 1061.4 1061.4 gaat 1109.4 1109.4 Building Block Expected Mass (Da) Observed Mass (Da) acact 1320.6 1322.5 accat 1320.6 1322.9 Nature Chemical Biology: doi: 10.1038/nchembio.280 S4 acgtt 1351.6 1352.8 caact 1320.6 1322.0 cacat 1320.6 1322.9 cagtt 1351.6 1352.3 tgact 1351.6 1352.9 tgcat 1351.6 1352.9 tggtt 1382.6 1383.6 gtact 1351.6 1352.0 gtcat 1351.6 1355.8 gtgtt 1382.6 1385.3 biotin-ggatt 1618.6 1618.5 DNA Template Library Synthesis. The library of DNA templates was synthesized with intact 5’-DMT groups using the following sequence: TCCGACCGAATTCCTGGCTCGGAAA68A68A68A68A68A68A68A68AAAAAACCGAGT GGGCACGCC where 6 was an equimolar mixture of AC, GT, and TG dinucleotide phosphoramidites (Chemgenes Corporation) and 8 was an equimolar mixture of AC, GT, TG, and CA dinucleotide phosphoramidites (Chemgenes Corporation). The positive control template was similarly synthesized with the sequence: CGAATTCCTGGCTCGGAAAGTGTAGTGTAGTGTAGTGTAGTGTAGTGTAGTGTAATC CGGAAAACCGAGTGGGCACGCC. Purification of DNA templates was carried out by (i) deprotection with 1:1 ammonium hydroxide: methylamine for 15 min at 65 ˚C; (ii) reverse-phase HPLC purification of the DMT-containing fraction using a [8% acetonitrile in 0.1 M TEAA pH 7] to [40% acetonitrile in 0.1 M TEAA pH 7] solvent gradient with a column temperature of 45˚C; (iii) DMT cleavage with 3% trifluoroacetic acid/ water for 10 min; and (iv) reverse-phase HPLC purification of the DMT-deprotected product. Samples were quantitated by UV spectroscopy. PCR Amplification for Figure 5. The material to be amplified was diluted in 400 µL of Thermopol buffer, 1 μM primer 1, 1 μM primer 2, 1 mM total dNTPs, and Vent (exo-) Polymerase (16 U/ 400 μL). Primer 1: 5’ PCGAATTCCTGGCTCGGAAA where P is a 5’ phosphate (Glen Research, 10-1901); Primer 2: 5’ BGGCGTGCCCACTCGG where B is installed with the 5’ Biotin phosphoramidite (Glen Research 10-5950). The PCR protocol is as follows: 25 cycles of [30 sec at 94 °C, 30 sec at 55 °C, and 30 sec at 72 °C]. PCR reaction Nature Chemical Biology: doi: 10.1038/nchembio.280 S5 products were purified using the Qiagen PCR purification kit to remove primers, resulting in 100 µL of purified PCR product per 400 µL of PCR reaction. Strand Separation After PCR. Streptavidin-coated magnetic particles (200 µL, Roche) were washed twice with wash buffer and then incubated with 3 mg/mL yeast RNA (Ambion) in wash buffer for 30 min. To this mixture was added purified product DNA from PCR amplification in wash buffer. The bead suspension was incubated for 1 hour at room temperature with agitation. The beads were washed twice with 200 µL of wash buffer. The coding strand was eluted with 100 µL of 0.1 M NaOH and the eluant was neutralized with 25 µL 3 M sodium acetate (pH 5.2). Samples were either precipitated with ethanol (Figure 5) or concentrated using a YM-30 concentrator column (Millipore, Figure 6). Samples were dissolved in water and quantitated using UV spectroscopy. Typical yields were 10-20 pmoles of coding strand. The wash buffer was 100 mM NaCl for the experiment in Figure 5 and 25mM Tris-HCl, 130 mM NaCl, pH 7.4 for selections in Figure 6 and in the Supplementary Figure 3 below. Ligation of 3’ and 5’ Hairpins. The purified coding strand (~10 pmol) was dissolved in 20 µL T4 ligase buffer with 1 µM hairpin 1 and 1 µM hairpin 2. Hairpin 1: 5’NTCCGAGCCAGGAATTCGCCCGGGXCTTCTTCCCGGG where N is from the 5’ AminodT-CE phosphoramidite (Glen Research 10-1932) and X is from the Cy3 phosphoramidite (Glen Research 10-5913). Hairpin 2: 5’PCGAGTGGCTFTCCCACTCGGGCGTGC (for Figure 5) or 5’PCGAGTGGCTFTCCCACTCGTGGCGTGC (for Figure 6 and the Supplementary Figure 3) where P is a 5’ phosphate (Glen Research, 10-1901) and F is from the fluorescein phosphoramidite (Glen Research 10-1964). Samples were incubated for 2 hours at 16 ˚C and purified either by concentration with YM-30 columns (Millipore, Figure 5) or by Minelute Nucleotide Removal Kit (Qiagen, Figure 6). Library Sequencing. Library sequences from DNA synthesis were amplified by PCR in 100 μL Amplitaq buffer (Bio-Rad) containing Hotstart Taq polymerase, 1 μM primer 1, and 1 μM primer 2. Primer 1: 5’ GCCGACCGAATTCCTGGCTCGG; Primer 2: 5’ GGGCACACAGGAACAGCTATGACCATGGCGTGCCCACTC. An initial denaturation step at 94 °C for 3 min was followed for 25 cycles of [30 sec at 94 °C, 30 sec at 55 °C, and 30 sec at Nature Chemical Biology: doi: 10.1038/nchembio.280 S6 72 °C]. From the completed PCR reaction 0.5 µL was subjected to TOPO cloning (Stratagene) using the pCR2.1-TOPO vector (Stratagene) and transformed into Mach 1 chemically competent cells (Stratagene). Carbenicillin-resistant colonies were picked and subjected to the Templiphi reaction (GE Healthcare) with the following sequencing primer: GGGCACACAGGAACAGCTATGACCATGGCGTGCCCACTC Selection Over Six Rounds of Translation, Selection, and Amplification. To a library of DNA templates encoding 4.3 x 108 PNA 40-mers containing eight consecutive building blocks was added 10-2, 10-4, or 10-6 equivalents of a DNA template uniquely containing an AATCC codon (red) that translates into a biotinylated H2N-ggatt-CHO PNA building block. The positive control DNA template, but not the other templates, contains an MspI cleavage site. The resulting library of DNA templates was subjected to two to six iterated rounds of translation, PNA strand displacement, in vitro selection for binding to immobilized streptavidin, and amplification by PCR (see Figure 5). Enrichment was measure by MspI digestion of the PCR-amplified DNA before and after selection. A control selection using 10-2 equivalents of the positive control template but omitting the biotinylated building block during translation (-biotin lanes) was also performed. Nature Chemical Biology: doi: 10.1038/nchembio.280 S7 Supplementary Results Analysis of Potential PNA Genetic Codes GC and Pyrimidine Content. DNA codons with equal GC and pyrimidine content were hypothesized to have similar affinity for their cognate PNA building blocks and thereby similar polymerization efficiencies. Pools of candidate codons were analyzed for number of Gs and Cs in their sequence as a fraction of the total length and the number of pyrimidines as a fraction of the total length. Sequence Specificity. Based on our previous studies2, we hypothesized that in order to exhibit sequence-specific polymerization, codons and non-cognate building blocks must differ by at least one full mismatch (i.e. not a G:T wobble) or two G:T wobbles. Note that because codons that differ by one base (i.e. AACTT and AACCT) automatically have either different GC or different pyrimidine contents, codons that meet the GC and pyrimidine content criteria automatically meet this sequence-specificity criteria. Out-of-Frame Hybridization. The genetic code should minimize the possibility that building blocks polymerize out of frame to avoid mistranslation or inhibition of proper translation. From a possible pool of codons, we implemented a computational screen to elucidate the largest possible set of codons with no possible out-of-frame hybridization. A codon was chosen at random from the pool and its cognate building block was checked for the ability to hybridize out of frame to a template made containing the codon. If the chosen codon failed this test, it was removed from analysis and another was chosen. If the first codon passed, another codon was chosen at random and both cognate building blocks were tested for out-of-frame annealing to any of the possible sequences generated by the two codons together. If possible out-of-frame annealing was detected, the second codon was discarded and another chosen. If the sequence analysis suggested no out-of-frame annealing, then another codon was chosen and the process continued until no codons were left. The analysis was performed multiple times to assure that the initial order of codons did not affect the analysis. Nature Chemical Biology: doi: 10.1038/nchembio.280 S8 Ease of Template Library Synthesis. In addition to uniform polymerization efficiency and sequence specificity, an ideal PNA genetic code enables corresponding template libraries to be readily synthesized in a single column on a DNA synthesizer without the need to perform laborious split-and-pool oligonucleotide synthesis. The synthesis of a library template containing n occurrences of x codons by the split-and-pool method requires dividing synthesis resin into x equal portions, performing the solid-phase synthesis of a codon on each of the x portions, pooling the portions, and repeating this process for each of n codon positions in the library. As n and x increase to even modest numbers, this strategy becomes difficult or impractical. In contrast, single-column library synthesis traditionally uses mixtures of A, C, G, and/or T phosphoramidites to install degenerate bases at specific positions in a DNA strand. The requirements that we identified above— identical GC and pyrimidine content, at least one nonG:T mismatch or two wobble mismatches between any two codons, and no out-of-frame annealing— mathematically preclude the use of traditional degenerate bases. The only degenerate bases that do not vary in pyrimidine content are Y (T or C), or R (A or G); the use of either, however, necessarily changes GC content and introduces the possibility of two codons that differ only by a single G:T wobble mismatch. Taken together, this analysis establishes that a building block set of an acceptable size with predicted uniform and sequence-specific polymerization, no out of frame annealing, and a readily synthesized template library cannot be generated using mixtures of A, C, G, and T phosphoramidites. We hypothesized that a PNA genetic code that supports single-column template library synthesis while satisfying the requirements of constant GC content, constant pyrimidine content, and the absence of any two codons that differ only by a single wobble mismatch could be achieved by using mixtures of dinucleotide (Chemgenes Corporation, Supplementary Table 1) or trinucleotide (Glen Research, Supplementary Table 2) phosphoramidites. The use of mixtures of polynucleotide phosphoramidites allows for the single-column synthesis of template libraries unavailable through other means and can assure conformity to the above rules of genetic code design as long as each element in the mixture has the same GC and pyrimidine content. Commercially Available Dinucleotide Nature Chemical Biology: doi: 10.1038/nchembio.280 S9 Phosphoramidites DNA codon GC fraction pyrimidine fraction AA 0 0 AG 0.5 0 GA 0.5 0 GG 1 0 AT 0 0.5 TA 0 0.5 AC 0.5 0.5 CA 0.5 0.5 GT 0.5 0.5 TG 0.5 0.5 CG 1 0.5 GC 1 0.5 TT 0 1 CT 0.5 1 TC 0.5 1 CC 1 1 Supplementary Table 1. Commercially available dinucleotide phosphoramidites and their GC and pyrimidine content. Commercially Available Trinucleotide Phosphoramidites DNA sequence GC fraction pyrimidine fraction AAA 0 0 GAA 1/3 0 AAC 1/3 1/3 ATG 1/3 1/3 CAG 2/3 1/3 GAC 2/3 1/3 GGT 2/3 1/3 TGG 2/3 1/3 ACT 1/3 2/3 Nature Chemical Biology: doi: 10.1038/nchembio.280 S10 ATC 1/3 2/3 CAT 1/3 2/3 GTT 1/3 2/3 TAC 1/3 2/3 CGT 2/3 2/3 CTG 2/3 2/3 GCT 2/3 2/3 TGC 2/3 2/3 CCG 1 2/3 TCT 1/3 1 TTC 1/3 1 Supplementary Table 2. Commercially available trinucleotide phosphoramidites and their respective GC and pyrimidine content. Analysis of One-Base Codons In principle, mononucleotide building blocks offer many attractive advantages such as no possibility for out of frame hybridization, and the greatest possible library complexity for a polymer of a given length. However, previous attempts to polymerize mononucleotide PNAs through acylation chemistry revealed that they were prone to cyclize to form six-membered rings3, which may also be a problem for reductive amination. In addition, mononucleotide PNAs would violate criteria for uniform GC and pyrimidine content and would presumably differ significantly in strength of their hybridization with a DNA template. Analysis of Two-Base Codons Two-base codons can be achieved through the use of dinucleotide phosphoramidites (Table S1) all 16 of which are commercially available. One can separate the pool of 16 codons into candidate sets of codons all of which have identical GC and pyrimidine content. Four codons comprise their own sets (AA, TT, GG, and CC), and fail the out of frame annealing analysis. Additionally six sets of two codons (AG/GA, AT/TA, AC/CA, TG/GT, CG/GC, and CT/TC) reduce to a possible contribution of one building block each (for instance, the building block for AG would bind out of frame on (GA)n templates, allowing the choice of only one or the other of the codons). One candidate set contains four codons (AC, GT, TG, and CA). Nature Chemical Biology: doi: 10.1038/nchembio.280 S11 However the two codons AC and CA become mutually incompatible as do the two codons GT and TG. Therefore, the maximum possible genetic code comprising two-base codons consists of two codons (for instance AC and GT). Analysis of Three-Base Codons Three-base codons can be achieved through the use of trinucleotide phosphoramidites, 20 of which are commercially available (Table S2). These 20 can separate into candidate sets of codons all of which have identical GC and pyrimidine content. There are two candidate sets of four codons (CAG/GAC/GGT/TGG and CGT/CTG/GCT/TGC) and one candidate set of five codons (ACT/ATC/CAT/GTT/TAC) with the required sequence properties. The candidate set CGT/CTG/GCT/TGC contains three codons (CTG, GCT and TGC) that are mutually incompatible for out of frame annealing and of which two must be discarded, leaving a twomember candidate set. The candidate set CAG/GAC/GGT/TGG, contains two codons (GGT and TGG) which are mutually incompatible and of which one must be discarded, leaving a threemember candidate set. The candidate set ACT/ATC/CAT/GTT/TAC, contains two pairs of incompatible codons (ACT/TAC and ATC/CAT). As one must be discarded from each pair, this leaves a three-member candidate set. This analysis demonstrates that for three-base codons made from trinucleotide phosphoramidites, the largest possible codon set is three. Three-base codons may also be achieved by using a YX’ or X’Y code where a mixture of mononucleotide phosphoramidites (Y) is followed or preceded by a mixture of dinucleotide phosphoramidites (X’). As discussed earlier, the mononucleotide component cannot be varied since this would create unacceptable changes in the sequence composition. The dinucleotide portion must conform to the sequence criteria and therefore mixtures of dinucleotide phosphoramidites, of which the candidate set (AC, GT, TG, and CA) is the biggest. An analysis of all possible three-base codons made up of a fixed base followed by or preceded by a mixture of this four-member set gives four possible codons, one of which must be removed to account for out of frame annealing. For example, in the set AAC, AGT, ATG, ACA the building block for the ACA codon (tgt) can anneal out of frame to the sequence AACAGT. This analysis demonstrates that for three-base codons made from dinucleotide phosphoramidites, the largest possible codon set is three. Nature Chemical Biology: doi: 10.1038/nchembio.280 S12 Analysis of Four-Base Codons Four-base codons may be achieved by using a YX’ or X’Y code where a mixture of mononucleotide phosphoramidites (Y) is followed or preceded by a mixture of trinucleotide phosphoramidites (X’). As discussed earlier, the mononucleotide component cannot be varied since this would create unacceptable changes in the sequence composition, which leaves variations in the dinucleotide portion. The trinucleotide portion must conform to the sequence criteria and therefore mixtures of trinucleotide phosphoramidites, of which the candidate sets CAG/GAC/GGT/TGG, CGT/CTG/GCT/TGC, and ACT/ATC/CAT/GTT/TAC are the biggest. Analyzing the possible codons formed by combining a fixed mononucleotide with a mixture of trinucleotides one obtains several sets of five codons that do not suffer from out of frame annealing (for instance: ACTG, CATG, TACG, GTTG, and ATCG). Four-base codons can be also be achieved from two dinucleotides in sequence and envisaged as Y’X’ where Y’ is one set of mutually-compatible dinucleotides and X’ is another. Since all elements of Y’ and X’ must have identical GC and pyrimidine content, we analyzed all possible X’Y’ combinations. The largest set was found to be X’=2, and Y’=2, giving a final codon set of four building blocks (for example, ACTG, ACGT, CATG, CAGT). This analysis demonstrates that for four-base codons made from mixtures of either trinucleotide or dinucleotide phosphoramidites, the largest possible codon set is five. Note that PNA building blocks longer than pentanucleotides (used in this work) would also likely show efficient polymerization, but would decrease complexity for a library of a given nucleotide length. Nature Chemical Biology: doi: 10.1038/nchembio.280 S13 Supplementary Figure 1. Two PNA genetic codes that exhibit non-uniform DNA-templated polymerization. (A) Representative polymerization of a PNA aldehyde building block on a complementary DNA template. (B) Ten consecutive DNA-templated reductive amination couplings result in PNA 40-mers containing secondary amine linkages between every fourth nucleotide. The efficiency of this process is shown at 25 °C, 45 °C and 60 °C for eight PNA aldehydes of the form H2Ngvvt-CHO (where v = a, c, or g) using DNA templates containing ten repeats of the codon of interest. Reaction conditions: 16 µM PNA aldehyde and 0.4 µM DNA templates were heated to 95 °C and cooled to the specified temperature. NaBH3CN was added to 80 mM. After 30 min, the reactions were analyzed by denaturing PAGE and visualized by staining with ethidium bromide. (C) Seven consecutive DNAtemplated reductive amination couplings result in PNA 35-mers containing secondary amine linkages between every fifth nucleotide. The efficiency of this process is shown at 25 °C, 45 °C and 60 °C for the ten PNA pentamer aldehydes. Reaction conditions were identical to those in (B). For both (B) and (C), the building blocks that did not polymerize efficiently at 25 °C were not characterized at the higher temperatures. Nature Chemical Biology: doi: 10.1038/nchembio.280 S14 N40 Library (x’y’t)8 Library Supplementary Figure 2. Modeling of N40 and (x’y’t)8 libraries. 10,000 (x’y’t)8 library members or 10,000 N40 library members were randomly generated and their secondary structure folding energies were calculated as DNA using the Oligonucleotide Modeling Platform (OMP; DNA Software, Inc.). All simulations to determine secondary structures were performed at 25 °C in 1.0 M NaCl with 100 nM template. The top histogram shows the results for the N40 library; the bottom histogram shows the results for the (x’y’t)8 library. Nature Chemical Biology: doi: 10.1038/nchembio.280 S15 Supplementary Figure 3. Complete analysis of the model selection over six rounds of translation, selection, and amplification. See also the main text and Figure 6. To a library of DNA templates encoding 4.3 x 108 PNA 40-mers containing eight consecutive building blocks was added 10-2, 10-4, or 10-6 equivalents of a DNA template uniquely containing an AATCC codon (red) that translates into a biotinylated H2N-ggatt-CHO PNA building block. The positive control DNA template, but not the other templates, contains an MspI cleavage site. The resulting library of DNA templates was subjected to six iterated rounds of translation, PNA strand displacement, in vitro selection for binding to immobilized streptavidin, and amplification by PCR (see Figure 5). MspI digestion of the PCR-amplified DNA before and after selection reveals that the selection resulted in > 1,000,000-fold enrichment for the positive control sequence encoding the biotinylated PNA. A similar model selection using 10-2 equivalents of the positive control template but omitting the biotinylated building block during translation (-biotin lanes) results in no enrichment of the positive control sequence. Y’ = mixture of AC, GT, and TG; X’ = mixture of AC, GT, TG, and CA. The images shown are from 2.5% agarose gels stained with ethidium bromide. Nature Chemical Biology: doi: 10.1038/nchembio.280 S16 Supplementary References Cited 1. Kleiner, R.E., Brudno, Y., Birnbaum, M.E. & Liu, D.R. DNA-templated polymerization of side-chain-functionalized peptide nucleic acid aldehydes. J. Am. Chem. Soc. 130, 4646-4659 (2008). 2. Rosenbaum, D.M. & Liu, D.R. Efficient and sequence-specific DNA-templated polymerization of peptide nucleic acid aldehydes. J. Am. Chem. Soc. 125, 13924-13925 (2003). 3. Schmidt, J.G., Christensen, L., Nielsen, P.E. & Orgel, L.E. Information transfer from DNA to peptide nucleic acids by template-directed syntheses. Nucleic Acids Res. 25, 4792-6 (1997). Nature Chemical Biology: doi: 10.1038/nchembio.280
© Copyright 2026 Paperzz