JOURNAL OF MOLECULAR RECOGNITION J. Mol. Recognit. 2002; 15: 126–134 Published online in Wiley InterScience (www.interscience.wiley.com). DOI:10.1002/jmr.567 Identification of epitope-like consensus motifs using mRNA display Rick Baggio1*†, Petra Burgstaller2†, Stephen P. Hale1, Andrew R. Putney1, Meghan Lane1, Dasa Lipovsek1, Martin C. Wright1, Richard W. Roberts3, Rihe Liu4, Jack W. Szostak4 and Richard W. Wagner1 1 Phylos Inc., 128 Spring St, Lexington, MA 02421, USA Noxxon Pharma AG, Gustav-Meyer-Allee 25, 13355 Berlin, Germany 3 Division of Chemistry and Chemical Engineering, California Technology Institute, Pasadena, CA 91125, USA 4 Department of Molecular Biology, Massachusetts General Hospital, Boston, MA 02114, USA 2 The mRNA display approach to in vitro protein selection is based upon the puromycin-mediated formation of a covalent bond between an mRNA and its gene product. This technique can be used to identify peptide sequences involved in macromolecular recognition, including those identical or homologous to natural ligand epitopes. To demonstrate this approach, we determined the peptide sequences recognized by the trypsin active site, and by the anti-c-Myc antibody, 9E10. Here we describe the use of two peptide libraries of different diversities, one a constrained library based on the trypsin inhibitor EETI-II, where only the six residues in the first loop were randomized (6.4 107 possible sequences, 6.0 1011 sequences in the library), the other a linear-peptide library with 27 randomized amino acids (1.3 1035 possible sequences, 2 1013 sequences in the library). The constrained library was screened against the natural target of wild-type EETI, bovine trypsin, and the linear library was screened against the anti-c-myc antibody, 9E10. The analysis of selected sequences revealed minimal consensus sequences of PR(I,L,V)L for the first loop of EETI-II and LISE for the 9E10 epitope. The wild-type sequences, PRILMR for the first loop of EETI-II and QKLISE for the 9E10 epitope, were selected with the highest frequency, and in each case the complete wildtype epitope was selected from the library. Copyright # 2002 John Wiley & Sons, Ltd. Keywords: molecular recognition; mRNA; protein; peptide, display; selection; 9E10; EETI-II; Myc; trypsin INTRODUCTION Specific interactions between proteins, such as between an enzyme and its inhibitor or between an antibody and its antigen, are mediated through molecular recognition of subsites on both binding partners. To determine such binding subsites, one strategy is to use randomized peptide or protein libraries and a screening method to identify the minimal binding domain for an interaction with a given target. Armed with this information, it should in principle be possible to search databases of protein sequences to identify previously unknown protein:target binding pairs. A method for determining the sequences of natural protein-binding peptides would be a valuable tool in proteomic research. Such a method would allow, for example, the identification of the unknown binding partners of signaling proteins, and the identification of the targets of antibodies with defined biological properties but unknown antigens. One example of a well-characterized complex between a protein and a peptide is the complex between trypsin and its *Correspondence to: R. Baggio, Phylos Inc., 128 Spring St, Lexington, MA 02421, USA. E-mail: [email protected]. †Both authors contributed equally to this work. Abbreviations used: EETI-II, Ecballium elaterium trypsin inhibitor two. Copyright # 2002 John Wiley & Sons, Ltd. inhibitor, the Ecballium elaterium trypsin inhibitor two (EETI-II; Favel et al., 1989a,b). Trypsin is a 24 kDa pancreatic serine protease and EETI-II is a 28-residue trypsin inhibitor of the knottin family; like most knottins, it contains a triple-stranded beta sheet and three disulfide bonds (Heitz et al., 1989). Only the first loop of EETI-II (residues 3–8) is involved in the interaction with trypsin (Le-Nguyen et al., 1989, 1990). Another example of a well-characterized complex between a protein and a peptide is that of the c-Myc antigen and the anti-c-Myc antibody 9E10, which recognizes the cMyc epitope between residues 408 and 439 (Evan et al., 1985; Schiweck et al., 1997). Human c-Myc is a transcription factor of 65 kDa which, as a heterodimer with the Max protein, binds to genes whose promoters contain the E box motif (CACGTG, Amati et al., 2001). A limitation of previous methods (Deroos and Muller, 2001) used to achieve the goal of finding epitope-like consensus motifs is the overall size of the randomized library that allows for adequate interaction space to be sampled. In affinity-based peptide library selections (Mattheakis et al., 1994; Schatz et al., 1996), selection pressure is applied on a random population, resulting in the enrichment of specific binders. The association of phenotype with genotype is essential both to enrich binders at each round of selection and to identify the selected molecules. Among the technologies that take advantage of this link are IDENTIFICATION OF EPITOPE-LIKE CONSENSUS MOTIFS 127 Figure 1. Ribbon model of a mRNA±peptide fusion molecule, to scale (drawn with Biopolymer2, MSI). The double helix mRNA/DNA strand encodes for the tobacco mosaic virus untranslated region and either the 28 amino acids of the EETI-II library or the 27 amino acids of the randomized Ê extension of the mRNA linear peptide library. The single-stranded poly-A linker is a covalent 100 A strand. The three prime end of the poly-A linker terminates with puromycin, which forms the carboxy-terminus of the peptide encoded by the mRNA. in vivo technologies such as phage display (Hoogenboom et al., 1998), E. coli display (Christmann et al., 1999; Wentzel et al., 1999), yeast display (Kieke et al., 1997; Boder and Wittrup, 2000) and in vitro technologies such as ribosome display (He and Taussig, 1997), display technologies using DNA-binding proteins (Fitzgerald, 2000; Speight et al., 2001), and mRNA display (Roberts and Szostak, 1997; Nemoto et al., 1997). mRNA display uses puromycin to covalently attach an mRNA to its translated product (Fig. 1). The advantages of this method are that the libraries are generated in vitro, which allows large libraries (>1012 different sequences) to be made, and that the covalent link between the genotype and the phenotype ensures that the mRNA–peptide fusion remains intact under a range of conditions. mRNA display has been used to select highaffinity streptavidin-binding peptides (Wilson et al., 2001) and functional proteins (Keefe and Szostak, 2001) from a randomized peptide library. In this study, we applied the mRNA display technology to test its ability to identify the binding domains for two target:ligand interactions, bovine trypsin inhibitor and the anti-myc 9E10 monoclonal antibody. We selected the peptides that bound to these targets by iteratively selecting and enriching for binders from randomized peptide libraries. In addition to selecting epitope-like motifs from randomized peptide libraries, our study sought to dissect the peptide– macromolecular interactions and to determine the minimal sequence required for the recognition of a peptide by a macromolecule. EXPERIMENTAL Library construction EETI-II. Three gel-purified oligonucleotides (Oligos Etc. Inc.) were used to construct EETI-II with a randomized first loop (residues 3–8). In a final volume of 100 ml, 1 nmol of the template, 5' GAT AAA TGT TAA TGT TAC CCG ACG (NNS)6 ACG TTC GTC CTG TCG CTG ACG GAG CGG CCG ACG CAG ACG CCG GGG TTG CCG AAG ACG CCG 3', N = A, G, C, T; S = C, G), was amplified for Copyright # 2002 John Wiley & Sons, Ltd. three rounds of PCR (94 °C, 1 min; 65 °C, 1 min; 72 °C, 2 min) using 1 mM of the 5' primer, 5' TAA TAC GAC TCA CTA TAG GGA CAA TTA CTA TTT ACA ATT ACA ATG GGC TGC 3' (Oligos Etc. Inc.), and the 3' primer, 5' CGG CGT CTT CGG CAA CCC CGG CGT 3' (Oligos Etc. Inc.) in the supplied buffer (Promega) with MgCl2 (2.5 mM), NTP (250 mM) and 5 ml of Taq polymerase (Promega, 5 units/ml). One nanomole of DNA was transcribed into RNA in a 0.5 ml reaction using the Megashortscript In vitro Transcription kit (Ambion). After incubation for 1 h at 37 °C, 20 units of DNAse I (Ambion) were added and the incubation at 37 °C continued for an additional 15 min. The reaction was twice purified with phenol/CHCl3/isoamylalcohol and excess NTPs were removed by NAP-25 gel filtration (Pharmacia). The RNA was precipitated by addition of 1/10 vol NaOAc pH 5.2 and 1 vol isopropanol and dissolved in 200 ml H2O. The puromycin-DNA oligonucleotide linker, 5'-A27CCPuromycin, was synthesized as described previously and conjugated to the 3'-end of the library RNA by templatedirected enzymatic ligation (Roberts and Szostak, 1997). In a 100 ml reaction volume, the purified RNA (1 nmol) was ligated to 5'-A27CC-Puromycin linker (2 nmol) in the presence of biotinylated DNA splint (2 nmol), 5'biotin-G CAA CGA CCA ACT TTT TTT TTN GCC GCA GAA G 3' (Oligos Etc. Inc.), with 10 ml of T4 DNA ligase (20 units/ml, Promega) for 4 h at room temperature. Ligated mRNA products that annealed to the biotinylated splint were captured on 250 ml of neutravidin-agarose beads (Pierce) suspended in 1 ml of PBS for 1 h at 35 °C. The beads were washed three times with 1 ml of PBS to remove unbound species, and mRNA-A27CC-Puromycin linker conjugates were eluted by incubating the beads for one hour at 35 °C with an excess (10 nmol) of antisplint, 5' CTT CTG CGG CAA AAA AAA AAG TTG GTC GTT GC 3' (Oligos Etc. Inc.). The mRNA-A27CC-Puromycin linker conjugates were concentrated by isopropanol precipitation. The purified ligated mRNA was then used to generate mRNApeptide fusion. Purified ligated RNA (1 nmol) was translated using 1 ml of lysate from the Rabbit Reticulocyte IVT kit J. Mol. Recognit. 2002; 15: 126–134 128 R. BAGGIO ET AL. (Ambion) in the presence of 15 mCi 35S-methionine (1000 Ci/mmol). After a 30 min incubation at 30 °C, KCl and MgCl2 were added to a final concentration of 530 mM and 150 mM, respectively. mRNA–protein fusion molecules were isolated by incubation with 250 mg oligo dT25 cellulose (Pharmacia) in 10 ml of incubation buffer (100 mM Tris–HCl pH 8.0, 10 mM EDTA, 1 M NaCl and 0.25% Triton X-100) for 1 h at 4 °C. Bound mRNA–protein fusion molecules were isolated by filtration, washed with incubation buffer and eluted with ddH2O. The concentration of mRNA–protein fusion molecules was determined by scintillation counting. To generate fusion molecules containing the corresponding cDNA strands, isolated mRNAprotein fusion molecules were reverse transcribed with 10 ml of Superscript II Reverse Transcriptase (200 units/ml, Gibco BRL) in the manufacturer’s buffer at 42 °C using a 5-fold excess of ligation splint as primer. In order to promote disulfide pairing of the EETI-II fusion library, DTT was omitted for the reverse transcription reaction. c-Myc. Three gel-purified oligonucleotides (Oligos Etc.) were used to construct the linear library. One naromole of the template [5'-AGC TTT TGG TGC TTG TGC ATC (SNN)27 CTC CTC GCC CTT GCT CAC CAT-3', N = A, G, C, T; S = C, G] was amplified by three rounds of PCR (94 °C, 1 min; 65 °C, 1 min; 72 °C, 2 min) using 1 mM of the 5' primer (5'-AGC TTT TGG TGC TTG TGC ATC-3') and the 3' primer (5'-TAA TAC GAC TCA CTA TAG GGA CAA TTA CTA TTT ACA ATT ACA ATG GTG AGC AAG GGC GAG GAG-3') in the supplied buffer (Promega) with MgCl2 (2.5 mM), NTP (250 mM) and Taq polymerase (Promega, 5 units) in a final volume of 5 ml. The reaction mixture was extracted twice with phenol/CHCl3/isoamylalcohol, precipitated with ethanol, and dissolved in 100 ml TE. The final construct contains a randomized region where each codon has the sequence NNS, where ‘N’ represents equal probability of A, G, C and T, and ‘S’ represents equal probability of G and C, an upstream T7 transcription promoter and a variant of the TMV translation initiation sequence. Downstream from the randomized region is a sequence designed to accommodate primers for PCR, reverse transcription and ligation. Transcription followed the protocols described in the EETI-II section. For ligation, equal molar amounts (2 nmol) of RNA, 5'A27CC-puromycin, and splint (5'-TTT TTT TTT TNA GCT TTT GGT GCT TG 3') were mixed in a 1.5 ml reaction with T4 DNA ligase (1200 units, Promega) in buffer supplied by the manufacturer. Following a 4 h incubation at room temperature, the reaction was extracted twice with phenol/ CHCl3 isoamylalcohol and the nucleic acids were precipitated with isopropanol as described above. Ligated RNA was separated from unligated RNA on a 6% TBE/urea denaturing polyacrylamide gel, eluted from the gel with 200 ml H2O. Ligated RNA (1.25 nmol) was translated in a total volume of 7.5 ml using the Rabbit Reticulocyte IVT kit (Ambion) in the presence of 150 mCi 35 S-methionine (1000 Ci/mmol). Subsequent steps in preparing the library were similar to those described for the low diversity, constrained peptide library construction. The reverse transcription reaction, necessary to generate fusion molecules containing the corresponding cDNA strands used Superscript II Reverse Transcriptase (200 Copyright # 2002 John Wiley & Sons, Ltd. units/ml, Gibco BRL) and followed the manufacturer’s recommended protocol. Selection EETI-II. Potential column binders were removed from the reverse transcribed EETI-II fusion library (1 pmol in 700 ml) by incubating the library with 100 ml CL-sepharose beads (Sigma) for 15 min at room temperature. The suspension was spun at 1000 rpm for 30 s and the supernatant applied to 100 ml of trypsin-agarose bead slurry (Sigma) in selection buffer (PBS pH 7.4, 1% Tween-20), and, in the instances where high stringency conditions were used, 200 mM benzamidine (Sigma). The mixture was incubated for 30 s at room temperature. The beads were centrifuged at 1000 rpm and washed 4–8 times with 1 ml selection buffer. Bound material was eluted with 4 100 ml of 0.33 N NaOH. During the course of the selection, the enrichment of the selected binders was determined by measuring the amount of radio labeled protein fusion bound to a 10% fraction of the trypsin–agarose bead slurry. All radiolabeled measurements were performed on a scintillation counter (Wallac 1409). The eluates were neutralized and the DNA recovered by isopropanol precipitation. c-Myc. The starting library (33 pmol, 2 1013 molecules) was incubated with a 12-fold excess of the c-myc binding antibody (Santa Cruz). An excess of protein A–sepharose was used to precipitate the peptide fusion–antibody complexes. After washing the sepharose with 5 vols of selection buffer (1 PBS pH 7.4, 1 mg/ml BSA, 0.05% Tween 20) to remove non-specific binders, bound species were eluted by the addition of 15 mM acetic acid. The cDNA portion of the eluted fusion molecules was amplified by PCR and the resulting DNA was used to generate an enriched population of fusion products for ensuing rounds of selection. Binding assay EETI-II. Representative clones were subjected to further analysis. Selected clones of the EETI-II selection were prepared as mRNA-peptide fusions, the RNA in the fusion was digested with RNAse One (Promega), and the target binding of the remaining peptide was determined in the following way. mRNA-peptide fusions (2–7 pmol) were prepared as described in the EETI-II experimental section. Following oligo dT25 cellulose purification, 1.5 ml of RNase One2 (Promega) were added to, and incubated with, each mRNA–peptide fusion for 1 h at room temperature. Constant amounts of the rnase-treated mRNA–peptide fusions were then separately incubated in 500 ml selection buffer (PBS pH 7.4, 1% Tween-20) at room temperature for 20 min with various concentrations (20–200 nM) of biotinylated bovine trypsin (Sigma). Streptavidin magnetic beads (200 ml; Dynal) were added to each incubate. The resulting suspensions were rotated for 10 min. The beads were washed twice with 500 ml selection buffer; and the total radioactive counts for the beads (bound radioligand value), unbound material and washes (free radioligand J. Mol. Recognit. 2002; 15: 126–134 IDENTIFICATION OF EPITOPE-LIKE CONSENSUS MOTIFS value) was determined on a scintillation counter (Wallac). Dissociation constants for the different EETI-II variants were determined by plotting the data to the equation: Fractional radioligand bound to trypsin [trypsin concentration]/([trypsin concentration] KD The equation was derived from the law of mass action model for a ligand receptor interaction: Radioligand trypsin ! Radioligand trypsin c-Myc. 0.2 pmol aliquct 35S-labeled monoclonal RNApeptide fusion (R6-51, R6-52, R6-53, R6-55, R6-56, R6-58, R6-60, R6-61, R6-63, R6-66, R6-67, R6-68) or wild-type fusion construct (WT) were incubated with 100 pmol antimyc monoclonal antibody 9E10, with 100 pmol anti-myc monoclonal antibody 9E10 in the presence of 2 nmol synthetic myc peptide, or without monoclonal antibody, respectively. The peptide fusion–antibody complexes were precipitated by addition of protein A–sepharose. The values represent the average percentage of fusion molecules that bound to the antibody and could be eluted with 15 mM acetic acid determined in duplicate binding reactions. Sequence determination and analysis DNA recovered from the fifth round of selection was PCR amplified, gel-purified (Qiagen) and cloned into the pCR1 2.1-TOPO1 cloning system (Invitrogen). The resulting plasmids were purified (Qiagen). Cycle sequencing was performed on chosen samples and resolved on an ABI 310 sequencer. Sequences were analyzed with the MacVector analysis package (Oxford Molecular) and aligned according to amino acid sequence. RESULTS 129 starting library consisted of 1011 molecules (1 pmol), which exceeds the potential complexity of the EETI library. In contrast to the linear library, its starting library consisted of 1013 molecules (33 pmol), which only samples 10 23 of the potential complexity of the linear library. EETI-II selection In the first three rounds of selection the amount of radiolabeled protein fusion bound to immobilized trypsin increased from 1.8 to 22%, the latter being equivalent to the binding of wild-type EETI. Since these measurements of percent binding represent the ability of in vitro produced radiolabeled fusion to bind to trypsin, a dramatic extension of the c-terminus of the EETI scaffold by way of puromycin-poly-A linker and mRNA/DNA stills yields a construct capable of substantial binding to trypsin. In an attempt to further enrich the population, 200 mM benzamidine, a competitive inhibitor of trypsin (KD = 18.4 mM; Mares-Guia and Shaw, 1964), was introduced into the incubation and wash buffers in the fourth round. Additionally, the binding time was reduced to 30 s. This increased stringency decreased the amount of binding to 0.5% in the fourth round. By the fifth round under the same set of increased stringency conditions, the binding percentage of radiolabeled protein fusions to immobilized trypsin increased to 2%, indicating a possible enrichment for higher affinity binders. Sequence analysis. The cDNA pool from the fifth round of selection was cloned and 108 clones were sequenced. The sequence analysis revealed that wild-type EETI-II represented 20% of the clones (Table 1). The proline at the P2 position (nomenclature from Schechter and Berger, 1967) is conserved in all the analyzed sequences, and the arginine at the P1 position is conserved except for one instance where it is substituted by lysine. The hydrophobic residues isoleucine, leucine, or valine occupy the P1' and P2' positions. The P3' and P4' positions were highly variable (Table 1). Library complexity The two libraries constructed were a constrained, lowdiversity library based on the EETI-II scaffold, which consists of a randomly mutated first loop, and a highdiversity linear peptide library, which consists of 27 randomized residues within a linear peptide. EETI-II has been shown to be tolerant of mutagenesis while retaining its knottin fold (Favel et al., 1989a,b; Christmann et al., 1999; Wentzel et al., 1999). The choice of these two libraries was based upon the basic structural requirements for high affinity binders to trypsin and the 9E10 anti-myc antibody. Whereas a constrained peptide would be a more likely trypsin inhibitor candidate than a linear peptide, the 9E10 anti-myc antibody recognizes a linear epitope. Therefore, the library based upon the EETI-II scaffold was used against trypsin, and the high diversity linear library was used against the 9E10 anti-myc antibody. The theoretical complexity of the EETI-II library (a randomized 6-mer) is 206 or 6.40 107 different sequences; that of the linear library (a randomized 27-mer), 2027 or 1.34 1035 different sequences. For the EETI-II library, the Copyright # 2002 John Wiley & Sons, Ltd. Characterization. Representative peptides were produced in vitro as mRNA-protein fusions, treated with RNase and Table 1. Representative clones from the EETI-II selection. Conserved residues are bold; residues that have similar properties are italic. The motifs PRIL or PRLL are present in most of the clones wt EETI-II 20% 7.5% 3.75% 3.75% 3.75% 2.5% 2.5% 2.5% 2.5% 1.25% Consensus P P P P P P P P P P P P R R R R R R R R R R K R I I L I L V V L L L L X L L L L L L L L L V L L M M A L Q A A V H Q A X R R P P P L P L L R P X J. Mol. Recognit. 2002; 15: 126–134 130 R. BAGGIO ET AL. Figure 2. Binding ef®ciency of selected EETI-II variants to trypsin-sepharose as determined by the percentage of radiolabeled clone bound. tested for binding to trypsin (Fig. 2). Each peptide was therefore modified at the C-terminus due to the presence of the puromycin-DNA linker. Selected clones were further characterized by solution-based radioligand affinity analysis (Table 2). The binding efficiency of each clone was determined and, unexpectedly, a lower percentage of wildtype bound than any of the other screened clones; despite the fact that the wild-type clone bound the tightest, followed closely by the P1-lysine variant. The dissociation constant obtained for the wild-type clone (16 3.2 nM) is close to that obtained by Wentzel et al. (1999; 6.09 1.97 nM) and Le-Nguyen et al. (1990; 1.8 nM) for wild-type EETI-II constructs with an extended C-terminus. Under the conditions chosen for production, purification and binding, the discrepancy between percentage binding and dissociation constant for the wild-type clone may be a result of differences in the percentage of active (i.e. folded) wildtype clone relative to similar percentages for the other selected anti-trypsin clones. Table 2. Af®nities of the selected EETI-II variants measured in three different experiments. The clones used for af®nity characterization have the puromycin± DNA (30mer poly-dA) linker at their C-terminus Amino acid sequence (residues 3–8) Kd (nM) PRILMR PRVLAL PRLVQR PRLLAP PKLLAP 16 3.2 52 21.0 53 9.0 81 21.0 21 2.4 Copyright # 2002 John Wiley & Sons, Ltd. c-Myc selection Five rounds of selection were performed using the c-Myc monoclonal antibody, 9E10, as the target. After four rounds of selection, the amount of radiolabeled mRNA–peptide fusion recovered increased (Fig. 3). After the fifth round, 34% of the linear peptide library bound to the c-Myc antibody; this is comparable to the binding of wild-type cMyc in control experiments. No further increase in binding was observed after round five. Eluants from the fifth and the sixth round were used to characterize and sequence of the resulting clones. Sequence analysis. The cDNA pools from the fifth and sixth rounds were cloned and 116 colonies were sequenced. As presented in Table 3, two selected clones contained the sequence of the wild-type c-Myc epitope (EQKLISEEDL). A third clone differed from the wild-type by two point mutations in the nucleotide sequence, only one of which altered an amino acid (Ile to Val). The consensus sequence selected [X(Q,E)XLISEXX(L,M)] included the full length 10 amino acid wild-type sequence; identical or homologous sequences to the four of the core residues, LISE, were conserved in 86% of the 116 clones examined. Characterization. A series of experiments was performed to determine if the pool of selected peptides bound to the anti-c-Myc antibody 9E10. Fusion products from the sixth round of selection were evaluated under different immunopreciptation conditions. In order to confirm that selected fusion molecules interact with the antigen-binding site of the anti-c-Myc antibody 9E10, synthetic c-Myc peptide was introduced as a competitor for binding to the anti-c-Myc antibody. The percentage of fusion bound to the antibody J. Mol. Recognit. 2002; 15: 126–134 IDENTIFICATION OF EPITOPE-LIKE CONSENSUS MOTIFS 131 Figure 3. The enrichment of peptides binding to the c-Myc 9E10 antibody determined by measuring the increase in bound and eluted radiolabeled methionine as incorporated into the peptide sequence. An increase in signal can be observed emerging by round 4. decreased with increasing amounts of c-Myc peptide, which implies that the selected population binds to the c-Myc antigen binding site on the 9E10 antibody. No binding was detected in experiments without an antibody, which indicates that the selected peptides do not bind to protein A-agarose. The selected peptides were shown not to bind an irrelevant antibody target, an anti-integrin monoclonal antibody. The combined data suggests that the selected Table 3. Representative clones from the selection for binding to antibody 9E10. Conserved residues are bold; residues that are less frequently conserved but have similar properties are italic. The four-residue motif, LISE, is conserved c-Myc epitope E Q R6-51 C A R6-52 E E R6-53 R Q R6-55 L Q R6-56 I V R6-58 E E R6-60 M Q R6-61 T M R6-63 E Q R6-66 D M R6-67 F Q R6-68 Q R consensus X Q/E K S Y Y R R Y N D K M A V X L V L L L L L L L L L L L L I I V I I L L I I I I I I I Copyright # 2002 John Wiley & Sons, Ltd. S S S S S S S S P S S A S S E E E E E E E E E E E E E E E R Y Y Q Y Y H H E K E F X D L E C V M E H M F H M V M E L Y M D L E L E L W L X L/M population interacts specifically with the target molecule against which it was raised. Similar immunoprecipitation assays were performed on 12 selected sequences from the sixth round pool. These experiments confirmed the specific binding of the RNA– peptide fusions to the antigen-binding site of the anti-c-Myc monoclonal antibody (Fig. 4). The selected sequences bound the c-Myc antibody with an affinity similar to that of the wild-type Myc fusion. DISCUSSION mRNA display was applied to selections against two targets, the serine protease trypsin and the anti-c-Myc antibody 9E10, in order to determine the minimal binding sequence of the peptide to its protein partner. Following multiple rounds, each selection converged to a consensus binding sequence. Remarkably, in both randomized peptide selections, the highest affinity selected sequences were identical to the wild-type binding peptide, which suggests that the selection of these peptides is driven almost entirely by optimal binding interactions and not by other steps in the selection protocol. The EETI-II selection mapped the first loop of six amino acids, which binds to the trypsin active-site. The trypsinbinding sequences from the EETI-II selection are highly similar to the wild-type EETI-II first loop, including three highly conserved residues and a fourth position with a preference for hydrophobic amino acids. These results J. Mol. Recognit. 2002; 15: 126–134 132 R. BAGGIO ET AL. Figure 4. Competition between unlabeled myc peptide and radiolabeled fusion molecules (selected clones) for binding to the anti-Myc antibody 9E10. The Myc peptide competes with the selected binders, indicating a common binding site to the 9E10 antibody. imply that the PR(I, L, V)L is essential for trypsin binding. The observation that the P3' and P4' positions vary the most in the selected EETI-II variants is in accord with the cocrystal structure of CMTI-trypsin (Helland et al., 1999). In the crystal structure of the complex between CMTI and trypsin, the two residues at the P3' and P4' positions are more than 4 Å from the trypsin surface (Helland et al., 1999). Since these positions are not making any direct contact with the trypsin surface, we would expect that the selection pressure at the P3' and P4' positions would be relatively low. In other natural anti-trypsin knottins (Holak et al., 1989; Wilusz et al., 1983; Ling et al., 1993; Hatakeyama et al., 1991; Otlewski et al., 1987; Heitz et al., 1989; Wieczorek et al., 1985), the PRILM sequence in the first loop is the most common; this agrees with our EETI-II selection results, where clones containing the PRILMR sequence were dominant. The second most common sequence in this selection was PRLLAP. A similar sequence (PRILMP), can be found in the trypsin inhibitor knottin from Luffa cylindrical (Ling et al., 1993). Another interesting variant from the EETI-II selection was found to have as its first loop sequence PKLLAP. This Copyright # 2002 John Wiley & Sons, Ltd. variant is similar in affinity to the wild-type sequence, despite the substitution of Ile5 to Leu. NMR structural studies of the wild-type Ile5 to Leu EETI-II variant showed a first loop more disordered than the wild-type first loop, resulting in reduced trypsin binding (Nielsen et al., 1994). This decrease in binding relative to wild-type was not apparent in our PKLLAP variant, possibly because differences at other positions in the first loop compensated for the detrimental effect of the Ile5 to Leu substitution. The c-Myc antibody, Myc1-9E10, was originally raised against the 32 amino acid C-terminus of the human c-myc oncoprotein (Chan et al., 1987). The longest contiguous selected peptide sequence with complete identity to c-myc was EQKLISEEDL. Despite the low initial abundance of peptides containing this 10-mer sequence in the 27-mer linear peptide library, the selection pressure favored its amplification and emergence. The sequence obtained is long enough that a simple PIR (Barker et al., 2001) search would have identified the c-Myc oncogene if the identity of the antigen for the 9E10 antibody had not been known. In fact, performing a PIR pattern search on the degenerative consensus sequence, Q/E-X-L-I-S-E-X-X-L/M, yielded J. Mol. Recognit. 2002; 15: 126–134 IDENTIFICATION OF EPITOPE-LIKE CONSENSUS MOTIFS predominantly matches for c-Myc. Thirty two percent of the matches (i.e. eight out of 25) from the PIR database were the c-Myc oncogene. All human matches were c-Myc. The report that the capacity of a monoclonal antibody binding site may be as short as 5–9 residues (Dunn et al., 1999) suggests that an antibody binding region can be shorter than the selected 10 amino acid identity of the c-Myc antigen. Indeed, the results of the selection against the 9E10 antibody show that enriched peptides converge on the consensus sequence LISE. Nearly 61% of the clones contain this sequence, and another 25% contain single homologous substitutions within the LISE consensus. The remaining 14% had no more than two nonhomologous substitutions within the LISE consensus. Two of the more frequent nonhomologous substitutions were a proline for the serine and a tryptophan for the glutamic acid of the LISE consensus. Although flanking amino acids may contribute to the affinity, the conservation of the LISE tetrapeptide sequence within the majority of analyzed clones suggest that LISE is the most important determinant of binding. 133 In summary, we used mRNA display with two peptide libraries of different diversities: a low diversity, constrained knottin library with six randomized residues in the first loop; and a high diversity 27-mer, linear peptide. Both selections converged to consensus sequences, and some clones contained the precise wild-type sequence of their respective ligands. In both cases, the sequence requirements for molecular recognition of a peptide by a macromolecule were mapped at a high resolution. The application of mRNA display to identify native ligands and precise primary sequence requirements of peptides binding to proteins may become a useful tool both in proteomic applications and peptidomimetic design. Acknowledgements The authors wish to thank Michael McPherson and Thomas Cujec for critical reading of the manuscript. REFERENCES Amati B, Frank SR, Donjerkovic D, Taubert S. 2001. Function of the c-Myc oncoprotein in chromatin remodeling and transcription. Biochim. Biophys. Acta. 1471: M135±M145. Barker WC, Garavelli JS, Hou Z, Huang H, Ledley RS, McGarvey PB, Mewes H-W, Orcutt BC, Pfeiffer F, Tsugita A, Vinayaka CR, Xiao C, Yeh L-SL, Wu C. 2001. Protein Information Resource: a community resource for expert annotation of protein data. Nucl. Acids Res. 29: 29±32. Boder ET, Wittrup KD. 2000. Yeast surface display for directed evolution of protein expression. Meth. Enzymol. 328: 430± 444. Chan S, Gabra H, Hill F, Evan G, Sikora K. 1987. A novel tumour marker related to the c-myc oncogene product. Mol. Cell Probes 1: 73±82. Christmann A, Walter K, Wentzel A, Kratzner R, Kolmar H. 1999. The cystine knot of a squash-type protease inhibitor as a structural scaffold for Escherichia coli cell surface display of conformationally constrained peptides. Protein Engng. 12: 797±806. Deroos S, Muller CP. 2001. Antigenic and immunogenic phage displayed mimotopes as substitute antigens: applications and limitations. Comb. Chem. High Throughput Screen 4: 75± 110. Dunn C, O'Dowd A, Randall RE. 1999. Fine mapping of the binding sites of monoclonal antibodies raised against the PK tag. J. Immunol. Meth. 224: 141±150. Evan Gi, Lewis GK, Ramsay G, Bishop JM. 1985. Isolation of monoclonal antibodies speci®c for human c-Myc protooncogene product. Mol. Cell Biol. 5: 3610±3616. Favel A, Le-Nguyen D, Coletti-Previero MA, Castro B. 1989a. Active-site chemical mutagenesis of Ecballium elaterium trypsin inhibitor II: new microproteins inhibiting elastase and chymotrypsin. Biochem. Biophys. Res. Commun. 162: 79±82. Favel A, Mattras H, Coletti-Previero MA, Zwilling R, Robinson EA, Castro B. 1989b. Protease inhibitors from Ecballium elaterium seeds. Int. J. Pept. Protein Res. 33: 202±208. FitzGerald K. 2000. In vitro display technologiesÐnew tools for drug discovery. Drug Discov. Today 5: 253±258. Hatakeyama T, Hiraoka M, Funatsu G. 1991. Amino acid sequences of the two smallest trypsin inhibitors from sponge gourd seeds. Agric. Biol. Chem. 55: 2641±2642. He M, Taussig MJ. 1997. Antibody-ribosome-mRNA (ARM) complexes as ef®cient selection particles for in vitro display Copyright # 2002 John Wiley & Sons, Ltd. and evolution of antibody combining sites. Nucl. Acids Res. 25: 5132±5134. Heitz A, Chiche L, Le-Nguyen D, Castro B. 1989. 1H 2D and distance geometry study of the folding of Ecballium elaterium trypsin inhibitor, a member of the squash inhibitors family. Biochemistry 28: 2392±2398. Helland R, Berglund GI, Otlewski J, Apostoluk W, Andersen OA, Willassen NP, Smalas AO. 1999. High-Resolution structures of three new trypsin-squash-inhibitor complexes: a detailed comparison with other trypsins and their complexes. Acta Crystallogr. Sect. D D55: 139±148. Holak TA, Bode W, Huber R, Otlewski J, Wilusz T. 1989. Nuclear Magnetic resonance solution and x-ray structures of squash trypsin inhibitor exhibit the same conformation of the proteinase binding loop. J. Mol. Biol. 210: 649±654. Hoogenboom HR, de Bruine AP, Hufton SE, Hoet RM, Arends JW, Roovers RC. 1998. Antibody phage display technology and its appilcations. Immunotechnology 4: 1±20. Keefe AD, Szostak JW. 2001. Functional proteins from a randomsequence library. Nature 410: 715±718. Kieke MC, Cho BK, Boder ET, Kranz DM, Wittrup KD. 1997. Isolation of anti-T cell receptor scFv mutants by yeast surface display. Protein Engng. 10: 1303±1310. Le-Nguyen D, Mattras H, Coletti-Previero MA, Castro B. 1989. Design and chemical synthesis of a 32 residue chimeric microprotein inhibiting both trypsin and carboxypeptidase A. Biochem. Biophys. Res. Commun. 162: 1425±1430. Le-Nguyen D, Heitz A, Chiche L, Castro B, Boigegrain RA, Favel A, Coletti-Previero MA. 1990. Molecular recognition between serine proteases and new bioactive microproteins with a knotted structure. Biochimie. 72: 431±435. Ling MH, Qi HY, Chi CW. 1993. Protein, cDNA, and genomic DNA sequences of the towel gourd trypsin inhibitor. A squash family inhibitor. J. Biol. Chem. 268: 810±814. Mares-Guia M, Shaw E. 1964. Studies on the active center of trypsin. The binding of amidines and guanidines as models of the substrate side chain. J. Biol. Chem. 240: 1579±1585. Mattheakis LC, Bhatt RR, Dower WJ. 1994. An in vitro polysome display system for identifying ligands from very large peptide libraries. Proc. Natl. Acad. Sci. USA 91: 9022±9026. Nemoto N, Miyamoto-Sato E, Husimi Y, Yanagawa H. 1997. In vitro virus: bonding of mRNA bearing puroMycin at the 3'terminal end to the C-terminal end of its encoded protein on J. Mol. Recognit. 2002; 15: 126–134 134 R. BAGGIO ET AL. the ribosome in vitro. FEBS Lett. 414: 405±408. Nielsen KJ, Alewood D, Andrews J, Kent SB, Craik DJ. 1994. An 1H NMR determination of mirror-image forms of a Leu-5 variant of the trypsin inhibitor from Ecballium elaterium (EETI-II). Protein Sci. 3: 291±302. Otlewski J, Whatley H, Polanowski A, Wilusz T. 1987. Amino-acid sequences of trypsin inhibitors from watermelon (Citrullus vulgaris) and red bryony (Bryonia dioica) seeds. Biol. Chem. Hoppe-Seyler 368: 1505±1507. Roberts RW, Szostak J. 1997. RNA-peptide fusions for the in vitro selection of peptides and proteins. Proc. Natl. Acad. Sci. USA 94: 12297±12302. Schatz PJ, Cull MG, Martin EL, Gates CM. 1996. Screening of peptide libraries linked to lac repressor. Meth. Enzymol. 267: 171±191. Schechter I, Berger A. 1967. On the size of the active site in proteases. Biochem. Biophys. Res. Commun. 27: 157±162. Schiweck W, Buxbaum B, Schatzlein C, Gunter Neiss H, Skerra A. 1997. Sequence analysis and bacterial production of the antic-Myc antibody 9E10: the VH domain has an extended CDRH3 and exhibits unusual solubility. FEBS Lett. 414: 33±38. Copyright # 2002 John Wiley & Sons, Ltd. Speight RE, Hart DJ, Sutherland JD, Blackburn JM. 2001. A new plasmid display technology for the in vitro selection of functional phenotype-genotype linked proteins. Chem. Biol. 8: 951±965. Wentzel A, Christmann A, Kratzner, Kolmar H. 1999. Sequence Requirements of the GPNG b-turn of the Ecballium elaterium trypsin inhibitor II explored by combinatorial library screening. J. Biol. Chem. 274: 21037±21043. Wieczorek M, Otlewski J, Cook J, Parks K, Leluk J, WilimowskaPelc A, Polanowski A, Wilusz T, Laskowski M Jr. 1985. The squash family of serine proteinase inhibitors. Amino acid sequences and association equilibrium constants of inhibitors from squash, summer squash, zucchini, and cucumber seeds. Biochem. Biophys. Res. Commun. 126: 646±652. Wilson DS, Keefe AD, Szostak JW. 2001. The use of mRNA display to select high-af®nity protein-binding peptides. Proc. Natl. Acad. Sci. 98: 3750±3755. Wilusz T, W.ieczorek M, Polanowski A, Denton A, Cook J, Laskowski M Jr. 1983. Amino-acid sequence of two trypsin isoinhibitors, ITD I and ITD III from squash seeds (Curcurbita maxima). Hoppe-Seyler's Z. Physiol. Chem. 364: 93±95. J. Mol. Recognit. 2002; 15: 126–134
© Copyright 2026 Paperzz