Lab 10.2: Homology Modeling Lab Boris Steipe [email protected] http://biochemistry.utoronto.ca/steipe Departments of Biochemistry and Molecular and Medical Genetics Program in Proteomics and Bioinformatics University of Toronto Lab 10.2 1 Concepts 1. Sequence alignment is the single most important step in homology modeling. 2. Reasons to model need to be defined. 3. Fully automated homology modeling services perform well. 4. SwissModel in practice. Lab 10.2 2 Concept 1: Sequence alignment is the single most important step in homology modeling. Lab 10.2 3 Superposition vs Alignment • The coordinates of two proteins are “superimposed” in space. • An alignment may be derived by correlated pairs of alpha-carbons. • A superposition may differ from an optimized symbolic alignment... Lab 10.2 4 Insert of 4 residues • Optimal sequence aligment • gktlit nfsqehip • gktlisflyeqnfsqehip • Optimal structure alignment (blue=helix) • gktlitnfsq ehip • gktlisflyeqnfsqehip Lab 10.2 5 Off by 1, Off by 4 3.8Å • A shift in alignment of 1 residue corresponds to a skew in the modeled structure of about 4 Å (3.8 Å is the inter-alpha carbon distance) • Nothing you can do AFTER an alignment will fix this error (not even molecular dynamics). Lab 10.2 6 Alignment is the limiting step for homology model accuracy No amount of forcefield minimization will put a misaligned residue in the right place ! HOMSTRAD @ CASP4: Williams MG et al. (2001) Proteins Suppl.5: 92-97 Lab 10.2 7 Indels (inserts or deletions) • Observations of known similarities in structures demonstrate that uniform gap penalty assumptions are NOT BIOLOGICAL. • Indels are most often observed in loops, less often in secondary structure elements • When they do not occur in loops, there is usually a maintenance of helical or strand properties. Lab 10.2 8 Can we do better with the gap assumption? • Required: position specific gap penalties • One approach: implemented in Clustal as secondary structure masks • Get secondary structure information, convert it to Clustal mask format. (Easy - read documentation !) Lab 10.2 9 Secondary structure from PDB .... (Algorithm ?) Lab 10.2 10 Secondary structure from RasMol .... (DSSP !) Lab 10.2 11 Concept 2: Reasons to model need to be defined. Lab 10.2 12 Use of homology models Interpreting homology models: biochemical inference from 3D similarity •Bonds •Angles, plain and dihedral •Surfaces, solvent accessibility •Amino acid functions, presence in structure patterns •Spatial relationship of residues to active site •Spatial relationship to other residues •Participation in function / mechanism •Static and dynamic disorder •Electrostatics •Conservation patterns (structural and functional) •Posttranslational modification sites Lab 10.2 13 Abuse of homology models •Modelling structures that cannot / will not be verified •Analysing geometry of model •Interpreting loop structures Lab 10.2 14 Databases of Models Don’t make models unless you check first... • Swiss-Model repository • 64,000 models based on 4000 structures and Swiss-Prot proteins • ModBase • Made with "Modeller" - 15,000 reliable models for substantial segments of approximately 4,000 proteins in the genomes of Saccharomyces cerevisiae, Mycoplasma genitalium, Methanococcus jannaschii, Caenorhabditis elegans, and Escherichia coli. Lab 10.2 15 Concept 3: Fully automated services perform well. Lab 10.2 16 Homology Modeling Software? • Freely available packages perform as good as commercial ones at CASP (Critical Assessment of Structure Prediction) • Swiss Model (tutorial) • Modeller (http://guitar.rockefeller.edu) Lab 10.2 17 Swiss-Model steps: 1. Search for sequence similarities BLASTP against EX-NRL 3D Peitsch M & Guex N (1997) Electrophoresis 18: 2714 Lab 10.2 18 Swiss-Model steps: 1. Search for sequence similarities 2. Evaluate suitable templates Identity: > 25% Expected model : > 20 resid. Peitsch M & Guex N (1997) Electrophoresis 18: 2714 Lab 10.2 19 Swiss-Model steps: 1. Search for sequence similarities 2. Evaluate suitable templates 3. Generate structural alignments Select regions of similarity and match in coordinatespace (EXPDB). Peitsch M & Guex N (1997) Electrophoresis 18: 2714 Lab 10.2 20 Swiss-Model steps: 1. 2. 3. 4. Search for sequence similarities Evaluate suitable templates Generate structural alignments Average backbones Compute weighted average coordinates for backbone atoms expected to be in model. Peitsch M & Guex N (1997) Electrophoresis 18: 2714 Lab 10.2 21 Swiss-Model steps: 1. 2. 3. 4. 5. Search for sequence similarities Evaluate suitable templates Generate structural alignments Average backbones Build loops Pick plausible loops from library, ligate to stems; if not possible, try combinatorial search. Peitsch M & Guex N (1997) Electrophoresis 18: 2714 Lab 10.2 22 Swiss-Model steps: 1. 2. 3. 4. 5. 6. Search for sequence similarities Evaluate suitable templates Generate structural alignments Average backbones Build loops Bridge incomplete backbones Bridge with overlapping pieces from pentapeptide fragment library, anchor with the terminal residues and add the three central residues. Peitsch M & Guex N (1997) Electrophoresis 18: 2714 Lab 10.2 23 Swiss-Model steps: 1. 2. 3. 4. 5. 6. 7. Search for sequence similarities Evaluate suitable templates Generate structural alignments Average backbones Build loops Bridge incomplete backbones Rebuild sidechains Rebuild sidechains from rotamer library - complete sidechains first, then regenerate partial sidechains from probabilistic approach. Peitsch M & Guex N (1997) Electrophoresis 18: 2714 Lab 10.2 24 Swiss-Model steps: 1. 2. 3. 4. 5. 6. 7. 8. Search for sequence similarities Evaluate suitable templates Generate structural alignments Average backbones Build loops Bridge incomplete backbones Rebuild sidechains Energy minimize Gromos 96 Energy minimization Peitsch M & Guex N (1997) Electrophoresis 18: 2714 Lab 10.2 25 Swiss-Model steps: 1. 2. 3. 4. 5. 6. 7. 8. 9. Search for sequence similarities Evaluate suitable templates Generate structural alignments Average backbones Build loops Bridge incomplete backbones Rebuild sidechains Energy minimize Write Alignment and PDB file e-mail results Peitsch M & Guex N (1997) Electrophoresis 18: 2714 Lab 10.2 26 Swissmodel in comparison 3D-Crunch: Manual alternatives: Modeller ... Automatic alternatives: SwissModel sdsc1 3djigsaw pcomb_pcons cphmodels easypred 211,000 sequences -> 64,000 models # 1 for RMSD and % correct aligned, #2 for coverage Controls: >50 % ID: ~ 1 Å RMSD 40-49% ID: 63% < 3Å 25-29% ID: 49% < 4Å Guex et al. (1999) TIBS 24:365-367 EVA: Eyrich et al. (2001) Bioinformatics 17:1242-1243 (http://cubic.bioc.columbia.edu/eva) Lab 10.2 27 What structure elements change between similar sequence? • Subtle changes in protein backbone path • Changes in amino acid side-chain rotamer orientation • backbone dependent • Loops added or truncated • Model may be incomplete Lab 10.2 28 Concept 4: SwissModel in practice. Lab 10.2 29 SwissModel ... first approach mode http://www.expasy.org/swissmod Lab 10.2 30 ... enter the ExPDB template ID... Lab 10.2 31 ... run in Normal Mode (Except if defining a DeepView project )... Lab 10.2 32 ... successful submission. Results come by e-mail. Lab 10.2 33 Optimal sequence alignment http://cbrmain.cbr.nrc.ca/EMBOSS/index.html [...] # Matrix: EBLOSUM35 # Gap_penalty: 10.0 # Extend_penalty: 0.5 # # Length: 122 # Identity: 36/122 (29.5%) # Similarity: 55/122 (45.1%) # Gaps: 28/122 (23.0%) # Score: 150.5 [...] #======================================= 23 LNNKKTIAEGRRIPISKAVENPTATEIQDVCSAVGLNVFLEKNKMYSREW |:.||:.|||||||...||.|....|:.:....:||. |..:.|.|.:.| 11 LDSKKSRAEGRRIPRRFAVPNVKLHELVEASKELGLK-FRAEEKKYPKSW 72 59 73 NRDVQYRGRVRVQLKQEDGSLCLVQFPSRKSVMLYAAEMIPKLKTRTQKT .:..|||.|:.: .::..:|:..|..|.::: 60 ---WEEGGRVVVEKR-----------GTKTKLMIELARKIAEIR-----123 GGADQSLQQGEGSKKGKGKKKK :|..:| ||.|.|||| 90 ---EQKREQ----KKDKKKKKK Lab 10.2 122 89 144 104 34 Optimal structural superposition 1.4Å in 32 res. Lab 10.2 35 Questions ? Feedback ? [email protected] http://biochemistry.utoronto.ca/steipe/ Lab 10.2 36
© Copyright 2026 Paperzz