Building Homology Models with Prime: A Case Study

For further information on this and other topics, please feel free to contact us at: [email protected]
M
r<P
::
td
; s
c
* n
at
ore
=
ms
ato
ore
of
fs
et
nt
set
(i
oreMapping mapping; for
oreMapping> mappings; PhpC
PhpC
tor<
:vec
:
d
st
off
=
i
nt
, i
nt
imatch0, int *moleatoms, int
*refcoreatom
nt
s){i
nt
ncor
eat
=
COMM
ON(g
lidel
ig
int
ch0
snap(MM_Index
t
;in
t,
gc
li
ct,
ima
t
ms,
= CO
).
MMON(glidelig
static MM_Index
ato
i
nt *refcoreatoms){int ncoreat = COMMON(glidelig). nc
corect,MM_Index
Building Homology Models with Prime:
A Case Study with Factor Xa
ligct, int
imatch0, int
*moleatoms, int
*refcorea-
;
toms){int
).
nc
or
eat
;i
ms
o
at
re
nco
h0
hp
<P
r
o
ct
ve
= i
:
:
matc
td
h0 * ncoreat; s
to
vec
ex
Ind
MM_
Co
r
eM
ap
p
i
ng>
ma
MM_
Ind
ex
lig
static MM_Index snap(MM_Index corect, MM_Index ligct, i
tc
ima
ic
hpCoreMapping> mappings; P
ppings; PhpCoreMapping mapping;
hpCoreMapping mapping;
at
*mo
le
MODELING
st
M_Index snap(MM_Index
cor
ect
,
i
n
ms
t
to
im
ea
a
tc
or
h
fc
0
,
*re
int
*moleatoms, int
){
in
t
for(int
MM_Index snap(MM_Index corect,
static
ncoreatoms;int offset = imatch0 * ncorea
t
MODELING
Building Homology Models with Prime:
A Case Study with Factor Xa
In this tutorial we will use Prime to build homology models of the human protein factor Xa using
two different templates, bovine factor Xa and human thrombin. The tutorial is divided into four
sections:
1. Creating a project in Maestro
2. Building models using the Structure Prediction Wizard
3. Assessing model quality
4. Refining models
Required File: factorXa_human.fasta, FXa.psipass2, FXa_homologymodelingexample.prjzip
Homology Modeling with Prime l Page 1 of 11
1. Creating project files in Maestro
Before running any modeling tasks in Maestro, it is strongly recommended that you save your session as project
(.prj file format). Once a project is created then any work you do will be automatically saved. An instance of the
project can be exported as an archive (.prjzip or .prj.zip file format) for easy sharing with colleagues via email.
Note: When archived projects are opened in Maestro, a new project must be created in order for any subsequent
work on the project to be automatically saved.
1. Open an archived project. Start Maestro. In the Project toolbar, press the Open icon. From the
working directory, select the file FXa_homologymodelingexample.prjzip; press Open. A scratch
project will be created.
2. Create a new project. Save the scratch project by pressing the Save As icon in the Project toolbar.
In the dialog box that appears, assign “FXa_homologymodeling” as the file name and press save.
This will create a permanent project file; the header at the top of the Maestro window will change
to FXa_homologymodeling.prj accordingly.
2. Building models using the Structure Prediction Wizard
Here we will build two different homology models of human Factor Xa using two different templates. The
first model will be built in a straight-forward manner from a template structure of the bovine homolog
of factor Xa (PDB 1KIG). The second model will be built from a human paralog, thrombin (PDB 2BVR).
Since the sequence identify between human Factor Xa and thrombin is very low, we will use secondary
structure information to guide manual alignment of the two sequences prior to building the final model.
Model building consists of the following steps: (1) importing the target sequence, (2) identifying a model
template(s), (3) editing the sequence alignment (if necessary), and (4) building the model. Additional refinement
steps can be performed after the initial model is built (See Section 4).
3. Create a new run in the structure prediction wizard (SPW). Open the homology modeling window
(via Tasks -> Homology Modeling in the Maestro menu bar); from the two choices that appear,
select the Structure Prediction Wizard option (Figure 1). In the window that opens, create a new
run (File -> New, Figure 1); you can use the default run name that appears and press Ok. The name
of the run will be at the top of the window and the sequence areas will be blank.
Figure 1. The Homology Modeling (left side) and Structure Prediction Wizard (right side) windows.
Homology Modeling with Prime l Page 2 of 11
4. Import the target structure. Load the human FXa sequence by clicking the File button from Get
sequence from; in the dialog box that opens, select FactorXa_human.fasta. Press Ok. The amino
acid sequence will now appear in the upper window, while the name of the sequence will appear
in the lower window (Figure 2).
Figure 2. View of the sequence windows after importing the FXa sequence.
5. Identify a model template. Press the Next button located at the bottom right hand corner of the
window to progress to the Find Homologs step; here we will use BLAST to find a suitable template.
Press the BLAST Homology Search button to initiate the search; when the job has finished, results
will be displayed in the lower window, sorted in descending order based on their score . Displayed
in the upper window will be the sequence alignment, colored by residue type, between FXa and
the homolog that is highlighted in the lower window (Figure 3).
Homology Modeling with Prime l Page 3 of 11
Figure 3. Output of the homology search. Results are displayed in the lower window while the
sequence alignment between FXa and the selected homolog is displayed in the top window.
In the lower results window, press the ID header to sort by ID; scroll down to 1KIG, a structure of a
bovine FXa ortholog, and select it. Press Next to progress to the next step.
6. Edit the sequence alignment. In the upper sequence window, press the crop (scissors) icon
in the toolbar, then highlight the region of target sequence that is not aligned to the template;
this region of sequence should now be colored light grey and will be excluded from the model
building step (Figure 4). Since the sequence identity is pretty high, we will not edit the alignment
manually. Press Next to advance to the final model building step.
Homology Modeling with Prime l Page 4 of 11
Figure 4. The edit alignment step. Target sequence colored light grey will not be modeled.
7. Build the model. Keep the default options and press the Build Model button. Once the model
is built, it will appear as a ribbon in the workspace (Figure 5). Residues that are identical to the
template are colored purple, while residues that differ are colored cyan. Sequence insertions, if
applicable, will be colored red. The model is automatically added to your project table.
Figure 5. Resulting model of human FXa built from 1KIG. Identical residues to the template are
colored purple while different residues are colored cyan.
Homology Modeling with Prime l Page 5 of 11
8. Generate an alternate model using 2bvr as the template
8A. Re-import the human FXa sequence and find homologs. Repeat steps 3 and 4 (i.e. create a new
run and input the human FXa sequence). Advance to the Find Homologs step of the SPW and run
a BLAST homology search; when the results appear, sort by ID and select 2BVR, a structure of the
human paralog thrombin. Press Next to advance to the Edit Alignment step.
8B. Incorporate secondary structure information into the edit alignment step. Since the sequence
identify between FXa and thrombin is pretty low, we are going to use secondary structure
information to ensure optimal alignment of the two sequences. First, switch the Alignment Method
to Prime STA. Since this approach utilizes secondary structure information to build the alignment,
you can either (1) run the built-in secondary structure prediction (SSP) algorithm at this point or (2)
import predictions generated from a third-party or web-based SSP method. Here we will import
results generated using the PSI-PRED server [1]. Press the Import button and select FXa.psipass2.
Prediction results will now be visible in the top window, where residues annotated with red letter
Hs are predicted to form helices while residues annotated with blue letter Es are predicted to form
beta sheets (Figure 6). Turn on the secondary structure information for the template sequence by
pressing the View SSA icon (red cylinder) in the SPW toolbar; blue arrows indicate residues that
form beta sheets while orange cylinders indicate residues that form alpha helices (Figure 6).
Figure 6. Visualizing secondary structure and conservation information during the Edit Alignment
step of the SPW. Predictions for the target sequence are annotated as Hs (helices) or Es (beta
sheets), while the secondary structure of the template is annotated as blue arrows (beta sheets)
and orange cylinders (alpha helices). Strictly conserved aligned sequence positions are annotated
as a separate row in the alignment window.
8C. Identify strictly conserved regions based on homology. Press the Identify Globally Conserved
Residues button; this will use protein family information from PFAM [2] to identify strictly conserved
aligned positions. Conserved positions will be annotated in the sequence alignment (Figure 6).
Homology Modeling with Prime l Page 6 of 11
8D. Manually edit the alignment. Use the alignment slider icons to make manual changeds to the
alignment so as to (a) ensure that globally conserved residues are aligned, (b) secondary structure
elements are aligned to the greatest extent possible and (c) regions residing with secondary
structure regions are gap-free. Use the edit SSP icon as necessary to edit the annotation of
secondary structure regions as necessary. (Figure 7)
Figure 7. Manually edited alignment (yours may look slightly different).
When finished press Next. Note: if you have gaps in secondary struture regions of your template
or query then an error message will appear. You can choose to ignore these gaps or try to fix them
before moving on. If you have too many gaps in secondary structure regions then the model may
fail to be built.
Homology Modeling with Prime l Page 7 of 11
8D. Build a model. Keep the default options and press the Build Model button. When the new
model is built it will appear in the workspace (Figure 8). Note differences in the color coding of this
model versus the one built using 1KIG as the template, in particular that much of the sequence is
not conserved and loop insertions (red) are visible.
Figure 8. Resulting homology model of FXa built using human thrombin (PDB 2BVR) as the
template. Sequence insertions are colored red.
9. Compare the two models to the structure of FXa. Open your project table. Include both models
and the entry named 1F0R_prepped in the workspace. Color each ribbon uniquely by clicking
and holding down the blue ribbon icon in the Representation toolbar and selecting Entry; the
three structures should now be colored distinctly (Figure 9). Open the structural alignment panel
(Tools -> Protein Structure Alignment in the Maestro menu bar); press Align. A window will pop
up with the resulting structure-based sequence alignment and RMSD values and the structures will
now be aligned in the workspace (Figure 9). Note that despite the low sequence identity between
1F0R and 2BVR, the structural model has a relatively low RMSD
Figure 9. Alignment of Factor Xa (green) to models built based on template structures
1KIG (blue) and 2BVR (peach).
Homology Modeling with Prime l Page 8 of 11
3. Assessing model quality
After building a model, it is always a good idea to examine its quality and fix any problems, such as steric
clashes, bond length deviations and angles, etc, prior to using the model for tasks such as virtual screening.
Most common structural problems can be fixed using the Protein Preparation Wizard.
10. Generate a protein quality report for the model based on 1KIG. Clear your workspace by pressing
the Clear (broom) icon in the Workspace toolbar. Open the project table and include the model
built based on 1KIG into the workspace. Display protein atoms by pressing the also display icon in
the Display toolbar; choose Protein. Press the Protein Reports icon in the Proteins toolbar. A new
panel will open that allows you to display various metrics associated with protein quality in table
format (Figure 11). Double click on a row in the table to zoom into the corresponding region in the
workspace, with the bond or atom highlighted in yellow (Figure 10).
Figure 10. Protein report for the model build on 1KIG (right side), and the corresponding
workspace view of the highlighted entry in the table (left side). Use the drop down Display
menu in the report to view other quality metrics.
11. View the Ramachadran Plot. Close the protein
report and press the Ramachandran icon on the
Proteins toolbar. In the Ramachandran plot that
appears, you can visualize residues (represented by
points) that reside within allowed (yellow), favored
(reddishorange), or disallowed (white) regions of
phi/psi space (Figure 11). Double click on a point to
zoom into that residue in the workspace. Note that
several residues are located outside of acceptable
phi-psi regions.
Figure 11. Ramachandran plot of the FXa mode built on 1KIG.
Homology Modeling with Prime l Page 9 of 11
4. Refining models
The two most common tools for refining structures are the protein prep wizard and Prime loop refinement. Note
that loop refinement is a CPU intensive calculation and should only be performed when deemed necessary.
12. Fix structural problems using the Protein Prep Wizard. Click on the Prep Wiz icon in the Project
toolbar. In the panel that opens, click on the Refine tab (Figure 12). Under H-bond assignment,
press Optimize. When the job finishes, press the Minimize button in the Restrained Minimization
section.
Figure 12. The refine tab of the Protein Preparation Wizard.
Homology Modeling with Prime l Page 10 of 11
Update the reliability report and Ramachandran plot; note that many of the problems have been
fixed. You can use the Quick Torsion and Sculpt tools located in the Proteins toolbar to manually
address any remaining problems.
13. Set up loop refinement for the model built on 2bvr. Clear the workspace. Open the project
table and include in the workspace the entry for the model built on 2bvh. Navigate to the Prime
loop refinement panel (Tasks -> Protein Refinement -> Refine Loops). In the window that opens,
press the Non-Template button; the table should now be populated with loops that correspond to
the red regions of the model (Figure 13). To include a loop(s) in the refinement, click in the box in
the Run column. Press Run (Note: this calculation can take several hours to day depending on the
number and on the length of the loops).
Figure 13. The loop refinement window.
References:
[1] http://bioinf.cs.ucl.ac.uk/psipred/
[2] http://pfam.xfam.org/
Homology Modeling with Prime l Page 11 of 11