2010 Best Practices Competition Basic Research and Drug Discovery

2010 Best Practices Competition Basic Research and Drug Discovery Pg Nominating 2 User Company AstraZeneca 9 11 23 Lexicon Pharmaceuticals 30 Millipore Corporation 36 Marrow Transplantation & Cellular Therapy, St. Jude Children's Research Hospital Massachusetts Institute of Technology Project Title AstraZeneca Patient Safety New Case Handling Operating Model the Normalization of HIV Chemistry Consultants (now defunct) Genstruct / Pfizer GRD Causal Network Model of 2‐Butoxyethanol‐
Induced Hemangiosarcoma in Mice and Its Relevance to Humans Automated Compound Verification by NMR, LC/MS, and HPLC in the Drug Discovery Process Guava Benchtop Six‐color Flow Cytometer (easyCyte 8HT) Identifying Drug Effects via Pathway Alterations using an Integer Linear Programming Optimization Formulation on Phosphoproteomic Data Geographical genomics of human gene expression variation 39 SAS Institute The Gibson Lab, Center for Integrative Genomics, Georgia Tech 44 Sigma Life Science 49 Tarbiat Modares University, EnzymeZist 66 Zymeworks Inc 78 Your Favorite Gene powered by Ingenuity Genetic Oral Delivery for Gut (GOD 4 Gut) A modular computational modeling environment for structure guided optimization of protein therapeutics. Computer‐assisted anti‐AIDS drug development: cyclophilin B against the HIV‐
1 subtype A V3 loop Institute of Bioorganic Chemistry of the National Academy of Sciences of Belarus Published Resources for the Life Sciences
250 First Avenue, Suite 300, Needham, MA 02494 | phone: 781‐972‐5400 | fax: 781‐972‐5425 Bio‐IT World 2010 Best Practices Awards ENTRY FORM 1. Nominating Organization (Fill this out only if you are nominating a group other than your own.) A. Nominating Organization Organization name: Address: B. Nominating Contact Person Name: Title: Tel: Email: 2. User Organization (Organization at which the solution was deployed/applied) A. User Organization Organization name: AstraZeneca Address: 1800 concord pike, Wilmington DE 19850 B. User Organization Contact Person Name: Nate Blevins Title: Global Drug Development IS Business Relationship Manager Tel: +1 302 886 5621 Email: [email protected] 3. Project Project Title: AstraZeneca Patient Safety New Case Handling Operating Model Team Leader Name: Tony Gill, NCHOM IS/IT Workstream Lead, AstraZeneca Title: Global Drug Development IS Senior Programme Manager Tel: +44 1509 644561 Email: [email protected] IS/IT Team members – name(s), title(s) and company (optional): Rob Sanchez, IS Project Manager, AstraZeneca Published Resources for the Life Sciences
250 First Avenue, Suite 300, Needham, MA 02494 | phone: 781‐972‐5400 | fax: 781‐972‐5425 Steve Rayner, IS Project Manager, AstraZeneca Paul Seymour, Application Service Manager, AstraZeneca Darryl Draper, Application Service Manager, AstraZeneca Prasanna Sundarrajan, ASM, Cognizant Prangyasila Mishra, Project Manager, Cognizant Karthick Sukumaran, Systems Analyst Lead, Cognizant Mark Barsoum, Senior IS Business Analyst, AstraZeneca Nate Blevins, Senior IS Business Relationship Manager, AstraZeneca Colin Coote, Information Architect, AstraZeneca Some of the Business Team members that the IS/IT team worked with: Joachim Forsgren, Vice President Patient Safety and Project Sponsor, AstraZeneca Mikael Rosén, Business Project Lead, AstraZeneca Agneta Andréasson, Global Clinical Development, AstraZeneca Carina Karlsson, Knowledge Transfer and Mentoring Support, AstraZeneca Maria Broddéne, Global Clinical Development, AstraZeneca Suzanne Guidera, AZ‐TCS interface organization, AstraZeneca Terry Grass, Procurement, AstraZeneca Vita Petrik, AZ‐TCS interface organization, AstraZeneca And others from AstraZeneca Patient Safety, Tata Consultancy Services, Cognizant Technology Solutions and IBM. 4. Category in which entry is being submitted (1 category per entry, highlight your choice) ˆ Basic Research & Biological Research: Disease pathway research, applied and basic research ˆ Drug Discovery & Development: Compound‐focused research, drug safety ˆ Clinical Trials & Research: Trial design, eCTD ˆ Translational Medicine: Feedback loops, predictive technologies ˆ Personalized Medicine: Responders/non‐responders, biomarkers ˆ IT & Informatics: LIMS, High Performance Computing, storage, data visualization, imaging technologies ˆ Knowledge Management: Data mining, idea/expertise mining, text mining, collaboration, resource optimization ˆ Health‐IT: ePrescribing, RHIOs, EMR/PHR ˆ Manufacturing & Bioprocessing: Mass production, continuous manufacturing (Bio‐IT World reserves the right to re‐categorize submissions based on submission or in the event that a category is refined.) 5. Description of project (4 FIGURES MAXIMUM): AstraZeneca Patient Safety New Case Handling Operating Model Published Resources for the Life Sciences
250 First Avenue, Suite 300, Needham, MA 02494 | phone: 781‐972‐5400 | fax: 781‐972‐5425 A. ABSTRACT/SUMMARY of the project and results (150 words max.) Having a proactive approach to ensuring the safety of patients using medicines on the market - or new drugs in
development - continues to be an important area of focus for pharmaceutical companies, including
AstraZeneca. At all stages of a drug’s lifecycle, it is essential that any safety signals or adverse events are
identified early and carefully evaluated in the context of knowledge about the drug and the patients taking it. If
an adverse drug reaction is confirmed, this can then be swiftly communicated to doctors, physicians, patients
and regulators. As well as having this proactive and rigorous approach to safety management, companies at the
same time and in today’s current economic climate, are also looking to operate as efficiently and effectively as
they can without compromising quality or standards whilst still meeting regulatory requirements, but in a
world where cost pressures are becoming more acute.
Against this backdrop, AstraZeneca (AZ) Patient Safety recognized an opportunity to reshape and refocus its
patient safety operations and partnered with Tata Consultancy Services (TCS) to develop a fully-automated
New Case Handling Operating Model (NCHOM). The model – believed to be an industry first - makes full
use of technology and business process enhancements in the identification and handling of all adverse events.
In turn, NCHOM is set to reduce running costs of case handling operations at AstraZeneca by over 70%,
allowing the company to focus its investment on critical research and development activities and for the patient
safety operations to direct its energies into areas of safety science and proactive safety management.
Published Resources for the Life Sciences
250 First Avenue, Suite 300, Needham, MA 02494 | phone: 781‐972‐5400 | fax: 781‐972‐5425 Refocus of Patient Safety Activities
B. INTRODUCTION/background/objectives To refocus patient safety activities, the AstraZeneca New Case Handling Operating Model (NCHOM) project
had a mandate to evaluate opportunities to further enhance case handling quality and productivity. Activities
in scope included processing of an ongoing stream of approximately 82,000 cases from clinical trials, post
marketing, spontaneous reporting, legal and literature. This processing was managed by over 135 Patient
Safety staff spread across 7 AstraZeneca locations worldwide. The aim of the NCHOM project was to support
a strategy to redesign this model and implement a best in class case handling process.
NCHOM was not only about reduction of cost and centralization; it was also about identifying potential for
productivity improvements (the number of cases that can be handled per case handler) and, where possible,
enhancing quality through automation and process redesign. A more targeted, value added and productive
approach to data entry was developed to more simply and easily pinpoint and prioritize adverse events. The
redesign involved data entry and an adverse event classification process based on guidelines from the Council
for International Organizations of Medical Sciences (CIOMS) working group V1. Further productivity and
quality enhancements were driven through automation of manual process steps, which in turn reduced the risk
of manual error.
Published Resources for the Life Sciences
250 First Avenue, Suite 300, Needham, MA 02494 | phone: 781‐972‐5400 | fax: 781‐972‐5425 1
Council for International Organizations of Medical Sciences, http://www.cioms.ch/
C. RESULTS (highlight major R&D/IT tools deployed; innovative uses of technology). The operational processing of adverse event reports was streamlined, automated and consolidated into one
location, using Tata Consultancy Services (TCS) as AstraZeneca’s data entry provider. TCS has a dedicated
team in Hungary and a back-up support team in India receiving cases from around the globe that now places
the information directly into the AZ SAPPHIRE safety database system. Control of Patient Safety policies and
SOPs - and how the data is acted upon - remains firmly with AZ, including pharmacovigilance activities such
as signal detection and safety evaluation, risk benefit assessments and production of Periodic Safety Update
Reports which will continue to be conducted - as they are now - at the AZ R&D sites.
The change involved two stages, enabled by system enhancements from AstraZeneca Global Drug
Development IS, Cognizant Technology Solutions and IBM. The first step before transition to TCS was the
move to a new case prioritization process, differentiating the amount of information entered in the SAPPHIRE
database. The next step was the phased transfer of data entry activities to TCS between May and August 2009.
The new approach to data entry prioritization has helped AZ Patient Safety and others to more easily and
efficiently pinpoint the most critical safety-related information. However, the collection of adverse event
information and reporting of it did not change.
In addition, because this model became AZ’s ”gold standard” for process and contractual framework of high
volume case processing, AZ was able to use the same approach for the MedImmune H1N1 vaccine . Looking
ahead, AstraZeneca is working on additional productivity increases through further process automation.
Standard data imports are being implemented for the Medidata RAVE system (the web-based tool used to
capture clinical data from clinical trials) and AstraZeneca marketing companies across the world. Regulatory
reporting will also be further automated. AZ is also planning to further enhance the SAPPHIRE system for
MedImmune vaccines/biologics and within the AsiaPacific region to gain more benefits from the New Case
Handling Operating Model.
D. ROI achieved or expected (200 words max.): The NCHOM project is on track to deliver significant business benefits - potentially in excess of 70% cost
efficiencies from the year 2011, compared to previous running costs thanks to process enhancements,
centralization and better utilization of case handling staff, with other benefits set to follow as a result of the
further system enhancements outlined above. The new model has not impacted the quality of adverse event
case data, and has supported simpler and more efficient utilization of signal management tools for surveillance
and risk management.
E. CONCLUSIONS/implications for the field. The process improvements and system automation in AZ’s new case handling operating model has enhanced
productivity and maintained a high quality of adverse event case data, while reducing cost. The new model
Published Resources for the Life Sciences
250 First Avenue, Suite 300, Needham, MA 02494 | phone: 781‐972‐5400 | fax: 781‐972‐5425 1.
provides AZ with more flexibility and capacity to respond to future increases in case volume. The more
efficient use of staffing numbers required for case handling has enabled a shift of more resources to proactive
surveillance and risk management. NCHOM has underpinned what AZ Clinical Development is focused on
delivering around world--leading design and interpretation of clinical trials and data, cutting-edge operational
excellence, superior clinical data and information exploitation capabilities and a culture of continuous
improvement.
REFERENCES/testimonials/supporting internal documents (If necessary; 5 pages max.) Message from Joachim Forsgren, VP, AstraZeneca Patient Safety:
The many things accomplished – the set up of a new data entry centre, establishment of new processes and structures,
and the seamless transfer of work from our 7 AZ locations – is an incredible achievement and one which people from
across Patient Safety, Clinical Development and TCS should be rightly proud to have played a part. This project has
involved considerable extra effort invested from all concerned to bring in this project on time and within budget.
The new case handling operating model (NCHOM) supports our drive for operational excellence and ensuring we have
the right level of quality information we need for timely and proactive safety management of our products in
development and on the market. Equally, this reshaping means we can invest our patient safety skills in supporting the
better design of our drug programmes, enhancing our decision-making and delivering value to our products in
development – for example, the increased emphasis and investment in the areas of safety science; development of
predictive scenarios to inform patient benefit/risk analysis; and the generation of differentiated safety profiles for new
medicines.
We continue to embrace and embed the new ways of working that this change has brought about, to make this “business
as usual”, to cement our relationship with TCS and to seize every opportunity for continuous improvement.
Message from Mikael Rosén, NCHOM Business Project Manager, AstraZeneca Patient Safety:
Published Resources for the Life Sciences
250 First Avenue, Suite 300, Needham, MA 02494 | phone: 781‐972‐5400 | fax: 781‐972‐5425 The holistic project approach we took to business value and risk management through cross functional collaboration has
been rewarding with enhanced productivity but with no reduction of quality. Furthermore it ensured the team was
geared up to fine-tune and synchronize the many moving parts of the project to a successful delivery.
I am extremely proud of the team effort and the collaboration with TCS. This was all delivered to time and cost through
strong commitment and a shared agenda.
Message from Hilda Gibson, VP, AstraZeneca Global Drug Development IS:
This is a great example of what we can achieve when we all work together to deliver new capabilities that enable
business change.
Bio‐IT World 2010 Best Practices Awards Nominating Organization name: Nominating Organization address: Nominating Organization city: Saint petersburg Nominating Organization state: FL Nominating Organization zip: 33710 Nominating Contact Person: Nominating Contact Person Title: Nominating Contact Person Phone: Nominating Contact Person Email: User Organization name: Chemistry Consultants now defunct User Organization address: 2272 60th St N User Organization city: Saint Petersburg User Organization state: FL User Organization zip: 33710 User Organization Contact Person: James Threadgill User Organization Contact Person Title: ex ‐Consultant User Organization Contact Person Phone: 727‐345‐4015 User Organization Contact Person Email: [email protected] Project Title: the Normalization of HIV Team Leaders name: Team Leaders title: Team Leaders Company: Team Leaders Contact Info: Team Members name: Stephen McCarter Team Members title: Peer Team Members Company: same Entry Category: Basic Research & Biological Research Abstract Summary: Introduction: One of the major problems of using an HIV‐1 vaccine is that the outside viral coat of HIV‐1 changes because the HIV‐1 reverse transcriptase mutates the viral genome every 10000 kilobases. , With the AIDS virus genome being approximately 10000 KB long. Erwin Scrodinger[2,3] in the 1940s surmised that biology on the molecular level might behave like certain mathematical concepts.It was with this idea in mind , when Dr. Tony Fauci stated this year that the only suitable Biotech solution for treatment of HIV is a vaccine that the idea for normalization of the HIV genome became a possible solution to a seemingly impossible problem, how to get a consistant sequence of DNA from a mutating virus. Results: To try to counteract this problem, it is proposed that certain antisense RNA be constructed of various circular patterns, palindrome sequences and LTRs be constructed, injected or added by vector into the HIV‐1 virus to inhibit the HIV reverse transcriptase[4] or sequence nonsensensical genetic information while a nonmutating reverse transcriptase of known characterization is added to the HIV‐1 virus creating upon viral replication a viral product of consistent viral structure. The nonmutating reverse transcriptase does not come from HIV‐1 , but from another retrovirus. The viral DNA sequence is then said to be Normalized. ROI achieved: Conclusions: References: 1.Zhou et al. Structured definition of a conseved neutyralization epitope on HIV‐1 gp120 Nature 445,732‐737(2007). 2. Schrodinger,Erwin What is Life?he Physical Aspects of the Living Cell,Cambridege University Press 1944 3.2c/o Watson, James, DNA: The Secret of life, Alfred A. Knopf,Toronto , Canada 2003.p35‐36. 4. Rossi,John, in vivo delivery of Dicer substrates RNAs for treatment of HIV infection , Novel Approaches for Targeted delivery, Molecular Medicine Tri‐
confrence 2010 February 5,2010 moscone North Convention Center, San francisco , CA conference preprint 1. Nominating Organization (Fill this out only if you are nominating a group other than your own.) A. Nominating Organization Organization name: Address: B. Nominating Contact Person Name: Title: Tel: Email: 2. User Organization (Organization at which the solution was deployed/applied) A. User Organization Organization name: Genstruct, Inc and Pfizer Global Research and Development Address: One Alewife Center, Cambridge, MA 02140 (Genstruct) Eastern Point Road, MS8274‐1234, Groton, CT, 06340 (Pfizer) B. User Organization Contact Person Name: Diane H. Song, Ph.D. Title: Project Lead/Scientist II Tel: 617‐547‐5421 ext 235 Email: [email protected] 3. Project Project Title: Causal Network Model of 2-Butoxyethanol-Induced Hemangiosarcoma in
Mice and Its Relevance to Humans Team Leader Name: Daphna Laifenfeld, Ph.D. Title: Director of Research Tel: 617‐547‐5421 ext 213 Email: [email protected] Team members – name(s), title(s) and company (optional): Diane H. Song, Sean F. Eddy, Annalyn Gilchrist, David Drubin, Milena Jorge, Brian P. Frushour, Renee Kenney, Keith O. Elliston (Genstruct) Leslie A. Obert, Mark M. Gosink, Jon C. Cook, Kay Criswell, Christopher J. Somps, Petra Koza‐
Taylor, Michael P. Lawton (Pfizer Global Research and Development) 4. Category in which entry is being submitted (1 category per entry, highlight your choice) ˆ Basic Research & Biological Research: Disease pathway research, applied and basic research 9 Drug Discovery & Development: Compound‐focused research, drug safety ˆ Clinical Trials & Research: Trial design, eCTD Translational Medicine: Feedback loops, predictive technologies ˆ Personalized Medicine: Responders/non‐responders, biomarkers ˆ IT & Informatics: LIMS, High Performance Computing, storage, data visualization, imaging technologies ˆ Knowledge Management: Data mining, idea/expertise mining, text mining, collaboration, resource optimization ˆ Health‐IT: ePrescribing, RHIOs, EMR/PHR ˆ Manufacturing & Bioprocessing: Mass production, continuous manufacturing (Bio‐IT World reserves the right to re‐categorize submissions based on submission or in the event that a category is refined.) 5. Description of project (4 FIGURES MAXIMUM): A. ABSTRACT/SUMMARY of the project and results (150 words max.) B. INTRODUCTION/background/objectives C. RESULTS (highlight major R&D/IT tools deployed; innovative uses of technology). D. ROI achieved or expected (200 words max.): E. CONCLUSIONS/implications for the field. 6. REFERENCES/testimonials/supporting internal documents (If necessary; 5 pages max.) Causal Network Model of 2-Butoxyethanol-Induced Hemangiosarcoma in Mice
and Its Relevance to Humans
Abstract (150 word max)
Hemangiosarcomas can develop in rodents during long-term drug exposure and can delay or prevent the
development of new therapeutics. This study identifies molecular mechanisms underlying
hemangiosarcoma formation in mice to enable the assessment of these mechanisms during drug
development. We utilized Causal NetworkTM Modeling to reconstruct biological networks affected by 2butoxyethanol, a model compound for the study of hemangiosarcomas in mice. Network modeling of liver,
spleen, and bone marrow from 2-butoxyethanol treated mice identified molecular mechanisms underlying
the initiation and formation of hemangiosarcoma. These initiating mechanisms provided mechanistic
insight into compound assessment in mice, and therefore, their relevance in humans. This is the first
study to comprehensively identify molecular mechanisms that initiate hemangiosarcoma in mice and
provides a key step in understanding safety concerns that can impact critical decisions made during drug
development for humans.
Introduction
Discovery of safety concerns in pre-clinical studies, such as development of tumors in rodent-based
research, may dramatically limit a promising compound from further development or a marketed drug for
use in broader clinical applications. Successfully translating pre-clinical safety issues to human risk has
been a major challenge for the pharmaceutical industry, one which can be tackled through a detailed
molecular understanding of the pre-clinical safety issue. Molecular understanding aids in gaining insight
into the translatability of safety concerns from pre-clinical models to humans, thereby enabling the
assessment of risk to humans.
One example of a pre-clinical drug-induced safety concern for which the risk to humans is difficult to
assess is hemangiosarcoma. Hemangiosarcoma is a poorly differentiated and aggressive endothelial cell
(EC)-derived tumor that is rare in humans but forms spontaneously and in response to different
compounds in mice [1, 2]. Hemangiosarcoma can occur in various organs including liver, spleen, and
bone marrow [3-5]. According to the US National Cancer Institute’s Surveillance, Epidemiology, and End
Results (SEER) and National Toxicology Program (NTP) databases, the incidence rate of
hemangiosarcoma in mice was 40,000X higher than in humans. Despite such difference between mice
and humans, development of hemangiosarcoma in mice has negatively impacted the development and
approval of a number of drugs.
Typical compounds shown to induce hemangiosarcoma include 2-butoxyethanol (2-BE), p-chloroaniline,
p-nitroaniline, 2-biphenylamine, phenylhydrazine, Thorotrast, vinyl chloride, Elmiron, and troglitazone [69]. For pharmaceutical compounds, their association with carcinogenicity can drastically limit their
development, approval for chronic indications, or further use in clinical applications. For example,
genotoxic compound such as Thorotrast has been clinically associated with inducing liver
hemangiosarcoma in humans [10] and is now discontinued for its use as an X-ray contrast agent.
Molecular mechanisms specifically contributing to hemangiosarcoma have been elusive [11, 12], limiting
our ability to assess risk for humans from exposure to compounds leading to hemangiosrcoma in mice,
thereby restricting further development for some drugs.
The etiology of hemangiosarcoma has been studied using 2-BE as a model compound [4]. 2-BE is an
organic solvent that has been shown to induce hemangiosarcoma mainly in livers of male mice, with
some occurrence in the spleen and bone marrow after inhalation exposure [6, 13]. 2-BE is used in many
popular home, commercial, and industrial products such as paints, surface coatings, glass cleaners, dry
cleaning solutions, inks, degreasers, and various other products [13, 14]. In the present study, Causal
TM
Network Modeling (CNM) technology was used to study 2-BE-induced hemangiosarcoma in mice to
identify specific mechanisms of action (MOA) underlying the initiation and formation of these tumors.
Detailed understanding of 2-BE-induced hemangiosarcoma disease network will lead to a better
assessment of safety concerns surrounding hemangiosarcoma-associated drugs as well as industrial and
agricultural agents.
Three primary objectives are highlighted in the entry submission:
1) Identify molecular mechanisms underlying the formation of hemangiosarcoma in bone marrow,
spleen, and liver by building a causal network model of 2-BE-induced hemangiosarcoma in these
organs
2) Identify initiating event of hemangiosarcoma formation
3) Assess the species specificity of the mechanisms underlying 2-BE-induced hemangiosarcoma
A detailed understanding of the MOA of hemangiosarcoma will provide a basis for better
assessment of hemangiosarcoma-associated agents and can provide input to key decisions made
during the development of industrial products, agricultural agents, and pharmaceutical drugs.
Integration of Innovative Use of Technology and Results
Causal NetworkTM Modeling (CNM) is a technology and method invented by Genstruct that uses multiple
large-scale data modalities (e.g. transcriptomic, proteomic, phosphoproteomic, etc) to derive mechanistic
hypotheses that can explain the data, thus discovering networks and pathways upstream of observed
‘omic alterations. Modeling of biological systems has been limited mainly to the mathematical modeling of
various systems, and to simple pathway painting approaches [15-18]. Recently, several investigators
have applied semi-quantitative methods to network modeling, but have had little success in scaling their
models to complete biological systems [19]. Furthermore, ‘pathway painting’ approaches simply ‘paint’
gene expression data onto known pathways, and define a set of interconnected elements. This type of
pathway analysis is limited by naïve assumptions such as equating gene expression changes to changes
in protein levels or protein activity, while often the relationship between gene changes and protein
changes are more complex. Thus, such approaches have been useful for investigators to visualize
pathways, but have not proven to have strong biological relevance. The key innovation of CNM is to
scale the method of causal modeling (models depicting cause and effect relationships) to
complete biological systems by using a computational inferencing method to infer activity states
(i.e. cause) of biological measurements from experimental and clinical samples (i.e. effects).
Collection of statistically significant mechanistic hypotheses that
potentially explain biological measurements are interconnected
ª kaof(X)
into a causal network that describes how these mechanisms work
ª exp(A)
in concert to produce the biological phenotypes of the complete
ª M {P@Y}
© taof(M)
system. CNM has been used effectively in over 50 collaborative
© exp(B)
projects with various corporate partners.
© exp(C)
Overview of Causal Network
TM
Modeling (CNM) Technology
© exp(D)
CNM is a systems biology platform that enables the building of
Figure 1. Hypotheses are best
detailed mechanistic models of signaling networks for disease and
explanations of the observed
drug action. CNM employs an artificial intelligence approach that
data. Observed data (e.g.,
computes statistically significant upstream causes, termed
microarray), represented by
“mechanistic hypotheses”, for observations in ‘omics data sets
green
(increase)
and
red
(Fig. 1). For example, in Fig. 1, a measured decrease in the
(decrease), is used to infer
expression (exp) of A and increase in the expressions of B, C, and
hypotheses
for
increased
D are consistent with increase in the transcriptional activity (taof) of
(yellow) or decreased (blue)
M. This approach embraces the complexity of the expression data
protein activity.
and structures it into a model that scientists can comprehend,
rather than having to grapple with the comprehensive meaning of hundreds or thousands of data points
beyond the cognitive space of the human mind. Mechanistic hypotheses derived from multiple data
modalities (transcriptomic, phosphoproteomic, metabolomic, etc.), and tissues can be integrated together,
TM
allowing for a true systemic Causal Network Model of biological events at the cellular and organism
level. These models have been able to routinely explain over 70% of the observations in a given data set
or data sets and have been used for biomarker discovery, mechanism of action (MOA) definition, and
drug safety assessment. The CNM platform has been used for multiple successful collaborations
between Genstruct and corporate partners including GlaxoSmithKline and Pfizer as well as personalized
medicine company Champions Biotechnology. Examples of such applications of this platform have been
made publicly available through peer reviewed publications [20-23] with several more in preparation.
Objective 1: Identify molecular mechanisms underlying the formation of hemangiosarcoma in
bone marrow, spleen, and liver by building a causal network model of 2-BE-induced
hemangiosarcoma in these organs
•
Outcome: CNM predicted increased inflammation, increased hypoxic response, increased
Epo signaling, and increased endothelial cell proliferation upon treatment with 2-BE
2-BE is recognized to cause hemolysis [6]. After confirming that 2-BE causes red blood cell (RBC)
hemolysis (breakdown of RBC) in our experimental mice model system, CNM was used to identify
statistically significant explanations or mechanistic hypotheses for the observed transcriptomic gene
expression changes induced by 4 hours (4h) and 7 days (7d) of 2-BE treatment in the liver, spleen, and
bone marrow of B6C3F1 mice. 7d treatment was used to identify molecular mechanisms that can lead to
hemangiosarcoma formation, while the 4h treatment served to identify potential initiating events for these
mechanisms. The following section highlights causal network assessment of 2-BE treatments in liver,
spleen, and bone marrow of B6C3F1 mice, one of the most commonly used experimental strains for
toxicity testing.
Analysis of mice treated with 2-BE for 4h identified increased transcriptional responses to inflammation
and oxidative stress in liver and spleen. Specific molecular events such as increase in the activity of tolllike receptor 4 (Tlr4), Ikk family, NF-KB, as well as increase in the protein abundance of pro-inflammatory
cytokines Tnf, Il1a, Il1b, Il6, and Csf2 provided support for inflammation during the 4h acute treatment
(data not shown). At the 7d time point, CNM also predicted an increase in inflammation in the liver (Fig.
2A), which, like the results at 4h, corroborates previous studies from Klaunig and colleagues [3, 4]. An
increase in inflammatory response was supported by increased catalytic activity of Tlr4, as evidenced by
8 supporting gene expression changes. As illustrated in Fig. 2A, Tlr4 promotes inflammation by inducing
the transcriptional activity of NF-ΚB, which can lead to pro-inflammatory cytokine production of Il1b and
Il6 [24]. Activation of these interleukins, in turn, can activate macrophages and AP-1 signaling, which has
been shown to stimulate cell proliferation [25].
A transcriptional response to hypoxia was supported in response to the 2-BE 4h treatment in both the
spleen and liver (Fig 2B). The transcriptional response in the spleen that supports hypoxia includes the
activation of the hypoxia-inducible transcription factors Hif1a, Epas1, and Arnt, supported by 35, 20, and
10 gene expression changes, respectively (Fig. 2B). Similarly, in the liver, CNM identified increased
transcriptional responses to hypoxia after 4h treatment with 2-BE, which is supported by increases in the
activities of Hif1a and Epas1 (data not shown).
CNM further predicted an induction of Epo signaling in the bone marrow of mice in response to 2-BE 7d
treatment, consistent with the stimulatory effect of hemolysis on erythropoiesis. An increase in Epo levels
was supported in the bone marrow by 58 gene expression changes (Fig. 2C). Epo triggers differentiation
and proliferation of erythroid cells (red blood cells (RBC)) via Kras [26] and Gata1 [27], and endothelial
progenitor cells (EPCs) and endothelial cells (ECs) via activation of the phosphatidyl-inositol 3-kinase
(PI3K)/Akt pathway [28-30]. Since hemangiosarcomas are rapidly growing invasive tumors with blood
vessels directly growing in the tumor, increased Epo signaling and activation of the PI3K/Akt pathway
following 2-BE treatment should lead to increased EPC/EC cell proliferation in the bone marrow as well
as increased erythropoiesis. Consistent with this biology, CNM identified activation of the PI3K/Akt
pathway (Fig. 2C). Activation of this pathway was supported via decreased activities of the PI3K/Akt
pathway inhibitor, Pten and Foxo1, which are negatively regulated by Akt [31, 32]. In addition, increased
Gata1 transcriptional activity and increased iron accumulation via an increase in the transferrin receptor
(Tfrc) are supported in the bone marrow (Fig. 2C). Tfrc is involved in transporting iron into the cell for
erythropoiesis, which is a process responsible for erythroid production. Collectively, CNM has identified
increased Epo, Akt signaling, Gata1 activity, and iron accumulation that lead to increased erythroid
differentiation and proliferation in the bone marrow in response to 2-BE treatment (Fig. 2C).
Pro-inflammatory
cytokines
A
© catof(Tlr4) [8]
Liver
© Il1b [8] © Il6 [6]
taof(NFKB)
© Macrophage Activation [4]
Inflammatory Response
B
Spleen
C
©Response to
hypoxia [104]
© taof(Hif1a)
[35]
Bone
Marrow
© taof(Epas1)
[20]
© taof(Arnt) [10]
Hif1a:Arnt
© Jun [6]
Fos
© taof(AP-1) [5]
Cell Proliferation
© Epo [58]
catof(Epor) © exp(Epor)
© gtpof(Kras) [37]
EPC/EC
Proliferation
and Differentiation
© exp(Pik3r1) kaof(PI3K)
Epas1:Arnt
© exp(Gata1)
ªpaof(Pten)
[72]
kaof(Akt Family)
© taof(Gata1) [23]
ª exp(Pten)
Erythropoiesis
© Angiogenesis [18]
ª taof(Foxo1) [36]
© Tfrc [41]
Erythroid
Differentiation/
Proliferation
Figure 2. 2-BE treatment in liver, spleen, and bone marrow. (A) Seven-day 2-BE treatment induces
inflammation in the liver via activation of Tlr4, pro-inflammatory cytokine production, and increase in
macrophage activation. Increased inflammation and transcriptional activation of AP-1 can stimulate cell
proliferation. (B) 4h 2-BE treatment leads to hypoxia in the spleen via increase in the transcriptional
activities of Hif1a, Epas1, and Arnt. (C) Seven-day 2-BE treatment leads to increased Epo signaling and
erythropoiesis in the bone marrow. Increased Epo signaling can lead to EPC/EC differentiation and
proliferation. Increased Epo signaling can also lead to erythropoiesis (erythroid differentiation and
proliferation) via increased transcriptional activity of Gata1 in the bone marrow.
Notation and color guide: catof (X) is catalytic activity of X; paof(X) is phosphatase activity of X; taof(X)
is transcriptional activity of X; kaof(X) is kinase activity of X; gtpof(X) is gtp-bound activity of X; exp(X) is
expression of. X; X is abundance of protein X; green or red is an observed increase or decrease,
respectively, in the mRNA expression of a given gene; yellow and blue boxes indicate statistically
significant mechanistic hypotheses, predicted to be increased or decreased, respectively. [Numbers] in
yellow and blue boxes indicate number of gene expression changes consistent with the predicted
change in the direction of that hypothesis. Connecting lines with arrowheads indicate a causal
activation; lines with bars indicate causal inhibition.
The proposed mechanistic model for induction of hemangiosarcoma in multi-organ system identified
increased oxidative stress, inflammation, and hypoxia in liver and spleen at an earlier time-point.
Furthermore, continuous Epo signaling in the bone marrow was identified as a significant event leading to
the differentiation and proliferation of stem cells to EPCs and ECs at a later time-point. The induction of
Epo signaling in the bone marrow most likely serves to replenish erythrocytes resulting from RBC
hemolysis by 2-BE. The sustained inflammatory mechanisms predicted in the liver in response to 2-BE
could trigger recruitment of circulating EPCs/ECs to the local site of inflammation and lead to local EC
proliferation, consistent with the liver being the main site of hemangiosarcoma formation in mice in
response to 2-BE. Indeed, EPC/EC recruiting signals are commonly associated with sites of
hemangiosarcoma formation, as this effect has been observed for compounds such as p-nitroaniline,
Elmiron, and thiazolidinediones, which induce hemangiosarcoma in the liver [1, 7, 33].
Objective 2: Identify initiating event of hemangiosarcoma formation
ƒ Outcome: CNM predicted and immunohistochemical evidence confirmed that increased
hypoxic response is one of the initiating events of hemangiosarcoma formation
Downstream of 2-BE induced hemolysis,
CNM identified hypoxia as a molecular
response in liver, spleen and bone marrow
of mice. Hypoxia was predicted in spleen
and liver treated with 2-BE for 4h, and in
spleen and bone marrow treated for 7d.
This response was then compared to the
transcriptomic fingerprint generated in
mice treated directly with reduced oxygen
as a positive control for hypoxia (data not
shown).
In
addition,
Hypoxyprobe
immunohistochemistry was used to
quantify hypoxia in the same organs
following these treatments (Fig. 3) to
confirm the prediction of CNM. Detection
of a transcriptional response in the spleen
that supports hypoxia after 2-BE 4h
treatment
is
consistent
with
the
TM
Hypoxyprobe results for this organ after
2-BE 7d treatment, where a 19-fold
increase was observed in IHC staining for
hypoxia (Fig. 3B and 3E). Acute hypoxia
(6-8% O2 for 1.5-2.5h) served as a positive
control for hypoxic response (Fig. 3D and
3E).
It has been suggested that hemolysis and
subsequent hypoxia can serve as initiating
events to iron-induced oxidative stress in
response to 2-BE [4, 6]. Local tissue
hypoxia generates an acidic environment
that stimulates iron release from
hemosiderin, leading to the production of
hydroxyl radicals in vitro [34]. Furthermore,
Hif1a has been shown to produce reactive
oxygen species, which has also been
associated with 2-BE induced oxidative
stress [4]. Therefore, hypoxia may function
upstream of and contribute to iron-induced
oxidative stress in response to 2-BE. In addition to its role in oxidative stress, hypoxic response induces
processes that can increase the supply of oxygen such as transcription of genes involved in angiogenesis
and EC proliferation [35, 36].
Figure 3. Evidence of hypoxia with 2-BE treatments.
(A-D) Representative image of spleen tissue sections
from mice. Immunohistochemical (IHC) detection of
hypoxia by Hypoxyprobe confirmed that 7d treatment of 2BE (panel B) leads to hypoxia in the spleen. Panels A and
C served as negative controls, while panel B was used as
a positive control for hypoxic response (acute hypoxia, 68% O2 for 1.5-2.5h). (E) Summary of Hypoxyprobe IHC
quantitation data from liver, spleen, and bone marrow.
Fold change indicates difference in IHC staining of treated
tissue sections when compared to negative controls.
Since hemangiosarcomas are EC-derived tumors, initiating events such as hypoxia, as evidenced by 1)
increase in the transcriptional activities of Hif1a and Epas1 (Fig. 2B), 2) increase in Epo, a Hif target (Fig.
2C), 3) increased angiogenesis (Fig. 2B) in our causal network model, and 4) confirmation of hypoxia via
immunohistochemistry (Fig. 3), may induce oxidative stress in response to 2-BE and increase
angiogenesis and EC proliferation, ultimately leading to the development of hemangiosarcoma.
Objective 3: Assess the species specificity of the mechanisms underlying 2-BE-induced
hemangiosarcoma
•
Outcome: Species-specific differences indicate that mechanisms contributing to 2-BE
induced hemangiosarcoma do not support increase in risk for humans
Hemangiosarcomas are commonly found in mice, but are rarely observed in humans. Of the mechanisms
identified in this study, increased hypoxia leading to increased EC proliferation exhibits species-specific
differences that may help explain different rates of hemangiosarcoma occurrence between mice and
humans.
Hypoxia as an initiating event of hemangiosarcoma downstream of hemolysis
Because 2-BE causes hemolysis only in rodents, humans should not be subject to subsequent hypoxia
[37]. Therefore, our 2-BE model indicates that sustained Epo signaling and aberrant EC proliferation
downstream of hypoxia also should not occur in humans. Mice may also be more vulnerable to oxidative
stress from hypoxic conditions since they have been shown to have lower hepatic levels of antioxidant,
vitamin E, compared to humans, thereby predisposing them to develop spontaneous and compoundinduced hemangiosarcoma [38]. Consistent with our model of pinpointing hypoxia as a key event, it has
been shown that mice exposed to high altitudes have higher incidence of developing spontaneous benign
endothelial tumors in the ovary [39]. On the contrary, there is a lack of evidence of people developing
hemangiosarcoma from chronic hemolytic disease or high altitude living conditions. Therefore, hypoxia
alone is likely not an operational component of hemangiosarcoma formation in humans.
Mice are more prone to EC proliferation than humans
Aberrant EC recruitment and proliferation is critical to EC-derived tumorigenesis. It has been reported that
EC proliferation rate in normal liver is 4-5x higher in mice than in humans [40]. Therefore, it is plausible
that mice are more sensitive to EC proliferation, and this process can be easily triggered by minute
amount of stimulus. Different rate of EC proliferation could also explain species-specific differences in
hemangiosarcoma occurrence between mice and humans.
Since our 2-BE model identified hypoxia leading to increased EC proliferation as key events for
hemangiosarcoma, these assessments collectively support that 2-BE-driven signaling network
contributing to hemangiosarcoma formation is mouse-specific and may not be relevant to humans.
Summary
As shown in figure 4, CNM on the transcriptomic data from bone marrows, spleen, and liver of mice
treated with the hemangiosarcoma-causing agent 2-BE revealed multiple interrelated processes that
comprise a mechanistic model of hemangiosarcoma formation. Specifically, analysis of the transcriptomic
data demonstrated support for hypoxia as a key initiating event downstream of 2-BE-induced RBC
hemolysis. Importantly, these studies establish hypoxia as an initiating event leading to Epo release
(primarily from kidney) as well as Epo-mediated increase in both erythrocyte and EPC/EC differentiation
and proliferation in bone marrow and spleen. Sustained inflammatory response leads to recruitment of
EPC/ECs and proliferation in the liver in our multi-organ mouse model of 2-BE-induced
hemangiosarcoma.
In humans, 2-BE is not hemolytic [41], and there is no evidence of people developing hemangiosarcoma
from low blood oxygen conditions such as chronic hemolytic disease or high altitude living conditions.
There is also a lack of support for low oxygen-induced endothelial tumors in humans. Therefore, our
analyses collectively indicate that RBC-hemolysis leading to hypoxia, which are triggering events for
aberrant EC proliferation in the mouse liver, is specifically observed in experimental mouse models and it
may not be relevant in humans.
EC
Proliferation
Hypoxia
EPC/EC
Recruitment
2-BE
RBC
Hemolysis
Hemangiosarcoma
Inflammation
Hypoxia
Epo
Signaling
Fe 2+
Liver
EPC Differentiation/
EC Proliferation
Erythrocyte
Differentiation/
Proliferation
Bone Marrow
and Spleen
Figure 4. Proposed mechanistic model of 2-BE-induced hemangiosarcoma.
The proposed model is based on CNM of transcriptomic data, EC proliferation,
and HypoxyprobeTM data from liver, spleen, and bone marrow of mice treated for
4h and 7d with 2-BE. The main processes identified were (1) hypoxia as an
initiating event downstream of red blood cell (RBC) hemolysis, (2) Epo signaling
that can lead to EC differentiation and proliferation in bone marrow and spleen,
and (3) sustained inflammation that can lead to EC/EPC recruitment and EC
proliferation in the liver.
ROI Achieved or Expected (200 word limit)
An estimate of the ROI on the use of Genstruct’s unique CNM technology to comprehensively interpret
multiple transcriptomic data from multi-organ system can be attempted by comparing time and resources
required to generate similar in-depth analysis using standard methods.
Genstruct’s modeling project required 64 FTE-weeks (four full-time-equivalents (FTE) for 16 weeks) to
fully:
•
Analyze 80 transcriptomic datasets
•
Build multi-organ models of 2-BE-induced hemangiosarcoma
•
Identify key initiating molecular events
•
Confirm hypoxia by building hypoxia models
•
Assess the relevance of 2-BE-induced hemangiosarcoma mouse model in humans
In comparison, 160 FTE-weeks can be conservatively estimated to superficially analyze 80 transcriptomic
datasets (two FTE-weeks per one dataset) using standard methods (pathway painting approaches or
other software programs). 160 FTE-weeks only accounts for a subset of this project and doesn’t account
for the level of depth/coverage, interpretation of consistency in biology across datasets, identification of
key initiating events, corroboration using literature searches, or assessment of relevance to human health
that we achieved during the 64 FTE-weeks. Such comparison already indicates 2.5-fold fewer FTE-weeks
using Genstruct’s unique approach and clearly demonstrates significant ROI in using Genstruct’s
approach to contribute to safety assessment during drug development.
Conclusions and Implications for the Field
This is the first study to provide a detailed interpretation of transcriptomic data to develop a
comprehensive and mechanistic multi-organ model of hemangiosarcoma in silico. Specifically, this
approach can be used as a general framework for understanding molecular mechanisms that contribute
to compound-induced hemangiosarcoma in mice and their relevance in humans. This seminal work
provides a key step in understanding safety concerns, such as hemangiosarcoma, that emanate from
translational research and potentially impact critical decisions made during the development of
pharmaceutical drugs as well as household products and agricultural agents. Genstruct’s systems biology
approach can be applied to safety issues as well as efficacy and biomarker identification analyses and
thereby significantly shortening drug development and reducing attrition.
References
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.
NTP Toxicology and Carcinogenesis Studies of p-Nitroaniline (CAS No. 100-01-6) in B6C3F1
Mice (Gavage Studies). Natl Toxicol Program Tech Rep Ser, 1993. 418:1-203. PMID: 12616293.
Mendenhall, W.M., et al., Cutaneous angiosarcoma. Am J Clin Oncol, 2006. 29(5):524-8. PMID:
17023791.
Corthals, S.M., et al., Mechanisms of 2-butoxyethanol-induced hemangiosarcomas. Toxicol Sci,
2006. 92(2):378-86. PMID: 16675516.
Klaunig, J.E. and Kamendulis, L.M., Mode of action of butoxyethanol-induced mouse liver
hemangiosarcomas and hepatocellular carcinomas. Toxicol Lett, 2005. 156(1):107-15. PMID:
15705491.
Toxicology and Carcinogenesis Studies of Cumene (CAS No. 98-82-8) in F344/N Rats and
B6C3F1 Mice (Inhalation Studies). Natl Toxicol Program Tech Rep Ser, 2009. (542):1-200. PMID:
19340095.
Nyska, A., et al., Association of liver hemangiosarcoma and secondary iron overload in B6C3F1
mice--the National Toxicology Program experience. Toxicol Pathol, 2004. 32(2):222-8. PMID:
15200160.
Abdo, K.M., et al., Toxicity and carcinogenicity of Elmiron in F344/N rats and B6C3F1 mice
following 2 years of gavage administration. Arch Toxicol, 2003. 77(12):702-11. PMID: 14508637.
Abdo, K.M., et al., Carcinogenesis bioassay in rats and mice fed diets containing 2-biphenylamine
hydrochloride. Fundam Appl Toxicol, 1982. 2(5):201-10. PMID: 7185618.
Herman, J.R., et al., Rodent carcinogenicity with the thiazolidinedione antidiabetic agent
troglitazone. Toxicol Sci, 2002. 68(1):226-36. PMID: 12075125.
Przygodzki, R.M., et al., Sporadic and Thorotrast-induced angiosarcomas of the liver manifest
frequent and multiple point mutations in K-ras-2. Lab Invest, 1997. 76(1):153-9. PMID: 9010458.
Duddy, S.K., et al., p53 is not inactivated in B6C3F1 mouse vascular tumors arising
spontaneously or associated with long-term administration of the thiazolidinedione troglitazone.
Toxicol Appl Pharmacol, 1999. 156(2):106-12. PMID: 10198275.
Duddy, S.K., et al., Spontaneous and thiazolidinedione-induced B6C3F1 mouse
hemangiosarcomas exhibit low ras oncogene mutation frequencies. Toxicol Appl Pharmacol,
1999. 160(2):133-40. PMID: 10527912.
NTP Toxicology and Carcinogenesis Studies 2-Butoxyethanol (CAS NO. 111-76-2) in F344/N
Rats and B6C3F1 Mice (Inhalation Studies). Natl Toxicol Program Tech Rep Ser, 2000. 484:1290. PMID: 12571679.
Dean, B.S. and Krenzelok, E.P., Clinical evaluation of pediatric ethylene glycol monobutyl ether
poisonings. J Toxicol Clin Toxicol, 1992. 30(4):557-63. PMID: 1359160.
Bodei, C., et al., On deducing causality in metabolic networks. BMC Bioinformatics, 2008. 9
Suppl 4:S8. PMID: 18460181.
Klamt, S. and von Kamp, A., Computing paths and cycles in biological interaction graphs. BMC
Bioinformatics, 2009. 10:181. PMID: 19527491.
Schaub, M.A., et al., Qualitative networks: a symbolic approach to analyze biological signaling
networks. BMC Syst Biol, 2007. 1:4. PMID: 17408511.
Alekseev, O.M., et al., Analysis of gene expression profiles in HeLa cells in response to
overexpression or siRNA-mediated depletion of NASP. Reprod Biol Endocrinol, 2009. 7:45.
PMID: 19439102.
Feret, J., et al., Internal coarse-graining of molecular systems. Proc Natl Acad Sci U S A, 2009.
106(16):6453-8. PMID: 19346467.
Blander, G., et al., SIRT1 promotes differentiation of normal human keratinocytes. J Invest
Dermatol, 2009. 129(1):41-9. PMID: 18563176.
Smith, J.J., et al., Small molecule activators of SIRT1 replicate signaling pathways triggered by
calorie restriction in vivo. BMC Syst Biol, 2009. 3:31. PMID: 19284563.
22.
23.
24.
25.
26.
27.
28.
29.
30.
31.
32.
33.
34.
35.
36.
37.
38.
39.
40.
41.
Pollard, J., Jr., et al., A computational model to define the molecular causes of type 2 diabetes
mellitus. Diabetes Technol Ther, 2005. 7(2):323-36. PMID: 15857235.
Laifenfeld, D., et al., The role of hypoxia in 2-butoxyethanol-induced hemangiosarcoma. Toxicol
Sci, 113(1):254-66. PMID: 19812364.
O'Neill, L.A. and Bowie, A.G., The family of five: TIR-domain-containing adaptors in Toll-like
receptor signalling. Nat Rev Immunol, 2007. 7(5):353-64. PMID: 17457343.
Ballermann, B.J., et al., Shear stress and the endothelium. Kidney Int Suppl, 1998. 67:S100-8.
PMID: 9736263.
Zhang, J. and Lodish, H.F., Identification of K-ras as the major regulator for cytokine-dependent
Akt activation in erythroid progenitors in vivo. Proc Natl Acad Sci U S A, 2005. 102(41):14605-10.
PMID: 16203968.
Jelkmann, W., Molecular biology of erythropoietin. Intern Med, 2004. 43(8):649-59. PMID:
15468961.
Bouscary, D., et al., Critical role for PI 3-kinase in the control of erythropoietin-induced erythroid
progenitor proliferation. Blood, 2003. 101(9):3436-43. PMID: 12506011.
Bahlmann, F.H., et al., Erythropoietin regulates endothelial progenitor cells. Blood, 2004.
103(3):921-6. PMID: 14525788.
Urao, N., et al., Erythropoietin-mobilized endothelial progenitors enhance reendothelialization via
Akt-endothelial nitric oxide synthase activation and prevent neointimal hyperplasia. Circ Res,
2006. 98(11):1405-13. PMID: 16645141.
Chow, L.M. and Baker, S.J., PTEN function in normal and neoplastic growth. Cancer Lett, 2006.
241(2):184-96. PMID: 16412571.
Burgering, B.M. and Kops, G.J., Cell cycle and death control: long live Forkheads. Trends
Biochem Sci, 2002. 27(7):352-60. PMID: 12114024.
Sotiropoulos, K.B., et al., Adipose-specific effect of rosiglitazone on vascular permeability and
protein kinase C activation: novel mechanism for PPARgamma agonist's effects on edema and
weight gain. Faseb J, 2006. 20(8):1203-5. PMID: 16672634.
Ozaki, M., et al., Iron release from haemosiderin and production of iron-catalysed hydroxyl
radicals in vitro. Biochem J, 1988. 250(2):589-95. PMID: 2833249.
Otrock, Z.K., et al., Hypoxia-inducible factor in cancer angiogenesis: structure, regulation and
clinical perspectives. Crit Rev Oncol Hematol, 2009. 70(2):93-102. PMID: 19186072.
Yamakawa, M., et al., Hypoxia-inducible factor-1 mediates activation of cultured vascular
endothelial cells by inducing multiple angiogenic factors. Circ Res, 2003. 93(7):664-73. PMID:
12958144.
Udden, M.M., In vitro sub-hemolytic effects of butoxyacetic acid on human and rat erythrocytes.
Toxicol Sci, 2002. 69(1):258-64. PMID: 12215681.
Siesky, A.M., et al., Hepatic effects of 2-butoxyethanol in rodents. Toxicol Sci, 2002. 70(2):25260. PMID: 12441370.
Mori-Chavez, P., et al., Influence of altitude on late effects of radiation in RF-Un mice:
observations on survival time, blood changes, body weight, and incidence of neoplasms. Cancer
Res, 1970. 30(4):913-28. PMID: 4926801.
Ohnishi, T., et al., Comparison of endothelial cell proliferation in normal liver and adipose tissue in
B6C3F1 mice, F344 rats, and humans. Toxicol Pathol, 2007. 35(7):904-9. PMID: 18098037.
Gualtieri, J.F., et al., Repeated ingestion of 2-butoxyethanol: case report and literature review. J
Toxicol Clin Toxicol, 2003. 41(1):57-62. PMID: 12645968.
1. User Organization (Organization at which the solution was deployed/applied) A. User Organization Organization name: Lexicon Pharmaceuticals Address: 350 Carter Road, Princeton, NJ 08540 B. User Organization Contact Person Name: Philip Keyes Title: Director, analytical Chemistry Tel: (609) 466‐5596 work; (609) 847‐1684 cell Email: [email protected] 3. Project Project Title: Automated Compound Verification by NMR, LC/MS, and
HPLC in the Drug Discovery Process Team Leader Name: Philip Keyes Title: Director, analytical Chemistry Tel: (609) 466‐5596 work; (609) 847‐1684 cell Email: [email protected] Team members – name(s), title(s) and company (optional): Gonzalo Hernandez, NMR Spectroscopist, Lexicon Pharmaceuticals Jim Robinson, Assoc. Director ChemInformatics, Lexicon Pharmaceuticals 4. Category in which entry is being submitted (1 category per entry, highlight your choice) ˆ Basic Research & Biological Research: Disease pathway research, applied and basic research √ Drug Discovery & Development: Compound‐focused research, drug safety ˆ Clinical Trials & Research: Trial design, eCTD ˆ Translational Medicine: Feedback loops, predictive technologies ˆ Personalized Medicine: Responders/non‐responders, biomarkers ˆ IT & Informatics: LIMS, High Performance Computing, storage, data visualization, imaging technologies ˆ Knowledge Management: Data mining, idea/expertise mining, text mining, collaboration, resource optimization ˆ Health‐IT: ePrescribing, RHIOs, EMR/PHR ˆ Manufacturing & Bioprocessing: Mass production, continuous manufacturing (Bio‐IT World reserves the right to re‐categorize submissions based on submission or in the event that a category is refined.) Automated Compound Verification by NMR, LC/MS, and HPLC in the Drug Discovery Process
Philip Keyes, Director Analytical Chemistry, Lexicon Pharmaceuticals
Gonzalo Hernandez, Research Scientist, Lexicon Pharmaceuticals
Jim Robinson, Assoc. Director ChemInformatics, Lexicon Pharmaceuticals
ABSTRACT
A custom designed system of tools has been constructed from off the shelf components to provide
automated structure confirmation during the compound submission, supervisor verification, and
1
registration phase of the drug discovery process. While LC/MS and HPLC data are essential
components of the verification, NMR data provides an orthogonal result, and has proven more
challenging to automate. By analyzing the 1H-13C NMR correlation experiment—the 2D-HSQC—
through a comparison to predicted data and proposed structural assignments, a result can be derived
indicating the likelihood that the proposed structure is consistent with the data. By using a
multidisciplinary analytical approach combining NMR, LC/MS, and HPLC data, the analytical
department has helped to systematically identify problems with submitted compounds, allow for
corrections, and enhance the overall quality of the compound submission process. The automated
verification process has become a valued asset that helps maintain the quality of our drug discovery
efforts.
BACKGROUND
The widespread shift of many organizations from traditional analytical support groups to open-access
laboratories has led to more routine LC/MS and NMR analysis by synthetic and medicinal chemists.
With the volume of data being acquired on open-access instrumentation, as well as the additional
burden this analysis places on the chemists, automated structure verification by software has been
investigated in this area to improve throughput and QC of registration libraries. It was not uncommon 20
years ago for chemistry departments to require a quick analytical review for each final compound.
However, due to technological advances in data acquisition and robotics, the ability to generate ever
greater volumes of analytical data, and the demand to increase the number of compounds produced,
analytical interpretation has struggled to keep pace with compound reviewing demands.
Since the introduction of NMR prediction software, medicinal chemists have contemplated submitting
their compounds to a corporate sample registration system that would ultimately display a simplified
pass/fail result for their compounds. In the small molecule drug discovery environment, key sample
information requirements are purity and identity for each structure submitted to the corporate compound
registration system. We have implemented an integrated automated verification system that renders a
simplified graphic result without requiring any additional labor on the part of our medicinal chemists, for
compound registration and sample submission. The use of the relatively fast and sensitive GradientHSQC allows for comparison of predicted to observed data that is robust and reliable using
2
commercially-available NMR prediction platforms (ACD/2D NMR Predictor for our purposes). We have
added NMR automated verification on top of the automated verification already implemented for our
HPLC and LC/MS instruments to augment our results. The combined multi-disciplinary system provides
near real time feedback, on a pass/fail basis, for HSQC spectra run on final compounds.
Compounds submitted to corporate registration systems are generally assumed to be relatively pure
and conform to the submitted structure with a greater reliance on the mass spectrum confirming just the
formula weight. Regioisomers, potentially derived from alternative positions of substitution in a reaction,
however, are generally not well characterized by classical mass spec analysis, in many cases, and a
greater dependence on NMR is usually necessary. Subtle differences that may result from unexpected
substitutions can potentially be resolved when comparing predicted chemical shift values in the Carbon
13 spectrum. Comparison of predicted Carbon 13 chemical shifts is generally preferred for this type of
work, relative to proton prediction, since it is considered a much more mature, reliable, and accurate
method. Unfortunately, only a small percentage of medicinal chemistry compounds have Carbon 13
data collected at the time of their initial registration due to the relative insensitivity of the technique.
Proton NMR spectra are routinely collected, however exchangeable protons, the lock solvent used, and
pH effects impart limitations on the usefulness of these predictions for the purpose of “qualifying” a
compound. By using the 2D-HSQC, an experiment that can be run on milligram quantities of sample in
a relatively short period of time (5–15 minutes), we can overcome some of the limitations of both
techniques. The HSQC spectrum provides the software with the ability to experimentally extract proton
chemical shift information that is void of exchangeable protons, and thus eliminate the inherent
inaccuracy they generally add to predictions. More importantly, the HSQC provides Carbon 13 chemical
shift information that would otherwise be unavailable without the larger quantities of compound needed
to collect a 1D-Carbon 13 spectrum generated in the later stages of drug discovery.
The existing compound registration and submission system interface has been modified to show a
simple traffic light result of red, yellow, or green for each “qualifying” analysis, which allows for rapid
visualization of the overall result.
Figure 1—Workflow schematic of automated verification/structure confirmation. The black box refers to
the ACD/Automation Server software1.
Manual Processes
O
F
7.64
7.62
7.60
7.58
7.56
7.54
7.52
7.50
7.48
7.46
7.44
7.42
7.40
7.38
7.36
7.34
7.32
7.30
7.28
7.26
(ppm)
7.24
7.22
7.20
7.18
7.16
7.14
7.12
7.10
7.08
7.06
7.04
O
OH
The Black
Box
1H / 13C NMR
Result
Automated Processes
Over 2000 sets of compounds with 1H, HSQC, LC/MS, and HPLC data have helped improve the overall
quality of our registration database, and allow us to examine the strengths and weaknesses of the
current implementation so that regular improvements can be made.
RESULTS
Since initial implementation in 2007, a percentage of the total possible numbers of compounds
submitted have undergone automated verification. The rate of use has steadily increased over time as
a result of the robust integration within our compound submission system, ease of use, and support of
management for the process. To date, over 2000 compound data sets have been evaluated. A key
metric for determining the accuracy of the system is the resulting false negative reporting of failures.
Each negative result is carefully inspected to determine whether a true negative is being reported, or if
it is a false negative, what the cause is. True negatives are corrected, often due to a structure drawing
error, and if necessary, excluded from the inventory when they fail for other reasons. As an aggregate,
we have observed a false negative rate of 19% across MedChem and intermediate compounds
evaluated (Figure 2). To validate the system, 152 commercial chemicals were used to benchmark,
challenge and optimize the system using a series of positive and negative controls (correct and
incorrect structures). For this, somewhat simpler sets of compounds (Avg MW = 263), relative to
medicinal chemistry submissions (Avg MW = 341), a false negative rate of 8% was observed while
fixing evaluation criteria and thresholds to keep the false positive rate of near or less than 20% (Figure
3). This is a critical step in establishing the basis for organizational confidence in the results.
Figure 2—System verification results for 1402 MedChem and Intermediate compound data sets.
19%
Green
57%
24%
Yellow
False Negative
Figure 3—Benchmark of 152 commercial reference compound data sets used to challenge the auto
verification system.
O
Correct structure
(positive control)
interrogated by
O
Cl
Cl
OH
N
O
Cl
vs
N
Cl
Cl
O
OH
Incorrect structure
(Negative control)
interrogated by
Cl
80
Correct Structure
R e s u lt D is t r ib u t io n
70
60
77 %
80
Incorrect Structure
70
60
50
50
40
40
49 %
30
True
30
20
10
0
False
Negative
True
Posit
ive
8%
Right Answer
Good
Ambiguous
Wrong Answer
20
10
0
False
Positive
Neg
ative
Right Answer
20 %
Ambiguous
Wrong Answer
Bad
ROI
How do you measure the value of a process that you have not done in the past (due to impracticality or
excessive labor cost), but that you do now because it can be done with automation? The most direct
calculation of ROI would be the cost associated with performing the same work in the absence of the
system. Based on the current compound submission rate for 50 to 70 chemists at our site, this would
translate to a minimum of one additional FTE, a skilled NMR spectroscopist. The capacity of the system
as installed is greater than our current operational needs and therefore would scale even better with a
higher density of chemists. Based on traditional FTE costs alone, the ROI would essentially be a
minimum of $250,000 annually. It is, however, very difficult to calculate the actual total return on
investment for application of the automated NMR verification system to our compound submission
process due to the many added benefits beyond this basic staffing consideration. Among many of the
aspects of return on investment is the information content. Chemists are presented with an
automatically generated report (Figure 4) that shows a tentative assignment of their compound. This in
turn helps them improve their own NMR interpretation skills and thus impacts on the professional
development of the organization as a whole. It is also a huge time saver for our supervisors and
reduces their time spent on data review for approval by helping them to focus their limited time on
compounds potentially needing more scrutiny. Through a color-coded tentative assignment provided by
the verification protocol, the software quickly points to the problem area of the sample, which allows for
rapid visualization and decision-making by the supervisor. In addition, the presence of tentative
assignments allows our supervisors to evaluate whether the software is passing or failing compounds
for the right or wrong reasons; a luxury we did not have prior to the roll-out of this system since
chemists assignments were not documented. Once assignments are corrected by an analytical expert,
selected high value sets of unique assignments are then exported to an ACD/Labs prediction training
database where they can improve the verification results on samples submitted in the future. This
allows us to not only retain the knowledge extracted from the analysis, but also leverage it to improve
the overall system.
Figure 4—Automatically generated assignment report from ACD/Automation Server with color coded
indicators for assignment confidence.
The key return, however, is greater accuracy of resulting structural information. The sources of return
include improved interpretability of Structure Activity Relationship (SAR) information for lead
optimization of in vitro and in vivo screening results. Furthermore, the long term benefit of correct
structure/compound addition to our screening library and corresponding avoidance of “compounding”
future errors, is a major return. If compounds tested in lead optimization have imprecise structures, this
can have a profound and misleading impact on SAR. A series that could have been promising, had the
correct compound/structure information been available, might then otherwise evolve a more accurate
understanding of SAR which would have help move the lead optimization project forward. The cost
saved by avoidance of running unneeded or misleading in vitro or in vivo screening to develop SAR, in
this case, is the first level of return. On the other hand, if correct structural information results in either
re-synthesis or new information, it is conceivable that tractable information could potentially be
developed that would lead to a successful lead optimization campaign, and ultimately a good drug
candidate that otherwise might not be found.
One good example of this that we have seen was the result obtained from the incorrect assignment of a
constitutional isomer (Regiochemistry) for a building block. Many analogs had been synthesized from
that building block for which the structures then needed to be corrected. From a long term point of view,
compounds are added to our screening libraries. When hits are obtained, compounds are usually resynthesized and confirmation is attempted with a primary screen. If the structure of the initial hit is
incorrect, it can prove quite challenging to deduce the structure from what is usually very small amounts
of material that are available in the screening library. This can lead to missing what could have
otherwise been a tractable hit for a future lead optimization campaign and ultimately a successful drug
candidate.
The extreme extrapolation of this is therefore the value of a drug on the market. It is more likely that
building in quality to compound libraries helps to eliminate the cascading downstream costs of testing
useless or incorrect compounds which cannot be later confirmed or provide useful information.
Elimination of wasteful testing, can in turn, free up valuable resources, and therefore help to cut down
the overall time to prosecute programs and obtain drug candidates.
CONCLUSION
We have demonstrated that a practical automated NMR verification system, incorporating ACD/Labs
NMR software components, can be implemented in an industrial/pharmaceutical environment and that
the system is robust and provides added value to compound collection integrity and quality. It has also
provided a means to apply analytical review, with minimal staffing, to entire collections of compounds
which would otherwise not be possible. The system allows evaluation of a smaller portion of the entire
compound collection by eliminating the need to look at compounds with a high degree of confidence
and concentrate on those more likely to be of issue. Simplicity has been the key to its effective and
growing use in our organization. By implementing a system that requires negligible additional user input
of effort, while providing a plethora of information useful to our chemists that includes an overall
assessment of the candidate compound, and tentative assignments, we have achieved user
acceptance and trust.
REFERENCES
1.
Automated compound verification using 2D-NMR HSQC data in an open-access environment,
Magnetic Resonance in Chemistry, Volume 47 Issue 1, Pages 38 – 52, 2008
2. Advanced Chemistry Development, Inc., Toronto, Ontario, Canada. http://www.acdlabs.com
Concept and results presented at the following events (Presentations):
-
ENC 2008, Session Lecture, Small Molecule Techniques
-
ENC 2008, ACD/Labs NMR Software Symposium, March 9, 2008, Asilomar, CA
-
SMASH 2008, ACD/Labs NMR Software Symposium, September 7, 2008, Santa Fe, NM
-
th
EUM 2008, ACD/Labs 9 Annual European Users’ Meeting, October 28, 2008. Paris, France
-
ENC 2009, ACD/Labs NMR Software Symposium, March 29, 2009, Asilomar, CA
-
EAS 2009, Session Lecture, Oct 29, 2009, Somerset, NJ
Concept and results presented at the following events (Posters):
-
ENC 2007, Poster Session, Daytona Beach, FL, April 24, 2007
ENC 2008, Poster Session, Asilomar, CA, March 10, 2008
ENC 2009, Poster Session, Asilomar, CA, March 30 2009
Published Resources for the Life Sciences
250 First Avenue, Suite 300, Needham, MA 02494 | phone: 781‐972‐5400 | fax: 781‐972‐5425 Bio-IT World 2010 Best Practices Awards
1. Nominating Organization (Fill this out only if you are nominating a group other than your own.)
A. Nominating Organization
Organization name: Millipore Corporation
Address: 290 Concord Road, Billerica, MA 01821
B. Nominating Contact Person
Name: Catherine Varmazis
Title: Corporate Communications Specialist
Tel: 978-715-1336
Email: [email protected]
2.
User Organization (Organization at which the solution was deployed/applied)
A. User Organization
Organization name: Bone Marrow Transplantation & Cellular Therapy, St. Jude Children's Research
Hospital
Address:
262 Danny Thomas Place
Memphis, TN 38105-3678
B. User Organization Contact Person
Name: Dr. Mari Hashitate Dallas
Assistant Member, St. Jude Faculty
Bone Marrow Transplantation & Cellular Therapy
St. Jude Children's Research Hospital
262 Danny Thomas Place
Memphis, TN 38105-3678
Email: [email protected]
Phone: (901) 595-3695
3. Project
Project Title: Guava Benchtop Six-color Flow Cytometer (easyCyte 8HT)
Team Leader
Name: Rick Pittaro
Title: R&D Director
Tel: 510-576-1400
Email: [email protected]
Team members – name(s), title(s) and company (optional): Jason Whalley, Product Manager,
Millipore
4. Category in which entry is being submitted (1 category per entry, highlight your choice)
• Basic Research & Biological Research: Disease pathway research, applied and basic research
ˆ Drug Discovery & Development: Compound-focused research, drug safety
Published Resources for the Life Sciences
250 First Avenue, Suite 300, Needham, MA 02494 | phone: 781‐972‐5400 | fax: 781‐972‐5425 ˆ
ˆ
ˆ
ˆ
Clinical Trials & Research: Trial design, eCTD
Translational Medicine: Feedback loops, predictive technologies
Personalized Medicine: Responders/non-responders, biomarkers
IT & Informatics: LIMS, High Performance Computing, storage, data visualization, imaging
technologies
ˆ Knowledge Management: Data mining, idea/expertise mining, text mining, collaboration, resource
optimization
ˆ Health-IT: ePrescribing, RHIOs, EMR/PHR
ˆ Manufacturing & Bioprocessing: Mass production, continuous manufacturing
(Bio-IT World reserves the right to re-categorize submissions based on submission or in the event that a
category is refined.)
5. Description of project (4 FIGURES MAXIMUM):
A. ABSTRACT/SUMMARY of the project and results (150 words max.)
Millipore’s guava easyCyte 8HT flow cytometer is a compact system with 96-well automation
that speeds research and makes it more cost-effective. The system lets researchers conduct
complex cell analysis at their benchtops in their own labs rather than having to send samples to
a core lab for analysis. In addition, it does not require a full-time technician to operate it, thus
making it more cost-effective than traditional flow cytometers.
The system’s six-color capability lets researchers conduct highly multiplexed experiments in
which six targets and eight parameters can be simultaneously analyzed in a single sample of
cells. The six-color detection range also gives greater freedom to choose fluorescent dyes with
well-separated emission spectra.
The system’s patented microcapillary technology means it can analyze smaller sample sizes—
up to 20 times smaller in some cases—thus substantially reducing the cost of reagents and
antibody expenses.
B. INTRODUCTION/background/objectives
Researchers whose experiments require cell analysis using multi-laser flow cytometry have
traditionally faced several challenges: lack of accessibility to a flow cytometer; a tedious, timeconsuming analysis even when a flow cytometer is available; and more recently, “data overload”
that makes it difficult to interpret the data.
Flow cytometry is a complex process has traditionally been available only in core labs often
situated in locations far from researchers’ labs, so scientists had to reserve time in the core lab
and send their samples there for analysis.
This was not only inconvenient and time-consuming, it was also costly. For example, if
scientists’ samples were unavoidably delayed in their own lab, they could be charged for the
time they had reserved at the core lab even if they could not do their analysis as planned.
Published Resources for the Life Sciences
250 First Avenue, Suite 300, Needham, MA 02494 | phone: 781‐972‐5400 | fax: 781‐972‐5425 Millipore addressed these problems with two goals in mind. First, we wanted to make multi-laser
cytometry accessible to people beyond the core flow lab. We also wanted to simplify the
complex process of analyzing plate-based assays.
Our solution was to develop a small-footprint six-color flow cytometer (the guava easyCyte 8-HT
Flow Cytometry System) for benchtop use that runs on guava InCyte Flow Cytometry Software
(see separate entry).
C. RESULTS (highlight major R&D/IT tools deployed; innovative uses of technology).
In the words of Dr. Mari Hashitate Dallas, Assistant Member, St. Jude Faculty:
My lab is dedicated to improving the outcome of pediatric patients with malignant and non-malignant
hematologic disease who are undergoing bone marrow transplants.
The focus of our research is to addresses a central question in pediatric transplantation: the
immunological barriers to umbilical cord blood transplant (UCBT) and possible methods to overcome
them, thereby improving the overall survival of patients undergoing UCBT.
Published Resources for the Life Sciences
250 First Avenue, Suite 300, Needham, MA 02494 | phone: 781‐972‐5400 | fax: 781‐972‐5425 The easyCyte 8HT flow cytometer in the laboratory allows us to overcome major limitation
associated with our research.
•
•
Time constraints: Once animals are begun to be engrafted via the Xenogen IVIS-200 in
vivo imaging system, cohorts of up to 30 animals are sacrificed in a short interval, and
bone marrow, blood, spleen, thymus and lymph nodes are harvested. This was a critical
time point, with analysis dependent on the availability of the shared flow cytometry
facility. The easyCyte 8HT benchtop flow cytometer has removed this obstacle and
enables us to perform five-color analysis on multiple samples. The multiple sample
analysis is possible due to the built-in automated plate sampler—the easyCyte
instrument is the only instrument with fully integrated plate based analysis, saving
valuable time.
Limited patient samples: The micro-capillary flow system of the easyCyte 8HT allows
our lab to analyze critical patient samples more effectively. The volume of blood we are
currently able to obtain from pediatric patients is limited. We will be able to analyze a
larger panel from a smaller sample due to the reduced sample volume required when
using the easyCyte 8HT, compared to other instruments.
D. ROI achieved or expected (200 words max.):
Here at the Bone Marrow Transplantation & Cellular Therapy center of St. Jude Children’s Research
Hospital, we use the easyCyte 8HT flow cytometer primarily to analyze cell staining.
It definitely has saved us money, since we do not have to pay the core flow cytometry facility for
analysis, but do it ourselves instead. It also saves us money due to its 96-well plate reader
capability. We can now let the samples run while we are doing something else with our time while
the plate runs, instead of having to sit while taking the tubes on and off the machine.
Also, our monthly charges at the core facility have been greatly reduced, from $1000 to $2000 per
month to $100 to $200 per month.
E. CONCLUSIONS/implications for the field.
With its smaller footprint and more powerful analysis capabilities than traditional flow cytometers,
Millipore’s guava easyCyte 8-HT Flow Cytometry System accelerates the analysis of cell
samples and makes flow cytometry accessible to more researchers. Researchers get their
analysis more quickly, allowing them to complete their experiments faster.
1. REFERENCES/testimonials/supporting internal documents (If necessary; 5 pages max.)
Full modulation is a key technology advancement on the guava easyCyte 8HT that enables detection of eight
parameters from two lasers (six colors) on a capillary-based cytometer. The blue and red excitation lasers are
modulated so they are out of phase.
Published Resources for the Life Sciences
250 First Avenue, Suite 300, Needham, MA 02494 | phone: 781‐972‐5400 | fax: 781‐972‐5425 The lasers beams overlap spatially but are modulated (ON/OFF) out of phase. They are modulated at a high
enough frequency to sample each cell multiple times as it passes through the overlapped laser beams, such that
no information is lost. The physically overlapped signals from the two lasers do not need to be time-delayed to
be correlated, thereby not requiring a precise time calibration, which is required for most flow cytometers.
See figures below.
Published Resources for the Life Sciences
250 First Avenue, Suite 300, Needham, MA 02494 | phone: 781‐972‐5400 | fax: 781‐972‐5425 Published Resources for the Life Sciences
250 First Avenue, Suite 300, Needham, MA 02494 | phone: 781‐972‐5400 | fax: 781‐972‐5425 Bio‐IT World 2010 Best Practices Awards 1. Nominating Organization (Fill this out only if you are nominating a group other than your own.) A. Nominating Organization Organization name: Address: B. Nominating Contact Person Name: Title: Tel: Email: 2. User Organization (Organization at which the solution was deployed/applied) A. User Organization Organization name: Massachusetts Institute of Technology Address: 77 Massachusetts Ave, Bldg 3-158, Cambridge, MA 02139 B. User Organization Contact Person Name: Alexander Mitsos, Ph.D. Title: Assistant Professor Tel: +1-617-324-6768 Email: [email protected] 3. Project Project Title: Identifying Drug Effects via Pathway Alterations using an Integer Linear Programming Optimization Formulation on Phosphoproteomic Data Team Leader Name: Leonidas G. Alexopoulos Title: Lecturer Tel: +30 210 7721666 Email: [email protected] Team members – name(s), title(s) and company (optional): Alexander Mitsos1, Ioannis N. Melas2, Paraskeuas Siminelakis2, Aikaterini D. Chairakaki2, Julio Saez‐
Rodriguez3,4, Leonidas G. Alexopoulos2 1:Department of Mechanical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America, 2:Department of Mechanical Engineering, National Technical University of Athens, Athens, Greece, 3:Department of Systems Biology, Harvard Medical School, Boston, Published Resources for the Life Sciences
250 First Avenue, Suite 300, Needham, MA 02494 | phone: 781‐972‐5400 | fax: 781‐972‐5425 Massachusetts, United States of America, 4:Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America 4. Category in which entry is being submitted (1 category per entry, highlight your choice) Æ Drug Discovery & Development: Compound‐focused research, drug safety OR Æ Basic Research & Biological Research: Disease pathway research, applied and basic research ˆ
ˆ
ˆ
ˆ
ˆ
ˆ
ˆ
Clinical Trials & Research: Trial design, eCTD Translational Medicine: Feedback loops, predictive technologies Personalized Medicine: Responders/non‐responders, biomarkers IT & Informatics: LIMS, High Performance Computing, storage, data visualization, imaging technologies Knowledge Management: Data mining, idea/expertise mining, text mining, collaboration, resource optimization Health‐IT: ePrescribing, RHIOs, EMR/PHR Manufacturing & Bioprocessing: Mass production, continuous manufacturing (Bio‐IT World reserves the right to re‐categorize submissions based on submission or in the event that a category is refined.) 5. Description of project (4 FIGURES MAXIMUM): A. ABSTRACT/SUMMARY of the project and results (150 words max.) Understanding the mechanisms of cell function and drug action is a major endeavor in the
pharmaceutical industry. Finding drug's targets is traditionally based on drug's biochemical activity
(potency and selectivity) using either in-vitro assays (e.g. kinase screening) or cellular approaches (e.g.
MS-based bait schemes). However, beyond measurements of drug’s binding affinities, little is know on
how the signaling network of the cell is affected by a drug. Our method is an unbiased,
phosphoproteomic-based approach to identify drug effects by monitoring drug-induced topology
alterations. Using a combination of novel high-throughput phosphoproteomic experiments and
computational we were able to draw signaling pathways and monitor their changes under the presence of
drugs. Our method represents an unbiased phosphoproteomic approach to identify drug effects on small to
medium size pathways which is scalable to larger topologies with any type of signaling interventions
(small molecules, RNAi, etc). The method can reveal drug effects on pathways, the cornerstone for
identifying mechanisms of drug's efficacy.
B. INTRODUCTION/background/objectives Cells are complex functional units. Signal transduction refers to the underlying mechanism that regulates
cell function, and it is usually depicted on signaling pathways maps. Each cell type has distinct signaling
transduction mechanisms, and several diseases arise from alterations on the signaling pathways. Smallmolecule inhibitors have emerged as novel pharmaceutical interventions that aim to block certain
pathways in an effort to reverse the abnormal phenotype of the diseased cells. Despite that significant
effort is been invested for designing compounds to hit certain targets, little is known on how these
compounds act on an “operative” signaling network. Here, we combine novel high throughput proteinsignaling measurements and sophisticated computational techniques to evaluate drug effects on cells via a
computational analysis of phosphoproteomic data.
Published Resources for the Life Sciences
250 First Avenue, Suite 300, Needham, MA 02494 | phone: 781‐972‐5400 | fax: 781‐972‐5425 Our approach comprises of two steps: build pathways that simulate cell function and identify druginduced alterations of those pathways. We employed our approach to evaluate the effects of 4 drugs on a
cancer hepatocytic cell type. We were able to confirm the main target of the drugs but also uncover
unknown off-target effects that cannot be revealed by standard affinity assays. C. RESULTS (highlight major R&D/IT tools deployed; innovative uses of technology). To prove our case, we made a topology for several hepatocytic cell-lines and we evaluate the effects of 4
drugs: 3 selective inhibitors for the Epidermal Growth Factor Receptor (EGFR) and a non-selective drug.
We confirm effects easily predictable from the drugs' main target (i.e., 3 EGFR drugs Lapatinib,
Erlotinib, and Gefitinib block the EGFR pathway) but we also uncover unanticipated effects due to either
drug promiscuity or the cell's specific topology. An interesting finding is that the selective EGFR
inhibitor Gefitinib inhibits signaling downstream the Interleukin-1alpha (IL1α) pathway; an effect that
cannot be extracted from binding affinity-based approaches. D. ROI achieved or expected (200 words max.): E.
This project is an applied research that took place in academic institutes (MIT, Harvard Medical School,
and National Technical University of Athens) and thus, ROI was not expected. However, a provisional
patent has been submitted by the Transfer Licensing Office of MIT and several options are currently
investigated in order to commercialize the approach.
CONCLUSIONS/implications for the field. Our approach is a new tool for understanding drug’s action via novel algorithms that deconvolute
phosphoproteomic data. Since phosphoproteomic measurements are currently the ultimate reporters of
drug’s action on signaling networks, we anticipate our approach to be a viable alternative for identifying
drug effects when compared to gene expression or binding affinity –based approaches. Understanding the
drug effects in normal and diseased cells we can provide important information for the analysis of clinical
outcomes in order to improve drug efficacy and safety.
1.
REFERENCES/testimonials/supporting internal documents (If necessary; 5 pages max.) Please refer to our recently published paper in PLoS computational Biology:
Mitsos A, Melas IN, Siminelakis P, Chairakaki AD, Saez-Rodriguez J, Alexopoulos LG. 2009
Identifying Drug Effects via Pathway Alterations using an Integer Linear Programming Optimization
Formulation on Phosphoproteomic Data. PLoS Comput Biol 5(12): e1000591.
doi:10.1371/journal.pcbi.1000591
BIO IT Best Practices Award Bio‐IT World 2010 Best Practices Awards Nominating Organization name: SAS Institute Nominating Organization address: SAS Campus Drive Nominating Organization city: Cary Nominating Organization state: NC Nominating Organization zip: 27513 Nominating Contact Person: Anne Bullard Nominating Contact Person Title: Marketing Communications Specialist Nominating Contact Person Phone: 919‐531‐6617 Nominating Contact Person Email: [email protected] User Organization name: The Gibson Lab, Center for Integrative Genomics User Organization address: Georgia Institute of Technology User Organization city: Atlanta User Organization state: GA User Organization zip: 30332 User Organization Contact Person: Greg Gibson, Ph.D. User Organization Contact Person Title: DIrector, Center for Integrative Genomics User Organization Contact Person Phone: 404‐385‐2343 User Organization Contact Person Email: [email protected] Project Title: Geographical genomics of human gene expression variation Team Leaders name: Greg Gibson, Ph.D. Team Leaders title: DIrector, Center for Integrative Genomics Team Leaders Company: Georgia Institute of Technology Team Leaders Contact Info: [email protected] Team Members name: Youssef Idaghdour, Ph.D. Team Members title: Postdoctoral Fellow Team Members Company: Awadalla Laboratory, Sainte‐Justine Research Center, University of Montreal Entry Category: Basic Research & Biological Research 1. User Organization The Gibson Laboratory Center for Integrative Genomics Georgia Institute of Technology Atlanta, Georgia, 30332 USA. 2. Contact Person Prof. Greg Gibson Director, Center for Integrative Genomics Georgia Institute of Technology Atlanta, Georgia, USA Tel: 404‐385‐2343 Email: [email protected] 3. Project Project Title: Geographical genomics of human gene expression variation Team Leader: Greg Gibson, Ph.D. 4. Category Basic Research and Biological Research 5. Description of Project A. ABSTRACT/SUMMARY of the project and results (800 characters max.) The genetics of gene expression (GOGE) analysis approach aims to characterize genetic influences on gene activity, and can now be performed on a genome‐wide basis. Our study takes the GOGE approach a step further by directly incorporating estimates of environment, ethnicity, and family relatedness into association testing to simultaneously measure the impact of nature, nurture, and culture on genome function. In collaboration with the JMP Genomics team, the Gibson Laboratory developed a GOGE workflow that allows efficient performance of billions of association tests in a robust statistical framework using standard desktop computing capabilities accessible to all. The results are published in the journal Nature Genetics1. The statistical procedure and protocol implemented using JMP Genomics is published in the journal Nature Protocols2. B. INTRODUCTION/background/objectives The expression level of a single gene can be viewed as a quantitative trait. The variance in each gene’s expression can therefore be partitioned among various effects using classical quantitative linkage mapping (QTL) or linkage disequilibrium mapping methods. This calculation may be relatively simple to perform for one or a handful of genes, yet genome‐level expression studies such as those performed using microarrays or mRNA‐seq can easily produce thousands of gene‐level expression values or even hundreds of thousands of probe or exon‐level measurements. In the last few years, genome‐wide association studies incorporating gene expression have demonstrated that regulatory polymorphisms, also known as expression single nucleotide polymorphisms (eSNPs), impact the expression of several percent of all genes. Several known eSNPs have been found to associate with a variety of phenotypes, including known human diseases or conditions. Uncovering eSNPs can therefore help researchers understand the molecular basis of complex diseases. Many of the previous studies in this area have been performed with cultured cell lines, but the discovery of eSNPs in actual human tissues is the only way to evaluate the joint effects of such factors as population structure, relatedness, geography, environment, and gender, as well as interactions among them. A focus of the Gibson Laboratory in 2009 was to understand the contribution of these factors to variation in gene expression in peripheral blood leukocytes sampled from rural villagers and urban dwellers in Morocco. This group has previously demonstrated the striking effect that environment can have on the leukocyte transcriptome3, using JMP Genomics as the main tool for data analysis. However, quadrupling the sample size of the studied population and combining expression data with genotypic data represented a novel analytic and computing challenge. The scope and scale of the intended work prompted enhancements to the capabilities of JMP Genomics to ensure efficient treatment of the data using existing standard computer equipment. C. RESULTS (highlight major R&D/IT tools deployed; innovative uses of technology). As part of the “Geographical Genomics” project, genome‐wide gene expression and genotypic data was generated from 194 individuals representing both genders, a wide range of family relationships, and two dominant ethnicities from three geographic locations. The data consisted of more than 22,000 gene expression measurements and more than 500,000 SNPs from each individual generated using Illumina BeadChips. One of the main objectives of the study was to test the association of each gene expression measurement with each SNP genotype in the entire sample while accounting for a complex correlation structure reflecting biased distributions of Arab and Berber populations in rural and urban settings. Advanced hypothesis tests also demonstrated an absence of interaction effects among the genetic and environmental factors. Collaborations with the Gibson Laboratory revolving around the data from the Geographical Genomics project provided the JMP Genomics team at SAS invaluable insight into the software requirements and challenges associated with the analysis and integration of such data sets. The result of this collaborative effort was a streamlined, best practice workflow that could perform comprehensive genome‐wide association testing with gene expression traits using a standard desktop computer. This workflow utilized existing processes in JMP Genomics and incorporated newly developed code that has since become the basis for new production applications in the software. This analysis workflow was fully implemented using JMP Genomics to call SAS procedures behind the scenes and perform association testing in a mixed‐model framework, handling random and fixed effects in a statistically robust manner. The flexible graphical options provided by JMP facilitated easy movement between exploratory and analytical phases of the analysis, leading the researchers to new insights about the environmental and genetic contributions to variation in gene expression that guided downstream modeling efforts. In the exploratory phase, the team examined each data type separately using several pre‐built, streamlined workflows (e.g., Basic Expression Workflow, Basic Genetics Workflow) that were tailored to specific data types, greatly facilitating data analysis and leaving little space for manual errors. These workflows also allowed researchers to perform robust supervised quality control of the data prior to statistical hypothesis testing. Several downstream analyses were also performed using JMP Genomics. Both principal component analyses of whole‐genome gene expression data and of genome‐wide genotypic data were performed within the space of a few hours. The JMP Genomics Cross‐Correlation application was used for an initial association screen to find significant correlations between paired expression and genotype data sets. Using this process, more than 11 billion association tests were performed using a desktop machine (Intel Core2 Duo CPU, 2.33 GHz, 3.23 GB Ram with Windows XP). It is important to note that analytic functions implemented in JMP Genomics that call SAS permit handling of large data files without a need to open the files in active memory, significantly decreasing data processing time and increasing the size of data sets that can be handled on the desktop. An option was added to permit saving of output results only for associations significant at user‐specified thresholds, providing an alternative and simple solution to the colossal data storage required to retain all output results. This initial screen implicated candidate eSNPs, which were then examined using a series of statistical models accounting for various effects in the SNP‐Trait Association process. Approximately 400 genome‐wide significant associations were discovered in samples obtained from 194 individuals1. This study provided the first glimpse of the robustness of SNP‐
gene expression level associations to genetic ethnicity, relatedness and environment and paves the way for similar studies in other populations and environments. D. ROI achieved or expected (1000 characters max.): JMP Genomics provides an integrated and timely solution for joint analysis of genotypic and gene expression data. The application is accessible to most researchers without significant equipment investments. Further, the statistically robust mixed‐model framework supported by SAS can handle data sets from simple to complex, providing flexibility to approach a problem from various analytic angles. Hidden data quality issues can derail costly genomic studies. JMP Genomics QC workflows permit exploration of sample‐to‐sample variation and assessment of the relative contributions of experimental and technical factors. Users navigate multiple steps in a logical and consistent manner without the need to reformat files or perform manual merges. JMP Genomics also offers extensive statistical tools that allow interactive graphical exploration and statistical analysis of data, with further options to explore functional consequences of variation in gene expression using various online pathway and network tools. E. CONCLUSIONS/implications for the field (800 characters max.) This is the first study of environmental and genetic contributions to in vivo human gene expression profiles to account for the complex effects of population structure, relatedness, lifestyle and geography. The innovative statistical analysis approach used here is detailed in a recent article published in Nature Genetics by the Gibson Laboratory1, and a companion protocol paper in Nature Protocols. The successful integration of a GOGE analysis workflow into JMP Genomics permits robust statistical association testing with genome‐wide genotypic and gene expression data. JMP Genomics let analysts use a single solution to cover many steps in a complex process, from manipulation of large data sets to downstream analysis and visualization. The joint analysis of large data sets using only desktop computing resources adds value for all researchers in the field of quantitative genetics. Collection of multiple data types on genomic samples has become routine, and workflows such as these will be critical tools in simplifying the discovery of connections in large data sets. REFERENCES/testimonials/supporting internal documents (If necessary; 5 pages max.) [1] Idaghdour et al. Geographical genomics of human leukocyte gene expression variation in southern Morocco. Nature Genetics. Advance Online Publication: December 6, 2009. doi:10.1038/ng.495 [2] Youssef Idaghdour, Wendy Czika, Kelci Miclaus, Sang H. Lee, Peter M. Visscher, Russell D. Wolfinger, Greg Gibson (2009) Accounting for population structure and relatedness in gene expression genome‐wide association testing using a mixed‐model approach. Nature Protocols. doi: 10.1038/nprot.2009.216 [3] Idaghdour et al. A genome‐wide gene expression signature of environmental geography in leukocytes of Moroccan Amazighs. PLoS Genetics. 4, e1000052. (2008). 250 First Avenue, Suite 300, Needham, MA 02494 | phone: 781‐972‐5400 | fax: 781‐972‐5425 Published Resources for the Life Sciences
Bio‐IT World 2010 Best Practices Awards 1.
Nominating Organization (Fill this out only if you are nominating a group other than your own.) A. Nominating Organization Organization name: Address: B. Nominating Contact Person Name: Title: Tel: Email: 2. User Organization (Organization at which the solution was deployed/applied) A. User Organization Organization name: Sigma Life Science Address: 2909 Laclede Avenue Saint Louis, MO 63103, USA B. User Organization Contact Person Name: Cesar Paredes Title: eBusiness Marketing Manager Tel: 314‐286‐7605 Email: [email protected] 3. Project Project Title: Your Favorite Gene powered by Ingenuity Team Leader Name: Kyle Brueggeman Title: eBusiness Product Manager Tel: 314‐289‐8496 x4519 Email: [email protected] Team members – name(s), title(s) and company (optional): Joseph Bedell ‐ eBusiness Marketing Manager Cesar Paredes ‐ eBusiness Marketing Manager Fei Zhong – Bioinformatics Manager Jason Humphreys – Search Technology Manager Kyle Brueggeman ‐ eBusiness Product Manager 4. Category in which entry is being submitted (1 category per entry, highlight your choice) ˆ Basic Research & Biological Research: Disease pathway research, applied and basic research 250 First Avenue, Suite 300, Needham, MA 02494 | phone: 781‐972‐5400 | fax: 781‐972‐5425 Published Resources for the Life Sciences
Drug Discovery & Development: Compound‐focused research, drug safety Clinical Trials & Research: Trial design, eCTD Translational Medicine: Feedback loops, predictive technologies Personalized Medicine: Responders/non‐responders, biomarkers IT & Informatics: LIMS, High Performance Computing, storage, data visualization, imaging technologies Knowledge Management: Data mining, idea/expertise mining, text mining, collaboration, resource optimization ˆ Health‐IT: ePrescribing, RHIOs, EMR/PHR ˆ Manufacturing & Bioprocessing: Mass production, continuous manufacturing ˆ
ˆ
ˆ
ˆ
ˆ
ˆ
(Bio‐IT World reserves the right to re‐categorize submissions based on submission or in the event that a category is refined.) 5. Description of project (4 FIGURES MAXIMUM): A. ABSTRACT/SUMMARY of the project and results (150 words max.) Your Favorite Gene powered by Ingenuity is a free web‐based biological search portal for exploring dynamic gene‐focused content drawn from the Ingenuity® Knowledge Base, a repository of detailed biological and chemical interactions and functional annotations. YFG positions Sigma’s Life Science products within a content‐rich environment of relevant biological and chemical information. The Web site also provides researchers with the capability to model and evaluate prospective experiments in the context of previously published scientific literature. B. INTRODUCTION/background/objectives There are a number of search engines available on the Internet where a scientist can conduct a search on their area of study. Most of these resources return results matched on keyword, but may or may not be relevant. A researcher must, therefore, evaluate each connection to determine if it is relevant to their research or not. This can be both time consuming and add cost to the research process. With some 200,000 products offered by Sigma‐Aldrich, the development of Your Favorite Gene powered by Ingenuity offers a platform on which researchers can explore the relationships between different biological, metabolic or cell signaling pathway components and find Sigma’s high quality small molecules, antibodies, enzymes, and shRNA or siRNA for gene knockdown within the context of their research. The initial launch of YFG grouped relevant products by gene ID to facilitate the location of products relevant to a gene while planning an experiment. The goal of Your Favorite Gene powered by Ingenuity was to bring together the power of the Ingenuity® Knowledge Base with the product groupings of YFG. Doing so would enable scientists to search by gene, disease, tissue, pathway, etc. to find curated information relevant to their experiment, unlock new connections they may not have thought of, and quickly find the products with which they can perform the experiment. 250 First Avenue, Suite 300, Needham, MA 02494 | phone: 781‐972‐5400 | fax: 781‐972‐5425 Published Resources for the Life Sciences
C. RESULTS (highlight major R&D/IT tools deployed; innovative uses of technology). Sigma's Bioinformatics team, Sigma‐Aldrich IS, Sigma‐Aldrich's Search Team, and Ingenuity embarked on a deep collaboration over nine months to bring together Sigma’s original YFG tool with Ingenuity’s technology and content to create Your Favorite Gene powered by Ingenuity and leverage it into Sigma‐
Aldrich's Web site search. The result of this collaborative effort is a dramatic departure from the previously prevalent static search interfaces among life science product providers. This new, intuitive search tool matches thousands of Sigma‐Aldrich products to biological information, presented in dynamic networks. Researchers and students can search by gene, protein, function, disease, species, tissue or pathway to access a range of previous research findings and biological information including molecular functions, cell regulation, protein domains, and metabolic and signaling pathways. From an initial search, simple navigation allows exploration of broader networks, providing insight into pathway interactions. Researchers can also model prospective experiments in Your Favorite Gene powered by Ingenuity, accessing the findings of previous studies and quickly sourcing the availability of Sigma‐Aldrich products relevant to their work. Ingenuity’s biological and chemical content is specifically structured for searching and faceted navigation. The content used by Sigma was drawn from the Ingenuity® Knowledge Base, which consists of millions of interactions and uses a lexicon of over a million gene, chemical, process‐related, drug and disease terms continuously updated from scientific findings by PhD‐level scientists. The pathways available through Your Favorite Gene powered by Ingenuity are fully interactive: users can search, filter, highlight and drill down on specific molecules or interactions to access underlying information and, most importantly, view products relevant to those molecules or interactions. Your Favorite Gene powered by Ingenuity also provides fully interactive molecular networks that allow researchers to observe what is upstream or downstream from a gene and find related products necessary for their experiments. Ingenuity's networks of biological and chemical relationships are regularly updated, providing researchers with the most up‐to‐date findings related to proteins, genes, complexes, drugs, tissues, cells, and diseases of interest. Through Your Favorite Gene powered by Ingenuity users can access select information from the Ingenuity® Knowledge Base, with quick links to the larger repository for Ingenuity subscribers. Your Favorite Gene powered by Ingenuity serves as an information hub for researchers and students exploring diseases, functions, and gene pathways, and to match our comprehensive collection of products, kits and reagents to relevant biological information. The reach of the Your Favorite Gene powered by Ingenuity platform serves was further expanded in November 2009 with the launch of a unique life science‐focused Facebook application. Developed and marketed by Sigma Life Science, the ‘What’s Your Favorite Gene?’ application is one of the first of its kind on Facebook, providing a platform that can enable scientists and researchers to network with each other and facilitate discussion based on their favorite genes, identifiable via gene functionality and biological pathways. The application also allows researchers to post and share gene information using gene details and associated pathways and interactors from Your Favorite Gene powered by Ingenuity. 250 First Avenue, Suite 300, Needham, MA 02494 | phone: 781‐972‐5400 | fax: 781‐972‐5425 D. ROI achieved or expected (200 words max.): Published Resources for the Life Sciences
Since launching Your Favorite Gene powered by Ingenuity, in early 2009 we have a 160% increase in unique users of the tool. In addition, we have 30% more return users when compared to non‐YFG content on the Sigma‐Aldrich Web site, and we have doubled the amount of time spent on the site when compared to the non‐YFG content. Orders placed for products listed in YFG are 40% higher in value than those placed for non‐YFG products. 6.
E. CONCLUSIONS/implications for the field. Your Favorite Gene powered by Ingenuity has allowed researchers to find the products most relevant to their research. The integration of content from the Ingenuity® Knowledge Base positions Sigma’s Life Science products within a contextually rich environment of relevant biological and chemical information. Doing so not only allows researchers to quickly locate products specific to their research, but also provides them the capability to model and evaluate prospective experiments in the context of previously published scientific literature that has been manually curated by PhD scientists. YFG enables the unlocking of new connections to related genes and hypotheses. In short, this innovation streamlines the entire research process, saving both time and money at the research stage of a disease pathway project. Scientists can now spend less time finding products that fit an experiment, and more time realizing their research. REFERENCES/testimonials/supporting internal documents (If necessary; 5 pages max.) Input from 90 customers who filled out a survey after the launch. The feedback to the questions posed was overwhelmingly positive. 250 First Avenue, Suite 300, Needham, MA 02494 | phone: 781‐972‐5400 | fax: 781‐972‐5425 Published Resources for the Life Sciences
User provided comment: "YFG not only helps to design effective experiment but it is also instrumental for a better selection of experimental products such as antibodies, bio‐active molecules, bio‐assays..."
1)
User Organization (Organization at which the solution was
deployed/applied)
A. User Organization
Organization name:
Tarbiat Modares University, Tehran University, and EnzymeZist Company (at
Tehran)
Address:
#7 Khiaban Shahid Mojtabaie/Khiaban Shariati – Tehran/Iran
And
438 Club DR
Aurora, OH 44202
USA
B. User Organization Contact Person
Name:
Mehdi Naderi Manesh
Title:
- Associate Prof of Genetic/Nanobio-technology at Tarbiat Modares University
- Cofounder and President of EnzymeZist Company
Tel:
330-562-8734 (USA)
98-21-22858508 (Tehran/Iran)
Email:
[email protected]
http//vision24.blog.homepagenow.com
Project
Project Title:
Genetic Oral Delivery for Gut ("GOD 4 Guttm") via bionano-particle
to produce therapeutic proteins (i.e., insulin) from alimentary canal (cells
and tissues) to replace the function of damaged or diseased cells/tissues.
Team Leader
Name:
Mehdi Naderi Manesh
Tel:
330-562-8734 (USA)
98-21-22858508 (Tehran/Iran)
Email:
[email protected]
Team members-names, titles and company:
M. Sarblooki, Prof of Biomaterial at IBB/Tehran Univ.
A. Pylakhi, Graduate Student of Genetic at Tarbiat Modares Univ.
Category in which entry is being submitted
1
"GOD 4 Guttm", "BOT 4 Guttm", and "GEM 4 HEALTHtm"
are trade mark of this tech-project
Basic Research & Biological Research
5. Description of project (4 FIGURES MAXIMUM):
A. ABSTRACT/SUMMARY of the project and results (150 words max.) This innovative technology offers the potential of curing diseases (diabetes) based on
normal physiological and biochemical regulations, function, and responses of cell (meal
regulated insulin secretion). Nanoparticles (Dendrosomes) are utilized as carriers and
protectant in oral gene (Gus and insuline) delivery into alimentary canal of diabetic rats.
The cells of intestinal tract, in particular differentiated and stem cells in gut, serve as the
site for both the manufacture of the protein drug (insulin and reporter gene), as indicated
by galactocidase histochemistry result, and its delivery into the bloodstream as shown by
decreased blood glucose levels in administered diabetics rats. Therefore, the need for
recombinant protein (drug) production in a pharmaceutical factory and for special means
of administration is avoided. This leading-edge technology, Genetic Oral Delivery for
Gut ("GOD 4 Guttm"), manufactures (in regulated-fashion) therapeutic proteins from
alimentary canal (cells and tissues) to replace the function of damaged or diseased
tissues.
B. INTRODUCTION/background/objectives Background
This innovative platform technology focuses on oral gene delivery through
nanotechnology to produce therapeutic proteins from alimentary canal (cells, tissues, and
flora) to replace the function of damaged or diseased tissues. The ultimate goal is to
present a novel technology for:
• in vivo introduction of a nucleic acid cassette into stem cells or differentiated
short living cells of intestinal epithelium,
• Expression and export of the gene/protein based on normal cell's physiological
and biochemical regulations, functions, and responses.
• Transformation of any organism's cell (i.e., bacteria, animal, and plant)
The vector solution can be delivered via the intestinal lumen in a variety of ways:
• Bionanoparticle Oral Therapy for Gut ("BOT 4 Guttm")
• Genetic Engineering of Microflora for Healing Extensively as an All LivingsTherapy ("GEM 4 HEALTHtm") via slow release capsules (probiotics)
In this project, only Bionanoparticle Oral Therapy for Gut ("BOT 4 Guttm") will be
presented. In this nanobio-technology, the patient swallows a “two-in-one” drug—a
drug that produces a drug. The first drug acts as a carrier, protectant, and delivery vesicle,
after ingestion, for the nucleic acid fragment that contains the genetic code (DNA or
iRNA) for the second drug, a therapeutic protein. The cells of the intestinal tract, in
particular the small intestines, serve as the site for both the manufacture of the protein
drug in a regulated-fashion and its delivery into the bloodstream in a controlled manner.
2
"GOD 4 Guttm", "BOT 4 Guttm", and "GEM 4 HEALTHtm"
are trade mark of this tech-project
The need for recombinant protein (drug) production in a pharmaceutical factory is
avoided in this novel bionano-technology,.
Most diseases are ultimately attributable to effects of and on protein molecules.
Therefore, it should not be surprising that these large and extremely complex molecules
are dominating the pharmacopoeia, replacing simpler and smaller organic molecules as
the drugs of choice.
Despite substantial accomplishments and an immense investment of money and
effort, the broad promise of protein drugs has yet to be realized. The intestinal epithelium
is a particularly attractive site for gene therapy because of:
• Its great mass of cells
• Its ease of access via the intestinal lumen
Critical aspects of the morphology, kinetics of epithelium, and its advantages are :
• Its luminal surface interfaces with the external milieu whereas its basolateral
surfaces interfaces with the internal milieu. This means it is ideally located to
receive nucleic acids applied externally (via the lumen) and to direct the protein
or peptide products either:
o To the luminal surface (e.g., to correct a defect of ingestion or absorption)
o To be secreted from the luminal surface (to act on the epithelium at more
distal sites)
o To the interior of the epithelial cells (to act metabolically)
o To the basolateral surface for secretion into the circulatory system (to act
systemically)
• The epithelial villus cells are replaced continuously with cells emerging from the
crypts (pit-like structures) that surround the base of the villi. Each villus is fed by
ten or more crypts. Proliferation is confined to the lower two-thirds of the crypt
and the progenitors of both crypt and villus cells are the stem cells located at the
base of the crypt. Thus, with the respect to gene therapy, there are two general
possibilities with this epithelium:
o Permanent expression of the transferred gene as a result of transduction of
the stem cells
o Transient expression as a result of transduction or transfection of other
crypt or villus cells. Whilst most applications favor permanent expression,
there are also some applications which arise from transient expression
• Another aspect of intestinal structure is its substantial length. This means that
there is a very large mass of tissue available for gene transfer. Making use of
even a small fraction of this capacity can deliver many important therapeutic
proteins to the bloodstream. Moreover, short-acting proteins can be used with
greater therapeutic effect since present innovative technology ensures their
continuous synthesis and secretion at the needed rates.
•
•
3
Moreover, because the protein is produced in situ, it may be possible to obtain a
more natural product, with appropriate species specific post-translational
modifications
Additionally, and of great potential importance, it is possible to regulate the rates
of synthesis and release of the drug. Therefore, Genetic Oral Delivery for Gut
("GOD 4 Guttm") is an innovative practice with a great future.
"GOD 4 Guttm", "BOT 4 Guttm", and "GEM 4 HEALTHtm"
are trade mark of this tech-project
Oral administration is the most convenient way of delivering drugs. Recent advances
in biotechnology have produced highly potent new molecules such as peptides, proteins
and nucleic acids. Due to their sensitivity to chemical and enzymatic hydrolysis as well
as a poor cellular uptake, their oral bioavailability remains very low.
Despite sophisticated new delivery systems, the development of a satisfactory oral
formulation remains a challenge. Among the possible strategies to improve the absorption
of drugs, micro- and nanoparticles represent an exciting approach to enhance the uptake
and transport of orally administered molecules. Increasing attention has been paid to their
potential use as carriers for nucleic acids drugs for oral administration. Several varieties
of nanoparticles are available: polymeric nanoparticles, metal nanoparticles, liposomes,
micelles, quantum dots, dendrimers, and dendrosomes.
In this project, a time-release formulation of oral pill ("Nucleic Acids Pill"), a
capsule, or a tablet (providing protection and delivery) containing desirable nucleic acid
molecules (with potential for regulated expression, function, and distribution in blood
system) is introduced. The nucleic acid molecules are associated with nanoparticles that
facilitate delivery to the intestinal epithelial cell. Genetic material which is incorporated
into the intestinal cells can be any DNA or RNA (ie iRNA). For example, nucleic acid
can be:
• Not normally found in intestinal epithelial cells
• Normally found in intestinal epithelial cells, but not expressed at physiological
significant levels
• Normally found in intestinal epithelial cells and normally expressed at
physiological desired levels in the stem cells or their progeny
• Any other DNA/iRNA which can be modified for expression in intestinal
epithelial cells, and
• Any combination of the above
In enhanced and preferred embodiment of this innovative technology, intestinal
specific promoters such as GIP responding promoters of K cells, the intestinal fatty acid
binding protein promoter, the disacharidase promoters, cysteine-rich intestinal protein
promoter and apolipoprotein promoter can be used.
In the selection of the promoter, the parameters can include:
• Achieving sufficiently high levels of gene expression to get a physiological effect
• Maintaining a critical steady state of gene expression to achieve temporal
regulation of gene expression
• Achieving cell and tissue specific expression
• Achieving pharmacological, endocrine, parocrine or autocrine regulation of gene
expression
• Preventing inappropriate or undesirable levels of expression
Many of the vectors and delivery systems developed for in vivo cellular
transformation either have their own inherent drawbacks or are not entirely suitable for in
vivo intestinal cell transformation. For example, recombinant viruses, particularly
retroviruses, may be slow in gaining FDA approval due to concerns generally associated
with the administration of live viruses to humans. In addition, it has become clear that
viral vectors present problems with the possibility of multiple administrations of the gene
4
"GOD 4 Guttm", "BOT 4 Guttm", and "GEM 4 HEALTHtm"
are trade mark of this tech-project
construct due to immune responses, and may greatly limit their utility. Mechanical
means, such as gene gun, are designed for use in transformation of skeletal muscle cells
and are not particularly useful in intestinal cell transformation due to problems of access
and to the delicate nature of organ.
5
"GOD 4 Guttm", "BOT 4 Guttm", and "GEM 4 HEALTHtm"
are trade mark of this tech-project
Mission and Objectives
The mission (Fig. 1) is to apply Genetic Oral Delivery for Gut ("GOD 4 Guttm") to
produce therapeutic proteins from alimentary canal (cells and tissues) to replace the
function of damaged or diseased tissues.
According to the greatness (Fig. 1, Return On Investment) and novelty of the
technology bioinformatic/IT tools in the field of synthetic chemistry, biomaterial, cell
biology, molecular biology, developmental biology, genetic engineering, and
nanotechnology , the project was divided into different objectives (Fig. 1) including:
• Apply “Design Thinking” and bioinformatic/IT tools to define challenges to
overcome and to design an innovative practice for a successful oral gene therapy
of gut cells
• Identify a regulated expression (i.e., GIP responding promoters of K cells) and
controlled release of therapeutic protein (transit peptides)
• Find the appropriate delivery system for oral gene therapy with defined features
• Design nanoparticles as assembly, protectant, and delivery vesicles for nucleic
acids with biodegradable and inert properties having targeting potential
• Cytotoxic studies
• Optimize it in cell culture
• Optimize it in normal animals with reporter genes
• Test the system on diabetic model animals
6
"GOD 4 Guttm", "BOT 4 Guttm", and "GEM 4 HEALTHtm"
are trade mark of this tech-project
"GOD 4 GutTM" Opportunities
Platform
Technology
Target-Market
ROI/Market
Value
Meal-Regulated
Systemic Drug
delivery
Diabetes:
Therapeutic Drugs (Largest
- 13 M Treated in US
-$100 B Total Drug
Costs/Global
Non-regulated
Systemic Drug
Delivery
Localized
Expression
Healthcare Market over Next 5
years)
Obesity:
CCK & PYY
-30% Adults in US
-$60 B Direct Costs
Protein Drugs
-$110 B Global Sales
IBD/Crohn's:
Il-10 & anti-TNFa
-$4 B Combined Global
Market
Colon Cancer:
-$8 B Global Sales
Innovative Drug Advanced/Sustained/Targeted
-$140 B Global
Delivery Systems Release
Based on 2008-09 data (WWW)
Fig. 1. Mission, Objectives, and Return On Investment:
Flow chart of the steps/objectives considered in this innovative design/platform
nanobio-technology for a successful oral gene therapy of gut as a surrogate for
damaged tissue and the expected Return On Investment (ROI).
7
"GOD 4 Guttm", "BOT 4 Guttm", and "GEM 4 HEALTHtm"
are trade mark of this tech-project
C. RESULTS (highlight major R&D/IT tools deployed; innovative uses of technology). In this project, a new family of synthetic vehicles having spherical dendritic
structures called Dendrosomes were designed, readily made, and utilized in direct
delivery of genes into cells and model animals, applying “Design Thinking” and IT tools
in the field of synthetic chemistry, biomaterial, cell biology, molecular biology,
developmental biology, genetic engineering, and nanotechnology. Besides their ease of
preparation and storage they enjoy some unique advantages, e.g. they are inexpensive,
easy to handle and apply, inert and highly stable compared with other existing synthetic
vehicles for gene delivery (cationic lipids, dendrimers and liposomes). Data obtained thus
far on cell cultures and animal models have irrevocably demonstrated their inertness as
well as their impressive performance in easy, quick and direct transfections, which
include:
• Direct delivery of plasmid, pEcoRI-E (origin of replication of VZV) and pNN2
(containing HSV1 polymerase gene) into human kidney (G293, Vero) and hepatocyte
(Huh7) cell cultures, by simply mixing each of these plasmids with the dendrosome
for a very short period and exposing it to the target cell in culture (data not provided).
• Direct intramuscular (IM) or intradermal (ID) injection of mixture of a
dendrosome with CMV containing the gene for hepatitis B surface antigen into BalbC
mice. Here, it is found that as a high a ratio of plasmid/dendrosome as 150 (or even
higher in second and third injections) can be easily achieved, which elicits a quick
and intense immune (antibody) response compared with common methods (e.g. 20%
sucrose) or recombinant antigen vaccines (data not provided).
• The impact of different ratio of DNA/Dendrosome and pH on stability and
efficiency of transfection in vitro and in vivo conditions have been examined (data not
provided).
• Transformation of plant cells has been accomplished in vitro via this innovative
bionano-technology (data not provided).
8
"GOD 4 Guttm", "BOT 4 Guttm", and "GEM 4 HEALTHtm"
are trade mark of this tech-project
Fig. 2. Nanoparticle Features:
Properties considered in the process of Dendrosome design and synthesis,
applying IT tools, as nucleic acids protectant and delivery vesicle plus its
scanning electron micrograph (SEM).
9
"GOD 4 Guttm", "BOT 4 Guttm", and "GEM 4 HEALTHtm"
are trade mark of this tech-project
Distribution and Penetration of Therapeutic Gene Marker
In this study, the pCMV-lacZ plasmid carrying the lacZ gene encoding βgalactosidase under the control of the cytomegalovirus (CMV) promoter was used.
Different ratios and pH were tested and finally DNA/Dendrosome complexes were
freshly prepared on a ratio of 6:1(v/v). 300 μl of DNA (1 μg/ μl) was dissolved in 1650 μl
of PBS pH 7. Then 50 μl Dendrosome was added and vortexed for 15 min. Mixture was
incubated for 30 min. All of procedures mentioned above were carried out in room
temperature.
The experimental animals in this study were the males, adult Wistar rats weighing
200 to 250 g which were purchased from the Razi Institute. Rats were fasted for 24 hr
before the experiments, but were allowed free access to water until 3 hrs before feeding
complex. Rats receiving only plasmid DNA or polymer served as control groups.
To evaluate gene transfer in vivo, rats were sacrificed at set time points (48 and 72
and 96 hr) and their intestine was removed and processed for individual analysis.
β-Galactosidase Wholemount Staining: Different part of intestine including
duodenum, jejunum, illume and colon were dissected and stained with 4-chloro-5-bromo3-indolyl-β-galactoside (X-Gal). After stained overnight, the tissues were photographed
by a digital camera. The pictures were transferred into a computer and adjusted for equal
brightness and contrast using Adobe Photoshop.
β-Galactosidase Enzyme Histochemistry: The intestinal tissues were frozen in an
O.C.T. Embedding Medium immediately after sacrificing of animal and cut into thin
sections (5μ). Cryosections (5μ) were fixed for 15 min at room temperature in Lac Z fix
buffer , washed twice in PBS for 20 min, and stained for 8 hr at 37ْc in lacZ stain buffer (
containing 1.3mM MgCl2, 5mM K3Fe(CN)6, 5mM K4(CN)6, and 1-mg/ml concentration
of the substrate 5-bromo-4-chloro-3-indolyl-β-D-galactopyranoside(X-Gal). Cryosections
(5μ) were stained with hematoxylin-eosin for pathological assessment. The pictures were
scanned into a computer and edited with adobe Photoshop software that adjusts the
brightness and contrast.
Distribution and penetration of Gal enzyme is shown in the gut cells after gene therapy.
Blue spots indicate present of Gal enzyme inside the cells. As shown in Fig. 3, this
innovative technology delivers genes in differentiated and stem cells in the gut.
10
"GOD 4 Guttm", "BOT 4 Guttm", and "GEM 4 HEALTHtm"
are trade mark of this tech-project
Differentiated Cells
Stem Cells
Fig. 3. Distribution and penetration of the "Nucleic Acids Pill":
Blue spots indicates present of Gal enzyme activity inside the differentiated and
stem cells of rats' gut after oral gene therapy. 11
"GOD 4 Guttm", "BOT 4 Guttm", and "GEM 4 HEALTHtm"
are trade mark of this tech-project
Diabetic Animals
The experimental animals in this study were the males, adult Wistar rats weighing
200 to 250 g which were purchased from the Razi Institute. For preparation of diabetic
models, after a 48-hour fast, the rats were weighed and anesthetized. A solution of
alloxan at 2% diluted in saline at 0.9% was administered to the animals in a single dose
corresponding to 80 mg of alloxan per kg of animal weight injected into their penial vein.
Food and water were presented to the animals only 30 minutes after the drug
administration. From the animals subjected to this procedure, 40% developed chronic
diabetes mellitus; 10% either developed the condition to a mild or slight degree or did not
developed diabete at all; and the remaining 50% died within the first week of follow up.
A sample of the rat’s venous blood was collected on a reagent strip 10 days after the
diabetes induction procedure for blood glucose level determination using a portable
glucose analyzer. The level of serum glucose considered to be normal in ranges from 50
to 135 mg/100ml. In this study, rats with glucose levels above 200mg/dl were considered
as having severe diabetes.
To evaluate expression of insulin gene and its physiological response (Fig. 4) in vivo,
a sample of the rat’s venous blood was collected on a reagent strip at set time points (48,
60, 72, 84, 96, and 108 hrs) for blood glucose level determination using a portable
glucose analyzer.
12
"GOD 4 Guttm", "BOT 4 Guttm", and "GEM 4 HEALTHtm"
are trade mark of this tech-project
120
naked DNA
P ercen tag e o f G lu co se
100
B2
80
TEST
60
A1
40
A2
20
A3
48
60
72
84
96
108
120
B1
0
0
1
2
3
4
5
6
7
B3
interval
Fig. 4. Blood glucose level after oral gene therapy of diabetic rats:
A1, A2, A3, B1, B2, and B3 are different dendrosome types complexed with the insulin gene DNA.
Dendrosome B2 gave the best results. TEST is only dendrosome with no DNA. Naked DNA is only
insulin gene DNA with no dendrosome. Experiments were repeated at least three times.
13
"GOD 4 Guttm", "BOT 4 Guttm", and "GEM 4 HEALTHtm"
are trade mark of this tech-project
ROI achieved or expected (200 words max.):
An effective/smart and low cost drug delivery system that can transport and deliver a
drug precisely and safely to its site of action is becoming the ‘holy grail’ of
pharmaceutical industry.
The global market for protein drugs plus advanced drug delivery systems is estimated
at $250.0 Billion for 2009. Sadly, diabetes has reached epidemic proportions (250
Million), according to WHO. Therefore, the presented innovative platform nanobiotechnology is uniquely able to bring high Return On Investment (Fig. 1) by providing
following opportunities:
•
•
•
Platform
o This platform technology offers the potential of replacing the function of
damaged/abnormal cells/tissues based on normal physiological and
biochemical regulations, function, and responses of cell that can be
expanded to provide long-term systemic delivery of proteins other than
insulin. Both permanent expression (stem cells) and transient expression
(epithelial villus cells) of therapeutic proteins (ie. clotting factors for
hemophilia and erythropoietin for anemia) are possible through cell
specific targeting and/or cell specific expression., whereby only constant
circulatory levels of proteins are required for therapeutic effects.
Intellectual Property
o Proprietary innovative nanobio-technology ("GOD 4 Guttm").
Industry
o This leading-edge technology has all the ingredients to be commercially
viable and successful.
14
"GOD 4 Guttm", "BOT 4 Guttm", and "GEM 4 HEALTHtm"
are trade mark of this tech-project
D. CONCLUSIONS/implications for the field.
General implications for the field:
Design Thinking, IT tools, Nanobio-technology, and Genetic Oral Delivery for Gut
("GOD 4 Guttm") are applied to produce therapeutic proteins (i.e., insulin) from
alimentary canal (cells and tissues) to replace the function of damaged or diseased
cells/tissues for the following reasons:
• IT and nanotechnology are made for Science Imagination (SciIm/ImSci).
• IT and nanotechnology can strike any Pose.
• Precision ( at high impacts) and details (at high levels) are inherent in the essence
of Universe and Nature’s “Designs Thinking” leading to just a vision of beauty
and excellence beyond the shorelines of Trial/Error while holding the Science’s,
Logic’s, Flow’s, Mind’s, Senses’, Design’s, Form’s, Function’s, Perspective’s,
and Choreography’s Hands.
• Bioinformatics and IT tools for simulating the objectives, questions, conditions,
targets, environments, types of genes, biochemical pathways, enhancers,
repressors, signal transduction, and etc.
• IT tools and software bring out Keenness, Elegance, and Knowledge (KEK)
leading to Strength, Acceleration, and Speed (SAS).
• The boxes and borders of Visions, Trial/Errors, Concepts, Ideas, Approaches, and
Methods which have and had caused Repetitions and Short comings are not kept
intact.
• Not to Pave the Path based on the Past.
• Emphasis first on vision, systematic view, flow, perspective, passion, design,
software, form, efficiency, and critical thinking then on hardware and function.
• Comprehending the designs, concepts, forms, and essence of Universe and Nature
at subparticle, nano, micro, and macro levels as a systematic ocean of ideas,
practicality, elegance, and efficiency.
• Scientific and artistic understanding of the whole body of interconnected and
integrated symphony of software, form and function in Universe and Nature.
• Learning the Choreography of “Design Thinking” to bring out the software,
forms, definitions, and functions in any process of design, thinking, and solution.
• Taking off the blinding lens and factors of habit, accepting, ignoring, and norm.
• Connecting with the scientific, fluid, and creative process of sustainability.
• Jumping beyond the shorelines of boxes, rigidity, and impossibility resulting
efficient ideas, inventions, and innovative practices.
Oral delivery of the DNA/nanoparticle complex encoding a desired therapeutic
protein is successful in the present innovative technology. The DNA is taken up by
intestinal cells to synthesize the encoded protein and to secrete it into the bloodstream or
the gastrointestinal tract to achieve therapeutic results. This leading-edge drug delivery
platform improves the efficacy of therapeutic proteins through nano-biotechnology.
Specific Implications for the field:
15
"GOD 4 Guttm", "BOT 4 Guttm", and "GEM 4 HEALTHtm"
are trade mark of this tech-project
•
•
•
•
•
•
16
It exploits the enormous ability of cells lining the GI tract to produce and secrete
proteins. The flexibility of this technology allows for the delivery of a wide
variety of enhancing and pharmaceutical proteins, systemically, into the
gastrointestinal tract, as well as locally, making it well suited for a broad spectrum
of augmenting, preventive, and therapeutic measures. The therapeutic gene
product is secreted/exported into the bloodstream of the person in a manner and
dosage that is more akin to normal production of the gene product.
The present method is also more beneficial over other gene-based therapies that
administer the gene vector into the bloodstream to get to other tissues and organs
since it involves administration of the DNA of interest directly to the target cells
of the subject without first being distributed broadly via the bloodstream. Thus,
delivery of the DNA using this method is more efficient and avoids the need for
additional mechanisms to target the DNA of interest to a particular tissue.
Reaction by the immune system to the delivery vector (particularly viral vectors)
is a major obstacle for conventional gene-based therapies. Delivery of vectors by
other routes (e.g., intravenous, intramuscular injection, or pulmonary
administration) exposes them to blood and extra cellular fluid. This exposure
commonly results in inflammation and an immune response. These adverse
reactions often worsen with reapplication, to the point where treatment cannot
continue and becomes completely ineffective. The present method presents the
vector directly to the intestines without having to pass first through the blood or
tissue of the subject. This shields the DNA/iRNA delivery process as much as
possible from the systemic circulation where immunological and inflammatory
responses are initiated, and in this manner it minimizes their interference with
therapy. The nuclei acid cassette is administered in the simplest possible fashion
by the oral administration of the pill ("Nucleic Acids Pill").
Naked-linear DNA/iRNA could be used, rather than viral vectors or circular
plasmids. Therefore, only linearized nucleic acids containing just the sequences
for therapeutic purpose and DNA integration (if required) without selectable
markers (host/bacterial growth selection) or other additional sequences are
required.
The method avoids the cost and difficulty of manufacture of proteins by using the
body's own tissues to synthesize the desired proteins. Moreover, it avoids the
problems of rapid metabolism by providing for the continuous manufacture and
secretion of the therapeutic protein by gene expression.
The location of K-cells makes it an ideal surrogate cells for insulin replacement.
Having localized predominantly in the upper small intestine, K-cells can be
readily accessed by this innovative and documented "GOD 4 Guttm" technology.
The body produces GIP and insulin in response to glucose ingestion in a similar
fashion; when glucose is ingested, both are produced; when glucose is absent,
production of both are halted (i.e., in a “meal-dependant fashion”). Therefore,
DNA construct (insulin) containing GIP promoter is expressed and regulated in a
"meal-dependent fashion" and with the right transit-peptide can be secreted in the
targeted internal or external section of cell.
Dendrosome (nanoparticle that protects, carries, and delivers nucleic acids) is
inexpensive, stable at room temperature, easy to make, inert, biodegradable,
"GOD 4 Guttm", "BOT 4 Guttm", and "GEM 4 HEALTHtm"
are trade mark of this tech-project
•
•
resistant to pH and degrading enzymes, and nontoxic with the properties to be a
leading-edge cell-specific targeting gene therapy vesicle.
Primary experiments indicate that Dendrosomes can be designed to deliver
nucleic acids into rats' brain cells based on reporter gene (Gus staining) activities
(data not shown).
Nucleic acids (DNA/iRNA) can be transferred and expressed in a regulatedfashion to any organism's cell (i.e., animal, bacteria, and plant protoplast) via this
innovative bionano-technology.
REFERENCES/testimonials/supporting internal documents
This project was refereed by three reviewers from National Committee of
Nanobiotechnology at Higher Education Ministry in Tehran and internal reviewers from
Tarbiat Modares University. It has been nominated for biotechnolog award at Tarbiat
Modares University.
17
"GOD 4 Guttm", "BOT 4 Guttm", and "GEM 4 HEALTHtm"
are trade mark of this tech-project
Bio‐IT World 2010 Best Practices Awards Nominating Organization name: Nominating Organization address: Nominating Organization city: Vancouver Nominating Organization state: BC Nominating Organization zip: V6H3V9 Nominating Contact Person: Nominating Contact Person Title: Nominating Contact Person Phone: Nominating Contact Person Email: User Organization name: Zymeworks Inc User Organization address: 540‐1385 West 8th Avenue User Organization city: Vancouver User Organization state: BC User Organization zip: V6H 3V9 User Organization Contact Person: Surjit Dixit User Organization Contact Person Title: Chief Technology Officer User Organization Contact Person Phone: 604‐678‐1388 x 128 User Organization Contact Person Email: [email protected] Project Title: A modular computational modeling environment for structure guided optimization of protein therapeutics. Team Leaders name: Siddharth Srinivasan Team Leaders title: Senior Developer, Software Engineering Team Leaders Company: Zymeworks Inc Team Leaders Contact Info: 604‐678‐1388 x 129 Team Members name: Dimitri Tcaciuc Team Members title: Software Developer Team Members Company: Zymeworks Inc Entry Category: Drug Discovery & Development Abstract Summary: Introduction: Background: Protein therapeutics, in particular antibody based therapeutics, is one of the fastest growing areas of drug discovery in the pharmaceutical industry. The optimization of the lead protein therapeutic candidates is traditionally carried out using maturation technologies that are marred by explosive combinatorial complexity resulting in inefficiencies along the route to identifying the complex combination of mutations required for achieve the optimization goal. On the other hand, the nature of protein molecules involving a delicate balance between its sequence composition and activity often necessitates such combination of mutations to achieve the target properties of interest. But this same issue also enormously complicates any plan for “engineering” them in a rational manner. In recent years, a number of computational methods have proven to be valuable in providing unique structural, dynamic and biophysical insights that can aid the scientist in strategizing their protein engineering effort towards a rational or semi‐rational process. This calls for an integrated and comprehensive infrastructure providing tools in a number of areas within the field of computational modeling and simulation of protein molecules. • Protein structure optimization methods to model the effects of mutations • Data mining procedures to characterize and identify hotspots: Structural and Functional • Conformational sampling technologies that can simulate the dynamic nature of protein molecules and data mining procedure to extract • Identify intrinsic correlations between residues and their cooperative tendencies affecting allosteric behavior • Free energy simulation methods that provide quantitative insight on the effect of proposed mutations and their physical or thermodynamic basis While a host of academic groups have pushed the boundaries of research in these areas, a regular application environment that can be readily employed by a protein engineering team to support their work employing the diversity of methods and algorithms is missing. Besides the complex theoretical basis, application of such technology is hurdled by involved user interfaces, difficult to build upon and integrate in complex workflows, and further lead into major data mining challenges given the vast amount of information generated during these simulation. From a development angle, collaborative use of the available programs is fraught by challenges in communication of data between the different programs due to incompatible formats and data exchange issues. Hence a comprehensive development environment calls for a library of routines that can capture a number of core functionalities that will be required in the modeling and simulation exercise. ZymePy: To facilitate rapid application development and to reduce operational complexity for both software engineers and scientists, we introduced this common software framework, which is based on mix of Python and C programming languages, that we call ZymePy. The core API of ZymePy provides facilities for selecting and manipulating composition and geometry of structures either in the form of individual frames or trajectories, manipulating solvent and ion configurations, residue side chain geometries, defining restraints, computing energy terms and various input output functionality. The main reasons for producing this molecular development environment were: • Full control over the design and code quality to target company‐specific applications. • Superior development language with access to vast number of scientific libraries. • Quick development turn‐around time in case library modifications are needed. ZymePy is a multi‐layered library that provides various functionalities at several granularity levels. Its core provides representation of basic molecular structures, facilities for selecting and manipulating composition and geometry either in the form of individual structures or entire trajectories, manipulating solvent and ion configuration, defining simulation restraints, computing basic energy terms, etc. The core is intentionally designed to be extremely basic and transparent. It avoids use of any kind of implicit data sharing between different program parts that plagues other toolkits, which tends to cause heavy problems down the road, especially when dealing with parallel execution of the code. One layer above, the basic functions are grouped in a useful reusable blocks that we call add‐ons. These blocks are written to match one of a small number of common interfaces, which defines the context it will be used in. They are written such that it is easy for a programmer to combine them together and form what one might call the business logic of the entire software stack. The resulting functionality is exposed through small command line interface (CLI) programs that will be given to the protein engineers, who are the effective users of the software ZymeFlow: Our team consists of a group of protein engineers and computational chemists working together on the design of computational experiments that will aid in the structure guided computational optimization of proteins. From a software development angle, this calls for a workflow execution technology that can minimize the development required to achieve the application of a new idea (in terms of a computational modeling or simulation protocol) into code or subsequently generate the relevant data. To meet this challenge our software development infrastructure supports an abstract workflow model consisting of three objects: a source, actors and one or more sink. Source: This is the input data for a particular action, and includes metadata associated with that data to uniquely identify it within the ZymeFlow framework. The purpose of the source is to transform arbitrary data into discrete units of work, depending upon the available resources. These resources may be the number of processing cores, the available memory, the network topology, the level of granularity required or other considerations related to the parallelism Actors: These represent the actions to be performed on a particular set of data. Actors can be of arbitrary complexity, and use data from the source and output them to the sink, or to further actors depending upon the workflow. The flow of data between actors is meant to be non‐human readable, and interactions with the user can be made only at prepositioned data sinks explicitly created in the workflow. This maintains the integrity of the data and its lineage, and can be traced at any given point. It also allows for efficient checkpointing and resuming of flows from intermediate stages, since the data is stored (or referenced) in the object database at every point, and can be uniquely identified. Results: Use of ZymePy in the development of ZymePack: ZymePack is a proprietary algorithm for repacking protein sidechains in a high throughput manner, using specialized parallelization algorithms to take advantage of our high performance computing cluster, along with other algorithms designed to reduce the computational complexity to a tractable amount. This algorithm extensively makes use of the ZymePy molecular modeling framework developed in‐
house to perform mutations, conformational optimizations, energy computations and other elimination criteria. The ZymePy core is flexible enough to allow us to experiment with multiple energy functions, parameter sets (force fields), and optimization algorithms. The underlying parallelization algorithm makes use of Map‐Reduce techniques to split up the computations among participating processors with minimal overhead, and each application using ZymePy can make use of this framework. The ZymePy framework allows us to efficiently manipulate structures to perform mutation, compute energy differences, and optimize their conformations in a high throughput manner, and hence apply such techniques to traditionally intractable problems such as antibody Fab/Fc/receptor regions. These have greatly accelerated our time‐to‐production on many projects including Fc/FcR optimization on an antibody‐receptor interface. ZymePack is also used for repacking regions of a protein that have been modeled with low resolution crystallographic information, or are reliant on homology modeling to generate structures. ZymePy also allows for easy manipulations of protein structures including backbone conformational changes and this gives us a unique technological edge in accurately re‐modeling structures after a complex mutation has been proposed for protein engineering purposes. Use of ZymePy in the development of Knowledge Based Potentials for stability and mutagenesis scoring: A knowledge based potential is a powerful algorithm that is used to assign scores to candidate structure conformations or to mutations. The score is based on knowledge that is derived from a large database of structures such as the protein databank. The algorithm needs access to a large number of structure files to extract the necessary atom and residue information. The knowledge based potential implementation at Zymeworks uses one‐line Zymepy calls to read these structures. Once a structure has been loaded, ZymePy makes it easy to access detailed structural information such as chain IDs, sequence numbers, residue types, and atomic coordinates, all of which the algorithm needs for its calculations. Further, with the help of ZymePy, structures can be written out and visualized in other third party packages. Color coding can be done based on a particular property of a residue, such as its mutability, as computed by the knowledge based potential. Zymepy allows this information to be coded into the structure file. Finally, “addons” and other scripts can be written that make use of ZymePy functionality. Tools to perform conformational clustering, to find central structures, to perform principal component analysis, and to compute solvent accessible surface areas have been programmed using python and ZymePy with minimal lines of code and developer time. Because Zymepy takes care of the details behind the many everyday operations, the developer can focus his efforts on the scientific problem at hand. Usage of ZymeFlow for FcR optimization in antibody: In this project, the ZymeFlow framework was applied to an antibody optimization problem, where a large subset of simultaneous mutations were proposed for optimization purposes. The ZymeFlow framework made use of the sidechain optimization capabilities of ZymePack and other ZymePy functionality like Knowledge Based potentials to score the resulting mutations, using the resource allocator built into the ZymeFlow framework. This allowed the computational chemists to focus on building efficient scoring tools rather than spending time on the parallelization of their tools, and allowed the protein engineers to run this workflow recursively with multiple simultaneous mutation sites and optimization parameters using the full power of our computational resources. ROI achieved: Conclusions: References: 1. Progress in computational protein design. S. M. Lippow and B. Tidor. Curr. Opin. Biotechnol. 18: 305‐311 (2007) 2. Kepler: an extensible system for design and execution of scientific workflows. I. Altintas and C. Berkley and E. Jaeger and M. Jones and B. Ludascher and S. Mock (2004) Scientific and Statistical Database Management, 2004. Proceedings. 16th International Conference on. Pages: 423‐‐424. 3. MapReduce: Simplified Data Processing on Large Clusters Jeffrey Dean and Sanjay Ghemawat ‐ http://labs.google.com/papers/mapreduce.html A modular computational modeling environment for structure guided optimization of protein therapeutics. Abstract: Computational modeling and simulation of proteins holds widespread promise as an efficient tool for the knowledge guided optimization of complex therapeutic proteins such as antibody drugs in an otherwise maturation technology based optimization industry. The field of molecular simulation has progressed significantly in recent times, thanks to the development of improved molecular models, algorithms and high performance computing infrastructure but the application of these methods for therapeutic protein drug discovery has lagged due to the lack of accessibility of the technology. This technology provides a unique platform conducive for rapid development, testing and deployment cycles on the software architecture side and a friendly and expanding application environment for the user. Background: Protein therapeutics, in particular antibody based therapeutics, is one of the fastest growing areas of drug discovery in the pharmaceutical industry. The optimization of the lead protein therapeutic candidates is traditionally carried out using maturation technologies that are marred by explosive combinatorial complexity resulting in inefficiencies along the route to identifying the complex combination of mutations required for achieve the optimization goal. On the other hand, the nature of protein molecules involving a delicate balance between its sequence composition and activity often necessitates such combination of mutations to achieve the target properties of interest. But this same issue also enormously complicates any plan for “engineering” them in a rational manner. In recent years, a number of computational methods have proven to be valuable in providing unique structural, dynamic and biophysical insights that can aid the scientist in strategizing their protein engineering effort towards a rational or semi-­‐rational process. This calls for an integrated and comprehensive infrastructure providing tools in a number of areas within the field of computational modeling and simulation of protein molecules. • Protein structure optimization methods to model the effects of mutations • Data mining procedures to characterize and identify hotspots: Structural and Functional • Conformational sampling technologies that can simulate the dynamic nature of protein molecules and data mining procedure to extract • Identify intrinsic correlations between residues and their cooperative tendencies affecting allosteric behavior • Free energy simulation methods that provide quantitative insight on the effect of proposed mutations and their physical or thermodynamic basis While a host of academic groups have pushed the boundaries of research in these areas, a regular application environment that can be readily employed by a protein engineering team to support their work employing the diversity of methods and algorithms is missing. Besides the complex theoretical basis, application of such technology is hurdled by involved user interfaces, difficult to build upon and integrate in complex workflows, and further lead into major data moo
o r t o c t rc mdpoc d o dam c do o a c
pa o c r
adm
t dum oc o v d da c t pr d c
t uad a mr r a p
o dmmpo c do d
c
ci
oc
a oc uad a mr p cd o dmu c c n o rrp rs o
dmua or t
t dum oc ot adom oc r
adpc o r c c o ucpa opm a d da po c do c r c c i a lpa
o r mp c do n a r s
r mp c dos
c e
o r
dam cr o
da a ae d
o c md o
d
c c a u uu c do t dum oc o cd a p du a c do dmu n ce da dc
rd ci a o o ar o r oc rcrvi ocad p
c r dmmdo rd ci a a m i daAvi
r
r do m n d ec do o
uad a mm o o p rvc c i
em es
da
d
em e uadt r
c r da r c o o m o up c o dmudr c do o
dm cae d
rcap cpa r c a o c dam d o t p a m r da ca
cda rvm o up c o rd t oc o do
do pa c dorva r p r
o dm ca rv
o o a rca ocrv dmupc o o a e c amr o
t a dpr oupc dpcupc po c do ces
m o a rdor da uad p o c r md p a
•
•
•
t dum oc ot adom oc i
af
p docad dt a c
r o o d l p ce cd c a c dmu oeCru
uu c dors
pu a da t dum oc o p
i c
rr cd t rc opm a d r oc a a rs
p A t dum oc cpaoC adpo c m o r a ae md
c dor a o
s
em e r mp c C e a a ae c c uadt r t a dpr po c do c r c r t a a op a ce
t r s cr da uadt r a ua r oc c do d
r md p a rcap cpa rv c r da r c o o
m o up c o dmudr c do o
dm cae c a o c
dam d o t p rcap cpa r da oc a
ca
cda rvm o up c o rd t oc o do do pa c dov
o o r mp c do a rca ocrv
dmupc o
r o a e c amrv c s
da r oc oc do e r o cd
nca m e r o
ca oru a ocs c t d r pr d oe A o d mu c c r a o
ci
o
a oc uad a m u acr
c c u p r dc a cdd A crvi
c o r cd pr
t e uad mr di o c ad v ru
e
i
o
o i c u a n pc do d c
d s
o e a
s
r
o rc
dt vc
r po c dor a adpu
o pr p a pr d Ar c c i
-­‐
d Ar a i a cc o cd m c do d rm opm a d dmmdo oc a
rvi
doc nc c i pr
os
e a i a cc o rp c c c r re da uad a mm a cd
dm o c m cd c a o
rc As
dam i
c do m
c
c
pr o rr d
d c
oc a rd ci a
a rp c o po c do ce r nudr c adp rm dmm o o oc a
w kuad a mr c c
i t o cd c uadc o o o arvi d a c
c t pr ar d c rd ci a
pa c m dor rcr d
adpu d uadc o o o ar o dmupc c do m rcr i daA o cd c a
do c
r o d dmupc c do nu a m ocr c c i o c rcap cpa p
dmupc c do duc m g c do d uadc ors adm rd ci a
t dum oc o vc r r da i daA di
n pc do c od d e c c o m o m g c
t dum oc a l p a cd
t c
uu c do d
o i
wo c amr d
dmupc c do md o da r mp c do uadcd d k ocd d da
rp r l p oc e o a c c a t oc c s d m c c r
o dpa rd ci a
t dum oc
o a rcap cpa rpuudacr o rca c i daA di md dor rc o d c a d
crf rdpa v cdar
o do da mda r oAs
f
r r c oupc c da u ac p a c dov o o p r m c c rrd c i c c c c cd
po l p e oc e c i c o c
em di a m i daAs
upaudr d c rdpa r cd ca or dam
a ca ae c ocd r a c po cr d i daAv u o o pudo c
t a rdpa rs
r
a rdpa r m e c opm a d uad rr o da rvc
t m mdaevc o ci daA cdud d ev
c t d a op a ce a l p a da dc a dor a c dor a c cd c u a rm
f
r a ua r oc c
c dor cd
u a dam do u ac p a r c d
c s
a ca ae dmu n cev o pr
c adm c rdpa
o dpcupc c m cd c
cdar o
d
r oAvda cd pac a
actors depending upon the workflow. The flow of data between actors is meant to be non-­‐
human readable, and interactions with the user can be made only at prepositioned data sinks explicitly created in the workflow. This maintains the integrity of the data and its lineage, and can be traced at any given point. It also allows for efficient checkpointing and resuming of flows from intermediate stages, since the data is stored (or referenced) in the object database at every point, and can be uniquely identified. Benefits and ROI: This technology improves inter-­‐disciplinary communication and recursive development by creating a framework to efficiently build complex tools from smaller building blocks, manage the relationship between the data, plan workflows, their execution and associated data storage. The benefits to the functional groups are: Protein engineers think about conceptual workflow rather than implementation details. This reduces complexity by abstracting computational and storage resources, parallelization and data formats of intermediate steps. Integrated data lineage tracks IO relationships and eliminates duplicate computations. Computational chemists improve time-­‐to-­‐solution by providing a customizable toolbox of actors. Platform is future-­‐proof to use any underlying batch queuing system, scheduler, storage and backup infrastructure, parallelization paradigm. Systems Architects can accurately predict storage and computational demands from users for better capacity planning. Example application for Protein Engineering Use of ZymePy in the development of ZymePack: ZymePack is a proprietary algorithm for repacking protein sidechains in a high throughput manner, using specialized parallelization algorithms to take advantage of our high performance computing cluster, along with other algorithms designed to reduce the computational complexity to a tractable amount. This algorithm extensively makes use of the ZymePy molecular modeling framework developed in-­‐house to perform mutations, conformational optimizations, energy computations and other elimination criteria. The ZymePy core is flexible enough to allow us to experiment with multiple energy functions, parameter sets (force fields), and optimization algorithms. The underlying parallelization algorithm makes use of Map-­‐Reduce techniques to split up the computations among participating processors with minimal overhead, and each application using ZymePy can make use of this framework. The ZymePy framework allows us to efficiently manipulate structures to perform mutation, compute energy differences, and optimize their conformations in a high throughput manner, and hence apply such techniques to traditionally intractable problems such as antibody Fab/Fc/receptor regions. These have greatly accelerated our time-­‐to-­‐production on many projects including Fc/FcR optimization on an antibody-­‐receptor interface. ZymePack is also used for repacking regions of a protein that have been modeled with low resolution crystallographic information, or are reliant on homology modeling to generate structures. ZymePy also allows for easy manipulations of protein structures including backbone conformational changes and this gives us a unique technological edge in accurately re-­‐modeling structures after a complex mutation has been proposed for protein engineering purposes. Use of ZymePy in the development of Knowledge Based Potentials for stability and mutagenesis scoring A knowledge based potential is a powerful algorithm that is used to assign scores to candidate structure conformations or to mutations. The score is based on knowledge that is derived from a large database of structures such as the protein databank. The algorithm needs access to a large number of structure files to extract the necessary atom and residue information. The knowledge based potential implementation at Zymeworks uses one-­‐line Zymepy calls to read these structures. Once a structure has been loaded, ZymePy makes it easy to access detailed structural information such as chain IDs, sequence numbers, residue types, and atomic coordinates, all of which the algorithm needs for its calculations. Further, with the help of ZymePy, structures can be written out and visualized in other third party packages. Color coding can be done based on a particular property of a residue, such as its mutability, as computed by the knowledge based potential. Zymepy allows this information to be coded into the structure file. Finally, “addons” and other scripts can be written that make use of ZymePy functionality. Tools to perform conformational clustering, to find central structures, to perform principal component analysis, and to compute solvent accessible surface areas have been programmed using python and ZymePy with minimal lines of code and developer time. Because Zymepy takes care of the details behind the many everyday operations, the developer can focus his efforts on the scientific problem at hand. Usage of ZymeFlow for FcR optimization in antibody: In this project, the ZymeFlow framework was applied to an antibody optimization problem, where a large subset of simultaneous mutations were proposed for optimization purposes. The ZymeFlow framework made use of the sidechain optimization capabilities of ZymePack and other ZymePy functionality like Knowledge Based potentials to score the resulting mutations, using the resource allocator built into the ZymeFlow framework. This allowed the computational chemists to focus on building efficient scoring tools rather than spending time on the parallelization of their tools, and allowed the protein engineers to run this workflow recursively with multiple simultaneous mutation sites and optimization parameters using the full power of our computational resources. Conclusions: We have developed a modular computational platform to assist in structure and knowledge guided optimization of protein therapeutics based on a molecular modeling and structure manipulation library to develop a number of protein engineering algorithms that can be combined to create powerful analysis and modeling applications. We have developed a high level architecture to simplify protein engineering workflows by allowing different combinations of protein engineering tools while abstracting the complexity of the underlying parallelization algorithms, data storage, data transfer and lineage away from the user and maintaining complete coherence of data across multiple simultaneous workflows to gain unique insights into protein structure-­‐function relationships. References: 1. Progress in computational protein design. S. M. Lippow and B. Tidor. Curr. Opin. Biotechnol. 18: 305-­‐311 (2007) 2. Kepler: an extensible system for design and execution of scientific workflows. I. Altintas and C. Berkley and E. Jaeger and M. Jones and B. Ludascher and S. Mock (2004) Scientific and Statistical Database Management, 2004. Proceedings. 16th International Conference on. Pages: 423-­‐-­‐424. 3. MapReduce: Simplified Data Processing on Large Clusters: Jeffrey Dean and Sanjay Ghemawat -­‐ http://labs.google.com/papers/mapreduce.html Bio‐IT World 2010 Best Practices Awards Nominating Organization name: Nominating Organization address: Nominating Organization city: Minsk Nominating Organization state: Republic of Belarus Nominating Organization zip: 220141 Nominating Contact Person: Nominating Contact Person Title: Nominating Contact Person Phone: Nominating Contact Person Email: User Organization name: Institute of Bioorganic Chemistry of the National Academy of Sciences of Belarus User Organization address: Kuprevich str. 5/2 User Organization city: Minsk User Organization state: Republic of Belarus User Organization zip: 220141 User Organization Contact Person: Alexander Andrianov User Organization Contact Person Title: Principal Investigator User Organization Contact Person Phone: (+375)‐17‐267‐82‐63 User Organization Contact Person Email: [email protected]‐net.by Project Title: Computer‐assisted anti‐AIDS drug development: cyclophilin B against the HIV‐1 subtype A V3 loop Team Leaders name: Team Leaders title: Team Leaders Company: Team Leaders Contact Info: Team Members name: Ivan Anishchenko Team Members title: software developer Team Members Company: United Institute of Informatics Problems of the National Academy of Sciences of Belarus Entry Category: Basic Research & Biological Research Abstract Summary: Introduction: The HIV‐1 envelope glycoprotein (Env), the etiologic agent of AIDS [1], consists of two noncovalently bound subunits derived from the gp160 precursor. One of these subunits, gp120 protein, is localized on the surface of the viral isolates and becomes a direct party to the virus binding to the target‐cells, whereas the other, trans‐membrane gp41 protein, triggers the process of membrane fusion resulting in the invasion of the virus genome into the macrophages and T‐lymphocytes [2]. Specific interactions of the HIV‐1 with the virus primary receptor CD4 as well as with its chemokine co‐receptors CCR5 and/or CXCR4 are put into effect using the V1‐V5 loops of gp120 disclosing the high variability of the amino acid sequences in diverse virus strains [3‐5]. Currently, special emphasis of the research teams involved in the anti‐AIDS drug studies is attracted to the HIV‐1 V3 loop (reviewed in [6]). The higher interest in V3 is caused by numerous experimental data [7] testifying to the fact that exactly this gp120 site gives rise to the principal target for neutralizing antibodies and accounts for the choice of co‐receptor determining the preference of the virus in respect with T‐
lymphocytes or primary macrophages [8]. The differential usage of co‐
receptors, which is critically dependent on the sequence, charge, and/or structure of the V3 region of gp120 [9, 10], dictates the viral phenotype, which shows a typical pattern of evolution during the natural history of HIV‐1 infection. CCR5‐restricted strains (R5) are the most prevalent in vivo, as they are almost invariably responsible for the initial transmission, predominate during the long asymptomatic phase of the infection, and often persist after the progression to full‐blown AIDS; by contrast, strains that utilize CXCR4, either alone (R4) or in combination with CCR5 (R5X4), emerge only in a subset of patients, typically in conjunction with the onset of clinical signs of disease progression and immune system deterioration [11, 12]. Since the V3 loop governs the cell tropism and cell fusion (see, e.g., [7]), one of the strategic ways in developing the anti‐HIV‐1 drugs may be based on the approach anticipating the search for the chemicals able to block efficiently this functionally significant stretch of gp120 [6]. Comprehensive analysis of the data given in study [13] allows one to suppose that immunophilins exhibiting specific high‐affinity interactions with the HIV‐1 V3 loop may be utilized as a basic substance to set out of the search for the potential anti‐AIDS therapeutic agents. Immunophilins known originally as the cellular receptors of the immunosuppressive drugs cyclosporine A and FK506 or rapamycin organize the extensive group of proteins exhibiting peptidyl‐prolyl cis‐trans isomerase activity which is inhibited specifically and efficiently on binding of the corresponding immunosuppressant [14]. Immunophilins subdivided into three families of proteins, namely cyclophilins and FK506‐binding proteins (FKBPS), and a novel chimeric dual‐family immunophilin, named FK506‐ and Cyclosporine‐
binding protein (FCBP) show similar enzymatic and biological functions despite the apparent difference in their sequence and three‐dimensional structures (reviewed in [15]). Alongside with the function of intracellular receptors of immunosuppressants, individual representatives of immunophilins act as catalysts of protein folding and as shaperones stabilizing proteins in a defined conformation and supervising the quality of their spatial structure [16, 17]. A variety of bacterial and protozoan pathogens express FKBP‐related peptidyl‐prolyl cis‐trans isomerases termed macrophage‐infectivity potentiators (Mip). Mip proteins act in host cell infection as virulence factors, either as membrane‐bound proteins on the surface of the pathogens or as soluble secreted proteins [18, 19]. The peptidyl‐prolyl cys‐trans isomerase activity of Mip proteins is suppressed by FK506, which reduces the infectivity of the pathogens without affecting the rate of intracellular replication. Distinct immunophilins were found to be released from cells. Cyclophilin B was detected in human milk [20] and blood plasma [21], but is mainly localized in the endoplasmatic reticulum of cells. The cytosolic immunophilins cyclophin A and FKBP12 were shown to be released during apoptosis of fibroblasts [22] and to act as chemokines by unknown mechanism [23, 24, 25]. Recent researches have revealed that many immunophilins possess a shaperone function independent of peptidyl‐prolyl cys‐trans isomerase activity (reviewed in [15]). Knockout animal studies have confirmed multiple essential roles of immunophilins in physiology and development consisting in interactions with proteins to guide their proper folding and assembly [15]. Reasoning from the empirical observations, there is a good motive to think that immunophilins present in normal human blood plasma are directly relevant to the HIV‐1 replication assisting the virus with getting into macrophages and T‐lymphocytes (see, e.g., [26]). In particular, cyclophilin A packaged into nascent virus particles by specific binding to the capsid region of the Gag precursor protein at the time of viral assembly [27, 28,29], was found to mediate the HIV‐1 attachment to the target cells via heparans followed by the gp120‐CD4 interaction [26]. Due to the interaction of immunophilins with the HIV‐1 isolates, their role of conformases or docking mediators in the virus life cycle seems to be highly probable, since immunophilin receptors on cell membranes and immunophilin‐related virulence factors of pathogens have been identified (see, e.g., [13]). The work presented here proceeds with our previous studies [30, 31] where two virtual molecules, namely FKBP and Cyc A peptides, presenting the promising anti‐HIV‐1 pharmacological substances were designed by means of computer modeling based on the analysis of specific interactions of the FK506‐binding protein and cyclophilin A with V3. The object of the present study was to model the structural complex of one more protein from immunophilins super‐family, cyclophilin (Cyc) B, with the HIV‐1 subtype A V3 loop (SA‐V3 loop) circulating in Eastern Europe including Republic of Belarus and, therefore, offering the target of our special interest, as well as to specify the Cyc B segment forming the binding site for V3, the synthetic copy of which, on the assumption of keeping the 3D peptide structure in the free state, may be considered as a forward‐looking applicant for the role of a new antiviral drug. To this effect, molecular docking of the HIV‐1 SA‐V3 structure determined previously [32] with the X‐ray conformation of Cyc B was put into practice, and the Cyc B stretch responsible for the binding to V3 was identified followed by predicting the most probable 3D structure of this stretch in the unbound state, studying its dynamic behavior, and collating the results obtained with the X‐ray data for the corresponding site of Cyc B. Thereupon, the potential energy function was analyzed for the complex of the SA‐V3 loop with the Cyc B peptide offering the virtual molecule that imitates the Cyc B segment making a key contribution to the interactions of the native protein with V3. As a matter of record, the designed peptide was shown to be capable of the effective masking of the functionally critical and structurally rigid V3 sites, presenting the suitable framework for making a reality of the protein engineering projects utilizing the V3 target for developing the anti‐
AIDS drugs able to stop the HIV's spread. Results: Molecular docking simulations Molecular docking of the SA‐V3 loop [32] with Cyc B (file 1CYN of the Protein Data Bank [33, 34]) as well as with the Cyc B peptide was executed by the Hex 4.5 program [35] which presents an interactive molecular graphics package for calculating and displaying feasible docking modes of pairs of protein and DNA molecules and employs the spherical polar Fourier correlations to accelerate the computations. Energy refinement of the generated complexes was performed in the GROMACS package [36] by minimizing their potential energy. To this end, the conjugate gradient method was used for the complex of the native protein with the V3 loop as well as for the overmolecular ensemble of V3 with the Cyc B peptide. At the final point of computations, the structural complexes were subjected to the procedure of simulating annealing carried out during 100 ps time domain at initial and final temperatures equal to 500 and 0 K respectively. Determination of the 3D static structure for the Cyc B peptide and molecular dynamics computations To design the 3D structure of the Cyc B peptide, the X‐
ray conformation of the Cyc B site [37] responsible for its binding to V3 was involved in the calculations as a starting model to find its best energy structural variant in the unbound form. The search for this most preferable conformation was executed by consecutive use of the molecular mechanics and simulated annealing methods realized in the Tinker package [38] with activating its program modules Minimize and Anneal. The MD simulations of the built Cyc B peptide structure were implemented by the GROMACS computer package [36] using the GROMOS96 force field parameter set 53A6 [39]. The starting 3D structure of CycB peptide generated hereinbefore was placed in a cubic box so that the smallest distance between its walls and the peptide atoms was greater than the half of the cut‐off radius of the Coulomb and Lennard‐Jones potentials fixed at 1.4 nm. Simple point charge water model [40] was utilized to set the parameters of explicit solvent on which the periodic boundary conditions were imposed in all directions. Before the MD computations, the initial CycB peptide model was subjected to the procedure of energy minimization realized in vacuum by the steepest descent method. The MD simulations were carried out at temperature 310 K during 20.5 ns time domain with 1 fs step at fixed pressure and number of atoms, the first 0.5 ns being the stage of solvent relaxation. To integrate the Newton's equations of motion, the common leap‐
frog algorithm was used. To control the temperature, the weak coupling scheme to an external bath [41] was employed in the calculations with 0.1 ps characteristic time. As with the temperature coupling, the system was linked to a "pressure bath" by exponential relaxation of pressure [41] with 1.0 ps time constant. Every 10 ps, the geometric parameters of the MD structures and the data on their energy characteristics were recorded into the trajectory file. Comparison of the MD conformations between themselves and with the input structure was performed in terms of the values of root‐mean‐square deviations computed both in Cartesian and angular space. To this effect, the GROMACS routines [36] were implicated in the studies. The computations were run in parallel on SKIF K‐1000 computer cluster on 64 CPUs [42]. Identification of secondary structures in the Cyc B peptide To determine the different types of secondary structures in the Cyc B peptide, the , ψ values for all of the amino acids derived from the simulated model were analyzed in compliance with the criteria given in study [43]. The types of β‐ and ‐turns were identified within the classification of Hutchinson and Thornton [44]. To detect the nonstandard β‐turns, the additional information on the distances Cαi.Cαi+3 computed from the atomic coordinates of the simulated structures was employed. Collation of 3D static structures The values of root‐mean‐square deviations (RMSD) in atomic coordinates (cRMSD) were taken to evaluate the similarity of the structures in the Cartesian space (see, e.g., review [45]). To compare the structures in terms of the dihedrals, the RMSD between corresponding angles (aRMSD) were used as a measure of their conformational similarity in the angular space [45]. Figure 1 casts light on the structural complex of the HIV‐1 SA‐V3 loop with Cyc B generated via molecular docking of their 3D structures followed by optimization of its geometric parameters. Insight into the function describing the energy surface of the built complex makes it clear that the binding of V3 to Cyc B initiates the formation of stable overmolecular structure that is characterized by the value of potential energy equal to ‐6434 kcal/mol. Analysis of the matrix of interatomic contacts coming true in the designed complex allows one to identify the amino acids of V3 and Cyc B participating in the intermolecular interactions the total energy of which comes to ‐75 kcal/mol. So, according to the data obtained, such V3 residues as Ser‐11, Val‐
12, Gly‐15, Pro‐16, Gly‐17, Gln‐18, Ala‐19, Thr‐23, and Arg‐31 take up positions nearby the surface of Cyc B giving rise to the binding site for V3 by means of Gly‐1, Pro‐2, Lys‐3, Gly‐28, Lys‐29, Thr‐30, Lys‐91, Lys‐93, and Glu‐178. One needs to note that cooperation of the V3 loop with Cyc B results in the origin of one ion pair organized by Arg‐31 of V3 and Glu‐178 of Cyc B and in the formation of six H‐bonds that appear as a result of donor‐acceptor interactions of the receptor amino acids Lys‐91, Lys‐93, and Glu‐178 on the one hand as well as of V3 residues Ser‐11, Val‐12, Gln‐18, Ala‐19, and Thr‐23 on the other hand (see information given in Table 1). Figure 1. Image of the structural complex between the HIV‐1 SA‐V3 loop (tubes) and Cyc B (balls). Table 1. Geometric parameters of intermolecular H‐bonds for the structural complex of the HIV‐1 SA‐V3 loop with Cyc B. These results signify that interaction of the V3 loop with Cyc B entails the blockade of its central region making the immunogenic crown of gp120 [46], whereas the residues of V3 N‐ and C‐terminal segments (except for Arg‐31) relating to its stem [46] do not come in direct contact with the receptor. The data above are in harmony with those of study [13], whereby high affinity to immunophilins is typical not merely for intact V3 variable loops but also for their peptides embracing the immunogenic tip of gp120. Among the segments of V3 interacting effectively with Cyc B, it is essential to mark its tripeptide Gly‐15‐Pro‐16‐Gly‐17 occurring actually in all of the deciphered amino acid sequences of the HIV‐1 principal neutralizing determinant [47]. Functional role of this invariant V3 stretch has not been completely specified. Nevertheless, as mentioned above, even a single substitution for its central residue by alanine makes an impact both on the virus immunogenicity and infectivity [48] testifying to important role of Pro‐16 in the HIV‐1 life cycle. Under the data derived, the 3D structure of the V3 fragment of interest is practically identical to that of Cyc B site Gly‐1‐Pro‐2‐Lys‐3 which is spatially close to it: the value of cRMSD computed for all of the atoms of their main chains totals 0.7 Å. Resemblance of the 3D main chain shapes observed for the two segments of the ligand and the receptor makes it possible to suggest that Cyc B stretch Gly‐1‐
Pro‐2‐Lys‐3 gives rise to the signal structure that is interpreted by V3 as a mirror image of its own immunogenic crest, which, most likely, presents the head reason involving the specificity of V3 interactions with immunophilins. In this light, the findings above confirm the validity of the assumption made in our previous studies [30, 31] where specific high‐affinity interactions of the HIV‐1 V3 variable loops with immunophilins arising from experimental observations [13] were suggested to be stipulated by appearing in their amino acid sequences the fragments exposing the similar 3D structures which are constructed from b‐turns of polypeptide chain (for details see works [30, 31]). In such a way, the data of molecular docking testify to realizing the energetically favorable contacts of the HIV‐1 SA‐V3 loop with Cyc B resulting in the masking of some of the key V3 amino acids of its immunogenic crown. In this context, we could suggest of a possible usage of immunophilins and, in particular, Cyc B as an alternative to the V3‐directed antibodies commonly used to neutralize the HIV‐1 activity. However, the evidence of study [49] demonstrating that increase of immunophilins concentration in infected blood plasma does not influence the virus infectivity conflicts with this primitive conjecture. In the case of Cyc B, the probable cause of its insufficient neutralizing activity may consist in the fact that, as follows from our simulations, the binding of the immunophilin to the HIV‐1 V3 loop occurs via interactions with the central region of V3 and does not affect its N‐ and C‐terminals (Figure 1) where the major portion of the residues involved in cell tropism and cell fusion is localized [50‐52]. Therefore, to amplify the blockade of V3 and preserve its capacity for specific interactions with Cyc B, we have undertaken an attempt to design and as potential anti‐HIV‐1 drug the virtual molecule named Cyc B peptide and imitating N‐
terminal segment 1‐30 of the native immunophilin. The choice of Cyc B segment 1‐30 for continuation of our studies is caused by the following motive: in compliance with the designed data, it holds tripeptide Gly‐1‐Pro‐2‐Lys‐3 recognizable by the virus immunogenic crest and comprises significant share of the residues making the binding site for V3. Certainly, such a definition is correct only in case that the 3D structure of this Cyc B segment does not experience the considerable alterations in its free state. To check whether that is true, we computed the most preferable 3D structure of the Cyc B peptide and compared it with the one appearing in crystal [37] in the corresponding site of the intact protein. Analysis of Figure 2 illustrating the image of the superposed peptide structures gives grounds to conclude that the spatial folds of their backbone are closely related, and this inference arising from visual observation is ratified by the value of cRMSD equal to 2.4 Å. At collating the structures given in Figure 2, it is essential to underscore that very close agreement between them (cRMSD is 0.5 Å) occurs in segment Gly‐1‐Pro‐2‐Lys‐3 that, as stated above, forms in the native Cyc B the conformational epitope specifically recognizable by V3. Analogous conclusion to the effect that the compared structures are alike follows from their confrontation in the conformational space (f,y): the value of aRMSD calculated for all of the peptide amino acids comes to 33°. Figure 2. 3D structure of the CycB peptide superposed with the X‐ray conformation for segment 1‐30 of the entire protein. Insight into the static model for the 3D structure of the Cyc B peptide (Figure 3a) shows that essential contribution to its energy stabilization belongs to the donor‐acceptor interactions that result in forming the extensive network of hydrogen bonds appearing between amino acids both distant and adjacent in the polypeptide chain. The molecule generated by computer modeling tools offers the elongated "construction" in which the spatially close N‐ and C‐terminal residues give rise to the long‐range H‐bond by the oxygen of Gly‐1 carboxyl group and the hydrogen of hydroxyl group of Thr‐30 side chain. As follows from the dihedral values given in Table 2, central part 10‐22 of the Cyc B peptide constitutes the b‐sheet the "oval isthmus" of which is composed from consecutive b‐turns (Figure 3b), with their spatial folds being similar to those identified above in the immunogenic crown of the HIV‐1 SA‐V3 loop. In particular, according to our simulations, tetrapeptide Ile‐14‐
Gly‐15‐Asp‐16‐Glu‐17 of the Cyc B peptide adopts the conformation of none‐
standard b‐turn IV close to that of stretch Gly‐15‐Pro‐16‐Gly‐17‐Gln‐18 of V3: in this case, the value of cRMSD estimated for the backbone atoms of the compared structures is 1.1 Å. Resemblance in the structural organization of the central regions of V3 and Cyc B peptide takes also place for their longer fragments. For instance, if we compare the 3D structure of V3 stretch 15‐20 producing the overwhelming majority of contacts with neutralizing antibodies (see, e.g., study [53]) with the one of the Cyc B peptide segment 14‐19, it is also possible to observe the conformity of their spatial backbone folds (the corresponding value of cRMSD aggregates 2.0 Å). This outcome enables one to assume that, subject to the observance of the principle of "mirror similarity" formulated in studies [30, 31], segment of the Cyc B peptide forming the "oval isthmus" of the b‐sheet and located in the native protein inside its globule may give an additional site for specific binding to V3 alternative to stretch Gly‐1‐Pro‐2‐Lys‐3 of the intact immunophilin. Figure 3. Three‐dimensional (a) and secondary (b) structures of the CycB peptide generated based on the X‐ray conformation of site 1‐30 for the intact Cyc B. Table 2. Dihedral angles for amino acids in the 3D structure of the Cyc B peptide. When looking into the secondary structure of the designed molecule (Figure 3b), one cannot but catch sight of the peptide segment 27‐30 that, as the stretches of its central part, exposes the conformation of b‐turn, which merits the principal concern in view of the data on the 3D structure of the SA‐V3 loop [32] where the C‐terminal site organizes exactly the same structural motif. This evidence combined with that above implicates the following conclusion: the secondary structures of V3 and Cyc B peptide observed in their central and C‐terminal portions are closely related. In addition, considering the main chain dihedrals (Table 2) indicates that, except b‐turns, the analyzed structure also forms ‐bends the central residues of which are located in positions 15, 18, and 21 (Figure 3b). The results of molecular dynamics simulations implemented during 20 ns time domain by applying the static 3D structure of the Cyc B peptide as the starting model are evidence of its relative conformational rigidity: the average of cRMSD calculated for the structures of the MD trajectory and starting point amounts 3.1 Å and the system of intramolecular H‐bonds serving as one of the factors stabilizing the molecule structure undergoes no drastic changes within the whole MD trajectory. Nonetheless, comprehensive study on the dynamic structures of CycB peptide indicates that their individual representatives differ significantly from the input structure, which does not exclude the probability of wide‐ranging structural reorganizations of this molecule stimulated by abrupt alterations of the environment that, for example, may happen upon its entry into numerous intermolecular interactions. Conformity of the 3D structure of the Cyc B peptide with that of the same immunophilin segment (Figure 2) and its relative structural inflexibility following from the data of molecular dynamics computations give ground to believe that the molecule designed here may not only conserve the capacity for specific interaction with V3 characteristic of the native protein [13] but also intensify the blockade of this cryptic site of gp120. Indeed, delving into the potential energy function describing the structural complex of the Cyc B peptide with V3 illustrates (Figure 4) that, as compared to the native protein, the peptide originating from its framework exhibits much more extensive network of contacts with the V3 segments embracing the biologically significant amino acids of gp120. So the energy of intermolecular interactions in the complex of Cyc B with V3 amounts to ‐75 kcal/mol and, in the case of interest, its value falls down to ‐350 kcal/mol. And at the same time, stabilization of the complex between CycB peptide and V3 is reached owing to the multiple donor‐acceptor interactions (see Table 3) as well as to the salt bridge formed using Arg13 of V3 and Asp‐18 of immunophilin‐derived peptide. When analyzing the system of H‐bonds given in Table 3, there is need to note that, from the side of V3, contribution to its formation belongs to such biologically meaningful residues of gp120 as Lys‐10, Arg‐13, Gly‐17, Gln‐18, Asp‐25, Asp‐29, Ile‐30, and Arg‐31 which find themselves to be isolated as a result of arising the overmolecular ensemble. Except for the residues of this register the functional role of which has been discussed here, we ought to notice Asp‐25 that takes an active part in binding of the virus to the cell membrane surface [54‐58] and, along with Ser‐11, accounts for the HIV‐1 phenotype [59, 60]. Constituting the complex of the CycB peptide with V3 also entails the masking of its functionally critical amino acids Ser‐11, Ala‐19, Ile‐23, Gly‐24, and Gln‐32 which are also utilized by the virus to set up the cell tropism determinant [54‐58]. The active center of V3 responsible for binding to CycB peptide contains Asn‐6 the blockade of which may be highly effective for the virus inactivation: as mentioned above, this amino acid of V3 presenting the integral part of its structurally invariant segment 3‐7 [32] gives rise to one of the conserved sites of N‐
linked glycosylation of gp120 [61]. When examining the overmolecular ensemble represented in Figure 4, one needs to cast a glance at the following feature: in this complex, segment Gly‐1‐Pro‐2‐Lys‐3 of the Cyc B peptide interacts with structurally rigid stretch 28‐32 of the HIV‐1 V3 domain [32], whereas the corresponding site of the native protein, as stated before, contacts the immunogenic crown of gp120 (Figure 1). At the same time, this V3 region that proves to be unused by the N‐terminal site of the Cyc B peptide becomes very intimate with hexapeptide Ile‐14‐Gly‐15‐Asp‐16‐Glu‐17‐Asp‐18 belonging to the "oval isthmus" of its b‐
sheet. These observations are of special interest since the indicated segments of the receptor and ligand reside in the b‐turns of polypeptide chain which may serve as docking sites for protein‐protein interactions [62, 63, 64]. The presence of b‐turns in the 3D structures of V3 and Cyc B peptide is likely to be one of the head factors that may make a determinative contribution to the specificity of their efficacious interactions. Figure 4. Overmolecular structure of the HIV‐1 SA‐V3 loop (balls) with the CycB peptide (tubes). Table 3. Geometric parameters of intermolecular H‐bonds for the structural complex of the HIV‐1 SA‐V3 loop with the CycB peptide. Collating the 3D structures of V3 and Cyc B peptide in natural and constrained states indicates that, in either case, forming the complex brings in the certain structural rearrangements taking place both in the Cartesian and angular spaces. At the same time, the Cyc B peptide experiences the more profound transformation of its structure: so when we confront the V3 structures materialized in the overmolecular ensemble and in the unbound status, the values of cRMSD and aRMSD are respectively 2.0 Å and 47° and, in the case of the Cyc B peptide, the corresponding values rise to 4.0 Å and 57°. This observation falls into line with the supposition above, whereby the Cyc B peptide, in spite of the relative conformational rigidity, may exhibit the higher flexibility of the polypeptide chain on drastic medium alterations. In studies [30, 31], we implemented the computer‐aided design of two molecules referred to as Cyc A and FKBP peptides and, having analyzed their structural complexes with the HIV‐1 SA‐V3 loop [32], disclosed that the Cyc A peptide binds effectively to its immunogenic crown, whereas the FKBP peptide prefers to interact with the N‐ and C‐terminal segments of the virus principal neutralizing determinant. The findings derived here bear witness that, unlike the molecules constructed previously [30, 31], the Cyc B peptide is able to mask the functionally crucial amino acids both of the V3 central part and of its stem stretches. Moreover, as compared to these molecules, cooperation of the Cyc B peptide with V3 brings in the origin of more stable overmolecular structure: for instance, the value of the energy of intermolecular interactions computed for the structural complex of V3 with Cyc A peptide [31] totals ‐87 kcal/mol and, in the case in question, it aggregates ‐350 kcal/mol (see above). As shown in study [32], in spite of the hypervariability of V3, its segments 3‐7, 15‐20, and 28‐32 embracing the highly conserved amino acids of gp120 give rise to the closely related spatial backbone folds in different virus isolates, and, therefore, they may be considered as promising targets for anti‐AIDS drug studies. Allowing for these data in common with the evidence which bears witness that the Cyc B peptide is capable of masking the V3 functionally critical residues residing in its structurally invariant segments, one may expect that synthetic copy of this virtual molecule (or its structural analogs) may display biological activity to various HIV‐1 strains exhibiting a broadly neutralizing effect. Beyond all shadow of doubt, the peptide constructed here must experience the extensive experimental test to be considered as the coming applicant for the role of "magic bullet" displaying a wide‐range blockade of the HIV‐1 envelope glycoprotein gp120. In conclusion, the model of the structural complex of Cyc B with V3 proposed above substantiates the literature data on a high affinity of immunophilins to V3 [13], and the results derived from its analysis enable one to make an optimistic prognosis of the prospects of using their peptides as the starting chemicals for the design of efficacious antiviral agents by protein engineering methods. D. ROI achieved or expected (1000 characters max.):
The project was implemented by two scientists under the financial support of the
SKIF-GRID scientific program (grant ? 4U-S/07-111) and the Belarusian Foundation for
Basic Research (project X08-003). Since the study was primarily aimed at the computer-aided
design of a new chemical compound which could serve as a basic structure for developing the
anti-AIDS drugs, we were in need of considerable computational resources. The major part of
cost was spent on supercomputer time usage as well as the employment of the national Grid
environment: the total time spent on computations amounts to approximately 3000 CPU
hours. The overall investment in the project obtained from the two grants mentioned above
comes to 7000 US dollars. So long as the investigations carried out in our project have a
theoretical manner, the return of investments may be derived when its findings will be used in
the applied anti-AIDS drug projects.
E. CONCLUSIONS/implications for the field (800 characters max.)
The overmolecular structure of Cyc B with V3 was built by computer modeling tools
and the immunophilin-derived peptide able to mask effectively the structurally invariant V3
segments embracing the functionally crucial amino acids of the HIV-1 gp120 envelope
protein was constructed and analyzed. Starting from the joint analysis of the results derived
with those of the literature, the generated peptide was suggested to offer a promising basic
structure for making a reality of the protein engineering projects aimed at developing the antiAIDS drugs able to stop the HIV's spread.
References: 1. Gallo, R. C. and Montagnier, L. (2003) The discovery of HIV as the cause of AIDS. N. Engl. J. Med., 349, 2283‐2285. 2. Wyatt, R. and Sodroski, J. (1998) The HIV‐1 envelope glycoproteins: fusogens, antigens, and immunogens. Science, 280, 1884‐1888. 3. Landau, N. R., Warton, M., and Littman, D. R. (1988) The envelope glycoprotein of the human immunodeficiency virus binds to the immunoglobulin‐
like domain of CD4. Nature, 334, 159‐162. 4. Feng, Y., Broder, C. C., Kennedy, P. E., Berger, E. A. (1996) HIV‐1 entry cofactor: functional cDNA cloning of a seven‐transmembrane, G protein‐
coupled receptor. Science, 272, 872‐877. 5. Deng, H., Liu, R., Ellmeier, W., Choe, S., Unutmaz, D., Burkhart, M., Di Marzio, P., Marmon, S., Sutton, R. E., Hill, C. M., Davis, C. B., Peiper, S. C., Schall, T. J., Littman, D. R., Landau, N. R. (1996) Nature, 381, 661‐
666. 6. Sirois, S., Sing, T., and Chou, K. C. (2005) HIV‐1 gp120 V3 loop for structure‐based drug design. Curr. Protein Pept. Sci., 6, 413‐422. 7. Hartley, O., Klasse, P. J., Sattentau, Q. J., Moore J. P. (2005) V3: HIVs Switch‐Hitter. AIDS Res. Hum. Retroviruses, 21, 171‐189. 8. Hwang, S. S., Boyle, T. J., Lyerly, H. K., Cullen, B. R. (1991) Identification of the envelope V3 loop as the primary determinant of cell tropism in HIV‐1. Science, 253, 71‐74. 9. Choe, H., Farzan, M., Sun, Y., Sullivan, N., Rollins, B., Ponath, P. D., Wu, L., Mackay, C. R., LaRosa, G., Newman, W., Gerard, N., Gerard, C., Sodroski, J. (1996) The beta‐chemokine receptors CCR3 and CCR5 facilitate infection by primary HIV‐1 isolates. Cell, 85, 1135‐1148. 10. Cocchi, F., DelVico, A., Garzino‐Demo, A., Cara, A., Gallo, R. C., Lusso, P. (1996) The V3 domain of the HIV‐1 gp120 envelope glycoprotein is critical for chemokine‐mediated blockade of infection. Nat. Med., 2, 1244‐
1247. 11. Connor, R. I., Sheridan, K. E., Ceradini, D., Choe, S., Landau, N. R. (1997) Change in coreceptor use correlates with disease progression in HIV‐1‐
infected individuals. J. Exp. Med., 185, 621‐628. 12. Scarlatti, G., Tresoldi, E., Bjorndal, A., Fredriksson, R., Colognesi, C., Deng, H. K., Malnati, M. S., Plebani, A., Siccardi, A. G., Littman, D. R., Fenyo, E. M., Lusso, P. (1997) In vivo evolution of HIV‐1 co‐receptor usage and sensitivity to chemokine‐mediated suppression. Nat. Med., 3, 1259‐
1265. 13. Endrich, M. M. and Gehring, H. (1998) The V3 loop of human immunodeficiency virus type‐1 envelope protein is a high‐affinity ligand for immunophilins present in human blood. Eur. J. Biochem., 252, 441‐446. 14. Galat, A. and Metcalfe, S. M. (1995) Peptidylproline cis/trans isomerases. Prog. Biophys. Mol. Biol., 63, 67‐118. 15. Barik, S. (2006) Immunophilins: for the love of proteins. Cell. Mol. Life. Sci., 63, 2889‐2900. 16. Baker, E. K., Colley, N. J., and Zuker, C. S. (1994) The cyclophilin homolog NinaA functions as a chaperone, forming a stable complex in vivo with its protein target rhodopsin. EMBO J., 13, 4886‐4895. 17. Ferreira, P. A., Nakayama, T. A., and Travis, G. H. (1997) Interconversion of red opsin isoforms by the cyclophilin‐related chaperone protein Ran‐binding protein 2. Proc. Natl. Acad. Sci. USA, 94, 1556‐1561. 18. Hacker, J. and Fischer, J. (1993) Immunophilins: structure‐function relationship and possible role in microbial pathogenicity. Mol. Microbiol., 10, 445‐456. 19. Moro, A., Ruiz‐Cabello, F., Fernandez‐Cano, A., Stock, R. P., Gonzalez, A. (1995) Secretion by Trypanosoma cruzi of a peptidyl‐prolyl cis‐
trans isomerase involved in cell infection. EMBO J., 14, 2483‐2490. 20. Spik, G., Haendler, B., Delmas, O., Mariller, C., Chamoux, M., Maes, P., Tartar, A., Montreuil, J., Stedman, K., Kocher, H. P., Keller, R., Hiestand, P. C., Movva, N. R. (1991) A novel secreted cyclophilin‐like protein (SCYLP). J. Biol. Chem., 266, 10735‐10738. 21. Allain, F., Boutillon, C., Mariller, C., Spik, G. (1995) Selective assay for CyPA and CyPB in human blood using highly specific anti‐peptide antibodies. J. Immunol. Methods, 178, 113‐120. 22. Endrich, M. M., Grossenbacher, D., Geistlich, A., Gehring, H. (1996) Apoptosis‐induced concomitant release of cytosolic proteins and factors which prevent cell death. Biol. Cell, 88, 15‐22. 23. Sherry, B., Yarlett, N., Strupp, A., Cerami, A. (1992) Identification of cyclophilin as a proinflammatory secretory product of lipopolysaccharide‐
activated macrophages. Proc. Natl. Acad. Sci. USA, 89, 3511‐3515. 24. Xu, Q., Leiva, M. C., Fischkoff, S. A., Handschumacher, R. E., Lyttle, C. R. (1992) Leukocyte chemotactic activity of cyclophilin. J. Biol. Chem., 267, 11968‐11971. 25. Bang, M., Muller, W., Hans, M., Brune, K., Swandulla, D. (1995) Activation of Ca2+ signaling in neutrophils by the mast cell‐released immunophilin FKBP12. Proc. Natl. Acad. Sci. USA, 92, 3435‐3438. 26. Saphire, A. C. S., Bobardt, M. D., and Gallay, P. A. (1999) Host cyclophilin A mediates HIV‐1 attachment to target via heparans. EMBO J., 18, 6771‐6785. 27. Franke, E. K., Yuan, H. E. H., and Luban, J. (1994) Specific incorporation of cyclophhilin A into HIV‐1 virions. Nature, 372, 359‐362. 28. Thali, M., Bukovsky, A., Kondo, E., Rosenwirth, B., Walsh, C. T., Sodroski, J., Gottlinger, H. G. (1994) Functional association of cyclophilin A with HIV‐1 virions. Nature, 372, 363‐365. 29. Colgan, J., Yuan, H. E. H., Franke, E. K., Luban, J. (1996) Binding of the human immunodeficiency virus type 1 Gag polyprotein to cyclophilin A is mediated by the central region of capsid and requires Gag dimerization. J. Virol., 70, 4299‐4310. 30. Andrianov, A. M. (2008) Computational anti‐AIDS drug design based on the analysis of the specific interactions between immunophilins and the HIV‐1 gp120 V3 loop. Application to the FK506‐binding protein. J. Biomol. Struct. Dynam., 26, 49‐56. 31. Andrianov, A. M. (2009) Immunophilins and HIV‐1 V3 loop for structure‐based anti‐AIDS drug design. J. Biomol. Struct. Dynam., 26, 445‐454. 32. Andrianov A. M. and Anishchenko I. V. (2009) Computational model of the HIV‐1 subtype A V3 loop: study on the conformational mobility for structure‐based anti‐AIDS drug design. J. Biomol. Struct. Dynam., 27, 179‐194. 33. Bernstein, F. C., Koetzle, T. F., Williams, G. J. B., Meyer, E. F., Brice, M. D., Rodgers, J. R., Kennard, O., Shimanouchi, T., Tasumi, M. (1997) The protein data bank. A computer‐based archival file for macromolecular structures. J. Mol. Biol., 112, 535‐542. 34. Berman, H. M., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T. N., Weissig, H., Shindyalov, I. N., Bourne, P. E. (2000) The Protein Data Bank. Nucleic Acids Research, 28, 235‐242. 35. Mustard, D. and Ritchie, D. W. (2005) Macromolecular docking using spherical polar fourier correlations. Proteins: Struct. Funct. Bioinf., 60, 269‐274. 36. Berendsen, H. J. C., van der Spoel, D., and van Drunen, R. (1995) GROMACS: A message‐passing parallel molecular dynamics implementation. Comp. Phys. Commun. 91, 43‐56. 37. Mikol, V., Kallen, J., and Walkinshaw, M. D. (1994) X‐ray structure of a Cyclophilin B/Cyclosporin complex: comparison with Cyclophilin A and delineation of its calcineurin‐binding domain. Proc. Natl. Acad. Sci. USA, 91, 5183‐5186. 38. Ren, P. and Ponder, J. W. (2003) TINKER: Software tools for molecular design. J. Phys. Chem. B., 107, 5933‐5947. 39. Oostenbrink, C., Villa, A., Mark, A. E., van Gunsteren, W. F. (2004) A biomolecular force field based on the free enthalpy of hydration and solvation: the GROMOS force‐field parameter sets 53A5 and 53A6. J. Comput. Chem., 25, 1656‐1676. 40. Berendsen, H. J. C., Postma, J. P. M., van Gunsteren, W. F., Hermans, J. (1981) Interaction models for water in relation to protein hydration. In B. Pullman (Ed.), Intermolecular Forces (pp. 331‐342). Dordrecht: D. Reidel Publishing Company. 41. Berendsen, H. J. C., Postma, J. P. M., DiNola, A., Haak, J. R. (1984) Molecular‐dynamics with coupling to an external bath. J. Chem. Phys., 81, 3684‐3690. 42. Ablameyko, S. V., Abramov, S. M., Anishchanka, U. V., Medvedev, S. V., Paramonov, N. N., Tchij, O. P. (2005) SKIF supercomputer configurations (in Russian). Minsk: United Institute of Informatics Problems. 43. Smith, L. J., Bolin, K. A., Schwalbe, H., MacArthur, M. W., Thornton, J. M., Dobson, C. M. (1996) Analysis of main chain torsion angles in proteins: Prediction of NMR coupling constants for native and random coil conformations. J. Mol. Biol., 255, 494‐506. 44. Hutchinson, E. G. and Thornton, J. M. (1996) PROMOTIF ‐ a program to identify and analyze structural motifs in proteins. Protein Sci., 5, 212‐220. 45. Sherman, S. A. and Johnson, M. E. (1993) Derivation of locally accurate spatial protein structure from NMR data. Prog. Biophys. Mol. Biol., 59, 285‐339. 46. Cormier, E. G. and Dragic, T. (2002) The crown and the stem V3 loop play distinct roles in human immunodeficiency virus type 1 envelope glycoprotein interactions with CCR5 coreceptor. J. Virol., 76, 8953‐8957. 47. LaRosa, G. J., Davide, J. P., Weinhold, K., Waterbury, J. A., Profy, A. T., Lewis, J. A., Langlois, A. J., Dressman, G. R., Boswell, R. N., Shadduk, P., Holley, L. H., Karplus, M., Bolognesi, D. P., Matthews, T. J., Emini, E. A., Putney, S. D. (1990) Conserved sequence and structural elements in the HIV‐1 principal neutralizing determinant. Science, 249, 932‐
935. 48. Ivanoff, L. A., Looney, D. J., McDanal, C., Morris, J. F., Wong‐Staat, F., Lang, A. J., Petteway, S. R. Jr., Matthews, T. J. (1991) Alteration of HIV‐1 infectivity and neutralization by a single amino acid replacement in the V3 loop domain. AIDS Res. Hum. Retroviruses, 7, 595‐603. 49. Minder, D., Boni, J., Schupbach, J., Gering, H. (2002) Immunophilins and HIV‐1 infection. Arch. Virol., 147, 1531‐1542. 50. Wang, W.‐K., Dudek, T., Zhao, Y.‐J., Brumblay, H. G., Essex, M., Lee, T.‐H. (1998) CCR5 coreceptor utilization involves a highly conserved arginine residue of HIV type 1 gp120. Proc. Natl. Acad. Sci. USA, 95, 5740‐5745. 51. de Parseval, A., Bobardt, M. D., Chatterji, U., Elder, J. H., David, G., Zolla‐Pazner, S., Farzan, M., Lee, T. H., Gallay, P. A. (2005) A highly conserved arginine in gp120 governs HIV‐1 binding to both syndecans and CCR5 via sulfated motifs. J. Biol. Chem., 280, 39493‐39504. 52. Hu, Q., Napier, K. B., Trent, J. O., Wang, Z., Taylor, S., Griffin, G. E., Peiper, S. C., Shattock, R. J. (2005) Restricted variable residues in the C‐terminal segment of HIV‐1 V3 loop regulate the molecular anatomy of CCR5 utilization. J. Mol. Biol., 350, 699‐712. 53. Ghiara, J. B., Stura, E. A., Stanfield, R. L., Profy, A. T., Wilson, I. A. (1994) Crystal structure of the principal neutralization site of HIV‐1. Science, 264, 82‐85. 54. Chavda, S. C., Griffin, P., Han‐Liu, Z., Keys, B., Vekony, M. A., Cann, A. J. (1994) Molecular determinants of the V3 loop of human immunodeficiency virus type 1 glycoprotein gp120 responsible for controlling cell tropism. J. Gen. Virol., 75, 3249‐3253. 55. Mammano, F., Salvatori, F., Ometto, L., Panozzo, M., Chieco‐Bianchi, L., De Rossi, A. (1995) Relationship between the V3 loop and the phenotypes of human immunodeficiency virus type 1 (HIV‐1) isolates from children perinatally infected with HIV‐1. J. Virol., 69, 82‐92. 56. Milich, L., Margolin, B. H., and Swanstrom, R. (1993) V3 loop of the human immunodeficiency virus type 1 Env protein: interpreting sequence variability. J. Virol., 67, 5623‐5634. 57. Shioda, T., Levy, J. A., and Cheng‐Mayer, C. (1992) Small amino acid changes in the V3 hypervariable region of gp120 can affect the T‐cell‐line and macrophage tropism of human immunodeficiency virus type 1. Proc. Natl. Acad. Sci. USA, 89, 9434‐9438. 58. Wu, L., Gerard, N. P., Wyatt, R., Choe, H., Parolin, C., Ruffin, N., Borsetti, A., Cardoso, A. A., Desjardin, E., Newman, Gerard, W. C., Sodroski, J. (1996) CD4‐induced interaction of primary HIV‐1 gp120 glycoproteins with the chemokine receptor CCR‐5. Nature, 384, 179‐183. 59. Fouchier, R. A. M., Groenink, M., Kootstra, N. A., Tersmette, M., Huisman, H. G., Miedema, F., Schuitemaker, H. (1992) Phenotype‐associated sequence variation in the third variable domain of the human immunodeficiency virus type 1 gp120 molecule. J. Virol., 66, 3183‐3187. 60. De Jong, J. J., De Ronde, A., Keulen, W., Tersmette, M., Goudsmit, J. (1992) Minimal requirements for the human immunodeficiency virus type 1 V3 domain to support the syncytium‐inducing phenotype: analysis by single amino acid substitution. J. Virol., 66, 6777‐6780. 61. Ogert, R. A., Lee, M. K., Ross, W., Buckler‐White, A., Martin, M. A., Cho, M. W. (2001) N‐linked glycosylation sites adjacent to and within the V1/V2 and the V3 loops of dualtropic human immunodeficiency virus type 1 isolate DH12 gp120 affect coreceptor usage and cellular tropism. J. Virol., 75, 5998‐6006. 62. Smith, J. A. and Pease, L. J. (1980) Reverse turns in peptides and proteins. CRC Crit. Rev. Biochem., 8, 315‐399. 63. Rose, G.D., Gierasch, L.M., and Smith, J.A. (1985) Turns in peptides and proteins. Adv. Prot. Chem., 37, 1‐109. 64. Newton, A.C. (2001) Protein kinase C: structural and spatial regulation by phosphorylation, cofactors, and macromolecular interactions. Chem. Rev., 101, 2353‐2364. Table 1. Geometric parameters of intermolecular H-bonds for the structural complex of the
HIV-1 SA-V3 loop with Cyc B.
Residue
(donor)
Lys-932
Lys-932
Ser-111
Gln-181
Gln-181
Ala-191
Group
Residue
Group
Distance (Å)
Distance (Å)
(donor) (acceptor) (acceptor) Donor…Acceptor Hydrogen…Acceptor
NH
Gln-181
OE1
2.7
1.7
NZ
Thr-231
OG1
2.8
1.8
OG
Glu-1782
OE2
2.7
1.7
NH
Lys-912
CO
2.8
1.9
NE2
Glu-1782
OE2
2.8
1.9
2
NH
Lys-91
CO
3.0
2.0
Footnote: Superscripts 1 and 2 denote the amino acids of V3 and Cyc B respectively.
Table 2. Dihedral angles for amino acids in the 3D structure of the Cyc B peptide.
Residue
Gly-1
φ
—
Dihedral angles (deg.)
ψ
χ1
-109,7
—
χ2
—
χ3
—
Pro-2
-52,2
178,6
-18,1
31,0
Lys-3
-135,3
170,1
-64,6
-176,2
-69,6
Val-4
-84,6
141,8
176,6
—
—
Thr-5
-135,9
-136,8
69,1
—
—
Val-6
-60,6
123,4
-62,0
—
—
Lys-7
-93,7
80,2
-173,1
59,0
168,0
Val-8
-112,5
160,8
-153,8
—
—
Tyr-9
-78,7
170,2
78,2
-82,8
—
Phe-10
-144,3
161,7
-168,0
-101,0
—
Asp-11
-85,7
165,1
-141,6
-58,4
—
Leu-12
-147,4
101,0
-154,5
-50,8
—
Arg-13
-129,4
102,0
-45,8
-162,4
70,9
Ile-14
-80,3
88,7
-55,0
-179,9
—
Gly-15
74,1
-72,1
—
—
—
Asp-16
-147,8
5,8
-152,3
-61,9
—
Glu-17
-53,7
120,6
-64,8
-176,4
-30,7
Asp-18
-74,2
42,0
-157,7
-137,1
—
Val-19
-52,9
-31,1
-165,5
—
—
Gly-20
118,2
-175,8
—
—
—
Arg-21
-80,9
55,8
-75,7
157,4
-70,7
Val-22
-53,2
129,0
179,8
—
—
Ile-23
-76,6
60,7
-39,7
-59,3
—
Phe-24
-56,9
-63,3
-66,1
-75,9
—
Gly-25
77,9
171,0
—
—
—
Leu-26
-138,0
159,7
-66,5
95,1
—
Phe-27
-97,0
-5,3
-51,8
-83,5
—
Gly-28
61,7
-130,8
—
—
—
Lys-29
-133,8
143,2
176,9
63,7
169,4
Thr-30
-122,6
—
-62,2
—
—
Table 3. Geometric parameters of intermolecular H-bonds for the structural complex of the
HIV-1 SA-V3 loop with the CycB peptide.
Residue
(donor)
Arg-131
Arg-131
Arg-131
Gly-171
Gln-181
Gln-181
Asp-251
Lys-32
Thr-52
Lys-72
Lys-292
Group
Residue
Group
Distance (Å)
Distance (Å)
(donor) (acceptor) (acceptor) Donor…Acceptor Hydrogen…Acceptor
NH1
Asp-182
OD1
3.0
2.0
2
NH2
Asp-18
OD1
3.2
2.3
NH2
Asp-182
OD2
3.1
2.2
2
NH
Asp11
OD1
2.9
1.9
NE2
Tyr-92
CO
2.8
1.8
2
NE2
Asp-11
OD1
2.8
1.8
2
NH
Gly-28
CO
2.8
1.8
NZ
Ile-301
CO
2.8
1.9
1
OG1
Arg-31
CO
2.7
1.7
NH
Lys-101
CO
2.8
1.8
1
NZ
Asp-29
CO
2.9
2.0
Footnote: Superscripts 1 and 2 denote the amino acids of V3 and CycB peptide respectively.
Figure 1. Image of the structural complex between the HIV-1 SA-V3 loop (tubes) and Cyc B
(balls).
Figure 2. 3D structure of the CycB peptide superposed with the X-ray conformation for
segment 1-30 of the entire protein.
Figure 3. Three-dimensional (a) and secondary (b) structures of the CycB peptide generated
based on the X-ray conformation of site 1-30 for the intact Cyc B.
Figure 4. Overmolecular structure of the HIV-1 SA-V3 loop (balls) with the CycB peptide
(tubes).