Endpoint tossicologici coperti dai metodi in silico: miti e realtà

Istituto di Ricerche Farmacologiche Mario Negri, Milano
E. BENFENATI
Endpoint tossicologici coperti dai
metodi in silico: miti e realtà
The Nobel Prize in Chemistry 2013
The Nobel Prize in Chemistry 2013 has gone to Michael Levitt, Martin Karplus
and Arieh Warshel, who “took the chemical experiments into cyberspace”
=f( )
5
HUMAN EXPERTS have identified
LINKS between
STRUCTURE and TOXICITY
ASHBY identified a list of RESIDUES for GENOTOXIC
EFFECT
There are
simple REGRESSIONS
Series of CONGENERIES
Toxicity
MW
DATA and RULES
A) From Data to Rules (by the human experts)
Rules codify previous knowledge
B) Direct Use of Data
Starting from tox data + structures.
Will be more and more important: ToxCast
IMPLICIT OR EXPLICIT KNOWLEDGE
QSAR flowchat
Activity data
• Garbage in, garbage out
• What is the precision?
• Quality and quantity of data
– Suitable for purposes?
– Intrinsic variability of Y
data (particularly for
QSAR)
– Chemical domain
covered with
experimental data
◦ As much as you can
expecially if using
complex models
The procedure to
CALCULATE DESCRIPTORS
2D descriptors
3D descriptors
(no optimization required)
(optimization required)
O
Cl
H
C
O
C
H2
HC
C
H
C
H
HC
CH3
H3C
The procedure adopted to calculate the 2D DESCRIPTORS may
vary based on the different software requirement as input file
format
The 3D DESCRIPTORS are also affected by the geometry
optimization procedure
MOLECULAR DESCRIPTORS
Many DESCRIPTORS FAMILIES:
• Constitutional / information descriptors: molecular weight, number of
chemical elements, number of H-bonds or double bonds, &
• Physicochemical descriptors: lipophilicity, polarizability, &
• Topological descriptors:
atomic branching and ramification
• Electronic, geometrical and quantum-chemical descriptors
• Fragmental / structural keys defining Booleans (bitmap) arrays
ALGORITHMS
regressions
• Discriminant Analysis
• CART
f(x)
• KNN
• Fuzzy logic
• Bayesian
• Self Organizing Map (SOM)
• Support Vector Machine (SVM)
classification
x2
x1
REACH promotes
Innovation (Article 1)
All Info
should be used
QSAR
is mentioned
According to REACH regulation (Annex XI)
a (Q)SAR is VALID if:
the model is recognized scientifically valid;
the substance is included in the applicability
domain of the model;
results are adequate for classification and
labelling and for risk assessment;
adequate documentation of the methods
provided.
Seven Reasons
to use QSAR
1.
2.
3.
4.
5.
6.
7.
Innovation (also in view of thousands of new data - ToxCast)
Time for experiments
Occurrence of enough laboratories/resources
Reduction of costs
Use of animals
Prioritization needs
Pro-active approach for greener chemicals
From ANTARES and CALEIDOS to VEGA
Identification of the BEST MODELS
Characterization of the AD
Integration of DIFFERENT MODELS
Implementation into a UNIQUE PLATFORM
Integration with READ ACROSS
Specific ANSWERS
to the four REACH Requirements
Virtual models for evaluating the properties
of chemicals within a global architecture
18
SOME EXAMPLES
SOME EXAMPLES
CAESAR
SARpy
ToxTree
SOME EXAMPLES
MUTAGENS
READ ACROSS and QSAR
• A case-to-case basis
Read-across is a method of filling in data
gaps for a substance by using surrogate
data from another substance.
• Reliability supported by specific
explanation
• Supporting data needed (generic and/or
substance-specific)
• Subjective expert assessment
• Pre-built model
(Q)SARs are computer based models
designed to predict properties from
knowledge of the chemical structure.
• Reliability supported by the applicability
domain
• Supporting examples from training sets
• Objective output (though it requires an
evaluation by the expert)
Common data comes from experience
Read across: real data from other
chemicals
QSAR: real data combined into a
complex architecture
Chemical Categories
• A group of chemicals that have some
features that are common
– Structurally similar e.g. common
substructure
– Property e.g. similar physicochemical,
topological, geometrical, or surface
properties
– Behaviour e.g. (eco)toxicological response
underpinned by a common MOA
– Functionality e.g. preservatives,
flavourings, detergents, fragrances
shortcomings
Read across requires experts in chemistry,
biology, environmental sciences, –
–
–
–
Expert reasoning is rare and expensive
Expert reasoning is subjective
Experts may give different weights to a feature
Experts may be aware and use different sets of
rules
– Experts may over-relay on past experience and
miss new evidence
– Expert reasoning is irreproducible
ToxRead: conceptual framework
3 aims
– exploration of different conditions
– reproducible - objective
– easily taylored by expert
• read across at the intersection between chemicals
and ontologies
= ToxRead takes into account experimental evidence and
(available) theory on toxicity
•
Similarity
Chemical similarity - similarity of chemical
compounds with respect to either structural or
functional qualities, i.e. the effect that the
chemical compound has.
• similarity measure - a real valued function that
quantifies the similarity between two objects.
Usually similarity measures are the inverse of
distance metrics.
No universal definition of similarity
•
Similarity is reflexive, commutative, non transitive
ToxRead similarity computation
Similarity is calculated with an index resulting from the
weighted combination of
• a fingerprint
• three structural keys based on molecular descriptors
– built with constitutional descriptors (number of atoms,
number of certain bonds etc), descriptors focused on
hetero-atoms, and descriptors for specific functional
groups (such as nitro groups, sulfonic groups etc).
• They can account the number of some features or functional
groups and not only their presence/absence.
ToxRead
user
interface
a graph with pop up windows
• Centered in the target molecule
• Target directly connected to the most
similar molecules (in the inner circle)
• The structural alerts are in the second
circle
• Paths connect molecules sharing the
same structural alert
• Shape: circular nodes are molecules,
triangle nodes are structural alerts
• Circle dimension: related to similarity
• Color: red or green with different
saturation indicates active-non active
at different levels
Clicking on nodes shows the structure,
the explanation, the list of chemicals,
etc.
Rules and libraries
• Different libraries of structural alerts
(rules) are available
– for mutagenicity:
1.
2.
3.
4.
Benigni Bossa (expert)
SARpy (data mining)
Developed in CALEIDOS (expert and data mining)
Developed in PROSIL (expert and data mining)
• Some are more mechanistic, others more
evidence-based
• Both positive and negative effects.
– Negative effects are only evidence-based
New rules
• Besides published rules, human based
rules (about 300), extracted manually by E.
Benfenati starting from the chemical
classes are present for mutagenicity as
result of CALEIDOS project
• other about 300 rules for mutagenicity are
obtained automatically through data mining
(PROSIL project)
Example: exception rule of
aromatic amines toxicity
A closer look
Pop-up windows
• Chemical structure
• Rule name and
structure
• Rule accuracy
• Rule meaning
• List of molecules (up to
100) where the rule
applies
Name: SA7
Description: Epoxides and aziridines
(Benigni/Bossa structural alert no.
07)
Experimental accuracy: 0.76
Fisher test p-value: < 10e-6
Example of clearly mutagenic compound:
increase of accuracy compared to the BB rule SA10
Rule
M_78
Rule M_78: alpha,beta unsaturated
aldehyde,
with chlorine or bromine in alpha or beta
Experimental accuracy: 1
target
Target cpd
CAS n.=2648-51-3,
O=CC=C(Cl)Cl
Rule SA_10
Similarity to target:
0.878
SA_10: Alpha,beta
unsaturated carbonyls
(Benigni/Bossa)
Experimental
accuracy: 0.49
indeed the experimental
value is mutagen
Correctly predicted by VEGA
QSAR
Example of a clearly mutagenic compound: increase of accuracy
compared to the BB rule SA7 and to Sarpy alerts SM22 and SM97
Similarity to
target=1
M_72_b
2,2,3-trihydro-oxirane
Experimental accuracy: 0.91
Target compound
O1CC1CCc2ccc(cc2)c3ccccc3
SA7
Epoxides and aziridines (Benigni/Bossa)
Experimental accuracy: 0.76
Mutagenicity exercise: target and
similar compounds
target
Rules for non mutagenicity
target
conflicting rules: exception to the BB rule SA28bis for mutagenic aromatic monoand dialkylamine
It is predicted non mutagen by VEGA QSAR with
low value of applicability domain
EX_M_12_8
N-alkyl-2,1-benzothiazol-3-amine
Similarity to
target=1
Target cpd
CAS: 703-83-3
n2c1ccccc1c(NCC)s2
SA28bis
Example of conflicting rules:
exception to BB rules SA10 and SA12
NM_metil_quinone
Similarity to
target=1
Target compound
CAS: 527-61-7
O=C1C=C(C(=O)C(=C1)C)C
SA12 Quinones
(Benigni/Bossa)
LIFE 11 ENV/IT/295
?
READ-ACROSS Excercise
?
About 200 questionnaires
40 participants
Mutagenicity
SOFTWARE SIMPLICITY
90%
80%
70%
60%
%high
50%
%medium
40%
%low
30%
20%
10%
0%
VEGA
ToxRead
OECD TB
Mutagenicity
AGREEMENT AMONG PARTICIPANTS DEPENDING ON THE TOOL
TOXREAD : all answers (7 molecules) are in agreement
QSAR TOOLBOX: answers for 7 molecules are in disagreement, and only
one in agreement
disagreement
agreement
ToxRead
OECD QSAR Toolbox
-8
-7
-6
-5
-4
-3
-2
-1
0
1
2
3
4
5
6
7
8
9
44
Conclusions
• Fundamental to analyze all pieces of information
• Very useful to apply more than one model (not
only VEGA and ToxRead)
• Fundamental to compare results
• Very useful to refer to experimental data
ToxRead available at
www.toxgate.eu
VEGA available at
www.vega-qsar.eu