Grouping 34 Chemicals Based on Mode of Action

TOXICOLOGICAL SCIENCES, 151(2), 2016, 447–461
doi: 10.1093/toxsci/kfw058
Advance Access Publication Date: March 29, 2016
Research Article
Grouping 34 Chemicals Based on Mode of Action Using
Connectivity Mapping
K. Nadira De Abrew,*,1 Raghunandan M. Kainkaryam,* Yuqing K. Shan,*
Gary J. Overmann,* Raja S. Settivari,‡ Xiaohong Wang,* Jun Xu,*
Rachel L. Adams,* Jay P. Tiesman,* Edward W. Carney,‡,† Jorge M. Naciff,* and
George P. Daston*
*Mason Business Center, The Procter & Gamble Company, Cincinnati, Ohio 45040 and ‡Toxicology &
Environmental Research and Consulting, The Dow Chemical Company, Midland, Michigan 48674
†
Deceased.
1
To whom correspondence should be addressed. Fax: (513) 277-2311. E-mail: [email protected].
ABSTRACT
Connectivity mapping is a method used in the pharmaceutical industry to find connections between small molecules,
disease states, and genes. The concept can be applied to a predictive toxicology paradigm to find connections between
chemicals, adverse events, and genes. In order to assess the applicability of the technique for predictive toxicology
purposes, we performed gene array experiments on 34 different chemicals: bisphenol A, genistein, ethinyl-estradiol,
tamoxifen, clofibrate, dehydorepiandrosterone, troglitazone, diethylhexyl phthalate, flutamide, trenbolone, phenobarbital,
retinoic acid, thyroxine, 1a,25-dihydroxyvitamin D3, clobetasol, farnesol, chenodeoxycholic acid, progesterone, RU486,
ketoconazole, valproic acid, desferrioxamine, amoxicillin, 6-aminonicotinamide, metformin, phenformin, methotrexate,
vinblastine, ANIT (1-naphthyl isothiocyanate), griseofulvin, nicotine, imidacloprid, vorinostat, 2,3,7,8-tetrachloro-dibenzop-dioxin (TCDD) at the 6-, 24-, and 48-hour time points for 3 different concentrations in the 4 cell lines: MCF7, Ishikawa,
HepaRG, and HepG2 GEO (super series accession no.: GSE69851). The 34 chemicals were grouped in to predefined mode of
action (MOA)–based chemical classes based on current literature. Connectivity mapping was used to find linkages between
each chemical and between chemical classes. Cell line–specific linkages were compared with each other and to test
whether the method was platform and user independent, a similar analysis was performed against publicly available data.
The study showed that the method can group chemicals based on MOAs and the inter–chemical class comparison alluded
to connections between MOAs that were not predefined. Comparison to the publicly available data showed that the method
is user and platform independent. The results provide an example of an alternate data analysis process for high-content
data, beneficial for predictive toxicology, especially when grouping chemicals for read across purposes.
Key words: connectivity mapping; toxicogenomics; 21st century tox.
Perturbation of a biological system via an external insult results
in a characteristic gene expression profile. This expression profile is unique to the biological system and insult under consideration. Querying a representative signature of this profile against
other such profiles using pattern-matching methods provides
an opportunity to identify both known and unknown connections between perturbed biological systems. This concept
named “connectivity mapping (CMap)” was first described by
Lamb et al. (2006) as a means to find connections between small
molecules, diseases, and drugs. Following this landmark publication, the CMap concept has repeatedly been shown to be an
effective tool to make such connections (Dudley et al., 2011;
Hieronymus et al., 2006; Jahchan et al., 2013; Li et al., 2015;
Vidovic et al., 2014; Zimmer et al., 2010). It is proposed that the
field of toxicology could leverage this idea successfully used in
the pharmaceutical industry to make connections between
C The Author 2016. Published by Oxford University Press on behalf of the Society of Toxicology. All rights reserved.
V
For Permissions, please e-mail: [email protected]
447
448
|
TOXICOLOGICAL SCIENCES, 2016, Vol. 151, No. 2
small molecules, disease, and genes and reapply it to a predictive toxicology paradigm where connections between chemicals,
adverse events, and genes are sought.
The 2007 U.S. National Research Council published report:
“Toxicity Testing in the 21st Century” (TT21C) called for a fundamental change in how we conduct toxicity testing (NRC, 2007). The
report called for a paradigm shift from traditional toxicity testing
methods based on high-dose animal studies to one based on
in vitro methods typically using human cells in a high-throughput
fashion (Stephens et al., 2012). The intent was to shift toxicology
testing from one based on apical outcomes in animals to one based
on mechanistic understanding in humans (NRC, 2007). Since the
publication of TT21C, various terminologies have emerged that
build on the original concept of mode of action (MOA) (Boobis et al.,
2006, 2008; Dellarco and Wiltse, 1998). These include the TT21C
concept of toxicity pathway (NRC, 2007), the OECD-driven concept
of adverse outcome pathway (AOP) (Ankley et al., 2010; OECD, 2012;
Vinken, 2013) and the ALTEX-driven concept of pathway of toxicity
(Hartung, 2010). A common denominator among all of these definitions is the molecular initiation event (MIE), reflecting the desire to
identify an event that can be detected in vitro and used to map a
response to an AOP or MOA with the ultimate goal of predicting
toxicity. In the case of an AOP, an MIE is defined as “the first anchor
of an AOP and refers to the interaction of a chemical with a biological system at the molecular level, such as ligand–receptor interactions or binding to proteins and nucleic acids” (Vinken, 2013). It
is well known that such interactions lead to gene expression
changes (Nuwaysir et al., 1999), quantifying these gene expression
changes immediately following the MIE could be diagnostic of the
MIE at play. Connectivity mapping provides a well-defined systematic process to attempt this task. When the relationship between
the CMap signature and an MIE is known, a gene signature-based
MIE could be searched against other gene expression–based MIEs
stored in a database. Because MIEs are inherently associated with
an MOA/AOP, this provides a means to group chemicals based on
MOA/AOP. The concept provides a practical option for a highthroughput, low-cost, non-animal method in predictive toxicology,
underlined by mechanistic understanding of human relevant toxicology and in line with TT21C.
The objective of the present study was to evaluate the possibility of using CMap in predictive toxicology to identify connections
between the biological signatures of 34 chemicals tested on 4 different cell lines (34 4) and in the process provide specifics on
how CMap may be used as an alternate method to assess highcontent data. All chemicals in the predefined MOA-based chemical
classes with at least 2 chemicals grouped together in at least 1 cell
line using this method. Interclass connections were also observed
among some chemical classes. By using 4 cell lines, we were able
to show that certain MOAs are unique to certain tissue types.
Comparison of representative data from our study to the publicly
available CMap database (Lamb et al., 2006) (http://www.broadinsti
tute.org/cmap/) showed that the current method is user and platform independent. Overall, we believe that CMap provides a practical, actionable, nonanimal method for finding similarities
between chemicals and potential MOAs. The concept has broad applicability, especially in supporting grouping and read across/informing MOA of a new chemical (ECHA, 2015; Wu et al., 2010).
MATERIALS AND METHODS
Chemicals and Reagents
Bisphenol A (99.0%; catalog no. 239658), genistein (99.0%; catalog
no. G6649), ethinyl-estradiol (catalog no. 46263), tamoxifen
(catalog
no. T5648), clofibrate
(catalog
no. C6643),
dehydorepiandrosterone (95%; catalog no. 709549), troglitazone
(catalog no. T2573), diethylhexyl phthalate (catalog no. 36735),
flutamide (catalog no. F9397) trenbolone (catalog no. T3925),
phenobarbital (catalog no. P1636), retinoic acid (catalog no.
R2625), thyroxine (catalog no. T2376), 1a,25-dihydroxyvitamin
D3 (catalog no. D1530), clobetasol (catalog no. C8037), farnesol
(catalog no. F203), chenodeoxycholic acid (catalog no. C9377),
progesterone (catalog no. P0130), RU486 (catalog no. M8046),
ketoconazole (catalog no. K1003), valproic acid (catalog no.
P4543), desferrioxamine (catalog no. D9533), amoxicillin (catalog
no. A8523), 6-aminonicotinamide (catalog no. A68203),
metformin (catalog no. D150959), phenformin (catalog no.
P7045), methotrexate (catalog no. A6770), vinblastine (catalog
no. V1377), ANIT (1-naphthyl isothiocyanate) (catalog no.
N4525), griseofulvin (catalog no. G4753), nicotine (catalog no.
N3876), and imidacloprid (catalog no. 37894) were all purchased
from Sigma-Aldrich (St Louis, Missouri); vorinostat (catalog no.
10009929) was purchased from Cayman Chemicals (Ann Arbor,
Michigan); and 2,3,7,8-tetrachloro-dibenzo-p-dioxin was custom
ordered from Accustandard (New Haven, Connecticut)
Concentration and Time Point Selection
Concentration selection for each chemical treatment was based
on either values used in the study by Lamb et al. (2006) (http://
www.broadinstitute.org/cmap/) or other existing literature
(Table 1). Although RNA was isolated at 6-, 24-, and 48-hour
time points, the 6-hour time point was chosen for gene array
experiments in order to obtain a signature most likely representing the direct mechanism (MIE). Other studies have shown
the importance of using early response genes in predictive toxicology (Zhang et al., 2014). This time point was also used by
Lamb et al. (2006) in their original study.
Cell Culture
MCF7 cells
MCF7 (human breast adenocarcinoma) cells were purchased
from American Type Culture Collection (ATCC, Manassas,
Virginia) and grown in phenol red free DMEM (Invitrogen,
Carlsbad, California) supplemented with 10% fetal bovine serum
(FBS) (Invitrogen), 100-U/ml penicillin and 100-lg/ml streptomycin and maintained at 37 C in an atmosphere of 5% CO2.
Ishikawa cells
Ishikawa cells (human endometrial adenocarcinoma) were
grown per method described by Naciff et al. (2010). In brief, cells
were routinely maintained in DMEM/F12 medium (Invitrogen)
supplemented with 10% fetal calf serum (Hyclone, Logan, Utah),
100-U/ml penicillin-G, 100-mg/ml streptomycin, and 0.25-mg/ml
amphotericin B (Invitrogen) and maintained at 37 C in an atmosphere of 5% CO2.
HepaRG cells
HepaRG (human hepatoma) cells were purchased from
Biopredic International (Rennes, France). Differentiated HepaRG
cells were grown following suppliers’ protocol (Biopredic) in
Williams E medium (Invitrogen), supplemented with 2 mM
Glutamax (Invitrogen), 10% ADD670 supplement (Biopredic
International), and maintained at 37 C in an atmosphere of
5% CO2.
Bisphenol A
Genistein
Ethinyl-estradiol
Trenbolone
Dehydroepiandrosterone
ANIT (1-naphthyl isothiocyanate)
Griseofulvin
Nicotine
Imidacloprid
Metformin
Phenformin
Clofibrate
Diethylhexyl phthalate (DHP)
Phenobarbital
Troglitazone
Farnesol
Chenodeoxycholic acid
Valproic acid
Vorinostat
Retinoic acid
Thyroxine
2,3,7,8-Tetrachloro-dibenzo-p-dioxin
1a,25-Dihydroxyvitamin D3
Clobetasol
Progesterone
RU486
Ketoconazole
Desferrioxamine
Amoxicillin
6-Aminonicotinamide
Methotrexate
Vinblastine
Tamoxifen
Flutamide
RAR agonist
TR agonist
AhR agonist
Vitamin D agonist
Glucocoritcoid receptor agonist
Progesterone receptor agonist
Progesterone receptor antagonist
Steroid synthesis inhibitor
Iron chelator
Idiosyncratic liver injury
Glycolytic inhibitor
Folate/1-carbon metabolism inhibitor
Microtubule inhibitor
Antiestrogen
Antiandrogen
HDAC inhibitors
FXR receptor agonist
CAR/PXR agonist
PPAR agonist
Oxidative phosphorylation/mitochondrial inhibitors
Nicotinic acetylcholine receptor agonist
Liver cholestasis inducers
Androgen
Estrogens, environmental estrogens
Chemical Class
DMSO
DMSO
DMSO
DMSO
DMSO
DMSO
DMSO
DMSO
DMSO
H2O
DMSO
DMSO
DMSO
DMSO
DMSO
DMSO
DMSO
H2O
DMSO
DMSO
DMSO
DMSO
DMSO
DMSO
DMSO
DMSO
Methanol
DMSO
NaOH
DMSO
DMSO
DMSO
DMSO
DMSO
Vehicle
1, 10, 100 lM
1, 10, 100 lM
10 nM, 100 nM, 1 lM
100 nM, 1 lM, 10 lM
1, 10, 100 nM
1, 10, 100 lM
100 nM, 1 lM, 10 lM
1, 10, 100 lM
10 lM, 100 lM, 1 mM
100 nM, 1 lM, 10 lM
1, 10, 100 lM
10 lM, 100 lM, 1 mM
1, 10, 100 lM
1, 10, 100 lM
1, 10, 100 lM
1, 10, 100 lM
1, 10, 100 lM
10 lM, 100 lM, 1 mM
1, 10, 100 lM
10 nM, 100 nM, 1 lM
10 nM, 100 nM, 1 lM
1, 10, 100 nM
1, 10, 100 nM
1, 10, 100 lM
1, 10, 100 lM
1, 10, 100 nM
1, 10, 100 lM
1, 10, 100 lM
10 lM, 100 lM, 1 mM
10 lM, 100 lM, 1 mM
1, 10, 100 lM
10 nM, 100 nM, 1 lM
100 nM, 1 lM, 10 lM
1, 10, 100 lM
Dose (D3, D2, and D1)
Skandrani et al. (2006)
http://www.broadinstitute.org/cmap/
http://www.broadinstitute.org/cmap/
http://www.broadinstitute.org/cmap/
Kang and Lee (2005)
Olsen et al. (2004)
http://www.broadinstitute.org/cmap/
Unpublished data
http://www.broadinstitute.org/cmap/
http://www.broadinstitute.org/cmap/
http://www.broadinstitute.org/cmap/
Wang et al. (2007)
Tang et al. (2004)
Hall et al. (2010)
Kido et al. (2003)
http://www.broadinstitute.org/cmap/
http://www.broadinstitute.org/cmap/
Chalbos et al. (1991)
http://www.broadinstitute.org/cma/
An et al. (1998)
Li et al. (2007)
Muniyappa et al. (2009)
http://www.broadinstitute.org/cmap
http://www.broadinstitute.org/cmap/
http://www.broadinstitute.org/cmap/
Vinggaard et al. (1999); http://www.broadinstitute.org/cmap/
Recchia et al. (2006) and Vivacqua et al. (2003)
http://www.broadinstitute.org/cmap/
Rao et al. (2011)
Maggiolini et al. (1999)
Blankvoort et al. (2001)
Thome-Kromer et al. (2003)
Rathinasamy et al. (2010), http://www.broadinstitute.org/cmap/
References
Pregnane X Receptor; RAR, Retinoic Acid Receptor; TR, Thyroid Hormone Receptor.
a
Range finding studies evaluating gene expression.
Abbreviations: AhR, Aryl-hydrocarbon Receptor; CAR, Constitutive Androstane Receptor; DMSO, Dimethyl sulfoxide; FXR, Farnesoid X Receptor; HDAC, Histone deacetylase; PPAR, Peroxisome Proliferator-Activated Receptor; PXR,
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
Chemical
TABLE 1. Chemicals, chemical classes, vehicle and doses used in study
DE ABREW ET AL.
|
449
450
|
TOXICOLOGICAL SCIENCES, 2016, Vol. 151, No. 2
HepG2 cells
HepG2 (human hepatocellular carcinoma) cells were purchased
from ATCC and grown in Eagle’s Minimum Essential Medium
containing phenol red (ATCC) supplemented with 10% FBS
(ATCC), 100-U/ml penicillin, and 100-lg/ml streptomycin (ATCC)
and maintained at 37 C in an atmosphere of 5% CO2.
Gene Expression Microarray Measurements
MCF7, Ishikawa, HepG2 cells (5105 cells/ml, 1 ml/well of 12well cell culture plate (Corning, Corning, New York) and HepaRG
cells (per manufacturer’s protocol [Biopredic International] in
collagen coated 24-well plates [BD, Franklin Lakes, New Jersey]
were treated with 3 concentrations of chemical (Table 1) or vehicle (Table 1) for 6, 24, or 48 hours. Total RNA was isolated
using buffer RLT (Qiagen, Valencia, CA) and Agencourt
RNAdvance Tissue XP beads (Beckman Coulter Inc, Danvers,
Massachusetts) according to manufacturer’s protocol (Beckman
Coulter Inc), and RNA integrity was validated using a NanoDrop
8000 Spectrophotometer (Wilmington, Delaware). Labeled cRNA
was synthesized from 500 ng of total RNA using the Affymetrix
(Santa Clara, California) IVT-Express labeling kit according to
manufacturer’s instructions. Seven and a half microgram of
labeled cRNA was fragmented manually and hybridized to
Affymetrix Human Genome U219-96 arrays for 16 hours,
washed, stained, and scanned using an Affymetrix GeneTitan
(Santa Clara, California). The gene expression studies were performed in triplicate (n ¼ 3) with the cells for each replicate
treated and harvested on separate days.
Gene Expression Microarray Statistical Analysis
The Affymetrix Human Genome U219-96 arrays used in this
study has more than 49,000 probesets analyzing over 36,000
transcripts and variants, which represent more than 20,000
well-substantiated human genes. The complete gene expression data have been deposited in the National Center for
Biotechnology Information Gene Expression Omnibus (GEO)
(Super series accession no.: GSE69851)
The Affymetrix U219 GeneTitan measurements were preprocessed using the standard Robust Multiarray Averaging (RMA)
method (Gautier et al., 2004; Irizarry et al., 2003). Of the 1416
samples (including controls) profiled, 44 samples were excluded
for quality-control issues using a standard unsupervised clustering of samples based on their full expression profile (Jesse
Krijthe (2015). Rtsne: T-Distributed Stochastic Neighbor
Embedding using Barnes-Hut Implementation. R package version 0.10. http://CRAN.R-project.org/package¼Rtsne).
CMap Analysis
The primary goal of the current CMap analysis was to validate
and discover connections between the 34 chemicals under
study using pattern-matching of their expression profiles.
After RMA preprocessing, the log2 gene expression data were
used to calculate fold-change of each perturbation sample (or
instance, in CMap parlance) with respect to the average control
instances in the corresponding batch. The fold-change matrix
of all perturbation instances was used to produce the CMap
rank matrix using the standard method described in the original
CMap paper (Lamb et al., 2006). This rank matrix represented the
database against which all signatures are scored. To generate
signatures for each chemical (independently for each concentration or time point), a 2-sample t-test paired for instances
tested in the same batch was run using the limma software
(Smyth, 2005; Wettenhall et al., 2008). A 5% false-discovery rate
(FDR) cut-off was used to generate up (positive fold-change) and
down (negative fold-change) signatures. Because the cut-off
used was quite stringent, the following allowance was made for
chemicals that did not produce useable signatures. For chemicals that did not have as many as 250 probesets significant at
this FDR level for either direction, a corresponding standard signature of the top 250 probesets by t statistic was used instead.
Each signature was scored against the rank matrix generated
earlier (Supplementary Table 2).
Heatmaps were created using CMap scores with positive
connections in red and negative connections in blue. Chemicals
were grouped by chemical class, and the order of chemicals/
chemical class was positioned manually so as to preserve the
order of the chemicals/chemical class in both horizontal and
vertical directions (i.e., order of chemicals on the x-axis from
left to right and the order of chemicals on the y- axis from top to
bottom were set to be the same). Within a chemical, the 3 concentrations (D1, D2, and D3) were also positioned in the same
order for both the x- and y-axes (for the x-axis, D1–D3 was set
from left to right; for the y-axis, D1–D3 was set from top to bottom for each chemical). This resulted in a final plot where the
order of the chemical/concentration combination on the x-axis
from left to right and the chemical concentration combination
on the y-axis from top to bottom were exactly the same. Within
the plot, an area within 2 consecutive black lines represented
the CMap score for a given chemical. Within each 2 consecutive
black lines, the 3 columns (x-axis) or 3 rows (y-axis) represented
the 3 concentrations D1, D2, and D3 from left to right or top to
bottom. Chemicals are denoted by letters on x-axis and numbers on y-axis for easy reference (Supplementary Table 2; Figs.
1–4).
Each black box within each of the heatmaps represents a
grouping of 9 CMap scores (9 cells: magnified in Figure 1) for the
corresponding concentration for the given chemical (e.g., A1 includes the CMap score for AD1–1D1, AD2–1D1, AD3–1D1, AD1–
1D2, AD2–1D2, AD3–1D2, AD1–1D3, AD2–1D3, and AD3–1D3)
(Figs. 1–4). Each of the cells within a black box represents the
average CMap score of a chemical-/concentration-specific signature read against each of the other chemical/concentrations
(read along the x-axis and compared with each chemical/concentration on the y-axis). The color of the cell lies between red
(similar) and blue (dissimilar), the intensity of the color represents how close the value is to þ2 (maximum possible [similar]
CMap score) or 2 (minimal possible [dissimilar] CMap score)
and is also represented as a color key in each figure (Figs. 1–4).
The diagonal more intensely colored red cells represent “selfconnectivity” or how a signature relates to its own average profile. The values of these cells should ideally be close to a value
of 2. The reason for the < 2 red color is due to the signature (yaxis) being picked using a t test and the scores of the replicates
(x-axis) being averaged.
Comparison of CMap Scores From Selected Chemicals to
MIT’s Public CMap Database
Next, we wanted to further validate the connections between
the 34 chemicals used in this study and also wanted to understand the robustness of the method when using publicly available data sets. Signatures of chemicals common to our study
and also used in the study of Lamb et al. (2006) were used to
query the original CMap database and the resulting scores were
analyzed for consistency.
DE ABREW ET AL.
|
451
FIG. 1. Heatmap of average connectivity mapping (CMap) scores for all concentrations in MCF7 Cells for 34 Chemicals. Signatures were created for each chemical concentration combination of the 34 chemicals using a 2-sample t test and a 5% false discovery rate. Signatures were compared with the CMap rank matrix, and average
CMap scores were obtained as explained in the Materials and Methods section. Each CMap value was color coded based on a color key (top left corner: red: positive connection, blue: negative connection) and represented as a cell in the heatmap. Color-coded cells were grouped based on chemical class (chemical class key on bottom
right) and ordered by concentration. Finally, chemical classes were graphed against each other so as to result in the same order of chemical classes for both x- and
y-axes. Diagonal intensely pigmented line of cells represents self-connectivity. Black boxes represent CMap scores for all 3 concentrations against all 3 concentrations
of another chemical (each black box contains 9 cells corresponding to the 9 different chemical/concentration combinations). The A1 block is magnified to show detail
of the 9 cells.
Only 17 of the 34 chemicals used in this study were available
on the original CMap database (http://www.broadinstitute.org/
cmap/). Signatures generated for overlapping chemicals (amoxicillin, chenodeoxycholic acid, clobetasol, clofibrate, flutamide,
genistein, griseofulvin, ketoconazole, metformin, methotrexate,
phenformin, progesterone, tamoxifen, troglitazone, valproic
acid, vinblastine, and vorinostat) were used to query the MIT
CMap database (restricted to 3095 chemical instances tested on
the MCF7 cell line). Because we used a different Affymetrix platform than the original paper of Lamb et al. (2006) (U219 vs
U1333A2), to make the 2 Affymetrix platforms comparable, the
probesets from the U219 platform used in the current study
were filtered to match only those present on the older U133A2
platform used by the MIT database. Therefore, the resulting up/
down signatures differed slightly from those used in the original
analysis described above. The resulting scores and ranking of
chemicals in the MIT database were analyzed for consistency by
looking for chemicals in the MIT database that matched the
query chemical. Barview plots were created for each of the
chemicals, where the “barview” was constructed from 3095
horizontal lines representing the chemical instances tested on
the MCF7 cell line and each ordered by their connectivity score.
Instances corresponding to treatment with the chemical of
interest were denoted by black lines. All other instances were
depicted based on their connectivity score; green: positive, gray:
null and red: negative (Figure 6 and Supplementary Figure 1).
RESULTS
Chemical Selection for CMap Analysis
We selected 34 chemicals to represent a broad range of known
toxicological MOAs. The 34 chemicals fell into 24 different chemical
452
|
TOXICOLOGICAL SCIENCES, 2016, Vol. 151, No. 2
FIG. 2. Heatmap of average connectivity mapping (CMap) scores for all concentrations in Ishikawa cells for 34 chemicals. Signatures were created for each chemical
concentration combination of the 34 chemicals using a 2-sample t test and a 5% false discovery rate. Signatures were compared with the CMap rank matrix, and average CMap scores were obtained as explained in the Materials and Methods section. Each CMap value was color coded based on a color key (top left corner: red: positive
connection, blue: negative connection) and represented as a cell in the heatmap. Color-coded cells were grouped based on chemical class (chemical class key on bottom right) and ordered by concentration. Finally, chemical classes were graphed against each other so as to result in the same order of chemical classes for both x- and
y-axes. Diagonal intensely pigmented line of cells represents self-connectivity. Black boxes represent CMap scores for all 3 concentrations against all 3 concentrations
of another chemical (each black box contains 9 cells corresponding to the 9 different chemical/concentration combinations).
classes. A chemical class was defined as a group of chemicals that
shared a MOA (as defined based on the literature). Of the 24 chemical classes, 9 contained at least 2 chemicals, whereas 15 chemical
classes were represented by a single chemical (Table 1).
Identifying Intra–Chemical Class Positive Linkages for
34 Chemicals in 4 Different Cell Lines
Of the 34 chemicals used in the study, 19 fell into chemical
classes that contained at least 2 chemicals. The chemicals and
the chemical classes they belong to are defined based on what
is known about the MOAs of the chemicals in each class (Table
1). Ideally, chemicals within the same class should behave in a
similar manner (similar biological signature), and the CMap
score should be representative of this phenomena.
To compare the CMap scores of a given class, all cells corresponding to the overlap of x- and y-axis for a given chemical
class were compared (e.g., for estrogen, the 81 cells AD1–1D1,
AD2–1D1, AD3–1D1, etc., within the 9 boxes A1, B1, C1, A2, B2,
C2, A3, B3, and C3 were compared with each other). In general,
chemicals within the same class showed positive linkages (cells
within overlap areas were predominantly red): for example, the
estrogen/environmental estrogen class of chemicals described
by the 81 cells above were predominantly red in Figures 1–4.
However, the strength of the positive linkage was different for
each of the cells within the same overlap area for all 4 cell types
(the color-scale of cells within the overlap area indicated higher
similarity scores and showed differences from nonoverlap
areas, these differences were unique for each cell type).
Furthermore, if the same cell (e.g., AD1–1D1) was compared
among the 4 cell lines, the strength of the positive linkage (intensity of red color) was different. Both of these outcomes were
expected. The differences in the intensity of each cell in the
same overlap area can be attributed to specificity and/or
DE ABREW ET AL.
|
453
FIG. 3. Heatmap of average connectivity mapping (CMap) scores for all concentrations in HepaRG cells for 34 chemicals. Signatures were created for each chemical concentration combination of the 34 chemicals using a 2-sample t test and a 5% false discovery rate. Signatures were compared with the CMap rank matrix, and average
CMap scores were obtained as explained in the Materials and Methods section. Each CMap value was color coded based on a color key (top left corner: red: positive connection, blue: negative connection) and represented as a cell in the heatmap. Color-coded cells were grouped based on chemical class (chemical class key on bottom
right) and ordered by concentration. Finally, chemical classes were graphed against each other so as to result in the same order of chemical classes for both x- and yaxes. Diagonal intensely pigmented line of cells represents self-connectivity. Black boxes represent CMap scores for all 3 concentrations against all 3 concentrations of
another chemical (each black box contains 9 cells corresponding to the 9 different chemical/concentration combinations).
potency of ligands to the receptor/enzyme defined by the chemical class. This fact is further confirmed by the color-scale being
correlated with concentration for a given chemical in a specific
cell line (e.g., intensity of red color for AD1–1D1, AD2–1D1, and
AD3–1D1 are different). The difference in the pattern of intensity of each cell within the overlap area among the different cell
lines can be attributed to the “completeness” of the pathway of
interest (molecular pathway underlying the MOA). Certain cell
types may lack or have significantly fewer copies of certain molecular components, of the pathway of interest, consequently
the pathway will show a quantitatively lower response in such
cell types when exposed to the same concentration of chemical.
For example, it is well known that the Farnesoid X receptor
(FXR) is predominantly expressed in the liver and kidney
(Forman et al., 1995). In our study, the intensity of the selfconnectivity
for
the
FXR
agonists
Farnesol
and
Chendeoxycholic acid (O16, P16, O17, and O18) are much more
pronounced in the 2 hepatocyte cell lines HepG2 (Figure 4) and
HepaRG (Figure 3) when compared with the other 2 non-hepatocyte cell lines (Figs. 1 and 2). This phenomenon can be further
corroborated by observing the changes seen with increasing
concentration. For example, if components representing a certain pathway are low in abundance in a certain cell line, no response might be observed at the lower concentration (D3);
however, at a higher concentration (D1), the expected response
might be observed. At the lower concentration (D3), the number
of ligand molecules may not be sufficient to interact with the
limited number of receptor/enzyme copies to instigate a statistically significant downstream gene expression change/activate
pathway. However, at the higher concentration, the number of
available ligand molecules may be sufficient to result in such an
outcome (Figs. 1 and 5).
454
|
TOXICOLOGICAL SCIENCES, 2016, Vol. 151, No. 2
FIG. 4. Heatmap of average connectivity mapping (CMap) scores for all concentrations in HepG2 cells for 34 chemicals. Signatures were created for each chemical concentration combination of the 34 chemicals using a 2-sample t test and a 5% false discovery rate. Signatures were compared with the CMap rank matrix, and average
CMap scores were obtained as explained in the Materials and Methods section. Each CMap value was color coded based on a color key (top left corner: red: positive connection, blue: negative connection) and represented as a cell in the heatmap. Color-coded cells were grouped based on chemical class (chemical class key on bottom
right) and ordered by concentration. Finally, chemical classes were graphed against each other so as to result in the same order of chemical classes for both x- and yaxes. Diagonal intensely pigmented line of cells represents self-connectivity. Black boxes represent CMap scores for all 3 concentrations against all 3 concentrations of
another chemical (each black box contains 9 cells corresponding to the 9 different chemical/concentration combinations).
Identifying Inter–Chemical Class Positive/Negative
Linkages for 34 Chemicals in 4 Different Cell Lines
As described above, the known intra–chemical class positive
linkages of CMap scores were expected. Next we wanted to determine if novel inter-chemical class/positive and negative linkages could be identified via our analysis. In order to make this
comparison, for each heatmap, the CMap score (cell-color)
within the same chemical class was compared with the scores
(cell colors) in the same row for the other 33 chemicals and 3
concentrations. Intensely colored red or blue cells on the same
row were an indication of strong similarity or dissimilarity of
the chemical identified by the row to the chemical identified by
the column (read across rows and compared with each column).
This is illustrated by cell type in Figures 1–4, discussed further
below.
As a first step in this analysis, CMap scores for agonists and
antagonists for the same receptor were compared. The 34
chemicals included agonists and antagonists for 3 receptors: estrogen receptor (ER), androgen receptor, and progesterone. In
general, the antagonists behaved in an opposing manner to
agonists (blue/white cells vs red cell colors), and the negative
linkage was most prominent at the highest concentration (D1).
For example, the antiestrogen tamoxifen exhibited negative
linkages for the ER agonists bisphenol A, genistein, and ethinylestradiol for the highest concentration D1 (Figure 5). However,
the phenomena could not be considered a complete response
(cells for antagonists were not completely blue when cells for
agonist were red and vice versa) and was highly concentration
and cell type specific (compare D1, D2, and D3 for the following
within each figure [Figs. 1–4]: anti estrogen: tamoxifen vs
estrogens: bisphenol A, genistein and ethinyl-estradiol, anti
androgen:
flutamide
vs
androgens:
trenbolone,
dehydroepiandrosterone, antiprogesterone: RU486 vs progesterone. Compare same agonist antagonist pairs between Figs. 1–4).
DE ABREW ET AL.
|
455
FIG. 5. Heatmap of average connectivity mapping (CMap) scores for D1 concentration in MCF7 cells for 34 chemicals. Signatures were created for the D1 concentration
of MC7 cells and a CMap score generated by comparing this signature to a CMap rank matrix of MCF7 D1 as explained in the Materials and Methods section. Each CMap
value was color coded based on a color key (top left corner: red: positive connection, gray: null connection, blue: negative connection) and represented as a cell in the
heatmap. Color-coded cells were grouped based on chemical class (chemical class key on bottom right). Finally, chemical classes were graphed against each other so
as to result in the same order of chemical classes for both x- and y-axes. Diagonal intensely pigmented line of cells represents self-connectivity. Black boxes represent
CMap scores for a given chemical class.
These observations could be explained by the nature of the antagonists. None of the antagonists picked for the study were
complete antagonists of the receptor, as such their responses
were not expected to be completely opposite to the agonist response. The inconsistent agonist antagonist behavior among
the 4 different cell types was expected and can be attributed to
how well the respective pathways were represented in each of
the cell types (completeness of pathway).
The HepG2 profile (Figure 4) exhibited 5 distinct patches of
red (area 1: P16, Q16, P17, and P17; area 2: W16, X16, Y16, Z16,
W17, X17, Y17, Z17, AA16, and AA17; area 3: R18, S18, R19, and
S19; area 4: P23, Q23, P24, Q24, P25, Q25, P26, Q26, P27, and Q27;
area 5: W23, X23, Y23, Z23, AA 23, W24, X24, Y24, Z24, AA 24,
W25, X25, Y25, Z25, AA 25, W26, X26, Y26, Z26, AA 26, W27, X27,
Y27, Z27, and AA27). When analyzed these patches of red
alluded to inter–chemical class positive linkages in the HepG2
cell line. The chemical class FXR receptor agonist (area 1)
showed positive linkages with vitamin D agonist, glucocorticoid
receptor agonist, progesterone receptor agonist, progesterone
receptor antagonist, and steroid synthesis inhibitors (area 2).
Similarly, vitamin D agonist, glucocorticoid receptor agonist,
progesterone receptor agonist, progesterone receptor antagonist, and steroid synthesis inhibitors (area 4) showed positive
linkages with each other (area 5). The same exact patterns (i.e.,
positive linkages) but at a lesser intensity were observed for the
HepaRG (Figure 3) cell line but not for the MCF7 and Ishikawa
cell lines (Figs. 1 and 2). These comparisons provided another
example of cell type (hepatocyte cell line)–specific inter–chemical class positive linkages.
Enriching CMap Technology by Assessing CMap Profiles
in Multiple Cell Lines
Per TT21C, a toxicity pathway is defined as a normal cellular response pathway that is expected to result in an adverse health
effect when sufficiently perturbed (NRC, 2007). In order to ensure that all possible adverse outcomes of a chemical have been
accounted for, one would need to assess the effects of a chemical on every possible cellular pathway. Although it is not
known how many such cellular pathways actually exist (Lamb
456
|
TOXICOLOGICAL SCIENCES, 2016, Vol. 151, No. 2
et al., 2006), an approach that has been implemented by many,
including Lamb et al. (2006), is to look at gene signatures in multiple cell types. The intent is to achieve maximum coverage of
the genome by the overlap of the differential gene expression
profiles of different cell lines. In an attempt to follow a similar
process, we picked 4 different cell lines for our study and retained the same experimental and data analysis procedures
across all 4 of them.
The overall heatmaps for each of the cell lines were unique
(Figs. 1–4). The heatmap for MCF7 (Figure 1) exhibited predominantly positive linkages (red color) illustrating that this cell line
responded to most of the 34 chemicals (ligands) quite well.
Although the MCF7 cell line showed the most sensitivity, this
resulted in reduced specificity. However, if the concentration
was teased out and the highest concentration (D1) was plotted
for the MCF7 cell line (Figure 5), a much clearer picture of the
linkages could be observed. This showed the significance of
dose in making the right connections when using this method.
In contrast, the other 3 cell lines (Figs. 2–4) showed high specificity and lesser sensitivity (visualized as patches of red and blue
vs an overall red or blue heatmap).
Within each of the heatmaps, distinct rows that showed no
linkages (white/light red or white/light blue colored rows) could
be identified. For each cell line, exposure to completely different
chemicals resulted in a “no-linkage” response and was mostly
observed at the highest exposure concentration tested (D1)
(areas of white/light red or white/light blue colored rows were
different for Figs. 1–4. This difference was more pronounced
when the D1 row was compared between Figs. 1–4). The uniqueness of the no-linkage chemicals for each cell line was an indication that the cellular pathways were not acting in a similar
manner among the 4 cell types. One possible interpretation for
the prominence of the phenomena at the highest concentration
(D1) could be attributed to the activation of non-specific pathways (e.g., cytotoxicity pathways) at the higher concentration.
In either case, the 4 cell lines behaved dissimilarly, demonstrating the need to do these studies in different cell lines in parallel
to capture the full depth of the biology/toxicology that is taking
place.
Comparison of CMap Scores of Representative
Chemicals From MCF7 Line Against Broad
Institute CMap Database
In their original paper, Lamb et al. (2006) showed that connections between drugs disease and genes can be established independent of gene array platform, cell type, and concentration We
were interested in exploring if this concept held true for our
study. The 17 of 34 chemicals available on the CMap database
were queried against the CMap data and presented as barview
plots (Figure 6 and Supplementary Figure 1). Although some
variability was observed in terms of the connections, in general,
more positive connections (as opposed to null and negative)
were observed between our signature and the CMap database
for all 17 chemicals. In general, the number of positive connections seemed to increase with increasing concentration (black
lines move into green area with increasing concentration). Of
the 4 cell lines used in our study, only MCF7 was used by the
CMap group hence the comparison only used this cell line.
DISCUSSION
Since the publication of the seminal report, TT21C by the NRC
(2007), the focus of toxicology research has shifted from one
where chemicals are defined based on the diseases and health
effects they trigger to one where chemicals are defined based
on the biology that is driving the underlying toxicity (Kavlock
et al., 2012). This is realized by defining MOAs (toxicity pathways) that lead to understanding the chemical biological interactions that are taking place upstream of the apical events that
are observed (Kavlock et al., 2012). Through programs such as
ToxCast and Tox21 government agencies such as the U.S. EPA
and the U.S. FDA have already embraced this idea (Collins et al.,
2008; Dix et al., 2007; Kavlock et al., 2012; Tice et al., 2013). These
initiatives are defining methods to identify MOAs (or responses
that may eventually be used to inform MOAs) of chemicals
using high-throughput approaches (HTS). ToxCast and Tox21
incorporate a myriad of in vitro assays that range from cell-free
biochemical assays to complex cell culture systems to small
model organisms (Kavlock et al., 2012; Knudsen et al., 2013) to
measure a range of endpoints from protein–protein binding to
high-content cell imaging to multiplexed transcription factor reporter assays, resulting in excess of 650 different assay readouts for a given chemical (Judson et al., 2010; Knudsen et al.,
2011) (http://www.epa.gov/ncct/Tox21/). Following the modeling
of concentration-response data for each of these assay readouts, AC50 values (concentration causing half maximal response) are calculated from curve fits to Hill equations (Tice
et al., 2013). Due to the direct and indirect measurements associated with the different assays used in ToxCast/Tox21, the AC50
values of any given tested compound could have a range spanning 4 Log folds (Wetmore et al., 2012), ultimately affecting interpretation of biological specificity (point of departure [POD]).
Methods to distinguish between AC50 values that are directly
related to a biological pathway that contributes to an ultimate
adverse response versus ones that are not are unavailable to
date.
In our current study, we attempt to assess the MOA of
chemicals using a high-content (gene expression) approach.
Although both high-throughput (in vitro assays) and highcontent (gene expression) methods attempt to address the
same question of identifying an MOA for a given chemical biological interaction, we believe the CMap approach outlined in this
paper can help address a fundamental issue related to the large
range associated with the Tox21/Toxcast AC50s data sets:
namely, ruling in/out of the appropriate “assay set” for a particular chemical. Connectivity mapping could be used to either
tailor the relevant in vitro assays for a particular chemical or be
used as a tool to weed out outlier AC50 data sets during data
analysis.
We predefined MOA-related chemical classes for the 34
chemicals used in this study. For some of the classes, there is a
close connection with a known MIE (e.g., receptor binding),
whereas others are defined at a higher level and a specific MIE
is less clear due to a number of reasons, including a lack of detailed knowledge on the pathways and the nonspecificity of the
pathways involved (e.g., idiosyncratic liver injury or oxidative
phosphorylation/mitochondrial inhibitors).
We believe that chemical-mediated pathway perturbations
(toxicological MOAs) can be grouped into 2 main areas based on
the selectivity of the biological action triggered by the chemical:
(1) Pathway perturbations mediated via nonselective binding to
intracellular molecules (mediated via strong interactions (covalent binding) between chemicals (ligands) and macromolecules
(DNA, protein, lipids etc.); and (2) Pathway perturbations mediated via selective agonism/antagonism (mediated through
interactions of chemicals [ligands] and biological molecules [receptors, enzymes etc.]) (Daston et al., 2015). Although many
DE ABREW ET AL.
|
457
FIG. 6. Barview of connectivity mapping (CMap) scores for selected chemicals when compared with data from CMap database. Signatures were created for all 3 concentrations of genistein, tamoxifen, metformin, phenformin, valproic acid, vorinostat, methotrexate, and troglitazone and a CMap score generated by comparing this signature to a CMap rank matrix created using the externally available CMap data (http://www.broadinstitute.org/cmap/) as explained in the Materials and Methods
section. The barview is constructed from 3095 horizontal lines representing chemical instances tested on the MCF7 cell line each ordered by their connectivity score.
Instances corresponding to treatment with the same chemical as the signature are denoted by black lines. The number of instances is listed in parenthesis next to the
title. The colors applied to the remaining instances reflect the sign of the score. Green, gray, and red representing positive, null, and negative connections, respectively.
robust in vitro methods to assess the toxicity of chemicals
within the “nonselective binding” group exists to date (Ames
et al., 1973; Gerberick et al., 2004; OECD 2010, 2014) in vitro methods for assessing the toxicity of the “selective agonism/antagonism” group has not been totally realized. Weak interactions
cover a large range of specific targets, such that it has not been
possible to assemble a set of in vitro assays to quantify these
interactions reliably. The 34 chemicals used in this study were
chosen with a bias toward the latter group, due to both the imminent need to address this category of chemicals and our belief that CMap provides an ideal tool to attempt this task.
For the purposes of this manuscript, cell line–specific CMap
scores for all concentrations are depicted as a single heatmap
(Figs. 1–4). This was done to provide the best visual representation for a manuscript format. However, this visual can be somewhat confusing: the high density of the data may lead to a lack
458
|
TOXICOLOGICAL SCIENCES, 2016, Vol. 151, No. 2
of clarity when trying to pick the most relevant concentration to
find connections between chemicals. Although noted as a drawback when formatting for a manuscript, we do not expect this
aspect to be a significant setback in day-to-day practice of the
method. In practice, all CMap scores would be tabulated, sorted
by ascending order, and the most relevant concentration corresponding to the highest positive/negative CMap score picked,
avoiding the need to rely on a visual representation such as a
heatmap.
In traditional animal-based toxicity testing, a chemical usually produces adverse effects in multiple hazard domains (developmental, neurological, reproductive, etc.) Among these
adverse events, the effect that occurs at the lowest dose and is
biologically and/or statistically significant is considered the critical effect. Based on this critical effect, a POD (e.g. NOAEL,
LOAEL, BMD, etc.) is identified or modeled and represents the
point at which a low-dose extrapolation to a health reference value begins (Faustman and Omenn, 2001). Various casespecific uncertainty factors are used to calculate a reference
dose (RfD) from this POD. This RfD is an estimate of daily exposure to the chemical that is assumed to be without adverse
health impact on the human population (Faustman and
Omenn, 2001). If the TT21C paradigm were applied to the above
quantitative risk assessment (QRA) process, each of the adverse
events in the different hazard domains could be viewed as
manifestations of perturbations in one or more biological pathways (MOAs). The critical effect could potentially be defined as
the MOA that is perturbed at the lowest concentration. The key
here is to clearly understand that a perturbation is linked to an
AOP given that cellular perturbations may reflect adaptive responses that would not manifest as toxicity. In order to replace
the historical animal model-based QRA methods with MOAbased methods, it is imperative that all pathways are covered so
as to not miss out on the pathway that might be primarily responsible for the critical effect. Although no one knows how
many such MOAs/pathways exist in humans (Lamb et al., 2006),
one way to assure coverage is by using multiple cell lines, each
potentially containing the relevant machinery for the activation
of a finite number of pathways, and the overlap providing nearcomplete coverage. Although theoretical, this was the fundamental thinking behind the selection of multiple cell lines for
our study. Two aspects factored into making the decision of
what cell lines to use for this study: (1) The relevance of the tissue that the cell line originated as a potential target of toxicity
following systemic exposure to a chemical; (2) The presence of
target receptors, enzymes, and ion channels for the 34 chemicals in the cell line. It is clear that each cell line has its unique
sensitivity and specificity for the 34 chemicals (Figs. 1–4).
However, it is also understood that these 4 cell lines do not provide complete coverage for all MOAs of interest. Further analysis looking at known and unknown chemicals in different
chemical spaces and testing these against a broad battery of cell
lines is needed to provide a more comprehensive answer as to
how many cell lines are needed to obtain a realistic characterization of in vivo toxicity.
A comparison of the CMap signatures among the 3 concentrations for each of the 34 chemicals shows that concentration
has a significant impact on the gene expression profile and
thereby the CMap signature of the specific chemicals in each of
the cell lines (Figs. 1–4 and Supplementary Table 2). In our
study, we observed many instances where 1 concentration exhibited the opposite linkage to the remaining concentrations (e.
g., 2 concentrations showed positive linkage, whereas 1 concentration showed negative linkage or vice versa). The negative
linkage of tamoxifen, an ER antagonist, with the ER agonists
bisphenol A, genistein, and ethinyl-estradiol at the highest dose
(D1), which was not as prominent at the mid (D2) and low (D3)
concentration (Figure 1 column AGD1 compared with the rest of
AG1, AG2, and AG3) is a good example of this phenomenon. A
separate heatmap for the high concentration (D1) in MCF7 cells
further emphasizes this concentration phenomenon (Figure 5).
Mode of actions/AOPs are traditionally defined as linear processes. Considering the complexity of toxicological processes,
this simplistic definition has been criticized and has been
acknowledged as a deficiency (Vinken, 2013). Concentrationdependent activation of parallel pathways (activation of new
pathways) and crossing over of pathways (recruitment of new
pathways) can lead to different adverse outcomes (Vinken,
2013). Hence, it is important to characterize the dose–response
curve and to understand at which doses nonspecific responses
driven by generalized cytotoxicity emerge. For risk assessment,
the goal would be to compare the estimated systemic exposure
to the in vitro response occurring at the lowest dose that is
linked to an adverse biological response, but for hazard characterization, chemicals would first need to be grouped considering
all relevant responses below those associated with high-dose
nonspecific toxicity. A second step prior to risk assessment
would be to understand which low-dose responses are relevant
to subsequent adverse outcomes. An in vitro-to-in vivo extrapolation method such as the one described by Wetmore et al.
(2012, 2013, 2014; Thomas et al., 2013) could be an option for figuring out the relevant in vitro concentration based on the
known in vivo human exposure scenario. For a study such as
this where a majority of the MOAs are receptor mediated, the
relevant in vitro dose would reflect the constant concentration
(Css), other in vitro scenarios may require a dose reflecting the
maximum concentration (Cmax), or area under the curve (AUC).
After the relevant in vitro concentration is calculated, it is imperative that multiple concentrations flanking this calculated
concentration be tested in order to capture the MOA-based phenomena that may be responsible for the critical endpoint. This
would allow for conducting the in vitro experiment within a
human relevant dose range (i.e., facilitate testing in the correct
dose range 10–100-fold range vs 10 000–100 000-fold range, etc.)
while still generating enough data points to account for a statistically significant result after accounting for other uncertainties
such as in vitro biokinetics (Blaauboer, 2010).
The distinct patches of positive linkages (areas of red) in the
HepG2 cell line (Figure 4) provided a clear example of the
strength of the current method. In this cell line, the chemical
class FXR receptor agonist showed positive linkages with vitamin D agonist, glucocorticoid receptor agonist, progesterone receptor agonist, progesterone receptor antagonist, and steroid
synthesis inhibitors. An examination of the current literature
shows that these receptor classes (and hence their ligands) are
distinctly related. The FXR, vitamin D (VDR), glucocorticoid (GR),
and progesterone (PR) receptors are all part of the large group of
receptors known as the nuclear receptor (NR) family. Within
this class, a number of subclasses exist that are named on
the basis of a phylogenetic tree (defined based on the evolution
of 2 well-conserved domains of NR family members) (Bridgham
et al., 2010; Nuclear Receptors Nomenclature Committee, 1999).
Among the 4 receptor classes of interest, FXR and VDR fall into
the NR sub family 1I, GR and PR fall into the subfamily 3C
(Bridgham et al., 2010; Nuclear Receptors Nomenclature
Committee, 1999). The other class of compounds: steroid synthesis inhibitors are also related to these receptors. Because
steroids are a main class of ligands for the NR family, one could
DE ABREW ET AL.
assume that the inhibition of steroid synthesis could indirectly
affect the functioning of the above-listed receptor families. The
other key insight from the result is the lack of positive linkages
between FXR, VDR, GR, PR agonist/antagonist, and the ER agonist/antagonist in the HepG2 cell line (Figure 4). The ER falls into
the NR sub family 3A and is very closely related to the group 3C
members; however, no linkage is observed in the HepG2 result
(Figure 4). Liver cells do not express ER in abundance (Grandien
et al., 1997), hence pathways related to the ER may not be as active in this cell line resulting in no linkage. The other interesting
phenomena observed was the preservation of the positive correlation described above between the 2 liver cell lines—HepG2
(Figure 4) and HepaRG (Figure 3)—albeit at a much lower intensity in the HepaRG cell line. The HepaRG cell line is known to
have a metabolic competency parallel to primary hepatocytes
(Gerets et al., 2012), these differences in metabolic capacity
(Aninat et al., 2006; Gerets et al., 2012; Guillouzo et al., 2007;
Jennen et al., 2010) may have a role in the subtle differences
seen in the results of these 2 cell lines. Overall, these observations provide further evidence for the need to perform these
studies in multiple cell lines, where the machinery for certain
pathways are uniquely expressed, paying particular attention to
in vitro conditions that better mimic in vivo toxicokinetics.
We observed mostly positive correlation between CMap
scores generated from our data when compared with a rank
matrix of the externally available data for the MCF7 cell line
from the Broad Institute when the number of instances (n) for a
particular chemical was adequate (Figure 6). Although the highest concentration (D1) provided the most positive correlation for
most chemicals (Figure 6 and Supplementary Figure 1), there
were a couple of instances where the medium concentration
(D2) resulted in better positive correlation than D1, for example,
tamoxifen, clobetasol, progesterone, ketoconazole. Although
difficult to provide a direct explanation due the low number of
instances of overlap for each chemical (“n” for each chemical in
Figure 6), the key point here is the need to use the relevant
in vitro concentration that can be extrapolated to the in vivo
situation.
We understand that the overall survey and statements
made regarding the data set presented in this manuscript is
general in nature. The density and richness of the data set for
34 chemicals using 3 concentrations in 4 different cell lines provides an opportunity to perform in-depth CMap analysis for
each chemical used in this study. This was not attempted and
was not the intent of the current publication, rather the intent
was to evaluate the possibility of using CMap in predictive toxicology to identify connections between the biological signatures
of 34 chemicals and provide specifics on how CMap may be
used as an alternate method to assess high-content data. The
complete gene expression data for this study have been deposited in GEO (super series accession no.: GSE69851), and all
CMap scores for all chemicals and concentrations are provided
as Supplementary Table 2. Readers are encouraged to perform
an in-depth analysis on specific chemicals to understand the
nuances of using CMap as a tool to understand MOA-based clustering of chemicals. We believe that this level of detail could be
a second tier analysis that can be performed using the same
data set following the “screening” level analysis described in
this manuscript. Furthermore, we understand that the 34 chemicals used in this study have well-defined MOAs, and that the
method was successful in MOA-based classification of these
known chemicals. We recognize that the method may need
augmenting when using chemicals with unknown MOAs. In
conclusion, we show that CMap can be used as a robust filtering
|
459
tool to assess high-content data. The study provides evidence
that CMap can be applied for predictive toxicological purposes.
Current limitations and future needs to improve the tool are
also highlighted. These include the need to expand the number
of cell lines used for a given chemical and the need to demonstrate the validity of the method via case studies using chemicals from different chemical spaces. In addition, comparison of
data from selected chemicals from our study to the publicly
available Broad Institute data base (http://www.broadinstitute.
org/cmap/) showed good correlation (Figure 6, P values) when
the number of instances (n) for a particular chemical was adequate, confirming that the method is user and platform independent. Development of a large shared database such as the
one maintained by the Broad Institute (http://www.broadinsti
tute.org/cmap) for toxicologically relevant chemicals will provide further impetus for broad range adoption of methods such
as this in the future. With the increased need to comprehend
big data in toxicology, this type of approach may provide further
clarity to HTS efforts such as ToxCast and Tox21.
SUPPLEMENTARY DATA
Supplementary data are available online at http://toxsci.
oxfordjournals.org/.
ACKNOWLEDGMENTS
The authors wish to thank Karen Blackburn and Catherine
Mahony for their review of the manuscript.
REFERENCES
Ames, B. N., Lee, F. D., and Durston, W. E. (1973). An improved
bacterial test system for the detection and classification of
mutagens and carcinogens. Proc. Natl. Acad. Sci. U.S.A. 70,
782–786.
An, W. G., Kanekal, M., Simon, M. C., Maltepe, E., Blagosklonny,
M. V., and Neckers, L. M. (1998). Stabilization of wild-type p53
by hypoxia-inducible factor 1alpha. Nature 392, 405–408.
Aninat, C., Piton, A., Glaise, D., Le Charpentier, T., Langouet, S.,
Morel, F., Guguen-Guillouzo, C., and Guillouzo, A. (2006).
Expression of cytochromes P450, conjugating enzymes and
nuclear receptors in human hepatoma HepaRG cells. Drug
Metab. Dispos. 34, 75–83.
Ankley, G. T., Bennett, R. S., Erickson, R. J., Hoff, D. J., Hornung, M.
W., Johnson, R. D., Mount, D. R., Nichols, J. W., Russom, C. L.,
Schmieder, P. K., et al. (2010). Adverse outcome pathways: A
conceptual framework to support ecotoxicology research
and risk assessment. Envir. Toxicol. Chem. 29, 730–741.
Blaauboer, B. J. (2010). Biokinetic modeling and in vitro-in vivo
extrapolations. J. Toxicol. Environ. Health B Crit. Rev. 13,
242–252.
Blankvoort, B. M., de Groene, E. M., van Meeteren-Kreikamp, A.
P., Witkamp, R. F., Rodenburg, R. J., and Aarts, J. M. (2001).
Development of an androgen reporter gene assay (AR-LUX)
utilizing a human cell line with an endogenously regulated
androgen receptor. Anal. Biochem. 298, 93–102.
Boobis, A. R., Cohen, S. M., Dellarco, V., McGregor, D., Meek, M. E.,
Vickers, C., Willcocks, D., and Farland, W. (2006). IPCS framework for analyzing the relevance of a cancer mode of action
for humans. Crit. Rev. Toxicol. 36, 781–792.
Boobis, A. R., Doe, J. E., Heinrich-Hirsch, B., Meek, M. E., Munn, S.,
Ruchirawat, M., Schlatter, J., Seed, J., and Vickers, C. (2008).
460
|
TOXICOLOGICAL SCIENCES, 2016, Vol. 151, No. 2
IPCS framework for analyzing the relevance of a noncancer
mode of action for humans. Crit. Rev. Toxicol. 38, 87–96.
Bridgham, J. T., Eick, G. N., Larroux, C., Deshpande, K., Harms, M.
J., Gauthier, M. E., Ortlund, E. A., Degnan, B. M., and
Thornton, J. W. (2010). Protein evolution by molecular tinkering: Diversification of the nuclear receptor superfamily from
a ligand-dependent ancestor. PLoS Biol. 8,
Chalbos, D., Galtier, F., Emiliani, S., and Rochefort, H. (1991). The
anti-progestin RU486 stabilizes the progestin-induced fatty
acid synthetase mRNA but does not stimulate its transcription. J. Biol. Chem. 266, 8220–8224.
Collins, F. S., Gray, G. M., and Bucher, J. R. (2008). Toxicology.
Transforming environmental health protection. Science 319,
906–907.
Daston, G., Knight, D. J., Schwarz, M., Gocht, T., Thomas, R. S.,
Mahony, C., and Whelan, M. (2015). SEURAT: Safety
Evaluation Ultimately Replacing Animal Testing–recommendations for future research in the field of predictive toxicology. Archiv. Toxicol. 89, 15–23.
Dellarco, V. L., and Wiltse, J. A. (1998). US Environmental
Protection Agency’s revised guidelines for Carcinogen Risk
Assessment: Incorporating mode of action data. Mut. Res.
405, 273–277.
Dix, D. J., Houck, K. A., Martin, M. T., Richard, A. M., Setzer, R. W.,
and Kavlock, R. J. (2007). The ToxCast program for prioritizing
toxicity testing of environmental chemicals. Toxicol. Sci. 95, 5–12.
Dudley, J. T., Sirota, M., Shenoy, M., Pai, R. K., Roedder, S., Chiang,
A. P., Morgan, A. A., Sarwal, M. M., Pasricha, P. J., and Butte, A.
J. (2011). Computational repositioning of the anticonvulsant
topiramate for inflammatory bowel disease. Sci. Transl. Med.
3, 96ra76.
ECHA. (2015). Grouping of substances and read-across. Available
at: http://echa.europa.eu/support/grouping-of-substancesand-read-across. Accessed December 10, 2015.
Faustman, E. M., and Omenn, G. S. (2001). Risk assessment. In
Casarett and Doull’s Toxicology: The Basic Science of Poisons (L. J.
Casarett, C. D. Klaassen, and J. Doull, Eds.), 6 ed., pp. 91–92.
McGraw-Hill Medical Pub. Division, New York, NY.
Forman, B. M., Goode, E., Chen, J., Oro, A. E., Bradley, D. J.,
Perlmann, T., Noonan, D. J., Burka, L. T., McMorris, T., Lamph,
W. W., et al. (1995). Identification of a nuclear receptor that is
activated by farnesol metabolites. Cell 81, 687–693.
Gautier, L., Cope, L., Bolstad, B. M., and Irizarry, R. A. (2004). affy–
analysis of Affymetrix GeneChip data at the probe level.
Bioinformatics 20, 307–315.
Gerberick, G. F., Vassallo, J. D., Bailey, R. E., Chaney, J. G., Morrall,
S. W., and Lepoittevin, J. P. (2004). Development of a peptide
reactivity assay for screening contact allergens. Toxicol. Sci.
81, 332–343.
Gerets, H. H., Tilmant, K., Gerin, B., Chanteux, H., Depelchin, B.
O., Dhalluin, S., and Atienzar, F. A. (2012). Characterization of
primary human hepatocytes, HepG2 cells, and HepaRG cells
at the mRNA level and CYP activity in response to inducers
and their predictivity for the detection of human hepatotoxins. Cell Biol. Toxicol. 28, 69–87.
Grandien, K., Berkenstam, A., and Gustafsson, J. A. (1997). The estrogen receptor gene: Promoter organization and expression.
Int. J. Biochem. Cell Biol. 29, 1343–1369.
Guillouzo, A., Corlu, A., Aninat, C., Glaise, D., Morel, F., and
Guguen-Guillouzo, C. (2007). The human hepatoma HepaRG
cells: A highly differentiated model for studies of liver metabolism and toxicity of xenobiotics. Chem. Biol. Interact. 168,
66–73.
Hall, J. M., Barhoover, M. A., Kazmin, D., McDonnell, D. P.,
Greenlee, W. F., and Thomas, R. S. (2010). Activation of the
aryl-hydrocarbon receptor inhibits invasive and metastatic
features of human breast cancer cells and promotes breast
cancer cell differentiation. Mol. Endocrinol. 24, 359–369.
Hartung, T. (2010). Evidence-based toxicology - the toolbox of
validation for the 21st century? Altex 27, 253–263.
Hieronymus, H., Lamb, J., Ross, K. N., Peng, X. P., Clement, C.,
Rodina, A., Nieto, M., Du, J., Stegmaier, K., Raj, S. M., et al.
(2006). Gene expression signature-based chemical genomic
prediction identifies a novel class of HSP90 pathway modulators. Cancer Cell 10, 321–330.
Irizarry, R. A., Bolstad, B. M., Collin, F., Cope, L. M., Hobbs, B., and
Speed, T. P. (2003). Summaries of Affymetrix GeneChip probe
level data. Nucl. Acids Res. 31, e15.
Jahchan, N. S., Dudley, J. T., Mazur, P. K., Flores, N., Yang, D.,
Palmerton, A., Zmoos, A. F., Vaka, D., Tran, K. Q., Zhou, M.,
et al. (2013). A drug repositioning approach identifies tricyclic
antidepressants as inhibitors of small cell lung cancer and
other neuroendocrine tumors. Cancer Discov. 3, 1364–1377.
Jennen, D. G., Magkoufopoulou, C., Ketelslegers, H. B., van
Herwijnen, M. H., Kleinjans, J. C., and van Delft, J. H. (2010).
Comparison of HepG2 and HepaRG by whole-genome gene
expression analysis for the purpose of chemical hazard identification. Toxicol. Sci. 115, 66–79.
Judson, R. S., Houck, K. A., Kavlock, R. J., Knudsen, T. B., Martin,
M. T., Mortensen, H. M., Reif, D. M., Rotroff, D. M., Shah, I.,
Richard, A. M., and., et al. (2010). In vitro screening of environmental chemicals for targeted testing prioritization: The
ToxCast project. Environ. Health Perspect. 118, 485–492.
Kang, S. C. and Lee, B. M. (2005). DNA methylation of estrogen receptor alpha gene by phthalates. J. Toxicol. Environ. Health A
68, 1995–2003.
Kavlock, R., Chandler, K., Houck, K., Hunter, S., Judson, R.,
Kleinstreuer, N., Knudsen, T., Martin, M., Padilla, S., Reif, D.,
et al. (2012). Update on EPA’s ToxCast program: Providing
high throughput decision support tools for chemical risk
management. Chem. Res. Toxicol. 25, 1287–1302.
Kido, S., Inoue, D., Hiura, K., Javier, W., Ito, Y., and Matsumoto, T.
(2003). Expression of RANK is dependent upon differentiation
into the macrophage/osteoclast lineage: Induction by
1alpha,25-dihydroxyvitamin D3 and TPA in a human myelomonocytic cell line, HL60. Bone 32, 621–629.
Knudsen, T., Martin, M., Chandler, K., Kleinstreuer, N., Judson,
R., and Sipes, N. (2013). Predictive models and computational
toxicology. Methods Mol. Biol. 947, 343–374.
Knudsen, T. B., Houck, K. A., Sipes, N. S., Singh, A. V., Judson, R.
S., Martin, M. T., Weissman, A., Kleinstreuer, N. C.,
Mortensen, H. M., Reif, D. M., et al. (2011). Activity profiles of
309 ToxCast chemicals evaluated across 292 biochemical targets. Toxicology 282, 1–15.
Lamb, J., Crawford, E. D., Peck, D., Modell, J. W., Blat, I. C., Wrobel,
M. J., Lerner, J., Brunet, J. P., Subramanian, A., Ross, K. N., et al.
(2006). The Connectivity Map: Using gene-expression signatures to connect small molecules, genes, and disease. Science
313, 1929–1935.
Li, J., Zheng, S., Chen, B., Butte, A. J., Swamidass, S. J., and Lu, Z.
(2015). A survey of current trends in computational drug repositioning. Brief. Bioinformatics.
Li, P. Y., Chang, Y. C., Tzang, B. S., Chen, C. C., and Liu, Y. C.
(2007). Antibiotic amoxicillin induces DNA lesions in mammalian cells possibly via the reactive oxygen species. Mut.
Res. 629, 133–139.
DE ABREW ET AL.
Maggiolini, M., Donze, O., Jeannin, E., Ando, S., and Picard, D.
(1999). Adrenal androgens stimulate the proliferation of
breast cancer cells as direct activators of estrogen receptor
alpha. Cancer Res. 59, 4864–4869.
Maggiolini, M. (2004). Xenoestrogens and the induction of proliferative effects in breast cancer cells via direct activation of
oestrogen receptor alpha. Food Addit. Contam. 21, 134–144.
Muniyappa, H., Song, S., Mathews, C. K., and Das, K. C. (2009).
Reactive oxygen species-independent oxidation of thioredoxin in hypoxia: Inactivation of ribonucleotide reductase
and redox-mediated checkpoint control. J. Biol. Chem. 284,
17069–17081.
Naciff, J. M., Khambatta, Z. S., Reichling, T. D., Carr, G. J., Tiesman, J.
P., Singleton, D. W., Khan, S. A., and Daston, G. P. (2010). The
genomic response of Ishikawa cells to bisphenol A exposure is
dose- and time-dependent. Toxicology 270, 137–149.
NRC. (2007). Toxicity Testing in the 21st Century: A Vision and a
Strategy. National Academies Press, Washington, DC.
Nuclear Receptors Nomenclature Committee. (1999). A unified
nomenclature system for the nuclear receptor superfamily.
Cell 97, 161–163.
Nuwaysir, E. F., Bittner, M., Trent, J., Barrett, J. C., and Afshari, C.
A. (1999). Microarrays and toxicology: The advent of toxicogenomics. Mol. Carcinog. 24, 153–159.
OECD. (2010). Test No. 487: In Vitro Mammalian Cell Micronucleus
Test. OECD Publishing, Paris.
OECD. (2012). Draft Template, and Guidance on Developing and
Assessing the Completeness of Adverse Outcome Pathways
(AOPs). OECD Publishing, Paris.
OECD (2014). Test No. 473: In Vitro Mammalian Chromosomal
Aberration Test. OECD Publishing, Paris.
Olsen, C. M., Meussen-Elholm, E. T., Roste, L. S., and Tauboll, E.
(2004). Antiepileptic drugs inhibit cell growth in the human
breast cancer cell line MCF7. Mol. Cell. Endocrinol. 213, 173–179.
Rao, X., Di Leva, G., Li, M., Fang, F., Devlin, C., Hartman-Frey, C.,
Burow, M. E., Ivan, M., Croce, C. M., and Nephew, K. P. (2011).
MicroRNA-221/222 confers breast cancer fulvestrant resistance by
regulating multiple signaling pathways. Oncogene 30, 1082–1097.
Rathinasamy, K., Jindal, B., Asthana, J., Singh, P., Balaji, P. V., and
Panda, D. (2010). Griseofulvin stabilizes microtubule dynamics, activates p53 and inhibits the proliferation of MCF-7
cells synergistically with vinblastine. BMC Cancer 10, 213.
Recchia, A. G., Vivacqua, A., Gabriele, S., Carpino, A., Fasanella,
G., Rago, V., Bonofiglio, D., Skandrani, D., Gaubin, Y., Beau, B.,
et al. (2006). Effect of selected insecticides on growth rate and
stress protein expression in cultured human A549 and SHSY5Y cells. Toxicol. In Vitro 20, 1378–1386.
Ritchie, M. E., Phipson, B., Wu, D., Hu, Y., Law, C. W., Shi, W., and
Smyth, G. K. (2008). limma powers differential expression
analyses for RNA-sequencing and microarray studies. Nucl.
Acids Res. 43, e47
Smyth, G. (2005). Limma: Linear models for microarray data. In
Gentleman, Robert, Carey, Vincent J., Huber, Wolfgang,
Irizarry, Rafael A., and Dudoit, Sandrine. Bioinformatics and
Computational Biology Solutions Using R and Bioconductor, pp.
397–420. Springer, New York, NY.
Stephens, M. L., Barrow, C., Andersen, M. E., Boekelheide, K.,
Carmichael, P. L., Holsapple, M. P., and Lafranconi, M. (2012).
Accelerating the development of 21st-century toxicology:
Outcome of a Human Toxicology Project Consortium workshop. Toxicol. Sci. 125, 327–334.
Tang, H. Y., Lin, H. Y., Zhang, S., Davis, F. B., and Davis, P. J.
(2004). Thyroid hormone causes mitogen-activated protein
|
461
kinase-dependent phosphorylation of the nuclear estrogen
receptor. Endocrinology 145, 3265–3272.
Thomas, R. S., Philbert, M. A., Auerbach, S. S., Wetmore, B. A.,
Devito, M. J., Cote, I., Rowlands, J. C., Whelan, M. P., Hays, S. M.,
Andersen, M. E., et al. (2013). Incorporating new technologies
into toxicity testing and risk assessment: Moving from 21st century vision to a data-driven framework. Toxicol. Sci. 136, 4–18.
Thome-Kromer, B., Bonk, I., Klatt, M., Nebrich, G., Taufmann, M.,
Bryant, S., Wacker, U., and Kopke, A. (2003). Toward the identification of liver toxicity markers: A proteome study in
human cell culture and rats. Proteomics 3, 1835–1862.
Tice, R. R., Austin, C. P., Kavlock, R. J., and Bucher, J. R. (2013).
Improving the human hazard characterization of chemicals:
A Tox21 update. Environ. Health Perspect. 121, 756–765.
Vidovic, D., Koleti, A., and Schurer, S. C. (2014). Large-scale integration of small molecule-induced genome-wide transcriptional responses, Kinome-wide binding affinities and cellgrowth inhibition profiles reveal global trends characterizing
systems-level drug action. Front. Genet. 5, 342.
Vinggaard, A. M., Joergensen, E. C., and Larsen, J. C. (1999). Rapid
and sensitive reporter gene assays for detection of antiandrogenic and estrogenic effects of environmental chemicals.
Toxicol. Appl. Pharmacol. 155, 150–160.
Vinken, M. (2013). The adverse outcome pathway concept: A
pragmatic tool in toxicology. Toxicology 312, 158–165.
Vivacqua, A., Recchia, A. G., Fasanella, G., Gabriele, S., Carpino,
A., Rago, V., Di Gioia, M. L., Leggio, A., Bonofiglio, D., Liguori,
A., et al. (2003). The food contaminants bisphenol A and 4nonylphenol act as agonists for estrogen receptor alpha in
MCF7 breast cancer cells. Endocrine 22, 275–284.
Wang, X. J., Hayes, J. D., Henderson, C. J., and Wolf, C. R. (2007).
Identification of retinoic acid as an inhibitor of transcription
factor Nrf2 through activation of retinoic acid receptor alpha.
Proc. Natl. Acad. Sci. U.S.A. 104, 19589–19594.
Wetmore, B. A., Allen, B., Clewell, H. J., III, Parker, T., Wambaugh,
J. F., Almond, L. M., Sochaski, M. A., and Thomas, R. S. (2014).
Incorporating population variability and susceptible subpopulations into dosimetry for high-throughput toxicity testing. Toxicol. Sci. 142, 210–224.
Wetmore, B. A., Wambaugh, J. F., Ferguson, S. S., Li, L., Clewell, H.
J., III, Judson, R. S., Freeman, K., Bao, W., Sochaski, M. A., et al.
(2013). Relative impact of incorporating pharmacokinetics on
predicting in vivo hazard and mode of action from highthroughput in vitro toxicity assays. Toxicol. Sci. 132, 327–346.
Toxicology
Wetmore, B. A., Wambaugh, J. F., Ferguson, S. S., Sochaski, M. A.,
Rotroff, D. M., Freeman, K., Clewell, H. J., III, Dix, D. J.,
Andersen, M. E., et al. (2012). Integration of dosimetry, exposure, and high-throughput screening data in chemical toxicity
assessment. Toxicol. Sci. 125, 157–174.
Wu, S., Blackburn, K., Amburgey, J., Jaworska, J., and Federle, T.
(2010). A framework for using structural, reactivity, metabolic and physicochemical similarity to evaluate the suitability of analogs for SAR-based toxicological assessments.
Regul. Toxicol. Pharmacol. 56, 67–81.
Zhang, J. D., Berntenis, N., Roth, A., and Ebeling, M. (2014). Data
mining reveals a network of early-response genes as a consensus signature of drug-induced in vitro and in vivo toxicity. Pharmacogenomics J. 14, 208–216.
Zimmer, M., Lamb, J., Ebert, B. L., Lynch, M., Neil, C., Schmidt, E.,
Golub, T. R., and Iliopoulos, O. (2010). The connectivity map links
iron regulatory protein-1-mediated inhibition of hypoxia-inducible factor-2a translation to the anti-inflammatory 15-deoxydelta12,14-prostaglandin J2. Cancer Res. 70, 3071–3079.