supplementary methods - Molecular Cancer Therapeutics

Supplemental Methods for Cooperative Targets of Combined mTOR/HDAC Inhibition
Promote MYC Degradation by Simmons, Michalowski, et al.
Detailed Methods for Tumor Tissues, Immunohistochemistry
Tumor tissues were fixed (4% PFA) immediately upon collection, and transferred after 24h to
ethanol. Unstained sections of paraffin-embedded tumors were de-paraffinized, re-hydrated,
and subjected to antigen-retrieval by autoclave in pH9.0 Dako Target Retrieval Solution (Dako,
Carpinteria, CA). Slides were incubated with MYC antibody overnight at 4°C, rinsed in 0.2%
PBST, incubated with secondary antibody at RT for 1h, and treated with Vectastain ABC kit
(Vector Laboratories) and Dako Liquid DAB+ Substrate Chromogen System. Slides were
counterstained with hematoxylin. Slides labeled with anti-MYC antibody were scanned and
color intensity and area was measured using the Color Deconvolution Algorithm in Aperio
ImageScope (v9; Aperio Technologies).
Detailed Methods for Protein Analysis, Immunoblots, Protein Simple
After indicated treatment, cells were pelleted by centrifugation and washed twice with
ice cold PBS; PBS was aspirated, cell pellets flash frozen and stored at -80°C. 15µg of protein in
20µL lysate, prepared with 75-300µL RIPA lysis buffer, was electrophoresed on 4-12% Bis-Tris
sodium dodecyl sulfate polyacrylamide gels and transferred as described (1). Blots were washed
in TBST, blocked for 1 hour in 10% milk/TBST, washed in TBST and incubated in primary
antibody overnight at 4°C. Primary antibodies were diluted in 5% bovine serum albumin(BSA).
Blots were washed 3x with TBST, incubated in 1:5000 anti-rabbit conjugated HRP secondary
antibody in 5% milk/TBST for 1h, and washed 3x. SuperSignalWest Dura Extended Signal
1
Substrate (Thermo Scientific) was used for chemiluminescence. Blots were imaged using G:Box
and GeneSnap software (Syngene).
For more precise quantification of MYC protein to determine relative half-life changes
due to drug treatment, the size-based automated capillary immunoassay system (Simple
Western, ProteinSimple) was performed by the Center for Cancer Research Collaborative
Protein Technology Resource group using the manufacturer’s protocol (2). For protein half-life
experiments, translation was inhibited by a 5 minute pre-treatment of the cells with 20g/ml
cycloheximide. Half-life experiments were conducted at 5 minutes, 4 hours and 18 hours post
single agent and combination treatment. The proteasome inhibitor MG-132 was used along
with cycloheximide (InSolution; Calbiochem) in experiments to enrich for phosphorylated MYC
species.
Inducible MYC experiments (P493-6 Model)
The P493-6 cells were treated with 0.1ug/ml tetracycline for 72 hours to allow for
maximal repression of the MYC transgene. For drug treatment/cell viability experiments, all
cells used in the experiment were treated with tetracycline for 72 hours to fully repress
exogenous MYC expression. After 72 hours, the cells were collected, centrifuged and washed
twice with PBS, and then split into two flasks, with and without tetracycline. The cells were then
incubated for 24 hours to allow for complete restoration of exogenous MYC expression in the
cells without tetracycline. The cells were then replated in 96-well plates (both with and without
tetracycline) and treated with MS-275, rapamycin or the combination.
2
Primer Sequences used in ChIP-qPCR
Primers flanking c-Myc binding site in the proximal promoter of each targeted gene
Detailed Methods for Bioinformatics procedures
Microarray data pre-processing: Affymetrix (Santa Clara, CA, USA) HG-U133 Plus 2 CEL files
were imported to the R Bioconductor affy package and processed with the RMA algorithm (3).
Probe sets with low signal across all arrays were removed. Multiple probe sets corresponding to
the same gene were replaced by the one with the maximal median intensity. Around 14K genes
were available for the statistical analyses. All microarray data are deposited into the GEO
repository (GSE##).
Mining of publically available microarray datasets: Raw data (Affymetrix HG-U133_2 CEL files)
from primary bone marrow samples of multiple myeloma patients and healthy donors were
obtained from the GEO database (GSE6477) (4,5) and processed with the RMA algorithm (3).
One-way ANOVA contrasts were used to estimate the differences in gene expression between
the healthy donors (N=15) and the different classes of multiple myeloma, i.e., newly diagnosed
(N=75), relapsed (N=28), SMM (N=23, smoldering multiple myeloma), and MGUS (N=21,
monoclonal gammopathy of uncertain significance). The ANOVA t-statistic was used as the
ranking metric in the Gene Set Enrichment Analysis (GSEA). MAS5 normalized data (Affymetrix
3
HG-U133 Plus2) from 414 newly diagnosed multiple myeloma patients (CD-138+-selected
plasma cells from bone marrow samples) were downloaded from GEO (GSE 4581 (6)) and
utilized in the survival risk prediction analysis.
Analysis of Variance (ANOVA): Univariate two-way ANOVA models were applied to examine the
combined expression effects of entinostat and sirolimus (7). Specifically, a significant
interaction term in the two-by-two factorial ANOVA was used as an indication of transcriptional
synergy for the drug combination (P<0.05). Otherwise, when the interaction was not significant,
the additive two-way ANOVA model was fitted and the main effects for each individual drug
treatment tested. When the interaction was significant, the individual simple effects for the
entinostat and sirolimus treatments were estimated with one-way ANOVA contrasts. The
simple effect for the drug combination treatment was also estimated for each gene. Using the
method of Storey and Tibshirani (8) the P-values were converted to false discovery rate Qvalues. The analyses were done using R programming language (2011) and the gregmisc and
qvalue libraries.
Weighted Gene Co-Expression Network Analysis (WGCNA): We performed network modeling
using Weighted Gene Co-expression Analysis as proposed by Langfelder and Horvath (9) and
implemented in the R WGCNA library (10). In the network, nodes represented gene expression
profiles across the experiments and the undirected edges represented the correlation-based
strength of connection among genes. In the first step, the unsigned Pearson's correlation
coefficients were determined for all pair-wise comparisons of gene-expression profiles, which
were then transformed into the adjacency matrix using a power function: aij = |cor(xi, xj)| . The
power adjacency function converted the co-expression similarity measure into a continuous
4
strength of connection (weight), while allowing retention of all co-expression relationships
among genes and scale-free network properties by emphasizing large correlations at the
expense of small ones. Furthermore, the connectivity, ki, of the i-th node was defined as the
sum of its adjacencies with all other nodes in the network (ki = aij). Building the network we
applied the power coefficient  = 8, which resulted in the connectivity distribution satisfying the
exponentially truncated power-law. In such networks the degree of connectivity of the most
connected nodes (hubs) is smaller than expected in a pure scale-free network, due to the scalefree properties preserved within a narrower range of the node connectivities (11).
In forming network modules (sets of genes whose expression profiles were highly
correlated across experiments), the adjacency was further transformed using the topological
overlap measure (interconnectedness). The topological overlap matrix (TOMij) defined
commonality of network neighbors for each pair of nodes and its symmetrical distance matrix
(dij = 1-TOMij) was used to identify highly interconnected groups of nodes with a clustering
algorithm. The network modules were detected using the agglomerative average linkage
hierarchical clustering and automated dynamic cut tree algorithm (Langfelder et al., 2008), with
a minimum module size of 20 genes. Each module represented a group of genes with similar
expression pattern summarized by the module eigengene (MEi), computed as the first principal
component of a module's expression matrix. Module eigengenes were utilized to define a
measure of module membership (MMi) for a node as the signed correlation of a node profile
with the corresponding module eigengene.
Assessing which modules captured genes relevant to particular drug treatments, the
two-way ANOVA gene significance (GSi = -log10 P-valuei) was integrated with the network
5
concepts of module significance (MSi) and intramodular connectivity (kINi). The module
significance measure was calculated as the average gene significance for all nodes in a
particular module. Intramodular connectivity for the i-th node quantified its co-expression with
all the other nodes in a given module by the sum of a node's adjacencies within the module.
The relation between the intramodular connectivity and gene significance was estimated with
Pearson's correlation coefficients and Fisher's asymptotic tests implemented in the WGCNA
package. A combination of module significance equal or greater than 2.0 (negative log10 of
0.01) with a significant correlation of gene significance and intramodular connectivity
(Bonferroni corrected P-value<0.05) was used to associate a network module with a drug
response.
In the final step we selected a top connectivity network. Spurious or isolated
connections with topological overlap less than 0.25 were removed. In addition, nodes were
selected based upon the measure of module membership (absolute value of MM > 0.8) and the
gene significance of the module-specific drug effects (GS > 2). Extremely highly connected
nodes (hub genes) were defined within each module, setting the cutoff threshold for scaled
intramodular connectivity (kINsc = kIN/kINmaximum) to 0.6 and pairwise adjacency to 0.66
(corresponding to the pairwise Pearson's correlation coefficient of 0.95).
Functional over-representation: The NIH Database for Annotation, Visualization, and Integrated
Discovery (DAVID) Bioinformatics Resource was used to determine over-representation of Gene
Ontology (GO) terms (12). DAVID's GO FAT functional categories (GO subsets with broadest
terms filtered out) were tested. The significance of the functional enrichment was identified
with a modified Fisher's exact test (EASE score) followed by the Benjamini correction for
6
multiple comparisons and using 0.05 as a p-value cutoff. Lists of enriched GO terms were
summarized with semantically non-redundant terms using the REVIGO algorithm (13) with
SimRel and medium similarity options.
Gene Set Enrichment Analysis (GSEA): Gene Set Enrichment Analysis (GSEA) was applied as
described previously (14) to test the enrichment of the WGCNA network modules in the human
microarray data with respect to multiple myeloma patients and healthy donors (4,5). We
performed the pre-ranked GSEA version (14) with 5000 permutations of the module gene sets.
The data were ranked based on the t-statistic from one-way ANOVA planned comparisons. A
FDR q-value less than 0.1 was considered significant.
Prognostic classifier: We tested whether the cooperative gene signature of entinostat and
sirolimus was predictive of overall survival in patients with MM disease (6). A multivariate
survival risk predictor was built using the principal components method of Bair and Tibshirani
(15) as implemented in the BRB-Array Tools developed by Dr. Richard Simon and BRB-Array
Tools Development Team (http://linus.nci.nih.gov/BRB-ArrayTools.html). The applied model is
based on 'supergenes' that were defined here with the first three principal component linear
combinations from genes whose expression was univariately correlated with survival (Cox
regression p-value < 0.05). The 'supergene' expression is related to survival time using Cox
proportional hazards modeling to derive a regression coefficient (weight) for each 'supergene',
which is then used for computing the risk score as the weighted combination of the
'supergenes'. This multivariate model was tested in two complementary validation schemes
(10-fold cross-validation and single training/test split) to assign risk-group membership for
clinical samples. Kaplan-Meier survival curves were plotted for the low- and high-risk groups (a
7
risk score lower or higher than the 50th percentile in the training set). To assess the significance
of prediction in the cross-validated model a permutation log-rank test was used. The survival
data was randomly permuted among the patients, repeating the whole risk prediction
procedure 5000 times. The p-value was calculated as the proportion of permuted test statistics
that were as large as or larger than the observed value. The survival difference between the
two risk groups in the single split validation procedure was assessed by the asymptotic log-rank
test. A p-value of 0.05 was chosen as the significance threshold for both the log-rank tests.
Oncomine “Molecular Concept” Analysis: In the OncomineTM Platform (Thermo Fisher, Ann
Arbor, MI) a concept is an aspect of biology represented by a molecular signature (16). Thirtythree down-regulated genes in the 37-gene predictor were entered into Oncomine analyses as
a concept signature and the associated cancer vs. normal concepts were identified using the
default parameters for significant overlap with other predetermined signatures (Fisher’s exact
test odds ratio>2, p-value<1e-4). A subset of results is included for the dataset with the highest
overlap statistic for cancer types, for which at least ten significantly overlapping concepts were
found. Only two of the four genes up-regulated in the 37-signature were mapped in the
Oncomine database, therefore these two were not analyzed due to such small sample for this
type of analysis.
Cell Line Transcriptional Response: To quantify overall transcriptional response in a MM cell line
a score was calculated as a weighted average of log2 fold changes in the 37 genes (Nanostring
assays). A positive (+1) and negative (-1) weight was assigned to each gene reflecting the
directionality of the combined treatment effect of mTORi and HDACi established for L363 cell
line in the microarray experiment. Thus the higher the weighted score the greater the response
8
concordant with the response observed in L363 cells. Cell lines were classified as sensitive if the
decrease in viability was greater than 50%. Non-parametric Wilcoxon t-test was used to
compare the response scores between sensitive and non-sensitive cell lines.
Chi-Square Tests for Classifier Comparisons: Chi-square tests and mosaic plots with residual
based shadings to test conditional independence between our 37 gene classifier and the 7
molecular subgroups from MM patients (6), with and without stratification by proliferation
index (PI) was determined using the R vcd package (17).
ChIP-Seq dataset mining:ChIP-seq datasets was downloaded from the GEO database
(GSE36354) (18). All the analyses were done essentially as we have described previously (19).
Briefly, sequencing tags were aligned to the human genome build hg19 with the tolerance of
one mismatch for each tag. The peaks were called using the MACS algorithm (Zhang et al.,
2008) with p<10e-5, fold change>5. The peaks were then assigned to all the genes in the Refseq
database. Promoter region was defined as 2.5 kb upstream and downstream of the
transcription start site (TSS). Gene body region is defined as from 2.5 kb downstream of TSS
until the end of the 3’ UTR. Distal_upstream is from 2.5 kb to 25 kb upstream of TSS, while
distal_downstream is from the end of 3’ UTR to 25 kb downstream of the end of the gene.
References
1.
2.
Simmons JK, Patel J, Michalowski A, Zhang S, Wei BR, Sullivan P, et al. TORC1 and class I
HDAC inhibitors synergize to suppress mature B cell neoplasms. Molecular oncology
2014;8(2):261-72 doi 10.1016/j.molonc.2013.11.007.
Chen JQ, Heldman MR, Herrmann MA, Kedei N, Woo W, Blumberg PM, et al. Absolute
quantitation of endogenous proteins with precision and accuracy using a capillary
Western system. Analytical biochemistry 2013;442(1):97-103 doi
10.1016/j.ab.2013.07.022.
9
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
Irizarry RA, Hobbs B, Collin F, Beazer-Barclay YD, Antonellis KJ, Scherf U, et al.
Exploration, normalization, and summaries of high density oligonucleotide array probe
level data. Biostatistics 2003;4(2):249-64 doi 10.1093/biostatistics/4.2.2494/2/249 [pii].
Carrasco DR, Tonon G, Huang Y, Zhang Y, Sinha R, Feng B, et al. High-resolution genomic
profiles define distinct clinico-pathogenetic subgroups of multiple myeloma patients.
Cancer Cell 2006;9(4):313-25 doi S1535-6108(06)00089-4
[pii]10.1016/j.ccr.2006.03.019.
Chng WJ, Kumar S, Vanwier S, Ahmann G, Price-Troska T, Henderson K, et al. Molecular
dissection of hyperdiploid multiple myeloma by gene expression profiling. Cancer Res
2007;67(7):2982-9 doi 67/7/2982 [pii]10.1158/0008-5472.CAN-06-4046.
Zhan F, Huang Y, Colla S, Stewart JP, Hanamura I, Gupta S, et al. The molecular
classification of multiple myeloma. Blood 2006;108(6):2020-8 doi blood-2005-11013458 [pii]10.1182/blood-2005-11-013458.
Slinker BK. The statistics of synergism. J Mol Cell Cardiol 1998;30(4):723-31 doi S00222828(98)90655-1 [pii]10.1006/jmcc.1998.0655.
Storey JD, Tibshirani R. Statistical significance for genomewide studies. Proc Natl Acad
Sci U S A 2003;100(16):9440-5 doi 10.1073/pnas.15305091001530509100 [pii].
Zhang B, Horvath S. A general framework for weighted gene co-expression network
analysis. Stat Appl Genet Mol Biol 2005;4:Article17 doi 10.2202/1544-6115.1128.
Langfelder P, Horvath S. WGCNA: an R package for weighted correlation network
analysis. BMC bioinformatics 2008;9:559 doi 10.1186/1471-2105-9-559.
Langfelder P, Zhang B, Horvath S. Defining clusters from a hierarchical cluster tree: the
Dynamic Tree Cut package for R. Bioinformatics 2008;24(5):719-20 doi btm563
[pii]10.1093/bioinformatics/btm563.
Huang da W, Sherman BT, Lempicki RA. Systematic and integrative analysis of large gene
lists using DAVID bioinformatics resources. Nat Protoc 2009;4(1):44-57 doi
nprot.2008.211 [pii]10.1038/nprot.2008.211.
Supek F, Bosnjak M, Skunca N, Smuc T. REVIGO summarizes and visualizes long lists of
gene ontology terms. PloS one 2011;6(7):e21800 doi 10.1371/journal.pone.0021800.
Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL, Gillette MA, et al. Gene
set enrichment analysis: a knowledge-based approach for interpreting genome-wide
expression profiles. Proc Natl Acad Sci U S A 2005;102(43):15545-50 doi 0506580102
[pii]10.1073/pnas.0506580102.
Bair E, Tibshirani R. Semi-supervised methods to predict patient survival from gene
expression data. PLoS Biol 2004;2(4):E108 doi 10.1371/journal.pbio.0020108.
Rhodes DR, Kalyana-Sundaram S, Mahavisno V, Varambally R, Yu J, Briggs BB, et al.
Oncomine 3.0: genes, pathways, and networks in a collection of 18,000 cancer gene
expression profiles. Neoplasia 2007;9(2):166-80.
Zeilels A, Meyer D, Hornik K. Residual-based shadings for visualizing (conditional)
independence. J Comput Graph Stat 2007;16(3):507-25 doi
10.1198/106186007x237856.
Lin CY, Loven J, Rahl PB, Paranal RM, Burge CB, Bradner JE, et al. Transcriptional
amplification in tumor cells with elevated c-Myc. Cell 2012;151(1):56-67 doi
10.1016/j.cell.2012.08.026.
10
19.
Li M, He Y, Dubois W, Wu X, Shi J, Huang J. Distinct regulatory mechanisms and
functions for p53-activated and p53-repressed DNA damage response genes in
embryonic stem cells. Molecular cell 2012;46(1):30-42 doi
10.1016/j.molcel.2012.01.020.
11