Molecular diagnosis of colorectal tumors by expression profiles of 50

ã
Oncogene (2002) 21, 4120 ± 4128
2002 Nature Publishing Group All rights reserved 0950 ± 9232/02 $25.00
www.nature.com/onc
Molecular diagnosis of colorectal tumors by expression pro®les of 50 genes
expressed di€erentially in adenomas and carcinomas
Yu-Min Lin1,2, Yoichi Furukawa1, Tatsuhiko Tsunoda3, Chung-Tai Yue2, Kou-Ching Yang2 and
Yusuke Nakamura*,1
1
Human Genome Center, Institute of Medical Science, University of Tokyo, 4-6-1 Shirokanedai, Minato-ku, Tokyo 108-8639,
Japan; 2Department of Internal Medicine, Shin Kong Wu Ho-Su Memorial Hospital, 95 Wen Chang Road, Taipei 11160, Taiwan;
3
SNP Research Center, Riken (Institute of Physical and Chemical Research), 4-6-1 Shirokanedai, Minato-ku, Tokyo 108-8639,
Japan
Most colon cancers are thought to develop through the
`adenoma-to-carcinoma sequence' model. To elucidate
the mechanisms underlying this pathway, we analysed
gene-expression pro®les of 20 colorectal tumors (nine
adenomas and 11 di€erentiated adenocarcinomas) by
means of a cDNA microarray representing 23 040 genes
coupled with laser-capture microdissection. A twodimensional hierarchical clustering analysis of expression
pro®les of the 20 tumors correctly separated the
carcinoma group from the adenoma group. Furthermore
we identi®ed 51 genes whose expression was commonly
up-regulated, 376 that were commonly down-regulated in
both types of tumors as opposed to normal colonic
epithelium and 50 whose expression levels were
signi®cantly di€erent between adenomas and carcinomas.
On the basis of expression pro®les of the 50 discriminating genes, we established a scoring system to separate
adenomas from carcinomas. Application of this scoring
system for evaluating ®ve additional colorectal tumors
correctly predicted their histological features. The
genome-wide information reported here should contribute
to a more profound understanding of colorectal tumorigenesis, particularly of adenoma-carcinoma progression,
and provide indicators for developing novel strategies to
diagnose, treat, and ultimately prevent colorectal
carcinomas.
Oncogene (2002) 21, 4120 ± 4128. doi:10.1038/sj.onc.
1205518
Keywords: colon cancer; adenoma-carcinoma sequence;
cDNA microarray; expression pro®le
Introduction
Colorectal carcinoma is a leading cause of cancer
deaths in developed countries. Recent medical advances have made great progress in diagnosis and
*Correspondence: Y Nakamura; E-mail: [email protected]
Received 22 October 2001; revised 11 March 2002; accepted 22
March 2002
treatment, but many patients with advanced colorectal
cancers still wait for more e€ective but less harmful
therapeutic drugs. Development of a sensitive, speci®c
and convenient diagnostic system for detecting very
early-stage colorectal cancers or pre-malignant lesions
could ultimately eliminate this disease. Furthermore,
identi®cation of e€ective preventive strategies would
release people from fear of this life-threatening illness.
Muto et al. (1975) ®rst described pre-malignant
lesions of the colon on the basis of careful histological
examination and proposed a model of `adenomacarcinoma sequence of the colon'. Subsequent molecular studies during the past two decades have opened
the door to a better understanding of colorectal
carcinogenesis and established a view of multi-step
carcinogenesis in the colon that agrees well with
Muto's. Investigations of hereditary colon cancers have
improved the general understanding of mechanisms
underlying sporadic colorectal carcinomas. However,
our knowledge of genes involved in colorectal tumors
is still fragmentary and awaits genome-wide information about how expression pro®les are altered during
multi-step carcinogenesis.
Analysis of expression pro®les using cDNA microarrays is a promising method for identifying genes
whose expression is altered by response to various
physiological conditions or drugs, or pathological
conditions (Afshari et al., 1999; Kallioniemi, 2001).
This technology can also provide systematic expression
pro®les that may re¯ect the physiological or pathological phenotype of a subset of a cell population. This
approach has proven to be useful for analysing genes
involved in various neoplasms and for detecting
speci®c phenotypes (Ono et al., 2000; Okabe et al.,
2001; Kitahara et al., 2001). We were interested in
examining expression pro®les of pre-malignant and
malignant lesions of the colon, because comparison of
these two groups of tumors would provide information
about genes that undergo altered expression during
progression of an adenoma to a carcinoma.
We report here a genome-wide analysis of geneexpression pro®les in nine pre-malignant lesions and 11
di€erentiated adenocarcinomas of the colon by means
of a cDNA microarray representing 23 040 genes. We
Analysis of expression profiles in colonic tumors
Y-M Lin et al
4121
identi®ed a panel of genes that were commonly
deregulated in both colorectal adenomas and carcinomas, and also 50 genes whose expression di€ered
signi®cantly between benign and malignant tumors. We
established a scoring system on the basis of expression
of selected genes that may assist clinicians in
distinguishing adenomas from carcinomas. The data
reported here will add to a comprehensive understanding of colorectal carcinogenesis, facilitate development of novel diagnostic strategies, and provide
clues for identi®cation of molecular targets for
therapeutic drugs and preventive agents.
Results
Two-dimensional hierarchical clustering
To analyse correlation among the samples and genes,
we applied a two-dimensional hierarchical clustering
algorithm (http://www.microarrays.org/software) using
the data obtained from the 20 tumors. We excluded
genes for further analysis when average Cy3-and Cy5¯uorescence intensities were below 16105 units, and
selected a set of genes whose values were obtained in
more than 16 cases (80%). We further excluded genes
with standard deviations of observed values less than
0.5. A total of 771 genes passed this ®lter for
subsequent clustering analysis.
In the sample axis, the 20 samples were separated
into two major groups based on their expression
pro®les; all of the nine tumors belonging to one group
were adenomas and the other major group consisted of
the 11 carcinomas (Figure 1). This result is consistent
with a recent report that four colon adenomas were
separated from 18 adenocarcinomas using oligonucleotide arrays (Notterman et al., 2001). The expression
pro®les obtained on our microarray clearly demonstrated that adenomas and adenocarcinomas have
speci®c expression pro®les, and indicated that molecular classi®cation of colonic tumors is feasible. Our
data also disclosed the molecular nature of each tumor
type. In the gene axis, we later focused on two gene
clusters, cluster A (more abundant expression in
carcinomas than adenomas) and cluster B (more
abundant expression in adenomas than carcinomas).
Genes commonly up-regulated in colonic tumors
Since most colon cancers arise from adenomas, genes
involved in early stages of colorectal tumorigenesis are
likely to be deregulated in both types of tumors. To
identify such genes, we selected genes from our data set
of 2425 genes according to the following criteria: if the
Cy3/Cy5 ratio of the gene was 42 in more than 50%
of the tumors, we de®ned it as a commonly upregulated gene, and if the ratio was 50.5 in more than
50% of the tumors, we de®ned it as commonly downregulated. With these criteria we identi®ed 51 genes
that were commonly up-regulated in both tumor
phenotypes as compared with their corresponding
Figure 1 A two-dimensional hierarchical clustering of 771 genes
across 20 colorectal tumors. The color in each well represents
relative expression of each gene (vertical axis) in each paired
sample (horizontal axis); more intense colors re¯ect wider
di€erences between tumor and normal epithelium. Red, increased
in tumor; green, decreased; black, unchanged; gray, no expression
in the tumor cells. In the sample axis, carcinomas (T) and
adenomas (P) were separated to two di€erent trunks. In the gene
axis, 771 genes were clustered in di€erent branches according to
their similarity; the shorter the branches the greater the similarity.
Sub-clusters A and B were selected for further analysis (see text
and Figure 3)
normal epithelia (Figure 2). Among the 51 genes, 19
were involved in RNA/protein processing; e.g. ribosomes, translation elongation/initiation factors, and
chaperonins. Other up-regulated genes detected in our
experiments included oncogenes (HMGIY, DEK and
NPM1), genes encoding cell adhesion/cytoskeleton
molecules (TUBB, K-ALPHA, TGFBI, CDH3 and
PAP), genes involved in growth control (IMPDH2
and ODC1), signal transduction (BRF1, PLAB,
LAP18, CD81 and MACMARCKS), and cell-cycle
control (RAN and UBE2I); transcription factors
Oncogene
Analysis of expression profiles in colonic tumors
Y-M Lin et al
4122
Figure 2 Fifty-one genes commonly up-regulated in both adenomas and carcinomas. (Cy3/Cy5)ave indicates average value of Cy3/
Cy5 in the 20 paired samples. Genes that appear repeatedly represent the same genes spotted on di€erent set of slides. Red,
increased in tumor; green, decreased; black, unchanged; gray, no expression in the tumor cells
(HMG1 and HMG2), tumor-associated molecules
(PPP2R1B, LDHB and SLC29A1), and others.
of cellular homeostasis in tumors (Lawrance et al.,
2001).
Genes commonly down-regulated in colonic tumors
Genes expressed differently between adenomas and
carcinomas
We also identi®ed 376 genes (including 127 expressed
sequence tags) that were commonly down-regulated in
both types of tumor by the criteria described above
(supplementary data). This group includes genes
associated with programmed cell death (CASP8,
CASP9, CFLAR, DFFA, PAWR, TNF, TNFRSF10C
and TNFRSF12), immunity (chemokine receptors such
as IL1RL2, IL17R and IL3RA), growth suppression
(Suppressin, DCN, MADH2 and SST), and tumor
suppression (TP53). Other down-regulated genes encode cell adhesion/cytoskeleton molecules (e.g.
ADAM8, AVIL, CDH17, CEACAM1, CTNNA2,
ICAPA, KRT9, and ARHGAP5), various metabolic
factors (e.g. BPHL, CA2, CA5A, HSD11B2 and
ECHS1), ion transporters (SLC15A2, SLC22A1,
SLC4A3 and SLC5A1), a natural antimicrobial
molecule (DEFA6), and others. Metabolic enzymes
and ion-transport mediators are key factors for
maintaining pivotal cellular functions such as detoxication (CA2, CA5A and BPHL) and acid-base balance.
Down-regulation of these genes indicates a disruption
Oncogene
When we focused on genes in clusters A and B
(Figure 3), we identi®ed several that regulate
bioenergetics. In cluster A, PGK1 and LDHA may
be induced by hypoxia (Semenza et al., 1994).
Proteasomes (PSMD7 and PSMB8) have been
reported to accumulate in cancer cells by glucose
starvation and hypoxia (Ogiso et al., 1999). VDAC3
is one of the voltage-dependent anion-selective
channel proteins that play important roles in
regulating mitochondrial homeostasis (Vander Heiden
et al., 2000). GSS is a regulator of oxidative stress
(Uhlig and Wendel, 1992). Some of the genes in
cluster B encode proteins that have been reported to
function in adaptation to low-oxygen conditions
(GPX2, PPIA, GAPD, ANXA2, ALDH1 and ADAR)
(Chu et al., 1993; Zhong and Simons, 1999; Hoeren
et al., 1998; Denko et al., 2000), in energy
consumption (ATP6A1, ATP1B1, and ATP5A1)
(Wodopia et al., 2000) or in carbohydrate metabolism
(GMDS).
Analysis of expression profiles in colonic tumors
Y-M Lin et al
4123
Figure 3 Functional clusters in the gene axis. (A) Ten of 24 genes in cluster A and (B) 12 of 29 genes in cluster B were related to
bioenergetics homeostasis (labeled in blue). Genes that appear repeatedly represent the same genes spotted on di€erent set of slides
Verification of microarray data by quantitativeRT ± PCR
To examine the reliability of our microarray data, we
selected six genes, two of which (TGFBI and LAP18)
were up-regulated in both adenomas and carcinomas
and the others (HECH, NME1, TCEA1 and PSMA7)
were di€erently expressed between adenomas and
carcinomas, and examined their expression in 13
additional paired aRNA samples (seven adenomas
and six carcinomas) by quantitative RT ± PCR
(QRT ± PCR). The results were highly similar to
microarray data for all six genes (Figure 4). These
data veri®ed the reliability and rationality of our
strategy to identify genes that are commonly upregulated or di€erently expressed during development
and progression of colorectal cancer.
Comparison of expression analysis data in colon cancers
We additionally compared our data with two sets of
data reported previously. Information of gene expres-
sion pro®les in two colon cancer tissues and two noncancerous colonic mucosae analysed by means of Serial
Analysis of Gene Expression was provided by National
Center for Biotechnology Information (http://
www.ncbi.nlm.nih.gov/SAGE/). Among 100 tags of
genes expressed most di€erently between the cancer
and non-cancerous tissues, 50 tags corresponded to
independent unique genes in UniGene database.
Among 50 genes corresponding to these 50 tags, four
up-regulated genes and 24 down-regulated genes in
cancer were contained in our microarray. One of the
four up-regulated gene, TGFBI (transforming growth
factor, beta-induced, 68 kD) also showed elevated
expression in our data (Figure 2). Eighteen of the 24
down-regulated genes including TSPAN-1 (tetraspan1),
GPA33 (glycoprotein A33), CA1 (carbonic anhydrase
1), MT2A (metallothionein 2A), CEACAM1 (carcinoembryonic antigen-related cell adhesion molecule 1),
YF13H12 (protein expressed in thyroid), MUC13
(mucin 13), HLAB (major histocompatibility complex
class 1B), DUSP1 (dual speci®city phosphatase 1),
GSN (gelsolin), LGALS4 (galectin 4), CKB (creatinine
Oncogene
Analysis of expression profiles in colonic tumors
Y-M Lin et al
4124
Figure 4 Validation of microarray data. (A) Log2(Cy3/Cy5) values of 20 samples (11 carcinomas and nine adenomas) in cDNA
microarray analysis. (B) Log2(Tumor/Normal) values for 13 additional samples (six carcinomas and seven adenomas) obtained by
QPCR. The expression ratio (Tumor:Normal) of TGFBI, LAP18, HECH, NME1, TCEA1 and PSMA7 determined by QPCR were
in line with the microarray data for all six genes. Data are presented here as 10th to 90th percentiles of calculated values. Statistical
signi®cance was examined by Mann ± Whitney U-tests
kinase, brain type), KRT19 (keratin 19), RNASE1
(RNase A family 1), IFI27 (interferon, alpha-inducible
protein 27), PP1201, EPS8R2 (FLJ21935), and an
EST(Hs. 107139) revealed decreased expression in more
than half the cases examined by us. We also compared
a list of genes highly expressed in colon cancer tissues
compared to the matched non-cancerous tissues, which
was reported by Notterman et al. (2001) using the
A€ymetrix Human 6500 GeneChip Set, and found that
only two genes, GTF3A (general transcription factor
IIIA) and AHCY (adenosylhomocysteine hydrolase),
were in our list of frequently up-regulated genes in
cancer (Figure 5A). However, among 10 other genes
that were highly expressed in their cancer tissues and
spotted on our array slides, eight genes, such as
KIAA0101, PYCR1 (pyrroline 5-carboxylate reductase), HSPE1 (heat shock 10 kD protein 1), CDC25B
(cell division cycle 25B), CSE1L (chromosome segregation 1-like), CKS2 (CDC28 protein kinase 2), MMP1
(metalloprotenase 1), and CLNS1A (chloride channel,
nucleotide-sensitive, 1A), also showed enhanced expression in more than half of cancer tissues.
Development of a `Molecular Diagnosis Score' (MDS)
system
Among the genes expressed di€erently between adenomas and carcinomas, we identi®ed 50 whose expression
showed statistically signi®cant di€erences between the
two types of tumors (P40.01, Mann ± Whitney U-test;
Figure 5A). We developed a `Molecular Diagnosis
Score' (MDS) system based on expression pro®les of
these 50 genes (see Materials and methods) as a way to
apply that information to clinical diagnosis. The mean
score of the 11 carcinomas was 77.4+11.6, while that
Oncogene
of the nine adenomas was 75.9+14.4 (mean+s.d.,
P50.0001, Mann ± Whitney U-test; Figure 5B). We
de®ned the cut-o€ value for discriminating adenocarcinoma from adenoma at 35, an average of the mean
values of the two groups, and analysed ®ve additional
tumors to verify the reliability of the MDS system.
Among the ®ve samples tested, three that showed
scores greater than 35 (73.5, 63.2, and 64.6) all turned
out to be carcinomas by histological examination. The
two samples with scores of 10.3 and 71.4 were both
adenomas (Figure 5B). Since the distribution of MDSs
for 14 carcinomas ranging from 57.7 to 94.1 were
completely separated from that for 11 adenomas
ranging from 733.4 to 16.7, both sensitivity and
speci®city of the MDS system were 100% on the basis
of this cut-o€ value. In addition, a hierarchical
clustering analysis of all 25 samples correctly separated
adenomas from carcinomas based on the expression
pro®les of the 50 selected genes (Figure 5A).
Discussion
In this study, we disclosed common nature of adenoma
and adenocarcinoma of the colon through the analysis
of genome-wide expression pro®les. The 51 genes
commonly up-regulated in both adenomas and carcinomas included 19 involved in RNA/protein processing;
e.g. ribosomes, translation elongation/initiation factors,
and chaperonins. Ribosomes are the molecular machines
that manufacture proteins according to blueprints of
mRNAs that encode them. Interactions of the ribosome
with mRNAs, tRNAs, and a number of non-ribosomal
protein cofactors such as translation initiation/elongation factors guarantee that polypeptide chains are
Analysis of expression profiles in colonic tumors
Y-M Lin et al
4125
Figure 5 `Molecular diagnosis score' (MDS) based on expression of 50 genes whose levels of expression were statistically di€erent
between adenomas and carcinomas. (A) Clustering analysis. The data for each gene were ®rst median-centered, and an `Average
Linkage Clustering' was subsequently applied to the data set (red, data 4median value; green, data 5median value). In the sample
axis, 25 tumors were separated to two trunks (adenoma group and carcinoma group). Asterisks (*) indicate additional test samples.
A sample, 056P3, was diagnosed as an early adenocarcinoma by histological examination. In the gene axis, the 18 genes on the top
showed higher expression in carcinoma than in adenoma, and were labeled with `1' as a sign. The 32 genes at the bottom showed
higher expression in adenoma than in carcinoma, and were labeled with `71' as a sign (see Materials and methods). Statistical
signi®cance was examined by the Mann ± Whitney U-test. (B) MDSs were presented as 10 ± 90th percentiles of the calculated values.
Asterisks denote the ®ve additional samples for validating the MDS system. Tumor 056P3 is a well-di€erentiated adenocarcinoma
Oncogene
Analysis of expression profiles in colonic tumors
Y-M Lin et al
4126
Table 1 Genes, sequences of primers and probes used for quantitative RT ± PCR
Symbol
Primer
Probe
MDH1
F:5'-TCCCTGTTGTAATCAAGAATAAGACCT-3'
R:5'-CAGTTCCTTTGCAGTAAGATCCATC-3'
F:5'-GCAGACTCTGCGCTTGAGATC-3'
R:5'-GGGCTAGTCGCACAGACCTC-3'
F:5'-CAAATGGCTGCCAAACTGG-3'
R:5'-GGATTCTTTGTTCTTCCGCACT-3'
F:5'-GACAGCAGTGGAGAATTGATGTTT-3'
R:5'-ACAATTTGAGGACACTTCATATTTGC-3'
F:5'-GCATACAAGTTGGCAGGAACATTA-3'
R:5'-ACCACAAGCCGATCTCCTTCT-3'
F:5'-AGATGCGGAAAAACTTGACCA-3'
R:5'-AGTCAGTCTGGGTCCCACCA-3'
F:5'-GCAGCGTTATACGCAGAGCA-3'
R:5'-ACCCACGATGAGGGCAGAG-3'
5'-Vic-TTGTTGAAGGTCTCCCTATTAATGATTTCTCACGTGTamra-3'
5'-Fam-AACAAGCATCAGCGTTTTCCAGGGCT-Tamra-3'
TGFBI
LAP18
HECH
MNE1
TCEA1
PSMA7
initiated, elongated and terminated (Maguire and
Zimmerman, 2001). After translation, polypeptides
emerge from the ribosomes and enter the endoplasmic
reticulum where chaperonins may remodel the polypeptides (Boon et al., 2001). Therefore accelerated protein
synthesis appears to be a common feature of adenomas
and carcinomas and re¯ects a heavy proliferative burden
in both tumor types. In addition, our data suggest that
activation of oncogenes, aberrant transduction of
signals, deregulation of the cell cycle, impaired growth
control, and remodeling of cytoskeletal structures are
general features of tumor cells (Hanahan and Weinberg,
2000). Targeting these genes or the molecules they
encode could be an important strategy for developing
novel approaches to prevention, diagnosis, and treatment of colorectal cancer.
The commonly down-regulated genes included a
number of genes associated with cell death, which
may imply that broad repression of programed celldeath pathways is a crucial step for colorectal
tumorigenesis. In addition, the list of commonly
down-regulated genes suggests that reduction of
growth-suppressive signals and/or tumor-suppressive
functions may confer continuous proliferative properties to neoplastic cells. Therefore, induction of genes
categorized in this cluster may be a potential approach
for development of novel cancer therapies.
We also revealed that a number of genes discriminating carcinoma from adenoma were relevant to
hypoxia. Conceivably, carcinoma cells may be more
exposed to starved and hypoxic conditions, where
carbohydrate/oxygen homeostasis is impaired, than are
adenoma cells. Therefore our data may imply that
cancer cells are changing their expression pro®les in
response to low-nutrient and hypoxic conditions. Since
hypoxia is a prognostic indicator in a number of
tumors (Denko et al., 2000), targeting the genes in this
category may help us to clarify micro-environmental
changes during malignant transformation, and to
identify novel prognostic predictors for colon cancer.
Consideration of the nature of the genes described
above leads us to conclude that activation of
oncogenes, augmentation of proliferation signals,
attenuation of anti-proliferative signals, avoidance of
Oncogene
5'-Fam-CGTTTGCGAGAGAAGGATAAGCACATTGAAG-Tamra-3'
5'-Fam-TGAGGCAGACTTGGTGCTGGCG-Tamra-3'
5'-Fam-CATGGCAGTGATTCTGTGGAGAGTGCA-Tamra-3'
5'-Fam-AAGCCATCAGAGAGCATCAGATGGCC-Tamra-3'
5'-Fam-TGGGCGCAGGCCGTTTGG-Tamra-3'
self-destruction machinery, alteration of cell structure,
and adaptation to micro-environmental changes, are of
great importance for the development and progression
of normal colonic mucosal cells to adenocarcinomas.
These six features suggest that adenoma and carcinoma
cells share several genetic characteristics but have
unique expression pro®les, and this information may
guide e€orts to identify novel targets for blocking
malignant transformation.
The results of MDS system using expression pro®les
of the 50 genes corroborated its feasibility for predicting
the histological features of colorectal tumors. Analysis
of gene-expression pro®les of very early colon cancers
will be necessary for de®ning a more precise cut-o€
value to distinguish between benign and malignant
lesions; moreover, the scoring system must be validated
using bulk tumors before it can become a tool for
clinical diagnosis. Nevertheless, since histological diagnosis of adenomas and carcinomas is sometimes very
dicult and may vary among pathologists (Schlemper
et al., 2000), the MDS system may ultimately be useful
in distinguishing benign from malignant tumors because
it is an objective quanti®cation of each tumor on the
basis of a genome-wide database.
In conclusion, the gene-expression analysis of colorectal adenomas and carcinomas reported here, obtained
through a combination of laser-capture dissection and a
genome-wide cDNA microarray, provides a large body
of information to clarify the mechanisms of colorectal
tumorigenesis and should contribute to identi®cation of
molecular targets for prevention, diagnosis and treatment of colorectal tumors. Moreover, the novel MDS
system that we developed, based on a subset of 50 genes,
may facilitate more sensitive, speci®c and precise
diagnosis of such tumors.
Materials and methods
Tissue samples and laser-capture microdissection (LCM)
Eleven di€erentiated adenocarcinomas, nine adenomas, and
their corresponding normal mucosae of the colon were
obtained with informed consent from 16 patients who
Analysis of expression profiles in colonic tumors
Y-M Lin et al
underwent colectomy. In four cases, both adenomas and
carcinomas had arisen in the same patient. All 20-paired
samples were embedded in TissueTek OCT medium (Miles,
Inc.) and frozen at 7808C. Procedures of ®xation, staining
and LCM were performed as described previously (Kitahara
et al., 2001). About 10 000 cells were selectively collected by
LCM from each tissue sample.
RNA extraction, T7-based RNA amplification and cDNA
microarray
Extraction of total RNA and T7-based RNA ampli®cation
were carried out as described previously (Kitahara et al.,
2001). Two rounds of ampli®cation yielded 15 ± 80 mg of
ampli®ed RNA (aRNA) from each sample. A 2.5 mg aliquot
of aRNA from each tumor and normal epithelium were
labeled with Cy3-dCTP and Cy5-dCTP respectively (Amersham Pharmacia Biotech). To reduce experimental ¯uctuation, we used duplicate sets of cDNA microarray slides
containing 23 040 cDNAs for each analysis. Fabrication of
the cDNA microarray slides, hybridization, washing, and
detection of signals were carried out as described elsewhere
(Ono et al., 2000). The 23 040 genes were selected from
UniGene database (National Center for Biotechnology
Information) and their cDNA fragments were ampli®ed by
RT ± PCR using gene-speci®c primers for each gene and a
variety of human poly(A)+ RNAs as template (Clontech).
Data analysis
The intensity of each signal of Cy3 and Cy5 was evaluated
photometrically using ArrayVision software (Imaging Research Inc., St. Catharines, Ont. Canada) and normalized
according to the expression of 52 housekeeping genes
(Kitahara et al., 2001). After normalization, each gene was
separated into one of four categories based on the average
Cy3/Cy5 ratio (r): up-regulated (r42), down-regulated
(r50.5), unchanged (0.55r52) and low (expression level
below cuto€ level for detection). Excel, Cluster and TreeView
software packages were used for subsequent analysis.
Validation of data
To assess the reproducibility of hierarchical clustering, we
compared clustering results in the sample axis by using
di€erent sets of genes (23 040 target sequences were spotted
on ®ve slides, and we performed clustering analysis for the 20
samples in all ®ve sets). When one sample consistently fell
into the same cluster in di€erent sets of genes, we de®ned the
data as reproducible. The reproducibility was more than 80%
when Cy3 or Cy5 ¯uorescent units were above 100 000.
Therefore, we ®rst calculated average Cy3- and Cy5¯uorescence intensities of each gene in all 20 cases, excluded
genes for further analysis when both intensities were below
16105 units, and selected a total of 2425 genes. A great
majority of their spots (more than 99% of spots) had S/N
ratios greater than 3.0. We further chose 771 genes whose
values were obtained in more than 16 cases (80%) and their
standard deviations of observed values were grater than 0.5.
4127
Calculation of `Molecular Diagnosis Score' (MDS)
The MDS of each tumor was de®ned as the sum of weighted
log ratios of expression pro®les of the 50 genes selected as
di€erentially expressed in adenomas vs carcinomas:
MDSi=SSk log2(rik), where rik is the expression ratio (Cy3/
Cy5) of gene k of patient i, and Sk is the sign for gene k
which was determined as follows: we ®rst calculated the
average log ratio log2(rik) for gene k in the 11 adenocarcinomas and the nine adenomas (avecarcinoma=Slog2(rik)/
ncarcinoma and aveadenoma=Slog2(rik)/nadenoma). Then, we
determined the sign for each gene: Sk=+1, if avecarcinoma
4 aveadenoma, and Sk=71, if avecarcinoma 5 aveadenoma
(Figure 4A).
Real-time quantitative RT ± PCR
To verify our microarray data, we selected six genes and
examined their expression levels in 13 additional samples
(seven adenomas and six carcinomas) by means of real-time
quantitative RT ± PCR (TaqMan PCR, Perkin-Elmer) (Heid
et al., 1996), using a 7700 Sequence Detector (Perkin-Elmer).
Each single-stranded cDNA was reverse-transcribed from
ampli®ed RNA and diluted for subsequent PCR ampli®cation. Malate dehydrogenase 1 (MDH1) served as a relative
quantitative control since it showed the smallest Cy3/Cy5
¯uctuations in 100 hybridizations. Each PCR was carried out
in a 25-ml volume and ampli®ed for 10 min at 958C for
activation of AmpliTaq GoldTM, followed by 40 cycles of
958C for 15 s and 608C for 1 min. The genes and sequences
of the primers and probes used for quantitative RT ± PCR are
listed in Table 1.
Statistics
Assessment of statistical di€erences of gene expression in
carcinomas vs adenomas was determined by Mann ± Whitney
U-tests. A P value 40.05 was considered statistically
signi®cant. Statistical analyses were performed using Stat
View software.
Acknowledgments
We appreciate the help of Drs Yoshinao Yamada,
Yoshiyuki Akiyama, Masataka Sakaguchi, Osamu Kitahara, and Renpei Yanagawa for helpful discussions, and
the assistance of Dr Meiko Takahashi in preparing the
manuscript. This work was supported in part by Research
for the Future Program Grant #00L01402 from the Japan
Society for the Promotion of Science.
References
Afshari CA, Nuwaysir EF and Barrett JC. (1999). Cancer
Res., 59, 4759 ± 4760.
Boon K, Caron HN, van Asperen R, Valentijn L, Hermus MC, van Sluis P, Roobeek I, Weis I, Voute PA, Schwab M
and Versteeg R. (2001). EMBO J., 20, 1383 ± 1393.
Chu FF, Doroshow JH and Esworthy RS. (1993). J. Biol.
Chem., 268, 2571 ± 2576.
Denko N, Schindler C, Koong A, Laderoute K, Green C and
Giaccia A. (2000). Clin. Cancer Res., 6, 480 ± 487.
Hanahan D and Weinberg RA. (2000). Cell, 100, 57 ± 70.
Oncogene
Analysis of expression profiles in colonic tumors
Y-M Lin et al
4128
Oncogene
Heid CA, Stevens J, Livak KJ and Williams PM. (1996).
Genome Res., 6, 986 ± 994.
Hoeren FU, Dolferus R, Wu Y, Peacock WJ and Dennis ES.
(1998). Genetics, 149, 479 ± 490.
Kallioniemi OP. (2001). Ann. Med., 33, 142 ± 147.
Kitahara O, Furukawa Y, Tanaka T, Kihara C, Ono K,
Yanagawa R, Nita M, Takagi T, Nakamura Y and
Tsunoda T. (2001). Cancer Res., 61, 3544 ± 3549.
Lawrance IC, Fiocchi C and Chakravarti S. (2001). Hum.
Mol. Genet., 10, 445 ± 456.
Maguire BA and Zimmermann RA. (2001). Cell, 104, 813 ±
816.
Muto T, Bussey HJ and Morson BC. (1975). Cancer, 36,
2251 ± 2270.
Notterman DA, Alon U, Sierk AJ and Levine AJ. (2001).
Cancer Res., 61, 3124 ± 3130.
Ogiso Y, Tomida A, Kim HD and Tsuruo T. (1999).
Biochem. Biophys. Res. Commun., 258, 448 ± 452.
Okabe H, Satoh S, Kato T, Kitahara O, Yanagawa R,
Yamaoka Y, Tsunoda T, Furukawa Y and Nakamura.
(2001). Cancer Res., 61, 2129 ± 2137.
Ono K, Tanaka T, Tsunoda T, Kitahara O, Kihara C,
Okamoto A, Ochiai K, Takagi T and Nakamura Y.
(2000). Cancer Res., 60, 5007 ± 5011.
Schlemper RJ, Riddell RH, Kato Y, Borchard F, Cooper
HS, Dawsey SM, Dixon MF, Fenoglio-Preiser CM, Flejou
JF, Geboes K, Hattori T, Hirota T, Itabashi M, Iwafuchi
M, Iwashita A, Kim YI, Kirchner T, Klimp®nger M,
Koike M, Lauwers GY, Lewin KJ, Oberhuber G, O€ner
F, Price AB, Rubio CA, Shimizu M, Shimoda T, Sipponen
P, Solcia E, Stolte M, Watanabe H and Yamabe H. (2000).
Gut, 47, 251 ± 255.
Semenza GL, Roth PH, Fang HM and Wang GL. (1994). J.
Biol. Chem., 269, 23757 ± 23763.
Uhlig S and Wendel A. (1992). Life Sci., 51, 1083 ± 1094.
Vander Heiden MG, Chandel NS, Li XX, Schumacker PT,
Colombini M and Thompson CB. (2000). Proc. Natl.
Acad. Sci. USA, 97, 4666 ± 4671.
Wodopia R, Ko HS, Billian J, Wiesner R, Bartsch P and
Mairbaurl H. (2000). Am. J. Physiol. Lung Cell Mol.
Physiol., 279, L1110 ± L1119.
Zhong H and Simons JW. (1999). Biochem. Biophys. Res.
Commun., 259, 523 ± 526.