Cancer Biomarker Discovery via Low Molecular Weight Serum

Papers in Press. Published December 3, 2009 as doi:10.1373/clinchem.2009.127951
The latest version is at http://www.clinchem.org/cgi/doi/10.1373/clinchem.2009.127951
Opinion
Clinical Chemistry 56:2
000 – 000 (2010)
Cancer Biomarker Discovery via Low Molecular Weight
Serum Profiling—Are We Following Circular Paths?
Michael T. Davis,1* Paul L. Auger,1 and Scott D. Patterson1
The rigors of attaining reproducible protein identifications from complex biological matrices have recently
been described as the ascent of a “mountainous road”
that must surmount a series of methodological, technical, and analytical barriers to attain reliable results (1 ).
From the perspective of biomarker discovery, the difficulty of this trail, and the attendant requisite level of
expertise, is greatest when broad discovery work flows
are used but is lessened substantially if targeted strategies can be used. Taking broad poetic license to portray
this visual image within the context of the American
westward expansion, one can envision a process by
which teams must leave the comfort of the gentle plains
(the proof-of-concept phase) to scale the foothills and
peaks that lie on the trail to clinical utility. There will be
much debate on the choosing of the best path forward.
With this simile in mind (and to our point of view), we
still see, as we suggested a few years ago, that many
advocates of the use of mass spectrometry (MS)2 for
profiling the so-called low molecular weight fragmentome (LMWF) remain circling on the plains, retracing
the paths of evidence laid down decades before that had
revealed the prevalence of dysregulated hemostasis in
malignant disease (2 ).
The genesis of what has been referred to as the
“SELDI fiasco” (3 ) traces back to the early success and
attendant hyperbole associated with the apparent differentiation of ovarian cancer patients from their unaffected controls by a pattern of uncharacterized peaks
presented in the low-mass region of native serum
SELDI analyses (4 ). Although the discriminatory
power of these results was ultimately attributed to
methodological bias (5 ), the concept of the use of biomolecule patterns as disease-specific identifiers had
been proposed, and the need for component identification had been disputed. Qualitative data obtained by
liquid chromatography coupled with tandem MS (LC-
1
Molecular Sciences, Amgen Inc., Thousand Oaks, CA.
* Address correspondence to this author at: Amgen Inc., One Amgen Center Dr.,
MS 1-1-A, Thousand Oaks, CA 91320. E-mail [email protected].
Received July 31, 2009; accepted October 30, 2009.
Previously published online at DOI: 10.1373/clinchem.2009.127951
2
Nonstandard abbreviations: MS, mass spectrometry; LMWF, low molecular
weight fragmentome; LC-MS/MS, liquid chromatography–tandem MS; FIBA
5909, 5909-Da internal fragment of the fibrinogen ␣ chain spanning residues
576 – 629.
MS/MS) (6 ) subsequently revealed the LMW plasma
proteome to consist largely of proteolytic fragments of
abundant blood proteins associated with coagulation
and the complement cascade, most of which are now
known to be produced ex vivo. Skeptics in the field
questioned the suitability of these approaches to produce novel insights into disease, given their sensitivity
to preanalytical influences and given the prior knowledge of the prevalence of hemostatic dysregulation in
oncology (2 ). In contrast, some advocates of LMWF
profiling, having observed similar findings in their own
hands, invoked the presence of tumor-specific exopeptidases to account for the apparent specificity of ex
vivo– dependent disease patterns and, in a welcome
break from the field in general, have taken the appropriate steps toward establishing a rigorous assay platform to carry this effort forward (7 ).
Regardless of opinion, these and other data offer
compelling evidence that the serum LMWF, when
probed by direct analyses of unfractionated materials,
is at a minimum confounded by and at worst perhaps
limited to the detritus of abundant blood proteins. We
suggest the successes described to date may be due to
dysregulated hemostasis— often overlaid on an acutephase response— but are unlikely to be due to anything
more. In light of the daunting concentration range of
serum components, the limited dynamic range of
MALDI analyses (8 ), and the dramatic impact of ex
vivo proteolysis, this conclusion is the simplest explanation of the observed phenomena (i.e., the rule of Occam’s razor is satisfied). With the annotated features
from a study of both plasma and serum (9 ), Fig. 1 presents a glimpse into the potential peptide complexity of
the blood LMWF over the LC-MS/MS–tractable mass
range. Given the usual caveats associated with sequencing by tandem MS (i.e., not all ions yield quality spectra, and not all quality spectra can be easily correlated
in a database search) and the likelihood that the identified peptide ladders represent facile components of
their “family trees,” it is reasonable to suggest there are
multiple components at every nominal mass in a
MALDI spectrum of an unfractionated sample, be it
plasma or serum. The shadow cast by these fragments
of abundant proteins, ⬎90% of which represent the
top 75 and top 125 proteins in plasma and serum, respectively (10 ), obscures the likelihood of detecting
tumor-derived peptides in a MALDI spectrum. Al1
Copyright (C) 2009 by The American Association for Clinical Chemistry
Opinion
Distribution of Annotated Native Blood Peptides
(700 – 4000 Daltons)
(Peptide Ions)
Plasma
(978)
Serum
Serum or Platelet Derived
(992)
Non-Serum/ Platelet
(8)
700
1200
1700
2200
2700
3200
3700
mass
Fig. 1. Mass distribution of annotated native peptides observed in human plasma and serum [Bakun et al. (9 )].
Peptides derived from proteins represented by at least 2 unique peptide ions are displayed within the mass range of 700 – 4000
Da. The peptides annotated in the plasma analyses are derived exclusively from blood proteins with 90% of these peptides
representing the top 75 proteins by relative abundance [Hortin et al. (10 )]. More than 90% of the serum peptides are derived
from the top 125 serum proteins, whereas ⬍1% of all peptides were attributed to nonserum/platelet-derived proteins.
though the various qualitative assessments of native
blood fluids performed since 2005 have generally failed
to uncover tumor-specific peptides, they have revealed
the major impacts that preanalytical and technical variables have at all stages, from the point of sample collection through the final data analysis. The important
point here is that the disease associations of dysregulated hemostasis are known and are measured as part of
regular clinical care. What can easily be measured with
current clinical assays becomes more complex to analyze at the level of the LMWF.
The cycle of rediscovery can be seen in the frequent
observations of seemingly promiscuous LMW features
across a number of studies. One of these features, a
temporally sensitive biomarker of approximately 5909
nominal mass, has repeatedly been identified as an internal fragment of the fibrinogen ␣ chain spanning residues 576 – 629 (FIBA 5909). Although the fragment is
directly correlated with serum coagulation in healthy
individuals (2 ), its presence in samples from diseased
individuals is likely due to the same mechanism. Despite numerous reports of its identification and correlation with the coagulation process, FIBA 5909
continues to be rediscovered and reported as “uncharacterized” (11 ). Remaining current with the biomarker
literature has grown increasingly difficult over the
years, with the growth of the field and the advent of new
journals sharing the responsibility. Consequently, reporting of this particular observation is likely to con2
Clinical Chemistry 56:2 (2010)
tinue until the identification of discriminating features
is required before their publication. Additionally, the
process of sample collection initiates profound changes
in the LMWF through the initiation of proteolytic cascades and the activation of blood cells and platelets, the
complexity and regulation of which elude full understanding. The lack of perfect knowledge, however,
should not preclude the recognition that diseases affecting platelet biology, such as many malignant states,
are likely to yield discriminating features in native MS
analyses that are hallmarks of platelet activation (e.g.,
platelet factor 4, pro–platelet basic protein precursor)
(12, 13 ). The interpretation of results that implicate
the involvement of platelet-derived factors is incomplete without the patient’s platelet count and
morphology.
Early critics of the Ciphergen Biosystems ProteinChip Reader (PBS-II) reflected on the low resolution
and low mass accuracy of this relatively unsophisticated MALDI-TOF instrument. Anecdotally, it was often expressed that one could do better with a “real”
mass spectrometer. Although the benefits of enhanced
mass resolution, accuracy, and stability are indisputable and the value of off-chip sample processing has
been demonstrated (14 ), these features are fine points
compared with the magnitude of the effect induced by
the underlying biology. As has been reviewed recently
(15 ), 45% of the peaks observed in differential analyses
of samples from case– control studies that targeted
Opinion
breast cancer on an “advanced” platform represented
rediscoveries of prior findings, with the FIBA 5909
peptide being among the most significantly correlated.
Similarly, the pairwise recapitulation of observations
obtained with the low-performance PBS-II instrument
and with high-value instrumentation (16 ) regarding
the prevalence of abundant protein degradants (including FIBA 5909) in sera from patients with head and
neck squamous cell carcinoma suggests that little new
biology is likely to be revealed. The reliability of these
platforms is undoubtedly superior, and the tandem sequencing capabilities will surely prove valuable. But
will the outcomes differ? Time will tell, but the key
benchmark of progress will be evident when feature
identification becomes the accepted practice—a task
that would be accelerated in many cases if the compendium of identified peptides were to be curated in a centralized database accessible through tools such as TagIdent (17 ).
Circular paths are often trod with respect to the
analysis of small acute-phase proteins such as serum
amyloid A, as is evident in its discovery as a putative
therapeutic response marker in the treatment of non–
small-cell lung cancer (18, 19 ). Although inconsistently identified by the research team (known in 2007
but unknown in 2009), the profile of serum amyloid A
is consistent and unmistakable, with expression levels
inversely correlated with the response to treatment
(i.e., survival). Its prognostic value has also been consistent and unmistakable across decades of investigation (20 ). The recent US Food and Drug Administration approval of the OVA1 screening test (Vermillion)
speaks to the value of evaluating serum amyloid A and
similar proteins, because 2 of the 4 proteins (transthyretin and transferrin) evaluated in combination with
cancer antigen 125 are acute-phase reactants, although
they are negative responders in this example (21 ). In
view of this acceptance of “the pattern is the biomarker” model, it follows that any examination of the serum LMWF is incomplete without the parallel assessment of the acute-phase reactants. Combined with our
previous recommendations regarding coagulation and
platelet status, all of which are components of standard
clinical practice, systematic evaluation of these cellular
and molecular components should expand our under-
standing of the underlying biology and better determine, perhaps more rapidly, the utility of LMW
profiling.
This commentary is not meant to disparage the
efforts of these research groups, and we recognize the
difficulty of attaining a comprehensive coverage of
the field in today’s information-rich environment. We
acknowledge our own limitations in this regard with
respect to the omission of references to efforts with
proximal fluids and tissues, which are more likely to
yield disease insights than serum profiling, or with respect to the emerging recognition of the potential value
of isoform variation and posttranslational modifications, which are all beyond the scope of this commentary. This criticism is intended to guide readers toward
a fact-based recognition of the inherent limitations of
MALDI/SELDI profiling of the LMWF for biomarker
discovery and to serve as an instrument to encourage
others to strike out on their own road. In an offhanded
fashion, the emerging trend toward publication of negative data, which effectively closes out years of preliminary promise, is evidence that the field can break the
cycle and pursue other paths (22, 23 ). At the same
time, one must wonder if the recent report of a LMW
profile for the early detection of breast cancer does not
feel like déjà vu (24 ).
Author Contributions: All authors confirmed they have contributed to
the intellectual content of this paper and have met the following 3 requirements: (a) significant contributions to the conception and design,
acquisition of data, or analysis and interpretation of data; (b) drafting
or revising the article for intellectual content; and (c) final approval of
the published article.
Authors’ Disclosures of Potential Conflicts of Interest: Upon
manuscript submission, all authors completed the Disclosures of Potential Conflict of Interest form. Potential conflicts of interest:
Employment or Leadership: None declared.
Consultant or Advisory Role: None declared.
Stock Ownership: M.T. Davis, Amgen Inc.
Honoraria: None declared.
Research Funding: None declared.
Expert Testimony: None declared.
Role of Sponsor: The funding organizations played no role in the
design of study, choice of enrolled patients, review and interpretation
of data, or preparation or approval of manuscript.
References
1. Aebersold R. A stress test for mass spectrometrybased proteomics. Nat Methods 2009;6:411–2.
2. Davis MT, Auger P, Spahr C, Patterson SD. Cancer
biomarker discovery via low molecular weight
serum proteome profiling—Where is the tumor?
Proteomics Clin Appl 2007;1:1545–58.
3. Anderson NL. Clinical proteomics heads into real
world. Improved instrumentation and unbiased samples renew promise of biomarker pipeline. Genet Eng
Biotechnol News 2009;29(5). http://www.genengnews.
com/articles/chitem.aspx?aid⫽2822&chid⫽4 (Accessed June 2009).
4. Petricoin EF, Ardekani AM, Hitt BA, Levine PJ,
Fusaro VA, Steinberg SM, et al. Use of proteomic
patterns in serum to identify ovarian cancer. Lancet 2002;359:572–7.
5. Baggerly KA, Morris JS, Coombes KR. Reproducibility of SELDI-TOF protein patterns in serum:
comparing datasets from different experiments.
Bioinformatics 2004;20:777– 85.
6. Koomen JM, Li D, Xia LC, Coombes KR, Abbruzzese J, Kobayashi R. Direct tandem mass spectrometry reveals limitations in protein profiling
experiments for plasma biomarker discovery. J
Proteome Res 2005;4:972– 81.
7. Villanueva J, Nazarian A, Lawlor K, Yi SS, Robbins
RJ, Tempst P. A sequence-specific exopeptidase
Clinical Chemistry 56:2 (2010)
3
Opinion
8.
9.
10.
11.
12.
13.
4
activity test (SSEAT) for “functional” biomarker
discovery. Mol Cell Proteomics 2008;7:509 –18.
Hortin GL. The MALDI-TOF mass spectrometric
view of the plasma proteome and peptidome.
Clin Chem 2006;52:1223–37.
Bakun M, Karczmarski J, Poznanski J, Rubel T,
Rozga M, Malinowska A, et al. An integrated
LC-ESI-MS platform for quantitation of serum
peptide ladders. Application for colon carcinoma
study. Proteomics Clin Appl 2009;3:932– 46.
Hortin GL, Sviridov D, Anderson NL. Highabundance polypeptides of the human plasma
proteome compromising the top 4 logs of
polypeptide abundance. Clin Chem 2008;54:
1608 –16.
Han KQ, Huang G, Gao CF, Wang XL, Ma B, Sun
LQ, Wei ZJ. Identification of lung cancer patients
by serum protein profiling using surfaceenhanced laser desorption/ionization time-offlight mass spectrometry. Am J Clin Oncol 2008;
31:133–9.
Shi L, Zhang J, Wu P, Feng K, Li J, Xie Z, et al.
Discovery and identification of potential biomarkers of pediatric acute lymphoblastic leukemia.
Proteome Sci 2009;7:7.
Fiedler GM, Leichtle AB, Kase J, Baumann S,
Ceglarek U, Felix K, et al. Serum peptidome profiling revealed platelet factor 4 as a potential
Clinical Chemistry 56:2 (2010)
14.
15.
16.
17.
18.
19.
discriminating peptide associated with pancreatic
cancer. Clin Cancer Res 2009;15:3812–9.
Villanueva J, Philip J, Entenberg D, Chaparro CA,
Tanwar MK, Holland EC, Tempst P. Serum peptide profiling by magnetic particle-assisted, automated sample processing and MALDI-TOF mass
spectrometry. Anal Chem 2004;76:1560 –70.
Callesen AK, Vach W, Jorgenson PE, Cold S,
Mogensen O, Kruse TA, et al. Reproducibility of
mass spectrometry based protein profiles for diagnosis of breast cancer across clinical studies: a
systematic review. J Proteome Res 2008;7:1395–
402.
Freed GL, Cazares LH, Fichlander CE, Fuller TW,
Sawyer CA, Stack BC Jr, et al. Differential capture
of serum proteins for expression profiling and
biomarker discovery in pre- and posttreatment
head and neck cancer samples. Laryngoscope
2008;118:61– 8.
Swiss Institute of Bioinformatics. ExPASy Proteomics Server. TagIdent tool. http://www.expasy.
ch/tools/tagident.html (Accessed June 2009).
Yildiz PB, Shyr Y, Rahman JS, Wardwell NR, Zimmerman LJ, Shakhtour B, et al. Diagnostic accuracy of MALDI mass spectrometric analysis of
unfractionated serum in lung cancer. J Thorac
Oncol 2007;2:893–901.
Salmon S, Chen H, Chen S, Herbst R, Tsao A, Tran
20.
21.
22.
23.
24.
H, et al. Classification by mass spectrometry can
accurately and reliably predict outcome in patients with non-small cell lung cancer treated
with erlotinib-containing regimen. J Thorac Oncol
2009;4:689 –96.
Malle E, Sodin-Semrl S, Kovacevic A. Serum
amyloid A: an acute-phase protein involved in
tumour pathogenesis. Cell Mol Life Sci 2009;
66:9 –26.
US Food and Drug Administration. FDA news
release. http://www.fda.gov/NewsEvents/Newsroom/
PressAnnouncements/ucm182057.html (Accessed
September 2009).
McLerran D, Grizzle WE, Feng Z, Bigbee WL,
Banez LL, Cazares LH, et al. SELDI-TOF MS whole
serum proteomic profiling with IMAC surface
does not reliably detect prostate cancer. Clin
Chem 2008;54:53– 60.
West-Norager M, Bro R, Marini F, Hogdall EV,
Hogdall CK, Nedergaard L, Heegaard NHH. Feasibility of serodiagnosis of ovarian cancer by mass
spectrometry. Anal Chem 2009;8:1907–13.
Belluco C, Petricoin EF, Mammano E, Facchiano
F, Ross-Rucker S, Nitti D, et al. Serum proteomic
analysis identifies a highly sensitive and specific
discriminatory pattern in stage 1 breast cancer.
Ann Surg Oncol 2007;14:2470 – 6.