On the beta-binomial model for analysis of

BIOINFORMATICS
Vol. 26 no. 3 2010, pages 363–369
doi:10.1093/bioinformatics/btp677
ORIGINAL PAPER
Gene expression
On the beta-binomial model for analysis of spectral count data in
label-free tandem mass spectrometry-based proteomics
Thang V. Pham∗ , Sander R. Piersma, Marc Warmoes and Connie R. Jimenez
OncoProteomics Laboratory, Department Medical Oncology, VUmc-Cancer Center Amsterdam, VU University
Medical Center, De Boelelaan 1117, 1081 HV Amsterdam, the Netherlands
Received on July 17, 2009; revised on December 1, 2009; accepted on December 7, 2009
Advance Access publication December 9, 2009
Associate Editor: Trey Ideker
1
INTRODUCTION
Mass spectrometry-based proteomics is an important technique for
large-scale identification and quantification of proteins. It has been
shown that the number of spectral counts of a protein can be
used as a measure of its abundance (Dix et al., 2008; Liu et al.,
∗ To
whom correspondence should be addressed.
15
10
Variance
ABSTRACT
Motivation: Spectral count data generated from label-free tandem
mass spectrometry-based proteomic experiments can be used to
quantify protein’s abundances reliably. Comparing spectral count
data from different sample groups such as control and disease
is an essential step in statistical analysis for the determination of
altered protein level and biomarker discovery. The Fisher’s exact
test, the G-test, the t-test and the local-pooled-error technique (LPE)
are commonly used for differential analysis of spectral count data.
However, our initial experiments in two cancer studies show that the
current methods are unable to declare at 95% confidence level a
number of protein markers that have been judged to be differential
on the basis of the biology of the disease and the spectral count
numbers. A shortcoming of these tests is that they do not take into
account within- and between-sample variations together. Hence, our
aim is to improve upon existing techniques by incorporating both the
within- and between-sample variations.
Result: We propose to use the beta-binomial distribution to test
the significance of differential protein abundances expressed in
spectral counts in label-free mass spectrometry-based proteomics.
The beta-binomial test naturally normalizes for total sample count.
Experimental results show that the beta-binomial test performs
favorably in comparison with other methods on several datasets in
terms of both true detection rate and false positive rate. In addition,
it can be applied for experiments with one or more replicates, and
for multiple condition comparisons. Finally, we have implemented
a software package for parameter estimation of two beta-binomial
models and the associated statistical tests.
Availability
and
implementation: A
software
package
implemented in R is freely available for download at
http://www.oncoproteomics.nl/.
Contact: [email protected]
Supplementary information: Supplementary data are available at
Bioinformatics online.
5
0
0
5
10
15
Mean
Fig. 1. A mean-variance plot of the spectral count of proteins acquired from a
single biological sample from 29 LC/MS/MS runs over a period of 6 months.
The variance tends to increase for higher abundance proteins.
2004; Ramani et al., 2008; Zybailov et al., 2009). In comparison
to intensity-based quantification which requires complex signal
processing, quantification using spectral counts is advantageous
because the numbers are readily available after protein identification.
However, tools for statistical analysis of this type of data are
still immature. This article addresses the problem of comparative
analysis of spectral count data generated in label-free tandem mass
spectrometry-based proteomics.
We make a distinction between within- and between-sample
variation. The former refers to variation caused by the random
sampling process of each biological sample. The latter is the
variation resulted from random biological samples in a sample group
such as control or disease.
Since the number of samples per comparison group is typically
small (currently, n varies from 1 to 6 for data generated in our
laboratory), non-parametric methods such as the Mann–Whitney
test or the Kruskal–Wallis test are usually not powerful. One often
resorts to assumptions about the generative mechanism of the data
in the form of parametric distributions. In particular, the popular
t-test assuming a normal distribution can be used with either the raw
spectral count numbers or their transformed values (e.g. logarithm or
square root transformation). The t-test, however, does not take into
account within-sample variation. Figure 1 shows a mean-variance
© The Author 2009. Published by Oxford University Press. All rights reserved. For Permissions, please email: [email protected]
[14:45 8/1/2010 Bioinformatics-btp677.tex]
363
Page: 363
363–369
T.V.Pham et al.
plot of proteins resulted from a single biological sample after 29
LC/MS/MS runs (see Section 3.1 for a detailed description). It can be
seen that the absolute variance tends to increase for proteins at higher
abundance. Thus, ignoring the within-sample variation may result
in artificially large test statistics for proteins at a high abundance
region.
When there is only one replicate available in each sample group,
the G-test of independence or the Fisher’s exact test can be used by
forming a 2×2 contingency table for each protein (Sokal and Rohlf,
1995). The two tests are based on the assumption of a multinomial
distribution and a hypergeometric distribution of the number of
counts, respectively. Hence, effectively the within-sample variation
is modeled. However, when more than one replicates are available,
the current non-optimal approach is to pool the samples, and as a
result ignoring the between-sample variability.
This article proposes to use the beta-binomial distribution
(Skellam, 1948) to model spectral count data. The variability is
modeled in two directions. The within-sample variation is modeled
with a binomial distribution, similar to the assumption employed
in the G-test of independence or the Fisher’s exact test. The
between-sample variation is modeled by treating the parameter of the
binomial distribution as a random variable from a beta distribution.
Finally, parameter inference is based on the likelihood ratio test as
in the case of the G-test.
The article is organized as follows. Section 2 presents the
beta-binomial test. Section 3 evaluates the performance of the betabinomial test and existing tests for spectral count data. Section 4
concludes the article.
2
THE BETA-BINOMIAL MODEL
Let x ∈ N denote the number of spectral counts of a particular
protein and n ∈ N the total number of spectral counts of all
proteins in a sample. To model the within-sample variation, assume
x is distributed according a binomial distribution with success
probability π ∈ R, 0 ≤ π ≤ 1,
p(x|π,n) =
n x
π (1−π)n−x
x
(1)
Furthermore, for the between-sample variation, π is modeled as a
random variable from a beta distribution with real parameters α > 0
and β > 0
p(π|α,β) =
πα−1 (1−π)β−1
B(α,β)
(2)
being the number of data points, is
⎡
ni −x
N x
i −1
i −1
⎣
L=
log(µ+rθ)+
log(1−µ+rθ)
i=1
r=0
r=0
−
n
i −1
⎤
log(1+rθ)⎦
(4)
r=0
where i and r are running indices, µ = α(α+β)−1 and θ = (α+β)−1
(Williams, 1975). The reparameterization is to improve the
numerical stability of the optimization algorithm. In addition, µ is
the expected value of the binomial parameter π.
The likelihood ratio test can be used for significance analysis
(Sokal and Rohlf, 1995). Let Lj be the maximal value of the log
likelihood (4) for each group, j = 1,...,M. Let L0 be the maximal
value of the log likelihood when data from all groups are used. In
the first model, the statistic S to test the homogeneity across groups
⎛
S = 2 ⎝−L0 +
M
⎞
Lj ⎠
(5)
j=1
is approximately χ2 distributed with 2(M −1) degrees of freedom.
Here, the null hypothesis is equal to µj and θj . (Hence, there are
2M free parameters in one model and two free parameters in the
other model, which leads to 2(M −1) degrees of freedom for the χ2
distribution.)
When the beta distributions are bell shaped (θj < µj , θj < 1−µj ),
the groups may differ with respect to µ but not with respect to θ
(Williams, 1975). In this case, we can use the second model where
the θj are constrained to be equal. Here, we test the hypothesis that
µj are equal assuming that θj are equal. The distribution of the
statistic S in (5) under the null hypothesis is approximated by the χ2
distribution with (M −1) degrees of freedom. (This is because there
are (M +1) free parameters in one model and two free parameters
in the other model, resulting in (M −1) degrees of freedom for the
χ2 distribution.)
There are two special cases in implementing the beta-binomial
test for spectral count data.
• µj = 0: this situation arises either from the random sampling
process or from a black and white regulation of the protein. In
this case, µj = 0 maximizes (4) and Lj = 0. We do not address
the issue with asymptotic approximations as stated in Crowder
(1978) and simply omit the contribution of group j in (5).
(3)
• θj = 0: the situation arises when there is one replicate only or
when the between-sample variation diminishes. This indicates
that the data might be pooled. In fact, in this case the unique
solution for µj is equal to the binomial parameter π+ of the
pooled data.
The maximum likelihood estimation of the parameters α and β
of the beta-binomial distribution is as follows. Ignoring constants
involving only the data, the log likelihood L(α,β|{xi ,ni }N
i=1 ), N
Note that instead of using the likelihood ratio test, the betabinomial distribution can be used for testing in other frameworks
(Baggerly et al., 2003; Ennis and Bi, 1998). Nevertheless, the main
computational task is still the optimization of (4).
where B(·,·) is the beta function. The marginal distribution of x is
then the beta-binomial distribution (Skellam, 1948)
p(x|α,β,n) =
n B(α+x,n+β −x)
B(α,β)
x
364
[14:45 8/1/2010 Bioinformatics-btp677.tex]
Page: 364
363–369
The beta binomial model for analysis of spectral count data
On the existence of within-sample variation and
extra-binomial total variation
The first dataset is spectral count data from 29 LC/MS/MS runs
of colon cancer cell line HTC116, which were used >6 months
as technical control samples to evaluate equipment performance.
The MS/MS spectra were identified by SEQUEST. The identified
peptides were organized by the Scaffold 2 software (Proteome
Software Inc., Portland, OR, USA; see Supplementary Material
I). The resulted dataset contains 656 proteins in total (min 2
peptides/protein, 99% protein probability). There is no biological
variation in the set. We used this dataset to illustrate the existence
of within-sample variation (Fig. 1).
We performed a test for the null hypothesis that the binomial
proportions over 29 samples in this dataset are the same (using
prop.test in R). The spectral count numbers are normalized
for the total sample spectral count. This is important because of
the differences in sample load (the number of total spectral counts
for each sample ranges from 936 to 2868). The result shows that
only 27 out of 656 proteins (4%) show heterogeneity of binomial
proportions at 95% confidence level. Figure 2a shows a protein with
a high degree of homogeneousity among the proportions. In this
case, the beta-binomial fit coincides with a binomial fit (θ = 0).
Next, we analyzed a dataset with two cancer groups, one with
three biological samples and the other with five biological samples.
0.12
Sample spectral counts
Sample binomial model
Binomial model
Beta−binomial model
0.1
0.08
Probability
We contrast the performance of the beta-binomial test against
existing statistical tests. Two important criteria for comparison are
the true detection rate and false positive rate.
Previous comparative studies (Bantscheff et al., 2007; Zhang et
al., 2006) indicate that for experiments with a single replicate in
each group the G-test, the Fisher’s exact test and the so-called
AC test perform equally well. The G-test is favorable because of
its computational simplicity and its applicability for comparison
with multiple conditions. Thus, we will consider the G-test as a
representative for the class of methods dealing with one replicate.
In Bantscheff et al. (2007), the t-test is recommended for
experiments with more than three replicates, while the local-poolederror technique (LPE) test is suited for ones with two or three
replicates. Hence, in this sample size setting we contrast the
performance of the beta-binomial test against the G-test, the t-test
and the LPE test. Moreover, we also attempt to use the t-test with
log-transformed data (after normalization for total sample count),
a procedure often carried out as a preprocessing step for genomic
data.
The QSpec method is a recent development in differential analysis
of spectral count data (Choi et al., 2008). It is a Bayesian modeling
technique where the spectral counts are modeled as observations
from a Poisson distribution, which is similar to the binomial
assumption. However, statistical information across proteins is also
employed in QSpec as opposed to the beta-binomial model where
each protein is treated separately. We will use an implementation of
QSpec provided by its authors in our evaluation. For protein marker
discovery, a threshold of 9.8 on the Bayes factor is used as in Choi
et al. (2008).
We also perform evaluation for three group comparisons. In
addition to the G-test, one-way ANOVA and one-way ANOVA with
log-transformation are considered.
3.1
(a)
EXPERIMENTAL RESULTS
0.06
0.04
0.02
0
0
10
20
30
40
50
60
70
Number of counts
(b)
0.12
Sample spectral counts
Sample binomial model
Binomial model
Beta−binomial model
0.1
0.08
Probability
3
0.06
0.04
0.02
0
0
10
20
30
40
50
60
70
Number of counts
Fig. 2. An illustration of the power of the beta-binomial model. The thin gray
lines represent the binomial distribution assumption, which might be used
when there is only one replicate. The thick gray lines are the binomial fit using
all available spectral counts, or equivalently pooling the data. (a) Example
of a protein with a high degree of homogeneousity among 29 samples. Here,
the beta-binomial fit coincides with a binomial fit (θ = 0). (b) Example of a
protein exhibiting extra-binomial variation. Two out of five samples fall in
the low density region of the group binomial distribution. The beta-binomial
distribution provides a better fit to the data.
In total the dataset contains 1786 proteins. In each sample, there are
approximately 20 000 spectral counts. The proportion test shows that
965 out of 1786 proteins (54%) indicate heterogeneity of binomial
proportions at 95% confidence level in either group. Therefore,
modeling data with a binomial distribution, or equivalently pooling
the data, is clearly not sufficient.
Figure 2b illustrates an example of one protein with extrabinomial variation. It can be seen that the binomial model is
inadequate here as two out of five points lie at the extreme tails
of the estimated binomial distribution. Thus, pooling the data in this
case will create a false estimate of the variance. The extra-binomial
variation can be well modeled by a beta-binomial distribution.
3.2
On true detection rate
We use two real-life datasets from two cancer proteomics studies
conducted in our laboratory to examine the ability of the betabinomial test to detect differential proteins. Both studies require
365
[14:45 8/1/2010 Bioinformatics-btp677.tex]
Page: 365
363–369
T.V.Pham et al.
a two-group comparison. The first dataset is a comparative analysis
of colon adenomas and carcinomas. Human cancer tissues were
prepared to extract, among others, the so-called chromatin-binding
fraction (Albrethsen et al., 2009). Subsequently, the chromatinbinding fractions are subjected to a processing pipeline including
gel electrophoresis, in-gel digestion, nano-LC separation, tandem
mass spectrometry and database searching (Supplementary Material
I). The second dataset was generated for the comparative analysis
of proximal fluids obtained from normal colon and colon tumor
samples of a mouse model. The samples were again subjected to
the aforementioned gel-LC-MS/MS proteomics workflow to obtain
spectral count data (Supplementary Material I).
Evaluating the true detection rate is difficult because in general
there is no ground truth available on differential proteins. One
approach is to generate a synthetic list of differential proteins.
However, it is not certain that the generative mechanism will be
close to that of real-life data. We therefore employed an approximate
method. For each dataset, we used a list of potentially differential
proteins provided by experts as an (approximate) ground truth.
Both differential protein lists had been composed prior to the
implementation of the test. Candidate proteins came from literature.
Some were established clinical and preclinical protein markers for
the cancer. Some had been found to be differential in other medium
or compartments rather than the ones under study here. Typically,
these candidate proteins are used for confirmatory analysis. For
colorectal cancer, the proteins are CEA, CEACAM-6, TIMP-1,
MMP-9, PKM2 and LF.
While it is not certain that all proteins in the lists will express
significant difference in the datasets, it should give an indication
on the relative performance of the different statistical tests.
Furthermore, within each protein list there is a sub-list of proteins
which are known markers for the cancer. Thus, in comparison to the
full list, the proteins in the sub-list are more likely to be differential.
In the first study, a list of 58 potential protein markers was
composed. This list contains six proteins which are known markers
for the cancer. Similarly, there is a list of 50 potential protein markers
in the second study, of which 11 are known markers.
Figure 3 compares the detection rate of the beta-binomial test
against the G-test, the t-test, the LPE test and QSpec. It can be
seen that the beta-binomial test detects the most differential proteins
at the 95% confidence level in three cases. In particular, only the
beta-binomial test can declare all existing markers significantly
differential. While QSpec can detect 24 out of 58 candidate proteins
for dataset 1, it detects only four out of six protein markers for the
dataset. The t-test with log-transformation and QSpec perform as
well as the beta-binomial test for the second dataset. Thus, one can
conclude that the beta-binomial test outperforms the G-test, the t-test
and the LPE test in terms of detection rate. It is comparable with the
t-test when the data are log transformed and the QSpec method.
As an aside, it is interesting to observe the difference in true
detection rate for the two datasets. Recall that the first dataset comes
from a comparison between two stages of cancer, whereas the second
dataset comes from a cancer versus control comparison. It can be
expected that the differences in protein abundance are more apparent
for the cancer versus control comparison. Here the result shows
that the beta-binomial test, the t-test with log transformation and
QSpec have 100% detection rate for the second dataset and <50%
detection rate for the first dataset. Some of the proteins not detected
in the first dataset might be false expert assignments. Some might be
(a) 60 Number of detections out of 58 proteins
22
24
21
20
10
8
0
Result of dataset 1 with a list of 58 potentially
differential proteins.
(b) 8 Number of detections out of 6 proteins
6
5
4
4
3
1
0
Result of dataset 1 with six known protein markers.
(c) 55 Number of detections out of 50 proteins
50
50
50
41
35
13
0
Result of dataset 2 with a list of 50 potentially
differential proteins.
(d) 15 Number of detections out of 11 proteins
11
11
11
9
6
6
0
Result of dataset 2 with 11 known protein markers.
Beta binomial test
-test with log-transformation
test
LPE
-test
QSpec
Fig. 3. The number of true detections of six methods on two datasets with
four (approximate) ground truth lists: the beta binomial test, the G test, the
t-test, the t-test with log-transformation, the LPE method, and QSpec. The
beta binomial test, the t-test with log-transformation, and QSpec perform
comparably, and outperform other methods.
differential in other compartments but not in the chromatin-binding
fraction. For the purpose of this article, it is important to note that
the lists of candidate proteins were independently identified from
the development of the tests.
3.3
On false positive rate
3.3.1 Two-group comparison We followed the work of Zhang
et al. (2006) for a comprehensive comparison of statistical tests
for differential analysis of spectral count data. The dataset for
evaluation comes from the work of Liu et al. (2004). The spectral
366
[14:45 8/1/2010 Bioinformatics-btp677.tex]
Page: 366
363–369
The beta binomial model for analysis of spectral count data
count data is obtained from Saccharomyces cerevisiae cell lysate
samples with six additional protein markers added at three different
concentrations (2.5, 1.25 and 0.25%). The six protein markers are
bovine carbonic anhydrase (CAH2), bovine serum albumin (ALBU),
soybean trypsin inhibitor (ITRA), chicken lysozyme (LYC), chicken
ovalbumin (OVAL) and rabbit phosphorylase b (PHS2).
In Zhang et al. (2006), the authors focused on the false positive
rate of two-group comparison. For each spiked-in protein, the false
positive rate is the fraction of the number of proteins with lower
P-value than the P-value of the designated protein. This evaluation
metric is sensible especially for biomarker discovery applications
because the investigator may vary the threshold for marker selection.
We repeat the experiment in Zhang et al. (2006) with the G-test,
the t-test and the LPE test. For the G-test, if more than one replicates
are available, the data are pooled. The result is adjusted by the
William’s correction method (see Sokal and Rohlf, 1995; Zhang
et al., 2006, for detail of the correction). For the t-test and LPE test,
the data are normalized for total sample count. As in the evaluation
for true detection rate, we also report the result of the t-test on
log-transformed data (after total sample count normalization).
The QSpec method requires proteins’ sequence length. This
information can be obtained from http://www.yeastgenome.org/, and
is available in Supplementary Material III.
Three pairwise comparisons among three different spiked-in
concentrations were performed (2.5% versus 1.25%, 1.25% versus
0.25% and 2.5% versus 0.25%), resulting in comparisons with 2-,
5- and 10-fold changes, respectively. The effect of the number of
replicates is evaluated by considering each comparison with one,
two and three replicates. For one and two replicates, the false
positives rates are averaged from all nine possible combinations
(this is because there are three replicates available in each group).
The one-replicate comparison is not applicable for the t-test and the
LPE test. The beta-binomial test can be used for all cases.
Figure 4 reports the false positive rates of different tests in
different settings of fold change and sample size. We were able to
reproduce the result of Zhang et al. (2006), except for the comparison
1.25% versus 2.5% with two replicates, where our result of LPE test
(∼10%) is lower than the result in Zhang et al. (2006) (∼18%).
This might be due to the different parameter setting of the LPE
algorithm and/or software version. It can be seen that the t-test with
log-transformation exhibits similar behavior as the t-test without
data transformation. Both versions are worse than other methods
for the 5- and 10-fold changes.
For one-replicate experiment, the beta-binomial test performs as
well as the G-test, and as QSpec in most cases. [Note that the
standard deviations (SDs) are approximately equal to the means.
Only in the 5-fold comparison, the performance of QSpec appears
considerably worse than that of the other two tests.] For two- and
three-replicate experiments, the beta-binomial test outperforms all
others. The false negative rate of the beta-binomial test is higher
than that of the t-test in one case only (for protein CAH2, 2-fold
change). The result of the beta-binomial test is equal and in many
cases significantly better than that of other tests.
The results across different fold changes and different numbers
of replicates are expected in that it is more difficult to detect 2-fold
differences than 5- and 10-fold differences. In addition, increasing
the number of replicates reduces the false positive rate. Another
observation is that the six proteins result in different false positive
rates. This is because the proteins are based on different subsets of
peptides whose chemical and physical properties are different, for
instance, the ionization property and the number of tryptic peptides.
This leads to differences in detectablity and variation.
3.3.2 Three-group comparison We contrast the beta-binomial
test against the G-test (for multiple groups) (Sokal and Rohlf, 1995),
one-way ANOVA and one-way ANOVA with log-transformation.
Note that the LPE test is not applicable in this case. For one
and two replicates, the false positives rates are averaged from all
27 possible combinations (again, this is because there are three
replicates available in each group). For experiments with more than
one replicates, data are pooled for the G-test. Only the beta-binomial
test and G-test are applicable for one-replicate experiments.
Figure 5 shows the performance of the different tests with one and
two replicates. When all three replicates are used, no false positive
is reported for all the tests. It can be seen that the beta-binomial
test performs as well as the G-test for one-replicate experiment and
outperforms all other tests for the two-replicate situation.
4
DISCUSSIONS
We proposed to use the beta-binomial distribution to model spectral
count data with both within- and between-sample variation in labelfree tandem mass spectrometry-based proteomics. The experimental
results showed that the beta-binomial test compares favorably with
other existing tests in significance analysis of spectral count data in
terms of true detection rate and false positive rate.
For datasets with a single replicate in each group, the betabinomial test and the G-test perform comparably. This is expected
since the extra-binomial variation in the beta-binomial model
diminishes, resulting in a binomial model. On the other hand, under
the constraint of equal sample load, the multinomial distribution
assumed by the G-test turns into a binomial distribution for the
two-group comparison. Nevertheless, the use of the beta-binomial
test is advantageous in a situation where a sample group has
more than one replicates, while another group has only a single
replicate. Our experiment showed that data should not be pooled at
convenience when more than one replicates are available.
The true detection rates of the beta-binomial test, the t-test with
log-transformation and QSpec are similar and considerably better
than the t-test without data transformation and the LPE method.
This is likely because the first three tests model the spectral count
data on the non-negative side, whereas the other two tests cannot
impose this restriction, which has a clear impact especially when
the counts concentrate near zero.
A recent study (Chourey et al., 2009) used the Poisson regression
model for spectral count data. Our analysis shows that in case of twogroup comparison, the model leads to pooling data in each group
in the log space. Hence, it belongs to the class of methods where
the G-test is representative. Nevertheless, the Poisson regression
model is flexible and can be taken into account in other settings
such as for paired data, time-course data and experimental designs
with multi-way interactions.
The beta-binomial distribution has been used for analysis of
differential gene expression level in SAGE libraries in Baggerly
et al. (2003), in which model fitting is based on the method of
moments. It is unclear if this approach can be generalized to handle
constraints on parameters for different likelihood ratio tests, for
example, to consider the difference between group means only.
367
[14:45 8/1/2010 Bioinformatics-btp677.tex]
Page: 367
363–369
LYC
OVAL
PHS2
ALBU
CAH2
ITRA
LYC
OVAL
PHS2
0.8
ITRA
LYC
OVAL
PHS2
ALBU
CAH2
ITRA
LYC
OVAL
PHS2
0.25% vs 2.5%
LYC
PHS2
Beta binomial test
OVAL
test
ALBU
ITRA
-test
CAH2
OVAL
PHS2
-test with log-transformation
LYC
0
LPE
ALBU
ALBU
ALBU
QSpec
CAH2
CAH2
CAH2
ITRA
ITRA
ITRA
LYC
LYC
LYC
Three replicates
OVAL
OVAL
OVAL
PHS2
PHS2
PHS2
Fig. 4. False positive rates of different statistical tests for pairwise comparison of three different spiked-in concentrations, the beta-binomial test, the G-test, the t-test, the t-test with log-transformation, the
LPE method and QSpec. Rows represent different fold changes, and columns represent the number of replicates used in testing. The six spiked-in protein markers are bovine carbonic anhydrase (CAH2),
bovine serum albumin (ALBU), soybean trypsin inhibitor (ITRA), chicken lysozyme (LYC), chicken ovalbumin (OVAL) and rabbit phosphorylase b (PHS2). For one and two replicates, the false positives
rates are averaged from all nine possible combinations. The SDs of the measurements are provided in Supplementary Material II. It is observed that the SDs are approximately equal to the means.
ITRA
0
CAH2
0
ALBU
7
12
22
0
CAH2
0.4
0
ALBU
3
0
1.4
0
ITRA
Two replicates
0
CAH2
5
0
ALBU
One replicate
False positive rate (%)
False positive rate (%)
False positive rate (%)
0.25% vs 1.25%
False positive rate (%)
False positive rate (%)
False positive rate (%)
False positive rate (%)
False positive rate (%)
[14:45 8/1/2010 Bioinformatics-btp677.tex]
1.25% vs 2.5%
False positive rate (%)
0.5
T.V.Pham et al.
368
Page: 368
363–369
The beta binomial model for analysis of spectral count data
(a) 0.7
False positive rate (%)
Finally, we have implemented a robust software package in R for
hypothesis testing with the beta-binomial model for spectral count
data. An option for correction for multiple testing is included in the
package.
ACKNOWLEDGEMENTS
We would like to thank Remond Fijneman and Jakob Albrethsen for
their works on the two cancer datasets used in the current article.
0
ALBU
CAH2
ITRA
LYC
OVAL
PHS2
Funding: VUmc-Cancer Center Amsterdam.
Conflict of Interest: none declared.
False positive rate (%)
(b) 3.5
REFERENCES
0
ALBU
CAH2
ITRA
LYC
OVAL
PHS2
Beta binomial test
test
One-way ANOVA
One-way ANOVA with log-transformation
Fig. 5. False positive rate of different statistical tests for three-group
comparison of three different spiked-in concentrations. (a) Using one
replicate. (b) Using two replicates. The result is averaged over all 27 possible
combinations from the 3×3×3 experiment. No test generates false positive
when all three replicates are used. See text for the details of the spiked-in
protein markers and the statistical tests.
Our implementation uses an optimization procedure where the result
of the method of moments is used as an initial estimate, and
extension for parameter constraints is straightforward.
An alternative mathematical model for two directions of variation
is the gamma-Poisson distribution. Here, the within-sample variation
is modeled with a Poisson distribution, and the other with a gamma
distribution. Our experiment with within-sample variation indicates
a strong similarity between the binomial model and the Poisson
model. Further evaluation is necessary to judge the performance of
the gamma-Poisson model.
In terms of computational speed, the beta-binomial test is slower
than the G-test, the t-test, the LPE test and one-way ANOVA. Its
speed is roughly equal to that of QSpec with the default setting.
Nevertheless, the computational time is marginal in comparison
with other processing steps such as sample preparation and mass
spectrometry. Furthermore, the processing of multiple proteins is
easily parallelizable. Thus, this issue is negligible.
Albrethsen,J. et al. (2009) Unravelling the nuclear matrix proteome. J. Proteomics, 72,
71–81.
Baggerly,K.A. et al. (2003) Differential expression in SAGE: accounting for normal
between-library variation. Bioinformatics, 19, 1477–1483.
Bantschef,M. et al. (2007) Quantitative mass spectrometry in proteomics: a critical
review. Anal. Bioanal. Chem., 389, 1017–1031.
Choi,H. et al. (2008) Significance analysis of spectral count data in label-free shotgun
proteomics. Mol. Cell. Proteomics, 7, 2373–2385.
Chourey,K. et al. (2009) Comparative temporal proteomics of a response regulator
(SO2426)-deficient strain and wild-type Shewanella oneidensis MR-1 during
chromate transformation. J. Proteome Res., 8, 59–71.
Crowder,M.J. (1978) Beta-binomial ANOVA for proportions. Appl. Stat., 27, 34–37.
Dix,M.M. et al. (2008) Global mapping of the topography and magnitude of proteolytic
events in apoptosis. Cell, 134, 679–691.
Ennis,D.M. and Bi,J. (1998) The beta-binomial model: accounting for inter-trial
variation in replicated difference and preference tests. J. Sens. Stud., 13, 389–412.
Jain,N. et al. (2003) Local-pooled-error test for identifying differentially expressed
genes with a small number of replicated microarrays. Bioinformatics, 19,
1945–1951.
Liu,H. et al. (2004) A model for random sampling and estimation of relative protein
abundance in shotgun proteomics. Anal. Chem., 76, 4193–4201.
Old,W.M. et al. (2005) Comparison of label-free methods for quantifying human
proteins by shotgun proteomics. Mol. Cell. Proteomics, 4, 1487–1502.
Ramani,A.K. et al. (2008) A map of human protein interactions derived from
co-expression of human mRNAs and their orthologs. Mol. Syst. Biol., 4, 180.
Skellam,J.G. (1948) A probability distribution derived from the binomial distribution
by regarding the probability of success as variable between the sets of trials. J. R.
Stat. Soc. Ser. B (Methodol.), 10, 257–261.
Sokal,R.R. and Rohlf,F.J. (1995) Analysis of frequencies. In Biometry: the Principles
and Practice of Statistics in Biological Research, 3rd edn. W. H. Freeman and Co.,
New York, pp. 685–793.
Williams,D.A. (1975) The analysis of binary responses from toxicological experiments
involving reproduction and teratogenicity. Biometrics, 31, 949–952.
Zhang,B. et al. (2006) Detecting differential and correlated protein expression in labelfree shotgun proteomics. J. Proteome Res., 5, 2909–2918.
Zybailov,B. et al. (2009) Large scale comparative proteomics of a Chloroplast Clp
protease mutant reveals folding stress, altered protein homeostasis, and feedback
regulation of metabolism. Mol. Cell. Proteomics, 8, 1789–1810.
369
[14:45 8/1/2010 Bioinformatics-btp677.tex]
Page: 369
363–369