Analytical Tools for High Throughput Sequencing Data

Marshall University School of Medicine
Department of Biochemistry and Microbiology
BMS 617
Lecture 13: One-way ANOVA
Marshall University Genomics Core Facility
GRHL2 Gene Expression (again)
• Revisit our GRHL2 expression experiment
– Compared the expression of GRHL2 in three
different types of cell line
• Basal A, Basal B, Luminal
• Previously compared just one group against
another, using a t-test
• Really want to compare all three groups
Marshall University School of Medicine
GRHL2 Expression
Marshall University School of Medicine
What’s wrong with t-tests?
• Many investigator’s instinct here is to use t-tests:
– Could use three t-tests:
• Basal A vs Basal B
• Basal A vs Luminal
• Basal B vs Luminal
• But this becomes a multiple-hypothesis testing problem
• If we assume the null hypothesis: the expression is the
same in all three cell types, the chances of seeing data that
produce one or more p-values less than 0.05 are 14.3%.
– Problem increases dramatically as the number of groups
increases
– 40.1% for 5 groups, 76.2% for 8 groups
Marshall University School of Medicine
One-Way ANOVA
• The correct way to analyze these data is with
one-way ANOVA
– ANOVA is an abbreviation for “Analysis of
Variance”
• This just describes how the technique works
• We are still comparing the means of the groups
– Not comparing the variance
Marshall University School of Medicine
How ANOVA works
• Can think of ANOVA as a comparison of two models
– Similarly to the way we described linear regression
• The null hypothesis model
– All groups have the same mean
• Yi = μ + εi
• The alternative model
– Each group has a different mean
• Yi,j = μj + εi,j
• μj is the mean for group j
• We quantify how well the data fit the model by computing the sum of
squares of the residuals (deviations from the model) for each model
– For the null hypothesis, sum of squares of deviations from overall mean
– For the alternative hypothesis, sum of squares of deviations from the group
means
Marshall University School of Medicine
ANOVA for GRHL2 data as a model
comparison
Hypothesis
Scatter from
Sum of Squares
Percentage of
variation
Null
Grand mean
48.696
100%
Alternative
Group means
14.679
30.1%
Difference
Improvement of
model
34.017
69.9%
• R2=0.699
• Same interpretation as before
• The proportion of variation which is “accounted for” by the model
• In the context of an ANOVA, this is usually called η2
• Note that any grouping will reduce the sum of squares
• So we need to do something more sophisticated than just seeing if R2
increases to know if this is a “good” model
• Must account for the sample size and the number of parameters in the
model
Marshall University School of Medicine
Usual presentation of ANOVA
Marshall University School of Medicine
Interpreting the ANOVA table
• In this presentation, the first row is the
“improvement from the model”
– This is the “total sum of squares” minus the “model
sum of squares”
• Total sum of squares is the sum of squares of differences
between points and the overall mean
• Model sum of squares is the sum of squares of differences
between points and their group mean
– The difference turns out to be the sum of squares of
the differences between the group means and the
overall mean, weighted by the number in each group
Marshall University School of Medicine
Data plot(again)
Marshall University School of Medicine
ANOVA as partition of sum of squares
• This is the traditional view of ANOVA
– Partitioning the total sum of squares (variance) into
• The within-groups sum of squares
• The between-groups sum of squares
– If the between-groups sum of squares is “large”
compared to the within-groups sum of squares, we
conclude the groups do not have the same mean
– In order to make this precise we have to compute the
degrees of freedom
Marshall University School of Medicine
Degrees of Freedom
• Easiest way to think of degrees of freedom is via models
– Number of data points, subtract number of parameters in the model
• In our example, we have 51 data points
• Null hypothesis model (total sum of squares) has one parameter
(the mean)
– So total sum of squares has 50 degrees of freedom
• Alternative model (within groups sum of squares) has 3 parameters
(mean for each group)
– So within-groups sum of squares has 48 degrees of freedom
• Between groups is the difference in sum of squares
– Degrees of freedom is the difference in degrees of freedom in the two
models
– In our case, 50-48=2 degrees of freedom.
Marshall University School of Medicine
Mean squares and F ratio
• As with model comparison, the mean squares is the
sum of squares divided by the degrees of freedom
• The F ratio is the between-groups mean squares
divided by the within-groups mean squares
– Distribution of F is known for each pair of d.f.
– So p-value can be computed
Marshall University School of Medicine
Interpreting the p-value for an ANOVA
• The null hypothesis for the ANOVA is that all
groups are sampled from the same population
– In particular, they all have the same population
mean
• A small p-value gives us evidence to reject the
null hypothesis
– I.e. we have evidence to conclude that “not all the
groups have the same mean”
– On its own, this is often not very useful…
Marshall University School of Medicine
Post-Hoc tests
• The ANOVA analysis showed a small p-value
(p=3.17 x 10-13), so we have strong evidence to
reject the null hypothesis
– I.e. we are justified in believing the means for
each group are not all the same
• However, this doesn’t tell us which groups are
different to which other groups
– This is usually what you want to know
• “Post-hoc” tests can be used to determine this
Marshall University School of Medicine
Fisher’s Least Significant Difference
• Fisher’s Least Significant Difference was the first
Post-Hoc test developed for ANOVA
• Has really been superseded by more
sophisticated tests
• Fisher’s Least Significant Difference can only be
performed after an ANOVA gives a significant
result
– Why it is called a “post-hoc” test
– Use the same terminology for more modern tests
• Though most of these actually make sense even without
performing an ANOVA
Marshall University School of Medicine
Tukey’s Honest Significant Differences
• Tukey’s HSD compares every group to every
other group
• Works by computing the largest t-ratio of all
possible t-tests, under the assumption of the
null hypothesis
• Tukey’s HSD associates confidence intervals
with each pairwise comparison
– these are family-wise confidence intervals
Marshall University School of Medicine
Results of Tukey HSD for GRHL2 data
Marshall University School of Medicine
Interpreting the CIs
• The confidence intervals are family-wise 95% confidence intervals
– We are 95% confident that all the intervals shown contain the true
difference in means in the population
– Only makes sense to show all of these (not a subset of them)
• If the 95% confidence interval contains zero, the difference is not
statistically significant at a significance level of 0.05
• If the 95% confidence interval does not contain zero, the difference
is statistically significant at a significance level of 0.05
• Since these are family-wise measures, it only makes sense to think
of the whole family at once
– Think of this as dividing the comparisons into two groups
• Those that are statistically significantly different, and those that are not
• Cannot talk about statistical significance of one comparison in isolation
Marshall University School of Medicine
Dunnett’s Test
• Dunnett’s test is another post-hoc test for oneway ANOVA
• Instead of comparing every group to every other
group, it compares every group to a single control
group
– Must decide before the experiment which is to be the
control group (and why)
• Since there are fewer comparisons than Tukey’s
test, Dunnett’s test has more statistical power
Marshall University School of Medicine
Other post-hoc tests
• There are other approaches to testing between
groups in a one-way ANOVA
– Again, must consider multiple hypotheses when doing
these kinds of tests
– Specialized post-hoc tests account for this
• Most naïve approach:
– Compare with t-tests and use Bonferroni correction
– Flexible – can compare arbitrary groups
• But not statistically powerful
Marshall University School of Medicine
Scheffe’s Test
• Scheffe’s test is the most flexible post-hoc test for
ANOVA
– Can be used with an arbitrary set of comparisons
between groups
• Called “contrasts”
– More powerful than Bonferroni
– Less powerful than Tukey or Dunnett’s in cases where
those are applicable
• Decision as to which post-hoc test should be used
should be made at experimental design time
– Not based on the data
Marshall University School of Medicine
Summary
• One-way ANOVA is used to compare means
across more than two groups
• The generated p-value is the probability of seeing
differences between the groups at least as big as
those observed, assuming the groups are all
sampled from populations with the same mean
• Can think of this as a comparison of models
• Post-hoc tests are usually used to determine
which groups are difference
– Most common are Tukey’s test (all comparisons) and
Dunnett’s test (each group compared to control)
Marshall University School of Medicine