Analysis of RT distributions with R

Analysis of RT distributions
with R
Emil Ratko-Dehnert
WS 2010/ 2011
Session 06 – 14.12.2010
Before we start with R...
• A concise recap of significance testing, as we
will be computing tests with R today.
2
Excursion
THEORY OF SIGNIFICANCE TESTING
3
Trial analgoy (I)
• A Defendant is charged with a crime
• Prosecutor and defense lawyer respectively must
try to convice a jury of his guilt or innocence
• Jurors are instructed to assume that the defendant
is innocent, unless proven guilty beyond the
shadow of a doubt
4
Trial analogy (II)
• At the end jury decide guilt or innocence based
on strength of their belief in the assumption of
his innocence given the evidence
• But: Convictions can be erroneous(!)
• This depends on the threshold α used
P( evidence | innocent) <= α
5
In our case...
• H0 null hypothesis (formerly „innocent“)
• H1 alternative (or research) hypothesis
(formerly „guilty“)
• One seeks to determine whether the H0 is
reasonable given the availible data (formerly
„evidence“)
6
Test statistic and p-value
• a test statistic is produced by an experiment
and estimates the likelihood of data to H0
• p-value :=
P( test statistic is observed value or more | H0)
• If p-value small -> test is statistically significant and casts doubt on H0
7
Confusion-Matrix
Test
Reality
H0 true
(Innocent )
H1 true
(Guilty)
H0 accepted
(Innocent)
True Positive
False positive
(Type-II-Error, β)
H0 rejected
(Guilty)
False negative
(Type-I-Error, α)
True negative
(Power; 1 – β)
-> Sensitivity
-> Specificity
8
Inference: necessary steps
1. Identify H0, H1
2. Specify a test statistic that discriminates between
H0, H1; collect data and compute the test statistic
3. Using H1, specify values that are extreme under
H0 in the direction of H1.
4. Calculate the p-value under H0. The smaller the
value, the stronger the evidence against H0
9
Ex: calibration of a machine
• A machine produces a „thingy“, with a specific
height X ( X ~ N(0, 1); null hypothesis)
• The height of a randomly chosen „thingy“ is 0.7
• Is the machine still in its tolerance band or has it
slipped (e.g. X ~ N(1, 1); alternative hypothesis)?
10
P X  0.7  F 0.7
area = 0.7580
P X  0.7  1  F 0.7
area = 0.2420
11
area = 0.3821
area = 0.2420
12
s  10
area = 0.0134
p  PX  0.7 | X ~ N (0, 1 / s)  1  F (0.7)
13
SIGNIFICANCE TEST FOR THE MEAN
(T-TEST)
14
Student‘s t-test (I)
• Let the data X1, X2, ... Xn be an iid sequence,
Xi ~ N(μ, σ) (or n large enough for CLT)
• A test of significance for
H 0 :   0 ,
H1 :   0 ,   0 ,   0
can be performed with
test statistic
X  0
T
s/ n
15
Student‘s t-test (II)
• T has the t-distribution with n-1 dof under H0
• Let t be an observed value of the test statistic,
then the p-value is computed by
 PT  t | H 0 
H1 :    0

p  value  PT  t | H 0 
H1 :    0
 PT    t   | H  H :   
0
0
0
1
0

16





Other types
• Test for the median
• Test of proportion
• Two sample test; Matched samples
• Test over the rank (Wilcoxon)
• ...
17
Random comments
• Always check the assumption of the t-test
(normality, independance, ...)
• Statistical significance doesn‘t mean practical
significance (look at effect size)
• Repeated t-testing („testing into compliance“)
has to be corrected to prevent α-inflation
18
Student‘s t-test in R
• In R this is performed by the function t.test()
t.test(x, mu= ..., alt = “two.sided“)
• The output looks something like this
19
Quantile-quantile Plots
• A q-q-plot plots the quantiles of one distribution
against the quantiles of another as points.
• If the distributions have similar shapes, the
points will fall roughly along a straight line.
20
21
Quantile-quantile Plots
• In R you can use the qqplot() function to check the
distribution of two data set against each other
• To check normality, one can use qqnorm() and add
a reference line by qqline()
22
AND NOW TO
23