Analysis of RT distributions with R Emil Ratko-Dehnert WS 2010/ 2011 Session 06 – 14.12.2010 Before we start with R... • A concise recap of significance testing, as we will be computing tests with R today. 2 Excursion THEORY OF SIGNIFICANCE TESTING 3 Trial analgoy (I) • A Defendant is charged with a crime • Prosecutor and defense lawyer respectively must try to convice a jury of his guilt or innocence • Jurors are instructed to assume that the defendant is innocent, unless proven guilty beyond the shadow of a doubt 4 Trial analogy (II) • At the end jury decide guilt or innocence based on strength of their belief in the assumption of his innocence given the evidence • But: Convictions can be erroneous(!) • This depends on the threshold α used P( evidence | innocent) <= α 5 In our case... • H0 null hypothesis (formerly „innocent“) • H1 alternative (or research) hypothesis (formerly „guilty“) • One seeks to determine whether the H0 is reasonable given the availible data (formerly „evidence“) 6 Test statistic and p-value • a test statistic is produced by an experiment and estimates the likelihood of data to H0 • p-value := P( test statistic is observed value or more | H0) • If p-value small -> test is statistically significant and casts doubt on H0 7 Confusion-Matrix Test Reality H0 true (Innocent ) H1 true (Guilty) H0 accepted (Innocent) True Positive False positive (Type-II-Error, β) H0 rejected (Guilty) False negative (Type-I-Error, α) True negative (Power; 1 – β) -> Sensitivity -> Specificity 8 Inference: necessary steps 1. Identify H0, H1 2. Specify a test statistic that discriminates between H0, H1; collect data and compute the test statistic 3. Using H1, specify values that are extreme under H0 in the direction of H1. 4. Calculate the p-value under H0. The smaller the value, the stronger the evidence against H0 9 Ex: calibration of a machine • A machine produces a „thingy“, with a specific height X ( X ~ N(0, 1); null hypothesis) • The height of a randomly chosen „thingy“ is 0.7 • Is the machine still in its tolerance band or has it slipped (e.g. X ~ N(1, 1); alternative hypothesis)? 10 P X 0.7 F 0.7 area = 0.7580 P X 0.7 1 F 0.7 area = 0.2420 11 area = 0.3821 area = 0.2420 12 s 10 area = 0.0134 p PX 0.7 | X ~ N (0, 1 / s) 1 F (0.7) 13 SIGNIFICANCE TEST FOR THE MEAN (T-TEST) 14 Student‘s t-test (I) • Let the data X1, X2, ... Xn be an iid sequence, Xi ~ N(μ, σ) (or n large enough for CLT) • A test of significance for H 0 : 0 , H1 : 0 , 0 , 0 can be performed with test statistic X 0 T s/ n 15 Student‘s t-test (II) • T has the t-distribution with n-1 dof under H0 • Let t be an observed value of the test statistic, then the p-value is computed by PT t | H 0 H1 : 0 p value PT t | H 0 H1 : 0 PT t | H H : 0 0 0 1 0 16 Other types • Test for the median • Test of proportion • Two sample test; Matched samples • Test over the rank (Wilcoxon) • ... 17 Random comments • Always check the assumption of the t-test (normality, independance, ...) • Statistical significance doesn‘t mean practical significance (look at effect size) • Repeated t-testing („testing into compliance“) has to be corrected to prevent α-inflation 18 Student‘s t-test in R • In R this is performed by the function t.test() t.test(x, mu= ..., alt = “two.sided“) • The output looks something like this 19 Quantile-quantile Plots • A q-q-plot plots the quantiles of one distribution against the quantiles of another as points. • If the distributions have similar shapes, the points will fall roughly along a straight line. 20 21 Quantile-quantile Plots • In R you can use the qqplot() function to check the distribution of two data set against each other • To check normality, one can use qqnorm() and add a reference line by qqline() 22 AND NOW TO 23
© Copyright 2025 Paperzz