123 t-F-Z-chi-test 1 t-test / F-test / Z-test / Chi square t-test The t-test was described by 1908 by William Sealy Gosset for monitoring the brewing at Guinness in Dublin. Guinness considered the use of statistics a trade secret, so he published his test under the pen-name 'Student' -- hence the test is now often called the 'Student's t-test'. The t-test is a basic test that is limited to two groups. For multiple groups, you would have to compare each pair of groups, for example with three groups there would be three tests (AB, AC, BC). The t-test (or student's t-test) gives an indication of the separateness of two sets of measurements, and is thus used to check whether two sets of measures are essentially different (and usually that an experimental effect has been demonstrated). The typical way of doing this is with the null hypothesis that means of the two sets of measures are equal. The t-test assumes: A normal distribution (parametric data) Underlying variances are equal (if not, use Welch's test) It is used when there is random assignment and only two sets of measurement to compare. There are two main types of t-test: Independent-measures t-test: when samples are not matched. Matched-pair t-test: When samples appear in pairs (eg. before-and-after). Independent one-sample t-test In testing the null hypothesis that the population mean is equal to a specified value μ0, one uses the statistic where s is the sample standard deviation of the sample and n is the sample size. The degrees of freedom used in this test is n − 1. Independent two-sample t-test Equal sample sizes, equal variance This test is only used when both: the two sample sizes (that is, the number, n, of participants of each group) are equal; it can be assumed that the two distributions have the same variance. Violations of these assumptions are discussed below. The t statistic to test whether the means are different can be calculated as follows: where Here is the grand standard deviation (or pooled standard deviation), 1 = group one, 2 = group two. The denominator of t is the standard error of the difference between two means. 123 t-F-Z-chi-test 2 For significance testing, the degrees of freedom for this test is 2n − 2 where n is the number of participants in each group. Unequal sample sizes, equal variance This test is used only when it can be assumed that the two distributions have the same variance. (When this assumption is violated, see below.) The t statistic to test whether the means are different can be calculated as follows: where Note that the formulae above are generalizations for the case where both samples have equal sizes is an estimator of the common standard deviation of the two samples: it is defined in this way so that its square is an unbiased estimator of the common variance whether or not the population means are the same. In these formulae, n = number of participants, 1 = group one, 2 = group two. n − 1 is the number of degrees of freedom for either group, and the total sample size minus two (that is, n1 + n2 − 2) is the total number of degrees of freedom, which is used in significance testing. Unequal sample sizes, unequal variance This test is used only when the two population variances are assumed to be different (the two sample sizes may or may not be equal) and hence must be estimated separately. The t statistic to test whether the population means are different can be calculated as follows: where Where s is the unbiased estimator of the variance of the two samples, n = number of participants, 2 1 = group one, 2 = group two. Note that in this case, is not a pooled variance. For use in significance testing, the distribution of the test statistic is approximated as being an ordinary Student's t distribution with the degrees of freedom calculated using This is called the Welch–Satterthwaite equation. Note that the true distribution of the test statistic actually depends (slightly) on the two unknown variances. Dependent t-test for paired samples This test is used when the samples are dependent; that is, when there is only one sample that has been tested twice (repeated measures) or when there are two samples that have been matched or "paired". This is an example of a paired difference test. For this equation, the differences between all pairs must be calculated. The pairs are either one person's pre-test and post-test scores or between pairs of persons matched into meaningful groups (for instance drawn from the same family or age group). The average (XD) and standard deviation (sD) of those differences are used in the equation. The constant μ0 is non-zero if you want to test whether the average of the difference is significantly different from μ0. The degree of freedom used is n − 1. 123 t-F-Z-chi-test F – Test 3 An F-test ( Snedecor and Cochran, 1983) is used to test if the standard deviations of two populations are equal. This test can be a two-tailed test or a one-tailed test. The two-tailed version tests against the alternative that the standard deviations are not equal. The one-tailed version only tests in one direction, that is the standard deviation from the first population is either greater than or less than (but not both) the second population standard deviation . The choice is determined by the problem. For example, if we are testing a new process, we may only be interested in knowing if the new process is less variable than the old process. The test statistic for F is simply where and are the sample variances with N1 and N2 number of observations respectively,. The more this ratio deviates from 1, the stronger the evidence for unequal population variances. The variance are arranged so that F>1. That is; s12>s22. We use the F-test as the Student's t test, only we are testing for significant differences in the variances. The hypothesis that the two standard deviations are equal is rejected if for an upper one-tailed test for a lower one-tailed test for a two-tailed test or where is the critical value of the F distribution with significance level of . and degrees of freedom and a Chi-square test The chi-square test may be used both as a test of goodness-of-fit (comparing frequencies of one attribute variable to theoretical expectations) and as a test of independence (comparing frequencies of one attribute variable for different values of a second attribute variable). The underlying arithmetic of the test is the same; the only difference is the way the expected values are calculated. Goodness-of-fit tests and tests of independence are used for quite different experimental designs and test different null hypotheses, so we will consider the chi-square test of goodness-of-fit and the chi-square test of independence to be two distinct statistical tests. When to use it The chi-squared test of independence is used when you have two attribute variables, each with two or more possible values. A data set like this is often called an "R X C table," where R is the number of rows and C is the number of columns. For example, if you surveyed the frequencies of three flower phenotypes (red, pink, white) in four geographic locations, you would have a 3 X 4 123 t-F-Z-chi-test 4 table. You could also consider it a 4 X 3 table; it doesn't matter which variable is the columns and which is the rows. It is also possible to do a chi-squared test of independence with more than two attribute variables, but that experimental design doesn't occur very often and is rather complicated to analyze and interpret, so we won't cover it. Chi-Square Goodness-of-Fit Test An attractive feature of the chi-square goodness-of-fit test is that it can be applied to any univariate distribution for which you can calculate the cumulative distribution function. The disadvantage of the chi-square test is that it requires a sufficient sample size in order for the chi-square approximation to be valid. Hypothesis H0 : The chi-square test is defined for the hypothesis: H1 : The data do not follow the specified distribution. Test statistic For the chi-square goodness-of-fit computation, the data are divided into k bins and the test statistic is defined as where is the observed frequency for bin i and is the expected frequency. The hypothesis(H0) that the data are from a population with the specified distribution is rejected if χ2 calculated > χ2 table value at (k-1) degrees of freedom. Degrees of freedom can be calculated as the number of categories in the problem minus 1. Calculating Chi-Square Green Yellow Observed (o) 639 241 Expected (e) 660 220 -21 21 441 441 0.668 2 . . Deviation (o - e) 2 2 Deviation (d ) 2 d /e 2 2 χ = Σd /e = 2.668 Test of independence attributes (comparing frequencies) 2 x 2 Contingency Table There are several types of chi square tests depending on the way the data was collected and the hypothesis being tested. We'll begin with the simplest case: a 2 x 2 contingency table. If we set 123 t-F-Z-chi-test 5 the 2 x 2 table to the general notation shown below in Table 1, using the letters a, b, c, and d to denote the contents of the cells, then we would have the following table: For a contingency table that has r rows and c columns, the chi square test can be thought of as a test of independence. In a test of independence the null and alternative hypotheses are: Ho: The two categorical variables are independent. H1: The two categorical variables are related. Table 1. General notation for a 2 x 2 contingency table. Variable 2 Variable 1 Category 1 Category 2 Total Data type 1 a c a+c Data type 2 b d b+d Totals a+b c+d a+b+c+d=N For a 2 x 2 contingency table the Chi Square statistic is calculated by the formula: When a comparison is made between one sample and another, a simple rule is that the degrees of freedom equal (number of columns minus one) x (number of rows minus one) not counting the totals for rows or columns. If χ2 calculated > χ2 table value at (rows-1) x (Column-1) degrees of freedom then we reject the null hypothesis and say that the attributes are associated. Example: Suppose you conducted a drug trial on a group of animals and you hypothesized that the animals receiving the drug would survive better than those that did not receive the drug. You conduct the study and collect the following data: Ho: The survival of the animals is independent of drug treatment. H1: The survival of the animals is associated with drug treatment. Table 2. Number of animals that survived a treatment. Dead Alive Treated 36 14 Not 30 25 treated Total 66 39 Total 50 55 105 Applying the formula above we get: Chi square = 105[(36)(25) - (14)(30)]2 / (50)(55)(39)(66) = 3.418 At degrees of freedom (2-1) x (2-1) = 1.
© Copyright 2026 Paperzz