Nominal/Ordinal/ Categorical data Black Chapter 20 What if the data that we obtain are not interval or ratio level? Working with percentages Do children with reading disabilities have trouble getting along with their teachers? What personality characteristics are associated with harsh parenting? Are low SES children more likely to have psychopathology than those in middle and high SES? Categorical Data Do’s and Don’ts NOT IN BLACK Tabular presentation of data Reading level Public Schools Private Schools Religious Schools Behind grade level 36 24 21 At grade level 33 28 43 Above grade level 14 22 14 Do you have a directional hypothesis? If yes Use that to guide your presentation If no Think about temporal or logical sequencing Present percentages accordingly SPSS crosstabs output: presenting percentages Reading level Behind grade level At grade level Above grade level SUM Public Schools Private Schools Religious Schools 36 44.44% 43.37% 15.32% 24 29.63% 32.43% 10.21% 21 25.93% 26.92% 8.94% 81 33 31.73% 39.76% 14.04% 28 26.92% 37.84% 11.91% 43 41.35% 55.13% 18.30% 104 14 28.00% 16.87% 5.96% 83 22 44.00% 29.73% 9.36% 74 14 28.00% 17.95% 5.96% 78 50 235 Presenting percentages Table 1. Distribution of the reading level of children in public, private, and religious schools (N=235) Reading Level Public Schools Private Schools Religious Schools Behind grade level 43.4% 32.4% 26.9% At grade level 39.8% 37.8% 55.1% Above grade level 16.9% 29.7% 17.9% 83 74 78 N Writing about tabular data Among the children in public schools 43% were reading behind their grade level, compared to 32% in private schools and 27% in religious schools. Children reading above their grade level differed as well with only 17% reading above their grade level in public schools, 18% in religious schools but 30% in private schools. Graphical presentation of categorical data 50.0% 40.0% Behind grade level 30.0% At grade level Above grade level 20.0% 10.0% Public Schools Private Schools Religious Schools Graphical presentation of categorical data 100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% Above grade level At grade level Behind grade level Public Schools Private Schools Religious Schools Graphical presentation of categorical data 100.0% 90.0% 80.0% 70.0% 60.0% 50.0% 40.0% 30.0% 20.0% 10.0% 0.0% Public Schools Private Schools Religious Schools Behind grade level At grade level Above grade level Nominal/Ordinal data Black Chapter 20 Reflection exercise Do you see any similarities between t-tests and F-tests on one hand, and Chi-square tests, on the other? The concept of observed and expected frequencies We may have prior information about how things are in the population. Children who have reading problems Population Percentages Yes 10% No 90% The concept of observed and expected frequencies We may have prior information about how things are in the population. Children who have reading problems Population Observed Percentages Yes 14 10% No 40 90% What would be observed and expected frequencies Compute the expected frequencies Yes No Total Children who have reading problems Expected Observed frequencies Expected 14 5.4 10% 40 48.6 90% 54 54 Is this sample similar to a known population? Chi-square test 2 ( O E ) i 2 i Ei i 1 Number of cells m df=m-1 p-value tests the NULL hypothesis (gives the probability of null hypothesis being true) that this sample is similar to the known population. p-value greater than the chosen significance level indicates that this sample is similar to the known population. p-value smaller than the chosen significance level indicates that this sample is coming from a population different from the known population. Apply the chi-square test Yes No Total Yes No Total p Children who have reading problems Expected Observed frequencies Expected 14 5.4 10% 40 48.6 90% 54 54 Chi Square 13.696296 1.5218107 15.218107 0.000096 1-sample Chi square test: Application guide Write the null hypothesis: 1. This sample is coming from the known population. 2. This sample is similar to the known population. Write alternative hypothesis 1. This sample is not coming from the known population. 2. This sample is not similar to the known population. Compute “expected” frequencies Compute chi-square value Obtain the p-value p=CHIDIST(chi-value,df) Interpret the p-value Example 2 - Step 1: Hypotheses In the US population, 16% of all women experience depressive symptoms that are clinically meaningful during a year. The numbers below are taken from a sample of Turkish women living in metropolitan areas. To what extent do these women’s experiences with depression are similar to American women? Women who have clinically significant depressive symptoms Depressive Not depressive Turkish US Percentages 45 16% 105 84% Step 2: What would be the expected frequencies Women who have clinically significant depressive symptoms Depressive Not depressive Total Expected Turkish Turkish US Percentages frequency 45 16% 24 105 84% 126 150 150 Step 3: Compute Chi square value Women who have clinically significant depressive symptoms Depressive Not depressive Total Expected Turkish Turkish US Percentages frequency 45 16% 24 105 84% 126 150 150 Oi-Ei (Oi-Ei)**2 21 441 -21 441 Chi square values 18.375 3.5 21.875 Step 4: Compute p value Women who have clinically significant depressive symptoms Depressive Not depressive Total Expected Turkish Turkish US Percentages frequency 45 16% 24 105 84% 126 150 150 Oi-Ei (Oi-Ei)**2 21 441 -21 441 p value Chi square values 18.375 3.5 21.875 2.91E-06 0.00000 Step 5: Write your conclusion Comparing two samples – Nominal/ordinal variables We may have a hypothesis about an association: Religiosity and sexual orientation are associated Sexual Religiosity Orientation Yes No Heterosexual 57 105 Gay 13 27 Bisexual 8 17 Null Hypothesis: Religiosity and sexual orientation are not associated. Alternative Hypothesis: Religiosity and sexual orientation are associated. The assumption of independence If the two variables were independent, what would have been the distribution of the cases? Sexual Orientation Heterosexual Gay Bisexual Total Religiosity Yes No 57 105 13 27 8 17 78 149 Total 162 40 25 227 The assumption of independence – Step 1 If the two variables were independent, what would have been the distribution of the cases? Sexual Religiosity Orientation Yes No Heterosexual 57 105 Gay 13 27 Bisexual 8 17 Total 78 149 Percent religious 34.36% 65.64% Total Percent sexual orientation 162 71.37% 40 17.62% 25 11.01% 227 Marginal distribution What would be the probability that a person is heterosexual AND religious? p(Heterosexual & religious) = 71.37%*34.36% = 24.52% How many people would be heterosexual AND religious? N(Heterosexual & religious) = 24.52%*227 = 55.7 Expected frequencies under the assumption of independence – Step 2 Calculate what is the percent of cases expected in each cell, under the assumption of independence Sexual Religiosity Orientation Yes No Heterosexual 57 105 Gay 13 27 Bisexual 8 17 Total 78 149 Marginal distribution 34.36% 65.64% Percent in each Marginal cell expected Total distribution Yes No 162 71.37% 24.52% 46.84% 40 17.62% 6.05% 11.57% 25 11.01% 3.78% 7.23% 227 Expected frequencies under the assumption of independence – Step 3 Calculate what is the number of cases expected in each cell, under the assumption of independence Sexual Religiosity Orientation Yes No Heterosexual 57 105 Gay 13 27 Bisexual 8 17 Total 78 149 Marginal distribution 34.36% 65.64% Frequency in each cell expected Heterosexual 55.6652 106.3348 Gay 13.74449 26.25551 Bisexual 8.590308 16.40969 Percent in each Marginal cell expected Total distribution Yes No 162 71.37% 24.52% 46.84% 40 17.62% 6.05% 11.57% 25 11.01% 3.78% 7.23% 227 Chi square values for each cell – Step 4 Sexual Religiosity Orientation Yes No Heterosexual 57 105 Gay 13 27 Bisexual 8 17 Total 78 149 Marginal distribution 34.36% 65.64% Frequency in each cell expected Heterosexual 55.6652 106.3348 Gay 13.74449 26.25551 Bisexual 8.590308 16.40969 Percent in each Marginal cell expected Total distribution Yes No 162 71.37% 24.52% 46.84% 40 17.62% 6.05% 11.57% 25 11.01% 3.78% 7.23% 227 227 Chi-square for each cell Heterosexual 0.032007 0.016756 Gay 0.040327 0.021111 Bisexual 0.040565 0.021235 0.172 Is the observed distribution conforming to independence? Two sample chi-square test Chi-square test 2 ( O E ) i 2 i Ei i 1 mn df=(m-1)*(n-1) p-value gives us the probability that hypothesis of independence is true. p-value greater than the chosen significance level indicates that the two variables of interest are independent. p-value smaller than the chosen significance level indicates that the two variables of interest are associated. Example - chi-square: Step 1 Behind grade level At grade level Above grade level Public Private Religious Schools Schools Schools 36 24 21 33 28 43 14 22 14 83 74 78 81 104 50 235 Step 2 Behind grade level At grade level Above grade level Public Private Religious Schools Schools Schools 36 24 21 33 28 43 14 22 14 83 74 78 0.353191 0.314894 0.331915 81 0.344681 104 0.442553 50 0.212766 235 Step 3 Behind grade level At grade level Above grade level Expected % Behind grade level At grade level Above grade level Public Private Religious Schools Schools Schools 36 24 21 33 28 43 14 22 14 83 74 78 0.353191 0.314894 0.331915 Public Schools 0.121738 0.156306 0.075147 81 0.344681 104 0.442553 50 0.212766 235 Private Religious Schools Schools 0.108538 0.114405 0.139357 0.14689 0.066999 0.07062 1 Step 4 Expected % Behind grade level At grade level Above grade level Public Schools 0.121738 0.156306 0.075147 Private Religious Schools Schools 0.108538 0.114405 0.139357 0.14689 0.066999 0.07062 1 Expected # Behind grade level At grade level Above grade level Public Schools 28.60851 36.73191 17.65957 Private Schools 25.50638 32.74894 15.74468 Religious Schools 26.88511 34.51915 16.59574 Step 5 Expected # Behind grade level At grade level Above grade level Public Schools 36 33 14 Public Schools 28.60851 36.73191 17.65957 Private Schools 24 28 22 Private Schools 25.50638 32.74894 15.74468 Religious Schools 21 43 14 Religious Schools 26.88511 34.51915 16.59574 Chi Behind grade level At grade level Above grade level Public Schools 1.909715 0.379158 0.75837 Private Schools 0.088966 0.688645 2.485221 Religious Schools 1.28824 2.083621 0.406001 Behind grade level At grade level Above grade level Step 6 Chi Behind grade level At grade level Above grade level Public Schools 1.909715 0.379158 0.75837 Private Schools 0.088966 0.688645 2.485221 Religious Schools 1.28824 2.083621 0.406001 10.08794 df p 4 0.038972 What if the sample size was 150? Behind grade level At grade level Above grade level Public Private Religious Schools Schools Schools 23 15 13 21 18 27 9 14 9 Same percentages, different N! What if the sample size was 150? Chi Behind grade level At grade level Above grade level Public Schools 1.218967 0.242016 0.484066 Private Schools 0.056787 0.439561 1.586312 Religious Schools 0.822281 1.329971 0.25915 6.439109 df p Same percentages, different N! 4 0.168668 What if the sample size was 1500? Behind grade level At grade level Above grade level Public Private Religious Schools Schools Schools 230 153 134 211 179 274 89 140 89 Same percentages, different N! What if the sample size was 1500? Chi Behind grade level At grade level Above grade level Public Schools 12.18967 2.420156 4.840657 Private Schools 0.567865 4.395607 15.86312 Religious Schools 8.22281 13.29971 2.591496 64.39109 df p Same percentages, different N! 4 0.00000
© Copyright 2026 Paperzz