Introduction to Statistics for the Social Sciences SBS200, COMM200, GEOG200, PA200, POL200, or SOC200 Lecture Section 001, Fall, 2014 Room 120 Integrated Learning Center (ILC) 10:00 - 10:50 Mondays, Wednesdays & Fridays. http://www.youtube.com/watch?v=oSQJP40PcGI Labs Remember: • • • Bring electronic copy of your data (flash drive or email it to yourself) Your data should have correct formatting See Lab Materials link on class website to doublecheck formatting of excel is exactly consistent Schedule of readings Before next exam (September 26th) Please read chapters 1 - 4 in Ha & Ha textbook Please read Appendix D, E & F online On syllabus this is referred to as online readings 1, 2 & 3 Please read Chapters 1, 5, 6 and 13 in Plous Chapter 1: Selective Perception Chapter 5: Plasticity Chapter 6: Effects of Question Wording and Framing Chapter 13: Anchoring and Adjustment Reminder Use this as your study guide By the end of lecture today 9/17/14 Characteristics of a distribution Central Tendency Dispersion Primary types of “measures of central tendency”? Mean Median Mode Measures of variability Range Standard deviation Variance Memorizing the four definitional formulae Homework due – Monday (September 22nd) On class website: please print and complete homework worksheet # 6 & 7 Overview Frequency distributions Challenge yourself as we work through characteristics of distributions to try to categorize each concept as a measure of 1) central tendency 2) dispersion or 3) shape Mean, Median, Mode, Trimmed Mean Standard deviation, Variance, Range Mean Absolute Deviation Skewed right, skewed left unimodal, bimodal, symmetric The normal curve Another example: How many kids in your family? Number of kids in family 1 4 3 2 1 8 4 2 2 14 14 4 2 1 4 2 2 3 1 8 Measures of Central Tendency (Measures of location) The mean, median and mode Mean: The balance point of a distribution. Found by adding up all observations and then dividing by the number of observations Mean for a sample: Σx / n = mean = x Mean for a population: ΣX / N = mean = µ (mu) Measures of “location” Where on the number line the scores tend to cluster Note: Σ = add up x or X = scores n or N = number of scores Measures of Central Tendency (Measures of location) The mean, median and mode Mean: The balance point of a distribution. Found by adding up all observations and then dividing by the number of observations Mean for a sample: Σx / n = mean = x 41/ 10 = mean = 4.1 Number of kids in family 1 4 3 2 1 8 4 2 2 14 Note: Σ = add up x or X = scores n or N = number of scores How many kids are in your family? What is the most common family size? Number of kids in family 1 3 1 4 2 4 2 8 2 14 Median: The middle value when observations are ordered from least to most (or most to least) How many kids are in your family? What is the most common family size? Number of kids in family 1 4 3 2 1 8 4 2 2 14 Median: The middle value when observations are ordered from least to most (or most to least) 1, 3, 1, 4, 2, 4, 2, 8, 2, 14 1, 1, 2, 2, 2, 3, 4, 4, 8, 14 How many kids are in your family? What is the most common family size? Number of kids in family 1 3 4 1 3 4 2 2 1 4 8 2 4 8 2 2 14 Median: The middle value when observations are ordered from least to most (or most to least) 1, 3, 1, 4, 2, 4, 2, 8, 2, 14 1, 1, 2, 2, 2, 3, 4, 4, 8, 14 2.5 Median always has a percentile rank of 50% regardless of shape of distribution 2+3 µ=2.5 If there appears to be two medians, take the mean of the two Mode: The value of the most frequent observation Number of kids in family 1 3 1 4 2 4 2 8 2 14 Bimodal distribution: If there are two most frequent observations Score 1 2 3 4 5 6 7 8 9 10 11 12 13 14 f . 2 3 1 2 0 0 0 1 0 0 0 0 0 1 Please note: The mode is “2” because it is the most frequently occurring score. It occurs “3” times. “3” is not the mode, it is just the frequency for the value that is the mode What about central tendency for qualitative data? Mode is good for nominal or ordinal data Median can be used with ordinal data Mean can be used with interval or ratio data Overview Frequency distributions Challenge yourself as we work through characteristics of distributions to try to categorize each concept as a measure of 1) central tendency 2) dispersion or 3) shape Mean, Median, Mode, Trimmed Mean Skewed right, skewed left unimodal, bimodal, symmetric The normal curve A little more about frequency distributions An example of a normal distribution A little more about frequency distributions An example of a normal distribution A little more about frequency distributions An example of a normal distribution A little more about frequency distributions An example of a normal distribution A little more about frequency distributions An example of a normal distribution Measure of central tendency: describes how scores tend to cluster toward the center of the distribution Normal distribution In all distributions: mode = tallest point median = middle score mean = balance point In a normal distribution: mode = mean = median Measure of central tendency: describes how scores tend to cluster toward the center of the distribution Positively skewed distribution In all distributions: mode = tallest point median = middle score mean = balance point In a positively skewed distribution: mode < median < mean Note: mean is most affected by outliers or skewed distributions Measure of central tendency: describes how scores tend to cluster toward the center of the distribution Negatively skewed distribution In all distributions: mode = tallest point median = middle score mean = balance point In a negatively skewed distribution: mean < median < mode Note: mean is most affected by outliers or skewed distributions Mode: The value of the most frequent observation Bimodal distribution: Distribution with two most frequent observations (2 peaks) Example: Ian coaches two boys baseball teams. One team is made up of 10-year-olds and the other is made up of 16-year-olds. When he measured the height of all of his players he found a bimodal distribution Overview Frequency distributions The normal curve Mean, Median, Mode, Trimmed Mean Standard deviation, Variance, Range Mean Absolute Deviation Skewed right, skewed left unimodal, bimodal, symmetric Dispersion: Variability Some distributions are more variable than others The larger the variability the wider the curve tends to be A 5’ 5’6” 6’ 6’6” 7’ 6’ 6’6” 7’ B 5’ 5’6” C The smaller the variability the narrower the curve tends to be Range: The difference between the largest and smallest observations Range for distribution A? 5’ 5’6” 6’ 6’6” 7’ Range for distribution B? Range for distribution C? Wildcats Basketball team: Tallest player = 84” (same as 7’0”) (Kaleb Tarczewski and Dusan Ristic) Shortest player = 71” (same as 5’11”) (Parker Jackson-Cartwritght) Fun fact: Mean is 78 Range: The difference between the largest and smallest scores 84” – 71” = 13” xmax - xmin = Range Range is 13” Baseball Wildcats Baseball team: Tallest player = 77” (same as 6’5”) (Austin Schnabel) Shortest player = 70” (same as 5’10”) (Five players are 5’10”) Fun fact: Mean is 72 Range: The difference between the largest and smallest score 77” – 70” = 7” xmax - xmin = Range Please note: No reference is made to numbers between the min and max Range is 7” (77” – 70” ) Frequency distributions The normal curve Variability Some distributions are more variable than others 5’ 5’6” 6’ 6’6” 7’ 5’ 5’6” 6’ 6’6” 7’ What might this be? Let’s say this is our distribution of heights of men on U of A baseball team Mean is 6 feet tall 5’ 5’6” 6’ 6’6” 7’ What might this be? Variability 5’ 5’6” 6’ 6’6” 7’ 5’ 5’6” 6’ 6’6” 7’ 5’ 5’6” 6’ 6’6” 7’ The larger the variability the wider the curve the larger the deviations scores tend to be The smaller the variability the narrower the curve the smaller the deviations scores tend to be Variability Standard deviation: The average amount by which observations deviate on either side of their mean Generally, (on average) how far away is each score from the mean? Mean is 6’ Deviation scores Diallo is 0” Let’s build it up again… U of A Baseball team Diallo is 6’0” Diallo’s deviation score is 0 6’0” – 6’0” = 0 Diallo 5’8” 5’10” 6’0” 6’2” 6’4” Deviation scores Diallo is 0” Preston is 2” Let’s build it up again… U of A Baseball team Diallo is 6’0” Diallo’s deviation score is 0 Preston is 6’2” Preston Preston’s deviation score is 2” 6’2” – 6’0” = 2 5’8” 5’10” 6’0” 6’2” 6’4” Deviation scores Diallo is 0” Preston is 2” Mike is -4” Hunter is -2 Let’s build it up again… U of A Baseball team Diallo is 6’0” Diallo’s deviation score is 0 Hunter Preston is 6’2” Preston’s deviation score is 2” Mike Mike is 5’8” Mike’s deviation score is -4” 5’8” 5’10” 6’0” 6’2” 6’4” 5’8” – 6’0” = -4 Hunter is 5’10” Hunter’s deviation score is -2” 5’10” – 6’0” = -2 Deviation scores Diallo is 0” Preston is 2” Mike is -4” Hunter is -2 Shea is 4 David is 0” Let’s build it up again… U of A Baseball team David Shea Diallo’s deviation score is 0 Preston’s deviation score is 2” Mike’s deviation score is -4” Hunter’s deviation score is -2” Shea is 6’4” 5’8” 5’10” 6’0” 6’2” 6’4” Shea’s deviation score is 4” 6’4” – 6’0” = 4 David is 6’ 0” David’s deviation score is 0 6’ 0” – 6’0” = 0 Deviation scores Diallo is 0” Preston is 2” Mike is -4” Hunter is -2 Shea is 4 David is 0” Let’s build it up again… U of A Baseball team David Shea Diallo’s deviation score is 0 Preston’s deviation score is 2” Mike’s deviation score is -4” Hunter’s deviation score is -2” Shea’s deviation score is 4” David’s deviation score is 0” 5’8” 5’10” 6’0” 6’2” 6’4” Deviation scores Diallo is 0” Preston is 2” Mike is -4” Hunter is -2 Shea is 4 David is 0” Let’s build it up again… U of A Baseball team 5’8” 5’10” 6’0” 6’2” 6’4” Standard deviation: The average amount by which observations deviate on either side of their mean 5’8” 5’10” 6’0” 6’2” 6’4” Deviation scores Diallo is 0” Preston is 2” Mike is -4” Hunter is -2 Shea is 4 David is 0” Standard deviation: The average amount by which observations deviate on either side of their mean 5’8” 5’10” 6’0” 6’2” 6’4” Deviation scores Diallo is 0” Preston is 2” Mike is -4” Hunter is -2 Shea is 4 David is 0” Standard deviation: The average amount by which observations deviate on either side of their mean 5’8” 5’10” 6’0” 6’2” 6’4” Deviation scores Diallo is 0” Preston is 2” Mike is -4” Hunter is -2 Shea is 4 David is 0” Deviation scores Diallo is 0” Preston is 2” Mike is -4” Hunter is -2 Shea is 4 David is 0” Standard deviation: The average amount by which observations deviate on either side of their mean Mike Hunter 5’8” 5’10” 6’0” 6’2” 6’4” How do we find the average height? Σx = average height N How do we find the average spread? Preston Σ(x - µ) = average deviation N Σ (x - µ) = ? 5’8” 5’9” 5’10’ 5’11” 6’0” 6’1” 6’2” 6’3” 6’4” - 6’0” = - 4” 6’0” = - 3” 6’0” = - 2” 6’0” = - 1” 6’0 = 0 6’0” = + 1” 6’0” = + 2” 6’0” = + 3” 6’0” = + 4” Σ(x - x) = 0 Σ(x - µ) = 0 Diallo Deviation scores Diallo is 0” Preston is 2” Mike is -4” Hunter is -2 Shea is 4 David is 0” Standard deviation: The average amount by which observations deviate on either side of their mean Σx- x=? 5’8” Big problem 5’10” Σx N 6’0” 6’2” 6’4” Square the deviations 2 Σ(x - µ) N 2 Σ(x - x) 2 Σ(x - µ) 5’8” 5’9” 5’10’ 5’11” 6’0” 6’1” 6’2” 6’3” 6’4” - 6’0” = - 4” 6’0” = - 3” 6’0” = - 2” 6’0” = - 1” 6’0 = 0 6’0” = + 1” 6’0” = + 2” 6’0” = + 3” Big 6’0” = + 4” problem Σ(x - x) = 0 Σ(x - µ) = 0 Standard deviation: The average amount scores deviate on either side of their mean Mean: The average value in the data Mean is a measure of typical “value” (where the typical scores are positioned on the number line) Standard deviation is typical “spread” (typical size of deviations or distance from mean) – can never be negative Standard deviation: The average amount by which observations deviate on either side of their mean These would be helpful to know by heart – please memorize these formula Standard deviation: The average amount by which observations deviate on either side of their mean What do these two formula have in common? Standard deviation: The average amount by which observations deviate on either side of their mean n-1 is “Degrees of Freedom” More, next lecture What do these two formula have in common?
© Copyright 2026 Paperzz