Homework Discussion • Read pages 446 - 461 • Page 467: 17 – 20, 25 – 27, 61, 62, 63, 67 • See if you can find an example in your life of a survey that might yield unreliable results The critical issues are: a. Finding a sample that is representative of the population, and b. Determining how big the sample should be. Choosing a good sample of a reasonable size is more important that the sampling rate. • Bush's lead gets smaller in poll • • By Susan Page, USA TODAY WASHINGTON — President Bush leads Sen. John Kerry by 8 percentage points among likely voters, the latest USA TODAY/ CNN/Gallup Poll shows. That is a smaller Results based on likely advantage than the voters are based on the sub president held in midsample of 758 survey September but shows him maintaining a respondents deemed most likely to vote in the durable edge in a race November 2004 General that was essentially tied Election. The margin of for months. sampling error is ±4 percentage points. George Gallup explained • Whether you poll the Unites States or New York State or Baton Rouge … you need… the same number of interviews or samples. It’s no mystery really – if a cook has two pots of soup on the stove, one far larger than the other, and thoroughly stirs them both, he doesn’t have to take more spoonfuls from one than the other to sample the taste accurately. Statistics is the science of dealing with data. This includes gathering data, organizing data, interpreting data, and understanding data. Descriptive statistics (page 476) is the area which describes large amounts of data in a way that is understandable, useful, and, if need be, convincing. EXAMPLE 1 (page 478). Stat 101 Midterm Exam Scores (25 Points Possible): N=75 ID 1257 1297 1348 1379 1450 1506 1731 1753 1818 2030 2058 2462 2489 2542 2619 Score 12 16 11 24 9 10 14 8 12 12 11 10 11 10 1 ID 2651 2658 2794 2795 2833 2905 3269 3284 3310 3596 3906 4042 4124 4204 4224 Score 10 11 9 13 10 10 13 15 11 9 14 10 12 12 10 ID 4355 4396 4445 4787 4855 4944 5298 5434 5604 5644 5689 5736 5852 5877 5906 Score 8 7 11 11 14 6 11 13 10 9 11 10 9 9 12 ID 6336 6510 6622 6754 6798 6873 6931 7041 7196 7292 7362 7503 7616 7629 7961 Score 11 13 11 8 9 9 12 13 13 12 10 10 14 14 12 ID 8007 8041 8129 8366 8493 8522 8664 8767 9128 9380 9424 9541 9928 9953 9973 Score 13 9 11 13 8 8 10 7 10 9 10 8 15 11 10 DESCRIPTIVE STATISTICS A data set is a collection of data values called data points. The size of a data set is the number of data points in it. We use N to represent size. EXAMPLE 1 (page 478). Stat 101 Midterm Exam Scores (25 Points Possible): N=75 ID 1257 1297 1348 1379 1450 1506 1731 1753 1818 2030 2058 2462 2489 2542 2619 Score 12 16 11 24 9 10 14 8 12 12 11 10 11 10 1 ID 2651 2658 2794 2795 2833 2905 3269 3284 3310 3596 3906 4042 4124 4204 4224 Score 10 11 9 13 10 10 13 15 11 9 14 10 12 12 10 ID 4355 4396 4445 4787 4855 4944 5298 5434 5604 5644 5689 5736 5852 5877 5906 Score 8 7 11 11 14 6 11 13 10 9 11 10 9 9 12 ID 6336 6510 6622 6754 6798 6873 6931 7041 7196 7292 7362 7503 7616 7629 7961 Score 11 13 11 8 9 9 12 13 13 12 10 10 14 14 12 ID 8007 8041 8129 8366 8493 8522 8664 8767 9128 9380 9424 9541 9928 9953 9973 Score 13 9 11 13 8 8 10 7 10 9 10 8 15 11 10 In statistical usage, a variable is any characteristic that varies with members of a population. (page 481) Baseball stats When possible values of the numerical variable change by minimum increments, the variable is called discrete When the differences between the values of a numerical variable can be arbitrarily small, we call the variable continuous . EXAMPLE 1 (page 478). Stat 101 Midterm Exam Scores (25 Points Possible): N=75 ID 1257 1297 1348 1379 1450 1506 1731 1753 1818 2030 2058 2462 2489 2542 2619 Score 12 16 11 24 9 10 14 8 12 12 11 10 11 10 1 ID 2651 2658 2794 2795 2833 2905 3269 3284 3310 3596 3906 4042 4124 4204 4224 Score 10 11 9 13 10 10 13 15 11 9 14 10 12 12 10 ID 4355 4396 4445 4787 4855 4944 5298 5434 5604 5644 5689 5736 5852 5877 5906 Score 8 7 11 11 14 6 11 13 10 9 11 10 9 9 12 ID 6336 6510 6622 6754 6798 6873 6931 7041 7196 7292 7362 7503 7616 7629 7961 Score 11 13 11 8 9 9 12 13 13 12 10 10 14 14 12 ID 8007 8041 8129 8366 8493 8522 8664 8767 9128 9380 9424 9541 9928 9953 9973 Score 13 9 11 13 8 8 10 7 10 9 10 8 15 11 10 TABLE 14-2 Frequency Table for Stat 101 Data Set A frequency table (page 478) is a listing of the scores along with the frequency with which they occur. Frequency Table Score Frequency % 0 0 0.00% 1 1 1.33% 2 0 0.00% 3 0 0.00% 4 0 0.00% 5 0 0.00% 6 1 1.33% 7 2 2.67% 8 6 8.00% 9 10 13.33% 10 16 21.33% 11 13 17.33% 12 9 12.00% 13 8 10.67% 14 5 6.67% 15 2 2.67% 16 1 1.33% 17 0 0.00% 18 0 0.00% 19 0 0.00% 20 0 0.00% 21 0 0.00% 22 0 0.00% 23 0 0.00% 24 1 1.33% 25 0 0.00% N= 75 100.00% A bar graph (page 479) is a graph with the possible test scores listed in increasing order on a horizontal axis and the frequency of each test score displayed by the height of the column above that test score. N=75 F r e q u e n c y 18 16 14 12 10 8 6 4 2 0 1 3 5 7 9 11 13 15 17 19 21 23 25 Score Outliers are data points that do not fit into the overall pattern of the data. Objective 7: Creating structures and systems that model problems and information 25 23 21 19 17 15 Score 13 9 7 5 3 22% 20% 18% 16% 14% 12% 10% 8% 6% 4% 2% 0% 11 N=75 1 Relative Frequency Instead of representing frequencies a bar graph may represent relative frequencies i.e. the frequencies expressed as percentages of the total population. Fancy bar graphs that use icons instead of bars to show the frequencies, are commonly referred to as pictograms. EXAMPLE 14.3 (page 481). Year Yearly sales of XYZ Corporation from 1997 through 2002 1997 1998 1999 2000 2001 2002 Annual sales 52 55 61 63 70 77 80 Annual Sales (in m illions) Millions of Dollars Millions of Dollars Annual Sales (in millions) 75 70 65 60 55 50 80 70 60 50 40 30 20 10 0 1997 1998 1999 2000 2001 2002 1997 1998 1999 2000 2001 2002 Year Year EXAMPLE (page 484). SAT Scores 1200 1000 800 600 400 200 1510-1600 1410-1500 1310-1400 1210-1300 1110-1200 1010-1100 910-1000 810-900 710-800 610-700 510-600 0 400-500 200 300 500 800 1000 1100 1200 900 700 400 300 100 Frequency 400-500 510-600 610-700 710-800 810-900 910-1000 1010-1100 1110-1200 1210-1300 1310-1400 1410-1500 1510-1600 When we have a large number of possible scores we often break up the range of scores into class intervals. EXAMPLE (page 486). Starting Salaries of TSU Graduates Starting Salaries of First-Year TSU Graduates Salary Number of Students Percentage 40000+ - 45000 228 7% 45000+ - 50000 456 14% 50000+ - 55000 1043 32% 55000+ - 60000 912 28% 60000+ - 65000 391 12% 65000+ - 70000 163 5% 70000+ - 75000 65 2% When a numerical variable is continuous, its possible values can vary by infinitesimally small increments. Consequently, there are no gaps between the class intervals. In this case we use a variation of a bar graph called a histogram. EXAMPLE (page 486). Starting Salaries of TSU Graduates N=3258 35% Percentages 30% 25% 20% 15% 10% 5% 7000075000 6500070000 6000065000 5500060000 5000055000 4500050000 4000045000 0% When a numerical variable is continuous, its possible values can vary by infinitesimally small increments. Consequently, there are no gaps between the class intervals. In this case we use a variation of a bar graph called a histogram. A variable that represents a measurable quantity is called a numerical or quantitative variable. Variables which describe characteristics that cannot be measured numerically are called categorical, or qualitative variables. (page 482) EXAMPLE 3. Enrollment (by School) at Tasmania State University TABLE 14-3 Undergraduate Enrollments at TSU School Enrollment Agriculture 2400 16% Business 1250 8% Education 2840 19% Humanities 3350 22% Science 4870 32% Other 290 2% Total 15000 5000 35 EXAMPLE 3. Enrollment (by School) at Tasmania State University N=15,000 5000 4500 4000 3500 3000 2500 2000 1500 1000 500 0 4870 3350 2840 2400 1250 290 Other Science Humanities Education Business Agriculture EXAMPLE 3. Enrollment (by School) at Tasmania State University N=15,000 35% 30% 25% 20% 15% 10% 5% 0% 32% 22% 19% 16% 8% 2% Other Science Humanities Education Business Agriculture NUMERICAL SUMMARIES OF DATA (page 558) Measures of location (central tendency) are numbers that tell us something about where the values of the data fall. Measures of spread (dispersion) tell us something about how spread out the values of data are. The average of a set of N numbers is obtained by adding the numbers and dividing by N. Example. Average Home runs per season: Mike Sweeney TABLE 14-2 Frequency Table for Stat 101 Data Set Frequency Table Score Frequency % 0 0 0.00% 1 1 1.33% 2 0 0.00% 3 0 0.00% 4 0 0.00% 5 0 0.00% 6 1 1.33% 7 2 2.67% 8 6 8.00% 9 10 13.33% 10 16 21.33% 11 13 17.33% 12 9 12.00% 13 8 10.67% 14 5 6.67% 15 2 2.67% 16 1 1.33% 17 0 0.00% 18 0 0.00% 19 0 0.00% 20 0 0.00% 21 0 0.00% 22 0 0.00% 23 0 0.00% 24 1 1.33% 25 0 0.00% N= 75 100.00% Example 9. The Average Test Score in the Stat 101 Test N=75 F r e q u e n c y 18 16 14 12 10 8 6 4 2 0 1 3 5 7 9 11 13 Score 15 17 19 21 23 25 8 7 6 5 4 3 2 1 0 15 16 17 18 19 20 21 22 23 THE AVERAGE (page 559). STEP 1. Calculate the total of the data. total ( s1 f1 ) ( s2 f 2 ) ... (sn f n ) STEP 2. Calculate N. N f1 f 2 ... f n STEP 3. Calculate the Average. Average = total / N Homework • Read pages 476 – 489 • Page 499: 1 – 3, 5, 7 – 11, 19, 21, (for 23, 25, 29, 30, 32, find the mean)
© Copyright 2026 Paperzz