CHAPTER 2: DATA • THE FIVE W’S • WHO- THE ROWS OF A DATA TABLE THAT CORRESPOND TO THE INDIVIDUAL CASES ABOUT WHOM WE RECORD SOME STATISTICS. • • • • • WHAT- THE CHARACTERISTICS RECORDED ABOUT EACH INDIVIDUAL WHY- REASON DATA WAS COLLECTED WHERE- WHERE THE DATA WAS COLLECTED WHEN- THE TIME THAT THE DATA WAS COLLECTED HOW- IT IS IMPORTANT TO EXPLAIN THE TYPE OF EXPERIMENT, SURVEY, OR STUDY THAT WAS CONDUCTED • COLLECTED DATA IS ORGANIZED INTO DATA TABLES X Y Z A 123 34567 56789 B 123345 789 0987654 CHAPTER 2 CONT. • VARIABLES • MEASURED IN UNITS • CATEGORICAL VARIABLE- ANSWERS QUESTIONS ABOUT HOW CASES FALL INTO CATEGORIES • QUANTITATIVE VARIABLE- ANSWERS QUESTIONS ABOUT THE QUANTITY OF WHAT IS MEASURED • TYPES OF RESPONDENTS • SUBJECTS- PEOPLE WE EXPERIMENT ON • EXPERIMENTAL UNITS- ANIMALS, PLANTS, WEB SITES AND OTHER INANIMATE SUBJECTS • RESPONDENTS- INDIVIDUALS WHO ANSWER A SURVEY CHAPTER 3: DISPLAYING AND DESCRIBING DATA • • • THREE RULES OF DATA ANALYSIS 1. 2. 3. MAKE A PICTURE MAKE A PICTURE MAKE A PICTURE FREQUENCY TABLE • • • TABLE THAT ORGANIZES COUNTS FOR CATEGORICAL DATA RELATIVE FREQUENCY TABLES SHOW PERCENTS IMPORTANT TO KNOW PROPORTIONS SO WE CAN USE PERCENTS AREA PRINCIPLE- THE AREA OCCUPIED BY A PART OF THE GRAPH SHOULD CORRESPOND TO THE MAGNITUDE OF THE VALUE IT REPRESENTS. CHAPTER 3 CONT. • BAR CHART- DISPLAYS THE DISTRIBUTION OF A CATEGORICAL VARIABLE, SHOWING THE COUNTS FOR EACH CATEGORY NEXT TO EACH OTHER FOR EASY COMPARISON. • PIE CHARTS- SHOWS ALL THE CASES ON AS A CIRCLE AND THEY SLICE THE CIRCLE INTO PIECES WHO SIZES ARE PROPORTIONAL TO THE FRACTION OF THE WHOLE OF EACH CATEGORY. • CONTINGENCY TABLE • • • SHOWS TWO VARIABLES SIDE BY SIDE MARGINAL DISTRIBUTION- SHOWS THE COUNTS FOR EACH VARIABLE CONDITIONAL DISTRIBUTION- SHOWS THE PERCENTS FOR EACH VARIABLE • INDEPENDENCE- WHEN THE DISTRIBUTION OF ONE VARIABLE IS THE SAME FOR ALL CATEGORIES OF ANOTHER AP Stats Grades A B C D F Bar Chart Pie Chart X Y A 5678 234567 B 98765 345678 Total 99999 98765 Contingency Table CHAPTER 4: DISPLAYING AND SUMMARIZING DATA • HISTOGRAM- REPRESENTS COUNTS AS BARS AND PLOTS THEM AGAINST QUANTITATIVE DATA. • RELATIVE FREQUENCY HISTOGRAM- SAME AS HISTOGRAM, REPLACING THE COUNTS ON THE VERTICAL AXIS WITH PERCENTAGES OF THE TOTAL NUMBER OF CASES. • STEM-AND-LEAF PLOT- SIMILAR TO A HISTOGRAM, BUT IT SHOWS EACH INDIVIDUAL VALUE. • DOTPLOT- A DOT IS PLACED ALONG AN AXIS FOR EACH CASE IN THE DATA. • QUANTITATIVE DATA CONDITION- THE DATA ARE VALUES OF A QUANTITATIVE VARIABLE WHOSE UNITS ARE KNOWN. MUST KNOW THIS BEFORE MAKING A GRAPHICAL DISPLAY. CHAPTER 4 CONT. • THREE THINGS TO DESCRIBE A DISTRIBUTION 1. 2. 3. SHAPE- WHETHER IT UNIMODAL OR BIMODAL, SYMMETRIC OR SKEWED, AND WHETHER OR NOT THERE ARE OUTLIERS. CENTER- THE CENTER OF THE DATA. USUALLY TALKS ABOUT THE MEDIAN. MEDIAN-IS THE MIDDLE VALUE THAT DIVIDES THE TWO HALVES OF THE HISTOGRAM. SPREAD- THE RANGE AND INTERQUARTILE RANGE OF THE DATA. RANGE- THE DIFFERENCE BETWEEN THE MAXIMUM AND THE MINIMUM OF THE DATA. INTERQUARTILE RANGE- THE DIFFERENCE BETWEEN THE UPPER QUARTILE RANGE AND THE LOWER QUARTILE RANGE • 5 NUMBER SUMMARY- REPORTS THE MEDIAN, QUARTILES, MINIMUM, AND THE MAXIMUM OF A DATA SET. CHAPTER 4 CONT. • MEAN • FEELS LIKE THE CENTER BECAUSE IT IS THE POINT WHERE THE HISTOGRAM BALANCES. • CALCULATED BY DIVIDING THE TOTAL OF YOUR DATA BY THE NUMBER OF DATA POINTS. • USED WHEN THE HISTOGRAM IS SYMMETRIC AND THERE ARE NO OUTLIERS. • MEDIAN • IS RESISTANT TO VALUES THAT ARE EXTRAORDINARILY LARGE OR SMALL • USED WHEN THE DATA IS SKEWED OR HAS OUTLIERS. • STANDARD DEVIATION • ACCOUNTS FOR HOW FAR EACH VALUE IS FROM THE MEAN. • ONLY WORKS FOR SYMMETRIC DATA. • CANNOT BE CALCULATED BY ITS SELF, SO YOU MUST TAKE THE SQUARE ROOT OF THE VARIANCE IN ORDER TO OBTAIN THE STANDARD DEVIATION. CHAPTER 5: UNDERSTANDING AND COMPARING DISTRIBUTIONS • BOXPLOT- A GRAPHICAL REPRESENTATION OF A 5 NUMBER SUMMARY. ALSO, SHOWS OUTLIERS OF THE DATA. • OUTLIERS • • ANY POINT THAT HAS LEVERAGE ON THE DATA DUE TO BEING EXTREMELY HIGH OR EXTREMELY LOW. TO DETERMINE WHETHER OR NOT A POINT IS AN OUTLIER YOU USE THE FORMULA: 1.5 X IQR THEN SUBTRACT FROM LOWER QUARTILE AND ADD TO UPPER QUARTILE. • RE-EXPRESSING OR TRANSFORMING DATA- APPLY A SIMPLE FUNCTION TO FIX SKEWED DATA. EX: TAKING THE NATURAL LOG OF YOUR DATA. • BOXPLOTS ALLOW YOU TO COMPARE MULTIPLE SPREADS OF DATA. COMPARING DISTRIBUTIONS CHAPTER 6: THE STANDARD DEVIATION AS A RULER AND THE NORMAL MODEL • STANDARD DEVIATION • • ANSWERS THE QUESTION HOW FAR IS THIS VALUE FROM THE MEAN AND HOW DIFFERENT ARE THESE TWO STATISTICS STANDARDIZED VALUES OR Z-SCORES MEASURE THE DISTANCE OF EACH DATA VALUE FROM THE MEAN IN STANDARD DEVIATIONS. STANDARDIZED VALUES HAVE NO UNITS. • SHIFTING DATA • WHEN WE ADD OR SUBTRACT A CONSTANT TO EACH VALUE ALL MEASURES OF POSITION(CENTER, PERCENTILES, MIN, AND MAX) WILL INCREASE OR DECREASE BY THAT SAME CONSTANT. THIS LEAVES SPREAD THE SAME. • WHEN WE MULTIPLY OR DIVIDE BY A CONSTANT TO EACH VALUE ALL MEASURES OF POSITION AND SPREAD WILL BE MULTIPLIED OR DIVIDED BY THAT CONSTANT. CHAPTER 6 CONT. • NORMAL MODEL • THE BELL SHAPE CURVE THAT IT IS APPROPRIATE FOR DISTRIBUTIONS WHOSE SHAPES ARE UNIMODAL AND SYMMETRIC. • • • • NUMBERS WE USE TO SPECIFY THIS MODEL ARE CALLED PARAMETERS. SUMMARIES OF THIS DATA ARE CALLED STATISTICS. A NORMAL MODEL WITH A MEAN OF 0 AND A STANDARD DEVIATION OF 1 IS CALLED THE STANDARD NORMAL MODEL. IN ORDER TO USE THIS MODEL THE DATA MUST MEET THE NEARLY NORMAL CONDITION. • THE 68-95-99.7 RULE- SAYS THAT 68% OF THE DATA WILL FALL WITHIN 1 STANDARD DEVIATION OF THE MEAN, 95% WILL FALL WITHIN 2 STANDARD DEVIATIONS, AND 99.7% WILL FALL WITHIN 3 STANDARD DEVIATIONS. CHAPTER 6 CONT. • RULES FOR WORKING WITH THE NORMAL MODEL 1. 2. 3. MAKE A PICTURE MAKE A PICTURE MAKE A PICTURE • NORMAL PROBABILITY PLOT- TELLS YOU IF YOUR DATA IS NORMAL BY SHOWING WHETHER OR NOT YOUR DATA LIES ON A DIAGONAL LINE. NORMAL MODEL AND NORMAL PROBABILITY PLOT
© Copyright 2026 Paperzz