Statistics 11 Example 8: Please provide an example of each level of measurement: (a) Nominal (b) Ordinal (c) Ratio (d) Interval 1.4 & 1.5: CRITICAL THINKING & COLLECTING SAMPLE DATA Key Concepts… We focus on meaning of information obtained by studying data as well as the methods used in collecting sample data. Goals in this section are to: Learn how to interpret information based on data. Thinking carefully about the context of the data, the source of the data, the method used in data collection, the conclusions reached and practical implications. Misuse of statistics Evil intent on the part of a dishonest person Unintentional errors on the part of people who do not know better. σ If sample data are not collected in the appropriate way, the data may be so completely useless that no amount of statistical torturing can salvage them. "Lies, damned lies, and statistics" is a phrase describing the persuasive power of numbers, particularly the use of statistics to bolster weak arguments, and the tendency of people to disparage statistics that do not support their positions. It is also sometimes colloquially used to doubt statistics used to prove an opponent's point. Statistics 12 Sampling Techniques Voluntary Response Sample (self-selected sample) – respondents themselves decide whether to be included o Literary Digest study Simple Random Sample – Every subject in the population has an equally likely chance of being selected (each subject has the same probability of being selected). Any pair (or triples, quads, etc) has an equally likely chance of being selected. o Drawing names from a hat o First draw: Population size “N” – each subject has probability 1/N o Second draw: Population size “N – 1” – each subject has prob 1/(N-1) Random Sample – members from a population are selected in such a way that each individual member in the population has either an equal or unequal chance of being selected; that is the probability of selected different subjects need not be equal. o NBA draft Systematic Sample – select a starting point and then select every kth element in the population o Put 30 numbers in a hat (there are 30 students on my roster). o Choose every 6th student to survey. Convenience Sampling – use results that are very easy to obtain. o It is convenient for me to take my sample at the grocery store at 10 a.m. close to the 55+ living. BTW – I am asking people whether they believe that abortion should be legal/illegal. Stratified Sampling – subdivide the population into at least two different subgroups (strata) so that subjects within the same subgroup share the same characteristics (such as gender or age bracket), then we draw a sample from each subgroup (or stratum) o Suppose a group want to learn the chances of a bond issue passing. A community is composed of two distinct income groups – low and high. Population of community is stratified into low and high income and a simple random sample from the low and high income groups. Cluster sampling – first divide the population into sections (or clusters), then randomly select some of those clusters, and then survey every member in the cluster. o Randomly select airline flights – then survey every passenger on the plane. Multistage sampling – use a different sampling method in different stages Statistics 13 Example 9: Determine what sampling technique: Voluntary, Random, simple random, systematic, cluster, convenience, or stratified sample would be useful in each situation. (a) (b) An experimenter wants to estimate the average water consumption per family in a city. He chooses a random starting point on every block and uses the water bill of every 4th house. ____________________ Six different health plans were randomly selected and all of their members were surveyed about customer satisfaction. ____________________ (c) I surveyed all of my students to obtain sample data consisting of the number of credit cards students possess. _____________________ (d) In a study of college programs, 820 students are randomly selected from those majoring in communications, 1463 were randomly selected from those majoring in mathematics and 760 were randomly selected from those majoring in history. _____________________ (e) Wes set up a booth outside the cafeteria and waited for students to approach him. He asked students whether professor’s at IVC should be allowed to assign homework. _____________________ (f) On your desk, you have five flash cards. Each flash card either has A, B, C, D or F on it. Draw your flash card and that is your grade for the course. _____________________ Statistics 14 Misuse of Data Reported Results – take the data yourself rather than allowing someone to report their data. o I will measure the height of each of you rather than you report to me how tall you are. Small Samples – conclusions shall not be reached on samples that are “not large enough”. The question arises – how large of sample is large enough – and this brings us to a discussion on the normal distribution. o I will ask five students on their feeling about euthanasia. Misused Percentages – misleading or unclear percentages. (% review at end of section) o Continental Airlines claimed that they have improved 100% on lost luggage in the last six months. The New York Times interpreted (correctly) to mean that no baggage is now being lost – an accomplishment not yet enjoyed by Continental Airlines. Loaded Questions – if survey questions are not worded carefully, the results of a study can be misleading. Survery questions can be “loaded” or intentionally worded to elicit a desired response. Order of questions -- results from a poll conducted in Germany o Would you say that traffic contributes more or less to air pollution than industry? o Would you say that industry contributes more or less to air pollution than traffic? Traffic presented first, 45% blamed traffic, 27% blamed industry Industy presented first, 24% blamed traffic and 57% blamed industry. Missing Data o Sometimes missing data is out of control of experimenter (subjects dropping out of a study for reasons unrelated to study) o Low income people less likely to respond to a questions such as income o U.S. Census – missing data from homeless people – o people that do not respond to phone surveys (or don’t have land lines) Self-Interest study o Aware of monetary gains from results. Our survey 250 people from hiring professionals demonstrated that you are more likely to get hired if you purchase Tistoni shoes for men (price tag: $38,000). (By the way, I work for Tistoni and as part of my marketing, I took every one of these executives to dinner). Deliberate Distortions – see graphs next page Small samples – make sure your sample size is large enough! Statistics 15 Example 10: Do you want to work for company A or company B. The x-axis represents time and the y-axis represents salary. All aspects of both Company A and Company B are equivalent. Company B Salary Salary Company A Salary Salary Bias – everything sampling technique will generate some bias. Goal of the researcher should be to minimize bias. These are two biases that can be minimized. The non-response bias – at least report the non-response, or take another random sample from the differing population. Non-response bias – answers of those that respond potentially differ from respondents that did not respond. Response Bias – respondents answering questions in the way that they believe the surveyor wants them to respond. Correlation vs. Causality Example 11: It is a known fact that as ice cream sales increase, shark attacks increase. Therefore, we can conclude that sharks love to eat people that eat ice cream!! Obviously, ‘we’ ice cream eaters taste better! Is this a valid conclusion? Why/ why not? Statistics 16 Correlation vs. Causality and Confounding Finding a statistical association between two variables and to conclude that one of the variables causes (or directly affects) the other variable. Two variables may seem to be linked (smoking and pulse rate), but the increase in pulse rate may/may not be caused by smoking. The relationship of the shark attacks vs. ice cream sales as well as smoking vs. pulse rate are correlations. Even though we may find a number of cases to be true – we cannot conclude that one variable caused the other o Need to consider confounding variables. Confounding Variables – not able to distinguish among the affects of different factors. Moral of the story – CORRELATION DOES NOT IMPLY CAUSALITY!!! Example 12: David’s study demonstrated that tall people read better. (a) Does being tall CAUSE people to read better? (b) Is it possible that there is a correlation between being tall and reading better? (c) What are some possible confounding variables? Statistics 17 Studies Observational Study – we observe and measure specific characteristics, but we do not attempt to modify the subjects being studied. o Cross-Sectional Study – Data are observed, collected and measured at one particular point in time. Aims to provide data on the entire population. o Retrospective study (Observational Study) Case-Control study – Data are observed, collected and measured from a particular subset of the population. Used frequently in epidemiology – compare subjects with a particular condition “the cases” with those that do not have the condition “the controls” but who are otherwise similar. Data are collected from the past by going back in time (through examination of records, interviews, and so on. o A type of case-control study (look back historically) Prospective (longitudinal) study Data are collected in the future from groups sharing common factors (called cohorts). Experiment – we apply some treatment and then proceed to observe its effect on the subjects (Subjects in experiments are called experimental units) **Note: In an observational study, no treatment is given. Example 13: Identify whether each study is cross-sectional, retrospective or prospective. (a) Qualcomm funded a project that studied the affects of 4th – 6th gradestudents who were taught by a math specialist ability to communicate mathematics over a period of 4 years. (b) Physicians at the Mount Sinai Medical Center plan to study emergency personnel who worked at the site of the terrorist attacks in New York City on September 11, 2001. They plan to study these workers from now until several years into the future. (c) University of Toronto researchers studied 669 traffic crashes involving drivers with cell phones. They found that cell phone use quadruples the risk of a collision. Statistics 18 Experimental Design Designing an experiment is important. A faulty design can result in ‘GIGA’ (as your author suggests) – garbage in, garbage out! Famous example: The Salk Vaccine Experiment o 1954 – test the effectiveness of the Salk vaccine in preventing polio. o 200,745 children given a treatment 201,229 injected with placebo (essentially, a sugar pill that has no affect) 200,745 injected with Salk vaccine injection o Result: 115 injected with placebo developed paralytic polio 33 developed paralytic polio of those injected with vaccine. Types of Experiments Randomization – subjects assigned to different groups through a process of random selection. Equivalent to flipping a coin! o The children in the Salk Vaccine experiment were assigned treatment or control group based on random selection. o Use chance as a way to create similar groups. Replication – replication of an experiment on more than one subject. o Sample sizes should be large enough so that the erratic behavior characteristic of small samples do not disguise the true effects of differing treatments. o Larger sample sizes increase the change of recognizing different treatment effects; but large sample sizes do not necessarily indicate a good sample. Completely Randomized Experimental Design – assign subjects to different treatment groups through a process of random selection (Salk Vaccine) Randomized Block Design – Group subjects that are similar but blocks differ in ways that might affect the outcome of an experiment (i.e. gender and assigning treatment for heart medication ). A block is a group of subjects that are similar. Rigorously Controlled Design – subjects are assigned to different treatment groups in ways that are important to experiment Matched Pairs Design – Compare exactly two treatment groups by using subjects matched in pairs that somehow are related and/or share similar characteristics. Can be subject (twin studies)…Coke/Pepsi anyone?? Statistics 19 Blinding & Placebo Affect Blinding is a technique used in an experiment in which the subject doesn’t know whether he or she is receiving the treatment or the placebo. In a double-blind experiment, both the subject and the investigator do not know whether the subject received the treatment or the placebo. Placebo affect occurs when an untreated subject reports an improvement in symptoms. ** Note: Salk Vaccine was a double-blind experiment. Sampling Errors There will always be sampling error, no matter how well the experimental design is planned. If we randomly sample 1000 IVC students and asked if they obtained a high school diploma or a GED the result will differ slightly from another 1000 students asked the same question. (we often hear the term “margin of error” in reporting statistics. This will be discussed later). Sampling Error – the difference between a sample result and the true population result; such an error results from change sample fluctuations. Non-sampling error – occurs when the sample data are incorrectly collected, recorded, or analyzed (such as selected a biased sample, using a defective measurement instrument, or copying the data incorrectly). o The student that ends up with 1005% in the class had an incorrect test score input in the grade book! Statistics 20 Example 14: Choose something you want to study and design an experiment. Statistics 21 Homework Chapter 1 1.1 1.2 1.3 1.4 1.5 NA 1-18, 23, 26, 28 1-32, 34 1,3, 4, 5, 6, 8, 9, 10, 12, 13, 15-19, 21, 24, 25, 28, 30 1-4,6, 9, 11, 12, 13, 15, 16, 18, 19, 21-26, 27, 29, 31 PERCENTAGE REVIEW “of” often means multiply Percent means per hundred so n% Percentage of: Change the % to n 100 1 then multiply. 100 Fraction to percentage: Divide by denominator and multiply by 100. Decimal to percentage: Multiply the decimal by 100 and put in the percent symbol. Percentage to decimal: Remove the percent symbol and divide by 100. Perform the indicated operation. a. 12% of 1200 c. 12% of 1200 b. Write 5/8 as a percentage. d. Write 5/8 as a percentage.
© Copyright 2024 Paperzz