A foundation for analysis in the health science Yongli YANG Ph.D, Associate Professor Department of Biostatistics & Epidemiology, college of public health TEL: 67781249 E-mail: [email protected] STATISTICS IN LIFE GDP in China increased 7.7% in 2013 from the report of State Statistical Bureau. Life expectancy is 74.83 year in 6th population census Weather forecast in Zhengzhou week Theory course content 8 introduction 9 Description of quantitative variable 10 Description of qualitative variable . Statistical table and graph 10 Exercise: statistical description 11 Normal distribution 11 Sampling error and sampling distribution 12 The principle of hypothesis test 12 t test 13 One-way analysis of variance 13 Nonparametric test 14 Exercise: t test and ANOVA 14 Chi-square test 15 Simple linear correlation analysis Chapter I introduction to biostatistics Introduction Some basic concepts Basic step of statistical work Review questions and exercises Be familiar with • Basic step of statistical work Understand • The definition: statistics and biostatistics Master • The definition: population, sample, probability, quantitative variable, qualitative variable I. INTRODUCTION We are frequently reminded of the fact that we are living in the information age. Appropriately, then, this subject is about information—how it is obtained, how it is analyzed, and how it is interpreted. The information about which we are concerned are called data, and the data are available to us in the form of numbers. Question 1 We aim to explore whether smoking is harmful to your health. How to explore? Lung cancer, Heart disease, Other diseases? Lung cancer a a/(a+b) smoking no lung cancer b compare non- Lung cancer c c/(c+d) smoking no lung cancer d conclusion Smoking group Non-smoking group Question 2 It is obvious that generally men are taller than women, while some other women are taller than men. •Therefore, if you wanted to ‘prove’ that men were taller, you should measure many people of each sex. • How many people should you measure? Question 3 A doctor used a new drug to cure 5 AIDS patients. 4 of them are cured. Conclusion: The cured rate of this drug was 80%. ? Is his conclusion right? Why or why not? A knowledge of statistics is like a knowledge of foreign languages or of algebra; it may prove of use at any time under any circumstances. A.L. Bowley II. SOME BASIC CONCEPTS ① Data ② Statistics and biostatistics ③ Population and sample ④ Variable ⑤ Parameter and Statistic ⑥ Probability ① DATA Definition: The raw material of statistics is data. For our purses we define data as numbers. Sources of data: Routinely kept records Surveys Experiments External sources ① DATA Routinely kept records. Hospitals keep day-to-day records, which contain immense amounts of information on patients. When the need for data arises, we should look for them first among routinely kept records. ① DATA Surveys If the data needed to answer a question are not available from routinely kept records, then logical source may be a survey. For example, the administration of the health department want to learn the numbers of hypertension in Zhengzhou, we may conduct a survey. ① DATA Experiments Frequently the data needed to answer a question are available only as the result of an experiment. For example, a nurse wish to know which of several strategies is best for maximizing patient compliance. ① DATA External sources The data needed to answer a question may already exist in the form of published reports. For example, statistical yearbook, population census…… ② STATISTICS A science dealing with the collection, analysis, interpretation and presentation of masses of numerical data ----Webster’s international dictionary ② statistics The science and art of dealing with variation in data classification as to obtain through collection, and analysis in such a way reliable results. —— John M. Last —— A Dictionary of Epidemiology ② STATISTICS The tools of statistics are employed in many fields—demography, national economic, psychology, medicine…… Demographics National economic statistics Psychological statistics Biostatistics …… ② BIOSTATISTICS When the data analyzed are derived from the biological sciences and medicine, we use the term “biostatistics” to distinguish this particular application of statistical tools and concepts. ③ Population and sample We want to learn the average income of Beijing doctors in 2010. Suppose there are 20,000 doctors in Beijing in 2010. → → To investigate all the doctors one by one (But it is consuming-time ) 500 are drawn from which randomly. Then generalize the population average income from the incomes of 500 doctors. ③ Population and sample Questions What is study aim? What is study population? What is our observational unit? What is sample? What is sample size? ③ Population and sample Answers To learn the average income of Beijing doctors in 2010 20,000 doctors’ income Individual 500 doctors’ income 500 ③ Population and sample population Definition:Population is the largest collection of entities for which we have an interest at a particular time. For example, we are interested in the weights of all the children enrolled in a certain country elementary school system, our population consists of all these weights. ③ Population and sample population Population may be finite or infinite. If a population of values consists of a fixed number of these values, the population is said to be finite. If, on the other hand, a population consists of an endless succession of values, the population in an infinite one. ③ Population and sample Sample Definition: A sample is a random part of population. Suppose our population consists of the weights of all the elementary school children enrolled in a certain country school system. If we collect for analysis the weights of only a fraction of these children, we have only a part of our population of weights, that is, we have a sample. ③ Population and sample How to get a random part of population? Simple random sampling Systematic sampling Stratified sampling Cluster sampling If a sample of size n is drawn from a population of size N in such a way that every possible sample of size n has the same chance of being selected, the 3 4 random 5 sample is called a simple sample 6 2 1 12 13 11 9 15 14 10 16 8 7 17 Sample ④ VARIABLE If we observe a characteristic, we find that it takes on different values in different persons, places, or things, we label the characteristic a variable. Examples: heart rate, the heights of adult males, diastolic blood pressure, gender,blood type, treatment effect ④ Variable Quantitative variable Binary variable variable Qualitative variable Multiple categorical variable Ordinal variable ④ Variable quantitative variable: also known as metric, or numerical is one that can be measured in the usual sense convey information regarding amount example:the weights of preschool children, diastolic blood pressure ④ Variable qualitative variable also known as categorical or nominal is one that can not be measured in the usual sense,only can be categorized convey information regarding attribute ④ Variable Binary variable: gender, live or death, yes or no. Multiple categorical variable blood types A, B, AB, O race white, black, yellow, brown Ordinal variable: there is an order in the categories Your opinion on something: unsatisfactory, normal, very satisfactory ④ Variable ID age gender Educational level occupation height weight 2025655 27 male graduate teacher 165 71.5 2025653 22 male undergraduate doctor 160 74 2025830 25 female junior high school worker 158 68 2022543 23 male senor high school students 161 69 2022466 25 female senor high school worker 159 62 2024535 27 female elementary farmer 157 68 2025834 20 male graduate cadre 158 66 2019464 24 male graduate students 158 70.5 2025783 29 female junior high school farmer 154 57 ④ Variable Data transformation Numerical variable weight (kg) fat or overweight Ranked variable normal thin binary variable normal abnormal quantitative variable qualitative variable example:WBC(1/m3)count of five persons: 3000 6000 lower normal 5000 8000 12000 normal normal Binary variable : normal abnormal Ordinal variable: lower higher 3 persons; 2 persons 1 person normal 3 persons higher 1 person ⑤ Parameter and Statistic Parameter → describe the characteristic of population. → usually presented by Greek letter,such as μ. → Usually unknown ⑤ Parameter and Statistic statistic → describe the characteristic of a sample → usually presented by Latin letter,such as s and p. ⑥ Probability → the possibility of occurrence of a random event. → designated as P Certain 0≤P≤1 P=0 impossible event P=1 certain event P≤0.05 small probability event Impossible ⑥ Probability random event: The event may occur or may not occur in one experiment. Before one experiment, nobody is sure whether the event occurs or not. Throw the dice ⑥ Probability Frequency of an event------the number of times the event occurs in a sequence of repetition of the random phenomenon. Probability of an event----if in a long sequence of repetition, the relative frequency of an event approached a fixed number, that number is the probability of the event . ⑥ Probability Relative frequency 1.00 0.75 0.50 0.25 0.00 0 25 50 75 100 125 ⑥ Probability The relationship between relative frequency and probability →Probability is the limit of frequency n ∝ P=f=m/n Ⅲ BASIC STEP OF STATISTICAL WORK 4 Analysis of data 3 Sorting of data 2 Collection of data 1 Design 1 Design Professional design Statistical design • Study aim • Sampling method • Study subject • Allocation method • measures • Calculation of sample size • Data processing 2 Collection of data Source of data Routinely kept records Surveys Experiments External sources Principle:in time, accurate, complete 3 Sorting of data Checking: outlier, missing value, Coding: Blood type A(1), B(2), AB(3), O(4); gender male(1), female(2) Grouping: DBP hypotension normal SBP hypertension Computing: weight height Body mass index 4 Analysis of data Statistical analysis is divided into two parts: descriptive statistics and inferential statistics To teach the student to organize and summarize data indicator Statistical description Table and chart Statistical analysis Statistical To teach the student how inference to reach decisions about a large body of data by examining only a small part of the data Parameter estimation Hypothesis testing IV. REVIEW QUESTIONS AND EXERCISES Define: 1. Quantitative variable 2. Qualitative variable 3. Population 4. Sample 5. probability IV. Review questions and exercises Explain the type of the following variables: 1. Admitting diagnosis in a mental health clinic 2. Weights of babies born in hospital during a year 3. Gender of babies born in hospital during a year 4. Under-arm temperature of patients with fever
© Copyright 2026 Paperzz