Alaa Elmaoued Nancy Nguyen Epidemiology/population health Study design, types and selection of studies Incidence vs. prevalence Descriptive studies Measures of health status Analytical studies: observational vs. interventional Survival analysis interpretation Systematic reviews and meta-analysis Composite health status indicators Obtaining and describing samples Population pyramids and impact of demographic Methods to handle noncompliance changes Disease surveillance and outbreak investigation Communicable disease transmission Points of intervention Qualitative analysis Study interpretation Bias, confounding, and threats to validity Internal vs. external validity Statistical vs. clinical significance www.usmle.org Rates: crude and adjusted Crude = overall (e.g. crude mortality rate) Adjusted = stratified by different categories (e.g. Age-adjusted mortality rates) Mortality Standard mortality ratio = (observed # of deaths per yr/expected # deaths per yr) x 100 If the SMR = 100, this indicates that the # observed deaths is equal to # expected Population attributable risk (PAR) = Incidence in the total population – incidence in the nonexposed group Population attributable risk percent (PAR%) = [(incidence in the total population-incidence in the nonexposed group)/incidence in the total population] x 100 Reproductive rates Maternal mortality death of a woman while pregnant or within 42 days of termination of pregnancy, irrespective of the duration and site of the pregnancy, from any cause related to or aggravated by the pregnancy or its management but not from accidental or incidental causes Denominator is usually reported per 100,000 registered live births Neonatal mortality Death of a live-born baby within 7 days of life Per 1,000 live births Infant mortality Death of a child less than 1 year of age Per 1,000 live births Under-5 mortality NOT THAT Y-axis represents the proportion of survivors and X-axis represents time moving forward Generally used to assess survival with death as the defining “event” but can also be used for other health outcomes such as fertility Data is used to define the intervals rather than having a predetermined interval Makes full use of the data and is more accurate Accounts for some loss to follow-up Years of potential life lost (YPLL) Measure of premature mortality or early death (i.e. people who die younger have a greater loss of future productive years than people who die at an older age) Based on life expectancy of the population Quality-adjusted life years (QALY) Measure of the quality of remaining life years Used to evaluate different healthcare interventions Quality of life is based on a scale from 0 to 1 where 0 is death and 1 is the best possible health state Disability-adjusted life years (DALY) Years of life lost to premature death AND years lived with a disability of specified severity and duration Measure of overall disease burden that combines mortality and morbidity 1. 2. Define the outbreak and validate the existence of an outbreak Examine the distribution of cases by time and place 3. Look for combinations (interactions) of relevant variables 4. Develop hypotheses based on: existing knowledge (if any), analogy to diseases of known etiology, findings from investigation of the outbreak 5. Test hypotheses 6. Recommend control measures 7. Prepare a written report of the investigation and the findings 8. Communicate findings to those involved in policy development and implementation and to the public Attack rate = # of people at risk in whom a certain illness develops/ total # of people at risk Herd immunity Reportable diseases: What types of diseases are reportable/notifiable? Definition Example Primary Preventing the initial development of a disease Immunization Secondary Early detection of existing disease to reduce severity and complications Screening for cancer Tertiary Reducing the impact of the disease Rehabilitation for stroke The physical examination records of the entire incoming freshman class of 1935 at the University of Minnesota were examined in 1977 to see if their recorded height and weight at the time of admission to the university was related to the development of coronary heart disease by 1986. This is an example of: A. A cross-sectional study B. A case-control study C. A concurrent cohort study D. A retrospective cohort study E. An experimental study The physical examination records of the entire incoming freshman class of 1935 at the University of Minnesota were examined in 1977 to see if their recorded height and weight at the time of admission to the university was related to the development of coronary heart disease by 1986. This is an example of: A. A cross-sectional study B. A case-control study C. A concurrent cohort study D. A retrospective cohort study E. An experimental study Residents of three villages with three different types of water supply were asked to participate in a survey to identify cholera carriers. Because several cholera deaths had occurred recently, virtually everyone present at the time underwent examination. The proportion of residents in each village who were carriers was computed and compared. What is the proper classification for this study? A. Cross-sectional study B. Case-control study C. Concurrent cohort study D. Nonconcurrent cohort study E. Experimental study Residents of three villages with three different types of water supply were asked to participate in a survey to identify cholera carriers. Because several cholera deaths had occurred recently, virtually everyone present at the time underwent examination. The proportion of residents in each village who were carriers was computed and compared. What is the proper classification for this study? A. Cross-sectional study B. Case-control study C. Concurrent cohort study D. Nonconcurrent cohort study E. Experimental study A case control study is characterized by all of the following except: A. It is relatively inexpensive compared with most other epidemiologic study designs B. Patients with the disease (cases) are compared with persons without the disease (controls) C. Incidence rates may be computed directly D. Assessment of past exposure may be biased E. Definition of cases may be difficult A case control study is characterized by all of the following except: A. It is relatively inexpensive compared with most other epidemiologic study designs B. Patients with the disease (cases) are compared with persons without the disease (controls) C. Incidence rates may be computed directly D. Assessment of past exposure may be biased E. Definition of cases may be difficult Cross-sectional study Case-series/Case-report AKA prevalence study Case report = one person Both exposure and disease outcome are Case series = more than one determined simultaneously Cannot establish temporal relationship between the exposure and onset of disease Ecological study • Based on aggregate or group data, not on individual (e.g. cause of death in different countries) Evaluates subjects with known exposure with similar treatment OR for exposure and outcome simultaneously No hypothesis testing Vulnerable to selection bias (select certain patients) No control/comparison group = low internal validity Vulnerable to Hawthorne effect Selection of subjects is based on exposure Groups are followed to compare incidence of disease or other health outcomes Prospective aka concurrent aka longitudinal cohort study Retrospective aka nonconcurrent aka historical cohort study Good for evaluating temporal/causal association Bad for rare diseases Expensive and time-consuming Problems with loss-to-follow-up Selection of subjects is based on disease or other health outcome Groups are evaluated to compare past exposure Incident > prevalent cases (survival vs. development) Matching Group = frequency match Individual = each case matched to a control Relatively inexpensive and does not require as much time Susceptible to recall bias Good for rare diseases Bad for rare exposures Randomized Control Trial Essentially the Gold Standard Unethical in a lot of cases! Double-blind Placebo-controlled Community intervention Systematic Review Meta-analysis A research study which aims to provide an A statistical technique used to combine the exhaustive summary of current literature relevant to a research question. Crucial to EBM results of all eligible studies in a systematic review into a single quantitative estimate or summary effect size Effect sizes measure the strength of the relationship between two variables, thereby providing information about the magnitude of the intervention effect Heterogeneity is a value calculated to determine if individual studies are similar enough to compare (prefer non-significant findings for heterogeneity) Publication bias is particularly problematic for systematic reviews because not all studies are published, depending on the significance and direction of effects detected. Horizontal line = confidence interval Each square represents the result from individual studies Center line = 1.0 (no association) Overall result from the meta-analysis Any systematic error in the design, conduct, or analysis of a study that results in a mistaken estimate of an exposure’s effect on the risk of disease Selection bias Error introduced when the study population does not represent the target population Can be introduced at any stage of a research study Information bias Occurs during data collection and can lead to misclassification Sampling bias or non-random sampling bias: a selection procedure that yields a non- representative sample in which a parameter estimate differs from the existing in the target population Example is telephone random sampling which would systematically exclude households without telephones Ascertainment bias Healthcare access bias Survivor treatment selection bias Recall bias If the presence of disease influences the perception of its causes or the search for exposure to the putative cause Common in case-control studies where participants are aware of their disease status, but can also occur in cohort studies Ecologic fallacy When analyses realized in an ecological group analysis are used to make inferences at the individual level Hawthorne effect When individuals modify they react or behave in response to their awareness of being observed An extraneous variable that correlates (directly or inversely) with both the dependent variable and the independent variable Example: Drinking coffee and pancreatic cancer Confounding is not an error in the study but can be considered a true phenomenon that is identified in a study and must be understood One approach is to stratify… If you stratify the data by the confounding variable then you will find that the measure of association will equal 1.0 If you know of a possible confounder during the design phase of your study, you can match cases to controls based on the confounding variable Internal validity The extent to which a study is able to make causal conclusions based the design and ability to reduce systematic error Essentially how well you designed your study (confounding = red flag!) External validity Whether the findings of a study can be generalized to the rest of the population Example: hospital cohorts Alaa Elmaoued Nancy Nguyen Sensitivity and Specificity Number Needed to Harm Positive and Negative Predictive Values t-Test Incidence and Prevalence ANOVA Odds Ratio Chi-square Relative Risk Pearson Correlation Coefficient Attributable Risk Error types Relative Risk Reduction Absolute Risk Reduction Number Needed to Treat Incidence RATE = Number of new cases Prevalence = Number of total existing / Population at risk Incidence looks at new cases at a time period cases / Population at risk Prevalence = incidence x duration of disease Chronic disease with long duration has a high prevalence Disease with short duration has low prevalence and equals the incidence of disease Smithville has a stable population of 100,000 and 2000 individuals in this community have been diagnosed with disease X. Although 300 individuals in Smithville die each year from all causes, 100 of those die from disease X. There are 50 new cases of the disease each year. The annual incidence of this disease is represented by which of the following? The incidence is represented by the number of new cases of the disease in a given period divided by the susceptible population. Because the 2000 people with the disease are no longer susceptible, they must e subtracted from the total population; thus the incidence is 50/98,000. A research group is studying sickle cell disease in a geographically isolated community of 6000 people. A genetic analysis is performed on every community member At the beginning of the year, it is determined that 10% are homozygous for hemoglobin S and therefore have sickle cell disease, and 30% of the community is heterozygous for the mutant allele. Over the course of the year, 100 infants are born, six of whom are diagnosed with sickle cell disease. Of 80 people who die during the year, three had sickle cell disease. Which of the following is the current prevalence of sickle cell disease in this population? Prevalence is the total number of cases in a population divided by the total population at risk of the disease. Multiply the initial population (6000) by the initial prevalence (10%), yielding 600 cases. Over the course of the year, there was a net gain of 3 patients with sickle cell disease, bringing the new total to 603. Likewise, the new population at risk is 6020, a net gain of 20 people. Therefore, the current prevalence is 603/6020. Be Sensitive to Positive people Sensitivity is how good a test will identify those who have the disease Sensitivity = True Positives/(True Positives + False Negatives) OR = 1 – false-negative rate SN-N-OUT A highly sensitive test Rules Out the disease if it is negative β-Thalassemia major results from a homozygous genotype that leads to complete absence of both the β-globin chains. A study subjected 100,000 participants to an intrauterine screening test; 87 tested positive for βthalassemia major, and the remaining 99,913 tested negative. In 7 of those 87 cases the results were shown to be false positive. Ultimately, 100 of those originally screened were found to actually have the disease. Which of the following is the correct sensitivity of the intrauterine screening test? Proportion of positive test results tat are truly positive If the test result is positive in this patient, what is the probability that this patient truly has the disease? PPV = TP/ TP+FP PPV is directly related to prevalence High prevalence means high PPV Investigators studying cardiovascular disease discover a new serum protein marker that is correlated with the presence of ruptured atherosclerotic plaques. It is hoped that this serum marker could be used as a screening test to identify whether a person has had a recent MI. In a phase III clinical trial of 1400 subjects, the investigators find that of the 500 subjects who had an MI, 400 tested positive for the serum marker, whereas 850 subjects who did not have an MI tested negative for the marker. If this marker were used to screen patients for recent MI, what is the probability that a person will have had an MI given a positive serum protein analysis? The question is asking to calculate the positive predictive value of the test, i.e, the probability that a person with a positive serum marker on the screening test will indeed have had a recent MI. Specificity is the proportion of people without the disease wo test negative SP-P-IN Highly specific test when positive rules in the disease. Specificity = True Negatives / True Negatives + False Positives OR = 1 – false- positive rate Proportion of negative test results that are true negative If the test is negative, what is the probability that this patient does not have the disease? NPV = True Negatives / All people who tested negative (TN + FN) NPV is inversely correlated with prevalence High prevalence = Low NPV How to determine whether a certain disease is associated with a certain exposure To determine whether an association exists, we can use data from case-control and cohort studies Used in Case-control studies Odds that group with disease (cases) was exposed to a risk factor (a/c) divided by the odds that group without the disease (controls) was exposed (b/d) Researchers are investigating the relationship between cell phone use and brain cancer. Of 50 brain cancer patients, 30 admitted to using a cell phone for 10 year or more. Of 400 healthy participants in the study, 250 were found to have used a cell phone for 10 years or more. Which of the following is an appropriate conclusion to draw from this study regarding ell phone use and brain cancer? The clinical study described is a case-control study. Case-controls look at those with the disease (the cases) compared to those without the disease (the controls). The odds ratio is then calculated as OR=(odds in disease group)/ (odds in control group) = [30/(5030)] / [250/(400-250)] = 9/10. Used in cohort studies Risk of developing disease in the exposed group divided by risk in the unexposed group. Defined as the difference in risk between exposed and unexposed groups, or the proportion of disease occurrences that are attributable to the exposure Number needed to treat is defined as the number of patients who need to be treated for 1 patient to benefit Number needed to harm is defined as the number of patients who need to be exposed to a risk factor for 1 patient to be harmed t-Test checks the difference between the means of 2 groups. ANOVA checks the difference between the means of 3 or more groups Chi-square checks the difference between 2 or more percentages or proportions of categorical outcomes; used for frequency data rather than for comparison of means. A physician is studying the effects of drug A and drug B on cognitive performance in Alzheimer patients. She administers a memory test to two groups of subjects (those taking drug A and those taking drug B) and compares their mean scores. Which of the following statistical tests would be most appropriate for this purpose? A. ANOVA B. Chi-square test C. Linear regression analysis D. t-Test E. Multiple linear regression The t-Test is used to compare two means derived from two samples. r is always between -1 and +1 The closer the absolute value of r is to 1, the stronger the linear correlation between the 2 variables. Positive r value means a positive correlation Negative r value means a negative correlation The coefficient of determination (r2) is what is usually reported (i.e. graphs) Type I (α) errors and Type II (β) errors indicate that you accepted the wrong hypothesis. Type I (α) error Type II (β) error • “False-positive” error: • You accepted your hypothesis (alternative hypothesis) rather than the null-hypothesis • “False-negative” error: • You fail to reject the nullhypothesis when it is actually wrong • The p-value is the probability of making a type I error • β is the probability of making a type II error. • Power = 1- β A study with greater power has less type II error The power is the probability of rejecting the null hypothesis when it is in fact false (This is what we want to happen) Conventionally, a study should have a power of 0.8 (or a β of 0.2) to be accepted. Important: Increasing the sample size is the most practical and important way of increasing the power of a statistical test, i.e., there is power in numbers. A medical resident decides to test the hypothesis that people with Alzheimer’s have elevated serum sodium levels. The Type I error of this study was 0.078. What does this analysis represent for the study? A. Determines the power of a study to detect a significant change B. Probability of Type I error is known as β C. Represents the probability of incorrectly rejecting the null hypothesis D. Most studies used a probability of error level of 0.10 to determine the significance E. It is equal to 1- β α should be less than 0.05 to be acceptable USMLE Step 1 Qbook, Fifth edition USMLERx Qbank 2015 First Aid 2015 edition USMLE Step I Secrets …You should probably start running… In a city with a population of 1 million, 10, 000 individuals have SLE. There are 1,000 new cases of SLE each year and 200 deaths caused by the disease. There are 2,500 deaths per year from all causes. Assuming no net emigration or immigration to the city, the incidence of SLE in this city is given by which of the following expressions? A. 800/990,000 B. 800/1,000,000 C. 1,000/990,000 D. 1,000/1,000,000 E. 2,500/1,000,000 F. 10,000/1,000,000 In a city with a population of 1 million, 10, 000 individuals have SLE. There are 1,000 new cases of SLE each year and 200 deaths caused by the disease. There are 2,500 deaths per year from all causes. Assuming no net emigration or immigration to the city, the incidence of SLE in this city is given by which of the following expressions? A. 800/990,000 B. 800/1,000,000 C. 1,000/990,000 D. 1,000/1,000,000 E. 2,500/1,000,000 F. 10,000/1,000,000 Don’t forget to subtract the prevalent cases of SLE! They are not part of the population at risk of becoming new cases Researchers are developing a screening test for awesomeness which has a sensitivity of 95% and a specificity of 90%. If the prevalence of awesomeness is 10%, which of the following is the best estimate for the probability that a person who tests negative for awesomeness is actually not awesome at all? A. 45% B. 50% C. 85% D. 90% E. 95% F. 99% Researchers are developing a screening test for awesomeness which has a sensitivity of 95% and a specificity of 90%. If the prevalence of awesomeness is 10%, which of the following is the best estimate for the probability that a person who tests negative for awesomeness is actually not awesome at all? Awesome Not Awesome A. 45% Pos Awesome 95 (sn=95%) 90 185 B. 50% Neg Awesome 5 810 (sp=90%) 815 100 [prevalence = 10%] 900 1000 [start with a nice round number] C. 85% D. 90% E. 95% F. 99% Negative Predictive Value = TN/TN+FN=810/815 = 99% A study is conducted to evaluate the average number of pizza slices consumed by medical students during their first year. Results of 100 students surveyed show an average number of pizza slices of 110 with a standard deviation of 20. Which of the following is the best estimate for the 95% confidence interval for the mean in this sample? A. 70 to 130 B. 70 to 150 C. 85 to 115 D. 90 to 130 E. 105 to 115 F. 106 to 114 A study is conducted to evaluate the average number of pizza slices consumed by medical students during their first year. Results of 100 students surveyed show an average number of pizza slices of 110 with a standard deviation of 20. Which of the following is the best estimate for the 95% confidence interval for the mean in this sample? A. 70 to 130 B. 70 to 150 C. 85 to 115 D. 90 to 130 E. 105 to 115 F. 106 to 114 CI = sample mean ± Z x (SD/√n) Z-score for 95% CI = 1.96 ≈ 2 = 110 ± 2 (20/√100) = 110 ± 2 (20/10) = 110 ± 2 (2) = 110 ± 4 = (106, 114) A screening test used to detect cervical cancer has a sensitivity of 96%, a specificity of 90% a positive predictive value of 92% and a negative predictive value of 95%. A recent study on the impact of Gardasil suggests that the prevalence of cervical cancer has declined. Given this information, how will this impact the results of the screening test? A. Decrease the sensitivity B. Decrease the specificity C. Increase the negative predictive value D. Increase the positive predictive value E. Increase the sensitivity F. Increase the specificity A screening test used to detect cervical cancer has a sensitivity of 96%, a specificity of 90% a positive predictive value of 92% and a negative predictive value of 95%. A recent study on the impact of Gardasil suggests that the prevalence of cervical cancer has declined. Given this information, how will this impact the results of the screening test? A. Decrease the sensitivity B. Decrease the specificity C. Increase the negative predictive value D. Increase the positive predictive value E. Increase the sensitivity F. Increase the specificity A change in prevalence is a change in the population, not the screening exam; therefore you can eliminate answers A, B, E, and F because sensitivity and specificity pertain to qualities of the TEST and not the population If the prevalence of a disease goes down, then you have the probability of having more true negatives and less true positives… thus the NPV increases and the PPV decreases
© Copyright 2024 Paperzz