Methodology session III Monday 16 November 2009 09.15-10.00: Exercise S3 10.15-11.00: Measures of association 11.15-12.00: Exercise S4 12.00-13.00:Lunch break 13.00-13.45:Experimental design incl. Randomized controlled trials 14.00-14.45: Calculating sample size 15.00-15.45:Exercise S5 L.11: Calculating sample size 11.1 Sample size calculation 11.2 The inadequacy of small experiments 1 11.1 Sample size calculations - How many observations (patients) do we need? - This question is answered by evaluating the study’s statistical power. Example: Humerfelt et al. (1998) Effectiveness of a postal smoking cessation advice: a randomized controlled trial in young men with reduced FEV1 and asbestos exposure. Eur. Resp. J. 11: 284-290. To calculate the sample size some key questions have to be answered first: (1) What is the primary aim of the study? The example: - To decide if mailed advice reduces smoking in smokers with high risk of lung disease. (2) What is the primary response variable (end point)? The example: - Smoking status 1 year after advice was given. 2 (3) How will the data be analysed to detect a potential treatment effect? The example: - Compare the proportions of quitters in an intervention group and a control group 1 year after a letter was sent with advice about quitting. A twosided hypothesis test will be performed (the chisquare test w/o Yate’s correction). If p ≤ 0.05 we will conclude that smoking cessation advice per mail influences the smoking-habits one year later in patients with high risk of lung disease => α= 0.05. (4) How large response is expected in the control group? The example: - In the control group 2.5 % were expected to have quit smoking 1 year after first letter => p1 = 0.025 for quitting during the year 3 (5) What is the smallest treatment effect that would be of (clinical) importance to find, and how certain would you want to be to detect such an effect? The example: - If the letter makes (at least) 5 % to quit during the following year would we like to have (at least) an 80 % chance to detect it => p2 = 0.05 for quitting and β = 0.20. (6) Use Pocock’s formula for sample size for a dichotomous or continuous response: Dichotomous response: n= p1(1-p1) + p2(1-p2) f(α,β) (p2-p1)2 Continuous response: 2σ2__ n = (μ2-μ1)2f(α,β) Here the factor f(α,β) is found in Pocock’s table. 4 Pocock’s table for f( α, β) f( α, β) β 0.05 α 0.10 0.20 0.50 0.10 10.8 8.6 6.2 2.7 0.05 13.0 10.5 7.9 3.8 0.02 15.8 13.0 10.0 5.4 0.01 17.8 14.9 11.7 6.6 The example: f(α,β) = f(0.05,0.20) = 7.9 gives n= 0.025(1-0.025) + 0.05(1-0.05) 7.9 = 908.5 (0.05-0.025)2 And, adjusted for 70 % response rate: approx. 1300 in each group. 5 PS: One-sided test For a one-sided hypothesis test at significance level α with power 1-β use the formulas with 2α (and β) Instead of α (and β). Example: For a one-sided test of the hypotheses about smoking with α = 0.05 and power 0.80 you need n= p1 (1 - p1 ) + p 2 (1 - p 2 ) f(2α , β ) ( p 2 - p1 )2 = 0.025(1 - 0.025) + 0.05(1 - 0.05) f(0.10,0.20) 2 (0.025 - 0.05 ) = 0.025(1 - 0.025) + 0.05(1 - 0.05) • 6.2 = 115 • 6.2 = 713 2 (0.025 - 0.05 ) patients in each group, i.e. a total of 1426 should be randomized. Example with continuous response variable: Cockburn et al. (1980) Maternal vitamin D intake and mineral metabolism in mothers and their newborn infants. Br. Med. J., 281, 11-14. Replies to the key questions: (1) To decide if supplementary vitamin D given to pregnant women prevent hypocalcaemia in newborns. (2) The child’s serum calcium-level 1 week after birth. (3) Compare mean serum calcium-level between a placebo-group and a treatment group using a two-sided unpaired t-test at significance level α = 0.05. (4) Without D-vitamins the children are expected to have a mean calcium level of μ1 = 9.0 mg per 100 ml with a standard deviation of σ = 1.8 mg per 100 ml. (5) If D-vitamins increase the mean calcium level to μ2 = 9.5 mg per 100 ml we would wish to have a 95 % chance of detecting it in this RCT. (6) This gives n = (2⋅1.8²/0.5²)⋅13.0 = 337, i.e. a total of 700 patients. 6 Sample size for an equivalence trial (Pocock p.129-130) H0: Not equivalent vs.H1: Equivalent 1. Choose a value d such that if the two treatments really are equally effective (H1) the upper 100(1-α)% CI for the difference in proportion successes on the two treatments should not exceed d with probability 1-β. 2. Use the following formula for the size of each group: n= 2 p(1 - p ) f( α , β ) ( d )2 Sample size for an equivalence trial (Example) - In a RCT one wants to specify that a new antidepressant will only be considered acceptable if it can be demonstrated with 95% confidence that it is at worst 10% inferior to the standard drug. Suppose one accepts a 20% risk that even if the drug is really effective one will fail to show it as acceptable in this sense. - Then p=0.70, d=0.10, α=0.05, β=0.20, so n= 2 x 0.70 x 0.30 7.9 = 332 2 (10 ) 7 11.2 The inadequacy of small trials: - Large risk for failing in documenting treatment effects of clinical importance. - Too many false ‘negative’ trials get published. - Unnecessary experimentation with humans (or animals). - Delay progress in the development of new treatments, and - Is a waste of time, money and effort. In other words, many trials that are published are wasted since they did not have the resources necessary to answer the clinical research questions that were posed. What can be done? - Be not too optimistic w.r.t. patient-recruitment. Design multi-centre studies. Broaden the inclusion criteria. Don’t conduct the study. - Meta-analyses. 8 Michel de Montaigne (1533-92): “Frukten av en leges erfaring er ikke historiene om hans behandlinger og minnet om at han har helbredet fire pestsmittede og tre giktbrudne, hvis ikke han samtidig evner å utvinne noe av denne erfaringen som kan utvikle hans skjønn, og han kan vise oss at han av dette er blitt klokere i sitt yrke”. 9
© Copyright 2026 Paperzz