Design of experiments Research ? Layman’s reality deals personal experiences, believes in miracle cures, no need for verification Scientist’s reality is cynicism on everything, requires verification with scientific study design Timo Nevalainen 1 2 Requirements for a scientific hypothesis Hypothesis A scientific hypothesis must be testable Science is mostly driven by developing and testing novel hypotheses Is an educated guess about what nature is going to do, or about why nature does what it does Ultimate aim of the study is to accept of reject the hypothesis a scientific hypothesis must generate predictions. To say that a hypothesis g p predictions" means the same thing g "generates as saying the hypothesis "is testable". A scientific hypothesis must be falsifiable *There are other inhabited planets* Can be tested by space probe BUT if they find none, it does NOT prove that the hypothesis is incorrect (= not falsifiable) 3 Example: Xylitol and dogs Man commonly used sweetener positive effects on caries and on ear infections excessive use may induce laxative effects 4 Formulating hypothesis Dogs 2-year toxicity study at 2 g/kg daily in diet resulted in minor liver changes accidental consumption of xylitol: mortality with seizures clinically Kuzuya et al. al. 1966: Xylitol in dogs produces much stronger insulin release than glucose Hypothesis: Ingested xylitol causes insulin secretion, which results in hypoglycemia BUT: Was this tested in the 22-year toxicity study ? Hypoglycemia only in fasted dogs ? Read more on writing hypotheses 1 Manipulation hypothesis A good hypothesis Useful when the effect of a procedure on animals is being studied. The “IF” will “HAPPEN” when “ALTER” something statement Example: The animal will grow faster if diet energy content is increased 7 8 Choice hypothesis Observational hypothesis Useful when e.g. wild animals in nature and one cannot change environment; can also be a comparative statement. Useful when investigating the preferences of animals “ORGANISM X” iis “STATEMENT ABOUT DISTRIBUTION, DENSITY”, and “SIZE ETC”. Given a choice “THE” will “PREFER” than “OTHER PREFERENCE” statement. Example: Mice prefer certain nest material over other materials 9 Example: Reindeers thrive better in subarctic climate than in hot and humid areas "What makes a good hypothesis" 10 Animal experiments In animal experiments we believe to be able to standardize most factors causing variation TEST YOURSELF Environment Etiological agents of diseases genetics diet Hence possible to operate with relatively small numbers of animals 11 Vs. epidemiological and clinical studies 12 2 What is Experimental Design All About? Our way to ask nature and test the hypothesis Validity reflects the weakest link of the study Experimental design aims at securing that best possible knowknow-how is in use Poorly planned or executed study is always unnecessary and unethical Planned interference in the natural order of events more than carefully observing what is occurring The importance stems from the quest for inference causes or relationships as opposed to simply description Inferences -> what produced, contributed to, or caused events without ambiguity What is a scientific experiment ? some form of experimental design is ordinarily required The purpose of the design is to rule out alternative causes, leaving only the actual factor - the real cause. 13 14 Nonstatistical aspects Bias Choice of animals and precision Design of environment Applicability Practical randomization Case for Refinement and Reduction Are animal experiments wellll d designed i d? 15 16 Experiments well designed ? ”.....quality of experiments is poor, and basic principles of experimental design are ignored.......advances in experimental design during the last 30 years have not had any effect on studies.” Mead (1990) The Design of Experiments, Cambridge University Press 17 Australian Vet J 72:322-328, 1995 18 3 Which were the problems? Deficiencies in design failure to randomize too heterogeneous material inappropriate number bias Deficiencies in statistical analysis sub-optimal methods suberrors in calculation 19 20 Kilkenny et al. 2010 21 Arrive_guidelines.pdf 23 4 Standardization fallacy? Bias ? RA Fisher 1934 A highly standardized experiment supplies direct information only in respect of the narrow range of conditions conditions.. ..deliberately varying in each case some of the conditions of the experiment, achieve a wider inductive basis for our conclusions, without in any degree impairing their precision. Systematic difference between the real and estimated effects ii.e. treated d and d controll groups must have the same environment Remove systematic differences 26 No systematic differences Unbiased, how to ? Failure -> False positives or false negatives Independent replication of observations Common errors G Groups in i diff differentt environments i t Sampling different groups different times by different people Favoring one group over other(s) Improper randomization Not to block by groups Conditions may not stay the same Randomization Blinding Code experimental units Crucial when observations are subjective 27 Description of the study Choice of animals / Ellery Source: Species (Latin name if not a common laboratory Source: species), source, age and/or body weight, sex Transportation:: Length of acclimatization period Transportation Genotype:: The breed, strain, or stock name. Inbred Genotype strains, mutants, transgenes transgenes, g , and clones using g internationally accepted nomenclature (see: http://www.informatics.jax.org/mgihome/nomen/strains.sht ml ) for mouse and rat nomenclature). Any genetic quality assurance verifying the genotype should be mentioned Microbiological status: status: Conventional, specified pathogenpathogenfree (SPF), germfree/ germfree/gnotobiotic gnotobiotic.. When possible, reference should be made to some agreedagreed-upon standards such as the FELASA standards (www.felasa.org (www.felasa.org ) Complicating factors of a study are the ones that should be described, see 28 Ellery et al al. 1985l Gold standard Publication checklist…. checklist…. ILAR task group…working 29 30 5 Hooijmans et al. 2010 Specification of Environment Housing: Type of housing including whether conventional, barrier, isolator, or individually ventilated cages. Room temperature (with diurnal variation), humidity, ventilation, light/dark periods, light intensity. Cage type, model, material type of floor (solid/mesh), (solid/mesh) type of material, bedding, frequency of cage cleaning, number of animals per cage, cage enrichments. Diet:: Type, composition, manufacturer, feeding Diet regimen (ad (ad libitum, libitum, restricted, pair fed), method of sterilization. Water:: ad libitum, Water libitum, bottles or automatic, quality, sterilization. 31 32 Hooijmans et al. 2010 Hooijmans et al. 2010 33 34 Statistical and practical significance Hooijmans et al. 2010 These are two different things With a large number of animals it is sometimes possible to gain statistical significance even though differences are very small With a low number of animals it may be impossible to show statistical significance of eg clinically valuable difference It is unethical to use to few or too many animals 35 36 6 A good design should Oubred stocks vs. isogenic strains Have clear aims Have good knowledge of literature Use applicable model Have experience Use correct statistics 37 38 Criteria of good design Precision Unbiased Aim at high SIGNAL / Noise ratio Ratio at o goes up Independent repetition Precise Even material, material blocks, blocks size Applicability factorial design, blocks Simple Estimation of error Increasing signal (larger dose, more sensitive model) Decreasing noise (less variation, larger experiment) 39 40 Why to look for variation ? Decisive for window of appropriate pp p n Unethical to operate outside the window Comparisons of variation seldom done Requires larger study than mean comparisons Change in SD -> n change 41 + 40 % -> 1.42 -> 1.96 -> 96 % - 40 % -> 0.62 -> 0.36 -> 64 % 42 7 Examples of sources of noise Variation Biological variation species, sex, age, biorhythms, animal care, stress Preanalytical variation y sampling, sample treatment, -storage Analytical variation Species Batch Age Strain Supplier Litter Size Genotype Body Weight Sex Physiological and Pathological Status Usually half of biological Pharmacologic variation response 43 44 Variation and n Parameter Genetics Group F1 F1--hybride F2--hybride F2 SD 13.5 18.4 Which variation? Scientists – own parameters n 24 42 Diseases no Mycoplasma 18.6 42 Mycoplasma 43.3 200 Breeding Wild 20.3 48 Purpose 20.4 48 n needed to detect 10 % difference at p=0.05 no way to screen all parameters effects most likely facility--, strain facility strain-- & enrichment enrichment--specific Laboratory animal scientists - welfare indicators eg fecal and/or urine corticosterone telemetric methods with the assumption that uniform welfare status results in smallest possible variation in other determinations 45 Experimental unit (EU) 46 Randomization Unit of replication which can be assigned at random to a treatment Diet study with animals in cages; cage = EU If lymphocytes taken from animals -> cells treated in a Petri dish; dish = EU Topical test compound over various spots over body, spot = EU Most often EU equals an animal Critical part of an experiment Tossing coins - simplest Pay attention to weight and age differences If cannot – use them as covariates Pay attention to families 47 And make them main effects 48 8 Cage rack Other items to randomize order of sampling or recording environment place in the rack temp, light, ventilation gradient across the room? Illumination 49 50 The HelsinkiHelsinki-Rat Rat--ification ification,, 2010 •Most toxicity testing is done using stocks. •Should a resistant stock be used, it could lead to a false negative result •These These problems can be overcome by using small number of animals of several inbred strains in a factorial design •This is more powerful and better able to detect toxicity with the added advantage that it would highlight genetic variation in response •See www.isogenic.info/ •The Helsinki-Rat-ification, 2010 – next slide A basic rule of good experimental design is that all variables should be controlled except that due to the treatment The use of genetically undefined rats and mice violates this rule The result is excessive numbers of false negative results The use of such animals is therefore unethical and uneconomical We hereby pledge that we will no longer breed, maintain or allow geneticallygenetically-undefined mice or rats to be used in our animal houses, and when serving on an ethical committee, we will not agree to their use in research Signed .......................... 52 Applicability – representative genotypes Types of experiments Parallel groups How valid will your experiment be under other circumstances ? e.g. does you data apply to different strains, the other sex, etc etc.. .. ? One procedure to each animal Cross over E h animal Each i l iis exposed d tto allll procedures d with ith a wash out period in between Factorial 53 Possible to study more than one procedure simultaneously 54 9 Types of exps and precision Blocks increase precision If variation, as most commonly, is within an animal < between the animals Control some extraneous sources of variation Maybe useful if cross over is more p precise If the opposite is true Parallel groups are more precise Factorial design allows assessment of both variation at the same time material is heterogeneous -> several mini experiments material has some natural structure there are bottlenecks Can be both parallel and cross over 55 12 dogs of 4 litters -> 3 groups 56 Traditional design Imaginary study Purpose: Effect of drug on enzyme Design Controls 8 males Drug 8 males Stat: tt-test / 14 degrees of freedom But, what about females? random by blocks 57 Factorial Design 58 Advantages of factorial design Design B Controls Drug 4 males 4 males 4 females 4 females More than one hypothesis at a time Better mathematical model More precision Reveals interactions Stat: St t ANOVA Source Drug Sex Drug x Sex Error DF 1 1 1 12 Same size, more info MS xxx xxx xxx xxx 59 60 10 Factorial & ANOVA Main effect: Stock or strain Outbred stock is believed to represent better a species than several defined strains (Inbred and F1 F1--hybrides) Some outbred stocks have rather narrow genetic bases If strain (or stock) is used as main effect and several strains or stocks are used, design will show whether it is significant Main effects: Strain, sex, procedure, litter....... Covariates: Age, weight, parameter before procedure....... 61 Principles in action.. ? 62 Defined or outbred ? Not really, e.g. Safety evaluation is usually done with one ((outbred outbred)) stock Carcinogenicity studies are often done with a single inbred strain (Fischer344) or single outbred stock Biological standardization Large response is important Select strain or stock with large response If simultaneously small variation is needed, then take a defined strain 63 Defined or outbred ? Defined or outbred ? Application to same species Which is more representative: Extrapolation to an other species I addition dditi tto previous i ttake k representative t ti In species and within species representative genotypes One outbred stock or One inbred or F1 strain If there is genetic component 64 It is better to take several defined strains 65 66 11 Main effect: Litter Litter Why do we use litter blocking ID necessary at weaning Use means more precision Can be done in addition to even distribution to all groups In large animals While we could not care less in rodents Is it really a species question? Or question of numbers of animals in a group? 67 Appropriate number of animals Error types Experience helps? Mathematically correct 68 False positive ( type I error, α-error) St ti ti ll significant Statistically i ifi t change # no practical significance Eff Effect size i SD false positive (significance) false negative (power) False negative (type II error, β-error) Most common, cause: effect = small variation = large # animals = small Statistical power = 1 - β 69 Size of the study / I Good design has high statistical power ! 70 Resource equation method Mead (1988): Quantitative endpoint Error estimate requires 10 10--20 degrees of freedom DF (error) = (N(N-1) - (T(T-1) - (B (B--1), where N = # observations T = # groups B = # blocks and/or covariates 71 72 12 Example / I Appropriate number of animals / II Set p – usually 0.05 Set statistical power - usually 0.90 Decide effect size Estimate variation Size of experiment = n 30 mice, 3 groups and 2 blocks DF (error) = (30 (30--1) - (3 (3--1) - (2 (2--1) = 26 Conclusion: unnecessarily large 73 74 75 76 Freeware for study size determination www.uib.no/isf/people/doc/ssd.htm p p Calculate study size at beginning and statistical power at end More on the topic Reading http://www.adelaide.edu.au/ANZCCART/ variables in animal based research: part 1. phenotypic variability in experimental animals variables in animal based research: part 2. variability associated with experimental conditions and techniques the importance of non-statistical non statistical design in refining animal experiments Doing better animal experiments; together with notes on genetic nomenclature of laboratory animals Festing MFW, Altman DG. 2002. Guidelines for the design and statistical analysis of experiments using laboratory animals. ILAR J 43:24443:244-258 http://www.lal.org.uk/handbooks.html CD on Experimental Design: www.sheffbp.co.uk/ 77 http://dels.nas.edu/ilar/ 78 13 Authors Order of the authors First is usually the one who did most of the work L iis usually ll the h group lleader d Last Authors in between carry less weight Criteria (www.icmje.org) Substantial contributions to conception and design, or acquisition of data, or analysis and interpretation of data Drafting the article or revising it critically for important intellectual content Final approval of the version to be published 1. 2. 3. 79 80 Objectives / Hypothesis Ethics approval Indicate the nature of the ethical review permissions, relevant licenses in your country and international and/ or institutional guidelines for the care and use of animals Clearly describe the primary and any secondary objectives of the study, or specific hypotheses being tested Statistics tests null hypothesis Ultimate aim of the study is to accept of reject the hypothesis 81 82 Design features Procedure description For each experiment, give brief details on Number of experimental and control groups. steps taken to avoid of subjective bias when allocating animals to treatment (randomization) And when reading results (blinding) The experimental unit Give details on How drug formulation and dose, site, route of g * administration * anaesthesia and analgesia surgical procedure * method of euthanasia) * details of any specialist equipment (supplier(s). Flow chart to illustrate how complex study designs were carried out. 83 When (e.g., time of day). Where (e.g., home cage, laboratory, water maze). Why 84 14 Housing description Animal description Quarantine / Acclimatization length (days) / procedures during species, strain, stock, sex, developmental stage (age), and weight (e.g., mean or median weight plus weight range). source of animals, nomenclature, genetic modification status, genotype health/immune status drug-- or testnaı¨ drug testnaı¨ve previous procedures, procedures, etc. Type of facility, e.g., specific pathogen free Type of cage or housing; e.g. open vs. IVC Bedding material; number of cage companions cage shape and material Husbandry conditions (e.g., light/dark cycle, temperature, etc. type of food, access to food Welfare--related assessments and interventions Welfare that were carried out 85 86 Space characteristics Space characteristics type: outside yard/pen/open cages/ filter top cages/ IVC/ isolator/ other inside dimensions (L x W X H - cm/m) type & material of flooring ( lid/ f (solid/perforated/grid/wire/other d/ id/ i / h & epoxy mass/ tiles/ PVC/ concrete/ other) wall material(s) ((polysulfone polysulfone// polyetherimide/ polyetherimide/ polycarbonate/ polyvinyl/ steel/ wood/ other) Emissions Examples of interference: gridgrid-floor cages represent a form of mild stress associated with increased corticosterone levels, raises blood pressure , and leads to foot lesions in longlong-term housing in rats Examples of interference: Old polycarbonate cages leach Bisphenol A, a compound with estrogenic activity hopper/top (material (material--structure) washing / sterilization frequency. Examples of interference: Common cage cleaning practices may promote aggression in male groups 87 Environmental complexity Sample size total number of animals used in each experiment and the number of animals in each experimental group. group. Explain how the number of animals was decided. Provide details of any sample size calculation used. Indicate the number of independent replications of each experiment, if relevant. items/ item combinations (material, structure, dimensions) emissions 88 Example p of interference: Pieces of p polycarbonate y cages and water bottles leach Bisphenol A, a compound which disrupts fetal development in mice renewal/ washing / sterilization frequency carry over items 89 90 15 Group allocation Outcomes Give full details of how animals were allocated to experimental groups, including randomisation or matching matching,, if done done.. Describe the order in which the animals in the different experimental groups were treated and assessed Clearly define the primary and secondary experimental outcomes assessed, e.g. cell death molecular markers behavioural changes 91 92 Bedding & nesting material Feeding characteristics material(s) treatments within the facility emissions Examples of interference: Softwood bedding induces liver metabolism due to inherent pinenes still present in commonly used beddings. PaperPaper-based materials may contain toxic substances Example of interference: Type of dietary fat has impact on rodent physiology and behavior autoclaving /other treatment(s) in facility Example of interference: Dirty environment impairs liver metabolism Examples of interference: Phytoestrogens in diet can interfere many endpoints thus being an important factor in studies batch specific analysis results, if any volume/ weight provided change interval. type and batch of diet (manufacturer/ code/ batch no/ ) Example of interference: Type of sterilization has an effect on mouse breeding 93 94 Feeding Characteristics 2 Watering method of feeding ad libitum Examples of interference: Ad libitum fed animals develop insulin resistance, diabetes, high blood pressure, impaired brain function, increased oxidative stress and inflammation, and are more susceptible to cancer, neurodegenerative disease and kidney disease. Unsurprisingly, they die prematurely restricted feeding (details of ) method of watering (nipples/bottle/bowl) material(s) of water provider system emissions water treatment Examples of interference: Most methods of food restriction are not compatible with legally mandated group housing and derail the diurnal rhythmicity of physiological parameters 95 Example of interference: Old polycarbonate water bottles leach Bisphenol A, a compound with proven estrogenic effect Example of interference: Water acidification decreases weight gain and decreases water consumption, and all additives in drinking water should be considered as a potential source of variation in immune response 96 16 Physical environment temperature range (oC). Example of interference: Small changes in ambient temperature alter cardiovascular parameters in both rats and mice Example of interference: Albino rats develop retinal degeneration within 3 months, when exposed to illumination of 60 lux lux.. lightlight g -dark rhythm y Example of interference: Low ambient temperature with extremely low humidity delays puberty in mice ventilation rate (air changes/h) Example of interference: Disturbed lighting in groups of male mice caused higher levels of corticosterone and shorter agonistic latency within the group color spectrum of lights light source e.g. fluorescence tubes (flickering) ammonia & CO2 . light intensity ((lux lux,, day/night) relative humidity (RH) range (%) (%). Physical environment 2 Example of interference: CO2 levels above 3 % have direct effect on cardiovascular parameters and preferences in rats air speed to which animals are exposed (m/s) 97 98 Physical environment 3 Social environment acoustic environment Examples of interference: Noise can change hormonal, reproductive and cardiovascular parameters and disturb sleep/wake cycle 24/7 dB and audiogram g / ultrasounds housing and care related sounds. group size/cage density regrouping compatibility tibilit olfactory environment emissions from the building odours from other species nearby e.g. pheromones 99 100 Statistics Baseline data Provide details of the statistical methods used for each analysis. Specify the unit of analysis for each dataset (e (e.g. g single animal animal, group of animals, single neuron). Describe any methods used to assess whether the data met the assumptions of the statistical approach. approach. For each experimental group, report relevant characteristics and health status of animals weight weight, microbiological status drug-- or test drug test--naı¨ naı¨ve ve)) before treatment or testing (often be tabulated). tabulated ). 101 102 17 Numbers analyzed Outcomes and estimates give the number of animals in each group included in each analysis. Report absolute numbers (e.g. 10/20 not 50%a). 50%a) 10/20, If any animals or data were not included in the analysis, explain why. Report the results for each analysis carried out, with a measure of precision (e.g., standard error or confidence interval)) interval For meta meta--analysis numbers are needed, they are difficult & unprecise if drawn from figures 103 104 The Three Rs alternatives The Three Rs alternatives basis of the method usedused- reference (science (science-based/ official validation/ best practice) Application of Replacement alternatives Details of methods used Details how used Verification of validity Application of Refinement alternatives Details of methods used on top of the application of HEP(s) housing refinements procedural refinements Untoward effects Poor condition Dead? specific assessment criteria used Describe any modifications to the experimental protocols made to reduce adverse events 106 Humane endpoint(s) (HEP) Give details of all important adverse events in each experimental group Lessons learned from the Three R applications in the study Refinement & Reduction interplay? (Effect on research quality) 105 methods related to design methods based on standardization methods efficient via increased longevity other Quantification estimate of outcome attributable to Reduction method(s) in this study Verification of method to assure efficacy in this study Application of Reduction alternatives Or why did you not change the protocol? alleviation of pain, dystress and suffering 107 estimate of illill-health impact of the study on animals basis of HEP method - reference (science (science-based/ official validation/ best p practice)) criteria used to establish (cardinal signs/ scoring system) method used to show efficacy of HEP(s) used in this study lessions learned on HEP(s) 108 18 Interpretation Applicability Interpret the results in relation to current understanding, and other relevant studies Discuss the study limitations How the findings of this study are likely to translate to other species or systems, including any relevance to human biology biology. Probably the most important statement of the article potential sources of bias p any limitations of the animal model imprecision associated with the results Discuss implications of your experimental methods or findings for the replacement, refinement, or reduction (the 3Rs) 109 Refereeing exercise 110 Check--list for refereeing Check Work in groups on given articles Pretend that they have not been published Use all the information for scrutiny Be critical Final decision: Accepted / Accepted with major/minor revision / Rejected We will pause for group work After the pause groups present their findings Is the paper important? Is the work original? Does the work add enough to what is already in the literature? I there Is th a clear l message? ? Does the paper read well and make sense? Is this journal the right place for this paper? 111 Scientific reliability 112 Scientific reliability / 2 Abstract/summary — does it reflect accurately what the paper says? Research question — is it clearly defined and appropriately answered? Overall design of study — is it adequate? What they studied — are they adequately described and their conditions defined? 113 Methods — are they adequately described? Results — does it answer the research question? ti ? C Credible? dibl ? W Wellll presented? t d? Usefulness of tables and figures? Is the quality good enough? Can some eliminated? Is the data correct in the tables? 114 19 Scientific reliability / 3 General attitude Interpretation and conclusions — are they warranted by and sufficiently derived from/focused on the data? Message clear? References — are they up to date and relevant? Any glaring omissions? 115 Be kind. Do not be tempted by the reviewer anonymousity to make unkind remarks. Be fair fair. Do not hesitate to identify flaws in the manuscript, balance criticism with potential strengths & technical limitations and the nature of the journal. If you give criticism, also give a motivation, including literature references if applicable. 116 Attitude Be ‘action ‘action--able’. Providing practical suggestions for textual changes or dditi l experiments i t h l convey what h t additional helps you think would improve the manuscript better than simple criticism. 117 20
© Copyright 2026 Paperzz