Hypothesis - DistaGenomics

Doing good quality
research
1 /54 1. The Research
Research is looking for the truth!
The truth may not always be what you think it is!
2 /54 1. The Research
Here is a fact:
We have proved this wheat variety is drought resistant
How?
By growing it in several trials under drought conditions
How do you know it is more drought resistant?
It gave higher yields than other wheat varieties
How do you know it is because of better
drought resistance and not something else?
This is a region with lower than average rainfall
Every year?
No, two years out of three
What about the years of your trials?
Yes, it was dry two years out of three
But was drought the major factor affecting
yields?
We assumed it was as it’s a dry area
Could it have been because of greater disease
resistance, competition with weeds, better
uptake of fertiliser, or something else?
Possibly
Did you check this?
No, and anyway, weeds are always a problem
So how do you know the better yield of the
variety is not because of something else?
We assumed it was drought resistance as it’s a dry area Did you measure soil water content to test if it
was low enough to stress the crop?
Yes, we always do
In the experimental field?
No, in a field 100 m away
Is the soil the same in that field?
Probably, though we haven’t checked in detail
Hmm. OK. What about yields in the wet years?
The variety also gave high yields
So is it drought resistant or just high yielding?
Well, yes, it seems to be high yielding every year
So have you proved it is drought resistant?
Well, probably but we’re not absolutely certain
I see!
So is it a fact?
3 /54 1. The Research
A lot of research appears to be based on
proving facts by doing experiments. A fact is
regarded to be the result of one or more
experiments.
Frequently, research programmes seem to be
built around the availability of facilities/
equipment/chemicals without having a logical
research strategy to follow.
Good quality research is largely based on
hypothesis testing. You set up a hypothesis
and then design experiments to test it.
4 /54 1. The Research
A dictionary definition
Hypothesis [n]
1. A concept that is not yet verified but
that if true would explain certain facts
or phenomena;
2. A proposal intended to explain
certain facts or observations
5 /54 1. The Research
Here is an example of a hypothesis based on a
proposal for funding by a UK funding agency:
Hypotheses
a) ....
b) …
c) ....
d) Decreased water availability initiates ABA synthesis and
hence modifies gene expression via an ABA-mediated
signalling pathway.
Individual (ABA, ions and pH) and common (network)
signal transduction components are involved.
Let’s look at this last hypothesis in more detail ...
6 /54 1. The Research
The hypothesis is that ..
d) Decreased water availability initiates ABA synthesis
and hence modifies gene expression via an ABAmediated signalling pathway.
So, based on previous research of the proposer, or
evidence in the literature, or just an intelligent guess, a
hypothesis has been put forward for testing as part of
the research project.
The hypothesis tells you what you are going to need to
look at:
- varying the availability of water
- measuring ABA concentrations
- aspects of gene expression
7 /54 1. The Research
The project will look at varying the availability of water as a
treatment,
[suitable amounts of water will need to be thought about]
then looking at the consequences of this in terms of ABA
production.
[decisions will need to be made on where to measure ABA and
when]
Finally, some aspects of gene expression will need to be
studied.
[decisions will need to be made on which genes and when to
measure expression]
However, this last point raises a critical additional question:
How will you show that the gene expression is due to
a change in ABA synthesis and not a direct effect of
water stress itself?
8 /54 1. The Research
This is a key factor needed for good quality
research:
• Designing the experiments to ensure that
there can be no other explanation that
would invalidate the test of your hypothesis.
• Only then will you know whether
your hypothesis is right or wrong!
This leads to ...
9 /54 1. The Research
The Research Cycle
Formulate
a hypothesis
Formulate
a new
hypothesis
Hypothesis
Hypothesis
Interpret the results and
make conclusions
Deduce
Analyse
Process the results
Work out how to test it
Design
Carry out
Do the experiment and
collect the data
10 /54 1. The Research
For a series of experiments testing a
sequence of hypotheses it is often helpful to
prepare a GANTT chart
A GANTT chart shows how the various components
of a project are related in time, so that the overall
objectives are achieved at the end of the project.
[Gantt was an American industrial engineer and efficiency
expert about a century ago.]
Here is an example ...
11 /54 1. The Research
Here’s a GANTT chart for an EU project.
WP
Let’s see how we can expand
part of it to develop a research cycle:
1
D1.1
D1.2
D1.3
D1.4
Baseline hy dro-meteorological sites set up.
Dev elopment of GIS and LWEIS in WB.
Data f rom f ield campaigns each y ear.
Crop/hy drological runof f models working.
D2a.1
D2a.2
D2a.2
D2a.2
Partner 6
Temporal
Temporal
Temporal
WP2 laboratory established.
data on chemicals in water (y r1).
data on chemicals in water (y r2).
data on chemicals in water (y r3).
WP2a
The experiments:
D2b.1 Bacterial pollutants of water measured (y r1).
Tomato
plants
being grown in a glasshouse from April to August with
D2b.2
Faecal pollution
of raware
water measured.
D2b.3 Contamination with pathogens measured.
different
waterof water
regimes
tor2).try toWP2b
increase water-use efficiency (especially PRD D2b.1
Bacterial pollutants
measured (y
D2b.4 Strategies to prev ent contamination.
partial
alternate
root drying).
D2b.5
Washing
strategies f or crops.
D2b.1 Bacterial pollutants of water measured (y r3).
The hypothesis to be tested is that PRD will increase water-use efficiency.
D2c.1 Ecotoxicology of f ish - y ear 1.
D2c.1 Ecotoxicology of f ish - y ear 2.
D2c.1 Ecotoxicology of f ish - y ear 3.
WP2c
D3a.1
D3a.2
D3a.3
D3a.4
D3a.5
D3a.6
D3a.7
D3a.8
WP3a
Partner 6 WP3 laboratory established.
Irrigation f acilities set up at P4, P5 and P6.
Datasets on crop water and nutrient use.
Assessing ef f ects of nutrient treatments.
Assessing inf luence of pH on growth.
Assessing ef f ects on crop production & quality .
Inf ormation using PRD to sav e water.
Recommendations f or growers.
D3b.1 Inf ormation on quinoa in Macedonia.
D3b.2 Booklet f or growers on quinoa.
Design
D3c.1 Paramerisation
of the DAISY model.
D4.1
D4.2
D4.3
D4.4
D4.5
D5.1
D5.2
D5.3
D5.4
D5.5
Mar
Apr
May
Preparation
for sampling
Equipment OK,
plants OK,
everything
plants if needed. tubes labelled.
DatabasePrepare
of WB stakeholders,
etc set Thin
up.
details
July
Aug
Sept
Sampling
Plan each day.
Think ahead.
Rescheduling
if problems?
Harvest
What
data
will be
needed?
Data
analysis
Check
for
errors.
More
details
More
details
More
details
WP3b
Start experiment
WP3c
Institutional
f act-f
inding report.
Plan
when
to sow, Sow all seeds.
Farmer questionnaires dev eloped f or WP4.
how
to sow,
Check seedling
Institutional
recommendations.
WP4
Cost-benef
it
analy
ses
f
or
sev
eral
crops.
how many plants. establishment.
Cost-benef it analy sis f or quinoa.
Six-monthly dissemination meetings.
Annual project report s.
A GIS-Net network established.
Booklet prepared f More
or WB f armers.
June
WP5
More
details
More
details
0
6
12
18
Month
GANTT chart: WATERWEB timetable of activities
24
30
36
12 /54 1. The Research
A good research programme has to take
account of four key components:
- the scale of the programme
- the cost of the programme
- the time available for the programme
- the quality of the results
Each of these factors depends on the
others, so they can be considered as a
research pyramid …..
13 /54 1. The Research
The Research Pyramid
Scale
Quality
Cost
Time
You need to adjust Scale,
Cost and Time to maximise
Quality
Note that the line joining Quality to Cost is dashed.
In fact Quality rarely depends on Cost!
14 /54 1. The Research
Now it’s your turn to work out a research cycle:
Here’s an exercise for you in experimental
design (working in groups of two) ...
On a visit to your local supermarket you see an
advertisement for large decorative sugar crystals to
serve with your filter coffee.
This looks attractive and will impress your dinner
guests, so you decide to buy some!
But, will the large crystals take longer to dissolve
than other forms of sugar?
If so, this could be a disadvantage.
You have some other types of sugar at home, so
you decide to design some experiments to test this.
15 /54 1. The Research
As well as the large decorative sugar crystals,
in the cupboard you have also found:
• icing (powdered) sugar
• granulated (crystal) sugar
• raw cane (brown) sugar
• sugar cubes
In your kitchen, as well as water, you have a
balance (kitchen scales), a spoon, a glass
beaker, a measuring jug, and you have an
egg timer.
16 /54 1. The Research
Prepare two hypotheses to test using the five
sugar samples.
What are the variables in the experiments going
to be?
Do you have enough equipment to test your
hypotheses?
If not, what else you will need (from your house)?
Do you foresee any problems?
How can you reduce the errors in the
measurements?
Now design some experiments to test
your two hypotheses ...
17 /54 1. The Research
Here are my ideas:
Points to consider:
Rate of dissolving sugar might depend on • type of sugar,
• temperature of the water,
• ratio of sugar weight:water volume,
• speed of stirring the water
• shape of the beaker
• size of the spoon
The most difficult parts of the experiments
are likely to be • maintaining a constant stirring rate,
• seeing when the sugar is completely
dissolved
18 /54 1. The Research
So, taking into account these factors,
it is clear that several types of
comparison could be made and,
whatever the design of the
comparisons, because of the lack of a
proper stirrer the level of replication
may need to be high (up to 10?).
19 /54 1. The Research
My two hypotheses are:
1. The different types of sugar will
differ in their rates of dissolving, with
the icing sugar being quickest and
the large crystal sugar the slowest.
2. The rate of dissolving will depend
on the temperature of the water, with
the sugar dissolving quicker in hot
than in cold water.
20 /54 1. The Research
Extra equipment needed:
1. Some sort of weighing ‘boat’ to
transfer the sugar from the kitchen
scales to the beaker.
2. Maybe a jam thermometer to
record the temperature of the water.
3. A musical metronome would be
useful to keep stirring rates constant.
4. Some sort of watch or clock.
[The egg-timer is not likely to be accurate
enough]
21 /54 1. The Research
Experimental design:
Five types of sugar to be tested, using a constant weight
of each.
[As the sugar cubes cannot be subdivided, the weight of
five cubes will be the standard weight.]
Two volumes of water to be tested, using the measuring
jug to measure out 80% and 40% of the volume of the
glass beaker.
[Space needs to be left in the beaker to avoid the water
being spilt during the experiments.]
Two water temperatures will be compared: water
equilibrated to the temperature of the cold tap, and the
hot tap.
[Make sure the hot water boiler is full and fully heated at
the start.]
22 /54 1. The Research
Experimental design (cont):
The sugar will be weighed out on the kitchen scales, using a piece of
paper as a weighing ‘boat’ and added to the glass. The relevant
volume of water will be measured out and added when the watch
second hand reaches 00.00.
Stirring will start immediately with the spoon, noting the time to
complete disappearance of traces of sugar.
Each type of sugar will be tested five times (as a first guess).
[To reduce errors amongst replicates, the water will be stirred at
approximately one cycle per second.]
Problems:
1. Not being able to see when the icing sugar is dissolved.
2. Large differences between replicates because of poor stirring.
3. Spilling the water if difficult to stir with the spoon.
4. Variation in hot water temperature from one replicate to the next.
23 /54 1. The Research
Total measurements to be collected:
5 x sugar types
2 x water temperature
2 x water volume
5 replicates of each sugar x temperature x
volume combination
= 100 measurements in total.
24 /54 1. The Research
Here is my GANNT chart showing the timing of activities:
Activity
Prepare weights
Measure rates
Prepare weights
Measure rates
Prepare weights
Measure rates
Prepare weights
Measure rates
Calculate results
Cold water
40% full
Cold water
80% full
Hot water
40% full
Hot water
80% full
1 2 3 4 5
1 2 3 4 5
1 2 3 4 5
1 2 3 4 5
Time
All weights for each sugar type for 5 replicates weighed out at the
beginning of each particular treatment (storing samples in paper ‘boats’ until needed).
Time for a refreshment break!
25 /54 1. The Research
So, you have now collected all your data.
How are you going to present the results?
When showing results graphically, you should always put the
most important comparisons nearest to each other.
Therefore, in the case of our sugar experiment:
Time to dissolve
Time to dissolve
Key: 1 = large crystals, 2 = cubes, 3 = raw sugar, 4 = ordinary, 5 = icing,
H = hot water, C = cold water
Temp. H C
Sugar
1
HC HC HC HC
2
3
4
5
Not this ...
Sugar
Temp.
1 2 3 4 5
Hot
1 2 3 4 5
Cold
... but this
- I have assumed that large crystals are slowest to dissolve.
26 /54 1. The Research
Conclusion from the sugar experiment:
It looks as if the large decorative crystals took longer to
dissolve in water than the other forms of sugar.
So, slower to dissolve in your coffee!
Or are they? ...
What was missing from the graphs?
Of course, you all noticed that none of the figures on the
previous page had any error bars, didn’t you?!
Until you know whether the differences are significant or not,
you can’t test your hypothesis.
So, how are you going to do this?
27 /54 1. The Research
The ‘null’ hypothesis
This is a statistical term allowing you to test
whether two or more sets of data are significantly
different. [‘null’ means having nothing/being
empty/non-existent, so in the statistical sense it
means ‘no difference’.]
Therefore, in the case of our sugar samples, the
‘null’ hypothesis would be that there is no significant
difference amongst the rates of dissolving each
type of sugar in water.
28 /54 1. The Research
This is where the problems start.
This is where good experimental design is important.
What are you going to compare with what?
What are your controls going to be?
Are they the only controls or should you have
several types of control?
How many replicates of the tests do you need to
know whether the differences are significant or not?
29 /54 1. The Research
Will you use individual samples, pooled
samples or paired samples?
What statistical methods will you use to test
whether the data sets are different?
Are there alternative statistical methods that
would be better but possible only if you
changed the experimental design?
You need to think about all of this before you
start any experimental work as this will
determine your experimental design.
30 /54 1. The Research
For example, if a growth cabinet is big enough
to test only 12 plants at once and you want to
test 2 varieties with 6 plants for each of 2 water
stress treatments (total of 24 plants),
- is it better to have 6 plants of both varieties for just
one treatment in the cabinet at the same time to
make sure they are all under the same growth
conditions and then use the cabinet on a different
occasion for the second treatment, or - do you have 6 plants per treatment in the cabinet
for one variety, and then do the second variety with
6 plants per treatment on another occasion, or - do you split the replicates into 2 and have 3 plants
per treatment for both varieties at once, and then do
the second 3 replicates on another occasion?
31 /54 1. The Research
Maybe 6 replicate plants is more than you need?
In fact, how many replications do you need to be able to find
a significant difference between two or more treatments?
- this is a question I have often been asked!
[It’s a bit like asking the question ‘How long is a piece of string?’
Answer - that depends how long it is!]
So, the answer to the number of replicates needed will
depend upon how much variation there is within a
treatment.
The bigger the variation within a treatment, the more
replication you will need in your experiment to find
significant differences between treatments.
Let’s look at an example ….
32 /54 1. The Research
Take two sets of five replicate values:
Set A
mean
Set B
mean
It doesn’t take a PhD, or even an MSc degree to decide that
set A is more variable than set B, so it will need more
replicates of set A to find significant differences than with set B.
33 /54 1. The Research
So, the more success you have in designing
the experiment to reduce the variation within a
treatment, the easier it will be for you to find
significant differences between treatments.
In the example above, if this was plant height,
maybe seed size varies a lot for variety A, but
is very uniform in variety B.
Therefore, selecting for uniform seeds might
help reduce the variation within the variety.
34 /54 1. The Research
In the same way, if you don’t think carefully
enough about all the factors that might have an
influence on the response to the treatments,
then you could get the wrong answer when
you test your ‘null’ hypothesis.
[Think about the growth cabinet example earlier:
if the same cabinet is used to test all replicates of one
variety and then all replicates of the second variety, but
the cabinet temperature changes between the two trials,
… what do you think would be the consequence?
All the experimental results will be invalid (not worth
anything!!!), but you might not be aware of this if you
didn’t realise the temperature had changed because you didn’t think to measure it!]
35 /54 1. The Research
Here’s an example from my research programme
at Newcastle University of looking for the truth The hypothesis:
Ozone is damaging to the yield of wheat plants, and
the more ozone you give them, the lower the yield.
[Note that wheat is supposed to be the most
sensitive crop to ozone in the UK!]
The highest ozone treatment given in open-top
chambers (75 ppb) should have been enough to cause
significant damage.
36 /54 1. The Research
This is the result of ranking wheat genotypes for yield under ozone
(75 ppb) relative to control yields using the original data:
Ratio yield/plant 75ppb/Control
2.5
2
1.5
Note lots of genotypes with
yield stimulated by ozone!!
26 genotypes > 1.1
1
0.5
0
Genotype
So the hypothesis is WRONG!!
Or is it?
37 /54 1. The Research
Well, let’s clean up the data to remove values affected by mice
eating the ears, and the bags that were spilt on the floor, etc:
Ratio yield/plant 75ppb/Control
2.5
2
1.5
Note lots of genotypes still with
yield stimulated by ozone!!
25 genotypes > 1.1
1
0.5
0
Genotype
So the hypothesis IS STILL WRONG!!
Or is it?
38 /54 1. The Research
OK, let’s be clever and think whether it is realistic for the hypothesis
to be wrong, or could there still be mistakes in our dataset?
Yield in wheat is derived by multiplying the following components:
spikelets/ear x ears/plant x grains/spikelet x weight/grain
Components are determined at different stages of development
- essentially in the order shown above.
Ozone treatment started on 20th May.
Spikelet production was already completed by 20th May.
Therefore spikelets/ear should be the same in both treatments.
But for several genotypes there were some
small spikelets/ear, particularly for control
plants, as shown here:
Genotype Spikelets Spikelets
Number
Control
Ozone
21
17
18
21
17
17
21
14
17
21
11
17
21
16
17
21
17
17
39 /54 1. The Research
Genotype Spikelets Spikelets
Number
Control
Ozone
21
17
18
21
17
17
21
14
17
21
11
17
21
16
17
21
17
17
So, can we just delete the values that look
small?
Possibly, if we can show that they are
significantly different from the rest.
It looks as though the most frequent
spikelet number for this line is 17.
So, let’s rank all the lines for mean spikelet
number, then look at only those lines with
spikelet number means from 16.0 to 18.0,
i.e. 17.0 plus or minus 1:
[Data shaded yellow are for the 75ppb ozone treatment]
DHL Mean Sp no.
45
18.7
48
18.5
45
18.5
104
18.3
65
18.0
65
18.0
87
18.0
144
18.0
39
17.8
10
17.8
19
17.8
104
17.8
20
17.7
87
17.7
114
17.7
14
17.7
18
17.7
58
17.7
71
17.5
41
17.5
19
17.3
26
17.3
31
17.3
52
17.3
102
17.3
23
17.3
39
17.3
48
17.3
54
17.3
14
17.2
58
17.2
20
17.2
21
17.2
90
17.2
98
17.2
23
17.0
32
17.0
32
17.0
71
17.0
8
16.8
22
16.8
42
16.8
97
16.8
98
16.8
7
16.8
33
16.8
42
16.8
97
16.8
143
16.8
11
16.7
18
16.7 40
/54 1. The Research
Then work out a frequency distribution for
spikelet number per ear for those lines:
81 lines (across both treatments), giving in
total 476 values for spikelet number per ear.
180
Number per class
160
140
120
100
80
60
40
20
0
8
9
10
11
12
13
14
15
16
17
Spikelet number per ear
18
19
20
21
22
The range from 15 to 19 (green bars) contains
450 values, which = 95% total values.
Therefore, any values outside this range (red
bars) can be regarded as significantly different.
41 /54 1. The Research
So, let’s see what the picture looks like once we have corrected for
variation in spikelets per ear (determined before ozone was given):
Ratio yield/plant 75ppb/Control
2.5
2.5
2
1.5
Note very few genotypes now
with yield stimulated by ozone!!
Now only 11 genotypes > 1.1
1
0.5
0
0
Genotype
So the hypothesis is probably
RIGHT after all!!
42 /54 1. The Research
Here is a riddle [brain teaser] for you:
When is a significant difference not a significant difference?
Answer - when the variation measured is not biological.
For example, you want to test whether drought treatments will
affect the plant hormone abscisic acid (ABA).
Now your experiments will have different sources of variation
due to:
- the effect of drought on ABA which you want to test, but also
- the efficiency of the extraction process for the hormone, and
- the efficiency of the method to purify the hormone, and
- the reproducibility of the assay for ABA.
43 /54 1. The Research
Therefore, you need to know how variable are
- the extraction process
- the purification protocol and
- the hormone assay
before you can test the effect of the drought
treatments on tissue ABA concentrations.
[I have sometimes been sent scientific papers to review where
the authors present standard errors for the extraction and assay
efficiencies instead of the effect of the treatments!]
When was the last time you checked the
calibration of your pipettors?
So, how many of you have tested your
accuracy and precision of using a pipettor?
44 /54 1. The Research
Precision and Accuracy
Precision is a measure of how closely the analytical results can
be duplicated.
Replicate samples (prepared identically from the same sample)
are analyzed to establish the precision of a measurement (of
enzyme activity, for example).
Accuracy measures how close to a true or accepted value a
measurement lies.
The difference between
Accepted
value
accuracy and precision
is illustrated here:
Precise measurements
that are not highly
accurate
Accurate measurements
that are not highly
precise
http://en.wikipedia.org/wiki/Accuracy_and_precision
45 /54 1. The Research
4 Feb 1999
SP5
Printout format: only FW or DW known
Dilution of antibody stock used:
not recorded
Dilution of labelled (+/-)-ABA stock used:
not recorded
Tissue (+)-ABA data corrected for 10% immunoreactive contamination
Samples extracted on the basis of DW
with constant solvent volume.
Extraction volume = 1 ml per sample
Weights entered in milligrams.
Extract volume assayed = 50 ul
No.
dpm1 dpm2 Mn dpm pg/tube Wt.extr Ratio
ng/gDW±se
Sample
--------------------------------------------------------------------1
2512 2514 2513.0 1053.3
38.6
25.9:1 491.1±0.3 169S QTL
2
2889 2412 2650.5 988.3
46.2
21.6:1 385.0±48.5 40N QTL
3
2515 2575 2545.0 1035.5
31.1
32.2:1 599.3±9.6
57S QTL
4
1750 1740 1745.0 1660.6
37.5
26.7:1 797.0±2.7 148N QTL
5
2876 3141 3008.5
813.8
39.4
25.4:1 371.7±25.5 200N QTL
6
2486 2687 2586.5 1014.4
40.3
24.8:1 453.0±24.3 187N QTL
7
2452 2318 2385.0 1130.5
44.2
22.6:1 460.3±17.0 150N QTL
8
3410 3326 3368.0
673.4
34.3
29.2:1 353.3±7.9 181S QTL
9
2521 2382 2451.5 1090.0
37.5
26.7:1 523.2±19.8 9PN QTL
---------------------------------------------------------------------
Here’s an example of an assay for the plant hormone ABA which
was carried out with duplicate analyses of each extract.
If the duplicates differed by more than 10% from the mean, the
extract was re-assayed (as in sample 2).
46 /54 1. The Research
Look at the consequences for your results of variation by no
more than 10% at different stages in the analysis process:
Source of variation
Variation Type of variation
Experiment to experiment
10%
biological
Plant to plant within experiment 10%
biological
Extraction efficiency
95-85%
analytical
Purification efficiency
95-85%
analytical
Assay efficiency
95-85%
analytical
If the true value is 100 at 100% analytical efficiency,
a single analysis result could give a value ranging from
100 x 1.05 x 1.05 x 0.95 x 0.95 x 0.95 = 94.5, to
100 x 0.95 x 0.95 x 0.85 x 0.85 x 0.85 = 55.4 !
Biological range = 90.3 - 110.3% of true value.
Analytical range = 61.4% - 85.7% of true value.
47 /54 1. The Research
How to minimise these errors - 1
Plants (for a pot experiment):
• Thoroughly mix the growing medium before use
• Weigh equal amounts of medium into each pot
• Choose uniform seeds and sow more seeds per pot than you need so
that you can thin them to give uniformity later on (say, about 1 week
after emerging).
• If you are short of seeds, sow 1 per pot, then after all seedlings are
emerged and any differences between seedling size clearly visible,
rank all plants from smallest to largest and choose one plant per
treatment of equal size, then the next size as one per treatment, and
so on so that variation in plant size is matched across all treatments.
• Give equal amounts of water at the same time to each pot
• Move plants around the phytotron/GH (carefully) every few days or use
a block design where a block has one plant of each treatment
48 /54 1. The Research
How to minimise these errors - 2
Sampling:
• Sample at the same time each day (unless testing diurnal trends)
• Sample one replicate of each treatment, then the next replicate of
each treatment, etc (i.e. not all of treatment 1, then all of
treatment 2, etc.)
• Label all collection tubes/bags, etc in such a way that it is clear
exactly what is what and that the label will not come off (or
fade) during storage (note that some ‘biro’ ink can fade in
bright
light and marker pen ink can easily be rubbed off glass tubes
stored in a freezer!)
• Be observant and note anything that may cause variation in your
results (a leaf affected by disease, or damaged by wounding,
etc).
49 /54 1. The Research
How to minimise these errors - 3
Analytical measurements:
• Check that the pipettors are calibrated and give reliable volumes
• Process one (or more) replicate of each treatment in the same
batch (as for sampling)
• Use the same batch of reagents, buffers, substrates for analysing
across all treatments
• In a pilot study, check the accuracy and precision of
a) your sample extraction procedure (e.g. re-extract your sample)
b) your sample purification procedure (e.g. test recovery of a
known amount of substance)
c) your sample assay procedure (e.g. test quantification of a
known amount of substance)
• Preferably assay every sample twice and repeat again any giving
bad reproducibility
50 /54 1. The Research
How to minimise these errors - 4
Identify the comparisons most important to test
(e.g. control and stress on a particular occasion, or
resistant versus susceptible varieties).
Then design your assay protocol to allow those
comparisons to be made on the same occasion.
If using densitometry scans to quantify peak areas,
remember to subtract any background below the
baseline.
Unless you have previously shown that your assay
is both accurate and precise (within 5% of the true
value every time), you should replicate your assay.
Give your controls the letter ‘a’ if using
Duncan’s multiple range test.
51 /54 1. The Research
Some comments, based on personal experience,
on collecting and analysing good quality data:
- with a long-term experiment it is useful to have a diary to
remind you what to do each day
- if you make a mistake when writing down a measurement,
make sure you correct it in a way that is legible to you and
to others
- form a judgement beforehand on the sort of mean values
that would be realistic for the data set:
in this way, you are more likely to recognise a value/mean
that is a mistake (a number misread when entering into the
computer, or a decimal point missed out).
200
For example, a
decimal point has
clearly been
missed out from
one of this series
of 17 values:
160
120
80
40
0
1
2
3
4
5
6
7
8
9 10 11 12 13 14 15 16 17
52 /54 1. The Research
- don’t present mean values or errors with a level
of precision greater than that justified by the
number of replicates.
For example, if you have only three values from 5.5 to
9.6 don’t present the mean value with more than 2
decimal places, or errors to more than 1 decimal place.
- be careful about relying upon others to do your
experimental work for you!
The more people involved in the work, the more likely it
will be for mistakes to happen, especially with those who
have no personal involvement in the outcome of the
research (why should I care?).
53 /54 1. The Research
Conclusions:
So, if you think carefully about all the factors I’ve
described,
- all the controls that you ought to include
- all the factors that might invalidate your comparisons
- all the likely costs and facilities needed
- all the degrees of replication that will be needed for a
particular level of significance to be tested
- all the statistical tests you need to do
- all the possible sources of error, and
- all the things that could go wrong (!),
then ...
54 /54 1. The Research
you are likely to have a well-designed experiment
that will give good quality results, which will give you
- a valid test of your hypothesis
- something worth writing up for one or
more good quality publications
- a sound basis upon which to develop
ideas on what to do next:
forming your next hypothesis
- but crucially: access to the TRUTH
- and that will give you ...
- satisfaction in a job well done!
55 /54 1. The Research