Information on Uncertainties and Error Analysis

Background Information on Uncertainties and Error Analysis
These notes describe the way we will do calculations in this class. They take precedence
over any statements in the lab manual that might be different in some details.
Important note on calculations:
Every calculation should be done without any rounding of intermediate results. Rounding to the
correct number of significant figures should be done based on “exact” values. If you have to
record intermediate results, keep several extra decimal digits to ensure that your final results do
not suffer from round-off errors. Show all work, including the expression you evaluated on your
calculator, if you wish to get any partial credit for what might be only a calculator error.
1. Significant Figures
Significant figures are numbers that have actual physical meaning. The digits 1 through 9 are
always considered significant, but zeroes can be either significant or simply place holders. This
makes it necessary to have rules that allow us to specify unambiguously whether or not a zero is
significant. Further, because the physical limitations on how much we know about a particular
value should be reflected in any result we calculate from it, it is equally important to have rules
on which digits in a calculated result are significant and which are not. Please notice that the rule
used for exams and homework in the lecture course is not as detailed as what we must use in the
laboratory because we assume all numbers are exact in the lecture class and results are only
being shortened for our convenience, whereas in the lab the number of significant figures tells
us how accurate our experimental knowledge of a physical quantity really is.
A. Specifying significant figures
Trailing zeroes are not significant if the number is an integer. They are significant if the number
has a decimal point; trailing zeroes after the decimal point are always significant. Leading zeroes
are not significant if they only serve as place holders for a decimal fraction.
Some examples:
Number
3.76
7035
0.000002
0.500002
5
5.0
5.00
6.04
Sig Figs
3
4
1
6
1
2
3
3
Number
73000
73000.
Sig Figs
2
5
73020
73002
73000.0
73000.1
4
5
6
6
Notice how scientific notation is the only way to make it clear that one trailing zero is significant
in the case of a number that would otherwise be a large integer.
B. Rules for Rounding
The rule I want you to use when rounding is a bit different from what you may have used in the
past and differs from what is in the lab manual. If the fractional part you are going to round off is
exactly 500...., then you round the digit before it to the nearest even number. Since 1, 2, 3, and 4
get rounded down while 6, 7, 8, and 9 always get rounded up, this has the effect of balancing out
the direction you round so there is no statistical bias (such as you get if 5 is always rounded up).
Some examples:
Round 2.501 to 3, but round 2.5 to 2 like you would round 2.485 to 2. Round 3.5 to 4 just as you
would round 3.52 to 4, but round 3.499982 to 3.
C. Multiplication and Division
The number of significant figures in a product or quotient is the same as the smallest number of
significant figures in any of the factors.
Example: Consider the calculation of an Area =
.
The exact product is
, but this must be rounded to
because each factor has only two significant figures.
D. Addition and Subtraction
The number of significant figures in a sum or difference is found by first identifying the
rightmost decimal place of the least precise quantity involved in the calculation and rounding
to this decimal place in the result. The number of sig figs can increase or decrease.
Example: Suppose a temperature is measured to be
and we want to convert to kelvins.
We add the conversion factor
to get the exact sum of
.
The least precise quantity (8) is only known to the ones place, so we must
round this result to
. [We start with 1 sig fig and get 3 sig figs.]
E. Combinations of Calculations
When different kinds of operations (such as addition and division) appear in an expression, you
work out the number of significant figures step-by-step following the rules for precedence of
operations in arithmetic. You do the first kind of calculation (say division to evaluate a fraction)
and determine the number of sig figs for it, then do the next type of calculation (say addition of
those fractions inside parentheses) using the number of sig figs from the intermediate result to
determine the number of sig figs for that result, and so on. Remember, you do not round until
you are finished, so you must keep track of the sig figs on the side, so to speak (or do so literally,
by keeping notes in the margin of your calculation worksheet).
2. Experimental Uncertainties
First, a comment. Although the terms “experimental uncertainty” and “experimental error” are
used interchangeably by scientists, it is important to realize that these uncertainties are normally
not due to actual errors. Experimental uncertainties generally arise from unavoidable difficulties
in making a measurement, limitations of the equipment, or even ambiguities in the specification
of what is to be measured. Only rarely are the differences between theory and observation due to
actual mistakes, and when they are, one of the goals of statistical analysis of data is to spot these
mistakes so they can be avoided when the experiment is repeated.
We group experimental uncertainties into two categories, instrumental and statistical. The first is
a property of the measurement device, while the other is a name for the aggregate effect of many
different sources of variations seen in repeated measurements of the same data point.
A. Sources of Experimental Uncertainties
One common source of experimental uncertainty is a limitation of the equipment used. Each
instrument has some limit to its precision. A meter stick has divisions marked every mm, and
any value smaller than that has to be estimated. In contrast, a micrometer has divisions that
allow direct readings to a much higher precision, but it cannot be used to measure large
distances. A related source of uncertainty is reading the measurement, particularly when
estimating values that fall between the divisions marked on the instrument. These are discussed
further in section B as well as section C because these uncertainties are the easiest to quantify.
There are also actual personal errors, where the instrument is misread or a value is recorded
incorrectly. The most common ones are when a reading is made in one unit (say grams or cm)
but has to be recorded in SI units (kg or meters) and a digit is lost in the conversion. Another is
the misreading of a scale, or even reading a ruler from the wrong end. These are eliminated by
experience and attention to detail when doing the measurement. If the error only happens once,
it can be identified as an “outlier” as described in section C, otherwise it is a systematic error.
The most problematic kind of uncertainty is what is called a systematic error. This could result
from a flaw in the instrument being used (a meter stick that is not a meter long, for example) or
how the instrument is used (failure to zero a mass balance before using it). It could also result
from some feature of the experimental setup, say an extra force that is due to how the apparatus
works, or excess friction in a poorly adjusted pivot point or pulley, or even the mass of a string
that you used but did not include in your mass measurements. These systematic uncertainties can
be hard to identify, as they often show up as a result that is very precise (all data agree to many
significant figures) but not accurate (the data disagree with the known value for the quantity
being measured). They can only be quantified by repeating an experiment in a different fashion.
B. Estimated Instrumental Uncertainties
Ever time you make a measurement, you should record your estimate of how accurate it is. You
do this in two ways: by the number of sig figs you record, and by explicitly stating how unsure
you are of the last digit you recorded. This uncertainty is a combination of the limitations of the
instrument and your estimate of the uncertainty in your reading of the instrument. Since this sort
of uncertainty is usually the same for a series of measurements, it suffices to only record it once
with the first measurement of that type.
Example 1: When using a digital meter or some other device where you cannot estimate
values between the ones given by the instrument, you should assign an uncertainty of
one half of the last digit. For example, if you were doing timing with a clock that only
displayed seconds, a measurement of 70 s would be recorded as (70. ± 0.5) s.
Example 2: When you make an estimate between marked divisions, say when using a ruler
or a lab balance, you need to estimate how good your estimate is. For example, you might
decide that you can easily tell when a length is 2/3 of the way between 21.2 cm and 21.3 cm
on a metal ruler and thus record the measurement as (0.2126 ± 0.0003) m, but you might
decide to record it as (0.2125 ± 0.0005) m when using a thick wooden meter stick because
you can only tell that the length is midway between 21.2 cm and 21.3 cm.
C. Statistical Uncertainties (Standard Deviation)
If the only uncertainties in a measurement are due to the sort of random errors that result from
trying to guess a value that falls between two marked divisions or the difficulty in positioning a
ruler at exactly the same starting position each time, the uncertainties you estimated can also be
determined experimentally if you collect a large sample of data. You can do this because the
scatter of the data points about their average value (the mean) can tell you about the actual
uncertainty in the estimated mean provided the random errors follow a “bell curve” distribution.
In other cases the method we use will only give an approximate value for the true uncertainty.
The quantity we will calculate is called the sample standard deviation (
or
). It tells
you the probability that a new measurement will differ by a specific amount from the mean of
your sample. There is a 68.3% probability that a new value will be within one standard deviation
of the mean, and a 95.5% probability that it will fall within two standard deviations of the mean.
(That is, about 2/3 of the time it will be within “one sigma,” and the other 1/3 of the time it will
be within “two sigma,” with only 1 in 20 falling more than “two sigma” from the mean.) There
is a 99.73% probability that a new value will fall within three standard deviations, that is, less
than 3 in 1000 should be that far from the mean. A measurement that falls several standard
deviations from the mean is called an “outlier”. (See below.)
Calculation of the sample standard deviation:
We will use a calculator or computer fitting program to do this calculation. [The procedure
for the TI-30X IIS or TI-83 is on a handout.] These notes are included to tell you what your
calculator is doing for you when you ask it to find the value of
.
You start with a collection of N measurements,
, which should be recorded with your estimate
of the instrumental uncertainty. [The instrumental uncertainty is averaged along with the data
point values, and then used in the final evaluation of what uncertainty to report.]
You next calculate the mean (the average) by using the formula
and then calculate the sample standard deviation according to the formula
.
This calculation is most easily done by hand if you calculate the quantity in parentheses, the
“deviation from the mean”, and its square in a table as is shown in the example below.
Important: Any time you calculate the average (mean) of a set of data you should also calculate
the sample standard deviation and record it as the uncertainty in the mean. In general, the
standard deviation is the best estimate of the actual uncertainty in your measurement. If it is
really different from your estimated error (either way), you should reexamine your calculation
and/or estimated uncertainty and/or recorded measurements for errors. Really large error
estimates might indicate that you should repeat the experimental measurement. In a case where
your measurements are all essentially the same, giving a standard deviation much smaller than
your instrumental error (sometimes you can even get zero standard deviation), you should use the
estimated instrumental error as your final uncertainty. In other cases, use the standard deviation.
Finally, round the uncertainty and the mean to the correct number of significant figures.
Sample Calculation:
We will assume that five measurements of a time, in seconds, have been made with an estimated
instrumental uncertainty of 0.5 s. These are listed in the first column (or entered into a data list
in a calculator) and used to calculate the average. We first sum up the entries in the data column.
Time Data (s)
Sum =
Deviation from Mean
Deviation Squared
0.4
0.16
1.6
2.56
1.4
1.96
0.4
0.16
0.6
0.36
Sum = 5.20
The mean is found by dividing the sum by N = 5, giving
keep all of the extra digits beyond the two sig figs we really know.]
. [Notice that we
We then calculate the deviation from the mean by subtracting 69.6 from each data point,
recording the magnitude in the 2nd column so we can easily check our work, and recording the
square of the deviation in the 3rd column. Next we add up the squares as shown above. Finally,
we divide the sum by 4, getting 1.30, and take its square root, which is about 1.1402, and then
look at the significant figures and instrumental uncertainty to decide what value to report.
The last step is to compare the standard deviation (about 1.14) to the estimated instrumental error
(0.5). The difference is not enough to be concerned about, so we use the standard deviation we
just calculated as the uncertainty. In this case, there are two significant figures so we round the
mean and uncertainty and report (70. ± 1) s as our experimental value.
D. Standard Error
As stated earlier, you should always report the standard deviation whenever you calculate an
average from a set of data. However, in some cases you will be asked for the “standard error”,
which is simply
, instead. In such cases, be sure you give the standard
deviation in conjunction with the mean and give the standard error in the space provided for it.
Note: The standard error gives a range that has a 68.3% probability of containing the true value
of the quantity being measured. Contrast this with the sample standard deviation, which gives
a range that has a 68.3% probability of containing the result of a new measurement that uses the
same experimental set up, that is, a value you would get if you repeated the measurement.
3. Propagating Experimental Uncertainties in Calculations
The specific procedure to use when performing calculations with numbers that have uncertainties
can be quite complicated, particularly when the uncertainty in one value might be correlated to
that in another (because they have some measurement in common). This is the subject of actual
courses in statistics and data analysis. The procedure to use when you know the uncertainties are
completely uncorrelated is to add the errors in quadrature as described in the lab manual, but we
will not use that method in this class. The method described below is simpler and gives a more
conservative estimate of the way uncertainties propagate.
A. Adding and Subtracting Numerical Data with Uncertainties
When you add or subtract numbers, you simply add the uncertainties.
Suppose we have measured and , and determined their uncertainties to be
respectively. If
Examples:
, then
; if
, then
and
.
,
B. Multiplying and Dividing Numerical Data with Uncertainties
When you multiply or divide numbers, you add the relative uncertainties to get the relative
uncertainty of the result and then calculate the result’s uncertainty.
Suppose we have measured
and
and
, and determined their uncertainties
, then
and
Example: Evaluate
and
. If
.
. We first calculate
We then calculate
.
and
.
Finally, we round the value to two sig figs as required by the rules of section 1 and
round the uncertainty to the same decimal place:
is the answer.
C. Raising Numerical Data to a Power
When you raise a quantity to a power, you multiply its relative uncertainty by that power to get
the relative uncertainty of the result and then calculate the result’s uncertainty.
Suppose we have
, where
Example: Evaluate
can be a non-integer. Then
. We first calculate
Then we calculate
and then round to two sig figs to get
.
.
and
, which is the answer.
Again, as in the previous example, notice that the number of digits in the uncertainty is
determined by which decimal place is occupied by the least significant digit in the result.
In this case we get two digits in the uncertainty because it is so large. If this were our final
answer, we could give the above result and then say the measured value is
.
D. Combinations of Calculations
Just as described in section 1.E, you work out the uncertainty in the final result by working out
the uncertainties in each part of the calculation step-by-step, doing each kind of operation in the
standard precedence of operations order you would use if doing the arithmetic by hand. You
work your way through the expression from the inside out, doing calculations within parentheses
before those outside, evaluating powers before multiplication or addition, etc. We will go over
this during the lab period when it has to be done as part of the lab calculations.
4. Least Squares Fitting
The idea of a “best fit” curve is an important one. It replaces the use of a mean and standard
deviation for cases where you have 2-variable data rather than 1-variable data if some physical
model tells you to expect a particular functional relationship between the two variables. The
resulting fit gives you the parameters of the function, an extra parameter that tells you how well
the function describes the data, and in some cases it gives you uncertainties in the fit parameters.
The name “least squares” comes from the way in which the fit is done: you are minimizing the
sum of the squares of the difference between each point and the function, so the best fit is when
those squared differences are least.
The case where one fits the line
to data consisting of
pairs is given the
special name of “linear least squares” and is the same as doing “linear regression”.
Pages 12 to 15 of the lab manual provide a discussion of this kind of fitting but contain a very
was left out.
significant typographical error in equation 11 for the best fit slope. A factor of
The correct equations for the slope (
) and y-intercept ( ) are
and
.
The “correlation coefficient” r is given by the formula
.
The value of r is a quantitative measure of how good the fit is. Appendix I (page 559) of the lab
manual provides a table that tells you the probability of getting a particular value of “r” for a
given number of data points from a set of totally random numbers. A small probability means it
is unlikely that you could have gotten data that fall near your fitted line purely by chance.
There are at least two different ways you can do a least squares fit:
1. You can do it by hand, by evaluating the equations above. Doing it by hand is made easier if
you build a table giving each of the terms you need to sum up to get the contributions to each of
the equations. We will do it with a computer program.
2. You can do it with “Graphical Analysis”, by using the “Linear Fit” option. This can give
the uncertainties in the slope and intercept as well as the regression coefficient ( ) if you follow
the directions on how to turn on that option. I recommend always including the uncertainties in
the fit parameters because they can be used to understand the precision of your work, whereas the
value of “r” tells you the reliability of your result.
3. You can do it with TI-83 calculator, but that won’t give you the standard deviations.
Sample Fit with Graphical Analysis:
The graph shown below is from a scan of the output from a linear regression fit to a simple data
set. Notice that Graphical Analysis gives uncertainties in the slope and intercept values. Those
uncertainties give an indication of how much we could change those fit parameters and still have
the line come “acceptably” close to the collection of data points.
Notice that the uncertainties from the fit indicate that we should report this result as a slope of
(9.4 ± 0.9) and an intercept of (4.8 ± 4.9). There are only 2 significant figures in the slope, which
is expected because the x values of the data set were only given to 2 digits.
Finally, we can also conclude from this fit that the value of the intercept is statistically
consistent with zero. That is, there is a 68% probability that the value of b could be zero.
Many people would report the intercept as (5 ± 5) because the tenths place is not significant.