USING EXCEL IN BIOSTATISTICS General description of Excel

USING EXCEL IN BIOSTATISTICS
General description of Excel
Excel is a spreadsheet that is part ofthe Microsoft Office packages. It has other appli­
cations besides statistics. Businesses, for example, can use Excel to keep inventory records,
process orders, and compare sales. Teachers could use Excel to maintain a record of student
grades. It is also an excellent way to work with data in biostatistics. Some data sets are placed
on the Internet in files that are readable with Microsoft ExceL
Upon loading the program, it sets up a blank workbook. Each workbook is composed
of sheets of information. Each workbook may contain up to 255 sheets. The sheet names are at
the bottom, on tabs. You may change the name ofthe sheet by right clicking on its tab, select­
ing rename, and then type the new name. This is useful if the data is in one sheet and each
graph is in a separate sheet. Each sheet is composed of columns (labeled A, B, C, etc) and rows
(labeled 1,2,3, etc).
Location of statistical features
1. Some functions are built-in and can be used within cells; these can be directly accessed or
obtained with the Function Wizard located on the tool bar and resembling.fx.
2. Graphs are constructed with the Chart Wizard located on the tool bar and resembling a di­
mensional histogram.
3. Data Analysis Tool Packs are part of Excel's custom installation. Click on Tools, Add-ins
and then wait approximately 2 minutes. Then click on the top 2 data analysis tool packs. After
closing the dialog box, the words Data Analysis will be included in the Tools menu.
Getting started for general work
Entering information into Excel. Basically, there are two major ways of first entering
information; either by directly typing it in or by opening an existing file.
To enter information, just type the data into cells of a worksheet. You may enter infor­
mation either vertically or horizontally.. You may precede each with a title, called a header
row. If your information is not entirely visible, you may enlarge the size of the cell by moving
the cursor to the column headings (labeled A, B, C, etc.) When the cursor changes to a vertical
line with arrows pointing to both the left and right, you can move the cursor to the left or right,
with the mouse, thereby changing the size of the celL Each cell is named with its column letter
followed by its row number. The name ofthe cell will appear on the left side ofthe screen,
right above the worksheet.
To open an existing file, click on File, Open, and then select the file to open. Usually,
this will be on a disk, in drive A. You should provide your own disk for this purpose.
Save the worksheet. Once the data is typed in, it is an excellent idea to save it on a disk.
Click on File, Save As, and then name the file with a title that will easily identify the data. Be
sure to save it on drive A.
Bl
82
Manipulating the data. It may be necessary to manipulate your data into order make
them easier to work with. Some of the common operations are described below.
A. Applying a data format
Sometimes data have special formats, such as currency. To apply a special format, highlight
the data to be formatted. Click on Format and select the appropriate one.
8. Inserting new data in a specific place
Click on the row where the new data are to go before. Click on Insert, Row to create a new
row. Likewise, a new column can be created by clicking first on the column and then Insert,
Column.
C. Sorting the data
This is particularly useful when making a frequency table. Click on any cell within the data.
Click Data, Sort and then select the fields that are to be sorted either ascending or descending.
At the bottom of the screen, there is a place for the header row to be selected or not.
D. Creating a new column, based on a calculation
Suppose a frequency table has been created and it is necessary to create a relative frequency
table. Assume that column A has the lower class limits and column 8 has the frequencies; both
columns have header labels. In CI, type Relative Frequency. In C2, type =82 I #, where #
stands for the total number of data items. Press Enter. The next box, C3, will be highlighted.
Go back up to C2, move the cursor to the lower right corner, until it turns into a + sign. Hold­
ing down the left mouse button, pull the cursor down the column until you reach the last needed
place. Excel will automatically change the formula to reflect each new location.
E. Copying cells to a new location
Highlight the group of cells to copy. One way to copy the cells is to press Control and C keys
together. At the top of the new location, press Control and V keys together.
F. Creating a pattern of numbers
This is useful if when setting up class limits (for graphs) or ranks (for the normal probability
plot). Type the first 2 or 3 numbers. Highlight the cells, move the mouse arrow to the lower
right comer, press the le~ mouse, drag the desired amount, and release the mouse.
G. Changing the width of a column
Move the pointer so that it is on the rightmost edge of the column heading ofthe column to be
extended. The pointer should change shape, resembling a vertical line with 2 arrows. Hold
down the left mouse button and drag the width to the desired amount.
B3
Graphs for raw data
The basic installation of Excel has a chart wizard. Pressing a button, which looks like
a three dimensional bar graph on the tool bar, accesses this. The graphs available include bar
graphs, line graphs, and pie graphs. A simple default graph is highlighted in black; fancier ver­
sions are also present.
To make a graph, the first thing you will need to do is to create a frequency table, using
standard techniques. If the table is not yet prepared, you will either need to enter the data into
the calculator lists or into a column in ExceL The data must then be sorted.
After the frequency table has been created, the cells must be highlighted. Press the
Chart Wizard button on the top row. When moving from one screen of the chart wizard to the
next, be sure to use the NEXT button, rather than the FINISH button.
Generally, if the frequency table contains nominal data, Excel will not have any prob­
lems with handling the axes. If the classes are not nominal, then Excel has a slight problem.
Once the chart wizard begins, you will need to click on the Series tab and remove the classes
from the series to graph. On the bottom, there is a place for the category x axis. Type =(name
of sheet)! (starting cell):(ending cell) for the data values.
While progressing through the chart wizard, you will be allowed to enter a title for the
graph and labels for both axes. The legend box could be eliminated if desired. The frequencies
can be entered A data table could be included. The chart wizard will ask whether you to
place the chart in the sheet or a chart; it is recommended to place the graph in a new chart.
Click in the top circle to accomplish this.
Once the graph is drawn, additional changes can be made. By clicking on the various
components, you can change the font of printing, the color of the background, the fill pattern on
the bars, the width of the bars, etc. For each correction, the mouse must be moved so that a lit­
tle box with words describing what is to be changed appears. Double click the mouse to enter
the dialog. To make the histogram bars connect, double click on a bar, go to the options tab,
and run the gap width down to zero.
This basic technique will work for all graphs. When preparing graphs, remember that
the graph should look professional and follow correct formats. If a cumulative frequency poly­
gon is desired, the graph must start with a frequency of O. If a frequency polygon is desired, the
graph must start and end with frequencies of O. Be sure to include the "invisble" class marks or
boundaries for these graphs. Relative frequency graphs can also be made.
There are two other types of histogram that you may wish to make. The first one is
available through a supplemental program called the Data Analysis Toolpak. This product is
available separately from Duxbury Press as part of a book called Data Analysis with Microsoft
Excel. It produces a histogram with a cumulative frequency percentage graph superimposed on
it. The number of classes (called bins) are created automatically by Excel. If it is desired to
have "nice" classes, be sure to set up the lower class limits and specify the range in the bin. Be
sure to check the bottom two options (Cumulative frequency and chart) to get the graph.
B4
This graph is placed in a sheet of Excel, rather than in a Chart. Students will need to stretch the
graph, change fonts, and generally make improvements. A sample of a modified graph is be­
low.
Average salaries of professors at 50
universities
15
(J
c
~ 10
>.
C"
e
u.
5
o +-I~-+--
......."'I----+-
120.00%
100.00%
80.00%
60.00%"
40.00%
20.00%
.00%
45 50 55 60 65 70 75 80 More
Salary
A second type of special histogram is available through the Stat-Plus add-in. It is a his­
togram with. a nonnal curve superimposed on it, using the mean and standard deviation of the
data. Again, be sure to specify the range of values of the data. This graph is again placed in a
sheet, so you will need to make changes to the design. A sample modified graph follows.
Distribution of average salaries of
professors at 50 universities
10,.......-­
"--,10.00
CD
>.
C"l
g
6
e
6.00
4
4.00
2
2.00
Q)
::::l
u.
m
8.00 ~
8
o -+--+-+-+-+-t-t-t-t--I--t--I----1r-1--t--+ 0.00
'*"'\b< ~(;:j''\tt- q.'re
OJ
r8>
~.
reO;) ro(;:j
rett-· rereOverall
~co
,\(;:jo
~~
,\b<'
~
~
z
o
3
!!!.
B5
Using Excel for descriptive statistics
Descriptive statistics within a cell ofa worksheet. When using any of the functions be­
low, you must have a data range specified by either the name of the range (like Income) or the
range of the cells (like C2:C5l). Each function is calculated by =(function name)(range
name, other information needed).
Calculation
What it Finds
Example
Average(range)
Average
Mean of the data
Trimmean
Trimmed mean of a group, the per­
Trimmean(range,percent)
cent removed is 1/2 from the top and
1/2 from the bottom of the data set
Median
Median of the data
Median(range)
Mode
Mode of the data
*Excel only reports the first mode*
Mode(range)
Max
Maximum of the data
Max(range)
Min
Minimum of the data
Min(range)
Stdev
Standard deviation (sample)
Stdev(range)
Var
Variance (sample)
Var(range)
Stdevp
Standard deviation (population)
Stdevp(range)
Varp
Variance (population)
Varp(range)
Percentile
Percentiles, including quartiles
Percentile(range,O. ##)
One advantage of using this method of obtaining descriptive statistics is that the coeffi­
cient of variation, range, scores within 1 or 2 standard deviations of the mean and scores to de­
termine outliers can be built using previous results. For example, ifthe value of the mean is in
cell B12 and the value of the standard deviation is in cell B13, then the coefficient of variation
could be located in cell B14 by typing =100*813/812. Likewise, the limits for the number of
scores within 1 standard deviation can be entered in cells B15 and B16 by typing =812+813 or
=812-813.
Descriptive Statistics using the Tools. Click on Tools, then select Data Analysis. A pop
up menu will appear. Click on Descriptive Statistics. A selection sheet will appear. If the data
had been in named columns, the name of the column can be entered. If this is the case, be sure
to put a check mark in the box for labels. Otherwise, you will need to input the range of values.
86
Near the middle of the box, record where you want the output moved to; a suggestion is
to put it into a new sheet which needs to be named. Also, be sure to check the box labeled De­
scriptive Statistics. It is possible to do confidence intervals. It is also possible to print out the
th
k smallest and largest scores, if desired.
Sample output, using a data set of overall faculty salaries in thousands of dollars at 50 universi­
ties is shown below:
Overall
58.195
0.986535
56.99
#N/A
6.975854
48.66254
-0.23208
0.501573
29.79
45.75
75.54
2909.75
50
Mean
Standard Error
Median
Mode
Standard Deviation
Sample Variance
Kurtosis
Skewness
Range
Minimum
Maximum
Sum
Count
The #N/A for the mode tells us that there is no mode; however, if there is a number, be sure to
check for multiple modes.
Among the values given are the standard error of the mean which has the following fonnula.
This forumla is part of the infonnation necessary in calculating a confidence intervaL
The Kurtosis value is an index which describes a distribution with respect to its flatness or
peakedness, as compared to the nonnal distribution. A negative value is characteristic of a rel­
atively flat distribution while a positive value is a relatively peaked distribution. The fonnula
for calculation of Kurtosis is given below.
n(n+1)
(n-l )(n-2)(n-3)
L(X-XJ4
s
3(n-l)2
(n-2)(n-3)
Another value isthat for Skewness. A negative skew indicates the longer tail extends in the di­
rection of low values in the distribution; the mean should be smaller than the median. A posi­
tive skew indicates the longer tail extends in the direction of high values in the distribution; the
mean should be larger than the median. The fonnula for skewness is
n
(n-1Xn-2)
_J3
x-X
L (-s­
B7
Box plots
Box plots are a feature of Stat Plus. Select box plots to open the dialog box. Enter in
the range of values for the data and check the header row label if appropriate. Send the output
to a new sheet. Moderate outliers will be shown as a filled in circle while extreme outliers will
be an open circle. Extreme outliers are beyond Q3 + 3 (Q3 - Ql) or Ql- 3(Q3 - Ql). Moder­
ate outliers are at a distance of 1.5 times the interquartile range to 3 times the interquartile
range from either Ql or Q3. A sample output is shown below:
80
70
60
50
40
Overall
30
20
10
0
Scatter plots and Regression lines
The Chart Wizard will also create a scatter plot of data. Be sure that the independent variable
(x) is in the first column and the dependent variable (y) is in the second column. Highlight the
data and make a the scatter plot following the general graphing procedures and storing the re­
sult in a chart.
The background should be cleared. In addition, depending on the data, the axes may need to be
re-scaled as Excel tends to start both at zero. To change the axes, move the mouse until you see
the name of the axes appear in a little box. Double click the mouse. On the tab labeled scale,
change the minimum to a value slightly smaller than the data's minimum value.
To add a regression line, move the mouse near a data point. When the box with the word Series
appears, right click the mouse. A drop down menu should appear. Click on Add Trendline. A
series of different trendlines will appear. For this early work, the linear model is to be selected.
Be sure to click the options tab and check the bottom 2 boxes - add equation and coefficient of
determination to the graph. Otherwise, just the line will be drawn. The equation and value of
2
r can be moved around on the graph. Other models are available such as the exponential
growth/decay modeL
B8
Binomial probabilities
Excel has the binomial probability distribution built in. To begin a distribution, create 2
columns, labeled X and P(X). The X values are from 0 to n, where n is the number of trials.
The Function Wizard will make it easier for the probabilities to be calculated. Put the cursor in
the cell for the first P(X). Select the Function Wizard (recall it looks like Ix on the toolbar).
Select Statistical, then within the menu, select on Binomdist. For number, enter the cell where
X = 0 was found (like A2). Enter the value for n. Enter a desired probability (the value of p in
the binomial formula). If not doing a cumulative, enter the word false. After finishing, drag
the cell throughout the range of cells to compute the other binomial probabilities.
Once the chart of values is obtained, you could make a histogram of the data. Additional
columns, to detennine the mean and standard deviation using the generic probability distribu­
tion formulas, could be created. Additionally, you could see the effect of changing the value
for p on the symmetry of the distribution.
Poisson probabilities
Poisson probabilities are also available through the Function Wizard and can be utilized in
much the same way. The value of A is entered in the Wizard. Both individual and cumulative
probabilities can be found.
Normal Curve Probabilities
Excel provides several functions related to the nonnal distribution. These are also accessed
most readily through the Function Wizard which will guide you through the information to be
entered. The table below describes these functions. These functions are particularly useful in
establishing confidence intervals or doing hypothesis testing.
Function
Accomplishes
Nonndist
Returns the value of the cumulative probabilities; must have found
the values for s, mean and standard deviation
Nonmnv
Returns the z score of the cumulative probabilities; must have cu­
mulative percents, mean and standard deviation
1N0rmsdist
Returns the value of the cumulative probabilities; must have foudn
the z score first
Normsinv
Returns the z score of the cumulative probabilities; must have cu­
mulative percents first
Standardize
Returns a standardized z score for a specified X, mean and stan­
dard deviation
B9
Central Limit Theorem
To demonstrate the Central Limit Theorem, first, a population must be created. Click on
Tools, Data Analysis, Random Number Generator. A dialog box will appear. The follow­
ing information must be entered.
Number of variables - to put all of the numbers in one column, use I
Number of random variables - any number that you desire - suggest 500
Distribution - the choices that would probably be appropriate are
Uniform - the lower and upper bounds must be entered
Normal- the mean and standard deviation must be entered
Output Range - Enter the location of the upper left hand cell (AI, for example)
Next, samples must be created - this is where the tedious, time consuming works come into
play. Click on Tools, Data Analysis, Sampling. Again a dialog box will appear. The follow­
ing information must be entered.
Input range - the range of values for the population (AI:A500, for example)
Sampling method - either periodic or random; probably should use random
a
ax = .j;;
Number of samples - the value for n in
Output range - the first cell where the sample value is placed;
values will be in a column
After the first sample is made, you will need to make additional samples of the same size, plac­
ing results in the next column of the chart. This is what takes the time. However, it does show
the individual samples. After all columns are created, then the sample mean of each column
must be obtained. In a cell below the first column's data, type =Average(start cell, end cell).
The formula is then copied throughout all the samples.
In the example, 500 random decimals were created. Ten samples of size 5 were created.
The output for the samples and their means are shown:
Sample 1
0.29991
0.59304
0.56865
0.70309
0.75091
Means 0.58312
Sample 2
0.28986
0.71261
0.59899
0.5754
0.95825
Sample 3
0.10269
0.11921
0.28062
0.01315
0.34419
Sample 4
0.48601
0.49907
0.2642
0.54714
0.2327
Sample 5
0.19828
0.41173
0.69906
0.5439
0.12436
Sample 6
0.1391
0.00653
0.05557
0.923
0.64986
Sample 7
0.08243
0.71407
0.85076
0.80508
0.37681
Sample 8
0.09082
0.48601
0.97366
0.84201
0.1908
Sample 9
0.19727
0.46587
0.78021
0.4687
0.69973
Sample 10
0.494705
0.639973
0.585498
0.87582
0.016938
0.62702
0.17197
0.40582
0.39546
0.35481
0.56583
0.51666
0.52235
0.522587
Next, at some location in the worksheet, you should find the mean and standard deviation of
both the population and the sample. Also, ~ should be found, using the
~
population standard deviation and the sample size.
BlO
Summary of values
Population
Mean
Standard deviation
0.489554
0.289814
Sample
Mean
Standard deviation
0.466565
0.135795
Standard error
0.129609
What is interesting to see is that, with increased values for n, there is much less diversity in the
averages. You could also graph the sample means as a histogram, if enough samples were ob­
tained.
Normal Probability Plots
One of the underlying assumptions that applies to all the inferential work in statistics is that the
data must be normally distributed. Even though it is not usually included in texts, it is a good
idea to verify this assumption, especially when working on a project where data is analyzed.
Some evidence from the descriptive statistics that support (or suggest normality) are:
Proximity of mean, median and mode
Range is approximately 6 times the standard deviation
Inter-quartile range is approximately 1.33 times the standard deviation
The percent of scores in 1 standard deviation of the mean is 68%
The percent of scores in 2 standard deviations of the mean is 95%
The shape of the box plot
The shape of the histogram (not recommended for small data sets)
A normal probability plot, which plots the data versus the theoretical z scores if the data were
normally distributed, should result in a straight line. If data are not normally distributed, then
theoretically you should attempt to normalize it by one of several transformations: logarith­
mic, square root or reciprocal. All are readily done on ExceL The transformed data, if it ap­
pears to be a linear plot, would then be utilized for all confidence intervals and hypothesis tests,
transforming final values for the interval back into the regular data values.
Directions for a normal probability plot:
1. Establish headings as follows:
ColumnA
Rank
Column B
Cumulative percent
Column C
Z score
Column D
Data
811
2. Enter into Column D, the data values. Sort the data in ascending order. Check to see if
there are any duplicate data values.
3. Enter in Column A, starting in cell A2, the values from 1 to
ber of data items.
D,
where n represents the num­
4. If you had no duplicate data values, skip this step and go directly to step 5.
For any data points that are duplicate points, you will need to average the ranks and record
that value for each of the duplicates. For example, if your worksheet has columns like below:
Rank
Cumulative proportion
Z score
Data
1
13
2
13
3
13
4
14
5
15
6
15
Change it to the following:
Rank
Cumulative proportion
Z score
Data
2
13
2
13
2
13
4
14
5.5
15
5.5
15
The 13's have rank 2 since the average of 1,2 and 3 is 2. The 15's have rank 5.5 since the av­
erage of5 and 6 is 5.5.
rank
5. To form Column B, the cumulative proportions are found by n + I . In cell B2, enter
=A2/(value for 0+1). That is, if the data had 30 values, enter =A2I31. Highlight the cell and
drag to copy the formula down to the last row.
6. To form the inverse normal z scores (which are based on cumulative proportions and repre­
sent the area to the left of the z score), type in cell C2, =NORMSINV(B2). Highlight the cell
and drag to copy the formula down to the last row.
B12
7. Use the chart wizard to create an XY scatter plot of Column C versus Column D, with the
Column C being the x values. After the wizard is finished, you may want ~o readjust the values
on each axis~ it is not necessary for the y scale to start at zero.
The plot of the salaries of professors at 50 universities suggests that the data are close to being
normally distributed.
Probability Plot
-------89:00- - - - - - - . - - - j
-I
!
•
-3
-2
-1
0
z score
3
2
Additionally, the Stat Plus Add-in also gives a probability plot (Pplot); this is essentially the
same graph with a rotation of axes and a trend line added..
Normal Probability Plot
3.00
2.00
••
1.00
~
0
u
II)
0.00
z
-1.00
-2.00
-3.00
45.75
50.75
55.75
60.75
Overall
65.75
70.75
•
B13
T distributions
Excel has, through the Function Wizard, 2 functions for T distributions. These are summarized
below:
Function
Accomplishes
Tdist
Returns the value of the probability for the tail area for a particular t
score; must have the value of t, the degrees of freedom, and the num­
ber of tails (lor 2)
Tinv
.­
Returns the t score for a given probability ofa tail area and degrees of
freedom
Confidence Intervals
In real life statistics, a z interval is used only when the population standard deviation is known.
Otherwise, a t interval is used.
To create a z interval for raw data, the values must be entered into the worksheet. The values
for the mean, standard deviation and sample size must be found and put into cells. This can be
done either by the individual formulas or by the descriptive statistics add-in. For illustration,
assume that the mean is in cell B2, the standard deviation is in cell B3, and the sample size is in
cell B4.
The steps below are the formulas to enter:
Confidence level
Area in tail
Z score
Lower limit
Upper limit
Cell B5
Cell B6
Cell B7
Cell B8
Cell B9
Enter the value of the confidence level as a decimal
=(1-B5)/2
=ABS(NORMSINV(B6»
=B2-B7*B3/SQRT(B4)
=B2+B7*B3/SQRT(B4)
Sample output is below:
Z intervals
Mean
Standard Deviation
Sample size
Confidence level
Area in tail
Z score
Lower limit
Upper limit
54.4
4.5
36
0.9
0.05
1.644853
53.16636
55.63364
To create a T interval, you have a choice. If it is desired to use formulas, then the line for Z
score is replaced by:
T score
Cell B7
=ABS(TINV(B6,B4-1»
B14
Sample output is below:
T intervals
Mean
Standard Deviation
Sample size
Confidence level
Area in tail
T score
Lower limit
Upper limit
54.4
4.5
36
0.9
0.05
2.03011
52.87742
55.92258
A second method is to enter the raw data into Excel. When choosing descriptive statistics,·se;.·
lect on the confidence level as welL Sample output is below for the speeds of 10 cars:
Speeds of cars
Mean
Standard Error
Median
Mode
Standard Deviation
Sample Variance
Kurtosis
Skewness
Range
Minimum
Maximum
Sum
Count
Confidence Level(95.0%)
69.7
1.626516387
68.5
#N/A
5.143496433
26.45555556
-0.926717627
0.283301617
16
62
78
697
10
3.679438498
To obtain the actual interval, you must take the mean and then both subtract and add the confi­
dence level figure (3.6794).
In a similar fashion, you can create intervals for proportions.
X=
N=
P estimate
Q estimate
Confidence level
Area in tail
Z score
Lower limit
Upper limit
Cell B2
Cell B3
Cell B4
Cell B5
Cell B6
Cell B7
Cell B8
Cell B9
Cell B10
Enter the value for x
Enter the value for N
=B2/B3
=1-B4
Enter the value of the confidence level as a decimal
=(1-B6)/2
=ABS(NORMSINV(B7»
=B3-68*SQRT(B4*B5/B3)
=B3+B8*SQRT(B4*B5/B3)
B15
Sample output is shown below:
Proportion intervals
x=
n=
p estimate
q estimate
confidence level
area in tail
z score
Lower limit
Upper limit
38
250
0.152
0.848
0.95
0.025
1.959961
0.107496
0.196504
Hypothesis Testing
One-sample tests are not included in Excel, but as in the case of the intervals, formulas can be
created which will accomplish the task.
Sample Mean
Hypothesized Mean
Standard Deviation
Sample size
Test Statistic
Alpha
P value one tailed
Z critical one tailed
P value two tailed
Z critical two tailed
Cell B2
Cell B3
Cell B4
Cell B5
Cell B6
Cell B7
Cell B8
Cell B9
Cell BlO
Cell B11
Enter the value here
Enter the value here
Enter the value here
Enter the value here
=(B2-B3)/(B4ISQRT(B5))
Enter the value here
=1-NORMSDIST(ABS(B6»
=ABS(NORMSINV(B7»
=2*B8
=ABS(NORMSINV(B7/2»
Sample output is shown below.
One Sample Z test.
Sample Mean
Hypothesized Mean
Standard Deviation
Sample size
Test statistic
Alpha
P value one tailed
Z critical one tailed
P value two tailed
Z critical two tailed
110
100
15
20
2.981424
0.05
0.001435
1.644853
0.002869
1.959961
In a similar fashion, tests could be set up for a T test of the mean and a Z test for proportions.
For the T test, a cell with degrees of freedom could be created. The p value formulas would re­
quire the format TDIST(location of T score, location of degrees of freedom, 1) for one
tailed tests. The critical value formulas would require the format TINV(location of alpha, lo­
cation of degrees of freedom) for one tailed tests; in two tailed tests, make sure the value of
alpha is divided by 2.
B16
Fortunately, the tests for 2 populations are part ofthe Data Analysis Tool Pack, if the raw data
are presented. Available are z tests, t tests with equal population variances, t tests with unequal
population variance and paired differences t tests. For each test, you need to have the data es­
tablished in columns. Select on Data Analysis, choose the test, and then fill in the desired in­
foonation. Note that the hypothesized mean difference is asked for; in Math 124, this is treated
as zero. Sample output for paired differences is shown below:
t-Test: Paired Two Sample for Means
Mean
···········Variance
Observations
Pearson Correlation
Hypothesized Mean Difference
Df
t Stat
P(T<=t) one-tail
t Critical one-tail
P(T<=t) two-tail
t Critical two-tail
Machine I
17.42857143
6.619047619
Machine 2
18.42857143
10.28571429
7
7
0.842594423
o
6
-1.527525232
0.088744414
1.943180905
0.177488828
2.446913641
ANOVA
Excel has one way ANaVA as well as two way ANaVA. The data is again entered into a
worksheet in either rows or columns. Enter the Data Analysis tool pack and select on one way
ANaVA. Be sure that the range specified tells the location of all of the data, including blank
cells; that is, if a box were drawn around the cells specified, all the data would be found.
Sample results are shown:
Anova: Single Factor
Groups
Average
Sum
Count
Variance
Company A
4
235
58.75
170.9167
CompanyB
3
205
68.33333
400.3333
CompanyC
5
353
70.6
164.8
CompanyD
4
211
52.75
49.58333
ANaVA
Source of
Variation
SS
df
MS
Between Groups
865.6333
3
288.5444
Within Groups
2121.367
12
176.7806
2987
15
Total
F
1.632218
P-Value
0.234033
F crit
3.4903
817
Contingency Tables
First, you must copy the tables into Excel (assuming that the raw data was not available). The
row totals and column totals can be obtained by writing formulas and then dragging them
across the table.
Observed
In favor
Against
No opinion
Grand Total
Men
93
70
12
175
Women
87
32
6
125
180
102
18
300
Grand Total
To calculate the expected values, formulas for the corresponding cells need to be established.
For example, the expected for Men In Favor is found by =BS*E3/ES, assuming that the work­
sheet was begun in cell A2.
Expected
Against
In favor
Men
Women
Grand Total
No opinion
Grand Total
105
59.5
10.5
175
75
42.5
7.5
125
180
102
18
300
Next, set up headings as follows:
P-value
Observed chi-square
Alpha
Critical chi-square
Click on the cell next to P-value. Click on Function Wizard, select Statistical, select CHITEST,
and click Next. Enter the actual range of observed cell frequencies; do not include locations of
totals. Below it enter the expected range of cell frequencies. Click finish. The p value of the
test will appear.
Click on the cell next to Observed chi-square. Click on Function Wizard, select Statistical, se­
lect CIllINV and click Next. For probability, enter the cell location of the p value. For degrees
of freedom, enter the degrees of freedom which must be calculated. Click finish. The test
value of Chi square will be displayed.
B18
Enter a value for alpha.
Click on the cell next to Critical chi-square. Click on Function Wizard, select Statistical, select
CIffiNV. For probability, enter the cell location of the p value. For degrees of freedom, enter
the number. Click finish.
What is really neat, is that as you change the value of alpha, the critical value will automatically
be recalculated.
P-value
Observed chi-square
Alpha
Critical chi-square
0.0161411
8.2527466
0.01
9.210351
P-value
Observed chi-square
Alpha
Critical chi-square
0.0161411
8.2527466
0.05
5.9914764