Statistical Analysis using SPSS

Statistical Analysis using SPSS
Dr.Shaikh Shaffi Ahamed
Asst. Professor
Dept. of Family & Community Medicine
SPSS Windows
• Data View
– Used to display data
– Columns represent variables
– Rows represent individual units or groups of units that share
common values of variables
• Variable View
–
–
–
–
–
Used to display information on variables in dataset
TYPE: Allows for various styles of displaying
LABEL: Allows for longer description of variable name
VALUES: Allows for longer description of variable levels
MEASURE: Allows choice of measurement scale
• Output View
– Displays Results of analyses/graphs
Data Entry Tips
• For large datasets, use a spreadsheet such as EXCEL
which is more flexible for data entry, and import the
file into SPSS
• Give descriptive LABEL to variable names in the
VARIABLE VIEW
• Keep in mind that Columns are Variables, you don’t
want multiple columns with the same variable
Importing data into SPSS
To import an EXCEL file, click on:
FILE  OPEN  DATA then change FILES OF TYPE
to EXCEL (.xls)
To import a TEXT or DATA file, click on:
FILE  OPEN  DATA then change FILES OF TYPE
to TEXT (.txt) or
DATA (.dat)
You will be prompted through a series of dialog boxes to
import dataset
Descriptive Statistics-Numeric Data
• After Importing your dataset, and providing names to
variables, click on:
• ANALYZE  DESCRIPTIVE STATISTICS DESCRIPTIVES
• Choose any variables to be analyzed and place them in box on right
• Options include:
n
Mean : y 
y
i 1
i
n
n
Sum :  yi
i 1
 y
n
Std. deviation : S 
S
S.E. Mean :
n
i 1
i
y
n 1

2
Variance : S 2
e
t
d
N
e
u
i
m
a
m
i
a
t
t
t
t
t
t
t
i
E
i
i
i
i
i
i
C
8
8
0
1
3
3
1
1
V
8
Descriptive Statistics-General Data
• After Importing your dataset, and providing names to
variables, click on:
• ANALYZE  DESCRIPTIVE STATISTICS
FREQUENCIES
• Choose any variables to be analyzed and place them in box on
right
• Options include (For Categorical Variables):
– Frequency Tables
– Pie Charts, Bar Charts
• Options include (For Numeric Variables)
– Frequency Tables (Useful for discrete data)
– Measures of Central Tendency, Dispersion, Percentiles
– Pie Charts, Histograms
Example 1.4 - Smoking Status
S
u
P
r
u
c
c
V
N
0
9
9
9
Q
3
3
3
2
Q
9
6
6
8
C
2
4
4
2
O
3
8
8
0
T
7
0
0
Vertical Bar Charts and Pie Charts
• After Importing your dataset, and providing names
to variables, click on:
• GRAPHS  BAR…  SIMPLE (Summaries for Groups
of Cases)  DEFINE
• Bars Represent N of Cases (or % of Cases)
• Put the variable of interest as the CATEGORY AXIS
• GRAPHS  PIE… (Summaries for Groups of Cases) 
DEFINE
• Slices Represent N of Cases (or % of Cases)
• Put the variable of interest as the DEFINE SLICES BY
Example 1.5 - Antibiotic Study
80
60
40
Count
20
5
4
0
1
OUTCOME
2
3
4
5
3
1
2
Histograms
• After Importing your dataset, and providing names
to variables, click on:
• GRAPHS  HISTOGRAM
• Select Variable to be plotted
• Click on DISPLAY NORMAL CURVE if you want a
normal curve superimposed (see Chapter 3).
Example 1.6 - Drug Approval Times
30
20
10
Std. Dev = 20.97
Mean = 32.1
N = 175.00
0
0
0.
12
0
0.
11
0
0.
10
.0
90
.0
80
.0
70
.0
60
.0
50
.0
40
.0
30
.0
20
.0
10
0
0.
MONTHS
Side-by-Side Bar Charts
• After Importing your dataset, and providing names
to variables, click on:
• GRAPHS  BAR…  Clustered (Summaries for Groups
of Cases)  DEFINE
• Bars Represent N of Cases (or % of Cases)
• CATEGORY AXIS: Variable that represents groups to be
compared (independent variable)
• DEFINE CLUSTERS BY: Variable that represents
outcomes of interest (dependent variable)
Example 1.7 - Streptomycin Study
30
20
OUTCOME
1
2
10
3
Count
4
5
0
6
1
TRT
2
Scatterplots
• After Importing your dataset, and providing names
to variables, click on:
• GRAPHS  SCATTER  SIMPLE  DEFINE
• For Y-AXIS, choose the Dependent (Response) Variable
• For X-AXIS, choose the Independent (Explanatory)
Variable
Example 1.8 - Theophylline Clearance
8
7
6
5
4
THCLRNCE
3
2
1
0
.5
DRUG
1.0
1.5
2.0
2.5
3.0
3.5
Scatterplots with 2 Independent Variables
• After Importing your dataset, and providing names
to variables, click on:
• GRAPHS  SCATTER  SIMPLE  DEFINE
• For Y-AXIS, choose the Dependent Variable
• For X-AXIS, choose the Independent Variable with the
most levels
• For SET MARKERS BY, choose the Independent Variable
with the fewest levels
Example 1.8 - Theophylline Clearance
8
7
6
5
4
3
THCLRNCE
DRUG
2
Tagamet
1
Pepcid
0
Placebo
0
2
SUBJECT
4
6
8
10
12
14
16
Contingency Tables for Conditional
Probabilities
• After Importing your dataset, and providing names to
variables, click on:
• ANALYZE  DESCRIPTIVE STATISTICS  CROSSTABS
• For ROWS, select the variable you are conditioning on
(Independent Variable)
• For COLUMNS, select the variable you are finding the conditional
probability of (Dependent Variable)
• Click on CELLS
• Click on ROW Percentages
Example 1.10 - Alcohol & Mortality
C
A
o
0
1
W
0
C
5
5
0
%
%
%
%
1
C
1
4
5
%
%
%
%
T
C
6
9
5
%
%
%
%
Independent Sample t-Test
• After Importing your dataset, and providing names
to variables, click on:
• ANALYZE  COMPARE MEANS  INDEPENDENT
SAMPLES T-TEST
• For TEST VARIABLE, Select the dependent (response)
variable(s)
• For GROUPING VARIABLE, Select the independent
variable. Then define the names of the 2 levels to be
compared (this can be used even when the full dataset has
more than 2 levels for independent variable).
Example 3.5 - Levocabastine in Renal Patients
S
t
E
e
N
e
e
G
A
N
6
3
2
2
H
6
7
9
7
S
s
u
f
a
V
o
n
a
l
e
r
.
e
E
a
e
e
2
o
F
d
p
i
t
r
r
w
g
f
p
e
e
A
E
4
1
6
0
4
7
7
0
3
a
E
6
3
6
7
7
3
6
n
Paired t-test
• After Importing your dataset, and providing names
to variables, click on:
• ANALYZE  COMPARE MEANS  PAIRED
SAMPLES T-TEST
• For PAIRED VARIABLES, Select the two dependent
(response) variables (the analysis will be based on first
variable minus second variable)
Example 3.7 - Cmax in SRC&IRC Codeine
l
E
e
e
N
e
P
S
1
I
s
e
N
i
g
P
S
m
i
f
o
n
a
l
r
e
E
p
2
e
e
e
d
w
t
p
a
a
P
S
3
9
2
8
9
6
2
0
Chi-Square Test
• After Importing your dataset, and providing names to
variables, click on:
•
•
•
•
•
ANALYZE  DESCRIPTIVE STATISTICS  CROSSTABS
For ROWS, Select the Independent Variable
For COLUMNS, Select the Dependent Variable
Under STATISTICS, Click on CHI-SQUARE
Under CELLS, Click on OBSERVED, EXPECTED, ROW
PERCENTAGES, and ADJUSTED STANDARDIZED
RESIDUALS
• NOTE: Large ADJUSTED STANDARDIZED RESIDUALS
(in absolute value) show which cells are inconsistent with null
hypothesis of independence. A common rule of thumb is seeing
which if any cells have values >3 in absolute value
Example 5.8 - Marital Status & Cancer
R
E
V
N
C
R
T
a
C
o
n
a
t
c
a
n
M
S
C i
A
o
n
7
9
7
6
E
x
.
1
9
0
%
%
%
%
A
d
3
3
M
C
o
a
2
6
8
4
E
x
.
3
7
0
%
%
%
%
A
d
7
7
W
C
o
2
7
6
3
E
x
.
6
4
0
%
%
%
%
A
d
1
1
D
C i
o
v
1
5
5
0
E
x
.
0
0
0
%
%
%
%
A
d
0
0
T
C
o
o
3
7
6
3
E
x
.
0
0
0
%
%
%
%
a
m
p
s
a
d
i
l
d
a
P
0
3
7
L
2
3
4
L
1
1
7
A
3N
a
1
m
Fisher’s Exact Test
• After Importing your dataset, and providing names to
variables, click on:
ANALYZE  DESCRIPTIVE STATISTICS  CROSSTABS
For ROWS, Select the Independent Variable
For COLUMNS, Select the Dependent Variable
Under STATISTICS, Click on CHI-SQUARE
Under CELLS, Click on OBSERVED and ROW
PERCENTAGES
• NOTE: You will want to code the data so that the outcome
present (Success) category has the lower value (e.g. 1) and the
outcome absent (Failure) category has the higher value (e.g. 2).
Similar for Exposure present category (e.g. 1) and exposure
absent (e.g. 2). Use Value Labels to keep output straight.
•
•
•
•
•
R
Example 5.5 - Antiseptic Experiment
E
T
H
e
D
o
a
t
e
T
A
C
6
4
0
%
%
%
%
C
C
6
9
5
%
%
%
%
T
C
2
3
5
%
%
%
%
a
r
m
c
c
p
t
t
s
a
s
s
d
i
i
l
i
d
d
u
d
f
b
P
5
1
4
a
C
8
1
8
L
7
1
3
F
5
4
L
2
1
4
A
N
5
a
C
b
0
1
McNemar’s Test
• After Importing your dataset, and providing names to
variables, click on:
ANALYZE  DESCRIPTIVE STATISTICS  CROSSTABS
For ROWS, Select the outcome for condition/time 1
For COLUMNS, Select the outcome for condition/time 2
Under STATISTICS, Click on MCNEMAR
Under CELLS, Click on OBSERVED and TOTAL
PERCENTAGES
• NOTE: You will want to code the data so that the outcome present
(Success) category has the lower value (e.g. 1) and the outcome
absent (Failure) category has the higher value (e.g. 2). Similar for
Exposure present category (e.g. 1) and exposure absent (e.g. 2).
Use Value Labels to keep output straight.
•
•
•
•
•
Example 5.6 - Report of Implant Leak
E
G
o
s
s
t
S
P
C
9
8
7
%
%
%
A
C
5
3
8
%
%
%
T
C
4
1
5
%
%
%
a
t
il
d
a
M
P-value
N
a
B
Relative Risks and Odds Ratios
• After Importing your dataset, and providing names to
variables, click on:
•
•
•
•
•
•
ANALYZE  DESCRIPTIVE STATISTICS  CROSSTABS
For ROWS, Select the Independent Variable
For COLUMNS, Select the Dependent Variable
Under STATISTICS, Click on RISK
Under CELLS, Click on OBSERVED and ROW PERCENTAGES
NOTE: You will want to code the data so that the outcome present
(Success) category has the lower value (e.g. 1) and the outcome
absent (Failure) category has the higher value (e.g. 2). Similar for
Exposure present category (e.g. 1) and exposure absent (e.g. 2).
Use Value Labels to keep output straight.
Example 5.1 - Pamidronate Study
R
E
V
N
o
e
o
P
P
C
7
9
6
%
%
%
%
P
C
4
7
1
%
%
%
%
T
C
1
6
7
%
%
%
%
s
o
n
e
r
p
a
w
p
lu
O
6
3
0
(
F
7
2
5
Y
F
6
3
6
N
N
7
Example 5.2 - Lip Cancer
E
C
R
N
o
e
o
t
P
Y
C
9
9
8
%
%
%
%
N
C
8
1
9
%
%
%
%
T
C
7
0
7
%
%
%
%
s
n
e
r
p
w
l
p
u
O
3
1
9
P
F
6
8
5
Y
F
8
2
4
N
7
Correlation
After Importing your dataset, and providing names to
variables, click on:
ANALYZE  CORRELATE BIVARIATE
Select the VARIABLES
Select the PEARSON CORRELATION
Select the Two tailed test of significance
Select Flag significant correlations
Linear Regression
• After Importing your dataset, and providing names
to variables, click on:
•
•
•
•
ANALYZE  REGRESSION  LINEAR
Select the DEPENDENT VARIABLE
Select the INDEPENDENT VARAIABLE(S)
Click on STATISTICS, then ESTIMATES,
CONFIDENCE INTERVALS, MODEL FIT
Examples 7.1-7.6 - Gemfibrozil Clearance
i
a
c
d
a
i
c
i
c
c
r
B
e
E
M
i
t
B
g
B
1
(
8
8
1
0
0
6
C
5
1
5
3
6
2
8
a
D
Examples 7.1-7.6 - Gemfibrozil Clearance
b
O
m
d
F
S
M
i
a
g
f
a
1
R
2
1
8
3
6
R
8
5
3
T
0
6
a
P
b
D
b
u
E
u
s
r
s
R
q
q
M
t
a
1
1
6
0
a
P
b
D