lecture notes January 6 and 8

KAHS 6020 Multivariate analysis
and design
• Dr. Alison Macpherson
• Website www.yorku.ca/alison3
Primary course objectives
1. To learn about multivariate statistical
techniques
2. To apply these techniques in a
situation that is meaningful to you and
your research
Secondary course objectives
1. To improve on presentation skills
2. To learn how to prepare a report on
data analysis
3. To prepare students to write the
methods for data analysis and results
sections of their theses
Course philosophy
This course will use a problem-based
learning approach to multivariate
statistical techniques. It is designed to be
applied to real life situations that you may
encounter as you conduct your research.
Course overview
Format
1. Initial lecture on current topic
2. Questions
3. One complete example
4. Case study (ongoing example of
analysis of a data set)
5. Overview and questions
Week 1
An overview
•
•
•
•
Research questions
Exposure and outcome
Types of variables
Basic statistical measures
-continuous variables
-categorical variables
• An example: body checking injuries in
children
• Case study 1: Research question
There was this statistics professor who,
when driving his car, would always
accelerate hard before coming to any
Intersection, whip straight through it , then
slow down again once he'd got past it.
One day, he took a passenger, who was
understandably unnerved by his driving
style, and asked him why he went so fast
over intersections. The statistics professor
replied, "Well, statistically speaking, you
are far more likely to have an accident at
an intersection, so I just make sure that I
spend less time there."
Research Design and Data Analysis
•
•
•
•
•
•
All analysis starts with a research question
The question drives the analysis process
Questions should be:
-specific (what are you planning to measure?)
-answerable using the data you have
Should include both an exposure and outcome
variable
Methodological Considerations:
Exposure and Outcomes
Exposure
Outcome
Causal pathway
No
exposure
No
outcome
What is exposure?
Some examples from Kinesiology:
- Programs to promote activity
- Exercise
- Balance training
- Use of protective equipment
- Others?
What is an outcome?
Some examples from Kinesiology:
• Usually related to improved health
- Weight loss ( BMI)
- Fewer injuries
- Healthier lifestyle
- Increased participation
- Others?
Exposure and Outcomes
Outcome
Exposure
No
exposure
Causal pathway
Potential for
bias, other
explanations
No
outcome
Types of variables
• Continuous variables
-variables for which there is a range of responses
e.g., age, blood pressure, weight
• Categorical variables
– Variables that fall into categories
– e.g, gender, smoking status
Basic statistical measures for
continuous variables
• Mean (the average number)
• -calculated by summing all the numbers and
dividing by n
• Median (the number in the middle)
• -calculated by going to the 50th percentile
• Mode (the most frequent response)
• -calculated by counting the number of times
each number occurs
• Did you hear about the statistician who had
his head in an oven and his feet in a bucket
of ice? When asked how he felt, he replied,
"On the average I feel just fine."
More about statistical measures for
continuous variables
•
•
•
•
Standard deviation
Assesses the variability in the data
Measure is the square root of the variance
Variance is calculated by the distance of each
measure from the mean
• Accuracy depends on the normal distribution
Statistical measures for categorical
variables
• Counts (how many fall within each
category)
• Proportions (what percentage fall within
each category)
• Frequency distributions (comparing
counts and percentages between
categories)
Example # 1
Is a change in the policy related to
body-checking associated with a
change in injuries in youth ice
hockey?
Reporting of frequencies and
proportions
Background
• In 1998/1999 Ontario Hockey Federation
changed policy to allow body checking
among Atom rep players (elite hockey players
ages 10 and 11)
• Ontario allows body checking at the Pee Wee
level (players ages 12 and 13)
• Québec does not allow any body checking
until Bantam level (players ages 14 and 15)
Methods
• All children presenting to participating
hospital Emergency Departments with a
hockey related injury
• Children in Ontario were compared to
children in Québec
• Hospitals participating in the Canadian
Hospitals Injury Reporting and
Prevention Program
Methods
• Exposure variable:
Playing hockey in Ontario compared to
Québec
• Outcome variable:
Injury due to body checking compared
to other hockey injury
Atom Level: Body Checking Injuries
Proportion Checking
60
50
40
30
20
10
0
6
7
8
9
0
1
2
9
9
9
9
0
0
0
19
19
19
19
20
20
20
/
/
/
/
/
/
/
95
96
97
98
99
00
01
9
9
9
9
9
0
0
1
1
1
1
1
2
2
Hockey Season
Ontario
Quebec
Pee Wee Level: Body Checking
Injuries
Proportion Checking
60
50
40
30
20
10
0
95
9
1
9
9
1
/
6
96
9
1
9
9
1
/
7
97
9
1
9
9
1
/
8
98
9
1
9
9
1
/
9
99
9
1
0
0
2
/
0
00
0
2
0
0
2
/
Hockey Season
Ontario
Quebec
1
01
0
2
0
0
2
/
2
Bantam Level: Body Checking
Injuries
Proportion Checking
60
50
40
30
20
10
0
96 997 998 999 000 001 002
9
/1 6 /1 7 /1 8 /1 9 /2 0 /2 1 /2
5
9
9
9
9
9
0
0
19 19
19
19 19
20
20
Hockey Season
Ontario
Quebec
Implications for Prevention
• Rule change allowing Atom players to body
check was associated with an increase in
checking injuries
• Increased injuries attributable to body
checking were observed in all age groups
where checking was allowed
• Allowing body checking among younger
players was not associated with a decrease
in injuries later on
Overview
• All analysis starts with a research
question
• Examine the exposure/outcome
relationship
• Different types of variables are
measured and presented differently
For next week
• Read chapters 1, 2 and 4 in the text
• Start thinking about possible data sets