Review - People

Midterm 1 Review
Dr. Joseph Brennan
Math 148, BU
Dr. Joseph Brennan (Math 148, BU)
Midterm 1 Review
1 / 17
Chapter 1
Vocabulary of Design:
(Experimental) Units and Subjects:
The individuals on which a study is performed are the (experimental)
units. When the units are human, they are called subjects.
Treatments and Treatment Groups:
A specific condition applied to the units in a study is called a treatment.
Control Treatment and Control Group:
Statistical studies have a benchmark treatment, called the control
treatment. A group which receives a control treatment is called the
control group. Subjects in the control group
does not receive any treatment at all.
is administered to some well known and widely used drug.
receives a placebo treatment.
Dr. Joseph Brennan (Math 148, BU)
Midterm 1 Review
2 / 17
Chapter 1
Vocabulary of Design:
An experiment is blind when subjects do not know which treatment they
are receiving.
In a double-blind experiment the subjects do not know whether they are
in the treatment or in the control group; neither do the researchers who
evaluate the response.
A dummy treatment given to units in a control group is called a placebo.
The positive response to a dummy treatment is called the placebo effect.
The placebo effect is a psychological phenomenon.
Dr. Joseph Brennan (Math 148, BU)
Midterm 1 Review
3 / 17
Chapter 1
Vocabulary of Design:
Response is the measured result of a study. Response is taken from each
experimental unit.
Response Variables are the variables that we are trying to study and
predict.
A variable which may influence the response and is not included into the
study is called the confounding variable (confounder).
The entire group of individuals that we want information about is called
the (target) population. Any subset of the population is a sample of the
population.
Responses taken from all individuals in of a population is a census.
The GOAL of the population-sample relationship is to extend conclusions
made from analyzing a sample to the entire population.
Dr. Joseph Brennan (Math 148, BU)
Midterm 1 Review
4 / 17
Chapter 2
Main Types of Observational Studies A study based on observing
individuals and measuring responses but does not attempt to influence
responses. An investigator does not assign individuals to different
treatments. He or she selects individuals from the population who have
the condition whose effects are being studied; this is the treatment group.
Individuals without the condition comprise the control group.
Because of potential hidden confounding factors, an observational study
may only establish an ASSOCIATION between the treatment and the
response, but NOT the cause-effect relationship.
!Association 6= Causation!
CAUSATION can be established only from well-designed experimental
studies which have better control over the confounding factors.
Dr. Joseph Brennan (Math 148, BU)
Midterm 1 Review
5 / 17
Chapter 2
Cross-sectional Study (or sample survey): provides information
about a population based on a sample from the population at a
specific time point.
Longitudinal Study (prospective study): selects a sample from the
population and follows the sample forward in time in order to observe
developing certain conditions (for example, the occurrence of a
disease).
Retrospective Study (case-control study): An observational study
of archived data. A retrospective study looks backwards and examines
variables in relation to an outcome that is pre-established.
Dr. Joseph Brennan (Math 148, BU)
Midterm 1 Review
6 / 17
Chapter 2
EXPERIMENTAL STUDY
A study based on deliberately imposing some treatment(s) on
individuals to observe their responses.
The placebo treatment is only relevant for experimental studies.
A randomized controlled experiment has the following principles:
1
2
3
4
Control;
Randomization;
Repetition;
Blinding.
Well-designed experiments establish the CAUSE-EFFECT
relationship between the treatment and response.
Dr. Joseph Brennan (Math 148, BU)
Midterm 1 Review
7 / 17
Chapter 3
A qualitative variable places an individual into one of several groups or
categories.
A quantitative variable takes numerical values for which arithmetic
operations (such as adding and averaging) make sense.
There are 3 main types of the histograms:
Density histogram displays percents (or proportions) per unit width
in the vertical direction.
In a frequency histogram the height of each bar is equal to the
actual count of observations in the class interval.
In a relative frequency histogram the height of each bar is equal to
the proportion or percentage of observations in the class interval.
Dr. Joseph Brennan (Math 148, BU)
Midterm 1 Review
8 / 17
Chapter 3
The number of bins k (of equal width) for a data set of size n is found
using Sturges’ formula:
k = 1 + 3.322 log10 (n)
The area of a density histogram equals 1 or 100%.
The shape of a distribution:
Number of modes: Unimodal, Bimodal, Multi-modal.
Symmetry and Skew.
Dr. Joseph Brennan (Math 148, BU)
Midterm 1 Review
9 / 17
Chapter 4
The center of a distribution:
Mode: The number that occurs most frequently in a given data.
Mean: The numerical center of data. The mean x̄ for a set of
observations is determined by adding all values together and dividing
by the number of observations.
x̄ =
n
x1 + x2 + . . . + xn
1X
=
xi
n
n
i=1
Median: The midpoint of a distribution. For a data set the median is
the number such that half of observations are smaller and the other
half are larger.
Dr. Joseph Brennan (Math 148, BU)
Midterm 1 Review
10 / 17
Chapter 4
The spread of a distribution:
Standard Deviation: s measures the spread about the mean. The
standard deviation is connected to only the mean among center
measures.
v
u n
u1 X
2
s=t
(xi − x̄)
n
i=1
Interquartile Range: Q3 − Q1 where Q3 is the third quartile (75th
percentile) and Q1 is the first quartile (25th percentile).
Dr. Joseph Brennan (Math 148, BU)
Midterm 1 Review
11 / 17
Effects of a General Linear Transformations.
If we act upon a data set by any linear transformation:
xnew = bx + a,
the change in spread and center are recorded:
Mean
1st Quartile
x̄new = bx̄ + a
Q1, new = bQ1 + a
Median
3rd Quartile
x̃new = bx̃ + a
Q3, new = bQ3 + a
Standard Deviation
Interquartile Range
snew = |b| · s
IQRnew = |b| · IQR
where | · | denotes absolute value.
Dr. Joseph Brennan (Math 148, BU)
Midterm 1 Review
12 / 17
Chapter 5
Properties of the (standard) normal
curve:
Symmetric about zero,
Unimodal,
The mean, median, and mode are
equal,
Bell-shaped,
The mean µ = 0 and the standard
deviation σ = 1,
The area under the whole normal
curve is 100% (or 1, if you use
decimals).
Dr. Joseph Brennan (Math 148, BU)
Midterm 1 Review
13 / 17
Chapter 5
Figure : Normal curve and percentage of observations under it. Horizontal scale
uses the standard units z.
Dr. Joseph Brennan (Math 148, BU)
Midterm 1 Review
14 / 17
Chapter 5
z-Score:
The transformation of data into standard units, normal
approximation:
observation − mean
z=
standard deviation
Dr. Joseph Brennan (Math 148, BU)
Midterm 1 Review
15 / 17
Chapter 5
Example: Recall that IQ scores are standardized to have a mean x̄ of 100
and a standard deviation s of 15.
(a) What percentage of people have an average IQ; between 90 and 110?
Dr. Joseph Brennan (Math 148, BU)
Midterm 1 Review
16 / 17
Chapter 5
Example: Recall that IQ scores are standardized to have a mean x̄ of 100
and a standard deviation s of 15.
(a) What percentage of people have an average IQ; between 90 and 110?
90 − 100
110 − 100
z90 =
= −0.66
z110 =
= 0.66
15
15
Dr. Joseph Brennan (Math 148, BU)
Midterm 1 Review
16 / 17
Chapter 5
Example: Recall that IQ scores are standardized to have a mean x̄ of 100
and a standard deviation s of 15.
(a) What percentage of people have an average IQ; between 90 and 110?
90 − 100
110 − 100
z90 =
= −0.66
z110 =
= 0.66
15
15
(b) What percentage of people have an extreme IQ; lower than 70 and
higher than 130?
Dr. Joseph Brennan (Math 148, BU)
Midterm 1 Review
16 / 17
Chapter 5
Example: Recall that IQ scores are standardized to have a mean x̄ of 100
and a standard deviation s of 15.
(a) What percentage of people have an average IQ; between 90 and 110?
90 − 100
110 − 100
z90 =
= −0.66
z110 =
= 0.66
15
15
(b) What percentage of people have an extreme IQ; lower than 70 and
higher than 130?
70 − 100
130 − 100
z70 =
= −2
z130 =
=2
15
15
Dr. Joseph Brennan (Math 148, BU)
Midterm 1 Review
16 / 17
Chapter 5
Example: Recall that IQ scores are standardized to have a mean x̄ of 100
and a standard deviation s of 15.
(a) What percentage of people have an average IQ; between 90 and 110?
90 − 100
110 − 100
z90 =
= −0.66
z110 =
= 0.66
15
15
(b) What percentage of people have an extreme IQ; lower than 70 and
higher than 130?
70 − 100
130 − 100
z70 =
= −2
z130 =
=2
15
15
(b) MENSA requires an IQ in the 98th percentile to join their
organization. What IQ score is considered 98th percentile?
Dr. Joseph Brennan (Math 148, BU)
Midterm 1 Review
16 / 17
Chapter 5
Example: Recall that IQ scores are standardized to have a mean x̄ of 100
and a standard deviation s of 15.
(a) What percentage of people have an average IQ; between 90 and 110?
90 − 100
110 − 100
z90 =
= −0.66
z110 =
= 0.66
15
15
(b) What percentage of people have an extreme IQ; lower than 70 and
higher than 130?
70 − 100
130 − 100
z70 =
= −2
z130 =
=2
15
15
(b) MENSA requires an IQ in the 98th percentile to join their
organization. What IQ score is considered 98th percentile?
The 98th percentile is related to a z-score of 2.05. What IQ has a
z-score of 2.05?
x − 100
2.05 =
15
Dr. Joseph Brennan (Math 148, BU)
Midterm 1 Review
16 / 17
Chapter 5
Example: Recall that IQ scores are standardized to have a mean x̄ of 100
and a standard deviation s of 15.
(a) What percentage of people have an average IQ; between 90 and 110?
90 − 100
110 − 100
z90 =
= −0.66
z110 =
= 0.66
15
15
(b) What percentage of people have an extreme IQ; lower than 70 and
higher than 130?
70 − 100
130 − 100
z70 =
= −2
z130 =
=2
15
15
(b) MENSA requires an IQ in the 98th percentile to join their
organization. What IQ score is considered 98th percentile?
The 98th percentile is related to a z-score of 2.05. What IQ has a
z-score of 2.05?
x − 100
⇒ 100 + (2.05)(15) = x − 100 ⇒ x = 130.75
2.05 =
15
Dr. Joseph Brennan (Math 148, BU)
Midterm 1 Review
16 / 17
Chapter 6
There are 3 MAIN SOURCES of measurement errors:
1
Chance Error:
Chance error is present in every measurement
and is difficult to determine. Chance errors are random and have
equal probability of undervaluing a measurement as overvaluing one.
If a n measurements have been taken with a mean of x̄ and a
standard deviation of s, then we say:
”The next measurement will be x̄ give or take s.”
2
Outliers:
3
Bias (Systematic Error):
Observations outside of 3 standard deviations are
considered to be extreme and are treated as potential outliers.
A phenomenon affecting all
measurements the same way, pushing them in the same direction.
Bias, unlike chance error, is not detectable through multiple
measurements.
Dr. Joseph Brennan (Math 148, BU)
Midterm 1 Review
17 / 17