3 Measures of Central Tendency, Dispersion and

LLearnStat
St t
Learning Statistics the Easy Way
Session on
MEASURES OF CENTRAL TENDENCY,
DISPERSION AND SKEWNESS
BUREAU OF LABOR AND
EMPLOYMENT STATISTICS
MEASURES OF CENTRAL TENDENCY, DISPERSION AND SKEWNESS
OBJECTIVES
At the end of the session, the participants should be able to:
1 Describe data using the common measures of
1.
central tendency;
2. Describe data in terms of their variability
and skewness; and
3. Determine the most applicable
pp
measure of
central tendency given different types of
distribution.
2011 LearnStat Sessions
2
BUREAU OF LABOR AND
EMPLOYMENT STATISTICS
OUTLINE
1. Measures of Central Tendency
٠Mean
Median
٠Median
٠Mode
2 Measures of Dispersion
2.
3. Skewness
4 Types of Distribution
4.
2011 LearnStat Sessions
3
BUREAU OF LABOR AND
EMPLOYMENT STATISTICS
Measures
M
u
of
f Central Tendency
n n y
A. MEAN
- commonly referred to as the average or arithmetic
mean.
- most widely used measure of central location.
X
=
Sum of all values in the data set
2011 LearnStat Sessions
4
Total number of observations
BUREAU OF LABOR AND
EMPLOYMENT STATISTICS
Measures of Central Tendency Ages
Example of mean
computation
Mean Age
g
X
= 30+28+…+25
13
= 318/13
= 24.5
24 5
2011 LearnStat Sessions
5
of 13 Job Applicants
Applicant
Number
Age
1
30
2
28
3
25
4
35
5
25
6
34
7
20
8
19
9
26
10
18
11
17
12
16
13
25
Total
318
BUREAU OF LABOR AND
EMPLOYMENT STATISTICS
Measures
M a ur of
f Central
n ra Tendency
n n y
B. MEDIAN
- the value of the middle item in a set of
observations which has been arranged in an
ascending or descending order of magnitude.
- is the
th centermost
t
st value
l in
i a distribution.
dist ib ti
2011 LearnStat Sessions
6
BUREAU OF LABOR AND
EMPLOYMENT STATISTICS
Measures of Central Tendency
Ages of 13 Job Applicants
Example of finding the
median (Number of
observations is odd)
2011 LearnStat Sessions
7
Applicant
Number
Age
12
16
11
17
10
18
8
19
7
20
13
25
5
25
3
25
9
26
2
28
1
30
6
34
4
35
The median value is the
middle most value in the data
set.
Median age = 25
BUREAU OF LABOR AND
EMPLOYMENT STATISTICS
Measures of Central Tendency
Ages of 14 Job Applicants
Example of finding the
median (Number of
observations is even)
2011 LearnStat Sessions
8
Applicant
Number
Age
12
16
11
17
10
18
8
19
7
20
13
25
5
25
3
26
9
26
6
2
28
1
30
6
34
4
35
14
35
The median value is the
sum of the two middle
most values in
n the data
set divided by 2.
Median age = 25 + 26
2
= 25.5
BUREAU OF LABOR AND
EMPLOYMENT STATISTICS
Measures
M a ur of
f Central
n ra Tendency
n n y
C. MODE
- is the
value in the
data set that
occurs most
frequently.
Ages of 13 Job Applicants
Example of finding
the mode
2011 LearnStat Sessions
9
Applicant
N b
Number
Age
12
16
11
17
10
18
8
19
7
20
13
25
5
25
3
25
9
26
2
28
1
30
6
34
4
35
Mode = 25 is the
value that
occurs most
frequently
BUREAU OF LABOR AND
EMPLOYMENT STATISTICS
Measures
M a ur of
f Central
n ra Tendency
n n y
Advantages of the MEAN:
™ takes into account all observations.
observations
™ can be used for further statistical calculations and
mathematical manipulations.
Disadvantages of the MEAN:
™ easily affected by extreme values.
™ cannot be computed
p
if there are missing
g values due to
omission or non-response.
™ in grouped data with open-ended class intervals, the
mean cannott b
be computed.
t d
2011 LearnStat Sessions
10
BUREAU OF LABOR AND
EMPLOYMENT STATISTICS
Measures of Central Tendency
y
Advantages of the MEDIAN:
™ not affected by extreme values.
™ can be computed even for grouped data w
with
th open
openended class intervals.
Disadvantages of the MEDIAN:
™ Observations
b
f
from
d ff
different
d
data
sets have
h
to be
b
merged to obtain a new median, whether group or
g
p data are involved.
ungrouped
2011 LearnStat Sessions
11
BUREAU OF LABOR AND
EMPLOYMENT STATISTICS
Measures of Central Tendency
Advantage of the MODE:
™ can be easily identified through ocular inspection.
Disadvantages of the MODE:
™ d
does nott possess the
th desired
d i d algebraic
l b i property
t of
f th
the
mean that allows further manipulations.
™ like the median,
median observations from different data sets
have to be merged to obtain a new mode, whether group
or ungrouped data are involved.
2011 LearnStat Sessions
12
BUREAU OF LABOR AND
EMPLOYMENT STATISTICS
MEASURES OF DISPERSION
Let us take 5 sets of observations
Set 1:
Set 2:
Set 3:
Set 4:
Set 5:
45
45
44
41
44
45
46
45
43
45
47
46
46
48
48
48
48
49
48
49
50
50
51
55
49
x = 47
Questions remain unanswered even after getting the mean:
How variable are the data sets?
How do the values in each data set differ from each other?
How are the values in each data set clustered or dispersed
from each other?
2011 LearnStat Sessions
13
BUREAU OF LABOR AND
EMPLOYMENT STATISTICS
Measures of Dispersion
-
group of analytical tools that describes the spread
or variability of a data set.
set
2011 LearnStat Sessions
14
BUREAU OF LABOR AND
EMPLOYMENT STATISTICS
Importance of the measures of dispersion
• supplements an average or a measure of
central tendency
• compares one group of
f data
d
with
h another
h
• indicates how representative the average
is.
is
2011 LearnStat Sessions
15
BUREAU OF LABOR AND
EMPLOYMENT STATISTICS
A measure of dispersion
p
can be expressed
p
in several ways:
y
Range
Measures
of
Dispersion
Quartile
Deviation
Mean Absolute
Deviation
Variance/
Standard Deviation
Coefficient
of variation
2011 LearnStat Sessions
16
Based on the
position of an
observation
b
ti iin a
distribution
Measures the
dispersion
around an
average
Expressed in a
relative value
BUREAU OF LABOR AND
EMPLOYMENT STATISTICS
SKEWNESS
¾ describes the degree to which the data deviates from
symmetry.
¾ when the distribution of the data is not symmetrical, it
is said to be asymmetrical or skewed.
2011 LearnStat Sessions
17
BUREAU OF LABOR AND
EMPLOYMENT STATISTICS
Types of Distribution
(i Relation
(in
R l ti to
t Mean,
M
M
Median
di
and
d M
Mode)
d )
Symmetrical/Normal
Distribution
• Bell shaped distribution
• The mean, median and mode are
all located at one point.
Mean = Median = Mode
2011 LearnStat Sessions
18
BUREAU OF LABOR AND
EMPLOYMENT STATISTICS
• Observations are mostly
concentrated towards the smaller
values
l
and
d there
th
are some
extremely high values.
• Also called skewed to the right
distribution
No. of obserrvations
Positively Skewed Distribution
Income
Mode
Median
Mean
Mode < Median < Mean
2011 LearnStat Sessions
19
BUREAU OF LABOR AND
EMPLOYMENT STATISTICS
• Observations are mostly
concentrated towards the larger
values and there are some
extremely low values.
• Al
Also called
ll d skewed
k
d tto th
the left
l ft
distribution.
No. of obserrvations
Negatively
g
y Skewed Distribution
M
Mean
Age
g of BLES staff
Median
Mode
Mean < Median < Mode
2011 LearnStat Sessions
20
BUREAU OF LABOR AND
EMPLOYMENT STATISTICS
Considerations to be made when using the three most
common
mm measures
m
of
f central tendency
y:
Distribution
Level of
Measurement
Normal
Interval or
Ratio
Measure to
Use
Mean
Other Considerations
™
™
Skewed
Ordinal
Median
™
Skewed
Nominal
Mode
™
2011 LearnStat Sessions
21
When further statistical
calculations or
mathematical
manipulations are needed
When all observations are
considered in the
computation
When distribution has
open ended intervals
open-ended
When interested in the
most frequently occurring
observation
BUREAU OF LABOR AND
EMPLOYMENT STATISTICS
Speciall Topic on Rounding
R
d
Off
ff
Rules for Rounding off Numbers:
•
•
If the
th first
fi t di
digit
it tto b
be d
dropped
d iis lless th
than 5
5,
round down.
If the first digit to be dropped is greater
than or equal to 5
5, round up
up.
E
Examples:
l
•
•
•
•
•
Round off 185.5
185 5 into a whole number: 186
Round off 185.468 into a whole number: 185
Round off 184.51
184 51 into a whole number: 185
Round off 2.0547 into one decimal place: 2.1
R
Round
d off
ff 2.073
2 073 iinto
t ttwo d
decimal
im l places:
l
s: 2
2.07
07
More Examples:
1 Manual
1.
M
l Computation
• 2010 labor productivity (at constant 2000 prices)
= (GDP/Employed)
5,701,539M
=
= 158,222.26
158 222 26 = 158,222
158 222
36.035M *
• Region
g
VI-Employment
p y
growth
g
rate (2009-2010)
(
):
⎛ 2,974 ∗ ⎞
Growth Rate = ⎜
− 1 ⎟ × 100 = (1 .03156 − 1) × 100
2 883 * ⎠
⎝ 2,883
= 0.03156 x 100 = 3.156% = 3 .2%
*I LFS
*In
LFS, figures
fi
s are expressed
x
ss d in
i th
thousands.
s ds
2. Electronic Computation
In Microsoft Excel, you can use the following syntax:
=round(value to be rounded off, number of decimal place
to be retained)
The value to be rounded off can be a single number or a
formula to obtain a single number.
number
Example:
• Round off 275.689
275 689 into two decimal places:
=round(275.689, 2) = 275.69
• 2010 labor p
productivity
y at constant 2000 prices:
p
⎛ ⎛ ⎛ 5,701,539 ⎞
⎞ ⎞
= round ⎜ ⎜ ⎜
×1,000 ⎟ ,0 ⎟ = 158,222
⎟
36 035 ⎠
⎠ ⎠
⎝ ⎝ ⎝ 36,035
Labor Productivity Worksheet
Growth Rate Worksheet