Introduction to R An example of data analysis

Example: BMI
Reading data
Gender and age
Height and weight
Inference
Introduction to R
An example of data analysis
A. Blejec
July 29, 2011
What about the BMI?
Example: BMI
Reading data
Gender and age
Height and weight
Inference
What about the BMI?
Data analysis: BMI
To show the flavor of R data analysis, we will analyze a small
dataset of people’s height and weight. People try to care about
their body weight. It is a common knowledge, that weight is
increasing with height. To compensate for the influence of height
on weight, Body Mass Index (BMI) was introduced that can be
calculated as:
weight
BMI =
height 2
where weight is measured in kilograms and height is measured in
meters.
Our analysis will try to investigate the weights of different gender
and age groups and the influence of height on weight and
calculated BMI.
Example: BMI
Reading data
Gender and age
Data file: bmiall.txt
gender age weight height
M 17 73.6 1.730
M 17 71.0 1.765
M 17 62.4 1.770
M 17 71.0 1.870
M 17 72.4 1.765
...
F
F
F
F
F
F
18
18
18
18
18
18
52.6
46.2
52.4
54.0
55.2
55.4
1.626
1.624
1.638
1.630
1.690
1.677
Height and weight
Inference
What about the BMI?
Example: BMI
Reading data
Gender and age
Height and weight
Inference
What about the BMI?
Reading data
> bmiData <- read.table("../data/bmiall.txt",
+
header = TRUE)
> head(bmiData)
1
2
3
4
5
6
gender age weight height
M 17
73.6 1.730
M 17
71.0 1.765
M 17
62.4 1.770
M 17
71.0 1.870
M 17
72.4 1.765
M 17 104.0 1.825
Example: BMI
Reading data
Gender and age
Height and weight
Inference
What about the BMI?
Get info about the data
> str(bmiData)
'data.frame': 419 obs. of 4 variables:
$ gender: Factor w/ 2 levels "F","M": 2 2 2 2 2 2 2 2 2
$ age
: int 17 17 17 17 17 17 17 17 17 17 ...
$ weight: num 73.6 71 62.4 71 72.4 104 70.4 79.8 63.4
$ height: num 1.73 1.76 1.77 1.87 1.76 ...
> dim(bmiData)
[1] 419
4
> n <- dim(bmiData)[1]
Example: BMI
Reading data
Gender and age
Height and weight
Inference
Data summary
> summary(bmiData)
gender
F:205
M:214
age
Min.
:17.00
1st Qu.:17.00
Median :17.00
Mean
:17.49
3rd Qu.:18.00
Max.
:18.00
height
Min.
:1.502
1st Qu.:1.652
Median :1.720
Mean
:1.720
3rd Qu.:1.780
Max.
:1.970
weight
Min.
: 44.80
1st Qu.: 57.20
Median : 63.20
Mean
: 64.59
3rd Qu.: 71.00
Max.
:104.00
What about the BMI?
Example: BMI
Reading data
Gender and age
Gender and age tables
> attach(bmiData)
> table(gender)
gender
F
M
205 214
> table(gender, age)
age
gender 17 18
F 101 104
M 112 102
Height and weight
Inference
What about the BMI?
Example: BMI
Reading data
Gender and age
Height and weight
Flat contingency table
> highWeight <- weight > mean(weight)
> ftable(gender, age, highWeight)
highWeight FALSE TRUE
gender age
F
17
18
M
17
18
80
80
40
29
21
24
72
73
Inference
What about the BMI?
Example: BMI
Reading data
Gender and age
Height and weight
Inference
Contingency tables ...
> (tbl <- xtabs(~highWeight + gender + age))
, , age = 17
gender
highWeight F M
FALSE 80 40
TRUE 21 72
, , age = 18
gender
highWeight F M
FALSE 80 29
TRUE 24 73
What about the BMI?
Example: BMI
Reading data
Gender and age
Height and weight
Inference
What about the BMI?
... and independence test
> summary(tbl)
Call: xtabs(formula = ~highWeight + gender + age)
Number of cases in table: 419
Number of factors: 3
Test for independence of all factors:
Chisq = 89.81, df = 4, p-value = 1.448e-18
> as.data.frame.table(tbl)
1
2
3
4
5
6
7
8
highWeight gender age Freq
FALSE
F 17
80
TRUE
F 17
21
FALSE
M 17
40
TRUE
M 17
72
FALSE
F 18
80
TRUE
F 18
24
FALSE
M 18
29
TRUE
M 18
73
Example: BMI
Reading data
Gender and age
Height and weight
Inference
What about the BMI?
Total, row, and column proportions: prop.table()
> X <- table(gender, age)
> prop.table(X)
age
gender
17
18
F 0.2410501 0.2482100
M 0.2673031 0.2434368
> prop.table(X, 1)
age
gender
17
18
F 0.4926829 0.5073171
M 0.5233645 0.4766355
> colP <- prop.table(X, 2)
> round(colP * 100, 1)
age
gender
17
18
F 47.4 50.5
Example: BMI
Reading data
Gender and age
Height and weight
Marginal frequencies
Total number of cases
> margin.table(X)
[1] 419
and margins, first by rows and then by columns
> margin.table(X, 1)
gender
F
M
205 214
> margin.table(X, 2)
age
17 18
213 206
Inference
What about the BMI?
Example: BMI
Reading data
Gender and age
Height and weight
Function apply()
> byRow <- 1
> byColumn <- 2
> (Rsum <- apply(X, byRow, sum))
F
M
205 214
> (Csum <- apply(X, byColumn, sum))
17 18
213 206
> X/Rsum
age
gender
17
18
F 0.4926829 0.5073171
M 0.5233645 0.4766355
Inference
What about the BMI?
Example: BMI
Reading data
Gender and age
Height and weight
Inference
Plot table
>
>
>
>
Q3 <- quantile(weight, 0.75)
select <- (weight > Q3)
X <- table(gender[select], age[select])
print(X)
17 18
F 4 8
M 43 49
> heading <- paste("Weight above Q3 = ", Q3)
What about the BMI?
Example: BMI
Reading data
Gender and age
Height and weight
Mosaic plot
> plot(X, main = heading)
Weight above Q3 = 71
18
17
F
M
Inference
What about the BMI?
Example: BMI
Reading data
Gender and age
Height and weight
Inference
Barplot
> barplot(X, beside = TRUE, col = c("pink",
+
"lightblue"), main = heading)
0
10
20
30
40
Weight above Q3 = 71
17
18
What about the BMI?
Example: BMI
Reading data
Gender and age
Height and weight
Inference
What about the BMI?
Numerical variables and summary statistics
> mean(weight)
[1] 64.5883
> mean(height)
[1] 1.719964
> c(sd(weight), sd(height))
[1] 10.53051077 0.08752747
> (V <- var(cbind(weight, height)))
weight
height
weight 110.8916572 0.601565848
height
0.6015658 0.007661059
> cor(weight, height)
[1] 0.6526635
> my.cor <- V[1, 2]/(sd(weight) * sd(height))
> cat("Correlation r =", my.cor, "\n")
Correlation r = 0.6526635
Example: BMI
Reading data
Gender and age
Height and weight
Inference
What about the BMI?
Are there differences in weight and height in gender age
classes?
> aggregate(cbind(weight, height), list(age,
+
gender), mean)
1
2
3
4
Group.1 Group.2
weight
height
17
F 58.51881 1.650644
18
F 59.42500 1.656644
17
M 69.12500 1.775857
18
M 70.88137 1.791794
Example: BMI
Reading data
Gender and age
Height and weight
Inference
Grand tour:
Data, histogram, boxplot, and quantile plot
>
>
>
>
>
>
>
x <- height
oldpar <- par(mfrow = c(2, 2))
plot(x)
hist(x, col = "lightblue")
boxplot(x, col = "lightblue")
qqnorm(x)
par(oldpar)
What about the BMI?
Example: BMI
Reading data
Gender and age
Height and weight
Inference
Grand tour
0
100
200
300
80
60
40
Frequency
20
400
1.5
1.6
1.7
Index
1.8
1.9
2.0
x
1.9
Sample Quantiles
1.7
1.5
1.7
1.9
Normal Q−Q Plot
1.5
x
●
●
●
●
● ● ● ● ●
● ● ● ●●
●
●●●
● ● ● ● ●●●
●
●●
●
●
●
●
●●
●●●
●
● ●●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●● ●●● ●
●● ●
●●●
●
●●
●●
●●●
●
● ●●
●●●●
●●●
●
●●
●
●●
●
●●
●
●
●
●●
●
● ●
●
●
●●
●● ● ●
●● ●
●
●
●●
●●
●●
●
●
●●
●●
●
●
●● ●
●●●●
●● ●●
●
●
●
●●●●
●●●
●
● ●●●
● ●●
●
●
●
●●●
●
●●
● ●●
●
●
●●●
●
●
●
●
●
●
●
●
●
●●●● ●●
●
●
●
●
●
●
●
●●
● ●●
●●●●●
● ●●
●●
●●●●
●●
●●
●
●
● ●●
●
●
●
●●
● ●●
● ●●●●●
●
●●
●●
●
●
● ● ●
●
● ●● ●
●
●●
●●
●
●
●●●●
●
●
●
●●
●
●●
●●
●
●
●● ● ●
●●●●
●●●●
●
●
●
●
●●
●●
●●
●●●
●●
●
●●
●●
●●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
● ●●●●●●● ●
● ● ● ●
●
●
●
●●
●●
●● ●
●● ●●
●
●●
●
●
●
●●
●●● ● ●●●● ●
●
●●
●
●
● ●
0
1.5
1.7
1.9
Histogram of x
● ●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
● ●
−3
−2
−1
0
1
2
3
What about the BMI?
Example: BMI
Reading data
Gender and age
Height and weight
Informative plots
>
>
>
>
>
>
>
x <- height
oldpar <- par(mfrow = c(2, 2))
plot(x)
hist(x, col = "lightblue")
boxplot(x, col = "lightblue")
qqnorm(x)
par(oldpar)
Inference
What about the BMI?
Example: BMI
Reading data
Gender and age
Height and weight
Inference
What about the BMI?
Define a function gtour()
> gtour <- function(x, color = "lightblue") {
+
oldpar <- par(mfrow = c(2, 2))
+
plot(x)
+
hist(x, col = color)
+
boxplot(x, col = color)
+
qqnorm(x)
+
par(oldpar)
+ }
Example: BMI
Reading data
Gender and age
Height and weight
Inference
> gtour(weight, color = "lightgreen")
Histogram of x
0
100
200
300
80
60
40
Frequency
20
400
40 50 60 70 80 90
Index
x
Normal Q−Q Plot
●
90
70
Sample Quantiles
●
●
●
●
50
70
90
●
50
x
● ●
● ●● ●
●
●
●
●●
●
●●
●
●●
●
●
●
●●
●
● ● ●●●
●●●
●
●●
●
● ●
●●
●●
●
●●●●● ●●
●●
● ● ●●
●
●
●
●
●●●●
●●
●
●●
●●● ●●●
●
●
●●●●
● ●●●
●●●
●●
●
●●●
●● ●
●●●●
●● ●● ●●
●
●
●
● ●●●●●●●
● ●●
●
●●
●
●●
●●●●●●●
●
●
●
●
●
●●● ●
●
● ●●
●
●
●
●
●
●
●
●
● ●
●●● ●
●
●● ●●●● ●
●●
●● ●●
●
●
●●●●●●●●●●●●
●●●●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●●
●● ●
● ●●
●●●●
●
●
●●
●●
● ●●●
●●
●
●●
●●
●●●●
●
●●
● ●●● ●
● ●●
●●
●
●
●
●●
●
●
●
●
●● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●●
●
● ● ●●● ● ● ●
●●●●
●●
● ●
●● ●
●●●●●
●●●
●
●
●●
●
●●●
●● ●
●
●
●●●
●●●● ●●
●●
●
●
●
●
●
●
●● ●●● ●
●
●●●
●●
●
●●●
●
●●
●●●
●
●
● ●
●●
●
●● ●● ●●
●●
●
●●
●
●
●
●
●● ●
0
50
70
90
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●●●●
What about the BMI?
Example: BMI
Reading data
Gender and age
Height and weight
Inference
What about the BMI?
I am heavy because I am tall :)
>
>
>
>
>
>
>
+
>
>
+
>
oldpar = par(mfrow = c(2, 2))
plot(height, weight)
title(1)
plot(height, weight, pch = as.character(gender))
title(2)
col <- c("red", "blue")[as.numeric(factor(gender))]
plot(height, weight, col = col, cex = 1.5,
lwd = 2)
title(3)
plot(height, weight, col = col, xlim = c(1.4,
2), ylim = c(40, 110))
title(4)
Example: BMI
Reading data
Gender and age
Height and weight
Inference
I am heavy because I am tall :)
1
2
●
●
1.7
1.8
90
1.5
1.8
3
4
1.7
1.8
1.9
1.9
●
● ●●
● ●
●●
●
●
●
● ● ●●●●●●
● ● ● ●●●
●
●●
●
●
● ●
● ●●● ●
●●
●●
●●
●
●
●●●
●
●
●●●
● ●
●●
●
●●
●●
●●
●
●
●●
●
●
●
●
●●●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●● ●●●
● ●●
●
●●
●
●
●●
●●
●
●
●
●
●
●
●●
●●●
●
●
●
●
●
●● ●●
●
●
●
●
●
●●
●●
●●
●
●●●
●
●
●
●
●
●
●● ●
●
●●
●●
●
●
●
●
●
●●
●●
●
●●
●
●
●
●
●
●
●
●●
●
●
● ●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●● ●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●●●
●
●●
●
●
●
●
●●●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●●
●
●
●
●●●●
●●●
●
●
●
●
●
●
● ●●
●
100
weight
90
weight
70
50
1.7
height
●
● ●●
●● ●
●
●● ●● ● ●
●● ●●●
●●●●●●
●
●●
●
●●●
●●
●
●
●
●
●
●
●
●●
●
●
●●●
●
●
●●
●
● ●
●●
●
●●
●●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●●●●
●
●●● ●
●●●
● ●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●●●
●● ●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●●
●
●●
● ●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●●●●
● ● ●●
●●
●●
●
●
●●●
●
●
●●
●
●
●
●
●●
●
●
●●
●
●●●●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●●●
●
●●
●●
●
●●
●
●
●
●
●
●
●●
●
●
●
●●
●
●●
●● ● ●
●
1.6
1.6
height
●
1.5
70
weight
1.9
●
80
1.6
60
1.5
M
F MM
MM
M
M
M
MMMM
MM MMM
MM
M
M
M
F
MM
MM
M
M
M
M
MM
F
M
M
M
M
M
M
M
MM
FF F
M
MMM
M
M
M
M
M
F
M
M
F
M
M
M
M
M
M
M
M
M
M
M
M
FM
M
M
M
F
M
FM
MM
M
M
M
MMM
MFFM
M
MMM M
MM
F F
M
M
F
F
M
M
M
FM
F
FMM
F
M
MMM
M
M
FFFF FFFFFFF
M
M
FF
M
M
M
F
M
F
M
F
M
M
F
M
M
F
M
F
F
M
M
MM MMM
M
M
FF
M
M
M
FM
FFFM
F
M
M
FFFFFF
M
M
F
M
M
F
M
M
M
M
F
FFM
F
M
F
M
FF
M
M
M
F
FF
M
F
FFFFFF
M
F
M
M
F
M
FF
M
F
FM
FM
F
M
F FF
M
M
F
F
M
M
F
FF
FFFM
FFF
F
M
FF
M
FF
M
F
M
FFF
F
FF
FM
FF
F
FM
FF
FF
MM M
FFFF
FF
FFFF
M
M
F
F
F
FFFFFF
F
F
F
F
F
F
F
F F F F FF
F
50
●● ●
●
●●
●●
●
●● ●
●●
● ●●
●● ● ● ●
●
●
●
●
● ●
● ●● ● ●
●
●
●
●●
●
●
●
●
●
●
●
●●●●●●
●
●
● ●
●●
●
●●
●●
●●
●●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●● ●
●
●●
●●●
●
●●
●
●
●●
●
●
●
● ●●●●●●
● ●
●
●●●
●
●●●
●
●
●
●
●
●●
● ●●
●
●●
●
●●●
●
●
●●●
●● ●●
●
●
●●
●
●●
●
●
●●●
●
●
●●
●
●
●
●●●
●●
●
●
●
●
●
●●
●●
●
●●
●
●●
●
●
●
●
●
●
●●
●
●
● ●● ●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●●●●
●●●●
●
● ● ●
●●
●●
●
●●●
●
●
●
●
●
●
●
●●●●
●
●
●
●●
●
●●
●●
●
●
●●
●●
●
●
●●●
●
●
●●
●
●●
●●
●
●
●
●●
●
●
●
●
● ●
●
●●
●
●
● ●●
●
●
●
●●
●●●
●
●
●●●●
●●
●
●
●● ●
●
●●
●● ● ●
40
70
50
weight
90
●
1.4
1.6
1.8
2.0
What about the BMI?
Example: BMI
Reading data
Gender and age
Height and weight
Inference
I am heavy because I am tall :)
> for (g in c("F", "M")) {
+
select <- (gender == g)
+
x <- height[select]
+
y <- weight[select]
+
rfit <- lm(y ~ x)
+
col <- c("darkred", "darkblue")[(g ==
+
"M") + 1]
+
abline(rfit, col = col, lwd = 4)
+
cat("Gender:", g, "\n")
+
print(rfit)
+ }
What about the BMI?
Example: BMI
Reading data
Gender and age
Height and weight
I am heavy because I am tall :)
Gender: F
Call:
lm(formula = y ~ x)
Coefficients:
(Intercept)
-29.63
x
53.58
Gender: M
Call:
lm(formula = y ~ x)
Coefficients:
(Intercept)
-81.98
x
85.20
Inference
What about the BMI?
Example: BMI
Reading data
Gender and age
Height and weight
Inference
I am heavy because I am tall :)
1
2
●
●
1.7
1.8
90
1.5
1.8
3
4
1.7
1.8
1.9
1.9
●
● ●●
● ●
●●
●
●
●
● ● ●●●●●●
● ● ● ●●●
●
●●
●
●
● ●
● ●●● ●
●●
●●
●●
●
●
●●●
●
●
●●●
● ●
●●
●
●●
●●
●●
●
●
●●
●
●
●
●
●●●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●● ●●●
● ●●
●
●●
●
●
●●
●●
●
●
●
●
●
●
●●
●●●
●
●
●
●
●
●● ●●
●
●
●
●
●
●●
●●
●●
●
●●●
●
●
●
●
●
●
●● ●
●
●●
●●
●
●
●
●
●
●●
●●
●
●●
●
●
●
●
●
●
●
●●
●
●
● ●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●● ●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●●●
●
●●
●
●
●
●
●●●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●●
●
●
●
●●●●
●●●
●
●
●
●
●
●
● ●●
●
100
weight
90
weight
70
50
1.7
height
●
● ●●
●● ●
●
●● ●● ● ●
●● ●●●
●●●●●●
●
●●
●
●●●
●●
●
●
●
●
●
●
●
●●
●
●
●●●
●
●
●●
●
● ●
●●
●
●●
●●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●●●●
●
●●● ●
●●●
● ●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●●●
●● ●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●●
●
●●
● ●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●●●●
● ● ●●
●●
●●
●
●
●●●
●
●
●●
●
●
●
●
●●
●
●
●●
●
●●●●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●●●
●
●●
●●
●
●●
●
●
●
●
●
●
●●
●
●
●
●●
●
●●
●● ● ●
●
1.6
1.6
height
●
1.5
70
weight
1.9
●
80
1.6
60
1.5
M
F MM
MM
M
M
M
MMMM
MM MMM
MM
M
M
M
F
MM
MM
M
M
M
M
MM
F
M
M
M
M
M
M
M
MM
FF F
M
MMM
M
M
M
M
M
F
M
M
F
M
M
M
M
M
M
M
M
M
M
M
M
FM
M
M
M
F
M
FM
MM
M
M
M
MMM
MFFM
M
MMM M
MM
F F
M
M
F
F
M
M
M
FM
F
FMM
F
M
MMM
M
M
FFFF FFFFFFF
M
M
FF
M
M
M
F
M
F
M
F
M
M
F
M
M
F
M
F
F
M
M
MM MMM
M
M
FF
M
M
M
FM
FFFM
F
M
M
FFFFFF
M
M
F
M
M
F
M
M
M
M
F
FFM
F
M
F
M
FF
M
M
M
F
FF
M
F
FFFFFF
M
F
M
M
F
M
FF
M
F
FM
FM
F
M
F FF
M
M
F
F
M
M
F
FF
FFFM
FFF
F
M
FF
M
FF
M
F
M
FFF
F
FF
FM
FF
F
FM
FF
FF
MM M
FFFF
FF
FFFF
M
M
F
F
F
FFFFFF
F
F
F
F
F
F
F
F F F F FF
F
50
●● ●
●
●●
●●
●
●● ●
●●
● ●●
●● ● ● ●
●
●
●
●
● ●
● ●● ● ●
●
●
●
●●
●
●
●
●
●
●
●
●●●●●●
●
●
● ●
●●
●
●●
●●
●●
●●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●● ●
●
●●
●●●
●
●●
●
●
●●
●
●
●
● ●●●●●●
● ●
●
●●●
●
●●●
●
●
●
●
●
●●
● ●●
●
●●
●
●●●
●
●
●●●
●● ●●
●
●
●●
●
●●
●
●
●●●
●
●
●●
●
●
●
●●●
●●
●
●
●
●
●
●●
●●
●
●●
●
●●
●
●
●
●
●
●
●●
●
●
● ●● ●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●●●●
●●●●
●
● ● ●
●●
●●
●
●●●
●
●
●
●
●
●
●
●●●●
●
●
●
●●
●
●●
●●
●
●
●●
●●
●
●
●●●
●
●
●●
●
●●
●●
●
●
●
●●
●
●
●
●
● ●
●
●●
●
●
● ●●
●
●
●
●●
●●●
●
●
●●●●
●●
●
●
●● ●
●
●●
●● ● ●
40
70
50
weight
90
●
1.4
1.6
1.8
2.0
What about the BMI?
Example: BMI
Reading data
Gender and age
Height and weight
Inference
What about the BMI?
Avoiding loops: *apply
col <- c("red", "blue")[gender]
plot(height, weight, col = col, xlim = c(1.4,
2), ylim = c(40, 110))
byFit <- by(bmiData, gender, function(x) lm(weight ~
height, data = x))
lapply(byFit, abline, lwd = 4)
110
>
>
+
>
+
>
●
●
40 50 60 70 80 90
weight
●
● ●
●
●
●
●●
●
●
●● ●
● ●
● ● ●
●●
● ● ●
●
●
●
●
●
●
●
● ●
●
●
● ●● ●
● ●●● ● ●●
●●
●● ● ●●
●●
●● ●
●●
●
●●
●
● ●●
●●●●
●●●●●
●
●●
● ● ●
●●●
● ●●●
●
●●
●
●● ● ● ● ●
●●● ●●
●
●
● ● ●
●
●
●
●
●
●
●
● ●
●
●
●
●
● ●
●●● ●
●●●
● ●
●
●
●●
●
●
● ●● ●
●●●●●●●
●●●●●●●
●
●
● ●●●● ●
●●
●●
●● ● ● ● ●
● ●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
● ●● ●
● ● ●
●
●●
●● ●●●●●●●● ●
●
●●●●●●●● ●
●
●
●●
● ●●
● ●
●● ●●● ●●●
●●
●●●●●●
●●
●
●● ●●
●● ●●
●
●
●
●
●
● ●●
●
●
● ●
●
●●●
● ●●●●●
●●●●● ●
●
●●
● ●●
● ●
●●
●
●
● ●
● ●●
●●
●●● ●
●
●●●
●●●●●●
●
●● ●
●
●
●●●
●
●
●●●●●
●
●
●
●
●
●
●
●
●
●● ●
●
● ●● ●
● ● ●●●●
●
● ●●● ●
●● ●
●● ● ●
●
●
●
●
●
●
● ●
● ●
●
1.4
1.5
1.6
1.7
1.8
1.9
2.0
Example: BMI
Reading data
Gender and age
Height and weight
Inference
Extract coefficients from all models
> sapply(byFit, function(x) x$coefficients)
F
M
(Intercept) -29.62659 -81.98327
height
53.58033 85.19731
What about the BMI?
Example: BMI
Reading data
Gender and age
Height and weight
Inference
What about the BMI?
Gender and age effects on height and weight
> summary(aov(height ~ gender + age))
Df
gender
1
age
1
Residuals
416
--Signif. codes:
Sum Sq
1.76308
0.01282
1.42642
Mean Sq F value Pr(>F)
1.76308 514.1828 < 2e-16 ***
0.01282
3.7395 0.05382 .
0.00343
0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1
> summary(aov(weight ~ gender * age))
Df Sum Sq Mean Sq F value
gender
1 12631 12631.2 156.4069
age
1
188
187.9
2.3262
gender:age
1
19
18.9
0.2340
Residuals
415 33515
80.8
--Signif. codes: 0 '***' 0.001 '**' 0.01
Pr(>F)
<2e-16 ***
0.1280
0.6288
'*' 0.05 '.' 0.1
Example: BMI
Reading data
Gender and age
Height and weight
Inference
What about the BMI?
Gender and age effects on height and weight
> lmfit <- lm(weight ~ gender * height)
> summary(lmfit)
Call:
lm(formula = weight ~ gender * height)
Residuals:
Min
1Q
-15.8396 -5.1961
Median
-0.9167
3Q
4.2347
Max
38.5225
Coefficients:
(Intercept)
genderM
height
genderM:height
--Signif. codes:
Estimate Std. Error t value Pr(>|t|)
-29.627
16.339 -1.813
0.0705 .
-52.357
22.786 -2.298
0.0221 *
53.580
9.875
5.426 9.82e-08 ***
31.617
13.294
2.378
0.0178 *
0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1
Example: BMI
Reading data
Gender and age
Height and weight
Inference
What about the BMI?
Plot of predicted values shows interaction
> col <- c("red", "blue")[gender]
> plot(height, weight, col = col)
> points(height, lmfit$fitted.values, col = col,
+
pch = 16)
100
●
●
● ●
● ●
●
●●
90
●
80
●
● ●
●
●
●
● ●
● ●
●●
●●●
●
●
●
●●
●
●●
●
● ●
●
● ●● ● ● ●
●●
●●
●
●
●
●●● ●
●
●
●
●
●
●
●
●● ●
● ●
●
●●●
●●
●
●
● ●● ●●● ● ●
● ●
● ●● ●● ●
●
●
●
●
●
● ●●● ●
● ●
●● ● ● ●
●
●●
●
●●
●● ● ●
● ●● ●
●
●
● ● ●
●●
●
●
●● ●●● ●
●
●
● ●
●
●●●●
●
● ●●
●
● ●●●
●
●●
●●●
●
●
●●
●
●
●
●
●● ●
●
●
●
●
●
●
●
●
● ●●● ● ●●
●
●
●● ●●●●●
●●
● ●
●
●
● ●
●
●
●
●
●
●
●
●
●
●● ● ●
● ● ●● ●●
● ● ●● ●
●●
●
●
● ●●
●
●
●
●●
●
●●
● ●●
●
●●
●●
● ● ●
●
●●
●
● ● ●●● ●● ●● ●
●
●● ●●● ●●● ●
●
●
●
●●
●
●
●
●
● ●
●●
●
●
●
● ● ●● ●
●
●●
●●
●
●
●
●
● ● ●●
●● ●●●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●● ●●●
●
●
●
● ●●
●●●
●
●
●● ● ●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●● ●
●
●
●
●
●
●
● ●
●
●
●
●
● ●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
● ●
● ●● ● ● ● ●
●●●
●
●
●
●●● ●
●
●
●
●●●●
●
● ●●●
●● ●
●
●
●●
●
●●● ● ●●
●
●
●
●
●●
● ●
● ● ●● ●
● ●
●
● ●● ●
●●
●● ●
● ●
● ●● ●
●
●
●
●
●
●
●
● ●
●
●
● ●
●
●
●●
60
70
●
●
50
weight
●
●
●
●
●●
1.5
1.6
1.7
1.8
1.9
●
Example: BMI
Reading data
Gender and age
Height and weight
Inference
What about the BMI?
Student t-test
> t.test(height ~ gender)
Welch Two Sample t-test
data: height by gender
t = -22.6415, df = 416.359, p-value < 2.2e-16
alternative hypothesis: true difference in means is not
95 percent confidence interval:
-0.1410314 -0.1184996
sample estimates:
mean in group F mean in group M
1.653688
1.783453
Example: BMI
Reading data
Gender and age
Height and weight
Inference
What about the BMI?
Extract part of a result and reuse
> (tweight <- t.test(weight ~ age))
Welch Two Sample t-test
data: weight by age
t = -0.9737, df = 416.815, p-value = 0.3308
alternative hypothesis: true difference in means is not
95 percent confidence interval:
-3.024263 1.020666
sample estimates:
mean in group 17 mean in group 18
64.09577
65.09757
> names(tweight)
[1] "statistic"
"parameter"
[4] "conf.int"
"estimate"
[7] "alternative" "method"
"p.value"
"null.value"
"data.name"
Example: BMI
Reading data
Gender and age
Height and weight
Inference
What about the BMI?
Extract estimated means and p value
> tweight$estimate
mean in group 17 mean in group 18
64.09577
65.09757
> pValue <- round(tweight$p.value, 3)
> cat("My comment about p-value ( p =", pValue,
+
")\n")
My comment about p-value ( p = 0.331 )
Example: BMI
Reading data
Gender and age
Height and weight
Inference
Distribution of BMI
> BMI <- weight/(height^2)
> gtour(BMI, col = "lightblue")
Histogram of x
0
100
200
300
100
40
Frequency
0
20
x
30
●
● ●
●
●
●●● ●
●
●●
●
●● ● ●● ● ● ●
●
●
● ●●● ●
● ● ●● ● ●●
●
●●●●●
●
●
●
●
●
●
●● ●
●
●●●●
● ●●
●●
●●●●
●●●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●●
●
● ●●
●● ●
●●
●
●
●● ●
●●
●
●
●●
●
●
●
●
●●●●
●
●●●
●●●
●
●●
●●
●
●
●
●●
●●
●●
●●●
●
●
●●
●●
●
●
●●
●
●
●
●●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●●
●●
●●
●●
●
●
●●●
●
●
●
●●●●
●
●
●
●
●
●●
●
●●
●
●
●●
●
●●
●
●
●
●
●
●●
●
●
●
●●
●
●●●●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●●
●●●
●
●
●●
●
●
●●
●
●
●
●
●●
●●●
●
●
●●●
●●
●
●
●●●
●
●
●
●
●●●●●●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●● ●
●●
●●
●●
●
●
●●
●●●
● ●● ●●
●●●●
●●
●
●●● ●
●
●
● ●●
●
● ●
●●
●● ●
●
● ●
●●● ●● ●●
●
●
400
20
25
Index
30
35
x
30
●
20
30
20
●
●
●
●
Sample Quantiles
Normal Q−Q Plot
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●●●●
−3
−2
−1
0
1
2
Theoretical Quantiles
3
What about the BMI?
Example: BMI
Reading data
Gender and age
Height and weight
Inference
Make a new data frame and show correlations
> X <- data.frame(weight, height, BMI)
> col <- c("red", "blue")[gender]
> cor(X)
weight
height
BMI
weight 1.0000000 0.65266353 0.76766658
height 0.6526635 1.00000000 0.02193029
BMI
0.7676666 0.02193029 1.00000000
What about the BMI?
Example: BMI
Reading data
Gender and age
Height and weight
Inference
Plot scattergrams
> pairs(X, col = col, pch = 16)
1.8
1.9
●
●
●●●
●●
●
● ● ●● ●●
●
●● ● ●
●● ● ●
●
●
●
●
●●
●●●●●●●
●
● ●●
●●●●●● ●●
●
●
●
●
●●
●
●
●●
●
●●
●●
●●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●●
●
●
●●
●●
●●
●●
●●●●
●
●
●
●
●
●
●
●●●
●
●
●
●●
●
●
●
●●●●
●
●●
●
●
●
●
●
●●
●
●●
●
●
●
●
●●
●●
●●
●
●
●●
●
●
●
●
●
● ●●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●●●
●
● ● ●
●
●
●
●●
●
●
●
●
●
●
●●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●●
●● ●
●●
●●●
●
●
●●
●
●
●
●●
●
● ●
●
●
●
●●
●
●
●●
● ●●●●
height
●
●
●
●●●●●
●
●
●●● ●●
● ●
●
● ●
●● ●●
●
●
●● ●
●
●
●●● ●
●
●
●
●
●
●
●●
●●●● ● ●●●●
●●●
●●
●
●● ●
●
●●
●
●●●
●●
● ●● ●
●●
●
●
●
●
●●
●
●
●
●
●
●●
●
● ●
●
●
●
●
●
●●●●
●
●●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
● ●●
● ●●●
●
●●
●
●●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●●●●
●● ●
●●
●●
● ●●
●
●
●
●
●
●
●
●
●
●
●
● ●●
●●●
●
●
●
●
●
●
●
●●
● ● ●
●●
●●●
●
●
●●
● ●●
●●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●●●●●● ●●●● ● ●
●
●
●
●●●●●
●
●●
●
●●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●● ●
● ●●
●●
●
●●●
●
●●
●●●
●
● ●●●
●
●
● ●●●
●●●●● ●
● ● ●●
●●
●● ●● ●
●● ● ● ●●
●●
●
●
●
●
●●
●
●
●
●
●●●
●
● ● ● ●
●
●
● ●●
●● ●
●
●●
●
●● ●●
● ●
●
●
●●
●
●●●●
●
●
●
●
●
●
●●
●●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●●● ●
●●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●●● ●
●
●
●●●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
● ●
●
●●
●●
●
●
●●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●● ●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●● ●
●
●●
●
●
●
●
●
●
●● ●
●
●
●
●
●
●
●
●● ●
●
●●●
●●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
● ●
50
60
70
80
90
●
●
●
●
●
●
30
●
●
●
35
●
●
●
●●
●
●
●
●
●
●
●
● ●
●
●
● ●
●
●
●
●●
● ●●● ● ●
● ●● ●
● ●●●●
●●●
● ● ●● ● ● ●
●●●
●●
●
●●● ●●
●●
● ●●
●
●
●
●
●
●
●
●
●
●
●
●●●
● ● ●●● ● ● ●
●●
●● ●
● ●●
●
●●
●
●●
●●
●
●
●●
●
●
●
●
●● ● ●
●
●
●●
●
●
●
●
●
●●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●●
●●
●
●
●
●
● ●
●●
●●● ●●
● ●
●●●
●●●●●
●
●●●●
●
●●
●
●●
●
●
●
●●
●●
●●●●
●
●●
●
●
●●
●
●
●
●
● ●
●
●
●●
●
●
●
●
●●
●
●
● ●●●●
●
●
●
●●●●
●
●
●
●●
●●
● ●●
●
●
●
●●●
●
●
●
●●
●
●
●●
●
●
●●
●●●● ●
●●●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●●
● ● ●●●
●●●
●●●
●
● ●
●
●
●
●
●● ●
●
●
●●
●
●
●
●
●
●
●● ●
●●
●●
●
●
●
●
●●●
●●
●
● ●●●●●
●●● ●
● ●●●
●
BMI
25
●
●
●
●● ●
●
●●
●
● ●● ●●
●
●
● ●● ●
● ●● ●●●●
●
● ●
●●● ●●
●
●
●●
●
●●● ●
●
●
●
●
●●
●● ●●
●●●● ●
●●●
●●●
●●
●
●
●●●●
●
● ●●
●●●
●●
●●
●
●
●
●
● ●● ●
●
●●
●
●●
●●
●
●
●
●
●
● ●●
●●
●
●
●
●
●
●●
●●
●
●
●
●●
●
●
●
●●●
●
●
●
● ●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●●●
●
● ●●
●
●
●
●
●
●●●
●●
●●
●●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●●●
●
●
●
●
●
●
●●●
● ● ●
●
●
●●●●
● ●●
●
●●●
●●●
● ●
●
●
●
●
●
●
●
●
●●
●●
●
●●
●
●●
●
●
●
●
●●●
●
●
●●●
●●
●●
●●●
●● ●
●
●
●
●
●
● ●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●●●
● ●●
●●
●●
●
●
● ●●
●
●
●
●●
●●●
● ●
●
●
●●●
● ● ●●
●
●
●
●● ●● ● ●
● ●
●
●
● ●
●●
●
●
●
●
●● ●
●●
●●
●
●● ●
●●
● ●●
● ●● ●
●
●●
●
●
● ●
● ●● ● ●
●
●
●
●●
●●
●●
●
●
●●●●●●●
●
●
● ●
●●
●●
●●
●●
●●
●
●●
●
●
●
●●
●●●●
●
●
●
●
●●●
●●
●
●
●
●
●
●
●
●●
●
●
● ●
●
●
●
●
●
● ●●●●●●
●
●
●
●●●
●●●
●
●
●● ●●
●
●
●
●
●●
● ●
●●
●
●●●
●
●
●● ●●
●
●
●
●●
●
●
●●
●●
●
●●
●
●
●●
●
●
●
●●●
●●
●
●
●
●
●
●
●●
●●
●●
●
●●
●
●
●
●
●
●●
●
●●
● ●● ●
●
●
●
●
●●
●
●
●●
●
●●
●
●
●
●●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●●
●●●●●
●●
●
●
● ● ●
●
●●
●
●●
●●●
●
●
●
●
●
●
●
●
●●●●
●
●
●●
●● ●●
●
●●
●●
●
●
●●
●●
●
●●●
●
●●
●
●●
●●
●
●
●
●
●
●
●
●
●● ●
●●
●●
●●
●
● ●●
●●
●
●
●●●
●
●
●
●
●
●●
●
●●
●
●
●
●●
●● ● ●
20
1.5
1.7
1.9
weight
●
90
1.7
●
70
1.6
50
1.5
20
25
30
35
What about the BMI?
Example: BMI
Reading data
Gender and age
Height and weight
Inference
What about the BMI?
Distributions of numerical variables
>
>
>
+
>
>
>
oldpar <- par(mfrow = c(1, 3))
cols <- c("pink", "lightblue")
boxplot(split(height, gender), col = cols,
main = "height")
boxplot(weight ~ gender, col = cols, main = "weight")
boxplot(BMI ~ gender, col = cols, main = "BMI")
par(oldpar)
Example: BMI
Reading data
Gender and age
Height and weight
Inference
What about the BMI?
Distributions of numerical variables
height
weight
●
●
100
35
●
●
BMI
●
●
1.9
●
●
●
●
●
●
30
90
●
●
●
●
●
1.8
●
●
●
1.5
50
20
1.6
●
60
70
1.7
25
80
●
●
●
F
M
F
M
F
M
Example: BMI
Reading data
Gender and age
Height and weight
Inference
What about the BMI?
Calculated sizes of symbols
> pairs(X, cex = BMI/15, col = gender)
●
1.8
1.9
●
●
● ●●
●●
●
●
●
●
●● ●●●●●●
●●
●
●
●
●●
●●●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●●●
●
●●
●
●
●
●
● ●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●●
●
●
●
●●●●
● ●
●
●
●
●
●●
●
●
●
●
●
●
●
●
● ●●
● ●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●●●●
●●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
● ●
●
●
●●● ●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●●
●
●●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●●
●
●
●●
●
●●
●
●
●
●
●
●
●●
●
●●●●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
● ●●
●●
●
●
●●●
●●●
●
●
●● ●●
●
●●
●● ● ●
●
●
●
●
●
● ●●
● ●
●
●●
●
● ●
●
● ●
●●●
● ●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
● ●
●●
●
●● ●
●●●
●
●
●
●
●
●●●
●
●
●
●●
●●
●
●
●●
●
●
●
●
●
●
●●
●
●● ●●●
●
●
●
●
●
●●
●●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
● ●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●●
●●●
●●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●●
●
●
●
●●
●
●
●
●
●●●
●
●● ●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
● ●●●
●
●●
●● ●
●● ●
●
●● ● ●
● ●
●● ●
●
●●
●
●
●●
●
●●
●
●
●
●
●●
● ●●●●
●
●
●
●●
height
●●
●
●
●
●
●●●●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●●
●●
●●
●●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●●
●
●● ●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●● ●
●
●
●
●
●
●
●●
●
●
●
●●●
●
●
●●
●
●
●
●
●●
●
●
●●
●
50
60
70
90
●
35
●
BMI
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●●●●
●
●
●
●
●
●
●
●●●
●●●●
●
●
●
●
●
●●
●
●
●●●● ●
●
●
●
●
●
●
●●●
●
●
●●●●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●●●
●
●
●
●●
●
●●
●●●●●
●
●●
●
●
●
●●
●
●
●
●
●
●●●
●
●
● ●●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
● ●
●
●●
● ●●
80
●
●●
●
●●
●● ●
● ● ●●
●●
●
●●●●
●
●● ● ●
●
●●
●●●
●
●●
●●
●
●
●
●
●
●
●●
●
●●
●
●
●●
●●
●
●● ●●
●●
●
●●
●●
●
●
●
●●
●●
● ●●●●
●●
●
●●● ●
●●
●
●●●
●●●
●●
●
●
●● ●
●
●
●
●●
●● ●
30
●
● ●
●
●●
●
●●
● ●●
●●
●
●●
●
●●
●
●
●●
●
● ●
●●
●
●●
●
●
●
●
●●
●
●●
●●
●
●
●
●
●●
●
●●●
● ●
●
●●
●
●
●
●
●
●
●
●
● ●●●●●
●●
●
●
● ●
●●●●
●●● ●●
●
●
●
●●● ●
●
●
●
●
●
●
●●
●
●
●
●
●
●
● ●●
●
●●
●
●
●
●
●●●
●
●
●
●
●
●
●
●●
●
●
●
●
● ●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
● ●●●●●
●●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●●
●
●●
● ●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●●
●
●
●
●
●
●●
●●●
● ●●
●●
●● ●●● ●
●●
●
25
1.5
1.7
1.9
●
●
●
●●
20
weight
● ●
●●●
●
●●
●
●●●
●●
●
●●
●●
●●
●
●●
● ●
●●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●●
●●
●
●
●
●
●●●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●●
●●●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●●
●●●
●●
90
1.7
70
1.6
50
1.5
●
20
25
30
35
Example: BMI
Reading data
Gender and age
Height and weight
Inference
What about the BMI?
BMI classes
> bmic <- cut(BMI, c(0, 13, 18, 25, 30, Inf))
> levels(bmic)
[1] "(0,13]"
"(13,18]"
[5] "(30,Inf]"
"(18,25]"
"(25,30]"
> levels(bmic) <- c("S", "s", "N", "h", "H")
> bmic <- factor(bmic, levels = c("S", "s",
+
"N", "h", "H"), ordered = T)
> is.ordered(bmic)
[1] TRUE
Example: BMI
Reading data
Gender and age
Height and weight
Inference
What about the BMI?
Color coded BMI classes
> cols <- (6 - as.numeric(bmic))
> pairs(X, cex = BMI/15, col = cols, pch = 16)
1.8
1.9
●
● ●●
●●
●
●●●
●●●
●
●●
●
●
●●
● ●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●●●●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
● ●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●●
●
●●
●
●
●●●●
●●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
● ●
●
●
●●● ●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●●
●
●●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●●
●
●
●●
●
●●
●
●
●
●
●
●
●
●●
●
●●●●
●
●●
●
●
●
●
●
●
●
●●
●
●
●●●
●
●
●●
●
●●
●●
●
●
●
●
●●
●●
●
●●
●●
●
●
●●
●
●●
●●
●
●●
●
●●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●●
●● ●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●●
●
50
60
70
90
●
●
35
●●
●
●
●●
●● ●
● ● ●●
●●
●
●●●●
●
●● ● ●
●
●●
●●●
●
●●
●●
●
●
●
●
●
●●
●●
●
●●
●●●
●●
●
● ●
●●
●
BMI
●●
●
●● ●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
● ●●●● ●
●
●
●
●
●
●
●
●
●●●
●●●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●●
●●
●●
●
●●
●
●
●●
●●
●
●
●
●
●
●
●
●●●
●
● ●●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●●
●
●●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●● ●
●
●●
●
●
●
●
●●● ●●
●
●●
● ●●
80
●
●
●
● ●●
●●● ●
●●
●●●
●
●● ●
30
●
●
●
●
●●
●● ●
● ●
●
●
●
●
●
●
● ●●●●●
●●
●
●
● ●
●●●●
●●● ●●
●
●
●
●●● ●
●
●
●
●
●
●
●●
●
●
●
●
●
●
● ●●
●
●●
●
●
●
●
●●●
●
●
●
●
●
●
●
●●
●
●
●
● ●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
● ●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●●
●
●
●●●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●●
●●
●●
●●
●●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●●
●
●●
●
● ●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●●
●
●
●
●
●
●●
●●●
● ●●
●●
●● ●●● ●
●●
●
height
●
●●
●
●●
●
●●
●
●●
●●
●●
●
●
●●
●
●●
●
●
●
●
●●
● ●●●●
●
●
● ●
●●●
●
●●
●
●●
●
●
●●
●
●
●●
●
● ●
●●
●
●●
●
●
●
●
●
●
●●
●
●
● ●
●
●
25
1.5
1.7
1.9
●
●●
●
●
●●●●●
●●●
●
●
● ●
●
●●
●● ● ●
●
●
●
●
●
●
● ●●
● ●
●
●●
●
● ●
●
● ●
●●●
● ●
●●
●
●
●
●
●
●●
●
●
●
●
●
● ●
●●
●
●●
●●
●● ●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●●
●
●
●●
●
●
●● ●●●
●
●
●
●
●
●●●
●●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
● ●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●●●
●
●
●●●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●●
●
●
●●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
● ●●●
●
●●
●● ●
●● ●
●
●● ● ●
● ●
● ●
●●●
●
●●
●
●●●
●●
●
●●
●●
●●
●
● ●
●●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●●
●
●
●
●
●
●
●
●●
●
●
●
●
●●●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●●
●
●
●
●
●
●
●
20
weight
●
●
●
●
●
●● ●●●●●●
●●
●
●
●
●●
●●●
●
●
●●
●●
●
●
●
●
●
●
90
1.7
●
70
1.6
50
1.5
●
20
25
30
35
Example: BMI
Reading data
Gender and age
Height and weight
Inference
Flat contingency table for categories of BMI
> gabTable <- ftable(gender, age, bmic)
> gabTable
bmic
gender age
F
17
18
M
17
18
S
s
0
0
0
0
6
7
8
1
N
h
H
86 9
87 6
88 15
89 12
0
4
1
0
What about the BMI?
Example: BMI
Reading data
Gender and age
Height and weight
Inference
What about the BMI?
Barplots are easy to understand ...
150
> barplot(table(gender, bmic), beside = TRUE,
+
legend = TRUE)
0
50
100
F
M
S
s
N
h
H