Solutions - Department of Statistical and Actuarial Sciences

1
The University of Western Ontario
Department of Statistical and Actuarial Sciences
Statistical Sciences 2864B
Assignment 4 Solutions
Due Date: March 27, 2013
Please work together with one or two partners on this assignment, Hand in one copy of
the assignment for your entire group. Print all of all group members on the front of your
assignment. (Do NOT write down their student numbers.)
1. Download the R function pictogram.R from the course webpage and use the source()
function to read it into an R session. This function creates a kind of bar chart that
is popular in magazines and newspapers, using particular shapes for the different bars.
For an example of its use, try the following code
income <- c(Ottawa=93.07, Toronto=66.79, London=70.16,
Windsor=67.22, Sudbury = 75.24, ThunderBay=72.96)
pictogram(income,
label="Median Income (in $1000s) for Ontario Cities -- 2009",
units=20)
grid.xaxis(at=seq(0,120,30)/120, label=seq(0,120,length=5))
Use functions in grid to construct functions which produces the shapes from the following list:
(a) a coin (with a head)
>
>
>
+
>
+
+
+
+
+
+
+
+
+
>
+
>
source("pictogram.R.txt")
library(grid)
income <- c(Ottawa=93.07, Toronto=66.79, London=70.16,
Windsor=67.22, Sudbury = 75.24, ThunderBay=72.96)
coin<-function(gp){
grid.circle(x=0.5,y=0.5,r=0.5,gp=gpar(fill="darkgrey"))
grid.circle(x=0.5,y=0.5,r=0.35,gp=gpar(lty="dashed",fill="grey"))
grid.circle(x=0.45,y=0.65,r=0.05,gp=gpar(fill="blue"))
grid.circle(x=0.55,y=0.65,r=0.05,gp=gpar(fill="blue"))
pushViewport(viewport(x=0.5,y=0.2,width=0.35,height=0.35,clip="on"))
pushViewport(viewport(x=0.5,y=1.5,width=1.5,height=1.5))
grid.circle(x=0.5,y=0.5,r=0.5,gp=gpar(fill="red"))
upViewport(2)
}
pictogram(income,label="Median Income (in $1000s) for Ontario Cities -- 2009"
units=20,shape=coin)
grid.xaxis(at=seq(0,120,30)/120, label=seq(0,120,length=5))
2
Pictogram of Median Income (in $1000s) for Ontario Cities −− 2009
Toronto
Windsor
London
ThunderBay
Sudbury
Ottawa
30 represents
60
0
20 units 90
120
(b) a dollar bill (with a “$”)
>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
>
+
>
dollarbill<-function(gp){
rec1<-rectGrob(width=unit(0.8,"npc"),height=unit(0.4,"npc"),
gp=gpar(fill="green4"))
rec2<-rectGrob(width=unit(0.74,"npc"),height=unit(0.34,"npc"),
gp=gpar(fill="green2"))
rec3<-editGrob(rec1,vp=viewport(x=0.5,y=0.5,angle=30))
rec4<-editGrob(rec2,vp=viewport(x=0.5,y=0.5,angle=30))
dollar<-textGrob("$",gp=gpar(cex=0.75,col="green4"))
dollar1<-editGrob(dollar,vp=viewport(x=0.5,y=0.5,angle=30))
grid.draw(rec3)
grid.draw(rec4)
grid.draw(dollar1)
}
pictogram(income,label="Median Income (in $1000s) for Ontario Cities -- 2009"
units=20,shape=dollarbill)
grid.xaxis(at=seq(0,120,30)/120, label=seq(0,120,length=5))
Pictogram of Median Income (in $1000s) for Ontario Cities −− 2009
$
$
$
$
Toronto
Windsor
$
$
$
$
London
$
$
$
$
ThunderBay
$
$
$
$
Sudbury
$
$
$
$
Ottawa
$
$
$
$
0
$ represents
30
60
$
20 units 90
120
Construct pictogram plots of the income data using each of your new functions.
2. Construct an R function which takes as input an integer n an integer-valued seed s
and returns n pseudorandom numbers generated from the multiplicative congruential
generator with m = 30307 and b = 172. Does this generator have a maximal cycle
length?
3
unirand <- function(n, s) {
m <- 30307
b <- 172
x <- numeric(n)
for (i in 1:n) {
s <- (s*b)%%m
x[i] <- s/m
}
x
}
The maximal cycle length is m-1 which is 30306.
Construct a second function which generates pseudorandom numbers using m = 30323
and b = 170. Does this generator have maximal cycle length?
unirand2 <- function(n, s) {
m <- 30323
b <- 170
x <- numeric(n)
for (i in 1:n) {
s <- (s*b)%%m
x[i] <- s/m
}
x
}
The maximal cycle length is m-1 which is 30322.
Construct a third function which takes input an integer n and two integer-valued seeds
s1 and s2 and returns a vector of length n whose ith component is
u1[i] + u2[i] - floor(u1[i]+u2[i])
where u1 and u2 are vectors of length n output from the first two functions. Does the
cycle length of this third generator exceed 300000? Are the resulting numbers uniformly
distributed? Use a lag plot to check whether the numbers appear to be independent.
> unirand3 <- function(n, s1, s2) {
+
m1 <- 30307
+
m2 <- 30323
+
b1 <- 172
+
b2 <- 170
+
x <- numeric(n)
+
for (i in 1:n) {
+
s1 <- (s1*b1)%%m1
+
s2 <- (s2*b2)%%m2
+
x1 <- s1/m1
+
x2 <- s2/m2
+
x[i] <- x1+x2-floor(x1+x2)
+
}
4
+
+
>
>
>
>
x
}
unif.sample<-unirand3(1000,25,62)
hist(unif.sample,main="Sample of 1000")
Big.sample<-unirand3(400000,14,76)
length(unique(Big.sample))
[1] 400000
80
40
0
Frequency
Sample of 1000
0.0
0.2
0.4
0.6
0.8
1.0
unif.sample
The cycle length exceeds 300 000 as the number of uniquely generated values in a sample
of size 400 000 is 400 000. From the histogram of 1000 samples we can see that the
resulting numbers appear to be uniformly distributed. The lag plot shows no clear
pattern and hence the numbers appear to also be independent.
0.6
0.4
0.2
0.0
unif.sample
0.8
1.0
> lag.plot(unif.sample)
−0.5
0.0
0.5
lag 1
1.0
1.5