Midterm 2004

Midterm
2004.4.24
(Written)
1. (20%)
Let an exponential family in canonical form be
f x,   expT x  A hx
A discrete random variable X with probabilities
ax  x
P X  x  
, x  0,1,; ax   0;  0,
C  
is a power series distribution.
(a) Show that the power series distribution is an exponential family and
find

and T x  . Also show that the moment generating
function of X is
M X t  
C exp t 
C   .
(b) Show that the binomial distribution Binomial n, p  and Poisson
distribution Poisson   are special cases of the power series
distribution and determine
(c) As
a x   1

and C   .
x and C    log 1   , x  1,2,;0    1,
is the logarithmic series distribution. Show that the moment
generating function is
M t  
log 1   exp t 
log 1    ,
and determine E  X  and Var  X  .
2. (20%)
The logistic density is
1
f x  
exp  x 
1  exp x 2
,
(a) Show that the density is symmetrical about 0.
(b) Find the cumulative distribution function and show that the 100p
percentile occurs at
 p 
x p  log 
.
1  p 
(c) Show that the moment generating function of the random variable X
with the logistic density is
M X t  
t
sin t  .
Find the first and the second cumulants.
3. (20%)
For independent binomial sampling with response
Yi ~ Binomial mi , i , i  1,2,, n ,
a linear logistic model with log it  i   0  1 xi is used to fit the
data. Derive the iterated reweighted least square estimate (IRLS) by both
Fisher’s scoring method and Newton-Raphson method.
4. (20%)
Consider the multinomial response model
 j xi   log  j xi    j   t xi s j  i ,
with scores s  1,0,,0 . Show that, with these scores, the
log-linear model is equivalent to the nested response model,
log it  1  xi   1   t xi
  j  xi  
log it 
 j, j  2.
1  rj 1  xi  
2
5. (20%)
Yi ~ Pai  , i  1,2, , r ,
are independently distributed Poisson distributions.
r
Y1
(a) Prove that the conditional distribution


Binomial  n,

is


(b) Suppose that
given
Y
i 1
i
n


a1

r
ai  .

i 1

Y1
variables with means
and

Y2
and
are independent Poisson random
 .
Show how you might use the
result in (a) to test the null hypothesis
one-sided alternative
H0 :   1
against the
H1 :   1 .
(Computer)
A. (25%)
Using a six-point scale, subjects indicated their preference for black olive.
Preference
Urbanization Location
A
B
C
D
E
F
Urban
MW
20
15
12
17
16
28
NE
18
17
18
18
6
25
SW
12
9
23
21
19
30
Rural
MW
30
22
21
17
8
12
NE
23
18
20
18
10
15
SW
11
9
26
19
17
24
In this data, Preference is an ordinal respons with categories (A, B, C,
D, E, F), Urbanization and Location are two explanatory variables.
3
Please use proportional odds model to fit the above data. What is the
conclusion?
B. (25%)
The following table refers to 661 children with birth weights 650 g and
1749 g all of whom survived for at least one year. The variables of
interest are:
Cardiac: mild heart problems of the mother during pregnancy
Comps: gynaecological problems during pregnancy
Smoking: mother smoked at least one cigarette per day during the
first months of pregnancy.
BW: was the birth weight less than 1250
Cardiac
Yes
No
Comps
Yes
No
Yes
No
Smoking Yes
No
Yes
No
Yes
No
Yes
No
BW Yes
10
25
12
15
18
12
42
45
No
7
5
22
19
10
12
202 205
Analyze the data and interpret the relationship of the children weights and
mother’s habits and health conditions.
C. (25%)
The data given in Splus build in data frame Insurance (in the library
MASS) consist of the numbers of policy-holders, Holders, the numbers
of car insurance claims made by those policyholders, Claims. There are
three explanatory variables, District (four levels), Group (of car, four
levels), and Age (four ordered levels). Please analyze the data up to the
three way interaction with offset log(Holders). What are the factors in
determining the number of claims?
D. (25%)
Please write a program to fit the logistic regression model
log it  i   0  1 xi
(see problem 3).
Note: Splus commands glm or glim could not be used.
4