Exam of Statistics 6C

1
Università di Venezia - Corso di Laurea Economics & Management
Exam of Statistics 6C - Prof. M. Romanazzi
January 15th, 2016
Full Name
Matricola
• Total (nominal) score: 30/30 (2/30 for each question).
• Pass score: 18/30. Lowest (18) and highest (29, 30) grades must be confirmed by oral
discussion.
• Pocket calculator and portable computer are allowed, textbooks or class notes are not.
• Detailed solutions to questions must be given on the draft sheet (foglio di brutta copia);
final answers/results must be copied on the exam sheet, beside the small squares.
2
Exercise 1 The natural growth of population (X) in a reference year is the difference between birth rate and
death rate and is usually measured as number of people in 1000 residents. The stem-and-leaf display
in Table 1 shows the values of X in a random sample of 80 Italian municipalities (source: Istat,
2014). In the display data are rounded to just one significant digit and the decimal point is one
digit to the right of vertical bar.
n = 80
−1|2
Pn is read −12
xi = −303.7887574
Pi=1
n
2
i=1 xi = 3494.4979035
-1
-1
-0
-0
0
0
9776
332221
9888777777776665555555
444444444333332222222111110
0000011122223334
55566
Table 1: Stem-and-leaf of natural growth rates of Italian municipalities.
Q1 How many municipalities have a negative growth rate?
Q2 Compute the median x0.5 and the mean x̄ of the data.
x0.5 =
, x̄ =
Q3 Describe the criteria to detect the outliers. In the present case is there any one?
Criteria:
Outlier(s):
Q4 What is the distribution shape suggested by the sample: unimodal or multimodal, symmetric
or (negatively, positively) asymmetric, uniform, normal, etc.
Shape:
Exercise 2 The weekly mileage (km) of Mr Rossi’s car is a random variable X with expectation µ = 130
and standard deviation σ = 30. The mileages in different weeks are assumed to be stochastically
independent.
Q1 Let Y be the total mileage of Mr Rossi’s car in a year. What are the expectation µY and the
standard deviation σY of Y ?
µY =
, σY =
Q2 What is the probability of the yearly mileage Y to be greater than 6500 km?
Exercise 3 Consider again the municipality data of Exercise 1. We want to estimate the proportion pA of
Italian municipalities with a positive natural growth rate.
Q1 Let X n,A be the relative frequency of municipalities with a positive natural growth rate in
a random sample of size n. What is the probability distribution of X n,A ? What are the
expectation and the standard error?
Probability distribution of X n,A :
Expectation:
, Standard error:
Q2 Compute the confidence interval for pA (confidence level: 0.95).
Q3 Suppose a very precise estimate to be required. What is the sample size n∗ we need so as the
standard error is lower than 0.01?
3
Exercise 4 Consider again the municipality data of Exercise 1. Our purpose is to estimate the average µ of X
variable in the population of all Italian municipalities.
Q1 What is the point estimate of µ and what is its sampling error ?
Point estimate of µ:
, sampling error:
Q2 Consider the test H0 : µ = 0, H1 : µ 6= 0 at the significance level α = 5%. What is the
rejection region of the test? What is the observed value of the test statistic?
Rejection region:
Observed value of test statistic:
Q3 What is the p-value of the test? Does it suggest rejection or non rejection of H0 ? Does it
agree with the results of previous question Q2?
p-value:
It suggests rejection (non rejection) of H0 because
It agrees (does not agree) because
Exercise 5 The total growth rate Y is the sum of the natural growth rate X and the migration rate Z. It takes
into account the change of residence of people (from/to different municipalities or foreign countries).
The scatter plot in Figure 1 shows the joint distribution of natural and total
Pn growth rates in the
sample
already
considered
in
Exercise
1.
The
summary
statistics
of
Y
are
i=1 yi = −207.599086,
Pn
2
y
=
22914.335496.
Moreover,
the
sample
linear
correlation
coefficient
is rX,Y = 0.4474925.
i=1 i
Q1 Mark on the scatter plot the region with positive values of both natural and total growth rates
and count the number of municipalities it includes.
Q2 Estimate a linear prediction model y = a + bx for Y , using X as explanatory variable. What
are the estimates of the coefficients a and b? Describe the measure of goodness-of-fit and
compute its value.
Intercept a =
, slope b =
Measure of goodness-of-fit:
Value:
Q3 Does the migration rate have a positive or negative impact on the population of Italian municipalities? Explain carefully. Moreover, derive the confidence interval for µY , the mean of
total growth rate for all Italian municipalities (confidence level: 0.95) and compare with the
confidence interval for µX .
Impact of migration rate:
Confidence interval for µY :
Comparison:
4
60
Growth rate of Italian municipalities
●
40
●
●
●
●
●
●
●
●
●
●
0
●
●
●
−20
●
●
●
●
● ● ●●
● ● ●
●
●
●
● ●
●
●●
●
●
●
●
● ●
●
●
●
●
●
●
●
●● ●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
−40
●
−60
Total growth rate
20
●
●
−20
−15
−10
−5
0
5
Natural growth rate
2014
Figure 1: Scatter plot of natural and total growth rates.