Exam 2

1
Exam 2 Spring 2017 STAT 305A (Take-Home) Due 3/28 (T In class) Name_______________________________
NOTE: All work (except code) for a given part MUST be placed DIRECTLY beneath that part. Begin each problem on a new page.
Table 1. Some useful Definitions and Facts
Def. 1: X and Y are independent if f ( X ,Y ) ( x, y)  f X ( x) f Y ( y)
Def. 2. (X,Y) are uncorrelated if E ( X Y )  E ( X ) E (Y ) .
Def. 3. The correlation coefficient for (X,Y) is  X Y   X Y /( X  Y ) Def. 4. E[ g ( X , Y )] 
g ( x, y ) f
( x, y ) dx dy

( X ,Y )
SY S X
Fact 1. E (aX  bY  c))  aE ( X )  b E (Y )  c
Fact 2. Cov(aX  bY  c,W )  a X W  b Y W
Fact 3. Var(aX  bY  c)  a 2 X2  b 2  Y2  2ab XY .
Fact 4.  X2  E ( X 2 )   X2 .
Problem 1 (30pts) In this problem you will address the total uncertainty in prediction of the oscillation
period of the pendulum shown at right. This uncertainty has two components: (i) that associated with L=The
act of measuring the length,  , and (ii) that associated with Tm=The act of measuring the time it takes to
complete m periods of oscillation. To answer questions related to this problem, you will need to view
https://www.youtube.com/watch?v=4a0FbQdH3dY with Professor Lewin.


m

(a)(4pts) Assume L ~ N (L  5.21m , L  0.025m) . Define relative uncertainty as r  2 /  100% . Compute this value
for L, and compare it to that value claimed by Prof. Lewin. [Give the time stamp of his claim.]
Solution:
(b)(6pts) To mathematically compute 
L
and 
L
is very difficult. Use 106 simulations to obtain their values.
Solution: [See code @ 1(b)]
(c)(8pts) Professor Lewin claims that, because the percent relative uncertainty in L is ~1%, the resulting uncertainty in
calculating the pendulum period T  (2 / g ) L is only ~0.5%. This is not obvious. Use your answers in (b) to arrive at
(i) T , (ii)  T , and (iii) to verify his claim re: the relative uncertainty for T. [Give the time stamp.] Use g  9.80 m / s 2 .
Solution:
Remark 1. Assume that T (associated with L) has T  4.6 sec and  T  0.01sec .
(d)(6pts) Let Q ~ N (0,  Q  0.1s) denote the measurement error associated with the stop-time after any chosen number of
pendulum swings. Assume this error is independent of T. Let W  mT  Q denote the act of measuring the stop-time of m

pendulum oscillations. The period estimator is then: T  W / m  T  Q / m . Arrive at the expression for  T as a function
of only m.
Solution:

(e)(6pts) Professor Lewin claims that for m  10 periods, the 2 uncertainty in T is 0.02 sec. (i) Use your expression in
(d) to show that his claim is incorrect, and then (ii) explain how he ‘messed up’.
Solution:
2
PROBLEM 2 (25pts) The performance of a wind turbine is tied directly to the velocity of the wind impinging on the
blades at any given time. Let V  (V1 ,,Vn ) denote the act of measuring wind speed at n successive time intervals Δ
seconds apart from one another.
n
n
(a)(4pts) Let V  (1 / n) Vk . Give, and then use the appropriate Fact in Table 1 to show that E ( V )  (1 / n) V .
k 1
k 1
k
Solution: Fact ___ ;
(b)(3pts) Assume that Δ is sufficiently large, so that the elements of V can be assumed to be uncorrelated. On p.185 of the

n
book, give the equation that you will then use directly to show that  2  Var ( V )  (1 / n) 2   V2 .
V
k 1
k
Solution: On p.185: __________________________________
(c)(3pts) Suppose that at any index k, we have Vk ~ Weibull (a  31.7, b  9) . These parameter values correspond to
V  30 mph and  V  4 mph . Suppose further, that the time   60 sec . between successive measurements is sufficiently
large that we can assume that the elements of V are uncorrelated. For a 10 minute observation time we have a total of
n  10 measurements. Use the results in (a-b) to compute the numerical values of   and   .
V
V
Solution:
(d)(6pts) The pdf for any Vk (below LEFT) clearly does not have the symmetry of the normal pdf. The Central Limit
Theorem (CLT) says that if the sample size, n, is ‘sufficiently large’, then no matter what the pdf might be for Vk , the pdf

for V will have a pdf that is approximately normal. Use 105 simulations of V to arrive at (i) the simulation-based pdf

overlaid against a normal pdf for V . Then (ii) use your result to determine if the CLT is applicable.

Figure 2(d) The given pdf for any Vk (LEFT), and your pdfs (normal and simulation-based) for V (RIGHT).
Solution:[See code @2d used to arrive at the plot at RIGHT]
Determination:
3
(e)(4pts) Now, rather than using a 60-second sampling interval, let Δ=5 seconds. Then,
over a one-minute observation time we have n  12 measurements. We can no longer
assume that Vk and Vk  j will be uncorrelated for small values of j. Specifically,
 1

 0.6
 0.6 2
Σ V   V2   3
 0.6
assume that Cov(Vk ,Vk  j )  0.6 j V2 . The Σ V  Cov(V, V) matrix is shown at right. On
 
12
12 12
 11
p.185 of the book, we have, as a special case of (5-26): Var Vk    Cov(V j ,Vk ) .
0.6
 k 1

0.6 0.6 2 0.63
1
0.6

0.6








 0.6
3
j 1 k 1
0.6 2
 0.611 


 
 0.63 

 0.6 2 
 0.6 

0.6 1 
This double-sum is simply the sum of the elements of Σ V . Use this information to arrive at the numerical value for  V .
[HINT: Σ V is a symmetric Toeplitz matrix, which can be easily formulated in Matlab using the ‘toeplitz’ command, if you
want. To add up all its elements, simply use the command sum(sum( Σ V )).]
Solution: [See code @ 2(e).]
Remark. You should have found that the value for   in (e) is ~70% larger than the value in (c). Correlation makes a
V
big difference. Specifically, when measurements are correlated one should not blindly use     / n .
4
PROBLEM 3 (25pts) Future development of drones that are solar-powered will necessitate addressing the influence of
Reynolds number ( Re ), which is a function of speed, on the drag Coefficient ( Cd ). In this problem you will investigate
the basic problem of the influence of Re on Cd in relation to a sphere. Specifically, you will attempt to arrive at a linear
model for the first 125 points of the log of the data given in Figure 1 of the 2013 paper by Professor Faith Morrison:
www.chem.mtu.edu/~fmorriso/DataCorrelationForSphereDrag2013.pdf.
[For those with a keen interest, see also: http://www.grc.nasa.gov/WWW/k-12/airplane/dragsphere.html ]
The code for this problem is given in Appendix 2; specifically, the code related to PROBLEM 3. I have written most of

this code for you. The only code that you need to write is in relation to (i) the linear model Y  mX  b , where,
X  log 10 (Re) and Y  log 10 (Cd ) , and (ii) the model errors. [NOTE: You must have the data file ReCd_FM.txt in
your folder, so that the program can load it.]
(a)(8pts) Use the log-data array XY to arrive at estimates of the model parameters m and b. Then from these, obtain the
nonlinear model for Cd .
Solution: [See code at specified location.]
(b)(7pts) The code beneath your answer in (a) recovers the nonlinear model parameter estimates. Then overlay FM’s
model and your model on the data. Give this plot at below LEFT. Then insert your code that will compute the errors of
both models, so that the following code can give a plot of the model errors at RIGHT.
Solution: [See code at specified locations.]
Figure 3(b) Cd prediction models (LEFT), and associated model errors (RIGHT).
(c)(6pts) Based on your plot at RIGHT in (b), discuss which model has greater bias in specific Re regions.
Discussion:
(d)(4pts) Were one to desire a prediction model for this low Re regime, explain why your model would be a more
‘efficient’ one (as reflected by the number of model parameters).
Explanation:
5

PROBLEM 4 (20pts) In this problem you will address the quadratic model Y  m1 X  m2 X 2  b for the Re region


below ~105. For X  log 10 (Re) , let Let X 1  X and X 2  X 2 . Then log-data prediction model can be expressed as:

X 
 1
 XY and b  Y  Mtr  X .
m2  1   b  M tr X  b where M  Σ
XX Σ
X
 2

(a)(5pts) Prove that the model for predicting Cd from Re is: Cd  10b Re[ m1 m2 log10 (Re)] .
Proof:

Y  m1 X 1  m2 X 2  b  m1
(b)(10pts) Use the log-data array X2Y given in the code to arrive at estimates of the model parameters M and b. From

these, give the final numerical expression for Cd  10b Re[ m1 m2 log10 (Re)]
Solution:
(c)(5pts) Overlay FM’s model and your model on the data. Give this plot at below LEFT. Then insert your code that will
compute the errors of both models, so that the following code can give a plot of the model errors at RIGHT.
Solution: [See code at specified locations.]
Figure 4(c) Cd prediction models (LEFT), and associated model errors (RIGHT).
6
Appendix 1 Matlab code for Problems 1 and 2
%PROGRAM NAME: exam2.m
%PROBLEM 1(f):
muL=5.21; stdL=.025;
nsim=10^6;
%================================================
%PROBLEM 2:
%2(d):
muV=30; stdV=4;
%The code weibullparams.m gives:
a=31.686; b=8.966;
nsim =10^5; n=10;
%2(e):
Appendix 2 Matlab Code for Problems 3 and 4
%PROGRAM NAME: ReCdModels_FM.m
load ReCd_FM.txt
Re=ReCd_FM(:,1);
Cd=ReCd_FM(:,2);
%--------------------------------------------------------%Construction of Faith Morrison's (FM’s) Model for Cd:
T1=24*Re.^-1;
T2n=2.6*(Re/5); T2d=1+(Re/5).^1.52;
T3n=0.411*(Re/263000).^-7.94; T3d=1+(Re/263000).^-8;
T4=(Re.^0.8)/461000;
CdhatFM=T1+(T2n./T2d)+(T3n./T3d)+T4; %FM's Cd prediction model.
%----------------------------------------------------------%PROBLEM 3
n1=125; %max index of data for LINEAR regression
XY=[log10(Re(1:n1)),log10(Cd(1:n1))]; %Log10 data
%+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
%*****Insert your code for computing mhat and bhat HERE*****
[mhat bhat] %These are the parameter estimates for the log-data
%+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
chat=10^bhat; %The nonlinear model is: chat*Re^mhat
Cdhat=chat*Re(1:n1).^(mhat*ones(n1,1));
figure(30)
loglog(Re(1:n1),Cd(1:n1),'*')
hold on
loglog(Re(1:n1),CdhatFM(1:n1),'k')
loglog(Re(1:n1),Cdhat,'r--','LineWidth',2)
legend('Re/Cd Data','Morrison Cd Model','My Cd Model')
xlabel('Re'); ylabel('Cd')
grid
%Model Errors:
%+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
%Insert your code for computing prediction errors HERE:
%+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
figure(31)
semilogx(Re(1:n1),errFM(1:n1),'k*')
hold on
semilogx(Re(1:n1),err,'ro')
legend('Morrison Cd Model Errors','My Cd Model Errors')
grid
xlabel('Re')
ylabel('Cd Error')
7
%========================================================
%========================================================
%PROBLEM 4: QUADRATIC model over BOTH regions:
n2=759;
X21=log10(Re(1:n2)); X22=X21.^2; Y2=log10(Cd(1:n2));
X2Y=[X21,X22,Y2]; % 759x3 log-data array
%+++++++++++++++++++++++++++++++++++++++++++++++++++++++++
%Insert your code for (b) HERE:
[Mhat' bhat]
%+++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Y2hat=Mhat(1)*X21 + Mhat(2)*X22 + Bhat;
Cdhat=10^Bhat * Re(1:n2).^(Mhat(1)*ones(n2,1) + Mhat(2)*log10(Re(1:n2)));
figure(32)
loglog(Re(1:n2),Cd(1:n2),'*')
hold on
loglog(Re(1:n2),CdhatFM(1:n2),'k')
loglog(Re(1:n2),Cdhat,'r--','LineWidth',2)
legend('Re/Cd Data','Morrison Cd Model','My Cd Model')
xlabel('Re'); ylabel('Cd')
grid
%Model Errors:
%+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
%***Insert your code for computing prediction errors HERE***
%+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
figure(33)
semilogx(Re(1:n2),errFM(1:n2),'k*')
hold on
semilogx(Re(1:n2),err,'ro')
legend('Morrison Cd Model Errors','My Cd Model Errors','Location','SouthEast')
grid
xlabel('Re')
ylabel('Cd Error')