Stochastic Modelling and Geostatistics

Lecture (14,15)
More than one Variable,
Curve Fitting,
and
Method of Least Squares
Two Variables
Often two variables are in some way connected.
Observation of the pairs:
X
Y
X1
X2
.
.
.
Xn
Y1
Y2
.
.
.
Yn
Covariance
The covariance gives the some information about the extent
to which the two random variables influence each other.
Cov( x, y )  E{x  E{x}}E{ y  E{ y}}
Cov( x, y )  E{x. y}  E{x}.E{ y}
it is computed from the sample as,
1 n
Cov( x, y )   ( x  x )( y  y )
n i 1
if x=y
1 n
Cov( x, x)   ( x  x )( x  x )
n i 1
n
1
 2x   ( x  x ) 2
n i 1
Example Covariance
7
6
5
4
3
2
1
0
0
1
2
3
4
5
n
cov( x, y ) 
(x
i 1
i
6
7
 x)( yi  y ))
n
x
y
xi  x
yi  y
0
2
3
4
6
3
2
4
0
6
-3
-1
0
1
3
0
-1
1
-3
3
x3
y3
7
  1.4
5
( xi  x )( yi  y )
0
1
0
-3
9
 7
What does this
number tell us?
Pearson’s R
   cov( x, y )  
• Covariance does not really tell us anything
– Solution: standardise this measure
• Pearson’s R: standardise by adding std to
equation:
xy 
cov(x , y )
x y
Correlation Coefficient
Cov( x, y ) E{x  E{x}}E{ y  E{ y}}
( x, y ) 

x y
x y
it is computed from the sample as,
Cov( x, y )
( x, y ) 

x y
1 n
( x  x )( y  y )

n i 1
1 n
1 n
2
2
(
x

x
)
(
y

y
)


n i 1
n i 1
1  ( x, y )  1
if x=y
( x, x)  1
( x, y )  0 there is no relation between x and y
( x, y )  1 there is a perfect reverse relation between x and y
Correlation Coefficient (Cont.)
60
( x, y )  
40
( x, y )  0
60
40
20
Y
20
Y
0
0
-20
-40
-20
-60
-60
-40
-20
0
20
40
60
X
-40
-10
0
10
20
30
X
100
0.8
( x, y )  1
80
( x, y )  
0.6
Y
Y
60
0.4
40
0.2
20
0
0
0
20
40
60
X
80
100
0
20
40
60
X
80
100
Procedure of Best Fitting (Step 1)
How to find out the relation between the two variables?
1. Make observation of the pairs:
X
Y
X1
X2
.
.
.
Xn
Y1
Y2
.
.
.
Yn
Procedure of Best Fitting (Step 2)
2. Make plot of the observations.
It is always difficult to decide whether
a curved line fits nicely to a set of data.
Straight lines are preferable.
80
60
Y
We change the scale to obtain
straight lines.
40
20
0
-20
-40
-40
-20
0
X
20
40
Method of Least Square (Step 3)
3. Specify a straight line relation.
Y=a+bX
We need to find a and b that
minimises the square of the
differences between the line
and the observed data.
80
X
b
+
a
Y=
60
Y
40
20
0
-20
-40
-40
-20
0
X
20
40
Step 3 (cont.)
 find best fit of a line in a cloud of
observations: Principle of least squares
ε
n
2
ˆ
(
y

y
)
 i
i 1
n
 min
y = ax + b
ε
= ŷ, predicted value
= yi, true value
= residual error
Method of Least Square (Step 4)
The sum of the squared deviation is equal to,
S ( a, b) 
n
 y
i 1
i
 ( a  bxi ) 
2
Values a and b for which S is minimum,
S ( a, b)
S ( a, b)
 0 and
0
a
b
 n
2
 yi  ( a  bxi )   0

a i 1

a
n
  y
2
i
i 1
 2 yi ( a  bxi )  ( a  bxi ) 2 
 0
 yi2




[2 yi ( a  bxi )] 
( a  bxi ) 2   0


a
a
i 1  a

n
 yi2

yi


2[
y
(
a

bx
)

(
a

bx
)
]

2(
a

bx
)


i
i
i
i   0

a

a

a
i 1 



n



2[ yi
( a  bxi )]  2( a  bxi )   0


a

i 1 
n
n
  2 y
i
i 1
n
 y
i
i 1
n
y
i 1
i
 2( a  bxi )   0
 ( a  bxi )   0
n
 na  b  xi  0
i 1
Method of Least Square (Step 5)
S ( a, b)
0
b
 n
2
 yi  ( a  bxi )   0

b i 1

b
n
  y
2
i
i 1
 2 yi ( a  bxi )  ( a  bxi ) 2 
 0
 yi2


2 

[2
y
(
a

bx
)]

(
a

bx
)

i
i
i
 b
 0

b

b
i 1 

n
 yi2

yi

 2[ yi
( a  bxi )  ( a  bxi )
]  2( a  bxi ) xi   0



b

b

b
i 1 



n




2[
y
(
a

bx
)]

2(
a

bx
)
x
0

i
i
i
i



b

i 1 
n
n
  2( y x )  2( a  bx ) x   0
i
i 1
i
i
n
  ( y x )  ( ax
i
i 1
n
x
i 1
i
i
i
i
 bxi2 ) 
 0
n
n
i 1
i 1
yi  a  xi  b  xi2  0
Method of Least Square (Step 6)
S ( a, b)
0
b
 n
2
 yi  ( a  bxi )   0

b i 1

b
n
  y
2
i
i 1
 2 yi ( a  bxi )  ( a  bxi ) 2 
 0
 yi2


2 

[2
y
(
a

bx
)]

(
a

bx
)

i
i
i
 b
 0

b

b
i 1 

n
 yi2

yi

 2[ yi
( a  bxi )  ( a  bxi )
]  2( a  bxi ) xi   0



b

b

b
i 1 



n




2[
y
(
a

bx
)]

2(
a

bx
)
x
0

i
i
i
i



b

i 1 
n
n
  2( y x )  2( a  bx ) x   0
i
i 1
i
i
n
  ( y x )  ( ax
i
i 1
n
x
i 1
i
i
i
i
 bxi2 ) 
 0
n
n
i 1
i 1
yi  a  xi  b  xi2  0
Method of Least Square (Step 7)
n
a
n
y x
i 1
i
i 1
2
i
n
n
i 1
i 1
  xi  xi yi


n x    xi 
i 1
 i 1

n
n
2
2
i
b
n
n
n
i 1
i 1
i 1
n xi yi   yi  xi


n x    xi 
i 1
 i 1 
n
n
2
2
i
y  a  bx
Example
We have the following eight pairs of observations:
X
y
1
1
3
2
4
4
6
4
8
5
9
7
11
8
14
9
Example (Cont.)
Construct the least square line:
xi
yi
Xi^2
xi.yi
Yi^2
1
1
1
1
1
3
2
9
6
4
4
4
16
16
16
6
4
36
24
16
8
5
64
40
25
9
7
81
63
49
11
8
121
88
64
14
9
196
126
81
40
524
364
256
5
65.5
45.5
32
56

1/n 7
N=8
Example (Cont.)
n
a
n
n
n
 y x x x y
i
i 1
2
i
i 1
i 1
i
i 1


n x    xi 
i 1
 i 1 
n
i
n
i
2
2
i
40*524  56*364 6
a

 0.545
8*524  56*56
11
b
n
n
n
i 1
i 1
i 1
n xi yi   yi  xi


n x    xi 
i 1
 i 1 
n
n
2
2
i
8*364  56* 40 7
b

 0.636
8*524  56*56 11
xi
yi
Xi^
2
xi.yi
Yi^2
1
1
1
1
1
3
2
9
6
4
4
4
16
16
16
6
4
36
24
16
8
5
64
40
25
9
7
81
63
49
11
8
121
88
64
14
9
196
126
81
56
40
524
364
256
7
5
65.5
45.5
32
Example (Cont.)
10
Equation Y = 0.545+ 0.636 * X
8
Number of data points used = 8
Average X = 7
Y
Average Y = 5
6
4
2
0
0
4
8
X
12
16
Example (2)
i
1
2
3
4
5
xi
2.10
6.22
7.17
10.5
13.7
yi
2.90
3.83
5.98
5.71
7.74
5
 xi  39.69
i 1
5 2
 xi  392.3201
i 1
5
 yi  26.16
i 1
5
 xi yi  238.7416
i 1
1
1
(26.16)(392.3)  (39.69)( 238.7)
5
a 5
 2.038
1
2
392.3  39.69
5
1
238.7  (39.69)( 26.16)
5
b
 0.4023
1
2
392.3  39.69
5
y  2.038  0.4023x
Example (3)
Excel Application
• See Excel
Covariance and
the Correlation Coefficient
• Use COVAR to calculate the covariance
Cell =COVAR(array1, array2)
– Average of products of deviations for each
data point pair
– Depends on units of measurement
• Use CORREL to return the correlation coefficient
Cell =CORREL(array1, array2)
– Returns value between -1 and +1
• Also available in Analysis ToolPak
Analysis ToolPak
•
•
•
•
•
•
•
Descriptive Statistics
Correlation
Linear Regression
t-Tests
z-Tests
ANOVA
Covariance
Descriptive Statistics
• Mean, Median,
Mode
• Standard Error
• Standard Deviation
• Sample Variance
• Kurtosis
• Skewness
• Confidence Level
for Mean
•
•
•
•
•
•
•
Range
Minimum
Maximum
Sum
Count
kth Largest
kth Smallest
Correlation and Regression
• Correlation is a measure of the strength of linear
association between two variables
–
–
–
–
Values between -1 and +1
Values close to -1 indicate strong negative relationship
Values close to +1 indicate strong positive relationship
Values close to 0 indicate weak relationship
• Linear Regression is the process of finding a line
of best fit through a series of data points
– Can also use the SLOPE, INTERCEPT, CORREL and RSQ
functions
Polynomial Regression
• Minimize the residual between the data
points and the curve -- least-squares
regression
Linear
yi  a0  a1xi
Quadratic
yi  a0  a1xi  a2 xi2
Cubic
yi  a0  a1xi  a2 xi2  a3 xi3
General
yi  a0  a1xi  a2 xi2  a3 xi3    am xim
Must find values of a0 , a1, a2, … am
Polynomial Regression
• Residual
ei = yi  (a0  a1xi  a2 xi2  a3 xi3    am xim )
• Sum of squared residuals
n 2
n
S r =  ei =  [ y  (a0  a1 x  a2 x 2  a3 x 3    am x m )]2
i=1
i=1
• Minimize by taking derivatives
Polynomial Regression
• Normal Equations

 n
 n
 x
 i=1 i
 n
  xi2
 i=1
 
n m
  xi
i=1
n
n
i=1
2
 xi
i=1
n
3
 xi
i=1
n
4
 xi
i=1
n
n
 xi
i=1
n
2
 xi
i=1
n
3
 xi

m 1
 xi
i=1

m 2
 xi
i=1





m 
 xi 
i=1
  a0 
n
m 1
 xi   a1 
 
i=1

n
m  2  a2 
 
 xi
i=1
  
n

n
2m
 xi
i=1
 n

y

i 

 ni=1 
 x y 
 i=1 i i 
  n 2 
  xi yi 
 i=1

 am 
  

n m 

  xi yi 

i=1

Example
x
0
1.0
1.5
2.3
2.5
4.0
5.1
6.0
6.5
7.0
8.1
9.0
y
0.2
0.8
2.5
2.5
3.5
4.3
3.0
5.0
3.5
2.4
1.3
2.0
x
9.3
11.0
11.3
12.1
13.1
14.0
15.5
16.0
17.5
17.8
19.0
20.0
y
-0.3
-1.3
-3.0
-4.0
-4.9
-4.0
-5.2
-3.0
-3.5
-1.6
-1.4
-0.1
6
4
f(x)
2
0
-2
-4
-6
0
5
10
15
x
20
25
Example
x
0
1.0
1.5
2.3
2.5
4.0
5.1
6.0
6.5
7.0
8.1
9.0
y
0.2
0.8
2.5
2.5
3.5
4.3
3.0
5.0
3.5
2.4
1.3
2.0
x
9.3
11.0
11.3
12.1
13.1
14.0
15.5
16.0
17.5
17.8
19.0
20.0
y
-0.3
-1.3
-3.0
-4.0
-4.9
-4.0
-5.2
-3.0
-3.5
-1.6
-1.4
-0.1

 n
 n
 x
 i=1 i
n
  xi2
i=1
n 3
  xi
i=1
n
 xi
i=1
n
2
 xi
i=1
n
3
 xi
i=1
n
4
 xi
i=1
n
2
 xi
i=1
n
3
 xi
i=1
n
4
 xi
i=1
n
5
 xi
i=1
3
 xi 
i=1 
a
n
4  0 
 xi  a 
i=1   1 

n
5 a2 
 xi   
i=1   a3 
n
6
 xi 
i=1 
n
 n

y

i 

 ni=1 
 x y 
 i=1 i i 
 n

2
  xi yi 
i=1

n 3 
  xi yi 
i=1

229.6
3060.2
46342.8  a0 
 24
  1.30 
 229.6
  316.9 
3060.2
46342.8
752835.2   a1 

   

752835.2
12780147.7  a2 
 3060.2 46342.8
  6037.2 
46342.8 752835.2 12780147.7 223518116.8  a 
 9943.36

 3 


Example
 a0 
  0.3593
a 
 2.3051 
1
   

a2 
 0.3532
a 
 0.0121 


 3
Regression Equation
y = - 0.359 + 2.305x - 0.353x2 + 0.012x3
6
4
f(x)
2
0
-2
-4
-6
0
5
10
15
x
20
25
Nonlinear Relationships
y  aebx
To make it linear, take logarithm of both sides
• If relationship is an exponential function
ln (y)  ln (a) + bx
Now it’s a linear relation between ln(y) and x
• If relationship is a power function
y  ax
To make linear, take logarithm of both sides
ln (y)  ln (a) + b ln (x)
Now it’s a linear relation between ln(y) and ln(x)
b
Examples
• Quadratic curve
y  a0  a1 x  a2 x 2
q  a0  a1 H  a2 H 2
– Flow rating curve:
• q = measured discharge,
• H = stage (height) of water behind outlet
• Power curve y  ax b
– Sediment transport:c  aq b
• c = concentration of suspended sediment
• q = river discharge
n


q

K
c
– Carbon adsorption:
• q = mass of pollutant sorbed per unit mass of carbon,
• C = concentration of pollutant in solution
Example – Log-Log
100
x
y
X=Log(
x)
Y=Log(
y)
1.2
2.1
0.18
0.74
2.8
11.5
1.03
2.44
4.3
28.1
1.46
3.34
5.4
41.9
1.69
3.74
6.8
72.3
1.92
4.28
7.9
91.4
2.07
2.5
4.52
90
2
80
70
1.5
Y=Log(y)
y
60
50
1
40
30
0.5
20
10
0
0
0
1
2
3
4
5
x
x vs y
6
7
8
9
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
X=Log(x)
X=Log(x) vs Y=log(y)
0.9
1
Example – Log-Log
Using the X’s and Y’s, not the original x’s and y’s

 n

n
  Xi
i=1

 n

 Xi 
  Yi 
a
i=1      i=1 
n 2  B  n

 
X
Y
 Xi 

i i


i=1

i=1
n
 6 8.34  a 
19.1
8.34 14.0  B   31.4

 


5
5
 X i   ln (xi )  8.34
i 1
i 1
5 2
5
2
 X i   ln (xi )  14.0
i 1
i 1
5
5
 Yi   ln (yi )  19.1
i 1
i 1
5
5
 X iYi   ln (xi ) ln (yi )  31.4
i 1
i 1
Example – Carbon Adsorption
q  K c 
n
q = pollutant mass sorbed per carbon mass
C = concentration of pollutant in solution,
K = coefficient
n = measure of the energy of the reaction
log10 q  log10 K  n log10 c
Example – Carbon Adsorption
Linear axes: K = 74.702, and n = 0.2289
350
300
250
200
q
q  K c n
150
100
50
0
0
100
200
300
C
400
500
600
Example – Carbon Adsorption
Logarithmic axes:
logK = 1.8733, K = 101.6733 = 74.696, n = 0.2289
3
2.5
Y=Log(q)
2
1.5
log10 q  log10 K  n log10 c
1
0.5
0
0
0.5
1
1.5
X=Log(c)
2
2.5
3
Multiple Regression
yi  axi  b  e i
Regression model
• Y1 = x11 b1 + x12 b2 +…+ x1n bn + e1
Y2 = x21 b1 + x22 b2 +…+ x2n bn + e2
:
Ym = xm1 b1 + xm2 b2 +…+ xmn bn + em
.
 y1   x11 x12 x1n b
 y   x x x b21
 2   21 22 2nb
 ym  xm1xm2 xmn n
Multiple regression model
e1 
  
  e 2 
 e 
 m
In matrix notation
Multiple Regression (cont.)
 y1   x11 x12 x1n  b
e 1 
 
 b 1   e 
 y2   x21 x22 x2n b2   2 

n 
 y   xm1xm2 xmn
e 
m
m
Y  X b e
Observed data = design matrix * parameters +
residuals