Computational Intelligence SS15 Homework 4 Maximum

Computational Intelligence SS15
Homework 4
Maximum Likelihood Estimation
Christian Knoll
Tutor:
Points to achieve:
Extra points:
Info hour:
Deadline:
Hand-in mode:
Hand-in instructions:
Newsgroup:
Johann Steiner, [email protected]
19 pts
5* pts
08.06.2015, 14:00-15:00, HS i11
11.06.2015 19:00
Hand in a hard-copy of your results at the SPSC hand-in boxes
(Inffeldgasse 16c, ground floor) on 11.06.2015 between 9:00 and 19:00.
Use the cover sheet from the website.
Submit your matlab files and a colored version of your report (as.pdf )
as .zip-archive via TeachCenter file upload(MyFiles).
(Please name your archive hw4-Familyname1Familyname2Familyname3.zip)
https://www.spsc.tugraz.at/courses/computational-intelligence
tu-graz.lv.ci
General remarks
Your submission will be graded based on:
• The correctness of your results (Is your code doing what it should be doing? Are your plots
consistent with what algorithm XY should produce for the given task? Is your derivation
of formula XY correct?)
• The depth and correctness of your interpretations (Keep your interpretations as short as
possible, but as long as necessary to convey your ideas)
• The quality of your plots (Is everything important clearly visible in the print-out, are axes
labeled, . . . ?)
1
2
6
p2
p1
4
y [m]
2
d1
0
−2
ptrue
−4
p3
−6
−6
p4
−4
−2
0
x [m]
2
4
6
Figure 1: Geometry for Homework 4: Four anchors (at known positions, like the satellites in
GPS) should be used to estimate the position of an agent, indicated as ptrue
1
1.1
Maximum Likelihood Estimation
Scenario and model
Download hw4.zip from the course website and unzip. HW4_skeleton.m contains a skeleton
of the main Matlab script that you should implement. The aim is to estimate the position
p = [x, y]T of an agent using noisy distance measurements ri , i = 1, . . . , NA (independent and
identically distributed) to the NA = 4 anchor nodes. These are shown in Fig. 1, and are at the
positions
−5
−5
5
5
, pa,4 =
.
(1)
, pa,3 =
pa,1 =
, pa,2 =
5
−5
−5
5
We are not able to measure the true distance of the agent at position p to these anchors,
di (p). Hence, we need a statistical model that describes the error in the measurements ri and
puts them in relation to the unknown parameter that we want to estimate (p). We consider
two cases, in which the distance measurements to the i-th anchor are distributed as
1
−
(ri −di (p))2
2
2σr
Case I (Gaussian): p(ri |p) = p
e
2πσr2
(
λe−λ(ri −di (p)) ,
Case II (Exponential): p(ri |p) =
0
ri ≥ di (p)
else
(2)
(3)
The dependence on the parameter p is given as the Euclidean distance to the i-th anchor, i.e.
p
di (p) = (xi − x)2 + (yi − y)2 .
(4)
This nonlinear dependence of di on p does not allow for a closed form solution of an ML
estimator for p. A popular alternative cost function to the likelihood is the sum of squares of
CI SS15 Homework 4
Tutor: Johann Steiner, [email protected]
1.2
Maximum Likelihood Estimation of Models [6 Points]
3
the measurement errors, yielding the least-squares estimator:
p̂ML = arg max p(r|p) = arg max
p
p
≈ p̂LS = arg min
p
NA
X
NA
Y
p(ri |p)
(5)
i=1
(ri − d(p))2 = kr − d(p)k2
(6)
i=1
where the vector r contains all distance measurements ri , and d(p) is a vector that contains
distances di (p) calculated with the position p using (4). Since this still contains the nonlinear
dependence of di on p, an iterative algorithm is often applied for its solution. We will use the
Gauss-Newton algorithm.
It requires the calculation of the Jacobian matrix of the measurement errors, which collects
the first-order derivatives (linearizations) of the measurement errors. In this example, this
matrix has the dimensions (NA × 2). The two columns are defined as
[Jr (p)]i,1 =
∂(ri − di (p))
,
∂x
[Jr (p)]i,2 =
∂(ri − di (p))
.
∂y
(7)
The algorithm starts with an initial guess p̂(0) and updates the parameter in the t-th iteration
as
−1
p̂(t+1) = p̂(t) − JTd (p̂(t) )Jd (p̂(t) )
JTd (p̂(t) ) r − d(p̂(t) )
(8)
The algorithm stops after a previously defined maximum number of iterations or if kp̂(t) −
p̂(t−1) k < γ, i.e. the change in the estimated position is smaller than a chosen tolerance value
γ.
1.2
Maximum Likelihood Estimation of Models [6 Points]
We use three different scenarios for the evaluation. Each scenario considers a different assignment of the measurement models (2) and (3):
Scenario 1: Measurements of all anchors follow the Gaussian model.
Scenario 2: Measurements of one anchor follow the Exponential model, the other ones follow
the Gaussian model.
Scenario 3: Measurements of all anchors follow the Exponential model.
The according measurement realizations are contained in the files data_HW4_x.mat as (N × NA )
matrices r, where N is the number of trials, in this case 2000.
1. For scenario 2, find out which is the anchor with exponentially distributed measurements.
2. For the exponential distribution, derive the maximum likelihood estimator for λ. Hint:
You may assume that the true distances di (ptrue ) are known.
3. Estimate the parameters of the measurement models (2) and (3), i.e. σr2 and λ using the
maximum likelihood method for the different scenarios and anchors.
1.3
Least-Squares Estimation of the Position [11 Points]
• Show analytically that for scenario 1 (joint likelihood for the distances is Gaussian), the
least-squares estimator of the position is equivalent to the maximum likelihood estimator,
i.e. that (5) equals (6)!
CI SS15 Homework 4
Tutor: Johann Steiner, [email protected]
1.4
Numerical Maximum-Likelihood Estimation of the Position [2+5* Points]
4
• Implement the Gauss-Newton algorithm to find the least-squares estimate for the position.
Write a Matlab function [p_e] = LS_GN(p_a, p_0, r, max_iter, tol), which takes
the (2 × NA ) anchor positions, the (2 × 1) initial position, the (NA × 1) distance estimates
the maximum number of iterations and the chosen tolerance as input. The output is the
estimated position.
• For all three scenarios, evaluate your estimation algorithm using the provided data. For
each of the N = 2000 trials, choose the starting position p0 randomly according to a
uniform distribution within the square spanned by the anchor points. Have a look at:
– The mean and variance of the position estimation error kp̂LS − ptrue k.
– Scatter plots of the estimated positions. Fit a two-dimensional Gaussian distribution
to the point cloud of estimated positions and draw its contour lines. You can use the
provided function plotGaussContour.m. Do the estimated positions look Gaussian?
– The cumulative distribution function (CDF) of the position estimation error, i.e. the
probability that the error is smaller than a given error. You can use the Matlab
function [Fx, x] = ecdf(pos_error) for the estimation of the CDF. For plotting,
use plot(x,Fx,’some style’). With this, you can compare the different scenarios.
What can you say about the probability of large estimation errors?
• Compare the performance of scenario 2 with the case that you do not use the anchor with
the exponentially distributed measurements at all! What can you observe (Gaussianity of
the estimated positions, probability of large errors, . . . )?
1.4
Numerical Maximum-Likelihood Estimation of the Position [2+5* Points]
For non-Gaussian distributed data, the maximum likelihood-estimator is in general not equivalent to the least-squares estimator. In this example, we want to compare the least-squares
estimator with a direct maximization of the likelihood function for scenario 3 (all anchors have
exponentially distributed measurements).
1. For the first trial (i.e. the first NA distance estimates), compute the joint likelihood
function p(r|p) according to (3) over a two dimensional grid with a resolution of 5 cm.
Confine the evaluation to the square region that is enclosed by the anchors. Why might it
be hard to find the maximum of this function with a gradient ascent alogrithm using an
abitrary starting point within the evaluation region? Is the maximum at the true position
[2 Points]?
2. For all trials, compute a numerical maximum likelihood estimate based on the the joint
likelihood function evaluated over the grid, i.e. just take the maximum of p(r|p) as an
estimate. Compare the performance of this estimator (using the same considerations as
above) with the least-squares algorithm for the data from scenario 3. Is the comparsion
fair? Is this truly a maximum likelihood estimator [3* Points]?
3. Using a Gaussian prior pdf p(p) centered at ptrue with a diagonal covariance matrix
Σ = diag{σp2 } with σp = 1m, compute a Bayesian estimator for the postion
p̂Bayes = arg max p(r|p)p(p).
p
(9)
Again, use the evaluation over the grid. Describe the results! Does the prior knowledge
enhance the accuracy [2* Points]?
CI SS15 Homework 4
Tutor: Johann Steiner, [email protected]