CDS 110b: Lecture 1.1 Course Overview

Probability for Estimation
(review)
In general, we want to develop an estimator for systems of the form:
๐‘ฅ = ๐‘“ ๐‘ฅ, ๐‘ข + ฮท(๐‘ก);
y = โ„Ž ๐‘ฅ + ๐œ”(๐‘ก);
๐‘”๐‘–๐‘ฃ๐‘’๐‘› ๐‘ฆ, ๐‘“๐‘–๐‘›๐‘‘ ๐‘ฅ
We will primarily focus on discrete time linear systems
๐‘ฅ๐‘˜+1 = ๐ด๐‘˜ ๐‘ฅ๐‘˜ + ๐ต๐‘˜ ๐‘ข๐‘˜ + ฮท๐‘˜ ;
๐‘ฆ๐‘˜ = ๐ป๐‘˜ ๐‘ฅ๐‘˜ + ๐œ”๐‘˜ ;
Where
โ€“ Ak, Bk, Hk are constant matrices
โ€“ xk is the state at time tk; uk is the control at time tk
โ€“ ฮทk ,wk are โ€œdisturbancesโ€ at time tk
Goal: develop procedure to model disturbances for estimation
โ€“ Kolmogorov probability, based on axiomatic set theory
โ€“ 1930โ€™s onward
Axioms of Set-Based Probability
Probability Space:
โ€“ Let W be a set of experimental outcomes (e.g., roll of dice)
ฮฉ = ๐ด1 , ๐ด2 , โ€ฆ , ๐ด๐‘
โ€ข the Ai are โ€œelementary eventsโ€ and subsets of W are termed โ€œeventsโ€
โ€ข Empty set {โˆ…} is the โ€œimpossible eventโ€
โ€ข S={ฮฉ} is the โ€œcertain eventโ€
โ€“ A probability space (ฮฉ, F,P)
โ€ข F = set of subsets of W, or โ€œeventsโ€, P assigns probabilities to events
Probability of an Eventโ€”the Key Axioms:
โ€“ Assign to each Ai a number, P(Ai), termed the โ€œprobabilityโ€ of event Ai
โ€“ P(Ai) must satisfy these axioms
1. P(Ai) โ‰ฅ 0
2. P(S) = 1
3. If events A,B ฯต W are โ€œmutually exclusive,โ€ or disjoint, elements or
events (AโˆฉB= {โˆ…}), then
P A โˆช B = ๐‘ƒ ๐ด + ๐‘ƒ(๐ต)
Axioms of Set-Based Probability
As a result of these three axioms and basic set operations (e.g.,
DeMorganโ€™s laws, such as ๐ด โˆช ๐ต=๐ด โˆฉ ๐ต)
โ€“ P({โˆ…})=0
โ€“ P(A) = 1-P(๐ด) โ‡’ P(A) + P(๐ด) = 1, where ๐ด is complement of A
โ€“ If ๐ด1 , ๐ด2 , โ€ฆ , ๐ด๐‘ mutually disjoint
P ๐ด1 โˆช ๐ด1 โˆช โ‹ฏ โˆช ๐ด๐‘ = ๐‘ƒ ๐ด1 + ๐‘ƒ ๐ด1 + โ‹ฏ + ๐‘ƒ(๐ด๐‘ )
For W an infinite, but countable, set we add the โ€œAxiom of infinite
additivityโ€
3(b). If ๐ด1 , ๐ด2 , โ€ฆ are mutually exclusive,
P ๐ด1 โˆช ๐ด1 โˆช โ‹ฏ = ๐‘ƒ ๐ด1 + ๐‘ƒ ๐ด1 + โ‹ฏ
We assume that all countable sets of events satisfy Axioms 1, 2, 3, 3(b)
But we need to model uncountable setsโ€ฆ
Continuous Random Variables (CRVs)
Let ฮฉ = โ„ (an uncountable set of events)
โ€“ Problem: it is not possible to assign probabilities to subsets of โ„ which
satisfy the above Axioms
โ€“ Solution:
โ€ข let โ€œeventsโ€ be intervals of โ„: A = {x | xl โ‰ค x โ‰ค xu}, and their countable
unions and intersections.
โ€ข Assign probabilities to these events
P ๐‘ฅ๐‘™ โ‰ค ๐‘ฅ โ‰ค ๐‘ฅ๐‘ข = ๐‘ƒ๐‘Ÿ๐‘œ๐‘๐‘Ž๐‘๐‘–๐‘™๐‘–๐‘ก๐‘ฆ ๐‘กโ„Ž๐‘Ž๐‘ก ๐‘ฅ ๐‘ก๐‘Ž๐‘˜๐‘’๐‘  ๐‘ฃ๐‘Ž๐‘™๐‘ข๐‘’๐‘  ๐‘–๐‘› [๐‘ฅ๐‘™ , ๐‘ฅ๐‘ข ]
โ€ข x is a โ€œcontinuous random variable (CRV).
Some basic properties of CRVs
โ€“ If x is a CRV in ๐ฟ, ๐‘ˆ , then P(L โ‰ค x โ‰ค L) = 1
โ€“ If y in ๐ฟ, ๐‘ˆ , then P(L โ‰ค y โ‰ค x) = 1 - P(y โ‰ค x โ‰ค U)
Probability Density Function (pdf)
๐‘ฅ๐‘ข
๐‘ ๐‘ฅ๐‘™ โ‰ค ๐‘ฅ โ‰ค ๐‘ฅ๐‘ข โ‰ก
๐‘ ๐‘ฅ ๐‘‘๐‘ฅ
๐‘ฅ๐‘™
E.g.
๐‘ ๐‘ฅ
โ€“ Uniform Probability pdf:
1
๐‘ ๐‘ฅ = ๐‘โˆ’๐‘Ž
โ€“ Gaussian (Normal) pdf:
๐‘ ๐‘ฅ =
1
๐œŽ 2๐œ‹
๐‘’
1 ๐‘ฅโˆ’๐œ‡ 2
๐œŽ
โˆ’2
µ =โ€œmeanโ€ of pdf
s = โ€œstandard deviationโ€
๐‘ ๐‘ฅ
Most of our Estimation theory will be built on the Gaussian distribution
Joint & Conditional Probability
Joint Probability:
โ€“ Countable set of events: P A โˆฉ B = P(A,B), probability A & B both occur
โ€“ CRVs: let x,y be two CRVs defined on the same probability space. Their
โ€œjoint probability density functionโ€ p(x,y) is defined as:
๐‘ฆ๐‘ข
๐‘ฅ๐‘ข
P ๐‘ฅ๐‘™ โ‰ค ๐‘ฅ โ‰ค ๐‘ฅ๐‘ข ; ๐‘ฆ๐‘™ โ‰ค ๐‘ฆ โ‰ค ๐‘ฆ๐‘ข โ‰ก
๐‘ ๐‘ฅ, ๐‘ฆ ๐‘‘๐‘ฅ ๐‘‘๐‘ฆ
๐‘ฆ๐‘™
๐‘ฅ๐‘™
โ€“ Independence
โ€ข A, B are independent if P(A,B) = P(A) P(B)
โ€ข x,y are independent if p(x,y) = p(x) p(y)
Conditional Probability:
A โˆฉB
, probability of A given that B occurred
P B
โ€ข E.G. probability that a โ€œ2โ€ is rolled on a fair die given that we know
the roll is even:
โ€“ Countable events: P(A|B)=
P
โ€“ P(B) = probability of even roll = 3/6=1/2
โ€“ P A โˆฉ B = 1/6 (since A โˆฉ B = A)
โ€“ P(2|even roll) = P A โˆฉ B /P(B) = (1/6)/(1/2) = 1/3
Conditional Probability & Expectation
Conditional Probability (continued):
โ€“ CRV: ๐‘ ๐‘ฅ ๐‘ฆ =
๐‘ ๐‘ฅ,๐‘ฆ
๐‘ ๐‘ฆ
๐‘–๐‘“ 0 < ๐‘ ๐‘ฆ < โˆž
0
๐‘œ๐‘กโ„Ž๐‘’๐‘Ÿ๐‘ค๐‘–๐‘ ๐‘’
โ€“ This follows from:
P ๐‘ฅ๐‘™ โ‰ค ๐‘ฅ โ‰ค ๐‘ฅ๐‘ข | ๐‘ฆ โ‰ก
๐‘ฅ๐‘ข
๐‘ ๐‘ฅ|๐‘ฆ ๐‘‘๐‘ฅ =
๐‘ฅ๐‘™
โ€“ and:
โˆž
๐‘ ๐‘ฅ =
๐‘ฅ๐‘ข
๐‘
๐‘ฅ๐‘™
๐‘ฅ, ๐‘ฆ ๐‘‘๐‘ฅ
๐‘(๐‘ฆ)
โˆž
๐‘ ๐‘ฅ, ๐‘ฆ ๐‘‘๐‘ฆ =
โˆ’โˆž
๐‘ ๐‘ฅ|๐‘ฆ ๐‘(๐‘ฆ)๐‘‘๐‘ฆ
โˆ’โˆž
Expectation: (key for estimation)
โ€“ Let x be a CRV with distrubution p(x). The expected value (or mean) of x
โˆž
โˆž
is
๐ธ[๐‘ฅ] =
๐‘ฅ๐‘ ๐‘ฅ ๐‘‘๐‘ฅ
๐ธ[๐‘”(๐‘ฅ)] =
๐‘”(๐‘ฅ)๐‘ ๐‘ฅ ๐‘‘๐‘ฅ
โˆ’โˆž
โˆ’โˆž
โ€“ Conditional mean (conditional expected value) of x given event M:
โˆž
๐ธ[๐‘ฅ|๐‘€] =
๐‘ฅ๐‘ ๐‘ฅ|๐‘€ ๐‘‘๐‘ฅ
โˆ’โˆž
Expectation
Mean Square:
Variance:
(continued)
โˆž
๐ธ[๐‘ฅ 2 ] =
๐œŽ 2 = ๐ธ[(๐‘ฅ โˆ’ ๐œ‡)2 ] =
๐‘ฅ 2 ๐‘ ๐‘ฅ ๐‘‘๐‘ฅ
โˆ’โˆž
โˆž
(๐‘ฅ โˆ’ ๐œ‡)2 ๐‘ ๐‘ฅ ๐‘‘๐‘ฅ
โˆ’โˆž
๐œ‡ ๐‘ฅ = ๐ธ[๐‘ฅ]
Random Processes
A stochastic system whose state is characterized a time evolving CRV,
x(t), t e [0,T].
โ€“ At each t, x(t) is a CRV
โ€“ x(t) is the โ€œstateโ€ of the random process, which can be characterized by
P[๐‘ฅ๐‘™ โ‰ค ๐‘ฅ(๐‘ก) โ‰ค ๐‘ฅ๐‘ข ] =
โˆž
๐‘
โˆ’โˆž
๐‘ฅ, ๐‘ก ๐‘‘๐‘ฅ
Random Processes can also be characterized by:
โ€“ Joint probability function
P[๐‘ฅ1๐‘™ โ‰ค ๐‘ฅ(๐‘ก1 ) โ‰ค ๐‘ฅ1๐‘ข ; ๐‘ฅ2๐‘™ โ‰ค ๐‘ฅ(๐‘ก2 ) โ‰ค ๐‘ฅ2๐‘ข ] =
โ€“ Correlation Function
๐ธ[๐‘ฅ ๐‘ก1 ๐‘ฅ(๐‘ก2 )] =
โˆž
๐‘ฅ
โˆ’โˆž 1
๐‘ฅ1๐‘ข ๐‘ฅ2๐‘ข
๐‘
๐‘ฅ1๐‘™ ๐‘ฅ2๐‘™
Joint probability
density function
๐‘ฅ1 , ๐‘ฅ2 , ๐‘ก1 , ๐‘ก2 ๐‘‘๐‘ฅ1 ๐‘‘๐‘ฅ2
Correlation function
๐‘ฅ2 ๐‘ ๐‘ฅ1 , ๐‘ฅ2 , ๐‘ก1 , ๐‘ก2 ๐‘‘๐‘ฅ1 ๐‘‘๐‘ฅ2 โ‰ก ๐œŒ(๐‘ก1 , ๐‘ก2 )
โ€“ A random process x(t) is Stationary if p(x,t+t)=p(x,t) for all t
Vector Valued Random Processes
๐‘‹1 (๐‘ก)
โ‹ฎ
๐‘‹(๐‘ก) =
where each Xi(t) is a random process
๐‘‹๐‘› (๐‘ก)
R(๐‘ก1 , ๐‘ก2 ) = โ€œCorrelation Matrixโ€ = ๐ธ[๐‘‹ ๐‘ก1 ๐‘‹ ๐‘‡ ๐‘ก2 ]
๐ธ[๐‘‹1 ๐‘ก1 ๐‘‹1 ๐‘ก2 ] โ‹ฏ ๐ธ[๐‘‹1 ๐‘ก1 ๐‘‹๐‘› ๐‘ก2 ]
โ‹ฎ
โ‹ฑ
โ‹ฎ
=
๐ธ[๐‘‹๐‘› ๐‘ก1 ๐‘‹1 ๐‘ก2 ] โ‹ฏ ๐ธ[๐‘‹๐‘› ๐‘ก1 ๐‘‹๐‘› ๐‘ก2 ]
โˆ‘(t) = โ€œCovariance Matrixโ€ = E (๐‘‹ ๐‘ก โˆ’ ๐œ‡ ๐‘ก )(๐‘‹ ๐‘ก โˆ’ ๐œ‡ ๐‘ก )๐‘‡
Where ๐œ‡ ๐‘ก = ๐ธ[๐‘‹ ๐‘ก ]