Random Processes and Time Series Analysis. Part II: Random

 RANDOM PROCESSES AND TIME SERIES ANALYSIS. PART II: RANDOM PROCESSES Random Processes and Time Series Analysis. Part II: Random Processes Definition of a Random Process In the course “Probability and Statistics” you have learnt a lot about a random variable, and a bit about a random process. This lecture reminds you about the key definitions of probability theory and gives the definition of a random process. Assumptions of the course We assume that any process we discuss is random (stochastic). This is the most general assumption about a process. It means that an output of such a process can be predicted only with some probability, and never for certain. When is it reasonable to assume the randomness of a process? When we know that it obeys probabilistic laws, or when the number of influencing factors is too large to be taken account for. An example of a random process is the Brownian motion of molecules in the gas. We do not assume the existence of a deterministic model governing the behaviour of the system considered. To be able to assess the properties of random processes from observing their time series, one should know the theory of random processes in the first instance. We will therefore start from the theory of random processes. Definitions in Probability Theory Definition. An experiment is any process of trial and observation. Example: Throwing a die. Definition. Sample space S is the totality of points that correspond to all possible outcomes of the experiment. Example: 1,2,3,4,5,6 Definition. An event is an occurrence of any one of all possible outcomes of an experiment. Example: Occurrence of a 5. Definitions of probability. Opinion 1 (of a physicist): Probability is a primary, fundamental notion that generally cannot be defined through other simpler notions. Opinion 2 (of a mathematician): At the moment there are four different kinds of probability that may be introduced. Namely, probability 1. as intuition 2. as a ratio of favourable to total numbers of outcomes (classical theory) 3. as a frequency of occurrence 4. as a measure based on axiomatic theory (not considered in detail here) 1 RANDOM PROCESSES AND TIME SERIES ANALYSIS. PART II: RANDOM PROCESSES Opinion of a physicist reflects the non-­‐uniqueness of understanding the probability, and also the difficulty in giving its general definition in lay terms. We are only interested in probabilities 2 and 3 above, since they derive from experimental approach. 2. Probability as ratio of outcomes. Let N be the total number of times the experiment is performed, and NA be the total number of times an event A occurred. N
Definition. Probability of an event A is P(A) = A . N
3. Probability as frequency of occurrence. In the previous definition allow N to tend to infinity. N
Definition. Probability of an event A is P(A) = lim A . N→∞ N
Definition. A random event is an event that under the same given conditions may occur, or may fail to occur. One can predict whether the random event will occur or not only in probabilistic sense. Note, that the outcome of an experiment need not be a number. E.g. it is 'heads' or 'tails' in tossing a coin experiment. However, we often want to represent outcomes as numbers. Definition. A random variable X is a function that associates a unique numerical value with every outcome of an experiment. The value of the random variable will vary from trial to trial as the experiment is repeated. More rigorously, a random variable X is a real-­‐valued function defined on the sample space S of the experiment. Example An experiment consists of a single drawing from a standard deck of playing cards. 1) Find the number of sample points in the sample space. 2) Classify the sample 3) Define a random variable in this sample space. Number i of sample point si Outcome Random Variable xi(s) 1 Ace of clubs 1 2 Two of clubs 2 … … … 52 King of spades 52 2 RANDOM PROCESSES AND TIME SERIES ANALYSIS. PART II: RANDOM PROCESSES Solution 1) 52 cards in a standard deck: total of 52 sample points. 2) 52 points can be counted: the sample space is finite and hence discrete. 3) 52 possible outcomes can be associated with natural numbers from 1 to 52. Note, that many other random variables can be defined, e.g. x(s)=s2. Discrete and Continuous Random Variables Definition. A discrete random variable is a random variable that may take only a countable number of distinct values such as 0, 1, 2, 3, 4, ... If a random variable can take only a finite number of distinct values, then it must be discrete. Examples: the number of children in a family, the Friday night attendance at a cinema, the number of patients in a doctor's surgery, the number of defective light bulbs in a box of ten. Definition. A continuous random variable is random variabel which takes an infinite number of possible values. Continuous random variables are usually measurements. Examples: height, weight, the amount of sugar in an orange, the time required to run a mile. Scalar and Vector Random Variables There can be more than one outcome of the same experiment. Example From a meadow with a lot of flowers a flower is picked randomly. 1. If only the flower length is measured, this is a scalar random variable s. 2. If the length and the weight of the flower are measured, then this experiment produces a vector random variable, or random vector s with two coordinates: length and weight of the flower. Random (Stochastic) Process Illustration of a random process X(t,s) for a continuous sample space S being interval between 0 and 10. Definition. A random (stochastic) process X is a process that assigns, according to a certain rule, a continuous-­‐time function f(t) to every outcome s of an experiment. Random process is a function of two kinds of variables: of time t and of random vector s. 3 RANDOM PROCESSES AND TIME SERIES ANALYSIS. PART II: RANDOM PROCESSES Features of f(t). 1. f(t) is a random variable: for fixed time t, f is dependent on s. 2. f(t) is a time function: for fixed s, f is dependent on t. Comparison between a Random Variable and a Random Process • For a random variable, the outcome of a random experiment is mapped into a number. • For a random process, the outcome of a random experiment is mapped into a waveform (function of time). Each such function of time is called a realization of a random process. Note, that like random variables, random processes can be discrete or continuous. (end of Lecture 2) Notations X(t) – (capital letter) random process x(t) – (small letter) the value the process takes at time moment t. Quasideterministic Process, Probability Density Function (“Distribution”) This lecture gives you the definition of a quasideterministic random process, and of the ensemble of realizations. Definition: Random function f(t) is a function that at any fixed value of its argument t=t1 is a random variable. This means that if the experiment with the same system is repeated under precisely the same conditions, the value of the random function at each time moment t1 will be a random variable. This constitutes the crucial difference between a random function and a deterministic function. The values of a deterministic function are uniquely determined by the value of its arguments. Most generally, a random process is a random function of time. Definition: Random process X(t) is said to be quasideterministic if it is described by a deterministic function of one or several random variables that do not depend on time. Example A 3-­‐year old boy takes a violin and tries to “play”. He randomly picks up a string, presses the string to the sounding board at an arbitrary point and pulls the string with an arbitrary force so it produces a sound. Is this a quasideterministic process or not? Solution. We single out the following elements of the problem: 4 RANDOM PROCESSES AND TIME SERIES ANALYSIS. PART II: RANDOM PROCESSES System: violin and a boy Process: sound produced by a single string Experiment: pulling a string, that includes 3 stages, namely 1) selection of one of four strings 2) position of a finger on the string Outcomes of stages 1) and 2) together determine the frequency ω of the sound which is the first random variable and is continuous. 3) The force of pulling the string determines the initial volume A of the sound, which is the second random variable and is continuous, too. The process is quasideterministic, since after the values of the random variables ω and A chosen, the oscillations of the string are described by a deterministic function on time, that in the simplest case is a damped sine, Here, ϕ0 is initial phase of oscillations and λ is a parameter
of the system that describes damping.
Example (Binary Process) A player tosses a coin at equal time intervals. There are only two possible outcomes: head and tail. These events are statistically independent, i.e. the probability of getting a head is not dependent on the probability of getting a tail before, and vice versa. We single out the following elements of the problem: System: a player and a coin Process: a sequence of tossings of a coin Random variable s: Takes two values 0 and 1. 1 is assigned to the realization when “head” occurs, and 0 when “tail” occurs. Question 1: Is this process discrete or continuous? Answer: Question 2: Is this process quasideterministic? Answer: 5 RANDOM PROCESSES AND TIME SERIES ANALYSIS. PART II: RANDOM PROCESSES Ensemble of Realizations of Random Process Assume that we launch the same random process X(t) many times, and each time i register a realization xi (t) . From the definition of a random process, this corresponds to choosing a value of a random variable from the sampe space S and mapping it into a function of time xi (t) . Definition. A collection of all realizations that correspond to all points from the sample space is called an ensemble of realizations. In a typical experimental situation there are infinitely many realizations. The full ensemble of realizations characterise the random process completely. Example Let an electric voltage be applied to a resistor. Consider an electric current through a resistor. Reminder: electric current is a flow of many (VERY many!) electrons. Since the number of electrons is so large, and each moves randomly along its own trajectory, the total current can be fairly considered as a random process. Experiment: we switch on the voltage and leave it at a constant value. Then we register the elctric current through the circuit. The realization is the current depending on time. As experiment is repeated, we obtain another realization of current. After the experiment is repeated some M times, we obtain an ensemble of M realizations of random process. The Figure illustrates an ensemble of realizations for this experiment. 6 RANDOM PROCESSES AND TIME SERIES ANALYSIS. PART II: RANDOM PROCESSES Probability Density Function (PDF) of a Random Process, Mean Value, Variance Definition: One-­‐dimensional probability density function (PDF) p1 (x, t) of a continuous random process (r/p) X(t) is defined as follows Note: PDF will be sometimes referred to as “distribution” for brevity. In words, at some selected time moment t we take the probability P for the value of the random process to take the value from the interval [x, x + Δx) and divide by the size of the interval Δx . We then take the limit as Δx → ∞ . The PDF may vary in time. Illustration of the definition of 1-­‐dimensional A 1-­‐dimensional PDF of a random process PDF. may vary in time. Note, that p1(x,t) does not characterize the process completely, since it tells nothing about the statistical interdependence of the values of random process X(t) at different time moments t1 and t2. Definition: Two-­‐dimensional PDF p2 (x1 ,t1 , x2 ,t2 ) of a continuous random process (r/p) X(t) is defined as follows The general view of p2 is difficult to illustrate since it depends on 4 arguments and thus requires 5-­‐dimensional space for visualization. But if we fix t1 and t2 the values of random process at these moments are just random variables, and p2 becomes their joint function. This is the function of just two arguments and can be visualized. Let in p2(x1,t1,x2,t2) we fix t1 and t2. Then p2(x1,x2) would look like shown in Figure. If t1 or t2 change, the shape of p2 generally changes, too. Note that p2(x1,t1,x2,t2) does not characterize the process completely, either! 7 RANDOM PROCESSES AND TIME SERIES ANALYSIS. PART II: RANDOM PROCESSES Definition: An N-­‐dimensional PDF pN (x1, t1, x2 , t2 ,…, x N , t N ) of a contnuous r/p X(t) pN (x1, t1, x2 , t2 ,…, x N , t N ) =
R{X(t1 ) ∈ [x1, x1 + Δx1 )∧ X(t2 ) ∈ [x2 , x2 + Δx2 )∧…∧ X(t N ) ∈ [x N , x N + Δx N )}
Δx1Δx2 …Δx N
lim
Δx1→0
Δx2 →0
…
Δx N →0
A random process is completely defined if on an ensemble of realizations for any N and arbitrary time moments t1, t2, …, tN an N-­‐dimensional probability density function of the process is defined. Dirac Delta-­‐function: Symbolic Function (Reminder) Delta-­‐function is constructed as follows: 1) some bell-­‐shaped function is taken, whose integral over all real numbers is 1. 2) now its width is decreased in such a way that the area under the plot is preserved. While dealing with discrete random variables and discrete random processes two important properties of delta-­‐function will be used and One-­‐dimensional PDF of a Discrete Random Process Let X(t) be a discrete random process that takes a finite number of possible values xk, k=1,…,m with finite probabilities Pk(t). Then 1-­‐dimensional probability density function (PDF) p1(x,t) is Illustration of a 1-­‐dimensional PDF of a discrete random process at some selected time t0 . The heights of the vertical lines are shown proportional to the values of Pk(t0). 8 RANDOM PROCESSES AND TIME SERIES ANALYSIS. PART II: RANDOM PROCESSES Example (PDF of a Discrete Random Process) A boy is repeatedly throwing an unfair coin, such that the probability PH of getting heads is 70%, while the probability PT of getting tails is 30%. Sketch a Probability Density Function function for this process. Solution Define the random function that maps the outcomes (heads or tails) into numbers. Heads -­‐ 0, tails -­‐ 1. Properties of a PDF, Mean, Variance Statistically Independent Values of a Random Process If the values of a random process at time moments t1,t2,…,tN are statistically independent, then I.e.the N-­‐dimensional probability density function is just a product of the 1-­‐dimensional PDFs of its arguments. Example (repeatedly tossing a coin) Any N values of random process are statistically independent, and thus any N-­‐dimensional PDF will be the product of 1-­‐dimensional PDFs. Basic Properties of Probability Density Function 1. PDF is a non-­‐negative function of its arguments, because the probability is positive, and the intervals lengths are positive (see definition). 2. 1-­‐dimensional PDF satisfies normalization condition 9 RANDOM PROCESSES AND TIME SERIES ANALYSIS. PART II: RANDOM PROCESSES At each time moment the process will for sure take some value. 3. N-­‐dimensional PDF satisfies normalization condition 4. N-­‐dimensional PDF does not change if its pairs of arguments (xi,ti) and (xj,tj) are swapped, i being not equal to j. 5. PDF satisfies co-­‐ordination condition, i.e. for any k<n That means that any k-­‐dimensional PDF can be defined from knowing an n-­‐
dimensional PDF (n>k) by integrating over the “redundant” arguments. Mean Value of a Random Process The mean (expected) value E of a random variable indicates its average or central value. Definition (reminder): Mean value E of random variable X is defined as Note, that two random variables with the same mean value can have very different distributions. The mean (expected) value E of a random process at a given time moment t is defined by analogy. Definition: Mean value E(t) of random process X(t) is defined as Note that averaging is performed over the ensemble of realizations (see dx), not over time! Notation: Upper bar means average over the ensemble. Variance and Standard Deviation of a Random Process Definition(reminder): Variance σ X2 of a random variable X is defined as 10 RANDOM PROCESSES AND TIME SERIES ANALYSIS. PART II: RANDOM PROCESSES Meaning: The variance of a random variable is a non-­‐negative number which gives an idea of how widely spread the values of the random variable are likely to be; the larger the variance, the more scattered the observations are on average. Definition: Variance σ X2 (t) of random process is defined as (σ X (t)) is called standard deviation of random variable (random process). Illustration: Mean value and variance of a random process at a fixed time moment t0 Definition: σ X
1-­‐dimensional PDF symmetric with respect to mean value 1-­‐dimensional PDF asymmetric with respect to mean value Here most probable and mean values Most probable and mean values coincide. generally do not coincide! Mean value of a Discrete Random Process: Example Example
Given: A random process X(t) consists in repeatedly throwing an unfair coin at equal time intervals
Δt, so that the probability P of getting heads is 70%, while the probability P of getting tails is
H
T
30%.
Find: Mean value and variance at any discrete time moment ti=iΔt (i is a positive integer and Δt is a
time interval).
Solution
PDF found earlier is By definition, the mean value is 11 RANDOM PROCESSES AND TIME SERIES ANALYSIS. PART II: RANDOM PROCESSES Calculate E(ti) with account of According to the definition of a variance, Substitute the value of X(t) = 0.3 and the PDF p1 found above Note, that X(t) = 0.3 is the mean value but is not a possible value of X(t). Fundamental Theorem for Random Variables Example (PS 2, Q. 1(a) Mean value and variance of a Continuous Random Process) Given: A quasideterministic random process X(t) is defined as cosine with random phase, X(t)=Acos(ωt+ϕ)=f(ϕ,t) where A, ω are constants, ϕ is a random variable uniformly distributed in the interval [-­‐π,π], which means Find: Mean value and variance of random process X(t). For illustration fix amplitude A=1 and angular frequency ω=1. Consider 10 different values of random variable (random phase ϕ). 10 respective realizations of X(t) are shown in Figure. How can we find mean value and variance? A knee-­‐jerk answer is that this is straightforward. We have the full information about the process, i.e. we know all its realizations. We can simply use the definitions of mean and variance and apply them. However, for this we need a PDF of X(t), which we do not have. This presents a problem. 12 RANDOM PROCESSES AND TIME SERIES ANALYSIS. PART II: RANDOM PROCESSES General Difficult Problem Consider a random variable A taking values a, whose PDF is p1(a). Also, consider a deterministic function of A, B=g(A), taking values b. Note, that a deterministic function of a random variable is a random variable itself. To find the mean or square mean of B, a natural approach would be to find PDF for the new random variable B and then calculate its mean, square mean, etc. as usual Β=
∞
∞
∫ β p1 (β )d β, Β2 =
∫β
−∞
2
p1 (β )d β −∞
However, finding p1(b) might be a tough problem in general, if function g is non-­‐linear. A better solution comes in the form of the Fundamental Theorem for Random variables. Fundamental Theorem for One Random Variable Fundamental Theorem states that the mean value and the mean square values of a function g of a random variable A can be found as follows, where p1(a) is the PDF of the random variable A. More generally, the mean value of any function h(B) can be calculated as Fundamental Theorem for Two Random Variables Consider two random variables A1 and A2 taking values α1 and α 2 , respectively. Their joint PDF is
p2 (α1, α 2 ) . Also, consider a deterministic function of these two variable, B = g ( A1, A2 ) , taking
values β. Note, that finding p (β) in this situation is even tougher than in case of a function of a
1
single random variable.
Fundamental Theorem for Two Random Variables states that the mean or mean square values of Β
can be found as
∞
∞
Β = g(A1, A2 ) =
∫ ∫ g(α , α
1
2
)p 2 (α1, α 2 )dα1dα 2
−∞ −∞
∞
2
2
Β = g (A1, A2 ) =
∞
2
∫ ∫ g (α , α
1
2
)p 2 (α1, α 2 )dα1dα 2
−∞ −∞
More generally, the mean value of any function h(B) can be calculated as 13 RANDOM PROCESSES AND TIME SERIES ANALYSIS. PART II: RANDOM PROCESSES Fundamental Theorem for Several Random Variables Consider n random variables A1, A2 ,…, An taking values α1, α 2 ,…, α n , respectively. Their joint PDF
is pn (α1, α 2 ,…, α n ) . Also, consider a deterministic function B = g ( A1, A2 ,…, An ) taking values β.
Fundamental Theorem for Several Variables states that the mean or mean square values of Β can be
found as
∞
∞
Β = g(A1, A2 ,..., An ) = ∫  ∫ g(α1, α 2 ,..., α n )p n (α1, α 2 ,..., α n )dα1dα 2 ...dα n
−∞
−∞
∞
∞
Β 2 = g 2 (A1, A2 ,..., An ) = ∫  ∫ g 2 (α1, α 2 ,..., α n )p n (α1, α 2 ,..., α n )dα1dα 2 ...dα n
−∞
−∞
More generally, the mean value of any function h(B) can be calculated as Example (PS 2, Q. 1(a). Mean value and variance of a Continuous Random Process) Solution (correct) Use the Fundamental Theorem for one random variable. Mean value is X(t) = A cos(ω t + ϕ ) =
∞
Variance is
σ X2 (t) = ∫ ( f (ϕ, t) − X(t))2 p(ϕ )dϕ =
−∞
Example (PS 2, Q. 5 Use of Fundamental Theorem for 2 r/v’s) Let X(t)=cos(ωt+ϕ), where ω and ϕ are random
pωφ
2 = 1 8π , ω ∈ [8;12], φ ∈ [0;2 π ] . Find the mean value of X(t).
14 variables
with
joint
PDF
RANDOM PROCESSES AND TIME SERIES ANALYSIS. PART II: RANDOM PROCESSES Solution:
∞
X(t) =
∞
∫ ∫
−∞
X(t)p 2 (ω, ϕ )dω dϕ =
∞
Autocorrelation and Covariance of a Random Process The degree of average mutual influence between the values of random process at time moments t1 and t2 can be characterized by autocorrelation function, defined as K XX (t1, t2 ) = In words, we fix two time moments t1 and t2 and obtain two random variables that can be characterized by their joint 2-­‐dimensional distribution p2(x1,x2). We then consider the product of the values of random process at these time moments and average it over the ensemble of all realizations. Index “XX” means “correlation between the process X and itself, which is X”. The need for such an index will become clear when we consider cross-­‐correlation. Centered Process Definition: A centered process is a process whose mean value is zero at any time moment t. In order to make a centered process out of a non-­‐centered one, one needs to remove the current mean value. In other words, a centered process Xc(t) consists of the values of the original process X(t) at the given time moment minus the process mean at the same time moment: 15 RANDOM PROCESSES AND TIME SERIES ANALYSIS. PART II: RANDOM PROCESSES Covariance function The degree of average mutual influence between the centered values of random process at time moments t1 and t2 can be characterized by covariance function, defined as: ∞
Ψ XX (t1, t2 ) =
∫∫ ( x - X(t )) ( x
1
-∞
1
2
)
- X(t2 ) p2 (x1, t1, x2 , t2 )dx1dx2
We now consider the current values of random process at two given time moments t1 and t2 minus its mean values at the same time moments, multiply them and average. Covariance is defined as the function of two arguments t1 and t2. A typical plot of YXX(t1, t2) is shown below. Why do we need autocorrelation and covariance? Complete information about the statistical dependence between the values of a random process at two time moments t1 and t2 is given by two-­‐dimensional probability density function p2(x1,t1,x2,t2). Why should we introduce any new characteristics? Because p2(x1,t1,x2,t2) cannot be visualized and is not helpful in practice. Unlike p2, both autocorrelation and covariance can always be plotted in 3D and analysed visually. 16 RANDOM PROCESSES AND TIME SERIES ANALYSIS. PART II: RANDOM PROCESSES What does covariance tell us? • Covariance at time moments t1 and t2 gives the average value of product of two centered random variables, i.e. those whose mean is zero. • If these variables take positive and negative values independently of each other, their average product would be zero. They say that these values are uncorrelated. • If one value somehow depends on another, say if one is positive, the other should mostly be negative, their average product is not zero. Then these values are said to be correlated. • The larger the absolute value of the average product, the stronger the correlation is. Property of autocorrelation and covariance Autocorrelation and covariance are symmetric with respect to exchange in the values of time moments t1 and t2, because the 2-­‐dimensional PDF has this property. Namely, and Covariance in two kinds of arguments Covariance is defined as the function of two arguments t1 and t2. A typical plot of YXX(t1, t2) is shown in the Figure above. However, it is often convenient to choose different arguments: t = t1 (1st time moment, or just “time”) τ=t2 -­‐ t1 (“time interval”) The same function YXX, but in new coordinates (t, t) is given in this Figure. Autocorrelation and covariance in terms of “time” and “time interval” When solving practical problems it is often more convenient to use the following representations of autocorrelation and covariance: 17 RANDOM PROCESSES AND TIME SERIES ANALYSIS. PART II: RANDOM PROCESSES K XX (t, τ ) =
Ψ XX (t, τ ) =
Covariance and Variance 2
If we use YXX(t1, t2), then variance σ X (t) at the time moment t is the value of covariance when both its arguments are equal to this time moment, σ X2 (t) = Ψ XX (t,t) . I.e. variance is the value of covariance along the diagonal t1= t2. If we use YXX (t, t), then variance at the time moment t is the value of covariance at t=0,
σ X2 (t) = Ψ XX (t, 0) . Relationship between Autocorrelation and Covariance (PS 3, Q. 2) (
)(
)
Ψ XX (t, τ ) = X(t) − X(t) X(t + τ ) − X(t + τ )
Remarks: 1. The upper bar means the average over the ensemble of realizations, i.e. integral over x. 2. An integral of a sum is a sum of integrals. Covariance is an autocorrelation function minus the product of mean values at respective time moments. If mean values are zero, i.e. if the process is centred, covariance equals the autocorrelation. Example (Covariance of a Damped Cosine with Random Phase) Consider a random process X(t)=exp(-λt)cos(ωt+ϕ)=f(ϕ,t), where ω is a constant, ϕ is a random
variable uniformly distributed in the interval [-π,π]. Find the mean value, variance and covariance
of the random process X(t).
Comment: An example of a real system where such a process can be realized is a pendulum with
friction whose amplitude of oscillations is small. Initial push with velocity v together with the initial
18 RANDOM PROCESSES AND TIME SERIES ANALYSIS. PART II: RANDOM PROCESSES angle α define the phase ϕ. The variable x is then the angle of the pendulum deviation from vertical
line.
Solution Mean value is X(t) = Variance is found as follows (
)(
) τ =0 =
σ X2 (t) = X(t) − X(t) X(t + τ ) − X(t + τ )
Covariance is found as below Ψ XX (t,τ ) = K XX (t,τ ) = X(t)X(t + τ ) =
19 RANDOM PROCESSES AND TIME SERIES ANALYSIS. PART II: RANDOM PROCESSES Interpretation of the results obtained. 1) Mean value does not depend on time and is zero. 2) Variance depends on time: it decreases with time. This is consistent with the fact that the ampliude (i.e. spread) of oscillations decays with time. 3) Covariance depends on both arguments: time and time interval. For a fixed time t, it oscillates in τ with the same frequency as all realizations of the random process. As time t grows, its amplitude decreases two times faster than the amplitude of realizations. For a fixed τ, it monotonously decreases with time t. 20 RANDOM PROCESSES AND TIME SERIES ANALYSIS. PART II: RANDOM PROCESSES Stationarity of a Random Process
Revision: Distribution as a Function of Time If time is fixed at a particular value tk, a random process becomes a random variable. If we fix two time moments, we get two random variables which can be characterized by their joint two-­‐
dimensional distribution. If we fix three… etc. Generally, each such distribution changes when different time moments are selected. Stationarity in Strict Sense Definition: The process is said to be stationary in strict sense, if its N-­‐dimensional PDF does not change with respect to any time shift τ for any N (N is unbounded), i.e. pN (x1,t1, x 2 ,t 2 ,…, x N ,t N ) = If we shift all the time moments considered by the same magnitude τ, the PDF remains the €
same. Stationarity of the 1st Order Definition: A random process is said to be 1st order stationary if its 1-­‐dimensional PDF does not change with respect to any time shift τ, i.e. p1 (x,t) = €
21 RANDOM PROCESSES AND TIME SERIES ANALYSIS. PART II: RANDOM PROCESSES p1(x,t) changes in time, the process is p1(x,t) does not change in time, the nonstationary. process is 1st order stationary. Note, that in a 1st order stationary process both mean and variance do not depend on time. Example (1st Order Stationary Process, Cosine with Random Phase (PS 2, Q. 1)) Consider a random process X(t)=Acos(ωt+ϕ)=f(ϕ,t), where A,ω are constants, and ϕ is a random
variable uniformly distributed in the interval [-π,π]. To show realizations and PDF, set A=1, ω=1.
10 realizations for 10 randomly chosen values
of ϕ from [-π,π].
1-dimensional
PDF
numerically
estimated from an ensemble of 100000
realizations
Stationarity of the 2nd Order Definition: A random process is said to be 2nd order stationary if its 2-dimensional PDF does not
change with respect to any time shift τ, i.e.
p2 (x1,t1, x 2 ,t 2 ) =
€
To show this, set time shift τ= -t and rewrite
1
p2 (x1,t1 − t1, x 2 ,t 2 − t1 ) =
€
22 RANDOM PROCESSES AND TIME SERIES ANALYSIS. PART II: RANDOM PROCESSES 2-dimensional PDF depends only on the
time interval τ between two considered
time moments, but not on time!
Note, that a 2nd order stationary process is automatically 1st order stationary, since a lowerdimensional PDF is determined by a higher-dimensional PDF.
Covariance of the 2nd order stationary process ∞
ΨXX (t,τ ) =
∫∫ ( x
1
−∞
)(
)
− X(t) x 2 − X(t + τ ) p2 (x1,t, x 2 ,t + τ )dx1dx 2
€
Conclusion: Covariance of a 2nd order stationary process depends only on time interval τ between
the two time moments, but not on the time moments themselves!
Wide Sense Stationarity (WSS) We introduce a relaxed stationarity condition that is better applicable in practice.
Definition: A random process is said to be wide sense stationary if it satisfies the following three
conditions:
1. Its mean value is constant in time
2. Its variance is finite
3. Its covariance depends only on time shift between the time moments.
A typical plot for covariance of a wide sense stationary (WSS) process is shown below.
Note, that the variance of a WSS process is the value of covariance at τ=0, σX2 = ΨXX (0).
Hierarchy of stationarity
•
€
The diagram below is valid for processes with finite variance (energy)!
23 RANDOM PROCESSES AND TIME SERIES ANALYSIS. PART II: RANDOM PROCESSES •
Wide sense stationarity is not included in this diagram:
•
WSS process need not be 2nd or even 1st order stationary!
Stationarity in lay terms (if a non-expert asks you…) Broadly speaking, the process is stationary if its statistical characteristics do not change in time.
Otherwise, it is non-stationary.
Note, that these are NOT strict definitions, but statements meant to convey the idea. They are note
to be used in exam papers, or, at least, not alone.
Properties of covariance function of a wide sense stationary (WSS) process
A typical plot of covariance for a WSS process is shown in Figure above.
Properties of covariance of a WSS random process are:
1. Covariance takes its maximal value at τ=0,
2. Covariance is an even function of τ,
3. For many stationary processes covariance decreases with τ, and there exists a limit
This property means that as the time moments considered are further and further awayfrom each
other, the values of random process loose their statistical interdependence. I.e. the system “forgets”
its preceding states. With this, 2-dimensional PDF tends to zero as τ tends to infinity.
Example (Cosine with Random Phase (PS. 2, Q. 1) Interpreting Covariance)
Consider a random process X(t)=Acos(ωt+ϕ)=f(ϕ,t), where A,ω are constants, ϕ is a random
variable uniformly distributed in the interval [-π,π]. An example of a real system where such a
24 RANDOM PROCESSES AND TIME SERIES ANALYSIS. PART II: RANDOM PROCESSES process can be realized is a pendulum without friction whose amplitude of oscillations is small.
Initial push with velocity v together with the initial angle α define the phase ϕ.
Different realizations of X(t) are shown.
The process X(t) is wide sense stationary.
Covariance of X(t) found earlier reads:
ΨXX (t,τ ) =
A2
cos(ωτ ) = ΨXX (τ )
2
€
Notes on plotting covariance
1. Since covariance of a WSS process is symmetric with respect to zero, it is usually plotted
only for positive values of time interval τ.
2. When comparing covariances of two WSS processes with different amplitudes, it is
convenient to normalize each covariance by respective variance, so that their values at τ=0
would be 1, and all other values smaller.
Covariance of a wide sense stationary process – graphical representation
ΨXX
depends only on time shift τ : the process is wide sense stationary!
25 RANDOM PROCESSES AND TIME SERIES ANALYSIS. PART II: RANDOM PROCESSES Correlation Time
• Covariance, among other things, tells us how quickly the system “forgets” its previous states.
• Often it is convenient to characterize a WSS process by just one number instead of plotting the
whole covariance function.
Question: What is the
which they are still
maximal time interval between two values of the process at
correlated?
Answer: This interval is equal to correlation time τcor.
€
Definition: Correlation time τcor of a wide sense stationary random process is defined as ∞
τ cor =
∫Ψ
XX
(τ ) dτ = −∞
Correlation time is characteristic of system memory.
€
Example (correlation time) Let the covariance of the process be given by ΨXX (τ ) = σ X2 exp(− λ | τ |) .
€
The larger the λ is, the faster the covariance decreases, the smaller the correlation time is.
26 RANDOM PROCESSES AND TIME SERIES ANALYSIS. PART II: RANDOM PROCESSES Ergodicity of a Random Process Averaging over the ensemble of realizations: Applications Example (Neuroscience)
Suppose we wish to study the response of the human brain
activity to external stimuli. As a measure of brain activity,
an electroencephalogram is measured. A human subject is
looking at the screen, where the same image (e.g. letter A)
appears and disappears at regular intervals. The same
experiment is repeated several hundred of times.
We thus obtain an ensemble of realizations (encephalograms)
of the same random process. The mean, mean square, variance
and PDD are estimated at each time moment by averaging over
the whole ensemble of realizations, and thus we obtain several
functions of time. The shapes of these functions will be different for different stimuli.
What is wrong with averaging over the ensemble of realizations? Problem 1: A huge amount of data is required, requiring large storage space.
In all illustrations of random processes used in this module so far, 1-dimensional PDDs of random
processes were estimated numerically by averaging over the ensemble of realizations. The data used
for these estimates were numerically simulated realizations of the same random process. 100 000
realizations were used for all PDDs shown.
The students might be wondering why they were not given some PDDs obtained from experimental
realizations rather than from numerical simulations of some random processes. The reason is very
simple: such data is not easily available, because it is quite resource consuming to obtain them.
Namely, one needs to repeat the same experiment several hundreds of times at least, and to record
all the realizations.
Problem 2: Sometimes it is not just laborious, but impossible to obtain the required data.
Many experiments cannot be repeated even twice, not mentioning hundreds of times.
Example 1: Switching of the polarity of magnetic field of the Earth. There were only about 20 such
switchings during the whole history of Earth existence. A human is unable to repeat such
experiments.
Example 2: Change of climate on Earth. Changes between ice ages and global warming. Again,
such experiments cannot be repeated.
Example 3: Marketing. The changes in the sales of a particular company from the instance of its
creation. To repeat the experiment one should close the company and re-open it again. But even if
one tries to do this, the conditions of experiment would have changed, because the world around
would have become different. It is impossible to reproduce the conditions of the experiment.
27 RANDOM PROCESSES AND TIME SERIES ANALYSIS. PART II: RANDOM PROCESSES Typical experimental data
Most typical experimental data available are represented by a single realization x(t) of the random
process.
Example
Consider a wine producing company (system). The random process is selling the wine, and the state
variable is “monthly wine sales”. We can only observe a single realization of this process, pictured
below. We cannot produce a second realization, because for that we would need to relaunch the
company from the very start, and moreover to do it in the same market – which is impossible, since
the market obviously changes with time.
Revision: how to define a random process?
A random process is completely defined if on an ensemble of realizations for any N and arbitrary
time moments t ,t ,…,t an N-dimensional probability density function of the process is defined,
1 2
N
which reads as below.
pN (x1,t1, x 2 ,t 2 ,…, x N ,t N ) =
Ρ{X(t1 ) ∈[x1, x1 + Δx1 ) ∧ X(t 2 ) ∈[x 2 , x 2 + Δx 2 ) ∧…∧ X(t N ) ∈[x N , x N + Δx N )}
lim
Δx1 → 0
Δx
Δx
…Δx
1
2
N
Δx → 0
2
…
Δx N → 0
But what if we do not have the required ensemble of realizations with the p defined on it?
€
Averaging over time and Ergodic Process Consider some function f of time t, f(t).
N
Definition: Average of f(t) over finite time interval T is defined and denoted as:
Definition: Average of f(t) over infinite time interval is defined and denoted as:
28 RANDOM PROCESSES AND TIME SERIES ANALYSIS. PART II: RANDOM PROCESSES Definition: A strict sense stationary process X(t) is said to be strict sense ergodic if all its statistical
characteristics can be obtained from any of its single realizations, x(t), known during an infinite
time interval.
In order to obtain these statistical characteristics, instead averaging over the ensemble of
realizations, we can average over time.
Comparison: Ergodicity and Stationarity
Reminder:
•
A process can be stationary in strict sense, but this is a mathematical abstraction that cannot
be verified in practice.
•
Relaxed conditions of stationarity are: N-th order stationarity, where N is some positive
integer, and also wide sense stationarity.
•
Strict sense stationarity implies automatically stationarity of any N-th order, and also wide
sense stationarity.
By analogy, besides ergocidity of the process in strict sense, we can introduce relaxed conditions on
ergodicity.
Ergodicity of Various Orders st
Definition 1: A 1 -order stationary random process is called 1st order ergodic if any single
realization x(t) of it carries the same information as its 1-dimensional PDD,
Note, that p1 does not depend on t because the process should be 1st order stationary!
Definition 2: A 2nd-order stationary random process is called 2nd order ergodic if any single
realization x(t) of it carries the same information as its 2-dimensional PDD
p2 (x1, t1, x2 , t2 ) = Note, that p2 depends only on time interval τ because the process should be 2nd order stationary!
Definition 3: An Nth-order stationary random process is called Nth order ergodic if any single
realization x(t) of it carries the same information as its N-dimensional PDD
pN (x1,t1, x 2 ,t 2 ,…, x N ,t N ) = pN (x1,t1 + τ, x 2 ,t 2 + τ,…, x N ,t N + τ ) .
Note, that pN is the same after shift by τ because the process should be Nth order stationary!
€
29 RANDOM PROCESSES AND TIME SERIES ANALYSIS. PART II: RANDOM PROCESSES Ergodicity with Respect to Mean Value Consider a 1st order stationary random process X(t), and its particular realization x(t). If the mean
value of the process can be obtained as an average over time of this single realization, i.e.
1 T /2
x(t) = lim ∫ x(t)dt = T→∞ T
−T /2
the process X(t) is said to be ergodic with respect to mean value.
Note:
• p1 and X do not depend on t because the process should be 1st order stationary.
• <x> does not depend on time because it is the result of averaging over time.
€Definition 4: A 1st-order stationary random process X(t) is called ergodic with respect to the mean
value, if its mean value can be obtained from any single realization x(t) of it as the time average of
the latter.
Ergodicity with Respect to Covariance Consider a wide sense stationary random process X(t), and its particular realization x(t). Suppose
the process is ergodic with respect to mean value. If the covariance of the process can be obtained
as an average over time of the product as follows
( x(t) −
x(t)
) ( x(t + τ ) −
(
= X(t) − X
x(t + τ )
)
=
( x(t) − x ) ( x(t + τ ) − x )
) ( X(t + τ ) − X ) = Ψ
XX
(τ )
the process X(t) is said to be ergodic with respect to covariance.
Note:
• Ψ (τ) does not depend on t because the process should be wide sense stationary.
XX
• <x> does not depend on time because it is the result of averaging over time.
Definition 5: A wide sense stationary random process X(t) is called ergodic with respect to
covariance, if its covariance can be obtained from any single realization x(t).
30 RANDOM PROCESSES AND TIME SERIES ANALYSIS. PART II: RANDOM PROCESSES Ergodicity of Random Process and its Relationship to Stationarity
Example (Ergodicity of Cosine with Random Phase PS. 4 Q. 1) Consider a random process X(t)=Acos(ωt+ϕ)=f(ϕ,t), where A, ω are
constants, and ϕ is a random variable uniformly distributed in the interval [-π;
π]. Determine if the random process X(t) is ergodic with respect to mean value
and to covariance.
Solution:
From Q. 1 of PS 2 we have found that X(t) = X = 0 . Also, this process is 1st order stationary, as can
be shown analytically (although we do not show this in this module!). Note, that there is only one
random variable ϕ in this process.
Assume that the pendulum €
was pushed with some velocity v from some angle α, and thus the
random variable ϕ has taken some particular value ϕ∗. We thus generated a single realization x(t)
of this process.
First, we try to find the mean value from this realization.
1 T /2
1 T /2
1(A
T /2 +
x(t)dt
=
lim
Acos(ωt + ϕ*)dt = lim * sin(ωt + ϕ*) |−T / 2∫
∫
T →∞ T
T →∞ T
T → ∞ T )ω
,
−T / 2
−T / 2
x(t) = lim
+
++
+
++
( ωT
( ωT
1 A ( ( ωT
1 A ( ( ωT
sin*
+ ϕ *- − sin* −
+ ϕ *-- = lim
sin*
+ ϕ *- + sin*
− ϕ *-*
*
T →∞ T ω )
,
,, T → ∞ T ω ) ) 2
,
,,
) 2
) 2
) 2
' ωT /2 + ϕ * +ωT /2 − ϕ * * ' ωT /2 + ϕ * −ωT /2 + ϕ * *
1 A
2sin)
= lim
, cos)
,=
T →∞ T ω
+ (
+
(
2
2
' ωT *
sin) ,
2 A ' ωT *
( 2 +
= lim
sin) , cos(ϕ*) = lim
Acos(ϕ*) = 0
ωT
T →∞ T ω
T →∞
( 2 +
2
= lim
€
In the above sin(ωT /2) can take values between –1 and 1 only and is thus bounded. In taking the
limit as T goes to infinity, we divide a finite value by an infinitely large value and thus obtain zero.
€
Conclusion: The average over time of realization x(t) equals the process mean value. Therefore, the
€ is ergodic w.r.t. the mean value.
process
Second, let us attempt to find covariance from x(t). From PS 2 Q. 1 we know that the covariance of
X(t) is ΨXX (t,τ ) =
A2
cos(ωτ ).
2
It is convenient to first consider average over a finite time interval T, and only then take the limit as
€ T tends to infinity.
31 RANDOM PROCESSES AND TIME SERIES ANALYSIS. PART II: RANDOM PROCESSES ( x(t) − x(t)
) ( x(t + τ ) − x(t + τ )
)T
=
1 T /2
∫ (x(t) − x(t) )(x(t + τ ) − x(t + τ ) )dt
T −T / 2
1 T /2
1 T /2 2
=
∫ (x(t) − 0)(x(t + τ ) − 0)dt = T ∫ A cos(ωt + ϕ*)cos(ω (t + τ) + ϕ*)dt
T −T / 2
−T / 2
1 2 1 T /2
= A
∫ cos(ωt + ϕ * +ωt + ωτ + ϕ *) + cos(ωt + ϕ * −ωt − ωτ − ϕ *) dt
T
2 −T / 2
(
)
A2 T / 2
=
∫ cos(2ωt + ωτ + 2ϕ *) + cos(ωτ ) dt
2T −T / 2
A2 1
A2
T /2
T /2
=
sin(2ωt + ωτ + 2ϕ *) |−T / 2 +
cos(ωτ ) t |−T / 2
2T 2ω
2T
2
A 1
=
sin(ωT + ωτ + 2ϕ *) − sin( −ωT + ωτ + 2ϕ *)
2T 2ω
& T (−T) )
A2
+ cos(ωτ )( −
+ = ( x(t) − x ) ( x(t + τ ) − x )
'2
2 *
2T
(
€
)
(
)
T
Now take the limit T →∞ to obtain
€
€
Conclusion: An estimate of covariance from a single realization x(t) coincides with covariance
obtained from averaging over the ensemble of realizations. Therefore, the process is ergodic w.r.t.
covariance.
Overall conclusions
• The process is 1st order stationary, so the first ergodicity condition is satisfied.
• Mean values obtained from an ensemble, and from a single, realizationsare the same.
Therefore, the process is ergodic with respect to mean.
• Covariance estimated from an ensemble, and from a single, realizations is the same. Therefore,
the process is ergodic with respect to covariance.
Note, that while calculating mean or covariance from a single realization x(t) we did not take
account of statistical properties of the random variable ϕ. In taking time average we assume to
know nothing about them!
Something to think about.
Compare the result for this process with the one for process in PS 2, Q. 4. The only difference
between the two processes is that statistical properties of ϕ are different.
Ergodicity versus Stationarity
q If somebody asks you what an ergodic process is in most general terms, you should be able to
answer something like this: “If the statistical characteristics of the whole process can be recovered
from any single of its realizations, then the process is ergodic.”
32 RANDOM PROCESSES AND TIME SERIES ANALYSIS. PART II: RANDOM PROCESSES q Does ergodicity always mean stationarity?
Yes. There is no ergodicity without stationarity. A process which is ergodic of a certain order is
always stationary of the respective order.
q Does stationarity always mean ergodicity?
No. Stationarity is a necessary condition for ergodicity, but not a sufficient one. However,
experience tells us that most stationary processed are ergodic.
Example (a stationary but not ergodic process, PS. 2 and 4, Q. 5) Consider a random process X(t)=cos(ωt+ϕ), where ω and ϕ are random variables with joint
probability density distribution pωϕ
2 = 1 8π , ω ∈[8;12], ϕ ∈[−π ;π ].
Figure shows 10 realizations of X(t):
each has a different
frequency ω and
€
a different initial phase ϕ.
We need to determine if this process is ergodic with respect to mean value and covariance.
Solution
1) First of all, we need to determine if the process is wide-sense stationary. By solving PS. 2, Q. 5,
we already discovered that this process is WSS, because all three conditions are satisfied. Namely,
its mean value is constant, its variance is finite, and its covariance depends only on τ, i.e.
1
1
X(t) = 0, σ 2 (t) = , Ψ XX (t, τ ) = (sin(12τ ) − sin(8τ ))
2
8τ
Comments on determining stationarity of this process
•When determining stationarity, we assume the possibility to launch the same process infinitely
many times and to average all quantities over the ensemble of realizations.
•We took full account of the fact that the process was quasideterminsitic, i.e. was a deterministic
function of random variables whose statistical properties were known to us. When we averaged
over the ensemble of realizations, we in fact used the fundamental theorem for random variables
and averaged over the random variables.
•In all calculations time was never an integration variable!
2) Next, we need to test this process for ergodicity. Unlike in PS. 2, Q. 5, here we allow the random
variables ϕ, ω to take some values ϕ∗ and ω∗ within their ranges. This pair of values will produce
33 RANDOM PROCESSES AND TIME SERIES ANALYSIS. PART II: RANDOM PROCESSES one realization x(t)=cos(ω∗t+ϕ∗), of the process, which we will be considering. We will need to try
to obtain all characteristics of the process by taking this single realization and averaging various
quantities over time.
Note, that we do not assume any knowledge of the statistical properties of the random variables
involved. We do not even assume to know whether there are any random variables in the
realization, or what they are! All we know is some function x(t) depending on time. Consequently,
all our estimates will NOT include our knowledge of pωϕ
2
2.1) We are looking for mean value <x(t)> as an average of the realization of the process over
time:
€
1 T /2
1 T /2
1% 1
T / 2 (*
x(t) = lim ∫ x(t)dt = lim ∫ cos(ω * t + ϕ *)dt = lim ''
sin(ω * t + ϕ *)
T→∞ T
T→∞ T
T→∞ T ω *
−T / 2 *)
&
−T /2
−T /2
(
% ω *T
((
1 % % ω *T
= lim
sin '
+ ϕ ** − sin ' −
+ ϕ ** *
'
T→∞ ω *T &
& 2
)
&
))
2
(
% ω *T
((
1 % % ω *T
+ ϕ ** + sin '
− ϕ ** *
= lim
' sin '
T→∞ ω *T
& 2
))
)
& & 2
( % ω *T
(
% ω *T
ω *T
ω *T
+ϕ *+
−ϕ ** '
+ϕ *−
+ϕ **
'
1
2
2
2sin ' 2
= lim
* cos ' 2
*
T→∞ ω *T
2
2
* '
*
'
) &
)
&
% ω *T (
2
= lim
sin '
* cos(ϕ *) = 0 = X(t)
T→∞ ω *T
& 2 )
*
#ω T &
In the above sin%
( is bounded between -1 and 1, while T grows infinitely. Therefore, the whole
$ 2 '
expression goes to zero.
Conclusion: The time average of the realization, which can only be a constant not depending on
€ appears to coincide with the mean value of the process obtained by averaging over the
time,
ensemble of realizations. Thus, the process is ergodic with respect to mean value.
2.2) Let us try to find autocorrelation by averaging over time. Since the mean value of the process
in time is zero, autocorrelation and covariance estimated from a single realization will coincide.
1 T /2
x(t)
−
x
x(t
+
τ
)
−
x
=
(
)(
) T T ∫ ( x(t) − x ) ( x(t + τ ) − x ) dt =
−T /2
=
1 T /2
∫ cos (ω * t + ϕ *) cos (ω * t + ω * τ + ϕ *) dt =
T −T /2
=
1 1 T /2
∫ (cos (ω * t + ϕ * +ω * t + ω * τ + ϕ *) + cos (ω * t + ϕ * −ω * t − ω * τ − ϕ *)) dt =
T 2 −T /2
=
1 T /2
∫ (cos (2ω * t + ω * τ + 2ϕ *) + cos (−ω * τ )) dt
2T −T /2
34 RANDOM PROCESSES AND TIME SERIES ANALYSIS. PART II: RANDOM PROCESSES =
1 " 1
T /2
T /2
sin ( 2ω * t + ω * τ + 2ϕ *)
+ cos (ω * τ ) t
$
2T $# 2ω *
−T / 2
−T / 2
%
'=
'&
1 "
#sin (ω *T + ω * τ + 2ϕ *) − sin (−ω *T + ω * τ + 2ϕ *)%& +
4ω *T
( T ( T ++
1
cos (ω * τ ) * − * − -- =
+
2T
) 2 ) 2 ,,
1
1 "
=
#sin (ω *T + ω * τ + 2ϕ *) − sin (−ω *T + ω * τ + 2ϕ *)%& + cos (ω * τ )
2
4ω *T
=
Now, take the limit as T tends to infinity. In the formulae above each sin is bounded between -1 and
1, so that the difference of two sines lies between -2 and 2, while T grows infinitely. Thus,
the first
term vanishes as T is increased. The second term does not depend on T, so
Compare this with covariance obtained by averaging over the ensemble of realizations,
1
ΨXX (t,τ ) = (sin(12τ ) − sin(8τ )).
8τ
The estimate of covariance obtained from a single realization is obviously different from the correct covariance. Conclusion: Estimation€of covariance from one realization leads to a wrong formula. Therefore, the
process is not ergodic with respect to covariance, although it is a wide-sense stationary process.
Stationarity and Ergodicity: Practical aspects Does mean value ergodicity exist? For a process to be ergodic with respect to mean value, it has to
be 1st-order stationary. Namely, its PDD and consequently mean value should not change in time.
Rigorously this can be checked only if we have an ensemble of realizations. We can then calculate
PDD and mean over this ensemble at all time moments, and then check if they do not change in
time within some numerical accuracy that we define ourselves.
But what if have only one realization of the process? Can we judge about the process mean value
being constant in time? A mathematically rigorous and totally useless answer is no.
Anecdote. Sherlock Holmes and Dr. Watson are travelling in the balloon. They have an accident
due to which their balloon lands unexpectedly. They need to find out where they are, and they ask
a passer-by “Where are we now?” The passer-by answers: “You are in the balloon.” Sherlock
Holmes says to Dr. Watson “This person is a mathematician: he gave us an absolutely precise and
an totally useless answer.”
35 RANDOM PROCESSES AND TIME SERIES ANALYSIS. PART II: RANDOM PROCESSES Truth about Time Series Analysis Is there any more useful answer? Yes, if we are ready to sacrifice mathematical rigour. But if we
are not, there is no point in ever starting time series analysis (a tough fact).
In what follows “approximately” will be the keyword. When we will discuss if the two (or more)
values or functions derived from experimental data are the same, we will mean “approximately the
same”. It means that the difference between them, although always exists, is less than a certain
sufficiently small number.
So, is there any means to assess whether the process value is approximately constant in time from
its single realization? Yes.
Mean value of realization: visual assessment
Intuitively
introduced
mean value We can inspect the realization visually and try to answer the question: Does the realization look like
having a constant mean value in time? In the case illustrated above, the answer is rather “no” than
“yes”, since the mean value introduced intuitively drifts in time.
Mean value of realization: numerical estimate Split the realization into several segments, possibly intersecting, each should be long enough.
Calculate the mean value over each segment as the average over the segment over time. Compare
the mean values. If they coincide within some accuracy, the process mean can be regarded as
constant.
Series
3000.
2500.
2000.
1500.
1000.
500.
0
20
40
60
80
100
120
140
In the realization above the three mean values estimated numerically from three different segments
of the same realization are noticeably different.
36 RANDOM PROCESSES AND TIME SERIES ANALYSIS. PART II: RANDOM PROCESSES Warning: judge with care! All definitions of stationarity and ergodicity are applied to the processes that can be observed for an
infinitely long time. However, in practice, any process can be observed only during a finite time
interval.
Observation of the realization
during some small time T might
lead one to the conclusion that the
process mean is floating in time.
However, the increase of
observation time would reveal that
the process mean is rather
constant.
How to check if the process is 1st order ergodic?
For the process to be 1st order ergodic it should be in the first instance 1st order stationary, i.e. its 1dimensional PDD should not change in time. Rigorously this can be verified only if we have an
ensemble of realizations. We can then calculate the estimate p1 (k, i) of PDD, where ti is i-th time
moment, and k is the number of subinterval within which the value of the process falls at the given
time ti. We can check if the mean value is constant in time within some numerical accuracy.
What can be done with only one realization of the process?
Split the realization into several segments, possibly intersecting, each should be long enough.
Calculate the p1 (k) over each segment. Compare the numerically obtained functions. If they are
approximately the same, the process can be regarded as 1st order stationary.
Note, that the inverse is not necessarily true. If the process is 1st order stationary, it is not
necessarily 1st order ergodic. But if we have only one realization of the process, there is no means
to check ergodicity. In practice, we just hope that the process is ergodic.
37 RANDOM PROCESSES AND TIME SERIES ANALYSIS. PART II: RANDOM PROCESSES One-dimensional Probability Density Function
from experimental data
1. PDF from an Ensemble of Realizations
In reality the random processes are usually not quasideterministic. Moreover, typically we know
nothing about the processes or the systems in which they occur beyond their realizations. In spite of
that, can we estimate some statistical characteristics of such processes?
Assume that the experiment can be repeated many times under precisely the same conditions, and
thus we can launch the same random process A(t) many times. We will obtain an ensemble of N
realizations, which are sampled with step Δt.
aj(ti)
ti
Assume that we consider a realization number j, whose points are denoted as aj(ti)= aji , where ti is
the i-th time moment at which the value of the quantity being registered is known, ti = iΔt, where i
is effectively the discrete time, and Δt is the sampling step. Suppose there are L data points in each
discretized realization (time series). Let experimental data be given in the form of a Table below.
Number of trial j /
Time ti
i = 1, t i = Δt
i = 2, t i = 2Δt
i = 3, t i = 3Δt
...
i = L, t i = LΔt
j=1
a11
a12
a13
...
a1L
€2
a1
€
a22
a32
€
...
aL2
j=2
€
...
€
...
€
...
€
...
...
€
...
j=L
€
a1N
€
a2N
€
a3N
...
€
aLN
Each column number i represents values of some random variable A with its own PDF p1A (a) .
€
€ these PDFs.
Generally all p1A (a)€are different at€different time moments
ti. We wish to estimate
€
€
Algorithm for PDF estimation from an ensemble of realizations
The algorithm for estimation of the 1-dimensional PDF p1A (a,t) of the random process A(t) at each
time moment is given below.
1. Fix time at ti and consider elements of the
€ respective column.
2. Find aimax and aimin .
38
RANDOM PROCESSES AND TIME SERIES ANALYSIS. PART II: RANDOM PROCESSES 3. Split the range [aimin; aimax) into M equal subintervals of length Δx,
4. Define the following set of sub-intervals:
[a
min
i
€
;aimin + Δx ), [ aimin + Δx;aimin + 2Δx ) ,…,
It is obvious that aimin + MΔx = aimax . Each interval can be given a number k changing from 1 to M.
€
5. Reserve an array of numbers qk, k=1,…,M, and set all qk,=0
6. At the time moment ti consider in turn experimental points aij from different trials j: j is
sequentially changed between 1 and N. For each aij find the subinterval it falls into, i.e the
number k as follows:
In addition, number k satisfies the following conditions:
a imin ≤ a ij < a imin + Δx,
If
If a
min
i
j
+ 7Δx ≤ a i < a
min
i
then k = 1
+ 8Δx,
then k = 8
In the above notation [] means the integral part of the number.
7. For each aij ,€when k is found we add 1 to qk: qk =qk+1.
8. As a result, each of the array element qk is the number of data points that has fallen into the
k-th interval.
The array qk can be represented as a table:
k
1
2
…
M
qk
q1
q2
…
qM
In the table, the meaning of elements qk is as follows: qk is the number of times the event number k
has occurred. The sum of all qk is precisely equal to N.
39
RANDOM PROCESSES AND TIME SERIES ANALYSIS. PART II: RANDOM PROCESSES 9. Consider the probability P for the point aij to fall inside k-th interval, i.e.
Define P as qk /N (ratio of numbers of outcomes).
10. Use the definition of the probability density function
p1 (x,t) = Δxlim
→0
Ρ{X(t) ∈ [x, x + Δx)}
Δx
Importantly, we cannot numerically take a limit as Δx tends to zero. Give up this limit.
11. Substitute P€=qk /N, and as an approximation of “true” p1A (a,t) calculate
€
Here, p˜ 1 (k,i) shows the approximate probability of the process to take the value within the k-th
interval that includes ai.
€
2. PDF from Experimental Data: From Many to One Realization
If we can obtain an infinite number of realizations of the same random process, it does not matter
for us whether the process is stationary or not, or ergodic or not. We will be able to numerically
estimate its characteristics even if they change in time. We will also be able to tell from these
characteristics whether the process is stationary, or ergodic, or perhaps neither.
However, if we have only one realization of the process, we should be careful when judging about
the process characteristics.
Assume that the process is 1st order ergodic. Then its 1-dimensional PDF can be estimated from its
single realization as follows.
3. PDF from a Single Realization: Proof
1. Consider a 1st order ergodic process X(t) and its single realization x(t).
2. Consider a particular interval [x*;x*+Δx).
40
RANDOM PROCESSES AND TIME SERIES ANALYSIS. PART II: RANDOM PROCESSES 3. Consider a function f(y), such that
4. Consider f(x(t)).
f=1 if x(t) falls within [x*;x*+Δx). Let this occur at times t within intervals t ∈ [t1i ;t 2i ) .
f=0 if x(t) falls outside [x*;x*+Δx).
5. Consider an average of f(x(t)) over time interval T
€
T /2
f (x(t))
€
T
=
1
∫ f (x(t))dt =
T −T / 2
6. Consider an average of f(X(t)) over the ensemble of realizations. Use the fundamental
theorem for one random variable: here random variable x is the value of the random process
X(t) at the time moment t.
∞
f (X(t)) =
∫
f (X(t))p1X (x)dx
−∞
Consider f(X(t)).
f=1 if X(t) falls within [x*;x*+Δx).
f=0 if X(t) falls outside [x*;x*+Δx).
€
If Δx is very small, p1X (x) is almost constant within [x*;x*+Δx).Then the above expression is
approximately equal to
€
7. Ergodicity of the 1st order implies f (x(t)) = f (X(t)). The left-hand side is the average
€
41
RANDOM PROCESSES AND TIME SERIES ANALYSIS. PART II: RANDOM PROCESSES over time along one realization x(t) of the random process X(t). The right-hand side is the
average over ensemble of realizations xi(t) of the random process X(t). Then
and
If T → ∞
and Δx → 0 , the equality becomes exact.
4. PDF from a Single Realization: Algorithm
Below a sequence of steps is given that will allow you to estimate a 1-dimensional PDF p1A (a) of a
random process A(t) from its single realization a(t) sampled with step Δt, so that a dataset ai is
obtained of length L (i=1,2,…,L).
€
1. Find maximum amax and minimum amin values of the realization.
min
max
2. Split the interval [a ; a ] into M sub-intervals of length Δx each.
3. For each point of realization, find the subinterval numbered k into which
the point falls. k changes between 1 and M.
4. Calculate the number of points Lk in each subinterval numbered k.
5. Find estimate p˜ 1 (k) of PDF p1A (a) as
€
€
In fact this means calculating the total number of time steps the realization had spent within the k-th
subinterval.
5. Mean Value, Variance and Covariance from a Single Realization
Below the formulae are given for the numerical estimation of other statistical characteristics from a
single realization a(t) of an ergodic process, sampled with a step Δt and observed during T seconds.
Thus, we assume that experimental data we have is a sequence of L values a1,a2 ,...,aL .
Mean value
€
€
42
RANDOM PROCESSES AND TIME SERIES ANALYSIS. PART II: RANDOM PROCESSES In the above, we approximate an integral by a sum. We assume the partitioning of the interval [0,T]
(observation period) into L sub-intervals of length Δt each, and use the method of rectangles to
approximate the integral within each small sub-interval, see Figure below.
€
By analogy, we can derive the following approximate formulae that allow us to estimate mean
square, variance and covariance from a single realization of an ergodic random process.
Mean square
Variance
2
1 L
a ≈ ∑ (ai )
L i=1
2
σA2 = a 2 − a
2
Here we take estimates of a 2 and a from above.
€
L −L 4
(
+
1
*
Covariance
(
τ
)
=
Ψ
jΔt
≈
a
a
Ψ
) * L ∑ i i+ j -- − a 2,
€
AA
AA (
) L − 4 i=1
,
L
j€= 0,..., €
4
In estimating the covariance function using the formula above, we assume that any value of
( jΔt) is estimated by averaging over ¾ of the whole dataset to ensure that averaging is good. To
ΨAA €
satisfy this requirement, we can only estimate covariance for τ = jΔt from j=0 and up to a quarter
of the dataset length j = L 4 .
€
You can see that estimates of statistical characteristics
€ are much easier to obtain from a single
realization than from an ensemble of realizations.
€
43