Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

2
Probability and Stochastic Processes

In stochastic processes the future is not uniquely determined, but we have at least probability relations enabling us to make predictions.

William Feller [47], p. 420.

Signals and messages containing information about electrical, mechanical, chemical, biological, and other processes are usually affected by various types of noise and disturbances, the values of which often cannot be ignored. In such cases, the deterministic approximation becomes too rough, and probabilistic methods are used to achieve the best results. Under the influence of noise, any process becomes random, and accurate information extraction about its features requires mathematical methods describing random variables, stochastic processes, and SDEs. This chapter provides a brief introduction to the concepts and foundations of the theory of probability and stochastic processes, which will be used later in the discussion of methods of state estimation.

2.1 Random Variables

In engineering practice we often deal with some kind of experiment and elements of its random outcomes that cannot be used directly. For example, tracking distance can be measured via time of arrival. Thus, to each we can assign a real number , call it random variable [191], and describe or simply in terms of the probability theory. A collection of random variables sets up some random process.

Scalar random variable: Since is random, it cannot be described in deterministic terms. But we may wonder how frequently the variable occurs above some constant value of , which leads to the concept of probability. Let us think that occurs many times and assign an event , which means that is happening below . The probability that is calculated as

(2.1) $upper P left-brace upper X less-than-or-slanted-equals x right-brace equals StartFraction upper P o s s i b l e o u t c o m e s f a v o r i n g e v e n t normal upper A Over upper T o t a l p o s s i b l e o u t c o m e s EndFraction$

and is represented by the cumulative distribution function (cdf),

(2.2) $upper F Subscript upper X Baseline left-parenthesis x right-parenthesis equals upper P left-brace upper X less-than-or-slanted-equals x right-brace comma$

which is equal to the probability for to take values below with the following properties:

$StartLayout 1st Row 1st Column 0 2nd Column less-than-or-slanted-equals 3rd Column upper F Subscript upper X Baseline left-parenthesis x right-parenthesis less-than-or-slanted-equals 1 2nd Row 1st Column upper F Subscript upper X Baseline left-parenthesis alpha right-parenthesis 2nd Column less-than-or-slanted-equals 3rd Column upper F Subscript upper X Baseline left-parenthesis beta right-parenthesis if alpha less-than-or-slanted-equals beta comma 3rd Row 1st Column upper P left-brace alpha less-than upper X less-than-or-slanted-equals beta right-brace 2nd Column equals 3rd Column upper F Subscript upper X Baseline left-parenthesis beta right-parenthesis minus upper F Subscript upper X Baseline left-parenthesis alpha right-parenthesis comma 4th Row 1st Column upper F Subscript upper X Baseline left-parenthesis negative infinity right-parenthesis 2nd Column equals 3rd Column 0 comma 5th Row 1st Column upper F Subscript upper X Baseline left-parenthesis infinity right-parenthesis 2nd Column equals 3rd Column 1 period EndLayout$

One might also be interested in a function that represents the concentration of values around . The corresponding function is called the probability density function (pdf). But if the random variable is discrete, it is represented with the probability mass function (pmf). An example is the Bernoulli distribution of two discrete random values 0 and 1.

Both and are positive‐valued with the following fundamental properties:

(2.3) $integral Subscript negative infinity Superscript infinity Baseline p Subscript upper X Baseline left-parenthesis x right-parenthesis d x equals 1 comma$

(2.4) $upper F Subscript upper X Baseline left-parenthesis x right-parenthesis equals integral Subscript negative infinity Superscript x Baseline p Subscript upper X Baseline left-parenthesis z right-parenthesis d z left right double arrow p Subscript upper X Baseline left-parenthesis x right-parenthesis equals StartFraction normal d Over normal d x EndFraction upper F Subscript upper X Baseline left-parenthesis x right-parenthesis period$

Another way to describe the properties of a random variable is to define the expectation of its exponential measure as

(2.5) $upper Phi Subscript upper X Baseline left-parenthesis j theta right-parenthesis equals script upper E left-brace e Superscript j theta upper X Baseline right-brace equals integral Subscript negative infinity Superscript infinity Baseline p Subscript upper X Baseline left-parenthesis x right-parenthesis e Superscript j theta x Baseline normal d x comma$

which is the direct Fourier transform of pdf with the variable playing the role of negative angular frequency. The function, which is usually complex, is called the characteristic function (cf) and represents in the transform domain. Provided , the pdf of can be defined by inverse Fourier transform as

(2.6) $p Subscript upper X Baseline left-parenthesis x right-parenthesis equals upper E left-brace e Superscript j theta upper X Baseline right-brace equals StartFraction 1 Over 2 pi EndFraction integral Subscript negative infinity Superscript infinity Baseline upper Phi Subscript upper X Baseline left-parenthesis j theta right-parenthesis e Superscript minus j theta x Baseline normal d theta greater-than-or-slanted-equals 0 period$

In some cases, one can also use the logarithm of the characteristic function or log‐characteristic function

$normal upper Psi Subscript upper X Baseline left-parenthesis j theta right-parenthesis equals ln upper Phi Subscript upper X Baseline left-parenthesis j theta right-parenthesis comma$

which can be more convenient for some forms of .

Multiple random variables: Let us now consider two random variables and such that, by (2.2), and . The joint cdf of and is defined by

(2.7) $upper F Subscript upper X comma upper Y Baseline left-parenthesis x comma y right-parenthesis equals upper P left-parenthesis upper X less-than-or-slanted-equals x comma upper Y less-than-or-slanted-equals y right-parenthesis$

to exist with the following main properties:

$StartLayout 1st Row 1st Column 0 less-than-or-slanted-equals upper F Subscript upper X comma upper Y Baseline left-parenthesis x comma y right-parenthesis 2nd Column less-than-or-slanted-equals 3rd Column 1 comma 2nd Row 1st Column upper F Subscript upper X comma upper Y Baseline left-parenthesis x comma infinity right-parenthesis 2nd Column equals 3rd Column upper F Subscript upper X Baseline left-parenthesis x right-parenthesis comma 3rd Row 1st Column upper F Subscript upper X comma upper Y Baseline left-parenthesis infinity comma y right-parenthesis 2nd Column equals 3rd Column upper F Subscript upper Y Baseline left-parenthesis y right-parenthesis comma 4th Row 1st Column upper F Subscript upper X comma upper Y Baseline left-parenthesis infinity comma infinity right-parenthesis 2nd Column equals 3rd Column 1 comma 5th Row 1st Column upper F Subscript upper X comma upper Y Baseline left-parenthesis x comma negative infinity right-parenthesis 2nd Column equals 3rd Column upper F Subscript upper X comma upper Y Baseline left-parenthesis negative infinity comma y right-parenthesis equals 0 comma 6th Row 1st Column upper F Subscript upper X comma upper Y Baseline left-parenthesis a comma c right-parenthesis 2nd Column less-than-or-slanted-equals 3rd Column upper F Subscript upper X comma upper Y Baseline left-parenthesis b comma d right-parenthesis if a less-than-or-slanted-equals b and c less-than-or-slanted-equals d period EndLayout$

The joint pdf and cdf of random and relate to each other as

(2.8) $p Subscript upper X comma upper Y Baseline left-parenthesis x comma y right-parenthesis equals StartFraction partial-differential squared upper F Subscript upper X comma upper Y Baseline left-parenthesis x comma y right-parenthesis Over partial-differential x partial-differential y EndFraction comma$

(2.9) $upper F Subscript upper X comma upper Y Baseline left-parenthesis x comma y right-parenthesis equals integral Subscript negative infinity Superscript x Baseline integral Subscript negative infinity Superscript y Baseline p Subscript upper X comma upper Y Baseline left-parenthesis v comma u right-parenthesis normal d v normal d u$

and has the following properties:

$StartLayout 1st Row 1st Column p Subscript upper X comma upper Y Baseline left-parenthesis x comma y right-parenthesis 2nd Column greater-than-or-slanted-equals 3rd Column 0 comma 2nd Row 1st Column integral Subscript negative infinity Superscript infinity Baseline integral Subscript negative infinity Superscript infinity Baseline p Subscript upper X comma upper Y Baseline left-parenthesis x comma y right-parenthesis normal d x normal d y 2nd Column equals 3rd Column 1 comma 3rd Row 1st Column p Subscript upper X Baseline left-parenthesis x right-parenthesis 2nd Column equals 3rd Column integral Subscript negative infinity Superscript infinity Baseline p Subscript upper X comma upper Y Baseline left-parenthesis x comma y right-parenthesis normal d y comma 4th Row 1st Column p Subscript upper Y Baseline left-parenthesis y right-parenthesis 2nd Column equals 3rd Column integral Subscript negative infinity Superscript infinity Baseline p Subscript upper X comma upper Y Baseline left-parenthesis x comma y right-parenthesis normal d x period EndLayout$

Given the distribution of a random variable , the latter can be represented in general by an infinite number of quantitative measures, called moments associated with and cumulants associated with . For example, the zero moment of is its total probability (i.e., one), the first moment is the mean or average, the second central moment is the variance, the third standardized moment is the skewness, and the fourth standardized moment is the kurtosis.

2.1.1 Moments and Cumulants

Two kinds of special characteristics called moments have found applications in the representation of random variables: raw (ordinary) moments and central moments.

Raw Moments

The raw moment of the ‐order of a random variable is defined by the average of as

(2.10) $script upper E left-brace upper X Superscript v Baseline right-brace equals m Subscript v Baseline equals integral Subscript negative infinity Superscript infinity Baseline x Superscript v Baseline p Subscript upper X Baseline left-parenthesis x right-parenthesis normal d x$

and the infinite set of raw moments completely represents the variable .

The ‐order raw moment of a discrete population of , , can be represented with the raw moments calculated by

(2.11) $m Subscript v Baseline approximately-equals StartFraction 1 Over 2 upper N plus 1 EndFraction sigma-summation Underscript i equals negative upper N Overscript upper N Endscripts upper X Subscript i Superscript v$

and it represents exactly when .

For the population of observed on a continuous time interval , the raw moment can also be calculated by

(2.12) $m Subscript v Baseline approximately-equals StartFraction 1 Over 2 upper T EndFraction integral Subscript negative upper T Superscript upper T Baseline upper X Superscript v Baseline left-parenthesis t right-parenthesis normal d t$

and it represents a variable exactly when .

The most commonly used raw moment is of the 1‐order, , and is known as the mean or average of ,

(2.13) $m 1 equals upper X overbar equals script upper E left-brace upper X right-brace period$

There are also two other special characteristics associated with the mean of called median and mode.

Median: The median “” is the value separating the higher half of the population of values from the lower half. Therefore, the median may be thought as the “middle” value. For example, given a set , the median is 5, and for it is 4.5. But for the median is also 5. Therefore, this statistic is robust against outliers.

Mode: The “” of a population of is the value that appears most often. The mode is thus the value of at which the pmf takes its maximum value. The mode is not unique if takes the same maximum value at several points. The most extreme case is the uniform distributions, where all values occur equally.

Mean, median, and mode: The mean, median, and mode are the same in unimodal distributions, such as the Gaussian distribution, and they are different in skewed distributions.

Central Moments

It is also desirable to have moments relative to the mean (2.13) of a random variable . The corresponding ‐order moment is called the central moment and is defined by the mean value of as

(2.14) $mu Subscript v Baseline equals script upper E left-brace left-parenthesis upper X minus upper X overbar right-parenthesis Superscript v Baseline right-brace equals integral Subscript negative infinity Superscript infinity Baseline left-parenthesis x minus upper X overbar right-parenthesis Superscript v Baseline p left-parenthesis x right-parenthesis normal d x period$

For discrete and continuous populations of , the ‐order central moment can be determined by, respectively,

(2.15) $mu Subscript v Baseline equals limit Underscript upper N right-arrow infinity Endscripts StartFraction 1 Over 2 upper N plus 1 EndFraction sigma-summation Underscript i equals negative upper N Overscript upper N Endscripts left-parenthesis upper X Subscript i Baseline minus upper X overbar right-parenthesis Superscript v Baseline comma$

(2.16) $mu Subscript v Baseline equals limit Underscript upper N right-arrow infinity Endscripts StartFraction 1 Over 2 upper T EndFraction integral Subscript negative upper T Superscript upper T Baseline left-bracket upper X left-parenthesis t right-parenthesis minus upper X overbar right-bracket Superscript v Baseline normal d t period$

The most common central moment has the 2‐order, , and is called the variance , which can also be calculated by

(2.17) $sigma Subscript upper X Superscript 2 Baseline equals m 2 minus m 1 squared period$

The root square of the variance is called the standard deviation

(2.18) $sigma Subscript upper X Baseline equals StartRoot sigma Subscript upper X Superscript 2 Baseline EndRoot greater-than-or-slanted-equals 0 comma$

which always has positive values and is the degree of dissipation of a variable with respect to its mean value .

Cumulants

The moments of a random variable can also be determined using the expansion of cf in a Maclaurin series. The corresponding coefficients of the Maclaurin series are called semi‐invariants or cumulants of order and are calculated as

$kappa Subscript v Baseline equals j Superscript negative v Baseline left-bracket StartFraction normal d Superscript v Baseline ln upper Phi left-parenthesis j theta right-parenthesis Over normal d theta Superscript v Baseline EndFraction right-bracket vertical-bar Subscript theta equals 0 Baseline period$

For given cumulants , the characteristic function is defined by

$upper Phi left-parenthesis j theta right-parenthesis equals exp left-bracket sigma-summation Underscript v equals 1 Overscript infinity Endscripts StartFraction kappa Subscript v Baseline Over v factorial EndFraction left-parenthesis j theta right-parenthesis Superscript v Baseline right-bracket period$

It follows from the definition of that the raw moments of can be specified via as

$m Subscript v Baseline equals j Superscript negative v Baseline StartFraction normal d Superscript v Baseline Over normal d theta Superscript v Baseline EndFraction upper Phi left-parenthesis j theta right-parenthesis vertical-bar Subscript theta equals 0$

and restored through raw moments as

$upper Phi left-parenthesis j theta right-parenthesis equals sigma-summation Underscript v equals 0 Overscript infinity Endscripts StartFraction m Subscript v Baseline Over v factorial EndFraction left-parenthesis j theta right-parenthesis Superscript v Baseline period$

What also comes is that cumulants can be represented through moments and vise versa as , , , … and , , , …, due to exact transitions from one type of characteristics to another.

Skewness and Kurtosis

Symmetrically distributed is not always the case in physics and engineering, since many quantities are positive‐valued, such as distance, range, and magnitude. On the other hand, many symmetric distributions differ from the normal law by a greater rectangularity or longer tails. In such cases, two other statistics are used: skewness and kurtosis.

Skewness: If a real‐valued random variable is distributed asymmetrically about the mean, skewness is often used as a measure of asymmetry. The skewness value is calculated by

(2.19) $gamma 1 equals StartFraction mu 3 Over StartRoot mu 2 cubed EndRoot EndFraction equals StartFraction mu 3 Over sigma Subscript upper X Superscript 3 Baseline EndFraction equals StartFraction kappa 3 Over StartRoot kappa 2 cubed EndRoot EndFraction$

and it can be positive, negative, undefined, or zero for symmetric distributions.

Figure 2.1 illustrates the effect of skewness on a Gaussian distribution, which is unimodal and symmetric. It is seen that the mean, median, and mode are equal in the Gaussian distribution (Fig. 2.1b). However, negative skewness (Fig. 2.1a) and positive skewness (Fig. 2.1c) matter, so .

Schematic illustration of effects of skewness on unimodal distributions: (a) negatively skewed, (b) normal (no skew), and (c) positively skewed. — **Figure 2.1** Effects of skewness on unimodal distributions: (a) negatively skewed, (b) normal (no skew), and (c) positively skewed.

Schematic illustration of common forms of kurtosis: mesokurtic (normal), platykurtic (higher rectangularity), and leptokurtic (longer tails). — **Figure 2.2** Common forms of kurtosis: mesokurtic (normal), platykurtic (higher rectangularity), and leptokurtic (longer tails).

Kurtosis: In some cases, a real‐valued random variable is distributed with multiple outliers, and kurtosis is used as a measure of the “tailedness” or “peakedness” of its pdf. In this sense, kurtosis is called a descriptor of the pdf shape. There are different ways of quantifying kurtosis and estimating it from the population of . The most widely used measure of kurtosis is the fourth standardized moment

(2.20) $gamma 2 equals StartFraction mu 4 Over sigma Subscript upper X Superscript 4 Baseline EndFraction equals StartFraction kappa 4 Over kappa 2 squared EndFraction period$

Figure 2.2 illustrates three commonly recognized forms of kurtosis: leptocurtic, mesokurtic (normal), and platycurtic.

The difference is that the platykurtic has a higher squareness and the leptokurtic has longer tails compared to the mesokurtic, which is normal.

Example 2.1 Gaussian random variable. A random variable , distributed according to the normal law

(2.21) $p Subscript upper X Baseline left-parenthesis x right-parenthesis equals StartFraction 1 Over StartRoot 2 pi sigma Subscript upper X Superscript 2 Baseline EndRoot EndFraction exp left-bracket minus StartFraction left-parenthesis x minus upper X overbar right-parenthesis squared Over 2 sigma Subscript upper X Superscript 2 Baseline EndFraction right-bracket comma$

is called a Gaussian random variable. It is the only random variable that is represented by two statistics: the mean and the variance . Therefore, it is usually denoted as . All other Gaussian statistics are zero, including skewness and kurtosis.

The cdf of a Gaussian variable is given by

(2.22) $StartLayout 1st Row 1st Column upper F Subscript upper X Baseline left-parenthesis x right-parenthesis 2nd Column equals 3rd Column integral Subscript negative infinity Superscript x Baseline p Subscript upper X Baseline left-parenthesis z right-parenthesis normal d z 2nd Row 1st Column Blank 2nd Column equals 3rd Column StartFraction 1 Over StartRoot 2 pi EndRoot EndFraction integral Subscript negative infinity Superscript StartFraction x minus upper X overbar Over sigma Subscript upper X Baseline EndFraction Baseline e Superscript minus StartFraction z squared Over 2 EndFraction Baseline normal d z equals upper Phi left-parenthesis StartFraction x minus upper X overbar Over sigma Subscript upper X Baseline EndFraction right-parenthesis comma EndLayout$

where

(2.23) $upper Phi left-parenthesis x right-parenthesis equals StartFraction 1 Over StartRoot 2 pi EndRoot EndFraction integral Subscript negative infinity Superscript x Baseline e Superscript minus StartFraction z squared Over 2 EndFraction Baseline d z equals one half left-bracket 1 plus erf left-parenthesis StartFraction x Over StartRoot 2 EndRoot EndFraction right-parenthesis right-bracket$

is the probability integral and is the error function.

A Gaussian random variable exists from to , but the Gaussian pdf rapidly approaches zero when deviates from its mean . Therefore, it is often desirable to specify the probability that exists in some interval , and thereby estimate the error probability for to exist outside this interval.

The probability that a Gaussian random variable exists in the interval is defined as

(2.24) $StartLayout 1st Row 1st Column upper P Subscript upper X Baseline left-brace a less-than-or-slanted-equals x less-than-or-slanted-equals b right-brace 2nd Column equals 3rd Column upper F Subscript upper X Baseline left-parenthesis b right-parenthesis minus upper F Subscript upper X Baseline left-parenthesis a right-parenthesis 2nd Row 1st Column Blank 2nd Column equals 3rd Column upper Phi left-parenthesis StartFraction b minus upper X overbar Over sigma Subscript upper X Baseline EndFraction right-parenthesis minus upper Phi left-parenthesis StartFraction a minus upper X overbar Over sigma Subscript upper X Baseline EndFraction right-parenthesis EndLayout$

via the probability integral (2.23).

In Fig. 2.3 we summarize the previous analysis with connections between cdf , pdf , cf , raw moments , central moments , and cumulants of a random variable .

Schematic illustration of relationships and connections between cdf FX(x), pdf pX(x), cf ΦX(jθ), raw moments mv, central moments μv, and cumulants κv of a random variable X. — **Figure 2.3** Relationships and connections between cdf , pdf , cf , raw moments , central moments , and cumulants of a random variable .

2.1.2 Product Moments

Let a random variable or simply represent a random measurement and or represent a random measurement . Thus, we can think that the two variables are distributed with a joint pdf .

Similarly to a singe variable, the product moment of two variables and is defined as follows:

(2.26) $m 11 equals script upper E left-brace upper X upper Y right-brace equals integral Subscript negative infinity Superscript infinity Baseline integral Subscript negative infinity Superscript infinity Baseline x y p Subscript upper X comma upper Y Baseline left-parenthesis x comma y right-parenthesis normal d x normal d y period$

Accordingly, the central product moment or the mean‐adjusted product moment is defined by

(2.27) $StartLayout 1st Row 1st Column mu 11 2nd Column equals 3rd Column cov left-parenthesis upper X comma upper Y right-parenthesis equals script upper E left-brace left-parenthesis upper X minus upper X overbar right-parenthesis left-parenthesis upper Y minus upper Y overbar right-parenthesis right-brace 2nd Row 1st Column Blank 2nd Column equals 3rd Column integral Subscript negative infinity Superscript infinity Baseline integral Subscript negative infinity Superscript infinity Baseline left-parenthesis x minus upper X overbar right-parenthesis left-parenthesis y minus upper Y overbar right-parenthesis p Subscript upper X comma upper Y Baseline left-parenthesis x comma y right-parenthesis normal d x normal d y EndLayout$

and called the covariance.

In the special case where and are independent, pdf is represented by the product as , and the product moments become

(2.28) $m 11 equals integral Subscript negative infinity Superscript infinity Baseline x p Subscript upper X Baseline left-parenthesis x right-parenthesis normal d x integral Subscript negative infinity Superscript infinity Baseline y p Subscript upper Y Baseline left-parenthesis y right-parenthesis normal d y equals script upper E left-brace upper X right-brace script upper E left-brace upper Y right-brace comma$

(2.29) $StartLayout 1st Row 1st Column mu 11 2nd Column equals 3rd Column integral Subscript negative infinity Superscript infinity Baseline left-parenthesis x minus upper X overbar right-parenthesis p Subscript upper X Baseline left-parenthesis x right-parenthesis normal d x integral Subscript negative infinity Superscript infinity Baseline left-parenthesis y minus upper Y overbar right-parenthesis p Subscript upper Y Baseline left-parenthesis y right-parenthesis normal d y 2nd Row 1st Column Blank 2nd Column equals 3rd Column script upper E left-brace upper X minus upper X overbar right-brace script upper E left-brace upper Y minus upper Y overbar right-brace period EndLayout$

If two random variables and are independent with properties (2.28) and (2.29), then it follows that they are also uncorrelated. However, the converse is generally not true, except in a few special cases. Note also that terminologically, if , then the random variables and are orthogonal.

It is often convenient to use normalized measures. The normalized mean‐adjusted product moment is called the correlation coefficient or population Pearson correlation coefficient, and it is defined as

(2.30) $rho Subscript upper X comma upper Y Baseline equals StartFraction cov left-parenthesis upper X comma upper Y right-parenthesis Over sigma Subscript upper X Baseline sigma Subscript upper Y Baseline EndFraction comma$

where and are standard deviations of variables and .

As the degree of correlation between two random variables and , the correlation coefficient ranges as due to the following limiting properties:

If , the correlation ia maximum, .
If , it is obvious that .
For uncorrelated and , we have and hence .

Product moments are fundamental when analyzing interactions between random variables. However, if two random variables are considered at two different points in some space or at two different points in time, it is necessary to use another function, called the correlation function.

2.1.3 Vector Random Variables

Let us now represent two random variable as column vectors

(2.31) $upper X equals left-bracket upper X 1 upper X 2 ellipsis upper X Subscript n Baseline right-bracket Superscript upper T Baseline element-of double-struck upper R Superscript n Baseline comma$

(2.32) $upper Y equals left-bracket upper Y 1 upper Y 2 ellipsis upper Y Subscript m Baseline right-bracket Superscript upper T Baseline element-of double-struck upper R Superscript m Baseline comma$

where entries are drawn from populations or variables of some random processes. Such vectors can be characterized by their autocorrelation and cross‐correlation.

The autocorrelation matrix and autocovariance matrix of a vector are defined as

(2.33) $StartLayout 1st Row 1st Column script upper R Subscript upper X 2nd Column equals 3rd Column script upper E left-brace upper X upper X Superscript upper T Baseline right-brace 2nd Row 1st Column Blank 2nd Column equals 3rd Column Start 4 By 4 Matrix 1st Row 1st Column script upper E left-brace upper X 1 squared right-brace 2nd Column script upper E left-brace upper X 1 upper X 2 right-brace 3rd Column ellipsis 4th Column script upper E left-brace upper X 1 upper X Subscript n Baseline right-brace 2nd Row 1st Column script upper E left-brace upper X 2 upper X 1 right-brace 2nd Column script upper E left-brace upper X 2 squared right-brace 3rd Column ellipsis 4th Column script upper E left-brace upper X 2 upper X Subscript n Baseline right-brace 3rd Row 1st Column vertical-ellipsis 2nd Column vertical-ellipsis 3rd Column down-right-diagonal-ellipsis 4th Column vertical-ellipsis 4th Row 1st Column script upper E left-brace upper X Subscript n Baseline upper X 1 right-brace 2nd Column script upper E left-brace upper X Subscript n Baseline upper X 2 right-brace 3rd Column ellipsis 4th Column script upper E left-brace upper X Subscript n Superscript 2 Baseline right-brace EndMatrix comma EndLayout$

(2.34) $StartLayout 1st Row 1st Column script í’ž Subscript upper X 2nd Column equals 3rd Column script upper E left-brace left-parenthesis upper X minus upper X overbar right-parenthesis left-parenthesis upper X minus upper X overbar right-parenthesis Superscript upper T Baseline right-brace 2nd Row 1st Column Blank 2nd Column equals 3rd Column Start 4 By 4 Matrix 1st Row 1st Column sigma 1 squared 2nd Column sigma 12 3rd Column ellipsis 4th Column sigma Subscript 1 n Baseline 2nd Row 1st Column sigma 21 2nd Column sigma 2 squared 3rd Column ellipsis 4th Column sigma Subscript 2 n Baseline 3rd Row 1st Column vertical-ellipsis 2nd Column vertical-ellipsis 3rd Column down-right-diagonal-ellipsis 4th Column vertical-ellipsis 4th Row 1st Column sigma Subscript n Baseline 1 Baseline 2nd Column sigma Subscript n Baseline 2 Baseline 3rd Column ellipsis 4th Column sigma Subscript n Superscript 2 Baseline EndMatrix comma EndLayout$

where is the variance of and is the covariance of and for . It follows that matrices and are square and symmetric. It can also be shown that for any vector the following values are nonnegative,

$StartLayout 1st Row 1st Column z Superscript upper T Baseline script upper R Subscript upper X Baseline z 2nd Column greater-than-or-slanted-equals 3rd Column 0 comma 2nd Row 1st Column z Superscript upper T Baseline script í’ž Subscript upper X Baseline z 2nd Column greater-than-or-slanted-equals 3rd Column 0 comma EndLayout$

and both and are thus positive semidefinite. Note that if the previous relationships are inequalities, matrices and are called positive definite.

Similarly, the cross‐correlation matrix and cross‐covariance matrix of random vectors and are defined by, respectively,

(2.35) $StartLayout 1st Row 1st Column script upper R Subscript upper X upper Y 2nd Column equals 3rd Column script upper E left-brace upper X upper Y Superscript upper T Baseline right-brace 2nd Row 1st Column Blank 2nd Column equals 3rd Column Start 4 By 4 Matrix 1st Row 1st Column script upper E left-brace upper X 1 upper Y 1 right-brace 2nd Column script upper E left-brace upper X 1 upper Y 2 right-brace 3rd Column ellipsis 4th Column script upper E left-brace upper X 1 upper Y Subscript n Baseline right-brace 2nd Row 1st Column script upper E left-brace upper X 2 upper Y 1 right-brace 2nd Column script upper E left-brace upper X 2 upper Y 2 right-brace 3rd Column ellipsis 4th Column script upper E left-brace upper X 2 upper Y Subscript n Baseline right-brace 3rd Row 1st Column vertical-ellipsis 2nd Column vertical-ellipsis 3rd Column down-right-diagonal-ellipsis 4th Column vertical-ellipsis 4th Row 1st Column script upper E left-brace upper X Subscript n Baseline upper Y 1 right-brace 2nd Column script upper E left-brace upper X Subscript n Baseline upper Y 2 right-brace 3rd Column ellipsis 4th Column script upper E left-brace upper X Subscript n Baseline upper Y Subscript n Baseline right-brace EndMatrix comma EndLayout$

(2.36) $StartLayout 1st Row 1st Column script í’ž Subscript upper X upper Y 2nd Column equals 3rd Column script upper E left-brace left-parenthesis upper X minus upper X overbar right-parenthesis left-parenthesis upper Y minus upper Y overbar right-parenthesis Superscript upper T Baseline right-brace 2nd Row 1st Column Blank 2nd Column equals 3rd Column script upper E left-brace upper X upper Y Superscript upper T Baseline right-brace minus upper X overbar upper Y overbar Superscript upper T Baseline period EndLayout$

2.1.4 Conditional Probability: Bayes' Rule

We have already mentioned that random variables can be distributed jointly. They can also be distributed conditionally, when the probability of one variable depends on another already observed. To arrive at probabilistic relations associated with such variables, turn to [47] and consider two events and that have a joint probability . We can now introduce the conditional probability of the event , assuming that is observed. The conditional probability can be defined as

(2.38) $upper P left-parenthesis upper A vertical-bar upper B right-parenthesis equals StartFraction upper P left-parenthesis upper A upper B right-parenthesis Over upper P left-parenthesis upper B right-parenthesis EndFraction comma$

where the probability of event is called marginal probability because it is independent on other events. The conditional probability has the following key properties [47,91]:

$StartLayout 1st Row 1st Column upper P left-parenthesis upper A vertical-bar upper B right-parenthesis 2nd Column greater-than-or-slanted-equals 3rd Column 0 comma 2nd Row 1st Column upper P left-parenthesis upper A vertical-bar upper B right-parenthesis 2nd Column equals 3rd Column 1 if upper B equals upper A comma 3rd Row 1st Column upper P left-parenthesis upper A vertical-bar upper B right-parenthesis 2nd Column is 3rd Column undefined if upper P left-parenthesis upper B right-parenthesis equals 0 period EndLayout$

Because incorporates some already known information about the relation between two variables, it is also called an a posteriori probability, while is also called an a priori probability.

By applying (2.38) to a collection of random variables , we arrive at the important chain rule

(2.39) $upper P left-parenthesis upper A 1 comma ellipsis comma upper A Subscript n Baseline right-parenthesis equals upper P left-parenthesis upper A 1 vertical-bar upper A 2 comma ellipsis comma upper A Subscript n Baseline right-parenthesis upper P left-parenthesis upper A 2 comma ellipsis comma upper A Subscript n Baseline right-parenthesis$

that for three variables gives

(2.40) $upper P left-parenthesis upper A 1 comma upper A 2 comma upper A 3 right-parenthesis equals upper P left-parenthesis upper A 1 vertical-bar upper A 2 comma upper A 3 right-parenthesis upper P left-parenthesis upper A 2 vertical-bar upper A 3 right-parenthesis upper P left-parenthesis upper A 3 right-parenthesis period$

Note that the rule (2.39) plays a fundamental role in the Bayes theory.

Bayes' Theorem

Let us consider a general case of multiple events [145]. For a finite set of mutually exclusive events called a finite partition and another event such that the events , , are mutually exclusive, we can refer to the total probability theorem and write

$upper P left-parenthesis upper B right-parenthesis equals upper P left-parenthesis upper B upper A 1 right-parenthesis plus midline-horizontal-ellipsis plus upper P left-parenthesis upper B upper A Subscript n Baseline right-parenthesis period$

Using (2.38), we can further modify this relation as

(2.41) $upper P left-parenthesis upper B right-parenthesis equals upper P left-parenthesis upper B vertical-bar upper A 1 right-parenthesis upper P left-parenthesis upper A 1 right-parenthesis plus midline-horizontal-ellipsis plus upper P left-parenthesis upper B vertical-bar upper A Subscript n Baseline right-parenthesis upper P left-parenthesis upper A Subscript n Baseline right-parenthesis comma$

because

(2.42) $upper P left-parenthesis upper B upper A Subscript i Baseline right-parenthesis equals upper P left-parenthesis upper B vertical-bar upper A Subscript i Baseline right-parenthesis upper P left-parenthesis upper A Subscript i Baseline right-parenthesis period$

On the other hand, since events and are interchangeable, we can write and conclude that

(2.43) $upper P left-parenthesis upper A Subscript i Baseline vertical-bar upper B right-parenthesis equals upper P left-parenthesis upper B vertical-bar upper A Subscript i Baseline right-parenthesis StartFraction upper P left-parenthesis upper A Subscript i Baseline right-parenthesis Over upper P left-parenthesis upper B right-parenthesis EndFraction comma$

(2.44) $proportional-to upper P left-parenthesis upper B vertical-bar upper A Subscript i Baseline right-parenthesis upper P left-parenthesis upper A Subscript i Baseline right-parenthesis comma$

(2.45) $equals c upper P left-parenthesis upper B vertical-bar upper A Subscript i Baseline right-parenthesis upper P left-parenthesis upper A Subscript i Baseline right-parenthesis comma$

which is stated by Bayes' theorem. Note that Tomath Bayes was the first to use conditional probability and describe the probability of an event via prior knowledge of conditions related to the event.

The conditional probability represented with (2.43) is the likelihood of event given that event is true. Likewise, is the likelihood of event given that event is true. The a priori probability is often omitted in (2.43), and the constant is introduced (2.45) to normalize , to get the unit area.

Inserting (2.41) into (2.43) gives

(2.46) $StartLayout 1st Row 1st Column upper P left-parenthesis upper A Subscript i Baseline vertical-bar upper B right-parenthesis 2nd Column equals 3rd Column StartFraction upper P left-parenthesis upper B vertical-bar upper A Subscript i Baseline right-parenthesis upper P left-parenthesis upper A Subscript i Baseline right-parenthesis Over upper P left-parenthesis upper B vertical-bar upper A 1 right-parenthesis upper P left-parenthesis upper A 1 right-parenthesis plus midline-horizontal-ellipsis plus upper P left-parenthesis upper B vertical-bar upper A Subscript n Baseline right-parenthesis upper P left-parenthesis upper A Subscript n Baseline right-parenthesis EndFraction 2nd Row 1st Column Blank 2nd Column equals 3rd Column StartFraction upper P left-parenthesis upper B vertical-bar upper A Subscript i Baseline right-parenthesis upper P left-parenthesis upper A Subscript i Baseline right-parenthesis Over sigma-summation Underscript i equals 1 Overscript n Endscripts upper P left-parenthesis upper B vertical-bar upper A Subscript i Baseline right-parenthesis upper P left-parenthesis upper A Subscript i Baseline right-parenthesis EndFraction comma EndLayout$

which generalizes Bayes' rule.

Conditional Probability Density

Considering two random variables and instead of two events and and omitting the derivation, which can be found in [45],[145],[79], the relation (2.38) can be equivalently rewritten as

(2.47) $p Subscript upper X vertical-bar upper Y Baseline left-parenthesis x vertical-bar y right-parenthesis equals StartFraction p Subscript upper X upper Y Baseline left-parenthesis x comma y right-parenthesis Over p Subscript upper Y Baseline left-parenthesis y right-parenthesis EndFraction comma$

where is the conditional pdf of given , is the joint pdf of and , and is called the marginal pdf of defined by

(2.48) $p Subscript upper Y Baseline left-parenthesis y right-parenthesis equals integral Underscript x Endscripts p Subscript upper X upper Y Baseline left-parenthesis x comma y right-parenthesis normal d x comma$

(2.49) $equals integral Underscript x Endscripts p Subscript upper Y vertical-bar upper X Baseline left-parenthesis y vertical-bar x right-parenthesis p Subscript upper X Baseline left-parenthesis x right-parenthesis normal d x equals script upper E Subscript upper X Baseline left-brace p Subscript upper Y vertical-bar upper X Baseline left-parenthesis y vertical-bar x right-parenthesis right-brace period$

For multiple random variables collected as and , (2.47) can be written as

(2.50) $p Subscript upper X vertical-bar upper Y Baseline left-parenthesis x 1 comma ellipsis comma x Subscript n Baseline vertical-bar y 1 comma ellipsis comma y Subscript m Baseline right-parenthesis equals StartFraction p Subscript upper X upper Y Baseline left-parenthesis x 1 comma ellipsis comma x Subscript n Baseline comma y 1 comma ellipsis comma y Subscript m Baseline right-parenthesis Over p Subscript upper Y Baseline left-parenthesis y 1 comma ellipsis comma y Subscript m Baseline right-parenthesis EndFraction$

and the cdf found by integrating over as

$upper F Subscript upper X vertical-bar upper Y Baseline left-parenthesis x 1 comma ellipsis comma x Subscript n Baseline vertical-bar y 1 comma ellipsis comma y Subscript m Baseline right-parenthesis$

(2.51) $equals integral Subscript negative infinity Superscript x 1 Baseline ellipsis integral Subscript negative infinity Superscript x Subscript n Baseline Baseline StartFraction p Subscript upper X upper Y Baseline left-parenthesis z 1 comma ellipsis comma z Subscript n Baseline comma y 1 comma ellipsis comma y Subscript m Baseline right-parenthesis Over p Subscript upper Y Baseline left-parenthesis y 1 comma ellipsis comma y Subscript m Baseline right-parenthesis EndFraction normal d z Subscript n Baseline ellipsis normal d z 1 period$

In the case of three random variables , , and , we accordingly have

$StartLayout 1st Row 1st Column p left-parenthesis x 1 vertical-bar x 2 comma x 3 right-parenthesis 2nd Column equals 3rd Column StartFraction p left-parenthesis x 1 comma x 2 comma x 3 right-parenthesis Over p left-parenthesis x 2 comma x 3 right-parenthesis EndFraction comma 2nd Row 1st Column upper F left-parenthesis x 1 vertical-bar x 2 comma x 3 right-parenthesis 2nd Column equals 3rd Column integral Subscript negative infinity Superscript x 1 Baseline StartFraction p left-parenthesis z comma x 2 comma x 3 right-parenthesis Over p left-parenthesis x 2 comma x 3 right-parenthesis EndFraction normal d z period EndLayout$

It is worth noting that the rule (2.48) allows excluding any primary variable from (2.50), and one can use (2.49) to exclude any conditional variable from (2.50). For example, given , we can find

(2.52) $p left-parenthesis x 1 vertical-bar y 1 comma y 2 right-parenthesis equals integral Subscript negative infinity Superscript infinity Baseline p left-parenthesis x 1 comma x 2 vertical-bar y 1 comma y 2 right-parenthesis normal d x 2 comma$

(2.53) $p left-parenthesis x 1 comma x 2 vertical-bar y 2 right-parenthesis equals integral Subscript negative infinity Superscript infinity Baseline p left-parenthesis x 1 comma x 2 vertical-bar y 1 comma y 2 right-parenthesis p left-parenthesis y 1 vertical-bar y 2 right-parenthesis normal d y 1 period$

Most generally, for a set one can write

(2.54) $p left-parenthesis x 1 vertical-bar x 2 comma ellipsis comma x Subscript n Baseline right-parenthesis equals StartFraction p Subscript upper X Baseline left-parenthesis x 1 comma ellipsis comma x Subscript n Baseline right-parenthesis Over p left-parenthesis x 2 comma ellipsis comma x Subscript m Baseline right-parenthesis EndFraction$

and, by the chain rule (2.39), obtain

(2.55) $StartLayout 1st Row 1st Column p Subscript upper X Baseline left-parenthesis x 1 comma x 2 comma ellipsis comma x Subscript n Baseline right-parenthesis 2nd Column equals 3rd Column p left-parenthesis x 1 vertical-bar x 2 comma ellipsis comma x Subscript n Baseline right-parenthesis p left-parenthesis x 2 vertical-bar x 3 comma ellipsis comma x Subscript n Baseline right-parenthesis 2nd Row 1st Column Blank 2nd Column Blank 3rd Column ellipsis p left-parenthesis x Subscript n minus 1 Baseline vertical-bar x Subscript n Baseline right-parenthesis p left-parenthesis x Subscript n Baseline right-parenthesis period EndLayout$

2.1.5 Transformation of Random Variables

Sometimes a deterministic transformation of one random variable with known pdf into another with unknown or desired pdf is required. For example, the pdf of the system input variable is known, and the pdf of the output variable must be defined. It is also often required to transform two nonlinearly related random variables into two linearly related ones, or vice versa. An example can be found in tracking where measurements made in polar coordinates are to be represented in Cartesian coordinates.

Single‐to‐Single Variable Transformation

Let two random variables, and , be related to each other as follows:

(2.56) $upper Y equals g left-parenthesis upper X right-parenthesis comma$

(2.57) $upper X equals h left-parenthesis upper Y right-parenthesis equals g Superscript negative 1 Baseline left-parenthesis upper Y right-parenthesis comma$

where and are some known smooth functions. Suppose we know pdf for and would like to know pdf for . To find , we can refer to the dependencies of on (2.56) and on (2.57) and assert that the probability that lies within the interval is equal to the probability that lies within . This can be formalized as

$integral Subscript x Superscript x plus normal d x Baseline p Subscript upper X Baseline left-parenthesis u right-parenthesis normal d u equals StartLayout Enlarged left-brace 1st Row integral Subscript y Superscript y plus normal d y Baseline p Subscript upper Y Baseline left-parenthesis u right-parenthesis normal d u if normal d y greater-than 0 2nd Row minus integral Subscript y Superscript y plus normal d y Baseline p Subscript upper Y Baseline left-parenthesis u right-parenthesis normal d u if normal d y less-than 0 EndLayout$

and further rewritten equivalently in the differential form as

$p Subscript upper X Baseline left-parenthesis x right-parenthesis normal d x equals p Subscript upper Y Baseline left-parenthesis y right-parenthesis StartAbsoluteValue normal d y EndAbsoluteValue period$

Since we assume that , we arrive at the transformation rule

(2.58) $p Subscript upper Y Baseline left-parenthesis y right-parenthesis equals StartFraction normal d x Over StartAbsoluteValue normal d y EndAbsoluteValue EndFraction p Subscript upper X Baseline left-parenthesis x right-parenthesis equals StartAbsoluteValue StartFraction partial-differential h left-parenthesis upper Y right-parenthesis Over partial-differential y EndFraction EndAbsoluteValue p Subscript upper X Baseline left-bracket h left-parenthesis upper Y right-parenthesis right-bracket period$

Transformation of Vector Random Variables

Given sets of random variables, represented with pdf and with . Suppose the variables and , , are related to each other as

$upper X Subscript i Baseline equals h Subscript i Baseline left-parenthesis upper Y 1 comma ellipsis comma upper Y Subscript n Baseline right-parenthesis comma upper Y Subscript i Baseline equals g Subscript i Baseline left-parenthesis upper X 1 comma ellipsis comma upper X Subscript n Baseline right-parenthesis period$

Then the of can be defined via of as

(2.59) $p Subscript upper Y Baseline left-parenthesis y 1 comma ellipsis comma y Subscript n Baseline right-parenthesis equals StartAbsoluteValue upper J EndAbsoluteValue p Subscript upper X Baseline left-bracket h 1 left-parenthesis y 1 comma ellipsis comma y Subscript n Baseline right-parenthesis comma ellipsis comma h Subscript n Baseline left-parenthesis y 1 comma ellipsis comma y Subscript n Baseline right-parenthesis right-bracket comma$

where is the determinant of the Jacobian of the transformation,

(2.60) $upper J left-parenthesis StartFraction x 1 comma ellipsis comma x Subscript n Baseline Over y 1 comma ellipsis comma y Subscript n Baseline EndFraction right-parenthesis equals Start 3 By 3 Matrix 1st Row 1st Column StartFraction partial-differential h 1 Over partial-differential y 1 EndFraction 2nd Column ellipsis 3rd Column StartFraction partial-differential h 1 Over partial-differential y Subscript n Baseline EndFraction 2nd Row 1st Column vertical-ellipsis 2nd Column down-right-diagonal-ellipsis 3rd Column vertical-ellipsis 3rd Row 1st Column StartFraction partial-differential h Subscript n Baseline Over partial-differential y 1 EndFraction 2nd Column ellipsis 3rd Column StartFraction partial-differential h Subscript n Baseline Over partial-differential y Subscript n Baseline EndFraction EndMatrix period$

2.2 Stochastic Processes

Recall that a random variable corresponds to some measurement outcome . Since exists at some time as , the variable is also a time function . The family of time functions dependent on is called a stochastic process or a random process , where and are variables. As a collection of random variables, a stochastic process can be a scalar stochastic process, which is a set of random variables in some coordinate space. It can also be a vector stochastic process , which is a collection of random variables , , in some coordinate space.

The following forms of stochastic processes are distinguished:

A stochastic process represented in continuous time with continuous values is called a continuous stochastic process.
A discrete stochastic process is a process that is represented in continuous time with discrete values.
A stochastic process represented in discrete time with continuous values is called a stochastic sequence.
A discrete stochastic sequence is a process represented in discrete time with discrete values.

Using the concept of random variables, time‐varying cdf and pdf of a scalar stochastic process process can be represented as

(2.63) $upper F left-parenthesis x comma t right-parenthesis equals upper P left-brace upper X left-parenthesis t right-parenthesis less-than-or-slanted-equals x right-brace comma$

(2.64) $p left-parenthesis x comma t right-parenthesis equals StartFraction partial-differential upper F left-parenthesis x comma t right-parenthesis Over partial-differential x EndFraction left right double arrow upper F left-parenthesis x comma t right-parenthesis equals integral Subscript negative infinity Superscript x Baseline p left-parenthesis u comma t right-parenthesis normal d u period$

For a set of variables corresponding to a set of time instances , we respectively have

(2.65) $upper F left-parenthesis x semicolon t right-parenthesis equals upper P left-brace upper X left-parenthesis t 1 right-parenthesis less-than-or-slanted-equals x 1 comma ellipsis comma upper X left-parenthesis t Subscript n Baseline right-parenthesis less-than-or-slanted-equals x Subscript n Baseline right-brace comma$

(2.66) $p left-parenthesis x semicolon t right-parenthesis equals StartFraction partial-differential Superscript n Baseline upper F left-parenthesis x 1 comma ellipsis comma x Subscript n Baseline semicolon t 1 comma ellipsis comma t Subscript n Baseline right-parenthesis Over partial-differential x 1 ellipsis partial-differential x Subscript n Baseline EndFraction comma$

(2.67) $upper F left-parenthesis x semicolon t right-parenthesis equals integral Subscript negative infinity Superscript x 1 Baseline ellipsis integral Subscript negative infinity Superscript x Subscript n Baseline Baseline p left-parenthesis u 1 comma ellipsis comma u Subscript n Baseline semicolon t 1 comma ellipsis comma t Subscript n Baseline right-parenthesis normal d u Subscript n Baseline ellipsis normal d u 1 period$

A stochastic process can be ether stationary or nonstationary. There are the following types of stationary random processes:

Strictly stationary, whose unconditional joint pdf does not change when shifted in time; that is, for all , it obeys
(2.68) $p left-parenthesis x 1 comma ellipsis comma x Subscript n Baseline semicolon t 1 minus tau comma ellipsis comma t Subscript n Baseline minus tau right-parenthesis equals p left-parenthesis x 1 comma ellipsis comma x Subscript n Baseline semicolon t 1 comma ellipsis comma t Subscript n Baseline right-parenthesis period$
Wide‐sense stationary, whose mean and variance do not vary with respect to time.
Ergodic, whose probabilistic properties deduced from a single random sample are the same as for the whole process.

It follows from these definitions that a strictly stationary random process is also a wide‐sense random process, and an ergodic process is less common among other random processes. All other random processes are called nonstationary.

2.2.1 Correlation Function

The function that described the statistical correlation between random variables in some processes is called the correlation function. If the random variables represent the same quantity measured at two different points, then the correlation function is called the autocorrelation function. The correlation function of various random variables is called the cross‐correlation function.

Autocorrelation Function

The autocorrelation function of a scalar random variable measured at different time points and is defined as

(2.69a) $script upper R Subscript upper X Baseline left-parenthesis t 1 comma t 2 right-parenthesis equals script upper E left-brace upper X left-parenthesis t 1 right-parenthesis upper X left-parenthesis t 2 right-parenthesis right-brace$

(2.69b) $equals integral Subscript negative infinity Superscript infinity Baseline integral Subscript negative infinity Superscript infinity Baseline x 1 x 2 p Subscript upper X Baseline left-parenthesis x 1 comma x 2 semicolon t 1 comma t 2 right-parenthesis normal d x 1 normal d x 2 comma$

where is a joint time‐varying pdf of and .

For mean‐adjusted processes, the correlation function is called an autocovariance function and is defined as

(2.70a) $script í’ž Subscript upper X Baseline left-parenthesis t 1 comma t 2 right-parenthesis equals script upper E left-brace left-bracket upper X left-parenthesis t 1 right-parenthesis minus ModifyingAbove upper X With bar left-parenthesis t 1 right-parenthesis right-bracket left-bracket upper X left-parenthesis t 2 right-parenthesis minus ModifyingAbove upper X With bar left-parenthesis t 2 right-parenthesis right-bracket right-brace$

(2.70b) $StartLayout 1st Row 1st Column Blank 2nd Column equals integral Subscript negative infinity Superscript infinity Baseline integral Subscript negative infinity Superscript infinity Baseline left-bracket x 1 minus ModifyingAbove upper X With bar left-parenthesis t 1 right-parenthesis right-bracket left-bracket x 2 minus ModifyingAbove upper X With bar left-parenthesis t 2 right-parenthesis right-bracket p Subscript upper X Baseline left-parenthesis x 1 comma x 2 semicolon t 1 comma t 2 right-parenthesis normal d x 1 normal d x 2 period EndLayout$

Both and tell us how much a variable is coupled with its shifted version .

There exists a simple relation between the cross‐correlation and autocorrelation functions assuming a complex random process ,

(2.71a) $script upper R Subscript upper X Baseline left-parenthesis t 1 comma t 2 right-parenthesis equals script upper E left-brace upper X left-parenthesis t 1 right-parenthesis upper X Superscript asterisk Baseline left-parenthesis t 2 right-parenthesis right-brace$

(2.71b) $equals script í’ž Subscript upper X Baseline left-parenthesis t 1 comma t 2 right-parenthesis plus ModifyingAbove upper X With bar left-parenthesis t 1 right-parenthesis ModifyingAbove upper X With bar Superscript asterisk Baseline left-parenthesis t 2 right-parenthesis comma$

where is a complex conjugate of .

If the random process is stationary, its autocorrelation function does not depend on time, but depends on the time shift . For such processes, is converted to

(2.72a) $script upper R Subscript upper X Baseline left-parenthesis tau right-parenthesis equals script upper E left-brace upper X left-parenthesis t right-parenthesis upper X left-parenthesis t plus tau right-parenthesis right-brace$

(2.72b) $equals integral Subscript negative infinity Superscript infinity Baseline integral Subscript negative infinity Superscript infinity Baseline x 1 x 2 p left-parenthesis x 1 comma x 2 semicolon tau right-parenthesis d x 1 d x 2 period$

If the joint pdf is not explicitly known and the stochastic process is supposed to be ergodic, then can be computed by averaging the product of the shifted variables as

(2.73) $script upper R Subscript upper X Baseline left-parenthesis tau right-parenthesis equals limit Underscript upper T right-arrow infinity Endscripts integral Subscript negative upper T Superscript upper T Baseline upper X left-parenthesis t right-parenthesis upper X left-parenthesis t plus tau right-parenthesis normal d t comma$

and we notice that the rule (2.73) is common to experimental measurements of correlation. Similarly to (2.73), the covariance function (2.70b) can be measured for ergodic random processes.

Cross‐Correlation Function

The correlation between two different random processes and is described by the cross‐correlation function. The relationship remains largely the same if we consider these variables at two different time instances and define the cross‐correlation function as

(2.74a) $script upper R Subscript upper X upper Y Baseline left-parenthesis t 1 comma t 2 right-parenthesis equals script upper E left-brace upper X left-parenthesis t 1 right-parenthesis upper Y left-parenthesis t 2 right-parenthesis right-brace$

(2.74b) $equals integral Subscript negative infinity Superscript infinity Baseline integral Subscript negative infinity Superscript infinity Baseline x 1 y 2 p left-parenthesis x 1 comma y 2 semicolon t 1 comma t 2 right-parenthesis normal d x 1 normal d y 2 comma$

and the cross‐covariance function as

(2.75a) $script í’ž Subscript upper X upper Y Baseline left-parenthesis t 1 comma t 2 right-parenthesis equals script upper E left-brace left-bracket upper X left-parenthesis t 1 right-parenthesis minus ModifyingAbove upper X With bar left-parenthesis t 1 right-parenthesis right-bracket left-bracket upper Y left-parenthesis t 2 right-parenthesis minus ModifyingAbove upper Y With bar left-parenthesis t 2 right-parenthesis right-bracket right-brace$

(2.75b) $equals integral Subscript negative infinity Superscript infinity Baseline integral Subscript negative infinity Superscript infinity Baseline left-bracket x 1 minus ModifyingAbove upper X With bar left-parenthesis t 1 right-parenthesis right-bracket left-bracket y 2 minus ModifyingAbove upper Y With bar left-parenthesis t 2 right-parenthesis right-bracket p left-parenthesis x 1 comma y 2 semicolon t 1 comma t 2 right-parenthesis normal d x 1 normal d y 2 period$

For stationary random processes, function (2.74a) can be computed as

(2.76a) $script upper R Subscript upper X upper Y Baseline left-parenthesis tau right-parenthesis equals upper E left-brace upper X left-parenthesis t right-parenthesis upper Y left-parenthesis t plus tau right-parenthesis right-brace$

(2.76b) $equals integral Subscript negative infinity Superscript infinity Baseline integral Subscript negative infinity Superscript infinity Baseline x 1 y 2 p left-parenthesis x 1 comma y 2 semicolon tau right-parenthesis normal d x 1 normal d y 2$

(2.76c) $equals limit Underscript upper T right-arrow infinity Endscripts integral Subscript negative upper T Superscript upper T Baseline upper X left-parenthesis t right-parenthesis upper Y left-parenthesis t plus tau right-parenthesis normal d t comma$

and function (2.75a) modified accordingly.

Properties of Correlation Function

The following key properties of the autocorrelation and cross‐correlation functions of two random processes and are highlighted:

The autocorrelation function is non‐negative, , and has the property .
The cross‐correlation function is a Hermitian function, .
The following Cauchy‐Schwartz inequality holds,
(2.77) $StartAbsoluteValue script upper R Subscript upper X upper Y Baseline left-parenthesis t 1 comma t 2 right-parenthesis EndAbsoluteValue squared less-than-or-slanted-equals script upper E left-brace StartAbsoluteValue upper X left-parenthesis t 1 right-parenthesis EndAbsoluteValue squared right-brace script upper E left-brace StartAbsoluteValue upper Y left-parenthesis t 2 right-parenthesis EndAbsoluteValue squared right-brace equals sigma Subscript upper X Superscript 2 Baseline left-parenthesis t 1 right-parenthesis sigma Subscript upper Y Superscript 2 Baseline left-parenthesis t 2 right-parenthesis period$
For a stationary random process , the following properties apply:
$upper S y m m e t r y colon script upper R Subscript upper X upper Y Baseline left-parenthesis tau right-parenthesis equals script upper R Subscript upper X upper Y Baseline left-parenthesis negative tau right-parenthesis comma$

$upper P h y s i c a l limit colon limit Underscript tau right-arrow infinity Endscripts script upper R Subscript xi Baseline left-parenthesis tau right-parenthesis equals 0 comma$

$upper F o u r i e r t r a n s f o r m colon script upper F left-brace script upper R Subscript upper X upper Y Baseline left-parenthesis tau right-parenthesis right-brace greater-than-or-slanted-equals 0 comma i f e x i s t s period$

The latter property requires the study of a random process in the frequency domain, which we will do next.

2.2.2 Power Spectral Density

Spectral analysis of random processes in the frequency domain plays the same role as correlation analysis in the time domain. As in correlation analysis, two functions are recognized in the frequency domain: power spectral density (PSD) of a random process and cross power spectral density (cross‐PSD) of two random processes and .

Power Spectral Density

The PSD of a random process is determined by the Fourier transform of its autocorrelation function. Because the Fourier transform requires a process that exists over all time, spectral analysis is applied to stationary processes that satisfy the Dirichlet conditions [165].

Dirichlet conditions: Any real‐valued periodic function can be extended into the Fourier series if, over a period, a function 1) is absolutely integrable, 2) is finite, and 3) has a finite number of discontinuities.

The Wiener‐Khinchin theorem [145] states that the autocorrelation function of a wide‐sense stationary random process is related to its PSD by the Fourier transform pair as

(2.78) $script upper S Subscript upper X Baseline left-parenthesis omega right-parenthesis equals script upper F left-brace script upper R Subscript upper X Baseline left-parenthesis tau right-parenthesis right-brace equals integral Subscript negative infinity Superscript infinity Baseline script upper R Subscript upper X Baseline left-parenthesis tau right-parenthesis e Superscript minus j omega tau Baseline normal d tau comma$

(2.79) $script upper R Subscript upper X Baseline left-parenthesis tau right-parenthesis equals script upper F Superscript negative 1 Baseline left-brace script upper S Subscript upper X Baseline left-parenthesis omega right-parenthesis right-brace equals StartFraction 1 Over 2 pi EndFraction integral Subscript negative infinity Superscript infinity Baseline script upper S Subscript upper X Baseline left-parenthesis omega right-parenthesis e Superscript j omega tau Baseline normal d omega comma$

and has the following fundamental properties:

Since is conjugate symmetric, , then it follows that is a real function of of a real or complex stochastic process.
If is a real process, then is real and even, and is also real and even: . Otherwise, is not even.
The PSD of a stationary process is positive valued, .
The variance of a scalar is provided by
(2.80) $sigma Subscript upper X Superscript 2 Baseline equals script upper R Subscript upper X Baseline left-parenthesis 0 right-parenthesis equals StartFraction 1 Over 2 pi EndFraction integral Subscript negative infinity Superscript infinity Baseline script upper S Subscript upper X Baseline left-parenthesis omega right-parenthesis normal d omega comma$
Given an LTI system with a frequency response , then of an input process projects to of an output process as
(2.81) $script upper S Subscript upper Y Baseline left-parenthesis omega right-parenthesis equals StartAbsoluteValue script upper H left-parenthesis j omega right-parenthesis EndAbsoluteValue squared script upper S Subscript upper X Baseline left-parenthesis omega right-parenthesis period$

Cross Power Spectral Density

Like the PSD, the cross‐PSD of two stationary random processes and is defined by the Fourier transform pair as

(2.82) $script upper S Subscript upper X upper Y Baseline left-parenthesis j omega right-parenthesis equals integral Subscript negative infinity Superscript infinity Baseline script upper R Subscript upper X upper Y Baseline left-parenthesis tau right-parenthesis e Superscript minus j omega tau Baseline normal d tau comma$

(2.83) $script upper R Subscript upper X upper Y Baseline left-parenthesis tau right-parenthesis equals StartFraction 1 Over 2 pi EndFraction integral Subscript negative infinity Superscript infinity Baseline script upper S Subscript upper X upper Y Baseline left-parenthesis j omega right-parenthesis e Superscript j omega tau Baseline normal d omega comma$

and we notice that the cross‐PSD usually has complex values and differs from the PSD in the following properties:

Since and are not necessarily even functions of , then it follows that and are not obligatorily real functions.
Due to the Hermitian property , functions and are complex conjugate of each other, , and the sum of and is real.

If is the sum of two stationary random processes and , then the autocorrelation function can be found as

$StartLayout 1st Row 1st Column script upper R Subscript upper Z Baseline left-parenthesis tau right-parenthesis 2nd Column equals 3rd Column upper E left-brace left-bracket upper X left-parenthesis t right-parenthesis plus upper Y left-parenthesis t right-parenthesis right-bracket left-bracket upper X left-parenthesis t plus tau right-parenthesis plus upper Y left-parenthesis t plus tau right-parenthesis right-bracket right-brace comma 2nd Row 1st Column Blank 2nd Column equals 3rd Column script upper R Subscript upper X Baseline left-parenthesis tau right-parenthesis plus script upper R Subscript upper X upper Y Baseline left-parenthesis tau right-parenthesis plus script upper R Subscript upper Y upper X Baseline left-parenthesis tau right-parenthesis plus script upper R Subscript upper Y Baseline left-parenthesis tau right-parenthesis period EndLayout$

Therefore, the PSD of is generally given by

$script upper S Subscript upper Z Baseline left-parenthesis j omega right-parenthesis equals script upper S Subscript upper X Baseline left-parenthesis omega right-parenthesis plus script upper S Subscript upper X upper Y Baseline left-parenthesis j omega right-parenthesis plus script upper S Subscript upper Y upper X Baseline left-parenthesis j omega right-parenthesis plus script upper S Subscript upper Y Baseline left-parenthesis omega right-parenthesis period$

A useful normalized measure of the cross‐PSD of two stationary processes and is the coherence defined as

(2.84) $gamma Subscript upper X comma upper Y Superscript 2 Baseline equals StartFraction StartAbsoluteValue script upper S Subscript upper X comma upper Y Baseline left-parenthesis j omega right-parenthesis EndAbsoluteValue Over script upper S Subscript upper X Baseline left-parenthesis omega right-parenthesis script upper S Subscript upper Y Baseline left-parenthesis omega right-parenthesis EndFraction comma$

which plays the role of the correlation coefficient (2.30) in the frequency domain. Maximum coherence is achieved when two processes are equal, and therefore . On the other extreme, when two processes are uncorrelated, we have , and hence the coherence varies in the interval .

2.2.3 Gaussian Processes

As a collection of Gaussian variables, a Gaussian process is a stochastic process whose variables have a multivariate normal distribution. Because every finite linear combination of Gaussian variables is normally distributed, the Gaussian process plays an important role in modeling and state estimation as a useful and relatively simple mathematical idealization. It also helps in solving applied problems, since many physical processes after passing through narrowband paths acquire the property of Gaussianity. Some nonlinear problems can also be solved using the Gaussian approach [187].

Suppose that a Gaussian process is represented with a vector of Gaussian random variables corresponding to a set of time instances and that each variable is normally distributed with (2.37a). The pdf of this process is given by

(2.85) $p Subscript upper X Baseline left-parenthesis x semicolon t right-parenthesis equals StartFraction 1 Over StartRoot left-parenthesis 2 pi right-parenthesis Superscript n Baseline StartAbsoluteValue script í’ž Subscript upper X Baseline EndAbsoluteValue EndRoot EndFraction exp left-bracket minus one half left-parenthesis x minus upper X overbar right-parenthesis Superscript upper T Baseline script í’ž Subscript upper X Superscript negative 1 Baseline left-parenthesis x minus upper X overbar right-parenthesis right-bracket comma$

where , , and the covariance (2.34) is generally time‐varying. The standard notation of Gaussian process is

$upper X tilde script í’© left-parenthesis upper X overbar comma script í’ž Subscript upper X Baseline right-parenthesis period$

If the Gaussian process is a collection of uncorrelated random variables and, therefore, holds for and is diagonal, then pdf (2.85) becomes a multiple product of the densities of each of the variables,

(2.86) $StartLayout 1st Row 1st Column p Subscript upper X Baseline left-parenthesis x semicolon t right-parenthesis 2nd Column equals 3rd Column p Subscript upper X 1 Baseline left-parenthesis x 1 semicolon t 1 right-parenthesis ellipsis p Subscript upper X Sub Subscript n Baseline left-parenthesis x Subscript n Baseline semicolon t Subscript n Baseline right-parenthesis 2nd Row 1st Column Blank 2nd Column equals 3rd Column StartFraction 1 Over StartRoot left-parenthesis 2 pi right-parenthesis Superscript n Baseline sigma Subscript upper X 1 Superscript 2 Baseline ellipsis sigma Subscript upper X Sub Subscript n Subscript Superscript 2 Baseline EndRoot EndFraction exp left-bracket minus sigma-summation Underscript i equals 1 Overscript n Endscripts StartFraction left-parenthesis x Subscript i Baseline minus upper X overbar Subscript i Baseline right-parenthesis squared Over 2 sigma Subscript upper X Sub Subscript i Subscript Superscript 2 Baseline EndFraction right-bracket period EndLayout$

The log‐likelihood corresponding to (2.85) is

(2.87) $ln script upper L left-parenthesis upper X vertical-bar x semicolon t right-parenthesis equals minus one half left-bracket ln StartAbsoluteValue script í’ž Subscript upper X Baseline EndAbsoluteValue plus left-parenthesis x minus upper X overbar right-parenthesis Superscript upper T Baseline script í’ž Subscript upper X Superscript negative 1 Baseline left-parenthesis x minus upper X overbar right-parenthesis plus n ln left-parenthesis 2 pi right-parenthesis right-bracket$

and information entropy representing the average rate at which information is produced by a Gaussian process (2.85) is given by [2]

(2.88) $StartLayout 1st Row 1st Column upper H left-bracket p Subscript upper X Baseline left-parenthesis x semicolon t right-parenthesis right-bracket 2nd Column equals 3rd Column minus integral Subscript negative infinity Superscript infinity Baseline ln left-bracket p Subscript upper X Baseline left-parenthesis x semicolon t right-parenthesis right-bracket p Subscript upper X Baseline left-parenthesis x semicolon t right-parenthesis normal d x 2nd Row 1st Column Blank 2nd Column equals 3rd Column StartFraction n Over 2 EndFraction plus StartFraction n Over 2 EndFraction ln left-parenthesis 2 pi right-parenthesis plus one half ln StartAbsoluteValue script í’ž Subscript upper X Baseline EndAbsoluteValue period EndLayout$

Properties of Gaussian Processes

As a mathematical idealization of real physical processes, the Gaussian process exhibits several important properties that facilitate process analysis and state estimation. Researchers often choose to approximate data histograms with the normal law and use standard linear estimators, even if Gaussianity is clearly not observed. It is good if the errors are small. Otherwise, a more accurate approximation is required. In general, each random process requires an individual optimal estimator, unless its histogram can be approximated by the normal law to use standard solutions.

The following properties of the Gaussian process are recognized:

In an exhaustive manner, the Gaussian process is determined by the mean and the covariance matrix , which is diagonal for uncorrelated Gaussian variables.
Since the Gaussian variables are uncorrelated, it follows that they are also independent, and since they are independent, they are uncorrelated.
The definitions of stationarity in the strict and wide sense are equivalent for Gaussian processes.
The conditional pdf of the jointly Gaussian stochastic processes and is also Gaussian. This follows from Bayes' rule (2.47), according to which
$p Subscript upper X vertical-bar upper Y Baseline left-parenthesis x vertical-bar y semicolon t right-parenthesis equals StartFraction p Subscript upper X upper Y Baseline left-parenthesis x comma y semicolon t right-parenthesis Over p Subscript upper Y Baseline left-parenthesis y semicolon t right-parenthesis EndFraction period$
Linear transformation of a Gaussian process gives a Gaussian process; that is, the input Gaussian process goes through the linear system to the output as in order to remain a Gaussian process.
A linear operator can be found to convert a correlated Gaussian process to an uncorrelated Gaussian process and vice versa.

2.2.4 White Gaussian Noise

White Gaussian noise (WGN) occupies a special place among many other mathematical models of physical perturbations, As a stationary random process, WGN has the same intensity at all frequencies, and its PSD is thus constant in the frequency domain. Since the spectral intensity of any physical quantity decreases to zero with increasing frequency, it is said that WGN is an absolutely random process and as such does not exist in real life.

Continuous White Gaussian Noise

The WGN is the most widely used form of white processes. Since noise is usually associated with zero mean, , WGN is often referred to as additive WGN (AWGN), which means that it can be added to any signal without introducing a bias. Since the PSD of the scalar is constant, its autocorrelation function is delta‐shaped and commonly written as

(2.89) $script í’ž Subscript w Baseline left-parenthesis tau right-parenthesis equals script upper R Subscript w Baseline left-parenthesis tau right-parenthesis equals StartFraction upper N 0 Over 2 EndFraction delta left-parenthesis tau right-parenthesis comma$

where is the Dirac delta [165] and is some constant. It follows from (2.89) that the variance of WGN is infinite,

$sigma Subscript w Superscript 2 Baseline equals upper E left-brace w squared left-parenthesis t right-parenthesis right-brace equals script upper R Subscript w Baseline left-parenthesis 0 right-parenthesis equals StartFraction upper N 0 Over 2 EndFraction delta left-parenthesis 0 right-parenthesis equals infinity comma$

and the Fourier transform (2.78) applied to (2.89) gives

(2.90) $script upper S Subscript w Baseline left-parenthesis omega right-parenthesis equals integral Subscript negative infinity Superscript infinity Baseline StartFraction upper N 0 Over 2 EndFraction delta left-parenthesis tau right-parenthesis e Superscript minus j omega tau Baseline normal d tau equals StartFraction upper N 0 Over 2 EndFraction comma$

which means that the constant value in (2.89) has the meaning of a double‐sided PSD of WGN, and is thus a one‐sided PSD. Due to this property, the conditional pdf , , of white noise is marginal,

$p Subscript upper X Baseline left-parenthesis x Subscript k Baseline vertical-bar x Subscript l Baseline right-parenthesis equals p Subscript upper X Baseline left-parenthesis x Subscript k Baseline right-parenthesis period$

For zero mean vector WGN , the covariance is defined by

(2.91) $StartLayout 1st Row 1st Column script í’ž Subscript w Baseline left-parenthesis tau right-parenthesis 2nd Column equals 3rd Column script upper R Subscript w Baseline left-parenthesis tau right-parenthesis equals upper E left-brace w left-parenthesis t right-parenthesis w Superscript upper T Baseline left-parenthesis t plus tau right-parenthesis right-brace 2nd Row 1st Column Blank 2nd Column equals 3rd Column script upper S Subscript w Baseline delta left-parenthesis tau right-parenthesis equals one half Start 3 By 3 Matrix 1st Row 1st Column upper N 11 2nd Column ellipsis 3rd Column upper N Subscript 1 n Baseline 2nd Row 1st Column vertical-ellipsis 2nd Column down-right-diagonal-ellipsis 3rd Column vertical-ellipsis 3rd Row 1st Column upper N Subscript n Baseline 1 Baseline 2nd Column ellipsis 3rd Column upper N Subscript n n Baseline EndMatrix delta left-parenthesis tau right-parenthesis comma EndLayout$

where is the PSD matrix of WGN, whose component , , is the PSD or cross‐PSD of and .

Discrete White Gaussian Noise

In discrete time index , WGN is a discrete signal , the samples of which are a sequence of uncorrelated random variables. Discrete WGN is defined as the average of the original continuous WGN as

(2.92) $w Subscript k Baseline equals StartFraction 1 Over tau EndFraction integral Subscript t Subscript k Baseline minus tau Superscript t Subscript k Baseline Baseline w left-parenthesis t right-parenthesis normal d t comma$

where is a proper time step.

Taking the expectation on the both sides of (2.92) gives zero,

$upper E left-brace w Subscript k Baseline right-brace equals StartFraction 1 Over tau EndFraction integral Subscript t Subscript k Baseline minus tau Superscript t Subscript k Baseline Baseline upper E left-brace w left-parenthesis t right-parenthesis right-brace normal d t equals 0 comma$

and the variance of zero mean can be found as

(2.93) $StartLayout 1st Row 1st Column sigma Subscript w Sub Subscript k Superscript 2 2nd Column equals 3rd Column upper E left-brace w Subscript k Superscript 2 Baseline right-brace 2nd Row 1st Column equals 2nd Column StartFraction 1 Over tau squared EndFraction integral Subscript t Subscript k Baseline minus tau Superscript t Subscript k Baseline integral Subscript t Subscript k Baseline minus tau Superscript t Subscript k Baseline upper E left-brace w left-parenthesis theta 1 right-parenthesis w left-parenthesis theta 2 right-parenthesis right-brace normal d theta 1 normal d theta 2 3rd Row 1st Column Blank 2nd Column equals 3rd Column StartFraction upper N 0 Over 2 tau squared EndFraction integral Subscript t Subscript k Baseline minus tau Superscript t Subscript k Baseline integral Subscript t Subscript k Baseline minus tau Superscript t Subscript k Baseline delta left-parenthesis theta 1 minus theta 2 right-parenthesis normal d theta 1 normal d theta 2 4th Row 1st Column Blank 2nd Column equals 3rd Column StartFraction upper N 0 Over 2 tau squared EndFraction integral Subscript t Subscript k Baseline minus tau Superscript t Subscript k Baseline Baseline normal d theta 2 equals StartFraction upper N 0 Over 2 tau EndFraction equals StartFraction 1 Over tau EndFraction script upper S Subscript w Baseline comma EndLayout$

where is the PSD (2.90) of WGN.

Accordingly, the pdf of a discrete WGN can be written as

$p Subscript w Sub Subscript k Baseline left-parenthesis x right-parenthesis equals StartRoot StartFraction tau Over pi upper N 0 EndFraction EndRoot exp left-parenthesis minus StartFraction tau x squared Over upper N 0 EndFraction right-parenthesis$

and denoted as . Since a continuous WGN can also be denoted as , the notations become equivalent when .

For a discrete zero mean vector WGN , whose components are given by (2.92) to have variance (2.93), the covariance of is defined following (2.93) as

(2.94) $upper R Subscript w Baseline equals upper E left-brace w Subscript k Baseline w Subscript k Superscript upper T Baseline right-brace equals StartFraction 1 Over tau EndFraction script upper S Subscript w Baseline equals StartFraction 1 Over 2 tau EndFraction Start 3 By 3 Matrix 1st Row 1st Column upper N 11 2nd Column ellipsis 3rd Column upper N Subscript 1 n Baseline 2nd Row 1st Column vertical-ellipsis 2nd Column down-right-diagonal-ellipsis 3rd Column vertical-ellipsis 3rd Row 1st Column upper N Subscript n Baseline 1 Baseline 2nd Column ellipsis 3rd Column upper N Subscript n n Baseline EndMatrix comma$

where is the PSD matrix specified by (2.91). It follows from (2.91) and (2.94) that the covariance of the discrete WGN evolves to the covariance of the continuous WGN when as

$limit Underscript tau right-arrow 0 Endscripts upper R Subscript w Baseline equals limit Underscript tau right-arrow 0 Endscripts StartFraction 1 Over tau EndFraction script upper S Subscript w Baseline right-arrow script upper S Subscript w Baseline delta left-parenthesis tau right-parenthesis equals script í’ž Subscript w Baseline left-parenthesis tau right-parenthesis equals script upper R Subscript w Baseline left-parenthesis tau right-parenthesis period$

Because is defined for any , and is defined at , then it follows that there is no direct connection between and . Instead, can be viewed as a limited case of when .

2.2.5 Markov Processes

The Markov (or Markovian) process, which is also called continuous‐time Markov chain or continuous random walks, is another idealization of real physical stochastic processes. Unlike white noise, the values of which do not correlate with each other at any two different time instances, correlation in the Markov process can be observed only between two nearest neighbors.

A stochastic process is called Markov if its random variable , given at , does not depend on since . Thus, the following is an exhaustive property of all types of Markov processes.

Markov process: A random process is Markovian if on any finite time interval of time instances the conditional probability of a variable given depends solely on ; that is,

(2.95) $StartLayout 1st Row 1st Column upper P left-brace upper X left-parenthesis t Subscript n Baseline right-parenthesis 2nd Column less-than 3rd Column x Subscript n Baseline vertical-bar upper X left-parenthesis t 1 right-parenthesis equals x 1 comma ellipsis comma upper X left-parenthesis t Subscript n minus 1 Baseline right-parenthesis equals x Subscript n minus 1 Baseline right-brace 2nd Row 1st Column Blank 2nd Column equals 3rd Column upper P left-brace upper X left-parenthesis t Subscript n Baseline right-parenthesis less-than x Subscript n Baseline vertical-bar upper X left-parenthesis t Subscript n minus 1 Baseline right-parenthesis equals x Subscript n minus 1 Baseline right-brace period EndLayout$

The following probabilistic statement can be made about the future behavior of a Markov process: if the present state of a Markov process at is known explicitly, then the future state at , , can be predicted without reference to any past state at , .

It follows from the previous definition that the multivariate pdf of a Markov process can be written as

(2.96) $StartLayout 1st Row 1st Column p Subscript upper X Baseline left-parenthesis x 1 comma ellipsis comma x Subscript n Baseline right-parenthesis 2nd Column equals 3rd Column p Subscript upper X 1 Baseline left-parenthesis x 1 right-parenthesis p Subscript upper X 2 Baseline left-parenthesis x 2 vertical-bar x 1 right-parenthesis ellipsis p Subscript upper X Sub Subscript n Baseline left-parenthesis x Subscript n Baseline vertical-bar x Subscript n minus 1 Baseline right-parenthesis 2nd Row 1st Column Blank 2nd Column equals 3rd Column p Subscript upper X 1 Baseline left-parenthesis x 1 right-parenthesis product Underscript i equals 1 Overscript n minus 1 Endscripts p Subscript upper X Sub Subscript i plus 1 Baseline left-parenthesis x Subscript i plus 1 Baseline vertical-bar x Subscript i Baseline right-parenthesis EndLayout$

to be the chain rule (2.55) for Markov processes. In particular, for two Markovian variables one has

$p Subscript upper X 1 comma upper X 2 Baseline left-parenthesis x 1 comma x 2 right-parenthesis equals p Subscript upper X 1 Baseline left-parenthesis x 1 right-parenthesis p Subscript upper X 2 Baseline left-parenthesis x 2 vertical-bar x 1 right-parenthesis period$

An important example of Markov processes is the Wiener process, known in physics as Brownian motion.

When a Markov process is represented by a set of values at discrete time instances, it is called a discrete‐time Markov chain or simply Markov chain. For a finite set of discrete variables , specified at discrete time indexes , the conditional probability (2.95) becomes

(2.98) $StartLayout 1st Row 1st Column upper P left-brace upper X Subscript n Baseline 2nd Column less-than 3rd Column x Subscript n Baseline vertical-bar upper X 1 equals x 1 comma ellipsis comma upper X Subscript n minus 1 Baseline equals x Subscript n minus 1 Baseline right-brace 2nd Row 1st Column Blank 2nd Column equals 3rd Column upper P left-brace upper X Subscript n Baseline less-than x Subscript n Baseline vertical-bar upper X Subscript n minus 1 Baseline equals x Subscript n minus 1 Baseline right-brace period EndLayout$

Thus, the Markov chain is a stochastic sequence of possible events that obeys (2.98). A notable example of a Markov chain is the Poisson process [156,191].

Property (2.98) can be written in pdf format as

(2.99) $p left-parenthesis x Subscript n Baseline vertical-bar x 1 comma ellipsis comma x Subscript n minus 1 Baseline right-parenthesis equals p left-parenthesis x Subscript n Baseline vertical-bar x Subscript n minus 1 Baseline right-parenthesis period$

Since (2.99) defines the distribution of via , the conditional pdf for is called the transitional pdf of the Markov process.

Example 2.8 Gauss‐Markov sequence. Given (2.97), the WGN can also be defined as

$w left-parenthesis t right-parenthesis equals StartFraction normal d Over normal d t EndFraction upper W left-parenthesis t right-parenthesis approximately-equals StartFraction upper W left-parenthesis t right-parenthesis minus upper W left-parenthesis t minus tau right-parenthesis Over tau EndFraction comma$

which leads to a first‐order difference equation representing the Gauss‐Markov sequence [79]

(2.100) $upper W Subscript k Baseline equals upper W Subscript k minus 1 Baseline plus tau w Subscript k Baseline comma k equals 0 comma 1 comma ellipsis comma$

where is a white Gaussian sequence of variables, and the initial condition is Gaussian and does not depend on .

The sequence (2.100) is clearly Markov, since depends only on provided . Therefore, it is also called the Gauss‐Markov colored process noise [185]. The transitional probability for this sequence can be written as

$p left-parenthesis x Subscript k Baseline vertical-bar x Subscript k minus 1 Baseline right-parenthesis equals p left-parenthesis x Subscript k Baseline minus x Subscript k minus 1 Baseline right-parenthesis period$

With a zero initial condition , the process noise (2.100) can be written in batch form as to have zero mean, , and the variance

$sigma Subscript upper W Superscript 2 Baseline equals upper E left-brace upper W squared left-parenthesis t right-parenthesis right-brace equals tau squared sigma-summation Underscript i equals 1 Overscript k Endscripts upper E left-brace w Subscript i Superscript 2 Baseline right-brace equals tau k StartFraction upper N 0 Over 2 EndFraction equals tau k script upper S Subscript w Baseline comma$

where is defined by (2.93).

Note that if the noise in (2.100) is not gained with , as in the standard Gauss‐Markov sequence [79], then the variance becomes .

The transitional probability can be obtained for any two random variables and of the Markov process [79]. To show this, one can start with the rule (2.52) by rewriting it as

(2.101) $p left-parenthesis x Subscript k Baseline vertical-bar x Subscript k minus 2 Baseline right-parenthesis equals integral Subscript negative infinity Superscript infinity Baseline p left-parenthesis x Subscript k Baseline comma x Subscript k minus 1 Baseline vertical-bar x Subscript k minus 2 Baseline right-parenthesis normal d x Subscript k minus 1 Baseline period$

Now, according to the chain rule (2.96) and the Markov property (2.99), the integrand can be rewritten as

$StartLayout 1st Row 1st Column p left-parenthesis x Subscript k Baseline comma x Subscript k minus 1 Baseline vertical-bar x Subscript k minus 2 Baseline right-parenthesis 2nd Column equals 3rd Column p left-parenthesis x Subscript k Baseline vertical-bar x Subscript k minus 1 Baseline comma x Subscript k minus 2 Baseline right-parenthesis p left-parenthesis x Subscript k minus 1 Baseline vertical-bar x Subscript k minus 2 Baseline right-parenthesis 2nd Row 1st Column Blank 2nd Column equals 3rd Column p left-parenthesis x Subscript k Baseline vertical-bar x Subscript k minus 1 Baseline right-parenthesis p left-parenthesis x Subscript k minus 1 Baseline vertical-bar x Subscript k minus 2 Baseline right-parenthesis EndLayout$

and (2.101) transformed to the Chapman‐Kolmogorov equation

(2.102) $p left-parenthesis x Subscript k Baseline vertical-bar x Subscript k minus 2 Baseline right-parenthesis equals integral Subscript negative infinity Superscript infinity Baseline p left-parenthesis x Subscript k Baseline vertical-bar x Subscript k minus 1 Baseline right-parenthesis p left-parenthesis x Subscript k minus 1 Baseline vertical-bar x Subscript k minus 2 Baseline right-parenthesis normal d x Subscript k minus 1 Baseline period$

Note that this equation also follows from the general probabilistic rule (2.53) and can be rewritten more generally for as

(2.103) $p left-parenthesis x Subscript k Baseline vertical-bar x Subscript m Baseline right-parenthesis equals integral Subscript negative infinity Superscript infinity Baseline p left-parenthesis x Subscript k Baseline vertical-bar x Subscript k minus 1 Baseline right-parenthesis p left-parenthesis x Subscript k minus 1 Baseline vertical-bar x Subscript m Baseline right-parenthesis normal d x Subscript k minus 1 Baseline period$

The theory of Markov processes and chains establishes a special topic in the interpretation and estimation of real physical processes for a wide class of applications. The interested reader is referred to a number of fundamental and applied investigations discussed in [16,45,79,145,156,191].

2.3 Stochastic Differential Equation

Dynamic physical processes can be both linear and nonlinear with respect to variables and perturbations. Many of them can be generalized by a multivariate differential equation in the form

(2.104) $StartFraction normal d Over normal d t EndFraction upper X left-parenthesis t right-parenthesis equals f left-parenthesis upper X comma w comma t right-parenthesis comma$

where is a nonlinear function of a general vector stochastic process and noise . We encounter such a case in trajectory measurements where values and disturbances in Cartesian coordinates are nonlinearly related to values measured in polar coordinates (see Example 2.5).

2.3.1 Standard Stochastic Differential Equation

Since noise in usually less intensive than measured values, another form of (2.104) has found more applications,

(2.105) $StartFraction normal d Over normal d t EndFraction upper X left-parenthesis t right-parenthesis equals f left-parenthesis upper X comma t right-parenthesis plus g left-parenthesis upper X comma t right-parenthesis w left-parenthesis t right-parenthesis comma$

(2.106) $normal d upper X left-parenthesis t right-parenthesis equals f left-parenthesis upper X comma t right-parenthesis normal d t plus g left-parenthesis upper X comma t right-parenthesis normal d upper W left-parenthesis t right-parenthesis comma$

where and are some known nonlinear functions, is some noise, and . A differential equation 2.105 or (2.106) can be thought of as SDE, because one or more its terms are random processes and the solution is also a random process. Since a typical SDE contains white noise calculated by the derivative of the Wiener process , we will further refer to this case.

If the noise in (2.105) is white Gaussian with zero mean and autocorrelation function , and noise in (2.106) is a Wiener process with zero mean and autocorrelation function , then (2.105) and (2.106) are called SDE if the following Lipschitz condition is satisfied.

Lipschitz condition: An equation ((2.105)) representing a scalar random process is said to be SDE if zero mean noise is white Gaussian with known variance and nonlinear functions and satisfy the Lipschitz condition

(2.107) $StartAbsoluteValue f left-parenthesis x comma t right-parenthesis minus f left-parenthesis y comma t right-parenthesis EndAbsoluteValue plus StartAbsoluteValue g left-parenthesis x comma t right-parenthesis minus g left-parenthesis y comma t right-parenthesis EndAbsoluteValue less-than-or-slanted-equals upper L StartAbsoluteValue x minus y EndAbsoluteValue$

for constant .

The problem with integrating either (2.105) or (2.106) arises because the integrand, which has white noise properties, does not satisfy the Dirichlet condition, and thus the integral does not exist in the usual sense of Riemann and Lebesgue. However, solutions can be found if we use the Itô calculus and Stratonovich calculus.

2.3.2 Itô and Stratonovich Stochastic Calculus

Integrating SDE (2.106) from to gives

(2.108) $upper X left-parenthesis t right-parenthesis equals upper X left-parenthesis t 0 right-parenthesis plus integral Subscript t 0 Superscript t Baseline f left-parenthesis upper X comma t right-parenthesis normal d t plus integral Subscript t 0 Superscript t Baseline g left-parenthesis upper X comma t right-parenthesis normal d upper W left-parenthesis t right-parenthesis comma$

where it is required to know the initial , and the first integral must satisfy the Dirichlet condition.

The second integral in (2.108), called the stochastic integral, can be defined in the Lebesgue sense as [140]

(2.109) $integral Subscript t 0 Superscript t Baseline g left-parenthesis upper X comma t right-parenthesis normal d upper W left-parenthesis t right-parenthesis equals limit Underscript n right-arrow infinity Endscripts sigma-summation Underscript k equals 1 Overscript n Endscripts g left-parenthesis upper X comma t overbar Subscript k Baseline right-parenthesis left-bracket upper W left-parenthesis t Subscript k minus 1 Baseline right-parenthesis minus upper W left-parenthesis t Subscript k Baseline right-parenthesis right-bracket comma$

where the integration interval is divided into subintervals as and . This calculus allows one to integrate the stochastic integral numerically if is specified properly.

Itô proved that a stable solution to (2.109) can be found by assigning . The corresponding solution for (2.108) was named the Itô solution, and SDE (2.106) was called the Itô SDE [140]. Another calculus was proposed by Stratonovich [193], who suggested assigning at the midpoint of the interval. To distinguish the difference from Itô SDE, the Stratonovich SDE is often written as

(2.110) $normal d upper X left-parenthesis t right-parenthesis equals f left-parenthesis upper X comma t right-parenthesis normal d t plus g left-parenthesis upper X comma t right-parenthesis ring normal d upper W left-parenthesis t right-parenthesis$

with a circle in the last term. The circle is also introduced into the stochastic integral as to indicate that the calculus (2.109) is in the Stratonovich sense. The analysis of Stratonovich's solution is more complicated, but Stratonovich's SDE can always be converted to Itô's SDE using a simple transformation rule [193]. Moreover, makes both solutions equivalent.

2.3.3 Diffusion Process Interpretation

Based on the rules of Itô and Stratonovich, the theory of SDE has been developed in great detail [16,45,125,145,156]. It has been shown that if the stochastic process is represented with (2.105) or (2.106), then it is a Markovian process [45]. Moreover, the process described using (2.105) or (2.106) belongs to the class of diffusion processes, which are described using the drift and diffusion coefficients [193] defined as

(2.111) $upper K 1 left-parenthesis x comma t right-parenthesis equals limit Underscript tau right-arrow 0 Endscripts StartFraction 1 Over tau EndFraction script upper E left-brace left-bracket upper X left-parenthesis t plus tau right-parenthesis minus upper X left-parenthesis t right-parenthesis right-bracket vertical-bar upper X equals x right-brace comma$

(2.112) $upper K 2 left-parenthesis x comma t right-parenthesis equals limit Underscript tau right-arrow 0 Endscripts StartFraction 1 Over tau EndFraction script upper E left-brace left-bracket upper X left-parenthesis t plus tau right-parenthesis minus upper X left-parenthesis t right-parenthesis right-bracket squared vertical-bar upper X equals x right-brace$

and associated with the first and second moments of the stochastic process . Depending on the choice of in (2.109), the drift and diffusion coefficients can be defined in different senses. In the Stratonovich sense, the drift and diffusion coefficients become

(2.113) $upper K 1 left-parenthesis upper X comma t right-parenthesis equals f left-parenthesis upper X comma t right-parenthesis plus StartFraction upper N 0 Over 2 EndFraction g left-parenthesis upper X comma t right-parenthesis StartFraction partial-differential g left-parenthesis upper X comma t right-parenthesis Over partial-differential upper X EndFraction comma$

(2.114) $upper K 2 left-parenthesis upper X comma t right-parenthesis equals StartFraction upper N 0 Over 2 EndFraction g squared left-parenthesis upper X comma t right-parenthesis period$

In the Itô sense, the second term vanishes on the right‐hand side of (2.113).

In a more general case of a vector stochastic process , represented by a set of random subprocesses , the th stochastic process can be described using SDE [193]

(2.115) $normal d upper X Subscript i Baseline left-parenthesis t right-parenthesis equals f Subscript i Baseline left-parenthesis upper X comma t right-parenthesis normal d t plus sigma-summation Underscript l equals 1 Overscript n Endscripts g Subscript i l Baseline left-parenthesis upper X comma t right-parenthesis normal d upper W Subscript l Baseline left-parenthesis t right-parenthesis comma i element-of left-bracket 1 comma n right-bracket comma$

where the nonlinear functions and satisfy the Lipschitz condition (2.107) and , , are independent Wiener processes with zero mean and autocorrelation function

$script upper E left-brace left-bracket upper W Subscript l Baseline left-parenthesis t 1 right-parenthesis minus upper W Subscript l Baseline left-parenthesis t 2 right-parenthesis right-bracket left-bracket upper W Subscript j Baseline left-parenthesis t 1 right-parenthesis minus upper W Subscript j Baseline left-parenthesis t 2 right-parenthesis right-bracket right-brace equals StartFraction upper N Subscript l Baseline Over 2 EndFraction StartAbsoluteValue t 2 minus t 1 EndAbsoluteValue delta Subscript l j Baseline comma$

where , is the Kronecker symbol, and is the PSD of .

Similarly to the scalar case, the vector stochastic process , described by an SDE (2.110), can be represented in the Stratonovich sense with the following drift coefficient and diffusion coefficient [193],

(2.116) $upper K Subscript i Baseline left-parenthesis upper X comma t right-parenthesis equals f Subscript i Baseline left-parenthesis upper X comma t right-parenthesis plus sigma-summation Underscript l equals 1 Overscript n Endscripts sigma-summation Underscript j equals 1 Overscript n Endscripts StartFraction upper N Subscript l Baseline Over 2 EndFraction g Subscript j l Baseline left-parenthesis upper X comma t right-parenthesis StartFraction partial-differential g Subscript i l Baseline left-parenthesis upper X comma t right-parenthesis Over partial-differential upper X Subscript j Baseline EndFraction comma$

(2.117) $upper K Subscript i j Baseline left-parenthesis upper X comma t right-parenthesis equals sigma-summation Underscript l equals 1 Overscript n Endscripts StartFraction upper N Subscript l Baseline Over 2 EndFraction g Subscript i l Baseline left-parenthesis upper X comma t right-parenthesis g Subscript j l Baseline left-parenthesis upper X comma t right-parenthesis period$

Given the drift and diffusion coefficients, SDE can be replaced by a probabilistic differential equation for the time‐varying pdf of . This equation is most often called the Fokker‐Planck‐Kolmogorov (FPK) equation, but it can also be found in the works of Einstein and Smoluchowski.

2.3.4 Fokker‐Planck‐Kolmogorov Equation

If the SDE (2.105) obeys the Lipschitz condition, then it can be replaced by the following FPK equation representing the dynamics of in probabilistic terms as

(2.118) $StartFraction partial-differential Over partial-differential t EndFraction p left-parenthesis x comma t right-parenthesis equals minus StartFraction partial-differential Over partial-differential x EndFraction left-bracket upper K 1 left-parenthesis x comma t right-parenthesis p left-parenthesis x comma t right-parenthesis right-bracket plus one half StartFraction partial-differential squared Over partial-differential x squared EndFraction left-bracket upper K 2 left-parenthesis x comma t right-parenthesis p left-parenthesis x comma t right-parenthesis right-bracket period$

This equation is also known as the first or forward Kolmogorov equation. Closed‐form solutions of the partial differential equation (2.118) have been found so far for a few simple cases. However, the stationary case, assuming with , reduces it to a time‐invariant probability flux

(2.119) $upper G left-parenthesis x right-parenthesis equals upper K 1 left-parenthesis x right-parenthesis p Subscript st Baseline left-parenthesis x right-parenthesis minus one half StartFraction partial-differential Over partial-differential x EndFraction left-bracket upper K 2 left-parenthesis x right-parenthesis p Subscript st Baseline left-parenthesis x right-parenthesis right-bracket equals 0 comma$

which is equal to zero at all range points. Hence, the steady‐state pdf can be defined as

(2.120) $p Subscript st Baseline left-parenthesis x right-parenthesis equals StartFraction c Over upper K 2 left-parenthesis x right-parenthesis EndFraction exp left-bracket 2 integral Subscript x 1 Superscript x Baseline StartFraction upper K 1 left-parenthesis z right-parenthesis Over upper K 2 left-parenthesis z right-parenthesis EndFraction normal d z right-bracket$

by integrating (2.119) from some point to with the normalizing constant . By virtue of this, in many cases, a closed‐form solution is not required for (2.118), because pdf (2.120) contains all the statistical information about stochastic process at .

The FPK equation 2.118 describes the forward dynamics of the stochastic process in probabilistic terms. But if it is necessary to study the inverse dynamics, one can use the second or backward Kolmogorov equation

(2.121) $StartFraction partial-differential Over partial-differential s EndFraction p left-parenthesis x comma s right-parenthesis equals minus upper K 1 left-parenthesis x comma s right-parenthesis StartFraction partial-differential Over partial-differential x EndFraction p left-parenthesis x comma s right-parenthesis minus one half upper K 2 left-parenthesis x comma s right-parenthesis StartFraction partial-differential squared Over partial-differential x squared EndFraction p left-parenthesis x comma s right-parenthesis comma s less-than-or-slanted-equals t comma$

specifically to learn the initial distribution of at , provided that the distribution at is already known.

2.3.5 Langevin Equation

A linear first‐order SDE is known in physics as the Langevin equation, the physical nature of which can be found in Brownian motion. It also represents the first‐order electric circuit driven by white noise . Although the Langevin equation is the simplest in the family of SDEs, it allows one to study the key properties of stochastic dynamics and plays a fundamental role in the theory of stochastic processes.

The Langevin equation can be written as

(2.122) $StartFraction normal d Over normal d t EndFraction upper X left-parenthesis t right-parenthesis plus alpha upper X left-parenthesis t right-parenthesis equals w left-parenthesis t right-parenthesis comma$

where is a constant and is white Gaussian noise with zero mean, , and the autocorrelation function . If we assume that the stochastic process starts at time as , then the solution to (2.122) can be written as

(2.123) $upper X left-parenthesis t right-parenthesis equals upper X 0 e Superscript minus alpha t Baseline plus alpha e Superscript minus alpha t Baseline integral Subscript 0 Superscript t Baseline e Superscript alpha theta Baseline w left-parenthesis theta right-parenthesis normal d theta comma$

which has the mean

(2.124) $script upper E left-brace upper X left-parenthesis t right-parenthesis right-brace equals ModifyingAbove upper X With bar left-parenthesis t right-parenthesis equals upper X 0 e Superscript minus alpha t$

and the variance

(2.125) $StartLayout 1st Row 1st Column sigma Subscript upper X Superscript 2 Baseline left-parenthesis t right-parenthesis 2nd Column equals 3rd Column script upper E left-brace left-bracket upper X left-parenthesis t right-parenthesis minus ModifyingAbove upper X With bar left-parenthesis t right-parenthesis right-bracket squared right-brace 2nd Row 1st Column equals 2nd Column alpha squared e Superscript minus 2 alpha t Baseline integral Subscript 0 Superscript t Baseline integral Subscript 0 Superscript t Baseline e Superscript alpha theta 1 Baseline e Superscript alpha theta 2 Baseline script upper E left-brace w left-parenthesis theta 1 right-parenthesis w left-parenthesis theta 2 right-parenthesis right-brace normal d theta 1 normal d theta 2 3rd Row 1st Column Blank 2nd Column equals 3rd Column StartFraction alpha squared upper N 0 Over 2 EndFraction e Superscript minus 2 alpha t Baseline integral Subscript 0 Superscript t Baseline integral Subscript 0 Superscript t Baseline e Superscript alpha theta 1 Baseline e Superscript alpha theta 2 Baseline delta left-parenthesis theta 1 minus theta 2 right-parenthesis normal d theta 1 normal d theta 2 4th Row 1st Column Blank 2nd Column equals 3rd Column StartFraction alpha squared upper N 0 Over 2 EndFraction e Superscript minus 2 alpha t Baseline integral Subscript 0 Superscript t Baseline e Superscript 2 alpha theta Baseline normal d theta 5th Row 1st Column Blank 2nd Column equals 3rd Column StartFraction alpha upper N 0 Over 4 EndFraction left-parenthesis 1 minus e Superscript minus 2 alpha t Baseline right-parenthesis period EndLayout$

Since the Langevin equation is linear and the driving noise is white Gaussian, it follows that is also Gaussian with a nonstationary pdf

(2.126) $p left-parenthesis x comma t right-parenthesis equals StartFraction 1 Over StartRoot 2 pi sigma Subscript upper X Superscript 2 Baseline left-parenthesis t right-parenthesis EndRoot EndFraction exp left-bracket minus StartFraction left-bracket x minus ModifyingAbove upper X With bar left-parenthesis t right-parenthesis right-bracket squared Over 2 sigma Subscript upper X Superscript 2 Baseline left-parenthesis t right-parenthesis EndFraction right-bracket comma$

where the mean is given by (2.124) and the variance by (2.125).

The drift coefficient (2.113) and diffusion coefficient (2.114) are defined for the Langevin equation 2.122 as

$StartLayout 1st Row 1st Column upper K 1 left-parenthesis upper X comma t right-parenthesis 2nd Column equals 3rd Column minus alpha upper X left-parenthesis t right-parenthesis comma 2nd Row 1st Column upper K 2 2nd Column equals 3rd Column StartFraction alpha squared upper N 0 Over 2 EndFraction comma EndLayout$

and hence the FPK equation 2.118 becomes

(2.127) $StartFraction partial-differential Over partial-differential t EndFraction p left-parenthesis x comma t right-parenthesis equals alpha StartFraction partial-differential Over partial-differential x EndFraction left-bracket x p left-parenthesis x comma t right-parenthesis right-bracket plus StartFraction alpha squared upper N 0 Over 4 EndFraction StartFraction partial-differential squared Over partial-differential x squared EndFraction p left-parenthesis x comma t right-parenthesis comma$

whose solution is a nonstationary Gaussian pdf (2.126) and the moments (2.124) and (2.125).

Langevin's equation is a nice illustration of how a stochastic process can be investigated in terms of a probability distribution, rather than by solving its SDE. Unfortunately, most high‐order stochastic processes cannot be explored in this way due to the complexity, and the best way is to go into the state space and use state estimators, which we will discuss in the next chapter.

2.4 Summary

In this chapter we have presented the basics of probability and stochastic processes, which are essential to understand the theory of state estimation. Probability theory and methods developed for stochastic processes play a fundamental role in understanding the features of physical processes driven and corrupted by noise. They also enable the formulation of feature extraction requirements from noise processes and the development of optimal and robust estimators.

A random variable of a physical process can be related to the corresponding measured element in a linear or nonlinear manner. A collection of random variables is a random process. The probability that will occur below some constant (event ) is calculated as the ratio of possible outcomes favoring event to total possible outcomes. The corresponding function is called cdf. In turn, pdf represents the concentration of values around and is equal to the derivative of with respect to , since cdf is the integral measure of pdf.

Each random process can be represented by a set of initial and central moments. The Gaussian process is the only one that is represented by a first‐order raw moment (mean) and a second‐order central moment (variance). Product moments represent the power of interaction between two different random variables and . The normalized measure of interaction is called the correlation coefficient, which is in the ranges . A stochastic process is associated with continuous time and stochastic sequence with discrete time.

The conditional probability of the event means that the event is observed. Therefore, it is also called the a posteriori probability, and and are called the a priori probabilities. Bayes' theorem states that the joint probability can be represented as , and there is always a certain rule for converting two correlated random variables into two uncorrelated random variables and vice versa.

The autocorrelation function establishes the degree of interaction between the values of a variable measured at two different time points, while the cross‐correlation establishes the degree of interaction between the values of two different variables. According to the Wiener‐Khinchin theorem, PSD and correlation function are related to each other by the Fourier transform.

Continuous white Gaussian noise has infinite variance, while its discrete counterpart has finite variance. A stochastic process is called Markov if its random variable , for a given at , does not depend on , since .

The SDE can be solved either in the Itô sense or in the Stratonovich sense. It can also be viewed as a diffusion process and represented by the probabilistic FPK equation. The Langevin equation is a classical example of first‐order stochastic processes associated with Brownian motion.

2.5 Problems

Two events and are mutually exclusive. Can they be uncorrelated and independent?
Two nodes and transmit the same message over the wireless network to a central station, which can only process the previously received one. Assuming a random delay in message delivery, what is the probability that 10 consecutive messages will belong to node ?
A network of 20 nodes contains 5 damaged ones. If we choose three nodes at random, what is the probability that at least one of these nodes is defective?
Show that if , then 1) and 2) .
The binomial coefficients are computed by . Show that 1) and 2) .
Given events , , and and using the chain rule, show that
$StartLayout 1st Row 1st Column upper P left-parenthesis upper A upper B right-parenthesis 2nd Column equals 3rd Column upper P left-parenthesis upper A vertical-bar upper B right-parenthesis upper P left-parenthesis upper B right-parenthesis comma 2nd Row 1st Column upper P left-parenthesis upper A upper B vertical-bar upper C right-parenthesis 2nd Column equals 3rd Column upper P left-parenthesis upper A vertical-bar upper B upper C right-parenthesis upper P left-parenthesis upper B vertical-bar upper C right-parenthesis comma 3rd Row 1st Column upper P left-parenthesis upper A upper B upper C right-parenthesis 2nd Column equals 3rd Column upper P left-parenthesis upper A vertical-bar upper B upper C right-parenthesis upper P left-parenthesis upper B vertical-bar upper C right-parenthesis upper P left-parenthesis upper C right-parenthesis period EndLayout$
Given two independent identically distributed random variables and with zero mean, variance , and join pdf , find the pdf of the following variables: 1) , 2) , and 3) .
The Bernoulli distribution of the discrete random variable is given by pmf
(2.128) $f left-parenthesis upper X semicolon p right-parenthesis equals p Superscript upper X Baseline left-parenthesis 1 minus p right-parenthesis Superscript 1 minus upper X$

to represent random binary time delays in communication channels. Find the cdf for (2.128) and show that the mean is and the variance is .
A generalized normal distribution of a random variable is given by the pdf
(2.129) $p Subscript upper X Baseline left-parenthesis x right-parenthesis equals StartFraction beta Over 2 alpha upper Gamma left-parenthesis 1 slash beta right-parenthesis EndFraction e Superscript minus left-parenthesis StartAbsoluteValue x minus mu EndAbsoluteValue slash alpha right-parenthesis Super Superscript beta Superscript Baseline comma$

where all are real, is location, is scale, and is shape. Find the moments and cumulants of (2.129). Prove that the mean is and the variance is . Find the values of at which this distribution becomes Gaussian, almost rectangular, and heavy‐tailed.
The random variable representing the measurement noise has a Laplace distribution with pdf
(2.130) $p Subscript upper X Baseline left-parenthesis x vertical-bar mu comma b right-parenthesis equals StartFraction 1 Over 2 b EndFraction exp left-parenthesis minus StartFraction StartAbsoluteValue x minus mu EndAbsoluteValue Over b EndFraction right-parenthesis comma$

where is the location parameter and is the scale parameter. Find cdf, the mean , and the variance of this variable.
Consider a set of independent samples , , each of which obeys the Laplace distribution (2.130) with variance . The maximum likelihood estimate of location is given by [13]
(2.131) $ModifyingAbove mu With Ì‚ equals arg min Underscript mu Endscripts sigma-summation Underscript i equals 1 Overscript upper N Endscripts StartFraction 1 Over sigma Subscript upper L Superscript 2 Baseline EndFraction StartAbsoluteValue upper X Subscript i Baseline minus mu EndAbsoluteValue$

and is called the median estimate. Redefine this estimate considering as a state variable and explain the meaning of the estimate.
A measured random quantity has a Cauchy distribution with pdf
(2.132) $p Subscript upper X Baseline left-parenthesis x vertical-bar mu comma b right-parenthesis equals StartFraction 1 Over pi EndFraction StartFraction b Over left-parenthesis x minus mu right-parenthesis squared plus b squared EndFraction comma$

where is the location parameter and is the scale parameter. Prove that cdf is and that the mean and variance are not definable.
Consider a measurable random variable represented by a Cauchy pdf (2.132) with and . The set of measured variables passes through an electric circuit, where it is saturated by the power supply as
$upper X Subscript i Baseline equals StartLayout Enlarged left-brace 1st Row 1st Column upper X Subscript i Baseline comma 2nd Column StartAbsoluteValue upper X Subscript i Baseline EndAbsoluteValue less-than 5 comma 2nd Row 1st Column 5 comma 2nd Column StartAbsoluteValue upper X Subscript i Baseline EndAbsoluteValue greater-than-or-slanted-equals 5 EndLayout period$

Modify the Cauchy pdf (2.132) for saturated and numerically compute the mean and variance.
The measurement of some scalar constant quantity is corrupted by the Gauss‐Markov noise , where and is the white Gaussian driving noise. Find the autocorrelation function and PSD of noise . Describe the properties of in two extreme cases: and .
Given a discrete‐time random process , where and is some random driving force. Considering as input and as output, find the input‐to‐output transfer function .
Explain the physical nature of the skewness and kurtosis and how these measures help to recognize the properties of a random variable . Illustrate the analysis based on the Bernoulli, Gaussian, and generalized normal distributions and provide some practical examples.
The joint pdf of and is given by
$p Subscript upper X comma upper Y Baseline left-parenthesis x comma y right-parenthesis equals StartLayout Enlarged left-brace 1st Row 1st Column x squared plus one third x y 2nd Column comma 3rd Column 0 less-than-or-slanted-equals x less-than-or-slanted-equals 1 comma 0 less-than-or-slanted-equals y less-than-or-slanted-equals 2 comma 2nd Row 1st Column 0 2nd Column comma 3rd Column otherwise EndLayout$

Prove that the marginal pdfs are and .
Two uncorrelated phase differences and are distributed with the conditional von Mises circular normal pdf
(2.133) $p left-parenthesis phi vertical-bar gamma comma phi 0 right-parenthesis equals StartFraction 1 Over 2 pi upper I 0 left-parenthesis alpha right-parenthesis EndFraction e Superscript alpha cosine left-parenthesis phi minus phi 0 right-parenthesis Baseline comma$

where is a random phase mod and is its deterministic constituent, is a modified Bessel function of the first kind and zeroth order, and is a parameter sensitive to the power signal‐to‐noise ratio (SNR) . Show that the phase difference with different SNRs is conditionally distributed by

(2.134) $p left-parenthesis normal upper Psi vertical-bar gamma 1 comma gamma 2 comma normal upper Psi 0 right-parenthesis equals StartFraction 1 Over 2 pi EndFraction StartFraction upper I 0 left-parenthesis r right-parenthesis Over upper I 0 left-parenthesis alpha 1 right-parenthesis upper I 0 left-parenthesis alpha 2 right-parenthesis EndFraction$

and define the function for .
A stable system with random components is represented in discrete‐time state space with the state equation and the observation equation . Considering as input and as output, find the input‐to‐output transfer function .
The Markov chain with three states is specified using the transition matrix
$upper M equals Start 3 By 3 Matrix 1st Row 1st Column 0.9 2nd Column 0.075 3rd Column 0.025 2nd Row 1st Column 0.15 2nd Column 0.8 3rd Column 0.05 3rd Row 1st Column 0.25 2nd Column 0.25 3rd Column 0.5 EndMatrix period$

Represent this chain as , specify the transition matrix , and find .
Given a stationary process with derivative , show that for a given time the random variables and are orthogonal and uncorrelated.
The continuous‐time three‐state clock model is represented by SDE , where the zero mean noise has the following components: is the phase noise, is the frequency noise, and is the linear frequency drift noise. Show that the noise covariance for this model is given by
$script í’ž Subscript w Baseline left-parenthesis tau right-parenthesis equals tau Start 3 By 3 Matrix 1st Row 1st Column script upper S Subscript phi Baseline plus StartFraction script upper S Subscript ModifyingAbove phi With dot Baseline tau squared Over 3 EndFraction plus StartFraction script upper S Subscript ModifyingAbove phi With two-dots Baseline tau Superscript 4 Baseline Over 20 EndFraction 2nd Column StartFraction script upper S Subscript ModifyingAbove phi With dot Baseline tau Over 2 EndFraction plus StartFraction script upper S Subscript ModifyingAbove phi With two-dots Baseline tau cubed Over 8 EndFraction 3rd Column StartFraction script upper S Subscript ModifyingAbove phi With two-dots Baseline tau squared Over 6 EndFraction 2nd Row 1st Column StartFraction script upper S Subscript ModifyingAbove phi With dot Baseline tau Over 2 EndFraction plus StartFraction script upper S Subscript ModifyingAbove phi With two-dots Baseline tau cubed Over 8 EndFraction 2nd Column script upper S Subscript ModifyingAbove phi With dot Baseline plus StartFraction script upper S Subscript ModifyingAbove phi With two-dots Baseline tau squared Over 3 EndFraction 3rd Column StartFraction script upper S Subscript ModifyingAbove phi With two-dots Baseline tau Over 2 EndFraction 3rd Row 1st Column StartFraction script upper S Subscript ModifyingAbove phi With two-dots Baseline tau squared Over 6 EndFraction 2nd Column StartFraction script upper S Subscript ModifyingAbove phi With two-dots Baseline tau Over 2 EndFraction 3rd Column script upper S Subscript ModifyingAbove phi With two-dots Baseline EndMatrix comma$

if we provide the integration from to .
Two discrete random stationary processes and have autocorrelation functions
(2.135) $script upper R Subscript w Baseline equals Start 4 By 4 Matrix 1st Row 1st Column 1 2nd Column 0 3rd Column ellipsis 4th Column 0 2nd Row 1st Column 0 2nd Column 1 3rd Column ellipsis 4th Column 0 3rd Row 1st Column vertical-ellipsis 2nd Column vertical-ellipsis 3rd Column down-right-diagonal-ellipsis 4th Column vertical-ellipsis 4th Row 1st Column 0 2nd Column 0 3rd Column ellipsis 4th Column 1 EndMatrix comma script upper R Subscript v Baseline equals Start 4 By 4 Matrix 1st Row 1st Column 1 2nd Column 1 3rd Column ellipsis 4th Column 1 2nd Row 1st Column 1 2nd Column 1 3rd Column ellipsis 4th Column 1 3rd Row 1st Column vertical-ellipsis 2nd Column vertical-ellipsis 3rd Column down-right-diagonal-ellipsis 4th Column vertical-ellipsis 4th Row 1st Column 1 2nd Column 1 3rd Column ellipsis 4th Column 1 EndMatrix comma$

which are measured relative to each point on the horizon by shifting and by . What is the PSD of the first process and the second process ?
Given an LTI system with the impulse response , where , , and is the unit step function. In this system, the input is a stationary random process with the autocorrelation function applied at and disconnected at . Find the mean and the mean square value of the output signal and plot these functions.