Saturday 21 May 2016

Does the autocorrelation function completely describe a stochastic process?



Is a stochastic process completely described by its autocorrelation function?


If not, which additional properties would be needed?



Answer



What is meant by a complete description of a stochastic process? Well, mathematically, a stochastic process is a collection $\{X(t) : t \in {\mathbb T}\}$ of random variables, one for each time instant $t$ in an index set $\mathbb T$, where usually $\mathbb T$ is the entire real line or the positive real line, and a complete description means that for each integer $n \geq 1$ and $n$ time instants $t_1, t_2, \ldots, t_n \in \mathbb T$, we know the (joint) distributions of the $n$ random variables $X(t_1)$, $X(t_2)$, $\ldots, X(t_n)$. This is an enormous amount of information: we need to know the CDF of $X(t)$ for each time instant $t$, the (two-dimensional) joint CDF of $X(t_1)$ and $X(t_2)$ for all choices of time instants $t_1$ and $t_2$, the (three-dimensional) CDFs of $X(t_1)$, $X(t_2)$, and $X(t_3)$, etc. etc. etc.


So naturally people have looked about for simpler descriptions and more restrictive models. One simplification occurs when the process is invariant to a change in the time origin. What this means is that



  • All the random variables in the process have identical CDFs: $F_{X(t_1)}(x) = F_{X(t_2)}(x)$ for all $t_1, t_2$.

  • Any two random variables separated by some specified amount of time have the same joint CDF as any other pair of random variables separated by the same amount of time. For example, the random variables $X(t_1)$ and $X(t_1 + \tau)$ are separated by $\tau$ seconds, as are the random variables $X(t_2)$ and $X(t_2 + \tau)$, and thus $F_{X(t_1), X(t_1 + \tau)}(x,y) = F_{X(t_2), X(t_2 + \tau)}(x,y)$

  • Any three random variables $X(t_1)$, $X(t_1 + \tau_1)$, $X(t_1 + \tau_1 + \tau_2)$ spaced $\tau_1$ and $\tau_2$ apart have the same joint CDF as $X(t_2)$, $X(t_2 + \tau_1)$, $X(t_2 + \tau_1 + \tau_2)$ which as also spaced $\tau_1$ and $\tau_2$ apart,

  • and so on for all multidimensional CDFs. See, for example, Peter K.'s answer for details of the multidimensional case.



Effectively, the probabilistic descriptions of the random process do not depend on what we choose to call the origin on the time axis: shifting all time instants $t_1, t_2, \ldots, t_n$ by some fixed amount $\tau$ to $t_1 + \tau, t_2 + \tau, \ldots, t_n + \tau$ gives the same probabilistic description of the random variables. This property is called strict-sense stationarity and a random process that enjoys this property is called a strictly stationary random process or, more simply, s stationary random process.



Note that strict stationarity by itself does not require any particular form of CDF. For example, it does not say that all the variables are Gaussian.



The adjective strictly suggests that is possible to define a looser form of stationarity. If the $N^{\text{th}}$-order joint CDF of $X(t_1), X(t_2), \ldots, X(t_N)$ is the same as the $N^{\text{th}}$-order joint CDF of $X(t_1+\tau), X(t_2+\tau), \ldots, X(t_N +\tau)$ for all choices of $t_1,t_2, \ldots, t_N$ and $\tau$, then the random process is said to be stationary to order $N$ and is referred to as a $N^{\text{th}}$-order stationary random process. Note that a $N^{\text{th}}$-order stationary random process is also stationary to order $n$ for each positive $n < N$. (This is because the $n^{\text{th}}$-order joint CDF is the limit of the $N^{\text{th}}$-order CDF as $N-n$ of the arguments approach $\infty$: a generalization of $F_X(x) = \lim_{y\to\infty}F_{X,Y}(x,y)$). A strictly stationary random process then is a random process that is stationary to all orders $N$.


If a random process is stationary to (at least) order $1$, then all the $X(t)$'s have the same distribution and so, assuming the mean exists, $E[X(t)] = \mu$ is the same for all $t$. Similarly, $E[(X(t))^2]$ is the same for all $t$, and is referred to as the power of the process. All physical processes have finite power and so it is common to assume that $E[(X(t))^2] < \infty$ in which case, and especially in the older engineering literature, the process is called a second-order process. The choice of name is unfortunate because it invites confusion with second-order stationarity (cf. this answer of mine on stats.SE), and so here we will call a process for which $E[(X(t))^2]$ is finite for all $t$ (whether or not $E[(X(t))^2]$ is a constant) as a finite-power process and avoid this confusion. But note again that



a first-order stationary process need not be a finite-power process.




Consider a random process that is stationary to order $2$. Now, since the joint distribution of $X(t_1)$ and $X(t_1 + \tau)$ is the same as the joint distribution function of $X(t_2)$ and $X(t_2 + \tau)$, $E[X(t_1)X(t_1 + \tau)] = E[X(t_2)X(t_2 + \tau)]$ and the value depends only on $\tau$. These expectations are finite for a finite-power process and their value is called the autocorrelation function of the process: $R_X(\tau) = E[X(t)X(t+\tau)]$ is a function of $\tau$, the time separation of the random variables $X(t)$ and $X(t+\tau)$, and does not depend on $t$ at all. Note also that $$E[X(t)X(t+\tau)] = E[X(t+\tau)X(t)] = E[X(t+\tau)X(t + \tau - \tau)] = R_X(-\tau),$$ and so the autocorrelation function is an even function of its argument.



A finite-power second-order stationary random process has the properties that



  1. Its mean $E[X(t)]$ is a constant

  2. Its autocorrelation function $R_X(\tau) = E[X(t)X(t+\tau)]$ is a function of $\tau$, the time separation of the random variables $X(t)$ and $X(t+\tau)$, and does not depend on $t$ at all.





The assumption of stationarity simplifies the description of a random process to some extent but, for engineers and statisticians interested in building models from experimental data, estimating all those CDFs is a nontrivial task, particularly when there is only a segment of one sample path (or realization) $x(t)$ on which measurements can be made. Two measurements that are relatively easy to make (because the engineer already has the necessary instruments on his workbench (or programs in MATLAB/Python/Octave/C++ in his software library) are the DC value $\frac 1T\int_0^T x(t)\,\mathrm dt$ of $x(t)$ and the autocorrelation function $R_x(\tau) = \frac 1T\int_0^T x(t)x(t+\tau)\,\mathrm dt$ (or its Fourier transform, the power spectrum of $x(t)$). Taking these measurements as estimates of the mean and the autocorrelation function of a finite-power process leads to a very useful model that we discuss next.






A finite-power random process is called a wide-sense-stationary (WSS) process (also weakly stationary random process which fortunately also has the same initialism WSS) if it has a constant mean and its autocorrelation function $R_X(t_1, t_2) = E[X(t_1)X(t_2)]$ depends only on the time difference $t_1 - t_2$ (or $t_2 - t_1$).



Note that the definition says nothing about the CDFs of the random variables comprising the process; it is entirely a constraint on the first-order and second-order moments of the random variables. Of course, a finite-power second-order stationary (or $N^{\text{th}}$-order stationary (for $N>2$) or strictly stationary) random process is a WSS process, but the converse need not be true.



A WSS process need not be stationary to any order.



Consider, for example, the random process $\{X(t)\colon X(t)= \cos (t + \Theta), -\infty < t < \infty\}$ where $\Theta$ takes on four equally likely values $0, \pi/2, \pi$ and $3\pi/2$. (Do not be scared: the four possible sample paths of this random process are just the four signal waveforms of a QPSK signal). Note that each $X(t)$ is a discrete random variable that, in general, takes on four equally likely values $\cos(t), \cos(t+\pi/2)=-\sin(t), \cos(t+\pi) = -\cos(t)$ and $\cos(t+3\pi/2)=\sin(t)$, It is easy to see that in general $X(t)$ and $X(s)$ have different distributions, and so the process is not even first-order stationary. On the other hand, $$E[X(t)] = \frac 14\cos(t)+ \frac 14(-\sin(t)) + \frac 14(-\cos(t))+\frac 14 \sin(t) = 0$$ for every $t$ while \begin{align} E[X(t)X(s)]&= \left.\left.\frac 14\right[\cos(t)\cos(s) + (-\cos(t))(-\cos(s)) + \sin(t)\sin(s) + (-\sin(t))(-\sin(s))\right]\\ &= \left.\left.\frac 12\right[\cos(t)\cos(s) + \sin(t)\sin(s)\right]\\ &= \frac 12 \cos(t-s). \end{align} In short, the process has zero mean and its autocorrelation function depends only on the time difference $t-s$, and so the process is wide sense stationary. But it is not first-order stationary and so cannot be stationary to higher orders either.


Even for WSS processes that are second-order stationary (or strictly stationary) random processes, little can be said about the specific forms of the distributions of the random variables. In short,




A WSS process is not necessarily stationary (to any order), and the mean and autocorrelation function of a WSS process is not enough to give a complete statistical description of the process.



Finally, suppose that a stochastic process is assumed to be a Gaussian process ("proving" this with any reasonable degree of confidence is not a trivial task). This means that for each $t$, $X(t)$ is a Gaussian random variable and for all positive integers $n \geq 2$ and choices of $n$ time instants $t_1$, $t_2$, $\ldots, t_n$, the $N$ random variables $X(t_1)$, $X(t_2)$, $\ldots, X(t_n)$ are jointly Gaussian random variables. Now a joint Gaussian density function is completely determined by the means, variances, and covariances of the random variables, and in this case, knowing the mean function $\mu_X(t) = E[X(t)]$ (it need not be a constant as is required for wide-sense-stationarity) and the autocorrelation function $R_X(t_1, t_2) = E[X(t_1)X(t_2)]$ for all $t_1, t_2$ (it need not depend only on $t_1-t_2$ as is required for wide-sense-stationarity) is sufficient to determine the statistics of the process completely.


If the Gaussian process is a WSS process, then it is also a strictly stationary Gaussian process. Fortunately for engineers and signal processors, many physical noise processes can be well-modeled as WSS Gaussian processes (and therefore strictly stationary processes), so that experimental observation of the autocorrelation function readily provides all the joint distributions. Furthermore since Gaussian processes retain their Gaussian character as they pass through linear systems, and the output autocorrelation function is related to th input autocorrelation function as $$R_y = h*\tilde{h}*R_X$$ so that the output statistics can also be easily determined, WSS process in general and WSS Gaussian processes in particular are of great importance in engineering applications.


No comments:

Post a Comment

readings - Appending 内 to a company name is read ない or うち?

For example, if I say マイクロソフト内のパートナーシップは強いです, is the 内 here read as うち or ない? Answer 「内」 in the form: 「Proper Noun + 内」 is always read 「ない...