(Thereβs no code in this post, just some examples to explain what weβre going to do next.)

Suppose we define the state of the system whose evolution we want to study by a probability vector $\mathbf{p}(t)$ β at any moment in time, we have a probability distribution over a finite partition of the state space of the system (so that if we partition the state space into $N$ components, then $\mathbf{p}(t) \in \mathbb{R}^N$). Evolution of the system as a Markov chain is then defined by the evolution rule

$\mathbf{p}(t + \Delta{}t) = \mathbf{M} \mathbf{p}(t), \qquad (1)$

where $\mathbf{M} \in \mathbb{R}^{N \times N}$ is a *Markov matrix*. This approach to modelling the evolution of probability densities has the benefit both of being simple to understand and to implement (in terms of estimating the matrix $\mathbf{M}$ from data) and, as weβll see, of allowing us to distinguish between random βdiffusiveβ evolution and conservative βnon-diffusiveβ dynamics.

Weβll see how this works by examining a very simple example.

First, there are a couple of properties that the matrix $\mathbf{M}$ has to satisfy in order to ensure that the evolution of $\mathbf{p}(t)$ according to $(1)$ is consistent with $\mathbf{p}(t)$ being a probability distribution^{1}. If we rewrite $(1)$ as

$p_i(t + \Delta{}t) = \sum_j M_{ij} p_j(t),$

then, writing the property that a vector $\mathbf{x}$ represents a probability distribution as $\mathcal{P}[\mathbf{x}]$ (i.e. $x_i \geq 0$ and $\sum_i x_i = 1$), the condition that $\mathcal{P}[\mathbf{p}(t)] \Rightarrow \mathcal{P}[\mathbf{p}(t + \Delta{}t)]$ requires that

$M_{ij} \geq 0 \text{ for all $i$, $j$} \qquad \sum_i M_{ij} = 1 \text{ for all $j$} \qquad \sum_j M_{ij} = 1 \text{ for all $i$}.$

A matrix satisfying these conditions is called a *doubly stochastic matrix*.

Weβll see later how we can estimate Markov matrices from data, but for now weβre going to look at a very simple but interesting decomposition of Markov matrix that will give us some insight into the type of dynamics that our system exhibits. All we do is split our Markov matrix $\mathbf{M}$ into its symmetric and antisymmetric parts:

$\mathbf{M}^S = \frac12 (\mathbf{M} + \mathbf{M}^T) \qquad \mathbf{M}^A = \frac12 (\mathbf{M} - \mathbf{M}^T)$

Weβll use some simple examples to see how the symmetric component picks out diffusive dynamics and the antisymmetric component conservative dynamics.

For a first example, letβs suppose we have partitioned the state space of our system into four cells, and we have purely deterministic dynamics, with a cycle of transitions between the cells in the partition that goes as $1 \to 2 \to 3 \to 4 \to 1$. This is represented by a Markov matrix $\mathbf{M}_1$:

$\mathbf{M}_1 = \begin{pmatrix} 0 & 0 & 0 & 1 \\ 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \end{pmatrix}$

which has symmetric and asymmetric parts $\mathbf{M}_1^S$ and $\mathbf{M}_1^A$:

$\mathbf{M}_1^S = \frac12 \begin{pmatrix} 0 & 1 & 0 & 1 \\ 1 & 0 & 1 & 0 \\ 0 & 1 & 0 & 1 \\ 1 & 0 & 1 & 0 \end{pmatrix} \qquad \mathbf{M}_1^A = \frac12 \begin{pmatrix} 0 & -1 & 0 & 1 \\ 1 & 0 & -1 & 0 \\ 0 & 1 & 0 & -1 \\ -1 & 0 & 1 & 0 \end{pmatrix}$

The matrix $\mathbf{M}_1^A + |\mathbf{M}_1^A|$ recovers the original periodic motion:

$\mathbf{M}_1^A + |\mathbf{M}_1^A| = \begin{pmatrix} 0 & 0 & 0 & 1 \\ 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \end{pmatrix}.$

For the next example, letβs use the same state space partitioning and cycle of transitions, but letβs say that there is a probability of 0.7 that the system follows the cycle at each time step, with equal probabilities that it instead steps into any of the other βoff-cycleβ states:

$\mathbf{M}_2 = \begin{pmatrix} 0.1 & 0.1 & 0.1 & 0.7 \\ 0.7 & 0.1 & 0.1 & 0.1 \\ 0.1 & 0.7 & 0.1 & 0.1 \\ 0.1 & 0.1 & 0.7 & 0.1 \end{pmatrix}$

The symmetric and asymmetric parts of the Markov matrix are:

$\mathbf{M}_2^S = \frac12 \begin{pmatrix} 0.1 & 0.4 & 0.1 & 0.4 \\ 0.4 & 0.1 & 0.4 & 0.1 \\ 0.1 & 0.4 & 0.1 & 0.4 \\ 0.4 & 0.1 & 0.4 & 0.1 \end{pmatrix} \qquad \mathbf{M}_2^A = \begin{pmatrix} 0 & -0.3 & 0 & 0.3 \\ 0.3 & 0 & -0.3 & 0 \\ 0 & 0.3 & 0 & -0.3 \\ -0.3 & 0 & 0.3 & 0 \end{pmatrix}$

and we find that

$\mathbf{M}_2^A + |\mathbf{M}_2^A| = \begin{pmatrix} 0 & 0 & 0 & 0.6 \\ 0.6 & 0 & 0 & 0 \\ 0 & 0.6 & 0 & 0 \\ 0 & 0 & 0.6 & 0 \end{pmatrix}$

again recovering the non-diffusive part of the dynamics.

And for a third example, letβs use the same state space partitioning but purely random dynamics:

$\mathbf{M}_2 = \frac{1}{4} \begin{pmatrix} 1 & 1 & 1 & 1 \\ 1 & 1 & 1 & 1 \\ 1 & 1 & 1 & 1 \\ 1 & 1 & 1 & 1 \end{pmatrix} \qquad \mathbf{M}_3^S = \frac{1}{4} \begin{pmatrix} 1 & 1 & 1 & 1 \\ 1 & 1 & 1 & 1 \\ 1 & 1 & 1 & 1 \\ 1 & 1 & 1 & 1 \end{pmatrix} \qquad \mathbf{M}_3^A = \begin{pmatrix} 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \end{pmatrix}$

Here $\mathbf{M}_3^A + |\mathbf{M}_3^A| = 0$.

Comparing the three cases, we see that $\mathbf{M}_3 = \mathbf{M}_3^S$, i.e.Β there is no antisymmetric component at all, while $\mathbf{M}_1^A$ and $\mathbf{M}_2^A$ are both non-zero and seem to represent the predictable cyclic dynamics present in the system: the magnitude of the elements in $\mathbf{M}_2^A$ are smaller than those in $\mathbf{M}_1^A$, reflecting the reduced predictability in the second case because of the random diffusive element to the dynamics. In each case, the matrix $\mathbf{M}^A + |\mathbf{M}^A|$ extracts the non-diffusive part of the system dynamics.

Next, weβre going to try to calculate Markov matrices to describe the transitions between different regimes of atmospheric flow and use this symmetric/antisymmetric decomposition (in particular the $\mathbf{M}^A + |\mathbf{M}^A|$ matrix) to detect predictable cycles of transitions.

Iβm not going to describe this stuff with any level of formality. If you want the real skinny, read Crommelinβs paper and the references within, R. A. Pasmanter & A. Timmermann (2003). Cyclic Markov chains with an application to an intermediate ENSO model. Nonlin. Processes Geophys. 10, 197-210 Β [PDF].β©