Non-diffusive atmospheric flow #13: Markov matrix examples

(There’s no code in this post, just some examples to explain what we’re going to do next.)

Suppose we define the state of the system whose evolution we want to study by a probability vector p(t)\mathbf{p}(t) — at any moment in time, we have a probability distribution over a finite partition of the state space of the system (so that if we partition the state space into N components, then p(t)RN\mathbf{p}(t) \in \mathbb{R}^N). Evolution of the system as a Markov chain is then defined by the evolution rule

p(t+Δt)=Mp(t),(1)\mathbf{p}(t + \Delta{}t) = \mathbf{M} \mathbf{p}(t), \qquad (1)

where MRN×N\mathbf{M} \in \mathbb{R}^{N \times N} is a Markov matrix. This approach to modelling the evolution of probability densities has the benefit both of being simple to understand and to implement (in terms of estimating the matrix M\mathbf{M} from data) and, as we’ll see, of allowing us to distinguish between random “diffusive” evolution and conservative “non-diffusive” dynamics.

We’ll see how this works by examining a very simple example.

First, there are a couple of properties that the matrix M\mathbf{M} has to satisfy in order to ensure that the evolution of p(t)\mathbf{p}(t) according to (1) is consistent with p(t)\mathbf{p}(t) being a probability distribution I’m not going to describe this stuff with any level of formality. If you want the real skinny, read Crommelin’s paper and the references within, R. A. Pasmanter & A. Timmermann (2003). Cyclic Markov chains with an application to an intermediate ENSO model. Nonlin. Processes Geophys. 10, 197-210   [PDF]. . If we rewrite (1) as

pi(t+Δt)=jMijpj(t),p_i(t + \Delta{}t) = \sum_j M_{ij} p_j(t),

then, writing the property that a vector x\mathbf{x} represents a probability distribution as P[x]\mathcal{P}[\mathbf{x}] (i.e. xi0x_i \geq 0 and ixi=1\sum_i x_i = 1), the condition that P[p(t)]P[p(t+Δt)]\mathcal{P}[\mathbf{p}(t)] \Rightarrow \mathcal{P}[\mathbf{p}(t + \Delta{}t)] requires that

Mij0 for all i, jiMij=1 for all jjMij=1 for all i.M_{ij} \geq 0 \text{ for all i, j} \qquad \sum_i M_{ij} = 1 \text{ for all j} \qquad \sum_j M_{ij} = 1 \text{ for all i}.

A matrix satisfying these conditions is called a doubly stochastic matrix.

We’ll see later how we can estimate Markov matrices from data, but for now we’re going to look at a very simple but interesting decomposition of Markov matrix that will give us some insight into the type of dynamics that our system exhibits. All we do is split our Markov matrix M\mathbf{M} into its symmetric and antisymmetric parts:

MS=12(M+MT)MA=12(MMT)\mathbf{M}^S = \frac12 (\mathbf{M} + \mathbf{M}^T) \qquad \mathbf{M}^A = \frac12 (\mathbf{M} - \mathbf{M}^T)

We’ll use some simple examples to see how the symmetric component picks out diffusive dynamics and the antisymmetric component conservative dynamics.

For a first example, let’s suppose we have partitioned the state space of our system into four cells, and we have purely deterministic dynamics, with a cycle of transitions between the cells in the partition that goes as 123411 \to 2 \to 3 \to 4 \to 1. This is represented by a Markov matrix M1\mathbf{M}_1:

M1=(0001100001000010)\mathbf{M}_1 = \begin{pmatrix} 0 & 0 & 0 & 1 \\ 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \end{pmatrix}

which has symmetric and asymmetric parts M1S\mathbf{M}_1^S and M1A\mathbf{M}_1^A:

M1S=12(0101101001011010)M1A=12(0101101001011010)\mathbf{M}_1^S = \frac12 \begin{pmatrix} 0 & 1 & 0 & 1 \\ 1 & 0 & 1 & 0 \\ 0 & 1 & 0 & 1 \\ 1 & 0 & 1 & 0 \end{pmatrix} \qquad \mathbf{M}_1^A = \frac12 \begin{pmatrix} 0 & -1 & 0 & 1 \\ 1 & 0 & -1 & 0 \\ 0 & 1 & 0 & -1 \\ -1 & 0 & 1 & 0 \end{pmatrix}

The matrix M1A+M1A\mathbf{M}_1^A + |\mathbf{M}_1^A| recovers the original periodic motion:

M1A+M1A=(0001100001000010).\mathbf{M}_1^A + |\mathbf{M}_1^A| = \begin{pmatrix} 0 & 0 & 0 & 1 \\ 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \end{pmatrix}.

For the next example, let’s use the same state space partitioning and cycle of transitions, but let’s say that there is a probability of 0.7 that the system follows the cycle at each time step, with equal probabilities that it instead steps into any of the other “off-cycle” states:

M2=(0.10.10.10.70.70.10.10.10.10.70.10.10.10.10.70.1)\mathbf{M}_2 = \begin{pmatrix} 0.1 & 0.1 & 0.1 & 0.7 \\ 0.7 & 0.1 & 0.1 & 0.1 \\ 0.1 & 0.7 & 0.1 & 0.1 \\ 0.1 & 0.1 & 0.7 & 0.1 \end{pmatrix}

The symmetric and asymmetric parts of the Markov matrix are:

M2S=12(0.10.40.10.40.40.10.40.10.10.40.10.40.40.10.40.1)M2A=(00.300.30.300.3000.300.30.300.30)\mathbf{M}_2^S = \frac12 \begin{pmatrix} 0.1 & 0.4 & 0.1 & 0.4 \\ 0.4 & 0.1 & 0.4 & 0.1 \\ 0.1 & 0.4 & 0.1 & 0.4 \\ 0.4 & 0.1 & 0.4 & 0.1 \end{pmatrix} \qquad \mathbf{M}_2^A = \begin{pmatrix} 0 & -0.3 & 0 & 0.3 \\ 0.3 & 0 & -0.3 & 0 \\ 0 & 0.3 & 0 & -0.3 \\ -0.3 & 0 & 0.3 & 0 \end{pmatrix}

and we find that

M2A+M2A=(0000.60.600000.600000.60)\mathbf{M}_2^A + |\mathbf{M}_2^A| = \begin{pmatrix} 0 & 0 & 0 & 0.6 \\ 0.6 & 0 & 0 & 0 \\ 0 & 0.6 & 0 & 0 \\ 0 & 0 & 0.6 & 0 \end{pmatrix}

again recovering the non-diffusive part of the dynamics.

And for a third example, let’s use the same state space partitioning but purely random dynamics:

M2=14(1111111111111111)M3S=14(1111111111111111)M3A=(0000000000000000)\mathbf{M}_2 = \frac{1}{4} \begin{pmatrix} 1 & 1 & 1 & 1 \\ 1 & 1 & 1 & 1 \\ 1 & 1 & 1 & 1 \\ 1 & 1 & 1 & 1 \end{pmatrix} \qquad \mathbf{M}_3^S = \frac{1}{4} \begin{pmatrix} 1 & 1 & 1 & 1 \\ 1 & 1 & 1 & 1 \\ 1 & 1 & 1 & 1 \\ 1 & 1 & 1 & 1 \end{pmatrix} \qquad \mathbf{M}_3^A = \begin{pmatrix} 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \end{pmatrix}

Here M3A+M3A=0\mathbf{M}_3^A + |\mathbf{M}_3^A| = 0.

Comparing the three cases, we see that M3=M3S\mathbf{M}_3 = \mathbf{M}_3^S, i.e. there is no antisymmetric component at all, while M1A\mathbf{M}_1^A and M2A\mathbf{M}_2^A are both non-zero and seem to represent the predictable cyclic dynamics present in the system: the magnitude of the elements in M2A\mathbf{M}_2^A are smaller than those in M1A\mathbf{M}_1^A, reflecting the reduced predictability in the second case because of the random diffusive element to the dynamics. In each case, the matrix MA+MA\mathbf{M}^A + |\mathbf{M}^A| extracts the non-diffusive part of the system dynamics.

Next, we’re going to try to calculate Markov matrices to describe the transitions between different regimes of atmospheric flow and use this symmetric/antisymmetric decomposition (in particular the MA+MA\mathbf{M}^A + |\mathbf{M}^A| matrix) to detect predictable cycles of transitions.