7

It's well known that, if $Q$ is the matrix of transient state transition probabilities, and $$ N = \sum_{n=0}^{\infty} Q^n = (I - Q)^{-1}$$ then $N_{ij}$ describes the expected number of times the chain is in state $j$, given that it starts in state $i$. (source: Wiki absorbing markov chain).

I'm looking for the expected number of times the chain is in state $j$, given that it starts in state $i$ $\textbf{and}$ eventually absorbs in state $k$.

Motivation:

I'm trying to model the spread of a mutant gene throughout a population, and to do this, I'm using a markov chain. The absorbing states are $0$ and $M$ to represent the gene dying or existing in every member of the population. I want to calculate the number of times a specific amount of people have this gene before it dies out, given that it dies out.

NRH
  • 16,580
  • 56
  • 68
Samuel Reid
  • 123
  • 8

1 Answers1

4

Using the Wikipedia notation, the full matrix of transition probabilities is written as $$P = \left(\begin{array}{cc} Q & R \\ 0 & I \end{array}\right)$$ with the states organized so that the last states are all absorbing. In the OP's example from population genetics, the states would be organized as $1, 2, \ldots, M-1, 0, M$.

If $\tau$ denotes the first time the Markov chain enters the set of absorbing states, the absorption probability for state $k$ given that the Markov chain starts in state $i$ is $$B_{ik} := P(X_{\tau} = k \mid X_0 = i).$$ Observe that if $X_n = j$ for a transient state $j$ then $\tau > n$, and by the time homogeneity of the Markov chain $$P(X_{\tau} = k \mid X_n = j) = B_{jk}.$$ From Wikipedia it follows that $$B_{ik} = (NR)_{ik}.$$

The expected number of visits to state $j$ conditionally on the chain starting in $i$ and being absorbed in $k$ is $$\xi_{ik}(j) = E\left( \sum_{n=0}^{\infty} 1(X_n = j) \ \middle| \ X_0 = i, X_{\tau} = k\right) = \sum_{n=0}^{\infty} P(X_n = j \mid X_0 = i, X_{\tau} = k) .$$ Now the probabilities in the infinite sum can be rewritten as follows \begin{align*} P(X_n = j \mid X_0 = i, X_{\tau} = k) & = \frac{P(X_{\tau} = k, X_n = j \mid X_0 = i)}{P(X_{\tau} = k \mid X_0 = i)} \\ & = \frac{P(X_{\tau} = k \mid X_n = j, X_0 = i)P(X_n = j \mid X_0 = i)}{P(X_{\tau} = k \mid X_0 = i)} \\ & = \frac{P(X_{\tau} = k \mid X_n = j)}{P(X_{\tau} = k \mid X_0 = i)} (Q^n)_{ij} \\ & = \frac{B_{jk}}{B_{ik}}(Q^n)_{ij}. \end{align*} From this we obtain the formula \begin{align*} \xi_{ik}(j) & = \frac{B_{jk}}{B_{ik}}\sum_{n=0}^{\infty} (Q^n)_{ij} = \frac{B_{jk}}{B_{ik}}N_{ij}, \end{align*} where the absorption probabilities $B_{ij}$ and $B_{ik}$ can be computed from $N$ and $R$ using the formula above.

NRH
  • 16,580
  • 56
  • 68