9

Lets $X=(x_1, x_2,...x_{20})$ where $x_i\sim N(0,1)$ and $x_i, x_j$ are independent $\forall i\neq j$.

What is the probability to obtain a sample $X$ where there are at least two consecutive values $x_i$ and $x_{i+1}$ such that $ \begin{cases} |x_{i}| & > & 1.5 \\ |x_{i+1}| & > & 1.5 \\ x_i x_{i+1} & < & 0 \end{cases} $ ?

whuber
  • 281,159
  • 54
  • 637
  • 1,101
will198
  • 669
  • 1
  • 4
  • 9
  • The first condition should be that $|x_i|,|x_{i+1}|<1.5$ and the second condition is that $x_i*x_{i+1}<0$ – will198 Apr 09 '15 at 09:50
  • 1
    Each of your 20 variables has a chance of about 0.0668 to be over 1.5 and the same chance of being under -1.5 . This reduces your problem to a question about discrete (3-valued) variables which could be solved with the chain rule. It must be possible to program a function for this, with your limit (1.5) and number of consecutive variables (20) as input. Do you have notions of either R, SAS or js? – Dirk Horsten Apr 09 '15 at 13:36
  • $0$? $|1.5|=1.5$ Or is there a typo in the question? Probability that two numbers are $>1.5$ and their product is $<0$ is $0$. – Dmitry Rubanovich Apr 09 '15 at 09:25
  • with $x_i, x_{i+1}>|1.5|$ I mean that $x_i, x_{i+1}>1.5$ or $x_i, x_{i+1}< -1.5$ and with $x_i*x_{i+1}<0$ I mean that one value is >0 and the other is <0. For example $x_i=1.8$ and $x_{i+1}=-2$ fit both conditions. – will198 Apr 09 '15 at 09:45
  • Then it is a typo. It should say $|x_i|,|x_{i+1}|>1.5$. – Dmitry Rubanovich Apr 09 '15 at 10:08

1 Answers1

6

Run a Markov chain.

Let a "flip" (at index $i$) be the event that $X_{i-1}$ and $X_{i}$ are of opposite signs and both exceed $1.5$ in size. As we scan across any realization of $(X_i)$ looking for flips, we can exploit the symmetry of the standard Normal distribution to describe the process with just four states:

  • The Start, before $X_1$ is observed.

  • Zero, where $-1.5 \le X_{i-1} \le 1.5$.

  • One, where $|X_{i-1}| \gt 1.5$.

  • Flipped, where a flip occurs at $i$.

Start transitions into the (mixed) state $$\mu = (1-2p, 2p, 0)$$

(corresponding to the chances of being in states (Zero, One, Flipped)) where $$p = \Pr(X_1 \lt -1.5) = \Pr(X_1 \gt 1.5) \approx 0.0668072.$$ Because Start is never seen again, let's not bother to track it any further.

Zero transitions into One with probability $2p$ (when $|X_i|\gt 1.5$) and otherwise stays at Zero.

One transitions into Flipped with probability $p$: this occurs when $|X_i| \gt 1.5$ and $X_i$ has the opposite sign of $X_{i-1}$. It also transitions back to One with probability $p$ when $|X_i| \gt 1.5$ and $X_i$ has the same sign as $X_{i-1}$. Otherwise it transitions to Zero.

Flipped is an absorbing state: once there, nothing changes regardless of the value of $X_i$.

Thus the transition matrix (ignoring the transient Start) for (Zero, One, Flipped) is therefore

$$\mathbb{P} = \left( \begin{array}{ccc} 1-2 p & 2 p & 0 \\ 1-2 p & p & p \\ 0 & 0 & 1 \\ \end{array} \right)$$

After leaving the start state (and entering the mixed state $\mu$), $20-1$ transitions will be made in the scan for a flip. The desired probability therefore is the third entry (corresponding to Flipped) in $$\mu \cdot \mathbb{P}^{20-1} \approx 0.149045.$$


Computational Details

We don't need to do $18$ matrix multiplications to obtain $\mathbb{P}^{19}$. Instead, after diagonalizing

$$\mathbb{P} = \mathbb{Q}^{-1} \mathbb{E} \mathbb{Q},$$

the answer for any exponent $n$ (even huge ones) can be computed via just one matrix multiplication as

$$\mu\cdot\mathbb{P}^n = \left(\mu\cdot\mathbb{Q}^{-1}\right) \mathbb{E}^n \mathbb{Q}$$

with

$$\mu \cdot \mathbb{Q}^{-1} = \left(1,\frac{-4 p^2+p+1-\sqrt{(2-7 p) p+1}}{2 \sqrt{(2-7 p) p+1}},-\frac{-4 p^2+p+1+\sqrt{(2-7 p) p+1}}{2 \sqrt{(2-7 p) p+1}}\right),$$

$$\mathbb{Q} = \left( \begin{array}{ccc} 0 & 0 & 1 \\ \frac{\left(1+p+\sqrt{-7 p^2+2 p+1}\right) \left(3 p-1+\sqrt{-7 p^2+2 p+1}\right)}{8 p^2} & -\frac{1+p+\sqrt{-7 p^2+2 p+1}}{2 p} & 1 \\ \frac{\left(1+p-\sqrt{-7 p^2+2 p+1}\right) \left(3 p-1-\sqrt{-7 p^2+2 p+1}\right)}{8 p^2} & -\frac{1+p-\sqrt{-7 p^2+2 p+1}}{2 p} & 1 \\ \end{array} \right) $$

and

$$\mathbb{E}^n = \left( \begin{array}{ccc} 1 & 0 & 0 \\ 0 & \left(\frac{1}{2} \left(1-p-\sqrt{-7 p^2+2 p+1}\right)\right)^n & 0 \\ 0 & 0 & \left(\frac{1}{2} \left(1-p+\sqrt{-7 p^2+2 p+1}\right)\right)^n \\ \end{array} \right)$$

A million-iteration simulation (using R) supports this result. Its output,

     Mean       LCL       UCL 
0.1488040 0.1477363 0.1498717

estimates the answer as $0.1488$ with a confidence interval $[0.1477, 0.1499]$ that includes $0.149045$.

n <- 20                                         # Length of the sequence
n.iter <- 1e6                                   # Length of the simulation
set.seed(17)                                    # Start at a reproducible point
x <- rnorm(n.iter*n)                            # The X_i
y <- matrix(sign(x) * (abs(x) > 3/2), n, n.iter)
flips <- colSums(y[-1, ] * y[-n, ] == -1)       # Flip indicators
x.bar <- mean(flips >= 1)                       # Mean no. of flipped sequences
s <- sqrt(x.bar * (1-x.bar) / n.iter)           # Standard error of the mean
(c(Mean=x.bar, x.bar + c(LCL=-3,UCL=3) * s))    # The results
whuber
  • 281,159
  • 54
  • 637
  • 1,101
  • 2
    For the curious, the technique that Whuber exploited to get the exponents of the transition matrix is sometimes called "Diagonalization" in elementary linear algebra textbooks. – Sycorax Apr 09 '15 at 15:50