9

What do the vertical bars mean in the first and third formulae?

$$v_i|z_i=k,\mu_k\sim\mathcal{N}(\mu_k, \sigma^2)$$ $$P(z_i=k)=\pi_k$$ $$\pi|\alpha\sim \text{Dir}(\alpha/K1_K)$$ $$\mu_k\sim H(\lambda)$$ This formula is originally from here.

gung - Reinstate Monica
  • 132,789
  • 81
  • 357
  • 650
Jiang Xiang
  • 412
  • 1
  • 3
  • 12
  • 3
    I think in this context the vertical bars can read as "given that". So the first line would mean $v_{i}$ **given that** $z_{i} = k,\mu_{k}\sim N(\mu_{k},\sigma^{2})$. See also [this list](http://en.wikipedia.org/wiki/Vertical_bar#Mathematics) on Wikipedia. – COOLSerdash Jul 31 '14 at 20:28

1 Answers1

9

The vertical bar is often called a 'pipe'. It is often used in mathematics, logic and statistics. It typically is read as 'given that'. In probability and statistics it often indicates conditional probability, but can also indicate a conditional distribution. You can read it as 'conditional on'.

For example the third line can be read "pi, conditional on alpha, is distributed as dirichlet... ". The idea of a distribution conditional on something else taking a specific value is very, very common in statistics. Perhaps the most typical example would be of $Y$ values conditional on $X$ being normally distributed in regression models (for an example, see my answer here: What is the intuition behind conditional Gaussian distributions).

gung - Reinstate Monica
  • 132,789
  • 81
  • 357
  • 650
  • I know the idea of conditional probability, what confused me here is that, usually the conditional probability is written as P(A|B), which is different with this case. And how to interpret the first formula? – Jiang Xiang Jul 31 '14 at 20:43
  • 2
    "The probability of $v_i$, given that $z_i = k$ and given the value $\mu_k$, is distributed normally with mean $\mu_k$ and variance $\sigma^2$." – Sycorax Jul 31 '14 at 20:48
  • Sorry about that, @JiangXiang. In doesn't have to be conditional probability, it can just indicate a conditional distribution, eg. I edited to clarify that. I didn't mention the 1st formula, because COOLSerdash had already discussed it. I also agree w/ user777's reading (they are the same). – gung - Reinstate Monica Jul 31 '14 at 20:53
  • 1
    It would clarify matters to define what you mean by a "conditional distribution" and explicitly distinguish it from a "conditional probability." The uses of "$|$" in the first and third lines are really quite different, despite the similarity of notation, so relying on *the very same word* "conditional" to distinguish them might be more confusing that enlightening. – whuber Jul 31 '14 at 21:04
  • Thanks. For the particular problem in [Dirichlet Process](http://en.wikipedia.org/wiki/Dirichlet_process), in the first formula $$v_i|z_i=k,\mu_k\sim\mathcal{N}(\mu_k, \sigma^2)$$ it would be safe to ignore the $\mu_k$: $$v_i|z_i=k\sim\mathcal{N}(\mu_k, \sigma^2)$$ because as long as $z_{i}$belongs to the k-th cluster, its mean should be $\mu_k$. Is it necessary to put the $\mu_k$ here? Or there are some other reasons for this? – Jiang Xiang Jul 31 '14 at 21:07
  • @whuber How can I tell the difference of the vertical bars in the first and third formula? – Jiang Xiang Jul 31 '14 at 21:11
  • Context: you know what the variables mean. When the stuff to the right of the bar is not a random variable, then the bar is not denoting a conditional distribution: it is merely a shorthand for a *parameter* of a distribution. – whuber Jul 31 '14 at 21:29