Which notation and why: $\text{P}()$, $\Pr()$, $\text{Prob}()$, or $\mathbb{P}()$

Question

Are these merely stylistic conventions (whether italicized or non-italicized), or are there substantive differences in the meanings of these notations?

Are there other notations meaning "the probability of" that should be considered in this question?

I feel like I see $\mathbb{P}()$ more in the context of measure theoretic probability. — TrynnaDoStat, Jul 18 '14 at 17:45

score 23 · Accepted Answer · edited Dec 29 '20 at 17:22

23

Stylistic conventions, mainly, but with some underlying rationale.

$\mathbb{P}()$ and $\Pr()$ can be seen as two ways to "free up" the letter $\text{P}$ for other use—it is used to denote other things than "probability", for example in research with complicated and extensive notation where one starts to exhaust available letters.

$\mathbb{P}()$ requires special fonts, which is a disadvantage. $\Pr()$ may be useful when the author would want the reader to think of probability in abstract and general terms, using the second lower-capital letter "$r$" to disassociate the symbol as a whole from the usual way we write up functions.

For example, some problems are solved when one remembers that the cumulative distribution function of a random variable can be written and treated as a probability of an "inequality-event", and apply the basic probability rules rather than functional analysis.

In some cases, one may also see $\text {Prob}()$, again, usually in the beginning of an argument that will end up in a specific formulation of how this probability is functionally determined.

The italics version $P()$ is also used, and also in lower-case form, $p()$—this last version is especially used when discussing discrete random variables (where the probability mass function is a probability).

$\pi(\;,\;)$ is used for conditional ("transition") probabilities in Markov Theory.

edited Dec 29 '20 at 17:22

Alexis

26,219
5
78
131

answered Jul 18 '14 at 18:15

Alecos Papadopoulos

52,923
5
131
241

1

Thank you, I have included $\text{Prob}()$ in an edit to my question. Also: "it *is* used to denote other things than 'probability'" say it ain't so! ;) I think also that $\pi$ is sometimes used to describe the parameter corresponding to $p$ in a PMF. – Alexis Jul 18 '14 at 18:21
5

Well, Alexis, GASP indeed, but this is why when reading a paper, never skip its preparatory sections -it is where the author defines the symbolic language he will use -and if he doesn't, he is sloppy. – Alecos Papadopoulos Jul 18 '14 at 19:13
1

I disagree on one point: I have mostly seen $p()$ used for a _continuous_ random variable --- the thinking being that its probability density function evaluated at a point is similar to but distinct from the probability mass function of a discrete random variable evaluated at a point, which is a probability and can be denoted by $P()$. It is also my impression that $P()$ is more common than $\text{P}()$. – Nagel Jul 23 '14 at 13:08
@Nagel That's interesting. In which field? – Alecos Papadopoulos Jul 23 '14 at 16:05
@AlecosPapadopoulos: I am sure I have seen it repeatedly in statistical machine learning; I thought I had seen it in pure statistics texts too, but I shan't say for certain. – Nagel Jul 25 '14 at 09:46
@Nagel. I wouldn't know about machine learning, but in Information Theory, and Econometrics, lower case $p(x)$ denotes usually a probability mass function. $p$ alone denotes usually a specific probability. $P()$ is the general symbol for probability of any $()$. Probability density functions in mathematical statistics are predominantly denoted by $f(x)$ or better $f_{X}(x)$, and other function-related symbols like $g$ or $h$. – Alecos Papadopoulos Jul 25 '14 at 11:11
@AlecosPapadopoulos I'm also used to seeing $p(x)$ used to denote a density function, and $P$ to denote a probability mass. This seems a common convention in physics (in my experience, at least). – Will Vousden Feb 04 '16 at 11:58
@WillVousden Yes, notational conventions are not consistent across discipline, in many cases. – Alecos Papadopoulos Feb 04 '16 at 13:32

score 5 · Answer 2 · answered Jul 18 '14 at 18:10

5

I've seen all three used in different undergrad classes and as far as I know, they're stylistic differences and all represent probability as you're thinking of it.

One other notation I've seen is in Sheldon Ross's "Introduction to Probability Theory", where $\mathbf{P}$ represents a probability matrix. He also uses $\pi_i$ as a notation for limiting probability, which a sequence of probabilities $(p_i)$ converges to.

answered Jul 18 '14 at 18:10

Brandon Sherman

443
2
11

Would it be fair to say that $\pi$ and $p$ in the sense that you are referring to correspond to parameters and estimates of, say, a Bernouli or binomial distribution? – Alexis Jul 18 '14 at 18:18
1

I pretty much always see $\theta$ used to represent a parameter in one of those distributions. Occasionally I've seen $p$ used as the parameter, but never $\pi$. I've never seen $\pi$ used outside the context of limiting probabilities. I'm not sure but I think it fits into the whole "use English letters for statistics and Greek letters for parameters" paradigm. – Brandon Sherman Jul 18 '14 at 18:26
And yet $p$ is a Latin (not English) letter (i.e. statistic), and $\pi$ is a Greek letter (i.e. parameter?). – Alexis Jul 18 '14 at 18:39
Depends on the context. I've only seen $\pi$ used in the context of limiting probabilities in stochastic processes. In that specific situation, the $p$s converge to $\pi$ as $n \rightarrow \infty$. – Brandon Sherman Jul 18 '14 at 18:44
You misunderstand my comment, the Latin-ness or Greek-ness of those letters does not in any way depend on context. – Alexis Jul 18 '14 at 19:03
1

Oh my bad. Yeah, obviously $p$ is Latin and $\pi$ is Greek. But the analogy I was trying to make is that $\bar{x} \to \mu$ as $n \to \infty$, and $\bar{x}$ is Latin and $\mu$ is Greek. Similarly, in stochastic processes, $p \to \pi$ as $n \to \infty$ and $p$ is Latin and $\pi$ is Greek. – Brandon Sherman Jul 19 '14 at 19:53

Taylor · Answer 3 · 2020-12-29T17:31:10.020

2

This makes me think of Meyn and Tweedie's book. They use $P$ to denote the transition kernel for a Markov chain, and $\mathsf{P}$ for the law of the entire chain on $\mathsf{X}^{\infty}$. This answer is specific to Markov chains, but the distinction is obviously important.

The difference between $P$ and $\mathbb{P}$ (and $E$ and $\mathbb{E}$) from book to book, is just for aesthetic appeal, in my opinion. I can't generalize where I see $\Pr()$ or $\text{Prob}()$, really.

edited Dec 29 '20 at 17:31

answered Dec 29 '20 at 17:28

Taylor

18,278
2
31
66

2

+1 And, as with @29740 's comments, the italic-serif/Roman sans-serif somewhat parallels the Latin/Greek convention for estimate/estimand. – Alexis Dec 29 '20 at 17:30

Which notation and why: $\text{P}()$, $\Pr()$, $\text{Prob}()$, or $\mathbb{P}()$

3 Answers3

Linked

Related