33

Applied probability is an important branch in probability, including computational probability. Since statistics is using probability theory to construct models to deal with data, as my understanding, I am wondering what's the essential difference between statistical model and probability model? Probability model does not need real data? Thanks.

David J.
  • 566
  • 1
  • 5
  • 12
Honglang Wang
  • 915
  • 3
  • 9
  • 16

1 Answers1

31

A Probability Model consists of the triplet $(\Omega,{\mathcal F},{\mathbb P})$, where $\Omega$ is the sample space, ${\mathcal F}$ is a $\sigma$−algebra (events) and ${\mathbb P}$ is a probability measure on ${\mathcal F}$.

Intuitive explanation. A probability model can be interpreted as a known random variable $X$. For example, let $X$ be a Normally distributed random variable with mean $0$ and variance $1$. In this case the probability measure ${\mathbb P}$ is associated with the Cumulative Distribution Function (CDF) $F$ through

$$F(x)={\mathbb P}(X\leq x) = {\mathbb P}(\omega\in\Omega:X(\omega)\leq x) =\int_{-\infty}^x \dfrac{1}{\sqrt{2\pi}}\exp\left({-\dfrac{t^2}{2}}\right)dt.$$

Generalisations. The definition of Probability Model depends on the mathematical definition of probability, see for example Free probability and Quantum probability.

A Statistical Model is a set ${\mathcal S}$ of probability models, this is, a set of probability measures/distributions on the sample space $\Omega$.

This set of probability distributions is usually selected for modelling a certain phenomenon from which we have data.

Intuitive explanation. In a Statistical Model, the parameters and the distribution that describe a certain phenomenon are both unknown. An example of this is the familiy of Normal distributions with mean $\mu\in{\mathbb R}$ and variance $\sigma^2\in{\mathbb R_+}$, this is, both parameters are unknown and you typically want to use the data set for estimating the parameters (i.e. selecting an element of ${\mathcal S}$). This set of distributions can be chosen on any $\Omega$ and ${\mathcal F}$, but, if I am not mistaken, in a real example only those defined on the same pair $(\Omega,{\mathcal F})$ are reasonable to consider.

Generalisations. This paper provides a very formal definition of Statistical Model, but the author mentions that "Bayesian model requires an additional component in the form of a prior distribution ... Although Bayesian formulations are not the primary focus of this paper". Therefore the definition of Statistical Model depend on the kind of model we use: parametric or nonparametric. Also in the parametric setting, the definition depends on how parameters are treated (e.g. Classical vs. Bayesian).

The difference is: in a probability model you know exactly the probability measure, for example a $\mbox{Normal}(\mu_0,\sigma_0^2)$, where $\mu_0,\sigma_0^2$ are known parameters., while in a statistical model you consider sets of distributions, for example $\mbox{Normal}(\mu,\sigma^2)$, where $\mu,\sigma^2$ are unknown parameters.

None of them require a data set, but I would say that a Statistical model is usually selected for modelling one.

Xi'an
  • 90,397
  • 9
  • 157
  • 575
  • Thus Probability Models are studying the properties of the models using mathematical way, while statistical models are just selecting the proper probability model using the data. Right? – Honglang Wang Jun 23 '12 at 20:00
  • 2
    @HonglangWang That is correct to some extent. The main difference is that a probability model is only one (known) distribution, while a statistical model is a set of probability models; the data is used to select a model from this set or a smaller subset of models that better (in a certain sense) describe the phenomenon (in the light of the data). –  Jun 23 '12 at 20:52
  • 2
    (+1) This is a nice answer, though I have a couple of comments. First, I think this may be selling the probabilist a little bit short. It is not at all uncommon to consider a set of probability spaces in a probabilistic model, and indeed, the possible measures can even be *random* (constructed on a suitably larger space). Second, a Bayesian (in particular) might find this answer slightly disconcerting in that a Bayesian statistical model can often be viewed as a single probability model on a suitable product space $\Omega \times \Theta$. – cardinal Jun 24 '12 at 01:04
  • @gung I have modified a bit my answer in order to try to fill some of your queries. It is difficult to define these models for all sorts of audiences. For example cardinal is asking for more formal definitions while you prefer a more intuitive one. I tried to include comments in both directions but it is not intended at all to fully cover this topic (which is quite extensive and even controversial). –  Jun 24 '12 at 19:48
  • @Procrastinator, I do think you have done a good job, & I certainly recognize how explaining this stuff can be very difficult when your audience has a bimodal distribution. I appreciate the clarifications; I still wonder a little about the normal: so $\mathbb P$ is the normal CDF, am I right that $\Omega$ is the real number line? What is $\mathcal F$ for the normal? That's the one that's the most mysterious to me. – gung - Reinstate Monica Jun 24 '12 at 19:54
  • 1
    @gung This a more measure-theory-related question. Regarding your first question, ${\mathbb P}$ is indeed defined through the CDF. Now, the interpretation of $\Omega$ is the difficult one because, formally, ${\mathbb P}(X\leq x)$ means ${\mathbb P}(\omega\in\Omega: X(\omega)\leq x)$, then $\Omega$ are not observable values. ${\mathcal F}$ is a $\sigma-$algebra which is the pre-image of the Borel $\sigma-$algebra under $X$, again this are not observable. I am not sure how to explain this in an intuitive level. –  Jun 24 '12 at 20:04
  • 1
    I see, that does actually help a little. Thanks again, +1. – gung - Reinstate Monica Jun 24 '12 at 20:09
  • 1
    @gung: You might consider formulating your questions and doubts and posting them as a separate question if there is not already a duplicate available. Cheers. – cardinal Jun 24 '12 at 20:38
  • 2
    @gung $\Omega$ depends on the *application*; it is not determined by theory. For instance, $\Omega$ could be a set of Brownian motions describing the price of a financial derivative and $X$ could be the value attained at a fixed time $t$. In another application $\Omega$ could be a set of people and $X$ could be the lengths of their forearms. Generally, $\Omega$ is a mathematical model of the *physical* objects of study and $X$ is a numerical property of those objects. $\mathcal{F}$ is the set of possible events: those situations to which we want to ascribe probabilities. – whuber Jul 18 '12 at 18:51
  • @whuber, +1, taking the normal distribution case, would that mean $\mathcal F$ is the real number line? – gung - Reinstate Monica Jul 18 '12 at 19:13
  • 2
    @gung $\mathcal{F}$ is a *sigma algebra*: it's a collection of subsets (the "events"). In the financial application, it's a set of price histories; in the forearm measurements application, the events would be sets of *people.* We can talk about this more if you want in a chat room. – whuber Jul 18 '12 at 20:01