2

Illustrative example:

The pgf for $X \sim \mathrm{Bin}(m,p_1)$ is $G_1(z)=[(1-p_1)+p_1z]^m$. If $m \sim \mathrm{Bin} (n,p_2)$ the final distribution has pgf of $G_2(z)=[(1-p_1 p_2)+p_1 p_2z]^n$ representing $\mathrm{Bin}(n,p_1 p_2)$ distribution.

Mathematically the final answer is still a valid distribution for $s=p_1 p_2$ as long as $0<s <1$. This may be true even though $p_i<0$ or $p_i>1$ individually, as long as their product is positive and between 0 and 1.

For example, if $p_{\{1,2\}}=-\frac{1}{\sqrt 2}$ then both distributions alternate between positive and negative 'probabilities', yet their mixture is $s=\frac{1}{2}$; equivalent to flipping a coin $n$ times.

This reminds me of improper priors where the probabilities are positive but the sum does not equal 100%. Here we may have a "distribution" where the sum is 100% but the probabilities may be negative.

If improper priors are allowed if the posterior is valid, are negative probabilities allowed if the final answer is a valid distribution?

Silverfish
  • 20,678
  • 23
  • 92
  • 180
sheppa28
  • 1,287
  • 8
  • 15
  • I'm negative 70.7% confident that $p_{\{1,2\}}=-\frac{1}{\sqrt 2}$ is valid. But see also my answer http://stats.stackexchange.com/questions/4220/can-a-probability-distribution-value-exceeding-1-be-ok/160979#160979 . – Mark L. Stone Sep 01 '16 at 01:48
  • 1
    I haven't looked into this personally but apparently negative probabilities are taken seriously by some. A guy named Gabor Szekely came up with something called a "half coin" which involves negative probabilities; might be worth checking that out. – dsaxton Sep 01 '16 at 01:49
  • 2
    Isn't positivity one of the axioms underlying probability theory? – shadowtalker Sep 01 '16 at 02:18
  • It's unclear what you might mean by "allowed": what is the statistical problem you are addressing? – whuber Sep 01 '16 at 21:46
  • As you said, you can get a binomial distribution by counting results from a Bernoulli process, and you can compute pgf of that binomial using the probability of success in the Bernoulli process. However, the fact you can get a valid distribution using the formula for the binomial with negative parameters doesn't mean that you can get a Bernoulli process with probabilities equal to that negative parameters. It works in the other way, but not this way. In fact, you have proofed that it doesn't work by finding a counter-example that violates one axiom of probability, as @ssdecontrol pointed. – Pere Sep 01 '16 at 21:49
  • I agree that positivity is an axiom of probability, and this violates that axiom. The statistical problem I'm wondering about is why is violation of another axiom, that of sum=100%, accepted in use - that of improper priors if the posterior is proper - but violation of the axiom of positivity is not allowed when the same posterior valid condition can also be met? – sheppa28 Sep 02 '16 at 02:43

1 Answers1

1

You can formulate this in terms of the following experiment: You flip two biased coins with probability of heads $p_1,p_2$ respectively $n$ times. You then ask: how many simultaneous occurrences of heads happened? This is answered by the binomial distribution with parameter $p_1p_2$. The fact of the matter is that in this line of reasoning, you're actively using the axioms of probability, which assume $0\leq p_1,p_2\leq 1$, along with assumptions on independence. However, you're right, if both $p$'s are negative, then $p_1p_2\geq 0$ might be a valid parameter for the binomial. A kind of analytic continuation for probability.

Now you're saying that $p_1,p_2<0$ as a prior. This invalidates either prior as a probability distribution, which brings the entire experiment into question. Or, maybe your friend flipped the first coin $n$ times, and reported $p_1$ as the number of heads over the total as a negative number. This might be made valid if you were to extend the theory of the experiment to include a prior on your friends prior, which includes them drunkenly reporting a negative $p_1$ value.

Here's the point. You are saying that $P:=p_1p_2$ is a kind of random variable, with some definition of priors on $p_1,p_2$. You're taking that random variable and plugging it into a binomial distribution, in hopes of getting a sensible answer. Whenever $0\leq P\leq 1$, you'll get such a sensible answer. Otherwise, you get nonsense.

To resolve such issues, we usually say that the probability that $0\leq p_1,p_2\leq 1$ with probability 1. Any other option has probability 0. Conditioning on on events that have zero probability is a big no-no in introductory probability, and is formally defined later on using measure theory along with conditional expectation.

Alex R.
  • 13,097
  • 2
  • 25
  • 49