9

This is more of a philosophical question, but from a purely Bayesian standpoint how does one actually form prior knowledge? If we need prior information to carry out valid inferences then there seems to be a problem if we have to appeal to past experience in justifying today's priors. We're apparently left with the same question regarding how yesterday's conclusions were valid, and a kind of infinite regress seems to follow where no knowledge is warranted. Does this mean that ultimately prior information must be assumed in an arbitrary way, or perhaps based on a more "frequentist" style of inference?

dsaxton
  • 11,397
  • 1
  • 23
  • 45
  • What about prior knowledge coming from past experiments? – Christoph Hanck Mar 14 '16 at 20:02
  • That's the question I'm asking. How did *those* experiments generate actual knowledge? – dsaxton Mar 14 '16 at 20:04
  • You yourself are the product of a stochastic probabilistic process. (Unless you have much higher prior belief in your specific divine creation than I). So I suppose, yes, possessing consciousness and having the ability to reason and incarnating as the dsaxton who chose Pi Day to ask this question can be seen as being based on something like an infinite sequence of possible repetitions over an infinite multiverse. But that's probably overthinking it. – Dalton Hance Mar 14 '16 at 20:06
  • @dsaxton, for example by producing estimates of, say, the effectiveness of a drug, which we then seek to update via a new sample. – Christoph Hanck Mar 15 '16 at 06:54
  • Right, but does this then reduce to a kind of "cumulative" frequentism? – dsaxton Mar 15 '16 at 13:06
  • Prior captures what you know about the phenomenon. For instance, you went to school many years ago. Now, you have to estimate the mean height of 2nd grade students in your local school district. Based on your own experience, you think that the height is probably "around 4 feet". So, you could capture your prior as Gaussian distribution $\mathcal{N}(4,0.01)$, or $4\pm 0.1$ feet. You can use this prior before getting the data and estimating the mean height. The problem in practical setting is that once you saw the data once, you can't re-use the same prior, because it must have changed now. – Aksakal Mar 15 '16 at 16:55

2 Answers2

9

Speaking of prior knowledge can be misleading, that is why you often see people speaking rather about prior beliefs. You do not need to have any prior knowledge to set up a prior. If you needed one, how would Longley-Cook manage with his problem?

Here is an example from the 1950s when Longley-Cook, an actuary at an insurance company, was asked to price the risk for a mid-air collision of two planes, an event which as far as he knew hadn't happened before. The civilian airline industry was still very young, but rapidly growing and all Longely-Cook knew was that there were no collisions in the previous 5 years.

Lack of data about mid-air collisions was not a problem to assign some prior to it that lead to pretty accurate conclusions as described by Markus Gesmann. This is extreme example of insufficient data and no prior knowledge, but in most real life situations you would have some out-of-data beliefs about your problem, that can be translated to priors.

There is a common misconception about priors that they need to be somehow "correct", or "unique". In fact, you can purposefully use "incorrect" priors to validate different beliefs against your data. Such approach is described by Spiegelhalter (2004) who describes how a "community" of priors (e.g. "skeptical", or "optimistic") can be used in decision-making scenario. In this case it is not even prior beliefs that are used to form priors, but rather prior hypotheses.

Since when using Bayesian approach, you include both the prior and data into your model, information from both sources will be combined. The more informative is your prior comparing to data, the more influence it would have, the more informative is your data, the less influence would your prior have.

Eventually, "all models are wrong, but some are useful". Priors describe beliefs that you incorporate in your model, they do not have to be correct. It is enough if they are helpful for your problem, as we are dealing only with approximations of reality that are described by your models. Yes, they are subjective. As you already noticed, if we needed prior knowledge for them, we would end up in a vicious circle. Their beauty is that they can be formed even when confronted with shortage of data, so to overcome it.


Spiegelhalter, D. J. (2004). Incorporating Bayesian ideas into health-care evaluation. Statistical Science, 156-174.

Tim
  • 108,699
  • 20
  • 212
  • 390
  • 1
    Community of priors is a clear bastardization of Bayesian approach. Unless one has multiple personalities, there can't be multiple priors. The prior is supposed to capture your prior belief, all that you know about the phenomenon. If you have multiple priors, you'll run into even more philosophical issues than Bayesian approach already has. – Aksakal Mar 15 '16 at 16:50
  • 1
    @Aksakal using them as described by Spiegelhalter is pretty appealing: use different priors and compare the outcome and check how much do they influence the outcome. Besides, they are nice example of that prior does not have to be "correct". – Tim Mar 15 '16 at 17:25
  • 2
    it's appealing technically but logically inconsistent. It basically kills the whole Bayesian statistics' basis about beliefs, subjective probabilities etc. If you have many priors, why not have infinite number of priors? In this case how is this different from frequentist approaches? Once you run infinite number of priors, your outcome will converge to pure frequentist result or uninformative prior or something along those lines. – Aksakal Mar 15 '16 at 17:35
  • 1
    @Aksakal when creating single prior you "weight" the different sources of evidence "in your head" to come up with something -- how does it differ from setting multiple priors, one per source of belief? – Tim Mar 15 '16 at 17:54
  • 1
    +1 but I agree with @Aksakal that multiple priors and even the whole approach of Bayesian "model checking" (as e.g. advertised greatly by Andrew Gelman) is not exactly pure Bayesian. For a pure Bayesian, the idea of "validating different beliefs" does not make any sense. You have your beliefs; you get new data; you update your beliefs, and that's all there is to it. Gelman and Shalizi have a paper on [Philosophy and the practice of Bayesian statistics](http://www.stat.columbia.edu/~gelman/research/published/philosophy.pdf) that talks about this conundrum but I never managed to read it through. – amoeba Mar 20 '16 at 17:34
  • 2
    Special thanks for the reference to Spiegelhalter 2004. I like this sentence from the closing paragraph: "The general statistical community, who are not stupid, have justifiably found somewhat tiresome the tone of hectoring self-righteousness that has often come from the Bayesian lobby." – amoeba Mar 20 '16 at 23:57
  • @amoeba you can also check his book "Bayesian Approaches to Clinical Trials and Health-Care Evaluation". – Tim Mar 21 '16 at 08:00
8

I think you're making the mistake of applying something like the frequentist concept of probability to the foundations of the subjective definition. All that a prior is in the subjective framework is a quantification of a current belief, before updating it. By definition, you don't need anything concrete to arrive at that belief and it doesn't need to be valid, you just need to have it and to quantify it.

A prior can be informative or uninformative and it can be strong or weak. The point of those scales is that you don't have any implicit assumptions about the validity of your prior knowledge, you have explicit ones, and sometimes that can be "I have no information." Or it can be "I am not confident in the information I have." The point is, there is no requirement that prior knowledge is "valid". And that assumption is the only reason your scenario seems paradoxical.

By the way, if you like thinking about the philosophy of probability, you should read The Emergence of Probability by Ian Hacking and its sequel, The Taming of Chance. The first book especially was really illuminating in how the concept of probability came to have dual and seemingly incompatible definitions. As a teaser: did you know that until fairly recently, calling something "probable" meant that it was "approvable", i.e. that it was "approved by the authorities" or that it was a generally well respected opinion. It had nothing whatsoever to do with any concept of likelihood.

amoeba
  • 93,463
  • 28
  • 275
  • 317
Robert E Mealey
  • 328
  • 2
  • 6
  • 1
    Interesting statement about the meaning of "probable". Until fairly recently is until when exactly? I find [11 occurrences of "probable" in the works of Shakespeare](http://www.opensourceshakespeare.org) and they seem to have the usual meaning. That's 400 years ago. – amoeba Mar 20 '16 at 17:41
  • 1
    Looking at all those quotes, I actually think if you read them with the original definition in mind, they make more sense. But the best examples in Emergence are from Gibbon's Decline and Fall of the Roman Empire, one from Gibbon's personal notes and one from a footnote. The personal note read: "Let us conclude, then, though with some remainder of scepticism, that although Livy's narrative has more of probability, yet that of Polybius has more of truth." And the footnote, to Chapter xxiv of decline and fall, reads: "Such a fact is probable but undoubtable false." – Robert E Mealey Mar 20 '16 at 17:47
  • Another really interesting one, also from Emergence, is from a mid 1700s writer named Thomas Church, in response to David Hume's attack on the credibility of miracles... – Robert E Mealey Mar 20 '16 at 17:57
  • Quoting from Emergence: "The author is at pains to insist that credibility is relative to the evidence. Church grants, 'that in common discourse, it is not unusual to call any thing credible or incredible, antecedent to our consideration of its proof. But if we examine our ideas, this will be found to be a loose unphilosophical way of expressing ourselves. All that can be meant is, that such a thing is possible or impossible, probable or improbable, or, at farthest, happening very frequently, or very seldom [1750, p. 60]." – Robert E Mealey Mar 20 '16 at 17:57
  • But going back to shakespeare quotes, actually, those are pretty fascinating examples. Because most of them do make sense of a sort when defining probable in the modern way, but all of them also make sense if you read them to be referring to how "believable" or "plausible" the subject being referred to is or should be. – Robert E Mealey Mar 20 '16 at 18:05
  • This one, from All's Well That Ends Well, is, I think unambiguously defined in the original way: "None in the world; but return with an invention and clap upon you two or three probable lies: but we have almost embossed him; you shall see his fall to-night; for indeed he is not for your lordship's respect." Referring to probable, or believable, lies. – Robert E Mealey Mar 20 '16 at 18:10
  • By now I am confused; I thought that "plausible" or "believable" *is* the modern meaning of "probable". But in your answer you wrote about "approved by the authorities", which is a whole different meaning. – amoeba Mar 20 '16 at 18:25
  • Well, the modern definition according to Mirriam-Webster, is "likely to happen or to be true but not certain". http://www.merriam-webster.com/dictionary/probable whereas "plausible" is defined as "possibly true : believable or realistic". One of Hacking's main early points is that for middle age philosophers especially, credibility of an idea had a lot to do with how many preceding respectable authorities approved it, or how "probable" it was. And the modern definition was an attempt to redefine credibility in terms of appeals to empirical observation, not appeals to authority. – Robert E Mealey Mar 20 '16 at 18:53
  • "The authorities" in my original definition is ambiguous, and I understand why that would be confusing. A better way to say it might be that calling something probable mean that it was "approvable by the authorities, or by generally respected, intelligent people and institituions." A good illustrative example of this is Catholic probabilism, which is discussed at length in Emergence, since Pascal spent most of his public life attacking it. https://en.wikipedia.org/wiki/Catholic_probabilism – Robert E Mealey Mar 20 '16 at 19:00