Is Bayesian statistics genuinely an improvement over traditional (frequentist) statistics for behavioral research?

Question

While attending conferences, there has been a bit of a push by advocates of Bayesian statistics for assessing the results of experiments. It is vaunted as both more sensitive, appropriate, and selective towards genuine findings (fewer false positives) than frequentist statistics.

I have explored the topic somewhat, and I am left unconvinced so far of the benefits to using Bayesian statistics. Bayesian analyses were used to refute Daryl Bem's research supporting precognition, however, so I remain cautiously curious about how Bayesian analyses might benefit even my own research.

So I am curious about the following:

Power in a Bayesian analysis vs. a frequentist analysis
Susceptibility to Type 1 error in each type of analysis
The trade-off in complexity of the analysis (Bayesian seems more complicated) vs. the benefits gained. Traditional statistical analyses are straightforward, with well-established guidelines for drawing conclusions. The simplicity could be viewed as a benefit. Is that worth giving up?

Thanks for any insight!

Bayesian statistics is traditional statistics - can you give a concrete example for what you mean be traditional statistics? — , Mar 11 '11 at 19:54
@OphirYoktan: He's talking about frequency probability versus Bayesian probability. It's even mentioned in the question's title. — , Mar 11 '11 at 19:57
Wikipedia can clarify the distinction for people, hopefully. Frequentist inference: http://en.wikipedia.org/wiki/Frequentist_inference and Bayesian inference http://en.wikipedia.org/wiki/Bayesian_inference — , Mar 11 '11 at 20:14
I think this question should be moved over here: http://stats.stackexchange.com/ — Mark Lapierre, Mar 11 '11 at 23:29
Hmm, maybe. I certainly tried framing it in terms of a skeptical assessment of Bayesian statistics. I would probably frame my query differently towards a devoted statistics crowd. — , Mar 11 '11 at 23:51
@MindDetective: It's definitely an interesting question (I'd love to see an answer from a professional statistician). But even when framed as a skeptical assessment of Bayesian statistics, it's still more about statistical methodology than it is about skepticism. You're far more likely to get a good answer from a statistician than you are from a skeptic. — Mark Lapierre, Mar 12 '11 at 03:04
@MindDetective: I think it would be more appropriate here if it were specifically about the conclusions Bem reached (with skepticism fostered by the chosen statistical analysis), rather than about statistical analysis (with Bem's paper as an example). — Mark Lapierre, Mar 12 '11 at 03:09
I wasn't interested in discussing Bem's study. The topic was about the vaunted benefits of Bayesian statistics, NOT about precognition. Additionally, I think we should be skeptical of the claims that statisticians make as well as those of scientists or homeopathy proponents or politicians. Is there really no room to assess the claims of applied mathematics here? If it isn't interesting or answerable by anyone who frequents the site, that's fine. But off topic? That leaves a narrower definition of on-topic than I like, personally. — , Mar 12 '11 at 14:11
I also voted off-topic; I think this would fit better on stats.SE. — , Mar 13 '11 at 01:28
I asked a [question on meta](http://meta.skeptics.stackexchange.com/questions/200/are-applied-mathematics-questions-on-topic) about whether this should be on-topic. — , Mar 13 '11 at 01:46
The funny thing about Bayesian Probability theory, is that it has the theoretical apparatus to explain why people are skeptical of the theory! And it also tells you under what conditions this will happen! — probabilityislogic, Mar 16 '11 at 09:25
Why is this not CW? It seems like a ripe situation for devolving into an argumentative, subjective discussion. At the very least, it seems like they wrong situation for awarding or taking reputation based on the answers. — cardinal, Mar 16 '11 at 23:33
I think this question can potentially have a "good" or "correct" answer. E.g. if someone could say "for every frequentist test with type 1 error $\alpha$ and type 2 error $\beta$, there exists a Bayesian test with type 1 error $\alpha$ and type 2 error $\beta - x$", this would be a good answer. Or something like "every frequentist test is equivalent to a Bayesian test with uninformative prior". I.e. this doesn't have to be a religious war between frequentists and bayesians. I'm only arguing because I don't understand how the replies relate to the specific questions in OP. — SheldonCooper, Mar 16 '11 at 23:56
@sheldon - I would agree here (+1 from me), but at the same time it is a "loaded" question, in that it is essentially asking for a value judgment on behalf of the person answering the question. And there is nothing wrong with challenging either side - it should make both stronger (as long as people "keep their cool" and don't make things personal, stay on issue). I think that's one of the best things about this forum - you can express your ideas, and see what other people think. — probabilityislogic, Mar 17 '11 at 11:46

score 15 · Answer 1 · answered Mar 16 '11 at 14:54

15

A quick response to the bulleted content:

1) Power / Type 1 error in a Bayesian analysis vs. a frequentist analysis

Asking about Type 1 and power (i.e. one minus the probability of Type 2 error) implies that you can put your inference problem into a repeated sampling framework. Can you? If you can't then there isn't much choice but to move away from frequentist inference tools. If you can, and if the behavior of your estimator over many such samples is of relevance, and if you are not particularly interested in making probability statements about particular events, then I there's no strong reason to move.

The argument here is not that such situations never arise - certainly they do - but that they typically don't arise in the fields where the methods are applied.

2) The trade-off in complexity of the analysis (Bayesian seems more complicated) vs. the benefits gained.

It is important to ask where the complexity goes. In frequentist procedures the implementation may be very simple, e.g. minimize the sum of squares, but the principles may be arbitrarily complex, typically revolving around which estimator(s) to choose, how to find the right test(s), what to think when they disagree. For an example. see the still lively discussion, picked up in this forum, of different confidence intervals for a proportion!

In Bayesian procedures the implementation can be arbitrarily complex even in models that look like they 'ought' to be simple, usually because of difficult integrals but the principles are extremely simple. It rather depends where you'd like the messiness to be.

3) Traditional statistical analyses are straightforward, with well-established guidelines for drawing conclusions.

Personally I can no longer remember, but certainly my students never found these straightforward, mostly due to the principle proliferation described above. But the question is not really whether a procedure is straightforward, but whether is closer to being right given the structure of the problem.

Finally, I strongly disagree that there are "well-established guidelines for drawing conclusions" in either paradigm. And I think that's a good thing. Sure, "find p<.05" is a clear guideline, but for what model, with what corrections, etc.? And what do I do when my tests disagree? Scientific or engineering judgement is needed here, as it is elsewhere.

answered Mar 16 '11 at 14:54

conjugateprior

19,431
1
55
83

I'm not sure that asking about type 1/type 2 errors implies anything about a repeated sampling framework. It seems that even if my null hypothesis cannot be sampled repeatedly, it is still meaningful to ask about the probability of type 1 error. The probability in this case, of course, is not over all the possible hypotheses, but rather over all possible samples from my single hypothesis. – SheldonCooper Mar 16 '11 at 15:24
It seems to me that the general argument is this: although making a type 1 (or 2) error *can* be defined for a 'one shot' inference (Type 1 vs 2 is just part of a typology of mistakes I can make) unless my making this mistake is embedded in repeated trials neither error type can have a frequentist probability. – conjugateprior Mar 16 '11 at 15:53
What I'm saying is that making a type 1 (or 2) error is *always* embedded in repeated trials. Each trial is sampling a set of observations from the null hypothesis. So even if it is difficult to imagine sampling a different hypothesis, repeated trials are still there because it is easy to imagine sampling a different set of observations from that same hypothesis. – SheldonCooper Mar 16 '11 at 15:59
You are, I think backing off to from N-P to Fisher when you ask whether it doesn't still make sense to think of the p value as simultaneously a measure of divergence from some null, as a way to define a type 1 error, and as something that doesn't require a sampling framework. I'm not sure how to have all these at once, but perhaps I'm missing something. – conjugateprior Mar 16 '11 at 16:01
oops, we overlapped. In the terms of my original response to the question you're saying that you *can* imagine embedding in a repeated framework. OK, then Type 1 and perhaps Type 2 makes good sense. – conjugateprior Mar 16 '11 at 16:04
I say that it does require a sampling framework, but that this sampling framework is easy to have. As a frequentist, I don't know what it means to "sample a hypothesis". But I do know what it means to "sample observations from a hypothesis". So I do this sampling repeatedly and use it to define type 1 errors. So I do use a sampling framework (of course), but this is sampling of observations only, not sampling of hypotheses. – SheldonCooper Mar 16 '11 at 16:07
fwiw, I don't know what 'sample a hypothesis' might mean, and I'd guess that no Bayesian does either (except as part of some MCMC *implementation* detail). I guess that because you identify the use of random variables so tightly with the existence of random sampling you're thinking that Bayesian must sample their parameters/hypotheses. (After all, they do treat them as random). But for Bayesians the two are decoupled and the priority reversed: the random variables represent uncertainty, which *may or may not* be caused by a sampling process. – conjugateprior Mar 16 '11 at 20:24
I was referring to the text of your reply saying that "it implies that you can put your inference problem into a repeated sampling framework". I thought you meant sampling hypotheses somehow. I understand now this is not what you meant. On the other hand, sampling data from the hypothesis seems to be readily possible. If that's the case, would it be fair to summarize your argument as saying that in most cases there is no strong reason to use Bayesian statistics? – SheldonCooper Mar 16 '11 at 21:36
1

Riddle me this: how does one decide "what is random?" E.g. suppose you have an urn, someone is sampling "at random" from the urn. Also suppose an "intelligent observer" is also present, and they know the exact contents of the urn. Is the sampling still "at random" even though the "intelligent observer" can predict with certainty exactly what will be drawn? Has anything about the urn changed if they are no longer present? – probabilityislogic Mar 17 '11 at 11:54
@SheldonCooper The short version: IF (you can [put your problem in a resampling framework] AND if the behavior of your estimator over many such samples is of relevance AND if you are not particularly interested in making probability statements about particular events AND you don't mind a plethora of competing principles) THEN there's no particular reason to change what you're doing. I don't know whether the antecedent describes 'most cases' but it doesn't describe most of my cases. – conjugateprior Mar 17 '11 at 13:47
1

The issue I have with the "repeated" nature of frequentists is that in order to work, the conditions must remain the same. But if the conditions remain the same, you should be able to pool your data sets together and get a better estimate. The frequentist ignores the past information precisely under the conditions when it is reasonable to take it into account. – probabilityislogic Mar 17 '11 at 22:46
+1 For a thoughtful analysis and attempt to respond directly to the OP. – whuber Mar 19 '11 at 04:34

score 5 · Answer 2 · edited Apr 13 '17 at 12:44

5

Bayesian statistics can be derived from a few Logical principles. Try Searching "probability as extended logic" and you will find more in depth analysis of the fundamentals. But basically, Bayesian statistics rests on three basic "desiderata" or normative principles:

The plausability of a proposition is to be represented by a single real number
The plausability if a proposition is to have qualitative correspondance with "common sense". If given initial plausibility $p(A|C^{(0)})$, then change from $C^{(0)}\rightarrow C^{(1)}$ such that $p(A|C^{(1)})>p(A|C^{(0)})$ (A becomes more plausible) and also $p(B|A C^{(0)})=p(B|AC^{(1)})$ (given A, B remains just as plausible) then we must have $p(AB| C^{(0)})\leq p(AB|C^{(1)})$ (A and B must be at least as plausible) and $p(\overline{A}|C^{(1)})<p(\overline{A}|C^{(0)})$ (not A must become less plausible).
The plausability of a proposition is to be calculated consistently. This means a) if a plausability can be reasoned in more than 1 way, all answers must be equal; b) In two problems where we are presented with the same information, we must assign the same plausabilities; and c) we must take account of all the information that is available. We must not add information that isn't there, and we must not ignore information which we do have.

These three desiderata (along with the rules of logic and set theory) uniquely determine the sum and product rules of probability theory. Thus, if you would like to reason according to the above three desiderata, they you must adopt a Bayesian approach. You do not have to adopt the "Bayesian Philosophy" but you must adopt the numerical results. The first three chapters of this book describe these in more detail, and provide the proof.

And last but not least, the "Bayesian machinery" is the most powerful data processing tool you have. This is mainly because of the desiderata 3c) using all the information you have (this also explains why Bayes can be more complicated than non-Bayes). It can be quite difficult to decide "what is relevant" using your intuition. Bayes theorem does this for you (and it does it without adding in arbitrary assumptions, also due to 3c).

EDIT: to address the question more directly (as suggested in the comment), suppose you have two hypothesis $H_0$ and $H_1$. You have a "false negative" loss $L_1$ (Reject $H_0$ when it is true: type 1 error) and "false positive" loss $L_2$ (Accept $H_0$ when it is false: type 2 error). probability theory says you should:

Calculate $P(H_0|E_1,E_2,\dots)$, where $E_i$ is all the pieces of evidence related to the test: data, prior information, whatever you want the calculation to incorporate into the analysis
Calculate $P(H_1|E_1,E_2,\dots)$
Calculate the odds $O=\frac{P(H_0|E_1,E_2,\dots)}{P(H_1|E_1,E_2,\dots)}$
Accept $H_0$ if $O > \frac{L_2}{L_1}$

Although you don't really need to introduce the losses. If you just look at the odds, you will get one of three results: i) definitely $H_0$, $O>>1$, ii) definitely $H_1$, $O<<1$, or iii) "inconclusive" $O\approx 1$.

Now if the calculation becomes "too hard", then you must either approximate the numbers, or ignore some information.

For a actual example with worked out numbers see my answer to this question

edited Apr 13 '17 at 12:44

Community

1

answered Mar 16 '11 at 09:23

probabilityislogic

22,555
4
76
97

3

I'm not sure how this answers the question. Frequentists of course disagree with desideratum 1 from this list, so the rest of the argument doesn't apply to them. It also doesn't answer any of the specific questions in the OP, such as "is Bayesian analysis more powerful or less error-prone than a frequentist analysis". – SheldonCooper Mar 16 '11 at 15:18
@sheldoncooper - if a frequentist disagrees with desideratum 1, then on what basis can they construct a 95% confidence interval? They must require an additional number. – probabilityislogic Mar 16 '11 at 22:32
@sheldoncooper - and further, sampling probabilities would have to be re-defined, because they too are only 1 number. A frequentist cannot reject desideratum 1 without rejecting their own theory – probabilityislogic Mar 16 '11 at 23:12
1

I'm not sure what additional number you are referring to. I'm also not sure what is the computation of $p(H_1|...)$ that you've introduced. The procedure in standard statistical tests is to compute $p(E_1, E_2, ... | H_0)$. If this probability is low, $H_0$ can be rejected; otherwise it cannot. – SheldonCooper Mar 16 '11 at 23:21
1

"they cannot reject desideratum 1 without rejecting their own theory" -- what do you mean by that? Frequentists have no notion of "plausibility". They have a notion of "frequency of occurrence in repeated trials". This frequency satisfies conditions similar to your three desiderata and thus happens to follow similar rules. Thus for anything for which the notion of frequency is defined, you can use laws of probability without any problem. – SheldonCooper Mar 16 '11 at 23:25
Just to clarity, in frequentist interpretation the notation $p(...|H_0)$ does not of course represent conditioning on a random variable $H_0$. Rather, it's a convenient shortcut for $p_{H_0}(...)$, the probability of $...$ under model $H_0$. – SheldonCooper Mar 16 '11 at 23:59
@sheldon - if you reject desideratum 1, then presumably you require more than one number to represent the plausibility of an event. This means you cannot represent $P(\dots|H_0)$ by a single number. But this is what a frequentist does! calculates the likelihood (i.e. 1 number). Therefore, a frequentist cannot reject desideratum 1. For frequentist "plausibility"="frequency" I would have thought. – probabilityislogic Mar 17 '11 at 02:25
"if you reject desideratum 1, then presumably you require more than one number to represent the plausibility of an event" -- sure, not that I care. "This means you cannot represent $P(\dots|H_0)$ by a single number." -- I disagree with that. For a frequentist $P(\dots|H_0)$ is not plausibility; in fact, frequentist doesn't care about plausibility at all. It's just frequency. – SheldonCooper Mar 17 '11 at 07:03
"For frequentist "plausibility"="frequency" I would have thought." -- again, I disagree. My understanding is that frequency is the only thing frequentists care about (hence the name). The notion of "plausibility" for a frequentist is at best some vague intuition which sometimes helps and sometimes doesn't, but definitely not something equivalent to probaiblity of an event. – SheldonCooper Mar 17 '11 at 07:07
So on what basis can a "frequentist" as you describe it make decisions if they do not adopt a notion of plausibility (or some other synonym of the word)? The world is uncertain, so how does the frequentist take account of this? Can you define what a confidence interval is without either adopting a notion of "likelihood" or "prediction"? (which are all equivalent to the notion of plausibility). It seems by your notion would only be able to describe the past, or what is observed, making it useless for the problems people want statistics for. (more later) – probabilityislogic Mar 17 '11 at 11:26
...(cont'd) people use statistics to make inferences about what is unknown. Frequencies are very important, but they can only be observed, they cannot be used for prediction without adopting some connection between "likelihood" and "frequency". This all falls under the cloud of "plausible reasoning". It could be my use of terminology and particular words that are obscuring my point. And you still haven't answered my earlier question "on what basis can a confidence interval be used without a notion of plausibility?" – probabilityislogic Mar 17 '11 at 11:33
@sheldon - just looking at your earlier comment, you seem to think that by writing $P(H_0|\dots)$ I have somehow made $H_0$ "random". This is not true. This just means I want to know the extent to which the evidence I have (on the right hand side of "|") implies the hypothesis I am testing(I am using the word "implies" in *exactly the same way as in deductive logic*). – probabilityislogic Mar 17 '11 at 11:39

score 2 · Answer 3 · answered Mar 13 '11 at 04:16

I am not familiar with Bayesian Statistics myself but I do know that Skeptics Guide to the Universe Episode 294 has and interview with Eric-Jan Wagenmakers where they discuss Bayesian Statistics. Here is a link to the podcast: http://www.theskepticsguide.org/archive/podcastinfo.aspx?mid=1&pid=294

Is Bayesian statistics genuinely an improvement over traditional (frequentist) statistics for behavioral research?

3 Answers3

Linked