10

In this Wikipedia article, there is this sentence:

This is a frequentist approach

Is 'this' referring to OLS?

Is it really 'a' rather than 'the'? What are some other frequentist approaches? As far as I know, we must minimize [$\varepsilon_1, \varepsilon_2, ..., \varepsilon_n$][$\varepsilon_1, \varepsilon_2, ..., \varepsilon_n$]'.

Edit: This is in relation to my previous question. Tim mentioned 'Now, to estimate logistic regression in Bayesian way...'

So, how would one do it in a frequentist way? OLS? Recalled statistics class involving linear regression. Guess there's MLE too.

BCLC
  • 2,166
  • 2
  • 22
  • 47
  • 2
    It's think it's definitely 'a'. OLS is how a frequentist using ML estimation would approach the particular distributional model under discussion there, but there's nothing sacrosanct about that model (nor indeed anything that says a frequentist *has* to use ML estimation). – Glen_b Sep 07 '15 at 13:28
  • Christiansen states that least squares is not a statistical procedure, but rather a geometric one. (*Plane Answers to Complex Questions*, 4th ed.) – Clarinetist Sep 07 '15 at 20:21

3 Answers3

14

OLS by itself does not imply what type of inference (if any) is being done. I would say it is a mere descriptive statistic.

If you assume some generative model (sampling distribution), and try to infer on the OLS coefficients, you are then free to do frequentist inference or Bayesian.

Richard Hardy
  • 54,375
  • 10
  • 95
  • 219
JohnRos
  • 5,336
  • 26
  • 56
  • 1
    OLS is a mere descriptive statistic? Isn't this a bit of a stretch? – Paul Sep 07 '15 at 14:47
  • 4
    @Paul: depends what you mean by OLS. Computing the minimizer of the squared error loss in the linear model class is an assumption free exercise in optimization and\or matrix algebra. The inference stage (assuming a linear generative model, centered and independent error terms,...) is not inherent to the definition of OLS (for many, but not all, authors). – JohnRos Sep 07 '15 at 14:52
  • Yeah. I get what you're doing, and I kind of like it philosophically. It just seems a bit unconventional to identify any "merely mathematical" procedure as a descriptive statistic. – Paul Sep 07 '15 at 15:08
  • That said I do like it from a pure theory point of view. A statistic is just a function of the data, and any deterministic mathematical procedure applied to the data (from OLS to LASSO and beyond) can be interpreted as a function of the data. – Paul Sep 07 '15 at 15:27
  • 1
    @Paul: It seems that some up-voters agree with this view. Then again, I vaguely recall debating it in this site in the past. The view presented was that no computation is done without some assumed model, so that all descriptive statistics has some implied inference. Needless to say, I disagree. – JohnRos Sep 07 '15 at 15:50
  • 1
    I am one of the upvoters :) This view is an acquired taste but one worth acquiring, and apparently popular among users of this site at least. – Paul Sep 07 '15 at 15:55
  • 1
    @Paul: here was the debate we had at the time: http://stats.stackexchange.com/questions/32782/when-do-i-need-a-model – JohnRos Sep 07 '15 at 15:58
  • 1
    Take the example of a regression on a constant only. This yields the sample mean as the estimated coefficient - few would debate that this is a descriptive statistic, no? A multiple regression just takes this a step further, as @JohnRos explains. – Christoph Hanck Sep 09 '15 at 07:32
  • Thanks JohnRos. So Wiki is wrong? – BCLC Sep 12 '15 at 09:01
  • @BCLC: Any particular statement in the wiki that you have in mind? Notice that "OLS" and "OLS in the linear model" are not exactly the same thing. – JohnRos Sep 12 '15 at 21:37
  • @JohnRos Wiki says 'The ordinary least squares solution is to estimate the coefficient vector using the Moore-Penrose pseudoinverse...This is a frequentist approach' So is it necessarily a frequentist approach as is claimed? – BCLC Sep 18 '15 at 11:33
  • @BCLC: projecting the data ("Moore-Penrose...") to the span of some vectors is an assumption free arithmetic operation. – JohnRos Sep 18 '15 at 11:37
  • @JohnRos So your answer is 'no' ? – BCLC Sep 18 '15 at 11:38
  • @BCLC: I did not read the wiki. I am saying that minimizing the squared error loss does not imply the type of inference you will be making. – JohnRos Sep 18 '15 at 11:44
  • @JohnRos I see. And whether it's frequentist or Bayesian [depends on the assumption/s](http://stats.stackexchange.com/a/171443/44339) ('[OLS is Bayesian regression, but with a (nonstandard) uniform prior](http://stats.stackexchange.com/questions/171423/is-ols-the-frequentist-approach-to-linear-regression/171427?noredirect=1#comment325512_171439)')? – BCLC Sep 18 '15 at 11:49
14

In the folk parlance of statistics, OLS tends to be identified as a frequentist approach to parameter estimation because it does not explicitly involve a prior distribution on the parameters being estimated. But strictly speaking, OLS is just a mathematical operation whose result has both frequentist and Bayesian interpretations.

From a frequentist perspective, if the usual linear model assumptions hold, the OLS parameter estimates are equal to the true parameter values plus an error of known distribution derived from the randomness of sampling. This enables us to extract information about the estimates (e.g. p-values and confidence intervals) which have theoretical guarantees that limit the chance of certain types of errors attributable to random sample variation.

From a Bayesian perspective, if the usual linear model assumptions hold, OLS provides the maximum a posteriori (MAP) parameter estimate under a uniform prior. The posterior distribution is a multivariate Gaussian with a peak at the MAP estimate, and we can use the posterior to update our prior on the parameters, or compute credible intervals if we so desire.

In general, the frequentist/Bayesian distinction is fraught with so much folk connotation and association that most statistical methods end up feeling like one or the other to practitioners. But at bottom, if this distinction has any objective meaning, it is about how you interpret probabilities, and your choice of parameter estimation method does not necessarily say anything about how you interpret probabilities. Only your interpretation of the parameter estimates can identify what side of the distinction you are (currently) working in.

Paul
  • 9,773
  • 1
  • 25
  • 51
3

The text from:

"The ordinary least squares solution... and y is the column n-vector"

denotes the frequentist approach to linear regression, and is generally termed "OLS". In Bayesian regression, there is a prior on the parameters. The often used Normal prior on the betas also has a frequentist interpretation: ridge regression.

JQVeenstra
  • 61
  • 1
  • Thanks JQVeenstra. So you disagree with Paul and John Ros? – BCLC Sep 12 '15 at 09:02
  • 1
    Actually, no. We are answering, in some ways, different questions. They are answering the title to this question: is OLS necessarily a frequentist approach? I am saying the interpretation of the Wikipedia article is definitely speaking from a frequentist perspective. Paul makes the point that OLS is Bayesian regression, but with a (nonstandard) uniform prior. Which is perfectly correct. The article, however, does not present it this way. – JQVeenstra Sep 13 '15 at 11:58