1

(This question is an attempt to zoom in on the key issue in this question using as little information as possible.)

Lets say I want to derive the likelihood function of $\beta$ given $x$ and $y$ for the model $$y=x\beta+u$$ where $$u\sim NID(0,1).$$ I would start with these steps: \begin{align} \mathcal{L}(\beta\mid x,y) &=f_{XY}(x, y \mid \beta)\\ &=f_u(y=x\beta + u\mid\beta, x, y). \end{align} I think that I need the second step to continue the derivation, but I don't know how to motivate it. I take it entirely on intuition. Is it correct? If so, how do I justify it? If not, how would the next few steps look like?

Edit: Just to give a sense of where I'm going with this, the rest of the derivation would look like this: \begin{align} &=f_u(y-x\beta=u\mid\beta, x, y)\\ &=\varphi(y-x\beta). \end{align}

Fredrik P
  • 436
  • 3
  • 12
  • Bayes: $P(\beta|x,y)P(x,y)=P(x,y|\beta)P(\beta)$, so you say $P(\beta|x,y)\sim P(x,y|\beta)P(\beta)$. If you don't know anything about $\beta$, you can say $P(\beta|x,y)\sim P(x,y|\beta)$ and maximize it. – Aksakal May 27 '15 at 13:03
  • @Aksakal I'm not entirely sure how that relates to what I'm doing here. How would that let me impose the model structure on the probability? (It's not that I don't know what to do after the last step in the question. It's just that I don't know how to motivate that last step.) – Fredrik P May 27 '15 at 13:52
  • 1
    What you call "model structure" in practical terms is simply a function that allows you imply the residuals from the data set given parameters, in other words that is essentially your likelihood function – Aksakal May 27 '15 at 13:55
  • Closely related threads, such as http://stats.stackexchange.com/questions/49443 and http://stats.stackexchange.com/questions/32103, illustrate the general procedure. It's not perfectly clear what you're trying to ask, though. It sounds at some points like you are requesting an exposition of Maximum Likelihood theory and at other points like you just want to derive ordinary least squares from a Normal likelihood. Could you edit this post to make it more apparent what kind of answers you are seeking? – whuber May 27 '15 at 19:34
  • @whuber I guess it is more in line of an exposition of Maximum Likelihood theory that I'm requesting (but perhaps without the Maximum part--I don't think that I'm maximizing anything here). I have tried to make it more apparent what kind of answer I'm seeking. Was it at all improved? – Fredrik P May 27 '15 at 19:59
  • I'm still unsure what you mean by "motivate" the derivation of a likelihood function. It's a perfectly well-defined mathematical quantity, so what would constitute further "motivation"? Given that detailed step-by-step illustrations, as well as working code, exist in many threads on this site, what do you need in addition to that? – whuber May 27 '15 at 20:02
  • @whuber Well, the first step follows from the [definition of the likelihood function for continuous variables](http://en.wikipedia.org/wiki/Likelihood_function#Continuous_probability_distribution), the third step is just reshuffling terms within the pdf (akin to doing the same thing inside a probability) and the fourth step follows from the distribution of $u$. But I don't know of any similar grounds for making the second step of the derivation yet believe that there are such grounds. So, what the question bottles down to is: what are those grounds? – Fredrik P May 27 '15 at 20:10
  • @whuber I have never seen anyone derive likelihood functions with as detailed steps as this. Normally, they just state step 1 and then jump straight to step 4. – Fredrik P May 27 '15 at 20:11
  • I'm sure I've detailed this numerous times, at least for the stages from model development through parameter estimation. A quick search turns up a step-by-step account at http://stats.stackexchange.com/a/124408 and another at http://stats.stackexchange.com/a/70922. I don't understand what your steps 1 through 4 are, precisely, but what do you think is missing? – whuber May 27 '15 at 20:16
  • @whuber I don't think anything is missing. It is probably as detailed as it can get. I just can't put words on why I do step 2. In both the links you sent, you jumped straight to the conclusion. – Fredrik P May 27 '15 at 20:23
  • @whuber Perhaps the answer is just what Aksakal said. Something like, "given our assumed model we have". – Fredrik P May 27 '15 at 20:31
  • I really am curious: what is missing in the "jump"? Could you be specific about the two steps that are connected by this jump? – whuber May 27 '15 at 20:38
  • @whuber :-) I just can't give a description for what I do in the step $f_{XY}(x, y \mid \beta)\\ =f_u(y=x\beta + u\mid\beta, x, y)$ that sounds convincing. My best shot would be something like "all the variation in $y$ comes from $u$ so, given the model, the pdfs of $y$ and $u$ are equal". How would you describe it? – Fredrik P May 27 '15 at 20:49
  • I'm not quite sure what that step amounts to, because the notation is vague. I guess it is analogous to what is called the "link function" in a generalized linear model: you wish to postulate a probability model for a response $y$ and that model's parameters depend on covariates $x$ and other parameters $\beta$. That is not a mathematical derivation, though: it is a statement of how you are intending to model the data. It is an expression of your understanding of the world. Perhaps the most detailed account I have of how that step is made appears at http://stats.stackexchange.com/a/64039. – whuber May 27 '15 at 22:08

0 Answers0