1

I have a textfile with the two columns

$$\mathbf{x}=(x_1,...,x_i)$$

$$\mathbf{y}=(y_1,...,y_i)$$

I want to use the following model for the data

$$y_i=A\sin\left(\frac{x_i}{B}\right)+C\epsilon_i,$$

where $\epsilon_i\sim N(0,1)$ and independent.

By guessing I found that $A=5.2, \ B=5.3$ and $C=1.0$ gives me a pretty good fit. Now I want to write a function in R-code that computes the likelihood function (the probability of observed data $y_1,\ldots,y_{i}$ given the observed values $x_1,...,x_i$ and the observed values for the parameters). But before I do that, I need to understand what's going on mathematically.

The posterior is given by

$$\pi(x_1,\ldots,x_i\mid y_1,\ldots,y_i)=\frac{\color{red}{\pi(y_1,\ldots ,y_i\mid x_1,\ldots,x_i)} \cdot \pi(x_1,\ldots,x_i)}{\pi(y_1,\ldots,y_i)},$$

where $\pi(y_1,\ldots,y_i\mid x_1,\ldots,x_i)$ is the likelihood. But how do I calculate the, posterior, evidence and the prior here? Any help is appreciated.

Michael Hardy
  • 7,094
  • 1
  • 20
  • 38
Parseval
  • 295
  • 1
  • 7
  • The prior needs to be chosen by you, and you should choose it wisely (non-informative is the term). The posterior is then obtained by looking at a table of Conjugate-priors (if you chose poorly you are screwed as far as doing this by hand goes). – user2974951 Dec 05 '18 at 13:46
  • 1
    If you assume the $\epsilon_i$ are independent then you can immediately write down the likelihood, purely mechanically, by applying the definitions of "$N(0,1)$" and independence. This has nothing to do with prior or posterior distributions. – whuber Dec 05 '18 at 14:25
  • @user2974951 - So If I want to use a flat prior, can I just let it be proportional to a constant? However, I still don't have my posterior and evidence, so I cant solve for the red colored part yet. – Parseval Dec 05 '18 at 14:50
  • @whuber - I forgot to add in the question that $\epsilon_i$ are independent, will edit. I'd be very glad if you could show me how to do this, I have not been able to find a similar example anywhere. Maybe you can link me somewhere where they show some example? – Parseval Dec 05 '18 at 14:52
  • 1
    This is a regression problem, so this search will turn up comparable examples: https://stats.stackexchange.com/search?q=likelihood+normal+independent+regression. The first hit arguably is a duplicate: https://stats.stackexchange.com/questions/47040/write-down-the-log-likelihood-function-for-this-model. The answer at https://stats.stackexchange.com/questions/305908/likelihood-in-linear-regression also works, even though it's an overly contorted account of something that is basically simple (IMHO). – whuber Dec 05 '18 at 15:18
  • @user2974951 "the posterior is then obtained by looking at a table of Conjugate-priors". Not quite. 1) you don't have to choose a conjugate prior. In fact, Jeffrey's priors are popular priors, improper, and not conjugate to anything. 2) I don't know where such a table exists, and I think that a proper Bayesian training precludes needing such a table. – AdamO Dec 05 '18 at 15:22

1 Answers1

2

You've written down the wrong expression for the posterior to do Bayesian regression. Consider the $\vec{x}$ to be fixed by design. Thus, you don't need a prior for the $x$. Rather, you need to set a prior for A, B, and C. Define $r = y - \hat{y}$ in the usual way. The likelihood, based on the probability model for $\epsilon$ and given A, B, and C would be:

$$L(y_i | A, B, C)= \prod_{i=1}^n \phi\left(\left(y_i - A \sin(x_i/B)\right)/C\right)$$

where $\phi$ is the standard normal density.

AdamO
  • 52,330
  • 5
  • 104
  • 209
  • Thanks for this answer, I still don't understand a few things here: (1) In terms of what variable is the normal density expressed? (2) Why did you divide $y-\hat{y}$ by $C$ in the expression of the likelihood? (3) Why does one need to define an $r$? I can't find anyhting about that in this wikipedia article: https://en.wikipedia.org/wiki/Bayesian_linear_regression – Parseval Dec 05 '18 at 15:48
  • @Parseval 1) The normal probability model is for the residuals. 2) because C is a dispersion parameter and doing this gives standard normal RVs. 3) so one can calculate the likelihood. The term $r$ is a convenient shorthand for $Y-\mathbf{X}\beta$ in the linear regression case. you're not doing linear regression. Rather you're doing spectral regression, since the sinusoidal trend in X predicts the Y. But no matter, subtract observed and expected and get a residual. – AdamO Dec 05 '18 at 16:44
  • @AdanO - Sorry for wasting your time with the stupid questions. Everything makes sense now. Thank you sir. – Parseval Dec 05 '18 at 17:17
  • 1
    @Parseval don't be rough on yourself. When sorting things out in the comments, I tend toward terseness so as to be clearer and more direct. – AdamO Dec 05 '18 at 17:28
  • if it's not much to ask from you, could you please check if I got the following log-likelihood function correct in this thread: https://stats.stackexchange.com/questions/380540/check-if-log-likelihood-function-is-correctly-derived ? It's just a continuation of this question. – Parseval Dec 05 '18 at 21:54
  • 2
    The likelihood is missing a $C^{-n}$ factor. – Xi'an Dec 06 '18 at 16:08