1

I want to fit a multivariate linear regression, with $Y_1, \dots,Y_4$ as the response and $X_1,\dots,X_n$ as the explanatory variables. $X_1$ and $X_2$ are two components of a mixture, and $X_3,\dots ,X_n$ environmental covariates. My objectives are:

  1. To find which of the $X_i$ significantly affect the $Y_i$
  2. To investigate covariance among $Y_i$, controlling for the $X_i$.
  3. To predict the values of $Y$ over the whole range of the mixture $(X_1, X_2, 1-X_1-X_2)$.

$Y$ responses fit the general assumptions for linear models well enough (normality, homoscedasticity).

A paper I've been reading uses the RStan package: they define priors, describe convergence, compute posterior predictive checks, and a lot of stuff I do not fully grasp. I am a noob in Bayesian statistics, so to avoid errors and save time I would rather use simple multivariate regressions that I understand better, such as lm(cbind(Y1...Y4) ~ X1 ... Xn). I'm fully prepared to delve into Bayesian stats if necessary though.

I found this post, but it is more on the differences between Rstan and lme4 in general rather than applied to a specific problem. My question is: for a simple problem like mine, is multivariate regression in base R sufficient? What does RStan allow that is not in base R in this case?

Stochastic
  • 799
  • 1
  • 6
  • 28
Nausi
  • 133
  • 8

0 Answers0