rstan: Diagnostics of regression

Question

I ran a simple normal regression in rstan with some informative priors. My data has heteroskedasticity and would like to fix the same. However, I am new to bayesian regression and rstan. My questions are

Can I use the mean of the estimates of coefficients for forecasting out of sample?
How do I find the standard error of my regression? Can I still do $\sigma^2 (X'X)^{-1}$?
Can I just plot the result of the residuals of regression against fitted values to see the test for heteroskedasticity?

Why do you want to use Bayesian linear regression if you want to use it *exactly the same* as frequentist linear regression? — Tim, Jun 11 '17 at 09:27
what is frequentist linear regression? I can't run a robust regression with trimmed outliers as fat tails are not outliers here. — Som Joy, Jun 12 '17 at 06:44

score 1 · Answer 1 · answered Nov 17 '21 at 16:12

Can I use the mean of the estimates of coefficients for forecasting out of sample?

You could, but why? A big selling point of Bayesian modelling is the ability to integrate over uncertainty in the parameters. You would do this by generating from the posterior predictive distribution, namely

$$ p(\tilde{y} \vert y) = \int p(\tilde{y}\vert \theta) p(\theta \vert y) d \theta $$

This integral can be approximated by taking estimates of your coefficients and generating data via the likelihood. In Stan, you might right a linear regression and posterior predictive quantities as

data{
    int n;
    int p;
    vector[n] y;
    matrix[n, p] X;
}
parameters{
  vector[p] beta;
  real<lower=0> sigma;
}
model{
  beta ~ <priors_here>;
  sigma ~ <priors_here>;
  y ~ normal(X*b, sigma);
}
generated quantities{
  //Here is the posterior predictive
  vector[n] y_tilde;
  for(i in 1:n){
    y_tilde[i] = normal_rng(X*b, sigma);
  }
}

For each sample, Stan will compute X*b and then draw a gaussian random vairable with mean X*b and standard deviation sigma. If you average each of these, then you get a prediction for each row in X.

How do I find the standard error of my regression?

Uncertainty in the parameters can be obtained directly from the samples outputted by Stan.

can I just plot the result of the residuals of regression against fitted values to see the test for heteroskedasticity?

That is certainly one way to check the model.

score 0 · Answer 2 · answered Jun 12 '17 at 07:47

Your first question is answered in the Prediction based on bayesian model thread. Notice that you can make two different kinds of predictions: you can predict the distribution of outcome and the point estimate. If you need a point estimate than you can take mean, median, mode (i.e. maximum a posteriori estimate) etc., depending on your needs.

As about standard errors, what kind of errors do you need and what for? With Bayesian model you will obtain posterior distribution of your parameters and posterior predictive distribution of your outcome. With those distributions you can easily obtain interval estimates (see credible intervals). In fact Bayesian credible intervals give you much more information about the distribution of your outcomes then confidence intervals as discussed in here.

rstan: Diagnostics of regression

2 Answers2