Bootstrap regression standard errors and re-calculate t-statistic?

Question

I am working on a few (both simple and multivariable) regression analyses, and I have cases where the residuals are non-normal, to varying degrees. As I've understood, the Gauss-Markov theorem states that normality of residuals is not necessary for the coefficient point estimates to be correct, i.e., I can trust that the regression summary tells me the BLUE coefficient estimates. However, the standard errors may be biased, and thus, the corresponding t- and p-values may not be correct.

A previous question (How to obtain p-values of coefficients from bootstrap regression?) asked if it was possible to calculate p-values from bootstrapped coefficients and their CIs. However, if I bootstrap the coefficients and use the bootstrapped mean and the bootstrapped SE to re-calculate t- and p-values, would that be a sound approach?

Here is the code I use, in R:

# create linear model
mod <- lm(Y~X,data=dataset); summary(mod)

# create function to return coefficient
mod.bootstrap <- function(data, indices) {    
d <- data[indices, ]
mod <- lm(Y~X, data=d)  
return(coef(mod))
}

# set seed
set.seed(1234)

# begin bootstrap using the boot package
mod.boot <- boot(data=dataset, statistic=mod.bootstrap, R=2000)

# now here is how I re-calculate t- and p-values
bootmean <- mean(mod.boot$t[,2])
booter <- sd(boot.t[,2])
tval <- bootmean/booter
p <- 2*pt(-abs(tval),df=mod$df.residual)

The new t- and p-values come out fairly close to the LM summary, albeit a bit more conservative, as expected. Is this something I could report? I am very new at this, so I can't really tell if my logic is valid or not.

Edit: Clarified a bit.

Bootstrap regression standard errors and re-calculate t-statistic?

0 Answers0