Reporting results of a bootstrap regression model that has violated the assumption of homoscedasticity

Question

I want to fit a multiple regression to test the relationship between several independent variables and a dependent variable. A few independent variables are standard demographic variables, the rest of the IVs are summed scores of responses to likert items. The dv is also a sum of responses to likert items.

The books I have read before and during my analysis are Discovering statistics using SPSS, by Field, 4th ed., and R in action, data analysis and graphics with r.

The problem is that the homoscedasticity assumption has been violated. To test homoscedasticity, I used the ncvTest() available in Car package of R, and the Breusch and Pagan test available in olsrr package for R. Both show violation of homoscedasticity. Both books I have mentioned above, suggest that bootstrapping for robust standard errors and confidence intervals should be used when the homoscedasticity assumption is not met. In his book, Field suggests that when there is heteroscedasticity, the tests of hypotheses may not be valid.

I have bootstraped using the Boot function in R package, as well as using SPSS. My problem is that I am unable to work out how to present my results after a bootstrap, and which values may or should change other than standard errors as a result of the bootstrap.

I’ll use the output of both R and SPSS as examples to elaborate my question. In R, I know that rlm(), Boot(), summary(ofboot) will give me the bootstrap standard errors. But, when I am going to report my results, should I report the r-squared, f, and t provided by rlm(), should bootstrapping influence r-squared, f and t values? In other words, when I'm constructing the regression results table, I will take the values of standard error from the results of bootstrap, but can I take the t, f, and coefficients values returned by rlm()? SPSS provides two tables of coefficients, both with and without the bootstrap; the bootstrap coefficient table does not have a column for t, so, again, when reporting the results, do I report the t in coefficient table without the bootstrap, f in the ANOVA and r-squared in the model summary table. Field in his book, has provided a bootstrap example, but the process is shown graphically, and since I’m totally blind, I cannot see the image to determine whether his book has an answer to my dumbest question.

Similar question answered before https://stats.stackexchange.com/q/56870/99274. This question somewhat a duplicate. — Carl, Feb 27 '18 at 21:21
@kjetilbhalvorsen Maybe I'm not seeing the forest for the trees, but the question you link asks whether bootstrap *can* be used for heteroscedastic data. OP, here, seems to be saying, "I bootstrapped *because* my data were heteroscedastic... now what?". In which they're wondering about the applicability of fit statistics, and, of course, how to actually construct CIs and p-values from those bootstrapped results. "[the bootstrap] will blow the head off any problem provided you're willing to put the pieces together." — AdamO, Mar 06 '18 at 21:31

score 0 · Answer 1 · answered Feb 27 '18 at 19:53

Bootstrap is resampling with replacement. One can use this to build up a histogram as an approximating empirical distribution of any parameter one wants to examine. Link->https://stats.stackexchange.com/a/130101/99274 Now if one does 1000 bootstrap simulations, one can see what the top 25 and bottom 25 values are and that gives a 95% confidence interval. Similarly, any particular test value will have a probability associated with it by its position rank. For example, suppose that a value is the between the 9th and 10th smallest of 1000 values, then the one-tailed probability is approximately between 0.009 and 0.010.

Reporting results of a bootstrap regression model that has violated the assumption of homoscedasticity

1 Answers1