0

I am using a continuous variable of body mass index. I checked the distribution using statistical tests and determined it is not normally distributed. I think these results are driven by outliers, there is a huge number of them. How can I deal with outliers? I am going to use this variable as a dependent variable for a linear regression.

AdamO
  • 52,330
  • 5
  • 104
  • 209
  • Relevant discussions here: https://stats.stackexchange.com/questions/2492/is-normality-testing-essentially-useless https://stats.stackexchange.com/questions/86835/normality-assumption-in-linear-regression https://stats.stackexchange.com/questions/163642/what-to-do-if-residuals-are-not-normally-distributed – AdamO Mar 20 '18 at 19:50
  • 1
    BMI is useless. Do you have a measure of waist circumference instead? BMI is no longer a risk factor for metabolic syndrome after adjusting for waist circumference while waist circumference remains a risk factor. – AdamO Mar 20 '18 at 19:51

1 Answers1

0

First, BMI is a really terrible measure.

Second, nothing in OLS regression requires that the DV be normal. The errors are assumed to be normal. However, outliers may be a problem but this can be dealt with by using e.g. quantile regression.

Peter Flom
  • 94,055
  • 35
  • 143
  • 276
  • 2
    It's interesting Pearson rejected BMI = weight / height^3 (thus BMI as a weight-density) in favor of weight / height^2 because it made *his* data look more normal. He failed to project what modern agriculture would do to waistlines. – AdamO Mar 20 '18 at 19:51
  • I am using a multiple linear regression for gender age and ethnicity as independent variables and one of the assumptions is normality of data. If i was using a logistic regression it wouldnt be an issue i think. What am asking is possible paths that i can use. – Christakis Damianou Mar 20 '18 at 19:56
  • No, normality of the data is not an assumption of linear regression.That is incorrect. – Peter Flom Mar 20 '18 at 20:13
  • Ok thanks. i thought that normal distribution is related to standard error mean which for my bmi variable is 0.091. Is that acceptable? – Christakis Damianou Mar 20 '18 at 20:24
  • It's impossible to say whether that is acceptable. – Peter Flom Mar 20 '18 at 21:31