How to analyse a continuous response having a bimodal distribution?

Question

I am investigating unconscious racial prejudice as a predictor for guilty or not guilty judgements (Using SPSS).

I have a continuous variable for unconscious racial prejudice (higher numbers equal higher levels of racial prejudice), which I want to see if it can predict future judgements of guilt. My dependent variable is a scale where 0 = definately not guilty, and 100 = definately guilty. My sample is not normally distributed, as it clusters around 25 and 75, giving me a binomial distribution. (In other words people have on average been 50% confident in a guilty decision, or 50% confident in a not guilty decision. It was predicted that people would find it hard to decide and as such would have very low levels of confidence. I was wrong! My sample chose either guilty or not guilty in equal numbers, but they were all very confident in the decisions they made!)

Is there anyway of analysing my data? I note that binomial regressions and Ordinary Least Squares both need a dichotomous dependent variable. I can not just divide my group into 'guilty' or 'not guilty' as it is predicted that higher unconscious racial prejudice will predict higher levels of confidence in a guilty verdict, and lower levels of unconscious prejudice will lead to higher levels of confidence in a not guilty verdict.

If anyone could help it would be much appreciated. It's for my psychology honours thesis.

SPSS is not optimal software when you need to use slightly more advanced statistics. In general you can do a regression with finite mixture models or you could choose quantile regression and model upper/lower quantiles apart from using ordinary least squares. Since you're dealing with a bimodal distribution you should probably try to bootstrap your model to avoid issues with normality. — Max Gordon, Oct 13 '12 at 08:55
There are a couple of misunderstandings here. The fact that your responses cluster around 25 & 75 does not make it a "binomial" distribution. The word you may be looking for is *bimodal*. Binomial (logistic) regression *does* require your response to be dichotomous, but OLS reg very much *does not*, nor does it matter that your response is bimodal & non-normal. In OLS reg, only the *residuals* need to be (roughly) normal, w/ how rough is acceptable depending on N. This Q may be helpful: [What if residuals are normally distributed but y is not?](http://stats.stackexchange.com/questions/12262/). — gung - Reinstate Monica, Oct 13 '12 at 14:15

score -1 · Answer 1 · answered Oct 13 '12 at 12:27

-1

Ordinary least squares (OLS) does not require a dichotomous dependent variable (DV), indeed, it can't be done with one. OLS requires a continuous DV, but it's a question of how close to continuous it needs to be. If your scale can take on any value from 0 to 100, that is probably close enough. But OLS makes assumptions about the residuals of the model (not the distribution of the DV itself).

A very flexible alternative to the OLS model, and one that deals well with bounded dependent variables such as yours, is beta regression. See Smithson. I do not know if this can be implemented in SPSS - it can be done in both SAS and R.

answered Oct 13 '12 at 12:27

Peter Flom

94,055
35
143
276

2

It's not that OLS literally can't be done with a binary Y; it can hardly be done in a way that's consistent with the assumptions underlying it...I don't agree that simply having a Y on a 0 to 100 scale is probably good enough. – rolando2 Feb 10 '13 at 16:21
Well, not *just* a 0 to 100 scale. You would still need to check the assumptions. – Peter Flom Feb 10 '13 at 16:40
1

Peter, I am curious why you suggest beta regression to handle a *bimodal* dependent variable as described in this question. Is it solely because the DV is bounded? – whuber Feb 10 '13 at 17:00
@whuber Yes, that was why. But it might need something more complex like a mixture model. Smithson adds modeling of the variance as well as the mean. What would you suggest here? – Peter Flom Feb 10 '13 at 18:17
3

I would take @gung's comments to the question seriously: the first thing is to do *some* kind of quick fit--even by eye--and begin exploring the distribution of the residuals rather than focusing on the distribution of the responses. I would guess the modes are associated with, or indicate, some attribute of the subjects, so it would be interesting to explore any associations between those modes and all available covariates. – whuber Feb 10 '13 at 18:29

How to analyse a continuous response having a bimodal distribution?

1 Answers1

Linked