I have a dataset with the following types of predictors:
- binary (e.g., gender),
- nominal with 3 categories,
- ordinal, and
- continuous
Question:
What is the best way to set up a regression model that includes these different types of variable?
I have a dataset with the following types of predictors:
What is the best way to set up a regression model that includes these different types of variable?
The lm() procedure in R handles the entire range of linear models, not just multiple regression. All you have to do is make sure your predictors are set up to be of the right type.
Binary is the special case of nominal where the number of levels is two.
Nominal variables must be set to mode factor. They can be coerced to factors from character variables by using factor(). Note that linear models use one of the levels as a baseline, so it effectively disappears. By default this will be the first in your list of levels. If you don't specify the order of the levels they will be put in alphabetic order. You can change the order using relevel().
For ordinal data you need them to be ordered factors. Use ordered() to coerce characters or factors to ordered factors.
For continuous predictors you want the predictor to be a double. Use double() to enforce this.
As the comments suggest, it is only by fully understanding and specifying your research design that you will establish what regression method best corresponds to your data.
In the case where your DV is a categorical variable, which seems likely if you are dealing with social data, I would recommend reading extensively from Long and Freese to make an informed choice. Long and Freese use Stata, but equivalent commands exist in both R and SPSS.