2

So I am dealing with a questionnaire where given an answer on a specific questions the next three following questions are not applicable to the person. I would like to include those variables in a regression as independent variables but I dont know how. For example:

  1. Do you own a car?
  2. What is the color of the car?
  3. What is the brand of the car?
  4. How old is the car?

So there will be a high correlation between question 1 and the questions 2,3,4. I could remove question 1 but multi collinearity between 2,3,4 remains because all people who do not own a car have a "not-applicable". Also I would lose the easily interpret-able effect of owning a car.

I also cant really impute those values as they are not missing at random.

Is there a way of solving it, or what would be search terms I could use to find an answer?

Janosch
  • 530
  • 2
  • 10

0 Answers0