0

I would like to regress price of food products against three sets of dummy variables: 1. the food product itself (13 products) 2. the country where the food product was priced (119 countries) 3. the continent where the food product was priced (5 continents)

If I have three sets of dummy variables, I will need to omit one category from each. But, will I run into collinearity issues with dummy set number 3 ("continent") causing most statistical packages to omit these dummy variables from their results?

StatsScared
  • 1,048
  • 12
  • 26
  • 1
    If one continent covers included dummies of countries for that continent, then yes, you'll have perfect multicollinearity. This happens when one predictor variable is a linear function of one or more other predictor variables. – AlexK May 29 '19 at 23:59
  • Just to clarify, you mean that if all of my countries belong to one of the five continents I have as variables, then I'd have perfect multicollinearity? – StatsScared May 30 '19 at 00:09
  • 1
    Well, to be precise: let's say North America (continent) includes Canada, U.S. and Mexico. If you included all three countries as dummies, then the "North America" dummy would be perfectly collinear with those 3 country dummies and its coefficient could not be estimated. This is because for all observations in North America, the value of "North America" would just be the sum of values in three country dummies. If you have a continent dummy that is not exhaustive of the country dummies for that continent, then you would not have this issue for that continent. – AlexK May 30 '19 at 00:15
  • Thank you, very clear. Would you mind expanding, with an example, on your last sentence: case where a continent dummy is not exhaustive of the country dummies for that continent? – StatsScared May 30 '19 at 00:24
  • 1
    This would be if you had an observation with a non-missing value for continent dummy, but no variable for the country that observation came from. With North America, that would be if you removed/omitted one of those three country dummies, but still coded the continent dummy for all observations in North America. – AlexK May 30 '19 at 00:33
  • Thanks again. But then, wouldn't I be fine if I omit one 'country' and omit one 'continent' ? (I guess not, because there would still be continents with all of their countries present in the regression, correct?) – StatsScared May 30 '19 at 00:35
  • 1
    You'd need to consider this on a by-continent basis. You would have to remove one country on each continent, in addition to removing one continent. – AlexK May 30 '19 at 00:45

0 Answers0