When estimating the returns to education, you can measure education by adding a quantitative variable school years, or as a set of indicator variables representing the different levels of education. What are the advantages and disadvantages of either approach?
-
4Can you tell us a bit more about the context? With what do you want to compare them? – Maarten Buis Jan 20 '16 at 11:24
-
1If you're asking about different schemes for encoding categorical variables there's a useful link [here](http://www.ats.ucla.edu/stat/r/library/contrast_coding.htm) (& note that usage isn't always very clear - see [“Dummy variable” versus “indicator variable” for nominal/categorical data](http://stats.stackexchange.com/q/125608/17230)). If you're asking about whether to discretize continuous predictors see [What is the benefit of breaking up a continuous predictor variable?](http://stats.stackexchange.com/q/68834/17230). If you're asking something else then please edit your question to clarify – Scortchi - Reinstate Monica Jan 20 '16 at 11:36
-
2Welcome to the site. I echo @MaartenBuis 's question and add a note that it should be dummy variables not dummys. Finally, why is this tagged with econometrics? – Peter Flom Jan 20 '16 at 11:36
-
Hey , I'll try to be a bit more clear . The question is this: what are the advantages and disadvantages estimating a model using dummy variables ? For example ,estimating the returns to education . You can measure it by adding a quantitative variable school years , for example , or dummy variables to estimate according to different levels of education the disadvantages of dummy are more important for me , because the advantages I more or less understand . I tagged econometrics because I am a student for economics. – or levy Jan 20 '16 at 13:14
-
When clarifying your questions please *edit* them rather than leaving important info. to be found (or not) in comments. Tag according to the question's topic, not your background (& if this is homework use `self-study`). Some discussion of how to treat ordinal predictors (such as "level of education") can be found in [Logistic regression and ordinal independent variables](http://stats.stackexchange.com/q/101511/17230) & [Continuous dependent variable with ordinal independent variable](http://stats.stackexchange.com/q/33413/17230). – Scortchi - Reinstate Monica Jan 20 '16 at 13:47
-
@orlevy I edited your question to correspond with what you told us in your comment. This is a big change in the question, so please tell me if this is what you want to know. – Maarten Buis Jan 21 '16 at 10:53
-
Why not [Ordinal variables (Wikipedia)](https://en.wikipedia.org/wiki/Ordinal_data)? See [What is the difference between categorical, ordinal and interval variables?](http://www.ats.ucla.edu/stat/mult_pkg/whatstat/nominal_ordinal_interval.htm) – Winks Jan 21 '16 at 11:31
-
@Winks As an explanatory/right-hand-side/independent/x-variable there is not much to be gained from declaring it ordinal; you would typically end up adding that as a series of indicator variables as well. – Maarten Buis Jan 21 '16 at 13:23
2 Answers
The advantage of years of education over educational levels is that you just get one effect of education; it is a more parsimonious model. Moreover, in some educational systems you could argue that it represents the "investment" in time the respondent made.
However, this won't work in all educational systems. In many European ones students need to choose early on (e.g. age 10 in Germany) between different tracks. In those tracked systems having the same number of years of education correspond to very different levels of education.
If you have the real years of education, then does someone that had to repeat a year have more education than someone who attained the same level in one go?

- 19,189
- 29
- 59
If you are measuring education within a specific country, then I encourage you to throw in as many variables as you have to account for quality of education. I am sure everyone who's been to school (of course that is everyone here!) remembers that we don't all come out equally educated after 12 years of grade school. Across states/provinces, even within classrooms, this gap may be huge. Also different universities produce different levels of education. E.g. an Ivy League university, on average, produces better alumni than a community college.
If you are doing a cross-country analysis, then you will probably only have years of schooling from Human Development Indicators available to you. But there Kazakhstan will come out on the same level as Singapore. No offense to the post-soviet state, just making a point.
In any case you would have to control for a lot of different variables to reflect the quality of education. I would make your decision whether to use categorical or continuous variables based on what goes better with the other variables available to you. In fact, I would do both ways and check which one produces more convincing and statistically significant results and pick the best one. There is no crime in looking for good data.

- 160
- 2
- 10