Fixed effects estimates different number of parameters with different datasets

Question

Consider a simple panel data model of wage determination, with two periods, and only one regressor: a dummy of whether individual lives in urban or rural area. Importantly, individuals can switch location between periods. $U$ is urban, $R$ is rural. $1$ is year 1, $2$ is year 2, $i$ is individual, and $t \in \{1,2\}$. The model is:

$$ w_{i,t} = \alpha + \beta U1_{i,t} + \gamma U2_{i,t} + \phi R1_{i,t} + \theta R2_{i,t} + \epsilon_{i,t} $$

where $Ut_{i,t}$ and $Rt_{i,t}$ are dummy stating whether individual $i$ is in rural/urban in period $t$. (Note: I add all categories just for clarification. Naturally, the four of them are collinear with the constant).

For example, the data matrix might look like this:

$$ \begin{array}{cc|cccc} i & t & \text{constant} & U1_{i,t} & U2_{i,t} & R1_{i,t} & R2_{i,t} \\ \hline 1 & 1 & 1 & 1 & 0 & 0 & 0 \\ 1 & 2 & 1 & 0 & 1 & 0 & 0 \\ 2 & 1 & 1 & 1 & 0 & 0 & 0 \\ 2 & 2 & 1 & 0 & 0 & 0 & 1 \end{array} $$

Individual 1 remains in urban in both periods, whether individual 2 switches from urban to rural.

I am estimating a model like this using fixed-effects. The issue is the following:

If the dataset has switchers, the software returns an estimation for four out of five of the model parameters ($\alpha, \beta, \gamma, \phi, \theta$) - four only because of the constant.
If the dataset has no switchers, the software returns an estimation for three out of five of the model parameters. Of these, at least one is of urban and of rural type.

I am trying to understand this parameter estimation difference. I have been through formulas and matrix algebra, textbooks and google, and so far I cannot resolve this. Now I want your help!

Further information:

This estimation pattern does not follow through in random-effects. Regardless of whether switching exists or not, RE identifies the same number of coefficients. Thus, the result is necessarily due to a combination of switching and the intrinsic demeaning nature of FE.
I've tried in different software and commands, and the result holds.
Identification of all coefficients (but one) requires full rank of matrix $\sum_{i=1}^{N}(\ddot{X_{i}}'\ddot{X_{i}})$, where $\ddot{X_{i}}$ is the demeaned data matrix for individual $i$ (Wooldridge (2010), p.304). According to my calculations, if there are no switchers, that matrix is a diagonal, with element $(1,1)=NT$, and all rest diagonal elements equal to $N_{u/r}\frac{T-1}{T}$, where $N_{u/r}$ is the number of individuals on each region. I cannot see how that matrix has no full rank, and so cannot see why all coefficients are not calculated. For example, in the case of individual 1 above, $\ddot{X_{i}}$ is:

$$ \begin{array}{ccccc} 1 & 0.5 & -0.5 & 0 & 0 \\ 1 & -0.5 & 0.5 & 0 & 0 \end{array} $$

and $\ddot{X_{i}}'\ddot{X_{i}}$ is:

$$ \begin{array}{ccccc} 2 & 0 & 0 & 0 & 0 \\ 0 & 0.5 & 0 & 0 & 0 \\ 0 & 0 & 0.5 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 \end{array} $$

Combined with another non-switcher who lives in a rural area, the sum is:

$$ \begin{array}{ccccc} 4 & 0 & 0 & 0 & 0 \\ 0 & 0.5 & 0 & 0 & 0 \\ 0 & 0 & 0.5 & 0 & 0 \\ 0 & 0 & 0 & 0.5 & 0 \\ 0 & 0 & 0 & 0 & 0.5 \end{array} $$

Which clearly has full rank. And so on. So why are not all coefficients estimated?

Note: I know this is a long and perhaps cumbersome question. But I promise to award a bounty of at least 200 reps to the correct answer (cannot set bounty right after asking). — luchonacho, Sep 09 '16 at 20:23
One note about the structure of your data. Why not restructure the matrix with one row per individual per period? Then, there are lots of ways to code the urban vs rural shifts with a more parsimonious set of dummy variables. http://www.ats.ucla.edu/stat/sas/webbooks/reg/chapter5/sasreg5.htm and also here http://www.ats.ucla.edu/stat/sas/webbooks/reg/chapter5/sasreg5.htm — Mike Hunter, Sep 09 '16 at 20:49
If rural and urban are the only two categories, and an individual cannot be in both during the same time period, why do you need separate indicators for each category? Why not just use $U_{i,t}\in\{0,1\}$? — GeoMatt22, Sep 09 '16 at 20:59
@GeoMatt22 but then how do you get the history of wages for each category? In my setting is given by the dummies and constant. In yours, how do I get the wage rate for rural in period 2? — luchonacho, Sep 10 '16 at 07:30
@luchonacho I may have misunderstood: what is "the wage rate"? Your first equation has $w_{i,t}$ so isn't there a wage rate for each individual in each period? So in the table at the top, only $w_{i=2,t=2}$ contributes to the rural wage rate in period 2? — GeoMatt22, Sep 10 '16 at 08:04
@GeoMatt22 I mean, I can see how much the wage changes for each sector, on each year. If you forget about the constant for a moment, each beta represents the average wage of workers in each sector, on each period (which I call the "wage rate"). — luchonacho, Sep 10 '16 at 08:10
@DJohnson Thanks. I am using simple coding, the one I prefer. Notice that in my case the variable **varies over time**, unlike race in the link you provide. That is why it might look less parsimonious, but it is the same thing. — luchonacho, Sep 10 '16 at 08:12
@GeoMatt22 it seems that if I use your suggestion, I am assuming that the effect of time on both rural and urban wages is the same. In my setting, they are free to be anything. — luchonacho, Sep 10 '16 at 08:33
@luchonacho I tried to explain in an "answer". Please comment there to help clarify, if my understanding is incorrect wrong. — GeoMatt22, Sep 10 '16 at 09:16
Actually, it is not the same thing. Your approach loses the sensitivity of the factors to time and isn't able to capture many desired effects. — Mike Hunter, Sep 10 '16 at 11:09
@DJohnson Can you please elaborate on your last comment? I don't quite understand what you mean. — luchonacho, Sep 10 '16 at 22:50
The two approaches are completely different, have differing degrees of freedom and permit testing of different hypotheses. By using only 1 degree of freedom (your data structure) with multiple predictors, tests of the evolution of parameters over time are not possible. — Mike Hunter, Sep 10 '16 at 23:49

score 1 · Answer 1 · edited Apr 13 '17 at 12:44

1

I do not fully understand your question, and this does not seem like a complete answer. However it is too long for another comment, so I will write out my issue here.

Your model is

$$w_{it} = \alpha + \beta_{u,1} D_{i,u,1} + \beta_{u,2} D_{i,u,2} + \beta_{r,1} D_{i,r,1} + \beta_{r,2} D_{i,r,2} + \epsilon_{it}$$ If I understand correctly, during period $t$ individual $i$ is either urban or rural, but not both, i.e. $$D_{i,u,t} + D_{i,r,t} = 1$$ for $t\in\{1,2\}$.

( UPDATE: Apparently this is known as the "dummy variable trap". See case 4 here. )

So if we define a new dummy for "individual $i$ is urban in period $t$", we have \begin{align} U_{i,t} \equiv D_{i,u,t} \implies \beta_{u,t} D_{i,u,t} + \beta_{r,t} D_{i,r,t} &= \beta_{u,t} U_{i,t} + \beta_{r,t} (1-U_{i,t}) \\ &= \beta_{r,t} + ( \beta_{u,t} - \beta_{r,t} ) U_{i,t}\\ &\equiv \Delta{\alpha}_t + \Delta{\beta}_t U_{i,t} \end{align}

So your model becomes $$ w_{it} = A + \Delta{\beta}_1 U_{i,1} + \Delta{\beta}_2 U_{i,2} $$ where $A\equiv\alpha+\Delta{\alpha}_1+\Delta{\alpha}_2$.

So, because you only have two $independent$ dummy variables, it seems like you can only estimate two coefficients, plus an intercept. That is, your original four coefficients are not identifiable: You can only estimate the urban-rural differences $\Delta\beta_t$.

edited Apr 13 '17 at 12:44

Community

1

answered Sep 10 '16 at 09:15

GeoMatt22

11,997
2
34
64

Thanks, that is a good advancement. However, recall that **in a model with switchers (urban-rural or rural-urban migration), FE can estimate all coefficients! (all but one if constant included)**. This is a fact, as the software returns an estimation of all of them. So the key distinction here is of a model **with and without switchers**. That is not incorporated in your answer. So there is something wrong/missing. I'm trying to figure out what. – luchonacho Sep 10 '16 at 22:15
When you say "switcher", does that mean $D_{i,u,t} + D_{i,r,t} = 2$ ? That would violate my assumption in the second equation above. – GeoMatt22 Sep 10 '16 at 22:17
No. That still holds (see the data table in the question). Switcher means that $D_{i,u,t} + D_{i,u,t+1} = 1 $ and $D_{i,r,t} + D_{i,r,t+1} = 1 $ (in a model with just two periods). So you are in rural in a period, and in urban in another period. – luchonacho Sep 10 '16 at 22:21
Then I am not sure you can estimate all 4 coefficients in a coupled problem. You could still fit separate models on stratified* samples (*or would the term be "conditional samples"?). – GeoMatt22 Sep 10 '16 at 22:26
Wait a minute. That seems to be the thing! As it turns out, for switchers, $D_{i,u,t}+D_{i,r,t} \neq 1$, whereas for non-switchers, $D_{i,u,t}+D_{i,r,t} = 1$, which explains why you can or cannot estimate them, as you showed! Try expanding your answer on that. Basically, your answer is for non-switchers only. If you expand to switchers, that might be it! I will do it as well in my notes, and come back tomorrow (kind of late here). – luchonacho Sep 10 '16 at 22:48
I just want to be sure we are talking about the same thing: When I say $D_{i,u,t}+D_{i,r,t} = 1$, I mean that $D_{i,u,1}+D_{i,r,1} = 1$ and $D_{i,u,2}+D_{i,r,2} = 1$. So for the $i=2$ case you called "switcher" in your example table, the constraint $D_{i,u,t}+D_{i,r,t} = 1$ still holds. – GeoMatt22 Sep 10 '16 at 22:54
Ok, changed the notation to make it clearer. Now the distinction I made earlier about switchers and non-switchers summing up to one or not holds. – luchonacho Sep 11 '16 at 11:10

Fixed effects estimates different number of parameters with different datasets

1 Answers1