0

I am thinking about the difference between pooled cross sectional data and unbalanced panel data, especially in the context of fixed effect models.

What I can't get my head around is, where exactly the difference between these two types of data sets are. Suppose we draw from a pool of firms/households at 3 points in time. This definition alone would make it seem to like it was pooled cross-sectional data.

However, by chance, the resulting data frame could look like:

Firm year y
A 2000 10
B 2000 12
A 2002 54
C 2002 11
A 2004 123
B 2006 24

Meaning we observe firm A every year, firm B in every but the second year and firm C only in the second year. In my opinion such a data set would perfectly match the wikipedia-definition of an unbalanced panel data set which "is a dataset in which at least one panel member is not observed every period".

However, we started off with something that sounded very much like a pooled cross-section.

Can someone please eloberate a bit further on the difference between the two kinds of data sets? Furthermore, I am specifically interest in when it is possible to use unit fixed effects with pooled cross-sectional data, because it could also happen, that we observe 100% distinct groups at the two points in time, meaning that we would need to estimate N dummies for the inclusion of unit fixed effects.

Max
  • 107
  • 5

0 Answers0