I have to generate a regression model to get at the impact of each of 70 units (these are geographic organizations). We have to control for 39 variables other than unit (this is determined by the federal government). Since I have the entire population p values are not an issue nor is statistical power. One way to do this would be to create 69 dummy variables for the units (one per unit and one reference unit). One of the two dependent variables we will analyze is interval another has two levels (but the federal government determined we would run linear probability model so we will still run OLS). Any suggestions.
My goal is to find out how well the units are doing when controlling for a variety of factors. For example, we are trying to determine for each unit how much income they generate for their customers given factors such as age or education of their customers they have no control off (these are the controls I mentioned). I have studied splines before, but I struggle to interpret them. In any case the continuous predictors are not what interest us primarily, in this analysis. It is the performance of the units. One problem we have is there is very little theory to build on.
I decided to do what the federal government did, or will do, for this project which is to do fixed effect regression with one dummy for every unit. But being new to fixed effect regression I had a followup question. Do you remove one of the units as a reference level? And if you do how would you chose that unit. We need to measure the impact of every unit.