I am analysing panel data model across 20 years and 55 counties. I want to perform fixed effect panel data regression. I started using lm
using dummy variables and the r-squared and the adjusted r-squared were around 0.4. Then I heard about plm
model, I used it and the r-squared drastically decreased, even worse adjusted r-squared became negative. See the data and the function below. Please let me know if I am doing something wrong.
My data frame:
structure(list(table.id = c("alameda, ca, 2003", "alameda, ca, 2004",
"alameda, ca, 2005"), location = c("Alameda, CA", "Alameda, CA",
"Alameda, CA"), year = c(2003, 2004, 2005), search.fund = c(0,
0, 0), search.fund.binary = c(0, 0, 0), time.avg = c(0, 0, 0),
distance.avg = c(0, 0, 0), avg.income.capita = c(40266, 41973,
43594), real.gdp = c(86355025, 88443534, 90705419), unemployment = c(6.8,
5.9, 5.1), education.rate = c(34.9, 34.9, 34.9), urban.id = c(1,
1, 1), no.establishments = c(46548.75, 46623.5, 46254.25),
no.building.permit = c(14828, 15239, 14883), population.size = c(1454163,
1445721, 1441545), no.establishments.capita = c(0.0320106824338124,
0.0322493067472908, 0.0320865807172166), no.building.permit.capita = c(0.0101969311555857,
0.0105407613225512, 0.0103243395107333)), row.names = c(NA,
3L), class = "data.frame")
lm
model with dummy variables:
sf.lm.fe.nb <- lm(search.fund ~ education.rate + unemployment +
urban.id + no.establishments.capita + no.building.permit.capita +
factor(location) + factor(year), data = df)
summary(sf.lm.fe.nb)
plm
model:
sf.plm.fe.nb <- plm(search.fund ~ education.rate + unemployment +
urban.id + no.establishments.capita + no.building.permit.capita,
data = df, model = "within", effect = "twoways",
index = c("location", "year"))
summary(sf.plm.fe.nb)