The fixed effects model your estimating is akin to estimating a separate intercept for each sireID
. The unit-specific intercepts don't appear in your summary output, but you can ask for them with the following function: fixef(fixed)
.
Estimating fixed effects along the other dimension (e.g., effect = "twoways
) isn't going to return a global intercept. The main takeaway from the post you referenced is that the overall intercept $\alpha$ is perfectly collinear with $\alpha_i$, as the sum of all the unit-specific effects (i.e., $\alpha_i$'s) is one.
Estimating your equation using lm()
and including dummies for all units will return an intercept, but the intercept is simply the effect for the omitted dummy. To see this in practice, I will use the Produc
dataset (U.S. state production from 1970 to 1986) from the plm
package.
# Load the plm package and the Produc dataset
library(plm)
data("Produc", package = "plm")
# Estimate a one-way fixed effects model
# Extract the state-specific intercepts
mod_within <- plm(hwy ~ gsp, index = c("state", "year"), model = "within", effect = "individual", data = Produc)
summary(fixef(mod_within))
The fixef()
function extracts the fixed effects from a plm
object. The output below reports the unique state-specific intercepts. Remember the effect for Alabama in the first row.
# Summary output from the fixef() function
# These are the state-specific intercepts...
Estimate Std. Error t-value Pr(>|t|)
ALABAMA 7390.71 185.97 39.742 < 2.2e-16 ***
ARIZONA 4749.54 180.96 26.246 < 2.2e-16 ***
ARKANSAS 3761.83 175.90 21.387 < 2.2e-16 ***
CALIFORNIA 37091.91 697.28 53.195 < 2.2e-16 ***
COLORADO 4501.00 186.95 24.076 < 2.2e-16 ***
CONNECTICUT 6142.66 192.52 31.907 < 2.2e-16 ***
DELAWARE 1897.09 171.51 11.061 < 2.2e-16 ***
FLORIDA 13175.73 266.92 49.362 < 2.2e-16 ***
GEORGIA 8331.90 208.42 39.977 < 2.2e-16 ***
IDAHO 2369.09 171.82 13.789 < 2.2e-16 ***
ILLINOIS 22536.50 360.88 62.449 < 2.2e-16 ***
INDIANA 8559.26 213.81 40.033 < 2.2e-16 ***
IOWA 8951.42 184.49 48.520 < 2.2e-16 ***
KANSAS 5983.13 181.48 32.969 < 2.2e-16 ***
KENTUCKY 9925.08 187.95 52.808 < 2.2e-16 ***
LOUISIANA 9972.78 217.73 45.804 < 2.2e-16 ***
MAINE 2205.62 172.15 12.812 < 2.2e-16 ***
MARYLAND 8259.07 198.04 41.704 < 2.2e-16 ***
MASSACHUSETTS 7203.77 225.36 31.965 < 2.2e-16 ***
MICHIGAN 16473.75 288.12 57.178 < 2.2e-16 ***
MINNESOTA 10479.91 199.22 52.604 < 2.2e-16 ***
MISSISSIPPI 5435.86 176.62 30.777 < 2.2e-16 ***
MISSOURI 9406.42 206.89 45.465 < 2.2e-16 ***
MONTANA 3644.11 171.82 21.208 < 2.2e-16 ***
NEBRASKA 4359.62 175.03 24.907 < 2.2e-16 ***
NEVADA 1969.22 172.23 11.434 < 2.2e-16 ***
NEW_HAMPSHIRE 1965.13 171.93 11.430 < 2.2e-16 ***
NEW_JERSEY 10672.72 262.80 40.612 < 2.2e-16 ***
NEW_MEXICO 3158.47 174.12 18.140 < 2.2e-16 ***
NEW_YORK 29833.23 529.86 56.304 < 2.2e-16 ***
NORTH_CAROLINA 7987.01 212.46 37.593 < 2.2e-16 ***
NORTH_DAKOTA 2746.73 171.58 16.008 < 2.2e-16 ***
OHIO 21086.53 317.17 66.484 < 2.2e-16 ***
OKLAHOMA 5029.61 188.38 26.700 < 2.2e-16 ***
OREGON 5566.66 180.65 30.815 < 2.2e-16 ***
PENNSYLVANIA 21292.22 325.81 65.351 < 2.2e-16 ***
RHODE_ISLAND 1783.01 172.00 10.366 < 2.2e-16 ***
SOUTH_CAROLINA 3806.90 179.89 21.162 < 2.2e-16 ***
SOUTH_DAKOTA 3024.36 171.36 17.650 < 2.2e-16 ***
TENNESSE 9005.20 195.23 46.125 < 2.2e-16 ***
TEXAS 24581.81 464.77 52.890 < 2.2e-16 ***
UTAH 3296.01 173.52 18.995 < 2.2e-16 ***
VERMONT 1792.38 171.08 10.477 < 2.2e-16 ***
VIRGINIA 11836.57 212.64 55.666 < 2.2e-16 ***
WASHINGTON 9110.27 198.90 45.804 < 2.2e-16 ***
WEST_VIRGINIA 6054.88 175.35 34.530 < 2.2e-16 ***
WISCONSIN 9116.06 203.39 44.820 < 2.2e-16 ***
WYOMING 2703.02 171.94 15.721 < 2.2e-16 ***
Estimating dummies for each state using lm()
results in equivalent estimates. However, we must drop a state since all the state dummies sum to one. The state
variable in the Produc
dataset is a factor variable; it's ordered alphabetically. R drops the first factor by default, which is Alabama.
# The least squares dummy variable (LSDV) estimator
# Include a dummy variable for each unit (i.e., state)
mod_lsdv <- lm(hwy ~ gsp + as.factor(state), data = Produc)
# Extract the intercept
mod_lsdv$coefficients['(Intercept)']
(Intercept)
7390.71
Note how the indicator for Alabama doesn't necessarily get dropped; it actually gets absorbed into the intercept. The omitted state effect could be any state in the panel.
In short, the intercepts usually aren't the focus in a fixed effects analysis; they're nuisance. In your case, the principal focus should be on X1
and X2
.
In accordance with the suggestion provided by @Helix123, I wanted to note that there is, in fact, a within_intercept()
function in the plm
package. I always found it a bit artificial but it does return an "overall intercept" for within models and its accompanying standard error.
Under the hood, it's actually a weighted mean of the fixed effects. You could calculate this fairly easily in the one-way case using the following code:
# The overall intercept is just a weighted mean of the state effects
weighted.mean(fixef(mod_within), pdim(mod_within)$Tint$Ti)
Simply feed the weighted.mean()
function the individual state effects, followed by a numeric vector of the weights, and you got your overall intercept.