The model coefficients are estimated contrasts based on how the data frame generates contrasts in the factor levels for density. Take a look at this:
fit <- glm(events ~ as.factor(density), df, family = poisson)
model.matrix(fit)
To see how these contrasts are estimated, store the GLM as an object in the workspace. The intercept in this case is now the average log rate when density
is equal to 1 (which is the log of 1, i.e. close to 0). Each of the parameters, such as the first, which is labeled as.factor(density)2
is the log relative rate comparing events when density is equal to 2 versus density equal to 1.
Each of these model parameters has a known limiting asymptotic distribution due to the central limit theorem. The theory on this is well understood, but a bit advanced. Consult McCullagh & Nelder, "Generalized Linear Models" for a statement of the result. Basically, as with linear regression, the natural parameters in the generalized linear models converge to a normal distribution under replications of the study. Thus, we can calculate the limiting distribution under the null hypothesis and calculate the probability of observing model coefficients as inconsistent or more inconsistent than what was experimentally obtained. This is very similar to the usual interpretation of a $p$-value as obtained from OLS model parameters, or simple Pearson tests of contingency tables, or the t-test.
Note that, had you removed the as.factor
coding of density
, you would have estimated an averaged log relative rate comparing values of density
differing by 1 unit, and the intercept would have been the interpolated to be the log event rate when density=0
, which may or may not be a useless quantity. The log relative rates in the data you generated are not constant, so the model effects would represent an "averaged effect".
For instance:
## the actual relative rates comparing subsequent density values
relRates <- exp(diff(log(1:5))
modelFit <- glm(events ~ density, data=df, family=poisson)
## model based relative rate, weighted by random data
exp(coef(modelFit))[2]
## the approximate average log relative rate, converted to relative rate
exp(mean(log(relRates))