Let's say that I generate observations from logistic regression:
n <- 1000 #n. of observations
p <- 5 #n. of covariates
u <- dnorm(runif(n*p, min = 0, max = 1))
x <- matrix(u, nrow = n, byrow = TRUE)
b <- c(2, 1, 2, 3, 6, -3)
prob <- 1/(1 + exp(-b%*%t(cbind(1,x))))
y <- rbinom(n,1,prob)
But now if I fit a model:
df = data.frame(y, x)
mod <- glm(y ~ . , data = df, family = "binomial")
coefficients(mod)
Coefficients I get are:
(Intercept) X1 X2 X3 X4 X5
-7.0004768 16.5132458 -0.6912072 11.7006775 1.5552072 7.0992499
It aswell might be a very dumb question showing lack of knowledge, but why are my coefficients returned by glm
different than these from b
? A shoot in the dark is that while generating y
I rely upon probability given by a binomial distribution, so there is a room for uncertanity. Am I right? In the sense that drawing from binom(4, 0.5)
I could get e.x both (1,1,0,0)
and (0,1,0,0)
.