As far as I know, the difference between logistic model and fractional response model (frm) is that the dependent variable (Y) in which frm is [0,1], but logistic is {0, 1}. Further, frm uses the quasi-likelihood estimator to determine its parameters.
Normally, we can use glm
to obtain the logistic models by glm(y ~ x1+x2, data = dat, family = binomial(logit))
.
For frm, we change family = binomial(logit)
to family = quasibinomial(logit)
.
I noticed we can also use family = binomial(logit)
to obtain frm's parameter since it gives the same estimated values. See the following example
library(foreign)
mydata <- read.dta("k401.dta")
glm.bin <- glm(prate ~ mrate + age + sole + totemp,
data = mydata, family = binomial('logit'))
summary(glm.bin)
return:
Call:
glm(formula = prate ~ mrate + age + sole + totemp,
family = binomial("logit"),
data = mydata)
Deviance Residuals:
Min 1Q Median 3Q Max
-3.1214 -0.1979 0.2059 0.4486 0.9146
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 1.074e+00 8.869e-02 12.110 < 2e-16 ***
mrate 5.734e-01 9.011e-02 6.364 1.97e-10 ***
age 3.089e-02 5.832e-03 5.297 1.17e-07 ***
sole 3.636e-01 9.491e-02 3.831 0.000128 ***
totemp -5.780e-06 2.207e-06 -2.619 0.008814 **
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 1166.6 on 4733 degrees of freedom
Residual deviance: 1023.7 on 4729 degrees of freedom
AIC: 1997.6
Number of Fisher Scoring iterations: 6
And for family = quasibinomial('logit')
:
glm.quasi <- glm(prate ~ mrate + age + sole + totemp,
data = mydata
,family = quasibinomial('logit'))
summary(glm.quasi)
return:
Call:
glm(formula = prate ~ mrate + age + sole + totemp,
family = quasibinomial("logit"),
data = mydata)
Deviance Residuals:
Min 1Q Median 3Q Max
-3.1214 -0.1979 0.2059 0.4486 0.9146
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 1.074e+00 4.788e-02 22.435 < 2e-16 ***
mrate 5.734e-01 4.864e-02 11.789 < 2e-16 ***
age 3.089e-02 3.148e-03 9.814 < 2e-16 ***
sole 3.636e-01 5.123e-02 7.097 1.46e-12 ***
totemp -5.780e-06 1.191e-06 -4.852 1.26e-06 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for quasibinomial family taken to be 0.2913876)
Null deviance: 1166.6 on 4733 degrees of freedom
Residual deviance: 1023.7 on 4729 degrees of freedom
AIC: NA
Number of Fisher Scoring iterations: 6
The estimated Beta from both family
are the same, but the difference is the SE values. However, to obtain the correct SE, we have to use library(sandwich)
as in this post.
Now, my questions:
- What is the difference between these two codes?
- Is frm about to obtain robust SE?
If my understanding is not correct, please give some suggestions.