I carried out a binary logistic regression using glm
. Below you can see the (modified) output.
I included -1 to display all values, even the baseline which is used as reference category.
My questions would be: 1) There are still some groups missing: For example "stfwrk" (there are 7, it only shows 6), or "Statemplyd" (there are 2, dichotomous).
2) What do the suffixes after the "SEF"-categories mean? (".L", ".Q", "^4" etc.) Why doesn't it display the defined levels, such as for example in "eduScndary"? (The variable containing the "edu"-items has been created quite similar to the SEF item, only that it's a character one). I assume it has to do with the cut- and labels-command. Is there another way to have these labels assigned at the exact cut-points? Or how do I get it to display "A", "B" etc. in the model?
This is how I created the "SEF" item:
The original variable "SEFcls" is a Factor w/ 488 levels. I used cut
to match the 488 levels into 5 groups:
SEF<-as.numeric(SEFcls)
cut(SEF, breaks = c(1,8,52,171,279),
labels = c("A", "B", "C", "D", "E"), ordered=T,
right = TRUE)
The variable SEF is now an Ord.factor w/ 5 levels. When computing table
or summary
I get the correct results with the correct, previously assigned labels:
A B C D E
3411 2098 1744 1120 141
Output:
Deviance Residuals:
Min 1Q Median 3Q Max
-2.2818 -0.4392 -0.2883 -0.0802 3.4007
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -1.146810 0.390622 -2.936 0.00333 **
age -0.010576 0.001831 -5.777 7.59e-09 ***
gendrFem -0.297994 0.058344 -5.108 3.26e-07 ***
stfwrk1 0.084032 0.384767 0.218 0.82712
stfwrk2 -0.332194 0.319690 -1.039 0.29875
stfwrk3 -0.157778 0.291956 -0.540 0.58891
stfwrk4 0.326489 0.282949 1.154 0.24855
stfwrk5 0.125305 0.265983 0.471 0.63757
stfwrk6 0.299977 0.269225 1.923 0.05448 .
hlthGood -0.033777 0.071748 -0.471 0.63780
hlthFair 0.169100 0.084155 2.009 0.04450 *
hlthBad -0.054457 0.132281 -0.412 0.68058
hlthVery bad 0.176113 0.240020 0.734 0.46311
SEF.L 0.694256 0.214092 3.243 0.00118 **
SEF.Q -0.042545 0.218816 -0.194 0.84584
SEF.C -0.165405 0.196888 -0.840 0.40085
SEF^4 -0.048406 0.154774 -0.313 0.75447
SEF^5 -0.037543 0.125438 -0.299 0.76471
SEF^6 0.004365 0.096144 0.045 0.96379
SEF^7 0.175295 0.083457 2.100 0.03569 *
eduSecondary -0.058418 0.152025 -0.384 0.70078
eduSnrClass 0.151239 0.126941 1.191 0.23349
eduSnrClass -0.081495 0.117589 -0.693 0.48828
eduThirdlvl -0.581437 0.131859 -4.410 1.04e-05 ***
eduDctrl -0.836041 0.390912 -2.139 0.03246 *
StatEmplyd -0.155013 0.088234 -1.757 0.07894 .
sclLess than once a month -0.044219 0.238654 -0.185 0.85301
sclOnce a month 0.183115 0.236095 0.776 0.43799
sclSeveral times a month 0.108922 0.231849 0.470 0.63850
sclOnce a week -0.009763 0.233962 -0.042 0.96671
sclSeveral times a week -0.031426 0.233323 -0.135 0.89286
sclEvery day 0.072457 0.242567 0.299 0.76516
cntryB -1.560045 0.180680 -8.634 < 2e-16 ***
cntryCz -1.683876 0.194952 -8.637 < 2e-16 ***
cntryGer -1.282113 0.150699 -8.508 < 2e-16 ***
cntryDen 0.151659 0.137432 1.104 0.26980
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 13733 on 20443 degrees of freedom
Residual deviance: 10185 on 20384 degrees of freedom
(19741 observations deleted due to missingness)
AIC: 10305
Number of Fisher Scoring iterations: 17