I have data from a survey where I collected demographic information and quiz scores from college students. I ran backward model selection to determine which of my variables affect knowledge. My best model includes the variables Gender, Religion, Political party, School type * Hometown region, Curriculum, and Class size. Now I'm interested in determining if there are significant differences between the categories of these variables (e.g., Democrats vs. Republicans, Democrats v. 3rd Party, etc.) Looking at the model output,
Estimate Std. Error z value Pr(>|z|)
(Intercept) 1.12742 0.11051 10.202 < 2e-16 ***
GenderMale 0.27761 0.03665 7.574 3.62e-14 ***
ReligionNone 0.38754 0.04057 9.553 < 2e-16 ***
ReligionPolytheist 0.03883 0.08426 0.461 0.644918
PoliticalOther -0.05100 0.04064 -1.255 0.209471
PoliticalRepublican -0.26154 0.04931 -5.304 1.13e-07 ***
School_TypePublic 0.31366 0.11628 2.697 0.006987 **
Home_RegionNortheast 0.42175 0.15982 2.639 0.008318 **
Home_RegionSouth 0.25988 0.12583 2.065 0.038897 *
Home_RegionWest 0.47569 0.14199 3.350 0.000808 ***
CurriculumMajor 0.14399 0.04894 2.942 0.003257 **
CurriculumNone -0.15485 0.04176 -3.708 0.000209 ***
Class_SizeAbove 400 0.18364 0.04575 4.014 5.97e-05 ***
Class_SizeBetween 201 and 400 0.10113 0.04470 2.263 0.023664 *
School_TypePublic:Home_RegionNortheast -0.38518 0.17365 -2.218 0.026545 *
School_TypePublic:Home_RegionSouth -0.42587 0.13869 -3.071 0.002136 **
School_TypePublic:Home_RegionWest -0.46535 0.15436 -3.015 0.002573 **
I understand that the Intercept/reference level is GenderFemale, ReligionMonotheist, PoliticalDemocrat, etc. My question is this: since the reference level contains multiple variables, can I accurately use the provided p-values to determine if there's a difference between/among genders, religions, political parties, school types, hometown regions, curriculums, and class sizes? That is, can I say that there is a significant difference between males and females (P < 0.005), or a non-significant difference between monotheists and polytheists (P = 0.645)?
If so, can I use the relevel function to change the reference level so I can check the significance of all combinations of a variable (e.g., ReligionPolytheist vs. ReligionNone)?
I may be overthinking this, but if someone could clarify this output interpretation for me it would be greatly appreciated.
Thank you,
Sara