Let's say I have a data like this, and I'm trying to build a mixed model.
studentId | courseId | courseName | year | courseGroup | timespent | count | mark
stud1 | 19 | M101 | 2008 | F | 12.3 | 23 | 3.7
stud1 | 21 | E102 | 2008 | C | 2.3 | 15 | 4
stud1 | 109 | H300 | 2008 | E | 22.3 | 5 | 3
stud2 | 19 | M101 | 2008 | F | 3.3 | 45 | 3
stud2 | 21 | E102 | 2008 | C | 12.3 | 56 | 3.3
stud3 | 200 | M101 | 2009 | F | 12.3 | 21 | 3.7
the full model would be:
lmer.model.full <- mark ~ courseGroup + timespent + count + courseGroup:(timespent+count) + (1|studentId) + (1|courseName/courseId)
Running the model results with the following summary:
Fixed effects:
Estimate Std. Error df t value Pr(>|t|)
(Intercept) 3.909e+00 1.282e-01 8.089e+02 30.491 < 2e-16 ***
courseGroupE -2.404e-01 1.835e-01 6.417e+02 -1.311 0.19049
courseGroupF -1.105e-01 1.493e-01 2.246e+02 -0.740 0.46008
timespent -1.552e-02 5.065e-02 1.872e+03 -0.306 0.75932
count -5.244e-02 5.409e-02 1.869e+03 -0.969 0.33244
courseGroupE:timespent 8.740e-02 5.184e-02 1.823e+03 1.686 0.09196 .
courseGroupF:timespent 2.350e-03 3.992e-02 1.881e+03 0.059 0.95308
courseGroupE:count -6.546e-02 2.673e-02 1.158e+03 -2.449 0.01446 *
courseGroupF:count -7.015e-02 2.470e-02 1.373e+03 -2.840 0.00457 **
In order to proceed with the model selection using backward elimination, I should continue by removing the one with the highest p-value, which is timespent. But, the interaction between timespent and courseGroup is marginally significant, might become significant in later iterations. On the other hand, p-value for interaction between F group and timespent is 0.95. What should I do in this case and how to proceed further?
One more thing that puzzle me... what should I be doing in this case:
Fixed effects:
Estimate Std. Error df t value Pr(>|t|)
(Intercept) 3.909e+00 1.282e-01 8.089e+02 30.491 < 2e-16 ***
courseGroupE -2.404e-01 1.835e-01 6.417e+02 -1.311 0.16049
courseGroupF -1.105e-01 1.493e-01 2.246e+02 -0.740 0.96008
timespent -1.552e-02 5.065e-02 1.872e+03 -0.306 0.25932
count -5.244e-02 5.409e-02 1.869e+03 -0.969 0.33244
courseGroupE:timespent 8.740e-02 5.184e-02 1.823e+03 1.686 0.09196 .
courseGroupF:timespent 2.350e-03 3.992e-02 1.881e+03 0.059 0.25308
courseGroupE:count -6.546e-02 2.673e-02 1.158e+03 -2.449 0.01446 *
courseGroupF:count -7.015e-02 2.470e-02 1.373e+03 -2.840 0.00457 **
Should I remove courseGroup from the model, leaving only with timespent and count, or remove count? What would be the best indicator that courseGroup should be excluded from the model?