I'm using matched pairs logistic regression (1-1 matched case-control; Hosmer and Lemeshow 2000) to model differences between vegetation selected at nest sites vs. paired random sites. To do this, I created a data frame that contained the difference in vegetation measurements between nest and random sites (so nest minus random) and used R to fit a logistic regression model, using a vector of all 1's as the 'Response' and a no-intercept model.
Here's the data frame (I only include 1 of the covariates, grass density, for the example):
nest<-structure(list(VerGR = c(1.380952381, 1.952380953, 2.666666667,
-3.809523809, 2.428571428, 2.142857143, 0.142857143, 2.095238095,
1.952380952, 3.333333334, 3.190476191, -2.857142857, 2.857142858,
-1.666666667, 0.523809524, 4.761904762, 0.571428571, 2.238095238,
-2.809523809, 0.857142857, 1.523809524, -2.476190476, -0.428571428,
-5.190476191, 4.142857143, 2.857142858, -2.476190476, 4.095238096,
1.428571428, 1.714285714, -2.80952381, 3.142857143, 2.809523809,
7.238095238, 2.523809523, 2.333333333, -0.095238096, -0.095238096,
-0.142857143, 4.047619048, 4.761904759, -1.285714285, -1.190476191,
2.523809524, -2.095238095, -2, 4.761904761, 8.952380952, 1.095238096,
5.666666666, -0.714285714, 0, 2.809523809, -0.238095239, 3.666666667,
0.904761905, -4.952380952, -3.666666667, 2, -0.619047619, 4.523809524,
1.523809524, 4.619047619, 6.142857143, 3.19047619, -2.190476191,
-1.666666667, 2.714285714, -1.285714286, 2.857142857, 2.761904762,
2.809523809, -7.142857139, -5.952380949, -1.19047619, 1.523809524,
-0.38095238, 5.571428571, 5.238095239, 2.047619048, 7.857142857,
0.61904761, 2.523809524, -1.190476191), Response = c(1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L)), .Names = c("VerGR", "Response"), class = "data.frame", row.names = c(NA,
-84L))
And the no-intercept logistic regression models I am running:
grass.mod <- glm(Response ~ VerGR - 1, data=nest, family="binomial")
grass2.mod <- glm(Response ~ VerGR + I(VerGR^2) - 1, data=nest, family="binomial")
For the most part the models run fine, and give the same parameter estimates as models implemented using the 'clogit' function from the survival R package. The data set for the clogit models is slightly different, with Responses = 1 (nest) or = 0 (random point), and includes a column called 'PairID' to indicate nest-random pairs. Here's what the clogit models look like:
library(survival)
grass.mod.clog <- clogit(Response ~ VerGR + strata(PairID), data=full)
grass2.mod.clog <- clogit(Response ~ VerGR + I(VerGR^2) + strata(PairID), data=full)
But when I run the glm's, I get these 2 warnings if using a quadratic term:
Warning messages:
1: glm.fit: algorithm did not converge
2: glm.fit: fitted probabilities numerically 0 or 1 occurred
I'm able to satisfy the first warning if I use more iterations in the glm formula, but I'm not sure what is happening with the second warning. I would be glad to use the 'clogit' function (which works with quadratic terms), but I'm unsure how to create prediction plots to visually display the data when going that route. Any suggestions?