I have data from a field survey. The objective of the study is to relate number of seedling (respond variable, count data), landform (exploratory variable, categorical variable with 3 levels) and percent canopy coverage (exploratory variable, quantitative). In each habitat, I have data from five 25x25 meter plots. Within each plot I used three 2x2 meter subplots nested within the bigger plot, and number of seedlings were count from these subplots. Total number of observations is 60; 20 plots x 3 subplots. Only one kind of landform contained seedlings. Other two landforms contained no seedlings, so they are empty plots. the data looks like this:
data.frame': 60 obs. of 5 variables:
$ plot : Factor w/ 20 levels "HD2","LC1","LC2",..: 16 16 16 17 17 17 12 12 12 9 ...
$ subplot : Factor w/ 60 levels "HD2.1","HD2.2",..: 46 47 48 49 50 51 34 35 36 25 ...
$ av.canopy : num 92.2 92.2 92.2 92.3 92.3 ...
$ landform : Factor w/ 3 levels "abandoned","ridge",..: 2 2 2 2 2 2 2 2 2 2 ...
$ seedling.2016: int 6 7 5 2 5 4 4 6 4 0 ...
The problem is when I compared number of seedlings between landforms using this code:
seedling <- glmer(seedling.2016 ~ landform +(1|plot), family = poisson)
The result does not make sense for me-there were no any significant different between landforms event there is only one landform (ridge) that has seedlings, while other had no seedlings at al. It is also note that SEs are enormous. The result looks like this:
Generalized linear mixed model fit by maximum likelihood (Laplace Approximation) ['glmerMod']
Family: poisson ( log )
Formula: seedling.2016 ~ landform + (1 | plot)
Data: streblus.subplots
AIC BIC logLik deviance df.resid
294.9 303.3 -143.5 286.9 56
Scaled residuals:
Min 1Q Median 3Q Max
-6.3619 -0.0001 -0.0001 0.0000 8.7305
Random effects:
Groups Name Variance Std.Dev.
plot (Intercept) 2.637 1.624
Number of obs: 60, groups: plot, 20
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -20.412 1461.267 -0.014 0.989
landformridge 22.250 1461.265 0.015 0.988
landformtemp 1.066 390.540 0.003 0.998
When I changed link function to square root as this code:
Seedling2 <- glmer(seedling.2016 ~ landform + (1|plot), family = poisson(link = sqrt))
Fixed effects:
#Estimate Std. Error z value Pr(>|z|)
#(Intercept) -1.220e-05 5.296e-01 0.000 1
#landformridge 3.250e+00 7.429e-01 4.376 1.21e-05 ***
# landformtemp 1.018e-05 7.795e-01 0.000 1
Now number of seedlings in ridge is significantly higher than the other, and it makes more sense to me.
My question is: Is it valid in term of statistics to use square root link with Poisson distribution, there are any better methods available with better ground of statistics?