17

I am learning the generalized estimating equations (GEE) and the geepack R package. There are some questions that I am a little confused.

In a GEE-constructed model, we have $Var(Y_{it})=\phi_{it}\cdot V(\mu_{it})$, where $\phi$ is the scale parameter. We further decompose $Var(Y)$ into $V^{1/2}R(\alpha)V^{1/2}$ where $\alpha$ is the correlation parameter. Three link-models are specified in geepack, for $\mu,\phi,\alpha$, respectively. See this PDF file for details.

(1) In GEE1, can I say we only need to make sure that the mean structure is correctly specified, i.e. the link model $g(\mu)=X\beta$ is correct, while it doesn't matter whether the link models for $\phi$ and $\alpha$ are correctly specified?

(2) By default, the geese.fit function makes the scale value scale.value = 1.0 -- does it say $\phi=1.0$? It is understandable that the default alpha=NULL as people can specify different correlation structures and the program will assign appropriate alpha values accordingly. My question is: how often people try to explicitly model the scale parameter $\phi$?

(3) This question is closely related to (2) about the scale parameter $\phi$. Recall the variance function is $Var(Y_{it})=\phi_{it}\cdot V(\mu_{it})$. In the Gaussian case, we have $\phi=\sigma^2$ and $V(\mu_{it})=1$; in the binomial case, we have $\phi=1$ and $V(\mu_{it})=\mu_{it}(1-\mu_{it})$; in the Poisson case, we have $\phi=1$ and $V(\mu_{it})=\mu_{it}$. Can I say that, in the negative binomial case, $\phi=1$ and $V(\mu_{it})=\mu_{it}+\varphi\mu_{it}^2$? Here $\varphi$ is the NB2 dispersion parameter, NOT the scale parameter $\phi$ in GEE.

Thank you very much!

alittleboy
  • 743
  • 1
  • 9
  • 29
  • Usually, the scale parameter $\Phi$ is chosen in the [same way as GLM](https://www.stata.com/support/faqs/statistics/gee/), especially in the case of overdispersion, as discussed by Hardin and Hilbe in their book (2nd ed., p. 28 and 65): It is fixed in the case of Binomial and Poisson models, and estimated in the continuous case. – chl Oct 22 '20 at 11:48

1 Answers1

1

Q1

If you use the sandwich estimator for the covariance, GEE's coefficients and standard errors are consistent even if your working models for $\alpha$ and $\phi$ are wrong. This is well-covered elsewhere, e.g. Sandwich estimator intuition .

Q2

I don't know if it was the same back in 2012, but nowadays at least, scale.value is only used if scale.fix is TRUE.

Q3

Yes, you're exactly right. The scale parameter $\phi$ is distinct from the NB dispersion $\varphi$. They are similar in a vague sense -- "addressing overdispersion" -- but different quantitatively. For example, if $\mu=1$, then $\varphi$ adds itself to the variance, whereas $\phi$ multiplies itself by the variance.

One key qualitative difference is that $\phi$ does not help re-weight observations to lend more credence to those with lower variance, whereas $\varphi$ does. You can plug in anything for $\phi$ and get the same coefficients out. Not so for $\varphi$; huge values will cause your estimates to depend overly on observations with small fitted values (for $\mu$). There is a nice recap of closely related techniques in this paper, which, as a bonus, involves a cute kind of animal (harbor seals).

eric_kernfeld
  • 4,828
  • 1
  • 16
  • 41