$\sqrt{n}$-equivalence of M-estimator based on plug-in estimator

Question

Suppose our model has a nuisance parameter $\eta_0$ of which we possess a consistent estimator $\hat{\eta}_0$.

We obtain an estimator $\hat{\theta}$ of a parameter of interests $\theta$ by finding the $\theta$ that solves the estimating equation

$$S_n(\theta, \hat{\eta}) = 0 $$

However, if we know $\eta_0$, can we obtain a better estimator? Specifically, consider the following.

Question: Under which conditions is $\hat{\theta}$ asymptotically equivalent to $\tilde{\theta}$, where $\tilde{\theta}$ is an estimator obtained by solving the estimating equation

$$S_n(\theta, \eta_0) = 0 $$

which requires $\eta_0$ to be known?

Note that conditions for the consistency and $\sqrt{n}$-consistency of $\hat{\theta}$ have been provided in other posts.

Guillaume F. · Accepted Answer · 2018-07-10T20:11:04.380

Background:

For the case $\eta_0$ known, we assume the existence of a function $S(\theta,\eta)$ such that

1) $\tilde{\theta} = \theta_0 + Op(n^{-1/2})$

2) $S(\theta,\eta)$ is differentiable in $\theta$ at $(\theta_0,\eta_0)$ with a derivative matrix $\Gamma$ of full rank

3) $ S(\tilde{\theta},\eta_0) - S(\theta_0,\eta_0) = S_n(\tilde{\theta},\eta_0) - S_n(\theta_0,\eta_0) + op\left(n^{-1/2}\right)$

From 2), we get a Taylor expansion about $\theta_0$,

$$S(\tilde{\theta},\eta_0) - S(\theta_0,\eta_0) = \Gamma (\tilde{\theta} - \theta_0) + op(|\tilde{\theta} - \theta_0|)$$

Hence

$$ \tilde{\theta} - \theta_0 = \Gamma^{-1} \left( S(\tilde{\theta},\eta_0) - S(\theta_0,\eta_0) \right) + op(n^{-1/2})$$

From 3),

$$ \tilde{\theta} - \theta_0 = \Gamma^{-1} \left(S_n(\tilde{\theta},\eta_0) - S_n(\theta_0,\eta_0) \right) + op(n^{-1/2})$$

Note that assumption 3) is satisfied if assumption 4-6 and 7a found here are true.

To have an equivalent estimator when $\eta$ is unknown, we need to have an equivalent linearization.

Solution 1:

Assume that, in addition to 1-3,

A) $\hat{\theta} = \theta_0 + Op(n^{-1/2})$

B) $ S(\hat{\theta},\eta_0) = S(\tilde{\theta},\eta_0) + op\left(n^{-1/2}\right)$

Then we can write, from A),

$$ \hat{\theta} - \theta_0 = \Gamma^{-1} \left( S(\hat{\theta},\eta_0) - S(\theta_0,\eta_0) \right) + op(n^{-1/2})$$

From B),

$$ \hat{\theta} - \theta_0 = \Gamma^{-1} \left( S(\tilde{\theta},\eta_0) - S(\theta_0,\eta_0) \right) + op(n^{-1/2})$$

Solution 2:

If we assume 1-3, A) and

C) $\hat{\eta} = \eta_0 + Op(n^{-1/2})$

D) $S(\theta,\eta)$ is differentiable in $\eta$ at $(\theta_0,\eta_0)$ with a derivative matrix equals to zero

E) $S(\hat{\theta},\hat{\eta}) = S(\tilde{\theta},\eta_0) + op(n^{-1/2})$

Then we can perform the following Taylor expansion about $(\theta_0, \eta_0)$,

$$S(\hat{\theta},\hat{\eta}) - S(\theta_0,\eta_0) = \Gamma (\hat{\theta} - \theta_0) + op(|\hat{\theta} - \theta_0| + |\hat{\eta} - \eta_0|)$$

and thus

$$\begin{align} \hat{\theta} - \theta_0 &= \Gamma^{-1} \left( S(\hat{\theta},\hat{\eta}) - S(\theta_0,\eta_0)\right) + op(n^{-1/2}) \\ &= \Gamma^{-1} \left( S(\tilde{\theta},\eta_0) - S(\theta_0,\eta_0)\right) + op(n^{-1/2}) \end{align}$$

A sufficient condition for E) to hold is that 3) be true and

$\begin{align} S(\hat{\theta},\hat{\eta}) - S(\theta_0,\eta_0) &= S_n(\hat{\theta},\hat{\eta}) - S_n(\theta_0,\eta_0) + op(n^{-1/2}) \\ S_n(\hat{\theta},\hat{\eta}) - S_n(\tilde{\theta},\eta_0) &= op(n^{-1/2}) \end{align}$

Solution 3

If we assume 1-3, A) and

F) $\hat{\eta} = \eta_0 + op(1) $

G) $S(\theta,\eta)$ is uniformly differentiable in $\theta$ at $\theta_0$ on a neighborhood of $\eta_0$ with a derivative matrix $\Gamma(\eta)$

H) $\Gamma(\eta)$ is continuous and full rank at $\eta_0$, with $\Gamma = \Gamma(\eta_0)$

I) $S(\hat{\theta},\hat{\eta}) - S(\theta_0,\hat{\eta}) = S(\tilde{\theta},\eta_0) - S(\theta_0,\eta_0) + op(n^{-1/2})$

Then from G) we can perform the following Taylor expansion about $\theta_0$, which is valid with probability tending to one,

$$\begin{align}S(\hat{\theta},\hat{\eta}) - S(\theta_0,\hat{\eta}) &= \Gamma(\hat{\eta}) (\hat{\theta} - \theta_0) + op(|\hat{\theta} - \theta_0|) \\ &= \Gamma(\hat{\theta} - \theta_0) + op(|\hat{\theta} - \theta_0|) \end{align}\\$$

with the second line true because of F) and H).

Hence, with I)

$$\begin{align}\hat{\theta} - \theta_0 &= \Gamma^{-1}\left(S(\hat{\theta},\hat{\eta}) - S(\theta_0,\hat{\eta}) \right) + op(n^{-1/2}) \\ &= \Gamma^{-1}\left(S(\tilde{\theta},\eta_0) - S(\theta_0,\eta_0) \right) + op(n^{-1/2}) \end{align}\\$$

Note that a sufficient condition for $I$ to be true is that both E) be true and

I') $S(\theta_0,\hat{\eta}) = S(\theta_0,\eta_0) + op(n^{-1/2})$

Both conditions D) and I') are asymptotic orthogonality assumptions.

Guillaume F. · Answer 2 · 2018-07-19T22:05:59.790

The other answer doesn't assume that $S_n(\hat{\theta}, \eta_0)$ is differentiable. If we assume $S_n(\hat{\theta}, \eta_0)$ differentiable, our work is simplified somewhat.

Background: Assume

1) $\tilde{\theta} = \theta_0 + op(1); S_n(\tilde{\theta},\eta_0) = op(n^{-1/2}); S_n(\theta_0,\eta_0) = Op(n^{-1/2})$

2) $S_n(\theta,\eta)$ is equidifferentiable (in probability) in $\theta$ at $(\theta_0,\eta_0)$ with a derivative matrix $\Gamma_n$

3) $\Gamma_n = \Gamma + op(1)$, with $\Gamma$ invertible

With probability tending to one, we can do a Taylor expansion about $\theta_0$,

$$\begin{align} S_n(\tilde{\theta},\eta_0) &= S_n(\theta_0,\eta_0) + \Gamma_n(\tilde{\theta} - \theta_0) + op(\tilde{\theta} - \theta_0) \\ &= S_n(\theta_0,\eta_0) + \Gamma(\tilde{\theta} - \theta_0) + op(\tilde{\theta} - \theta_0) \end{align}$$

Hence

$$\begin{align} \tilde{\theta} - \theta_0 &= -\Gamma^{-1}\left( S_n(\theta_0,\eta_0) \right) + op(n^{-1/2} + |\tilde{\theta} - \theta_0|) \\ &= -\Gamma^{-1}\left( S_n(\theta_0,\eta_0) \right) + op(n^{-1/2}) \end{align}$$

For the generalization to $\hat{\theta}$, we additionally assume

4) $(\hat{\theta},\hat{\eta}) = (\theta_0,\eta_0) + op(1); S_n(\hat{\theta},\hat{\eta}) = op(n^{-1/2})$

Solution:

If we additionally assume either

5) $S_n(\hat{\theta},\eta_0) = op(n^{-1/2} + |\hat{\theta} - \theta_0|)$

6) $S_n(\theta_0,\eta_0) = -\Gamma(\hat{\theta} - \theta_0) + op(n^{-1/2} + |\hat{\theta} - \theta_0|)$

Then we can perform the same Taylor expansion as in the background and thus get asymptotic equivalence of the two estimators.

I propose the following conditions to satisfy either 5) or 6):

Condition 1:

If we assume

A) There is a $\Gamma$ invertible such that, for every sequence of ball $U_n$ that shrinks to $\eta_0$, $$\sup_{\eta \in U_n}\left(- S_n(\hat{\theta},\eta) + S_n(\theta_0,\eta) + \Gamma(\hat{\theta} - \theta_0)\right) = op(n^{-1/2} + |\hat{\theta} - \theta_0|)$$ B) $S_n(\theta_0,\hat{\eta}) = S_n(\theta_0,\eta_0) + op(n^{-1/2} + |\hat{\theta} - \theta_0|) $

From A) and B)

$$ \begin{align} S_n(\hat{\theta},\hat{\eta}) &= S_n(\theta_0,\hat{\eta}) + \Gamma(\hat{\theta} - \theta_0)) = op(n^{-1/2} + |\hat{\theta} - \theta_0|) \\ &= S_n(\theta_0,\eta_0) + \Gamma(\hat{\theta} - \theta_0)) = op(n^{-1/2} + |\hat{\theta} - \theta_0|) \end{align}$$

Therefore,

$$ S_n(\theta_0,\eta_0) = -\Gamma(\hat{\theta} - \theta_0) = op(n^{-1/2} + |\hat{\theta} - \theta_0|)$$

which is the result.

Note 1: Assumptions that each individually implies A) are

A') $S_n(\theta,\eta)$ is uniformly equidifferentiable (in probability) in $\theta$ at $\theta_0$ on a neighborhood of $\eta_0$ with a derivative matrix $\Gamma_n(\eta)$ stochastically equicontinuous at $\eta_0$, with $\Gamma_n(\eta_0) = \Gamma + op(1)$, with $\Gamma$ invertible

A'') $S_n(\theta,\eta)$ is differentiable (in probability) in $\theta$ in a neighborhood of $(\theta_0,\eta_0)$, with derivative $\Gamma_n(\theta,\eta)$ equicontinuous at $(\theta_0,\eta_0)$ and with $\Gamma_n(\theta_0,\eta_0) = \Gamma + op(1)$, with $\Gamma$ invertible

Condition 2:

Assume,

A) $\hat{\eta} = \eta_0 + Op(n^{-1/2}) $

B) $S_n(\theta,\eta)$ is equidifferentiable (in probability) at $(\theta_0,\eta_0)$ with derivative matrix $[\Gamma_n, \Psi_n] $

C) $[\Gamma_n,\Psi_n] = [\Gamma, {\bf 0}] + op(1)$, with $\Gamma$ invertible

Then, performing a Taylor expansion about $(\theta_0, \eta_0)$,

$$\begin{align} S_n(\hat{\theta},\hat{\eta}) &= S_n(\theta_0,\eta_0) + \Gamma_n(\hat{\theta} - \theta_0) + \Psi_n(\hat{\eta} - \eta_0) + op(|\hat{\theta} - \theta_0| +|\hat{\eta} - \eta_0| ) \\ &= S_n(\theta_0,\eta_0) + \Gamma(\hat{\theta} - \theta_0) + op(n^{-1/2} + |\hat{\theta} - \theta_0 |) \end{align} $$

Hence,

$$ S_n(\theta_0,\eta_0) = -\Gamma(\hat{\theta} - \theta_0) = op(n^{-1/2} + |\hat{\theta} - \theta_0|)$$

$\sqrt{n}$-equivalence of M-estimator based on plug-in estimator

2 Answers2