Establishing convergence in probability from a related convergence in distribution

Question

Is it true that $\sqrt n (\hat{\theta}-\theta) \ \rightarrow_d \ N(0,\sigma^2)$ implies $\text{plim} \ \hat{\theta} = \theta $? If so, how can I prove this?

Attempted proof: My proof is like this:

Since $\hat{\theta} \rightarrow_{asy.} N(\theta, \frac{\sigma^2}{n})$ I have $\lim_{n \to \infty} E(\hat{\theta})= \theta \ $ and $\lim_{n \to \infty} Var(\hat{\theta})=0 $. Therefore, from convergence in quadratic mean, $\text{plim} \ \hat{\theta} = \theta $. Is this proof correct?

Here's something to contemplate. Let $X$ be a Rademacher$(1/2)$ variable; that is, $\Pr(X=-1)=\Pr(X=1)=1/2.$ Define the sequence $X_n = (-1)^nX, n=1,2,3,\ldots.$ All the $X_n$ are identically distributed, so they converge in distribution trivially. What random variable would they converge to in probability? Do they converge to it? Now, although this is not the situation you have, it might provide some insight into what could go wrong with your proof. — whuber, Oct 25 '18 at 14:49
See https://stats.stackexchange.com/questions/207264/root-n-consistent-estimator-but-root-n-doesnt-converge/207281#207281 — Christoph Hanck, Oct 25 '18 at 14:55
@whuber Thank you for your reply. $ X_n $ do not converge in probability, because for all n $ X_n $ takes -1 or 1 with probability 1/2, do they? But I don't understand how do your example relate to my proof. I would appreciate if you give me a little more information. — David Khan, Oct 26 '18 at 01:21
@christoph Thank you for your reply. I understand the proof of the link. — David Khan, Oct 26 '18 at 01:22
Let $Z_n$ be any sequence of variables that converge to a standard Normal variable. Consider the sequence $X_nZ_n.$ — whuber, Oct 26 '18 at 13:33

Alecos Papadopoulos · Answer 1 · 2020-12-13T04:12:20.737

3

We are given that

$$\sqrt n (\hat{\theta}-\theta) \to_d N(0,\sigma^2) \tag{1} $$

Make the additional assumptions that

1) For the finite distribution of $\sqrt n (\hat{\theta}-\theta)$, the $2+\delta,\; \delta >0$ absolute moment exists and is finite.

2) The sequences of 1st and 2nd moments of $\{\sqrt n (\hat{\theta}-\theta)\}$ converge each to a constant.

Then these constants are the corresponding moments of the limiting distribution. In particular, this means that

$$\lim \text {Var}[\sqrt{n}(\hat \theta -\theta)] = \sigma^2 \tag{2}$$

At the same time, we have

$$\lim \text {Var}[\sqrt{n}(\hat \theta -\theta)] = \lim \mathbb E[\sqrt{n}(\hat \theta -\theta)]^2 - \lim\left(\mathbb E[\sqrt{n}(\hat \theta -\theta)]\right)^2$$

$$= \lim \mathbb E[\sqrt{n}(\hat \theta -\theta)]^2 - \left(\lim\mathbb E[\sqrt{n}(\hat \theta -\theta)]\right)^2 = \lim \mathbb E[\sqrt{n}(\hat \theta -\theta)]^2 - 0 \tag{3}$$

Combining $(2)$ and $(3)$ we have

$$\lim \mathbb E[\sqrt{n}(\hat \theta -\theta)]^2 = \sigma^2 \implies \lim n\mathbb E(\hat \theta -\theta)^2 = \sigma^2 < \infty \implies \mathbb E(\hat \theta -\theta)^2 = O(1/n) $$

$$\implies \mathbb E(\hat \theta -\theta)^2 = o(1) \implies \mathbb E(\hat \theta -\theta)^2 \to 0 $$

The last result is convergence in quadratic mean - and convergence in quadratic mean of a random variable to a constant, implies that this constant is also its probability limit (see here for an exposition why).

So $\hat \theta_n \to_p \theta$.

While in general it does not hold that convergence in distribution implies convergence in probability, we see how, for a subset of cases and under some additional conditions, we can go from the limiting distribution to the probability limit.

edited Dec 13 '20 at 04:12

answered Dec 02 '18 at 20:09

Alecos Papadopoulos

52,923
5
131
241

There seems to be a typo and we should have $E(\hat \theta - \theta)^2 = O(1/n)$. – sonicboom Dec 12 '20 at 23:08
How strong are the assumptions 1) and 2) in the case where $\theta$ and $\hat \theta$ are the coefficients and estimated coefficients of multiple linear regression in the usual setting (e.g. finite error variance with the errors independent of the predictors)? I assume these assumptions are underlying Hansen's statement in [equation (6.13)](https://www.ssc.wisc.edu/~bhansen/econometrics/Econometrics2013.pdf)? – sonicboom Dec 12 '20 at 23:29
1

@sonicbomm They do but you don't really need this because convergence in probability is proved on its own for the $Q^{-1}$ matrix. Proving the convergence in distribution is done to be able to improve on the rate of convergence, not to prove convergence in probability. – Alecos Papadopoulos Dec 13 '20 at 00:40
I see how it works but it just seems a bit arbitrary because in 1) and 2) you make an assumption that involves $\sqrt{n}$ and ultimately this assumption results in the $O_p(n^{-1/2})$ convergence in probability and $O(1/n)$ in quadratic mean. You could have just as easily replaced the $\sqrt{n}$ with $n$ or $n^2$, etc., in 1) and 2) and obtained a different rate of convergence in probability and expectation. – sonicboom Dec 13 '20 at 22:18
I know $\sqrt{n}$ is special in the context of the CLT and thus your equation (1), but as things stand it seems very arbitrary in 1) and 2) because these are merely assumptions. We can get any convergence rate we want if we just make arbitrary assumptions. Is there some reason why 1) and 2) should be considered more than just arbitrary assumptions? – sonicboom Dec 13 '20 at 22:20
1

@sonicboom As I wrote you do not need to invoke them. Just prove convergence in probability directly. – Alecos Papadopoulos Dec 13 '20 at 22:30

Establishing convergence in probability from a related convergence in distribution

1 Answers1

Linked