170

What is the difference between a consistent estimator and an unbiased estimator?

The precise technical definitions of these terms are fairly complicated, and it's difficult to get an intuitive feel for what they mean. I can imagine a good estimator, and a bad estimator, but I'm having trouble seeing how any estimator could satisfy one condition and not the other.

MathematicalOrchid
  • 2,430
  • 3
  • 13
  • 15
  • 13
    Have you looked at the very first figure in the Wikipedia article on [consistent estimators](http://en.wikipedia.org/wiki/Consistent_estimator), which specifically explains this distinction? – whuber Jun 24 '12 at 16:45
  • 5
    I've read the articles for both consistency and bias, but I still don't really understand the distinction. (The figure you refer to claims that the estimator is consistent but biased, but doesn't explain _why_.) – MathematicalOrchid Jun 24 '12 at 16:47
  • 1
    Which part of the explanation do you need help with? The caption points out that each of the estimators in the sequence is biased and it also explains why the sequence is consistent. Do you need an explanation of how the bias in these estimators is apparent from the figure? – whuber Jun 24 '12 at 16:50
  • 7
    +1 The comment thread following one of these answers is very illuminating, both for what it reveals about the subject matter and as an interesting example of how an online community can work to expose and rectify misconceptions. – whuber Jan 12 '13 at 17:51
  • Related: http://stats.stackexchange.com/questions/173152/fisher-consistency-versus-standard-consistency/173319#173319 – kjetil b halvorsen Oct 01 '15 at 16:26
  • Strongly related: https://stats.stackexchange.com/questions/236328/how-does-one-explain-what-an-unbiased-estimator-is-to-a-layperson/236332#236332 – Ferdi Oct 31 '18 at 11:52

3 Answers3

179

To define the two terms without using too much technical language:

  • An estimator is consistent if, as the sample size increases, the estimates (produced by the estimator) "converge" to the true value of the parameter being estimated. To be slightly more precise - consistency means that, as the sample size increases, the sampling distribution of the estimator becomes increasingly concentrated at the true parameter value.

  • An estimator is unbiased if, on average, it hits the true parameter value. That is, the mean of the sampling distribution of the estimator is equal to the true parameter value.

  • The two are not equivalent: Unbiasedness is a statement about the expected value of the sampling distribution of the estimator. Consistency is a statement about "where the sampling distribution of the estimator is going" as the sample size increases.

It certainly is possible for one condition to be satisfied but not the other - I will give two examples. For both examples consider a sample $X_1, ..., X_n$ from a $N(\mu, \sigma^2)$ population.

  • Unbiased but not consistent: Suppose you're estimating $\mu$. Then $X_1$ is an unbiased estimator of $\mu$ since $E(X_1) = \mu$. But, $X_1$ is not consistent since its distribution does not become more concentrated around $\mu$ as the sample size increases - it's always $N(\mu, \sigma^2)$!

  • Consistent but not unbiased: Suppose you're estimating $\sigma^2$. The maximum likelihood estimator is $$ \hat{\sigma}^2 = \frac{1}{n} \sum_{i=1}^{n} (X_i - \overline{X})^2 $$ where $\overline{X}$ is the sample mean. It is a fact that $$ E(\hat{\sigma}^2) = \frac{n-1}{n} \sigma^2 $$ which can be derived using the information here. Therefore $\hat{\sigma}^2$ is biased for any finite sample size. We can also easily derive that $${\rm var}(\hat{\sigma}^2) = \frac{ 2\sigma^4(n-1)}{n^2}$$ From these facts we can informally see that the distribution of $\hat{\sigma}^2$ is becoming more and more concentrated at $\sigma^2$ as the sample size increases since the mean is converging to $\sigma^2$ and the variance is converging to $0$. (Note: This does constitute a proof of consistency, using the same argument as the one used in the answer here)

Macro
  • 40,561
  • 8
  • 143
  • 148
  • 9
    (+1) Not all MLEs are consistent though: the general result is that there exists a consistent subsequence in the sequence of MLEs. For proper consistency a few additional requirements, e.g. identifiability, are needed. Examples of MLEs that aren't consistent are found in certain errors-in-variables models (where the "maximum" turns out to be a saddle-point). – MånsT Jun 25 '12 at 06:42
  • 2
    Well, the EIV MLEs that I mentioned are perhaps not good examples, since the likelihood function is unbounded and no maximum exists. They're good examples of how the ML approach can fail though :) I'm sorry that I can't give a relevant link right now - I'm on vacation. – MånsT Jun 25 '12 at 06:59
  • Thank you @MånsT. The necessary conditions were outlined in the link but that wasn't clear from the wording. – Macro Jun 25 '12 at 11:12
  • 2
    Just a side note: The parameter space is certainly not compact in this case, in contrast to the conditions at that link, nor is the log likelihood concave wrt $\sigma^2$ itself. The stated consistency result still holds, of course. – cardinal Jun 25 '12 at 12:43
  • 2
    You're right, @cardinal, I'll delete that reference. It's clear enough that $E(\hat{\sigma}^2) \rightarrow \sigma^2$ and ${\rm var}(\hat{\sigma}^2) \rightarrow 0$ but I don't want to stray from the point by turning this into an exercise of proving the consistency of $\hat{\sigma}^2$. – Macro Jun 25 '12 at 12:54
  • What is the consequence of inconsistency to OLS Assumptions or to BLUE? – Ivan Nov 10 '15 at 06:21
  • Hi, what mean var(σ^2)=2σ4(n−1)n2 ? how read it ? why so ? – Max Usanin Sep 27 '16 at 12:26
30

Consistency of an estimator means that as the sample size gets large the estimate gets closer and closer to the true value of the parameter. Unbiasedness is a finite sample property that is not affected by increasing sample size. An estimate is unbiased if its expected value equals the true parameter value. This will be true for all sample sizes and is exact whereas consistency is asymptotic and only is approximately equal and not exact.

To say that an estimator is unbiased means that if you took many samples of size $n$ and computed the estimate each time the average of all these estimates would be close to the true parameter value and will get closer as the number of times you do this increases. The sample mean is both consistent and unbiased. The sample estimate of standard deviation is biased but consistent.

Update following the discussion in the comments with @cardinal and @Macro: As described below there are apparently pathological cases where the variance does not have to go to 0 for the estimator to be strongly consistent and the bias doesn't even have to go to 0 either.

amoeba
  • 93,463
  • 28
  • 275
  • 317
Michael R. Chernick
  • 39,640
  • 28
  • 74
  • 143
  • 1
    Let me make sure I understand this: If an estimator is unbiased, then for _any_ sample size, the average estimate must equal the true value, but any one specific estimate can of course still be arbitrarily far away. If an estimator is consistent, then the distribution of estimates must _improve_ as the sample size increases. Is that about right? – MathematicalOrchid Jun 24 '12 at 20:08
  • For a consistent estimate the variance of the distribution goes to 0 and the estimate approaches the true parameter. The consistent estimate does not have to be unbiased but the bias goes to 0 as n goes to infinity. Anunbiased estimate will be consistent if its variance goes to 0 as n approaches infinity. That is generally the case. – Michael R. Chernick Jun 24 '12 at 20:28
  • 11
    @MichaelChernick +1 for your answer but, regarding your comment, the variance of a consistent estimator does not necessarily goes to $0$. For example if $(X_1,...,X_n)$ is a sample from $\mbox{Normal}(\mu,1)$, $\mu\neq 0$, then $1/{\bar X}$ is a (strong) consistent estimator of $1/\mu$, but $\mbox{var}(1/{\bar X})=\infty$, for all $n$. –  Jun 24 '12 at 20:38
  • @Procrastinator For an estimator sn to be strongly consistent for a parameter p P[p-e<=sn<=p+e] has to approach 1 as n approaches infininity for every e>0. This cannot happen if the variance of sn does not go to 0. I presume the variance exists to begin with. You are taking an example where the variance never exists. – Michael R. Chernick Jun 24 '12 at 20:42
  • 6
    @Procrastinator: (+2) [The bias need not shrink to zero, either, even when the mean exists for each $n$](http://stats.stackexchange.com/questions/17706/how-to-show-that-an-estimator-is-consistent#comment31833_17707) – cardinal Jun 24 '12 at 20:46
  • My point for the OP is to give a fundamental understanding of what consistency means. So for estimators that are well behaved and have finite variances the estimate is converging to a constant and its distribution is becoming degenerate. – Michael R. Chernick Jun 24 '12 at 20:50
  • @cardinal The point here for the OP is to explain the difference between unbiasedness and consistency and not to confuse the issue with unusual counterexamples. But i do appreciate the need for a qualifying statement. I should have said that for estimators with finite variance to be consistent the variance must go to zero. – Michael R. Chernick Jun 24 '12 at 20:55
  • 1
    Given a sample of size $n$, $x_1,\ldots,x_n$, the estimator of the mean $\delta(x_1,\ldots,x_n)=x_1$ is unbiased but inconsistent, whatever the sample size. – Xi'an Jun 24 '12 at 22:46
  • @Xi'an A good example but you consistency/ inconsistency is an asymptotic property and doesn't apply to any finite sample size n. What you mean is that x1 does not converge to the mean of the distribution. – Michael R. Chernick Jun 24 '12 at 22:58
  • 5
    @MichaelChernick: Pardon my French! Consistency is by definition an asymptotic property, yes, so indeed the collection of estimators always taking the fist observation in the sample is unbiased for all $n$'s and inconsistent. – Xi'an Jun 24 '12 at 23:24
  • My comment was with regard to your statement "is unbiased for all n's and inconsistent" It sounds like you are saying inconsistent for all n even though that is not what you intended. – Michael R. Chernick Jun 25 '12 at 00:46
  • 2
    The statement **"if you took many samples of size n and computed the estimate each time the average of all these estimates would be close to the true parameter value"** is pretty imprecise and would also be true about a (slightly) biased estimator. I'm not sure the bolded statement really adds a lot to the answer, either, as the previous statement - "An estimate is unbiased if its expected value equals the true parameter value" - already describe what unbiasedness means. – Macro Jun 25 '12 at 01:04
  • 1
    The idea was to say something understandable without mentioning limits. So that is the reason for saying close. Of course thechnically that could be said of estimators that are nearly unbiased. The point was not to give a preicse definition but rather to give a clear indication as to what the term means. Certainly it would be very picky to downvote the answer especially since this is only an informal description of a deifnition that already has been properly defined. – Michael R. Chernick Jun 25 '12 at 01:27
  • So many confusing comments! However, I still like this answer for emphasizing that one property is concerned with increasing sample sizes, while the other is not. – MathematicalOrchid Jun 25 '12 at 12:16
  • 1
    thank you It shouldn't be so complicated but that is unfortunately what manthematical statisticians do. – Michael R. Chernick Jun 25 '12 at 13:26
  • 6
    Michael, the body of your answer is pretty good; I think the confusion was introduced by your first comment, which leads with two statements that are plainly false and potential points of confusion. (Indeed, many students walk away from an introductory graduate statistics class with precisely these misconceptions due to poor delineation between the different modes of convergence and their meaning. Your last comment could be taken to be a little on the harsh side.) – cardinal Jun 25 '12 at 13:33
  • Note that the OP was not confused by my comments which are not false. But I did add some clarification that may or may not have been necessary. – Michael R. Chernick Jun 25 '12 at 13:39
  • 10
    Unfortunately, the first two sentences in your first comment and the entire second comment are false. But, I fear it is not fruitful to further try to convince you of these facts. – cardinal Jun 25 '12 at 13:48
  • 1
    It is a fault of many statisticians to be overly precise and thereby confusing. This is a problem I had but got over. For the most part out answers are not directed at graduate students in statistics and therefore do not need to be mathematically as precise as would be expected for graduate students. – Michael R. Chernick Jun 25 '12 at 13:50
  • @cardinal So then am i to understand you to say that consistency is not an asymptotic proprty that says that the estimator tends to approach the true value of the parameter as the sample size increasesand that unbiased is not a property that applies to all fixed samples sizes? – Michael R. Chernick Jun 25 '12 at 13:55
  • 3
    Michael, I think cardinal is referring to your **comments**, not the sentences in your answer. – Macro Jun 25 '12 at 13:59
  • @Macro Then the object is to these comments? "For a consistent estimate the variance of the distribution goes to 0 and the estimate approaches the true parameter. The consistent estimate does not have to be unbiased but the bias goes to 0 as n goes to infinity. " The first statement presumes the existence of a finite varaince and so is correct under the implied assumption. The second sentence is perfectly defensible without any qualification. We both gave examples of such cases. – Michael R. Chernick Jun 25 '12 at 14:12
  • Michael, this isn't my battle at all (and I think @cardinal has explicitly said he's withdrawing) so this is the last I'll input but cardinal posts a link in the comments above apparently disproving your second statement. – Macro Jun 25 '12 at 14:16
  • @Macro The link leads to a statement that asserts that consistent estimates do not necessarily need to be asymptotically unbiased but that is all there is no proof given for the assertion. Maybe you can fill in the details then. How can an estimator with a finite mean and variance be consistent without the variance going to zero and the estimate approaching the parameter value? – Michael R. Chernick Jun 25 '12 at 14:35
  • 13
    Here is an admittedly absurd, but **simple** example. The idea is to **illustrate** exactly what can go wrong and why. It *does* have practical applications. **Example**: Consider the typical iid model with finite second moment. Let $\hat\theta_n = \bar X_n + Z_n$ where $Z_n$ is independent of $\bar X_n$ and $Z_n = \pm a n$ each with probability $1/n^2$ and is zero otherwise, with $a > 0$ arbitrary. Then $\hat\theta_n$ is unbiased, has variance bounded *below* by $a^2$, and $\hat\theta_n \to \mu$ almost surely (it's strongly consistent). I leave as an exercise the case regarding the bias. – cardinal Jun 25 '12 at 14:48
  • 2
    So, the variance can be made *arbitrarily large* without affecting the (strong!) consistency of the estimator, even when the variance exists for all $n$. – cardinal Jun 25 '12 at 14:48
  • Ok thanks. What are the practical applications? Actually var( Zn)=2a. Sorry but I still don't see how the estimator can be strongly consistent with a non zero bias that does not go to zero. – Michael R. Chernick Jun 25 '12 at 15:50
  • 3
    The variance is $2 a^2$, so the original statement is correct, though the bound is weak. All of the tools for the bias example are present in the previous comment. For example, $\hat\theta_n = \bar X_n + n|Z_n|$ shows the bias can be arbitrary, too. Behavior like this shows up in models with distributions involving heavy tails. – cardinal Jun 25 '12 at 16:22
  • @cardinal Heavy tails but with finite variance and discrete? Looks to me like E(θˆn)= E(X¯n) +2a approaches μ +2a and not μ. – Michael R. Chernick Jun 25 '12 at 16:48
  • 1
    Regarding your second sentence: **Precisely!** Regarding the first, it pertains to your (other) query regarding applications, a point that the examples (purposefully) do not speak to. – cardinal Jun 25 '12 at 16:51
  • @cardinal but the example does not show strong consistency. The bias presists in the limit. So the bias example is wrong! – Michael R. Chernick Jun 25 '12 at 17:17
  • 8
    Michael, The estimator *is* strongly consistent for $\mu$; the second term converges to zero almost surely! Recall the whole point was you asked for an example of a consistent estimator where the bias did not vanish! I've shown it not only does not vanish, but can be made arbitrarily bad. A very small tweak produces an example such that the bias *diverges* to $\infty$. You could also make it oscillate at your whim. – cardinal Jun 25 '12 at 17:24
  • 2
    Thanks for the examples @cardinal. I almost want to post a question "Can an estimator that isn't asymptotically unbiased still be consistent?" so you can get some points for that. – Macro Jun 25 '12 at 17:42
  • 4
    @Macro: Thanks for the nice comment. I realize I sound very strident here and that is not my intention. But, it is good to hash through these matters so that one better understands what's going on. In this particular case, some years ago I encountered a rather applied problem which prompted me to think about such examples. I haven't meant for it to sound like this is "obvious" or "trivial"; it is neither, which makes for a good chance to learn and improve one's intuition. – cardinal Jun 25 '12 at 17:48
  • 1
    @cardinal I know what I asked for but I do not see how you showed it. n|Zn| has expected value 2a since we have that E[n|Zn|] = n (na)/n$^2$ + n(na)/n$^2$ + 0(n$^2$-2)/n$^2$ =2a. So how is it that an estimator can be biased for a parameter mu for every n and the bias does not go to 0 as n goes to infinity and yet the estimate is strongly consistent? I accept the example about the variance not going to 0 but this makes no sense to me so far and you certainly haven't explained it. – Michael R. Chernick Jun 25 '12 at 18:00
  • 1
    Michael, as you pointed out, the bias is constant in $n$ - it equals $2a$. But, clearly $n|Z_n| \rightarrow 0$ and we know $\overline{X}_{n} \rightarrow \mu$, implying $\hat{\theta}_n \rightarrow \mu$. Does that make it clear? – Macro Jun 25 '12 at 18:08
  • 3
    Michael, your most recent edit is baffling, especially in light of the careful, detailed comments I've made. I'd strongly urge you to edit further or roll them back. – cardinal Jun 25 '12 at 18:09
  • 3
    I don't like the bit about "apparently pathological cases". What the aswer should make clear is that in practice one often proves consistency by showing convergence in $L_2$ (variance and bias tending to zero), but it should be made clear that although this is sufficient, it's not a necessary condition for convergence in probability. – leonbloy Jun 25 '12 at 18:35
  • @Macro As I showed and you agreed E[n|Zn|] is not 0 it is 2a. If E[n|Zn|]=2a for all n then it must also be 2a in the limit. n|Zn| = n$^2$ a with probability 2n$^2$. But if n |Zn| does not go to 0 in expectation how can θˆn→μ It goes to μ+2a. – Michael R. Chernick Jun 25 '12 at 18:58
  • @leonboy Yes the example shows that the sufficient condition is not necessary but what is wrong with calling the counterexample pathological? – Michael R. Chernick Jun 25 '12 at 19:00
  • 1
    Michael, the bias is $E( \mu - \overline{X}_{n} + n|Z_n|) = 2a$, regardless of $n$ so clearly the bias isn't "going anywhere" as $n$ increases. But also, $\overline{X}_n + n|Z_n| \rightarrow \mu + 0 = \mu$ in probability, clearly, by the LLN and by looking at the definition of $Z_n$ (apparently this convergence also occurs almost surely but that is less obvious to me at a glance). – Macro Jun 25 '12 at 19:01
  • 1
    @Macro: The Borel-Cantelli Lemma gives an easy proof of the almost sure convergence while at the same time avoiding any reference whatsoever to the underlying probability space (which is an additional didactic motivation behind the chosen example). – cardinal Jun 25 '12 at 19:07
  • 1
    @Macro I guess it surprises me that the sequence E[n|Zn|] converges to μ+a and n|Zn| converges to 0 a.s. but I guess it is possible. Irevised my edited answer. – Michael R. Chernick Jun 25 '12 at 19:19
  • 4
    Michael, this is a result of the fact that neither convergence in probability or almost sure convergence imply convergence in $L_p$, which is essentially what we're talking about with $p=1$. See http://en.wikipedia.org/wiki/Convergence_of_random_variables#Properties_4. – Macro Jun 25 '12 at 19:31
  • 1
    @Macro yes but it still seems surprising. – Michael R. Chernick Jun 25 '12 at 19:36
-2

If we take a sample of size $n$ and calculate the difference between the estimator and the true parameter, this gives a random variable for each $n$. If we take the sequence of these random variables as $n$ increases, consistency means the both the mean and the variance go to zero as $n$ goes to infinity. Unbiased means that this random variable for a particular $n$ has mean zero.

So one difference is that bias is a property for a particular $n$, while consistency refer to the behavior as $n$ goes to infinity. Since Another difference is that bias has to do just with the mean (an unbiased estimator can be wildly wrong, as long as the errors cancel out on average), while consistency also says something about the variance.

An estimator can be unbiased for all $n$ but inconsistent if the variance doesn't go to zero, and it can be consistent but biased for all $n$ if the bias for each $n$ is nonzero, but going to zero. For instance, if the bias is $\frac 1 n$, the bias is going to zero, but it isn't ever equal to zero; a sequence can have a limit that it doesn't ever actually equal.

Acccumulation
  • 3,688
  • 5
  • 11
  • Your definition of *consistent* is not general, an estimator can be consistent without variance going to zero, see https://stats.stackexchange.com/questions/120584/asymptotic-consistency-with-non-zero-asymptotic-variance-what-does-it-represen – kjetil b halvorsen Dec 11 '20 at 01:57