Questions tagged [variance]

The expected squared deviation of a random variable from its mean; or, the average squared deviation of data about their mean.

The variance of a random variable $X$ is the expected squared deviation from its mean:

$$\mbox{Var}\left[X\right] = \mbox{E}\left[\left(X - \mbox{E}\left[X\right]\right)^2\right] = \mbox{E}\left[X^2\right] - \left(\mbox{E}\left[X\right]\right)^2.$$

As such, the variance captures the "spread" of a random variable around its expected value. The square root of the variance is the standard deviation.

The variance of a dataset is the mean squared deviation from its mean, sometimes called a "population variance."

The two kinds of variance are related. Variance in the first sense is a property of a random variable. One way to estimate that property from data (viewed as $n$ independent realizations of the variable) uses the population variance of the data. A related estimator called the "sample variance." It is equal to $n/(n-1)$ times the population variance.

Not all random variables have finite variance. This occurs when $\mbox{E}\left[X^2\right] $ diverges. For example, the Cauchy distribution (Student t distribution with 1 degree of freedom) does not have a finite variance.

3779 questions
262
votes
10 answers

How would you explain covariance to someone who understands only the mean?

...assuming that I'm able to augment their knowledge about variance in an intuitive fashion ( Understanding "variance" intuitively ) or by saying: It's the average distance of the data values from the 'mean' - and since variance is in square units,…
PhD
  • 13,429
  • 19
  • 45
  • 47
172
votes
7 answers

What's the difference between variance and standard deviation?

I was wondering what the difference between the variance and the standard deviation is. If you calculate the two values, it is clear that you get the standard deviation out of the variance, but what does that mean in terms of the distribution you…
Le Max
  • 3,559
  • 9
  • 26
  • 26
122
votes
8 answers

Bias and variance in leave-one-out vs K-fold cross validation

How do different cross-validation methods compare in terms of model variance and bias? My question is partly motivated by this thread: Optimal number of folds in $K$-fold cross-validation: is leave-one-out CV always the best choice?. The answer…
98
votes
9 answers

Understanding "variance" intuitively

What is the cleanest, easiest way to explain someone the concept of variance? What does it intuitively mean? If one is to explain this to their child how would one go about it? It's a concept that I have difficulty in articulating - especially when…
PhD
  • 13,429
  • 19
  • 45
  • 47
95
votes
4 answers

Does the variance of a sum equal the sum of the variances?

Is it (always) true that $$\mathrm{Var}\left(\sum\limits_{i=1}^m{X_i}\right) = \sum\limits_{i=1}^m{\mathrm{Var}(X_i)} \>?$$
Abe
  • 3,561
  • 7
  • 27
  • 45
79
votes
5 answers

How exactly did statisticians agree to using (n-1) as the unbiased estimator for population variance without simulation?

The formula for computing variance has $(n-1)$ in the denominator: $s^2 = \frac{\sum_{i=1}^N (x_i - \bar{x})^2}{n-1}$ I've always wondered why. However, reading and watching a few good videos about "why" it is, it seems, $(n-1)$ is a good unbiased…
PhD
  • 13,429
  • 19
  • 45
  • 47
67
votes
1 answer

Variance of product of multiple independent random variables

We know the answer for two independent variables: $$ {\rm Var}(XY) = E(X^2Y^2) − (E(XY))^2={\rm Var}(X){\rm Var}(Y)+{\rm Var}(X)(E(Y))^2+{\rm Var}(Y)(E(X))^2$$ However, if we take the product of more than two variables, ${\rm Var}(X_1X_2 \cdots…
damla
  • 791
  • 1
  • 7
  • 5
59
votes
7 answers

Intuitive explanation of the bias-variance tradeoff?

I am looking for an intuitive explanation of the bias-variance tradeoff, both in general and specifically in the context of linear regression.
NPE
  • 5,351
  • 5
  • 33
  • 44
58
votes
6 answers

How can a distribution have infinite mean and variance?

It would be appreciated if the following examples could be given: A distribution with infinite mean and infinite variance. A distribution with infinite mean and finite variance. A distribution with finite mean and infinite variance. A distribution…
58
votes
5 answers

What is the difference between N and N-1 in calculating population variance?

I did not get the why there are N and N-1 while calculating population variance. When we use N and when we use N-1? Click here for a larger version It says that when population is very big there is no difference between N and N-1 but it does not…
ilhan
  • 932
  • 3
  • 11
  • 19
51
votes
7 answers

When conducting a t-test why would one prefer to assume (or test for) equal variances rather than always use a Welch approximation of the df?

It seems like when the assumption of homogeneity of variance is met that the results from a Welch adjusted t-test and a standard t-test are approximately the same. Why not simply always use the Welch adjusted t?
russellpierce
  • 17,079
  • 16
  • 67
  • 98
49
votes
3 answers

Derive Variance of regression coefficient in simple linear regression

In simple linear regression, we have $y = \beta_0 + \beta_1 x + u$, where $u \sim iid\;\mathcal N(0,\sigma^2)$. I derived the estimator: $$ \hat{\beta_1} = \frac{\sum_i (x_i - \bar{x})(y_i - \bar{y})}{\sum_i (x_i - \bar{x})^2}\ , $$ where $\bar{x}$…
45
votes
1 answer

Computing Cohen's Kappa variance (and standard errors)

The Kappa ($\kappa$) statistic was introduced in 1960 by Cohen [1] to measure agreement between two raters. Its variance, however, had been a source of contradictions for quite a some time. My question is about which is the best variance…
Cesar
  • 984
  • 1
  • 9
  • 21
45
votes
5 answers

What is the difference between a population and a sample?

What is the difference between a population and a sample? What common variables and statistics are used for each one, and how do those relate to each other?
Baltimark
  • 2,028
  • 3
  • 19
  • 20
44
votes
5 answers

Why does increasing the sample size lower the (sampling) variance?

Big picture: I'm trying to understand how increasing the sample size increases the power of an experiment. My lecturer's slides explain this with a picture of 2 normal distributions, one for the null-hypothesis and one for the alternative-hypothesis…
user2740
  • 1,226
  • 2
  • 12
  • 19
1
2 3
99 100