15

As I understand, we can get correlation by normalizing covariance using the equation

$$\rho_{i,j}=\frac{cov(X_i, X_j)}{\sigma_i \sigma_j}$$

where $\sigma_i=\sqrt{E[(X_i-\mu_i)^2]}$ is the standard deviation of $X_i$.

My concern is what if the standard deviation equals zero? Is there any condition that guarantees it cannot be zero?

Thanks.

whuber
  • 281,159
  • 54
  • 637
  • 1,101
chepukha
  • 273
  • 1
  • 3
  • 8
  • 11
    No variable that has standard deviation 0 could possibly be correlated with another (non-constant) variable. Correlation is a measure of how large/small values in one variable correspond to large/small values in another variable - if one of the variables is equal to a constant with probability 1 (a consequence of having standard deviation 0), then it can't possibly give information about whether the other variable is small or large. I don't know what the convention is but it seems like the correlation should be defined as 0 in that case. – Macro Nov 14 '11 at 04:57
  • Thanks a lot Macro. I think your idea is the same as the answer below. However, I couldn't vote your comment up due to limitation in points. Thanks. – chepukha Nov 14 '11 at 05:15
  • 4
    You have accepted an answer already, and so I will write just a comment. If a random variable $Y$ has standard deviation $\sigma_Y = 0$, then $$\text{cov}(X,Y)=E[(X-\mu_X)(Y-\mu_Y)]=0$$ for any other random variable $X$ (since $(Y-\mu_Y)=0$ with probability $1$). Thus, the definition of the correlation coefficient $\rho_{X,Y}=\frac{\text{cov}(X,Y)}{\sigma_X\sigma_Y}$ gives the indeterminate form $\frac{0}{0}$. It is conventional to _define_ $\rho_{X,Y}$ to be equal to $0$ in this case, and this can be defended on the grounds of the limiting value of $\rho_{X,Y}$ as $\sigma_Y \to 0$ etc. – Dilip Sarwate Nov 14 '11 at 12:00
  • 6
    @Dilip, if it is an answer it should go as an answer. It shouldn't matter whether an answer is accepted already. – Andy W Nov 14 '11 at 13:10
  • 1
    @Dilip The problem with the $\frac{0}{0}$ form is that even if it can be made to have a definite value by means of a limiting operation, the value depends on *how* you take the limit. Whence, the argument that $\rho_{X,Y}= 0$ is incomplete (and unconvincing). Can you cite a source that adopts this convention and supports it with a valid reason? – whuber Nov 14 '11 at 15:11

4 Answers4

14

It's true that, if one of your SD's is 0, that equation is undefined. However, a better way to think about this is that if one of your SD's is 0, there is no correlation. In loose conceptual terms, a correlation is telling you about how one variable moves around as the other variable moves around. An SD of 0 implies that variable is not 'moving around'. You would have to have a vector of a constant, such as rep(constant, n_times).

gung - Reinstate Monica
  • 132,789
  • 81
  • 357
  • 650
2

The other thing to think about are the underlying assumptions when we talk about means and standard deviations, and correlations.

If we are talking about a data sample, one common assumption is that the data is (at least approximately) normally distributed, or can be transformed such that it is (e.g. via a log transform). If you observe a standard deviation of zero, there are two scenarios: either the standard deviation is in fact nonzero, but very small, and therefore the dataset you have has samples that are all on the mean value (this could, for example, happen if you are measuring data at a coarse level of precision); or the model is misspecified.

In this second scenario, the standard deviation, and consequently the correlation, is a meaningless measure.

More generally, the underlying distributions must both have finite second moments, and therefore non-zero standard deviations, for the correlation to be a valid concept.

tdc
  • 7,289
  • 5
  • 32
  • 62
  • It may be worth noting that the original question is about (theoretical) distributions, not about data. – whuber Nov 14 '11 at 15:12
  • If that is the case, then a standard deviation of zero would imply a degenerate distribution with measure only at the mean (i.e. the constant function) ... again the standard deviation only makes sense the underlying distribution is normal. If the standard deviation is zero, the PDF of the Gaussian is not properly defined, and hence not permissible in the model. – tdc Nov 14 '11 at 15:34
  • I'm surprised at the appearance of Gaussians in your comment, Tom. This seems like an unnecessary restriction. Requiring existence of a pdf also seems restrictive (after all, no discrete distribution has a pdf). Note, too, that the SD is well defined--"meaningful"--whenever the second moment is finite, and this includes probability atoms (your "Dirac delta" functions). – whuber Nov 14 '11 at 15:56
  • Ok I agree was probably being overly restrictive, but generally this is what people mean by the SD. e.g. from Wolfram: "Standard deviation can be defined for any distribution with finite first two moments, but it is most common to assume that the underlying distribution is normal." Do you take my point though, that if the SD = 0 for one of the variables, the basic assumptions underlying the statistical concept of correlation are not being met? – tdc Nov 14 '11 at 17:10
  • Yes, Tom, your last statement is spot on and I accept it gladly. However, the idea it expresses does not appear very prominently in your reply; if it's there, it's buried in the remarks about normal distributions, logs, delta functions, and the focus on data rather than the distributions themselves. BTW, one should be careful about statistical statements appearing on the Wolfram site: it is so heavily oriented towards mathematics that its characterizations about statistical practice can be questionable. Here, it's dead wrong: the use of SD goes way beyond Normal-distribution settings. – whuber Nov 14 '11 at 17:36
  • I've tried to incorporate the discussion into the text ... – tdc Nov 15 '11 at 17:22
  • +1 For adding a useful and thoughtful perspective to the question. – whuber Nov 15 '11 at 17:29
2

A correlation is the cosine of the angle between two vectors. To say that the standard deviation for Y is zero is the same as saying that the vector Y-mean(Y) is zero (or, more rigorously, that it represents zero in the appropriate vector space). So the question becomes "What can one say about the (cosine of the) angle between the zero vector and the vector X-mean(X)?". More generally, in any vector space with an inner product, what is meant by the angle between the zero vector and some other vector? There's only one answer to this, in my opinion, and that is that the concept of "angle" in this situation is meaningless, and so the concept of correlation in this situation is meaningless.

David Epstein
  • 1,077
  • 2
  • 8
  • 18
1

Disclaimer, I realize that there is already an accepted quality answer, so this should be a response, but I don't have the experience points to allow it. @Dilip mentioned that you can define the correlation as 0 for convention, but this seems problematic as it would have very different interpretation from a correlation that is truly zero (with non-zero SDs). The original question says "if the SD of one variable is zero". If we just stop and think of the definition of 'variable' then we get a much more direct path to the answer. A variable with 0 SD is not a variable at all, it is a constant. So in that case you don't have two variables, so it conceptually doesn't make sense to define a correlation at all.