8

What I want to do is to construct GLMM's to evaluate resource selection, and I have a set of variables (some representing distances and others representing % of land cover).

Can I test for correlation between variables before standardize them? I am not quite sure what should I do first.

mtao
  • 471
  • 5
  • 14

3 Answers3

12

Can I test for correlation between variables before standardize them? I am not quite sure what should I do first.

Correlation will be the same regardless whether you calculate it before or after standardization. To see this, it is enough to know that correlation is invariant to scale. Take $b \in \mathbb{R}$ and $a>0$, then

$$ \begin{aligned} \text{Corr}(aX-b,Y) &= \frac{\text{Cov}(aX-b,Y)}{\sqrt{\text{Var}(aX-b)}\sqrt{(\text{Var}(Y)}} \\ &= \frac{\text{Cov}(aX,Y)}{\sqrt{\text{Var}(aX)}\sqrt{\text{Var}(Y)}} \\ &= \frac{a \text{Cov}(X,Y)}{\sqrt{a^2 \text{Var}(X)}\sqrt{\text{Var}(Y)}} \\ &= \frac{a \text{Cov}(X,Y)}{a \sqrt{\text{Var}(X)}\sqrt{\text{Var}(Y)}} \\ &= \frac{\text{Cov}(X,Y)}{\sqrt{\text{Var}(X)}\sqrt{\text{Var}(Y)}} \\ &= \text{Corr}(X,Y) \end{aligned} $$

The first equality is a definition.
The second uses the property that covariance as well as variance are invariant to location shifts.
The third uses the properties of covariance and variance with respect to multiplication by a constant.
The fourth uses the fact that $a>0$.
The fifth just cancels out the multipliers.
The sixth is again a definition.

This covers standardization, which is subtracting the mean and dividing by the standard deviation (a positive number).

Richard Hardy
  • 54,375
  • 10
  • 95
  • 219
6

Yes, verifying the correlations between your explanatory variables is part of the data exploration as suggested in Zuur et al. (2010) A protocol for data exploration to avoid common statistical problems. This should be done before you standardize them and construct your GLMMs.

However, I'm not sure how it would affect the correlations if you standardized your explanatory variables first but I would guess that the correlation results would be relatively the same.

Zhu Weiji
  • 3
  • 2
Mud Warrior
  • 505
  • 1
  • 6
  • 20
4

+1 to both answers but just to state the obvious:

Linear correlation is defined as the scaled version of the covariance between two variables. The scaling itself is simply the product of the standard deviations of the two variables. Therefore, standardising (or any linear transformation of the variables examined for that matter) will not change the correlation as any prior rescaling effect that might affect the covariance, will be nullified by the scale normalisation that gives the final correlation estimate.

usεr11852
  • 33,608
  • 2
  • 75
  • 117