Questions tagged [pearson-r]

The Pearson product-moment correlation coefficient is a measure of the linear relationship between two variables $X$ and $Y$, giving a value between +1 and −1.

The Pearson product-moment correlation coefficient is given by the following equation:

$\rho{_X}{_Y} = \frac{\text{Cov}(X,Y)}{\sqrt{\text{Var}(X)} \times \sqrt{\text{Var}(Y)}}$

where,

$\rho{_X}{_Y}$ = Pearson’s correlation coefficient;
$\text{Cov}(X,Y)$ = covariance of random variables $X$ and $Y$;
$\text{Var}(X)$ = variance of random variable $X$;
$\text{Var}(Y)$ = variance of random variable $Y$;

While Pearson's $\rho$ is invariant under linear transformations, it is not invariant under arbitrary monotone transformations, which are commonly applied to skewed datasets, such as square root or log transform. As such, this measure is not robust to outliers, compared to other (scale-free) measures of associations such as Spearman's $\rho$ or Kendall's $\tau$.

470 questions
145
votes
5 answers

How to choose between Pearson and Spearman correlation?

How do I know when to choose between Spearman's $\rho$ and Pearson's $r$? My variable includes satisfaction and the scores were interpreted using the sum of the scores. However, these scores could also be ranked.
user3636
143
votes
6 answers

Pearson's or Spearman's correlation with non-normal data

I get this question frequently enough in my statistics consulting work, that I thought I'd post it here. I have an answer, which is posted below, but I was keen to hear what others have to say. Question: If you have two variables that are not…
Jeromy Anglim
  • 42,044
  • 23
  • 146
  • 250
134
votes
9 answers

What is the difference between linear regression on y with x and x with y?

The Pearson correlation coefficient of x and y is the same, whether you compute pearson(x, y) or pearson(y, x). This suggests that doing a linear regression of y given x or x given y should be the same, but I don't think that's the case. Can…
user9097
  • 2,973
  • 7
  • 18
  • 11
73
votes
3 answers

How to use Pearson correlation correctly with time series

I have 2 time-series (both smooth) that I would like to cross-correlate to see how correlated they are. I intend to use the Pearson correlation coefficient. Is this appropriate? My second question is that I can choose to sample the 2 time-series as…
user1551817
  • 1,007
  • 1
  • 8
  • 11
64
votes
5 answers

Is it meaningful to calculate Pearson or Spearman correlation between two Boolean vectors?

There are two Boolean vectors, which contain 0 and 1 only. If I calculate the Pearson or Spearman correlation, are they meaningful or reasonable?
Zhilong Jia
  • 785
  • 1
  • 6
  • 9
31
votes
3 answers

finding p-value in pearson correlation in R

Is it possible to find the p-value in pearson correlation in R? To find the pearson correlation, I usually do this col1 = c(1,2,3,4) col2 = c(1,4,3,5) cor(col1,col2) # [1] 0.8315218 But how I can find the p-value of this?
tubby
  • 593
  • 2
  • 6
  • 9
28
votes
3 answers

If linear regression is related to Pearson's correlation, are there any regression techniques related to Kendall's and Spearman's correlations?

Maybe this question is naive, but: If linear regression is closely related to Pearson's correlation coefficient, are there any regression techniques closely related to Kendall's and Spearman's correlation coefficients?
sitems
  • 3,649
  • 1
  • 25
  • 52
26
votes
2 answers

How to interpret Matthews correlation coefficient (MCC)?

The answer for the question Relation between the phi, Matthews and Pearson correlation coefficients? shows that the three coefficient methods are all equivalents. I'm not from statistics, so it should be an easy question. The Matthews paper…
daniel souza
  • 511
  • 1
  • 4
  • 11
23
votes
1 answer

Are random variables correlated if and only if their ranks are correlated?

Assume $X,Y$ are continuous random variables with finite second moments. The population version of Spearman's rank correlation coefficient $ρ_s$ can be defined as the Pearson's product-moment coefficient ρ of the probability integrals transforms…
FSpanhel
  • 231
  • 1
  • 3
23
votes
3 answers

Why is Pearson parametric and Spearman non-parametric

Apparently Pearson's correlation coefficient is parametric and Spearman's rho is non-parametric. I'm having trouble understanding this. As I understand it Pearson is computed as $$ r_{xy} = \frac{cov(X,Y)}{\sigma_x\sigma_y} $$ and Spearman is…
user2740
  • 1,226
  • 2
  • 12
  • 19
22
votes
2 answers

Shrunken $r$ vs unbiased $r$: estimators of $\rho$

There has been some confusion in my head about two types of estimators of the population value of Pearson correlation coefficient. A. Fisher (1915) showed that for bivariate normal population empirical $r$ is a negatively biased estimator of $\rho$,…
ttnphns
  • 51,648
  • 40
  • 253
  • 462
21
votes
3 answers

Analogy of Pearson correlation for 3 variables

I am interested in whether or not a "correlation" of three variables is something, and if what, what would this be? Pearson product moment correlation coefficient $$\frac{\mathrm{E}\{(X-\mu_X)(Y-\mu_Y)\}}{\sqrt{\mathrm{Var}(X)\mathrm{Var}(Y)}}$$ Now…
PascalVKooten
  • 2,127
  • 5
  • 22
  • 34
21
votes
6 answers

Completing a 3x3 correlation matrix: two coefficients of the three given

I was asked this question in an interview. Lets say we have a correlation matrix of the form \begin{bmatrix}1&0.6&0.8\\0.6&1&\gamma\\0.8&\gamma&1\end{bmatrix} I was asked to find the value of gamma, given this correlation matrix. I thought I could…
novice
  • 333
  • 2
  • 6
17
votes
2 answers

Why is Pearson's ρ only an exhaustive measure of association if the joint distribution is multivariate normal?

This assertion was raised in the top response to this question. I think the 'why' question is sufficiently different that it warrants a new thread. Googling "exhaustive measure of association" did not produce any hits, and I'm not sure what that…
16
votes
1 answer

How to understand the correlation coefficient formula?

Can anyone help me understand the Pearson correlation formula? the sample $r$ = the mean of the products of the standard scores of variables $X$ and $Y$. I kind of understand why they need to standardize $X$ and $Y$, but how to understand the…
Aaron Lu
  • 161
  • 1
  • 3
1
2 3
31 32