How to check if the correlation between two continous variables is influenced by a categorical factor?

Question

I have a data frame (df) where I see correlation between two continuous variables (c1 and c2). I need to know whether the observed correlation between the two variables differs between groups, which are identified with a categorical variable (v1)?

I tried to find interaction using lm function in R. I tried lm(c1~c2*v1,data=df) and looked for p value of the interaction c2:v1. I am not sure if this is correct. Please correct me if I am wrong.

score 1 · Answer 1 · edited May 23 '17 at 12:39

1

I think you should implement a Sperman correlation (by group).

See this link, to another stackexchange.com question.

In alternative, see this other link, to another stackexchange.com question.

edited May 23 '17 at 12:39

Community

1

answered Jul 02 '15 at 06:25

Fuca26

795
1
9
29

Thank you for those links. But I also need to know if the correlations obtained by factors are statistically different. Any comment on that? – Veera Jul 02 '15 at 06:30
Doesn't R give you p-values as well? That's what you should look at. – Fuca26 Jul 02 '15 at 06:35
I meant if I get correlation in subset1 is 0.4 and subset2 is 0.6, I need to know if correlation in subset1 is statistically different from correlation in subset2. Hope you get me. – Veera Jul 02 '15 at 06:37
Lets say I am looking for the correlation between the blood insulin and glucose levels in a group of participants. Suppose if i need to know how the correlation varies, if i group them in to diabetics and non-diabetics, what should I do. – Veera Jul 02 '15 at 06:49
I have tried to clarify your question. Roll it back if you do not like it or distort the meaning of your initial question. – Fuca26 Jul 02 '15 at 06:51
Ok, given what I have understood now, it seems that your initial approach makes sense. One thing though (I am not a doctor), is there any reason to think that being diabetics per se does not affect C1? – Fuca26 Jul 02 '15 at 06:57
That is an example I gave, since its well known fact. – Veera Jul 02 '15 at 06:59
Normally, when you use an interaction in your regression, you should add also the main effects; instead, you would exclude the main effects of v1 or c2 only if you had a very strong reason for doing that. So, probably your command should be: lm(c1~c2*v1 + c2 + v1,data=df). But I am not sure it is the best way of doing that. – Fuca26 Jul 02 '15 at 07:05
See the previous comment. The coefficient on c2 would give you the effect of c2 on c1 when v1 is 0. When v1 is 1, the total effect of c2 on c1 would be given by the sum of the coefficients on c2 and c2*v1. The coefficient on v1 would give you the effect of v1 on c1. If there was no difference between groups in terms of effect of c2 on c1, the p-value of the interaction effect would have you rejecting the H0. – Fuca26 Jul 02 '15 at 07:08

How to check if the correlation between two continous variables is influenced by a categorical factor?

1 Answers1