8

I have two correlation coefficients ($r_1$ and $r_2$), obtained within the same sample (20 subjects). My aim is to test it they are significantly different. $r_1$ is the correlation between a neurophysiological parameter and a behavioural parameter in condition A; $r_2$ is the correlation between the same neurophysiological parameter and the same behavioural parameter in condition B.

I was thinking to apply a bootstrap procedure for each condition, in order to obtain two distributions of correlations. Then, I can simply run a two-sample t-test, to test for a significant difference.

My questions:

  1. Does this procedure seem reasonable to achieve my purpose? (test if $r_1$ and $r_2$ are significantly different)
  2. Is there a way to decide the number of iterations, or is it totally arbitrary? (for example..Can I go with 1000?)
Ian_Fin
  • 1,129
  • 8
  • 18
smndpln
  • 432
  • 2
  • 20
  • 4
    If you're going to generate multiple correlation coefficients through bootstrapping then why not generate a distribution of $r_1-r_2$ and use that to calculate a p value for the difference in coefficients that you observed? If something is wrong with this then hopefully someone will explain why. – Ian_Fin Nov 16 '16 at 10:22
  • It seems that utobi does not agree (see below). In your opinion, in this situation, is better to use a permutation or a bootstrap approach? – smndpln Nov 16 '16 at 15:57
  • From skimming their response, it seems @utobi and I agree on what the relevant statistic might be ($r_1-r_2$). I have no opinion on whether a permutation or bootstrap approach to generating that statistic is more appropriate. Hopefully someone else, maybe utobi, will be able to answer that for you. – Ian_Fin Nov 16 '16 at 16:38

2 Answers2

1

I think in your case it is best to use a permutation test in which you compute a permuted correlation for each condition and then take their difference. For instance, you can concatenate row-wise your two variables under condition A and those under condition B, so you end up with a matrix (20*2 $\times$ 2). Then you permute across this 40 rows and get the p-value, as explained by the following R code:

# fix the # permutations
nperm <- 5000 # needs to be large enough but depends on the samp. size

# set a void vector for the dif of correl.
cor.dif <- rep(NA, nperm)

# simulate some fake data
n1 <- n2 <- 10
x1a <- runif(n1)
x2a <- rnorm(n2)
x1b <- rnorm(n1)
x2b <- runif(n2)

X1 <- cbind(x1a, x2a) # the two measurements in cond. A
X2 <- cbind(x1b, x2b) # the two measurements in cond. B

# concatenate row-wise X1 and X2
X <- rbind(X1, X2) # this is the matrix of 20*2 x 2

# now start permuting
for(i in 1:nperm){

  # sample an index
  idx <- sample(na+nb,na, replace = FALSE)

  # calculate the permuted correlation in the first condition
  cor.1 <- cor(X[idx,1],X[idx,2])

  # calculate the permuted correlation in the second condition
  cor.2 <- cor(X[-ida,1],X[-idx,2])

  # store the dif. of correlations
  cor.dif[i] <- cor.1-cor.2
}

# compute the empirical/actual difference of correlations
emp.cor.dif <- cor(x1a, x2a)-cor(x1a, x2a)

# see at the plot
hist(cor.dif)
abline(v = emp.cor.dif)

# compute the Monte Carlo approximation of the permutation p-value
2*min(mean(cor.dif>emp.cor.dif), mean(cor.dif<emp.cor.dif))
utobi
  • 1,013
  • 10
  • 24
  • Thanks! I get the method here, but, actually, why should I correlate across conditions? – smndpln Nov 16 '16 at 10:58
  • If you are referring to my answer, you do not take correlation across conditions, but within conditions. Lastly, you take the difference of correlations at each condition and test if such a difference is zero or not. Is that clear? – utobi Nov 16 '16 at 11:06
  • 1
    Ok, that is clear, sorry for misreading the script. In your opinion, why permutation should be used here instead of bootstrap? – smndpln Nov 16 '16 at 11:11
  • Here and in general, I prefer permutation testing procedures because they exactly control type I error rate. – utobi Nov 16 '16 at 16:54
-1

If you are testing the effect of behavioral parameter on neuro-physiological parameter on two different condition. you can test whether interaction of two different condition and behavioral on neuro-physiological. If the addition of interaction term is significant, you can say that neuro-physiological values are associated with two different condition.

  • What do you mean with "If the addition of interaction term is significant"? And...I think you are talking about using an ANCOVA for this analysis. Am I right? – smndpln Nov 17 '16 at 14:40