0

I want to calculate the correlation between variables which aren't normally distributed, so I am trying to use Spearman. There are some missing values and I don't know how to handle them.

The cor function documentation in R says:

"In the case of missing values, the ranks are calculated depending on the value of use, either based on complete observations, or based on pairwise completeness with reranking for each pair."

It also says:

"For cov and var, "pairwise.complete.obs" only works with the "pearson" method."

So how do I set "or based on pairwise completeness with reranking for each pair" as a criteria for "use"?

cor(df, method="spearman", use=WHAT?)
Karolis Koncevičius
  • 4,282
  • 7
  • 30
  • 47
user11916948
  • 123
  • 4
  • Non-normal distributions really aren't a sufficient reason not to use Pearson correlation. The only issue might be how else to get at P-values or more generally carry out statistical inference. Much discussed here. See e.g. https://stats.stackexchange.com/questions/3730/pearsons-or-spearmans-correlation-with-non-normal-data – Nick Cox Mar 16 '20 at 12:59

1 Answers1

1

Use:

library(Hmisc)
rcorr(as.matrix(data), type="spearman")

But I have used in the past. Cant try right now since I'm at lunch. EDIT: This worked for me.

cor(data, method = 'spearman', use='pairwise.complete.obs')