I have distributions from two different data sets and I would like to measure how similar their distributions (in terms of their bin frequencies) are. In other words, I am not interested in the correlation of data point sequences but rather in the their distributional properties with respect to similarity. Currently I can only observe a similarity in eye-balling which is not enough. I don't want to assume causality and I don't want to predict at this point. So, I assume that correlation is the way to go.
Spearman's Correlation Coefficient is used to compare non-normal data and since I don't know anything about the real underlying distribution in my data, I think it would be a save bet. I wonder if this measure can also be used to compare distributional data rather than the data poitns that are summarized in a distribution. Here the example code in R that exemplifies what I would like to check:
aNorm <- rnorm(1000000)
bNorm <- rnorm(1000000)
cUni <- runif(1000000)
ha <- hist(aNorm)
hb <- hist(bNorm)
hc <- hist(cUni)
print(ha$counts)
print(hb$counts)
print(hc$counts)
# relatively similar
n <- min(c(NROW(ha$counts),NROW(hb$counts)))
cor.test(ha$counts[1:n], hb$counts[1:n], method="spearman")
# quite different
n <- min(c(NROW(ha$counts),NROW(hc$counts)))
cor.test(ha$counts[1:n], hc$counts[1:n], method="spearman")
Does this make sense or am I violating some assumptions of the coefficient?
Thanks, R.