2

Can Spearman's rho be used to calculate correlations between nominal (i.e., locations such as 1 = City1, 2 = City2, 3 = City3) and metrical data (i.e., revenue generated in US dollars)?

I also heard that $\eta$ (i.e., the eta measure of effect size in ANOVA) could be used to achieve this. How would $\eta$ be interpreted?

Nick Cox
  • 48,377
  • 8
  • 110
  • 156
John Fisher
  • 21
  • 1
  • 2
  • 1
    Rho is for ordinal or metrical variables. No nominal. Eta is the right choice (one of). Eta is multiple correlation coefficient, R, of ANOVA. – ttnphns Oct 13 '14 at 16:51
  • 1
    Quite a similar question http://stats.stackexchange.com/q/119835/3277 – ttnphns Oct 13 '14 at 17:04
  • 1
    What exactly do you mean by "correlation" in this context? Association? Difference in mean? Difference in distribution? Something else? – whuber Oct 13 '14 at 17:34
  • Another relevant question, with a good answer: http://stats.stackexchange.com/questions/15958/how-to-interpret-and-report-eta-squared-partial-eta-squared-in-statistically – Silverfish Nov 18 '14 at 17:05

1 Answers1

0

Nominal data as described above cannot be used in regression, because there is no meaning to the ordering of factors i.e. there is no reason City2 should come before City3 and after City1. Eta can be interpreted as the relative proportion of variance explained by the treatment (City). It makes more sense to interpret it after considering the values of the test statistic and p-values associated with the test. Here's more on eta http://www.uccs.edu/lbecker/glm_effectsize.html

katya
  • 2,084
  • 8
  • 11
  • I'm no trying to regress. Would it be possible to use either of these to get just a measurement of correlation? – John Fisher Oct 13 '14 at 16:47
  • still, this does not change the answer: you can get a measure of variance explained by the treatment via ANOVA (if the underlying parametric assumptions are satisfied) but not 'correlation', which is measure of how well the relationship between two variables can be described by a monotonic function. – katya Oct 13 '14 at 16:52
  • 5
    The first sentence could easily be misunderstood: "[n]ominal data cannot be used in regression". The important qualifications are (1) as response variables (2) except that binary data (e.g. 0 or 1) can be used in a linear probability model (some fans, despite many objectors) or in e.g. logit or probit regression. – Nick Cox Oct 13 '14 at 17:23
  • good point, I added "as described above" – katya Oct 13 '14 at 18:45
  • 1
    @katya Sorry, but I don't think that change addresses my point. – Nick Cox Oct 14 '14 at 00:34
  • ok, I'd like to address it correctly: how can it be modified to address your point? Or do you suggest regression is possible with these data? – katya Oct 14 '14 at 03:22
  • If you can do an ANOVA you can do a regression, but city should be treated as a factor i.e. coded using (number of cities - 1) binary dummies. As ttnphns commented, the R-squared for the regression corresponds to the Eta-squared for the ANOVA, so as a "variance explained" measure it can be seen as a kind of *multiple* correlation. It may not be the most useful measure of effect size for these data - rather than thinking about variance explained, is it better to consider differences between mean revenues between cities, e.g. Cohen's d? – Silverfish Nov 18 '14 at 17:15