1

I found this statement in some documentation but I could not make sense of it.

"Correlation is not a good metric for regression because it is scale and offset invariant".

I understand that correlation between height and weight remains the same whether we measure in kg or pound, but what does offset mean and how does this impact regression performance?

Marouen
  • 133
  • 7
  • Correlation between what? – Michael M Jun 13 '19 at 19:42
  • 4
    "but what does offset mean" I think he meant that correlation will remain totally same even if you add any constant to height or weight (or both). – Tamas Ferenci Jun 13 '19 at 20:13
  • Thanks! How does correlation being scale and offset invariant makes it a bad metric for regression? – Marouen Jun 13 '19 at 20:57
  • The quotation is incomprehensible (and objectionable) out of context because we have no idea which *quality* of a regression this "metric" is supposed to reflect. Could you explain that? – whuber Jun 13 '19 at 21:52

1 Answers1

2

Correlation Coefficient can be used to measure goodness-of-fit of a model to data. This means, that we compare prediction created by model with the real values. It would be generally good, if our predicted values highly correlated with real values. The higher correlation, the better fit of model to data.

Nevertheless, correlation as a measure of goodness-of-fit may lead to some pitfalls:

  • Imagine that a model, for some unknown reason, adds a huge constant to every predicted value. Correlation being invariant to a constant may suggest great result, when model somehow is very bad.
  • Imagine that a model, for other unknown reason, multiplies every predicted value by some huge constant. Prediction also might be very bad, while correlation is great, because it is invariant to scale.
cure
  • 1,666
  • 1
  • 7
  • 19
  • 2
    I see what you mean and don't disagree, but urge caution. For counterpoint to this--where a completely opposite answer is given!--please visit https://stats.stackexchange.com/a/13317/919. *Correlation (by itself) does not measure goodness of fit.* Indeed, one can and does compare (squared) correlation coefficients to compare models *that differ only in which explanatory variables are included.* As such, they provide information about one *relative* aspect of goodness of fit--but not in any absolute sense. – whuber Jun 13 '19 at 22:21