1

An obvious way to find out whether a relationship between X and Y is linear, is to construct a scatterplot.

But is there any other way, e.g. some statistical test?

My X and Y variables have quite a few data points (e.g. 5K).

Ken Lee
  • 321
  • 7
  • 1
    A visual check is often quicker: try [Anscombe's quartet](https://stats.stackexchange.com/a/16505/2958) to see how fast this can be done. Quantitative methods, for example on the residuals from a linear fit, will often pick up "statistically significant" deviations from the assumptions, especially when you have a large number of observations, even when these are small and not worth worrying about. – Henry May 15 '21 at 17:23
  • 5K datapoints is a very small dataset btw – Firebug May 15 '21 at 17:35
  • 1
    You can plot 5k points. – Tim May 15 '21 at 21:01
  • @Henry is important (check the link), but if you need a direct response to the issue of plot/don't plot, then PLOT THE DATA! – Michael Lew May 15 '21 at 21:29

1 Answers1

1

you can test for that by testing if a model that allows for a non-linear effect (e.g., by including polynomial or spline terms) is statistically significantly better than a model that only has a linear effect, for example by using an F-test.

rep_ho
  • 6,036
  • 1
  • 22
  • 44