The famous Anscombe data set is often used to illustrate various problems in regression such as nonlinear relationships, outliers and influential points. But, with N = 10, there is a limited amount one can do.
Also, simple plots of the data make it pretty clear that the various simple linear regressions are problematic.
I thought of generating a larger data set (say, N = 1000) with similar patterns, by jittering each of the variables in the Anscombe data. Has anyone used this sort of approach as a didactic tool to illustrate violations of the OLS assumptions and ways to find them?