76

The ancient greeks famously sought to construct geometrical relationships using only a ruler and a compass. Given a set of points in a two dimensional plane, is it possible to find the OLS line using only such instruments?

This question has absolutely no practical application that I can think of.

Firebug
  • 15,262
  • 5
  • 60
  • 127
Pablo Derbez
  • 808
  • 3
  • 9
  • 1
    I was wondering if a physical device based on strings could be a practical implementation but Hooke's law states that the force scales linearly with the distance rather than as its square... – Xi'an Feb 01 '21 at 05:56
  • 2
    @Xi'an Is that a problem, though? Since the *force* scales linearly, the *potential energy* scales with the square of distance. A physical contraption will move towards a configuration that minimises its kinetic energy - which is to say, the sum of squares of spring displacements, weighted by spring stiffness - so it should be quite possible to build a spring-based line-of-best-fit machine. – Geoffrey Brent Feb 01 '21 at 09:58
  • @GeoffreyBrent: thank you for correcting my abyssal mechanics...! But is it really feasible to build such a contraption with eg rubber bands, eg how long should each band be? – Xi'an Feb 01 '21 at 11:02
  • 1
    @Xi'an it is not, you would end up implementing PCA, not linear regression (unless the springs are limited to "vertical" movement only) – Firebug Feb 01 '21 at 15:18
  • 1
    I was indeed thinking of imposing vertical forces! – Xi'an Feb 01 '21 at 15:44
  • Yes, you don't even need the compass! Although you'd need the visualization power of a human brain: take the ruler and draw a line through your points that looks like it goes through your cloud minimizing the distances to the outliers. Engineers and physicists have fitted straight lines like that for centuries! ;-) And it works pretty reliably. If the distribution isn't too crazy. – user2705196 Feb 01 '21 at 20:14
  • I think the given answer is awesome :-) I would love to see one that is more dependent on the relationship rather than performing the calculation of the formula. Maybe someday I'll play around with it. – Mike M Feb 01 '21 at 21:05
  • @Firebug Yeah, I was taking that restriction as read, but should've mentioned it explicitly. – Geoffrey Brent Feb 02 '21 at 01:20
  • 1
    @Xi'an Rubber bands don't quite follow Hooke's law, because when compressed they don't exert much restoring force. If you wanted to use those, you'd need something like opposing pairs of bands so that they're always in tension, as well as something to impose the vertical-only movement restriction. But it should be doable. – Geoffrey Brent Feb 02 '21 at 01:25
  • 1
    @Xi'an If you make such a physical device please let us know! I love the idea – Pablo Derbez Feb 02 '21 at 04:12
  • 1
    Simple answer: **of course.** Everyone learns the technique in grade school; use Euclid's construction to drop a perpendicular from a point to a line. You just have to work in more than two dimensions! A problem, you think? Not at all: in a simple regression the $x$ coordinates form one vector, the $y$ coordinates another, and (still according to Euclid), *they are included in a plane.* You might object that this plane "sits" in higher dimensions. So it does: but then again, compass constructions don't concern numbers: they are purely about *geometry.* – whuber Feb 02 '21 at 10:15

1 Answers1

102

Loosely speaking, it's apparently possible to compute any quantity which can be expressed "using only the integers 0 and 1 and the operations for addition, subtraction, multiplication, division, and square roots" with only a compass and ruler -- the wikipedia article on constructible numbers has more details. Since the slope of the OLS line definitely has such a closed form, we could deduce it's possible to construct the line.

As someone who isn't an expert in compass and ruler constructions, I found this a bit unbelievable, so I gave it a try myself: the green line is the OLS fit for the three blue points, not fitting an intercept for simplicity.

You can play around with it here for yourself and drag around the blue points a bit.

enter image description here

Here's roughly how the construction went: it turns out you can multiply two numbers by constructing similar triangles. So for each of the three (x,y) points, I computed x^2 on the x-axis and xy on the y-axis (shown in red). Then I simply added up all the x^2's and xy's to get the green point in the top right which defines the OLS line.

shimao
  • 22,706
  • 2
  • 42
  • 81
  • 2
    To get the slope of the regression, I don't understand how you derived the point at the lower end as 0,0 ? It is not a requirement of a regression line to use 0,0 although it is true for many functions & your approach may be correct here too. Is there a generalised approach for calculating a second point that is independent of the first? – simon coleman Feb 02 '21 at 09:42
  • 11
    @simoncoleman because it's simpler, I fit a model without intercept -- $y_i = mx_i + \epsilon_i$. Knowing that this is possible, it's pretty easy to convince myself (and hopefully everyone else) there's nothing which would make fitting an intercept impossible. i just didn't bother because i didn't think the added complexity would give any deep insight. – shimao Feb 02 '21 at 14:47