From taking some online linear regression courses (Stanford Machine Learning and John Hopkins Linear Regression) it seems that there are at least three ways of finding a the coefficients of a linear regression:
- Minimizing the
r^2
error using gradient descent (or other optimization function) - Solving for the coefficients using matrices
- Calculating the slope directly using
Cov(x,y)/Var(x)
and the intercept usingmean(y) - slope*mean(x)
I can understand the reasons for choosing between gradient descent and solving using matrices (i.e. the cost of inverting a matrix or for singular matrices) - but I don't get why you wouldn't always just use the simple closed form solution. Its computationally a lot less expensive, and easier to understand.
What are the reasons for choosing to use optimization or solving over the simple closed form solution? Is it that the closed form solution doesn't work for multiple regressors? or does the closed form generalize somehow to the matrix solution?