I am taking Andrew Ng's Machine Learning class on Coursera and in the below slide he distinguishes principal component analysis (PCA) from Linear Regression. He says that in Linear Regression, we draw vertical lines from the data points to the line of best fit, whereas in PCA, we draw lines that are perpendicular to achieve the shortest distance.
I thought with linear regression we always use some Euclidean distance metric to calculate the error from what our hypothesis function predicts vs. what the actual data point was. Why doesn't it use the shortest distance a la PCA?