On the intuitive side, I have been thinking about the following.
The Pearson correlation is a 2-dimensional linear approximation, while the linear regression is n-dimensional linear approximation. Therefore, the latter offers an estimate of the correlation that accounts for a lot of other features that might in/deflate the estimate obtained with the Pearson correlation.
See this example1, for the Pearson correlation. Consider a map without info on altitude on it and suppose you can move on it linearly (presence of rivers or cliffs do not matter). You know the time you left point A and reached B, then you compute the speed.
See this example2, for the linear regression. If instead you move on a map with info on altitude and you have to accounts for all a lot of other info on the ground you are facing (i.e., rivers or cliffs), but still the time you left point A and reached B is as in example 1, the value of the speed you will get will be different (very likely it will be higher).
Although the linear regression offers only an approximation of the average speed, it is still better than the initial approximation you got with the Pearson correlation.
Do some of you find something wrong in this example? (your answers will be very useful as I normally use this example in class)
In any case, I hope this example helped to understand the difference between the two techniques.