I would to find the parameters for $y_i=\beta_0+\beta_1x_{1i}+\beta_2x_{2i}$ using the least squares concept and want to prove that matrix notation is computationally more convenient than my way below :
I want to minimize : $LS(\beta)=\sum(y_i-(\beta_0+\beta_1x_{1i}+\beta_2x_{2i}))^2$
1.derivate $LS(\beta)$ w.r. to $\beta_0$ and set it to $0$
$\beta_0=\bar{y}-\beta_1\bar{x_1}-\beta_2\bar{x_2}$ (use $\beta_0$ for equations below)
2.derivate $LS(\beta)$ w.r. to $\beta_1$ and set it to $0$
$\beta_1=\frac{\sum(y_i-\bar{y_i})(\bar{x_1}-x_{1i})\space+\space\beta_2\sum(\bar{x_2}-x_{2i})(\bar{x_1}-x_{1i})}{\sum(\bar{x_1}-x_{1i})^2}$
3.derivate $LS(\beta)$ w.r. to $\beta_2$ and set it to $0$
$\beta_2=\frac{\sum(y_i-\bar{y_i})(\bar{x_2}-x_{2i})\space+\space\beta_1\sum(\bar{x_1}-x_{1i})(\bar{x_2}-x_{2i})}{\sum(\bar{x_2}-x_{2i})^2}$
I could solve for $\beta_1$ or $\beta_2$ (by plugging in) but I get a very messy expression.
Q) We also know that a matrix solution would be $(X'X)^{-1} X'Y$ which yields $O(n)=O(k^2*(n + k))$ where X = $n$ by $k$ matrix.
Why not solve for all possible $\beta$ in this scalar fashion ? (as I've written above we can solve them independent of each other!). The above complexity would be $O(n)=n^2$ so why would I choose a matrix notation ? or where am I being unreasonable ?