1

I repropose a question I have had no answer on

I am trying to calculate $∇_wMSE=0$ and $∇_mMSE=0$ with '$w$' and '$m$' being matrices of unknown parameters and $MSE=(X⋅m⋅w−Y)^2$ ($X$ and $Y$ are matrices of known values).

If I simply had one set of parameters, i.e., $∇_wMSE=0→∇_m(X⋅m−Y)^2=0$, the solution would be $m=(X^T⋅X)−1⋅X^T⋅Y$

With the two sets of parameters: $∇_wMSE=0→∇_w(X⋅m⋅w−Y)^2=0$

$→w=(m^T⋅X^T⋅X⋅m)−1⋅m^T⋅X^T⋅y$

How to solve this last equation? I could apply the chain rule, multiplying by $∇_mX⋅m$, but I am not sure how to then multiply the equation by this result.

Thomas Lumley
  • 21,784
  • 1
  • 22
  • 73
learner
  • 31
  • 2
  • 1
    It would be easier to follow your post if you used math formatting: https://math.meta.stackexchange.com/questions/5020/mathjax-basic-tutorial-and-quick-reference – Sycorax Jun 14 '20 at 15:22
  • You can not solve this as a linear equation (https://stats.stackexchange.com/questions/470818/what-is-limiting-about-a-linear-model) and instead you need to find the minimum with some gradient method, starting with some m and w and improve the solution in small steps. – Sextus Empiricus Jun 14 '20 at 15:44
  • You may have a problem that there'll be multiple solutions. If some m and w minimize the MSE then m/k and w*k give the same MSE and also minimize the MSE – Sextus Empiricus Jun 14 '20 at 15:49

0 Answers0