4

In deriving the parameter estimate in OLS, we differentiate the following (in matrix form)

$$y^T y - 2\beta^T X^T y + \beta^T X^T X \beta$$

The part of the differentiation I don't understand is why $$\beta^T X^T X \beta$$ differentiates to

$$2X^T X \beta. $$

The way I thought it would work, was:

$$\triangledown(\beta^T)X^TX\beta + \beta^TX^TX\triangledown (\beta) = X^TX \beta + \beta^TX^TX $$

gunes
  • 49,700
  • 3
  • 39
  • 75
Bill
  • 556
  • 1
  • 9

1 Answers1

6

In your way, the dimensions of the summands don't match. If you write the term as $\beta^TA\beta$ openly, we'll have the following expression ($A$ is symmetric): $$\beta^TA\beta=\sum_{i,j}A_{ij}\beta_i\beta_j=\sum_i A_{ii}\beta_i^2+2\sum_{i<j}A_{ij}\beta_i\beta_j$$ Derivative with respect to $\beta_i$ is $$\frac{\partial \beta^TA\beta}{\partial \beta_i}=2\beta_iA_{ii}+2A_{ij}\beta_j$$ This expression is basically the dot product of $i$-th row of $A$ and $\beta$ multiplied by $2$, i.e. $2A_i^T\beta$, where $A_i$ denotes the $i$-th row.

In numerator layout notation, which is commonly used in matrix calculus cheatsheets, scalar differentiated by a vector produces a horizontal vector, so we'll concatenate the derivative for each $\beta_i$ horizontally:

$$\frac{\partial\beta^TA\beta}{\partial\beta}=[2A_1^T\beta\dots2A_n^T\beta]=2[A_1^T\dots A_n^T]\beta$$

If we transpose each row of A and concatenate them horizontally in their respective order, we obtain $A^T$, which is $A$ since $A$ was symmetric. Therefore, the middle matrix in the above expression is $A$. Letting $A=X^TX$, we have:

$$\frac{\partial\beta^TA\beta}{\partial\beta}=2A^T\beta=2A\beta=2X^TX\beta$$

which matches your formula.

gunes
  • 49,700
  • 3
  • 39
  • 75
  • How can I look at it from simply a matrix point of view? – Bill Jan 23 '21 at 17:37
  • 1
    For a rigorous account of matrix differentiation, see https://stats.stackexchange.com/a/257616/919. But as a practical matter, dimensional considerations like those used in the present answer are a powerful way to avoid mistakes. – whuber Jan 23 '21 at 18:22