Let $\mathbf{\beta}$ be the parameter vector of a ridge regression.
Now we can say that:
\begin{equation} \frac{\partial \lambda \beta^T \beta}{\partial \beta}=2\lambda\beta. \end{equation}
Why is this?
I thought that $$\frac{d}{dx} x^t x = 2x^t$$
Which would imply that:
\begin{equation} \frac{\partial \lambda \beta^T \beta}{\partial \beta}=2\lambda\beta^{T}. \end{equation}