3

I was able to use the following equation to find derivatives of matrix function:

$f(X+h) = f(X) + Ah + o(|h|) \quad \cdots (1)$

where $h$ is small displacement and $A$ is the jacobian matrix. I found couple more equations that find derivatives of matrix function:

$D_Yf(X) = \lim_{t->0} \frac{f(X+tY) - f(X)}{t} \quad \cdots (2)$

$D_Yf(X) = \lim_{t->0} \frac{f(X+tY) - f(X)}{t} = tr(Y^TU) \quad \cdots (3)$

It seems like the first equation and second equation are identical with (2) being more precise in terms of the definition of a derivative( Equation (2) also shows why $|h|^2$ is ignored). Furthermore, both (1) and (2) apply to functions $\mathbb{R}^n$->$\mathbb{R}^m$. Equation (3) seems to be special case of (2). Equation (3) was used like the following:

$f(X) = tr(AX)$ $D_Yf(X) = \lim_{t->0} \frac{f(X+tY)-f(X)}{t} \\ = \lim_{t->0} \frac{tr(A(X+tY) - tr(AX)}{t} \\ = \lim_{t->0} \frac{tr(AX+AtY] - tr(AX)}{t} \\ = \lim_{t->0} \frac{tr(tAY)}{t} \\ = \lim_{t->0} tr(AY)\\ = tr(AY) \\ = tr([AY]^T) \\ = tr(Y^TA^T)$

$U=A^T$, therefore $D_Yf(X) = A^T$.

Couple questions regarding those three equations:

  1. Is $h$ in eq. (1) same as $tY$ in equation (2)?(something seems a bit missing in equation(1) to me...)

  2. I don't quite understand what $tr(Y^TU)$ meansin equation (3). $Y$ seems to be a directional matrix and $U$ seems to be the jacobian but what exactly does that expression mean? And how does formatting into $tr(Y^TU)$ form give us the derivative?

EDIT: they(where i found equation (3)) used column vector for the gradient. (http://www.tc.umn.edu/~nydic001/docs/unpubs/Schonemann_Trace_Derivatives_Presentation.pdf)

MoneyBall
  • 737
  • 4
  • 15
  • You should read the matrix cookbook. It covers all this stuff. http://www2.imm.dtu.dk/pubdb/views/edoc_download.php/3274/pdf/imm3274.pdf – gammer Jan 24 '17 at 05:46
  • 1
    @gammer matrix cookbook seems, to me at least, like a cheat sheet with all the derivations without proof. I want to know how the cookbook derived certain stuff. In fact I was working on algebraically proving equations 33-45 in the cookbook and got stuck with this notion of deriving a trace. – MoneyBall Jan 24 '17 at 05:52
  • Cool, well check it out if you want. Good luck. – gammer Jan 24 '17 at 06:20
  • Because I addressed (2) in detail, starting from the definitions, in your previous question at http://stats.stackexchange.com/questions/257579/what-justifies-this-calculation-of-the-derivative-of-a-matrix-function/257616#257616, and because $tY$ in $(2)$ obviously plays exactly the same role as $h$ in $1$, I don't understand what else you need to know. – whuber Jan 24 '17 at 16:17
  • @whuber okay so you're saying $tY$ is the same as $h$ so that confirms my first question. what exactly is $tr(Y^TU)$? Is U a jacobian to directional matrix Y? Then what about the trace? – MoneyBall Jan 24 '17 at 23:12
  • I have little idea what you mean, because $U$ comes in out of the blue: it doesn't appear anywhere in your post until you start asking about it. In my other answer I explained how expressions like this can be interpreted as linear forms, so I would suppose that might be the intention here. – whuber Jan 24 '17 at 23:15
  • @whuber the RHS of equation (3) is $tr(Y^TU)$. So by equating the derivative to the $tr(Y^TU)$, you're saying we want it as linear form. A linear form of the directional derivative times the jacobian? What exactly does $tr(Y^TU)$ represent? Because $U$ turns out to be the derivative of functions in the form of $tr(AX)$ – MoneyBall Jan 24 '17 at 23:21
  • The RHS of equation (3) makes little sense because there is no $U$ on the LHS. I suppose it's intended to define $U$ implicitly, but it's not very clear. Regardless, Wikipedia has a page about [linear forms](https://en.wikipedia.org/wiki/Linear_form) you can consult for more information. Incidentally, the derivative of $A\to\operatorname{tr}(AX)$ is ... *itself,* because this is a linear function. It is *not* the same thing as $X$--but you may think of $X$ as *representing* this derivative. – whuber Jan 24 '17 at 23:24

0 Answers0