By the chain rule,
$\frac{\partial x^{T}A^{T}y}{\partial \Sigma_{i,j}}=
\mbox{tr} \left( \left( \frac{\partial x^{T}A^{T}y}{\partial A^{T}} \right)^{T} \frac{\partial A^{T}}{\partial \Sigma_{i,j} } \right) $.
This chain rule formulation is described in many references on matrix calculus, such as The Matrix Cookbook of Petersen and Pedersen.
The first partial derivative is easy:
$ \frac{\partial x^{T}A^{T}y}{\partial A^{T}}=xy^{T}$.
You can find a useful formula for the derivative of the Cholesky factor with respect to elements of $\Sigma$ on page 211 of Bayesian Filter and Smoothing by Simo Sarkka
(Note that the book uses $P=AA^{T}$ rather than $\Sigma=A^{T}A$, so the notation is complicated. I've transposed everything in the book to match the notation used in your statement of the problem.) After the change of notation, this formula gives:
$\frac{\partial A^{T}}{\partial \Sigma_{i,j}}=A^{T} \Phi \left(A^{-T} E_{i,j} A^{-1} \right) $
where $\Phi_{k,l}(M)=M_{k,l}$ if $k>l$, $\Phi_{k,l}(M)=M_{k,l}/2$ if $k=l$, and $\Phi_{k,l}(M)=0$ if $k<l$. This is basically the lower triangle of the matrix but with the diagonal divided by 2. $E_{i,j}$ is the zero matrix with one's in the $(i,j)$ and $(j,i)$ positions.
I can't see any particular way to simplify this further.
I have tested this in MATLAB by comparing the formula against a finite difference approximation and the results match up.