Comparison between analytic gradient and numerical gradient for multivariate normal distribution wrt mean and covariance

Question

The analytic gradients of log multivariate normal distribution wrt mean and covariance matrix can be found at StackExchange post and The gradient of the log-likelihood of normal distributions.

I tried to use numerical differentiation (e.g., forward difference) to obtain the gradients by the following MATLAB code:

% analytic solution
mu = [1 1];
Cov = eye(2);
Cov(1,2) = 0.0; Cov(2,1) = Cov(1,2);
x = [1 2];
CovInv = inv(Cov);
x_m = x-mu;
mu_a = x_m*CovInv; % gradient of mean
cov_a = -0.5*(CovInv - CovInv*x_m'*x_m*CovInv); % gradient of covariance

% numerical solution (e.g., forward difference)
h = sqrt(eps);
prob = @(x,mu,Cov) exp(-0.5*(x-mu)/Cov*(x-mu)')/sqrt((2*pi)^2*det(Cov));
mu_n = nan(1,2); % for mean
p = log(prob(x,mu,Cov));
mu2 = mu;  mu2(1) = mu2(1) + h;  p2 = log(prob(x,mu2,Cov));
mu_n(1) = (p2 - p)/h;
mu2 = mu;  mu2(2) = mu2(2) + h;  p2 = log(prob(x,mu2,Cov));
mu_n(2) = (p2 - p)/h;
cov_n = nan(2,2);
Cov2 = Cov; Cov2(1,1) = Cov2(1,1) + h; p2 = log(prob(x,mu,Cov2));
cov_n(1,1) = (p2 - p)/h;
Cov2 = Cov; Cov2(1,2) = Cov2(1,2) + h; Cov2(2,1) = Cov2(2,1) + h; p2 = log(prob(x,mu,Cov2));
cov_n(1,2) = (p2 - p)/h; cov_n(2,1) = cov_n(1,2);
Cov2 = Cov; Cov2(2,2) = Cov2(2,2) + h; p2 = log(prob(x,mu,Cov2));
cov_n(2,2) = (p2 - p)/h;

mu_n is very close to mu_a, as well as for cov_n and cov_a.

However, changing the covariance from 0, Cov(1,2)=0.5; Cov(2,1)=Cov(1,2); for example, the cov_n(1,2) is always nearly twice larger than cov_a(1,2).

>> cov_n(1,2)/cov_a(1,2)
ans = 2.0000

It also happens when means are unequal, mu = [-1 0];.

What's wrong in this case?

If I still want to use numerical difference, what can I do?

Continue above question: About the Hessian matrix, how can I make sure the the numerical differentiation gives correct answer?

jjet's answer at https://stats.stackexchange.com/a/339552/142756 seems to be correct. — Keith Lau, Apr 16 '18 at 01:58
For Hessian: 'Ddf' function of 'MissMech' package in R is said to calculate the Hessian of the observed data Multivariate Normal Log-Likelihood with Incomplete Data. However, I double checked it with symbolic differentiation. 'Ddf' seems wrong for the hessian matrix of the covariance matrix. — Keith Lau, Apr 17 '18 at 07:55

Comparison between analytic gradient and numerical gradient for multivariate normal distribution wrt mean and covariance

0 Answers0