2

The analytic gradients of log multivariate normal distribution wrt mean and covariance matrix can be found at StackExchange post and The gradient of the log-likelihood of normal distributions.

I tried to use numerical differentiation (e.g., forward difference) to obtain the gradients by the following MATLAB code:

% analytic solution
mu = [1 1];
Cov = eye(2);
Cov(1,2) = 0.0; Cov(2,1) = Cov(1,2);
x = [1 2];
CovInv = inv(Cov);
x_m = x-mu;
mu_a = x_m*CovInv; % gradient of mean
cov_a = -0.5*(CovInv - CovInv*x_m'*x_m*CovInv); % gradient of covariance

% numerical solution (e.g., forward difference)
h = sqrt(eps);
prob = @(x,mu,Cov) exp(-0.5*(x-mu)/Cov*(x-mu)')/sqrt((2*pi)^2*det(Cov));
mu_n = nan(1,2); % for mean
p = log(prob(x,mu,Cov));
mu2 = mu;  mu2(1) = mu2(1) + h;  p2 = log(prob(x,mu2,Cov));
mu_n(1) = (p2 - p)/h;
mu2 = mu;  mu2(2) = mu2(2) + h;  p2 = log(prob(x,mu2,Cov));
mu_n(2) = (p2 - p)/h;
cov_n = nan(2,2);
Cov2 = Cov; Cov2(1,1) = Cov2(1,1) + h; p2 = log(prob(x,mu,Cov2));
cov_n(1,1) = (p2 - p)/h;
Cov2 = Cov; Cov2(1,2) = Cov2(1,2) + h; Cov2(2,1) = Cov2(2,1) + h; p2 = log(prob(x,mu,Cov2));
cov_n(1,2) = (p2 - p)/h; cov_n(2,1) = cov_n(1,2);
Cov2 = Cov; Cov2(2,2) = Cov2(2,2) + h; p2 = log(prob(x,mu,Cov2));
cov_n(2,2) = (p2 - p)/h;

mu_n is very close to mu_a, as well as for cov_n and cov_a.

However, changing the covariance from 0, Cov(1,2)=0.5; Cov(2,1)=Cov(1,2); for example, the cov_n(1,2) is always nearly twice larger than cov_a(1,2).

>> cov_n(1,2)/cov_a(1,2)
ans = 2.0000

It also happens when means are unequal, mu = [-1 0];.

What's wrong in this case?

If I still want to use numerical difference, what can I do?

Continue above question: About the Hessian matrix, how can I make sure the the numerical differentiation gives correct answer?

Keith Lau
  • 63
  • 6
  • jjet's answer at https://stats.stackexchange.com/a/339552/142756 seems to be correct. – Keith Lau Apr 16 '18 at 01:58
  • For Hessian: 'Ddf' function of 'MissMech' package in R is said to calculate the Hessian of the observed data Multivariate Normal Log-Likelihood with Incomplete Data. However, I double checked it with symbolic differentiation. 'Ddf' seems wrong for the hessian matrix of the covariance matrix. – Keith Lau Apr 17 '18 at 07:55

0 Answers0