I need to optimize parameters of a Gaussian shaped function to best fit my data points using maximum likelihood.
I first make an initial estimate for the parameters (mu, sigma and peak) and apply a gradient descent (fmincon in Matlab) algorithm to optimize these parameters.
The cost function is the sum of the negative log values of probabilities of observing each data point given the model parameters and the model noise parameter.
sum(-log( normpdf( Y - model( X, params) ,0,error_sigma) ));
where, Y: observations
normpdf : normal probability densitiy function.
model: a function that takes the x values and parameters to be optimized
params : parameters of the model
error_sigma : noise of the data
I optimize the model parameters and the noise parameter simultaneously to obtain highest -loglikelihood values.
I am puzzled by obtaining negative likelihood values. I would have estimated that -log of probabilities should always evaluate in positive values. However, the matlab function normpdf, especially for small spread values, returns probability values which are bigger than 1.
This is because the integral of the normpdf function does nearly never sums up to 1. Therefore I am thinking that I should be using the normal cumulative function for optimization, instead of normpdf in order to get valid loglikelihood values. However, the parameters never converges as good as the normpdf case, when I use normcdf.
I hope I am clear.