0

In a program I am writing, I have a Gaussian Distribution function that returns the PDF given a specific vector. The issue is, this is obviously not the actual probability.

To further complicate matters, the Gaussian Distribution I am using involves a multivariate distribution, so it's difficult to arbitrarily change a few values by a minute amount and find the area to get the probability.

Edit: I am actually using a measure of log-likelihood (which is why I need the probability). So to add onto this, is it okay if I just use the density values, since my measures involve using the differences in log-likelihood for a ML algorithm.

Eric Staner
  • 183
  • 1
  • 5
  • 2
    I don't understand what your question is. – Taylor Jun 23 '16 at 18:06
  • 1
    Density is not probability -- but (for continuous random variables) density is what you use in likelihood. Please clarify your question. – Glen_b Jun 23 '16 at 18:29
  • @Glen_b I know that density is not probability. I just want to know a good way to approximate the probability given the density and the Gaussian distribution (since I will be using a continuous model applied for a discrete set) – Eric Staner Jun 23 '16 at 21:34
  • What probability do you mean? ... also "I will be using a continuous model applied for a discrete set" ... then describe your discrete variable and the continuous model in more detail in your post. – Glen_b Jun 23 '16 at 23:17

1 Answers1

1

In a program I am writing, I have a Gaussian Distribution function that returns the PDF given a specific vector. The issue is, this is obviously not the actual probability.

Yes, this is probability density function, so it returns "probabilities per foot". Check Can a probability distribution value exceeding 1 be OK? to find out more about PDFs. In fact, in case of continuous variables the "actual probabilities" are zeros.

I am actually using a measure of log-likelihood (which is why I need the probability). So to add onto this, is it okay if I just use the density values, since my measures involve using the differences in log-likelihood for a ML algorithm.

Log-likelihood is defined in terms of log-probabilities, or log-densities:

$$ \ln L(\mu|X) = \sum^N_{i=1} \underbrace{\ln f(x_i, \mu)}_\text{log-probability, or log-density} $$

Moreover, as you correctly noticed, log-likelihood is used as a relative measure, so it does not matter if the "probabilities" sum (or integrate) to one, because we use it only to compare which parameters are more "likely" given the data, rather than calculating actual probabilities. Check Maximum Likelihood Estimation (MLE) in layman terms to learn more.

Tim
  • 108,699
  • 20
  • 212
  • 390