1

Say I have an observed data set ($n_i$) and I want to obtain the best fit out of 10 data sets produced by a model dependent on a single parameter $a$ ($m_i(a)\;a=1..10$).

Suppose I use a Poisson likelihood distribution:

$P_i(a)=\frac{m_i(a)^{n_i}}{e^{m_i(a)}n_i!}\; ; \; a=1..10$

where $m_i(a)$ represents the model value of bin $i$ (for a given value of the parameter $a$) and $n_i$ its observed value (which remains constant always) The likelihood of the whole data set or cumulative likelihood (for each value of $a$) is then:

$L(a)=\prod\limits^{N}_{i=1} \frac{m_i(a)^{n_i}}{e^{m_i(a)}n_i!}\; ; \; a=1..10$

where $N$ is the total number of bins. Now, I want to pick the model data set that best fits my observed data set. Could I just choose the maximum value of $L(a)$, ie: the maximum likelihood:

$Max\_Likelihood = max\,[ L(a)\; ; \; a=1..10]$

and say that the value of parameter $a$ associated with that particular modeled data set is the best estimation of parameter $a$, given my observed data set?

Or should I use a cumulative likelihood ratio, defined as :

$LR(a)= \prod\limits^{N}_{i=1} \frac{\frac{m_i(a)^{n_i}}{e^{m_i(a)}n_i!}}{\frac{n_i^{n_i}}{e^{n_i}n_i!}} = \prod\limits^{N}_{i=1} \left(\frac{m_i(a)}{n_i}\right)^{n_i}e^{n_i-m_i(a)}\; ; \; a=1..10$

and keep the value of $a$ that gives the maximum value of $LR(a)$, ie the maximum likelihood ratio:

$Max\_Likelihood\_Ratio = max\,[ LR(a)\; ; \; a=1..10]$

Since the observed data set ($n_i$) remains the same for all the modeled data sets ($m_i(a);\;a=1..10$), won't maximizing this value $LR(a)$ give me the same result (ie: the same value of $a$) as maximizing $L(a)$?

Gabriel
  • 3,072
  • 1
  • 22
  • 49
  • I don't understand what you mean by "model data set". Typically, one wants to find the parameters that best fit a data set. – jrennie May 16 '12 at 13:51
  • "Model data set" means a data set produced by a model. In this case the model depends on a single parameter $a$. I **do** want to find the parameters that best fit my data set. In my case my _data set_ is my _observed data set_ and there's a **single** parameter $a$. Please tell me if this is still not clear. – Gabriel May 16 '12 at 13:56

2 Answers2

2

To add a bit to jrennie's answer: using the likelihood ratio instead of the likelihood has two practical advantages:

  1. -2 times the log-likelihood-ratio approaches a chi^2 distribution for large N, and an thus be used as a goodness-of-fit indicator (though for your example of 10 data points that's probably not relevant);

  2. The negative log-likelihood-ratio -- and all the individual terms in the sum -- are always non-negative (because the likelihood ratio itself is always between 0 and 1). This means that you can use fast, least-squares-type minimization algorithms (e.g., Levenberg-Marquardt) to find the minimum of the log-likelihood-ratio. (Again, for your example problem with only 10 data points that's probably not relevant; but for problems with many data points and a complicated model, it can be useful.)

Peter Erwin
  • 121
  • 4
2

The parameter $a$ which maximizes the likelihood of the observed data is the maximum likelihood parameter. It is, in the maximum likelihood sense, the "best" parameter for the observed data. Note that this will not (necessarily) give you the best generalization performance.

As you note, the ratio should give you the same "best" parameter since $L(a) \propto LR(a)$.

Note that it is common to work with logarithms of likelihood and LR (since log doesn't affect maximum). It might be easier to convince yourself of the equivalence if you work with log-likelihood and log-likelihood-ratio. It is also conventional to negate and minimize. You might be interested in this discussion of negative log-likelihood.

jrennie
  • 394
  • 1
  • 5
  • Thank you for your answer @jrennie. I actually do use the log likelihood in my calculations. What I still don't fully understand is this: if both the _likelihood_ and the _likelihood_ **ratio** will give the same answer (best value of parameter $a$), why use a ratio at all? Maybe a better question is: in which cases is the _likelihood_ **ratio** the best option? – Gabriel May 16 '12 at 14:13
  • 1
    @Gaba_p [Likelihood ratio](http://en.wikipedia.org/wiki/Likelihood-ratio_test) is typically used for comparing two different models. – jrennie May 16 '12 at 15:30