8

I have a set of data points $X_i, y_i$ where $x$ are the independent variables and I believe each $y_i$ can be modeled as being drawn from a exponential distributions with parameters $\lambda_i$.

If I use $X_i$ to predict $\lambda_i$, how can I evaluate the quality of my predicted distributions with respect to the observations $y_i$?

Edit: This is essentially the same question as How to evaluate quality of probability estimator for Bernoulli experiments? but in a continuous context rather than a binomial context. It's not obvious to me what to use in this case instead of cross-entropy.

Stephan Kolassa
  • 95,027
  • 13
  • 197
  • 357
Thomas Johnson
  • 741
  • 5
  • 14

2 Answers2

7

The standard approach to this is using the log-likelihood of the exponential distribution. This is actually exactly how the cross-entropy is derived, it is the log-likelihood of the Bernoulli distribution.

In the case of an exponential distribution, the pdf is:

$$ f(y; \lambda) = \lambda e^{-\lambda y} $$

So the log-likelihood is:

$$ LL(\lambda_i; y_i ) = \log(f(y_i; \lambda_i)) = \log(\lambda_i) - \lambda_i y_i$$

So, if $y_i$ are your true values, and $\lambda_i$ are your predictions, an exponential model would minimize:

$$ LL(\{\lambda_i\}; \{y_i\}) = \sum_i \log(\lambda_i) -\lambda_i y_i$$

Fitting models by maximizing the log-likelihood in this way leads to the the theory of generalized linear models; the exponential model is a special case.

Matthew Drury
  • 33,314
  • 2
  • 101
  • 132
3

The standard way to assess predictive distributions is via scoring rules. The log-likelihood that Matthew Drury recommends is one example, it's the logarithmic scoring rule. There are also others. Merkle & Steyvers (2013, Decision Analysis) discuss how different scoring rules hang together, and how to choose one.

More information can be found in the tag wiki, and we have a number of questions carrying the tag.

Stephan Kolassa
  • 95,027
  • 13
  • 197
  • 357