1

I am trying to build a (very simple) recommender system on which statistical model to use, as a learning exercise. It only targets inference (not prediction).

Doing that, it would be nice to isolate a set of best-fitting distributions for the chosen variable, as a part of model assumption verification. Reading on various Q&As on this website [1, 2, 3] and R packages, I understand this to be a most difficult issue.

From answers to other questions, I do understand that no single particular distribution can be isolated, but it seems to me that a set of best-fitting distributions can be identified. Am I mistaken?

For example, a (very naive) way would be to sort the kernel density function of the tested variable against a coefficient cutoff obtained through comparison of the targeted distributions kernel density functions, using some kind of pattern recognition algorithm...

And what about the fitdistrplus R package?

Can this be done?

Raoul
  • 267
  • 1
  • 11
  • The best fit you can get to any data of $n$ observations is the empirical likelihood that places weight $1/n$ at each of the observed points. You need to penalize/control complexity somehow for this question to be well posed. – Andrew M Nov 07 '16 at 06:21

0 Answers0