Suppose you are trying to estimate the pdf of a random variable $X$, for which there are tons of i.i.d. samples $\{X_i\}_{i=1}^{n}$ (i.e. $n$ is very large, think thousands - millions).
One option is to estimate the mean and variance, and to just assume it's Gaussian.
On the other end, one can take kernel density estimates, to get something more accurate (especially when there's so much data).
The problem is, that I need to evaluate the resulting pdf very very fast. If I assume the pdf is Gaussian, then evaluating the pdf $f_X(x)$ is very fast, but the estimate might not be accurate. On the other hand, kernel density estimates will be way too slow to use.
So the question is: what are common ways to get pdf estimates that are more general than Gaussians, but in an incremental fashion? Ideally, I'd like to have a model with a number of parameters (say $k$), that can be used to trade-off estimation accuracy and evaluation speed.
Possible directions I thought about are:
Estimate the moments of the distribution, and find the pdf based on these moments alone. $k$ here is the number of moments. But then, what is the model for the pdf based on the model?
Gaussian mixtures with $k'$ mixtures (here $k=3k'-1$ since for each element of the mixture we keep the mean, variance and weight, and the weights sum to one). Is this a good idea?
Any other ideas are welcome.
Thanks!
Related questions: ML estimation;
Update / clarification:
Thanks for all the answers so far.
I really need the pdf (not the cdf, and not to sample from this distribution). Specifically, I am using the scalar pdf estimates for Naive Bayes (NB) classification and regression: given the label, each of the features has a pdf, and the NB assumption says that they are independent. So in order to calculate the posterior (the probability of the label given the feature values) I need the different pdf's evaluated at the observed feature values.