Suppose that I have a system that at each time $t_i$ produces $N$ i.i.d samples of an unknown distribution $f(x;t)$. I want to estimate the distribution in an online manner. If I had only the observation at a single time $t_i$, I think I could use the kernel density estimation method. Hence, $$f(x;t_i)\approx \frac{1}{Nh}\sum_{j=1}^{N} K\left(\frac{x-x_j}{h}\right)$$ But the system generates $N$ data at each time. Hence, if the distribution do not depend on $t$ after $T$ observations, I will have the following approximation $$f(x;t) = f(x)\approx \frac{1}{NTh}\sum_{i=1}^{T}\sum_{j=1}^{N} K\left(\frac{x-x_{j,t_i}}{h}\right)$$ In the above expression the number of samples in the summation increases as the time goes on. Hence after some time I should store lots of information. I should also do lots of calculations as the number of terms in the summation increases. Hence, I'm looking for a method that do not require all of the previous (raw) information (for example by some sort of moving average). This method is also not suitable for time varying distributions.
Are there extensions of kernel density estimation or any other methods that can estimate or learn the data distribution in an online manner without the need to store all of the information at all times? Is it possible to learn time-varying distributions with such a method?