Minimum-Distance estimation of mixed/mixture distributions

Question

Please note: I posted this first on Mathoverflow. Someone there advised me that on stats.stackexchange the question might fit better here. This is the link to the original post.

I currently have to fit some heavy-tailed data. As the fitted (positive, continuous) distributions will be used in a numerical integration procedure that involves Fourier transformation, I am restricted to distributions with analytic characteristic function. Some first analysis (Hill Plot etc) showed that the tails can be fitted by a stable distribution quite well. However, closer to zero this is not the case. So I played a little with a mixture (or mixed – a term that seems to be heavily overloaded in statistics) of stable and an exponential distribution, i.e.: $$ f(x)=c_1\lambda\exp(-\lambda x)+c_2f_\text{stable}(x)\,, $$ where $c_1+c_2=1$. This seems to improve the fit significantly. The question remains, how to fit the mixed distribution. From what I've read, it seems reasonable to consider minimum-distance estimators, like Anderson-Darling to achieve a maximum-goodness of fit. I did not find any implemented algorithms for the minimum-distance procedure. So I wanted to use some numerical optimization algorithm that allows constraints (which I need) and implement it myself.

Does this approach make sense? Recommendations for the optimization method? I do not have an analytic Jacobian, of course. Is there any tested, implemented method? Should I use a different approach? MLE is “involved” as there is no distribution function of the stable distribution

Remark: Computational effort is not relevant. However, as this is only a minor issue of the thesis, I'd rather spend not too much time on it. Parameter estimation of heavy-tailed distributions is a vast field, combined with a lack of experience this can end in a disaster. So I'd be happy if someone points me into the right direction.

Two questions: (1) Is there an illustrative plot you could include and (2) Can you give some sense as to why you want/need the $\alpha$-stability? At first I thought you were trying to maintain closure under linear transformations, but apparently not since introducing the mixture will break that. — cardinal, Jan 29 '12 at 13:16
I want the $\alpha$-stability, because I have a Pareto-like tail (the Hill Plot suggests this) and I need a characteristic function in closed form to use it for numerical integration. Any numerical approximation of the characteristic function would likely ruin/complicate my numerical integration procedure. This is true for most fat-tail distributions like Burr/Log-normal etc. I can provide plots, if needed. — user13655, Jan 29 '12 at 21:30

score 2 · Accepted Answer · answered Jan 29 '12 at 07:57

2

Due to the convoluted nature of the $\alpha$-stable distributions (no likelihood), I would suggest using an ABC technique to estimate the mixture. Peters et al. have a recent paper on this. (Here are some slides from Gareth Peters, as well.) Barthelmé and Chopin use a slightly different technique based on expectation-propagation to estimate their $\alpha$-stable model. Even though those papers do not address directly the mixture issue, the mixture part is the easy part: using a Gibbs sampler that separates the data into two groups for the two components at each (Gibbs) step means that both distributions are estimated separately.

answered Jan 29 '12 at 07:57

Xi'an

90,397
9
157
575

Thank you for sharing these interesting papers. I think I can implement the second one in reasonable time. However, there is a problem: None of these methods allows for parameter constraints. I.e. i need $\beta=1$ and $\delta\geq0$ in the stable distribution. If $\delta<0$ the distribution has positive probability for negative values. Some estimators for stable distributions returned such values. It seems not appropriate to just floor $\delta$ in the algorithm. – user13655 Jan 29 '12 at 21:54
I do not understand why you could not include those constraints into the prior distribution. Setting $\beta=1$ is straightforward, obviously, and choosing a prior on $\delta$ in $\mathbb{R}_+$ like an $\mathcal{E}xp(\mu)$ distribution is sufficient to ensure $\delta>0$ in the estimation. – Xi'an Jan 30 '12 at 06:23
1

If I understand correctly you propose using the general Gibbs sampling procedure for mixture models as outlined on p. 25 in http://www.ceremade.dauphine.fr/~xian/mixo.pdf and then use e.g. EP-ABC instead of step 1.3? Is this correct? – user13655 Jan 31 '12 at 16:57
@user13655: Exactly! Yes, indeed, the mixture structure can be broken by a Gibbs sampler. Once the data is divided into groups, each component can be analysed in the most convenient manner, eg by EP-ABC if anything else is too hard. – Xi'an Jan 31 '12 at 18:01
1

Thank you! I contacted Simon Barthelmé who provided me the Matlab code of their paper. So implementation should be straightforward now. – user13655 Feb 01 '12 at 19:09

Minimum-Distance estimation of mixed/mixture distributions

1 Answers1