Estimation After Selection on Non-central F Random Variables

Question

Suppose that you observe $F_1,F_2,\ldots,F_k$ each independently. drawn from non-central F distributions with common, known, d.f. $\nu_1, \nu_2$, and with (unknown) non-centrality parameters $\lambda_1,\lambda_2,\ldots,\lambda_k$. Suppose that the sample is ordered by $F_{(1)} \le F_{(2)} \le \ldots \le F_{(k)}$. I would like to somehow estimate $\lambda_{(k)}$. That is, estimate the non-centrality parameter associated with the largest of the $F_i$. (Actually, procedures for estimating arbitrary $\lambda_{(i)}$ would be nice too.)

There is a large body of literature on estimation after selection that typically assumes the RVs are normally distributed (Gupta and Miescke inter alios), or exponential, uniform, etc. I looked into the normalizing transforms of the non-central F, but these require you to know the $\lambda_i$ (they seem to exist to construct tables), and don't work well with estimates $\hat{\lambda_i}.$

Milton & Rizvi (1989) is a related reference.

I am both puzzled and curious as to why you would be interested in solving such a problem. I think the term "largest sample" is not what you mean. Don't you mean that you want to estimate the noncentrality parameter of the maximum of the Fs? Also if you really mean i.i.d. then you are assuming all the noncentrality parameters are the same. If you are allowing them to be different then maybe you mean that the noncentral Fs are indpendent but not necessarily identically distributed. In the case where you have k different lambdas, I see no way to identify any of the noncentrality parameters. — Michael R. Chernick, Jul 20 '12 at 22:30
If you assume all of them are the same (the iid case) there is hope and you could apply maximum likelihood. Perhaps in the non iid case if you assume some are the same and hence reduce the number of parameters the likelihood approach might still work. — Michael R. Chernick, Jul 20 '12 at 22:33
You can always write down the cumulative distribution for the maximum. But it will involve not only the lambda that you want to estimate but also the other k-1 that are nuisance parameters. — Michael R. Chernick, Jul 20 '12 at 22:35
sorry for the confusion; yes, I meant 'independently', not 'i.i.d.' That is fixed. Milton & Rizvi give some background on a similar problem. Suppose you are trying to select among $k$ populations, some subset with (vector) means far from the origin. The selection part is relatively straightforward, but after performing selection, getting good unbiased estimates of the distances of the means to the origin is difficult. I hope that clarifies somewhat. — shabbychef, Jul 20 '12 at 22:36
So how do the noncentral Fs enter the problem? Usually they come about in analysis of variance under the alternative that the means are different. For testing the null distribution (central F is all that is important it get the critical values) non-centrality parameters would only play a role in power/sample size determination. Things are still too vague. I still think there are too many parameters running around to reach a sensible solution. You have k unknown noncentrality parameters and nomerator and denominator degrees of freedom which I assume are known. — Michael R. Chernick, Jul 20 '12 at 23:21
But you have only one observation from each distribution. How do you get good estimates of any parameters based on a sample of size 1. I could get an unbiased estimate of the mean based on one observation. But it wouldn't be very accurate. Because the variance is a function of the mean for the exponential I can estimate a variance based on one observation if I sampled from an exponential but I couldn;t for a normal distribution. So what makes you think you can get a decent solution to the problem that you are posing? — Michael R. Chernick, Jul 20 '12 at 23:26
I am posing the question in terms of the non-central F; in reality, for each population, I observe $\nu_2 + \nu_1$ samples of a $\nu_1$-variate Gaussian vector, then construct Hotelling's $T^2$. I then multiply the $T_i^2$ by $\nu_2 / (\nu_1 (\nu_1 + \nu_2 + 1))$ to get a non-central F. By collecting more observations of the vectors, the $\lambda_i$ gets inflated. — shabbychef, Jul 21 '12 at 00:02
Maybe this is more apropos of the chat room, but, for example, Hwang showed that Lindley's shrinkage improves Bayes risk when noise is Gaussian. — shabbychef, Jul 21 '12 at 00:03
I still don't see how you could possible get reasonable estimate of any noncentrality parameter with so little data. — Michael R. Chernick, Jul 21 '12 at 04:43
I downvoted the question because even after the questions and answers your problem seems to either not have a good solution or it is still imp0roperly stated. — Michael R. Chernick, Jul 21 '12 at 05:33
The F statistics are summary statistics. By analogy, suppose you observed a million draws from a normal RV with unknown mean $\mu$ and known variance $\sigma^2$, then computed the sample mean. The sample mean is normally distributed. You might just say that you observe a single Gaussian with mean $\mu$ and variance $\sigma^2/10^6$. Is that single draw a 'reasonable estimate' of the parameter $\mu$? It depends on the scale of $\mu$ versus $\sigma^2 / 10^6$. — shabbychef, Jul 21 '12 at 05:35
let us [continue this discussion in chat](http://chat.stackexchange.com/rooms/4187/discussion-between-shabbychef-and-michael-chernick) — shabbychef, Jul 21 '12 at 05:40
In the example you give the sample size enters into the denominator of the mean. so for large n the variance of the estimate of the mean will be small and continue to decrease with increasing n. I suppose the analogy with the F distribution would be the degrees of freedom in the numerator and denominator but what is the estimator of the noncentrality parameter that would have a small variance based on that single F statistic. Can you give me a good answer? I think continuing in chat would be appropriate if thie were a side issue. — Michael R. Chernick, Jul 21 '12 at 06:14
But I think it is fundamental to whether or not the question is sensible. — Michael R. Chernick, Jul 21 '12 at 06:15
You can find some estimators in [this paper](http://www.jstor.org/stable/3315657). Maximum likelihood estimators do not seem to be too difficult to calculate numerically for moderate $k$ (at least that is my impression after some numerical experiments). I would not spend lots of time looking for closed estimators of such cumbersome distributions. — , Jul 21 '12 at 13:45
@Procrastinator thanks for the link! I had been using the unbiased estimator, but the KRS estimator helps me somewhat. The MLE is also easy to compute by brute force. — shabbychef, Jul 24 '12 at 05:07

Estimation After Selection on Non-central F Random Variables

0 Answers0