I don't have a very strong background in statistics so I have a few conceptual questions and there is a strong possibility I'm missing something obvious.
Suppose I'm interested in estimating the 99th percentile of body weight in the United States and I have data from every town in every state. I could simply aggregate all the data and find the 99th percentile, but I'm not sure what statistical power this number would have. The data aggregated all together isn't really normally distributed. It has a very large kurtosis excess, so I don't believe any kind of confidence interval estimate would hold. But perhaps I'm missing a basic statistical concept.
The second option would be to find the 99th percentile in each town, and apply the central limit theorem, treating each town as an independent random variable. I know the CLT applies to other quantiles, but it doesn't hold for extreme quantiles. I have used MATLAB simulations to prove to myself that it will hold for the 0.99 quantile. However, body weight in each town is not identically distributed. You can imagine that low income, urban, areas will have heavier individuals. So I must apply the Lyapunov or Lindeberg-Feller CLT. Is this a valid thing to do? It seems like these Central Limit Theorems are making statements about the distribution of a random variable divided by the sum of variances, rather than a mean or quantile. How would confidence interval estimates change for these theorems? Any references or insight is greatly appreciated. Thanks.