0

I have samples from many related binomial distributions. In the case where I have only hundreds of samples, some of the sample sizes are also in the hundreds. In the case where I only have dozens of samples, most of the sample sizes are in the thousands. Each sample shows an $\bar{x}_i$ as an estimate of $p_i$.

I accept the assumption that $p/(1-p)$ is log-normally distributed across all the samples. (Or at least among the population of possible $p$'s, from which my sample of samples is drawn, which is close enough.)

If I knew the lognormal parameters, then for each sample, I could find the maximum likelihood of $p$, given the pdf of the lognormal and the pdf of the normal that approximates the binomial, with its sample size.

But first I need to look at the set of samples to estimate the lognormal parameters. And in some cases, $\bar{x}$ will be 0 or 1. Obviously, the large samples should have greater weight than the small samples in determining the mean of $\log(p/(1-p))$. Is there some reasonably fast computation that will give me the lognormal parameters, then I can then use to regress each of the samples to the mean.

dsaxton
  • 11,397
  • 1
  • 23
  • 45
Dvd Avins
  • 101
  • You need to use the sample sizes $n_i$ along with the $\bar{x}_i$ in order to carry out anything like this. Your remarks about maximum likelihood are obscure, because you seem actually to be performing some kind of Bayesian computation rather than ML. Similarly, the final remark about "regress each of the samples to the mean" is difficult to comprehend. Could you elaborate on your post by describing the actual problem you face, rather than this abstraction of it? That would help communicate your ideas more reliably. – whuber Sep 21 '16 at 20:56
  • Can you use a beta distribution directly on the $p_i$'s rather than a log-normal on the odds? Then you could use http://stats.stackexchange.com/questions/96481/how-to-specify-a-bayesian-binomial-model-with-shrinkage-to-the-population – jaradniemi Sep 21 '16 at 21:20
  • This is a better link http://stats.stackexchange.com/questions/24537/priors-for-hierarchial-bayesian-binomial-model and yes, you could obtain MLEs fairly easily since the $p_i$'s can be integrated out and you will then have a 2-dimensional optimization. – jaradniemi Sep 21 '16 at 21:23
  • I am attempting to estimate skill levels of a pass/fail skill in a population of people. I succeed in getting the whole known population of people to demonstrate their skill by attempting the task. Each subject who participates tries however many times they have patience for, which varies considerably. The known population may be treated as a sample of a wider population. If I know (or have useful estimates) of the mean and variance of the skill levels in the population, I can, in Bayesian fashion, take the product of the density functions to find the maximum likelihood of each skill level. – Dvd Avins Sep 22 '16 at 00:29
  • It's getting late here. I'll take a look if a beta distribution is close enough and also at the other link tomorrow or tonight if I can't sleep. – Dvd Avins Sep 22 '16 at 00:32
  • Why would $\log(p/(1-p)$ be normal? – Glen_b Sep 22 '16 at 04:57
  • @Glen If X is normal, X+K1 is normal (K's representing constants). If log(Y) is normal, log(Y*K2) is log(Y) + log(K2), still normal. I have reason to believe the skill level as represented by the log odds ratio relative to the population log((p/(1-p))/(pw/(1-pw))) is reasonably approximated by a normal distribution with mean 0, where pw is the mean p in the population. Since pw/(1-pw) is a constant, log(p/(1-p)) is must also be approximately normal, with a mean of log((1-pw)/pw). (Sorry I don't know the markdown to show math. I don't see it in the formatting link.) – Dvd Avins Sep 22 '16 at 14:38
  • Mathematical formatting via mathjax - http://meta.math.stackexchange.com/questions/5020/mathjax-basic-tutorial-and-quick-reference# -- If you know LaTeX it's trivial to use, if you don't, it's not hard to get started; begin by putting dollar signs `$` either end of an expression and \ in front of function names. so `$\log(Y)$` is $\log(Y)$. use `^` to get superscripts and `_` to get subscripts, enclosing any superscript or subscript longer than one character in braces `{ }` ... replace `*` with `\cdot` or `\times` (or in many cases just a space) so $a\cdot b, a\times b, ab$ are products – Glen_b Sep 22 '16 at 23:29

0 Answers0