1

I have almost the same question as: How can I efficiently model the sum of Bernoulli random variables? But:

(1) The number of random variables for summation is ~ N=20 (case 1) or N=90 (case 2).

(2) $p_i$ ~ 0.13 (case 1)

(3) The precision of the model based on Poisson law is not enough.

(4) We need that our approx would be the good enough to model partial sums like these as well: $\sum_{i=k,N}{X_i}$, ( $k=1,N$ )

(5) We have empirical data for every $X_i$. The diagram shows that there is almost linear dependence for $Pr(X_i=1)$ for i=1,6 and then almost constant function or some small linear dependence (for i=7,20 ).

Really I'm not sure in (3) since the Le Cam's inequality looks like very general...

What class of model could we try?

Andrey
  • 41
  • 1
  • 3

1 Answers1

1

Sums of bernoullis are distributed exactly binomial, so one often would use logistic regression.

Patrick McCann
  • 1,330
  • 7
  • 12
  • Sums of **equally** distributed bernoullis are binomial. They are slightly different in our case. We use $$p_i=p(1+i/N {\delta})$$ model now. But it should be some better model. – Andrey Apr 19 '11 at 08:57
  • Can you paste the first few rows of your data set? – Patrick McCann Apr 29 '11 at 21:29
  • Here is it (10 rows): {{1, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 0}, {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0}, {0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0}, {0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0}, {0, 0, 0, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0}, {0, 1, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0}, {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0}, {1, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0}, {0, 1, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0}, {0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0}} – Andrey Jun 28 '11 at 12:42