I have a $10000\times2$ table (see below) where column A is made out of observed data and column B is the output of a model that I have which should fit the observed data A.
$n_{A}$ is the total number of points in column A and $n_{B}$ the total number of points in B and they are not necessarily equal.
A B
-----
1 0
1 0
1 0
0 1
1 0
2 0
1 0
1 0
1 1
0 0
0 0
0 2
0 1
5 7
1 0
1 0
6 4
(...)
How should I characterize the goodness of fit of this model output (B) with my observed data (A)?
The Chi-Squared test:
$\chi^{2}=\sum\limits_{i=1}^{N} \frac{(O_i-E_i)^2}{E_i}$
(where $O_i$ is the observed frequency and $E_i$ is the expected frequency) has the issue of what to do I do if $E_i=0$? As you can see in the table above, there's lots of bins where $E_i$ is zero.
The same happens if I use a log likelihood Poisson test. This is how I've seen it expressed (taken from here):
$-2ln\lambda=2 \sum\limits_{i=1}^{N} (E_{i} - O_{i} + O_{i}ln\frac{O_{i}}{E_{i}})$
it will clearly have the same problem as the Chi-Squared test above, whenever $E_i=0$.
Fisher's exact test was mentioned in the original question (here) but later retracted.
So what can I do whit this table?
This question comes from this original post. I've tried to phrase it as simple as possible.
A very similar question was asked here but it was never fully answered.