As specified in the title, I'm trying to understand how to test the goodness-of-fit for the binomial distribution; to this aim, I followed what suggested in this link.
Particularly, I'm investigating about the possibility to model the independent variable, here defined as $Y$ on the basis of binomial distribution; $Y$ is a dummy variable that can assume values 0 (failure) or 1 (success), and, moreover, it is related to a categorical variable assuming values from 0 to 9 (such values are increasing in the probability of success).
I need for testing whether $Y$ follows a binomial distribution $B(p,n)$, where:
- $p$: number of successes;
- $n$: number of repeated experiments;
$n$ has been assumed equal to 1, since the experiments is never repeated.
So, I implemented the following SAS code to to estimate the parameter $p$:
proc genmod data = dataset;
model Y = /dist=binomial;
output out = predbin
p = p; /* p: binomial parameter estimate */
run;
data expected_binomial_distribution;
set predbin;
do Y = 0 to 1;
prob_bin = pdf("binomial",Y,p,1);
output;
end;
stop;
run;
and the following one to estimate the goodness-of-fit:
proc means sum nway data = expected_binomial_distribution;
class Y;
var prob_bin;
output out = goodness_of_fit sum=_testp_;
run;
ods output onewaylrchisq = LR_SpecifiedProportions
lrchisqMC = LR_Exact_MC;
proc freq data = dataset;
table Y / chisq(testp = goodness_of_fit
df = 1
lrchisq lrchi);
run;
ods output close;
The degrees of freedom value is set equal to 1 since Y can assume 2 values only.
Is this interpretation of what the link suggests for fitting and testing the binomial distribution correct?
Moreover, since the test reject the null hypothesis, one cannot assume to model the $Y$ variable by using the logistic regression model; so, what alternatives one has to model the dummy variable $Y$ in terms of distribution (negative binomial, Poisson,...) or link function (logit,...)?