I have two count variables for several hundred thousand comparisons, one expected and one observed, and I would like to test if the counts are significantly different.
One possible approach I have looked into is simply subtracting the expected from the observed count for each row and then build a 95% CI and see if 0 is within this range. However, I am unsure if this is the best method and I am wondering if there is something more appropriate for performing such an analysis?
I have also checked into using a GLM for count data to estimate a slope and see if it is equal to 1. However, I have not seen any examples of this being used with count predictor variables, save someone else asking about it here: Does using count data as independent variable violate any of GLM assumptions? From this it appears like it would be okay, if certain things are taken into account. But, does this overcomplicate something as simple "is the difference between observed and expected different from zero?".