Normalization of count data of time periods with different length

Question

I have count-data from two time-periods which differ in length. The event I'm counting is in both periods the same kind of event.

Period 1 is 120 hours
Period 2 is 48 hours

At the end I have something like this table:

Event occurred in Period 1       Event occurred in Period 2
        275 times                          129 times

I want to compare the data with a e.g. chi2-test. Of course, if I would do this without normalization, the result wouldn't be reliable. But what is a good normalization/standardization (I know that these are different terms) method to accomplish this? I appreciate every thoughts on that topic.

EDIT: Accidentally I switched the periods. I corrected the data.

Glen_b · Accepted Answer · 2016-04-07T11:17:29.393

Generally you don't make them comparable by doing something to the counts, but you do take account of the different exposures in computing the expected values in the chi-squared test.

Under a null hypothesis of equal event rates (events per hour), the two periods can simply be combined to estimate the rate ... that is $275+129$ events in $120+48$ hours, so we estimate the rate as $\frac{275+129}{120+48}$ events per hour, and the expected count in period 1 is then $(275+129)\frac{120}{120+48}\approx 288.57$ and in period 2 is $(275+129)\frac{48}{120+48}\approx 115.43$.

With those expected values, the chi-square goodness of fit statistic, $\sum_i \frac{(O_i-E_i)^2}{E_i}$ is straightforward to calculate by hand; it has $k-1=1$ degree of freedom in this example. However, it's a pretty standard calculation - for example, here it is in R:

eventcounts = c(275,129)
exposuretime = c(120,48)
chisq.test(eventcounts,  p = exposuretime, rescale.p = TRUE)

        Chi-squared test for given probabilities

data:  eventcounts
X-squared = 2.2339, df = 1, p-value = 0.135

which is the same result as doing it by hand.

Thanks @Glen_b. I didn't check the chisq.test-function carefully enough and so I didn't realize, that one can add probabilities into the function. Very nice and clear explanation. Thanks a lot. — Tobias, Apr 07 '16 at 11:07
One last short question @Glen_b: Am I right, that I can interprete this test like: There is no significant difference in the event rates between both time periods? — Tobias, Apr 07 '16 at 11:39

Normalization of count data of time periods with different length

1 Answers1

Linked