0

Hypothetical example:

A researcher counts the number of persons entering a building over a one hour period and records this number.

The following day the researcher performs the same count during the same hour of the day.

Can these two numbers be compared for a statistically significant difference using a t-test, despite the fact that neither are means?

(Please ignore the fact that taking single readings and trying to infer meaningful findings is abhorrently bad science, this is intended as a statistical question.)

kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467
Unencoded
  • 101
  • 3
  • 4
    Have you heard of the [Poisson distribution](https://en.wikipedia.org/wiki/Poisson_distribution)? – usεr11852 Mar 23 '17 at 00:08
  • I have to a limited extent, I assumed it would not apply here as it would require a consistent flow of persons entering the building, which may not be the case (say, for a cafe, where lunch time is likely to be very busy)? – Unencoded Mar 23 '17 at 00:14
  • 1
    Why would you need a steady (I guess that's what you mean by "consistent") flow? You fixed the time-interval so what happens within that time-interval is irrelevant (at first instance at least). In any case, check whuber's and Rob's answers in this [thread](http://stats.stackexchange.com/questions/9561), I think they will address your issue fine. Your suspicion was correct: Nope! You should not use a standard $t$-test. – usεr11852 Mar 23 '17 at 01:01
  • I was just going off the list of assumptions in the wikipedia article you linked, apologies for causing any frustration, this is certainly a grey area for me. – Unencoded Mar 23 '17 at 01:08
  • 1
    No problem at all. You are looking for an "E-test" most probably. :D – usεr11852 Mar 23 '17 at 01:09
  • I appreciate your patience, I will look into an E-test then, many thanks for your help :D – Unencoded Mar 23 '17 at 01:14

1 Answers1

0

The t-test requires the estimate of a standard deviation of a normal distributed variable. So it is not gonna work with a single sample (from which you can not calculate such estimate).

But, in this particular case you could assume that variable is Poisson distributed (instead of normal). In that case you can compute an estimate for the variance (although it is not a great assumption and some over-dispersion is likely, or basically your estimate will be as good as your assumption).

Please ignore the fact that taking single readings and trying to infer meaningful findings is abhorrently bad science, this is intended as a statistical question.

This is hard to ignore. What is the point of the, as statistical intended, question?

Sextus Empiricus
  • 43,080
  • 1
  • 72
  • 161