4

I'm just making this up to understand the rules...

Claim: People wearing roller blades can get to places in lesser time than those who don't. (Assuming on foot)

Now, I'd like to express this using symbols (that's what I call them, unless there's another name for this).

I'm guessing: If there's no problem with the symbol to be used, I'd like to use t as the time taken to walk. So my $H_0$ would be that even with roller blades on, they would need the same amount of time.

Based on this, is it correct to express the following?

$H_0$ : $\mu$ = t

Or is there a better way to express this?

Please feel free to correct me - I really want to learn this!

Thanks!

itsols
  • 729
  • 1
  • 7
  • 8

1 Answers1

3

For simplicity, let us fix definite start and end points to the hypothesis, so in words the hypothesis might be

"People traveling from point A to point B by roller blade arrive sooner than those who walk."

Probably the simplest way to express this hypothesis is to add the qualifier "... on average".

Then the hypothesis becomes a definite statement about the conditional expectation of the travel time, given the mode of transportation.

To proceed you would then introduce a symbol for the travel time, say $T$, and a symbol for the mode of travel. If we are assuming that all travelers either use roller blades or they walk, then we can represent the travel mode by a dummy variable such as $$R=\begin{cases}1 & \text{if roller blade} \\ 0 & \text{if walk} \end{cases}$$

Then the hypothesis becomes

$$\mathbb{E}[\,T\mid R=1\,] < \mathbb{E}[\,T\mid R=0\,]$$

where $\mathbb{E}[\,x\mid a\,]$ should be read as "the expected value of $x$, given $a$".

Usually the symbol $H_0$ that you use is reserved for the null hypothesis, which is typically the negation of the hypothesis of interest. So for example in this case, $H_0$ might be "roller blading is no faster than walking".


One last point: The hypothesis as expressed in the equation above is not actually empirically testable, because the theoretical expectations cannot be literally computed. So usually a next step is to use the theoretical hypothesis to generate predictions about what would be observed in a data-set (sample) of actually observed $(T,R)$ pairs.

GeoMatt22
  • 11,997
  • 2
  • 34
  • 64
  • That is one of the most beautiful explanations I've seen in a long time on the entire stackexchange. Thanks! I do have 2 questions on the what you've mentioned. First, how do I express the statement **roller blading is no faster than walking** as a symbols. Second, I thought that **T** is reserved for other things like the T-Score. So are we really free to use that symbol? Thanks a again! – itsols Sep 29 '16 at 01:19
  • 1
    For the first: In the hypothesis (gray box), change $ – GeoMatt22 Sep 29 '16 at 01:27