1

I have the percentage of completed activities of a set of users that have used a concrete application. I have the number for each of them, for instance:

  • Participant #1 --> 0.85
  • Participant #2 --> 0.48
  • ...
  • Participant #23 --> 0.76

The intention to have recorded these results is to prove that they can complete their tasks with a level of completeness of at least 80%. For this, the null hypothesis that I formulated (and that I want to reject) is: H0 = σ = 0.8

And the alternative is: H1 = σ> 0.8

This sounds logic for me, but I was reading some papers and books and all the examples I found about small single-sample cases are contrasting two kinds of users, for instance, expert users against novice users in the same group.

What do you think? Can you point me some literature with an example like mine? (where you want to prove that all the sample participants can do somenthing in a percentage).

Thanks in advance,

gal007
  • 133
  • 4
  • 1
    What does $\sigma$ represent here? It seems to me that you probably have heterogeneous users, so they would likely have different expected proportions – Glen_b Nov 25 '16 at 21:35
  • I understood (I'm not really pretty sure) that when the sample you have is small, then you need to use σ. Isn't it? – gal007 Nov 25 '16 at 23:24
  • I'm really not sure what you're saying there (can you point to a source for it? That might give enough context to figure out what you mean), but it might make more sense if you explained what $\sigma$ is intended to represent here. – Glen_b Nov 25 '16 at 23:26
  • I think it represents the mean of the successful rate for the usage of the application – gal007 Nov 26 '16 at 02:12
  • Is it ok to use the mean to demonstrate the hypothesis here? – gal007 Nov 26 '16 at 02:18
  • Okay then what does "when the sample you have is small, then you need to use σ" mean? It's also not clear what you're after when you ask "Is it ok to use the mean to demonstrate the hypothesis here?" ... You have a hypothesis about a population mean. Are you now asking if it's okay to use the sample mean to test the hypothesis? Or are you just asking the original question again? (If it's the second thing, please don't pester people by re-asking your posted question in comments. We can already see that you asked a question.) – Glen_b Nov 26 '16 at 03:08

1 Answers1

1

The answer to your titular question is "yes", and you've done so yourself in the body of your question. I'm pretty sure that is not much help to you so I will answer the different question of how to test whether the observed data support the hypothesis that the population mean is greater than 0.8.

I suspect that your confusion may stem from the fact that your data are a set of proportions, whereas the usual examples dealing with testing proportions (or confidence intervals for proportions) use data that are a single proportion such as ten heads from fifteen tosses of a coin = 10/15.

The first thing that might help is to subtract your null hypothesis value from each of the data values. Then your null hypothesis is the familiar $H_0=0$.

Next, recognise that your data are not normally distributed and either choose to use a non-parametric test or assume that the data are normal enough to use, for example, a single sample t-test. For the latter I would suggest that if the (un-subtracted) proportions get much higher than 0.8 or less than 0.2 then the bounds of the actual distribution will make the normal approximation problematical. However, if all of the (un-subtracted) data are close to 0.5 then there will probably be no problem with an assumption of normality with n=23 participants.

You might consider using a permutations test (Permutation test comparing a single sample against a mean) or look here for alternatives (http://www.basic.northwestern.edu/statguidefiles/ttest_one_alts.html).

Finally, consider whether your purpose might be best served by focussing on how many of the participants exceeded the 80% threshold, or what the median values is. There are many data analysis purposes for which the hypothesis and significance tests are not really necessary or helpful.

Michael Lew
  • 10,995
  • 2
  • 29
  • 47
  • thank you. I had the doubt also because I read a book (I just found it again) saying: "A hypothesis of the form ‘method A is good for task T’ is not testable but a hypothesis of the form ‘method A requires less time than method B to accomplish task T’ is". And in my case I am trying to prove the first, don't you think? :( – gal007 Nov 26 '16 at 11:41
  • Yes, you may be trying to "prove' an unprovable. Hence my suggestion that you consider reporting the fraction of participant exceeding your threshold or the median success. Perhaps a confidence interval would suffice. – Michael Lew Nov 26 '16 at 20:32