44

When teaching an introductory level class, the teachers I know tend to invent some numbers and a story in order to exemplify the method they are teaching.

What I would prefer is to tell a real story with real numbers. However, these stories needs to relate to a very tiny dataset, which enables manual calculations.

Any suggestions for such datasets will be very welcomed.

Some sample topics for the tiny datasets:

  • correlation/regression (basic)
  • ANOVA (1/2 ways)
  • z/t tests - one/two un/paired samples
  • comparisons of proportions - two/multi way tables
Andre Silva
  • 3,070
  • 5
  • 28
  • 55
Tal Galili
  • 19,935
  • 32
  • 133
  • 195

9 Answers9

27

The data and story library is an " online library of datafiles and stories that illustrate the use of basic statistics methods".

This site seems to have what you need, and you can search it for particular data sets.

David LeBauer
  • 7,060
  • 6
  • 44
  • 89
  • Hi David - the site you linked to is really great - thank you. – Tal Galili Jan 04 '11 at 09:37
  • Service currently unavailable (as of April 2016) – Felipe Apr 17 '16 at 20:42
  • @FelipeAlmeida I just accessed the site; please check again, perhaps on a different computer / device – David LeBauer Apr 17 '16 at 21:12
  • @DavidLeBauer have you tried clicking on "list all topics" and then selecting one of the methods? see [this link here](http://i.imgur.com/HWLhHTV.png) – Felipe Apr 17 '16 at 21:22
  • 1
    @FelipeAlmeida I see. I spoke with the site's maintainer who says ',,,Look for a new, more modern, and much better DASL coming soon at dasl.datadesk.com.' – David LeBauer Apr 18 '16 at 03:51
  • Although the "list all topics" functionality doesn't work, you can still [view a large number of datasets by searching specifically for "datafile"](http://lib.stat.cmu.edu/cgi-bin/dasl.cgi?query=datafile). – Ninjakannon May 08 '17 at 15:33
23

There's a book called "A Handbook of Small Datasets" by D.J. Hand, F. Daly, A.D. Lunn, K.J. McConway and E. Ostrowski. The Statistics department at NCSU have electronically posted the datasets from this book here.

The website above gives only the data; you would need to read the book to get the story behind the numbers, that is, any story beyond what you can glean from the data set's title. But, they are small, and they are real.

  • These are just the right size. You can view the book by searching "Handbook of Small Datasets" on google scholar - you can view parts of it on google books. – Felipe Apr 17 '16 at 21:00
  • The given link is broken. Please update the link. Thanks – MYaseen208 Apr 15 '19 at 13:52
13

For two-way tables, I like the data on gender and survival of the titanic passengers:

       | Alive  Dead | Total
-------+-------------+------
Female | 308    154  |  462
Male   | 142    709  |  851
-------+-------------+------
Total  | 450    863  | 1313

With this data, one can discuss things like the chi-square test for independence and measure of assocation, such as the relative rate and the odds ratio. For example, female passengers were ~4 times more likely to survive than male passengers. At the same time, male passengers were ~2.5 times more likely to die than female passengers. The odds ratio for survival/dying is always 10 though.

Wolfgang
  • 15,542
  • 1
  • 47
  • 74
9

The Journal of Statistical Education has an archive of educational data sets.

David LeBauer
  • 7,060
  • 6
  • 44
  • 89
6

CAUSEweb has data sets as well as lots of other teaching resources.

See http://www.causeweb.org/resources/datasets/ for the datasets.

CAUSE stands for Consortium for the Advancement of Undergraduate Statistics Education.

4

Probably such an obvious answer that it does not really need to be mentioned, but for correlation or linear regression Anscombe's quartet is a logical choice. Although it is not a real story with real data I think it is such a simple example it would reasonably fit into your criteria.

Andy W
  • 15,245
  • 8
  • 69
  • 191
3

StatSci.org is a nice source for datasets.

MYaseen208
  • 2,379
  • 7
  • 32
  • 46
2

A nice article entitled Resource Discovery for Teaching Statistics has shed light on this this topic.

MYaseen208
  • 2,379
  • 7
  • 32
  • 46
  • Just finished reading most of the paper (I skimmed a few parts) - it is indeed a good review of the situation. It will be interesting to see how this will develop in the future... – Tal Galili Dec 09 '13 at 21:40
  • 2
    Is it possible you could add the key points here, or give a summary? The link may go dead at some point, & it will also help readers know if they want to pursue the link further without having to click on it. – gung - Reinstate Monica Jun 06 '14 at 20:10
1

https://tuvalabs.com

I am sure you have found what you were looking for long back, but for anyone else who come across thread - TuvaLabs is nice source for the datasets for Classrooms. It curates datasets, story, description, small exercise and visualization capability also you can requests datasets on it.

Mutant
  • 101
  • 2