29

From David Salsburg's book The lady tasting tea:

Although the reader may not believe it, literary style plays an important role in mathematical research. Some mathematical writers seem unable to produce articles that are easy to understand. Others seem to get a perverse pleasure out of generating many lines of symbolic notation so filled with detail that the general idea is lost in the picayune.

But some authors have the ability to display complicated ideas with such force and simplicity that the development appears to be obvious in their exposition. Only upon reviewing what has been learned does the reader realize the great power of the results. Such an author was Jerzy Neyman. It is a pleasure to read his papers. The ideas evolve naturally, the notation is deceptively simple, and the conclusions appear to be so natural that you find it hard to see why no one produced these results long before.

What are other specific examples of such well-written papers on statistics or machine learning?

The idea is to have a list of "this is how you should write" papers.

Please, try to provide:

  • Full bibliographic citation such as:

    Carl E. Rasmussen, "The Infinite Gaussian Mixture Model" In Advances in Neural Information Processing Systems 12, Vol. 12 (2000)

  • In case of links, make them to publicly accessible repositories if possible (e.g. http://arxiv.org/).

  • A short, informal, comprehensible review on what is the paper about and why it is an example of a top well-written paper.

alberto
  • 2,646
  • 16
  • 36
  • 4
    This is a good & nice question, it could be useful to have that list! – kjetil b halvorsen Feb 20 '15 at 14:33
  • 3
    @kjetilbhalvorsen Yes, indeed, but questions like this are not on topic for [se] sites in general. – Gavin Simpson Feb 20 '15 at 14:48
  • 5
    I agree w/ @kjetilbhalvorsen. It might be nice to have this Q on the site. Perhaps it could be made CW, though--it isn't really on-topic w/i a strict reading of the guidelines. – gung - Reinstate Monica Feb 20 '15 at 14:49
  • 2
    Yes, it should be made community wiki and reopened. – kjetil b halvorsen Feb 20 '15 at 14:51
  • 1
    Besag's "Spatial Interaction and the Statistical Analysis of Lattice Systems" published in 1974 is a good example of clear and vigorous writing. – Zen Feb 20 '15 at 14:51
  • 1
    +1 to the CW, that was my intention. But @Gavin how is this question more off-topic than http://stats.stackexchange.com/questions/1337/statistics-jokes ? – alberto Feb 20 '15 at 14:58
  • 1
    Ferguson's master piece: http://projecteuclid.org/euclid.aos/1176342360 – Zen Feb 20 '15 at 15:38
  • 3
    @alberto That one (or even some) of these debatably OT posts didn't get closed is not justification for leaving all such questions open *here*. I'd be just as interested in the responses you get as the next user of this site, but questions like this don't generate answers in the sense of the [se] system. – Gavin Simpson Feb 20 '15 at 15:50
  • One question: should it be me who make it a "community wiki" post? (I don't see any checkbox) – alberto Feb 20 '15 at 16:02
  • 5
    I made it CW, but I agree with @Gavin: although in the distant past we have tolerated these "big list" questions, they do not fit the SE model. They tend to generate unsupported opinions and lots of links that eventually die. One way to get such a thread to fit on an SE site is to request an *objective analysis* (with supporting examples!). So if you're willing to ask your question in that (more constructive) way, I would be among those voting to reopen it--and it needn't even be CW in that format, either. – whuber Feb 20 '15 at 16:47
  • @whuber, I'm not sure about what you mean with "objective analysis" (of clarity, style...). See the new edit, I hope there is some improvement :) But feel free to edit the question anyway! – alberto Feb 20 '15 at 18:51
  • I don't think whether this question is CW or not CW is critical to it being *on-topic*. "Big-list" questions have to walk a very fine line to fit the SE model (if the 'grandfathered-in' Jokes one was asked today, I'd vote to close it without hesitation, it's just not SE's concept of a suitable question). That's not to say I think this question is *uninteresting* - quite the opposite (*interestingness* is not itself sufficient). I think the extent to which this question seeks to avoid the page becoming a stream of unsupported opinion and dead links would be critical to it remaining open. – Glen_b Feb 21 '15 at 00:40

1 Answers1

9

I'll give it a shot...:

Benjamini, Yoav; Hochberg, Yosef (1995). "Controlling the false discovery rate: a practical and powerful approach to multiple testing". Journal of the Royal Statistical Society, Series B 57 (1): 289–300. MR 1325392.

Link to PDF: http://www.math.tau.ac.il/~ybenja/MyPapers/benjamini_hochberg1995.pdf

I think the importance of the paper is undisputable. In fields like genomics, experiments with 1000s of tests involved are the norm and the BH method is the most popular way to address the multiple testing issue. Not surprisingly perhaps, this paper appears in the top 100 most cited articles.

Is it beautifully written? I think so. In this paper you have 1) The mathematical formalism (although I can't judge whether this could be made better); 2) An understandable, plain English explanation of what the problem is, why other methods are unsatisfactory and how the BH method works; 3) A simple worked example of how it is done.

(I'm very intersted in this questions, hope others come up with answers & opinions)

dariober
  • 2,805
  • 11
  • 14
  • Related to this I would like to add [controlling the false discovery rate via knockoffs](http://statweb.stanford.edu/~candes/papers/FDR_regression.pdf). – Gumeo Oct 23 '15 at 09:47