3
qqnorm(x)
plot(qnorm(seq(80)/81),sort(x))

Finding that the plots produced by the commands above are slightly different from each other, I tried this:

qqnorm(qnorm(seq(80)/81))

I got a slightly less than perfect line. I'd have tried regressing the result of qnorm(seq(80)/81) on the variable that was plotted on the $x$-axis and plotting residuals against the predictor, expecting to see some graceful curve, but for the fact that I don't know what to use as the predictor. Possibly such a residual plot would reveal more than just the graceful S-shaped thing I'd anticipate.

So my question is this: if the thing on the $x$-axis in the plot produced by qqnorm is not what I get from qnorm(seq(80)/81) (and what I did shows that indeed it is not), then what is it?

gung - Reinstate Monica
  • 132,789
  • 81
  • 357
  • 650
Michael Hardy
  • 7,094
  • 1
  • 20
  • 38
  • One other thing I could imagine it might be is the _expected values_ of the standard normal order statistics. But I don't know an efficient way to compute those or any standard commands to get them, unless that's what I actually get from qqnorm. ${}\qquad{}$ – Michael Hardy Aug 02 '15 at 21:05
  • 2
    This seems to be a question about R code, not the nature of qq-plots themselves. If this is actually a statistical question, please edit it to make that aspect more obvious. – gung - Reinstate Monica Aug 02 '15 at 21:11
  • 4
    See the help page for `ppoints`, as referenced in that for `qqnorm`. – Scortchi - Reinstate Monica Aug 02 '15 at 21:52
  • 4
    As Scortchi suggests, the precise details of how the quantiles are calculated are in `ppoints`. They're not quite expected quantiles, but the use of a=3/8 for n<10 is a good approximation suggested by Blom(1958); I am not sure why the function changes to a=1/2 above that but it only makes visually noticeable difference at the two extreme points and for n past about 30 or so not even then. – Glen_b Aug 03 '15 at 00:35
  • 2
    See [this question](http://stats.stackexchange.com/questions/9001/approximate-order-statistics-for-normal-random-variables) and also [this plot](http://i.stack.imgur.com/pAsQd.png) of expected normal order statistics against Blom's approximation for n=10 with a line through $O$ with slope 1 in red. – Glen_b Aug 03 '15 at 00:59

1 Answers1

4

Based on comments above and pages to which they link, it appears that the command

ppoints(seq(80))

gives us this:

\begin{align} & \left( \frac{1}{2\cdot 80},\ \frac{3}{2\cdot 80},\ \frac{5}{2\cdot 80},\ \frac{7}{2\cdot 80},\ \ldots,\ \frac{159}{2\cdot 80} \right) \\[10pt] = {} &\left( \ldots,\ \frac{2i-1}{2\cdot 80},\ \ldots : i=1,\ldots,80 \right) \tag 1 \end{align} and

qnorm(ppoints(seq(80)))

gives us precisely what is plotted on the $x$-axis when one uses the command

qqnorm(x)

where $x$ is a vector with $80$ components.

My first guess had been that instead of $(1)$ we would have \begin{align} & \left( \frac 1 {81},\ \frac 2 {81},\ \frac 3 {81},\ \ldots,\ \frac{80}{81} \right) \\[10pt] = {} & \left( \ldots,\ \frac i {80+1},\ \ldots\ : i=1,\ldots 80 \right). \end{align}

Michael Hardy
  • 7,094
  • 1
  • 20
  • 38