Calculating area under normal curve to the right of an extremely high z-score

Question

This might be a little unusual, but please bear with me. I'm working on a theoretical exercise in chromatography (chemistry), a method that is used to separate different molecules. It is assumed that after performing the chromatography procedure, two different molecules separate from each other along a one-dimensional "tube" in zones that have Gaussian (normally distributed) shapes, with the y-axis representing the concentrations of the molecules along a one-dimensional space. The maxima of the peaks that represent the two molecules are separated from each other by a large number of standard deviations (10 in the following example):

I want to calculate the theoretical amount (cumulative probability) of molecule A in the space that corresponds to the maximum of molecule B's peak. Thus, I am trying to calculate the cumulative density of a standardized normal distribution to the right of an extremely high (>10) z-score. I have tried using R (and Matlab) to calculate this:

1-pnorm(x)

where x is the large z-score. I can increase x=8, which returns the cumulative density of 6.661338e-16. Anything above x=8 that returns 0. I know that it is probably related to memory limitations and rounding off. My question, is there a way (configuration of R or Matlab, specialized software, an existing table) that will allow me to calculate the cumulative density of a standardized normal distribution to the right of z-scores approaching 40, without rounding the answer to 0?

Well I stand corrected! I guess that makes sense, it's the operation of subtraction that causes the rounding. Why not answer his question? Might help future people with his problem. — Patty, Jul 05 '17 at 21:21
Don't bother to compute the value: **compute its logarithm.** The values will underflow double-precision floats once $z$ exceeds 37 or so, but the logs will be fine. [Mills' ratio](https://stats.stackexchange.com/a/7206/919) will give a decent approximation: it says (for $z\gg 0$) the log is approximately $$\log(1-\Phi(z)) \approx -\frac{1}{2}\left(\log(2\pi) + z^2 + \log(z^2)\right).$$ It's good to $1\%$ at $z=10$, better than $0.1\%$ for $z\approx 32$, and gets better as $z$ grows. — whuber, Jul 05 '17 at 21:36
Patty's right to say to use pnorm(-x) rather than 1-pnorm(x) for large x; the reason why you get 0 for 1-pnorm(30) is due to subtracting two things that are essentially 1 when you're really interested in the difference. pnorm(-30) is a good way to get it. However the "R way" to get upper tail areas would be `pnorm(30,lower.tail=FALSE)` (and I'd add `,log=TRUE`). However, for extreme values out beyond that you might as well use the approximation $\phi(x)/x$, (whuber's suggestion) -- and as he says, work on the log-scale. You should get at least 3 figure accuracy out that far. — Glen_b, Jul 06 '17 at 01:41
Note that `pnorm(x,lower.tail=FALSE,log=TRUE)` seems to work correctly out a very long way — Glen_b, Jul 06 '17 at 01:48
Fantastic! Thank you very much for your comments, I have learned a lot! — Mirzo Kanoatov, Jul 06 '17 at 13:08

Calculating area under normal curve to the right of an extremely high z-score

0 Answers0