4

This is strictly a nomenclature question. I have no particular problem finding double integrals of the type $\int\int\text{pdf}(y) \, d y \,d x$, and I find them quite useful. Whereas we have a good name for $\int\text{pdf}(x) \, dx=\text{CDF}(\textit{x})$, where CDF is the cumulative distribution (credit: @NickCox, A.K.A., density) function, what I do not have is a good name for the integral of the CDF.

I suppose one could call it an accumulated cumulative distribution (ACD), DID (double integral of density) or CDF2, but I have not seen anything of the sort. For example, one would hesitate to use "ccdf" or "CCDF", as that is already taken as an abbreviation for complementary cumulative distribution function, which some prefer to saying "survival function," S$(t)$, as that latter is, strictly speaking, for an RV, whereas CCDF is not from an RV; it is a function equal to 1-CDF, which maybe a relate to probability, but does not have to. For example, PDF often refers to situations in which there are no probabilities, and a more general term for PDF is "density function". However, $df$ is already taken as "degrees of freedom", so the entire literature is stuck with "PDF". So what about DIPDF, "double integral of PFD, a bit long, that is. DIDF? ICDF for integral of cumulative distribution (density) function? How about ICD, integral of cumulative distribution? I like that one, it is short and says it all.

@whuber gave some examples of how these are used in his comment below and I quote "That's right. I establish a general formula for certain definite integrals of CDFs at stats.stackexchange.com/a/446404/919. Also closely related are stats.stackexchange.com/questions/413331, stats.stackexchange.com/questions/105509, stats.stackexchange.com/questions/222478, and stats.stackexchange.com/questions/18438 -- and I know there are more."

Thanks to @whuber's contributions the text of this question is now more clear than prior versions. Regrets to @SextusEmpericus, we have both spent too much time on this.

And the accepted answer is "super-cumulative" distribution, because that name is catchy and has been used before, although frankly, without being told, I would not have known that, which is why, after all, I asked. Now, for the first time, we define SCD as its acronym. I wanted an acronym because unlike elsewhere, where $S(x)$ is used for SCD$(x)$ (not mentioning names), I wanted something that was unique enough to not cause confusion. Now granted, I may be using SCD outside of a purely statistical context in my own work, but as everyone uses PDF, even when there is no p to speak of, that is at most a venial sin.

Edit: Upon further consideration, I will call pdf as $f$ of whatever, e.g., $f(x)$, CDF as $F(x)$ and double integrals as $\mathcal{F}(x)$ just to make things simpler.

Carl
  • 11,532
  • 7
  • 45
  • 102
  • Comments are not for extended discussion; this conversation has been [moved to chat](https://chat.stackexchange.com/rooms/117344/discussion-on-question-by-carl-what-should-the-0-to-t-integral-of-a-cdf-be-calle). – whuber Dec 16 '20 at 14:07
  • 3
    This question seems not to be about integrating CDFs, but rather about integrating over the indexing space ("time") of a stochastic process. This makes it rather vague and confusing and also suggests there should not be any general term for such a broad, ill-defined procedure. – whuber Dec 16 '20 at 15:02
  • @whuber The question is about integrating CDF's from the lower limit of the CDF support to some independent variable value. Whether this is a stochastic process or not is not relevant. Whether the independent variable is time or not is irrelevant. Due to irrelevant concerns and a few relevant ones, this question has gained the dubious title of 'most commented.' Perhaps in the future, I will not state that I have a use for something as that invites the reader to think that they can judiciously second guess my intentions without having a full deck of cards. – Carl Dec 20 '20 at 16:36
  • Your comments apply to the completely revised version of the question. There was no second guessing when I wrote my previous comments: they were based on evidence of confusion between the indefinite integral of the CDF and time-integrals of time-varying CDFs. – whuber Dec 20 '20 at 18:34
  • @whuber I accept blame for confusing the reader. However, it is also important that you now state that the confusion no longer pertains. I leave you with the following, I may have been confusing, but I was not confused. – Carl Dec 21 '20 at 14:31
  • 1
    @Carl, those examples by Whuber are no examples of integrals of CDF but they are examples of integrals of 1-CDF. – Sextus Empiricus Dec 21 '20 at 15:02
  • @SextusEmpiricus OK. When I have needed the integral of CCDF, I have derived it from the integral of CDF. If you know the one, you can write the other. – Carl Dec 22 '20 at 05:00
  • That's incorrect, because typically the integral of a CDF (often taken from $0$ to $\infty$) *must* diverge. – whuber Aug 28 '21 at 19:16
  • @whuber CDF=1-CCDF. Thus, $\int_0^x CDF\,dx=x-\int_0^x CCDF\,dx$. Whereas $lim_{x \to \infty}\int_0^x CDF\,dx\to\infty$, that is not how it is used. Typically, it is used to find the average CCD on the interval $(a,\,b)$ of $x$. – Carl Aug 29 '21 at 20:54

2 Answers2

6

Disclaimer

What should the integral of a CDF be called

I suggest the following name "integral of a CDF". Unless there is something intuitive about this integral I do not see why we should aim for a different name. The following answer will only show that the current status is that there is no intuitive idea behind the double integral of a PDF or integral of a CDF (and that the examples are not examples of integrals of a CDF). It is not a direct answer to the question (instead it is an answer to why we can not answer the question).

This is not an answer suggesting a name. It is a summary of several comments that may be helpful to achieve an answer.

At the moment it is, to me, not very clear what the double integral of the probability density function is supposed to mean. The two examples have some problems: 1 Your examples are physics and not probability. Is there use for the double integral of a probability density? 2 In addition, the examples are not examples of a double integration.

In this answer I will argue why the double integral of a pdf is problematic* **, and possibly this may lead to clarifications of the examples, and eventually inspiration for a name for this integral.

* There are several notions of the integral of $1-CDF$ like in the questions:

but I do not know of anything that integrates the $CDF$

** By problematic I mean that it is an integral of an extensive property but not in an additive way with disjoint sets. Or, the integrand $dx$ a measure of space is the quantity which we add up and weighed by 1-CDF(x), so we must see it intuitively as a sum over $dx$.

The integral over $1-F(x)$ can be converted into a sum over the quantile function $\int_0^b (1-F(x)) dx = \int_{F(0)}^{F(b)} Q(p) dp$ and these are related by the integral of inverse functions making the integral over $1-F(x)$ equivalent to an integral over the quantile function. For the integral over $F(x)$ you do not have the same equivalence. Without this equivalence I do not see any intuition for the use of such integrals and it becomes difficult to come up with a name.


Densities

The meaning of density has been a subject in this question: What do we exactly mean by "density" in Probability Density function (PDF)?

In my answer to that question I relate densities to the Radon-Nikodym derivative

  • Densities as the ratio of two measures on the same space. $$\rho = \frac{d \nu}{d \mu}$$
  • These two quantities/measures are extensive properties. The ratio is an intensive property
  • By integration of this density you get an extensive property. $$\nu(S) = \int_S \rho d \mu$$

So the integral of a probability density (or a normalized density as used in your examples) will give 'probability' as outcome. However an integral of the extensive property 'probability' gives a value with no clear use.


Example 2

In your second example, decay of some amount of radiactive material, your double integral is not resulting from a double integral of an intensive propery.

The amount of material $M(t)$ follows a differential equation (with $\dot{}$ referring to differentiation in time):

$$\dot{M}(t)= -\frac{ln(2)}{\tau} \cdot M(t) = -\lambda \cdot M(t)$$

where $\tau$ is the half time, and $\lambda$ is the rate of decay. The solution is:

$$\begin{array}{rlcrcl} \text{amount of mass} &[mass] &:& & M(t) &=& 1-e^{-\lambda t} \\ \text{loss rate} &[mass/time]&:& & \dot{M}(t) &=& \lambda e^{-\lambda t} \\ \end{array}$$

Because of that differential equation we can write $\dot{M}(t)$ or $M(t)$ as an integral of itselve by using $M(t) - M(r) = - \int_t^r \dot{M}(s) ds$ and if $M(\infty) = 0$ then

$$M(t) = M(t) - M(\infty) = - \int_t^\infty \dot{M}(s) ds = \lambda \int_t^\infty {M}(s) ds $$

In your example you compute the total loss $Q(a,b)$ (and related the average loss is $Q(a,b)/(b-a)$) in some time period from $a$ to $b$ as a function of the mass. It is in that way that you get the double integral

$$\begin{array}{rrcl} \text{total loss between $a$ and $b$} :& Q(a,b) &=& \int_a^b \dot M(t) dt = M(b) - M(a)\\ &&=& \int_a^b -\lambda M(t) dt \\ &&=& \int_{a}^b - \lambda \left(\lambda \int_t^\infty {M}(s) ds \right) dt \\ && =& - \lambda^2 \int_{a}^b \int_t^\infty {M}(s) ds dt \end{array}$$

BTW. In this example the integral $\int_t^\infty {M}(s) ds$ does actually not relate to an integral of the CDF but instead it is an integral of the survival function.

So, in this example the double integral arrises from the relationship $\dot{M}(t) \propto M(t)$ and it is not so much a double integral of the intensive property 'density'. There is a factor $\lambda$ with units $[1/time]$ which changes the extensive property 'amount of mass' into a intensive property 'loss rate'.

Plainly integrating two times the pdf has no meaning, and it gets only a meaning through the differential equation.

This indicates that for those examples where this double integral occurs we can use the actual physical meaning of the integral to 'give a name' to the double integral.

BTW, in your example the average radiation exposure (as a fraction) is

$$\dfrac{\text{CDF}(t_2) - \text{CDF}(t_1)}{t_2-t_1} \quad \text{with units} \frac{1}{[time]}$$

instead of

$$\dfrac{\int_{0}^{t_2}\text{CDF}(t)\,d t-\int_{0}^{t_1}\text{CDF}(t)\,d t}{t_2-t_1} \quad \text{with units} \frac{[time]}{[time]}$$

You can see this based on the units. The total fraction of radiation exposure is unit less. The average fraction of radiation exposure must have units $[1/time]$. The coefficient $\lambda$ is missing to give the expression the right dimensions.

Example 1

You can shift up and down one integral because the quantity is an integral of itself. This is also clear from the article that you link from the comments 'Comparison of the gamma-Pareto convolution with conventional methods of characterising metformin pharmacokinetics in dogs' Journal of Pharmacokinetics and Pharmacodynamics volume 47, pages19–45(2020).

In that article it is written

the average mass over the dose interval, which written from the survival function equals $\Delta S(t)/\tau$, i.e., $S \tau(i) = \frac{1}{\tau} \lbrace S[\tau(i-1)] - S(\tau i) \rbrace$, for $i=1,2,3, \dots$.

In the question you write

Then to find the average drug mass during a dosing interval, we need an integral average of the summed CCDF during that interval

which relates to the integral $\dfrac{\int_{0}^{t_2}\text{CDF}(t)\,d t-\int_{0}^{t_1}\text{CDF}(t)\,d t}{t_2-t_1}$

If you are looking for a name of this integral, then why not just use the name for the equivalent $\Delta S(t)/\tau$?

Sextus Empiricus
  • 43,080
  • 1
  • 72
  • 161
  • 1
    Re "problematic:" integrals of the CDF directly give what one might call "partial expectations." They can be used (with proper normalization) to give succinct formulas for expectations of truncated versions of the variable. – whuber Dec 16 '20 at 14:23
  • @whuber but that is the integral of $1-CDF$ right? There must be a question about that. – Sextus Empiricus Dec 16 '20 at 14:29
  • 2
    That's right. I establish a general formula for certain definite integrals of CDFs at https://stats.stackexchange.com/a/446404/919. Also closely related are https://stats.stackexchange.com/questions/413331, https://stats.stackexchange.com/questions/105509, https://stats.stackexchange.com/questions/222478, and https://stats.stackexchange.com/questions/18438 -- and I know there are more. – whuber Dec 16 '20 at 14:55
  • @SextusEmpiricus When the CDF is already a good fraction of a page in length. There is little motive for also defining $\Delta$CCDF as a separate function, as $\Delta$CCDF$=1-$CDF$(t_2)-[1-$CDF$(t_1)]=$CDF$(t_1)-$CDF$(t_2)$ – Carl Dec 16 '20 at 15:01
  • @Carl I do not get your last comment are you saying $\Delta CCDF = \Delta CDF$? – Sextus Empiricus Dec 16 '20 at 15:03
  • @SextusEmpiricus No. Notice the difference, $F(t_1)-F(t_2)\neq F(t_2)-F(t_1)$. – Carl Dec 16 '20 at 15:09
  • @Carl I see now that you meant $CCDF(t) = 1-CDF(t)$ and not $CCDF(t) = \int_0^t CDF(s) ds$. But, now that this formula is cleared up, I am actually still confused with the comment. What is the point? – Sextus Empiricus Dec 16 '20 at 15:15
  • @whuber Thank you. A mean integral on an interval would indeed be that interval's expectation. And Sextus Empericus There are lots of points, not the least of which is that physics is either considered a branch of statistics, or, statistics is a branch of physics. Take your choice. – Carl Dec 16 '20 at 15:25
  • 1
    I think you continue to confuse things, Carl: a mean integral over an interval of time would be an *average* value, to be sure, but it wouldn't be related to any given CDF. The confusion here is that you seem to be dealing with a *parameterized family of CDFs* rather than a single CDF and you are not integrating "the" CDF with respect to its variable; you are integrating the values of these CDFs over a range of parameters. Both your language and your notation are ambiguous and need clarifying. – whuber Dec 16 '20 at 16:35
  • @whuber I asked a simple question. What would you call the double integral of a PDF? Then all hell breaks loose with extraneous comments. So I gave examples. Then I get more extraneous comments, but no answers other than yours, which is the first and only insight relevant to the original question. I regret asking, next time I will just figure it out myself. In general to find a definite integral from $a$ to $b$, where $a – Carl Dec 17 '20 at 10:44
  • @whuber con't $a$. Now when the integrand is already a CDF (or $1-\text{CDF}=\text{CCDF}$) that implies that the double integral exists in some form and thus illustrates a usage of such things, which unnecessary aside was brought to you by the silly comment to the effect that there is no "use" for the double integral of a PDF. If you guys would spend more time thinking about the question than trying to pick it apart there would be fewer words, less confusion, and less nonsense. Please take your comment that I upvoted, post it as an answer and I will award the bounty to you. – Carl Dec 17 '20 at 10:57
  • 1
    @Carl you sound a bit upset and blame a lot of other people. For what it is worth, I have incorporated Whuber's comment with the 2nd and 3rd revisions to my answer, just a few minutes after his comment. I hope that you agree that finding a name for some new function requires to gain an intuition about it's use. In my answer I explained that the examples show no use and the integral's of the CDF can be related just as well as integral's of the PDF because they relate to each other by a constant. If you are looking for a name, then adopt the name that is currently used for integrals of the PDF. – Sextus Empiricus Dec 17 '20 at 11:18
  • @SextusEmpiricus I know you are trying, but, for me, there is little or nothing in your answer that I did not already know. Sure my question not posited in the best possible light, I'll patch it up a bit. If you want me to accept your answer put in whubers links, AND, do some thinking about what to call the double integral of a PDF, AND, put that in. – Carl Dec 17 '20 at 11:26
  • 2
    My comments are intended only to point out what appears to be a fundamental ambiguity in this question. IMHO, it won't have a good answer and likely can't be answered adequately unless the respondent makes the effort to refine and narrow the question. – whuber Dec 17 '20 at 15:09
  • 3
    @Carl, what you do *not* know is *how* others do not know (or interpret) your question. If my answer shows a lack of understanding, and as whuber says there is a lot of ambiguity, then you could try to improve the question-answer on your side as well (even if the question is not an answer it is a feed-back about your concept). If it is so different to introduce a new concept that desires a new name does it deserve a new name? First the concept needs to be explained clearly such that other's can understand it without too much trouble. – Sextus Empiricus Dec 17 '20 at 15:12
  • Read my answer as a Socratic question: "What do you *mean* with this double integral?". I placed it only as an answer because the comments were getting too difficult. What is the *point* of this double integral? (I believe that the examples do not show this well and that's the focus of my answer, which shows why) If you find this point then you have a clue about a good name for it. But at this moment, there is just the clinical term 'cumulative cumulative density funciton' or 'double cumulative density function', and there is no lead to replace it with something more intuitive. – Sextus Empiricus Dec 17 '20 at 15:18
  • @SextusEmpiricus I have done a lot of editing. Sure, I can see how some things were confusion, and have tried to address that. – Carl Dec 17 '20 at 15:19
  • @SextusEmpericus The double integral of a PDF. Got a name for it? I'm totally confused by your asking what I *mean*. It's math. What would you have me say for meaning? It is what it is. It is useful, it is used, it could do with a name. The single integral of a PDF has a name; CDF. What have you got against asking for a name for the double integral? What further can I do to clarify this question? – Carl Dec 17 '20 at 15:21
  • @Carl I suggest the following name "double integral of a PDF". Unless there is something intuitive about this integral I do not see why we should aim for a different name. – Sextus Empiricus Dec 17 '20 at 15:23
  • @SextusEmpiricus How about ICD, integral of cumulative distribution? I like that one, it is short and says it all. I want an acronym. The reason why I don't like DIPDF, DID or other acronyms from the PDF is that the use of the double integral relates more to the CDF than the PDF. – Carl Dec 17 '20 at 15:31
  • @Carl ICD sounds *nice*. Sure, go ahead and use it. However, you're talking about an *entirely* new concept. Without explaining the concept, why a special name? I believe, we should first get a grasp of the concept and it's meaning before we start introducing special names. Otherwise do it like the chemists with their system of naming chemicals (which may differ and you can have different nomenclature so 'integrated CDF' and 'double integrated PDF' might both work) or like the astronomers in naming new objects with some number. **Trivial names only come when the subject is special** – Sextus Empiricus Dec 17 '20 at 15:40
  • Carl, the problems are that (1) the phrase "double integral of a PDF" does not have a clear meaning; (2) your only attempt to define it mathematically is nonsensical (because "$dx$" appears twice in the integral); and (3) some of the text of your question suggests you don't have a simple PDF, but that you have a family of them over which you are integrating. Thus, *nobody can tell what you're trying to ask.* – whuber Dec 17 '20 at 15:56
  • @whuber Yeah there are problems. For multiple dosing using a repeat of the same dose, the unit dose mass (a CCDF) the algorithm for finding mean dose mass during the dosing interval is simpler because the average integral during the $n^{th}$ $\tau$, where $tau$ is the temporal dosing interval duration, e.g., every 12 h, reduces to $\dfrac{\int_0^{n\,\tau}\text{CCDF}\,d\,t}{\tau}$, if superposition can be assumed. – Carl Dec 17 '20 at 17:27
  • To adjust this so that the same average dose mass is present during each dose interval, we multiple each dose by the dose scaling constants required to do that, This does not change the CCDF, but it does make things rather more complicated as the CCDF contribution from prior $\tau$ intervals, and the current dose must be independently scaled. – Carl Dec 17 '20 at 17:42
  • @Carl the article that you cited uses PDF instead of CDF $$\dfrac{\int_0^{n\,\tau}\text{PDF}\,d\,t}{\tau}$$ instead of $$\dfrac{\int_0^{n\,\tau}\text{CCDF}\,d\,t}{\tau}$$ and the trick to get a constant concentration is more similar to a convolution than straightforward double integration of the PDF or single or single integration of the CDF. – Sextus Empiricus Dec 17 '20 at 18:26
  • @SextusEmpiricus Yes, for mean concentration in a dosing interval, whereas for mean dose mass the second integral applies. – Carl Dec 17 '20 at 18:33
  • @SextusEmpiricus Look, that extraneous material was included only because you and others doubted that (1) there is any such thing (2) that there are uses for it. Whuber found a bunch of uses that better apply to statistics. That it is a perfectly general problem should no longer be in any doubt. That Example (1) is hard to understand, is because I told the truth about what I need the nomenclature for. Example (1) cannot we worked without the integral of unit drug mass, a CCDF. In that, it is not optional, it is necessary, that it is difficult to understand is because I was cut no slack. – Carl Dec 17 '20 at 18:44
  • 4
    @Carl the matter is not so difficult to understand (it's just double integrals). That is not the problem. The problem is that it is being sort of decorated with confusing terms, and there is *no* double integral of the CDF. Not in the examples, not in the linked article, neither in Whuber's links. – Sextus Empiricus Dec 17 '20 at 18:50
  • 1
    @Carl I believe that I might be the only person here that might invest some time in your question but I find your responses a bit toxic in the way that you respond to any criticism. How do you want me to stay interested? – Sextus Empiricus Dec 17 '20 at 18:58
  • @SextusEmpiricus There is no double integral of the CDF, there is a single integral of the CDF, which is then the double integral of a PDF, because the CDF is an integral of a PDF. Look, the examples I put in from what I am familiar with, which is not what most people want to see, but it's what I had to give as examples. Examples were not needed at all if one treats the question as honest and answers in the abstract. That I was not accorded the benefit of the doubt has led to busy work that I was not expecting to be put through. Also, communication is a two way street and you also should – Carl Dec 17 '20 at 21:10
  • Let us [continue this discussion in chat](https://chat.stackexchange.com/rooms/117400/discussion-between-carl-and-sextus-empiricus). – Carl Dec 17 '20 at 21:10
  • You did a lot of work on this, so I am awarding you a +1 vote. – Carl Dec 19 '20 at 18:30
2

I am mentioning here one term for integral of CDF used by Prof. Avinash Dixit in his lecture note on Stochastic Dominance (which I happen to have very recently stumbled upon). Obviously, this is not a very generally accepted term otherwise it would have been discussed already on this thread.

He calls it super-cumulative distribution function and is used in an equivalent definition of Second Order Stochastic Dominance. Let $X$ and $Y$ be two r.v such that $E(X) = E(Y)$ and have same bounded support. Further, let $S_x(.), S_y(.)$ be the respective super cumulative distribution functions.

We say that $X$ is second order stochastic dominant over $Y$ iff $S_x(w) < S_y(w)$ for all values of $w$ in support of $X, Y$.

It may also be interesting to note that for First Order Stochastic Dominance, the condition gets simply replaced by CDF in place of super-cdf.

kjetil b halvorsen
  • 63,378
  • 26
  • 142
  • 467
Dayne
  • 2,113
  • 1
  • 7
  • 24
  • The lecture note link is not available to me when I follow the link. It may be that you have privileges for viewing it that are not shared. Perhaps you could obtain permission to share these notes. – Carl Dec 19 '20 at 02:55
  • @Carl: i think it should be accessible. You can try right clicking and click on 'save link as' as this link is directly to pdf file. Alternatively try [this](https://www.princeton.edu/~dixitak/Teaching/EconomicsOfUncertainty/Slides&Notes/) link and go to lecture note 4. – Dayne Dec 19 '20 at 03:03
  • The first procedure did not work, but the second did, reading it now. Suggest you link to https://www.princeton.edu/~dixitak/Teaching/EconomicsOfUncertainty/Slides&Notes/Notes04.pdf – Carl Dec 19 '20 at 03:30
  • 1
    Indeed, these notes call the integral of the CDF to be a "super-cumulative" distribution function on page 3, and propose using it to compare Second Order Stochastic Dominance. – Carl Dec 19 '20 at 15:15
  • Moreover, "super-cumulative" distribution function has been used as a term for a discrete RV in at least [one working paper](https://financetheory.org/public/storage/working_paper/00019-00.pdf). – Carl Dec 19 '20 at 15:31
  • And in [another paper](https://s18798.pcdn.co/dimitrilanda/wp-content/uploads/sites/7118/2020/04/Gordon_Landa_NCFS_12.1.pdf) in an expected value context. – Carl Dec 19 '20 at 15:40
  • 1
    @Carl: this second link says *"sometimes called ..."*. So it appears that more than one writer has used this terminology. – Dayne Dec 19 '20 at 15:45
  • Indeed. And I want an acronym. How does SCD look to you? – Carl Dec 19 '20 at 17:05
  • 1
    @Carl: Sure if it becomes more commonly used then SCD will surely come in fashion (maybe not as much as cdf - as that will remain more widely used). – Dayne Dec 19 '20 at 18:26