0

I have a grid approximation of a cdf, $F_x$. The cdf has support for $x>=0$

From there, calculating the $E[x]$ is straight forward with some std numerical integration techniques.

In my case, however, I have a variable $y$, which is randomly distributed according to a pdf $f_y$.

$y$ defines the percentile in which $x$ is likely to be. The relationship is $y = F_x(x)$

My goal is to calculate the $E[x]$ by knowing that $y$ is distributed as $f_y$. Can I just simply calculate the $E[x]$ integrating with $f_y(x)$ mapping $y$ to $x$ using $x = F_x(y)^{-1}$ (the quantile function). Or is there something that gets lost there (which is my concern)?

EDIT

I have been reflecting on the problem formulation.

In a Bayesian framework where $c$ represents client data and $s$ represents sales

we have:

$$p(s|c) = \frac{p(c|s) * p(s)}{p(c)}$$

In the current set up: $p(s)$ is an empirical $cdf$ - which looks very weird and I would not try to fit - while $c$ is a ranking on clients. I was wondering if in this scenario we get to a numerical estimate of $E[s|c]$ and what is the role played by the probability integral transform ?

IcannotFixThis
  • 1,151
  • 7
  • 20
  • That's fine. In general, if $y$ has some distribution and $x$ is some function of $y$, $h(y)=x$ then $E[x] = \int h(y) f_y dy$. This is the so called law of the unconscious statistician! – CloseToC Jun 10 '15 at 17:55
  • 1
    If $$Y = F_X(X)$$ then by the probability integral transform $y$ is necessarily distributed as a $U(0,1)$ Uniform, not according to some arbitrary distribution. – Alecos Papadopoulos Jun 10 '15 at 18:07
  • @Alecos: This is true. This is the missing piece. Tnx. But in my current setup I have, in some cases, information on the ranking, the distribution on $y$. How to incorporate that? – IcannotFixThis Jun 10 '15 at 20:06
  • You mean you have information on _realizations_ of $Y$? – Alecos Papadopoulos Jun 10 '15 at 20:07
  • Yes, exactly. Different distributions of $y$ realizations for different populations – IcannotFixThis Jun 10 '15 at 20:14
  • It doesn't matter. As long as $Y$ is actually defined and generated as in my previous comment, any observed variability, is just that, uninformative (or worse, misleading) sample variability. The underlying distribution of $Y$ will always be $U(0,1)$, giving in reality each percentile an equal chance of materializing, irrespective of the actual realizations. You cannot infer from it anything more for the distribution of $X$ and/or its expected value. – Alecos Papadopoulos Jun 10 '15 at 20:40
  • @AlecosPapadopoulos: Thanks. Let's assume that we have estimated $F_x$ based on some experiment and some data $\theta$. And that we have estimated the distribution of $y$ based on another experiment and data $\theta '$. I am not sure the probability integral transform rules out that the estimated distribution of $y$ could be informative in estimating $E[x]$. Am I wrong? – IcannotFixThis Jun 10 '15 at 21:19
  • 1
    Unfortunately, you are, because the distribution of $Y$ _if $Y$ is defined as above_, is _always_ $U(0,1)$, no matter what any estimation of it based on any finite sample may show. – Alecos Papadopoulos Jun 10 '15 at 22:01

0 Answers0