5

Consider two random variables $X$ and $Y$ with densities $${f}_1(x) = \frac{1}{n_1h_1} \sum\limits_{i=1}^{n_1}K\left(\frac{x-u_i}{h_1}\right) ~~~~\text{and} ~~~~ {f}_2(y) = \frac{1}{n_2h_2} \sum\limits_{j=1}^{n_2}K\left(\frac{y-v_j}{h_2}\right),$$respectively, where $K$ is a kernel and $h_1$ and $h_2$ are corresponding bandwidth parameters. Then, how to write the joint density of $X$ and $Y$?

msuzen
  • 1,709
  • 6
  • 27
Shanks
  • 743
  • 1
  • 5
  • 18

2 Answers2

2

It depends on what assumptions you are willing to make. Basically, you need a multivariate kernel. If you can assume that the variables are independent, you can use the product kernel

$$ \hat{f_h}(\mathbf{x}) = \frac{1}{n} \sum_{i=1}^n \prod_{j=1}^d \frac{1}{h_j} K\Big(\frac{x_i-\mathbf{X}_{ij}}{h_j}\Big) $$

It won't do anything about handling dependence or correlation between variables. If you can't assume independence, you need a proper multivariate kernel density estimator defined in terms of multivariate kernels.

Tim
  • 108,699
  • 20
  • 212
  • 390
  • If we consider the joint density $${f}_{12}(x,y) = \frac{1}{n} \sum\limits_{i=1}^n |\pmb{H}|^{-1/2} \pmb{K}\left( \pmb{H}^{-1/2} \begin{pmatrix} x-u_i\\ y-v_i \end{pmatrix} \right),$$ where $\pmb{H} = \begin{pmatrix} h_1^2 & h_{12}\\ h_{12} & h_2^2 \end{pmatrix}$, I thought that it is putting bivariate Gaussian distributions only in points $\{(u_i,v_i)^\top,~ i=1,\ldots,n\}$. Should the joint density not be considering all the points $\{(u_i,v_j)^\top,~ i,j=1,\ldots,n\}$? – Shanks Sep 27 '21 at 07:55
  • 2
    @Shanks not sure what you mean? It *does* consider all the points. It works the same as univariate KDE, just instead of using univariate kernel, it uses multivariate kernel per each row of the dataset. – Tim Sep 27 '21 at 07:58
  • If we are defining a KDE on a set of vectors $\{\pmb{x}_1, \ldots, \pmb{x}_n\}$, then the joint density is correct. But, here, I am trying to form a joint density from two sets of univariate points. E.g., given a set of vectors $\{\pmb{x}_1, \ldots, \pmb{x}_n\}$, a joint Dirac delta is $\frac 1n \sum\limits_{i=1}^n\delta(\pmb{X}-\pmb{x}_i)$. But, if we have two densities $\frac 1n \sum\limits_{i=1}^n \delta (x-x_i)$ and $\frac 1n \sum\limits_{i=1}^n \delta (y-y_i)$, their joint density is $\frac{1}{n^2}\sum\limits_{i=1}^n \sum\limits_{j=1}^n \delta(x-x_i, y-y_j)$. – Shanks Sep 27 '21 at 08:13
  • In other words, this joint density is putting masses only in $n$ diagonal points, whereas it should put weights to all the $n^2$ points. – Shanks Sep 27 '21 at 08:16
  • 1
    @Shanks but with univariate KDE you aren't either "looking at all the points". Do you want to use KDE or you want to fit some other kind of nonparametric distribution to the data? With KDE you consider all the data by using a kernel per each datapoint. – Tim Sep 27 '21 at 08:40
  • 1
    @Shanks also it's not clear for me why would you like to look at disjoint pairs. Say that $x$ is height and $y$ is weight, looking at $(x_i, y_i)$ tells you how $i$-th person's height correlates to their weight. What would be the point at looking on how my height correlates to your weight and weights of all the other people? – Tim Sep 27 '21 at 12:04
  • Simply assume I have $n_1$ points for the first KDE and $n_2$ points for the second KDE. Now, it should be clear. – Shanks Sep 27 '21 at 16:57
  • @Shanks and the X a and Y points are independent from each other? – Tim Sep 27 '21 at 20:33
  • No, not necessarily. – Shanks Sep 27 '21 at 22:00
  • @Shanks what are X and Y in your question? It’s like height of one group of people and weight of another group of people? If yes, it tells you nothing about the joint distribution, you can at best assume they are independent, but you can’t even use multivariate KDE, you just can multiply two univariate KDEs. – Tim Sep 28 '21 at 05:36
2

The kernel density estimation (KDE) will produce the marginal density function: $$f_1(x_1)=\int_{X_2} f(x_1,x_2)dx_2\approx f_K(x_1)$$ Where $f(x_1,x_2)$ is the joint density function, and $f_K$ is your KDE. Unfortunately, in general case it is impossible to recover the join density solely from marginals, e.g. without any additional assumptions. There are special cases where it is possible, such as when the variables are independent, for instance.

There is Sklar theorem which stets that joint distributions can be constructed from marginals plus copula. For instance, in case of Gaussian distribution, the copula is very simply determined by the correlation matrix of the variables. The problem is that the Sklar theorem doesn't tell us how to estimate this copula. However, if somehow you knew it then you would plug your marginals into the copula and get the joint distribution.

By the way, the copulas are a very common technique in some fields, and they do use KDE as marginals in some applications. Here's an excellent short paper that details applications of copulas in Finance: Copulas for Finance A Reading Guide and Some Applications, by Eric Bouy et al., has all equations you need to get the joint distribution, for instance

msuzen
  • 1,709
  • 6
  • 27
Aksakal
  • 55,939
  • 5
  • 90
  • 176
  • 1
    (+1) Copulas are quite natural in this context with the assumption of a joint Kernel $K(X,Y)$. An intro lecture notes maybe helpful too http://www.columbia.edu/~mh2078/QRM/Copulas.pdf – msuzen Sep 27 '21 at 19:38