Let's say one has a finite collection of i.i.d. samples from an unknown source distribution $S=\{x_{i} | i \in [1,n_{S}], x_{i} \sim p_{X_{S}}(x)\}$. Where each $x$ is multidimensional and has continuous and discrete components.
One also has another finite collection of i.i.d. samples from an unknown target distribution $T=\{x_{j} | i \in [1,n_{T}], x_{j} \sim p_{X_{T}}(x) \in \mathbb{R}^{d}\}$ with matching features (same dimensions and same levels/range for each component).
What would be a mathematical method to optimally subsample $S$ into a collection $S'$ so that $S'$ and $T$ are the closest statistically (in whatever mathematical sense where this problem is the easiest : E.M distance/KL, etc.) ?
All I can think of is some kind of rejection sampling method but the details of it are not clear to me.
Is it possible to add some constraints over $S'$ like its size ? (sequential sampling method ?)
I am looking for ideas or pointers towards literature.