Assume one has two data samples: the first one , $X = x_{1}, \dots, x_{n}$ and second, $Y = y_{1}, \dots, y_{m}$, with $m \ll n$. I aim to check if the data $Y$ was generated by the same data generating process (DGP) as $X$ did. One direct approach is using KS test for those data sets and if a p-value is very small, then the hypothesis about the same DGP is rejected.
Another approach which comes to my mind is the following:
From the first set $X$ we bootstrap some samples of the size $m$. Then perform multiple test with data sample $Y$.
What would be a correct procedure to make it? I do not think that just averaging of p-values would make much sense...
Edit: I am aware that, given the null-hypothesis is true, the p-value is a uniformly distributed random variable and, therefore, for a significance level $\alpha$ the $\alpha*100 \%$ of p-values will be less than $\alpha$ and $(1-\alpha)*100 \%$ will be greater.