I have two sets of vectors and want to find a differentiable measure that can help quantify/approximate the degree of separability of the two sets. This metric might correlate well with the performance of a RF trained to separate the data.
Looking online I found Bhattacharya distance, which looks to be what I want but applied to distributions. According to the Wikipedia: It is used to measure the separability of classes in classification. I tried using this metric but unfortunately due to the high dimensionality of my vectors the sample covariance matrices are singular and lead to undefined results.
Any suggestions for what metric I might be able to use instead?