i have a set of data that is generated by expensive computational model evaluations, on a total data set of 10000 samples in 40 dimensions. This sample data set is composed of different data sets, originating partly from random runs, latin hypercube DOE, radial design DOE, linear parameter studies, and a large part is based on the history data generated by several optimization runs using genetic algorithms.
My thought was that a large part of the function evaluations generated during the genetic algorithms runs, could be some how used to augment them to the set of random and latin hypercube samples, in order to have a larger sample set to perform a variance based sensitivity analysis.
I came up with 2 ideas, but i am an engineer, not a mathematician:
1) using the covariance matrix for the total samples matrix, trying to filter out samples until the of diagonal terms are smaller then some threshold, to avoid correlations.
2)The other idea was to make some sort of minimum distance filter to avoid areas with tightly clustered samples.
Would that be sufficient? are there any tests for randomness, that i could use? The problem is that i don't know the right terminology, so maybe there exist ready to run methods for such problems, but i don't know how to find them, because i don't know their names.
I am thankful for any helpful suggestions.