In Machine Learning (ML) tasks, one splits the dataset into training and test sets. We train the ML model based on the training test, and then we evaluate the performance of the model with the test set.
It is always crucial to have "independent" test set, which are those samples that the model has not seen during the training phase.
In research, it is significant to prove that the test set is independent. However, this question arises:
How do we guarantee the "independence" of our test set?