I am creating some artificial missing values in a dataset using the 2 well known mechanisms MAR and NMAR. I want to validate what I create, but I cant find any statistical test that given the observed, missing and then complete data can tell me whether the MAR holds.
My idea is that with MAR the distribution of the complete and observed data would stay pretty much the same so I tried with the KS test (2 samples) and the t-test between the complete and observed data... but it always reject the MAR hypothesis.
Any help would be appreciated Thanks
p.s. I know that MAR is not testable, but in this case i do know the values of the missing data!
I ll be more accurate: given a complete dataset i am generating MAR clones with a range of 10/20% of missing values. The procedure is the following: I choose 2 random columns Y, K and a random evaluation method for each value in the Y column (value < mean or value > threshold or some other function). In the positive evaluation case, with a certain probability p I remove the entry in the column K and row R. In this way the "missingness" of the values in the column K does not depend on K itself but on the observed column Y ( which is the definition of MAR). I now need to find a stastical procedure to test this "missing-value generator". In other words, given a sample dataset Z (and the original dataset if needed) i want to be able to say ( with a certain p-value of course) whether the artificial MAR holds or not.