Fixing a biased (deliberately) sample

Question

I learned today that a staff member at my company deliberately biased a sample set. They selected items for the sample known to be different, more positive, than the population as a whole.

I am now trying to understand the best way of dealing with it while minimizing rework.

Unfortunately, the rework for any of the items in the test set is expensive and time consuming.

We are testing computerized object recognition in complex environments and we are scoring the computerized system and comparing it to human categorized images. It takes a human 1/2 day to score a single sample item for this test.

So, I'd like to keep as many of the items already scored as part of my new sample set as possible.

I would like advice on the best way to do this.

My thought so far is to think of the situation as a stratified sample with some parts of the sample set (strata?) already completed while other strata are yet to be completed. So, I now need to randomly select other items which fit the definition of the incomplete strata.

At any rate, please share your thoughts and advice. Thanks in advance!

No! Say it isn't so that someone would select a (very) biased sample! But more seriously, you might want to read about propensity scoring which attempts to "correct" for a biased sample. That's an oversimplification but it might help your situation. — JimB, Aug 21 '15 at 05:28
I looked this up online and think some of the ideas fit my situation; although the overall approach seems to be targeting a different type of situation. I do have a confounder. The entire population set had preliminary human scoring, one of which was used to bias the sample, and that item is correlated to what we want to measure with the more sophisticated analysis. — JZZ, Aug 21 '15 at 15:34
So, now I need to assess the biased sample set distribution of that confounder and compare to the population distribution. Where there are "gaps" in the sample, I need to suplement my sample with new items from the population set to get similar distrubitions. Thoughts? Other approaches to suggest? — JZZ, Aug 21 '15 at 15:38

Fixing a biased (deliberately) sample

0 Answers0

Linked