Preface
I work at an ecommerce saas company. I have been asked to perform an analysis on the relationship between behaviors of potential customers during free trials and their conversion to paying customers.
In analyzing the data, I have learned that our data collection is systematically NOT collecting certain behaviors of some prospects/customers.
I have communicated this bias in the data to my stakeholders, but I have been unable to convince them of the risks of making decisions based on analyses performed on bad data. I think part of the issue is that they believe that examples that are not in the business domain are inapplicable.
Question
What is a recent (i.e. last 10 years) example of negative consequences of stats/data science as a result of using bad data?
I am seeking a reference (article, white paper, journal publication, etc.) describing a recent example in the business/commercial domain in which the consequence of using flawed data was a significantly negative business outcome. Here I am defining "bad data" as data that misrepresents reality due to collection failures.
Things I have already referenced, but which do not answer my question:
- Improper use of statistical tools About methods/algos, not bad/biased data
- Dewey Defeats Truman Political domain (i.e. not business), too old
- References on the misuse of stats in a business context Closed and unanswered
- Examples of wrong or crazy inferences being draw from big data Public health domain (i.e. not business)