How can statistics be used to avoid "Lending False Credibility To Decisions We've Already Made"

Question

In light of this article Data Science Has Become About Lending False Credibility To Decisions We've Already Made published in Forbes, I would appreciate input from the statistical and data science community:

1. What can be done to ensure credibility in findings based on machine learning and big data?

2. Is hypothesis testing inherently superior and more credible than machine learning?

The article begins with:

One of the greatest failures of data science (...) [i]t no longer matters what our data actually says or whether the data we are using is in any way relevant to the questions we ask of it. All that matters is that we can justify our preordained decisions with the certainly of “data.”

As we rapidly undermine the promise of data science, will our trust in data fade with it?

[O]ur era of searching data for answers has devolved into searching data until we find support for the answer we've already decided upon.

It concludes saying:

Putting this all together, data science is no longer about analyzing data or giving our data the opportunity to speak to us.

Most dangerously, it has become about the misuse of statistics, data, research methodologies and the scientific method to lend false credibility to decisions that have already been made.

We no longer devise a hypothesis and test it using data. We start with the conclusion we want and find the data and methods to support it.

So, the author of that article is painting so may people with such a wide brush that It's liable to collapse under its own gravitational weight. That said, if you dialed back the rhetoric, there's a point there about both hype, and an abandonment of the scientific method in some business communities. — Matthew Drury, Mar 25 '19 at 17:03
There’s not a single specific example of anything in this article. — The Laconic, Mar 25 '19 at 17:10
Thanks. Yes, this article does not provide any example to support its claims. But the claims made are very troubling. That is why I would appreciate input from the statistical community: `1. What can be done to ensure credibility in findings based on machine learning and big data? 2. Is hypothesis testing inherently superior and more credible than machine learning?` — Krantz, Mar 25 '19 at 20:14

How can statistics be used to avoid "Lending False Credibility To Decisions We've Already Made"

0 Answers0