These issues have been known for a long time, it started in education research, psychology and has since spread to even physics. There is no one in particular to blame and apparently nothing can stop it.
We are quite in danger of sending highly trained and highly
intelligent young men out into the world with tables of erroneous
numbers under their arms, and with a dense fog in the place where
their brains ought to be. In this century, of course, they will be
working on guided missiles and advising the medical profession on the
control of disease, and there is no limit to the extent to which they
could impede every sort of national effort.
Fisher, R N (1958). "The Nature of Probability". Centennial Review 2: 261–274.
The usual application of statistics in psychology consists of testing
a "null hypothesis" that the investigator hopes is false. For example,
he tests the hypothesis that the ex perimental group is the same as
the control group even though he has done his best to make them
perform differently.Then a "significant" difference is obtained which
shows that the data do not agree with the hypothesis tested. The
experimenter is then pleased because he has shown that a hypothesis he
didn't believe, isn't true. Having found a "significant difference,"
the more important next step should not be neglected. Namely,
formulate a hypothesis that the scientist does believe and show that
the data do not differ significantly from it. This is an indica tion
that the newer hypothesis may be regarded as true. A definite
scientific advance has been achieved.
MATHEMATICAL SOLUTIONS FOR PSYCHOLOGICAL PROBLEMS. HAROLD GULLIKSEN. American Scientist,Vol. 47, No. 2 (JUNE 1959), pp. 178-201
The major point of this paper is that the test of significance does
not provide the information concerning psychological phenomena
characteristically attributed to it; and that, furthermore, a great
deal of mischief has been associated with its use. What will be said
in this paper is hardly original. It is, in a certain sense, what
"everybody knows." To say it "out loud" is, as it were, to assume the
role of the child who pointed out that the emperor was really
outfitted only in his underwear. Little of that which is contained in
this paper is not already available in the literature, and the
literature will be cited.
THE TEST OF SIGNIFICANCE IN PSYCHOLOGICAL RESEARCH. DAVID BAKAN. Psychological Bulletin. VOL. 66, No. 6. DECEMBER 1966.
The puzzle, sufficiently striking (when clearly discerned) to be
entitled to the designation “paradox,” is the follow- ing: In the
physical sciences, the usual result of an improvement in experimental
design, instrumentation, or numerical mass of data, is to increase the
difficulty of the “observational hurdle” which the physical theory of
interest must successfully surmount; whereas, in psychology and some
of the allied behavior sciences, the usual effect of such improvement
in experimental precision is to provide an easier hurdle for the
theory to surmount. Hence what we would normally think of as improve-
ments in our experimental method tend (when predictions materialize)
to yieldstronger corroboration of the theory in physics, since to
remain unrefuted the theory must have survived a more difficult test;
by contrast, such experimental improvement in psychology typically
results in a weaker corroboration of the theory, since it has now been
required to survive a more lenient test.
THEORY-TESTING IN PSYCHOLOGY AND PHYSICS: A METHODOLOGICAL PARADOX. PAUL E. MEEHL. Philosophy of Science, 1967, Vol. 34, 103–115.