This seems to be a qualitative way of expressing the loss of confidence in the p-values used in frequentist hypothesis testing to quantify the significance of results.
First, p-values are notoriously difficult to interpret/explain. Quoting Andrew Gelman...
The casual view of the P value as posterior probability of the truth of the null hypothesis is false and not even close to valid under any reasonable model, yet this misunderstanding persists even in high-stakessettings....The formal view of the P value as a probability conditional on the null is mathematically correct but typically irrelevant to research goals (hence, the popularity of alternative—if wrong—interpretations).
Second, as described in this July 2017 article from Nature, the use of frequentist p-values as a test of significance has in recent years helped to produce a slew of results deemed significant that cannot be reproduced...
Shlomo Argamon, a computer scientist at the Illinois Institute of Technology in Chicago, says... “no matter what confidence level you choose, if there are enough different ways to design your experiment, it becomes highly likely that at least one of them will give a statistically significant result just by chance”
Now, the risk described above can be largely eliminated if you have a simple experimental design that can be repeated, but if your experiment is non-repeatable, then you are stuck with the single p-value you get on the one try.
These problems with p-values may be correctable, and it is probably a bit of overstatement to think they are inherent to a "frequentist conception of probability", but it's also true that Bayesian methods are less encumbered by these particular issues.