Questions tagged [hypothesis-testing]

Hypothesis testing assesses whether data are inconsistent with a given hypothesis rather than being an effect of random fluctuations.

Hypothesis testing assesses whether data are inconsistent with a given hypothesis rather than being an effect of random fluctuations.

9227 questions
355
votes
16 answers

Is normality testing 'essentially useless'?

A former colleague once argued to me as follows: We usually apply normality tests to the results of processes that, under the null, generate random variables that are only asymptotically or nearly normal (with the 'asymptotically' part…
shabbychef
  • 10,388
  • 7
  • 50
  • 93
280
votes
16 answers

What is the meaning of p values and t values in statistical tests?

After taking a statistics course and then trying to help fellow students, I noticed one subject that inspires much head-desk banging is interpreting the results of statistical hypothesis tests. It seems that students easily learn how to perform the…
169
votes
16 answers

Are large data sets inappropriate for hypothesis testing?

In a recent article of Amstat News, the authors (Mark van der Laan and Sherri Rose) stated that "We know that for large enough sample sizes, every study—including ones in which the null hypothesis of no effect is true — will declare a statistically…
141
votes
8 answers

Is Facebook coming to an end?

Recently, this paper has received a lot of attention (e.g. from WSJ). Basically, the authors conclude that Facebook will lose 80% of its members by 2017. They base their claims on an extrapolation of the SIR model, a compartmental model frequently…
124
votes
7 answers

How to choose between t-test or non-parametric test e.g. Wilcoxon in small samples

Certain hypotheses can be tested using Student's t-test (maybe using Welch's correction for unequal variances in the two-sample case), or by a non-parametric test like the Wilcoxon paired signed rank test, the Wilcoxon-Mann-Whitney U test, or the…
116
votes
10 answers

ASA discusses limitations of $p$-values - what are the alternatives?

We already have multiple threads tagged as p-values that reveal lots of misunderstandings about them. Ten months ago we had a thread about psychological journal that "banned" $p$-values, now American Statistical Association (2016) says that with our…
Tim
  • 108,699
  • 20
  • 212
  • 390
107
votes
7 answers

T-test for non normal when N>50?

Long ago I learnt that normal distribution was necessary to use a two sample T-test. Today a colleague told me that she learnt that for N>50 normal distribution was not necessary. Is that true? If true is that because of the central limit theorem?
102
votes
9 answers

Is this really how p-values work? Can a million research papers per year be based on pure randomness?

I'm very new to statistics, and I'm just learning to understand the basics, including $p$-values. But there is a huge question mark in my mind right now, and I kind of hope my understanding is wrong. Here's my thought process: Aren't all researches…
n_mu_sigma
  • 1,071
  • 2
  • 8
  • 6
101
votes
3 answers

What are examples where a "naive bootstrap" fails?

Suppose I have a set of sample data from an unknown or complex distribution, and I want to perform some inference on a statistic $T$ of the data. My default inclination is to just generate a bunch of bootstrap samples with replacement, and calculate…
raegtin
  • 9,090
  • 12
  • 48
  • 53
95
votes
2 answers

How much do we know about p-hacking "in the wild"?

The phrase p-hacking (also: "data dredging", "snooping" or "fishing") refers to various kinds of statistical malpractice in which results become artificially statistically significant. There are many ways to procure a "more significant" result,…
91
votes
4 answers

When to use Fisher and Neyman-Pearson framework?

I've been reading a lot lately about the differences between Fisher's method of hypothesis testing and the Neyman-Pearson school of thought. My question is, ignoring philosophical objections for a moment; when should we use the Fisher's approach of…
Stijn
  • 1,550
  • 1
  • 12
  • 20
84
votes
9 answers

Why is it possible to get significant F statistic (p<.001) but non-significant regressor t-tests?

In a multiple linear regression, why is it possible to have a highly significant F statistic (p<.001) but have very high p-values on all the regressor's t tests? In my model, there are 10 regressors. One has a p-value of 0.1 and the rest are above…
81
votes
9 answers

Regarding p-values, why 1% and 5%? Why not 6% or 10%?

Regarding p-values, I am wondering why $1$% and $5$% seem to be the gold standard for "statistical significance". Why not other values, like $6$% or $10$%? Is there a fundamental mathematical reason for this, or is this just a widely held…
Contango
  • 1,387
  • 1
  • 16
  • 15
81
votes
11 answers

How to obtain the p-value (check significance) of an effect in a lme4 mixed model?

I use lme4 in R to fit the mixed model lmer(value~status+(1|experiment))) where value is continuous, status and experiment are factors, and I get Linear mixed model fit by REML Formula: value ~ status + (1 | experiment) AIC BIC logLik…
ECII
  • 1,791
  • 2
  • 17
  • 25
73
votes
15 answers

Why would parametric statistics ever be preferred over nonparametric?

Can someone explain to me why would anyone choose a parametric over a nonparametric statistical method for hypothesis testing or regression analysis? In my mind, it's like going for rafting and choosing a non-water resistant watch, because you may…
1
2 3
99 100