Most Popular
1500 questions
82
votes
2 answers
Likelihood ratio vs Bayes Factor
I'm rather evangelistic with regards to the use of likelihood ratios for representing the objective evidence for/against a given phenomenon. However, I recently learned that the Bayes factor serves a similar function in the context of Bayesian…

Mike Lawrence
- 12,691
- 8
- 40
- 65
82
votes
4 answers
Why not approach classification through regression?
Some material I've seen on machine learning said that it's a bad idea to approach a classification problem through regression. But I think it's always possible to do a continuous regression to fit the data and truncate the continuous prediction to…

Strin
- 921
- 1
- 7
- 5
81
votes
9 answers
Regarding p-values, why 1% and 5%? Why not 6% or 10%?
Regarding p-values, I am wondering why $1$% and $5$% seem to be the gold standard for "statistical significance". Why not other values, like $6$% or $10$%?
Is there a fundamental mathematical reason for this, or is this just a widely held…

Contango
- 1,387
- 1
- 16
- 15
81
votes
5 answers
What is regularization in plain english?
Unlike other articles, I found the wikipedia entry for this subject unreadable for a non-math person (like me).
I understood the basic idea, that you favor models with fewer rules. What I don't get is how do you get from a set of rules to a…

Meh
- 1,135
- 2
- 10
- 12
81
votes
2 answers
XKCD's modified Bayes theorem: actually kinda reasonable?
I know this is from a comic famous for taking advantage of certain analytical tendencies, but it actually looks kind of reasonable after a few minutes of staring. Can anyone outline for me what this "modified Bayes theorem" is doing?

eric_kernfeld
- 4,828
- 1
- 16
- 41
81
votes
2 answers
What is global max pooling layer and what is its advantage over maxpooling layer?
Can somebody explain what is a global max pooling layer and why and when do we use it for training a neural network. Do they have any advantage over ordinary max pooling layer?

Eka
- 1,921
- 2
- 22
- 28
81
votes
5 answers
Explain the difference between multiple regression and multivariate regression, with minimal use of symbols/math
Are multiple and multivariate regression really different? What is a variate anyways?

Neil McGuigan
- 9,292
- 13
- 54
- 62
81
votes
11 answers
How to obtain the p-value (check significance) of an effect in a lme4 mixed model?
I use lme4 in R to fit the mixed model
lmer(value~status+(1|experiment)))
where value is continuous, status and experiment are factors, and I get
Linear mixed model fit by REML
Formula: value ~ status + (1 | experiment)
AIC BIC logLik…

ECII
- 1,791
- 2
- 17
- 25
81
votes
9 answers
Probability of a single real-life future event: What does it mean when they say that "Hillary has a 75% chance of winning"?
As the election is a one time event, it is not an experiment that can be repeated. So exactly what does the statement "Hillary has a 75% chance of winning" technically mean? I am seeking a statistically correct definition not an intuitive or…

pitosalas
- 933
- 1
- 7
- 6
81
votes
0 answers
How can a regression be significant yet all predictors be non-significant?
My multiple regression analysis model has a statistically significant F value however all beta values are statistically non-significant.
All the regression assumptions are met. No multicollinearity was found. Correlations among all predictors are…

Serene
- 811
- 1
- 7
- 3
81
votes
5 answers
Please explain the waiting paradox
A few years ago I designed a radiation detector that works by measuring the interval between events rather than counting them. My assumption was, that when measuring non-contiguous samples, on average I would measure half of the actual interval.…

Stephen Sackett
- 913
- 1
- 7
- 6
81
votes
4 answers
Can bootstrap be seen as a "cure" for the small sample size?
This question has been triggered by something I read in this graduate-level statistics textbook and also (independently) heard during this presentation at a statistical seminar. In both cases, the statement was along the lines of "because the sample…

James
- 2,600
- 1
- 14
- 26
80
votes
6 answers
Choosing a clustering method
When using cluster analysis on a data set to group similar cases, one needs to choose among a large number of clustering methods and measures of distance. Sometimes, one choice might influence the other, but there are many possible combinations of…

Brett
- 5,708
- 3
- 29
- 41
80
votes
2 answers
What is a "kernel" in plain English?
There are several distinct usages:
kernel density estimation
kernel trick
kernel smoothing
Please explain what the "kernel" in them means, in plain English, in your own words.

Neil McGuigan
- 9,292
- 13
- 54
- 62
80
votes
3 answers
Best way to present a random forest in a publication?
I am using the random forest algorithm as a robust classifier of two groups in a microarray study with 1000s of features.
What is the best way to present the random forest so that there is enough information to make it
reproducible in a paper?
Is…

danielsbrewer
- 2,385
- 3
- 20
- 17