Most Popular

1500 questions
82
votes
2 answers

Likelihood ratio vs Bayes Factor

I'm rather evangelistic with regards to the use of likelihood ratios for representing the objective evidence for/against a given phenomenon. However, I recently learned that the Bayes factor serves a similar function in the context of Bayesian…
Mike Lawrence
  • 12,691
  • 8
  • 40
  • 65
82
votes
4 answers

Why not approach classification through regression?

Some material I've seen on machine learning said that it's a bad idea to approach a classification problem through regression. But I think it's always possible to do a continuous regression to fit the data and truncate the continuous prediction to…
Strin
  • 921
  • 1
  • 7
  • 5
81
votes
9 answers

Regarding p-values, why 1% and 5%? Why not 6% or 10%?

Regarding p-values, I am wondering why $1$% and $5$% seem to be the gold standard for "statistical significance". Why not other values, like $6$% or $10$%? Is there a fundamental mathematical reason for this, or is this just a widely held…
Contango
  • 1,387
  • 1
  • 16
  • 15
81
votes
5 answers

What is regularization in plain english?

Unlike other articles, I found the wikipedia entry for this subject unreadable for a non-math person (like me). I understood the basic idea, that you favor models with fewer rules. What I don't get is how do you get from a set of rules to a…
Meh
  • 1,135
  • 2
  • 10
  • 12
81
votes
2 answers

XKCD's modified Bayes theorem: actually kinda reasonable?

I know this is from a comic famous for taking advantage of certain analytical tendencies, but it actually looks kind of reasonable after a few minutes of staring. Can anyone outline for me what this "modified Bayes theorem" is doing?
eric_kernfeld
  • 4,828
  • 1
  • 16
  • 41
81
votes
2 answers

What is global max pooling layer and what is its advantage over maxpooling layer?

Can somebody explain what is a global max pooling layer and why and when do we use it for training a neural network. Do they have any advantage over ordinary max pooling layer?
Eka
  • 1,921
  • 2
  • 22
  • 28
81
votes
5 answers

Explain the difference between multiple regression and multivariate regression, with minimal use of symbols/math

Are multiple and multivariate regression really different? What is a variate anyways?
81
votes
11 answers

How to obtain the p-value (check significance) of an effect in a lme4 mixed model?

I use lme4 in R to fit the mixed model lmer(value~status+(1|experiment))) where value is continuous, status and experiment are factors, and I get Linear mixed model fit by REML Formula: value ~ status + (1 | experiment) AIC BIC logLik…
ECII
  • 1,791
  • 2
  • 17
  • 25
81
votes
9 answers

Probability of a single real-life future event: What does it mean when they say that "Hillary has a 75% chance of winning"?

As the election is a one time event, it is not an experiment that can be repeated. So exactly what does the statement "Hillary has a 75% chance of winning" technically mean? I am seeking a statistically correct definition not an intuitive or…
pitosalas
  • 933
  • 1
  • 7
  • 6
81
votes
0 answers

How can a regression be significant yet all predictors be non-significant?

My multiple regression analysis model has a statistically significant F value however all beta values are statistically non-significant. All the regression assumptions are met. No multicollinearity was found. Correlations among all predictors are…
Serene
  • 811
  • 1
  • 7
  • 3
81
votes
5 answers

Please explain the waiting paradox

A few years ago I designed a radiation detector that works by measuring the interval between events rather than counting them. My assumption was, that when measuring non-contiguous samples, on average I would measure half of the actual interval.…
Stephen Sackett
  • 913
  • 1
  • 7
  • 6
81
votes
4 answers

Can bootstrap be seen as a "cure" for the small sample size?

This question has been triggered by something I read in this graduate-level statistics textbook and also (independently) heard during this presentation at a statistical seminar. In both cases, the statement was along the lines of "because the sample…
James
  • 2,600
  • 1
  • 14
  • 26
80
votes
6 answers

Choosing a clustering method

When using cluster analysis on a data set to group similar cases, one needs to choose among a large number of clustering methods and measures of distance. Sometimes, one choice might influence the other, but there are many possible combinations of…
Brett
  • 5,708
  • 3
  • 29
  • 41
80
votes
2 answers

What is a "kernel" in plain English?

There are several distinct usages: kernel density estimation kernel trick kernel smoothing Please explain what the "kernel" in them means, in plain English, in your own words.
Neil McGuigan
  • 9,292
  • 13
  • 54
  • 62
80
votes
3 answers

Best way to present a random forest in a publication?

I am using the random forest algorithm as a robust classifier of two groups in a microarray study with 1000s of features. What is the best way to present the random forest so that there is enough information to make it reproducible in a paper? Is…
danielsbrewer
  • 2,385
  • 3
  • 20
  • 17