Highest Voted 'model-selection' Questions - Statistical Analysis Stack Exchange

734

votes

11 answers

How to choose the number of hidden layers and nodes in a feedforward neural network?

Is there a standard and accepted method for selecting the number of layers, and the number of nodes in each layer, in a feed-forward neural network? I'm interested in automated ways of building neural networks.

model-selection neural-networks

asked Jul 20 '10 at 00:15

Rob Hyndman

51,928
23
126
178

265

votes

13 answers

Is there any reason to prefer the AIC or BIC over the other?

The AIC and BIC are both methods of assessing model fit penalized for the number of estimated parameters. As I understand it, BIC penalizes models more for free parameters than does AIC. Beyond a preference based on the stringency of the criteria,…

modeling aic cross-validation bic model-selection

asked Jul 23 '10 at 20:49

russellpierce

17,079
16
67
98

254

votes

3 answers

How to know that your machine learning problem is hopeless?

Imagine a standard machine-learning scenario: You are confronted with a large multivariate dataset and you have a pretty blurry understanding of it. What you need to do is to make predictions about some variable based on what you have. As…

machine-learning forecasting modeling model-selection forecastability

asked Jul 05 '16 at 08:22

Tim

108,699
20
212
390

242

votes

7 answers

How to choose a predictive model after k-fold cross-validation?

I am wondering how to choose a predictive model after doing K-fold cross-validation. This may be awkwardly phrased, so let me explain in more detail: whenever I run K-fold cross-validation, I use K subsets of the training data, and end up with K…

cross-validation model-selection

asked Mar 15 '13 at 02:21

Berk U.

4,265
5
21
42

228

votes

8 answers

Algorithms for automatic model selection

I would like to implement an algorithm for automatic model selection. I am thinking of doing stepwise regression but anything will do (it has to be based on linear regressions though). My problem is that I am unable to find a methodology, or an…

feature-selection model-selection aic stepwise-regression faq

asked Jan 09 '12 at 18:22

S4M

2,432
3
13
6

178

votes

5 answers

Training on the full dataset after cross-validation?

TL:DR: Is it ever a good idea to train an ML model on all the data available before shipping it to production? Put another way, is it ever ok to train on all data available and not check if the model overfits, or get a final read of the expected…

machine-learning cross-validation model-selection

asked Jun 05 '11 at 16:50

Amelio Vazquez-Reina

17,546
26
74
110

131

votes

4 answers

Nested cross validation for model selection

How can one use nested cross validation for model selection? From what I read online, nested CV works as follows: There is the inner CV loop, where we may conduct a grid search (e.g. running K-fold for every available model, e.g. combination of…

cross-validation model-selection

asked Jul 22 '13 at 15:53

Amelio Vazquez-Reina

17,546
26
74
110

95

votes

2 answers

How much do we know about p-hacking "in the wild"?

The phrase p-hacking (also: "data dredging", "snooping" or "fishing") refers to various kinds of statistical malpractice in which results become artificially statistically significant. There are many ways to procure a "more significant" result,…

hypothesis-testing statistical-significance p-value model-selection reproducible-research

asked Mar 09 '16 at 13:14

Silverfish

20,678
23
92
180

87

votes

5 answers

What are modern, easily used alternatives to stepwise regression?

I have a dataset with around 30 independent variables and would like to construct a generalized linear model (GLM) to explore the relationship between them and the dependent variable. I am aware that the method I was taught for this situation,…

regression generalized-linear-model model-selection stepwise-regression

asked Jul 31 '11 at 23:45

fmark

4,666
5
35
51

85

votes

14 answers

Why haven't robust (and resistant) statistics replaced classical techniques?

When solving business problems using data, it's common that at least one key assumption that under-pins classical statistics is invalid. Most of the time, no one bothers to check those assumptions so you never actually know. For instance, that so…

model-selection nonparametric outliers robust philosophical

asked Aug 03 '10 at 07:49

doug

9,901
1
22
26

77

votes

6 answers

Variable selection for predictive modeling really needed in 2016?

This question has been asked on CV some yrs ago, it seems worth a repost in light of 1) order of magnitude better computing technology (e.g. parallel computing, HPC etc) and 2) newer techniques, e.g. [3]. First, some context. Let's assume the goal…

machine-learning modeling feature-selection model-selection prediction

asked May 28 '16 at 20:13

horaceT

3,162
3
15
19

65

votes

2 answers

Why only three partitions? (training, validation, test)

When you are trying to fit models to a large dataset, the common advice is to partition the data into three parts: the training, validation, and test dataset. This is because the models usually have three "levels" of parameters: the first…

machine-learning model-selection data-mining

asked Apr 08 '11 at 14:45

charles.y.zheng

7,346
2
28
32

62

votes

2 answers

A more definitive discussion of variable selection

Background I'm doing clinical research in medicine and have taken several statistics courses. I've never published a paper using linear/logistic regression and would like to do variable selection correctly. Interpretability is important, so no fancy…

regression feature-selection model-selection

asked Jul 14 '16 at 16:30

sharper_image

737
7
10

60

votes

3 answers

Linear model with log-transformed response vs. generalized linear model with log link

In this paper titled "CHOOSING AMONG GENERALIZED LINEAR MODELS APPLIED TO MEDICAL DATA" the authors write: In a generalized linear model, the mean is transformed, by the link function, instead of transforming the response itself. The two methods …

generalized-linear-model model-selection lognormal-distribution

asked Jan 16 '13 at 10:01

miura

3,364
3
21
27

49

votes

3 answers

AIC,BIC,CIC,DIC,EIC,FIC,GIC,HIC,IIC --- Can I use them interchangeably?

On p. 34 of his PRNN Brian Ripley comments that "The AIC was named by Akaike (1974) as 'An Information Criterion' although it seems commonly believed that the A stands for Akaike". Indeed, when introducing the AIC statistic, Akaike (1974, p.719)…

forecasting model-selection aic bic

asked Feb 16 '14 at 14:11

Hibernating

3,723
2
21
34

Questions tagged [model-selection]