Questions tagged [modeling]

This tag describes the process of creating a statistical or machine learning model. Always add a more specific tag.

A statistical model is a formalization of relationships between variables in the form of mathematical equations. A statistical model describes how one or more random variables are related to one or more random variables. The model is statistical as the variables are not deterministically but stochastically related.

2392 questions

265

votes

13 answers

Is there any reason to prefer the AIC or BIC over the other?

The AIC and BIC are both methods of assessing model fit penalized for the number of estimated parameters. As I understand it, BIC penalizes models more for free parameters than does AIC. Beyond a preference based on the stringency of the criteria,…

asked Jul 23 '10 at 20:49

russellpierce

17,079
16
67
98

254

votes

3 answers

How to know that your machine learning problem is hopeless?

Imagine a standard machine-learning scenario: You are confronted with a large multivariate dataset and you have a pretty blurry understanding of it. What you need to do is to make predictions about some variable based on what you have. As…

machine-learning forecasting modeling model-selection forecastability

asked Jul 05 '16 at 08:22

Tim

108,699
20
212
390

111

votes

5 answers

Using k-fold cross-validation for time-series model selection

Question: I want to be sure of something, is the use of k-fold cross-validation with time series is straightforward, or does one need to pay special attention before using it? Background: I'm modeling a time series of 6 year (with semi-markov…

time-series modeling cross-validation

asked Aug 10 '11 at 17:20

Mickaël S

1,258
3
10
6

101

votes

18 answers

Including the interaction but not the main effects in a model

Is it ever valid to include a two-way interaction in a model without including the main effects? What if your hypothesis is only about the interaction, do you still need to include the main effects?

regression modeling interaction regression-coefficients

asked May 20 '11 at 01:19

Glen

6,320
4
37
59

votes

3 answers

Can someone explain Gibbs sampling in very simple words?

I'm doing some reading on topic modeling (with Latent Dirichlet Allocation) which makes use of Gibbs sampling. As a newbie in statistics―well, I know things like binomials, multinomials, priors, etc.―,I find it difficult to grasp how Gibbs sampling…

modeling sampling conditional-probability gibbs

asked May 01 '11 at 19:37

Thea

votes

24 answers

Rules of thumb for "modern" statistics

I like G van Belle's book on Statistical Rules of Thumb, and to a lesser extent Common Errors in Statistics (and How to Avoid Them) from Phillip I Good and James W. Hardin. They address common pitfalls when interpreting results from experimental and…

modeling exploratory-data-analysis rule-of-thumb

asked Sep 16 '10 at 10:21

chl

50,972
18
205
364

votes

11 answers

Why should I be Bayesian when my model is wrong?

Edits: I have added a simple example: inference of the mean of the $X_i$. I have also slightly clarified why the credible intervals not matching confidence intervals is bad. I, a fairly devout Bayesian, am in the middle of a crisis of faith of…

bayesian modeling philosophical misspecification

asked Apr 20 '17 at 15:28

Guillaume Dehaene

2,137
1
10
18

votes

14 answers

What is the meaning of "All models are wrong, but some are useful"

"Essentially, all models are wrong, but some are useful." --- Box, George E. P.; Norman R. Draper (1987). Empirical Model-Building and Response Surfaces, p. 424, Wiley. ISBN 0471810339. What exactly is the meaning of the above phrase?

modeling

asked Apr 27 '13 at 08:39

gpuguy

1,063
3
10
10

votes

6 answers

Variable selection for predictive modeling really needed in 2016?

This question has been asked on CV some yrs ago, it seems worth a repost in light of 1) order of magnitude better computing technology (e.g. parallel computing, HPC etc) and 2) newer techniques, e.g. [3]. First, some context. Let's assume the goal…

machine-learning modeling feature-selection model-selection prediction

asked May 28 '16 at 20:13

horaceT

3,162
3
15
19

votes

4 answers

Why does including latitude and longitude in a GAM account for spatial autocorrelation?

I have produced generalized additive models for deforestation. To account for spatial-autocorrelation, I have included latitude and longitude as a smoothed, interaction term (i.e. s(x,y)). I've based this on reading many papers where the authors say…

r modeling spatial autocorrelation generalized-additive-model

asked Sep 01 '12 at 14:00

gisol

votes

6 answers

Model for predicting number of Youtube views of Gangnam Style

PSY's music video "Gangnam style" is popular, after a little more than 2 months it has about 540 million viewers. I learned this from my preteen children at dinner last week and soon the discussion went in the direction of if it was possible to do…

modeling internet

asked Oct 27 '12 at 05:49

FredrikD

votes

7 answers

Do all interactions terms need their individual terms in regression model?

I am actually reviewing a manuscript where the authors compare 5-6 logit regression models with AIC. However, some of the models have interaction terms without including the individual covariate terms. Does it ever make sense to do this? For example…

regression modeling interaction aic

asked May 04 '12 at 02:10

djhocking

1,701
3
17
21

votes

7 answers

What is a "saturated" model?

What is meant when we say we have a saturated model?

modeling regression

asked Jul 20 '10 at 12:09

Graham Cookson

7,543
6
41
35

votes

3 answers

Variables are often adjusted (e.g. standardised) before making a model - when is this a good idea, and when is it a bad one?

In what circumstances would you want to, or not want to scale or standardize a variable prior to model fitting? And what are the advantages / disadvantages of scaling a variable?

modeling predictive-models feature-selection mathematical-statistics standardization

asked Dec 01 '11 at 16:29

Andrew

5,478
5
21
21

votes

4 answers

What is so cool about de Finetti's representation theorem?

From Theory of Statistics by Mark J. Schervish (page 12): Although DeFinetti's representation theorem 1.49 is central to motivating parametric models, it is not actually used in their implementation. How is the theorem central to parametric…

probability mathematical-statistics modeling exchangeability

asked Aug 16 '12 at 17:40

gui11aume

13,383
2
44
89

2 3

…

99 100 Next