Questions tagged [misspecification]

Problems with model specification, such as missing variables/predictors, wrong functional form, wrong variance or covariance structure, etc.

Misspecification of statistical models such as missing variables/predictors, wrong functional form, wrong specification of variance or covariance structure or other problems with model specification.

See https://en.wikipedia.org/wiki/Specification_(regression) and http://www.statisticshowto.com/model-misspecification/

114 questions
87
votes
11 answers

Why should I be Bayesian when my model is wrong?

Edits: I have added a simple example: inference of the mean of the $X_i$. I have also slightly clarified why the credible intervals not matching confidence intervals is bad. I, a fairly devout Bayesian, am in the middle of a crisis of faith of…
Guillaume Dehaene
  • 2,137
  • 1
  • 10
  • 18
33
votes
6 answers

Inclusion of lagged dependent variable in regression

I'm very confused about if it's legitimate to include a lagged dependent variable into a regression model. Basically I think if this model focuses on the relationship between the change in Y and other independent variables, then adding a lagged…
user22109
  • 351
  • 1
  • 3
  • 3
28
votes
2 answers

Is it true that Bayesian methods don't overfit?

Is it true that Bayesian methods don't overfit? (I saw some papers and tutorials making this claim) For example, if we apply a Gaussian Process to MNIST (handwritten digit classification), but only show it a single sample, will it revert to the…
24
votes
2 answers

Why doesn't Wilks' 1938 proof work for misspecified models?

In the famous 1938 paper ("The large-sample distribution of the likelihood ratio for testing composite hypotheses", Annals of Mathematical Statistics, 9:60-62), Samuel Wilks derived the asymptotic distribution of $2 \times LLR$ (log likelihood…
16
votes
2 answers

Statistical Inference Under Misspecification

The classical treatment of statistical inference relies on the assumption that that a correctly specified statistical is used exists. That is, the distribution $\mathbb{P}^*(Y)$ that generated the observed data $y$ is part of the statistical model…
10
votes
2 answers

When to use (non)parametric test of homoscedasticity assumption?

If one is testing assumption of homoscedasticity, parametric (Bartlett Test of Homogeneity of Variances, bartlett.test) and non-parametric (Figner-Killeen Test of Homogeneity of Variances, fligner.test) tests are available. How to tell which kind to…
Roman Luštrik
  • 3,338
  • 3
  • 31
  • 39
9
votes
2 answers

Statistical inference under model misspecification

I have a general methodological question. It might have been answered before, but I am not able to locate the relevant thread. I will appreciate pointers to possible duplicates. (Here is an excellent one, but with no answer. This is also similar in…
Richard Hardy
  • 54,375
  • 10
  • 95
  • 219
8
votes
1 answer

Effects of model selection and misspecification testing on inference: Probabilistic Reduction approach (Aris Spanos)

This question is about pre-test bias, inference after model selection and data snooping within the Probabilistic Reduction (PR) methodology by Aris Spanos (which is related to the Error Statistics philosophy by Deborah Mayo; see e.g. her blog). I…
7
votes
1 answer

Test incorrect functional form when residuals have non-normal distribution

J. B. Ramsey (in "Tests for specification errors in classical linear least-squares regression analysis." Journal of the Royal Statistical Society. 1969) says that the RESET test assumes that the residuals are normally distributed. If one wants to…
Baumann
  • 916
  • 9
  • 9
7
votes
1 answer

Distribution of random effects

Why do we usually assume that random effects come from a normal distribution? Can we assume another distribution? Or maybe because the CLT indicates that a random effect is normally distributed?
6
votes
4 answers

What is the benefit of regression with student-t residuals over OLS regression?

Sometimes I see advice to fit regressions with student-t residuals rather than using OLS (which is equivalent to assuming normally distributed residuals) if the distribution of the residuals is heavy-tailed. However, since the OLS estimator is BLUE…
6
votes
4 answers

Could logistic regression be used to detect large errors in least squares regression?

I have the following linear model: $$w^*=\text{arg min}_w\sum_{i=1}^N \bigg(Y_i-\sum_{j=1}^M X_{i,j}\times w_j\bigg)^2$$ Let $T \in N^*$ and $e_i=|Y_i-\sum_{j=1}^M X_{i,j}\times w_j|$. It's possible using logistic regression to predict which errors…
6
votes
3 answers

Statistical test to determine if a relationship is linear?

What is the best statistical test to use if I measure the value of $Y$ (e.g. pH) for specific values of $X$ e.g. $X=0,10,20,30,...,100$ (e.g. temperature) and I want to test weather the relationship between $X$ and $Y$ is linear? (i.e. $H_0$: The…
6
votes
2 answers

Conditional vs. Marginal models

I have data with an outcome of 0 or 1 (binary) representing success or failure. I also have two comparison groups (Treatment vs. Control). Each subject in the study contributed 2 observations (the treatment is ear drops, so 2 ears). I wanted to…
6
votes
1 answer

Fixed Regressor Conspiracy and Connection to Exchangeability

In simple regression model regressors are treated as fixed rather than stochastic. Whoever picks the experimental values for the regressors, decides in which frequency to include each value. This can be equally weighted (i.e 10 samples of 20mg…
Cagdas Ozgenc
  • 3,716
  • 2
  • 29
  • 55
1
2 3 4 5 6 7 8