Questions tagged [robust]

Robustness in general refers to a statistic's insensitivity to deviations from its underlying assumptions (Huber and Ronchetti, 2009).

Robust statistics are insensitive to deviations from their underlying assumptions and outliers. Such methods are useful it is not possible to detect and remove outliers or to appropriately test the assumptions required by a given statistic. A robust statistic is meant to achieve three goals:

efficiency - it should have an optimal or nearly optimal efficiency as the assumed model
stability - small deviations from the assumptions should have only a small influence on performance
breakdown - larger deviations from the assumptions should not lead to a complete failure

Examples of robust statistics are median regression as estimation technique, or Huber-White standard errors for statistical inference. Note that "robust" is not equivalent to "better". Robustness is always based on compromise as it sacrifices efficiency to ensure against larger deviations from the assumptions from the model (Anscombe, 1960).

For further reading see

Huber, P.J. and Ronchetti, E.M. (2009) "Robust Statistics", 2nd Edition, Wiley Series in Probability and Statistics, John Wiley & Sons, Inc., New Jersey
Anscombe, F.J. (1960) "Rejection of Outliers", Technometrics, Vol. 2, pp. 123-147

519 questions

votes

14 answers

Why haven't robust (and resistant) statistics replaced classical techniques?

When solving business problems using data, it's common that at least one key assumption that under-pins classical statistics is invalid. Most of the time, no one bothers to check those assumptions so you never actually know. For instance, that so…

asked Aug 03 '10 at 07:49

doug

9,901
1
22
26

votes

4 answers

Replicating Stata's "robust" option in R

I have been trying to replicate the results of the Stata option robust in R. I have used the rlm command form the MASS package and also the command lmrob from the package "robustbase". In both cases the results are quite different from the "robust"…

r stata robust robust-standard-error

asked Sep 28 '14 at 12:42

user56579

votes

4 answers

Fast linear regression robust to outliers

I am dealing with linear data with outliers, some of which are at more the 5 standard deviations away from the estimated regression line. I'm looking for a linear regression technique that reduces the influence of these points. So far what I did is…

regression linear-model outliers robust fused-lasso

asked Dec 19 '12 at 10:47

Matteo Fasiolo

3,134
2
20
29

votes

3 answers

Why do we care so much about normally distributed error terms (and homoskedasticity) in linear regression when we don't have to?

I suppose I get frustrated every time I hear someone say that non-normality of residuals and /or heteroskedasticity violates OLS assumptions. To estimate parameters in an OLS model neither of these assumptions are necessary by the Gauss-Markov…

regression assumptions normality-assumption robust teaching

asked Dec 30 '14 at 22:22

Zachary Blumenfeld

3,826
1
14
21

votes

2 answers

Why should we use t errors instead of normal errors?

In this blog post by Andrew Gelman, there is the following passage: The Bayesian models of 50 years ago seem hopelessly simple (except, of course, for simple problems), and I expect the Bayesian models of today will seem hopelessly simple, 50…

distributions bayesian normal-distribution model robust

asked Oct 20 '14 at 16:15

Potato

1,025
1
11
12

votes

2 answers

Error "system is computationally singular" when running a glm

I'm using the robustbase package to run a glm estimation. However when I do it, I get the following error: Error in solve.default(crossprod(X, DiagB * X)/nobs, EEq) : system is computationally singular: reciprocal condition number =…

r generalized-linear-model robust

asked Nov 13 '13 at 18:11

NK1

votes

4 answers

Why isn't RANSAC most widely used in statistics?

Coming from the field of computer vision, I've often used the RANSAC (Random Sample Consensus) method for fitting models to data with lots of outliers. However, I've never seen it used by statisticians, and I've always been under the impression…

outliers bootstrap robust

asked Jul 21 '10 at 14:30

Bossykena

votes

6 answers

What would a robust Bayesian model for estimating the scale of a roughly normal distribution be?

There exists a number of robust estimators of scale. A notable example is the median absolute deviation which relates to the standard deviation as $\sigma = \mathrm{MAD}\cdot1.4826$. In a Bayesian framework there exist a number of ways to robustly…

r bayesian estimation standard-deviation robust

asked Jan 13 '14 at 16:08

Rasmus Bååth

6,422
34
57

votes

8 answers

Replacing outliers with mean

This question was asked by my friend who is not internet savvy. I've no statistics background and I've been searching around internet for this question. The question is : is it possible to replace outliers with mean value? if it's possible, is…

mean outliers robust winsorizing

asked Nov 29 '13 at 14:08

Alun

votes

2 answers

Are 50% confidence intervals more robustly estimated than 95% confidence intervals?

My question flows out of this comment on an Andrew Gelman's blog post in which he advocates the use of 50% confidence intervals instead of 95% confidence intervals, although not on the grounds that they are more robustly estimated: I prefer 50% to…

confidence-interval assumptions robust

asked Nov 27 '16 at 06:28

user1205901 - Reinstate Monica

11,303
26
77
152

votes

1 answer

What are the multidimensional versions of median

What are the multidimensional versions of the median and what are their pros and cons? I confess this doesn't have a single answer, but I think it is a useful question to ask and will be a benefit to others as well. How stable it is (i.e. how many…

multivariate-analysis robust median

asked Nov 06 '12 at 23:26

John Robertson

votes

5 answers

How robust is the independent samples t-test when the distributions of the samples are non-normal?

I've read that the t-test is "reasonably robust" when the distributions of the samples depart from normality. Of course, it's the sampling distribution of the differences that are important. I have data for two groups. One of the groups is highly…

t-test assumptions normality-assumption robust

asked Oct 09 '12 at 00:29

Archaeopteryx

votes

2 answers

Is a weighted $R^2$ in robust linear model meaningful for goodness of fit analysis?

I estimated a robust linear model in R with MM weights using the rlm() in the MASS package. `R`` does not provide an $R^2$ value for the model, but I would like to have one if it is a meaningful quantity. I am also interested to know if there is any…

r goodness-of-fit r-squared robust rlm

asked Jan 30 '14 at 03:31

CraigMilligan

votes

4 answers

Mean and Median properties

Can somebody explain me clear the mathematical logic that would link two statements (a) and (b) together? Let us have a set of values (some distribution). Now, a) Median does not depend on every value [it just depends on one or two middle…

mean median robust sensitivity-analysis types-of-averages

asked Feb 16 '11 at 19:33

ttnphns

51,648
40
253
462

votes

3 answers

Crash course in robust mean estimation

I have a bunch (around 1000) of estimates and they are all supposed to be estimates of long-run elasticity. A little more than half of these is estimated using method A and the rest using a method B. Somewhere I read something like "I think method B…

mean outliers robust references

asked Mar 03 '12 at 17:41

Ondrej

2 3

…

34 35 Next