Questions tagged [confounding]

In statistical models, confounding is said to occur when the apparent dependence of the response on a predictor is partially or wholly due to the dependence of both on a third variable not included in the model, or dependence on a linear combination of other variables included in the model. Confounding with a variable included in a model is often called multicollinearity. A synonym is *aliasing*, used in design of experiments.

In statistical models, confounding is said to occur when the apparent dependence of the response on a predictor is partially or wholly due to the dependence of both on a third variable not included in the model. The causal relation, if any, between predictor and response is thus obscured. Observational studies are especially prone to confounding, the only remedy being to include all potential confounders in the model. Experiments mitigate confounding through randomization; though when randomization is restricted by blocking, some main effects or interactions may be partially or wholly confounded with blocks.

323 questions
190
votes
5 answers

How exactly does one “control for other variables”?

Here is the article that motivated this question: Does impatience make us fat? I liked this article, and it nicely demonstrates the concept of “controlling for other variables” (IQ, career, income, age, etc) in order to best isolate the true…
107
votes
15 answers

US Election results 2016: What went wrong with prediction models?

First it was Brexit, now the US election. Many model predictions were off by a wide margin, and are there lessons to be learned here? As late as 4 pm PST yesterday, the betting markets were still favoring Hillary 4 to 1. I take it that the betting…
horaceT
  • 3,162
  • 3
  • 15
  • 19
51
votes
11 answers

Famous easy to understand examples of a confounding variable invalidating a study

Are there any well-known statistical studies that were originally published and thought to be valid, but later had to be thrown out due to a confounding variable that wasn't taken into account? I'm looking for something easy to understand that…
29
votes
3 answers

Basic Simpson's paradox

I have a question about something that my statistics teacher said about the following problem: There are two hospitals named Mercy and Hope in your town. You must choose one of these in which to undergo an operation. You decide to base your…
swiecki
  • 393
  • 3
  • 6
25
votes
3 answers

Can a confounding factor hide a possible causal relationship? (as opposed to find a spurious one)

I'm a rookie with statistics, and I'm struggling to understand this: it is well known that a confounding factor can cause a spurious association, leading to rejecting a true null hypothesis (i.e. due to the confounding factor Z, I could conclude…
19
votes
5 answers

Why is controlling for too many variables considered harmful?

I am trying to understand the point of the second panel in the following xkcd comic: Specifically, how can one be misled by controlling too many confounding variables in one's models? Any pointers to what this criticism is called in the…
nsimplex
  • 361
  • 2
  • 6
18
votes
1 answer

Confounder - definition

According to M. Katz in his book Multivariable analysis (Section 1.2, page 6), "A confounder is associated with the risk factor and causally related to the outcome." Why must the confounder be causally related to the outcome? Would it be enough for…
marco
  • 181
  • 1
  • 3
18
votes
2 answers

Unconfoundedness in Rubin's Causal Model- Layman's explanation

When implementing Rubin's causal model, one of the (untestable) assumptions that we need is unconfoundedness, which means $$(Y(0),Y(1))\perp T|X$$ Where the LHS are the counterfactuals, the T is the treatment, and X are the covariates that we…
RayVelcoro
  • 1,039
  • 1
  • 10
  • 19
17
votes
4 answers

Why does propensity score matching work for causal inference?

Propensity score matching is used for make causal inferences in observational studies (see the Rosenbaum / Rubin paper). What's the simple intuition behind why it works? In other words, why if we make sure the probability of participating in the…
max
  • 1,254
  • 1
  • 12
  • 29
15
votes
4 answers

Confounding variables in machine learning predictions?

In classical statistics, confounding variable is a critical concept since it can distort our view about input variables and outcome variable's relationship. Many forms of control and adjustment are sought in statistics to eliminate, avoid or…
15
votes
3 answers

Do we really need to include "all relevant predictors?"

A basic assumption of using regression models for inference is that "all relevant predictors" have been included in the prediction equation. The rationale is that failure to include an important real-world factor leads to biased coefficients and…
ATJ
  • 1,711
  • 1
  • 15
  • 20
14
votes
1 answer

Techniques for analyzing ratios

I am looking for advice and comments that deal with the analysis of ratios and rates. In the field in which I work analysis of ratios in particular is widespread but I have read a few papers that suggest this can be problematic, I am thinking…
12
votes
3 answers

A potential confound in an experiment design

Overview of the question Warning: This question requires a lot of set-up. Please bear with me. A colleague of mine and I are working on an experiment design. The design must work around a large number of constraints, which I will list below. I…
11
votes
3 answers

What examples of lurking variables in controlled experiments are there in publications?

In this paper: Lurking Variables: Some Examples Brian L. Joiner The American Statistician Vol. 35, No. 4, Nov., 1981 227-233 Brian Joiner claims that "randomization is not a panacea". This is contrary to common statements such as the one…
Flask
  • 1,711
  • 1
  • 14
  • 24
11
votes
2 answers

Is it possible to have a variable that acts as both an effect modifier and a confounder?

Is it possible to have a variable that acts as both an effect (measurement) modifier and a confounder for a given pair of risk-outcome associations? I'm still a little unsure of the distinction. I've looked at graphical notation to help me…
user1447630
  • 999
  • 3
  • 8
  • 12
1
2 3
21 22