Questions tagged [survey-weights]

survey weights are used when data are collected according to a probability sampling design with unequal probabilities of selection and/or response

Survey weights are used when data are collected according to a probability sampling design with unequal probabilities of selection and/or response.

In survey sampling, the inferential goal is to generalize from the sample to the finite population. The original motivation for survey weights comes from Horvitz-Thompson estimator of the population total: $$ t[y] = \sum_{i \in \mbox{units in sample}} \frac{y_i}{\pi_i} $$ where $\pi_i$ is the probability of selection. In this expression, $1/\pi_i$ can be interpreted as a weight attached to unit $i$, $w_i=\pi_i^{-1}$.

In practice, survey weights also include corrections for nonresponse, lack of population coverage, and other corrections for imbalance between the sample and the population.

References:

  1. Korn and Graubard (1995 JRSS-A)
  2. Korn and Graubard (1999 Wiley book)
  3. Heeringa, West and Berglund (2010 Chapman and Hall)
  4. Valliant, Dever and Kreuter (2013 Springer book)
  5. Kolenikov and Pitblado (2014 chapter in Wiley handbook)
  6. Pfeffermann (1996 SMMR)
  7. Lohr (2009 Cengage textbook)
  8. Lavallee and Beaumont (2015 invited article in SMIF)

Related tags:

163 questions
17
votes
2 answers

Two worlds collide: Using ML for complex survey data

I am struck with seemingly easy problem, but I haven't found a suitable solution for several weeks now. I have quite a lot of poll/survey data (tens of thousands of respondents, say 50k per dataset), coming from something I hope is called complexly…
kotrfa
  • 618
  • 1
  • 6
  • 15
10
votes
5 answers

What is calibration?

What does it mean to calibrate survey weights? Also, what are other definitions of calibration in statistics? I have heard it used in several contexts, particularly risk prediction (referring to whether the total number of predicted events in a…
AdamO
  • 52,330
  • 5
  • 104
  • 209
9
votes
3 answers

Using post-stratification weights in R survey package

I am analyzing a dataset that has a variable for post-stratification weights. As this is a complex survey, the plan is to use the R survey package. I have been reading its documentation and feel like able to set a survey design correctly. So far, so…
FabF
  • 121
  • 1
  • 8
9
votes
3 answers

Recommend references on survey sample weighting

Let's aim for some at an introductory level, some articles and some textbooks. Applied is more helpful, including R code is great. Thanks!
Michael Bishop
  • 2,171
  • 3
  • 21
  • 31
8
votes
1 answer

Variables for post-stratification weights?

What justifies the usage of a variable for post-stratification? I am working with a constituent survey of a non-profit's constituent with 2500 responses out of a much larger sample and even larger population. I have many variables about the target…
Andrew
  • 656
  • 5
  • 11
8
votes
1 answer

In propensity score analysis, what are options to deal with very small or large propensities?

$\newcommand{\P}{\mathbb{P}}$I am concerned with observational data in which the treatment assignment can be explained exceedingly well. For example, a logistic regression of $$\P(A =1 |X) = (1+ \exp(-(X\beta)))^{-1}$$ wehre $A$ treatment…
tomka
  • 5,874
  • 3
  • 30
  • 71
7
votes
0 answers

When to use longitudinal (panel) weights vs cross-section weights in complex surveys

I'm currently working with a longitudinal dataset, the Kauffman Firm Survey. The survey tracks about 5000 firms starting from 2004 - 2009. Firms die out over the years. It has both cross-sectional weights and longitudinal weights. I've checked out…
Robert
  • 275
  • 3
  • 6
6
votes
1 answer

Are sampling weights necessary in logistic regression?

When should I use weights when performing a logistic regression? The weights I'm referring to are sampling weights from a survey? Or should I just use the unweighted data?
user22119
  • 599
  • 4
  • 10
6
votes
1 answer

How can I use Propensity Scores to adjust for survey non-response bias?

Say I estimate the probability that each member of my target population responds to a survey using propensity scores. I am having a hard time finding a clear explanation of how I can use the propensity scores to adjust a continuous outcome of…
6
votes
0 answers

Defining quantiles for complex survey samples

I am looking to accumulate a comprehensive list of definitions for quantiles under complex sampling that have been published or implemented in software. I'm not worrying about the separate problem of uncertainty estimation. Without the complication…
Thomas Lumley
  • 21,784
  • 1
  • 22
  • 73
5
votes
1 answer

Stratified survey calculations by hand and with survey package don't agree. Simulation results

Bounty info: I originally emailed Thomas Lumley at an old email address. He did respond to an email to his new address. Note: Long post (lots of code) I can’t seem to replicate the results of the survey function using very basic by-hand…
abalter
  • 770
  • 6
  • 18
5
votes
1 answer

What is the effect of using survey sample weights for a sub-sample?

If a sub-sample of the survey sample, selected based on certain demographic characteristics of the data (e.g. age, race etc.), is used, which means the sub-sample might not be representative of the population anymore, is it better to not use…
tvl
  • 61
  • 6
5
votes
2 answers

F-Test for Equality of Variances with Weighted Survey Data

I would like to use an F-Test for Equality of Variances on a variable to compare two groups. Normally, this would be done with an sdtest command in Stata or a var.test command in R. However, the data come from a multi-stage, stratified random sample…
coip
  • 283
  • 2
  • 13
4
votes
1 answer

Calibration of weights for market research survey

I have a stratified random sample based on sampling frame formed from our CRM systems data. Now when I look responses in different strata they seem to differ. Some stratas have much higher response rate. My survey variables are mostly…
Analyst
  • 2,527
  • 10
  • 11
4
votes
2 answers

How to estimate the (approximate) variance of the weighted mean?

Background: weighted mean In the context of survey statistics it so happens that a sample of respondents from a survey are fit some weights to adjust their answers to the general population. These weights are often fitted using inverse of estimated…
1
2 3
10 11