How to use weighted data in linear modeling?

Question

I have data from studies where each study has the means for that sub sample and the sample size is n, so the n is a frequency. There is one categorical predictor. The dependent variable is a % (the % issue has a separate question).

I used n/sum(n) for the weights and made weighted histograms and density plots as well as some linear models (iirc, some of the functions want probablities for weights).

Am i correct in using this weight for histograms and densities?

I see these questions where many different types of "weights" are used:

uses n^(3/4) and n

uses 1/n and 1/n^2

uses n

uses 1/n and 1/sqrt(n) - this is trying to deal with heteroscedastic data

These different weights are confusing.

My data do not seem to be normally distributed and has a factor that indicates that the variances within each group are quite different.

I can sorta see using the 1/sqrt(n) to deal with heteroscedasticity, but given that my n is a frequency, it seems like we are throwing information away.

So is using n/sum(n) for the weights a reasonable thing to do with linear models?

Do I need to use this trick (which uses n) to get the correct p-value?

How to use weighted data in linear modeling?

0 Answers0