I have data from studies where each study has the means for that sub sample and the sample size is n, so the n is a frequency. There is one categorical predictor. The dependent variable is a % (the % issue has a separate question).
I used n/sum(n) for the weights and made weighted histograms and density plots as well as some linear models (iirc, some of the functions want probablities for weights).
Am i correct in using this weight for histograms and densities?
I see these questions where many different types of "weights" are used:
-
uses 1/n and 1/sqrt(n) - this is trying to deal with heteroscedastic data
These different weights are confusing.
My data do not seem to be normally distributed and has a factor that indicates that the variances within each group are quite different.
I can sorta see using the 1/sqrt(n) to deal with heteroscedasticity, but given that my n is a frequency, it seems like we are throwing information away.
So is using n/sum(n) for the weights a reasonable thing to do with linear models?
Do I need to use this trick (which uses n) to get the correct p-value?