Using variance of y variable as weights for weighted linear regression when both x and y variables contain negative values?

Question

I am currently dealing with a weighted linear regression problem in the context of an instrument calibration in analytical chemistry. Let's assume I have a response variable y and a predictor variable x. Both contain several negative values. Both contain several negative values. Now I perform a linear regression and notice that the assumption of homoscedasticity is not met(residuals increase with increasing fitted values). Normally, in such a case, I would try to solve the problem by applying different pragmatic weights such as 1/x^0.5, 1/x, 1/x^2, 1/y^0.5, 1/y and 1/y^2 so deal with heteroscedasticity. I consider these factors pragmatic because in analytical chemistry there are usually not sufficient replicates for each level (usually at most 2) to apply variance-based (S^2) weighting factors (e.g. 1/S(y)^2), which usually solve the heteroscedasticity problem best. Unfortunately, it is not possible to use negative weights for weighted linear regression and hence all x and y based weighting factors are not applicable. Now let's assume that I have enough replication to make reliable variance estimates for every level. Would it then be allowed to use these variances for weighting?

Thanks a lot in advance. XEZ

The weights you proposed don't make any sense. If you were to add 100 to each y value and 100 to each x value, the weights would change dramatically, but the slope and its standard error should not change. The weights are affected by things the regression estimates are not affected by, so they would be completely inappropriate. — Noah, Jun 17 '21 at 18:34
Thank you for your comment. However, I do not agree with it. The application of the above weighting factors is common in analytical chemistry. The usefulness of these weights has already been proven in many studies. The background is that, as a rule, the error increases with increasing concentration (x-values). This is because higher concentrations have a greater influence on the model. References: [link](https://doi.org/10.1021/ac5018265); [link](https://doi.org/10.1016/j.cca.2009.11.021); [link](https://doi.org/10.1016/S1570-0232(02)00244-1) — XEZ, Jun 18 '21 at 06:41
Interesting, I was not aware of this. It does seem very specific to the problem where all x values are positive. One alternative would be to use weights with the absolute value of x, e.g., 1/|x|^.5. This way magnitude is still preserved and negative values can be handled. — Noah, Jun 18 '21 at 06:55
I had the same idea at the beginning. The problem is that in the case that both positive and negative x values (or y -values for 1/y weights) exist, there would be no constant increase of the weights with increasing x concentration. E.g. for the x-series = -2,-1,1,2 the 1/|x| weights series would be = 0.5,1,0.5,1. Thus very low x values would be penalized as much as very high x values. — XEZ, Jun 18 '21 at 07:11

Using variance of y variable as weights for weighted linear regression when both x and y variables contain negative values?

0 Answers0