I am doing weighted linear regression. I decided to weight my data points on the number of genes per chromosome per species. The graph I made on the basis of that wasn't much different than the unweighted linear regression. However, when I took the square root of the number of genes per chromosome per species and use that as my weight, the graph was different from unweighted linear regression. Can someone tell me why I had such difference? And if taking a square root is a better way to weight my data points? If yes, then why is that? Could you give me a statistical reason? Because I read somewhere that taking square root for weight is a common practice.
Asked
Active
Viewed 425 times
1 Answers
1
I do not think that there is a statistical reason for what you happen to observe in this case. Taking square roots or logarithms is commonly done to take variable with skewed distributions and make them more symmetric. I don't know if there is any other way that square roots are useful transformations.

Michael R. Chernick
- 39,640
- 28
- 74
- 143
-
1The square-root transformation has a long history of being used to try to stabilize variance in count data, which might be related to its use in gene-number data. See [this discussion](https://stats.stackexchange.com/q/46418/28500). I still remember that from my long-lost copy of Snedecor and Cochran. – EdM Aug 14 '20 at 15:31