Skewed continous data & Linear Regression

Question

I'm new to statistics.

I created a data set of around 10000 observations and wanted to examine the relation between a variable A with continuous values between 0 and 10 to another B which can assume continuous values between 0 to 1.

However, I noticed that my distribution of the independent variable is heavily left skewed; in over half of the observations it is clumped at 9-10. This is due the nature of the data; I can't collect more observations.

I'm at a bit of a loss how to proceed. Here's what my naivete came up with:

bin the independent variable into several classes and use under/oversampling techniques
resample from the observations in a specific ratio to get to a more normal distribution

Would this make sense? Are there other ways to deal with this?

Why do you need to do anything? Most procedures and models for examining such relationships make no requirements of the independent variables, except that they be nonconstant. — whuber, Feb 22 '22 at 22:17
It may help you to read a bit more about what assumptions linear regression *does* require, e.g. https://stats.stackexchange.com/questions/16381/ — Silverfish, Feb 22 '22 at 23:46
If your goal is just to quantify the strength of the relationship and find its direction, calculate the Spearman correlation coefficient. — Daniel Dostal, Feb 22 '22 at 23:56
@hachiko If, in your dataset, a variable only takes one value, then it is constant. If it takes at least two values, it is non-constant. If you want to analyse how that variable is related to other variables, it's important for it to be non-constant: eg if you wanted to see how family size is related to educational attainment, but every participant in your study is an only child, you simply have no idea whether their grades would be better/worse if they came from a larger family! No clever algorithm or statistical methodology can tell you, because the data does not speak of this relationship — Silverfish, Feb 25 '22 at 02:22

Skewed continous data & Linear Regression

0 Answers0