I'm very new to statistics and I want to be able to interrupt the following regression data. What does y=.33 - .0000007x mean? Also, what is r^2? Please let me know what y, x, and r^2 tells us.
1 Answers
The equation
$y = 0.33 - 0.000007 * x$
where the * = multiplication, describes the line that you have fitted through your data. A line has two features that define it typically: 1) Where the line intersects with the y-axis (i.e. the value of y when x = 0), and; 2) the slope of the line (I.e. rise over run).
The first number in the equation (0.33) tells you where the line hits the y axis, or the value of y when x is zero.
The second part of the equation describes the change you observe in y for every unit change you see in x. I.e. y will decrease (because the number is negative) by 0.000007 for every one unit change you see in x. This equation effectively lets you plug in a value of x and then work out what you would expect the y value would be.
$R^2$ however tells you about the strength of the relationship between the two variables. $R^2$ values can take any value between 0 and 1. A value of 1 means the two are perfectly correlated. A value of 0 means that there is no statistically significant relationship between the two variables. It can also be interpreted as the predictive capacity of your model to a degree. A high $R^2$ suggests that the two variables could be used to predict one another - for instance a $|R^2|$ of 0.9 means that your model captures 90% of the variation in your data.
However, Your value is very close to zero and suggests that the model only describes 0.07% of the variation in your data. It probably does not help that you have chosen to fit the wrong kind of regression line to your data though (I think). I suspect you have fitted what is called a simple linear regression between these two variables, which assumes a certain kind of response - one that can theoretically range between negative infinity and infinity. Your response, since it is a percentage, can only range between 0 and 100, meaning that you really need a different kind of model - probably a binomial regression. Here is a link that goes into this further if you want to find out more.

- 1,290
- 6
- 20
-
1Your description of $R^2$ is problematic and is really more like the usual Pearson correlation between two variables. My suspicion is that I could get $R^2< – Dave Nov 20 '19 at 04:43
-
If R^2 is actually *squared*, how could it be negative? – James Phillips Nov 20 '19 at 12:08
-
@James Because $R^2$ is not defined as the square of something, it actually can be negative in some circumstances. See https://stats.stackexchange.com/questions/183265/what-does-negative-r-squared-mean. However, in this case (where an intercept is included in the model) one can mathematically prove that $R^2$ actually is the square of the correlation coefficient, whence it cannot be negative. Andre: you have confused $R^2$ with the correlation coefficient in your explanation. – whuber Nov 20 '19 at 16:11
-
@whuber your kind explanation makes sense. – James Phillips Nov 20 '19 at 18:08
-
1@Whuber - My mistake! I will adjust accordingly when I am next at a pc – André.B Nov 20 '19 at 18:16
-
Thank you. Might I suggest adding some nuance when you get a chance? In particular, your statement "value of 0 means that there is no relationship between the two variables" could easily be misconstrued, because it's clear you are referring to *evidence* of a *particular kind* of relationship and therefore you cannot justify concluding "no relationship" (whatsoever). – whuber Nov 20 '19 at 19:46
-
Good point! I have made an amendment, albeit a small one. Is it less ambiguous now? – André.B Nov 20 '19 at 19:48