1

I made a multiple linear regression model. This model uses 3 predictors to determine the response (Temperature in K). This model gives a R square of 0.234 (Indicates that this model is a failure). But when i calculated RMSE to be 0.935 K which is very good.

My model gives very good RMSE but very bad R square. So my question is, what conclusions can be made for this model in terms of its accuracy?

Lostman
  • 111
  • 3
  • 5
    Why do you believe your $R^2$ is good and your RMSE is bad? [Is my model any good, based on the diagnostic metric (R2 / AUC / accuracy / etc) value?](https://stats.stackexchange.com/q/414349/1352) – Stephan Kolassa Jul 16 '19 at 07:06

1 Answers1

1

First, an $R^2 = 0.234$ isn't necessarily bad and an RMSE = $0.935$K isn't necessarily good. What is a good $R^2$ depends on the field of study and RMSE depends on the scale of measurement.

Second, the two measure different things. $R^2$ is a measure of how much of the variation in the DV is accounted for by the model. RMSE is a measure of how close the predicted values are to the actual ones. The code below gives some models with varying $R^2$ and RMSE. (Material after a # is result).

install.packages("Metrics")
library(Metrics)


set.seed(1234)

x1 <- rnorm(1000)
x2 <- rnorm(1000)
x3 <- rnorm(1000)

y1 <- 10*x1 + 10*x2 + 10*x3 + rnorm(1000,0, 5000)
y2 <- 10*x1 + 10*x2 + 10*x3 + rnorm(1000,0, 5)
y3 <- x1/10 + x2/10 + x3/10 + rnorm(1000,0, 5000)
y4 <- rnorm(1000,0,1)

m1 <- lm(y1~x1+x2+x3)
summary(m1)  #R2 = 0.000
rmse(m1$fitted.values, y) #59.9

m2 <- lm(y2~x1+x2+x3)
summary(m2)  #R2 = 0.93
rmse(m2$fitted.values, y) #20.11

m3 <- lm(y3~x1+x2+x3)
summary(m3)  #R2 = 0.000
rmse(m3$fitted.values, y) #2551

m4 <- lm(y4~x1+x2+x3)
summary(m4)  #R2 = 0.001
rmse(m4$fitted.values, y) #10.6
Peter Flom
  • 94,055
  • 35
  • 143
  • 276