I am using Random Forrest to predict the MRR (Material removal rate). But the predictions have been quite off the mark. Even Linear Regression gave a much better result. I don't know where I'm going wrong. Below is my code in R:
data <-structure(list(A = c(50L, 50L, 50L, 50L, 50L, 60L, 60L, 60L,
60L, 60L, 70L, 70L, 70L, 70L, 70L, 80L, 80L, 80L, 80L, 80L, 90L,
90L, 90L, 90L, 90L), B = c(3L, 5L, 7L, 9L, 11L, 3L, 5L, 7L, 9L,
11L, 3L, 5L, 7L, 9L, 11L, 3L, 5L, 7L, 9L, 11L, 3L, 5L, 7L, 9L,
11L), C = c(100L, 200L, 300L, 400L, 500L, 200L, 300L, 400L, 500L,
100L, 300L, 400L, 500L, 100L, 200L, 400L, 500L, 100L, 200L, 300L,
500L, 100L, 200L, 300L, 400L), D = c(65L, 70L, 75L, 80L, 85L,
75L, 80L, 85L, 65L, 70L, 85L, 65L, 70L, 75L, 80L, 70L, 75L, 80L,
85L, 65L, 80L, 85L, 65L, 70L, 75L), E = c(0.2, 0.3, 0.4, 0.5,
0.6, 0.5, 0.6, 0.2, 0.3, 0.4, 0.3, 0.4, 0.5, 0.6, 0.2, 0.6, 0.2,
0.3, 0.4, 0.5, 0.4, 0.5, 0.6, 0.2, 0.3), MRR = c(8.926014, 14.10501,
38.40095, 48.49642, 88.21002, 4.892601, 15.179, 26.92124, 38.78282,
89.16468, 5.298329, 10.04773, 18.30549, 49.21241, 79.57041, 2.362768,
4.868735, 22.52983, 44.8926, 49.06921, 1.312649, 7.207637, 18.61575,
25.1074, 48.01909)), class = "data.frame", row.names = c(NA,
-25L))
#Splitting the data
library(caTools)
set.seed(123)
split <- sample.split(data$MRR, SplitRatio = 0.7)
training_set <- subset(data, split == TRUE)
test_set <- subset(data, split == FALSE)
#Building the model and making predictions
library(randomForest)
set.seed(123)
rforest <- randomForest(x = training_set[-6],
y = training_set$MRR,
ntree = 500)
pred_rforest <- predict(rforest, test_set[,1:5])
#Also building a Decision tree model for the prediction
library(rpart)
dtree <- rpart(formula = MRR ~ .,
data = training_set,
control = rpart.control(minsplit = 1))
pred_dtree <- predict(dtree, test_set[,1:5])
#Checking the accuracy
library(MLmetrics)
MAPE(pred_dtree, test_set[,6])
MAPE(pred_rforest, test_set[,6])
Both results were very bad.
Any help would mean a lot.