In machine learning most algorithms require some kind of scaling to decrease error. This is my code:
# ensure results are repeatable
set.seed(7)
# load the library
library(caret)
# load the dataset
data(iris)
head(iris)
X=scale(iris[,-5])
X=data.frame(X)
head(X)
y=iris[,5]
y=data.frame(y)
head(y)
X=cbind(X,y)
# prepare training scheme
control <- trainControl(method="repeatedcv", number=5, repeats=1)
# train the model
model <- train(y~., data=X, method="svmLinear2", trControl=control, tuneLength=5)
# summarize the model
print(model)
#saving model
save(model, file="model.Rdata")
#loading model
supmod<-load("model.Rdata")
#new data
# Sepal.Length Sepal.Width Petal.Length Petal.Width
# 4.2 3.2 1.7 0.23
new<-c(4.2,3.2,1.7,0.23)
pre<-predict(supmod,new)
#dont know how to predict this model with unseen data
In the above code I have two question one related to scaling of the new data and other related to coding error passing the new data to the loaded model.
The real iris feature data looks like this
Sepal.Length Sepal.Width Petal.Length Petal.Width
1 5.1 3.5 1.4 0.2
2 4.9 3.0 1.4 0.2
3 4.7 3.2 1.3 0.2
4 4.6 3.1 1.5 0.2
5 5.0 3.6 1.4 0.2
6 5.4 3.9 1.7 0.4
But before passing to svm algorithm we have to scale the data and i use scale() to scale data and its look like this.
Sepal.Length Sepal.Width Petal.Length Petal.Width
1 -0.8976739 1.01560199 -1.335752 -1.311052
2 -1.1392005 -0.13153881 -1.335752 -1.311052
3 -1.3807271 0.32731751 -1.392399 -1.311052
4 -1.5014904 0.09788935 -1.279104 -1.311052
5 -1.0184372 1.24503015 -1.335752 -1.311052
6 -0.5353840 1.93331463 -1.165809 -1.048667
It is this scaled data that we use for training and testing our model. lets say I have successfully trained the model and use it for prediction of new unseen data (eg this one row).
Sepal.Length Sepal.Width Petal.Length Petal.Width
4.2 3.2 1.7 0.23
- Do I need to scale this new data? or I just have to pass this data directly to my model?
- The next question is related to a coding error
predict(supmod,new)
returns this error
Error in UseMethod("predict") : no applicable method for 'predict' applied to an object of class "character"