5

I have built a model to predict Upsell probability. When I use the function confusionMatirx from caret package, I get the following results:

> confusionMatrix(data = predict_svm_test_5, test_td$UpSell_Ind)
Confusion Matrix and Statistics

             Reference
Prediction    0    1
          0 7976 2886
          1  217  644

           Accuracy : 0.7353          
             95% CI : (0.7272, 0.7433)
No Information Rate : 0.6989          
P-Value [Acc > NIR] : < 2.2e-16       

              Kappa : 0.1987          
Mcnemar's Test P-Value : < 2.2e-16       

        Sensitivity : 0.9735          
        Specificity : 0.1824          
     Pos Pred Value : 0.7343          
     Neg Pred Value : 0.7480          
         Prevalence : 0.6989          
     Detection Rate : 0.6804          
Detection Prevalence : 0.9266          
  Balanced Accuracy : 0.5780          

   'Positive' Class : 0  

However, I expected to see the confusion matrix as follows:

             Reference
Prediction   1      0
      1     644    217
      0    2886   7976
Specificity(TPR): 0.9735
Sensitivity(TNR): 0.1824

1 meaning there was an Upsell (Event) and 0 meaning no Upsell (No Event) based on the PDF of Caret Package. Link is here Page 24, 25

Now my question: How do I interpret the results of confusionMatrix? The values given by the function are different from values that I calculate.

Thanks in advance for the help.

conjugateprior
  • 19,431
  • 1
  • 55
  • 83
EsBee
  • 91
  • 1
  • 1
  • 6
  • 1
    this question isn't clear - to me at least. The `caret` values and the values you calculated are identical. The difference is confusion matrix layout is cosmetic and not substantial. – charles Jun 09 '15 at 00:32
  • Thanks @charles for the response. OK, I can understand that the layout is cosmetic. But the thing that confuses me is the values that I get for Specificity and Sensitivity from my calculation and the values that are output from caret confusionMatrix function. The value of my calculated Specificity = Caret's Sensitivity. How do I interpret? Does it mean that True Positive rate of 0(from caret) = True Negative Rate of 1( from my calculation). This is where I am confused. – EsBee Jun 09 '15 at 15:35
  • 2
    this is just a coding issue I think. Sen/spec require a definition of "positive". Here `caret` has automatically chosen a value different from the one you want. I don't use the `caret` package but something like...`confusionMatrix((pred, ref, positive=1)` might work. you want to use the `positive=` option. – charles Jun 10 '15 at 00:21

1 Answers1

4

Thanks @charles for pointing me to "positive". Though positive = 1 did not work as the argument positive takes only character value in the function. But I was able to get what I wanted using the following:

levels(test_td$UpSell_Ind)

[1] "0" "1"
confusionMatrix(data = predict_glm_vif_test, test_td$UpSell_Ind, positive = levels(test_td$UpSell_Ind)[2])

Confusion Matrix and Statistics

          Reference
Prediction    0    1
         0 8104 3241
         1   89  289

           Accuracy : 0.7159          
             95% CI : (0.7077, 0.7241)
No Information Rate : 0.6989          
P-Value [Acc > NIR] : 2.701e-05       

              Kappa : 0.0952          
 Mcnemar's Test P-Value : < 2.2e-16       

        Sensitivity : 0.08187         
        Specificity : 0.98914         
     Pos Pred Value : 0.76455         
     Neg Pred Value : 0.71432         
         Prevalence : 0.30112         
     Detection Rate : 0.02465         
 Detection Prevalence : 0.03224         
  Balanced Accuracy : 0.53550         

   'Positive' Class : 1  
EsBee
  • 91
  • 1
  • 1
  • 6