0

I just run a LR multivariate analysis to class probability check. From this analysis I could see that the combination of 4-variables would increase my test accuracy.

Here is the output of this anaylsis:

logit(P) = log(P / (1 - P)) = 13.458 - 0 Variable A - 0.106 Variable B - 0.004 variable C + 0.008 Variable D, where P is Pr(y=1|x). The best threshold (or Cutoff) for the predicted P is 0.15. Original Label: NTB/TB --> Labels in Logistic Regression: 0/1 Note) The class/response value is recommended as (Case:1 and Control:0)

        Estimate     Std. Error z value Pr(>|z|)    Odds

(Intercept ) 13.458 8.354 1.611 0.107 - Variable A 0 0 -1.046 0.296 1 Variable B -0.106 0.079 -1.338 0.181 0.9 Variable C -0.004 0.002 -1.649 0.099 1 Variable C 0.008 0.007 1.192 0.233 1.01

My question is, I want to validate this LR model in new samples, how can I them apply it ? Should I simply apply this form: 13.458 - 0 Variable A - 0.106 Variable B - 0.004 variable C + 0.008 Variable D??

I am a biologist with a very limited statistical background.

Many thanks to you all

  • 1
    Additionally, this search may be useful: https://stats.stackexchange.com/search?tab=votes&q=%5blogistic%5d%20interpret – Sycorax Aug 09 '19 at 13:10

1 Answers1

0

That form will give you the value of the logit for the new sample, but you have to apply the inverse logit function to that to get P. Then you have to compare those values of P to your threshold to classify each point in your sample.

Here’s an online tutorial that uses R, but also explains the process:

http://r-statistics.co/Logistic-Regression-With-R.html

Joe
  • 191
  • 6
  • Thanks a lot. Another question: Why the Variable A has a value of 0 (zero) ? As in logit(P) = log(P / (1 - P)) = 13.458 - 0 Variable A - 0.106 Variable B - 0.004 variable C + 0.008 Variable D. What does it mean?? – Leonardo Araujo Aug 12 '19 at 08:52
  • A logistic regression assumes a linear relationship between the predictors and the log-odds of the response. If the output of the regression is a coefficient of zero for variable A, that means that variable A does not help predict Y, at least not in your sample. – Joe Aug 13 '19 at 10:09