0

I have a dataset where I need to find the relation between the smoking habits of a person (Yes/No) with the probability risk of a person getting a heart disease.

Please suggest how can I find the correlation between 2 level categorical feature( smoking) to continuous variable(Risk) in Python. Can Pearson correlation coefficient be used ?

ekta
  • 1
  • 1
  • maybe check this ? https://stats.stackexchange.com/questions/102778/correlations-between-continuous-and-categorical-nominal-variables/102800#102800 – StupidWolf Apr 23 '20 at 19:42

1 Answers1

0

You could perform logistic regression: https://en.wikipedia.org/wiki/Logistic_regression The absolute value of the correlation coefficient will roughly tell you how useful your continuous variable is for predicting the yes/no variable. If the slope is zero, is has no predictive power.

user3433489
  • 353
  • 1
  • 8