0

My dataframe consists of both categorical and numerical variables. The target variable is categorical. The data frame also has substantial missing values where I cannot omit them or use any suitable imputation. However, I want to use multivariate analysis to find correlation between the independent variables, if any, and group them together or omit one of them.

I want to know if there is any standard procedure to do so. Can you please guide me with my approach. If not, please suggest any alternative.

my sample dataset looks like this

primary_edu<-c(38,49,57,NA,88,91,NA,39,50)  
highschool_edu<-c(50,55,62,71,NA,70,NA,77,60)  
graduation_edu<-c(52,57,68,NA,NA,91,NA,NA,50)  
postgraduation_edu<-c(60,67,72,NA,NA,91,NA,NA,55)  
married<-c("yes","yes","yes",NA,"no","yes","yes","no",NA)  
region<-c("A","A","C","C",NA,"B","B","A","D")  
position <-c("level1","level1","level2","level1",NA,"level3",NA,"level3","level2")  
promoted<-c("yes","no","yes","yes","no","yes","no","no","no")  

df_sample<-data.frame(cbind(primary_edu,highschool_edu,graduation_edu,postgraduation_edu,married,region,position,promoted)) 

My target variable is promoted.

areddy
  • 21
  • 1
  • 2
  • 2
    Please carefully read [this](http://stackoverflow.com/help/mcve) and [this](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example/5963610#5963610) and then update your question to conform to these guidelines. Also, I noticed you have asked several questions but have not accepted a single answer. Please read [this](http://stackoverflow.com/help/someone-answers). – jlhoward Sep 23 '15 at 05:36
  • Thank you for the suggestions. I have provided the sample dataset. –  Sep 23 '15 at 06:09
  • 1
    Are you asking about how to group independent variables based on correlations (then the nature and even existence of your dependent variable is of no relevance), or are you asking about how to build a good regression model and want to group/omit variables specifically for this purpose? – amoeba Sep 23 '15 at 10:38
  • Does [this](http://stats.stackexchange.com/questions/108007/correlations-with-categorical-variables) help? – jlhoward Sep 23 '15 at 17:28
  • @jihoward Thank you so much for the link. It helped. :) – areddy Sep 30 '15 at 07:51

0 Answers0