My dataframe consists of both categorical and numerical variables. The target variable is categorical. The data frame also has substantial missing values where I cannot omit them or use any suitable imputation. However, I want to use multivariate analysis to find correlation between the independent variables, if any, and group them together or omit one of them.
I want to know if there is any standard procedure to do so. Can you please guide me with my approach. If not, please suggest any alternative.
my sample dataset looks like this
primary_edu<-c(38,49,57,NA,88,91,NA,39,50)
highschool_edu<-c(50,55,62,71,NA,70,NA,77,60)
graduation_edu<-c(52,57,68,NA,NA,91,NA,NA,50)
postgraduation_edu<-c(60,67,72,NA,NA,91,NA,NA,55)
married<-c("yes","yes","yes",NA,"no","yes","yes","no",NA)
region<-c("A","A","C","C",NA,"B","B","A","D")
position <-c("level1","level1","level2","level1",NA,"level3",NA,"level3","level2")
promoted<-c("yes","no","yes","yes","no","yes","no","no","no")
df_sample<-data.frame(cbind(primary_edu,highschool_edu,graduation_edu,postgraduation_edu,married,region,position,promoted))
My target variable is promoted.