I have a dependent variable (DV) and about 200 independent variables (IVs). I want to understand which of the 10-20 variables are important for this DV. I could do:
- PCA - However it'll only tell me which 10-20 variables contain majority of the info of the dataset irregardless of the DV
- Regression and check p-value - However, I don't know which curve the data follows, so I can't do simple linear regression
Any ideas would be highly appreciated!