I am trying to ascertain which independent variables matter the most as they pertain to the dependent variable.
The two methods I have used are giving slightly different answers. I have tried two: correlation matrix and scaled coefficients in a regression.
Needing clarification:
The correlation matrix and regression coefficients are giving me slightly different X-variables that matter the most. Which method would you use? For example:
the correlation matrix shows that crime has a strong negative correlation to income (Y), public transit has a strong positive correlation to income (Y), education has a strong positive correlation to income (Y), and population has a strong positive correlation to income (Y).
the scaled coefficients from the regression show that access to public transit has a strong positive relationship to income (Y), education has a strong positive relationship to income (Y), and access to tutors has a strong positive relationship to income (Y).
correlation with R corrr::correlate(data)
:
- crime
- transit
- education
- population
scaled coefficients in regression with R (lm(scale(y)) ~ scale(x1) + scale(x2) + scale(x3)...
:
- transit
- education
- tutors
Which would you use? And why? I believe the regression because it specifies an actual relationship. Or would you do something else?
And, I thought that collinearity/high correlation was a problem in regressions which makes these two methods seem at odds with each other. Thank you in advance for any clarification/guidance.