Trying to find a way to find the correlation between the legalization of marijuana in states (yes/no) and the crime rates in that state. I'm honestly so confused.
-
Correlation is not causality. Causality is often near to correlation. Crime is noisy, is driven by weather, economics, and the moon. You have to account for all other contributors that are more significant than the legalization before you can start getting a clear read on the impact of legalization. Google could do it. The NSA likely already has. The FBI might or might not, depending on whether O. really likes cocaine users and what he has directed them to do. – EngrStudent Feb 08 '16 at 03:33
-
...what are you talking about, EngrStudent? – Matt Brems Feb 08 '16 at 06:02
-
You could discretize the crime rates variable and then calculate correlation. http://stats.stackexchange.com/questions/108007/correlations-with-categorical-variables – Omri374 Feb 08 '16 at 08:12
3 Answers
There must be many other variables influencing legalization that crime rates. You need to get and control for some other relevant variables. Then, since legalization is a binary (yes/no) variable, I would start out, maybe, with logistic regression. See for instance What is the difference between linear regression and logistic regression?.
Some other posts about correlation between quantitative and binary variables: Is there a correlation index for Binary Variable vs Quantitative variable? and Correlations between continuous and categorical (nominal) variables.

- 63,378
- 26
- 142
- 467
To answer the title problem: you can encode 'yes' as 1, and 'no' as 0, and calculate a thing called Point-Biserial Correlation.
But please don't do this.
The question you have asked in this context is in fact hard and deep question. If you use simple correlation or regression analysis, you might actually assume, that you do not have the right answer, and such correlation is driven by something else, than drugs law.
There are some methods, that possibly may be introduced, like regression discontinuity design, difference-in-differences (or difference-in-differences-in-differences), instrumental variable. These, and other, methods in such situation would aim to get the 'true' effect of the legalization of marijuana on crime. Luckily (or not) somebody already did this in this article. Starting from there, and searching for other articles in this topic should give general view, how such things can be done.
I assumed that the question of interest is legalisation->crime. Of course it might be other way as other answer suggest.

- 1,666
- 1
- 7
- 19
Lila, it is not possible to calculate the correlation between a categorical variable and a quantitative variable. Correlation only exists between two quantitative variables.

- 2,588
- 1
- 11
- 14