I do a text analysis where I want to identify dependencies among categorical variables, for example let's take this dataset:
pos1 pos2 pos3
1 A B C
2 A B D
3 A B A
4 B B D
5 A B B
Here the columns indicate the position in the text and the rows indicate different texts. From this example it is obvious that A
on position 1 is accompanied by B
as position 2. I thought of calculating a correlation coefficient, such as pearson, however to do so I have to convert this data set to a binary matrix. Then considering this question, I think pearson or spearman would not be a good choice. Is there a way to calculate the association of these categorical variables, such that one can see that A
at position 1 is commonly accompanied by B
at position 2 for example?