Find Clusters of Responses in Survey

Question

I have results from a survey:

ID    Ques    Answer
1      a        yes
1      b        yes
1      c        no
2      a        no
2      b        yes
2      c        no
3      a        no
3      b        no
3      c        no

I would like to see if there are any relationships/clusters between answering yes to the questions. I have data for around 2000 participants, with around 30 questions each. I'm wondering if I should construct a distance matrix, by converting the yes/no to 0/1, follow this previous question. Any idea if I could reproduce similar methods in SAS or Python?

***EDIT: I think I'm trying to cluster questions, as opposed to people. I would like to see if a group of questions are often answered "yes" together, or the opposite, that answering "yes" to a, is predictive of answering no to b.

score 1 · Answer 1 · answered Oct 07 '19 at 22:45

1

What you are describing sounds more like frequent itemset mining to me. Because not all questions will cluster.

Questions are your items, each questionnaire is a transaction. You'll need fairly high thresholds, since each user answers each question.

answered Oct 07 '19 at 22:45

Has QUIT--Anony-Mousse

39,639
7
61
96

Find Clusters of Responses in Survey

1 Answers1