Let’s begin with simple visualization:
ID A1 A2 A3 A4 SUM A1/SUM A2/SUM A3/SUM A4/SUM
1 0 1 1 0 2 0 0.5 0.5 0
2 1 1 1 1 4 0.25 0.25 0.25 0.25
3 0 0 1 0 1 0 0 1 0
4 1 1 1 0 3 0.33 0.33 0.33 0
I’ve got a set of ordinal binary data, where 1 means that respondent used to work in some area, 0 – that he/she didn’t. There are 35 such variables, and I have to reduce this amount, by finding the most similar cases (then I will use this to count mean from some likert-scale indicators in groups and difference between individual indicator and mean indicator in area, to use it as independent variable in logistic regression).
Can I convert this binary data to numerical data, by dividing each “1” response by overall sum of “1” responses (this will indicate “how much” a person used to work in some area) and then use such variables in factor analysis/principal component analysis to find most similar groups?