I have a table which is used a people to fill some information, some 30 fields.
Sample Data:
ID Name Gender ZIP Phone Income Expense Family_size ......
1 A1 M 2321 5325222 45000 3553 5
Now most of the fields are mandatory, I have an intuition that some column are just being filled with random values or 0.
My guess is that the column/fields that little variance/randomess are the once that people are not bothering to fill well (some fields will have little variance, example: gender)
Question1:
Does my approach makes sense? How can variance of categorical variable be calculated?
Question2:
Is there a better way to do this? If yes how?
P.S: There is no data that reveals us the amount of time spend that could be used as a proxy to asses this.
I am using R for my analysis!