I'm working with a dataset that has a few variables that I'm having difficulty trying to preprocess. So one of them is called MENTHLTH where it is a numeric variable.
The point of the variable is to measure the number of days a person has had a bad mental health day within the last 30 days. So if you put 1 you had one bad mental health day in the past 30 day, if you put 30 all of them were. However, exceptions exist in that if you had No bad days in the past 30 days you'd put 88 and if you were not sure you'd put 77.
Now 1/3 of the responses had a value of 1-30. Nearly 2/3 had a value of 88 and the remainder were 77 or blank.
So how should I go about dealing with this variable? Should I make it nominal and try to bin the values in such a way that it represents meaningful groupings or can I continue to just treat it as a numeric variable?
I'm running it through Weka.