For instance I have a set of categories from a variable called 'category'. From python, here are my categories and their corresponding counts.
Category_Others 1430
Application/Website 1345
Backups 900
Software/OS 800
Hardware/Virtual 700
Network 302
There are two options I am considering; however, I am not sure which one is more legit.
I assign a value to each category as follows:
Category_Others ---- 0
Application/Website ---- 1
Backups ---- 2
Software/OS ---- 3
Hardware/Virtual ---- 4
Network ---- 5
Or I introduce 5 new variables into my data frame. Then I end up with 5 more columns (i.e. columns with names of each category)
i.e. on Python
category = pd.get_dummies(incident['category'], drop_first=True)