How to measure volatility of a categorial variable?

Question

I have a variable that is a rank of 6 categories though 12 months. The categories are S, A, B, C, D, and E. Where S is best and E is worst. The variable can change overtime, for example one observation can be a S in the first month, a B in the second month, etc... I would like to analyze how multiple observations that in the month zero started at one category and changed over to others categories over the course of 12 months.

How can I achieve a sense of volatility of each category? For example, in what degree the tier that started at B (or S) in the month zero changed over time? How can I measure if the letter B (or S) tends to be better or worst and in what degree?

This is an example of the data (NA is a null):

date   tier M0  M1  M2  M3  M4  M5  M6  M7  M8  M9  M10 M11 M12
Jan-20  B   B   B   B   B   B   A   A   S   A   B   B   B   B
Jan-20  B   B   B   A   C   A   A   A   A   A   B   A   A   A
Jan-20  B   B   B   B   B   B   D   D   D   B   D   D   D   D
Jan-20  B   B   B   A   S   S   S   S   S   S   S   S   S   S
Jan-20  B   B   B   A   B   E   E   E   E   E   E   E   E   E
Jan-20  S   S   S   S   S   S   S   S   S   S   S   S   S   S
Jan-20  S   S   S   S   S   S   S   S   S   B   B   A   NA  NA
Jan-20  S   S   S   S   C   C   C   E   E   B   C   C   B   B
Jan-20  S   S   S   A   A   B   A   E   C   C   B   B   B   E
Jan-20  S   S   S   A   C   B   C   E   E   B   E   C   B   A

I've assign numbers to each category. S = 3, A = 2, B = 1, C = 0, D = -1, E = -2.

Then I calculated the average of each tier by the tier they started with by month (each column).

data_final %>% 
  group_by(TIER) %>% 
  summarise(M0 = mean(TIER_00),
            M1 = mean(TIER_01),
            M2 = mean(TIER_02),
            M3 = mean(TIER_03),
            M4 = mean(TIER_04),
            M5 = mean(TIER_05),
            M6 = mean(TIER_06),
            M7 = mean(TIER_07),
            M8 = mean(TIER_08),
            M9 = mean(TIER_09),
            M10 = mean(TIER_10),
            M11 = mean(TIER_11),
            M12 = mean(TIER_12))

This result in a single row with each tier and their average by month. After that, I calculated the standard deviation to obtain a sense of distance between each month and multiplied that by the square root of N = 13 (because I am starting at 0). But couldn't interpret the results. I am having problems to create a sense of distance between each categories.

How to measure volatility of a categorial variable?

0 Answers0