I have a variable that is a rank of 6 categories though 12 months. The categories are S, A, B, C, D, and E. Where S is best and E is worst. The variable can change overtime, for example one observation can be a S in the first month, a B in the second month, etc... I would like to analyze how multiple observations that in the month zero started at one category and changed over to others categories over the course of 12 months.
How can I achieve a sense of volatility of each category? For example, in what degree the tier that started at B (or S) in the month zero changed over time? How can I measure if the letter B (or S) tends to be better or worst and in what degree?
This is an example of the data (NA is a null):
date tier M0 M1 M2 M3 M4 M5 M6 M7 M8 M9 M10 M11 M12
Jan-20 B B B B B B A A S A B B B B
Jan-20 B B B A C A A A A A B A A A
Jan-20 B B B B B B D D D B D D D D
Jan-20 B B B A S S S S S S S S S S
Jan-20 B B B A B E E E E E E E E E
Jan-20 S S S S S S S S S S S S S S
Jan-20 S S S S S S S S S B B A NA NA
Jan-20 S S S S C C C E E B C C B B
Jan-20 S S S A A B A E C C B B B E
Jan-20 S S S A C B C E E B E C B A
I've assign numbers to each category. S = 3, A = 2, B = 1, C = 0, D = -1, E = -2.
Then I calculated the average of each tier by the tier they started with by month (each column).
data_final %>%
group_by(TIER) %>%
summarise(M0 = mean(TIER_00),
M1 = mean(TIER_01),
M2 = mean(TIER_02),
M3 = mean(TIER_03),
M4 = mean(TIER_04),
M5 = mean(TIER_05),
M6 = mean(TIER_06),
M7 = mean(TIER_07),
M8 = mean(TIER_08),
M9 = mean(TIER_09),
M10 = mean(TIER_10),
M11 = mean(TIER_11),
M12 = mean(TIER_12))
This result in a single row with each tier and their average by month. After that, I calculated the standard deviation to obtain a sense of distance between each month and multiplied that by the square root of N = 13 (because I am starting at 0). But couldn't interpret the results. I am having problems to create a sense of distance between each categories.