I am trying to estimate a multi-state model and am following the setup in the book by Mills (2011). The research units can go through different states a, b, c, and d. The research units can move between any of the four states in either direction. My problem is how to set up my data to do so.
I initially started out with the dataset in this format:
ID year state v1 v2
1 2000 a
1 2001 a
1 2003 b
2 1990 a
2 1995 a
In Mills book on page 206, the suggested format is however:
ID from to trans Tstart Tstop time status v1 v2
where from to show all possible movements between states, trans a sequence of transition per id, Tstart and Tstop and Time are the time variables, status indicates whether one experienced a state or not (censored) and v1,v2 are covariates.
I am doubting how I should set the data frame up, since in my case there are many movements between states in various directions. I started out with adding a column to my original data, so that I have data of type "SPELL" as it is referred to in the Traminer manual. However, by doing so and changing it to a say STS format, I also loose my covariates to which I need to hold on for the multi-state model.
EDIT: In sum, how do I go from the format above to the format required by the mstate package?