The R TraMineR library is a toolkit for exploring and rendering categorical sequence data such as sequences describing family life trajectories or professional careers. This CV "traminer" tag is intended for conceptual questions about the analyses that can be done with the library as well as more general questions about sequence analysis. For TraMineR coding questions use the "traminer" tag under StackOverflow.
Questions tagged [traminer]
78 questions
10
votes
1 answer
Do low silhouette widths mean the data has little underlying structure?
I am new to sequence analysis, and I was wondering how you react if the average silhouette widths (ASW) from cluster analyses of Optimal Matching-based dissimilarity matrices are low (around.25). Would it seem appropriate to conclude that there is…

JeremyR
- 151
- 1
- 4
9
votes
2 answers
similarity measure between two different ordered sequences
I know we can quantify the similarity between two sequences with the same length and same elements by rank order correlation. But how to measure similarity between two sequences of different length, and only having some elements in common?
For…

sgyf
- 93
- 1
- 3
8
votes
1 answer
When and how to use weights for sequence analysis in social science?
Weighting in sequence analysis
So far, I have scarcely found papers that address the issue of weighting for sequence analysis (using for example the optimal matching algorithm). Sequence analysis normally involves several steps:
setting or…

non-numeric_argument
- 547
- 3
- 18
6
votes
1 answer
What's the algorithm for finding sequences used by TraMineR?
I'm working an analysis about finding frequent sequences in a event-state dataset using the R package TraMineR (and arulesSequences too).
In arulesSequences the algorithm used to find frequent sequences is the cSPADE algorithm.
But what is the…

Stefan
- 63
- 3
6
votes
1 answer
Index plot for each cluster sorted by the silhouette
After a cluster analysis I´m trying to plot for each cluster the Index plot of the Silhouette value instead of for the complete dataset
(like in the WeightedCluster Library Manual by Matthias Studer). First of all, is that theoretically correct?…

emanuela.struffolino
- 247
- 1
- 7
6
votes
2 answers
Other substitution matrices for missing value state in sequence analysis with TraMineR?
We have a question about how to deal with missing values/gaps within sequences. We like to set up our own substitution-cost matrix for the Optimal Matching process. As far as we know, TraMiner allows creating own cost matrices - but only in case…

Oliver
- 71
- 2
6
votes
1 answer
How to find the rows that meet some conditions in a sequence data set
Here is the summary of my sequence data generated in the SPELL format. All sequences are supposed to have the same length of 1440, but the summary tells me that they are not the same (see "min/max sequence length:91/1440").
I want to find the rows: …

POTENZA
- 361
- 1
- 4
5
votes
2 answers
Should normalization completely weed out correlation?
I have two variables: ordering & length. The former measures the ordering of a sequence (i.e. all permutations of A-B-C), and the former is the length of the sequence (i.e. A-B-C has a length of 3). These are highly correlated, and I want to…

histelheim
- 2,465
- 4
- 23
- 40
5
votes
1 answer
Separate opening and extension penalties for indels
ClustalG, the social science version of ClustalX, can use different 'opening' and 'extension' penalties for indels, so that an indel operation can have more weight when it adds the first element for a new event or gets rid of the last element for an…

POTENZA
- 361
- 1
- 4
5
votes
1 answer
How to estimate the centroid of clustered sequences?
I have run a sequence analaysis using the Optimal Matching algorithm. Afterwards, I have clustered the resulting distance matrice using the Ward algorithm and calculated silhouettes as measures of cluster quality and to identify representative…

non-numeric_argument
- 547
- 3
- 18
5
votes
2 answers
Modifying the time granularity of a state sequence
I created a sequence object from my SPELL-formatted data set. The sequence length of the sequence object is 1440 (i.e., 1-min intervals for a day).
Is there any easy way for TraMineR to convert the sequence length from 1440 to 288 (i.e., 5-min…

POTENZA
- 361
- 1
- 4
5
votes
1 answer
Problems with Groups in Traminer
I'm having a problem using the group function in TraMineR. I have a data set that contains SPELL data, so multiple rows per case. I also have demographic data per case, at one row per case. I merge these together and end up with data that has a…

mCorey
- 363
- 1
- 6
5
votes
1 answer
Sequence analysis - Clusters quality - Time Use
I am trying to run sequence clustering on time use data but I fail to have "acceptable" clustering solution according to Studer (2010).
The sequences have 76 episodes of 15 minutes slots (12 states). The data recorded people's activities during one…

giac
- 821
- 5
- 20
4
votes
1 answer
How to measure multichannel distances between "event" sequences?
In TraMineR, seqdistmc is used to measure multichannel distances between "state" sequences. I am wondering if there is a function to measure multichannel distances between "event" sequences.

POTENZA
- 361
- 1
- 4
4
votes
2 answers
minimum criteria for datasets used with TraMineR
I posted this to the TraMineR user list and it was suggested that it would be appropriate to post it here as well.
Any suggestions as to how to determine the minimum dataset size and missingness characteristics to which TraMineR may be applied…

Shawn
- 43
- 5