Finding statistically relevant patterns between data examples where the values are delivered in a sequence [Wikipedia]
Questions tagged [sequential-pattern-mining]
67 questions
10
votes
3 answers
High autocorrelation when taking the L-th order of difference of a sequence of independent random numbers
To explain this question in more detail, I'll first elaborate my approach:
I simulated a sequence of independent random numbers $X = \{x_1,...,x_N\}$.
I then take $L$ times the difference; i.e. I create the variables:
$dX_{1} =…

JohnAndrews
- 537
- 1
- 5
- 15
10
votes
2 answers
Multivariate time series clustering
I am collecting a group of multivariate time sequences. For example, there are 2000 time series. Each time series is of 12 dimensions.
Are there any systematic models/algorithms that can cluster multivariate time series? For instance, I would like…

user3269
- 4,622
- 8
- 43
- 53
9
votes
2 answers
Best use of LSTM for within sequence event prediction
Assume the following 1 dimensional sequence:
A, B, C, Z, B, B, #, C, C, C, V, $, W, A, % ...
Letters A, B, C, .. here represent 'ordinary' events.
Symbols #, $, %, ... here represent 'special' events
The temporal spacing between all events is…

dgorissen
- 191
- 1
- 5
8
votes
2 answers
Identifying sequential patterns
I am working with sequence data which are long lists of malware win-api calls. I am trying to cast the problem of identifying 'malware behavior' into one of finding sequential patterns. I treat each api call as a single item Itemset. The number of…

chet
- 285
- 1
- 5
6
votes
1 answer
What's the algorithm for finding sequences used by TraMineR?
I'm working an analysis about finding frequent sequences in a event-state dataset using the R package TraMineR (and arulesSequences too).
In arulesSequences the algorithm used to find frequent sequences is the cSPADE algorithm.
But what is the…

Stefan
- 63
- 3
6
votes
2 answers
Machine learning on non-fixed-length sequential data?
I have a problem which I'd like to apply machine learning (supervised classification) to, however, the data is sequential and each row in the data vector has its own length. This implies that the number of features in each row is non-constant (think…

Magnus
- 173
- 1
- 5
6
votes
3 answers
Cluster Sequences of data with different length
I need to cluster sequences of data that have different length.
I am using Matlab and my first question is related to the method.
Is KMeans sufficient to achieve this?
IN KMeans I have to use the following command to cluster a set of data stored in…

user55534
- 101
- 2
- 5
5
votes
1 answer
Binary Pattern Prediction
I have 400K observations, each one is a growing binary pattern where order matters. The goal is to predict the likelihood of the next character.
For example,
10101010101? - as humans we can eyeball and say with high confidence that the next…

Josh
- 383
- 2
- 9
4
votes
1 answer
Identify changepoints in 1/0 sequence
Say I have a sequence that looks like this:
0 0 0 0 0 1 0 0 0 0 1 0 1 1 1 0 0 0 0 0 0 0 0 0 0
In general terms, I am interested in determining when there is an over-abundance of 1's in close proximity. Not necessarily in a row, but within a…

jalapic
- 359
- 2
- 10
4
votes
3 answers
pattern recognition for sequence data
There are a lot of data sequences, I am trying to find pairs of sequences that are similar with other.
Trivially, we can define some distance measure, and compare each pair of sequence in terms of this measure. Or even we can solve this problem in a…

user785099
- 1,105
- 3
- 14
- 24
3
votes
1 answer
Sequential classification, combining predictions
What is the best way to combine outputs from a binary classifier, which outputs probabilities, and is applied to a sequence of non-iid inputs?
Here's a scenario: Say I have a classifier which does an OK, but not great, job of classifying whether or…

bill_e
- 2,681
- 1
- 19
- 33
3
votes
0 answers
Which distance metric to use to cluster categorical sequences (clickstreams or clickpaths)?
For my research, I want to cluster website visitors based on their clickstreams to understand different information behavior patterns (i.e., customer/visitor journeys). The data can be characterized as a number of sequences of predefined states…

MLud
- 51
- 4
3
votes
0 answers
Sequential Prediction: Data Modeling and Classical Algorithms
I have data that can be called demographic data.
Raw data
Person 0001
\begin{array}{|c|c|}
\hline
Feb\,1981- Apr\,85 & engaged\,\,in\,\,\underline{activity}\,\,\textit{A}\,\,of \,\,\underline{type}\,\,\textbf{square}\,\\ \hline
Apr\,1985- July\,86 &…
3
votes
1 answer
Studying fluctuations / trajectories over time - objective methods of grouping?
I'm trying to study fluctuations of a disease's activity over time, for example the f;uctiation in severity of chronic pain (in the absence of obvious triggers).
Individuals generally demonstrate one of a number of trajectories; eg. those with…

bobmcpop
- 1,063
- 1
- 14
- 20
3
votes
1 answer
How to detect variable seasonality pattern
We have predicted and actual (daily) data for past 3 years. We use 90 days of data for prediction. Generally our predictions are very accurate, but we receive unusual traffic for few days/weeks ( like thanksgiving-for few days, Christmas - around 2…

user2451089
- 31
- 2