Suppose that I have a set of accelerometer data collected with one sensor and one label for each measured data point. These labels describe different states of my system e.g., $state_A, state_B, state_C$, etc., and I want to use this information to train a classifier to recognize these states mentioned earlier.
Now, let's say that I want to use a fixed sliding window to extract some features rather than feeding the raw data to the classifier. The issue is, that some of these sliding windows could contain more than one unique label: e.g. the time window contains the transition from $state_A$ to $state_C$. What should I do with these kind of windows? Should I discard them? Should I set a threshold to determine if I use them or not (e.g. if 90% or more of the measured points cointain the same label, then it is ok to use the window)? Are there any best practices handling these kind of situations?
In the following figure I add an example of this issue: the image shows a plot of three different states that the system could have, and they are encoded as 1,2, and 3 for visualization purposes. Let's say that I want to take windows of every 25 samples without overlapping, so the vertical red lines show the beginning and the end of every time window.
There are windows that only possess one unique state, but some others contain more than one.