I am working on an anomaly detection application that uses keystroke dynamics.
This is the pool of features that I have to my disposal:
- hold time = R(i) - P(i)
- key-up to key-down = P(i+1) - R(i)
- key-up to key-up = R(i+1) - R(i)
- key-down to key-down = P(i+1) - P(i)
- key-down to key-up = R(i+1) - P(i)
Where,
- P(i) is the press time of the current key
- R(i) is the release time of the current key
- R(i+1) is the release time of the consecutive key
- P(i+1) is the press time of the consecutive key
I am aware that the "best" features will be the ones with high variance.
What statistical method(s) can I employ for selecting the "best" features?