I have a timeseries classification dataset that has about 2000 data points (binary classification).
When I use a LSTM model straighly on data, I got really bad results (nearly 0.1). But when I use first order differences
in timeseries I got more improve results (nearly 0.3).
Example of first order difference:
[1, 6, 6, 8, 9] = [5, 0, 2, 1]
Therefore, I concluded that LSTM works better when we provide more processed data.
While doing some Google search I discovered that there is something called second-order differences
. However, I am not clear what is the objective of second order difference and how to calculate it. Please let me know your thoughts on this.
I am happy to provide more details if needed.