0

I have the following problem I consider to model with HMM. However, it seems to me like a non-standard application or it possibly makes no sense employ HMM. I already have some practical experience with HMM used on time series and quite good theoretical knowledge.

Problem: I am observing a continuous, time series-like process, minute data $\{y_t\}$, with bunch of regressors. I want to predict the mean for every hour given the past hour, more accurately, I can use circa past 45 minutes data.

So I was thinking of defining the states as whether the hourly mean is greater then zero and otherwise. I would have then a binary state variable. But like this, I can't see how to define a state space model

$y_t|x_t \sim p(y_t|x_t)$

$x_t|x_{t-1} \sim p(x_t|x_{t-1}),$

while in my setting, e.g., $y_1, \ldots, y_{60}$ are generated by one state, so the state should last 60 minutes and the transition shall occur at the end of every hour and one state generates 60-observations sequence with 0 probability of transition. The closest problem I can think of is for instance regime detection in financial data, bull and bear market.

Thus I don't see some straightforward application of HMM, if at all it is possible, but I would like to use it due its probabilistic interpretation. I wonder, if it is better to define the states like above and transform the problem to classification.. I can also provide more informations if this grabs your attention. Thanks in advance.

reicja
  • 63
  • 1
  • 7
  • Tell me if Im thinking about this correctly: your dependent variable is, say, the temperature, and we have a bunch of other independent variables like cloud coverage and wind speed? – redress May 24 '17 at 20:47
  • Yes, let's say it is like that. – reicja May 24 '17 at 20:50
  • 3
    Sounds like a classic multiple linear regression problem...what am I missing? – redress May 24 '17 at 20:52
  • Well the thing I want to predict is the average of the next hour. Say, it is 10:00, so I can use minute measurements up to 10:45 and at 10:45 I have to make the prediction for the average of the next hour, from 11:00 to 12:00. Of course, I considered a lot of regression techniques but in my case there is not strong linear dependence/correlation this far ahead. – reicja May 24 '17 at 20:58
  • (+1) @redress You can definitely formulate this is a classic regression problem. If you do so and use overlapping observations, you'll need to take into account the correlation in your residuals due to overlapping observations (eg. Hansen-Hodrick standard errors or Newey-West or something.) – Matthew Gunn May 24 '17 at 23:00
  • I know that. Because of regression approaches kinda fail so far I am looking for different approaches. – reicja May 25 '17 at 05:35

2 Answers2

0

but in my case there is not strong linear dependence/correlation this far ahead

I would consider a Recurrent Neural Network, where the shape of your input matrix is

   [number_of_observations, number_of_variables, 45]
redress
  • 740
  • 1
  • 7
  • 16
0

So, after a little while I found exactly what I had in mind, which is HMM with multiple emissions per state. The problem is for me sufficiently covered in another HMM question.

reicja
  • 63
  • 1
  • 7