0

I have a year-long data of an anemometric tower. The variable that I'm interested in is wind speed at 50m. My idea is to train with this data using some models (Arima, Sarima, Holt-Winters, NNETAR, and HybridModel) then predict. But I'm having some doubts about creating the time series object.

I have 52704 observations starting from 2008-01-01 00:00:00 to 2008-12-31 23:50:00. Every 10 seconds a measurement is made, so in 1 minute I have 6 observations along 11 variables. I'm don't know if the right value for the frequency argument is 6. Because since the measurement is so short I think the "cycle" is a minute. So the frequency maybe 6, since 60/10 = 6.

In such a case how to choose the proper frequency value and how to validate it?

p.s: I stumble in this question and this post, but I'm still not understanding how to choose the proper "cycle" for my case.

Dagdeloo
  • 1
  • 1
  • The frequency is the interval at which you expect some discernible characteristic of your data to repeat regularly. Since you don't say what your data are, we have no way to tell whether you should specify a frequency or what it should be. Maybe you could supply a little more information? – whuber Jan 23 '21 at 00:26
  • @whuber I've added some context about my problem. In synthesis: this data is from an anemometric tower, and I'm trying to predict the wind speed. – Dagdeloo Jan 23 '21 at 03:23
  • Meteorology suggests looking at one day and one year seasons. That would be frequencies of 8640 and 3155695, respectively. – whuber Jan 23 '21 at 18:12
  • So, your thoughts about integrating the observed data minute-wise? (mean of 6 observations). I know I'm going out of the scope of the question, but I fear the overfitting. – Dagdeloo Jan 23 '21 at 18:14

0 Answers0