7

Assume having a list column so that your time series is nested, see Convert pandas df with data in a “list column” into a time series in long format. Use three columns: [list of data] + [timestamp] + [duration] for details. The question here is not about how to unnest a list column though. Assume that you already have a long format structure with all available list elements unnested into a normal column, here for example taking 2 lists with 4 list elements, making 8 rows in the end.

Part1: Assume that every list of the original list column has the same number of list elements (here, the first list with 4 items around 08:53, and the second with 4 items around 08:55).

                     value
datetimeindex
2016-05-04 08:53:20  1
2016-05-04 08:53:21  2
2016-05-04 08:53:22  1
2016-05-04 08:53:23  9
2016-05-04 08:55:00  2
2016-05-04 08:55:01  2
2016-05-04 08:55:02  3
2016-05-04 08:55:03  0

Now approaching the actual question. From statsmodels.tsa.seasonal.seasonal_decompose¶ we read:

Definition of period

"period, int, optional"

Period of the series. Must be used if x is not a pandas object or if the index of x does not have a frequency. Overrides default periodicity of x if x is a pandas object with a timeseries index.

What is meant here with "Period of the series"? Is it:

  1. the number of lists, here 2.
  2. the standard size of a list. This would be 4 in the example.
  3. something else than 1./2.

Please also explain what would be different if you had a "Part2 setting", if there is any difference:

Part2: Assume that every list of the original list column has a varying number of list elements (here, the first list with 4 items around 08:53, and the second with just 3 items around 08:55).

                     value
datetimeindex
2016-05-04 08:53:20  1
2016-05-04 08:53:21  2
2016-05-04 08:53:22  1
2016-05-04 08:53:23  9
2016-05-04 08:55:00  2
2016-05-04 08:55:01  2
2016-05-04 08:55:02  3

The examples shall make the question clear, no programming (especially not with the examples) needed for an accepted answer.

Context:

This question arose from decompose() for time series: ValueError: You must specify a period or x must be a pandas object with a DatetimeIndex with a freq not set to None.

  • 1
    Bounties become more noticeable towards the end of the period, as they move towards the top of the [list](https://stats.stackexchange.com/?tab=bounties). That said, it is notable that this Q has only 21 views, a number of which are me checking on it. You may want to ask a question on [meta.stats.SE] to see if there's a way to raise the profile of this Q &/or make it more answerable. (Again, this doesn't seem to be a programming Q, & there are so many Qs on [SO] that they usually get less attention than here.) – gung - Reinstate Monica Sep 09 '20 at 12:49
  • (1) The time-series data's having previously been stored in a nested list isn't obviously relevant. Some explanation is required of what that has to do with the choice of period for decomposition. (2) *Period* is the no. observations in a seasonal cycle - e.g. if you've daily observations & weekly seasonality, the period is 7. But your observations are at irregular intervals, & how to perform seasonal decomposition on irregular time series is perhaps the crux of your question (see e.g. https://stats.stackexchange.com/q/244042/17230 & https://stackoverflow.com/q/12623027/1864816). – Scortchi - Reinstate Monica Sep 16 '20 at 08:40

0 Answers0