For a clickstream simulation I require to generate sequences of randomly distributed timestamps.
Each sequence should:
- start and end between sim_start and sim_end,
- have hit_count number of timestamps
For a clickstream simulation I require to generate sequences of randomly distributed timestamps.
Each sequence should:
Computers have different ways of storing time data. For example, R uses date-time classes POSIXlt
and POSIXct
. From the documentation
Class "POSIXct" represents the (signed) number of seconds since the beginning of 1970 (in the UTC time zone) as a numeric vector.
So time is stored as a number of seconds
Sys.time()
## [1] "2017-02-03 10:34:35 CET"
as.numeric(Sys.time())
## [1] 1486114478
this means that if you want to sample timestamps, then you simply need to sample values from $0$ to $k$ (maximal number of seconds from the origin of choice), and then transform them to timestamps, e.g.
u <- runif(10, 0, 60) # "noise" to add or subtract from some timepoint
as.POSIXlt(u, origin = "2017-02-03 08:00:00") # sample 60 seconds starting from this origin (i.e. time 0)from this origin (i.e. time 0)
## [1] "2017-02-03 09:00:44 CET" "2017-02-03 09:00:30 CET" "2017-02-03 09:00:06 CET" "2017-02-03 09:00:12 CET" "2017-02-03 09:00:36 CET"
## [6] "2017-02-03 09:00:16 CET" "2017-02-03 09:00:18 CET" "2017-02-03 09:00:34 CET" "2017-02-03 09:00:22 CET" "2017-02-03 09:00:35 CET"
Outside of R you also can follow such procedure by sampling some values and adding (or subtracting) them from some time-object like =NOW()
in Excel or systime
in databases etc.
Notice that this procedure enables you to sample from non-uniformly distributed time if you sample from different distribution, for example, normal distribution as in the example below.
hist(as.POSIXlt("2017-02-03 08:00:00") + rnorm(1e6, 0, 60*60), 100)