I am a PhD student.
I have a data set (waiting time in minutes between tweets) which looks almost symmetrically to the naked eye.
I've tried a couple of distribution fits to this data and the closest I can get is a cauchy distribution with a p value = 0.02. I'd like to see if the data is a reasonable fit to a Laplace / Double Exponential Distribution.
mydata = c(251, 178, 342, 252, 253, 213, 335, 273, 250, 325, 253, 252, 254, 252, 240, 248,
253, 250, 250, 247, 257, 259, 250, 254, 251, 250, 251, 254, 248, 265, 239, 260,
253, 311, 252, 311, 250, 249, 251, 212, 289, 243, 253, 252, 254, 249, 250,
259, 268, 346, 312, 263, 287, 281, 334, 239, 218, 280, 5, 255, 251, 255,
266, 325, 248, 249, 250, 251, 171, 326, 195, 198, 281, 271, 265, 267, 250, 251,
278, 264, 252, 265, 250, 243, 267, 250, 252, 253, 244, 252, 259, 132, 275, 182,
336, 250, 251, 253, 358, 252, 276, 281, 255, 252, 191, 277, 283, 193, 213,
268, 277, 250, 236, 241, 296, 242, 249, 251, 250, 262, 250, 219, 263, 267, 245,
254, 251, 251, 234, 259, 264, 261, 246, 254, 264, 276, 236, 245, 253, 222, 240,
250, 250, 252, 239, 254, 250, 263, 267, 251, 255, 256, 252, 243, 257,
251, 252, 252, 242, 229, 250, 265, 252, 237, 270, 212, 268, 290, 256, 239, 239,
263, 251, 248, 252, 249, 241, 268, 261, 254, 256, 258, 250, 251, 250, 259, 257, 197,
282, 461, 257, 250, 250, 250, 251, 253, 253, 251, 250, 263, 247, 254,
251, 256, 250, 250, 177, 305, 275, 203, 260, 250, 251, 252, 239, 274, 167, 262,
251, 272, 251, 264, 250, 256, 226, 257, 270, 240, 239, 255)
Which gives an interesting histogram
hist(mydata, breaks = 25, freq =F)
I looked at a couple of prior posts on this subject and they appear to use the nls function Double exponential fit in R however the data that is used appears to have two variable to model.
I then looked at converting my data to a frequency table and trying the above solution:
nonlin <- function(t, a, b, c) { a * (exp(-(abs(t-c) / b))) }
nlsfit <- nls(Freq ~ nonlin(times, a, b, c), data=mydataframe, start=list(a=2, b=2, c=2.5))
However I get an error (singular gradient matrix at initial parameter estimates) as the starting parameters appear bad. Then i found another post Why is nls() giving me "singular gradient matrix at initial parameter estimates" errors? that uses some code to estimate the starting parameters.
c.0 <- min(mydataframe$times) * 0.5
model.0 <- lm(log(times - c.0) ~ Freq, data=mydataframe)
start <- list(a=exp(coef(model.0)[1]), b=coef(model.0)[2], c=c.0)
model <- nls(times ~ a * exp(b * Freq) + c, data = mydataframe, start = start)
I get another error running the last line
Error in nls(times ~ a * exp(b * Freq) + c, data = mydataframe, start = start) :
step factor 0.000488281 reduced below 'minFactor' of 0.000976562
Any help folks can provide is much appreciated. Jonathan