I'm studying the growth of a population (users of a website).
I have the user count for each time block (which is 2 weeks long).
Now, I'd like to understand if this growth follows a well-known growth curve.
To start with, I ran this regression on R on 30 observations with glm(tot_users ~ block_id,family=binomial(logit), data = df)
.
These are the observations:
25,83,111,164,251,370,557,815,1154,1513,2032,2605,3590,4904,5718,6602,7628,8727,9471,10263,11047,11799,12441,13040,13634,14168,14582,15143,15649,16164,16472
This is what I get:
Coefficients:
(Intercept) block_id10 block_id11 block_id12 block_id13 block_id14 block_id15 block_id16 block_id17 block_id18 block_id19 block_id2 block_id20 block_id21 block_id22 block_id23 block_id24 block_id25 block_id26 block_id27 block_id28 block_id29 block_id3 block_id30 block_id31 block_id4 block_id5 block_id6 block_id7 block_id8
-9.397e+00 6.331e+00 6.667e+00 7.148e+00 7.398e+00 7.714e+00 7.968e+00 8.151e+00 8.364e+00 8.623e+00 8.979e+00 1.958e-13 9.263e+00 9.430e+00 9.591e+00 9.817e+00 9.985e+00 1.020e+01 1.048e+01 1.067e+01 1.098e+01 1.126e+01 6.932e-01 1.196e+01 3.196e+01 2.946e+00 3.468e+00 4.754e+00 5.555e+00 5.896e+00
block_id9
6.150e+00
Degrees of Freedom: 30 Total (i.e. Null); 0 Residual
Null Deviance: 17.57
Residual Deviance: 3.167e-10 AIC: 74.87
If I plot this model, I get a surprisingly close approximation. The residual deviance seems incredibly low, indicating that the model is very good (is it?). My questions are:
1) Can I assume that my variable growth follows a logistic curve?
2) What other curves do you think I should try to fit?
3) What is the usual validation process to make the regression results publishable? What measures are usually reported?
Thanks for any hint. Mulone