3

As an example, suppose that I am interested in predicting what proportion of a rain water tank will be full today. I have the following pieces of information:

  1. The proportion of the rain water tank that is full yesterday.
  2. Other predictors like number of people using the rain water tank.

What sort of analysis should I use to model this proportion? Can I:

  1. Fit a logistic regression model and use the proportion of the rain water tank that is full yesterday as one of the predictors.
  2. Calculate the change in proportion between today and yesterday, use that as the response in a regression model.
  3. Use a Bayesian type model where the proportion of the rain water that is full yesterday is the prior in the model.
  4. Use time series analysis, even though there are only two points in time where we observe the fullness of the tank.
Alex
  • 3,728
  • 3
  • 25
  • 46
  • You can do all four of the above as well as other methods like Markov chains, glm, etc... there are many possible answers to your question and it depends on your data as to which model fits well... voting to close as too broad – Gaurav Sep 29 '15 at 04:27
  • 3
    No point in using time series for only two time periods; I'd add the previous period as a predictor, but I wouldn't use the usual logistic regression as the variance function will be wrong; that is for a count proportion not a continuous proportion. If you can't fit a model suitable for a continuous proportion, then you might consider a *quasi-binomial* logistic regression, since you can at least model the mean and variance in similar fashion to a beta model. – Glen_b Sep 29 '15 at 06:53
  • But the question doesn't specifically say that you're interested in predicting tank level based on previous day only... You have other predictors as mentioned and for all we know you can always use more than 1 day previous level data to find a pattern... If you have some reason to assume that only the previous day level matters, then discrete time Markov chain is what I'd use... – Gaurav Sep 29 '15 at 10:08
  • 1
    It's arguable this question is too broad in its current form but it is difficult to imagine a way to make it more focused; so long as it is asking for modelling *approaches* (and there's more than one way to approach this) rather than *how to fit these models* (which would best be asked as separate questions for each approach, assuming they're not duplicates) I think this is on-topic here. – Silverfish Sep 29 '15 at 12:35
  • Thank you for your comments everyone. @Gaurav, no, it is not possible to use data for any previous days. Consider the response as samples of the tank level on independent days. It just happens that we know the tank level on the previous day + other predictors – Alex Sep 29 '15 at 22:36
  • @Glen_b your comment seems to be very helpful, but I do not understand most of what you are saying. It would be really appreciated if you could flesh out your suggestion of a quasi-binomial logistic regression as an answer, or point me to some references. I did find this which seems to be in-line with what you are suggesting: http://www.petrkeil.com/?p=603 – Alex Sep 29 '15 at 22:40
  • And this is a CV question for quasibinomial: http://stats.stackexchange.com/questions/91724/what-is-quasibinomial – Alex Sep 29 '15 at 23:06

0 Answers0