I think your data can be viewed as multilevel data, where measurements are nested in subjects. For 20% of your cases measurements at 2-3 time points are missing. Maximum likelihood (ML) fitting of a mixed-effects model will assume that these observations are missing at random (MAR).
Having said this, if you fit a multilevel model, each subject will have its own intercept to allow hetorogeneity in initial scores at t=1. I would additionally include a random slope of the time indicator(s) to allow for between subject differences in the change across time.
About the quantity of missing information: 20% missing observations seems still fair enough to allow ML fitting. If you want to be really sure that the missing data do not cause any problem, you could use a multiple imputation (MI) model before estimation of the multilevel model. Also multiple imputation will assume data are MAR, but the Bayesian nature of the method may be easier to accomodate in BUGS. An alternative in R is the package mice
.
However, if you have only missing data in the dV the big advantage of multilevel modeling is that its ML algorithm allows missing data in model fitting. If the ML algorithm does not have any convergence issues, the estimates from MI and ML should be very similar.