Context
- you have 200 observations of an individual's running time for the 100 metres measured once a day for 200 days.
- Assume the individual was not a runner before commencement of practice
- Based on the observed data and the 199 other observations, you want to estimate the latent time it would take the individual to run if they (a) applied maximal effort; and (b) had a reasonably good run for them (i.e., no major problems with the run; but still a typical run). Let's call this latent potential.
Of course, the actual data would not measure latent potential directly. The data would be noisy:
- Times would vary from run to run
- On some days the individual would be particularly slow because of one or more possible problems (e.g., tripping at the start, getting a cramp half way through, not putting in much effort). Such problems would result in massive outliers
- On some days the individual would be slower than you'd expect, perhaps because of more minor issues.
- In general, with practice the runner would be expected to get faster in latent potential.
- In rare cases, it is possible for the runner to get slower in latent potential (e.g., injury)
The implications of this:
- The occasional slow time might provide minimal information on what the individual is capable of.
- A fast time for the individual suggests that the individual is capable of such a fast time, but a small amount of this fast time might be good fortune on the day (e.g., the right wind, a little luck on the start).
The question: Thus, how could one estimate latent potential at each of the 200 time points based on the available data and a few assumptions about the nature of running times?
Initial Thoughts: I imagine there would be some form of Bayesian approach that combined the available information and assumptions to form an estimate, but I'm not sure where to look for such models. I'm also not quite clear how the effectiveness of such a model would be evaluated.