10

I have a time course experiment that follows 8 treatment groups of 12 fish for 24 hours with observations made at 5 second intervals. Among the measurements made is how far each fish travels (in mm) between observations. The 24 hours are divided into 1 dark period and 1 light period.

Here is a plot of the movements of the 12 individual fish in treatment group H for the first hour of the dark period:

control group during 1st hour of dark

You can see that some fish have long periods of inactivity, some short periods, and some have none during this particular window. I need to combine the data from all 12 fish in the treatment group in such a way as to identify the length and frequency of the rest periods during the entire dark period and the entire light period. I need to do this for each treatment group. Then I need to compare the differences between their rest period lengths and frequencies.

I'm not a stats gal, and I'm completely at sea. The problem resembles sequence alignment to me (my bioinfomatics background), so I'm thinking Hidden Markov models, but this may be way off base. Could anyone suggest a good approach to this problem and perhaps a small example in R?

Thanks!

dnagirl
  • 397
  • 4
  • 8
  • 1
    This seems to have several subproblems. First, you need to find and extract the periods of inactivity for each fish. A person can do it by just looking, but if you have a lot of data, maybe you want to do it automatically. Second, you need to combine them for all the fish in one group. Third, you need to compare them across groups. Which of these do you need help with? Since you mentioned HMMs, I'm guessing the first. If so, a trivial method would be just to find any interval when motion = 0. Why wouldn't it work? If there is one brief spike during rest, do you count it as one long period – SheldonCooper Jul 15 '11 at 01:45
  • or two short periods? If the motion is not 0, but very close to 0, do you want to count that as rest? – SheldonCooper Jul 15 '11 at 01:46
  • @SheldonCooper: it's the first issue I'm having trouble with. Minimal activity should not count as a disruption of rest. So sample c93 has 1 (or possibly 2) period(s) of inactivity, but sample c87 has 4. I was thinking of normalizing the activity scale to 1 based on individual maxes so that the change in activity is proportional rather than absolute. I also had the thought that since these observations appear roughly cycle, perhaps Fourier analysis might be the appropriate approach. – dnagirl Jul 15 '11 at 12:22
  • 1
    I think we'll need a formal definition of "inactivity" or "rest". It's OK if you state it in terms of what the fish does rather than in terms of what the plots look like (we can translate later), but it has to be formal. If a formal definition is exactly what you need help with, here are some things to consider: 1. I don't think there is any standard definition of "inactivity in time series" that you could just take and use; 2. it could be useful to know more context about your experiments to come up with a definition; – SheldonCooper Jul 15 '11 at 19:28
  • 1
    3. in biology people are often very accepting of simple and straightforward definitions such as "inactivity is any continuous period when motion is less than 0.1, possibly interrupted by spikes of height no more than 2 and duration no more than 20 sec, spaced at least 5 minutes apart". Sure, this definition is brittle and thresholds seem arbitrary (what if they are spaced 4 min 55 sec apart?), but the main purpose is to remove the bias a person might have in classifying rest periods using common sense. – SheldonCooper Jul 15 '11 at 19:32
  • What is the main goal of your experiment? Do you use different species? What is the dependent and independent variables? – Pantera Jul 27 '11 at 08:02

1 Answers1

1

I think an HMM-based analysis could be helpful for you. Since you know that you are looking for a distinction between rest and motion, you can just postulate a 2-state model. For HMMs, you need to specify the emission probability for each state. My first try would be to use an exponential (or a gamma?) for the resting phase (since it bounded by zero from below and a normal distribution for the other state (you should set the initial parameters to a some reasonable value). You can then calculate the posterior state distribution along with the maximum-likelihood estimates for your parameters. The posterior-state sequence can give you the estimated lengths of the resting and activity periods (just count the number of successive states). You could even put the dark/light period as covariate into the model.

This http://cran.r-project.org/web/packages/depmixS4/index.html is a great package for HMMs. This http://cran.r-project.org/web/packages/depmixS4/vignettes/depmixS4.pdf vignette has very useful information about its application and the usage of constraints and covariates with HMMs as well.

One problem I'm seeing is that you have multiple fish. You should start by fitting a HMM for each fish separately. Maybe you could combine fish if you could somehow "normalize" the activity such that they could yield the same emission probability parameters. Or you could use the fish-number as a covariate.

Some example code:

require(depmixS4)
set.seed(1)
mod <- depmix( activity~1, data=yourdata, nstates=2,
               family=gaussian() );
fitted <- fit(mod)

but there are many, many possibilities, check out the above links!

Good luck with your project!

thias
  • 815
  • 8
  • 19