Compare distributions (with replicates) and visualization

Question

The experimental set-up is an almost typical control vs treatment, in which we are counting how many reads map to a particular class of genes in each condition (RNA-seq), done in triplicate. An example table would look like this:

feature  c1  c2  c3  t1  t2  t3
   A     10  7   11  5   3   4
   B     9  10   15  6   7   8
   C     15  6   10  11  5   9
   (.. up to several hundreds)

The question we are asking is:

"Is the distribution of counts in the control different from the treatment?".

Which would be the most appropriate test for this? From a previous question, an anova sounds like the way to go. Knowing R I could melt the data.table and run aov(Counts~Conditions, data). Would this be acceptable?

Extra points for suggestions on how to represent the data, including the replicates. Since there is more than one treatment, something like this, or boxplots for each replicate will become too cluttered.

If you are dealing with count data you should probably use a GLM and not an ANOVA. — Roland, May 28 '15 at 10:57

brumar · Accepted Answer · 2015-06-01T14:09:16.270

1

Edit : after Roland comments I think that a poisson glm is suitable there.

I could be wrong, but if your triplicate comes by pair (c1 counting more or less the same thing as t1), using two factors might be more convenient. It allow to represent the category and the treatment/control condition in a separated manner as you want to focus on the difference between condition and treatment.

It would then give :
glm(Counts~Condition+Categ, data,family=poisson)

edited Jun 01 '15 at 14:09

answered May 28 '15 at 10:32

brumar

2,092
11
14

1

I don't understand why you'd use `family = binomial`. I'd expect that the `poisson` or `quasipoisson` families would be more appropriate. – Roland Jun 01 '15 at 13:27
I was thinking that binomial glm was a quite standard thing to go. After reading your comment I had a look on this question http://stats.stackexchange.com/questions/60643/difference-between-binomial-negative-binomial-and-poisson-regression and understood that things are more complicated. As I lack experience and knowledge for the moment I ll edit to be in line with your suggestion. Thank you for bringing that up. – brumar Jun 01 '15 at 14:06

Compare distributions (with replicates) and visualization

1 Answers1