2

I would appreciate any suggestions of how to analyse the following dataset. It was suggested that I use ANOVA, but I wanted to check with the community.

I have a protein where I have calculated the entropy of each reading frame (blue, red and yellow). In the attached figure, the bars are a sum of the values calculated. It would probably be better to represent the values as means rather than total sums. Anyway, eye-balling it, the yellow section has a higher entropy than the blue and red regions, whereas red and blue look similar. However, I need to add some statistical weight to this and I would only compare reading frames within one genotype. I'm unsure whether to use ANOVA (repeated as it's 3 groups ?), a pairwise.t.test in R, or even Friedman's test, it's been a while. Perhaps this question is similar to How to test hypothesis of no group differences? ?

I didn't find questions and answers that I identified with, so apologies if I missed one.

enter image description here

Liam
  • 23
  • 2

1 Answers1

0

A two way ANOVA or a couple of t-tests is probably just fine. Is the entropy Gaussian distributed? Probably not exactly, but you're going to get big $p$-values off of this. For example, I threw it into R and fit a quick linear model:

d <- data.frame(genotype=factor(rep(c('A','B','C','D','E','F','G','H'), times=3)),
            frame=factor(rep(c('f0','f1', 'f2'), each=8)),
            entropy=c(12.1, 20.2, 21.2, 18.0, 18.0, 21.2, 14.8, 19.0, 9.6, 8.0, 7.6, 7.3, 5.3, 10.8, 4.8, 3.6, 5.1, 7.8, 7.8, 7.3, 7.1, 7.8, 6.4, 7.6))

f <- lm(entropy ~ genotype + frame, data=d)

(I renamed frame 3 to f0 just to make R calculate the differences between 3 and 1 and 3 and 2 without futzing with anything.) Then summary(f) gives

Call:
lm(formula = entropy ~ genotype + frame, data = d)

Residuals:
    Min      1Q  Median      3Q     Max 
-4.1292 -0.7990 -0.0438  0.9719  4.3083 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  16.2292     1.3792  11.768 1.20e-08 ***
genotypeB     3.0667     1.7445   1.758   0.1006    
genotypeC     3.2667     1.7445   1.873   0.0822 .  
genotypeD     1.9333     1.7445   1.108   0.2864    
genotypeE     1.2000     1.7445   0.688   0.5028    
genotypeF     4.3333     1.7445   2.484   0.0263 *  
genotypeG    -0.2667     1.7445  -0.153   0.8807    
genotypeH     1.1333     1.7445   0.650   0.5264    
framef1     -10.9375     1.0683 -10.238 6.97e-08 ***
framef2     -10.9500     1.0683 -10.250 6.87e-08 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 

Residual standard error: 2.137 on 14 degrees of freedom
Multiple R-squared: 0.9157, Adjusted R-squared: 0.8615 
F-statistic:  16.9 on 9 and 14 DF,  p-value: 4.697e-06 

Entropy is the result of additive effects, none of them particularly wild, so those $p$-values of $10^{-8}$ are plenty to go on.

user873
  • 651
  • 3
  • 2