I've been having some trouble in attempting to compare sets of data. I can't seem to analyse whether two models describe the same set of data, or if they describe different sets.
Here is my a portion of my basic data:
ZT WT_PAL Line_37_PAL WT_PhPRR5 Line_37_PhPRR5 WT_EOBI Line_37_EOBI WT_EOBII Line_37_EOBII WT_CM1 Line_37_CM1 WT_ADT
1 0 0.08017366 0.000959987 0.26035363 0.03264146 1.46476869 0.009786237 4.16477772 0.000742414 0.07395887 0.000456353 0.06000000
2 0 0.05930462 0.021197691 0.26147552 0.22926780 1.57837816 0.926847383 1.15031587 0.461807744 0.03682062 0.101097795 0.05322561
3 0 0.14389513 0.756356081 0.63035752 0.72129878 1.76452175 0.640368308 2.42348584 1.364089162 0.12954215 0.892205209 0.13821109
4 4 0.12194367 0.297290671 0.13444482 0.14225469 0.99144104 1.131902963 0.91522009 0.910081812 0.29664680 0.505630813 0.51706760
5 4 0.06025697 0.164053161 0.15448683 0.26627386 1.31917230 1.519721821 0.62925084 2.483566296 0.12296628 0.364813045 0.35061055
6 4 0.20896743 0.249435523 1.23052341 0.61818565 1.77819303 1.284683192 1.41398975 1.523446689 0.30023862 0.282538740 0.56811626
7 8 2.38864472 0.042225180 1.54472331 0.04236890 1.04169534 0.860432687 0.26977645 2.001020769 2.93724542 1.340914776 3.00230489
8 8 2.27484249 0.108464160 1.27963226 0.21218338 0.92997042 0.999347054 0.24756421 0.878011535 2.36280758 0.564269963 2.05923549
9 8 1.72728498 0.284489142 1.17311707 0.63301025 0.73380469 0.863829602 0.20109633 0.831139775 2.37338677 1.046991612 2.24797092
10 12 1.13821434 0.462596491 2.22919520 0.15287139 0.34310114 0.817010999 0.29965738 0.236064056 1.18592546 0.725928756 1.01932917
11 12 1.10145755 0.368458720 2.13568842 0.39531534 0.33147292 1.107039633 0.32343745 0.888220142 0.98362898 0.663785645 0.93808648
12 12 1.91985246 0.219754262 1.44412345 0.66775319 0.22753689 0.513590231 0.07657606 1.100251286 1.75011191 0.251849690 1.61130028
13 16 0.68005324 0.396014538 0.31868826 0.14759449 0.38865638 0.778205100 1.09767555 0.627603654 0.55060102 0.784160371 0.60319061
14 16 0.83616544 0.514261850 0.21921500 0.19384070 0.22801491 1.029590354 0.12193953 0.494258870 0.62367453 0.868126888 0.59068953
15 16 0.59058070 0.758966630 0.56687274 0.80844039 0.12417071 0.698339222 0.12503996 1.321782313 0.50518054 1.127351763 0.90570233
16 20 0.30896858 0.376021422 0.18652112 0.16757942 0.50239187 0.823056297 0.30242397 0.549940528 0.32069459 0.464616256 0.33701357
17 20 0.04854291 0.231663315 0.07268395 0.10814706 0.07590502 0.620767904 0.03008203 0.491554754 0.04180077 0.374756383 0.04942141
18 20 0.81359279 0.833815983 0.58218634 0.32892256 0.35501741 0.381413660 0.34660498 0.558786138 0.43100429 0.645363500 0.99771479
What I would like to do, is to see if the expression profile over time of Line_37_PAL is significantly different to that of WT_PAL
First thing I did was try to fit the model:
fitWT_PAL_1 <- lm(data1$WT_PAL ~ data1$ZT)
fitWT_PAL_2 <- lm(data1$WT_PAL ~ data1$ZT + I(data1$ZT^2))
fitWT_PAL_3 <- lm(data1$WT_PAL ~ data1$ZT + I(data1$ZT^2) + I(data1$ZT^3))
fitWT_PAL_4 <- lm(data1$WT_PAL ~ data1$ZT + I(data1$ZT^2) + I(data1$ZT^3) + I(data1$ZT^4))
fitWT_PAL_5 <- lm(data1$WT_PAL ~ data1$ZT + I(data1$ZT^2) + I(data1$ZT^3) + I(data1$ZT^4) + I(data1$ZT^5))
Which determined that fitWT_PAL_4 fit the data best.
I then did the same for the Line_37_PAL, fit37_PAL_5 proved to be the best fit.
I wanted to see here whether or not the two models adequately described the same data, or if the data they described were different (and that the models were in fact describing different expression profiles).
But when entering the anova I get:
> anova(fit37_PAL_4, fitWT_PAL_1)
Analysis of Variance Table
Response: data1$Line_37_PAL
Df Sum Sq Mean Sq F value Pr(>F)
data1$ZT 1 0.10797 0.107974 1.7944 0.1901
I(data1$ZT^2) 1 0.00342 0.003422 0.0569 0.8131
I(data1$ZT^3) 1 0.12717 0.127171 2.1134 0.1561
I(data1$ZT^4) 1 0.04095 0.040949 0.6805 0.4157
Residuals 31 1.86536 0.060173
Warning message:
In anova.lmlist(object, ...) :
models with response ‘"data1$WT_PAL"’ removed because response differs from model 1
I'm assuming this is because my Y-values come from two different sets of data? Please correct me if I'm wrong, and I would be thankful for any advice you might be able to give.
I ran the predicted values of a model against the actual values using t.test(x,y, paired = TRUE), but that only describes the differences in means of the two populations, not the possible differences in expression patterns. Advice on how to proceed?