I have a dataset of counts of four different metadata factors associated with a gene and two experimental groups, FGT and free, with 52 and 40 unique genes respectively. The first 100 rows can be found here: https://pastebin.com/PAG5pCDh (I can provide more)
Having performed a poisson distributed glm on count data and identifying the variable origin
as a significant predictor, as originfree
is significant (I think I am under standing that correctly?), how do I determine if origin free is associated with a higher or lower count.
A truncated output of coefficients for the glm looks like this:
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -7.100e-01 5.827e-01 -1.218 0.223062
originfree -2.921e-01 8.830e-02 -3.308 0.000939 ***
variableDuplication 1.427e-01 1.116e-01 1.279 0.201013
variableKnown_target -1.609e+00 2.000e-01 -8.047 8.47e-16 ***
variablePhylogeny 1.310e-01 1.119e-01 1.171 0.241491
geneGrpE 1.792e+00 6.236e-01 2.873 0.004063 **
genePGK -4.455e-15 8.165e-01 0.000 1.000000
geneRibosomal_S14 6.931e-01 7.071e-01 0.980 0.326959
geneSHMT 2.079e+00 6.124e-01 3.396 0.000684 ***
geneTIGR00009 9.758e-15 8.165e-01 0.000 1.000000
geneTIGR00057 6.931e-01 7.071e-01 0.980 0.326959
geneTIGR00069 -6.149e-15 8.165e-01 0.000 1.000000
geneTIGR00079 1.386e+00 6.455e-01 2.148 0.031743 *
geneTIGR00105 1.386e+00 6.455e-01 2.148 0.031743 *
I see that originfree
is significant, which I understand to mean it the fact of something being originfree or not significantly affects the models ability to predict count )please correct me if I am wrong)
Now how do I find out if originfree
is associated with an increase or decrease in the count of the four metadata factors? Would I have to run separate glms on subset dataframe for each metadata factor in order to work this out?
My alternative hypothesis is that it would lead to a decrease