I've been using a negative binomial model to compare the number of particles in the ocean of different sizes to their abundance.
My gam, in r code looks like this
gam(TotalParticles ~ log(lb), offset = log(binsize * vol), data = df, family = "nb")
I'm using gam, rather than glm, because the former can handle negative binomial regressions and poisson regressions with the same syntax and I was comparing the two at one point
Essentially I'm interested in the slope of the relationship between the log of size lb
and the log of particle number Total Paricles
. The particle sizes have been binned by size, and so I normalize to the width of that bin. The particls are also collected in different volumes of water, so I also normalize to that volume vol
.
I'm trying to express this in math and words in a corresponding manuscript. Right now, based on this post, I'm writing the equation as:
$$ ln(\frac{E(Total\,Particles)}{Volume *Binsize}) = b_0 + b_1\,ln(Size) $$
My co-authors keep asking me what E
means in this context. My understanding is that we are predicting an expected "negative binomial" distribution of total particle numbers from size.
So far I have written:
The term on the left describes the expected volume and bin-size normalized count data, assuming a negative binomial distribution of residuals, with E referring to the conditional expectation of total particle numbers, assuming a negative binomial distribution.
I have two questions:
Does my equation actually correctly represent the linear model written in code? If not, is there a better way of writing it?
Is there a better way I can express in words what is going on on the left hand side of the equation?