Length – Weight Relationships (LWR) are widely used in fisheries studies; weight and length are recorded, log transformed (e or 10), then the linear regression equation of log weight ~ log length derived. This equation can be used in future studies to estimate weight when only length is known.
I've an abundant fish species for which I wish to test how Length-Weight Relationship (LWR) may be altered due to site (3 level factor), season (2 level factor) and year (2 year factor).
I use a simple linear regression analysis (in R) to test log weight as the dependent variable against the main effects and interaction effects of log length, site, season and year:
log weight ~ log length * site * season * year
I derive a minimum adequate model using analysis of deviance between models with and without each term included to identify non-significant terms for elimination (starting with higher order interactions).
This analysis has 2 aims:
1) to determine if interactive and then main effects of length, site, season and year significantly impact the LWR
2) to quantify the model variance accounted for by each significant model term, and thus the relative importance of each variable (quantified as the model term’ Sum of Squares expressed as a proportion of the SS of the null model)
Note, 4- and 3-way interactions are difficult to interpret - this is not my objective; my objective is simply to establish variables' significance and quantify effect size.
My model residuals look to be normally distributed and have equal homogeneity of variances.
I am aware there is discussion in the literature regarding potential caveats of using log transformed response variables which may affect:
quantifying the model variance partitioned to the independent variables and interactions terms
transformed data may not estimate the mean accurately, as the mean of log-transformed responses is not the same as the logarithm of the mean response
Some advices instead suggest using a non-transformed response variable within a generalised linear model with a log-link function. (Linear model with log-transformed response vs. generalized linear model with log link)
However, as fish LWR use log transformed data by standard, so it would seem more suitable to test the effects of site, season and year when using log transformed length and weight data.
I therefore wondered if my analysis above was 1) appropriate and robust (i.e. not wrong), or 2) could be bettered?