I am quite new to R and coding so please forgive the lack of in depth information I may provide. I am also new at using linear models, particularly with large data sets. I have used the gls function in the nlme package to assess water quality data and I just need to understand the output and what I need to report for an article.
I want to look at the relationship between water flow and various parameters (electrical conductivity, pH etc etc) over a long time period (50 years). The data is stationary and there are many data points so autocorrelation is present (I tested for this elsewhere) and this is why I am using gls instead of linear methods (was also suggested to me by a reviewer for a paper). I ran the code to look at flow and electrical conductivity (ec) and the dataset name is rivin.
I ran a first model (m1)
using the following script
m1<-gls(flow~ec, rivin)
and then a second one as follows using the AR(1) function
m2<-update(m1,correlation=corAR1())
I then used anova()
to check significance between the two models and this is the output:
Model df AIC BIC logLik Test L.Ratio p-value
m1 1 3 183.2906 193.5522 -88.64531
m2 2 4 164.8020 178.4841 -78.40098 1 vs 2 20.48866 <.0001
Does this mean that m2
is significantly different from m1
?
I then look at the summary from m2
:
summary(m2)
Generalized least squares fit by REML
Model: flow ~ ec
Data: rivin
AIC BIC logLik
164.802 178.4841 -78.40098
Correlation Structure: AR(1)
Formula: ~1
Parameter estimate(s):
Phi
0.3111562
Coefficients:
Value Std.Error t-value p-value
(Intercept) 3.936472 0.5951170 6.614619 0
ec -1.106382 0.2228789 -4.964047 0
Correlation:
(Intr)
ec -0.999
Standardized residuals:
Min Q1 Med Q3 Max
-2.70663729 -0.58400432 0.03536558 0.33392867 5.01270171
Residual standard error: 0.3557397
Degrees of freedom: 228 total; 226 residual
Here I just please want to know how to interpret the results and what to report in an article. Do the Coefficient
results indicate that ec
decreases as flow increases and that this is significant? And what does the Correlation (Intr)
show me? Does this value of -0.999 indicate collinearity between the two variables and make the model invalid? And what do the results from the Standardized residuals
indicate?
Thank you in advance.