I have data that is quite heteroscedastic, and therefore decided to try fitting a GLS model in python with the statsmodels package in python.
The data has two continuous feature variables with skewed distributions with a continuous response variable. The data is NOT time series. I did not know how to specify "sigma" in the model, so I just left it as "None".
The model performed well, with an r2 value of ~ 98%! However I am completly clueless as how else to evaluate the model, particularly in comparison to my earlier OLS and polynomial regression models.
I tried to compute the Mean Average Error, by running a batch of predictions and finding their mean difference with the actuals. But this approach yielded a pretty large deviation.
How else can I evaluate the model? Am I running predictions correctly?
I have never worked with GLS or WLS models before, and the math goes a bit over my head. Any tips on how to help? Thanks!