Let’s say I want to explain one variable with six regressors. I don’t know if the relationship between my dependent variable and each regressor is linear, if any. I have six regressors, so I can’t look graphically at these relationships, as the best I can do is a 3D plot. But, based on scientific literature, it’s likely that these relationships are all linear, so, I choose to fit a linear regression (let’s admit that my data comes from a Gaussian distribution). All my variables are kept by my method of selection. I check normality of my residuals, homoscedasticity, independence, all is okay. Now, I plot my fitted values against my observed data.
Possibility one: my model fits my data well, in the sense that there’s a good linear relationship between my fitted values and my observed data. Can I thus validate my hypothesis of linearity between my dependent variable and my regressors? Does such a good fit always necessarily confirm this hypothesis?
Possibility two: my model fits my data poorly; my fitted values aren’t aligned with my observed data, but they show a non-linear trend. So I reject my hypothesis of linearity and try to fit a non-linear model (GAM,…) Am I right?
Is this method ok or are there some parameters I’m not taking into account? If this method is okay, are there some extra precautions to take for mixed linear models, for generalized linear models or for GLMM?
Methods that I saw to check linearity often imply smoothing splines, loess etc., but I’m wondering if it is possible to use such methods if you have more than two predictors? Or is there a way than I can plot a variable against another while "correcting" the effects of the others without creating a new model?