My aim is to find the association between intake of chocolate (continuous predictor) and blood pressure (continuous outcome) in a multiple linear regression. I have to include many covariates in order to adjust for confounders (many of them are continuous variables).
But I don't understand if I should keep the continuous covariates as continuous or categorize them into a categorical variable. When I categorize some of the countinuous variables I see that they are not linearly related to the outcome (which is one of the assumptions for linear regression models).
For example with fiber intake, with the lowest intake group as the reference group, the beta coefficients for the other intake groups don't increase or decrease linearly as the intake gets higher in the groups. And for many of my covariates the p-values are lower and the $R^2$ is bigger in a categorical covariate compared to a continuous covariate. My questions are:
Should I be concerned when a continuous cofactor isn't linearly related to the outcome? (Or is this not important because I am only interested in finding the association between the predictor and the outcome?)
In choosing between whether to model a covariate as continuous or categorical, should I look at which gives the lowest p-value and the highest R^2?
I would be very grateful for answers as I have been struggling to understand this for a long time.