I have a data set consisting of 104 responses from an online survey (data have been cleaned) and I want to test the relation between 14 independent variables (12 variables about which hypothesis have been generated + 2 "control" variables) and 1 dependent variable. The hypotheses have been formulated after indications of the literature.
I chose to do a regression analysis even though the sample size in not optimal. After running the "lm()" function in R I got an F-statistic with a p-value below 5% indicating that at least one of the independent variables are related to the dependent variable. What is more by looking at the p-values of the independent variables, 2 of those are below 5%.
What I want to ask is: should I keep all the variables in the model and report the results, or should I re-run the analysis including only the 2 variables with the low p-values?. Or is it better to do variable selection using automated methods e.g. stepwise selection?
My goal is to draw causal conclusions (explanatory regression) about the relation between independent variables and the dependent variable, so using stepwise methods for variable selection does not seem appropriate for my case, as I am not aiming for predictions.