I'm trying to generate an R function that keeps relevant variables based on their absolute t-value (or p, whichever is easier in code).
Basically what I want is to run one regression (1), retain all variables that are significant (based on the t-value, or p). Then run another regression (2), retain all variables that are significant (based on t-value, or p). Then take all retained variables from the regressions (1 & 2) above and combine them in a final regression (3).
Currently I am struggling with the step of retaining variables based on their t-value, I can extract the t-values of the variables but don't know how to code that they should be "retained" and kept for the later regression.
What I have currently (example edited for better understanding):
summaryI1 <- summary(lm(y~x1+x2+x3+x4)) #x1-x4 are the vars included in regression 1
(coefI1 <- coef(summaryI1))
tI1 <- coefI1[, "t value"] #extracts the t-values for all the x's
cutoff <- which(tI2>cut, arr.in=TRUE) #checks that t-stats of vars are above the cutoff level
Suppose x1 and x2 are significant and I would like to keep them "aside" for the next regression. How would I code that r should create a new matrix of x1 and x2 based on x1 and x2 having t-values higher than my cutoff?
I'm not that experienced in r programming yet but learning. Can anyone help?