I'd really appreciate help using Stata to perform a manual stepwise forward logistic regression.
I have 37 biologically plausible, statistically significant categorical variables linked to disease outcome. I need to end up with a final multivariable model.
I've added the first variable (most significant/most plausible) with corresponding OR output.
xi:logistic outcome i.variable1
. xi:logistic casecontrol i.breed_groupall
i.breed_group~l _Ibreed_gro_0-7 (naturally coded; _Ibreed_gro_0 omitted)
Logistic regression Number of obs = 995
LR chi2(6) = 83.87
Prob > chi2 = 0.0000
Log likelihood = -422.36813 Pseudo R2 = 0.0903
------------------------------------------------------------------------------
casecontrol | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
_Ibreed_gr~2 | 1.757143 .6861797 1.44 0.149 .817345 3.777537
_Ibreed_gr~3 | 1.952381 .8811439 1.48 0.138 .8061249 4.728537
_Ibreed_gr~4 | 1.464286 .530121 1.05 0.292 .7202148 2.977074
_Ibreed_gr~5 | 6.192708 1.779109 6.35 0.000 3.526453 10.87485
_Ibreed_gr~6 | 3.880357 1.103611 4.77 0.000 2.222193 6.775816
_Ibreed_gr~7 | .636646 .2555236 -1.13 0.261 .2899083 1.398091
------------------------------------------------------------------------------
What do I look for to see if adding the second variable I choose means that both variables should stay in, when, for example I type;
xi:logistic outcome i.variable1 i.variable2
. xi:logistic casecontrol i.breed_groupall i.height_category
i.breed_group~l _Ibreed_gro_0-7 (naturally coded; _Ibreed_gro_0 omitted)
i.height_cate~y _Iheight_ca_0-4 (naturally coded; _Iheight_ca_0 omitted)
Logistic regression Number of obs = 992
LR chi2(10) = 132.25
Prob > chi2 = 0.0000
Log likelihood = -396.05629 Pseudo R2 = 0.1431
------------------------------------------------------------------------------
casecontrol | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
_Ibreed_gr~2 | 1.002262 .4145417 0.01 0.996 .4455736 2.254465
_Ibreed_gr~3 | 1.185087 .557553 0.36 0.718 .4712828 2.980017
_Ibreed_gr~4 | 1.774162 .6616502 1.54 0.124 .8541793 3.685001
_Ibreed_gr~5 | 1.452416 .541494 1.00 0.317 .6994292 3.01605
_Ibreed_gr~6 | 1.256098 .4377793 0.65 0.513 .6343958 2.487064
_Ibreed_gr~7 | .4238999 .1777699 -2.05 0.041 .1863361 .9643388
_Iheight_c~1 | 2.780857 .918967 3.09 0.002 1.455088 5.314572
_Iheight_c~2 | 5.402833 2.246817 4.06 0.000 2.391342 12.20679
_Iheight_c~3 | 13.50715 5.989787 5.87 0.000 5.663642 32.21303
_Iheight_c~4 | 16.85605 8.674745 5.49 0.000 6.147467 46.21846
------------------------------------------------------------------------------
How do I know if I want to keep one, or both of these variables, or that one, or both of them is no use to me?
If for example, I want to keep both of these and add the 3rd variable, how do I know which?
. xi:logistic casecontrol i.height_category i.breed_groupall i.combinedweight
i.height_cate~y _Iheight_ca_0-4 (naturally coded; _Iheight_ca_0 omitted)
i.breed_group~l _Ibreed_gro_0-7 (naturally coded; _Ibreed_gro_0 omitted)
i.combinedwei~t _Icombinedw_0-5 (naturally coded; _Icombinedw_0 omitted)
Logistic regression Number of obs = 891
LR chi2(14) = 123.58
Prob > chi2 = 0.0000
Log likelihood = -346.82026 Pseudo R2 = 0.1512
------------------------------------------------------------------------------
casecontrol | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
_Iheight_c~1 | 3.397418 1.235922 3.36 0.001 1.665316 6.931087
_Iheight_c~2 | 6.321891 3.119652 3.74 0.000 2.403289 16.62983
_Iheight_c~3 | 14.36312 7.797083 4.91 0.000 4.956448 41.62242
_Iheight_c~4 | 23.39157 15.34571 4.81 0.000 6.46607 84.62101
_Ibreed_gr~2 | .7339708 .3482777 -0.65 0.515 .2895834 1.860303
_Ibreed_gr~3 | 1.060443 .5318773 0.12 0.907 .396787 2.834113
_Ibreed_gr~4 | 1.644423 .6502556 1.26 0.208 .757569 3.569479
_Ibreed_gr~5 | 1.246412 .4940969 0.56 0.578 .5731024 2.71076
_Ibreed_gr~6 | 1.262449 .4661338 0.63 0.528 .6122442 2.603172
_Ibreed_gr~7 | .401331 .1782529 -2.06 0.040 .1680497 .9584456
_Icombined~1 | 1.21903 .7160999 0.34 0.736 .3854692 3.855132
_Icombined~2 | 1.238685 .3967314 0.67 0.504 .6612025 2.320531
_Icombined~4 | 1.764532 .5970935 1.68 0.093 .9090641 3.425031
_Icombined~5 | 2.107871 1.118772 1.40 0.160 .7448366 5.965227
------------------------------------------------------------------------------
Any help would be gratefully received.
(I am using STATA v9.1 I believe.)