Yes, you need to partial out the exogenous variables using the partial
option in ranktest
. So the correct syntax should be:
ranktest (endog_var)(Z1 Z2), partial(exogen_var) full robust
This is also done in the documentation for ranktest
(at the bottom of the helpfile). You can check this by comparing your results to the Kleibergen-Paap rk reported by ivreg2
with robust standard errors.
As an example:
// use a toy data set
sysuse auto
// run the iv regression with two instruments using ivreg2
ivreg2 price weight (mpg = foreign trunk ), first robust
/* this is the output from the first stage diagnostics for underidentification
Underidentification test
Ho: matrix of reduced form coefficients has rank=K1-1 (underidentified)
Ha: matrix has rank=K1 (identified)
Kleibergen-Paap rk LM statistic Chi-sq(2)=1.90 P-val=0.3863
*/
// Test 1
// compare the Kleibergen-Paap rk LM test from ivreg2 with the manual test (partial out exogenous variables)
ranktest (mpg) (foreign trunk), partial(weight) full robust
*/ output
Kleibergen-Paap rk LM test of rank of matrix
Test statistic robust to heteroskedasticity
Test of rank= 0 rk= 1.90 Chi-sq( 2) pvalue=0.386287
*/
// Test 2
// compare the Kleibergen-Paap rk LM test from ivreg2 with the manual test (not partialling out exogenous variables)
ranktest (mpg) (foreign trunk weight), full robust
*/ output
Kleibergen-Paap rk LM test of rank of matrix
Test statistic robust to heteroskedasticity
Test of rank= 0 rk= 30.70 Chi-sq( 3) pvalue=0.000001
*/
You see that test 1 produced the correct rk statistic and p-value (as in the ivreg2
output) whilst test 2 did not come up with the correct results.
And yes, the matrix may not be of full rank if your instruments are not strong enough. ivreg2
also provides the F-test for the excluded instruments (see the Angrist and Pischke F-statistic). For more information see section 7 of Baum et al (2007) "Enhanced routines for instrumental variables / generalized method of moments estimation and testing" (link). Using ivreg2
is generally a better strategy than doing 2sls "by hand" because the Stata routine provides you with a whole range of test statistics that are useful and it also provides you with the correct standard errors.