To begin with you have to define the equivalence concept. One may think that two models are equivalent when they do produce almost the same forecasting accuracy (this one would be relevant for time series and panel data), another one could be interested in if the fits from the model are close. The former is the object for different cross-validation (jack-knife usually or some out-of-sample tests, Rob's accuracy()
do this nicely), the latter goes for the minimization of some information criterion.
In microeconometrics the choice is $BIC$, though you may also consider $AIC$ if you are working with small sample sizes. Note, that the choice based on minimization of information criterion is also relevant for nested models.
A nice discussion is given in must-have-it book by Cameron and Trivedi (Chapter 8.5 provides excellent review of the methods), more specific theoretical details are found in Hong and Preston here.
Roughly speaking, choosing from two models the more parsimonious (having less parameters to estimate, therefore more degrees of freedom) will be suggested as preferable. An information criterion introduces a special penalty function that restricts the inclusion of additional explanatory variables into linear model conceptually similar to the restrictions introduced by adjusted $R^2$.
However you may not be just interested in choosing the model that minimizes selected information criterion. Equivalence concept implies that some test statistic should be formulated. Therefore you may go for likelihood ratio tests either Cox or Voung $LR$ tests, Davidson-MacKinnon $J$ test.
Finally, according to the tags, you may be just interested in R
functions:
library(lmtest)
coxtest(fit1, fit2)
jtest(fit1, fit2)
Where fit1
and fit2
are two non-nested fitted linear regression models, coxtest
is the Cox $LR$ test, and jtest
Davidson-MacKinnon $J$ test.