What I would do here is compare the predictions to each other and to the correct values, graphically. That is, first run both models and output the predicted values from each. Then create a) A scatter plot of 1) probit predicted vs. logistic predicted 2) each model vs. actual values. b) A density plot (or maybe a box plot) of the errors of each model (to find any outliers).
Then make a decision.
However, in my experience, the two models often make similar predictions; however, different substantive fields have a tradition of using one or the other (e.g. psychology uses the logistic much more than the probit; I think the situation is the reverse in economics, but I am less versed in that literature).