In my firm I am developing a model using a probit model. I noticed that when benchmarking with a logit specification, the logit slightly improves the model goodness-of-fit.
Talking with a colleague he argued that, this is purely luck because the probit and logit are very similar. He also said that I can apply a monotonic transformation to my data and get better results with a probit.
What's the intuition behind this argument?
Details: I regress a binary variable taking the value 1 if an individual is in financial distress. The data cover a period of 20 years for 1000 individuals. The probit function has the following form $P(Y=1|X)=\Phi(X\beta)$ while the logit function is given by $P(Y=1|X)=\frac{1}{1+e^{-X\beta}}$. The explanatory variables are some macro variables such as GDP, unemployment etc. I computed then an average for the actual vs predicted values across all individuals and for each year. I could then compute a $R^2$ given by the correlation between actual and predicted values squared. When repeating the process with logit I noticed a slightly better increase