I often read that: If we believe that the functional form of the dependent variable is a cumulative normal density, we may use probit and if we believe that the dependent variable follows a logistic response. In this case, we should use logistic regression. How can I deduce this in practice
-
2"deduce" is a more mathematical-flavor term, while choosing logit or probit as a link function is a choice of model form. You cannot deduce it, at best you infer it from data. Usually, logit model and probit model don't differ much. – Zhanxiong May 18 '20 at 21:21
-
They gave me different results thats why I need to know which one should I consider – Khouloud Bennour May 18 '20 at 22:51
-
Of course they can't be exactly the same. To determine which form you should use, you could check the goodness of fit metrics for them respectively. Or if your interest is of prediction, split training-test data sets (or do cross-validation) and compare their prediction accuracies. The point is, you have to rely on your data to determine which one to use. – Zhanxiong May 18 '20 at 23:02
-
@KhouloudBennour Are the index function coefficients different or are the marginal effects different? The former is expected, but the latter is not typical. You can use the rule of thumb that the logit/probit coefficient ratio is roughly 1.6-1.8. – dimitriy May 18 '20 at 23:08
-
@Zhanxiong thats what i am planning to do. But I thought there is sthg I need to check ( like the distribution of the response variable) before even choosing one of the models – Khouloud Bennour May 18 '20 at 23:20
-
@DimitriyV.Masterov I didnt compute their margins? are the margins of both models equal? – Khouloud Bennour May 18 '20 at 23:21
-
In other words, how can I determine the functional form of my dependent variable (wether it lis a cumulative normal density or logistic response). Please I need it urgently – Khouloud Bennour May 18 '20 at 23:58
-
1They are very close, so if they gave you different results, if that really means *meaningfully* different, aka *different conclusions*, you should include the details in the post. That would be interesting. Otherwise, this is a duplicate of: https://stats.stackexchange.com/questions/20523/difference-between-logit-and-probit-models?noredirect=1&lq=1 – kjetil b halvorsen May 19 '20 at 03:39
1 Answers
If I'm understanding this correctly, you're essentially just asking how to assess whether or not a given statistical model accurately fits a set of observations, with specific questions relating to differing between probit vs. logit error distributions. Unless you know the assumptions behind how the data was sampled, you cannot "deduce" the distribution other than really applying goodness of fit tests to get an understanding of how well the discrepancies match the models.
These are two uniquely separate and distinct distributions with different assumptions embedded in their derivation so aren't necessarily the only choices for distributions of binary categorical errors, however they do tend to be tested together as they both have nice properties and supports for glms where the distributions of the errors are binary categorical data. And unless you know specifically the data was sampled from the distributions like you have stated above i.e. probit from a cumulative normal or logit from a logistic response, you cannot really "deduce" this in practice in any better way then these goodness of fit tests and comparisons.

- 1,357
- 6
- 17