Similarities and differences between IRT model and Logistic regression model

Question

Despite the basic similarities like both of these model the probability of success rather than modelling the response variable directly; I believe that there are more reliable answers which point out the differences and similarities between these models.

One difference being, in logistic one can use different type and different number of independent variables; whereas in IRT model we just have only one independent variable that is ability.

One more similarity : To estimate the parameters in logistic we use Maximum likelihood approach. In IRT also we use marginal maximum likelihood as one of the parameter estimating approach.

So can anyone please state out the statistical/ mathematical differences in these two models?

IRT (aka latent trait analysis) is sometimes called logistic factor analysis ([see](http://stats.stackexchange.com/q/215404/3277)). The difference between LR and IRT majorily parallels the difference between linear regression and factor analysis. In regression, dependent variable is given, along with the independent manifest variables. In Factor analysis and other latent variable models, latent is extracted from the given manifest variables; moreover, it is the latent that then is seen as the independent variable who "predicts" the manifest ones. — ttnphns, Dec 19 '16 at 06:09
@ttnphns, Thank you so much for the reply. So am I committing a mistake if I am referring a variable Y as response to an item and then modelling the probability it being correct. In this scenario, haven't I already known my dependent variable? And one more question, manifest variable you mean dependent one in IRT right? — Artiga, Dec 19 '16 at 06:34
To repeat. In a regression you have manifest DVs Y and manifest IVs X. In latent variable models (factor analysis, IRT,...) You have only X. Latent factor(s) F are extracted from X, but extracted so as to consider them as predictors of X, that is, they serve the IVs for X which are the DVs. In logistic regression, categorical DV is a logistic function of the linear combination of (usually continuous) IVs. In IRT, the observed categorical variables are logistic function of the linear combination of continuous Fs. — ttnphns, Dec 19 '16 at 07:53

Tom · Accepted Answer · 2016-12-19T17:47:13.140

Have a look at Section 1.6 ("The linear regression perspective") in De Boeck and Wilson (2008) Explanatory Item Response Models (http://www.springer.com/de/book/9780387402758) and Formann, A. K. (2007), (Almost) Equivalence between conditional and mixture maximum likelihood estimates for some models of the Rasch type, In M. von Davier & C. H. Carstensen (Eds.), Multivariate and mixture distribution Rasch models (pp. 177-189), New York: Springer.

In short: IRT models are generalized nonlinear mixed effects models:

the score $Y_{pi}\in\left\{ 0,1\right\} $ of a student $p$ to an item $i$ is the dependent variable,
given a randomly sampled student's trait, e.g. $\theta_{p}\sim N\left(\mu,\sigma^{2}\right)$, the responses are assumend to be independent Bernoulli distributed,
given $\theta_{p}$, the predictor $\eta_{pi}=\textrm{logit}\left(P\left(Y_{pi}=1\right)\right)$ is a linear combination of item characteristics $$\eta_{pi}=\sum_{k=0}^{K}b_{k}X_{ik}+\theta_{p}+\varepsilon_{pi},$$
let $X_{ik}=-1,$ if $i=k$, and $X_{ik}=0$, otherwise - thus obtain the Rasch model $$P\left(Y_{pi}=1\mid\theta_{p}\right)=\frac{\exp\left(\theta_{p}-b_{i}\right)}{1+\exp\left(\theta_{p}-b_{i}\right)};$$

Note that IRT models are extended towards different aspects:

With respect to discriminatory power (2PL) and guessing ratio (3PL) of an item $$ P\left(Y_{pi}=1\mid\theta_{p}\right)= c_i+(1-c_i)\frac{\exp\left(a_{i}\left(\theta_{p}-b_{i}\right)\right)}{1+\exp\left(a_{i}\left(\theta_{p}-b_{i}\right)\right)} $$
With respect to polytomous scores $$ P\left(Y_{pi}=k\mid\theta_{p}\right)=\frac{\exp\left(a_{ik}\theta_{p}-b_{ik}\right)}{\sum_{k=0}^{K}\exp\left(a_{ik}\theta_{p}-b_{ik}\right)} $$
With respect to known student characteristics constituting the population (e.g., sex, migration status) $$ \theta_{p}\sim N\left(\mathbf{Z}\boldsymbol{\beta},\sigma^{2}\right), $$
With respect to construct dimensionality $$ P\left(Y_{pi}=1\mid\theta_{p}\right)=\frac{\exp(\sum_{d}a_{id}\theta_{pd}-b_{i})}{1+\exp(\sum_{d}a_{id}\theta_{pd}-b_{i})},\quad\theta_{p}\sim N^{d}\left(\boldsymbol{\mu},\Sigma\right) $$
With respect to discrete skill classes (continuous distributions can be easily approximated by discrete ones) $$ P\left(Y_{pi}=1\mid\theta_{p(l)}\right)=\frac{\exp(\theta_{p(l)}-b_{i(l)})}{1+\exp(\theta_{p(l)}-b_{i(l)})},\quad\theta_{p(l)}\in\left\{ \theta_{p(1)},\dots,\theta_{p(L)}\right\} $$

(taken from the useR!2015 slides for the R package TAM)

There is also freely available paper by de Boeck et al on this https://www.jstatsoft.org/article/view/v039i12 plus his handout http://statmath.wu.ac.at/courses/deboeck/materials/handouts.pdf — Tim, Dec 19 '16 at 09:16

score 0 · Answer 2 · answered Mar 07 '19 at 20:01

@Tom's response is excellent, but I'd like to offer a version that's more heuristic and that introduces an additional concept.

Logistic regression

Imagine we have a number of binary questions. If we are interested in the probability of responding yes to any one of the questions, and if we're interested in the effect of some independent variables on that probability, we use logistic regression:

$P(y_i = 1) = \frac{1}{1 + exp(X\beta)} = logit^-1(X\beta)$

where i indexes the questions (i.e. the items), X is a vector of characteristics of the respondents, and $\beta$ is the effect of each of those characteristics in log odds terms.

IRT

Now, note that I said we had a number of binary questions. Those questions might all get at some kind of latent trait, e.g. verbal ability, level of depression, level of extraversion. Often, we are interested in the level of the latent trait itself.

For example, in the Graduate Record Exam, we're interested in characterizing the verbal and math ability of various applicants. We want some good measure of their score. We could obviously count how many questions someone got correct, but that does treat all questions as being worth the same amount - it doesn't explicitly account for the fact that questions might vary in difficulty. The solution is item response theory. Again, we're (for now) not interested in either X or $\beta$, but we're just interested in the person's verbal ability, which we'll call $\theta$. We use each person's pattern of responses to all the questions to estimate $\theta$:

$P(y_i = 1) = logit^-1[a_i(\theta_j - b_i)]$

where $a_i$ is discrimination of item i and $b_i$ is its difficulty.

So, that's one obvious distinction between regular logistic regression and IRT. In the former, we're interested in the effects of independent variables on one binary dependent variable. In the latter, we use a bunch of binary (or categorical) variables to predict some latent trait. The original post said that $\theta$ is our independent variable. I'd respectfully disagree, I think it's more like this is the dependent variable in IRT.

I used binary items and logistic regression for simplicity, but the approach generalizes to ordered items and ordered logistic regression.

Explanatory IRT

What if you were interested in the things that predict the latent trait, though, i.e. the Xs and $\beta$s previously mentioned?

As mentioned earlier, one model to estimate the latent trait is just count the number of correct answers, or add up all the values of your Likert (i.e. categorical) items. That has its flaws; you're assuming that each item (or each level of each item) is worth the same amount of the latent trait. This approach is common enough in many fields.

Perhaps you can see where I'm going with this: you can use IRT to predict the level of the latent trait, then conduct a regular linear regression. That would ignore the uncertainty in each person's latent trait, though.

A more principled approach would be to use explanatory IRT: you simultaneously estimate $\theta$ using an IRT model and you estimate the effect of your Xs on $\theta$ as if you were using linear regression. You can even extend this approach to include random effects to represent, for example, the fact that students are nested in schools.

More reading available on Phil Chalmers' excellent intro to his mirt package. If you understand the nuts and bolts of IRT, I'd go to the Mixed Effects IRT section of these slides. Stata is also capable of fitting explanatory IRT models (albeit I believe it can't fit random effects explanatory IRT models as I described above).

Similarities and differences between IRT model and Logistic regression model

2 Answers2

Logistic regression

IRT

Explanatory IRT