I always thinking about regression model is based on Y occurs given X. It means Y is always occur after X shown.
- linear regression
Like this...
example1.
price of egg = b0 + b1*(chicken's age) + b2*(chicken's biological status) + b3*(duration after birth of egg)
- Logistic Regression
But, I'm very confused thinking about logistic regression (logit link). It is based on Odds Ratio. It has property of symmetry. OR(Y given X) = OR(X given Y) (https://en.wikipedia.org/wiki/Odds_ratio#Symmetry)
So, expand multiple logistic equation... I think it can be... "Y is not always occur after X shown. X can be occur after Y."(?)
i.e.
Example1.
"occur of lung cancer = b0 + b1*(age) + b2*(number of comorbidity) + b3*(smoking status before lung cancer occured)
is equal to
smoking status before lung cancer occured = b0 + b1*(age) + b2*(number of comorbidity) + b3*(occur of lung cancer)"
or
Example2.
"Dead = b0 + b1*(age) + b2*(number of comorbidity) + b3*(lung cancer)
is equal to
lung cancer occured = b0 + b1*(age) + b2*(number of comorbidity) + b3*(Dead)"
Is it correct?
EDIT
I found similar question: Relationship between regressing Y on X, and X on Y in logistic regression
But my new example below, Odds ratio(multiple logistic regression) is not same as origin question's odds ratio(simple logistic regression).
> y = c(0,0,0,1,1,1,1,1,1,1)
> x = c(0,1,1,0,0,0,1,1,1,1)
> z1 = c(0,1,1,1,1,0,0,0,1,1)
> z2 = c(1,1,0,0,1,0,1,1,0,1)
> z3 = c(0,1,0,1,1,0,1,0,1,0)
>
> fit = glm(y ~ x, family=binomial(link="logit"))
> coef(summary(fit))
Estimate Std. Error z value Pr(>|z|)
(Intercept) 1.0986123 1.154700 0.9514270 0.3413877
x -0.4054651 1.443375 -0.2809146 0.7787759
> fit = glm(x ~ y, family=binomial(link="logit"))
> coef(summary(fit))
Estimate Std. Error z value Pr(>|z|)
(Intercept) 0.6931472 1.224745 0.5659524 0.5714261
y -0.4054651 1.443375 -0.2809145 0.7787760
> y = c(0,0,0,1,1,1,1,1,1,1)
> x = c(0,1,1,0,0,0,1,1,1,1)
> z1 = c(0,1,1,1,1,0,0,0,1,1)
> z2 = c(1,1,0,0,1,0,1,1,0,1)
> z3 = c(0,1,0,1,1,0,1,0,1,0)
>
> fit = glm(y ~ x, family=binomial(link="logit"))
> coef(summary(fit))
Estimate Std. Error z value Pr(>|z|)
(Intercept) 1.0986123 1.154700 0.9514270 0.3413877
x -0.4054651 1.443375 -0.2809146 0.7787759
> fit = glm(x ~ y, family=binomial(link="logit"))
> coef(summary(fit))
Estimate Std. Error z value Pr(>|z|)
(Intercept) 0.6931472 1.224745 0.5659524 0.5714261
y -0.4054651 1.443375 -0.2809145 0.7787760
>
> fit = glm(y~x + z1 + z2 + z3, family=binomial(link="logit"))
> epiDisplay::logistic.display(fit)
Logistic regression predicting y
crude OR(95%CI) adj. OR(95%CI) P(Wald's test) P(LR-test)
x: 1 vs 0 0.6667 (0.0394,11.2853) 1.0057 (0.0422,23.9878) 0.997 0.997
z1: 1 vs 0 0.67 (0.04,11.29) 0.3 (0.01,11.61) 0.516 0.496
z2: 1 vs 0 0.67 (0.04,11.29) 0.49 (0.02,11.29) 0.659 0.654
z3: 1 vs 0 2.67 (0.16,45.14) 4.47 (0.15,133.82) 0.388 0.357
Log-likelihood = -5.5623
No. of observations = 10
AIC value = 21.1245
> fit = glm(x~y + z1 + z2 + z3, family=binomial(link="logit"))
> epiDisplay::logistic.display(fit)
Logistic regression predicting x
crude OR(95%CI) adj. OR(95%CI) P(Wald's test) P(LR-test)
y: 1 vs 0 0.67 (0.04,11.29) 0.96 (0.04,23.87) 0.979 0.979
z1: 1 vs 0 2 (0.15,26.73) 3.37 (0.11,99.3) 0.482 0.462
z2: 1 vs 0 2 (0.15,26.73) 2.85 (0.15,55.24) 0.488 0.475
z3: 1 vs 0 1 (0.08,12.56) 0.61 (0.02,15.96) 0.769 0.765
Log-likelihood = -6.2909
No. of observations = 10
AIC value = 22.5819
Why is this phenomenon occured?