I'm studying inference and I would like to understand better how classical distributions were derived. I manage to derivate the t-distribution using sympy but my result is a bit different from whats was expected. Here is what I have tried:
First, I define a standard normal and a chi-square distributions:
from sympy import *
from sympy.stats import Normal, Chi, StudentT, density
from sympy.abc import x,y,z,k
X=Normal('X',0,1)
Y=Chi('Y',k)
normal=density(X).pdf(x)
chi=density(Y).pdf(y)
As I understand, we can recover the density of random variable give by the ratio of two independent random variables ($Z=X/Y$) using:
In this case, the above expression is given by:
integrate(abs(y)*normal.subs({x:z*y})*chi, y)
$$\frac{\sqrt{2} \cdot 2^{1 - \frac{k}{2}} \int \frac{y^{k} e^{- \frac{y^{2}}{2}} e^{- \frac{y^{2} z^{2}}{2}} \left|{y}\right|}{y}\, dy}{2 \sqrt{\pi} \Gamma\left(\frac{k}{2}\right)} $$
The computation of this integral gives:
result=integrate(abs(y)*normal.subs({x:z*y})*chi, (y,0,+oo))
result
$$\begin{cases} \frac{\left(z^{2} + 1\right)^{- \frac{k}{2} - \frac{1}{2}} \Gamma\left(\frac{k}{2} + \frac{1}{2}\right)}{\sqrt{\pi} \Gamma\left(\frac{k}{2}\right)} & \text{for}\: \left(2 \left|{\arg{\left(z \right)}}\right| \leq \frac{\pi}{2} \wedge \frac{\operatorname{re}{\left(k\right)}}{2} + \frac{1}{2} > 0\right) \vee \left(\frac{\operatorname{re}{\left(k\right)}}{2} + \frac{1}{2} > 0 \wedge 2 \left|{\arg{\left(z \right)}}\right| < \frac{\pi}{2}\right) \\\int\limits_{0}^{\infty} \frac{\sqrt{2} \cdot 2^{1 - \frac{k}{2}} y^{k - 1} e^{- \frac{y^{2}}{2}} e^{- \frac{y^{2} z^{2}}{2}} \left|{y}\right|}{2 \sqrt{\pi} \Gamma\left(\frac{k}{2}\right)}\, dy & \text{otherwise} \end{cases} $$
The interest here lays on the first expression, which looks like a t-distribution but is a bit different. It appears to me that I should have $\left(1+\frac {z^2}{k}\right)$, instead of $(z^2 +1)$ . Namely, the t-distribution should be:
$$\frac{\Gamma[(k+1)/2]}{\sqrt{k\pi}\Gamma(k/2)}\left(1+\frac {z^2}{k}\right)^{-\frac 12 (k+1)}$$
I thought that perhaps the two expressions are equivalent, then I did a numerical evaluation of my t-distribution with the real t-distribution. For $k=20$ and $z=40$, the real t-distribution gives:
Z=StudentT("z", k)
N(density(Z)(z).subs({k:20,z:40}))
3.60073318935429⋅10^21
while my t-distribution gives:
N(result.args[0][0].subs({k:20,z:40}))
3.98006030714142⋅10^34
I mean, the results are different because I did some thing wrong on the derivation or is something related to round error?
What am I doing wrong here? Am I in the right direction?