Where is the error term behind the following model:
$$h_i(t) = h_0(t) \exp \left ( \sum_{k = 1}^p \beta_k z_{ik} \right )$$
Where is the error term behind the following model:
$$h_i(t) = h_0(t) \exp \left ( \sum_{k = 1}^p \beta_k z_{ik} \right )$$
The distributional assumptions behind a relative risk model are hidden in the baseline hazard function $h_0(t)$. If you specify a form for this function, then you completely specify the distribution of your data.
For example, $h_0(t) = \phi \psi t^{\phi - 1}$ corresponds to the Weibull distribution.
There absolutely is an "error" in survival analysis.
You can define the "time to event" according to a probability model with some $$g(T) = b (X, t) + \epsilon(X,t)$$
where $g$ would usually be something like a log transform. Of course requiring $\epsilon$ to be normal, identically distributed, or even stationary is a rather strong assumption that just doesn't play out in real life. But if we allow $\epsilon$ to be quite general, the Cox proportional hazard model is a special case of the above display. Is this an abuse of notation? Maybe. Note we are mpt guaranteed any of the desirable properties of independence between the parameters. But if we think carefully about what an error is, it's not that it doesn't exist, it's just not a helpful notation to facilitate scientific investigation.
This "fully parametric" approach can be very efficient when it's true. A fully parametric "Weibull" model is actually a lot like a linear regression model for survival data, where the scale parameter is a lot like an error variance (dispersion parameter)
You could predict survival time for a given subject, subtract that from observed survival time, and this "residual" can be flexibly modeled using semiparametric splines to describe the distribution and mean-variance relationship. More commonly, we use the difference of predicted and observed cumulative hazard (Schoenfeld) residuals and their theoretical basis to infer the appropriateness of the proportional hazards assumption.
Theoretically, $\hat{S}^{-1}(T) \sim B(0,1)$. That is, the survival times under a quantile-transform, follow a stationary Brownian Bridge. So there is a relation between the probability model and a fundamentally random process. One could inspect diagnostic plots to assess the adequacy of $\hat{S}$ as an estimator of $S$.
Simple Linear Regression Model
\begin{equation} Y_i=B_0+B_1 X_i+ε_i \end{equation}
Where
$Y_i$ is the value of the response variable in the ith trial
$ε_i $ is a random error term with mean $E[ε_i]=0$ and variance $σ^2 [ε_i ]=σ^2$
\begin{equation} E[Y_i ]=B_0+B_1 X_i \end{equation}
Consider the simple linear regression model
\begin{equation} Y_i=B_0+B_1 X_i+ε_i\\ Y_i=0,1 \end{equation}
Where the outcome $Y_i$ is binary, taking on the value of either 0 or 1. The expected response $E[Y_i]$ has a special meaning in this case. Since $E[ε_i]=0$ we have:
\begin{equation} E[Y_i ]=B_0+B_1 X_i \end{equation} Consider $Y_i$ to be a Bernoulli random variable for which we can state the probability distribution as follows:
\begin{equation} P(Y_i=1)=π_i \end{equation} \begin{equation} P(Y_i=0)=1-π_i \end{equation}
\begin{equation} E[Y_i ]=B_0+B_1 X_i= π_i \end{equation}
Simple Logistic Regression Model
First, we require a formal statement of the simple logistic regression model. Recall that when the response variable is binary, taking on the value 1 and o with probabilities π and 1-π , respectively, Y is a bernoulli random variable with parameter $E[Y]=π$. We could state the the simple logistic regression model in model the following fashion:
$Y_i$ are independent Bernoulli random variable with expected Value $E[Y_i ] =π_i$ , where:
\begin{equation} E[Y_i ] =π_i= exp( B_0+B_1 X_i)/(1+exp(B_0+B_1 X_i)) \end{equation}
Poisson Distribution
\begin{equation} f(Y)=(μ^Y exp(-μ))/Y! \end{equation}
$E[Y]=μ$
$σ^2 [Y]=μ$
Poisson Regression Model
The poisson regression model, Like any nonlinear regression medol, can be stated as follows:
\begin{equation} Y_i=E[Y_i ]+ε_i \\i=1,2,…..,n \end{equation}
The mean response for the $i$th case, to be denoted now by $μ_i$ for simplicity, is assumed as always to be a function of the set of predictor variables ,$ X_1,…..,X_(p-1)$. We use the notation $μ$($X_i$,$B$) to denote the function that relates the mean response $μ_i$ to $X_i$, the values of the predictor variable for case $i$ , and B, the values of the regression cofficients. Some commonly used functions for poisson regression are:
\begin{equation} μ_i= μ(X_i,B)=(X_i,B) \end{equation}
\begin{equation} μ_i= μ(X_i,B)=exp(X_i,B) \end{equation}
\begin{equation} μ_i= μ(X_i,B)=log_e(X_i,B) \end{equation}
That , this models called Generalized Linear Model (GLM).
Survival analysis
Consider an AFT model with one predictor X. The model can be expressed on the log scale as: \begin{equation} log (T)= a_0+a_1 X+ε \end{equation}
Where $ε$ is a random error following some distribution.
T (Exponential, Weibull, Log-logistic and Lognormal )
log (T) (Extreme value, Extreme value, Logistic and Normal)
but cox proportional hazard model, The distributional assumptions behind hidden in the baseline hazard function $h_0 (t)$
This Answer is limited to frequentist statistics and statistical model without random effect.
In fact, the statistical modeling is to find the conditional distribution of response variable conditional on fixed values of the covariates, i.e., distribution of $Y|X=x$. When writing the statistical model, abiding following 3 steps will keep you from mathematical mistakes.
Find the form of the distribution of $Y|X=x$.
List the parameters that determine the distribution.
Write down how the covariate determine the parameters through the unknown constant parameters.
Example 1: Subject = 5-16 year old boys (indexed by $i$), response variable $Y$ = height, Covariate $X$ = age.
E1-1: Distribution form: $Y_i|X_i \sim Normal$
E1-2: Parameters for normal: mean $\mu_i$ and variance $\sigma_i^2$
E1-3: Functions for parameters: $\mu_i = \mu_0+\beta X_i$ and $\sigma_i^2=\sigma^2$
It is the same as $Y_i = \mu_0 +\beta X_i +\epsilon_i$ and $\epsilon_i \sim N(0,\sigma^2)$
Example 2: Subject = Men older than 65 years (indexed by $i$), response variable ($Y$) = death or alive in the next full year, covariate ($X$) = age.
E2-1: Distribution form: $Y_i=\text {death}|X_i$ follows Bernoulli with parameter $\pi_i$.
E2-2: Parameter for Bernoulli: $\pi_i$, the probability that i-th person dies in next year
E2-3: Function of parameters: $\pi_i = \frac{e^{\beta_0+\beta_1X_i}}{1+e^{\beta_0+\beta_1X_i}}$ or $log(\frac {\pi_i}{1-\pi_i}) = \beta_0+\beta_1X_i$
It is logistic regression and no $\epsilon$ after $\beta_0+\beta_1X_i$.
Example 3: 10 minuets at a specific street from 6:00 am to 9:00 am, $Y_i=$ # of cars passed the street at specific place, $X=int((\text{begining time - 6:00 in minuets})/10)$
E3-1: Distribution: Poisson
E3-2: Parameter: $\lambda_i$
E3-3: Function of parameter: $\lambda_i = e^{\beta_0+\beta_1X_i}$ or $log(\lambda_i)=\beta_0+\beta_1X_i$
It is Poisson regression and also no $\epsilon$ after $\beta_0+\beta_1X_i$.
OP's question:
Distribution: Any probability distributions belong to proportional hazard family
Parameters: Depend on distribution, but we do not need to know because of proportional hazard assumption.
Function of parameter: $$h_i(t) = h_0(t) \exp \left ( \sum_{k = 1}^p \beta_k z_{ik} \right )$$ Need to know the fact that hazard determines the probability distribution of the survival time.
Obviously, there is no position for $\epsilon$.
If you still do not believe there is no $\epsilon$ in logistic, Poisson and Cox proportional hazard model, you can consider following two questions.
In the linear model, $\epsilon$ appears in the process of model establishment. In the final conclusions, we can and need to estimate the variance of $\epsilon$. We also can estimate $\epsilon$ itself by $Y-\hat Y$. We also know that $Var(\hat \beta = (X'X)^{-1}\sigma$.
In other 3 kinds of models, if you insist there is $\epsilon$, why it does not appear in the process of model establishment? What is the effect of $\epsilon$ on the model? Did and could we estimate anything related to $\epsilon$?
So if you insist there is $\epsilon$ in that 3 kind models, then $\epsilon$ acts like a ghost, when you want it, it would appear,; when you do not it, it would disappear. But in mathematical statistics, this kind of ghost is not allowed in the model.
You may ask why it is acceptable that baseline hazard function $\lambda_0(t)$ also appears in the model specification and disappears in the model fitting process and final results. The reason is in the process of model establishment, the $\lambda_0(t)$ is cancelled under the assumption of proportional hazard. If you really interesting in $\lambda_0(t)$, you can get its estimate, $do not like $\epsilon$ which cannot be estimated.
Why linear model $Y\sim N(X\beta, \sigma^2)$ has an alternative expression $Y=X\beta + \epsilon$, $\epsilon \sim N(0,\sigma^2)$, and other models do not have alternative expression with $\epsilon$?
(will continue)