In Hothorn et al, the test statistic is specified as
$$ T_j(L_n, w) = vec(\sum w_i g_j(X_{ji}) h(Y_i, (Y_1,...,Y_n)^T))$$
What is the exact form of this test statistic with a continuous response and categories and numerical predictors?
In Hothorn et al, the test statistic is specified as
$$ T_j(L_n, w) = vec(\sum w_i g_j(X_{ji}) h(Y_i, (Y_1,...,Y_n)^T))$$
What is the exact form of this test statistic with a continuous response and categories and numerical predictors?
If both the regressor $X_{ji}$ and the response $Y_i$ are numeric, then both $g(\cdot)$ and $h(\cdot)$ are chosen to be the identiy by default. Thus, the linear test statistic $T_j$ is simply the sum of products $X_{ji} \cdot Y_i$. This corresponds essentially to the main ingredient of a covariance or correlation - and with the subsequent standardization of the linear test statistic $T_j$ it becomes a correlation test statistic.
If one of the variables is categorical, then the corresponding transformation ($g(\cdot)$ or $h(\cdot)$) is the matrix of all dummy variables. Consequently, the standardized test statistic for two categorical variables corresponds to a $\chi^2$ test statistic. And if one variable is numeric and the other categorical you obtain an ANOVA-type test. Other transformations are also possible, appropriate for censored survival responses or ordinal responses etc.
If you want to carry out the tests "by hand" you can explore the independence_test()
function from the coin
package for conditional inference. An introduction is available in Hothorn et al.'s "A Lego System for Conditional Inference" (doi:10.1198/000313006X118430), a preprint version of which is also available in the package as vignette("LegoCondInf", package = "coin")
.