Questions tagged [semiparametric]

Semiparametric probability models are a general class of models used for estimation and inference that contain a nonparametric component and a parametric component.

David Cox coined the term semiparametric when formulating the seminal Cox model and describing the role of the baseline hazard. At that time, several generalizations of the likelihood function existed, such as quasi likelihood, conditional likelihood, and partial likelihood, with little theory to reconcile their unique approach to inference and handling nuisance parameters. Semiparametric models did just that.

This allowed all such models, such as generalized linear mixed models, cox models, and conditional logistic regression to be written as the product of a parametric component and a non-parametric component. The latter of which can be "partialled" or "conditioned" out to obtain parametric values that can be compared on a likelihood-like scale.

Estimation in semiparametric inference has several properties that are similar to those of maximum likelihood such as root-n consistency, unbiasedness, and existence and uniqueness of solutions.

36 questions
24
votes
5 answers

When is quantile regression worse than OLS?

Apart from some unique circumstances where we absolutely must understand the conditional mean relationship, what are the situations where a researcher should pick OLS over Quantile Regression? I don't want the answer to be "if there is no use in…
20
votes
2 answers

Generalized additive models -- who does research on them besides Simon Wood?

I use GAMs more and more. When I go to provide references for their various components (smoothing parameter selection, various spline bases, p-values of smooth terms), they are all from one researcher -- Simon Wood, at the University of Bath, in…
12
votes
0 answers

Is the Wilcoxon two-sample test maximally powered to detect proportional odds alternatives?

We know from the literature that The Wilcoxon-Mann-Whitney two-sample rank sum test is optimal for detecting simple location shifts when comparing two continuous random variables that each have a logistic distribution The Wilcoxon test is a special…
Frank Harrell
  • 74,029
  • 5
  • 148
  • 322
12
votes
3 answers

In survival analysis, when should we use fully parametric models over semi-parametric ones?

This question is the counterpoint of the other question In survival analysis, why do we use semi-parametric models (Cox proportional hazards) instead of fully parametric models? Indeed, it clearly demonstrates the advantages of Cox Proportional…
Dan Chaltiel
  • 1,089
  • 12
  • 25
9
votes
1 answer

Probabilistic interpretation of Thin Plate Smoothing Splines

TLDR: Do thin plate regression splines have a probabilistic/Bayesian interpretation? Given input-output pairs $(x_i,y_i)$, $i=1,...,n$; I want to estimate a function $f(\cdot)$ as follows \begin{equation}f(x)\approx u(x)=\phi(x_i)^T\beta…
5
votes
0 answers

How general is the backfitting algorithm?

Hastie \& Tibshirani's original approach to fitting generalized additive models was the backfitting algorithm. For a model of the form $$ y = \alpha + \displaystyle\sum_k f_k(x_k) + \epsilon $$ Initilize the $\alpha$ and $f_k$ at reasonable…
generic_user
  • 11,981
  • 8
  • 40
  • 63
5
votes
1 answer

Gam with low E.D.F (estimated degrees of freedom) value in main effect, not interaction term

I have a gam model with the following structure: gam(sv ~ s(day, bs="ts") + s(range, bs="ts") + s(time, bs="cc") + ti(day, range, bs=c("ts", "ts")), data=train.all, method="REML") In this model's summary range has an e.d.f of 0.95, and…
Hannah
  • 293
  • 5
  • 11
5
votes
2 answers

Understanding Big/Little $O_p$/$o_p$ Notation for Estimators

I am reading a Text about Single Index Models (SIM), where a SIM is defined as $E[Y|X=x] = G(X' \beta)$, with $G$ and $\beta$ unknown. After proposing an estimator for the function $G$, the following statement is given $\sqrt{n h_n}[G_n(z) -…
5
votes
3 answers

Book for introductory nonparametric econometrics/statistics

My work implies a lot of econometrics, and I had a good formation about it. Nevertheless, I am regularly faced with some semi or non parametric techniques (for instance I had to use quantile regressions, partial estimation, or nonparametric…
Anthony Martin
  • 1,109
  • 3
  • 11
  • 26
4
votes
1 answer

Efficient influence function in proportional hazards model

I was hoping someone could help me with this problem in the cox proportional hazards model. I am given the following setup. T is a non-negative random variable with continous distribution and hazard function $\lambda_T(t)$. T has density $f_T(t) =…
3
votes
0 answers

OLR with rms: proportional odds assumption

I am fitting an ordinal logistic regression model with rms package. my data involves a three-level ordered outcome (see reproducible data below), as well as continuous and binary predictors: fit <- lrm(y ~ grp + sprt + chrs + age + …
Uri
  • 41
  • 4
3
votes
0 answers

JuliaOpt Empirical Likelihood Estimation

I am trying to perform an empirical likelihood estimation in a regression setting using JuliaOpt (Convex or JuMP) and ran into difficulties using either API. The problem: Empirical likelihood for regression can be written as a maximization problem…
3
votes
0 answers

Testing semi-parametric versus parametric model

I am estimating a (semi)parametric and a parametric model for a panel data set, and I want to test the functional form by applying the method proposed by Henderson et al. (2008, p.266–267). In particular, given the two models: $$ \begin{aligned} Y…
Bob
  • 251
  • 1
  • 3
  • 10
2
votes
0 answers

Do robust standard errors protect you from proportional odds assumptions?

Cox Proportional Hazards models are traditionally taught alongside proportional hazards assumptions. There is a corresponding test of proportionality. However, if standard errors are calculated from sandwich estimators, there's no need to worry…
AdamO
  • 52,330
  • 5
  • 104
  • 209
2
votes
0 answers

MCMC fitting of a Dirichlet Process or Polya Tree prior to the residuals in a (simple linear regression)/(2-independent-samples) problem

Consider a simple location-shift semi-parametric model with two mutually-independent samples (here $F$ is a cumulative distribution function (CDF) on $\mathbb{ R }$, the $C_i$ and $T_j$ are real-valued outcomes for subjects in control $( C )$ and…
1
2 3