I'm reading an article on the use of influence curves in robust estimation (Hampel, 1974) which includes the following definition of an influence curve for an estimator $T$:
Let $R$ be the real line, let $T$ be a real-valued functional defined on some subset of the set of all probability measures on $R$, and let $F$ denote a probability measure on $R$ for which $T$ is defined. Denote by $\delta_x$ the probability measure determined by the point mass $1$ in any given point $x \in R$. Mixtures of $F$ and some $\delta_x$ are written as $(1 - \epsilon)F + \epsilon \delta_x$, for $0 < \epsilon < 1$. Then the influence curve $IC_{T,F} (.)$ of (the "estimator") $T$ at (the "underlying probability distribution") $F$ is defined pointwise by $IC_{T,F}(x) = \lim_{\epsilon \to 0} \{ T[(1 - \epsilon)F + \epsilon \delta_x] -T(F) \}/\epsilon$ if this limit is defined for every point $x \in R$.
What is the quantity $\delta_x$ measuring?
Is $\delta_x$ the same as the infinitesimal probability $p_X(x)d x$ for a density $p_X(x)$ (say from cumulative distribution $P$) over the interval $[x,x+dx]$? $\delta_x$ is also called an "atomic probability measure" later in the article.
If so, then $IC_{T,F}(x)$ measures the "rate of change" in a function $T(F)$ as you mix in a little bit ($\epsilon$) of an alternate distribution $P$, is that correct?
I'm trying to wrap my mind around how one might have a weighted mixture of two probability distributions. It's an important concept to understand for new causal inference techniques such as Targeted Maximum Likelihood Estimation.