In the thread Is there any statistical test that is parametric and non-parametric?, @JohnRos gives an answer saying that
Parametric is used in (at least) two meanings:
- A - To declare you are assuming the family of the noise distribution up to it's parameters.
- B - To declare you are assuming the specific functional relationship between the explanatory variables and the outcome.
@whuber counters that
The two meanings in the first paragraph frequently have a unified treatment in the literature: that is, there appears to be no fundamental or important distinction between them.
Question: I am failing to see exactly how and wonder if anyone could provide an explanation.
For example, I find the definition used in the tag information on nonparametric (created by @whuber) similar to A:
Most statistical procedures derive their justification from a probability model of the observations to which they are applied. Such a model posits that the data appear to be related in a specific way to draws from some probability distribution that is an unknown member of some family of distributions. The family of distributions for a parametric procedure can be described in a natural way by a finite set of real numbers, the "parameters." Examples include the family of Binomial distributions (which can be parameterized by the chance of a "success") and the family of Normal distributions (usually parameterized by an expectation $\mu$ and variance $\sigma^2$). When such a description is not possible, the procedure is termed "nonparametric." Wikipedia provides a list of some non-parametric procedures.
but I cannot reconcile it easily with the description of the notion in James et al. "An Introduction to Statistical Learning" p. 21 which is similar to B:
Parametric methods involve a two-step model-based approach.
- First, we make an assumption about the functional form, or shape, of $f$. For example, one very simple assumption is that $f$ is linear in $X$: $$ f(X) = \beta_0 + \beta_1 X_1 + \beta_2 X_2 + \dots + > \beta_p X_p. (2.4) $$ This is a linear model, which will be discussed extensively in Chapter 3. Once we have assumed that $f$ is linear, the problem of estimating $f$ is greatly simplified. Instead of having to estimate an entirely arbitrary $p$-dimensional function $f(X)$, one only needs to estimate the $p+1$ coefficients $\beta_0,\beta_1,\dots,\beta_p$.
- After a model has been selected, we need a procedure that uses the training data to fit or train the model. In the case of the linear model fit train (2.4), we need to estimate the parameters $\beta_0,\beta_1,\dots,\beta_p$. That is, we want to find values of these parameters such that $$ Y \approx \beta_0 + \beta_1 X_1 + \beta_2 X_2 + \dots + \beta_p X_p. $$ The most common approach to fitting the model (2.4) is referred to as (ordinary) least squares, which we discuss in Chapter 3. However, least squares least squares is one of many possible ways to fit the linear model. In Chapter 6, we discuss other approaches for estimating the parameters in (2.4).
The model-based approach just described is referred to as parametric; it reduces the problem of estimating $f$ down to one of estimating a set of parameters.
Again, my question can be found above in bold print.