Classic MDS (cMDS or PCoA) preserves global distances, characteristic of linear techniques. However, metric MDS seeks to minimize a cost function (stress), while non-metric MDS (nMDS) preserves only the ranking of dissimilarities between points. It seems to me these techniques produce a kind of embedding, which would be nonlinear, but both cMDS and nMDS are listed as linear techniques in this article. Conversely, Wikipedia describes MDS in general as a form of nonlinear dimensionality reduction.
It is possible to use a nonlinear kernel in MDS to preserve smaller distances, as in the case of a Sammon mapping. This is definitely a nonlinear technique.
So: are multidimensional scaling and its variants considered linear or nonlinear dimensionality reduction techniques, and why?