I am reading exercise 6.4 from The Elements of Statistical Learning (Hastie, Tibshirani and Friedman) and I am having difficulty interpreting exactly what is being asked in the following question
Ex. 6.4 Suppose that the $p$ predictors $X$ arise from sampling relatively smooth analog curves at $p$ uniformly spaced abscissa values. Denote by $Cov(X|Y) = Σ$ the conditional covariance matrix of the predictors, and assume this does not change much with $Y$ . Discuss the nature of Mahalanobis choice $A = Σ ^{−1}$ for the metric in (6.14). How does this compare with $A = I$? How might you construct a kernel A that (a) downweights high-frequency components in the distance metric; (b) ignores them completely?
Note that (6.14) is the kernel given by $$K_{\lambda , A}(x_0, x) = D(\frac{(x-x_0)^TA(x-x_0)}{\lambda})$$
To me it sounds like there are some number of periodic curves and each observation $x_i$ is a sample from one of them, in which case $Y$ would be a categorical variable indicating which curve a given sample has been drawn from. Then $\Sigma_j$ would be the covariance matrix of all of the observations belonging to curve $j$. I don't think this makes sense however since then we would have a different weight matrix $A$ for each of the analog curves.
I suspect this answer will be useful in solving the problem, but I still can't quite get a concrete interpretation of exactly what is being asked.
How exactly should this question be interpreted?