Distance defined by second moment, akin to Mahalanobis distance?

Question

In ordinary linear regression ($c=0$) and ridge regression ($c > 0$), for design matrix $X$ with dimensions $N$ observations by $D$ dimensions, the $N \times N$ hat matrix is given by:

$$H = X (X^T X + c I)^{-1} X^T$$

If we consider the $ij$-th element, it is given by

$$[H]_{ij} = x_i^T \Big( \underbrace{\sum_n x_n x_n^T + c I }_{:= M}\Big)^{-1} x_j$$.

What effect does this $D \times D$ matrix $M$ (specifically, its inverse) have on the inner product between $x_i$ and $x_j$?

It looks similar to the Mahalanobis distance, but defined using non-centered data and the non-centered second moment rather than centered data and the second moment.

"Metric for the second centered moment" doesn't make sense: I think you mean "metric based on the second centered moment." It's equivalent to a Euclidean metric. So is the situation based on raw second moments, so the two have a great deal in common. What, specifically, are you looking for in an answer then? — whuber, Dec 14 '21 at 18:36
Yes, I'll change my phrasing to "metric based on the second centered moment". I'm looking for whether the metric based on the second moment (a) has a name and (b) what properties it has — Rylan Schaeffer, Dec 14 '21 at 18:41
"What properties it has" is too broad and vague to be suitable here: that's why I'm suggesting you make your question more specific. — whuber, Dec 14 '21 at 18:43
I'm not sure how to be more specific. I don't know how to ask for specific properties that I don't know exist — Rylan Schaeffer, Dec 14 '21 at 18:44
Please bear in mind that this site, like all SE sites, is for [practical detailed questions](https://stats.stackexchange.com/tour): "Focus on questions about an actual problem you have faced. Include details about what you have tried and exactly what you are trying to do." — whuber, Dec 14 '21 at 18:46
I've said this before and I'll say it again: I feel like you interpret that statement far too narrowly. — Rylan Schaeffer, Dec 14 '21 at 18:48
The records (of close votes, edits, etc.) show that for a very long time I have interpreted these guidelines more *broadly* than most users and almost all diamond mods. In the present case, although one might argue about what an "actual problem" is and what "what you have tried" might amount to, you haven't yet exhibited information about either one. By all rights I should have closed this thread immediately, but I am hopeful--based on your previous activities here on CV--that you will know how to make this question on topic and will do so before any confusing "answers" appear. — whuber, Dec 14 '21 at 18:53
In rephrasing this question, I've made it more narrow so that someone else interested in understanding the metric defined using the second noncentral moment will not be able to find my question. I think that constraining questions to be so focused on details means that answers are less generally applicable. — Rylan Schaeffer, Dec 14 '21 at 18:58
I agree: requiring questions to be focused makes them less general. I have worked around that in my own posts by often offering a solution to a generalization of the question. One way to approach this as the poser of a question would be to ask a specific question, but then describe the context and invite optional generalized answers. BTW, your edit is interesting, because (for the first time) it suggests developing an answer based on the "data augmentation" interpretation of Ridge Regression at https://stats.stackexchange.com/a/164546/919. — whuber, Dec 14 '21 at 19:55
Let us [continue this discussion in chat](https://chat.stackexchange.com/rooms/132330/discussion-between-rylan-schaeffer-and-whuber). — Rylan Schaeffer, Dec 14 '21 at 20:03

Distance defined by second moment, akin to Mahalanobis distance?

0 Answers0