21

Is it that in standardization variance is known while in studentization it is not known and therefore estimated? Thank you.

58485362
  • 211
  • 1
  • 2
  • 3
  • 2
    You may want to clarify the context of your question. What kind of standardization, what kind of studentization? What are these values being used for? – russellpierce May 22 '14 at 17:29
  • 3
    If you're asking about *residuals*, then the terminology is not (ahem) *standardized*. Different authors use different names for the same thing, and occasionally - and sadly most confusingly, the same name for different things. There are what I call (i) *scaled* residuals ($(y-\hat{y}_i)/s$, called *standardized* residuals by some authors); (ii) *internally studentized* residuals (called *standardized* by some authors/packages, *studentized* by others); (iii) *externally studentized* / *studentized deleted* residuals – Glen_b May 22 '14 at 22:37
  • Unless you clarify in which context you are using standardization/studentization, you won't get the exact answer you are looking for :( – the Jun 03 '21 at 17:17

4 Answers4

21

A short recap. Given a model $y=X\beta+\varepsilon$, where $X$ is $n\times p$, $\hat\beta=(X'X)^{-1}X'y$ and $\hat y=X\hat\beta=X(X'X)^{-1}X'y=Hy$, where $H=X(X'X)^{-1}X'$ is the "hat matrix". Residuals are $$e=y-\hat y=y-Hy=(I-H)y$$ The population variance $\sigma^2$ is unknown and can be estimated by $MSE$, the mean square error.

Semistudentized residuals are defined as $$e_i^*=\frac{e_i}{\sqrt{MSE}}$$ but, since the variance of residuals depends on both $\sigma^2$ and $X$, their estimated variance is: $$\widehat V(e_i)=MSE(1-h_{ii})$$ where $h_{ii}$ is the $i$th diagonal element of the hat matrix.

Standardized residuals, also called internally studentized residuals, are: $$r_i=\frac{e_i}{\sqrt{MSE(1-h_{ii})}}$$

However the single $e_i$ and $MSE$ are non independent, so $r_i$ can't have a $t$ distribution. The procedure is then to delete the $i$th observation, fit the regression function to the remaining $n-1$ observations, and get new $\hat y$'s which can be denoted by $\hat y_{i(i)}$. The difference: $$d_i=y_i-\hat y_{i(i)}$$ is called deleted residual. An equivalent expression that does not require a recomputation is: $$d_i=\frac{e_i}{1-h_{ii}}$$ Denoting the new $X$ and $MSE$ by $X_{(i)}$ and $MSE_{(i)}$, since they do not depend on the $i$th observation, we get: $$t_i=\frac{d_i}{\sqrt{\frac{MSE_{(i)}}{1-h_{ii}}}} =\frac{e_i}{\sqrt{MSE_{(i)}(1-h_{ii})}}\sim t_{n-p-1}$$ The $t_i$'s are called studentized (deleted) residuals, or externally studentized residuals.

See Kutner et al., Applied Linear Statistical Models, Chapter 10.

Edit: I must say that the answer by rpierce is perfect. I thought that the OP was about standardized and studentized residuals (and dividing by the population standard deviation to get standardized residuals looked odd to me, of course), but I was wrong. I hope that my answer can help someone even if OT.

Sergio
  • 5,628
  • 2
  • 11
  • 27
  • 2
    ... and this answer is correct in defining studentized residuals from a regression equation. There is no definition of a corresponding standardized residual. The regression framework doesn't seem to apply to the question asked. But this is still a valuable contribution; +1 – russellpierce May 22 '14 at 16:57
  • 2
    @rpierce, you are right: as soon as I read "studentization" I read "residuals" too, but they only were in my mind ;-) Sorry. I have noticed my oversight only after the last click. – Sergio May 22 '14 at 17:04
9

In social sciences it is typically said that Studentizated scores uses Student's/Gosset's calculation for estimating the population variance/standard deviation from the sample variance/standard deviation ($s$). In contrast, Standardized scores (a noun, a particular type of statistic, the Z score) are said to use the population standard deviation ?($\sigma$).

However, it appears there is some terminological differences across fields (please see the comments on this answer). Therefore, one ought to proceed with caution in making these distinctions. Moreover, studentized scores are rarely called such and one typically sees 'studentized' values in the context of regression. @Sergio provides details about those types of studentized deleted residuals in his answer.

russellpierce
  • 17,079
  • 16
  • 67
  • 98
  • 2
    [Wikipedia](https://en.wikipedia.org/wiki/Studentization) adds, "The term is also used for the standardisation of a higher-degree statistic by another statistic of the same degree: for example, an estimate of the third central moment would be standardised by dividing by the cube of the sample standard deviation." – Nick Stauner May 22 '14 at 15:48
  • So standardization is not possible if the population variance is not known? – 58485362 May 22 '14 at 15:49
  • 2
    I think it would be safer to say that Studentization is the form of standardization available if the population variance is unknown. This takes the form of a technical, terminological point of distinction rather than a misleading statement about the more general, broadly-used term. – Nick Stauner May 22 '14 at 15:56
  • This answer puzzles me because it contradicts the commonest use of the term "standardize," which is to recenter and rescale a dataset by its mean and standard deviation: not by any population mean and SD, which necessarily are unknown! How can we reconcile this use of the term with the distinction given in this answer? – whuber May 22 '14 at 16:52
  • 2
    @whuber: The context of the question was basic, so I gave a basic answer. Standard scores (Z) are computed in introductory stats and $\sigma$ is given to them. Sometimes you do actually have the population standard deviation (e.g. a non-missing data census of 10 people). – russellpierce May 22 '14 at 16:54
  • I believe most readers would prefer an objectively correct answer, even if it requires further explanation or is complex to set out, over one that is basic but unnecessarily limited, wrong, or misleading. – whuber May 22 '14 at 16:59
  • I think we have very different styles in this regard. I tend to try to provide answers that suit the level of the question asker and are accessible to 1st year grad stats students. You provide very detailed and technically astute answers that I can't even begin to fathom. That stats programs have decided to call $M\over{s}$ 'standardize' is a source of large confusion for students. I think @NickStauner's contribution largely clarified the dual use of the term 'standardize' as particular action, Z scoring, versus the general usage of the term nicely. – russellpierce May 22 '14 at 17:19
  • I would certainly welcome your answer! – russellpierce May 22 '14 at 17:28
  • I posted my first comment because I was thoroughly confused by @Nick's two comments. The quotation in the first one divides by a *sample* SD to "standardize" a value but the second one appears to contradict that by asserting it should be called "studentization." I don't think there's anything difficult or complicated about this issue, and the right answer might be that different authors use the terms differently, but the various answers and comments in this thread so far sure haven't resolved the question! I like Sergio's answer, technical though it is, for its clarity and authoritativeness. – whuber May 22 '14 at 18:47
  • @whuber: TBH, my formal education in statistics left me utterly naive about the distinction in question here. My comments lean entirely on Wikipedia, which may be wrong of course. Your first comment seems to contradict [the page on standard scores](https://en.wikipedia.org/wiki/Standard_score), which states, "The z-score is only defined if one knows the population parameters; if one only has a sample set, then the analogous computation with sample mean and sample standard deviation yields the Student's t-statistic." There would seem to be a common misconception here somewhere, but whose is it? – Nick Stauner May 22 '14 at 20:55
  • 1
    @NickStauner: There is a verb 'standardize' which corresponds to your first comment. One can (verb) standardize any metric. The noun I think of when I hear 'standardized' as it was used in the question (as a noun) is a 'standardized score'. However that term refers to one particular calculation, Z, and that does use the population parameter of standard deviation in the denominator. As I noted above stats programs have increased the befuddlement on people about what a 'standard score' is. – russellpierce May 22 '14 at 21:04
  • In short, studentization is a way of standardizing (verb) but is not the same as standardizing (a la standardized score; a noun). – russellpierce May 22 '14 at 21:06
  • Here I've actually committed the same crime as @Sergio because I imagined a 'score' in the question that was not present... only implied by the form of the first sentence of the body of the question. – russellpierce May 22 '14 at 21:06
  • 1
    @Nick You're right: I see that Wikipedia consistently makes a strong distinction (it's repeated at https://en.wikipedia.org/wiki/Standardizing). However, that distinction is not reflected in textbooks I have taught from, such as Devore's *Probability and Statistics*, which uses "standardize" for shifting and rescaling based on the sample statistics (or, as in Freedman *et al*'s *Statistics*, the term "standardized" is used in both senses). – whuber May 22 '14 at 21:07
  • 1
    To clarify my second comment, my suggestion (following my naive interpretation of Wikipedia, for lack of a more authoritative reference) was that studentization (can't decide whether to capitalize this again) could be considered a subtype of standardization. Because people like me sure as heck don't know the difference without consulting Wikipedia, I suspect we use "standardize" more broadly and without hesitation to refer to using whichever form of the *SD* is available (probably *s* most often, but maybe $\sigma$). Studentization at least seems to specify use of a sample *SD*, not $\sigma$. – Nick Stauner May 22 '14 at 21:11
  • Hence I don't think it's constructive to insist upon a definition of standardization that sets it apart from studentization in the sense of the OP's reply. But of course it's not up to me to decide what the words actually mean, and I'd be happy to see several independent references agree with one another and disagree with me...I'm a little pessimistic about that though. – Nick Stauner May 22 '14 at 21:14
  • 2
    @Nick That sounds like a good resolution, given that various authorities do use "standardization" broadly but none (AFAIK) ever use "studentize" in such a broad sense. – whuber May 22 '14 at 21:15
  • @whuber: It looks like the first book is for Engineering majors and the second, I'm guessing (based on the 'people who bought this book also bought' on Amazon) tends to be used in public health? I think you are right there is probably some terminological inconsistency across authors (and fields) here. Wikipedia probably just got slammed with more social science folk tweaking the article. Hopefully this long comment thread will serve as a warning sign to future readers. – russellpierce May 22 '14 at 21:15
  • 2
    @rpierce The second book (Freedman, Pisani, and Purves) has been around for about 40 years, through five (largely unchanged) editions, and started life as the text for UC Berkeley's intro stats course. It covers just about all conceivable fields, not just public health. On the other hand, one of its strengths is to avoid emphasizing small, meaningless, or overly technical distinctions, so although it is a good guide to statistics generally, it cannot be relied on for settling arcane matters. – whuber May 22 '14 at 21:19
3

I am very late in answering this question!!. But couldn't find the answer in very simple language so humble attempt to answer this.

Why we do standardization? Imagine you have two models-one predicts craziness from amount of time spent on studying statistics while other predicts log(craziness) with amount of time on statistics.

it would be hard to understand residuals are both are in different units. So we standardize them .(similar theory as Z-score )

Standardized residuals: - When residuals are divided by an estimate of standard deviation . In general if absolute value > 3 then it's cause of concern.

We use this to investigate outliers in model.

Studentized Residual: We use this to study stability of model.

Process is simple. We remove individual test case from model and find out the new predicted value. Difference between new value and original observed value can be standardized by dividing standard error. this value is Studentized Residual

For more infö discovering statics using R -http://www.statisticshell.com/html/dsur.html

NBhoyar
  • 31
  • 2
1

Wikipedia has a good overview at https://en.wikipedia.org/wiki/Normalization_(statistics):

Standard score $\frac{X - \mu}{\sigma}$ : Normalizing errors when population parameters are known. Works well for populations that are normally distributed

Student's t-statistic $\frac{X - \overline{X}}{s}$ : Normalizing residuals when population parameters are unknown (estimated).

asmaier
  • 351
  • 3
  • 8