0

I just spent the last 2 hours trying to understand degrees of freedom, i read multiple interpertations, however one really stuck with me: it's the number of data points that are free to vary AFTER the parameter is calculated, and it's basically used instead of the sample size onwards in any calculations that includes said parameter, this was the most agreed upon definition to my knowledge ( it's even the definition used by stackexchange for the tag :D ). That being said, I have 2 questions:

1- why do some sources say the opposite, that degrees of freedom is the number of variables that are basically LOCKED IN once the parameter is calculated, and the variance for example is divided by n-dof, which is contrary to common sense when it comes to comparing it to degrees of freedom in other sciences in my opinion. ( this concept is adopted by programming languages like numpy in python )

2- why does the concept of degrees of freedom exists only when it comes to sample, if the definition is correct and it's divided by (n-1) because " the last data point does not contribute new information", what makes this data point important in case of population and negligible in case of samples?

user28324
  • 101
  • 1
  • 3
    Your question 2 is answered [here](https://stats.stackexchange.com/questions/406327/). – Ben Jan 08 '22 at 20:50
  • 2
    Could you make separate questions of it rather than asking multiple questions at once? – Tim Jan 08 '22 at 20:55
  • With respect to 3, populations don't have data points or sample sizes. Only samples from populations have data points and sample sizes. – jbowman Jan 08 '22 at 21:10
  • 1
    @Tim sure, since question 2 is a duplicate i will make seperate question for 1 and 3, and delete this one right after i clear things out with jbowman, maybe i can get question 3 answered too :) – user28324 Jan 08 '22 at 21:31
  • @jbowman can you please elaborate more what do you mean they dont have data points? They dont have sample sizes but they have population size and one can make the argument that after calculating the population mean, all of the population can vary except one – user28324 Jan 08 '22 at 21:34
  • @Ben thanks !! , i can't understand a single word but it's nice to know that i am thinking in the right direction . – user28324 Jan 08 '22 at 21:36
  • 1
    Please check out our [thread on DoF.](https://stats.stackexchange.com/questions/16921/how-to-understand-degrees-of-freedom) – whuber Jan 08 '22 at 22:57
  • 1
    No, you can't vary the population, because if you do it's a different population. The mean of the population is calculated by, writing simplistically, summing/integrating the values of the population times their probabilities, not by just averaging the population values. The population is not just "the biggest sample", which seems to be along the lines of your thinking. It's, effectively, a list of the values the sample elements can take, but it is not itself a sample. This list can be finite, countably infinite, or uncountably infinite in "length" - think of the Normal dist'n. – jbowman Jan 08 '22 at 23:03
  • @jbowman i guess the problem here is my understanding of the d.o.f. The way i see it, it's the number of data points that " add new information " when calculating a parameter or statistics. – user28324 Jan 09 '22 at 05:24
  • @Ben i am curious, what courses should i study to learn stuff like this? seems like advanced linear algebra and mathematicl statistics – user28324 Jan 09 '22 at 11:05
  • 1
    Advanced linear algebra would probably cover it. – Ben Jan 09 '22 at 11:49
  • @Ben Theorems about the dimensions of images and kernels of linear maps are usually considered the most basic parts of linear algebra and therefore we can be sure they are covered in any linear algebra text. – whuber Jan 09 '22 at 18:43
  • @whuber not if you're taking linear algebra for engineering application :D – user28324 Jan 11 '22 at 11:20

0 Answers0