3

So for a system, a dispersion is the measure of how the population deviates from the mean. Intuitively the more the dispersion in the system the more the disorder i.e. entropy. A jar of marbles with only red colors have 0 dispersion (if we measure dispersion by color) as well as 0 entropy. However the following scenerio confuses me in terms of dispersion and entropy:

Say a class has 2 students. Both the students obtain 10 marks in some test. The average of the class now is 10 while the variance/dispersion is 0. The entropy on the other hand is not 0 which is counter intuitive. $$\mu=\Sigma \ p(x_i)x_i=0.5(10)+0.5(10)=10$$ $$\sigma^2=\frac{\Sigma(x_i-\mu)^2}{N}=\frac{(10-10)^2+(10-10)^2}{2}=0$$ $$H(x_i)=\Sigma \ -p(x_i)\log(p(x_i))=0.5\log(2)+0.5\log(2)=1$$

GENIVI-LEARNER
  • 720
  • 4
  • 13
  • Have a look at this https://stats.stackexchange.com/questions/483535/how-to-include-the-observed-values-not-just-their-probabilities-in-information/485555#485555 where it is shown that variance can be seen as a kind of entropy! How useful that is, is another question. – kjetil b halvorsen Aug 27 '21 at 14:48

1 Answers1

1

You are trying to calculate entropy from the sample (just as your mean and variance are sample mean and variance as well). However, it's a function of the distribution, in this case the PMF. Let $X$ be the RV that denotes the grade a random student gets. You need the PMF of $X$ to calculate the actual entropy. But, you can still estimate it using the empirical PMF, which is: $$\hat p_X(x)=\begin{cases}1 &, x=10\\0 &,\text{else}\end{cases}$$ And the empirical entropy would be: $$\hat H (p)=1\times\log 1=0$$

gunes
  • 49,700
  • 3
  • 39
  • 75
  • I quite didnt understand it. Why my computation of entropy for the sample is wrong? Is it possible to elaborate the answer a bit. – GENIVI-LEARNER Apr 30 '20 at 14:11
  • Sure, it's not correct because you're using the correct probability mass function. What is your distribution of grades in the classroom? You're saying that the grade is 10 with 1/2 prob, and the grade is again 10 with again 1/2 prob. – gunes Apr 30 '20 at 14:14
  • But I used the same 1/2 prob for both in the mean formula and it checks out. Right? – GENIVI-LEARNER Apr 30 '20 at 14:20
  • 1
    It'd check out as well when you do $1/3\times10+1/3\times10+1/3\times10=10$, but would it be correct? Define your RV, e.g. X = grade of student. And then, define your distribution. I don't see two different values of $X$ with probabilities 1/2. – gunes Apr 30 '20 at 14:21
  • I see. So my mean formula is wrong? – GENIVI-LEARNER Apr 30 '20 at 14:22
  • Yours is sample mean, which is $\hat \mu=\frac{1}{n}\sum_{i=1}^n x_i$, it's not the expected value, because you don't have the distribution. – gunes Apr 30 '20 at 14:23
  • essentially both 10 & 10 are distributed uniformly. So if we have 3 students all with 10 grades, then the uniform distribution will assign 1/3 to each ..right? – GENIVI-LEARNER Apr 30 '20 at 14:23
  • I am quite eager to know why cant we say that grades of 10 is uniformly distributed? – GENIVI-LEARNER May 01 '20 at 13:41
  • Let us [continue this discussion in chat](https://chat.stackexchange.com/rooms/107454/discussion-between-gunes-and-genivi-learner). – gunes May 01 '20 at 13:42