Questions tagged [mean]

The expected value of a random variable; or a location measure for a sample.

The mean of a probability distribution is also called its expected value. For a discrete random variable, $X$, its defined as:

$$E[X] = \mu = \sum_{x}x P(X=x)$$

Where $P(X)$ is the probability mass function and the sum is taken over all values that $X$ can take. For a continuos RV simply replace the summation with an integral.

The mean of a sample of points $(x_1, ..., x_n)$, also known as the sample mean, is the arithmetic average of all points, defined as:

$$\bar{x} = \frac1n \sum_{i=1}^{n}x_i$$

In literature, we reserve the use of $\mu$ to denote the true population mean and $\bar{x}$ as the mean of a sample of points, from some population.

2514 questions
280
votes
16 answers

Why does a 95% Confidence Interval (CI) not imply a 95% chance of containing the mean?

It seems that through various related questions here, there is consensus that the "95%" part of what we call a "95% confidence interval" refers to the fact that if we were to exactly replicate our sampling and CI-computation procedures many times,…
Mike Lawrence
  • 12,691
  • 8
  • 40
  • 65
128
votes
10 answers

Why does the Cauchy distribution have no mean?

From the distribution density function we could identify a mean (=0) for Cauchy distribution just like the graph below shows. But why do we say Cauchy distribution has no mean?
108
votes
4 answers

Difference between standard error and standard deviation

I'm struggling to understand the difference between the standard error and the standard deviation. How are they different and why do you need to measure the standard error?
louis xie
  • 1,233
  • 3
  • 10
  • 6
98
votes
5 answers

Mean absolute error OR root mean squared error?

Why use Root Mean Squared Error (RMSE) instead of Mean Absolute Error (MAE)?? Hi I've been investigating the error generated in a calculation - I initially calculated the error as a Root Mean Normalised Squared Error. Looking a little closer, I…
user1665220
  • 1,105
  • 1
  • 8
  • 6
95
votes
8 answers

If mean is so sensitive, why use it in the first place?

It is a known fact that median is resistant to outliers. If that is the case, when and why would we use the mean in the first place? One thing I can think of perhaps is to understand the presence of outliers i.e. if the median is far from the mean,…
Legend
  • 4,232
  • 7
  • 37
  • 50
59
votes
2 answers

How should one interpret the comparison of means from different sample sizes?

Take the case of book ratings on a website. Book A is rated by 10,000 people with an average rating of 4.25 and the variance $\sigma = 0.5$. Similarly Book B is rated by 100 people and has a rating of 4.5 with $\sigma = 0.25$. Now because of the…
PhD
  • 13,429
  • 19
  • 45
  • 47
58
votes
6 answers

How can a distribution have infinite mean and variance?

It would be appreciated if the following examples could be given: A distribution with infinite mean and infinite variance. A distribution with infinite mean and finite variance. A distribution with finite mean and infinite variance. A distribution…
49
votes
6 answers

Is Amazon's "average rating" misleading?

If I understand correctly, book ratings on a 1-5 scale are Likert scores. That is, a 3 for me may not necessarily be a 3 for someone else. It's an ordinal scale IMO. One shouldn't really average ordinal scales but can definitely take the mode,…
PhD
  • 13,429
  • 19
  • 45
  • 47
48
votes
14 answers

Why is median age a better statistic than mean age?

If you look at Wolfram Alpha Or this Wikipedia page List of countries by median age Clearly median seems to be the statistic of choice when it comes to ages. I am not able to explain to myself why arithmetic mean would be a worse statistic.…
Lazer
  • 583
  • 1
  • 4
  • 6
48
votes
5 answers

What can we say about population mean from a sample size of 1?

I am wondering what we can say, if anything, about the population mean, $\mu$ when all I have is one measurement, $y_1$ (sample size of 1). Obviously, we'd love to have more measurements, but we can't get them. It seems to me that since the sample…
thedu
  • 505
  • 4
  • 6
40
votes
14 answers

Regression to the mean vs gambler's fallacy

On the one hand, I have the regression to the mean and on the other hand I have the gambler´s fallacy. Gambler’s fallacy is defined by Miller and Sanjurjo (2019) as “the mistaken belief that random sequences have a systematic tendency towards…
Luis P.
  • 731
  • 1
  • 5
  • 12
37
votes
5 answers

Will the fact that my Italian son is going to attend a primary school change the expected number of Italian children to be present in his class?

This is a question stemming from a real-life situation, for which I have been genuinely puzzled about its answer. My son is due to start primary school in London. As we are Italian, I was curious to know how many Italian children are already…
jj90213
  • 445
  • 4
  • 7
36
votes
2 answers

Is there a reliable nonparametric confidence interval for the mean of a skewed distribution?

Very skewed distributions such as the log-normal do not result in accurate bootstrap confidence intervals. Here is an example showing that the left and right tail areas are far from the ideal 0.025 no matter which bootstrap method you try in…
Frank Harrell
  • 74,029
  • 5
  • 148
  • 322
35
votes
3 answers

Why does minimizing the MAE lead to forecasting the median and not the mean?

From the Forecasting: Principles and Practice textbook by Rob J Hyndman and George Athanasopoulos, specifically the section on accuracy measurement: A forecast method that minimizes the MAE will lead to forecasts of the median, while minimizing…
Brans Ds
  • 1,192
  • 1
  • 10
  • 16
35
votes
5 answers

What is the difference between "mean value" and "average"?

Wikipedia explains: For a data set, the mean is the sum of the values divided by the number of values. This definition however corresponds to what I call "average" (at least that's what I remember learning). Yet Wikipedia once more quotes: There…
neydroydrec
  • 581
  • 2
  • 6
  • 10
1
2 3
99 100