Questions tagged [median]

The median is the value below which half the data or probability distribution lies - when the sample size is odd, the median is the 'middle' value of an ordered sample.

The median is the value above which, and below which, half the data lies. If the number of data is odd, the median is the middle value when the sample is ordered; if the number of data is even, the median is the mean of the two middle values when the sample is ordered.

  • The median can be considered an appropriate measure of central tendency for an ordinal level variable.
  • It is often considered a more representative measure than the mean, even for interval data, when the distribution is highly skewed.
  • Medians are more robust (but less efficient) measure when the data may be contaminated.
643 questions
141
votes
5 answers

Percentile vs quantile vs quartile

What is the difference between the three terms below? percentile quantile quartile
luciano
  • 12,197
  • 30
  • 87
  • 119
95
votes
8 answers

If mean is so sensitive, why use it in the first place?

It is a known fact that median is resistant to outliers. If that is the case, when and why would we use the mean in the first place? One thing I can think of perhaps is to understand the presence of outliers i.e. if the median is far from the mean,…
Legend
  • 4,232
  • 7
  • 37
  • 50
76
votes
5 answers

Central limit theorem for sample medians

If I calculate the median of a sufficiently large number of observations drawn from the same distribution, does the central limit theorem state that the distribution of medians will approximate a normal distribution? My understanding is that this is…
54
votes
10 answers

What is a good algorithm for estimating the median of a huge read-once data set?

I'm looking for a good algorithm (meaning minimal computation, minimal storage requirements) to estimate the median of a data set that is too large to store, such that each value can only be read once (unless you explicitly store that value). There…
PeterR
  • 1,712
  • 1
  • 16
  • 13
48
votes
14 answers

Why is median age a better statistic than mean age?

If you look at Wolfram Alpha Or this Wikipedia page List of countries by median age Clearly median seems to be the statistic of choice when it comes to ages. I am not able to explain to myself why arithmetic mean would be a worse statistic.…
Lazer
  • 583
  • 1
  • 4
  • 6
42
votes
5 answers

Confidence interval for median

I have to find a 95% C.I. on the median and other percentiles. I don't know how to approach this. I mainly use R as a programming tool.
Dominic Comtois
  • 2,047
  • 5
  • 20
  • 25
38
votes
7 answers

Is there an accepted definition for the median of a sample on the plane, or higher ordered spaces?

If so, what? If not, why not? For a sample on the line, the median minimizes the total absolute deviation. It would seem natural to extend the definition to R2, etc., but I've never seen it. But then, I've been out in left field for a long time.
phv3773
  • 481
  • 4
  • 4
36
votes
2 answers

Is there a reliable nonparametric confidence interval for the mean of a skewed distribution?

Very skewed distributions such as the log-normal do not result in accurate bootstrap confidence intervals. Here is an example showing that the left and right tail areas are far from the ideal 0.025 no matter which bootstrap method you try in…
Frank Harrell
  • 74,029
  • 5
  • 148
  • 322
35
votes
3 answers

Why does minimizing the MAE lead to forecasting the median and not the mean?

From the Forecasting: Principles and Practice textbook by Rob J Hyndman and George Athanasopoulos, specifically the section on accuracy measurement: A forecast method that minimizes the MAE will lead to forecasts of the median, while minimizing…
Brans Ds
  • 1,192
  • 1
  • 10
  • 16
32
votes
3 answers

Why does basic hypothesis testing focus on the mean and not on the median?

In basic under-grad statistics courses, students are (usually?) taught hypothesis testing for the mean of a population. Why is it that the focus is on the mean and not on the median? My guess is that it is easier to test the mean due to the central…
nafrtiti
  • 665
  • 1
  • 6
  • 9
29
votes
2 answers

How to construct a 95% confidence interval of the difference between medians?

My problem: parallel group randomized trial having a very right-skewed distribution of the primary outcome. I do not want to assume normality and use normal-based 95% CIs (i.e. using 1.96 X SE). I am comfortable expressing the measure of central…
pmgjones
  • 5,543
  • 8
  • 36
  • 36
29
votes
1 answer

When if ever is a median statistic a sufficient statistic?

I came across a casual remark on The Chemical Statistician that a sample median could often be a choice for a sufficient statistic but, besides the obvious case of one or two observations where it equals the sample mean, I cannot think of another…
Xi'an
  • 90,397
  • 9
  • 157
  • 575
27
votes
1 answer

What are the multidimensional versions of median

What are the multidimensional versions of the median and what are their pros and cons? I confess this doesn't have a single answer, but I think it is a useful question to ask and will be a benefit to others as well. How stable it is (i.e. how many…
John Robertson
  • 973
  • 3
  • 15
  • 25
25
votes
4 answers

Why does mean tend be more stable in different samples than median?

Section 1.7.2 of Discovering Statistics Using R by Andy Field, et al., while listing virtues of mean vs median, states: ... the mean tends to be stable in different samples. This after explaining median's many virtues, e.g. ... The median is…
Alok Lal
  • 353
  • 3
  • 5
24
votes
2 answers

Is it possible to accumulate a set of statistics that describes a large number of samples such that I can then produce a boxplot?

I must clarify immediately that I am a practicing software developer, not a statistician, and that my college stats class was a very long time ago… That said, I would like to know if there is a method for accumulating a set of descriptive statistics…
1
2 3
42 43