-1

I have a large number of data. If i have to represent the data by the central tendency, which one i have to use, mean or median? The value of standard deviation is so huge too. Thanks.

  • 3
    What do you mean by "better"? – Anna SdTC Feb 07 '17 at 09:32
  • 1
    I think you are asking the wrong question. You have to ask yourself what do you want first. – Learn_and_Share Feb 07 '17 at 09:36
  • @AnnaSdTC i mean which one is better between mean and median since it has a huge standard deviation. – Sandra Vian Feb 07 '17 at 09:53
  • @MedNait I want represent my data, but i confuse which one i should use – Sandra Vian Feb 07 '17 at 09:54
  • 1
    "Better" has no formal definition in mathematics or statistics. What do you want to represent? Do you have outliers? How skewed is your distribution? Is the data a discrete variable or continuous? Do you want the measure of central tendency to take one of the possible values of the data? – Anna SdTC Feb 07 '17 at 10:00
  • There are other options too like the mode (as @AnnaSdTC hints). Why do you think that (a) your standard deviation is huge (b) that the size of the standard deviation is relevant here? – mdewey Feb 07 '17 at 10:07
  • when i get the mean value, and also the std dev is huge, is that okay for me to use the mean value, or i have to use another central tendency like median or mode? @mdewey – Sandra Vian Feb 07 '17 at 10:15
  • @AnnaSdTC i have outliers. the data is continous. yes, i want to measure of central tendency to take one of the possible values of the data – Sandra Vian Feb 07 '17 at 10:17
  • Since the mean is very sensitive to outliers and it does rarely take one of the values of the data (i.e., the mean of kids per women can be 2.3 or any other non-integer), then you may want to use the median. – Anna SdTC Feb 07 '17 at 10:18
  • @AnnaSdTC Thanks a lot. I wanna ask something. if i want to measure of central tendency to take one of the possible values of the data, can i consider that by std deviation? – Sandra Vian Feb 07 '17 at 10:21
  • The standard deviation does not measure central tendency but deviation. – Anna SdTC Feb 07 '17 at 10:22
  • @AnnaSdTC Oooh. I get it. Thanks. It really helps me :) – Sandra Vian Feb 07 '17 at 10:23
  • If you have outliers you may prefer to summarise using the median and inter-quartile range rather than mean and sd. – mdewey Feb 07 '17 at 12:16

1 Answers1

2

As comments have pointed, mean an median answer different questions. I give an example:

There are 4 people in a very little country. The only available food is imported pizza, and one of the villagers eat one pizza a day but the other three don't eat at all. Mean of eaten pizzas is 0.25 pizzas/day/inhabitant but median is 0 pizzas/day/inhabitant.

The local food dealer needs to know how many pizzas he needs to import to the country. He would use the mean (0.25) and multiply for the population (4) and get the right daily number of pizzas to import every day (1).

The local health organisation is likely more interested in knowing how well feed is the population. The median (0) shows that at least half of the population is starving, while the mean (0.25) wouldn't show it.

In general, when we are interested in aggregate values, mean is more useful, but sometimes median is more interesting when we are more interested in the typical values.

And of course no single number can summarize the whole information of a distribution. Sometimes mean or median will be enough for a given purpose, or we can need standard deviation, histograms, percentiles and so.

Pere
  • 5,875
  • 1
  • 13
  • 29