Questions tagged [skewness]

Skewness measures (or refers to) a degree of asymmetry in the distribution of a variable.

Skewness usually refers to standardized third-order measure of asymmetry in a distribution: that is, a centralized third moment divided by the cube of a standard deviation. Histograms of positively skewed distributions will typically have a long "tail" of relatively high values; those of negatively skewed distributions will usually have a long tail of relatively low values.

More generally, and much more qualitatively, "skew" is sometimes used synonymously with "asymmetric". Note, however, that a distribution can be asymmetric but have zero skewness.

The usual measure of skewness for a dataset $x_i$ ($i=1,2,\ldots,n$) with mean $\bar{x}$ is given by:

$$\frac{ \frac{1}{n} \sum_{i = 1}^{n}{\left(x_i - \bar{x}\right)^3}}{\left( \frac{1}{n}\sum_{i=1}^{n}(x_i-\bar{x})^2 \right)^{\frac{3}{2}}}$$

655 questions
58
votes
3 answers

What does standard deviation tell us in non-normal distribution

In a normal distribution, the 68-95-99.7 rule imparts standard deviation a lot of meaning, but what would standard deviation mean in a non-normal distribution (multimodal or skewed)? Would all data values still fall within 3 standard deviations? Do…
Zuhaib Ali
  • 681
  • 1
  • 5
  • 5
34
votes
6 answers

Can somebody offer an example of a unimodal distribution which has a skewness of zero but which is not symmetrical?

In May 2010 Wikipedia user Mcorazao added a sentence to the skewness article that "A zero value indicates that the values are relatively evenly distributed on both sides of the mean, typically but not necessarily implying a symmetric distribution."…
Andy McKenzie
  • 1,299
  • 8
  • 16
32
votes
3 answers

Outlier Detection on skewed Distributions

Under a classical definition of an outlier as a data point outide the 1.5* IQR from the upper or lower quartile, there is an assumption of a non-skewed distribution. For skewed distributions (Exponential, Poisson, Geometric, etc) is the best way to…
31
votes
4 answers

Does mean=mode imply a symmetric distribution?

I know this question has been asked with the case mean=median, but I did not find anything related to mean=mode. If the mode equals the mean, can I always conclude this is a symmetric distribution? Will I be forced to know also the median for this…
tzipy
  • 509
  • 5
  • 10
30
votes
5 answers

What is the reason the log transformation is used with right-skewed distributions?

I once heard that log transformation is the most popular one for right-skewed distributions in linear regression or quantile regression I would like to know is there any reason underlying this statement? Why is the log transformation suitable for…
user3269
  • 4,622
  • 8
  • 43
  • 53
28
votes
12 answers

Real life examples of distributions with negative skewness

Inspired by "real-life examples of common distributions", I wonder what pedagogical examples people use to demonstrate negative skewness? There are many "canonical" examples of symmetric or normal distributions used in teaching - even if ones like…
Silverfish
  • 20,678
  • 23
  • 92
  • 180
24
votes
3 answers

t-test on highly skewed data

I have a data set with tens of thousands of observations of medical cost data. This data is highly skewed to the right and has a lot of zeros. It looks like this for two sets of people (in this case two age bands with > 3000 obs each): Min. 1st…
Chris
  • 575
  • 1
  • 5
  • 13
24
votes
2 answers

How to handle the difference between the distribution of the test set and the training set?

I think one basic assumption of machine learning or parameter estimation is that the unseen data come from the same distribution as the training set. However, in some practical cases, the distribution of the test set will almost be different from…
23
votes
3 answers

How can I calculate the confidence interval of a mean in a non-normally distributed sample?

How can I calculate the confidence interval of a mean in a non-normally distributed sample? I understand bootstrap methods are commonly used here, but I am open to other options. While I am looking for a non-parametric option, if someone can…
23
votes
4 answers

How to tell if my data distribution is symmetric?

I know that if the median and mean are approximately equal then this means there is a symmetric distribution but in this particular case I'm not certain. The mean and median are quite close (only 0.487m/gall difference) which would lead me to say…
user72943
  • 253
  • 1
  • 3
  • 7
22
votes
7 answers

Why is skewed data not preferred for modelling?

Most of the times when people talk about variable transformations (for both predictor and response variables), they discuss ways to treat skewness of the data (like log transformation, box and cox transformation etc.). What I am not able to…
saurav shekhar
  • 391
  • 1
  • 2
  • 11
22
votes
2 answers

Non-normal distributions with zero skewness and zero excess kurtosis?

Mostly theoretical question. Are there any examples of non-normal distributions that has first four moment equal to those of normal? Could they exist in theory?
21
votes
4 answers

Should the mean be used when data are skewed?

Often introductory applied statistics texts distinguish the mean from the median (often in the the context of descriptive statistics and motivating the summarization of central tendency using the mean, median and mode) by explaining that the mean is…
Alexis
  • 26,219
  • 5
  • 78
  • 131
21
votes
4 answers

Transformation to increase kurtosis and skewness of normal r.v

I'm working on an algorithm that relies on the fact that observations $Y$s are normally distributed, and I would like to test the robustness of the algorithm to this assumption empirically. To do this, I was looking for a sequence of transformations…
20
votes
3 answers

How to assess skewness from a boxplot?

How to decide skewness by looking at a boxplot built from this data: 340, 300, 520, 340, 320, 290, 260, 330 One book says, "If the lower quartile is farther from the median than the upper quartile, then the distribution is negatively skewed."…
JerryW
  • 303
  • 1
  • 2
  • 6
1
2 3
43 44