Questions tagged [descriptive-statistics]

Descriptive statistics summarize features of a sample, such as mean and standard deviations, median and quartiles, the maximum and minimum. With multiple variables, may include correlations and crosstabs. Can include visual displays - boxplots, histograms, scatterplots and so on.

Descriptive statistics summarize features of a sample.

Common descriptive statistics include mean and standard deviations, particular quantiles like the median and quartiles, the maximum and minimum, range and interquartile range, five number summaries and so on, but with multiple variables, may include correlations and crosstabs.

Descriptive statistics may include visual displays such as boxplots, histograms and scatterplots.

1646 questions
162
votes
5 answers

What's the difference between Normalization and Standardization?

At work we were discussing this as my boss has never heard of normalization. In Linear Algebra, Normalization seems to refer to the dividing of a vector by its length. And in statistics, Standardization seems to refer to the subtraction of a mean…
Chris
  • 1,629
  • 3
  • 11
  • 3
141
votes
5 answers

Percentile vs quantile vs quartile

What is the difference between the three terms below? percentile quantile quartile
luciano
  • 12,197
  • 30
  • 87
  • 119
90
votes
5 answers

How to 'sum' a standard deviation?

I have a monthly average for a value and a standard deviation corresponding to that average. I am now computing the annual average as the sum of monthly averages, how can I represent the standard deviation for the summed average ? For example…
klonq
  • 1,167
  • 2
  • 9
  • 9
69
votes
8 answers

What are good basic statistics to use for ordinal data?

I have some ordinal data gained from survey questions. In my case they are Likert style responses (Strongly Disagree-Disagree-Neutral-Agree-Strongly Agree). In my data they are coded as 1-5. I don't think means would mean much here, so what basic…
PaulHurleyuk
  • 1,549
  • 3
  • 16
  • 18
66
votes
12 answers

What does orthogonal mean in the context of statistics?

In other contexts, orthogonal means "at right angles" or "perpendicular". What does orthogonal mean in a statistical context? Thanks for any clarifications.
pmgjones
  • 5,543
  • 8
  • 36
  • 36
58
votes
5 answers

Correlations between continuous and categorical (nominal) variables

I would like to find the correlation between a continuous (dependent variable) and a categorical (nominal: gender, independent variable) variable. Continuous data is not normally distributed. Before, I had computed it using the Spearman's $\rho$.…
54
votes
8 answers

Modern successor to Exploratory Data Analysis by Tukey?

I've been reading Tukey's book "Exploratory Data Analysis". Being written in 1977, the book emphasizes paper/pencil methods. Is there a more 'modern' successor which takes into account that we can now instantaneosly plot large data sets?
46
votes
3 answers

Empirical relationship between mean, median and mode

For a unimodal distribution that is moderately skewed, we have the following empirical relationship between the mean, median and mode: $$ \text{(Mean - Mode)}\sim 3\,\text{(Mean - Median)} $$ How was this relationship derived? Did Karl Pearson…
40
votes
7 answers

Why shouldn't the denominator of the covariance estimator be n-2 rather than n-1?

The denominator of the (unbiased) variance estimator is $n-1$ as there are $n$ observations and only one parameter is being estimated. $$ \mathbb{V}\left(X\right)=\frac{\sum_{i=1}^{n}\left(X_{i}-\overline{X}\right)^{2}}{n-1} $$ By the same token I…
39
votes
8 answers

Graphical data overview (summary) function in R

I'm sure I've come across a function like this in an R package before, but after extensive Googling I can't seem to find it anywhere. The function I'm thinking of produced a graphical summary for a variable given to it, producing output with some…
38
votes
7 answers

How to interpret the coefficient of variation?

I am trying to understand the Coefficient of Variation. When I try to apply it to the following two samples of data I am unable to understand how to interpret the results. Let's say sample 1 is ${0, 5, 7, 12, 11, 17}$ and sample 2 is ${10 ,15 ,17…
30
votes
2 answers

Is variation the same as variance?

This is my first question on Cross Validated here, so please help me out even if it seems trivial :-) First of all, the question might be an outcome of language differences or perhaps me having real deficiencies in statistics. Nevertheless, here it…
ŁukaszBachman
  • 435
  • 1
  • 5
  • 9
30
votes
12 answers

Command-line tool to calculate basic statistics for stream of values

Is there any command-line tool that accepts the flow of numbers (in ascii format) from standard input and gives the basic descriptive statistics for this flow, such as min, max, average, median, RMS, quantiles etc? The output is welcome to be…
mbaitoff
  • 757
  • 1
  • 8
  • 16
27
votes
1 answer

Generate two variables with precise pre-specified correlation

UPDATE: Solution Thanks to Greg Snow for pointing out the empirical = TRUE command in mvrnorm (multivariate random normal stuff)! Here's the explicit code: samples = 200 r = 0.83 library('MASS') data = mvrnorm(n=samples, mu=c(0, 0),…
Jonas Lindeløv
  • 1,778
  • 1
  • 17
  • 28
25
votes
3 answers

What can one conclude about the data when arithmetic mean is very close to geometric mean?

Is there anything significant about a geometric mean and arithmetic mean that fall very close to one another, say ~0.1%? What conjectures can be made about such a data set? I've been working on analyzing a data set, and I notice that ironically the…
1
2 3
99 100