Questions tagged [definition]

This tag indicates questions about definitions of statistical terms. Use a more general tag [terminology] for questions on statistical parlance that are not specifically about definitions.

490 questions
522
votes
23 answers

Why square the difference instead of taking the absolute value in standard deviation?

In the definition of standard deviation, why do we have to square the difference from the mean to get the mean (E) and take the square root back at the end? Can't we just simply take the absolute value of the difference instead and get the expected…
c4il
  • 5,465
  • 4
  • 16
  • 9
372
votes
9 answers

What is the difference between fixed effect, random effect and mixed effect models?

In simple terms, how would you explain (perhaps with simple examples) the difference between fixed effect, random effect and mixed effect models?
Andrew
  • 5,478
  • 5
  • 21
  • 21
192
votes
14 answers

What is a data scientist?

Having recently graduated from my PhD program in statistics, I had for the last couple of months began searching for work in the field of statistics. Almost every company I considered had a job posting with a job title of "Data Scientist". In fact,…
RustyStatistician
  • 1,709
  • 3
  • 13
  • 35
117
votes
14 answers

Maximum Likelihood Estimation (MLE) in layman terms

Could anyone explain to me in detail about maximum likelihood estimation (MLE) in layman's terms? I would like to know the underlying concept before going into mathematical derivation or equation.
107
votes
12 answers

When should linear regression be called "machine learning"?

In a recent colloquium, the speaker's abstract claimed they were using machine learning. During the talk, the only thing related to machine learning was that they perform linear regression on their data. After calculating the best-fit coefficients…
103
votes
12 answers

What, precisely, is a confidence interval?

I know roughly and informally what a confidence interval is. However, I can't seem to wrap my head around one rather important detail: According to Wikipedia: A confidence interval does not predict that the true value of the parameter has a…
dsimcha
  • 7,375
  • 7
  • 32
  • 29
90
votes
9 answers

What is meant by a "random variable"?

What do they mean when they say "random variable"?
Baltimark
  • 2,028
  • 3
  • 19
  • 20
87
votes
7 answers

What are principal component scores?

What are principal component scores (PC scores, PCA scores)?
vrish88
  • 1,143
  • 1
  • 9
  • 8
66
votes
4 answers

What is a contrast matrix?

What exactly is contrast matrix (a term, pertaining to an analysis with categorical predictors) and how exactly is contrast matrix specified? I.e. what are columns, what are rows, what are the constraints on that matrix and what does number in…
49
votes
3 answers

What is the difference between posterior and posterior predictive distribution?

I understand what a Posterior is, but I'm not sure what the latter means? How are the 2 different? Kevin P Murphy indicated in his textbook, Machine Learning: a Probabilistic Perspective, that it is "an internal belief state". What does that really…
A.D
  • 2,114
  • 3
  • 17
  • 27
45
votes
8 answers

Rigorous definition of an outlier?

People often talk about dealing with outliers in statistics. The thing that bothers me about this is that, as far as I can tell, the definition of an outlier is completely subjective. For example, if the true distribution of some random variable…
dsimcha
  • 7,375
  • 7
  • 32
  • 29
44
votes
6 answers

Is a time series the same as a stochastic process?

A stochastic process is a process that evolves over time, so is it really a fancier way of saying "time series"?
Victor
  • 5,925
  • 13
  • 43
  • 67
34
votes
9 answers

What is the difference between an estimator and a statistic?

I learned that a statistic is an attribute you can obtain from samples.Taking many samples of same size, calculating this attribute for all of them and plotting the pdf, we get the distribution of the corresponding attribute or the distribution of…
gutto
  • 389
  • 1
  • 3
  • 4
31
votes
3 answers

What is a latent space?

In the context of machine learning, I often hear the term latent space, sometimes qualified with the word "high dimensional" or "low dimensional" latent space. I am a bit puzzled by this term (as it is almost never defined rigorously). Can someone…
Fraïssé
  • 961
  • 2
  • 13
  • 29
31
votes
5 answers

Wikipedia entry on likelihood seems ambiguous

I have a simple question regarding "conditional probability" and "Likelihood". (I have already surveyed this question here but to no avail.) It starts from the Wikipedia page on likelihood. They say this: The likelihood of a set of parameter…
1
2 3
32 33