0

As a math guy trying to understand Principal Component Analysis from the standpoint of Linear Algebra, I am following along with this paper I found. As I read it, I want to understand the statistics vocabulary involved, and the hierarchy/relationship of terms.

  1. Population
  2. Variable
  3. Experiment
  4. Sample
  5. Observation
  6. Outcome

I'd like to lay out my understanding of these words and their relationships, ask for corrections, and pose some clarifying questions:

  • A population is the set of all possible outcomes of an
    experiment.

  • A sample is a subset of the population, and is therefore a set of outcomes

  • A random variable is the mapping of a population of all possible real life outcomes of an experiment to some numerical value.

This leaves a couple questions:

  1. What is an observation?
  2. If I draw blood from $n$ people, and each draw yields $m$ data points (such as platelet count, plasma levels, blood alcohol content) what is each draw of blood called? It seems like it should be a sample, but I thought samples were directly related to outcomes of single random variables. In this case a sample of blood contains many variables. Is this where the word observation comes in?

  3. Is the experiment the act of drawing blood $n$ times? Or is it the act of observing the variables within each sample (medically speaking) of blood?

rocksNwaves
  • 299
  • 1
  • 9
  • 1
    Have you considered searching our site? We have loads of threads on these topics. In fact, we have a bunch of posts that reference *every one of them:* see https://stats.stackexchange.com/search?tab=votes&q=observation%20sample%20variable%20experiment%20outcome. My post at https://stats.stackexchange.com/questions/50/what-is-meant-by-a-random-variable/54894#54894 covers all but the term "sample." Please be aware, too, that various scientific and social science communities use these words in different ways than statisticians: you need to be aware who is writing for whom. – whuber Feb 27 '20 at 20:26
  • @whuber I have, and the language is so contradictory from post to post that it about drove me mad. – rocksNwaves Mar 03 '20 at 22:23
  • Agreed. It is useful to distinguish posts by non-statisticians (usually as questions) from posts from statisticians in this regard--statisticians tend to be more careful and consistent with this terminology. – whuber Mar 03 '20 at 22:26
  • @whuber I got a fairly satisfactory answer in the form of a comment here: https://stats.stackexchange.com/questions/107912/what-is-the-difference-between-sample-and-outcome-plus-events-and-observations/107936?noredirect=1#comment840181_107936 – rocksNwaves Mar 03 '20 at 22:50

0 Answers0