0

In machine learning data set each feature is considered as a random variable. Random variable is a function which maps the outcomes in sample space to a real value. Now I am trying to understand since each feature is a random variable. what would be the sample space of it? I mean what is the domain it is trying to map?

praveen
  • 21
  • 1
  • I marked this as a duplicate of another, more general, question. I believe the another question applies as well to what you ask, since the definition of random variable is the same in probability theory, statistics, and machine learning. – Tim May 25 '20 at 09:18
  • here my question is not about what is a Random Variable. its about what would be the events the random variable is mapping. In the data we typically see the values but the input domain in unknown. I would like to know about the domain here not what is a random variable. – praveen May 25 '20 at 10:46
  • Those events are part of definition of random variable. Is there any reason why the linked thread does not answer your question? – Tim May 25 '20 at 10:48

1 Answers1

0

Just look at the empirical distribution of your features, e.g. by a histogram if you have real values. If you have discrete values, like categories, the same idea applies. Now each value in your sample space (interval / category) is mapped to a probability.

Tinu
  • 579
  • 1
  • 5
  • 10
  • This is not correct. See https://stats.stackexchange.com/questions/50/what-is-meant-by-a-random-variable/54894#54894 for description of what random variable is. – Tim May 25 '20 at 09:17