In machine learning data set each feature is considered as a random variable. Random variable is a function which maps the outcomes in sample space to a real value. Now I am trying to understand since each feature is a random variable. what would be the sample space of it? I mean what is the domain it is trying to map?
Asked
Active
Viewed 49 times
0
-
I marked this as a duplicate of another, more general, question. I believe the another question applies as well to what you ask, since the definition of random variable is the same in probability theory, statistics, and machine learning. – Tim May 25 '20 at 09:18
-
here my question is not about what is a Random Variable. its about what would be the events the random variable is mapping. In the data we typically see the values but the input domain in unknown. I would like to know about the domain here not what is a random variable. – praveen May 25 '20 at 10:46
-
Those events are part of definition of random variable. Is there any reason why the linked thread does not answer your question? – Tim May 25 '20 at 10:48
1 Answers
0
Just look at the empirical distribution of your features, e.g. by a histogram if you have real values. If you have discrete values, like categories, the same idea applies. Now each value in your sample space (interval / category) is mapped to a probability.

Tinu
- 579
- 1
- 5
- 10
-
This is not correct. See https://stats.stackexchange.com/questions/50/what-is-meant-by-a-random-variable/54894#54894 for description of what random variable is. – Tim May 25 '20 at 09:17