I have a data set of about 200,000 observations about establishments in my district and their operating time. I need to establish the correctness of my data. Consequently, I am trying to take a simple random sample of 500 observations and get their operating time physically ascertained. In order to take the sample I have followed the following steps in MS Excel:
- Assign random values between 0 and 1 using the
Rand()
function to each of the 200,000 observations; - Sort the data in descending order on the basis of the random values assigned;
- Select the top 500 observations.
I would like to know if this is the right way to take a simple random sample. If not, why and what should I do to make it random?