Questions tagged [random]

20 questions
31
votes
4 answers

R: Problem with runif: generated number repeats (more often than expected) after less than 100 000 steps

After executing the code RNGkind(kind="Mersenne-Twister") # the default anyway set.seed(123) n = 10^5 x = runif(n) print(x[22662] == x[97974]) TRUE is output! If I use, e.g., RNGkind(kind="Knuth-TAOCP-2002") similarly happens: I get "only" 99 995…
Antoine
  • 738
  • 5
  • 15
5
votes
1 answer

Number of expected turns to get $k$ distinct numbers

We have a random number generator, which generates random numbers from $1$ to $N$. Each number has an equal probability of occurring (equiprobable). Find the expected number of turns to get $k$ distinct numbers from the random number…
Sak1sham
  • 53
  • 5
3
votes
1 answer

Do the levels of a random effect need to be present in all the levels of a fixed effect?

Study design In our study, we have a 2x2x2 design with factors Prime (determiner, pronoun), Category (noun, verb) and Masking (masked presentation, unmasked presentation). Our stimulus set included two prime words (determiner 'a' and pronoun 'he')…
Elena
  • 33
  • 3
3
votes
1 answer

Random forest classifier. Some of my data is overrepresented. Is this an issue?

I am using a random forest classifier to predict plant color in my study species, using a variety of environmental variables. My data comes from citizen scientists and I am worried that the class imbalance I'm seeing between my color categories may…
3
votes
1 answer

Is there a way to determine if one picture is more random than another

I have two pictures. I only want to keep pictures that "display" randomness. Is there a way to distinguish more random from less? In these plots, time is the vertical axis. So, the second plot has a slight repeating pattern, while the first plot is…
Krits
  • 41
  • 3
3
votes
1 answer

When to stop enumerating a fixed set of unknown cardinality via random sampling?

DNS resolution can sometimes return one of multiple IP addresses, for load balancing. I would like to enumerate a list of IPs for a service so I can whitelist traffic to a domain without performing an excessive amount of reverse lookups. How many…
Iiridayn
  • 141
  • 6
2
votes
2 answers

Calculate probability that I have fully sampled a set

Let's say I have a set of items out of which I randomly take 5% . Then perform some action on these items and put them back in the set. After how many repetitions of this process can I be 95% sure that I have performed this action on all the…
2
votes
2 answers

Why I get high observation when I generate data from t-distribution in R

I want to generate 200 samples from t-distribution with the degree of freedom=1 and sample size is 10 and in R I use this code set.seed(1234) B <- matrix(rt(10*200, 1), 200) But when I see the sample number 167 (B[167,]) I found this high number…
2
votes
0 answers

R: Simulating coin flips

Here is a problem I thought of: Suppose I am watching someone flip a fair coin. Each flip is completely independent from the previous flip. I watch this person flip 3 consecutive heads. I interrupt this person and ask the following question: If the…
stats_noob
  • 5,882
  • 1
  • 21
  • 42
2
votes
1 answer

Excluding participants due to issue in randomisation?

I have been carrying a study and due to a human error in the randomisation process, instead of getting equal sized groups (I have 8 and aimed to have 100 per group), I now have around 90-100 participants per group (in 7 groups) and for one group…
Cristina
  • 21
  • 1
2
votes
1 answer

Probability of facing a specific number when having N random numbers from a "discrete uniform distribution of N numbers"

What I know: with R as a random variable from a discrete uniform distribution of 1000 numbers [1, 1000]. there is a 1/1000 chance to have R=123 (or any other number in [1, 1000]) What I think I know: so if we test this 1000 times, there must be a…
2
votes
1 answer

Can we sample from both pdf and cdf?

my question is quite generic. I am currently studying the algorithms calculating random numbers from distributions: In inverse transform method we get the cumulative distribution function in the end and take the random variable from there. Whereas…
1
vote
0 answers

ICC in Random intercept model with a predictor variable

Are we able/ or is it correct to calculate ICC in random intercept model which has a preditor variable within? Does this value give us the variation of the outcome which is purely related to subject heterogeneity?
loreen
  • 11
  • 2
1
vote
1 answer

Undersampling approach in different types of study

If I want to use an undersampling approach to construct the machine learning model, I am wondering if there are any criteria to determine how many times I should sample the data from the majority group (the minority is 14% and the majority is 86%)…
tassaneel
  • 13
  • 2
1
vote
0 answers

Criteria to be called a random variable?

I've read that to call something a random variable, that thing must be the result of a statistical experiment. So it got me thinking in which situations might we have an actual bias? For example, medical diagnosis is an interesting case. The people…
edward84
  • 21
  • 1
1
2