Questions tagged [simulation]

A vast area which includes generating results from computer models.

Using some model of a "reality", a simulator is run which generates results. A trivial example would be a computer pseudo-random number generator for the Normal distribution. More complex examples include, running simulated cars into barriers to look at their behaviour in impact and numerical weather forecasting where lots of options are run though ensemble models and the most likely is chosen.

Simulation is almost always cheaper and faster than the real thing. Sometimes simulation is the only reasonable ethical solution, e.g. when you are evaluating the safety of a device for human consumption, you can't ethically test it with humans where there is a risk of harm.

In statistics, simulation is often used to test new statistical algorithms - you simulate some data with known parameters, and then you test how well your new algorithm can identify those known parameters and compare your results against older algorithms. In real life you often do not know the true values of the actual parameters.

1656 questions
68
votes
8 answers

How to simulate data that satisfy specific constraints such as having specific mean and standard deviation?

This question is motivated by my question on meta-analysis. But I imagine that it would also be useful in teaching contexts where you want to create a dataset that exactly mirrors an existing published dataset. I know how to generate random data…
Jeromy Anglim
  • 42,044
  • 23
  • 146
  • 250
63
votes
5 answers

Why does collecting data until finding a significant result increase Type I error rate?

I was wondering exactly why collecting data until a significant result (e.g., $p \lt .05$) is obtained (i.e., p-hacking) increases the Type I error rate? I would also highly appreciate an R demonstration of this phenomenon.
Reza
  • 876
  • 7
  • 10
53
votes
2 answers

How to simulate artificial data for logistic regression?

I know I'm missing something in my understanding of logistic regression, and would really appreciate any help. As far as I understand it, the logistic regression assumes that the probability of a '1' outcome given the inputs, is a linear combination…
zorbar
  • 727
  • 1
  • 7
  • 9
44
votes
6 answers

When to use simulations?

So this is a very simple and basic question. However, when I was in school, I paid very little attention to the whole concept of simulations in class and that's left me a little terrified of that process. Can you explain the simulation process in…
AMathew
  • 1,000
  • 12
  • 18
42
votes
2 answers

Simulation of logistic regression power analysis - designed experiments

This question is in response to an answer given by @Greg Snow in regards to a question I asked concerning power analysis with logistic regression and SAS Proc GLMPOWER. If I am designing an experiment and will analze the results in a factorial…
B_Miner
  • 7,560
  • 20
  • 81
  • 144
42
votes
8 answers

Approximate $e$ using Monte Carlo Simulation

I've been looking at Monte Carlo simulation recently, and have been using it to approximate constants such as $\pi$ (circle inside a rectangle, proportionate area). However, I'm unable to think of a corresponding method of approximating the value of…
31
votes
5 answers

Generating random numbers manually

How can I manually generate a random number from a given distribution, as for instance, 10 realisations from the standard normal distribution?
29
votes
5 answers

What are examples of statistical experiments that allow the calculation of the golden ratio?

There are some very simple experiences that can be done by a kid at home, whose result allows one to statistically approach famous numbers such as $\pi$ or $e$. An example where $\pi$ shows up is perhaps the most famous one of its kind. In Buffon's…
29
votes
2 answers

How well does bootstrapping approximate the sampling distribution of an estimator?

Having recently studied bootstrap, I came up with a conceptual question that still puzzles me: You have a population, and you want to know a population attribute, i.e. $\theta=g(P)$, where I use $P$ to represent population. This $\theta$ could be…
KevinKim
  • 6,347
  • 4
  • 21
  • 35
27
votes
1 answer

Generate two variables with precise pre-specified correlation

UPDATE: Solution Thanks to Greg Snow for pointing out the empirical = TRUE command in mvrnorm (multivariate random normal stuff)! Here's the explicit code: samples = 200 r = 0.83 library('MASS') data = mvrnorm(n=samples, mu=c(0, 0),…
Jonas Lindeløv
  • 1,778
  • 1
  • 17
  • 28
27
votes
2 answers

Why is it necessary to sample from the posterior distribution if we already KNOW the posterior distribution?

My understanding is that when using a Bayesian approach to estimate parameter values: The posterior distribution is the combination of the prior distribution and the likelihood distribution. We simulate this by generating a sample from the…
Dave
  • 1,641
  • 2
  • 14
  • 27
26
votes
1 answer

When would one use Gibbs sampling instead of Metropolis-Hastings?

There are different kinds of MCMC algorithms: Metropolis-Hastings Gibbs Importance/rejection sampling (related). Why would one use Gibbs sampling instead of Metropolis-Hastings? I suspect there are cases when inference is more tractable with…
25
votes
1 answer

How to create a multivariate Brownian Bridge?

It is known, that a standard multivariate Brownian bridge $ y(\mathbf u) $ is a centered Gaussian process with covariance function $$ \mathbb E(y(\mathbf u) y(\mathbf v)) = \prod_{j=1}^d (u_j \wedge v_j) - \prod_{j=1}^d u_j v_j $$ I am not sure…
andeliyeasi
  • 353
  • 2
  • 6
25
votes
2 answers

What is importance sampling?

I'm trying to learn reinforcement learning and this topic is really confusing to me. I have taken an introduction to statistics, but I just couldn't understand this topic intuitively.
21
votes
1 answer

How to sample from Cantor distribution?

What would be the best way to sample from Cantor distribution? It only has cdf and we can't invert it.
Tim
  • 108,699
  • 20
  • 212
  • 390
1
2 3
99 100