1

I have a random variable $X$ and its pdf $f(x)=4x^3$ - which happens to be distributed as $Beta(4,1)$.

I would like to approximate $E(X)$, $Var(X)$, $P(X>0.5)$ using Monte Carlo simulation with 1,000 draws 5 times.

I know that, $$E[\bar X]=\frac{1}{n}\sum_{1=1}^n X_i=\hat \mu =\bar X \\ Var[\bar X]=\frac{1}{n-1}\sum_{1=1}^n (X_i-\bar X)^2 =\hat \sigma^2 X$$

In R, I would like to use the replicate function to simulate 1000 draws. Then

> replicate(n=1000, expr=mean(rbeta(1,shape1=4,shape2=1)), simplify = "array")

First is this the correct code? I am not summing $X_i$'s and then dividing by $n$ hoping that mean() will do both? And what is the simplify parameter (ELI5). From ETHZ,

logical or character string; should the result be simplified to a vector, matrix or higher dimensional array if possible? For sapply it must be named and not abbreviated. The default value, TRUE, returns a vector or matrix if appropriate, whereas if simplify = "array" the result may be an array of “rank” (=length(dim(.))) one higher than the result of FUN(X[[i]]).

For $Var[\bar X]$ and $P[\bar X<0.5]$ I am quite lost. Based on intuition, I will have $\bar X$ so to approximate $Var[\bar X]$, I can simulate $X_i$ via rbeta(n=1000, shape1=4, shape2=1) and putting them in a column vector $\underline{a}$. With 1000 $\bar X$ also in a column vector $\underline{b}$, I can perform $$\underline{a}-\underline{b}=\underline{c}.$$

I can square each entry of $\underline{c}$ and then sum them up and divide by $n-1$. Yet I have a suspicion that there is a 1 line solution.

Side note: I alsp know by the Central Limit Theorem, $$\bar X \dot\sim N[E(\bar X)=\mu, Var(\bar X)=\frac{\sigma^2}{n}]$$

as $n\to\infty $ or $(n\ge 30)$.

ozarka
  • 179
  • 4
  • Central limit theorem does **not** say anything about $n \ge 30$ and toes **not** imply that $\infty \approx 30$! See http://stats.stackexchange.com/questions/2541/is-there-a-reference-that-suggest-using-30-as-a-large-enough-sample-size – Tim Feb 02 '17 at 08:49

1 Answers1

1

If a understand correclty you want to draw samles with 5 observations each time and calculate these estimates for every instance. First randomly draw your 5 observations

    x = rbeta(5,shape1=4,shape2=1)

The calculate the quantities you need

    mean(x)
    var(x)
    mean(x>0.5)

If you want to do this a thousand times you can indeed use

    replicate(n=1000, expr={x=rbeta(5,shape1=4,shape2=1); c(mean=mean(x), var=var(x), prob05 = mean(x<0.05))}, simplify = "array")

The simplify="array" just means that instead of a list of length 1000 with vectors of length 3 it will return a 3-by-1000 matrix

Knarpie
  • 1,522
  • 9
  • 22