Say we have $X \sim \text{Beta}(\alpha, \beta)$. What's the sampling distribution of its sample mean?
In other words, what distribution does the sample mean $\bar{X}$ of a Beta follow?
Say we have $X \sim \text{Beta}(\alpha, \beta)$. What's the sampling distribution of its sample mean?
In other words, what distribution does the sample mean $\bar{X}$ of a Beta follow?
I thought this was an interesting question so here's a quick visual exploration. For $X\sim Beta(\alpha_1,\alpha_2)$, I first selected 4 separate Beta distributions (PDFs shown below).
Then I collected sample means, $\bar X = \frac{1}{n}\sum_{i=1}^n x_i$ and plotted the corresponding histograms as shown below. The results look Normal and I'm inclined to believe @ChristophHanck's assertion that the Central Limit Theorem (CLT) is at work here.
MATLAB code
% Parameters
n = 5000;
K = 5000;
% Define Beta distributions
pd1 = makedist('Beta',0.25,0.45);
pd2 = makedist('Beta',0.25,2.5);
pd3 = makedist('Beta',4,0.15);
pd4 = makedist('Beta',3.5,5);
% Collect Sample Means
X1bar = zeros(K,1);
X2bar = zeros(K,1);
X3bar = zeros(K,1);
X4bar = zeros(K,1);
for k = 1:K % get K sample means
X1bar(k) = mean(random(pd1,n,1)); % take mean of n samples
X2bar(k) = mean(random(pd2,n,1));
X3bar(k) = mean(random(pd3,n,1));
X4bar(k) = mean(random(pd4,n,1));
end
% Plot Beta distribution PDFs
Xsupport = 0:.01:1;
figure, hold on, box on
title('Beta(\alpha_1,\alpha_2) PDFs')
plot(Xsupport,pdf(pd1,Xsupport),'r-','LineWidth',2.2)
plot(Xsupport,pdf(pd2,Xsupport),'b-','LineWidth',2.2)
plot(Xsupport,pdf(pd3,Xsupport),'k-','LineWidth',2.2)
plot(Xsupport,pdf(pd4,Xsupport),'g-','LineWidth',2.2)
legend('(0.25,0.45)','(0.25,2.5)','(4,0.15)','(3.5,5)')
figure
s(1) = subplot(2,2,1), hold on, box on
histogram(X1bar,'FaceColor','r')
s(2) = subplot(2,2,2), hold on, box on
histogram(X2bar,'FaceColor','b')
s(3) = subplot(2,2,3), hold on, box on
histogram(X3bar,'FaceColor','k')
s(4) = subplot(2,2,4), hold on, box on
histogram(X4bar,'FaceColor','g')
title(s(1),'(0.25,0.45)')
title(s(2),'(0.25,2.5)')
title(s(3),'(4,0.15)')
title(s(4),'(3.5,5)')
Edit: This post was a quick attempt to provide the OP something. As pointed out, we know the Central Limit Theorem (CLT) implies these results will hold for any distribution with a finite variance.
Note: see also for the same question: Sum of n i.i.d Beta-distributed variables.
For the case of a uniform distribution, $\text{Beta}(1,1)$, the distribution of the sum of a number of independent variables (and the mean is related) has been described as the Irwin-Hall distribution.
If $$X_n = \sum_{i=1}^n Y_i \quad \text{ with } \quad U_i \sim \text{Beta}(1,1)$$
then you have a spline of degree $n-1$
$$f_X(x;n) = \frac{1}{(n-1)!} \sum_{j=0}^{n-1} a_j(k,n)x^j \quad \text{ for } \quad k \leq x \leq k+1$$
where the $a_j(k,n)$ can be described by a recurrence relation:
$$a_j(k,n) = \begin{cases} 1 & \quad k=0,j=n-1 \\ 0 & \quad k=0,j< n-1 \\ a_j(k-1,n) + (-1)^{n+k-j-1} {{n}\choose{k}} {{n-1}\choose{j}} k^{n-j-1} & \quad k>1 \end{cases}$$
You could see the above formula as being constructed by a repeated convolution of $X_{n-1}$ with $Y_n$ where the integral is solved piecewice. Can we possibly generalize this for Beta distributed variables with any $\alpha$ and $\beta$?
Let $$X_n(\alpha,\beta) = \sum_{i=1}^n Y_i \quad \text{ with } \quad U_i \sim \text{Beta}(\alpha,\beta)$$
We expect the function $f_X(x;n,\alpha,\beta)$ to be split up in $n$ pieces (though possibly not a spline anymore). The convolution to compute the distribution of $X_{n}(\alpha,\beta) = X_{n-1}(\alpha,\beta)+U_n$ will be something like:
$$f_X(x;n,\alpha,\beta) = \int^{\text{min}(1,x)}_{1-\text{min}(1,n-x)} f_X(x-y;n-1,\alpha,\beta) y^{\alpha-1}(1-y)^{\beta-1} dy$$
$$f_X(x;n,\alpha,\beta) = \begin{cases} \int_{0\phantom{-x}}^{x} ((x-y)y)^{\alpha-1}((1-x+y)(1-y))^{\beta-1} dy & \quad \text{if $0 \leq x \leq 1$} \\ \int_{x-1}^{1} ((x-y)y)^{\alpha-1}((1-x+y)(1-y))^{\beta-1} dy & \quad \text{if $1 \leq x \leq 2$} \end{cases}$$
For integer $\alpha$ and $\beta$: the terms like $((x-y)y)^{\alpha-1}$ and $((1-x+y)(1-y))^{\beta-1}$ can be expanded for integer values of $\alpha$ and $\beta$, such that the integral is straightforward to solve.
For example:
$$\begin{array}{} f_X(x;2,2,2) &=& \begin{cases} \frac{1}{30} x^3(x^2-5x+5) & \quad \text{if $x \leq 1$} \\ \frac{1}{30}(2-x)^3(x^2+x-1) & \quad \text{if $x \geq 1$} \end{cases}\\ \\ f_X(x;2,3,3) &=& \begin{cases} \frac{1}{630} x^5(x^4-9x^3+30x^2-42x+21) & \quad \text{if $x \leq 1$} \\ \frac{1}{630}(2-x)^5(x^4+x^3-2x+1) & \quad \text{if $x \geq 1$} \end{cases} \end{array}$$
The solution for integer values of $\alpha$ and $\beta$ will be a spline as well. Possibly this could be cast in some nice (or more likely not so nice) formula for more general situations (not just $n=2$ and $\alpha=\beta=2$ or $\alpha=\beta=3$). But at this point one needs quite a few cups of coffee, or better an infuse, to tackle this stuff.