As you've described the problem, ${\boldsymbol Y} = \{Y_1, Y_2, Y_3, Y_4\}$ will have a multivariate normal distribution with mean ${\boldsymbol \mu} = (\beta, \beta, \beta, \beta)'$ and covariance matrix
$$ \Sigma = \sigma^2 \left( \begin{array}{cccc}
2 & 0 & 1 & 1 \\
0 & 2 & 1 & 1 \\
1 & 1 & 3 & 1 \\
1 & 1 & 1 & 3 \\ \end{array} \right) $$
Normally this type of covariance structure model would require some kind of software like MPLUS but I believe it may be simple enough to "trick" lme
into fitting a model like this but it is simple enough to "build-your-own".
I'm not sure about getting the unbiased minimum variance estimator (although I'm sure the ordinary sample mean would be competitive), but I can describe how to get the maximum likelihood estimator (MLE), which is desirable for the reasons mentioned by Michael Chernick. The log-likelihood for a single observation of ${\boldsymbol Y}$ is
$$L(\beta, \sigma^2) = \log \left( \frac{1}{(2\pi)^2 |\Sigma|^{1/2}} \right) -\frac{1}{2} ({\boldsymbol Y}-{\boldsymbol \mu})' \Sigma^{-1} ({\boldsymbol Y}-{\boldsymbol \mu}) $$
which is only a function of $\beta$ and $\sigma^2$ since ${\boldsymbol \mu}$ only depends on $\beta$ and $\Sigma$ only depends on $\sigma^2$. We sum over the observations and optimize the resulting function as a function of $\beta, \sigma^2$ to get the MLE. I'll use the dmnorm()
function from the R
package mnormt
to do this and give a rather crudely programmed example:
set.seed(1234)
N <- 100
s = matrix(0,4,4)
s[1,]=c(2,0,1,1)
s[2,]=c(0,2,1,1)
s[3,]=c(1,1,3,1)
s[4,]=c(1,1,1,3)
# generate data where true values are beta=1, sigma^2 = 3.
y <- list()
for(i in 1:N) y[[i]] <- rmnorm(1,mean=c(1,1,1,1),varcov=3*s)
# P[1] is beta, P[2] is sigma squared
L <- function(P)
{
# crude barrier to prevent sigma squared being negative
if( P[2] <= 0 ) return(Inf)
like <- 0
for(i in 1:N)
{
like <- like + dmnorm(y[[i]], mean=rep(P[1],4), varcov=P[2]*s, log=TRUE)
}
return(-like)
}
# chose arbitrary starting values of beta=1,sigma^2=1 for the optimization
optim(c(1,1),L)$par
[1] 0.9109401 3.0786393
You can get approximate confidence intervals either by bootstrapping or using the fisher information, which will require derivatives of the log-likelihood either numerically (which is returned by optim()
) or analytically, which you may find this thread helpful for.