I am running some experiment where I draw samples from a multivariate Bernoulli distribution (in this case taking values -1 or +1) with a single correlation coefficient (i.e., same correlation for all pairs), then I take the absolute value of the mean of the Bernoulli realizations for each draw and then average across the draws. Reason of interest of the absolute value of the mean is business driven.
Some code might help illustrate the experiment:
require(data.table)
require(bindata)
sim.agreement <- function(n = 2, rho = 0.2, sp = 0.5, n.draws = 1e5){
tmp <- as.data.table(rmvbin(n.draws, matrix(data = sp, nrow = 1, ncol = n), bincorr = (1 - rho) * diag(n) + rho), keep.rownames = T)
for(i in names(tmp)){tmp <- tmp[, (i) := ifelse(get(i) == 0, -1, get(i))]}
tmp <- tmp[, sim := 1:.N]
tmp <- melt.data.table(data = tmp, id.vars = c('sim'), measure.vars = names(tmp)[!names(tmp) %in% 'sim'], value.factor = F)
tmp <- tmp[, .(agreement = abs(mean(value))), by = .(sim)]
tmp <- tmp[, .(mean.agreement = mean(agreement), n = n)]
}
out <- rbindlist(lapply(2:6, function(n){
out <- rbindlist(lapply(seq(0.2, 0.6, 0.1), function(rho){
out <- sim.agreement(n = n, rho = rho)[, rho := rho]
}), use.names = T, fill = T)
}), use.names = T, fill = T)
Thus, I can map the correlation coefficients to the expected values of the absolute mean, below a table for some rho (rows) and n (columns):
Question: Given a correlation coefficient (fixed across pairs) and number of Bernoulli's, is there an analytical way to get the absolute value of the mean?