Education data was just released, providing counts for test results in bins corresponding to letter grades. The bins are given by the partition $m=[m_0,m_1,...,m_k]$ of the numerical grade scale, the frequency within each bin by $z_j=\#\{x_i|m_{j-1}\leq x_i<m_j\},\ j=1,...,k$, where $x_i, i=1,...,n$ are the individual student scores and $z_j$ is the count of observations in each bin. Only $m$ and $(z_1,...,z_k)$ are reported, the raw scores $\{x_1,...,x_n\}$ are not released.
Of course one could calculate the overall mean by assuming $\bar\mu_j=\frac{m_j-m_{j-1}}{2}$ and then aggregating $$\bar\mu=\frac1n \sum_{j=1}^k z_j \bar\mu_j.$$ This implicitly assumes that the data are symmetrically distributed (e.g., uniform) within each bin, which is unreasonable given the unimodal nature of the overall distribution.
A similar naïve formula for the variance, $$\text{Var}(x)=\frac1{n-1} \sum_{j=1}^k z_j(\bar\mu_j-\bar\mu)^2,$$ would underestimate the true variance. Assuming uniform distribution within each bin could be an improvement, but still ignores the fact that the the probability mass within each bin is likely higher at the boundary closer to the overall mean and lower at the bin boundary further away from the overall mean.
My hunch is that it would require either a parametric assumption for the overall distribution (which, given the data, I am reluctant to make) or estimating a kernel of some sort. Seems to me this is a fairly standard problem. Does someone have a solution?