I have some microarray data (~15 samples) which I've clustered via pam, with a range of cluster sizes and I want to find out the optimal k with BIC.
I basically want to re-implement the BIC score from the x-means paper and this stat.stackexchange post answered some basic questions. But it seems that their definition of sigma is for the unidimensional case. How would I calculate the Covariance matrix for my multidimensional dataset to plug into the multivariate Gaussian log-likelihood function?
I could be missing something obvious, but I can't seem to find a reference to explain the multivariate case for cluster models. I can add a reproducible example if needed.
update: Here's the formula for variance: $$ \sigma^2 = \frac{1}{R-K}\sum_{i}(x_i - \mu_{(i)})^2 $$ Here, $x_i$ is the sample point and $\mu_{(i)}$ is the cluster center for the cluster which the sample belongs to. In the multivariate case, a point is defined by a vector of size $n$ (for example, a row $i$ in the data matrix) so the mean $\mu_{(i)}$ should also be 1 by n vector. How then do they get a single number for the variance?
1 X-means: extending K-means with efficient estimation of the number of clusters, Pelleg & Moore