1

I have a set of data and want to know whether they fall in 1, 2 or 3 groups.

I started exploring the question by using k-means in MATLAB. By just looking at the distance from the centroid of each cluster by eye it looks like I might have two clusters, but I would like to be able to back this up with some statistics.

I have read that using BIC might be a good way.

I am new to clustering, which means that when I read all the answers to questions similar to this one that refer to research papers on the topic, I just cannot follow them, and I was hoping that someone can give me some basic information on whether BIC or something else might be suitable, and some MATLAB code (computing BIC or something similar) with explanations for dummies (I am also new to MATLAB), so that I can use it.

Glen_b
  • 257,508
  • 32
  • 553
  • 939
Luci Lu
  • 11
  • 1
  • 2
  • 2
    Here is computation of BIC and AIC as they are computed (in SPSS) for clustering http://stats.stackexchange.com/q/55147/3277 – ttnphns Oct 04 '13 at 20:44
  • Welcome to the site, @Lucia. I'm sympathetic to your desire for simple code, but asking for code is off-topic for CV (see our [help page](http://stats.stackexchange.com/help)); your request might be better sent to MATLAB's tech support. If you are having difficulty understanding the ideas from another CV thread on this topic, link to the thread & ask about the part that you don't understand. Otherwise, this Q may need to be closed. – gung - Reinstate Monica Oct 04 '13 at 21:29
  • 6
    This question appears to be off-topic because it is about asking for code. – gung - Reinstate Monica Oct 04 '13 at 21:29
  • 2
    I think there are two parts to this question: (a) "I want to back up my by eye judgement with statistics" + "I've read using BIC might be good" + "I just cannot follow answers and papers on using BIC for this" + "hoping for explanations for dummies" and (b) "pass me MATLAB code for computing BIC or such". (a) is on topic, (b) is asking for code. I think the post could be much better worded to indicate the statistical content, but even as it stands I think it's on topic. – Glen_b Oct 04 '13 at 23:30
  • Welcome to the site. If you can separate your question into two, one asking about BIC and clustering and the other asking for code, then the first one could stay here. – Peter Flom Oct 14 '13 at 10:10

0 Answers0