I'm running a bunch of experiments with randomly picked "knobs", and I'm recording various event types and times they occurred during the event. I'm particularly interested in getting a good variety of events happening simultaneously, so I process the event timings to create a matrix that show the number of times two events happened near each other, like this:
e1 e2 e3 e4 e5
e1
e2 6
e3 11 2
e4 0 11 4
e5 1 14 1 15
My goal is to cluster the experiments based on the above data, finding big clusters that produce similar data (so I run less of them), and find outliers/small clusters (so that I can run more of them and even things out).
What would some appropriate clustering algorithms be to deal with data like this?
Using what I'm familiar with, I could normalize all experiments and then compare the matrices and calculate the distance between any two experiments... then use MDS to convert to 2D locations, and use DBSCAN to cluster. That, however, seems like a lot of steps where data can turn from good to useless if I'm not carefully tuning each step.
Is there some simpler methodology to determine similarity of a bunch of matrices, and highlight those are most dissimilar from others?
Update: Adding more clarity (hopefully :) ) To simplify things, lets ignore what matrices represent and just say that I have N observations, where each has a 2D set of attributes. How do I cluster the observations, with the goal of finding those are that are the most different from others?