2

I am in the final year of my undergraduate program and I am doing my capstone project related to k-means clustering. I am trying to get good results on my dataset by optimizing the k-means algorithm using the Newton-Rapshon method in R. The Newton-Raphson algorithm is a root solving function which takes an equation as an argument and gives out the approximate root. I can't understand how to integrate that with the k-means algorithm for optimization purposes. Can anyone help me out with this, it would be much appreciated.

  • I am in agreement with Dougal's answer. N-R + k-means wouldn't be a great combination. If you're really settled on trying to look at k-means, I suggest reading through this post first http://stats.stackexchange.com/questions/133656/how-to-understand-the-drawbacks-of-k-means/133694#133694 It should give you appropriate context about the limitations of k-means, then maybe you can think of a project where you try to "optimize" k-means. – Jon Feb 01 '17 at 17:40

1 Answers1

6

The k-means algorithm is a specific algorithm that doesn't really have a spot to plug in Newton-Raphson.

You could consider using Newton-Raphson to optimize the k-means problem: finding assignments of points to sets to minimize the total squared distance from their means. The objective function is$\DeclareMathOperator*{\argmin}{\mathrm{arg\,min}}$ $$ \argmin_{S_1, \dots, S_k} \sum_{i=1}^k \sum_{x \in S_i} \left\lVert x - \frac{1}{\lvert S_i \rvert} \sum_{x' \in S_i} x' \right\rVert^2 .$$ Unfortunately, since the sets $S_i$ are discrete objects, it's going to be hard to phrase the problem in a way that Newton-Raphson can work on, since it's for functions on continuous input spaces.

I'm not sure why you decided to use Newton-Raphson – it's not a very natural fit. If you're just trying to analyze some data, use an existing k-means package. But if you're really dedicated to doing this idea, you could try a relaxation to soft assignments, like in a Gaussian mixture model. The standard thing to do there would be expectation maximization, but Newton-Raphson would also work.

Danica
  • 21,852
  • 1
  • 59
  • 115