The Geomblog has an interest posting on estimating...

2015-05-31T13:41:48.737-07:00

The Geomblog has an interest posting on estimating the number of clusters in a given dataset at http://blog.geomblog.org/2010/03/this-is-part-of-occasional-series-of.html.

You might also like to look at a previous publication of mine where I show how an exact optimum can be found to the elbow method when plotting number of clusters versus root mean squared error, http://eprints.qut.edu.au/53371/.

I also think it would be quite feasible to incorporate an approach like X-means into these algorithms, https://www.cs.cmu.edu/~dpelleg/download/xmeans.pdf.

*Not sure if my previous comment was submitted . B...

2015-05-31T11:59:31.436-07:00

*Not sure if my previous comment was submitted . But do you have any suggestions for a document clustering algorithm that might be close to the scale of Topsig but one in which the # of clusters is not specified.

For example affinity propagation

Comments on Perplexing Permutations: Web Scale Document Clustering: Clustering 733 Million Web Pages

The Geomblog has an interest posting on estimating...

*Not sure if my previous comment was submitted . B...