Today I have released ClusterEval 1.0. This program compares a clustering to a ground truth set of categories according to multiple different measures. It also includes a novel approach called 'Divergence from a Random Baseline' that augments existing measures to correct for ineffective clusterings. It has been used in the evaluation of clustering at the INEX XML Mining track at INEX in 2009 and 2010, and the upcoming Social Event Detection task at MediaEval in 2013. It implements cluster quality metrics based on ground truths such as Purity, Entropy, Negentropy, F1 and NMI.
Further details describing the use and functionality of this software are available in the manual.
Complete details of the quality measures can be found in the paper 'Document Clustering Evaluation: Divergence from a Random Baseline'.
The Social Event Detection task at MediaEval involves automated detection of social events from real life social networks. If this sounds of interest to you, head over the to the task description page and register.
Post a Comment