Discovery of clusters in numeric data by an hybridization of an ant colony with the minimum distance classification

D. Steinberg, Laboratoire d'Informatique de Tours, Université de Tours, Tours, France

M. Slimane, Laboratoire d'Informatique de Tours, Université de Tours, Tours, France
Email: slimane@univ-tours.fr

Nicolas Monmarché, Laboratoire d'Informatique de Tours, Université de Tours, Tours, France
Email: monmarche@rabelais.univ-tours.fr

G. Venturini, Laboratoire d'Informatique de Tours, Université de Tours, Tours, France

We consider in this paper the problem of unsupervised clustering, where clusters must be found in a data set without the a priori knowledge of the correct number of classes. We contribute to the study of clustering ants from the knowledge discovery point of view, with the aim of solving real world problems. The method we present is based on the hybridization of a stochastic ant-based algorithm and the deterministic minimum distance classification. Each numerical data is symbolized by an object. These objects are scattered over a 2D grid. Ant-like agents move onto the grid, and are allowed to pick up and drop objects, thus creating heaps of objects (i.e. classes). The method uses the hierarchical following form : the ant based algorithm is initially used in order to create numerous little but homogenous heaps. The minimum distance classification is then used on this intermediate clustering. The ants are then used again, and are this time allowed to manipulate entire heaps of objects. The minimum distance classification is then applied a last time to refine the results.

This algorithm was tested on various numerical databases, either real world data or artificial ones. The results showed a usually good convergence toward the correct number of classes, and low misclassification rates.