Abstract: This Paper presents an overview of the clustering and its methods used in Data Mining. Firstly, different measures that are used for determining whether two clusters are similar or dissimilar are defined. Then different methods of clustering are presented and are divided into:  hierarchical, partitional and evolutionary algorithms. Finally clustering is performed in large data sets and subsequently their challenges are discussed.

 

Keywords: Clustering, similarity measures, clustering analysis, k-means.