📞 +91-7667918914 | ✉️ ijarcce@gmail.com
IJARCCE Logo
International Journal of Advanced Research in Computer and Communication Engineering A monthly Peer-reviewed & Refereed journal
ISSN Online 2278-1021ISSN Print 2319-5940Since 2012
IJARCCE adheres to the suggestive parameters outlined by the University Grants Commission (UGC) for peer-reviewed journals, upholding high standards of research quality, ethical publishing, and academic excellence.
← Back to VOLUME 5, ISSUE 4, APRIL 2016

Document Clustering Analysis Based on Hybrid Clustering Algorithm

Neha Garg, R.K. Gupta

DOI: 10.17148/IJARCCE.2016.54190

Abstract: In today�s era of World Wide Web, there is a tremendous proliferation in the amount of digitized text documents. As there is huge collection of documents on the web, there is a need of grouping the set of documents into clusters. Document clustering plays an important role in effectively navigating and organizing the documents. The k-means clustering algorithm is the most commonly document clustering algorithm, it takes less computation time than a matrix-based clustering algorithm. The major problem with this algorithm is that it is quite sensitive to selection of initial cluster centroids. This article proposed a hybrid Genetic K-means clustering algorithm that improves the quality of clusters. Further, author has also performs a comparisons of hybrid algorithm and k-means algorithm on two different text document dataset. The experimental results show that the proposed method is more effective and converge to more accurate clusters than previous method.



Keywords: Document Clustering, Cosine Similarity, k-means, Genetic Algorithm, Purity measure.

How to Cite:

[1] Neha Garg, R.K. Gupta, “Document Clustering Analysis Based on Hybrid Clustering Algorithm,” International Journal of Advanced Research in Computer and Communication Engineering (IJARCCE), DOI: 10.17148/IJARCCE.2016.54190