📞 +91-7667918914 | ✉️ ijarcce@gmail.com
IJARCCE Logo
International Journal of Advanced Research in Computer and Communication Engineering A monthly Peer-reviewed & Refereed journal
ISSN Online 2278-1021ISSN Print 2319-5940Since 2012
IJARCCE adheres to the suggestive parameters outlined by the University Grants Commission (UGC) for peer-reviewed journals, upholding high standards of research quality, ethical publishing, and academic excellence.
← Back to VOLUME 7, ISSUE 2, FEBRUARY 2018

A Non-Overlap based Cluster using Canopy and Parallel K-Means for MapReduce Framework

Summia Parveen, Amirjohn BeeBee, Thoufiya. M, Safana Jasmine

DOI: 10.17148/IJARCCE.2018.7243

Abstract: There are very big bottlenecks when traditional data mining algorithms deal with large data sets. A novel technique for clustering the large and high dimensional datasets. The main idea is to use an inexpensive and approximate distance measure in order to efficiently partition the data into overlapping subsets which is called as canopies. After we get these canopies the desired clustering is performed by measuring exact distances only between points that occur in a common canopy. Using canopies, large clustering problems that were formerly impossible become practical and efficient. K-Means is typical distance-based clustering algorithm. Here, the canopy clustering algorithm is implemented as an efficient clustering technique by means of knowledge integration. With the study of the canopy clustering the K-Means paradigm of computing, we find is appropriate for the implementation of a clustering algorithm. This paper shows some advantages of canopy cluster to K-Means clustering mechanism and proposes a pre-clustering approach to K-Means Clustering method. Here we use Hadoop�s MapReduce program model for K-Means clustering with canopy clustering. The experimental results show that Canopy + K-means algorithm has faster operation speed than K-means algorithm, but both of them show good speed-up ratio under Hadoop environment and Canopy + K-means algorithm is even much better K-means algorithm.



Keywords: Abnormal Event Detection, Outlier Detection, Video Data Stream, Sparse Learning, Dynamic Detection.

How to Cite:

[1] Summia Parveen, Amirjohn BeeBee, Thoufiya. M, Safana Jasmine, “A Non-Overlap based Cluster using Canopy and Parallel K-Means for MapReduce Framework,” International Journal of Advanced Research in Computer and Communication Engineering (IJARCCE), DOI: 10.17148/IJARCCE.2018.7243