📞 +91-7667918914 | ✉️ ijarcce@gmail.com
IJARCCE Logo
International Journal of Advanced Research in Computer and Communication Engineering A monthly Peer-reviewed & Refereed journal
ISSN Online 2278-1021ISSN Print 2319-5940Since 2012
IJARCCE adheres to the suggestive parameters outlined by the University Grants Commission (UGC) for peer-reviewed journals, upholding high standards of research quality, ethical publishing, and academic excellence.
← Back to VOLUME 4, ISSUE 6, JUNE 2015

Distributed Computing Based Methods for Anomaly Analysis in Large Datasets

Remya G, Anuraj Mohan

DOI: 10.17148/IJARCCE.2015.4692

Abstract: Anomaly detection is considered as one among the important domain in data mining. Both supervised and unsupervised learning methods are used in anomaly detection task. In this paper emphasis is given to distance based prediction of anomalies. We studied the traditional methods which involves index-based, nested-loop and cell-based approaches towards anomaly detection. As the size of the datasets become very large the task of detecting anomalies becomes computationally complex. Having the push towards big data mining, it will become more necessary to adopt existing anomaly detection algorithms to various distributed computing platforms. Our paper is based on a survey on the different strategies that can be adopted for anomaly analysis using distributed computing techniques. First we studied the concept of anomaly detection solving set, a subset of the input data set representing a model that can be used to predict anomalies. The solving set is defined using necessary number of points that helps in the detection of the top anomalies by taking into consideration only a subset of all the pair wise distances from the data set. Then we analysed the possibility of using Map Reduce framework for performing anomaly analysis. A MapReduce based solving set algorithm for anomaly detection using Hadoop framework is also proposed.



Keywords: Anomaly, Distributed Computing, Map Reduce, Hadoop.

How to Cite:

[1] Remya G, Anuraj Mohan, “Distributed Computing Based Methods for Anomaly Analysis in Large Datasets,” International Journal of Advanced Research in Computer and Communication Engineering (IJARCCE), DOI: 10.17148/IJARCCE.2015.4692