Abstract: This Big Data is new term define to store a huge data sets complex data. There are various techniques which are proposed for mining. The big data can be mined is really a researchers issue especially in domain of text mining. This paper presents an effective technique which include a preprocessing of huge dataset(i.e. text mining) for finding the shortest neighbor. the dataset used here is facebook .The proposed system provides a solution for storing huge data and retrieving which is adapted to all environment.

Keywords: component; Dataset, Preprocessing, Clustering, Big Data, K-Means.