Abstract: In world of automation every individual needs an instant result of any query. As result of which different applications and software are coming into existence. The information is the major aspect of human life. The information is available as documents, database, multimedia resources, etc. Through this project we are extracting appropriate keyword from several documents input. Extracted keywords are matched with available documents. Finally, we recommend appropriate documents to the participants for reference. In document clustering, hundreds of thousands of files are usually examined. Much of the data in those files consists of unstructured text, whose analysis by computer examiners is difficult to be performed. In this context, automated methods of analysis are of great interest. Algorithms for clustering documents can facilitate the discovery of new and useful knowledge from the documents under analysis. We have proposed a more efficient document clustering algorithm. This will enhance the searching and analysis of the document and the best suitable results related to the query loaded will be recommended. In this software, we can add any number of documents. After that we can see all the documents, then after tokenization and clustering, we gain the extracted keywords and their frequency (count of the words) and then recommend the data sets generated to the user. Due to it the computation process of finding the data will be reduced in amount of time and efforts.

Keywords: Keyword Extraction, Stop-Words Analysis and Removal, Stemming of Clusters, Data Clustering Techniques and Document Recommendation


PDF | DOI: 10.17148/IJARCCE.2019.8120

Open chat
Chat with IJARCCE