Abstract: Creating short summaries of documents is obtaining salient information from an authentic text document. The extracted information is attained as a summarized report and consulted as a concise summary to the user. It is very crucial for us to understand and to describe the content of the text. The extractive summarization technique focuses on choosing how paragraphs, essential sentences, etc., creates the original documents in precise form and presents a summary that only contains parts of the original document. The efficiency of summarization resides in having identifying and presenting the key entities in the document. The proposed system aims at creating an extractive summary of multiple documents and enables us to find the relevance of the contents in those documents. This is enabled with a user interface to pose a query on set of multiple documents and present the most relevant documents in the order. Simple machine learning algorithms are used to perform this and the performance evaluation of the system could help the progress of research activities further to do the same as abstractive summarization using deep neural networks.
Keywords: Summarization, Machine Learning, Tokenization, Algorithms, Spacy
| DOI: 10.17148/IJARCCE.2020.9327