Abstract: Plagiarism is the practice of using someone else's words or ideas as one's own. In many nations, plagiarism is considered to be a violation of moral rights. The unacceptable act of plagiarism has been rising significantly in today's environment of developing technology and expanding Internet usage. It is frequently seen in a variety of academic contexts, including research papers, blogs, essays, assignments, etc. In this paper we employed two ways of finding plagiarized text. One method focuses on building a plagiarism detector that examines a specified response text file against a source text file and, depending on the similarities between the two text files, identifies the answer text file as original or plagiarized. In order to create a binary classification model and identify plagiarism, a Support Vector Machine (SVM) was employed. Another method focuses on creating a web application that can identify plagiarism in text, offering a sentence-by-sentence analysis with the percentage of plagiarism and a link to a potential source article, including a method to check for source code plagiarism within a directory.

Keywords: SVM, NLP, Machine learning, Plagiarism Detection, n-grams containment.

PDF | DOI: 10.17148/IJARCCE.2022.117114

Open chat
Chat with IJARCCE