Abstract: Source code written falsification has been a sympathy toward numerous instructors in software engineering field, given to the simplicity of accessibility of substance in this period of web. The tool built up is an instrument for identifying written falsification in source codes of understudies learning programming dialects, to take into account the requirements of educators and help them screen understudies source codes. Right now the instrument underpins Java Programming Language. The instrument works in three stages. Tokenization took after by N-Gram representation of source codes and afterward examination utilizing Greedy String Tiling calculation. Reaction time of the device is one moment for 50 source code records of length 75 lines of code (LOC). According to the exploration, results given by the instrument are ninety-nine percent right. The goal of this anticipate is to build up a device for distinguishing written falsification in both the source code and non-specific printed information documents for conquering the downsides of the current methodologies.

Keywords: Plagiarism, Source Code, Text Document, Tokens, Detection Tool.