Abstract: In this paper a method is proposed for identification of Roman, Devanagari, Kannada, Tamil, Telugu and Malayalam scripts at text block level using features of Correlation property of Gray Level Co-occurrence Matrix (GLCM) and multi resolutionality of Discrete Wavelet Transform (DWT) of input handwritten document text blocks. The two-dimensional DWT extracts spatial features and Correlation of GLCM is used to extract texture features. Typically it can be observed that the patterns of any handwritten text block encompass spatial texture primitives.  Therefore, the primary aim of this paper is to show the efficiency of DWT and Correlation of GLCM in describing the handwritten text blocks of six Indian scripts. Exhaustive experimentations were conducted on a dataset of 100 text blocks of each script, with bi-script and tri-script combinations of  six scripts and script recognition is carried out using three classifiers namely nearest neighbor (NN), LDA and SVM. Using SVM classifier average script classification accuracy achieved in case of bi-script and tri-script combinations are 96.4333% and 93.9833% respectively.

 

Keywords: Bilingual, Trilingual, Script Recognition, Discrete Wavelet Transform, Correlation of Gray Level Co-occurrence Matrix, texture features, text block level, Nearest Neighbor, Linear Discriminant Analysis,  support vector machine classifier.