Abstract: In this paper a method is
proposed for identification of Roman, Devanagari,
Kannada, Tamil, Telugu and Malayalam scripts at text block level using features
of Correlation property of Gray Level Co-occurrence Matrix (GLCM) and multi resolutionality of Discrete Wavelet Transform (DWT) of
input handwritten document text blocks. The two-dimensional DWT extracts
spatial features and Correlation of GLCM is used to extract texture features.
Typically it can be observed that the patterns of any handwritten text block
encompass spatial texture primitives.
Therefore, the primary aim of this paper is to show the efficiency of
DWT and Correlation of GLCM in describing the handwritten text blocks of six
Indian scripts. Exhaustive experimentations were conducted on a dataset of 100 text
blocks of each script, with bi-script and tri-script combinations of six scripts and script recognition is carried
out using three classifiers namely nearest neighbor
(NN), LDA and SVM. Using SVM classifier average script classification accuracy
achieved in case of bi-script and tri-script combinations are 96.4333% and
93.9833% respectively.
Keywords: Bilingual, Trilingual, Script Recognition, Discrete Wavelet Transform, Correlation of Gray Level Co-occurrence Matrix, texture features, text block level, Nearest Neighbor, Linear Discriminant Analysis, support vector machine classifier.