Analysis of Phonetic Matching Approaches for Indic Languages
Sandeep Chaware, Srikantha Rao
Soundex, Q-gram, Indic-phonetic, threshold, phonetic matching
Abstract: Phonetic matching plays an important role in multilingual information retrieval, where data is manipulated in multiple languages. User needs information in their local language which may be different from the language where data has been maintained. In such an environment, we need a system which matches the strings phonetically irrespective of either exactly or approximately. There are many errors or variations can be considered but here we have considered typographical errors, spelling errors as differ in vowel and matching of compound words. There are many approaches like soundex, q-gram, phoenix etc., but they may produce an ambiguity in matching or may not be applicable to Indian languages. In this paper, we proposed approaches namely Soundex, Q-gram and Indic-Phonetic by generating cases like Hindi or Marathi (LOS), differ in vowel and compound words for Hindi and Marathi. We evaluated the three approaches for Hindi and Marathi. We found that Indic-Phonetic approach is an efficient and accurate as compared to other two approaches.
Keywords: Soundex, Q-gram, Indic-phonetic, threshold, phonetic matching
How to Cite:
[1] Sandeep Chaware, Srikantha Rao, βAnalysis of Phonetic Matching Approaches for Indic Languages,β International Journal of Advanced Research in Computer and Communication Engineering (IJARCCE)
