Annotating Assamese Corpus using the Standard POS Tagset
Abstract: Assamese is the official language of the Indian state of Assam and is about 25 million native speakers. But, being a regional language, it is highly lacking in language resources like corpus, language technology tools, guidelines etc till date. As the digitization of Assamese corpus, after it was tagged at the Part-of-Speech (POS) level, can help tremendous in the fields of various Natural Language Processing (NLP) applications, linguistic studies, various linguistic research works, etc. So, the development of annotated Assamese corpus has become unavoidable task now-a-days.
Keywords: Assamese, POS, BIS, NLP.
How to Cite:
[1] Bipul Roy, Bipul Syam Purkayastha, “Annotating Assamese Corpus using the Standard POS Tagset,” International Journal of Advanced Research in Computer and Communication Engineering (IJARCCE), DOI: 10.17148/IJARCCE.2016.5879
