Annotating Assamese Corpus using the Standard POS Tagset

Bipul Roy; Bipul Syam Purkayastha

doi:10.17148/IJARCCE.2016.5879

← Back to VOLUME 5, ISSUE 8, AUGUST 2016

Annotating Assamese Corpus using the Standard POS Tagset

Bipul Roy, Bipul Syam Purkayastha

DOI: 10.17148/IJARCCE.2016.5879

Abstract: Assamese is the official language of the Indian state of Assam and is about 25 million native speakers. But, being a regional language, it is highly lacking in language resources like corpus, language technology tools, guidelines etc till date. As the digitization of Assamese corpus, after it was tagged at the Part-of-Speech (POS) level, can help tremendous in the fields of various Natural Language Processing (NLP) applications, linguistic studies, various linguistic research works, etc. So, the development of annotated Assamese corpus has become unavoidable task now-a-days.

Keywords: Assamese, POS, BIS, NLP.

Downloads: Download PDF|DOI: 10.17148/IJARCCE.2016.5879

How to Cite:

[1] Bipul Roy, Bipul Syam Purkayastha, “Annotating Assamese Corpus using the Standard POS Tagset,” International Journal of Advanced Research in Computer and Communication Engineering (IJARCCE), DOI: 10.17148/IJARCCE.2016.5879