Comparison of Extracting Content with Minimization of Lexeme in a Text Corpus by Using Different Dimension Reduction Techniques

I.KIRUBARAJI; R.JOTHILAKSHMI

← Back to VOLUME 1, ISSUE 10, DECEMBER 2012

Comparison of Extracting Content with Minimization of Lexeme in a Text Corpus by Using Different Dimension Reduction Techniques

MS. I.KIRUBARAJI, MS. R.JOTHILAKSHMI

👁 3 views📥 3 downloads

Abstract: Document retrieval is a member of information retrieval in which information are extracted or gaining appropriate knowledge from unstructured text.i.e Unstructured text is in the form of NLP,HTML, AML format. Each document symbolized in the form of term vector model. Term vector model represented by an identifiers of objects as index terms. A single document contains more than ten thousand index terms, Seeking information from this archive is not easy. Dimension of tem vector models are high, So pertaining information from this large space is painful. Scaling of data is rigid. For the sake of effective information retrieval dimension of each document feature should be reduced. This is achieved by different dimension reduction techniques. This paper focuses on populous dimension reduction techniques such as LLE, t- SNE, Isomap and LDA and its advantages and disadvantages.

Keywords: documents, vectors, terms, dimension reduction

How to Cite:

[1] MS. I.KIRUBARAJI, MS. R.JOTHILAKSHMI, “Comparison of Extracting Content with Minimization of Lexeme in a Text Corpus by Using Different Dimension Reduction Techniques,” International Journal of Advanced Research in Computer and Communication Engineering (IJARCCE)

This work is licensed under a Creative Commons Attribution 4.0 International License.