πŸ“ž +91-7667918914 | βœ‰οΈ ijarcce@gmail.com
International Journal of Advanced Research in Computer and Communication Engineering
International Journal of Advanced Research in Computer and Communication Engineering A monthly Peer-reviewed & Refereed journal
ISSN Online 2278-1021ISSN Print 2319-5940Since 2012
IJARCCE adheres to the suggestive parameters outlined by the University Grants Commission (UGC) for peer-reviewed journals, upholding high standards of research quality, ethical publishing, and academic excellence.
← Back to Archives

Web Page Categorization with Extended TDW Scheme

Arun P R, Sumesh M S, Eldhose P Sim

πŸ‘ 14 viewsπŸ“₯ 0 downloads
Share: 𝕏 f in ✈ βœ‰
Abstract: The exponential growth of internet over the past decade has increased millions of web pages published on every subject. Internet provides only a medium for communication between the computer and for accessing online document over this network but not to organize this large amount of data. There are different subject based web directories like Open Directory ProjectοΏ½s (ODP) Directory Mozilla (DMOZ), Yahoo etc., these directories organize web pages in hierarchy. Due to the rapid growth of web pages the categorization demands the need of machine learning technique to automatically maintain the web page directory service. To assign a web page into a class the textual information in the page serves as a hint. Here we propose a method which uses an extended TDW scheme for feature representation and a naοΏ½ve Bayesian to build the classification model. The web page categorization provides a wide range of advantages that ranges from knowledgebase construction, to improve the quality of web results, web content filtering, focused crawling etc. Keywords: Categorization, Extended TDW Matrix, Naive Bayesian, Feature selection.

How to Cite:

[1] Arun P R, Sumesh M S, Eldhose P Sim, β€œWeb Page Categorization with Extended TDW Scheme,” International Journal of Advanced Research in Computer and Communication Engineering (IJARCCE)

Creative Commons License This work is licensed under a Creative Commons Attribution 4.0 International License.