📞 +91-7667918914 | ✉️ ijarcce@gmail.com
International Journal of Advanced Research in Computer and Communication Engineering
International Journal of Advanced Research in Computer and Communication Engineering A monthly Peer-reviewed & Refereed journal
ISSN Online 2278-1021ISSN Print 2319-5940Since 2012
IJARCCE adheres to the suggestive parameters outlined by the University Grants Commission (UGC) for peer-reviewed journals, upholding high standards of research quality, ethical publishing, and academic excellence.
← Back to VOLUME 1, ISSUE 6, AUGUST 2012

Web Text Mining for news by Classification

Ms. Sarika Y. Pabalkar

Pad Dr. D.Y Patil Institute of Institute of Engineering and Technology, Pimpri , Pune, Maharashtra, India.

Abstract: In today’s world most information resources on the World Wide Web are published as HTML or XML pages and number of web pages is increasing rapidly with expansion of the web. In order to make better use of web information, technologies that can automatically re-organize and manipulate web pages are pursued such as by web information retrieval, web page classification and other web mining work. Research and application of Web text mining is an important branch in the data mining. Now people mainly use the search engine to look up Web information. The search engine like Google can hardly provide individual service according to different need of different user. However, Web text mining aims to resolve this problem. In Web text mining, the text extraction and the characteristic express of its extraction contents are the foundation of mining work, the text classification is the most important and basic mining method. Thus classification means classify each text of text set to a certain class depending on the definition of classification system. Thus, the challenge becomes not only to find all the subject occurrences, but also to filter out just those that have the desired meaning. Nowadays people usually use the search engine—Google, Yahoo etc. to browse the Web information mainly. But these search engines involve so wide range, whose intelligence level is low. It is very difficult to mine data further. The development of techniques for mining unstructured, semi-structured, and fully structured textual data has become increasingly important in industry.

Keywords: Text Mining, Extraction, Classification, Stemming, Stopword Removal
👁 32 views
Creative Commons License This work is licensed under a Creative Commons Attribution 4.0 International License.

How to Cite:

[1] Ms. Sarika Y. Pabalkar, “Web Text Mining for news by Classification,” International Journal of Advanced Research in Computer and Communication Engineering (IJARCCE)

Share this Paper