Abstract: The World Wide Web has
several online databases and the number keeps growing every day. The data in
the web pages is generally wrapped in the form of data records. Such web pages
are generated dynamically. This paper focuses on extracting the data from the
web pages. Till today, several techniques are proposed for retrieving
information from web pages but all suffer the common problem. The problem is
dependence on programming language used to design the web pages. So this paper
focuses on utilizing the visual features of web page for extracting the data
from the deep web pages. To make the system efficient, it can be combined with
non-visual information like the symbols and tags. Approach of this paper is
independent of any specific web programming language and hence it can be
extended to various web pages which have different underlying architecture.
Keywords: Web mining, Web data extraction, visual features of deep Web pages, wrapper generation