Abstract: The World Wide Web is a vast repository of knowledge from many diverse sources. Data, which is the most significant entity on its own, is particularly vital in the worlds of data science, computer vision, artificial intelligence, machine learning, and deep learning. Since data has already had a significant impact on so many organizations worldwide, it will always occupy center stage in the technology world. Web scraping was used to obtain data in the best possible form. The globe is chasing after the data supplied on the internet since it is so useful. Web scraping is a practice that has existed for some time and is still useful today. There are numerous forms and access methods for information on the internet. Web-based indexing or semantic processing of the material may therefore be laborious. The method that seeks to solve this problem is called web scraping. Using web scraping, unstructured web data can be converted into structured data that can be kept in a central database or spreadsheet for analysis. A few common web scraping techniques include traditional copy-and-paste, text wrapping and regular expression matching, HTTP programming, HTML parsing, DOM parsing, web-scraping software, vertical aggregation platforms, semantic annotation identifying, and computer vision webpage analysers. This comparative analysis of Web scraping tools and the techniques involved in it intends to increase the readers' awareness of this technology and aid in their quest for knowledge.

Keywords: Data science, computer vision, artificial intelligence, machine learning, deep learning, web scraping


PDF | DOI: 10.17148/IJARCCE.2024.13596

Open chat
Chat with IJARCCE