Abstract: Information retrieval became very imperative task in current scenario, everyone have huge volume of data in the distributed environment, and retrieving those contents need extra care and attention. In this paper, a new semi-supervised lexicon based optimized information retrieval methods are proposed. Unlike the existing classification method, the proposed system decreases the need of training process and improves the information retrieval efficiency. In this proposal, the semi-supervised bag of words representation is used for information retrieval. The improved entropy and lexicon method effectively performs information retrieval process on different type of text, image and video datasets. This drastically reduces the training time and information retrieval time in the huge dataset environment. The implementation and experiments used multi-class image, video datasets and text datasets with lexicon entropy. the system significantly improve average precision, accuracy and storage efficiency by deploying high configured information retrieval method.

Keywords: Information retrieval, CBIR, classification, Web extraction, Image dataset, Dictionary Learning, Entropy Optimization, Image Retrieval, Time-series Retrieval.