Abstract: De-duplication is the path toward choosing all classes of information inside an enlightening gathering that suggest a comparable certified life/world component. The data amassed from various resources may have quality issues in it. The thought to perceive duplicates by using windowing and blocking system. The objective is to finish better precision, extraordinary viability and besides to diminish the false positive rate all are according to the surveyed comparable qualities of records. De-duplication is a property which gives additional information of comparable qualities between the two substances. In this paper the basic focus is given on right ID of duplicates in the database by applying thought of windowing and blocking. The objective is to achieve better precision, awesome capability and moreover to reduce the false positive rate all are according to the assessed similarities of records.

Keywords: Access control, big data, cloud computing, data deduplication, proxy re-encryption.