International Journal of Advanced Research in Computer and Communication Engineering

A monthly peer-reviewed online and print journal

ISSN Online 2278-1021
ISSN Print 2319-5940

Abstract: An increasing number of organizations maintain collections of data about individuals. Hospitals keep medical records of their patients, commerce companies collect information of their clients and web service companies keep track of the preferences of their users. Publication of these data can be useful for research, epidemic studies, commerce development, statistical analysis, etc. It is difficult to keep all the data open on internet, in some cases, to keep confidence and in safe manner where it contains some important or sensitive details.  The removal of directly identifying information (such as Name, Disease) from the published records is not enough to guarantee individuals privacy. The personal data can be misused, for a variety of purposes. Maintaining the privacy for high dimensional database has become difficult. A potential attacker could infer a record’s identity by linking public external information sources (voters registration lists, phone number catalogues, etc.) with a combination of other attributes, like age, gender and postal code, which are not generally unique per person. The main goal is to focus on Privacy and utility of the shared data. l - Diversity is one of the methods to preserve the data which is very sensitive and confidential. Keeping more sensitive data in dataset helps us to preserve the database more safe and secure manner. The advantage of using l-diversity provides a greater distribution of sensitive attributes within the group, thus increasing data protection. This method protects against attribute disclosure, which is an enhancement of k-anonymity technique. 

Keywords: Sensitive Attribute, Privacy Preserving Data, Adult Dataset, l-Diversity Model, Quasi Identifier, Utility, Sensitive Value


PDF | DOI: 10.17148/IJARCCE.2019.8117