Abstract: This paper presents a clustering based k-anonymization technique to minimize the information loss while at the same time ensuring data utility. In privacy preserving data mining, anonymization based approaches have been used to preserve the privacy of an individual. However, the anonymization based approaches suffer from the issue of information loss. To minimize the information loss and ensure data quality we produce new approach called systematic clustering along with equal combination of quasi-identifier and sensitive attributes. The proposed approach first generates sub-databases by equal combination of quasi-identifier and sensitive attributes and adopts group-similar data together and then anonymizes each group individually. We also evaluate our approach empirically focusing on the information loss and execution time as vital metrics.

Keywords: quasi-identifier, sensitive attribute, sub-databases, systematic clustering, anonymization, PPDM.