Abstract: In recent times, there have been significant advancements in techniques for storing data across multiple clouds. This has resulted in users having a certain level of control over information leakage, as data is spread out across several cloud storage providers. However, the chaotic distribution of data pieces can still result in severe data loss, even with the use of multiple clouds. This article highlights the issue of a data leakage caused by the chaotic distribution of data in multi-cloud storage systems and introduces a solution called Store Simulation. This system aims to minimize customer data leakage across multiple clouds by keeping syntactically related data on the same cloud.
The efficient creation of similarity-preserving functions is a key aspect of Store Sim. These functions use signatures, such as Min Hash and Bloom filters, to compute information leakage. Additionally, the system utilizes a clustering-based storage plan creation method to efficiently distribute data chunks across multiple clouds.
To test the efficacy of this strategy, two real datasets from GitHub and Wikipedia were used. The results show that the Store Sim blueprint can reduce information leakage by up to 60% compared to no-plan placement. Furthermore, our investigation into system vulnerabilities reveals that this approach makes information attacks more challenging.
Keywords Multi-cloud storage systems, Information leakage, Data distribution, Syntactic similarity Min Hash, Bloom filters, Clustering-based storage plan creation, Real datasets, Information attacks Data loss prevention.
| DOI: 10.17148/IJARCCE.2023.12364