Design and Analysis of Privacy Models against Background Knowledge in Privacy - Preserving Data Publishing
Abstract
"Humongous amount of data gets collected by various online applications like social networks, cellular technologies, the healthcare sector, location - based services, and many more. The collected data can be accessed by third - party applications to study social and economic issues of society, leverage research, propose healthcare and business solutions, and even track a pandemic. As a result, online collected - data is a significant contributor in recent times. Despite the umpteen usefulness of online collected - data, it is vulnerable to privacy threats due to the presence of sensitive information of individual(s). Adding to that, the adversary has also become strong and powerful in terms of capabilities and access to knowledge. Knowledge is freely available in the public domain from sources like social profiles, social relations, previously published data and many more. As a result, privacy - preserving data publishing is a challenging research direction to venture upon. Our work mainly focuses on designing privacy models against background knowledge. Briefly, background knowledge is knowledge present with adversary used to disclose privacy of the individual(s). This makes background knowledge highly uncertain and inaccurate in nature as we cannot quantify the amount of knowledge present with the adversary. In this work, we design and analyze privacy solutions based on background knowledge. First of all, we propose an adversarial model against background knowledge and analyze existing and prominent privacy models against it. Secondly, we propose a privacy model (q, [lb, ub]+sp, a) - Private against background knowledge. The background knowledge assumption is comprehensive and realistic, which makes the proposed privacy model
more strong and comprehensive in nature. The proposed privacy model has been theoretically analyzed against a strong adversary. Also, the proposed privacy model has been evaluated experimentally and compared with existing literature. Progressively, our research work extends to Social Networks, which is an important application of privacy - preserving data publishing. Social network data has become an important resource in recent times but is prone to privacy threats. Thirdly, we propose a privacy model named Rule Anonymity against rule - based mining techniques in social networks. The rule - based mining techniques can predict unpublished sensitive information by generating rules. This makes it a challenging adversarial assumption. A rule - based anonymization technique has been proposed that incorporates the Rule Anonymity principle. We analyze the rule - based anonymization technique against a strong adversary having the capability of rule - based mining technique. The experimental evaluation of the rule - based anonymization technique shows positive results in terms of privacy when compared with existing literature. Fourth, we propose a de - anonymization technique against adversary’s background knowledge. The adversary’s background knowledge considers a comprehensive background knowledge that is imprecise and inaccurate in nature. We suggested distance metrics that consider imprecise and inaccurate identification and structural information. The de - anonymization technique has been implemented on a real social dataset and exhibits positive results in terms of de - anonymization accuracy. Fifth, we propose a privacy - preserving technique against comprehensive adversarial background knowledge. We have evaluated the proposed privacy model (q, [lb, ub]+sp, a) - Private on the Adult dataset and Census Income dataset and compared it with existing literature in terms of privacy. For social networks, we have used the Facebook dataset to evaluate the proposed privacy models and techniques."
Collections
- PhD Theses [87]