Results 1 - 10
of
69
Personal Privacy vs Population Privacy: Learning to Attack Anonymization
"... Over the last decade great strides have been made in developing techniques to compute functions privately. In particular, Differential Privacy gives strong promises about conclusions that can be drawn about an individual. In contrast, various syntactic methods for providing privacy (criteria such as ..."
Abstract
-
Cited by 12 (0 self)
- Add to MetaCart
(Show Context)
Over the last decade great strides have been made in developing techniques to compute functions privately. In particular, Differential Privacy gives strong promises about conclusions that can be drawn about an individual. In contrast, various syntactic methods for providing privacy (criteria such as k-anonymity and l-diversity) have been criticized for still allowing private information of an individual to be inferred. In this paper, we consider the ability of an attacker to use data meeting privacy definitions to build an accurate classifier. We demonstrate that even under Differential Privacy, such classifiers can be used to infer “private ” attributes accurately in realistic data. We compare this to similar approaches for inference-based attacks on other forms of anonymized data. We show how the efficacy of all these attacks can be measured on the same scale, based on the probability of successfully inferring a private attribute. We observe that the accuracy of inference of private attributes for differentially private data and l-diverse data can be quite similar. Categories and Subject Descriptors. H.1[Models and Principles]: Miscellaneous—Privacy
Coupled-Worlds Privacy: Exploiting Adversarial Uncertainty in Statistical Data Privacy
"... Abstract—We propose a new framework for defining privacy in statistical databases that enables reasoning about and exploiting adversarial uncertainty about the data. Roughly, our framework requires indistinguishability of the real world in which a mech-anism is computed over the real dataset, and an ..."
Abstract
-
Cited by 8 (1 self)
- Add to MetaCart
(Show Context)
Abstract—We propose a new framework for defining privacy in statistical databases that enables reasoning about and exploiting adversarial uncertainty about the data. Roughly, our framework requires indistinguishability of the real world in which a mech-anism is computed over the real dataset, and an ideal world in which a simulator outputs some function of a “scrubbed” version of the dataset (e.g., one in which an individual user’s data is removed). In each world, the underlying dataset is drawn from the same distribution in some class (specified as part of the definition), which models the adversary’s uncertainty about the dataset. We argue that our framework provides meaningful guarantees in a broader range of settings as compared to previous efforts to model privacy in the presence of adversarial uncertainty. We also show that several natural, “noiseless ” mechanisms satisfy our definitional framework under realistic assumptions on the distribution of the underlying data. Index Terms—data privacy I.
m-Privacy for Collaborative Data Publishing
"... Abstract—In this paper, we consider the collaborative data publishing problem for anonymizing horizontally partitioned data at multiple data providers. We consider a new type of “insider attack ” by colluding data providers who may use their own data records (a subset of the overall data) in additio ..."
Abstract
-
Cited by 8 (4 self)
- Add to MetaCart
(Show Context)
Abstract—In this paper, we consider the collaborative data publishing problem for anonymizing horizontally partitioned data at multiple data providers. We consider a new type of “insider attack ” by colluding data providers who may use their own data records (a subset of the overall data) in addition to the external background knowledge to infer the data records contributed by other data providers. The paper addresses this new threat and makes several contributions. First, we introduce the notion of m-privacy, which guarantees that the anonymized data satisfies a given privacy constraint against any group of up to m colluding data providers. Second, we present heuristic algorithms exploiting the equivalence group monotonicity of privacy constraints and adaptive ordering techniques for efficiently checking m-privacy given a set of records. Finally, we present a data provider-aware anonymization algorithm with adaptive m-privacy checking strategies to ensure high utility and m-privacy of anonymized data with efficiency. Experiments on real-life datasets suggest that our approach achieves better or comparable utility and efficiency than existing and baseline algorithms while providing m-privacy guarantee. I.
Publishing Microdata with a Robust Privacy Guarantee ∗
"... Today, the publication of microdata poses a privacy threat. Vast research has striven to define the privacy condition that microdata should satisfy before it is released, and devise algorithms to anonymize the data so as to achieve this condition. Yet, no method proposed to date explicitly bounds th ..."
Abstract
-
Cited by 5 (1 self)
- Add to MetaCart
(Show Context)
Today, the publication of microdata poses a privacy threat. Vast research has striven to define the privacy condition that microdata should satisfy before it is released, and devise algorithms to anonymize the data so as to achieve this condition. Yet, no method proposed to date explicitly bounds the percentage of information an adversary gains after seeing the published data for each sensitive value therein. This paper introduces β-likeness, an appropriately robust privacy model for microdata anonymization, along with two anonymization schemes designed therefor, the one based on generalization, and the other based on perturbation. Our model postulates that an adversary’s confidence on the likelihood of a certain sensitive-attribute (SA) value should not increase, in relative difference terms, by more than a predefined threshold. Our techniques aim to satisfy a given β threshold with little information loss. We experimentally demonstrate that (i) our model provides an effective privacy guarantee in a way that predecessor models cannot, (ii) our generalization scheme is more effective and efficient in its task than methods adapting algorithms for the k-anonymity model, and (iii) our perturbation method outperforms a baseline approach. Moreover, we discuss in detail the resistance of our model and methods to attacks proposed in previous research. 1.
Cryptography and the Economics of Supervisory Information: Balancing Transparency and Confidentiality
"... w o r k i n g p a p e r ..."
(Show Context)
Differentially private synthesization of multi-dimensional data using copula functions
- In EDBT
, 2014
"... Differential privacy has recently emerged in private statisti-cal data release as one of the strongest privacy guarantees. Most of the existing techniques that generate differentially private histograms or synthetic data only work well for single dimensional or low-dimensional histograms. They becom ..."
Abstract
-
Cited by 4 (4 self)
- Add to MetaCart
(Show Context)
Differential privacy has recently emerged in private statisti-cal data release as one of the strongest privacy guarantees. Most of the existing techniques that generate differentially private histograms or synthetic data only work well for single dimensional or low-dimensional histograms. They become problematic for high dimensional and large domain data due to increased perturbation error and computation complexity. In this paper, we propose DPCopula, a differentially private data synthesization technique using Copula functions for multi-dimensional data. The core of our method is to compute a differentially private copula function from which we can sample synthetic data. Copula functions are used to describe the dependence between multivariate random vec-tors and allow us to build the multivariate joint distribution using one-dimensional marginal distributions. We present two methods for estimating the parameters of the copula functions with differential privacy: maximum likelihood estimation and Kendall’s τ estimation. We present formal proofs for the privacy guarantee as well as the conver-gence property of our methods. Extensive experiments using both real datasets and synthetic datasets demonstrate that DPCopula generates highly accurate synthetic multi-dimensional data with significantly better utility than state-of-the-art techniques. 1.
Secure Multiparty Aggregation with Differential Privacy: A Comparative Study
"... This paper considers the problem of secure data aggregation in a distributed setting while preserving differential privacy for the aggregated data. In particular, we focus on the secure sum aggregation. Security is guaranteed by secure multiparty computation protocols using well known security schem ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
(Show Context)
This paper considers the problem of secure data aggregation in a distributed setting while preserving differential privacy for the aggregated data. In particular, we focus on the secure sum aggregation. Security is guaranteed by secure multiparty computation protocols using well known security schemes: Shamir’s secret sharing, perturbation-based, and various encryption schemes. Differential privacy of the final result is achieved by distributed Laplace perturbation mechanism(DLPA).Partial randomnoise isgeneratedbyall participants, which draw random variables from Gamma or Gaussian distributions, such that the aggregated noise follows Laplace distribution to satisfy differential privacy. We also introduce a new efficient distributed noise generation scheme with partial noise drawn from Laplace distributions. We compare the protocols with different privacy mechanisms and security schemes in terms of their complexity and security characteristics. More importantly, we implemented all protocols, and present an experimental comparison on their performance and scalability in a real distributed environment. 1.
On Learning Cluster Coefficient of Private Networks
"... Abstract—Enabling accurate analysis of social network data while preserving differential privacy has been challenging since graph features such as clustering coefficient or modularity often have high sensitivity, which is different from traditional aggregate functions (e.g., count and sum) on tabula ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
Abstract—Enabling accurate analysis of social network data while preserving differential privacy has been challenging since graph features such as clustering coefficient or modularity often have high sensitivity, which is different from traditional aggregate functions (e.g., count and sum) on tabular data. In this paper, we treat a graph statistics as a function f and develop a divide and conquer approach to enforce differential privacy. The basic procedure of this approach is to first decompose the target computation f into several less complex unit computations f1, · · · , fm connected by basic mathematical operations (e.g., addition, subtraction, multiplication, division), then perturb the output of each fi with Laplace noise derived from its own sensitivity value and the distributed privacy threshold ϵi, and finally combine those perturbed fi as the perturbed output of computation f. We examine how various operations affect the accuracy of complex computations. When unit computations have large global sensitivity values, we enforce the differential privacy by calibrating noise based on the smooth sensitivity, rather than the global sensitivity. By doing this, we achieve the strict differential privacy guarantee with smaller magnitude noise. We illustrate our approach by using clustering coefficient, which is a popular statistics used in social network analysis. Empirical evaluations show the developed divide and conquer approach outperforms the direct approach. I.
Membership Privacy: A Unifying Framework For Privacy Definitions ABSTRACT
"... We formalize positive membership privacy, which prevents the adversary from significantly increasing its ability to conclude that an entity is in the input dataset, and negative membership privacy, which prevents leaking of non-membership. These notions are parameterized by a family of distributions ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
We formalize positive membership privacy, which prevents the adversary from significantly increasing its ability to conclude that an entity is in the input dataset, and negative membership privacy, which prevents leaking of non-membership. These notions are parameterized by a family of distributions that captures the adversary’s prior knowledge. The power and flexibility of the proposed membership privacy framework lies in the ability to choose different distribution families to instantiate membership privacy. Many privacy notions in the literature are equivalent to membership privacy with interesting distribution families, including differential privacy, differential identifiability, and differential privacy under sampling. Casting these notions into the framework leads to deeper understanding of the strengthes and weaknesses of these notions, as well as their relationships to each other. The framework also provides a principled approach to developing new privacy notions under which better utility can be achieved than what is possible under differential privacy.