Results 1  10
of
153
Differential privacy: A survey of results
 In Theory and Applications of Models of Computation
, 2008
"... Abstract. Over the past five years a new approach to privacypreserving ..."
Abstract

Cited by 258 (0 self)
 Add to MetaCart
Abstract. Over the past five years a new approach to privacypreserving
Smooth sensitivity and sampling in private data analysis
 In STOC
, 2007
"... We introduce a new, generic framework for private data analysis. The goal of private data analysis is to release aggregate information about a data set while protecting the privacy of the individuals whose information the data set contains. Our framework allows one to release functions f of the data ..."
Abstract

Cited by 173 (16 self)
 Add to MetaCart
(Show Context)
We introduce a new, generic framework for private data analysis. The goal of private data analysis is to release aggregate information about a data set while protecting the privacy of the individuals whose information the data set contains. Our framework allows one to release functions f of the data with instancebased additive noise. That is, the noise magnitude is determined not only by the function we want to release, but also by the database itself. One of the challenges is to ensure that the noise magnitude does not leak information about the database. To address that, we calibrate the noise magnitude to the smooth sensitivity of f on the database x — a measure of variability of f in the neighborhood of the instance x. The new framework greatly expands the applicability of output perturbation, a technique for protecting individuals ’ privacy by adding a small amount of random noise to the released statistics. To our knowledge, this is the first formal analysis of the effect of instancebased noise in the context of data privacy. Our framework raises many interesting algorithmic questions. Namely, to apply the framework one must compute or approximate the smooth sensitivity of f on x. We show how to do this efficiently for several different functions, including the median and the cost of the minimum spanning tree. We also give a generic procedure based on sampling that allows one to release f(x) accurately on many databases x. This procedure is applicable even when no efficient algorithm for approximating smooth sensitivity of f is known or when f is given as a black box. We illustrate the procedure by applying it to kSED (kmeans) clustering and learning mixtures of Gaussians.
What Can We Learn Privately?
 49TH ANNUAL IEEE SYMPOSIUM ON FOUNDATIONS OF COMPUTER SCIENCE
, 2008
"... Learning problems form an important category of computational tasks that generalizes many of the computations researchers apply to large reallife data sets. We ask: what concept classes can be learned privately, namely, by an algorithm whose output does not depend too heavily on any one input or sp ..."
Abstract

Cited by 99 (9 self)
 Add to MetaCart
(Show Context)
Learning problems form an important category of computational tasks that generalizes many of the computations researchers apply to large reallife data sets. We ask: what concept classes can be learned privately, namely, by an algorithm whose output does not depend too heavily on any one input or specific training example? More precisely, we investigate learning algorithms that satisfy differential privacy, a notion that provides strong confidentiality guarantees in the contexts where aggregate information is released about a database containing sensitive information about individuals. We present several basic results that demonstrate general feasibility of private learning and relate several models previously studied separately in the contexts of privacy and standard learning.
Differential privacy and robust statistics
 STOC'09
, 2009
"... We show by means of several examples that robust statistical estimators present an excellent starting point for differentially private estimators. Our algorithms use a new paradigm for differentially private mechanisms, which we call ProposeTestRelease (PTR), and for which we give a formal definit ..."
Abstract

Cited by 91 (2 self)
 Add to MetaCart
(Show Context)
We show by means of several examples that robust statistical estimators present an excellent starting point for differentially private estimators. Our algorithms use a new paradigm for differentially private mechanisms, which we call ProposeTestRelease (PTR), and for which we give a formal definition and general composition theorems.
Differentially Private Aggregation of Distributed TimeSeries with Transformation and Encryption
"... We propose the first differentially private aggregation algorithm for distributed timeseries data that offers good practical utility without any trusted server. This addresses two important challenges in participatory datamining applications where (i) individual users wish to publish temporally co ..."
Abstract

Cited by 88 (3 self)
 Add to MetaCart
(Show Context)
We propose the first differentially private aggregation algorithm for distributed timeseries data that offers good practical utility without any trusted server. This addresses two important challenges in participatory datamining applications where (i) individual users wish to publish temporally correlated timeseries data (such as location traces, web history, personal health data), and (ii) an untrusted thirdparty aggregator wishes to run aggregate queries on the data. To ensure differential privacy for timeseries data despite the presence of temporal correlation, we propose the Fourier Perturbation Algorithm (FPAk). Standard differential privacy techniques perform poorly for timeseries data. To answer n queries, such techniques can result in a noise of Θ(n) to each query answer, making the answers practically useless if n is large. Our FPAk algorithm perturbs the Discrete Fourier Transform of the query answers. For answering n queries, FPAk improves the expected error from Θ(n) to roughly Θ(k) where k is the number of Fourier coefficients that can (approximately) reconstruct all the n query answers. Our experiments show that k ≪ n for many reallife datasets resulting in a huge errorimprovement for FPAk. To deal with the absence of a trusted central server, we propose the Distributed Laplace Perturbation Algorithm (DLPA) to add noise in a distributed way in order to guarantee differential privacy. To the best of our knowledge, DLPA is the first distributed differentially private algorithm that can scale with a large number of users: DLPA outperforms the only other distributed solution for differential privacy proposed so far, by reducing the computational load per user from O(U) to O(1) where U is the number of users. 1
Airavat: Security and Privacy for MapReduce
, 2009
"... The cloud computing paradigm, which involves distributed computation on multiple largescale datasets, will become successful only if it ensures privacy, confidentiality, and integrity for the data belonging to individuals and organizations. We present Airavat, a novel integration of decentralized i ..."
Abstract

Cited by 82 (2 self)
 Add to MetaCart
(Show Context)
The cloud computing paradigm, which involves distributed computation on multiple largescale datasets, will become successful only if it ensures privacy, confidentiality, and integrity for the data belonging to individuals and organizations. We present Airavat, a novel integration of decentralized information flow control (DIFC) and differential privacy that provides strong security and privacy guarantees for MapReduce computations. Airavat allows users to use arbitrary mappers, prevents unauthorized leakage of sensitive data during the computation, and supports automatic declassification of the results when the latter do not violate individual privacy. Airavat minimizes the amount of trusted code in the system and allows users without security expertise to perform privacypreserving computations on sensitive data. Our prototype implementation demonstrates the flexibility of Airavat on a wide variety of case studies. The prototype is efficient, with runtimes on Amazon’s cloud computing infrastructure within 25 % of a MapReduce system with no security.
Composition attacks and auxiliary information in data privacy
 CoRR
, 2008
"... Privacy is an increasingly important aspect of data publishing. Reasoning about privacy, however, is fraught with pitfalls. One of the most significant is the auxiliary information (also called external knowledge, background knowledge, or side information) that an adversary gleans from other channel ..."
Abstract

Cited by 78 (6 self)
 Add to MetaCart
(Show Context)
Privacy is an increasingly important aspect of data publishing. Reasoning about privacy, however, is fraught with pitfalls. One of the most significant is the auxiliary information (also called external knowledge, background knowledge, or side information) that an adversary gleans from other channels such as the web, public records, or domain knowledge. This paper explores how one can reason about privacy in the face of rich, realistic sources of auxiliary information. Specifically, we investigate the effectiveness of current anonymization schemes in preserving privacy when multiple organizations independently release anonymized data about overlapping populations. 1. We investigate composition attacks, in which an adversary uses independent anonymized releases to breach privacy. We explain why recently proposed models of limited auxiliary information fail to capture composition attacks. Our experiments demonstrate that even a simple instance of a composition attack can breach privacy in practice, for a large class of currently proposed techniques. The class includes kanonymity and several recent variants. 2. On a more positive note, certain randomizationbased notions of privacy (such as differential privacy) provably resist composition attacks and, in fact, the use of arbitrary side information. This resistance enables “standalone ” design of anonymization schemes, without the need for explicitly keeping track of other releases. We provide a precise formulation of this property, and prove that an important class of relaxations of differential privacy also satisfy the property. This significantly enlarges the class of protocols known to enable modular design. 1.
No Free Lunch in Data Privacy
"... Differential privacy is a powerful tool for providing privacypreserving noisy query answers over statistical databases. It guarantees that the distribution of noisy query answers changes very little with the addition or deletion of any tuple. It is frequently accompanied by popularized claims that i ..."
Abstract

Cited by 78 (6 self)
 Add to MetaCart
(Show Context)
Differential privacy is a powerful tool for providing privacypreserving noisy query answers over statistical databases. It guarantees that the distribution of noisy query answers changes very little with the addition or deletion of any tuple. It is frequently accompanied by popularized claims that it provides privacy without any assumptions about the data and that it protects against attackers who know all but one record. In this paper we critically analyze the privacy protections offered by differential privacy. First, we use a nofreelunch theorem, which defines nonprivacy as a game, to argue that it is not possible to provide privacy and utility without making assumptions about how the data are generated. Then we explain where assumptions are needed. We argue that privacy of an individual is preserved when it is possible to limit the inference of an attacker about the participation of the individual in the data generating process. This is different from limiting the inference about the presence of a tuple (for example, Bob’s participation in a social network may cause edges to form between pairs of his friends, so that it affects more than just the tuple labeled as “Bob”). The definition of evidence of participation, in turn, depends on how the data are generated – this is how assumptions enter the picture. We explain these ideas using examples from social network research as well as tabular data for which deterministic statistics have been previously released. In both cases the notion of participation varies, the use of differential privacy can lead to privacy breaches, and differential privacy does not always adequately limit inference about participation.
Differential privacy under continual observation
 In STOC
, 2010
"... Differential privacy is a recent notion of privacy tailored to privacypreserving data analysis [10]. Up to this point, research on differentially private data analysis has focused on the setting of a trusted curator holding a large, static, data set; thus every computation is a “oneshot ” object: ..."
Abstract

Cited by 65 (3 self)
 Add to MetaCart
(Show Context)
Differential privacy is a recent notion of privacy tailored to privacypreserving data analysis [10]. Up to this point, research on differentially private data analysis has focused on the setting of a trusted curator holding a large, static, data set; thus every computation is a “oneshot ” object: there is no point in computing something twice, since the result will be unchanged, up to any randomness introduced for privacy. However, many applications of data analysis involve repeated computations, either because the entire goal is one of monitoring, e.g., of traffic conditions, search trends, or incidence of influenza, or because the goal is some kind of adaptive optimization, e.g., placement of data to minimize access costs. In these cases, the algorithm must permit continual observation of the system’s state. We therefore initiate a study of differential privacy under continual observation. We identify the problem of maintaining a counter in a privacy preserving manner and show its wide applicability to many different problems.
Privacypreserving aggregation of timeseries data
 In NDSS
, 2011
"... We consider how an untrusted data aggregator can learn desired statistics over multiple participants ’ data, without compromising each individual’s privacy. We propose a construction that allows a group of participants to periodically upload encrypted values to a data aggregator, such that the aggre ..."
Abstract

Cited by 64 (5 self)
 Add to MetaCart
We consider how an untrusted data aggregator can learn desired statistics over multiple participants ’ data, without compromising each individual’s privacy. We propose a construction that allows a group of participants to periodically upload encrypted values to a data aggregator, such that the aggregator is able to compute the sum of all participants ’ values in every time period, but is unable to learn anything else. We achieve strong privacy guarantees using two main techniques. First, we show how to utilize applied cryptographic techniques to allow the aggregator to decrypt the sum from multiple ciphertexts encrypted under different user keys. Second, we describe a distributed data randomization procedure that guarantees the differential privacy of the outcome statistic, even when a subset of participants might be compromised. 1