Results 1 - 10
of
15
Worst-case background knowledge in privacy
- In ICDE
, 2007
"... Recent work has shown the necessity of considering an attacker’s background knowledge when reasoning about privacy in data publishing. However, in practice, the data publisher does not know what background knowledge the attacker possesses. Thus, it is important to consider the worst-case. In this pa ..."
Abstract
-
Cited by 56 (1 self)
- Add to MetaCart
Recent work has shown the necessity of considering an attacker’s background knowledge when reasoning about privacy in data publishing. However, in practice, the data publisher does not know what background knowledge the attacker possesses. Thus, it is important to consider the worst-case. In this paper, we initiate a formal study of worst-case background knowledge. We propose a language that can express any background knowledge about the data. We provide a polynomial time algorithm to measure the amount of disclosure of sensitive information in the worst case, given that the attacker has at most k pieces of information in this language. We also provide a method to efficiently sanitize the data so that the amount of disclosure in the worst case is less than a specified threshold. 1.
Privacy skyline: Privacy with multidimensional adversarial knowledge
- in VLDB, 2007
, 2007
"... Privacy is an important issue in data publishing. Many organizations distribute non-aggregate personal data for research, and they must take steps to ensure that an adversary cannot predict sensitive information pertaining to individuals with high confidence. This problem is further complicated by t ..."
Abstract
-
Cited by 34 (0 self)
- Add to MetaCart
Privacy is an important issue in data publishing. Many organizations distribute non-aggregate personal data for research, and they must take steps to ensure that an adversary cannot predict sensitive information pertaining to individuals with high confidence. This problem is further complicated by the fact that, in addition to the published data, the adversary may also have access to other resources (e.g., public records and social networks relating individuals), which we call external knowledge. A robust privacy criterion should take this external knowledge into consideration. In this paper, we first describe a general framework for reasoning about privacy in the presence of external knowledge. Within this framework, we propose a novel multidimensional approach to quantifying an adversary’s external knowledge. This approach allows the publishing organization to investigate privacy threats and enforce privacy requirements in the presence of various types and amounts of external knowledge. Our main technical contributions include a multidimensional privacy criterion that is more intuitive and flexible than previous approaches to modeling background knowledge. In addition, we provide algorithms for measuring disclosure and sanitizing data that improve computational efficiency several orders of magnitude over the best known techniques. 1.
Materialized views in probabilistic databases for information exchange and query optimization
- IN PROCEEDINGS OF VLDB
, 2007
"... Views over probabilistic data contain correlations between tuples, and the current approach is to capture these correlations using explicit lineage. In this paper we propose an alternative approach to materializing probabilistic views, by giving conditions under which a view can be represented by a ..."
Abstract
-
Cited by 24 (9 self)
- Add to MetaCart
Views over probabilistic data contain correlations between tuples, and the current approach is to capture these correlations using explicit lineage. In this paper we propose an alternative approach to materializing probabilistic views, by giving conditions under which a view can be represented by a block-independent disjoint (BID) table. Not all views can be represented as BID tables and so we propose a novel partial representation that can represent all views but may not define a unique probability distribution. We then give conditions on when a query’s value on a partial representation will be uniquely defined. We apply our theory to two applications: query processing using views and information exchange using views. In query processing on probabilistic data, we can ignore the lineage and use materialized views to more efficiently answer queries. By contrast, if the view has explicit lineage, the query evaluation must reprocess the lineage to compute the query resulting in dramatically slower execution. The second application is information exchange when we do not wish to disclose the entire lineage, which otherwise may result in shipping the entire database. The paper contains several theoretical results that completely solve the problem of deciding whether a conjunctive view can be represented as a BID and whether a query on a partial representation is uniquely determined. We validate our approach experimentally showing that representable views exist in real and synthetic workloads and show over three magnitudes of improvement in query processing versus a lineage based approach.
Managing uncertainty in social networks
- IEEE DATA ENGINEERING BULLETIN
, 2007
"... Social network analysis (SNA) has become a mature scientific field over the last 50 years and is now an area with massive commercial appeal and renewed research interest. In this paper, we argue that new methods for collecting social nework strucuture, and the shift in scale of these networks, intro ..."
Abstract
-
Cited by 9 (0 self)
- Add to MetaCart
Social network analysis (SNA) has become a mature scientific field over the last 50 years and is now an area with massive commercial appeal and renewed research interest. In this paper, we argue that new methods for collecting social nework strucuture, and the shift in scale of these networks, introduces a greater degree of imprecision that requires rethinking on how SNA techniques can be applied. We discuss a new area in data management, probabilistic databases, whose main research goal is to provide tools to manage and manipulate imprecise or uncertain data. We outline the application building blocks necessary to build a large scale social networking application and the extent to which current research in probabilisitc databases addresses these challenges.
Data Privacy for ALC Knowledge Bases
"... Abstract. Information systems support data privacy by granting access only to certain (public) views. The data privacy problem is to decide whether hidden (private) information may be inferred from the public views and some additional general background knowledge. We study the problem of provable pr ..."
Abstract
-
Cited by 5 (3 self)
- Add to MetaCart
Abstract. Information systems support data privacy by granting access only to certain (public) views. The data privacy problem is to decide whether hidden (private) information may be inferred from the public views and some additional general background knowledge. We study the problem of provable privacy in the context of ALC knowledge bases. First we show that the ALC privacy problem wrt. concept retrieval and subsumption queries is ExpTime-complete. Then we provide a sufficient condition for data privacy that can be checked in PTime. 1
T.: A formal model of data privacy
- PSI’06. Volume 4378 of LNCS
, 2007
"... Abstract. Information systems support data privacy by constraining user’s access to public views and thereby hiding the non-public underlying data. The privacy problem is to prove that none of the private data can be inferred from the information which is made public. We present a formal definition ..."
Abstract
-
Cited by 5 (2 self)
- Add to MetaCart
Abstract. Information systems support data privacy by constraining user’s access to public views and thereby hiding the non-public underlying data. The privacy problem is to prove that none of the private data can be inferred from the information which is made public. We present a formal definition of the privacy problem which is based on the notion of certain answer. Then we investigate the privacy problem in the contexts of relational databases and ontology based information systems. 1
I.: Privacy-preserving query answering in logic-based information systems
- In: ECAI
, 2008
"... Abstract. We study privacy guarantees for the owner of an information system who wants to share some of the information in the system with clients while keeping some other information secret. The privacy guarantees ensure that publishing the new information will not compromise the secret one. We pre ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
Abstract. We study privacy guarantees for the owner of an information system who wants to share some of the information in the system with clients while keeping some other information secret. The privacy guarantees ensure that publishing the new information will not compromise the secret one. We present a framework for describing privacy guarantees that generalises existing probabilistic frameworks in relational databases. We also formulate different flavors of privacy-preserving query answering as novel, purely logic-based reasoning problems and establish general connections between these reasoning problems and the probabilistic privacy guarantees. 1
Auditing a Batch of SQL Queries
"... Abstract. In this paper, we study the problem of auditing a batch of SQL queries: given a set of SQL queries that have been posed over a database, determine whether some subset of these queries have revealed private information about an individual or group of individuals. In [2], the authors studied ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Abstract. In this paper, we study the problem of auditing a batch of SQL queries: given a set of SQL queries that have been posed over a database, determine whether some subset of these queries have revealed private information about an individual or group of individuals. In [2], the authors studied the problem of determining whether any single SQL query in isolation revealed information forbidden by the database system’s data disclosure policies. In this paper, we extend this work to the problem of auditing a batch of SQL queries. We define two different notions of auditing- semantic auditing and syntactic auditing- and show that while syntactic auditing seems more desirable, it is in fact NP-hard to achieve. The problem of semantic auditing of a batch of SQL queries is, however, tractable and we give a polynomial time algorithm for this purpose. 1
F.4.1 [Mathematical Logic]: Proof theory General Terms: Theory, security
, 2008
"... Data privacy, description logic ..."

