Results 1 -
9 of
9
Computational Disclosure Control - A Primer on Data Privacy Protection
- Massachusetts Institute of Technology
, 2001
"... Today's globally networked society places great demand on the dissemination and sharing of person-specific data for many new and exciting uses. Even situations where aggregate statistical information was once the reporting norm now rely heavily on the transfer of microscopically detailed transaction ..."
Abstract
-
Cited by 23 (2 self)
- Add to MetaCart
Today's globally networked society places great demand on the dissemination and sharing of person-specific data for many new and exciting uses. Even situations where aggregate statistical information was once the reporting norm now rely heavily on the transfer of microscopically detailed transaction and encounter information. This happens at a time when more and more historically public information is also electronically available. When these data are linked together, they provide an electronic shadow of a person or organization that is as identifying and personal as a fingerprint even when the information contains no explicit identifiers, such as name and phone number. Other distinctive data, such as birth date and ZIP code, often combine uniquely and can be linked to publicly available information to re-identify individuals. Producing anonymous data that remains specific enough to be useful is often a very difficult task and practice today tends to either incorrectly believe confidentiality is maintained when it is not or produces data that are practically useless. The goal of the work presented in this book is to explore computational techniques for releasing useful information in such a way that the identity of any individual or entity contained in data cannot be recognized while the data remain practically useful. I begin by demonstrating ways to learn information about entities from publicly available information. I then provide a formal framework for reasoning about disclosure control and the ability to infer the identities of entities contained within the data. I formally define and present null-map, k-map and wrong-map as models of protection. Each model provides protection by ensuring that released information maps to no, k or incorrect entities, respectively. The book ends by examining four computational systems that attempt to maintain privacy while releasing electronic information. These systems are: (1) my Scrub System, which locates personally-identifying information in letters between doctors and notes written by clinicians; (2) my Datafly II System, which generalizes and suppresses values in field-structured data sets; (3) Statistics Netherlands' m-Argus System, which is becoming a European standard for producing public-use data; and, (4) my k-Similar algorithm, which finds optimal solutions such that data are minimally distorted while still providing adequate protection. By introducing anonymity and quality metrics, I show that Datafly II can overprotect data, Scrub and m-Argus can fail to provide adequate protection, but k-similar finds optimal results.
A deontic logic for reasoning about confidentiality
"... This paper presents a deontic logic \Sigma for reasoning about per- mission or prohibition to know some parts of the datab!ase content in the context of a multilevel confidentiality policy. The most important logical features in the definition of a multi- level policy are that each confidentiality ..."
Abstract
-
Cited by 11 (2 self)
- Add to MetaCart
This paper presents a deontic logic \Sigma for reasoning about per- mission or prohibition to know some parts of the datab!ase content in the context of a multilevel confidentiality policy. The most important logical features in the definition of a multi- level policy are that each confidentiality level is defined by a set of sentences and that, when the policy is designed, the permission to know is not necessarily the complement of the prohibition to know. These concepts are formalized in a modal logic where deontic modalities, doxastic modalities and confidentiality levels are interpreted by non-standard modal models. The corresponding axiomatics is also presented in the paper and its soundness and completeness have been proved. A limitation of the \Sigma logic is that sentences in the scope of modalities are sentences of Propositional Calculus. Finally, it is shown how the logic can be used to express con- straints to guarantee the consistency of a policy or to prevent the existence of inference channels. That is, the possibility to infer sen- tences that are not permitted to know from other sentences that are permited to know. Both deductive and abductive channels are considered. Keywords: knowledge representation, deontic logic, database security.
A Practical Formalism for Imprecise Inference Control
- Proceedings of the 8th IFIP WG11.3 Workshop on Database Security
, 1994
"... This paper describes a powerful, yet practical, formalism for modeling and controlling imprecise FD-based inference in relational database systems. The formalism provides a canonical representation of inference which unifies precise inference and the primitive imprecise inference mechanisms of abduc ..."
Abstract
-
Cited by 8 (4 self)
- Add to MetaCart
This paper describes a powerful, yet practical, formalism for modeling and controlling imprecise FD-based inference in relational database systems. The formalism provides a canonical representation of inference which unifies precise inference and the primitive imprecise inference mechanisms of abduction and partial deduction. Whereas other imprecise (partial) inference models estimate the probability of making inferences, the formalism supports the analysis of the actual imprecise values inferred in a database extension. Imprecise inference is analyzed by transforming a precise database augmented with additional "catalytic" relations, conveying possibly imprecise a priori knowledge, into an equivalent imprecise database. The analysis of imprecise inference and the related inference control methodology are highly flexible and robust. They can be directly applied to classical, MLS, and imprecise databases. With minimal modifications, they also can be used in knowledge discovery or databa...
Compromising Privacy with Trail Re-Identification: The REIDIT Algorithms
, 2002
"... Re-identification is the process of relating unique and specific entities to seemingly anonymous data, and as such, is an attack on the privacy of a data collection. This work introduces a new reidentification attack, termed the trail problem, for data distributed over multiple locations. Through th ..."
Abstract
-
Cited by 5 (3 self)
- Add to MetaCart
Re-identification is the process of relating unique and specific entities to seemingly anonymous data, and as such, is an attack on the privacy of a data collection. This work introduces a new reidentification attack, termed the trail problem, for data distributed over multiple locations. Through the use of data trails an adversary can independently reconstruct the trails of locations that identified entities and their un-identified data visited, which can then employed for re-identification via trail matching. The attack strategy is based on the premise that data collecting institutions partition and release a dataset as multiple subsets, such that one release contains identifying attributes (e.g. name, social security number, phone number) and a second is devoid of these attributes (e.g. DNA sequences). The trail attack is dependent on whether the identified data is always collected with the un-identified data, termed complete, or whether one of the attributes is under-collected, termed incomplete. Both the complete and incomplete trail problems are formalized and several novel algorithms for re-identification are introduced. Examples are drawn from the areas of clickstream, DNA sequence, health, and video data.
Designing Security Agents for the DOK Federated System
- Database Security XI
, 1998
"... This paper addresses two main issues of the DOK system [15], that is the design of a framework for enforcing security policies and a secure architecture which implements such a framework. Federated security policies are expressed as logic-based expressions (called "aggregation constraints") specifyi ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
This paper addresses two main issues of the DOK system [15], that is the design of a framework for enforcing security policies and a secure architecture which implements such a framework. Federated security policies are expressed as logic-based expressions (called "aggregation constraints") specifying the different combinations of transactions that a user is not allowed to issue, either in single or multiple states of a federation. To enable efficient monitoring of aggregation constraints, state transition graphs are generated to model the different sub-computations of the constraints. Two marking techniques, namely LMT (Linear Marking Technique) and ZMT (Zigzag Marking Technique), are proposed to detect violations of federated security policies. To enable an effective enforcement of security policies, we designed a secure DOK architecture using specialised agents: (i) coordination agents allow the coordination of different federated activities, (ii) task agents perform specific tasks of th...
IRI: A Quantitative Approach to Inference Analysis in Relational Databases
- Proc. IFIP WG 11.3 Working Conference on Database Security
, 1997
"... A new approach is introduced to evaluate inference risks in element-level labelling relational databases. Techniques from rough set theory are used to capture the semantics of data and a quantitative measure Inference Risk Index (IRI) has been defined to characterise possible inference risks due to ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
A new approach is introduced to evaluate inference risks in element-level labelling relational databases. Techniques from rough set theory are used to capture the semantics of data and a quantitative measure Inference Risk Index (IRI) has been defined to characterise possible inference risks due to material implications reflected by the data. The approach is shown to be able to take into account of all certain and possible material implications in the data, including functional dependencies. It can also be used to address inference threats posed by rule-induction techniques from data mining. A major advantage of our approach is that the quantitative measure I R I is computed directly from data without knowledge input from System Security Officer. The computation is efficient and allows for real-time monitoring of inference risks during database run-time. Therefore, we are able to follow the changes in data patterns during database lifetime. Keywords inference risk, relational databa...
On Rough Sets and Inference Analysis
- In International Workshop on Information Security, LNCS
, 1997
"... . In this paper, we give an overview of a promising approach to inference detection and analysis in relational databases, first introduced in [25]. The approach employs techniques from rough sets theory and is able to take into account of all certain and possible material implications in the data, i ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
. In this paper, we give an overview of a promising approach to inference detection and analysis in relational databases, first introduced in [25]. The approach employs techniques from rough sets theory and is able to take into account of all certain and possible material implications in the data, including functional dependencies. It can also be used to address inference threats posed by rule-induction techniques from data mining. A major advantage of this approach is that the quantitative measure IRI is computed directly from data without knowledge input from System Security Officer. By comparing with other techniques, we attempt to convey the merits of rough sets based approach. 1 Introduction In multilevel databases, inference problem has long been identified as a major threat to security. An inference problem in a multilevel database arises when a user with a low-level clearance, accessing information of low classification, is able to draw conclusions about information at higher ...
Inference and Aggregation Issues In Secure Database Management Systems
"... This report is the first of five companion documents to the Trusted Database Management System Interpretation of the Trusted Computer System Evaluation Criteria. The companion documents address topics that are important to the design and development of secure database management systems, and are wri ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
This report is the first of five companion documents to the Trusted Database Management System Interpretation of the Trusted Computer System Evaluation Criteria. The companion documents address topics that are important to the design and development of secure database management systems, and are written for database vendors, system designers, evaluators, and researchers. This report addresses inference and aggregation issues in secure database management systems. Keith F. Brewster Acting Chief, Partnerships and Processes May ACKNOWLEDGMENTS
Aggregation
"... The aggregation problem arises whenever some collection of data has a classi#cation strictly greater than that of the individual data forming the aggregate. This paper addresses such a problem in the context of federated databases, where data can be distributed and heterogeneous. Aggregation constra ..."
Abstract
- Add to MetaCart
The aggregation problem arises whenever some collection of data has a classi#cation strictly greater than that of the individual data forming the aggregate. This paper addresses such a problem in the context of federated databases, where data can be distributed and heterogeneous. Aggregation constraints aremodelledaslogic expressions of the FEderatedLogic Language #FELL#, and these specify the combinations of basic transactions that needtobe checked against either single or multiple states of a federation. Transitional graphs are built from aggregation constraints to model all the required sub-computations for enforcing federated security policies. We also develop a monitoring technique, called Non-Linear Marking Technique #NLMT#, to check the nodes of transitional graphs against the past and current states of a federation. Keywords Federated databases, inference and aggregation, transitional graph. 1 Introduction Database security is concerned with ensuring privacy, secrecy and inte...

