Results 1 - 10
of
53
State-of-the-art in Privacy Preserving Data Mining,
- SIGMOD Record,
, 2004
"... Abstract We provide here an overview of the new and rapidly emerging research area of privacy preserving data mining. We also propose a classification hierarchy that sets the basis for analyzing the work which has been performed in this context. A detailed review of the work accomplished in this ar ..."
Abstract
-
Cited by 159 (6 self)
- Add to MetaCart
(Show Context)
Abstract We provide here an overview of the new and rapidly emerging research area of privacy preserving data mining. We also propose a classification hierarchy that sets the basis for analyzing the work which has been performed in this context. A detailed review of the work accomplished in this area is also given, along with the coordinates of each work to the classification hierarchy. A brief evaluation is performed, and some initial conclusions are made.
Random projection-based multiplicative data perturbation for privacy preserving distributed data mining
- IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING
, 2006
"... This paper explores the possibility of using multiplicative random projection matrices for privacy preserving distributed data mining. It specifically considers the problem of computing statistical aggregates like the inner product matrix, correlation coefficient matrix, and Euclidean distance matri ..."
Abstract
-
Cited by 94 (6 self)
- Add to MetaCart
(Show Context)
This paper explores the possibility of using multiplicative random projection matrices for privacy preserving distributed data mining. It specifically considers the problem of computing statistical aggregates like the inner product matrix, correlation coefficient matrix, and Euclidean distance matrix from distributed privacy sensitive data possibly owned by multiple parties. This class of problems is directly related to many other data-mining problems such as clustering, principal component analysis, and classification. This paper makes primary contributions on two different grounds. First, it explores Independent Component Analysis as a possible tool for breaching privacy in deterministic multiplicative perturbation-based models such as random orthogonal transformation and random rotation. Then, it proposes an approximate random projection-based technique to improve the level of privacy protection while still preserving certain statistical characteristics of the data. The paper presents extensive theoretical analysis and experimental results. Experiments demonstrate that the proposed technique is effective and can be successfully used for different types of privacypreserving data mining applications.
Database security -- concepts, approaches, and challenges
- IEEE TRANS. DEPENDABLE SECUR. COMPUT
, 2005
"... As organizations increase their reliance on, possibly distributed, information systems for daily business, they become more vulnerable to security breaches even as they gain productivity and efficiency advantages. Though a number of techniques, such as encryption and electronic signatures, are curre ..."
Abstract
-
Cited by 28 (3 self)
- Add to MetaCart
As organizations increase their reliance on, possibly distributed, information systems for daily business, they become more vulnerable to security breaches even as they gain productivity and efficiency advantages. Though a number of techniques, such as encryption and electronic signatures, are currently available to protect data when transmitted across sites, a truly comprehensive approach for data protection must also include mechanisms for enforcing access control policies based on data contents, subject qualifications and characteristics, and other relevant contextual information, such as time. It is well understood today that the semantics of data must be taken into account in order to specify effective access control policies. Also, techniques for data integrity and availability specifically tailored to database systems must be adopted. In this respect, over the years the database security community has developed a number of different techniques and approaches to assure data confidentiality, integrity, and availability. However, despite such advances, the database security area faces several new challenges. Factors such as the evolution of security concerns, the “disintermediation” of access to data, new computing paradigms and applications, such as grid-based computing and ondemand business, have introduced both new security requirements and new contexts in which to apply and possibly extend current approaches. In this paper, we first survey the most relevant concepts underlying the notion of database security and summarize the most well-known techniques. We focus on access control systems, on which a large body of research has been devoted, and describe the key access control models, namely, the discretionary and mandatory access control models, and the role-based access control (RBAC) model. We also discuss security for advanced data management systems, and cover topics such as access control for XML. We then discuss current challenges for database security and some preliminary approaches that address some of these challenges.
On Inverse Frequent Set Mining
, 2003
"... Frequent set mining is a well-known technique to summarize binary data. However, it is an open problem how difficult it is to invert the frequent set mining, i.e., how difficult it is to find a binary data set that is compatible with frequent set mining results, the frequent sets. This inverse data ..."
Abstract
-
Cited by 26 (2 self)
- Add to MetaCart
Frequent set mining is a well-known technique to summarize binary data. However, it is an open problem how difficult it is to invert the frequent set mining, i.e., how difficult it is to find a binary data set that is compatible with frequent set mining results, the frequent sets. This inverse data mining problem is related to the questions of how well privacy is preserved in the frequent sets and how well the frequent sets characterize the original data set. In this paper we analyze the computational complexity of the problem of finding a binary data set compatible with a given collection of frequent sets and show that in many cases the problem is computationally very difficult.
Secure Association Rule Sharing
- Advances in Knowledge Discovery and Data Mining, 8th Pacific-Asia Conference, PAKDD 2004
, 2004
"... The sharing of association rules is often beneficial in industry, but requires privacy safeguards. One may decide to disclose only part of the knowledge and conceal strategic patterns which we call restrictive rules. These restrictive rules must be protected before sharing since they are paramou ..."
Abstract
-
Cited by 17 (1 self)
- Add to MetaCart
(Show Context)
The sharing of association rules is often beneficial in industry, but requires privacy safeguards. One may decide to disclose only part of the knowledge and conceal strategic patterns which we call restrictive rules. These restrictive rules must be protected before sharing since they are paramount for strategic decisions and need to remain private. To address this challenging problem, we propose a unified framework for protecting sensitive knowledge before sharing. This framework encompasses: (a) an algorithm that sanitizes restrictive rules, while blocking some inference channels. We validate our algorithm against real and synthetic datasets; (b) a set of metrics to evaluate attacks against sensitive knowledge and the impact of the sanitization. We also introduce a taxonomy of sanitizing algorithms and a taxonomy of attacks against sensitive knowledge.
A Survey of Quantification of Privacy Preserving Data Mining Algorithms
"... Abstract The aim of privacy preserving data mining (PPDM) algorithms is to extract relevant knowledge from large amounts of data while protecting at the same time sensitive information. An important aspect in the design of such algorithms is the identification of suitable evaluation criteria and the ..."
Abstract
-
Cited by 15 (0 self)
- Add to MetaCart
(Show Context)
Abstract The aim of privacy preserving data mining (PPDM) algorithms is to extract relevant knowledge from large amounts of data while protecting at the same time sensitive information. An important aspect in the design of such algorithms is the identification of suitable evaluation criteria and the development of related benchmarks. Recent research in the area has devoted much effort to determine a trade-off between the right to privacy and the need of knowledge discovery. It is often the case that no privacy preserving algorithm exists that outperforms all the others on all possible criteria. Therefore, it is crucial to provide a comprehensive view on a set of metrics related to existing privacy preserving algorithms so that we can gain insights on how to design more effective measurement and PPDM algorithms. In this chapter, we review and summarize existing criteria and metrics in evaluating privacy preserving techniques. 1
A new framework of privacy preserving data sharing
- Proc. of the IEEE ICDM Workshop on Privacy and Security Aspects of Data Mining
, 2004
"... We introduce a dataset reconstruction based framework for data sharing with privacy preserving. The proposed framework uses a constraint-based inverse itemset lattice mining technique to automatically generate a sample dataset to be released for sharing. In this framework, data owners can control th ..."
Abstract
-
Cited by 13 (2 self)
- Add to MetaCart
(Show Context)
We introduce a dataset reconstruction based framework for data sharing with privacy preserving. The proposed framework uses a constraint-based inverse itemset lattice mining technique to automatically generate a sample dataset to be released for sharing. In this framework, data owners can control the potential mine-able knowledge (frequent itemsets in our context) from the released dataset. Before generating the sample dataset, the potential mine-able knowledge set is checked for two aspects: One is for the compliancy of user-specified security constraint and the trade-off principle, so that the sensitive patterns are well protected while the side-effect is tolerable. The other check is verification for consistency among itemset supports in the lattice so that it is sensible for inverse dataset reconstruction. This mechanism offers the data owner total control of the potentially discoverable knowledge from publicly accessible datasets, and at the same time the released data matches with the main features of the original dataset for sharable knowledge, thus the user’s privacy can be preserved. 1.
Reconstruction-Based Association Rule Hiding
"... As large repositories of data contain confidential rules that must be protected before published, association rule hiding becomes one of important privacy preserving data mining problems. Compared with traditional data modification methods, data reconstruction is a new promising, but not sufficientl ..."
Abstract
-
Cited by 8 (0 self)
- Add to MetaCart
(Show Context)
As large repositories of data contain confidential rules that must be protected before published, association rule hiding becomes one of important privacy preserving data mining problems. Compared with traditional data modification methods, data reconstruction is a new promising, but not sufficiently investigated method, which is inspired by the inverse frequent set mining problem. In my PhD period, my research focuses on further investigating reconstruction-based techniques for association rule hiding. Particularly, we have proposed a FP-treebased method for inverse frequent set mining, which can be used in our proposed reconstruction-based framework. We hope the proposed solution will fetch up the new reconstruction-based research track and work well according to the evaluation metrics including hiding effects, data utility, and time performance.
Inference Attacks in Peer-to-Peer Homogeneous Distributed Data Mining
- In 16th European Conference on Artificial Intelligence (ECAI 04
, 2004
"... Spontaneous formation of peer-to-peer agent-based data mining systems seems a plausible scenario in years to come. However, the emergence of peer-to-peer environments further exacerbates privacy and security concerns that arise when performing data mining tasks. We analyze potential threats to data ..."
Abstract
-
Cited by 4 (3 self)
- Add to MetaCart
(Show Context)
Spontaneous formation of peer-to-peer agent-based data mining systems seems a plausible scenario in years to come. However, the emergence of peer-to-peer environments further exacerbates privacy and security concerns that arise when performing data mining tasks. We analyze potential threats to data privacy in a peer-topeer agent-based distributed data mining scenario, and discuss inference attacks which could compromise data privacy in a peer-to-peer distributed clustering scheme known as KDEC.