Results 1 - 10
of
117
Privacy Preserving Association Rule Mining in Vertically Partitioned Data
- In The Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
, 2002
"... Privacy considerations often constrain data mining projects. This paper addresses the problem of association rule mining where transactions are distributed across sources. Each site holds some attributes of each transaction, and the sites wish to collaborate to identify globally valid association ru ..."
Abstract
-
Cited by 295 (21 self)
- Add to MetaCart
Privacy considerations often constrain data mining projects. This paper addresses the problem of association rule mining where transactions are distributed across sources. Each site holds some attributes of each transaction, and the sites wish to collaborate to identify globally valid association rules. However, the sites must not reveal individual transaction data. We present a two-party algorithm for efficiently discovering frequent itemsets with minimum support levels, without either site revealing individual transaction values.
The new casper: Query processing for location services without compromising privacy
- IN PROC. OF THE 32ND INTERNATIONAL CONFERENCE ON VERY LARGE DATA BASES, VLDB
, 2006
"... In this paper, we present a new privacy-aware query processing framework Capser * in which mobile and stationary users can obtain snapshot and/or continuous location-based services without revealing their private location information. In particular, we propose a privacy-aware query processor embedde ..."
Abstract
-
Cited by 234 (7 self)
- Add to MetaCart
In this paper, we present a new privacy-aware query processing framework Capser * in which mobile and stationary users can obtain snapshot and/or continuous location-based services without revealing their private location information. In particular, we propose a privacy-aware query processor embedded inside a location-based database server to deal with snapshot and continuous queries based on the knowledge of the user’s cloaked location rather than the exact location. Our proposed privacy-aware query processor is completely independent of how we compute the user’s cloaked location. In other words, any existing location anonymization algorithms that blur the user’s private location into cloaked rectilinear areas can be employed to protect the user’s location privacy. We first propose a privacy-aware query processor that not only supports three new privacy-aware query types, but it also achieves a trade-off between query processing cost and answer optimality. Then, to improve system scalability of processing continuous privacy-aware queries, we propose a shared execution paradigm that shares query processing among a large number of continuous queries. The proposed scalable paradigm can be tuned through two parameters to trade off between system scalability and answer optimality. Experimental results show that our query processor achieves high quality snapshot and continuous location-based services while
Privacy-Preserving K-Means Clustering over Vertically Partitioned Data
- IN SIGKDD
, 2003
"... Privacy and security concerns can prevent sharing of data, derailing data mining projects. Distributed knowledge discovery, if done correctly, can alleviate this problem. The key is to obtain valid results, while providing guarantees on the (non)disclosure of data. We present a method for k-means cl ..."
Abstract
-
Cited by 167 (10 self)
- Add to MetaCart
(Show Context)
Privacy and security concerns can prevent sharing of data, derailing data mining projects. Distributed knowledge discovery, if done correctly, can alleviate this problem. The key is to obtain valid results, while providing guarantees on the (non)disclosure of data. We present a method for k-means clustering when different sites contain different attributes for a common set of entities. Each site learns the cluster of each entity, but learns nothing about the attributes at other sites.
Random projection-based multiplicative data perturbation for privacy preserving distributed data mining
- IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING
, 2006
"... This paper explores the possibility of using multiplicative random projection matrices for privacy preserving distributed data mining. It specifically considers the problem of computing statistical aggregates like the inner product matrix, correlation coefficient matrix, and Euclidean distance matri ..."
Abstract
-
Cited by 94 (6 self)
- Add to MetaCart
(Show Context)
This paper explores the possibility of using multiplicative random projection matrices for privacy preserving distributed data mining. It specifically considers the problem of computing statistical aggregates like the inner product matrix, correlation coefficient matrix, and Euclidean distance matrix from distributed privacy sensitive data possibly owned by multiple parties. This class of problems is directly related to many other data-mining problems such as clustering, principal component analysis, and classification. This paper makes primary contributions on two different grounds. First, it explores Independent Component Analysis as a possible tool for breaching privacy in deterministic multiplicative perturbation-based models such as random orthogonal transformation and random rotation. Then, it proposes an approximate random projection-based technique to improve the level of privacy protection while still preserving certain statistical characteristics of the data. The paper presents extensive theoretical analysis and experimental results. Experiments demonstrate that the proposed technique is effective and can be successfully used for different types of privacypreserving data mining applications.
PDA: privacy-preserving data aggregation in wireless sensor networks
- IN: PROCEEDINGS OF THE IEEE INFOCOM2007
, 2007
"... Providing efficient data aggregation while preserving data privacy is a challenging problem in wireless sensor networks research. In this paper, we present two privacy-preserving data aggregation schemes for additive aggregation functions. The first scheme – Cluster-based Private Data Aggregation ( ..."
Abstract
-
Cited by 54 (2 self)
- Add to MetaCart
(Show Context)
Providing efficient data aggregation while preserving data privacy is a challenging problem in wireless sensor networks research. In this paper, we present two privacy-preserving data aggregation schemes for additive aggregation functions. The first scheme – Cluster-based Private Data Aggregation (CPDA)– leverages clustering protocol and algebraic properties of polynomials. It has the advantage of incurring less communication overhead. The second scheme – Slice-Mix-AggRegaTe (SMART)– builds on slicing techniques and the associative property of addition. It has the advantage of incurring less computation overhead. The goal of our work is to bridge the gap between collaborative data collection by wireless sensor networks and data privacy. We assess the two schemes by privacy-preservation efficacy, communication overhead, and data aggregation accuracy. We present simulation results of our schemes and compare their performance to a typical data aggregation scheme – TAG, where no data privacy protection is provided. Results show the efficacy and efficiency of our schemes. To the best of our knowledge, this paper is among the first on privacy-preserving data aggregation in wireless sensor networks.
Privacy Preserving Frequent Itemset Mining
, 2002
"... One crucial aspect of privacy preserving frequent itemset mining is the fact that the mining process deals with a trade-off: privacy and accuracy, which are typically contradictory, and improving one usually incurs a cost in the other. One alternative to address this particular problem is to look fo ..."
Abstract
-
Cited by 53 (3 self)
- Add to MetaCart
One crucial aspect of privacy preserving frequent itemset mining is the fact that the mining process deals with a trade-off: privacy and accuracy, which are typically contradictory, and improving one usually incurs a cost in the other. One alternative to address this particular problem is to look for a balance between hiding restrictive patterns and disclosing nonrestrictive ones. In this paper, we propose a new framework for enforcing privacy in mining frequent itemsets. We combine, in a single framework, techniques for efficiently hiding restrictive patterns: a transaction retrieval engine relying on an inverted file and Boolean queries; and a set of algorithms to "sanitize" a database. In addition, we introduce performance measures for mining frequent itemsets that quantify the fraction of mining patterns which are preserved after sanitizing a database. We also report the results of a performance evaluation of our research prototype and an analysis of the results.
Towards Collaborative Security and P2P Intrusion Detection
- In Proceedings of the IEEE Information Assurance Workshop (IAW
, 2005
"... The increasing array of Internet-scale threats is a pressing problem for every organization that utilizes the network. Organizations have limited resources to detect and respond to these threats. The end-to-end (E2E) sharing of information related to probes and attacks is a facet of an emerging tren ..."
Abstract
-
Cited by 49 (2 self)
- Add to MetaCart
(Show Context)
The increasing array of Internet-scale threats is a pressing problem for every organization that utilizes the network. Organizations have limited resources to detect and respond to these threats. The end-to-end (E2E) sharing of information related to probes and attacks is a facet of an emerging trend toward "collaborative security." The key benefit of a collaborative approach to intrusion detection is a better view of global network attack activity. Augmenting the information obtained at a single site with information gathered from across the network can provide a more precise model of an attacker's behavior and intent. While many organizations see value in adopting such a collaborative approach, some challenges must be addressed before intrusion detection can be performed on an inter-organizational scale. We report on our...
Privacy-preserving decision trees over vertically partitioned data
- IN THE PROCEEDINGS OF THE 19TH ANNUAL IFIP WG 11.3 WORKING CONFERENCE ON DATA AND APPLICATIONS SECURITY
"... Privacy and security concerns can prevent sharing of data, derailing data mining projects. Distributed knowledge discovery, if done correctly, can alleviate this problem. In this paper, we tackle the problem of classification. We introduce a generalized privacy preserving variant of the ID3 algorit ..."
Abstract
-
Cited by 45 (2 self)
- Add to MetaCart
Privacy and security concerns can prevent sharing of data, derailing data mining projects. Distributed knowledge discovery, if done correctly, can alleviate this problem. In this paper, we tackle the problem of classification. We introduce a generalized privacy preserving variant of the ID3 algorithm for vertically partitioned data distributed over two or more parties. Along with the algorithm, we give a complete proof of security that gives a tight bound on the information revealed.
A Practical Approach to Solve Secure Multi-Party Computation Problems
- IN NEW SECURITY PARADIGMS WORKSHOP
, 2002
"... Secure Multi-party Computation (SMC) problems deal with the following situation: Two (or many) parties want to jointly perform a computation. Each party needs to contribute its private input to this computation, but no party should disclose its private inputs to the other parties, or to any third pa ..."
Abstract
-
Cited by 43 (1 self)
- Add to MetaCart
(Show Context)
Secure Multi-party Computation (SMC) problems deal with the following situation: Two (or many) parties want to jointly perform a computation. Each party needs to contribute its private input to this computation, but no party should disclose its private inputs to the other parties, or to any third party. With the proliferation of the Internet, SMC problems becomes more and more important. So far no practical solution has emerged, largely because SMC studies have been focusing on zero information disclosure, an ideal security model that is expensive to achieve. Aiming at developing practical solutions to SMC problems, we propose a new paradigm, in which we use an acceptable security model that allows partial information disclosure. Our conjecture is that by lowering the restriction on the security, we can achieve a much better performance. The paradigm is motivated by the observation that in practice people do accept a less secure but much more efficient solution because sometimes disclosing information about their private data to certain degree is a risk that many people would rather take if the performance gain is so significant. Moreover, in our paradigm, the security is adjustable, such that users can adjust the level of security based on their definition of the acceptable security. We have developed a number of techniques under this new paradigm, and are currently conducting extensive studies based on this new paradigm.
Privacy preserving regression modelling via distributed computation
- In Proc. Tenth ACM SIGKDD Internat. Conf. on Knowledge Discovery and Data Mining
, 2004
"... www.niss.org ..."
(Show Context)