• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations

Tools

Sorted by:
Try your query at:
Semantic Scholar Scholar Academic
Google Bing DBLP
Results 1 - 10 of 141,035
Next 10 →

Privacy-Preserving Data Mining

by Rakesh Agrawal , Ramakrishnan Srikant , 2000
"... A fruitful direction for future data mining research will be the development of techniques that incorporate privacy concerns. Specifically, we address the following question. Since the primary task in data mining is the development of models about aggregated data, can we develop accurate models with ..."
Abstract - Cited by 844 (3 self) - Add to MetaCart
without access to precise information in individual data records? We consider the concrete case of building a decision-tree classifier from tredning data in which the values of individual records have been perturbed. The resulting data records look very different from the original records

Automatic Subspace Clustering of High Dimensional Data

by Rakesh Agrawal, Johannes Gehrke, Dimitrios Gunopulos, Prabhakar Raghavan - Data Mining and Knowledge Discovery , 2005
"... Data mining applications place special requirements on clustering algorithms including: the ability to find clusters embedded in subspaces of high dimensional data, scalability, end-user comprehensibility of the results, non-presumption of any canonical data distribution, and insensitivity to the or ..."
Abstract - Cited by 724 (12 self) - Add to MetaCart
identical results irrespective of the order in which input records are presented and does not presume any specific mathematical form for data distribution. Through experiments, we show that CLIQUE efficiently finds accurate clusters in large high dimensional datasets.

Optimizing Search Engines using Clickthrough Data

by Thorsten Joachims , 2002
"... This paper presents an approach to automatically optimizing the retrieval quality of search engines using clickthrough data. Intuitively, a good information retrieval system should present relevant documents high in the ranking, with less relevant documents following below. While previous approaches ..."
Abstract - Cited by 1314 (23 self) - Add to MetaCart
-log of the search engine in connection with the log of links the users clicked on in the presented ranking. Such clickthrough data is available in abundance and can be recorded at very low cost. Taking a Support Vector Machine (SVM) approach, this paper presents a method for learning retrieval functions. From a

A comparison of bayesian methods for haplotype reconstruction from population genotype data.

by Matthew Stephens , Peter Donnelly , Dr Matthew Stephens - Am J Hum Genet , 2003
"... In this report, we compare and contrast three previously published Bayesian methods for inferring haplotypes from genotype data in a population sample. We review the methods, emphasizing the differences between them in terms of both the models ("priors") they use and the computational str ..."
Abstract - Cited by 557 (7 self) - Add to MetaCart
individuals to assist in this endeavor, but in general such data may be either unavailable or only partially informative. We focus here on the problem of statistically inferring haplotypes from unphased genotype data for a sample of ("unrelated") individuals from a population. Several approaches

ℓ-diversity: Privacy beyond k-anonymity

by Ashwin Machanavajjhala, Daniel Kifer, Johannes Gehrke, Muthuramakrishnan Venkitasubramaniam - IN ICDE , 2006
"... Publishing data about individuals without revealing sensitive information about them is an important problem. In recent years, a new definition of privacy called k-anonymity has gained popularity. In a k-anonymized dataset, each record is indistinguishable from at least k − 1 other records with resp ..."
Abstract - Cited by 672 (13 self) - Add to MetaCart
Publishing data about individuals without revealing sensitive information about them is an important problem. In recent years, a new definition of privacy called k-anonymity has gained popularity. In a k-anonymized dataset, each record is indistinguishable from at least k − 1 other records

The role of deliberate practice in the acquisition of expert performance

by K. Anders Ericsson, Ralf Th. Krampe, Clemens Tesch-romer - Psychological Review , 1993
"... The theoretical framework presented in this article explains expert performance as the end result of individuals ' prolonged efforts to improve performance while negotiating motivational and external constraints. In most domains of expertise, individuals begin in their childhood a regimen of ef ..."
Abstract - Cited by 690 (15 self) - Add to MetaCart
to that of the rest of the population. Speculations on the causes of these individuals ' extraordinary abilities and performance are as old as the first records of their achievements. Early accounts commonly attribute these individuals' outstanding performance to divine intervention, such as the

Security without identification: transaction systems to make Big Brother obsolete

by David Chaum
"... The large-scale automated transaction systems of the near future can be designed to protect the privacy and maintain the security of both individuals and organizations. DAVID CHAUM Computerization is robbing individuals of the ability to monitor and control the ways information about them is used. A ..."
Abstract - Cited by 505 (3 self) - Add to MetaCart
The large-scale automated transaction systems of the near future can be designed to protect the privacy and maintain the security of both individuals and organizations. DAVID CHAUM Computerization is robbing individuals of the ability to monitor and control the ways information about them is used

An algorithm for finding best matches in logarithmic expected time

by Jerome H. Friedman, Jon Louis Bentley, Raphael Ari Finkel - ACM Transactions on Mathematical Software , 1977
"... An algorithm and data structure are presented for searching a file containing N records, each described by k real valued keys, for the m closest matches or nearest neighbors to a given query record. The computation required to organize the file is proportional to kNlogN. The expected number of recor ..."
Abstract - Cited by 764 (2 self) - Add to MetaCart
An algorithm and data structure are presented for searching a file containing N records, each described by k real valued keys, for the m closest matches or nearest neighbors to a given query record. The computation required to organize the file is proportional to kNlogN. The expected number

A semantics of multiple inheritance

by Luca Cardelli - Information and Computation , 1988
"... There are two major ways of structuring data in programming languages. The first and common one, used for example in Pascal, can be said to derive from standard branches of mathematics. Data is organized as cartesian products (i.e. record types), disjoint sums (i.e. unions or variant types) and func ..."
Abstract - Cited by 528 (9 self) - Add to MetaCart
There are two major ways of structuring data in programming languages. The first and common one, used for example in Pascal, can be said to derive from standard branches of mathematics. Data is organized as cartesian products (i.e. record types), disjoint sums (i.e. unions or variant types

k-anonymity: a model for protecting privacy.

by Latanya Sweeney - International Journal on Uncertainty, Fuzziness and Knowledge-based Systems, , 2002
"... Consider a data holder, such as a hospital or a bank, that has a privately held collection of person-specific, field structured data. Suppose the data holder wants to share a version of the data with researchers. How can a data holder release a version of its private data with scientific guarantees ..."
Abstract - Cited by 1313 (15 self) - Add to MetaCart
guarantees that the individuals who are the subjects of the data cannot be re-identified while the data remain practically useful? The solution provided in this paper includes a formal protection model named k-anonymity and a set of accompanying policies for deployment. A release provides k
Next 10 →
Results 1 - 10 of 141,035
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University