• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

DMCA

ABSTRACT Detecting Privacy Leaks Using Corpus-based Association Rules

Cached

  • Download as a PDF

Download Links

  • [crypto.stanford.edu]
  • [www-cs-students.stanford.edu]
  • [xenon.stanford.edu]
  • [www.parc.com]
  • [crypto.stanford.edu]
  • [xenon.stanford.edu]

  • Save to List
  • Add to Collection
  • Correct Errors
  • Monitor Changes
by Richard Chow
Citations:13 - 1 self
  • Summary
  • Citations
  • Active Bibliography
  • Co-citation
  • Clustered Documents
  • Version History

BibTeX

@MISC{Chow_abstractdetecting,
    author = {Richard Chow},
    title = {ABSTRACT Detecting Privacy Leaks Using Corpus-based Association Rules},
    year = {}
}

Share

Facebook Twitter Reddit Bibsonomy

OpenURL

 

Abstract

Detecting inferences in documents is critical for ensuring privacy when sharing information. In this paper, we propose a refined and practical model of inference detection using a reference corpus. Our model is inspired by association rule mining: inferences are based on word co-occurrences. Using the model and taking the Web as the reference corpus, we can find inferences and measure their strength through web-mining algorithms that leverage search engines such as Google or Yahoo!. Our model also includes the important case of private corpora, to model inference detection in enterprise settings in which there is a large private document repository. We find inferences in private corpora by using analogues of our Web-mining algorithms, relying on an index for the corpus rather than a Web search engine. We present results from two experiments. The first experiment demonstrates the performance of our techniques in identifying all the keywords that allow for inference of a particular topic (e.g. “HIV") with confidence above a certain threshold. The second experiment uses the public Enron e-mail dataset. We postulate a sensitive topic and use the Enron corpus and the Web together to find inferences for the topic. These experiments demonstrate that our techniques are practical, and that our model of inference based on word co-occurrence is well-suited to efficient inference detection.

Keyphrases

inference detection    reference corpus    private corpus    word co-occurrence    web-mining algorithm    search engine    sensitive topic    second experiment    web search engine    public enron e-mail dataset    large private document repository    particular topic    enterprise setting    certain threshold    enron corpus    first experiment    association rule mining    important case    practical model    present result   

Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University