• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations

Tools

Sorted by:
Try your query at:
Semantic Scholar Scholar Academic
Google Bing DBLP
Results 1 - 10 of 2,463
Next 10 →

Query by Committee

by H. S. Seung, M. Opper, H. Sompolinsky , 1992
"... We propose an algorithm called query by committee, in which a committee of students is trained on the same data set. The next query is chosen according to the principle of maximal disagreement. The algorithm is studied for two toy models: the high-low game and perceptron learning of another perceptr ..."
Abstract - Cited by 432 (3 self) - Add to MetaCart
perceptron. As the number of queries goes to infinity, the committee algorithm yields asymptotically finite information gain. This leads to generalization error that decreases exponentially with the number of examples. This in marked contrast to learning from randomly chosen inputs, for which the information

Selective sampling using the Query by Committee algorithm

by Yoav Freund, H. Sebastian Seung, Eli Shamir, Naftali Tishby - Machine Learning , 1997
"... We analyze the "query by committee" algorithm, a method for filtering informative queries from a random stream of inputs. We show that if the two-member committee algorithm achieves information gain with positive lower bound, then the prediction error decreases exponentially with the numbe ..."
Abstract - Cited by 433 (7 self) - Add to MetaCart
We analyze the "query by committee" algorithm, a method for filtering informative queries from a random stream of inputs. We show that if the two-member committee algorithm achieves information gain with positive lower bound, then the prediction error decreases exponentially

Network Applications of Bloom Filters: A Survey

by Andrei Broder, Michael Mitzenmacher - INTERNET MATHEMATICS , 2002
"... A Bloomfilter is a simple space-efficient randomized data structure for representing a set in order to support membership queries. Bloom filters allow false positives but the space savings often outweigh this drawback when the probability of an error is controlled. Bloom filters have been used in ..."
Abstract - Cited by 522 (17 self) - Add to MetaCart
A Bloomfilter is a simple space-efficient randomized data structure for representing a set in order to support membership queries. Bloom filters allow false positives but the space savings often outweigh this drawback when the probability of an error is controlled. Bloom filters have been used

Neural network ensembles, cross validation, and active learning

by Anders Krogh, Jesper Vedelsby - Neural Information Processing Systems 7 , 1995
"... Learning of continuous valued functions using neural network en-sembles (committees) can give improved accuracy, reliable estima-tion of the generalization error, and active learning. The ambiguity is defined as the variation of the output of ensemble members aver-aged over unlabeled data, so it qua ..."
Abstract - Cited by 479 (6 self) - Add to MetaCart
Learning of continuous valued functions using neural network en-sembles (committees) can give improved accuracy, reliable estima-tion of the generalization error, and active learning. The ambiguity is defined as the variation of the output of ensemble members aver-aged over unlabeled data, so

Toward Optimal Active Learning through Sampling Estimation of Error Reduction

by Nicholas Roy, Andrew Mccallum - In Proc. 18th International Conf. on Machine Learning , 2001
"... This paper presents an active learning method that directly optimizes expected future error. This is in contrast to many other popular techniques that instead aim to reduce version space size. These other methods are popular because for many learning models, closed form calculation of the expec ..."
Abstract - Cited by 353 (2 self) - Add to MetaCart
of the expected future error is intractable. Our approach is made feasible by taking a sampling approach to estimating the expected reduction in error due to the labeling of a query. In experimental results on two real-world data sets we reach high accuracy very quickly, sometimes with four times fewer

Eigentaste: A Constant Time Collaborative Filtering Algorithm

by Ken Goldberg, Theresa Roeder, Dhruv Gupta, Chris Perkins , 2000
"... Eigentaste is a collaborative filtering algorithm that uses universal queries to elicit real-valued user ratings on a common set of items and applies principal component analysis (PCA) to the resulting dense subset of the ratings matrix. PCA facilitates dimensionality reduction for offline clusterin ..."
Abstract - Cited by 378 (6 self) - Add to MetaCart
Eigentaste is a collaborative filtering algorithm that uses universal queries to elicit real-valued user ratings on a common set of items and applies principal component analysis (PCA) to the resulting dense subset of the ratings matrix. PCA facilitates dimensionality reduction for offline

Open information extraction from the web

by Michele Banko, Michael J Cafarella, Stephen Soderland, Matt Broadhead, Oren Etzioni - IN IJCAI , 2007
"... Traditionally, Information Extraction (IE) has focused on satisfying precise, narrow, pre-specified requests from small homogeneous corpora (e.g., extract the location and time of seminars from a set of announcements). Shifting to a new domain requires the user to name the target relations and to ma ..."
Abstract - Cited by 373 (39 self) - Add to MetaCart
of relational tuples without requiring any human input. The paper also introduces TEXTRUNNER, a fully implemented, highly scalable OIE system where the tuples are assigned a probability and indexed to support efficient extraction and exploration via user queries. We report on experiments over a 9,000,000 Web

Identification of protein coding regions by database similarity search

by Warren Gish, David J. States - Nature Genetics , 1993
"... Correspondence should be addressed to W.G. page 1 Summary Sequence similarity between a translated nucleotide sequence and a known biological protein can provide strong evidence for the presence of a homologous coding region, and such similarities can often be identified even between distantly relat ..."
Abstract - Cited by 262 (2 self) - Add to MetaCart
characterized the sensitivity of BLASTX recognition to the presence of substitution, insertion and deletion errors in the query sequence and to sequence divergence. Reading frames were reliably identified in the presence of 1 % query errors, a rate that is typical for primary nucleotide sequence data. BLASTX

Skip Graphs

by James Aspnes, Gauri Shah - Proc. of the 14th Annual ACMSIAM Symp. on Discrete Algorithms , 2003
"... Skip graphs are a novel distributed data structure, based on skip lists, that provide the full functionality of a balanced tree in a distributed system where resources are stored in separate nodes that may fail at any time. They are designed for use in searching peer-to-peer systems, and by providin ..."
Abstract - Cited by 306 (9 self) - Add to MetaCart
, and by providing the ability to perform queries based on key ordering, they improve on existing search tools that provide only hash table functionality. Unlike skip lists or other tree data structures, skip graphs are highly resilient, tolerating a large fraction of failed nodes without losing connectivity

Finding Application Errors and Security Flaws Using PQL: a Program Query Language

by Michael Martin, Benjamin Livshits, Monica S. Lam , 2005
"... A number of effective error detection tools have been built in recent years to check if a program conforms to certain design rules. An important class of design rules deals with sequences of events associated with a set of related objects. This paper presents a language called PQL (Program Query Lan ..."
Abstract - Cited by 188 (5 self) - Add to MetaCart
A number of effective error detection tools have been built in recent years to check if a program conforms to certain design rules. An important class of design rules deals with sequences of events associated with a set of related objects. This paper presents a language called PQL (Program Query
Next 10 →
Results 1 - 10 of 2,463
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University