Results 1 - 10
of
3,884
Scalable knowledge harvesting with high precision and high recall
- In WSDM
, 2011
"... Harvesting relational facts from Web sources has received great attention for automatically constructing large knowledge bases. Stateof-the-art approaches combine pattern-based gathering of fact candidates with constraint-based reasoning. However, they still face major challenges regarding the trade ..."
Abstract
-
Cited by 53 (6 self)
- Add to MetaCart
the trade-offs between precision, recall, and scalability. Techniques that scale well are susceptible to noisy patterns that degrade precision, while techniques that employ deep reasoning for high precision cannot cope with Web-scale data. This paper presents a scalable system, called PROSPERA, for high
The Relationship Between Precision-Recall and ROC Curves
- In ICML ’06: Proceedings of the 23rd international conference on Machine learning
, 2006
"... Receiver Operator Characteristic (ROC) curves are commonly used to present results for binary decision problems in machine learning. However, when dealing with highly skewed datasets, Precision-Recall (PR) curves give a more informative picture of an algorithm’s performance. We show that a deep conn ..."
Abstract
-
Cited by 415 (4 self)
- Add to MetaCart
Receiver Operator Characteristic (ROC) curves are commonly used to present results for binary decision problems in machine learning. However, when dealing with highly skewed datasets, Precision-Recall (PR) curves give a more informative picture of an algorithm’s performance. We show that a deep
Active Bucket Categorization for High Recall Video Retrieval
"... Abstract—There are large amounts of digital video available. High recall retrieval of these requires to go beyond the ranked results, the common target in high precision retrieval. To aid high recall retrieval, we propose Active Bucket Categorization, a multi-category interactive learning strategy w ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Abstract—There are large amounts of digital video available. High recall retrieval of these requires to go beyond the ranked results, the common target in high precision retrieval. To aid high recall retrieval, we propose Active Bucket Categorization, a multi-category interactive learning strategy
Text Chunking using Transformation-Based Learning
, 1995
"... Eric Brill introduced transformation-based learning and showed that it can do part-ofspeech tagging with fairly high accuracy. The same method can be applied at a higher level of textual interpretation for locating chunks in the tagged text, including non-recursive "baseNP" chunks. For ..."
Abstract
-
Cited by 523 (0 self)
- Add to MetaCart
Eric Brill introduced transformation-based learning and showed that it can do part-ofspeech tagging with fairly high accuracy. The same method can be applied at a higher level of textual interpretation for locating chunks in the tagged text, including non-recursive "baseNP" chunks
Cumulated Gain-based Evaluation of IR Techniques
- ACM Transactions on Information Systems
, 2002
"... Modem large retrieval environments tend to overwhelm their users by their large output. Since all documents are not of equal relevance to their users, highly relevant documents should be identified and ranked first for presentation to the users. In order to develop IR techniques to this direction, i ..."
Abstract
-
Cited by 694 (3 self)
- Add to MetaCart
, it is necessary to develop evaluation approaches and methods that credit IR methods for their ability to retrieve highly relevant documents. This can be done by extending traditional evaluation methods, i.e., recall and precision based on binary relevance assessments, to graded relevance assessments
An extensive empirical study of feature selection metrics for text classification
- J. of Machine Learning Research
, 2003
"... Machine learning for text classification is the cornerstone of document categorization, news filtering, document routing, and personalization. In text domains, effective feature selection is essential to make the learning task efficient and more accurate. This paper presents an empirical comparison ..."
Abstract
-
Cited by 496 (15 self)
- Add to MetaCart
of twelve feature selection methods (e.g. Information Gain) evaluated on a benchmark of 229 text classification problem instances that were gathered from Reuters, TREC, OHSUMED, etc. The results are analyzed from multiple goal perspectives—accuracy, F-measure, precision, and recall—since each is appropriate
The Challenge of High Recall in Biomedical Systematic Search
"... Clinical systematic reviews are based on expert, laborious search of well-annotated literature. Boolean search on bibliographic databases, such as medline, continues to be the preferred discovery method, but the size of these databases, now approaching 20 million records, makes it impossible to full ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
Clinical systematic reviews are based on expert, laborious search of well-annotated literature. Boolean search on bibliographic databases, such as medline, continues to be the preferred discovery method, but the size of these databases, now approaching 20 million records, makes it impossible to fully trust these searching methods. We are investigating the trade-offs between Boolean and ranked retrieval. Our findings show that although Boolean search has limitations, it is not obvious that ranking is superior, and illustrate that a single query cannot be used to resolve an information need. Our experiments show that a combination of less complicated Boolean queries and ranked retrieval outperforms either of them individually, leading to possible time savings over the current process.
Antecedent Selection Techniques for High-Recall Coreference Resolution
"... We investigate methods to improve the recall in coreference resolution by also trying to resolve those definite descriptions where no earlier mention of the referent shares the same lexical head (coreferent bridging). The problem, which is notably harder than identifying coreference relations among ..."
Abstract
- Add to MetaCart
We investigate methods to improve the recall in coreference resolution by also trying to resolve those definite descriptions where no earlier mention of the referent shares the same lexical head (coreferent bridging). The problem, which is notably harder than identifying coreference relations among
Impact of Surrogate Assessments on High-Recall Retrieval
"... ABSTRACT We are concerned with the effect of using a surrogate assessor to train a passive (i.e., batch) supervised-learning method to rank documents for subsequent review, where ..."
Abstract
- Add to MetaCart
ABSTRACT We are concerned with the effect of using a surrogate assessor to train a passive (i.e., batch) supervised-learning method to rank documents for subsequent review, where
IR evaluation methods for retrieving highly relevant documents
, 2000
"... This paper proposes evaluation methods based on the use of non-dichotomous relevance judgements in IR experiments. It is argued that evaluation methods should credit IR methods for their ability to retrieve highly relevant documents. This is desirable from the user point of view in moderu large IR e ..."
Abstract
-
Cited by 414 (5 self)
- Add to MetaCart
This paper proposes evaluation methods based on the use of non-dichotomous relevance judgements in IR experiments. It is argued that evaluation methods should credit IR methods for their ability to retrieve highly relevant documents. This is desirable from the user point of view in moderu large IR
Results 1 - 10
of
3,884