Results 1 - 10
of
62,189
Matching words and pictures
- JOURNAL OF MACHINE LEARNING RESEARCH
, 2003
"... We present a new approach for modeling multi-modal data sets, focusing on the specific case of segmented images with associated text. Learning the joint distribution of image regions and words has many applications. We consider in detail predicting words associated with whole images (auto-annotation ..."
Abstract
-
Cited by 665 (40 self)
- Add to MetaCart
We present a new approach for modeling multi-modal data sets, focusing on the specific case of segmented images with associated text. Learning the joint distribution of image regions and words has many applications. We consider in detail predicting words associated with whole images (auto
Query Expansion Using Local and Global Document Analysis
- In Proceedings of the 19th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval
, 1996
"... Automatic query expansion has long been suggested as a technique for dealing with the fundamental issue of word mismatch in information retrieval. A number of approaches to expansion have been studied and, more recently, attention has focused on techniques that analyze the corpus to discover word re ..."
Abstract
-
Cited by 610 (24 self)
- Add to MetaCart
global analysis techniques, such as word context and phrase structure, on the local set of documents produces results that are both more effective and more predictable than simple local feedback. 1 Introduction The problem of word mismatch is fundamental to information retrieval. Simply stated, it means
Semantic similarity based on corpus statistics and lexical taxonomy
- Proc of 10th International Conference on Research in Computational Linguistics, ROCLING’97
, 1997
"... This paper presents a new approach for measuring semantic similarity/distance between words and concepts. It combines a lexical taxonomy structure with corpus statistical information so that the semantic distance between nodes in the semantic space constructed by the taxonomy can be better quantifie ..."
Abstract
-
Cited by 873 (0 self)
- Add to MetaCart
This paper presents a new approach for measuring semantic similarity/distance between words and concepts. It combines a lexical taxonomy structure with corpus statistical information so that the semantic distance between nodes in the semantic space constructed by the taxonomy can be better
A Sequential Algorithm for Training Text Classifiers
, 1994
"... The ability to cheaply train text classifiers is critical to their use in information retrieval, content analysis, natural language processing, and other tasks involving data which is partly or fully textual. An algorithm for sequential sampling during machine learning of statistical classifiers was ..."
Abstract
-
Cited by 631 (10 self)
- Add to MetaCart
was developed and tested on a newswire text categorization task. This method, which we call uncertainty sampling, reduced by as much as 500-fold the amount of training data that would have to be manually classified to achieve a given level of effectiveness. 1 Introduction Text classification is the automated
The Advantages of Evolutionary Computation
, 1997
"... Evolutionary computation is becoming common in the solution of difficult, realworld problems in industry, medicine, and defense. This paper reviews some of the practical advantages to using evolutionary algorithms as compared with classic methods of optimization or artificial intelligence. Specific ..."
Abstract
-
Cited by 541 (6 self)
- Add to MetaCart
Evolutionary computation is becoming common in the solution of difficult, realworld problems in industry, medicine, and defense. This paper reviews some of the practical advantages to using evolutionary algorithms as compared with classic methods of optimization or artificial intelligence. Specific
Computing semantic relatedness using Wikipedia-based explicit semantic analysis
- In Proceedings of the 20th International Joint Conference on Artificial Intelligence
, 2007
"... Computing semantic relatedness of natural language texts requires access to vast amounts of common-sense and domain-specific world knowledge. We propose Explicit Semantic Analysis (ESA), a novel method that represents the meaning of texts in a high-dimensional space of concepts derived from Wikipedi ..."
Abstract
-
Cited by 562 (9 self)
- Add to MetaCart
Computing semantic relatedness of natural language texts requires access to vast amounts of common-sense and domain-specific world knowledge. We propose Explicit Semantic Analysis (ESA), a novel method that represents the meaning of texts in a high-dimensional space of concepts derived from
N: Meta-analysis in clinical trials
- Controlled Clinical Trials
, 1986
"... ABSTRACT: This paper examines eight published reviews each reporting results from several related trials. Each review pools the results from the relevant trials in order to evaluate the efficacy of a certain treatment for a specified medical condition. These reviews lack consistent assessment of hom ..."
Abstract
-
Cited by 1303 (0 self)
- Add to MetaCart
relevant covariates which would reduce the heterogeneity and allow for more specific therapeutic recommendations. We suggest a simple noniterative procedure for characterizing the distribution of treatment effects in a series of studies. KEY WORDS: random effects model, heterogeneity of treatment effects
Probabilistic Latent Semantic Indexing
, 1999
"... Probabilistic Latent Semantic Indexing is a novel approach to automated document indexing which is based on a statistical latent class model for factor analysis of count data. Fitted from a training corpus of text documents by a generalization of the Expectation Maximization algorithm, the utilized ..."
Abstract
-
Cited by 1225 (10 self)
- Add to MetaCart
model is able to deal with domain-specific synonymy as well as with polysemous words. In contrast to standard Latent Semantic Indexing (LSI) by Singular Value Decomposition, the probabilistic variant has a solid statistical foundation and defines a proper generative data model. Retrieval experiments
Gapped BLAST and PSI-BLAST: a new generation of protein database search programs.
- Nucleic Acids Res.
, 1997
"... ABSTRACT The BLAST programs are widely used tools for searching protein and DNA databases for sequence similarities. For protein comparisons, a variety of definitional, algorithmic and statistical refinements described here permits the execution time of the BLAST programs to be decreased substantia ..."
Abstract
-
Cited by 8572 (88 self)
- Add to MetaCart
substantially while enhancing their sensitivity to weak similarities. A new criterion for triggering the extension of word hits, combined with a new heuristic for generating gapped alignments, yields a gapped BLAST program that runs at approximately three times the speed of the original. In addition, a method
Results 1 - 10
of
62,189