Results 1 - 10
of
1,502
Semantic Similarity in a Taxonomy: An Information-Based Measure and its Application to Problems of Ambiguity in Natural Language
, 1999
"... This article presents a measure of semantic similarityinanis-a taxonomy based on the notion of shared information content. Experimental evaluation against a benchmark set of human similarity judgments demonstrates that the measure performs better than the traditional edge-counting approach. The a ..."
Abstract
-
Cited by 320 (10 self)
- Add to MetaCart
This article presents a measure of semantic similarityinanis-a taxonomy based on the notion of shared information content. Experimental evaluation against a benchmark set of human similarity judgments demonstrates that the measure performs better than the traditional edge-counting approach. The article presents algorithms that take advantage of taxonomic similarity in resolving syntactic and semantic ambiguity, along with experimental results demonstrating their e#ectiveness. 1. Introduction Evaluating semantic relatedness using network representations is a problem with a long history in arti#cial intelligence and psychology, dating back to the spreading activation approach of Quillian #1968# and Collins and Loftus #1975#. Semantic similarity represents a special case of semantic relatedness: for example, cars and gasoline would seem to be more closely related than, say, cars and bicycles, but the latter pair are certainly more similar. Rada et al. #Rada, Mili, Bicknell, & Blett...
LabelMe: A Database and Web-Based Tool for Image Annotation
, 2008
"... We seek to build a large collection of images with ground truth labels to be used for object detection and recognition research. Such data is useful for supervised learning and quantitative evaluation. To achieve this, we developed a web-based tool that allows easy image annotation and instant sha ..."
Abstract
-
Cited by 232 (37 self)
- Add to MetaCart
We seek to build a large collection of images with ground truth labels to be used for object detection and recognition research. Such data is useful for supervised learning and quantitative evaluation. To achieve this, we developed a web-based tool that allows easy image annotation and instant sharing of such annotations. Using this annotation tool, we have collected a large dataset that spans many object categories, often containing multiple instances over a wide variety of images. We quantify the contents of the dataset and compare against existing state of the art datasets used for object recognition and detection. Also, we show how to extend the dataset to automatically enhance object labels with WordNet, discover object parts, recover a depth ordering of objects in a scene, and increase the number of labels using minimal user supervision and images from the web.
Faceted Metadata for Image Search and Browsing
, 2003
"... There are currently two dominant interface types for searching and browsing large image collections: keywordbased search, and searching by overall similarity to sample images. We present an alternative based on enabling users to navigate along conceptual dimensions that describe the images. The inte ..."
Abstract
-
Cited by 225 (2 self)
- Add to MetaCart
There are currently two dominant interface types for searching and browsing large image collections: keywordbased search, and searching by overall similarity to sample images. We present an alternative based on enabling users to navigate along conceptual dimensions that describe the images. The interface makes use of hierarchical faceted metadata and dynamically generated query previews. A usability study, in which 32 art history students explored a collection of 35,000 fine arts images, compares this approach to a standard image search interface. Despite the unfamiliarity and power of the interface (attributes that often lead to rejection of new search interfaces), the study results show that 90% of the participants preferred the metadata approach overall, 97% said that it helped them learn more about the collection, 75% found it more flexible, and 72% found it easier to use than a standard baseline system. These results indicate that a category-based approach is a successful way to provide access to image collections.
Computing semantic relatedness using Wikipedia-based explicit semantic analysis
- In Proceedings of the 20th International Joint Conference on Artificial Intelligence
, 2007
"... Computing semantic relatedness of natural language texts requires access to vast amounts of common-sense and domain-specific world knowledge. We propose Explicit Semantic Analysis (ESA), a novel method that represents the meaning of texts in a high-dimensional space of concepts derived from Wikipedi ..."
Abstract
-
Cited by 172 (7 self)
- Add to MetaCart
Computing semantic relatedness of natural language texts requires access to vast amounts of common-sense and domain-specific world knowledge. We propose Explicit Semantic Analysis (ESA), a novel method that represents the meaning of texts in a high-dimensional space of concepts derived from Wikipedia. We use machine learning techniques to explicitly represent the meaning of any text as a weighted vector of Wikipedia-based concepts. Assessing the relatedness of texts in this space amounts to comparing the corresponding vectors using conventional metrics (e.g., cosine). Compared with the previous state of the art, using ESA results in substantial improvements in correlation of computed relatedness scores with human judgments: from r =0.56 to 0.75 for individual words and from r =0.60 to 0.72 for texts. Importantly, due to the use of natural concepts, the ESA model is easy to explain to human users. 1
ConceptNet: A Practical Commonsense Reasoning Toolkit
- BT TECHNOLOGY JOURNAL
, 2004
"... ConceptNet is a freely available commonsense knowledgebase and natural-language-processing toolkit which supports many practical textual-reasoning tasks over real-world documents including topic-jisting (e.g. a news article containing the concepts, "gun," "convenience store," "demand money" and "mak ..."
Abstract
-
Cited by 167 (5 self)
- Add to MetaCart
ConceptNet is a freely available commonsense knowledgebase and natural-language-processing toolkit which supports many practical textual-reasoning tasks over real-world documents including topic-jisting (e.g. a news article containing the concepts, "gun," "convenience store," "demand money" and "make getaway" might suggest the topics "robbery" and "crime"), affect-sensing (e.g. this email is sad and angry), analogy-making (e.g. "scissors," "razor," "nail clipper," and "sword" are perhaps like a "knife" because they are all "sharp," and can be used to "cut something"), and other contextoriented inferences. The knowledgebase is a semantic network presently consisting of over 1.6 million assertions of commonsense knowledge encompassing the spatial, physical, social, temporal, and psychological aspects of everyday life. Whereas similar large-scale semantic knowledgebases like Cyc and WordNet are carefully handcrafted, ConceptNet is generated automatically from the 700,000 sentences of the Open Mind Common Sense Project -- a World Wide Web based collaboration with over 14,000 authors.
Summarizing Text Documents: Sentence Selection and Evaluation Metrics
- In Research and Development in Information Retrieval
, 1999
"... Human-quality text summarization systems are difficult to design, and even more difficult to evaluate, in part because documents can differ along several dimensions, such as length, writing style and lexical usage. Nevertheless, certain cues can often help suggest the selection of sentences for incl ..."
Abstract
-
Cited by 156 (5 self)
- Add to MetaCart
Human-quality text summarization systems are difficult to design, and even more difficult to evaluate, in part because documents can differ along several dimensions, such as length, writing style and lexical usage. Nevertheless, certain cues can often help suggest the selection of sentences for inclusion in a summary. This paper presents our analysis of news-article summaries generated by sentence selection. Sentences are ranked for potential inclusion in the summary using a weighted combination of statistical and linguistic features. The statistical features were adapted from standard IR methods. The potential linguistic ones were derived from an analysis of news-wire summaries. Toevaluate these features we use a normalized version of precision-recall curves, with a baseline of random sentence selection, as well as analyze the properties of such a baseline. We illustrate our discussions with empirical results showing the importance of corpus-dependent baseline summarization standards, compression ratios and carefully crafted long queries.
Content-Based Book Recommending Using Learning for Text Categorization
- IN PROCEEDINGS OF THE FIFTH ACM CONFERENCE ON DIGITAL LIBRARIES
, 1999
"... Recommender systems improve access to relevant products and information by making personalized suggestions based on previous examples of a user's likes and dislikes. Most existing recommender systems use collaborative filtering methods that base recommendations on other users' preferences. By contra ..."
Abstract
-
Cited by 141 (6 self)
- Add to MetaCart
Recommender systems improve access to relevant products and information by making personalized suggestions based on previous examples of a user's likes and dislikes. Most existing recommender systems use collaborative filtering methods that base recommendations on other users' preferences. By contrast, content-based methods use information about an item itself to make suggestions. This approach has the advantage of being able to recommend previously unrated items to users with unique interests and to provide explanations for its recommendations. We describe a content-based book recommending system that utilizes information extraction and a machine-learning algorithm for text categorization. Initial experimental results demonstrate that this approach can produce accurate recommendations.
80 million tiny images: a large dataset for non-parametric object and scene recognition
- IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
"... ..."
Dynamic Service Matchmaking Among Agents in Open Information Environments
- SIGMOD Record
, 1999
"... The amount of services and deployed software agents in the most famous offspring of the Internet, the World Wide Web, is exponentially increasing. In addition, the Internet is an open environment, where information sources, communication links and agents themselves may appear and disappear unpredict ..."
Abstract
-
Cited by 127 (17 self)
- Add to MetaCart
The amount of services and deployed software agents in the most famous offspring of the Internet, the World Wide Web, is exponentially increasing. In addition, the Internet is an open environment, where information sources, communication links and agents themselves may appear and disappear unpredictably.Thus, an effective, automated search and selection of relevant services or agents is essential for human users and agents as well. We distinguish three general agent categories in the Cyberspace, serviceproviders, servicerequester, and middle agents. Service providers provide some type of service, such as finding information, or performing some particular domain specific problem solving. Requester agents need provider agents to perform some service for them. Agents that help locate others are called middle agents [2]. Matchmaking is the process of finding an appropriate provider for a requester through a middl...

