Results 1 - 10
of
26
Framing Image Description as a Ranking Task: Data, Models and Evaluation Metrics
"... The ability to associate images with natural language sentences that describe what is depicted in them is a hallmark of image understanding, and a prerequisite for applications such as sentence-based image search. In analogy to image search, we propose to frame sentence-based image annotation as the ..."
Abstract
-
Cited by 44 (2 self)
- Add to MetaCart
(Show Context)
The ability to associate images with natural language sentences that describe what is depicted in them is a hallmark of image understanding, and a prerequisite for applications such as sentence-based image search. In analogy to image search, we propose to frame sentence-based image annotation as the task of ranking a given pool of captions. We introduce a new benchmark collection for sentence-based image description and search, consisting of 8,000 images that are each paired with five different captions which provide clear descriptions of the salient entities and events. We introduce a number of systems that perform quite well on this task, even though they are only based on features that can be obtained with minimal supervision. Our results clearly indicate the importance of training on multiple captions per image, and of capturing syntactic (word order-based) and semantic features of these captions. We also perform an in-depth comparison of human and automatic evaluation metrics for this task, and propose strategies for collecting human judgments cheaply and on a very large scale, allowing us to augment our collection with additional relevance judgments of which captions describe which image. Our analysis shows that metrics that consider the ranked list of results for each query image or sentence are significantly more robust than metrics that are based on a single response per query. Moreover, our study suggests that the evaluation of ranking-based image description systems may be fully automated. 1.
Integrating Syntactic and Semantic Analysis into the Open Information Extraction Paradigm
"... In this paper we present an approach aimed at enriching the Open Information Extraction paradigm with semantic relation ontologization by integrating syntactic and semantic features into its workflow. To achieve this goal, we combine deep syntactic analysis and distributional semantics using a short ..."
Abstract
-
Cited by 7 (4 self)
- Add to MetaCart
In this paper we present an approach aimed at enriching the Open Information Extraction paradigm with semantic relation ontologization by integrating syntactic and semantic features into its workflow. To achieve this goal, we combine deep syntactic analysis and distributional semantics using a shortest path kernel method and soft clustering. The output of our system is a set of automatically discovered and ontologized semantic relations.
Semantic convolution kernels over dependency trees: smoothed partial tree kernel
- In CIKM
, 2011
"... In recent years, natural language processing techniques have been used more and more in IR. Among other syntactic and semantic parsing are effective methods for the design of complex applica-tions like for example question answering and sentiment analy-sis. Unfortunately, extracting feature represen ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
(Show Context)
In recent years, natural language processing techniques have been used more and more in IR. Among other syntactic and semantic parsing are effective methods for the design of complex applica-tions like for example question answering and sentiment analy-sis. Unfortunately, extracting feature representations suitable for machine learning algorithms from linguistic structures is typically difficult. In this paper, we describe one of the most advanced piece of technology for automatic engineering of syntactic and se-mantic patterns. This method merges together convolution depen-dency tree kernels with lexical similarities. It can efficiently and effectively measure the similarity between dependency structures, whose lexical nodes are in part or completely different. Its use in powerful algorithm such as Support Vector Machines (SVMs) allows for fast design of accurate automatic systems. We report some experiments on question classification, which show an un-precedented result, e.g. 41 % of error reduction of the former state-of-the-art, along with the analysis of the nice properties of the ap-proach. 1.
Grammatical Feature Engineering for fine-grained IR tasks
"... Abstract. Information Retrieval tasks include nowadays more and more complex information in order to face contemporary challenges such as Opinion Mining (OM) or Question Answering (QA). These are examples of tasks where complex linguistic information is required for reasonable performances on realis ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
(Show Context)
Abstract. Information Retrieval tasks include nowadays more and more complex information in order to face contemporary challenges such as Opinion Mining (OM) or Question Answering (QA). These are examples of tasks where complex linguistic information is required for reasonable performances on realistic data sets. As natural language learning is usually applied to these tasks, rich structures, such as parse trees, are critical as they require complex resources and accurate pre-processing. In this paper, we show how good quality language learning methods can be applied to the above tasks by using grammatical representations simpler than parse trees. These features are here shown to achieve the state-of-art accuracy in different IR tasks, such as OM and QA. 1 Syntactic modeling of linguistic features in Semantic Tasks Information Retrieval faces nowadays contemporary challenges such as Sentiment Analysis (SA) or Question Answering (QA), that are tight to complex and fine grained linguistic information. The traditional view in IR that represents the meaning of documents just according to the words that occur in them is not directly applicable. Statistical models,
Talking Robots
"... Abstract. In the last years robotic platforms have appeared in many research and everyday life contexts. An easy way of interacting with them has then become a necessity. Human Robot Interaction is the research field that aims at studying how robots can interact with humans in the most natural way. ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
(Show Context)
Abstract. In the last years robotic platforms have appeared in many research and everyday life contexts. An easy way of interacting with them has then become a necessity. Human Robot Interaction is the research field that aims at studying how robots can interact with humans in the most natural way. In this work we will present preliminary studies that we have been done in this direction, focusing on Natural Language based interaction, with particular attention to the grounding problem. In particular, we will study how Statistical Machine Learning techniques can be applied to Natural Language as it is used to interact with robots. Moreover, we will also investigate how this approach can be integrated in such complex systems. 1
Semantic Kernels for Semantic Parsing
"... We present an empirical study on the use of semantic information for Concept Seg-mentation and Labeling (CSL), which is an important step for semantic parsing. We represent the alternative analyses out-put by a state-of-the-art CSL parser with tree structures, which we rerank with a classifier train ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
(Show Context)
We present an empirical study on the use of semantic information for Concept Seg-mentation and Labeling (CSL), which is an important step for semantic parsing. We represent the alternative analyses out-put by a state-of-the-art CSL parser with tree structures, which we rerank with a classifier trained on two types of seman-tic tree kernels: one processing structures built with words, concepts and Brown clusters, and another one using semantic similarity among the words composing the structure. The results on a corpus from the restaurant domain show that our semantic kernels exploiting similarity measures out-perform state-of-the-art rerankers. 1
Structured Kernel-Based Learning for the Frame Labeling over Italian Texts
"... Abstract. In this paper two systems participating to the Evalita Frame Labeling over Italian Texts challenge are presented. The first one, i.e. the SVM-SPTK system, implements the Smoothed Partial Tree Kernel that models semantic roles by implicitly combining syntactic and lexical information of an ..."
Abstract
- Add to MetaCart
(Show Context)
Abstract. In this paper two systems participating to the Evalita Frame Labeling over Italian Texts challenge are presented. The first one, i.e. the SVM-SPTK system, implements the Smoothed Partial Tree Kernel that models semantic roles by implicitly combining syntactic and lexical information of annotated examples. The second one, i.e. the SVM-HMM system, realizes a flexible approach based on the Markovian formulation of the SVM learning algorithm. In the challenge, the SVM-SPTK system obtains state-of-the-art results in almost all tasks. Performances of the SVM-HMM system are interesting too, i.e. the second best scores in the Frame Prediction and Argument Classification tasks, especially considering it does not rely on a full syntactic parsing.
Encoding syntactic dependencies using Random Indexing and Wikipedia as a corpus
"... Abstract. Distributional approaches are based on a simple hypothesis: the meaning of a word can be inferred from its usage. The application of that idea to the vector space model makes possible the construction of a WordSpace in which words are represented by mathematical points in a geometric space ..."
Abstract
- Add to MetaCart
(Show Context)
Abstract. Distributional approaches are based on a simple hypothesis: the meaning of a word can be inferred from its usage. The application of that idea to the vector space model makes possible the construction of a WordSpace in which words are represented by mathematical points in a geometric space. Similar words are represented close in this space and the definition of “word usage ” depends on the definition of the context used to build the space, which can be the whole document, the sentence in which the word occurs, a fixed window of words, or a specific syntactic context. However, in its original formulation WordSpace can take into account only one definition of context at a time. We propose an approach based on vector permutation and Random Indexing to encode several syntactic contexts in a single WordSpace. We adopt WaCkypedia EN corpus to build our WordSpace that is a 2009 dump of the English Wikipedia (about 800 million tokens) annotated with syntactic information provided by a full dependency parser. The effectiveness
Walk-based Computation of Contextual Word Similarity
"... We propose a new measure of semantic similarity between words in context, which exploits the syntactic/semantic structure of the context surrounding each target word. For a given pair of target words and their sentential contexts, labeled directed graphs are made from the output of a semantic parser ..."
Abstract
- Add to MetaCart
(Show Context)
We propose a new measure of semantic similarity between words in context, which exploits the syntactic/semantic structure of the context surrounding each target word. For a given pair of target words and their sentential contexts, labeled directed graphs are made from the output of a semantic parser on these sentences. Nodes in these graphs represent words in the sentences, and labeled edges represent syntactic/semantic relations between them. The similarity between the target words is then computed as the sum of the similarity of walks starting from the target words (nodes) in the two graphs. The proposed measure is tested on word sense disambiguation and paraphrase ranking tasks, and the results are promising: The proposed measure outperforms existing methods which completely ignore or do not fully exploit syntactic/semantic structural co-occurrences between a target word and its neighbors.
Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence Integrating Syntactic and Semantic Analysis into the Open Information Extraction Paradigm
"... In this paper we present an approach aimed at enriching the Open Information Extraction paradigm with semantic relation ontologization by integrating syntactic and semantic features into its workflow. To achieve this goal, we combine deep syntactic analysis and distributional semantics using a short ..."
Abstract
- Add to MetaCart
In this paper we present an approach aimed at enriching the Open Information Extraction paradigm with semantic relation ontologization by integrating syntactic and semantic features into its workflow. To achieve this goal, we combine deep syntactic analysis and distributional semantics using a shortest path kernel method and soft clustering. The output of our system is a set of automatically discovered and ontologized semantic relations.