Results 1 - 10
of
11
Discriminative Reranking for Spoken Language Understanding
- IEEE Transactions on Audio, Speech & Language Processing
, 2012
"... Abstract—Spoken Language Understanding (SLU) is con-cerned with the extraction of meaning structures from spo-ken utterances. Recent computational approaches to SLU, e.g. Conditional Random Fields (CRF), optimize local models by encoding several features, mainly based on simple n-grams. In contrast, ..."
Abstract
-
Cited by 8 (0 self)
- Add to MetaCart
(Show Context)
Abstract—Spoken Language Understanding (SLU) is con-cerned with the extraction of meaning structures from spo-ken utterances. Recent computational approaches to SLU, e.g. Conditional Random Fields (CRF), optimize local models by encoding several features, mainly based on simple n-grams. In contrast, recent works have shown that the accuracy of CRF can be significantly improved by modeling long-distance dependency features. In this paper, we propose novel approaches to encode all possible dependencies between features and most importantly among parts of the meaning structure, e.g. concepts and their combination. We rerank hypotheses generated by local models, e.g. Stochastic Finite State Transducers (SFSTs) or Conditional Random Fields (CRF), with a global model. The latter encodes a very large number of dependencies (in the form of trees or sequences) by applying kernel methods to the space of all meaning (sub) structures. We performed comparative experiments between SFST, CRF, Support Vector Machines (SVMs) and our proposed discriminative reranking models (DRMs) on representative conversational speech corpora in three different languages: the ATIS (English), the MEDIA (French) and the LUNA (Italian) corpora. These corpora have been collected within three different domain applications of increasing complexity: informational, transactional and problem-solving tasks, respectively. The results show that our DRMs consistently outperform the state-of-the-art models based on CRF.
A.: On reverse feature engineering of syntactic tree kernels
- In: Proceedings of the Fourteenth Conference on Computational Natural Language Learning
, 2010
"... In this paper, we provide a theoretical framework for feature selection in tree kernel spaces based on gradient-vector components of kernel-based machines. We show that a huge number of features can be discarded without a significant decrease in accuracy. Our selection algorithm is as accurate as an ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
In this paper, we provide a theoretical framework for feature selection in tree kernel spaces based on gradient-vector components of kernel-based machines. We show that a huge number of features can be discarded without a significant decrease in accuracy. Our selection algorithm is as accurate as and much more efficient than those proposed in previous work. Comparative experiments on three interesting and very diverse classification tasks, i.e. Question Classification, Relation Extraction and Semantic Role Labeling, support our theoretical findings and demonstrate the algorithm performance. 1
Fast Random Walk Graph Kernel
"... Random walk graph kernel has been used as an important tool for various data mining tasks including classification and similarity computation. Despite its usefulness, however, it suffers from the expensive computational cost which is at least O(n 3) or O(m 2) for graphs with n nodes and m edges. In ..."
Abstract
-
Cited by 6 (2 self)
- Add to MetaCart
(Show Context)
Random walk graph kernel has been used as an important tool for various data mining tasks including classification and similarity computation. Despite its usefulness, however, it suffers from the expensive computational cost which is at least O(n 3) or O(m 2) for graphs with n nodes and m edges. In this paper, we propose Ark, a set of fast algorithms for random walk graph kernel computation. Ark is based on the observation that real graphs have much lower intrinsic ranks, compared with the orders of the graphs. Ark exploits the low rank structure to quickly compute random walk graph kernels in O(n 2) or O(m) time. Experimental results show that our method is up to 97,865 × faster than the existing algorithms, while providing more than 91.3 % of the accuracies.
Verb Classification using Distributional Similarity in Syntactic and Semantic Structures
- In proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL
, 2012
"... In this paper, we propose innovative repre-sentations for automatic classification of verbs according to mainstream linguistic theories, namely VerbNet and FrameNet. First, syntac-tic and semantic structures capturing essential lexical and syntactic properties of verbs are defined. Then, we design a ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
In this paper, we propose innovative repre-sentations for automatic classification of verbs according to mainstream linguistic theories, namely VerbNet and FrameNet. First, syntac-tic and semantic structures capturing essential lexical and syntactic properties of verbs are defined. Then, we design advanced similarity functions between such structures, i.e., seman-tic tree kernel functions, for exploiting distri-butional and grammatical information in Sup-port Vector Machines. The extensive empir-ical analysis on VerbNet class and frame de-tection shows that our models capture mean-ingful syntactic/semantic structures, which al-lows for improving the state-of-the-art. 1
Explicit and Implicit Syntactic Features for Text Classification
"... Syntactic features are useful for many text classification tasks. Among these, tree kernels (Collins and Duffy, 2001) have been perhaps the most robust and effective syntactic tool, appealing for their empirical success, but also because they do not require an answer to the difficult question of whi ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
(Show Context)
Syntactic features are useful for many text classification tasks. Among these, tree kernels (Collins and Duffy, 2001) have been perhaps the most robust and effective syntactic tool, appealing for their empirical success, but also because they do not require an answer to the difficult question of which tree features to use for a given task. We compare tree kernels to different explicit sets of tree features on five diverse tasks, and find that explicit features often perform as well as tree kernels on accuracy and always in orders of magnitude less time, and with smaller models. Since explicit features are easy to generate and use (with publicly available tools), we suggest they should always be included as baseline comparisons in tree kernel method evaluations. 1
Discriminative Reranking of Discourse Parses Using Tree Kernels
"... In this paper, we present a discrimina-tive approach for reranking discourse trees generated by an existing probabilistic dis-course parser. The reranker relies on tree kernels (TKs) to capture the global depen-dencies between discourse units in a tree. In particular, we design new computa-tional st ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
(Show Context)
In this paper, we present a discrimina-tive approach for reranking discourse trees generated by an existing probabilistic dis-course parser. The reranker relies on tree kernels (TKs) to capture the global depen-dencies between discourse units in a tree. In particular, we design new computa-tional structures of discourse trees, which combined with standard TKs, originate novel discourse TKs. The empirical evalu-ation shows that our reranker can improve the state-of-the-art sentence-level parsing accuracy from 79.77 % to 82.15%, a rel-ative error reduction of 11.8%, which in turn pushes the state-of-the-art document-level accuracy from 55.8 % to 57.3%. 1
Kernel-Based Machines for Abstract and Easy Modeling of Automatic Learning
"... Abstract. The modeling of system semantics (in several ICT domains) by means of pattern analysis or relational learning is a product of latest results in statistical learning theory. For example, the modeling of natural language semantics ex-pressed by text, images, speech in information search (e.g ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
(Show Context)
Abstract. The modeling of system semantics (in several ICT domains) by means of pattern analysis or relational learning is a product of latest results in statistical learning theory. For example, the modeling of natural language semantics ex-pressed by text, images, speech in information search (e.g. Google, Yahoo,..) or DNA sequence labeling in Bioinformatics represent distinguished cases of suc-cessful use of statistical machine learning. The reason of this success is due to the ability to overcome the concrete limitations of logic/rule-based approaches to semantic modeling: although, from a knowledge engineer perspective, rules are natural methods to encode system semantics, noise, ambiguity and errors affect-ing dynamic systems, prevent such approached from being effective, e.g. they are not flexible enough. In contrast, statistical relational learning, applied to representations of system states, i.e. training examples, can produce semantic models of system behavior based on a large number attributes. As the values of the latter are automatically learned, they reflect the flexibility of statistical settings and the overall model
Towards Using Reranking in Hierarchical Classification
"... Abstract. We consider the use of reranking as a way to relax typical independence assumptions often made in hierarchical multilabel classification. Our reranker is based on (i) an algorithm that generates promising k-best classification hypotheses from the output of local binary classifiers that cla ..."
Abstract
- Add to MetaCart
Abstract. We consider the use of reranking as a way to relax typical independence assumptions often made in hierarchical multilabel classification. Our reranker is based on (i) an algorithm that generates promising k-best classification hypotheses from the output of local binary classifiers that classify nodes of a target tree-shaped hierarchy; and (ii) a tree kernel-based reranker applied to the classification tree associated with the hypotheses above. We carried out a number of experiments with this model on the Reuters corpus: we firstly show the potential of our algorithm by computing the oracle classification accuracy. This demonstrates that there is a significant room for potential improvement of the hierarchical classifier. Then, we measured the accuracy achieved by the reranker, which shows a significant performance improvement over the baseline.
Prototypical Opinion Holders: What We can Learn from Experts and Analysts
"... In order to automatically extract opinion holders, we propose to harness the contexts of prototypical opinion holders, i.e. common nouns, such as experts or analysts, that describe particular groups of people whose profession or occupation is to form and express opinions towards specific items. We a ..."
Abstract
- Add to MetaCart
(Show Context)
In order to automatically extract opinion holders, we propose to harness the contexts of prototypical opinion holders, i.e. common nouns, such as experts or analysts, that describe particular groups of people whose profession or occupation is to form and express opinions towards specific items. We assess their effectiveness in supervised learning where these contexts are regarded as labeled training data and in rule-based classification which uses predicates that frequently co-occur with mentions of the prototypical opinion holders. Finally, we also examine in how far knowledge gained from these contexts can compensate the lack of large amounts of labeled training data in supervised learning by considering various amounts of actually labeled training sets. 1
Social Network Extraction from Texts: A Thesis Proposal
"... In my thesis, I propose to build a system that would enable extraction of social interactions from texts. To date I have defined a comprehensive set of social events and built a preliminary system that extracts social events from news articles. I plan to improve the performance of my current system ..."
Abstract
- Add to MetaCart
In my thesis, I propose to build a system that would enable extraction of social interactions from texts. To date I have defined a comprehensive set of social events and built a preliminary system that extracts social events from news articles. I plan to improve the performance of my current system by incorporating semantic information. Using domain adaptation techniques, I propose to apply my system to a wide range of genres. By extracting linguistic constructs relevant to social interactions, I will be able to empirically analyze different kinds of linguistic constructs that people use to express social interactions. Lastly, I will attempt to make convolution kernels more scalable and interpretable. 1