• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

Kernel methods, syntax and semantics for relational text categorization. (2008)

by Alessandro Moschitti
Venue:In CIKM.
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 42
Next 10 →

Cross-lingual annotation projection for semantic roles

by Sebastian Padó, Mirella Lapata - Journal of Artificial Intelligence Research , 2009
"... This article considers the task of automatically inducing role-semantic annotations in the FrameNet paradigm for new languages. We propose a general framework that is based on annotation projection, phrased as a graph optimization problem. It is relatively inexpensive and has the potential to reduce ..."
Abstract - Cited by 38 (3 self) - Add to MetaCart
This article considers the task of automatically inducing role-semantic annotations in the FrameNet paradigm for new languages. We propose a general framework that is based on annotation projection, phrased as a graph optimization problem. It is relatively inexpensive and has the potential to reduce the human effort involved in creating role-semantic resources. Within this framework, we present projection models that exploit lexical and syntactic information. We provide an experimental evaluation on an English-German parallel corpus which demonstrates the feasibility of inducing high-precision German semantic role annotation both for manually and automatically annotated English data. 1.
(Show Context)

Citation Context

...ing from information extraction (Surdeanu, Harabagiu, Williams, & Aarseth, 2003) to the modeling of textual entailment relations (Tatu & Moldovan, 2005; Burchardt & Frank, 2006), text categorization (=-=Moschitti, 2008-=-), question answering (Narayanan & Harabagiu, 2004; Frank, Krieger, Xu, Uszkoreit, Crysmann, Jörg, & Schäfer, 2007; Moschitti, Quarteroni, Basili, & Manandhar, 2007; Shen & Lapata, 2007), machine tran...

End-to-End Relation Extraction Using Distant Supervision from External Semantic Repositories

by Truc-vien T. Nguyen, Ro Moschitti
"... In this paper, we extend distant supervision (DS) based on Wikipedia for Relation Extraction (RE) by considering (i) relations defined in external repositories, e.g. YAGO, and (ii) any subset of Wikipedia documents. We show that training data constituted by sentences containing pairs of named entiti ..."
Abstract - Cited by 27 (2 self) - Add to MetaCart
In this paper, we extend distant supervision (DS) based on Wikipedia for Relation Extraction (RE) by considering (i) relations defined in external repositories, e.g. YAGO, and (ii) any subset of Wikipedia documents. We show that training data constituted by sentences containing pairs of named entities in target relations is enough to produce reliable supervision. Our experiments with state-of-the-art relation extraction models, trained on the above data, show a meaningful F1 of 74.29 % on a manually annotated test set: this highly improves the state-of-art in RE using DS. Additionally, our end-to-end experiments demonstrated that our extractors can be applied to any general text document. 1
(Show Context)

Citation Context

...oblem. The One vs. Rest strategy is employed by selecting the instance with largest margin as the final answer. We carried out 5-fold cross-validation with the tree kernel toolkit 4 (Moschitti, 2004; =-=Moschitti, 2008-=-). 4.2 Results on Wikipedia RE We created a test set by sampling 200 articles from Freebase (these articles are not used for training). An expert annotator, for each sentence, labeled all possible pai...

Syntactic and semantic structure for opinion expression detection

by Richard Johansson, Ro Moschitti - In Proceedings of the 14th Conference on Computational Natural Language Learning , 2010
"... We demonstrate that relational features derived from dependency-syntactic and semantic role structures are useful for the task of detecting opinionated expressions in natural-language text, significantly improving over conventional models based on sequence labeling with local features. These feature ..."
Abstract - Cited by 21 (4 self) - Add to MetaCart
We demonstrate that relational features derived from dependency-syntactic and semantic role structures are useful for the task of detecting opinionated expressions in natural-language text, significantly improving over conventional models based on sequence labeling with local features. These features allow us to model the way opinionated expressions interact in a sentence over arbitrary distances. While the relational features make the prediction task more computationally expensive, we show that it can be tackled effectively by using a reranker. We evaluate a number of machine learning approaches for the reranker, and the best model results in a 10-point absolute improvement in soft recall on the MPQA corpus, while decreasing precision only slightly. 1
(Show Context)

Citation Context

...oehdorn and Moschitti, 2007b), to syntactically contextualize word similarities may improve the reranker accuracy. (ii) The latter can be further boosted by studying complex structural kernels, e.g. (=-=Moschitti, 2008-=-; Nguyen et al., 2009; Dinarelli et al., 2009). (iii) More specific predicate argument structures such those proposed in FrameNet, e.g. (Baker et al., 1998; Giuglea and Moschitti, 2004; Giuglea and Mo...

Structural relationships for large-scale learning of answer reranking

by Aliaksei Severyn, Alessandro Moschitti - In SIGIR , 2012
"... Supervised learning applied to answer re-ranking can highly improve on the overall accuracy of question answering (QA) systems. The key aspect is that the relationships and prop-erties of the question/answer pair composed of a question and the supporting passage of an answer candidate, can be effici ..."
Abstract - Cited by 18 (10 self) - Add to MetaCart
Supervised learning applied to answer re-ranking can highly improve on the overall accuracy of question answering (QA) systems. The key aspect is that the relationships and prop-erties of the question/answer pair composed of a question and the supporting passage of an answer candidate, can be efficiently compared with those captured by the learnt model. In this paper, we define novel supervised approaches that exploit structural relationships between a question and their candidate answer passages to learn a re-ranking model. We model structural representations of both questions and answers and their mutual relationships by just using an off-the-shelf shallow syntactic parser. We encode structures in Support Vector Machines (SVMs) by means of sequence and tree kernels, which can implicitly represent question and an-swer pairs in huge feature spaces. Such models together with the latest approach to fast kernel-based learning enabled the training of our rerankers on hundreds of thousands of instances, which previously rendered intractable for kernel-ized SVMs. The results on two different QA datasets, e.g., Answerbag and Jeopardy! data, show that our models de-liver large improvement on passage re-ranking tasks, reduc-ing the error in Recall of BM25 baseline by about 18%. One of the key findings of this work is that, despite its simplicity, shallow syntactic trees allow for learning complex relational structures, which exhibits a steep learning curve with the increase in the training size.
(Show Context)

Citation Context

...ations. Additionally, [29, 1, 35] report significant improvement by exploiting expensive linguistic approaches, e.g., predicate argument structures, for re-ranking candidate answer lists. Our work in =-=[23, 21]-=- was the first to exploit kernel methods for modeling answer re-ranking using syntactic and shallow semantic tree kernels based on predicate argument structures. However, our method lacked the use of ...

Embedding semantic similarity in tree kernels for domain adaptation of relation extraction

by Barbara Plank, Alessandro Moschitti - In ACL , 2013
"... Relation Extraction (RE) is the task of extracting semantic relationships between entities in text. Recent studies on rela-tion extraction are mostly supervised. The clear drawback of supervised methods is the need of training data: labeled data is expensive to obtain, and there is often a mismatch ..."
Abstract - Cited by 13 (4 self) - Add to MetaCart
Relation Extraction (RE) is the task of extracting semantic relationships between entities in text. Recent studies on rela-tion extraction are mostly supervised. The clear drawback of supervised methods is the need of training data: labeled data is expensive to obtain, and there is often a mismatch between the training data and the data the system will be applied to. This is the problem of domain adapta-tion. In this paper, we propose to combine (i) term generalization approaches such as word clustering and latent semantic anal-ysis (LSA) and (ii) structured kernels to improve the adaptability of relation ex-tractors to new text genres/domains. The empirical evaluation on ACE 2005 do-mains shows that a suitable combination of syntax and lexical generalization is very promising for domain adaptation. 1
(Show Context)

Citation Context

...case of structural kernels, K determines the shape of the substructures describing the objects. Commonly used kernels in NLP are string kernels (Lodhi et al., 2002) and tree kernels (Moschitti, 2006; =-=Moschitti, 2008-=-). NP PP NP E2 NNP Texas IN from NP E1 NNP governor → NP PPNP NP PPNP E1 NP PPNP E1 NNP governor E1 NNP governor . . . NNP Texas Figure 1: Syntactic tree kernel (STK). Syntactic tree kernels (Collins ...

Large-scale support vector learning with structural kernels

by Aliaksei Severyn, Ro Moschitti - In ECML/PKDD , 2010
"... Abstract. In this paper, we present an extensive study of the cutting-plane algorithm (CPA) applied to structural kernels for advanced text classification on large datasets. In particular, we carry out a compre-hensive experimentation on two interesting natural language tasks, e.g. predicate argumen ..."
Abstract - Cited by 11 (4 self) - Add to MetaCart
Abstract. In this paper, we present an extensive study of the cutting-plane algorithm (CPA) applied to structural kernels for advanced text classification on large datasets. In particular, we carry out a compre-hensive experimentation on two interesting natural language tasks, e.g. predicate argument extraction and question answering. Our results show that (i) CPA applied to train a non-linear model with different tree kernels fully matches the accuracy of the conventional SVM algorithm while being ten times faster; (ii) by using smaller sampling sizes to ap-proximate subgradients in CPA we can trade off accuracy for speed, yet the optimal parameters and kernels found remain optimal for the exact SVM. These results open numerous research perspectives, e.g. in natural language processing, as they show that complex structural kernels can be efficiently used in real-world applications. For example, for the first time, we could carry out extensive tests of several tree kernels on mil-lions of training instances. As a direct benefit, we could experiment with a variant of the partial tree kernel, which we also propose in this paper.

Joint Distant and Direct Supervision for Relation Extraction

by Truc-vien T. Nguyen, Ro Moschitti
"... Supervised approaches to Relation Extraction (RE) are characterized by higher accuracy than unsupervised models. Unfortunately, their applicability is limited by the need of training data for each relation type. Automatic creation of such data using Distant Supervision (DS) provides a promising solu ..."
Abstract - Cited by 10 (1 self) - Add to MetaCart
Supervised approaches to Relation Extraction (RE) are characterized by higher accuracy than unsupervised models. Unfortunately, their applicability is limited by the need of training data for each relation type. Automatic creation of such data using Distant Supervision (DS) provides a promising solution to the problem. In this paper, we study DS for designing endto-end systems of sentence-level RE. In particular, we propose a joint model between Web data derived with DS and manually annotated data from ACE. The results show (i) an improvement on the previous state-of-the-art in ACE, which provides important evidence of the benefit of DS; and (ii) a rather good accuracy on extracting 52 types of relations from Web data, which suggests the applicability of DS for general RE. 1
(Show Context)

Citation Context

...ed as a multiclass classification problem. We employed one vs. rest, selecting the instance with largest margin as the final label. We used the Tree Kernel toolkit3 (Moschitti, 2004; Moschitti, 2006; =-=Moschitti, 2008-=-) as SVM platform to implement CK1 and CSK (see Section 4.1). The training phase with convolution kernels on syntactic parse tree and diverse sequence kernels on the large DS data took 3 days. For tes...

Learning Adaptable Patterns for Passage Reranking

by Aliaksei Severyn, Massimo Nicosia, Ro Moschitti
"... This paper proposes passage reranking models that (i) do not require manual feature engineering and (ii) greatly preserve accuracy, when changing application domain. Their main characteristic is the use of relational semantic structures representing questions and their answer passages. The relations ..."
Abstract - Cited by 9 (5 self) - Add to MetaCart
This paper proposes passage reranking models that (i) do not require manual feature engineering and (ii) greatly preserve accuracy, when changing application domain. Their main characteristic is the use of relational semantic structures representing questions and their answer passages. The relations are established using information from automatic classifiers,
(Show Context)

Citation Context

...aditional handcrafted rules. To reduce the burden of manual feature engineering for QA, we proposed structural models based on kernel methods, (Moschitti et al., 2007; Moschitti and Quarteroni, 2008; =-=Moschitti, 2008-=-) with passages limited to one sentence. Their main idea is to: (i) generate question and passage pairs, where the text passages are retrieved by a search engine; (ii) assuming those containing the co...

Discriminative Reranking for Spoken Language Understanding

by Marco Dinarelli, Ro Moschitti - IEEE Transactions on Audio, Speech & Language Processing , 2012
"... Abstract—Spoken Language Understanding (SLU) is con-cerned with the extraction of meaning structures from spo-ken utterances. Recent computational approaches to SLU, e.g. Conditional Random Fields (CRF), optimize local models by encoding several features, mainly based on simple n-grams. In contrast, ..."
Abstract - Cited by 8 (0 self) - Add to MetaCart
Abstract—Spoken Language Understanding (SLU) is con-cerned with the extraction of meaning structures from spo-ken utterances. Recent computational approaches to SLU, e.g. Conditional Random Fields (CRF), optimize local models by encoding several features, mainly based on simple n-grams. In contrast, recent works have shown that the accuracy of CRF can be significantly improved by modeling long-distance dependency features. In this paper, we propose novel approaches to encode all possible dependencies between features and most importantly among parts of the meaning structure, e.g. concepts and their combination. We rerank hypotheses generated by local models, e.g. Stochastic Finite State Transducers (SFSTs) or Conditional Random Fields (CRF), with a global model. The latter encodes a very large number of dependencies (in the form of trees or sequences) by applying kernel methods to the space of all meaning (sub) structures. We performed comparative experiments between SFST, CRF, Support Vector Machines (SVMs) and our proposed discriminative reranking models (DRMs) on representative conversational speech corpora in three different languages: the ATIS (English), the MEDIA (French) and the LUNA (Italian) corpora. These corpora have been collected within three different domain applications of increasing complexity: informational, transactional and problem-solving tasks, respectively. The results show that our DRMs consistently outperform the state-of-the-art models based on CRF.
(Show Context)

Citation Context

...features). In more detail, we represent the dependency in the conceptual structure by means of semantic trees and sequences that we designed. Then, we apply tree and sequence kernels defined in [16], =-=[18]-=-, [20] for extracting all possible substructures, which correspond to different semantic/syntactic dependency features. It should be noted that ours is the first comprehensive study on using such rich...

Automatic Feature Engineering for Answer Selection and Extraction

by Aliaksei Severyn, Alessandro Moschitti
"... This paper proposes a framework for automat-ically engineering features for two important tasks of question answering: answer sentence selection and answer extraction. We represent question and answer sentence pairs with lin-guistic structures enriched by semantic infor-mation, where the latter is p ..."
Abstract - Cited by 8 (4 self) - Add to MetaCart
This paper proposes a framework for automat-ically engineering features for two important tasks of question answering: answer sentence selection and answer extraction. We represent question and answer sentence pairs with lin-guistic structures enriched by semantic infor-mation, where the latter is produced by auto-matic classifiers, e.g., question classifier and Named Entity Recognizer. Tree kernels ap-plied to such structures enable a simple way to generate highly discriminative structural fea-tures that combine syntactic and semantic in-formation encoded in the input trees. We con-duct experiments on a public benchmark from TREC to compare with previous systems for answer sentence selection and answer extrac-tion. The results show that our models greatly improve on the state of the art, e.g., up to 22% on F1 (relative improvement) for answer ex-traction, while using no additional resources and no manual feature engineering. 1
(Show Context)

Citation Context

...ce re-ranking and answer extraction, is required to yield a better performance. 7 Related Work Tree kernel methods have found many applications for the task of answer reranking which are reported in (=-=Moschitti, 2008-=-; Moschitti, 2009; Moschitti and Quarteroni, 2008; Severyn and Moschitti, 2012). However, their methods lack the use of important relational information between a question and a candidate answer, whic...

Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University