MetaCartSign in to MyCiteSeer

Include Citations | Advanced Search | Help

Include Citations | Advanced Search | Help

  Inductive Logic Programming for Corpus-Based Acquisition of Semantic Lexicons

Download:
Download as a PDF | Download as a PS
by Pascale S Ebillot, Pierrette Bouillon
ftp://issco-ftp.unige.ch/pub/publications/sebillot-lll-2000.ps.gz
Add To MetaCart

Abstract:

In this paper, we propose an Inductive Logic Programming learning method which aims at automatically extracting special Noun-Verb (NV) pairs from a corpus in order to build up semantic lexicons based on Pustejovsky's Generative Lexicon (GL) principles (Pustejovsky, 1995). In one of the components of this lexical model, called the qualia structure, words are described in terms of semantic roles. For example, the telic role indicates the purpose or function of an item (cut for knife), the agentive role its creation mode (build for house), etc. The qualia structure of a noun is mainly made up of verbal associations, encoding relational information. The Inductive Logic Programming learning method that we have developed enables us to automatically extract from a corpus N-V pairs whose elements are linked by one of the semantic relations defined in the qualia structure in GL, and to distinguish them, in terms of surrounding categorial context from N-V pairs also present in sentences of the corpus but not relevant. This method has been theoretically and empirically validated, on a technical corpus. The N-V pairs that have been extracted will further be used in information retrieval applications for index expansion

Citations

555 The generative lexicon – Pustejovsky - 1991
486 Inverse entailment and Progol – Muggleton - 1995
345 Inductive logic programming : Theory and methods – Muggleton, Raedt - 1994
338 Automatic acquisition of hyponyms from large text corpora – Hearst - 1992
203 Explorations in Automatic Thesaurus Discovery – Grefenstette - 1994
28 Knowledge acquisition of predicate argument structures from technical texts using machine learning: The system ASIUM – Faure, Nédellec - 1999
24 Extraction de liens sémantiques entre termes à partir de corpus de textes techniques – Morin - 1999
21 Corpus-derived first, second and third-order word affinities – Grefenstette - 1994
21 SQLET: Short Query Linguistic Expansion Techniques, palliating one-word queries by providing intermediate structure to text – Grefenstette - 1997
18 MMORPH - the multext morphology program – Petitpierre, Russell - 1998
15 Multext: Multilingual Text Tools and Corpora – Armstrong - 1996
14 The form of information in science: Analysis of an immunology sublanguage – Harris, Gottfried, et al. - 1989
11 Part-of-speech disambiguation using ILP – Cussens - 1996
10 Semantic Feature Extraction from Technical Texts with Limited Human Intervention – Agarwal - 1995
10 Automatic extraction of subcategorisation from corpora – Briscoe, Carroll - 1997
9 Learning for Semantic Interpretation: Scaling Up Without Dumbing Down – Mooney - 1999
8 Developpement de lexiques a grande echelle – Bouillon, Lehmann, et al. - 1998
7 Semantic interpretation of binominal sequences and information retrieval – Fabre, Sebillot - 1999
7 Acquisition automatique d'informations lexicales `a partir de corpus : un bilan. Research report n 3321 – Pichon, Sebillot - 1997
6 From corpus to lexicon: From contexts to semantic features – Pichon, Sébillot - 2000
4 Regroupements issus de d'ependances syntaxiques en corpus : cat'egorisation et confrontation avec deux mod'elisations conceptuelles – Bouaud, Habert, et al. - 1997
4 Les linguistiques de corpus. Armand Collin/Masson – Habert, Nazarenko, et al. - 1997
3 The Polymorphism of Verbs Exhibiting Middle Transitive Alternations in English – Bassac, Bouillon - 2000
2 Tagger Overview. This comparison could be extended to other corpus frequency based technics (mutual information, etc.). Open the door. Clean the whole tank. Empty the tank bottom – Armstrong, Bouillon, et al. - 1995