Download:
|
by Pascale S Ebillot, Pierrette Bouillon
ftp://issco-ftp.unige.ch/pub/publications/sebillot-lll-2000.ps.gz
Add To MetaCart
Abstract:
In this paper, we propose an Inductive Logic Programming learning method which aims at automatically extracting special Noun-Verb (NV) pairs from a corpus in order to build up semantic lexicons based on Pustejovsky's Generative Lexicon (GL) principles (Pustejovsky, 1995). In one of the components of this lexical model, called the qualia structure, words are described in terms of semantic roles. For example, the telic role indicates the purpose or function of an item (cut for knife), the agentive role its creation mode (build for house), etc. The qualia structure of a noun is mainly made up of verbal associations, encoding relational information. The Inductive Logic Programming learning method that we have developed enables us to automatically extract from a corpus N-V pairs whose elements are linked by one of the semantic relations defined in the qualia structure in GL, and to distinguish them, in terms of surrounding categorial context from N-V pairs also present in sentences of the corpus but not relevant. This method has been theoretically and empirically validated, on a technical corpus. The N-V pairs that have been extracted will further be used in information retrieval applications for index expansion
Citations
|
555
|
The generative lexicon
– Pustejovsky
- 1991
|
|
486
|
Inverse entailment and Progol
– Muggleton
- 1995
|
|
345
|
Inductive logic programming : Theory and methods
– Muggleton, Raedt
- 1994
|
|
338
|
Automatic acquisition of hyponyms from large text corpora
– Hearst
- 1992
|
|
203
|
Explorations in Automatic Thesaurus Discovery
– Grefenstette
- 1994
|
|
28
|
Knowledge acquisition of predicate argument structures from technical texts using machine learning: The system ASIUM
– Faure, Nédellec
- 1999
|
|
24
|
Extraction de liens sémantiques entre termes à partir de corpus de textes techniques
– Morin
- 1999
|
|
21
|
Corpus-derived first, second and third-order word affinities
– Grefenstette
- 1994
|
|
21
|
SQLET: Short Query Linguistic Expansion Techniques, palliating one-word queries by providing intermediate structure to text
– Grefenstette
- 1997
|
|
18
|
MMORPH - the multext morphology program
– Petitpierre, Russell
- 1998
|
|
15
|
Multext: Multilingual Text Tools and Corpora
– Armstrong
- 1996
|
|
14
|
The form of information in science: Analysis of an immunology sublanguage
– Harris, Gottfried, et al.
- 1989
|
|
11
|
Part-of-speech disambiguation using ILP
– Cussens
- 1996
|
|
10
|
Semantic Feature Extraction from Technical Texts with Limited Human Intervention
– Agarwal
- 1995
|
|
10
|
Automatic extraction of subcategorisation from corpora
– Briscoe, Carroll
- 1997
|
|
9
|
Learning for Semantic Interpretation: Scaling Up Without Dumbing Down
– Mooney
- 1999
|
|
8
|
Developpement de lexiques a grande echelle
– Bouillon, Lehmann, et al.
- 1998
|
|
7
|
Semantic interpretation of binominal sequences and information retrieval
– Fabre, Sebillot
- 1999
|
|
7
|
Acquisition automatique d'informations lexicales `a partir de corpus : un bilan. Research report n 3321
– Pichon, Sebillot
- 1997
|
|
6
|
From corpus to lexicon: From contexts to semantic features
– Pichon, Sébillot
- 2000
|
|
4
|
Regroupements issus de d'ependances syntaxiques en corpus : cat'egorisation et confrontation avec deux mod'elisations conceptuelles
– Bouaud, Habert, et al.
- 1997
|
|
4
|
Les linguistiques de corpus. Armand Collin/Masson
– Habert, Nazarenko, et al.
- 1997
|
|
3
|
The Polymorphism of Verbs Exhibiting Middle Transitive Alternations in English
– Bassac, Bouillon
- 2000
|
|
2
|
Tagger Overview. This comparison could be extended to other corpus frequency based technics (mutual information, etc.). Open the door. Clean the whole tank. Empty the tank bottom
– Armstrong, Bouillon, et al.
- 1995
|