Results 1 - 10
of
39
Verb Class Disambiguation Using Informative Priors
- COMPUTATIONAL LINGUISTICS
, 2004
"... Levin’s (1993) study of verb classes is a widely used resource for lexical semantics. In her framework, some verbs, such as give, exhibit no class ambiguity. But other verbs, such as write, have several alternative classes. We extend Levin’s inventory to a simple statistical model of verb class ambi ..."
Abstract
-
Cited by 48 (4 self)
- Add to MetaCart
Levin’s (1993) study of verb classes is a widely used resource for lexical semantics. In her framework, some verbs, such as give, exhibit no class ambiguity. But other verbs, such as write, have several alternative classes. We extend Levin’s inventory to a simple statistical model of verb class ambiguity. Using this model we are able to generate preferences for ambiguous verbs without the use of a disambiguated corpus. We additionally show that these preferences are useful as priors for a verb sense disambiguator.
BiFrameNet: Bilingual frame semantics resources construction by cross-lingual induction
- In Proceedings of the 20th International Conference on Computational Linguistics
, 2004
"... We present a novel automatic approach to constructing a bilingual semantic network—the BiFrameNet, to enhance statistical and transfer-based machine translation systems. BiFrameNet is a frame semantic representation, and contains semantic structure transfers between English and Chinese. The English ..."
Abstract
-
Cited by 20 (1 self)
- Add to MetaCart
We present a novel automatic approach to constructing a bilingual semantic network—the BiFrameNet, to enhance statistical and transfer-based machine translation systems. BiFrameNet is a frame semantic representation, and contains semantic structure transfers between English and Chinese. The English FrameNet and the Chinese HowNet provide us with two different views of the semantic distribution of lexicon by linguists. We propose to induce the mapping between the English lexical entries in FrameNet to Chinese word senses in HowNet, furnishing a bilingual semantic lexicon which simulates the “concept lexicon ” purportedly used by human translators, and which can thus be beneficial to machine translation systems. BiFrameNet also contains bilingual example sentences that have the same semantic roles. We automatically induce Chinese example sentences and their semantic roles, based on semantic structure alignment from the first stage of our work, as well as shallow syntactic structure. In addition to its utility for machine-aided and machine translations, our work is also related to the spatial models proposed by cognitive scientists in the framework of artifactual simulations of the translation process. 1.
Identifying Concepts Across Languages: A First Step towards a Corpus-based Approach to Automatic Ontology Alignment
"... This paper presents a first step towards the creation of a bilingual ontology through the alignment of two monolingual ontologies: the American English WordNet and the Mandarin Chinese HowNet. These two ontologies have structures which are very different from each other, as well as being constructed ..."
Abstract
-
Cited by 11 (0 self)
- Add to MetaCart
This paper presents a first step towards the creation of a bilingual ontology through the alignment of two monolingual ontologies: the American English WordNet and the Mandarin Chinese HowNet. These two ontologies have structures which are very different from each other, as well as being constructed for two very different languages, which makes this an appropriate and challenging task for our algorithm
Using ontological and document similarity to estimate museum exhibit relatedness
- ACM Journal on Computing and Cultural Heritage
, 2011
"... Exhibits within Cultural Heritage collections such as museums and art galleries are arranged by experts with intimate knowledge of the domain, but there may exist connections between individual exhibits that are not evident in this representation. For example, the visitors to such a space may have t ..."
Abstract
-
Cited by 11 (4 self)
- Add to MetaCart
(Show Context)
Exhibits within Cultural Heritage collections such as museums and art galleries are arranged by experts with intimate knowledge of the domain, but there may exist connections between individual exhibits that are not evident in this representation. For example, the visitors to such a space may have their own opinions on how exhibits relate to one another. In this paper, we explore the possibility of estimating the perceived relatedness of exhibits by museum visitors through a variety of ontological and document similarity-based methods. Specifically, we combine the Wikipedia category hierarchy with lexical similarity measures, and evaluate the correlation with the relatedness judgements of visitors. We compare our measure with simple document similarity calculations, based on either Wikipedia documents or web pages taken from the website for the museum of interest. We also investigate the hypothesis that physical distance in the museum space is a direct representation of the conceptual distance between exhibits. We demonstrate that ontological similarity measures are highly effective at capturing perceived relatedness and that the proposed raco (Related Article Conceptual Overlap) method is able to achieve results closest to relatedness judgements provided by human annotators compared to existing state-of-the art measures of semantic relatedness.
Multilingual Computational Semantic Lexicons in Action: The Wysinnwyg Approach To NLP
- IN PROCEEDINGS OF THE 36TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 17TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL LINGUISTICS (COLING-ACL-98
, 1998
"... Much effort has been put into computational lexicons over the years, and most systems give much room to (lexical) semantic data. However, in these systems, the effort put on the study and representation of lexical items to express the underlying continuum existing in 1) language vagueness and polyse ..."
Abstract
-
Cited by 9 (1 self)
- Add to MetaCart
Much effort has been put into computational lexicons over the years, and most systems give much room to (lexical) semantic data. However, in these systems, the effort put on the study and representation of lexical items to express the underlying continuum existing in 1) language vagueness and polysemy, and 2) language gaps and mismatches, has remained embryonic. A sense enumeration approach fails from a theoretical point of view to capture the core meaning of words, let alone relate word meanings to one another, and complicates the task of NLP by multiplying ambiguities in analysis and choices in generation. In this paper, I study computational semantic lexicon representation from a multilingual point of view, reconciling different approaches to lexicon representation: i) vagueness for lexemes which have a more or less finer grained semantics with respect to other languages; ii) underspecification for lexemes which have multiple related facets; and, iii) lexical rules to relate systematic polysemy to systematic ambiguity. I build on a What You See Is Not Necessarily What You Get (WYSINNWYG) approach to provide the NLP system with the "right" lexical data already tuned towards a particular task. In order to do so, I argue for a lexical semantic approach to lexicon representation. I exemplify my study through a cross-linguistic investigation on spatially-based expressions.
A Multi-Level Approach to Interlingual MT: Defining the Interface between Representational Languages
- International Journal of Expert Systems
"... This paper describes a multi-level design, i.e., a non-uniform approach to interlingual machine translation (MT), in which distinct representational languages are used for different types of knowledge. We demonstrate that a linguistically-motivated "division of labor" across multiple repre ..."
Abstract
-
Cited by 8 (2 self)
- Add to MetaCart
(Show Context)
This paper describes a multi-level design, i.e., a non-uniform approach to interlingual machine translation (MT), in which distinct representational languages are used for different types of knowledge. We demonstrate that a linguistically-motivated "division of labor" across multiple representation levels has not complicated, but rather has readily facilitated, the identification and construction of systematic relations at the interface between each level. Our approach assumes an interlingua derived from the lexical semantics and predicate decomposition approaches of Jackendoff (1983; 1990) and Levin and Rappaport-Hovav (1995a; 1995b). We describe a model of interpretation and representation of natural language sentences which has been implemented as part of an interlingual MT system called PRINCITRAN.
Capturing motion verb generalizations in synchronous tree-adjoining grammar
- Predicative Forms in Natural Language and in Lexical Knowledge Bases
, 1999
"... This paper describes the use of verb class memberships as a means of capturing generalizations about manner-of-motion verbs in Synchronous Tree Adjoining Grammars, STAGs, [20, 21, 22]. This approach allows STAGs, which are essentially transfer-based, to take advantage of the same types of generaliza ..."
Abstract
-
Cited by 6 (3 self)
- Add to MetaCart
(Show Context)
This paper describes the use of verb class memberships as a means of capturing generalizations about manner-of-motion verbs in Synchronous Tree Adjoining Grammars, STAGs, [20, 21, 22]. This approach allows STAGs, which are essentially transfer-based, to take advantage of the same types of generalizations which are generally thought of as wholly the domain of interlingua systems- without giving up any of the lexical specificity unique to transfer-based systems. In this way a machine translation system based on STAGs can respond with seamless flexibility to a wide spectrum of phenomena being presented for translation ranging from idioms and idiosyncratic lexical items to well-behaved verbs that follow lexical rules. 1
Are WordNet sense distinctions appropriate for computational lexicons?
- In Advanced Papers of the SENSEVAL Workshop
, 1998
"... this paper we specifically address the question of polysemy with respect to verbs, and whether or not the sense distinctions that are made in on- line dictionary resources are appropriate for computational lexicons. We examine the use of sets of related syntactic frames and verb classes as a means o ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
(Show Context)
this paper we specifically address the question of polysemy with respect to verbs, and whether or not the sense distinctions that are made in on- line dictionary resources are appropriate for computational lexicons. We examine the use of sets of related syntactic frames and verb classes as a means of simplifying the task of defining different senses, and we focus on the mismatches between distinctions that can readily be made with these tools and some of the distinctions that are made in WordNet. 2 Challenges in building large-scale lexicons
Argument Status in Japanese Verb Sense Disambiguation
- IN EIGHTH INTERNATIONAL CONFERENCE ON THEORETICAL AND METHODOLOGICAL ISSUES IN MACHINE TRANSLATION: TMI-99
, 1999
"... This research aims to incorporate argument status-based modelling within an otherwise selectional constraint-based system of verb sense disambiguation, to capture e#ects such as underspecification, surface case alternation and semantic backing-off. The proposed implementation hinges around a descr ..."
Abstract
-
Cited by 5 (3 self)
- Add to MetaCart
This research aims to incorporate argument status-based modelling within an otherwise selectional constraint-based system of verb sense disambiguation, to capture e#ects such as underspecification, surface case alternation and semantic backing-off. The proposed implementation hinges around a description of the general behavioural characteristics of integral complements, complements, middles and adjuncts through a pre-determined weighting schema. On limited evaluation, the resultant system returned an accuracy of over 83%, and was further shown to significantly outperform baseline methods.