Results 1 - 10
of
18
From treebank to propbank
- In Language Resources and Evaluation
, 2002
"... This paper describes our approach to the development of a Proposition Bank, which involves the addition of semantic information to the Penn English Treebank. Our primary goal is the labeling of syntactic nodes with specific argument labels that preserve the similarity of roles such as the window in ..."
Abstract
-
Cited by 163 (8 self)
- Add to MetaCart
This paper describes our approach to the development of a Proposition Bank, which involves the addition of semantic information to the Penn English Treebank. Our primary goal is the labeling of syntactic nodes with specific argument labels that preserve the similarity of roles such as the window in John broke the window and the window broke. After motivating the need for explicit predicate argument structure labels, we briefly discuss the theoretical considerations of predicate argument structure and the need to maintain consistency across syntactic alternations. The issues of consistency of argument structure across both polysemous and synonymous verbs are also discussed and we present our actual guidelines for these types of phenomena, along with numerous examples of tagged sentences and verb frames. Metaframes are introduced as a technique for handling similar frames among near− synonymous verbs. We conclude with a summary of the current status of annotation process. 1.
Handling Translation Divergences: Combining Statistical and Symbolic Techniques in Generation-Heavy Machine Translation
- In Fifth Conference of the Association for Machine Translation in the Americas, AMTA-2002
, 2002
"... This paper describes a novel approach to handling translation divergences in a Generation-Heavy Hybrid Machine Translation (GHMT) system. The translation divergence problem is usually reserved for Transfer and Interlingual MT because it requires a large combination of complex lexical and structu ..."
Abstract
-
Cited by 16 (5 self)
- Add to MetaCart
This paper describes a novel approach to handling translation divergences in a Generation-Heavy Hybrid Machine Translation (GHMT) system. The translation divergence problem is usually reserved for Transfer and Interlingual MT because it requires a large combination of complex lexical and structural mappings. A major requirement of these approaches is the accessibility of large amounts of explicit symmetric knowledge for both source and target languages. This limitation renders Transfer and Interlingual approaches ineffective in the face of structurally-divergent language pairs with asymmetric resources. GHMT addresses the more common form of this problem, source-poor/targetrich, by fully exploiting symbolic and statistical target-language resources.
Generation-Heavy Hybrid Machine Translation
, 2002
"... This paper describes Generation-Heavy Hybrid Machine Translation (GHMT), a novel approach for trans- lating between structurally-divergent language pairs with asymmetrical resources. The approach depends on the existence of rich target language resources such as word lexical semantics, categ ..."
Abstract
-
Cited by 11 (1 self)
- Add to MetaCart
This paper describes Generation-Heavy Hybrid Machine Translation (GHMT), a novel approach for trans- lating between structurally-divergent language pairs with asymmetrical resources. The approach depends on the existence of rich target language resources such as word lexical semantics, categorial variations and subcategorization frames. These resources are used to overgenerate multiple lexico-structural variations from a target-glossed syntactic dependency representation of the source language sentence. This symbolic overgeneration, which accounts for a wide range of possible variations, is constrained by a statistical targetlanguage model. The exploitation of target language resources (symbolic and statistical) to handle a problem usually reserved for Transfer and Interlingual MT is useful for translation from source languages with scarce linguistic resources. A preliminary evaluation on the application of this approach to Spanish-English MT is conducted with promising results.
DUSTer: A Method for Unraveling Cross-Language Divergences for Statistical Word-level Alignment
- Proceedings of AMTA-02
, 2002
"... The frequent occurrence of divergences---structural differences between languages---presents a great challenge for statistical wordlevel alignment. In this paper, we introduce DUSTer, a method for systematically identifying common divergence types and transforming an English sentence structure to be ..."
Abstract
-
Cited by 8 (2 self)
- Add to MetaCart
The frequent occurrence of divergences---structural differences between languages---presents a great challenge for statistical wordlevel alignment. In this paper, we introduce DUSTer, a method for systematically identifying common divergence types and transforming an English sentence structure to bear a closer resemblance to that of another language. Our ultimate goal is to enable more accurate alignment and projection of dependency trees in another language without requiring any training on dependency-tree data in that language. We present an empirical analysis comparing the complexities of performing word-level alignments with and without divergence handling. Our results suggest that our approach facilitates word-level alignment, particularly for sentence pairs containing divergences.
Identification of Divergence for English to Hindi EBMT
- In Proceedings of MT SUMMIT III. 141--148
"... Divergence is a key aspect of translation between two languages. Divergence occurs when structurally similar sentences of the source language do not translate into sentences that are similar in structures in the target language. Divergence assumes special significance in the domain of Example-Based ..."
Abstract
-
Cited by 6 (1 self)
- Add to MetaCart
Divergence is a key aspect of translation between two languages. Divergence occurs when structurally similar sentences of the source language do not translate into sentences that are similar in structures in the target language. Divergence assumes special significance in the domain of Example-Based Machine Translation (EBMT). An EBMT system generates translation of a given sentence by retrieving similar past translation examples from its example base and then adapting them suitably to meet the current translation requirements. Divergence imposes a great challenge to the success of EBMT. The present work provides a technique for identification of divergence without going into the semantic details of the underlying sentences. This identification helps in partitioning the example database into divergence / non-divergence categories, which in turn should facilitate efficient retrieval and adaptation in an EBMT system. 1
Development and Evaluation of a Korean Treebank and its Application to NLP
- In Proceedings of the 3rd International Conference on Language Resources and Evaluation (LREC ‘02
, 2002
"... This paper discusses issues in building a 54-thousand-word Korean Treebank using a phrase structure annotation, along with developing annotation guidelines based on the morpho-syntactic phenomena represented in the corpus. Various methods that were employed for quality control are presented. The eva ..."
Abstract
-
Cited by 6 (1 self)
- Add to MetaCart
This paper discusses issues in building a 54-thousand-word Korean Treebank using a phrase structure annotation, along with developing annotation guidelines based on the morpho-syntactic phenomena represented in the corpus. Various methods that were employed for quality control are presented. The evaluation on the quality of the Treebank and some of the NLP applications under development using the Treebank are also presented. 1.
Inducing Lexico-Structural Transfer Rules from Parsed Bi-texts
- IN: PROCEEDINGS OF THE 39TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS -- DDMT WORKSHOP
, 2001
"... This paper describes a novel approach to inducing lexico-structural transfer rules from parsed bi-texts using syntactic pattern matching, statistical cooccurrence and error-driven filtering. ..."
Abstract
-
Cited by 5 (1 self)
- Add to MetaCart
This paper describes a novel approach to inducing lexico-structural transfer rules from parsed bi-texts using syntactic pattern matching, statistical cooccurrence and error-driven filtering.
Dependency and relational structure in treebank annotation
- COLING 2004 Recent Advances in Dependency Grammar
, 2004
"... Among the variety of proposals currently making the dependency perspective on grammar more concrete, there are several treebanks whose annotation exploits some form of Relational Structure that we can consider a generalization of the fundamental idea of dependency at various degrees and with referen ..."
Abstract
-
Cited by 5 (1 self)
- Add to MetaCart
Among the variety of proposals currently making the dependency perspective on grammar more concrete, there are several treebanks whose annotation exploits some form of Relational Structure that we can consider a generalization of the fundamental idea of dependency at various degrees and with reference to different types of linguistic knowledge. The paper describes the Relational Structure as the common underlying representation of treebanks which is motivated by both theoretical and task-dependent considerations. Then it presents a system for the annotation of the Relational Structure in treebanks, called Augmented Relational Structure, which allows for a systematic annotation of various components of linguistic knowledge crucial in several tasks. Finally, it shows a dependency-based annotation for an Italian treebank, i.e. the Turin University Treebank, that implements the Augmented Relational Structure. 1
From TreeBank to PropBank
, 2002
"... This paper describes our approach to the development of a Proposition Bank, which involves the addition of semantic information to the Penn English Treebank. Our primary goal is the labeling of syntactic nodes with specific argument labels that preserve the similarity of roles such as the window in ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
This paper describes our approach to the development of a Proposition Bank, which involves the addition of semantic information to the Penn English Treebank. Our primary goal is the labeling of syntactic nodes with specific argument labels that preserve the similarity of roles such as the window in John broke the window and the window broke. After motivating the need for explicit predicate argument structure labels, we briefly discuss the theoretical considerations of predicate argument structure and the need to maintain consistency across syntactic alternations. The issues of consistency of argument structure across both polysemous and synonymous verbs are also discussed and we present our actual guidelines for these types of phenomena, along with numerous examples of tagged sentences and verb frames. Metaframes are introduced as a technique for handling similar frames among nearsynonymous verbs. We conclude with a summary of the current status of annotation process.
Maurice gross’ grammar lexicon and natural language processing
- In Proceedings of the 2nd Language and Technology Conference
, 2005
"... Maurice Gross ’ grammar lexicon contains an extremly rich and exhaustive information about the morphosyntactic and semantic properties of French syntactic functors (verbs, adjectives, nouns). Yet its use within natural language processing systems is still restricted. In this paper, we first argue th ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
Maurice Gross ’ grammar lexicon contains an extremly rich and exhaustive information about the morphosyntactic and semantic properties of French syntactic functors (verbs, adjectives, nouns). Yet its use within natural language processing systems is still restricted. In this paper, we first argue that the information contained in the grammar lexicon is potentially useful for Natural Language Processing (NLP). We then sketch a way to translate this information into a format which is arguably more amenable for use by NLP systems. 1. Maurice Gross’s grammar lexicon Much work in syntax concentrates on identifying and formalising general syntactic rules that are thought to be valid of a large class of words. Typically, Chomsky’s transformation rules describe systematic relations between syntactic structures. And more recently, the lexical rules of e.g., Lexical Functional Grammar systematically describes a pair of syntactic categories deemed to hold of a given class of words. But as Chomsky himself observed (Chomsky, 1965),

