Results 1 - 10
of
45
Building a Japanese Parsed Corpus while Improving the Parsing System
- In Proceedings of the NLPRS
, 1998
"... In January 1996, we started a project to construct a Japanese parsed corpus and to simultaneously improve a morphological analyzer and a parser. In this project, human annotators are not only correcting the erroneous analyses produced by the parsing system, but also improving the parsing system/gram ..."
Abstract
-
Cited by 58 (5 self)
- Add to MetaCart
In January 1996, we started a project to construct a Japanese parsed corpus and to simultaneously improve a morphological analyzer and a parser. In this project, human annotators are not only correcting the erroneous analyses produced by the parsing system, but also improving the parsing system/grammar: finding problematic fixed expressions, picking up phrases which have exceptional functions, and classifying unseen types of clauses, and so on. 1 Introduction Corpus-based methods have become a central paradigm in the present-day NLP. Corpus-based NLP usually takes the following steps: 1. plain corpora are first assigned tags automatically (this paper limits the scope to part-ofspeech tags and syntactic tags), 2. those tags are corrected by human annotators, making correctly tagged corpora, 3. tagged corpora are used to construct or improve NLP system/grammar. Corpus construction projects so far have not fully exploited the automatic parser. In the case of Penn Treebank Project (Marcus...
Dialog Navigator”: A question answering system based on large text knowledge base
- In Proc. COLING
, 2002
"... This paper describes a dialog based QA system, Dialog Navigator, which can answer questions based on large text knowledge base. In real world QA systems, vagueness of questions is a big problem. Our system can navigate users to the desired answers using the following methods: asking users back with ..."
Abstract
-
Cited by 13 (3 self)
- Add to MetaCart
This paper describes a dialog based QA system, Dialog Navigator, which can answer questions based on large text knowledge base. In real world QA systems, vagueness of questions is a big problem. Our system can navigate users to the desired answers using the following methods: asking users back with dialog cards, and description extraction of each retrieved text. Another feature of the system is that it retrieves relevant texts precisely, using question types, synonymous expression dictionary, and modifier-head relations in Japanese sentences. 1
HPSG-Style Underspecified Japanese Grammar with Wide Coverage
, 1998
"... This paper describes a wide-coverage Japanese grammar based on HPSG. The aim of this work is to see the coverage and accuracy attainable using an underspecified grammar. Underspecification, allowed in a typed feature structure formalism, enables us to write down a wide-coverage grammar concisely. Th ..."
Abstract
-
Cited by 12 (2 self)
- Add to MetaCart
This paper describes a wide-coverage Japanese grammar based on HPSG. The aim of this work is to see the coverage and accuracy attainable using an underspecified grammar. Underspecification, allowed in a typed feature structure formalism, enables us to write down a wide-coverage grammar concisely. The grammar we have implemented consists of only 6 ID schemata, 68 lexical entries (assigned to func- tional words). and 63 lexical entry templates (assigned to parts of speech (POSs) a) Further- more. word-specific constraints such subcategorization of verbs are not fixed in the gramnlar. However. this granlmar call generate parse trees for 87% of the 10000 sentences in the Japanese EDR corpus. The dependency accu- racy is 78% when a parser uses the heuristic that every bunsetsu is attached to the nearest possible one.
Semantic Analysis of Japanese Noun Phrases: A New Approach To . . .
- IN PROC. OF THE 37TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL
, 1999
"... This paper presents a new method of analyzing Japanese noun phrases of the form N1 no N2. The Japanese postposition no roughly corresponds to of, but it has much broader age. The method exploits a definition of N2 in a dictionary. For example, rugby no coach can be interpreted as a person who teach ..."
Abstract
-
Cited by 11 (3 self)
- Add to MetaCart
This paper presents a new method of analyzing Japanese noun phrases of the form N1 no N2. The Japanese postposition no roughly corresponds to of, but it has much broader age. The method exploits a definition of N2 in a dictionary. For example, rugby no coach can be interpreted as a person who teaches technique in rugby. We illustrate the effectiveness of the method by the analysis of 300 test noun phrases.
A Probabilistic Disambiguation Method Based on Psycholinguistic Principles
- In Proceedings of the Fourth Workshop on Very Large Corpora (WVLC-4
, 1996
"... We address the problem of structural disambiguation in syntactic parsing. In psycholinguistics, a number of principles of disambiguation have been proposed, notably the Lexical Preference Rule (LPR), the Right Association Principle (RAP), and the Attach Low and Parallel Principle (ALPP). We argue th ..."
Abstract
-
Cited by 10 (0 self)
- Add to MetaCart
We address the problem of structural disambiguation in syntactic parsing. In psycholinguistics, a number of principles of disambiguation have been proposed, notably the Lexical Preference Rule (LPR), the Right Association Principle (RAP), and the Attach Low and Parallel Principle (ALPP). We argue that in order to improve disambiguation results it is necessary to implement these principles on the basis of a probabilistic methodology. We define a 'three-word probability' for implementing LPR, and a 'length probability' for implementing RAP and ALPP. Furthermore, we adopt the 'back-off' method to combine these two types of probabilities. Our experimental results indicate our method to be effective, attaining an accuracy of 89.2%.
Coordinate Noun Phrase Disambiguation in a Generative Parsing Model
"... In this paper we present methods for improving the disambiguation of noun phrase (NP) coordination within the framework of a lexicalised history-based parsing model. As well as reducing noise in the data, we look at modelling two main sources of information for disambiguation: symmetry in conjunct s ..."
Abstract
-
Cited by 7 (0 self)
- Add to MetaCart
In this paper we present methods for improving the disambiguation of noun phrase (NP) coordination within the framework of a lexicalised history-based parsing model. As well as reducing noise in the data, we look at modelling two main sources of information for disambiguation: symmetry in conjunct structure, and the dependency between conjunct lexical heads. Our changes to the baseline model result in an increase in NP coordination dependency f-score from 69.9 % to 73.8%, which represents a relative reduction in f-score error of 13%. 1
Finding Structural Correspondences from Bilingual Parsed Corpus for Corpus-based Translation
- Proceedings of the 18th International Conference on Computational Linguistics (COLING-00
, 2000
"... In this paper, we describe a system and methods for finding structural correspondences from the paired dependency structures of a source sentence and its translation in a target language. The system we have developed finds word correspondences first, then finds phrasal correspon(tences based on word ..."
Abstract
-
Cited by 7 (0 self)
- Add to MetaCart
In this paper, we describe a system and methods for finding structural correspondences from the paired dependency structures of a source sentence and its translation in a target language. The system we have developed finds word correspondences first, then finds phrasal correspon(tences based on word correspondences. We have also developed a GUI system with which a user can check and correct tile correspondences retrieved by the system. These structural correspondences will be used as raw translation I)atterns in a corpus-based translation system. 1
Converting Text into Agent Animations: Assigning Gestures to Text
"... This paper proposes a method for assigning gestures to text based on lexical and syntactic information. First, our empirical study identified lexical and syntactic information strongly correlated with gesture occurrence and suggested that syntactic structure is more useful for judging gesture occurr ..."
Abstract
-
Cited by 4 (3 self)
- Add to MetaCart
This paper proposes a method for assigning gestures to text based on lexical and syntactic information. First, our empirical study identified lexical and syntactic information strongly correlated with gesture occurrence and suggested that syntactic structure is more useful for judging gesture occurrence than local syntactic cues. Based on the empirical results, we have implemented a system that converts text into an animated agent that gestures and speaks synchronously. 1
Realistic Parsing: Practical Solutions of Difficult Problems
, 1995
"... This paper describes work on the linguistic analysis of texts within a project devoted to knowledge acquisition from text. We focus on syntactic processing and present some key elements of the project's parser that allow it to deal successfully with technical texts. The parser is fully impleme ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
This paper describes work on the linguistic analysis of texts within a project devoted to knowledge acquisition from text. We focus on syntactic processing and present some key elements of the project's parser that allow it to deal successfully with technical texts. The parser is fully implemented and tested on a variety of real texts; improvements and enhancements are in progress. Because our knowledge acquisition method assumes no a priori model of the domain of the source text, the parser relies as much as possible on lexical and syntactic clues. That is why it strives for full syntactic analysis rather than some form of text skimming. We present a practical approach to four acknowledged difficult problems which to date have no generally acceptable answers: phrase attachment; time constraints for problematic input (how to avoid long and unproductive computation); parsing conjoined structures (how to preserve broad coverage without losing control of the parsing...
A Discriminative Learning Model for Coordinate Conjunctions
"... We propose a sequence-alignment based method for detecting and disambiguating coordinate conjunctions. In this method, averaged perceptron learning is used to adapt the substitution matrix to the training data drawn from the target language and domain. To reduce the cost of training data constructio ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
We propose a sequence-alignment based method for detecting and disambiguating coordinate conjunctions. In this method, averaged perceptron learning is used to adapt the substitution matrix to the training data drawn from the target language and domain. To reduce the cost of training data construction, our method accepts training examples in which complete word-by-word alignment labels are missing, but instead only the boundaries of coordinated conjuncts are marked. We report promising empirical results in detecting and disambiguating coordinated noun phrases in the GENIA corpus, despite a relatively small number of training examples and minimal features are employed. 1

