Results 1 - 10
of
10
The Grammar Matrix: An Open-Source Starter-Kit for the Rapid Development of Cross-Linguistically Consistent Broad-Coverage Precision Grammars
- Proceedings of the Workshop on Grammar Engineering and Evaluation at the 19th International Conference on Computational Linguistics
, 2002
"... The grammar matrix is an open-source starter-kit for the development of broadcoverage HPSGs. By using a type hierarchy to represent cross-linguistic generalizations and providing compatibility with other open-source tools for grammar engineering, evaluation, parsing and generation, it facilit ..."
Abstract
-
Cited by 32 (9 self)
- Add to MetaCart
The grammar matrix is an open-source starter-kit for the development of broadcoverage HPSGs. By using a type hierarchy to represent cross-linguistic generalizations and providing compatibility with other open-source tools for grammar engineering, evaluation, parsing and generation, it facilitates not only quick start-up but also rapid growth towards the wide coverage necessary for robust natural language processing and the precision parses and semantic representations necessary for natural language understanding.
Compositional Semantics in a Multilingual Grammar Resource
- In E. M
, 2003
"... In this paper we present the methodology and mechanisms employed for semantic composition in the Matrix grammar starter-kit, using Minimal Recursion Semantics (MRS) with grammars written in the Head-driven Phrase Structure Grammar (HPSG) framework. ..."
Abstract
-
Cited by 14 (6 self)
- Add to MetaCart
In this paper we present the methodology and mechanisms employed for semantic composition in the Matrix grammar starter-kit, using Minimal Recursion Semantics (MRS) with grammars written in the Head-driven Phrase Structure Grammar (HPSG) framework.
A Lexicon Module for a Grammar Development Environment
- Proceedings of the 4th International Conference of Language Resources and
, 2004
"... Past approaches to developing an effective lexicon component in a grammar development environment have suffered from a number of usability and efficiency issues. We present a lexical database module currently in use by a number of grammar development projects. The database module presented addresses ..."
Abstract
-
Cited by 6 (3 self)
- Add to MetaCart
Past approaches to developing an effective lexicon component in a grammar development environment have suffered from a number of usability and efficiency issues. We present a lexical database module currently in use by a number of grammar development projects. The database module presented addresses issues which have caused problems in the past and the power of a database architecture provides a number of practical advantages as well as a solid framework for future extension. 1.
The Hinoki Treebank: A Treebank for Text Understanding
- In Proc. of the First IJCNLP, Lecture Notes in Computer Science
, 2004
"... Abstract. In this paper we describe the motivation for and construction of a new Japanese lexical resource: the Hinoki treebank. The treebank is built from dictionary definition sentences, and uses an HPSG grammar to encode the syntactic and semantic information. We then show how this treebank can b ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
Abstract. In this paper we describe the motivation for and construction of a new Japanese lexical resource: the Hinoki treebank. The treebank is built from dictionary definition sentences, and uses an HPSG grammar to encode the syntactic and semantic information. We then show how this treebank can be used to extract thesaurus information from definition sentences in a language-neutral way using minimal recursion semantics. 1
Parallel Distributed Grammar Engineering for Practical Applications
"... Based on a detailed case study of parallel grammar development distributed across two sites, we review some of the requirements for regression testing in grammar engineering, summarize our approach to systematic competence and performance profiling, and discuss our experience with grammar developmen ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
Based on a detailed case study of parallel grammar development distributed across two sites, we review some of the requirements for regression testing in grammar engineering, summarize our approach to systematic competence and performance profiling, and discuss our experience with grammar development for a commercial application.
High precision treebanking: Blazing useful trees using pos information
- In Proceedings of the 43th Meeting of the Association for Computational Linguistics
, 2005
"... In this paper we present a quantitative and qualitative analysis of annotation in the Hinoki treebank of Japanese, and investigate a method of speeding annotation by using part-of-speech tags. The Hinoki treebank is a Redwoods-style treebank of Japanese dictionary definition sentences. 5,000 sentenc ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
In this paper we present a quantitative and qualitative analysis of annotation in the Hinoki treebank of Japanese, and investigate a method of speeding annotation by using part-of-speech tags. The Hinoki treebank is a Redwoods-style treebank of Japanese dictionary definition sentences. 5,000 sentences are annotated by three different annotators and the agreement evaluated. An average agreement of 65.4 % was found using strict agreement, and 83.5% using labeled precision. Exploiting POS tags allowed the annotators to choose the best parse with 19.5 % fewer decisions. 1
By
, 2004
"... I gratefully acknowledge many people who supported me with deepest appreciation. I first express my appreciation to my dissertation committee members, Takao Gunji, Joseph Emonds, and Taisuke Nishigauchi. Takao Gunji, my adviser, gave me many theoretical ideas and inspired me in diverse ways. Actuall ..."
Abstract
- Add to MetaCart
I gratefully acknowledge many people who supported me with deepest appreciation. I first express my appreciation to my dissertation committee members, Takao Gunji, Joseph Emonds, and Taisuke Nishigauchi. Takao Gunji, my adviser, gave me many theoretical ideas and inspired me in diverse ways. Actually, his writings interested me in theoretical linguistics and made me turn to grammar development in which linguistics and NLP col-laborate. I enjoyed arguing with Joseph Emonds about mathematical aspects of linguistic theory. I was really impressed with his wise comments. My gratitude also goes Taisuke Nishigauchi, whose guidance has been invaluable since I decided to major in linguistics. I am also deeply grateful to Francis Bond for lots of advice ranging from engineering and technical aspects to spiritual side. I owe him almost all of the technical expertise that are central to my dissertation. He also gave me a chance to take part in the HINOKI project, through which I could meet many interesting people and have a good experience. I learned so many things about NLP from Takaaki Tanaka and Sanae Fujita through the HINOKI project. Their help constitutes an integral part of my dissertation. I am indebted to Yuji Matsumoto, one of the groundbreakers of HPSG-based NLP in
NTT Communication Science Laboratories, Nippon Telegraph and Telephone Corporation
"... In this paper we present a quantitative and qualitative analysis of annotation in the Hinoki treebank of Japanese, and investigate a method of speeding annotation by using part-of-speech tags. The Hinoki treebank is a Redwoods-style treebank of Japanese dictionary definition sentences. 5,000 sentenc ..."
Abstract
- Add to MetaCart
In this paper we present a quantitative and qualitative analysis of annotation in the Hinoki treebank of Japanese, and investigate a method of speeding annotation by using part-of-speech tags. The Hinoki treebank is a Redwoods-style treebank of Japanese dictionary definition sentences. 5,000 sentences are annotated by three different annotators and the agreement evaluated. An average agreement of 65.4 % was found using strict agreement, and 83.5% using labeled precision. Exploiting POS tags allowed the annotators to choose the best parse with 19.5 % fewer decisions. 1
Effectiveness of Methods for Syntactic and Semantic Recognition of Numeral Strings: Tradeoffs Between Number of Features and Length of Word N-Grams
"... Abstract. This paper describes and compares the use of methods based on N-grams (specifically trigrams and pentagrams), together with five features, to recognise the syntactic and semantic categories of numeral strings representing money, number, date, etc., in texts. The system employs three interp ..."
Abstract
- Add to MetaCart
Abstract. This paper describes and compares the use of methods based on N-grams (specifically trigrams and pentagrams), together with five features, to recognise the syntactic and semantic categories of numeral strings representing money, number, date, etc., in texts. The system employs three interpretation processes: word N-grams construction with a tokeniser; rule-based processing of numeral strings; and N-gram-based classification. We extracted numeral strings from 1,111 online newspaper articles. For numeral strings interpretation, we chose 112 (10%) of 1,111 articles to provide unseen test data (1,278 numeral strings), and used the remaining 999 articles to provide 11,525 numeral strings for use in extracting N-gram-based constraints to disambiguate meanings of the numeral strings. The word trigrams method resulted in 83.8% precision, 81.2 % recall ratio, and 82.5 % in F-measurement ratio. The word pentagrams method resulted in 86.6 % precision, 82.9 % recall ratio, and 84.7% in F-measurement ratio. Keywords: numeral strings, N-grams, named entity recognition, natural language processing. 1
morphemes. TABLE OF CONTENTS
, 2009
"... This is to certify that I have examined this copy of a doctoral dissertation by ..."
Abstract
- Add to MetaCart
This is to certify that I have examined this copy of a doctoral dissertation by

