Results 1 -
7 of
7
Profer: Predictive, Robust Finite-State Parsing for Spoken Language
- In Proceedings of ICASSP
, 1999
"... The natural languageprocessingcomponentof a speechunderstanding system is commonly a robust, semantic parser, implemented as either a chart-based transition network, or as a generalized leftright (GLR) parser. In contrast, we are developing a robust, semantic parser that is a single, predictive fini ..."
Abstract
-
Cited by 16 (3 self)
- Add to MetaCart
The natural languageprocessingcomponentof a speechunderstanding system is commonly a robust, semantic parser, implemented as either a chart-based transition network, or as a generalized leftright (GLR) parser. In contrast, we are developing a robust, semantic parser that is a single, predictive finite-state machine. Our approach is motivated by our belief that such a finite-state parser can ultimately provide an efficient vehicle for tightly integrating higher-level linguistic knowledge into speech recognition. We report on our development of this parser, with an example of its use, and a description of how it compares to both finite-state predictors and chart-based semantic parsers, while combining the elements of
Unsupervised Pos-Tagging Improves Parsing Accuracy And Parsing Efficiency
- In Proceedings of IWPT
, 2001
"... It is shown that a simple POS-tagger can be used to filter the results of lexical analysis of a widecoverage computational grammar. The reduction of the number of lexical categories not only greatly improves parsing efficiency, but in our experiments also gave rise to a mild increase in parsing accu ..."
Abstract
-
Cited by 7 (0 self)
- Add to MetaCart
It is shown that a simple POS-tagger can be used to filter the results of lexical analysis of a widecoverage computational grammar. The reduction of the number of lexical categories not only greatly improves parsing efficiency, but in our experiments also gave rise to a mild increase in parsing accuracy; in contrast to results reported in earlier work on supervised tagging. The novel aspect of our approach is that the POS-tagger does not require any human-annotated data - but rather uses the parser output obtained on a large training set.
Combining Acoustic Confidence Scores with Deep Semantic Analysis for Clarification Dialogues
- In Proceedings of the 5th International Workshop on Computational Semantics (IWCS-5
, 2003
"... This paper describes a technique to include acoustic confidence scores as returned by automated speech recognisers in generic semantic representations. The method we propose requires only minimal changes to an existing grammar used for speech applications. Special attention is paid to the treatm ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
This paper describes a technique to include acoustic confidence scores as returned by automated speech recognisers in generic semantic representations. The method we propose requires only minimal changes to an existing grammar used for speech applications. Special attention is paid to the treatment of multi-word lexemes and combining several (N-best) speech recognition results into one semantic representation. The approach has been implemented and tested using the Nuance speech recognition software and a chart parser, in the formalism of underspecified discourse representations. The potential relevance of confidence scores in rich semantic representations is illustrated by generating more flexible clarification questions in dialogue systems.
Statistical Parsing of Dutch using Maximum Entropy Models with Feature Merging
- In Proceedings of the Natural Language Processing Pacific Rim Symposium
, 2001
"... In this project report we describe work in statistical parsing using the maximum entropy technique and the Alpino language analysis system for Dutch. A major difficulty in this domain is the lack of sucient corpus data available for training. Among other problems, this sparseness of data increases t ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
In this project report we describe work in statistical parsing using the maximum entropy technique and the Alpino language analysis system for Dutch. A major difficulty in this domain is the lack of sucient corpus data available for training. Among other problems, this sparseness of data increases the danger of the model overfitting the training data, making it particularly important that the selection of statistical features upon which to base the model be optimal. To this end we have adapted the notion of feature merging, a means of constructing equivalence classes of statistical features based upon common elements within them. In spite of promising preliminary results, subsequent tests have not enabled us to conclude whether this approach helps the kind of models we are working with.
Robust Efficient Parsing for Spoken Dialogue Processing
, 1998
"... ion (Johnson and Dorre, [39]) ffl x(A,B,f(A,B),g(A,h(B,i(C)))) =) x(A,B,f(,),g(,)) ffl parsewithweakening(Cat,P0,P,E0,E) :- weaken(Cat,WeakenedCat), parse(WeakenedCat,P0,P,E0,E), Cat=WeakenedCat. ffl Really helps! Ambiguity Packing ffl A parser should not construct all parse trees (exponential) ..."
Abstract
- Add to MetaCart
ion (Johnson and Dorre, [39]) ffl x(A,B,f(A,B),g(A,h(B,i(C)))) =) x(A,B,f(,),g(,)) ffl parsewithweakening(Cat,P0,P,E0,E) :- weaken(Cat,WeakenedCat), parse(WeakenedCat,P0,P,E0,E), Cat=WeakenedCat. ffl Really helps! Ambiguity Packing ffl A parser should not construct all parse trees (exponential) ffl Instead, a compact representation of all such parse trees are constructed -- grammar [42, 9] -- parse forest [76] -- packed structures [3] ffl Here: for every `result item' keep track of the lexical entry and references of other result items that were used to create it ffl Results in a lexicalized tree substitution grammar ffl which generates the input sentence with all its parse trees Bottom-up Inactive-chart Parser Item form: [i;X; j] Axioms: Goals: [0;S;n] Inference Rules: Scan [q i ;wi; qi+1 ] Complete [q k ;X1; q k 0][q k 0;X2; q k 00] : : : [q m0;Xl; qm] [q k ;X0; qm] X0 !X1:::Xl Bottom-up Inactive-chart Parser Inference Rules: Scan [q i ;wi; qi+...
A Practical Semantic Type Representation for Natural Language Understanding
, 2000
"... Reasoning about semantic classes and determining compatibility of the words in a given context is an important procedure used in many modules of natural language understanding systems. However, most existing systems do not devote much attention to their ontological knowledge representations, resulti ..."
Abstract
- Add to MetaCart
Reasoning about semantic classes and determining compatibility of the words in a given context is an important procedure used in many modules of natural language understanding systems. However, most existing systems do not devote much attention to their ontological knowledge representations, resulting in implementations that are not portable to other domains. At the same time, statistical methods are more robust and less labor-intensive to develop, but typically result in models that are not easily interpretable by humans. We propose a semantic feature representation the use in practical dialogue systems and argue that it can oer advantages in terms of lexicon development and portability - in particular for dening selectional restrictions - and can also be useful for other system modules that do logical inference. We then propose to develop statistical methods allowing us to learn parts of our representation from corpus data. The author wishes to thank James Allen, Jason Eisner, Len...
Gertjan van Noord Alfa-informatica & BCN
"... Abstract It is shown that a simple POS-tagger can be used to filter the results of lexical analysis of a widecoverage computational grammar. The reduction of the number of lexical categories not only greatly improves parsing efficiency, but in our experiments also gave rise to a mild increase in par ..."
Abstract
- Add to MetaCart
Abstract It is shown that a simple POS-tagger can be used to filter the results of lexical analysis of a widecoverage computational grammar. The reduction of the number of lexical categories not only greatly improves parsing efficiency, but in our experiments also gave rise to a mild increase in parsing accuracy; in contrast to results reported in earlier work on supervised tagging. The novel aspect of our approach is that the POS-tagger does not require any human-annotated data- but rather uses the parser output obtained on a large training set. 1 Introduction Full parsing of unrestricted texts on the basis of a wide-coverage computational HPSG grammar remains a challenge. In our recent experience in the development of the Alpino system, discussed in section 2, we found that even in the presence of various clever chart parsing and ambiguity packing techniques, lexical ambiguity in particular has an important effect on parsing efficiency. In some cases, a category assigned to a word is obviously wrong for the sentence the word occurs in. For instance, in a lexicalist grammar the two occurrences of called in (1) will be associated with two distinct lexical categories. The entry associated with (1-a) will reflect the requirement that the verb combines syntactically with the particle `up'. Clearly, this lexical category is irrelevant for the analysis of sentence (1-b), since no such particle occurs in the sentence.

