Results 1 - 10
of
45
Optimality Theory: Constraint interaction in Generative Grammar
, 1993
"... ~ ROA Version, 8/2002. Essentially identical to the Tech Report, with new pagination (but the same footnote and example numbering); correction of typos, oversights & outright errors; improved typography; and occasional small-scale clarificatory rewordings. Citation should include reference to this ..."
Abstract
-
Cited by 789 (23 self)
- Add to MetaCart
~ ROA Version, 8/2002. Essentially identical to the Tech Report, with new pagination (but the same footnote and example numbering); correction of typos, oversights & outright errors; improved typography; and occasional small-scale clarificatory rewordings. Citation should include reference to this version.
Information Structure and the Syntax-Phonology Interface
, 1998
"... The paper proposes a theory relating syntax, semantics, and intonational prosody, and covering a wide range of English intonational tunes and their semantic interpretation in terms of focus and information structure. The theory is based on a version of combinatory categorial grammar which directly p ..."
Abstract
-
Cited by 90 (3 self)
- Add to MetaCart
The paper proposes a theory relating syntax, semantics, and intonational prosody, and covering a wide range of English intonational tunes and their semantic interpretation in terms of focus and information structure. The theory is based on a version of combinatory categorial grammar which directly pairs phonological and logical forms without intermediary representational levels.
Generation of Affect in Synthesized Speech
- Journal of the American Voice I/O Society
, 1990
"... When compared to human speech, synthesized speech is distinguished by insufficient intelligibility, inappropriate prosody and inadequate expressiveness. These are serious drawbacks for conversational computer systems. Intelligibility is basic --- intelligible phonemes are necessary for word recognit ..."
Abstract
-
Cited by 78 (5 self)
- Add to MetaCart
When compared to human speech, synthesized speech is distinguished by insufficient intelligibility, inappropriate prosody and inadequate expressiveness. These are serious drawbacks for conversational computer systems. Intelligibility is basic --- intelligible phonemes are necessary for word recognition. Prosody --- intonation (melody) and rhythm --- clarifies syntax and semantics and aids in discourse flow control. Expressiveness, or affect, provides information about the speaker's mental state and intent beyond that revealed by word content. My work explores improvements to the affective component of synthesized speech. It is embodied in the Affect Editor program, which is intended to show that variations in affect can be generated in synthetic speech and to point the way towards improving the recognizability and naturalness of the affe...
A Hierarchical Stochastic Model for Automatic Prediction of Prosodic Boundary Location
- COMPUTATIONAL LINGUISTICS
, 1994
"... Prosodic phrase structure ..."
Heads and Phrases. Type Calculus for Dependency and Constituent Structure
- Journal of Language, Logic and Information
, 1991
"... From a logical perspective, categorial type systems can be situated within a landscape of substructural logics --- logics with a structure-sensitive consequence relation. Research on these logics has shown that the inhabitants of the substructural hierarchy can be systematically related by embedding ..."
Abstract
-
Cited by 41 (10 self)
- Add to MetaCart
From a logical perspective, categorial type systems can be situated within a landscape of substructural logics --- logics with a structure-sensitive consequence relation. Research on these logics has shown that the inhabitants of the substructural hierarchy can be systematically related by embedding translations on the basis of structural modalities. The modal operators offer controlled access to stronger logics from within weaker ones by licensing of structural operations. Linguistic material exhibits structure in dimensions not covered by the standard structural rules. The purpose of this paper is to generalize the modalisation and licensing strategy to two such dimensions: phrasal structure and headedness. Phrasal domain-sensitive type systems capture the notion of constituent structure; constituency relaxation can be licensed via an associativity modality. The opposition between heads and non-heads introduces dependency structure, an autonomous dimension of linguistic structure wh...
The Computational Processing of Intonational Prominence: A Functional Prosody Perspective
, 1997
"... Intonational prominence, or accent, is a fundamental prosodic feature that is said to contribute to discourse meaning. This thesis outlines a new, computational theory of the discourse interpretation of prominence, from a FUNCTIONAL PROSODY perspective. Functional prosody makes the following two imp ..."
Abstract
-
Cited by 16 (2 self)
- Add to MetaCart
Intonational prominence, or accent, is a fundamental prosodic feature that is said to contribute to discourse meaning. This thesis outlines a new, computational theory of the discourse interpretation of prominence, from a FUNCTIONAL PROSODY perspective. Functional prosody makes the following two important assumptions: first, there is an aspect of prominence interpretation that centrally concerns discourse processes, namely the discourse focusing nature of prominence; and second, the role of prominence in language processing in general, and discourse processing in particular, is not essentially separate from the processing of other grammatical, nonprosodic information. This thesis develops a computational theory of prominence interpretation by explaining how prominence serves as an inference cue in discourse processing. Prominence signals changes in the attentional status of entities in a discourse model, while nonprominence signals that the realized entities are already in discourse fo...
Prosody modeling in concept-to-speech generation
, 2002
"... With the development of speech recognition and synthesis technology, speech interfaces for practical applications are in high demand. For applications like spoken dialogues systems, where not only the waveform but also the content of a system’s query/response have to be generated automatically, a Co ..."
Abstract
-
Cited by 15 (1 self)
- Add to MetaCart
With the development of speech recognition and synthesis technology, speech interfaces for practical applications are in high demand. For applications like spoken dialogues systems, where not only the waveform but also the content of a system’s query/response have to be generated automatically, a Concept-to-Speech system is needed. One key module in a Concept-to-Speech system is prosody modeling. It determines how prosody (intonation), the suprasegmental aspect of speech that communicates the structure and meaning of utterances, should be represented and generated automatically. Since prosody directly affected by the meaning and structure of the sentences automatically produced by a natural language generator; at the same time, it also has significant influence on the naturalness and effectiveness of the speech synthesized, its performance is critical to the success of a Conceptto-Speech system where both natural language generation and speech synthesis are used together to generate the final spoken output. In this thesis, I focus on two aspects of the prosody modeling process. First, I explore novel features that are available during natural language generation, such as the meaning, structure, and context of sentences, and demonstrate how these features are related to prosody, based on empirical evidences derived from annotated speech corpora. Second, I propose a new prosody modeling approach that automatically combines different natural language features for prosody prediction. More specifically, I designed an augmented instance-based learning algorithm that makes use of the natural prosody in human speech to produce natural and vivid synthesized speech. Our subjective evaluation demonstrates the effectiveness of this approach. I implement the prosody modeling system for a medical application called MAGIC.
Automatic Prosodic Analysis for Computer Aided Pronunciation Teaching
, 1994
"... Correct pronunciation of spoken language requires the appropriate modulation of acoustic characteristics of speech to convey linguistic information at a suprasegmental level. Such prosodic modulation is a key aspect of spoken language and is an important component of foreign language learning, for p ..."
Abstract
-
Cited by 14 (0 self)
- Add to MetaCart
Correct pronunciation of spoken language requires the appropriate modulation of acoustic characteristics of speech to convey linguistic information at a suprasegmental level. Such prosodic modulation is a key aspect of spoken language and is an important component of foreign language learning, for purposes of both comprehension and intelligibility. Computer aided pronunciation teaching involves automatic analysis of the speech of a non-native talker in order to provide a diagnosis of the learner's performance in comparison with the speech of a native talker. This thesis describes research undertaken to automatically analyse the prosodic aspects of speech for computer aided pronunciation teaching. It is necessary to describe the suprasegmental composition of a learner's speech in order to characterise significant deviations from a native-like prosody, and to offer some kind of corrective diagnosis. Phonological theories of prosody aim to describe the suprasegmental composition of speech...
Hierarchical structure and word strength prediction of Mandarin prosody
- International Journal of Speech Technology
, 2003
"... We use Stem-ML to build an automatic learning system for Mandarin prosody that allows us to make quantitative measurements of prosodic strengths. Stem-ML is a phenomenological model of the muscle dynamics and planning process that controls the tension of the vocal folds. Because Stem-ML describes th ..."
Abstract
-
Cited by 13 (9 self)
- Add to MetaCart
We use Stem-ML to build an automatic learning system for Mandarin prosody that allows us to make quantitative measurements of prosodic strengths. Stem-ML is a phenomenological model of the muscle dynamics and planning process that controls the tension of the vocal folds. Because Stem-ML describes the interactions between nearby tones or accents, we were able to use a highly constrained model with only one accent template for each lexical tone category, and a single prosodic strength per word. The model accurately reproduces the intonation of the speaker, capturing 87 % of the variance of. The result reveals strong alternating metrical patterns in words, and shows that the speaker uses word strength to mark a hierarchy of boundaries. 1.

