Results 1 - 10
of
11
Glottal Open Quotient Estimation Using Linear Prediction
- In Proc. Intern. Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications
, 1999
"... A new method for the estimation of the voice open quotient is presented. Assuming abrupt glottal closures, the glottal ow waveform is considered as the impulse response of an anticausal two-poles lter. It is dened by four parameters : T 0 , A v , O q and m . The last three ones are estimated by a s ..."
Abstract
-
Cited by 5 (1 self)
- Add to MetaCart
A new method for the estimation of the voice open quotient is presented. Assuming abrupt glottal closures, the glottal ow waveform is considered as the impulse response of an anticausal two-poles lter. It is dened by four parameters : T 0 , A v , O q and m . The last three ones are estimated by a second-order linear prediction of the inverse ltered speech. Results on synthetic and natural speech signals are reported and compared with measurements on the corresponding electroglottographic signals. 1 Introduction Analysis of voice source's acoustic parameters is a challenging issue in the domains of speech communication (e.g. speech analysis and synthesis) or speech and voice pathology research. Vocal fold's vibration is responsible for voice and speech quality. Direct measurement of the glottal activity is still diÆcult, and thus many methods for voice analysis are based on processing of the acoustic signal. According to the linear source/lter theory of speech production [1], the...
Comparison of multiple voice source parameters in different phonation types
- Proceedings of Interspeech 2007
"... A large sample of vowels produced by male and female speakers were inverse filtered and parameterized using 21 different glottal flow parameters. The performance of the different parameters in expression of the phonation type was then tested using objective statistical methods. The comparison of the ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
A large sample of vowels produced by male and female speakers were inverse filtered and parameterized using 21 different glottal flow parameters. The performance of the different parameters in expression of the phonation type was then tested using objective statistical methods. The comparison of the results revealed marked differences in the parameters ’ performance, and therefore, guidelines for parameter use and comparison were established. Index Terms: voice quality, phonation type, inverse filtering, voice source, parameterization
Source-filter separation for articulation-to-speech synthesis
- in Proc. ICSLP2004, Jeju, Korea
, 2004
"... In this paper we examine a method for separating out the vocal-tract filter response from the voice source characteristic using a large articulatory database. The method realises such separation for voiced speech using an iterative approximation procedure under the assumption that the speech product ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
In this paper we examine a method for separating out the vocal-tract filter response from the voice source characteristic using a large articulatory database. The method realises such separation for voiced speech using an iterative approximation procedure under the assumption that the speech production process is a linear system composed of a voice source and a vocal-tract filter, and that each of the components is controlled independently by different sets of factors. Experimental results show that the spectral variation is evidently influenced by the fundamental frequency or the power of speech, and that the tendency of the variation may be related closely to speaker identity. The method enables independent control over the voice source characteristic in our articulation-to-speech synthesis. 1.
J. Thompson J. S. Mason
"... A large proportion of current speech related research is founded upon databases collected under controlled conditions. The manner in which they are collected is likely induce variations due to anxiety, and analysis of a typical speech database is presented supporting this. Trends in speaker identifi ..."
Abstract
- Add to MetaCart
A large proportion of current speech related research is founded upon databases collected under controlled conditions. The manner in which they are collected is likely induce variations due to anxiety, and analysis of a typical speech database is presented supporting this. Trends in speaker identification (SI) error rates correlate with postulated trends in anxiety levels along the time course of data collection. Further analysis using estimated glottal waveforms shows a speaker that causes high SI errors to have speech that exhibits characteristics of higher stress at the initial phase of database recording. 2. INTRODUCTION Speech variability, both within speaker (intra) and between speakers (inter), is a topic of major interest in speech research. Much effort has been directed towards the study of variation across speakers (inter-variation), particularly with regard to speaker adaption in speech recognition systems. Significantly less research has been devoted to the study of withi...
MATHEMATICAL METHODS FOR LINEAR PREDICTIVE SPECTRAL MODELLING OF SPEECH
"... Brief is the flight of the brightest stars. In February 2008, we were struck by the sudden departure of our colleague and friend, Carlo Magi. His demise, at the age of 27, was not only a great personal shock for those who knew him, but also a loss for science. During his brief career in science, he ..."
Abstract
- Add to MetaCart
Brief is the flight of the brightest stars. In February 2008, we were struck by the sudden departure of our colleague and friend, Carlo Magi. His demise, at the age of 27, was not only a great personal shock for those who knew him, but also a loss for science. During his brief career in science, he displayed rare talent in both the development of mathematical theory as well as application of mathematical results in practical problems of speech science. Apart from mere technical contributions, Carlo was also a valuable and highly appreciated member of the research team. Among colleagues, he was well known for venturing into exciting philosophical discussions and his positive as well as passionate attitude was motivating for everyone around him. His contributions will be fondly remembered. Carlo’s work was tragically interrupted just a few months before the defence of his doctoral thesis. Shortly after Carlo’s passing, we realised that the thesis he was working on had to be finished. We, the colleagues of Carlo, did not have a choice, it was obvious to us that finishing his work was our obligation. It is our hope that this posthumous doctoral dissertation of Carlo Magi will honour the life and work of our dear colleague, as well as remind us how fortunate we were to have worked with such a talent.
Speech: Analysis of the Glottal Flow Using the Normalised Amplitude Quotient
"... Emotions in short vowel segments of continuous speech were analysed using inverse filtering and a recently developed glottal flow parameter, the normalised amplitude quotient (NAQ). Simulated emotion portrayals were produced by 9 professional stage actors. Separated /a: / vowel segments were inverse ..."
Abstract
- Add to MetaCart
Emotions in short vowel segments of continuous speech were analysed using inverse filtering and a recently developed glottal flow parameter, the normalised amplitude quotient (NAQ). Simulated emotion portrayals were produced by 9 professional stage actors. Separated /a: / vowel segments were inverse filtered and parameterised using NAQ. Statistical analyses showed significant differences among most of the emotions studied. Results also demonstrated clear gender differences. Inverse filtering, together with NAQ, was shown to be a promising method for the analysis of emotional content in continuous speech. Copyright © 2006 S. Karger AG, Basel 1
Alternative Measures of Phonation: Collision Threshold Pressure & Electroglottographic Spectral Tilt. Extra: Perception of Swedish Accents
, 2010
"... The collision threshold pressure (CTP), i.e. the smallest amount of subglottal pressure needed for vocal fold collision, has been explored as a possible complement or alternative to the now commonly used phonation threshold pressure (PTP), i.e. the smallest amount of subglottal pressure needed to in ..."
Abstract
- Add to MetaCart
The collision threshold pressure (CTP), i.e. the smallest amount of subglottal pressure needed for vocal fold collision, has been explored as a possible complement or alternative to the now commonly used phonation threshold pressure (PTP), i.e. the smallest amount of subglottal pressure needed to initiate and sustain vocal fold oscillation. In addition, the effects of vocal warmup (Paper 1) and vocal loading (Paper 2) on the CTP and the PTP have been investigated. Results confirm previous findings that PTP increases with an increase in fundamental frequency (F0) of phonation and this is true also for CTP, which on average is about 4 cm H 2O higher than the PTP. Statistically significant increases of the CTP and PTP after vocal loading were confirmed and after the vocal warm-up, the threshold pressures were generally lowered although these results were significant only for the females. The vocal loading effect was minor for the two singer subjects who participated in the experiment of Paper 2.
The GlottHMM Entry for Blizzard Challenge 2011: Utilizing Source Unit Selection in HMM-Based Speech Synthesis for Improved Excitation Generation
"... This paper describes the GlottHMM speech synthesis system for Blizzard Challenge 2011. GlottHMM is a hidden Markov model (HMM) based speech synthesis system that utilizes glottal inverse filtering for separating the vocal tract and the glottal source from speech signal and models both components ind ..."
Abstract
- Add to MetaCart
This paper describes the GlottHMM speech synthesis system for Blizzard Challenge 2011. GlottHMM is a hidden Markov model (HMM) based speech synthesis system that utilizes glottal inverse filtering for separating the vocal tract and the glottal source from speech signal and models both components individually. In this year’s entry, stabilized weighted linear prediction (SWLP) is used to yield more robust estimates of the vocal tract filter of the high-pitched female voice. After the inverse filtering, the resulting source signal is parameterized into excitation features and a glottal flow pulse library, consisting of the variety of different glottal flow pulses. In the synthesis stage, a unit selection scheme is used for reconstructing the source signal: by minimizing the target and concatenation costs, best matching glottal flow pulses are selected from the pulse library in order to create a natural voice source. Finally, speech is synthesized by filtering the excitation signal by the vocal tract filter. Index Terms: speech synthesis, hidden Markov model, glottal inverse filtering, glottal flow pulse library, unit selection
Occupational voice -- Studying voice production and preventing voice problems with special emphasis on call-centre employees
, 2007
"... ..."
Changes in . . . and Subjective Voice Complaints in Call Center Customer-Service Advisors During One Working Day
, 2008
"... The aim of this study was to investigate how different acoustic parameters, extracted both from speech pressure waveforms and glottal flows, can be used in measuring vocal loading in modern working environments and how these parameters reflect the possible changes in the vocal function during a wo ..."
Abstract
- Add to MetaCart
The aim of this study was to investigate how different acoustic parameters, extracted both from speech pressure waveforms and glottal flows, can be used in measuring vocal loading in modern working environments and how these parameters reflect the possible changes in the vocal function during a working day. In addition, correlations between objective acoustic parameters and subjective voice symptoms were addressed. The subjects were 24 female and 8 male customer-service advisors, who mainly use telephone during their working hours. Speech samples were recorded from continuous speech four times during a working day and voice symptom questionnaires were completed simultaneously. Among the various objective parameters, only F0 resulted in a statistically significant increase for both genders. No correlations between the changes in objective and subjective parameters appeared. However, the results encourage researchers within the field of occupational voice use to apply versatile measurement techniques in studying occupational voice loading.

