Abstract — The last decade has witnessed the emergence of a new breed of human computer interfaces that combines several human language technologies to enable humans to converse with computers using spoken dialogue for information access, creation, and processing. In this paper, we introduce the nature of these conversational interfaces, and describe the underlying human language technologies on which they are based. After summarizing some of the recent progress in this area around the world, we discuss development issues faced by researchers creating these kinds of systems, and present some of the ongoing and unmet research challenges in this field. Keywords—Conversational interfaces, spoken dialogue systems, speech understanding systems. I.
|
283
|
Plans for discourse
– Grosz, Sidner
- 1990
|
|
186
|
Building Natural Language Generation Systems
– Reiter, Dale
- 2000
|
|
126
|
A natural language system for spoken language applications
– SENEFF
- 1992
|
|
112
|
Perceiving Talking Faces: From Speech Perception to a Behavioral Principle
– Massaro
- 1998
|
|
103
|
Review of text-to-speech conversion for English
– Klatt
- 1987
|
|
77
|
Multimodal interfaces for dynamic interactive maps
– OVIATT
- 1996
|
|
64
|
Evaluation of Spoken Language Systems: The ATIS Domain
– Price
- 1990
|
|
64
|
A telephone-based conversational interface for weather information
– Zue, Seneff, et al.
- 2000
|
|
57
|
How may I help you
– Gorin, Riccardi, et al.
- 1997
|
|
42
|
The CMU air travel information service: Understanding spontaneous speech
– Ward
- 1990
|
|
38
|
Parse Scoring with Prosodic Information: an Analysis/Synthesis approach
– Ostendorf, Wightman, et al.
- 1993
|
|
34
|
The N-Best Algorithm: An Efficient Procedure for Finding Top
– Chow, Schwartz
- 1989
|
|
31
|
Natural-sounding speech synthesis using variable-length units
– Yi, Glass
- 1998
|
|
30
|
A General Framework for Evaluating Spoken Dialogue Agents
– Walker, Litman, et al.
- 1997
|
|
28
|
Using Markov decision process for learning dialog strategies
– Levin, Pieraccini, et al.
- 1998
|
|
27
|
Dialog strategies guiding users to their communicative goals
– Denecke, Waibel
- 1997
|
|
26
|
Multilingual Language Generation across Multiple Domains
– Glass, Polifroni, et al.
- 1994
|
|
26
|
Evaluation methodology for a telephone-based conversational system
– Polifroni, Seneff, et al.
- 1998
|
|
24
|
A form-based dialogue manager for spoken language applications
– Goddeau, Meng, et al.
- 1996
|
|
24
|
A trainable text-to-speech system
– Huang, Acero, et al.
- 1996
|
|
23
|
The TRAINS project: A case study in defining a conversational planning agent
– Allen
- 1995
|
|
21
|
The efficiency of multimodal interaction: A case study
– Cohen, Johnston, et al.
- 1998
|
|
20
|
Unsupervised training of a speech recognizer: recent experiments
– Kemp, Waibel
- 1999
|
|
20
|
Stochastic representation of semantic structure for speech understanding
– Pieraccini, Levin
- 1992
|
|
20
|
Robust Parsing for Spoken Language Systems
– Seneff
- 1992
|
|
19
|
Combining Linguistic and Statistical Knowledge Sources in Natural-Language Processing for ATIS
– Moore, Appelt, et al.
- 1995
|
|
19
|
Confidence scoring for speech understanding systems
– Pao, Schmid, et al.
- 1998
|
|
19
|
Maximum likelihood and discriminative training of direct translation models
– Papineni, Roukos, et al.
- 1998
|
|
19
|
The thoughtful elephant: Strategies for spoken dialog systems
– Souvignier, Keller, et al.
- 2000
|
|
18
|
Embodied Conversation: Integrating Face and Gesture into Automatic Spoken Dialogue Systems
– Cassell
- 1999
|
|
18
|
Galaxy-II: A Reference Architecture for
– Seneff, Hurley, et al.
- 1998
|
|
17
|
Multimodal discourse modelling in a multi-user multi-domain environment
– Seneff, Goddeau, et al.
- 1996
|
|
17
|
Modelling Non-verbal Sounds for Speech Recognition
– Ward
- 1989
|
|
16
|
AMICA: The AT&T mixed initiative conversational architecture
– Pieraccini, Levin, et al.
- 1997
|
|
15
|
Design considerations on dialogue systems: From theory to technology -the case of artimis
– Sadek
- 1999
|
|
13
|
Statistical language processing using hidden understanding models
– Miller, Bobrow, et al.
- 1994
|
|
13
|
Galaxy-II as an architecture for spoken dialog evaluation
– Polifroni, Seneff
- 2000
|
|
12
|
Automatic Modelling for Adding New Words to a Large Vocabulary
– Asadi, Schwartz, et al.
- 1991
|
|
12
|
Discourse Segmentation Of Spoken Dialogue: An Empirical Approach
– Flammia
- 1998
|
|
12
|
Data Collection and Analysis in the Air Travel Planning Domain
– Kowtko, Price
- 1989
|
|
10
|
On the use of prosody in automatic dialogue understanding
– Noth, Batliner, et al.
- 1999
|
|
9
|
Bringing spoken language systems to the classroom
– Sutton, Kaiser, et al.
- 1997
|
|
8
|
A schema-based approach to dialog control
– Constantinides, Hansma, et al.
- 1998
|
|
8
|
Stochastic natural language generation for spoken dialog systems
– Oh, Rudnicky
- 2002
|
|
8
|
Pegasus: A Spoken Language Interface for On-Line Air Travel Planning
– Zue, Seneff, et al.
- 1994
|
|
7
|
Evaluation of Dialog Strategies for a Tourist Information Retrieval System
– Devillers, Bonneau-Maynard
- 1998
|
|
7
|
A New Generation of Spoken Dialogue Systems
– Peckham
- 1992
|
|
7
|
et al., “Can prosody aid the automatic classification of dialog acts in conversational speech
– Shriberg
- 1998
|
|
6
|
Prompt Constrained Natural Language - Evolving the Next Generation of Telephony Services
– Marcus, Brown, et al.
- 1996
|
|
6
|
Prosody-based detection of the context of backchannel responses
– Noguchi, Den
- 1998
|