Results 1 -
6 of
6
Subword-based Approaches for Spoken Document Retrieval
, 2000
"... This thesis explores approaches to the problem of spoken document retrieval (SDR), which is the task of automatically indexing and then retrieving relevant items from a large collection of recorded speech messages in response to a user specified natural language text query. We investigate the use of ..."
Abstract
-
Cited by 40 (0 self)
- Add to MetaCart
This thesis explores approaches to the problem of spoken document retrieval (SDR), which is the task of automatically indexing and then retrieving relevant items from a large collection of recorded speech messages in response to a user specified natural language text query. We investigate the use of subword unit representations for SDR as an alternative to words generated by either keyword spotting or continuous speech recognition. Our investigation is motivated by the observation that word-based retrieval approaches face the problem of either having to know the keywords to search for a priori, or requiring a very large recognition vocabulary in order to cover the contents of growing and diverse message collections. The use of subword units in the recognizer constrains the size of the vocabulary needed to cover the language; and the use of subword units as indexing terms allows for the detection of new user-specified query terms during retrieval. Four
Supporting access to large digital oral history archives
- Proceedings of the Joint Conference on Digital Libraries
, 2002
"... This paper describes our experience with the creation, indexing, and provision of access to a very large archive of videotaped oral histories − 116,000 hours of digitized interviews in 32 languages from 52,000 survivors, liberators, rescuers, and witnesses of the Nazi Holocaust. It goes on to identi ..."
Abstract
-
Cited by 14 (4 self)
- Add to MetaCart
This paper describes our experience with the creation, indexing, and provision of access to a very large archive of videotaped oral histories − 116,000 hours of digitized interviews in 32 languages from 52,000 survivors, liberators, rescuers, and witnesses of the Nazi Holocaust. It goes on to identify a set of critical research issues that must be addressed if we are to provide full and detailed access to collections of this size: issues in user requirement studies, automatic speech recognition, automatic classification, segmentation, summarization, retrieval, and user interfaces. The paper ends by inviting others to discuss use of these materials in their own research. Categories and Subject descriptors
Audio Indexing and Retrieval of Complete Broadcast News Shows
, 2000
"... This paper describes a system for retrieving relevant portions of complete broadcast news shows starting with only the audio data. A novel system of automatically detecting and removing commercials is described and shown to increase the performance of the system whilst also reducing the computationa ..."
Abstract
-
Cited by 6 (2 self)
- Add to MetaCart
This paper describes a system for retrieving relevant portions of complete broadcast news shows starting with only the audio data. A novel system of automatically detecting and removing commercials is described and shown to increase the performance of the system whilst also reducing the computational effort required. The sophisticated large vocabulary speech recogniser which produces the high-quality transcriptions and the window-based retrieval system with post-merging are also described. Results are
A System for the Retrieval of Italian Broadcast News
- Speech Communication
, 2000
"... This paper presents a prototype for the retrieval of Italian broadcast news, which has been developed at ITC-irst. The architecture employs a speech recognition engine for the automatic transcription of audio news. Moreover, it features document indexing based on part-of-speech tagging of text co ..."
Abstract
-
Cited by 5 (2 self)
- Add to MetaCart
This paper presents a prototype for the retrieval of Italian broadcast news, which has been developed at ITC-irst. The architecture employs a speech recognition engine for the automatic transcription of audio news. Moreover, it features document indexing based on part-of-speech tagging of text coupled with morphological analysis, and query expansion exploiting the Italian WordNet thesaurus. Query-document matching is based on a statistical term weighting scheme. The system was tested on a 203 story collection of audio news, augmented with 9,500 newspaper articles. The evaluation was based on a "known item" retrieval task and aimed at evaluating the impact of speech recognition errors and query expansion on retrieval performance.
Towards Automatic Real Time Preparation of On-Line Video
- Proceedings for Conference Talks and Presentations. Paper presented at the 34 th Hawaii International Conference On System Sciences
, 2001
"... How many times did you miss a conference talk on a parallel track and wish you had a second chance to see it? Or you just wanted to see a few talks from a conference you did not attend? Video proceedings, which contain videos of all the conference talks, would be of a great value in these cases and ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
How many times did you miss a conference talk on a parallel track and wish you had a second chance to see it? Or you just wanted to see a few talks from a conference you did not attend? Video proceedings, which contain videos of all the conference talks, would be of a great value in these cases and in many others. With recent progress in digital video, streaming technology, large storage, Internet and especially video indexing and retrieval technology, video proceedings finally become a reality. The key challenge is to create it in an efficient way and to provide the user with easy, intuitive and rapid access to the talks and the snippets of video that he is looking for. This paper describes an application that allows a nearly automatic, real time creation of video proceedings. All the talks are captured in video, and are automatically indexed by speech recognition and video analysis tools. The abstracts and speaker’s biography are extracted from the text proceedings and are converted to web pages. Free text search in speech and efficient multi-view video browsing are combined with a table of contents to compose fully searchable and browsable video proceedings. The paper covers different aspects of the problem, an overview of the CueVideo system and examples from two conferences that we have processed using this system.
ISSUES IN SPEECH-BASED RETRIEVAL OF VIDEO
"... This paper discusses issues arising when applying the IBM Audio-Indexing System to retrieval of video. Issues discussed include the relationship between speech transcription accuracy and retrieval performance, query processing schemes and the critical problem of mapping between cues in speech and th ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
This paper discusses issues arising when applying the IBM Audio-Indexing System to retrieval of video. Issues discussed include the relationship between speech transcription accuracy and retrieval performance, query processing schemes and the critical problem of mapping between cues in speech and the relevant video shots. The temporal relationship between the occurrence of cues in speech transcripts and relevant shots is quantified and then simple schemes for performing this mapping are described and evaluated. Experiments demonstrate the promise of more sophisticated schemes involving up-front video ranking and one possible implementation is discussed. Techniques are evaluated using the TREC-2002 Video Track queries and corpus, comprising a total of 68.45 hours of video. 1.

