Results 1 - 10
of
22
Incorporating speaker and discourse features into speech summarization
- In: Proc. of the HLT-NAACL 2006
, 2006
"... We have explored the usefulness of incorporating speech and discourse features in an automatic speech summarization system applied to meeting recordings from the ICSI Meetings corpus. By analyzing speaker activity, turn-taking and discourse cues, we hypothesize that such a system can outperform sole ..."
Abstract
-
Cited by 38 (12 self)
- Add to MetaCart
(Show Context)
We have explored the usefulness of incorporating speech and discourse features in an automatic speech summarization system applied to meeting recordings from the ICSI Meetings corpus. By analyzing speaker activity, turn-taking and discourse cues, we hypothesize that such a system can outperform solely text-based methods inherited from the field of text summarization. The summarization methods are described, two evaluation methods are applied and compared, and the results clearly show that utilizing such features is advantageous and efficient. Even simple methods relying on discourse cues and speaker activity can outperform text summarization approaches. 1.
Content-based Access to Spoken Audio
- IEEE Signal Processing Magazine
, 2005
"... This article describes approaches to content-based access to spoken audio with a qualitative and tutorial emphasis. We describe how the analysis, retrieval and delivery phases contribute making spoken audio content more accessible, and we outline a number of outstanding research issues. We also disc ..."
Abstract
-
Cited by 26 (1 self)
- Add to MetaCart
(Show Context)
This article describes approaches to content-based access to spoken audio with a qualitative and tutorial emphasis. We describe how the analysis, retrieval and delivery phases contribute making spoken audio content more accessible, and we outline a number of outstanding research issues. We also discuss the main application domains and try to identify important issues for future developments. The structure of the article is based on general system architecture for content-based 2 access which is depicted in Figure 1. Although the tasks within each processing stage may appear unconnected, the interdependencies and the sequence with which they take place vary
Extrinsic Summarization Evaluation: A Decision Audit Task
"... Abstract. In this work we describe a large-scale extrinsic evaluation of automatic speech summarization technologies for meeting speech. The particular task is a decision audit, wherein a user must satisfy a complex information need, navigating several meetings in order to gain an understanding of h ..."
Abstract
-
Cited by 19 (7 self)
- Add to MetaCart
Abstract. In this work we describe a large-scale extrinsic evaluation of automatic speech summarization technologies for meeting speech. The particular task is a decision audit, wherein a user must satisfy a complex information need, navigating several meetings in order to gain an understanding of how and why a given decision was made. We compare the usefulness of extractive and abstractive technologies in satisfying this information need, and assess the impact of automatic speech recognition (ASR) errors on user performance. We employ several evaluation methods for participant performance, including post-questionnaire data, human subjective and objective judgments, and an analysis of participant browsing behaviour. 1
Speech summarization without lexical features for Mandarin broadcast news
- ASSOCIATION FOR COMPUTATIONAL LINGUISTICS
, 2007
"... We present the first known empirical study on speech summarization without lexical features for Mandarin broadcast news. We evaluate acoustic, lexical and structural features as predictors of summary sentences. We find that the summarizer yields good performance at the average F-measure of 0.5646 ev ..."
Abstract
-
Cited by 19 (1 self)
- Add to MetaCart
We present the first known empirical study on speech summarization without lexical features for Mandarin broadcast news. We evaluate acoustic, lexical and structural features as predictors of summary sentences. We find that the summarizer yields good performance at the average F-measure of 0.5646 even by using the combination of acoustic and structural features alone, which are independent of lexical features. In addition, we show that structural features are superior to lexical features and our summarizer performs surprisingly well at the average F-measure of 0.3914 by using only acoustic features. These findings enable us to summarize speech without placing a stringent demand on speech recognition accuracy.
Improving lecture speech summarization using rhetorical information
, 2007
"... We propose a novel method of extractive summarization of lecture speech based on unsupervised learning of its rhetorical structure. We present empirical evidence showing that rhetorical structure is the underlying semantics which is then rendered in linguistic and acoustic/prosodic forms in lecture ..."
Abstract
-
Cited by 13 (2 self)
- Add to MetaCart
(Show Context)
We propose a novel method of extractive summarization of lecture speech based on unsupervised learning of its rhetorical structure. We present empirical evidence showing that rhetorical structure is the underlying semantics which is then rendered in linguistic and acoustic/prosodic forms in lecture speech. We present a first thorough investigation of the relative contribution of linguistic versus acoustic features and show that, at least for lecture speech, what is said is more important than how it is said. We base our experiments on conference speeches and corresponding presentation slides as the latter is a faithful description of the rhetorical structure of the former. We find that discourse features from broadcast news are not applicable to lecture speech. By using rhetorical structure information in our summarizer, its performance reaches 67.87 % ROUGE-L F-measure at 30 % compression, surpassing all previously reported results. The performance is also superior to the 66.47 % ROUGE-L F-measure of baseline summarization performance without rhetorical information. We also show that, despite a 29.7 % character error rate in speech recognition, extractive summarization performs relatively well, underlining the fact that spontaneity in lecture speech does not affect the central meaning of lecture speech.
Multi-stage compaction approach to broadcast news summarisation
- in Proceedings of Eurospeech 2005
, 2005
"... This paper presents a fully automatic, multi-stage compaction approach to broadcast news summarisation, targeting transcripts from automatic speech recognition (ASR) systems. It employs a network of multi-layer perceptrons to remove incorrectly transcribed words based on confidence scores, and to se ..."
Abstract
-
Cited by 10 (1 self)
- Add to MetaCart
(Show Context)
This paper presents a fully automatic, multi-stage compaction approach to broadcast news summarisation, targeting transcripts from automatic speech recognition (ASR) systems. It employs a network of multi-layer perceptrons to remove incorrectly transcribed words based on confidence scores, and to select significant chunks at multiple stages based on tf.idf scores and named entity frequency. The resulting summaries are assessed using a combination of cross comprehension test and a fluency test, finally compared with an automatic evaluation scheme. The experimental results show the approach can produce summaries with good information content. 1.
Chinese Spoken Document Summarization Using Probabilistic Latent Topical Information
- in Proc. ICASSP 2006. 19.5 19.6 19.7 19.8 19.9 20.0 20.1 20.2 20.3 20.4 20.5 20.6 20.7 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Ratios of Training Observations C ER (% ) MI 16 FB 16 MI 128 FB 128
"... Abstract. The purpose of extractive summarization is to automatically select indicative sentences, passages, or paragraphs from an original document according to a certain target summarization ratio, and then sequence them to form a concise summary. In this paper, in contrast to conventional approa ..."
Abstract
-
Cited by 7 (4 self)
- Add to MetaCart
(Show Context)
Abstract. The purpose of extractive summarization is to automatically select indicative sentences, passages, or paragraphs from an original document according to a certain target summarization ratio, and then sequence them to form a concise summary. In this paper, in contrast to conventional approaches, our objective is to deal with the extractive summarization problem under a probabilistic modeling framework. We investigate the use of the hidden Markov model (HMM) for spoken document summarization, in which each sentence of a spoken document is treated as an HMM for generating the document, and the sentences are ranked and selected according to their likelihoods. In addition, the relevance model (RM) of each sentence, estimated from a contemporary text collection, is integrated with the HMM model to improve the representation of the sentence model. The experiments were performed on Chinese broadcast news compiled in Taiwan. The proposed approach achieves noticeable performance gains over conventional summarization approaches.
A Pilot Study of Opinion Summarization in Conversations
"... This paper presents a pilot study of opinion summarization on conversations. We create a corpus containing extractive and abstractive summaries of speaker’s opinion towards a given topic using 88 telephone conversations. We adopt two methods to perform extractive summarization. The first one is a se ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
(Show Context)
This paper presents a pilot study of opinion summarization on conversations. We create a corpus containing extractive and abstractive summaries of speaker’s opinion towards a given topic using 88 telephone conversations. We adopt two methods to perform extractive summarization. The first one is a sentence-ranking method that linearly combines scores measured from different aspects including topic relevance, subjectivity, and sentence importance. The second one is a graph-based method, which incorporates topic and sentiment information, as well as additional information about sentence-to-sentence relations extracted based on dialogue structure. Our evaluation results show that both methods significantly outperform the baseline approach that extracts the longest utterances. In particular, we find that incorporating dialogue structure in the graph-based method contributes to the improved system performance. 1
WORD TOPICAL MIXTURE MODELS FOR EXTRACTIVE SPOKEN DOCUMENT SUMMARIZATION
"... This paper considers extractive summarization of Chinese spoken documents. In contrast to conventional approaches, we attempt to deal with the extractive summarization problem under a probabilistic generative framework. A word topical mixture model (w-TMM) was proposed to explore the co-occurrence r ..."
Abstract
-
Cited by 5 (2 self)
- Add to MetaCart
This paper considers extractive summarization of Chinese spoken documents. In contrast to conventional approaches, we attempt to deal with the extractive summarization problem under a probabilistic generative framework. A word topical mixture model (w-TMM) was proposed to explore the co-occurrence relationship between words of the language. Each sentence of the spoken document to be summarized was treated as a composite word TMM model for generating the document, and sentences were ranked and selected according to their likelihoods. Various kinds of modeling structures and learning approaches were extensively investigated. In addition, the summarization capabilities were verified by comparison with the other conventional summarization approaches. The experiments were performed on the Chinese broadcast news collected in Taiwan. Noticeable performance gains were obtained. The proposed summarization technique has also been properly integrated into our prototype system for voice retrieval of broadcast news via mobile devices. 1.
Effect of Recognition Errors on Text Clustering
, 2004
"... Abstract. This paper presents clustering experiments performed over noisy texts (i.e. texts that have been extracted through an automatic process like character or speech recognition). The effect of recognition errors is investigated by comparing clustering results performed over both clean (manuall ..."
Abstract
-
Cited by 3 (3 self)
- Add to MetaCart
(Show Context)
Abstract. This paper presents clustering experiments performed over noisy texts (i.e. texts that have been extracted through an automatic process like character or speech recognition). The effect of recognition errors is investigated by comparing clustering results performed over both clean (manually typed data) and noisy (automatic speech transcriptions) versions of the same speech recording corpus. 2 IDIAP–RR 04-82 1