22 citations found. Retrieving documents...
A. Stolcke and E. Shriberg. Automatic linguistic segmentation of conversational speech. In Proceedings of the International Conference on Spoken Language Processing, pages 1005--1008, 1996.

 Home/Search   Document Details and Download   Summary   Related Articles   Check  

This paper is cited in the following contexts:
Natural Language Queries on Natural Language Data: .. - Armstrong, Clark, .. (2003)   (2 citations)  (Correct)

....[Li and Roth 2001 ] identification of non recursive syntactic constituents) which includes preliminary steps of named entity recognition. The shallow dialogue processing itself aims at the following interrelated subtasks (see also Figure 1) 1. dialogue segmentation into individual utterances [Stolcke and Shriberg 1996] and into episodes that focus on a coherent topic [Choi 2000] dialogue tagging with discourse acts and adjacency pairs detection [Stolcke, Ries et al. 2000] each utterance is assigned a tag that identifies its discourse function, then pairs of utterances such as question answer, ....

Stolcke, A.; Shriberg, E.: Automatic Linguistic Segmentation of Conversational Speech. The 4th International conference on Spoken Language Processing (ICSLP96) , Philadelphia, 1996.


Placing Structuring Elements In A Word Sequence For.. - Weilhammer, Ruske   (Correct)

....We give a detailed description of the algorithm and present first results of a system trained on a small corpus. 1 INTRODUCTION Predicting the positions of structuring elements in written or spontaneous speech by using statistical methods has been very popular and successful in recent years. [5] and [6] tried to predict sentence boundaries and disfluencies, 3] worked on the prediction of prosodic information. In both cases the task was to find the position of structural elements with a specific meaning. Class based language models are widely used in speech recognition. Such models use ....

A. Stolke and E. Shriberg. Automatic linguistic segmentation of conversational speech. In Proceedings of the International Conference on Language Processing, Philadelphia, USA, volume 2, pages 1005--1008, 1996.


Utterance Units in Spoken Dialogue - Traum, Heeman (1997)   (5 citations)  (Correct)

....consensus as to what defines an utterance unit, most attempts make use of one or more of the following factors. Speech by a single speaker, speaking without interruption by speech of the other, constituting a single Turn (e.g. 7, 8, 22, 27] Has syntactic and or semantic completion (e.g. [7, 22, 21, 32]) Defines a single speech act (e.g. 22, 16, 20] Is an intonational phrase (e.g. 12, 9, 7, 11] Separated by a pause (e.g. 22, 11, 29, 33] While the turn has the great advantage of having easily recognized boundaries, 3 there are several difficulties with treating it as a ....

Andreas Stolcke and Elizabeth Shriberg, `Automatic linguistic segmentation of conversational speech', in Proceedings of the 4rd International Conferenceon Spoken LanguageProcessing (ICSLP-96), (October 1996).


Automatic Detection Of Semantic Boundaries - Cettolo, Corazza (1997)   (Correct)

....to occur at SBs, but in a different way compared to how they occur in the segment interior. In fact, in correspondence of SBs they are usually surrounded by silent pauses. A rule based approach is discussed in [12] obtaining good results, while a completely statistical approach is presented in [8], where only lexical information is considered. Interesting results in SB prediction are obtained by using a n gram LM also including some extra linguistic phenomena. Finally, an approach based on multi layer perceptron is presented in [2] for a similar problem, the prediction of syntactic ....

....model performance also is good. 3. CLASSIFIER The main advantage of statistical approaches over rulebased ones is that automatic training algorithms allow the construction of the classifier from problem related data. Therefore, they are easier to use on new domains. The results presented in [8, 2] show how such statistical approaches perform well. Nevertheless, when a statistical approach is used, it is often very difficult to find out which kind of information is useful in some situation, and then to decide which information source is better. On the contrary, Binary Decision Trees (BDTs) ....

[Article contains additional citation context not shown here]

A. Stolcke and E. Shriberg. Automatic Linguistic Segmentation of Conversational Speech. In Proc. of ICSLP, Philadelphia, USA, 1996.


Automatic Detection Of Semantic Boundaries Based On.. - Cettolo, Falavigna (1998)   (1 citation)  (Correct)

.... however, the presence and the length of filled pauses seem to have a significant correlation with semantic boundaries [12] An overview on the use of prosody in speech based systems can be found in [7] For SB detection, the main source of information lies on the linguistic content of the sentence [10, 2, 14, 9, 4]. However, the detector can use only the recognizer output, in addition to the acoustic parameters, and then it cannot be based on linguistic information requiring sophisticated processing. Word n grams can effectively capture statistical relations between words and SBs [10] In fact, once a ....

....sentence [10, 2, 14, 9, 4] However, the detector can use only the recognizer output, in addition to the acoustic parameters, and then it cannot be based on linguistic information requiring sophisticated processing. Word n grams can effectively capture statistical relations between words and SBs [10]. In fact, once a word SB n gram model is estimated on a training set, simple dynamic programming algorithms can find the best segmentation or the k best segmentations of a test input sentence. In this paper, a detailed description of some techniques aimed at extracting acoustic and lexical ....

A. Stolcke and E. Shriberg. Automatic Linguistic Segmentation of Conversational Speech. In ICSLP, Philadelphia, USA, 1996.


High Performance Segmentation of Spontaneous Speech Using Part .. - Marsal Gavald (1997)   (1 citation)  (Correct)

....is thus how to correctly segment an utterance into clauses. The segmentation procedure described in Lavie et al. 1996) uses a combination of acoustic information, statistical calculation of boundary trigrams, some highly indicative keywords and also some heuristics from the parser itself. Stolcke and Shriberg (1996) studied the relevance of several word level features for segmentation performance on the Switchboard corpus (see Godfrey et al. 1992) Their best results were achieved by using part of speech n grams, enhanced by a couple of trigger words and biases. Another, more acoustics based approach for ....

....several neural nets yield extremely good performance. While Lavie et al. 1996) just report an improvement in the end to end performance of the Janus speech to speech translation system when using their segmentation method but do not give details the performance of the segmentation method itself, Stolcke and Shriberg (1996) are more explicit and provide precision and recall results. Moreover Lavie et al. 1996) deal with Spanish input whereas Stolcke and Shriberg (1996) like us, drew their data from the Switchboard corpus. Type Harmful Reason Context false positive no trigger word to work b and when I had ....

[Article contains additional citation context not shown here]

A. Stolcke and E. Shriberg. 1996. Automatic linguistic segmentation of conversational speech. In Proceedings of the ICSLP-96, pp. 1005--1008. K. Takagi and S. Itahashi. 1996. Segmentation of spoken dialogue by interjections, disfluent utterances and pauses. In Proceedings of the ICSLP-96, pp. 697--700.


Is The Speaker Done Yet? - Faster And More (2002)   Self-citation (Stolcke Shriberg)   (Correct)

No context found.

A. Stolcke and E. Shriberg, "Automatic linguistic segmentation of conversational speech", in H. T. Bunnell and W. Idsardi, editors, Proc. ICSLP, vol. 2, pp. 1005--1008, Philadelphia, Oct. 1996.


Is The Speaker Done Yet? - Faster And More (2002)   Self-citation (Stolcke Shriberg)   (Correct)

....(HMM) where the word event (word non EOU or word EOU) pairs correspond to states and the words to observations, with the transition probabilities given by the N gram language model. Then, the forward backward probabilities could be used to obtain the probability of EOU in the current boundary [12]. In our case, however, we are restricted to using only past information. Therefore, we use only N gram probabilities conditioned on the word history. 2.3. Knowledge source combination To make use of both prosodic and word information, we compute the following score at each boundary for each ....

A. Stolcke and E. Shriberg, "Automatic linguistic segmentation of conversational speech", in H. T. Bunnell and W. Idsardi, editors, Proc. ICSLP, vol. 2, pp. 1005--1008, Philadelphia, Oct. 1996.


Dependency Language Modeling - Stolcke, Chelba, Engle, Jimenez.. (1997)   (8 citations)  Self-citation (Stolcke)   (Correct)

....on complete sentences or utterances. Unfortunately, the Switchboard test set is not commonly segmented in that way, due to the fact that spontaneous speech has not simple cues for utterance boundaries; the automatic linguistic segmentation of spontaneous speech is the subject of ongoing research [27]. To work around this problem we relied on the hand segmentation of the word level transcripts that had been done as part of the treebanking effort [18] A total of 1.4 million words from the training corpus had been annotated for linguistic segment (utterance) boundaries, from which we drew the ....

....i.e. incorporating only N gram constraints. At this point we ran into the memory and filesize demands of the current MEMT, as described in Section 3.3.4. Specifically, the event file size for a model 3 To extend the linguistic segmentation to the full training corpus, the automatic segmented of [27] was used. 17 Language model WER 2 4 backoff bigram 48.0 2 4 ME trigram 48.3 2 4 2 backoff trigram 46.3 1 1 2 backoff trigram 46.2 2 4 2 ME trigram 48.4 2 4 2 ME trigram (no smoothing) 48.7 Table 3: N gram results for backoff and ME models. with the same number of parameters as a ....

Andreas Stolcke and Elizabeth Shriberg. Automatic linguistic segmentation of conversational speech. In Proceedings of the International Conference on Spoken Language Processing, volume 2, pages 1005--1008, Philadelphia, PA, October 1996.


Combining Words And Prosody For Information.. - Hakkani-Tür, Tür.. (1999)   (8 citations)  Self-citation (Stolcke Shriberg)   (Correct)

....prosodic information for information extraction. The general framework for combining lexical and prosodic cues for tagging speech with various kinds of hidden structural information is a further development of our earlier work on detecting sentence boundaries and disfluencies in spontaneous speech [16, 14, 12, 15]. 2. PROSODIC MODELING 2.1. Data For all tasks, the prosodic model used a wide range of features that were automatically extracted from about 70 hours (700 thousand words) of the Linguistic Data Consortium (LDC) 1997 Broadcast News (BN) corpus. Sentence boundaries were automatically determined ....

....only on the hidden variable in question (boundary no boundary, NE tag) but not on the words. We discuss various approaches around this strong assumption in [15] 3.1. Sentence Segmentation For sentence segmentation, we relied on a hidden event N gram language model (LM) of the type used in [14]. The states of the HMM consist of the end of sentence status of each word (boundary or no boundary) plus preceding words and possibly boundary tokens to fill up the N gram context (N = 4 in our experiments) Transition probabilities are given by the N gram probabilities. HMM observations consist ....

A. Stolcke and E. Shriberg. Automatic linguistic segmentation of conversational speech. In H. T. Bunnell and W. Idsardi, editors, Proc. ICSLP, vol. 2, pp. 1005--1008, Philadelphia, 1996.


Dependency Language Modeling - Stolcke, Chelba, Engle, Jimenez.. (1997)   (8 citations)  Self-citation (Stolcke)   (Correct)

....evaluated on complete sentences or utterances. Unfortunately, the Switchboard corpus as supplied to us was not segmented in that way, since spontaneous speech has no simple cues for utterance boundaries. The automatic linguistic segmentation of spontaneous speech is the subject of ongoing research [30]. To work around this problem we relied on the hand segmentation of the word level transcripts that had been done as part of the treebanking effort [20] A total of 1.4 million words from the training corpus had been annotated for linguistic segment (utterance) boundaries, from which we drew the ....

.... developed at last year s workshop (LM95) was competitive with the standard trigram [32] That model used a somewhat different smoothing method, which might account for the difference; in any case, 2 To extend the linguistic segmentation to the full training corpus, the automatic segmenter of [30] was used. 25 Language model WER 2 4 backoff bigram 48.0 2 4 ME bigram 48.3 2 4 2 backoff trigram 46.3 1 1 2 backoff trigram 46.2 2 4 2 ME trigram 48.4 2 4 2 ME trigram (no smoothing) 48.7 Table 12: N gram results for backoff and ME models. Model Link N gram constraints Link WER Types ....

Andreas Stolcke and Elizabeth Shriberg. Automatic linguistic segmentation of conversational speech. In Proceedings of the International Conference on Spoken Language Processing, volume 2, pages 1005--1008, Philadelphia, October 1996.


Modeling The Prosody Of Hidden Events For Improved.. - Stolcke, Shriberg.. (1999)   (11 citations)  Self-citation (Stolcke Shriberg)   (Correct)

....as a knowledge source for word recognition. Prosodic cues have been studied mainly for the purpose of automatic detection of disfluencies [10, 12] and sentence boundaries [8] The correlation between hidden events and word cues has likewise been exploited, for detecting both sentence boundaries [14, 8] and disfluencies [2, 6, among others] although recent work has also shown that speech language models can be improved by incorporating hidden events into the model [5] For the present work we reused prosodic and language models of hidden events previously developed for automatic detection from ....

A. Stolcke and E. Shriberg. Automatic linguistic segmentation of conversational speech. In H. T. Bunnell and W. Idsardi, editors, Proc. ICSLP, vol. 2, pp. 1005--1008, Philadelphia, 1996.


A Prosody-Only Decision-Tree Model For Disfluency Detection - Shriberg, Bates, Stolcke (1997)   (9 citations)  Self-citation (Stolcke Shriberg)   (Correct)

.... that speakers hesitate before less predictable words; thus, transition probabilities should be dynamically adjusted in the vicinity of hesitations [9] Automatic detection of disfluencies could also benefit higher level modeling, for example, the automatic segmentation of speech into sentences [11], and the modeling of discourse or topic structure [13] 1.2. Why use prosody Various approaches to automatic disfluency detection have been proposed in past work [8, 1, 7, 4] These studies have focused on task oriented dialog and have used a combination of lexical and prosodic features. ....

A. Stolcke and E. Shriberg. Automatic linguistic segmentation of conversational speech. In Proc. ICSLP, vol. 2, pp. 1005--1008, Philadelphia, 1996.


Automatic Detection Of Sentence Boundaries And Disfluencies Based .. - Stolcke (1998)   (10 citations)  Self-citation (Stolcke Shriberg)   (Correct)

....extraction, it is important to find the location and extent of disfluencies (including self repairs) so that a speaker s intended meaning can be inferred. We will refer to sentence boundaries and disfluencies collectively as our target events. Prior work on utterance boundary detection [8, 12] as well as on disfluency detection [5, 10] has addressed this problem, but not in a completely realistic framework. Previous work has assumed either a correct word sequence, or knowledge of the word boundaries. In reality, word information is not known, but has to be hypothesized using a speech ....

....During testing, the model can be used as a hidden Markov model (HMM) in which the word event pairs correspond to states and the words to observations, with the transition probabilities given by the Ngram model. The model is a generalization of the hidden segment boundary language model used in [12] where the number of events types and the context length can be arbitrary. Given a word sequence, a forward backward dynamic programming algorithm is used to compute the posterior probability P LM (E i jW ) of an event E i at position i. We trained a word event N gram model from 1.2M words of ....

[Article contains additional citation context not shown here]

A. Stolcke and E. Shriberg. Automatic linguistic segmentation of conversational speech. In H. T. Bunnell and W. Idsardi, editors, Proc. ICSLP, vol. 2, pp. 1005--1008, Philadelphia, 1996.


Dialog Act Modeling for Conversational Speech - Stolcke, Shriberg, Bates.. (1998)   (9 citations)  Self-citation (Stolcke Shriberg)   (Correct)

.... The raw Switchboard data is not segmented in a linguistically consistent way; we therefore made use of a version that had been hand segmented at the utterance level (Meteer others 1995) Automatic segmentation of spontaneous speech is an open research problem in its own right (Mast et al. 1996; Stolcke Shriberg 1996), but we decided not to confound the DA detection task with the additional problems introduced by automatic segmentation. We chose to follow a recent standard for shallow discourse structure annotation, the Discourse Annotation and Markup System of Labeling (DAMSL) tag set, which was recently ....

Stolcke, A., and Shriberg, E. 1996. Automatic linguistic segmentation of conversational speech. In ICSLP-96, 1005--1008.


Combining Words and Speech Prosody for Automatic.. - Stolcke, Shriberg, .. (1999)   (3 citations)  Self-citation (Stolcke Shriberg)   (Correct)

....this study and in relation to other work. The general framework for combining lexical and prosodic cues for tagging speech with various kinds of hidden structural information is a further development of our earlier work on sentence segmentation and disfluency detection for spontaneous speech [10, 12, 13]. 2. Approach Topic segmentation in the paradigm used by us and others [15] proceeds in two phases. In the first phase, the input is divided into contiguous strings of words assumed to belong to one topic each. We refer to this step as chopping . For example, in textual input, the natural units ....

A. Stolcke and E. Shriberg. Automatic linguistic segmentation of conversational speech. In H. T. Bunnell and W. Idsardi, editors, Proc. ICSLP, vol. 2, pp. 1005--1008, Philadelphia, 1996.


Dialog Act Modeling for Conversational Speech - Stolcke, Shriberg, Bates.. (1998)   (9 citations)  Self-citation (Stolcke Shriberg)   (Correct)

.... The raw Switchboard data is not segmented in a linguistically consistent way; we therefore made use of a version that had been hand segmented at the utterance level (Meteer others 1995) Automatic segmentation of spontaneous speech is an open research problem in its own right (Mast et al. 1996; Stolcke Shriberg 1996), but we decided not to confound the DA detection task with the additional problems introduced by automatic segmentation. We chose to follow a recent standard for shallow discourse structure annotation, the Dialog Act Markup in Several Layers (DAMSL) tag set, which was recently designed by the ....

Stolcke, A., and Shriberg, E. 1996. Automatic linguistic segmentation of conversational speech. In Proc. ICSLP, 1005--1008.


Modeling Linguistic Segment And Turn Boundaries For N-Best.. - Stolcke (1997)   (3 citations)  Self-citation (Stolcke)   (Correct)

....to the most likely segmentation of a word sequence according to the language model. This is the basis of a simple automatic segmentation algorithm, and can been used to segment spontaneous speech transcripts into linguistic utterance units where hand segmented transcripts are not available [14]. An approximate version of the hidden segmentation language model that does not require the forward algorithm has been used previously to study the effect of segmentation on language model perplexity [5] 3. N BEST LIST RESCORING To apply the hidden segmentation model to the output of a speech ....

....wordswere available in handsegmented form. We therefore trained an automatic linguistic segmenter on this data, and used it to segment the remaining training data. This method had previously been shown to give good segment boundary detection accuracy on this corpus (85 recall, 3 false alarms) [14]. The hand segmented and the automatically segmented training data were pooled, resulting in a linguistic segment language model based on the same amount of training data as the acoustic segment language model. The test set consisted of 25 Switchboard conversations (24,000 words) and was ....

A. Stolcke and E. Shriberg. Automatic linguistic segmentation of conversational speech. In Proceedings of the International Conference on Spoken Language Processing, vol. 2, pp. 1005--1008, Philadelphia, 1996.


Structural Event Detection for Rich Transcription of Speech - Liu (2004)   (Correct)

No context found.

A. Stolcke and E. Shriberg. Automatic linguistic segmentation of conversational speech. In Proceedings of the International Conference on Spoken Language Processing, pages 1005--1008, 1996.


Specification of a Shallow Dialog Analysis Model - Andrei Popescu-Belis Maria   (Correct)

No context found.

Stolcke, A., and Shriberg, E., "Automatic Linguistic Segmentation of Conversational Speech", In Proc. of ICSLP96, Philadelphia, PA, USA, 1996, p. 1005--1008.


Intonational Boundaries, Speech Repairs and Discourse Markers: .. - Heeman, Allen (1997)   (1 citation)  (Correct)

No context found.

Stolcke, A. and E. Shriberg. 1996a. Automatic linguistic segmentation of conversational speech. In Proceedings of the 4rd International Conference on Spoken Language Processing (ICSLP-96), October.


Linguistically Engineered Tools for Speech Recognition.. - Van Ess-Dykema, Ries (1998)   (Correct)

No context found.

SS96. A. Stolcke and E. Shriberg. Automatic linguistic segmentation of conversational speech. In Proceedings of the ICSLP, Philadelphia, USA, 1996.

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC