| Bellegarda, J. R. (1998). Exploiting Both Local and Global Constraints for Multispan Statistical Language Modeling. Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). pp. 677-680, Seattle, USA. |
....information, various methods are used to identify the most relevant portions of the out of domain data prior to combination. Past work on preselection has been based on word frequency counts [17] probability (or perplexity) of word or part of speech sequences [8] latent semantic analysis [1], and information retrieval techniques [12, 8] Perplexity based clustering has also been used for defining topic specific subsets of in domain data [6, 4, 13] and test set perplexity has been used to prune documents from a training corpus [10] The most common method for using the additional ....
J. Bellegarda. Exploiting both local and global constraints for multispan statistical language modeling. In Proc. ICASSP, pages II:677--680, 1998.
....As the name suggests, the goal of LSA is to find a data mapping which provides information well beyond the lexical level and reveals semantical relations between the entities of interest. Due to its generality, LSA has proven to be a valuable analysis tool with a wide range of applications (e.g. [3, 5, 8, 1]) Yet its theoretical foundation remains to a large extent unsatisfactory and incomplete. This paper presents a statistical view on LSA which leads to a new model called Probabilistic Latent Semantics Analysis (PLSA) In contrast to standard LSA, its probabilistic variant has a sound statistical ....
J.R. Bellegarda. Exploiting both local and global constraints for multi-span statistical language modeling. In Proceedings of ICASSP'98, volume 2, pages 677--80, 1998.
....been in one of the 1 Khudanpur and Wu: Maximum Entropy Language Models 2 two directions mentioned above either incorporation of topic dependencies or of syntactic structure but not both. Several models which combine topic related information with N gram models have been studied, e.g. by Bellegarda (1998), Clarkson Robinson (1997) Chen Rosenfeld (1998) Iyer Ostendorf (1996) Kneser et al. 1997) and Martin et al. 1997) The essential idea in all these papers comes from the information retrieval (IR) literature where extensive use is made of weighted word frequencies to discern the topic ....
.... Suhm (1996) describe a topic dependent model very similar to the topic dependent model presented here and demonstrate improvement in perplexity over a bigram model with a relatively small 5000 word vocabulary (Lafferty, 1996) The approach based on latent semantic analysis recently proposed in Bellegarda (1998) is a refreshing departure from these methods and is most similar to our approach in philosophy. It remains of interest to us to compare our method with Bellegarda s under identical experimental conditions. Relatively fewer models which utilize syntactic structure for improving over N gram models ....
[Article contains additional citation context not shown here]
Bellegarda, J. R. (1998). Exploiting Both Local and Global Constraints for Multispan Statistical Language Modeling. In Proceedings of the IEEE Inter- Khudanpur and Wu: Maximum Entropy Language Models 21 national Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 677--680, Seattle, WA.
....given that the topic is business news is much greater than that in the whole corpus. This difference has been successfully used to discern the topic of a document in information retrieval. Several models that combine topic related information with N gram models have been studied, e.g. in [1, 2, 3, 8, 5, 6, 11, 12, 19, 21]. Most schemes [8, 11, 12, 21] exploited these differences by constructing separate N gram models for each individual topic. For example, in [12] the equation (1) was modified as P (w1; w2; w T ) m X k=1 k T 1 Y i=1 P k (w i jw i Gamman 1 ; w i Gamma1 ) # (3) where k was ....
....N gram model would not destroy the full history dependence of the original model. Dynamics were modeled by cache like notions rather than a semantic notion of topic. No statistically significant improvement has been reported in [19] The approach based on latent semantic analysis (LSA) proposed in [1, 2, 3] was a refreshing departure from these methods. In order to capture the global constraints, a matrix of co occurrences between words and documents was accumulated from the training data by simply keeping track of which word was found in what document. K means and bottom up clustering were ....
J. R. Bellegarda, "Exploiting Both Local and Global Constraints for Multispan Statistical Language Modeling," in Proc. ICASSP'98, Vol. 2, pp. 677-680, May 12-15, 1998.
....dynamic models of the topic of the conversation. We present a compact model that integrates these dependencies with N grams in a statistically sound manner in the maximum entropy (ME) framework. Several models which combine topic related information with N gram models have been studied, e.g. in [1, 4, 3, 8, 9, 10]. The essential idea comes from the information retrieval (IR) literature where extensive use is made of weighted term frequencies to discern the topic or genre of a document. Most schemes [4, 8, 10] exploit these differences for language modeling by constructing separate N gram models for each ....
....objective of fast rescoring. The work on read speech in [9] is similar; Supported by National Science Foundation Grant No IRI 9618874 dynamics there are modeled by cache like notions rather than a semantic notion of topic. The approach based on latent semantic analysis recently proposed in [1] is a refreshing departure from these methods and presents perplexities on newspaper text. In the method presented here the term frequencies are treated as topic dependent salient features of a corpus, just as overall Ngram frequencies are topic independent salient features. An admissible model is ....
[Article contains additional citation context not shown here]
J. R. Bellegarda, "Exploiting Both Local and Global Constraints for Multispan Statistical Language Modeling," in Proc. ICASSP'98, Vol. 2, pp. 677-680, May 12-15, 1998.
....In language modeling, a more accurate expectation of the probability of these topic relevant words may be estimated via determination of semantic content of the document, e.g. by topic detection. Several models that combine topic related information with N gram models have been studied, e.g. in [8, 1, 5, 3, 4, 11, 9, 12, 13, 14]. Most of them [8, 5, 9, 13, 14] exploit these differences for language modeling by constructing separate N gram models for each individual topic. This results in fragmentation of the training data and therefore may hurt the discriminating power of a well trained global N gram model. Some use an ....
J. R. Bellegarda, "Exploiting Both Local and Global Constraints for Multispan Statistical Language Modeling, " in Proc. ICASSP'98, Vol. 2, pp. 677-680, May 12-15, 1998.
No context found.
Bellegarda, J. R. (1998). Exploiting Both Local and Global Constraints for Multispan Statistical Language Modeling. Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). pp. 677-680, Seattle, USA.
No context found.
Bellegarda, J.R. Exploiting both local and global constraints for multi-span statistical language modeling, IEEE Proc. Intl. Conf. on Acoustics, Speech, and Signal Processing, vol. 2, 677-680, 1998.
No context found.
Bellegarda, J.R. Exploiting both local and global constraints for multi-span statistical language modeling, IEEE Proc. Intl. Conf. on Acoustics, Speech, and Signal Processing, vol. 2, 677-680, 1998.
No context found.
Jerome Bellegarda. 1998. Exploiting both local and global constraints for multispan statistical language modeling. In IEEE ICASSP-98, Seattle, Washington, May.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC