108 citations found. Retrieving documents...
C. E. Shannon. Prediction and entropy of printed English. The Bell System Technical Journal, 30:50--64, January 1951.

 Home/Search   Document Not in Database   Summary   Related Articles   Check  

This paper is cited in the following contexts:

First 50 documents  Next 50

Metrics for text entry research: An evaluation of - Kspc   (Correct)

....representing useful information transfer is F IF INF C C Bandwidth Utilised = 8) and the proportion of wasted bandwidth is F IF INF C F IF INF Bandwidth Wasted = 9) Purists will rightfully disagree. The units of C, INF, IF, and F, are characters, not bits. Yet Shannon [6] argues that it is possible to measure the information content of a character. A future direction for this work is to cast these formulae in terms of information content, instead of characters. Then these relations would also apply to non character based text input methods. Equations 8 and 9 ....

Shannon, C.E., (1951) Prediction and entropy of printed English. Bell System Technical Journal, 30, 51-64.


The Representation of Natural Language to Enable Neural Networks.. - Lyon (1994)   (1 citation)  (Correct)

....If a sentence is represented as a bag of words with no information on their order little knowledge can be gleaned. The approach taken here is to combine adjacent words and represent the sentence as a set of pairs and triples. These tuples partially preserve sequential order. Shannon (1951) [25] investigated the amount of information held in sequences of letters in English text, and his results were later extended to words. He found that the additional information held by sequences longer than 3 was very small. This is an important natural constraint. However, the great number of words ....

.... If words are ranked in order of frequency, denoted by n, there is an empirical relationship between the probability of the word at rank n occurring, p(n) and n itself, known as Zipf s Law: p(n) This gives a surprisingly good approximation to word probabilities in English and other languages [25, 1951], and indicates the extent to which a significant number of words occur infrequently. The LOB corpus of about 1 million word tokens contains about 50,000 different word forms, of which about 42,000 occur less than 10 times each [21, 1987] In the Brown corpus of comparable size 40 of word forms ....

[Article contains additional citation context not shown here]

C E shannon. Prediction and Entropy of Printed English. Bell System Technical Journal, 1951.


Unknown - Figure There Are   (Correct)

....Visualization Similar to the behavior of N block models in the text domain, the accuracy of N block models for describing the contents of images increases when we increase the value of N. We look at some examples to get an intuition by generalizing a visualization technique proposed by Shannon [79, 80, 81] from the text domain to the image domain. The idea is to train various N blocks and then use each to generate random images. Given a training database D of M images and a codebook of N codes c 1 c N , let I M be the VQ encoded images, we train the uni block by calculating the ....

C. Shannon. Prediction and entropy of printed English. Bell Systems Technical Journal, 30:50--64, 1951.


Generic Summaries for Indexing in Information Retrieval -.. - Sakai, Jones (2001)   (1 citation)  (Correct)

.... informative as the fulltexts for a task of solving English reading comprehension questions [9] Although this work is sometimes misquoted as if they discovered the optimal CR for any kind of text for any purpose [1, 27] it is not clear that Shannon s 25 redundancy limit from information theory [19] applies to text summarization, or whether reading comprehension questions cover all informative content. Moreover, since the CR can a ect summary evaluation [5] we examine extracts at 5 , 10 , 30 and 50 CR. We also generate abstract length 3 query Single search runs summary index ....

Shannon, C. E.: Prediction and Entropy of Printed English, Bell Systems Technical Journal 30, pp. 50-64 (1951.


The Design, Implementation and Use of the Ngram Statistics.. - Banerjee, Pedersen (2003)   (Correct)

....task in natural language processing. An Ngram might be considered interesting if it occurs more often than would be expected by chance, or has some tendency to predict the occurrence of other phenomena in text. There is a long history of research in this area. Character Ngrams were used by Shannon [10] to estimate the per letter entropy of the English language. In the last decade there has been a large amount of work in developing corpus based techniques to identify collocations in text (e.g. 2] 3] 6] 9] This paper describes the Ngram Statistics Package (NSP) a general purpose ....

C. Shannon. Prediction and entropy of printed English. The Bell System Technical Journal, 30(50-64), 1951.


Fragment Completion in Humans and Machines - Jacobs, Rudra (2003)   (Correct)

....Seidenberg[17] proposed a model based on tri grams. Srinivas et al. 21] indicate that assuming orthographic encoding is in most cases sufficient to describe word completion performance in humans. Orthographic Markov models of words have often been used computationally, as, for example, in Shannon s[19] famous work. Following this work, our model is also orthographic. We find that a bigram rather than a trigram representation is sufficient, and leads to a simpler model. Contradicting evidence exists for the influence of fragment length on word completion. Oloffsson and Nyberg [12] failed to ....

C. Shannon. "Prediction and Entropy of Printed English," Bell Systems Technical Journal, 30:50-64, 1951.


Using Software Metrics and Evolutionary Decision Trees.. - Kokol, Podgorelec.. (2001)   (Correct)

....the definition of the complexity and the choice of the system being studied. Using the assumption that meaning and information content in text is founded on the correlation between language symbols one of the meaningful measures of complexity of human writings is entropy as established by Shannon [11]. Yet, when a text is very long it is almost impossible to calculate the Shannon information entropy so Grassberger [12] proposed an approximate method to estimate entropy. But entropy does not reveal directly the correlation properties of texts so another more general measure is needed. One ....

Shannon, C.E., "Prediction and entropy of printed English", Bell System Technical Journal, 1951, pp 30, 50.


The Entropy Of English Using Ppm-Based Models - Teahan, Cleary (1996)   (8 citations)  Self-citation (Shannon)   (Correct)

....ENTROPY OF ENGLISH USING PPM BASED MODELS W. J. Teahan, John G. Cleary Department of Computer Science, University of Waikato, New Zealand Over 45 years ago Claude E. Shannon estimated the entropy of English to be about 1 bit per character [16]. He did his by having human subjects guess samples of text, letter by letter. From the number of guesses made by each subject he estimated upper and lower bounds of 1.3 and 0.6 bits per character (bpc) for the entropy of English. Shannon s methodology was not improved upon until 1978 when Cover ....

C.E. Shannon. Prediction and entropy of printed English. Bell System Technical Journal, pages 50-64, 1951. '


Log-Linear Interpolation of Language Models - Gutkin (2006)   (Correct)

No context found.

C. E. Shannon. Prediction and entropy of printed English. The Bell System Technical Journal, 30:50--64, January 1951.


Log-Linear Interpolation of Language Models - Gutkin (2000)   (Correct)

No context found.

C. E. Shannon. Prediction and entropy of printed English. The Bell System Technical Journal, 30:50--64, January 1951.


A Maximum Entropy Approach - To Adaptive Statistical   (Correct)

No context found.

C. E. Shannon. Prediction and Entropy of Printed English. Bell Systems Technical Journal, Volume 30, pages 50--64, 1951.


Estimating Entropy of the Source: The Case of - Natural Language Anna   (Correct)

No context found.

C. Shannon. Prediction and Entropy of Printed English. Bell System Technical Journal, (30):50-64. 1951.


The TEXTURE Benchmark: Measuring Performance of - Text Queries On (2005)   (Correct)

No context found.

C. Shannon. Prediction and entropy of printed English. Technical report, Bell Systems, 1951.


CXHist: An On-line Classification-Based Histogram for XML.. - Lim, Wang, Vitter   (Correct)

No context found.

C. Shannon. Prediction and entropy of printed english. Bell Systems Technical Journal, 30:50--64, 1951.


Statistical Language Modelling - Gotoh, Renals (2003)   (Correct)

No context found.

C. E. Shannon. Prediction and entropy of printed English. Bell System Technical Journal, 30(1):50--64, January 1951.


Grammatical Trigrams: A Probabilistic Model of Link Grammar - Lafferty, Sleator, Temperley (1992)   (46 citations)  (Correct)

No context found.

C. E. Shannon. Prediction and entropy of printed English. Bell Syst. Tech. J., Vol. 30, pp. 50-64, 1951. 10


CSE 254 (Spring 2003) - Growing Gram Trees   (Correct)

No context found.

C. Shannon. Prediction and entropy of printed english. Technical Report 30, Bell Systems, 1951.


CSE 256 (Spring 2004) - For   (Correct)

No context found.

C. Shannon. Prediction and entropy of printed english. Technical Report 30, Bell Systems, 1951.


Where Genetic Algorithms Excel - Baum, Boneh, Garrett (1995)   (4 citations)  (Correct)

No context found.

C. E. Shannon (1951) Prediction and entropy of printed English, Bell System Technical J. pp50-64, January.


The Functional Load of Phonological Contrasts - Surendran (2003)   (Correct)

No context found.

Shannon, Claude E. Prediction and entropy of printed English, Bell Systems Technical Journal 30:50-64, 1951.


Stochastic Lexicalized Context-Free Grammar - Schabes, Waters (1993)   (12 citations)  (Correct)

No context found.

C. E. Shannon. Prediction and entropy of printed english. The Bell System Technical Journal, 30:50--64, 1951.


Input-based Language Modelling in the Design of High.. - Soukoreff, MacKenzie (2003)   (1 citation)  (Correct)

No context found.

Shannon, C. E. (1951). Prediction and entropy of printed English. Bell System Technical Journal, 30, 51-64.


Probabilistic Language Modelling - Part Iii Project   (Correct)

No context found.

C. E. Shannon. Prediction and entropy of printed English. Bell systems technical journal, 30:50-64, January 1951.


Universal Prediction - Merhav, Feder (1998)   (17 citations)  (Correct)

No context found.

C. E. Shannon, "Prediction and entropy of printed English," Bell Sys. Tech. J., vol. 30, pp. 5--64, 1951.


Word Association and MI-Trigger-based Language Modeling - Guodong Zhou Kimteng   (Correct)

No context found.

Shannon C.E. "Prediction and Entropy of Printed English". Bell Systems Technical Journal, Vol.30, pp.50-64, 1951.

First 50 documents  Next 50

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC