21 citations found. Retrieving documents...
J. Schmidhuber and S. Heil. Sequential Neural Text Compression. Technical Report FKI- -94, Fakultat fur Informatik, Technische Universitat Munchen, 1994.

 Home/Search   Document Not in Database   Summary   Related Articles   Check  

This paper is cited in the following contexts:
Image compression with neural networks - A survey - Jiang (1999)   (1 citation)  (Correct)

....To this end, such a neural network application can be viewed as an indirect application for lossless image compression. By considering each pixel as an individual symbol in a pre de ned scanned order, other general lossless data compression neural networks can also be applied for image compression [28,59]. In fact, theoretical results are analysed [56] based on Kolmogorov s mapping neural network existence theorem [20] that a C grey level image of n#n can be completely described by a three layer neural network with 2#log n# inputs, 4#log n##2 hidden neurones and #log C# output neurones [56] This ....

J. Schmidhuber, S. Heil, Sequential neural text compression, IEEE Trans. Neural Networks 7 (1) (January 1996) 142}146.


Online Symbolic-Sequence Prediction with Recurrent.. - Pérez-Ortiz.. (2001)   (Correct)

....of next symbol probabilities. Arithmetic compression is used to measure the quality of the predictor. 1 Introduction RNN are specially suitable for temporal sequence processing. RNN develope state representations of this kind of sequences. Previous works with similar sequences [1, 2, 3] concentrate on offline prediction (for example, grammatical inference) Here we concentrate on online prediction where the network is trained to give in real time an output as correct as possible for every component of the sequence supplied as input. 2 Methodology 2.1 Architecture We ....

Schmidhuber, J. and S. Heil (1996), "Sequential neural text compression ", IEEE Transactions on Neural Networks, 7(1), pp. 142146.


Online Symbolic-Sequence Prediction with.. - Pérez-Ortiz.. (2001)   (Correct)

....for predicting online the next component in a symbolic sequence. Arithmetic compression [7] is used to evaluate the quality of the predictor. Di#erent symbolic sequence sources ranging from finite state machines to texts in human language are considered in the experiments. Unlike previous works [3, 11, 13] which performed neural o#ine prediction on the kind of sequences studied here, in this paper we concentrate on online prediction. 2 Prediction with DTRNN We have chosen two classical DTRNN: Elman s simple recurrent network (SRN) 4] and the recurrent error propagation network (REPN) 10] Both ....

Schmidhuber, J. and S. Heil (1996), "Sequential neural text compression", IEEE Transactions on Neural Networks, 7(1), pp. 142-146.


Online Symbolic-Sequence Prediction with Recurrent.. - Pérez-Ortiz.. (2001)   (Correct)

....of next symbol probabilities. Arithmetic compression is used to measure the quality of the predictor. 1 Introduction RNN are specially suitable for temporal sequence processing. RNN develope state representations of this kind of sequences. Previous works with similar sequences [1, 2, 3] concentrate on offline prediction (for example, grammatical inference) Here we concentrate on online prediction where the network is trained to give in real time an output as correct as possible for every component of the sequence supplied as input. 2 Methodology 2.1 Architecture We ....

Schmidhuber, J. and S. Heil (1996), "Sequential neural text compression ", IEEE Transactions on Neural Networks, 7(1), pp. 142146.


Text Compression via Alphabet Re-Representation (Extended.. - Long, Natsev, Vitter   (Correct)

....the network through blending and partitioning; and second, by doing adaptive size adjustments through manipulating the d and h parameters during training. 6 Results and Discussion We have run two sets of experiments to test our approach. The first set is identical to the experiments reported in [5], where the authors propose a similar neural network approach that consists of the same input output structure but uses only one hidden layer. As reported in [5] the number of hidden nodes used in their single hidden layer is 440, the context size is 5, and the alphabet size is 80. The matching ....

....and Discussion We have run two sets of experiments to test our approach. The first set is identical to the experiments reported in [5] where the authors propose a similar neural network approach that consists of the same input output structure but uses only one hidden layer. As reported in [5], the number of hidden nodes used in their single hidden layer is 440, the context size is 5, and the alphabet size is 80. The matching configuration that we used had an alphabet size of 256 so that it can handle even binary files but all characters that were predicted with essentially a zero ....

[Article contains additional citation context not shown here]

SCHMIDHUBER, J., AND HELL, S. Sequential neural text compression. IEEE Transactions on Neural Networks 7, 1 (January 1996), 142-146.


Extracting Finite State Representations from Recurrent Neural.. - Tino, Köteles (1999)   (5 citations)  (Correct)

....entropy, i.e. difficult to predict, construct a stochastic model that has the information theoretic properties similar to those of S. So far, connectionist approaches to extraction of useful statistics from long symbolic sequences have have been primarily concerned with compression issues [37] or detection and analysis of significant subsequences [20, 38] In this paper, we study three main issues associated with training RNNs on long chaotic sequences. 1. Chaotic sequences are by nature unpredictable. Consequently, one can hardly expect RNNs to be able to exactly learn the prediction ....

J. Schmidhuber and S. Heil. Sequential neural text compression. IEEE Transactions on Neural Networks, 7(1):142--146, 1996.


A Neural Probabilistic Language Model - Bengio, Ducharme, Vincent (2000)   (7 citations)  (Correct)

....we push this idea to a large scale, and concentrate on learning a statistical model of the distribution of word sequences, rather than learning the role of words in a sentence. The proposed approach is also related to previous proposals of character based text compression using neural networks [11]. Learning a clustering of words [10, 1] is also a way to discover similarities between words. In the model proposed here, instead of characterizing the similarity with a discrete random or deterministic variable (which corresponds to a soft or hard partition of the set of words) we use a ....

Jurgen Schmidhuber. Sequential neural text compression. IEEE Transactions on Neural Networks, 7(1):142--146, 1996.


Online Symbolic-Sequence Prediction with.. - Pérez-Ortiz.. (2001)   (Correct)

....for predicting online the next component in a symbolic sequence. Arithmetic compression [7] is used to evaluate the quality of the predictor. Di erent symbolic sequence sources ranging from nite state machines to texts in human language are considered in the experiments. Unlike previous works [3, 11, 13] which performed neural o ine prediction on the kind of sequences studied here, in this paper we concentrate on online prediction. 2 Prediction with DTRNN We have chosen two classical DTRNN: Elman s simple recurrent network (SRN) 4] and the recurrent error propagation network (REPN) 10] Both ....

Schmidhuber, J. and S. Heil (1996), \Sequential neural text compression", IEEE Transactions on Neural Networks, 7(1), pp. 142-146.


A Neural Probabilistic Language Model - Bengio, Ducharme, Vincent (2000)   (7 citations)  (Correct)

....on learning a statistical model of the distribution of word sequences, rather than learning the role of words in a sentence. The proposed approach is also related to previous proposals of character based text compression using neural networks to predict the probability of the next character (Schmidhuber, 1996). The idea of discovering some similarities between words to obtain generalization from training sequences of words to new sequences of words is not new. For example, it is exploited in approaches that are based on learning a clustering of the words (Pereira, Tishby and Lee, 1993; Baker and ....

Schmidhuber, J. (1996). Sequential neural text compression. IEEE Transactions on Neural Networks, 7(1):142--146.


A Neural Probabilistic Language Model - Bengio, Ducharme, Vincent (2000)   (7 citations)  (Correct)

....we push this idea to a large scale, and concentrate on learning a statistical model of the distribution of word sequences, rather than learning the role of words in a sentence. The proposed approach is also related to previous proposals of character based text compression using neural networks [11]. Learning a clustering of words [10, 1] is also a way to discover similarities between words. In the model proposed here, instead of characterizing the similarity with a discrete random or deterministic variable (which corresponds to a soft or hard partition of the set of words) we use a ....

Jurgen Schmidhuber. Sequential neural text compression. IEEE Transactions on Neural Networks, 7(1):142--146, 1996.


Text Compression Via Alphabet Re-Representation - Natsev (1997)   (Correct)

....in the language due to the underlying higher order grammatical structure. Methods of the third type (neural networks) along with other not so conventional PR approaches (e.g. fuzzy logic) have already been applied to image compression (see [4, 7] and recently to text compression as well (see [13]) For more information about these pattern recognition approaches consult [12] Now let us look at PPM s classical statistics approach to text compression in the context of pattern recognition. Consider a string S of length N over an alphabet A = fff 1 ; ff 2 ; ff oe g of size oe. Let ....

....a crucial role in the complexity of the network both in time and space requirements. For comparison, consider a network with the same input and output structure but with a single hidden layer, which is completely interconnected with both input and output layers (this network was proposed in [13]) Given roughly the same other parameters (e.g. n = 5) such a network, with 440 hidden nodes, performs worse than a network with our architecture that contains a total of 250 hidden nodes (d = 30; h = 20) In addition to cutting the number of hidden nodes almost in half, our approach requires ....

[Article contains additional citation context not shown here]

Jurgen Schmidhuber and Stefan Heil. Sequential neural text compression. IEEE Transactions on Neural Networks, 7(1):142--146, January 1996.


Text Compression Via Alphabet Re-Representation - Long, Natsev, Vitter   (Correct)

....ff i in the string, and N p is the total number of occurrences of pattern p. Alternative, and not so conventional, PR approaches (such as neural networks and fuzzy logic) have already been applied to image compression (Cottrell, Munro, Zipser, 1989) and recently to text compression as well (Schmidhuber Heil, 1996). The ability of neural networks, in particular, to learn smooth data suggests that they may be able to do a better Text Compression via Alphabet Re representation 5 job in modeling the character probabilities by employing some form of intelligent learning. One of the main challenges of neural ....

....heuristics (such as the greedy approach and Lin Kernighan s 2 OPT heuristic) Faloutsos Lin, Murtagh, 1983; Lawler, Lenstra, Rinnooy, Shmoys, 1985) 9 Results and discussion We have run two sets of experiments to test our approach. The first set is identical to the experiments reported in (Schmidhuber Heil, 1996), where the authors propose a similar neural network that consists of the same input output structure but uses only one hidden layer. As reported in (Schmidhuber Heil, 1996) the number of hidden nodes used in their single hidden layer is 440, the context size is 5, and the alphabet size is 80. ....

[Article contains additional citation context not shown here]

Schmidhuber, J., & Heil, S. (1996). Sequential neural text compression. IEEE Transactions on Neural Networks, 7 (1), 142--146.


Schmidhuber's Lab - Turteltaub (1999)   (Correct)

....100 papers on diverse topics including fine arts [96] and the nature of surprises [97] Apparently he even founded a religion [94] Most of his articles, however, are about machines that learn from experience. I have started to compile an incomplete list of references to work by him and his lab [117, 116, 39, 50, 40, 42, 43, 41, 52, 49, 56, 44, 54, 47, 48, 51, 53, 57, 46, 68, 45, 55, 69, 64, 65, 59, 66, 58, 67, 60, 63, 61, 73, 71, 79, 70, 74, 62, 72, 75, 78, 82, 80, 76, 81, 77, 84, 89, 88, 94, 87, 85, 96, 83, 100, 86, 90, 99, 91, 93, 105, 119, 95, 92, 97, 120, 118, 98, 125, 130, 129, 126, 128, 124, 123, 122, 131, 127, 35, 34, 36, 38, 32, 33, 37, 27, 28, 25, 24, 22, 23, 15, 9, 21, 10, 16, 26, 17, 18, 6, 7, 8, 13, 11, 20, 19, 14, 12, 115, 114, 121, 30, 106, 108, 107, 29, 31, 109, 110, 111, 112, 113, 5, 101, 103, 104, 4, 3, 2, 1, 102]. Hopefully I ll be able to add missing entries soon. Future work will concentrate on categorizing related papers and establishing common threads. ....

J. Schmidhuber and S. Heil. Sequential neural text compression. IEEE Transactions on Neural Networks, 7(1):142--146, 1996.


Schmidhuber's Lab - Turteltaub (1999)   (Correct)

....100 papers on diverse topics including fine arts [96] and the nature of surprises [97] Apparently he even founded a religion [94] Most of his articles, however, are about machines that learn from experience. I have started to compile an incomplete list of references to work by him and his lab [117, 116, 39, 50, 40, 42, 43, 41, 52, 49, 56, 44, 54, 47, 48, 51, 53, 57, 46, 68, 45, 55, 69, 64, 65, 59, 66, 58, 67, 60, 63, 61, 73, 71, 79, 70, 74, 62, 72, 75, 78, 82, 80, 76, 81, 77, 84, 89, 88, 94, 87, 85, 96, 83, 100, 86, 90, 99, 91, 93, 105, 119, 95, 92, 97, 120, 118, 98, 125, 130, 129, 126, 128, 124, 123, 122, 131, 127, 35, 34, 36, 38, 32, 33, 37, 27, 28, 25, 24, 22, 23, 15, 9, 21, 10, 16, 26, 17, 18, 6, 7, 8, 13, 11, 20, 19, 14, 12, 115, 114, 121, 30, 106, 108, 107, 29, 31, 109, 110, 111, 112, 113, 5, 101, 103, 104, 4, 3, 2, 1, 102]. Hopefully I ll be able to add missing entries soon. Future work will concentrate on categorizing related papers and establishing common threads. ....

J. Schmidhuber and S. Heil. Sequential Neural Text Compression. Technical Report FKI- -94, Fakultat fur Informatik, Technische Universitat Munchen, 1994.


Text Compression via Alphabet Re-Representation (Extended.. - Long, Natsev, Vitter   (Correct)

....the network through blending and partitioning; and second, by doing adaptive size adjustments through manipulating the d and h parameters during training. 6 Results and Discussion We have run two sets of experiments to test our approach. The first set is identical to the experiments reported in [5], where the authors propose a similar neural network approach that consists of the same input output structure but uses only one hidden layer. As reported in [5] the number of hidden nodes used in their single hidden layer is 440, the context size is 5, and the alphabet size is 80. The matching ....

....and Discussion We have run two sets of experiments to test our approach. The first set is identical to the experiments reported in [5] where the authors propose a similar neural network approach that consists of the same input output structure but uses only one hidden layer. As reported in [5], the number of hidden nodes used in their single hidden layer is 440, the context size is 5, and the alphabet size is 80. The matching configuration that we used had an alphabet size of 256 so that it can handle even binary files but all characters that were predicted with essentially a zero ....

[Article contains additional citation context not shown here]

Schmidhuber, J., and Heil, S. Sequential neural text compression. IEEE Transactions on Neural Networks 7, 1 (January 1996), 142--146.


The Use of a Bayesian Neural Network Model for Classification Tasks - Holst (1997)   (Correct)

.... the development of some process from a sequence of instrument readings at di#erent times [Cichocki and Unbehauen, 1993] analysis of EEG patterns [Ingber, 1997] recognition of gesture sequences [Sandberg, 1997] and prediction of the following character from the previous characters in a text [Schmidhuber and Heil, 1996]. There are at least two aspects of sequences which require attention. One is how to code the sequences 98 CHAPTER 5. DISCUSSION when feeding them to the network, and the other is how to handle the kind of dependencies that are typically present within sequences. Let us first consider the case ....

Schmidhuber J. and Heil S. (1996). Sequential neural text compression. IEEE Trans. Neural Networks 7: 142--146.


Modeling Complex Symbolic Sequences with Neural and Hybrid.. - Tino, Köteles (1996)   (Correct)

....positive entropy, i.e. difficult to predict, construct a stochastic model that has the information theoretic properties similar to those of S. So far, connectionist approaches to extraction of useful statistics from long symbolic sequences have have been primarily concerned with compression issues [38] or detection and analysis of significant subsequences [25, 40] In this paper, we investigate the potential of recurrent neural networks (RNNs) to be trained as chaotic symbolic sequence models and possibilities of reformulating the knowledge extracted by RNNs in a compact form of finite state ....

....intuition [45] The amount of memory needed to save the block conditional probabilities of symbols grows exponentially with the order of MS. This, look up table approach can be made feasible using a learning system (e.g. a feed forward neural network) that learns the conditional probabilities [38]. The problem with this method is that we have no control over the mechanism of generalization to unseen (or very rarely seen) blocks of symbols. However, Schmidhuber [38] reported impressive results obtained using this method in compressing long, well structured symbolic sequences. We propose the ....

[Article contains additional citation context not shown here]

J. Schmidhuber and S. Heil. Sequential neural text compression. IEEE Transactions on Neural Networks, 7(1):142--146, 1996.


Extracting Finite State Representations from Recurrent Neural.. - Tino, Köteles (1999)   (5 citations)  (Correct)

....positive entropy, i.e. difficult to predict, construct a stochastic model that has the information theoretic properties similar to those of S. So far, connectionist approaches to extraction of useful statistics from long symbolic sequences have have been primarily concerned with compression issues [37] or detection and analysis of significant subsequences [20, 38] In this paper, we study three main issues associated with training RNNs on long chaotic sequences. 1. Chaotic sequences are by nature unpredictable. Consequently, one can hardly expect RNNs to be able to exactly learn the prediction ....

J. Schmidhuber and S. Heil. Sequential neural text compression. IEEE Transactions on Neural Networks, 7(1):142--146, 1996.


The Work of Schmidhuber 1987-2002 - Hufnagel (2002)   Self-citation (Schmidhuber)   (Correct)

No context found.

J. Schmidhuber and S. Heil. Sequential Neural Text Compression. Technical Report FKI- -94, Fakultat fur Informatik, Technische Universitat Munchen, 1994.


Predictive Coding With Neural Nets: Application To Text.. - Schmidhuber, Heil (1995)   (1 citation)  Self-citation (Schmidhuber Heil)   (Correct)

.... next characters , given n previous characters. P s outputs are fed into algorithms that generate short codes for characters with low information content (characters with high predicted probability) and long codes for characters conveying a lot of information (highly unpredictable characters) [5]. Two such standard coding algorithms are employed: Huffman Coding (see e.g. 1] and Arithmetic Coding (see e.g. 7] With the off line variant of the approach, P s training phase is based on a set F of training files. After training, the weights are frozen. Copies of P are installed at all ....

J. H. Schmidhuber and S. Heil. Sequential neural text compression. IEEE Transactions on Neural Networks, 1994. Accepted for publication.


Neural Predictors For Detecting And Removing Redundant Information - Schmidhuber (1998)   (1 citation)  Self-citation (Schmidhuber)   (Correct)

....strings) 3 EXAMPLE 2: Text Compression The example from the previous section was based on artificial data from a stochastic automaton. Can neural predictors offer something for redundancy reduction in natural language How do they compare to standard data compression algorithms The method [28] reviewed in this section is an instance of a strategy known as predictive coding or model based coding . A neural predictor network P is trained to approximate the conditional probability distribution of possible characters, given the previous characters. P s outputs are fed into the ....

....characters. P s outputs are fed into the Arithmetic Coding algorithm (e.g. 41] that generates short codes for characters with low information content (characters with high predicted probability) and long codes for characters conveying a lot of information (highly unpredictable characters) [28]. 3.1 PREDICTING CONDITIONAL PROBABILITIES With the offline variant of the approach, P s training phase is based on a set F of training files. Assume that the alphabet contains k possible characters z 1 ; z 2 ; z k . The (local) representation of z i is a binary k dimensional vector ....

[Article contains additional citation context not shown here]

J. Schmidhuber and S. Heil. Sequential neural text compression. IEEE Transactions on Neural Networks, 7(1):142--146, 1996.

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC