Results 1 -
8 of
8
Architectural Bias in Recurrent Neural Networks - Fractal Analysis
- IEEE Transactions on Neural Networks
, 1931
"... We have recently shown that when initialized with "small" weights, recurrent neural networks (RNNs) with standard sigmoid-type activation functions are inherently biased towards Markov models, i.e. even prior to any training, RNN dynamics can be readily used to extract finite memory machines (Hammer ..."
Abstract
-
Cited by 23 (5 self)
- Add to MetaCart
We have recently shown that when initialized with "small" weights, recurrent neural networks (RNNs) with standard sigmoid-type activation functions are inherently biased towards Markov models, i.e. even prior to any training, RNN dynamics can be readily used to extract finite memory machines (Hammer & Tino, 2002; Tino, Cernansky & Benuskova, 2002; Tino, Cernansky & Benuskova, 2002a). Following Christiansen and Chater (1999), we refer to this phenomenon as the architectural bias of RNNs. In this paper we further extend our work on the architectural bias in RNNs by performing a rigorous fractal analysis of recurrent activation patterns. We assume the network is driven by sequences obtained by traversing an underlying finite-state transition diagram -- a scenario that has been frequently considered in the past e.g. when studying RNN-based learning and implementation of regular grammars and finite-state transducers. We obtain lower and upper bounds on various types of fractal dimensions, such as box-counting and Hausdor# dimensions. It turns out that not only can the recurrent activations inside RNNs with small initial weights be explored to build Markovian predictive models, but also the activations form fractal clusters the dimension of which can be bounded by the scaled entropy of the underlying driving source. The scaling factors are fixed and are given by the RNN parameters.
Dimensions of Neural-symbolic Integration - A Structured Survey
- We Will Show Them: Essays in Honour of Dov Gabbay
, 2005
"... Introduction Research on integrated neural-symbolic systems has made significant progress in the recent past. In particular the understanding of ways to deal with symbolic knowledge within connectionist systems (also called artificial neural networks) has reached a critical mass which enables the ..."
Abstract
-
Cited by 17 (6 self)
- Add to MetaCart
Introduction Research on integrated neural-symbolic systems has made significant progress in the recent past. In particular the understanding of ways to deal with symbolic knowledge within connectionist systems (also called artificial neural networks) has reached a critical mass which enables the community to strive for applicable implementations and use cases. Recent work has covered a great variety of logics used in artificial intelligence and provides a multitude of techniques for dealing with them within the context of artificial neural networks. Already in the pioneering days of computational models of neural cognition, the question was raised how symbolic knowledge can be represented and dealt with within neural networks. The landmark paper [McCulloch and Pitts, 1943] provides fundamental insights how propositional logic can be processed using simple artificial neural networks. Within the following decades, however, the topic did not receive much attention as research in arti
Mathematical Aspects of Neural Networks
- European Symposium of Artificial Neural Networks 2003
, 2003
"... In this tutorial paper about mathematical aspects of neural networks, we will focus on two directions: on the one hand, we will motivate standard mathematical questions and well studied theory of classical neural models used in machine learning. On the other hand, we collect some recent theoretic ..."
Abstract
-
Cited by 5 (4 self)
- Add to MetaCart
In this tutorial paper about mathematical aspects of neural networks, we will focus on two directions: on the one hand, we will motivate standard mathematical questions and well studied theory of classical neural models used in machine learning. On the other hand, we collect some recent theoretical results (as of beginning of 2003) in the respective areas. Thereby, we follow the dichotomy offered by the overall network structure and restrict ourselves to feedforward networks, recurrent networks, and self-organizing neural systems, respectively.
Self-Organizing Maps for Time Series
, 2005
"... We review a recent extension of the self-organizing map (SOM) for temporal structures with a simple recurrent dynamics leading to sparse representations, which allows an efficient training and a combination with arbitrary lattice structures. We discuss its practical applicability and its theoretical ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
We review a recent extension of the self-organizing map (SOM) for temporal structures with a simple recurrent dynamics leading to sparse representations, which allows an efficient training and a combination with arbitrary lattice structures. We discuss its practical applicability and its theoretical properties. Afterwards, we put the approach into a general framework of recurrent unsupervised models. This generic formulation also covers a variety of well-known alternative approaches including the temporal Kohonen map, the recursive SOM, and SOM for structured data. Based on this formulation, mathematical properties of the models are investigated. Interestingly, the dynamic can be generalized from sequences to more general tree structures thus opening the way to unsupervised processing of general data structures.
Compositionality in Neural Systems
, 2003
"... In real life, people deal with composite structures: Written English language is built of 26 characters and a few additional symbols which form syllables, words, sentences, articles, roadmaps, and finally the `Handbook of Brain Theory'. Spoken language consists of raw acoustic waves at a basic level ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
In real life, people deal with composite structures: Written English language is built of 26 characters and a few additional symbols which form syllables, words, sentences, articles, roadmaps, and finally the `Handbook of Brain Theory'. Spoken language consists of raw acoustic waves at a basic level; at a higher level, it can be decomposed into phonemes which form words, sentences, a poem or speech. Visual data can be decomposed into pixels with various colors and intensities; alternatively, the raw image data may be represented by features like edges or texture, which are grouped to complex contours, objects, and, finally, a complete scene like the image of our grandmother who sits knitting on a chair. Moreover, not only real life data are processed as composite objects, artificial data created by humans or virtual objects have a composite structure, either: Web sites consist of single pages with head and body, links, tabulars, figures, enumerations, etc. Computer programs ...
Conditional Neural Fields
"... Conditional random fields (CRF) are widely used for sequence labeling such as natural language processing and biological sequence analysis. Most CRF models use a linear potential function to represent the relationship between input features and output. However, in many real-world applications such a ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Conditional random fields (CRF) are widely used for sequence labeling such as natural language processing and biological sequence analysis. Most CRF models use a linear potential function to represent the relationship between input features and output. However, in many real-world applications such as protein structure prediction and handwriting recognition, the relationship between input features and output is highly complex and nonlinear, which cannot be accurately modeled by a linear function. To model the nonlinear relationship between input and output we propose a new conditional probabilistic graphical model, Conditional Neural Fields (CNF), for sequence labeling. CNF extends CRF by adding one (or possibly more) middle layer between input and output. The middle layer consists of a number of gate functions, each acting as a local neuron or feature extractor to capture the nonlinear relationship between input and output. Therefore, conceptually CNF is much more expressive than CRF. Experiments on two widely-used benchmarks indicate that CNF performs significantly better than a number of popular methods. In particular, CNF is the best among approximately 10 machine learning methods for protein secondary structure prediction and also among a few of the best methods for handwriting recognition. 1
Growing RSOM:GRSOM, Temporal extensions of SOM, Hybridization of SOM with GA:GASOM.
"... The resolution of multy variable complex problems such as the multy speaker speech recognition and independently of context, requires the application of neural structures. One tool which proves to be powerful in the classification field, is the kohonen map called SOM. This map is characterized by th ..."
Abstract
- Add to MetaCart
The resolution of multy variable complex problems such as the multy speaker speech recognition and independently of context, requires the application of neural structures. One tool which proves to be powerful in the classification field, is the kohonen map called SOM. This map is characterized by the representation of static data. Thus, we ought to enrich the intelligibility and the performance of this model in order to reach what biology imposes by handling a kind of logical pile "memory " with the introduction of temporal context which realise feedbacks that integrate respectively the leaky integrators concept for the TKM. The recurrent leaky integrators idea for the recurrent SOM ‘RSOM ’ in an improvement of the TKM. More recently, the principle of self refer for the recursive SOM. In other case, we are introducing the possibility to obtain hybridization with GA in an attempt to reach the natural evolution of the human thought as regards to recognition.

