#### DMCA

## Artificial Neural Networks: A Tutorial (1996)

Venue: | IEEE Computer |

Citations: | 104 - 4 self |

### Citations

5962 |
Neural Networks: A Comprehensive Foundation
- Haykin
- 1999
(Show Context)
Citation Context ...d. The field of artificial neural networks has provided alternative approaches for solving these problems. It has been established that a large number of applications can benefit from the use of ANNs =-=[1, 9, 7]-=-. Artificial neural networks, which are also referred to as neural computation, network computation, connectionist models, and parallel distributed processing (PDP), are massively parallel computing s... |

2366 |
Neural networks and physical systems with emergent collective computational abilities
- Hopfield
- 1982
(Show Context)
Citation Context ...pfield network The Hopfield network is a special type of recurrent network which uses the network energy function as a tool for designing recurrent networks and for understanding its dynamic behavior =-=[10]-=-. It is Hopfield's formulation that made explicit the principle of storing information as dynamically stable attractors, and popularized the use of recurrent networks for associative memory and for so... |

2215 |
Introduction to the Theory of Neural Computation,
- Hertz, Krogh, et al.
- 1991
(Show Context)
Citation Context ...d. The field of artificial neural networks has provided alternative approaches for solving these problems. It has been established that a large number of applications can benefit from the use of ANNs =-=[1, 9, 7]-=-. Artificial neural networks, which are also referred to as neural computation, network computation, connectionist models, and parallel distributed processing (PDP), are massively parallel computing s... |

1577 | Organization of Behavior - Hebb - 1994 |

1162 |
A logical calculus of the ideas immanent in nervous activity
- McCulloch, Pitts
- 1943
(Show Context)
Citation Context ...e mind originates and how the brain computes. These efforts may be traced back to Aristotle. Yet, the modern era of computational neural modeling began with the pioneering work of McCulloch and Pitts =-=[15]-=- in 1943, who introduced a computational model of neuron and a logical calculus of neural networks. McCulloch-Pitts' classic paper was widely read at the time (and is still read), generating considera... |

1143 | The perceptron: A probabilistic model for information storage and organization
- Rosenblatt
- 1958
(Show Context)
Citation Context ...er class if y = 0. The linear equation n X j=1 w j x j \Gamma �� = 0; defines the decision boundary (a hyperplane in the n-dimensional input space) which divides the space into two halves. Rosenbl=-=att [20]-=- developed a learning procedure to determine the weights and threshold in a perceptron, given a set of training patterns. The perceptron learning procedure can be described as follows. 1. Initialize t... |

750 |
Parallel distributed processing: Explorations in the microstructure of cognition. Vol : Foundations.
- Rumelhart, McClelland
- 1986
(Show Context)
Citation Context ...t publications appeared, which changed the course of ANN research. Perhaps more than any other publication, the 1982 paper by Hopfield [10] and the two-volume book by Rumelhart and McClelland in 1986 =-=[21]-=- were the most influential publications. In 1982, Hopfield introduced the idea of an energy function from statistical physics to formulate a new way of understanding the computation of recurrent netwo... |

733 |
An Introduction to Computing with Neural Nets
- Lippmann
- 1987
(Show Context)
Citation Context ...epeat for the next pattern until the error in the output layer is below a pre-specified threshold or the maximum number of iterations is reached. A geometric interpretation (adopted and modified from =-=[14]-=-) shown in Figure 14 can help us understand the role of hidden units (with the threshold activation function). Each unit in the first hidden layer forms a hyper-plane in the pattern space; boundaries ... |

680 |
Perceptrons: An introduction to computational geometry
- MINSKY, PAPERT
- 1969
(Show Context)
Citation Context ...surfaces. ANNs generated a great deal of enthusiasm in the 1960's. It appeared as if such a machine could do any type of computation. However, this enthusiasm was dampened by Minsky and Papert's book =-=[17]-=- which demonstrated the fundamental limitations of the computing power of one-layer perceptrons. They showed that certain rather simple computations, such as the ExclusiveORs(XOR) problem, could not b... |

475 |
Backpropagation Applied to Handwritten Zip-Code Recognition
- Cun, Boser, et al.
- 1989
(Show Context)
Citation Context ...ork with 50 hidden units is found to produce good generalization ability. Not all OCR systems explicity extract features from the raw data. A typical example is the network developed by Le Cun et al. =-=[13]-=- for zip-code recognition. The network architecture is shown in Figure 21. A 16 \Theta 16 normalized gray level image is presented to a feedforward network with three hidden layers. The feature extrac... |

191 |
Pattern Recognition by Self-Organizing Neural Networks.
- Carpenter, Grossberg
- 1991
(Show Context)
Citation Context ...each learning algorithm can perform. Due to space limitation, we will not discuss some of the other algorithms, including ADALINE, MADALINE [22], linear discriminant analysis (see [11]), ART2, ARTMAP =-=[4]-=-, Sammon's projection (see [11]), principal component analysis (see [9]), and RBF learning algorithm (see [7]). Interested readers can further read the corresponding references. Note that in order to ... |

146 |
Neurocomputing (Foundations of Research),
- Anderson, Rosenfeld
- 1988
(Show Context)
Citation Context ... mode in which both the visible and hidden neurons are allowed to operate freely. Boltzmann learning is a stochastic learning rule derived from information-theoretic and thermodynamic principles (see =-=[2]-=-). The objective of Boltzmann learning is to adjust the connection weights such that the states of visible units satisfy a particular desired probability distribution. According to the Boltzmann learn... |

98 |
Learning Machines: Foundations of Trainable Pattern Classifying Systems,
- Nilsson
- 1965
(Show Context)
Citation Context ...st proof of the perceptron convergence theorem. In 1960, Widrow and Hoff introduced the least mean square (LMS) algorithm for the Adaline (Adaptive Linear Element). Nilsson's book on machine learning =-=[19]-=- was the best-written exposition of linearly separable patterns in hypersurfaces. ANNs generated a great deal of enthusiasm in the 1960's. It appeared as if such a machine could do any type of computa... |

70 |
Logical versus analogical or symbolic versus connectionist or neat versus scruffy
- Minsky
- 1991
(Show Context)
Citation Context ...mbolic representation, (ii) searching-based reasoning using rules, logic, and knowledge database, and (iii) expert-based learning (expert systems). AI takes the top-down strategy to solve prob11 lems =-=[16]-=-: begin at the level of commonsense psychology, and hypothesize what processes could solve a problem. If the problem can not be solved in a single step, break the problem into subproblems. This proced... |

43 |
Self-Organization and Associate Memory, Third edition,
- Kohonen
- 1989
(Show Context)
Citation Context ...odebook and Voronoi tessellation generated by the unsupervised competitive learning rule may not be the best for pattern classification purposes (see Figure 12(a)). Learning vector quantization (LVQ) =-=[12]-=- is a supervised competitive learning technique which uses pattern class information to adjust the Voronoi vectors slightly, so as to improve classification accuracy. In LVQ, the weight updating rule ... |

33 |
A Comparative Study Of Different Classifiers For Handprinted Character
- Mohiuddin, Mao
- 1994
(Show Context)
Citation Context ... spline curve approximation, and Fourier descriptors. There is no clear evidence as to which feature set is best for a given application. Figure 20 shows a typical scheme for extracting zone features =-=[18]-=-: contour direction and bending points. Contour direction features are generated by dividing the binary image array into rectangular and diagonal zones and computing histograms of chain codes in these... |

20 |
Computing with structured neural networks
- Feldman, Fanty, et al.
- 1988
(Show Context)
Citation Context ...cannot take more than about one hundred serial stages. In other words, the brain runs parallel programs that are about 100 steps long for such perceptual tasks. This is known as the hundred step rule =-=[6]-=-. The same timing considerations show that the amount of information sent from one neuron to another must be very small (a few bits). This implies that critical information is not transmitted directly... |

16 |
Neural Networks and Pattern Recognition
- Jain, Mao
- 1994
(Show Context)
Citation Context ...rn recognition system involves the following three main steps: (i) data acquisition and preprocessing, (ii) representation or feature extraction, and (iii) decision making or clustering. Jain and Mao =-=[11]-=- have addressed a number of common links between ANNs and statistical pattern recognition (SPR). There is a close correspondence between some of the popular ANN models and traditional pattern recognit... |

3 |
Character segmentation in document OCR: Progress and hope
- Casey
(Show Context)
Citation Context ...lt for machine printed text when techniques such as "kerning" are employed. Noise could cause otherwise separated characters to be touching. Various techniques can be used to split composite=-= patterns [5]-=-. One effective technique is to break the composite pattern into smaller patterns (over-segmentation) and find the correct character segmentation points using the output of pattern classifier. Figure ... |

2 |
et al. (eds). The first census optical character recognition system conference
- Wilkinson, Geist
(Show Context)
Citation Context ...tion) and find the correct character segmentation points using the output of pattern classifier. Figure 19 shows the size-normalized character bitmaps of a sample set from the NIST character database =-=[23]-=-. We can see substantial intra-class variations. The goal of feature extraction is to extract the most relevant measurements from the sensed data, so as to minimize the within-class variability while ... |