Results 1 - 10
of
106
Sequential Minimal Optimization: A Fast Algorithm for Training Support Vector Machines
- ADVANCES IN KERNEL METHODS - SUPPORT VECTOR LEARNING
, 1998
"... This paper proposes a new algorithm for training support vector machines: Sequential Minimal Optimization, or SMO. Training a support vector machine requires the solution of a very large quadratic programming (QP) optimization problem. SMO breaks this large QP problem into a series of smallest possi ..."
Abstract
-
Cited by 206 (3 self)
- Add to MetaCart
This paper proposes a new algorithm for training support vector machines: Sequential Minimal Optimization, or SMO. Training a support vector machine requires the solution of a very large quadratic programming (QP) optimization problem. SMO breaks this large QP problem into a series of smallest possible QP problems. These small QP problems are solved analytically, which avoids using a time-consuming numerical QP optimization as an inner loop. The amount of memory required for SMO is linear in the training set size, which allows SMO to handle very large training sets. Because matrix computation is avoided, SMO scales somewhere between linear and quadratic in the training set size for various test problems, while the standard chunking SVM algorithm scales somewhere between linear and cubic in the training set size. SMO's computation time is dominated by SVM evaluation, hence SMO is fastest for linear SVMs and sparse data sets. On realworld sparse data sets, SMO can be more than 1000 times...
Constructive Incremental Learning from Only Local Information
, 1998
"... ... This article illustrates the potential learning capabilities of purely local learning and offers an interesting and powerful approach to learning with receptive fields. ..."
Abstract
-
Cited by 126 (35 self)
- Add to MetaCart
... This article illustrates the potential learning capabilities of purely local learning and offers an interesting and powerful approach to learning with receptive fields.
The neural basis of cognitive development: A constructivist manifesto
- Behavioral and Brain Sciences
, 1997
"... Quartz, S. & Sejnowski, T.J. (1997). The neural basis of cognitive development: A constructivist manifesto. ..."
Abstract
-
Cited by 106 (0 self)
- Add to MetaCart
Quartz, S. & Sejnowski, T.J. (1997). The neural basis of cognitive development: A constructivist manifesto.
Constructive Algorithms for Structure Learning in Feedforward Neural Networks for Regression Problems
- IEEE Transactions on Neural Networks
, 1997
"... In this survey paper, we review the constructive algorithms for structure learning in feedforward neural networks for regression problems. The basic idea is to start with a small network, then add hidden units and weights incrementally until a satisfactory solution is found. By formulating the whole ..."
Abstract
-
Cited by 47 (2 self)
- Add to MetaCart
In this survey paper, we review the constructive algorithms for structure learning in feedforward neural networks for regression problems. The basic idea is to start with a small network, then add hidden units and weights incrementally until a satisfactory solution is found. By formulating the whole problem as a state space search, we first describe the general issues in constructive algorithms, with special emphasis on the search strategy. A taxonomy, based on the differences in the state transition mapping, the training algorithm and the network architecture, is then presented. Keywords--- Constructive algorithm, structure learning, state space search, dynamic node creation, projection pursuit regression, cascade-correlation, resource-allocating network, group method of data handling. I. Introduction A. Problems with Fixed Size Networks I N recent years, many neural network models have been proposed for pattern classification, function approximation and regression problems. Among...
On-line EM Algorithm for the Normalized Gaussian Network
, 1999
"... A Normalized Gaussian Network (NGnet) (Moody and Darken 1989) is a network of local linear regression units. The model softly partitions the input space by normalized Gaussian functions and each local unit linearly approximates the output within the partition. In this article, we propose a new on ..."
Abstract
-
Cited by 45 (6 self)
- Add to MetaCart
A Normalized Gaussian Network (NGnet) (Moody and Darken 1989) is a network of local linear regression units. The model softly partitions the input space by normalized Gaussian functions and each local unit linearly approximates the output within the partition. In this article, we propose a new on-line EM algorithm for the NGnet, which is derived from the batch EM algorithm (Xu, Jordan and Hinton 1995) by introducing a discount factor. We show that the on-line EM algorithm is equivalent to the batch EM algorithm if a specific scheduling of the discount factor is employed. In addition, we show that the on-line EM algorithm can be considered as a stochastic approximation method to find the maximum likelihood estimator. A new regularization method is proposed in order to deal with a singular input distribution. In order to manage dynamic environments, where the input-output distribution of data changes over time, unit manipulation mechanisms such as unit production, unit deletion...
Problem Solving With Reinforcement Learning
, 1995
"... This dissertation is submitted for consideration for the dwree of Doctor' of Philosophy at the Uziver'sity of Cambr'idge Summary This thesis is concerned with practical issues surrounding the application of reinforcement lear'ning techniques to tasks that take place in high dimensional continuous ..."
Abstract
-
Cited by 42 (0 self)
- Add to MetaCart
This dissertation is submitted for consideration for the dwree of Doctor' of Philosophy at the Uziver'sity of Cambr'idge Summary This thesis is concerned with practical issues surrounding the application of reinforcement lear'ning techniques to tasks that take place in high dimensional continuous state-space environments. In particular, the extension of on-line updating methods is considered, where the term implies systems that learn as each experience arrives, rather than storing the experiences for use in a separate off-line learning phase. Firstly, the use of alternative update rules in place of standard Q-learning (Watkins 1989) is examined to provide faster convergence rates. Secondly, the use of multi-layer perceptton (MLP) neural networks (Rumelhart, Hinton and Williams 1986) is investigated to provide suitable generalising function approximators. Finally, consideration is given to the combination of Adaptive Heuristic Critic (AHC) methods and Q-learning to produce systems combining the benefits of real-valued actions and discrete switching
Representation, Similarity, and the Chorus of Prototypes
- Minds and Machines
, 1995
"... It is proposed to conceive of representation as an emergent phenomenon that is supervenient on patterns of activity of coarsely tuned and highly redundant feature detectors. The computational underpinnings of the outlined theory of representation are (1) the properties of collections of overlappi ..."
Abstract
-
Cited by 38 (8 self)
- Add to MetaCart
It is proposed to conceive of representation as an emergent phenomenon that is supervenient on patterns of activity of coarsely tuned and highly redundant feature detectors. The computational underpinnings of the outlined theory of representation are (1) the properties of collections of overlapping graded receptive fields, as in the biological perceptual systems that exhibit hyperacuity-level performance, and (2) the sufficiency of a set of proximal distances between stimulus representations for the recovery of the corresponding distal contrasts between stimuli, as in multidimensional scaling. The present preliminary study appears to indicate that this concept of representation is computationally viable, and is compatible with psychological and neurobiological data. 1 Introduction A perceptual system confronted with a stimulus must (i) decide whether the stimulus belongs to an already encountered category, and (ii) if necessary, create a new category record for the stimulus a...
Fast learning with incremental RBF Networks
- Neural Processing Letters
, 1994
"... We present a new algorithm for the construction of radial basis function (RBF) networks. The method uses accumulated error information to determine where to insert new units. The diameter of the localized units is chosen based on the mutual distances of the units. To have the distance information al ..."
Abstract
-
Cited by 37 (7 self)
- Add to MetaCart
We present a new algorithm for the construction of radial basis function (RBF) networks. The method uses accumulated error information to determine where to insert new units. The diameter of the localized units is chosen based on the mutual distances of the units. To have the distance information always available, it is held up-to-date by a Hebbian learning rule adapted from the "Neural Gas" algorithm. The new method has several advantages over existing methods and is able to generate small, well-generalizing networks with comparably few sweeps through the training data. 1 Introduction We propose a new algorithm for the construction of radial basis function (RBF) networks. An RBF network consists of a number of localized units (we use Gaussians in the following) positioned in input vector space. The localized units are completely connected by a set of weighted connections to a set of output units. The output units simply do a summation over their inputs. Networks of this type and vari...
Subspace information criterion for model selection
- Neural Computation
, 2001
"... The problem of model selection is considerably important for acquiring higher levels of generalization capability in supervised learning. In this paper, we propose a new criterion for model selection called the subspace information criterion (SIC), which is a generalization of Mallows ’ C L. It is a ..."
Abstract
-
Cited by 27 (16 self)
- Add to MetaCart
The problem of model selection is considerably important for acquiring higher levels of generalization capability in supervised learning. In this paper, we propose a new criterion for model selection called the subspace information criterion (SIC), which is a generalization of Mallows ’ C L. It is assumed that the learning target function belongs to a specified functional Hilbert space and the generalization error is defined as the Hilbert space squared norm of the difference between the learning result function and target function. SIC gives an unbiased estimate of the generalization error so defined. SIC assumes the availability of an unbiased estimate of the target function and the noise covariance matrix, which are generally unknown. A practical calculation method of SIC for least mean squares learning is provided under the assumption that the dimension of the Hilbert space is less than the number of training examples. Finally, computer simulations in two examples show that SIC works well even when the number of training examples is small.

