Results 1 - 10
of
2,035
A Practical Bayesian Framework for Backprop Networks
- Neural Computation
, 1991
"... A quantitative and practical Bayesian framework is described for learning of mappings in feedforward networks. The framework makes possible: (1) objective comparisons between solutions using alternative network architectures ..."
Abstract
-
Cited by 496 (20 self)
- Add to MetaCart
A quantitative and practical Bayesian framework is described for learning of mappings in feedforward networks. The framework makes possible: (1) objective comparisons between solutions using alternative network architectures
Learning and Evolution in Populations of Backprop Networks
, 1993
"... This paper reports an investigation of the relationship between learning and evolution in populations of backprop networks. The simulation environment, which has been used previously by Parisi, Nolfi and Cecconi (Parisi, Nolfi and Cecconi, 1991), consists of a two dimensional grid with a random samp ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
This paper reports an investigation of the relationship between learning and evolution in populations of backprop networks. The simulation environment, which has been used previously by Parisi, Nolfi and Cecconi (Parisi, Nolfi and Cecconi, 1991), consists of a two dimensional grid with a random
Efficient BackProp
, 1998
"... . The convergence of back-propagation learning is analyzed so as to explain common phenomenon observed by practitioners. Many undesirable behaviors of backprop can be avoided with tricks that are rarely exposed in serious technical publications. This paper gives some of those tricks, and offers expl ..."
Abstract
-
Cited by 209 (31 self)
- Add to MetaCart
a network using backprop requires making many seemingly arbitrary choices such as the number ...
Learning to rank using gradient descent
- In ICML
, 2005
"... We investigate using gradient descent methods for learning ranking functions; we propose a simple probabilistic cost function, and we introduce RankNet, an implementation of these ideas using a neural network to model the underlying ranking function. We present test results on toy data and on data f ..."
Abstract
-
Cited by 510 (17 self)
- Add to MetaCart
We investigate using gradient descent methods for learning ranking functions; we propose a simple probabilistic cost function, and we introduce RankNet, an implementation of these ideas using a neural network to model the underlying ranking function. We present test results on toy data and on data
Solving multiclass learning problems via error-correcting output codes
- JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH
, 1995
"... Multiclass learning problems involve nding a de nition for an unknown function f(x) whose range is a discrete set containing k>2values (i.e., k \classes"). The de nition is acquired by studying collections of training examples of the form hx i;f(x i)i. Existing approaches to multiclass l ..."
Abstract
-
Cited by 730 (8 self)
- Add to MetaCart
Multiclass learning problems involve nding a de nition for an unknown function f(x) whose range is a discrete set containing k>2values (i.e., k \classes"). The de nition is acquired by studying collections of training examples of the form hx i;f(x i)i. Existing approaches to multiclass learning problems include direct application of multiclass algorithms such as the decision-tree algorithms C4.5 and CART, application of binary concept learning algorithms to learn individual binary functions for each of the k classes, and application of binary concept learning algorithms with distributed output representations. This paper compares these three approaches to a new technique in which error-correcting codes are employed as a distributed output representation. We show that these output representations improve the generalization performance of both C4.5 and backpropagation on a wide range of multiclass learning tasks. We also demonstrate that this approach is robust with respect to changes in the size of the training sample, the assignment of distributed representations to particular classes, and the application of over tting avoidance techniques such as decision-tree pruning. Finally,we show that|like the other methods|the error-correcting code technique can provide reliable class probability estimates. Taken together, these results demonstrate that error-correcting output codes provide a general-purpose method for improving the performance of inductive learning programs on multiclass problems.
Guiding Backprop by Inserting Rules
"... Abstract. We report on an experiment where we inserted symbolic rules into a neural network during the training process. This was done to guide the learning and to help escape local minima. The rules are constructed by analysing the errors made by the network after training. This process can be repe ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
Abstract. We report on an experiment where we inserted symbolic rules into a neural network during the training process. This was done to guide the learning and to help escape local minima. The rules are constructed by analysing the errors made by the network after training. This process can
Long Short-term Memory
, 1995
"... "Recurrent backprop" for learning to store information over extended time intervals takes too long. The main reason is insufficient, decaying error back flow. We briefly review Hochreiter's 1991 analysis of this problem. Then we overcome it by introducing a novel, efficient method c ..."
Abstract
-
Cited by 387 (58 self)
- Add to MetaCart
"Recurrent backprop" for learning to store information over extended time intervals takes too long. The main reason is insufficient, decaying error back flow. We briefly review Hochreiter's 1991 analysis of this problem. Then we overcome it by introducing a novel, efficient method
Practical Issues in Temporal Difference Learning
- Machine Learning
, 1992
"... This paper examines whether temporal difference methods for training connectionist networks, such as Suttons's TD(lambda) algorithm can be successfully applied to complex real-world problems. A number of important practical issues are identified and discussed from a general theoretical perspect ..."
Abstract
-
Cited by 418 (2 self)
- Add to MetaCart
This paper examines whether temporal difference methods for training connectionist networks, such as Suttons's TD(lambda) algorithm can be successfully applied to complex real-world problems. A number of important practical issues are identified and discussed from a general theoretical
Bayesian Model Comparison and Backprop Nets
- ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 4
, 1992
"... The Bayesian model comparison framework is reviewed, and the Bayesian Occam's razor is explained. This framework can be applied to feedforward networks, making possible (1) objective comparisons between solutions using alternative network architectures; (2) objective choice of magnitude and ..."
Abstract
-
Cited by 23 (0 self)
- Add to MetaCart
The Bayesian model comparison framework is reviewed, and the Bayesian Occam's razor is explained. This framework can be applied to feedforward networks, making possible (1) objective comparisons between solutions using alternative network architectures; (2) objective choice of magnitude
Forward models: Supervised learning with a distal teacher
- Cognitive Science
, 1992
"... Internal models of the environment have an important role to play in adaptive systems in general and are of particular importance for the supervised learning paradigm. In this paper we demonstrate that certain classical problems associated with the notion of the \teacher " in supervised lea ..."
Abstract
-
Cited by 410 (8 self)
- Add to MetaCart
supervised learning algorithm that is capable of learning in multi-layer networks.
Results 1 - 10
of
2,035