Results 1 - 10
of
14
Greedy layer-wise training of deep networks
- In NIPS
, 2007
"... Complexity theory of circuits strongly suggests that deep architectures can be much more efficient (sometimes exponentially) than shallow architectures, in terms of computational elements required to represent some functions. Deep multi-layer neural networks have many levels of non-linearities allow ..."
Abstract
-
Cited by 105 (18 self)
- Add to MetaCart
Complexity theory of circuits strongly suggests that deep architectures can be much more efficient (sometimes exponentially) than shallow architectures, in terms of computational elements required to represent some functions. Deep multi-layer neural networks have many levels of non-linearities allowing them to compactly represent highly non-linear and highly-varying functions. However, until recently it was not clear how to train such deep networks, since gradient-based optimization starting from random initialization appears to often get stuck in poor solutions. Hinton et al. recently introduced a greedy layer-wise unsupervised learning algorithm for Deep Belief Networks (DBN), a generative model with many layers of hidden causal variables. In the context of the above optimization problem, we study this algorithm empirically and explore variants to better understand its success and extend it to cases where the inputs are continuous or where the structure of the input distribution is not revealing enough about the variable to be predicted in a supervised task. Our experiments also confirm the hypothesis that the greedy layer-wise unsupervised training strategy mostly helps the optimization, by initializing weights in a region near a good local minimum, giving rise to internal distributed representations that are high-level abstractions of the input, bringing better generalization.
Intrinsically motivated learning of hierarchical collections of skills
, 2004
"... Humans and other animals often engage in activities for their own sakes rather than as steps toward solving practical problems. Psychologists call these intrinsically motivated behaviors. What we learn during intrinsically motivated behavior is essential for our development as competent autonomous e ..."
Abstract
-
Cited by 80 (15 self)
- Add to MetaCart
Humans and other animals often engage in activities for their own sakes rather than as steps toward solving practical problems. Psychologists call these intrinsically motivated behaviors. What we learn during intrinsically motivated behavior is essential for our development as competent autonomous entities able to efficiently solve a wide range of practical problems as they arise. In this paper we present initial results from a computational study of intrinsically motivated learning aimed at allowing artificial agents to construct and extend hierarchies of reusable skills that are needed for competent autonomy. At the core of the model are recent theoretical and algorithmic advances in computational reinforcement learning, specifically, new concepts related to skills and new learning algorithms for learning with skill hierarchies. 1
Proto-value functions: A laplacian framework for learning representation and control in markov decision processes
- Journal of Machine Learning Research
, 2006
"... This paper introduces a novel spectral framework for solving Markov decision processes (MDPs) by jointly learning representations and optimal policies. The major components of the framework described in this paper include: (i) A general scheme for constructing representations or basis functions by d ..."
Abstract
-
Cited by 45 (8 self)
- Add to MetaCart
This paper introduces a novel spectral framework for solving Markov decision processes (MDPs) by jointly learning representations and optimal policies. The major components of the framework described in this paper include: (i) A general scheme for constructing representations or basis functions by diagonalizing symmetric diffusion operators (ii) A specific instantiation of this approach where global basis functions called proto-value functions (PVFs) are formed using the eigenvectors of the graph Laplacian on an undirected graph formed from state transitions induced by the MDP (iii) A three-phased procedure called representation policy iteration comprising of a sample collection phase, a representation learning phase that constructs basis functions from samples, and a final parameter estimation phase that determines an (approximately) optimal policy within the (linear) subspace spanned by the (current) basis functions. (iv) A specific instantiation of the RPI framework using least-squares policy iteration (LSPI) as the parameter estimation method (v) Several strategies for scaling the proposed approach to large discrete and continuous state spaces, including the Nyström extension for out-of-sample interpolation of eigenfunctions, and the use of Kronecker sum factorization to construct compact eigenfunctions in product spaces such as factored MDPs (vi) Finally, a series of illustrative discrete and continuous control tasks, which both illustrate the concepts and provide a benchmark for evaluating the proposed approach. Many challenges remain to be addressed in scaling the proposed framework to large MDPs, and several elaboration of the proposed framework are briefly summarized at the end.
Learning teleoreactive logic programs from problem solving
- Proceedings of the Fifteenth International Conference on Inductive Logic Programming
, 2005
"... Abstract. In this paper, we focus on the problem of learning reactive skills for use by physical agents. We propose a new representation for such procedures, teleoreactive logic programs, along with an interpreter that utilizes them to achieve goals. After this, we describe a learning method that ac ..."
Abstract
-
Cited by 15 (7 self)
- Add to MetaCart
Abstract. In this paper, we focus on the problem of learning reactive skills for use by physical agents. We propose a new representation for such procedures, teleoreactive logic programs, along with an interpreter that utilizes them to achieve goals. After this, we describe a learning method that acquires these structures in a cumulative manner through problem solving. We report experiments in three domains that involve multiple levels of skilled behavior. We also review related work and discuss directions for future research. 1
Learning preconditions for planning from plan traces and HTN structure
- Computational Intelligence
, 2005
"... Agreat challenge in developing planning systems for practical applications is the difficulty of acquiring the domain information needed to guide such systems. This paper describes a way to learn some of that knowledge. More specifically, the following points are discussed. (1) We introduce a theoret ..."
Abstract
-
Cited by 9 (2 self)
- Add to MetaCart
Agreat challenge in developing planning systems for practical applications is the difficulty of acquiring the domain information needed to guide such systems. This paper describes a way to learn some of that knowledge. More specifically, the following points are discussed. (1) We introduce a theoretical basis for formally defining algorithms that learn preconditions for Hierarchical Task Network (HTN) methods. (2) We describe Candidate Elimination Method Learner (CaMeL), a supervised, eager, and incremental learning process for preconditions of HTN methods. We state and prove theorems about CaMeL’s soundness, completeness, and convergence properties. (3) We present empirical results about CaMeL’s convergence under various conditions. Among other things, CaMeL converges the fastest on the preconditions of the HTN methods that are needed the most often. Thus CaMeL’s output can be useful even before it has fully converged.
On-Line Cumulative Learning of Hierarchical Sparse n-Grams
- Proceedings of the Third International Conference on Development and Learning
, 2004
"... We present a system for on-line, cumulative learning of hierarchical collections of frequent patterns from unsegmented data streams. Such learning is critical for long-lived intelligent agents in complex worlds. Learned patterns enable prediction of unseen data and serve as building blocks for highe ..."
Abstract
-
Cited by 7 (0 self)
- Add to MetaCart
We present a system for on-line, cumulative learning of hierarchical collections of frequent patterns from unsegmented data streams. Such learning is critical for long-lived intelligent agents in complex worlds. Learned patterns enable prediction of unseen data and serve as building blocks for higher-level knowledge representation. We introduce a novel sparse n-gram model that, unlike pruned n-grams, learns on-line by stochastic search for frequent n-tuple patterns. Adding patterns as data arrives complicates probability calculations. We discuss an EM approach to this problem and introduce hierarchical sparse n-grams, a model that uses a better solution based on a new method for combining information across levels. A second new method for combining information from multiple granularities (n-gram widths) enables these models to more effectively search for frequent patterns (an on-line, stochastic analog of pruning in association rule mining). The result is an example of a rare combination---unsupervised, on-line, cumulative, structure learning. Unlike prediction suffix tree (PST) mixtures, the model learns with no size bound but using less space than the data. It does not repeatedly iterate over data (unlike MaxEnt feature construction). It discovers repeated structure on-line and (unlike PSTs) uses this to learn larger patterns. The type of repeated structure is limited (e.g., compared to hierarchical HMMs) but still useful, and these are important first steps towards learning repeated structure in more expressive representations, which has seen little progress especially in unsupervised, on-line contexts.
Learning representation and control in Markov decision processes: New frontiers
- Foundations and Trends in Machine Learning
, 2009
"... ..."
Structural Abstraction Experiments in Reinforcement Learning
"... Abstract. A challenge in applying reinforcement learning to large problems is how to manage the explosive increase in storage and time complexity. This is especially problematic in multi-agent systems, where the state space grows exponentially in the number of agents. Function approximation based on ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
Abstract. A challenge in applying reinforcement learning to large problems is how to manage the explosive increase in storage and time complexity. This is especially problematic in multi-agent systems, where the state space grows exponentially in the number of agents. Function approximation based on simple supervised learning is unlikely to scale to complex domains on its own, but structural abstraction that exploits system properties and problem representations shows more promise. In this paper, we investigate several classes of known abstractions: 1) symmetry, 2) decomposition into multiple agents, 3) hierarchical decomposition, and 4) sequential execution. We compare memory requirements, learning time, and solution quality empirically in two problem variations. Our results indicate that the most effective solutions come from combinations of structural abstractions, and encourage development of methods for automatic discovery in novel problem formulations. 1
Acquisition of Hierarchical Reactive Skills in a Unified Cognitive Architecture
"... In this paper, we review Icarus, a cognitive architecture that utilizes hierarchical skills and concepts for reactive execution in physical environments. In addition, we present two extensions to the framework. The first involves the incorporation of means-ends analysis, which lets the system compos ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
In this paper, we review Icarus, a cognitive architecture that utilizes hierarchical skills and concepts for reactive execution in physical environments. In addition, we present two extensions to the framework. The first involves the incorporation of means-ends analysis, which lets the system compose known skills to solve novel problems. The second involves the storage of new skills that are based on successful means-ends traces. We report experimental studies of these mechanisms on three distinct domains. Our results suggest that the two methods interact to acquire useful skill hierarchies that generalize well and that reduce the effort required to handle new tasks. We conclude with a discussion of related work on learning and prospects for additional research, including extending the framework to cover developmental phenomena. Key words: incremental learning, cognitive architecture, reactive control, problem solving, hierarchical skills
Scalable knowledge acquisition through memory organization
- Helsinki University of Technology
, 2005
"... Memory organization plays a critical role in knowledge acquisition. An agent must select a small subset of existing knowledge to serve as the basis for new learning; otherwise each problem becomes more complex than the previous. Selecting this subset remains a challenge, however. We propose that exi ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
Memory organization plays a critical role in knowledge acquisition. An agent must select a small subset of existing knowledge to serve as the basis for new learning; otherwise each problem becomes more complex than the previous. Selecting this subset remains a challenge, however. We propose that existing knowledge be organized in order for a learning agent to achieve its full potential. The SCALE algorithm is presented as a method for knowledge acquisition and organization, and is used to demonstrate both the computational and training benefits of memory organization. 1.

