| Caruana, R. #1993#. Multi-task learning: a knowledge-based source of inductive bias. In Proceedings of the Tenth Conference on Machine Learning, pp. 41#48 San Mateo, CA, USA. Morgan Kaufmann. |
....a related class of problems. Moreover, these approaches do not relate problem solution similarity to the quality of injected solutions and performance. Work on multitask learning suggests, as we do, that it is easier to learn many tasks at once, rather than to learn these same tasks separately [Caruana, 1997b, Caruana et al. 1996, Caruana, 1997a] Multitask learning (MTL) can be applied to clusters of related tasks in parallel or in sequence and provides an inductive bias that often leads to better generalization performance on the tasks. While MTL addresses generalization, we lean towards improving ....
Caruana, R. (1997b). Multitask learning: A knowledge-based source of inductive bias. Machine Learning, 28:41 -- 75.
....when learning a new function. Transfer in the functional decomposition approach is particularly effective if the complexity of g is much larger than that of the individual h i s. Examples of functional decomposition with f = h i ffi g have become popular in recent neural network literature [1, 7, 11, 45, 55, 60, 61]. All these approaches assume that each f i can be represented by two layered multilayer perceptrons which share the same first hidden layer (input to hidden weights) Examples of the opposite functional decomposition, i.e. f i = g ffi h i , are often found in speaker adaptive speech recognition ....
....incrementally one by one, or all in parallel. Both methodologies have potential advantages and disadvantages. If tasks arrive one after another (see e.g. 45, 55] incremental approaches do not have to memorize training data, thus consume less memory. However, non incremental approaches (cf. [1, 7, 11, 60, 61]) might discover commonalities between different learning tasks that are difficult to find if learning tasks are processed sequentially [12] ffl Unselective vs. selective transfer. Most approaches weight learning tasks equally when transferring knowledge between them. As shown in a recent study ....
[Article contains additional citation context not shown here]
R. Caruana. Multitask learning: A knowledge-based of source of inductive bias. In P. E. Utgoff, editor, Proceedings of the Tenth International Conference on Machine Learning, pages 41--48, San Mateo, CA, 1993. Morgan Kaufmann.
....used in our work to speed sensory concept learning is to learn multiple categories using a shared structure. This idea is fairly well known in neural nets, where the tradeoff between using multiple single output neural nets vs. one multi output neural net has been well studied. Work by Caruana [3] shows that even when the goal is to learn a single concept, it helps to use a multi output net to learn related concepts. Figure 6 illustrates the basic idea. In the recycling domain, for example, the robot learns not just the concept of trash can , but also whether the object is near or ....
R. Caruana. Multitask learning: A knowledge-based source of inductive bias. In Proceedings of the Tenth International Conference on Machine Learning, pages 41--48. Morgan Kaufmann, 1993.
....of certain types can successfully guide and improve generalization. In his approach, he gives hints to neural networks in form of additional output units that learn a closely related task. These hints constrain the internal representation developed by the network. In a more general way, Caruana [Caruana, 1993] recently proposed to learn whole collections of tasks in parallel, using a shared internal representation. He conjectures that multi task learning will make neural network learning algorithms scale to more complex learning tasks. Both approaches to the lifelong learning problem described in this ....
Richard Caruana. Multitask learning: A knowledgebased of source inductive bias. submitted for publication, 1993.
.... learned from other, similar speakers (e.g. see [Hild and Waibel, 1993] Other approaches that use related functions to change the bias of an inductive learner can be found in [Utgoff, 1986] Rendell et al. 1987] Suddarth and Kergosien, 1990] Moore et al. 1992] Sutton, 1992] [Caruana, 1993] , Pratt, 1993] and [Baxter, 1995] Table 1 summarizes the problem definitions of the standard and the lifelong supervised learning problem. In lifelong supervised learning, the learner is given a collection Y of support sets, in addition to the training set X and the hypothesis space H. This ....
....manages to extract useful invariance information in this domain, even if these invariances defy simple interpretation. 3. 4 Using Support Sets as Hints A related family of methods for the transfer of knowledge across learning tasks are proposed in [Suddarth and Kergosien, 1990] Pratt, 1993] [Caruana, 1993] . In a nutshell, these approaches develop improved internal representations by considering multiple functions in F (sequentially, or simultaneously) Following these ideas, we trained a single classification network providingthe support data as hints for the development of more appropriate ....
R. Caruana. Multitask learning: A knowledgebased of source of inductive bias. In Paul E. Utgoff, editor, Proceedings of the Tenth International Conference on Machine Learning, pages 41--48, San Mateo, CA, 1993. Morgan Kaufmann.
....for learning f n . 3.1 Back Propagation Standard Back Propagationcan be used to learn the indicator function f n , using X as training set. This approach does not employ the support sets, hence is unable to transfer knowledge across learning tasks. 3. 2 Learning With Hints Learning with hints [1, 4, 6, 16] constructs a neural network with n output units, one for each function f k (k = 1; 2; n) This network is then trained to simultaneously minimize the error on both the support sets fX k g and the training set X. By doing so, the internal representation of this network is not only ....
R. Caruana. Multitask learning: A knowledge-based of source of inductive bias. In P. E. Utgoff, editor, Proceedings of the Tenth International Conference on Machine Learning, pages 41--48, San Mateo, CA, 1993. Morgan Kaufmann.
.... learned from other, similar speakers (e.g. see [Hild and Waibel, 1993] Other approaches that use related functions to change the bias of an inductive learner can be found in [Utgoff, 1986] Rendell et al. 1987] Suddarth and Kergosien, 1990] Moore et al. 1992] Sutton, 1992] [Caruana, 1993] , and [Pratt, 1993] Table 1 summarizes the problem definitions of standard supervised learning and the lifelong supervised learning problem. In lifelong supervised learning, the learner is given a collection Y of support sets, in addition to the training set X and the hypothesis space H. This ....
....to extract useful invariance information in this domain, even if these invariances defy simple interpretation. 3. 4 Using Support Sets as Hints A related family of methods for the transfer of knowledge across learning tasks are proposed in [Suddarthand Kergosien, 1990] citePratt93aABBREV, [Caruana, 1993] . In a nutshell, these approaches develop improved internal representations by considering multiple functions in F (sequentially, or simultaneously) Following these ideas, we trained a single classification network providing the support data as hints for the development of more appropriate ....
R. Caruana. Multitask learning: A knowledgebased of source of inductive bias. In Paul E. Utgoff, editor, Proceedings of the Tenth International Conference on Machine Learning, pages 41--48, San Mateo, CA, 1993. Morgan Kaufmann.
....domain specific knowledge in order to guide the generalization in a knowledgeable way. To date, there is available a variety of strategies for the transfer of domain specific knowledge across multiple learning tasks: ffl learning internal representations for artificial neural networks, e.g. [1, 5, 9, 22, 23, 25, 26, 29, 27], ffl learning distance metrics, e.g. 4, 18, 33] ffl learning to re represent the data, e.g. 14, 33] ffl learning invariances in classification, e.g. 6, 16, 31] ffl learning algorithmic parameters and choosing algorithms, e.g. 24, 30, 35] and 1. For each pair of support tasks n ....
Caruana, R. Multitask Learning: A Knowledge-Based of Source of Inductive Bias. in: Proceedings of the Tenth International Conference on Machine Learning, edited by P. E. Utgoff. Morgan Kaufmann, San Mateo, CA, 1993, pp. 41--48.
....all related by the fact that they are parts of the same scene. The data described under Type A was hand classified with the location of relevant features being defined in the images using a mouse. A neural network was then trained to predict the door type and the location of the relevant features[1]. 5 On Data Collecting Our data collection experiences may be distilled into a number of rules, designed to encourage other robot users to collect and use data. 5.1 Encouraging Data Collection ffl It should be as simple as possible to collect data, so that all users are encouraged to collect ....
Rich Caruana. Multitask learning: A knowledgebased source of inductive bias. PhD Thesis Proposal, Carnegie Mellon University, May 1994.
....the shoe and the glasses, the best invariance network classified only 53.2 of all image pairs correctly. In order to improve these results, we applied a learning technique that focusses learning by incorporating additional training information, adopted from [Suddarth and Kergosien, 1990] [Caruana, 1993] . Their technique rests on the assumption that in addition to the learning task of interest, some related learning tasks, using the same input representation and the same training data (with different target values) are available. Instead of training on a single task, a network is Figure 3: ....
....functions is Caruana s multi task learning algorithm. In his approach, multiple, related tasks are trained simultaneously in a single neural network, forcing the networks to share hidden units. He reports that hidden internal representations are developed which lead to improved generalization [Caruana, 1993] . Notice that these results match our findings when training the invariance network. All these approaches develop better internal representations of the data by considering multiple functions in F with the goal of improving generalization. ffl Spotting relevant features. Another approach, which ....
Richard Caruana. Multitask learning: A knowledge-based of source of inductive bias. In Paul E. Utgoff, editor, Proceedings of the Tenth International Conference on Machine Learning, pages 41--48, San Mateo, CA, 1993. Morgan Kaufmann.
.... possibilities (see below) As suggested above, there is also a certain relation between our meta learning scenario and the notion of transfer of knowledge between learning tasks, as mentioned though in rather different settings by, e.g. Pratt et al. 1991) Ourston and Mooney (1991) Caruana (1993), and Thrun and Mitchell (1995) While these authors study the effect of cross category transfer, MetaL(B) can be interpreted as performing cross context transfer. The effect can be most clearly seen in the Schubert experiment above. In terms of the dynamic selection of predictors, there is some ....
Caruana, R.A. (1993). Multitask Learning: A Knowledge-based Source of Inductive Bias. In Proceedings of the 10th International Conference on Machine Learning (ML-93), Amherst, MA. San Mateo, CA: Morgan Kaufmann.
.... [7] Others proposed hierarchical approaches, in which the building blocks, once learned, can be applied to multiple tasks [3, 7, 21] A third way for the transfer of knowledge is concerned with the construction of better internal representations, which improve generalization across multiple tasks [1, 16, 19, 23]. While this list is clearly incomplete, it nevertheless illustrates the impor Sebastian Thrun A Lifelong Learning Perspective for Mobile Robot Control 7 episode 1 episode 2 episode 6 episode 18 episode 19 episode 20 Figure 6: Learning navigation. Traces of three early and three late episodes ....
Richard Caruana. Multitask learning: A knowledge-based of source inductive bias. submitted for publication, 1993.
....of certain types can successfully guide and improve generalization. In his approach, he gives hints to neural networks in form of additional output units that learn a closely related task. These hints constrain the internal representation developed by the network. In a more general way, Caruana [Caruana, 1993] recently proposed to learn whole collections of tasks in parallel, using a shared internal representation. He conjectures that multi task learning will make neural network learning algorithms scale to more complex learning tasks. Both approaches to the lifelong learning problem described in this ....
Richard Caruana. Multitask learning: A knowledge-based of source of inductive bias. In Paul E. Utgoff, editor, Proceedings of the Tenth International Conference on Machine Learning, pages 41--48, San Mateo, CA, 1993. Morgan Kaufmann.
....the generalization accuracy of an inductive learning algorithm depends on the representation of the data. In the context of neural network learning, several researchers have proposed methods for learning data representations that are tailored towards the built in bias of artificial neural networks [58, 52, 44, 9, 5]. The basic idea here is the same as in Section 3.3. To re represent the data, these approaches train a neural network, g : I Gamma I 0 , which maps input patterns in I to a new space, I 0 . This new space I 0 forms the input space for further, task specific neural network learning. The ....
....network. Hence, it is possible to use standard Back Propagation to tune the weights of the transformation network g, along with the weights of the respective classification network. While some authors [52, 44] have proposed to process the support sets and the training set sequentially, others [58, 9, 5] are in favor of training g in parallel, using all n tasks simultaneously. Sequential training offers the advantage that not all training data has to be available at all time. However, it faces the potential burden of catastrophic forgetting in Back Propagation, which basically arises from the ....
[Article contains additional citation context not shown here]
Caruana, R. Multitask Learning: A Knowledge-Based of Source of Inductive Bias. in: Proceedings of the Tenth International Conference on Machine Learning, edited by P. E. Utgoff. Morgan Kaufmann, San Mateo, CA, 1993, pp. 41--48.
....include that described by Giles and Omlin [ Giles and Omlin, 1993 ] who studies how to initialize networks that learn finite state automata with grammatical rules. Berenji [ Berenji, 1992 ] describes studies how to insert fuzzy logic rules into neural networks for control. Caruana and Suddarth [ Caruana, 1993, Suddarth, 1990 ] both study the benefits of simultaneously learning multiple tasks. Transfer in neural networks 25 5.2 Adaptive learning Transfer is also related to research in adaptive learning, in both the symbolic machine learning and neural network research communities. In symbolic ....
Richard A. Caruana. Multitask learning: A knowledge-based source of inductive bias. In Proceedings of the tenth international conference on machine learning, pages 41--48, University of Massachusetts, June 1993. Machine Learning.
.... Functional transfer does not involve the explicit assignment of prior task representation to a new task, rather it employs the use of implicit pressures from supplemental training examples [Sudd90, AM95] the parallel learning of related tasks constrained to use a common internal representation [Caru93, Caru95, Baxt95a], or the use of historical training information (most commonly the learning rate or gradient of the error surface) to augment the standard weight update equations [Naik92, Naik93, Mitc93, Thru93, Thru94a, Thru94b] These pressures serve to reduce the effective hypothesis space in which the ....
Richard A. Caruana, Multitask Learning: A Knowledge-Based Source of Inductive Bias, Proceedings of the tenth international conference on machine learning, University of Massachusetts, pp. 41--48, June 1993.
....by using the domain information contained in the training signals of related tasks as an inductive bias [8] Caruana has shown many potential uses for MTL [7] This paper explores MTL s effectiveness for software agents. MTL was first studied for back propagation neural networks [6] [4], 1] but Caruana has since introduced a method for incorporating the technique into other learning algorithms such as k nearest neighbor (kNN) and decision trees [7] This paper first describes results obtained using the technique he suggested for kNN. Then we will suggest an improvement to the ....
Rich Caruana. Multitask learning: A knowledge-based source of inductive bias. In Proceedings of the 10th International Conference on Machine Learning, pages 41--48, University of Massechusetts, Amherst, 1993.
....in a knowledgeable way. To date, there is available a variety of strategies for the transfer of domain specific knowledge across multiple learning tasks (see [26, 27] for a more detailed survey and comparison) ffl learning internal representations for artificial neural networks, e.g. [1, 4, 8, 19, 21, 22, 24], ffl learning distance metrics, e.g. 4, 16, 29] ffl learning to re represent the data, e.g. 12, 26, 29] ffl learning invariances in classification, e.g. 5, 13, 26, 28] ffl learning algorithmic parameters and choosing algorithms, e.g. 6, 20, 25, 30] and ffl learning domain ....
R. Caruana. Multitask learning: A knowledge-based of source of inductive bias. In P. E. Utgoff, editor, Proceedings of the Tenth International Conferenceon Machine Learning, pages 41--48, San Mateo, CA, 1993. Morgan Kaufmann.
....on the shared hidden layer. The outputs for Tasks 2 4 are ignored when the net is used to make predictions for Task 1. More complex architectures and algorithms than backprop on a fully connected hidden layer sometimes work better. The benefit of MTL in connectionist nets has been established [4][22] 3] and some of the underlying mechanisms elucidated [2] But how useful is multitask transfer How often will training data be available for extra tasks that are usefully related to the main task Will multitask transfer work with learning methods other than neural nets This paper addresses ....
....of the selected splits. How do we select splits good for multiple tasks The basic approach is straightforward: compute the information gain of each split for each task individually, combine the gains, and select the split with the best aggregate performance. The MTL TDIDT algorithm presented in [4] combines task gains by averaging them; the selected splits are the ones whose average utility across all tasks is highest. There is a problem with simple averaging. Splits good for Task 1 are not necessarily good for Task 2. Because each split in a decision tree affects all nodes below it, it is ....
R. Caruana, "Multitask Learning: A KnowledgeBased Source of Inductive Bias," Proceedings of the 10th International Conference on Machine Learning, pp. 41-48, 1993.
....that develop in the hidden layer for one task to be used by other tasks. Sharing what is learned by different tasks while tasks are trained in parallel is the central idea in multitask learning [Suddarth Kergosien 1990; Dietterich, Hild Bakiri 1990, 1995; Suddarth Holden 1991; Caruana 1993a, 1993b, 1994, 1995; Baxter 1994, 1995, 1996; Caruana de Sa 1996] INPUTS . Task 1 Task 2 Task 3 Task 4 Figure 1.2: Multitask Backprop (MTL) of four tasks with the same inputs. Multitask learning is a collection of learning algorithms, analysis methods, and heuristics, not a single learning ....
....the simple, fully connected MTL architectures presented here. One interesting feature of committee MTL architectures is that multiple copies of the main task are used, and this improves performance on the main task. Sometimes this same effect is observed with simpler, fully connected MTL nets, too [Caruana 1993]. Dietterich Bakiri 1995] examine a much more sophisticated approach to benefitting from multiple copies of the main task by using multi bit error correcting codes as the output representation. 8.9.2 Input Reconstruction (IRE) Pomerleau s ALVINN system used artificial neural nets to learn to ....
[Article contains additional citation context not shown here]
Caruana, R., "Multitask Learning: A Knowledge-Based Source of Inductive Bias," Proceedings of the 10th International Conference on Machine Learning, ML-93, University of Massachusetts, Amherst, 1993, pp. 41-48.
....other tasks. In this application, we use MTL to benefit from the future lab results. The extra lab values are used as extra backprop outputs as shown in Figure 1. The extra outputs bias the shared hidden layer towards representations that better capture important features of the domain. See [2][3][9] for details about MTL and [1] for other ways of using extra outputs to bias learning. The MTL net has 64 hidden units. Table 3 shows the mean performance of ten runs of MTL with rankprop. The bottom row shows the improvement over rankprop Age Chest Pain Asthmatic Diabetic Heart Mumur ....
R. Caruana, "Multitask Learning: A Knowledge-Based Source of Inductive Bias," Proceedings of the 10th International Conference on Machine Learning, pp. 41-48, 1993.
....representations that arise in the hidden layer for one task to be used by other tasks. Sharing what is learned by different tasks while tasks are trained in parallel is the central idea in multitask learning [Suddarth Kergosien 1990; Dietterich, Hild Bakiri 1990, 1995; Suddarth Holden 1991; Caruana 1993a, 1993b, 1994, 1995; Baxter 1994, 1995, 1996; Caruana de Sa 1996] MTL is an inductive transfer method that uses the domain specific information contained in the training signals of related tasks. It does this by learning the multiple tasks in parallel while using a shared representation. In ....
....individually, combine the gains, and select the split with the best aggregate performance. As in MTL KNN LCWA, parameters are introduced to control how much emphasis is given to the extra tasks. Weighting the extra tasks this way yields better performance than the simpler approach presented in [Caruana 1993], which combined task gains by averaging them; recursive splitting algorithms often suffer when the data becomes sparse 65 low in the tree, so it is important early splits are sensitive to performance on the main task. See [Caruana 1997] for more detail about how the parameters can be learned ....
[Article contains additional citation context not shown here]
Caruana, R., "Multitask Learning: A Knowledge-Based Source of Inductive Bias," Proceedings of the 10th International Conference on Machine Learning, ML-93, University of Massachusetts, Amherst, 1993, pp.
No context found.
Caruana, R. #1993#. Multi-task learning: a knowledge-based source of inductive bias. In Proceedings of the Tenth Conference on Machine Learning, pp. 41#48 San Mateo, CA, USA. Morgan Kaufmann.
No context found.
Caruana, R. 1997b. Multitask learning: A knowledgebased source of inductive bias. Machine Learning 28:41 -- 75.
No context found.
BIBLIOGRAPHY 253 Caruana, R. A. (1993), Multitask learning: A knowledge-based source of inductive bias, in "Proceedings of the Tenth International Conference on Machine Learning", Morgan Kaufmann Publishers, Inc., pp. 41--48.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC