Results 1 -
6 of
6
Artificial Neural Networks for Document Analysis and Recognition
- IEEE TPAMI
, 2003
"... Artificial neural networks have been extensively applied to document analysis and recogni-tion. Most efforts have been devoted to the recognition of isolated handwritten and printed characters with widely recognized successful results. However, many other document pro-cessing tasks like pre-processi ..."
Abstract
-
Cited by 15 (5 self)
- Add to MetaCart
Artificial neural networks have been extensively applied to document analysis and recogni-tion. Most efforts have been devoted to the recognition of isolated handwritten and printed characters with widely recognized successful results. However, many other document pro-cessing tasks like pre-processing, layout analysis, character segmentation, word recognition, and signature verification have been effectively faced with very promising results. This paper surveys most significant problems in the area of off-line document image processing where connectionist-based approaches have been applied. Similarities and differences between ap-proaches belonging to different categories are discussed. A particular emphasis is given on the crucial role of the prior knowledge for the conception of both appropriate architectures and learning algorithms. Finally, the paper provides a critical analysis on the reviewed approaches and depicts most promising research guidelines in the field. In particular, a sec-ond generation of connectionist-based models are foreseen which are based on appropriate graphical representations of the learning environment.
Domain Adaptation via Transfer Component Analysis
"... Domain adaptation solves a learning problem in a target domain by utilizing the training data in a different but related source domain. Intuitively, discovering a good feature representation across domains is crucial. In this paper, we propose to find such a representation through a new learning met ..."
Abstract
-
Cited by 13 (8 self)
- Add to MetaCart
Domain adaptation solves a learning problem in a target domain by utilizing the training data in a different but related source domain. Intuitively, discovering a good feature representation across domains is crucial. In this paper, we propose to find such a representation through a new learning method, transfer component analysis (TCA), for domain adaptation. TCA tries to learn some transfer components across domains in a Reproducing Kernel Hilbert Space (RKHS) using Maximum Mean Discrepancy (MMD). In the subspace spanned by these transfer components, data distributions in different domains are close to each other. As a result, with the new representations in this subspace, we can apply standard machine learning methods to train classifiers or regression models in the source domain for use in the target domain. The main contribution of our work is that we propose a novel feature representation in which to perform domain adaptation via a new parametric kernel using feature extraction methods, which can dramatically minimize the distance between domain distributions by projecting data onto the learned transfer components. Furthermore, our approach can handle large datsets and naturally lead to out-of-sample generalization. The effectiveness and efficiency of our approach in are verified by experiments on two real-world applications: cross-domain indoor WiFi localization and cross-domain text classification. 1
Genetic programming for cross-task knowledge sharing
- In Genetic and Evolutionary Computation Conference GECCO
, 2007
"... We consider multitask learning of visual concepts within genetic programming (GP) framework. The proposed method evolves a population of GP individuals, with each of them composed of several GP trees that process visual primitives derived from input images. The two main trees are delegated to solvin ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
We consider multitask learning of visual concepts within genetic programming (GP) framework. The proposed method evolves a population of GP individuals, with each of them composed of several GP trees that process visual primitives derived from input images. The two main trees are delegated to solving two different visual tasks and are allowed to share knowledge with each other by calling the remaining GP trees (subfunctions) included in the same individual. The method is applied to the visual learning task of recognizing simple shapes, using generative approach based on visual primitives, introduced in [17]. We compare this approach to a reference method devoid of knowledge sharing, and conclude that in the worst case cross-task learning performs equally well, and in many cases it leads to significant performance improvements in one or both solved tasks.
Generalizing to a zero-data task: a computational chemistry case study
, 2006
"... We investigate the problem of learning several tasks simultaneously in order to transfer the acquired knowledge to a completely new task for which no training data are available. Assuming that the tasks share some representation that we can discover efficiently, such a scenario should lead to a bett ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
We investigate the problem of learning several tasks simultaneously in order to transfer the acquired knowledge to a completely new task for which no training data are available. Assuming that the tasks share some representation that we can discover efficiently, such a scenario should lead to a better model of the new task, as compared to the model that is learned by only using the knowledge of the new task. We have evaluated several supervised learning algorithms in order to discover shared representations among the tasks defined in a computational chemistry/drug discovery problem. We have cast the problem from a statistical learning point of view and set up the general hypotheses that have to be tested in order to validate the multi-task learning approach. We have then evaluated the performance of the learning algorithms and showed that it is indeed possible to learn a shared representation of the tasks that allows to generalize to a new task for which no training data are available. From a theoretical point of view, our contribution also comprises a modification to the Support Vector Machine algorithm, which can produce state-of-the-art results using multi-task learning concepts at its core. From a practical point of view, our contribution is that this algorithm can be readily used by pharmaceutical companies for virtual screening campaigns. 1
Conditional Graphical Models for Protein Structure Prediction
, 2005
"... It is widely believed that the protein structures play key roles in determining the functions, activity, stability and subcellular localization of the proteins, and the mechanisms of protein-protein interactions in cells. However, it is extremely labor-expensive and sometimes even impossible to expe ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
It is widely believed that the protein structures play key roles in determining the functions, activity, stability and subcellular localization of the proteins, and the mechanisms of protein-protein interactions in cells. However, it is extremely labor-expensive and sometimes even impossible to experimentally determine the structures for hundreds of thousands of protein sequences. In this thesis, we aim at designing computational methods to predict the protein structures from sequences. Since the protein structures involve many aspects, we focus on predicting the general protein structural topologies (as opposed to specific 3-D coordinates) from different levels, including secondary structures, super-secondary structures and quaternary folds for homogeneous multimers. Specifically, given a protein sequence, our goal is to predict what are the secondary structure elements, how they arrange themselves in threedimensional space, and how multiple chains associate into complexes. Traditional approaches for protein structure prediction are sequence-based, i.e. searching the database using PSI-BLAST or matching against a hidden Markov model (HMM) profile built from sequences with similar structures. These methods work well for simple conserved structures with strong sequence similarities, but fail when the similarity across proteins is poor and/or there exist long-range interactions,
Multi-Task Code Reuse in Genetic Programming
"... We propose a method of knowledge reuse between evolutionary processes that solve different optimization tasks. We define the method in the framework of tree-based genetic programming (GP) and implement it as code reuse between GP trees that evolve in parallel in separate populations delegated to par ..."
Abstract
- Add to MetaCart
We propose a method of knowledge reuse between evolutionary processes that solve different optimization tasks. We define the method in the framework of tree-based genetic programming (GP) and implement it as code reuse between GP trees that evolve in parallel in separate populations delegated to particular tasks. The technical means of code reuse is a crossbreeding operator which works very similar to standard tree-swapping crossover. We consider two variants of this operator, which differ in the way they handle the incompatibility of terminals between the considered problems. In the experimental part we demonstrate that such code reuse is usually beneficial and leads to success rate improvements when solving the common boolean benchmarks.

