Results 11 - 20
of
168
Improving the Convergence of the Backpropagation Algorithm Using Learning Rate Adaptation Methods
, 1999
"... This article focuses on gradient-based backpropagation algorithms that use either a common adaptive learning rate for all weights or an individual adaptive learning rate for each weight and apply the Goldstein/Armijo line search. The learning-rate adaptation is based on descent techniques and estima ..."
Abstract
-
Cited by 19 (13 self)
- Add to MetaCart
This article focuses on gradient-based backpropagation algorithms that use either a common adaptive learning rate for all weights or an individual adaptive learning rate for each weight and apply the Goldstein/Armijo line search. The learning-rate adaptation is based on descent techniques and estimates of the local Lipschitz constant that are obtained without additional error function and gradient evaluations. The proposed algorithms improve the backpropagation training in terms of both convergence rate and convergence characteristics, such as stable learning and robustness to oscillations. Simulations are conducted to compare and evaluate the convergence behavior of these gradient-based training algorithms with several popular training methods.
The Joint Manifold Model for Semi-supervised Multi-valued Regression
"... Many computer vision tasks may be expressed as the problem of learning a mapping between image space and a parameter space. For example, in human body pose estimation, recent research has directly modelled the mapping from image features (z) to joint angles (θ). Fitting such models requires training ..."
Abstract
-
Cited by 18 (3 self)
- Add to MetaCart
Many computer vision tasks may be expressed as the problem of learning a mapping between image space and a parameter space. For example, in human body pose estimation, recent research has directly modelled the mapping from image features (z) to joint angles (θ). Fitting such models requires training data in the form of labelled (z, θ) pairs, from which are learned the conditional densities p(θ|z). Inference is then simple: given test image features z, the conditional p(θ|z) is immediately computed. However large amounts of training data are required to fit the models, particularly in the case where the spaces are high dimensional. We show how the use of unlabelled data—samples from the marginal distributions p(z) and p(θ)—may be used to improve fitting. This is valuable because it is often significantly easier to obtain unlabelled than labelled samples. We use a Gaussian process latent variable model to learn the mapping from a shared latent low-dimensional manifold to the feature and parameter spaces. This extends existing approaches to (a) use unlabelled data, and (b) represent one-to-many mappings. Experiments on synthetic and real problems demonstrate how the use of unlabelled data improves over existing techniques. In our comparisons, we include existing approaches that are explicitly semi-supervised as well as those which implicitly make use of unlabelled examples. 1.
Intelligent Systems: Architectures and Perspectives, Innovations in Intelligent Systems
- In Studies in Fuzziness and Soft Computing, Springer Verlag Germany, Chapter
, 2002
"... Abstract: The integration of different learning and adaptation techniques to overcome individual limitations and to achieve synergetic effects through the hybridization or fusion of these techniques has, in recent years, contributed to a large number of new intelligent system designs. Computational ..."
Abstract
-
Cited by 16 (13 self)
- Add to MetaCart
Abstract: The integration of different learning and adaptation techniques to overcome individual limitations and to achieve synergetic effects through the hybridization or fusion of these techniques has, in recent years, contributed to a large number of new intelligent system designs. Computational intelligence is an innovative framework for constructing intelligent hybrid architectures involving Neural Networks (NN), Fuzzy Inference Systems (FIS), Probabilistic Reasoning (PR) and derivative free optimization techniques such as Evolutionary Computation (EC). Most of these hybridization approaches, however, follow an ad hoc design methodology, justified by success in certain application domains. Due to the lack of a common framework it often remains difficult to compare the various hybrid systems conceptually and to evaluate their performance comparatively. This chapter introduces the different generic architectures for integrating intelligent systems. The designing aspects and perspectives of different hybrid archirectures like NN-FIS, EC-FIS, EC-NN, FIS-PR and NN-FIS-EC systems are presented. Some conclusions are also provided towards the end.
Augmented statistical models for speech recognition
- in Proc. ICASSP
, 2006
"... Recently there has been significant interest in developing new acoustic models for speech recognition. One such model, that allows complex dependencies to be represented, is the augmented statistical model. This incorporates additional dependencies by constructing a local exponential expansion of a ..."
Abstract
-
Cited by 16 (9 self)
- Add to MetaCart
Recently there has been significant interest in developing new acoustic models for speech recognition. One such model, that allows complex dependencies to be represented, is the augmented statistical model. This incorporates additional dependencies by constructing a local exponential expansion of a standard HMM. Unfortunately, the resulting model often has an intractable normalisation term, rendering training difficult for all but binary classification tasks. In this paper, conditional augmented (C-Aug) models are proposed as an attractive alternative. Instead of modelling utterance likelihoods and inferring decision boundaries, C-Aug models directly model the posterior probability of class labels, conditioned on the utterance. The resulting model is easy to normalise and can be trained using conditional maximum likelihood estimation. In addition, as a convex model, the optimisation converges to a global maximum. 1.
Intrusion Detection Using Ensemble of Soft Computing Paradigms
- INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS DESIGN AND APPLICATIONS
, 2004
"... Soft computing techniques are increasingly being used for problem solving. This paper addresses using ensemble approach of different soft computing techniques for intrusion detection. Due to increasing incidents of cyber attacks, building effective intrusion detection systems (IDSs) are essential fo ..."
Abstract
-
Cited by 16 (12 self)
- Add to MetaCart
Soft computing techniques are increasingly being used for problem solving. This paper addresses using ensemble approach of different soft computing techniques for intrusion detection. Due to increasing incidents of cyber attacks, building effective intrusion detection systems (IDSs) are essential for protecting information systems security, and yet it remains an elusive goal and a great challenge. Two classes of soft computing techniques are studied: Artificial Neural Networks (ANNs) and Support Vector Machines (SVMs). We show that ensemble of ANN and SVM is superior to individual approaches for intrusion detection in terms of classification accuracy.
Global Search Methods For Solving Nonlinear Optimization Problems
, 1997
"... ... these new methods, we develop a prototype, called Novel (Nonlinear Optimization Via External Lead), that solves nonlinear constrained and unconstrained problems in a unified framework. We show experimental results in applying Novel to solve nonlinear optimization problems, including (a) the lear ..."
Abstract
-
Cited by 15 (1 self)
- Add to MetaCart
... these new methods, we develop a prototype, called Novel (Nonlinear Optimization Via External Lead), that solves nonlinear constrained and unconstrained problems in a unified framework. We show experimental results in applying Novel to solve nonlinear optimization problems, including (a) the learning of feedforward neural networks, (b) the design of quadrature-mirror-filter digital filter banks, (c) the satisfiability problem, (d) the maximum satisfiability problem, and (e) the design of multiplierless quadrature-mirror-filter digital filter banks. Our method achieves better solutions than existing methods, or achieves solutions of the same quality but at a lower cost.
An Optimization Perspective on Kernel Partial Least Squares Regression
- Advances in Learning Theory: Methods, Models and Applications
, 2003
"... Abstract. This work provides a novel derivation based on optimization for the partial least squares (PLS) algorithm for linear regression and the kernel partial least squares (K-PLS) algorithm for nonlinear regression. This derivation makes the PLS algorithm, popularly and successfully used for chem ..."
Abstract
-
Cited by 13 (4 self)
- Add to MetaCart
Abstract. This work provides a novel derivation based on optimization for the partial least squares (PLS) algorithm for linear regression and the kernel partial least squares (K-PLS) algorithm for nonlinear regression. This derivation makes the PLS algorithm, popularly and successfully used for chemometrics applications, more accessible to machine learning researchers. The work introduces Direct K-PLS, a novel way to kernelize PLS based on direct factorization of the kernel matrix. Computational results and discussion illustrate the relative merits of K-PLS and Direct K-PLS versus closely related kernel methods such as support vector machines and kernel ridge regression. ∗ This work was supported by NSF grant number IIS-9979860. Many thanks to Roman Rosipal, Nello Cristianini, and Johan Suykens for many helpful discussions on PLS and kernel methods, Sean Ekans from Concurrent Pharmaceutical for providing molecule descriptions for the Albumin data set, Curt Breneman and N. Sukumar for generating descriptors for the Albumin data, and Tony Van Gestel for an efficient Gaussian kernel
Hybrid Intelligent Systems for Stock Market Analysis
- COMPUTATIONAL SCIENCE, SPRINGER-VERLAG GERMANY, VASSIL N. ALEXANDROV ET. AL. (EDS.), ISBN
, 2001
"... The use of intelligent systems for stock market predictions has been widely established. This paper deals with the application of hybridized soft computing techniques for automated stock market forecasting and trend analysis. We make use of a neural network for one day ahead stock forecasting an ..."
Abstract
-
Cited by 12 (5 self)
- Add to MetaCart
The use of intelligent systems for stock market predictions has been widely established. This paper deals with the application of hybridized soft computing techniques for automated stock market forecasting and trend analysis. We make use of a neural network for one day ahead stock forecasting and a neuro-fuzzy system for analyzing the trend of the predicted stock values. To demonstrate the proposed technique, we considered the popular Nasdaq-100 index of Nasdaq Stock Market SM . We analyzed the 24 months stock data for Nasdaq-100 main index as well as six of the companies listed in the Nasdaq100 index. Input data were preprocessed using principal component analysis and fed to an artificial neural network for stock forecasting. The predicted stock values are further fed to a neuro-fuzzy system to analyze the trend of the market. The forecasting and trend prediction results using the proposed hybrid system are promising and certainly warrant further research and analysis.
A faster iterative scaling algorithm for conditional exponential model
- In The 20th International Conference on Machine Learning
, 2003
"... Conditional exponential model has been one of the widely used conditional models in machine learning field and improved iterative scaling (IIS) has been one of the major algorithms for finding the optimal parameters for the conditional exponential model. In this paper, we proposed a faster iterative ..."
Abstract
-
Cited by 11 (0 self)
- Add to MetaCart
Conditional exponential model has been one of the widely used conditional models in machine learning field and improved iterative scaling (IIS) has been one of the major algorithms for finding the optimal parameters for the conditional exponential model. In this paper, we proposed a faster iterative algorithm named FIS that is able to find the optimal parameters faster than the IIS algorithm. The theoretical analysis shows that the proposed algorithm yields a tighter bound than the traditional IIS algorithm. Empirical studies on the text classification over three different datasets showed that the new iterative scaling algorithm converges substantially faster than both the IIS algorithm and the conjugate gradient algorithm (CG). Furthermore, we examine the quality of the optimal parameters found by each learning algorithm in the case of incomplete training. Experiments have shown that, when only a limited amount of computation is allowed (e.g. no convergence is achieved), the new algorithm FIS is able to obtain lower testing errors than both the IIS method and the CG method. 1.
Financial Forecasting through Unsupervised Clustering and Evolutionary Trained Neural Networks
- Proceedings of the Congress on Evolutionary Computation (CEC
, 2003
"... This paper presents a time series forecasting methodology and applies it to generate one--step-- ahead predictions for the daily foreign exchange spot rates. The methodology draws from the disciplines of chaotic time series analysis, clustering, artificial neural networks and evolutionary computatio ..."
Abstract
-
Cited by 9 (5 self)
- Add to MetaCart
This paper presents a time series forecasting methodology and applies it to generate one--step-- ahead predictions for the daily foreign exchange spot rates. The methodology draws from the disciplines of chaotic time series analysis, clustering, artificial neural networks and evolutionary computation. In brief, clustering is applied to identify neighborhoods in the reconstructed state space of the system; and subsequently neural networks are trained to model the dynamics of each neighborhood separately. The results obtained through this approach are promising.

