Results 1  10
of
62
Supervised Neural Gas with General Similarity Measure
 Neural Processing Letters
, 2003
"... Prototype based classi cation oers intuitive and sparse models with excellent generalization ability. However, these models usually crucially depend on the underlying Euclidian metric; moreover, online variants likely suer from the problem of local optima. We here propose a generalization of learni ..."
Abstract

Cited by 37 (21 self)
 Add to MetaCart
(Show Context)
Prototype based classi cation oers intuitive and sparse models with excellent generalization ability. However, these models usually crucially depend on the underlying Euclidian metric; moreover, online variants likely suer from the problem of local optima. We here propose a generalization of learning vector quantization with three additional features: (I) it directly integrates neighborhood cooperation, hence is less aected by local optima; (II) the method can be combined with any dierentiable similarity measure whereby metric parameters such as relevance factors of the input dimensions can automatically be adapted according to the given data; (III) it obeys a gradient dynamics hence shows very robust behavior, and the chosen objective is related to margin optimization.
Computational Intelligence Methods for RuleBased Data Understanding
 PROCEEDINGS OF THE IEEE
, 2004
"... ... This paper is focused on the extraction and use of logical rules for data understanding. All aspects of rule generation, optimization, and application are described, including the problem of finding good symbolic descriptors for continuous data, tradeoffs between accuracy and simplicity at the r ..."
Abstract

Cited by 31 (3 self)
 Add to MetaCart
(Show Context)
... This paper is focused on the extraction and use of logical rules for data understanding. All aspects of rule generation, optimization, and application are described, including the problem of finding good symbolic descriptors for continuous data, tradeoffs between accuracy and simplicity at the ruleextraction stage, and tradeoffs between rejection and error level at the rule optimization stage. Stability of rulebased description, calculation of probabilities from rules, and other related issues are also discussed. Major approaches to extraction of logical rules based on neural networks, decision trees, machine learning, and statistical methods are introduced. Optimization and application issues for sets of logical rules are described. Applications of such methods to benchmark and reallife problems are reported and illustrated with simple logical rules for many datasets. Challenges and new directions for research are outlined.
Uncertainty of Data, Fuzzy Membership Functions, and MultiLayer Perceptrons
, 2004
"... Probability that a crisp logical rule applied to imprecise input data is true may be computed using fuzzy membership function. All reasonable assumptions about input uncertainty distributions lead to membership functions of sigmoidal shape. Convolution of several inputs with uniform uncertainty lead ..."
Abstract

Cited by 20 (7 self)
 Add to MetaCart
Probability that a crisp logical rule applied to imprecise input data is true may be computed using fuzzy membership function. All reasonable assumptions about input uncertainty distributions lead to membership functions of sigmoidal shape. Convolution of several inputs with uniform uncertainty leads to bellshaped Gaussianlike uncertainty functions. Relations between input uncertainties and fuzzy rules are systematically explored and several new types of membership functions discovered. Multilayered perceptron (MLP) networks are shown to be a particular implementation of hierarchical sets of fuzzy threshold logic rules based on sigmoidal membership functions. They are equivalent to crisp logical networks applied to input data with uncertainty. Leaving fuzziness on the input side makes the networks or the rule systems easier to understand. Practical applications of these ideas are presented for analysis of questionnaire data and gene expression data.
Towards comprehensive foundations of computational intelligence
 In: Duch W, Mandziuk J, Eds, Challenges for Computational Intelligence
, 2007
"... Abstract. Although computational intelligence (CI) covers a vast variety of different methods it still lacks an integrative theory. Several proposals for CI foundations are discussed: computing and cognition as compression, metalearning as search in the space of data models, (dis)similarity based m ..."
Abstract

Cited by 20 (13 self)
 Add to MetaCart
(Show Context)
Abstract. Although computational intelligence (CI) covers a vast variety of different methods it still lacks an integrative theory. Several proposals for CI foundations are discussed: computing and cognition as compression, metalearning as search in the space of data models, (dis)similarity based methods providing a framework for such metalearning, and a more general approach based on chains of transformations. Many useful transformations that extract information from features are discussed. Heterogeneous adaptive systems are presented as particular example of transformationbased systems, and the goal of learning is redefined to facilitate creation of simpler data models. The need to understand data structures leads to techniques for logical and prototypebased rule extraction, and to generation of multiple alternative models, while the need to increase predictive power of adaptive models leads to committees of competent models. Learning from partial observations is a natural extension towards reasoning based on perceptions, and an approach to intuitive solving of such problems is presented. Throughout the paper neurocognitive inspirations are frequently used and are especially important in modeling of the higher cognitive functions. Promising directions such as liquid and laminar computing are identified and many open problems presented. 1
M.: Fuzzy rulebased systems derived from similarity to prototypes
 Lecture Notes in Computer Science. Volume 3316
, 2004
"... Abstract — Relations between similaritybased systems, evaluating similarity to some prototypes, and fuzzy rulebased systems, aggregating values of membership functions, are investigated. Similarity measures based on information theory and probabilistic distance functions lead to a new type of memb ..."
Abstract

Cited by 19 (12 self)
 Add to MetaCart
(Show Context)
Abstract — Relations between similaritybased systems, evaluating similarity to some prototypes, and fuzzy rulebased systems, aggregating values of membership functions, are investigated. Similarity measures based on information theory and probabilistic distance functions lead to a new type of membership functions applicable to symbolic data. Fuzzy membership functions on the other hand lead to a new type of distance functions. Several such novel functions are presented. This approach opens new ways to generate fuzzy rules based either on individual features or on their combinations used to evaluate similarity. Transition from prototypebased rules using similarity and fuzzy rules is illustrated using artificial data in two dimensions. As an illustration of usefulness of prototypebased rules very simple rules are derived for leukemia gene expression data. I.
Coloring Black Boxes: Visualization of Neural Network Decisions
 INT. JOINT CONFERENCE ON NEURAL NETWORKS
, 2003
"... Neural networks are commonly regarded as black boxes performing incomprehensible functions. For classification problems networks provide maps from high dimensional feature space to Kdimensional image space. Images of training vector are projected on polygon vertices, providing visualization of netw ..."
Abstract

Cited by 16 (8 self)
 Add to MetaCart
Neural networks are commonly regarded as black boxes performing incomprehensible functions. For classification problems networks provide maps from high dimensional feature space to Kdimensional image space. Images of training vector are projected on polygon vertices, providing visualization of network function. Such visualization may show the dynamics of learning, allow for comparison of different networks, display training vectors around which potential problems may arise, show differences due to regularization and optimization procedures, investigate stability of network classification under perturbation of original vectors, and place new data sample in relation to training data, allowing for estimation of confidence in classification of a given sample. An illustrative example for the threeclass Wine data is described. The visualization method proposed here is applicable to any black box system that provides continuous outputs.
Rule Extraction: Using Neural Networks or for Neural Networks?
 Journal of Computer Science and Technology
, 2004
"... In the research of rule extraction from neural networks, fidelity describes how well the rules mimic the behavior of a neural network while accuracy describes how well the rules can be generalized. This paper identifies the fidelityaccuracy dilemma. It argues to distinguish rule extraction using ne ..."
Abstract

Cited by 15 (2 self)
 Add to MetaCart
(Show Context)
In the research of rule extraction from neural networks, fidelity describes how well the rules mimic the behavior of a neural network while accuracy describes how well the rules can be generalized. This paper identifies the fidelityaccuracy dilemma. It argues to distinguish rule extraction using neural networks and rule extraction for neural networks according to their differing goals, where fidelity and accuracy should be excluded from the rule quality evaluation framework, respectively.
Learning highly nonseparable Boolean functions using Constructive Feedforward Neural Network
"... Abstract. Learning problems with inherent nonseparable Boolean logic is still a challenge that has not been addressed by neural or kernel classifiers. The kseparability concept introduced recently allows for characterization of complexity of nonseparable learning problems. A simple constructive f ..."
Abstract

Cited by 11 (8 self)
 Add to MetaCart
(Show Context)
Abstract. Learning problems with inherent nonseparable Boolean logic is still a challenge that has not been addressed by neural or kernel classifiers. The kseparability concept introduced recently allows for characterization of complexity of nonseparable learning problems. A simple constructive feedforward network that uses a modified form of the error function and a windowlike functions to localize outputs after projections on a line has been tested on such problems with quite good results. The computational cost of training is low because most nodes and connections are fixed and only weights of one node are modified at each training step. Several examples of learning Boolean functions and results of classification tests on realworld multiclass datasets are presented. 1
W.: Projection Pursuit Constructive Neural Networks Based on Quality of Projected Clusters
 Lecture Notes in Computer Science 5164 (2008) 754–762
"... Abstract. Linear projection pursuit index measuring quality of projected clusters (QPC) is used to discover nonlocal clusters in highdimensional multiclass data, reduction of dimensionality, feature selection, visualization of data and classification. Constructive neural networks that optimize the ..."
Abstract

Cited by 10 (4 self)
 Add to MetaCart
(Show Context)
Abstract. Linear projection pursuit index measuring quality of projected clusters (QPC) is used to discover nonlocal clusters in highdimensional multiclass data, reduction of dimensionality, feature selection, visualization of data and classification. Constructive neural networks that optimize the QPC index are able to discover simplest models of complex data, solving problems that standard networks based on error minimization are not able to handle. Tests on problems with complex Boolean logic and a few real world datasets show high efficiency of this approach. 1
Heterogeneous Adaptive Systems.
 World Congress of Computational Intelligence
, 2002
"... Most adaptive systems are homogenous, i.e. they are built from processing elements of the same type. MLP neural networks and decision trees use nodes that partition the input space by hyperplanes. Other types of neural networks use nodes that provide spherical or ellipsoidal decision borders. This m ..."
Abstract

Cited by 9 (6 self)
 Add to MetaCart
Most adaptive systems are homogenous, i.e. they are built from processing elements of the same type. MLP neural networks and decision trees use nodes that partition the input space by hyperplanes. Other types of neural networks use nodes that provide spherical or ellipsoidal decision borders. This may not be the best inductive bias for a given data, frequently requiring large number of processing elements even in cases when simple solutions exist. In heterogeneous adaptive systems (HAS) different types of decision borders are used at each stage, enabling discovery of a most appropriate bias for the data. Neural, decision tree and similaritybased systems of this sort are described here. Results from a novel heterogeneous decision tree algorithm are presented as an example of this approach.