| W. Duch, R. Adamczak, and N. Jankowski, "New Developments in the Feature Space Mapping model," in Proc. 3rd Conf. on Neural Networks, Kule, Poland, Oct. pp. 65--70, 1997. |
....have crisp logical rules which usually can be obtained by decision trees, or inductive logic programming. It is also possible to obtain crisp logical rules from neural networks. Here we briefly describe two such methods developed in our group: FSM and MLP2LN. FSM Feature Space Mapping (FSM) 4] [5] has been introduced as a universal adaptive system based on multidimensional separable functions. Viewed from various perspectives FSM is a neurofuzzy system, a density estimation network, a memory based system or an example of a self organizing system. The main idea is simple: components of the ....
....also to use other localized functions like: triangular, trapezoidal, rectangular. Rectangular functions: 0 1 ) i i x D if G D x are very useful for extraction of crisp logical rules in the FSM network. Further details about this network are available in [5]. MLP2LN MLP2LN [6] is a smooth transformation from an MLP network into a network performing logical operations (Logical Network, LN) This transformation is achieved during network training by: gradually increasing the slope of sigmoidal functions to obtain crisp decision regions simplifying ....
[Article contains additional citation context not shown here]
W. Duch, R. Adamczak (1997), New developments in Feature Space Mapping model, Third Conference on Neural Networks and Their Applications, Kule, pp. 65-70
....the Resource Allocation Network (RAN) 7] or function estimation approach [8] initialize a single unit only. In FSM [2] a few initial units are created using techniques based on decision trees, dendrograms and the variable resolution scale for data filtering. Transfer functions used in FSM [9] [10] include rectangular, biradial and Gaussian functions. Except for centers and dispersions parametrization of rotations is also desired. Rotation matrix in N dimensions theoretically needs only N Gamma 1 angles, but in practice it is very difficult to specify rotation parameters (we have found an ....
....Except for centers and dispersions parametrization of rotations is also desired. Rotation matrix in N dimensions theoretically needs only N Gamma 1 angles, but in practice it is very difficult to specify rotation parameters (we have found an interesting solution to this problem presented in [10]) Calculation of activation with full rotation matrix requires N 2 operations for matrix multiplication. Unconstrained transformation matrices are even worse since the number of adaptive parameters is of the order of N 2 per node, which for a large N is quite impractical. We have followed ....
R. Adamczak,W. Duch and N. Jankowski, New developments in the Feature Space Mappingmodel, Proc. third conference on neural networks and their applications, Kule, Poland, 14-18.10.1997 (this volume)
No context found.
W. Duch, R. Adamczak, and N. Jankowski, "New Developments in the Feature Space Mapping model," in Proc. 3rd Conf. on Neural Networks, Kule, Poland, Oct. pp. 65--70, 1997.
No context found.
Adamczak, R., Duch, W., Jankowski, N.: New developments in the feature space mapping model. In: Third Conference on Neural Networks and Their Applications, Kule, Poland, Polish Neural Networks Society (1997) 65--70
....It may also be used as an associative memory, predicting the values of a part of X vector from any other part. In previous papers general philosophy, the training algorithm, parameterization of rotated bicentral transfer functions and initialization of neural networks parameters has been described[2, 4, 5]. The FSM model has been extended here to allow for different transfer functions that are added during the process of network growth. After a brief description of the FSM network recent developments are described and issues connected with optimization of transfer functions are discussed. Results ....
.... ) # (3) For = 2, 2 this function becomes a standard sigmoidal transfer function with weighted activation. Training algorithm: overall complexity, counted as the total number of adaptive parameters of the network, should be minimized. The algorithm used previously to train the FSM network [2, 4] estimated probability density for all classes, requiring explicit representation of convex and concave decision borders. The number of functions required to represent an area outside of a sphere may be quite large. The algorithm has been modified to allow for the ELSE class that has no explicit ....
[Article contains additional citation context not shown here]
W. Duch, R. Adamczak, N. Jankowski, New developments in the Feature Space Mapping model. 3rd Conf. on Neural Networks, Kule, Poland, pp. 65-70 (1997)
....reliability of the rules has not been estimated in these studies since the test set was quite small. New rule extraction methods based on SSV (Separability Split Value) decision tree [7] and neural methods based on constraint multilayer perceptron [8] and probability density estimation [9], have been used recently for medical data. Simple rules were discovered giving accurate results [10] Selection of reference vectors in the similarity based methods allow for prototype based understanding of data [11] These methods are applied here to the melanoma data. In the next section the ....
....set of crisp logical rules, although fuzzy rules with soft trapezoid membership functions may also be extracted. The procedure is almost fully automatic, giving the user a choice between the simplest possible description of the data and perhaps more accurate, but more complex, description. FSM [9], Feature Space Mapping, is a neural network estimating probability density of data using separable transfer functions. Each component of a transfer function may be interpreted as a context dependent membership function. Using rectangular functions crisp logic rules are derived, while trapezoidal, ....
[Article contains additional citation context not shown here]
Duch, W., Adamczak, R., Jankowski, N., New developments in the Feature Space Mapping model. 3rd Conf. on Neural Networks, Kule, Poland, Oct. 1997, pp. 65-70
....They use only 3N adaptive parameters, are localized (window type) and separable. In contrast to the Gaussian functions a single bicentral function may approximate rather complex decision borders. The next step towards even greater flexibility requires rotation of contours provided by each unit [11, 1]. Full N N rotation matrix is very hard to parameterize with N 1 independent rotation angles. Using transfer functions with just N 1 additional parameters per neuron one can implement rotations defining a product form of the combination of sigmoids: C P (x; t, t# , r) N i (A3 i ) 1 ....
R. Adamczak, W. Duch, and N. Jankowski. New developments in the feature space mapping model. In Third Conference on Neural Networks and Their Applications, pages 65--70, Kule, Poland, October 1997.
....neural models. Initial values of the intervals for continuos linguistic variables may be determined by the analysis of histograms, dendrograms, decision trees or clusterization methods. FSM, our probability density estimation neurofuzzy network, is initialized using various clusterization methods [4, 3]. Optimization of linguistic variables is a part of the FSM learning process. These variables are modeled using rectangular, triangular, trapezoidal, Gaussian or soft trapezoidal functions. The learning algorithm [4] finds optimal intervals and logical rules at the same time. These intervals are ....
....neurofuzzy network, is initialized using various clusterization methods [4, 3] Optimization of linguistic variables is a part of the FSM learning process. These variables are modeled using rectangular, triangular, trapezoidal, Gaussian or soft trapezoidal functions. The learning algorithm [4] finds optimal intervals and logical rules at the same time. These intervals are gradually increased if the number of errors does not grow, until the whole data range for a given feature is covered, indicating that the feature may be removed. MLP (multilayered perceptron) neural models were ....
[Article contains additional citation context not shown here]
W. Duch, R. Adamczak, N. Jankowski. New developments in the Feature Space Mapping model, 3rd Conf. on Neural Networks, Kule, Poland, Oct. 1997, pp. 65-70
....or a triangular fuzzy number. Unfortunately frequently histograms for all features overlap. Therefore we have developed several methods for determination of initial linguistic variables. A. Selection using density networks Feature Space Mapping (FSM) is a constructive neural network [42] 53] [54] that estimates the probability density p(C X ,Y ; M) of input X output Y pairs in each class C. Nodes of this network use localized, separable transfer functions, providing good linguistic variables. Crisp decision regions are obtained by using rectangular transfer functions; if this is not ....
....functions (or soft rectangular functions that are changed into rectangular during training) may be easier to use. FSM uses efficient clusterization procedures (based on dendrograms or decision trees) for initialization, frequently obtaining quite good results without any training (see [42] 53] [54] papers, where details of the training algorithm are described) Each network node covers a cluster of input vectors. The training procedure changes the node parameters (such as their positions in the input space) until the error function reaches a minimum. Nodes that cover only a few training ....
[Article contains additional citation context not shown here]
W. Duch, R. Adamczak, N. Jankowski, "New developments in the Feature Space Mapping model". 3rd Conf. on Neural Networks, Kule, Poland, Oct. 1997, pp. 65-70
No context found.
W. Duch, R. Adamczak, N. Jankowski. New developments in the Feature Space Mapping model, 3rd Conf. on Neural Networks, Kule, Poland, Oct. 1997, pp. 65-70
.... 63.2 FOIL (inductive logic) 99 60.1 T2 (rules from decision tree) 67.5 53.3 1R (rules) 58.4 50.3 Naive Bayes 46.6 IB2 IB4 81.2 85.5 43.6 44.6 (Bayes, LDA) and pattern recognition methods (LVQ) The best classification results were obtained with the committee of 50 FSM neural networks [14, 15] (in Table 1 shown as FSM 50) reaching 81 . The k nearest neighbors (kNN) with k=1, Manhattan distance function and selection of features gives 80.4 accuracy (for details see [13] and after feature weighting 82.8 (the training accuracy of kNN is estimated using the leave one out method) K ....
Duch, W., Adamczak, R., Jankowski, N. (1997) New developments in the Feature Space Mapping model. 3rd Conf. on Neural Networks, Kule, Poland, Oct. 1997, pp. 65-70
....input data by moving, decreasing and increasing the nodes, or by adding new nodes if it is necessary. The FSM network may use any separable transfer functions, including triangular, trapezoidal, Gaussian, or the biradial combinations i (s(x i b i ) s(x i b # i ) of sigmoidal functions [8] with soft trapezoidal shapes. If gain of sigmoidal functions s(x) is slowly increased during learning rectangular functions are smoothly recovered. Another approach is based on general separability criterion introduced recently [9] The best split value is the one which separates the maximal ....
W. Duch, R. Adamczak, N. Jankowski, New developments in the Feature Space Mapping model, 3rd Conf. on Neural Networks, Kule, Poland, Oct. 1997, pp. 65-70
....per unit and may represent quite complex decision borders. Semi bicentral functions and bicentral functions with independent slopes provide local and non local units in one network. The next step towards even greater flexibility requires individual rotation of contours provided by each unit [78, 79]. Of course one could introduce a rotation matrix operating on the NEURAL COMPUTING SURVEYS 2, 163 212, 1999, http: www.icsi.berkeley.edu #jagota NCS 199 10 0 10 10 0 10 0 0.05 0.1 0.15 0.2 Biases 1 1 Slopes .5 .5 10 0 10 10 0 10 0 0.5 1 Biases 5 5 Slopes 1 1 10 0 10 10 0 10 0 0.5 1 ....
....needed in this case. Rotation adds only N 1 parameters for C P ( function and N parameters for CK ( function. Bicentral functions with rotations (as well as multivariate Gaussian functions with rotation) have been implemented so far only in two neural network models, the Feature Space Mapping [65, 79] and the IncNet [80, 81, 82] Bicentral functions with rotation and two slopes. The most complex bicentral function is obtained by combining rotations with two independent slopes (see Fig. 29) NEURAL COMPUTING SURVEYS 2, 163 212, 1999, http: www.icsi.berkeley.edu #jagota NCS 202 10 5 0 5 10 ....
[Article contains additional citation context not shown here]
R. Adamczak, W. Duch, and N. Jankowski, "New developments in the feature space mapping model", in Third Conference on Neural Networks and Their Applications, Kule, Poland, Oct. 1997. pp. 65--70.
....criminal tendencies etc. determined by expert psychologists, and the second for men, with 1167 cases and 28 classes. Rules were generated using C4.5 classification tree [11] a very good classification system which may generate logical rules, and the Feature Space Mapping (FSM) neural network [12,13] since these two systems were the easiest to use on such complex data. These results are for the reclassification accuracy only using generated sets of rules. Statistical estimation of generalization by 10 fold crossvalidation gave 82 85 correct answers with FSM (crisp unoptimized rules) and ....
Duch, W., Adamczak, R., Jankowski, N. (1997) New developments in the Feature Space Mapping model. 3rd Conf. on Neural Networks, Kule, Poland, Oct. 1997, pp. 65-70
....example, in a larger modular network W (X) WG(X ; P k , x0 ) may be defined for a number of reference points P k in the input space. Networks appropriate for this region of space may then perform specific actions. The architecture should be similar to the constructive RBF type of networks (cf. [10,20,21]) with appropriate action conditions performed by the output nodes. 3 Black box systems, assessment of reliability and data explanation Neural systems usually do not provide explanation or even an assessment of reliability of their decisions. The same is true for other CI methods, such as ....
....malingerers, persons with criminal tendencies etc. determined by expert psychologists. Crisp logical rules for these data were generated using C4.5 classification tree [16] a very good classification system which may generate logical rules, and the Feature Space Mapping (FSM) neural network [20,21], since these two systems were the easiest to use on such complex data. Statistical estimation of generalization by 10 fold crossvalidation gave for crisp unoptimized FSM rules 82 85 correct answers, for C4.5 decision tree the accuracy was in the 79 84 range. If Gaussian uncertainty is included ....
[Article contains additional citation context not shown here]
Duch, W., Adamczak, R., Jankowski, N. (1997) New developments in the Feature Space Mapping model. 3rd Conf. on Neural Networks, Kule, Poland, Oct. 1997, pp. 65-70
.... or using special linguistic units (L units) in an MLP (multilayer perceptron) network [14] The FSM neural network uses arbitrary separable transfer functions, including triangular, trapezoidal, Gaussian, or the bicentral combinations i (s(x i b i ) s(x i b # i ) of sigmoidal functions [8] with soft trapezoidal shapes. If the slope of sigmoidal functions s(x) is slowly increased during learning rectangular functions are smoothly recovered. After training nodes of FSM network are analyzed providing good intervals for logical variables. Linguistic neural units (L units) ....
....functions (or soft rectangular functions that are changed into rectangular during training) is frequently simpler to use. FSM uses good clusterization procedures (based on dendrograms or decision trees) for initialization, frequently obtaining quite good results without any training [7] [8]. After training nodes that cover only a few training vectors are removed and nodes that cover many training vectors are optimized. The node covering the largest number of vectors, assigned to class C i , is selected (this node corresponds to the most general logical rule) choosing feature k = ....
W. Duch, R. Adamczak, N. Jankowski, New developments in the Feature Space Mapping model, 3rd Conf. on Neural Networks, Kule, Poland, Oct. 1997, pp. 65-70
....delusions, psychosis, etc. Data sets consists of 1027 and 1167 examples respectively for 27 and 28 classes sets. Figure 4 shows the learning of one single class, displaying the changes of accuracy and the number of neurons. In Tables 2 and 3 comparison of generalization for IncNet, FSM (Adamczak et al. 1997) and C4.5 is shown. In Table 2 the overall performance is presented and in Table 3 the generalization after dividing the whole set into training and testing sets for 10 90 and 5 95 learning. Figure 5 shows the confusion matrix (on the left) It clearly shows that there are just a few ....
Adamczak, R., Duch, W., and Jankowski, N. (1997). New developments in the Feature Space Mapping model.
....delusions, psychosis, etc. Data sets consists of 1027 and 1167 examples respectively for 27 and 28 classes sets. Figure 4 shows the learning of one single class, displaying the changes of accuracy and the number of neurons. In Tables 2 and 3 comparison of generalization for IncNet, FSM (Adamczak et al. 1997) and C4.5 is shown. In Table 2 the overall performance is presented and in Table 3 the generalization after dividing the whole set into training and testing sets for 10 90 and 5 95 learning. Figure 5 shows the confusion matrix (on the left) It clearly shows that there are just a few ....
Adamczak, R., Duch, W., and Jankowski, N. (1997). New developments in the feature space mapping model.
....by Sugeno, Kosi#ski, and Horikawa [9] Table 1) Although this function is frequently used for testing the approximation capabilities of adaptive systems, there is no standard procedure to select the training points and thus the results are rather hard to compare. For training 216 points from [1, 6] interval and 125 points for testing from [1.5, 5.5] interval were randomly chosen. All tests were performed using the same (if possible) or similar initial parameters. The Av Model APE TRS APE TES GMDS model Kongo 4.7 5.7 Fuzzy model 1 Sugeno 1.5 2.1 Fuzzy model 2 Sugeno 0.59 3.4 FNN Type 1 ....
....[1 unit of time is a single learning pair presentation. 1027 and 1167 examples respectively for 27 and 28 classes sets. Figure 3 shows the learning of one single class, displaying the changes of accuracy and the number of neurons. In Tables 2 and 3 comparison of generalization for IncNet, FSM [1] and C4.5 is shown. In Table 2 the overall performance is presented and in Table 3 the generalization after dividing the whole set into training and testing sets for 10 90 and 5 95 learning. Figure 4 shows the confusion matrix (on the left) It clearly shows that there are just a few ....
R. Adamczak, W. Duch, and N. Jankowski. New developments in the Feature Space Mapping model. In Third Conference on Neural Networks and Their Applications, pages 65--70, Kule, Poland, Oct. 1997.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC