| W. Duch and N. Jankowski, "New neural transfer functions." Neural Computing Surveys, vol. 2, pp. 639--658, 1999. |
....(for example, Euler s angles) and to calculate the derivatives necessary for backpropagation training procedure. Rotated densities in all dimensions may be obtained in two ways using transfer functions with just N 1 additional parameters per neuron. In the first approach (for the second see [ Duch and Jankowski, 1999; 1997 ] product form of the combination of sigmoids is used (see Fig. 3) CP (x; t, t # , R) N Y i #(R i x t i ) #(R i x t # i ) 10) SCP (x; t, t # , p, r, R) N Y i p i #(R i x t i ) r i #(R i x t # i ) 11) where R i is the i th row of the rotation ....
W. Duch and N. Jankowski. New neural transfer functions. Journal of Applied Mathematics and Computer Science, 7(3):639-- 658, 1997.
.... a window type rectangular function [3, 8] for simplicity of notation the dependence of L functions on the slope is not shown below) Similar smooth transformation is used in the FSM network using biradial transfer functions, which are combinations of products of L(x i ; b i ; b 0 i ) functions [9] with some additional parameters. Outputs of L units L(x i ; b i ; b 0 i ) are usually combined and filtered through another sigmoid oe( P ij L(x i ; b ij ; b 0 ij ) or the product Q ij L(x i ; b ij ; b 0 ij ) of these functions is used. x 1 1 2 b b b b b b 1 W W s(W x b) 1 ....
....rules from data has been presented. Neural networks either density estimation (FSM) or constrained multilayered perceptrons (MLPs) are used to obtain initial sets of rules. FSM is trained either directly with the rectangular functions or making a smooth transition from biradial functions [9] or trapezoidal functions to rectangular functions. MLPs are trained with constraints that change them into networks processing logical functions, either by simplification of typical MLPs or by incremental construction of networks performing logical functions. In this paper we have used mostly the ....
W. Duch and N. Jankowski, "New neural transfer functions". Applied Mathematics and Computer Science 7, 639-658 (1997)
.... process fuzzy membership functions ( soft trapezoids ) are transformed into rectangular functions [4] 9] Similar smooth transformation is used in the FSM network using biradial transfer functions, which are additionally parameterized combinations of products of L(x i ; b i ; b 0 i ) functions [10]. x 1 1 2 b b b b b b 1 W W s(W x b) 1 s(W x b ) 2 1 1 Fig. 2. L units, or pairs of neurons with constrained weights, used for determination of linguistic variables. Outputs of L units L(x i ; b i ; b 0 i ) are combined and filtered through another sigmoid oe( P ij L(x i ; b ij ....
....rules from data has been presented. Neural networks either density estimation (FSM) or constrained multilayered perceptrons (MLPs) are used to obtain initial sets of rules. FSM is trained either directly with the rectangular functions or making a smooth transition from biradial functions [10] or trapezoidal functions to rectangular functions. MLPs are trained with constraints that change them into networks processing logical functions, either by simplification of typical MLPs or by incremental construction of networks performing logical functions. In this paper we have used mostly the ....
W. Duch and N. Jankowski, New neural transfer functions. Applied Math. & Comp. Science 7 (1997) 639-658
....the most flexible densities (flexible shapes of OE(X; const contours) with the smallest number of adaptive parameters . Small network with a few complex nodes is equivalent to a large network with simple nodes. Recently we have reviewed various transfer functions suitable for neural networks [5]. The simplest functions with suitable properties for density modeling are of Gaussian or approximated Gaussian type. Our favorite functions are of the biradial type: Bi(X; D; b; s) N Y i=1 oe(e s i Delta (X i Gamma D i e b i ) 1 Gamma oe(e s i Delta (X i Gamma D i Gamma e b ....
.... SBi have 3N 1 parameters (if fi = 1 Gamma ff) 3N 2 parameters (independent ff; fi) or up to 5N parameters (different ff i and fi i used in each dimension) In our simulations we have found that biradial functions lead to faster convergence with smaller number of nodes than Gaussian functions [5]. The function realized by the FSM network represents crisp logical rules if the contours of constant density are defined by cuboids. Fuzzy logical rules require probability densities described by separable functions, with each one dimensional component equivalent to a membership function ....
W. Duch and N. Jankowski, "New Neural Transfer Functions," submitted to J. of Applied Mathematics and Computer Science, 1997
No context found.
W. Duch and N. Jankowski, "New neural transfer functions." Neural Computing Surveys, vol. 2, pp. 639--658, 1999.
No context found.
W. Duch and N. Jankowski. New neural transfer functions. Jour. of Applied Math. and Computer Science. submitted.
....the type of transfer functions used. A survey of transfer functions suitable for neural networks The authors acknowledge POL 040 98 grant for German Polish collaboration. W.D. is also grateful for support by the Polish Committee for Scientific Research, grant 8 T11C 006 19. has been presented in [6], and a taxonomy of such functions in [7] Constructive algorithms that add one transfer function at a time are very attractive but so far have been restricted to functions of the same type. Models that select or optimize the type of function that is added have not been used so far. Feature Space ....
....only radial basis functions that are separable. RBF functions are defined relatively to only one center X R . Bicentral functions obtained by subtraction of two sigmoidal functions are natural generalization of trapezoidal functions to soft continuos shapes; they use two reference vectors [6]. It may also be of advantage to consider classes of functions with non linear parameters that strongly influence their contours. Several classes of such functions may be defined: conical functions combining weighted activation with distance function: X;W,R, X R) W X R ) 2) change ....
W. Duch and N. Jankowski, New neural transfer functions. Neural Computing Surveys 2, 639-658 (1999)
....They use only 3N adaptive parameters, are localized (window type) and separable. In contrast to the Gaussian functions a single bicentral function may approximate rather complex decision borders. The next step towards even greater flexibility requires rotation of contours provided by each unit [11, 1]. Full N N rotation matrix is very hard to parameterize with N 1 independent rotation angles. Using transfer functions with just N 1 additional parameters per neuron one can implement rotations defining a product form of the combination of sigmoids: C P (x; t, t# , r) N i (A3 i ) 1 ....
W. Duch and N. Jankowski. New neural transfer functions. Journal of Applied Mathematics and Computer Science, 7(3):639--658, 1997.
....by P rules with appropriate similarity functions. The reverse does not hold; for example the Manhattan distance function: D(X,P) N i=1 X i P i (6) does not seem to be equivalent to any combination of membership functions and T norms. Many other distance measures are useful [9], for example Camberra: D Ca (X,Y) N i=1 X i Y i X i Y i (7) More general form of rules is obtained if more than one prototype is used in the rule condition: IF among k most similar prototypes P i class C is the most common than C(X) C. Such classification rules allow for ....
....complex decision borders but may seem more difficult to understand and may require more prototypes (at least k) per class. In approximation problems k prototype rules will be more useful. Oblique distribution of data may require linear combination, or non linear transformation, of input features [9]. The meaning of rules build with such features may be difficult to comprehend. Convex, polyhedral shapes obtained from a union of halfspaces defined by hyperplanes also do not lead to comprehensible rules. Methods Prototype based rules may be created in many ways. Prototypes are useful in ....
Duch W, Jankowski N. New neural transfer functions. Neural Computing Surveys 2 (1999) 639-658
.... the learning process, transforming the fuzzy membership function ( soft trapezoid ) into a window type rectangular function [2, 5] Similar smooth transformation is used in the FSM network using biradial transfer functions, which are combinations of products of L(x i ; b i , b # i ) functions [6] with some additional parameters. Outputs of L units L(x i ; b i , b # i , are usually combined and filtered through another sigmoid ( ij L(x i ; b ij , b # ij , or the product ij L(x i ; b ij , b # ij , of these functions is used. Initial linguistic variables have a strong ....
W. Duch and N. Jankowski. New neural transfer functions. Applied Math. & Comp. Science 7 (1997) 639-658
No context found.
W. Duch and N. Jankowski, New neural transfer functions. Applied Math. & Comp. Science 7 (1997) 639-658
.... a window type rectangular function [3, 8] for simplicity of notation the dependence of L functions on the slope is not shown below) Similar smooth transformation is used in the FSM network using biradial transfer functions, which are combinations of products of L(x i ; b i ,b # i ) functions [9] with some additional parameters. Outputs of L units L(x i ; b i ,b # i ) are usually combined and filtered through another sigmoid #( # ij L(x i ; b ij ,b # ij ) or the product # ij L(x i ; b ij ,b # ij ) of these functions is used. x 1 1 2 b b bb b b 1 W W (W x b) 1 ....
....as benign, giving overall accuracy of 96 . Optimization of this set of rules gives: R 1 : f 2 6 # f4 3 # f 8 8 (99.8) R 2 : f 2 9 # f 5 4 # f 7 # f 8 5 (100) R 3 : f 2 10 # f4 4 # f 5 4 # f 7 3 (100) R 4 : f 2 7 # f4 9 # f 5 3 # f 7 # [4, 9] # f 8 4 (100) R 5 : f 2 # [3, 4] # f4 9 # f 5 10 # f 7 6 # f 8 8 (99.8) These rules classify only 1 benign vector as malignant (R 1 andR 5 , the same vector) and the ELSE condition for the benign class makes 6 errors, giving 99.00 overall accuracy of this set of ....
[Article contains additional citation context not shown here]
W. Duch and N. Jankowski, "New neural transfer functions". Applied Mathematics and Computer Science 7, 639-658 (1997)
....the most flexible densities (flexible shapes of OE(X# ) const contours) with the smallest number of adaptive parameters . Small network with a few complex nodes is equivalent to a large network with simple nodes. Recently we have reviewed various transfer functions suitable for neural networks [5]. The simplest functions with suitable properties for density modeling are of Gaussian or approximated Gaussian type. Our favorite functions are of the biradial type: Bi(X# D# b# s) N # i=1 oe(e s i Delta (X i ;D i e b i ) 1 ; oe(e s i Delta (X i ; D i ; e b i ) 1) where ....
.... functions SBi have 3N 1parameters (if fi =1; ff) 3N 2parameters (independent ff# fi)orupto5N parameters (different ff i and fi i used in each dimension) In our simulations we have found that biradial functions lead to faster convergence with smaller number of nodes than Gaussian functions [5]. The function realized by the FSM network represents crisp logical rules if the contours of constant density are defined by cuboids. Fuzzy logical rules require probability densities described by separable functions, with each one dimensional component equivalent to a membership function ....
W. Duch and N. Jankowski, "New Neural Transfer Functions," submitted to J. of Applied Mathematics and Computer Science, 1997
....using the Feature Space Mapping network (FSM network, a model similar to RBF but based on separable functions) 16] Sufficiently large values of the slopes are needed to change the bicentral functions into rectangular functions. Early results obtained with these functions are presented in [17]. 3CONCLUSIONS Most neural networks are based on either sigmoidal or Gaussian transfer functions. In the functional link networks of Pao [18] a combination of various functions, such as polynomial, periodic, sigmoidal and Gaussian functions are used. We have presented a taxonomy of different ....
W. Duch, N. Jankowski, "New neural transfer functions", J. of Applied Mathematics and Computer Science 7: 639--658, 1997.
....using the Feature Space Mapping network (FSM network, a model similar to RBF but based on separable functions) 16] Sufficiently large values of the slopes are needed to change the bicentral functions into rectangular functions. Early results obtained with these functions are presented in [17]. 3CONCLUSIONS Most neural networks are based on either sigmoidal or Gaussian transfer functions. In the functional link networks of Pao [18] a combination of various functions, such as polynomial, periodic, sigmoidal and Gaussian functions are used. We have presented a taxonomy of different ....
W. Duch, N. Jankowski, "New neural transfer functions", J. of Applied Mathematics and Computer Science 7: 639--658, 1997.
....Bi(x; t, b,s) is possible by shifting centers t, rescaling b and s. Radial basis functions are defined relatively to only one center x t . Here components of two centers are used, t i e b and t i e b i , therefore we have called these functions previously biradial functions [78], but perhaps the name bicentral is more appropriate. Product form leads to well localized convex contours of bicentral functions. Exponentials e s i and e b i are used instead of s i and b i parameters to prevent oscillations during the learning procedure (learning becomes more stable) The ....
....per unit and may represent quite complex decision borders. Semi bicentral functions and bicentral functions with independent slopes provide local and non local units in one network. The next step towards even greater flexibility requires individual rotation of contours provided by each unit [78, 79]. Of course one could introduce a rotation matrix operating on the NEURAL COMPUTING SURVEYS 2, 163 212, 1999, http: www.icsi.berkeley.edu #jagota NCS 199 10 0 10 10 0 10 0 0.05 0.1 0.15 0.2 Biases 1 1 Slopes .5 .5 10 0 10 10 0 10 0 0.5 1 Biases 5 5 Slopes 1 1 10 0 10 10 0 10 0 0.5 1 ....
W. Duch and N. Jankowski, "New neural transfer functions", Journal of Applied Mathematics and Computer Science, vol. 7, no. 3, pp. 639--658, 1997.
....C,2 R C is a set of all subsets of R C , and R is the number of elements in R. The uncertainty s i of features may for some data dependent of the values of X i . Classification probabilities may in such cases be based on a direct calculation of optimal soft trapezoidal membership functions [6]. Linguistic units of neural networks with LR architecture provide such window type membership functions, L(x;a,b) x a) x b) Relating the slope to the input uncertainty allows to calculate probabilities in agreement with the Monte Carlo sampling. A network rule node (R node) computes ....
Duch, W., Jankowski, N. (1999) New neural transfer functions. Neural Computing Surveys 2, 639-658
.... purpose some restrictions on meaningful transformations of input features should be set first, combinations of these features determined by an MLP type layer, linguistic variables extracted by a constructive network with transfer function capable of local response (such as the bicentral functions [11]) and finally the outputs should be combined by a rule layer. The C MLP2LN network [12,13] has an aggregation layer (A layer) linguistic variables layer (L layer) a rule layer (R layer) and an output layer. This network was used previously in data mining tasks [13] but it is also well suited as ....
....which an approximation using any fuzzy system will converge very slowly. The same is true for the MLP networks which for some data distributions may suffer from slow convergence. The importance of a proper choice of the transfer functions in neural networks has been realized only quite recently [11]. Undoubtedly there are many industrial applications of fuzzy logic, although they seem to be restricted to control problems [19] Since global control models are impossible to construct, the space of control parameters is divided into subregions where appropriate actions are fired by rules; in ....
Duch, W., Jankowski, N. (1999) New neural transfer functions. Neural Computing Surveys 2, 639--658
....is the nearest center of a basis function to the vector xn and e min ; ffl min are some experimentally choosen constants. The growing network can be described by f (n) x; p) P k Gamma1 i=1 w i G i (x; p i ) en G k (x; p k ) 1 For a interesting review of many other transfer function see [3]. P k i=1 w i G i (x; p i ) where p k includes centers xn and others adaptive parameters which are set up with some initial values. If the growth criteria are not satisfied the RAN network uses the LMS algorithm to estimate free parameters. Although LMS algorithm is faster than Extended ....
....] ii (6) where 2 n;# is # confidence on 2 distribution for one degree of freedom. Neurons are pruned if the saliency L is too small and or the uncertainty of the network output R y is too big. Bi radial Transfer Functions: To obtain greater flexibility the bi radial transfer functiona [3] are used instead of Gaussians. These functions are build from products of pairs of sigmoidal functions for each variable and produce decision regions for classification of almost arbitrary shapes. Bi(x; t; b; s) N Y i=1 oe(e s i Delta (x i Gamma t i e b i ) 1 Gamma oe(e s i ....
[Article contains additional citation context not shown here]
W. Duch and N. Jankowski. New neural transfer functions. Jour. of Applied Math. and Computer Science. submitted.
....(EKF) algorithm [2] we decided to used EKF algorithm because it exhibits fast convergence, use lower number of neurons in hidden layer [13, 14] and gives some tools which would be useful in control of the growth and pruning process. 3 For a interesting review of many other transfer function see [4]. The Goal of IncNet Pro The main goal of our researche was to build a network which would be able to preserve information in as complex net as the data shown to the network during the learning time. The IncNet Pro tries to solve the above task in 4 ways: ffl ESTIMATION: The typical learning ....
....] ii (10) where # 2 n,# is # confidence on # 2 distribution for one degree of freedom. Neurons are pruned if the saliency L is too small and or the uncertainty of the network output Ry is too big. Bi radial Transfer Functions: To obtain greater flexibility the bi radial transfer functiona [4] zre used instead of Gaussians. These functiona are build from products of pairs of sigmoidal functions for each variable and produce decision regions for classification of almost arbitrary shapes. Bi(x; t, b, s) N Y i=1 #(e s i Delta (x i t i e b i ) 1 #(e s i Delta (x i t ....
[Article contains additional citation context not shown here]
W. Duch and N. Jankowski. New neural transfer functions. Jour. of Applied Math. and Computer Science, 1997. (in print).
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC